[PDF] On high-dimensional wavelet eigenanalysis

Abstract

In this paper, we characterize the asymptotic and large scale behavior of the eigenvalues of wavelet random matrices in high dimensions. We assume that possibly non-Gaussian, finite-variance p-variate measurements are made of a low-dimensional r-variate (r \ll p) fractional stochastic process with non-canonical scaling coordinates and in the presence of additive high-dimensional noise. The measurements are correlated both time-wise and between rows. We show that the r largest eigenvalues of the wavelet random matrices, when appropriately rescaled, converge to scale invariant functions in the high-dimensional limit. By contrast, the remaining p-r eigenvalues remain bounded. Under additional assumptions, we show that, up to a log transformation, the r largest eigenvalues of wavelet random matrices exhibit asymptotically Gaussian distributions. The results have direct consequences for statistical inference.

Full PDF

aa r X i v : . [ m a t h . S T ] F e b On high-dimensional wavelet eigenanalysis ∗†‡

Patrice AbryUniv Lyon, ENS de Lyon, Univ Claude Bernard,CNRS, Laboratoire de Physique, F-69342 Lyon, FranceB. Cooper BonieceDepartment of Mathematics and StatisticsWashington University in St. Louis Gustavo DidierMathematics DepartmentTulane UniversityHerwig WendtIRIT-ENSEEIHT, CNRS (UMR 5505),Universit´e de Toulouse, FranceFebruary 12, 2021

Abstract

In this paper, we mathematically construct wavelet eigenanalysis in high dimensions (Abryand Didier (2018 a , 2018 b )) by characterizing the scaling behavior of the eigenvalues of largewavelet random matrices. We assume that possibly non-Gaussian, ﬁnite-variance p -variatemeasurements are made of a low-dimensional r -variate ( r ≪ p ) fractional stochastic processwith non-canonical scaling coordinates and in the presence of additive high-dimensional noise.We show that the r largest eigenvalues of the wavelet random matrices, when appropriatelyrescaled, converge to scale invariant functions in the high-dimensional limit. By contrast,the remaining p − r eigenvalues remain bounded. In addition, we show that, up to a logtransformation, the r largest eigenvalues of wavelet random matrices exhibit asymptoticallyGaussian distributions. We further show how the asymptotic and large-scale behavior ofwavelet eigenvalues can be used to construct statistical inference methodology for a high-dimensional signal-plus-noise system. A wavelet is a unit L ( R )-norm function that annihilates polynomials. For a ﬁxed (octave) j ∈ N ,a wavelet random matrix is given by W ( a ( n )2 j ) = 1 n a,j n a,j X k =1 D ( a ( n )2 j , k ) D ( a ( n )2 j , k ) ∗ . (1.1)In (1.1), ∗ denotes transposition, n a,j = n/a ( n )2 j is the number of wavelet-domain observationsfor a sample size n , and each random vector D ( a ( n )2 j , k ) is the wavelet transform of a multivariate ∗ The ﬁrst author was partially supported by the French ANR AMATIS 2011 grant † AMS Subject classiﬁcation . Primary: 60G18, 60B20, 62H25, 42C40. ‡ Keywords and phrases : wavelets, operator self-similarity, random matrices. tochastic process Y at the dyadic scale a ( n )2 j and shift k ∈ Z . The entries of W (2 j ) are generallycorrelated. The so-named wavelet eigenanalysis methodology consists in using the behavior acrossscales of the eigenvalues of wavelet random matrices to study the fractality of stochastic systems(Abry and Didier (2018 a , 2018 b )). In this paper, we characterize the asymptotic behavior ofthe eigenvalues of wavelet random matrices in high dimensions. The (possibly non-Gaussian)underlying stochastic process Y = { Y ( t ) } t ∈ Z is assumed to have the form Y ( t ) = P X ( t ) + Z ( t ) , t ∈ Z . (1.2)In (1.2), both Y and the noise term Z = { Z ( t ) } t ∈ Z are (high-dimensional) p = p ( n )-variateprocesses, P = P ( n ) is a rectangular coordinates matrix and, for ﬁxed r ∈ N , X = { X ( t ) } t ∈ Z is a(low-dimensional) r -variate fractional process. We show that, if the ratio a ( n ) p ( n ) /n converges toa positive constant, under appropriate rescaling, the r largest eigenvalues of W ( a ( n )2 j ) convergeto scale invariant functions. By contrast, the remaining p ( n ) − r eigenvalues remain bounded.In addition, we show that, up to a log transformation, the r largest eigenvalues of W ( a ( n )2 j )exhibit asymptotically Gaussian distributions. We further discuss how the asymptotic and large-scale behavior of wavelet eigenvalues can be used to construct statistical inference for a high-dimensional signal-plus-noise system of the form (1.2), where X is a latent process containingfractal information and P (as well as Z ) is unknown.Since the 1950s, the spectral behavior of large dimensional random matrices has attractedconsiderable attention from the mathematical research community. In Quantum Mechanics, forexample, random matrices are of great interest as statistical mechanical models of inﬁnite dimen-sional and possibly unknown Hamiltonian operators (e.g., Mehta (2004), Anderson et al. (2010),Tao and Vu (2011)). Random matrices have also naturally emerged as one essential mathemati-cal framework for the modern era of “Big Data” (Brody (2011)), when hundreds to several tensof thousands of time series get recorded and stored on a daily basis. In arbitrary dimension p ,stochastic modeling has to cope with the curse of dimensionality : as the dimension increases, thevolume of the space increases so fast that the amount of data needed for traditional statisticalinference typically grows exponentially (Wainwright (2019)). Consequently, one is often interestedin understanding the behavior of statistics such as the spectral distribution of sample covariancematrices when the dimension p is comparable to the sample size n (e.g., Tao and Vu (2012), Xiaet al. (2013), Paul and Aue (2014)).In turn, scale invariance manifests itself in a wide range of natural and social phenomena suchas in climate studies (Isotta et al. (2014)), dendrochronology (Bai and Taqqu (2018)), hydrology(Benson et al. (2006)) and turbulence (Kolmogorov (1941). In a multidimensional setting, scalingbehavior does not always appear along standard coordinate axes, and often involves multiple(scaling) relations. A R r -valued stochastic process X is called operator self-similar (o.s.s.; Lahaand Rohatgi (1981), Hudson and Mason (1982)) if it exhibits the scaling property { X ( ct ) } t ∈ R f.d.d. = { c H X ( t ) } t ∈ R , c > . (1.3)In (1.3), H is some (Hurst) matrix H whose eigenvalues have real parts lying in the interval(0 ,

1] and c H := exp { log( c ) H } = P ∞ k =0 (log( c ) H ) k k ! . A canonical model for multivariate fractionalsystems is operator fractional Brownian motion (ofBm), namely, a Gaussian, o.s.s., stationary-increment stochastic process (Maejima and Mason (1994), Mason and Xiao (2002), Didier andPipiras (2011)). In particular, ofBm is the natural generalization of the classical fBm (Mandelbrotand Van Ness (1968)).The literature on random matrices under dependence as well as on high-dimensional stochasticprocesses has been expanding at a fast pace (e.g., Basu and Michailidis (2015), Chakrabarty et2l. (2016), Merlev`ede and Peligrad (2016), Che (2017), Steland and von Sachs (2017), Taylor andSalhi (2017), Wang et al. (2017), Zhang and Wu (2017), Merlev`ede et al. (2019)). In applications,models of the form (1.2) appear in neuroscience (see Liu et al. (2015), Section 3.4), in particular asrelated to fMRI or M/EEG imaging (Ciuciu et al. (2012)), and also, for example, in econometrics(e.g., Brown (1989), Forni and Lippi (1999), Lam and Yao (2012), Chan et al. (2017)).In the characterization of scaling properties, the use of eigenanalysis was ﬁrst proposed inMeerschaert and Scheﬄer (1999, 2003) for operator stable laws, and later in Becker-Kern andPap (2008) for o.s.s. processes in the time domain. It has also been used in the cointegrationliterature (e.g., Phillips and Ouliaris (1988), Harris and Poskitt (2004), Li et al. (2009), Zhanget al. (2018)). In Abry and Didier (2018 a , 2018 b ), wavelet eigenanalysis is proposed in lowdimensions for the construction of a general estimation method for the scaling (Hurst) structureof ofBm.In Abry et al. (2018) and Boniece et al. (2019), presented without proofs and containingpreliminary computational studies, wavelet random matrices were ﬁrst used in the modeling ofhigh-dimensional systems. In this paper, we mathematically construct wavelet eigenanalysis inhigh dimensions assuming measurements of the form (1.2), where the fractional properties of X are characterized by a scaling matrix of the Jordan form H = P H diag( h , . . . , h q ) P − H , with realeigenvalues. Under very general assumptions, we show that, for functions ξ q ( · ), λ p ( n ) − r + q (cid:16) W ( a ( n )2 j ) a ( n ) h q +1 (cid:17) P → ξ q (2 j ) , q = 1 , . . . , r, (1.4)whereas the remaining eigenvalues λ ℓ (cid:16) W ( a ( n )2 j ) a ( n ) hq +1 (cid:17) , ℓ = 1 , . . . , p ( n ) − r , are bounded in probability(Proposition 3.1). Moreover, under slightly stronger conditions, the random vector √ n a,j (cid:16) λ p ( n ) − r + q (cid:16) W ( a ( n )2 j ) a ( n ) h q +1 (cid:17) − λ p ( n ) − r + q (cid:16) E W ( a ( n )2 j ) a ( n ) h q +1 (cid:17) (cid:17) , q = 1 , . . . , r, (1.5)is asymptotically Gaussian (Theorem 3.1). In (1.4) and (1.5), we assumelim n →∞ p ( n ) a ( n ) n =: c ∈ [0 , ∞ ) , (1.6)which quantiﬁes the three-fold eﬀect of dimension ( p ( n )), sample size ( n ) and scale ( a ( n )) (cf. thetraditional ratio lim n →∞ p ( n ) /n for sample covariance matrices; see Bai and Silverstein (2010),Liu et al. (2015)).Our results have direct statistical consequences. We further construct a multiscale waveleteigenvalue regression methodology to eﬃciently estimate the scaling structure of the latent process X , given by (the scaling eigenvalues) h , . . . , h r , (1.7)and also to statistically identify the eﬀective dimension r of the system (Corollary 3.1 and Proposi-tion 3.2). In particular, our results show that (1.7) can be estimated regardless of the coordinates P ( n ) P H . In fact, non-canonical scaling coordinates P ( n ) P H generally mix together slow andfast power laws, leading to the so-called amplitude and dominance eﬀects (see Abry and Didier(2018 b ) for a detailed discussion). This, in turn, may severely bias the modeling of fractality inmultidimensional systems by means of standard, univariate-like statistical methodology (on themultivariate character of Internet traﬃc fractality, see, for instance, Abry and Didier (2018 a ),Section 6). 3or the sake of clarity and mathematical generality, our assumptions are stated directly inthe wavelet domain (see Section 2). We further describe a very broad Gaussian framework for X and Z whose behavior satisﬁes the assumptions used in Section 2, as well as some simple ﬁnite-variance and non-Gaussian instances of interest (Section 4). For both Gaussian and non-Gaussianinstances, we draw upon the concentration of measure phenomenon (Ledoux (2001), Boucheronet al. (2013)) to nonasymptotically characterize the (stochastic) properties of large dimensionalwavelet random matrices.This paper is organized as follows. In Section 2, we provide the basic wavelet framework,deﬁnitions and wavelet-domain assumptions used throughout the paper. In Section 3, we providethe main results on the asymptotic and high-dimensional behavior of wavelet eigenvalues andon the wavelet eigenvalue regression. In Section 4, we provide a Gaussian framework, as wellas Gaussian and non-Gaussian examples. In Section 5, we lay out conclusions and discuss openproblems. All proofs can be found in the appendix. For m ∈ N , let M ( m, R ) be the space of m × m real-valued matrices. Let S ( m, C ), S ≥ ( m, C )and S > ( m, C ), respectively, be the space of m × m Hermitian matrices and the cones of m × m positive semideﬁnite and positive deﬁnite Hermitian matrices. Also, let S ≥ ( m, R ) and S > ( m, R )be the their real-valued analogues. Let O ( m ) and U ( m ) be the orthogonal and unitary groups of m × m matrices, respectively. Throughout the manuscript, k A k denotes the Euclidean norm of amatrix A ∈ M ( p, R ) in arbitrary dimension p , i.e., k A k = r sup u ∈ S p − u ∗ AA ∗ u = r sup u ∈ S p − u ∗ A ∗ A u . For any A ∈ S ( p, R ), −∞ < λ ( A ) ≤ . . . ≤ λ q ( A ) ≤ . . . ≤ λ p ( A ) < ∞ denotes the set of ordered eigenvalues of the matrix A . For S = ( s i ,i ) i ,i =1 ,...,n ∈ M ( n, R ), letvec S ( S ) = ( s , s , . . . , s n , s , s , . . . , s n , . . . , s nn ) . (2.1)We use the asymptotic notation o P (1) , O P (1) , Ω P (1) , (2.2)to describe sequences of random vectors or matrices whose norms vanish, are bounded above andare bounded away from zero, respectively, in probability. Throughout the paper, we make use of a wavelet multiresolution analysis (MRA; see Mallat(1999), chapter 7), which decomposes L ( R ) into a sequence of approximation (low-frequency)and detail (high-frequency) subspaces V j and W j , respectively, associated with diﬀerent scales ofanalysis 2 j . In almost all mathematical statements, we assume the following conditions hold onthe underlying wavelet MRA. Assumption ( W ψ ∈ L ( R ) is a wavelet function, namely, it satisﬁes the relations Z R ψ ( t ) dt = 1 , Z R t p ψ ( t ) dt = 0 , p = 0 , , . . . , N ψ − , Z R t N ψ ψ ( t ) dt = 0 , (2.3)4or some integer (number of vanishing moments) N ψ ≥ Assumption ( W ) : the scaling and wavelet functions φ ∈ L ( R ) and ψ ∈ L ( R ) are compactly supported (2.4)and b φ (0) = 1. Assumption ( W α > x ∈ R | b ψ ( x ) | (1 + | x | ) α < ∞ . (2.5) Assumption ( W ) : the function X k ∈ Z k m φ ( · − k ) (2.6)is a polynomial of degree m for all m = 0 , . . . , N ψ − b ψ ( x ) exists, is everywhere diﬀerentiable and its ﬁrst N ψ − x = 0. Condition (2.5), in turn, implies that ψ is continuous (seeMallat (1999), Theorem 6.1) and, hence, bounded.Note that assumptions ( W − W

4) are closely related to the broad wavelet framework for theanalysis of Gaussian stochastic processes laid out in Moulines et al. (2007 a , 2007 b , 2008). TheDaubechies scaling and wavelet functions generally satisfy ( W − W

4) (see Moulines et al. (2008),p. 1927, or Mallat (1999), p. 253). Usually, the parameter α increases to inﬁnity as N ψ goes toinﬁnity (see Moulines et al. (2008), p. 1927, or Cohen (2003), Theorem 2.10.1).We further suppose the wavelet coeﬃcients stem from Mallat’s pyramidal algorithm (Mallat(1999), chapter 7). For expositional simplicity, in our description of the algorithm we use the R p -valued process Y , though analogous developments also hold for both X and Z . Initially, supposean inﬁnite time series { Y ( ℓ ) } ℓ ∈ Z , (2.7)associated with the starting scale 2 j = 1 (or octave j = 0), is available. Then, we can applyMallat’s algorithm to extract the so-named approximation ( A (2 j +1 , · )) and detail ( D (2 j +1 , · ))coeﬃcients at coarser scales 2 j +1 by means of an iterative procedure. In fact, as commonly donein the wavelet literature, we initialize the algorithm with the process R p ∋ e Y ( t ) := X k ∈ Z Y ( k ) φ ( t − k ) ∈ V , t ∈ R . (2.8)By the orthogonality of the shifted scaling functions { φ ( · − k ) } k ∈ Z , R p ∋ A (2 , k ) = Z R e Y ( t ) φ ( t − k ) dt = Y ( k ) , k ∈ Z (2.9)(see Stoev et al. (2002), proof of Lemma 6.1, or Moulines et al. (2007 b ), p. 160; cf. Abry andFlandrin (1994), p. 33). In other words, the initial sequence, at octave j = 0, of approximationcoeﬃcients is given by the original time series. To obtain approximation and detail coeﬃcients atcoarser scales, we use Mallat’s iterative procedure A (2 j +1 , k ) = X k ′ ∈ Z u k ′ − k A (2 j , k ′ ) , D (2 j +1 , k ) = X k ′ ∈ Z v k ′ − k A (2 j , k ′ ) , j ∈ N ∪ { } , k ∈ Z , (2.10)5here the (scalar) ﬁlter sequences { u k := 2 − / R R φ ( t/ φ ( t − k ) dt } k ∈ Z , { v k :=2 − / R R ψ ( t/ φ ( t − k ) dt } k ∈ Z are called low- and high-pass MRA ﬁlters, respectively. Due tothe assumed compactness of the supports of ψ and the associated scaling function φ (see con-dition (2.4)), only a ﬁnite number of ﬁlter terms is nonzero, which is convenient for compu-tational purposes (Daubechies (1992)). Hereinafter, we assume without loss of generality thatsupp( φ ) = supp( ψ ) = [0 , T ] (cf. Moulines et al (2007 b ), p. 160). Moreover, the wavelet (detail)coeﬃcients D (2 j , k ) of Y can be expressed as R p ∋ D (2 j , k ) = X ℓ ∈ Z Y ( ℓ ) h j, j k − ℓ , (2.11)where the ﬁlter terms are deﬁned by R ∋ h j,ℓ = 2 − j/ Z R φ ( t + ℓ ) ψ (2 − j t ) dt. (2.12)If we replace (2.7) with the realistic assumption that only a ﬁnite length time series { Y ( k ) } k =1 ,...,n (2.13)is available, writing e Y ( n ) ( t ) := P nk =1 Y ( k ) φ ( t − k ), we have e Y ( n ) ( t ) = e Y ( t ) for all t ∈ ( T, n + 1)(cf. Moulines et al. (2007 b )). Noting D (2 j , k ) = R R e Y ( t )2 − j/ ψ (2 − j t − k ) dt and D ( n ) (2 j , k ) = R R e Y ( n ) ( t )2 − j/ ψ (2 − j t − k ) dt , it follows that the ﬁnite-sample wavelet coeﬃcients D ( n ) (2 j , k ) of e Y ( n ) ( t ) are equal to D (2 j , k ) whenever supp ψ (2 − j · − k ) = (2 j k, j ( k + T )) ⊆ ( T, n + 1). In otherwords, D ( n ) (2 j , k ) = D (2 j , k ) , ∀ ( j, k ) ∈ { ( j, k ) : 2 − j T ≤ k ≤ − j ( n + 1) − T } . (2.14)Equivalently, such subset of ﬁnite-sample wavelet coeﬃcients is not aﬀected by the so-named border eﬀect (cf. Craigmile et al. (2005), Percival and Walden (2006), Didier and Pipiras (2010)).Moreover, by (2.14) the number of such coeﬃcients at octave j is given by n j = ⌊ − j ( n +1 − T ) − T ⌋ .Hence, n j ∼ − j n for large n . Thus, for notational simplicity we suppose n j = n j (2.15)holds exactly and only work with wavelet coeﬃcients unaﬀected by the border eﬀect. Throughout the paper, we assume observations stem from the model (1.2). The independent “sig-nal” X = { X ( t ) } and the noise component Z = { Z ( t ) } are R r -valued and R p -valued, respectively,where r is ﬁxed and p = p ( n ). The deterministic matrix P = P ( n ) can be expressed as M ( p, r, R ) ∋ P ( n ) = (cid:16) p ( n ) , . . . , p r ( n ) (cid:17) , k p q ( n ) k = 1 , q = 1 , . . . , r. (2.16)We state the appropriate conditions for the consistency and asymptotic normality of waveletlog-eigenvalues directly in the wavelet domain. For j ∈ N and k ∈ Z , the random vectors D ( a ( n )2 j , k ) ∈ R p , D X ( a ( n )2 j , k ) ∈ R r and D Z ( a ( n )2 j , k ) ∈ R p Y , X or Z , respectively, where { a ( n ) } n ∈ N is a dyadic sequence. Whenever well-deﬁned, the wavelet variance of Y at scale a ( n )2 j is denotedby W ( a ( n )2 j )) = 1 n a,j n a,j X k =1 D ( a ( n )2 j , k ) D ( a ( n )2 j , k ) ∗ , n a,j = na ( n )2 j . (2.17)The remaining wavelet variance terms W X , W XZ , W Z are naturally deﬁned as W X ( a ( n )2 j )) = 1 n a,j n a,j X k =1 D X ( a ( n )2 j , k ) D X ( a ( n )2 j , k ) ∗ , W XZ ( a ( n )2 j )) = 1 n a,j n a,j X k =1 D X ( a ( n )2 j , k ) D Z ( a ( n )2 j , k ) ∗ , W Z ( a ( n )2 j )) = 1 n a,j n a,j X k =1 D Z ( a ( n )2 j , k ) D Z ( a ( n )2 j , k ) ∗ . (2.18)Further deﬁne the auxiliary random matrix b B a (2 j ) = P − H a ( n ) − ( H +(1 / I ) W X ( a ( n )2 j ) a ( n ) − ( H ∗ +(1 / I ) ( P ∗ H ) − ∈ S ≥ ( r, R ) , (2.19)as well as its mean B a (2 j ) := E b B a (2 j ). In (2.19), we assume that the scaling matrix H has theJordan form H = P H diag( h , . . . , h r ) P − H , P H ∈ GL ( r, R ) , − / < h ≤ . . . ≤ h r < ∞ . (2.20)We make use of the following assumptions in the main results of this paper (Section 3). Assumption ( A W ( a ( n )2 j ) = P ( n ) W X ( a ( n )2 j ) P ∗ ( n ) + W Z ( a ( n )2 j )+ P ( n ) W X,Z ( a ( n )2 j ) + W ∗ X,Z ( a ( n )2 j ) P ∗ ( n ) (2.21)is well deﬁned a.s. Assumption ( A H as in (2.19),max n k E W Z ( a ( n )2 j ) , k W Z ( a ( n )2 j ) k , k a ( n ) − H − (1 / I W X,Z ( a ( n )2 j )) k o = O P (1) . (2.22) Assumption ( A b B a (2 j ) in (2.19) satisﬁes (cid:16) √ n a,j (vec S b B a (2 j ) − vec S B a (2 j )) (cid:17) j = j ,...,j m d → N (0 , Σ B ( j , . . . , j m )) , n → ∞ . (2.23)In addition, k B a (2 j ) − B (2 j ) k = O ( a ( n ) − ̟ ) , n → ∞ , j = j , . . . , j m , (2.24)where ̟ is the regularity parameter ̟ = min n h , min ≤ q } + h { h q = ... = h qr } , (2.25)7nd where B (2 j ) ∈ S > ( r, R ) and det(Σ B ( j , . . . , j m ) > . (2.26) Assumption ( A p ( n ) and the scaling factor a ( n ) satisfy the relations a ( n ) ≤ n j , a ( n ) n + na ( n ) ̟ +1 → , p ( n ) n/a ( n ) → c ∈ [0 , ∞ ) , n → ∞ , (2.27)where ̟ is given as in (2.25). Assumption ( A P ( n ) ∈ M ( p, r, R ) and P H ∈ GL ( r, R ) be as in (2.16) and (2.20), respec-tively. Let P ( n ) P H = Q ( n ) R ( n ) be the QR decomposition of P ( n ) P H and let ̟ be as in (2.25).Then, there exists a (deterministic) matrix A ∈ S > ( r, R ) with Cholesky decomposition A = R ∗ R (2.28)such that k R ( n ) − R k = O ( a ( n ) − ̟ ) . (2.29)Assumptions ( A − A

3) are stated in the wavelet domain. Assumption ( A

1) holds under verygeneral conditions. In fact, since X and Z are assumed independent, it suﬃces that k P ( n ) k < ∞ and that the wavelet random matrices W X ( a ( n )2 j ) and W Z ( a ( n )2 j ) be well deﬁned a.s. As-sumption ( A

2) ensures that the inﬂuence of the wavelet random matrices W X,Z ( a ( n )2 j ) and W Z ( a ( n )2 j ) in the observed wavelet spectrum W ( a ( n )2 j ) is not too large. Assumption ( A W X ( a ( n )2 j ) af-ter compensating for scaling. Note that this does not alter the scaling properties of the ob-served spectrum W ( a ( n )2 j ). Assumption ( A

4) controls the divergence rates among n , a ( n ) and p ( n ). Assumption ( A

5) ensures that, asymptotically speaking, the angles between the columnvectors of the matrix P ( n ) P H converge in such a way that the matrix of asymptotic angleslim n →∞ P ∗ H P ∗ ( n ) P ( n ) P H = A has full rank. This entails that P ( n ) does not signiﬁcantly perturbthe scaling property of the hidden matrix W X ( a ( n )2 j ). In particular, (2.29) implies that k P ∗ H P ∗ ( n ) P ( n ) P H − A k = O ( a ( n ) − ̟ ) . (2.30)A discussion of broad contexts where assumptions ( A

2) and ( A

3) hold is deferred to Section4. In the following example, to ﬁx ideas we brieﬂy illustrate assumption ( A Example 2.1

Recall that an ofBm is a multivariate ( i ) Gaussian; ( ii ) o.s.s.; ( iii ) stationary-increment stochastic process. Suppose the R r -valued stochastic process X is an ofBm, and rewrite H = P H J H P − H ∈ M ( r, R ), where P H ∈ GL ( r, C ) and J H is the Jordan form of H . Due to discretesampling, an exact o.s.s.-type scaling relation does not hold for the matrices b B a (2 j ), j ≥

0, in(2.19) for ﬁnite n . However, under mild assumptions on the ofBm, it follows from Theorem 3.1and from the proof of Lemma C.2 in Abry and Didier (2018 b ) that b B a (2 j ) satisﬁes (2.23) for somematrix B a (2 j ) ≡ B (2 j ) ∈ S > ( R , r ). Moreover, B (2 j ) = 2 jH B (1)2 jH ∗ . (2.31)In particular, if H is diagonalizable with real eigenvalues, the matrix B (2 j ) in (2.31) satisﬁes theentrywise scaling relations B (2 j ) = (cid:16) b (2 j ) ii ′ (cid:17) i,i ′ =1 ,...,r = (cid:16) j ( h i + h i ′ +1) b (1) ii ′ (cid:17) i,i ′ =1 ,...,r . (2.32)Further note that, in this case, condition (2.24) is satisﬁed when H has distinct eigenvalues or h = . . . = h r (see Proposition B.2). 8 Main results

We ﬁrst establish that, after proper rescaling, the r largest eigenvalues of a wavelet random matrixin high dimensions converge in probability to deterministic functions ξ q (2 j ), q = 1 , . . . , r . Thus,such functions can be interpreted as asymptotic rescaled eigenvalues . Notably, they display ascaling property. Moreover, the remaining p ( n ) − r eigenvalues of a wavelet random matrix arebounded in probability. Proposition 3.1

Assume ( W − W and ( A − A hold, and suppose the dyadic scaling sequence { a ( n ) } n ∈ N satisﬁes a ( n ) → ∞ as n → ∞ . Then, for p = p ( n ) , the limitsp -lim n →∞ λ p − r + q ( W ( a ( n )2 j )) a ( n ) h q +1 =: ξ q (2 j ) , q = 1 , . . . , r, (3.1) exist, and the deterministic functions ξ q satisfy the scaling relation ξ q (2 j ) = 2 j (2 h q +1) ξ q (1) . (3.2) In addition, p -lim n →∞ λ p − r ( W ( a ( n )2 j )) = O P (1) . (3.3)In the following theorem, we establish the asymptotic normality of the r largest waveletlog-eigenvalues in high dimensions in the case of simple Hurst eigenvalues, or identical Hursteigenvalues under a condition on the rescaled limits ξ q (2 j ). In the subsequent corollary, we furtherestablish the asymptotic normality of the wavelet eigenvalue regression estimator, as applied tothe r largest wavelet log-eigenvalues. Note that the new condition (3.5) on the asymptotic rescaledeigenvalues is needed for the asymptotic normality of wavelet log-eigenvalues, since it ensures theﬁnite-sample diﬀerentiability of the latter with respect to Hurst parameters (see also Remark 3.1). Theorem 3.1

Suppose ( W − W and ( A − A hold. Further suppose one of the followingconditions holds, namely,(i) either − / < h < . . . < h r < ∞ ; (3.4) or(ii) h = . . . = h r and the functions ξ q (2 j ) in (3.1) satisfy ξ q (1) = ξ q (1) , q = q . (3.5) Then, R m × r ∋ (cid:16) √ n a,j (cid:16) log λ p − r + q ( W ( a ( n )2 j )) − log λ p − r + q ( E W ( a ( n )2 j )) (cid:17) q =1 ,...,r (cid:17) j = j ,...,j d → N (0 , Σ λ )(3.6) as n → ∞ . If we write the asymptotic covariance matrix in block form Σ λ = (cid:16) Σ λ ( jj ′ ) (cid:17) j,j ′ = j ,...,j ,then its main diagonal entries satisfy Σ λ ( jj ) ii > , i = 1 , . . . , r . emark 3.1 When the scaling matrix eigenvalues are no longer simple, without condition (3.5),the asymptotic distribution of wavelet log-eigenvalues is generally non-Gaussian (in ﬁxed dimen-sions, see Proposition A.1.1 in Li (2017); cf. Anderson (2003), Sections 13.5.1 and 13.5.2, onsample covariance matrices).

Remark 3.2

Condition (2.20) posits a diagonalizable scaling matrix H with real eigenvalues, anassumption that is used in both Proposition 3.1 and Theorem 3.1. However, by a slight adaptationof the proof of the former, a much more general consistency statement than (3.1) holds. In fact,suppose H = P H J H P − H , J H = diag( J h , . . . , J h n ′ ) , P H ∈ GL ( r, C ) , where each J h · is a Jordan block of length r · , and − / < ℜ h ≤ · · · ≤ ℜ h n ′ < ∞ . Fix j ∈ N and suppose assumptions ( W − W A A A

4) and ( A

5) hold. In addition,suppose the auxiliary random matrix (2.19) satisﬁesΩ P ( n ) = λ ( b B a (2 j )) ≤ λ r ( b B a (2 j )) = O P ( n ) . (3.7)Then, as n → ∞ , 12 (cid:16) log λ p − r + q ( W ( a ( n )2 j ))log a ( n ) − (cid:17) P → ℜ h q ′ , (cid:16) log λ p − r + q ( E W ( a ( n )2 j ))log a ( n ) − (cid:17) → ℜ h q ′ , q = 1 , . . . , r, (3.8)as n → ∞ , where q ′ ∈ { , . . . , r ′ } is such that r + r + . . . + r q ′ − < q ≤ r + r + . . . + r q ′ (3.9)and r := 0 (cf. Abry and Didier (2018 a ), Theorem 3.1). Proposition 3.1 and Theorem 3.1 bear direct consequences for the construction of statisticalmethodology. In the following deﬁnition, we describe the wavelet eigenvalue regression in a high-dimensional framework (cf. Abry and Didier (2018 b , 2018 a )). Deﬁnition 3.1

Let { W ( a ( n )2 j ) } j = j ,...,j be the wavelet (variance) random matrices correspond-ing to scales { a ( n )2 j , . . . , a ( n )2 j } . Fix a range of octaves j = j , j + 1 , . . . , j , m := j − j + 1 . (3.10)The wavelet eigenstructure estimator is given by the regression system { b ℓ q } q =1 ,...,r := n (cid:16) j X j = j w j log λ q ( W ( a ( n )2 j )) − (cid:17)o q =1 ,...,r . (3.11)In (3.11), w j , j = j , . . . , j , are weights satisfying the relations j X j = j w j = 0 , j X j = j jw j = 1 , (3.12)where w j = 1 if m = 1 (see (3.10)). 10f W ( a ( n )2 j ) ∈ S > ( r, R ), then the estimator (3.11) is well deﬁned. Expression (3.11) canbe interpreted as wavelet eigenstructure estimators of Hurst eigenvalues, i.e., we can write { b ℓ q } q =1 ,...,r = { b h q } q =1 ,...,r .In the following corollary, we characterize the asymptotic behavior of the wavelet eigenstruc-ture estimator. Corollary 3.1

Under the conditions of Theorem 3.1, r na ( n ) (cid:16)b h q − h q (cid:17) q =1 ,...,r d → N (0 , M Σ λ M ∗ ) , (3.13) as n → ∞ , for some weight matrix M (see (A.88) ) and Σ λ as in Theorem 3.1. Note that, in (3.11), it is implicitly assumed that the eﬀective dimension r – i.e., the dimensionof X – is known. In the following deﬁnition, we introduce an estimator of r . Deﬁnition 3.2

Let j < j and consider the regression weights w j in (3.11) satisfying (3.12),and write ∆ q ( j , j ) := j X j = j jw j log λ q ( W ( a j ))log( a j ) (3.14)Given κ >

0, we deﬁne b r (2 j , j , κ ) := { q : ∆ q ( j , j ) > κ } . (3.15)Now note that, under the assumptions of Theorem 3.1, the lowest p − r eigenvalues staybounded. Hence, for q ≤ p − r , the quantity ∆ q ( j , j ) tends to zero. On the other hand, for q > p − r , still under the assumptions of Theorem 3.1, (3.14) converges to 2 h q − ( p − r ) + 1 > W ( a j ). This phenomenon liesbehind the following proposition, which establishes the consistency of the estimator b r (2 j , j , κ ). Proposition 3.2

Let b r (2 j , j , κ ) be as in (3.15) with κ ∈ (0 , h + 1] and suppose the assump-tions of Theorem 3.1 hold. Then, b r (2 j , j , κ ) P → r, n → ∞ . (3.16) Recall that assumptions ( A

2) and ( A

3) (i.e., (2.22) and (2.23), respectively) are stated in thewavelet domain. In this section, we provide a Gaussian framework as well as Gaussian andnon-Gaussian examples where such assumptions are satisﬁed.

When X and Z are both Gaussian, we can show that conditions (2.21)–(2.23) hold under verygeneral assumptions on the processes. For the sake of generality, we consider an array of (colored)noise density functions. In other words, we assume that, for each n , a new set of such functionsis randomly picked. Also, to facilitate comparison with (multivariate) wavelet eigenanalysis forofBm, we suppose the hidden process X has (ﬁrst order) stationary increments. This is not arestrictive assumption, as discussed in Remark 4.1.11 roposition 4.1 Suppose assumptions ( W − W are in place. Further suppose the followingconditions hold. ( i ) The R p -valued (noise) stochastic process Z = { Z ( t ) } t ∈ Z as in (1.2) has spectral density g Z ( x ) ∈ S ≥ ( p, R ) , x ∈ [ − π, π ) . (4.1) In (4.1) , g Z = diag( g ,n , . . . , g p,n ) , where g ℓ,n , ℓ = 1 , . . . , p , are drawn independently accord-ing to a distribution F ( d g ) on a class G ⊆ L [ − π, π ) (4.2) of univariate functions g such that sup g ∈G sup x ∈ [ − π,π ) | g ( x ) | < ∞ , | E g ( x ) − E g (0) | = O ( x ) , x → . (4.3) In addition, suppose that, conditionally on g , the univariate covariance function γ Z , g sat-isﬁes | γ Z , g ( h ) | ≤ C e − C h , h ∈ Z , (4.4) where the constants C , C > do not depend on the (univariate) density g ; ( ii ) the measure F ( d g ) is independent of X in (1.2) and P = P ( n ) in (1.2) ; ( iii ) 0 < h ≤ . . . ≤ h r < , and the stationary-increment, R r -valued stochastic process X = { X ( t ) } t ∈ Z has spectral density g X ( x ) = | x | − ( H +(1 / I ) s ( x ) | x | − ( H ∗ +(1 / I ) ∈ S ≥ ( r, R ) , (4.5) x ∈ [ − π, π ) . In (4.5) , s ( x ) is a S ≥ ( r, C ) -valued, entry-wise bounded and continuous, Her-mitian function such that, for some δ s ∈ [ ̟, (cf. (2.25) ), k s ( x ) − s (0) k ≤ C | x | δ s , x ∈ [ − π, π ) , λ ( s (0)) > . (4.6) Then, conditions (2.22) and (2.23) (i.e., ( A and ( A ) are satisﬁed. Example 4.1

Suppose that, for any ﬁxed n , the r -variate and p ( n )-variate processes X and Z are independent. Further assume that X is an ofBm and, for ﬁxed integers p and q , Z is madeup of entry-wise independent, i.d. ARMA( p , q ) processes (in particular, F ( d g ) is a Dirac delta onthe associated spectral density). Then, the assumptions of Proposition 4.1 are met. In particular,the spectral density of ofBm is given by S ≥ ( r, R ) ∋ g X ( x ) = X ℓ ∈ Z | x + 2 πℓ | − H − (1 / I | x + 2 πℓ | − H ∗ − (1 / I , x ∈ [ − π, π ) . Remark 4.1

As a consequence of Proposition 4.1, the wavelet eigenvalue regression methodology,originally developed for ofBm (ﬁxed dimensions) in Abry and Didier (2018 a ), carries over to abroad class of multivariate fractional, Gaussian, stationary-increment processes (see Theorem 3.1and Lemma C.2 in that reference).In addition, note that the assumption that X has (ﬁrst order) stationary increments is not crucial. By a simple adaptation of the proof of Proposition 4.1, if X has stationary incrementsof any order k ∈ N , it is possible to show that analogous conclusions hold as long as the numberof vanishing moments of the underlying wavelet basis (see (2.3)) is suﬃciently large. This reﬂectsthe suitability of wavelet analysis for stationary increment processes of any order. Such topic hasbeen broadly explored in the literature (e.g., Moulines et al. (2008), Roueﬀ and Taqqu (2009),Abry et al. (2019); see also Proposition C.4 in this paper).12 .2 Non-Gaussian instances: a discussion Developing high-dimensional, non-Gaussian, second-order and fractional frameworks is a verybroad topic that lies well outside the scope of this paper. In this section, we restrict ourselves todiscussing certain non-Gaussian instances so as to help illustrate the fact that the wavelet domainmethodology constructed in Sections 2 and 3 does not require the system (1.2) to be Gaussian.

Example 4.2

Let X = { X ( t ) } t ∈ Z be a possibly non-Gaussian, R r -valued stochastic processesmade up of independent linear fractional processes with ﬁnite second moments. Then, based on theframework constructed in Roueﬀ and Taqqu (2009), we can show that, under conditions ( W − W X , the associated random matrix b B a (2 j ) ∈ S > ( r, R ) satisﬁescondition (2.23) (i.e., assumption ( A r components of X are assumed independent,note that estimation of the r univariate Hurst exponents based on the associated model (1.2) isstill, in general, a nontrivial problem. This is so due to the presence of the unknown rectangularcoordinates matrix P , as well as of the high-dimensional noise component Z . In fact, mathemat-ically speaking, such model is closely related to the so-called blind source separation problems,which are of great interest in the ﬁeld of signal processing (e.g., Comon and Jutten (2010); in afractional context, see Abry et al. (2019)). Example 4.3

Recall that a random variable ξ is called sub-Gaussian wheninf (cid:8) t > E exp( ξ /t ) ≤ (cid:9) < ∞ . This is equivalent to the notion that the tails of the distribution of ξ are no heavier than thoseof the Gaussian distribution (Vershynin (2018), Proposition 2.5.2). Sub-Gaussian distributionsform a broad family that includes the Gaussian distribution itself, as well as compactly supporteddistributions, for example.So, suppose the noise process { Z ( t ) } t ∈ Z consists of (discrete-time) i.i.d. sub-Gaussian ob-servations. Consider the wavelet analysis framework provided by the Haar scaling and waveletfunctions, i.e., φ ( t ) = 1 [0 , ( t ) , ψ ( t ) = − [0 , / ( t ) + 1 [1 / , ( t ) , (4.7)respectively (Mallat (2009), p. 291). Starting from (4.7), suppose the wavelet coeﬃcients arecomputed by means of Mallat’s iterative procedure (2.10). Then, we can show that the waveletrandom matrix W Z satisﬁes condition (2.22) (i.e., in assumption ( A In this paper, we mathematically construct wavelet eigenanalysis methodology (Abry and Didier(2018 b , 2018 a )) in high dimensions by characterizing the scaling behavior of the eigenvalues oflarge wavelet random matrices. We assume that possibly non-Gaussian, ﬁnite-variance p -variatemeasurements are made of a low-dimensional r -variate ( r ≪ p ) fractional stochastic process withunknown scaling coordinates and in the presence of additive high-dimensional noise. In the high-dimensional limit, we establish that the rescaled r largest eigenvalues of the wavelet randommatrices converge to scale invariant functions, whereas the remaining p − r eigenvalues remainbounded. In addition, we show that, up to a log transformation, the r largest eigenvalues ofwavelet random matrices exhibit asymptotically Gaussian distributions. We further show how13he asymptotic and large-scale behavior of wavelet eigenvalues can be used to construct statisticalinference methodology for a high-dimensional signal-plus-noise system.This research leads to many relevant open problems in high dimensions: ( i ) an extended studyof the theory of wavelet eigenvalues in speciﬁc second order non-Gaussian fractional frameworks,especially in regard to the impact of heavier tails; ( ii ) further investigation of the behavior ofwavelet random matrices beyond the model (1.2); ( iii ) developing eﬃcient testing procedures forthe identiﬁcation of distinct Hurst eigenvalues; ( iv ) applications in ﬁelds where a profusion ofhigh-dimensional data is observed, such as physics, neuroscience and signal processing. A Proofs: Section 3

In this section, whenever convenient we simply write p = p ( n ), a = a ( n ) or P = P ( n ). For theproofs, it is useful to ﬁrst recall the variational characterization of the eigenvalues of a matrix M ∈ S ( p, C ) provided by the Courant–Fischer principle. In other words, ﬁx q ∈ { , . . . , p } andconsider the ordered eigenvalues λ ( M ) ≤ . . . ≤ λ p ( M ) . (A.1)By the Courant–Fischer principle, we can express λ q ( M ) = inf U q sup u ∈U q ∩ S p − C u ∗ M u = sup U p − q +1 inf u ∈U p − q +1 ∩ S p − C u ∗ M u , (A.2)where U q is a q -dimensional subspace of C n (Horn and Johnson (2012), Chapter 4). A relateduseful idea is the following. For M ∈ S ( p, R ), let u , . . . , u p be unit eigenvectors of M associatedwith the eigenvalues (A.1) respectively. Then, for q ∈ { , . . . , p } , we can further express λ q ( M ) = inf u ∈ span { u q ,..., u p }∩ S p − u ∗ M u = sup u ∈ span { u ,..., u q }∩ S p − u ∗ M u . (A.3)In proofs, we often use the following notation. For an arbitrary q ∈ { , . . . , r } , I − , I and I + are index sets given by the relations I − := { ℓ : h ℓ < h q } , I := { ℓ : h ℓ = h q } , I + := { ℓ : h ℓ > h q } . (A.4)Note that I − and I + are possibly empty. Also write r := card( I − ) , r := card( I ) ≥ , r := card( I + ) . (A.5)Throughout this section, for notational simplicity, we write P ( n ) P H ≡ P ( n ) ≡ P, (A.6)whose column vectors are denoted by p ℓ ( n ) ≡ p ℓ , ℓ = 1 , . . . , r . Proof of Proposition 3.1 : Statement 3.3 is a consequence of Lemma (B.1), and the scalingrelationship (3.2) is an immediate consequence of the limit (3.1), since if such a limit holds, writing e a ( n ) = a ( n )2 j , ξ q (2 j ) = p -lim n →∞ λ p − r + q ( W ( a ( n )2 j ) a ( n ) h q +1 = p -lim n →∞ λ p − r + q ( W ( e a ( n ))( e a ( n )2 − j ) h q +1 = 2 j (2 h q +1) ξ q (1) . So, we now show (3.1). 14ix q ∈ { , . . . , r } . Let I and the (possibly empty) sets I − , I + be given by the relations (A.4),and consider r , r and r as deﬁned in (A.5). Let P H be as in (2.20). Starting from conditions(2.22) and (2.23), consider the decomposition W ( a j ) a h q +1 = P a h b B a (2 j ) a h P ∗ a h q + O P (1) a h q +1 + P a h O P (1) a h q +1 / + O P (1) a h P ∗ a h q +1 / . (A.7)If I + = ∅ , then λ p − r + q ( W ( a j )) a hq +1 = O P (1). Thus, it suﬃces to show that every subsequence of λ p − r + q ( W ( a j )) a hq +1 that is convergent in probability converges to the same limit. So, suppose two suchlimits exist along some subsequences n a , n b , and write p -lim n i →∞ λ p − r + q ( W ( a ( n i )2 j )) a ( n i ) h q +1 =: ξ q,i (2 j ) , i = a , b . (A.8)The goal is to show that ξ q, a (2 j ) ≤ ξ q, b (2 j ) . (A.9)Once (A.9) is established, we can exchange the roles of n a , n b in the proof and obtain the desiredequality ξ q, a (2 j ) = ξ q, b (2 j ) . (A.10)So, by passing to further subsequences if necessary, for i = a , b and ℓ = 1 , . . . , r we can assumethat there are deterministic vectors γ iℓ such that P ∗ ( n i ) u p − r + ℓ ( n i ) → γ iℓ ∈ R r a.s. , n i → ∞ . (A.11)Moreover, Γ i := ( γ i , . . . , γ ir ) ∈ GL ( r, R ) , i = a , b , (A.12)by Lemma B.4. Now, take any nonzero vector e γ ∈ span { γ a q , . . . , γ a r } ∩ span { γ b , . . . , γ b q } ∩ S r − , (A.13)whose existence is guaranteed since dim span { γ a q , . . . , γ a r } + dim span { γ b , . . . , γ b q } = r + 1. In viewof (A.13), there are scalars α q , . . . , α r such that e γ = r X ℓ = q α ℓ γ a ℓ =: Γ a α ∈ S r − . (A.14)Along the subsequence n a , deﬁne the unit (random) vector v a ( n a ) = P rℓ = q α ℓ u p − r + ℓ ( n a ) k P rℓ = q α ℓ u p − r + ℓ ( n a ) k = P rℓ = q α ℓ u p − r + ℓ ( n a ) k α k . Then, by (A.11) and for the deterministic vector e γ as in (A.13), P ∗ ( n a ) v a ( n a ) P → e γ k α k , n a → ∞ , (A.15)where α := (0 , . . . , , α q , . . . , α r ) ∈ R r . Moreover, by (A.3), λ p − r + q ( W ( a ( n a )2 j )) a ( n a ) h q +1 = inf u ∈ span { u p − r + q ( n a ) ,..., u p ( n a ) }∩ S p − u ∗ W ( a ( n a )2 j ) a ( n a ) h q +1 u v ∗ a ( n a ) W ( a ( n a )2 j ) a ( n a ) h q +1 v a ( n a ) . Taking limits, expressions (A.7), (A.8) and (A.15) imply that k α k ξ q, a (2 j ) ≤ e γ ∗ h diag(0 , . . . , | {z } r , , . . . , | {z } r ) B (2 j )diag(0 , . . . , | {z } r , , . . . , | {z } r ) ie γ . (A.16)Now, again in view of (A.13), there exist scalars β , . . . , β q such that e γ = q X ℓ =1 β ℓ γ b ℓ =: Γ b β ∈ R r . (A.17)So, along the subsequence n b , deﬁne the (random) vector v b ( n b ) = P qℓ =1 β ℓ u p − r + ℓ ( n b ) k P qℓ =1 β ℓ u p − r + ℓ ( n b ) k = P qℓ =1 β ℓ u p − r + ℓ ( n b ) k β k . Again by (A.11) and for e γ as in (A.13), P ∗ ( n b ) v b ( n b ) P → e γ k β k , n b → ∞ , (A.18)where β = ( β , . . . , β q , , . . . , ∈ R r .In addition, again by (A.3), λ p − r + q ( W ( a ( n b )2 j )) a ( n b ) h q +1 = sup u ∈ span { u p − r +1 ( n b ) ,..., u p − r + q ( n b ) }∩ S p − u ∗ W ( a ( n b )2 j ) a ( n b ) h q +1 u ≥ v ∗ b ( n b ) W ( a ( n b )2 j ) a ( n b ) h q +1 v b ( n b ) . Taking limits, (A.8) and (A.18) imply that k β k ξ q, b (2 j ) ≥ e γ ∗ h diag(0 , . . . , | {z } r , , . . . , | {z } r ) B (2 j )diag(0 , . . . , | {z } r , , . . . , | {z } r ) ie γ . (A.19)As a consequence of (A.16) and (A.19), k α k ξ q, a (2 j ) ≤ k β k ξ q, b (2 j ) . (A.20)Now, recall that Γ i is given by (A.12). Note that, by conditions (2.30) and (A.6), for some A ∈ S > ( r, R ), Γ i Γ ∗ i = p -lim n i →∞ P ∗ ( n i ) P ( n i ) = A, i = a , b . (A.21)So, recast Γ i = V i Σ i W ∗ i , i = a , b , (A.22)where Σ i ∈ GL ( r, R ) is a diagonal matrix containing the singular values of Γ i and V i , W i ∈ U ( r ), i = a , b . By (A.21), V a Σ a V ∗ a = Γ a Γ ∗ a = A = Γ b Γ ∗ b = V b Σ b V ∗ b . (A.23)16ecall that the diagonal entries of Σ i , i = a, b , are nonnegative. Thus, based on the two spectraldecompositions of A in (A.23), without loss of generality we can assume that V := V a = V b , Σ := Σ a = Σ b . (A.24)However, in view of (A.12), (A.14) and (A.17),Γ a α = e γ = Γ b β . So, relations (A.22) and (A.24) imply that k α k = k Γ − a e γ k = k W a Σ − V ∗ e γ k = k Σ − V ∗ e γ k = k W b Σ − V ∗ e γ k = k Γ − b e γ k = k β k . (A.25)By (A.20), expression (A.9) holds, as claimed. As anticipated, we can now use the symmetricroles played by n a , n b in the proof of (A.9) to further conclude that (A.10) holds.This completes the proof for the case I + = ∅ .Now, suppose that I + = ∅ and ﬁx q ∈ { , . . . , r } . By relation (B.17) (itself a consequence ofLemma B.2), λ p − r + q ( W ( a j )) a h q +1 = O P (1) . So, it suﬃces to show that every subsequence of λ p − r + q ( W ( a j )) a hq +1 that is convergent in probabilityconverges to the same limit. Proceeding as for the case I + = ∅ , suppose two such limits existalong some subsequences n a , n b as in (A.8). As in (A.9) and (A.10), the goal is to show that ξ q, a (2 j ) ≤ ξ q, b (2 j ) and then use the symmetry in the argument to conclude that ξ q, a (2 j ) = ξ q, b (2 j ).So, analogously to the argument for the case I + = ∅ , by passing to further subsequences ifnecessary, expressions (A.11) and (A.12) hold. For i = a , b , let γ iℓ,k be the k –th entry of the limitvector γ iℓ in (A.11), ℓ = 1 , . . . , r . As a consequence of expression (B.10) in Lemma B.2, for i = a , b and k ∈ I + , the k –th entries of the limit vector γ iℓ in (A.11) are zero whenever ℓ ∈ I − ∪ I . Inother words, we can write ℓ ∈ I − ∪ I ⇒ γ iℓ = (cid:16) γ iℓ, , . . . , γ iℓ,r + r , , . . . , (cid:17) ∗ , i = a , b . Thus, by Lemma B.4,span { γ a , . . . , γ a r − r } = span { e ℓ , ℓ ∈ I + } ⊥ = span { γ b , . . . , γ b r − r } , (A.26)where e ℓ is the ℓ -th standard Euclidean basis vector in R r . So, take any nonzero vector e γ ∈ span { γ a q , . . . , γ a r − r } ∩ span { γ b , . . . , γ b q } ∩ S r − . (A.27)The existence of e γ is guaranteed since dim span { γ a q , . . . , γ a r − r } +dim span { γ b , . . . , γ b q } = r − r +1,and the spanning sets in (A.26) generate the same ( r − r )-dimensional space. Now, for γ = ( γ , . . . , γ r ) ∈ R r and x = ( x r − r +1 , . . . , x r ) ∈ R r , (A.28)deﬁne the vector y ( γ ; x ) = (cid:16) , . . . , | {z } r , γ r +1 , . . . , γ r + r | {z } r , x r − r +1 , . . . , x r | {z } r (cid:17) ∗ ∈ R r . (A.29)17ased on (A.29), deﬁne the (deterministic) function g ( γ ; x ) := y ∗ ( γ ; x ) B (2 j ) y ( γ ; x ) ∈ R . (A.30)For ﬁxed γ ∈ R r , as a function of x we can reexpress g ( γ ; · ) as g ( γ ; x ) = x ∗ Q x + h c , x i + d, where Q = ( b ii ′ ) i,i ′ ∈I + , c ∈ R r and d are constants. Note that Q = ( b ii ′ ) i,i ′ ∈I + ∈ S > ( r, R ), since b B (2 j ) ∈ S > ( r, R ), i.e., the mapping x x ∗ Q x is strictly convex. Consequently, for a ﬁxed γ , g ( γ ; · ) is also strictly convex, and since the function g ( γ ; · ) is quadratic in the argument x , it hasa unique global minimum. Also, for ﬁxed γ ∈ R r and x ∈ R r as in (A.28), deﬁne the vector y n ( γ ; x ) = (cid:16) γ a ( n ) h − h q , . . . , γ r a ( n ) h r − h q | {z } r , γ r +1 , . . . , γ r + r | {z } r , x r − r +1 , . . . , x r | {z } r (cid:17) ∗ . (A.31)Based on (A.31), analogously to (A.30), deﬁne the random function g n ( γ ; x ) := y ∗ n ( γ ; x ) b B a (2 j ) y n ( γ ; x ) . (A.32)Let x ∗ , e γ = ( x ∗ , e γ ,ℓ ) ℓ ∈I + ∈ R r (A.33)be the unique (deterministic) minimizer of the function g ( e γ ; · ) as deﬁned by (A.30). By (A.27),there are scalars α q , . . . , α r − r based on which we may reexpress e γ = r − r X ℓ = q α ℓ γ a ℓ . (A.34)For ﬁxed x ∗ , e γ , • as in (A.33), Lemma B.5 implies that, along the subsequence n a , there is a sequenceof unit vectors v a ( n a ) ∈ span { u p − r + q ( n a ) , . . . , u p ( n a ) } (A.35)such that, with probability tending to 1 as n a → ∞ , their angles with respect to the (non-unit)column vectors of P ( n a ) satisfy h p ℓ ( n a ) , v a ( n a ) i = x ∗ , e γ ,ℓ a ( n a ) h ℓ − h q , ℓ ∈ I + . (A.36)Moreover, still by Lemma B.5, e γ a ( n a ) := P ∗ ( n a ) v a ( n a ) P → e γ k α k , n a → ∞ , where, for α q , . . . , α r − r as in (A.34), we write α = (0 , . . . , , α q , . . . , α r − r , , . . . , ∈ R r . So, for each n a consider the function g n a ( e γ a ( n a ); x ∗ , e γ ) as deﬁned by (A.32). Note that, based on(A.35) and (A.36), we can reexpress g n a ( e γ a ( n a ); x ∗ , e γ ) = v ∗ a ( n a ) P ( n a ) a ( n a ) h b B a (2 j ) a ( n a ) h a ( n a ) h q P ∗ ( n a ) v a ( n a ) . λ p − r + q ( W ( a ( n a )2 j )) a ( n a ) h q +1 = inf u ∈ span { u p − r + q ( n a ) ,..., u p ( n a ) } u ∗ W ( a ( n a )2 j ) a ( n a ) h q +1 u ≤ v ∗ a ( n a ) W ( a ( n a )2 j )) a ( n a ) h q +1 v a ( n a )= g n a ( e γ a ( n a ); x ∗ , e γ ) + v ∗ a ( n a ) (cid:16) W ( a ( n a )2 j ) a ( n a ) h q +1 − P ( n a ) a ( n a ) h b B a (2 j ) a ( n a ) h P ∗ ( n a ) a ( n a ) h q (cid:17) v a ( n a ) . (A.37)However, by (A.36), a ( n a ) h − h q I P ∗ ( n a ) v a ( n a ) = y n a ( e γ a ( n a ); x ∗ , e γ ) P → k α k − y ( e γ , x ∗ , e γ ) , n a → ∞ . In particular, a ( n a ) h − h q I P ∗ ( n a ) v a ( n a ) = O P (1) along the subsequence n a . Thus, in view ofexpression (A.7) and still along n a , v ∗ a ( n a ) (cid:16) W ( a j ) − P a h b B a (2 j ) a h P ∗ a h q (cid:17) v a ( n a )= o P (1) + v ∗ a ( n a ) (cid:16) P a h − h q I O P (1) a h q +1 / + O P (1) a h − h q I P ∗ a h q +1 / (cid:17) v a ( n a ) = o P (1) . Consequently, g n a ( e γ a ( n a ); x ∗ , e γ ) + v ∗ a ( n a ) (cid:16) W ( a j ) a h q +1 − P a h b B a (2 j ) a h P ∗ a h q (cid:17) v a ( n a ) P → g ( e γ ; x ∗ , e γ ) k α k , n a → ∞ . (A.38)So, taking lim n a →∞ in (A.37), expressions (A.8) and (A.38) imply that ξ q, a (2 j ) ≤ g ( e γ ; x ∗ , e γ ) k α k . (A.39)On the other hand, proceeding in the same fashion as with expression (A.17), there are scalars β , . . . , β q based on which we can reexpress e γ = q X ℓ =1 β ℓ γ b ℓ . (A.40)Moreover, set v b ( n b ) = P qℓ =1 β ℓ u p − r + ℓ ( n b ) k P qℓ =1 β ℓ u p − r + ℓ ( n b ) k . (A.41)Then, by (A.11), P ∗ ( n b ) v b ( n b ) P → e γ k β k , n b → ∞ , (A.42)where, for β , . . . , β q as in (A.40), we write β := ( β , . . . , β q , , . . . , ∈ R r . x ∈ R r such that (cid:16) a ( n b ) h ℓ − h q h p ℓ ( n b ) , v b ( n b ) i (cid:17) ℓ ∈I + P → x k β k , n b → ∞ . (A.43)So, write e γ b ( n b ) = P ∗ ( n b ) v b ( n b ). Then, by (A.3), λ p − r + q ( W ( a ( n b )2 j )) a ( n b ) h q +1 = sup u ∈ span { u p − r +1 ( n b ) ,..., u p − r + q ( n b ) } u ∗ W ( a ( n b )2 j ) a ( n b ) h q +1 u ≥ v ∗ b ( n b ) W ( a ( n b )2 j ) a ( n b ) h q +1 v b ( n b )= g n b (cid:16)e γ b ( n b ); (cid:0) a ( n b ) h ℓ − h q h p ℓ ( n b ) , v b ( n b ) i (cid:1) ℓ ∈I + (cid:17) + v ∗ b ( n b ) (cid:16) W ( a j ) − P a h b B a (2 j ) a h P ∗ a h q (cid:17) v b ( n b ) . (A.44)However, by expressions (A.7) and (A.32), the right-hand side of (A.44) can be rewritten as y ∗ n (cid:16)e γ b ( n b ); (cid:0) a ( n b ) h ℓ − h q h p ℓ ( n b ) , v b ( n b ) i (cid:1) ℓ ∈I + (cid:17) b B a (2 j ) y n (cid:16)e γ b ( n b ); (cid:0) a ( n b ) h ℓ − h q h p ℓ ( n b ) , v b ( n b ) i (cid:1) ℓ ∈I + (cid:17) + o P (1) + v ∗ b ( n b ) (cid:16) P a h − h q I O P (1) a h q +1 / + O P (1) a h − h q I P ∗ a h q +1 / (cid:17) v b ( n b ) P → g ( e γ ; x ) k β k ≥ g ( e γ ; x ∗ , e γ ) k β k , n b → ∞ , (A.45)where x ∗ , e γ is deﬁned as in (A.33). In (A.45), the limit is a consequence of relations (A.42) and(A.43). By taking lim n b →∞ in (A.44), expressions (A.8) and (A.45) imply that ξ q, b (2 j ) ≥ g ( e γ ; x ∗ , e γ ) k β k . By repeating the argument leading to (A.25), we ﬁnd that k α k = k β k , implying ξ q, b (2 j ) ≥ g ( e γ ; x ∗ , e γ ) k α k ≥ ξ q, a (2 j ) . Analogously, by exchanging the roles of the subsequences n a and n b in the argument, we ﬁnd that ξ q, a (2 j ) = g ( e γ , x ∗ , e γ ) k α k = ξ q, b (2 j ) . (A.46)This ﬁnishes the proof for the case where I + = ∅ . Thus, (3.1) is established. (cid:3) Proof of Theorem 3.1 : Recall that, in this section, for notational simplicity we work under(A.6), namely, P ( n ) P H ≡ P ( n ) ≡ P . Throughout the proof, we use h as a shorthand for thematrix diag( h , . . . , h r ). In view of (2.22), for notational simplicity we write W Z ( a ( n )2 j ) = O P (1) , a ( n ) − h − (1 / I W X,Z ( a ( n )2 j ) = O P (1) , (A.47)20here the appropriate matrix dimensions are implicit. To avoid needless duplication in provingthe result under conditions (3.5) and (3.6) separately, we also continue to make use of the notation(A.4). So, rewrite log λ p − r + q (cid:16) W ( a ( n )2 j ) a h q +1 (cid:17) − log λ p − r + q (cid:16) E W ( a ( n )2 j ) a h q +1 (cid:17) = n log λ p − r + q (cid:16) P a h b B a (2 j ) a h P ∗ a h q + O P (1) a h q +1 + P a h O P (1) a h q +1 / + O P (1) a h P ∗ a h q +1 / (cid:17) − log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + O P (1) a h q +1 + P a h O P (1) a h q +1 / + O P (1) a h P ∗ a h q +1 / (cid:17)o + n log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + O P (1) a h q +1 + P a h O P (1) a h q +1 / + O P (1) a h P ∗ a h q +1 / (cid:17) − log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + O P (1) a h q +1 (cid:17)o + n log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + O P (1) a h q +1 (cid:17) − log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q (cid:17)o . (A.48)We consider each term on the right-hand side of (A.48) separately. Note that there is functionaldependence between the matrices b B a (2 j ) ∈ S ≥ ( r, R ), O P (1) ∈ S ≥ ( p, R ) and O P (1) ∈ M ( r, p, R ).However, we will construct partial mean value theorem-type expansions of each term in (A.48)where each of these matrix arguments is ﬁrst treated as functionally independent variables, andthen plug back in the actual value of the matrix argument.For B ∈ S > ( r, R ), deﬁne the functions f n,q, ( B ) := log λ p − r + q (cid:16) P a h B a h P ∗ a h q + O P (1) a h q +1 + P a h O P (1) a h q +1 / + O P (1) a h P ∗ a h q +1 / (cid:17) , (A.49) f n,q, ( R ) := log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q R + R ∗ a h P ∗ a h q (cid:17) (A.50)and f n,q, ( R ) := log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + R (cid:17) . (A.51)The layout of the proof is as follows. We will establish the convergence (cid:16) √ n a,j (cid:16) f n,q, ( b B a (2 j )) − f n,q, ( B a (2 j )) (cid:17) q =1 ,...,r (cid:17) j = j ,...,j d → N (0 , Σ λ ) , (A.52)and also that r na n f n,q, (cid:16) a − h − (1 / I W X,Z ( a j ) a h q +1 / (cid:17) − f n,q, ( ) (cid:17)o = o P (1) , (A.53) r na (cid:16) f n,q, (cid:16) W Z ( a j ) a h q +1 (cid:17) − f n,q, ( ) (cid:17) = o P (1) . (A.54)Indeed, as a consequence of (A.52), (A.53) and (A.54), (cid:16) √ n a,j (cid:16) log λ p − r + q ( W ( a ( n )2 j )) − log λ p − r + q ( E W ( a ( n )2 j )) (cid:17) q =1 ,...,r (cid:17) j = j ,...,j = (cid:16) √ n a,j (cid:16) f n,q, ( b B a (2 j )) − f n,q, ( B a (2 j )) (cid:17) q =1 ,...,r (cid:17) j = j ,...,j + o P (1) d → N (0 , Σ λ ) , n → ∞ , which proves (3.6).We proceed ﬁrst to establish (A.52). Recall that, for any M ∈ S ( p, R ), the diﬀerential of a simple eigenvalue λ q ( M ) exists in a connected vicinity of M and is given by dλ q ( M ) = u ∗ q { dM } u q , (A.55)where u q is a unit eigenvector of M associated with λ q ( M ) (Magnus (1985), p. 182, Theorem 1).Now, ﬁx an arbitrary q ∈ { , . . . , r } , and let I and (the possibly empty) sets I − , I + be as in(A.4). The eigenvalue λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + O P (1) a h q +1 + P a h O P (1) a h q +1 / + O P (1) a h P ∗ a h q +1 / (cid:17) (A.56)must be simple for large enough n with probability tending to 1. In fact, for M n = O P (1) a hq +1 + P a h O P (1) a hq +1 / + O ∗ P (1) a h P ∗ a hq +1 / and R n = O P (1) a hq +1 / , Lemma B.2 implies that u ∗ p − r + q ( n, R n ) M n u p − r + q ( n, R n ) = o P (1). Therefore, by Corollary B.1, λ p − r + ℓ (cid:16) P a h B a (2 j ) a h P ∗ a h q + O P (1) a h q +1 + P a h O P (1) a h q +1 / + O ∗ P (1) a h P ∗ a h q +1 / (cid:17) P →  , ℓ ∈ I − ; ξ ℓ (2 j ) , ℓ ∈ I ; ∞ , ℓ ∈ I + , as n → ∞ , where ξ ℓ (2 j ), ℓ ∈ I , are given by (3.1) and are pairwise distinct by condition (3.5).So, for any ﬁxed (and small enough) ε >

0, let O n = { B ∈ S > ( r, R ) : k B − B a (2 j ) k < ε } . Then, also for large n , by (A.55) with probability going to 1 the derivative of the function f n,q, in (A.49) exists in the open and connected set O n ⊆ S > ( r, R ). For any B ∈ O n , an applicationof Proposition 3 in Abry and Didier (2018 a ) yields f n,q, ( B ) − f n,q, ( B a (2 j )) = r X ℓ =1 r X ℓ ′ =1 ∂∂b ℓ,ℓ ′ f n,q ( ˘ B ) π ℓ,ℓ ′ ( B − B a (2 j )) (A.57)for some matrix ˘ B ∈ S > ( r, R ) lying in a segment connecting B and B a (2 j ) across S > ( r, R ).Deﬁne the event A = n ω : b B a (2 j ) ∈ O n o . By (2.23), P ( A ) → , n → ∞ . (A.58)By (A.57), for large enough n and in the set A , the expansion f n,q, ( b B a (2 j )) − f n,q, ( B a (2 j )) = r X ℓ =1 r X ℓ ′ =1 ∂∂b ℓ,ℓ ′ f n,q, ( ˘ B a (2 j )) π ℓ,ℓ ′ ( b B a (2 j ) − B a (2 j )) (A.59)holds for some matrix ˘ B a (2 j ) lying in a segment connecting b B a (2 j ) and B a (2 j ) across S > ( r, R ).Consider the matrix of derivatives n ∂∂b ℓ,ℓ ′ f n,q, ( ˘ B ) o ℓ,ℓ ′ =1 ,...,r n λ − p − r + q (cid:16) P a h ˘ B a h P ∗ a h q + O P (1) a h q +1 + P a h a h q O P (1) a h q +1 / + O ∗ P (1) a h q +1 / a h P ∗ a h q (cid:17) × ∂∂b ℓ,ℓ ′ λ p − r + q (cid:16) P a h ˘ B a h P ∗ a h q + O P (1) a h q +1 + P a h a h q O P (1) a h q +1 / + O ∗ P (1) a h q +1 / a h P ∗ a h q (cid:17)o ℓ,ℓ ′ =1 ,...,r . (A.60)In (A.60), the diﬀerential of the eigenvalue λ p − r + q (cid:16) P a h ˘ B a h P ∗ a h q + O P (1) a h q +1 + P a h a h q O P (1) a h q +1 / + O ∗ P (1) a h q +1 / a h P ∗ a h q (cid:17) is given by expression (A.55) with˘ W a h q +1 := P a h ˘ B a h P ∗ a h q + O P (1) a h q +1 + P a h a h q O P (1) a h q +1 / + O ∗ P (1) a h q +1 / a h P ∗ a h q (A.61)in place of M and u p − r + q ( n ) denoting a unit eigenvector of (A.61) associated with its ( p − r + q )–theigenvalue. Moreover, ∂∂b ℓ,ℓ ′ λ p − r + q (cid:16) ˘ W a h q +1 (cid:17) = u ∗ p − r + q ( n ) ∂∂b ℓ,ℓ ′ h P a h ˘ B a h P ∗ a h q + O P (1) a h q +1 + P a h a h q O P (1) a h q +1 / + O ∗ P (1) a h q +1 / a h P ∗ a h q i u p − r + q ( n )= u ∗ p − r + q ( n ) (cid:16) P a h ℓ,ℓ ′ a h P ∗ a h q (cid:17) u p − r + q ( n ) , ℓ, ℓ ′ = 1 , . . . , r, where ℓ,ℓ ′ is a matrix with 1 on entry ( ℓ, ℓ ′ ) and zeroes elsewhere. Therefore, by Proposition B.1,for 1 ≤ ℓ ≤ ℓ ′ ≤ r , and by using ˘ B a (2 j ) in place of ˘ B and ˘ W ( a j ) in place of ˘ W in (A.61), wecan pick the sequence u p − r + q ( n ) as to obtain a − (2 h q +1) u ∗ p − r + q ( n ) n ∂∂b ℓ,ℓ ′ ˘ W ( a j ) o u p − r + q ( n )= u ∗ p − r + q ( n ) P diag( a h − h q , . . . , , . . . , a h n − h q ) ℓ,ℓ ′ diag( a h − h q , . . . , , . . . , a h n − h q ) P ∗ u p − r + q ( n )= h p ℓ , u p − r + q ( n ) i a h ℓ − h q h p ℓ ′ , u p − r + q ( n ) i a h ℓ ′ − h q P →  , ℓ ∈ I − ; γ ℓq γ ℓ ′ q , ℓ, ℓ ′ ∈ I ,γ ℓq x ℓ ′ , ∗ , ℓ ∈ I , ℓ ′ ∈ I + ; x ℓ, ∗ x ℓ ′ , ∗ , ℓ, ℓ ′ ∈ I + . (A.62)In (A.62), the entries x ℓ, ∗ (depending on q ), ℓ ∈ I + , of the vector x q, ∗ (2 j ) as given by expression(B.89) in Lemma B.7, and each γ ℓ,q is as in Proposition B.1. Then, by (A.62) and (3.1) inProposition 3.1, expression (A.60) converges in probability to the matrix1 ξ q (2 j ) ×  γ ℓq γ ℓ ′ q ) ℓ,ℓ ′ ∈I ( γ ℓq x ℓ ′ , ∗ ) ℓ ∈I ,ℓ ′ ∈I + x ℓ, ∗ γ ℓ ′ q ) ℓ ∈I + ,ℓ ′ ∈I ( x ℓ, ∗ x ℓ ′ , ∗ ) ℓ,ℓ ′ ∈I +  , (A.63)where ξ q (2 j ) >

0. Turning back to (A.59), expression (A.63) and condition (2.23) imply that R ∋ √ n a,j ( f n,q, ( b B a (2 j )) − f n,q, ( B a (2 j )))23 r X ℓ =1 r X ℓ ′ =1 ∂∂b ℓ,ℓ ′ f n,q, ( ˘ B ) √ n a,j π ℓ,ℓ ′ ( b B a (2 j ) − B a (2 j )) d → N (0 , σ q ( j )) , (A.64)as n → ∞ , where σ q ( j ) > B ( j ) in (2.23) has full rank. Since,for j = j , . . . , j and q = 1 , . . . , r , the asymptotic normality of each individual log-eigenvalueresults from the factor (2.23), then the joint asymptotic normality (A.52) holds, as claimed.We now turn to (A.54). The eigenvaluelog λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q (cid:17) must be simple for large enough n with probability tending to 1. In fact, by Corollary B.1 with M n ≡ , λ p − r + ℓ (cid:16) P a h B a (2 j ) a h P ∗ a h q (cid:17) →  , ℓ ∈ I − ; ξ ℓ (2 j ) , ℓ ∈ I ; ∞ , ℓ ∈ I + , where the functions ξ ℓ (2 j ) are given by (3.1) and are pairwise distinct by condition (3.5). In otherwords, for any small δ >

0, there is n ∈ N (A.65)such that, for n ≥ n , the mapping M ( p, R ) ∋ R λ p − r + q (cid:18) P a h B a (2 j ) a h P ∗ a h q + R (cid:19) is diﬀerentiable for all k R k < δ . Therefore, by Proposition 3 in Abry and Didier (2018 a ), we canwrite f n,q, ( R ) − f n,q, ( ) := log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + R (cid:17) − log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q (cid:17) = p X i =1 p X i =1 ∂∂r i ,i f n,q, ( ˘ R ) π i ,i ( R ) (A.66)for some matrix ˘ R ∈ S > ( p, R ) lying in a segment connecting R and across S > ( p, R ). Moreover, n ∂∂r i ,i f n,q, ( ˘ R ) o i ,i =1 ,...,p = n λ − p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + ˘ R (cid:17) × ∂∂r i ,i λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + ˘ R (cid:17)o i ,i =1 ,...,p , (A.67)where the diﬀerential of the eigenvalue λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + R (cid:17) is given by expression (A.55) with P a h B a (2 j ) a h P ∗ a h q + R (A.68)24n place of M and u p − r + q ( n ) denoting a unit eigenvector of (A.68) associated with its ( p − r + q )–theigenvalue. In addition, ∂∂r i ,i λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + R (cid:17) = u ∗ p − r + q ( n ) ∂∂r i ,i h P a h B a (2 j ) a h P ∗ a h q + R i u p − r + q ( n )= u ∗ p − r + q ( n ) i ,i u p − r + q ( n ) , i , i = 1 , . . . , p. Fix ϕ > . (A.69)Under condition (2.22), for any small δ >

0, there is n ∈ N such that, for n ≥ n , k W Z ( a j ) /a h q +1 k ≤ δ with probability 1 − ϕ . So, let n be as in (A.65). Then, for n ≥ max { n , n } , expression (A.66) implies that r na (cid:16) f n,q, (cid:16) W Z ( a j ) a h q +1 (cid:17) − f n,q, ( ) (cid:17) = p X i =1 p X i =1 ∂∂r i ,i f n,q, ( ˘ R n ) r na π i ,i (cid:16) W Z ( a j ) a h q +1 (cid:17) = λ − p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + ˘ R n (cid:17) p X i =1 p X i =1 u ∗ p − r + q ( n ) i ,i u p − r + q ( n ) r na π i ,i (cid:16) W Z ( a j ) a h q +1 (cid:17) (A.70)with probability 1 − ϕ . In (A.70), the matrix ˘ R n lies between O P (1) /a h q +1 and . Since k ˘ R n k = o P (1), by Corollary B.1, λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a hq + ˘ R n (cid:17) P → ξ q (2 j ), n → ∞ . Now recall that,for M = ( m i ,i ) ∈ M ( p , p , R ), max i =1 ,...,p ; i =1 ,...,p | m i ,i ( n ) | ≤ k M k . (A.71)Thus, by condition (2.22), max i ,i =1 ,...,p | π i ,i ( W Z ( a j )) | = O P (1). Also recall that, for a vector x ∈ R p , k x k ≤ √ p k x k . (A.72)Then, expression (A.70) is bounded in absolute value by Ca h q +1 r na p X i =1 p X i =1 | u p − r + q ( n ) i u p − r + q ( n ) i | = Ca h q +1 r na k u p − r + q ( n ) k ≤ Ca h q +1 r na p = Ca h q +1 (cid:16) na (cid:17) / pn/a = C a h a h q +1 (cid:16) na h +1 (cid:17) / pn/a → , n → ∞ , (A.73)where the limit follows from condition (2.27). Since ϕ > r na (cid:16) f n,q, (cid:16) W Z ( a j ) a h q +1 (cid:17) − f n,q, ( ) (cid:17) = o P (1) , which is (A.54).We now turn to (A.53). Note that, again by Corollary B.1 with M n = W Z ( a j ) a hq +1 = o P (1) (see(2.22)), λ p − r + ℓ (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 (cid:17) P →  , ℓ ∈ I − ; ξ ℓ (2 j ) , ℓ ∈ I ; ∞ , ℓ ∈ I + , (A.74)25here the functions ξ ℓ (2 j ), ℓ ∈ I , are given by (3.1) and are pairwise distinct by condition (3.5).For R ∈ M ( r, p, R ), recast f n,q, ( R ) = log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q R + R ∗ a h P ∗ a h q (cid:17) = log (cid:16) u ∗ p − r + q ( n, R ) (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q R + R ∗ a h P ∗ a h q (cid:17) u p − r + q ( n, R ) (cid:17) =: log λ p − r + q ( W ( a ( n )2 j , R )) . (A.75)We claim that, by taking η suﬃciently small in (A.78), the eigenvalueslog λ p − r + ℓ ( W ( a ( n )2 j , R )) , ℓ ∈ I , (A.76)must be simple for large enough n with probability tending to 1 uniformly over k R k < ε for all ℓ ∈ I . For this purpose,( a ) we ﬁrst show that the eigenvalues (A.76) converge in probability to ξ ℓ (2 j ), ℓ ∈ I ;( b ) then, we prove that the eigenvalues log λ p − r + ℓ ( W ( a ( n )2 j , R )), ℓ / ∈ I , go to zero or inﬁnityin probability as n → ∞ .Given ( a ) and ( b ), since min ℓ,ℓ ′ ∈I ,ℓ = ℓ ′ | ξ ℓ (2 j ) − ξ ℓ ′ (2 j ) | >

0, by taking η suﬃciently small in (A.78),the eigenvalues (A.76) must be simple for large enough n with probability tending to 1 uniformlyover k R k < ε for all ℓ ∈ I , as claimed.So, to show ( a ), ﬁx ε >

0. By Lemma B.2,sup k R k≤ ε (cid:13)(cid:13)(cid:13) u ∗ p − r + q ( n, R ) P a h a h q (cid:13)(cid:13)(cid:13) = O P (1) , q = 1 , . . . , r − . (A.77)Together with Corollary B.1, this implies that, for all small η >

0, there is an ε > k R k≤ ε (cid:12)(cid:12)(cid:12) λ p − r + ℓ (cid:0) W ( a ( n )2 j , R ) (cid:1) − ξ ℓ (2 j ) (cid:12)(cid:12)(cid:12) < η (A.78)with probability tending to 1 as n → ∞ for all ℓ ∈ I . Indeed, suppose that, for some ℓ ∈ I andalong a subsequence of n there exists k R n k → η > (cid:12)(cid:12)(cid:12) λ p − r + ℓ (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q R n + R ∗ n a h P ∗ a h q (cid:17) − ξ ℓ (2 j ) (cid:12)(cid:12)(cid:12) ≥ η . (A.79)Let M n = W Z ( a j ) a hq +1 + P a h a hq R n + R ∗ n a h P ∗ a hq . By (2.22) and (A.77), u ∗ p − r + q ( n, R n ) M n u p − r + q ( n, R n ) = o P (1). Then, an application of Corollary B.1 implies that the event (A.79) must tend to 0 inprobability. This shows (A.78) and, hence, ( a ). Likewise, for ℓ ∗ ∈ I − ∪ I + , Lemma B.6 showsthatsup k R k≤ ε λ p − r + ℓ ∗ (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q R + R ∗ a h P ∗ a h q (cid:17) P → ( ∞ , ℓ ∗ ∈ I + , ℓ ∗ ∈ I − ∪ { , . . . , p − r } , where the statement for ℓ ∗ ∈ { , . . . , p − r } follows since sup k R k≤ ε λ p − r (cid:0) W ( a ( n )2 j , R ) (cid:1) = O P (1) a hq +1 due to Lemma B.2. This shows ( b ). Therefore, claim (A.76) is established.26hus, by an adaptation of Proposition 3 in Abry and Didier (2018 a ) to rectangular matrices, f n,q, ( R ) − f n,q, (0) = log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q R + R ∗ a h P ∗ a h q (cid:17) − log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 (cid:17) = r X i =1 p X i =1 ∂∂r i ,i f n,q, ( ˘ R ) π i ,i ( R )for some matrix ˘ R ∈ M ( r, p, R ) lying in a segment connecting R and across M ( r, p, R ). Considerthe matrix of derivatives n ∂∂r i ,i f n,q, ( ˘ R ) o i =1 ,...,r ; i =1 ,...,p = n λ − p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q ˘ R + ˘ R ∗ a h P ∗ a h q (cid:17) × ∂∂r i ,i λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q ˘ R + ˘ R ∗ a h P ∗ a h q (cid:17)o i =1 ,...,r ; i =1 ,...,p . (A.80)In (A.80), the diﬀerential of the eigenvalue λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q ˘ R + ˘ R ∗ a h P ∗ a h q (cid:17) is given by expression (A.55) with P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q ˘ R + ˘ R ∗ a h P ∗ a h q (A.81)in place of M and u p − r + q ( n, ˘ R ) denoting a unit eigenvector of (A.81) associated with its ( p − r + q )–th eigenvalue. Moreover, ∂∂r i ,i λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q ˘ R + ˘ R ∗ a h P ∗ a h q (cid:17) = u ∗ p − r + q ( n, ˘ R ) ∂∂r i ,i h P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q ˘ R + ˘ R ∗ a h P ∗ a h q i u p − r + q ( n, ˘ R )= u ∗ p − r + q ( n, ˘ R ) (cid:16) P a h a h q i ,i + i ,i a h P ∗ a h q (cid:17) u p − r + q ( n, ˘ R ) , i = 1 , . . . , r, i = 1 , . . . , p. Therefore, r na (cid:16) f n,q, ( R ) − f n,q, ( ) (cid:17) = λ − p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q ˘ R + ˘ R ∗ a h P ∗ a h q (cid:17) r X i =1 p X i =1 n u ∗ p − r + q ( n, ˘ R ) (cid:16) P a h a h q i ,i + i ,i a h P ∗ a h q (cid:17) u p − r + q ( n, ˘ R ) o i =1 ,...,r ; i =1 ,...,p r na π i ,i ( R ) . (A.82)Note that, by condition (2.22), 1 a h q +1 / k a − h − (1 / I W X,Z ( a j ) k = o P (1) , (A.83)27amely, k a − h − (1 / I W X,Z ( a j ) a hq +1 / k is arbitrarily small, in probability, for large enough n . Thus, by(A.82) and (A.83), r na (cid:16) f n,q, (cid:16) a − h − (1 / I W X,Z ( a j ) a h q +1 / (cid:17) − f n,q, ( ) (cid:17) = λ − p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q ˘ R n + ˘ R ∗ n a h P ∗ a h q (cid:17) × r X i =1 p X i =1 u ∗ p − r + q ( n ) (cid:16) P a h a h q i ,i + i ,i a h P ∗ a h q (cid:17) u p − r + q ( n ) × r na π i ,i (cid:16) a − h − (1 / I W X,Z ( a j ) a h q +1 / (cid:17) . (A.84)Note that ˘ R n lies in a segment connecting and the matrix a − h − (1 / I W X,Z ( a j ) a hq +1 / , whose norm tendsto 0 in probability due to condition (2.22). Hence, ˘ R n = o P (1). On the other hand, Lemma B.2implies that max ℓ ∈I + {|h u p − r + q ( n, ˘ R n ) , p ℓ ( n ) i| a ( n ) h ℓ − h q } = O P (1) . (A.85)This, in turn, implies u ∗ p − r + q ( n, ˘ R n ) (cid:16) W Z ( a j ) a hq +1 + P a h a hq ˘ R n + ˘ R ∗ n a h P ∗ a hq (cid:17) u p − r + q ( n, ˘ R ) = o P (1). Thus,for any small ϕ >

0, an application of Corollary B.1 gives λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q ˘ R n + ˘ R ∗ n a h P ∗ a h q (cid:17) → ξ q (2 j ) , n → ∞ , with probability 1 − ϕ . As a consequence of condition (2.22) and of (A.71),max i =1 ,...,r ; i =1 ,...,p | π i ,i ( a − h − (1 / I W X,Z ( a j )) | = O P (1) . Therefore, by expressions (A.72), (A.85) and by condition (2.27), the right-hand side of (A.84) isbounded, in absolute value, by O P (1) (cid:12)(cid:12)(cid:12) r X i =1 p X i =1 u ∗ p − r + q ( n ) (cid:16) P a h a h q i ,i + i ,i a h P ∗ a h q (cid:17) u p − r + q ( n ) r na π i ,i (cid:16) a − h − (1 / I W X,Z ( a j ) a h q +1 / (cid:17)(cid:12)(cid:12)(cid:12) ≤ O P (1) r na (cid:16) a h q +1 / (cid:17) r X i =1 p X i =1 (cid:12)(cid:12)(cid:12) u ∗ p − r + q ( n ) (cid:16) P a h a h q i ,i + i ,i a h P ∗ a h q (cid:17) u p − r + q ( n ) (cid:12)(cid:12)(cid:12) = O P (1) r na (cid:16) a h q +1 / (cid:17) r X i =1 (cid:12)(cid:12)(cid:12) h p i , u p − r + q ( n ) i a h i − h q (cid:12)(cid:12)(cid:12) p X i =1 | u p − r + q ( n ) i |≤ O P (1) r na (cid:16) a h q +1 / (cid:17) √ p ≤ O P (1) r pn/a (cid:16) na (cid:17) a h q +1 / ≤ O P (1) na ̟ +3 / P → . (A.86)In (A.86), the last inequality results from the bound ( h q − h ) + h ≥ ̟ , and convergence inprobability follows from the fact that ϕ > r na n log λ p − r + q (cid:16) P a h B (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h W X,Z ( a j ) a h q +1 / + W ∗ X,Z ( a j ) a h P ∗ a h q +1 / (cid:17) log λ p − r + q (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 (cid:17)o = r na n f n,q, (cid:16) a − h − (1 / I W X,Z ( a j ) a h q +1 / (cid:17) − f n,q, (0) (cid:17)o = o P (1) . Hence, (A.54) holds. (cid:3)

Proof of Corollary 3.1 : The proof is a slight modiﬁcation of Corollary 2, ( ii ), in Abry andDidier (2018 a ).For a ﬁxed q ∈ { , . . . , n } , the left-hand side of (3.13) can be recast as r na j X j = j w j (cid:16) log λ p − r + q ( W ( a j ) − log λ p − r + q ( E W ( a j )) (cid:17) + r na j X j = j w j (cid:16) log λ p − r + q ( E W ( a j )) − log ξ q ( a ( n )2 j ) (cid:17) + r na (cid:16) j X j = j w j ξ q ( a ( n )2 j ) − h q (cid:17) . (A.87)Note that by (3.2) in Proposition 3.1, ξ q ( a ( n )2 j ) = ( a ( n )2 j ) h q +1 ξ q (1). Therefore, by property(3.12), the third term in the sum (A.87) is zero. By Proposition B.2, the second term in the sum(A.87) is bounded by r na j X j = j | w j | Ca ̟ ≤ C ′ r na ̟ → , n → ∞ , where the limit is a consequence of condition (2.27). Therefore, we can rewrite (3.13) as j X j = j j/ − w j log 2 p K a,j (cid:16) log λ p − r + q ( W ( a j )) − log λ p − r + q ( E W ( a j )) (cid:17) + o (1) , and the weak limit (3.13) follows from Theorem 3.1. In the limiting variance in (3.13), the weightmatrix M ∈ M ( r, mr, R ) is given by M = (cid:16) j / w j log 2 I r ; 2 j +1 / w j +1 log 2 I r ; . . . ; 2 j / w j log 2 I r (cid:17) , (A.88)where I r ∈ M ( r, R ) is an identity matrix and m = j − j + 1. (cid:3) Proof of Proposition 3.2 : Proposition 3.1 and the scaling relation (3.2) show thatlog λ p − r + q W ( a j )log ( a j ) P → h q + 1 > , q = 1 , . . . , r. In addition, Lemma B.1 shows that log λ p − r W ( a j )log ( a j ) P → j . Thus, for q = 1 , . . . , r and κ > q ( j , j ) P → j X j = j jw j (2 h q + 1) = 2 h q + 1 > κ. Likewise, for q = 1 , . . . , p − r , ∆ q ( j , j ) P →

0, from which (3.16) follows. (cid:3) Auxiliary statements

In this section, as in Section A, whenever convenient we simply write p = p ( n ), a = a ( n ) or P = P ( n ). Also, for notational simplicity, as in (A.6) we write P ( n ) P H ≡ P ( n ) ≡ P. The following lemma is used in Proposition 3.1. It is also applied in Lemma B.4, which inturn is needed to establish Proposition 3.1. The convergence statement in Proposition 3.1 is alsoneeded in establishing Proposition 3.2.

Lemma B.1

Fix j ∈ N , and q ∈ { , . . . , r } . Suppose B a (2 j ) = E b B a (2 j ) ∈ S > ( p, R ) , and supposethe dyadic scaling sequence { a ( n ) } n ∈ N satisﬁes a ( n ) → ∞ as n → ∞ . Then, under ( W − W and ( A − A , as n → ∞ , λ p − r ( W ( a ( n )2 j )) = O P (1) (B.1) and λ p − r +1 ( W ( a ( n )2 j )) P → ∞ . (B.2) Proof:

To establish (B.1), just note that, by (A.2),0 ≤ λ p − r ( W ( a j )) ≤ sup u ∈{ p ,..., p r } ⊥ ∩ S p − u ∗ W ( a j ) u = sup u ∈{ p ,..., p r } ⊥ ∩ S p − u ∗ W Z ( a j ) u = O P (1) , under condition (2.22). To establish (B.2), it suﬃces to show that, for some constant C > λ p − r +1 ( W ( a j )) ≥ Ca h +1 (B.3)with probability tending to 1. In fact, by (A.2), ﬁrst rewrite λ p − r +1 ( W ( a j )) = sup U r inf u ∈U r ∩ S p − C u ∗ W ( a j ) u ≥ inf u ∈ span { p ,..., p r } u ∗ W ( a j ) u =: e u ∗ ( n ) W ( a j ) e u ( n ) . (B.4)In (B.4), e u ≡ e u ( n ) ∈ span { p , . . . , p r } (B.5)is an appropriately chosen unit vector, and, like the vectors p ℓ = p ℓ ( n ), ℓ = 1 , . . . , r , it is afunction of n . Now recast a − W ( a j ) = a − (cid:0) P W X ( a j ) P ∗ + W Z ( a j ) + P W X,Z ( a j ) + W ∗ X,Z ( a j ) P ∗ (cid:1) = P a h b B a (2 j ) a h P ∗ + O P (1) + P a h O P (1) + O ∗ P (1) a h P ∗ . (B.6)In (B.6), we use the notation (A.47), where the appropriate matrix dimensions are implicit. Inview of (B.5), there is some constant C > k P ∗ e u k > C with probability tending to 1.In turn, this implies that, for some C ′ > k a h P ∗ e u k ≥ C ′ a h , again with probability tending to1. Therefore, by (B.6), as well as by conditions (2.24) and (2.26), there is some constant C ′′ > a − e u ∗ W ( a j ) e u = k a h P ∗ e u k n e u ∗ P a h k e u ∗ P a h k b B a (2 j ) a h P ∗ e u k a h P ∗ e u k + e u ∗ O P (1) e u k a h P ∗ e u k + 2 k e u ∗ P a h k e u ∗ O P (1) a h P ∗ e u k a h P ∗ e u k o k a h P ∗ e u k (cid:8) Ω P (1) + o P (1) (cid:9) ≥ C ′′ a h with probability tending to 1 (see (2.2) on asymptotic notation). This shows (B.3). Hence, (B.2)is established. (cid:3) The following corollary of the proof of Proposition 3.1 is used several times throughoutthe proof of Theorem 3.1 and in the proof of Proposition B.1. In particular, it establishesthat the asymptotic behavior of the largest r rescaled eigenvalues of P ( n ) W X ( a ( n )2 j ) P ∗ ( n ) or P ( n ) E W X ( a ( n )2 j ) P ∗ ( n ) is indiﬀerent to random perturbations that appropriately align with itseigenspaces. Corollary B.1

Fix q ∈ { , . . . , r } . Let B denote either b B a (2 j ) or B a (2 j ) = E b B a (2 j ) as in (2.19) . For each n , let M n ∈ S ( p ( n ) , R ) be any sequence of random matrices such that u ∗ p − r + q ( n, M n ) M n u p − r + q ( n, M n ) = o P (1) , where u p − r + ℓ ( n, M n ) ∈ R p is a unit eigenvector associated with the eigenvalue λ p − r + ℓ (cid:16) P ( n ) a ( n ) h B a ( n ) h P ∗ ( n ) a ( n ) h q + M n (cid:17) . Then, λ p − r + q (cid:16) P ( n ) a ( n ) h B a h P ∗ ( n ) a ( n ) h q + M n (cid:17) P →  , ℓ ∈ I − ; ξ q (2 j ) , ℓ ∈ I ; ∞ , ℓ ∈ I + , n → ∞ . (B.7) In (B.7) , ξ q (2 j ) , q = 1 , . . . , r , are the functions (3.1) appearing in Proposition 3.1. Proof:

Since P ( n ) a ( n ) h B a h P ∗ ( n ) a ( n ) hq + M n ∈ S > ( R , r ), to establish the limits in (B.7) for ℓ ∈ I , mutatis mutandis , it suﬃces to repeat the arguments of the proof of Proposition 3.1 on thematrices P a h b B a (2 j ) a h P ∗ a hq + M n or P a h B a (2 j ) a h P ∗ a hq + M n , instead of W ( a j ) a hq +1 . The cases ℓ ∈ I ± followanalogously by repeating the arguments for establishing Lemma B.1. (cid:3) The following lemma is used in the proof of Theorem 3.1 and Proposition B.1.

Lemma B.2

Fix q ∈ { , . . . , r } and consider the index sets I − , I , and I + as in (A.4) . Suppose I + = ∅ . Let R ∈ M ( r, p, R ) , and deﬁne W n ≡ W ( a ( n )2 j , R ) := P ( n ) a ( n ) h B n a ( n ) h P ∗ ( n ) a ( n ) h q + W Z ( a ( n )2 j ) a ( n ) h q +1 + P ( n ) a ( n ) h a ( n ) h q R + R ∗ a ( n ) h P ∗ ( n ) a ( n ) h q . (B.8) In (B.8) , B n is any sequence of random matrices in S > ( r, R ) having some limit B n P → B ∞ ∈S > ( r, R ) , as n → ∞ . Also, let u p − r + q ( n, R ) be a unit eigenvector associated with the ( p − r + q ) –theigenvalue of W ( a ( n )2 j , R ) . Then, for every ε > , sup k R k≤ ε max ℓ ∈I + {|h p ℓ ( n ) , u p − r + q ( n, R ) i| a ( n ) h ℓ − h q } = O P (1) , n → ∞ . (B.9)31 n particular, for the unit eigenvector u p − r + q ( n ) of W ( a j ) , max ℓ ∈I + (cid:8) |h p ℓ ( n ) , u p − r + q ( n ) i| a ( n ) h ℓ − h q (cid:9) = O P (1) , n → ∞ . (B.10) Moreover, if u p − r + q ( n ) denotes a unit eigenvector associated with the ( p − r + q ) –th eigenvalue of E W ( a j ) a h q +1 = P ( n ) a ( n ) h B a (2 j ) a ( n ) h P ∗ ( n ) a ( n ) h q + E W Z ( a ( n )2 j ) a ( n ) h q +1 , then max ℓ ∈I + (cid:8) |h p ℓ ( n ) , u p − r + q ( n ) i| a ( n ) h ℓ − h q (cid:9) = O (1) , n → ∞ . (B.11) Proof:

First note that λ p − r + q ( W n ) a ( n ) h q +1 = inf U p − r + q sup u ∈U p − r + q ∩ S p − u ∗ W n a ( n ) h q +1 u ≤ sup u ∈ span { p q +1 ,..., p r } ⊥ ∩ S p − u ∗ W n a ( n ) h q +1 u ≤ sup u ∈ span { p ℓ ,ℓ ∈I + } ⊥ ∩ S p − u ∗ W n a ( n ) h q +1 u = O P (1) . (B.12)By way of contradiction, suppose there exists a sequence R m ∈ M ( r, p, R ) with k R m k ≤ ε suchthat the set I ∗ := n ℓ ∈ { , . . . , r } : lim sup m →∞ |h p ℓ , u p − r + q ( m, R m ) i| a ( m ) h ℓ − h q = ∞ o ⊆ I + is nonempty with non-vanishing probability. Equivalently, we can pick some ℓ ∈ I ∗ , and asubsequence n = n ( m ) such that |h p ℓ , u p − r + q ( n ( m )) i| a ( n ( m )) h ℓ − h q → ∞ . If there existssome ℓ ∈ I ∗ such that lim sup m →∞ |h p ℓ , u p − r + q ( n ( m )) i| a ( n ( m )) h ℓ |h p ℓ , u p − r + q ( n ( m )) i| a ( n ( m ) h ℓ = ∞ , then we can take a subsequence n of n such thatlim m →∞ |h p ℓ , u p − r + q ( n ( m )) i| a ( n ( m )) h ℓ |h p ℓ , u p − r + q ( n ( m )) i| a ( n ( m )) h ℓ = ∞ . Repeating this process, if there exists some ℓ i +1 ∈ I ∗ such thatlim sup m →∞ |h p ℓ i +1 , u p − r + q ( n i ( m )) i| a ( n i ( m )) h ℓi +1 |h p ℓ i , u p − r + q ( n i ( m )) i| a ( n i ( m )) h ℓi = ∞ , (B.13)take a reﬁnement n i +1 of the previous subsequence n i so that (B.13) holds as a limit. Let r ∗ ≤ |I ∗ | be the maximal index from this construction such that (B.13) holds as a limit. By deﬁnition,such limit holds along the sequence n r ∗ . Then, clearly, if r ∗ > i =1 ,...,r ∗ − lim m →∞ |h p ℓ i , u p − r + q ( n r ∗ ( m )) i| a ( n r ∗ ( m )) h ℓi |h p ℓ r ∗ , u p − r + q ( n r ∗ ( m )) i| a ( n r ∗ ( m )) h ℓr ∗ = 0 . In particular, for r ∗ ≥ |h p ℓ , u p − r + q ( n r ∗ ) i| a ( n r ∗ ) h ℓ |h p ℓ r ∗ , u p − r + q ( n r ∗ ) i| a ( n r ∗ ) h ℓr ∗ = O P (1) , ℓ ∈ I + . n r ∗ further if necessary we may assume that, for each ℓ = 1 , . . . , r ,lim m →∞ h p ℓ , u p − r + q ( n r ∗ ( m )) i a ( n r ∗ ( m )) h ℓ h p ℓ r ∗ , u p − r + q ( n r ∗ ( m )) i a ( n r ∗ ( m )) h ℓr ∗ =: y ℓ ∈ R . (B.14)Note that, in (B.14), y ℓ = 0 ⇒ ℓ ∈ I ∗ ⊆ I + . Moreover, there must be at least one such ℓ . Inother words, we can write y ∗ = ( y , . . . , y r ) = , where, with non-vanishing probability, a h P ∗ u p − r + q ( n r ∗ , R n r ∗ ) |h p ℓ r ∗ , u p − r + q ( n r ∗ ) i| a ( n r ∗ ) h ℓr ∗ → y , n r ∗ → ∞ . (B.15)So, consider the (divergent) sequence ϑ ( n r ∗ ) := |h p ℓ r ∗ , u p − r + q ( n r ∗ ) i| a ( n r ∗ ) h ℓr ∗ − h q . Then, by rela-tion (B.15), 1 ϑ ( n r ∗ ) λ p − r + q ( W n ) a ( n ) h q +1 = u ∗ p − r + q ( n ) W n ϑ ( n r ∗ ) a ( n ) h q +1 u p − r + q ( n )= u ∗ p − r + q ( n, R n ) P a h B n a h P ∗ ϑ ( n r ∗ ) a h q u p − r + q ( n, R n ) { o P (1) }→ y ∗ B ∞ y > . (B.16)The limit (B.16) implies that λ p − r + q ( W n ) a ( n ) hq +1 is unbounded with positive probability, which contradicts(B.12). Hence, (B.9) is established.To show statement (B.10), ﬁx ε >

0. By condition (2.22), with probability going to 1, k a − h − / W X,Z ( a j ) /a h q +1 / k ≤ ε. Moreover, for W ( a j , R ) as in expression (B.8) and its corresponding eigenvector u p − r + q ( n, R ), W (cid:16) a j , a − h − (1 / I W X,Z ( a j ) a h q +1 / (cid:17) = W ( a j ) , u p − r + q (cid:16) n, a − h − (1 / I W X,Z ( a j ) a h q +1 / (cid:17) = u p − r + q ( n ) . Hence, by expression (B.9), expression (B.10) follows. The statement (B.11) can be establishedby a simpler version of the argument leading to (B.9) applied to E W ( a j ) /a h q +1 . (cid:3) Remark B.1

Under the convention P ( n ) ≡ P ( n ) P H (see (A.6)), recall that W ( a ( n )2 j ) = P ( n ) a ( n ) h b B a (2 j ) a ( n ) h P ∗ ( n ) a ( n ) h q + W Z ( a ( n )2 j ) a ( n ) h q +1 + P ( n ) a ( n ) h a ( n ) h q (cid:2) a ( n ) − h P − H W X,Z ( a ( n )2 j ) (cid:3) + (cid:2) a ( n ) − h P − H W X,Z ( a ( n )2 j ) (cid:3) ∗ a ( n ) h P ∗ ( n ) a ( n ) h q . Thus, as a consequence of expression (B.10) in Lemma B.2, λ p − r + q ( W ( a ( n )2 j )) a ( n ) h q +1 = O P (1) . (B.17)Relation (B.17) is invoked in the proof of Proposition 3.1.The following lemma is used in the proofs of Lemma B.4 and Proposition B.1.33 emma B.3 For each n , let p ℓ ( n ) , ℓ = r + 1 , . . . , p , be an orthonormal basis for the nullspace of P . For ̟ as in (2.25) and for any q ∈ { , . . . , r } , p X i = r +1 h p i ( n ) , u p − r + q ( n ) i = o P (cid:0) a ( n ) − ̟ (cid:1) . (B.18) Moreover, if u p − r + q ( n ) denotes a unit eigenvector associated with the ( p − r + q ) –th eigenvalue of E W ( a j ) a hq +1 , then p X i = r +1 h p i ( n ) , u p − r + q ( n ) i = o (cid:0) a ( n ) − ̟ (cid:1) . (B.19) Proof:

By way of contradiction, suppose that, for some ε > n ∈ N , for notational simplicity), a ( n ) ̟ p X i = r +1 h p i ( n ) , u p − r + q ( n ) i ≥ ε > v ( n ) of u p − r + q ( n ) onto the nullspace of P ( n ), i.e., v ( n ) = p X i = r +1 h p i ( n ) , u p − r + q ( n ) i p i ( n ) . (B.21)Clearly, v ∗ ( n ) { P ( n ) W X,Z ( a j ) + W X,Z ( a j ) ∗ P ∗ ( n ) } v ( n ) = 0. In view of condition (2.22), withnon-vanishing probability, v ∗ ( n ) W ( a j ) v ( n ) = O P (1) . (B.22)However, let O ( n ) = ( u ( n ) . . . u p ( n )) be a matrix of eigenvectors of W ( a j ). Then, for any q ∈ { , . . . , r } , v ∗ ( n ) W ( a j ) v ( n ) = v ∗ ( n ) O ( n )diag (cid:16) λ ( W ( a j )) , . . . , λ p ( W ( a j )) (cid:17) O ∗ ( n ) v ( n )= p X i =1 h u i ( n ) , v ( n ) i λ i ( W ( a j )) ≥ h u p − r + q ( n ) , v ( n ) i λ p − r + q ( W ( a j ))= (cid:16) p X i = r +1 h p i ( n ) , u p − r + q ( n ) i (cid:17) λ p − r + q ( W ( a j )) ≥ ε a ( n ) − ̟ λ p − r + q ( W ( a j )) , (B.23)where the last inequality follows from (B.20). As a consequence of Proposition 3.1, and the factthat 2 ̟ ≤ h < h q + 1, the right-hand side of (B.23) blows up in probability, which contradicts(B.22). This shows (B.18). The statement (B.19) can be obtained by repeating the same argumentabove applied to E W ( a j ) in place of W ( a j ). (cid:3) The following lemma is used in the proofs of Propositions 3.1 and B.1.

Lemma B.4

Let u ℓ ( n ) be an eigenvector associated with the ℓ –th eigenvalue of the random matrix W ( a ( n )2 j ) , and deﬁne U r ( n ) = [ u p − r +1 ( n ) , . . . , u p ( n )] ∈ M ( p, r, R ) . (B.24)34 et Γ n = P ∗ ( n ) U r ( n ) , and consider any subsequence { Γ n ′ } n ′ of { Γ n } n ∈ N along which the limitp -lim n ′ →∞ Γ n ′ =: Γ ∈ M ( r, R ) (B.25) exists. Then, almost surely, Γ ∈ GL ( r, R ) . (B.26) Moreover, for each q ∈ { , . . . , r } , if I + = { ℓ : h ℓ > h q } 6 = ∅ (cf. (A.4) ), then the submatrix Γ + := (Γ ℓ,ℓ ′ ) ℓ,ℓ ′ ∈I + (B.27) of Γ is nonsingular. Moreover, if U r ( n ) = [ u p − r +1 ( n ) , . . . , u p ( n )] , where u ℓ ( n ) denotes an ℓ –thunit eigenvector associated to the (deterministic) matrix E W ( a j ) , and if along any subsequence n ′ it holds that Γ n ′ =: P ∗ ( n ′ ) U r ( n ′ ) has a limit lim n ′ →∞ Γ n ′ =: Γ ∈ M ( r, R ) , then Γ and Γ + := (Γ ℓ,ℓ ′ ) ℓ,ℓ ′ ∈I + are also nonsingular. Proof:

Let q ∈ { , . . . , r } . For notational simplicity, we write n = n ′ to index the subsequence.So, for each n , let p r +1 ( n ) , . . . , p p ( n ) be an orthonormal basis for the nullspace of P ( n ). As aconsequence of Lemma B.3, p X i = r +1 h p i ( n ) , u p − r + q ( n ) i P → , n → ∞ . (B.28)Also, for all n ∈ N and some Q ( n ) ∈ M ( p, r, R ), R ( n ) ∈ GL ( r, R ), let P ( n ) = Q ( n ) R ( n ) (B.29)be the QR decomposition of P ( n ) (e.g., Horn and Johnson (2012), Theorem 2.1.14, (a)). Write e p ( n ) , . . . , e p r ( n ) for the (orthonormal) columns of Q ( n ). Recall that, for a matrix M = ( m i,i ′ ) ∈ M ( r, R ), r X i,i ′ =1 m i,i ′ = tr( M ∗ M ) = r X i =1 σ i ( M ) , where σ i ( M ), i = 1 , . . . , r , are the singular values of M . Before showing (B.26), we establish that,for U r ( n ) as in (B.24), there exists some constant C > σ (cid:0) Q ∗ ( n ) U r ( n ) (cid:1) ≥ C > n with probability going to 1. In fact, (B.28) implies that, as n → ∞ , r X i =1 h e p i ( n ) , u p − r + q ( n ) i P → , q = 1 , . . . , r. Then, again as n → ∞ , r X i,ℓ =1 h e p i ( n ) , u p − r + ℓ ( n ) i = r X s =1 σ s (cid:16) Q ∗ ( n ) U r ( n ) (cid:17) P → r. (B.31)35owever, the largest singular value σ r (cid:0) Q ∗ ( n ) U r ( n ) (cid:1) satisﬁes σ r (cid:0) Q ∗ ( n ) U r ( n ) (cid:1) ≤ k Q ∗ ( n ) k k U r ( n ) k = 1 . Hence, (B.30) holds, since the above inequality, together with (B.31) imply σ (cid:0) Q ∗ ( n ) U r ( n ) (cid:1) → x = 0 for some x ∈ R r . Note that k R ∗ ( n ) Q ∗ ( n ) U r ( n ) x k ≥ σ ( R ∗ ( n )) σ (cid:0) Q ∗ ( n ) U r ( n ) (cid:1) k x k ≥ C / σ ( R ∗ ( n )) k x k , where the second inequality stems from (B.30). Thus, since P ( n ) and R ( n ) in (B.29) share thesame singular values, condition (2.30) implies thatlim inf n →∞ k R ∗ ( n ) Q ∗ ( n ) U r ( n ) x k ≥ C / lim n →∞ σ ( R ∗ ( n )) k x k > . (B.32)However, by condition (B.25), k Γ x k = p -lim n →∞ k R ∗ ( n ) Q ∗ ( n ) U r ( n ) x k . (B.33)Therefore, by expressions (B.30), (B.32) and (B.33), as n → ∞ , for some C ′ > k Γ x k ≥ C ′ k x k , with probability going to 1. This implies that x = 0. Hence, (B.26) is established.To show (B.27), once again let Γ as in (B.26) and ﬁx q ∈ { , . . . , r } . If I + = { ℓ : h ℓ > h q } 6 = ∅ ,Lemma B.7 implies that Γ ℓ,ℓ ′ = 0 whenever ℓ ∈ I − ∪ I and ℓ ′ ∈ I + , since, in this case, ℓ < ℓ ′ and h p ℓ ′ ( n ) , u p − r + ℓ ( n ) i a ( n ) h ℓ ′ − h ℓ is bounded, but a ( n ) h ℓ ′ − h ℓ → ∞ . In other words, we can recastΓ = (cid:18) (Γ ℓ,ℓ ′ ) ℓ,ℓ ′ ∈I − ∪I (Γ ℓ,ℓ ′ ) ℓ ∈I − ∪I ,ℓ ′ ∈I + (Γ ℓ,ℓ ′ ) ℓ,ℓ ′ ∈I + (cid:19) . (B.34)For Γ + as in (B.27), suppose Γ ∗ + y = for some y ∈ R r . Now let x = ( , y ) T ∈ R r . Then, by(B.34), Γ ∗ x = . Thus, x = . In other words, (B.27) holds.The statement regarding the limit Γ of Γ n ′ can be established by the same argument as for Γ. (cid:3) The following lemma is used in the proofs of Propositions 3.1 and B.1.

Lemma B.5

Fix q ∈ { , . . . , r } , and let u ℓ ( n ) be a unit eigenvector corresponding to the ℓ –theigenvalue of the random matrix W ( a j ) , and for each n deﬁne γ ℓ ( n ) := P ∗ ( n ) u p − r + ℓ ( n ) , ℓ = 1 , . . . , r. (B.35) Suppose n a is a subsequence along which the limitsp -lim n a →∞ γ q ( n a ) = γ a ℓ , ℓ = 1 , . . . , r, (B.36) exist. For any ﬁxed scalars α q , . . . , α r − r , (B.37)36 et e γ = r − r X ℓ = q α ℓ γ a ℓ . (B.38) Fix a vector x ∗ = ( x ∗ ,r − r +1 , . . . , x ∗ ,r ) ∗ ∈ R r . (B.39) Then, there exists a sequence of unit vectors v a ( n a ) ∈ span { u p − r + q ( n a ) , . . . , u p ( n a ) } satisfying P ∗ ( n a ) v a ( n a ) P → e γ qP r − r ℓ = q α ℓ , n a → ∞ , (B.40) where, with probability tending to 1, h p ℓ ( n a ) , v a ( n a ) i = x ∗ ,ℓ a ( n a ) h ℓ − h q , ℓ ∈ I + . (B.41) In particular, (A.36) holds.Moreover, for any sequence of eigenvectors u ℓ ( n ) associated to the ℓ –th eigenvalue of E W ( a j ) ,if along some subsequence n a the (deterministic) limits lim n a →∞ P ∗ ( n a ) u p − r + ℓ ( n a ) = γ a ℓ , ℓ = 1 , . . . , r, (B.42) exist, given any x ∗ as in (B.39) and e γ as in (B.38) , there exists v a ( n a ) ∈ span { u p − r + q ( n a ) , . . . , u p ( n a ) } satisfying (B.40) , with deterministic convergence replacing conver-gence in probability and (B.41) . In addition, we may write v ( n a ) = r X ℓ = q c ℓ ( n a ) u p − r + ℓ ( n a ) , where X ℓ ∈I + c ℓ ( n a ) = O ( a − ̟ ) . (B.43) Proof:

For notational simplicity, write n = n a . For n ∈ N , we will construct R -valued coeﬃ-cients c q ( n ) , . . . , c r ( n ) (B.44)such that the vector v ( n ) := r X ℓ = q c ℓ ( n ) u p − r + ℓ ( n ) (B.45)satisﬁes relation (B.40), and also such that (B.41) holds with probability tending to 1. For vectors γ a ℓ , ℓ = 1 , . . . , r , as in (B.36), let Γ = ( γ a , . . . , γ a r ) . (B.46)Also deﬁne ϑ n = (cid:16) x ∗ ,ℓ a ( n ) h ℓ − h q (cid:17) ℓ ∈I + ∈ R r , (B.47)which is the vector that contains the right-hand terms in (B.41). Note that ϑ n → , n → ∞ . (B.48)37onsider the matrices Γ n ∈ M ( r, R ) and Γ + n ∈ M ( r , R ) given byΓ n = (cid:0) h p ℓ ( n ) , u p − r + ℓ ′ ( n ) i (cid:1) ≤ ℓ,ℓ ′ ≤ r , Γ + n = (cid:0) h p ℓ ( n ) , u p − r + ℓ ′ ( n ) i (cid:1) ℓ,ℓ ′ ∈I + . (B.49)By (B.36) and (B.46), as n → ∞ , Γ n P → Γ . (B.50)In particular, Γ + n P → Γ + = (cid:0) Γ ℓ,ℓ ′ (cid:1) ℓ,ℓ ′ ∈I + . Therefore, by Lemma B.4, Γ + is nonsingular. Thisimplies that P (cid:0) Γ + n has full rank (cid:1) → , n → ∞ . (B.51)Therefore, for any ﬁxed 0 ≤ s ≤ α q , ..., α r − r as in (B.37), thesystem of equationsΓ + n ς = ϑ n − s qP r − r ℓ = q α ℓ P r − r ℓ = q α ℓ h p r − r +1 ( n ) , u p − r + ℓ ( n ) i ... P r − r ℓ = q α ℓ h p r ( n ) , u p − r + ℓ ( n ) i  (B.52)has a (unique) solution ς = ς n ( s ) = (cid:0) ς r − r +1 ,n ( s ) , . . . , ς r,n ( s ) (cid:1) ∗ ∈ R r (B.53)also with probability going to 1 as n → ∞ . Note that, by Lemma B.7, as n → ∞ , h p ℓ ′ ( n ) , u p − r + ℓ ( n ) i P → , ℓ ∈ I , ℓ ′ ∈ I + . (B.54)In view of (B.48) and (B.54), the vector on the right-hand side of (B.52) goes to zero in probability.Therefore, by the asymptotic full rank relation (B.51), the solution ς n ( s ) to the system (B.52)satisﬁes sup s ∈ [0 , k ς n ( s ) k P → , n → ∞ . (B.55)Thus, for any small ε >

0, 0 ≤ sup s ∈ [0 , k ς n ( s ) k < ε for large enough n with probability going to1. Now note that, for each n , the function s f ( s ) = 1 − s − k ς n ( s ) k depends continuously on s . Moreover, f (0) > − ε and f (1) = −k ς n (1) k . Hence, f must have a root s ∗ ( n ) ∈ (0 ,

1] withprobability tending to 1. So, for one such root s ∗ ( n ) ∈ (0 , , (B.56)for α ℓ , ℓ = q, . . . , r − r , as in (B.37), and for ς ℓ,n ( s ), ℓ = 1 , . . . , r , as in (B.53), deﬁne the vector c ( n ) ∗ = ( c ( n ) , . . . , c r ( n )) := (cid:16) , . . . , , s ∗ ( n ) α q qP r − r ℓ = q α ℓ , . . . , s ∗ ( n ) α r − r qP r − r ℓ = q α ℓ , ς r − r +1 ,n ( s ∗ ( n )) , . . . , ς r,n ( s ∗ ( n )) (cid:17) , if s ∗ ( n ) exists; (cid:16) , . . . , , α q qP r − r ℓ = q α ℓ , . . . , α r − r qP r − r ℓ = q α ℓ , , . . . , (cid:17) , otherwise . (B.57)Then, k c ( n ) k = 1 a.s. However, (B.55) further implies that the root (B.56) satisﬁes s ∗ ( n ) P → n → ∞ . Therefore, as a consequence of the uniform limit in (B.55), c ( n ) ∗ P → (cid:16) , . . . , , α q qP r − r ℓ = q α ℓ , . . . , α r − r qP r − r ℓ = q α ℓ , , . . . , (cid:17) , n → ∞ . (B.58)38urning back to the matrix Γ n as in (B.49), the limits in probability (B.50) and (B.58) then implythat Γ n c ( n ) P → e γ qP r − r ℓ = q α ℓ , n → ∞ , (B.59)where e γ is given by (B.38). This establishes (B.40) for v ( n ) as in (B.45) and for c ( n ) as in (B.57).Then, for a ﬁxed x ∗ as in (B.39), by again taking v ( n ) as in (B.45), with c q ( n ) , . . . , c r ( n ) asin (B.57), relation (B.52) implies that, for ℓ ∈ I + , h p ℓ ( n ) , v ( n ) i = r X i = q c i ( n ) h p ℓ ( n ) , u p − r + i ( n ) i (B.57) = s ∗ ( n ) r − r X i = q α i h p ℓ ( n ) , u p − r + i ( n ) i qP r − r ℓ = q α ℓ + X i ∈I + ς i,n ( s ∗ ( n )) h p ℓ ( n ) , u p − r + i ( n ) i (B.52) = s ∗ ( n ) r − r X i = q α i h p ℓ ( n ) , u p − r + i ( n ) i qP r − r ℓ = q α ℓ + (cid:16) x ∗ ,ℓ a ( n ) h ℓ − h q − s ∗ ( n ) r − r X i = q α i h p ℓ ( n ) , u p − r + i ( n ) i qP r − r ℓ = q α ℓ (cid:17) = x ∗ ,ℓ a ( n ) h ℓ − h q , with probability tending to 1. This establishes (B.41).The deterministic analogs to statements (B.40) and (B.41) regarding E W ( a j ) can be estab-lished by repeating the same argument for establishing (B.40) and (B.41) applied to E W ( a j ).To establish expression (B.43), consider expression (B.52) with u p − r + ℓ ( n ) in place of u p − r + ℓ ( n )and Γ + n := (cid:0) h p ℓ ( n ) , u p − r + ℓ ′ ( n ) i (cid:1) ℓ,ℓ ′ ∈I + in place of Γ + n . Then, for all large n , we may write, forsome s ∗ ( n ) ∈ (0 ,  c p − r +1 ( n )... c r ( n )  = (Γ + n ) −  ϑ n − s ∗ ( n ) qP r − r ℓ = q α ℓ P r − r ℓ = q α ℓ h p r − r +1 ( n ) , u p − r + ℓ ( n ) i ... P r − r ℓ = q α ℓ h p r ( n ) , u p − r + ℓ ( n ) i  . (B.60)Note that Γ + n is invertible due to Lemma B.4, since the limit Γ + of Γ + n is invertible. By con-struction, k ϑ n k = O ( a − ̟ ). In addition, h p i ( n ) , u p − r + ℓ ( n ) i = O ( a h i − h ℓ ) = O ( a − ̟ ) , when i ∈ I + , ℓ ∈ I due to Lemma B.2, expression (B.11). Thus, from (B.60), we obtain X ℓ ∈I + c ℓ ( n ) ≤ k (Γ + n ) − k k O ( a − ̟ ) + O ( a − ̟ ) k = O ( a − ̟ ) . This shows (B.43). (cid:3)

The following lemma is used in the proof of Theorem 3.1.

Lemma B.6

Fix q ∈ { , . . . , r } , and let W ( a j , R ) be as in (A.75) . If ℓ ∈ I − = { ℓ : h ℓ < h q } 6 = ∅ , then for any ﬁxed ℓ ∈ I − and for every small η > , there is ε ( η ) > such that, for k R k ≤ ε ( η ) , | λ p − r + ℓ ( W ( a j , R )) | ≤ η (B.61) as n → ∞ with probability going to 1. Likewise, if I + = { ℓ : h ℓ > h q } 6 = ∅ , then for any ﬁxed ℓ ∈ I + and for every large Ξ > , there is ε (Ξ) > such that, for k R k ≤ ε (Ξ) , | λ p − r + ℓ ( W ( a j , R )) − ξ q (2 j ) | ≥ Ξ (B.62) as n → ∞ with probability going to 1. In (B.62) , the function ξ q (2 j ) is as in Proposition 3.1. roof: Let q ∈ { . . . , r } . By Lemma B.2, for ﬁxed ε > ϕ ′ >

0, there is M ( ε ) > k R k≤ ε max q =1 ,...,r (cid:13)(cid:13)(cid:13) u ∗ p − r + q ( n, R ) P a h a h q (cid:13)(cid:13)(cid:13) ≤ M ( ε ) , (B.63)for all n , with probability at least 1 − ϕ ′ . So, ﬁx any small η >

0, and suppose ℓ ∈ I − . By (A.78)for λ p − r + ℓ , for an appropriately small ε = ε ( η ) and any R such that k R k ≤ ε , (cid:12)(cid:12)(cid:12) u ∗ p − r + ℓ ( n, R ) (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h q +1 + P a h a h q R + R ∗ a h P ∗ a h q (cid:17) u p − r + ℓ ( n, R ) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) a h ℓ − h q ) u ∗ p − r + ℓ ( n, R ) (cid:16) P a h B a (2 j ) a h P ∗ a h ℓ + W Z ( a j ) a h ℓ +1 + P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R )+ (cid:16) a h ℓ − h q u ∗ p − r + ℓ ( n, R ) (cid:16) P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R ) (cid:17) − (cid:16) a h ℓ − h q ) u ∗ p − r + ℓ ( n, R ) (cid:16) P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R ) (cid:17)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) a h ℓ − h q ) u ∗ p − r + ℓ ( n, R ) (cid:16) P a h B a (2 j ) a h P ∗ a h ℓ + W Z ( a j ) a h ℓ +1 + P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) a h ℓ − h q u ∗ p − r + ℓ ( n, R ) (cid:16) P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) a h ℓ − h q ) u ∗ p − r + ℓ ( n, R ) (cid:16) P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R ) (cid:12)(cid:12)(cid:12) ≤ a h ℓ − h q ) ξ ℓ (2 j )(1 + η ) + a h ℓ − h q M ( ε ) ε + a h ℓ − h q ) M ( ε ) ε < η, (B.64)where the last inequality holds for large enough n . This shows (B.61). Now, suppose ℓ ∈ I + , andﬁx any large Ξ >

0. Consider M ( ε ) as in (B.63). For any small η ′ >

0, if needed pick ε ≤ ε such that 2 M ( ε ) ε < η ′ ξ ℓ (2 j ) , where ξ ℓ (2 j ) is as in Proposition 3.1. Then, by (A.78) for λ p − r + ℓ , for any R such that k R k ≤ ε ,with probability at least 1 − ϕ ′ , u ∗ p − r + ℓ ( n, R ) (cid:16) P a h B a (2 j ) a h P ∗ a h q + W Z ( a j ) a h ℓ + P a h a h q R + R ∗ a h P ∗ a h q (cid:17) u p − r + ℓ ( n, R )= a h ℓ − h q ) u ∗ p − r + ℓ ( n, R ) (cid:16) P a h B a (2 j ) a h P ∗ a h ℓ + W Z ( a j ) a h ℓ +1 + P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R )+ a h ℓ − h q u ∗ p − r + ℓ ( n, R ) (cid:16) P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R ) − a h ℓ − h q ) u ∗ p − r + ℓ ( n, R ) (cid:16) P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R )= a h ℓ − h q ) n u ∗ p − r + ℓ ( n, R ) (cid:16) P a h B a (2 j ) a h P ∗ a h ℓ + W Z ( a j ) a h ℓ +1 + P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R )+ 1 a h ℓ − h q u ∗ p − r + ℓ ( n, R ) (cid:16) P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R )40 u ∗ p − r + ℓ ( n, R ) (cid:16) P a h a h ℓ R + R ∗ a h P ∗ a h ℓ (cid:17) u p − r + ℓ ( n, R ) o ≥ a h ℓ − h q ) n ξ ℓ (2 j )(1 − η ′ ) − a h ℓ − h q M ( ε ) ε − M ( ε ) ε o ≥ a h ℓ − h q ) n ξ ℓ (2 j )(1 − η ′ ) − a h ℓ − h q M ( ε ) ε o > Ξ , (B.65)where the last inequality holds for large enough n . This shows (B.62). (cid:3) The following proposition is used in the proof of Theorem 3.1 as well as in Corollary 3.1.In particular, note that establishing the behavior of the eigenvectors of W ( a ( n )2 j ) in terms ofthe coordinates given by P ( n ) (see expression (B.67)) is an important step unique to the high-dimensional framework. This is so because, in ﬁnite dimensions, it can be shown that there alwaysexists a convergent sequence of eigenvectors of the corresponding wavelet matrix (cf. Proposition1, ( iii ), and Lemma 3 in Abry and Didier (2018 a )). Proposition B.1

Fix j ∈ N and suppose conditions ( W − ( W and ( A − ( A hold. If(i) Either < h < . . . < h r < , (ii) h = . . . = h r , and whenever q = q , the functions ξ q (2 j ) in (3.1) satisfy ξ q (1) = ξ q (1) . (B.66) then, for each q ∈ { , . . . , r } , there is a sequence of ( p − r + q ) –th unit eigenvectors { u p − r + q ( n ) } n ∈ N of W ( a j ) along which the limits p -lim n →∞ P ∗ ( n ) u p − r + q ( n ) =: γ q (B.67) exist. Proof:

Let p r +1 ( n ) , . . . , p p ( n ) (B.68)be an orthonormal basis for the nullspace of P = P ( n ). Recall (B.18), which shows that, in eithercase ( i ) or ( ii ), p X i = r +1 h p i ( n ) , u p − r + q ( n ) i = o P ( a − ̟ ) . (B.69)So, consider the statement (B.67) under condition ( i ), and let q ∈ { , . . . , r } . Now, for each n , consider the ( QR ) decomposition P ( n ) = Q ( n ) R ( n ) . (B.70)In (B.70), R ( n ) ∈ GL ( r, R ) is upper-triangular. Also, the columns e p ( n ) , . . . , e p r ( n ) of Q ( n ) ∈ M ( p, r, R ), are orthonormal and satisfy the set of relationsspan { e p ℓ ( n ) , . . . , e p r ( n ) } = span { p ℓ ( n ) , . . . , p r ( n ) } , ℓ = 1 , . . . , r. (B.71)Note that, in particular, span { e p r ( n ) } = span { p r ( n ) } . (B.72)41oreover, for p i ( n ), i = r + 1 , . . . , p , as in (B.69),span { e p ( n ) , . . . , e p r ( n ) } ⊥ span { p r +1 ( n ) , . . . , p p ( n ) } . (B.73)As a consequence of (B.69) and (B.73), for any ℓ in the range 1 , . . . , r , r X i =1 h e p i ( n ) , u p − r + ℓ ( n ) i P → , n → ∞ . (B.74)Starting from (B.72), we now show that h e p r ( n ) , u p ( n ) i = 1 k p r ( n ) k h p r ( n ) , u p ( n ) i P → , n → ∞ . (B.75)By way of contradiction, suppose that, for some ε > h e p r ( n ) , u p ( n ) i ≤ − ε for some subsequence (still denoted n ∈ N , for notational simplicity).Observe that, from expression (A.7), 1 k p r ( n ) k u ∗ p ( n ) W ( a j ) a h r +1 u p ( n )= 1 k p r ( n ) k (cid:16) h p ( n ) , u p ( n ) i , . . . , h p r ( n ) , u p ( n ) i (cid:17) diag( a h − h r , . . . , × b B a (2 j ) diag( a h − h r , . . . ,  h p ( n ) , u p ( n ) i ... h p r ( n ) , u p ( n ) i  + o P (1) P ∼ k p r ( n ) k h p r ( n ) , u p ( n ) i (cid:0) b B a (2 j ) (cid:1) rr = h e p r ( n ) , u p ( n ) i (cid:0) b B a (2 j ) (cid:1) rr ≤ (1 − ε ) (cid:0) b B a (2 j ) (cid:1) rr < (cid:0) b B a (2 j ) (cid:1) rr P ∼ |h p r ( n ) , e p r ( n ) i| e p ∗ r ( n ) W ( a j ) a h r +1 e p r ( n )= 1 k p r ( n ) k e p ∗ r ( n ) W ( a j ) a h r +1 e p r ( n ) ≤ k p r ( n ) k u ∗ p ( n ) W ( a j ) a h r +1 u p ( n ) , which is a contradiction. Therefore, (B.75) holds. In particular, this establishes the asymptoticbehavior of the eigenvector u p ( n ) with respect to the coordinates given by e p ( n ) , . . . , e p r ( n ) and p r +1 ( n ) , . . . , p p ( n ) , (B.76)where p = p ( n ).We now establish analogous statements for the eigenvectors u p − r + q ( n ), q = 1 , . . . , r −

1. Basedon conditions (2.22), Lemma B.2 implies that |h p i ( n ) , u p − r + q ( n ) i| a h i − h q = O P (1) , i = q + 1 , . . . , r. Therefore, sequentially for q = 1 , . . . , r − i = q +1 ,...,r |h p i ( n ) , u p − r + q ( n ) i| P → , n → ∞ . (B.77)42ow recall that, by relation (B.71),span { e p ℓ ( n ) , . . . , e p r ( n ) } = span { p ℓ ( n ) , . . . , p r ( n ) } , ℓ = 1 , . . . , r. Thus, using (B.74) and (B.77), by choosing the sequences (i.e., possibly ﬂipping the signs) of u p − r + ℓ ( n ) appropriately for each ℓ = 1 , . . . , r , we obtain Q ∗ ( n ) U r ( n ) P → I r , n → ∞ , (B.78)where U r ( n ) = ( u p − r +1 ( n ) , . . . , u p ( n )) ∈ M ( p, r, R ). In other words, |h e p ℓ ( n ) , u p − r + ℓ ( n ) i| P → , ℓ = 1 , . . . , r. This establishes the asymptotic behavior of the eigenvectors u p − r + q ( n ), q = 1 , . . . , r , with respectto the coordinates given by (B.76). We now want to express such behavior in terms of thecoordinate system p ( n ) , . . . , p r ( n ) (i.e., the columns of P ( n )) and p r +1 ( n ) , . . . , p p ( n ) , (B.79)where p = p ( n ).In fact, under condition (2.30) P ∗ ( n ) P ( n ) = R ∗ ( n ) R ( n ) → A, n → ∞ . (B.80)Moreover, the (bounded) sequence { R ( n ) } n ∈ N must have a limit. In fact, let R and R belimits along any two subsequences. Then, by (B.80), R ∗ R = R ∗ R = A . Now, since A issymmetric positive deﬁnite, then R = R =: R as a consequence of the uniqueness of theCholesky decomposition (Horn and Johnson (2012), Corollary 7.2.9). Namely, R ( n ) → R, n → ∞ . (B.81)Thus, by (B.78) and for q = 1 , . . . , r , P ∗ ( n ) u p − r + q ( n ) = R ∗ ( n ) Q ∗ ( n ) u p − r + q ( n ) P → R e q =: γ q , where e q is the standard Euclidean vector in R r . This establishes (B.67) under condition ( i ) . Now, suppose condition ( ii ) holds, and ﬁx q = 1 , . . . , r . For P = P ( n ), write W P X ( a j ) := P W X ( a j ) P ∗ , and let h := h = . . . = h r . Also let e u p − r + q ( n ) be a ( p − r + q )–th unit eigenvectorof the matrix W P X ( a j ) a h +1 = P a h b B a (2 j ) a h P ∗ a h = P b B a (2 j ) P ∗ (namely, an eigenvector associated with the ( p − r + q )–th ordered eigenvalue). Then, clearly, P ∗ P b B a (2 j ) (cid:0) P ∗ e u p − r + q ( n ) (cid:1) = λ p − r + q (cid:0) P b B a (2 j ) P ∗ (cid:1) P ∗ e u p − r + q ( n ) . In other words, for q = 1 , . . . , r , the vector v q ( n ) := P ∗ e u p − r + q ( n ) k P ∗ e u p − r + q ( n ) k (B.82)43s a unit eigenvector of P ∗ P b B a (2 j ) with corresponding eigenvalue λ q ( P ∗ P b B a (2 j )) = λ p − r + q ( P b B a (2 j ) P ∗ ) . (B.83)Moreover, by Corollary B.1 with M n ≡ , as n → ∞ , λ q ( P ∗ P b B a (2 j )) = λ p − r + q ( P b B a (2 j ) P ∗ ) = λ p − r + q (cid:16) P a h b B a (2 j ) a h P ∗ a h (cid:17) P → ξ q (2 j ) . (B.84)In view of assumption (B.94), expression (B.84) shows that the eigenvalues of P ∗ P b B a (2 j ) aresimple with probability tending to 1, since by the scaling relation (3.2), for any q = q , ξ q (2 j ) = 2 j (2 h +1) ξ q (1) = 2 j (2 h +1) ξ q (1) = ξ q (2 j ) . In particular, since P ∗ P b B a (2 j ) P → A B (2 j ) due to condition (2.30), by ﬂipping the signs of v q ( n ) = P ∗ e u p − r + q ( n ) k P ∗ e u p − r + q ( n ) k if necessary, we obtain the limit P ∗ e u p − r + q ( n ) k P ∗ e u p − r + q ( n ) k = v q ( n ) P → v q , n → ∞ , (B.85)where v q is a q –th unit eigenvector of the matrix A B (2 j ). Moreover, the normalizing factor k P ∗ e u p − r + q ( n ) k in (B.82) must have a limit, since k P ∗ e u p − r + q ( n ) k = 1 v ∗ q ( n ) b B a (2 j ) v q ( n ) e u p − r + q ( n ) ∗ P b B a (2 j ) P ∗ e u p − r + q ( n )= 1 v ∗ q ( n ) b B a (2 j ) v q ( n ) λ p − r + q ( P b B a (2 j ) P ∗ ) P → ξ q (2 j ) v ∗ q B (2 j ) v q , n → ∞ . (B.86)In the last line of (B.86), we made use of the convergence b B a (2 j ) P → B (2 j ), as well as of the limits(B.84) and (B.85). Together with (B.85), this shows that the limit p -lim n →∞ P ∗ e u p − r + q ( n ) = p -lim n →∞ v q ( n ) k P ∗ e u p − r + q ( n ) k =: γ q (B.87)exists. By assumption (2.22), (cid:13)(cid:13)(cid:13) W ( a j ) a h +1 − W P X ( a j ) a h +1 (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) O P (1) a h +1 + P a h O P (1) a h +1 / + O P (1) a h P ∗ a h +1 / (cid:13)(cid:13)(cid:13) = o P (1) . Since the r largest eigenvalues of W ( a j ) a h +1 and W PX ( a j ) a h +1 are simple, Lemma B.9 implies that, byﬂipping the signs of ( p − r + q )–th unit eigenvectors u p − r + q ( n ) of W ( a j ) if necessary, we obtain (cid:13)(cid:13) u p − r + q ( n ) − e u p − r + q ( n ) (cid:13)(cid:13) = o P (1) (B.88)Hence, (cid:13)(cid:13) P ∗ u p − r + q ( n ) − P ∗ e u p − r + q ( n )) (cid:13)(cid:13) ≤ k P kk u p − r + q ( n ) − e u p − r + q ( n ) k = o P (1) . In particular, in view of (B.87), this implies that P ∗ u p − r + q ( n ) P → γ q , n → ∞ , which establishes (B.67) under condition ( ii ). (cid:3) The following Lemma is used in the proof of Theorem 3.1.44 emma B.7

Fix q ∈ { , . . . , r − } and suppose h < . . . < h r . Let u p − r + ℓ ( n ) be a unit eigen-vector associated with the ( p − r + ℓ ) –th eigenvalue of the random matrix W ( a ( n )2 j ) . Then, thereare constants x ∗ ,q,ℓ , ℓ ∈ I + , such that, as n → ∞ , h p ℓ ( n ) , u p − r + q ( n ) i a ( n ) h ℓ − h q P → x ∗ ,q,ℓ , ℓ ∈ I + . (B.89) Proof:

Consider an arbitrary subsequence of h p ℓ ( n ) , u p − r + q ( n ) i a ( n ) h ℓ − h q , still indexed by n ∈ N for simplicity. By expression (B.10) in Lemma B.2, there is some further subsequence n ′ alongwhich e x ∗ ,q,ℓ := p -lim n ′ →∞ h p ℓ ( n ′ ) , u p − r + q ( n ′ ) i a ( n ′ ) h ℓ − h q exists for all ℓ ∈ I + = { q + 1 , . . . , r } .Write x ( n ) := (cid:0) a ( n ′ ) h ℓ − h q h p ℓ ( n ′ ) , u p − r + q ( n ′ ) i (cid:1) ℓ ∈I + , e x := ( e x ∗ ,q,ℓ ) ℓ ∈I + , (B.90)i.e., x ( n ) P → e x . Moreover, by Proposition B.1, the limits γ ℓ = p -lim n →∞ γ ℓ ( n ) exist, where γ ℓ ( n ) := P ∗ ( n ) u p − r + ℓ ( n ). Now, let x ∗ ( n ) ∈ R r and x ∗ = ( x ∗ ,q +1 , . . . , x ∗ ,r ) ∈ R r (B.91)the minimizers of the functions g n ′ ( γ q ( n ′ ) , · ) and g ( γ q , · ), respectively, as deﬁned in(A.32) and (A.30). By Lemma B.5, we may take a sequence of unit vectors v ( n ′ ) ∈ span { u p − r + q ( n ′ ) , . . . , u p ( n ′ ) } such that, with probability tending to 1, h p ℓ ( n ′ ) , v ( n ′ ) i = x ∗ ,ℓ a h ℓ − h q , ℓ ∈ { q + 1 , . . . , r } , P ∗ ( n ′ ) v ( n ′ ) P → γ q . (B.92)Thus, g n ′ (cid:16) γ q ( n ′ ); x ∗ ( n ′ ) (cid:17) + u ∗ p − r + q ( n ′ ) (cid:16) W ( a j ) − P a h b B a (2 j ) a h P ∗ a h q (cid:17) u p − r + q ( n ′ ) . ≤ g n ′ (cid:16) γ q ( n ′ ); x ( n ) (cid:17) + u ∗ p − r + q ( n ′ ) (cid:16) W ( a j ) − P a h b B a (2 j ) a h P ∗ a h q (cid:17) u p − r + q ( n ′ ) . = λ p − r + q ( W ( a ( n ′ )2 j )) a ( n ′ ) h q +1 ≤ v ∗ ( n ′ ) W ( a ( n ′ )2 j )) a ( n ′ ) h q +1 v ( n ′ )= g n ′ ( γ ( n ′ ); x ∗ ) + v ∗ ( n ′ ) (cid:16) W ( a ( n ′ )2 j ) a ( n ′ ) h q +1 − P ( n ′ ) a ( n ′ ) h b B a (2 j ) a ( n ′ ) h P ∗ ( n ′ ) a ( n ′ ) h q (cid:17) v ( n ′ ) . (B.93)Now, as n ′ → ∞ , g n ′ ( γ q ( n ′ ); x ∗ ( n ′ )) P → g ( γ q , x ∗ ), and g n ′ ( γ q ( n ′ ); x ( n ′ )) P → g ( γ q , e x ). Analogouslyto (A.38), and (A.45), by taking limits along the subsequence n ′ in the string of inequalities(B.93), we obtain ξ q (2 j ) = g ( γ q ; x ∗ ) = g ( γ q ; e x ∗ ) , Since the minimizer x ∗ of g ( γ q ; · ) is unique, we conclude that e x ∗ ,ℓ = x ∗ ,ℓ for ℓ = q + 1 , . . . , r ,which establishes the claim. (cid:3) The following proposition is used in the proof of Corollary 3.1.

Proposition B.2

Fix j ∈ N and suppose conditions ( W − ( W and ( A − ( A hold. Supposethat, either(i) < h < . . . < h r < ; or ii) h = . . . = h r , and whenever q = q , the functions ξ q (2 j ) in (3.1) satisfy ξ q (1) = ξ q (1) . (B.94) Then, for some

C > that does not depend on j , (cid:12)(cid:12)(cid:12) λ p − r + q ( E W ( a ( n )2 j )) a ( n ) h q +1 − ξ q (2 j ) (cid:12)(cid:12)(cid:12) ≤ Ca ( n ) ̟ , q = 1 , . . . , r. (B.95) Proof:

Assume condition ( i ) holds. We prove only the statement for q ∈ { , . . . , r − } sincethe cases q = 1 , r can be handled by a simpliﬁed version of the following argument. For γ = (cid:0) γ , . . . , γ r (cid:1) ∗ ∈ R r , x ∈ R r , consider the deterministic function g n ( γ ; x ) := y ∗ n ( γ ; x ) B a (2 j ) y n ( γ ; x ) ∈ R (B.96)(c.f. expression (A.32)), where y n ( γ ; x ) = (cid:16) γ a ( n ) h − h q , . . . , γ r a ( n ) h r − h q | {z } r , γ q , x q +1 , . . . , x r | {z } r (cid:17) ∗ . Now recall the function g ( γ ; x ) given by (A.30), namely, g ( γ ; x ) = y ∗ ( γ ; x ) B (2 j ) y ( γ ; x ) ∈ R , (B.97)where R r ∋ y ( γ ; x ) = (cid:16) , . . . , | {z } r , γ q , x q +1 , . . . , x r | {z } r (cid:17) ∗ = lim n →∞ y n ( γ ; x ) . (observe that y ( γ ; x ) depends on γ only through its q –th entry γ q ). For ℓ = 1 , . . . , p , let u ℓ ( n )( = u ℓ ( n )) denote the ℓ –th (deterministic) eigenvector of E W ( a j ). Observe that, by repeatingthe argument leading to (B.67) for the deterministic matrix E W ( a j ) in place of W ( a j ), γ q ( n ) := P ∗ ( n ) u p − r + q ( n ) → γ q =: ( γ ,q , . . . , γ r,q ) ∗ , n → ∞ . (B.98)So, let x ∗ ( n ) ∈ R r and x ∗ = ( x ∗ ,q +1 , . . . , x ∗ ,r ) ∈ R r (B.99)be the minimizers of the (deterministic) functions g n ( γ q ( n ); · ) and g ( γ q ; · ), respectively, wheresuch functions are given in (B.96) and (B.97). Observe that, as n → ∞ , x ∗ ( n ) → x ∗ , implying y n ( γ q ( n ) , x ∗ ( n )) → y ( γ q , x ∗ ). Hence, g n ( γ q ( n ) , x ∗ ( n )) → g ( γ q , x ∗ ) , n → ∞ . (B.100)By Lemma B.5, for large enough n we may take a sequence of unit vectors v ( n ) ∈ span { u p − r + q ( n ) , . . . , u p ( n ) } such that h p ℓ ( n ) , v ( n ) i = x ∗ ,ℓ a h ℓ − h q , ℓ ∈ { q + 1 , . . . , r } , P ∗ ( n ) v ( n ) → γ q . (B.101)46herefore, as n → ∞ , a h − h q I P ∗ ( n ) v ( n ) = y n ( v ( n ) , x ∗ ( n )) → y ( γ q ; x ∗ ) , which implies g n ( v ( n ) , x ∗ ( n )) → g ( γ q , x ∗ ) , n → ∞ . (B.102)Moreover, let R r ∋ x ( n ) := (cid:0) h p q +1 ( n ) , u p − r + q ( n ) i a h q +1 − h q , . . . , h p r ( n ) , u p − r + q ( n ) i a h r − h q (cid:1) , n ∈ N (not to be confused with the minimizer x ∗ ( n ) of g n ( γ q ( n ) , · ) as in (B.99)). Then, we can express λ p − r + q ( E W ( a j ) /a h q +1 ) = g n ( γ q ( n ) , x ( n )) + u ∗ p − r + q ( n ) E W Z ( a j ) a h q +1 u p − r + q ( n ) . Thus, for all large n and for γ q ( n ) and v ( n ) as in (B.98) and (B.101), respectively, g n ( γ q ( n ) , x ∗ ( n )) + u ∗ p − r + q ( n ) E W Z ( a j ) a h q +1 u p − r + q ( n ) ≤ g n ( γ q ( n ) , x ( n )) + u ∗ p − r + q ( n ) E W Z ( a j ) a h q +1 u p − r + q ( n )= u ∗ p − r + q ( n ) E W ( a ( n )2 j ) a ( n ) h q +1 u p − r + q ( n ) = λ p − r + q ( E W ( a ( n )2 j )) a ( n ) h q +1 ≤ v ∗ ( n ) E W ( a ( n )2 j )) a ( n ) h q +1 v ( n ) = g n ( v ( n ); x ∗ ) + v ∗ ( n ) E W Z ( a j ) a h q +1 v ( n ) , (B.103)where in the second inequality we used the fact that v ( n ) ∈ span { u p − r + q ( n ) , . . . , u p ( n ) } . However,by Corollary B.1 with M n = E W Z ( a j ) /a h q +1 , λ p − r + q ( E W ( a ( n )2 j )) a ( n ) h q +1 → ξ q (2 j ) , n → ∞ . Therefore, in view of (B.100) and (B.102), by taking limits in (B.103), we see that ξ q (2 j ) = g ( γ q , x ∗ ) . Moreover, since k E W Z ( a j ) /a h q +1 k = o ( a − ̟ ) under assumption (2.22), this implies that (cid:12)(cid:12)(cid:12) λ p − r + q ( E W ( a ( n )2 j )) a ( n ) h q +1 − ξ q (2 j ) (cid:12)(cid:12)(cid:12) ≤ max n(cid:12)(cid:12)(cid:12) g n (cid:0) γ q ( n ) , x ∗ ( n ) (cid:1) − g (cid:0) γ q , x ∗ (cid:1)(cid:12)(cid:12)(cid:12) , (cid:12)(cid:12)(cid:12) g n (cid:0) v ( n ); x ∗ (cid:1) − g (cid:0) γ q , x ∗ (cid:1)(cid:12)(cid:12)(cid:12)o + o ( a − ̟ ) . (B.104)Consider the ﬁrst and the second terms inside the max {· , ·} operator on the right-hand side ofexpression (B.104). If (a) (cid:12)(cid:12)(cid:12) g n ( γ q ( n ) , x ∗ ( n )) − ξ q (2 j ) (cid:12)(cid:12)(cid:12) = O ( a − ̟ ) (B.105)and (b) (cid:12)(cid:12)(cid:12) g n ( v ( n ); x ∗ ) − ξ q (2 j ) (cid:12)(cid:12)(cid:12) = O ( a − ̟ ) , (B.106)47hen (B.95) holds under condition ( i ). So, we now establish (a) and (b).First, we show (a). Let γ q ( n ) and x ∗ ( n ) be as in (B.98) and (B.99), respectively. For notationalsimplicity, write y ( n ) = (cid:0) y ( n ) , . . . , y r ( n ) (cid:1) ∗ := y n (cid:0) γ q ( n ) , x ∗ ( n ) (cid:1) = (cid:16) h p ( n ) , u p − r + q ( n ) i a h − h q , . . . , h p r ( n ) , u p − r + q ( n ) i a h r − h q | {z } r , h p q ( n ) , u p − r + q ( n ) i , x ∗ ( n ) | {z } r (cid:17) (B.107)and y = (cid:0) , . . . , | {z } r , γ q,q , x ∗ |{z} r (cid:1) ∗ = ( y , . . . , y r ) ∗ := lim n →∞ y ( n ) , (B.108)where lim n →∞ h p q ( n ) , u p − r + q ( n ) i = γ q,q as a consequence of (B.98). Recast g n ( γ q ( n ) , x ∗ ( n )) − ξ q (2 j ) = g n ( γ q ( n ) , x ∗ ( n )) − g ( γ q , x ∗ )= y ∗ ( n ) B a (2 j ) y ( n ) − y ∗ B (2 j ) y = ( y ( n ) − y ) ∗ B a (2 j )( y ( n ) − y ) + 2 y ∗ B (2 j )( y ( n ) − y ) + y ∗ (cid:0) B a (2 j ) − B (2 j ) (cid:1) y . (B.109)Therefore, in view of condition (2.24), if k y ( n ) − y k = O ( a − ̟ ) , (B.110)then (cid:12)(cid:12) g n ( γ q ( n ) , x ∗ ( n )) − g ( γ q , x ∗ ) (cid:12)(cid:12) = O ( a − ̟ ) . Thus, (B.105) holds, which establishes (a). So, we now show (B.110). We establish relation(B.110) entry-wise for each of the ranges ℓ ∈ I − = { , . . . , q − } , ℓ = q and ℓ ∈ I + = { q +1 , . . . , r } .In fact, for ℓ ∈ I − = { , . . . , q − } , expression (B.107) shows that | y ℓ ( n ) − y ℓ | = | y ℓ ( n ) | = O (cid:16) a h ℓ − h q (cid:17) = O ( a − ̟ ) . (B.111)On the other hand, for ℓ = q , let P ( n ) = Q ( n ) R ( n ) be the QR decomposition of P ( n ) asin (B.70), where R ( n ) ∈ GL ( r, R ), Q ( n ) = ( e p ( n ) , . . . , e p r ( n )) ∈ M ( p, r, R ) with orthonormalcolumns. Again, let p r +1 ( n ) , . . . , p p ( n ) be an orthonormal basis for the nullspace of P ( n ) as in(B.68). Observe that, by Lemma B.8,1 − h e p k ( n ) , u p − r + k ( n ) i = O ( a − ̟ ) , k = 1 , . . . , r. (B.112)Consequently, after ﬂipping the sign of u p − r + q ( n ) if necessary, k Q ∗ ( n ) u p − r + q ( n ) − e q k = O ( a − ̟ ).Recalling that R ( n ) → R as n → ∞ (see (B.81)), from (B.78) it holds that γ q = R e q . Conse-quently, k P ∗ ( n ) u p − r + q ( n ) − γ q k = k R ∗ ( n ) Q ∗ ( n ) u p − r + q ( n ) − R e q k≤ k R ( n ) − R k + k R kk Q ∗ ( n ) u p − r + q ( n ) − e q k = O ( a − ̟ ) , where the in the last equality we make use of condition (2.29). Hence, | y q ( n ) − y q | = |h p q ( n ) , u p − r + q ( n ) i − γ q,q | = O ( a − ̟ ) . (B.113)In other words, (B.110) also holds for ℓ = q . 48urning to y ℓ ( n ) − y ℓ for ℓ ∈ I + = { q + 1 , . . . , r } , recall that x ∗ ( n ) := ( x ∗ ,q +1 ( n ) , . . . , x ∗ ,r ( n )) ∗ is the unique minimizer of g n ( γ q ( n ); · ) in (B.96), and that x ∗ = lim n →∞ x ∗ ( n ). Writing B a (2 j ) =( b ii ′ ( a )) ≤ i,i ′ ≤ r , let B + ,a = (cid:0) b iℓ ( a ) (cid:1) i,ℓ ∈I + , B ,a = (cid:0) b iℓ ( a ) (cid:1) i, ∈I + ,ℓ ∈I − ∪I , and let B + , B be their corresponding limits. The ﬁrst order conditions for the minimization of g n ( γ q ( n ); · ) imply that (cid:0) B ,a B + ,a (cid:1) y ( n ) = . (B.114)Since B + ∈ S > and y ℓ ( n ) = x ∗ ,ℓ ( n ) for ℓ = q + 1 , . . . , r , by rearranging (B.114) we have, for alllarge n , x ∗ ( n ) = − B − ,a B ,a (cid:0) y ℓ ( n ) (cid:1) ℓ ∈I − ∪I , x ∗ = − B − B (cid:0) y ℓ (cid:1) ℓ ∈I − ∪I . Together with (B.111) and (B.113), this implies that k ( y ℓ ( n ) − y ℓ ) ℓ ∈I + k = k x ∗ ( n ) − x ∗ k ≤ k B − B ( y ℓ ( n ) − y ℓ ) ℓ ∈I − ∪I k + C k B − ,a B ,a − B − B k≤ C k ( y ℓ ( n ) − y ℓ ) ℓ ∈I − ∪I k + O ( a − ̟ ) = O ( a − ̟ ) , due to expressions (B.111) and (B.113). Hence, (B.110) also holds for ℓ ∈ I + . Thus, (B.110)holds for ℓ = 1 , . . . , r . This establishes (B.105) and, hence, (a).Now, we turn to (b). Consider the sequence of unit vectors v ( n ) given by (B.40) in LemmaB.5 for the deterministic matric E W ( a j ). For a sequence of scalars { c q ( n ) , c q +1 ( n ) , . . . , c r ( n ) } satisfying P ri = q c i ( n ) = 1 we may write v ( n ) = P ri = q c i ( n ) u p − r + i ( n ), and by expression (B.43) ofLemma B.5, 1 − c q ( n ) = r X i = q +1 c i ( n ) = O ( a − ̟ ) . Moreover, observe that (B.101) implies c q ( n ) → n → ∞ . Hence, | − c q ( n ) | = O ( a − ̟ ).Therefore, |h p q ( n ) , v ( n ) i − h p q ( n ) , u p − r + q ( n ) i| = (cid:12)(cid:12)(cid:12) ( c q ( n ) − h p q ( n ) , u p − r + q ( n ) i + X i ∈I + c i ( n ) h p ℓ , u p − r + i ( n ) i (cid:12)(cid:12)(cid:12) ≤ | − c q ( n ) | |h p q ( n ) , u p − r + q ( n ) i| + O ( a − ̟ ) = O ( a − ̟ ) . (B.115)Thus, (B.113) and (B.115) show that (cid:12)(cid:12) h p q ( n ) , v ( n ) i − γ q,q (cid:12)(cid:12) ≤ (cid:12)(cid:12) h p q ( n ) , v ( n ) i − h p q ( n ) , u p − r + q ( n ) i (cid:12)(cid:12) + (cid:12)(cid:12) h p q ( n ) , u p − r + q ( n ) i − γ q,q (cid:12)(cid:12) = O ( a − ̟ ) + O ( a − ̟ ) = O ( a − ̟ ) . (B.116)Now, deﬁne e y ( n ) = a h − h q I P ∗ ( n ) v ( n ) = y n (cid:0) P ∗ ( n ) v ( n ) , x ∗ (cid:1) . Thus, entry-wise, we can express e y ( n ) ℓ =  h p ℓ ( n ) , v ( n ) i a h ℓ − h q = O ( a − ̟ ) , ℓ ∈ I − ; h p ℓ ( n ) , v ( n ) i , ℓ = q ; x ∗ ,ℓ , ℓ ∈ I + . y as in (B.108), we obtain k e y ( n ) − y k = (cid:0) h p q ( n ) , v ( n ) i − γ q,q (cid:1) + O ( a − ̟ ) = O ( a − ̟ ) . (B.117)Hence, (cid:12)(cid:12)(cid:12) g n ( v ( n ); x ∗ ) − ξ q (2 j ) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)e y ∗ ( n ) B a (2 j ) e y ( n ) − e y ∗ B (2 j ) e y (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) ( e y ( n ) − y ) ∗ B a (2 j )( e y ( n ) − y ) + 2 y ∗ B (2 j )( e y ( n ) − y ) + y ∗ (cid:0) B a (2 j ) − B (2 j ) (cid:1) y (cid:12)(cid:12)(cid:12) ≤ C k B a (2 j ) − B (2 j ) k + C ′ k e y ( n ) − y k = O ( a − ̟ ) , where the last equality is a consequence of condition (2.24) and relation (B.117). This establishes(B.106) and, hence, (b). Thus, as anticipated, (B.95) holds under condition ( i ).We now turn to (B.95) for the case ( ii ). Fix any q ∈ { , . . . , r } . By Weyl’s inequality (e.g.,Vershynin (2018), Theorem 4.5.3), (cid:12)(cid:12)(cid:12) λ p − r + q ( E W ( a j )) a h q +1 − λ p − r + q ( E W P X ( a j ) a h q +1 (cid:12)(cid:12)(cid:12) ≤ k E W ( a j ) /a h q +1 − E W P X ( a j ) /a h q +1 k = O ( a − ̟ ) . So, it suﬃces to show (B.95) for the matrix E W P X ( a j ) /a h q +1 = P B a (2 j ) P ∗ .Let e u i ( n ) denote the i –th unit eigenvector of the (deterministic) matrix P B a (2 j ) P ∗ , i =1 , . . . , p . Observe that P ∗ P B a (2 j ) P ∗ e u i ( n ) = λ i ( P B a (2 j ) P ∗ ) · P ∗ e u i ( n ) for i = 1 , . . . , r . Then, λ p − r + q ( P B a (2 j ) P ∗ ) = λ q ( P ∗ P B a (2 j )) , q = 1 , . . . , r (B.118)(where P ∗ e u q ( n ) is the associated eigenvector of P ∗ P B a (2 j )). By a similar argument, and consid-ering again the QR decomposition P ≡ P ( n ) = Q ( n ) R ( n ) as in (B.70), λ q (cid:0) R ( n ) B a (2 j ) R ∗ ( n ) (cid:1) = λ q (cid:0) R ∗ ( n ) R ( n ) B a (2 j ) (cid:1) , q = 1 , . . . , r. (B.119)As a consequence of (B.118), (B.119) and of the fact that Q ∗ ( n ) Q ( n ) = I r , λ q (cid:0) P ∗ ( n ) P ( n ) B a (2 j ) (cid:1) = λ q (cid:0) R ( n ) B a (2 j ) R ∗ ( n ) (cid:1) , q = 1 , . . . , r. So, by taking limits in (B.118), Corollary B.1 with M n ≡ implies that λ q ( P ∗ P B a (2 j )) → ξ q (2 j )as n → ∞ . Moreover, by a similar argument to the one leading to (B.119), ξ q (2 j ) = λ q ( A B (2 j )) = λ q ( R B (2 j ) R ∗ ) (B.120)due to assumptions (2.24), (2.28) and (2.30). Therefore, again by Weyl’s inequality, (cid:12)(cid:12) λ q (cid:0) P ∗ ( n ) P ( n ) B a (2 j ) (cid:1) − ξ q (2 j ) (cid:12)(cid:12) = (cid:12)(cid:12) λ q (cid:0) R ∗ ( n ) R ( n ) B a (2 j ) (cid:1) − λ q ( R ∗ R B (2 j )) (cid:12)(cid:12) = (cid:12)(cid:12) λ q (cid:0) R ( n ) B a (2 j ) R ∗ ( n ) (cid:1) − λ q (cid:0) R B (2 j ) R ∗ (cid:1)(cid:12)(cid:12) ≤ (cid:13)(cid:13) R ( n ) B a (2 j ) R ∗ ( n ) − R B (2 j ) R ∗ (cid:13)(cid:13) ≤ k ( R ( n ) − R ) B a (2 j )( R ( n ) − R ) ∗ k + k ( R ( n ) − R ) B a (2 j ) R ∗ k + k R B a (2 j )( R ( n ) − R ) ∗ k + k R ( B a (2 j ) − B (2 j )) R ∗ k = O ( a − ̟ ) , (B.121)where the last equality is a consequence of conditions (2.24) and (2.29). This establishes (B.95)in the case ( ii ). (cid:3) The following lemma is used in the proof of Proposition B.2.

Lemma B.8

Suppose the assumptions of Proposition B.1 hold, as well as condition ( i ) in thesame proposition. Let P ( n ) = Q ( n ) R ( n ) be the QR decomposition of P ( n ) , where R ( n ) ∈ GL ( r, R ) is upper-triangular, Q ( n ) = ( e p ( n ) , . . . , e p r ( n )) ∈ M ( p, r, R ) with orthonormal columns. Alsolet p r +1 ( n ) , . . . , p p ( n ) be an orthonormal basis for the nullspace of P ( n ) that is orthogonal to span { p ( n ) , . . . , p r ( n ) } . Then, expression (B.112) holds. roof: Recall thatspan { e p ℓ ( n ) , . . . , e p r ( n ) } = span { p ℓ ( n ) , . . . , p r ( n ) } , ℓ = 1 , . . . , r. (B.122)Expression (B.11) of Lemma B.2 shows h p i ( n ) , u p − r +1 ( n ) i = O ( a − h − h i ) ) = O ( a − ̟ ), i =2 , . . . , r . Hence, from (B.122), r X i =2 h e p i ( n ) , u p − r +1 ( n ) i = O (cid:16) r X i =2 h p i ( n ) , u p − r +1 ( n ) i (cid:17) = O ( a − ̟ ) . Moreover, expression (B.19) of Lemma B.3 shows that P pi = r +1 h p i ( n ) , u p − r +1 ( n ) i = o ( a − ̟ ),implying 1 − h e p ( n ) , u p − r +1 ( n ) i = r X i =2 h e p i ( n ) , u p − r +1 ( n ) i + p X i = r +1 h p i ( n ) , u p − r +1 ( n ) i = O ( a − ̟ ) + o ( a − ̟ ) = O ( a − ̟ ) , i.e., (B.112) holds for k = 1. Proceeding by ﬁnite induction on k , suppose we have 1 −h e p i ( n ) , u p − r + i ( n ) i = O ( a − ̟ ), i = 1 , . . . , k ; i.e.,1 − h e p i ( n ) , u p − r + i ( n ) i = X ℓ = p − r + i h e p i ( n ) , u ℓ ( n ) i = O ( a − ̟ ) , i = 1 , . . . , k. In particular, h e p i ( n ) , u p − r +( k +1) ( n ) i = O ( a − ̟ ), i = 1 , . . . , k . However, expression (B.19) ofLemma B.3 shows that P pi = r +1 h p i ( n ) , u p − r +( k +1) ( n ) i = o ( a − ̟ ). This implies1 − h e p k +1 ( n ) , u p − r +( k +1) ( n ) i = k X i =1 h e p i ( n ) , u p − r +( k +1) ( n ) i + r X i = k +2 h e p i ( n ) , u p − r +( k +1) ( n ) i + p X i = r +1 h p i ( n ) , u p − r +( k +1) ( n ) i = O ( a − ̟ ) + O ( a − ̟ ) + o ( a − ̟ ) = O ( a − ̟ ) . This shows (B.112). (cid:3)

The following lemma is used in the proof of Proposition B.1.

Lemma B.9

Fix q ∈ { , . . . , r } and let h := h = . . . = h r . Let u p − r + q ( n ) be a ( p − r + q ) –thunit eigenvector of the matrix W ( a j ) a h +1 . Write W P X ( a j ) := P W X ( a j ) P ∗ and let e u p − r + q ( n ) bea ( p − r + q ) –th unit eigenvector of the matrix W P X ( a ( n )2 j ) a ( n ) h +1 = P ( n ) a ( n ) h b B a (2 j ) a ( n ) h P ∗ ( n ) a ( n ) h = P ( n ) b B a (2 j ) P ∗ ( n ) . Under the assumptions of Proposition B.1, ( ii ) , after ﬂipping the signs of u p − r + q ( n ) if necessary,expression (B.88) holds, i.e., k u p − r + q ( n ) − e u p − r + q ( n ) k = o P (1) . (B.123)51 roof: It suﬃces to show that h u p − r + q ( n ) , e u p − r + q ( n ) i P → , n → ∞ . (B.124)So, take an arbitrary subsequence of the left-hand side of (B.124) (still indexed n , for notationalsimplicity), and consider the QR decomposition of P ( n ) (see (B.29)), where e p ℓ ( n ), ℓ = 1 , . . . , r ,denote the columns of Q ( n ), which are orthonormal. Now, extract a further subsequence (stillindexed n , for notational simplicity) such that, for all ℓ = 1 , . . . , r , h e p ℓ ( n ) , u p − r + q ( n ) i → c ℓ,q , n → ∞ , (B.125)for some constant c ℓ,q still with probability at least δ . Let c q = ( c ,q , . . . , c r,q ) ∗ be the vector ofsuch constants. Recast the limit (B.125) as Q ∗ ( n ) u p − r + q ( n ) P → c q , n → ∞ . (B.126)Now let p r +1 ( n ) , . . . , p p ( n ) be an orthonormal basis of span { p ( n ) , . . . , p r ( n ) } ⊥ . By Lemma B.3, p X i = r +1 h p i ( n ) , u p − r + q ( n ) i = o P ( a ( n ) − ̟ ) . (B.127)Therefore, k c q k = 1 . (B.128)However, W ( a j ) a h +1 u p − r + q ( n ) = λ p − r + q (cid:16) W ( a j ) a h +1 (cid:17) · u p − r + q ( n ) . By expression (A.7), and using the fact that Q ∗ ( n ) Q ( n ) = I r , this implies that R ( n ) b B a (2 j ) R ∗ ( n ) Q ∗ ( n ) u p − r + q ( n ) + o P (1) = λ p − r + q (cid:16) W ( a j ) a h +1 (cid:17) · Q ∗ ( n ) u p − r + q ( n ) . (B.129)Recall that, by condition (2.29), R ∗ ( n ) → R ∗ as n → ∞ . In view of (B.126), by taking p -lim n →∞ on both sides of (B.129), R B (2 j ) R ∗ c q = ξ q (2 j ) · c q . On the other hand, for e u p − r + q ( n ) as in (B.123), P ( n ) b B a (2 j ) P ∗ ( n ) e u p − r + q ( n ) = λ q (cid:0) P ( n ) b B a (2 j ) P ∗ ( n ) (cid:1) · e u p − r + q ( n ) , q = 1 , . . . , r. Hence, R ( n ) b B a (2 j ) R ∗ ( n ) Q ∗ ( n ) e u p − r + q ( n ) = λ q (cid:0) P ( n ) b B a (2 j ) P ∗ ( n ) (cid:1) · Q ∗ ( n ) e u p − r + q ( n ) . (B.130)However, by expression (B.87), R ∗ ( n ) Q ∗ ( n ) e u p − r + q ( n ) converges in probability to some vector as n → ∞ . Moreover, the matrix R ∗ ( n ) ∈ GL ( r, R ) also converges to some matrix R ∗ ∈ GL ( r, R ).Therefore, there exists a vector e c q such that Q ∗ ( n ) e u p − r + q ( n ) P → e c q , n → ∞ . Thus, in view of condition (2.23), by taking p -lim n →∞ on both sides of (B.130), R B (2 j ) R ∗ e c q = ξ q (2 j ) · e c q ,

52n addition, note that, for all n ∈ N and for all u ⊥ span { e p ( n ) , . . . , e p r ( n ) } , u ∗ P ( n ) b B a (2 j ) P ∗ ( n ) u = u ∗ Q ( n ) R ( n ) b B a (2 j ) R ∗ ( n ) Q ∗ ( n ) u = 0 . Clearly, e u p − r + q ( n ) is orthogonal to the space span { p r +1 ( n ) , . . . , p p ( n ) } =span { e p ( n ) , . . . , e p r ( n ) } ⊥ . Thus, again for all n ∈ N , k Q ∗ ( n ) e u p − r + q ( n ) k = 1 . Hence, e c q is itself a unit vector. Therefore, the unit vectors c q (see (B.128)) and e c q are eigenvectorsof R B (2 j ) R ∗ with corresponding eigenvalue ξ q (2 j ). Since the eigenvalues ξ ℓ (2 j ), ℓ = 1 , . . . , r of R B (2 j ) R ∗ are distinct due to assumption (3.5), we conclude that c q = ± e c q . (cid:3) C Proofs: Section 4

In proofs, whenever convenient we use the matrix exponent D := H − I. C.1 Section 4.1: proofs and auxiliary results

Proof of Proposition 4.1 : Note that condition (4.1) is equivalent to stating that the R p -valuedstochastic process Z = { Z ( t ) } t ∈ Z as in (1.2) admits the harmonizable representation R p ∋ Z ( t ) = Z π − π e i tx g ( x ) / e B Z ( dx ) , (C.1)where the C p -valued, Gaussian random measure e B Z ( dx ) is such that e B Z ( − dx ) = e B Z ( dx ), E e B Z ( dx ) e B Z ( dx ) ∗ = dx . In addition, condition (4.5) is equivalent to stating that the R r -valuedstochastic process X = { X ( t ) } t ∈ Z admits the harmonizable representation X ( t ) = Z π − π ( e i tx − g X ( x ) / e B ( dx ) . (C.2)In (C.2), e B ( dx ) is an orthogonal-increment Gaussian random measure satisfying e B ( − dx ) = e B ( dx )and E e B ( dx ) e B ( dx ) ∗ = dx . Then, the claim is a consequence of Proposition C.1, Lemma C.2 andProposition C.3. (cid:3) In the remainder of this section, we establish all the auxiliary results needed in the proof ofProposition 4.1 (including Proposition C.1, Lemma C.2 and Proposition C.3 themselves).Starting from the harmonizable representations (C.1) and (C.2) for the stochastic processes Z = { Z ( t ) } t ∈ Z and X = { X ( t ) } t ∈ Z , we now establish concentration inequalities and asymptoticresults for the random wavelet variance and covariance matrices W Z ( a ( n )2 j ) and W X,Z ( a ( n )2 j ).We can express D Z (2 j , k ) = X ℓ ∈ Z Z ( ℓ ) h j, j k − ℓ , j ∈ N , k ∈ Z , where h j, · is given by (2.12). By the independence between g and e B Z ( dx ) and by Itˆo’s isometry, E D Z (2 j , k ) D Z (2 j ′ , k ′ ) ∗ = Z π − π e i x (2 j k − j ′ k ′ ) H j ( x ) H j ′ ( x ) E g Z ( x ) dx, j , we deﬁne H j ( x ) := X ℓ ∈ Z h j,ℓ e − i xℓ . (C.3)In particular, E W Z (2 j ) = I p Z π − π |H j ( x ) | E g Z ( x ) dx =: I p m (2 j ) = I p E D Z (2 j , . (C.4)Let Γ V n be the covariance matrix of the vector V n of wavelet coeﬃcients (of Z ) at scale a ( n )2 j (i.e., (C.21) for m = 1). Let β := 12 π lim sup n →∞ max i =1 ,...,n a,j λ i,V n < ∞ , (C.5)Note that, by Lemma C.1, ( iii ), β < ∞ .In the following proposition, we establish a concentration inequality for the norm k W Z (2 j ) k = λ p ( W Z (2 j )). The proof follows an ε -net argument, involving steps of approximation, concentra-tion and union bound (cf. Vershynin (2018), Sections 4.4 and 4.6, or Lugosi (2017), pp. 13–14). Proposition C.1

Suppose the assumptions of Proposition 4.1 hold. Let m ( a j ) = E D Z ( a j , be as in (C.4) and let β be as in (C.5) and take any t > γ := 2 π E g (0) ≥ . (C.6) Fix a small ε > . Then, for large enough n ∈ N , P (cid:16) λ p (cid:0) W Z ( a ( n )2 j ) (cid:1) > m ( a ( n )2 j ) + t (cid:17) ≤ exp ( p log 9 − na ( n )2 j t − m ( a ( n )2 j )8 π ( β + ε ) − s t − m ( a ( n )2 j )8 π ( β + ε ) !) . (C.7) Let γ = 2 π E g (0) . In particular, as n → ∞ , with probability going to 1 the eigenvalue λ p (cid:0) W Z ( a ( n )2 j ) (cid:1) is bounded above by any constant C satisfying C > γ + 8 π ( β + ε ) p c j +2 log 92 ! . (C.8) Proof:

Fix any u ∈ S p − and let D ∗ Z,n = 1 p m ( a j ) (cid:16) D Z ( a j , ∗ u , D Z ( a j , ∗ u , . . . , D Z ( a j , n a,j ) ∗ u (cid:17) ∈ R n a,j . Write Σ D ,n = E D Z,n D ∗ Z,n and consider its spectral decomposition Σ D ,n = O n Λ D ,n O ∗ n for an orthogonal matrix O n . Recast1 n a,j D ∗ Z,n D Z,n d = 1 n a,j Z ∗ n Σ D ,n Z n = 1 n a,j Z ∗ n O n Λ D ,n O ∗ n Z n d = 1 n a,j Z ∗ n Λ D ,n Z n =: n a,j X k =1 η k,n Z k , η ≡ η n = (cid:16) η ,n , . . . , η n a,j ,n (cid:17) (C.9)be the vector of eigenvalues of the deterministic matrix n a,j Σ D ,n . Fix a small ε >

0. We claimthat, for large enough n , k η k ∞ ≤ π ( β + ε ) n a,j m ( a j ) . (C.10)Indeed, recall that F ( d g ) is a distribution on the class G as in (4.2). Note that the covariancestructure of the nonnormalized wavelet coeﬃcients of Z is given by E (cid:2) D Z ( a j , k ) i D Z ( a j , k ′ ) i ′ (cid:3) = (cid:26) , i = i ′ ; R γ g , j (cid:0) a j ( k − k ′ ) (cid:1) F ( d g ) , i = i ′ . Therefore, for k, k ′ ∈ Z , R ∋ E (cid:2) D Z ( a j , k ) ∗ u D Z ( a j , k ′ ) ∗ u (cid:3) = E h p X i =1 p X i ′ =1 D Z ( a j , k ) i D Z ( a j , k ′ ) i ′ u p,i u p,i ′ i = Z γ g ,a j (cid:0) a j ( k − k ′ ) (cid:1) F ( d g ) p X i =1 u p,i = Z γ g ,a j (cid:0) a j ( k − k ′ ) (cid:1) F ( d g )= Z π − π e i xa j ( k − k ′ ) |H a j ( x ) | E g ( x ) dx. (C.11)Therefore, Σ D ,n = E D Z,n D ∗ Z,n = 1 m ( a j ) ×  R γ g ,a j (0) F ( d g ) R γ g ,a j ( a j ) F ( d g ) . . . R γ g ,a j (cid:0) a j ( n a,j − (cid:1) F ( d g ) R γ g ,a j ( a j ) F ( d g ) R γ g ,a j (0) F ( d g ) . . . R γ g ,a j (cid:0) a j ( n a,j − (cid:1) ) F ( d g )... ... . . . ... R γ g ,a j (cid:0) a j ( n a,j − (cid:1) F ( d g ) R γ g ,a j (cid:0) a j ( n a,j − (cid:1) F ( d g ) . . . R γ g ,a j (0) F ( d g )  = 1 m ( a j ) E V n V ∗ n , (C.12)where V n is given by (C.21) for m = 1. Let ε >

0. By (C.12) and Lemma C.1, ( iii ), for largeenough n and for all k = 1 , . . . , n a,j , η k,n ≤ π ( β + ε ) n a,j m ( a j ) . In other words, (C.10) holds.So, let s > x = −k η k + p k η k + 2 k η k ∞ s/m ( a j )2 k η k ∞ ! . Then, 2 k η k √ x + 2 k η k ∞ x = s/m ( a j ). Since u ∗ E W Z ( a j ) u = m ( a j ), then P (cid:16) u ∗ W Z ( a j ) u − m ( a j ) > s (cid:17) P (cid:16) u ∗ W Z ( a j ) u m ( a j ) − > sm ( a j ) (cid:17) = P (cid:16) n a,j X k =1 η k,n ( Z k − ≥ sm ( a j ) (cid:17) = P (cid:16) n a,j X k =1 η k,n ( Z k − ≥ k η k √ x + 2 k η k ∞ x (cid:17) ≤ exp {− x } , (C.13)where the last inequality is a consequence of Lemma C.3. Now, note that k η k ≤ √ n a,j k η k ∞ ,and that p k η k + 2 k η k ∞ s/m ( a j ) ≤ k η k + p k η k ∞ s/m ( a j ). For notational simplicity, let β ε = β + ε . By applying relation (C.10) twice, we obtain x = k η k + k η k ∞ s/m ( a j ) − k η k p k η k + 2 k η k ∞ s/m ( a j )2 k η k ∞ ≥ k η k ∞ s/m ( a j ) − k η k p k η k ∞ s/m ( a j )2 k η k ∞ ≥ k η k ∞ s m ( a j ) − s n a,j s k η k ∞ m ( a j ) ! ≥ n a,j (cid:18) s πβ ε − r s πβ ε (cid:19) . (C.14)From (C.13) and (C.14), we arrive at P (cid:16) u ∗ W Z ( a ( n )2 j ) u − m ( a j ) > s (cid:17) ≤ exp (cid:26) − na j (cid:18) s πβ ε − r s πβ ε (cid:19)(cid:27) . (C.15)We now appeal to the same argument as in Lugosi (2017), pp. 13–14. In fact, let N be a 1 / λ p ( W Z ( a j )) ≤ u ∈N u ∗ W Z ( a j ) u , and card( N ) ≤ p . Thus, by the union bound, P (cid:16) λ p ( W Z ( a ( n )2 j )) > t + m ( a j ) (cid:17) ≤ p max u ∈N P (cid:16) u ∗ W Z ( a ( n )2 j ) u > t + m ( a j )2 (cid:17) . = 9 p max u ∈N P (cid:16) u ∗ W Z ( a ( n )2 j ) u − m ( a j ) > t − m ( a j )2 (cid:17) . (C.16)Note that, by Lemma C.1, ( ii ), m ( a j ) ≡ m ( a ( n )2 j ) → π E g (0) = γ, n → ∞ . (C.17)In view of (C.6), take s = ( t − m ( a j )) / s > n . Bycombining this with (C.16), we arrive at (C.7).To show the statement regarding (C.8), recastexp  p log 9 − na ( n )2 j  t − m ( a ( n )2 j )8 πβ ε − s t − m ( a ( n )2 j )8 πβ ε  =: exp (cid:26) na ( n ) b ( n ) (cid:27) . However, by (2.27) and (C.17), b ( n ) = pn/a ( n ) log 9 − j  t − m ( a ( n )2 j )8 πβ ε − s t − m ( a ( n )2 j )8 πβ ε  c log 9 − j (cid:18) t − γ πβ ε − r t − γ πβ ε (cid:19) =: L, n → ∞ . Thus, if t > t ∗ := γ + 8 πβ ε (cid:18) √ j +2 c log 92 (cid:19) , then L <

0. So, for C satisfying (C.8), C > γ + t ∗ .Therefore, by (C.7) and (C.17), P (cid:16) λ p (cid:0) W Z ( a ( n )2 j ) (cid:1) > C (cid:17) ≤ P (cid:16) λ p (cid:0) W Z ( a ( n )2 j ) (cid:1) > t ∗ + γ (cid:17) → , n → ∞ . Consequently, λ p (cid:0) W Z ( a ( n )2 j ) (cid:1) is bounded above by such constant C with probability going to1, as claimed. (cid:3) In the following lemma, we establish a bound on the cross-correlation between noise waveletcoeﬃcients whose indices are suﬃciently far apart. It is used in the proof of Proposition 4.1 bymeans of Proposition C.1 and Lemma C.2.

Lemma C.1

Suppose the assumptions of Proposition 4.1 hold. ( i ) Let T = length(supp( ψ )) and choose j, j ′ ∈ N , k, k ′ ∈ Z that satisfy | j k − j ′ k ′ | max { j , j ′ } ≥ T. (C.18) Fix n ∈ N . Then, for constants C, C ′ > that are independent of j , j ′ , k , k ′ and n , max i =1 ,...,p ( n ) E | D Z ( a ( n )2 j , k ) i D Z ( a ( n )2 j , k ′ ) i | ≤ Ca ( n )2 j + j ′ e − C ′ a ( n ) | j k − j ′ k ′ | . (C.19)( ii ) For g as in (4.1) , the univariate wavelet coeﬃcients { D Z (2 j , } j ∈ N satisfy lim j →∞ E D Z (2 j , = 2 π E g (0) . (C.20)( iii ) Fix ﬁxed j , . . . , j m ∈ N , deﬁne the vector of univariate wavelet coeﬃcients V n := (cid:0) D Z ( a ( n )2 j , , . . . , D Z ( a ( n )2 j , n a,j ) , D Z ( a ( n )2 j , , . . .. . . , D Z ( a ( n )2 j m , n a,j m ) (cid:1) ⊤ , (C.21) V n ∈ R ν ( n ) , where ν ( n ) = n a,j + . . . + n a,j m . Let λ i,V n , i = 1 , . . . , ν ( n ) , be the eigenvaluesof the covariance matrix Γ V n = Cov( V n , V n ) . Then, for some C > , max i =1 ,...,ν ( n ) λ i,V n ≤ C. (C.22) Proof:

We ﬁrst show ( i ). Fix i ∈ { , . . . , p } . For any k, k ′ , E (cid:2) D Z ( a j , k ) i D Z ( a j ′ , k ′ ) i (cid:3) = a − − ( j + j ′ ) Z R Z R X ℓ ∈ Z X ℓ ′ ∈ Z ψ (cid:0) ( a j ) − t (cid:1) ψ (cid:0) ( a j ′ ) − t ′ (cid:1) φ ( t + ℓ ) φ ( t ′ + ℓ ′ ) × E (cid:2) Z ( a j k − ℓ ) i Z ( a j ′ k ′ − ℓ ′ ) i (cid:3) dtdt ′ a − − ( j + j ′ ) Z R Z R X ℓ ∈ Z X ℓ ′ ∈ Z ψ (cid:0) ( a j ) − t (cid:1) ψ (cid:0) ( a j ′ ) − t ′ (cid:1) φ ( t + ℓ ) φ ( t ′ + ℓ ′ ) × γ Z , E g (cid:0) a (2 j k − j ′ k ′ ) + ( ℓ ′ − ℓ ) (cid:1) dtdt ′ . (C.23)In (C.23), for any t, h ∈ Z , γ Z , E g ( h ) := E (cid:2) Z ( t + h ) Z ( t ) (cid:3) = Z π − π e i hx E g ( x ) dx = E h E (cid:16) Z π − π e i hx g ( x ) dx (cid:12)(cid:12)(cid:12) g (cid:17)i . Consequently, by condition (4.4), (cid:12)(cid:12) γ Z , E g ( h ) (cid:12)(cid:12) ≤ C e − C | h | , h ∈ Z . (C.24)Let r = a (2 j k − j ′ k ′ ). Note that condition (C.18) is equivalent to | a j k − a j ′ k ′ | max { a j , a j ′ } ≥ T. Therefore, by the same argument leading to expression (A.25) in Boniece et al. (2021), | ℓ − ℓ ′ r | ≤ .So, by the bound (C.24), (cid:12)(cid:12) γ Z , E g (cid:0) j k − j ′ k ′ + ( ℓ ′ − ℓ ) (cid:1)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) γ Z , E g (cid:16) r (cid:16) ℓ ′ − ℓr (cid:17)(cid:17)(cid:12)(cid:12)(cid:12) ≤ C e − C ′ | r | . Hence, (cid:12)(cid:12) E D Z ( a j , k ) i D Z ( a j , k ′ ) i (cid:12)(cid:12) ≤ Ca − − ( j + j ′ ) e − C ′ | r | Z R Z R X ℓ ∈ Z X ℓ ′ ∈ Z (cid:12)(cid:12) ψ (cid:0) ( a j ) − t (cid:1) ψ (cid:0) ( a j ′ ) − t ′ (cid:1) φ ( t + ℓ ) φ ( t ′ + ℓ ′ ) (cid:12)(cid:12) dtdt ′ . (C.25)Now observe that each sum over ℓ and ℓ ′ in (C.25) contains only ( a j + 1) T − a j ′ + 1) T − ψ is bounded. Thus, after interchanging the summation andintegration symbols, each summand is bounded by the constant k ψ k ∞ k φ k L . Thus, (C.25) isbounded by C ′′ a − − ( j + j ′ ) e − C ′ | r | a j + j ′ k ψ k ∞ k φ k L ≤ C ′′′ a ( j + j ′ ) e − C ′ | r | . Then, (C.19) holds. This establishes ( i ).We now turn to ( ii ). Recast E D Z (2 j , = Z π − π |H j ( x ) | E g ( x ) dx. Recall that 12 π Z [ − π,π ] |H j ( x ) | dx = X ℓ ∈ Z h j,ℓ = 1 , (C.26)itself a consequence of Parseval’s theorem (see, e.g., Boniece et al. (2021), relations (A.11) and(A.12)). Fix a ∈ (0 , π ]. By relations (A.15) and (A.16) in Boniece et al. (2021), Z [ − π,π ] \ ( − a,a ) |H j ( x ) | dx → , j → ∞ , (C.27)58nd lim j →∞ π Z a − a |H j ( x ) | dx = 1 . (C.28)Now note that, by condition (4.3), the function E g ( x ) is continuous at x = 0. Therefore, for anyﬁxed ε >

0, we can choose 0 < a < π small so that, for | x | < a , | E g ( x ) − E g (0) | < ε . Thus, inview of (C.26), (cid:12)(cid:12)(cid:12) π Z π − π |H j ( x ) | E g ( x ) dx − E g (0) (cid:12)(cid:12)(cid:12) ≤ π Z π − π |H j ( x ) | | E g ( x ) − E g (0) | dx = 12 π Z a − a |H j ( x ) | | E g ( x ) − E g (0) | dx + 12 π Z [ − π,π ] \ [ − a,a ] |H j ( x ) | | E g ( x ) − E g (0) | dx< ε π Z a − a |H j ( x ) | dx + 1 π k E g k ∞ Z [ − π,π ] \ [ − a,a ] |H j ( x ) | dx, (C.29)where k·k ∞ is the usual L ∞ norm. By condition (4.3), k E g k ∞ < ∞ . Hence, by taking j suﬃcientlylarge, (C.27), (C.28) and (C.29) imply (C.20).To show ( iii ), consider the boundsup k u k =1 u ⊤ Γ V n u ≤ max ℓ =1 ,...,ν ν X k =1 | (Γ V n ) k,ℓ |≤ m max j,j ′ = j ,...,j m max k =1 ,...,n j n j ′ X k ′ =1 | E D Z (2 j , k ) D Z (2 j ′ , k ′ ) | . (C.30)Now, take M large so that (C.19) holds whenever | j k − j ′ k ′ | > M and a ≡ a ( n ) = 1. Then, forany k ′ , j , j ′ and a , { k : | a j k − a j ′ k ′ | < aM } = { k : | j k − j ′ k ′ | < M } = { k : 2 j ′ − j k ′ − − j M < k < j ′ − j k ′ + 2 − j M } ≤ − j M + 1 ≤ M + 1 . So, for large n , all but ﬁnitely many terms satisfy the bound (C.19), and the maximum numberof which do not satisfy the bound is independent of k ′ , j , j ′ and a . Furthermore, for arbitrary k, k ′ , j, j ′ , | E D Z ( a j , k ) D Z ( a j ′ , k ′ ) | ≤ | E D Z ( a j , E D Z ( a j ′ , | / ≤ sup j,j ′ | E D Z ( a j , E D Z ( a j ′ , | / ≤ C < ∞ by (C.20). In other words, the ﬁnitely many terms in the sum on the right-hand side of (C.30)that do not satisfy the bound (C.19) are bounded by a constant irrespective of n . Thus, for someconstants C ′ , C ′′ > m max j,j ′ = j ,...,j m max k =1 ,...,n a,j n a,j ′ X k ′ =1 | E D Z ( a j , k ) D Z ( a j ′ , k ′ ) | ≤ C ′ X r =0 e − C ′′ | r | < ∞ . (C.31)Hence, by (C.30) and (C.31) the eigenvalues of the sequence of matrices { Γ V n } n ∈ N are bounded,i.e., (C.22) holds. This establishes ( iii ). (cid:3) The following lemma is used in the proof of Proposition 4.1.

Lemma C.2

Suppose the assumptions of Proposition 4.1 hold. Then, k a ( n ) − ( H + I ) W X,Z ( a ( n )2 j )) k = O P (1) . roof: For any u ∈ R r ,0 ≤ u ∗ a − ( H + I ) W X,Z ( a j ) W X,Z ( a j ) ∗ a − ( H + I ) ∗ u = 1 n a,j n a,j X k =1 n a,j X k ′ =1 u ∗ a − ( H + I ) D X ( a j , k ) D Z ( a j , k ′ ) ∗ D Z ( a j , k ) D X ( a j , k ) ∗ a − ( H + I ) ∗ u = 1 n a,j n a,j X k =1 n a,j X k ′ =1 (cid:0) u ∗ a − ( H + I ) D X ( a j , k ) D X ( a j , k ) ∗ a − ( H + I ) ∗ u (cid:1) h D Z ( a j , k ) , D Z ( a j , k ′ ) i . Pick any κ >

0. Since u is ﬁxed and X and Z are independent, Markov’s inequality implies that P (cid:18)(cid:20) n a,j n a,j X k =1 n a,j X k ′ =1 n u ∗ a − ( H + I ) D X ( a j , k ) D X ( a j , k ) ∗ a − ( H + I ) ∗ u × (cid:10) D Z ( a j , k ) , D Z ( a j , k ′ ) (cid:11)o(cid:21) > κ (cid:19) ≤ κ E (cid:18) n a,j n a,j X k =1 n a,j X k ′ =1 n u ∗ a − ( H + I ) D X ( a j , k ) D X ( a j , k ) ∗ a − ( H + I ) ∗ u × (cid:10) D Z ( a j , k ) , D Z ( a j , k ′ ) (cid:11)o(cid:19) ≤ κ n a,j n a,j X k =1 n a,j X k ′ =1 n k a − ( H + I ) E D X ( a j , k ) D X ( a j , k ) ∗ a − ( H + I ) ∗ k× (cid:12)(cid:12) E (cid:10) D Z ( a j , k ) , D Z ( a j , k ′ ) (cid:11)(cid:12)(cid:12)o , (C.32)which holds uniformly in u . However, condition (4.5) and Lemma C.4 imply that M := sup n,k,k ′ k a − ( H + I ) E D X ( a j , k ) D X ( a j , k ′ ) ∗ a − ( H + I ) ∗ k < ∞ . Furthermore, E D Z ( a j , k ) i D Z ( a j , k ′ ) i = E D Z ( a j , k ) D Z ( a j , k ′ ) , i = 1 , . . . , p . Therefore, ex-pression (C.32) is bounded above by1 κ Mn a,j n a,j X k =1 n a,j X k ′ =1 (cid:12)(cid:12) E (cid:10) D Z ( a j , k ) , D Z ( a j , k ′ ) (cid:11)(cid:12)(cid:12) = Mκ pn a,j  n a,j ϕ a j (0) + 2 n a,j − X | τ | =1 ( n a,j − | τ | ) | ϕ a j ( τ ) |  , (C.33)where τ = | k − k ′ | . In (C.33), for some constant C > | ϕ a j ( τ ) | = (cid:12)(cid:12) E D Z ( a j , k ) D Z ( a j , k ′ ) (cid:12)(cid:12) ≤ E D Z ( a j , = m ( a j ) ≤ C. Consider the second sum term on the right-hand side of (C.33). By taking any integer

K > ψ ), pn a,j n a,j − X | τ | =1 ( n a,j − | τ | ) ϕ a j ( τ ) = pn a,j n K X | τ | =1 + n a,j − X | τ | = K +1 o ( n a,j − | τ | ) ϕ a j ( τ )60 pn a,j C ′ ( n a,j −

1) + n a,j sup | τ | >K ϕ a j ( τ ) ! ≤ C ′ pn a,j + p sup | τ | >K ϕ a j ( τ ) ≤ C ′′ pn a,j → C ′′′ ∈ [0 , ∞ ) , n → ∞ , where the last inequality and the limit follow from Lemma C.1 and condition (2.27), respectively.This implies that, for some constant M ′ > n a,j (and κ ), expression (C.33)is bounded above by M ′ /κ . Consequently, so is (C.32). Since κ > (cid:3) The following lemma is used in the proof of Proposition C.1. It provides a concentrationinequality for centered quadratic forms. It corresponds to Lemma 1 in Laurent and Massart(2000) (see also Birg´e and Massart (1998), Lemma 8, and Boucheron et al. (2013), p. 39).

Lemma C.3 (Laurent and Massart (2000)) Let Z , . . . , Z n i.i.d. ∼ N (0 , and η , . . . , η n ≥ , notall zero. Let k η k and k η k ∞ be the Euclidean square and sup norms of the vector η = ( η , . . . , η n ) ∗ .Also, deﬁne the random variable X = P ri =1 η i ( Z i − . Then, for every x > , P (cid:16) X ≥ k η k √ x + 2 k η k ∞ x (cid:17) ≤ exp {− x } , (C.34) P (cid:16) X ≤ − k η k √ x (cid:17) ≤ exp {− x } . (C.35)In the following lemma, we characterize the same-scale scaling property of the cross-correlation E (cid:2) D X ( a ( n )2 j , k ) D X ( a ( n )2 j , k ′ ) ∗ (cid:3) between wavelet coeﬃcients. The lemma is used in the proof ofLemma C.2. Lemma C.4

Under the assumptions of Proposition 4.1, there is a constant

C > so that k a ( n ) − ( H +(1 / I ) E (cid:2) D X ( a ( n )2 j , k ) D X ( a ( n )2 j , k ′ ) ∗ (cid:3) a ( n ) − ( H ∗ +(1 / I ) k ≤ C, (C.36) uniformly in k, k ′ , as n → ∞ . Proof:

For notational simplicity, ﬁx j and set j ′ = j . Also ﬁx k, k ′ ∈ Z . Recall that the spectraldensity of X is given by g X ( x ) = | x | − ( H +(1 / I ) s ( x ) | x | − ( H ∗ +(1 / I ) (see (4.5)). Then, by the change of variable y = 2 j x , E D X (2 j , k ) D X (2 j , k ′ ) ∗ = Z π − π e ix j ( k − k ′ ) |H j ( x ) | g X ( x ) dx = Z π j − π j e iy ( k − k ′ ) (cid:12)(cid:12)(cid:12) H j (cid:16) y j (cid:17)(cid:12)(cid:12)(cid:12) g X (cid:16) y j (cid:17) dy j = 2 jH n Z π j − π j (cid:12)(cid:12)(cid:12) H j (cid:16) y j (cid:17)(cid:12)(cid:12)(cid:12) | y | − ( H +(1 / I ) s (cid:16) y j (cid:17) | y | − ( H ∗ +(1 / I ) dy o jH ∗ . (C.37)Recast the expression on the right-hand side of (C.37) as2 jH hn Z | y |≤ + Z < | y |≤ π j o(cid:12)(cid:12)(cid:12) H j (cid:16) y j (cid:17)(cid:12)(cid:12)(cid:12) | y | − ( H +(1 / I ) s (cid:16) y j (cid:17) | y | − ( H ∗ +(1 / I ) dy i jH ∗ . (C.38)61y Moulines et al. (2007 b ), relation (78), |H j ( x ) | ≤ C j/ | j x | N ψ (1 + 2 j | x | ) α + N ψ , x ∈ [ − π, π ) . (C.39)Therefore, for | y | ≤ j π , (cid:12)(cid:12)(cid:12) H j (cid:16) y j (cid:17)(cid:12)(cid:12)(cid:12) ≤ C j y N ψ (1 + | y | ) α + N ψ ) . (C.40)In regard to the ﬁrst integral term in (C.38), relation (C.40) as well as pre- and post-multiplicationby (2 j ) − H − (1 / I and its transpose imply that (cid:13)(cid:13)(cid:13) (2 j ) − H − (1 / I n jH Z | y |≤ (cid:12)(cid:12)(cid:12) H j (cid:16) y j (cid:17)(cid:12)(cid:12)(cid:12) | y | − ( H +(1 / I ) s (cid:16) y j (cid:17) | y | − ( H ∗ +(1 / I ) dy jH ∗ o (2 j ) − H ∗ − (1 / I (cid:13)(cid:13)(cid:13) = 2 − j (cid:13)(cid:13)(cid:13) Z | y |≤ (cid:12)(cid:12)(cid:12) H j (cid:16) y j (cid:17)(cid:12)(cid:12)(cid:12) | y | − ( H +(1 / I ) s (cid:16) y j (cid:17) | y | − ( H ∗ +(1 / I ) dy (cid:13)(cid:13)(cid:13) ≤ C (cid:13)(cid:13)(cid:13) Z − | y | N ψ (1 + | y | ) α + N ψ ) | y | − ( H +(1 / I ) s (cid:16) y j (cid:17) | y | − ( H ∗ +(1 / I ) dy (cid:13)(cid:13)(cid:13) ≤ C ′ . (C.41)In (C.41), the constant C ′ is ﬁnite and independent of j since 2 N ψ − h r − > − s ( · ) isassumed bounded (see condition ( iii ) in the statement of Proposition 4.1). Analogously, for thesecond term in (C.38), (cid:13)(cid:13)(cid:13) (2 j ) − H − (1 / I n jH Z < | y |≤ π j (cid:12)(cid:12)(cid:12) H j (cid:16) y j (cid:17)(cid:12)(cid:12)(cid:12) | y | − ( H +(1 / I ) s (cid:16) y j (cid:17) | y | − ( H ∗ +(1 / I ) dy jH ∗ o (2 j ) − H ∗ − (1 / I (cid:13)(cid:13)(cid:13) ≤ C ′′ Z j π y N ψ (1 + | y | ) α + N ψ ) (cid:13)(cid:13) | y | − ( H +(1 / I ) s (cid:16) y j (cid:17) | y | − ( H ∗ +(1 / I ) (cid:13)(cid:13) dy ≤ C ′′ Z ∞ y N ψ (1 + | y | ) α + N ψ ) (cid:13)(cid:13) | y | − ( H +(1 / I ) s (cid:16) y j (cid:17) | y | − ( H ∗ +(1 / I ) (cid:13)(cid:13) dy ≤ C ′′′ . (C.42)In (C.42), the constant C ′′′ is ﬁnite and independent of j since 2 α + 2 h + 1 > s ( · )is assumed bounded. So, turning back to (C.38), by relations (C.41) and (C.42) we obtain thebound (cid:13)(cid:13)(cid:13) (2 j ) − H − (1 / I jH n Z | y |≤ π j (cid:12)(cid:12)(cid:12) H j (cid:16) y j (cid:17)(cid:12)(cid:12)(cid:12) | y | − ( H +(1 / I ) × s (cid:16) y j (cid:17) | y | − ( H ∗ +(1 / I ) dy o jH ∗ (2 j ) − H ∗ − (1 / I (cid:13)(cid:13)(cid:13) ≤ C. (C.43)This establishes (C.36). (cid:3) In Proposition C.2, we establish the asymptotic distribution of the auxiliary wavelet randommatrix b B a ( a j ). The proposition is used in the proof of Proposition C.3. For the statements ofProposition C.2, deﬁne the operator-normalized wavelet coeﬃcients a ( n ) − ( H +(1 / I ) D X ( a ( n )2 j , k ) =: D ( a ( n )2 j , k ) = (cid:16) d (cid:0) a ( n )2 j , k (cid:1) , . . . , d r (cid:0) a ( n )2 j , k (cid:1)(cid:17) ∗ . (C.44)Also recall that for a zero mean, Gaussian random vector Z ∈ R m , the Isserlis theorem (e.g.,Vignat (2012)) gives E ( Z . . . Z k ) = X Y E ( Z i Z j ) , E ( Z . . . Z k +1 ) = 0 , k = 1 , . . . , ⌊ m/ ⌋ . (C.45)62he notation P Q stands for adding over all possible k -fold products of pairs E ( Z i Z j ), where theindices partition the set 1 , . . . , k . Further recall that α , N ψ and δ s are given by (2.3), (2.5) and(4.6), respectively. Proposition C.2

Fix j , j ′ , and let Φ z := 2 j + j ′ Z ∞−∞ e i xz b ψ (2 j x ) b ψ (2 j ′ x ) | i x | − ( D + I ) s (0) | i x | − ( D ∗ + I ) dx, z ∈ Z . (C.46)( i ) Fix k , k ′ . Then, for large enough n and some C > , (cid:13)(cid:13)(cid:13) E D ( a ( n )2 j , k ) D ( a ( n )2 j ′ , k ′ ) ∗ − Φ j k − j ′ k ′ (cid:13)(cid:13)(cid:13) ≤ Ca ( n ) min { α − ,α + N ψ − ε,δ s } (C.47) for all small enough ε . ( ii ) As n → ∞ , √ n a,j √ n a,j ′ n a,j X k =1 n a,j ′ X k ′ =1 E D ( a ( n )2 j , k ) D ( a ( n )2 j ′ , k ′ ) ∗ ⊗ E D ( a ( n )2 j , k ) D ( a ( n )2 j ′ , k ′ ) ∗ → ( j + j ′ ) / gcd(2 j , j ′ ) ∞ X z = −∞ Φ z gcd(2 j , j ′ ) ⊗ Φ z gcd(2 j , j ′ ) . (C.48)( iii ) There is a matrix G jj ′ ∈ M ( r ( r + 1) / , R ) , not necessarily symmetric, such that, as n → ∞ , √ n a,j √ n a,j ′ Cov (cid:16) b B a (2 j ) , b B a (2 j ′ ) (cid:17) → G jj ′ , (C.49) where the entries of G jj ′ can be retrieved from (C.48) by means of (C.45) (see (2.1) on thenotation vec S ). ( iv ) For Υ( n ) := r j X j = j n a,j , (C.50) let Y n ∈ R Υ( n ) be the vector of all operator-normalized wavelet coeﬃcients at scales a ( n )2 j , j = j , . . . , j (cf. (C.75) ). Let λ i, Y n , i = 1 , . . . , Υ( n ) , be the eigenvalues of the covariancematrix Γ Y n = Cov( Y n , Y n ) . Then, for some C > , max i =1 ,..., Υ( n ) λ i, Y n ≤ C. (C.51) Proof:

We ﬁrst show ( i ). Let g ∗ ( x ) = | b φ ( x ) | | − e − i x | D + I g X ( x ) | − e − i x | D ∗ + I . (C.52)Now consider the bound (cid:13)(cid:13)(cid:13) E D ( a ( n )2 j , k ) D ( a ( n )2 j ′ , k ′ ) ∗ − Φ j k − j ′ k ′ (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) a ( n ) − ( H +(1 / I ) E D X ( a ( n )2 j , k ) D X ( a ( n )2 j ′ , k ′ ) ∗ a ( n ) − ( H ∗ +(1 / I ) − Φ j k − j ′ k ′ (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) a ( n ) − ( H +(1 / I ) n E D X ( a ( n )2 j , k ) D X ( a ( n )2 j ′ , k ′ ) ∗ − a ( n )2 j + j ′ Z π − π e i xa ( n )(2 j k − j ′ k ′ ) | b φ ( x ) | b ψ (cid:0) a ( n )2 j x (cid:1) b ψ (cid:0) a ( n )2 j ′ x (cid:1) g X ( x ) dx o a ( n ) − ( H ∗ +(1 / I ) (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) a ( n ) − ( H +(1 / I ) n a ( n )2 j + j ′ Z π − π e i xa ( n )(2 j k − j ′ k ′ ) b ψ (cid:0) a ( n )2 j x (cid:1) b ψ (cid:0) a ( n )2 j ′ x (cid:1) | − e − i x | − ( D + I ) (cid:2) g ∗ ( x ) − g ∗ (0) (cid:3) | − e − i x | − ( D ∗ + I ) dx o a ( n ) − ( H ∗ +(1 / I ) (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) a ( n ) − ( H +(1 / I ) n a ( n )2 j + j ′ Z π − π e i xa ( n )(2 j k − j ′ k ′ ) b ψ (cid:0) a ( n )2 j x (cid:1) b ψ (cid:0) a ( n )2 j ′ x (cid:1) ×| − e − i x | − ( D + I ) g ∗ (0) | − e − i x | − ( D ∗ + I ) dx − a ( n )2 j + j ′ Z π − π e i xa ( n )(2 j k − j ′ k ′ ) b ψ (cid:0) a ( n )2 j x (cid:1) b ψ (cid:0) a ( n )2 j ′ x (cid:1) | i x | − ( D + I ) g ∗ (0) | i x | − ( D ∗ + I ) dx o a ( n ) − ( H ∗ +(1 / I ) (cid:13)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:13) a ( n ) − ( H +(1 / I ) n a ( n )2 j + j ′ Z π − π e i xa ( n )(2 j k − j ′ k ′ ) b ψ (cid:0) a ( n )2 j x (cid:1) b ψ (cid:0) a ( n )2 j ′ y (cid:1) ×| i x | − ( D + I ) g ∗ (0) | i x | − ( D ∗ + I ) dx o a ( n ) − ( H ∗ +(1 / I ) − Φ j k − j ′ k ′ (cid:13)(cid:13)(cid:13) . (C.53)In the rest of this proof, we establish bounds for each term of the sum on the right-hand side of(C.53).In regard to the ﬁrst sum term in (C.53), for notational simplicity set a ( n ) = 1 for the momentand reexpress (2 j ) − ( H +(1 / I ) E D X (2 j , k ) D X (2 j ′ , k ′ ) ∗ (2 j ′ ) − ( H ∗ +(1 / I ) = Z π − π e i x (2 j k − j ′ k ′ ) H j ( x ) H j ′ ( x )(2 j ) − ( H +(1 / I ) g X ( x )(2 j ′ ) − ( H ∗ +(1 / I ) dx = Z π − π e i x (2 j k − j ′ k ′ ) n [ H j ( x ) − j/ b φ ( x ) b ψ (2 j x )][ H j ′ ( x ) − j ′ / b φ ( x ) b ψ (2 j ′ x )]+[ H j ( x ) − j/ b φ ( x ) b ψ (2 j x )]2 j ′ / b φ ( x ) b ψ (2 j ′ x ) + 2 j/ b φ ( x ) b ψ (2 j x )[ H j ′ ( x ) − j ′ / b φ ( x ) b ψ (2 j ′ x )]+2 j/ j ′ / | b φ ( x ) | b ψ (2 j x ) b ψ (2 j ′ x ) o (2 j ) − ( H +(1 / I ) g X ( x )(2 j ′ ) − ( H ∗ +(1 / I ) dx. (C.54)We now construct bounds for the ﬁrst, second and third sum terms on the right-hand side of(C.54). It suﬃces to consider the former two. Recall that, by relation (76) in Moulines et al.(2007 b ), (cid:12)(cid:12) H j ( x ) − j/ b φ ( x ) b ψ (2 j x ) (cid:12)(cid:12) ≤ C (2 j ) − α | x | N ψ , x ∈ [ − π, π ) . (C.55)So, by (C.55), (cid:13)(cid:13)(cid:13) Z π − π e i x (2 j k − j ′ k ′ ) (cid:2) H j ( x ) − j/ b φ ( x ) b ψ (2 j x ) (cid:3)(cid:2) H j ′ ( x ) − j ′ / b φ ( x ) b ψ (2 j ′ x ) (cid:3) g X ( x ) dx (cid:13)(cid:13)(cid:13) ≤ C (2 j + j ′ ) ( − α ) Z π − π | x | N ψ k (2 j ) − ( H +(1 / I ) g X ( x )(2 j ′ ) − ( H ∗ +(1 / I ) k dx, (C.56)where C > j or j ′ . Now recall that, by relation (77) in Moulines et al.(2007 b ), | b φ ( x ) b ψ (2 j ′ x ) | ≤ C | j ′ x | N ψ (1 + 2 j ′ | x | ) α + N ψ , x ∈ [ − π, π ) . (C.57)64hus, by relations (C.55) and (C.57), (cid:13)(cid:13)(cid:13) Z π − π e i x (2 j k − j ′ k ′ ) (cid:2) H j ( x ) − j/ b φ ( x ) b ψ (2 j x ) (cid:3) j ′ / b φ ( x ) b ψ (2 j ′ x ) × (2 j ) − ( H +(1 / I ) g X ( x )(2 j ′ ) − ( H ∗ +(1 / I ) dx (cid:13)(cid:13)(cid:13) (C.58) ≤ C (2 j ) ( − α ) j ′ / Z π − π | x | N ψ | b φ ( x ) b ψ (2 j ′ x ) | k (2 j ) − ( H +(1 / I ) g X ( x )(2 j ′ ) − ( H ∗ +(1 / I ) k dx ≤ C (2 j ) ( − α ) j ′ / Z π − π | x | N ψ | j ′ x | N ψ (1 + 2 j ′ | x | ) α + N ψ k (2 j ) − ( H +(1 / I ) g X ( x )(2 j ′ ) − ( H ∗ +(1 / I ) k dx. (C.59)By a change of variable y = 2 j ′ x in (C.59), C (2 j ) ( − α ) − j ′ / Z π j ′ − π j ′ (cid:12)(cid:12)(cid:12) y j ′ (cid:12)(cid:12)(cid:12) N ψ | y | N ψ (1 + | y | ) α + N ψ (cid:13)(cid:13)(cid:13) (2 j ) − ( H +(1 / I ) g X (cid:16) y j ′ (cid:17) (2 j ′ ) − ( H ∗ +(1 / I ) (cid:13)(cid:13)(cid:13) dy = C (2 j ) ( − α ) (2 j ′ ) − − N ψ Z π j ′ − π j ′ | y | N ψ (1 + | y | ) α + N ψ (cid:13)(cid:13)(cid:13) (2 j − j ′ ) − ( D + I ) | y | − ( D + I ) s (cid:16) y j ′ (cid:17) | y | − ( D ∗ + I ) (cid:13)(cid:13)(cid:13) dy, (C.60)where, again, C > j or j ′ . Therefore, at scales a j and a j ′ for ﬁxed j and j ′ (in place of 2 j and 2 j ′ ), expression (C.58) is bounded by C ( a j ) ( − α ) ( a j ′ ) − − N ψ Z πa j ′ − πa j ′ | y | N ψ (1 + | y | ) α + N ψ (cid:13)(cid:13)(cid:13) (2 j − j ′ ) − ( D + I ) | y | − ( D + I ) s (cid:16) ya j ′ (cid:17) | y | − ( D ∗ + I ) (cid:13)(cid:13)(cid:13) dy = C ( a j ) ( − α ) ( a j ′ ) − − N ψ n Z | y |≤ + Z ≤| y |≤ πa j ′ o | y | N ψ (1 + | y | ) α + N ψ × (cid:13)(cid:13)(cid:13) (2 j − j ′ ) − ( D + I ) | y | − ( D + I ) s (cid:16) ya j ′ (cid:17) | y | − ( D ∗ + I ) (cid:13)(cid:13)(cid:13) dy ≤ C j,j ′ a − ( α + N ψ ) + C ′ j,j ′ a − ( α + N ψ ) Z ≤| y |≤ πa j ′ | y | N ψ (1 + | y | ) α + N ψ (cid:13)(cid:13)(cid:13) | y | − ( D + I ) s (cid:16) ya j ′ (cid:17) | y | − ( D ∗ + I ) (cid:13)(cid:13)(cid:13) dy ≤ C j,j ′ a − ( α + N ψ ) + C ′′ j,j ′ a − ( α + N ψ ) Z πa j ′ y N ψ − α − h − dy. (C.61)However, Z πa j ′ y N ψ − α − h − dy ≤ C ×  , if N ψ − α − h < a, if N ψ − α − h = 0; a N ψ − α − h , if N ψ − α − h > . Therefore, the right-hand side of (C.61) is bounded by C ×  a − ( α + N ψ ) , if N ψ − α − h < a − ( α + N ψ ) log a, if N ψ − α − h = 0; a − α + h ) , if N ψ − α − h > . Thus, turning back to (C.54), bounds (C.56) and (C.60) imply that (cid:13)(cid:13)(cid:13) a ( n ) − ( H +(1 / I ) n E D X ( a ( n )2 j , k ) D X ( a ( n )2 j ′ , k ′ ) ∗ a ( n )2 j + j ′ Z π − π e i xa ( n )(2 j k − j ′ k ′ ) | b φ ( x ) | b ψ (cid:0) a ( n )2 j x (cid:1) b ψ (cid:0) a ( n )2 j ′ x (cid:1) g X ( x ) dx o a ( n ) − ( H ∗ +(1 / I ) (cid:13)(cid:13)(cid:13) ≤ C (cid:16) a ( n ) − (2 α − + a ( n ) − min { α + N ψ − ε, α + h ) } (cid:17) ≤ Ca ( n ) − min { α − ,α + N ψ − ε } (C.62)for all small enough ε >

0. This establishes a bound for the ﬁrst term on the right-hand side of(C.53).Turning to the second term on the right-hand side of (C.53), recall that, by assumption ( W b φ is inﬁnitely diﬀerentiable and | b φ (0) | = 1. Since, in addition, | b φ | is even, then ( | b φ ( x ) | ) ′ | x =0 = 0.Likewise, for ℓ = 1 , . . . , r , the functions | − e − i x i x | d ℓ +1 – deﬁned as 1 at x = 0 – are even and inﬁnitelydiﬀerentiable. Therefore, (cid:12)(cid:12)(cid:12) − e − i x i x (cid:12)(cid:12)(cid:12) D + I has vanishing ﬁrst derivative at x = 0. (C.63)Then, for g ∗ ( x ) is as in (C.52), assumption (4.6) implies that, for some C > (cid:13)(cid:13) g ∗ ( x ) − g ∗ (0) (cid:13)(cid:13) ≤ C | x | δ s , x ∈ [ − π, π ) , where g ∗ (0) = s (0) . Hence, a − ( H +(1 / I ) n ( a j + j ′ ) Z π − π e i xa (2 j k − j ′ k ′ ) b ψ (cid:0) a j x (cid:1) b ψ (cid:0) a j ′ x (cid:1) | − e − i x | − ( D + I ) (cid:2) g ∗ ( x ) − g ∗ (0) (cid:3) | − e − i x | − ( D ∗ + I ) dx o a − ( H ∗ +(1 / I ) = a − ( H +(1 / I ) n ( a j + j ′ ) Z π − π e i xa (2 j k − j ′ k ′ ) b ψ (cid:0) a j x (cid:1) b ψ (cid:0) a j ′ x (cid:1)(cid:12)(cid:12)(cid:12) − e − i x i x (cid:12)(cid:12)(cid:12) − ( D + I ) | i x | − ( D + I ) (cid:2) g ∗ ( x ) − g ∗ (0) (cid:3) | i x | − ( D ∗ + I ) (cid:12)(cid:12)(cid:12) − e − i x i x (cid:12)(cid:12)(cid:12) − ( D ∗ + I ) dx o a − ( H ∗ +(1 / I ) = a − ( H +(1 / I ) n j + j ′ Z πa − πa e i y (2 j k − j ′ k ′ ) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1)(cid:12)(cid:12)(cid:12) − e − i y/a i y/a (cid:12)(cid:12)(cid:12) − ( D + I ) | i y/a | − ( D + I ) h g ∗ (cid:16) ya (cid:17) − g ∗ (0) i | i y/a | − ( D ∗ + I ) (cid:12)(cid:12)(cid:12) − e − i y/a i y/a (cid:12)(cid:12)(cid:12) − ( D ∗ + I ) dy o a − ( H ∗ +(1 / I ) = 2 j + j ′ Z πa − πa e i y (2 j k − j ′ k ′ ) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1)(cid:12)(cid:12)(cid:12) − e − i y/a i y/a (cid:12)(cid:12)(cid:12) − ( D + I ) | i y | − ( D + I ) h g ∗ (cid:16) ya (cid:17) − g ∗ (0) i | i y | − ( D ∗ + I ) (cid:12)(cid:12)(cid:12) − e − i y/a i y/a (cid:12)(cid:12)(cid:12) − ( D ∗ + I ) dy. (C.64)However, by expression (4.6), the norm of the right-hand side of (C.64) is bounded by C Z πa − πa | b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1) | (cid:13)(cid:13) | i y | − ( D + I ) (cid:13)(cid:13)(cid:12)(cid:12)(cid:12) ya (cid:12)(cid:12)(cid:12) δ s (cid:13)(cid:13) | i y | − ( D ∗ + I ) (cid:13)(cid:13) dy ≤ C ′ a δ s Z πa − πa | b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1) | (cid:13)(cid:13) | i y | − ( D + I ) (cid:13)(cid:13) | y | δ s (cid:13)(cid:13) | i y | − ( D ∗ + I ) (cid:13)(cid:13) dy ≤ C ′′ a δ s Z ∞−∞ | b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1) | (cid:13)(cid:13) | i y | − ( D + I ) (cid:13)(cid:13) | y | δ s (cid:13)(cid:13) | i y | − ( D ∗ + I ) (cid:13)(cid:13) dy. (C.65)66ote that the integral on the right-hand side of (C.65) is ﬁnite. In fact, ﬁrst recall that, bycondition (2.20), the ordered eigenvalues of D satisfy − < d ≤ d ≤ . . . ≤ d r < . Considering the scaling behavior of the integrand around the origin, conditions (2.3) and (4.6)imply that 2 N ψ + δ s > > d r + 2. In turn, in regard to the behavior of the integrand as | x | → ∞ ,2 d + 2 + 2 α − δ s > d + 2 α >

1. So, (C.65) establishes a bound for the second term on theright-hand side of (C.53).We now turn to the third term on the right-hand side of (C.53). After a change of variable y = ax , we can reexpress a − ( H +(1 / I ) n j + j ′ Z πa − πa e i y (2 j k − j ′ k ′ ) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1) | − e − i y/a | − ( D + I ) g ∗ (0) | − e − i y/a | − ( D ∗ + I ) dy − j + j ′ Z πa − πa e i y (2 j k − j ′ k ′ ) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1) | i y/a | − ( D + I ) g ∗ (0) | i y/a | − ( D ∗ + I ) dy o a − ( H ∗ +(1 / I ) = a − ( H +(1 / n j + j ′ Z πa − πa e i y (2 j k − j ′ k ′ ) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1)(cid:16)(cid:12)(cid:12)(cid:12) − e − i y/a i y/a (cid:12)(cid:12)(cid:12) − ( D + I ) − I (cid:17) | i y/a | − ( D + I ) × g ∗ (0) | i y/a | − ( D ∗ + I ) (cid:12)(cid:12)(cid:12) − e − i y/a i y/a (cid:12)(cid:12)(cid:12) − ( D ∗ + I ) dy o a − ( H ∗ +(1 / + a − ( H +(1 / n j + j ′ Z πa − πa e i y (2 j k − j ′ k ′ ) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1) | i y/a | − ( D + I ) × g ∗ (0) | i y/a | − ( D ∗ + I ) (cid:16)(cid:12)(cid:12)(cid:12) − e − i y/a i y/a (cid:12)(cid:12)(cid:12) − ( D ∗ + I ) − I (cid:17) dy o a − ( H ∗ +(1 / = 2 j + j ′ Z πa − πa e i y (2 j k − j ′ k ′ ) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1)(cid:16)(cid:12)(cid:12)(cid:12) − e − i y/a i y/a (cid:12)(cid:12)(cid:12) − ( D + I ) − I (cid:17) | i y | − ( D + I ) × g ∗ (0) | i y | − ( D ∗ + I ) (cid:12)(cid:12)(cid:12) − e − i y/a i y/a (cid:12)(cid:12)(cid:12) − ( D ∗ + I ) dy +2 j + j ′ Z πa − πa e i y (2 j k − j ′ k ′ ) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1) | i y | − ( D + I ) × g ∗ (0) | i y | − ( D ∗ + I ) (cid:16)(cid:12)(cid:12)(cid:12) − e − i y/a i y/a (cid:12)(cid:12)(cid:12) − ( D ∗ + I ) − I (cid:17) dy. (C.66)For ℓ = 1 , . . . , r , the functions | − e − i y/a i y/a | − ( d ℓ +1) – deﬁned as 1 at x = 0 – have vanishing ﬁrstderivative at x = 0 (cf. (C.63)). Consequently, the norm of the expression on the right-hand sideof (C.66) is bounded by C Z πa − πa (cid:12)(cid:12) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) ya (cid:12)(cid:12)(cid:12) (cid:13)(cid:13) | i y | − ( D + I ) (cid:13)(cid:13) dy ≤ C ′ a . (C.67)Note that the integral appearing in expression (C.67) is ﬁnite. In fact, considering the behavior ofthe integrand around the origin, condition (2.3) implies that 2 N ψ +1 > d r +2. In turn, considering67he behavior of the integrand as | x | → ∞ , condition (2.5) implies that 2 d +2+2 α − > − y = ax , (cid:13)(cid:13)(cid:13) a − ( H +(1 / I ) n j + j ′ Z πa − πa e i y (2 j k − j ′ k ′ ) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1) | i y/a | − ( D + I ) g ∗ (0) | i y/a | − ( D ∗ + I ) dy o a − ( H ∗ +(1 / − Φ j k − j ′ k ′ (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) a − ( H +(1 / I ) n j + j ′ h Z πa − πa − Z ∞−∞ i e i y (2 j k − j ′ k ′ ) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1) | i y/a | − ( D + I ) × g ∗ (0) | i y/a | − ( D ∗ + I ) dy o a − ( H ∗ +(1 / (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) j + j ′ Z | y |≥ πa e i y (2 j k − j ′ k ′ ) b ψ (cid:0) j y (cid:1) b ψ (cid:0) j ′ y (cid:1) | i y | − ( D + I ) g ∗ (0) | i y | − ( D ∗ + I ) dy (cid:13)(cid:13)(cid:13) ≤ C ′ Z | y |≥ πa j | y | ) α j ′ | y | ) α | y | d +2 dy ≤ C ′′ y α +2 d +1 (cid:12)(cid:12)(cid:12) ∞ πa = C ′′′ a α +2 d +1 . (C.68)Consequently, (C.68) establishes a bound for the fourth term on the right-hand side of (C.53).Thus, in light of (C.53), the bound (C.53) for (cid:13)(cid:13)(cid:13) E D ( a ( n )2 j , k ) D ( a ( n )2 j ′ , k ′ ) ∗ − Φ j k − j ′ k ′ (cid:13)(cid:13)(cid:13) isnow a consequence of relations (C.62), (C.65), (C.67) and (C.68), where we use the facts that2 α + 2 d + 1 > α − , ≥ δ s . This establishes ( i ).In regard to ( ii ), ﬁrst recall that, for two matrices A and B , A ⊗ A − B ⊗ B = ( A − B ) ⊗ ( A − B ) + B ⊗ ( A − B ) + ( A − B ) ⊗ B . Therefore, k A ⊗ A − B ⊗ B k ℓ ≤ k ( A − B ) ⊗ ( A − B ) k ℓ + k B ⊗ ( A − B ) k ℓ + k ( A − B ) ⊗ ( A − B ) k ℓ ≤ k A − B k ℓ + 2 k A − B k ℓ k B k ∞ . (C.69)So, let Ξ j k − j ′ k ′ = Φ j k − j ′ k ′ ⊗ Φ j k − j ′ k ′ . (C.70)Recast 1 √ n a,j √ n a,j ′ n a,j X k =1 n a,j ′ X k ′ =1 E D ( a ( n )2 j , k ) D ( a ( n )2 j ′ , k ′ ) ∗ ⊗ E D ( a ( n )2 j , k ) D ( a ( n )2 j ′ , k ′ ) ∗ = 1 √ n a,j √ n a,j ′ n a,j X k =1 n a,j ′ X k ′ =1 n E D ( a ( n )2 j , k ) D ( a ( n )2 j ′ , k ′ ) ∗ ⊗ E D ( a ( n )2 j , k ) D ( a ( n )2 j ′ , k ′ ) ∗ − Ξ j k − j ′ k ′ o + 1 √ n a,j √ n a,j ′ n a,j X k =1 n a,j ′ X k ′ =1 Ξ j k − j ′ k ′ . (C.71)68y (C.69), the ℓ norm of the ﬁrst sum term in (C.71) is bounded by1 √ n a,j √ n a,j ′ n a,j X k =1 n a,j ′ X k ′ =1 (cid:13)(cid:13)(cid:13) E D ( a ( n )2 j , k ) D ( a ( n )2 j ′ , k ′ ) ∗ − Φ j k − j ′ k ′ (cid:13)(cid:13)(cid:13) ℓ + 2 √ n a,j √ n a,j ′ n a,j X k =1 n a,j ′ X k ′ =1 (cid:13)(cid:13)(cid:13) E D ( a ( n )2 j , k ) D ( a ( n )2 j ′ , k ′ ) ∗ − Φ j k − j ′ k ′ (cid:13)(cid:13)(cid:13) ℓ × k Φ j k − j ′ k ′ k ∞ ≤ C na ( n ) (cid:16) a ( n ) { α − ,α + N ψ ,δ s } + 1 a ( n ) min { α − ,α + N ψ ,δ s } (cid:17) → , n → ∞ , (C.72)by expression (C.47), where the limit is a consequence of condition (2.27). Namely, the ﬁrstterm in (C.71) vanishes as n → ∞ . In regard to the second term, we now proceed as in theproof of Proposition 3.3, ( i ), in Abry and Didier (2018 b ). It suﬃces to consider the subsequence n/a ( n ) = 2 j + j ′ n ∗ , n ∗ → ∞ . Then, n a,j = 2 j ′ n ∗ , n a,j ′ = 2 j n ∗ , and n − / a,j n − / a,j ′ = 2 − ( j + j ′ ) / /n ∗ .Let Ξ • be as in (C.70). By Theorem 1.8 in Jones and Jones (1998), p.10, the range of indicesspanned by 2 j k − j ′ k ′ is Z gcd(2 j , j ′ ). Thus, we would like to show that ∞ X z = −∞ k Ξ z gcd(2 j , j ′ ) k < ∞ . (C.73)Note that k Ξ j k − j ′ k ′ k ℓ ≤ k Φ z gcd(2 j , j ′ ) k ℓ . Thus, if P ∞ z = −∞ k Φ z k < ∞ , then1 √ n a,j √ n a,j ′ n a,j X k =1 n a,j ′ X k ′ =1 Ξ a ( n )(2 j k − j ′ k ′ ) → ( j + j ′ ) / gcd(2 j , j ′ ) ∞ X z = −∞ Φ z gcd(2 j , j ′ ) ⊗ Φ z gcd(2 j , j ′ ) (C.74)as a consequence of Lemma B.3 in Abry and Didier (2018 b ). Now recall that, for any1 ≤ ℓ , ℓ ≤ r , b ψ (2 j x ) b ψ (2 j ′ x ) | x | − ( d ℓ + d ℓ +2) ∈ L ( R ), which is a consequence of conditions(2.3)–(2.5). Note that, for constants c · , · ∈ C , we can write P − H | x | − ( D + I ) | x | − ( D ∗ + I ) ( P ∗ H ) − = { c ℓ ,ℓ | x | − ( d ℓ + d ℓ +2) } ℓ ,ℓ =1 ,...,r . Thus, by Parseval’s theorem, ∞ X z = −∞ (cid:12)(cid:12)(cid:12)(cid:12) Z R e i zx b ψ (2 j x ) b ψ (2 j ′ x ) | x | − ( d ℓ + d ℓ +2) dx (cid:12)(cid:12)(cid:12)(cid:12) = 2 π Z R (cid:12)(cid:12)(cid:12)(cid:12) b ψ (2 j x ) b ψ (2 j ′ x ) | x | − ( d ℓ + d ℓ +2) (cid:12)(cid:12)(cid:12)(cid:12) dx < ∞ . This proves P ∞ z = −∞ k Φ z k < ∞ , as claimed. Then, (C.74) holds. Expressions (C.72) and (C.74)imply (C.48). This establishes ( ii ).Statement ( iii ) is a direct consequence of ( i ) and ( ii ).We turn to ( iv ). For notational simplicity, we assume r = 2 and consider only two octaves j, j ′ , whence J = j − j + 1 = 2. Again for notation simplicity, recall that we also write a = a ( n ).Then, from (C.50), Υ( n ) = 2( n a,j + n a,j ) . Let Y n := (cid:16) d ( a j , , . . . , d ( a j , n a,j ); d ( a j , , . . . , ( a j , n a,j ); d ( a j , , . . . , d ( a j , n a,j ); d ( a j , , . . . , d ( a j , n a,j ) (cid:17) ∈ R Υ( n ) , (C.75)Fix v ∈ C Υ( n ) . For notational simplicity, divide the summation range k = 1 , . . . , Υ( n ) into thesubranges K = { , . . . , n a,j } , K = { n a,j + 1 , . . . , n a,j + n a,j ′ } ,K = { n a,j + n a,j ′ + 1 , . . . , n a,j + n a,j ′ } , K = { n a,j + n a,j ′ + 1 , . . . , n a,j + n a,j ′ ) } . Deﬁne the octave and index functions j ( k ) = j { K ∪ K } ( k ) + j ′ { K ∪ K } ( k ) , q ( k ) = 1 · { K ∪ K } ( k ) + 2 · { K ∪ K } ( k ) , respectively, which reﬂects the order of appearance of diﬀerent octave and index values in thevector (C.75). For g X as in (4.5), let e g n ( x ) = a − ( D + I ) g X ( x ) a − ( D ∗ + I ) , x ∈ [ − π, π ) . Fix j and any ε >

0. Then, under conditions (2.27) and (4.5), for large enough n , (cid:13)(cid:13)(cid:13)e g n (cid:16) ya j (cid:17)(cid:13)(cid:13)(cid:13) ≤ C (cid:13)(cid:13)(cid:13) | y | − ( D + I ) s (cid:16) ya j (cid:17) | y | − ( D ∗ + I ) (cid:13)(cid:13)(cid:13) (C.76)for some C >

0. Then, v ∗ Γ Y n v = Υ( n ) X k =1 Υ( n ) X k =1 v k v k (cid:16) Γ Y n (cid:17) k ,k = Z π − π Υ( n ) X k =1 Υ( n ) X k =1 v k v k e ia (2 j ( k k − j ( k k ) x H a j ( k ( x ) H a j ( k ( x ) e g n ( x ) q ( k ) q ( k ) dx. (C.77)By expanding the double summation in (C.77) into each pair in the Cartesian product { K , K , K , K } , we end up with 16 double summation terms under the sign of the integral,i.e., 8 pairs of conjugates. To the cross terms, namely, those involving distinct summation rangesin the index k , we can apply the elementary inequality | xy + xy | ≤ | x | + | y | . One such pair,associated with the summation regions K × K and K × K , is (cid:12)(cid:12)(cid:12) Z π − π (cid:16) X k ∈ K v k e i a j k x H a j ( x ) X k ∈ K v k e − i a j ′ k x H a j ′ ( x )+ X k ∈ K v k e i a j ′ k x H a j ′ ( x ) X k ∈ K v k e − i a j k x H a j ( x ) (cid:17)e g n ( x ) dx (cid:12)(cid:12)(cid:12) ≤ Z π − π n(cid:12)(cid:12)(cid:12) X k ∈ K v k e i a j kx H a j ( x ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) X k ∈ K v k e − i a j ′ kx H a j ′ ( x ) (cid:12)(cid:12)(cid:12) o | e g n ( x ) | dx, since e g n ( x ) = e g n ( x ) , and analogous bounds hold for the remaining terms. Therefore, (C.77)is bounded by Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e ia j kx (cid:12)(cid:12)(cid:12) |H a j ( x ) | ( e g n ( x ) + 3 | e g n ( x ) | ) dx + Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e ia j ′ kx (cid:12)(cid:12)(cid:12) |H a j ′ ( x ) | ( e g n ( x ) + 3 | e g n ( x ) | ) dx Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e ia j kx (cid:12)(cid:12)(cid:12) |H a j ( x ) | ( e g n ( x ) + 3 | e g n ( x ) | ) dx + Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e ia j ′ kx (cid:12)(cid:12)(cid:12) |H a j ′ ( x ) | ( e g n ( x ) + 3 | e g n ( x ) | ) dx ≤ Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e ia j kx (cid:12)(cid:12)(cid:12) C a j | a j x | N ψ (1 + a j | x | ) α +2 N ψ ( e g n ( x ) + 3 | e g n ( x ) | ) dx (C.78)+ Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e ia j ′ kx (cid:12)(cid:12)(cid:12) C a j ′ | a j ′ x | N ψ (1 + a j ′ | x | ) α +2 N ψ ( e g n ( x ) + 3 | e g n ( x ) | ) dx + Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e ia j kx (cid:12)(cid:12)(cid:12) C a j | a j x | N ψ (1 + a j | x | ) α +2 N ψ ( e g n ( x ) + 3 | e g n ( x ) | ) dx + Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e ia j ′ kx (cid:12)(cid:12)(cid:12) C a j ′ | a j ′ x | N ψ (1 + a j ′ | x | ) α +2 N ψ ( e g n ( x ) + 3 | e g n ( x ) | ) dx = Z πa j − πa j (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) C | y | N ψ (1 + | y | ) α +2 N ψ (cid:16)e g n (cid:16) ya j (cid:17) + 3 (cid:12)(cid:12)(cid:12)e g n (cid:16) ya j (cid:17) (cid:12)(cid:12)(cid:12)(cid:17) dy (C.79)+ Z πa j ′ − πa j ′ (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) C | y | N ψ (1 + | y | ) α +2 N ψ (cid:16)e g n (cid:16) ya j ′ (cid:17) + 3 (cid:12)(cid:12)(cid:12)e g n (cid:16) ya j ′ (cid:17) (cid:12)(cid:12)(cid:12)(cid:17) dy + Z πa j − πa j (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) C | y | N ψ (1 + | y | ) α +2 N ψ (cid:16)e g n (cid:16) ya j (cid:17) + 3 (cid:12)(cid:12)(cid:12)e g n (cid:16) ya j (cid:17) (cid:12)(cid:12)(cid:12)(cid:17) dy + Z πa j ′ − πa j ′ (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) C | y | N ψ (1 + | y | ) α +2 N ψ (cid:16)e g n (cid:16) ya j ′ (cid:17) + 3 (cid:12)(cid:12)(cid:12)e g n (cid:16) ya j ′ (cid:17) (cid:12)(cid:12)(cid:12)(cid:17) dy ≤ C ′ v ∗ v . (C.80)In equality (C.78), we apply relation (78) in Moulines et al. (2007 b ). In inequality (C.79), wemake the changes of variable y = a j x and y = a j ′ x . To justify inequality (C.80), it suﬃces toconsider only the term (C.79), involving summation over k ∈ K . Bound (C.76) implies that Z πa j − πa j (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) C | y | N ψ (1 + | y | ) α +2 N ψ (cid:16)e g n (cid:16) ya j (cid:17) + 3 (cid:12)(cid:12)(cid:12)e g n (cid:16) ya j (cid:17) (cid:12)(cid:12)(cid:12)(cid:17) dy ≤ C ′ Z πa j − πa j (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) | y | N ψ (1 + | y | ) α +2 N ψ (cid:13)(cid:13)(cid:13) | y | − ( D + I ) s (cid:16) ya j (cid:17) | y | − ( D ∗ + I ) (cid:13)(cid:13)(cid:13) dy. (C.81)However, based on the properties of s ( x ) as in (4.5) and on the conditions that N ψ ≥ α > W

1) and ( W Z πa j − πa j (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) | y | N ψ (1 + | y | ) α +2 N ψ (cid:13)(cid:13)(cid:13) | y | − ( D + I ) s (cid:16) ya j (cid:17) | y | − ( D ∗ + I ) (cid:13)(cid:13)(cid:13) dy n Z π − π + Z π ≤| y | <πa j o(cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) | y | N ψ (1 + | y | ) α +2 N ψ (cid:13)(cid:13)(cid:13) | y | − ( D + I ) s (cid:16) ya j (cid:17) | y | − ( D ∗ + I ) (cid:13)(cid:13)(cid:13) dy ≤ C Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) | y | N ψ − (2 d r +2) (1 + | y | ) α +2 N ψ (cid:13)(cid:13)(cid:13) s (cid:16) ya j (cid:17)(cid:13)(cid:13)(cid:13) dy + C Z π ≤| y | <πa j (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) | y | N ψ − (2 d +2) (1 + | y | ) α +2 N ψ (cid:13)(cid:13)(cid:13) s (cid:16) ya j (cid:17)(cid:13)(cid:13)(cid:13) dy ≤ C ′ Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) dy + C ′ a j X ℓ =3 Z π ( ℓ − ≤ y<πℓ (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) | y | N ψ − (2 d +2) (1 + | y | ) α +2 N ψ dy ≤ C ′ Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) dy + C ′′ a j X ℓ =3 Z π ( ℓ − ≤ y<πℓ (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) | y | α +2 d +2 dy ≤ C ′ Z π − π (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) dy + C ′′ a j X ℓ =3 | π ( ℓ − | α +2 d +2 Z π ( ℓ − ≤ y<πℓ (cid:12)(cid:12)(cid:12) X k ∈ K v k e i ky (cid:12)(cid:12)(cid:12) dy ≤ v ∗ v (cid:16) C ′ + C ′′ ∞ X ℓ =3 | π ( ℓ − | α +2 d +2 (cid:17) . Thus, (C.80) holds, which shows (C.51). This establishes ( iv ). (cid:3) In the following proposition, we establish the asymptotic normality of the operator-normalizedwavelet random matrix b B a (2 j ). The proposition is used to establish condition ( A

3) holds in thecontext of Proposition 4.1.

Proposition C.3

Under the assumptions of Proposition 4.1, the auxiliary wavelet random matrix b B a ( a ( n )2 j ) is asymptotically normal. In other words, as n → ∞ , (cid:16) √ n a,j (cid:0) vec S b B a (2 j ) − vec S B a (2 j ) (cid:1)(cid:17) j = j ,...,j d → N r ( r +1)2 × J (0 , G ) , (C.82) where J = j − j + 1 . (C.83) Each entry of the asymptotic covariance matrix G in (C.82) is given by the terms G jj ′ in (C.49) for appropriate values of j and j ′ . Moreover, k B a (2 j ) − B (2 j ) k = O ( a ( n ) − ̟ ) , n → ∞ , j = j , . . . , j m , (C.84) where ̟ is as in (2.25) ; i.e., assumption (A3) holds. Proof:

We ﬁrst establish (C.82). The argument is an adaptation of the proof of Theorem 3.1in Abry and Didier (2018 b ) (see also Bardet (2000), pp.510–513, Istas and Lang (1997), Lemma2, Wendt et al. (2017), Proposition 3.1, ( vi )). For notational simplicity, we only write the prooffor r = 2, where the entries are indexed 1 and 2, and q , q = 1 , n , j and k , consider the vector of operator-normalized wavelet coeﬃcients D ( a ( n )2 j , k ) as in (C.44), i.e., D ( a ( n )2 j , k ) = (cid:16) d (cid:0) a ( n )2 j , k (cid:1) , d (cid:0) a ( n )2 j , k (cid:1)(cid:17) ∗ . (C.85)72ecall that Υ( n ) = 2 P j j = j n a,j (cf. (C.50)). Form the vector of normalized wavelet coeﬃcients Y n := (cid:16) d ( a j , , d ( a j , , . . . , d ( a j , n a,j ) , d ( a j , n a,j ); . . . ; d ( a j , , d ( a j , , . . . , d ( a j , n a,j ) , d ( a j , n a,j ) (cid:17) ∈ R Υ( n ) . (C.86)Note that, without loss of generality, Y n in (C.86) is rewritten as a permutation of (C.75), andwe keep the notation. Let θ = ( θ j , . . . , θ j ) ∈ R J , (C.87)where J = j − j + 1 and θ j = ( θ j, , θ j, , θ j, ) ∗ ∈ R , j = j , . . . , j . Now form the block-diagonalmatrix D n deﬁned bydiag (cid:16) n a,j r j Ω n,j , . . . , n a,j r j Ω n,j | {z } n a,j ; . . . ; 1 n a,j r j Ω n,j , . . . , n a,j r j Ω n,j | {z } n a,j (cid:17) , (C.88)where Ω n,j = θ j, θ j, θ j, θ j, ! , j = j , . . . , j (C.89)We would like to establish the limiting distribution of the statistic T n = j X j = j θ ∗ j √ j vec S b B a ( a j )= j X j = j θ ∗ j √ j n a,j (cid:16) n a,j X k =1 d ( a j , k ) , n a,j X k =1 d ( a j , k ) d ( a j , k ) , n a,j X k =1 d ( a j , k ) (cid:17) ∗ = Y ∗ n D n Y n , where it suﬃces to consider θ in (C.87) such that θ ∗ Σ( H ) θ > H ) in (C.90) is obtained from(C.49) in Proposition C.2, ( iii ). It can be written in block form as Σ( H ) =: ( G jj ′ ) j,j ′ = j ,...,j ,corresponding to block entries of the vector θ = ( θ j , . . . , θ j ) ∗ . Let Γ Y n = Cov( Y n , Y n ), andconsider the spectral decomposition Γ / Y n D n Γ / Y n = O Λ O ∗ , where Λ is diagonal with real, and notnecessarily positive, eigenvalues ζ i ( a j ) , i = 1 , . . . , Υ( n ) , (C.91)and O is an orthogonal matrix. Now let Z ∼ N (0 , I Υ( n ) ). Then, T n d = Z ∗ Γ / Y n D n Γ / Y n Z = Z ∗ O Λ O ∗ Z d = Z ∗ Λ Z =: Υ( n ) X i =1 ζ i ( a j ) Z i . Assume for the moment that max i =1 ,..., Υ( n ) | ζ i ( a j ) | = o (cid:16)(cid:16) an (cid:17) / (cid:17) . (C.92)73y (C.49) and (C.90), na Var T n = j X j = j j X j ′ = j θ ∗ j nr na j r na j ′ Cov (cid:16) vec S b B a ( a j ) , vec S b B a ( a j ) (cid:17)o θ j ′ → j X j = j j X j ′ = j θ ∗ j G jj ′ θ j ′ > , n → ∞ . Therefore, there exists a constant

C > n , na Var T n ≥ C >

0. In viewof condition (C.92),max i =1 ,..., Υ( n ) | ζ i ( a j ) |√ Var T n ≤ C ′ (cid:16) na (cid:17) / max i =1 ,..., Υ( n ) | ζ i ( a j ) | → , n → ∞ , n → ∞ . The weak convergence (C.82) is now a consequence of Lemma B.4 in Abry and Didier (2018 b ).So, we need to show (C.92). The ﬁrst step is to establish the boundsup u ∈ S Υ( n ) − | u ∗ Γ / Y n D n Γ / Y n u | ≤ C max j = j ,...,j n a,j k Ω n,j k sup u ∈ S Υ( n ) − u ∗ Γ Y n u . (C.93)Let u ∈ S Υ( n ) − and let v = Γ / Y n u . We can break up the vector v into two-dimensional subvec-tors v · , · to reexpress v = ( v j , , . . . , v j ,n a,j ; . . . ; v j , , . . . , v j ,n a,j ) ∗ . Based on the block-diagonalstructure of D n expressed in (C.88), | u ∗ Γ / Y n D n Γ / Y n u | = | v ∗ D n v | = (cid:12)(cid:12)(cid:12) j X j = j n a,j X ℓ =1 v ∗ j,ℓ Ω n,j n a,j √ j v j,ℓ (cid:12)(cid:12)(cid:12) ≤ C j X j = j n a,j X ℓ =1 n a,j √ j k Ω n,j k k v j,ℓ k ≤ C (cid:16) max j = j ,...,j n a,j √ j k Ω n,j k (cid:17) j X j = j n a,j X ℓ =1 k v j,ℓ k = C (cid:16) max j = j ,...,j n a,j √ j k Ω n,j k (cid:17) u ∗ Γ Y n u , (C.94)where the constant C comes from a change of matrix norms and only depends on the ﬁxeddimension r = 2. By taking sup u ∈ S Υ( n ) − on both sides of (C.94), we arrive at (C.93).The second step towards showing (C.92) consists in analyzing the asymptotic behavior of theright-hand side of (C.93), as n → ∞ . So, note that, for some C > j = j ,...,j n a,j k Ω n,j k ≤ C an . (C.95)Moreover, by relation (C.51) in Proposition C.2, ( iv ), the maximum eigenvalue of Γ Y n is bounded,i.e., k Γ Y n k ≤ C for some C >

0, where k · k is the matrix Euclidean norm. Therefore, in view of(C.95), the right-hand side of (C.93) is bounded by C an . In turn, the latter expression divided by p an is bounded by C (cid:0) an (cid:1) / . By condition (2.27), this implies (C.92), and as a result, also (C.82).To establish (C.84), simply note that, by expression (C.44) and by the stationarity of thecoeﬃcients D (2 j , k ), we may write P H B a (2 j ) P ∗ H = E D ( a ( n )2 j , D ( a ( n )2 j , ∗ Then, by (C.47), k B a (2 j ) − B (2 j ) k = (cid:13)(cid:13)(cid:13) E (cid:2) P H D ( a ( n )2 j , D ( a ( n )2 j , ∗ P ∗ H (cid:3) − P H Φ P ∗ H (cid:13)(cid:13)(cid:13) Ca ( n ) min { α − ,α + N ψ − ε,δ s } = O ( a ( n ) − ̟ )for any small enough ε >

0, since 2 α − > > ̟ , α + N ψ − ε > > ̟ , and δ s > ̟ due toassumption ( iii ) in Proposition 4.1. Thus, (C.84) is established, and ( A

3) holds. (cid:3)

C.2 Section 4.2: auxiliary results and proofs for the case of non-Gaussian X The following proposition is mentioned in Example 4.2. In the statement of the proposition,we make use of the following univariate construct. Fix ˜ d ∈ R and any integer k > ˜ d − / W ˜ d,k = { W ˜ d,k ( t ) } t ∈ Z is a M ( ˜ d ) process (Roueﬀ andTaqqu (2009), Deﬁnition 1) with parameter ˜ d > k -th order diﬀerence process ∆ k W ˜ d,k is weakly stationary with (generalized) spectral density f ∆ W ˜ d,k ( x ) = | − e − i x | k − ˜ d ) f ∗ ( x ) , x ∈ [ − π, π ) . (C.96)In (C.96), f ∗ ( x ) ≥ β ∈ [ ̟, , (C.97)the function f ∗ satisﬁes | f ∗ ( x ) − f ∗ (0) | = O ( | x | − β ) , x → . In addition, suppose ∆ k W ˜ d,k ( t ) is a linear process of the form∆ k W ˜ d,k ( t ) = X ℓ ∈ Z a ˜ d ( t − ℓ ) ξ ℓ , X ℓ ∈ Z a d ( ℓ ) < ∞ . (C.98)In (C.98), { ξ ℓ } ℓ ∈ Z is a sequence of i.i.d. random variables with mean zero, unit variance, and ﬁnitefourth moment. Proposition C.4

Fix ˜ d ℓ > and k ℓ ∈ N ∪ { } , ℓ = 1 , . . . , r , and β satisfying (C.97) . Further ﬁxpairwise distinct j , . . . , j m ∈ N . Suppose the following conditions are in place. ( i ) X = { X ( t ) } t ∈ Z = { ( X ( t ) , . . . , X r ( t )) ∗ } t ∈ Z is a r -variate stochastic process, where the entry-wise processes are independent and X ℓ f.d.d. = W ˜ d ℓ ,k ℓ , ℓ = 1 , . . . , r ; ( ii ) conditions ( W − W hold with (1 + β ) / − α < ˜ d ℓ ≤ N ψ and k ℓ ≤ N ψ , ℓ = 1 , . . . , r . ( iii ) { a ( n ) } n ∈ N is a dyadic sequence such that n/a ( n ) → ∞ and ( n/a ( n )) / a ( n ) − α − d → ,as n → ∞ .Then, with H = diag( ˜ d − , . . . , ˜ d r − ) the weak limit (2.23) holds for some diagonal matrix Σ B ( j , . . . , j m ) with strictly positive main diagonal entries. Moreover, k B a (2 j ) − B (2 j ) k = O ( a ( n ) − ̟ ) , n → ∞ , j = j , . . . , j m , (C.99) where ̟ is as in (2.25) ; i.e., assumption (A3) holds. roof: The limit (2.23) is an immediate consequence of Theorem 2 in Roueﬀ and Taqqu (2009).For the limit (C.99), observe that E W X (2 j ) = diag (cid:0) D X (2 j , , . . . , D X (2 j , r (cid:1) . Moreover, in view of condition (C.97), under ( i ), ( ii ) and ( iii ), expression (67) in Roueﬀ andTaqqu (2009) shows that, for some constants K ℓ ( j ), | a ( n ) − d ℓ D X ( a j , ℓ − K ℓ ( j ) | = O ( a ( n ) − β ) = O ( a ( n ) − ̟ ) ℓ = 1 , . . . , r. Therefore, with H = diag( ˜ d − , . . . , ˜ d r − ), k B a (2 j ) − B (2 j ) k = k diag (cid:0) a − d D X ( a j , − K ( j ) , . . . , a − d r D X ( a j , r − K r ( j ) (cid:1) k = O ( a ( n ) − ̟ ) , which shows (C.99). (cid:3) C.3 Section 4.2: auxiliary results and proofs for the case of non-Gaussian Z We now turn to Proposition C.5, which is mentioned in Example 4.3. Proposition C.5 estab-lishes that, under assumptions, the wavelet random matrix W Z ( a ( n )2 j ) for sub-Gaussian noiseis bounded in probability. The proof of the proposition is based on two intermediary results,namely, Lemmas C.5 and C.6. In Lemma C.5, assuming sub-Gaussian noise Z , we establish aconcentration inequality for W Z ( a ( n )2 j ). Such result depends on the sub-Gaussian norm K (cf.(C.105)) of the vector of wavelet coeﬃcients. Then, in the ensuing Lemma C.6, we show that,under assumptions, such norm is not a function of j .Before we state and prove Lemma C.5, we set the notation. More precisely, let Z = { Z ( t ) } t ∈ Z be a sequence of R p -valued random vectors. For ℓ = 1 , . . . , p , deﬁne the vector of (univariate)wavelet coeﬃcients as D Z ( a ( n )2 j ) ℓ = (cid:16) D Z ( a ( n )2 j , ℓ , . . . , D Z ( a ( n )2 j , n a,j ) ℓ (cid:17) ∗ ∈ R n a,j . (C.100)In other words, D Z ( a ( n )2 j ) ℓ contains the wavelet-domain observations of the ℓ -th entry process.Also let D Z ( a ( n )2 j ) ℓ =: Σ / a e D Z ( a ( n )2 j ) ℓ , where E e D Z ( a ( n )2 j ) ℓ e D Z ( a ( n )2 j ) ∗ ℓ = I n a,j . In particular,Σ a ∈ S ( n a,j , R ) (C.101)denotes the covariance matrix of the stationary process d Z ( a ( n )2 j , k ) ℓ , k = 1 , . . . , n a,j . So, recast n a,j W Z ( a ( n )2 j ) = n a,j X k =1 D Z ( a ( n )2 j , k ) D Z ( a ( n )2 j , k ) ∗ =  D Z ( a ( n )2 j , . . . D Z ( a ( n )2 j , n a,j ) . . . D Z ( a ( n )2 j , p . . . D Z ( a ( n )2 j , n a,j ) p   D Z ( a ( n )2 j , . . . D Z ( a ( n )2 j , p . . . D Z ( a ( n )2 j , n a,j ) . . . D Z ( a ( n )2 j , n a,j ) p   D Z ( a ( n )2 j ) ∗ ... D Z ( a ( n )2 j ) ∗ p   D Z ( a ( n )2 j ) . . . D Z ( a ( n )2 j ) p  =  e D Z ( a ( n )2 j ) ∗ Σ / a ... e D Z ( a ( n )2 j ) ∗ p Σ / a   Σ / a e D Z ( a ( n )2 j ) . . . Σ / a e D Z ( a ( n )2 j ) p  . Now recall that the sub-Gaussian norm of a random vector X ∈ R q , q ∈ N , is deﬁned as k X k ψ := sup u ∈ S q − kh X, u ik ψ (C.102)(see Vershynin (2018), p. 51; n.b. : ψ should not be confused with the wavelet function ψ ).We are now in a position to state and prove Lemma C.5. Lemma C.5

Let Z = { Z ( t ) } t ∈ Z be a p -variate stochastic process made up of p independent andidentically distributed entry-wise processes with sub-Gaussian ﬁnite-dimensional distributions. Fix t ≥ and let W Z ( a ( n )2 j ) be as in (2.18) . Also let D Z ( a ( n )2 j ) be a random vector as in (C.100) and suppose K := kD Z ( a ( n )2 j ) k ψ < ∞ . (C.103) Let Σ a be as in (C.101) . Then, k W Z ( a ( n )2 j ) − M ( p, n, a ( n )2 j ) k ≤ pn a,j k Σ a k K max { δ, δ } , δ = C (cid:16)r n a,j p + t √ p (cid:17) , (C.104) for some C > with probability at least − {− t } . In (C.104) , M ( p, n, a ( n )2 j ) ∈ S ≥ ( p, C ) is a (random) matrix whose eigenvalues are either zero or equal to eigenvalues of the matrix Σ a as in (C.101) . Proof:

We ﬁrst recap a result in Vershynin (2018). Let A ∈ M ( p, n, R ) be a random matrixwhose rows A i , i = 1 , . . . , p are independent, sub-Gaussian, isotropic (i.e., E A ∗ i A i = I n ) and havemean zero. The proof of Theorem 4.6.1 in Vershynin (2018) consists in establishing relation (4.22)in that reference, namely, in showing that, for any t ≥ (cid:13)(cid:13)(cid:13) p A ∗ A − I n (cid:13)(cid:13)(cid:13) ≤ K max { δ, δ } , δ := C (cid:16)r np + t √ p (cid:17) , K := max i =1 ,...,p k A i k ψ , (C.105)with probability at least 1 − {− t } . In (C.105), C > n , p , t or K .So, let A =  D Z ( a ( n )2 j ) ∗ ... D Z ( a ( n )2 j ) ∗ p  ∈ R p × n a,j . Note that the rows of A are n a,j -dimensional i.i.d. random vectors (i.e., with potentially dependententries), but not isotropic, and AA ∗ = n a,j W Z ( a ( n )2 j ). However, (cid:13)(cid:13)(cid:13) p A ∗ A − Σ a (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) p  Σ / a e D Z ( a ( n )2 j ) . . . Σ / a e D Z ( a ( n )2 j ) p   e D Z ( a ( n )2 j ) ∗ Σ / a ... e D Z ( a ( n )2 j ) ∗ p Σ / a  − Σ a (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) p p X ℓ =1 Σ / a e D Z ( a ( n )2 j ) ℓ e D Z ( a ( n )2 j ) ∗ ℓ Σ / a − Σ a (cid:13)(cid:13)(cid:13) = ≤ k Σ a k (cid:13)(cid:13)(cid:13) p p X ℓ =1 e D Z ( a ( n )2 j ) ℓ e D Z ( a ( n )2 j ) ∗ ℓ − I n a,j (cid:13)(cid:13)(cid:13) ≤ k Σ a k K max { δ, δ } , (C.106)where, by (C.105), the last inequality holds with probability at least 1 − {− t } .Now consider the (random) singular value decomposition A =  D Z ( a ( n )2 j ) ∗ ... D Z ( a ( n )2 j ) ∗ p  = U S A V ∗ , U ∈ U ( p ) , V ∈ U ( n a,j ) , (C.107)as well as the (deterministic) spectral decompositionΣ a = O diag (cid:16) λ (Σ a ) , . . . , λ n a,j (Σ a ) (cid:17) O ∗ , O ∈ O ( n a,j ) . First assume n a,j > p . Then,1 p A ∗ A = V diag (cid:16) , . . . , | {z } n a,j − p terms , s ( A ) p , . . . , s p ( A ) p (cid:17) V ∗ . From Weyl’s inequality (Vershynin (2018), Theorem 4.5.3) and (C.106),max n λ (Σ a ) , . . . , λ n a,j − p (Σ a ) , (cid:12)(cid:12)(cid:12) λ n a,j − p +1 (Σ a ) − s ( A ) p (cid:12)(cid:12)(cid:12) , . . . , (cid:12)(cid:12)(cid:12) λ n a,j (Σ a ) − s p ( A ) p (cid:12)(cid:12)(cid:12)o ≤ (cid:13)(cid:13)(cid:13) p A ∗ A − Σ a (cid:13)(cid:13)(cid:13) ≤ k Σ a k K max { δ, δ } (C.108)with probability at least 1 − {− t } . Further note that, for U as in (C.107), we can write1 n a,j n a,j X k =1 D Z ( a ( n )2 j , k ) D Z ( a ( n )2 j , k ) ∗ = 1 n a,j AA ∗ = U diag (cid:16) s ( A ) n a,j , . . . , s p ( A ) n a,j (cid:17) U ∗ . Hence, using the random matrix U and the deterministic scalars λ • (Σ a ), (cid:13)(cid:13)(cid:13) W Z ( a ( n )2 j ) − pn a,j U diag (cid:16) λ n a,j − p +1 (Σ a ) , . . . , λ n a,j (Σ a ) (cid:17) U ∗ (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) n a,j n a,j X k =1 D Z ( a ( n )2 j , k ) D Z ( a ( n )2 j , k ) ∗ − pn a,j U diag (cid:16) λ n a,j − p +1 (Σ a ) , . . . , λ n a,j (Σ a ) (cid:17) U ∗ (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) n a,j AA ∗ − pn a,j U diag (cid:16) λ n a,j − p +1 (Σ a ) , . . . , λ n a,j (Σ a ) (cid:17) U ∗ (cid:13)(cid:13)(cid:13) pn a,j (cid:13)(cid:13)(cid:13) U diag (cid:16) s ( A ) p − λ n a,j − p +1 (Σ a ) , . . . , s p ( A ) p − λ n a,j (Σ a ) (cid:17) U ∗ (cid:13)(cid:13)(cid:13) ≤ pn a,j max n(cid:12)(cid:12)(cid:12) s ( A ) p − λ n a,j − p +1 (Σ a ) (cid:12)(cid:12)(cid:12) , . . . , (cid:12)(cid:12)(cid:12) s p ( A ) p − λ n a,j (Σ a ) (cid:12)(cid:12)(cid:12)o ≤ pn a,j k Σ a k K max { δ, δ } (C.109)with probability at least 1 − {− t } . In (C.109), the last inequality is a consequence of (C.108).Alternatively, suppose n a,j ≤ p . Then, for V as in (C.107),1 p A ∗ A = V diag (cid:16) s ( A ) p , . . . , s n a,j ( A ) p (cid:17) V ∗ . From Weyl’s inequality and (C.106),max n(cid:12)(cid:12)(cid:12) λ (Σ a ) − s ( A ) p (cid:12)(cid:12)(cid:12) , . . . , (cid:12)(cid:12)(cid:12) λ n a,j (Σ a ) − s n a,j ( A ) p (cid:12)(cid:12)(cid:12)o ≤ (cid:13)(cid:13)(cid:13) p A ∗ A − Σ a (cid:13)(cid:13)(cid:13) ≤ k Σ a k K max { δ, δ } (C.110)with probability at least 1 − {− t } . Hence, again using the random matrix U and thedeterministic scalars λ • (Σ a ), (cid:13)(cid:13)(cid:13) W Z ( a ( n )2 j ) − pn a,j U diag (cid:16) , . . . , | {z } p − n a,j entries , λ (Σ a ) , . . . , λ n a,j (Σ a ) (cid:17) U ∗ (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) n a,j n a,j X k =1 D Z ( a ( n )2 j , k ) D Z ( a ( n )2 j , k ) ∗ − pn a,j U diag (cid:16) , . . . , | {z } p − n a,j entries , λ (Σ a ) , . . . , λ n a,j (Σ a ) (cid:17) U ∗ (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) n a,j AA ∗ − pn a,j U diag (cid:16) , . . . , , λ (Σ a ) , . . . , λ n a,j (Σ a ) (cid:17) U ∗ (cid:13)(cid:13)(cid:13) = pn a,j (cid:13)(cid:13)(cid:13) U diag (cid:16) , . . . , , s ( A ) p − λ (Σ a ) , . . . , s n a,j ( A ) p − λ n a,j (Σ a ) (cid:17) U ∗ (cid:13)(cid:13)(cid:13) ≤ pn a,j max n(cid:12)(cid:12)(cid:12) s ( A ) p − λ (Σ a ) (cid:12)(cid:12)(cid:12) , . . . , (cid:12)(cid:12)(cid:12) s n a,j ( A ) p − λ n a,j (Σ a ) (cid:12)(cid:12)(cid:12)o ≤ pn a,j k Σ a k K max { δ, δ } (C.111)with probability at least 1 − {− t } . In (C.111), the last inequality is a consequence of (C.110).In view of relations (C.109) and (C.111), (C.104) holds. (cid:3) So, in Lemma C.6, we lay out simple suﬃcient conditions for K in (C.103) to be, indeed, aconstant not depending on n , p or j . 79 emma C.6 Let { Z ( t ) } t ∈ Z be an i.i.d. sequence of sub-Gaussian random variables, i.e., k Z ( t ) k ψ < ∞ , t ∈ Z . Fix j ∈ N and let { h j,ℓ } ℓ ∈ Z be the Haar ﬁlter (associated with theHaar framework; see (4.7) ). Also let D Z (2 j ) = ( d Z (2 j , , . . . , d Z (2 j , n j )) ∗ (C.112) be the associated vector of available wavelet coeﬃcients at scale j . Then, for some constant C > not depending on n , p or j , kD Z (2 j ) k ψ = sup u ∈ S nj − khD Z (2 j ) , u ik ψ ≤ C. (C.113) Proof:

Fix λ ≥

0. By the independence of { Z ( t ) } t ∈ Z , by property (5) in Proposition 2.5.2,Vershynin (2018), and by the fact that the random variables { Z ( t ) } t ∈ Z are sub-Gaussian and i.d., E exp n λ d Z (2 j , k ) o = Y ℓ ∈ Z E exp n λ h j, j k − ℓ Z ( ℓ ) o ≤ Y ℓ ∈ Z exp n λ h j, j k − ℓ K o = exp n λ K X ℓ ∈ Z h j, j k − ℓ o ≤ exp n λ K o (C.114)for some absolute constant K >

0. In the last step in (C.114), we use the property P ℓ ∈ Z h j,ℓ = 1(see (C.26)). Expression (C.114) establishes that the coeﬃcients d Z (2 j , k ) are sub-Gaussian, andthat for some constant C ′ > n, j , k d Z (2 j , k ) k ψ ≤ C ′ . (C.115)Observe that { d (2 j , k ) } k ∈ Z is an i.d. sequence, and recall that we can write h j,ℓ = 2 − j/ Z R φ ( t + ℓ ) ψ (2 − j t ) dt, ℓ ∈ Z (see (2.12)). Starting the Haar scaling and wavelet functions (4.7) at octave j = 1, h , = 1 √ , h , − = − √ . More generally, by induction, for any j ∈ N , h j,ℓ =  j/ , ℓ = − j + 1 , . . . , − j − ; − j/ , ℓ = − j − + 1 , . . . , , otherwise . In particular, the nonzero coeﬃcients in the sequences { h j, j k − ℓ } ℓ ∈ Z and { h j, j k ′ − ℓ } ℓ ∈ Z do notoverlap if k = k ′ . Hence, besides being i.d.,the terms of the sequence { d Z (2 j , k ) } k ∈ Z are also independent. (C.116)Now let u ∈ S n j − . Then, by relation (C.115) and Proposition 2.6.1 in Vershynin (2018), khD Z (2 j ) , u ik ψ ≤ C n j X k =1 u k k d Z (2 j , k ) k ψ = C k d Z (2 j , k ψ n j X k =1 u k = C k d Z (2 j , k ψ ≤ C ′ . This shows (C.113). (cid:3)

We are in a position to accurately state and prove Proposition C.5.80 roposition C.5

Suppose Z = { Z ( t ) } t ∈ Z is a sequence of i.i.d. sub-Gaussian R p -valued randomvectors. Assume the wavelet coeﬃcients are computed by means of Mallat’s iterative procedure (2.10) based on a Haar framework (see (4.7) ). Fix j ∈ N and let W Z ( a ( n )2 j ) be as in (2.18) .Further suppose condition (2.27) holds. Then, W Z ( a ( n )2 j ) = O P (1) . (C.117) Moreover, E W Z ( a ( n )2 j ) = O (1) . (C.118) In the other words, the statements in ( A pertaining to Z alone are satisﬁed. Proof:

Lemmas C.5 and C.6 establish the concentration inequality (C.104), where K does notdepend on n , a ( n ) or p ( n ). In addition, relations (C.109) and (C.111) show that the eigenvaluesof the matrix M ( p, n, a ( n )2 j ) ∈ S ≥ ( p, C ) as in (C.104) are either zero or eigenvalues of Σ a as in(C.101). However, by (C.116), we can express Σ a = E d Z ( a ( n )2 j , · I n a,j . Moreover, for any n , j and k , and for d Z (2 j , k ) as in (C.112), E d Z (2 j , k ) = E (cid:16) X ℓ ∈ Z h j, j k − ℓ Z ( ℓ ) (cid:17) = E Z (1) X ℓ ∈ Z h j, j k − ℓ = E Z (1) . (C.119)In particular, E d Z ( a ( n )2 j ,

1) = E Z (1). Therefore, the norm k Σ a k is bounded in all parameters( n , p and j ). Hence, (C.117) holds. In addition, since E W Z ( a ( n )2 j ) = E d Z ( a ( n )2 j , · I n a,j = Σ a ,relation (C.118) also holds. (cid:3) References

Abry, P. & Didier, G. (2018 a ), ‘Wavelet eigenvalue regression for n -variate operator fractionalBrownian motion’, Journal of Multivariate Analysis , 75–104.Abry, P. & Didier, G. (2018 b ), ‘Wavelet estimation for operator fractional Brownian motion’, Bernoulli (2), 895–928.Abry, P. & Flandrin, P. (1994), ‘On the initialization of the discrete wavelet transform algorithm’, IEEE Signal Processing Letters (2), 32–34.Abry, P., Didier, G. & Li, H. (2019), ‘Two-step wavelet-based estimation for Gaussian mixedfractional processes’, Statistical Inference for Stochastic Processes (2), 157–185.Abry, P., Wendt, H. & Didier, G. (2018), Detecting and estimating multivariate self-similar sourcesin high-dimensional noisy mixtures, in ‘2018 IEEE Statistical Signal Processing Workshop(SSP)’, pp. 688–692.Anderson, G., Guionnet, A. & Zeitouni, O. (2010), An Introduction to Random Matrices , Cam-bridge Studies in Advanced Mathematics, volume 118, Cambridge University Press.Anderson, T. (2003),

An Introduction to Multivariate Statistical Analysis , 3 edn, Wiley.Bai, S. & Taqqu, M. S. (2018), ‘How the instability of ranks under long memory aﬀects large-sample inference’,

Statistical Science (1), 96–116.81ai, Z. & Silverstein, J. (2010), Spectral Analysis of Large Dimensional Random Matrices , Vol. 20,Springer.Bardet, J.-M. (2000), ‘Testing for the presence of self-similarity of Gaussian time series havingstationary increments’,

Journal of Time Series Analysis (5), 497–515.Basu, S. & Michailidis, G. (2015), ‘Regularized estimation in sparse high-dimensional time seriesmodels’, Annals of Statistics (4), 1535–1567.Becker-Kern, P. & Pap, G. (2008), ‘Parameter estimation of selfsimilarity exponents’, Journal ofMultivariate Analysis , 117–140.Benson, D. A., Meerschaert, M. M., Baeumer, B. & Scheﬄer, H.-P. (2006), ‘Aquifer operatorscaling and the eﬀect on solute mixing and dispersion’, Water Resources Research .Birg´e, L. & Massart, P. (1998), ‘Minimum contrast estimators on sieves: exponential bounds andrates of convergence’,

Bernoulli (3), 329–375.Boniece, B. C., Didier, G. & Sabzikar, F. (2021), ‘Tempered fractional Brownian motion: waveletestimation, modeling and testing’, Applied and Computational Harmonic Analysis , 461–509.Boniece, B. C., Wendt, H., Didier, G. & Abry, P. (2019), Wavelet-based detection and estimationof fractional L´evy signals in high dimensions, in ‘2019 IEEE 8th International Workshop onComputational Advances in Multi-Sensor Adaptive Processing (CAMSAP)’, pp. 574–578.Boucheron, S., Lugosi, G. & Massart, P. (2013), Concentration Inequalities: a NonasymptoticTheory of Independence , 3 edn, Oxford University Press.Brockwell, P. & Davis, R. (1991),

Time Series: Theory and Methods , 2 nd edn, Springer.Brody, D. (2011), Big data: Harnessing a game-changing asset, in G. Stahl & M. Kenny, eds, ‘Areport from the Economist Intelligence Unit, sponsored by SAS’, The Economist IntelligenceUnit Ltd., U.K.Brown, S. J. (1989), ‘The number of factors in security returns’,

Journal of Finance (5), 1247–1262.Chakrabarty, A., Hazra, R. S. & Sarkar, D. (2016), ‘From random matrices to long range depen-dence’, Random Matrices: Theory and Applications (02), 1650008.Chan, N. H., Lu, Y. & Yau, C. Y. (2017), ‘Factor modelling for high-dimensional time series:inference and model selection’, Journal of Time Series Analysis (2), 285–307.Che, Z. (2017), ‘Universality of random matrices with correlated entries’, Electronic Journal ofProbability , 1–38.Ciuciu, P., Varoquaux, G., Abry, P., Sadaghiani, S. & Kleinschmidt, A. (2012), ‘Scale-free andmultifractal properties of fMRI signals during rest and task’, Frontiers in Physiology , 186.Cohen, A. (2003), Numerical Analysis of Wavelet Methods , Vol. 32, North-Holland, Amsterdam.Comon, P. & Jutten, C. (2010),

Handbook of Blind Source Separation: Independent ComponentAnalysis and Applications , Academic Press.82raigmile, P., Guttorp, P. & Percival, D. (2005), ‘Wavelet-based parameter estimation for poly-nomial contaminated fractionally diﬀerenced processes’,

IEEE Transactions on Signal Pro-cessing (8), 3151–3161.Daubechies, I. (1992), Ten Lectures on Wavelets , Vol. 61, Society for Industrial and AppliedMathematics, Philadelphia-PA.Didier, G. & Pipiras, V. (2010), ‘Adaptive wavelet decompositions of stationary time series’,

Journal of Time Series Analysis (3), 182–209.Didier, G. & Pipiras, V. (2011), ‘Integral representations and properties of operator fractionalbrownian motions’, Bernoulli (1), 1–33.Forni, M. & Lippi, M. (1999), ‘Aggregation of linear dynamic microeconomic models’, Journal ofMathematical Economics (1), 131–158.Harris, D. & Poskitt, D. (2004), ‘Determination of cointegrating rank in partially non-stationaryprocesses via a generalised von-Neumann criterion’, The Econometrics Journal (1), 191–217.Horn, R. A. & Johnson, C. R. (2012), Matrix Analysis , Cambridge University Press.Hudson, W. N. & Mason, J. D. (1982), ‘Operator-self-similar processes in a ﬁnite-dimensionalspace’,

Transactions of the American Mathematical Society (1), 281–297.Isotta, F., Frei, C., Weilguni, V., Perˇcec Tadi´c, M., Lassegues, P., Rudolf, B., Pavan, V., Caccia-mani, C., Antolini, G., Ratto, S. M. & Munari, M. (2014), ‘The climate of daily precipitationin the Alps: development and analysis of a high-resolution grid dataset from pan-Alpine rain-gauge data’,

International Journal of Climatology (5), 1657–1675.Istas, J. & Lang, G. (1997), ‘Quadratic variations and estimation of the local H¨older index of aGaussian process’, Annales de l’Institut Henri Poincar´e (4), 407–436.Jones, G. A. & Jones, J. M. (1998), Elementary Number Theory , Berlin: Springer-Verlag.Kolmogorov, A. N. (1941), The local structure of turbulence in an incompressible ﬂuid at veryhigh Reynolds numbers, in ‘Dokl. Akad. Nauk SSSR’, Vol. 30, pp. 299–303.Laha, R. & Rohatgi, V. (1981), ‘Operator self similar stochastic processes in rd’, StochasticProcesses and their Applications (1), 73 – 84.Lam, C. & Yao, Q. (2012), ‘Factor modeling for high-dimensional time series: inference for thenumber of factors’, Annals of Statistics (2), 694–726.Laurent, B. & Massart, P. (2000), ‘Adaptive estimation of a quadratic functional by model selec-tion’, Annals of Statistics pp. 1302–1338.Ledoux, M. (2001),

The Concentration of Measure Phenomenon , number 89, American Mathe-matical Society.Li, H. (2017), Wavelet-based estimation for Gaussian and non-Gaussian mixed fractional pro-cesses, PhD thesis, Tulane University.Li, Q., Pan, J. & Yao, Q. (2009), ‘On determination of cointegration ranks’,

Statistics and ItsInterface (1), 45–56. 83iu, H., Aue, A. & Paul, D. (2015), ‘On the Marˇcenko–Pastur law for linear time series’, Annalsof Statistics (2), 675–712.Lugosi, G. (2017), Lectures on Combinatorial Statistics, in ‘47th Probability Summer School,Saint-Flour’, pp. 1–91.Maejima, M. & Mason, J. (1994), ‘Operator-self-similar stable processes’, Stochastic Processesand their Applications (1), 139 – 163.Magnus, J. R. (1985), ‘On diﬀerentiating eigenvalues and eigenvectors’, Econometric Theory (2), 179–191.Mallat, S. (1999), A Wavelet Tour of Signal Processing , Academic Press, London.Mallat, S. (2009),

A Wavelet Tour of Signal Processing: the Sparse Way , Academic Press, London.Mandelbrot, B. & Van Ness, J. (1968), ‘Fractional Brownian motions, fractional noises and ap-plications’,

SIAM Review (4), 422–437.Mason, J. & Xiao, Y. (2002), ‘Sample path properties of operator-self-similiar Gaussian randomﬁelds’, Theory of Probability and Its Applications (1), 58–78.Meerschaert, M. & Scheﬄer, H.-P. (1999), ‘Moment estimator for random vectors with heavytails’, Journal of Multivariate Analysis , 145–159.Meerschaert, M. & Scheﬄer, H.-P. (2003), Portfolio modeling with heavy-tailed random vectors, in ‘Handbook of heavy-tailed distributions in Finance (S. T. Rachev (ed.))’, Elsevier ScienceB.V., Amsterdam, pp. 595–640.Mehta, M. L. (2004), Random Matrices , Elsevier.Merlev`ede, F. & Peligrad, M. (2016), ‘On the empirical spectral distribution for matrices with longmemory and independent rows’,

Stochastic Processes and their Applications (9), 2734–2760.Merlev`ede, F., Najim, J. & Tian, P. (2019), ‘Unbounded largest eigenvalue of large sample co-variance matrices: Asymptotics, ﬂuctuations and applications’,

Linear Algebra and its Ap-plications , 317–359.Moulines, E., Roueﬀ, F. & Taqqu, M. (2007 a ), ‘Central limit theorem for the log-regression waveletestimation of the memory parameter in the Gaussian semi-parametric context’, Fractals (4), 301–313.Moulines, E., Roueﬀ, F. & Taqqu, M. (2007 b ), ‘On the spectral density of the wavelet coeﬃcientsof long-memory time series with application to the log-regression estimation of the memoryparameter’, Journal of Time Series Analysis (2), 155–187.Moulines, E., Roueﬀ, F. & Taqqu, M. (2008), ‘A wavelet Whittle estimator of the memory pa-rameter of a nonstationary Gaussian time series’, Annals of Statistics pp. 1925–1956.Paul, D. & Aue, A. (2014), ‘Random matrix theory in statistics: a review’,

Journal of StatisticalPlanning and Inference , 1–29. 84ercival, D. B. & Walden, A. (2006),

Wavelet Methods for Time Series Analysis , Vol. 4, CambridgeUniversity Press.Phillips, P. C. B. & Ouliaris, S. (1988), ‘Testing for cointegration using principal componentsmethods’,

Journal of Economic Dynamics and Control (2-3), 205–230.Roueﬀ, F. & Taqqu, M. S. (2009), ‘Asymptotic normality of wavelet estimators of the memoryparameter for linear processes’, Journal of Time Series Analysis (5), 534–558.Steland, A. & Von Sachs, R. (2017), ‘Large-sample approximations for variance-covariance ma-trices of high-dimensional time series’, Bernoulli (4A), 2299–2329.Stoev, S., Pipiras, V. & Taqqu, M. (2002), ‘Estimation of the self-similarity parameter in linearfractional stable motion’, Signal Processing , 1873–1901.Tao, T. & Vu, V. (2011), ‘Random matrices: universality of local eigenvalue statistics’, ActaMathematica (1), 127–204.Tao, T. & Vu, V. (2012), ‘Random covariance matrices: universality of local statistics of eigen-values’,

Annals of Probability (3), 1285–1315.Taylor, C. & Salhi, A. (2017), ‘On partitioning multivariate self-aﬃne time series’, IEEE Trans-actions on Evolutionary Computation (6), 845–862.Vershynin, R. (2018), High-Dimensional Probability: an Introduction with Applications in DataScience , Vol. 47, Cambridge University Press.Vignat, C. (2012), ‘A generalized Isserlis theorem for location mixtures of Gaussian randomvectors’,

Statistics and Probability Letters (1), 67–71.Wainwright, M. J. (2019), High-Dimensional Statistics: a Non-Asymptotic Viewpoint , Vol. 48,Cambridge University Press.Wang, L., Aue, A. & Paul, D. (2017), ‘Spectral analysis of sample autocovariance matrices of aclass of linear time series in moderately high dimensions’,

Bernoulli (4A), 2181–2209.Wendt, H., Didier, G., Combrexelle, S. & Abry, P. (2017), ‘Multivariate Hadamard self-similarity:testing fractal connectivity’, Physica D: Nonlinear Phenomena , 1–36.Xia, N., Qin, Y. & Bai, Z. (2013), ‘Convergence rates of eigenvector empirical spectral distributionof large dimensional sample covariance matrix’,