[PDF] Limiting laws for extreme eigenvalues of large-dimensional spiked Fisher matrices with a divergent number of spikes

Abstract

Consider the p×p matrix that is the product of a population covariance matrix and the inverse of another population covariance matrix. Suppose that their difference has a divergent rank with respect to p , when two samples of sizes n and T from the two populations are available, we construct its corresponding sample version. In the regime of high dimension where both n and T are proportional to p , we investigate the limiting laws for extreme (spiked) eigenvalues of the sample (spiked) Fisher matrix when the number of spikes is divergent and these spikes are unbounded.

Full PDF

LLimiting laws for extreme eigenvalues oflarge-dimensional spiked Fisher matrices with adivergent number of spikes

Junshan Xie ∗ , Yicheng Zeng † , Lixing Zhu ‡ Abstract

Consider the p × p matrix that is the product of a population covariance matrixand the inverse of another population covariance matrix. Suppose that their di ﬀ erencehas a divergent rank with respect to p , when two samples of sizes n and T from thetwo populations are available, we construct its corresponding sample version. In theregime of high dimension where both n and T are proportional to p , we investigate thelimiting laws for extreme (spiked) eigenvalues of the sample (spiked) Fisher matrixwhen the number of spikes is divergent and these spikes are unbounded. Keywords:

Extreme eigenvalue, Fisher matrix, Phase transition phenomenon, Randommatrix theory, Spiked population model.

In the last few decades, as the remarkable development in storage devices and computingcapability, the demand for processing complex-structured data increases dramatically. Oneof the features as well as the challenges of these data sets is their high dimensions. Thedi ﬃ culty is that the classical limit theory for multivariate statistical analysis fails to en-sure reliable inference for high-dimensional data analysis. Classical limit theorems require“small p large n ” to keep their validity, which conﬂicts with the situation “large p large n ”in high-dimensional settings in the sense that p / n → c > ﬀ erent. To attack the relevant issues, random matrix theory (RMT) serves as apowerful tool in addressing statistical problems in high dimensions. The ﬁrst research ofrandom matrices in multivariate statistics was about the Wishart matrices in [18]. Abundantresearch has been established for various topics in this ﬁeld during the past half century,especially in recent years. In the area of RMT in statistics, we refer to monographs [2] and[19] for systematical study and [12] for a comprehensive review. ∗ Co-ﬁrst author. School of Mathematics and Statistics, Henan University, Kaifeng, China. † Co-ﬁrst author. Department of Mathematics, Hong Kong Baptist University, Hong Kong. ‡ Corresponding author. Research Center for Statistics and Data Science, Beijing Normal University, Zhuhai,China and Department of Mathematics, Hong Kong Baptist University, Hong Kong. Email address: [email protected] . a r X i v : . [ m a t h . S T ] S e p relevant topic in multivariate statistics is about testing the equality of two covariancematrices: H : Σ = Σ vs. H : Σ = Σ + ∆ , (1.1)where Σ and Σ are two covariance matrices corresponding to two p -variate populations,and ∆ is a non-negative deﬁnite matrix with rank q . Let S and S be the sample covariancematrices from these two populations, respectively. When S is invertible, the random matrix F = S − S is called a Fisher matrix.The di ﬀ erence between the null hypothesis and the alternative hypothesis relies on thoseextreme eigenvalues of F . Under the null hypothesis, Σ = Σ , [16] established the well-known Wacheter distribution as the limiting spectral distribution (LSD) of F . Some exten-sions were built later (see examples in [13], [14] and [15]). Furthermore, [1] pointed outthe fact that the largest eigenvalue of F converges to the upper bound of the support of theLSD of F . Under the alternative hypothesis, F is called a spiked Fisher matrix (see [17]),because Σ − Σ has a spiked structure similar to that of a spiked population model proposedby [10]. More speciﬁcally, the matrix Σ − Σ is assumed to have the spectrumspec( Σ − Σ ) = { λ , . . . , λ q , , . . . , } , (1.2)where λ ≥ . . . ≥ λ q >

1. When the rank q of ∆ is ﬁnite, [6] showed the phase transitionphenomenon of the extreme eigenvalues of F under Gaussian population assumption. Thatis, for 1 ≤ i ≤ q , the i -th largest eigenvalue of F will depart from the upper bound of thesupport of LSD of F if and only if λ exceeds certein phase transition point. [17] extendedit to the cases without Gaussian assumption and established central limit theorems for theoutlier eigenvalues of F .We in this paper consider, as a reasonable extension in theory and applications, the caseof divergent q with respect to the dimension p . We will investigate the convergence inprobability and central limit theorems for spiked eigenvalues of spiked Fisher matrices. Weformulate our problem as follows.Assume that Y = ( y , . . . , y T ) = ( y i j ) ≤ i ≤ p , ≤ j ≤ T ∈ R p × T and Z = ( z , . . . , z n ) = ( z i j ) ≤ i ≤ p , ≤ j ≤ n ∈ R p × n (1.3)are two independent arrays of independent real-valued random variables with zero meanand unit variance. We consider two samples { Σ / y i } ≤ i ≤ T and { Σ / z i } ≤ i ≤ n , then their cor-responding sample covariance matrices can respectively be written as S = T T (cid:88) i = Σ y i y (cid:62) i Σ = T Σ YY (cid:62) Σ and S = n n (cid:88) i = Σ z i z (cid:62) i Σ = n Σ ZZ (cid:62) Σ . Also, deﬁne the Fisher matrix F : = S − S , as the sample version of matrix Σ − Σ . We aimto investigate the limiting properties of the eigenvalues of F . As the eigenvalues of F remain2nvariant under the linear transformation( S , S ) → (cid:18) Σ − S Σ − , Σ − S Σ − (cid:19) , (1.4)thus we can assume Σ = I p throughout this paper without loss of generality. Under theassumption (1.2), eigenvalues of Σ are λ ≥ . . . ≥ λ q > λ q + = . . . = λ p =

1. Recalling(1.1) that Σ is a rank q pertubation of Σ = I p , we simply assume Σ =  Σ I p − q  . (1.5)For the sake of brevity and readability, we write the eigenvalues of F in descending order (cid:98) λ ≥ . . . ≥ (cid:98) λ p , simplifying the double subscripts as single ones. It should be noted that (cid:98) λ i isrelated to the sample size n .We then describe the related work and our contributions in this paper. When the num-ber of the spiked eigenvalues q is ﬁxed, and all the spiked eigenvalue λ i , i = , . . . , q , arebounded, there are some results on the limiting properties of the eigenvalues of F in theliterature. Such as, the almost surely convergence (strong consistency) and central limittheorem (CLT) of spiked eigenvalues ([17]) and asymptotically Tracy-Widom distributionfor the largest non-spiked eigenvalue ([7] and [8]). In this paper, we consider the casethat the number of spiked eigenvalues q = q ( p ) → ∞ as p → ∞ , and spiked eigenvalues λ i , ≤ i ≤ q diverge as p → ∞ . To the best of our knowledge, there is no relevant re-sult in the literature. A relevant work is [5] who studied spiked population models, wherethe asymptotics for spiked eigenvalues, including convergence in probability (weak con-sistency) and CLT, as well as Tracy-Widom law for the largest nonspiked eigenvalue werebuilt under a quite general framework. Unlike the case of ﬁxed q and bounded spikes λ i ,1 ≤ i ≤ q , normalizations for (cid:98) λ i , 1 ≤ i ≤ q are needed for the divergent q case. Consider thenormalized eigenvalues (cid:98) λ i /λ i in consistency and ( (cid:98) λ i − θ i ) /θ i in CLT, where θ i is a centeredparameter deﬁned later.The basic approach behind the proofs of the asymptotics for spiked eigenvalues is theanalysis of an equation for the determinant of a q × q random matrix (indexed by n ). When q is bounded, [17] derived the almost sure entrywise convergence of the q × q matrix (andhence the convergence with respect to matrix norms) and then solving the equation to leadto the almost sure limits of spiked eigenvalues. This argument does not work in the diver-gent q case where the convergence of a q × q matrix with respect to some norm could not bedirectly implied by the entrywise convergence. Instead, we use the CLT for random sequi-linear forms in [3] to derive the convergence rate of each entry, and then use Chebyshev’sinequality to put all entries together to derive the convergence rate of the matrix in (cid:96) ∞ norm.In this way, we achieve the convergence in probability as well as the CLT of spiked eigen-values (after proper normalizations). This approach is similar to that used in [5], so sometechnical assumptions are also imposed similarly.The remaining parts of the paper are organized as follows. Section 2 establishes themain results, including the convergence in probability of (cid:98) λ i /λ i and central limit theorems3f ( (cid:98) λ i − θ i ) /θ i , for those spiked eigenvalues of spiked Fisher matrix F . Here, θ i , 1 ≤ i ≤ q ,is a sequence of centering parameters deﬁned in this section. In the Section 3, we show theproofs of our main results in Section 2. Some important technical lemmas and their proofsare displayed in the Section 4. Considering the linear transformation (1.4), we assume that Σ = I p without loss of gen-erality, and then Σ has the structure as shown in (1.5). Further, we decompose the Σ in(1.5) as Σ = U (cid:62) Λ U . Here, U ≡ ( u , u , . . . , u q ) (cid:62) is a q × q orthogonal matrix and Λ = diag( λ , . . . , λ N (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) n , . . . , λ N (cid:96) − + , . . . , λ q (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) n (cid:96) ) , where λ = . . . = λ N > . . . > λ N (cid:96) − + = . . . = λ q and N i : = (cid:80) ij = n j for 1 ≤ i ≤ (cid:96) . In thiscase, Σ can be decomposed as Σ =  U (cid:62) I p − q   Λ I p − q   U I p − q  = :  U (cid:62) I p − q  Λ  U I p − q  . We give decompositions of the sample covariance matrices S and S as follows. Weﬁsrt decompose the matrices Y and Z deﬁned in (1.3) as Y = ( Y (cid:62) , Y (cid:62) ) (cid:62) and Z = ( Z (cid:62) , Z (cid:62) ) (cid:62) ,where Y , Z ∈ R q × n and Y , Z ∈ R ( p − q ) × n . Let X : = Σ / Y . Then we can similarly write X = ( X (cid:62) , X (cid:62) ) (cid:62) , where X = Σ / Y = U (cid:62) Λ / UY ∈ R q × T and X = Y ∈ R ( p − q ) × T . Itfollows that S =  T X X (cid:62) T X X (cid:62) T X X (cid:62) T X X (cid:62)  and S =  n Z Z (cid:62) n Z Z (cid:62) n Z Z (cid:62) n Z Z (cid:62)  . (2.1)For λ ∈ R \ { } , we introduce F = (cid:32) n Z Z (cid:62) (cid:33) − (cid:32) T X X (cid:62) (cid:33) , M ( λ ) = I p − q − F λ , (cid:101) m θ ( z ) = p − q tr (cid:32) z I p − q − F θ (cid:33) − , θ ∈ R , z ∈ C + . (2.2)Let µ ≥ . . . ≥ µ p − q be the eigenvalues of the Fisher matrix F . Then the empirical spectraldistribution (ESD) of F can be deﬁned as F n ( x ) = p − q p − q (cid:88) j = { µ j ≤ x } , x ∈ R .

4y the result in [17], under the assumption of p / n → y ∈ (0 ,

1) and p / T → c >

0, almostsurely, the empirical spectral distribution F n weakly converges to the limiting spectral dis-tribution F c , y , whose Stieltjes transform S ( z ) = (cid:82) ∞−∞ ( x − z ) − dF c , y ( x ) satisﬁes, for z (cid:60) [ a , b ] S ( z ) = − czc − c [ z (1 − y ) + − c ] + zy − c (cid:112) [ z (1 − y ) + − c ] − z zc ( c + zy ) , (2.3)where a = (1 − √ c + y − cy ) (1 − y ) − and b = (1 + √ c + y − cy ) (1 − y ) − .In the following, for any complex matrix A , we use s i ( A ) to denote the i -th largestsingular value, and (cid:107) A (cid:107) to denote the largest singular value throughout the paper. Write a n = O a . s . ( b n ) if it almost surely holds that a n = O ( b n ). Throughout this paper C is aconstant that may vary from place to place.The following assumptions are required. Assumption 2.1. y p : = p / n → y ∈ (0 , (cid:101) y p : = ( p − q ) / n ; c p : = p / T → c > (cid:101) c p : = ( p − q ) / T ; q = q ( n ) → ∞ as n → ∞ but q = o( n ). Assumption 2.2.

For any 1 ≤ i ≤ q , λ i satisﬁes q /λ i → a ) . λ − i (cid:80) qj = λ j = o( q − n ) and λ − i (cid:80) qj = λ j = o( q − ); ( b ) . λ i (cid:80) qj = λ − j = o( q − n ). Assumption 2.3.

Random vectors in { y i : 1 ≤ i ≤ T } (cid:83) { z i : 1 ≤ i ≤ n } are independentidentically distributed, E z i j =

0, E | z i j | = ∀ ≤ i ≤ p , ≤ j ≤ n and sup ≤ i ≤ p E | z i j | < ∞ . Assumption 2.4.

There exists a constant C > λ N i /λ N i + ≥ C for any 1 ≤ i ≤ (cid:96) − Assumption 2.5.

Suppose that { λ i } ≤ i ≤ q are of bounded multiplicities, i.e., sup ≤ i ≤ (cid:96) n i < ∞ . The weak consistency of (cid:98) λ i is stated below. Due to the fact that λ i may go to inﬁnity with n , consider the limit in probability for the ratio (cid:98) λ i /λ i , 1 ≤ i ≤ q . Theorem 2.6.

Assume that Assumptions 2.1, 2.2 and 2.3 hold. Then for all 1 ≤ i ≤ q , (cid:98) λ i λ i = − y + O (cid:16) y p − y (cid:17) + κ q · O p (cid:32) √ n + λ − i (cid:33) , where κ : = min { κ , κ } with κ : = q + λ − i (cid:80) qj = λ j and κ : = q + λ i (cid:80) qj = λ − j . Remark 2.7.

Note that the limit of the ratio (cid:98) λ i /λ i is 1 / (1 − y ) >

1, for all 1 ≤ i ≤ q . Thisis di ﬀ erent from the relevant limit for spiked population model with divergent q , which is1 (see Theorem 2.1 in [5]). Roughly speaking, when we take y → / (1 − y ) → emark 2.8. In the case of ﬁxed q and bounded spikes λ i , 1 ≤ i ≤ q , Theorem 3.1 in [17]shows that almost surely the spiked eigenvalue (cid:98) λ i converges to the limit λ i ( λ i + c − λ i − λ i y − − . Simply taking λ i → ∞ , the limit of λ i ( λ i + c − λ i − λ i y − − λ − i equals to1 / (1 − y ). Thus, Theorem 2.6 indicates that for the divergent q case, the result coincideswith the result for the ﬁxed q case in [17]. Remark 2.9.

In Theorem 2.6 we only consider unbounded spikes, but actually it can bereadily extended to handle the case with both bounded and unbounded spikes. Consider themodel Σ =  U (cid:62) I p − q  Λ  U I p − q  , where Λ = diag( λ , . . . , λ q , λ q + , . . . , λ q + q , , . . . , q = o( n / ) and q is bounded. As-sume that spikes λ ≥ . . . ≥ λ q are unbounded as in Theorem 2.6 and λ q + ≥ . . . ≥ λ q + q are bounded. For q + ≤ i ≤ q + q , by Theorem A.10 in [2], we have (cid:98) λ i = s i (cid:16) S − S (cid:17) ≤ s i ( S ) s (cid:16) S − (cid:17) ≤ s i ( Σ ) s (cid:32) T YY (cid:62) (cid:33) s (cid:16) S − (cid:17) < ∞ almost surely. So it holds that det (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  (cid:44) . Similar to the decomposition in (3.3), we have det (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  − (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  − (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  = . (2.4) In the same manner as used in the proof of Theorem 2.6, it can be checked that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  − (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∞ = o p (1) . Then the solution of equation (2.4) is close to that of the equationdet (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  = . (2.5)Note that the solution of (2.5) is an eigenvlaue of the spiked Fisher matrix ( Z Z (cid:62) / n ) − ( X X (cid:62) / T )which has been well studied by [17]. Thus, the weak consistency for all outliers (cid:98) λ i , 1 ≤ i ≤ q + q , could be achieved by combining Theorem 3.1 in [17] and Theorem 2.6. Such a kindof extension could also be considered for the CLT in Theorem 2.10.6 .3 Central limit theorem As λ i , 1 ≤ i ≤ q , goes to inﬁnity, the consistency of (cid:98) λ i /λ i in Theorem 2.6 does not meanthat (1 − y ) (cid:98) λ i is a good estimator of λ i . In this section, we establish the CLT for (cid:98) λ i to providefurther properties.We ﬁrst introduce a centered parameter for (cid:98) λ i . Let θ i ∈ R , 1 ≤ i ≤ q , satisfy1 − n E (cid:104) tr (cid:110) M − ( θ i ) (cid:111)(cid:105) = λ i θ i (cid:32) + T E (cid:34) tr (cid:40) M − ( θ i ) F θ i (cid:41)(cid:35)(cid:33) , (2.6)and deﬁne δ i , for 1 ≤ i ≤ q , as δ i = (cid:98) λ i − θ i θ i . (2.7)By Lemma 4.1, when n → ∞ , we can easily see that1 p − q E (cid:104) tr (cid:110) M − ( θ i ) (cid:111)(cid:105) = E (cid:8)(cid:101) m θ i (1) (cid:9) → p − q E (cid:34) tr (cid:40) M − ( θ i ) F θ i (cid:41)(cid:35) → . It follows by (2.6) that λ i θ i = (cid:18) − p − qn (cid:19) + o(1) → − y . Since the equation in Deﬁnition 2.6 for θ i is hard to calculate, an alternative deﬁnition for θ i is proposed as follows. Recall the deﬁnition of (cid:101) m θ ( z ) in (2.2): (cid:101) m θ ( z ) = p − q tr (cid:32) z I p − q − F θ (cid:33) − , θ ∈ R , z ∈ C + . Denoting f θ ( x ) = θ/ ( θ − x ) for any ﬁxed θ ∈ R , we have (cid:101) m θ (1) = p − q tr (cid:32) I p − q − F θ (cid:33) − = (cid:90) ∞−∞ θθ − x dF n ( x ) = : F n ( f θ ) , where F n denotes the ESD of the matrix F . By the CLT for linear spectral statistics (LSS)of Fisher matrices (see Theorem 3.10 in [19]), for any ﬁxed θ , p { F n ( f θ ) − F (cid:101) c p , (cid:101) y p ( f θ ) } converges weakly to a Gaussian variable. It follows that (cid:101) m θ (1) = F (cid:101) c p , (cid:101) y p ( f θ ) + O p ( n − ) = − θ (cid:101) S ( θ ) + O p ( n − ) = (cid:101) c p − (cid:101) c p + (cid:101) c p { θ (1 − (cid:101) y p ) + − (cid:101) c p } + θ (cid:101) y p − (cid:101) c p (cid:113) { θ (1 − (cid:101) y p ) + − (cid:101) c p } − θ (cid:101) c p ( (cid:101) c p + θ (cid:101) y p ) + O p ( n − ) , where (cid:101) S ( · ) denotes the stieltjes transform of F (cid:101) c p , (cid:101) y p . This leads toE { (cid:101) m θ (1) } = − θ (cid:101) S ( θ ) + O( n − ) . (2.8)7he deﬁnition of θ i in (2.6) can be rewritten as1 − (cid:101) y p E (cid:8)(cid:101) m θ i (1) (cid:9) = λ i θ i (cid:104) − (cid:101) c p + (cid:101) c p E (cid:8)(cid:101) m θ i (1) (cid:9)(cid:105) . According to (2.8), it is equivalent to1 + (cid:101) y p θ i (cid:101) S ( θ i ) + O( n − ) = λ i θ i (cid:110) − (cid:101) c p − (cid:101) c p θ i (cid:101) S ( θ i ) + O( n − ) (cid:111) . (2.9)Thus, we give another deﬁnition of θ i by the following equation1 + (cid:101) y p θ i (cid:101) S ( θ i ) = λ i θ i (cid:110) − (cid:101) c p − (cid:101) c p θ i (cid:101) S ( θ i ) (cid:111) . (2.10)It is notable that the θ i deﬁned by (2.10) is also applicable to the CLT of δ i in the later sec-tion. Comparing two equations (2.9) and (2.10), we can derive that the di ﬀ erence betweentwo δ i ’s respectively derived from these two equations is at most O( n − ), which is smallerthan the scale n − / of δ i . Even Taylor’s expansion on the stieltjes transformantion (cid:101) S ( · ) canbe simply used to the equation (2.10) and then get the explicit forms of θ i , although someerrors would appear. In the remaining parts of this paper, we use θ i deﬁned by (2.6) in allresults and their proofs.Consider the case where all the spiked eigenvalues are simple, that is, n i = ≤ i ≤ (cid:96) , which means that Λ = diag( λ , λ , . . . , λ q ). Theorem 2.10.

Under Assumptions 2.1, 2.2, 2.3, 2.4 and that n i = , ≤ i ≤ (cid:96) , i.e., (cid:96) = q ,it holds that, for all 1 ≤ i ≤ q , √ p δ i σ i d −→ N (0 , σ i : = ( y + c ) ν i − c − y (1 − y )(1 − y ) − , where ν i = E | u (cid:62) i Z e | , e = (1 , , . . . , (cid:62) ∈ R q and u i ∈ R q is the i -th column of the matrix U (cid:62) . Remark 2.11.

When the value of the variance σ i at the population level is unknown, forstatistical inference, estimating σ i is in need. A natural estimation way would be to es-timate the eigenvector u i ﬁrst. For the spiked population model, [5] shows that when aleading eigenvalue of the sample covariance matrix is divergent, the corresponding sam-ple eigenvector is a good estimator for its population counterpart in terms of their in-ner product. However, the situation becomes much more di ﬃ cult when it comes to thespiked Fisher matrix. Recalling the assumed structure Σ − / Σ Σ − / = I p + ∆ , we sup-pose that v i : = ( u (cid:62) i , , . . . , (cid:62) ∈ R p is the eigenvector of Σ − / Σ Σ − / = I p + ∆ cor-responding to λ i and (cid:98) v i is that of S = Σ / YY (cid:62) Σ / . Then Σ / (cid:98) v i is the eigenvector of( I p + ∆ ) / YY (cid:62) ( I p + ∆ ) / corresponding to the i -th largest eigenvalue. If Σ is known orcan be consistently estimated, Σ / (cid:98) v i is a good estimator of v i , by Theorem 4.1 in [5]. Butactually Σ cannot be easily recovered based on S because of the delocalization of thoseeigenvectors for non-outliers (see [4]). Thus, how to construct a consistent estimation of8 becomes a challenging issue. As a special case, when entries of Y and Z are Gaus-sian, the parameter ν i equals to 3, which is independent of the value of u i . In practice, thebootstrap approximation would be an alternative way to achieve a reliable estimation of σ i . For estimation of the variance of the largest sample eigenvalue in a spiked populationmodel, spiked population model, [11] shows that the bootstrap approximation works whenthe largest eigenvalue is quite large. This deserves a further study.To check the practical applicability of Theorem 2.10, a simulation is conducted. Set p = T = n = q = (cid:100) p (cid:101) , λ i = (3 / q + − i (log p / for 1 ≤ i ≤ q , where (cid:100) x (cid:101) denotes the smallest integer greater than or equal to x . Let Σ = diag( λ , . . . , λ q , , . . . , Σ = I p . Draw a sample { x i } ≤ i ≤ T of size T from N (0 , Σ ) and a sample { z i } ≤ i ≤ n ofsize n from N (0 , Σ ). Compute the largest q eigenvalues (cid:98) λ i , 1 ≤ i ≤ q , of the Fisher matrix F = S − S and then δ i accordingly, where S = (cid:80) Ti = x i x (cid:62) i / T and S = (cid:80) ni = z i z (cid:62) i / n . Wedraw qq plots of √ p δ /σ and √ p δ q /σ q from 1000 independent replications in Figure 1. Itsuggests that both of √ p δ /σ and √ p δ q /σ q are well approximated by the standard normaldistribution. −3 −2 −1 0 1 2 3 − − − (a) −3 −2 −1 0 1 2 3 − − − (b) Figure 1: (a) The qq plot of the normalized largest spiked eigenvalue √ p δ /σ from 1000independent replications. (b) The qq plot of the normalized smallest spiked eigenvalue √ p δ q /σ q from 1000 independent replications.Next, consider the case where some spiked eigenvalues are possibly multiple: Λ = diag( λ , . . . , λ N (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) n , . . . , λ N (cid:96) − + , . . . , λ q (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) n (cid:96) ) , where λ = . . . = λ N > . . . > λ N (cid:96) − + = . . . = λ q , N i : = (cid:80) ij = n j for 1 ≤ i ≤ (cid:96) andthere exists a constant C < ∞ such that 1 ≤ n i ≤ C for all 1 ≤ i ≤ (cid:96) . According to9he multiplicities of spiked eigenvalues, we divide the index set { , . . . , q } into (cid:96) subsets, J i = { N i − + , . . . , N i } , ≤ i ≤ (cid:96) . Here we denote N =

0. For any 1 ≤ i ≤ (cid:96) , and1 ≤ h , k , h , k , h , k ≤ n i , deﬁne M N i , h , k : = E (cid:16) u (cid:62) N i − + h Z e u (cid:62) N i − + k Z e (cid:17) , M N i , h , k , h , k : = E (cid:16) u (cid:62) N i − + h Z e u (cid:62) N i − + k Z e u (cid:62) N i − + h Z e u (cid:62) N i − + k Z e (cid:17) . Theorem 2.12.

Suppose that Assumptions 2.1, 2.2, 2.3, 2.4 and 2.5 hold. Deﬁne φ i ( (cid:98) λ j ) = ( (cid:98) λ j − θ j ) /θ j , for 1 ≤ i ≤ (cid:96) and j ∈ J i . Then √ p { φ i ( (cid:98) λ j ) , j ∈ J i } converges weakly to the dis-tribution of the eigenvalues of the n i × n i random matrix (cid:60) ( i ) , where (cid:60) ( i ) = (cid:16) R ( i ) hk (cid:17) ≤ h , k ≤ n i is asymmetric matrix with independent Gaussian entries of mean zero and covariance structurecov (cid:16) R ( i ) h , k , R ( i ) h , k (cid:17) = (1 − y ) − ω (cid:0) M N i , h , k , h , k − M N i , h , k M N i , h , k (cid:1) + (1 − y ) − ( β − ω ) (cid:0) M N i , h , k M N i , h , k + M N i , h , h M N i , k , k (cid:1) , where ω = ( y + c ) (1 − y ) and β = y (1 − y ) + c (1 − y ) . We begin with a summary of the proofs. Roughly, the proof of Theorem 2.6 proceeds inthree steps. First, we prove that the spiked eigenvalue (cid:98) λ i , 1 ≤ i ≤ q , solves the equation(3.5) whose left-hand side is the determinant of a q × q matrix which can be decomposedinto four terms, namely U Ξ A U (cid:62) , U Ξ B U (cid:62) , U Ξ C U (cid:62) and U Ξ D U (cid:62) deﬁned below. Second,we derive the limit of each entry of these four matrices and their convergence rates in (cid:96) ∞ norm, where the CLT for random sequilinear forms in [3] and Chebyshev’s inequality arerepeatedly used. Third, using eigenvalue perturbation theorems on (3.5), we estimate theﬂuctuation of the scaled eigenvalue (cid:98) λ i /λ i and reach the result. As for the proof of The-orem 2.10, we also work on the equation (3.5) in three main steps. First, we rewrite thematrix in (3.5) as the sum of U Θ n U (cid:62) , U δ i Θ n U (cid:62) and U Θ n U (cid:62) . See equation (3.30) below.Second, we prove the CLT for each diagonal entry of U Θ n U (cid:62) (Lemma 4.2) and estimatethe (cid:96) ∞ norm of U Θ n U (cid:62) (Lemma 4.3), U Θ n U (cid:62) (Lemma 4.4) and U Θ n U (cid:62) . Third, we ex-pand the determinant in (3.30) by Leibniz formula and then achieve the CLT for δ i . In thissection, we will cite the lemmas given in the next section without the proofs whose detailsare postponed to the next section. Proof of Theorem 2.6.

We ﬁrst show that for 1 ≤ i ≤ q , (cid:98) λ i converges to inﬁnity at the sameorder with λ i almost surely, i.e., there exists some constant C > C − < (cid:98) λ i /λ i < C almost surely.For any 1 ≤ i ≤ q , by Theorem A.10 in [2], we have that (cid:98) λ i = s i ( S − S ) ≤ s i ( S ) s ( S − ) = s i ( S ) s − p ( S ) and s i ( S ) ≤ s i ( S − S ) s ( S ) . s ( S ) → (1 + √ y ) and s p ( S ) → (1 − √ y ) > < C < (cid:98) λ i / s i ( S ) ≤ C < + ∞ almost surely for some constants C and C .Again, by Theorem A.10 in [2] and Weyl’s inequality, we have s i ( S ) ≤ s i ( Σ ) s (cid:32) T YY (cid:62) (cid:33) = λ i s (cid:32) T YY (cid:62) (cid:33) and s i ( S ) = s i (cid:32) T Y (cid:62) Σ Y (cid:33) = s i (cid:32) T Y (cid:62) Σ Y + T Y (cid:62) Y (cid:33) ≥ s i (cid:32) T Y (cid:62) Σ Y (cid:33) ≥ s i ( Σ ) s q (cid:32) T Y Y (cid:62) (cid:33) = λ i s q (cid:32) T Y Y (cid:62) (cid:33) . Due to the fact that s (cid:32) T YY (cid:62) (cid:33) → (1 + √ c ) and s q (cid:32) T Y Y (cid:62) (cid:33) → < C < s i ( S ) /λ i < C < + ∞ almost surely for some constants C and C .Thus, we conclude that C − < (cid:98) λ i /λ i < C almost surely for some constant C .For any 1 ≤ i ≤ q , by the deﬁnition of (cid:98) λ i , it solves the equation det (cid:16)(cid:98) λ i I − S − S (cid:17) = (cid:16)(cid:98) λ i S − S (cid:17) = . (3.1)By the decomposition of S and S in (2.1), the equation (3.1) can be rewritten asdet  (cid:98) λ i n Z Z (cid:62) − T X X (cid:62) (cid:98) λ i n Z Z (cid:62) − T X X (cid:62) (cid:98) λ i n Z Z (cid:62) − T X X (cid:62) (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  = . (3.2)By the formula of the determinant of partitioned matrices, we know that det  A BC D  = det( D ) det( A − BD − C ) when D is nonsingular. As for 1 ≤ i ≤ q , (cid:98) λ i is an outlier eigenvalueof S − S because (cid:98) λ i goes to inﬁnity at the same order with λ i , which meansdet (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  (cid:44) , then it follows by (3.2) that det (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  − (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  − (cid:98) λ i n Z Z (cid:62) − T X X (cid:62)  = . (3.3) For λ ∈ R , deﬁning A ( λ ) = Z (cid:62) M − ( λ ) (cid:32) n Z Z (cid:62) (cid:33) − n Z , ( λ ) = X (cid:62) M − ( λ ) (cid:32) n Z Z (cid:62) (cid:33) − λ T X , C ( λ ) = Z (cid:62) M − ( λ ) (cid:32) n Z Z (cid:62) (cid:33) − λ T X , D ( λ ) = X (cid:62) M − ( λ ) (cid:32) n Z Z (cid:62) (cid:33) − λ n Z , it holds that A ( λ ) = A ( λ ) (cid:62) , B ( λ ) = B ( λ ) (cid:62) and T C ( λ ) = n D ( λ ) (cid:62) . Then some elementarycalculations lead todet (cid:98) λ i Z (cid:110) I n − A ( (cid:98) λ i ) (cid:111) Z (cid:62) n − X (cid:110) I T + B ( (cid:98) λ i ) (cid:111) X (cid:62) T + (cid:98) λ i Z C ( (cid:98) λ i ) X (cid:62) n + (cid:98) λ i X D ( (cid:98) λ i ) Z (cid:62) T  = . (3.4)To ease the notation, we deﬁne Ξ A : = (cid:98) λ i Z (cid:110) I n − A ( (cid:98) λ i ) (cid:111) Z (cid:62) n , Ξ B : = X (cid:110) I T + B ( (cid:98) λ i ) (cid:111) X (cid:62) T , Ξ C : = (cid:98) λ i Z C ( (cid:98) λ i ) X (cid:62) n , Ξ D : = (cid:98) λ i X D ( (cid:98) λ i ) Z (cid:62) T . Multiplying the matrix in (3.4) by U on the left side hand and by U (cid:62) on the right side, wehave det (cid:110) U ( Ξ A − Ξ B + Ξ C + Ξ D ) U (cid:62) (cid:111) = . (3.5)Next, we analyze these four terms in (3.5) in the following.For the term U Ξ A U (cid:62) , we ﬁrst consider the decomposition1 n Z (cid:110) I n − A ( (cid:98) λ i ) (cid:111) Z (cid:62) = n Z { I n − A ( λ i ) } Z (cid:62) + n Z (cid:110) A ( (cid:98) λ i ) − A ( λ i ) (cid:111) Z (cid:62) . By Lemma 4.1 below, we have (cid:101) m λ i (1) − = O a . s . ( λ − i ), which implies1 n tr { I n − A ( λ i ) } = − p − qn (cid:101) m λ i (1) = − y p + qn + O a . s . ( λ − i ) . Note that E( Z Z (cid:62) / n ) = I q and that ( X , Z ) is independent of ( X , Z ). Under Assump-tion 2.3, by using Theorem 7.2 of [3], we have that, for all 1 ≤ j ≤ q , e (cid:62) j (cid:34) n Z { I n − A ( λ i ) } Z (cid:62) (cid:35) e j − (cid:26) − p − qn (cid:101) m λ i (1) (cid:27) = O p (cid:32) √ n (cid:33) (3.6)and E (cid:32) e (cid:62) j (cid:34) n Z { I n − A ( λ i ) } Z (cid:62) (cid:35) e j − (cid:26) − p − qn (cid:101) m λ i (1) (cid:27)(cid:33) = O (cid:32) n (cid:33) (3.7)12or all 1 ≤ j ≤ q . For those o ﬀ -diagonal elements, we have that, for any 1 ≤ j (cid:44) j ≤ q , e (cid:62) j (cid:34) n Z { I n − A ( λ i ) } Z (cid:62) (cid:35) e j = O p (cid:32) √ n (cid:33) (3.8)and E (cid:32) e (cid:62) j (cid:34) n Z { I n − A ( λ i ) } Z (cid:62) (cid:35) e j (cid:33) = O (cid:32) n (cid:33) , (3.9)which is implied by Theorem 7.1 and Corollary 7.1 in [3]. Also we can write A ( λ i ) − A (cid:16)(cid:98) λ i (cid:17) = Z (cid:62) (cid:110) M − ( λ i ) − M − (cid:16)(cid:98) λ i (cid:17)(cid:111) (cid:32) n Z Z (cid:62) (cid:33) − n Z = Z (cid:62) M − ( λ i ) (cid:110) M (cid:16)(cid:98) λ i (cid:17) − M ( λ i ) (cid:111) M − (cid:16)(cid:98) λ i (cid:17) (cid:32) n Z Z (cid:62) (cid:33) − n Z = (cid:16) λ − i − (cid:98) λ − i (cid:17) Z (cid:62) M − ( λ i ) F M − (cid:16)(cid:98) λ i (cid:17) (cid:32) n Z Z (cid:62) (cid:33) − n Z . It can be bounded by (cid:13)(cid:13)(cid:13)(cid:13) A ( λ i ) − A (cid:16)(cid:98) λ i (cid:17)(cid:13)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:16) λ − i − (cid:98) λ − i (cid:17) Z (cid:62) M − ( λ i ) F M − (cid:16)(cid:98) λ i (cid:17) (cid:32) n Z Z (cid:62) (cid:33) − n Z (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:12)(cid:12)(cid:12)(cid:12) λ − i − (cid:98) λ − i (cid:12)(cid:12)(cid:12)(cid:12) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ n Z (cid:62) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) M − ( λ i ) (cid:13)(cid:13)(cid:13) (cid:107) F (cid:107) (cid:13)(cid:13)(cid:13)(cid:13) M − (cid:16)(cid:98) λ i (cid:17)(cid:13)(cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:32) n Z Z (cid:62) (cid:33) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ n Z (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) = O( λ − i )almost surely. It follows that, for any 1 ≤ j , j ≤ q , e (cid:62) j (cid:34) n Z (cid:110) A ( (cid:98) λ i ) − A ( λ i ) (cid:111) Z (cid:62) (cid:35) e j = O a . s . ( λ − i ) . (3.10)Combining (3.6), (3.8)) and (3.10), we can get that, for any 1 ≤ j ≤ q , e (cid:62) j (cid:34) n Z (cid:110) I n − A ( (cid:98) λ i ) (cid:111) Z (cid:62) (cid:35) e j − (cid:18) − p − qn (cid:19) = O p (cid:32) √ n (cid:33) + O a . s . ( λ − i )and that, for any 1 ≤ j (cid:44) j ≤ q , e (cid:62) j (cid:34) n Z (cid:110) I n − A ( (cid:98) λ i ) (cid:111) Z (cid:62) (cid:35) e j = O p (cid:32) √ n (cid:33) + O a . s . ( λ − i ) . Replacing Z by UZ , it is easy to check that all the above conclusions still hold: e (cid:62) j (cid:34) n UZ (cid:110) I n − A ( (cid:98) λ i ) (cid:111) Z (cid:62) U (cid:62) (cid:35) e j = − p − qn + O p (cid:32) √ n (cid:33) + O a . s . ( λ − i ) (3.11)for all 1 ≤ j ≤ q , and e (cid:62) j (cid:34) n UZ (cid:110) I n − A ( (cid:98) λ i ) (cid:111) Z (cid:62) U (cid:62) (cid:35) e j = O p (cid:32) √ n (cid:33) + O a . s . ( λ − i ) (3.12)13or all 1 ≤ j (cid:44) j ≤ q . By the deﬁnition of Ξ A in (3.4), together with (3.11) and (3.12), wecan see that, for all 1 ≤ j ≤ q , e (cid:62) j U Ξ A U (cid:62) e j = (cid:98) λ i (cid:18) − p − qn (cid:19) + λ i · O p (cid:32) √ n (cid:33) + O a . s . (1) (3.13)and that, for all 1 ≤ j (cid:44) j ≤ q , e (cid:62) j U Ξ A U (cid:62) e j = λ i · O p (cid:32) √ n (cid:33) + O a . s . (1) . (3.14)For the term U Ξ B U (cid:62) , by the deﬁnition of X , we can derive that U Ξ B U (cid:62) = T Λ UY (cid:110) I T + B ( (cid:98) λ i ) (cid:111) Y (cid:62) U (cid:62) Λ = T Λ UY { I T + B ( λ i ) } Y (cid:62) U (cid:62) Λ + T Λ UY (cid:110) B ( (cid:98) λ i ) − B ( λ i ) (cid:111) Y (cid:62) U (cid:62) Λ , where 1 T tr { I T + B ( λ i ) } = T tr  I T + X (cid:62) M − ( λ i ) (cid:32) n Z Z (cid:62) (cid:33) − λ i T X  = + T tr  M − ( λ i ) (cid:32) n Z Z (cid:62) (cid:33) − λ i T X X (cid:62)  = + T tr (cid:32) I p − q − F λ i (cid:33) − F λ i  = + p − qT (cid:8)(cid:101) m λ i (1) − (cid:9) and B ( (cid:98) λ i ) − B ( λ i ) = X (cid:62) (cid:110)(cid:98) λ − i M − (cid:16)(cid:98) λ i (cid:17) − λ − i M − ( λ i ) (cid:111) (cid:32) n Z Z (cid:62) (cid:33) − T X = (cid:98) λ − i λ − i X (cid:62) M − (cid:16)(cid:98) λ i (cid:17) (cid:110) λ i M ( λ i ) − (cid:98) λ i M (cid:16)(cid:98) λ i (cid:17)(cid:111) M − ( λ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X = (cid:16)(cid:98) λ − i − λ − i (cid:17) X (cid:62) M − (cid:16)(cid:98) λ i (cid:17) M − ( λ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X . The same arguments for deriving (3.13) and (3.14) lead to that, for all 1 ≤ j ≤ q , e (cid:62) j U Ξ B U (cid:62) e j = λ j + λ j · O p (cid:32) √ n (cid:33) + λ j · O a . s . ( λ − i ) (3.15)and that, for 1 ≤ j , j ≤ q , e (cid:62) j U Ξ B U (cid:62) e j = λ j λ j · O p (cid:32) √ n (cid:33) + λ j λ j · O a . s . ( λ − i ) (3.16)for all 1 ≤ j (cid:44) j ≤ q .For the term U ( Ξ C + Ξ D ) U (cid:62) , by using the fact that Y = U (cid:62) Λ − UX , we have that U ( Ξ C + Ξ D ) U (cid:62) U (cid:98) λ i Z C ( (cid:98) λ i ) X (cid:62) n + (cid:98) λ i X D ( (cid:98) λ i ) Z (cid:62) T  U (cid:62) = U (cid:40) λ i Z C ( λ i ) X (cid:62) n + λ i X D ( λ i ) Z (cid:62) T (cid:41) U (cid:62) + UZ (cid:110)(cid:98) λ i C ( (cid:98) λ i ) − λ i C ( λ i ) (cid:111) n Y (cid:62) U (cid:62) Λ + Λ UY (cid:110)(cid:98) λ i D ( (cid:98) λ i ) − λ i D ( λ i ) (cid:111) T Z (cid:62) U (cid:62) , and that U (cid:40) λ i Z C ( λ i ) X (cid:62) n + λ i X D ( λ i ) Z (cid:62) T (cid:41) U (cid:62) = U (cid:16) Z X (cid:17)  O λ i C ( λ i ) n λ i D ( λ i ) T O   Z (cid:62) X (cid:62)  U (cid:62) = (cid:18) UZ Λ UY (cid:19)  O λ i C ( λ i ) n λ i D ( λ i ) T O   Z (cid:62) U (cid:62) Y (cid:62) U (cid:62) Λ  . Then we have that, for all 1 ≤ j , j ≤ q , e (cid:62) j U (cid:40) λ i Z C ( λ i ) X (cid:62) n + λ i X D ( λ i ) Z (cid:62) T (cid:41) U (cid:62) e j = e (cid:62) j (cid:18) UZ Λ UY (cid:19)  O λ i C ( λ i ) n λ i D ( λ i ) T O   Z (cid:62) U (cid:62) Y (cid:62) U (cid:62) Λ  e j = e (cid:62) j (cid:18) UZ λ j UY (cid:19)  O λ i C ( λ i ) n λ i D ( λ i ) T O   Z (cid:62) U (cid:62) λ j Y (cid:62) U (cid:62)  e j = λ j + λ j · e (cid:62) j (cid:16) UZ UY (cid:17)  O λ i C ( λ i ) n λ i D ( λ i ) T O   Z (cid:62) U (cid:62) Y (cid:62) U (cid:62)  e j + λ j − λ j i · e (cid:62) j (cid:16) UZ UY (cid:17)  O λ i C ( λ i ) n · i − λ i D ( λ i ) T · i O   Z (cid:62) U (cid:62) Y (cid:62) U (cid:62)  e j = λ j + λ j · O p (cid:32) √ n (cid:33) + λ j − λ j i · O p (cid:32) √ n (cid:33) = (cid:18) λ j + λ j (cid:19) · O p (cid:32) √ n (cid:33) , where i : = √− (cid:98) λ i C ( (cid:98) λ i ) − λ i C ( λ i ) = Z (cid:62) (cid:110) M − ( (cid:98) λ i ) − M − ( λ i ) (cid:111) (cid:32) n Z Z (cid:62) (cid:33) − T X = Z (cid:62) M − ( (cid:98) λ i ) (cid:110) M ( λ i ) − M ( (cid:98) λ i ) (cid:111) M − ( λ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X = (cid:16)(cid:98) λ − i − λ − i (cid:17) Z (cid:62) M − ( (cid:98) λ i ) F M − ( λ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X , (cid:98) λ i D ( (cid:98) λ i ) − λ i D ( λ i ) = X (cid:62) (cid:110) M − ( (cid:98) λ i ) − M − ( λ i ) (cid:111) (cid:32) n Z Z (cid:62) (cid:33) − n Z X (cid:62) M − ( (cid:98) λ i ) (cid:110) M ( λ i ) − M ( (cid:98) λ i ) (cid:111) M − ( λ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z = (cid:16)(cid:98) λ − i − λ − i (cid:17) X (cid:62) M − ( (cid:98) λ i ) F M − ( λ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z , we can get e (cid:62) j UZ (cid:110)(cid:98) λ i C ( (cid:98) λ i ) − λ i C ( λ i ) (cid:111) n Y (cid:62) U (cid:62) Λ e j = λ j · O a . s . (cid:16) λ − i (cid:17) and e (cid:62) j Λ UY (cid:110)(cid:98) λ i D ( (cid:98) λ i ) − λ i D ( λ i ) (cid:111) T Z (cid:62) U (cid:62) e j = λ j · O a . s . (cid:16) λ − i (cid:17) for any 1 ≤ j , j ≤ q . By using the similar arguments for proving (3.13) and (3.14), itholds that e (cid:62) j U ( Ξ C + Ξ D ) U (cid:62) e j = (cid:18) λ j + λ j (cid:19) · O p (cid:32) √ n (cid:33) + (cid:18) λ j + λ j (cid:19) · O a . s . (cid:16) λ − i (cid:17) (3.17)for any 1 ≤ j , j ≤ q .Combining (3.14)-(3.17) and the determinant (3.5), we can compute the limit of (cid:98) λ i /λ i for each 1 ≤ i ≤ q . We use a new notation to denote the matrix in the determinant (3.5).Deﬁne Ξ : = U ( Ξ A − Ξ B + Ξ C + Ξ D ) U (cid:62) , (cid:101) Ξ : = diag (cid:16) ξ , . . . , ξ qq (cid:17) , where ξ j j = (cid:98) λ i { − ( p − q ) / n } − λ j . Then by (3.14)-(3.17), we have that e (cid:62) j (cid:16) Ξ − (cid:101) Ξ (cid:17) e j = λ i · O p (cid:32) √ n (cid:33) + O p (1) + λ j λ j · O p (cid:32) √ n (cid:33) + λ j λ j · O a . s . ( λ − i ) + (cid:18) λ j + λ j (cid:19) · O p (cid:32) √ n (cid:33) + (cid:18) λ j + λ j (cid:19) · O a . s . (cid:16) λ − i (cid:17) = (cid:18) λ i + λ j λ j (cid:19) (cid:40) O p (cid:32) √ n (cid:33) + O a . s . (cid:16) λ − i (cid:17)(cid:41) for any 1 ≤ j , j ≤ q , which follows that e (cid:62) j λ − i (cid:16) Ξ − (cid:101) Ξ (cid:17) e j = (cid:18) + λ − i λ j λ j (cid:19) (cid:40) O p (cid:32) √ n (cid:33) + O a . s . (cid:16) λ − i (cid:17)(cid:41) . (3.18)According to (3.7) and (3.9) for Ξ A (similar results also hold for Ξ B , Ξ C and Ξ D ), it canbe easily checked that the variance of the term in (3.18) has the order (cid:18) + λ − i λ j λ j (cid:19) (cid:18) n − + λ − i (cid:19) . By Chebyshev’s inequality, we have that, for any (cid:15) > (cid:40) max ≤ j , j ≤ q (cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j λ − i (cid:16) Ξ − (cid:101) Ξ (cid:17) e j (cid:12)(cid:12)(cid:12)(cid:12) ≥ (cid:15) (cid:18) n − + λ − i (cid:19)(cid:41) (cid:88) ≤ j , j ≤ q Pr (cid:26)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j λ − i (cid:16) Ξ − (cid:101) Ξ (cid:17) e j (cid:12)(cid:12)(cid:12)(cid:12) ≥ (cid:15) (cid:18) n − + λ − i (cid:19)(cid:27) ≤ (cid:88) ≤ j , j ≤ q E (cid:110) e (cid:62) j λ − i (cid:16) Ξ − (cid:101) Ξ (cid:17) e j (cid:111) (cid:15) (cid:16) n − + λ − i (cid:17) = (cid:88) ≤ j , j ≤ q (cid:18) + λ − i λ j λ j (cid:19) · O (cid:16) (cid:15) − (cid:17) =  q + λ − i q (cid:88) j = λ j  · O (cid:16) (cid:15) − (cid:17) = κ · O (cid:16) (cid:15) − (cid:17) , which means (cid:13)(cid:13)(cid:13)(cid:13) λ − i (cid:16) Ξ − (cid:101) Ξ (cid:17)(cid:13)(cid:13)(cid:13)(cid:13) ∞ = max ≤ j , j ≤ q (cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j λ − i (cid:16) Ξ − (cid:101) Ξ (cid:17) e j (cid:12)(cid:12)(cid:12)(cid:12) = κ · O p (cid:32) √ n + λ − i (cid:33) and then ||| λ − i ( Ξ − (cid:101) Ξ ) ||| ∞ ≤ q (cid:107) λ − i ( Ξ − (cid:101) Ξ ) (cid:107) ∞ = κ q · O p (cid:32) √ n + λ − i (cid:33) . Note that the determinant equation det (cid:0)(cid:101) Ξ (cid:1) = (cid:16) λ − i (cid:101) Ξ (cid:17) =

0, that is,det (cid:98) λ i λ i (cid:18) − p − qn (cid:19) I q − λ − i Λ  = . At the same time, the equation det (cid:0) Ξ (cid:1) = (cid:16) λ − i Ξ (cid:17) =

0, that is,det (cid:98) λ i λ i (cid:18) − p − qn (cid:19) I q − λ − i Λ + λ − i (cid:16) Ξ − (cid:101) Ξ (cid:17) = . By eigenvalue perturbation theorems (see Theorem 6.3.2 in Chapter 6, [9]), we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:98) λ i λ i (cid:18) − p − qn (cid:19) − (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ||| λ − i ( Ξ − (cid:101) Ξ ) ||| ∞ = κ q · O p (cid:32) √ n + λ − i (cid:33) , that is (cid:98) λ i λ i = − y + O (cid:16) y p − y (cid:17) + κ q · O p (cid:32) √ n + λ − i (cid:33) . (3.19)Instead, we can compare determinant equationsdet (cid:18) Λ − (cid:101) ΞΛ − (cid:19) = (cid:18) Λ − ΞΛ − (cid:19) = , and then repeat all the derivations above to achieve an upper bound of ||| Λ − / ( Ξ − (cid:101) Ξ ) Λ − / ||| ∞ .In this case, we can get (cid:98) λ i λ i = − y + O (cid:16) y p − y (cid:17) + κ q · O p (cid:32) √ n + λ − i (cid:33) . (3.20)Thus, (3.19) and (3.20) lead to (cid:98) λ i λ i = − y + O (cid:16) y p − y (cid:17) + κ q · O p (cid:32) √ n + λ − i (cid:33) , κ q ( n − / + λ − i ) = o(1) under Assumption 2.2. The proof is ﬁnished. (cid:3) Proof of Theorem 2.10.

We begin with the equation on (cid:98) λ i in (3.4). Recall that we haveexpressed (3.4) as det (cid:0) Ξ A − Ξ B + Ξ C + Ξ D (cid:1) = . (3.21)For the ﬁrst term Ξ A , we can write Ξ A = (cid:98) λ i n Z (cid:104)(cid:110) I n − A ( (cid:98) λ i ) (cid:111) − { I n − A ( θ i ) } (cid:105) Z (cid:62) + (cid:98) λ i n (cid:16) Z { I n − A ( θ i ) } Z (cid:62) − E (cid:104) Z { I n − A ( θ i ) } Z (cid:62) (cid:105)(cid:17) + (cid:98) λ i n E (cid:104) Z { I n − A ( θ i ) } Z (cid:62) (cid:105) . Using the fact (cid:110) I n − A ( (cid:98) λ i ) (cid:111) − { I n − A ( θ i ) } = − δ i A ( θ i ) + δ i Z (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z and (cid:98) λ i = θ i (1 + δ i ) by (2.7), we can get Ξ A = θ i δ i (1 + δ i ) 1 n (cid:16) Z { I n − A ( θ i ) } Z (cid:62) − E (cid:104) Z { I n − A ( θ i ) } Z (cid:62) (cid:105)(cid:17) + θ i δ i (1 + δ i ) 1 n Z Z (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) − θ i δ i (1 + δ i ) 1 n Z Z (cid:62) + θ i δ i (1 + δ i ) 1 n E (cid:104) Z { I n − A ( θ i ) } Z (cid:62) (cid:105) + θ i (1 + δ i ) 1 n (cid:16) Z { I n − A ( θ i ) } Z (cid:62) − E (cid:104) Z { I n − A ( θ i ) } Z (cid:62) (cid:105)(cid:17) + θ i (1 + δ i ) 1 n E (cid:104) Z { I n − A ( θ i ) } Z (cid:62) (cid:105) = θ i (1 + δ i ) n (cid:16) Z { I n − A ( θ i ) } Z (cid:62) − E (cid:104) Z { I n − A ( θ i ) } Z (cid:62) (cid:105)(cid:17) + θ i δ i (1 + δ i ) 1 n Z Z (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) − θ i δ i (1 + δ i ) 1 n Z Z (cid:62) + θ i (1 + δ i ) n E (cid:104) Z { I n − A ( θ i ) } Z (cid:62) (cid:105) = : θ i (1 + δ i ) Ξ A + θ i δ i (1 + δ i ) Ξ A − θ i δ i (1 + δ i ) Ξ A + θ i (1 + δ i ) Ξ A . (3.22)For the second term Ξ B , we can similarly write Ξ B = T X (cid:110) B ( (cid:98) λ i ) − B ( θ i ) (cid:111) X (cid:62) + T (cid:16) X { I T + B ( θ i ) } X (cid:62) − E (cid:104) X { I T + B ( θ i ) } X (cid:62) (cid:105)(cid:17) + T E (cid:104) X { I T + B ( θ i ) } X (cid:62) (cid:105) = T (cid:16) X { I T + B ( θ i ) } X (cid:62) − E (cid:104) X { I T + B ( θ i ) } X (cid:62) (cid:105)(cid:17) − δ i (cid:98) λ i · T X X (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X X (cid:62) + T E (cid:104) X { I T + B ( θ i ) } X (cid:62) (cid:105) : Ξ B − δ i (cid:98) λ i Ξ B + Ξ B , (3.23)where the second equality above uses the fact B ( (cid:98) λ i ) − B ( θ i ) = − δ i (cid:98) λ i X (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X . For the term Ξ C , we have Ξ C = (cid:98) λ i n Z (cid:110) C ( (cid:98) λ i ) − C ( θ i ) (cid:111) X (cid:62) + (cid:98) λ i − θ i n Z C ( θ i ) X (cid:62) + θ i n (cid:104) Z C ( θ i ) X (cid:62) − E (cid:110) Z C ( θ i ) X (cid:62) (cid:111)(cid:105) . Using the fact C ( (cid:98) λ i ) − C ( θ i ) = − δ i (cid:98) λ i Z (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X , we have the decomposition Ξ C = θ i n (cid:104) Z C ( θ i ) X (cid:62) − E (cid:110) Z C ( θ i ) X (cid:62) (cid:111)(cid:105) − δ i n Z Z (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X X (cid:62) + θ i δ i n Z C ( θ i ) X (cid:62) = : θ i Ξ C − δ i Ξ C + θ i δ i Ξ C . (3.24)Similarly, we can write the last term Ξ D as Ξ D = θ i T (cid:104) X D ( θ i ) Z (cid:62) − E (cid:110) X D ( θ i ) Z (cid:62) (cid:111)(cid:105) − δ i T X X (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) + θ i δ i T X D ( θ i ) Z (cid:62) = : θ i Ξ D − δ i Ξ D + θ i δ i Ξ D . (3.25)Putting (3.22)-(3.25) into (3.21), we havedet( θ i Θ n + θ i δ i Θ n + θ i Θ n ) = , (3.26)where Θ n : = (1 + δ i ) Ξ A − θ − i Ξ B + Ξ C + Ξ D , (3.27) Θ n : = (1 + δ i ) Ξ A − (1 + δ i ) Ξ A + θ i (cid:98) λ i Ξ B − θ − i Ξ C + Ξ C − θ − i Ξ D + Ξ D , (3.28) Θ n : = (1 + δ i ) Ξ A − θ − i Ξ B . (3.29)Multiplying both sides of the matrix in (3.26) by θ − / i U from the left hand side and θ − / i U (cid:62) from the right hand side, we getdet (cid:110) U ( Θ n + δ i Θ n + Θ n ) U (cid:62) (cid:111) = . (3.30)19ecall that e i is the q -dimensional vector whose i -th element is 1 and others are 0. ByLemma 4.2 below, we have √ p (cid:98) S i : = √ p e (cid:62) i U Θ n U (cid:62) e i d −→ N (0 , (cid:101) σ i ) , (3.31)where (cid:101) σ i = ( y + c ) (1 − y ) ν i − y (1 − y ) (1 − y ) − c (1 − y ) . It follows by Lemma 4.3below that (cid:107) U Θ n U (cid:62) (cid:107) ∞ = O p (cid:32) q √ n + (cid:80) j λ j √ n λ i (cid:33) . (3.32)By Lemma 4.4 below, we also havemax ≤ j ≤ q (cid:12)(cid:12)(cid:12) e (cid:62) j U Θ n U (cid:62) e j − ( y − (cid:12)(cid:12)(cid:12) = O p  √ q δ i λ i + √ q √ n + (cid:113)(cid:80) j λ j λ i + (cid:112)(cid:80) j λ j λ i  , (3.33)max ≤ j (cid:44) j ≤ q (cid:12)(cid:12)(cid:12) e (cid:62) j U Θ n U (cid:62) e j (cid:12)(cid:12)(cid:12) = O p  q δ i λ i + q √ n + (cid:80) j λ j λ i + (cid:112) q (cid:80) j λ j λ i  . (3.34)For the term U Θ n U (cid:62) in (3.30), by considering its ( j , j ) entry for all 1 ≤ j , j ≤ q , wecan easily get that (1 + δ i ) U Ξ A U (cid:62) = (1 + δ i ) (cid:20) − p − qn E (cid:8)(cid:101) m θ i (1) (cid:9)(cid:21) I q , (3.35) U Ξ B U (cid:62) = (cid:18) + p − qT (cid:2) − + E (cid:8)(cid:101) m θ i (1) (cid:9)(cid:3)(cid:19) Λ . (3.36)By the deﬁnition of θ i in (2.6), we know1 − p − qn E (cid:8)(cid:101) m θ i (1) (cid:9) = λ i θ i (cid:18) + p − qT (cid:2) − + E (cid:8)(cid:101) m θ i (1) (cid:9)(cid:3)(cid:19) , which, together with the results in Lemma 4.1 below and Theorem 2.6, yields that(1 + δ i ) (cid:20) − p − qn E (cid:8)(cid:101) m θ i (1) (cid:9)(cid:21) − λ i θ i (cid:18) + p − qT (cid:2) − + E (cid:8)(cid:101) m θ i (1) (cid:9)(cid:3)(cid:19) = δ i (cid:20) − p − qn E (cid:8)(cid:101) m θ i (1) (cid:9)(cid:21) + δ i (cid:20) − p − qn E (cid:8)(cid:101) m θ i (1) (cid:9)(cid:21) = δ i (cid:26) − p − qn + o(1) (cid:27) . (3.37)Combining (3.35)-(3.37) and the deﬁnition of Θ n in (3.29), we can get that, for 1 ≤ j ≤ q , e (cid:62) j U Θ n U (cid:62) e (cid:62) j = (cid:40) (1 + δ i ) − λ j λ i (cid:41) (cid:20) − p − qn E (cid:8)(cid:101) m θ i (1) (cid:9)(cid:21) , which converges to zero if and only if λ j = λ i because (1 + δ i ) − λ j /λ i > C > C if λ j (cid:44) λ i under Assumption 2.4. When λ j = λ i , we have e (cid:62) j U Θ n U (cid:62) e (cid:62) j = δ i (cid:26) − p − qn + o(1) (cid:27) . (3.38)Note that all o ﬀ -diagonal entries of the matrix U Θ n U (cid:62) is zero, i.e. e (cid:62) j U Θ n U (cid:62) e j = , ∀ ≤ j (cid:44) j ≤ q . (3.39)20nserting (3.31), (3.32), (3.33), (3.34), (3.38) and (3.39) into (3.30), we can solve thedeterminant equation (3.30) and get the limiting distribution of δ i (1 ≤ i ≤ q ) immediately.Since diagonal elements of U Θ n U (cid:62) are at least constant order, when e (cid:62) j U Θ n U (cid:62) e (cid:62) j goes toinﬁnity for some j ’s, we can divide these rows by e (cid:62) j U Θ n U (cid:62) e (cid:62) j . In this way, we can getdet  O p (1) . . . O p ( ∗ ) . . . O p ( ∗ ) ... . . . ... . . . ... O p ( ∗ ) . . . (cid:98) S i + (1 − y + o p (1)) δ i . . . O p ( ∗ ) ... . . . ... . . . ... O p ( ∗ ) . . . O p ( ∗ ) . . . O p (1)  = √ p (cid:98) S i d −→ N (0 , (cid:101) σ i ) and ∗ = q √ n + (cid:80) j λ j √ n λ i + q δ i λ i + δ i (cid:80) j λ j λ i + δ i (cid:112) q (cid:80) j λ j λ i . By Leibniz formula for determinants, we can get that (cid:98) S i + (cid:110) − y + o p (1) (cid:111) δ i + q O p (cid:16) ∗ (cid:17) = (cid:98) S i + (cid:110) − y + o p (1) (cid:111) δ i + O p  q n + q ( (cid:80) j λ j ) n λ i + q δ i λ i + q δ i ( (cid:80) j λ j ) λ i + q δ i (cid:80) j λ j λ i  = . Under Assumptions 2.1 and 2.2(a), we have q = o( n ) and λ − i (cid:80) j λ j = o( q − n ), then itfollows that q n = o( n − ) , q (cid:80) j λ j n λ i = o( n − ) , q δ i λ i = o p ( δ i n ) , q δ i ( (cid:80) j λ j ) λ i = o p ( δ i n ) , q δ i (cid:80) j λ j λ i = o p ( δ i n ) . It leads to (cid:98) S i + (cid:110) − y + o p (1) (cid:111) δ i + o p ( δ i n ) + o( n − ) = . By multiplying √ p on both sides, we further obtain that √ p (cid:98) S i + (cid:110) − y + o p (1) (cid:111) · √ p δ i + o p (1) · p δ i + o (1) = . Recalling that √ p (cid:98) S i d −→ N (0 , (cid:101) σ i ), we can reach to √ p δ i d −→ N (0 , σ i ), where σ i = (cid:101) σ i (1 − y ) = ( y + c ) ν i − c − y (1 − y )1 − y . Instead, we can consider the determinantdet (cid:26)(cid:101) Λ − U ( θ i Θ n + θ i δ i Θ n + θ i Θ n ) U (cid:62) (cid:101) Λ − (cid:27) = , (3.40)21here (cid:101) Λ = diag( θ , . . . , θ q ) ∈ R q × q . Repeating all the derivations above, we can get (cid:107) (cid:101) Λ − U θ i Θ n U (cid:62) (cid:101) Λ − (cid:107) ∞ = O p  q √ n + λ i (cid:80) j λ − j √ n  , (3.41)max ≤ j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j (cid:101) Λ − θ i U Θ n U (cid:62) (cid:101) Λ − e j − ( y − θ i θ j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p  δ i (cid:115)(cid:88) j λ − j + λ i (cid:113)(cid:80) j λ − j √ n + √ q λ i + (cid:115)(cid:88) j λ − j  , (3.42)max ≤ j (cid:44) j ≤ q (cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j (cid:101) Λ − θ i U Θ n U (cid:62) (cid:101) Λ − e j (cid:12)(cid:12)(cid:12)(cid:12) = O p  δ i (cid:88) j λ − j + λ i (cid:80) j λ − j √ n + q λ i + (cid:115) q (cid:88) j λ − j  , (3.43) e (cid:62) j (cid:101) Λ − θ i U Θ n U (cid:62) (cid:101) Λ − e (cid:62) j = (cid:40) (1 + δ i ) − λ j λ i (cid:41) (cid:20) − p − qn E (cid:8)(cid:101) m θ i (1) (cid:9)(cid:21) , (3.44) e (cid:62) j (cid:101) Λ − θ i U Θ n U (cid:62) (cid:101) Λ − e j = , ∀ ≤ j (cid:44) j ≤ q . (3.45)Inserting (3.41)-(3.45) into (3.40), we can similarly prove √ p δ i d −→ N (0 , σ i ) under As-sumption 2.2(b). Thus the proof is completed. (cid:3) Proof of Theorem 2.12.

The proof of Theorem 2.12 is similar to that of Theorem 2.10, theonly di ﬀ erence is that we take the J i × J i block as a typical object to analyse, some usefullemmas can also be obtained from Lemmas 4.2-4.4 below. Similar arguments for derivingthe proof of Theorem 4.1 in [17] can be used. Thus, we omit the details. (cid:3) Lemma 4.1.

Suppose that Assumptions 2.1 and 2.3 hold. For any θ → ∞ , we have (cid:101) m θ (1) − = O a . s . ( θ − ). Proof of lemma 4.1.

By the deﬁnition of (cid:101) m θ ( z ) in (2.2), (cid:101) m θ (1) = p − q tr (cid:32) I p − q − F θ (cid:33) − = + p − q tr  F θ (cid:32) I p − q − F θ (cid:33) −  , we have (cid:101) m θ (1) − = p − q tr  F θ (cid:32) I p − q − F θ (cid:33) −  = θ −  p − q (cid:88) ≤ j ≤ p − q µ j − µ j /θ  . Since all the eigenvalues of F , namely µ ≥ . . . ≥ µ p − q , are almost surely bounded, we canget that (cid:101) m θ (1) − = O a . s . ( θ − ). (cid:3) Recall that e i is the q -dimensional vector whose i -th element is 1 and others are 0, U (cid:62) = ( u , u , . . . , u q ), where u i ∈ R q is the i − th column of the matrix U (cid:62) . Then we get thefollowing lemma. 22 emma 4.2. For any ﬁxed 1 ≤ i ≤ q , denote G ni = √ p U Θ n U (cid:62) . Under the assumptionsof Theorem 2.10, we have e (cid:62) i G ni e i d −→ N (0 , (cid:101) σ i ) , where (cid:101) σ i = ( y + c ) (1 − y ) ν i − y (1 − y ) (1 − y ) − c (1 − y ) and ν i = E | u (cid:62) i Z e | for 1 ≤ i ≤ q . Proof of Lemma 4.2.

From the deﬁnition of Θ n in (3.27) and the fact that Y = Σ − X = U (cid:62) Λ − UX , we have the decomposition e (cid:62) i G ni e i = u (cid:62) i (cid:104) (1 + δ i ) √ pn Z { I n − A ( θ i ) } Z (cid:62) − λ i θ i √ pT Y { I T + B ( θ i ) } Y (cid:62) + √ p λ i n Z C ( θ i ) Y (cid:62) + √ p λ i T Y D ( θ i ) Z (cid:62) (cid:105) u i − E( · ) , (4.1)where E[ · ] is the expectation of all the preceding terms after the equal sign.By Theorem 2.6, δ i converges in probability to 0, thus we only need to consider thelimit of e (cid:62) i (cid:101) G ni e i : = u (cid:62) i (cid:104) √ pn Z { I n − A ( θ i ) } Z (cid:62) − λ i θ i √ pT Y { I T + B ( θ i ) } Y (cid:62) + √ p λ i n Z C ( θ i ) Y (cid:62) + √ p λ i T Y D ( θ i ) Z (cid:62) (cid:105) u i − E[ · ] . For the ﬁrst two terms, Theorem 7.2 in [3] implies that, for any 1 ≤ i ≤ q ,1 √ n (cid:104) u (cid:62) i Z { I n − A ( θ i ) } Z (cid:62) u i − tr { I n − A ( θ i ) } (cid:105) d −→ N (0 , (cid:101) σ i A ) , √ T (cid:104) u (cid:62) i Y { I T + B ( θ i ) } Y (cid:62) u i − tr { I T + B ( θ i ) } (cid:105) d −→ N (0 , (cid:101) σ i B ) , with (cid:101) σ iA = ω I n − A ( θ i ) ( ν i − + β I n − A ( θ i ) and (cid:101) σ iB = ω I T + B ( θ i ) ( ν i − + β I T + B ( θ i ) , where ν i = E | u (cid:62) i Z e | = E | u (cid:62) i Y e | ,ω I n − A ( θ i ) = lim n →∞ n (cid:88) ≤ k ≤ n [ { I n − A ( θ i ) } ( k , k )] ,β I n − A ( θ i ) = lim n →∞ n tr { I n − A ( θ i ) } ,ω I T + B ( θ i ) and β I T + B ( θ i ) are similarly deﬁned. Here the fact that E | u (cid:62) i Z e | = E | u (cid:62) i Y e | isimplied by Assumption 2.3. Based on the facts thatE (cid:104) u (cid:62) i Z { I n − A ( θ i ) } Z (cid:62) u i (cid:105) = E (cid:16) tr (cid:104) Z (cid:62) u i u (cid:62) i Z { I n − A ( θ i ) } (cid:105)(cid:17) = tr (cid:104) E (cid:16) Z (cid:62) u i u (cid:62) i Z (cid:17) E { I n − A ( θ i ) } (cid:105) = E [tr { I n − A ( θ i ) } ] = n − ( p − q )E (cid:8)(cid:101) m θ i (1) (cid:9) , and that (cid:101) m θ i (1) − E (cid:8)(cid:101) m θ i (1) (cid:9) = O p ( n − ), we can get that1 √ n (cid:16) E (cid:104) u (cid:62) i Z { I n − A ( θ i ) } Z (cid:62) u (cid:62) i (cid:105) − tr { I n − A ( θ i ) } (cid:17) = o p (1) . √ n (cid:16) u (cid:62) i Z { I n − A ( θ i ) } Z (cid:62) u i − E (cid:104) u (cid:62) i Z { I n − A ( θ i ) } Z (cid:62) u (cid:62) i (cid:105)(cid:17) d −→ N (0 , (cid:101) σ i A ) , and similarly,1 √ T (cid:16) u (cid:62) i Y { I T + B ( θ i ) } Y (cid:62) u i − E (cid:104) u (cid:62) i Y { I T + B ( θ i ) } Y (cid:62) u i (cid:105)(cid:17) d −→ N (0 , (cid:101) σ i B ) . For other two terms, by the same approach in the proof of Theorem 2.6, we have that u (cid:62) i (cid:40) √ p λ i n Z C ( θ i ) Y (cid:62) + √ p λ i T Y D ( θ i ) Z (cid:62) (cid:41) u i = O p (cid:32) √ λ i (cid:33) . By all these arguments above, we can derive that e (cid:62) i G ni e i d −→ N (0 , (cid:101) σ i ) with (cid:101) σ i = y σ i A + c (1 − y ) σ i B .We compute ω I n − A ( θ i ) , β I n − A ( θ i ) , ω I T + B ( θ i ) and β I T + B ( θ i ) in the following. By the deriva-tions in the proof of Lemma 6 in [17], { I n − A ( θ i ) } ( k , k ) = −  Z (cid:62) M ( θ i ) − (cid:32) n Z Z (cid:62) (cid:33) − n Z  ( k , k ) = − θ i n  Z (cid:62) (cid:32) θ i · n Z Z (cid:62) − T X X (cid:62) (cid:33) − Z  ( k , k ) = + θ i n (cid:26) η (cid:62) k (cid:16) θ i n Z k Z (cid:62) k − T X X (cid:62) (cid:17) − η k (cid:27) , where η k is the k -th column of Z and Z k is deﬁned by removing the k -th column of Z .Note that (cid:32) n Z k Z (cid:62) k − θ i T X X (cid:62) (cid:33) − − (cid:32) n Z k Z (cid:62) k (cid:33) − = (cid:32) n Z i Z (cid:62) i − θ i T X X (cid:62) (cid:33) − (cid:40) n Z k Z (cid:62) k − (cid:32) n Z k Z (cid:62) k − θ i T X X (cid:62) (cid:33)(cid:41) (cid:32) n Z k Z (cid:62) k (cid:33) − = θ − i (cid:32) n Z i Z (cid:62) i − θ i T X X (cid:62) (cid:33) − (cid:32) T X X (cid:62) (cid:33) (cid:32) n Z k Z (cid:62) k (cid:33) − (4.2)and 1 p − q − (cid:32) n Z k Z (cid:62) k (cid:33) − = S MP (0) + O p (cid:16) p − (cid:17) = − y + O p (cid:16) p − (cid:17) , (4.3)where S MP denotes the Stieltjes transform of the Marcenko-Pastur law. Then we have that1 p − q θ i E  tr (cid:32) θ i n Z k Z (cid:62) k − T X X (cid:62) (cid:33) −  = E  p − q tr (cid:32) n Z k Z (cid:62) k (cid:33) − + O a . s . (cid:16) θ − i (cid:17) = E (cid:40) − y + O a . s . (cid:16) θ − i (cid:17) + O p (cid:16) p − (cid:17)(cid:41) → − y .

24y Lemma A.2. in [17], it holds that { I n − A ( θ i ) } ( k , k ) → + y (1 − y ) − = − y , which implies ω I n − A ( θ i ) = lim n →∞ n (cid:88) ≤ k ≤ n [ { I n − A ( θ i ) } ( k , k )] = (1 − y ) . By the similar argument, we can obtain that ω I T + B ( θ i ) = lim T →∞ T (cid:88) ≤ k ≤ T [ { I T + B ( θ i ) } ( k , k )] = . Now we come to the calculation of β I n − A ( θ i ) and β I T + B ( θ i ) . Since θ i → + ∞ as n goes toinﬁnity, we havelim n →∞ (cid:90) ∞−∞ θ i θ i − x dF n ( x ) = , lim n →∞ (cid:90) ∞−∞ θ i ( θ i − x ) dF n ( x ) = , lim T →∞ (cid:90) ∞−∞ x θ i − x dF n ( x ) = , lim T →∞ (cid:90) ∞−∞ x ( θ i − x ) dF n ( x ) = . Then these calculations lead to β I n − A ( θ i ) = lim n →∞ n tr { I n − A ( θ i ) } = lim n →∞ n tr (cid:110) I n − A ( θ i ) + A ( θ i ) (cid:111) = − n →∞ (cid:32) p − qn (cid:90) ∞−∞ θ i θ i − x dF n ( x ) (cid:33) + lim n →∞  p − qn (cid:90) ∞−∞ θ i ( θ i − x ) dF n ( x )  = − y + y = − y ,β I T + B ( θ i ) = lim T →∞ T tr { I T + B ( θ i ) } = lim T →∞ T tr (cid:110) I T + B ( θ i ) + B ( θ i ) (cid:111) = + T →∞ (cid:40) p − qT (cid:90) ∞−∞ x θ i − x dF n ( x ) (cid:41) + lim T →∞ (cid:40) p − qT (cid:90) ∞−∞ x ( θ i − x ) dF n ( x ) (cid:41) = + + = . Thus, we can write (cid:101) σ i = y (cid:101) σ i A + c (1 − y ) (cid:101) σ i B = y { ω I n − A ( θ i ) ( ν i − + β I n − A ( θ i ) } + c (1 − y ) { ω I T + B ( θ i ) ( ν i − + β I T + B ( θ i ) } = ( y + c ) (1 − y ) ν i − y (1 − y ) (1 − y ) − c (1 − y ) . Thus the proof is completed. (cid:3)

Lemma 4.3.

Under the assumptions of Theorem 2.10, (cid:107) U Θ n U (cid:62) (cid:107) ∞ = O p (cid:32) q √ n + (cid:80) j λ j √ n λ i (cid:33) . roof of Lemma 4.3. By the deﬁnition of Θ n in (3.27) again, we know Θ n = (1 + δ i ) n Z { I n − A ( θ i ) } Z (cid:62) − θ − i T X { I T + B ( θ i ) } X (cid:62) + n Z C ( θ i ) X (cid:62) + T X D ( θ i ) Z (cid:62) − E( · ) , (4.4)where E( · ) is the expectation of all the preceding terms.Denote η n = n Z { I n − A ( θ i ) } Z (cid:62) − E (cid:34) n Z { I n − A ( θ i ) } Z (cid:62) (cid:35) ,η n = T Y { I T + B ( θ i ) } Y (cid:62) − E (cid:34) T Y { I T + B ( θ i ) } Y (cid:62) (cid:35) ,η n = (cid:112) θ i n Z C ( θ i ) Y (cid:62) − E (cid:40) (cid:112) θ i n Z C ( θ i ) Y (cid:62) (cid:41) ,η n = (cid:112) θ i T Y D ( θ i ) Z (cid:62) − E (cid:40) (cid:112) θ i T Y D ( θ i ) Z (cid:62) (cid:41) . By the fact X = U (cid:62) Λ UY , we can write U Θ n U (cid:62) : = (cid:88) i = V ni , (4.5)where V n = (1 + δ i ) U (cid:32) n Z { I n − A ( θ i ) } Z (cid:62) − E (cid:34) n Z { I n − A ( θ i ) } Z (cid:62) (cid:35)(cid:33) U (cid:62) = (1 + δ i ) U η n U (cid:62) , (4.6) V n = − θ − i U (cid:32) T X { I T + B ( θ i ) } X (cid:62) − E (cid:34) T X { I T + B ( θ i ) } X (cid:62) (cid:35)(cid:33) U (cid:62) = − θ − i Λ U (cid:32) T Y { I T + B ( θ i ) } Y (cid:62) − E (cid:34) T Y { I T + B ( θ i ) } Y (cid:62) (cid:35)(cid:33) U (cid:62) Λ = − θ − i Λ U η n U (cid:62) Λ , (4.7) V n = U (cid:34) n Z C ( θ i ) X (cid:62) − E (cid:40) n Z C ( θ i ) X (cid:62) (cid:41)(cid:35) U (cid:62) = U (cid:34) n Z C ( θ i ) Y (cid:62) − E (cid:40) n Z C ( θ i ) Y (cid:62) (cid:41)(cid:35) U (cid:62) Λ = U (cid:34) n Z C ( θ i ) Y (cid:62) − E (cid:40) n Z C ( θ i ) Y (cid:62) (cid:41)(cid:35) U (cid:62) Λ = θ − i U η n U (cid:62) Λ (4.8) V n = U (cid:34) T X D ( θ i ) Z (cid:62) − E (cid:40) T X D ( θ i ) Z (cid:62) (cid:41)(cid:35) U (cid:62) = θ − i Λ U η n U (cid:62) . (4.9)Similarly as the arguments in the proof of Lemma 4.2, it holds that, for 1 ≤ j , j ≤ q , e (cid:62) j η n e j = O p (cid:32) √ n (cid:33) , e (cid:62) j η n e j = O p (cid:32) √ n (cid:33) , (cid:62) j η n e j = O p (cid:32) √ n λ i (cid:33) , e (cid:62) j η n e j = O p (cid:32) √ n λ i (cid:33) . Noting that U is an orthogonal matrix, we have that e (cid:62) j V n e j = e (cid:62) j (1 + δ i ) U η n U (cid:62) e j = O p (cid:32) √ n (cid:33) , e (cid:62) j V n e j = − e (cid:62) j θ − i Λ U η n U (cid:62) Λ e j = λ − i λ j λ j · O p (cid:32) √ n (cid:33) , e (cid:62) j V n e j = e (cid:62) j θ − i U η n U (cid:62) Λ e j = λ − i λ j · O p (cid:32) √ n (cid:33) , e (cid:62) j V n e j = e (cid:62) j θ − i Λ U η n U (cid:62) e j = λ − i λ j · O p (cid:32) √ n (cid:33) . Then by Chebyshev’s inequality, we can deduce that (cid:107) V n (cid:107) ∞ = O p (cid:32) q √ n (cid:33) , (cid:107) V n (cid:107) ∞ = O p (cid:32) (cid:80) j λ j √ n λ i (cid:33) , (cid:107) V n + V n (cid:107) ∞ = O p  (cid:112) q (cid:80) j λ j √ n λ i  , where (cid:112) q (cid:80) j λ j = o( (cid:80) j λ j ). Thus we complete the proof by (4.5). (cid:3) Lemma 4.4.

Under the assumptions of Theorem 2.10,max ≤ j ≤ q (cid:12)(cid:12)(cid:12) e (cid:62) j U Θ n U (cid:62) e j − ( y − (cid:12)(cid:12)(cid:12) = O p  √ q δ i λ i + √ q √ n + (cid:113)(cid:80) j λ j λ i + (cid:112)(cid:80) j λ j λ i  , (4.10)max ≤ j (cid:44) j ≤ q | e (cid:62) j U Θ n U (cid:62) e j | = O p  q δ i λ i + q √ n + (cid:80) j λ j λ i + (cid:112) q (cid:80) j λ j λ i  . (4.11) Proof of Lemma 4.4.

Recall the deﬁnition of Θ n in (3.28): Θ n = (1 + δ i ) 1 n Z Z (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) − (1 + δ i ) 1 n Z Z (cid:62) + (cid:98) λ i θ i T X X (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X X (cid:62) − (1 + δ i ) (cid:98) λ i n Z Z (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X X (cid:62) + n Z C ( θ i ) X (cid:62) − (1 + δ i ) (cid:98) λ i T X X (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) + T X D ( θ i ) Z (cid:62) . Noting that M − ( (cid:98) λ i ) − M − ( θ i ) = M − ( (cid:98) λ i ) (cid:110) M ( θ i ) − M ( (cid:98) λ i ) (cid:111) M − ( θ i ) = − δ i (cid:98) λ i M − ( (cid:98) λ i ) F M − ( θ i ) , we decompose the ﬁrst term in Θ n as1 n Z Z (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) n Z Z (cid:62) (cid:110) M − ( (cid:98) λ i ) − M − ( θ i ) (cid:111) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) + n Z Z (cid:62) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) = − δ i (cid:98) λ i n Z Z (cid:62) M − ( (cid:98) λ i ) F M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) + n Z Z (cid:62) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) . On one hand, similar to the arguments in the proof of Theorem 2.6, we can derive thatmax ≤ j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j δ i (cid:98) λ i n Z Z (cid:62) M − ( (cid:98) λ i ) F M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p (cid:32) √ q δ i λ i (cid:33) , max ≤ j (cid:44) j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j δ i (cid:98) λ i n Z Z (cid:62) M − ( (cid:98) λ i ) F M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p (cid:32) q δ i λ i (cid:33) . On the other hand, similar to the proof of Lemma 4.2, we can get that1 n  e (cid:62) j Z Z (cid:62) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) e j − E (cid:104) tr (cid:110) M − ( θ i ) (cid:111)(cid:105) = O p (cid:32) √ n (cid:33) , n e (cid:62) j Z Z (cid:62) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) e j = O p (cid:32) √ n (cid:33) , where n E (cid:110) tr M − ( θ i ) (cid:111) → y . It follows thatmax ≤ j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n e (cid:62) j (1 + δ i ) Z Z (cid:62) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) e j − n E (cid:104) tr (cid:110) M − ( θ i ) (cid:111)(cid:105)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p (cid:32) √ q √ n (cid:33) max ≤ j (cid:44) j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n e (cid:62) j (1 + δ i ) Z Z (cid:62) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p (cid:32) q √ n (cid:33) . Similarly, we can get the following for other terms:max ≤ j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j (1 + δ i ) 1 n Z Z (cid:62) e j − (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p (cid:32) √ q √ n (cid:33) , max ≤ j (cid:44) j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j (1 + δ i ) 1 n Z Z (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p (cid:32) q √ n (cid:33) , max ≤ j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j (cid:98) λ i θ i T X X (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X X (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p  (cid:113)(cid:80) j λ j λ i  , max ≤ j (cid:44) j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j (cid:98) λ i θ i T X X (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X X (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p  (cid:80) j λ j λ i  , max ≤ j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j (1 + δ i ) (cid:98) λ i n Z Z (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X X (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p  (cid:112)(cid:80) j λ j λ i  , max ≤ j (cid:44) j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j (1 + δ i ) (cid:98) λ i n Z Z (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − T X X (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p  (cid:112) q (cid:80) j λ j λ i  , max ≤ j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j n Z C ( θ i ) X (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p  (cid:112)(cid:80) j λ j √ n λ i  , max ≤ j (cid:44) j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j n Z C ( θ i ) X (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p  (cid:112) q (cid:80) j λ j √ n λ i  , max ≤ j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j (1 + δ i ) (cid:98) λ i T X X (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p  (cid:112)(cid:80) j λ j λ i  , ≤ j (cid:44) j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j (1 + δ i ) (cid:98) λ i T X X (cid:62) M − ( (cid:98) λ i ) M − ( θ i ) (cid:32) n Z Z (cid:62) (cid:33) − n Z Z (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p  (cid:112) q (cid:80) j λ j λ i  , max ≤ j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j T X D ( θ i ) Z (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p  (cid:112)(cid:80) j λ j √ n λ i  , max ≤ j (cid:44) j ≤ q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e (cid:62) j T X D ( θ i ) Z (cid:62) e j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O p  (cid:112) q (cid:80) j λ j √ n λ i  . Thus, all these inequalities lead tomax ≤ j ≤ q (cid:12)(cid:12)(cid:12) e (cid:62) j Θ n e j − ( y − (cid:12)(cid:12)(cid:12) = O p  √ q δ i λ i + √ q √ n + (cid:113)(cid:80) j λ j λ i + (cid:112)(cid:80) j λ j λ i  , max ≤ j (cid:44) j ≤ q (cid:12)(cid:12)(cid:12) e (cid:62) j Θ n e j (cid:12)(cid:12)(cid:12) = O p  q δ i λ i + q √ n + (cid:80) j λ j λ i + (cid:112) q (cid:80) j λ j λ i  . The proof is completed. (cid:3)

Acknowledgments

The authors gratefully acknowledge a grant from the University Grants Council of HongKong and a NSFC grant (NSFC11671042). Drs Xie and Zeng are co-ﬁrst authors.

References [1] Zhidong Bai and Jack W Silverstein. No eigenvalues outside the support of the limitingspectral distribution of large-dimensional sample covariance matrices.

Ann. Probab. ,26(1):316–345, 1998.[2] Zhidong Bai and Jack W Silverstein.

Spectral analysis of large dimensional randommatrices . Springer Series in Statistics. Springer, New York, second edition, 2010.[3] Zhidong Bai and Jianfeng Yao. Central limit theorems for eigenvalues in a spikedpopulation model.

Ann. Inst. Henri Poincar´e Probab. Stat. , 44(3):447–474, 2008.[4] Alex Bloemendal, Antti Knowles, Horng-Tzer Yau, and Jun Yin. On the principalcomponents of sample covariance matrices.

Probab. Theory Relat. Field , 164(1-2):459–552, 2016.[5] T Tony Cai, Xiao Han, and Guangming Pan. Limiting laws for divergent spiked eigen-values and largest nonspiked eigenvalue of sample covariance matrices.

Ann. Stat. ,48(3):1255–1280, 2020.[6] Prathapasinghe Dharmawansa, Iain M Johnstone, and Alexei Onatski. Local asymp-totic normality of the spectrum of high-dimensional spiked F-ratios. arXiv preprintarXiv:1411.3875 , 2014. 297] Xiao Han, Guangming Pan, and Qing Yang. A uniﬁed matrix model including bothCCA and F matrices in multivariate analysis: the largest eigenvalue and its applications.

Bernoulli , 24(4B):3447–3468, 2018.[8] Xiao Han, Guangming Pan, and Bo Zhang. The Tracy–Widom law for the largesteigenvalue of F type matrices.

Ann. Stat. , 44(4):1564–1592, 2016.[9] Roger A. Horn and Charles R. Johnson.

Matrix analysis . Cambridge University Press,Cambridge, second edition, 2013.[10] Iain M. Johnstone. On the distribution of the largest eigenvalue in principal compo-nents analysis.

Ann. Stat. , 29(2):295–327, 2001.[11] Noureddine El Karoui and Elizabeth Purdom. The non-parametric bootstrap and spec-tral analysis in moderate and high-dimension. volume 89 of

Proceedings of MachineLearning Research , pages 2115–2124. PMLR, 16–18 Apr 2019.[12] Debashis Paul and Aue Alexander. Random matrix theory in statistics: A review.

J.Stat. Plan. Infer. , 150:1–29, 2014.[13] Jack W Silverstein. The limiting eigenvalue distribution of a multivariate F matrix.

SIAM J. Math. Anal. , 16(3):641–646, 1985.[14] Jack W Silverstein. Strong convergence of the empirical distribution of eigenvaluesof large dimensional random matrices.

J. Multivar. Anal. , 55(2):331–339, 1995.[15] Jack W Silverstein and Zhidong Bai. On the empirical distribution of eigenvalues ofa class of large dimensional random matrices.

J. Multivar. Anal. , 54(2):175–192, 1995.[16] Kenneth W Wachter. The limiting empirical measure of multiple discriminant ratios.

Ann. Stat. , 8(5):937–957, 1980.[17] Qinwen Wang and Jianfeng Yao. Extreme eigenvalues of large-dimensional spikedFisher matrices with application.

Ann. Stat. , 45(1):415–460, 2017.[18] John Wishart. The generalised product moment distribution in samples from a normalmultivariate population.

Biometrika , 20(1 / Large Sample Covariance Matricesand High-Dimensional Data Analysis , volume 39 of