A Note on Central Limit Theorems for Linear Spectral Statistics of Large Dimensional F-matrix
aa r X i v : . [ m a t h . S T ] M a y A Note on Central Limit Theorems for LinearSpectral Statistics of Large Dimensional F -matrix Shurong Zheng and Zhidong Bai
KLAS and School of Mathematics & Statistics, Northeast Normal University, P.R.ChinaE-mails: [email protected]; [email protected]
Abstract.
Sample covariance matrix and multivariate F -matrix play important roles in multivariatestatistical analysis. The central limit theorems (CLT) of linear spectral statistics associated withthese matrices were established in Bai and Silverstein (2004) and Zheng (2012) which receivedconsiderable attentions and have been applied to solve many large dimensional statistical problems.However, the sample covariance matrices used in these papers are not centralized and there exist somequestions about CLT’s defined by the centralized sample covariance matrices. In this note, we shallprovide some short complements on the CLT’s in Bai and Silverstein (2004) and Zheng (2012), andshow that the results in these two papers remain valid for the centralized sample covariance matrices,provided that the ratios of dimension p to sample sizes ( n, n , n ) are redefined as p/ ( n −
1) and p/ ( n i − i = 1 ,
2, respectively.
Key words and phrases.
Linear spectral statistics, central limit theorem, centralized sample covari-ance matrix, centralized F -matrix, simplified sample covariance matrix, simplified F -matrix. Let { X jk , j, k = 1 , , · · · } and { Y jk , j, k = 1 , , · · · } be two independent double arrays of independent randomvariables, either both real or both complex. In the sequel, we use A ∗ to denote a complex conjugate transpose of avector or matrix A . For p > n > N >
1, we define X = ( X , · · · , X n ) and Y = ( Y , · · · , Y N ) with columnvectors X j = ( X j , ..., X jp ) ′ , ≤ j ≤ n , and Y k = ( Y k , ..., Y kp ) ′ , 1 ≤ k ≤ N . Let T p be a p × p non-negative definite(nnd) matrix. There exists a unique nnd matrix T / p such that T p = ( T / p ) . Then, ( T / p X , · · · , T / p X n ) and( T / p Y , · · · , T / p Y N ) can be considered as two independent samples of sizes n and N , respectively, drawn froma p -dimensional population with population covariance matrix T p . t is well known that the sample covariance matrices for T / p X and T / p Y are often defined as S x = 1 n − n X i =1 T / p X i X ∗ i T / p − n T / p ¯ X ¯ X ∗ T / p ! , (1.1) S y = 1 N − N X i =1 T / p Y i Y ∗ i T / p − N T / p ¯ Y ¯ Y ∗ T / p ! , (1.2)respectively, where ¯ X = n n P i =1 X i and ¯ Y = N N P i =1 Y i . The multivariate F -matrix is then defined as F = S x S − y . (1.3)Notice that the matrices defined in (1.1)–(1.3) are transformation invariant, we will call them centralized samplecovariance matrices and multivariate F -matrix, respectively.Due to Corollary A.41 and Theorem A.43 of Bai and Silverstein (2009), in the literature of random matrixtheory, the sample covariance matrices are usually simplified as B x = 1 n n X i =1 T / p X i X ∗ i T / p and B y = 1 N N X i =1 T / p Y i Y ∗ i T / p (1.4)and the multivariate F -matrix is simplified as G = B x B − y .Bai and Silverstein (2004) considered the central limit theorem (CLT) of the linear spectral statistics (LSS)of the simplified sample covariance matrix B x and provided the explicit expressions of asymptotic means andcovariance functions for B x . Later, Zheng (2012) extended the work of Bai and Silverstein (2004) to the case of themultivariate F -matrix G and obtained explicit expressions of the asymptotic means, variances, and covariances for G . Examining the inequalities derived from Corollary A.41 and Theorem A.43 of Bai and Silverstein (2009), onefinds that the difference between the empirical spectral distributions ( ESD ) of S x and B x is of the order O ( n − ).Hence, we conclude that S x and B x have the same limiting spectral distributions ( LSD ). However, the scalenormalizers in CLT’s of LSS of random matrices S x and B x have the same order as p . Thus, it is expected thatthe asymptotic biases in the CLT’s of LSS of S x and B x should have a little difference. Upon such a consideration,Pan (2012) reconsidered the CLT of LSS of centralized sample covariance matrix S x . To reduce the asymptoticbias, he added an additional term to that of Bai and Silverstein (2004), that is, y πi Z g ( z ) m y ( z ) R tdH ( t )(1+ tm y ( z )) z (cid:16) − y R m y ( z ) t dH ( t )(1+ tm y ( z )) (cid:17) dz, (1.5) To guarantee that the definition makes sense, we need to assume that p < N and T p is positivedefinite. Because the eigenvalues of F are independent of T p , we may assume T p is an identity matrix. here m y ( z ) = − − yz + ym y ( z ), m y ( z ) is the Stieltjes transform of the LSD of S x , H ( t ) is the LSD of T p and y n = p/n → y > S x has the same distribution as simplified covariance matrix B x with sample size n − B x is replaced by cen-tralized sample covariance matrix S x , Bai and Silverstein (2004)’s result remains valid provided that the ratio ofdimension to sample size y n is replaced by p/ ( n −
1) (this is equivalent to c n = n/ ( N −
1) in Bai and Silverstein(2004)). This result is equivalent to but much simpler than that of Pan (2012) in both expressions and proof.Moreover, we shall prove that if the simplified multivariate F -matrix G is replaced by the centralized F , the re-sults of Zheng (2012) remain valid provided the ratios of dimensions to sample sizes, y n and y n , are replaced by p/ ( n −
1) and p/ ( N − As mentioned in the previous section, the centralized covariance matrix will have the same LSD as that of thecorresponding simplified covariance matrix. In this note, we shall prove the following theorems.
Theorem 2.1
Assume that(a) For each p , { X ij , i ≤ p, j ≤ n } are independent random variables with EX ij = 0 , E | X | = 1 , andsatisfying np p X j =1 n X k =1 E | X jk | {| X jk |≥ η √ n } → , for any fixed η > . (2.1) Note that the random variables may be allowed to depend on p , but we suppress this dependence from the notationfor brevity.(b) We assume E | X ij | = 3 for the real case, and E | X ij | = 2 and EX ij = 0 for the complex case.(c) y n = p/n → y , and(d) T p is a p × p non-random nnd Hermitian matrix with bounded spectral norm in p , and its ESD H p D → H where H is a proper probability distribution. et f be an analytic function on an open region in the complex plane which covers the support of LSD of S x with the origin excluded.Then(i) the random variables X p ( f ) = p Z f ( x ) d (cid:16) F S x − F { y n − ,H p } ( x ) (cid:17) , (2.2) form a tight sequence in p , where F S x is the ESD of centralized sample covariance matrix S x , F { y,H } is the LSD of S x whose LSD’s Stieltjes transform m y ( z ) satisfies m y ( z ) = ym y ( z ) − (1 − y ) /z and m y ( z ) is the unique solutionto the equation z = − m y + y Z t tm y ( z ) dH ( t ) . (2.3) in the upper half complex plane for each z ∈ C + = { z : ℑ ( z ) > } .(ii) The random variables in (2.2) converges weakly to Gaussian variables X f with the same means and co-variance functions as given in Theorem 1.1 of Bai and Silverstein (2004). The proof of Theorem 2.1 is postponed to Section 3.As for the CLT of LSS of F matrix, we have the following theorem. Theorem 2.2
Assume that1. the two arrays { X jk , j ≤ p, k ≤ n } and { Y jk , j ≤ p, k ≤ N } satisfy for any fixed η > , np p X j =1 n X k =1 E | X jk | {| X jk |≥ η √ n } → , Np p X j =1 N X k =1 E | Y jk | {| Y jk |≥ η √ N } → . (2.4)
2. For all j, k , | EX jk | = β x + 1 + κ , | EY jk | = β y + 1 + κ . If both X and Y are complex valued, then EX jk = EY jk = 0 . Moreover, y n = p/n → y > and y N = p/N → y ∈ (0 , .Let f be an analytic function in an open region of the complex plane containing the interval h (1 − h ) (1 − y ) , (1+ h ) (1 − y ) i , thesupport of the continuous part of the LSD F y of F -matrix, h = √ y + y − y y and y = ( y , y ) .Then, as p → ∞ , the random variables W p ( f ) = p Z f ( x ) d (cid:16) F F ( x ) − F ( y n − ,y N − ) ( x ) (cid:17) converges weakly to Gaussian variables { W f } which have the same means and covariance functions as given inZheng (2012) ,where F F ( x ) is the ESD of centralized F -matrix F and F ( y ,y ) ( x ) is the LSD defined by (2.4) ofZheng (2012). roof. As mentioned in Section 1, we may assume T p to be an identity matrix. Split our proofs into two steps bywriting tr ( F − z I p ) − − pm ( y n − ,y N − ) ( z ) = (cid:20) tr ( S x S − y − z I p ) − − pm ( y n − ,F S − y ) ( z ) (cid:21) + p (cid:20) m ( y n − ,F S − y ) ( z ) − m ( y n − ,y N − ) ( z ) (cid:21) where F S − y ( t ) and F S y ( t ) are the ESDs of S − y and S y , m ( y ,y ) is the Stieltjes transform of the LSD of F matrix, m { y n − ,F S − y } = − − y n − z + y n − m { y n − ,F S − y } ( z ) z = − m { y n − ,F S − y } + y n − Z t tm { y n − ,F S − y } dF S − y ( t )= − m { y n − ,F S − y } + y n − Z t + m { y n − ,F S − y } dF S y ( t ) . (2.5)Step 1. Given S y , in the proof of Theorem 2.1, we have proved that the process tr ( S x S − y − z I p ) − − pm { y n − ,F S − y } ( z )weakly tends to a Gaussian process on the contour with mean and covariance function as given in (6.29) and (6.30)of Zheng (2012).Step 2. By (2.5) and the truth of z = − m { y n − ,y N − } + y n − Z t + m { y n − ,y N − } dF y N − ( t ) , (2.6)where F y N − is the M-P law with ratio of dimension to sample size y N − . Subtracting both sides of (2.5) fromthose of (2.6), we obtain p · (cid:20) m { y n − ,F S − y } ( z ) − m { y n − ,y N − } ( z ) (cid:21) = N · (cid:20) m { y n − ,F S − y } ( z ) − m { y n − ,y N − } ( z ) (cid:21) = − y n − m { y n − ,y N − } m { y n − ,F S − y } tr (cid:16) S y + m { y n − ,y N − } I p (cid:17) − − pm y N − ( − m { y n − ,y N − } )1 − y n − · R m { yn − ,yN − } · m { yn − ,F S − y } dF N − ( t ) (cid:16) t + m { yn − ,yN − } (cid:17) · t + m { yn − ,F S − x } ! = − y n − m { y n − ,y N − } m { y n − ,F S − y } p [ m N − ( − m { y n − ,y N − } ) − m y N − ( − m { y n − ,y N − } )]1 − y n − · R m { yn − ,yN − } · m { yn − ,F S − y } dF N − ( t ) (cid:16) t + m { yn − ,yN − } (cid:17) · t + m { yn − ,F S − y } = − y n − m { y n − ,y N − } m { y n − ,F S − y } · p [ m N − ( − m { y n − ,y N − } ) − m y N − ( − m { y n − ,y N − } )]1 − y n − · R m { yn − ,yN − } · m { yn − ,F S − y } dF N − ( t ) (cid:16) t + m { yn − ,yN − } (cid:17) · t + m { yn − ,F S − y } (2.7) hich weakly tends to a Gaussian process on the contour with mean and covariance function as given in (6.33) and(6.34) of Zheng (2012) where m N − is the Stieltjes transform of ESD F N − ( x ) of S y , z = − m yN − + y N − m yN − and m y N − ( z ) = − − y N − z + ym y N − ( z ). By Theorem 2.1 and (2.7) we obtain that tr ( F − z I p ) − − pm { y n − ,y N − } ( z ) and tr ( G − z I p ) − − pm { y n ,y N } ( z )have the same asymptotic distribution. Hence, the random variables (cid:18) p Z f ( x ) d ( F F ( x ) − F { y n − ,y N − } ( x )) (cid:19) converges weakly to Gaussian variables W f with the same means and covariance functions as Zheng (2012).Then the proof of Theorem 2.2 is completed. (cid:4) The condition (2.1) allows us to truncate the random variables at η n √ n and then renormalize them to have meanszero and vairances 1, where η n → κ + 1 + β x + o (1) and for the complex case we have EX ij = o ( n − ). The contour is definedsimilar to Bai and Silverstein (2004). Define γ i = √ n T / p X i . Then, we have S x = nn − n X i =1 ( γ i − ¯ γ )( γ i − ¯ γ ) ∗ = n X i =1 γ i γ ∗ i − n − X i = j γ i γ ∗ j = B x − ∆where ¯ γ = n n P i =1 γ i and ∆ = n − P j = k γ j γ ∗ k . Because( S x − z I ) − = ( A − ∆) − = A − +( A − ∆) − ∆ ( A ) − = A − + A − ∆ A − + A − (∆ A − ) +( A − ∆) − (cid:0) ∆ A − (cid:1) where A ( z ) = B x − z I , then we obtain p (cid:16) p tr ( B x − z I ) − − m n − ( z ) (cid:17) = p (cid:16) p tr ( A − ∆) − − m n ( z ) + m n ( z ) − m n − ( z ) (cid:17) = p (cid:16) p tr A − ( z ) − m n ( z ) (cid:17) + p ( m n ( z ) − m n − ( z )) + tr A − ( z )∆+ tr A − ( z )(∆ A − ( z )) + tr ( A ( z ) − ∆) − (∆ A − ( z )) (3.1)where m n ( z ) = − − y n z + y n m n ( z ), m n ( z ) and m n − ( z ) satisfy z = − m (0) n + pn Z t tm (0) n dH p ( t ) (3.2) = − m (0) n − + pn − Z t tm (0) n − dH p ( t ) (3.3) z = − m y + y Z t tm y dH ( t ) . (3.4)By (3.1), Lemmas 4.1 and 4.6, we havetr( B x − z I ) − − p · m n − ( z ) = p (cid:18) p tr A − ( z ) − m n ( z ) (cid:19) + o p (1) . (3.5)So Theorem 2.1 has the same asymptotic mean and covariance function as Theorem 1.1 of Bai and Silverstein(2004). Tightness of tr( S x − z I ) − − tr( B x − z I ) − . Using what has been proved in Bai and Silverstein (2004), we onlyneed to prove that there is an absolute constant M such that for any z , z ∈ C ,E (cid:12)(cid:12)(cid:12) tr( B x − z I ) − ( B x − z I ) − − tr( S x − z I ) − ( S x − z I ) − (cid:12)(cid:12)(cid:12) = E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 ( λ i − ˜ λ i )( λ i + ˜ λ i − z z )( λ i − z )( λ − z )(˜ λ i − z )(˜ λ − z ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ K E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X i =1 | λ i − ˜ λ i | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) B n + o (1) ≤ M, (3.6)where { λ i } and { ˜ λ i } are the eigenvalues of B x and S x , respectively, and arranged in descending order, the event B n is defined as x l + ǫ < ˜ λ p < λ < x r − ǫ such thatP( B n ) = o ( n − ) . (for the justification of the definition B n , see Bai and Silverstein (1998)). The last step of (3.6) follows from thefact that X i =1 | λ i − ˜ λ i | = X i =1 ( λ i − ˜ λ i ) ≤ λ − ˜ λ p ≤ x r by the interlacing theorem. The equi-continuity of
Etr( S x − z I ) − − pm n − ( z ) can be proved in a similar way to that for the tightness oftr( S x − z I ) − − Etr( B x − z I ) − .By now, the proof of Theorem 2.1 is completed. (cid:4) Lemma 4.1
Under assumptions of Theorem 2.1, for every z ∈ C + , we have p ( m (0) n − m (0) n − ) → (1 + zm y ) · m y + zm ′ y zm y . roof. We have m (0) n ( z ) = − n − pn · z + pn m n ( z ) , m (0) n − ( z ) = − n − − pn − · z + pn m n − ( z ) (4.1)where p/n → y >
0. By (3.4), we obtain m ′ y = 1 m y − y R t (1+ tm y ) dH ( t ) , y Z t tm y dH ( t ) = 1 + zm y m y (4.2)Using (3.2)-(3.3), we obtain0 = m (0) n − m (0) n − m (0) n m (0) n − − ( m (0) n − m (0) n − ) pn Z t (1 + tm (0) n )(1 + tm (0) n − ) dH p ( t ) − pn ( n − Z t tm (0) n − dH p ( t ) , that is, n ( m (0) n − m (0) n − ) = pn − R t tm (0) n − dH p ( t ) m (0) n m (0) n − − pn R t (1+ tm (0) n )(1+ tm (0) n − ) dH p ( t ) → y R t tm y dH ( t ) m y − y R t (1+ tm y ) dH ( t ) . (4.3)By (4.1), (4.2) and (4.3), we have p ( m (0) n − m (0) n − ) = nm (0) n + n − pz − (cid:18) ( n − m (0) n − + n − − pz (cid:19) = n ( m (0) n − m (0) n − ) + m (0) n − ( z ) + 1 z → y R t tm y dH ( t ) m y − y R t (1+ tm y ) dH ( t ) + 1 + zm ( z ) z = m ∗ y · zm y m y + 1 + zm ( z ) z = (1 + zm y ) · m y + zm ′ y zm y . (4.4)Thus, we prove that Lemma 4.1 holds. (cid:4) In the sequel, we shall use Vatali lemma frequently. Let∆ = 1 n X j = k γ j γ ∗ k . (It should be 1 n − X j = k γ j γ ∗ k but no harm to the limit.)We will derive the limit tr( A − ∆) − − tr( A − ). Lemma 4.2
After truncation and normalization, we have E | γ ∗ k A − γ k − (1+ zm y ( z )) | ≤ Kn − for every z ∈ C + . Proof.
We have γ ∗ k A − γ k = γ ∗ k A − k γ k β k = 1 − β k , where A k = A − γ k γ ∗ k and β k + (1 + γ ∗ k A − k γ k ) − . Therefore,By (1.15) and (2.17) of Bai and Silverstein (2004), we have E | γ ∗ k A − γ k − g ( z ) | = E | β k + zm y ( z ) | ≤ Kn − with g ( z ) = 1 + zm y ( z ). orollary 4.1 After truncation and normalization, we have E (cid:12)(cid:12) γ ∗ k A − γ k − ddz (1 + zm y ( z )) (cid:12)(cid:12) ≤ Kn − for every z ∈ C + . Proof.
By Cauchy integral formula, we have γ ∗ k A − γ k = 12 πi I | ζ − z | = v/ γ ∗ k A − ( ζ ) γ k ( ζ − z ) dζ and g ′ ( z ) = 12 πi I | ζ − z | = v/ ζm y ( ζ )( ζ − z ) dζ. Then E (cid:12)(cid:12) γ ∗ k A − γ k − ddz (1 + zm y ( z )) (cid:12)(cid:12) ≤ Kn − follows from Lemma 4.2. Lemma 4.3
For any subset U of { , , · · · , n } , after truncating and normalizing, we have E (cid:12)(cid:12) tr A − ∆ (cid:12)(cid:12) ≤ Kn − for every z ∈ C + . Especially for every z ∈ C + , E | tr( A − ∆) | = E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X j = k γ ∗ j A − γ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O ( n − ) . Proof.
We have tr A − ∆ = n P j = k ∈U γ ∗ j A − γ k = n P j = k ∈U γ ∗ j A − jk γ k β j β k ( j ) , where A jk = A k − γ j γ ∗ j for j = k and β k ( j ) = (1 + γ ∗ k A − jk γ k ) − . We will similarly define A ijk and β k ( ij ) for later use. Then we obtainE | tr( A − ∆) | = E n P j = k ∈U γ ∗ j A − j k γ k β j β k ( j ) 1 n P j = k ∈U γ ∗ k A − j k γ j β j β k ( j ) := P (2) + P (3) + P (4) , where the index ( · ) denotes the number of distinct integers in the set { j , k , j , k } . By the facts that | β j | ≤ | z | ν and ν = ℑ ( z ), we have P (2) ≤ | z | n v P j = k ∈U E | γ ∗ j A − jk γ k | ≤ | z | ν n P j = k E | tr( T ∗ A − jk T ¯ A − jk ) ≤ pn | z | k T k ν ≤ Kn − P (4) = n P j = k = j = k ∈U E γ ∗ j A − j k γ k γ ∗ j A − j k γ k β j β k ( j ) β j β k ( j ) where γ ∗ j A γ k = β j β k ( j ) γ ∗ j A − j k γ k = β j β k ( j ) h γ ∗ j A − j k k γ k − β k ( j k ) γ ∗ j A − j k k γ k γ ∗ k A − j k k γ k i = β j β k ( j ) h γ ∗ j A − j j k k γ k − β j ( j k k ) γ ∗ j A − j j k k γ j γ ∗ j A − j j k k γ k − β k ( j k ) γ ∗ j A − j j k k γ k γ ∗ k A − j j k k γ k + β k ( j k ) β j ( j k k ) γ ∗ j A − j j k k γ j γ ∗ j A − j j k k γ k γ ∗ k A − j j k k γ k + β k ( j k ) β j ( j k k ) γ ∗ j A − j j k k γ k γ ∗ k A − j j k k γ j γ ∗ j A − j j k k γ k − β k ( j k ) β j ( j k k ) γ ∗ j A − j j k k γ j γ ∗ j A − j j k k γ k γ ∗ k A − j j k k γ j γ ∗ j A − j j k k γ k i nd β j = b j − β j b j ǫ j = b j − b j ǫ j + β j b j ǫ j β j ( k ) = b j ( k ) − β j ( k ) b j ( k ) ǫ j ( k ) = b j ( k ) − b j ( k ) ǫ j ( k ) + β j ( k ) b j ( k ) ǫ j ( k ) (4.5)with b j = E γ ∗ j A j γ j , ǫ j = γ ∗ j A j γ j − E γ ∗ j A j γ j , and b j ( k ) and ǫ j ( k ) are similarly defined by replacing A − j as A − jk .By the same manner, we can decompose γ ∗ j A γ k into similar 6 terms and then we will estimate the expectationsof the 36 products in the expansion of γ ∗ j A γ k ( γ ∗ j A γ k ) ∗ . Case 1. There are at least 6 terms of A − j j k ,k := B’s contained in the product.
We shall use the factthat all β -factors are bounded | z | /v ≤ K . Then we can show that the term is bounded by O ( n − ). Say, for theproduct of the two 6-th terms, its expectation is bounded by KE (cid:12)(cid:12)(cid:12) ( γ ∗ j B γ j γ ∗ j B γ k γ ∗ k B γ j γ ∗ j B γ k )( γ ∗ j B γ j γ ∗ j B γ k γ ∗ k B γ j γ ∗ j B γ k ) ∗ (cid:12)(cid:12)(cid:12) ≤ K (cid:18) E (cid:12)(cid:12)(cid:12) ( γ ∗ j B γ j γ ∗ j B γ k γ ∗ k B γ j γ ∗ j B γ k ) (cid:12)(cid:12)(cid:12) E (cid:12)(cid:12)(cid:12) ( γ ∗ j B γ j γ ∗ j B γ k γ ∗ k B γ j γ ∗ j B γ k ) (cid:12)(cid:12)(cid:12) (cid:19) / . Note that the factors γ ∗ j B γ k in the first batch and ( γ ∗ j B γ k ) ∗ in the second batch are exchanged positions whenusing the Cauchy-Schwarz for avoiding 8the power of the γ under the expectation sign.Applying the formulaE (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X i =1 ¯ X i Y i n X j =1 ¯ X j Z j n X k =1 ¯ Y k Z k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = E X i = j (cid:0) | Y i | | Z j | + Y i ¯ Z i ¯ Y j Z j + | E X i | Y i Z i ¯ Y j ¯ Z j (cid:1) + n X i =1 E | X i | | Y i | | Z i | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X k =1 ¯ Y k Z k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = X i = k (cid:0) n − | Y i | | Y k | + ( n − | E Z i | + | E X i | ) Y i ¯ Y j (cid:1) + n X i =1 E | X i | | Y i | E | Z i | ≤ κ ( n − X i =1 | Y i | ! + max i { E | X i | E | Z i | − κ } n X i =1 | Y i | , where κ = 2 for the real case and 1 for the complex case, X i , Z k are independent random variables with mean 0,variance 1 and finite 4th moment, and further E X i = 0 (and E Z i = 0) if they are complex, we will haveE (cid:12)(cid:12)(cid:12) ( γ ∗ j B γ j γ ∗ j B γ k γ ∗ k B γ j γ ∗ j B γ k ) (cid:12)(cid:12)(cid:12) = 1 n E | ( γ ∗ j B γ j γ ∗ j B γ k γ ∗ j B γ k ) (cid:12)(cid:12)(cid:12) γ ∗ j BTB ∗ γ j ≤ Kn E( γ ∗ j BTB ∗ γ j ) + Kn n X i =1 E (cid:12)(cid:12)(cid:12) e ′ i T / B γ j (cid:12)(cid:12)(cid:12) γ ∗ j BTB ∗ γ j ≤ Kn (cid:20) (cid:18) n ( BTB ∗ T ) (cid:19) + 1 n n X i =1 | e ′ i T / BTB ∗ T / e i | E | X ij | (cid:21) = O ( n − ) , here e i is the standard i -th unit p -vector, i.e., its i -th entry is 1 and other p − | X ij | ≤ η n n max EE | x ij | = o ( n ) and e ′ i T / BTB ∗ T / e i ≤ k T k /v .By similar approach, one can prove that the expectation of other products with the number of B less than orequal to 6 are bounded by O ( n − ). Case 2. There are 5 B’s contained in the product.
We shall use the first expansion of β j and β j andthen use the bound bounded | z | /v ≤ K for β ’s. Then we can show that such terms are also bounded by O ( n − ).Say, for the product of the first term of the first factor and the 6-th term of the second factor, its expectation isbounded by (cid:12)(cid:12)(cid:12)(cid:12) E( β j β k ( j ) γ ∗ j B γ k )( β j β k ( j ) β k ( j k ) β j ( j k k ) γ ∗ j B γ j γ ∗ j B γ k γ ∗ k B γ j γ ∗ j B γ k ) ∗ (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) E (cid:16) β j β k ( j ) β j β k ( j ) β k ( j k ) β j ( j k k ) − b j b k ( j ) b j k ( j ) b k ( j k ) b j ( j k k ) (cid:17) × γ ∗ j B γ k γ ∗ j B γ j γ ∗ j B γ k γ ∗ k B γ j γ ∗ j B γ k ) ∗ (cid:12)(cid:12)(cid:12) ≤ K (cid:18) E (cid:12)(cid:12)(cid:12)(cid:16) β j β k ( j ) β j β k ( j ) β k ( j k ) β j ( j k k ) − b j b k ( j ) b j k ( j ) b k ( j k ) b j ( j k k ) (cid:17) ( γ ∗ j B γ k γ ∗ j B γ j ) (cid:12)(cid:12)(cid:12) E (cid:12)(cid:12)(cid:12) ( γ ∗ j B γ k (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12) γ ∗ j B γ k (cid:12)(cid:12)(cid:12) (cid:19) / ≤ O ( n − ) . Here, we have used the fact that each term in the expansion of (cid:16) β j β k ( j ) β j β k ( j ) β k ( j k ) β j ( j k k ) − b j b k ( j ) b j k ( j ) b k ( j k ) b j ( j k k ) (cid:17) contains at leat one ǫ function which the centralized quadratic form of γ . The use the same approach employed inCase1, one can show that the bound id O ( n − ). Case 3. There are less than 5 B’s contained in the product.
If the number is 4, we need to further expandthe matrix A j in ǫ j as A − j j − A − j j γ j γ ∗ j A − j j β j ( j ) , expand A − j = A − j j − A − j j γ j γ ∗ j A − j j β j ( j ) in ǫ j ,and then use the approach employed in Case 2 to obtain the desired bound.If the number is less than 4, we need to further expand the inverses of A -matrices. The details are omitted.Finally, we obtain that X (4) = O ( 1 n ) . Similarly, we have X (3) = O ( 1 n ) . Because tr( A − ∆) = ddz tr( A − ∆), then we haveE | tr( A − ∆) | = E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n X j = k γ ∗ j A − γ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O ( 1 n ) . he lemma is proved. Lemma 4.4
After truncation and normalization, we have tr( A − ∆ A − ∆) = ( m y ( z ) + zm ′ y ( z ))(1 + zm y ( z )) in L uniformally for z ∈ C + . Proof.
Set tr A − ( z )∆ A − ( z )∆ = n P j = k,i = t γ ∗ i A − ( z ) γ k γ ∗ j A − ( z ) γ t = Q + Q where Q = 1 n n X j = k γ ∗ j A − ( z ) γ j γ ∗ k A − ( z ) γ k and Q = 1 n X j = k,i = ti = k, or j = t γ ∗ i A − ( z ) γ k γ ∗ j A − ( z ) γ t . By Lemma 4.2 and 4.3, we obtain E | Q − (1 + zm y ( z ))(1 + zm y ( z )) | ≤ Kn − and E | Q | = o (1) . We thus haveE | tr A − ( z )∆ A − ( z )∆ − (1 + zm y ( z ))(1 + zm y ( z )) | = o (1) . Consequently, because tr A − ( z )∆ A − ( z )∆ = ∂rtr A − ( z )∆ A − ( z )∆ ∂z , then we have E | tr A − ( z )∆ A − ( z )∆ − ∂∂z g ( z ) g ( z ) | = o (1) . That is,tr A − ( z )∆ A − ( z )∆ = g ( z ) g ′ ( z )in L .By setting z = z = z , we obtain tr( A − ∆ A − ∆) = g ( z ) g ′ ( z )in L . Lemma 4.5
After truncation and normalization, we have tr( A − ∆) ( A − ∆) − = g ( z )tr(( A − ∆) ( A − ∆) − ) + o p (1) uniformly for z ∈ C + . Proof.
We havetr( A − ∆) ( A − ∆) − = E n X i = t,j = gh = s γ ∗ i A − γ j γ ∗ g A − γ h γ ∗ s ( A − ∆) − A − γ t = 1 n X i = t,j = gi = j,h = s γ ∗ i A − γ j γ ∗ g A − γ h γ ∗ s ( A − ∆) − A − γ t + 1 n X i = t,j = gi = j,h = s γ ∗ i A − γ j γ ∗ g A − γ h γ ∗ s ( A − ∆) − A − γ t = g ( z ) 1 n X h = s γ ∗ g A − γ h γ ∗ s ( A − ∆) − A − γ t + o p (1)= g ( z )tr(( A − ∆) ( A − ∆) − ) + g ( z ) 1 n X g = th = s γ ∗ g A − γ h γ ∗ s ( A − ∆) − A − γ t + o p (1)= g ( z )tr(( A − ∆) ( A − ∆) − ) + o p (1) . Then by Lemma 4.4, we havetr( A − ∆) ( A − ∆) − = tr( A − ∆) ( A ) − + tr( A − ∆) ( A − ∆) − = tr( A − ∆) ( A ) − + g ( z )tr( A − ∆) ( A − ∆) − + o p (1)= (1 + zm y ( z ))( m y ( z ) + zm ′ y ( z ))1 − g ( z ) + o p (1) . Hence, we obtain the following lemma. emma 4.6 After truncation and normalization, we have tr A − ( z )∆ + tr A − ( z )(∆ A − ( z )) + tr ( A ( z ) − ∆) − (∆ A − ( z )) = ( m y ( z ) + zm ′ y ( z ))(1 + zm y ( z )) − zm y ( z ) + o p (1) uniformly for z ∈ C + . References
Bai, Z. D. and Silverstein, J. W. (2004). CLT for linear spectral statistics of large-dimensional sample covariancematrices.
Ann. Probab. , (1A), 553-605.Bai, Z. D. and Silverstein, J. W. (2009). Spectral analysis of large-dimensional random matrices (2nd edition) ,Springer.Pan, G. M. (2012). Comparison between two types of large sample covariance matrices.
Annales de l’Institut HenriPoincare-Probabiliteset Statistiques (in press).Zheng, S. R. (2012). Central Limit Theorem for Linear Spectral Statistics of Large Dimensional F-Matrix.
Annalesde l’Institut Henri Poincare-Probabiliteset Statistiques , (2), 444-476.(2), 444-476.