On asymptotic expansion and CLT of linear eigenvalue statistics for sample covariance matrices when N/M→0
aa r X i v : . [ m a t h . P R ] N ov ON ASYMPTOTIC EXPANSION AND CLT OF LINEAR EIGENVALUESTATISTICS FOR SAMPLE COVARIANCE MATRICES WHEN
N/M → ∗ ZHIGANG BAO
Department of Mathematics, Zhejiang University Hangzhou, 310027, P. R. China.
Abstract.
We study the renormalized real sample covariance matrix H = X T X/ √ MN − p M/N with
N/M → N, M → ∞ in this paper. And we always assume M = M ( N ).Here X = [ X jk ] M × N is an M × N real random matrix with i.i.d entries, and we as-sume E | X | δ < ∞ with some small positive δ . The Stieltjes transform m N ( z ) = N − T r ( H − z ) − and the linear eigenvalue statistics of H are considered. We mainlyfocus on the asymptotic expansion of E { m N ( z ) } in this paper. Then for some fine testfunction, a central limit theorem for the linear eigenvalue statistics of H is established.We show that the variance of the limiting normal distribution coincides with the case ofa real Wigner matrix with Gaussian entries. Introduction
As an important branch of the Random Matrix Theory, the study towards samplecovariance matrix traces back to the work of Hsu [8] and Wishart [17]. The modernformulation of a large dimensional sample covariance matrix always indicates the matrixin the form of S N = M X T X , where X = [ X jk ] M × N is an M × N real random matrixwith mean zero and variance σ i.i.d entries and N/M → y ∈ (0 , ∞ ) as N, M := M ( N )tend to infinity. So usually the sample size M and parameter number N are assumed tobe with the same order. For a symmetric matrix with eigenvalues λ , · · · , λ N , we denoteits empirical spectral distribution by F N ( x ) = N P Ni =1 { λ i ≤ x } . And the limit of F N ( x ) as N → ∞ is often referred to as limiting spectral distribution. In the case of S N , it is wellknown that Marˇ c enko and Pastur [13] firstly found: as N tends to infinity F N ( x ) almostsurely converges to the so called Marˇ c enko-Pastur (MP) law with the density function ρ y ( x ) = 12 πxyσ p ( b − x )( x − a ) { a ≤ x ≤ b } , and there is a point mass 1 − /y at the origin if y >
1, where a = σ (1 − √ y ) and b = σ (1 + √ y ) .While in modern statistics, the case of N/M → N and M tend to infinityis also very common, see El Karoui [7] for example. In this paper, we will focus on thetheoretical aspect of such a particular case of the sample covariance matrix. It is easy to Date : August 28, 2018.2010
Mathematics Subject Classification.
Key words and phrases.
Sample covariance matrix, Stieltjes transform, Asymptotic expansion, Lineareigenvalue statistics. ∗ The work was supported partially by NSFC grant 11071213, ZJNSF grant R6090034 and SRFDP grant20100101110001. see when
N/M → y = 0, the interval [ a, b ] will shrink to the point σ , so every eigenvalueof S will tend to σ then. If we centralize S by subtracting σ I N and multiplying it with p M/N , the range of the eigenvalues will be enlarged to be order 1 typically. In fact,under the assumption
N/M → N → ∞ , it was understood long time ago that thematrix H N := r MN (cid:18) M X T X − σ I N (cid:19) (1.1)behaves similar with a Wigner matrix of dimension N on many spectral properties.A real Wigner matrix can be defined as a real symmetric random matrix W N withmean zero, finite variance i.i.d diagonal entries and mean zero, variance ω /N i.i.d above-diagonal entries, and all entries on or above diagonal are independent. As a cornerstonein the Random Matrix Theory, the so called semicircle law as the limiting spectral distri-bution of W N was firstly raised by Wigner [16], with the density given by ρ sc ( x ) = 12 πω p ω − x {| x |≤ ω } . (1.2)It is not difficult to see, when we fix N and only let M tends to infinity, H N will tendto a Wigner matrix with Gaussian entries under the effect of the central limit theorem.Thus if we let both N and M tend to infinity with N/M →
0, it is natural to ask whether F N ( x ) of H N tend to the semicircle law as well. A rigorous proof was given by Bai and Yin[5] through a moment method: under the assumption of the existence of 4-th moment of X jk , F N ( x ) converges almost surely to the semicircle law (1.2) with ω = σ . So a questionnaturally arises, how close are the spectral properties of the Wigner matrix W N and therenormalized sample covariance matrix H N when we set ω = σ ? However, such a questionmakes no sense if we do not specify some particular spectrum statistics to compare. So inthis paper, we will study some representative objects for H N , like Stieltjes transform of F N ( x ) and linear eigenvalue statistics of H N . Then compare them to the correspondingresults known for the Wigner matrix W N .The Stieltjes transform of F N ( x ) is defined by m N ( z ) = Z x − z dF N ( x )for any z = E + iη with E ∈ R and η >
0. We just consider fixed z , and specify F N ( x )as the empirical spectral distribution of H N throughout the paper. If we introduce theGreen function G N ( z ) = ( H N − z ) − , we also have m N ( z ) = 1 N T rG N ( z ) = 1 N N X j =1 G jj . Here we denote G jk as the ( j, k ) entry of G N ( z ) . Both as a powerful tool and as a rel-evant spectral statistic, the Stieltjes transform of the empirical spectral distribution hasshown to be particularly important in the Random Matrix Theory. As is well known,the convergence of probability measure sequence is equivalent to the convergence of itsStieltjes transform sequence towards the corresponding transform of the limiting measure(for example, see [4]). So under the assumption E { X } < ∞ , one has when N → ∞ , m N ( z ) almost surely converges to the Stieltjes transform f ( z ) of the semicircle law given by f ( z ) = − z + √ z − σ σ . (1.3)Here the square root is specified to be the one with positive imaginary part. What’s more,due to the basic fact | m N ( z ) | ≤ η − , the a.s convergence of m N ( z ) also implies f N ( z ) := E { m N ( z ) } = f ( z ) + o (1) . (1.4)In this paper, we will take a step further to calculate the leading order term of theremainder. A corresponding work for the Wigner matrices has been taken in [10] by Kho-runzhy et al., and extended by some subsequent articles to some other matrix ensembles,see[1], [11]. Such a topic is often referred to as the asymptotic expansion of f N ( z ) in theliterature. For ease of the expression and calculation, we just consider the normalized caseof σ = 1 hereafter, so the semicircle law in (1.2) will be a standard one, the general caseis just analogous. We denote the α -th moment of X by ω α and α -th cumulant by κ α ,particularly there are κ = σ = 1 and κ = ω − f N ( z ): Theorem 1 . Consider the matrix model H N defined in (1.1) with σ = 1 . And X =[ X jk ] M × N is an M × N real random matrix with mean zero and variance i.i.d entries.We assume M = M ( N ) and N/M → as N → ∞ . If E | X | δ < ∞ with some smallpositive δ , then we have the following asymptotic expansion holds for any fixed z with η > f N ( z ) = f ( z ) (cid:26) − r NM f ( z )1 − f ( z ) + 1 N (cid:18) f ( z )(1 − f ( z )) + κ f ( z )1 − f ( z ) (cid:19)(cid:27) + o r NM ∨ N ! . Remark 1.1 : By comparing with the corresponding result for the Wigner matrix in [10],we can find a different first order remainder with order q NM rather than 1 /N appearswhen M = o ( N ). What’s more, we do only require η > η ≥ ω for Wigner matrix case in [10].(1.3) and (1.5) show the convergence of m N ( z ) a.s. and in expectation respectively, asusual, we shall consider the fluctuation of m N ( z ) as a further job. In this paper, we willdeal with the more general linear eigenvalue statistic L N [ ϕ ] = N X i =1 ϕ ( λ i )of H N with the test function ϕ satisfies some smooth conditions. Note that N m N ( z ) is alinear eigenvalue statistic with ϕ ( x ) = 1 / ( x − z ). The topic of CLT for linear eigenvaluestatistics is really classical and attractive in the Random Matrix Theory, there are a vast ofrelated articles for different matrix ensembles, for examples, see [2], [3], [9], [12], [14], [15].However, for the large dimensional sample covariance matrices, there is no correspondingresult on the case of N/M →
0. So as a complement, we will discuss it in this paper.
ZHIGANG BAO
For convenience, we follow the notations in [14] to denote ξ ◦ = ξ − E { ξ } for any randomvariable ξ below. And introduce the norm || ϕ || s = Z (1 + 2 | k | ) s | b ϕ ( k ) | dk, b ϕ ( k ) = 12 π Z e ikx ϕ ( x ) dx. Then we can state our second result as follows:
Theorem 1 . Consider the matrix model H N defined in (1.1) with σ = 1 . And X =[ X jk ] M × N is an M × N real random matrix with mean zero and variance i.i.d entries.We assume M = M ( N ) and N/M → as N → ∞ . If E | X | < ∞ , then for any realtest function ϕ satisfying || ϕ || / ǫ < ∞ with any ǫ > , we have L ◦ N [ ϕ ] converges weaklyto the centered Gaussian distribution with variance V [ ϕ ] = 12 π Z − Z − (cid:18) ϕ ( λ ) − ϕ ( λ ) λ − λ (cid:19) − λ λ p − λ p − λ dλ dλ + κ π Z − ϕ ( µ ) µ p − µ dµ ! . (1.5) Remark 1.2 : Again we can compare it to the corresponding result of the Wigner matricesin [14], the variance coincides with the counterpart for a Gaussian Wigner matrix (NotGOE!) with variance ω − ϕ to be real in the above Theorem, so when we deal with m N ( z ), we need to work withRe m N ( z ) and Im m N ( z ) as in [10].Our article is organized as follows. We will provide some basic facts and tools in Section2, and as a warm-up, we use them to revisit the semicircle law for Gaussian case (i.e. X isGaussian) in an average sense. In Section 3 we present the proof of Theorem 1 .
1, which isbased on the preliminaries provided in Section 2 together with some extra lemmas, whoseproofs are postponed to section 4. In Section 5, we present the proof of Theorem 1 . C and C i ( i = 1 , , ,
4) to denote positive constants which may be different from line toline, sometimes may depend on η .2. Preliminaries And Gaussian Case
In this section we present some basic tools and facts required in the sequel, and usethem to revisit the semicircle law proved in [5] in an average sense. For convenience, whenthere is no confusion, we will get rid of the subscript N in the notations of matrices and z as a variable in the notation G N ( z ) below. If we set Y = [ Y jk ] M × N := ( M N ) − / X , wehave H = Y T Y − r MN I N . To do the job we state the following two lemmas without detailed but routine proofs.
Lemma 2 . (Generalized Stein’s equation) For any real-valued random variable ξ with E {| ξ | p +2 } < ∞ and complex valued function g ( t ) with continuous and bounded p + 1 derivatives, we have E { ξg ( ξ ) } = p X a =0 κ a +1 a ! E { g ( a ) ( ξ ) } + ǫ, (2.1) where κ a is the a -th cumulant of ξ , and | ǫ | ≤ C sup t | g ( p +1) ( t ) | E {| ξ | p +2 } , where the positive constant C depends on p . Remark 2.1.
The proof of Lemma . can be found in a lot of references, for example,see [10] for details. When ξ is centered Gaussian, (2.1) reduces to the famous Stein’s equation: E { ξg ( ξ ) } = E ξ · E { g ′ ( ξ ) } . (2.2)Our second lemma is on the derivatives D jk {·} with respect to the matrix entry Y jk ,which will be used frequently in Section 2 and 3. Lemma 2 . For any α, j ∈ { , , · · · , M } and β, k ∈ { , , · · · , N } , we have(i): D jk G αβ = − ( Y G ) jα G βk − ( Y G ) jβ G αk , (ii): D jk ( Y G ) αβ = δ αj G βk − G βk ( Y GY T ) jα − ( Y G ) jβ ( Y G ) αk , (iii): D jk [ G kk ( Y GY T ) jj ] = 2 G kk ( Y G ) jk − G kk ( Y G ) jk ( Y GY T ) jj , where δ αj in (ii) is the Kronecker delta function. Remark 2.2.
The proof of Lemma . is based on the following resolvent identity for realsymmetric matrix A and B ( A + B − z ) − − ( A − z ) − = − ( A + B − z ) − B ( A − z ) − , which implies ddǫ ( A + ǫB − z ) − | ǫ =0 = − ( A − z ) − B ( A − z ) − . (2.3) Now we set A = H and A + ǫB = ( Y + ǫE ( j, k )) T ( Y + ǫE ( j, k )) . Here E ( j, k ) representsthe M × N matrix with ( j, k ) -th entry to be and others . Then by (2.3) it is not difficultto get ( i ) of Lemma . . And ( ii ) and ( iii ) can be proved by the chain rule. We omit thedetails here, in fact it is quite similar with the counterparts in [10] and [12] . As a warm-up for the main task in Section 3, we use the above two lemmas to prove f N ( z ) tends to f ( z ) (as N → ∞ ) for the Gaussian case below. By the basic relationbetween a matrix and its Green function G = − z − + z − GH,
ZHIGANG BAO we have f N ( z ) = − z − + z − N E { T rGH } = − z − − z − r MN f N ( z ) + z − N X j,k E { Y jk ( Y G ) jk } . (2.4)Then we can use the Stein’s equation (2.2) to (2.4), which yields f N ( z ) = − z − − z − r MN f N ( z )+ z − N √ M N X j,k E { G kk − (( Y G ) jk ) − G kk ( Y GY T ) jj } = − z − − z − N √ M N E { T rY G Y T } − z − √ M N E { m N ( z ) T rY GY T } = − z − − z − N + 1 √ M N f N ( z ) − z − (1 + r NM z ) E { m N ( z ) }− z − N (1 + r NM z ) E { T rG } , (2.5)where we have used the fact that T rY GY T = T r (cid:18) I N + ( r MN + z ) G (cid:19) ,T rY G Y T = T r (cid:18) G + ( r MN + z ) G (cid:19) . Note the trivial bound || G || ≤ η − for the matrix norm of G , (2.5) implies f N ( z ) = − z − − z − E { m N ( z ) } + O (cid:18) N ∨ r NM (cid:19) . To estimate E { m N ( z ) } , we need to derive a bound for Var { m N ( z ) } . By using thePoincar´e inequality for Gaussian matrix entries as in Proposition 2 . { m N ( z ) } = O (cid:18) N (cid:19) . Consequently one has f N ( z ) = − z − − z − f N ( z ) + O (cid:18) N ∨ r NM (cid:19) . (2.6)On the other hand, it is well known that f ( z ) satisfies the equation f ( z ) = − z − − z − f ( z ) . (2.7)By a routine comparison issue on (2.6) and (2.7) we have f N ( z ) → f ( z ) as N → ∞ (seeSection 2 of [4] for example), which implies the convergence of the expected empiricalspectral distribution to the semicircle law.However, for Theorem 1 .
1, we need to take a step further to calculate the leading orderterm of the remainder f N ( z ) − f ( z ) precisely for the general distribution case. So at first we need use (2.1) instead of (2.2). As a result, more involved estimate towards thederivatives is required. What’s more, the Poincar´e inequality is no longer valid, we needto estimate Var { m N ( z ) } by a martingale difference method as in [14].3. Asymptotic Expansion For f N ( z )To prove Theorem 1 .
1, we begin with the basic idea raised in [10], use (2.1) to (2.4)and estimate the derivatives. However, the standard process will be too involved for ourmatrix model H N . As is shown in [10], for the Wigner matrix, every term in the expansionformula can be factorized into the entries of G N ( z ) which can be bounded by η − . Owingto such a trivial bound, many estimates hold obviously, especially for the remainder term.And the approach of iterating the expansion process also works well. However as we willsee, for our case, the derivatives are in more complicated forms. There are no trivial boundfor some factors of the terms (see (3.8), (3.9)), as well, iterating the expansion process willbring factors with new types. And to bound the remainder, we need to use a truncationtechnic at first.Now we truncate X jk at τ := ( M N ) / − t with some small positive t ≤ δ (say). Thenwe denote the empirical spectral distribution of the matrix defined in (1.1) with truncatedentries X τjk := X jk {| X jk |≤ τ } by F τN ( x ). Further we denote the stieltjes transform of F τN ( x )by m τN ( z ) and set f τN ( z ) = E m τN ( z ). By taking into account that | m N ( z ) | , | m τN ( z ) | ≤ η − and X jk P {| X jk | ≥ τ } ≤ CM N · τ − (5+ δ ) ≤ C ( M N ) − − t = o (cid:18)r NM ∨ N (cid:19) , we have | f N ( z ) − f τN ( z ) | = o (cid:18)r NM ∨ N (cid:19) . After the truncation, all of the first 5 moments of X jk will be modified. For example, | E { X jk {| X jk |≤ τ } }| = | E { X jk {| X jk |≥ τ } }| ≤ τ − (4+ δ ) E | X jk | δ ≤ C ( M N ) − − t . Similarly we can calculate the modification of the α -th moments of X jk under thetruncation for all α = 2 , , ,
5. So without loss of generality, we can always assume X j,k , ( j, k ) ∈ { , · · · , M } × { , · · · , N } are i.i.d and | X jk | ≤ ( M N ) / − t , | E { X jk }| ≤ C ( M N ) − − t , E { X jk } = 1 + C ( M N ) − − t , E { X αjk } = ω α + o (1) , α = 3 , , E {| X jk | δ } ≤ C. (3.1)And for simplicity of the notations, we still use f N ( z ) to denote f τN ( z ) below. What’smore, we need to present two lemmas before the rigorous proof of Theorem 1 .
1. The firstlemma is on the estimate of Var { m N ( z ) } , as we have seen in Section 2, it is a necessaryingredient for the final result. The second lemma is on an estimate towards every diagonalentry of the Green function, which can help to overcome effectively the difficulties we havementioned above. Their proofs will be postponed to Section 4. ZHIGANG BAO
Lemma 3 . : Under the assumptions of ω = 0 and ω < ∞ , we have Var { m N ( z ) } = O (cid:18) N (cid:19) for any fixed z with η > e G = [ e G jk ] M × M := (cid:18) Y Y T − r MN − z (cid:19) − . Lemma 3 . : Under the assumptions of Lemma . , for any fixed z with η > we havethe following two estimates(i): for any k ∈ { , , · · · , N } E (cid:12)(cid:12)(cid:12)(cid:12) G kk + 1 z + f N ( z ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ C N + C NM (3.2) (ii): for any j ∈ { , , · · · , M } E (cid:12)(cid:12)(cid:12)(cid:12) e G jj + 1 q MN + z + f N ( z ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ C ( NM ) + C NM (3.3) Remark 3.1
In fact, we can replace ( z + f N ( z )) − in (3.2) by f N ( z ) or even f ( z ) with thesame bound on the right hand side, but as a lemma, we shall not do that before we getthe final conclusion on the differences between ( z + f N ( z )) − , f N ( z ) and f ( z ).Proof of Theorem 1 . . Use (2.1) to (2.4), we have the following expansion for the generaldistribution case f N ( z ) = − z − − z − r MN f N ( z )+ z − N X a =0 M N ) a +14 X j,k κ a +1 a ! E D ajk ( Y G ) jk + R N , (3.4)where D aαβ {·} represents the a -th derivative with respect to Y αβ and R N ≤ C N M N ) X j,k sup jk E jk | D jk ( Y G ) jk | . Observe that in (3.4), the a = 0 term z − N M N ) / E { X } X jk E ( Y G ) jk = o (cid:18)r MN ∨ N (cid:19) by taking into account (3.1) and the estimation (3.13) proved below. So we will focus onthe estimations of a ≥ R N in the sequel. Here sup jk means the supremumis taken w.r.t Y jk and E jk represents the conditional expectation w.r.t Y jk . By usingLemma 2 . D jk ( Y G ) jk = G kk − [( Y G ) jk ] − G kk ( Y GY T ) jj D jk ( Y G ) jk = − G kk ( Y G ) jk + 6 G kk ( Y G ) jk ( Y GY T ) jj + 2[( Y G ) jk ] (3.5) D jk ( Y G ) jk = − G kk ) + 36 G kk [( Y G ) jk ] +12( G kk ) ( Y GY T ) jj − G kk [( Y G ) jk ] ( Y GY T ) jj − G kk ) [( Y GY T ) jj ] − Y G ) jk ] (3.6) D jk ( Y G ) jk = 120[ G kk ] ( Y G ) jk − G kk [( Y G ) jk ] − G kk ] ( Y G ) jk ( Y GY T ) jj + 240 G kk [( Y G ) jk ] ( Y GY T ) jj +120[ G kk ] ( Y G ) jk [( Y GY T ) jj ] + 24[( Y G ) jk ] for any ( j, k ) ∈ { , · · · , M } × { , · · · , N } . Observe that for any integer a ≥ D ajk ( Y G ) jk is a linear combination of terms in the form of F jk ( a , a , a ) := [ G kk ] a [( Y G ) jk ] a [( Y GY T ) jj ] a (3.7)with nonnegative integers a , a , a satisfying a + a + a ≤ a +1 . Moreover, by Lemma 2 . Y GY T ) jj must appears together with one G kk , thus we always have a ≥ a . By observing that Y GY T = e GY Y T = I M + ( r MN + z ) e G and the trivial fact || e G || ≤ η − we have | ( Y GY T ) jj | ≤ C r MN . (3.8)Similarly we have the following estimate | ( Y G ) jk | ≤ X k | ( Y G ) jk | ! / = | ( Y GG ∗ Y T ) jj | / ≤ C (cid:18) MN (cid:19) / . (3.9)But these two bounds are too bad, thus we need to figure out the higher order term R N more carefully. Fortunately, when we take expectations in (3.4), we can use the followingestimate E (cid:12)(cid:12)(cid:12)(cid:12) ( Y GY T ) jj − f N ( z ) q MN + z + f N ( z ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ C ( NM ) + C M , (3.10)which is a direct consequence of (3.3) and (3.8).To deal with R N , we observe that | F jk ( a , a , a ) | ≤ C | ( Y G ) jk | a | ( Y GY T ) jj | a = C | X α Y jα G αk | a | X β,γ Y jβ Y jγ G βγ | a ≤ (cid:18) ( X α Y jα )( X α | G αj | ) (cid:19) a (cid:18) ( X β,γ Y jβ Y jγ )( X β,γ | G βγ | ) (cid:19) a ≤ CN a ( X α Y jα ) a a . (3.11) In the second inequality we used Cauchy Schwartz inequality and in the last step we usedthe fact that X α | G αj | = [ G ∗ G ] jj ≤ C, X β,γ | G βγ | = T rG ∗ G = CN X β,γ Y jβ Y jγ = X α Y jα ! . So by (3.11) we havesup jk E jk | F jk ( a , a , a ) | ≤ CN a (cid:18) sup | Y jk | a +2 a + E (cid:18) X α = k Y jα (cid:19) a a (cid:19) ≤ CN a · ( M N ) − t ( a +2 a ) + CN a · N a a ( M N ) − a a Then it is not difficult to see | R N | = o ( r NM ∨ N )by taking into account a + 2 a ≤ P jk | ( Y G ) jk | a . When a ≥ | X j,k [( Y G ) jk ] a | ≤ X j,k ( | ( Y G ) jk | ) a = X j,k [( G ∗ Y T ) kj ( Y G ) jk ] a ≤ X k [( G ∗ Y T Y G ) kk ] a = X k (cid:20) G kk + ( r MN + ¯ z )( G ∗ G ) kk (cid:21) a = O ( M a N − a ) . (3.12)When a = 1, we can use the elementary inequality | X j,k ( Y G ) jk | ≤ (cid:18) M N X jk | ( Y G ) jk | (cid:19) = O (cid:18) ( M N ) (cid:19) . (3.13)And for this time we only need to bound the expectations, so (3.10) can be used. ByLemma 3 .
2, (3.10),(3.12) and (3.13), we can discard all the terms except for − F jk (1 , , , − F jk (1 , , F jk (1 , , . P jk E { ( Y G ) jk } instead. To get a lower bound than the coarse one(3.13), we need to iterate the expansion process, the main term of P jk E { ( Y G ) jk } isenough for us. We do it as follows: E X j,k ( Y G ) jk = X j,α,k E ( Y jα G αk ) = X j,α,k (cid:20) √ M N E D jα G αk + κ M N ) E D jα G αk (cid:21) + ǫ N , (3.14)where | ǫ N | ≤ C ω M N X j,α,k sup jα E jα | D jα G αk | . By using Lemma 2 . D jα G αk = − ( Y G ) jα G αk − ( Y G ) jk G αα (3.15) D jα G αk = − G αα G αk + 2 G αk [( Y G ) jα ] +2 G αα G αk ( Y GY T ) jj + 4 G αα ( Y G ) jα ( Y G ) jk (3.16) D jα G αk = 18 G αα G αk ( Y G ) jα + 6 G αα ( Y G ) jk − G αα G αk ( Y G ) jα ( Y GY T ) jj − G αk [( Y G ) jα ] − G αα ( Y G ) jk [( Y G ) jα ] − G αα ] ( Y G ) jk ( Y GY T ) jj . Inserting (3.15) and (3.16) into (3.14) one finds that E X j,k ( Y G ) jk = − √ M N X j,k E (cid:20) ( Y G ) jk + ( Y G ) jk T rG (cid:21) + κ ( M N ) X α,k E (cid:20) − M G αα G αk + G αk ( GY T Y G ) αα G αα G αk T r ( Y GY T ) + 2( GY T Y G ) αk G αα (cid:21) + ǫ N . Similarly to (3.13) we can provide E X j,k ( Y G ) jk = O (cid:16) ( M N ) (cid:17) . The other terms in the expansion are easier to estimate by a similar calculation as we havedone above. Moreover, similar to the case of R N it is not difficult to check that ǫ N has nocontribution to the main term of P j,k E { ( Y G ) jk } . So we can get a bound as E X j,k ( Y G ) jk = O (cid:16) M N (cid:17) . Therefore, under the estimates above, we arrive at f N ( z ) = − z − − z − N E X j,k [( Y G ) jk ] − z − N E X j,k G kk ( Y GY T ) jj − z − κ N E X k G kk + o ( 1 N ) . By (2.5) and Lemma 3 .
1, we can rewrite it as f N ( z ) = − z − − z − r NM f N ( z ) − z − (1 + r NM z ) f N ( z ) − z − N E { T rG ( z ) } − z − κ N E X k G kk + o ( 1 N ) , which implies (cid:12)(cid:12)(cid:12)(cid:12) f N ( z ) + 1 z + f N ( z ) (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:18) N ∨ r NM (cid:19) . (3.17)Furthermore, we know both f N ( z ) and f ( z ) are analytic function of z . Then by (1.4)and the Cauchy’s formula, we also have ddz ( f N ( z )) = ddz ( f ( z )) + o (1) . Now we note that1 N E T rG ( z ) = ddz ( f N ( z )) = ddz ( f ( z )) + o (1) = − f ( z ) z + 2 f ( z ) + o (1) . Together with (3.2) we finally have f N ( z ) = f ( z ) − r NM ( f ( z ) + zf ( z )) z + 2 f ( z )+ 1 N (cid:20) f ( z )( z + 2 f ( z )) − κ f ( z ) z + 2 f ( z ) (cid:21) + O ( NM ) + o (cid:18) N (cid:19) . This concludes the proof by inserting the basic relation z + 2 f ( z ) = f ( z ) − /f ( z ). (cid:3) Variance Estimates
In this section, we will state the proofs of Lemma 3 . .
2. In fact, we can usethe asymptotic expansion method again to estimate Var { m N ( z ) } , but it is really morecomplicated than in the Wigner case. We will use a martingale difference method used in[14] very recently. For convenience we introduce the notation γ N to replace T rG below.Proof of Lemma 3.1 . If we denote E ≤ k {·} and E k {·} as the expectation w.r.t the ran-dom variables of the first k columns and k -th column of Y N respectively, by the classicalmartingale method in [6], one has V ar { γ N } = N X k =1 E {| E ≤ k − { γ N } − E ≤ k { γ N }| } = N X k =1 E {| E ≤ k − { γ N − E k { γ N }}| }≤ N X k =1 E {| γ N − E k { γ N }| } . (4.1)Now we let y k be the k -th column of Y N and B k the M × ( N −
1) matrix consisting of theother N − Y N , then we have H = y · y − q MN ( B T y ) T B T y B T B − q MN I N − . With the notations H ( k ) = B Tk B k − r MN I N − and G ( k ) = ( H ( k ) − z ) − , we have T rG − T rG (1) = 1 + y · B B T (cid:18) B B T − q MN − z (cid:19) − y y · y − q MN − z − y · B B T (cid:18) B B T − q MN − z (cid:19) − y =: 1 + UV (4.2)and G = V − . Here we used the basic identity B (cid:18) B T B − r MN − z (cid:19) − n B T = B B T (cid:18) B B T − r MN − z (cid:19) − n , n = 1 , . To estimate (4.1), we only need to deal with the first term E {| γ N − E { γ N }| } = E (cid:26)(cid:12)(cid:12)(cid:12)(cid:12) UV − E (cid:26) UV (cid:27)(cid:12)(cid:12)(cid:12)(cid:12) (cid:27) , (4.3)since the others are analogous. So it suffices to estimate E {| U V − − E { U V − }| } and E {| V − − E { V − }| } . We will only present the estimate for the first one below. Clearlywe have E {| U V − − E { U V − }| } ≤ E {| U V − − E { U } E { V } | } = E (cid:26)(cid:12)(cid:12)(cid:12)(cid:12) U − E { U } E { V } − V − E { V } E { V } UV (cid:12)(cid:12)(cid:12)(cid:12) (cid:27) . (4.4)By observing that | E { V }| − ≤ η − and | U/V | ≤ | U/ Im V | = y · B B T | B B T − q MN − z | − y η + ηy · B B T | B B T − q MN − z | − y ≤ η − , it suffices to provide the following estimates E {| U − E { U }| } , E {| V − E { V }| } = O (cid:18) N (cid:19) . (4.5)For simplicity we introduce M [1] = [ M [1] jk ] M × M := B B T B B T − r MN − z ! − ,M [2] = [ M [2] jk ] M × M := B B T B B T − r MN − z ! − . Thus we have U − E { U } = X i = j M [2] ij Y i Y j + X i M [2] ii ( Y i − E { Y i } ) , V − E { V } = y · y − r MN − X i = j M [1] ij Y i Y j − X i M [1] ii ( Y i − E { Y i } ) . It is not difficult to get E {| U − E { U }| } ≤ C M N X i,j | M [2] ij | = C M N T r | M [2] | , (4.6) E {| V − E { V }| } ≤ C M N T r | M [1] | + C N . (4.7)Now we denote eigenvalues of H ( k ) by µ ( k )1 · · · µ ( k ) N − . Note that µ ( k ) α ( α = 1 , · · · , N −
1) arealso the eigenvalues of the matrixˇ H ( k ) := B k B Tk − r MN I M , which has a ( M − N + 1)-multiple eigenvalue − q MN . So it follows T r | M [ n ] | = N − X α =1 ( µ (1) α + q MN ) | µ (1) α − z | n = O ( M ) , n = 1 , . which implies (4.5) by taking into account (4.6) and (4.7). (cid:3) Proof of Lemma 3 . . We will only prove for G and e G , the others are analogous. Aswe have shown above G = V − = 1 y · y − q MN − z − y · M [1] y . (4.8)We denote the unit eigenvector of ˇ H (1) corresponding to the eigenvalue µ (1) α by v (1) α =( v (1) α (1) , · · · , v (1) α ( M )) ( α = 1 , · · · , N −
1) and set ξ (1) α = √ M N | y · v (1) α | . It is easy to check E { ξ (1) α } = 1. Then by (4.8), we have G = 1 y · y − q MN − z − √ MN P N − α =1 ( µ (1) α + q MN ) ξ (1) α µ (1) α − z =: 1 − z − m N ( z ) + r , where r = ( y · y − r MN ) + (cid:18) m N ( z ) − N N − X α =1 q NM µ (1) α + 1 µ (1) α − z (cid:19) − N N − X α =1 (cid:18)q NM µ (1) α + 1 (cid:19) ( ξ (1) α − µ (1) α − z = (cid:18) m N ( z ) − N N − X α =1 q NM µ (1) α + 1 µ (1) α − z (cid:19) + V − E { V } . As a consequence we have (cid:12)(cid:12)(cid:12)(cid:12) G + 1 z + f N ( z ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) f N ( z ) − m N ( z ) + r ( − z − m N ( z ) + r )( z + f N ( z )) (cid:12)(cid:12)(cid:12)(cid:12) ≤ C | ( f N ( z ) − m N ( z )) + r | . Thus we obtain E (cid:12)(cid:12)(cid:12)(cid:12) G + 1 z + f N ( z ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ C (Var { m N ( z ) } + E (cid:12)(cid:12)(cid:12)(cid:12) m N ( z ) − N N − X α =1 q NM µ (1) α + 1 µ (1) α − z (cid:12)(cid:12)(cid:12)(cid:12) + E | V − E { V }| ) (4.9)Note the trivial bound r NM (cid:12)(cid:12)(cid:12)(cid:12) N N − X α =1 µ (1) α µ (1) α − z (cid:12)(cid:12)(cid:12)(cid:12) = O ( r NM ) , (4.10)and the fact that (cid:12)(cid:12)(cid:12)(cid:12) m N ( z ) − N N − X α =1 µ (1) α − z (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) Z dF N ( x ) x − z − (1 − N ) Z dF (1) N ( x ) x − z (cid:12)(cid:12)(cid:12)(cid:12) = 1 N (cid:12)(cid:12)(cid:12)(cid:12) Z N F N ( x ) − ( N − F (1) N ( x )( x − z ) dx (cid:12)(cid:12)(cid:12)(cid:12) ≤ N Z dx ( x − z ) = πN , (4.11)where F (1) N ( x ) is the empirical spectral distribution of H (1) . In the inequality above, weuse the well known interlacing property between eigenvalues of an Hermitian matrix andits submatrix as: λ ≤ µ (1)1 ≤ λ ≤ · · · ≤ µ (1) N − ≤ λ N . Combining Lemma 3 .
1, (4.5), (4.10) and (4.11) we conclude the proof of (i) of Lemma 3 . e G , we now turn to e H = Y Y T − r MN I M . If we denote the k -th row of Y N by ˜ y k and the ( M − × N matrix consisting of the other M − Y N by D k . And further introduce the matrix e H ( k ) = [ e H ( k ) jk ] N × N =: D Tk D k − r MN I N with its eigenvalues ˜ µ ( k ) α , α = 1 , · · · , N . Then we have the following representation e H = ˜ y · ˜ y − q MN ˜ y D T D ˜ y T D D T − q MN I M − . If we use the e E {·} to denote the expectation with respect to ˜ y , and set e V = ˜ y D T D D T D − r MN − z ! − ˜ y T . We have e G = 1˜ y · ˜ y − q MN − z − e V =: 1 − q MN − z − m N ( z ) + ˜ r , where ˜ r = ˜ y · ˜ y + m N ( z ) + e V = ˜ y · ˜ y + m N ( z ) − N N X α =1 q NM ˜ µ (1) α + 1˜ µ (1) α + z + e V − e E { e V } . Now we denote the set of eventΩ ◦ =: (cid:26) | f N ( z ) − m N ( z ) + ˜ r ( z ) | ≥ r MN (cid:27) , and observe that on Ω c ◦ there exists (cid:12)(cid:12)(cid:12)(cid:12)(cid:18)r MN + z + m N ( z ) − ˜ r ( z ) (cid:19)(cid:18)r MN + z + f N ( z ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≥ C MN .
Then we have E (cid:12)(cid:12)(cid:12)(cid:12) e G + 1 q MN + z + f N ( z ) (cid:12)(cid:12)(cid:12)(cid:12) = E (cid:12)(cid:12)(cid:12)(cid:12) f N ( z ) − m N ( z ) + ˜ r ( z )( q MN + z + m N ( z ) − ˜ r ( z ))( q MN + z + f N ( z )) (cid:12)(cid:12)(cid:12)(cid:12) ≤ C (cid:20) ( NM ) + NM P (Ω ◦ ) (cid:21) · E | f N ( z ) − m N ( z ) + ˜ r ( z ) | , (4.12)where we used the fact that (cid:12)(cid:12)(cid:12)(cid:12)(cid:18)r MN + z + m N ( z ) − ˜ r ( z ) (cid:19)(cid:18)r MN + z + f N ( z ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≥ C r MN (4.13)holds on the full set Ω. To see this, We denote the unit eigenvector of e H (1) correspondingto the eigenvalue ˜ µ (1) α by ˜ v (1) α = (˜ v (1) α (1) , · · · , ˜ v (1) α ( N )) ( α = 1 , · · · , N ) and set˜ ξ (1) α = √ M N | ˜ y · ˜ v (1) α | . Then we have r MN + z + m N ( z ) − ˜ r ( z ) = r MN + z − ˜ y (cid:20) I − ( ˜ H (1) + r MN )( ˜ H (1) − z ) − (cid:21) ˜ y T = r MN + z + 1 √ M N N X α =1 (cid:18) q MN + ˜ µ (1) α ˜ µ (1) α − z − (cid:19) ˜ ξ (1) α = ( r MN + z ) √ M N N X α =1 µ (1) α − z ˜ ξ (1) α ! =: ( r MN + z )(1 + S ) . (4.14)Taking into account that ˜ µ (1) α ≥ − q MN and ˜ ξ (1) α ≥
0, we haveRe S ≥ − C r MN Im S, Im S ≥ , which implies | S | ≥ C r NM . (4.15)Then (4.13) is a direct consequence of (4.14) and (4.15).Now we proceed to the estimate of (4.12). By observing P (Ω ◦ ) ≤ NM E | f N ( z ) − m N ( z ) + ˜ r ( z ) | , it suffices to provide E | f N ( z ) − m N ( z ) + ˜ r ( z ) | ≤ C NM + C N , which can be derived similarly as what we have done to (4.9). (cid:3) CLT For Linear Eigenvalue Statistics
To prove Theorem 1 .
2, we will follow the recent article [14] by Shcherbina. For thereare only some technical differences, we will only state the main body of the proof in thissection, and left all the technical details to the Appendix.Proof of Theorem 1 . . Similar to [14], as we will see, ω is needed in our proof. So firstlywe need to truncate the random variable X jk at ( M N ) / , and then re-centralize it. Touse the truncated matrix, it is necessary to show its linear eigenvalue statistics have thesame limit distribution as the original one at first. If we denote the truncated matrix as b Y = [ b Y ij ] M × N , b Y ij = Y ij {| Y ij |≤ } , ˘ Y = [ ˘ Y ij ] M × N , ˘ Y ij = b Y ij − E b Y ij . And further introduce b H N = b Y T b Y − r MN I N , b L N [ ϕ ] = T rϕ ( b H ) , ˘ H N = ˘ Y T ˘ Y − r MN I N , ˘ L N [ ϕ ] = T rϕ ( ˘ H ) , b H N ( t ) = [ b Y + t ( Y − b Y )] T [ b Y + t ( Y − b Y )] − r MN I N . We denote the eigenvalues and corresponding eigenvectors of b H N ( t ) by λ i ( t ) and v i ( t )( i = 1 , · · · , N ) below. It suffices to prove that E {| e ix L ◦ N [ ϕ ] − e ix b L ◦ N [ ϕ ] |} ≤ P { b Y = Y } + | x || E {L N [ ϕ ] } − E { b L N [ ϕ ] }| → , (5.1) E {| e ix b L ◦ N [ ϕ ] − e ix ˘ L ◦ N [ ϕ ] |} ≤ | x | E {| b L N [ ϕ ] − ˘ L N [ ϕ ] |} → . (5.2)as N tends to infinity. The proof of (5.2) is similar to (5.1), we only give the proof of (5.1)below. P { b Y = Y } is obviously o (1) by the truncation. Also, we have E {|L N [ ϕ ] − b L N [ ϕ ] |} = Z dt E { X i | ϕ ′ ( λ i ( t )) λ ′ i ( t ) |}≤ || ϕ ′ || ∞ Z dt E { X i v Ti ( t ) b H ′ ( t ) v i ( t ) }≤ || ϕ ′ || ∞ Z dt E { T r | ( Y − b Y ) T b Y + b Y T ( Y − b Y ) + 2 t ( Y − b Y ) T ( Y − b Y ) |}≤ || ϕ ′ || ∞ Z dt E { X i,j | [( Y − b Y ) T b Y + b Y T ( Y − b Y ) + 2 t ( Y − b Y ) T ( Y − b Y )] ij |}≤ || ϕ ′ || ∞ Z dt X i,j,k ( E {| ( Y ki − b Y ki ) | · | b Y kj |} + t E {| ( Y ki − b Y ki )( Y kj − b Y kj ) |} ) ≤ || ϕ ′ || ∞ Z dt (cid:26) X i,k [ E {| Y ki − b Y ki | · | b Y ki |} + t E {| Y ki − b Y ki | } ]+ X i = j X k [ E | Y ki − b Y ki | · E | b Y kj | + t E | Y ki − b Y ki | · E | Y kj − b Y kj | ] (cid:27) . (5.3)By the truncation, we have E | b Y kj | ≤ C ( M N ) − , E | Y ki − b Y ki | ≤ ( M N ) − Z | X ki | {| X ki |≥ ( MN ) } dP ≤ C ( M N ) − . E | Y ki − b Y ki | ≤ ( M N ) − Z | X ki | {| X ki |≥ ( MN ) } dP ≤ C ( M N ) − . Then it is easy to check (5.3) tends to 0 as N tends to infinity, so (5.1) holds.So without loss of generality, we may and do assume in the sequel E { X jk } = 0 , E { X jk } = 1 + o (1) , E { X jk } = ω + o (1) , (5.4) E { X jk } ≤ C ( M N ) , E { X jk } ≤ C ( M N ) . (5.5) Relying on (5.4) and (5.5) we have the following lemma which collects all estimates weneed to derive the central limit theorem for L N [ ϕ ]. Lemma 5 . Under the assumptions (5.4), (5.5) and our basic assumption
N/M → as N → ∞ , we have for any fixed z with η > and z , z : Im z , > , > δ > thefollowing estimates hold: Var { γ N } ≤ CN − N X i =1 ( E {| G ii ( z ) | δ } )( η − − δ + η − δ ) , (5.6) E { ( V ◦ ) } , E { ( U ◦ ) } , E {| V ◦ | } , E {| U ◦ | } = O ( N − ) , (5.7) N E { V ◦ ( z ) V ◦ ( z ) } = ω − − ω − M T r [ M [1] ( z ) + M [1] ( z )]+ ω − M X i M [1] ii ( z ) M [1] ii ( z ) + 2 M T r [ M [1] ( z ) M [1] ( z )]+ 1 M ( r MN + z )( r MN + z ) γ ◦ (1) N ( z ) γ ◦ (1) N ( z ) , (5.8)Var { N E { V ◦ ( z ) V ◦ ( z ) }} , Var { N E { V ◦ ( z ) U ◦ ( z ) }} = O (cid:18) NM ∨ N (cid:19) , (5.9) E {| γ ◦ N | } ≤ CN / η − + o ( N / ) , E {| γ ◦ (1) N − γ ◦ N | } = O ( N − / ) , (5.10) | E { γ (1) N } /N − f ( z ) | , | E − { V } + f ( z ) | = o (1) . (5.11)We postpone the proof of Lemma 5 . η .By Proposition 1, Proposition 3 of [14] and (5.6) we only need to prove Theorem 2for the functions ϕ = ϕ η which are the convolution of some ϕ with the Poisson kernel P η = η/π ( x + η ). And ϕ is restricted to satisfy R | ϕ ( λ ) | dλ ≤ C with some positiveconstant C . For such test function ϕ we have L ◦ N [ ϕ ] = 1 π Z ϕ ( µ )Im γ ◦ N ( z µ ) dµ, z µ = µ + iη. Follow the notations of [14], we set Z N ( x ) = E { e ix L ◦ N [ ϕ ] } , e ( x ) = e ix L ◦ N [ ϕ ] , e ( x ) = e ix ( L (1) N [ ϕ ]) ◦ , (5.12)where L (1) N [ ϕ ] stands for the corresponding linear eigenvalue statistics of H (1) . Observethat Theorem 2 can be proved by providing that ddx Z N ( x ) = − xV [ ϕ , η ] Z N ( x ) + o (1) . (5.13)To do this, we introduce Y N ( z, x ) =: E { T rG ( z ) e ◦ ( x ) } = X k E { G kk ( z ) e ◦ ( x ) } . (5.14)Thus we have ddx Z N ( x ) = 12 π Z ϕ ( µ )( Y ( z µ , x ) − Y (¯ z µ , x )) dµ. We only need to deal with the first term in the summation (5.14), the others areanalogous, so we can deal with N E { V − e ◦ ( x ) } := T + T instead, where T = − N E { ( V − ) ◦ e ( x ) } ,T = − N E { ( V − ) ◦ ( e ( x ) − e ( x )) } . For the calculations towards T and T are similar to those in the case studied in [14], wewill not state the tedious process. Indeed, by inserting the estimate in Lemma 5 .
1, weeasily obtain T = f ( z ) Y N ( z, x ) + O (cid:18) √ N ∨ r NM (cid:19) , and T = ixZ N ( x ) Z dµϕ ( µ ) D ( z, z µ ) − D ( z, ¯ z µ )2 iπ + o (1) , where D ( z, z µ ) := 2 f ( z ) f ( z µ )(1 + f ′ ( z µ )) (cid:18) f ( z ) − f ( z µ ) z − z µ + ω − (cid:19) +2 f ( z ) f ( z µ ) ddz µ (cid:18) f ( z ) − f ( z µ ) z − z µ (cid:19) . (5.15)So we have Y N ( z, x ) = f ( z ) Y N ( z, x ) + ixZ N ( x ) Z dµϕ ( µ ) D ( z, z µ ) − D ( z, ¯ z µ )2 iπ (1 − f ( z )) + o (1) . Thus the variance in (5.13) can be represented by V [ ϕ , η ] = 14 π Z Z ϕ ( µ ) ϕ ( µ ) (cid:18) C ( z µ , ¯ z µ ) + C (¯ z µ , z µ ) − C ( z µ , z µ ) − C (¯ z µ , ¯ z µ ) (cid:19) dµ dµ , where C ( z, z µ ) = D ( z, z µ )1 − f ( z ) . We can get (1.5) by comparing (5.15) to (2 .
40) of [14], then we complete the proof. (cid:3) Appendix
Proof of Lemma 5 . . We begin with (5.6), we will use (4.1) and (4.3), and so we need toestimate (4.4) more carefully. According to (4.6) and (4.7), we have E (cid:26)(cid:12)(cid:12)(cid:12)(cid:12) V − E { V } E { V } (cid:12)(cid:12)(cid:12)(cid:12) (cid:27) ≤ C M N T r | M [1] | | z + √ MN T rM [1] | + C N | E { V }| (6.1)and E (cid:26)(cid:12)(cid:12)(cid:12)(cid:12) U − E { U } E { V } (cid:12)(cid:12)(cid:12)(cid:12) (cid:27) = C M N T r | M [2] | | z + √ MN T rM [1] | . For (6.1), if we set N (1) = B T B , we have T rM [1] = T rG (1) N (1) , T r | M [1] | = T rG (1) N (1) G (1) ∗ N (1) . So 1
M N T r | M [1] | = 1 M N T rG (1) N (1) G (1) ∗ N (1) ≤ M N [ T rG (1) N (1) G (1) ∗ ] − δ [ T rG (1) ( N (1) ) δδ G (1) ∗ ] δ ≤ M N [ T rG (1) N (1) G (1) ∗ ] − δ [ T r ( N (1) ) δδ ] δ η − δ ≤ N (cid:20) √ M N T rG (1) N (1) G (1) ∗ (cid:21) − δ (cid:20) N (cid:18) NM (cid:19) δ δ T r ( N (1) ) δδ (cid:21) δ η − δ ≤ N (cid:20) √ M N T rG (1) N (1) G (1) ∗ (cid:21) − δ (cid:20) N T r (cid:18) M X (1) T X (1) (cid:19) δδ (cid:21) δ η − δ . Furthermore we have Im
T rG (1) N (1) = ηT rG (1) N (1) G (1) ∗ , which implies E (cid:26) M N T r | M [1] | (cid:12)(cid:12) z + √ MN T rM [1] (cid:12)(cid:12) (cid:27) ≤ N η − − δ E (cid:26) (cid:2) N T r (cid:0) M X (1) T X (1) (cid:1) δδ (cid:3) δ | E { V }| δ (cid:27) ≤ N η − − δ E (cid:26)(cid:2) N T r (cid:18) M X (1) T X (1) (cid:19) δδ (cid:3) δ E | G | δ (cid:27) . (6.2)Here we have used Jensen’s inequality | E { V }| − ≤ E {| V | − } .If we control (6.2) by using Theorem 5 . y with N/M , we can get E ((cid:12)(cid:12)(cid:12)(cid:12) V − E { V } E { V } (cid:12)(cid:12)(cid:12)(cid:12) ) ≤ C N E | G | δ ( η − − δ + η − δ ) . Then by the fact that Im V = − ηU − η , we also have E {| V − E { V } E { V } UV | } ≤ C N ( E {| G | δ } )( η − − δ + η − δ )and E {| U − E { U } E { V } | } ≤ C N ( E {| G | δ } )( η − − δ + η − δ ) . Next we turn to the proof of the first inequality of (5.10), which will be used in theproof of (5.7). We use the following inequality for martingale (see [6]) E {| γ ◦ N | } ≤ CN N X k =1 E {| γ N − E k { γ N }| } . Similar to (4.3), it suffices to check E {| U − E { U }| } ≤ CN − η − + o ( N − ) (6.3)and E {| V − E { V }| } ≤ CN − η − + o ( N − ) . (6.4)We only present the proof of (6.4) below, (6.3) is analogous. Observing that M [1] ii =( B G (1) B T ) ii , (3.8) and (3.10) are valid as well. We have X i E | M [1] ii | ≤ Cη − MN X i E | M [1] ii | = CM η − , which implies E | V − E { V }| ≤ C E (cid:26) E (cid:20)(cid:12)(cid:12)(cid:12)(cid:12) X i = j M [1] ij Y i Y j (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) X i M [1] ii ( Y i − √ M N ) (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) y · y − r MN (cid:12)(cid:12)(cid:12)(cid:12) (cid:21)(cid:27) ≤ C M N ) E { T r | M [1] | } + C M N ) X i E {| M [1] ii | } ω + C M N ) E { ( T r | M | ) } + C M N ω ≤ CN − η − + o ( N − ) . For (5.7), we only deal with E {| V ◦ | } below, the others are similar. V ◦ = V − E { V } + E { V } − E { V } = V − E { V } + 1 √ M N ( T rM [1] − E T rM [1] )= V − E { V } + (cid:18) N + 1 √ M N z (cid:19) ( γ (1) N − E γ (1) N ) . So use the first inequality of (5.10) to γ (1) N , together with (6.5) we can get (5.7).For (5.8), note the following expansion: N E { V ◦ ( z ) V ◦ ( z ) } = ( ω − − ω − M T r [ M [1] ( z ) + M [1] ( z )]+ ω − M X i M [1] ii ( z ) M [1] ii ( z ) + 2 M T r [ M [1] ( z ) M [1] ( z )]+ 1 M [( T rM [1] ( z ) − E T rM [1] ( z ))( T rM [1] ( z ) − E T rM [1] ( z ))]= ( ω − − ω − M T r [ M [1] ( z ) + M [1] ( z )]+ ω − M X i M [1] ii ( z ) M [1] ii ( z ) + 2 M T r [ M [1] ( z ) M [1] ( z )]+ 1 M ( r MN + z )( r MN + z ) γ ◦ (1) N ( z ) γ ◦ (1) N ( z ) . (6.5) To deal with the first estimate in (5.9), we only need to take care of the variance of thethird and fourth terms of (6.5). Using (3.10) to M [1] ii again we can getVar { M [1] ii } ≤ C ( NM ) + C M , together with the trivial bound | M [1] ii | ≤ C q MN we haveVar { M X i M [1] ii ( z ) M [1] ii ( z ) }≤ M X i Var { M [1] ii ( z ) M [1] ii ( z ) }≤ M X i E (cid:18) M [1] ii ( z ) M [1] ◦ ii ( z ) + M [1] ◦ ii ( z ) E M [1] ii ( z ) (cid:19) ≤ C NM + C N .
Also we have
T rM [1] ( z ) M [1] ( z )= N − X α =1 ( µ (1) α + q MN ) ( µ (1) α − z )( µ (1) α − z )= (cid:20) MN − z z + ( r MN + z + z z + z ) (cid:21) N − X α =1 µ (1) α − z )( µ (1) α − z )+( r MN + z + z N − X α =1 µ (1) α − z − z ( µ (1) α − z )( µ (1) α − z ) + N − (cid:20) MN − z z + ( r MN + z + z z + z ) (cid:21) T r G (1) ( z ) − G (1) ( z ) z − z +( r MN + z + z T r ( G (1) ( z ) + G (1) ( z )) + N − . Using (5.6) to
T rG (1) = γ (1) N we haveVar (cid:26) M T rM [1] ( z ) M [1] ( z ) (cid:27) ≤ N . The estimate towards Var { N E { V ◦ ( z ) U ◦ ( z ) }} ) is straightforward by taking into account( G (1) ) = dG (1) /dz .The second estimate of (5.10) is a direct consequence of (4.2) and (5.7). The first part of(5.11) is just the consequence of (1.4) if we replace f N ( z ) = γ N /N by γ (1) N /N , and secondone follows directly from the fact E { V } = z + 1 N γ (1) N + O ( r NM ) = z + f ( z ) + o (1) and f ( z ) = − z + f ( z ) . So we complete the proof. (cid:3)
References [1] S. Albeverio, L. Pastur, M. Shcherbina.:
On the 1/n expansion for some unitary invariant ensembleof random matrices.
Commun. Math. Phys. 224, 271-305 (2001).[2] G.W. Anderson, O. Zeitouni:
CLT for a band matrix model . Probab. Theory and Related Fields.V.134, 283-338 (2006).[3] Z.D. Bai, J.W. Silverstein:
CLT for linear spectral statistics of large dimensional sample covariancematrices . Ann. of Prob. V.32, 553-605 (2004).[4] Z.D. Bai, J.W. Silverstein:
Spectral analysis of large dimensional random matrices
Mathematics Mono-graph Series 2, Science Press, Beijing.[5] Z.D. Bai, Y.Q. Yin:
Convergence to the semicircle law . Ann. of Prob. Volumn 16, No.2, 863-875(1988).[6] S.W. Dharmadhikari, V. Fabian, K. Jogdeo:
Bounds on the moments of martingales.
Ann. Math.Statist. v.39, p. 1719-1723 (1968).[7] N. EL Karoui:
On the largest eigenvalue of Wishart matrices with identity covariance when n, p and p/n → ∞ . Preprint, arXiv: math.ST/0309355.[8] P.L. Hsu:
On the distribution of roots of certain determinantal equations.
Ann. Eugenics 9, 250-258.[9] K. Johansson:
On fluctuations of eigenvalues of random Hermitian matrices . Duke Math. J. V.91,151-204 (1998).[10] A.M. Khorunzhy, B.A. Khoruzhenko, L.A. Pastur:
On asymptotic properties of large random matriceswith independent entries.
J. Math. Phys. 37, 5033 (1996).[11] A.M. Khorunzhy, W. Kirsch:
On asymptotic expansions and scales of spectral universality in bandrandom matrix ensembles.
Commun. Math. Phys 231, 223-255 (2002).[12] A. Lytova, L. Pastur: central limit theorem for linear eigenvalue statistics of random matrices withindependent entries
Ann. of prob. 37 No.5, 1778-1840 (2009).[13] V.A. Mar˘cenko, L.A. Pastur:
Distribution of eigenvalues for some sets of random matrices.
Math.USSR-Sb. 1, 457-483 (1967).[14] M. Shcherbina:
Central limit theorem for linear eigenvalue statistics of the Wigner and sample co-variance random matrices
Preprint, arXiv:1101.3249v1.[15] Ya. Sinai, A. Soshnikov:
Central limit theorem for traces of large random symmetric matrices withindependent matrix elements . Bol.Soc.Brasil.Mat.(N.S.) V.29, 1-24 (1998).[16] E. Wigner:
Characteristic vectors of bordered matrices with infinite dimensions.
Ann. of Math. 62 ,548-564 (1955).[17] J. Wishart:
The generalized product moment distribution in samples from a normal multivariatepopulation.
Biometrika A 20, 32-43 (1928).
Department of Mathematics, Zhejiang University, Hangzhou, Zhejiang 310027, P.R. China
E-mail address ::