Convergence rate to the Tracy-Widom laws for the largest eigenvalue of Wigner matrices
aa r X i v : . [ m a t h . P R ] F e b Convergence rate to the Tracy–Widom laws for the largest eigenvalueof Wigner matrices
Kevin Schnelli ∗ KTH Royal Institute of Technology [email protected]
Yuanyuan Xu † KTH Royal Institute of Technology [email protected]
Abstract.
This paper shows that the largest eigenvalue of a real symmetric or complex HermitianWigner matrix of size N converges to the Tracy–Widom laws at a rate O ( N − / ω ), as N tends toinfinity. For Wigner matrices this improves the previous rate O ( N − / ω ) obtained by Bourgade [5] forgeneralized Wigner matrices. Our result follows from a Green function comparison theorem, originallyintroduced in [19] to prove edge universality, on a finer spectral parameter scale with improved errorestimates. The proof relies on the continuous Green function flow induced by a matrix-valued Ornstein–Uhlenbeck process. Precise estimates on leading contributions from the third and fourth order momentsof the matrix entries are obtained using iterative cumulant expansions and recursive comparisons forcorrelation functions, along with uniform convergence estimates for correlation kernels of the Gaussianinvariant ensembles. Date : February 8, 2021 Introduction and main results
In this paper we study a quantitative version of the edge universality for Wigner random matrices.Let H N be a real symmetric or complex Hermitian Wigner matrix of size N . Then the edge universalityasserts that the largest eigenvalue, λ N , of H N satisfieslim N →∞ P (cid:16) N / ( λ N − < r (cid:17) = TW β ( r ) , r ∈ R , (1.1)where TW β are the cumulative distribution functions of the Tracy–Widom laws [42, 43] and β = 1 , β = 1 for real symmetric and β = 2 for complex Hermitian Wignermatrices). The universality of the Tracy–Widom laws was first proved in [38, 39] for Wigner matriceswhose entries have symmetric distributions. This symmetry assumption was partially removed in [34, 35].Edge universality for Wigner matrices whose entries have vanishing third moments was proved in [41].Edge universality without moment matching was proved in [19] for Wigner matrices and in [2, 6] forgeneralized Wigner matrices. A necessary and sufficient condition on the entries’ distributions for theedge universality to hold was given in [30].The main result of this paper is an estimate on the rate of convergence in (1.1) for Wigner matrices.Theorem 1.3 below states that, for any fixed r ∈ R and small ω > r>r (cid:12)(cid:12)(cid:12) P (cid:16) N / ( λ N − < r (cid:17) − TW β ( r ) (cid:12)(cid:12)(cid:12) ≤ N − / ω , (1.2)for N sufficiently large. For the Gaussian unitary ensemble (GUE, β = 2) and Gaussian orthogonalensemble (GOE, β = 1) it was established in [24] that the convergence rate for the largest eigenvalueis of order O ( N − / ); see Theorem 1.2 below. The first rate of convergence for non-invariant ensembleswas recently given by Bourgade in [5] where the upper bound O ( N − / ω ) for the convergence rate wasobtained for generalized Wigner matrices. ∗ Supported by the Swedish Research Council Grant VR-2017-05195, and the Knut and Alice Wallenberg Foundation. † Supported by the Swedish Research Council Grant VR-2017-05195.
The proof of the estimate in (1.2) is based on the Green function comparison method for the edgeuniversality by Erd˝os, Yau and Yin [19]. Our main technical result given in Theorem 1.4 compares theexpectation of a suitably chosen function of the Green function or resolvent of the Wigner matrix H N withthe corresponding quantity for the Gaussian invariant ensembles. Instead of the traditional Lindebergtype swapping strategy [8, 19, 41], we use the continuous Green function flow induced by a matrix-valuedOrnstein–Uhlenbeck process in combination with cumulant expansions [28, 29] for the comparison. Toachieve the convergence rate in (1.2) the comparison is required on a much finer spectral scale than thetypical O ( N − / ) edge scaling. This requires in turn precise estimates on the contributions to the Greenfunction flow from third and fourth order moments.Contributions from third moments can be estimated using the idea of unmatched indices [19], howeverdue to the finer spectral scale, we require expansions to arbitrary order in terms of the control parameterof the strong local law for the Green function [19] to implement this idea. This step relies on applyingcumulant expansions iteratively to Green functions observing a cancellation to leading order [21, 22, 29].The usefulness of cumulant expansions in random matrix theory was recognized in [26] and has widelybeen used since, e.g., [7, 16, 20, 31].Contributions from fourth moments are controlled by first showing that they can be reduced to trace-like correlation functions of products of Green functions. This step is motivated by the Weingartencalculus [10] to compute Haar integrals of products of eigenvector components for the invariant Gaussianensembles. The actual reduction for non-invariant ensembles relies on applying cumulant expansionsiteratively. In a second step we compare the resulting trace-like correlation functions between Wignermatrices and the invariant ensembles using again the interpolating flow. This leads to a hierarchy ofcorrelation functions which, after expansion to arbitrary order, can be recursively estimated by thelocal law for the Green function. Finally, we need to control the trace-like correlation functions for theinvariant ensembles. This is accomplished by using the uniform asymptotics [13] for correlation kernelsof the invariant ensembles in the edge scaling.Edge universality can also be studied through the dynamical approach of Erd˝os, Schlein and Yau.The local relaxation time of Dyson’s Brownian motion (DBM) at the edges is known [1, 5, 27] to beof order O ( N − / ). Combining his quantitative local relaxation estimates for the DBM with a Greenfunction comparison for short times, Bourgade obtained in [5] the convergence rate O ( N − / ω ) to theTracy–Widom laws for generalized Wigner matrices. In view of the local relaxation time of the DBM atthe spectral edges, we suspect that the convergence rate estimate in (1.2) is optimal for Wigner matrices.In addition, in our proof the leading error terms of order O ( N − / ω ) depend on the fourth cumulantsof the matrix entries, which is in agreement with the necessary and sufficient Lindeberg condition for thefourth moments in [30] for the edge universality.The methods presented in this paper are rather robust and can be applied to other random matrixmodels. Of interest in statistics are in particular convergence rate estimates for sample covariance ma-trices. For the white Wishart ensemble the convergence rate O ( N − / ) were obtained in [14, 32]. Edgeuniversality for sample covariance matrices was established in [36] and a first quantitative version ap-peared recently in [44]. In the accompanying article [37] we establish the corresponding results to (1.2) forsample covariance matrices. We further expect that in combination with [5] our method can be used toderive quantitative edge universality for other random matrix models such as generalized Wigner matricesand sample covariance matrices with general populations. Acknowledgment:
We thank Paul Bourgade and Maurice Duits for useful comments and suggestions.1.1.
Setup and main results.
Let H ≡ H N be an N × N Wigner matrix satisfying the followingassumption:
Assumption 1.1.
For a real symmetric ( β = 1) Wigner matrix, we assume:1. The matrix entries { H ij | i ≤ j } are independent real-valued centered random variables with H ij = H ji .2. For i = j , E [( √ N H ij ) ] = 1, and E [( √ N H ii ) ] are uniformly bounded.3. All moments of the entries of √ NH N are uniformly bounded, i.e., for any k ∈ N , there exists C k independent of N such that, for all 1 ≤ i, j ≤ N , E [ |√ NH ij | k ] ≤ C k . (1.3)For a complex Hermitian ( β = 2) Wigner matrix, we assume: a. The matrix entries { Re H ij , Im H ij | i ≤ j } are independent real-valued centered random variableswith H ij = H ji .b. For i = j , E [ |√ N H ij | ] = 1, and E [( H ij ) ] = 0.c. The bound (1.3) holds true.The Gaussian ensembles, which we denote by G β E for short, are Wigner matrices with Gaussianentries: For the Gaussian unitary ensemble (GUE, β = 2) the off-diagonal matrix entries √ N H ij , i = j ,are standard complex Gaussians ( i.e., √ NH ij ∼ N (0 , )+i N (0 , )) and the diagonal entries are standardreal Gaussians √ N H ii ∼ N (0 , β = 1) thematrix entries are real Gaussians with √ N H ij ∼ N (0 , i = j , and √ NH ii ∼ N (0 , λ j ) Nj =1 of H N be arranged in a non-decreasing order. It is well known that thelargest eigenvalue λ N converges to 2 in probability. The typical spacing of the top eigenvalues near 2 isorder O ( N − / ), due to the square-root behavior at the end points of the limiting spectral density andeigenvalue rigidity. The limiting distribution of the properly rescaled largest eigenvalue of the Gaussianensembles was found by Tracy and Widom in [42, 43]. The corresponding convergence rate was quantizedby Johnstone and Ma [24] in the following theorem. Theorem 1.2 (Convergence rate for the Gaussian ensembles) . Let H N be the GUE. For any fixed r ∈ R ,there exists a constant C = C ( r ) such that sup r>r (cid:12)(cid:12)(cid:12) P GUE (cid:16) N / ( λ N − < r (cid:17) − TW ( r ) (cid:12)(cid:12)(cid:12) ≤ CN − / . (1.4) Moreover, considering the GOE with N even, we have sup r>r (cid:12)(cid:12)(cid:12) P GOE (cid:16) ( N − / √ N (cid:16) λ N − (4 − N ) / (cid:17) < r (cid:17) − TW ( r ) (cid:12)(cid:12)(cid:12) ≤ CN − / . (1.5)The first quantitative convergence rate O ( N − / ω ) for generalized Wigner matrices was obtained byBourgade [5] using optimal local relaxation estimates for the Dyson Brownian motion and a quantitativeGreen function comparison theorem for short time.The main result of this paper is an improved bound for the convergence rate of the distribution of N / ( λ N −
2) for Wigner matrices.
Theorem 1.3 (Convergence rate for Wigner matrices) . Let H N be a real or complex Wigner matrixsatisfying Assumption 1.1. For any fixed r ∈ R and ω > , sup r>r (cid:12)(cid:12)(cid:12) P (cid:16) N / ( λ N − < r (cid:17) − TW β ( r ) (cid:12)(cid:12)(cid:12) ≤ N − + ω , (1.6) for sufficiently large N ≥ N ( r , ω ) . The corresponding statement holds for the smallest eigenvalue λ of H N . The proof of Theorem 1.3 relies on the Green function comparison method [18, 19]. Let H ≡ H N bea Wigner matrix satisfying Assumption 1.1. Let G ( z ) := 1 H − z , m N ( z ) := 1 N Tr G ( z ) , z ∈ C + , (1.7)denote the resolvent or Green function of H N and m N its normalized trace. The distribution of therescaled largest eigenvalue can be linked to the expectation (of smooth functions) of the imaginary partof m N ( z ) for appropriately chosen spectral parameters z ; see Subsection 2.3. The main technical resultof this paper is the following theorem. Theorem 1.4 (Green function comparison Theorem) . Let F to be any smooth function with uniformlybounded derivatives. For any small ǫ > , let N − ǫ ≤ η ≤ N − / ǫ and | κ | , | κ | ≤ C N − / ǫ forsome C > . Then there exists some c > that does not depend on ǫ , such that (cid:12)(cid:12)(cid:12)(cid:16) E − E G β E (cid:17)h F (cid:16) N Z κ κ Im m N (2 + x + i η )d x (cid:17)i(cid:12)(cid:12)(cid:12) ≤ N − / c ǫ , (1.8) for sufficiently large N ≥ N ( ǫ, C ) . Remark . A first Green function comparison theorem at the spectral edges was obtained in [19] forspectral parameters η of size O ( N − / − ǫ ) and with an error estimate of size O ( N − / c ǫ ).The constant c in the upper bound in (1.8) can be chosen as any number bigger than one. Aninspection of our proof in fact yields that the upper bound in (1.8) can be written asmax { K , | M − |} N − + c ǫ + N − / ǫ , where M := max i (cid:12)(cid:12) E [( √ N h ii ) ] (cid:12)(cid:12) , K := max i = j (cid:12)(cid:12) c (4) ( √ N h ij ) (cid:12)(cid:12) for β = 1, and K := max i = j (cid:12)(cid:12) c (2 , ( √ N h ij ) (cid:12)(cid:12) for β = 2, with c (4) ( √ N h ij ) the 4-th cumulants of √ N h ij given in (2.27), and c (2 , the corresponding(2 , Organization of the paper and outline of proofs.
The paper is organized as follows. InSection 2, we provide the preliminaries for the proofs, e.g., local law for the Green function or cumulantexpansions; and recall some properties of the invariant ensembles. In Section 3, following the approachof [19], we first reduce the proof of the main Theorem 1.3 to the Green function comparison Theorem 1.4.We then prove Theorem 1.4 using the idea of interpolating Green function flow and the key estimates onthe resulting drift term stated in Proposition 3.3 below.In Section 4, before we give the proof of Proposition 3.3 for arbitrary functions F , we prove thecorresponding Green function comparison theorem in the simplest case, F ( x ) = x ; see Proposition 4.1.To make the statements easier, we first consider complex Hermitian Wigner matrices. The proof ofProposition 4.1 is carried out in Sections 4 to 6 using interpolations and expansions. In the following, wesketch the proof.(1) We first set up the interpolation (3.8) between a given Wigner matrix and the GUE using thematrix Ornstein-Uhlenbeck process (3.7). Using Ito’s formula, we derive the stochastic evolutionfor the time-dependent normalized trace of the Green function m N ( t, z ) in (4.4). It then sufficesto estimate the drift term given in (4.6). Using the cumulant expansions of Lemma 2.6, we expandthe expectation of the drift term up to the fourth order and observe a cancellation of the secondorder terms in the cumulant expansions (4.5).(2) All the third order terms, as well as the fourth order terms excluding the ones corresponding to the(2,2)-cumulants of the off-diagonal entries are unmatched; see Definition 4.2. The contributionsfrom these unmatched terms are negligible; see Proposition 4.3 which is proved in Section 6.In Subsection 6.1, we use the Weingarten calculus to validate Proposition 4.3 for the GUE. InSubsection 6.2, we introduce the expansion mechanism by studying an example of an unmatchedterm for arbitrary Wigner matrices. In Subsection 6.3, we give the proof of Proposition 4.3 forany unmatched term using the expansion mechanism iteratively.(3) The fourth order terms corresponding to the (2,2)-cumulants of the off-diagonal entries and thesecond order terms corresponding to the 2nd-cumulants of the diagonal entries are given in termsof type-AB and type-A terms; see Definition 4.5. Motivated by the GUE computations basedon the Weingarten calculus in Subsection 5.1, we show that such terms can be expanded intotrace-like correlation functions of Green functions referred to as type-0 terms in Definition 4.5;see Proposition 4.6. The proof of Proposition 4.6 is presented in Subsection 5.2, using cumulantexpansions iteratively. The resulting type-0 terms are estimated in Lemma 4.8 which is provedin the following Subsection 5.3 using the idea of comparison recursively. The key observation isthat, after deriving the stochastic evolution (5.27) of any type-0 term containing d off-diagonalGreen function entries, we can expand the corresponding drift term (5.29) to arbitrary orderusing Proposition 4.6 and end up with type-0 terms containing at least d + 1 off-diagonal Greenfunction entries; see (5.32). By recursive comparison, Lemma 4.8 follows from the local law in(3.10) for the Green function and the estimates of type-0 terms for the GUE in Lemma 5.4. Thelast Subsection 5.4 is devoted to prove Lemma 5.4 using the determinantal structure of the GUEand the properties of its correlation kernel in the edge scaling.In Section 7, we extend the above ideas to general functions F , and use the estimate (4.3) from Propo-sition 4.1 as an input to prove Proposition 3.3. We then conclude with the Green function comparisonin Theorem 1.4 and hence our main result Theorem 1.3. In the last Section 8, the real symmetric case isproved with the required modifications. Notation:
We will use the following definition on high-probability estimates from [15].
Definition 1.6.
Let
X ≡ X ( N ) and Y ≡ Y ( N ) be two sequences of nonnegative random variables. Wesay Y stochastically dominates X if, for all (small) τ > > P (cid:0) X ( N ) > N τ Y ( N ) (cid:1) ≤ N − Γ , (1.9)for sufficiently large N ≥ N ( τ, Γ), and we write
X ≺ Y or X = O ≺ ( Y ).We often use the notation ≺ also for deterministic quantities, then (1.9) holds with probability one.Properties of stochastic domination can be found in the following lemma. Lemma 1.7 (Proposition 6.5 in [17]) . (1) X ≺ Y and Y ≺ Z imply X ≺ Z ;(2) If X ≺ Y and X ≺ Y , then X + X ≺ Y + Y and X X ≺ Y Y ; (3) If X ≺ Y , E Y ≥ N − c and | X | ≤ N c almost surely with some fixed exponents c , c > , thenwe have E X ≺ E Y . For any vector v ∈ C N , let v ( j ) be the j -th entry of the vector. For any matrix A ∈ C N × N , the matrixnorm induced by the Euclidean vector norm is given by k A k := σ max ( A ), where σ max ( A ) denotes thelargest singular value of A . We denote the sup norm of the matrix by k A k max := max i,j | A ij | . We usethe notation A := N Tr A for the normalized trace.Throughout the paper, we use c and C to denote strictly positive constants that are independent of N .Their values may change from line to line. We use the standard Big-O and little-o notations for large N .For X, Y ∈ R , we write X ≪ Y if there exists a small c > | X | ≤ N − c | Y | for large N .Moreover, we write X ∼ Y if there exist constants c, C > c | Y | ≤ | X | ≤ C | Y | for large N .Finally, we denote the upper half-plane by C + := { z ∈ C : Im z > } , and the non-negative real numbersby R + := { x ∈ R : x ≥ } . 2. Preliminaries
In the section, we collect some basic notations, tools and results required in the subsequent sections,in particular we introduce the local law for the Green function of Wigner matrices and eigenvalue rigidityestimates; relate the distribution function of the largest eigenvalues to the normalized trace of the Greenfunction; introduce the cumulant expansion formalism and finally recall properties of the GUE and theAiry kernel.2.1.
Local law for Wigner matrices.
For a probability measure ν on R denote by m ν its Stieltjestransform, i.e., m ν ( z ) := Z R d ν ( x ) x − z , z ∈ C + . (2.1)We refer to z as spectral parameter and often write z = E +i η , E ∈ R , η >
0. Note that m ν : C + → C + isanalytic and can be analytically continued to the real line outside the support of ν . Moreover, m ν satisfieslim η ր∞ i ηm µ (i η ) = −
1. The Stieltjes transform of the semicircle distribution ρ sc ( x ) := π p (4 − x ) + isdenoted by m sc ( z ). It is well know that m sc ( z ) is the unique solution to1 + zm + m ( z ) = 0 , (2.2)satisfying Im m sc ( z ) >
0, for Im z >
0. The Stieltjes transform of the empirical eigenvalue measure ofa Wigner matrix H N , µ N := N P Nj =1 δ λ j , is then given by the normalized trace of its Green functiondefined in (1.7).Let κ = κ ( E ) be the distance from E ∈ R to the closest edge point of the semicircle law, i.e., κ := min {| E − | , | E + 2 |} . (2.3)Define the domain of the spectral parameter z , S := { z = E + i η : | E | ≤ , < η ≤ } . (2.4)The Stieltjes transform m sc has the following quantitative properties, for a reference, see e.g., [17]. Lemma 2.1.
The Stieltjes transform of the semicircular law has the following properties: (1) The imaginary part of m sc satisfies | Im m sc ( z ) | ∼ ( √ κ + η, if E ∈ [ − , , η √ κ + η , otherwise , (2.5) uniformly in z ∈ S .(2) There are strictly positive constant c , such that c ≤ | m sc ( z ) | ≤ − cη , (2.6) hold for all z ∈ S . For any arbitrary small ǫ >
0, introduce the following subdomain of S , S ≡ S ( ǫ ) := (cid:8) z = E + i η : | E | ≤ , N − ǫ ≤ η ≤ (cid:9) . (2.7)We also define the deterministic control parameterΨ ≡ Ψ( z ) := s Im m sc ( z ) N η + 1
N η , z = E + i η. (2.8)In particular, from (2.5), for any z ∈ S , we have C √ N ≤ Ψ( z ) ≪ . (2.9)Moreover, for z ∈ S edge defined in (4.1) below, we have Ψ( z ) = O ( Nη ) as follows from (2.5).With these notation, we are now ready to state the following local law for the Green function of aWigner matrix. Theorem 2.2 (Local law for Wigner matrices [19]) . Let H be a symmetric or Hermitian N by N matrixsatisfying Assumption 1.1 and recall the Green function of H and its normalized trace in (1.7). Then wehave max ≤ i,j ≤ N | G ij ( z ) − δ ij m sc ( z ) | ≺ Ψ( z ) , | m N ( z ) − m sc ( z ) | ≺ N η , (2.10) uniformly in z ∈ S . Rigidity of eigenvalues.
The local law for the Green function in Theorem 2.2 implies the followingrigidity estimates for the eigenvalues of H . Recall that the eigenvalues of H are denoted as ( λ j ) Nj =1 arranged in a non-decreasing order. For E < E ( E , E ∈ R ∪ {±∞} ) denote the eigenvalue countingfunction by N ( E , E ) := { j : E ≤ λ j ≤ E } . (2.11)We also define the classical location γ j of the j -th eigenvalue λ j by jN = Z γ j −∞ ρ sc ( x )d x. (2.12) Theorem 2.3 (Rigidity of eigenvalues [19]) . For any E < E , we have (cid:12)(cid:12)(cid:12) N ( E , E ) − N Z E E ρ sc ( x )d x (cid:12)(cid:12)(cid:12) ≺ . (2.13) In addition, for any ≤ j ≤ N , we have | λ j − γ j | ≺ N − / (cid:16) min { j, N − j + 1 } (cid:17) − / . (2.14) In particular, for any small ǫ > and large Γ > , we have | λ N − | ≤ N − / ǫ , N (2 − C N − / ǫ , C N − / ǫ ) ≤ N ǫ , (2.15) with probability bigger than − N Γ , for N sufficiently large. Relating the distribution of the largest eigenvalue to the Green function.
Fix a small ǫ > E L := 2 + 4 N − / ǫ . (2.16)For any E ≤ E L , we define χ E := [ E,E L ] , (2.17)and note that N ( E, E L ) = Tr χ E ( H ). For η >
0, we define the mollifier θ η by setting θ η ( x ) := ηπ ( x + η ) = 1 π Im 1 x − i η . (2.18)We can relate Tr χ E ⋆ θ η ( H ) to the normalized trace of the Green function by the following identity,Tr χ E ⋆ θ η ( H ) = Nπ Z χ E ( y )Im m N ( y + i η )d y = Nπ Z E L E Im m N ( y + i η )d y . (2.19)The following lemma assures that Tr χ E ( H ) can be sufficiently well approximated by Tr χ E ⋆ θ η ( H ) for η ≪ N − / . Relying on this approximation, the lemma after, Lemma 2.5, then yields the desired linkbetween the distribution function of the rescaled largest eigenvalue of H and the normalized trace ofthe Green function using a cleverly chosen observable. This line of arguments was used first in [19] toprove the edge universality of Wigner matrices, where η is chosen slightly smaller than the typical edgeeigenvalue spacing N − / . In order to obtain a quantitative convergence rate, we aim to choose here η much smaller with η ≫ N − . A similar argument was used in [5]. The proofs of Lemma 2.5 andLemma 2.4 are modifications of [19] in order to accommodate the small η regime, and are postponed toAppendix. Lemma 2.4.
Let
E, l and η be scale parameters satisfying N − ≪ η ≪ l ≪ E L − E ≤ CN − / ǫ .Then, for any Γ > , (cid:12)(cid:12)(cid:12) Tr χ E ( H ) − Tr χ E ⋆ θ η ( H ) (cid:12)(cid:12)(cid:12) ≤ C (cid:16) N ( E − l , E + l ) + ηl N ǫ (cid:17) , (2.20) holds with probability bigger than − N − Γ , for N sufficiently large. Let F : R −→ R be a smooth cut-off function such that F ( x ) = 1 , if | x | ≤ / F ( x ) = 0 , if | x | ≥ / , (2.21)and we assume that F ( x ) is non-increasing for x ≥
0. Then one obtains from Lemma 2.4 the followingresult.
Lemma 2.5.
Set l = N ǫ η and l = N ǫ l such that N − ≪ η ≪ l ≪ l ≪ E L − E ≤ CN − / ǫ . Thenfor any Γ > , we have Tr χ E + l ⋆ θ η ( H ) − N − ǫ ≤ N ( E, ∞ ) ≤ Tr χ E − l ⋆ θ η ( H ) + N − ǫ , (2.22) with probability bigger than − N − Γ , for N sufficiently large. Furthermore, we have E h F (cid:16) Tr χ E − l ⋆ θ η ( H ) (cid:17)i − N − Γ ≤ P (cid:16) N ( E, ∞ ) = 0 (cid:17) ≤ E h F (cid:16) Tr χ E + l ⋆ θ η ( H ) (cid:17)i + N − Γ , (2.23) where F ( x ) is the cut-off function given in (2.21). Hence, recalling (2.19), we have established the desired link to the normalized trace of the Greenfunction.2.4.
Cumulant expansion formulas.
A key tool of this paper are the following cumulant expansionidentities. For reference, we refer to Lemma 3.1 in [20].
Lemma 2.6.
Let h be a complex-valued random variable with finite moments. Define the ( p, q ) -cumulantof h to be c ( p,q ) := ( − i) p + q (cid:16) ∂ p + q ∂s p ∂t q log E e i sh + i th (cid:17)(cid:12)(cid:12)(cid:12) s,t =0 . (2.24) Let f : C × C −→ C be a smooth function and denote its derivatives by f ( p,q ) ( z , z ) := ∂ p + q ∂z p ∂z q f ( z , z ) . Then for any fixed l ∈ N , we have E (cid:2) ¯ hf ( h, ¯ h ) (cid:3) = l X p + q +1=1 p ! q ! c ( p,q +1) E (cid:2) f ( p,q ) ( h, ¯ h ) (cid:3) + R l +1 , (2.25) where the error term R l +1 can be bounded as | R l +1 | ≤ C l E | h | l +1 max p + q = l n sup | z |≤ M | f ( p,q ) ( z, ¯ z ) | o + C l E h | h | l +1 | h | >M i max p + q = l k f ( p,q ) ( z, ¯ z ) k ∞ , (2.26) and M > is an arbitrary fixed cutoff.Moreover, we have the analogous cumulant expansion formula for a real-valued random variable h withfinite moments. Define the k -th cumulant of h to be c ( k ) := ( − i) k (cid:16) dd t log E e i th (cid:17)(cid:12)(cid:12)(cid:12) t =0 . (2.27) Let f : R −→ C be a smooth function and denote by f ( k ) its k -th derivative. Then for any fixed l ∈ N ,we have E (cid:2) hf ( h ) (cid:3) = l X k +1=1 k ! c ( k +1) E [ f ( k ) ( h )] + R l +1 , (2.28) where the error term satisfies | R l +1 | ≤ C l E | h | l +1 sup | x |≤ M | f ( l ) ( x ) | + C l E h | h | l +2 | h | >M i k f ( l ) k ∞ , and M > is an arbitrary fixed cutoff. GUE and the Airy kernel.
Let H ≡ H N belong to the GUE and denote the eigenvalues of √ NH by { µ j } Nj =1 in non-decreasing order. The joint eigenvalue density is explicitly given by p ( µ , · · · , µ N ) = 1 Z N,β Y i
In this section we give the proof of Theorem 1.3 from the main technical result, the Green functioncomparison theorem, Theorem 1.4.
Proof of Theorem 1.3.
Because of the rigidity of the eigenvalues in (2.15), one easily verifies that, for any ǫ > > /
3, sup | r |≥ N ǫ (cid:12)(cid:12)(cid:12) P (cid:16) N / ( λ N − < r (cid:17) − P G β E (cid:16) N / ( λ N − < r (cid:17)(cid:12)(cid:12)(cid:12) ≤ N − Γ , (3.1)for sufficiently large N . Hence in order to prove Theorem 1.3, it suffices to focus on r < r < N ǫ with r as in Theorem 1.2 and Theorem 1.3.Set as in (2.16) E := 2 + N − / r, and E L := 2 + 4 N − / ǫ . Fix η = N − ǫ and l = N − ǫ as in Lemma 2.5. Here we choose ǫ > l ≪ N − / . From (2.19) and (2.23), we can relate the distribution of the largest eigenvalue to thenormalized trace of the Green function as follows, E h F (cid:16) N Z N − / ǫ N − / r − l Im m N (2 + x + i η )d x (cid:17)i − N − Γ ≤ P (cid:16) N / ( λ N − < r (cid:17) = P (cid:16) N ( E, ∞ ) = 0 (cid:17) ≤ E h F (cid:16) N Z N − / ǫ N − / r + l Im m N (2 + x + i η )d x (cid:17)i + N − Γ . (3.2)By shifting the value of r in the second inequality of (3.2) and combining with the first inequality of (3.2),we obtain P (cid:16) N / ( λ N − < r − N / l (cid:17) − N − Γ ≤ E h F (cid:16) N Z N − / ǫ N − / r − l Im m N (2 + x + i η )d x (cid:17)i ≤ P (cid:16) N / ( λ N − < r (cid:17) + N − Γ . (3.3)Note that the above inequalities hold true for β = 1 , N even), and the convergence rate N − / obtained in Theorem 1.2 of [9] for the GOEwith N odd, we findTW β (cid:16) r − N / l (cid:17) − CN − / ≤ E G β E h F (cid:16) N Z N − / ǫ N − / r − l Im m N (2 + x + i η )d x (cid:17)i ≤ TW β ( r ) + CN − / . A similar upper and lower bound can be obtained in the same way when we consider + l in the integraldomain instead of − l . Since the Tracy–Widom distributions have smooth and uniformly bounded densitiesand l = N − ǫ , we havesup r Proof of Theorem 1.4. Consider the matrix Ornstein-Uhlenbeck process (cid:0) h ab ( t ) (cid:1) Na,b =1 :d h ab = 1 √ N d β ab − h ab d t, h ab (0) = ( H N ) ab , (3.7)where (cid:0) β ab ( t ) (cid:1) Na
0. Indeed, we choose a mesh of the interval 0 ≤ t ≤ T :=8 log N of size N , and obtain that (3.10) holds uniformly in z ∈ S , t ∈ [0 , N ] from the continuityof the process (3.8) in time. Moreover, (3.10) also holds uniformly in t ≥ N from (3.34) below.In the following, we often ignore the parameters and write for short H ≡ H ( t ) , G ≡ G ( t, z ) , m N ≡ m N ( t, z ) , t ∈ R + , z ∈ C \ R . For a fixed small ǫ > C > 0, let N − ǫ ≤ η ≤ N − / ǫ , | κ | , | κ | ≤ C N − / ǫ , (3.11)with κ < κ . In view of (2.19) and (2.23), we are interested in the quantity X ≡ X ( t ) := N Z κ κ Im m N ( t, x + i η )d x, t ∈ R + . (3.12)Hence X is a function of t , η as well as κ and κ .Let F : R → R be an arbitrary smooth function with uniformly bounded derivatives. The next lemmadetermines the evolution of the observable F (cid:0) X ( t ) (cid:1) under the Ornstein–Uhlenbeck flow in (3.7). Toalleviate the notation, we introduce the following abbreviations. Let P : R + × C \ R −→ C be anarbitrary function, then we introduce f Im P ≡ f Im P ( t, z ) := 12i ( P ( t, z ) − P ( t, ¯ z )) , t ∈ R + , z ∈ C \ R . (3.13)For example, for complex Wigner matrices, f Im G ij ( t, z ) = Im G ij ( t, z ), unless i = j . Further, we abbre-viate, for t ∈ R + , and z , z ∈ C \ R ,∆ f Im P ≡ ∆ f Im P ( t, z , z ) := 12i (cid:16) P ( t, z ) − P ( t, z ) (cid:17) − (cid:16) P ( t, z ) − P ( t, z ) (cid:17) , (3.14)where the spectral parameters are given as z = 2 + κ + i η , z = 2 + κ + i η , (3.15)with κ , κ , and η from (3.11). In particular, we have z , z ∈ S edge defined in (4.1) below.Returning to F ( X ), Ito’s lemma yields the following result. Lemma 3.1. The observable F ( X ) satisfies the following stochastic differential equation: d F ( X ) = d M + Θd t, (3.16) with the diffusion term d M = − √ N N X a,b =1 (cid:16) F ′ ( X )∆ f Im G ba (cid:17) d β ab , (3.17) and the drift term Θ ≡ Θ( t, z , z ) is explicitly given in (3.25) below. Moreover, E [Θ] can be written as E [Θ] = X p + q +1=3 p,q ∈ N K p,q +1 + E + O ≺ ( N − / ) , (3.18) for N sufficiently large, with K p,q +1 := 12 p ! q ! N p + q +12 N X a,b =1 a = b s ( p,q +1) ab E h ∂ p + q F ′ ( X )∆ f Im G ba ∂h pba ∂h qab i ; (3.19) E := 12 N N X a =1 ( s (2) aa − E h ∂F ′ ( X )∆ f Im G aa ∂h aa i , (3.20) where s ( p,q +1) ab and s (2) aa are the cumulants of the rescaled entries √ N h ab given in (2.24) and (2.27).Remark . The diffusion term d M in (3.17) yields a martingale M ( t ) upon integration. Note that theoperator norm of the Green function has the deterministic bound k G ( z ) k ≤ η ≤ N − ǫ , given z = E + i η with η ≥ N − ǫ . Since F has bounded derivatives, | F ′ ( X )∆ f Im G ba | ≤ CN − ǫ for some C > 0. Thus M ( t ) is a true martingale with vanishing expectation. Proof of Lemma 3.1. Recall the dynamics of the Orstein-Uhlenbeck process in (3.7) and that G is afunction of the matrix entries h ab . Using first Ito’s lemma and then the relation ∂G ij ∂h ab = − G ia G bj , (3.21)we computed G ij ( t, z ) = ∂G ij ∂t d t + X a ∂G ij ∂h aa d h aa + 12 X a ∂ G ij ∂h aa ∂h aa d h aa d h aa + X aN − / γ i E h max p + q =4 n sup w ∈ C (cid:12)(cid:12)(cid:12) ∂ p + q ∂h pba ∂h qab f ab (cid:16) H ( ab ) + wE ( ba ) + ¯ wE ( ab ) (cid:17)(cid:12)(cid:12)(cid:12)oi , (3.27)with a fixed small γ > 0, and where we use the notation E ( ab ) := ( δ ab ) Ni,j =1 , H ( ab ) := H − h ab E ( ab ) − h ba E ( ba ) , as well as f ab ( H ) := F ′ ( X )∆ f Im G ba . (3.28)Using the second resolvent identity, we can write G H ( ab ) ij = G Hij + (cid:16) G H ( ab ) ( h ab E ( ab ) + h ba E ( ba ) ) G H (cid:17) ij . (3.29)From the local law in (3.10), we have max i = j | G Hij | ≺ Ψ and max i | G Hii | ≺ 1. In addition, we have | h ij | ≺ √ N from the moment condition (1.3). Therefore, we have from (3.29) that max i = j | G H ( ab ) ij | ≺ Ψand max i | G H ( ab ) ii | ≺ 1. Similarly, we have G H ( ab ) + wE ( ab ) + ¯ wE ( ba ) ij = G H ( ab ) ij − (cid:16) G H ( ab ) + wE ( ab ) + ¯ wE ( ba ) ( wE ( ab ) + ¯ wE ( ba ) ) G H ( ab ) (cid:17) ij , (3.30)and thus sup | w | The drift term E [Θ] in (3.18) has the following bound: | E [Θ( t, z , z )] | ≤ N − / cǫ , (3.32) uniformly in t ≥ and z , z given in (3.15), for a numerical constant c > that does not depend on ǫ and sufficiently large N ≥ N ( ǫ, C ) . In order to finish the proof of Theorem 1.4, we now choose T := 8 log N and integrate (3.16) over[0 , T ]. Then taking the expectation, the diffusion term vanishes (see Remark 3.2) and the drift term isbounded using (3.32). We hence find by writing out X in (3.12) that (cid:12)(cid:12)(cid:12) E h F (cid:16) N Z κ κ Im m N (0 , x + i η )d x (cid:17)i − E h F (cid:16) N Z κ κ Im m N ( T, x + i η )d x (cid:17)i(cid:12)(cid:12)(cid:12) = O ( N − + cǫ log N ) . (3.33)Using the inequality k A k max ≤ k A k ≤ N k A k max , the second resolvent identity, that k G ( E + i η ) k ≤ η ,and (3.8), one shows that G ( T, z ) is sufficiently close to the Green function of the GUE, i.e., k G ( T, z ) − G GUE ( z ) k max ≤ k G ( T, z )(GUE − H ( T )) G GUE ( z ) k ≤ Nη k (GUE − H ( T )) k max ≺ N η . (3.34)Since F is a smooth function with uniformly bounded derivatives, we have (cid:12)(cid:12)(cid:12) F (cid:16) N Z κ κ Im m N ( T, x + i η )d x (cid:17) − F (cid:16) N Z κ κ Im m GUE N (2 + x + i η )d x (cid:17)(cid:12)(cid:12)(cid:12) ≺ N ǫ N / η . (3.35)Combining (3.33) and (3.35), we conclude the proof of Theorem 1.4. (cid:3) A simple case: estimates on E [Im m N ]In this section, we prove the simplest version of the Green function comparison theorem, Theorem 1.4,when F ( x ) = x . It then suffices to compare the expected normalized trace of the Green function ofa Wigner matrix E [ m N ( z )] with E GUE [ m N ( z )]. The ideas in this section will also be used to proveProposition 3.3, which is a key ingredient to establish the Green function comparison theorem for ageneral function F . The proof for general functions F will rely on the estimate (4.3) in Proposition 4.1below as an input. Proposition 4.1. Let H N be a complex Wigner matrix satisfying Assumption 1.1 and recall the timedependent matrix H ( t ) in (3.7). For any ǫ > and C > , define the domain of the spectral parameter z near the upper edge, S edge ≡ S edge ( ǫ, C ) := { z = E + i η ∈ S : | E − | ≤ C N − / ǫ , N − ǫ ≤ η ≤ N − / ǫ } , (4.1) with S given in (2.7). Then for any τ > , we have (cid:12)(cid:12)(cid:12) E [ m N ( t, z )] − E GUE [ m N ( z )] (cid:12)(cid:12)(cid:12) ≤ N − / τ , (4.2) uniformly in z ∈ S edge and t ≥ , for sufficiently large N ≥ N ( C , ǫ, τ ) . Furthermore, there exists some C > independent of ǫ , such that E [Im m N ( t, z )] ≤ CN − / ǫ , (4.3) uniformly in z ∈ S edge and t ≥ , for sufficiently large N ≥ N ′ ( C , ǫ ) . In the rest of this section we prove Proposition 4.1; its proof is split into several parts organized insubsections. Interpolation between a Wigner matrix and the GUE. Following the proof of Lemma 3.1in Section 3, we start by applying Ito’s lemma to the time dependent normalized trace of the Greenfunction, m N ( t, z ), from (3.9). Recalling (3.22), we find the analogue of (3.16)d( m N ( t, z )) = − N / N X v,a,b =1 G va G bv d β ab + 12 N N X v,a,b =1 (cid:16) h ab G va G bv + 1 N G vb G bv G aa + 1 N G va G av G bb (cid:17) d t :=d M + Θ d t, (4.4)with diffusion term d M and drift term Θ d t ≡ Θ ( t, z )d t ; here we use the subscript 0 to indicate that weare considering the simple case F ( x ) = x . The diffusion term d M yields a martingale after integration;see Remark 3.2. Applying the cumulant expansions in Lemma 2.6, we have the analogue of (3.18) for thedrift term, E [Θ ] = 12 N N X v,a =1 ( s (2) aa − E h ∂ ( G av G va ) ∂h aa i + 12 N N X v,a,b =1 a = b X p + q +1=3 p ! q ! s ( p,q +1) ab N p + q +12 E h ∂ p + q ( G bv G va ) ∂h pba ∂h qab i + O ≺ (cid:0) √ N (cid:1) = − N N X v,a =1 ( s (2) aa − E h ∂ G vv ∂h aa i − X p + q +1=3 p ! q ! N p + q +32 N X v,a,b =1 a = b s ( p,q +1) ab E h ∂ p + q +1 G vv ∂h pba ∂h q +1 ab i + O ≺ (cid:0) √ N (cid:1) , (4.5)where the error stems from the truncation of the cumulant expansions at fourth order. Recalling thearguments in Section 3, in order to establish Proposition 4.1 it suffices to show that for any τ > | E [Θ ( t, z )] | ≤ N − / τ , (4.6)uniformly in z ∈ S edge ( ǫ, C ) and t ≥ 0, for sufficiently large N ≥ N ( C , ǫ, τ ).Admitting (4.6), for T = 8 log N and any 0 ≤ t ′ ≤ T , we integrate (4.4) over [ t ′ , T ] and take theexpectation to get (cid:12)(cid:12)(cid:12) E h m N ( t ′ , z ) i − E h m N ( T, z ) i(cid:12)(cid:12)(cid:12) = O (cid:16) N − / τ log N (cid:17) . (4.7)Combining with (3.34), we obtain the comparison estimate in (4.2) between the GUE and the timedependent H ( t ) in (3.7) staring from the Wigner matrix H . The bound (4.3) will follow directly fromthe comparison result (4.2) and the corresponding estimate for the GUE in Lemma 5.4 below.In the remaining part of this section, we will hence prove (4.6). For that it suffices to estimate theterms on the right side of (4.5).4.2. Third and fourth order terms with unmatched indices. Using the differential rule for theGreen function entries in (3.21), each term in the cumulant expansion (4.5) can be written out in termsof an averaged product of the Green function entries of the form:1 N m N X v =1 · · · N X v m =1 c v ,...,v m (cid:16) n Y i =1 G x i y i ( t, z ) (cid:17) =: 1 N I X I c I (cid:16) n Y i =1 G x i y i ( t, z ) (cid:17) , t ∈ R + , z ∈ C + , (4.8)for m, n ∈ N , where I := { v j } mj =1 is a free summation index set which may include a, b, v from (4.5), m := {I} is the number of elements in the set I , and the coefficients { c I := c v ,...,v m } are uniformlybounded complex numbers. Moreover, n is the number of Green function entries in the product, andeach row index x i and column index y i (1 ≤ i ≤ n ) of the Green function entries represent some elementin the free summation index set I . We further define the degree of such a term in (4.8) to be the numberof off-diagonal terms in the product of the Green function entries, i.e., d := { ≤ i ≤ n : x i = y i } . (4.9)In particular, we have 0 ≤ d ≤ n . We use Q d ≡ Q d ( t, z ) to denote the collection of the averaged productsof the Green function entries of the form in (4.8) of degree d . For any Q d ≡ Q d ( t, z ) ∈ Q d , it is clearfrom the local law in (3.10) that | Q d ( t, z ) | ≺ Ψ d + 1 N , (4.10) uniformly in z ∈ S given in (2.7) and t ≥ 0. We will often omit the parameters z and t for notationalsimplicity.In the following, we use the letter v j to denote a free summation index running from 1 to N , and theletters x i , y i as the row and column indices of the Green function entries. In order to avoid confusion, weclarify that x i = y i = v j means that both x i and y i represent the same element v j ∈ I . Further we write x i = y i , if x i and y i represent two distinct indices from I , say v j and v j ′ . They could have the samevalue as the summation indices v j and v j ′ run from 1 to N .Now we first look at the third order terms in the cumulant expansion (4.5) with p + q + 1 = 3. Usingthe differential rule for the Green function entries in (3.21), all the third order terms with p + q + 1 = 3can be written out in the form in (4.8), with an extra factor √ N in front. We observe that these termsare unmatched, see Definition 4.2 below, since the indices a, b both appear an odd number of times inthe product of the Green function entries.In a similarly way, the fourth order terms in the cumulant expansion (4.5) with p + q + 1 = 4, exceptthe ones corresponding to p = 2 , q = 1, are also unmatched terms of the form in (4.8) from Definition 4.2,since the number of times the index a (or b ) appears in the row index set does not agree with the numberof times it appears in the column index set of the product of Green function entries. Definition 4.2 (Terms with unmatched indices) . Given any Q d ∈ Q d of the form in (4.8) of degree d ,let ν ( r ) j , ν ( c ) j , be the number of times the free summation index v j ∈ I appears as the row, respectivelycolumn, index in the product of the Green function entries, i.e., ν ( r ) j := { ≤ i ≤ n : x i = v j } , ν ( c ) j := { ≤ i ≤ n : y i = v j } , ≤ j ≤ m. (4.11)We define the set of the unmatched summation indices as I o := { ≤ j ≤ m : ν ( r ) j = ν ( c ) j } ⊂ I . If I o is empty, i.e., all the free summation indices appear the same number of times in the row index set { x i } and the row column index set { y i } , then we say that Q d is matched. Otherwise, we say Q d is anunmatched term, denoted by Q od . The collection of the unmatched terms of the form in (4.8) of degree d is denoted by Q od ⊂ Q d .Given any unmatched term Q od ∈ Q od , we define the unmatched index set for both row and column as R o := { ≤ j ≤ m : ν ( r ) j > ν ( c ) j } ⊂ I o ; C o := { ≤ j ≤ m : ν ( r ) j < ν ( c ) j } ⊂ I o . (4.12)Neither of R o and C o is empty. Moreover, R o ∩ C o is empty, and R o ∪ C o = I o . Next, we give two examples of unmatched terms, which appear as fourth order terms in (4.5), − N X v,a,b s (1 , ab E h G va G bv G ba G aa G bb i ∈ Q o ; − N X v,a,b s (0 , ab E h G va G bv G ba G ba G ba i ∈ Q o ; (4.13)and two examples of the unmatched terms from the third order terms on the right side of (4.5),1 N X v,a,b s (1 , ab E h G va G bv G aa G bb i ∈ Q o ; 1 N X v,a,b s (0 , ab E h G va G bv G ba G ba i ∈ Q o , (4.14)up to a factor of √ N .The following proposition states that the expectations of the unmatched terms are much smaller thantheir naive size obtained by the power counting from the local law as in (4.10). The proof is postponedto Section 6. Proposition 4.3. Consider any unmatched term Q od ∈ Q od of degree d with fixed n given in (4.8). Forany fixed D ∈ N , we have E [ Q od ( t, z )] = O ≺ (cid:16) N + Ψ D (cid:17) , (4.15) uniformly in z ∈ S given in (2.7) and t ≥ .Remark . In the observable Q od ( t, z ) in (4.15) the Green function entries from (4.8) are all chosen atthe same spectral parameter z ∈ S . Our proofs can be extended to the setting where the Green functionentries are evaluated at different spectral parameters in the domain S with the estimate in (4.15) holding true. As we do not require this generalization to prove Proposition 4.1 we do not pursue this directionhere.Therefore, using Proposition 4.3, the third order terms in the cumulant expansion (4.5) are all boundedas O ≺ ( N − / + √ N Ψ D ). Moreover all the fourth order terms in the cumulant expansion (4.5), exceptthe one corresponding to p = 2 , q = 1, are bounded by O ≺ ( N − + Ψ D ). By choosing D sufficiently largedepending on ǫ , we hence obtain from (4.5) that E [Θ ] = − N N X v,a =1 ( s (2) aa − E h ∂ G vv ∂h aa i − N N X v,a,b =1 a = b s (2 , ab E h ∂ G vv ∂h ba ∂h ab i + O ≺ ( 1 √ N ) . (4.16)The remaining terms on the right side of (4.16) are matched under Definition 4.2. It is thus sufficient toestimate these matched terms, as presented in the next subsection.4.3. Terms with matched indices. Applying the differentiation rule (3.21) to the right side of (4.16),the index v appears once as a row index and once as a column index of the Green function entries of theresulting terms on the right side of (4.16). In addition, the indices a, b from (4.16) will take a specialrole and appear twice as a row index and twice as a column index of the Green function entries. Afterdifferentiation by (3.21), we write out these products of Green function entries and observe that they areof the following form which we call type-AB terms. Definition 4.5 (Type-AB terms, type-A terms, type-0 terms) . For arbitrary m, n ∈ N , we consideraveraged products of Green functions of the form1 N m +2 N X v =1 · · · N X v m =1 N X a =1 N X b =1 c a,b,v ,...,v m (cid:16) n Y i =1 G x i y i ( t, z ) (cid:17) =: 1 N I +2 X I ,a,b c a,b, I (cid:16) n Y i =1 G x i y i (cid:17) , (4.17)for t ∈ R + , z ∈ C + , where each x i and y i represent the free summation indices a , b or v j (1 ≤ j ≤ m ).Here the coefficients { c a,b, I := c a,b,v ,...,v m } are uniformly bounded complex numbers. Note that the formin (4.17) is a special case of the form given in (4.8) with the two indices a and b singled out. The degree,denoted by d , of such a term is defined as in (4.9) by counting the number of the off-diagonal Greenfunction entries. Recall ν ( r ) j , ν ( c ) j defined in (4.11). We further define similarly ν ( r ) a := { i : x i = a } , ν ( c ) a := { i : y i = a } , ν ( r ) b := { i : x i = b } , ν ( c ) b := { i : y i = b } , for the special indices a , b .A type-AB term , denoted by P ABd , is of the form in (4.17) with each v j appearing once in the rowindex set { x i } and once in the column index set { y i } in the product of the Green function entries, i.e., ν ( r ) j = ν ( c ) j = 1. The indices a and b both appear the same number of times (more than once) in the rowindex set { x i } and column index set { y i } in the product of the Green function entries, i.e., ν ( r ) a = ν ( c ) a ≥ ν ( r ) b = ν ( c ) b ≥ 2. We denote by P ABd ≡ P ABd ( t, z ) the collection of the type-AB terms of degree d . Weremark that type-AB terms are matched in the sense of Definition 4.2.A type-A term , denoted by P Ad , is of the form in (4.17) with ν ( r ) a = ν ( c ) a ≥ 2, and ν ( r ) b = ν ( c ) b = ν ( r ) j = ν ( c ) j = 1 for 1 ≤ j ≤ m . We denote the collection of the type-A terms of degree d by P Ad ≡ P Ad ( t, z ). Weremark that the index b does no longer play a special role in type-A terms and we keep it in the notationin order to emphasize the inheritance from the form in (4.17).Finally, a type-0 term , denoted by P d , is of the form in (4.17) with all the free summation indicesappearing once in the row index set { x i } and once in the column index set { y i } in the product of theGreen function entries, i.e., ν ( r ) a = ν ( c ) a = ν ( r ) b = ν ( c ) b = ν ( r ) j = ν ( c ) j = 1 for 1 ≤ j ≤ m . We denotethe collection of the type-0 terms of degree d by P d ≡ P d ( t, z ). In addition, the indices a and b do nolonger play a special role in type-0 terms and we keep them in the notation in order to emphasize theinheritance from the form in (4.17)Next, we give two examples for type-AB terms, which are generated from the fourth order expansionterms in (4.16) corresponding to the (2 , − N X v,a,b s (2 , ab (cid:16) G va G aa G av G bb G bb (cid:17) ∈ P AB ; − N X v,a,b s (2 , ab (cid:16) G va G ab G bv G aa G bb (cid:17) ∈ P AB ; and an example of a type-A term, which is from the second order terms of diagonal entries in the cumulantexpansion (4.16), − N X v,a ( s (2) aa − (cid:16) G va G aa G av (cid:17) ∈ P A , where the index b no longer takes the special role.In the following, we only consider special type-AB terms with both indices a and b appearing in theproduct of the Green function entries four times in total, and the corresponding type-A terms with index a appearing four times in the product of Green function entries. For the general case, see Remark 4.7.The next proposition claims that, in expectation, any type-AB term as well as any type-A term ofdegree d can be expanded into linear combinations of type-0 terms of degrees at least d up to negligibleerror. The proof of Proposition 4.6 is presented in Subsection 5.2. Proposition 4.6. Consider any type-AB term P ABd ∈ P ABd of the form in (4.8) with fixed n ∈ N and ν ( r ) a = ν ( c ) a = ν ( r ) b = ν ( c ) b = 2 . Then for any fixed D ∈ N , we have E [ P ABd ( t, z )] = X P d ′ ∈P d ′ d ≤ d ′ 2, where the number of Green function entries in each type-AB term is n = 5, thesummation index set I = { v } and the coefficients c a,b,v = s (2 , ab . Similarly, the first group of terms on theright side of (4.16) can be written as a type-A term with degree d = 2 and the number of Green functionentries n = 3. Therefore, from Proposition 4.6, we can expand (4.16) as a sum of finitely many type-0terms of degrees at least two, i.e., E [Θ ( t, z )] = X P d ∈P d ≤ d ≤ D − E [ P d ( t, z )] + O ≺ (cid:16) √ N + Ψ D (cid:17) , (4.20)uniformly in z ∈ S and t ∈ R + , where the number of type-0 terms in the sum above can be bounded by( CD ) cD , for some numerical constants C, c .Having expanded E [Θ ( t, z )] into type-0 terms, we next estimate the size of type-0 terms of the formin (4.17) of degree d ≥ i.e., when the spectral parameter z is chosen in the domain S edge defined in (4.1). The proof of Lemma 4.8 is presented in Subsection 5.3. Lemma 4.8. For any type-0 term P d ∈ P d of the form in (4.17) of degree d ≥ with fixed n ∈ N , wehave | E [ P d ( t, z )] | = O ≺ ( N − / ) , (4.21) uniformly in z ∈ S edge given by (4.1) and t ≥ . We hence obtain the estimate of E [Θ ( t, z )] in (4.6) by combining (4.20) and (4.21), and by choosing D sufficiently large depending on ǫ > 0. This yields the proof of Proposition 4.1.5. Product of Green function entries with matched indices In this section, we prove Proposition 4.6 and Lemma 4.8. Before diving into their proofs, we outlinein the next subsection the intuition stemming from the GUE.5.1. Intuition from the GUE. In this subsection, we focus on the special case of the GUE. Theidea of eliminating the indices appearing more than twice and reducing type-AB to type-0 terms as inProposition 4.6 stems from explicit computations for the GUE based on the Weingarten calculus for Haarunitary matrices. To simplify the arguments, we only consider the following example of a type-AB termof the form in (4.17), 1 N X a,b ( G aa ( z )) ( G bb ( z )) ∈ P AB . (5.1)Thanks to the unitary conjugation invariance, we know that the eigenvalues ( λ i ) and the correspondingorthonormal eigenvectors ( u i ) of a GUE matrix are independent, and that the collection of eigenvectors U := ( u , · · · , u N ) is distributed according to Haar measure on the unitary group U ( N ).Further, using the spectral decomposition G ( z ) = 1 H − z = N X j =1 u j u Tj λ j − z , z ∈ S , (5.2)we write the expectation of (5.1) as1 N X a,b E [( G aa ( z )) ( G bb ( z )) ] = 1 N X a,b X j,k,p,q E h u j ( a ) u j ( a ) u k ( a ) u k ( a ) u p ( b ) u p ( b ) u q ( b ) u q ( b )( λ j − z )( λ k − z )( λ p − z )( λ q − z ) i = 1 N X a,b X j,k,p,q E h λ j − z )( λ k − z )( λ p − z )( λ q − z ) i × E [ U aj U ak U bp U bq U aj U ak U bp U bq ] . (5.3)In order to estimate the expectations of the eigenvectors above, we use the following result for theWeingarten calculus on the unitary groups [10, 11]. Lemma 5.1 (Corollary 2.4, Proposition 2.6 in [11]) . Let U = ( U ij ) Ni,j =1 be a Haar unitary random matrixof size N . Let n ∈ N and denote by S n the symmetric group of order n . Then, for arbitrary column androw indices i k , i ′ k , j k , j ′ k ∈ J , N K , ≤ k ≤ n , we have E [ U i j · · · U i n j n U i ′ j ′ · · · U i ′ n j ′ n ] = X α,β ∈ S n δ i ,i ′ α (1) · · · δ i n ,i ′ α ( n ) δ j ,j ′ β (1) · · · δ j n ,j ′ β ( n ) Wg( N, α − β ) , (5.4) where Wg( N, γ ) is the Weingarten function given by Wg( N, γ ) := E [ U · · · U nn U γ (1) · · · U n,γ ( n ) ] , γ ∈ S n . (5.5) In the limit of large N , the Weingarten function Wg( N, γ ) has the following asymptotic behavior: Let { c i } γ ) i =1 denotes the cycles of γ ∈ S n , with γ ) the total number of cycles. Then Wg( N, γ ) = N γ ) − n γ ) Y i =1 ( − | c i |− Cat( | c i | − 1) + O ( N γ ) − n − ) , (5.6) where | c i | denotes the length of the cycle c i and Cat( k ) = (2 k )! k !( k +1)! is the k -th Catalan number. Now we are ready to evaluate, for large N , E [ U aj U ak U bp U bq U aj U ak U bp U bq ] from (5.3) using Lemma 5.1with n = 4. We may assume that a = b , as the case a = b only contributes O ( N − ) to the expectationof (5.1) uniformly for z ∈ S , using the local law (3.10) and Lemma 2.1. We set n = 4, i = i = i ′ = i ′ = a , i = i = i ′ = i ′ = b , j = j ′ = j , j = j ′ = k , j = j ′ = p , and j = j ′ = q . Since max γ ∈ S n γ ) = 4,the leading term in (5.4), corresponding to Wg( N, γ ) with γ = ( α − β = ), is of size O ( N ) from (5.6) and the rest terms are bounded by O ( N ). Moreover, the coefficient in front of Wg( N, ) is given by thenumber of permutations σ ∈ S such that i l = i ′ σ ( l ) , j l = j ′ σ ( l ) , l = 1 , , , . (5.7)We then separate into the following five cases: 1.) all indices j, k, p, q are distinct, 2.) only two of themcoincide while the other two are distinct, 3.) two pairs of them coincide, 4.) three of them coincide andthe rest one is different, and 5.) all the indices are the same. As a = b , the number of permutationssatisfying (5.7) is given by 1, 8, 6, 8 and 4, respectively. Therefore, for a = b , we obtain E [( G aa ( z )) ( G bb ( z )) ] = 1 N X j,k,p,q all distinct E h λ j − z )( λ k − z )( λ p − z )( λ q − z ) i(cid:16) O (cid:0) N (cid:1)(cid:17) + 8 N X j,p,q all distinct E h λ j − z ) ( λ p − z )( λ q − z ) i(cid:16) O (cid:0) N (cid:1)(cid:17) + 6 N X j = q E h λ j − z ) ( λ q − z ) i(cid:16) O (cid:0) N (cid:1)(cid:17) + 8 N X j = q E h λ j − z ) ( λ q − z ) i(cid:16) O (cid:0) N (cid:1)(cid:17) + 4 N X j E h λ j − z ) i(cid:16) O (cid:0) N (cid:1)(cid:17) . (5.8)For example, by direct computation, the first term on the right side of (5.8) can be written using thespectral decomposition (5.2) as1 N X j,k,p,q all distinct E h λ j − z )( λ k − z )( λ p − z )( λ q − z ) i = 1 N E [(Tr G ) ] − N E [(Tr G )(Tr G ) ]+ 8 N E [(Tr G )(Tr G )] − N E [Tr G ] + 3 N E [(Tr G )(Tr G )] . (5.9)Observe that the resulting terms on the right side are type-0 terms under Definition 4.5. We furtherwrite the other terms on the right side of (5.8) similarly by type-0 terms using the spectral decomposi-tion. Averaging over a, b , adding the subleading diagonal terms, and using exactly cancellations, (5.3)eventually becomes, uniformly in z ∈ S ,1 N X a,b E [( G aa ) ( G bb ) ] = 1 N E [(Tr G ) ] + 2 N E [(Tr G )(Tr G ) ] + 1 N E [(Tr G )(Tr G )] + O ( N − ) . In this way, we have eliminated one pair of a -indices and b -indices from the type-AB term (5.1) andshown that they can be written as linear combinations of type-0 terms, which involves only products oftraces.For Wigner matrices, the above does not apply anymore as the eigenvectors are no longer exactlyHaar distributed on U ( N ), further the expectation in (5.3) does not factorize. Yet successively applyingcumulant expansions, we can reduce type-AB terms to sums of type-A terms up to negligible error,and then finally reduce type-A terms to sums of type-0 terms. This procedure is explained in the nextsubsection.5.2. Proof of Proposition 4.6. In this subsection, we give the proof of Proposition 4.6 for arbitraryWigner matrices using cumulant expansions. Proof of Proposition 4.6. We consider a type-AB term of the form in (4.17) with both indices a and b appearing twice as a row index and twice as a column index in the product of the Green function entries.There are two steps as follows. We first expand the type-AB term as a linear combination of type-Aterms by eliminating one pair of the index b . Then in a second step we expand the resulting type-A termsas linear combinations of type-0 terms by further eliminating a pair of the index a . Step 1: Reduction to type-A terms. Given a type-AB term, we will eliminate one pair of theindex b using the relation G ij = δ ij G + G ij HG − G ( HG ) ij , (5.10) and then applying cumulant expansions. The identity may be checked directly from the definition of theGreen function. In (5.10) we use the notation A := N Tr A , for any A ∈ C N × N , to denote the normalizedtrace. Similar ideas were used in [21, 29].Consider now a type-AB term P ABd ∈ P ABd of the form in (4.17). We split into the following two cases. Case 1: If there exists some i such that x i = y i = b , i.e., there is a factor G bb in the product of Greenfunction entries, we may then assume i = 1. Applying (5.10) to G bb and performing cumulant expansionsfor the resulting terms HG and ( HG ) bb , we obtain E [ P ABd ] = 1 N I +2 X I ,a,b c a,b, I E h ( G + G bb HG − G ( HG ) bb ) Y ≤ i ≤ n G x i y i i = 1 N I +2 X I ,a,b c a,b, I E h G Y ≤ i ≤ n G x i y i i + 1 N I +4 X I ,a,b,j,k c a,b, I E h ∂G bb G jk Q ≤ i ≤ n G x i y i ∂h jk i − N I +4 X I ,a,b,j,k c a,b, I E h ∂G jj G kb Q ≤ i ≤ n G x i y i ∂h kb i + O ≺ (cid:0) √ N (cid:1) , (5.11)where the error O ≺ ( √ N ) is from the truncation of the cumulant expansions. Using (3.21), the firstorder of the second term above corresponding to ∂∂h jk G jk is precisely canceled by that of the third termcorresponding to ∂∂h kb G kb . Then we write E [ P ABd ] = 1 N I +2 X I ,a,b c a,b, I E h G Y ≤ i ≤ n G x i y i i − N I +4 X I ,a,b,j,k c a,b, I E h ∂G bb Q ≤ i ≤ n G x i y i ∂h jk G jk i + 1 N I +4 X I ,a,b,j,k c a,b, I E h ∂G jj Q ≤ i ≤ n G x i y i ∂h kb G kb i + O ≺ ( 1 √ N ) . (5.12)The first term on the right side above is obtained by replacing G bb by the normalized trace G in theexpression of P ABd . In this way we have eliminated one pair of the index b . Since the index b originallyappeared twice as a row index and twice as a column index in the product of the Green function entries,the first term has become a type-A term of degree d . Moreover, from (3.21) and the fact that j, k are freshindices, the other terms on the right side of (5.12) can be written out as a sum of 2 n type-AB terms ofthe form in (4.17) with the corresponding free summation index set is I ′ = {I , j, k } , m ′ = I ′ = m + 2,and the number of Green function entries is n ′ = n + 2. Their degrees, denoted by d ′ , satisfy d ′ ≥ d + 1.We use P P ABd ′ ∈P ABd ′ ,d ′ ≥ d +1 E [ P ABd ′ ] to denote the finite sum of these terms, i.e., we write E [ P ABd ] = 1 N I +2 X I ,a,b c a,b, I E h G n Y i =2 G x i y i i + X P ABd ′ ∈P ABd ′ d ′ ≥ d +1 E [ P ABd ′ ] + O ≺ (cid:0) √ N (cid:1) . (5.13)Therefore, the combination of the identity (5.10) and the cumulant expansion gives a cancellation to firstorder, and the only leading term left is obtained by replacing a factor G bb with the normalized trace G of the product of Green function entries in the expression of the original P ABd . Case 2: If there is no i such that x i = y i = b , i.e., there is no factor as G bb in the product ofGreen function entries in (4.17), we may then assume that x = b and y = b . Since the index b appearsexactly twice in { y i } ni =2 , we may assume that y = y = b and x = b and x = b . Then there is no b in the remaining column index set { y i } ni =4 . Using the identity (5.10) on G by and applying cumulantexpansions, we find E [ P ABd ] = 1 N I +2 X I ,a,b c a,b, I E h ( G by HG − G ( HG ) by ) G x b G x b n Y i =4 G x i y i i = − N I +4 X I ,a,b,j,k c a,b, I E h ∂G by G x b G x b Q ni =4 G x i y i ∂h jk G jk i + 1 N I +4 X I ,a,b,j,k c a,b, I E h ∂G jj G x b G x b Q ni =4 G x i y i ∂h kb G ky i + O ≺ ( 1 √ N ) , (5.14) where in the second step, we observe a cancellation to first order similarly as in (5.12). From (3.21),the right side of (5.14) can again be written as a sum of 2 n type-AB terms of the form in (4.17) with I ′ = {I , j, k } , m ′ = m + 2, and n ′ = n + 2. Since j, k are fresh indices, the resulting type-AB terms havedegrees d ′ ≥ d + 2 (the finite sum of such terms is denoted by P P ABd ′ ∈P ABd ′ ,d ′ ≥ d +2 E [ P ABd ′ ]), except thefollowing two terms corresponding to taking ∂∂h kb of a Green function entry whose column index coincideswith b , i.e., N I +4 X I ,a,b,j,k c a,b, I E h G jj G x k G bb G x b n Y i =4 G x i y i G ky i (5.15)and 1 N I +4 X I ,a,b,j,k c a,b, I E h G jj G x b G x k G bb n Y i =4 G x i y i G ky i . (5.16)Observe that the terms in (5.15) and (5.16) are type-AB terms in P ABd with a factor G bb in the productof Green function entries considered in Case 1. Using (5.13) on these terms and combining with (5.14),we hence obtain E [ P ABd ] = 1 N I +4 X I ,a,b,j,k c a,b, I E h G jj GG ky G x k G x b n Y i =4 G x i y i i + 1 N I +4 X I ,a,b,j,k c a,b, I E h G jj GG ky G x b G x k n Y i =4 G x i y i i + X P ABd ′ ∈P ABd ′ d ′ ≥ d +2 E [ P ABd ′ ] + X P ABd ′′ ∈P ABd ′′ d ′′ ≥ d +1 E [ P ABd ′′ ] + O ≺ (cid:0) √ N (cid:1) , (5.17)where the first two terms on the right side above are type-A terms in P Ad , obtained by replacing a pairof index b , i.e., ( x , y ) or ( x , y ) by a fresh index k and multiplied by ( G ) . The first group of sumon the last line of (5.17) comes from (5.14) excluding two terms (5.15) and (5.16), and the number ofthe type-AB terms in the sum is at most 2 n − 2. The second group of sum on the last line of (5.17) isobtained from expanding (5.15) and (5.16) by (5.13). The corresponding type-AB terms are of the formin (4.17) with m ′′ = m ′ + 2 and n ′′ = n ′ + 2, and the number of the terms in the sum is at most 4 n ′ .Combining with Case 1, for any type-AB term P ABd ∈ P ABd , we rewrite (5.13) and (5.17) in the shortform E [ P ABd ] = X P Ad ∈P Ad E [ P Ad ] + X P ABd ′ ∈P ABd ′ d ′ ≥ d +1 E [ P ABd ′ ] + O ≺ (cid:0) √ N (cid:1) , (5.18)where the summations above denote a sum of at most two type-A terms of degree d and a sum of at most(6 n + 8) type-AB terms of degree not less than d + 1. The number of the Green function entries in theproduct (see (4.17)) of each term is at most n + 4. Remark . In general, if the number of the index b appearing in the Green function entries of P ABd isnot limited to four, i.e., ν ( r ) b = ν ( c ) b = s ≥ 3, then the terms in the first group of sum on the right sideof (5.18) are of the form in (4.17) with ν ( r ) b = ν ( c ) b = s − ≥ 2. Moreover, the number of such terms inthe first group of the sum is at most s . We can repeat the expansion procedure in (5.18) for s times until ν ( r ) b = ν ( c ) b = 1. We then end up with at most s ! type-A terms in P Ad , and at most 6 s s ( n + 4 s ) type-ABterms of degrees not less than d + 1 generated in the above expansion procedures.Iterating the expansion procedure (5.18) D − d times, the resulting type-AB terms have degrees atleast D . Using the local law in (3.10), we expand an arbitrary type-AB term P ABd ∈ P ABd as a finite sumof type-A terms of degrees at least d , up to negligible error. We hence arrive at E [ P ABd ] = X d ≤ d ′ For the expanded type-A terms on the right side of (5.19),we follow the idea in Step 1 to expand the resulting type-A terms as linear combinations of type-0 termsby further eliminating one pair of the index a .Given a type-A term P Ad ∈ P Ad of the form in (4.17), we split into two cases: 1) there exists a factor G aa in the product of Green function entries; 2) there is no factor G aa in the product of the Greenfunction entries. We utilize similar arguments as in Case 1 and Case 2 of Step 1 above and obtain theanalogue of (5.18), namely that E [ P Ad ] = X P d ∈P d P d + X P Ad ′′ ∈P Ad ′′ d ′′ ≥ d +1 E [ P Ad ′′ ] + O ≺ (cid:0) √ N (cid:1) , (5.20)where the summations above denote a sum of at most two type-0 terms of degree d and a sum of at most(6 n + 8) type-A terms of degrees at least d + 1. The number of the Green function entries in the productof each term is bound by n + 4.Iterating the above expansion for D − d times, we then expand an arbitrary type-A term P Ad ∈ P Ad asa sum of at most (6( n + 4 D )) D type-0 terms of degree d ′ satisfying d ≤ d ′ < D , up to negligible error.As the analogue of (5.19), we write E [ P Ad ] = X d ≤ d ′ In the subsection, we estimate the expectations of the type-0 terms andprove Lemma 4.8. We start with the following lemma for the GUE. Lemma 5.4. Let H belong to the GUE. For any ǫ > and C > , recall the domain S edge ≡ S edge ( ǫ, C ) defined in (4.1) . Then there exists a constant C independent of ǫ such that N E GUE h Im Tr G ( z ) i ≤ CN − / ǫ , (5.22) holds uniformly for all z ∈ S edge , for sufficiently large N ≥ N ( C , ǫ ) . Furthermore, for any τ > , allthe type-0 terms P d ∈ P d ( d ≥ ) of the form in (4.17) have the upper bound | E GUE [ P d ( z )] | ≤ N − / τ , (5.23) uniformly for all z ∈ S edge , for sufficiently large N ≥ N ′ ( C , ǫ, τ ) . The proof of Lemma 5.4 is postponed to Subsection 5.4. Using the above lemma for the GUE and thecomparison method, we are now ready to prove Lemma 4.8 for arbitrary Wigner matrices. Proof of Lemma 4.8. Consider any type-0 term P d ∈ P d of the form in (4.17) of degree d ≥ 2. If d ≥ D for some large D , then by the local law in (3.10), | E [ P d ] | = O ≺ (Ψ D + N − ). Else, if d is smaller, weestimate E [ P d ] using the comparison method iteratively and the corresponding estimates for the GUEin (5.23).We start the iteration by denoting the type-0 term P d of the form in (4.17) as P d ≡ P (1) d , where thesuperscript (1) and degree d ≡ d will be used to indicate the iteration step. We hence consider a term of the form P (1) d ≡ P (1) d ( t, z ) : 1 N I +2 X I ,a ,b c a ,b , I (cid:16) n Y i =1 G x i y i ( t, z ) (cid:17) , z ∈ S edge , (5.24)with n = I + 2, where each summation index in { a , b , I } appears exactly once in the row indexset { x i } and exactly once in the column index set { y i } . In the following, we often omit the parameters t, z and the errors below are always bounded uniformly in z ∈ S edge and t ≥ P (1) d under the Ornstein–Uhlenbeck flow in (3.7), similarly to (4.4). In general, for any { x i , y i } ni =1 with some n ∈ N , using Ito’sformula and the stochastic differential equation for the Green function entries in (3.22), we haved (cid:16) n Y i =1 G x i y i (cid:17) = n X j =1 Y i = j G x i y i d G x j y j + 12 n X j,k =1 Y i = j,k G x i y i d G x j y j d G x k y k = − √ N X a,b n X j =1 G x j a G by j Y i = j G x i y i d β ab + 12 X a,b n X j =1 (cid:16) h ab G x j a G by j + 1 N G x j b G by j G aa + 1 N G x j a G ay j G bb (cid:17) Y i = j G x i y i d t + 12 N X a,b X j,k G x j a G by j G x k b G ay k Y i = j,k G x i y i d t := d c M + b Θ d t , (5.25)with diffusion term d c M and drift term b Θ d t . Applying cumulant expansions to the drift term, we obtainthe analogue of (4.5), E [ b Θ] = 12 N N X a =1 ( s (2) aa − n X j =1 E h ∂ ( G x j a G ay j Q i = j G x i y i ) ∂h aa i + 12 N X a,b =1 a = b X p + q +1=3 s ( p,q +1) ab p ! q ! N p + q +12 n X j =1 E h ∂ p + q ( G x j a G by j Q i = j G x i y i ) ∂h pba ∂h qab i + O ≺ ( 1 √ N )= − N N X a =1 ( s (2) aa − E h ∂ ( Q ni =1 G x i y i ) ∂h aa i − X p + q +1=3 p ! q ! N p + q +12 N X a,b =1 a = b s ( p,q +1) ab E h ∂ p + q +1 ( Q ni =1 G x i y i ) ∂h pba ∂h q +1 ab i + O ≺ ( 1 √ N ) . (5.26)From (5.25) and (5.26), we find that P (1) d in (5.24) satisfies the stochastic differential equationd( P (1) d ) = d M (1) d + Θ (1) d d t, (5.27)where the diffusion term d M (1) d yields a martingale after integration (see Remark 3.2) and the drift termΘ (1) d d t satisfies the following analogue of (4.5), E [Θ (1) d ] = − N N X a =1 ( s (2) a a − E h ∂ ( P (1) d ) ∂h a a i − X p + q +1=3 p ! q ! N p + q +12 N X a ,b =1 a = b s ( p,q +1) a b E h ∂ p + q +1 ( P (1) d ) ∂h pb a ∂h q +1 a b i + O ≺ (cid:0) √ N (cid:1) , (5.28)where a , b are fresh summation indices, as a, b in (5.26). The subscript 2 is used to indicate the iterationstep and distinguish from a , b in (5.24).From (3.21), all the third order terms for p + q + 1 = 3 in the cumulant expansion above can bewritten out in the form in (4.8), with an extra factor √ N in front. Since the fresh indices a , b bothappear an odd number of times in the product of the Green function entries, they are unmatched fromDefinition 4.2. Using Proposition 4.3, these term are bounded by O ≺ ( N − / + √ N Ψ D ). The fourth order terms in the cumulant expansion with p + q + 1 = 4 in (5.28), with the exception ofthose corresponding to p = 2 , q = 1, are also unmatched terms of the form in (4.8), since the number oftimes the index a (or b ) appears in the row index set { x i } does not agree with the number of times itappears in the column index set { y i } . Using Proposition 4.3, these term are bounded by O ≺ ( N − + Ψ D ).By choosing D sufficiently large depending on ǫ , we hence obtain the following analogue of (4.16) E [Θ (1) d ] = − N N X a =1 ( s (2) a a − E h ∂ ( P (1) d ) ∂h a a i − N N X a ,b =1 a = b s (2 , a b E h ∂ ( P (1) d ) ∂h b a ∂h a b i + O ≺ ( N − / ) . (5.29)It then suffices to estimate the remaining matched terms above. Using (3.21) and (5.24), the secondgroup of terms on the right side of (5.29) can be written out in the form:1 N I +4 X I ,a ,b ,a ,b c a ,b ,a ,b , I (cid:16) n +4 Y i =1 G x i y i (cid:17) , (5.30)where the coefficients { c a ,b ,a ,b , I } are determined by { c a ,b , I } and { s (2 , a ,b } , and each summationindex in { a , b , I } appears once in the row index set { x i } and once in the column index set { y i } .Moreover, both indices a , b appear exactly twice in the row index set { x i } and exactly twice in thecolumn index set { y i } . We define the degree of the form in (5.30) as in (4.9) by counting the number ofoff-diagonal Green function entries. Recall the definition of the type-AB, type-A and type-0 terms fromDefinition 4.5. The definitions can be adapted naturally with respect to the fresh indices a and b , forthe form given in (5.30).Thus the second group of terms on the right side of (5.29) are n ( n + 1)( n + 2)( n + 3) type-ABterms of degrees not less than d + 1, from (3.21) and the fact that a , b are fresh indices. Similarly,the first group of terms on the right side of (5.29) are n ( n + 1) type-A terms of degrees not less than d + 1. Using Proposition 4.6, we expand each of these terms as a sum of finitely many type-0 terms ofdegrees at least d + 1, which are in the form: P (2) d : 1 N I +4 X I ,a ,b ,a ,b c a ,b ,a ,b , I (cid:16) n Y i =1 G x i y i (cid:17) , (5.31)where I is a set of free summation indices, the coefficients { c a ,b ,a ,b , I } are uniformly bounded complexnumbers, and each index in { a , b , a , b , I } appears once in { x i } and once in { y i } . In particular, n = I + 4. The degree of such a term, denoted by d , is given as in (4.9). Here we use the subscript 2to indicate the iteration step. Note that the form in (5.31) is a special case of the form given in (4.8)and the indices a , b , a , b do not take special roles. We keep them in the notation to emphasize theinheritance from (5.30). The collection of the type-0 terms of the form in (5.31) of degree d is denotedby P (2) d . Then from Proposition 4.6, we expand (5.29) and write for short E [Θ (1) d ] = X P (2) d ∈P (2) d d +1 ≤ d 1, we find | E [ P (2) d ( t, z )] | = O ≺ (Ψ D + N − ) using the local law in (3.10). We then obtain from (5.33) that | E [ P (1) d ( t ′ , z )] | = O ≺ (cid:16) log N ( N − / + Ψ D ) + N − / (cid:17) , (5.34)uniformly in t ′ ∈ [0 , T ] and z ∈ S edge .Else, if d ≤ D − 2, we repeat the above arguments for the resulting type-0 terms P (2) d ∈ P (2) d ( d ≥ d + 1) on the right side of (5.33) as in (5.27). Using (5.25) and (5.26), we then create two freshsummation indices, denoted by a , b , to derive the evolution under the Ornstein–Uhlenbeck flow of any P (2) d ∈ P (2) d . As the analogue of (5.29), the expectation of the corresponding drift terms is given by E [Θ (2) d ] = − N N X a =1 ( s (2) a a − E h ∂ ( P (2) d ) ∂h a a i − N N X a ,b =1 a = b s (2 , a b E h ∂ ( P (2) d ) ∂h b a ∂h a b i + O ≺ ( N − / ) . (5.35)From Definition 4.5, the right side above can be written out as linear combinations of type-A termsand type-AB terms, with respect to fresh summation indices a and b , of degrees not less than d + 1.Using Proposition 4.6, these terms can further be expanded using the type-0 terms of degrees at least d + 1. In this way, we obtain an estimate similar to (5.34) for d = D − d ≥ 2. In general, for any s ≥ 1, wedefine a type-0 term in the s -th iteration step to be in the form of P ( s ) d s : 1 N I s +2 s X I s ,a ,b ,...,a s ,b s c a ,b ,...,a s ,b s , I s E h n s Y i =1 G x i y i ( t, z ) i , (5.36)where I s is a set of free summation indices, the coefficients { c a ,b ,...,a s ,b s , I s } are uniformly boundedcomplex numbers, and each free summation index in { a , b , . . . , a s , b s , I s } appears once in { x i } and oncein { y i } . In particular, we have n s = I s + 2 s . The degree, denoted by d s , of such a term in (5.36) isgiven as in (4.9) by counting the number of off-diagonal Green function entries. We denote by P ( s ) d s thecollection of the type-0 terms in the s -th step of the form in (5.36) of degree d s . Note that the form in(5.36) is a special case of the form given in (4.8), in order to emphasize the s -th iteration step and thedependence on { a s , b s } .We then derive the stochastic evolution for any P ( s ) d s ∈ P ( s ) d s ( s ≥ P ( s ) d s ) = d M ( s ) d s + Θ ( s ) d s d t , (5.37)where d M ( s ) d yields a martingale after integration, and E [Θ ( s ) d ] satisfies E [Θ ( s ) d s ( t, z )] = X P ( s +1) ds +1 ∈P ( s +1) ds +1 d s +1 ≤ d s +1 2. Indeed, from(3.34) and the local law in (3.10), we have (cid:12)(cid:12) E [ P ( s ) d s ( T, z )] − E GUE [ P ( s ) d s ( z )] (cid:12)(cid:12) = O ( N − ) . (5.39)Together with the estimate (5.23) for the GUE, we obtain that, for any s ≥ d s ≥ (cid:12)(cid:12) E [ P ( s ) d s ( T, z )] (cid:12)(cid:12) = O ≺ ( N − / ) . (5.40) Next, we return to the stochastic differential equation of P ( s ) d s in (5.37). Integrating (5.37) over [ t ′ , T ]for any 0 ≤ t ′ ≤ T and taking the expectation as in (5.33), we have from (5.38) and (5.40) that E [ P ( s ) d s ( t ′ , z )] = X P ( s +1) ds +1 ∈P ( s +1) ds +1 d s +1 ≤ d s +1 1. We then obtaina similar estimate for any P ( s − d s − ∈ P ( s − d s − with d s − ≥ D − (cid:12)(cid:12) E [ P ( s − d s − ( t ′ , z )] (cid:12)(cid:12) = O ≺ (cid:16) log N ( N − / + Ψ D ) + N − / log N (cid:17) . Repeating the above process until s = 1, we hence obtain that, for d ≥ (cid:12)(cid:12) E [ P (1) d ( t, z )] (cid:12)(cid:12) = O ≺ (cid:16) ( N − / + Ψ D ) log D N (cid:17) , uniformly in t ∈ [0 , T ] and z ∈ S edge . By choosing D sufficiently large which depends on ǫ , we prove (4.21)for t ∈ [0 , T ]. If t ≥ T , a similar estimate can be obtained by using (5.39) and (5.40). We have hencefinished the proof of Lemma 4.8. (cid:3) Proof of Lemma 5.4. We end this section with the proof of Lemma 5.4 considering the GUE. Proof of Lemma 5.4. Using the spectral decomposition (5.2), we write1 N E GUE h Im Tr G ( z ) i = N ηN E GUE h N X j =1 | λ j − z | i , z ∈ S edge . (5.42)Then it suffices to estimate the following linear eigenvalue statistics, which can be written from (2.29),(2.36) and then (2.37) as1 N E GUE h N X i =1 | λ i − z | i = 1 N Z R e K N ( x, x ) | x − − κ − i η | d x = 1 N Z R K edge N ( x, x ) | x − N / κ − i N / η | d x , (5.43)where z = 2 + κ + i η ∈ S edge , with | κ | ≤ C N − / ǫ and N − ǫ ≤ η ≤ N − / ǫ .To control the integral on the right side of (5.43), we choose a fixed L < −∞ , − N / ], ( − N / , L ] and ( L ∞ ).For the integration domain ( −∞ , − N / ], we find that1 N Z x< − N / K edge N ( x, x ) | x − N / κ − i N / η | d x = O ( N − ) , (5.44)using the trace identity (2.34) for the kernel K N and that | κ | ≤ C N − / ǫ . Moreover, from Theorem 2.8 and Lemma 2.7, we have on ( L , ∞ ), that1 N Z x>L K edge N ( x, x ) | x − N / κ − i N / η | d x = 1 N Z x>L K airy ( x, x ) + O ( N − / ) | x − N / κ − i N / η | d x = O (cid:16) N η (cid:17) . (5.45)It hence suffices to focus on the regime ( − N / , L ]. Recall from (2.32) and (2.37) that K N ( x, x ) = N − X k =0 φ k ( x ); K edge N ( x, x ) = 1 N / K N (cid:16) √ N + xN / , √ N + xN / (cid:17) . (5.46)From (2.31) and (2.33), the derivative of K N ( x, x ) is given by K ′ N ( x, x ) = −√ N φ N − ( x ) φ N ( x ) . The Hermite functions satisfy, for all k , sup x ∈ R | φ k ( x ) | ≤ Ck − / . (5.47)for some constant C independent of k , as was proved in [4]. Therefore, the derivative of the edge kernel K edge N ( x, x ) is given by (cid:16) K edge N ( x, x ) (cid:17) ′ = 1 N / K ′ N (cid:16) √ N + xN / , √ N + xN / (cid:17) = O (1) . (5.48)For any x ∈ ( − N / , L ], we have from (5.48) and Lemma 2.7 that K edge N ( x, x ) = K edge N ( L , L ) − Z L x (cid:0) K edge N ( x, x ) (cid:1) ′ d x ≤ C ′ (1 + | x | ) . (5.49)Therefore, we obtain from (5.49) that1 N k Z − N / 2, where each summation index v j (1 ≤ j ≤ n ) appears once in the row index set { x i } ni =1 and once in the column index set { y i } ni =1 and the coefficients { c v ,...,v n } are uniformly bounded complexnumbers. For any 1 ≤ j ≤ n , if there exists 1 ≤ i ≤ n such that x i = y i = v j , then we say that v j isisolated. For any 1 ≤ j = j ′ ≤ n , if there exists 1 ≤ i ≤ n such that either x i = v j , y i = v j ′ or y i = v j , x i = v j ′ , then we say that v j and v j ′ are connected indices. Because the degree of (5.52) is at least two,there exists at least one cluster of connected indices containing at least two elements. We may assumethat v , . . . , v n (2 ≤ n ≤ n ) form a cluster of connected indices. Using the local law in (3.10), we have | P d ( z ) | ≺ N n N X v =1 · · · N X v n =1 (cid:12)(cid:12) G v v G v v · · · G v n v ( z ) (cid:12)(cid:12) . If n = 2, from Young’s inequality and the Ward identity1 N X i,j | G ij ( z ) | = Im m N ( z ) N η , z = E + i η ∈ C + , (5.53)which follows from the spectral decomposition (5.2), we obtain | P d ( z ) | ≺ N X v ,v (cid:12)(cid:12) G v v ( z ) G v v ( z ) (cid:12)(cid:12) ≤ N X v ,v (cid:0) | G v v ( z ) | + | G v v ( z ) | (cid:1) = Im m N ( z ) N η . (5.54)For n ≥ 3, we have similarly from the local law (3.10) that | P d ( z ) | ≺ Ψ n − N X v ,v ,v (cid:12)(cid:12) G v v ( z ) G v v ( z ) (cid:12)(cid:12)(cid:12) ≤ Ψ n − N X v ,v ,v (cid:0) | G v v ( z ) | + | G v v ( z ) | (cid:1) = O (cid:16) Im m N ( z )( N η ) n − (cid:17) , (5.55)where in the last two steps we use Young’s inequality, the Ward identity (5.53), and that Ψ( z ) = O ( Nη )for any z ∈ S edge . Therefore, combining with the estimate (5.22) for the expectation of Im m N ( z ), theproperties of stochastic domination in Lemma 1.7, and that η ≥ N − ǫ , we have, for any τ > E GUE (cid:2) | P ( s ) d ( z ) | (cid:3) ≤ N − / τ , d ≥ , uniformly in z ∈ S edge , for sufficiently large N ≥ N ′ ( C , ǫ, τ ). This completes the proof of (5.23), andhence the proof of Lemma 5.4. (cid:3) Product of Green function entries with unmatched indices In this section, we prove Proposition 4.3. Before stating the proof for Wigner matrices, we first considerthe GUE for the intuition why expectations of unmatched terms are much smaller than the naive sizeobtained using power counting and the local law as in (4.10).6.1. Intuition from the GUE. In this subsection, we focus on the special case of the GUE, as inSubsection 5.1. Consider any Q od ∈ Q od of the form (4.8). Using the spectral decomposition (5.2) and theunitary invariance of the GUE similarly as in (5.3), we write the expectation of the unmatched Q od as E [ Q od ] = 1 N I X I c I N X j ,...,j n =1 E h n Y i =1 λ j i − z ) i × E h n Y i =1 u j i ( x i ) u j i ( y i ) i , (6.1)with ( λ j ) the eigenvalues and the corresponding normalized eigenvectors ( u j ), and each x i , y i representsome free summation index in I . In order to estimate the expectations of the eigenvectors, recall theWeingarten calculus formula in Lemma 5.1. Under Definition 4.2 for unmatched indices, if the values ofthe free summation indices in I are distinct, then δ x ,y σ (1) · · · δ x n ,y σ ( n ) = 0, for any permutation σ ∈ S n .Thus from (5.4), for any 1 ≤ j , · · · , j n ≤ N , we have E h n Y i =1 u j i ( x i ) u j i ( y i ) i = 0 . The non-vanishing contributions come from the diagonal cases when the values of some free summationindices in I coincide. Because of the averaged form of Q od in (4.8) and the local law in (3.10) one worksout that, for any z ∈ S and t ≥ E [ Q od ] = O ( N − ) . (6.2)For Wigner matrices, the above argument does not apply anymore. We hence use similar expansionsas in Subsection 5.2 to extend to arbitrary Wigner matrices. Before we give the proof of Proposition 4.3,we start by considering an example of the unmatched term in Q od to illustrate the mechanism. Example of an unmatched term. We look at the following example of an unmatched term1 N X a,b G ab G ba G ab ∈ Q o , (6.3)with unmatched row index a and unmatched column index b . Using the local law in (3.10), the expectationof this term can be naively bounded by O ≺ (Ψ + N − ). The idea to improve this bound is similar to theproof of Proposition 4.6. Note that the combination of the identity (5.10) and the cumulant expansiongives a cancellation to the leading order. Thus we can improve the upper bound to O ≺ (Ψ + Ψ √ N + N − ).We next discuss the details.Using the identity (5.10) on the off-diagonal entry G ab with unmatched a as the row index and applyingcumulant expansions, we have1 N X a,b E [ G ab G ba G ab ] = 1 N X a = b E h(cid:16) G ab HG − G ( HG ) ab (cid:17) G ba G ab i + 1 N N X a =1 E [( G aa ) ]= 1 N X a,b,j,k E h ∂G ab G jk G ba G ab ∂h jk i − N X a,b,j,k E h ∂G jj G kb G ba G ab ∂h ka i + 1 √ N N X p + q +1=3 p ! q ! X a,b,j,k s ( p,q +1) jk E h ∂ G ab G jk G ba G ab ∂h pjk ∂h qkj i − √ N N X p + q +1=3 p ! q ! X a,b,j,k s ( p,q +1) ak E h ∂ G jj G kb G ba G ab ∂h pka ∂h qak i + O ≺ (cid:0) N (cid:1) , (6.4)where the last error term comes from the truncation of the cumulant expansions at the third order andthe diagonal case a = b .Using (3.21) and that j, k are fresh summation indices, all the third order expansions for { p + q +1 = 3 } can be written out using the terms of the form in (4.8) of degree at least three, with an additional factor √ N in front. Since both the fresh indices j, k appear in the product of the Green function entries foran odd number of times, the resulting terms are unmatched from Definition 4.2. From the local lawin (3.10), they are bounded by O ≺ (cid:16) Ψ √ N + N / (cid:17) .Now we return to the second order terms in the cumulant expansions in (6.4), i.e., N X a,b,j,k E h ∂G ab G jk G ba G ab ∂h jk i − N X a,b,j,k E h ∂G jj G kb G ba G ab ∂h ka i . (6.5)Using (3.21), the fresh indices j, k are then matched and the index a remains to be an unmatched rowindex. The key observation here is that the leading sub-term from the first term above, correspondingto taking ∂∂h jk of G jk , will be canceled precisely by the leading sub-term from the second term above,resulting from taking ∂∂h ka of G kb . We hence rewrite (6.5) as1 N X a,b,j,k E h ∂G ab G ba G ab ∂h jk G jk i − N X a,b,j,k E h ∂G jj G ba G ab ∂h ka G kb i . (6.6)The degrees of the resulting terms from the first part above are five as j, k are fresh indices. Similarly,the ones from the second part have degrees at least four, except one sub-term from taking ∂∂h la of G ba ,whose column index coincides with the unmatched row index a :1 N X a,b,j,k E h G jj G bk G aa G ab G kb i . Compared with the original term in (6.3), one replaces one pair of the index a by a fresh index k and addsa factor G aa for the replaced index a . The good news is that this leading term of degree three remainsunmatched with an unmatched row index a . We then expand it further as in (6.4) and obtain that1 N X a,b,j,k E h G jj G aa G ab G bk G kb i = 1 N X a,b,j,k,j ′ ,k ′ E h ∂G jj G aa G ab G bk G kb ∂h j ′ k ′ G j ′ k ′ i − N X a,b,j,k,j ′ ,k ′ E h ∂G jj G aa G j ′ j ′ G bk G kb ∂h k ′ a G k ′ b i + { third order terms } + O ≺ (cid:16) N (cid:17) , (6.7)with j ′ , k ′ another two fresh summation indices. Here, the third order terms are also unmatched terms ofthe form in (4.8) of degree at least three with an extra √ N in front, similarly as in (6.4). From (3.21), theresulting terms from the first part on the right side of (6.7) have degrees at least five. As for the secondpart above, even though the column index of the diagonal entry G aa coincides with the unmatched rowindex a , the resulting terms have degrees at least four.In this way, we improve the upper bound of the unmatched term given in (6.3) to (cid:12)(cid:12)(cid:12) N X a,b E h G ab G ba G ab i(cid:12)(cid:12)(cid:12) ≺ Ψ + Ψ √ N + N − . Indeed, we expand this unmatched term as1 N X a,b E h G ab G ba G ab ] = X Q od ′ ∈Q od ′ d ′ ≥ E [ Q od ′ ] + 1 √ N X Q od ′ ∈Q od ′ d ′ ≥ Q od ′ + O ≺ ( N − ) , (6.8)where we write P Q od ′ ∈Q od ′ ,d ′ ≥ Q od ′ as a sum of finitely many unmatched terms of the form in (4.8) ofdegrees increased by at least one, which comes from the second order expansions. Moreover, we write √ N P Q od ′ ∈Q od ′ ,d ′ ≥ Q od ′ as a finite sum of unmatched terms of the form in (4.8) with an extra factor √ N in front, which corresponds to the third order expansions. The last error O ≺ ( N − ) is from the truncationof the cumulant expansion and the diagonal cases. By repeating the above expansion procedure in (6.8)for arbitrary D times, we improve the upper bound to O ≺ (cid:0) Ψ D + Ψ D − √ N + N − (cid:1) . The full proof is presentedin the following section.6.3. Proof of Proposition 4.3. In this section, we give the proof of Proposition 4.3 for Wigner matricesusing the cumulant expansions as explained above. Proof of Proposition 4.3. Consider an arbitrary unmatched term Q od ∈ Q od of the form (4.8). Because itis equivalent to expand the Green function entry G xy in the row index x or column index y , we focus onthe unmatched row indices in the following.We may assume that the index v belongs to the unmatched row index set R o (which cannot be empty)from Definition 4.2. Then there exists an off-diagonal factor in the product of Green function entrieswith v as the row index. Without loss of generality, we set x = v , and y = v . Using (5.10) on theoff-diagonal entry G v y and applying cumulant expansions similarly as in (6.4), we have E [ Q od ] = 1 N I X I c I E h G v y n Y i =2 G x i y i i = 1 N I X I c I E h δ v y G v y n Y i =2 G x i y i i + 1 N I X I c I X j,k E h ∂G v y G jk Q ni =2 G x i y i ∂h jk i − N I X I c I X j,k E h ∂G jj G ky Q ni =2 G x i y i ∂h kv i + 12 N I √ N X I c I X p + q +1=3 p ! q ! X j,k s ( p,q +1) jk E h ∂ G x y G jk Q ni =2 G x i y i ∂h pjk ∂h qkj i − N I √ N X I c I X p + q +1=3 p ! q ! X j,k s ( p,q +1) v k E h ∂ G jj G ky Q ni =2 G x i y i ∂h pkx ∂h qx k i + O ≺ ( 1 N )= 1 N I X I c I X j,k E h ∂G v y Q ni =2 G x i y i ∂h jk G jk i − N I X I c I X j,k E h ∂G jj Q ni =2 G x i y i ∂h kv G ky i + { third order terms for p + q + 1 = 3 } + O ≺ (cid:0) N (cid:1) , (6.9)where j, k are fresh summation indices, the last error O ≺ ( N ) is from the truncation of the cumulantexpansions at the third order and the diagonal case v ≡ y .We first look at the third order expansions for p + q + 1 = 3, which are much smaller because we gainan extra √ N from the third order cumulants. Since both j, k are fresh indices, it is straightforward to check from (3.21) that the resulting terms are also of the form in (4.8) with an extra √ N in front. Theirdegrees, denoted by d ′ , satisfy d ′ ≥ d , the corresponding free summation index set is I ′ = {I , j, k } andthe number of Green function entries is n ′ = n + 3. In addition, the number of such terms is at most6( n + 3) . Comparing these terms with the original Q od , we add in total an odd number of j ’s (or k ’s)into the original row index set and column index set of the product of the Green function entries. Thenall these terms are unmatched terms from Definition 4.2. We use √ N P Q od ′ ∈Q od ′ ; d ′ ≥ d E [ Q od ′ ] to denote thefinite sum of these unmatched terms from the third order expansions.Next, we estimate the second order expansion terms, i.e., the second but last line on the right sideof (6.9). Using (3.21) we write them as a sum of at most 2 n terms of the form in (4.8) with I ′ = {I , j, k } and n ′ = n + 2. The degrees of these terms are estimated as follows.For the first group of terms in the second but last line of (6.9), comparing with the original Q od , wehave added one fresh index j and one fresh index k into both the original row index set and column indexset. Then j and k are both matched indices. Moreover, v from G v y remains an unmatched row index.After taking ∂∂h jk by (3.21), the degrees are then increased by at least two.Similarly, we compare the second group in the second but last line of (6.9) with the original Q od . We findthat both j and k are again matched. The resulting index v from ∂∂h kv is still an unmatched row index.However, the degrees of the resulting terms from taking ∂∂h kv may not be increased. This is becausethe column index of some Green function entry G x i y i (2 ≤ i ≤ n ) may coincide with the unmatched rowindex v . The number of such Green function entries with v as column index is given by ν c ( ≤ n ) fromDefinition 4.2. So we split the discussion into three cases. Case 1: If y i = v , then after taking ∂∂h kv of G x i y i , the degree of the resulting term is increased byat least one. Case 2: If y i = x i = v , then after taking ∂∂h kv of G x i y i , the degree is then increased by exactly one. Case 3: If y i = v , but x i = v , then, for simplicity, we may assume that y = v and x = v .From Definition 4.2 for unmatched indices, there exists some 3 ≤ i ′ ≤ n such that x i ′ = v and y i ′ = v ,because else v cannot be an unmatched row index of the original Q od . We may assume x = v and y = v . Then the corresponding term after taking ∂∂h kv of G x ,v becomes( ∗ ) := 1 N I X I ,j,k c I E h G jj G v v G ky G x k G v ,y n Y i =4 G x i y i i , (6.10)with y = v , x = v , and y = v , and the degree of this term is still d . Compared with the original Q od ,we have replaced one pair of the index v , i.e., the row index of G x y and the column index of G x y ,by the fresh index k . Further we get an additional diagonal Green function entry G v v for the replacedpair of index v . Since the index v from G v y remains an unmatched row index, we can further expandthe term in (6.10) using the unmatched row index v , as in (6.9). We write( ∗ ) = − N I X I ,j,k,j ′ ,k ′ c I E h ∂G jj G v v G v y G x k G ky (cid:16) Q ni =4 G x i y i (cid:17) ∂h j ′ k ′ G j ′ k ′ i + 1 N I X I ,j,k,j ′ ,k ′ c I E h ∂G jj G v v G j ′ j ′ G x k G ky (cid:16) Q ni =4 G x i y i (cid:17) ∂h k ′ v G k ′ y i + { third order expansions for p + q + 1 = 3 } + O ≺ (cid:0) N (cid:1) . (6.11)Similar as (6.9), the third order expansions contains at most 6( n + 5) unmatched terms of the form in(4.8) with an additional factor √ N in front, of degrees d ′′ ≥ d , with I ′′ = {I , j, k, j ′ , k ′ } and n ′′ = n + 5.We next estimate the second order expansions on the right side of (6.11). From (3.21), they become asum of at most 2 n terms of the form in (4.8), with I ′′ = {I , j, k, j ′ , k ′ } and n ′′ = n + 4.If for any 4 ≤ i ≤ n , either y i = v or x i = y i = v holds, as considered in Cases 1 and 2 above, thenthe degrees of these resulting terms are increased by at least one, i.e., d ′′ ≥ d + 1.Else we may assume that y = v and x = v . The resulting leading term of degree d , as theanalogue of (6.10), is obtained from replacing one pair of the index v , i.e., the row index of G x y andthe column index of G x y , by the fresh index k ′′ and adding an additional diagonal Green function entry G v v . Moreover, there exists some 5 ≤ i ′′′ ≤ n such that x i ′′′ = v and y i ′′′ = v to make sure v isan unmatched row index of the original Q od in (6.9), as explained at the beginning of Case 3. We mayassume i ′′′ = 5 for simplicity. Then the index v from G v y is again unmatched. We can expand thisleading term of degree d for the third time by applying (5.10) on G v y and applying cumulant expansions,similarly as in (6.11).We continue this procedure of expanding in the unmatched row index v repeatedly for s times, untilthere is no off-diagonal Green function entry with column index y i = v in the remaining product of theGreen function entries Q ni =2 s G x i y i . Then from Case 1 and Case 2 above, the resulting terms have degreesincreased by at least one. The number of iteration s is at most ν ( c )1 ( ≤ n ), where ν ( c )1 defined in (7.5) isthe number of times the unmatched row index v appears in the column index set of the original Q od .In this way, we expand the original unmatched Q od in terms of finitely many unmatched terms in theform (4.8) of degrees at least d + 1, as well as the third order cumulant expansion terms generated inthe iterations, plus an error O ≺ ( N − ) from the truncation of the cumulant expansion and the diagonalcases. In summary, for any unmatched Q od ∈ Q od , we write the following expansions for short: E [ Q od ] = X Q od ′ ∈Q od ′ d ′ ≥ d +1 E [ Q od ′ ] + 1 √ N X Q od ′ ∈Q od ′ d ′ ≥ d E [ Q od ′ ] + O ≺ ( 1 N ) , (6.12)where the number of unmatched terms in the summations above is bounded by ( Cn ) cn , and the numberof the Green function entries in the product of each the unmatched term is bounded by Cn for somenumerical constants C, c > D − d times. Then the unmatched terms in the firstsummation have degrees at least D , and the unmatched terms with √ N in the second summation havedegrees at least D − 1. Note that the total number of the terms generated in the iteration of the expansionsis bounded by (cid:0) ( C D n ) c D n (cid:1) D , and the number of the Green function entries in the product of each termis bounded by C D n . We hence obtain from the local law in (3.10) that E [ Q od ] = O ≺ (cid:0) Ψ D + Ψ D − √ N + 1 N (cid:1) = O ≺ (cid:0) Ψ D + 1 N (cid:1) . (6.13)We hence have finished the proof of Proposition 4.3. (cid:3) Proof of Proposition 3.3 In this section, we prove Proposition 3.3, which is a key ingredient in the proof the Green functioncomparison theorem, Theorem 1.4. The special case of Proposition 3.3 considering F ( x ) = x was statedin (4.6), which leads to the corresponding Green function comparison theorem for F ( x ) = x in Propo-sition 4.1. The proof of Proposition 3.3 relies on the analogues of Proposition 4.6 (expansion in type-0terms) and Proposition 4.3 (the negligibility of unmatched terms), as well as the estimate (4.3) obtainedin Proposition 4.1. Proof of Proposition 3.3. We extend the ideas from the proofs of (4.6) to the setup of Proposition 3.3.Recall E [Θ( t, z , z )] from (3.18), i.e., E [Θ( t, z , z )] ≡ E [Θ] = X p + q +1=3 p,q ∈ N K p,q +1 + E + O ≺ ( N − / ) , (7.1)with K p,q +1 given in (3.19) and E given in (3.20).In order to estimate the terms on the right side of (7.1), we write them, using the differentiation rules(3.21) and (3.26), in the analogous form of (4.8): e Q ( t, z , z ) : 1 N m N X v =1 · · · N X v m =1 c v ,...,v m E h F ( α ) ( X ) i Y i =1 ∆ f Im (cid:16) n i Y l =1 G x ( i ) l y ( i ) l (cid:17)i , (7.2)with α, m, i , n i ∈ N , F ( α ) be the α -th derivative of a smooth function F which has uniformly boundedderivatives, ∆ f Im : R + × ( C \ R ) → C defined in (3.14), where I := { v j } mj =1 is a free summation index set,and the v j ’s may also represent a, b from (3.19) and (3.20). The coefficients { c I := c v ,...,v m } are uniformly bounded complex numbers, and each x ( i ) l and y ( i ) l represent some element in the free summation indexset I . The total number of the Green function entries in (7.2) is then given by n := i X i =1 n i . (7.3)We further define the degree of a term in the form (7.2) by counting the number of off-diagonal Greenfunction entries, i.e., d := i X i =1 (cid:8) ≤ l ≤ n i : x ( i ) l = y ( i ) l (cid:9) . (7.4)In particular, we have 0 ≤ d ≤ n . The collection of the terms in the form (7.2) of degree d is denoted by e Q d ≡ e Q d ( t, z , z ). From the definition of ∆ f Im in (3.14), the local law in (3.10) and the fact that F hasbounded derivatives, we have, for any term e Q d ≡ e Q d ( t, z , z ) ∈ e Q d , | e Q d ( t, z , z ) | = O ≺ (cid:0) Ψ d + 1 N (cid:1) , uniformly in t ∈ R + , and z , z ∈ S given in (2.7). In the following, we often omit the parameters t, z , z for notational simplicity.7.1. Unmatched terms K p,q +1 in (3.19). In this subsection, we follow the idea in Section 6 to showthe negligibility of the terms K p,q +1 given in (3.19) with unmatched indices as defined next, c.f., Propo-sition 4.3. Recall Definition 4.2 for unmatched terms of the form in (4.8). Definition 7.1. Given any e Q d ∈ e Q d of the form in (7.2), let ν ( r ) j , ν ( c ) j , be the number of times the freesummation index v j ∈ I appears in the the row index set { x ( i ) l } and the column index set { y ( i ) l } of theGreen function entries, i.e., ν ( r ) j := i X i =1 { ≤ l ≤ n i : x ( i ) l = v j } , ν ( c ) j := i X i =1 { ≤ l ≤ n i : y ( i ) l = v j } . (7.5)Definition 4.2 for unmatched terms can be adapted naturally to the general form given in (7.2). Definethe set of unmatched summation indices as I o := { ≤ j ≤ m : ν ( r ) j = ν ( c ) j } ⊂ I . If I o is not empty, then we say e Q d is an unmatched term, denoted by e Q od . We denote by e Q od ⊂ e Q d thecollection of unmatched terms in the form (7.2) of degree d .The combination of the identity (5.10) and the cumulant expansion formula Lemma 2.6 used previouslyin the proof of Proposition 4.3 still applies similarly to the form in (7.2), using that { h ij } commute with∆ f Im given in (3.14), the differentiation rules (3.21) and (3.26), and the assumption that the function F has bounded derivatives. Therefore, for fixed D ≥ e Q od ∈ e Q od of the formin (7.2) with fixed n given in (7.3), E [ e Q od ( t, z , z )] = O ≺ (cid:0) N + Ψ D (cid:1) , (7.6)holds uniformly in t ∈ R + and z , z ∈ S , as in Proposition 4.3.Now we return to the right side of (7.1). Using (3.21) and (3.26), all the third order expansion terms K p,q +1 in (3.19) for p + q + 1 = 3 can be written out as a sum of finitely many unmatched terms of theform in (7.2) with an extra factor √ N in front, since both the indices a and b appear an odd number oftimes in the product of the Green function entries. We hence have from (7.6) that | K , + K , + K , | = O ≺ ( N − / + √ N Ψ D ) . (7.7)Similarly, the fourth order expansion terms K p,q +1 , p + q + 1 = 4, in (3.19), with the exception of K , ,can also be written as a finite sum of unmatched terms of the form in (7.2), since the number of timesthe index a (or b ) appears in the row index set { x ( i ) l } does not agree with the number of times it appearsin the column index set { y ( i ) l } . We then find from (7.6) that | K , + K , + K , | = O ≺ (cid:0) N − + Ψ D (cid:1) . (7.8) It hence suffices to estimate the remaining matched terms K , and E on the right side of (7.1) asfollows. We first consider K , given in (3.19), E in (3.20) can then be estimated similarly. The proofcontains three steps: 1) expanding matched terms into type-0 terms defined as below, c.f., Proposition 4.6;2) estimating the resulting type-0 terms whose degrees are at least two, c.f., Lemma 4.8; 3) estimatingthe remaining type-0 terms of degree zero using (4.3) in the edge scaling.7.2. Expanding K , . We start by K , given in (3.19), corresponding to the (2,2)-cumulants. Usingthe differentiation rules (3.21) and (3.26), we first write K , as the following sum K , = X k =1 I k , (7.9)with I := − N X a = b s (2 , ab E h F ′ ( X )∆ f Im (cid:16) ( G aa ) ( G bb ) (cid:17)i ; I := − N X a = b s (2 , ab E h F ′ ( X )∆ f Im (cid:16) G ab G ba G aa G bb (cid:17)i ; I := − N X a = b s (2 , ab E h F ′′ ( X )∆ f Im ( G ab )∆ f Im (cid:16) G aa G bb G ba (cid:17)i ; I := − N X a = b s (2 , ab E h F ′′ ( X ) (cid:16) ∆ f Im ( G aa G bb ) (cid:17) i ; I := − N X a = b s (2 , ab E h F ′′′ ( X )∆ f Im ( G ab )∆ f Im ( G ba )∆ f Im (cid:16) G aa G bb ) (cid:17)i ; I := − N X a = b s (2 , ab E h F ′′ ( X )∆ f Im (cid:16) ( G ab ) (cid:17) ∆ f Im (cid:16) ( G ba ) (cid:17)i ; I := − N X a = b s (2 , ab E h F ′′′ ( X ) (cid:16) ∆ f Im ( G ab ) (cid:17) ∆ f Im (cid:16) ( G ba ) (cid:17)i ; I := − N X a = b s (2 , ab E h F ′′′′ ( X ) (cid:16) ∆ f Im ( G ab ) (cid:17) (cid:16) ∆ f Im ( G ba ) (cid:17) i , (7.10)where s (2 , ab ( a = b ) are the (2,2)-cumulants of the rescaled entries √ Nh ab given in (2.24).Observe that for the terms given in (7.10), both indices a and b appear exactly twice as the row indexand exactly twice as the column index of a Green function entry. We hence consider the special case ofthe form in (7.2) with the two indices a, b singled out, namely,1 N I +2 X a,b, I c a,b, I E h F ( α ) ( X ) i Y i =1 ∆ f Im (cid:16) n i Y l =1 G x ( i ) l y ( i ) l (cid:17)i , (7.11)where each x ( i ) l and y ( i ) l represent a , b or some element in the free summation index set I = { v j } mj =1 , and { c a,b, I } are uniformly bounded complex numbers. The number of Green function entries in the product,denoted by n , is given as in (7.3). The degree, denoted by d , is given as in (7.4) by counting the numberof off-diagonal Green function entries in the product. Definition 7.2. Given any term of the form in (7.11), Definition 4.5 for the type-AB, type-A and Type-0terms of the form in (4.17) can be adapted naturally. Recall ν ( r ) j , ν ( c ) j given in (7.5) for any free summationindex v j ∈ I . We further define similarly for the special summation indices a and b , i.e., ν ( r ) a := i X i =1 { ≤ l ≤ n i : x ( i ) l = a } , ν ( c ) a := i X i =1 { ≤ l ≤ n i : y ( i ) l = a } ; ν ( r ) b := i X i =1 { ≤ l ≤ n i : x ( i ) l = b } , ν ( c ) b := i X i =1 { ≤ l ≤ n i : y ( i ) l = b } . If the following two conditions are satisfied,(1) all the free summation indices in {I} appear once in the row index set { x ( i ) l } and once in thecolumn index set { y ( i ) l } of the Green function entries, i.e., ν ( r ) j = ν ( c ) j = 1 (1 ≤ j ≤ m );(2) both the special indices a and b appear twice in the row index set { x ( i ) l } and twice in the columnindex set { y ( i ) l } of the Green function entries, i.e., ν ( r ) a = ν ( c ) a = ν ( r ) b = ν ( c ) b = 2,then such a term is a type-AB term. We denote a type-AB term in the form (7.11) of degree d by T ABd ≡ T ABd ( t, z , z ). The collection of all the type-AB terms of degree d is denoted by T ABd ≡ T ABd ( t, z , z ).A type-A term in the form (7.11) of degree d , denoted by T Ad , has ν ( r ) a = ν ( c ) a = 2, and ν ( r ) b = ν ( c ) b = ν ( r ) j = ν ( c ) j = 1 (1 ≤ j ≤ m ). Moreover, a type-0 term, denoted by T d , is of the form (7.11) of degree d with ν ( r ) a = ν ( c ) a = ν ( r ) b = ν ( c ) b = ν ( r ) j = ν ( c ) j = 1 (1 ≤ j ≤ m ). In addition, the collections of thetype-A terms and the type-0 terms of the form in (7.2) of degree d are denoted by T Ad ≡ T Ad ( t, z , z )and T d ≡ T d ( t, z , z ), respectively. We finally remark that the index b in a type-A term, as well asboth indices a, b in a type-0 term, do not take special roles. We keep them in the notation in order toemphasize the inheritance from the form (7.11).Under Definition 7.2, we observe that all the terms given in (7.10) are type-AB terms in the form (7.11)with I = ∅ and the coefficients given by c a,b = s (2 , ab . In particular, we have that I , I ∈ T AB , I , I , I ∈ T AB , and I , I , I ∈ T AB . In the following, we use, as in the proof of Proposition 4.6, thecombination of the identity (5.10) and cumulant expansion formula Lemma 2.6 to eliminate one pair ofthe index b and also one pair of the index a , and thus expand the type-AB terms as linear combinationsof type-0 terms up to negligible error. Lemma 7.3. For any fixed D ∈ N , we have K , = − s n E (cid:2) F ′ ( X ) (cid:0) ∆ f Im ( G ) (cid:1)(cid:3) + E (cid:2) F ′′ ( X ) (cid:0) ∆ f Im ( G ) (cid:1) (cid:3)o + X T d ∈T d ≤ d 2, the number of Green function entries n ′ = 6,and I ′ = { j, k } . We denote the finite sum as P T ABd ′ ∈T ABd ′ ; d ′ ≥ T ABd ′ , and write I = − N X a,b s (2 , ab E h F ′ ( X )∆ f Im (cid:16) ( G aa ) G bb G (cid:17)i + X T ABd ′ ∈T ABd ′ ; d ′ ≥ T ABd ′ + O ≺ (cid:0) √ N (cid:1) . (7.15)Next, we further replace G bb in the first terms on the right side of (7.15) by G using (5.10) and thecumulant expansion formula as in (7.14) to obtain − N X a,b s (2 , ab E h F ′ ( X )∆ f Im (cid:16) ( G aa ) G bb G (cid:17)i = − N X a,b s (2 , ab E h F ′ ( X )∆ f Im (cid:16) ( G aa ) ( G ) (cid:17)i − N X a,b,j,k s (2 , ab E h ∂F ′ ( X )∆ f Im (cid:16) ( G aa ) G bb GG jk (cid:17) ∂h jk i + 12 N X a,b,j,k s (2 , ab E h ∂F ′ ( X )∆ f Im (cid:16) ( G aa ) GG jj G kb (cid:17) ∂h kb i + O ≺ (cid:0) √ N (cid:1) . (7.16)Observe similarly to above that the leading sub-term from the second term will be cancelled exactly bythe leading sub-term from the third term. The remaining sub-terms form a sum of at most ten type-Aterms of degrees at least two, denoted as P T Ad ′ ∈T Ad ′ ; d ′ ≥ T Ad ′ . Combining with (7.15), we have I = − N X a,b s (2 , ab E h F ′ ( X )∆ f Im (cid:16) ( G aa ) ( G ) (cid:17)i + X T Ad ′ ∈T Ad ′ d ′ ≥ T Ad ′ + X T ABd ′ ∈T ABd ′ d ′ ≥ T ABd ′ + O ≺ (cid:0) √ N (cid:1) . (7.17)In general, for an arbitrary type-AB term T ABd ∈ T ABd of the form (7.11) with fixed n given in (7.3),we extend the arguments as in Step 1 in Subsection 5.2, using the differentiation rules (3.21) and (3.26)and that { h ij } commute with ∆ f Im in (3.14). We hence obtain the analogue of (5.18), T ABd = X T Ad ∈T Ad T Ad + X T ABd ′ ∈T ABd ′ d ′ ≥ d +1 T ABd ′ + O ≺ (cid:0) √ N (cid:1) , (7.18)where the summations above denote a sum of at most two type-A terms of degree d and a sum of atmost 6( n + 4) type-AB terms of degrees not less than d + 1. The number of the Green function entriesin each term above is at most n + 4. Iterating the expansion procedure (7.18) D − d times and using thelocal law in (3.10), we expand T ABd ∈ T ABd as a sum of at most (6( n + 4 D )) D type-A terms of degrees atleast d , up to negligible error. We write for short T ABd = X d ≤ d ′ For any type-0 term T d ∈ T d of the form (7.11) of degree d ≥ , we have | T d ( t, z , z ) | = O ≺ (cid:0) N − / (cid:1) , (7.24) uniformly in t ∈ R + , z , z ∈ S edge given in (4.1).Proof. Given any type-0 term T d ∈ T d of the form (7.11), we no longer emphasize the indices a , b fornotational simplicity. We then write T d from the definition of ∆ f Im in (3.14) as E h F ( α ) ( X ) 1 N I X I c I i Y i =1 (cid:16) n i Y l =1 G x ( i ) l y ( i ) l ( t, z ) − n i Y l =1 G x ( i ) l y ( i ) l ( t, z ) − n i Y l =1 G x ( i ) l y ( i ) l ( t, z ) + n i Y l =1 G x ( i ) l y ( i ) l ( t, z ) i , with t ≥ z , z ∈ S edge , and α, m, i , n i ∈ N , where each summation index v j ∈ I := { v j } mj =1 appearsexactly once in the row index set { x ( i ) l } and once in the column index set { x ( i ) l } of the Green functionentries. In particular, we have I = n = P i i =1 n i . For 1 ≤ j ≤ m , if there exist x ( i ) l = y ( i ) l = v j , then wesay v j is isolated. For any 1 ≤ j = j ′ ≤ m , if there exist 1 ≤ i ≤ i , ≤ l ≤ n i such that either x ( i ) l = v j , y ( i ) l = v j ′ or y ( i ) l = v j , x ( i ) l = v j ′ , then we say that v j and v j ′ are connected indices. We then writeout T d as a linear combination of the terms in the following form, which are rearranged using clusters ofconnected indices, denoted by { v ( q )1 , . . . , v ( q ) l q } q ,( ∗∗ ) := E h F ( α ) ( X ) 1 N I X I c I Y q (cid:16) G v ( q )1 v ( q )2 ( t, z ( q )1 ) G v ( q )2 v ( q )3 ( t, z ( q )2 ) · · · G v ( q ) lq v ( q )1 ( t, z ( q ) l q ) (cid:17)i , (7.25)where P q l q = n , z ( q ) l for any q and 1 ≤ l ≤ l q takes the values z , z , z , or z . Because the degree d ≥ l q ≥ 2. We may assume that q = 1. Recall that the coefficients { c I } are uniformly bounded and that the function F has bounded derivatives. Thenusing the local law in (3.10) and the properties of stochastic domination in Lemma 1.7, we have that | ( ∗∗ ) | ≺ E h N l N X v (1)1 ,...,v (1) l =1 (cid:12)(cid:12)(cid:12) G v (1)1 v (1)2 ( t, z (1)1 ) G v (1)2 v (1)3 ( t, z (1)2 ) · · · G v (1) l v (1)1 ( t, z (1) l ) (cid:12)(cid:12)(cid:12)i . In combination with Young’s inequality and the Ward identity (5.53), we find, similarly to (5.55), that | ( ∗∗ ) | ≺ E [Im m N ( t, z )]( N η ) l − + E [Im m N ( t, z )]( N η ) l − , l ≥ , z , z ∈ S edge . (7.26)Together with the estimate (4.3) on E [Im m N ( t, z )] in the edge scaling and the fact that η ≥ N − ǫ , weobtain the estimate in (7.24). (cid:3) Applying the estimate (7.24) to (7.12), we find that K , = − s n E h F ′ ( X ) (cid:16) ∆ f Im ( G ) (cid:17)i + E h F ′′ ( X ) (cid:16) ∆ f Im ( G ) (cid:17) io + O ≺ ( N − / + Ψ D ) , (7.27)uniformly in t ≥ z , z ∈ S edge . It then suffices to estimate the remaining type-0 terms of degreezero on the right side of (7.27). The idea is to reduce the power of the normalized trace of the Greenfunction G and then use the estimate in (4.3) to find a desired upper bound O ≺ ( N − / ǫ ) of K , .7.4. Estimate of K , . From the local law in (3.10) and the properties of the Stieltjes transform m sc ofthe semicircle law in Lemma 2.1, the normalized trace of the Green function G ( z ) is close to − z ∈ S edge . Using these ideas, we will estimate the leading terms on the right side of (7.27) with the helpof Lemma 7.4. Lemma 7.5. For any fixed k, k , k ≥ , we have E h F ′ ( X ) (cid:16) ∆ f Im ( G ) k (cid:17)i =( − k − k E h F ′ ( X ) (cid:16) ∆ f Im G (cid:17)i + O ≺ (cid:0) N − / (cid:1) , (7.28) and E h F ′′ ( X ) (cid:16) ∆ f Im ( G ) k (cid:17)(cid:16) ∆ f Im ( G ) k (cid:17)i =( − k + k − k k E h F ′′ ( X ) (cid:16) ∆ f Im ( G ) (cid:17) i + O ≺ (cid:0) N − / (cid:1) , (7.29) with ∆ f Im given in (3.14), uniformly in t ∈ R + and z , z ∈ S edge given in (4.1).Proof. We start by the first equality (7.28). Multiplying the left side of (7.28) by two, we write out theabbreviation ∆ f Im defined in (3.14) and obtain that2 E h F ′ ( X ) (cid:16)(cid:0) G ( t, z ) − G ( t, z ) (cid:1) k (cid:17)i = E h F ′ ( X ) (cid:16) z (cid:0) G ( t, z ) (cid:1) k − z (cid:0) G ( t, z ) (cid:1) k − z (cid:0) G ( t, z ) (cid:1) k + z ( G (cid:0) t, z ) (cid:1) k (cid:17)i + O ≺ (cid:0) N − / ǫ (cid:1) , (7.30)where we use that | z − | , | z − | = O ( N − / ǫ ), the local law in (3.10), and that the function F hasbounded derivatives.Using the definition of the Green function, the cumulant expansions, and the short hand ∆ f Im in(3.14), we obtain from the differentiation rules (3.21) and (3.26) that2 E h F ′ ( X ) (cid:0) ∆ f Im ( G ) k (cid:1)i = 1 N X a E h F ′ ( X )∆ f Im (cid:0) ( G ) k − (( HG ) aa − (cid:1)i + O ≺ ( N − / ǫ )= − E h F ′ ( X ) (cid:0) ∆ f Im ( G ) k − (cid:1)i + 1 N X a E h ∂F ′ ( X )∆ f Im (cid:0) ( G ) k − G la (cid:1) ∂h la i + O ≺ (cid:16) √ N (cid:17) = − E h F ′ ( X ) (cid:0) ∆ f Im ( G ) k − (cid:1)i − E h F ′ ( X )∆ f Im (cid:0) ( G ) k +1 (cid:1)i − k − N X a,j,l E h F ′ ( X )∆ f Im (cid:0) ( G ) n − G jl G aj G la (cid:1)i − N X a,l E h F ′′ ( X )(∆ f Im G al )∆ f Im (cid:0) G la ( G ) n − (cid:1)i + O ≺ (cid:16) √ N (cid:17) , (7.31) where the error is from the truncation of the cumulant expansions. The third and fourth term on theright side above are type-0 terms in T d with d ≥ E h F ′ ( X )∆ f Im (cid:16) ( G ) k +1 (cid:17)i = − E h F ′ ( X ) (cid:16) ∆ f Im ( G ) k − (cid:17)i − E h F ′ ( X ) (cid:16) ∆ f Im ( G ) k (cid:17)i + O ≺ (cid:0) N − / (cid:1) . (7.32)Iterating (7.32) several times till the power of G is reduced to one, we hence obtain (7.28).We next prove the second equality (7.29) in a similar way. Using the definition of the Green functionand the cumulant expansions, for any k , k ≥ 1, we obtain that2 E h F ′′ ( X ) (cid:16) ∆ f Im ( G ) k (cid:17)(cid:16) ∆ f Im ( G ) k (cid:17)i = − E h F ′′ ( X ) (cid:16) ∆ f Im ( G ) k (cid:17)(cid:16) ∆ f Im ( G ) k − (cid:17)i − E h F ′′ ( X ) (cid:16) ∆ f Im ( G ) k (cid:17)(cid:16) ∆ f Im ( G ) k +1 (cid:17)i + X T d ∈T d ; d ≥ T d + O ≺ (cid:0) √ N (cid:1) , where the summation on the right side above is a sum of finitely many type-0 terms in the form (7.11)of degrees at least two. Using the estimate for the type-0 terms in (7.24), we hence have E h F ′′ ( X ) (cid:16) ∆ f Im ( G ) k (cid:17)(cid:16) ∆ f Im ( G ) k +1 (cid:17)i = − E h F ′′ ( X ) (cid:16) ∆ f Im ( G ) k (cid:17)(cid:16) ∆ f Im ( G ) k − (cid:17)i − E h F ′′ ( X ) (cid:16) ∆ f Im ( G ) k (cid:17)(cid:16) ∆ f Im ( G ) k (cid:17)i + O ≺ (cid:0) N − / (cid:1) . (7.33)Iterating (7.33) several times till both the powers of G are reduced to one, we prove (7.29). (cid:3) Therefore, plugging the equalities (7.28) and (7.29) into (7.27), we obtain K , = 2 s E h F ′ ( X ) (cid:0) ∆ f Im G (cid:1)i − s E h F ′′ ( X ) (cid:0) ∆ f Im G (cid:1) i + O ≺ ( N − / + Ψ D ) , uniformly in t ≥ z , z ∈ S edge . Using the definition of ∆ f Im in (3.14), the estimate (4.3) of E [Im G ( t, z )] for z ∈ S edge and t ≥ 0, the properties of stochastic domination Lemma 1.7 and that thefunction F has bounded derivatives, we conclude, for any fixed D ≥ 1, that | K , | = O ≺ ( N − / ǫ + Ψ D ) , (7.34)uniformly in t ∈ R + and z , z ∈ S edge .7.5. Estimate of E . In this subsection, we estimate the second order term E given in (3.20) similarlyas K , . Using (3.26) and (3.21), we write E as E = − N N X a =1 ( s (2) aa − E h F ′ ( X )∆ f Im ( G aa ) i − N N X a =1 ( s (2) aa − E h F ′′ ( X ) (cid:16) ∆ f Im ( G aa ) (cid:17) i . (7.35)Observe that the above two terms are both type-A terms in T A of the form (7.11), where the index b no longer plays a special role. Using the combination of the identity (5.10) and the cumulant expansionformula, we expand E into a sum of finitely many type-0 terms, similarly to (7.12). That is, E = − s − n E h F ′ ( X )∆ f Im ( G ) i + E h F ′′ ( X ) (cid:16) ∆ f Im ( G ) (cid:17) io + X T d ∈T d ≤ d In this section, we prove the Green function comparison theorem, Theorem 1.4, for real Wigner ma-trices, using similar ideas as for the complex Hermitian case. To simplify the discussion, we will onlyaddress the differences.Consider the real-valued matrix Ornstein-Uhlenbeck process (cid:0) h ab ( t ) (cid:1) Na,b =1 :d h ab = r δ ab N d β ab − h ab d t, h ab (0) = ( H N ) ab , (8.1)where (cid:0) β ab ( t ) (cid:1) a ≤ b are independent real standard Brownian motions with β ba ( t ) = β ab ( t ). The initialcondition H N is a real symmetric Wigner matrix satisfying Assumption 1.1. In distribution this isequivalent to writing H ( t ) = e − t H N + √ − e − t GOE N , t ∈ R + . (8.2)As the analogue of (3.21), we have a new differentiation rule for the Green function entry of a realsymmetric matrix, ∂G ij ∂h ab = − G ia G bj + G ib G aj δ ab . (8.3)Then using Ito’s formula similarly to (3.22), we obtaind G ij ( t, z ) = d M ij + Θ ij d t , (8.4)where d M ij := − √ N P a ≤ b √ δ ab (cid:16) G ia G bj + G ib G aj (cid:17) d β ab , andΘ ij := 12 X a,b h ab G ia G bj + 12 N X a,b (cid:16) G ia G ab G bj + G ib G bj G aa + G ia G aj G bb (cid:17) . Recall F in (2.21) and X in (3.12). Applying Ito’s formula on F ( X ) and using (8.4), we derive thedynamics of F ( X ) in the real symmetric case,d F ( X ) = d M + Θd t , where the diffusion term d M yields a martingale after integration, see Remark 3.2, and the drift term isgiven by (we omit the parameters t and 2 + x + i η of the following Green function entries)Θ = F ′ ( X )Im Z κ κ X i,a,b (cid:16) h ab G ia G bi + 2 N G ia G ab G bi + 1 N G ib G bi G aa + 1 N G ia G ai G bb (cid:17) d x + F ′′ ( X ) 1 N X i,j X a,b (cid:16) Im Z κ κ G ia G bi d x (cid:17)(cid:16) Im Z κ κ G jb G aj d x (cid:17) = 12 X a,b h ab (cid:16) F ′ ( X )∆Im G ba (cid:17) + 1 N X a,b (cid:16) F ′ ( X )∆Im ( G aa G bb ) (cid:17) + 1 N X a,b (cid:16) F ′ ( X )∆Im ( G ab ) ) (cid:17) + 1 N X a,b (cid:16) F ′′ ( X )(∆Im G ab )(∆Im G ba ) (cid:17) . (8.5)where we abbreviate, for any function P : R + × C \ R −→ C ,∆Im P ≡ ∆Im P ( t, z , z ) := Im P ( t, z ) − Im P ( t, z ) , (8.6) with t ∈ R + , z = 2 + κ + i η, z = 2 + κ + i η ∈ S edge , as in (3.15). In fact, comparing with the driftterm in (3.24) for complex Hermitian matrices, the notation f Im in (3.13) is replaced with the imaginarypart Im . This is because { h ab } commute with taking the imaginary part, and the Green function of areal symmetric matrix satisfies G ij ( z ) = G ji ( z ) . (8.7)Moreover, using (8.3), it is easy to find the analogous differentiation rule to (3.26), ∂F ′ ( X ) ∂h ab = − 21 + δ ab F ′′ ( X ) N X i =1 Im (cid:16) Z κ κ G ia G bi (2 + x + i η )d x (cid:17) = − 21 + δ ab F ′′ ( X )∆Im G ab , (8.8)with ∆Im given in (8.6).Next, we return to the right side of (8.5). Applying the real cumulant expansion formula in Lemma 2.6for the independent entries { h ab } a ≤ b in the first term up to the fourth order and using the differentiationrules (8.3) and (8.8), the second order terms in the cumulant expansions are canceled exactly by the lastthree terms on the right side of (8.5). We hence obtain the real analogue of (3.18), E [Θ] = 12 N N X a =1 ( s (2) aa − E h ∂F ′ ( X )∆Im G aa ∂h aa i + 14 N / X a,b s (3) ab E h ∂ F ′ ( X )∆Im G ba ∂h ab i + 112 N X a,b s (4) ab E h ∂ F ′ ( X )∆Im G ba ∂h ab i + O ≺ (cid:0) √ N (cid:1) , (8.9)where the error O ≺ ( √ N ) is from the truncation of the cumulant expansion, and s ( k ) ab is the k -th cumulantdefined in (2.27) of the rescaled entries √ N h ab .We now claim that Proposition 3.3 holds true in the real case, which leads to Theorem 1.4 for β =1. The arguments in the complex case discussed before can be applied similarly, using the modifieddifferentiation rules (8.3) and (8.8), and the real cumulant expansion formula in Lemma 2.6.To simplify the statement, we only consider the simplest version of the Green function comparisontheorem for F ( x ) = x , as proved in Proposition 4.1 for complex Hermitian Wigner matrices. The Greenfunction comparison theorem for general functions F can be proved using the same idea, following thearguments in Section 7 for the complex Hermitian case.Applying (8.4) to the time dependent normalized trace of the Green function, m N ( t, z ), we find thereal analogue of (4.4), i.e., d( m N ( t, z )) =d M + Θ d t , (8.10)with the diffusion term d M := N P Nv =1 d M vv which yields a martingale term after integration; seeRemark 3.2, and the drift term Θ d t := N P Nv =1 Θ vv d t . Applying the real cumulant expansion formulaas in (8.9), the drift term satisfies the real analogue of (4.5), i.e., E [Θ ] = 12 N X v,a ( s (2) aa − E h ∂ ( G va G bv ) ∂h aa i + 14 N / X v,a,b s (3) ab E h ∂ ( G va G bv ) ∂h ab i + 112 N X v,a,b s (4) ab E h ∂ ( G va G bv ) ∂h ab i + O ≺ ( 1 √ N ) =: J + J + J + O ≺ (cid:0) √ N (cid:1) . (8.11)It then suffices to prove the estimate (4.6) in the real symmetric case. Using (8.3), the terms J , J , J above can be written out again in the form (4.8). The degree of a term in the form (4.8) is defined asin (4.9). We recall from (8.7) that the row and column index of a Green function entry can be switched.Following the idea from complex Hermitian case, the proof of (4.6) consists of three steps: 1) the thirdorder terms from J are unmatched and thus negligible (analogous to Proposition 4.3); 2) expanding thefourth order terms from J (as well as the second order terms in J ) as linear combinations of type-0terms of degrees at least two (analogous to Proposition 4.6); 3) estimating the resulting type-0 terms in2) of degrees at least two (analogous to Lemma 4.8).We start with the first step. Recall Definition 4.2 for unmatched terms in the complex Hermitian case.Because of (8.7), we can ignore the difference from the row and column index of a Green function entryof a real symmetric matrix. Definition 8.1 (Terms with unmatched indices in the real case) . Given any term, denoted by Q d , ofthe form (4.8) of degree d , let ν j be the number of times the free summation index v j ∈ I appears as therow or column index in the product of the Green function entries, i.e., ν j := { ≤ i ≤ n : x i = v j } + { ≤ i ≤ n : y i = v j } , ≤ j ≤ m . (8.12)We define the set of the unmatched summation indices as I o := { ≤ j ≤ m : ν j is odd } ⊂ I . Note that I o is even. If I = ∅ , then we say Q d is matched. Otherwise, Q d is an unmatched term,denoted by Q od . The collection of the unmatched terms in the form (4.8) of degree d is denoted by Q od .Then the third order terms from J on the right side of (8.11) are of the form (4.8) with an extra √ N in front and are unmatched with ν a = ν b = 3 defined in (8.14) below. Following the arguments inSection 6, using the relation (5.10), the real cumulant expansion formula, and the new differentiation ruleof the Green function entry (8.3), we observe a similar cancellation to the first order and then expanda unmatched term of the form (4.8) iteratively and prove that Proposition 4.3 holds true in the realsymmetric case. Therefore, we have | J | = O ≺ (cid:0) N − / + √ N Ψ D (cid:1) . (8.13)Next, in the second step, we expand the remaining terms of the form (4.8) from J and J that arematched. Recall a special case of matched terms as in (4.17) with two summation indices a, b singled outand Definition 4.5 for type-AB, type-A, type-0 terms in the complex case. Definition 8.2 (Type-AB terms, type-A terms, type-0 terms) . Given any term of the form in (4.17) ofdegree d with two special indices a and b , recall ν j in (8.12) for any v j ∈ I and define similarly ν a : = { ≤ i ≤ n : x i = a } + { ≤ i ≤ n : y i = a } ,ν b : = { ≤ i ≤ n : x i = b } + { ≤ i ≤ n : y i = b } . (8.14)If for any 1 ≤ j ≤ m , ν j = 2 and ν a = ν b = 4, then such a term is a type-AB term. A type-A termhas ν a = 4, and ν b = ν j = 2 (1 ≤ j ≤ m ). Finally, a type-0 term is defined to be in the form (4.17)with ν a = ν b = ν j = 2 (1 ≤ j ≤ m ). The collection of the type-AB, type-A, type-0 terms of degree d isdenoted by P ABd , P Ad , P d respectively.Following the arguments in Section 5, using the relations (5.10) and (8.3), and the real cumulantexpansion formula, we expand any type-AB (or type-A) term iteratively and prove that Proposition 4.6holds true in the real symmetric case. Therefore, expanding the type-AB terms from J and the type-Aterms from J and then combining with (8.13), we write (8.11) as E [Θ ( t, z )] = X P d ∈P d ≤ d ≤ D − E [ P d ( t, z )] + O ≺ (cid:0) √ N + Ψ D (cid:1) , (8.15)where the summation on the right side above denotes a linear combination of at most ( CD ) cD type-0terms, for some numerical constants C, c .In the last step, we aim to show that any type-0 term with degree d ≥ O ≺ ( N − / )for real symmetric Wigner matrices, as in Lemma 4.8. This reduces to prove Lemma 5.4 for the GOE. Lemma 8.3. For any z ∈ S edge ( ǫ, C ) given in (4.1) and t ≥ , we have the following uniform estimate: N E GOE h Im Tr G ( z ) i = O (cid:0) N − / ǫ (cid:1) . (8.16)The corresponding estimate (5.23) of the type-0 terms of degree d ≥ Proof of Lemma 8.3. The proof is similar to that of Lemma 5.4. For the one-point correlation functionof the GOE and the corresponding diagonal kernel K N, , we refer to [3] and [33]. From Chapter 3.9 in[3], we write K N, ( x, x ) = K N, ( x, x ) + √ N φ N − ( x ) (cid:16) Z ∞−∞ sgn( x − t ) φ N ( t )d t (cid:17) + 12 I N − φ N − ( x ) N =2 m +1 , (8.17)where K N, ( x, x ) is the one-point correlation function for the GUE given by (2.33), { φ k } are the Hermitefunctions in (2.30), and we use β = 1 , I m := Z ∞ φ m ( t )d t = 12 Z R φ m ( t )d t = 2 − / π / s (2 m )!2 m ( m !) ∼ m − / , (8.18)by the Stirling approximation; see Proposition 3.9.28 in [3]. In addition, from Lemma 1 in [25], we have I m +1 := Z ∞ φ m +1 ( t )d t = O ( m − / ) . (8.19)Note that the trace identity for the kernel K N, still holds as in (2.34). Next, we change the variable asin (2.37) and define K edge N, ( x, x ) := 1 N / K N, (cid:16) √ N + xN / , √ N + xN / (cid:17) . (8.20)From Theorem 1.1 in [13], as the real analogue of Theorem 2.8, for any L ∈ R , we have, in the limit oflarge N , that K edge N, ( x, x ) = K airy ( x, x ) + 12 Ai( x ) Z x −∞ Ai( t )d t + o (1) , (8.21)uniformly in x ∈ [ L , ∞ ). In addition, the right side of (8.21) is uniformly bounded for x > L ; seeChapter 3 in [3] for a reference. Now we are ready to estimate1 N E GOE h Im Tr G ( z ) i = N ηN E GOE h N X j =1 | λ j − z | i = N ηN Z R K edge N, ( x, x ) | x − N / κ − i N / η | d x , (8.22)for z = 2 + κ + i η ∈ S edge , in a similar way as in the proof of Lemma 5.4. Note that (5.44) and (5.45) stillhold true for the GOE. We will focus on the regime − N / < x ≤ L , for some fixed L < 0. Recallingthe estimate (5.50) for the GUE, it suffices to prove, for any x ∈ ( − N / , L ], that (cid:12)(cid:12)(cid:12) K edge N, ( x, x ) − K edge N, ( x, x ) (cid:12)(cid:12)(cid:12) = O (1) , (8.23)which then leads to 1 N / Z L − N / K edge N, ( x, x ) | x − N / κ + i N / η | d x = O (cid:16) N − ǫ η (cid:17) . (8.24)We hence obtain (8.16) for the GOE. In order to prove (8.23), we split into two cases below and followideas from [25]. Case 1: N is even. Let N = 2 m and the last term in (8.17) is vanishing. Since φ N is even, we write K edge N, ( x, x ) = K edge N, ( x, x ) + 12 N / φ N − ( y ) Z y φ N ( t )d t , (8.25)where we set for simplicity, y = 2 √ N + xN / , with − N / < x ≤ L , (8.26)which implies that √ N < y < √ N + L N − / . From [25] and references therein, we have the followingasymptotic formula of φ N ( t ). In the domain | t | ≤ √ (cid:16) (2 N + 1) / − (2 N + 1) − / (cid:17) , (8.27)we have as N → ∞ , φ N ( t ) = A N ( t ) + O (cid:16) N / (4 N + 2 − t ) − / (cid:17) , (8.28) with A N ( t ) := r π (4 N + 2 − t ) − / cos (cid:16) (2 N + 1)(2 α N − sin 2 α N ) − π (cid:17) , (8.29)and α N := arccos( t (4 N +2) − / ). We choose L < y of the integral in (8.25) satisfies (8.27). Thus we have from (8.28) that Z y φ N ( t )d t = Z y A N ( t )d t + O (cid:16) √ N Z y (4 N + 2 − t ) − / d t (cid:17) = Z y A N ( t )d t + O ( N − / ) . (8.30)Integrating A N given in (8.29) and using integration by parts, it was shown in (14) in [25] that (cid:12)(cid:12)(cid:12) Z y A N ( t )d t (cid:12)(cid:12)(cid:12) ≤ C (4 N + 2 − y ) − / = O ( N − / ) , (8.31)with √ N < y < √ N + L N − / . Thus we have from (8.30) that (cid:12)(cid:12)(cid:12) R y φ N ( t )d t (cid:12)(cid:12)(cid:12) = O ( N − / ) , for y givenin (8.26). Combining with (5.47), the estimate (8.23) then follows from (8.25). Case 2: N is odd. Let N = 2 m + 1. Since φ N is an odd function, we write K edge N, ( x, x ) = K edge N, ( x, x ) + 12 N / φ N − ( y ) Z y φ N ( t )d t − N / φ m ( y ) I m +1 + 12 N / I m φ m ( y ) , with y given in (8.26). Using (5.47), (8.18), and (8.19), the last two terms above are bounded by O (1).The second term can be estimated similarly as in the case N = 2 m . Thus (8.23) also hold true for N = 2 m + 1.We hence have finished the proof of Lemma 8.3. (cid:3) Appendix In this appendix we prove Lemma 2.4 and Lemma 2.5. To prove Lemma 2.4, we follow the argumentsin [19]. Proof of Lemma 2.4. Recall the mollifier θ η given in (2.18) and the indicator function χ E given in (2.17),where N − ≪ η ≪ E L − E ≤ CN − / ǫ , with ǫ > χ E ( H ) − Tr χ E ⋆ θ η ( H ) = Tr g ( H ) = N X j =1 g ( λ j ) , where g ( x ) := χ E ( x ) − χ E ⋆ θ η ( x ) = (cid:16) Z R [ E,E L ] − Z E L − xE − x (cid:17) θ η ( y )d y . (.32)We first consider the function g . Note that for any E > 0, we have cηE + η ≤ Z ∞ E θ η ( y )d y = 1 π Z ∞ E ηy + η d y ≤ CηE + η . Because of the symmetry of the integrand, we have a similar estimate for the integral over ( −∞ , E ] with E < 0. Thus, if x ∈ [ E, E L ], we have from (.32) that | g ( x ) | = (cid:16) Z E − x −∞ + Z ∞ E L − x (cid:17) θ η ( y )d y ≤ Cη (cid:16) | x − E | + η + 1 | x − E L | + η (cid:17) . Else, if x ∈ [ E, E L ] c , we have from the positiveness of θ η ( y ) that | g ( x ) | = Z E L − xE − x θ η ( y )d y ≤ ( Cη | x − E | + η , if x < E, Cη | x − E L | + η , if x > E L , (.33)It is easy to check that | g ( x ) | ≤ C, for x ∈ R . (.34)Now we choose a parameter l such that η ≪ l ≪ E L − E ≤ CN − / ǫ . If we further assumemin {| x − E | , | x − E L |} ≥ l , then we have | g ( x ) | ≤ Cηl , for | x − E | > l , | x − E L | < l . (.35) Plugging (.34) and (.35) into (.32), we hence obtain (cid:12)(cid:12)(cid:12) Tr χ E ( H ) − Tr χ E ⋆ θ η ( H ) (cid:12)(cid:12)(cid:12) ≤ C (cid:16) N ( E − l , E + l ) + N ( E L − l , ∞ ) + ηl N ( E, E L ) + Tr f ( H ) (cid:17) , where f ( x ) := (cid:0) χ E ⋆ θ η (cid:1) ( x ) x ≤ E − l . Using the rigidity of eigenvalues in (2.15), we obtain that (cid:12)(cid:12)(cid:12) Tr χ E ( H ) − Tr χ E ⋆ θ η ( H ) (cid:12)(cid:12)(cid:12) ≤ C (cid:16) N ( E − l , E + l ) + ηl N ǫ + Tr f ( H ) (cid:17) , (.36)with high probability, i.e., with probability bigger than 1 − N − Γ for any large Γ > 0, for N sufficientlylarge. It is then sufficient to estimate Tr f ( H ). We writeTr f ( H ) = X λ i ≤ E − l f ( λ i ) = ∞ X k =0 X λ i ∈I k f ( λ i ) , I k := ( E − k +1 l , E − k l ] . (.37)If x ≤ E − l , then E L − x ≥ E − x ≥ l ≫ η , and we have f ( x ) = Z E L − xE − x θ η ( y )d y = arctan (cid:16) E L − xη (cid:17) − arctan (cid:16) E − xη (cid:17) = arctan (cid:16) ηE − x (cid:17) − arctan (cid:16) ηE L − x (cid:17) ≤ Cη ( E L − E )( E L − x )( E − x ) ≤ C min n ( E L − E ) η ( E − x ) , ηE − x o . In combination with (.37), we haveTr f ( H ) ≤ C ∞ X k =0 min n ( E L − E ) η k l , η k l o N k , N k := { i : λ i ∈ I k } . (.38)We next estimate N k using the local law in (3.10). ConsiderIm m N ( E − · k l + i3 k l ) = 1 N N X i =1 k l | λ i − ( E − · k l ) | + (3 k l ) ≥ N N k · k l . (.39)Using the local law in (3.10) and (2.5), for any small τ > > 0, we find an upper bound forthe left hand side above asIm m N ( E − · k l + i3 k l ) ≤ Im m sc ( E − · k l + i3 k l ) + N ǫ + τ N k l ≤ C q k l + | E − · k l − | + N ǫ + τ N k l ≤ C (cid:16)p k l + N ǫ + τ N k l + N − / ǫ (cid:17) , with probability bigger than 1 − N − Γ . By choosing τ < ǫ , we hence obtain from (.39) that N k ≤ C (cid:16) (3 k l ) / N + N ǫ + 3 k l N / ǫ (cid:17) , with high probability. Combining with (.38), we haveTr f ( H ) ≤ C ∞ X k =0 min n ( E L − E ) η k l , η k l o(cid:16) (3 k l ) / N + N ǫ + 3 k l N / ǫ (cid:17) ≤ CN / ǫ η √ l + CN ǫ ηl ≤ C ′ N ǫ ηl , with high probability. Together with (.36), we hence obtain (cid:12)(cid:12)(cid:12) Tr χ E ( H ) − Tr χ E ⋆ θ η ( H ) (cid:12)(cid:12)(cid:12) ≤ C ′ (cid:16) N ( E − l , E + l ) + ηl N ǫ (cid:17) , with high probability. This completes the proof of Lemma 2.4. (cid:3) Next, we use Lemma 2.4 to prove Lemma 2.5. Proof of Lemma 2.5. Under the same assumption in Lemma 2.4, we choose a parameter l satisfying N − ≪ η ≪ l ≪ l ≪ E L − E ≤ CN − / ǫ . We have from Lemma 2.5 thatTr χ E ( H ) ≤ l − Z EE − l Tr χ y ( H )d y ≤ l − Z EE − l Tr χ y ⋆ θ η ( H )d y + Cl − Z EE − l (cid:16) N ( y − l , y + l ) + ηl N ǫ (cid:17) d y ≤ Tr χ E − l ⋆ θ η ( H ) + C (cid:16) N ǫ ηl + l l N ( E − l, E + l ) (cid:17) , (.40)with high probability. Using the rigidity result (2.13) and l ≪ N − / ǫ , we have N ( E − l, E + l ) ≤ Z E + lE − l N ρ sc ( x )d x + N ǫ ≤ CN ǫ , with high probability. Thus we obtain from (.40) that with high probabilityTr χ E ( H ) − Tr χ E − l ⋆ θ η ( H ) ≤ CN ǫ (cid:16) ηl + l l (cid:17) . One obtains a lower bound similarly. Therefore, for any large Γ > 0, we haveTr χ E + l ⋆ θ η ( H ) − CN ǫ (cid:16) ηl + l l (cid:17) ≤ Tr χ E ( H ) ≤ Tr χ E − l ⋆ θ η ( H ) + CN ǫ (cid:16) ηl + l l (cid:17) , with probability bigger than 1 − N − Γ . We pick l = N ǫ η and l = N ǫ l such that N ǫ (cid:16) ηl + l l (cid:17) = N − ǫ . Since the counting function N ( E, E L ) = Tr χ E ( H ) is integer valued, we have P (cid:16) N ( E, E L ) = 0 (cid:17) ≤ P (cid:16) Tr χ E + l ⋆ θ η ( H ) ≤ / (cid:17) + N − Γ ≤ E h F (cid:16) Tr χ E + l ⋆ θ η ( H ) (cid:17)i + N − Γ , where F is the cut-off function given in (2.21). In the other direction, we have E h F (cid:16) Tr χ E − l ⋆ θ η ( H ) (cid:17)i ≤ P (cid:16) Tr χ E − l ⋆ θ η ( H ) ≤ / (cid:17) ≤ P (cid:16) N ( E, E L ) = 0 (cid:17) + N − Γ . Therefore, together with (2.15), we obtain E h F (cid:16) Tr χ E − l ⋆ θ η ( H ) (cid:17)i − N − Γ ≤ P (cid:16) N ( E, ∞ ) = 0 (cid:17) ≤ E h F (cid:16) Tr χ E + l ⋆ θ η ( H ) (cid:17)i + N − Γ . This completes the proof of Lemma 2.5. (cid:3) References [1] Adhikari, A., Huang, J.: Dyson Brownian motion for general β and potential at the edge ,Probab. Theory Rel. Fields Correlated random matrices: band rigidity and edge universality , Ann.Probab. , 963-1001 (2020).[3] Anderson, G., Guionnet, A., Zeitouni, O: An introduction to random matrices , Cambridge studies in advanced math-ematics , Cambridge University Press, Cambridge (2010).[4] Bonan, S. S., Clark, D. S.: Estimates of the Hermite and the Freud polynomials , J. Approx. Theory , 210-224 (1990).[5] Bourgade, P.: Extreme gaps between eigenvalues of Wigner matrices , arXiv:1812.10376 (2018).[6] Bourgade, P., Erd˝os, L., Yau, H.-T.: Edge universality of beta ensembles , Commun. Math. Phys. Asymptotic distribution of smoothed eigenvalue density. II. Wigner randommatrices , Random Oper. Stoch. Equ. A generalization of the Lindeberg principle , Ann. Probab. Edgeworth expansion of the largest eigenvalue distribution function of Gaussian orthogonal ensemble , J.Math. Phys. , 013512 (2009).[10] Collins, B., Moments and cumulants of polynomial random variables on unitary groups, the Itzykson-Zuber integral,and free probability , Int. Math. Res. Not. IMRN , 953-982 (2003)[11] Collins, B., ´Sniady, P., Integration with Respect to the Haar Measure on Unitary, Orthogonal and Symplectic Group. Commun. Math. Phys. , 773-795 (2006).[12] Deift, P., Gioev, D.: Random matrix theory: invariant ensembles and universality . Courant Lecture Notes in Mathe-matics. Vol. 18. American Mathematical Soc., 2009.[13] Deift, P., Gioev, D.: Universality at the edge of the spectrum for unitary, orthogonal, and symplectic ensembles ofrandom matrices , Comm. Pure Appl. Math. , 867-910 (2007).[14] El Karoui, N.: A rate of convergence result for the largest eigenvalue of complex white Wishart matrices , Ann. Probab. , 2077-2117 (2006). [15] Erd˝os, L., Knowles, A. and Yau, H.-T.. (2013). Averaging fluctuations in resolvents of random band matrices. Ann.Henri Poincar´e Random matrices with slow correlation decay , Forum of Mathematics Sigma(2019), (8) (2019).[17] Erd˝os, L. and Yau, H.-T. (2017). A dynamical approach to random matrix theory. Courant Lecture Notes in Mathe-matics . Providence: American Mathematical Society.[18] Erd˝os, L., Yau, H.-T., Yin, J.: Bulk Universality for Generalized Wigner Matrices , Probab. Theory Rel. Fields , 341-407 (2012).[19] Erd˝os, L, Yau, H.-T., Yin, J.: Rigidity of eigenvalues of generalized Wigner matrices , Adv. Math. (3), 1435-1515(2012).[20] He, Y., Knowles, A.: Mesoscopic eigenvalue statistics of Wigner matrices , Ann. Appl. Probab. (3), 1510-1550 (2017).[21] He, Y., Knowles, A.: Fluctuations of extreme eigenvalues of sparse Erd˝os-R´enyi graphs , Preprint, arXiv:2005.02254,(2020).[22] Huang, J., Landon, B., Yau, H.-T.: Transition from Tracy–Widom to Gaussian fluctuations of extremal eigenvaluesof sparse Erd˝os–R´enyi graphs , Ann. Probab. Random matrices and determinantal processes . arXiv: math-ph/0510038 (2005).[24] Johnstone, I. M., Ma, Z.: Fast approach to the Tracy–Widom law at the edge of GOE and GUE , Ann. Appl. Prob. (5), 1962-1988 (2012).[25] Kholopov, A. A., Tikhomirov, A. N., Timushev, D. A.: Rate of Convergence to the Semicircle Law for the GaussianOrthogonal Ensemble , Theory Probab. Appl. , 171-177 (2008).[26] Khorunzhy, A., Khoruzhenko, B., Pastur, L.: Asymptotic Properties of Large Random Matrices with IndependentEntries , J. Math. Phys. , 5033-5060 (1996).[27] Landon, B, Yau, H-T.: Edge statistics of Dyson Brownian motion , preprint, arXiv 1712.03881, 2017.[28] Lee, J. O., Schnelli, K.: Edge universality for deformed Wigner matrices , Rev. Math. Phys. (8), (2015).[29] Lee, J. O., Schnelli, K.: Local law and Tracy–Widom limit for sparse random matrices , Probab. Theory Related Fields (1), 543-616 (2018).[30] Lee, J. O., Yin, J.: A Necessary and Sufficient Condition for Edge Universality of Wigner Matrices , Duke Math. J. , 117-173, (2014).[31] Lytova, A., Pastur, L.: Central Limit Theorem for Linear Eigenvalue Statistics of Random Matrices with IndependentEntries , Ann. Probab. , 1778-1840 (2009).[32] Ma, Z.: Accuracy of the Tracy–Widom limits for the extreme eigenvalues in white Wishart matrices , Bernoulli Random Matrices . Pure and Applied Mathematics , third version, Academic Press (2004).[34] P´ech´e, S., Soshnikov, A.: On the Lower Bound of the Spectral Norm of Symmetric Random Matrices with IndependentEntries , Electron. Commun. Probab. , 280–290 (2008).[35] P´ech´e, S., Soshnikov, A.: Wigner Random Matrices with Non-Symmetrically Distributed Entries , J. Stat. Phys. ,857–884 (2007).[36] Pillai, N., Yin, J.: Universality of covariance matrices , Ann. Appl. Probab. Convergence rate to the Tracy–Widom laws for the largest eigenvalue sample covariance matrices ,in preparation (2021).[38] Sinai, Y., Soshnikov, A.: A Refinement of Wigner’s Semicircle Law in a Neighborhood of the Spectrum Edge , FunctionalAnal. and Appl. , 114–131 (1998).[39] Soshnikov, A.: Universality at the Edge of the Spectrum in Wigner Random Matrices , Commun. Math. Phys. ,697-733 (1999).[40] Soshnikov, A.: Determinantal random point fields , Russian Math. Surveys (5), 923-975 (2000).[41] Tao, T., Vu, V.: Random Matrices: Universality of Local Eigenvalue Statistics up to the Edge , Commun. Math. Phys. , 549-572 (2010).[42] Tracy, C., Widom, H.: Level-Spacing Distributions and the Airy Kernel , Commun. Math. Phys. , 151-174 (1994).[43] Tracy, C, Widom, H.: On Orthogonal and Symplectic Matrix Ensembles , Commun. Math. Phys. , 727-754 (1996).[44] Wang, H.: