Spectral rigidity for addition of random matrices at the regular edge
aa r X i v : . [ m a t h . P R ] A ug Spectral rigidity for addition of random matrices at the regular edge
Zhigang Bao ∗ HKUST [email protected]
L´aszl´o Erd˝os † IST Austria [email protected]
Kevin Schnelli ‡ KTH Royal Institute of Technology [email protected]
Abstract.
We consider the sum of two large Hermitian matrices A and B with a Haar unitaryconjugation bringing them into a general relative position. We prove that the eigenvalue densityon the scale slightly above the local eigenvalue spacing is asymptotically given by the free additiveconvolution of the laws of A and B as the dimension of the matrix increases. This implies optimalrigidity of the eigenvalues and optimal rate of convergence in Voiculescu’s theorem. Our previousworks [4, 5] established these results in the bulk spectrum, the current paper completely settles theproblem at the spectral edges provided they have the typical square-root behavior. The key elementof our proof is to compensate the deterioration of the stability of the subordination equations by sharperror estimates that properly account for the local density near the edge. Our results also hold if theHaar unitary matrix is replaced by the Haar orthogonal matrix. Date : August 30, 2017
Keywords : Random matrices, local eigenvalue density, free convolution, spectral edge
AMS Subject Classification (2010) : 46L54, 60B20 Introduction
The pioneering work of Voiculescu [27] identified the eigenvalue density of the sum of two Hermitian N × N matrices A and B in a general relative position as the free additive convolution of the eigenvaluedensities µ A and µ B of A and B . The primary example for general relative position is asymptotic freenessthat can be generated by conjugation via a Haar distributed unitary matrix. In fact, under some mildregularity condition on µ A and µ B , local laws also hold, asserting that the empirical eigenvalue density ofthe sum converges on small scales as well. The optimal precision in such local law pins down the locationof individual eigenvalues with an error bar that is just slightly above the local eigenvalue spacing. Withan optimal error term, it identifies the speed of convergence of order N − ǫ in Voiculescu’s limit theorem.After several gradual improvements on the precision in [19, 20, 3], the local law on the optimal N − ǫ scale was established in [4] and the optimal convergence speed was obtained in [5]. All these resultswere, however, restricted to the regular bulk spectrum, i.e., to the spectral regime where the density ofthe free convolution is non-vanishing and bounded from above. In particular, the regime of the spectraledges were not covered. Under mild conditions on the limiting eigenvalue densities of A and B , the freeconvolution density always vanishes as the square-root function near the edges of its support. We callsuch type of edges regular . We remark that the regular edge is typical in many random matrix models,for instance, the semicircle law; i.e., the limiting density for Wigner matrices.Near the edges the eigenvalues are sparser hence they fluctuate more; naively, the extreme eigenvaluesmight be prone to very large fluctuations due to the room available to them on the opposite side ofthe support. Nevertheless, for Wigner matrices and many related ensembles with independent or weaklydependent entries it has been shown that the eigenvalue fluctuation does not exceed its natural threshold,the local spacing, even at the edge; see e.g., [17, 21, 2] and references therein. In general, it implies a ∗ Partially supported by Hong Kong RGC grant ECS 26301517. † Partially supported by ERC Advanced Grant RANMAT No. 338804. ‡ Partially supported by the G¨oran Gustafsson Foundation. very strong concentration of the empirical measure. For the smallest and largest eigenvalues it means afluctuation of order N − / . In fact, the precise fluctuation is universal and it follows the Tracy–Widomdistribution; see e.g., [25, 11, 22] for proofs in various models.In this paper we present a comprehensive edge local law on optimal scale and with optimal precisionfor the ensemble A + U BU ∗ where U is Haar unitary. We assume that the laws of A and B are close tocontinuous limiting profiles µ α and µ β with a single interval support and power law behavior at the edgewith exponent less than one. We prove that the free convolution µ α ⊞ µ β has a square root singularityat its edge and µ A ⊞ µ B closely trails this behavior. Furthermore, we establish that the eigenvalues of A + U BU ∗ follow µ A ⊞ µ B down to the scale of the local spacing, uniformly throughout the spectrum. Inparticular, we show that the extreme eigenvalues are in the optimal N − + ε vicinity of the deterministicspectral edges. Previously, similar result was only known with o (1) precision, see [14] for instance. Weexpect that Tracy–Widom law holds at the regular edge of our additive model. Very recently, bulkuniversality has been demonstrated in [12].Our analysis also implies optimal rate of convergence for Voiculescu’s global law for free convolutiondensities with the typical square root edges.The result demonstrates that the Haar randomness in the additive model has a similarly strongconcentration of the empirical density as already proved for the Wigner ensemble earlier. In fact, theadditive model is only the simplest prototype of a large family of models involving polynomials of Haarunitaries and deterministic matrices; other examples include the ensemble in the single ring theorem[18, 6]. The technique developed in the current paper can potentially handle square root edges in morecomplicated ensembles where the main source of randomness is the Haar unitaries.After the statement of the main result and the introduction of a few basic quantities, we show inSection 3 that µ α ⊞ µ β has under suitable conditions a square root singularity at the lowest edge andwe establish stability properties of subordination equations around that edge. In Section 4 an informaloutline of the proof that explains the main difficulties stemming from the edge in contrast to the relatedanalysis in the bulk. Here we highlight only the key point. A typical proof of the local laws has two parts:( i ) stability analysis of a deterministic (Dyson) equation for the limiting eigenvalue distribution, and ( ii )proof that the empirical density approximately satisfies the Dyson equation and estimate the error.Given these two inputs, the local law follows by simply inverting the Dyson equation. For our model theDyson equation is actually the pair of the subordination equations , that define the free convolution. Nearthe spectral edge, the subordination equations become unstable. A similar phenomenon is well knownfor the Dyson equation of Wigner type models, but it has not yet been analyzed for the subordinationequations. This instability can only be compensated by a very accurate estimate on the approximationerror; a formidable task given the complexity of the analogous error estimates in the bulk [5]. Alreadythe bulk analysis required carefully selected counter terms and weights in the fluctuation averagingmechanisms before recursive moment estimates could be started. All these ideas are used at the edge,even up to higher order, but they still fall short of the necessary precision. The key novelty is to identifya very specific linear combination of two basic fluctuating quantities with a fluctuation smaller thanthose of its constituencies, indicating a very special strong correlation between them. Notation:
The symbols O ( · ) and o ( · ) stand for the standard big-O and little-o notation. We use c and C to denote positive finite constants that do not depend on the matrix size N . Their values maychange from line to line.We denote by M N ( C ) the set of N × N matrices over C . For a vector v ∈ C N , we use k v k to denote itsEuclidean norm. For A ∈ M N ( C ), we denote by k A k its operator norm and by k A k its Hilbert-Schmidtnorm. We use tr A = N P i A ii to denote the normalized trace of an N × N matrix A = ( A ij ) N,N .Let g = ( g , . . . , g N ) be a real or complex Gaussian vector. We write g ∼ N R (0 , σ I N ) if g , . . . , g N areindependent and identically distributed (i.i.d.) N (0 , σ ) normal variables; and we write g ∼ N C (0 , σ I N )if g , . . . , g N are i.i.d. N C (0 , σ ) variables, where g i ∼ N C (0 , σ ) means that Re g i and Im g i are indepen-dent N (0 , σ ) normal variables.For two possibly N -dependent numbers a, b ∈ C , we write a ∼ b if there is a (large) positive constant C > C − | a | ≤ | b | ≤ C | a | .Finally, we use double brackets to denote index sets, i.e., for n , n ∈ R , J n , n K := [ n , n ] ∩ Z . Definition of the Model and main results
Model and assumptions.
Let A ≡ A N = diag( a , . . . , a N ) and B ≡ B N = diag( b , . . . , b N ) betwo deterministic real diagonal matrices in M N ( C ). Let U ≡ U N be a random unitary matrix whichis Haar distributed on U ( N ), where U ( N ) is the N -dimensional unitary group. We study the followingrandom Hermitian matrix H ≡ H N := A + U BU ∗ . (2.1)More specifically, we study the eigenvalues of H , denoted by λ ≤ . . . ≤ λ N . Throughout the paper,we are mainly working in the vicinity of the bottom of the spectrum. The discussion for the top of thespectrum is analogous. Let µ A , µ B and µ H be the empirical eigenvalue distributions of A , B , and H , i.e., µ A := 1 N N X i =1 δ a i , µ B := 1 N N X i =1 δ b i , µ H := 1 N N X i =1 δ λ i . For any probability measure µ on the real line, its Stieltjes transform is defined as m µ ( z ) := Z R x − z d µ ( x ) , z ∈ C + , where z is called spectral parameter . Throughout the paper, we write z = E +i η , i.e., E = Re z , Im z = η .In this paper, we assume that there are two N -independent absolutely continuous probability mea-sures µ α and µ β with continuous density functions ρ α and ρ β , respectively, such that the followingassumptions, Assumptions 2.1 and 2.2, are satisfied. The first one discusses some qualitative propertiesof µ α and µ β , while the second one demands that µ A and µ B are close to µ α and µ β , respectively. Assumption 2.1.
We assume the following: ( i ) Both density functions ρ α and ρ β have single non-empty interval supports, [ E α − , E α + ] and [ E β − , E β + ] ,respectively, and ρ α and ρ β are strictly positive in the interior of their supports. ( ii ) In a small δ -neighborhood of the lower edges of the supports, these measures have a power lawbehavior, namely, there is a (small) constant δ > and exponents − < t α − , t β − < such that C − ≤ ρ α ( x )( x − E α − ) t α − ≤ C , ∀ x ∈ [ E α − , E α − + δ ] ,C − ≤ ρ β ( x )( x − E β − ) t β − ≤ C , ∀ x ∈ [ E β − , E β − + δ ] , hold for some positive constant C > . ( iii ) We assume that at least one of the following two bounds holds sup z ∈ C + | m µ α ( z ) | ≤ C , sup z ∈ C + | m µ β ( z ) | ≤ C , (2.2) for some positive constant C . Assumption 2.2.
We assume the following: ( iv ) For the L´evy-distances d L , we have that d := d L ( µ A , µ α ) + d L ( µ B , µ β ) ≤ N − ǫ , (2.3) for any constant ǫ > when N is sufficiently large. ( v ) For the lower edges, we have inf supp µ A ≥ E α − − δ , inf supp µ B ≥ E β − − δ , (2.4) for any constant δ > when N is sufficiently large. ( vi ) For the upper edges, we assume that there is a constant C such that sup supp µ A ≤ C , sup supp µ B ≤ C . (2.5)A direct consequence of ( v ) and ( vi ) above is that there is a constant C ′ such that k A k , k B k ≤ C ′ . Since [27], it is well known now that µ H can be weakly approximated by a deterministic probabilitymeasure, called the free additive convolution of µ A and µ B . Here we briefly introduce some notationsconcerning the free additive convolution, which will be necessary to state our main results. For a probability measure µ on R , we denote by F µ its negative reciprocal Stieltjes transform, i.e., F µ ( z ) := − m µ ( z ) , z ∈ C + . (2.6)Note that F µ : C + → C + is analytic such thatlim η ր∞ F µ (i η )i η = 1 . (2.7)Conversely, if F : C + → C + is an analytic function with lim η ր∞ F (i η ) / i η = 1, then F is the negativereciprocal Stieltjes transform of a probability measure µ , i.e., F ( z ) = F µ ( z ), for all z ∈ C + ; see e.g., [1].The free additive convolution is the symmetric binary operation on Borel probability measures on R characterized by the following result. Proposition 2.3 (Theorem 4.1 in [8], Theorem 2.1 in [13]) . Given two Borel probability measures, µ and µ , on R , there exist unique analytic functions, ω , ω : C + → C + , such that, ( i ) for all z ∈ C + , Im ω ( z ) , Im ω ( z ) ≥ Im z , and lim η ր∞ ω (i η )i η = lim η ր∞ ω (i η )i η = 1 ; (2.8)( ii ) for all z ∈ C + , F µ ( ω ( z )) = F µ ( ω ( z )) , ω ( z ) + ω ( z ) − z = F µ ( ω ( z )) . (2.9)The analytic function F : C + → C + defined by F ( z ) := F µ ( ω ( z )) = F µ ( ω ( z )) , (2.10)is, in virtue of (2.8), the negative reciprocal Stieltjes transform of a probability measure µ , called thefree additive convolution of µ and µ , denoted by µ ≡ µ ⊞ µ . The functions ω and ω are referredto as the subordination functions . The subordination phenomenon for the addition of freely independentnon-commutative random variables was first noted by Voiculescu [28] in a generic situation and extendedto full generality by Biane [10].Choosing ( µ , µ ) = ( µ α , µ β ) in Proposition 2.3, we denote the associated subordination functions ω and ω by ω α and ω β , respectively. Analogously, for the choice ( µ , µ ) = ( µ A , µ B ), we denote by ω A and ω B the associated subordination functions. With the above notations, we obtain from (2.9) and (2.10)the following subordination equations m µ A ( ω B ( z )) = m µ B ( ω A ( z )) = m µ A ⊞ µ B ( z ) ,ω A ( z ) + ω B ( z ) − z = − m µ A ⊞ µ B ( z ) . (2.11)The same system of equations hold if we replace the subscripts ( A, B ) by ( α, β ).We denote the lower and upper edges of the support of µ α ⊞ µ β by E − := inf supp µ α ⊞ µ β , E + := sup supp µ α ⊞ µ β . (2.12)In Section 3, we establish various qualitative properties of µ α ⊞ µ β and of µ A ⊞ µ B . In particular, underAssumption 2.1, we show that µ α ⊞ µ β has a square-root decay at the lower edge, see (3.62).2.2. Main results.
To state our results, we introduce some more terminology. We denote the Greenfunction or resolvent of H and its normalized trace by G ( z ) ≡ G H ( z ) := 1 H − z , m H ( z ) := tr G ( z ) = 1 N N X i =1 G ii ( z ) , z ∈ C + . Observe that m H ( z ) is also the Stieltjes transform of µ H , i.e., m H ( z ) = Z R x − z d µ H ( x ) = 1 N N X i =1 λ i − z , z ∈ C + . We further set K := k A k + k B k + 1 . (2.13) Moreover, for any spectral parameter z = E + i η ∈ C + , we let κ ≡ κ ( z ) := min {| E − E − | , | E − E + |} , (2.14)with E ± given in (2.12). We then introduce the following domain of the spectral parameter z : For any0 < a ≤ b and 0 < τ < E + − E − , D τ ( a, b ) := { z = E + i η ∈ C + : −K ≤ E ≤ E − + τ, a ≤ η ≤ b } . (2.15)For any (small) positive constant γ >
0, we set η m := N − γ . Let η M > z ∈ D τ ( η m , η M ) with sufficiently small constant τ >
0. In particular, we usually have η m ≤ η ≤ η M .We also need the following definition on high-probability estimates from [16]. In Appendix A wecollect some of its properties. Definition 2.4.
Let
X ≡ X ( N ) and Y ≡ Y ( N ) be two sequences of nonnegative random variables. Wesay that Y stochastically dominates X if, for all (small) ǫ > and (large) D > , P (cid:0) X ( N ) > N ǫ Y ( N ) (cid:1) ≤ N − D , (2.16) for sufficiently large N ≥ N ( ǫ, D ) , and we write X ≺ Y or X = O ≺ ( Y ) . When X ( N ) and Y ( N ) dependon a parameter v ∈ V (typically an index label or a spectral parameter), then X ( v ) ≺ Y ( v ) , uniformly in v ∈ V , means that the threshold N ( ǫ, D ) can be chosen independently of v . With these definitions and notations, we now state our main result.
Theorem 2.5 (Local law at the regular edge) . Suppose that Assumptions 2.1 and 2.2 hold. Let τ > be a sufficiently small constant and fix any (small) constants γ > and ε > . Let d , . . . , d N ∈ C beany deterministic complex number satisfying max i ∈ J ,N K | d i | ≤ . Then (cid:12)(cid:12)(cid:12) N N X i =1 d i (cid:16) G ii ( z ) − a i − ω B ( z ) (cid:17)(cid:12)(cid:12)(cid:12) ≺ N η (2.17) holds uniformly on D τ ( η m , η M ) with η m = N − γ and any constant η M > . In particular, choosing d i = 1 for all i ∈ J , N K , we have the estimate (cid:12)(cid:12)(cid:12) m H ( z ) − m µ A ⊞ µ B ( z ) (cid:12)(cid:12)(cid:12) ≺ N η , (2.18) uniformly on D τ ( η m , η M ) . Moreover, we have the improved estimate (cid:12)(cid:12)(cid:12) m H ( z ) − m µ A ⊞ µ B ( z ) (cid:12)(cid:12)(cid:12) ≺ N ( κ + η ) , (2.19) uniformly for all z = E + i η ∈ D τ (0 , η M ) with E ≤ E − − N − + ε . Here, κ = | E − E − | is given in (2.14) . Let γ j be the j -th N -quantile of µ α ⊞ µ β , i.e., γ j is the smallest real number such that µ α ⊞ µ β (cid:0) ( −∞ , γ j ] (cid:1) = jN . (2.20)Similarly, we define γ ∗ j to be the j -th N -quantile of µ A ⊞ µ B .The following theorem is on the rigidity property of the eigenvalues of H . Theorem 2.6 (Rigidity at the lower edge) . Suppose that Assumptions 2.1 and 2.2 hold. For anysufficiently small constant c > , we have that for all ≤ i ≤ cN , | λ i − γ ∗ i | ≺ i − N − . (2.21) In fact, the same estimate also holds if γ ∗ i is replaced with γ i . With the following additional assumptions on the upper edges of µ α , µ β and µ A , µ B , we can combinethe current edge analysis with our strong local law in the bulk regime in [5]. This yields the rigidityresult for the whole spectrum. Assumption 2.7.
We assume the following: ( ii ′ ) In a small δ -neighborhood of the upper edges of their supports, the measures µ α and µ β have a powerlaw behavior, namely, there is a (large) constant C ≥ and exponents − < t α + , t β + < such that C − ≤ ρ α ( x )( E α + − x ) t α + ≤ C , ∀ x ∈ [ E α + − δ, E α + ] ,C − ≤ ρ β ( x )( E β + − x ) t β + ≤ C , ∀ x ∈ [ E β + − δ, E β + ] , hold for some sufficiently small constant δ > . ( v ′ ) For the upper edges of µ A and µ B , we have sup supp µ A ≤ E α + + δ , sup supp µ B ≤ E β + + δ , for any constant δ > when N is sufficiently large. ( vii ) The density function of µ α ⊞ µ β has a single interval support, i.e.,supp µ α ⊞ µ β = [ E − , E + ] . Corollary 2.8 (Rigidity for the whole spectrum) . Suppose that Assumptions 2.1, 2.2 and 2.7 hold.Then we have, for all i ∈ J , N K , the estimate | λ i − γ ∗ i | ≺ min (cid:8) i − , ( N − i + 1) − (cid:9) N − . (2.22) The same estimate also holds if γ ∗ i is replaced with γ i . Moreover, we have the following estimate on theconvergence rate of µ H , sup x ∈ R (cid:12)(cid:12) µ H (( −∞ , x ]) − µ A ⊞ µ B (( −∞ , x ]) (cid:12)(cid:12) ≺ N . (2.23)We remark here that all of our results above also hold for the orthogonal setup, i.e., when U is arandom orthogonal matrix Haar distributed on the orthogonal group O ( N ). The proof is nearly thesame as the unitary setup. A discussion on the necessary modification for the block additive model inthe bulk regime can be found in Appendix C of [6]. Here for our model, the modification can be done inthe same way. We omit the details.3. Properties of the subordination functions at the regular edge
In this section, we collect some key properties of the subordination functions and related quantities,that will often be used in Sections 5-9. We first introduce S AB ≡ S AB ( z ) := ( F ′ A ( ω B ( z )) − F ′ B ( ω A ( z )) − − , T A ≡ T A ( z ) := 12 (cid:16) F ′′ A ( ω B ( z ))( F ′ B ( ω A ( z )) − + F ′′ B ( ω A ( z ))( F ′ A ( ω B ( z )) − (cid:17) , T B ≡ T B ( z ) := 12 (cid:16) F ′′ B ( ω A ( z ))( F ′ A ( ω B ( z )) − + F ′′ A ( ω B ( z ))( F ′ B ( ω A ( z )) − (cid:17) , (3.1)where we use the shorthand notation F A ≡ F µ A and F B ≡ F µ B for the negative reciprocal Stieltjestransforms of µ A and µ B , and where ω A and ω B are the subordination functions associated through (2.9).The main result in this section is the following proposition on the domain D τ ( η m , η M ); see (2.15). Proposition 3.1.
Suppose that Assumptions 2.1 and 2.2 hold. Then, for sufficiently small constant τ > , we have the following statements: ( i ) There exist strictly positive constants k and K , such that min i | a i − ω B ( z ) | ≥ k , min i | b i − ω A ( z ) | ≥ k , (3.2) (cid:12)(cid:12) ω A ( z ) (cid:12)(cid:12) ≤ K, (cid:12)(cid:12) ω B ( z ) (cid:12)(cid:12) ≤ K , (3.3) hold uniformly on D τ ( η m , η M ) . ( ii ) For the Stieltjes transform m µ A ⊞ µ B of µ A ⊞ µ B , we have that Im m µ A ⊞ µ B ( z ) ∼ ( √ κ + η , if E ∈ supp µ A ⊞ µ B , η √ κ + η , if E supp µ A ⊞ µ B , (3.4) uniformly on z = E + i η ∈ D τ ( η m , η M ) , with κ given in (2.14) . ( iii ) For S AB , T A and T B defined in (3.1), we have S AB ( z ) ∼ √ κ + η , |T A ( z ) | ≤ C , |T B ( z ) | ≤ C , (3.5) uniformly on z ∈ D τ ( η m , η M ) , for some constant C . In addition, for z = E + i η ∈ D τ ( η m , η M ) with | E − E − | ≤ δ and η ≤ δ for some sufficiently small constant δ > , we also have |T A ( z ) | ≥ c , |T B ( z ) | ≥ c , (3.6) for some strictly positive constant c = c ( δ ) . ( iv ) For ω A , ω B and S AB we have | ω ′ A ( z ) | ≤ C √ κ + η , | ω ′ B ( z ) | ≤ C √ κ + η , |S ′ AB ( z ) | ≤ C √ κ + η , (3.7) any z ∈ D τ ( η m , η M ) , for some constant C . The proof of Proposition 3.1 is split into two steps. In the first step, carried out in Subsection 3.1,we derive the analogous statements for the N -independent measures µ α and µ β . This step requires onlyAssumption 2.1. In the second step, carried out in Subsection 3.2, we show that the statements carryover to the N -dependent measures µ A and µ B under Assumption 2.2, for N sufficiently large.3.1. Free convolution measure µ α ⊞ µ β . In this subsection, we derive some properties of the freeadditive convolution of the µ α and µ β . We will always assume that µ α and µ β satisfy Assumption 2.1.From Assumption 2.1 ( iii ) and Lemma 4.1 in [28], we know thatsup z ∈ C + | m µ α ⊞ µ β ( z ) | ≤ C. (3.8)In addition, under Assumption 2.1, we see from Theorem 2.3 and Remark 2.4 in [7] that ω α ( z ), ω β ( z )and m µ α ⊞ µ β ( z ) can be extended continuously to C + ∪ R . This together with (3.8) implies that µ α ⊞ µ β is absolutely continuous with a continuous and bounded density function.Recall from Assumption 2.1 that supp µ α = [ E α − , E α + ] and supp µ β = [ E β − , E β + ]. We introduce thespectral domain E ⊂ C by setting E := { z ∈ C + ∪ R : E α − + E β − − ≤ Re z ≤ E α + + E β + + 1 , ≤ Im z ≤ η M } , (3.9)where η M > µ α ⊞ µ β ⊂ E ∩ R .FC Lemma 3.2.
There exists a constant C such that sup z ∈E ( | ω α ( z ) | + | ω β ( z ) | ) ≤ C . (3.10)
Proof.
Let
L > max {| E α + + E β + + 1 | , | E α − + E β − − |} and M >
10 be large numbers to be chosen later.We will argue by contradiction. Assume first that there is z ∈ E such that | ω α ( z ) | > LM , | ω β ( z ) | > L . (3.11)Then we have from (2.9) that1 ω α ( z ) + ω β ( z ) − z = − Z R d µ α ( x ) x − ω β ( z ) = 1 ω β ( z ) + O (( ω β ( z )) − ) , (3.12)1 ω α ( z ) + ω β ( z ) − z = − Z R d µ β ( x ) x − ω α ( z ) = 1 ω α ( z ) + O (( ω α ( z )) − ) , (3.13)as L → ∞ . Thus we get from (3.13), as z ∈ E , that in the same limit ω β ( z ) ω α ( z ) = O (cid:0) ( ω α ( z ) − (cid:1) . (3.14)But then we have from (3.11) and (3.14) that L | ω α ( z ) | ≤ | ω β ( z ) || ω α ( z ) | ≤ C | ω α ( z ) | , (3.15)hence for L sufficiently large, we get a contradiction.Next, assume that there is z ∈ E such that | ω α ( z ) | > LM , | ω β ( z ) | ≤ L . (3.16)
Then we conclude from (2.9) that 1 | m µ α ( ω β ( z )) | = | ω α ( z ) + ω β ( z ) − z | ≥ LM , (3.17)for M sufficiently large, where we used that z ∈ E . On the other hand, the Stieltjes transform m µ α ( z )does not have any zeros in E as the support of µ α is connected. Thus there is a constant c >
0, dependingon L , such that | m µ α ( z ′ ) | ≥ c , for all z ′ ∈ C + with | z ′ | ≤ L . Hence, for M sufficiently large, we get acontradiction from (3.17).Finally, as both, (3.11) and (3.16), have been ruled out, we can conclude that | ω α ( z ) | ≤ LM , | ω β ( z ) | ≤ L , (3.18)for all z ∈ E . This completes the proof of Lemma 3.2. (cid:3) Recall from (2.12) that E − = inf supp µ α ⊞ µ β . Recall further that, for any spectral parameter z , κ = κ ( z ) defined in (2.14) is the distance of Re z to the endpoints of supp( µ α ⊞ µ β ). Lemma 3.3.
Let u ∈ R with u ≤ E − , then we have Re ω α ( u ) ≤ E β − , Re ω β ( u ) ≤ E α − . (3.19) Moreover, Re ω α and Re ω β are monotone increasing on ( −∞ , E − ) .Proof. We argue by contradiction. Assume that there exists y ′ with y ′ ≤ E − such that Re ω α ( y ′ ) > E β − .Then either Re ω α ( y ′ ) ∈ ( E β − , E β + ) or Re ω α ( y ′ ) ≥ E β + . In the first case, using that the imaginary part ofthe identity m µ α ⊞ µ β ( z ) = m α ( ω β ( z )), we conclude that Im m µ α ⊞ µ β ( y ′ ) > i.e., the density of µ α ⊞ µ β at y ′ is strictly positive. This contradicts the definition of E − (as the lowest endpoint supp µ α ⊞ µ β ).In the second case, Re ω α ( y ′ ) ≥ E β + , we haveRe m µ β ( ω α ( y ′ )) = Z E β + E β − ( x − Re ω α ( y ′ ))d µ β ( x ) | x − ω α ( y ′ ) | < . (3.20)However, since Re m µ β ( ω α ( y ′ )) = Re m µ α ⊞ µ β ( y ′ ), we get a contradiction asRe m µ α ⊞ µ β ( y ′ ) = Z ∞ y d µ α ⊞ µ β ( x ) x − y ′ > , (3.21)by the definition of E − .From the above, we get Re ω α ( y ′ ) ≤ E β − . Repeating the argument for ω β , we obtain (3.19).Finally, that Re ω α and Re ω β are increasing on ( −∞ , E − ) follows from the observation that Re m µ α ⊞ µ β is increasing on ( −∞ , E − ), the subordination property m µ α ⊞ µ β ( z ) = m µ β ( ω α ( z )) and (3.20). The sameargument shows that Re ω α is increasing on ( −∞ , E − ). This finishes the proof of Lemma 3.3. (cid:3) We now show that we actually have Re ω α ( E − ) ≤ E β − − k and Re ω β ( E − ) ≤ E α − − k , for someconstant k >
0. Our argument relies on the following computational lemma.
Lemma 3.4.
Let ω = λ + i ν , with ν ≥ and | ω | ≤ ϑ , for some small ϑ > . Let − < t < . Then, Z ϑ x t d x ( x − λ ) + ν ∼ λ t ν , if λ > ν , | ω | t − ∼ λ t − , if λ < , | λ | > ν ,ν t − , if ν > | λ | . (3.22) Proof.
Follows from elementary estimations. (cid:3)
Recall from (2.6) that F µ ( w ) = − /m µ ( w ), w ∈ C + , denotes the negative reciprocal Stieltjes transformof any probability measure µ . As F µ : C + → C + is analytic, and since µ is a probability measure, itadmits the representation F µ ( z ) − z = Re F µ (i) + Z R (cid:18) x − z − x x (cid:19) d b µ ( x ) , (3.23)for some finite Borel measure b µ on R . Note that b µ is in general not a probability measure. In particular,we have b µ ≡ µ is supported at a single point. The following result about the support ofthe measure b µ associated with the measure µ is of relevance. Lemma 3.5.
Let µ be a probability measure on R which is supported at more than two points, is ofbounded support and satisfies m µ ( x ) = 0 , for all x ∈ R \ supp µ . Then we have that supp µ = supp b µ , (3.24) where b µ is the finite Borel measure associated with µ through (3.23) .Proof. Given any probability measure ν on R , we first note that x ∈ R is in the support of ν if and onlyif its Stieltjes transform fails to be analytic in a neighborhood of x . For the measure µ from above, wehave m µ ( x ) = 0 for all x ∈ R \ supp µ . Therefore, we know that x ∈ R is in the support of µ if and onlyif the reciprocal Stieltjes transform F µ fails to be analytic in a neighborhood of x .Since µ is supported at more than one point, we have b µ = 0 in (3.23). We then apply the samereasoning to conclude that x ∈ R is in the support of the measure b µ if and only if F µ fails to be analyticin a neighborhood of x . Thus (3.24) directly follows. (cid:3) Lemma 3.6.
There is a constant k > , such that Re ω α ( E − ) ≤ E β − − k , Re ω β ( E − ) ≤ E α − − k . (3.25) Moreover, there exists a constant C , such that Im ω α ( z ) + Im ω β ( z ) ≤ η + C Im m µ α ⊞ µ β ( z ) , (3.26) for all z ∈ E . The constants k and C only depend on µ α and µ β .Proof. Let z ∈ E . Taking the imaginary part in the subordination equations (2.9) we getIm ω α ( z ) + Im ω β ( z ) − Im z | ω α ( z ) + ω β ( z ) − z | = Im m µ α ⊞ µ β ( z ) . Thus we obtainIm ω α ( z ) + Im ω β ( z ) = Im z + | ω α ( z ) + ω β ( z ) − z | Im m µ α ⊞ µ β ( z ) ≤ η + C Im m µ α ⊞ µ β ( z ) , where we used Lemma 3.2 to get the inequality. This proves (3.26).We move on to prove the estimates in (3.25). UsingIm m µ α ⊞ µ β ( z ) = Im ω α ( z ) Z R d µ β ( x ) | x − ω α ( z ) | = Im ω β ( z ) Z R d µ α ( x ) | x − ω β ( z ) | , (3.27)and (2.9), we can writeIm m µ α ⊞ µ β ( z )Im z (cid:18)(cid:16)Z R d µ α ( x ) | x − ω β ( z ) | (cid:17) − + (cid:16)Z R d µ β ( x ) | x − ω α ( z ) | (cid:17) − (cid:19) − m µ α ⊞ µ β ( z )Im z | m µ α ⊞ µ β ( z ) | , for all z ∈ E ∩ C + . Since Im m µ α ⊞ µ β ( z ) / Im z >
0, for all z ∈ E ∩ C + , we obtain (cid:16)Z R d µ α ( x ) | x − ω β ( z ) | (cid:17) − + (cid:16)Z R d µ β ( x ) | x − ω α ( z ) | (cid:17) − ≥ | m µ α ⊞ µ β ( z ) | − = | ω α ( z ) + ω β ( z ) − z | , (3.28)for all z ∈ E ∩ C + , and we can take the limit Im z → z ∈ E .Next, we introduce the quantities d α := | Re ω α ( E − ) − E β − | , d β := | Re ω β ( E − ) − E α − | . (3.29)We now claim that d α ≥ k and d β ≥ k , for some constant k >
0. Without loss of generality, we mayassume that d β ≥ d α . We then proceed by distinguishing two cases: First assume that d α ≤ ǫk , d β > k , (3.30)for some small constants k > ǫ > ϑ > Z E β − + ϑE β − d µ β ( x ) | x − ω α ( z ) | ∼ (Re ω α ( z ) − E β − ) tβ − Im ω α ( z ) , if Re ω α ( z ) − E β − ≥ Im ω α ( z ) , | Re ω α ( z ) − E β − | t β − − , if Re ω α ( z ) − E β − ≤ − Im ω α ( z ) , (Im ω α ( z )) t β − − , if Im ω α ( z ) > | Re ω α ( z ) − E β − | , (3.31) uniformly on the domain E , where we have − < t β − <
1. (In the limit Im z →
0, the integral may bedivergent, but this does not affect the following argument.) Fixing a small δ > z = E − − δ ,we obtain from all three cases in (3.31) that (cid:18) Z E β + E β − d µ β ( x ) | x − ω α ( E − − δ ) | (cid:19) − ≤ c | Re ω α ( E − − δ ) − E β − | − t β − ≤ c ( d α ) − t β − , (3.32)where we used that Re ω α ( y − δ ) is a non-positive increasing function as δ decreases by Lemma 3.3. Inparticular we can take the limit δ ց d α < ǫk and d β > k , we have from (3.28) and (3.32) that (cid:18) R R d µ α ( x ) | x − ω β ( E − − δ ) | + c ( ǫk ) − t β − (cid:19) ≥ | m µ α ⊞ µ β ( E − − δ ) | − , (3.33)which implies 1 ≥ Z R d µ α ( x ) | x − ω β ( E − − δ ) | (cid:0) | m µ α ⊞ µ β ( E − − δ ) | − − c ( ǫk ) − t β − (cid:1) (3.34)= R R d µ α ( x ) | x − ω β ( E − − δ ) | (cid:12)(cid:12) R R d µ α ( x ) x − ω β ( E − − δ ) (cid:12)(cid:12) − c ( ǫk ) − t β − Z R d µ α ( x ) | x − ω β ( E − − δ ) | , (3.35)where we used (2.9) to get the equality. As we are currently assuming that d β > k , we have c ( ǫk ) − t β − Z R d µ α ( x ) | x − ω β ( E − − δ ) | ≤ c ( ǫk ) − t β − d β ≤ cǫ − t β − k − t β − − , (3.36)where we used that Re ω α ( E − − δ ) is a non-positive increasing function as δ decreases.Next, as we assume that µ β is not a single point mass, we have by the Cauchy-Schwarz inequality R R d µ α ( x ) | x − ω β ( E − − δ ) | (cid:12)(cid:12) R R d µ α ( x ) x − ω β ( E − − δ ) (cid:12)(cid:12) ≥ (1 + C S ) , (3.37)for some constant C S >
0, uniformly for, say, all 0 ≤ δ ≤ / δ ց
0, we conclude from (3.36) and (3.37)1 ≥ C S − cǫ − t β − k − t β − − . (3.38)We therefore get, for ǫ < ( C S /ck t β − ) / (1 − t β − ) , for any k >
0, a contradiction. Here we use that t β − < k if ǫ is sufficiently small depending on k .Assume next that d α ≤ ǫk , d β ≤ k , (3.39)Following the lines from (3.31) to (3.32) with α and β interchanged, we find that for any small δ > (cid:16) Z E α + E α − d µ α ( x ) | x − ω β ( E − − δ ) | (cid:17) − ≤ c | Re ω β ( z − δ ) − E α − | − t α ≤ c ( d β ) − t α . (3.40)Hence, together with (3.32), we get from (3.28) that c ( ǫk ) − t β − + ck − t α − ≥ | m µ α ⊞ µ β ( E − − δ ) | − . (3.41)As m µ α ⊞ µ β ( E − − δ ) is increasing as δ decreases, we can take the limit δ ց
0. Thus | m µ α ⊞ µ β ( E − ) | − ≤ c ( ǫk ) − t β − + ck − t α − . (3.42)By (3.8). Hence, since t α − < t β − <
1, we get a contradiction by choosing k > ǫ < d α > ǫk , d β > k , (3.43)for ǫ > k > k := ǫk and concludesthe proof of Lemma 3.6. (cid:3) Lemma 3.7.
The lowest endpoint E − of supp µ α ⊞ µ β is the smallest real solution to the equation ( F ′ µ α ( ω β ( z )) − F ′ µ β ( ω α ( z )) −
1) = 1 , z ∈ R . (3.44) Moreover, there are constants κ > and η > such that Im m µ α ⊞ µ β ( z ) ∼ Im ω α ( z ) ∼ Im ω β ( z ) ∼ ( √ κ + η , if E ≥ E − , η √ κ + η if E < E − , (3.45) uniformly for all z = E + i η ∈ E where E := (cid:8) z ∈ E κ : − κ ≤ Re z − E − ≤ κ , ≤ Im z ≤ η (cid:9) . (3.46) Proof of Lemma 3.7.
From Lemma 3.6 we know that Re ω α ( E − ) ≤ E β − − K and Re ω β ( E − ) ≤ E α − − K .From the subordination equations (2.9) and (3.23), we have that F µ α ⊞ µ β ( z ) = F µ α ( ω β ( z )) = Re F µ α (i) + ω β ( z ) + Z R (cid:18) x − ω β ( z ) − x x (cid:19) d b µ α ( x ) , (3.47)for some Borel measures b µ α on R with, according to Lemma 3.5, supp b µ α = supp µ α . Arguing as in theproof of Lemma 3.5, we notice that u ∈ R is an edge of the measure µ α ⊞ µ β , if m µ α ⊞ µ β fails to beanalytic at u ∈ R and Im m µ α ⊞ µ β ( u ) = 0. Analyticity breaks down if either F µ α ⊞ µ β ( u ) = 0 or, accordingto (3.47), if ω β ( u ) ∈ supp b µ α = supp µ α , or if ω β fails to be analytic at u . For the lowest edge at u = E − ,we can exclude F µ α ⊞ µ β ( u ) = 0 by (3.8) and also ω ( u ) ∈ supp µ α as Re ω α ( E − ) ≤ E β − − k , k >
0. Thus E − ∈ R is the smallest point where ω β is not analytic.We next claim that ω β is not analytic at u ∈ R if ( F ′ µ α ( ω β ( u )) − F ′ µ β ( ω α ( u )) −
1) = 1. We argueas follows. From (3.23) we know that there is a Borel measure b µ β such that F µ β ( ω ) = Re F µ β (i) + ω + Z R (cid:18) x − ω − x x (cid:19) d b µ β ( x ) , (3.48)and F µ β is analytic in a disk of radius K centered at ω = ω β ( E − ) by (3.25). Here we also used thatsupp b µ β = supp µ β by Lemma 3.5. It follows that F ′ µ β ( ω ) = 1 + Z R d b µ β ( x )( x − ω ) , (3.49)and in particular that F ′ µ β ( ω α ( E − )) >
1, since ω α ( E − ) is real valued E − being defined as the lowerendpoint of the support of µ α ⊞ µ β . By the analytic inverse function theorem, the functional inverse F ( − µ β of F µ β is analytic in a neighborhood of F µ β ( ω α ( E − )). Thus the function e z ( ω ) := − F µ α ( ω ) + ω + F ( − µ β ◦ F µ α ( ω ) (3.50)is well-defined and analytic in a neighborhood of ω α ( E − ). It follows from (2.9) that ω β ( z ) is a solution ω = ω β ( z ) to the equation z = e z ( ω ) (with Im ω β ( z ) ≥ Im z ). Moreover, we have ω α ( z ) = F ( − µ β ◦ F µ α ( ω β ( z )).The function e z ( ω ) admits the following Taylor expansion in a neighborhood of ω β ( E − ), e z ( ω ) = E − + z ′ ( ω β ( E − ))( ω − ω β ( E − )) + 12 z ′′ ( ω β ( E − ))( ω − ω β ( E − )) + O (cid:0) ( ω − ω β ( E − )) (cid:1) . (3.51)In particular, e z ( ω ) admits an inverse around z = E − that is locally analytic if and only if e z ′ ( ω β ( E − )) = 0.Thus the smallest edge E − of the support of µ α ⊞ µ β , is the smallest u ∈ R such that e z ′ ( ω β ( u )) = 0. Tofind the location of edge, we compute e z ′ ( ω ) = − F ′ µ α ( ω ) + 1 + 1 F ′ µ β ◦ F ( − µ β ◦ F µ α ( ω ) F ′ µ α ( ω ) . (3.52)Hence, choosing ω = ω β ( z ), we get e z ′ ( ω β ( z )) = − F ′ µ α ( ω β ( z )) + 1 + 1 F ′ µ β ( ω α ( z )) F ′ µ α ( ω β ( z )) , (3.53)thence, from e z ′ ( ω β ( E − )) = 0 we have( F ′ µ α ( ω β ( E − )) − F ′ µ β ( ω α ( E − )) −
1) = 1 . (3.54)This proves (3.44). We move on to proving (3.45). From (3.50) we compute, e z ′′ ( ω ) = − F ′′ µ α ( ω ) + 1 F ′ µ β ◦ F ( − µ β ◦ F µ α ( ω ) F ′′ µ α ( ω ) − F ′ µ β ◦ F ( − µ β ◦ F µ α ( ω )) (cid:16) F ′′ µ β ◦ F ( − µ β ◦ F µ α ( ω ) (cid:17) · ( F ′ µ α ( ω )) , and thus by choosing ω = ω β ( z ), we get e z ′′ ( ω β ( z )) = − F ′′ µ α ( ω β ( z )) + 1 F ′ µ β ( ω α ( z )) F ′′ µ α ( ω β ( z )) − F ′ µ β ( ω α ( z ))) F ′′ µ β ( ω α ( z )) · ( F ′ µ α ( ω β ( z ))) . This we can rewrite as e z ′′ ( ω β ( z )) = F ′′ µ α ( ω β ( z )) F ′ µ β ( ω α ( z )) (1 − F ′ µ β ( ω α ( z ))) − F ′ µ β ( ω α ( z ))) F ′′ µ β ( ω α ( z )) · ( F ′ µ α ( ω β ( z ))) . (3.55)Thus choosing z = E − and recalling (3.53) and (3.54), we get e z ′′ ( ω β ( E − )) = F ′′ µ α ( ω β ( E − )) F ′ µ β ( ω α ( E − )) (1 − F ′ µ β ( ω α ( E − ))) + F ′′ µ β ( ω α ( E − )) F ′ µ β ( ω α ( E − )) ( F ′ µ α ( ω β ( E − )) − . (3.56)From (3.49), we directly get F ′ µ β ( ω α ( E − )) = 1 + Z R d b µ β ( x )( x − ω α ( E − )) , F ′ µ α ( ω β ( E − )) = 1 + Z R d b µ α ( x )( x − ω β ( E − )) , (3.57)as well as F ′′ µ β ( ω α ( E − )) = Z R d b µ β ( x )( x − ω α ( E − )) , F ′′ µ α ( ω β ( E − )) = Z R d b µ α ( x )( x − ω β ( E − )) . (3.58)Recalling that ω α ( E − ) ≤ E β − − K , ω β ( E − ) ≤ E α − − K and that b µ α = 0, b µ β = 0 (as µ α and µ β are notsingle point masses), we infer from (3.57) and (3.58) that there are constants c > C < ∞ such that c ≤ e z ′′ ( ω β ( E − )) ≤ C . (3.59)Choosing ω = ω β ( z ) (thus e z ( ω β ( z )) = z ) and using e z ′ ( ω β ( E − )) = 0, e z ′′ ( ω β ( E − )) = 0 in (3.51), we get ω β ( z ) − ω β ( E − ) = 2 z ′′ ( ω β ( E − )) p E − − z + O ( | z − E − | / ) , (3.60)for z in a neighborhood of E − . The branch of the square root is chosen such that Im ω β ( z ) > z ∈ C + .Next, setting z = E + i η , we observe that (3.59) and (3.60) imply, for z near E − , thatIm ω β ( z ) ∼ ( √ κ + η , if E ≥ E − , η √ κ + η , if E < E − . (3.61)This proves the third estimate in (3.45). The second estimate is obtained in the same way by interchang-ing the rˆoles of the indices α and β . Finally the first estimate follows from (3.27) and the fact that ω α ( z )and ω β ( z ), z ∈ E , are away from the supports of the measure µ β respectively µ α by (3.25) and (3.60).This shows (3.45) and concludes the proof of Lemma 3.7. (cid:3) Remark . From (3.60) and m µ α ⊞ µ β ( z ) = m µ α ( ω β ( z )) we get the precise behavior of m µ α ⊞ µ β ( z ) on E , m µ α ⊞ µ β ( z ) − m µ α ⊞ µ β ( E − ) = 2 m ′ µ α ( ω β ( E − )) z ′′ ( ω β ( E − )) p E − − z + O ( | z − E − | / ) , and thus by the Stieltjes inversion formula we have the square root behavior for the density of µ α ⊞ µ β ,d µ α ⊞ µ β ( x ) ∼ p x − E − d x , ∀ x ∈ [ E − , E − + κ ] . (3.62) Corollary 3.9.
Let E be as in (3.46) . Then the following behaviors hold uniformly for z ∈ E , m ′ µ α ⊞ µ β ( z ) ∼ p | z − E − | , m ′′ µ α ⊞ µ β ( z ) ∼ | z − E − | / , (3.63) | ω ′ α ( z ) | ∼ p | z − E − | , | ω ′′ α ( z ) | ∼ | z − E − | / , (3.64) and F ′ µ α ( ω β ( z )) ∼ , F ′′ µ α ( ω β ( z )) ∼ , F ′′′ µ α ( ω β ( z )) ∼ . (3.65) The same estimates hold true when the rˆoles of the subscripts α and β are interchanged.Proof. Having established (3.45) for the behavior of ω α and ω β around the smallest edge E − , the behav-iors in (3.63) follow directly. Using the subordination equations (2.9), we note that F ′ µ α ( ω β ( z )) ω ′ β ( z ) = F ′ µ β ( ω α ( z )) ω ′ α ( z ) = − m µ α ⊞ µ β ( z ) ′ / ( m µ α ⊞ µ β ( z )) , which together with (3.63) imply (3.64). Finally, (3.65)follows directly from the analyticity of F µ β and F µ α in neighborhood of ω α ( E − ), respectively ω β ( E − ). (cid:3) Let us define a second subdomain E κ of E by setting E κ := { z ∈ E : E α − + E β − − ≤ Re z − E − ≤ κ , ≤ Im z ≤ η M } (3.66)with κ , η and η M as in (3.46). Note that E ⊂ E κ ⊂ E . We further introduce the functions S αβ ≡ S αβ ( z ) := ( F ′ µ α ( ω β ( z )) − F ′ µ β ( ω α ( z )) − − , T α ≡ T α ( z ) := 12 (cid:0) F ′′ µ α ( ω β ( z ))( F ′ µ β ( ω α ( z )) − + F ′′ µ β ( ω α ( z ))( F ′ µ α ( ω β ( z )) − (cid:1) , T β ≡ T β ( z ) := 12 (cid:0) F ′′ µ β ( ω α ( z ))( F ′ µ α ( ω β ( z )) − + F ′′ µ α ( ω β ( z ))( F ′ µ β ( ω α ( z )) − (cid:1) , z ∈ C + . (3.67)These functions are essentially the first and second order derivatives of the subordination equations (2.9).We have the following corollary on the estimates of m µ α ⊞ µ β , ω α , ω β and also the above functions. Corollary 3.10.
Let E κ be as in (3.66) and let E be as in (3.46) . Then Im m µ α ⊞ µ β ( z ) ∼ Im ω α ( z ) ∼ Im ω β ( z ) ∼ ( √ κ + η , if E ≥ E − , η √ κ + η , if E < E − , (3.68) and S αβ ( z ) ∼ √ κ + η (3.69) hold uniformly for z ∈ E κ , with κ given in (2.14) . Moreover, we have T α ( z ) ∼ , T β ( z ) ∼ , (3.70) uniformly for z ∈ E , respectively |T α ( z ) | ≤ C, |T β ( z ) | ≤ C , (3.71) uniformly for z ∈ E κ , for some constant C .Proof of Corollary 3.10. Having established (3.45) for the behavior of ω α and ω β on E , the behaviorsin (3.68), (3.69) and (3.70) can be checked by elementary computations using Taylor expansions as inthe proof of Lemma 3.7, and the estimates in (3.57) and (3.58) .Consider now the complementary domain E κ \ E . Observe that κ + η ∼ E κ \ E . Hence, we haveIm m µ α ⊞ µ β ( z ) = Z R η ( x − E ) + η d µ α ⊞ µ β ( x ) ∼ η (3.72)uniformly on E κ \ E . Then, from (3.26), (3.72) and Im ω α ( z ) ≥ η , Im ω β ( z ) ≥ η , we getIm ω α ( z ) ∼ η, Im ω α ( z ) ∼ η. (3.73)Observe that both estimates in (3.68) are of the same order as η if z ∈ E κ \ E . Hence, we have (3.68).Next, we show that (3.69) can be extended to the whole E κ \E . Since κ + η ∼ E κ \ E . We first consider real z ∈ [ E α − + E β − − , E − ]. Using(3.49) and the analogue of F ′ µ α , (3.54), (3.69), the monotonicity of ω α ( z ) and ω β ( z ) on ( −∞ , E − − κ ]( c.f., Lemma 3.3), and (3.25), we see that0 ≤ ( F ′ µ α ( ω β ( z )) − F ′ µ β ( ω α ( z )) − ≤ − c , ∀ z ∈ [ E α − + E β − − , E − − κ ] , for some small constant c >
0. Hence, we have (cid:12)(cid:12) ( F ′ µ α ( ω β ( z )) − F ′ µ β ( ω α ( z )) − − (cid:12)(cid:12) ∼ , ∀ z ∈ [ E α − + E β − − , E − − κ ] . (3.74) Then, (3.74) can be extended to all z = E + i η , with E ∈ [ E α − + E β − − , E − − κ ] and 0 ≤ η ≤ e η for sufficiently small constant e η > E ∈ [ E α − + E β − − , E − + κ ] and 0 ≤ η ≤ η after possibly reducing η to e η if η > e η .It remains to show that the left side of (3.69) is proportional to 1 when E ∈ [ E α − + E β − − , E − + κ ]and η ≤ η ≤ η M . To this end, we first recall (3.49), and observe from (3.47) thatIm F µ α ( ω β ( z )) − Im ω β ( z )Im ω β ( z ) = Z R | x − ω β ( z ) | d b µ α ( x ) . (3.75)Hence, using (3.49), (3.75) and their F µ β analogues, we have | ( F ′ µ α ( ω β ( z )) − F ′ µ β ( ω α ( z )) − | ≤ Im F µ α ( ω β ( z )) − Im ω β ( z )Im ω β ( z ) Im F µ β ( ω α ( z )) − Im ω α ( z )Im ω α ( z )= Im ω α ( z ) − η Im ω β ( z ) Im ω α ( z ) − η Im ω α ( z ) ≤ − c , (3.76)for a strictly positive constant c , where in the second step we used the second equation in (2.9) and inthe last step we used the fact that η ≥ η and (3.73). Then, from (3.76) we get (3.69) in the whole E κ .Similarly, the upper bound in (3.71) follows from (3.73), (3.25), the monotonicity in Lemma 3.3, andthe continuity of ω α and ω β . Omitting the details, we conclude the proof of Corollary 3.10. (cid:3) At this stage we have completed the first step in the proof of Proposition 3.1. In the next subsection,we carry out the second step where we translate results obtained so far for µ α and µ β to the measures µ A and µ B by giving the actual proof of Proposition 3.1.3.2. Proof of Proposition 3.1.
In this subsection, we prove Proposition 3.1. Consider the N -dependentmeasures µ A and µ B while always assuming that they satisfy Assumption 2.2. Let ω A ( z ) and ω B ( z )denote the subordination functions associated by (2.11) to the measures µ A and µ B . Recall further thedefinition of the z -dependent quantities S AB , T A and T B in (3.1).Recall that E − = inf supp µ α ⊞ µ β . Fix sufficiently small ε, δ > D be defined by D := D in ∪ D out , with D in := { z ∈ C + : | z − E − | ≤ δ } ∩ { Im z ≥ N − ε , Re z > E − − N − ε } , D out := { z ∈ C + : | z − E − | ≤ δ } ∩ { Re z < E − − N − ε } . Notice that the bounds on
A, B -quantities will be for spectral parameters z that are separated awayfrom the limiting spectrum ( e.g., by assuming that Im z ≥ N − ε ) unlike in case of the α, β -quantities. Lemma 3.11.
Let µ A , µ B , µ α and µ β satisfy Assumptions 2.1 and 2.2. Then, there is a constant c > such that for any z ∈ D we have | ω A ( z ) − ω α ( z ) | + | ω B ( z ) − ω β ( z ) | . N − cε p | z − E − | ≤ N − / cε , (3.77) |S AB ( z ) | ∼ p | z − E − | , (3.78) and |T A ( z ) | ∼ , |T B ( z ) | ∼ , (3.79) for N sufficiently large. Moreover, we have for any z ∈ D that Im m µ A ⊞ µ B ( z ) ∼ p | z − E − | , z ∈ D in , (3.80)Im m µ A ⊞ µ B ( z ) . Im z + O ( N − cε ) p | z − E − | , z ∈ D out , (3.81) for N sufficiently large. Furthermore, for the imaginary parts the bound (3.77) is, for N sufficientlylarge, sharpened to | Im ω A − Im ω α | + | Im ω B − Im ω β | ≤ (Im ω α + Im ω β ) N − cε + Im z p | z − E − | , (3.82) for z ∈ D out , η ≤ N − , which implies that inf supp µ A ⊞ µ B ≥ E − − N − ε . (3.83) Away from the edge we have the following weaker versions of (3.78) , (3.79) : |S AB ( z ) | ∼ , (3.84) |T A ( z ) | + |T B ( z ) | ≤ C , (3.85) hold uniformly for any z with δ ≤ | z − E − | ≤ C , for N sufficiently large.Proof. First, note that we can rewrite the subordination equation for µ α and µ β ( c.f., (2.9) with µ = µ α , µ = µ β ) as F µ A ( ω β ( z )) − ω α ( z ) − ω β ( z ) + z = r ( z ) ,F µ B ( ω α ( z )) − ω α ( z ) − ω β ( z ) + z = r ( z ) , (3.86)where we introduced r ( z ) := F µ A ( ω β ( z )) − F µ α ( ω β ( z )) , r ( z ) := F µ B ( ω α ( z )) − F µ β ( ω α ( z )) . (3.87)By Lemma 3.6 and Lemma 3.7, we know that ω β ( z ), z ∈ D , is far away from the support of µ α and alsofrom the support of µ A , using (2.4). Hence, using Corollary 3.9 and Lemma 3.5, we have | r ( z ) | ≤ C d = CN − ε , | r ( z ) | ≤ C d = CN − ε , z ∈ D , (3.88)with d given in (2.3). We rely on the following local stability result of the system (3.86). Lemma 3.12.
Fix z ∈ D . Assume that the functions ω α , ω β , r , r : C + → C satisfy (3.86) with z = z . Assume moreover that there is a function q ≡ q ( z ) such that | ω A ( z ) − ω α ( z ) | ≤ q ( z ) , | ω B ( z ) − ω β ( z ) | ≤ q ( z ) , (3.89) with S αβ ( z ) q ( z ) = o (1) and S αβ ( z ) q ( z ) = o (1) , with S αβ given in (3.67) . Then we have | ω A ( z ) − ω α ( z ) | + | ω B ( z ) − ω β ( z ) | ≤ | r ( z ) | + | r ( z ) ||S αβ ( z ) | , (3.90) for N sufficiently large.Proof. The proof is almost identical to the proof of Proposition 4.1 in [3]. The only difference is that,by Corollary 3.9, F ′′ µ α ( ω β ( z )) and F ′′ µ β ( ω α ( z )) are O (1) uniformly in z ∈ D . Hence, in (4.11) of [3], wecan stop the Taylor expansion in Ω ( z ) = ω B ( z ) − ω β ( z ) at second order and estimate the remainderby O ( | Ω ( z ) | ). This means that the factor K/k in the subsequent formulas (4.12) and (4.13) can bereplaced by a constant. Recalling that the current S αβ plays the rˆole of 1 /S in [3], we find that inthe equation (4.13) we are in the linear regime provided that S αβ ( z ) q ( z ) ≪ S αβ ( z ) q ( z ) ≪ (cid:3) Continuing the proof of Lemma 3.11, we use a continuity argument to establish (3.90) with q ( z ) := N − ǫ / p | z − E − | . For z ∈ D with Im z = η M , for some fixed η M = O (1), the local linear stabilityresult of Lemma 4.2. of [3] shows that | ω A ( z ) − ω α ( z ) | + | ω B ( z ) − ω β ( z ) | ≤ | r ( z ) | + 2 | r ( z ) | ≤ N − ǫ ,provided that Im ω A ( z ) − Im z ≥ c > ω B ( z ) − Im z ≥ c >
0. These bounds follow from thesubordination equation and the representation:Im ω A ( z ) − Im z = Im F µ A ( ω B ( z )) − Im ω B ( z ) = (Im z ) Z R d b µ A ( x ) | x − z | ≥ c ′ > z ≥ η M , and similarly for ω B .Using the Lipschitz continuity of the subordination functions on D , in particular | ω ′ A ( z ) | , | ω ′ B ( z ) | ≤ η − , and similar for ω α and ω β , we can bootstrap (3.89) and (3.90) with q ( z ) = N − ǫ / p | z − E − | , asthen q ( z ) S αβ ( z ) ∼ N − ǫ (since S αβ ( z ) ∼ p | z − E − | by (3.69)). Thus we have | ω A ( z ) − ω α ( z ) | + | ω B ( z ) − ω β ( z ) | . d |S αβ | ≤ N − ε p | z − E − | ≤ N − / ε , z ∈ D , since for z ∈ D , we have | z − E − | ≥ N − ε , i.e., |S αβ ( z ) | ≥ N − / ε , this proves (3.77).From this bound we can compare S αβ and S AB , T α and T A , and T β and T B , e.g., |S AB ( z ) − S αβ ( z ) | ≤ | ( F ′ µ A ( ω B ( z )) − F ′ µ B ( ω A ( z )) − − ( F ′ µ A ( ω β ( z )) − F ′ µ B ( ω α ( z )) − | + | ( F ′ µ A ( ω β ( z )) − F ′ µ B ( ω α ( z )) − − ( F ′ µ α ( ω β ( z )) − F ′ µ β ( ω α ( z )) − | . | ω A ( z ) − ω α ( z ) | + | ω B ( z ) − ω β ( z ) | + d ≤ N − / ε , z ∈ D , (in the first estimate we used that F ’s are all regular and in the second we used the same in addi-tion to (3.25) and (2.4)). Since |S αβ | ≥ N − / ε in this regime, we immediately get (3.78). Thebounds (3.79), (3.80), (3.81), (3.84) are proven exactly in the same way by showing that the differencebetween the finite- N quantity and the limiting quantity is smaller than the size of the limiting quantitygiven in (3.67) and (3.63).The proof of (3.82) requires one more argument. Outside of the support, (3.77) is not optimal for theimaginary parts. Recall r and r from (3.87), z ∈ C + . Clearly | Im r ( z ) | ≤ C (Im ω β ( z )) N − ε , | Im r ( z ) | ≤ C (Im ω α ( z )) N − ε , z ∈ D , since Im F µ A ( ω β ( z )) = Im m µ A ( ω β ( z )) | m µ A ( ω β ( z )) | = Im ω β ( z ) | m µ A ( ω β ( z )) | Z R d µ A ( x ) | x − ω β ( z ) | , so changing A to α yields a factor N − ε by (2.3) since ω β ( z ) is away from the support of µ A . Takingimaginary parts in (3.86) and using the representations from (3.23) gives,Im ω β ( z ) Z R d b µ A ( x ) | x − ω β ( z ) | − Im ω α ( z ) + Im z = Im r ( z ) = O (cid:0) Im ω β ( z ) N − ε (cid:1) , Im ω α ( z ) Z R d b µ B ( x ) | x − ω α ( z ) | − Im ω β ( z ) + Im z = Im r ( z ) = O (cid:0) Im ω α ( z ) N − ε (cid:1) , (3.91) z ∈ D , and similarly, starting from the subordination equations for µ A and µ B , we haveIm ω B ( z ) Z R d b µ A ( x ) | x − ω B ( z ) | − Im ω A ( z ) + Im z = 0 , Im ω A ( z ) Z R d b µ B ( x ) | x − ω A ( z ) | − Im ω B ( z ) + Im z = 0 . (3.92)In fact, we can change ω β to ω B and ω α to ω A in (3.91), to getIm ω β ( z ) Z R d b µ A ( x ) | x − ω B ( z ) | − Im ω α ( z ) + Im z = O (cid:0) Im ω β ( z ) N − ε (cid:1) , Im ω α ( z ) Z R d b µ B ( x ) | x − ω A ( z ) | − Im ω β ( z ) + Im z = O (cid:0) Im ω α ( z ) N − ε (cid:1) , (3.93) z ∈ D . Subtracting (3.92) from (3.93) and using that for very small η the determinant of the resultinglinear system is very close to S AB ( z ) ∼ p | z − E − | , z ∈ D , from (3.78), we have proved (3.82).To prove (3.83), let z = x +i η with x ≤ E − − N − ε . At a distance of at least N − below E − , we getIm m µ α ⊞ µ β ( z ) = Im z Z R d µ α ⊞ µ β ( x ) | x − z | ≤ N Im z . Moreover from m µ α ⊞ µ β ( z ) = m α ( ω β ( z )) we have Im m α ( ω β ( z )) ∼ Im ω β ( z ) since ω β ( z ) is away fromthe support of µ α . The same holds for ω α ( z ), so we get Im ω α ( z ) + Im ω β ( z ) ≤ N Im z . Taking η ց ω A ( x ) = Im ω B ( x ) = 0. SinceIm m µ A ⊞ µ B ( z ) ∼ Im ω A ( z ) in this regime, x cannot lie in the support of µ A ⊞ µ B . This proves (3.83). (cid:3) Recall that γ j denoted the j -th N -quantiles of µ α ⊞ µ β from (2.20) and similarly let γ ∗ j denote the j -th N -quantiles of µ A ⊞ µ B , i.e., these are the smallest numbers γ j and γ ∗ j such that µ α ⊞ µ β (cid:0) ( −∞ , γ j ] (cid:1) = µ A ⊞ µ B (cid:0) ( −∞ , γ ∗ j ] (cid:1) = jN . Lemma 3.13 (Rigidity) . Suppose Assumptions 2.1 and 2.2 hold, then we have the rigidity bound | γ j − γ ∗ j | ≤ j − / N − + ε , j ∈ J , cN K , (3.94) for N sufficiently large and for some sufficiently small constant c > .Under the additional Assumption 2.7 we have the rigidity estimate for all quantiles, i.e., | γ j − γ ∗ j | ≤ min { j − / , ( N + 1 − j ) − / } N − + ε , j ∈ J , N K . (3.95) Proof.
The proof of these rigidity results are fairly straightforward from the information collected sofar, by using standard arguments to translate the closeness of Stieltjes transform of two measures intocloseness of their quantiles. We will just outline the argument. Recall the domain E κ from (3.46).First, we establish that there are at most N ε γ j -quantiles as well as N ε γ ∗ j -quantiles in an N − / ε vicinity of E − = inf supp µ α ⊞ µ β . This fact is immediate for the γ j quantiles since their distributionis given by the regular square root law, see (3.62). For the γ ∗ j -quantiles, we know from (3.83) that γ ∗ ≥ E − − N − ε . We compute from (3.80) jN = Z γ ∗ j −∞ d µ A ⊞ µ B = Z γ ∗ j E − − N − ε µ A ⊞ µ B ( x )d x ≤ C Z γ ∗ j E − − N − ε Im m A ⊞ B ( x + i N − ε )d x ≤ C Z γ ∗ j E − − N − ε (cid:2) | x − E − | + N − ε (cid:3) / d x ≤ C | γ ∗ j − E − | / + CN − ε | γ ∗ j − E − | , which means that | γ ∗ j − E − | ≥ c (cid:16) jN (cid:17) / , with some positive constant c >
0. So we have γ ∗ j ≥ E − + cN − / ε , if j ≥ cN ε/ , (3.96)and note that the condition j ≥ cN ε/ is equivalent to γ j ≥ E − + cN − / ε . In the other direction we use Z γ ∗ j E − − N − ε µ A ⊞ µ B ( x ) d x ≥ c Z γ ∗ j E − − N − ε Im m A ⊞ B ( x + i N − ε ) d x if | γ ∗ j − E − | ≫ N − ε . Using again (3.80) we get jN ≥ c | γ ∗ j − E − | / , i.e., γ ∗ j ≤ E − + C (cid:16) jN (cid:17) / ∀ j, since this latter bound also holds in the case, when | γ ∗ j − E − | ≫ N − ε is not satisfied.Thus we have established | γ j − γ ∗ j | ≤ | γ j − E − | + | γ ∗ j − E − | ≤ CN − / ε , whenever γ j ≤ E − + N − / ε . (3.97)From the continuity of the free convolution (Proposition 4.13 of [9]) and the condition (2.3) we getd L ( µ A ⊞ µ B , µ α ⊞ µ β ) ≤ d L ( µ A , µ α ) + d L ( µ B , µ β ) ≤ N − ǫ . On the other hand, the definition of the L´evy distance and the boundedness of the density of µ α ⊞ µ β below E − + κ (see (3.62)) directly imply that (cid:12)(cid:12) µ A ⊞ µ B (cid:0) ( −∞ , x ) (cid:1) − µ α ⊞ µ β (cid:0) ( −∞ , x ) (cid:1)(cid:12)(cid:12) ≤ CN − ε (3.98)holds for any x ≤ E − + κ . Together with (3.97), this estimate immediately implies the bound (3.94).For the proof of (3.95), we note that ( ii ′ ) and ( v ′ ) of Assumption 2.7 guarantee that near the upperedge of the support of µ α ⊞ µ β a similar rigidity statement holds as (3.94). Finally, ( ii ′ ) of Assumption 2.7together with the continuity and boundedness of the density of µ α ⊞ µ β (see (3.8)) imply that the densityhas a positive lower and upper bound away the two extreme edges of its support. These informationtogether with (2.3) are sufficient to conclude that (3.98) hold uniformly for any x ∈ R . The correspondingresult (3.95) for the quantiles follows immediately. (cid:3) Proof of Proposition 3.1.
First, on the domain D , ( i ) of Proposition 3.1 follows from (3.77), (3.25), theassumption (2.4) and also the continuity of ω α and ω β . In the complementary domain D τ ( η m , η M ) \ D ,we first prove (3.3). Using the equations m µ A ⊞ µ B = m µ A ( ω B ) = m µ B ( ω A ), we see that the upper boundson ω A and ω B follow from the fact that | m µ A ⊞ µ B ( z ) | ≥ c , which can be derived from the rigidity (3.94)easily. For (3.2), we further split into two regimes. In the regime η ≥ η for some small η >
0, we usethe fact Im ω A ( z ) , Im ω B ( z ) ≥ η directly. In the regime η ≤ η , we use the continuity of ω A and ω B , andalso the monotonicity of the ω A ( u ) and ω B ( u ) for u ∈ ( −∞ , E − − δ ] which can be proved similarly tothe monotonicity of ω α ( u ) and ω β ( u ) ( c.f., (3.19)).Similarly, on the domain D , Proposition 3.1 ( ii ) follows from (3.80) and (3.80) directly. In thecomplementary domain D τ ( η m , η M ) \ D , we apply again the rigidity result (3.94) to conclude the proof.Statement ( iii ) follows from (3.78), (3.79), (3.84) and (3.85). Finally, to prove item ( iv ), we differentiate the subordination equations (2.9) with respect to z to get (cid:18) − F ′ A ( ω B ( z ))1 − F ′ B ( ω A ( z )) 1 (cid:19) (cid:18) ω ′ A ( z ) ω ′ B ( z ) (cid:19) = (cid:18) (cid:19) , with the shorthand F A ≡ F µ A , F B ≡ F µ B . Hence, (cid:18) ω ′ A ( z ) ω ′ B ( z ) (cid:19) = S − (cid:18) F ′ A ( ω B ( z )) − F ′ B ( ω A ( z )) − (cid:19) , where S ≡ S AB . Using (3.1) and (3.2) and (3.5), we directly get the first two estimates in (3.7).Next, from the definition of S ( z ) in (3.1), we observe that |S ′ ( z ) | = (cid:12)(cid:12)(cid:12) F ′′ B ( ω A )( F ′ A ( ω B ) − ω ′ A ( z ) + F ′′ A ( ω B )( F ′ B ( ω A ) − ω ′ B ( z ) (cid:12)(cid:12)(cid:12) ≤ C | S − ( z ) | , (3.99)where in the second step we used (3.2), the first two estimates in (3.7). Hence, by (3.5) we get the thirdestimate in (3.7) and statement ( iv ) is proved. This finishes the proof of Proposition 3.1. (cid:3) General structure of the proof
Partial randomness decomposition.
In this subsection, we recall a the partial randomnessdecomposition of the Haar unitary matrix used in [4], which will often be used below.Let u i = ( u i , . . . , u iN ) be the i -th column of U . Let θ i be the argument of u ii . The following partialrandomness decomposition of U is taken from [15] (see also [23]): For any i ∈ J , N K , we can write U = − e i θ i R i U h i i , (4.1)where U h i i is a unitary block-diagonal matrix whose ( i, i )-th entry equals 1, and its ( i, i )-minor is Haardistributed on U ( N − U h i i e i = e i and e ∗ i U h i i = e ∗ i , where e i is the i -th coordinator vector.Here R i is a reflection matrix, defined as R i := I − r i r ∗ i , (4.2)where r i := √ e i + e − i θ i u i k e i + e − i θ i u i k . (4.3)Using U h i i e i = e i and (4.1), we see that u i = U e i = − e i θ i R i e i . (4.4)Hence, R i = R ∗ i is actually the Householder reflection (up to a sign) sending e i to − e − i θ i u i . With thedecomposition in (4.1), we can write H = A + e B = A + R i e B h i i R i , where we introduced the notations e B := U BU ∗ , e B h i i := U h i i B ( U h i i ) ∗ . Observe that e B h i i e i = b i e i and e ∗ i e B h i i = b i e ∗ i . Clearly, e B h i i is independent of u i .It is known that u i ∈ S N − C := { x ∈ C N : x ∗ x = 1 } is a uniformly distributed complex vector, andthere exists a Gaussian vector e g i ∼ N C (0 , N − I N ) such that u i = e g i k e g i k . We then further introduce the notations g i := e − i θ i e g i , h i := g i k g i k = e − i θ i u i , ℓ i := √ k e i + h i k . (4.5)Observe that the components g ik of g i are independent. Moreover, for k = i , g ik ∼ N C (0 , N ) while g ii isa χ -distributed random variable with E g ii = N . With the above notations, we can write r i in (4.3) as r i = ℓ i ( e i + h i ) . (4.6)In addition, using (4.4) and the fact R i = I , we have R i e i = − h i , R i h i = − e i , (4.7) which also imply h ∗ i e B h i i R i = − e ∗ i e B , e ∗ i e B h i i R i = − b i h ∗ i = − h ∗ i e B . (4.8)Here, in the first equality of the second equation we used that e ∗ i e B h i i = b i e i . We introduce the vectors˚ g i := g i − g ii e i , ˚ h i := ˚ g i k g i k , where the χ -distributed variable g ii is kicked out.4.2. Summary of the proof route.
In this subsection, we summarize the main route of the proof.While the final goal of the local law is to understand G ii , i ∈ J , N K , and its averaged version, we workwith several auxiliary quantities first. To understand their origin, it is useful to review the structure ofour previous proofs of the local laws in the bulk [4, 5]. We first introduce the following control parametersΨ ≡ Ψ( z ) := r N η , Π ≡ Π( z ) := s Im m H N η . (4.9)In [4], we investigated two main quantities: S i ≡ S i ( z ) := h ∗ i e B h i i G e i , T i ≡ T i ( z ) := h ∗ i G e i . (4.10)In particular we showed that S i = z − ω B ( z ) a i − ω B ( z ) + O ≺ (Ψ) , T i = O ≺ (Ψ) , by performing integration by parts in the h ∗ i variable. Using the identity G ii = 1 − ( e BG ) ii a i − z and that ( e BG ) ii = e ∗ i R i e B h i i R i G e i = − h ∗ i e B h i i R i G e i = − S i + h ∗ i e B h i i r i r ∗ i G e i = − S i + ℓ i ( h ∗ i e B h i i h i + b i h ii )( G ii + T i ) , we obtained the entry-wise local law for G ii from a precise control on S i and T i .Technically S i is a better quantity than G ii to handle since integration by parts can be directly appliedto it. However, along the calculation the quantity T i appeared and a second integration by parts wasneeded to control it. We obtained a closed system of equations on the expectations of S i and T i (see(6.23)–(6.24) of [4]) from which the entry-wise local law in the bulk followed.To obtain the law for the normalized trace of G in [5], we performed fluctuation averaging, but againnot for G ii directly. We considered averages (with arbitrary weights d i ) of the quantity Z i := Q i + G ii Υ , where we defined Q i ≡ Q i ( z ) := ( e BG ) ii tr G − G ii tr e BG , (4.11)Υ ≡ Υ( z ) := tr e BG − (tr e BG ) + tr G tr e BG e B . (4.12)From the entry-wise laws it is clear that | Q i | , | Υ | ≺ Ψ, and now we improve these bounds, at least inaveraged sense in case of Q i . Notice that Q i is the most “symmetric” quantity, in particular P i Q i = 0,but technically it is not the most convenient object to start a high moment estimate for N P i d i Q i .The reason is that one step of integration by parts generates an additional term, G ii Υ, which is hardto control directly. So instead of averaging Q i , in [5] we included a counter term, i.e., we averaged Z i instead. We first proved that that average is one order better, i.e., (cid:12)(cid:12)(cid:12) N N X i =1 d i Z i (cid:12)(cid:12)(cid:12) ≺ Ψ . (4.13)Then, using (4.13) with d i ≡
1, we obtained | Υ | ≺ Ψ . Thus a posteriori we showed that the counterterm G ii Υ is irrelevant for estimates of order Ψ and we obtained the same bound (4.13) for Q i as well.Finally, the bounds on the average of Q i with careful choices of the weights d i and using the algebraicidentities between G and e BG yielded the averaged law for G ii with the optimal O ≺ (Ψ ) error. All results in [4, 5] concerned the bulk. It is well known from the analogous results for Wignermatrices that the edge analysis is more difficult. The main reason is that the corresponding Dysonequation, the subordination equation in the current model, is unstable at the spectral edge, hence moreprecise estimates are necessary for the error terms. Theoretically all error terms involving Ψ = √ Nη should be improved by a factor of √ Im m , where we set m = m µ A ⊞ µ B . This factor reflects that thedensity of states is small at the edge (at a square root edge we have Im m ( z ) ∼ √ κ + η , where η = Im z and κ is the distance of Re z to the edge). This improvement exactly compensates for the bound of order( κ + η ) − / on the inverse of the linearization of the subordination equation near the edge. However,this improvement is quite complicated to obtain and the method in [5] is not sufficient.In this paper we present a new strategy to obtain the stronger bound. To prepare for the higheraccuracy, already in the entry-wise law we work with two new quantities P i and K i instead of S i and T i .They are defined as P i ≡ P i ( z ) := ( e BG ) ii tr G − G ii tr ( e BG ) + ( G ii + T i )Υ , (4.14) K i ≡ K i ( z ) := T i + ( b i T i + ( e BG ) ii )tr G − ( G ii + T i )tr ( e BG ) . (4.15)We recognize that P i = Q i + ( G ii + T i )Υ = Z i + T i Υ, i.e., we included an additional counter term T i Υ tothe previous Z i . While a posteriori this counter term turns out to be irrelevant, it is necessary in orderto perform the integration by parts more precisely. Similarly, K i = (cid:0) b i tr G − tr ( e BG ) (cid:1) T i + Q i , (4.16) i.e., K i is a linear combination of T i and Q i , it is nevertheless easier to work with K i .The proof is divided into three parts.In the first part (Section 5) we obtain entry-wise bounds of the form | K i | , | Q i | , | T i | , | P i | ≺ Ψ , as well as | Υ | ≺ Ψ ; (4.17)see Proposition 5.1. Notice that the estimates are still in terms of Ψ = √ Nη without the improvingfactor √ Im m . These results would be possible to derive directly from the estimates in [4] by operatingwith S i and T i , we nevertheless use the new quantities, since the formulas derived along the entry-wisebounds will be used in the improved bounds later.There is yet another reason for introducing the new quantities P i and K i , namely that in the currentpaper we have also changed the strategy concerning the entry-wise laws. In [4], a precursor to [5], wefirst proved entry-wise laws by deriving a system of equations for the expectation values (of S i and T i ), complemented with concentration inequalities to enhance them to high probability bounds. Forthe improved bound on averaged quantities high moment estimates were performed only in [5], usingthe entry-wise law as an input. In the current paper we organize the proof in a more straightforwardway, similarly to [6]. We bypass the fairly complicated argument leading to the entry-wise law in [4]and we rely on high moment estimates directly even for the entry-wise law. This strategy is not onlyconceptually cleaner but also allows us to use essentially the same calculations for the entry-wise andthe averaged law. The estimates of many error terms are shared in the two parts of the proofs; in caseof some other estimates it will be sufficient to point out the necessary improvements. However, highmoment estimates require to consider more carefully chosen quantities. For example, no direct highmoment estimates are possible for S i since it is even not a small quantity. But high moment estimateseven for T i and Q i produce additional terms that are difficult to handle. It turns out that the carefullychosen counter terms in P i and K i make them suitable for performing high moment bounds.More precisely, in the first step we compute the high moments of K i and conclude that | K i | ≺ Ψ. Inthe second step we prove a high moment bound for P i = Q i + ( G ii + T i )Υ, i.e., prove | P i | ≺ Ψ. In thethird step we average this bound and conclude | Υ | ≺ Ψ, which in turn yields that | Q i | ≺ Ψ. Finally,from (4.16) we conclude that | T i | ≺ Ψ. This proves (4.17) and completes the entry-wise bounds.In the second part of the proof (Section 6) we derive a rough bound on the averaged quantities. Wewill focus on N P i d i Q i since Q i is the most fundamental quantity. Averaged quantities typically are oneorder better than the trivial entryway bounds indicate, i.e., we expect | N P i d i Q i | ≺ Ψ = ( N η ) − , andindeed this was proven in [5] in the bulk and could be extended to the edge. Due to the improvement atthe edge, now we expect a bound of order Π ≈ Im m/N η , but we cannot obtain this in general. In thissecond part of the proof, we prove a bound of the form ΠΨ ≈ √ Im m/N η , which is “half-way” betweenthe standard fluctuation averaging bound and the optimal bound. We compute the high moments of N P i d i Q i to achieve this bound. Interestingly, the apparently leading term in the high momentcalculation already gives the optimal bound Π (first term on the right of (6.5)), but a “cross-term”(when the derivative hits another factor of N P i d i Q i ) is responsible for the weaker ΠΨ bound.Another point to make is that it is not necessary to compute the high moments of another quantityfor the rough averaged bound, unlike in [4, 5] and in the first part of the current proof, where we alwaysoperated with two different quantities in parallel. Various error terms along the calculation of N P i d i Q i do contain T i , but these terms can all be estimated using the entry-wise bound T i ≺ Ψ only. Choosinga special weight sequence d i we also improve the bound on Υ to Υ ≺ ΠΨ. In particular we could obtainan improved averaged bound on P i = Q i + ( G ii + T i )Υ immediately, and with a little effort on K i and T i as well, but we do not need them.Finally, in the third part of the proof (Section 7) we obtain the optimal Π bound for the averageof Q i , but only for two very specially chosen weights, see (7.11)–(7.13). In fact, only the estimates onthe “cross-term” need to be improved and the weights are chosen to achieve an additional cancellation.Nevertheless, linear combinations of Q i ’s with these two special sequences of weights are sufficient toinvert the subordination equations and conclude that Λ ι := ω ci − ω i ≺ Ψ , ι = A, B . We finally notice that1 N N X i =1 d i (cid:16) G ii − a i − ω cB (cid:17) may be expressed as a linear combination of the Q i , see (8.40), this quantity is already stochasticallybounded by ΠΨ ≤ Ψ from the second part of the proof. Since replacing ω cB with ω B yields an error ofat most Ψ , we obtain (2.17), the optimal average law for G ii .The actual proofs are considerably more complicated than this informal summary. On one hand, manyerror terms need to be estimated that have not been mentioned here, in particular we need fluctuationaveraging with random weights, a novel complication that has not been considered before. On the otherhand, in this summary we used the deterministic Ψ = ( N η ) − / and Π ≈ (Im m/N η ) / as controlparameters. In fact, Π is random, see (4.9), containing Im m H which is Im m A ⊞ B up to a random errorthat itself depends on Λ := | Λ A | + | Λ B | . In the third part of the proof (Section 7) we obtain a self-consistent inequality for this random quantity Λ (see (7.2)). Therefore an additional continuity argumentin η is necessary to conclude a deterministic bound on Λ.5. Entry-wise Green function subordination
In this section, we prove a subordination property for the Green function entries. From this sectionto Appendix B, without loss of generality, we assume thattr A = tr B = 0 . (5.1)We define the approximate subordination functions as ω cA ( z ) := z − tr AG ( z ) m H ( z ) , ω cB ( z ) := z − tr e BGm H ( z ) , z ∈ C + . (5.2)It will be seen that the functions ω cA and ω cB are good approximations of ω A and ω B defined in (2.3)with ( µ , µ ) = ( µ A , µ B ). Switching the rˆoles of A and B , and also the rˆoles of U and U ∗ , we introducethe following analogues of e B , H , and G ( z ), respectively, e A := U ∗ AU , H := B + e A ,
G ≡ G ( z ) := ( H − z ) − . (5.3)Observe that, by the cyclicity of the trace, ω cA ( z ) = z − tr e A G ( z ) m H ( z ) . From (5.2) and the identity ( A + e B − z ) G = I , it is easy to check that ω cA ( z ) + ω cB ( z ) − z = − m H ( z ) , z ∈ C + . (5.4)Recall the quantities S i and T i defined in (4.10). We will also need their variants˚ S i ≡ ˚ S i ( z ) := ˚ h ∗ i e B h i i G e i = S i − h ii b i G ii , ˚ T i ≡ ˚ T i ( z ) := ˚ h ∗ i G e i = T i − h ii G ii , (5.5)where the χ random variable h ii is kicked out. Further, we denote (dropping the z -dependence from the notation for brevity)Λ d i := (cid:12)(cid:12)(cid:12) G ii − a i − ω B (cid:12)(cid:12)(cid:12) , Λ d := max i Λ d i , Λ T := max i | T i | . (5.6)We also define Λ c d i and Λ c d analogously by replacing ω B by ω cB in the definitions of Λ d i and Λ d , respectively.In addition, we use the notations e Λ d i , e Λ d , e Λ T , e Λ c d i , e Λ c d to represent their analogues, obtained by switch-ing the rˆoles of A and B , and the rˆoles of U and U ∗ , in the definitions of Λ d i , Λ d , Λ T , Λ c d i , Λ c d , e.g., Λ c d i := (cid:12)(cid:12)(cid:12) G ii − a i − ω cB (cid:12)(cid:12)(cid:12) , e Λ d i := (cid:12)(cid:12)(cid:12) G ii − b i − ω A (cid:12)(cid:12)(cid:12) . (5.7)Recall P i , K i , and Υ defined in (4.14), (4.15) and (4.12). We further observe the elementary identities e BG = I − ( A − z ) G , G e B = I − G ( A − z ) . (5.8)Using the first identity in (5.8), we can rewrite Υ defined in (4.12) asΥ = tr AG tr e BG − tr G tr e BGA = 1 N N X i =1 a i (cid:16) G ii tr e BG − ( e BG ) ii tr G (cid:17) . (5.9)To ease the presentation, we further introduce the control parameterΠ i ≡ Π i ( z ) := s Im ( G ii ( z ) + G ii ( z )) N η , i ∈ J , N K . (5.10)Note that since k H k < K ( c.f., (2.13)), it is easy to see that Im G ii ( z ) & η and Im G ii ( z ) & η for all z ∈ D τ (0 , η M ), by spectral decomposition. This implies1 √ N . Π i ( z ) , ∀ z ∈ D τ (0 , η M ) . (5.11)In this section, we derive the following Green function subordination property. Proposition 5.1.
Suppose that the assumptions of Theorem 2.5 hold. Fix z ∈ D τ ( η m , η M ) . Assume that Λ d ( z ) ≺ N − γ , e Λ d ( z ) ≺ N − γ , Λ T ( z ) ≺ , e Λ T ( z ) ≺ . (5.12) Then we have, for all i ∈ J , N K , that | P i ( z ) | ≺ Ψ( z ) , | K i ( z ) | ≺ Ψ( z ) . (5.13) In addition, we also have that | Υ( z ) | ≺ Ψ( z ) (5.14) and, for all i ∈ J , N K , that Λ c d i ( z ) ≺ Ψ( z ) , | T i | ≺ Ψ( z ) . (5.15) The same statements hold if we switch the rˆoles of A and B , and also the rˆoles of U and U ∗ . Before the actual proof of Proposition 5.1, we establish several bounds that follow from the assumptionin (5.12). From the definitions in (5.6), the assumptions in (5.12), together with (3.2), we see thatmax i ∈ J ,N K | G ii | ≺ , max i ∈ J ,N K | T i | ≺ . (5.16)Analogously, we also have max i ∈ J ,N K |G ii | ≺
1. Hence, under (5.12), we see thatmax i ∈ J ,N K Π i ( z ) ≺ Ψ( z ) . Moreover, using the identities in (5.8), we also get from the first bound in (5.16) thatmax i ∈ J ,N K | ( XGY ) ii | ≺ , X, Y = I or e B. (5.17)In addition, from (2.11) we see that1 N N X i =1 a i − ω B ( z ) = m µ A ( ω B ( z )) = m µ A ⊞ µ B ( z ) . (5.18) Then, the first bound in (5.12), together with (5.18), (5.8), (3.3) and (3.2), leads to the following estimatestr G = m µ A ⊞ µ B + O ≺ ( N − γ ) , tr e BG = ( z − ω B ) m µ A ⊞ µ B + O ≺ ( N − γ ) , tr e BG e B = ( ω B − z ) (cid:0) ω B − z ) m µ A ⊞ µ B (cid:1) + O ≺ ( N − γ ) . (5.19)Furthermore, by (3.2), (3.3), and (5.18), we see that all the above tracial quantities are O ≺ (1) . Thisalso implies that | Υ | ≺
1, ( c.f., (4.12)). Moreover, from (5.2) and the first two equations in (5.19), wecan get the following rough estimate under (5.12) and (3.2), ω cB = ω B + O ≺ ( N − γ ) . (5.20) Proof of Proposition 5.1.
To prove (5.13), it suffices to show the high order moment estimates E (cid:2) | P i | p (cid:3) ≺ Ψ p , E (cid:2) | K i | p (cid:3) ≺ Ψ p , (5.21)for any fixed p ∈ N . Let us introduce the notations m ( k,l ) i := P ki P li , n ( k,l ) i := K ki K li , k, l ∈ N , i ∈ J , N K . (5.22)Further, we make the following convention in the rest of the paper: the notation O ≺ (Ψ k ), for any giveninteger k , represents some generic (possibly) z -dependent random variable X ≡ X ( z ) which satisfies | X | ≺ Ψ k , and E | X | q ≺ Ψ qk , for any given positive integer q . The first bound above follows from the original definition of the notation O ≺ ( · ) directly. It turns out that it is more convenient to require the second one in our discussions belowas well. It will be clear that the second bound always follows from the first one whenever this notationwill be used. For more details, we refer to the paragraph above Proposition 6.1 in [5]. Analogously, for allnotation of the form O ≺ (Γ) with some deterministic control parameter Γ, we make the same convention.With the definitions in (5.22) and the convention made above, we have the following recursive momentestimates. This type of estimates were used first in [22] to derive local laws for sparse Wigner matrices. Lemma 5.2 (Recursive moment estimate for P i and K i ) . Suppose the assumptions of Proposition 5.1.For any fixed integer p ≥ and any i ∈ J , N K , we have E [ m ( p,p ) i ] = E [ O ≺ (Ψ) m ( p − ,p ) i ] + E [ O ≺ (Ψ ) m ( p − ,p ) i ] + E [ O ≺ (Ψ ) m ( p − ,p − i ] , (5.23) E [ n ( p,p ) i ] = E [ O ≺ (Ψ) n ( p − ,p ) i ] + E [ O ≺ (Ψ ) n ( p − ,p ) i ] + E [ O ≺ (Ψ ) n ( p − ,p − i ] , (5.24) where we made the convention m (0 , i = n (0 , i = 1 and m ( − , i = n ( − , i = 0 if p = 1 . Although in the statements of Lemma 5.2, we use Ψ, in the proof, we actually get better estimates interms of Π i instead of Ψ for some error terms. We will keep the stronger form of these estimates sincethe same errors will appear in the averaged bounds in Section 6 as well. The average of these errors istypically smaller than Ψ . Proof of Lemma 5.2.
The proof is very similar to that of Lemma 7.3 of [6], which is presented for theblock additive model in the bulk regime. It suffices to go through the strategy in [6] for our additivemodel again. The strategy also works well at the regular edge, provided (3.2) and (3.3) hold. In addition,instead of the control parameter Ψ used in the proof of Lemma 7.3 of [6], we aim here at controlling manyerrors in terms of Π i . This requires a more careful estimate on the error terms. Due to the similarity tothe proof of Lemma 7.3 of [6], we only sketch the proof of Lemma 5.2 in the sequel.For each i ∈ J , N K , we write E [ m ( p,p ) i ] = E [ P i m ( p − ,p ) i ] = E [( e BG ) ii tr G m ( p − ,p ) i ] + E (cid:2)(cid:0) − G ii tr e BG + ( G ii + T ii )Υ (cid:1) m ( p − ,p ) i (cid:3) , (5.25)respectively, E [ n ( p,p ) i ] = E [ K i n ( p − ,p ) i ] = E [ T i n ( p − ,p ) i ] + E (cid:2)(cid:0) ( b i T i + ( e BG ) ii )tr G − ( G ii + T i )tr e BG (cid:1) n ( p − ,p ) i (cid:3) . (5.26)Using the fact e ∗ i R i = − h ∗ i ( c.f., (4.7)), we can write( e BG ) ii = e ∗ i R i e B h i i R i G e i = − h ∗ i e B h i i R i G e i = − h ∗ i e B h i i G e i + ℓ i h ∗ i e B h i i ( e i + h i )( e i + h i ) ∗ G e i = − S i + ℓ i ( b i h ii + h ∗ i e B h i i h i )( G ii + T i ) = − ˚ S i + ε i , (5.27) where S i and ˚ S i are defined in (4.10) and (5.5), respectively, and ε i := (cid:0) ( ℓ i − b i h ii + ℓ i h ∗ i e B h i i h i (cid:1) G ii + ℓ i (cid:0) b i h ii + h ∗ i e B h i i h i (cid:1) T i . (5.28)With the aid of Lemma A.1, it is elementary to check | h ii | ≺ √ N , | ℓ i − | ≺ √ N , | h ∗ i e B h i i h i | ≺ √ N , (5.29)where in the last inequality we also used the fact that tr e B h i i = tr B = 0, under the convention (5.1).Applying the bounds in (5.16) and (5.29), it is easy to see that | ε i | ≺ √ N . (5.30)Substituting (5.27) and (5.30) into the first term on the right hand side of (5.25), we have E [( e BG ) ii tr G m ( p − ,p ) i ] = − E [˚ S i tr G m ( p − ,p ) i ] + E [ O ≺ ( N − ) m ( p − ,p ) i ] , (5.31)where for the second term on the right hand side above we also used tr G = O ≺ (1); c.f., (5.19). We recallthe definition of ˚ S i from (5.5) and rewrite˚ S i = ( i ) X k ¯ g ik k g i k e ∗ k e B h i i G e i . Hereafter, we use the notation P ( i ) k to represent the sum over k ∈ J , N K \ { i } . Thus, the first term onthe right of (5.31) is of the form E [ P ( i ) k ¯ g ik h· · · i ], where h· · · i can be regarded as a function of the ¯ g ik ’sand the g ik ’s. Recall the following integration by parts formula for complex centered Gaussian variables, Z C ¯ gf ( g, ¯ g )e − | g | σ d g = σ Z C ∂ g f ( g, ¯ g )e − | g | σ d g , (5.32)for any differentiable function f : C → C . Applying (5.32) to the first term on the right of (5.31), we get E [˚ S i tr G m ( p − ,p ) i ] = 1 N ( i ) X k E h k g i k ∂ ( e ∗ k e B h i i G e i ) ∂g ik tr G m ( p − ,p ) i i + 1 N ( i ) X k E h ∂ k g i k − ∂g ik e ∗ k e B h i i G e i tr G m ( p − ,p ) i i + 1 N ( i ) X k E h e ∗ k e B h i i G e i k g i k ∂ tr G∂g ik m ( p − ,p ) i i + p − N ( i ) X k E h e ∗ k e B h i i G e i k g i k tr G ∂P i ∂g ik m ( p − ,p ) i i + pN ( i ) X k E h e ∗ k e B h i i G e i k g i k tr G ∂P i ∂g ik m ( p − ,p − i i . (5.33)Analogously, by T i = ˚ T i + h ii G ii , (5.5), the first bound in (5.16), the first bound in (5.29), and also(5.11), we can write the first term on the right hand side of (5.26) as E [ T i n ( p − ,p ) i ] = E [˚ T i n ( p − ,p ) i ] + E [ O ≺ ( N − ) n ( p − ,p ) i ] . (5.34)Similarly to (5.33), applying the integration by parts formula, we obtain E [˚ T i n ( p − ,p ) i ] = 1 N ( i ) X k E h k g i k ∂ ( e ∗ k G e i ) ∂g ik n ( p − ,p ) i i + 1 N ( i ) X k E h ∂ k g i k − ∂g ik e ∗ k G e i n ( p − ,p ) i i + p − N ( i ) X k E h e ∗ k G e i k g i k ∂K i ∂g ik n ( p − ,p ) i i + pN ( i ) X k E h e ∗ k G e i k g i k ∂K i ∂g ik n ( p − ,p − i i . (5.35)First, we consider the first term on the right side of (5.33). Recall ℓ i from (4.5). For brevity, we set c i := ℓ i k g i k . (5.36)It is elementary to derive that ∂G∂g ik = c i (cid:0) G e k ( e i + h ∗ i ) e B h i i R i G + GR i e B h i i e k ( e i + h i ) ∗ G (cid:1) + ∆ G ( i, k ) . (5.37) Here ∆ G ( i, k ) is a small remainder, defined as∆ G ( i, k ) := − G ∆ R ( i, k ) e B h i i R i G − GR i e B h i i ∆ R ( i, k ) G, (5.38)where ∆ R ( i, k ) := ℓ i k g i k ¯ g ik (cid:0) e i h ∗ i + h i e ∗ i + 2 h i h ∗ i (cid:1) − ℓ i k g i k g ii ¯ g ik (cid:0) e i + h i (cid:1)(cid:0) e i + h i (cid:1) ∗ . (5.39)The ∆ G ( i, k )’s are irrelevant error terms. We handle quantities with ∆ G ( i, k ) separately in Appendix B.Similarly to (7.55) of [6], using (5.37), we can get1 N ( i ) X k ∂ ( e ∗ k e B h i i G e i ) ∂g ik = − c i N ( i ) X k e ∗ k e B ( i ) G e k ( b i T i + ( e BG ) ii )+ c i N ( i ) X k e ∗ k e B h i i GR i e B h i i e k ( G ii + T i ) + 1 N ( i ) X k e ∗ k e B h i i ∆ G ( i, k ) e i . (5.40)Note that T i naturally appears in the first term of (5.33) after integrating by parts the ˚ S i term. Thisexplains why we need to study the high moments of K i to get another equation. Now, we claim that1 N ( i ) X k e ∗ k e B ( i ) G e k = tr e BG + O ≺ (Π i ) , N ( i ) X k e ∗ k e B h i i GR i e B h i i e k = tr e BG e B + O ≺ (Π i ) , (5.41)with Π i given in (5.10). We state the proof for the first estimate in (5.41). Note that1 N ( i ) X k e ∗ k e B ( i ) G e k = tr e B h i i G − N ( e B h i i G ) ii = tr e B h i i G + O ≺ ( 1 N ) , (5.42)where the last step follows from the identity ( e B h i i G ) ii = b i G ii and (5.16). Then, using that e B h i i = R i e BR i and R i = I − r i r ∗ i ( c.f., (4.2)), we see thattr e BG − tr e B h i i G = tr e BG − tr R i e BR i G = 1 N r ∗ i e BG r i + 1 N r ∗ i G e B r i − N r ∗ i e B r i r ∗ i G r i . Using (4.6), ℓ i = 1 + O ≺ ( √ N ) and k r ∗ i e B k .
1, we get by Cauchy-Schwarz that (cid:12)(cid:12) r ∗ i e BG r i (cid:12)(cid:12) . (cid:16) k G e i k + k G h i k (cid:17) = (cid:16) Im ( G ii + h ∗ i G h i ) η (cid:17) = (cid:16) Im ( G ii + G ii ) η (cid:17) , with G given in (5.3), where in the last step we used h ∗ i G h i = u ∗ i G u i = e ∗ i U ∗ GU e i = G ii (5.43)and the identities | G | = η Im G and |G| = η Im G . Similarly, we have (cid:12)(cid:12) r ∗ i G e B r i (cid:12)(cid:12) . (cid:16) Im ( G ii + G ii ) η (cid:17) , (cid:12)(cid:12) r ∗ i G r i (cid:12)(cid:12) . (cid:16) Im ( G ii + G ii ) η (cid:17) . Hence, we have (cid:12)(cid:12) tr e BG − tr e B h i i G (cid:12)(cid:12) . N (cid:16) Im ( G ii + G ii ) η (cid:17) . Im ( G ii + G ii ) N η = O ≺ (Π i ) , (5.44)where in the second step, we used the fact Im G ii , Im G ii & η . Combining (5.42) with (5.44) we obtainthe first estimate of (5.41). The second estimate in (5.41) is proved in the same way.Hence, using (5.41) and the first estimate in (B.1), we obtain from (5.40) that1 N ( i ) X k ∂ ( e ∗ k e B h i i G e i ) ∂g ik = − c i tr e BG (cid:0) b i T i + ( e BG ) ii (cid:1) + c i tr e BG e B (cid:0) G ii + T i (cid:1) + O ≺ (Π i ) . (5.45)Analogously, we can show that1 N ( i ) X k ∂ ( e ∗ k G e i ) ∂g ik = − c i tr G (cid:0) b i T i + ( e BG ) ii (cid:1) + c i tr e BG (cid:0) G ii + T i (cid:1) + O ≺ (Π i ) . (5.46) Using (5.26), (5.34), (5.35) and (5.46) and the estimate c i k g i k = 1 + O ≺ ( √ N ), we obtain E [ n ( p,p ) i ] = E h O ≺ (Ψ) n ( p − ,p ) i i + 1 N ( i ) X k E h ∂ k g i k − ∂g ik e ∗ k G e i n ( p − ,p ) i i + p − N ( i ) X k E h e ∗ k G e i k g i k ∂K i ∂g ik n ( p − ,p ) i i + pN ( i ) X k E h e ∗ k G e i k g i k ∂K i ∂g ik n ( p − ,p − i i . (5.47)Then, combining (5.45) with (5.46), we obtain1 N ( i ) X k ∂ ( e ∗ k e B h i i G e i ) ∂g ik tr G = − c i ( G ii + T i ) (cid:0) tr e BG − Υ (cid:1) + 1 N ( i ) X k ∂ ( e ∗ k G e i ) ∂g ik tr e BG + O ≺ (Π i )= − c i ( G ii + T i ) (cid:0) tr e BG − Υ (cid:1) + ˚ T i tr e BG + (cid:16) N ( i ) X k ∂ ( e ∗ k G e i ) ∂g ik − ˚ T i (cid:17) tr e BG + O ≺ (Π i ) . (5.48)Recall the definition of c i from (5.36). It is elementary to check that c i = k g i k − h ii − (cid:0) k g i k − (cid:1) + O ≺ ( 1 N ) . (5.49)Plugging (5.49) into (5.48) and also using the second equation in (5.5), we can write1 N ( i ) X k ∂ ( e ∗ k e B h i i G e i ) ∂g ik tr G = −k g i k (cid:0) G ii tr e BG − ( G ii + T i )Υ (cid:1) + (cid:16) N ( i ) X k ∂ ( e ∗ k G e i ) ∂g ik − k g i k ˚ T i (cid:17) tr e BG + ε i + O ≺ (Π i ) , (5.50)where ε i collects irrelevant terms ε i := (cid:0) k g i k − c i (cid:1)(cid:0) G ii tr e BG − ( G ii + T i )Υ (cid:1) + (cid:0) k g i k ˚ T i − c i T i (cid:1) tr e BG = (cid:0) k g i k − (cid:1) G ii tr e BG − (cid:0) h ii + (cid:0) k g i k − (cid:1)(cid:1) ( G ii + T i )Υ+ (cid:0) h ii + (cid:0) k g i k − (cid:1)(cid:1) T i tr e BG + O ≺ (cid:0) N (cid:1) . (5.51)From the estimates | h ii | ≺ √ N , k g i k = 1+ O ≺ ( √ N ), (5.16) and the observation that the tracial quantitiesare O ≺ (1), we see that ε i = O ≺ (cid:0) √ N (cid:1) . (5.52)Combining (5.25), (5.27), (5.33) and (5.50), we have E [ m ( p,p ) i ] = − E [(˚ S i + ε i )tr G m ( p − ,p ) i ] + E (cid:2)(cid:0) − G ii tr e BG + ( G ii + T ii )Υ (cid:1) m ( p − ,p ) i (cid:3) = E h(cid:16) ˚ T i − k g i k N ( i ) X k ∂ ( e ∗ k G e i ) ∂g ik (cid:17) tr e BG m ( p − ,p ) i i − N ( i ) X k E h ∂ k g i k − ∂g ik e ∗ k e B h i i G e i tr G m ( p − ,p ) i i − N ( i ) X k E h e ∗ k e B h i i G e i k g i k ∂ tr G∂g ik m ( p − ,p ) i i − p − N ( i ) X k E h e ∗ k e B h i i G e i k g i k tr G ∂P i ∂g ik m ( p − ,p ) i i − pN ( i ) X k E h e ∗ k e B h i i G e i k g i k tr G ∂P i ∂g ik m ( p − ,p − i i + E h(cid:16) ε i tr G − k g i k ε i + O ≺ (Π i ) (cid:17) m ( p − ,p ) i i . (5.53)For the first term on the right of (5.53), analogously to (5.35), applying (5.32) to the ˚ T i -term, we get E h(cid:16) ˚ T i − k g i k N ( i ) X k ∂ ( e ∗ k G e i ) ∂g ik (cid:17) tr e BG m ( p − ,p ) i i = 1 N ( i ) X k E h k g i k ∂ tr e BG∂g ik e ∗ k G e i tr e BG m ( p − ,p ) i i + 1 N ( i ) X k E h ∂ k g i k − ∂g ik e ∗ k G e i tr e BG m ( p − ,p ) i i + p − N ( i ) X k E h e ∗ k G e i k g i k ∂P i ∂g ik tr e BG m ( p − ,p ) i i + pN ( i ) X k E h e ∗ k G e i k g i k ∂P i ∂g ik tr e BG m ( p − ,p − i i . (5.54)Recall the estimates of ε i and ε i in (5.30) and (5.52), respectively, which implies that | ε i | ≺ Ψ and | ε i | ≺ Ψ. Therefore, to show (5.23), it suffices to estimate the second to the fifth terms on the right sideof (5.53), and all the terms on the right side of (5.54). Similarly, in light of (5.26), (5.34), and (5.46), toshow (5.24), it suffices to estimate the last three terms on the right side of (5.47). All these terms canbe estimated based on the following lemma.
Lemma 5.3.
Suppose the assumptions in Proposition 5.1 hold. Set X i = I or e B h i i . Let Q be any(possibly random) diagonal matrix satisfying k Q k ≺ and X = I or A . We have the following estimates N ( i ) X k ∂ k g i k − ∂g ik e ∗ k X i G e i = O ≺ ( 1 N ) , N ( i ) X k e ∗ i X ∂G∂g ik e i e ∗ k X i G e i = O ≺ (Π i ) , N ( i ) X k ∂T i ∂g ik e ∗ k X i G e i = O ≺ (Π i ) , N ( i ) X k tr (cid:16) QX ∂G∂g ik (cid:17) e ∗ k X i G e i = O ≺ (cid:0) Ψ Π i (cid:1) , N ( i ) X k tr (cid:16) QX ∂G∂g ik (cid:17) e ∗ k X i ˚ g i = O ≺ (cid:0) Ψ Π i (cid:1) . (5.55) In addition, the same estimates hold if we replace ∂G∂g ik and ∂T i ∂g ik by their complex conjugates ∂G∂g ik and ∂T i ∂g ik in the last four equations above. The proof of Lemma 5.3 will be postponed to Appendix B. With the aid of Lemma 5.3, the remainingproof of Lemma 5.2 is the same as the counterpart to the proof of Lemma 7.3 in [6]. The only differenceis that we use the improved bounds in Lemma 5.3 instead of those in Lemma 7.4 in [6]. Specifically, theestimates for the second term of (5.47), the second term of (5.53), and the second term of (5.54) followfrom the first equation in (5.55). The third term of (5.53) and the first term of (5.54) can be estimatedby the last equation in (5.55), after writing tr e BG = 1 − tr ( A − z ) G . All the other terms have ∂K i ∂g ik and ∂P i ∂g ik or their complex conjugate involved. Recall the definitions in (4.14) and (4.15), and also the firstequation in (5.8). Then, by the chain rule, we see that all terms in (5.47), (5.53) and (5.54), with ∂K i ∂g ik and ∂P i ∂g ik or their complex conjugate counterparts involved, can be estimated by combining the last threeequations in (5.55). This completes the proof of Lemma 5.2. (cid:3) With Lemma 5.2, we can complete the proof of Proposition 5.1. The proof is nearly the same as thatfor Theorem 7.2 in [6]. For the convenience of the reader, we sketch it below.First, using Young’s inequality, we obtain from (5.23) that for any given (small) ε > E (cid:2) m ( p,p ) i (cid:3) ≤
13 12 p N pε Ψ p + 3 2 p − p N − pε p − E (cid:2) m ( p,p ) i (cid:3) . Since ε > T ( z ) ≺ N − γ . (5.56)From the definition in (4.15), we can rewrite the second estimate in (5.13) as(1 + b i tr G − tr ( e BG )) T i = G ii tr ( e BG ) − ( e BG ) ii tr G + O ≺ (Ψ) . (5.57)Using the identity ( e BG ) ii = 1 − ( a i − z ) G ii ( z ) , (5.58) and approximate G ii by ( a i − ω B ) − , we get from (5.12) and (3.2) that( e BG ) ii = z − ω B a i − ω B + O ≺ ( N − γ ) . (5.59)We also recall the estimates of the tracial quantities in (5.19) under the assumption (5.12). Plugging(5.59), (5.19) and the first bound in the assumption (5.12) into (5.57), we get (cid:0) b i − z + ω B ) m µ A ⊞ µ B + O ≺ ( N − γ ) (cid:1) T i = O ≺ ( N − γ ) + O ≺ (Ψ) = O ≺ ( N − γ ) , (5.60)where in the last step we used that Ψ ≤ N − γ for all η ≥ η m . From the second line in (2.11), we note that1 + ( b i − z + ω B ) m µ A ⊞ µ B = m µ A ⊞ µ B (cid:16) m µ A ⊞ µ B + b i − z + ω B (cid:17) = m µ A ⊞ µ B ( b i − ω A ) . Using (3.2) and k A k , k B k ≤ C , we get | m µ A ⊞ µ B ( b i − ω A ) | &
1. This together with (5.60) implies (5.56).To prove (5.14), we recall the definition of P i in (4.14), which implies that1 N N X i =1 ( G ii + T i )Υ = 1 N N X i =1 P i = O ≺ (Ψ) . (5.61)Using the facts N P Ni =1 G ii = m µ A ⊞ µ B + O ≺ ( N − γ ) ( c.f., (5.19)), and N P Ni =1 T i = O ≺ ( N − γ ), and also | m µ A ⊞ µ B | &
1, we get (5.14) from (5.61).Then, combining (5.14) with the first estimate in (5.13), we get( e BG ) ii tr G − G ii tr e BG = O ≺ (Ψ) . (5.62)Applying the identity (5.58) and the definition of ω cB , we can rewrite (5.62) as (cid:0) ( a i − ω cB ) G ii − (cid:1) tr G = O ≺ (Ψ) . As shown above that | tr G | & a i − ω cB ) G ii − O ≺ (Ψ). By (5.20) and (3.2), we also note that | a i − ω cB | & O ≺ (Ψ). Thenthe second estimate in (5.15) follows. This completes the proof of Proposition 5.1. (cid:3) Rough fluctuation averaging for general linear combinations
In this section, we prove a rough fluctuation averaging estimate for the basic quantities Q i ’s definedin (4.11). From (5.62), we see that | Q i | ≺ Ψ . (6.1)Recall the definition of the control parameters Π and Π i in (4.9) and (5.10), respectively. The followingproposition states that the average of the Q i ’s is typically smaller than an individual Q i . Proposition 6.1.
Fix a z ∈ D τ ( η m , η M ) . Suppose that the assumptions of Proposition 5.1 hold. Set X i = I or e B h i i . Let d , . . . , d N ∈ C be possibly H -dependent quantities satisfying max j | d j | ≺ . Assumethat they depend only weakly on the randomness in the sense that the following hold, for all i, j ∈ J , N K , N N X i =1 ( i ) X k ∂d j ∂g ik e ∗ k X i G e i = O ≺ (cid:0) Ψ Π i (cid:1) , N N X i =1 ( i ) X k ∂d j ∂g ik e ∗ k X i ˚ g i = O ≺ (cid:0) Ψ Π i (cid:1) , (6.2) and the same bounds hold when the d j ’s are replaced by their complex conjugates d j . Suppose that Π( z ) ≺ ˆΠ( z ) for some deterministic and positive function ˆΠ( z ) that satisfies √ N √ η ≺ ˆΠ ≺ Ψ . Then, (cid:12)(cid:12)(cid:12) N N X i =1 d i Q i (cid:12)(cid:12)(cid:12) ≺ Ψ ˆΠ . (6.3)We remark that whenever the d j ’s are deterministic, (6.2) trivially holds. However, we will alsoneed (6.3) with certain random d j ’s that satisfy (6.2).For any d i ’s satisfying the assumption in Proposition 6.1, we introduce the notation m ( k,l ) := (cid:16) N N X i =1 d i Q i (cid:17) k (cid:16) N N X i =1 d i Q i (cid:17) l , k, l ∈ N . (6.4) Similarly to Lemma 5.2, it suffices to prove the following recursive moment estimate.
Lemma 6.2.
Fix a z ∈ D τ ( η m , η M ) . Suppose that the assumptions of Proposition 6.1 hold. Then, forany fixed integer p ≥ , we have E (cid:2) m ( p,p ) (cid:3) = E (cid:2) O ≺ ( ˆΠ ) m ( p − ,p ) (cid:3) + E (cid:2) O ≺ (Ψ ˆΠ ) m ( p − ,p ) (cid:3) + E (cid:2) O ≺ (Ψ ˆΠ ) m ( p − ,p − (cid:3) . (6.5) Proof of Proposition 6.1.
Similarly to the proof of (5.13) from Lemma 5.2, with Lemma 6.2, we can get(6.3) by applying Young’s and Markov’s inequalities. This completes the proof of Proposition 6.1. (cid:3)
Proof of Lemma 6.2.
We first claim that it suffices to prove the following statements: If | Υ( z ) | ≺ ˆΥ( z )for any deterministic and positive function ˆΥ( z ) ≤ Ψ( z ), then E (cid:2) m ( p,p ) (cid:3) = E (cid:2) ( O ≺ ( ˆΠ ) + O ≺ (Ψ ˆΥ)) m ( p − ,p ) (cid:3) + E (cid:2) O ≺ (Ψ ˆΠ ) m ( p − ,p ) (cid:3) + E (cid:2) O ≺ (Ψ ˆΠ ) m ( p − ,p − (cid:3) . (6.6)Indeed, similarly to the proof of (5.13) from Lemma 5.2, we can again apply Young’s inequality andMarkov’s inequality to get, for any d i ’s satisfying the assumptions in Proposition 6.1, that (6.6) implies (cid:12)(cid:12)(cid:12) N N X i =1 d i Q i (cid:12)(cid:12)(cid:12) ≺ ˆΠ + Ψ ˆΥ + Ψ ˆΠ ≺ Ψ ˆΥ + Ψ ˆΠ , (6.7)where in the last step we used the assumption ˆΠ ≺ Ψ.Next, recall from (5.9) that Υ = − N N X i =1 a i Q i . Choosing d i = a i for all i , we get from (6.7) | Υ | ≺ Ψ ˆΥ + Ψ ˆΠ ≺ N − γ ˆΥ + Ψ ˆΠ . (6.8)Using the right hand side of (6.8) as a new deterministic bound of Υ instead of the initial ˆΥ in (6.6),and perform the above argument iteratively, we can finally get | Υ | ≺ Ψ ˆΠ . Hence, at the end, we can choose ˆΥ = Ψ ˆΠ in (6.6) and get E (cid:2) m ( p,p ) (cid:3) = E (cid:2) ( O ≺ ( ˆΠ ) + O ≺ (Ψ ˆΠ)) m ( p − ,p ) (cid:3) + E (cid:2) O ≺ (Ψ ˆΠ ) m ( p − ,p ) (cid:3) + E (cid:2) O ≺ (Ψ ˆΠ ) m ( p − ,p − (cid:3) . (6.9)Observe that by the assumption that N √ η ≺ ˆΠ, we also have Π ≺ ˆΠ on D τ ( η m , η M ). Then the O ≺ (Ψ ˆΠ)term can be absorbed by the O ≺ ( ˆΠ ) in (6.9). Hence, we conclude (6.5) from (6.6). Therefore, in thesequel, we will focus on proving (6.6).Denote by D := diag( d i ) Ni =1 . We first write1 N N X i =1 d i Q i = 1 N N X i =1 ( e BG ) ii (cid:0) d i tr G − tr DG (cid:1) = 1 N N X i =1 ( e BG ) ii tr Gτ i , (6.10)where we introduced the notation τ i := d i − tr DG tr G . (6.11)Similarly to the proof of (5.13), we approximate ( e BG ) ii by − ˚ S i ( c.f., (5.27)), and then performintegration by parts using (5.32) with respect to ˚ g i in ˚ S i . More specifically, we write E (cid:2) m ( p,p ) (cid:3) = 1 N N X i =1 E h ( e BG ) ii tr Gτ i m ( p − ,p ) i = − N N X i =1 E h ˚ S i tr Gτ i m ( p − ,p ) i + E h ε m ( p − ,p ) i , (6.12) where we used the notation ε := 1 N N X i =1 ε i tr Gτ i . (6.13)Here ε i is defined in (5.28). To ease the presentation, we further introduce the notation τ i := − τ i tr e BG. (6.14)Using assumption (5.12), (5.19), and also (3.2), one checks that | τ i | ≺ | τ i | ≺
1, for all i ∈ J , N K .Similarly to (5.33), applying (5.32) to the first term on the right hand side of (6.12), we obtain1 N N X i =1 E h ˚ S i tr Gτ i m ( p − ,p ) i = 1 N N X i =1 ( i ) X k E h k g i k ∂ ( e ∗ k e B h i i G e i ) ∂g ik tr Gτ i m ( p − ,p ) i + 1 N N X i =1 ( i ) X k E h ∂ k g i k − ∂g ik e ∗ k e B h i i G e i tr Gτ i m ( p − ,p ) i + 1 N N X i =1 ( i ) X k E h k g i k e ∗ k e B h i i G e i ∂ (tr Gτ i ) ∂g ik m ( p − ,p ) i + p − N N X i =1 ( i ) X k E h k g i k e ∗ k e B h i i G e i tr Gτ i (cid:16) N N X j =1 ∂ ( d j Q j ) ∂g ik (cid:17) m ( p − ,p ) i + pN N X i =1 ( i ) X k E h k g i k e ∗ k e B h i i G e i tr Gτ i (cid:16) N N X j =1 ∂ ( d j Q j ) ∂g ik (cid:17) m ( p − ,p − i i . (6.15)First, we estimate the first term on the right hand side of (6.15). Using (5.50) and the bound1 N N X i =1 Π i ≤ , we have 1 N N X i =1 ( i ) X k k g i k ∂ ( e ∗ k e B h i i G e i ) ∂g ik tr Gτ i = − N N X i =1 (cid:0) G ii tr e BG − ( G ii + T i )Υ (cid:1) τ i + 1 N N X i =1 ( i ) X k (cid:16) ˚ T i − k g i k ∂ ( e ∗ k G e i ) ∂g ik (cid:17) τ i + ε + O ≺ (Π ) , where we have introduced ε := 1 N N X i =1 k g i k τ i ε i ; (6.16)see (5.51) for the definition of ε i . According to the definition in (6.11), we observe that1 N N X i =1 (cid:0) G ii tr e BG − ( G ii + T i )Υ (cid:1) τ i = 1 N N X i =1 G ii τ i (cid:0) tr e BG − Υ (cid:1) − N N X i =1 T i τ i Υ = O ≺ (Ψ ˆΥ) . Here in the last step we used the facts N X i =1 G ii τ i = 0 , N N X i =1 T i τ i Υ = O ≺ (Ψ ˆΥ) , (6.17)where the second estimate is implied by the second estimate in (5.15), and the assumption that | Υ | ≺ ˆΥ.Therefore, for the first term on the right hand side of (6.15), we have1 N N X i =1 ( i ) X k E h k g i k ∂ ( e ∗ k e B h i i G e i ) ∂g ik tr Gτ i m ( p − ,p ) i = 1 N N X i =1 ( i ) X k E h(cid:16) ˚ T i − k g i k ∂ ( e ∗ k G e i ) ∂g ik (cid:17) τ i m ( p − ,p ) i + E (cid:2) ( ε + O ≺ (Π ) + O ≺ (Ψ ˆΥ)) m ( p − ,p ) (cid:3) = 1 N N X i =1 ( i ) X k E h ∂ k g i k − ∂g ik e ∗ k G e i τ i m ( p − ,p ) i + 1 N N X i =1 ( i ) X k E h k g i k ∂τ i ∂g ik e ∗ k G e i m ( p − ,p ) i + p − N N X i =1 ( i ) X k E h k g i k e ∗ k G e i τ i (cid:16) N N X j =1 ∂ ( d j Q j ) ∂g ik (cid:17) m ( p − ,p ) i + pN N X i =1 ( i ) X k E h k g i k e ∗ k G e i τ i (cid:16) N N X j =1 ∂ ( d j Q j ) ∂g ik (cid:17) m ( p − ,p − i + E (cid:2)(cid:0) ε + O ≺ (Π ) + O ≺ (Ψ ˆΥ) (cid:1) m ( p − ,p ) (cid:3) , (6.18)where the second equation is obtained analogously to (5.54), by writing ˚ T i = P ( i ) k ¯ g ik e ∗ k G e i / k g i k andperforming integration by parts with respect to the g ik ’s.According to (6.12), (6.15), and (6.18), it suffices to estimate the last term on the right side of (6.12),the last four terms on the right side of (6.15), and all the terms on the right side of (6.18). All the desiredestimates can be derived from the following lemma. Lemma 6.3.
Fix a z ∈ D τ ( η m , η M ) . Suppose that the assumptions of Proposition 6.1 hold, especially(6.2) holds for d , . . . , d N in the definition (6.4). Let ˜ d , . . . , ˜ d N ∈ C be any (possibly random) numberswith the bound max i | ˜ d i | ≺ . Let Q be any (possibly random) diagonal matrix that satisfies k Q k ≺ .Set X = I or A , and set X i = I or e B h i i . Then we have N N X i =1 ( i ) X k ˜ d i ∂ k g i k − ∂g ik e ∗ k X i G e i = O ≺ ( 1 N ) , (6.19)1 N N X i =1 ( i ) X k ˜ d i tr (cid:16) QX ∂G∂g ik (cid:17) e ∗ k X i G e i = O ≺ (Ψ Π ) , (6.20) and the same estimate holds if we replace ∂G∂g ik by the complex conjugate ∂G∂g ik in (6.20). Further, we have E (cid:2) ε j m ( p − ,p ) (cid:3) = E (cid:2) O ≺ ( ˆΠ ) m ( p − ,p ) (cid:3) + E (cid:2) O ≺ (Ψ ˆΠ ) m ( p − ,p ) (cid:3) + E (cid:2) O ≺ (Ψ ˆΠ ) m ( p − ,p − (cid:3) , j = 1 , . (6.21)We postpone the proof of Lemma 6.3 and continue with the proof of Lemma 6.2 instead.The second term of (6.15) and the first term of (6.18) are directly estimated by (6.19). Using the defi-nition of τ i in (6.11) and of τ i in (6.14), the boundedness of the tracial quantities ( c.f., (5.19)), and thechain rule, we get the estimate on the third term of (6.15) and the second term of (6.18), using (6.20)and the assumption (6.2). For the last two terms of (6.15), and the third and fourth terms of (6.18),we note that1 N N X j =1 d j Q j = tr D e BG tr G − tr e BG tr DG = tr D tr G − tr DG − tr DAG tr G + tr AG tr DG , where in the last step we used the first identity of (5.8). Hence, by the chain rule, the fourth term of(6.15) and the third term of (6.18) are estimated with the aid of (6.20) and (6.2). The last term of (6.15)and the fourth term of (6.18) can be estimated analogously. Finally, the estimates of the second term of(6.12) and the last term of (6.18) are given by (6.21). Thus we conclude the proof of Lemma 6.2. (cid:3)
In the sequel, we prove Lemma 6.3.
Proof of Lemma 6.3.
Note that (6.19) and (6.20) follow from the first and the last estimates in (5.55),respectively, by averaging over the index i . Hence, it suffices to prove (6.21). Recall the definition of ε from (6.13) and of ε from (6.16). We first consider E [ ε m ( p − ,p ) ]. Recall the definition of ε i from (5.28). Using (5.14), (5.15), the firstbound in (5.16), and (5.29), we have ε i = h ∗ i e B h i i h i a i − ω cB + O ≺ (cid:0) Ψ √ N (cid:1) = ˚ h ∗ i e B h i i ˚ h i a i − ω cB + O ≺ ( ˆΠ ) . (6.22)Here the last step follows from the assumption N √ η ≺ ˆΠ , and that h i = ˚ h i + g ii k g i k e i with | g ii | ≺ √ N , ˚ h ∗ i e B h i i e i = b i ˚ h ∗ i e i = 0 . Hence, by the definition of ε in (6.13), we have ε = 1 N N X i =1 ˚ h ∗ i e B h i i ˚ h i d i tr G − tr DGa i − ω cB + O ≺ ( ˆΠ ) = 1 N N X i =1 ˚ h ∗ i e B h i i ˚ h i τ i + O ≺ ( ˆΠ ) , where we introduced the notation τ i := d i tr G − tr DGa i − ω cB . Using the integration by parts formula (5.32), we obtain1 N N X i =1 E (cid:2) ˚ h ∗ i e B h i i ˚ h i τ i m ( p − ,p ) (cid:3) = 1 N N X i =1 ( i ) X k E (cid:2) k g i k ¯ g ik e ∗ k e B h i i ˚ g i τ i m ( p − ,p ) (cid:3) = 1 N N X i =1 ( i ) X k E h ∂ (cid:0) k g i k − e ∗ k e B h i i ˚ g i τ i m ( p − ,p ) (cid:1) ∂g ik i . (6.23)Note that ∂ (cid:0) k g i k − e ∗ k e B h i i ˚ g i τ i m ( p − ,p ) (cid:1) ∂g ik = ∂ k g i k − ∂g ik e ∗ k e B h i i ˚ g i τ i m ( p − ,p ) + k g i k − e ∗ k e B h i i e k τ i m ( p − ,p ) + k g i k − e ∗ k e B h i i ˚ g i ∂τ i ∂g ik m ( p − ,p ) + ( p − k g i k − e ∗ k e B h i i ˚ g i τ i (cid:16) N N X j =1 ∂ ( d j Q j ) ∂g ik (cid:17) m ( p − ,p ) + p k g i k − e ∗ k e B h i i ˚ g i τ i (cid:16) N N X j =1 ∂ ( d j Q j ) ∂g ik (cid:17) m ( p − ,p − . (6.24)Notice that ∂ k g i k − ∂g ik = −k g i k − ¯ g ik and that τ i = O ≺ (1). In addition, we also have that ( i ) X k ¯ g ik e k = ˚ g ∗ i , ( i ) X k e ∗ k e B h i i e k = Tr B − b i = b i . Denoting by ˜ d , . . . , ˜ d N ∈ C generic (possibly random) numbers with max i | ˜ d i | ≺
1, we see that thecontributions from the first two terms on the right side of (6.24) to (6.23) follow from the estimates1 N N X i =1 ˜ d i ˚ g ∗ i e B h i i ˚ g i = O ≺ ( 1 N ) , N N X i =1 ˜ d i b i e ∗ k e B h i i e k = O ≺ ( 1 N ) . Here ˜ d i includes τ i and an appropriate power of k g i k . In addition, for the estimate of the remainingterms in (6.24), we claim that, for X i = I, e B h i i ,1 N N X i =1 ( i ) X k ˜ d i e ∗ k X i ˚ g i ∂τ i ∂g ik = O ≺ (Ψ Π ) , (6.25)1 N N X i =1 ( i ) X k ˜ d i e ∗ k X i ˚ g i (cid:16) N N X j =1 ∂ ( d j Q j ) ∂g ik (cid:17) = O ≺ (Ψ Π ) , (6.26)1 N N X i =1 ( i ) X k ˜ d i e ∗ k X i ˚ g i (cid:16) N N X j =1 ∂ ( d j Q j ) ∂g ik (cid:17) = O ≺ (Ψ Π ) . (6.27) The above three bounds follows from the last estimate in (5.55) and the chain rule. Hence, we concludethe proof of (6.21) with j = 1.The proof of (6.21) for j = 2 is similar to j = 1. Recall the definition of ε i from (5.51). Using (5.14),(5.15), the first bound in (5.16), and also the bounds in (5.29), we have ε i = (cid:0) k g i k − (cid:1) G ii tr e BG + O ≺ (cid:16) Ψ √ N (cid:17) = (cid:0) ˚ g ∗ i ˚ g i − (cid:1) tr e BGa i − ω cB + O ≺ ( ˆΠ ) , which possesses a very similar structure as (6.22). The remaining proof is nearly the same as the casefor ε ; it suffices to replace ˚ g ∗ i e B h i i ˚ g i by ˚ g ∗ i ˚ g i throughout the proof. We thus omit the details. Hence, weconclude the proof for Lemma 6.3. (cid:3) Optimal fluctuation averaging
In this section, we establish the optimal fluctuation averaging estimate for a very special linear com-binations of the Q i ’s and their analogues the Q i ’s ( c.f., (7.8)), under assumption (5.12).Recall the definition of the approximate subordination functions ω cA and ω cB in (5.2). We denoteΛ A := ω cA − ω A , Λ B := ω cB − ω B , Λ := | Λ A | + | Λ B | . (7.1)Recall S AB , T A and T B defined in (3.1). For brevity, in the sequel, we use the shorthand notation S ≡ S AB . Proposition 7.1.
Fix a z = E + i η ∈ D τ ( η m , η M ) . Suppose that the assumptions of Proposition 5.1hold. Suppose that Λ( z ) ≺ ˆΛ( z ) , for some deterministic and positive function ˆΛ( z ) ≺ N − γ , then (cid:12)(cid:12)(cid:12) S Λ ι + T ι Λ ι + O (Λ ι ) (cid:12)(cid:12)(cid:12) ≺ q (Im m µ A ⊞ µ B + ˆΛ)( |S| + ˆΛ) N η + 1(
N η ) , ι = A, B . (7.2)Before commencing the proof of Proposition 7.1, we first claim that the control parameter ˆΠ inProposition 6.1 can be chosen as the square root of the right side of (7.2) as long as Λ ≺ ˆΛ, i.e., ˆΠ := q (Im m µ A ⊞ µ B + ˆΛ)( |S| + ˆΛ) N η + 1(
N η ) ! . (7.3)Indeed, observe that when Λ ≺ ˆΛ ≺ N − γ , we obtain from the second line of (2.11) that | m H − m µ A ⊞ µ B | = | m H m µ A ⊞ µ B | (cid:12)(cid:12)(cid:12) m H ( z ) − m µ A ⊞ µ B ( z ) (cid:12)(cid:12)(cid:12) ≺ | m H m µ A ⊞ µ B | Λ . (7.4)Further, from the first line of (2.11) and (3.2), we see that, for any z ∈ D τ ( η m , η M ), | m H m µ A ⊞ µ B | ≺ (cid:12)(cid:12) ( m µ A ⊞ µ B + O ≺ ( N − γ )) m µ A ⊞ µ B (cid:12)(cid:12) ≺ . (7.5)Hence, we conclude from (7.4) and (7.5) that | m H − m µ A ⊞ µ B | ≺ Λ ≺ ˆΛ . (7.6)Therefore, recalling (4.9), we haveΠ ≺ Im m µ A ⊞ µ B + ˆΛ N η ≺ q (Im m µ A ⊞ µ B + ˆΛ)( |S| + ˆΛ) N η ≺ Ψ , where in the last two steps, we used that Im m µ A ⊞ µ B . |S| ≺
1; (3.4) and (3.5). In addition, from (3.4)and (3.5), we also have Im m µ A ⊞ µ B |S| & η . Thus we also have1 N √ η ≺ q (Im m µ A ⊞ µ B + ˆΛ)( |S| + ˆΛ) N η .
From the definition of Π in (4.9), we note that up to a Nη term ˆΠ here is equivalent to Π inside thespectrum but it is much larger than Π in the outside regime where S ≫ Im m µ A ⊞ µ B ( c.f., (3.4), (3.5)).With the above notation, we can rewrite (7.2) as (cid:12)(cid:12)(cid:12) S Λ ι + T ι Λ ι + O (Λ ι ) (cid:12)(cid:12)(cid:12) ≺ ˆΠ , ι = A, B. (7.7) Recall the definition of Q i from (4.11). We also introduce their analogues Q i ≡ Q i ( z ) := ( e A G ) ii tr G − G ii tr e A G , i ∈ J , N K . (7.8)with e A and G given in (5.3). To prove Proposition 7.1, we need an optimal fluctuation averaging for avery special combination of Q i ’s and Q i ’s. To this end, we define the functions Φ , Φ : ( C + ) −→ C ,Φ ( ω , ω , z ) := F A ( ω ) − ω − ω + z , Φ ( ω , ω , z ) := F B ( ω ) − ω − ω + z . (7.9)From (2.11), we have Φ ( ω A , ω B , z ) = Φ ( ω A , ω B , z ) = 0, with ω A ≡ ω A ( z ) and ω B ≡ ω B ( z ). For brevity,we use the shorthand notationsΦ c := Φ ( ω cA , ω cB , z ) , Φ c := Φ ( ω cA , ω cB , z ) . (7.10)Further, we define the quantities Z := Φ c + ( F ′ A ( ω B ) − c , Z := Φ c + ( F ′ B ( ω A ) − c . (7.11)We are going to show that Z and Z are actually certain linear combinations of the Q i ’s and the Q i ’s.We start with the identitiesΦ c = − F A ( ω cB )( m H ( z )) N N X i =1 a i − ω cB Q i , Φ c = − F B ( ω cA )( m H ( z )) N N X i =1 b i − ω cA Q i , (7.12)which can be derived by combining (5.2), (5.4) and (5.58). For all i ∈ J , N K , we set d i, := − F A ( ω cB )( m H ( z )) a i − ω cB , d i, := − ( F ′ A ( ω B ) − F B ( ω cA )( m H ( z )) b i − ω cA . (7.13)According to the definition in (7.11), (7.12), and also (7.13), we can write Z = 1 N N X i =1 d i, Q i + 1 N N X i =1 d i, Q i , (7.14)and Z can be represented in a similar way.Now, we choose d i = d i, , i ∈ J , N K , in Proposition 6.1. Observe that d i, can be regarded as asmooth function of tr e BG = 1 − tr ( A − z ) G and m H ( z ) = tr G , according to the definition in (7.13) andthat of ω cB in (5.2). Then, using the chain rule and the estimates of the tracial quantities in (5.19), onecan check that the first equation in assumption (6.2) is satisfied for the choice d i = d i, , i ∈ J , N K , byusing (5.55). The second equation can be checked analogously. Hence, applying Proposition 6.1, we get | Φ c | ≺ Ψ ˆΠ , | Φ c | ≺ Ψ ˆΠ , (7.15)where ˆΠ is chosen as in (7.3).The main technical task in this section is to establish the following estimates for Z and Z , wherethe previous order Ψ ˆΠ bounds from (6.3) are strengthened. Proposition 7.2.
Fix z ∈ D τ ( η m , η M ) . Suppose that the assumptions of Proposition 5.1 hold and that Λ( z ) ≺ ˆΛ( z ) for some deterministic and positive function ˆΛ( z ) ≤ N − γ . Choose ˆΠ( z ) as (7.3). Then, |Z | ≺ ˆΠ , |Z | ≺ ˆΠ . (7.16)We postpone the proof of Proposition 7.2 and first prove Proposition 7.1 with the aid of Proposition 7.2. Proof of Proposition 7.1.
By assumption, we see that | Λ A | , | Λ B | ≺ N − γ . First of all, expanding Φ c andΦ c around ( ω A , ω B ) and using the subordination equations Φ ( ω A , ω B , z ) = Φ ( ω A , ω B , z ) = 0, we getΦ c = − Λ A + ( F ′ A ( ω B ) − B + 12 F ′′ A ( ω B )Λ B + O (Λ B ) , Φ c = − Λ B + ( F ′ B ( ω A ) − A + 12 F ′′ B ( ω A )Λ A + O (Λ A ) . (7.17)We rewrite the second equation in (7.17) asΛ B = − Φ c + ( F ′ B ( ω A ) − A + 12 F ′′ B ( ω A )Λ A + O (Λ A ) . (7.18)Substituting (7.18) into the first equation in (7.17) yieldsΦ c = − ( F ′ A ( ω B ) − c + S Λ A + T A Λ A + O ((Φ c ) ) + O (Φ c Λ A ) + O (Λ A ) , where T A is defined in (3.1). In light of the definition in (7.11), we have Z = S Λ A + T A Λ A + O ((Φ c ) ) + O (Φ c Λ A ) + O (Λ A ) . (7.19)Combination of (7.15), (7.16) with (7.19) leads to (cid:12)(cid:12) S Λ A + T A Λ A + O (Λ A ) (cid:12)(cid:12) ≺ ˆΠ + Ψ ˆΠˆΛ . (7.20)The second term on the right hand side of (7.20) can be absorbed into the first term, in light of the factthat Ψ ˆΛ ≺ ˆΠ ( c.f., (7.3)). Hence, we have (cid:12)(cid:12) S Λ A + T A Λ A + O (Λ A ) (cid:12)(cid:12) ≺ ˆΠ . (7.21)Analogously, we also have (cid:12)(cid:12) S Λ B + T B Λ B + O (Λ B ) (cid:12)(cid:12) ≺ ˆΠ . (7.22)This completes the proof of Proposition 7.1. (cid:3) It remains to prove Proposition 7.2. We state the proof for Z , Z is handled similarly. We set l ( k,l ) := Z k Z l , k, l ∈ N . We can now prove a stronger estimate one E [ l ( p,p ) ] than the estimate obtained from Lemma 6.2 byimproving the error terms from O ≺ (Ψ ˆΠ) to O ≺ ( ˆΠ ). Lemma 7.3.
Fix a z ∈ D τ ( η m , η M ) . Suppose that the assumptions of Proposition 7.2 hold. For anyfixed integer p ≥ , we have E (cid:2) l ( p,p ) (cid:3) = E (cid:2) O ≺ ( ˆΠ ) l ( p − ,p ) (cid:3) + E (cid:2) O ≺ ( ˆΠ ) l ( p − ,p ) (cid:3) + E (cid:2) O ≺ ( ˆΠ ) l ( p − ,p − (cid:3) . Now, with Lemma 7.3, we can prove Proposition 7.2.
Proof of Proposition 7.2.
Similarly to the proof of (5.13) from Lemma 5.2, with Lemma 7.3, we can get(7.16) by applying Young’s and Markov’s inequalities. This completes the proof of Proposition 7.2. (cid:3)
In the sequel, we prove Lemma 7.3.
Proof of Lemma 7.3.
Recall the definition of Z in (7.14). We can write E (cid:2) l ( p,p ) (cid:3) = 1 N N X i =1 E (cid:2) d i, Q i l ( p − ,p ) (cid:3) + 1 N N X i =1 E (cid:2) d i, Q i l ( p − ,p ) (cid:3) . We only state the estimate for the first term on the right hand side above. The second term can beestimated in a similar way. By (6.10), we can write1 N N X i =1 d i, Q i = 1 N N X i =1 ( e BG ) ii tr Gτ i , where we chose d i = d i, , i ∈ J , N K , in the definition of τ i in (6.11).Then, analogously to (6.12), we can also write1 N N X i =1 E (cid:2) d i, Q i l ( p − ,p ) (cid:3) = 1 N N X i =1 E h ( e BG ) ii tr Gτ i l ( p − ,p ) i (7.23)with d i = d i, , i ∈ J , N K . Analogously to (6.5), we can show1 N N X i =1 E (cid:2) d i, Q i l ( p − ,p ) (cid:3) = E (cid:2) O ≺ ( ˆΠ ) l ( p − ,p ) (cid:3) + E (cid:2) O ≺ (Ψ ˆΠ ) l ( p − ,p ) (cid:3) + E (cid:2) O ≺ (Ψ ˆΠ ) l ( p − ,p − (cid:3) , where the last two terms come from the estimates of the analogues of the last two terms of (6.15), thethird and fourth terms in the right side of (6.18), and also the terms in (6.26) and (6.27), but with N P Nj =1 d j Q j replaced by Z . It suffices to improve the estimates of these terms. All these termscontain a derivative ∂ Z ∂g ik or ∂ Z ∂g ik , which is smaller than the derivative of an arbitrary linear combination ∂ ( N P i d i Q i ) /∂g ik or ∂ ( N P i d i Q i ) /∂g ik , due to the special choice of d i, ’s and d i, ’s. Specifically, weshall show the following lemma, which contains the estimates of all necessary terms. Lemma 7.4.
Fix a z ∈ D τ ( η m , η M ) . Suppose that the assumptions of Proposition 5.1 hold. Let ˜ d , . . . , ˜ d N ∈ C be (possibly random) numbers with max i | ˜ d i | ≺ . Let X i = I or e B h i i . Then we have N N X i =1 ( i ) X k ˜ d i e ∗ k X i G e i ∂ Z ∂g ik = O ≺ ( ˆΠ ) , N N X i =1 ( i ) X k ˜ d i e ∗ k X i G e i ∂ Z ∂g ik = O ≺ ( ˆΠ ) , N N X i =1 ( i ) X k ˜ d i e ∗ k X i ˚ g i ∂ Z ∂g ik = O ≺ ( ˆΠ ) , N N X i =1 ( i ) X k ˜ d i e ∗ k X i ˚ g i ∂ Z ∂g ik = O ≺ ( ˆΠ ) . (7.24) Proof of Lemma 7.4.
We give the proof for the first estimate in (7.24). The third one is analogous, andthe other two are just their complex conjugates. From the definitions in (7.10) and (7.11), we get ∂ Z ∂g ik = ∂ Φ c ∂g ik + ( F ′ A ( ω B ) − ∂ Φ c ∂g ik = (cid:16)(cid:0) F ′ A ( ω B ) − (cid:1)(cid:0) F ′ B ( ω cA ) − (cid:1) − (cid:17) ∂ω cA ∂g ik + (cid:0) F ′ A ( ω cB ) − F ′ A ( ω B ) (cid:1) ∂ω cB ∂g ik . Note that by the regularity of F A and F B , we have (cid:0) F ′ A ( ω B ) − (cid:1)(cid:0) F ′ B ( ω cA ) − (cid:1) − S + O ( | Λ A | ) , F ′ A ( ω cB ) − F ′ A ( ω B ) = O ( | Λ B | ) . The smallness of these coefficients carry the gain. According to the definition of ˆΠ in (7.3), we see that( |S| + Λ)Ψ Π ≤ ˆΠ if Λ ≤ ˆΛ. Hence, for the first estimate in (7.24), it suffices to show that1 N N X i =1 ( i ) X k ˜ d i e ∗ k X i G e i ∂ω cι ∂g ik = O ≺ (Ψ Π ) , ι = A, B . (7.25)This follows from (6.20), the fact that ω cB is a tracial quantity, and the chain rule. The other terms in(7.24) can be estimated similarly. This concludes the proof of Lemma 7.4. (cid:3) With the aid of Lemma 7.4, we can conclude the proof of Lemma 7.3. (cid:3) Strong local law
In this section, we use a continuity argument to prove the strong local law, i.e.,
Theorem 2.5, basedon Propositions 5.1, 6.1, and 7.1. We start with the following lemma. Recall
S ≡ S AB from (3.1) andΛ = | Λ A | + | Λ B | from (7.1). Further recall that η m = N − γ , with γ > Lemma 8.1.
Fix z ∈ D τ ( η m , η M ) . Suppose that the assumptions of Proposition 5.1 hold. Let ε ∈ (0 , γ ) .Suppose that Λ ≺ ˆΛ for some deterministic control parameter ˆΛ ≤ N − γ . If ˆΛ ≥ N ε Nη , then we have: ( i ) : If √ κ + η > N − ε ˆΛ , there is a sufficiently large constant K > , such that (cid:16) Λ ≤ |S| K (cid:17) | Λ A | ≺ N − ε ˆΛ , (cid:16) Λ ≤ |S| K (cid:17) | Λ B | ≺ N − ε ˆΛ ; (8.1)( ii ) : If √ κ + η ≤ N − ε ˆΛ , we have | Λ A | ≺ N − ε ˆΛ , | Λ B | ≺ N − ε ˆΛ . Proof.
From (3.4) and (3.5), we see that | S | & Im m µ A ⊞ µ B for all z ∈ D τ ( η m , η M ). Thus (7.2) gives (cid:12)(cid:12)(cid:12) S Λ ι + T ι Λ ι + O (Λ ι ) (cid:12)(cid:12)(cid:12) ≺ |S| + ˆΛ N η + 1(
N η ) , ι = A, B. (8.2)with S , T A and T B given in (3.1). Then, from | Λ ι | ≺ ˆΛ ≤ N − γ , we have S Λ ι + T ι Λ ι = O ≺ (cid:16) |S| + ˆΛ N η + 1(
N η ) + N − γ ˆΛ (cid:17) , ι = A, B. (8.3)If √ κ + η > N − ε ˆΛ, we have for ι = A, B , (cid:16) Λ ≤ |S| K (cid:17) | Λ ι | ≺ |S| − (cid:16) |S| + ˆΛ N η + 1(
N η ) + N − γ ˆΛ (cid:17) ≤ C N ε N η + N ε − γ ˆΛ ≤ CN − ε ˆΛ . (8.4) Here we absorbed the quadratic term on the left hand side in (8.3) into the linear term. Hence, weproved ( i ). From (8.4), we also see that if √ κ + η > N − ε ˆΛ, then (cid:16) Λ ≤ |S| K (cid:17) | Λ ι | ≺ N − ε |S| , ι = A, B. (8.5)Next, we prove ( ii ). If √ κ + η ≤ N − ε ˆΛ, from (3.5) and (3.6), wee that T ι ∼
1. Hence, we solve thequadratic equation (8.3) directly, then we get | Λ ι | ≺ |S| + (cid:16) |S| + ˆΛ N η + 1(
N η ) + N − γ ˆΛ (cid:17) ≤ CN − ε ˆΛ , ι = A, B , under the assumption that ˆΛ ≥ N ε Nη . This concludes the proof of Lemma 8.1. (cid:3) Recall the definitions of S in (3.1) and of Λ d , e Λ d , Λ T , e Λ T in (5.6). For any z ∈ D τ ( η m , η M ) and any δ ∈ [0 , z, δ ) := n Λ d ( z ) ≤ δ, e Λ d ( z ) ≤ δ, Λ( z ) ≤ δ , Λ T ( z ) ≤ , e Λ T ( z ) ≤ o . (8.6)We further decompose the domain D τ ( η m , η M ) into the following two disjoint parts: D > := n z ∈ D τ ( η m , η M ) : √ κ + η > N ε N η o , D ≤ := n z ∈ D τ ( η m , η M ) : √ κ + η ≤ N ε N η o . (8.7)For z ∈ D > , any δ ∈ [0 ,
1] and any ε ′ ∈ [0 , > ( z, δ, ε ′ ) ⊂ Θ( z, δ ) asΘ > ( z, δ, ε ′ ) := n Λ d ( z ) ≤ δ, e Λ d ( z ) ≤ δ, Λ( z ) ≤ min { δ , N − ε ′ |S|} , Λ T ( z ) ≤ , e Λ T ( z ) ≤ o . Lemma 8.2.
Suppose that the assumptions in Theorem 2.5 hold. For any fixed z ∈ D τ ( η m , η M ) , any ε ∈ (0 , γ ) and any D > , there exists a positive integer N ( D, ε ) and an event Ω( z ) ≡ Ω( z, D, ε ) with P (Ω( z )) ≥ − N − D , ∀ N ≥ N ( D, ε ) (8.8) such that the following hold:(i) If z ∈ D > , we have Θ > (cid:16) z, N ε √ N η , ε (cid:17) ∩ Ω( z ) ⊂ Θ > (cid:16) z, N ε √ N η , ε (cid:17) . (8.9) (ii) If z ∈ D ≤ , we have Θ (cid:16) z, N ε √ N η (cid:17) ∩ Ω( z ) ⊂ Θ (cid:16) z, N ε √ N η (cid:17) . (8.10) Proof.
In this proof, we fix a z ∈ D τ ( η m , η M ). From Proposition 5.1, we see that under the assumptionΛ d ( z ) ≺ N − γ , e Λ d ( z ) ≺ N − γ , Λ T ( z ) ≺ , e Λ T ( z ) ≺ , (8.11)we have using (5.15) thatΛ c d ( z ) ≺ √ N η , e Λ c d ( z ) ≺ √ N η , Λ T ( z ) ≺ √ N η , e Λ T ( z ) ≺ √ N η . (8.12)The following more quantitive statement for (8.12) can be derived if one states the proof of Proposi-tion 5.1 in a quantitive way: if the event Θ( z, N ε √ Nη ) holds, thenΛ c d ( z ) ≤ N ε √ N η , e Λ c d ( z ) ≤ N ε √ N η , Λ T ( z ) ≤ N ε √ N η , e Λ T ( z ) ≤ N ε √ N η , (8.13)hold on Θ( z, N ε √ Nη ) ∩ Ω( z ). Here Ω( z ) is the typical “event ” on which all the concentration estimates in theproof of Proposition 5.1 hold. Note that these concentration estimates are done with respect to the entriesor quadratic forms of Gaussian vectors g i ’s, the probability of Ω( z ) is thus independent of z . Hence, wehave a positive integer N ( D, ε ) uniformly in z such that (8.8) holds. Moreover, on Ω( z ), we can writeLemma 8.1 in a quantitive way. For instance, (8.1) can be written as (cid:0) Λ ≤ |S| K (cid:1) | Λ ι | ≤ N − ε ˆΛ on Ω( z ). Now, we choose ˆΛ = N ε Nη in Lemma 8.1. From Lemma 8.1 ( i ) and (8.5), we see that for z ∈ D > , thefollowing bound holds on the event Θ > ( z, N ε √ Nη , ε ) ∩ Ω( z ),Λ ≤ min n N ε N η , N − ε | S | o . (8.14)From Lemma 8.1 ( ii ), we see that for z ∈ D ≤ , the following bound holds on the event Θ( z, N ε √ Nη ) ∩ Ω( z ),Λ ≤ N ε N η . (8.15)Substituting (8.14) and (8.15) into the first two estimates in (8.13), we further get thatΛ d ( z ) ≤ N ε √ N η , e Λ d ( z ) ≤ N ε √ N η hold on Θ > ( z, N ε √ Nη , ε ) ∩ Ω( z ) if z ∈ D > and on Θ( z, N ε √ Nη ) ∩ Ω( z ) if z ∈ D ≤ . This completes the proof. (cid:3) With Lemma 8.2, we can now prove (2.17) and (2.18) in Theorem 2.5, using a continuity argument.The proof of (2.19) will be stated in Section 9.
Proof of (2.17) and (2.18) in Theorem 2.5.
With Lemma 8.2, the remaining proof of Theorem 2.5 isquite similar to the proof of Theorem 7.1 of [6]. So we only sketch the arguments.We start with an entry-wise Green function subordination estimate on global scale, i.e., η = η M forsome sufficiently large constant η M >
0. Recall Q i from (4.11). We regard Q i as a function of therandom unitary matrix U . Then, for z = E + i e η M with any fixed E and any e η M ≥ η M , we apply theGromov-Milman concentration inequality ( c.f., (6.2) in [6]), and get | Q i ( E + i e η M ) − E Q i ( E + i e η M ) | ≺ p N e η M ; (8.16)see Section 6.2 of [6] for similar estimates for the Green function entries of the block additive model.Next, using the invariance of the Haar measure, one can check the equation E ( e BG ⊗ G − G ⊗ e BG ) = 0 ; (8.17)see Proposition 3.2 of [24]. Taking the ( i, i )-th entry for the first component and the normalized tracefor the second component in the tensor product, we obtain from (8.17) that E Q i = E (cid:0) ( e BG ) ii tr G − G ii tr e BG (cid:1) = 0 . (8.18)We claim that, for sufficiently large η M >
1, we havesup z :Im z ≥ η M | Q i ( z ) | ≺ √ N , ∀ i ∈ J , N K , (8.19)where we used (8.16), (8.18), the Lipschitz continuity of Q i in the regime | z | ≤ √ N and the deterministicbound | Q i ( z ) | ≤ C √ N when | z | ≥ √ N . In addition, using that k H k ≤ k A k + k B k < K and the conventiontr e B = tr B = 0 ( c.f., (5.1)), we have, for z = E + i e η M with fixed E and any e η M ≥ η M , the expansionstr G ( z ) = − z + O ( 1 | z | ) = i e η M + O (cid:0) e η M (cid:1) , tr e BG ( z ) = − tr e Bz + O ( 1 | z | ) = O ( 1 e η M ) , (8.20)where we used tr B = 0 in the second equality. Hence, by the definition of ω cB in (5.2), we see that, ω cB ( z ) = z + O ( 1 e η M ) , z = E + i e η M . (8.21)Using the identity ( e BG ) ii = 1 − ( a i − z ) G ii , we can rewrite (8.19) as(1 − ( a i − ω cB ) G ii )tr G = O ≺ ( 1 √ N ) , z = E + i e η M . From the first line of (8.20) and (8.21) we getΛ c d ( z ) ≺ √ N , z = E + i e η M . (8.22) Analogously, we also have e Λ c d ( z ) ≺ √ N , z = E + i e η M . (8.23)Averaging over the index i in the definition of Λ cdi and e Λ cdi ( c.f., (5.7)), using (8.22) and (8.23) and usingthe fact tr G = tr G = m H yieldssup z :Im z ≥ η M (cid:12)(cid:12) m H ( z ) − m A ( ω cB ( z )) (cid:12)(cid:12) ≺ √ N , sup z :Im z ≥ η M (cid:12)(cid:12) m H ( z ) − m B ( ω cA ( z )) (cid:12)(cid:12) ≺ √ N (8.24)where in the large z regime these bounds even hold deterministically, similarly to (8.19). This togetherwith (5.4) gives us the systemsup z :Im z ≥ η M | Φ ( ω cA ( z ) , ω cB ( z ) , z ) | ≺ √ N , sup z :Im z ≥ η M | Φ ( ω cA ( z ) , ω cB ( z ) , z ) | ≺ √ N , (8.25)where Φ and Φ are defined in (7.9). We regard (8.25) as a perturbation of Φ ( ω A ( z ) , ω B ( z ) , z ) = 0,Φ ( ω A ( z ) , ω B ( z ) , z ) = 0. The stability of this system in the large η regime is analyzed in Lemma A.2.Choosing ( µ , µ ) = ( µ A , µ B ), ( e ω ( z ) , e ω ( z )) = ( ω cA ( z ) , ω cB ( z )) in Lemma A.2 below, and using the factthat (8.25) and (8.21) hold for any sufficiently large e η M , we obtain from the stability Lemma A.2 that | Λ ι ( z ) | = | ω cι ( z ) − ω ι ( z ) | ≺ √ N , ι = A, B, z = E + i η M (8.26)for any sufficiently large constant η M >
1, say.Substituting (8.26) into (8.22) and (8.23) givesΛ d ( E + i η M ) ≺ √ N , e Λ d ( E + i η M ) ≺ √ N , (8.27)for any fixed E ∈ R . Using the bound k G k ≤ η and the inequality | x ∗ G y | ≤ k G kk x kk y k , we also getΛ T ( E + i η M ) ≤ η M , e Λ T ( E + i η M ) ≤ η M , (8.28)for any fixed E ∈ R . Since (8.27) and (8.28) guarantee assumption (5.12), similarly to (8.12), we canapply Proposition 5.1 to get, for any fixed E ∈ R , thatΛ T ( E + i η M ) ≺ √ N , e Λ T ( E + i η M ) ≺ √ N . (8.29)Also observe that E + i η M ∈ D > , for any fixed E , and that |S ( E + i η M ) | &
1. Hence Λ( E + i η M ) ≺ N − ε |S ( E + i η M ) | . Then we can apply Lemma 8.1 ( i ) repeatedly for smaller and smaller Λ to getΛ( E + i η M ) ≺ N . (8.30)Combining (8.27), (8.29), (8.30) with the fact Λ( E + i η M ) ≺ N − ε |S ( E + i η M ) | , we see that the eventΘ > ( E + i η M , N ε √ N , ε ) holds with high probability. More quantitively, we have for any fixed E that P (cid:16) Θ > (cid:0) E + i η M , N ε √ N , ε (cid:1)(cid:17) ≥ − N − D , (8.31)for all D > N ≥ N ( D, ε ) with some threshold N ( D, ε ).Now we take (8.31) as the initial input, and use a continuity argument based on Lemma 8.2, to controlthe probability of the “good” events Θ > for z ∈ D > and Θ for z ∈ D ≤ . To this end, we first recall theevent Ω( z ) in Lemma 8.2. The main task is to show for any z = E + i η ∈ D > ,Θ > (cid:16) E + i η, N ε √ N η , ε (cid:17) ∩ Ω( E + i( η − N − )) ⊂ Θ > (cid:16) E + i( η − N − ) , N ε √ N η , ε (cid:17) , (8.32)and, for any z = E + i η ∈ D ≤ ,Θ (cid:16) E + i η, N ε √ N η (cid:17) ∩ Ω( E + i( η − N − )) ⊂ Θ (cid:16) E + i( η − N − ) , N ε √ N η (cid:17) . (8.33)The inclusions (8.32) and (8.33) are analogous to (7.20) of [4]. The only difference is here we decomposethe domain D τ ( η m , η M ) into D > and D ≤ , and in D > we also keep monitoring the event Λ ≤ N − ε |S| in order to use Lemma 8.1 ( i ). As we are gradually reducing Im z , once z enters into the domain D ≤ , wedo not need to monitor S anymore.The proofs of (8.32) and (8.33) rely on the Lipschitz continuity of the Green function, k G ( z ) − G ( z ′ ) k ≤ N | z − z ′ | , and of the subordination functions and S in (3.7). Using the Lipschitz continuity of thesefunctions, it is not difficult to see the following twoΘ > (cid:16) E + i η, N ε √ N η , ε (cid:17) ⊂ Θ > (cid:16) E + i( η − N − ) , N ε √ N η , ε (cid:17) , z = E + i η ∈ D > , (8.34)Θ (cid:16) E + i η, N ε √ N η (cid:17) ⊂ Θ (cid:16) E + i( η − N − ) , N ε √ N η (cid:17) , z = E + i η ∈ D ≤ . (8.35)Then, (8.34) together with (8.9) implies (8.32). Similarly, (8.35) together with (8.10) implies (8.33).Applying (8.32) and (8.33) recursively and using the simple fact that the domains D > and D ≤ areconnected, one can go from η = η M to η = η m , step by step of size N − . Consequently, we obtain forany η ∈ [ η m , η M ] ∩ N − Z that, if E + i η ∈ D > thenΘ > (cid:16) E + i η M , N ε √ N η M , ε (cid:17) ∩ Ω( E + i( η M − N − )) ∩ . . . ∩ Ω( E + i η ) ⊂ Θ > (cid:16) E + i η, N ε √ N η , ε (cid:17) ⊂ Θ > (cid:16) E + i η, N ε √ N η (cid:17) , (8.36)respectively, if E + i η ∈ D ≤ thenΘ > (cid:16) E + i η M , N ε √ N η M , ε (cid:17) ∩ Ω( E + i( η M − N − )) ∩ . . . ∩ Ω( E + i η ) ⊂ Θ (cid:16) E + i η, N ε √ N η (cid:17) . (8.37)Combining (8.8), (8.31), (8.36) and (8.37), we have P (cid:16) Θ (cid:16) E + i η, N ε √ N η (cid:17)(cid:17) ≥ − N − D (1 + N ( η M − η )) , (8.38)uniformly for all η ∈ [ η m , η M ] ∩ N − Z , when N ≥ max { N ( D, ε ) , N ( D, ε ) } . Finally, by the Lipschitzcontinuity of the Green function and also that of the subordination functions in (3.7), we can extend thebounds from z in the discrete lattice to the entire domain D τ ( η m , η M ).By the definition in (8.6), we obtain from (8.38) thatmax i ∈ J ,N K (cid:12)(cid:12)(cid:12) G ii ( z ) − a i − ω B ( z ) (cid:12)(cid:12)(cid:12) ≺ √ N η , | Λ A ( z ) | ≺ N η , max i ∈ J ,N K (cid:12)(cid:12)(cid:12) G ii ( z ) − b i − ω A ( z ) (cid:12)(cid:12)(cid:12) ≺ √ N η , | Λ B ( z ) | ≺ N η , (8.39)uniformly on D τ ( η m , η M ) with high probability. For any deterministic d , . . . , d N ∈ C , we further write1 N N X i =1 d i (cid:16) G ii − a i − ω cB (cid:17) = 1 N N X i =1 d i tr G ( a i − ω cB ) Q i , (8.40)which can easily be checked from the definition of ω cB , Q i and the equation ( a i − z ) G ii + ( e BG ) ii = 1.Regarding d i tr G ( a i − ω cB ) as the random coefficients d i in (6.3), it is not difficult to check that (6.2) holds,similarly to the last two equations in (5.55). Hence, we have (cid:12)(cid:12)(cid:12) N N X i =1 d i (cid:16) G ii − a i − ω cB (cid:17)(cid:12)(cid:12)(cid:12) ≺ Ψ ˆΠ . (8.41)Plugging the last estimate in (8.39) into (8.41), and using (3.2), we obtain (2.17) uniformly on D τ ( η m , η M ).Finally, choosing d i = 1 for all i ∈ J , N K in (8.41), we get (2.18) uniformly on D τ ( η m , η M ). This completesthe proof of (2.17) and (2.18) in Theorem 2.5. (cid:3) Rigidity of the eigenvalues
In this section, we prove Theorem 2.6, and also (2.19) in Theorem 2.5. Recall the definition of D > in (8.7). We start by improving the estimate of Λ defined in (7.1) in the following subdomain of D > , e D > := { z = E + i η ∈ D > : E < E − } . (9.1) Lemma 9.1.
Suppose that the assumptions in Theorem 2.5 hold. Then, we have the following uniformestimate for all z ∈ e D > , Λ( z ) ≺ N p ( κ + η ) η + 1 √ κ + η N η ) . (9.2) Proof.
First, from (8.39), we see that Λ ≺ Nη on D τ ( η m , η M ). Now, suppose that Λ ≺ ˆΛ for somedeterministic ˆΛ ≡ ˆΛ( z ) that satisfies N ε (cid:16) N p ( κ + η ) η + 1 √ κ + η N η ) (cid:17) ≤ ˆΛ( z ) ≤ N ε N η . (9.3)Observe that such ˆΛ always exists on D > . From (7.2), (3.4) and (3.5), we have for ι = A, B , and z ∈ e D > (cid:12)(cid:12)(cid:12) S Λ ι + T ι Λ ι (cid:12)(cid:12)(cid:12) ≺ q ( η √ κ + η + ˆΛ)( √ κ + η + ˆΛ) N η + 1(
N η ) ≺ q ˆΛ √ κ + ηN η + √ ηN η + 1( N η ) , (9.4)where we used that ˆΛ ≺ N ε Nη ≤ N − ε √ κ + η for all z ∈ e D > . Moreover, for z ∈ e D > , we see that | Λ ι | ≺ N η ≤ N − ε √ κ + η ∼ N − ε |S| , for ι = A, B . Hence, according to the fact T ι ≤ C ( c.f., (3.5)), we can absorb the second term on theleft side of (9.4) into the first term, and thus we have for ι = A, B | Λ ι | ≺ √ κ + η (cid:18) q ˆΛ √ κ + ηN η + √ ηN η + 1( N η ) (cid:19) ≤ N η ( κ + η ) ˆΛ + N − ε ˆΛ ≤ N − ε ˆΛ , where in the second step we used the lower bound in (9.3) directly, and in the last step we used the fact( N η ) − ( κ + η ) − ≤ N − ε ˆΛ which again follows from the lower bound in (9.3).Hence, we improved the bound from Λ ≤ ˆΛ to Λ ≤ N − ε ˆΛ as long as the lower bound in (9.3) holds.Performing the above improvement iteratively, one finally gets (9.2). Hence, we complete the proof. (cid:3) With the aid of Lemma 9.1, we can now prove Theorem 2.6.
Proof of Theorem 2.6.
We first show (2.21) for the smallest eigenvalue λ , i.e., | λ − γ | ≺ N − . (9.5)Recall K defined in (2.13). For any (small) constant ε >
0, we define the line segment. e D ( ε ) := { z = E + i η : E ∈ [ −K , E − − N − +6 ε ] , η = N − + ε } . (9.6)Then it is easy to check that e D ( ε ) ⊂ e D > ( c.f., (9.1)). Applying (9.2), we obtain Λ ≺ N − ε Nη uniformly on e D ( ε ), which together with (7.6) implies | m H ( z ) − m µ A ⊞ µ B ( z ) | ≺ N − ε N η , (9.7)uniformly on e D ( ε ). Moreover, by (3.4), we haveIm m µ A ⊞ µ B ( z ) ∼ η √ κ + η ≤ N − ε N η , (9.8)uniformly on e D ( ε ). Combining (9.7) with (9.8) yieldsIm m H ( z ) ≺ N − ε N η , (9.9) uniformly on e D ( ε ). Since k H k < K , to see (9.5), it suffices to show that with high probability λ is not inthe interval [ −K , E − − N − +6 ε ]. We prove it by contradiction. Suppose that λ ∈ [ −K , E − − N − +6 ε ].Then clearly for any η > E ∈ [ −K ,E − − N −
23 +6 ε ] Im m H ( E + i η ) = sup E ∈ [ −K ,E − − N −
23 +6 ε ] N N X i =1 η ( λ i − E ) + η ≥ N η , which contradicts the fact that (9.9) holds uniformly on e D ( ε ). Hence, we have (9.5).Next, from (2.18), (3.80) and (3.81) and a standard application of Helffer-Sj¨ostrand formula ( c.f., Lemma 5.1 [2]) on D τ ( η m , η M ) yieldssup x ≤ E − + c | µ H (( −∞ , x ]) − µ A ⊞ µ B (( −∞ , x ]) | ≺ N , (9.10)for any sufficiently small c = c ( τ ). Then (9.5), (9.10), together with the rigidity (3.94) and the squareroot behavior of the distribution µ α ⊞ µ β ( c.f., (3.62)) will lead to the conclusion. The same conclusionholds with γ ∗ j ’s replaced by γ j ’s by rigidity (3.94). (cid:3) Finally, with the aid of Theorem 2.6, we can prove (2.19) in Theorem 2.5.
Proof of (2.19) in Theorem 2.5.
Let ε > κ = E − − E ≥ N − + ε in (2.19), we see that (2.19) follows from (2.18) directly in the regime η ≥ κ , say. Hence, in the sequel,we work in the regime η ≤ κ only. For any z = E + i η ∈ D τ ( η m , η M ) with κ ≥ N − + ε , we set the contour C ≡ C ( z ) := C l ∪ C r ∪ C u ∪ C u , where C l ≡ C l ( z ) := (cid:8) ˜ z = E + κ η : − η − κ ≤ ˜ η ≤ η + κ (cid:9) , C r ≡ C r ( z ) := (cid:8) ˜ z = E − κ η : − η − κ ≤ ˜ η ≤ η + κ (cid:9) , C u ≡ C u ( z ) := (cid:8) ˜ z = ˜ E + i( η + κ ) : E − κ ≤ ˜ E ≤ E + κ (cid:9) We then further decompose C = C < ∪ C ≥ , where C < ≡ C < ( z ) := (cid:8) ˜ z ∈ C : | Im ˜ z | < η m (cid:9) , C ≥ ≡ C ≥ ( z ) := C \ C < . Now, we further introduce the eventΞ := \ ˜ z ∈C > n(cid:12)(cid:12) m H (˜ z ) − m µ A ⊞ µ B (˜ z ) (cid:12)(cid:12) ≤ N ε N Im ˜ z o \ n λ ≥ E − − N − / ε o Then, on the event Ξ, we have m H ( z ) − m µ A ⊞ µ B ( z ) = 12 π i I C z − z (cid:0) m H (˜ z ) − m µ A ⊞ µ B (˜ z ) (cid:1) d˜ z = 12 π i (cid:16) Z C < + Z C ≥ (cid:17) z − z (cid:0) m H (˜ z ) − m µ A ⊞ µ B (˜ z ) (cid:1) d˜ z. (9.11)Note that, for ˜ z ∈ C , we always have | ˜ z − z | ≤ κ . In addition, for ˜ z ∈ C < , we have the fact |C < | ≤ η m , and | m H (˜ z ) | ≤ Cκ , | m µ A ⊞ µ B (˜ z ) | ≤ Cκ , which hold on Ξ. For ˜ z ∈ C ≥ , we have the fact |C ≥ | ≤ Cκ and the bound (cid:12)(cid:12) m H (˜ z ) − m µ A ⊞ µ B (˜ z ) (cid:12)(cid:12) ≤ N ε N Im ˜ z which holds on Ξ. Applying the above bounds to (9.11), it is elementary to check that | m H ( z ) − m µ A ⊞ µ B ( z ) | ≤ C (cid:0) η m + N − ε log N (cid:1) κ on Ξ. Since γ in η m = N − γ and ε can be arbitrary, we can conclude that | m H ( z ) − m µ A ⊞ µ B ( z ) | ≺ N κ (9.12) if we can show that Ξ holds with high probability. Using (9.5), it suffices to show that (cid:12)(cid:12) m H (˜ z ) − m µ A ⊞ µ B (˜ z ) (cid:12)(cid:12) ≺ N Im ˜ z , uniformly in ˜ z ∈ C > . This only requires us to enlarge the domain D τ ( η m , η M ) and also consider itscomplex conjugate to include C > during the proof of (2.18). Hence, we conclude the proof of (2.19) bycombining the Nκ bound in (9.12) with the Nη bound in (2.18). (cid:3) We conclude the main part of the paper with the proof of Corollary 2.8.
Proof of Corollary 2.8.
With the additional Assumption 2.7, we can show analogously that the estimates(2.18) and (2.21) hold as well around the upper edge. According to Assumption 2.7 ( vii ) and the factsup C + | m µ α ⊞ µ β | ≤ C ( c.f., (3.8)), we see that except for the two vicinities of the lower and upper edge,the remaining spectrum is within the regular bulk. Together with the strong local law in the bulk regime, c.f., Theorem 2.4 in [5], we have (cid:12)(cid:12) m H ( z ) − m µ A ⊞ µ B ( z ) (cid:12)(cid:12) ≺ N η . (9.13)uniformly on the domain D ( η m , η M ) := { z = E + i η ∈ C + : −K ≤ E ≤ K , η m ≤ η ≤ η M } . Then,(9.13) together with (2.21) and its counterpart at the upper edge implies the rigidity for all eigenvalues, i.e., (2.22) can be proved again with Helffer-Sj¨ostrand formula. Then, from (2.22), we conclude that(2.23) holds. This completes the proof of Corollary 2.8. (cid:3) Appendix
A.In this appendix, we collect some basic technical results.A.1.
Stochastic domination and large deviation properties.
Recall the stochastic domination inDefinition 2.4. The relation ≺ is transitive and it satisfies the following arithmetic rules: if X ≺ Y and X ≺ Y then X + X ≺ Y + Y and X X ≺ Y Y . Further assume that Φ( v ) ≥ N − C is deterministicand that Y ( v ) is a nonnegative random variable satisfying E [ Y ( v )] ≤ N C ′ for all v . Then Y ( v ) ≺ Φ( v ),uniformly in v , implies E [ Y ( v )] ≺ Φ( v ), uniformly in v .Gaussian vectors have well-known large deviation properties which we use in the following form: Lemma A.1.
Let X = ( x ij ) ∈ M N ( C ) be a deterministic matrix and let y = ( y i ) ∈ C N be a deterministiccomplex vector. For a Gaussian random vector g = ( g , . . . , g N ) ∈ N R (0 , σ I N ) or N C (0 , σ I N ) , we have | y ∗ g | ≺ σ k y k , | g ∗ X g − σ N tr X | ≺ σ k X k . (A.1)A.2. Stability for large η . For any probability measures µ and µ on the real line, we define thefunctions Φ , Φ : ( C + ) → C by settingΦ ( ω , ω , z ) := F µ ( ω ) − ω − ω + z , Φ ( ω , ω , z ) := F µ ( ω ) − ω − ω + z . (A.2)We observe that the system of subordination equations (2.9) is equivalent toΦ ( ω ( z ) , ω ( z ) , z ) = 0 , Φ ( ω ( z ) , ω ( z ) , z ) = 0 , ∀ z ∈ C + . We have the following linear stability for the subordination equation in the large η regime. A somewhatweaker version of this result has already been proven in Lemma 4.2 of [3] requiring an unnecessarilystronger condition (compare (4.14) of [3] with the current (A.3) below). However, in our applicationsonly a weaker assumption can be guaranteed. In fact, already in [3] (in equation (6.56)) we tacitly reliedon the current version of this stability result. Thus by proving the stronger stability result below we alsocorrect this small inconsistency in [3]. Lemma A.2.
Let e η > be any (large) positive number and let e ω , e ω , e r , e r : C e η → C be analyticfunctions where C e η := { z ∈ C : Im z ≥ e η } . Assume that there is a constant C > such that thefollowing hold for all z ∈ C e η : | Im e ω ( z ) − Im z | ≤ C , | Im e ω ( z ) − Im z | ≤ C , (A.3) | e r ( z ) | ≤ C , | e r ( z ) | ≤ C , (A.4)Φ ( e ω ( z ) , e ω ( z ) , z ) = e r ( z ) , Φ ( e ω ( z ) , e ω ( z ) , z ) = e r ( z ) . (A.5) Then there is a constant η with η ≥ e η , such that | e ω ( z ) − ω ( z ) | ≤ k e r ( z ) k , | e ω ( z ) − ω ( z ) | ≤ k e r ( z ) k , (A.6) on the domain C η := { z ∈ C : Im z ≥ η } , where ω ( z ) and ω ( z ) are the subordination functionsassociated with µ and µ .Proof. Since most of the proof is identical to that in [3], here we only give the necessary modificationsinvolving the weaker condition (A.3). Following the proof in [3] to the letter up to (4.23), for every z ∈ C η we have constructed functions b ω ( z ), b ω ( z ) such that Φ µ ,µ ( b ω ( z ) , b ω ( z ) , z ) = 0 with | e ω j ( z ) − b ω j ( z ) | ≤ k e r ( z ) k , j = 1 , , z ∈ C η . (A.7)From (4.20) of [3] we know that the Jacobian of the subordination equations (denoted by Γ µ ,µ in [3])is close to 1 for sufficiently large e η . Thus by analytic inverse function theorem we obtain that b ω j ( z ), j = 1 ,
2, are also analytic functions for large η = Im z . From (A.3), (A.4) and (A.7), we see thatlim η ր∞ Im b ω (i η )i η = lim η ր∞ Im b ω (i η )i η = 1 . It is known from the proof of the uniqueness of the solution to the subordination equations near z = i ∞ that ( b ω ( z ) , b ω ( z )) is the unique solution in a neighborhood of z = i ∞ and it can be analytically extendedto all z ∈ C + . Hence, ( b ω ( z ) , b ω ( z )) = ( ω ( z ) , ω ( z )). This together with (A.7) concludes the proof. (cid:3) Appendix
B.In this appendix, we prove some technical lemmas. First, we estimate the small terms involving ∆ G .Specifically, we provide the bounds for the ∆ G involved terms in the the last four estimates in Lemma 5.3.Then, we prove Lemma 5.3. We summarize the estimates for ∆ G involved terms in the following lemma. Lemma B.1.
Fix a z ∈ D τ ( η m , η M ) . Let Q ∈ M N ( C ) be arbitrary, with k Q k ≺ . Let X i = I or e B h i i ,and X = I or A . Suppose the assumptions of Proposition 5.1 hold. Then, we have N ( i ) X k e ∗ k X i ∆ G ( i, k ) e i = O ≺ (Π i ) , N ( i ) X k e ∗ i X ∆ G ( i, k ) e i e ∗ k X i G e i = O ≺ (Π i ) , N ( i ) X k h ∗ i ∆ G ( i, k ) e i e ∗ k X i G e i = O ≺ (Π i ) , N ( i ) X k tr QX ∆ G ( i, k ) e ∗ k X i G e i = O ≺ (Ψ Π i ) . (B.1) Proof.
The proof is similar to that of Lemma B.1 in [6]. But here we need finer estimates. Recall ∆ R ( i, k )and ∆ G ( i, k ) from (5.39) and (5.38). We note that ∆ R ( i, k ) is a sum of terms of the form e d i ¯ g ik α i β ∗ i forsome e d i ∈ C with | e d i | ≺
1, where α i , β i = e i or h i . Hereafter, we use e d i to represent a generic numbersatisfying | e d i | ≺ D τ ( η m , G ( i, k ) is a sum of terms of the form e d i ¯ g ik G α i β ∗ i e B h i i R i G, e d i ¯ g ik GR i e B h i i α i β ∗ i G. (B.2)Then, the left hand side of the first estimate in (B.1) is a sum of terms of the form1 N e d i (cid:0) ˚ g ∗ i X i G α i (cid:1)(cid:0) β ∗ i e B h i i R i G e i (cid:1) , N e d i (cid:0) ˚ g ∗ i X i GR i e B h i i α i (cid:1)(cid:0) β ∗ i G e i (cid:1) . (B.3)By the Cauchy-Schwarz inequality, we have (cid:12)(cid:12) ˚ g ∗ i X i G α i (cid:12)(cid:12) ≺ k G α i k = s Im α ∗ i G α i η , (cid:12)(cid:12) β ∗ i e B h i i R i G e i (cid:12)(cid:12) ≺ k G e i k = s Im G ii η , (cid:12)(cid:12) β ∗ i G e i (cid:12)(cid:12) ≺ k G e i k = s Im G ii η , (cid:12)(cid:12) ˚ g ∗ i X i GR i e B h i i α i (cid:12)(cid:12) ≺ k GR i e B h i i α i k = s Im α ∗ i e B h i i R i GR i e B h i i α i η . (B.4)Note that for α i = e i , α ∗ i G α i = G ii , α ∗ i e B h i i R i GR i e B h i i α i = b i h ∗ i G h i = b i G ii , (B.5) and for α i = h i , α ∗ i G α i = G ii , α ∗ i e B h i i R i GR i e B h i i α i = e ∗ i e BG e B e i = e B ii − ( a i − z ) + ( a i − z ) G ii . (B.6)Plugging (B.5) and (B.6) into the bounds in (B.4), we see that both terms in (B.3) are of order O ≺ (Π i ).Hence, we proved the first estimate in (B.1).Next, we verify the second estimate (B.1). Since ∆ G ( i, k ) is a sum of terms of the form in (B.2), wesee that the left side of the second estimate in (B.1) is a sum of terms of the form1 N e d i (cid:0) e ∗ i XG α i (cid:1)(cid:0) β ∗ i e B h i i R i G e i (cid:1)(cid:0) ˚ g ∗ i X i G e i (cid:1) , N e d i (cid:0) e ∗ i XGR i e B h i i α i (cid:1)(cid:0) β ∗ i G e i (cid:1)(cid:0) ˚ g ∗ i X i G e i (cid:1) . (B.7)Note that e ∗ i e B h i i R i G e i = − b i T i , h ∗ i e B h i i R i G e i = − ( e BG ) ii . Hence, we have | β ∗ i e B h i i R i G e i | ≺ , | β ∗ i G e i | ≺ . (B.8)Further, we claim that | e ∗ i XG α i | , | e ∗ i XGR i e B h i i α i | ≺ s Im ( G ii + G ii ) η . (B.9)The proof of the above bounds is analogous to the proof of (B.4). We thus omit the details. Then, usingthe first estimate in (B.4), (B.8) and (B.9), we see that both terms in (B.7) are of order O ≺ (Π i ).The proof of the third estimate in (B.1) is nearly the same as that for the second one, we thus omit it.To show the last estimate, we again use the fact that ∆ G ( i, k ) is a sum of terms of the form in (B.2).Then it is not difficult to see that the left side of the last estimate in (B.1) is a sum of terms of the form e d i N (cid:0) β ∗ i e B h i i R i GQXG α i (cid:1)(cid:0) ˚ g ∗ i X i G e i (cid:1) , e d i N (cid:0) β ∗ i GQXGR i e B h i i α i (cid:1)(cid:0) ˚ g ∗ i X i G e i (cid:1) . (B.10)Note that (cid:12)(cid:12) β ∗ i e B h i i R i GQXG α i (cid:12)(cid:12) ≺ η k G α i k ≤ η s Im ( G ii + G ii ) η . (B.11)Analogously, we have (cid:12)(cid:12) β ∗ i GQXGR i e B h i i α i (cid:12)(cid:12) ≺ η s Im ( G ii + G ii ) η . (B.12)Applying (B.11), (B.12), and the first estimate in (B.4), we see that both terms in (B.10) are of oder O ≺ (Ψ Π i ). Hence, we obtain the last estimate in (B.1). This concludes the proof of Lemma B.1. (cid:3) Proof of Lemma 5.3.
The proof is similar to that for Lemma 7.4 in [6]. In the latter, we used Ψ insteadof Π i in the statement. However, the proof of Lemma 7.4 in [6] shows readily that the stronger boundsin (5.55) hold for the counterparts of the block additive model ( c.f., (7.77), (7.80), (7.81) and (7.87) of[6]). The proof for our additive model given here analogous.First, by (5.16), (5.17), (5.27), (5.30), and the fact ˚ T i = T i − h ii G ii , we have | ˚ S i | ≺ | ˚ T i | ≺
1, underthe assumption ((5.12). Then, for the first estimate in (5.55), we have1 N ( i ) X k ∂ k g i k − ∂g ik e ∗ k X i G e i = − N k g i k i ) X k ¯ g ik e ∗ k X i e i = − N k g i k ˚ h ∗ i X i G e i = O ≺ ( 1 N ) , where we used the fact that ˚ h ∗ i X i G e i = ˚ S i or ˚ T i if X i = e B h i i or I , respectively.Next, we show the second bound in (5.55). It is convenient to set I h i i := I − e i e ∗ i . Using (5.37), we get1 N ( i ) X k e ∗ i X ∂G∂g ik e i e ∗ k X i G e i = c i N e ∗ i XGI h i i X i G e i ( e i + h ∗ i ) e B h i i R i G e i + c i N e ∗ i XGR i e B h i i I h i i X i G e i ( e i + h i ) ∗ G e i + 1 N ( i ) X k e ∗ i X ∆ G ( i, k ) e i e ∗ k X i G e i . (B.13) The desired estimate of the last term was obtained in the second line of (B.1). Further, using (4.8) we get( e i + h ∗ i ) e B h i i R i G e i = − b i T i − ( e BG ) ii = O ≺ (1) , ( e i + h i ) ∗ G e i = G ii + T i = O ≺ (1) , where the estimates follows from (5.16) and (5.17). Hence, it suffices to show that | e ∗ i XGI h i i X i G e i | ≺ Im ( G ii + G ii ) η , | e ∗ i XGR i e B h i i I h i i X i G e i | ≺ Im ( G ii + G ii ) η . (B.14)Note that, by the assumption X = I or A , both terms in (B.14) can be bounded by C k GX e i kk G e i k = Cη p Im (
XGX ) ii p Im G ii ≤ C ′ Im G ii η . This completes the proof of the second inequality in (5.55). Next, we show the third estimate in (5.55).In light of the definition of T i , it suffices to show1 N ( i ) X k ∂ h ∗ i ∂g ik G e i e ∗ k X i G e i = O ≺ ( 1 N ) , N ( i ) X k h ∗ i ∂G∂g ik e i e ∗ k X i G e i = O ≺ (Π i ) . (B.15)The first estimate in (B.15) is proved as follows1 N ( i ) X k ∂ h ∗ i ∂g ik G e i e ∗ k X i G e i = − k g i k N ( i ) X k ¯ h ik e ∗ k X i G e i h ∗ i G e i = − k g i k N ˚ h i X i G e i h ∗ i G e i = O ≺ ( 1 N ) , where in the last step we again use the fact ˚ h ∗ i e B h i i G e i = ˚ S i = O ≺ (1) and h ∗ i G e i = T i = O ≺ (1). Theproof of the second estimate in (B.15) is similar to that of the second inequality in (5.55). It suffices toreplace e ∗ i X by h ∗ i in (B.13) and estimate the resulting terms. The counterpart to the last term in (B.13)is estimated in (B.1). The counterparts to the first two terms on the right side of (B.13) are bounded by C k G h i kk G e i k = Cη q Im h ∗ i G h i p Im G ii = Cη p Im G ii p Im G ii ≤ C ′ Im ( G ii + G ii ) η , where we have used (5.43).Next, we show the fourth estimate in (5.55). Using (5.37) again, we can get1 N ( i ) X k tr (cid:16) QX ∂G∂g ik (cid:17) e ∗ k X i G e i = c i N ( e i + h i ) ∗ e B h i i R i GQXGI h i i X i G e i + c i N ( e i + h i ) ∗ GQXGR i e B h i i I h i i X i G e i + 1 N ( i ) X k tr QX ∆ G ( i, k ) e ∗ k X i G e i . (B.16)The last term above is estimated in (B.1). Using (4.8) and k G k ≤ η , we have (cid:12)(cid:12)(cid:12) N ( e i + h i ) ∗ e B h i i R i GQXGI h i i X i G e i (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) N ( b i h ∗ i + e ∗ i e B ) GQXGI h i i X i G e i (cid:12)(cid:12)(cid:12) ≤ C N η (cid:0) k G h i k + k G e B e i k (cid:1) k G e i k ≤ C N η (cid:0) k G h i k + k G e B e i k + k G e i k (cid:1) = CN η (cid:0) Im ( h ∗ i G h i + ( e BG e B ) ii + G ii ) (cid:1) ≺ Im ( G ii + G ii ) N η . (B.17)Here in the last step we again used (5.43) and also factIm ( e BG e B ) ii = η + Im (( a i − z ) G ii ) = O ≺ ( η + Im G ii ) = O ≺ (Im G ii ) . (B.18)In (B.18), we used (5.8), the first bound in (5.16), and Im G ii & η which is easily checked by spectraldecomposition. Similar to (B.17), we get the desired estimate for the second term on the right of (B.16).Finally, the last equation in (5.55) can be proved analogously to the fourth one. The only differenceis, instead of the factor e ∗ k X i G e i in (6.20), here we have e ∗ k X i ˚ g i which does not contain any G factor,which actually makes the estimates even simpler. This completes the proof of Lemma 5.3. (cid:3) References [1] Akhiezer, N. I.:
The classical moment problem and some related questions in analysis , Hafner PublishingCo., New York, 1965.[2] Ajanki, O., Erd˝os, L., Kr¨uger, T.:
Universality for general Wigner-type matrices, arXiv:1506.05098 (2016).[3] Bao, Z. G., Erd˝os, L., Schnelli, K.:
Local stability of the free additive convolution , J. Funct. Anal. ,672-719 (2016).[4] Bao, Z. G., Erd˝os, L., Schnelli, K.:
Local law of addition of Random Matrices on optimal scale , Comm.Math. Phys. , 947-990 (2017) .[5] Bao, Z. G., Erd˝os, L., Schnelli, K.:
Convergence rate for spectral distribution of addition of random matrices ,arXiv: 1606.03076 (2016).[6] Bao, Z. G., Erd˝os, L., Schnelli, K.:
Local single ring theorem on optimal scale , arXiv:1612.05920 (2016).[7] Belinschi, S.:
A note on regularity for free convolutions , Ann. Inst. H. Poincar´e Probab. Stat. , 635-648(2006).[8] Belinschi, S., Bercovici, H.:
A new approach to subordination results in free probability , J. Anal. Math. , 357-365 (2007).[9] Bercovici, H, Voiculescu, D.:
Free convolution of measures with unbounded support , Indiana Univ. Math. J. , 733-773 (1993).[10] Biane, P.: Process with free increments , Math. Z. , 143-174 (1998).[11] Bourgade, P., Erd˝os, L., Yau, H.-T.:
Edge universality of beta ensembles , Comm. Math. Phys. ,261-354 (2014)[12] Che, Z., Landon, B.:
Local spectral statistics of the addition of random matrices , arXiv:1701.00513 (2017).[13] Chistyakov, G. P., G¨otze, F.:
The arithmetic of distributions in free probability theory , Cent. Euro. J. Math. , 997-1050 (2011).[14] Collins, B., Male, C.: The strong asymptotic freeness of Haar and deterministic matrices , Ann. Sci. ´Ec.Norm. Sup´er. (4) , 147-163 (2014).[15] Diaconis, P., Shahshahani, M.:
The subgroup algorithm for generating uniform random variables , Probab.Engrg. Inform. Sci. , 15-32 (1987).[16] Erd˝os, L., Knowles, A., Yau, H.-T.:
Averaging fluctuations in resolvents of random band matrices , Ann.Henri Poincar´e , 1837-1926 (2013).[17] Erd˝os, L., Knowles, A., Yau, H.-T., Yin, J.: The local semicircle law for a general class of random matrices ,Electron. J. Probab, , 1-58 (2013).[18] Guionnet, A., Krishnapur, M., Zeitouni, O.:
The single ring theorem , Ann. of Math. (2) , 1189-1217(2011).[19] Kargin, V.:
A concentration inequality and a local law for the sum of two random matrices , Prob. TheoryRelated Fields , 677-702 (2012).[20] Kargin, V.:
Subordination for the sum of two random matrices , Ann. Proba. , 2119-2150 (2015).[21] Lee, J. O., Schnelli, K.:
Local deformed semicircle law and complete delocalization for Wigner matrices withrandom potential . J. Math. Phys. , 103504 (2013).[22] Lee, J. O., Schnelli, K.:
Local law and Tracy–Widom limit for sparse random matrices , arXiv:1605.08767(2016).[23] Mezzadri F.:
How to generate random matrices from the classical compact groups , Notices Amer. Math. Soc. , 592-604 (2007).[24] Pastur, L., Vasilchuk, V.:
On the law of addition of random matrices , Comm. Math. Phys. , 249-286(2000).[25] Tracy, C., Widom, H.: Level spacing distributions and the Airy kernel, Comm. Math. Phys. , 151-174(1994).[26] Voiculescu, D.:
Addition of certain non-commuting random variables , J. Funct. Anal. , 323-346 (1986).[27] Voiculescu, D.:
Limit laws for random matrices and free products , Invent. Math. , 201-220 (1991).[28] Voiculescu, D.:
The analogues of entropy and of Fisher’s information theory in free probability theory, I ,Comm. Math. Phys.155