Local Stability of the Free Additive Convolution
aa r X i v : . [ m a t h . P R ] J a n Local Stability of the Free Additive Convolution
Zhigang Bao † IST Austria [email protected]
L´aszl´o Erd˝os ∗ IST Austria [email protected]
Kevin Schnelli † IST Austria [email protected]
We prove that the system of subordination equations, defining the free additive con-volution of two probability measures, is stable away from the edges of the supportand blow-up singularities by showing that the recent smoothness condition of Karginis always satisfied. As an application, we consider the local spectral statistics of therandom matrix ensemble A + UBU ∗ , where U is a Haar distributed random unitaryor orthogonal matrix, and A and B are deterministic matrices. In the bulk regime,we prove that the empirical spectral distribution of A + UBU ∗ concentrates aroundthe free additive convolution of the spectral distributions of A and B on scales downto N − / . Keywords : Free convolution, subordination, local eigenvalue density
AMS Subject Classification (2010) : 46L54, 60B20 Introduction
One of the basic concepts of free probability theory is the free additive convolution of twoprobability laws in a non-commutative probability space; it describes the law of the sum oftwo free random variables. In the case of a bounded self-adjoint random variable, its lawcan be identified with a probability measure of compact support on the real line. Hencethe free additive convolution of two probability measures is a well-defined concept and it ischaracteristically different from the classical convolution.In this paper, we prove a local stability result of the free additive convolution. A directconsequence is the continuity of the free additive convolution in a much stronger topologythan established earlier by Bercovici and Voiculescu [10]. A second application of our sta-bility result is to establish a local law on a very small scale for the eigenvalue density of arandom matrix ensemble A + U BU ∗ where U is a Haar distributed unitary or orthogonalmatrix and A , B are deterministic N by N hermitian matrices.The free additive convolution was originally introduced by Voiculescu [36] for the sumof free bounded noncommutative random variables in an algebraic setup (see Maassen [32]and by Bercovici and Voiculescu [10] for extensions to the unbounded case). The Stieltjestransform of the free additive convolution is related to the Cauchy-Stieltjes transforms of theoriginal measures by an elegant analytic change of variables. This subordination phenomenon was first observed by Voiculescu [38] in a generic situation and extended to full generality byBiane [14]. In fact, the subordination equations, see (2.5)-(2.6) below, may directly be used ∗ Partially supported by ERC Advanced Grant RANMAT No. 338804. † Supported by ERC Advanced Grant RANMAT No. 338804. to define the free additive convolution. This analytic definition was given independently byBelinschi and Bercovici [4] and by Chistyakov and G¨otze [18]; for further details we refer to, e.g., [39, 27, 2].Kargin [30] pointed out that the analytic approach to the subordination equations, incontrast to the algebraic one, allows one to effectively study how free additive convolutionis affected by small perturbations; this is especially useful to treat various error terms in therandom matrix problem [31]. The basic tool is a local stability analysis of the subordinationequations. In [30], Kargin assumed a lower bound on the imaginary part of the subordinationfunctions and a certain non-degeneracy condition on the Jacobian that holds for genericvalues of the spectral parameter. While these so-called smoothness conditions hold in manyexamples, a general characterization was lacking. Our first result, Theorem 2.5, showsthat the smoothness conditions hold wherever the absolutely continuous part of the freeconvolution measure is finite and nonzero. In particular, local stability holds unconditionally(Corollary 2.6) and, following Kargin’s argument [30], we immediately obtain the continuityof the free additive convolution in a stronger sense; see Theorem 2.7.The random matrix application of this stability result, however, goes well beyond Kargin’sanalysis [31] since our proof is valid on a much smaller scale. To explain the new elements,we recall how free probability connects to random matrices.The following fundamental observation was made by Voiculescu [37] (later extended byDykema [20] and Speicher [35]): if A = A ( N ) and B = B ( N ) are two sequences of Hermitianmatrices that are asymptotically free with eigenvalue distributions converging to probabilitymeasures µ α and µ β , then the eigenvalue density of A + B is asymptotically given by thefree additive convolution µ α ⊞ µ β . One of the most natural ways to ensure asymptoticfreeness is to consider conjugation by independent unitary matrices. Indeed, if A and B are deterministic (may even be chosen diagonal) with limit laws µ α and µ β , then A and U BU ∗ are asymptotically free if U = U ( N ) is a Haar distributed matrix; see [37] and manysubsequent works, e.g., [34, 40, 15, 33, 19]. In particular, the limiting spectral density ofthe eigenvalues of H = A + U BU ∗ is given by µ α ⊞ µ β .The conventional setup of free probability operates with moment calculations. An alter-native approach [33] proves the convergence of the resolvent at any fixed spectral parame-ter z ∈ C + . Both approaches give rise to weak convergence of measures, in particular theyidentify the limiting spectral density on macroscopic scale.Armed with these macroscopic results, it is natural to ask for a local law , i.e., for thesmallest possible ( N -dependent) scale so that the local eigenvalue density on that scale stillconverges as N tends to infinity. Local laws have been somewhat outside of the focus of freeprobability before Kargin’s recent works. After having improved a concentration result forthe Haar measure by Chatterjee [17] by using the Gromov-Milman concentration inequality,Kargin obtained a local law for the ensemble H = A + U BU ∗ on scale η ≫ (log N ) − / [29], i.e., slightly below the macroscopic scale. Recently in [31], he improved this result downto scale η ≫ N − / under the above mentioned smoothness condition. In Theorem 2.8 weprove the local law down to scale η = Im z ≫ N − / without any additional assumption.To achieve this short scale, we effectively use the positivity of the imaginary parts of thesubordination functions by localizing the Gromov–Milman concentration inequality withinthe spectrum. Since the subordination functions are obtained as the solution of a systemof self-consistent equations whose derivation itself requires bounds on the subordinationfunctions, the reasoning seems circular. We break this circularity by a continuity argument(similarly as in [23]) in which we reduce the imaginary part of the spectral parameter invery small steps, use the previous step as an a priori bound and show that the bound doesnot deteriorate by using the local stability result, Theorem 2.6. Finally, we remark that the local stability result is also a key ingredient in [3], where wewere able to prove a local law down to the smallest possible scale η ≫ N − , but with aweaker error bound than in Theorem 2.8; see Remark 2.4 for details.1.1. Notation.
We use the symbols O ( · ) and o ( · ) for the standard big-O and little-onotation. We use c and C to denote positive numerical constants. Their values may changefrom line to line. For a, b >
0, we write a . b , a & b if there is C ≥ a ≤ Cb , a ≥ C − b respectively. We write a ∼ b , if a . b and a & b both hold. We denote by k v k theEuclidean norm of v ∈ C N . For an N × N matrix A ∈ M N ( C ), we denote by k A k its operatornorm and by k A k := p h A, A i its Hilbert-Schmidt norm, where h A, B i := Trace( AB ∗ ), for A, B ∈ M N ( C ). Finally, we denote by tr A the normalized trace of A , i.e., tr A = N Trace A . Acknowledgment.
We thank an anonymous referee for many useful comments and re-marks, and bringing references [7, 12] to our attention.2.
Main results
Free additive convolution.
In this subsection, we recall the definition of the freeadditive convolution. Given a probability measure ∗ µ on R , its Stieltjes transform , m µ , onthe complex upper half-plane C + := { z ∈ C : Im z > } is defined by m µ ( z ) := Z R d µ ( x ) x − z , z ∈ C + . (2.1)We denote by F µ the negative reciprocal Stieltjes transform of µ , i.e., F µ ( z ) := − m µ ( z ) , z ∈ C + . (2.2)Observe that lim η ր∞ F µ (i η )i η = 1 , (2.3)as follows easily from (2.1). Note, moreover, that F µ is an analytic function on C + withnon-negative imaginary part. Conversely, if F : C + → C + is an analytic function such thatlim η ր∞ F (i η ) / i η = 1, then F is the negative reciprocal Stieltjes transform of a probabilitymeasure µ , i.e., F ( z ) = F µ ( z ), for all z ∈ C + ; see, e.g., [1].The free additive convolution is the binary operation on probability measures on R char-acterized by the following result. Proposition 2.1 (Theorem 4.1 in [4], Theorem 2.1 in [18]) . Given two probability measures µ and µ on R , there exist unique analytic functions, ω , ω : C + → C + , such that, ( i ) for all z ∈ C + , Im ω ( z ) , Im ω ( z ) ≥ Im z , and lim η ր∞ ω (i η )i η = lim η ր∞ ω (i η )i η = 1 ; (2.4)( ii ) for all z ∈ C + , F µ ( ω ( z )) − ω ( z ) − ω ( z ) + z = 0 ,F µ ( ω ( z )) − ω ( z ) − ω ( z ) + z = 0 . (2.5) ∗ All probability measures considered will be assumed to be Borel.
It follows from (2.4) that the analytic function F : C + → C + defined by F ( z ) := F µ ( ω ( z )) = F µ ( ω ( z )) , (2.6)satisfies (2.3). Thus F is the negative reciprocal Stieltjes transform of a probability mea-sure µ , called the free additive convolution of µ and µ , usually denoted by µ ≡ µ ⊞ µ .Note that (2.6) shows that the rˆoles of µ and µ are symmetric and thus µ ⊞ µ = µ ⊞ µ .The functions ω and ω of Proposition 2.1 are called subordination functions and F is saidto be subordinated to F µ , respectively to F µ .We mention that Voiculescu [36] originally introduced the free additive convolution ina different, algebraic manner. The equivalent analytic definition based on the existence ofsubordination functions (taken up in Proposition 2.1 above) was introduced in [4, 18].We next recall some basic examples. Choosing µ arbitrary and µ as a single pointmass at b ∈ R , it is easy to check that µ ⊞ µ simply is µ shifted by b . We exclude thisuninteresting case by henceforth assuming that µ and µ are both supported at more thanone point. Choosing µ = µ = µ as the Bernoulli distribution µ = (1 − ξ ) δ + ξδ , ξ ∈ (0 , , the free additive convolution is explicitly given by (see e.g., (5.5) of [33])( µ ⊞ µ )( x ) = p ( ℓ + − x ) + ( x − ℓ − ) + πx (2 − x ) + (1 − ξ ) + δ ( x ) + (2 ξ − + δ ( x ) , (2.7) x ∈ R , where ℓ ± := 1 ± p ξ (1 − ξ ) and where ( · ) + denotes the positive part. Observe that µ ⊞ µ has a nonzero absolutely continuous part and, depending on the choice of ξ , a pointmass. Another important choice for µ is Wigner’s semicircle law µ sc . For arbitrary µ , µ ⊞ µ sc is then purely absolutely continuous with a bounded density † that is real analyticwherever positive [13].Returning to the generic setting, the atoms of µ ⊞ µ are identified as follows. A point c ∈ R is an atom of µ ⊞ µ , if and only if there exist a, b ∈ R such that c = a + b and µ ( { a } ) + µ ( { b } ) >
1; see [Theorem 7.4, [11]]. For another interesting properties of theatoms of µ ⊞ µ we refer the reader to [12]. The boundary behavior of the functions F µ ⊞ µ , ω and ω has been studied by Belinschi [5, 6, 7] who proved the next two results.For simplicity, we restrict the discussion to compactly supported probability measures. Proposition 2.2 (Theorem 2.3 in [5], Theorem 3.3 in [6]) . Let µ and µ be compactlysupported probability measures on R , none of them being a single point mass. Then thefunctions F µ ⊞ µ , ω , ω : C + → C + extend continuously to R . Belinschi further showed in Theorem 4.1 in [6] that the singular continuous part of µ ⊞ µ is always zero and that the absolutely continuous part, ( µ ⊞ µ ) ac , of µ ⊞ µ is alwaysnonzero. We denote the density function of ( µ ⊞ µ ) ac by f µ ⊞ µ .We are now ready to introduce our notion of regular bulk , B µ ⊞ µ , of µ ⊞ µ . Informally,we let B µ ⊞ µ be the open set on which µ ⊞ µ admits a continuous density that is strictlypositive and bounded from above. For a formal definition we first introduce the set U µ ⊞ µ := int (cid:26) supp ( µ ⊞ µ ) ac (cid:15) { x ∈ R : lim η ց F µ ⊞ µ ( x + i η ) = 0 } (cid:27) . (2.8)Note that U µ ⊞ µ does not contain any atoms of µ ⊞ µ . By Privalov’s theorem the set { x ∈ R : lim η ց F µ ⊞ µ ( x + i η ) = 0 } has Lebesgue measure zero. In fact, an even strongerstatement applies for the case at hand. Belinschi [7] showed that if x ∈ R is such that † All densities are with respect to Lebesgue measure on R . lim η ց F µ ⊞ µ ( x + i η ) = 0, then it must be of the form x = a + b with µ ( { a } ) + µ ( { b } ) ≥ a, b ∈ R . There could only be finitely many such x , thus U µ ⊞ µ must contain an open non-empty interval. Proposition 2.3 (Theorem 3.3 in [6]) . Let µ and µ be as above and fix any x ∈ U µ ⊞ µ .Then F µ ⊞ µ , ω , ω : C + → C + extend analytically around x . In particular, the densityfunction f µ ⊞ µ is real analytic in U µ ⊞ µ wherever positive. The regular bulk is obtained from U µ ⊞ µ by removing the zeros of f µ ⊞ µ inside U µ ⊞ µ . Definition 2.4.
The regular bulk of the measure µ ⊞ µ is defined as the set B µ ⊞ µ := U µ ⊞ µ \ (cid:8) x ∈ U µ ⊞ µ : f µ ⊞ µ ( x ) = 0 (cid:9) . (2.9)Note that B µ ⊞ µ is an open non-empty set on which µ ⊞ µ admits the density f µ ⊞ µ .The density is strictly positive and thus (by Proposition 2.3) real analytic on B µ ⊞ µ .2.2. Stability Result.
To present our results it is convenient to recast (2.5) in a compactform: For generic probability measures µ , µ as above, let the function Φ µ ,µ : ( C + ) → C be given by Φ µ ,µ ( ω , ω , z ) := (cid:18) F µ ( ω ) − ω − ω + zF µ ( ω ) − ω − ω + z (cid:19) . (2.10)Considering µ , µ as fixed, the equationΦ µ ,µ ( ω , ω , z ) = 0 , (2.11)is equivalent to (2.5) and, by Proposition 2.1, there are unique analytic functions ω , ω : C + → C + , z ω ( z ) , ω ( z ) satisfying (2.4) that solve (2.11) in terms of z . We use thefollowing conventions: We denote by ω and ω generic variables on C + and we denote, witha slight abuse of notation, by ω ( z ) and ω ( z ) the subordination functions solving (2.11) interms of z . When no confusion can arise, we simply write Φ for Φ µ ,µ .We call the system (2.11) linearly S -stable at ( ω , ω ) ifΓ µ ,µ ( ω , ω ) := (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − F ′ µ ( ω ) − F ′ µ ( ω ) − − (cid:19) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ S , (2.12)for some constant S . Especially, the partial Jacobian matrix, DΦ( ω , ω ), of (2.10) given byDΦ( ω , ω ) := (cid:18) ∂ Φ ∂ω ( ω , ω , z ) , ∂ Φ ∂ω ( ω , ω , z ) (cid:19) = (cid:18) − F ′ µ ( ω ) − F ′ µ ( ω ) − − (cid:19) , admits a bounded inverse at ( ω , ω ). Note that DΦ( ω , ω ) is independent of z .Our first main result shows that the system (2.11) is linearly stable and that the imaginaryparts of the subordination functions are bounded below in the regular bulk. We require somemore notation: For a, b ≥ b ≥ a , and an interval I ⊂ R , we introduce the domain S I ( a, b ) := { z = E + i η ∈ C + : E ∈ I , a ≤ η ≤ b } . (2.13) Theorem 2.5.
Let µ and µ be compactly supported probability measures on R , and assumethat neither is supported at a single point and that at least one of them is supported at morethan two points. Let I ⊂ B µ ⊞ µ be a compact non-empty interval and fix some < η M < ∞ .Then there are two constants k > and S < ∞ , both depending on the measures µ and µ , on the interval I as well as on the constant η M , such that following statements hold. ( i ) The imaginary parts Im ω and Im ω of the subordination functions associatedwith µ and µ satisfy min z ∈S I (0 ,η M ) Im ω ( z ) ≥ k , min z ∈S I (0 ,η M ) Im ω ( z ) ≥ k . (2.14)( ii ) The system Φ µ ,µ ( ω , ω , z ) = 0 is linearly S -stable at ( ω ( z ) , ω ( z )) uniformly in S I (0 , η M ) , i.e., max z ∈S I (0 ,η M ) Γ µ ,µ ( ω ( z ) , ω ( z )) ≤ S . (2.15)
Remark . The assumption that neither of µ , µ is a point mass guarantees that thefree additive convolution is not a simple translate. The case when both, µ and µ arecombinations of two point masses is special and its discussion is postponed to Section 7.Theorem 2.5 has the following local stability result as corollary. Corollary 2.6.
Let µ , µ and S I (0 , η M ) be as in Theorem 2.5. Fix z ∈ C + . Assume thatthe functions e ω , e ω , e r , e r : C + → C satisfy Im e ω ( z ) > , Im e ω ( z ) > and Φ µ ,µ ( e ω ( z ) , e ω ( z ) , z ) = e r ( z ) , (2.16) with e r ( z ) := ( e r ( z ) , e r ( z )) ⊤ . Let ω , ω be the subordination functions solving the system Φ µ ,µ ( ω ( z ) , ω ( z ) , z ) = 0 , z ∈ C + .Then there exists a (small) constant δ > such that whenever we have | e ω ( z ) − ω ( z ) | ≤ δ , | e ω ( z ) − ω ( z ) | ≤ δ , (2.17) we also have | e ω ( z ) − ω ( z ) | ≤ S k e r ( z ) k , | e ω ( z ) − ω ( z ) | ≤ S k e r ( z ) k . (2.18) The constant δ > depends on µ and µ , on the interval I as well as on η M . We omit the proof of Corollary 2.6 from Theorem 2.5, since it follows directly fromProposition 4.1 in Section 4 below.2.3.
Applications.
We next explain two main applications of the stability estimates ob-tained in Theorem 2.5.2.3.1.
Continuity of the free additive convolution.
Our first application shows that the freeadditive convolution is a continuous operation when the image is equipped with the topologyof local uniform convergence of the density in the regular bulk; see (2.23). Bercovici andVoiculescu (Proposition 4.13 of [10]) showed that the free additive convolution is continuouswith respect to weak convergence of measures. More precisely, given two pairs of probabilitymeasures µ A , µ B and µ α , µ β on R , the measures µ A ⊞ µ B and µ α ⊞ µ β satisfyd L ( µ A ⊞ µ B , µ α ⊞ µ β ) ≤ d L ( µ A , µ α ) + d L ( µ B , µ β ) , (2.19)where d L denotes the L´evy distance. In particular, weak convergence of µ A to µ α and weakconvergence of µ B to µ β imply weak convergence of µ A ⊞ µ B to µ α ⊞ µ β .Using the Stieltjes transform, we can easily link (2.19) to the systems of equations in (2.5),respectively in (2.10). Using integration by parts and the definition of the Stieltjes transform,a direct computation reveals that there is a numerical constant C such that | m µ A ⊞ µ B ( z ) − m µ α ⊞ µ β ( z ) | ≤ Cη (cid:16) η (cid:17) d L ( µ A ⊞ µ B , µ α ⊞ µ β ) ≤ Cη (cid:16) η (cid:17) (d L ( µ A , µ α ) + d L ( µ B , µ β )) , η = Im z , (2.20) for all z ∈ C + , where we used (2.19) to get the second line. Note that the estimate in (2.20)deteriorates as η approaches the real line. Our next result strengthens (2.20) as follows. Weconsider the measure µ α ⊞ µ β as “reference” measure (in the sense that it locates the regularbulk) while µ A , µ B are arbitrary probability measures and show that the L´evy distancesbound | m µ A ⊞ µ B ( E + i η ) − m µ α ⊞ µ β ( E + i η ) | uniformly in η , for all E inside the regular bulkof µ α ⊞ µ β . Theorem 2.7.
Let µ α and µ β be compactly supported probability measures on R , and assumethat neither is supported at a single point and that at least one of them is supported at morethan two points. Let I ⊂ B µ α ⊞ µ β be a compact non-empty interval and fix some < η M < ∞ .Let µ A and µ B be two arbitrary probability measures on R .Then there are constants b > and Z < ∞ , both depending on the measures µ α and µ β ,on the interval I as well as on the constant η M , such that whenever d L ( µ A , µ α ) + d L ( µ B , µ β ) ≤ b (2.21) holds, then max z ∈S I (0 ,η M ) (cid:12)(cid:12) m µ A ⊞ µ B ( z ) − m µ α ⊞ µ β ( z ) (cid:12)(cid:12) ≤ Z (d L ( µ A , µ α ) + d L ( µ B , µ β )) , (2.22) holds, too. Note that max z ∈S I (0 ,η M ) | m µ α ⊞ µ β ( z ) | < ∞ by compactness of I and analyticity of m µ α ⊞ µ β in I . Thus the Stieltjes-Perron inversion formula directly implies that ( µ A ⊞ µ B ) ac has adensity, f µ A ⊞ µ B , inside I and thatmax x ∈I | f µ A ⊞ µ B ( x ) − f µ α ⊞ µ β ( x ) | ≤ Z (d L ( µ A , µ α ) + d L ( µ B , µ β )) , (2.23)provided that (2.21) holds, where f µ α ⊞ µ β is the density of ( µ α ⊞ µ β ) ac . Remark . The estimate (2.22) was recently given by Kargin [30] under the assumptionthat (2.14) and (2.15) hold for all z ∈ S I (0 , η M ), i.e., under the assumption that the con-clusions of our Theorem 2.5 hold. It is quite surprising that one can directly set Im z = 0in (2.22). As first noted by Kargin, this is due to the regularizing effect of ω α , ω β and tothe global uniqueness of solutions to (2.5) for arbitrary probability measures.2.3.2. Application to random matrix theory.
We now turn to an application of Theorem 2.5in random matrix theory. Let A ≡ A ( N ) and B ≡ B ( N ) be two sequences of N × N deter-ministic real diagonal matrices, whose empirical spectral distributions are denoted by µ A and µ B respectively, i.e., µ A := 1 N N X i =1 δ a i , µ B := 1 N N X i =1 δ b i , (2.24)where A = diag( a i ), B = diag( b i ). The matrices A and B depend on N , but we omitthis fact from the notation. Let ω A and ω B denote the subordination functions associatedwith µ A and µ B by Proposition 2.1.We assume that there are deterministic probability measures µ α and µ β on R , neitherof them being a single point mass, such that the empirical spectral distributions µ A , µ B converge weakly to µ α , µ β , as N → ∞ . More precisely, we assume thatd L ( µ A , µ α ) + d L ( µ B , µ β ) → , (2.25)as N → ∞ . Let ω α , ω β denote the subordination functions associated with µ α and µ β . Let U be an independent N × N Haar distributed unitary matrix (in short
Haar unitary )and consider the random matrix H ≡ H ( N ) := A + U BU ∗ . (2.26)We introduce the Green function , G H , of H and its normalized trace, m H , by setting G H ( z ) := 1 H − z , m H ( z ) := tr G H ( z ) , (2.27) z ∈ C + . We refer to z as the spectral parameter and we often write z = E + i η , E ∈ R , η >
0. Recall the definition of S I ( a, b ) in (2.13). We have the following local law for m H . Theorem 2.8.
Let µ α and µ β be two compactly supported probability measures on R , andassume that neither is only supported at one point and that at least one of them is supportedat more than two points. Let I ⊂ B µ α ⊞ µ β be a compact non-empty interval and fix some < η M < ∞ . Assume that the sequences of matrices A and B in (2.26) are such that theirempirical eigenvalue distributions µ A and µ B satisfy (2.25) . Fix any small γ > and set η m := N − / γ .Then we have the following uniform estimate: For any (small) ǫ > and any (large) D , P (cid:18) [ z ∈S I ( η m ,η M ) (cid:26)(cid:12)(cid:12) m H ( z ) − m µ A ⊞ µ B ( z ) (cid:12)(cid:12) > N ǫ N η / (cid:27)(cid:19) ≤ N D , (2.28) holds for N ≥ N , with some N sufficiently large, where we write z = E + i η . Using standard techniques of random matrix theory, we can translate the estimate (2.28)on the Green function into an estimate on the empirical spectral distribution of the matrix H .Let λ , . . . , λ N denote the ordered eigenvalues of H and denote by µ H := 1 N N X i =1 δ λ i (2.29)its empirical spectral distribution. Our result on the rate of convergence of µ H is as follows. Corollary 2.9.
Let
I ⊂ B µ α ⊞ µ β be a compact non-empty interval. Then, for any E < E in I , we have the following estimate. For any (small) ǫ > and any (large) D we have P (cid:18)(cid:12)(cid:12)(cid:12) µ H ([ E , E )) − µ A ⊞ B ([ E , E )) (cid:12)(cid:12)(cid:12) > N ǫ N / (cid:19) ≤ N − D , (2.30) for N ≥ N , with some N sufficiently large. We omit the proof of Corollary 2.9 from Theorem 2.8, but mention that the normalizedtrace m H of the Green function and the empirical spectral distribution µ H of H are linked by m H ( z ) = tr G H ( z ) = 1 N N X i =1 λ i − z = Z R d µ H ( x ) x − z , z ∈ C + . Corollary 2.9 then follows from a standard application of the Helffer-Sj¨ostrand functionalcalculus; see e.g.,
Section 7.1 of [22] for a similar argument.Note that assumption (2.25) does not exclude that the matrix H has outliers in thelarge N limit. In fact, the model H = A + U BU ∗ shows a rich phenomenology when, say, A has a finite number of large spikes; we refer to the recent works in [8, 9, 16, 31]. Remark . Our results in Theorem 2.8 and Corollary 2.9 are stated for U Haar distributedon the unitary group U ( N ). However, they also hold true (with the same proofs) when U is Haar distributed on the orthogonal group O ( N ). Remark . In [3], we derive, with a different approach, the estimate (with the notationof (2.28)) P (cid:18) [ z ∈S I ( η m ,η M ) (cid:26)(cid:12)(cid:12) m H ( z ) − m µ A ⊞ µ B ( z ) (cid:12)(cid:12) > N ǫ √ N η (cid:27)(cid:19) ≤ N D , (2.31)for N ≥ N , with some N sufficiently large, and with η m = N − γ . In fact, we obtainestimates for individual matrix elements of the resolvent G H as well. Comparing with (2.28),we see that we can choose η in (2.31) almost as small as N − at the price of losing afactor √ N . The stability and perturbation analysis in [3] rely on the optimal results inTheorem 2.5 and Theorem 2.7 as well as in Sections 3-5 of the present paper.2.4. Organization of the paper.
In Section 3, we consider the stability of the system (2.5)when at least one of the measures µ and µ is supported at more than two points and wegive the proof of Theorem 2.5. In Section 4, we consider perturbations of the system (2.5)and derive results that will be used in the proof of Theorem 2.8 and also in [3]. In Sec-tion 5, we prove Theorem 2.7. In Section 6 we consider the random matrix setup and proveTheorem 2.8. In the final Section 7, we separately settle the special case when both µ and µ are combinations of two point masses and give the results analogous to Theorem 2.5,Theorem 2.7 and Theorem 2.8 for that case.3. Stability of the system (2.11) and proof of Theorem 2.5
In this section, we discuss stability properties of the system (2.11), with µ , µ twocompactly supported probability measures satisfying the assumptions of Theorem 2.5. Lemma 3.1.
Let µ , µ be two probability measures on R neither of them being supportedat a single point. Then there is, for any compact set K ⊂ C + , a strictly positive constant < σ ( µ , K ) < such that the reciprocal Stieltjes transform F µ (see (2.2) ) satisfies Im z ≤ (1 − σ ( µ , K )) Im F µ ( z ) , ∀ z ∈ K . (3.1) Similarly, there is < σ ( µ , K ) < such that (3.1) holds with µ and F µ , respectively.Assume in addition that µ is supported at more than two points. Then there is, for anycompact set K ⊂ C + , a strictly positive constant < e σ ( µ , K ) < such that | F ′ µ ( z ) − | ≤ (1 − e σ ( µ , K )) Im F µ ( z ) − Im z Im z , ∀ z ∈ K , (3.2) where F ′ µ ( z ) ≡ ∂ z F µ ( z ) .Proof of Lemma 3.1. Assuming by contradiction that inequality (3.1) saturates (with van-ishing constant σ ( µ , K ) = 0, for some z ∈ K ⊂ C + ), we have Im F µ ( z ) = Im z for some z ,thus F µ ( z ) = z − a , a ∈ R , i.e., µ = δ a . This shows (3.1).To establish (3.2), we first note that the analytic functions F µ j : C + → C + , j = 1 , F µ j ( z ) = a F µj + z + Z R zxx − z d ρ F µj ( x ) , j = 1 , , z ∈ C + , (3.3)where a F µj ∈ R and ρ µ Fj are finite Borel measures on R . Note that the coefficients of z onthe right-hand side are determined by (2.3). From (3.3) we see that | F ′ µ ( z ) − | = (cid:12)(cid:12)(cid:12)(cid:12)Z R x ( x − z ) d ρ F µ ( x ) (cid:12)(cid:12)(cid:12)(cid:12) , z ∈ C + , (3.4) as well as Im F µ ( z ) − Im z Im z = Z R x | x − z | d ρ F µ ( x ) , z ∈ C + . (3.5)Hence, assuming by contradiction that inequality (3.2) saturates (with e σ ( µ , K ) = 0, forsome z ∈ K ), we must have Z R x | x − z | d ρ F µ ( x ) = (cid:12)(cid:12)(cid:12)(cid:12)Z R x ( x − z ) d ρ F µ ( x ) (cid:12)(cid:12)(cid:12)(cid:12) , (3.6)for some z ∈ K , implying that ρ F µ is either a single point mass or ρ F µ = 0. In the lattercase, we have F µ ( z ) = a µ + z and we conclude that µ must be single point measure, butthis is excluded by assumption. Thus ρ F µ is a single point mass, i.e., there is a constant d µ ∈ R such that F µ ( z ) = a F µ + z + (1 + zd µ ) / ( d µ − z ), z ∈ K . It follows that µ is aconvex combination of two point measures yielding a contradiction. This shows (3.2). (cid:3) Bounds on the subordination functions.
Let µ , µ be as above and let ω ( z ), ω ( z ) be the associated subordination functions. Recall that we rewrite the defining equa-tions (2.5) for ω and ω in the compact form Φ µ ,µ ( ω , ω , z ) = 0 introduced in (2.11).We first provide upper bounds on the subordination functions ω ( z ), ω ( z ). Our proofrelies on the assumption that µ , µ are compactly supported, i.e., that there is a constant L < ∞ such that supp µ ⊂ [ − L, L ] , supp µ ⊂ [ − L, L ] . (3.7)Recall from Theorem 2.5 that we fixed a compact non-empty interval I ⊂ B µ ⊞ µ . Sincethe density f µ ⊞ µ is real analytic inside the regular bulk by Proposition 2.3 and since I iscompact, there exists a constant κ > < κ ≤ min x ∈I f µ ⊞ µ ( x ) . (3.8)Fixing a constant 0 < η M < ∞ , it further follows that there is a constant M < ∞ such thatmax z ∈S I (0 ,η M ) | m µ ⊞ µ ( z ) | ≤ M . (3.9)
Lemma 3.2.
Let µ , µ be two compactly supported probability measures on R satisfy-ing (3.7) , for some L < ∞ , and assume that both are supported at more than one point. Let I ⊂ B µ ⊞ µ be a compact non-empty interval. Then there is a constant K < ∞ such that max z ∈S I (0 ,η M ) | ω ( z ) | ≤ K , max z ∈S I (0 ,η M ) | ω ( z ) | ≤ K . (3.10) The constant K depends on the constant η M and on the interval I as well as on the measures µ , µ through the constants κ in (3.8) and the constant L in (3.7) .Proof. We start by noticing that there is a constant κ > m µ ⊞ µ ( z ) = Z R η d( µ ⊞ µ )( x )( x − E ) + η ≥ Z I η f µ ⊞ µ ( x )d x ( x − E ) + η ≥ κ , (3.11)uniformly in z = E + i η ∈ S I (0 , η M ), where we used (3.8). Thus by subordination we havemin z ∈S I (0 ,η M ) | m µ ⊞ µ ( z ) | = min z ∈S I (0 ,η M ) (cid:12)(cid:12)(cid:12)(cid:12)Z R d µ ( a ) a − ω ( z ) (cid:12)(cid:12)(cid:12)(cid:12) ≥ κ , (3.12)since m µ ⊞ µ ( z ) = − /F µ ⊞ µ ( z ) by (2.6).On the other hand, µ is supported on the interval [ − L, L ]; see (3.7). Hence, using (3.12), | ω ( z ) | must be bounded from above on S I (0 , η M ). Interchanging the rˆoles of the indices 1and 2, we also get that | ω ( z ) | is bounded from above on S I (0 , η M ). (cid:3) Having established upper bounds on the subordination functions, we show that theirimaginary parts are uniformly bounded from below on the domain S I (0 , η M ). The proofrelies on inequality (3.1). Lemma 3.3.
Let µ , µ be two probability measures on R satisfying (3.7) , for some L < ∞ ,and assume that neither of them is only supported at a single point. Let I ⊂ B µ ⊞ µ be acompact non-empty interval. Then there is a strictly positive constant k > such that min z ∈S I (0 ,η M ) Im ω ( z ) ≥ k , min z ∈S I (0 ,η M ) Im ω ( z ) ≥ k . (3.13) Remark . The constant k in (3.13) depends on the interval I through the constants κ in (3.8) and M in (3.9). It further depends on η M , as well as on σ ( µ , K ) and σ ( µ , K )in (3.1), with K i = { u ∈ C + : u = ω i ( z ) , z ∈ S I (0 , η M ) } , i = 1 , . (3.14) Proof of Lemma 3.3.
First note that there is κ > m µ ⊞ µ ( z ) ≥ κ for all z ∈ S I (0 , η M ); c.f., (3.11). Moreover, there is M < ∞ such that | m µ ⊞ µ ( z ) | ≤ M for all z ∈ S I (0 , η M ); c.f., (3.9). Recall from (2.5) and (2.6) that ω ( z ) + ω ( z ) = z − m µ ⊞ µ ( z ) , z ∈ C + . (3.15)Hence, considering the imaginary part, we notice from (3.9) that there is κ > z ∈S I (0 ,η M ) (Im ω ( z ) + Im ω ( z )) ≥ κ . (3.16)It remains to show that Im ω and Im ω are separately bounded from below. To do so weinvoke (3.1) and assume by contradiction that Im ω ( z ) ≤ ǫ , for some small 0 ≤ ǫ < κ / ω ( z ) ≥ κ /
2. Since µ is assumed not to be a single point mass,Lemma 3.1 assures thatIm F µ ( ω ( z )) ≥ Im ω ( z )1 − σ ( µ , K ) , z ∈ S I (0 , η M ) , (3.17)with 0 < σ ( µ , K ) <
1, where K denotes the image of S I (0 , η M ) under the map ω (whichis necessarily compact by Lemma 3.2). On the other hand, (2.11) impliesIm F µ ( ω ( z )) = Im ω ( z ) + Im ω ( z ) − Im z , z ∈ C + . (3.18)Since Im ω ( z ) ≥ Im z , by Proposition 2.1, we get, by comparing (3.18) and (3.17), a con-tradiction with the assumption that Im ω ( z ) ≤ ǫ , for sufficiently small ǫ . Repeating theargument with the rˆoles of the indices 1 and 2 interchanged, we get (3.13). (cid:3) Linear stability of (2.11) . Having established lower and upper bounds on the subor-dination functions ω , ω , we now turn to the stability of the system Φ µ ,µ ( ω , ω , z ) = 0.Remember that we call the system linearly S -stable at ( ω , ω ) if Γ µ ,µ ( ω , ω ) ≤ S ,where Γ µ ,µ is defined in (2.12). Lemma 3.4.
Let µ , µ be two probability measures on R satisfying (3.7) for some L < ∞ .Assume that neither of them is a single point mass and that at least one of them is supportedat more than two points. Let I ⊂ B µ ⊞ µ be a compact non-empty interval.Then, there is a finite constant S such that max z ∈S I (0 ,η M ) Γ µ ,µ ( ω ( z ) , ω ( z )) ≤ S , (3.19) and max z ∈S I (0 ,η M ) | ω ′ ( z ) | ≤ S , max z ∈S I (0 ,η M ) | ω ′ ( z ) | ≤ S , (3.20) where ω ( z ) , ω ( z ) are the solutions to Φ µ ,µ ( ω , ω , z ) = 0 .Remark . Lemma 3.4 is the first instance where we use that at least one of µ and µ is supported at more than two points. For definiteness, we assume that µ is supportedat more than two points. The constant S in (3.19) depends on the interval I through theconstant κ in (3.8), on the constants η M , L in (3.7), σ ( µ , K ) and σ ( µ , K ), as well as on e σ ( µ , K ) of (3.2) with K defined in (3.14). Proof of Lemma 3.4.
Using (2.12) and Cramer’s rule, Γ ≡ Γ µ ,µ ( ω , ω , z ) equals Γ = 1 (cid:12)(cid:12) − ( F ′ µ ( ω ) − F ′ µ ( ω ) − (cid:12)(cid:12) (cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − − F ′ µ ( ω ( z )) + 1 − F ′ µ ( ω ( z )) + 1 − (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) . (3.21) As above, we assume for definiteness that µ is supported at more than two points. We firstfocus on F ′ µ ( ω ). Recalling the definition of K from Remark 3.1 and invoking (3.2), weobtain | F ′ µ ( ω ( z )) − | ≤ (cid:0) − e σ ( µ , K ) (cid:1) Im F µ ( ω ( z )) − Im ω ( z )Im ω ( z ) , (3.22)for all z ∈ S I (0 , η M ), where 0 < e σ ( µ , K ) <
1. Abbreviating e σ ≡ e σ ( µ , K ) and usingΦ µ ,µ ( ω ( z ) , ω ( z ) , z ) = 0, we thus have | F ′ µ ( ω ( z )) − | ≤ (1 − e σ ) Im ω ( z )Im ω ( z ) , z ∈ S I (0 , η M ) . (3.23)Reasoning in the similar way ( c.f., (4.9)), we also obtain | F ′ µ ( ω ( z )) − | ≤ Im ω ( z )Im ω ( z ) , z ∈ C + , (3.24)where the inequality may saturate here since we do not exclude µ being supported at twopoints only. Multiplying (3.23) and (3.24), we getmax z ∈S I (0 ,η M ) | F ′ µ ( ω ( z )) − | | F ′ µ ( ω ( z )) − | ≤ − e σ . (3.25)Using Lemma 3.2 and Lemma 3.3, we also have from (3.23) and (3.24) thatmax z ∈S I (0 ,η M ) | F ′ µ i ( ω j ( z )) − | ≤ K k , { i, j } = { , } . (3.26)Hence, bounding the operator norm by the Hilbert-Schmidt norm in (3.21), we obtainby (3.26) and (3.25) thatmax z ∈S I (0 ,η M ) Γ µ ,µ ( ω ( z ) , ω ( z )) ≤ √ e σ (cid:18) (cid:16) K k (cid:17) (cid:19) / =: S , (3.27)with finite constant S . This proves (3.19).The estimates in (3.20) follow by differentiating the equation Φ µ ,µ ( ω ( z ) , ω ( z ) , z ) = 0with respect to z . We get (cid:18) − F ′ µ ( ω ( z )) − F ′ µ ( ω ( z )) − − (cid:19) (cid:18) ω ′ ( z ) ω ′ ( z ) (cid:19) = (cid:18) (cid:19) . (3.28)From (3.19) we know that Φ is uniformly S -stable and we get (3.20) by inverting (3.28). (cid:3) Remark . The crucial estimate in the proof above is (3.25). An alternative proof of (3.25)under the assumptions that both µ and µ are supported at more than two points waspointed out by an anonymous referee. From (2.5) we observe that the subordination func-tion ω ( z ) appears, for fixed z ∈ C + , as the fixed point of the map F z : C + → C + , u
7→ F z ( u ) := F µ ( F µ ( u ) − u + z ) − F µ ( u ) − u + z . (3.29)Indeed, assuming that ω ∈ C + and that µ , µ are supported at least at three points(so that F µ ( z ) − z and F µ ( z ) − z are not M¨obius transformations), the fixed point ω ( z )is attracting as was shown in [4]. Thus, for any fixed k >
0, the Schwarz–Pick Theoremand (2.5) imply that for any compact subset b K of { z ∈ C + ∪ R : Im ω ( z ) , Im ω ( z ) ≥ k } there is a constant b σ ( b K ) < | F ′ ( ω ( z ) − || F ′ ( ω ( z ) − | ≤ b σ ( b K ) <
1, for any z ∈ b K . Thus, under the assumption that µ and µ are both supported at least at threepoints, (3.25) follows from Lemma 3.2 and Lemma 3.3.Collecting the results of this section, we obtain the proof of Theorem 2.5. Proof of Theorem 2.5.
Lemma 3.3 proves (2.14). Lemma 3.4 proves (2.15). (cid:3) Perturbations of the system (2.11)In this section, we study perturbations of the system Φ µ ,µ ( ω , ω , z ) = 0, where µ , µ denote general compactly supported probability measures on R . The main results ofthis section, Proposition 4.1 below, is used repeatedly in the continuity argument to proveTheorem 2.8. Yet, as noted in Corollary 2.6, it is of interest itself and it is also used in [3]. Proposition 4.1.
Fix z ∈ C + . Assume that the functions e ω , e ω , e r , e r : C + → C satisfy Im e ω ( z ) > , Im e ω ( z ) > and Φ µ ,µ ( e ω ( z ) , e ω ( z ) , z ) = e r ( z ) , (4.1) where e r ( z ) := ( e r ( z ) , e r ( z )) ⊤ . Assume moreover that there is δ ∈ [0 , such that | e ω ( z ) − ω ( z ) | ≤ δ , | e ω ( z ) − ω ( z ) | ≤ δ , (4.2) where ω ( z ) , ω ( z ) solve the unperturbed system Φ µ ,µ ( ω , ω , z ) = 0 . Assume that thereis a constant S such that Φ is linearly S -stable at ( ω ( z ) , ω ( z )) , and assume in additionthat there are strictly positive constants K and k with k > δ and with k > δKS such that < k ≤ Im ω ( z ) ≤ K , < k ≤ Im ω ( z ) ≤ K . (4.3)
Then we have the bounds | e ω ( z ) − ω ( z ) | ≤ S k e r ( z ) k , | e ω ( z ) − ω ( z ) | ≤ S k e r ( z ) k . (4.4) Proof.
Combining (4.3) and (4.2) with δ < k , we getIm e ω ( z ) ≥ k , Im e ω ( z ) ≥ k . (4.5)Next, we bound higher derivatives of F i ≡ F µ i , i = 1 ,
2. We first note that by the Nevanlinnarepresentation (3.3) we haveIm F i ( ω )Im ω = 1 + Z R x | x − ω | d ρ F i ( x ) , ω ∈ C + , i = 1 , . (4.6)On the other hand, we also have from (3.3) that | F ′ i ( ω ) − | ≤ Z R x | x − ω | d ρ F i ( x ) , ω ∈ C + , i = 1 , , (4.7) and analogously for higher derivatives, n ≥ | F ( n ) i ( ω ) | ≤ Z R x | x − ω | n +1 d ρ F i ( x ) ≤ ω ) n − Z R x | x − ω | d ρ F i ( x ) , (4.8) ω ∈ C + , i = 1 ,
2. Thus, combining (4.7), (4.6) and (2.11) we get | F ′ i ( ω j ( z )) − | ≤ Im F i ( ω j ( z )) − Im ω j ( z )Im ω j ( z ) = Im ω i ( z ) − Im z Im ω j ( z ) , { i, j } = { , } , (4.9) z ∈ C + , and similarly, starting from (4.8), | F ( n ) i ( ω j ( z )) | ≤ Im ω i ( z ) − Im z (Im ω j ( z )) n , z ∈ C + , { i, j } = { , } . (4.10)Let Ω i ( z ) := e ω i ( z ) − ω i ( z ), i = 1 ,
2, and Ω := (Ω , Ω ) ⊤ . Fixing z = z and Taylorexpanding F ( e ω ( z )) around ω ( z ) we get F ′ ( ω ( z ))Ω ( z ) − Ω ( z ) − Ω ( z ) = e r ( z ) − X n ≥ n ! F ( n )1 ( ω ( z ))Ω ( z ) n . (4.11)Recalling that k Ω( z ) k / k ≤ δ/k < | F ′ ( ω ( z ))Ω ( z ) − Ω ( z ) − Ω ( z ) | ≤ k e r ( z ) k + K k k Ω( z ) k , (4.12)and the analogous expansion with the rˆoles of the indices 1 and 2 interchanged. We thereforeobtain from (2.12) and from solving the linearized equation that k Ω( z ) k ≤ S k e r ( z ) k + KS k k Ω( z ) k . (4.13)Thus, we have the dichotomy that either k Ω( z ) k ≤ S k e r k or 2( KS ) − k ≤ k Ω( z ) k . Since k > δKS by assumption, the second alternative contradicts k Ω( z ) k ≤ δ . This proves theestimates in (4.4). (cid:3) In Proposition 4.1 we assumed the apriori bound | e ω i − ω i | ≤ δ ; see (4.2). The next lemmashows that we may drop this assumption, for spectral parameters z with sufficiently largeimaginary part, at the price of assuming effective lower bounds on Im e ω i . This statementwill be used as an initial input to start the continuity argument in Section 6. Lemma 4.2.
Assume there is a (large) e η > such that for any z ∈ C + with Im z ≥ e η theanalytic functions e ω , e ω , e r , e r : C + → C satisfy Im e ω ( z ) − Im z ≥ k e r ( z ) k , Im e ω ( z ) − Im z ≥ k e r ( z ) k . (4.14) and Φ µ ,µ ( e ω ( z ) , e ω ( z ) , z ) = e r ( z ) , (4.15) where e r ( z ) := ( e r ( z ) , e r ( z )) ⊤ .Then there is a constant η > , with η ≥ e η , such that | e ω ( z ) − ω ( z ) | ≤ k e r ( z ) k , | e ω ( z ) − ω ( z ) | ≤ k e r ( z ) k , (4.16) on the domain { z ∈ C + : Im z ≥ η } , where ω and ω are the subordination functionsassociated with µ and µ . The constant η depends on the measures µ and µ , and on thefunction e r through the constant e η > . Proof.
Recall the Nevanlinna representation (3.3) for F µ and F µ . Since µ and µ arecompactly supported, we have, as Im ω ր ∞ , F µ ( ω ) − ω = a + O ( | ω | − ) , F µ ( ω ) − ω = a + O ( | ω | − ) , (4.17)with a ≡ a F µ and a ≡ a F µ . There are thus e s , e s : C + → C such thatΦ µ ,µ ( e ω ( z ) , e ω ( z ) , z ) = (cid:18) a + e s ( z ) − e ω ( z ) + za + e s ( z ) − e ω ( z ) + z (cid:19) = (cid:18) e r ( z ) e r ( z ) (cid:19) , (4.18)with e s ( z ) = O ( | e ω ( z ) | − ) , e s ( z ) = O ( | e ω ( z ) | − ) , (4.19)as Im z ր ∞ . It follows immediately that e ω ( z ) = O (Im z ) and e ω ( z ) = O (Im z ), asIm z ր ∞ . Thus, recalling the definition of Γ µ ,µ in (2.12), we getΓ µ ,µ ( e ω ( z ) , e ω ( z )) = 1 + O ( η − ) , (4.20)as η = Im z ր ∞ . In particular, we obtain k ((DΦ) − Φ)( e ω ( z ) , e ω ( z ) , z ) k ≤ k Γ µ ,µ ( e ω ( z ) , e ω ( z )) kk Φ( e ω ( z ) , e ω ( z ) , z ) k≤ k e r ( z ) k , (4.21)for Im z sufficiently large. From (4.8) and (4.17), we also get | F (2) µ i ( ω ) | ≤ Im F µ i ( ω ) − Im ω (Im ω ) = O ((Im ω ) − ) , ω ∈ C + , i = 1 , , (4.22)as Im ω ր ∞ . Thus the matrix of second derivatives of Φ given byD Φ( ω , ω ) := (cid:18) ∂ Φ ∂ω ( ω , ω , z ) , ∂ Φ ∂ω ( ω , ω , z ) (cid:19) = F (2) µ ( ω ) F (2) µ ( ω ) 0 ! , satisfies k D Φ( e ω ( z ) , e ω ( z )) k = O (Im z ) − , as Im z ր ∞ . Hence, choosing η > s := 2 k e r ( z ) k k D Φ( e ω ( z ) , e ω ( z )) k < , on the domain { z ∈ C + : Im z ≥ η } . Thus, by the Newton-Kantorovich theorem(see, e.g., Theorem 1 in [25]), there are for every such z unique b ω ( z ) , b ω ( z ) such thatΦ µ ,µ ( b ω ( z ) , b ω ( z ) , z ) = 0, with | e ω i ( z ) − b ω i ( z ) | ≤ − √ − s s k e r ( z ) k ≤ k e r ( z ) k , i = 1 , . (4.23)Finally, we note that Im b ω ( z ) = Im b ω ( z ) − Im e ω ( z )+Im e ω ≥ Im z , by (4.14), for Im z ≥ η .Similarly, Im b ω ( z ) ≥ Im z , for Im z ≥ η . It further follows that Γ µ ,µ ( b ω ( z ) , b ω ( z )) = 0,for all z ∈ C + with Im z ≥ η , thus b ω ( z ) and b ω ( z ) are analytic on { z ∈ C : Im z > η } since F µ and F µ are. Finally, using (4.17) with ω = b ω , ω = b ω respectively, we see thatlim η ր∞ Im b ω (i η )i η = lim η ր∞ Im b ω (i η )i η = 1 . Thus, by the uniqueness claim in Proposition 2.1, b ω ( z ) , b ω ( z ) agree with ω ( z ) , ω ( z ) onthe domain { z ∈ C + : Im z ≥ η } . This proves (4.16) from (4.23). (cid:3) Proof of Theorem 2.7
In the setup of Theorem 2.7 we have two pairs of probability measures on R , µ α , µ β and µ A , µ B , where we consider µ α , µ β as “reference” measures (in the sense that they satisfythe assumptions of Theorem 2.7), while µ A , µ B are arbitrary. Under the assumptions ofTheorem 2.7 we can apply Theorem 2.5 with the choices µ α = µ and µ β = µ .Recall from (2.13) the definition of the domain S I ( a, b ), a ≤ b . Lemma 5.1.
Let µ A , µ B and µ α , µ β be the probability measures from (2.24) and (2.25) satisfying the assumptions of Theorem 2.7. Let ω A , ω B and ω α , ω β denote the associated sub-ordination functions by Proposition 2.1. Let I ⊂ B µ α ⊞ µ β be a compact non-empty interval.Fix < η M < ∞ .Then there are a (small) constant b > and a (large) constant K < ∞ , both dependingon the measures µ α and µ β , on the interval I and on the constant η M , such that whenever d L ( µ A , µ α ) + d L ( µ B , µ β ) ≤ b , (5.1) holds, then | ω A ( z ) − ω α ( z ) | ≤ K d L ( µ A , µ α )(Im ω β ( z )) + K d L ( µ B , µ β )(Im ω α ( z )) , | ω B ( z ) − ω β ( z ) | ≤ K d L ( µ A , µ α )(Im ω β ( z )) + K d L ( µ B , µ β )(Im ω α ( z )) , (5.2) hold uniformly on S I (0 , η M ) . In particular, choosing b ≤ b sufficiently small and assumingthat d L ( µ A , µ α ) + d L ( µ B , µ β ) ≤ b , we have, max z ∈S I (0 ,η M ) | ω A ( z ) | ≤ K , max z ∈S I (0 ,η M ) | ω B ( z ) | ≤ K , (5.3)min z ∈S I (0 ,η M ) Im ω A ( z ) ≥ k , min z ∈S I (0 ,η M ) Im ω B ( z ) ≥ k , (5.4) where K and k are the constant from Lemma 3.2 and Lemma 3.3, respectively.Remark . Armed with the conclusions of Theorem 2.5, our proof follows closely the argu-ments of [30]. We further remark that the main argument in the proof of Lemma 5.1 is differ-ent from the ones given in Section 4: it crucially relies on the global uniqueness of solutionson the upper half plane for both systems, Φ µ α ,µ β ( ω α , ω β , z ) = 0 and Φ µ A ,µ B ( ω A , ω B , z ) = 0,asserted by Proposition 2.1. Proof of Lemma 5.1.
We first write the system Φ µ α ,µ β ( ω α ( z ) , ω β ( z ) , z ) = 0 asΦ µ A ,µ B ( ω α ( z ) , ω β ( z ) , z ) = r ( z ) , z ∈ C + , with r ( z ) ≡ (cid:18) r A ( z ) r B ( z ) (cid:19) := (cid:18) F µ A ( ω β ( z )) − F µ α ( ω β ( z )) F µ B ( ω α ( z )) − F µ β ( ω α ( z )) (cid:19) . (5.5)From Lemma 3.3, we know that the imaginary parts of the subordination functions ω α , ω β are uniformly bounded from below on S I (0 , η M ). Next, integration by parts reveals that forany probability measures µ and µ , | m µ ( z ) − m µ ( z ) | ≤ c d L ( µ , µ )Im z (cid:18) z (cid:19) , z ∈ C + , (5.6)with some numerical constant c ; see, e.g., [31]. Thus, | F µ A ( ω β ( z )) − F µ α ( ω β ( z )) | = | m µ A ( ω β ( z ))) − m µ α ( ω β ( z )) || m µ A ( ω β ( z )) m µ α ( ω β ( z )) | ≤ C d L ( µ A , µ α )(Im ω β ( z )) , (5.7) with a new constant C that depends on the lower bound of Im m µ α ( ω β ( z )) = Im m µ α ⊞ µ β ( z )which is strictly positive on S I (0 , η M ); c.f., (3.12). Here we usedIm m µ A ( ω β ( z )) ≥ Im m µ α ( ω β ( z )) − | Im m µ A ( ω β ( z )) − Im m µ α ( ω β ( z )) | ≥
12 Im m µ α ( ω β ( z )) , as follows from (5.6) for small enough d L ( µ A , µ α ) ≤ b . Repeating the argument with therˆoles of A and B interchanged, we arrive at | r A ( z ) | ≤ C d L ( µ A , µ α )(Im ω β ( z )) , | r B ( z ) | ≤ C d L ( µ B , µ β )(Im ω α ( z )) , z ∈ S I (0 , η M ) , (5.8)for some constant C . Recalling the definition of Γ in (2.12), we get for sufficiently small b ,Γ µ A ,µ B ( ω α , ω β ) ≤ µ α ,µ β ( ω α , ω β ) ≤ S , (5.9)where S is from Lemma 3.4, and where we also use Lemma 3.3 and the assumption d L ( µ A , µ α )+d L ( µ B , µ β ) ≤ b . The Newton–Kantorovich theorem then implies ( c.f., the proof of Lemma 4.2for a similar application) that there are b ω A ( z ), b ω B ( z ) satisfyingΦ µ A ,µ B ( b ω A ( z ) , b ω B ( z ) , z ) = 0 , z ∈ S I (0 , η M ) , (5.10)and | ω α ( z ) − b ω A ( z ) | ≤ k r ( z ) k , | ω β ( z ) − b ω B ( z ) | ≤ k r ( z ) k , (5.11) z ∈ S I (0 , η M ). Invoking (5.8), (5.11) and Lemma 3.3, we see that b ω A ( z ) ∈ C + and b ω B ( z ) ∈ C + , for any z ∈ S I (0 , η M ) if b is sufficiently small. Yet, by the global uniqueness of solutionsasserted in Proposition 2.1, we must have b ω A ( z ) = ω A ( z ), b ω B ( z ) = ω B ( z ), z ∈ C + . Togetherwith (5.11) and (5.8) this implies (5.2) and concludes the proof. Then, choosing b sufficientlysmall, (5.3) and (5.4) are direct consequences of (5.2), Lemma 3.2 and Lemma 3.3. (cid:3) With the aid of Lemma 3.4, we prove the stability of the system Φ µ A ,µ B ( ω A , ω B , z ) = 0. Corollary 5.2.
Under the assumptions of Lemma 5.1, there is a (small) constant b > ,depending on the measures µ α and µ β , on the interval I and on the constant η M , such that d L ( µ A , µ α ) + d L ( µ B , µ β ) ≤ b (5.12) implies max z ∈S I (0 ,η M ) Γ µ A ,µ B ( ω A ( z ) , ω B ( z )) ≤ S (5.13) and max z ∈S I (0 ,η M ) | ω ′ A ( z ) | ≤ S , max z ∈S I (0 ,η M ) | ω ′ B ( z ) | ≤ S , (5.14) where ω A ( z ) , ω B ( z ) satisfy Φ µ A ,µ B ( ω A ( z ) , ω B ( z ) , z ) = 0 and S is the constant in Lemma 3.4.Proof. Let Γ ≡ Γ µ A ,µ B ( ω A ( z ) , ω B ( z )). Analogously to (3.21), we have Γ = 1 (cid:12)(cid:12) − ( F ′ µ A ( ω B ) − F ′ µ B ( ω A ) − (cid:12)(cid:12) (cid:13)(cid:13)(cid:13)(cid:13)(cid:18) − − F ′ µ A ( ω B ( z )) + 1 − F ′ µ B ( ω A ( z )) + 1 − (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) . (5.15) Using the bounds (5.3) and (5.4) for sufficiently small b , we follow, mutatis mutandis, theproof of Lemma 3.4 to get (5.13). The estimates in (5.14) then follow as in Lemma 3.4. (cid:3) We are now ready to complete the proof of Theorem 2.7. Proof of Theorem 2.7.
Recall that m µ A ⊞ µ B ( z ) = m µ A ( ω B ( z )), z ∈ C + . We first note that | m µ A ( ω B ( z )) − m µ α ( ω B ( z )) | ≤ C d L ( µ A , µ α )Im ω β ( z ) (cid:18) ω β ( z ) (cid:19) , for some numerical constant C ; c.f., (5.6). Thus using (5.4) we get | m µ A ( ω B ( z )) − m µ α ( ω B ( z )) | ≤ K k − d L ( µ A , µ α ) , z ∈ S I (0 , η M ) , for some numerical constant K . Choosing b as in Lemma 5.1 and assuming that d L ( µ A , µ α )+d L ( µ A , µ β ) ≤ b , we get from (5.2) that | m µ α ( ω B ( z )) − m µ α ( ω β ( z )) | ≤ K k − ((d L ( µ A , µ α ) + d L ( µ A , µ β )) , z ∈ S I (0 , η M ) . Setting Z := K k − + K k − we thus obtain (2.22). (cid:3) Remark . Note that under the assumptions of Theorem 2.7, we have, for d L ( µ A , µ α ) +d L ( µ A , µ β ) ≤ b , the bounds κ / ≤ | m µ A ⊞ µ B ( z ) | ≤ /k , (5.16)uniformly on S I (0 , η M ) with κ > k > Proof of Theorem 2.8
Before we immerse into the details of the proof of Theorem 2.8, we outline how The-orem 2.5 and the local stability results of Section 4 in combination with concentrationestimates for the unitary groups lead to the local law in (2.28).6.1.
Outline of proof.
We briefly outline of our proof when U is Haar distributed on U ( N ).Since we are interested in the tracial quantity m H of H = A + U BU ∗ , we may replace H by the matrix e H := V AV ∗ + U BU ∗ , (6.1)where V is another Haar unitary independent from U . By cyclicity of the trace we have m H = m e H and we study m e H below. We emphasize that this replacement is a convenient techni-cality which is not essential to our proof.Using the shorthand e A := V AV ∗ , e B := U BU ∗ , (6.2)we introduce the Green functions G e A ( z ) := ( e A − z ) − , G e B ( z ) := ( e B − z ) − , z ∈ C + . (6.3)For a given N × N matrix Q , we introduce the function f Q ( z ) := tr QG e H ( z ) , z ∈ C + , (6.4)where G e H = ( e H − z ) − is the Green function of e H . We define the approximate subordinationfunctions , ω cA and ω cB , by setting ω cA ( z ) := z − E f e A ( z ) E m e H ( z ) , ω cB ( z ) := z − E f e B ( z ) E m e H ( z ) , z ∈ C + , (6.5)where the expectation E is with respect to both Haar unitaries U and V . From the identity( e H − z ) G e H ( z ) = 1, z ∈ C + , we then obtain the relation ω cA ( z ) + ω cB ( z ) − z = − E m e H ( z ) , z ∈ C + , (6.6) reminiscent to ( c.f., (2.5)–(2.6)) ω A ( z ) + ω B ( z ) − z = − m A ⊞ B ( z ) , z ∈ C + . For the proof of Theorem 2.8, we decompose m e H ( z ) − m A ⊞ B ( z ) = (cid:0) m e H ( z ) − E m e H ( z ) (cid:1) + (cid:0) E m e H ( z ) − m A ⊞ B ( z ) (cid:1) , (6.7)where we abbreviate m A ⊞ B ≡ m µ A ⊞ µ B . To control the fluctuation part, m e H ( z ) − E m e H ( z ),we rely on the Gromov–Milman concentration inequality [26] for the unitary group; see (6.22)below. To control the deterministic part, we first note that, by (6.6) and m A ⊞ B ( z ) = m A ( ω B (( z )), bounding | E m e H ( z ) − m A ⊞ B ( z ) | amounts to bounding | ω cA ( z ) − ω A ( z ) | and | ω cB ( z ) − ω B ( z ) | . We then show that ω cA ( z ) and ω cB ( z ) are both in the upper-half plane andsatisfy Φ µ A ,µ B ( ω cA ( z ) , ω cB ( z ) , z ) = r ( z ) , z ∈ S I ( η m , η M ) , (6.8)for some small error r ( z ) ∈ C + , i.e., we consider (6.8) as a perturbation of the sys-tem Φ µ A ,µ B ( ω A ( z ) , ω B ( z ) , z ) = 0; c.f., (2.10). The formal derivation of (6.8) goes backto Pastur and Vasilchuk [33]. Using Proposition 4.1 (with rough a priori estimates on | ω cA ( z ) − ω A ( z ) | and | ω cB ( z ) − ω B ( z ) | obtained from the continuity argument below) andstability results of Theorem 2.5 and of Section 5, we then bound | ω cA ( z ) − ω A ( z ) | and | ω cB ( z ) − ω B ( z ) | in terms of r ( z ).In sum, for fixed z ∈ C + , our proof includes two parts: ( i ) estimation of the error r ( z )in (6.8) and ( ii ) concentration for m e H ( z ) around E m e H ( z ). Both parts rely on the estimates E m e H ( z ) , ω cA ( z ) , ω cB ( z ) ∼ , Im ω cA ( z ) , Im ω cB ( z ) & , z ∈ S I ( η m , η M ) . (6.9)Note that the quantities in (6.9) are obtained from the Green function of e H by averagingwith respect to the Haar measure. Similar bounds for m A ⊞ B , ω A and ω B were obtained inSection 5. These latter quantities are defined directly from µ A and µ B via Proposition 2.1To establish (6.9), we use a similar continuity argument as was used for Wigner matricesin [24]: For Im z = η M sufficiently large, the estimates in (6.9) directly follow from definitions.For z = E + i η , with E ∈ I fixed, we decrease η = η M down to η = η m in steps of size O ( N − ), where, at each step, we invoke parts ( i ) and ( ii ). However, a direct application ofthe Gromov–Milman concentration inequality for part ( i ) does not allow to push η belowthe mesoscopic scale η = N − / . Indeed, the Gromov–Milman inequality is effective if L /N = o (1), where L is the Lipschitz constant of m e H ( z ) with respect to the Haar unitary V .It is roughly bounded by p tr | G e H ( z ) | /N , which in turn is trivially bounded by 1 / p N η ,giving the η ≥ N − / γ , γ >
0, threshold. However, in reality, the random quantity p tr | G e H ( z ) | /N is typically of order 1 / p N η as follows by combining the deterministicestimate tr | G e H ( z ) | ≤ η − Im m e H ( z ) with a probabilistic order one bound for Im m e H ( z ).Our key novelty here is to capitalize on this latter information. We introduce a smoothcutoff that regularizes m e H ( z ) and then apply the Gromov–Milman inequality for this regu-larized quantity. With the bound 1 / p N η for the Lipschitz constant, we get concentrationestimates down to scales η ≥ N − / γ , γ > Notation.
The following notation for high-probability estimates is suited for our pur-poses. A slightly different form was first used in [21].
Definition 6.1.
Let X = ( X ( N ) ( v ) : N ∈ N , v ∈ V ( N ) ) , Y = ( Y ( N ) ( v ) : N ∈ N , v ∈ V ( N ) ) (6.10) be two families of nonnegative random variables where V ( N ) is a possibly N -dependent pa-rameter set. We say that Y stochastically dominates X , uniformly in v , if for all (small) ǫ > and (large) D > , P (cid:18) [ v ∈V ( N ) (cid:26) X ( N ) ( v ) > N ǫ Y ( N ) ( v ) (cid:27)(cid:19) ≤ N − D , (6.11) for sufficiently large N ≥ N ( ǫ, D ) . If Y stochastically dominates X , uniformly in v , wewrite X ≺ Y . If we wish to indicate the set V ( N ) explicitly, we write that X ( v ) ≺ Y ( v ) forall v ∈ V ( N ) . Localized Gromov–Milman concentration estimate.
In this subsection, we de-rive concentration bounds for some key tracial quantities. They are tailored for the continuityargument of Subsection 6.3 used to complete the proof of Theorem 2.8. The argument workswith U , V independent and both Haar distributed on U ( N ) or on O ( N ). Below, E denotesthe expectation with respect Haar measure.In the rest of this section, we let I ⊂ B µ α ⊞ µ β denote the compact non-empty subset fixedin Theorem 2.8. Also recall from Theorem 2.8 that we set η m = N − / γ , γ >
0. Below wechoose the constant η M ∼ < η M < ∞ arbitrary. Recall from (6.4) the notation f Q ,where Q is an arbitrary N × N matrix. Proposition 6.2.
Let Q be a given N × N deterministic matrix with k Q k . . Fix E ∈ I and b η ∈ [ η m , η M ] . Then Im m e H ( E + i η ) ≺ , ∀ η ∈ [ b η, η M ] , (6.12) implies the concentration bound | f V QV ∗ ( E + i b η ) − E f V QV ∗ ( E + i b η ) | ≺ p N b η . (6.13) The same concentration holds with
V QV ∗ replaced by U QU ∗ .Proof. For fixed E ∈ I , we consider z = E + i η ∈ C + as a varying spectral parameter anduse b z = E + i b η for the specific choice in the lemma. By the definition of f ( · ) and cyclicity ofthe trace, we have f V QV ∗ ( z ) = tr V QV ∗ (cid:0) V AV ∗ + U BU ∗ − z (cid:1) − = tr Q (cid:0) A + V ∗ U BU ∗ V − z (cid:1) − , (6.14)where tr ( · ) stands for the normalized trace. For simplicity, we denote W := V ∗ U, H := A + W BW ∗ , G H ( z ) := ( H − z ) − . (6.15)Observe that W is Haar distributed on U ( N ), respectively O ( N ), too. By cyclicity of thetrace we have tr G H ( z ) = tr G H ( z ) = m H ( z ). According to (6.14) and (6.15), we may regardin the sequel f V QV ∗ as a function of the Haar unitary matrix W by writing h ( z ) = h W ( z ) := f V QV ∗ ( z ) . For any fixed (small) ε >
0, let b χ be a smooth cutoff supported on [0 , N ε ], with b χ ( x ) = 1, x ∈ [0 , N ε ], and with bounded derivatives. Since m H ( z ) = tr G H ( z ), we can regard m H ( z )as a function of W and write χ ( z ) = χ W ( z ) := b χ (Im m H ( z )) . (6.16) We then introduce a regularization, e h W , of h W by setting e h ( z ) = e h W ( z ) := h W ( z ) ⌈− log η ⌉ Y n =0 χ W ( E + i2 n η ) . (6.17)We will often drop the W subscript from the notations h W ( z ), e h W ( z ) and χ W ( z ) butremember that these are random variables depending on the Haar unitary W .We will use assumption (6.12) at dyadic points, i.e., thatIm m H ( E + i2 l b η ) ≺ , ≤ l ≤ ⌈− log b η ⌉ , (6.18)(recall that m H ( z ) = m e H ( z ) so we may drop the tilde in the subscript of m ). Hence,by (6.16) and (6.18) we see that, for arbitrary large D > ⌈− log b η ⌉ Y l =0 χ ( E + i2 l b η ) = 1 , i.e., e h ( b z ) = h ( b z ) , (6.19)with probability larger than 1 − N − D , for N sufficiently large (depending on ε and D ).Taking the trivial bound k Q k / b η for h ( b z ) and for e h ( b z ) into account, we also have E e h ( b z ) − E h ( b z ) = O (cid:0) N − D +1 (cid:1) . (6.20)To prove (6.13), it therefore suffices to establish the concentration estimate (cid:12)(cid:12)(cid:12)e h ( b z ) − E e h ( b z ) (cid:12)(cid:12)(cid:12) ≺ p N b η , (6.21)for the regularized quantity e h ( b z ).To verify (6.21), we use the Gromov–Milman concentration inequality [26] (see Theo-rem 4.4.27 in [2] for similar applications) which states the following. Let M ( N ) = SO( N )or SU( N ) endowed with the Riemann metric k d s k inherited from M N ( C ) (equipped withthe Hilbert-Schmidt norm). If g : ( M ( N ) , k d s k ) → R is an L -Lipschitz function satisfying E g = 0, then P ( | g | > δ ) ≤ e − c Nδ L , ∀ δ > , (6.22)with some numerical constant c > N ).Here P and E are with respect Haar measure on M ( N ).In order to apply (6.22) to the function W e h W ( b z ) = e h ( b z ), we need to control itsLipschitz constant. To that end, we define the eventΩ( b η ) ≡ Ω E ( b η ) := (cid:8) Im m H ( E + i2 n b η ) ≤ N ε : ∀ n ∈ N (cid:9) . (6.23)To bound the Lipschitz constant, we need to bound quantities of the form tr | G H ( b z ) | k re-stricted to the event Ω( b η ). Let ( λ i ( H )) denote the eigenvalues of H and introduce I n := [ E − n b η, E + 2 n b η ] ∩ I , N n := |{ i : λ i ( H ) ∈ I n }| , n ∈ N . Since H and H are unitarily equivalent, their empirical eigenvalue distributions are thesame, µ H ; c.f., (2.29). Using the definition of the Stieltjes transform we have, for all n ∈ N ,the estimate N n = N Z I n d µ H ≤ N · n +1 b η Z E +2 n b ηE − n b η b η d µ H ( x )( x − E ) + b η ≤ N · n b η Im m H ( b z ) . Thus we have on the event Ω( b η ) that N n . n N ε b η , ∀ n ∈ N . (6.24) By the spectral theorem, we can boundtr | G H ( b z ) | . N N X i =1 | λ i ( H ) − E | + b η . (6.25)Then we observe (with the convention I − = ∅ ) that1 N N X i =1 | λ i ( H ) − E | + b η = 1 N N X i =1 ∞ X n =0 ( λ i ∈ I n \ I n − ) 1 | λ i ( H ) − E | + b η = 1 N ⌈ c log N ⌉ X n =0 X λ i ∈I n \I n − | λ i ( H ) − E | + b η , where we used kHk ≤ C to truncate the sum over n at ⌈ c log N ⌉ . We then bound (Ω( b η )) 1 N ⌈ c log N ⌉ X n =0 X λ i ∈I n \I n − | λ i ( H ) − E | + b η ≤ (Ω( b η )) 1 N ⌈ c log N ⌉ X n =0 N n n b η . N ε log N , where we used (6.24), i.e., with (6.25) we arrive at (Ω( b η ))tr | G H ( b z ) | . N ε log N . (6.26)Using the spectral decomposition of H we see thattr | G H ( b z ) | = 1 N N X i =1 | λ i ( H ) − E | + b η = 1 N b η N X i =1 b η | λ i ( H ) − E | + b η = Im m H ( b z ) b η , (6.27)where we also used that tr G H ( b z ) = tr G H ( b z ) = m H ( b z ). Thus, we bound (Ω( b η ))tr | G H ( b z ) | k ≤ (Ω( b η )) b η − k +1 Im m H ( b z ) . N ε b η − k +1 , ∀ k ≥ . (6.28)Having established (6.26) and (6.28), we proceed to estimate the Lipschitz constant of e h ( b z )as a function of W . Let su ( N ) and so ( N ) denote the (fundamental representations in M N ( C )of the) Lie algebras of SU ( N ) and SO ( N ) respectively. Let m stand for either su ( N ) or so ( N ). Note that X ∈ m satisfies X ∗ = − X . Since SU ( N ) and SO ( N ) are matrix groupsthe Lie bracket of su ( N ) and so ( N ) respectively is given by the commutator in the matrixalgebras. For fixed X ∈ M N ( C ), we let ad X : M N ( C ) → M N ( C ), Y ad X ( Y ) := XY − Y X . For X ∈ m and t ∈ R , we may write e t ad X ( W BW ∗ ) = (e tX W ) B (e tX W ) ∗ , wherewe used that X ∗ = − X . Further note thatdd t e t ad X ( W BW ∗ ) = e t ad X ad X ( W BW ∗ ) . (6.29)For X ∈ m with k X k = 1, we let G H ( z, tX ) := (cid:16) A + e t ad X ( W BW ∗ ) − z (cid:17) − , t ∈ R , and denote accordingly m H ( z, tX ) := tr G H ( z, tX ) , χ ( z, tX ) := b χ (Im m H ( z, tX )) , e h ( z, tX ) := tr QG H ( z, tX ) ⌈− log b η ⌉ Y l =0 χ ( E + i2 l η, tX ) , with χ ( z, ≡ χ ( z ), e h ( z, ≡ e h (0), etc . Evaluating the derivative of e h ( b z, tX ) with respect to t at t = 0 we get ∂∂t e h ( b z, tX ) (cid:12)(cid:12)(cid:12) t =0 = − tr (cid:16) QG H ( b z )ad X ( W BW ∗ ) G H ( b z ) (cid:17) ⌈− log b η ⌉ Y l =0 χ ( E + i2 l b η ) − tr (cid:0) QG H ( b z ) (cid:1)(cid:18) ⌈− log b η ⌉ X j =0 h ⌈− log b η ⌉ Y l =0 l = j χ ( E + i2 l b η ) i · φ ( E + i2 j b η ) × Im tr (cid:16) G H ( E + i2 j b η ) ad X ( W BW ∗ ) G H ( E + i2 j b η ) (cid:17)(cid:19) , (6.30)where we used (6.29) and where we introduced φ ( z ) := b χ ′ (Im m H ( z )), with b χ ′ the derivativeof b χ . Recalling (6.16) and the definition of the cutoff b χ , we note the bounds ⌈− log b η ⌉ Y l =0 χ ( E + i2 l b η ) ≤ , ⌈− log b η ⌉ X j =0 h Y l = j χ ( E + i2 l b η ) i · φ ( E + i2 j b η ) = O (log N ) . (6.31)On the event Ω c ( b η ), the complementary event to Ω( b η ), we further have the identities ⌈− log b η ⌉ Y l =0 χ ( E + i2 l b η ) = 0 , ⌈− log b η ⌉ X j =0 h ⌈− log b η ⌉ Y l =0 l = j χ ( E + i2 l b η ) i · φ ( E + i2 j b η ) = 0 . (6.32)It thus suffices to bound (6.30) on the event Ω( b η ). We bound the first term on the right sideof (6.30) as (Ω( b η )) (cid:12)(cid:12)(cid:12) tr (cid:16) QG H ( b z )ad X ( W BW ∗ ) G H ( b z ) (cid:17) ⌈− log b η ⌉ Y l =0 χ ( E + i2 l b η ) (cid:12)(cid:12)(cid:12) ≤ (Ω( b η )) 1 N k ad X ( W BW ∗ ) k k G H ( b z ) QG H ( b z ) k ⌈− log b η ⌉ Y l =0 χ ( E + i2 l b η ) , (6.33)where we used cyclicity of the trace and Cauchy–Schwarz inequality. Next, note that k ad X ( W BW ∗ ) k ≤ k B kk X k ≤ k B k , where we used the definition of ad X , k W k ≤ k X k = 1. Similarly, we have k G H ( b z ) QG H ( b z ) k ≤ k Q kk G H ( b z ) G ∗H ( b z ) k . Thus from (6.33), (Ω( b η )) (cid:12)(cid:12)(cid:12) tr (cid:16) QG H ( b z )ad X ( W BW ∗ ) G H ( b z ) (cid:17) ⌈− log b η ⌉ Y l =0 χ ( E + i2 l b η ) (cid:12)(cid:12)(cid:12) ≤ k B kk Q k (Ω( b η )) (cid:18) tr | G H ( b z ) | N (cid:19) ⌈− log b η ⌉ Y l =0 χ ( E + i2 l b η ) . (cid:18) N ε N b η (cid:19) , (6.34)where we used (6.28) with k = 4 in the last step. To handle the second term on the right side of (6.30), we use (6.31) and (6.26) to get (Ω( b η )) (cid:12)(cid:12)(cid:12) tr (cid:0) QG H ( b z ) (cid:1) ⌈− log b η ⌉ X j =0 ⌈− log b η ⌉ Y l =0 l = j χ ( E + i2 l b η ) · φ ( E + i2 j b η ) × Im tr (cid:16) G H ( E + i2 j b η )ad X ( W BW ∗ ) G H ( E + i2 j b η ) (cid:17)(cid:12)(cid:12)(cid:12) ≤ (Ω( b η )) k Q k tr | G H ( b z ) | ⌈− log b η ⌉ X j =0 ⌈− log b η ⌉ Y l =0 l = j χ ( E + i2 l b η ) | φ ( E + i2 l b η ) |× k B kk Q k (cid:18) tr | G H ( E + i2 j b η ) | N (cid:19) . (cid:18) N ε N b η (cid:19) . (6.35)Combining (6.35) and (6.34) we obtain, for any X ∈ su ( N ) or so ( N ) with k X k = 1, that (cid:12)(cid:12)(cid:12)(cid:12) ∂∂t e h ( b z, tX ) (cid:12)(cid:12)(cid:12) t =0 (cid:12)(cid:12)(cid:12)(cid:12) . (cid:18) N ε N b η (cid:19) , (6.36) i.e., the Lipschitz constant of e h ( b z ) as a function of W is bounded by C (cid:16) N ε N b η (cid:17) / , for someconstant C depending only on k B k and k Q k . Thus, taking g = e h ( z ) − E e h ( z ) , L = C (cid:18) N ε N b η (cid:19) , δ = N ε / p N b η in (6.22), and choosing ε > (cid:3) Continuity argument.
In this subsection, we often omit z ∈ C + from the notation.Let U and V be independent and both Haar distributed on either U ( N ) or O ( N ). Recallingthe notation in Section 6.1, we set∆ A ( z ) := − ( IE [ m H ( z )]) G e H ( z ) − ( IE [ f e B ( z )]) G e A ( z ) G e H ( z ) , ∆ B ( z ) := − ( IE [ m H ( z )]) G e H ( z ) − ( IE [ f e A ( z )]) G e B ( z ) G e H ( z ) , z ∈ C + , (6.37)where we introduced IE X := X − E X , for any random variables X . Using the left-invarianceof Haar measure, one derives the identities E [ G e H ⊗ e AG e H ] = E [ e AG e H ⊗ G e H ] , E [ G e H ⊗ e BG e H ] = E [ e BG e H ⊗ G e H ] ;see Theorem 7 in [33] or Appendix A of [31] for proofs. Taking the partial trace for the firstcomponent of the tensor products, we get E G e H ( z ) = E G e A ( ω cB ( z )) + δ cA ( z ) , δ cA ( z ) := 1 E m H ( z ) E [ G e A ( ω cB ( z ))( e A − z )∆ A ( z )] , (6.38)where ω cB ( z ) is defined in (6.5), we used (6.6), and where we implicitly assumed thatIm ω cB ( z ) >
0. This last assumption will be verified along the continuity argument. Then,we set r cA ( z ) := − tr δ cA ( z )tr G e A ( ω cB ( z ))(tr G e A ( ω cB ( z )) + tr δ cA ( z )) , (6.39) and define δ cB ( z ) and r cB ( z ) in the same way by swapping the rˆoles of A and B . Us-ing (6.38), (6.6), we eventually obtain, under the assumption that Im ω cA ( z ) > ω cB ( z ) >
0, Φ µ A ,µ B ( ω cA ( z ) , ω cB ( z ) , z ) = r c ( z ) , z ∈ C + , (6.40)with r c ( z ) = ( r cA ( z ) , r cB ( z )) ⊤ . Lemma 6.3.
Fix E ∈ I and any b η ∈ [ η m , η M ] . Set the notation z = E + i η and b z = E + i b η .Suppose that | ω cA ( z ) − ω A ( z ) | + | ω cB ( z ) − ω B ( z ) | ≤ N − γ , ∀ η = Im z ∈ [ b η, η M ] . (6.41) Moreover, assume that for the event Ξ( b η ) ≡ Ξ E ( b η ) := n | m H ( z ) − m A ⊞ B ( z ) | ≤ N − γ : z = E + i η, ∀ η ∈ [ b η, η M ] o we have P (cid:0) Ξ( b η ) (cid:1) ≥ − N − D (cid:0) N ( η M − b η ) (cid:1) , (6.42) for any D > if N ≥ N ( D ) .Then, for any ǫ > , the estimates | r cA ( b z ) | + | r cB ( b z ) | ≤ N ǫ N b η , (6.43) | ω cA ( b z ) − ω A ( b z ) | + | ω cB ( b z ) − ω B ( b z ) | ≤ N ǫ N b η , (6.44) | E m H ( b z ) − m A ⊞ B ( b z ) | ≤ N ǫ N b η , (6.45) hold for any N ≥ N ( ǫ ) . Moreover, for any ǫ, D > , the event Θ( b η ) ≡ Θ E ( b η ) := Ξ E ( b η ) ∩ n | m H ( b z ) − m A ⊞ B ( b z ) | ≥ N ǫ p N b η o (6.46) satisfies P (cid:0) Θ( b η ) (cid:1) ≤ N − D , (6.47) if N ≥ N ( ǫ, D ) . The threshold functions N , N , N depend only on µ α , µ β , the speed ofconvergence in (2.25) and they are uniform in b η ∈ [ η m , η M ] and E ∈ I . We postpone the proof of Lemma 6.3 and prove Theorem 2.8 first.
Proof of Theorem 2.8.
We start with observing that it is sufficient to prove a version of (2.28)where the real part of the spectral parameter E is fixed. This version asserts that there is alarge ( N -independent) η M , to be fixed below, such that for any (small) ǫ > D ,and any fixed E ∈ I , P (cid:18) [ z ∈S E ( η m ,η M ) (cid:26)(cid:12)(cid:12) m H ( z ) − m µ A ⊞ µ B ( z ) (cid:12)(cid:12) > N ǫ N (Im z ) / (cid:27)(cid:19) ≤ N D , (6.48)holds for N ≥ N , i.e., the set S I ( η m , η M ) in (2.28) is replaced with S E ( η m , η M ) := { E + i η : η ∈ [ η m , η M ] } . The threshold N depends on ǫ, D , µ α , µ β , I and on the speed of convergencein (2.25).Indeed, by introducing the discretized lattice version b S I ( a, b ) := S I ( a, b ) ∩ N − { Z × i Z } of the spectral domain S I ( a, b ) ( c.f., (2.13)) and by taking a union bound, we see that (6.48)implies P (cid:18) [ z ∈ b S I ( η m ,η M ) (cid:26)(cid:12)(cid:12) m H ( z ) − m µ A ⊞ µ B ( z ) (cid:12)(cid:12) > N ǫ N (Im z ) / (cid:27)(cid:19) ≤ CN D − . (6.49)Thanks to the Lipschitz continuity of the Stieltjes transforms m H ( z ) and m µ A ⊞ µ B ( z ) withLipschitz constant η − = (Im z ) − ≤ N , for any Im z ≥ η m , we see that (2.28) followsfrom (6.49) after a small adjustment of ǫ and D that were anyway arbitrary.From now on we fix E ∈ I and our goal is to prove (6.48). We will use Lemma 6.3. Inthe first step we verify that the assumptions of this lemma hold for b η = η M , i.e., that (6.41)and (6.42) hold for z = E + i η M . In the second step, we successively use Lemma 6.3 toreduce b η steps by steps of size N − until we have verified (6.41)–(6.42) down to b η = η m .Then (6.48) will follow from a final application of Lemma 6.3 combined with discretizationargument similar to the one above, but this time the η variable instead of the E variable. Step 1. Initial bound.
First we note that since µ A and µ B are compactly supported, k H k is deterministically bounded, we thus have Im m H ( E +i η M ) ≤ ( η M ) − ≤ η M ≥ | f V QV ∗ ( E + i η M ) − E f V QV ∗ ( E + i η M ) | ≺ p N η , (6.50)uniformly for any deterministic Q with k Q k .
1. The analog concentration holds with V replaced by U . Using (6.50) with Q = I ( I the identity matrix), we have | IE [ m H ( E +i η M )] | ≺ N − . Hence, it suffices to show that | E m H ( E + i η M ) − m A ⊞ B ( E + i η M ) | ≺ N . (6.51)Recalling the definitions of ω cA and ω cB in (6.5), we have, with z = E + i η M , the expansion ω cA ( z ) = z − E tr e AG e H ( z ) E tr G e H ( z ) = z − tr Az − + E tr e A ( e A + e B ) z − z − + O ( z − ) , as η M ր ∞ . Thus using the assumption tr A = 0 we getIm ω cA ( E + i η M ) − Im η M = tr A η M + E tr e A e Bη M | E + i η M | + O (cid:18) η (cid:19) , as η M ր ∞ . Next, since V and U are independent, we have E tr V AV ∗ U BU ∗ = tr E [ V AV ∗ ] E [ U BU ∗ ] = tr A tr B = 0 , since tr A = tr B = 0 by assumption. ThusIm ω cA ( E + i η M ) − Im η M = tr A η M | E + i η M | + O (cid:18) η (cid:19) , (6.52)as η M ր ∞ . Since tr A >
0, we achieve by choosing η M sufficiently large (but independentof N ) that Im ω cA ( E + i η M ) − Im η M ≥
12 tr A η M | E + i η M | , (6.53)and the analogue estimate holds with A replaced by B . In particular, we have, for such η M ,Im ω cA ( E + i η M ) & , Im ω cB ( E + i η M ) & , (6.54)and ω cA ( E + i η M ) ∼ ω cB ( E + i η M ) ∼ To show (6.51), we apply Lemma 4.2 to the system (6.40). Having established (6.53), itsuffices to show that | r cA ( E + i η M ) | ≺ N , | r cB ( E + i η M ) | ≺ N , (6.55)since then we have, for N sufficiently large and η M as above that, for any fixed ε ∈ [0 , N ε N ≤
12 tr A η M | E + i η M | ≤ Im ω cA ( E + i η M ) − Im η M , (6.56)and similarly with B replacing A . In particular, combining (6.55) and (6.56), we see thatassumption (4.14) of Lemma 4.2 (with the choice e r = r c ) is satisfied for N sufficiently large(with high probability). Consequently, we see that (6.41) (even with N − ǫ instead of N − γ in the latter) hold for z = E + i η M . Finally, the equations E m H ( z ) = 1 z − ω cA ( z ) − ω cB ( z ) , m A ⊞ B ( z ) = 1 z − ω A ( z ) − ω B ( z ) , (6.57)together with the concentration estimate (6.50) yield (6.42).It remains to justify (6.55). Since i η M m H ( E + i η M ) = O ( η − ), we have E m H ( E + i η M ) ∼
1. In addition, from (6.54) it follows that m A ( ω cB ( E + i η M )) ∼ m B ( ω cA ( E + i η M )) ∼ | tr δ cA ( E + i η M ) | ≺ N , | tr δ cB ( E + i η M ) | ≺ N . (6.58)By the definitions of δ cA , δ cB in (6.38), and ∆ A , ∆ B in (6.37), it is easy to obtain (6.58) byusing (6.50) and Cauchy–Schwarz. This completes Step 1 , i.e., the verification of (6.41)–(6.42) for b η = η M . Step 2. Induction.
Recall that ω A , ω B and m A ⊞ B (see Lemma 5.1) are uniformly boundedand ω cA ( z ), ω cB ( z ), m H ( z ), ω A ( z ), ω B ( z ) and m A ⊞ B ( z ) are Lipschitz continuous with aLipschitz constant bounded by (Im z ) − ≤ N , for any Im z ≥ η m . Applying Lemma 6.3 toconclude (6.44) with the choice ǫ = γ/
10, we see that if (6.41) and (6.42) hold for some b η ,then (6.41) also holds for b η replaced with b η − N − as long as b η ≥ η m . Moreover, by theLipschitz continuity of m H and m A ⊞ B , notice thatΞ( b η − N − ) ⊃ Ξ( b η ) \ Θ( b η ) . (6.59)Thus, if (6.42) holds for some b η , then (6.59) and (6.47) imply that (6.42) also holds for b η replaced with b η − N − . Using Step 1 as an initial input with the choice b η = η M , andapplying the above induction argument O ( N ) times by reducing b η with stepsize N − , wesee that (6.41) and (6.42) hold for all b η k ∈ [ η m , η M ] of the form b η k = η M − k · N − with someinteger k . Applying Lemma 6.3 once more for these b η k , but now with an arbitrary ǫ > | m H ( E + i b η k ) − m A ⊞ B ( E + i b η k ) | ≺ p N b η k , k = 0 , , . . . , k , (6.60)where k is the largest integer with b η k ≥ η m . The uniformity of (6.60) in k follows from thefact that the threshold functions N j in Lemma 6.3 are independent of b η . Clearly k = O ( N ),so taking a union bound of (6.60), compensating the combinatorial factor CN by replacing D by D −
5, and slightly adjusting ǫ to extend the control from the set { z = E +i b η k : k ≤ k } to all z ∈ S E ( η m , η M ), we obtain (6.48). (cid:3) It remains to prove Lemma 6.3. Proof of Lemma 6.3.
First we notice that E ∈ I and (2.25) imply that for all sufficientlylarge N , the bounds (5.3)-(5.4) hold. Together with (6.41) they imply that ω cA ( b z ) , ω cB ( b z ) ∼ , Im ω cA ( b z ) , Im ω cB ( b z ) & , (6.61)moreover, using (6.6) we also get 1 E m H ( b z ) . . (6.62)We start with (6.43). Thanks to symmetry, we only need to estimate | r cA ( b z ) | . By (6.61)we have k G e A ( ω cB ( b z )) k = k G A ( ω cB ( b z )) k . . (6.63)Furthermore, ω cB ( b z ) ∼ ω cB ( b z ) & m A ( ω cB ( b z )) ∼ (cid:12)(cid:12)(cid:12) E (cid:2) tr (cid:0) G e A ( ω cB ( b z ))( e A − b z )∆ A ( b z ) (cid:1)(cid:3)(cid:12)(cid:12)(cid:12) ≤ N ǫ N b η , (6.64)for any ǫ > N ≥ N ( ǫ ) is large enough, uniformly for b η ∈ [ η m , η M ]. Assuming (6.64) andrecalling the definition of δ cA and r cA in (6.38)-(6.39), from (6.62) we get the first estimatein (6.43).Next, we prove (6.64). By the definitions in (6.37), we have E h tr (cid:0) G e A ( ω cB ( b z ))( e A − b z )∆ A ( b z ) (cid:1)i = − E h IE [ m H ( b z )] tr (cid:0) G e A ( ω cB ( b z ))( e A − b z ) G e H ( b z ) (cid:1)i − E h IE [ f B ( b z )] tr (cid:0) G e A ( ω cB ( b z )) G e H ( b z ) (cid:1)i . (6.65)We rewrite the two terms on the right side separately as covariances, E h IE [ m H ( b z )] tr (cid:0) G e A ( ω cB ( b z ))( e A − b z ) G e H ( b z ) (cid:1)i = Cov (cid:16) m H ( b z ) , tr (cid:16) G e A ( ω cB ( b z ))( e A − b z ) G e H ( b z ) (cid:17)(cid:17) , respectively, E h IE [ f e B ( b z )]tr (cid:0) G e A ( ω cB ( b z )) G e H ( b z ) (cid:1) i = Cov (cid:16) f e B ( b z ) , tr (cid:0) G e A ( ω cB ( b z )) G e H ( b z ) (cid:1)(cid:17) , where Cov( X, Y ) := E ( IE [ X ] · IE [ Y ]), for arbitrary random variables X and Y .Given (6.42) and the uniform boundedness of m A ⊞ B ( z ) from (5.16), we see that (6.12)is satisfied and we can apply Proposition 6.2 using different choices for Q . Together withCauchy–Schwarz inequality | Cov(
X, Y ) | ≤ E | IE [ X ] | · E | IE [ Y ] | , we get (cid:12)(cid:12)(cid:12) Cov (cid:16) m H ( b z ) , tr (cid:16) G e A ( ω cB ( b z ))( e A − b z ) G e H ( b z ) (cid:17)(cid:17) (cid:12)(cid:12)(cid:12) ≺ N b η , (cid:12)(cid:12)(cid:12) Cov (cid:0) f e B ( b z ) , tr (cid:0) G e A ( ω cB ( b z )) G e H ( b z ) (cid:1)(cid:1) (cid:12)(cid:12)(cid:12) ≺ N b η . (6.66)More specifically, for the first line of (6.66), we chose Q = I and Q = G A ( ω cB ( b z ))( A − b z );for the second line we chose Q = B and Q = G A ( ω cB ( b z )), where we also used the facts e A = V AV ∗ and e B = U BU ∗ . Here, we also implicitly used (6.63). Then, (6.64) followsfrom (6.66), which in turn proves (6.43).Next, using Proposition 4.1, (6.40) and (6.43) we immediately get (6.44). Moreover,since Im ω A ( z ) , Im ω B ( z ) ≥ Im z , we have | z − ω A ( z ) − ω B ( z ) | ≥ Im ω B ( z ) &
1. Togetherwith (6.44) and (6.57) this implies (6.45). Notice that (6.42) together with the uniform boundon m A ⊞ B imply the condition (6.12) in Proposition 6.2. Thus, finally (6.46) and (6.47)follow from (6.45) and the concentration inequality (6.13). This completes the proof ofLemma 6.3. (cid:3) Two point mass case
In this section, we discuss stability properties of the free additive convolution µ α ⊞ µ β when both µ α and µ β are convex combinations of two point masses. The analogous resultto Theorem 2.5 is given in Proposition 7.2 below. Applications of that result in the spirit ofTheorems 2.7 and 2.8 are then stated in Proposition 7.3 and Proposition 7.4. When we referto the results in Sections 2-4, we will henceforth regard µ and µ as µ α and µ β , respectively,unless specified otherwise.7.1. Stability in the two point mass case.
Without loss of generality (up to shiftingand scaling), we assume that µ α = ξδ + (1 − ξ ) δ , µ β = ζδ θ + (1 − ζ ) δ , θ = 0 ,ξ, ζ ∈ (0 ,
12 ] , ξ ≤ ζ, ( θ, ξ, ζ ) = ( − , ,
12 ) . (7.1)Here we exclude the case ( θ, ξ, ζ ) = ( − , , ) since it is equivalent to ( θ, ξ, ζ ) = (1 , , )under a shift. Note that the latter is a special case of µ α = µ β .Set ℓ := min n (cid:16) θ − p (1 − θ ) + 4 θr + (cid:17) , (cid:16) θ − p (1 − θ ) + 4 θr − (cid:17)o ,ℓ := max n (cid:16) θ − p (1 − θ ) + 4 θr + (cid:17) , (cid:16) θ − p (1 − θ ) + 4 θr − (cid:17)o ,ℓ := min n (cid:16) θ + p (1 − θ ) + 4 θr + (cid:17) , (cid:16) θ + p (1 − θ ) + 4 θr − (cid:17)o ,ℓ := max n (cid:16) θ + p (1 − θ ) + 4 θr + (cid:17) , (cid:16) θ + p (1 − θ ) + 4 θr − (cid:17)o , where we introduced r ± := ξ + ζ − ξζ ± p ξζ (1 − ξ )(1 − ζ ) . (7.2)Note that ℓ < ℓ ≤ ℓ < ℓ . The following result, taken from [28], describes the regularbulk of µ α ⊞ µ β in the setting of (7.1). Recall that f µ α ⊞ µ β denotes the density of ( µ α ⊞ µ β ) ac . Lemma 7.1.
Let µ α and µ β be as in (7.1) . Then the regular bulk is given by B µ α ⊞ µ β = ( ℓ , ℓ ) ∪ ( ℓ , ℓ ) , (7.3) in case µ α = µ β , while in case µ α = µ β it is given by B µ α ⊞ µ α = ( ℓ , ℓ ) . (7.4) Proof.
Choose the diagonal matrices A and B with spectral distribution µ A = ξ N δ + (1 − ξ N ) δ and µ B = ζ N δ θ + (1 − ζ N ) δ respectively, with ξ N := ⌊ ξN ⌋ /N and ζ N := ⌊ ζN ⌋ /N ,where ⌊ · ⌋ denotes the integer part. Recall from (7.1) that ξ ≤ ζ and ξ + ζ ≤
1. FromTheorem 1.1 of [28], we first observe that the θ and 0 are eigenvalues of the matrix H = A + U BU ∗ , U a Haar unitary, with multiplicities N ( ζ N − ξ N ) and N (1 − ζ N − ξ N ), respectively.The remaining 2 ξ N N eigenvalues of H may be obtained via a two-fold transformation fromthe eigenvalues, ( t i ), of a ξ N N -dimensional Jacobi ensemble as τ ± j := 12 (cid:16) θ ± q (1 − θ ) + 4 t j (cid:17) , j = 1 , . . . , ξ N N , (7.5) and then identifying the eigenvalues of H as the set { τ + j } ∪ { τ − j } ∪ { , θ } . In addition, theweak limit of ξ N N P j δ t j , as N → ∞ , admits a density given by f ( x ) = 12 πξ p ( r + − x )( x − r − ) x (1 − x ) [ r − ,r + ] ( x ) , x ∈ R , (7.6)where r + and r − are defined in (7.2). Since the limiting spectral distribution of H isgiven by µ α ⊞ µ β , we see that ( µ α ⊞ µ β ) ac agrees with the weak limit of the measure N P j ( δ τ + j + δ τ − j ), as N → ∞ . Using this information together with (7.5) and (7.6), onededuces that supp ( µ α ⊞ µ β ) ac = [ ℓ , ℓ ] ∪ [ ℓ , ℓ ]. It then follows from the explicit form ofthe limiting distribution of the Jacobi ensemble that f µ α ⊞ µ β is bounded and strictly positiveinside its support. This proves (7.3).In the special case µ α = µ β , we have ℓ = ℓ = 1 and thus supp ( µ α ⊞ µ β ) ac = [ ℓ , ℓ ],with ℓ = 1 − p ξ (1 − ξ ) and ℓ = 1 + 2 p ξ (1 − ξ ). In fact, the density of ( µ α ⊞ µ α ) ac equals f µ α ⊞ µ α ( x ) = 1 π p ( ℓ − x )( x − ℓ ) x (2 − x ) , x ∈ ( ℓ , ℓ ) ; (7.7)see (5.5) of [33] for instance. Then (7.4) follows directly (cid:3) Our main task in this section is to show the following result on the stability of the systemΦ µ α ,µ β ( ω α , ω β , z ) = 0 in the setting (7.1). Recall the definition of Γ µ α ,µ β in (2.12). Proposition 7.2.
Let µ α and µ β be as in (7.1) . Let I ⊂ B µ α ⊞ µ β be a compact non-emptyinterval. Fix < η M < ∞ . Then, there are constants k > , K < ∞ and S < ∞ , dependingon the constants ξ , ζ , θ , η M and on the interval I , such that the subordination functionspossess the following bounds: min z ∈S I (0 ,η M ) Im ω α ( z ) ≥ k , min z ∈S I (0 ,η M ) Im ω α ( z ) ≥ k , (7.8)max z ∈S I (0 ,η M ) | ω α ( z ) | ≤ K , max z ∈S I (0 ,η M ) | ω β ( z ) | ≤ K . (7.9) Moreover, we have the following bounds: ( i ) If µ α = µ β , max z ∈S I (0 ,η M ) Γ µ α ,µ β ( ω α ( z ) , ω β ( z )) ≤ S . (7.10)( ii ) If µ α = µ β , Γ µ α ,µ α ( ω α ( z ) , ω α ( z )) ≤ S | z − | , (7.11) holds uniformly on S I (0 , η M ) .Remark . As an immediate consequence of Proposition 7.2 and (3.28), we obtain for µ α = µ β the bounds max z ∈S I (0 ,η M ) | ω ′ α ( z ) | ≤ S , max z ∈S I (0 ,η M ) | ω ′ β ( z ) | ≤ S with I asin (7.10). For µ α = µ β , we get | ω ′ α ( z ) | ≤ S | z − | , uniformly on S I (0 , η M ) as in (7.11). Remark . In the case µ α = µ β , we note from Lemma 7.1 ( c.f., (7.7)) that the point E = 1 is in the regular bulk B µ α ⊞ µ α . However, m µ α ⊞ µ β (1 + i0) is unstable under smallperturbations. For instance, let µ A = µ α = ξδ + (1 − ξ ) δ , µ B = ( ξ − ε ) δ + (1 − ξ + ε ) δ , for some small ε >
0. Then, according to Theorem 7.4 of [11], µ A ⊞ µ B has a point mass εδ .Hence, even though (2.25) ( i.e., d L ( µ B , µ β ) →
0, as ε →
0) is satisfied, m µ A ⊞ µ B ( z ) contains a singular part ε (1 − z ) , which blows up as | z − | = o ( ε ). This explains, on a heuristic level,the bound in (7.11) and shows why the µ α = µ β case at energy E = 1 is special even thoughthe density f µ α ⊞ µ α is real analytic in a neighborhood of E = 1. Remark . Consider a more general setup with µ α = ξδ a + e µ α and µ β = (1 − ξ ) δ b + e µ β , forsome constants ξ ∈ (0 , a, b ∈ R and for some Borel measures e µ α and e µ β with e µ α ( R ) = 1 − ξ and e µ β ( R ) = ξ . Analogously to the discussion in Remark 7.2, we note that m µ α ⊞ µ β ( a + b +i0)is unstable under small perturbations. However, from Lemma 3.4, we know that the systemΦ µ α ,µ β ( ω α , ω β , z ) = 0 is linearly S -stable in the regular bulk under the assumptions ofTheorem 2.5. That means, if neither µ α nor µ β is supported at a single point and at leastone of them is supported at more than two points, then the point E = a + b cannot lie in theregular bulk B µ α ⊞ µ β . Thus, only in the special case µ α = µ β with µ α as in (7.1), there is anunstable point, up to scaling and shifting given by E = 1, inside the regular bulk B µ α ⊞ µ α . Proof of Proposition 7.2.
Estimates (7.8) and (7.9) follow from Lemma 3.2 and Lemma 3.3.To show statement ( i ), we recall from the proof of Lemma 3.4 that Φ µ α ,µ β ( ω α , ω β , z ) = 0is linearly S -stable at ( ω α , ω β ) if (cid:12)(cid:12) − ( F ′ µ α ( ω β ) − F ′ µ β ( ω α ) − (cid:12)(cid:12) ≥ c , (7.12)for some strictly positive constant c . We now show that (7.12) holds for the case µ α = µ β in the setup of (7.1). Using henceforth the shorthand F α ≡ F µ α , F β ≡ F µ β , we compute F α ( z ) = z (1 − z )1 − ξ − z , F β ( z ) = z ( θ − z ) θ − θζ − z , z ∈ C + . (7.13)Then it is easy to obtain F ′ α ( z ) − ξ − ξ (1 − ξ − z ) , F ′ β ( z ) − θ ( ζ − ζ )( θ − θζ − z ) , (7.14)and | F ′ α ( z ) − | = Im F α ( z ) − Im z Im z , | F ′ β ( z ) − | = Im F β ( z ) − Im z Im z . Consequently, we have ( c.f., (3.18)) (cid:12)(cid:12)(cid:12)(cid:0) F ′ α ( ω β ( z )) − (cid:1)(cid:0) F ′ β ( ω α ( z )) − (cid:1)(cid:12)(cid:12)(cid:12) = (Im ω α ( z ) − Im z )(Im ω β ( z ) − Im z )Im ω α ( z )Im ω β ( z ) (7.15)for any z ∈ C + . Hence, for z ∈ S I ( η , η M ) with some small but fixed η > z ∈ S I (0 , η ).Then (7.13) together with (2.5) implies that ω β (1 − ω β )1 − ξ − ω β = ω α ( θ − ω α ) θ − θζ − ω α , ω β (1 − ω β )1 − ξ − ω β = ω α + ω β − z . (7.16)Denote s := 1 − ξ − ω β and t := θ − θζ − ω α . From (7.14) we then have (cid:0) F ′ α ( ω β ) − (cid:1)(cid:0) F ′ β ( ω α ) − (cid:1) = ( ξ − ξ )( θ ζ − ( θζ ) )( st ) . (7.17)Using (7.16), some algebra reveals that1 st = − ξ − ξ + ξ + θ − θζ − z ( ξ − ξ ) t , st = − θ ( ζ − ζ ) + θζ + 1 − ξ − zθ ( ζ − ζ ) s . (7.18) Owing to (7.15) and ( ξ − ξ )( θ ζ − ( θζ ) ) > ξ, ζ ∈ (0 , /
2] and θ = 0), itsuffices to show that | Im ( st ) | ≥ c , (7.19)in order to prove (7.12). Note that, from the definitions of s and t , together with (7.8)and (7.9), we have | Im s | , | Im t | ≥ c , | s | , | t | ≤ C . (7.20)Since µ α = µ β , there exists a positive constant d such that max {| ξ − ζ | , | θ − |} ≥ d . It isthen elementary to work out thatmax {| ( ξ − ξ ) − θ ( ζ − ζ ) | , | (2 ξ − θζ + θ − |} ≥ d , (7.21)for some positive constant d ≡ d ( ξ, ζ, θ ) >
0, since the special case ( θ, ξ, ζ ) = ( − , , ) isalso excluded in the setting (7.1). For brevity, we adopt the notation ϕ := θζ + 1 − ξ − zθ ( ζ − ζ ) s , ψ := ξ + θ − θζ − z ( ξ − ξ ) t . Then, according to (7.18) we haveRe 1 st = Re ψ − ξ − ξ = Re ϕ − θ ( ζ − ζ ) , Im 1 st = Im ψ = Im ϕ . (7.22)If | ( ξ − ξ ) − θ ( ζ − ζ ) | ≥ d holds in (7.21), then (7.22) implies that | Re ψ − Re ϕ | ≥ d , (7.23)for some positive constant d ≡ d ( ξ, ζ, θ ). For small enough η = η ( ξ, ζ, θ ), we then getRe ψ − Re ϕ = ( ξ + θ − θζ − E )Re t + O ( η )( ξ − ξ ) | t | − ( θζ + 1 − ξ − E )Re s + O ( η ) θ ( ζ − ζ ) | s | , which, together with (7.20) and (7.23), implies thatmax (cid:8) | θζ + 1 − ξ − E | , | ξ + θ − θζ − E | (cid:9) ≥ d , (7.24)for some positive constant d ≡ d ( ξ, ζ, θ ). If, on the other hand, | (2 ξ − θζ + θ − | ≥ d holds in (7.21), we get (7.24) by triangle inequality. Either way, (7.24) follows from (7.21),for sufficiently small, but fixed, η > c > η , for all z ∈ S I (0 , η ), we have max {| Im ϕ | , | Im ψ |} ≥ c . Since Im ϕ =Im ψ by (7.22), (7.19) holds on S I (0 , η ). Therefore, (7.19) holds on all of S I (0 , η M ). So, if µ α = µ β , the system Φ µ α ,µ β ( ω α , ω β , z ) = 0 is linearly S -stable with some finite S .We next prove statement ( ii ) where µ α = µ β and thus θ = 1, ξ = ζ . From (7.16), we seethat ω α = ω β satisfies the equation ω α (1 − ω α )1 − ξ − ω α = 2 ω α − z . (7.25)Solving (7.25) for ω α ( z ) we get ω α ( z ) = ω β ( z ) = 12 (cid:16) z − − ξ ) + p ( z − − ξ (1 − ξ ) (cid:17) , (7.26)where the square root is chosen such that ω β ( z ) → i p ξ (1 − ξ ) , as z →
1. Substituting (7.26)into (7.17), together with the θ = 1, ζ = ξ , s = t = 1 − ξ − ω α , yields (cid:0) F ′ α ( ω β ( z )) − (cid:1)(cid:0) F ′ β ( ω α ( z )) − (cid:1) = 4( ξ − ξ ) (cid:0) z − p ( z − − ξ − ξ ) (cid:1) . Then it is elementary to check that (cid:12)(cid:12) − (cid:0) F ′ α ( ω β ( z )) − (cid:1)(cid:0) F ′ β ( ω α ( z )) − (cid:1)(cid:12)(cid:12) & | z − | , z ∈ S (0 , η M ) , which further implies Γ µ α ,µ β ( ω α ( z ) , ω β ( z )) . / | z − | . Hence (7.11) is proved. (cid:3) Applications of Proposition 7.2.
Analogously to Theorem 2.5, we have two mainapplications of Proposition 7.2. The first one is the following modification of Theorem 2.7.Let µ α , µ β be as in (7.1) and let µ A , µ B be arbitrary probability measures on R . Recall thedomain S I ( a, b ) introduced in (2.13). For given (small) ς >
0, we set S ς I ( a, b ) := (cid:26) z ∈ S I ( a, b ) : ς | z − | ≥ max np d L ( µ A , µ α ) , q d L ( µ B , µ β ) o(cid:27) . (7.27) Proposition 7.3.
Let µ α , µ β be as in (7.1) . Let I ⊂ B µ α ⊞ µ β be a compact non-emptyinterval. Let µ A , µ B be two probability measures on R . Fix < η M < ∞ . Then there areconstants b > and Z < ∞ such that the condition d L ( µ A , µ α ) + d L ( µ B , µ β ) ≤ b (7.28) implies max z ∈S I (0 ,η M ) (cid:12)(cid:12) m µ A ⊞ µ B ( z ) − m µ α ⊞ µ β ( z ) (cid:12)(cid:12) ≤ Z (d L ( µ A , µ α ) + d L ( µ B , µ β )) , (7.29) in case µ α = µ β , respectively (cid:12)(cid:12) m µ A ⊞ µ B ( z ) − m µ α ⊞ µ α ( z ) (cid:12)(cid:12) ≤ Z | z − | (d L ( µ A , µ α ) + d L ( µ B , µ α )) , (7.30) uniformly on S ς I (0 , η M ) with ς ≤ ς , for some ς > , in case µ α = µ β . The constants b and Z depend only on the constants ξ, ζ, θ and on the interval I , while ς also depends on b .Proof. Having established Proposition 7.2, the proof of (7.29) is the same as that of Theo-rem 2.7. To establish (7.30), we mimic the proof of Theorem 2.7 with S replaced by S | z − | .We only give a sketch here. Similarly to (5.9), using (7.8) and (7.11), we have with b in (7.28)sufficiently small thatΓ µ A ,µ B ( ω α ( z ) , ω α ( z )) . | z − | , z ∈ S ς I (0 , η M ) . (7.31)As in the proof of Lemma 5.1, we rewrite the system Φ µ α ,µ α ( ω α ( z ) , ω α ( z ) , z ) = 0 asΦ µ A ,µ B ( ω α ( z ) , ω α ( z ) , z ) = r ( z ) with k r ( z ) k satisfying the bound (5.8). From the uniquenessof the solution to Φ µ A ,µ B ( ω A , ω B , z ) = 0 and (7.31), we get | ω A ( z ) − ω α ( z ) | . k r ( z ) k / | z − | , | ω B ( z ) − ω α ( z ) | . k r ( z ) k / | z − | , z ∈ S ς I (0 , η M ) , (7.32)via the Newton-Kantorovich theorem. Note that the inequality k r ( z ) k . ς | z − | is neededto guarantee that the first order term dominates over the higher order terms in the Taylorexpansion of Φ µ A ,µ B ( ω A , ω B , z ) around Φ µ A ,µ B ( ω α , ω β , z ). This is the reason why we restrictour discussion on the set S ς I (0 , η M ). In addition, thanks to (7.32) we see that (5.3) and (5.4)still hold with S I (0 , η M ) replaced by S ς I (0 , η M ). Then the remaining parts of the proofof (7.30) are the same as the counterparts in the proof of Theorem 2.7. (cid:3) The second application of Proposition 7.2 gives the following local law for the Greenfunction in the random matrix setup from Subsection 2.3.2. Fix any γ >
0. We introduce asub-domain of S ς I ( a, b ) by setting e S ς I ( a, b ) := S ς I ( a, b ) ∩ n z ∈ C : | z − | ≥ N γ p N η / o . (7.33) Proposition 7.4.
Let µ α , µ β be as in (7.1) . Assume that the empirical eigenvalue distri-butions µ A , µ B of the sequences of matrices A , B satisfy (2.25) . Fix any < η M < ∞ , anysmall γ > and set η m = N − + γ . Let I ⊂ B µ α ⊞ µ β be a compact non-empty interval. Thenwe have the following conclusions. ( i ) If µ α = µ β , then max z ∈S I ( η m ,η M ) (cid:12)(cid:12) m H ( z ) − m A ⊞ B ( z ) (cid:12)(cid:12) ≺ N η / . ( ii ) If µ α = µ β , then, for any fixed (small) ς > , (cid:12)(cid:12) m H ( z ) − m A ⊞ B ( z ) (cid:12)(cid:12) ≺ | z − | N η / , uniformly on e S ς I ( η m , η M ) .Proof of Proposition 7.4. Note that, in the proof of Theorem 2.8, the only place where weuse the assumption that at least one of µ α and µ β is supported at more than two pointsis Lemma 3.4; in particular in (3.25). Hence, it suffices to mimic the proof of Theorem 2.8with Lemma 3.4 replaced by Proposition 7.2. Then the proof of the case µ α = µ β is exactlythe same as that of Theorem 2.8. It suffices to discuss the case µ α = µ β below.Analogously to Corollary 5.2, with the aid of (7.31) and (7.32), we show thatΓ µ A ,µ B ( ω A ( z ) , ω B ( z )) . | z − | , z ∈ S ς I (0 , η M ) . (7.34)Then, we use a continuity argument, based on Lemma 4.2 and Proposition 4.1 with S replaced by S | z − | therein, to deduce from (6.40) that | ω ci ( z ) − ω i ( z ) | ≺ k r c ( z ) k / | z − | , i = A, B , on e S ς I ( η m , η M ). The remaining parts of the proof are the same as in Theorem 2.8.This completes the proof of part ( ii ) of Proposition 7.4. (cid:3) References [1] Akhieser, N. I.:
The classical moment problem and some related questions in analysis , Hafner Publish-ing Co., New York, 1965.[2] Anderson, G., Guionnet, A., Zeitouni, O.:
An introduction to random matrices , Cambridge Stud. Adv.Math. , Cambridge Univ. Press, Cambridge, 2010.[3] Bao Z. G., Erd˝os, L., Schnelli K.:
Local law of addition of random matrices on optimal scale ,arXiv:1509.07080 (2015).[4] Belinschi, S., Bercovici, H.:
A new approach to subordination results in free probability , J. Anal. Math. , 357-365 (2007).[5] Belinschi, S.:
A note on regularity for free convolutions , Ann. Inst. Henri Poincar´e Probab. Stat. ,635-648 (2006).[6] Belinschi, S.:
The Lebesgue decomposition of the free additive convolution of two probability distribu-tions , Probab. Theory Related Fields , 125-150 (2008).[7] Belinschi, S.: L ∞ -boundedness of density for free additive convolutions , Rev. Roumaine Math. PuresAppl. , 173-184 (2014).[8] Belinschi, S., Bercovici, H., Capitaine, M., F´evrier, M.: Outliers in the spectrum of large deformedunitarily invariant models , arXiv:1412.4916 (2014).[9] Benaych-Georges, F., Nadakuditi, R. R.:
The eigenvalues and eigenvectors of finte, low rank perturb-bations of large random matrices , Adv. Math. , 494-521 (2011). [10] Bercovici, H, Voiculescu, D.: Free convolution of measures with unbounded support , Indiana Univ.Math. J. , 733-773 (1993).[11] Bercovici, H., Voiculescu, D.: Regularity questions for free convolution, nonselfadjoint operator alge-bras, operator theory, and related topics , Oper. Theory Adv. Appl. , 37-47 (1998).[12] Bercovici, H., Wang, J.-C.:
On freely indecomposable measures , Indiana Univ. Math. J. , 2601-2610 (2008).[13] Biane, P.:
On the free convolution with a semi-circular distribution , Indiana Univ. Math. J. , 705-718(1997).[14] Biane, P.: Processes with free increments , Math. Z. , 143-174 (1998).[15] Biane, P.:
Representations of symmetric groups and free probability , Adv. Math. , 126-181(1998).[16] Capitaine, M.:
Additive/multiplicative free subordination property and limiting eigenvectors of spikedadditive deformations of Wigner matrices and spiked sample covariance matrices , J. Theoret. Probab. , 595-648 (2013).[17] Chatterjee, S.:
Concentration of Haar measures, with an application to random matrices , J. Funct.Anal. , 379-389 (2007).[18] Chistyakov, G. P., G¨otze, F.:
The arithmetic of distributions in free probability theory , Cent. Euro. J.Math. , 997-1050 (2011).[19] Collins, B.: Moments and cumulants of polynomial random variables on unitary groups, the Itzykson-Zuber integral, and free probability , Int. Math. Res. Not. , 953-982 (2003).[20] Dykema, K.:
On certain free product factors via an extended matrix model , J. Funct. Anal. ,31-60 (1993).[21] Erd˝os, L., Knowles, A., Yau, H.-T.:
Averaging fluctuations in resolvents of random band matrices ,Ann. Henri Poincar´e , 1837-1926 (2013).[22] Erd˝os, L., Knowles, A., Yau, H.-T., Yin, J.: The local semicircle law for a general class of randommatrices . Electron. J. Probab. , 1-58 (2013).[23] Erd˝os, L., Schlein, B., Yau, H.-T.:
Local semicircle law and complete delocalization for Wigner randommatrices , Ann. Probab. , 815-852 (2009).[24] Erd˝os, L., Yau, H.-T., Yin, J.:
Bulk universality for generalized Wigner matrices , Probab. TheoryRelated Fields , 341-407 (2012).[25] Ferreira, O. P., Svaiter, B. F.:
Kantorovich’s theorem on Newton’s method , arXiv:1209.5704 (2012).[26] Gromov, M., Milman V. D.:
A topological application of the isoperimetric inequality , Amer. J. Math. , 843-854 (1983).[27] Hiai, F., Petz, D.:
The semicircle law, free random variables and entropy , Math. Surveys Monogr. ,Amer. Math. Soc., Providence RI, 2000.[28] Kargin, V.: On eigenvalues of the sum of two random projections , J. Stat. Phys. , 246-258(2012).[29] Kargin, V.:
A concentration inequality and a local law for the sum of two random matrices , Probab.Theory Related Fields , 677-702 (2012).[30] Kargin, V.:
An inequality for the distance between densities of free convolutions , Ann. Probab. ,3241-3260 (2013).[31] Kargin, V.:
Subordination for the sum of two random matrices , Ann. Probab. , 2119-2150 (2015).[32] Maassen, H.:
Addition of freely independent random variables . J. Funct. Anal. , 409-438 (1992).[33] Pastur, L., Vasilchuk. V:
On the law of addition of random matrices , Comm. Math. Phys. ,249-286 (2000).[34] Speicher, R.:
Free convolution and the random sum of matrices , Pub. Res. Inst. Math. Sc. ,731-744 (1993).[35] Speicher, R.:
Multiplicative functions on the lattice of non-crossing partitions and free convolution ,Math. Ann. , 611-628 (1994).[36] Voiculescu, D.:
Addition of certain non-commuting random variables , J. Funct. Anal. , 323-346(1986).[37] Voiculescu, D.:
Limit laws for random matrices and free products , Invent. Math. , 201-220(1991).[38] Voiculescu, D.:
The analogues of entropy and of Fisher’s information measure in free probabilitytheory I , Comm. Math. Phys. , 71-92 (1993).[39] Voiculescu, D., Dykema, K. J., Nica, A.:
Free random variables , CRM Monogr. Ser., Amer. Math. Soc.,Providence RI, 1992. [40] Xu, F.: A random matrix model from two-dimensional Yang-Mills theory , Comm. Math. Phys.190