An inequality for the distance between densities of free convolutions
aa r X i v : . [ m a t h . P R ] O c t The Annals of Probability (cid:13)
Institute of Mathematical Statistics, 2013
AN INEQUALITY FOR THE DISTANCE BETWEEN DENSITIESOF FREE CONVOLUTIONS
By V. Kargin
University of Cambridge
This paper contributes to the study of the free additive convolu-tion of probability measures. It shows that under some conditions, ifmeasures µ i and ν i , i = 1 ,
2, are close to each other in terms of theL´evy metric and if the free convolution µ ⊞ µ is sufficiently smooth,then ν ⊞ ν is absolutely continuous, and the densities of measures ν ⊞ ν and µ ⊞ µ are close to each other. In particular, conver-gence in distribution µ ( n )1 → µ , µ ( n )2 → µ implies that the densityof µ ( n )1 ⊞ µ ( n )2 is defined for all sufficiently large n and converges tothe density of µ ⊞ µ . Some applications are provided, including: (i)a new proof of the local version of the free central limit theorem, and(ii) new local limit theorems for sums of free projections, for sumsof ⊞ -stable random variables and for eigenvalues of a sum of two N -by- N random matrices.
1. Introduction.
Free convolution is a binary operation on the set ofprobability measures on the real line that converts this set into a commu-tative semigroup. In contrast to the usual convolution, this operation isnonlinear relative to taking convex combinations of measures. The study ofproperties of free convolution is motivated by its numerous applications tooperator algebras [11, 21, 24], random matrices [10, 17, 19, 22], representa-tions of the symmetric group [8] and quantum physics [9, 27].Starting with work by Voiculescu [21], it was noted that free convolutionhas strong smoothing properties. Let µ ⊞ µ denote the free convolution ofprobability measures µ and µ . In [6], it was proved that µ ⊞ µ has anatom at x ∈ R if and only if there are y ∈ R and z ∈ R such that x = y + z ,and µ ( { y } ) + µ ( { z } ) >
1. In [1], it was shown that µ ⊞ µ can have asingular component if and only if one of the measures is concentrated onone point, and the other has a singular component (so that the resultingfree convolution is simply a translation of the measure with the singular Received August 2011; revised January 2012.
AMS 2000 subject classifications.
Key words and phrases.
Free probability, free convolution, convergence of measures.
This is an electronic reprint of the original article published by theInstitute of Mathematical Statistics in
The Annals of Probability ,2013, Vol. 41, No. 5, 3241–3260. This reprint differs from the original inpagination and typographic detail. 1
V. KARGIN component). Moreover, in the same paper it was shown that the density ofthe absolutely continuous part of the free convolution measure is analyticwherever the density is positive and finite.Some quantitative versions of the smoothing property of free convolu-tion have also been given. In particular, in [23] it was shown that if µ isabsolutely continuous with density f µ ∈ L p ( R ) ( p ∈ (1 , ∞ ]), then the freeconvolution of µ with an arbitrary other measure µ is absolutely continu-ous with density f µ ⊞ µ ∈ L p ( R ), and k f µ ⊞ µ k p ≤ k f µ k p . In particular, thesupremum of the density f µ ⊞ µ is less than or equal to the supremum ofthe density of f µ .Another important property of free convolution is that it is continuouswith respect to weak convergence of measures. In particular, by a resultin [4], if µ ( N )1 → µ and µ ( N )2 → µ as N grows to infinity (where → denotesconvergence in distribution), then µ ( N )1 ⊞ µ ( N )2 → µ ⊞ µ . In fact, Theo-rem 4.13 in [4] says that d L ( µ ⊞ µ , ν ⊞ ν ) ≤ d L ( µ , ν ) + d L ( µ , ν ), where d L denotes the L´evy distance on the set of probability measures on R .The main result of this paper establishes a strengthened version of thisproperty. If distances d L ( µ , ν ) and d L ( µ , ν ) are sufficiently small, and if µ ⊞ µ is sufficiently smooth, then ν ⊞ ν is absolutely continuous and thedistance between the densities of µ ⊞ µ and ν ⊞ ν can be bounded interms of the L´evy distances between the original measures.In particular, this result shows that the convergence in distribution µ ( N )1 → µ and µ ( N )2 → µ implies the convergence of the probability densities of µ ( N )1 ⊞ µ ( N )2 to the density of µ ⊞ µ .We prove this result under an assumption imposed on the measures µ and µ , which we call the smoothness of the pair ( µ , µ ) at a point of itssupport x . This assumption holds at a generic point x if µ = µ = µ , andthe density of µ ⊞ µ is absolutely continuous and positive at x . In the casewhen µ = µ , this assumption should be checked directly. We envision thatin applications µ and µ are fixed measures for which this assumption canbe directly checked, and µ ( N )1 and µ ( N )2 are (perhaps random) measures forwhich it can be checked that they are close to µ and µ in the L´evy distance.In order to formulate our main result precisely, we introduce several def-initions. Let µ and µ be two probability measures on R with the Stieltjestransforms m µ ( z ) and m µ ( z ), where the Stieltjes transform of a probabilitymeasure µ is defined by the formula m µ ( z ) := Z R µ ( dx ) x − z . Then, the free convolution µ ⊞ µ is defined as a probability measure on R with the Stieltjes transform m µ ⊞ µ ( z ), which satisfies the following system N INEQUALITY FOR FREE CONVOLUTIONS of equations: m µ ⊞ µ ( z ) = m µ ( ω ( z )) ,m µ ⊞ µ ( z ) = m µ ( ω ( z )) , (1) z − m µ ⊞ µ ( z ) = ω ( z ) + ω ( z ) . Here ω ( z ) and ω ( z ) are analytic functions in C + := { z : ℑ z > } , that map C + to itself, that have the property ℑ ω j ( z ) ≥ ℑ z , and such that ω j ( z ) = z + o ( z ) as z → ∞ in the sector ℑ z > κ |ℜ z | , where κ is an arbitrary pos-itive constant [7]. Functions ω ( z ) and ω ( z ) are called the subordinationfunctions for the pair ( µ , µ ).The definition of free convolution by the system (1) is equivalent to thestandard definition through R -transforms ([25] and [16]) if one sets ω ( z ) = z − R µ ( − m µ ⊞ µ ( z )), and similarly for ω ( z ).The subordination functions ω j ( z ) depend not only on z but also on thepair ( µ , µ ). In particular, some properties of the measures µ and µ areencoded in the functions ω j . A proper but more cumbersome notation wouldbe ω j ( µ , µ , z ) where j = 1 ,
2. In the cases when we need to compare thesubordination functions for pairs ( µ , µ ) and ( ν , ν ), we will denote themby ω µ,j ( z ) and ω ν,j ( z ), respectively.The system (1) implies the following system of equations for ω j :1 z − ω ( z ) − ω ( z ) = m µ ( ω ( z )) , (2) 1 z − ω ( z ) − ω ( z ) = m µ ( ω ( z )) . Note that the analytic solutions of the system (2) that satisfy the asymp-totic condition at infinity are unique in C + . (This follows from the facts thatthe solutions are unique in the area ℑ z ≥ η for sufficiently large η and thatthe analytic continuation in a simply-connected domain is unique.)By Theorem 3.3 in [1], the limits ω j ( x ) = lim η ↓ ℑ ω j ( x + iη ) exist, and wemake the following definition. Definition 1.1.
A pair of probability measures on the real line ( µ , µ )is said to be smooth at x if the following two conditions hold:(i) ℑ ω j ( x ) > j = 1 ,
2, and(ii) k µ ( x ) := 1 m ′ µ ( ω ( x )) + 1 m ′ µ ( ω ( x )) − ( x − ω ( x ) − ω ( x )) = 0 . (3)Inequality (3) is a technical condition and holds for a generic point x ∈ R . V. KARGIN
Condition (i) is somewhat stronger than the condition that µ ⊞ µ isLebesgue absolutely continuous at x . Indeed, if ℑ ω j ( x ) > j = 1 ,
2, thenthe limit lim η → m µ ⊞ µ ( x + iη ) = lim η → m µ ( ω ( z ))exists and is finite. By using results in [1], we can infer from this fact that µ ⊞ µ is Lebesgue absolutely continuous at x .In the converse direction, we have only that if µ = µ = µ , and µ ⊞ µ isabsolutely continuous with positive density at x, then condition (i) in thedefinition of smoothness is satisfied; see Proposition 1.4 below.The fact that smoothness is strictly stronger than absolute continuity of µ ⊞ µ can be seen from the following example. If µ is a point mass at 0,that is, µ = δ , and if µ is absolutely continuous at x , then µ ⊞ µ = µ isabsolutely continuous at x , but the pair ( µ , µ ) is not smooth at x . Indeed, m δ = − z − , and system (2) implies that ω ( z ) = z . Hence, ℑ ω ( x ) = 0 forevery x .On the other hand smoothness holds for many examples that we considerbelow.Next, let us recall the following standard definition. Definition 1.2.
The
L´evy distance between probability measures µ and ν is d L ( µ, ν ) = sup x inf { s ≥ F ν ( x − s ) − s ≤ F µ ( x ) ≤ F ν ( x + s ) + s } , where F µ ( t ) and F ν ( t ) are the cumulative distribution functions of µ and ν .It is well known that µ ( N ) → µ in distribution (i.e., the cumulative dis-tribution function of µ ( N ) weakly converges to the cumulative distributionfunction of µ ) if and only if d L ( µ ( N ) , µ ) →
0; see, for example, TheoremIII.1.2 on page 314 and Exercise III.1.4 on page 316 in [18].Here is the main result of this paper.
Theorem 1.3.
Assume that a pair of probability measures ( µ , µ ) issmooth at x . Then, there are some s µ, > and c µ > , which depend only on ( µ , µ ) , such that for all pairs of probability measures ( ν , ν ) with d L ( µ j , ν j ) m ν j ( z )and m ν ⊞ ν ( z ) denote the Stieltjes transforms of ν j and ν ⊞ ν , respectively,and let ω ν,j denote the subordination functions for the pair ( ν , ν ). First, weprove that the smallness of d L ( µ j , ν j ) implies that the differences | m ν j − m µ j | are small, and that the differences between the derivatives of m ν j and m µ j are also small. Then we show that this fact, together with system (2), impliesthat the differences between the corresponding subordination functions aresmall. At this stage we need the assumption of smoothness. Finally, we checkthat if both the Stieltjes transforms and the subordination functions of pairs( µ , µ ) and ( ν , ν ) are close to each other, then the Stieltjes transforms of µ ⊞ µ and ν ⊞ ν are close to each other uniformly on the half-line ℜ z = x , ℑ z >
0. This fact implies that the densities of µ ⊞ µ and ν ⊞ ν at x areclose to each other.Before discussing applications of Theorem 1.3, let us mention some resultswhich are helpful in checking the assumptions of this theorem. Proposition 1.4. If µ ⊞ µ is (Lebesgue) absolutely continuous in aneighborhood of x , and the density of µ ⊞ µ is positive at x , then ℑ ω j ( x ) > for j = 1 , . Another important case is when one of the probability measures has thesemicircle distribution with the density f sc ( x ) = π p (4 − x ) + . Since such ameasure, µ sc , is absolutely continuous, µ sc ⊞ µ is also absolutely continuous,for an arbitrary µ . Proposition 1.5.
If the density of µ sc ⊞ µ is positive at x , and | m µ sc ⊞ µ ( x ) | 6 = 1 , then ℑ ω j ( x ) > for j = 1 , . The proofs of Propositions 1.4 and 1.5 will be given in Section 3.Now let us turn to applications. Theorem 1.3 can be applied to derivesome old and new results about sums of free random variables and abouteigenvalues of large random matrices.
V. KARGIN
Recall that if X , . . . , X n are free, identically distributed self-adjoint ran-dom variables with finite variance σ , then [15, 20] S n := ( X + · · · + X n ) / ( σ √ n ) converges in distribution to a random variable X with the standardsemicircle law.In terms of free convolutions, it means that if µ is a probability measurewith variance σ , and if µ n ( dx ) := µ ⊞ · · · ⊞ µ | {z } n times ( σ √ n dx ) , then µ n → µ sc .Bercovici and Voiculescu in [5] showed that the convergence in this limitlaw holds in a stronger sense. Namely, assuming in addition that support of µ is bounded, they showed that µ n has a density for all sufficiently large n and that the sequence of these densities converges uniformly to the densityof the semicircle law. Recently, this result was generalized in [26] to the caseof µ n with unbounded support and finite variance. Results in [5] and [26]can be considered as local limit versions of the free CLT.In the first application (Theorem 4.1), we give a short proof of the easierpart of the results in [5] and [26] by using Theorem 1.3. (A more difficultpart of these results concerns the uniformity of the convergence on R .)In the second application (Theorem 4.2), we prove an analogous local limitresult for the sums S n = X ,n + · · · + X n,n , where X i,n are free projectionoperators with parameters p i,n such that P ni =1 p i,n → λ and max i p i,n → n → ∞ . The classical analogue of this situation is the sum of independentindicator random variables, and the classical result states that the sumsconverge in distribution to the Poisson random variable with parameter λ .A local version of this result is absent in the classical case because thePoisson random variable is discrete, and it does not make sense to talkabout convergence of densities. In the free probability case, the limit of thespectral distributions of S n is the Marchenko–Pastur distribution, which isabsolutely continuous with bounded density for λ >
1. We show that in thiscase the spectral measures of S n have a density for all sufficiently large n and that the sequence of these densities converges uniformly to the densityof the Marchenko–Pastur law.In the third application (Theorem 4.3), we show that a similar local limitresult holds for sums of free ⊞ -stable random variables.The fourth application (Theorem 4.4) is of a different kind and is con-cerned with eigenvalues of large random matrices. Let H N = A N + U N B N U ∗ N ,where A N and B N are N -by- N Hermitian matrices, and U N is a randomunitary matrix with the Haar distribution on the unitary group U ( N ). Let λ ( A )1 ≥ · · · ≥ λ ( A ) N be the eigenvalues of A N . Similarly, let λ ( B ) k and λ ( H ) k beordered eigenvalues of matrices B N and H N , respectively. Define the spectral N INEQUALITY FOR FREE CONVOLUTIONS point measures of A N by µ A N := N − P Nk =1 δ λ ( A ) k ( H ) , and define the spectralpoint measures of B N and H N similarly.Assume that µ A N → µ α and µ B N → µ β , and that the support of µ A N and µ B N is uniformly bounded. Let the pair ( µ α , µ β ) be smooth at x .Define N I := N µ H N ( I ), the number of eigenvalues of H N in interval I ,and let N η ( x ) := N ( x − η,x + η ] . Finally, assume that η = η ( N ) and √ log( N ) ≪ η ( N ) ≪ N η ( x ) ηN → f µ α ⊞ µ β ( x )with probability 1, where f µ α ⊞ µ β denotes the density of µ α ⊞ µ β . This resultgeneralizes the main result in [17] where it was proved that µ H N → µ α ⊞ µ β .It can be interpreted as a local limit law for eigenvalues of a sum of randomHermitian matrices.The rest of the paper is organized as follows. Section 2 is concerned withthe proof of the main theorem, Section 3 contains proofs of Propositions 1.4and 1.5, Section 4 contains applications, and Section 5 concludes.
2. Proof of Theorem 1.3.
Let F µ ( x ) and F ν ( x ) denote the cumulativedistribution functions of the measures µ and ν , respectively. Lemma 2.1.
Suppose that d L ( µ, ν ) = s . Assume that h ( x ) is a C real-valued function, such that R ∞−∞ | h ( u ) | du < ∞ and R ∞−∞ | h ′ ( u ) | du < ∞ . As-sume in addition that h ( u ) has a finite number of zeros. Then, ∆ := Z R | h ( u )[ F ν ( ηu ) − F µ ( ηu )] | du ≤ cs max { , η − } , (4) where c > depends only on h . Proof.
Since h is a continuous function with a finite number of zeros,we can decompose the set on which h ( u ) is nonzero into a finite numberof intervals I k on which h ( u ) has a constant sign. Note that it suffices toestimate the integral on each of these intervals. Consider the case when h ( u ) > I k . The treatment of the case h ( u ) < | F ν ( ηu ) − F µ ( ηu ) |≤ max { F µ ( ηu + s ) − F µ ( ηu ) , F ν ( ηu + s ) − F ν ( ηu ) ,F µ ( ηu ) − F µ ( ηu − s ) , F ν ( ηu ) − F ν ( ηu − s ) } + s. V. KARGIN
It suffices to estimate Z I k h ( u ) { F µ ( ηu + s ) − F µ ( ηu ) + s } du, since the other cases are similar.First of all, note that Z I k h ( u ) s du ≤ s Z ∞−∞ | h ( u ) | du ≤ cs. (5)Next, let e I k = I k + s/η . Then, Z I k h ( u ) F µ ( ηu + s ) du = Z e I k h ( t − s/η ) F µ ( ηt ) dt and therefore, Z I k h ( u )[ F µ ( ηu + s ) − F µ ( ηu )] du ≤ Z I k ∩ e I k [ h ( t − s/η ) − h ( t )] F µ ( ηt ) dt (6) + Z I k △ e I k max( | h ( t − s/η ) | , | h ( t ) | ) F µ ( ηt ) dt. For the first integral in this estimate, we can use the fact that h ( t − s/η ) − h ( t ) = − Z tt − s/η h ′ ( ξ ) dξ and therefore, (cid:12)(cid:12)(cid:12)(cid:12)Z I k ∩ e I k [ h ( t − s/η ) − h ( t )] F µ ( ηt ) dt (cid:12)(cid:12)(cid:12)(cid:12) ≤ Z R Z tt − s/η | h ′ ( ξ ) | F µ ( ηt ) dξ dt = Z R | h ′ ( ξ ) | (cid:18)Z ξ + s/ηξ F µ ( ηt ) dt (cid:19) dξ (7) ≤ sη Z R | h ′ ( ξ ) | dξ. For the second integral, we note that Z I k △ e I k max( | h ( t − s/η ) | , | h ( t ) | ) F µ ( ηt ) dt ≤ sup | h ( t ) || I k △ e I k | (8) ≤ | h ( t ) | s/η. By using estimates (5), (6), (7) and (8), we obtain∆ ≤ cs max { , η − } , where c depends only on function h . (cid:3) N INEQUALITY FOR FREE CONVOLUTIONS Now, let m µ ( z ) and m ν ( z ) denote the Stieltjes transforms of the proba-bility measures µ and ν , respectively. Lemma 2.2.
Let d L ( µ, ν ) = s and z = x + iη , where η > . Then: (a) | m µ ( z ) − m ν ( z ) | < csη − max { , η − } where c is a positive constant,and (b) | d r dz r ( m µ ( z ) − m ν ( z )) | < c r sη − − r max { , η − } where c r are positiveconstants. Proof. (a) By integration by parts, m µ ( z ) = Z R F µ ( λ )( λ − z ) dλ. Hence, setting u = ( λ − x ) /η , ℑ m µ ( z ) = 2 η Z R F µ ( x + ηu ) u du (1 + u ) , ℜ m µ ( z ) = 1 η Z R F µ ( x + ηu ) ( u − du (1 + u ) , and similar formulas hold for ℑ m ν ( z ) and ℜ m ν ( z ). Since u (1 + u ) − and( u − u ) − satisfy the assumptions of Lemma 2.1, Claim (a) follows.Claim (b) can be derived similarly by writing d r dz r m µ ( z ) = ( r + 1)! Z R F µ ( λ ) dλ ( λ − x − iη ) r +2 = ( r + 1)! η r +1 Z R u − i ) r +2 F µ ( ηu + x ) du, separating imaginary and real parts of the integrand, and applying Lemma 2.1. (cid:3) Lemma 2.3.
Assume that the pair ( µ , µ ) is smooth at x . Suppose that ( ν , ν ) is another pair of probability measures such that d L ( µ j , ν j ) < s for j = 1 , . Let ℜ z = x and ℑ z ≥ . Then (cid:12)(cid:12)(cid:12)(cid:12) z − ω µ, ( z ) − ω µ, ( z ) − m ν j ( ω µ,j ( z )) (cid:12)(cid:12)(cid:12)(cid:12) ≤ c µ s for j = 1 , . Here c µ > depends only on ( µ , µ ) and x . That is, if we substitute ω µ,j ( z ) in the system for ω ν,j ( z ), then the equal-ities will be satisfied up to a quantity of order s . V. KARGIN
Proof of Lemma 2.3.
The functions ω µ,j ( z ) satisfy equations (2),which implies that it is enough to show that | m ν j ( ω µ,j ( z )) − m µ j ( ω µ,j ( z )) | < cs for j = 1 ,
2. Note that min j =1 , {ℑ ( ω µ,j ( x )) } > µ , µ ). In addition, ℑ ( ω µ,j ( x + iη )) ≥ η for all η >
0. Hence, bycontinuity of ω µ,j ( x + iη ) in η , we have κ j := inf η ≥ ω µ,j ( x + iη ) >
0. Then,by Lemma 2.2, | m νj ( ω µ,j ( z )) − m µj ( ω µ,j ( z )) | < cs min { κ − j , κ − j } . (cid:3) Proposition 2.4.
Assume that a pair of probability measures ( µ , µ ) is smooth at x . Then there are some s µ, > and c µ > that depend onlyon ( µ , µ ) and x , such that for all pairs of probability measures ( ν , ν ) with d L ( µ j , ν j ) < s ≤ s µ, for j = 1 , , the limits ω ν,j ( x ) := lim η ↓ ω ν,j ( x + iη ) exist, and it is true that | ω ν,j ( x ) − ω µ,j ( x ) | ≤ c µ s for j = 1 , . Corollary 2.5.
Assume that the assumptions of Proposition 2.4 holdand that d L ( µ j , ν j ) < s ≤ s µ, for j = 1 , . Then, ν ⊞ ν is absolutely con-tinuous in a neighborhood of x , and | f ν ⊞ ν ( x ) − f µ ⊞ µ ( x ) | < c µ s, where f µ ⊞ µ and f ν ⊞ ν denote the densities of µ ⊞ µ and ν ⊞ ν , respec-tively. Proof.
Since m ν ⊞ ν ( z ) = ( z − ω ν, ( z ) − ω ν, ( z )) − , Proposition 2.4 im-plies that the limit m ν ⊞ ν ( x ) := lim η ↓ m ν ⊞ ν ( x + iη ) exists and | m ν ⊞ ν ( x ) − m µ ⊞ µ ( x ) | < c µ s. (9)By [1], ν ⊞ ν has no singular component. Hence, inequality (9) and theabsolute continuity of µ ⊞ µ in a neighborhood of x imply that for allsufficiently small s , the measure ν ⊞ ν is absolutely continuous in a neigh-borhood of x with the density f ν ⊞ ν ( x ) = π − ℑ ( m ν ⊞ ν ( x )), and | f ν ⊞ ν ( x ) − f µ ⊞ µ ( x ) | < c µ s. (cid:3) Proof of Proposition 2.4.
Let F ( ω ) : C → C be defined by theformula F : (cid:18) ω ω (cid:19) → (cid:18) c ( z − ω − ω ) − − m ν ( ω )( z − ω − ω ) − − m ν ( ω ) (cid:19) . N INEQUALITY FOR FREE CONVOLUTIONS Let us use the norm k ( x , x ) k = ( | x | + | x | ) / . By Lemma 2.3, k F ( ω µ, ( z ), ω µ, ( z )) k ≤ c µ s for all z = x + iη and η ≥ F with respect to ω is F ′ = (cid:18) ( z − ω − ω ) − ( z − ω − ω ) − − m ′ ν ( ω )( z − ω − ω ) − − m ′ ν ( ω ) ( z − ω − ω ) − (cid:19) . The determinant of this matrix is[ m ′ ν ( ω ) + m ′ ν ( ω )]( z − ω − ω ) − − m ′ ν ( ω ) m ′ ν ( ω ) . By the assumption of smoothness and by Lemma 2.2, this is close (i.e., thedifference < cs for some c >
0) to[ m ′ µ ( ω ) + m ′ µ ( ω )]( z − ω − ω ) − − m ′ µ ( ω ) m ′ µ ( ω )at ( ω , ω ) = ( ω µ, ( z ) , ω µ, ( z )) for all z = x + iη with η ≥
0. The latter ex-pression is nonzero by (3). In addition, the assumption of smoothness showsthat ( z − ω µ, ( z ) − ω µ, ( z )) − is bounded for z = x + iη with η ≥
0. Hence,the entries of the matrix [ F ′ ] − are bounded at ( ω µ, ( z ) , ω µ, ( z )), and thebound does not depend on η . This shows that the operator norm of [ F ′ ] − is bounded uniformly in η .Similarly, an explicit calculation of F ′′ , the assumption of smoothness of( µ , µ ) and Lemma 2.2 imply that for all z = x + iη with η ≥
0, the operatornorm of F ′′ is bounded (uniformly in η ) for all ( ω , ω ) in a neighborhoodof ( ω µ, ( z ) , ω µ, ( z )).It follows by the Newton–Kantorovich theorem [13] that if s = max j d L ( µ j ,ν j ) is sufficiently small, then the solution of the equation F ( ω ) = 0 exists forall z with ℜ z = x and ℑ z ≥ ω ν, ( z ) , ω ν, ( z )) by the following argument from [2].A solution of equation F ( ω ) = 0 satisfies the following pair of equations: ω = z + h ( ω ) ,ω = z + h ( ω ) , where h j ( ω ) = − ω − m ν j ( ω ) . Note in particular that ℑ h j ( ω ) ≥ ω ∈ C + ; see, for example, [4]or [15].Hence, ω is a fixed point of the function f z ( ω ) = z + h ( z + h ( ω )) , which maps C + to C + . For every z ∈ C + , the function f z ( ω ) is not a con-formal automorphism because it maps C + to a subset of C + + ℑ z , which is V. KARGIN a proper subset of C + . In addition, it is analytic as a function of z and ω that maps C + × C + to C + . Hence, by Theorem 2.4 in [2], for every z ∈ C + the function f z ( ω ) has a unique fixed point ω ( z ).A similar argument holds for ω ( z ), and we conclude that equation F ( ω ) =0 has a unique solution in C + × C + , which necessarily coincides with ( ω ν, ( z ), ω ν, ( z )).In addition, this solution satisfies the inequalities | ω ν,j ( z ) − ω µ,j ( z ) | < c µ s, j = 1 , , (10)for all z with ℜ z = E and ℑ z > ω ν,j ( E ) := lim η ↓ ω ν,j ( x + iη )and ω µ,j ( E ) := lim η ↓ ω µ,j ( x + iη )exist, and by taking the limits in (10), we find that | ω ν,j ( x ) − ω µ,j ( x ) | ≤ cs. (cid:3)
3. Proofs of Propositions 1.4 and 1.5.
Recall that a function f ( x ) is saidto be H¨older continuous at x if there exist positive constants α , C and ε such that | x − x | < ε implies that | f ( x ) − f ( x ) | < C | x − x | α . Lemma 3.1.
Suppose that a probability measure µ has a density whichis positive and H¨older continuous at x . Let m µ ( z ) be the Stieltjes transformof µ . Then | m µ ( x + iη ) | ≤ M < ∞ for all η > . Proof.
The results of Sokhotskyi, Plemelj and Privalov ensure that thelimit of m µ ( x + iη ) exists when η ↓
0; see Theorems 14.1b and 14.1c in [12].In particular this implies that m µ ( x + iη ) is bounded for sufficiently small η .In addition, | m µ ( x + iη ) | ≤ /η so it is bounded for large η . Since m µ ( x + iη )is continuous in the upper half-plane, m µ ( x + iη ) is bounded for all η , andthe claim of the lemma follows. (cid:3) Proof of Proposition 1.4.
Note that for the case µ = µ = µ , ω ( z ) = ω ( z ) = ( z − m µ ⊞ µ ( z ) − ) / . (11)Since by assumption µ ⊞ µ is absolutely continuous in a neighborhood of x ,and its density f µ ⊞ µ is positive at x , by the results in [1] f µ ⊞ µ is analytic andtherefore uniformly H¨older continuous in a neighborhood of x . By Sokhot-skyi, Plemelj and Privalov’s results, the limit m µ ⊞ µ ( x ) = lim η ↓ m µ ⊞ µ ( x + iη )exists and ℑ m µ ⊞ µ ( x ) = πf µ ⊞ µ ( x ) >
0. Then it follows from (11) that the lim-
N INEQUALITY FOR FREE CONVOLUTIONS its ω j ( x ) = lim η ↓ ω j ( x + iη ) exist. Moreover, since ℑ ω j ( z ) = 12 (cid:18) η + ℑ m µ ⊞ µ ( z ) | m µ ⊞ µ ( z ) | (cid:19) and by Lemma 3.1, | m µ ⊞ µ ( z ) | is bounded uniformly in η , hence the factthat ℑ m µ ⊞ µ ( x ) = πf µ ⊞ µ ( x ) > ℑ ω j ( x ) >
0. This completes theproof of the proposition. (cid:3)
Lemma 3.2. If µ has the semicircle distribution, then: (i) ω ( z ) = z − ω ( z ) + [ z − ω ( z )] − ; (ii) m µ sc ⊞ µ ( z ) = ω ( z ) − z ; (iii) ω ( z ) satisfies the equation ω ( z ) = z + Z µ ( dx ) x − ω ( z ) . Proof. (i) If µ has the semicircle distribution, then m ( − µ = − ( z + z − ); hence the first equation in system (2) implies ω = − (cid:18) z − ω − ω + z − ω − ω (cid:19) , which simplifies to ω = z − ω + 1 z − ω . (ii) By using (i), m µ sc ⊞ µ = 1 z − ω − ω = − ( z − ω ) . (iii) The second equation in system (2) becomes − ( z − ω ( z )) = Z µ ( dx ) x − ω ( z ) . (cid:3) Proof of Proposition 1.5.
From (ii) in Lemma 3.2, ℑ ω ( x ) = ℑ m µ sc ⊞ µ ( x ) = πf µ sc ⊞ µ ( x ) > . From (i), ℑ ω ( x ) = ℑ ω ( x ) (cid:18) − | x − ω | (cid:19) = ℑ ω ( x ) (cid:18) − | m µ sc ⊞ µ ( x ) | (cid:19) . Since ℑ ω ( x ) >
0, if | m µ sc ⊞ µ ( x ) | <
1, then ℑ ω ( x ) >
0, and we are done.Two remaining possibilities are | m µ sc ⊞ µ ( x ) | = 1 and | m µ sc ⊞ µ ( x ) | >
1. How-ever, | m µ sc ⊞ µ ( x ) | > V. KARGIN ℑ ω ( x ) <
0, which is ruled out by a general result of Biane. To sum up, theassumptions f µ sc ⊞ µ ( x ) > | m µ sc ⊞ µ ( x ) | = 1 imply that ℑ ω j ( x ) > (cid:3)
4. Applications.
In the first application we re-prove an easier part ofthe free local limit theorem which was first demonstrated in [5] for boundedrandom variables and later generalized in [26] to the case of unboundedvariables with finite variance. We will show the convergence of densities,but we will not investigate whether the convergence is uniform on R .Let X i be a sequence of self-adjoint identically-distributed free randomvariables in the sense of free probability theory. Define S n = ( X + · · · + X n ) / √ n , and let µ and µ n denote the spectral probability measures of X i and S n , respectively. It is known that µ n ( dx ) = µ ⊞ · · · ⊞ µ | {z } n times ( √ n dx ) . Theorem 4.1.
Suppose µ has zero mean and unit variance. Let I ε =[ − ε, − ε ] . Then for all sufficiently large n , µ n is (Lebesgue) absolutelycontinuous everywhere on I , and the density dµ n /dx uniformly convergeson I ε to the density of the standard semicircle law. Note that the results in [5] imply that for every closed interval J out-side of [2 , − µ n ( J ) = 0 for all sufficiently large n , providedthat µ has bounded support. In addition, the uniform convergence on I ε can be strengthened to the uniform convergence on R as in the proof ofTheorem 3.4(iii) in [26]. Proof of Theorem 4.1.
Let ν ,n be the distribution of ( X + · · · + X [ n/ ) / √ n and ν ,n be the distribution of ( X [ n/ + · · · + X n ) / √ n . Byusing the free CLT (Central limit theorem) from [15] (which generalizes thefree CLT in [20]), we infer that both ν ,n and ν ,n converge weakly to e µ sc ,where e µ sc is the semicircle law with variance 1 /
2. It is easy to calculate forthe pair ( e µ sc , e µ sc ) that ω e µ, = ω e µ, = 3 z + √ z − ℑ ω e µ,j ( x ) > I ε . (This also follows from Proposition 1.4.)A calculation shows that the genericity condition (3) is satisfied for each x ∈ I ε , and therefore the density of ν ,n ⊞ ν ,n exists for all sufficiently large n ,and converges to the density of e µ sc ⊞ e µ sc at each x ∈ I ε . A remark afterTheorem 1.3 shows that the convergence is in fact uniform on I ε . Since ν ,n ⊞ ν ,n = µ n , this implies that the density of µ n converges uniformly on I ε to the density of the standard semicircle law. (cid:3) In a similar fashion, it is possible to prove the local limit law for theconvergence to the free Poisson distribution.
N INEQUALITY FOR FREE CONVOLUTIONS Let { X n,i } ni =1 be freely independent self-adjoint random variables withthe distribution µ n,i = p n,i δ + (1 − p n,i ) δ . Let S n = X n, + · · · + X n,n , andlet µ n denote the spectral probability measure of S n . Then µ n ( dx ) = µ n, ⊞ · · · ⊞ µ n,n ( dx ) . Recall that the
Marchenko–Pastur distribution with parameter λ ≥ µ mp on R , with the density f mp ( x ) = p x − (1 − λ + x ) πx supported on the interval [ x min , x max ] := [(1 − √ λ ) , (1 + √ λ ) ]. In the freeprobability literature, this distribution is called the free Poisson distribution . Theorem 4.2.
Assume that P ni =1 p n,i → λ > and max i p n,i → as n → ∞ . Let I ε = [ x min + ε, x max − ε ] . Then for all sufficiently large n , µ n is(Lebesgue) absolutely continuous everywhere on I ε , and the density dµ n /dx uniformly converges on I ε to the density of the Marchenko–Pastur law withparameter λ . The proof of this theorem is similar to the proof of the previous one. Thefirst step is the weak convergence of µ n . In the case when p n,i = λ/n for all i ,a proof of weak convergence can be found on page 34 in [25]. The generalcase is a minor adaptation of this case, and we omit it. Next, we choose k n so that k n X i =1 p n,i ≤ λ/ < k n +1 X i =1 p n,i and define ν ,n and ν ,n as the spectral probability measures of X n, + · · · + X n,k n and X n,k n +1 + · · · + X n,n , respectively. It is easy to see that both ν ,n and ν ,n converge weakly to e µ mp , the Marchenko–Pastur distribution withparameter λ/
2. By using Proposition 1.4, we conclude that ℑ ω e µ mp ,j ( x ) > I ε . Moreover, a direct calculation shows that ω e µ, ( z ) = ω e µ, ( z ) = ( z + λ − q ( z − (1 + λ )) − λ )and m ′ e µ mp = 1 − λ/ z + − z (1 + λ/
2) + (1 − λ/ z p ( z − (1 + λ/ − λ . After some calculations the genericity condition (3) can be simplified to thefollowing inequality: f ( x, λ ) := x − (5 + λ ) x + (7 + λ + 2 λ ) x − (3 − λ + λ + λ )(12) = 0 . V. KARGIN
Fig. 1.
Contour plot of the right-hand side of (12).
Figure 1 shows the contour plot of f ( x, λ ). It can be seen from this plot andcan be checked formally that for λ >
1, there is only one x = x ( λ ) that vio-lates (12). Figure 2 shows the zero set of f ( x, λ ) for λ >
1, compared with thebounds on the support of the Marchenko–Pastur distribution. It can be seen
Fig. 2.
The zero set of the right-hand side of (12) compared with the support boundsfor x ( λ ) . N INEQUALITY FOR FREE CONVOLUTIONS from this graph and can be checked formally that x ( λ ) < t min ( λ ) = (1 − √ λ ) .Consequently, if x is in the support of e µ mp ⊞ e µ mp , the genericity condition (3)holds, and the pair ( e µ mp , e µ mp ) is smooth at x . Hence, Theorem 1.3 applies,and the density of µ n = ν ,n ⊞ ν ,n converges uniformly on I ε to the densityof e µ mp ⊞ e µ mp , that is, to the density of the Marchenko–Pastur distributionwith parameter λ .Similar results can be established for other limit theorems, except thatit is more difficult to check the genericity condition (3) for a point in thesupport of the limit distribution. Here is one more theorem of this type.Let measures µ and ν be called equivalent ( µ ∼ ν ) if there exist such real a and b , with b >
0, that for every Borel set S ⊂ R , µ ( S ) = ν ( bS + a ). Recallthat a measure µ is called ⊞ -stable, if µ ⊞ µ ∼ µ . The measure ν belongsto the domain of attraction of a ⊞ -stable law µ , if there exist measures ν n equivalent to ν such that ν n ⊞ ν n ⊞ · · · ⊞ ν n | {z } n times → µ. Clearly, in this case there exists a sequence of real constants b n > a n such that µ n := ν ⊞ ν ⊞ · · · ⊞ ν | {z } n times ( b n · + a n ) → µ. (13)(More about the ⊞ -stability of probability measures and its relation to theclassical stability of probability measures can be found in [3].) Theorem 4.3.
Suppose that a ⊞ -stable distribution µ is not equivalentto δ and that ν belongs to the domain of attraction of µ . Let a n , b n and µ n be defined as in (13), and let J be a bounded closed interval such thatthe density of µ is strictly positive on J . Then µ n is (Lebesgue) absolutelycontinuous on J for all sufficiently large n , and there exist such real κ n > and ξ n that the density of µ n ( κ n · + ξ n ) converges to the density of µ at(Lebesgue) almost all E ∈ J . Proof.
Let J ⊂ I , where the inclusion is strict, and I is a bounded,closed interval such that density of µ is strictly positive on I . (Interval I exists because by the results of Biane in [3] µ is absolutely continuous withanalytical density.)First, note that µ n is (Lebesgue) absolutely continuous on R for all suf-ficiently large n . Indeed, for even n = 2 k , the definition of µ n implies that µ k = µ k ⊞ µ k ( s − k · − t k ) for some constants t k and s k >
0. For large k , µ k is close in the L´evy metric to µ , which is known to be absolutely contin-uous. Hence, µ k has no atoms with weight ≥ /
2. This implies that µ k V. KARGIN has no atoms at all. In addition, by results of Belinschi, µ k has no singu-lar component. Therefore, µ k is absolutely continuous on R if k is suffi-ciently large. The argument for the odd n = 2 k + 1 is similar if we write µ k +1 = µ k +1 ⊞ µ k ( s k · + t k ).In the second and final step, we note that there exists a sequence ofconstants κ n > ξ n such that the density of µ n ( κ n · + ξ n ) converges tothe density of µ at (Lebesgue) almost all x ∈ I . Indeed, by the stability of µ , µ ⊞ µ = µ ( s · + t ) and µ has positive analytic density on I ; therefore, byProposition 1.4 ℑ ω µ,j ( x ) > x ∈ ( I − t ) /s . For almost all points x ,the genericity condition (3) holds, since otherwise k µ ( x ) (in the genericitycondition) would be exactly 0 which is not possible. For even n = 2 k , wehave µ k ⊞ µ k = µ k ( s k · + t k ), where s k > t k are certain real constants.Hence, by Theorem 1.3 the weak convergence µ k → µ implies that the densityof µ k ⊞ µ k ≡ µ k ( s k · + t k ) converges to the density of µ ⊞ µ ≡ µ ( s · + t ) atalmost all points of ( I − t ) /s . It follows that for κ k = s/s k > ξ k = t − ( s/s k ) t k , the density of µ k ( κ k · + ξ k ) converges to the density of µ at almost all points of I . The case of µ k +1 can be handled similarly byconsidering µ k ⊞ µ k +1 . (cid:3) Our next application is of a different kind and answers a question thatarises in the theory of large random matrices.Let H N = A N + U N B N U ∗ N , where A N and B N are N -by- N Hermitianmatrices, and U N is a random unitary matrix with the Haar distribution onthe unitary group U ( N ).Let λ ( A )1 ≥ · · · ≥ λ ( A ) N be the eigenvalues of A N . Similarly, let λ ( B ) k and λ ( H ) k be ordered eigenvalues of matrices B N and H N , respectively.Define the spectral point measures of A N by µ A N := N − P Nk =1 δ λ ( A ) k ( H ) ,and define the spectral point measures of B N and H N similarly. Let N I := N µ H N ( I ) denote the number of eigenvalues of H N in interval I , and let N η ( x ) := N ( x − η,x + η ] .Let the notation g ( N ) ≪ g ( N ) mean that lim N →∞ g ( N ) /g ( N ) = + ∞ . Theorem 4.4.
Assume that: (1) µ A N → µ α and µ B N → µ β ; (2) supp( µ A N ) ∪ supp( µ B N ) ⊆ [ − K, K ] for all N ; (3) the pair ( µ α , µ β ) is smooth at x ; (4) √ log( N ) ≪ η ( N ) ≪ .Then N η ( x )2 ηN → f µ α ⊞ µ β ( x ) with probability 1, where f µ α ⊞ µ β denotes the density of µ α ⊞ µ β . N INEQUALITY FOR FREE CONVOLUTIONS Previously, it was shown by Pastur and Vasilchuk in [17] that assumption(1) together with a weaker version of assumption (2) implies that µ H N → µ α ⊞ µ β with probability 1. Theorem 4.4 says that the convergence of µ H N to µ α ⊞ µ β holds on the level of densities, so it can be seen as a local limitlaw for the eigenvalues of the sum of random Hermitian matrices. Proof of Theorem 4.4.
In Theorem 2 in [14], it was shown thatthe following claim holds. Suppose that η = η ( N ) and 1 / √ log N ≪ η ( N ) ≪
1. Assume that the measure µ A N ⊞ µ B N is absolutely continuous, and itsdensity is bounded by a constant T N . Then, for all sufficiently large N , P (cid:26) sup x (cid:12)(cid:12)(cid:12)(cid:12) N η ( x )2 N η − f ⊞ ,N ( x ) (cid:12)(cid:12)(cid:12)(cid:12) ≥ δ (cid:27) ≤ exp (cid:18) − cδ ( ηN ) (log N ) (cid:19) , (14)where c > K N := max {k A N k , k B N k} and T N . Here f ⊞ ,N denotes the density of µ A N ⊞ µ B N .This statement can be modified so that the supremum in the inequalityholds for x in an interval, provided that the density of µ A N ⊞ µ B N is boundedby a constant T N in the following interval: P (cid:26) sup x ∈ ( a,b ) (cid:12)(cid:12)(cid:12)(cid:12) N η ( x )2 N η − f ⊞ ,N ( x ) (cid:12)(cid:12)(cid:12)(cid:12) ≥ δ (cid:27) ≤ exp (cid:18) − cδ ( ηN ) (log N ) (cid:19) . (15)Since assumptions (1) and (3) hold, we can use Theorem 1.3 and inferthat f ⊞ ,N ( x ) → f µ α ⊞ µ β ( x ). In particular, the sequence of densities f ⊞ ,N ( x )is uniformly bounded by a constant T . This fact and assumption (2) implythat the positive constant c in (14) can be chosen independently of N . Byusing the Borel–Cantelli lemma, we can conclude that N η ( x )2 N η → f µ α ⊞ µ β ( x )with probability 1. (cid:3)
5. Conclusion.
We have proved that if probability measures ν and ν are sufficiently close to probability measures µ and µ in the L´evy distance,and if µ ⊞ µ is sufficiently smooth at x , then ν ⊞ ν is absolutely continuousat x , and its density is close to the density of µ ⊞ µ .We have applied this result to derive several local limit law results for sumsof free random variables and for eigenvalues of a sum of random Hermitianmatrices. Acknowledgment.
I would like to thank Diana Bloom for her editorialhelp and an anonymous referee for helpful suggestions. V. KARGIN
REFERENCES [1]
Belinschi, S. T. (2008). The Lebesgue decomposition of the free additive convolutionof two probability distributions.
Probab. Theory Related Fields
Belinschi, S. T. and
Bercovici, H. (2007). A new approach to subordination resultsin free probability.
J. Anal. Math.
Bercovici, H. and
Pata, V. (1999). Stable laws and domains of attraction in freeprobability theory.
Ann. of Math. (2)
Bercovici, H. and
Voiculescu, D. (1993). Free convolution of measures with un-bounded support.
Indiana Univ. Math. J. Bercovici, H. and
Voiculescu, D. (1995). Superconvergence to the central limitand failure of the Cram´er theorem for free random variables.
Probab. TheoryRelated Fields
Bercovici, H. and
Voiculescu, D. (1998). Regularity questions for free convolu-tion. In
Nonselfadjoint Operator Algebras, Operator Theory, and Related Topics ( H. Bercovici and
C. Foias , eds.).
Operator Theory: Advances and Applica-tions
Biane, P. (1998). Processes with free increments.
Math. Z.
Biane, P. (1998). Representations of symmetric groups and free probability.
Adv.Math.
Feinberg, J. and
Zee, A. (1997). Non-Gaussian non-Hermitian random matrix the-ory: Phase transition and addition formalism.
Nuclear Phys. B
Guionnet, A. , Krishnapur, M. and
Zeitouni, O. (2011). The single ring theorem.
Ann. of Math. (2)
Haagerup, U. and
Thorbjørnsen, S. (2005). A new application of random matri-ces: Ext( C ∗ red ( F )) is not a group. Ann. of Math. (2)
Henrici, P. (1986).
Applied and Computational Complex Analysis. Vol. 3 . Wiley,New York. MR0822470[13]
Kantoroviˇc, L. V. (1948). Functional analysis and applied mathematics.
UspekhiMatem. Nauk (N.S.) Kargin, V. (2012). A concentration inequality and a local law for the sum oftwo random matrices.
Probab. Theory Related Fields . To appear. Available at http://arxiv.org/abs/1010.0353 .[15]
Maassen, H. (1992). Addition of freely independent random variables.
J. Funct.Anal.
Nica, A. and
Speicher, R. (2006).
Lectures on the Combinatorics of Free Probability . London Mathematical Society Lecture Note Series . Cambridge Univ. Press,Cambridge. MR2266879[17]
Pastur, L. and
Vasilchuk, V. (2000). On the law of addition of random matrices.
Comm. Math. Phys.
Shiryaev, A. N. (1996).
Probability , 2nd ed.
Graduate Texts in Mathematics .Springer, New York. MR1368405[19] Speicher, R. (1993). Free convolution and the random sum of matrices.
Publ. Res.Inst. Math. Sci. Voiculescu, D. (1985). Symmetries of some reduced free product C ∗ -algebras.In Operator Algebras and Their Connections with Topology and Ergodic The-ory (BuS¸teni, 1983) . Lecture Notes in Math. [21] Voiculescu, D. (1986). Addition of certain noncommuting random variables.
J. Funct. Anal. Voiculescu, D. (1991). Limit laws for random matrices and free products.
Invent.Math.
Voiculescu, D. (1993). The analogues of entropy and of Fisher’s information mea-sure in free probability theory. I.
Comm. Math. Phys.
Voiculescu, D. (1996). The analogues of entropy and of Fisher’s information mea-sure in free probability theory. III. The absence of Cartan subalgebras.
Geom.Funct. Anal. Voiculescu, D. V. , Dykema, K. J. and
Nica, A. (1992).
Free Random Variables:A Noncommutative Probability Approach to Free Products With Applications toRandom Matrices, Operator Algebras and Harmonic Analysis on Free Groups . CRM Monograph Series . Amer. Math. Soc., Providence, RI. MR1217253[26] Wang, J.-C. (2010). Local limit theorems in free probability theory.
Ann. Probab. Zee, A. (1996). Law of addition in random matrix theory.
Nuclear Phys. B