Rate of convergence and Edgeworth-type expansion in the entropic central limit theorem
aa r X i v : . [ m a t h . P R ] J u l The Annals of Probability (cid:13)
Institute of Mathematical Statistics, 2013
RATE OF CONVERGENCE AND EDGEWORTH-TYPEEXPANSION IN THE ENTROPIC CENTRAL LIMIT THEOREM By Sergey G. Bobkov, Gennadiy P. Chistyakovand Friedrich G¨otze
University of Minnesota, University of Bielefeld and University of Bielefeld
An Edgeworth-type expansion is established for the entropy dis-tance to the class of normal distributions of sums of i.i.d. randomvariables or vectors, satisfying minimal moment conditions.
1. Introduction.
Let ( X n ) n ≥ be independent, identically distributed ran-dom variables with mean E X = 0 and variance Var( X ) = 1. According tothe central limit theorem, the normalized sums Z n = X + · · · + X n √ n are weakly convergent in distribution to the standard normal law Z n ⇒ Z ,where Z ∼ N (0 ,
1) with density ϕ ( x ) = √ π e − x / . A much stronger state-ment (when applicable)—the entropic central limit theorem—states that, iffor some n , or equivalently, for all n ≥ n , the random variables Z n haveabsolutely continuous distributions with finite entropies h ( Z n ), then theseentropies converge, h ( Z n ) → h ( Z ) as n → ∞ . (1.1)This theorem is due to Barron [3]. Some weaker variants of the theorem incase of regularized distributions were known before; they go back to the workof Linnik [16], initiating an information-theoretic approach to the centrallimit theorem. Received April 2011; revised May 2012. Supported in part by NSF Grant DMS-11-06530 and SFB 701.
AMS 2000 subject classifications.
Key words and phrases.
Entropy, entropic distance, central limit theorem, Edgeworth-type expansions.
This is an electronic reprint of the original article published by theInstitute of Mathematical Statistics in
The Annals of Probability ,2013, Vol. 41, No. 4, 2479–2512. This reprint differs from the original inpagination and typographic detail. 1
S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
To clarify in which sense (1.1) is strong, recall that, if a random variable X with finite second moment has a density p ( x ), its entropy h ( X ) = − Z + ∞−∞ p ( x ) log p ( x ) dx is well defined and is bounded from above by the entropy of the normalrandom variable Z , having the same mean a and the same variance σ as X . Note that the value h ( X ) = −∞ is possible. The relative entropy D ( X ) = D ( X k Z ) = h ( Z ) − h ( X ) = Z + ∞−∞ p ( x ) log p ( x ) ϕ a,σ ( x ) dx, where ϕ a,σ stands for the density of Z , is nonnegative and serves as kindof a distance to the class of normal laws, or to Gaussianity. This quantitydoes not depend on the mean or the variance of X , and can be related tothe total variation distance between the distributions of X and Z by virtueof the Pinsker-type inequality D ( X ) ≥ k F X − F Z k . This already showsthat the entropic convergence (1.1) is stronger than convergence in the totalvariation norm.Thus, the entropic central limit theorem may be reformulated as D ( Z n ) →
0, as long as D ( Z n ) < + ∞ for some n . This property itself gives riseto a number of intriguing questions, such as to the type and the rate ofconvergence. In particular, it has been proved only recently that the sequence h ( Z n ) is nondecreasing, so that D ( Z n ) ↓
0; cf. [1, 17]. This leads to thequestion as to the precise rate of D ( Z n ) tending to zero; however, not muchseems to be known about this problem. The best results in this direction aredue to Artstein et al. [2] and to Barron and Johnson [15]. In the i.i.d. case asabove, these authors have obtained an expected asymptotic bound D ( Z n ) = O (1 /n ) under the hypothesis that the distribution of X admits an analyticinequality of Poincar´e-type (in [15], a restricted Poincar´e inequality is used).These inequalities involve a large variety of “nice” probability distributionswhich necessarily have a finite exponential moment.The aim of this paper is to study the rate of D ( Z n ), using moment condi-tions E | X | s < + ∞ with fixed values s ≥
2, which are comparable to thoserequired for classical Edgeworth-type approximations in the Kolmogorovdistance. The cumulants γ r = i − r d r dt r log E e itX (cid:12)(cid:12)(cid:12)(cid:12) t =0 are then well defined for all r ≤ [ s ] (the integer part of s ), and one mayintroduce the functions q k ( x ) = ϕ ( x ) X H k +2 j ( x ) 1 r ! · · · r k ! (cid:18) γ (cid:19) r · · · (cid:18) γ k +2 ( k + 2)! (cid:19) r k (1.2) NTROPIC CENTRAL LIMIT THEOREM involving the Chebyshev–Hermite polynomials H k . The summation in (1.2)runs over all nonnegative integer solutions ( r , . . . , r k ) to the equation r +2 r + · · · + kr k = k , and one uses the notation j = r + · · · + r k .The functions q k are defined for k = 1 , . . . , [ s ] −
2. They appear in Edgeworth-type expansions including the local limit theorem, where q k are used to con-struct the approximation of the densities of Z n . These results can be appliedto obtain an expansion in powers of 1 /n for the distance D ( Z n ). For a mul-tidimensional version of the following Theorem 1.1 for moments of integerorder s ≥
2, see Theorem 6.1 below.
Theorem 1.1.
Let E | X | s < + ∞ ( s ≥ , and assume D ( Z n ) < + ∞ ,for some n . Then D ( Z n ) = c n + c n + · · · + c [( s − / n [( s − / + o (( n log n ) − ( s − / ) . (1.3) Here c j = j X k =2 ( − k k ( k − X Z + ∞−∞ q r ( x ) · · · q r k ( x ) dxϕ ( x ) k − , (1.4) where the summation runs over all positive integers ( r , . . . , r k ) such that r + · · · + r k = 2 j . Each coefficient c j in (1.3) represents a certain polynomial in the cumu-lants γ , . . . , γ j +1 . For example, c = γ , and in the case s = 4, (1.3) gives D ( Z n ) = 112 n ( E X ) + o (cid:18) n log n (cid:19) ( E X < + ∞ ) . (1.5)Thus, under the 4th moment condition, we have D ( Z n ) ≤ Cn , where theconstant depends on the underlying distribution. This has been conjecturedby Johnson [14], page 49. Actually, the constant C may be expressed interms of E X and D ( X ), only.When s varies in the range 4 ≤ s ≤
6, the leading linear term in (1.5) willbe unchanged, while the remainder term improves and satisfies O ( n ) in case E X < + ∞ . But for s = 6, the result involves the subsequent coefficient c which depends on γ , γ and γ . In particular, if γ = 0, we have c = γ ,thus D ( Z n ) = 148 n ( E X − + o (cid:18) n log n ) (cid:19) ( E X = 0 , E X < + ∞ ) . More generally, representation (1.3) simplifies if the first k − X coincide with the corresponding moments of Z ∼ N (0 , S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
Corollary 1.2.
Let E | X | s < + ∞ ( s ≥ , and assume that D ( Z n ) < + ∞ , for some n . Given k = 3 , , . . . , [ s ] , assume that γ j = 0 for all ≤ j < k .Then D ( Z n ) = γ k k ! · n k − + O (cid:18) n k − (cid:19) + o (cid:18) n log n ) ( s − / (cid:19) . (1.6)Johnson had noticed (though in terms of the standardized Fisher infor-mation, see [14], Lemma 2.12) that if γ k = 0, D ( Z n ) cannot be of smallerorder than n − ( k − .Note that when E X k < + ∞ , the o -term may be removed in the represen-tation (1.6). On the other hand, when k > s +22 , the o -term will dominate the n − ( k − -term, and we can only conclude that D ( Z n ) = o (( n log n ) − ( s − / ).As for the missing range 2 ≤ s <
4, here there are no coefficients c j ap-pearing in the sum (1.3), and Theorem 1.1 just tells us that D ( Z n ) = o (cid:18) n log n ) ( s − / (cid:19) . (1.7)This bound is worse than the rate 1 /n . In particular, it only gives D ( Z n ) = o (1) for s = 2, which is the statement of Barron’s theorem. In fact, in thiscase the entropic distance to normality may decay to zero at an arbitrarilyslow rate. In case of a finite 3rd absolute moment, D ( Z n ) = o ( √ n log n ). Tosee that this and that the more general relation (1.7) cannot be improvedwith respect to the powers of 1 /n , we prove: Theorem 1.3.
Let η > . Given < s < , there exists a sequence ofindependent, identically distributed random variables ( X n ) n ≥ with E | X | s < + ∞ , such that D ( X ) < + ∞ and D ( Z n ) ≥ c ( n log n ) ( s − / (log n ) η , n ≥ n ( X ) , with a constant c = c ( η, s ) > , depending on η and s , only. Known bounds on the entropy are commonly based on Bruijn’s identitywhich may be used to represent the entropic distance to normality as an inte-gral of the Fisher information for regularized distributions; cf. [3]. However,it is not clear how to reach exact asymptotics with this approach. The proofsof Theorems 1.1 and 1.3 stated above rely upon classical tools and resultsin the theory of sums of independent summands including Edgeworth-typeexpansions for convolution of densities formulated as local limit theoremswith nonuniform remainder bounds. For noninteger values of s , the authorshad to complete the otherwise extensive literature by recent, technicallyrather involved results based on fractional differential calculus; see [6, 7]. NTROPIC CENTRAL LIMIT THEOREM Our approach applies to random variables in higher dimension as well andto nonidentical distributions for summands with uniformly bounded s th mo-ments.We start with the description of a truncation-of-density argument, whichallows us to reduce many questions about bounding the entropic distanceto the case of bounded densities (Section 2). In Section 3 we discuss knownresults about Edgeworth-type expansions that will be used in the proof ofTheorem 1.1. Main steps of the proofs are based on it in Sections 4 and 5.All auxiliary results cover the scheme of i.i.d. random vectors in R d as well(however, with integer values of s ) and are finalized in Section 6 to obtainmultidimensional variants of Theorem 1.1 and Corollary 1.2. Sections 7 and 8are devoted to lower bounds on the entropic distance to normality for aspecial class of probability distributions on the real line that are used in theproof Theorem 1.3.
2. Binomial decomposition of convolutions.
First let us comment on theassumptions in Theorem 1.1. It may happen that X has a singular distribu-tion, but the distribution of X + X and of all next sums S n = X + · · · + X n ( n ≥
2) are absolutely continuous; cf. [25].If it exists, the density p of X may or may not be bounded. In the firstcase, all the entropies h ( S n ) are finite. If p is unbounded, it may happenthat all h ( S n ) are infinite, even if p is compactly supported. But if h ( S n )is finite for some n = n then, for all n ≥ n , entropies are finite; see [3] forspecific examples.Denote by p n ( x ) the density of Z n = S n / √ n (when it exists). Since it isdesirable to work with bounded densities, we will slightly modify p n at theexpense of a small change in the entropy. Variants of the next constructionare well known; see, for example, [13, 23], where the central limit theoremwas studied with respect to the total variation distance. Without any extraefforts, we may assume that X n take values in R d which we equip with theusual inner product h· , ·i and the Euclidean norm | · | . For simplicity, wedescribe the construction in the situation, where X has a density p ( x ); cf.Remark 2.5 on appropriate modifications in the general case.Let m ≥ m = [ s ] + 1.)If p is bounded, we put e p n ( x ) = p n ( x ) for all n ≥
1. Otherwise, the integral b = Z p ( x ) >M p ( x ) dx (2.1)is positive for all M >
0. Choose M to be sufficiently large to satisfy, forexample, 0 < b < ; cf. Remark 2.4. In this case (when p is unbounded),consider the decomposition p ( x ) = (1 − b ) ρ ( x ) + bρ ( x ) , (2.2) S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE where ρ , ρ are the normalized restrictions of p to the sets { p ( x ) ≤ M } and { p ( x ) > M } , respectively. Hence, for the convolutions we have a binomialdecomposition p ∗ n = n X k =0 C kn (1 − b ) k b n − k ρ ∗ k ∗ ρ ∗ ( n − k )2 . For n ≥ m + 1, we split the above sum into the two parts, so that p ∗ n = ρ n + ρ n with ρ n = n X k = m +1 C kn (1 − b ) k b n − k ρ ∗ k ∗ ρ ∗ ( n − k )2 ,ρ n = m X k =0 C kn (1 − b ) k b n − k ρ ∗ k ∗ ρ ∗ ( n − k )2 . Note that, whenever b < b < , ε n ≡ Z ρ n ( x ) dx = m X k =0 C kn (1 − b ) k b n − k (2.3) ≤ n m b n − m = o ( b n ) as n → ∞ . Finally define e p n ( x ) = p n ( x ) = 11 − ε n n d/ ρ n ( x √ n )(2.4)and similarly p n ( x ) = ε n n d/ ρ n ( x √ n ). Thus, we have the desired decom-position p n ( x ) = (1 − ε n ) p n ( x ) + ε n p n ( x ) . (2.5)The probability densities p n ( x ) are bounded and provide an approxima-tion for p n ( x ) = n d/ p ∗ n ( x √ n ) in total variation. In particular, from (2.3)–(2.5) it follows that Z | p n ( x ) − p n ( x ) | dx < − n for all n large enough. One of the immediate consequences of this estimateis the bound | v n ( t ) − v n ( t ) | < − n ( t ∈ R d )(2.6)for the characteristic functions v n ( t ) = R e i h t,x i p n ( x ) dx and v n ( t ) = R e i h t,x i × p n ( x ) dx , corresponding to the densities p n and p n .This property may be sharpened in case of finite moments. NTROPIC CENTRAL LIMIT THEOREM Lemma 2.1. If E | X | s < + ∞ ( s ≥ , then for all n large enough, Z (1 + | x | s ) | e p n ( x ) − p n ( x ) | dx < − n . In particular, (2.6) also holds for all partial derivatives of v n and v n up toorder m = [ s ] . Proof.
By definition (2.5), | p n ( x ) − p n ( x ) | ≤ ε n ( p n ( x )+ p n ( x )), hence Z | x | s | p n ( x ) − p n ( x ) | dx ≤ ε n − ε n n − s/ Z | x | s ρ n ( x ) dx + n − s/ Z | x | s ρ n ( x ) dx. Let U , U , . . . be independent copies of U and V , V , . . . be independentcopies of V (that are also independent of U n ’s), where U and V are randomvectors with densities ρ and ρ , respectively. From (2.2) β s ≡ E | X | s = (1 − b ) E | U | s + b E | V | s , so E | U | s ≤ β s /b and E | V | s ≤ β s /b (using b < ). Therefore, for the normal-ized sums R k,n = 1 √ n ( U + · · · + U k + V + · · · + V n − k ) , ≤ k ≤ n, we have E | R k,n | s ≤ β s b n s/ , if s ≥
1, and E | R k,n | s ≤ β s b n − ( s/ , if 0 ≤ s ≤ ρ n and ρ n , Z | x | s ρ n ( x ) dx = n s/ n X k = m +1 C kn (1 − b ) k b n − k E | R k,n | s ≤ β s b n s +1 , Z | x | s ρ n ( x ) dx = n s/ m X k =0 C kn (1 − b ) k b n − k E | R k,n | s ≤ β s b n s +1 ε n . It remains to apply estimate (2.3) on ε n , and Lemma 2.1 follows. (cid:3) We need to extend the assertion of Lemma 2.1 to the relative entropieswith respect to the standard normal distribution on R d with density ϕ ( x ) =(2 π ) − d/ e −| x | / . Thus put D n = Z p n ( x ) log p n ( x ) ϕ ( x ) dx, e D n = Z e p n ( x ) log e p n ( x ) ϕ ( x ) dx. Lemma 2.2. If X has a finite second moment and finite entropy, then | e D n − D n | < − n , for all n large enough. S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
First, we collect a few elementary properties of the convex function L ( u ) = u log u ( u ≥ Lemma 2.3.
For all u, v ≥ and ≤ ε ≤ : (a) L ((1 − ε ) u + εv ) ≤ (1 − ε ) L ( u ) + εL ( v ) ; (b) L ((1 − ε ) u + εv ) ≥ (1 − ε ) L ( u ) + εL ( v ) + uL (1 − ε ) + vL ( ε ) ; (c) L ((1 − ε ) u + εv ) ≥ (1 − ε ) L ( u ) − e u − e . The first assertion is just Jensen’s inequality applied to L . By the convex-ity of L , for each y ≥
0, the function L ( x + y ) − L ( x ) is increasing in x ≥ L ( x + y ) − L ( x ) ≥ L ( y ), which is (b) for x = (1 − ε ) u and y = εv .Similarly, using L ≥ − e , we obtain (c). Proof of Lemma 2.2.
Assuming that p is (essentially) unbounded,define D nj = Z p nj ( x ) log p nj ( x ) ϕ ( x ) dx ( j = 1 , , so that e D n = D n, . By Lemma 2.3(a), D n ≤ (1 − ε n ) D n + ε n D n . On theother hand, by (b), D n ≥ ((1 − ε n ) D n + ε n D n ) + ε n log ε n + (1 − ε n ) log(1 − ε n ) . In view of (2.3), the two estimates give | D n − D n | < C ( n + D n + D n ) b n , (2.7)which holds for all n ≥ C . In addition, by the inequalityin (c) with ε = b , from (2.2) it follows that D ( X k Z ) = Z L (cid:18) p ( x ) ϕ ( x ) (cid:19) ϕ ( x ) dx ≥ (1 − b ) Z ρ ( x ) log ρ ( x ) ϕ ( x ) dx − e , (2.8)where Z denotes a standard normal random vector in R d . By the samereasoning, D ( X k Z ) ≥ b Z ρ ( x ) log ρ ( x ) ϕ ( x ) dx − e . (2.9)Now, by the convexity of the function L ( u ) = u log u , D n ≤ − ε n n X k = m +1 C kn (1 − b ) k b n − k Z r k,n ( x ) log r k,n ( x ) ϕ ( x ) dx,D n ≤ ε n m X k =0 C kn (1 − b ) k b n − k Z r k,n ( x ) log r k,n ( x ) ϕ ( x ) dx, NTROPIC CENTRAL LIMIT THEOREM where r k,n are densities of the normalized sums R k,n from the proof of Lem-ma 2.1. Here each integral may also be written as Z r k,n ( x ) log r k,n ( x ) ϕ ( x ) dx = Z L ( r k,n ( x )) dx + d π ) + 12 E | R k,n | . (2.10)We have E | R k,n | ≤ β b n , as noticed in the proof of Lemma 2.1. In addition,by the convexity of L , there is a general inequality Z L (( f ∗ g )( x )) dx ≤ Z L ( f ( x )) dx valid for the convolution of any two probability densities f and g on R d (ifthe integrals exist). In particular, Z L ( r k,n ( x )) dx ≤ d n + max (cid:26)Z L ( ρ ( x )) dx, Z L ( ρ ( x )) dx (cid:27) , which may actually be sharpened in case 1 < k < n by replacing max withmin. By (2.8) and (2.9), the integrals on the right-hand side are finite, thusthe integrals on the left-hand side of (2.10) are bounded by Cn with someconstant C . Hence, a similar bound also holds for D nj , and it remains toapply (2.7). Lemma 2.2 is proved. (cid:3) Remark 2.4. If X has a finite second moment and D ( X ) < + ∞ , thetruncation level M in (2.1) can be chosen explicitly in terms of b using theentropic distance D ( X ) and σ = det(Σ), where Σ is the covariance matrixof X .Indeed, putting a = E X and using an elementary inequality t log(1 + t ) ≤ t log t + 1 ( t ≥ Z p log (cid:18) pϕ a, Σ (cid:19) dx = Z pϕ a, Σ log (cid:18) pϕ a, Σ (cid:19) ϕ a, Σ dx ≤ Z p log pϕ a, Σ dx + 1 = D ( X ) + 1 . On the other hand, the original expression majorizes Z { p ( x ) >M } p ( x ) log Mϕ a, Σ ( x ) dx ≥ b log( M σ (2 π ) d/ ) , hence M ≤ σ (2 π ) d/ e ( D ( X )+1) /b . Remark 2.5. If Z n have absolutely continuous distributions with finiteentropies for n ≥ n >
1, the above construction should be properly modified. S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
Namely, one may put e p n = p n , if p n are bounded, and otherwise applythe same decomposition (2.2) to p n in place of p . As a result, for any n = An + B ( A ≥
1, 0 ≤ B ≤ n − S n will have thedensity r n ( x ) = A X k =0 C kA (1 − b ) k b A − k Z ( ρ ∗ k ∗ ρ ∗ ( A − k )2 )( x − y ) dF B ( y ) , where F B is the distribution of S B . For A ≥ m + 1, split the above suminto the two parts with summation over m + 1 ≤ k ≤ A and 0 ≤ k ≤ m ,respectively, so that r n = ρ n + ρ n . Then, like in (2.4) and for the samesequence ε n described in (2.3), define e p n ( x ) = 11 − ε n n d/ ρ n ( x √ n ) . Clearly, these densities are bounded and approximate p n ( x ) in total varia-tion. In particular, for all sufficiently large n , they satisfy the estimates thatare similar to the estimates in Lemmas 2.1 and 2.2.
3. Edgeworth-type expansions.
Let ( X n ) n ≥ be independent, identicallydistributed random variables with mean E X = 0 and variance Var( X ) = 1.In this section we collect some auxiliary results about Edgeworth-type ex-pansions both for the distribution functions F n ( x ) = P { Z n ≤ x } and the den-sities p n ( x ) of the normalized sums Z n = S n / √ n , where S n = X + · · · + X n .If the absolute moment E | X | s is finite for a given s ≥ m = [ s ],define ϕ m ( x ) = ϕ ( x ) + m − X k =1 q k ( x ) n − k/ (3.1)with the functions q k described in (1.2). Introduce as wellΦ m ( x ) = Z x −∞ ϕ m ( y ) dy = Φ( x ) + m − X k =1 Q k ( x ) n − k/ . (3.2)Similar to (1.2), the functions Q k have an explicit description involving thecumulants γ , . . . , γ k +2 of X . Namely, Q k ( x ) = − ϕ ( x ) X H k +2 j − ( x ) 1 r ! · · · r k ! (cid:18) γ (cid:19) r · · · (cid:18) γ k +2 ( k + 2)! (cid:19) r k , where the summation is carried out over all nonnegative integer solutions( r , . . . , r k ) to the equation r + 2 r + · · · + kr k = k with j = r + · · · + r k ; cf.,for example, [4] or [21] for details. NTROPIC CENTRAL LIMIT THEOREM Theorem 3.1.
Assume that lim sup | t |→ + ∞ | E e itX | < . If E | X | s < + ∞ ( s ≥ , then as n → ∞ , uniformly for all x , (1 + | x | s )( F n ( x ) − Φ m ( x )) = o ( n − ( s − / ) . (3.3)For 2 ≤ s < m = 2, there are no expansion terms in the sum (3.2),and hence Φ ( x ) = Φ( x ) is the distribution function of the standard normallaw. In this case, (3.3) becomes(1 + | x | s )( F n ( x ) − Φ( x )) = o ( n − ( s − / ) . (3.4)In fact, in this case Cramer’s condition on the characteristic function of X is not used. The result was obtained by Osipov and Petrov [19]; cf. also [5]where (3.4) is established with O .In the case s ≥ s = m is integer, relation (3.3) without thefactor 1 + | x | m represents the classical Edgeworth expansion. It is essentiallydue to Cram´er and is described in many papers and textbooks; cf. [9, 10].However, the case of fractional values of s is more delicate, especially in thefollowing local limit theorem. Theorem 3.2.
Let E | X | s < + ∞ ( s ≥ . Suppose Z n has a boundeddensity for some n . Then for all sufficiently large n , the random variables Z n have continuous bounded densities p n satisfying, as n → ∞ , (1 + | x | m )( p n ( x ) − ϕ m ( x )) = o ( n − ( s − / )(3.5) uniformly for all x . Moreover, (1 + | x | s )( p n ( x ) − ϕ m ( x ))(3.6) = o ( n − ( s − / ) + (1 + | x | s − m )( O ( n − ( m − / ) + o ( n − ( s − )) . If s = m is integer and m ≥
3, Theorem 3.2 is well known; then (3.5) and(3.6) simplify to (1 + | x | m )( p n ( x ) − ϕ m ( x )) = o ( n − ( m − / ) . (3.7)In this formulation the result is due to Petrov [20]; cf. [21], page 211, or [4],page 192. Without the term 1 + | x | m , relation (3.7) goes back to the resultsof Cram´er and Gnedenko (cf. [11]).In the general (fractional) case, Theorem 3.2 has recently been obtainedin [6, 7] by using the technique of Liouville fractional integrals and deriva-tives. Assertion (3.6) gives an improvement over (3.5) on relatively largeintervals of the real axis, and this is essential in the case of noninteger s .An obvious weak point in Theorem 3.2 is that it requires the boundednessof the densities p n , which is, however, necessary for conclusions, such as (3.5)or (3.7). Nevertheless, this condition may be removed, if we replace p n byslightly modified densities e p n . S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
Theorem 3.3.
Let E | X | s < + ∞ ( s ≥ . Suppose that, for all for allsufficiently large n , Z n have absolutely continuous distributions with densi-ties p n . Then there exist some bounded continuous densities e p n such that: (a) the relations (3.5) and (3.6) hold true for e p n instead of p n ; (b) R + ∞−∞ (1 + | x | s ) | e p n ( x ) − p n ( x ) | dx < − n , for all sufficiently large n ; (c) e p n ( x ) = p n ( x ) almost everywhere, if p n is bounded ( a.e. ) . Here, property (c) is added to include Theorem 3.2 in Theorem 3.3 as aparticular case. Moreover, one can use the densities e p n constructed in theprevious section with m = [ s ] + 1. We refer to [6, 7] for detailed proofs.This extended result allows us to immediately recover, for example, thecentral limit theorem with respect to the total variation distance (withoutthe assumption of boundedness of p n ). Namely, we have k F n − Φ m k TV = Z + ∞−∞ | p n ( x ) − ϕ m ( x ) | dx = o ( n − ( s − / ) . (3.8)For s = 2 and ϕ ( x ) = ϕ ( x ), this statement corresponds to a theorem ofProkhorov [22], while for s = 3 and ϕ ( x ) = ϕ ( x )(1 + γ x − x √ n )—to the resultof Sirazhdinov and Mamatov [23]. The multidimensional case.
Similar results are also available in the mul-tidimensional case for integer values s = m . In the remaining part of this sec-tion, let ( X n ) n ≥ denote independent identically distributed random vectorsin the Euclidean space R d with mean zero and identity covariance matrix.Assuming E | X | m < + ∞ for some integer m ≥ | · | denotesthe Euclidean norm), introduce the cumulants γ ν of X and the associatedcumulant polynomials γ k ( it ) up to order m by using the equality1 k ! d k du k log E e iu h t,X i (cid:12)(cid:12)(cid:12)(cid:12) u =0 = 1 k ! γ k ( it ) = X | ν | = k γ ν ( it ) ν ν ! ( k = 1 , . . . , m, t ∈ R d ) . Here the summation runs over all d -tuples ν = ( ν , . . . , ν d ) with integer com-ponents ν j ≥ | ν | = ν + · · · + ν d = k . We also write ν ! = ν ! · · · ν d !and use a standard notation for the generalized powers z ν = z ν · · · z ν d d ofreal or complex vectors z = ( z , . . . , z d ), which are treated as polynomials in z of degree | ν | .For 1 ≤ k ≤ m −
2, define the polynomials P k ( it ) = X r +2 r + ··· + kr k = k r ! · · · r k ! (cid:18) γ ( it )3! (cid:19) r · · · (cid:18) γ k +2 ( it )( k + 2)! (cid:19) r k , (3.9)where the summation is performed over all nonnegative integer solutions( r , . . . , r k ) to the equation r + 2 r + · · · + kr k = k . NTROPIC CENTRAL LIMIT THEOREM Furthermore, like in dimension one, define the approximating functions ϕ m ( x ) on R d by virtue of the equality (3.1), where every q k is determinedby its Fourier transform Z e i h t,x i q k ( x ) dx = P k ( it ) e −| t | / . (3.10)If Z n has a bounded density for some n , then for all sufficiently large n , Z n have continuous bounded densities p n satisfying (3.7); see [4], Theo-rem 19.2. We need an extension of this theorem to the case of unboundeddensities, as well as integral variants such as (3.8). The first assertion (3.11)in the next theorem is similar to the one-dimensional Theorem 3.3 in thecase where s = m is integer; cf. (3.5). For the proof (which we omit), one mayapply Lemma 2.1 and follow the standard arguments from [4], Chapter 4. Theorem 3.4.
Suppose that E | X | m < + ∞ with some integer m ≥ .If, for all sufficiently large n , Z n have densities p n , then the densities e p n introduced in Section 2 with m = m + 1 satisfy (1 + | x | m )( e p n ( x ) − ϕ m ( x )) = o ( n − ( m − / )(3.11) uniformly for all x . In addition, Z (1 + | x | m ) | e p n ( x ) − ϕ m ( x ) | dx = o ( n − ( m − / ) . (3.12)The second assertion is Theorem 19.5 in [4], where it is stated for m ≥ X has a nonzero absolutely contin-uous component. Note that, by Lemma 2.1, it does not matter whether e p n or p n are used in (3.12).
4. Entropic distance to normality and moderate deviations.
Let X ,X , . . . be independent, identically distributed random vectors in R d withmean zero, identity covariance matrix and such that D ( Z n ) < + ∞ , for all n large enough.According to Lemma 2.2 and Remark 2.5, up to an error at most 2 − n forsufficiently large n , the entropic distance to normality, D n = D ( Z n ), is equalto the relative entropy e D n = Z e p n ( x ) log e p n ( x ) ϕ ( x ) dx, where ϕ is the density of a standard normal random vector Z in R d .Given T ≥
1, split the integral into two parts by writing e D n = Z | x |≤ T e p n ( x ) log e p n ( x ) ϕ ( x ) dx + Z | x | >T e p n ( x ) log e p n ( x ) ϕ ( x ) dx. (4.1)By Theorems 3.3 and 3.4, e p n are uniformly bounded, that is, e p n ( x ) ≤ M ,for all x ∈ R d and n ≥ M . Hence, the second integral S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE in (4.1) may be treated by virtue of moderate deviations results (when T isnot too large). Indeed, since T ≥ Z | x | >T e p n ( x ) log e p n ( x ) ϕ ( x ) dx ≤ Z | x | >T e p n ( x ) log Mϕ ( x ) dx ≤ C Z | x | >T | x | e p n ( x ) dx, where C = + log(1 + M (2 π ) d/ ). One the other hand, using u log u ≥ u − Z | x | >T e p n ( x ) log e p n ( x ) ϕ ( x ) dx ≥ Z | x | >T ( e p n ( x ) − ϕ ( x )) dx ≥ − P {| Z | > T } . The two estimates give (cid:12)(cid:12)(cid:12)(cid:12)Z | x | >T e p n ( x ) log e p n ( x ) ϕ ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) ≤ P {| Z | > T } + C Z | x | >T | x | e p n ( x ) dx. (4.2)This is a very general upper bound, valid for any probability density e p n on R d , bounded by a constant M (with C as above).Following (4.1), we are faced with two analytic problems. The first one isto give a sharp estimate of e p n ( x ) − ϕ ( x ) on a relatively large Euclidean ball | x | ≤ T . Clearly, T has to be small enough, so that results like local limittheorems, such as Theorems 3.2–3.4 may be applied. The second problemis to give a sharp upper bound of the last integral in (4.2). To this aim,we need moderate deviations inequalities, so that Theorems 3.1 and 3.4are applicable. Anyway, in order to use both types of results we are forcedto choose T from a very narrow window only. This value turns out to beapproximately T n = p ( s −
2) log n + s log log n + ρ n ( s > , (4.3)where ρ n → + ∞ is a sufficiently slowly growing sequence (whose growth willbe restricted by the decay of the n -dependent constants in o -expressions ofTheorems 3.2–3.4). In the case s = 2, one may put T n = √ ρ n such that T n → + ∞ is a sufficiently slowly growing sequence. Lemma 4.1 (The case d = 1 and s real). If E X = 0 , E X = 1 , E | X | s < + ∞ ( s ≥ , then Z | x | >T n x e p n ( x ) dx = o (( n log n ) − ( s − / ) . (4.4) Lemma 4.2 (The case d ≥ s integer). If X has mean zero andidentity covariance matrix, and E | X | m < + ∞ , then Z | x | >T n x e p n ( x ) dx = o ( n − ( m − / (log n ) − ( m − d ) / ) ( m ≥ and R | x | >T n x e p n ( x ) dx = o (1) in the case m = 2 . NTROPIC CENTRAL LIMIT THEOREM Note that plenty of results and techniques concerning moderate deviationshave been developed by now. Useful estimates can be found, for example,in [12]. Restricting ourselves to integer values of s = m , one may argue asfollows. Proof of Lemma 4.2.
Given T ≥
1, write Z | x | >T | x | e p n ( x ) dx ≤ T m − Z | x | m e p n ( x ) dx ≤ T m − Z | x | m | e p n ( x ) − ϕ m ( x ) | dx (4.6) + 1 T m − Z | x | >T | x | m ϕ m ( x ) dx. By Theorem 3.4 [cf. (3.12)] the first integral in (4.6) is bounded by o ( n − ( m − / ).From the definition of q k it follows that q k ( x ) = N ( x ) ϕ ( x ) with somepolynomial N of degree at most 3( m − ϕ m ( x ) ≤ ϕ ( x ) on the balls of large radii | x | < n δ with sufficientlylarge n (where 0 < δ < ). On the other hand, with some constants C d , C ′ d depending on the dimension only, Z | x | >T | x | m ϕ ( x ) dx = C d Z + ∞ T r m + d − e − r / dr ≤ C ′ d T m + d − e − T / . (4.7)But for T = T n and s = m ≥
3, we have e − T / = T − m o ( n − ( m − / ), so by(4.6) and (4.7), Z | x | >T n | x | e p n ( x ) dx ≤ C (cid:18) T m − + 1 T m − d (cid:19) o ( n − ( m − / ) . Since T n is of order √ log n , (4.5) follows. Furthermore, in the case m = 2,(4.6) gives the desired relation Z | x | >T n | x | e p n ( x ) dx ≤ o (1) + Z | x | >T n | x | ϕ ( x ) dx → n → ∞ ) . (cid:3) Proof of Lemma 4.1.
The above argument also works for d = 1, butit can be refined applying Theorem 3.1 for real s . The case s = 2 is alreadycovered, so let s > T ≥ − ε n ) Z | x | >T x e p n ( x ) dx (4.8) S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE ≤ Z | x | >T x p n ( x ) dx = Z | x | >T x dF n ( x )= T (1 − F n ( T ) + F n ( − T )) + 2 Z + ∞ T x (1 − F n ( x ) + F n ( − x )) dx, (4.9)where F n denotes the distribution function of Z n . [Note that the first in-equality in (4.8) should be just ignored in the case, where p is bounded.]By (3.3), F n ( x ) = Φ m ( x ) + r n ( x ) n ( s − /
11 + | x | s , r n = sup x | r n ( x ) | → n → ∞ ) . Hence, the first term in (4.9) can be replaced with T (1 − Φ m ( T ) + Φ m ( − T ))(4.10)at the expense of an error not exceeding (for the values T ∼ √ log n )2 r n n ( s − / T T s = o (( n log n ) − ( s − / ) . (4.11)Similarly, the integral in (4.9) can be replaced with Z + ∞ T x (1 − Φ m ( x ) + Φ m ( − x )) dx (4.12)at the expense of an error not exceeding2 r n n ( s − / Z + ∞ T x dx x s = o (( n log n ) − ( s − / ) . (4.13)To explore the behavior of expressions (4.10) and (4.12) for T = T n usingprecise asymptotics as in (4.3), recall that, by (3.2),1 − Φ m ( x ) = 1 − Φ( x ) − m − X k =1 Q k ( x ) n − k/ . Moreover, we note that Q k ( x ) = N k − ( x ) ϕ ( x ), where N k − is a poly-nomial of degree at most 3 k −
1. Thus these functions admit a bound | Q k ( x ) | ≤ C m (1 + | x | m ) ϕ ( x ) with some constants C m (depending on m andthe cumulants γ , . . . , γ m of X ), which implies with some other constants | − Φ m ( x ) | ≤ (1 − Φ( x )) + C m (1 + | x | m ) √ n ϕ ( x ) . (4.14)Hence, using 1 − Φ( x ) < ϕ ( x ) x ( x > T n | − Φ m ( T n ) | ≤ CT n (1 − Φ( T n )) ≤ CT n e − T n / (4.15) = o (( n log n ) − ( s − / ) . A similar bound also holds for T n | Φ m ( − T n ) | . NTROPIC CENTRAL LIMIT THEOREM Now, we use (4.14) to estimate (4.12) with T = T n up to a constant by Z ∞ T x (1 − Φ( x )) dx < − Φ( T ) = o (( n log n ) − ( s − / ) . It remains to combine the last relation with (4.11), (4.13) and (4.15).Since ε n → (cid:3) Remark 4.3.
Note that the probabilities P {| Z | > T } appearing in (4.2)yield a smaller contribution for T = T n in comparison with the right-handsides of (4.4) and (4.5). Indeed, we have P {| Z | > T } ≤ C d T d − e − T / ( T ≥ Z | x | >T n e p n ( x ) log e p n ( x ) ϕ ( x ) dx.
5. Taylor-type expansion for the entropic distance.
In this section weprovide the last auxiliary step toward the proof of Theorem 1.1. In orderto describe the multidimensional case, let X , X , . . . be independent identi-cally distributed random vectors in R d with mean zero, identity covariancematrix, and such that D ( Z n ) < + ∞ for some n .If p n is bounded, then the densities p n of Z n ( n ≥ n ) are uniformlybounded, and we put e p n = p n . Otherwise, we use the modified densities e p n according to the construction of Section 2. In particular, if e Z n has density e p n ,then | D ( e Z n k Z ) − D ( Z n ) | < − n for all n large enough (where Z is a stan-dard normal random vector; cf. Lemma 2.2 and Remark 2.5). Moreover, byLemmas 4.1, 4.2 and Remark 4.3, (cid:12)(cid:12)(cid:12)(cid:12) D ( Z n ) − Z | x |≤ T n e p n ( x ) log e p n ( x ) ϕ ( x ) dx (cid:12)(cid:12)(cid:12)(cid:12) = o (∆ n ) , (5.1)where T n are defined in (4.3) and∆ n = n − ( s − / (log n ) − ( s − max( d, / (5.2)(with the convention that ∆ n = 1 for the critical case s = 2).Thus, all information about the asymptotics of D ( Z n ) is contained in theintegral in (5.1). More precisely, writing a Taylor expansion for e p n using theapproximating functions ϕ m in Theorems 3.2–3.4 leads to the following rep-resentation (which is more convenient in applications such as Corollary 1.2). Theorem 5.1.
Let E | X | s < + ∞ ( s ≥ , assuming that s is integer incase d ≥ . Then D ( Z n ) = m − X k =2 ( − k k ( k − Z ( ϕ m ( x ) − ϕ ( x )) k dxϕ ( x ) k − (5.3) + o (∆ n ) ( m = [ s ]) . S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
Note that in the case 2 ≤ s < D ( Z n ) = o (∆ n ). Proof of Theorem 5.1.
In terms of L ( u ) = u log u , rewrite the inte-gral in (5.1) as e D n, = Z | x |≤ T n L (cid:18) e p n ( x ) ϕ ( x ) (cid:19) ϕ ( x ) dx (5.4) = Z | x |≤ T n L (1 + u m ( x ) + v n ( x )) ϕ ( x ) dx, where u m ( x ) = ϕ m ( x ) − ϕ ( x ) ϕ ( x ) , v n ( x ) = e p n ( x ) − ϕ m ( x ) ϕ ( x ) . By Theorems 3.3 and 3.4, more precisely, by (3.6) for d = 1, and by (3.11)for d ≥ s = m integer, in the region | x | = O ( n δ ) with an appropriate δ >
0, we have | e p n ( x ) − ϕ m ( x ) | ≤ r n n ( s − /
11 + | x | s , r n → . (5.5)Since ϕ ( x )(1 + | x | s ) is decreasing as a function of | x | for large | x | , we obtain,for all | x | ≤ T n , | v n ( x ) | ≤ C r n n ( s − / e T n / T sn ≤ C ′ r n e ρ n / . The last expression tends to zero by a suitable choice of ρ n → ∞ which wewill assume from now on. In particular, for n large enough, | v n ( x ) | < in | x | ≤ T n .From the definitions of q k and ϕ m [cf. (1.2), (3.1) and (3.10)], it followsthat | u m ( x ) | ≤ C m | x | m − √ n (5.6)with some constants depending on m and the cumulants, only. Thus, we alsohave | u m ( x ) | < for | x | ≤ T n with sufficiently large n .Now, by Taylor’s formula, for | u | ≤ , | v | ≤ , L (1 + u + v ) = L (1 + u ) + v + 2 θ uv + θ v with some | θ j | ≤ u, v ). Applying this approximation with u = u m ( x ) and v = v n ( x ), we see that v n ( x ) can be removed from the right-hand side of (5.4) at the expense of an error not exceeding | J | + J + J ,where J = Z | x |≤ T n ( e p n ( x ) − ϕ m ( x )) dx, J = Z | x |≤ T n | u m ( x ) || e p n ( x ) − ϕ m ( x ) | dx NTROPIC CENTRAL LIMIT THEOREM and J = Z | x |≤ T n ( e p n ( x ) − ϕ m ( x )) ϕ ( x ) dx. But | J | = (cid:12)(cid:12)(cid:12)(cid:12)Z | x | >T n ( e p n ( x ) − ϕ m ( x )) dx (cid:12)(cid:12)(cid:12)(cid:12) (5.7) ≤ Z | x | >T n e p n ( x ) dx + Z | x | >T n | ϕ m ( x ) | dx. By Lemmas 4.1 and 4.2, the first integral on the right-hand side is T n -timessmaller than o (∆ n ). Also, by (5.6), the last integral in (5.7) is bounded by Z | x | >T n | ϕ m ( x ) − ϕ ( x ) | dx + Z | x | >T n ϕ ( x ) dx ≤ C m √ n Z | x | >T n (1 + | x | m − ) ϕ ( x ) dx + P {| Z | > T n } = o (∆ n ) . As a result, J = o (∆ n ).Applying (5.6) once more and then relation (3.12), we may also concludethat J ≤ C m T m − n √ n Z | x |≤ T n | e p n ( x ) − ϕ m ( x ) | dx = o (∆ n ) . Finally, using (5.5) with s >
2, we get, up to some constants, J ≤ C r n n s − Z | x |≤ T n e | x | / | x | s dx ≤ C d r n n s − Z T n r d − s − e r / dr ≤ C ′ d r n n s − T s − d +2 n e T n / = o (cid:18) n ( s − / (log n ) ( s − d +2) / (cid:19) = o (∆ n ) . If s = 2, all these steps are valid as well and give J ≤ C ′ d r n n s − T s − d +2 n e T n / → T n → + ∞ .Thus, at the expense of an error not exceeding o (∆ n ) one may remove v n ( x ) from (5.4), and we obtain the relation e D n, = Z | x |≤ T n L (1 + u m ( x )) ϕ ( x ) dx + o (∆ n ) , (5.8)which contains specified expansion terms, only. S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
Moreover, u m ( x ) = u ( x ) = 0 for 2 ≤ s <
3, and then the theorem is proved.Next, we consider the case s ≥
3. By Taylor’s expansion around zero, weget, whenever | u | < , for some positive constants θ m , L (1 + u ) = u + m − X k =2 ( − k k ( k − u k + θu m − , | θ | ≤ θ m , assuming that the sum has no terms in the case m = 3. Hence, with some | θ | ≤ θ m , Z | x |≤ T n L (1 + u m ( x )) ϕ ( x ) dx (5.9) = Z | x |≤ T n ( ϕ m ( x ) − ϕ ( x )) dx + m − X k =2 ( − k k ( k − Z | x |≤ T n u m ( x ) k ϕ ( x ) dx (5.10) + θ Z R d | u m ( x ) | m − ϕ ( x ) dx. For n large enough, by (5.6), the second integral in (5.9) has an absolutevalue (cid:12)(cid:12)(cid:12)(cid:12)Z | x | >T n ( ϕ m ( x ) − ϕ ( x )) dx (cid:12)(cid:12)(cid:12)(cid:12) ≤ C √ n Z | x | >T n (1 + | x | m − ) ϕ ( x ) dx = o (∆ n ) . This proves the theorem in the case 3 ≤ s < m = 3).Now, let s ≥
4. The last integral in (5.10) can be estimated again by virtueof (5.6) by Cn ( m − / Z R d (1 + | x | m − m − ) ϕ ( x ) dx = o (∆ n ) . In addition, the first integral in (5.10) can be extended to the whole spaceat the expense of an error not exceeding (for all n large enough) Z | x | >T n | u m ( x ) | k ϕ ( x ) dx ≤ Cn k/ Z | x | >T n (1 + | x | k ( m − ) ϕ ( x ) dx ≤ C ′ T k ( m − n √ n e − T n / = o (∆ n ) . Collecting these estimates in (5.9) and (5.10) and applying them in (5.8),we arrive at e D n, = m − X k =2 ( − k k ( k − Z u m ( x ) k ϕ ( x ) dx + o (∆ n ) . It remains to apply (5.1). Thus, Theorem 5.1 is proved. (cid:3)
NTROPIC CENTRAL LIMIT THEOREM
6. Theorem 1.1 and its multidimensional extension.
The desired repre-sentation (1.3) of Theorem 1.1 can be deduced from Theorem 5.1. Note thatthe latter covers the multidimensional case as well, although under some-what stronger moment assumptions.Thus, let ( X n ) n ≥ be independent identically distributed random vectorsin R d with finite second moment. If the normalized sum Z n = ( X + · · · + X n ) / √ n has density p n ( x ), the entropic distance to Gaussianity is definedas in dimension one to be the relative entropy D ( Z n ) = Z p n ( x ) log p ( x ) ϕ a, Σ ( x ) dx with respect to the normal law on R d with the same mean a = E X andcovariance matrix Σ = Var( X ). This quantity is affine invariant, and in thissense it does not depend on ( a, Σ).
Theorem 6.1. If D ( Z n ) < + ∞ for some n , then D ( Z n ) → , as n → ∞ .Moreover, given that E | X | s < + ∞ ( s ≥ , and that X has mean zero andidentity covariance matrix, we have D ( Z n ) = c n + c n + · · · + c [( m − / n [( m − / + o (∆ n ) ( m = [ s ]) , (6.1) where ∆ n are defined in (5.2), and where we assume that s is integer in case d ≥ . Here, as in Theorem 1.1, each coefficient c j is defined according to (1.4)again. It may be represented as a certain polynomial in the cumulants γ ν ,3 ≤ | ν | ≤ j + 1. Proof of Theorem 6.1.
We shall start from the representation (5.3)of Theorem 5.1, so let us return to definition (3.1), ϕ m ( x ) − ϕ ( x ) = m − X r =1 q r ( x ) n − r/ . In the case 2 ≤ s < m = 2), the right-hand side contains no termsand is therefore vanishing. Anyhow, raising this sum to the power k ≥ ϕ m ( x ) − ϕ ( x )) k = X j n − j/ X q r ( x ) · · · q r k ( x ) , where the inner sum is carried out over all positive integers r , . . . , r k ≤ m − r + · · · + r k = j . Respectively, the k th integral in (5.3) is equal to X j n − j/ X Z q r ( x ) · · · q r k ( x ) dxϕ ( x ) k − . (6.2) S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
Here the integrals are vanishing for odd j . In dimension one, this fol-lows directly from definition (1.2) of q r and the following property of theChebyshev–Hermite polynomials [24] Z + ∞−∞ H r ( x ) · · · H r k ( x ) ϕ ( x ) dx = 0 ( r + · · · + r k is odd) . (6.3)As for the general case, let us look at the structure of the functions q r . Givena multi-index ν = ( ν , . . . , ν d ) with integers ν , . . . , ν d ≥
1, define H ν ( x , . . . ,x d ) = H ν ( x ) · · · H ν d ( x d ), so that Z e i h t,x i H ν ( x ) ϕ ( x ) dx = ( it ) ν e −| t | / , t ∈ R d . Hence, by definition (3.10), q r ( x ) = ϕ ( x ) X ν a ν H ν ( x ) , (6.4)where the coefficients a ν emerge from the expansion P r ( it ) = P ν a ν ( it ) ν .Using (3.9), write these polynomials as P r ( it ) = X l ! · · · l r ! (cid:18) X | ν | =3 γ ν ( it ) ν ν ! (cid:19) l · · · (cid:18) X | ν | = r +2 γ ν ( it ) ν ν ! (cid:19) l r , (6.5)where the outer summation is performed over all nonnegative integer solu-tions ( l , . . . , l r ) to the equation l + 2 l + · · · + rl r = r . Removing the bracketsof the inner sums, we obtain a linear combination of the power polynomials( it ) ν with exponents of order | ν | = 3 l + · · · + ( r + 2) l r = r + 2 b l , b l = l + · · · + l r . (6.6)In particular, r + 2 ≤ | ν | ≤ r , so that P r ( it ) is a polynomial of degree atmost 3 r , and thus ϕ m ( x ) = N ( x ) ϕ ( x ), where N ( x ) is a polynomial of degreeat most 3( m − q r ( x ) · · · q r k ( x ) ϕ ( x ) k − = ϕ ( x ) X a ν (1) · · · a ν ( k ) H ν (1) ( x ) · · · H ν ( k ) ( x ) , (6.7)where | ν (1) | + · · · + | ν ( k ) | = r + · · · + r k (mod 2). Hence, if r + · · · + r k is odd,the sum | ν (1) | + · · · + | ν ( k ) | = d X i =1 ( | ν (1) i | + · · · + | ν ( k ) i | )is odd as well. But then at least one of the inner sums, say with coordinate i ,must be odd as well. Hence in this case, the integral of (6.7) over x i willvanish by property (6.3).Thus, in expression (6.2), only even values of j should be taken intoaccount. NTROPIC CENTRAL LIMIT THEOREM Moreover, since the terms containing n − j/ with j > s − n in relation (6.1), we get from (5.3) and (6.2), D ( Z n ) = m − X k =2 ( − k k ( k − m − X even j =2 n − j/ X Z q r ( x ) · · · q r k ( x ) dxϕ ( x ) k − + o (∆ n ) . Replace now j with 2 j and rearrange the summation. Then D ( Z n ) = X j ≤ m − c j n j + o (∆ n )with c j = m − X k =2 ( − k k ( k − X Z q r ( x ) · · · q r k ( x ) dxϕ ( x ) k − . Here the inner summation is carried out over all positive integers r , . . . , r k ≤ m − r + · · · + r k = 2 j . This implies k ≤ j . Furthermore, 2 j ≤ m − j ≤ [ s − ]. As a result, we arrive at the required relation(6.1) with c j = j X k =2 ( − k k ( k − X r + ··· + r k =2 j Z q r ( x ) · · · q r k ( x ) dxϕ ( x ) k − . (6.8)Thus, Theorem 6.1 and therefore Theorem 1.1 are proved. (cid:3) Remark.
In order to show that c j is a polynomial in the cumulants γ ν , 3 ≤ | ν | ≤ j + 1, first note that r + · · · + r k = 2 j , r , . . . , r k ≥ j ≥ max i r i + ( k − i r i ≤ j −
1. Thus, the maximal index for thefunctions q r i in (6.8) does not exceed 2 j −
1. On the other hand, it followsfrom (6.4) and (6.5) that P r and q r are polynomials in the same set of thecumulants; more precisely, P r is a polynomial in γ ν with 3 ≤ | ν | ≤ r + 2. Proof of Corollary 1.2.
By Theorem 5.1 [cf. (5.3)], D ( Z n ) = m − X k =2 ( − k k ( k − Z ( ϕ m ( x ) − ϕ ( x )) k dxϕ ( x ) k − + o (∆ n ) . (6.9)Assume that m ≥ γ = · · · = γ k − = 0 for a given integer 3 ≤ k ≤ m .(This is no restriction, when k = 3.) Then, by (1.2), q = · · · = q k − = 0,while q k − ( x ) = γ k k ! H k ( x ) ϕ ( x ). Hence, according to definition (3.1), ϕ m ( x ) − ϕ ( x ) = γ k k ! H k ( x ) ϕ ( x ) 1 n ( k − / + m − X j = k − q j ( x ) n j/ , where the sum is empty in the case m = 3. Therefore, the sum in (1.3) willcontain powers of 1 /n starting from 1 /n k − , and the leading coefficient isdue to the quadratic term in (6.9) when k = 2. More precisely, if k − ≤ m − , S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE we get that c = · · · = c k − = 0, and c k − = γ k k ! Z + ∞−∞ H k ( x ) ϕ ( x ) dx = γ k k ! . (6.10)Hence, if k ≤ m , (6.9) yields D ( Z n ) = γ k k ! 1 n k − + O ( n − ( k − ). Otherwise, the O -term should be replaced by o (( n log n ) − ( s − / ). Thus Corollary 1.2 isproved. (cid:3) By a similar argument, the conclusion may be extended to the mul-tidimensional case. Indeed, if γ ν = 0, for all 3 ≤ | ν | < k , then by (6.5), P = · · · = P k − = 0, while P k − ( it ) = X | ν | = k γ ν ( it ) ν ν ! . Correspondingly, in (6.4) we have q = · · · = q k − = 0 and q k − ( x ) = ϕ ( x ) × P | ν | = k γ ν ν ! H ν ( x ). Therefore, ϕ m ( x ) − ϕ ( x ) = ϕ ( x ) X | ν | = k γ ν ν ! H ν ( x ) 1 n ( k − / + m − X j = k − q j ( x ) n j/ . Applying this relation in (6.9), we arrive at (6.1) with c = · · · = c k − = 0and, by orthogonality of the polynomials H ν , c k − = 12 Z (cid:18) X | ν | = k γ ν ν ! H ν ( x ) (cid:19) ϕ ( x ) dx = 12 X | ν | = k γ ν ν ! . We may summarize our findings as follows.
Corollary 6.2.
Let ( X n ) n ≥ be i.i.d. random vectors in R d ( d ≥ with mean zero and identity covariance matrix. Suppose that E | X | m < + ∞ , for some integer m ≥ , and D ( Z n ) < + ∞ , for some n . Given k =3 , , . . . , m , if γ ν = 0 for all ≤ | ν | < k , we have D ( Z n ) = 12 n k − X | ν | = k γ ν ν ! + O (cid:18) n k − (cid:19) + o (cid:18) n ( m − / (log n ) ( m − d ) / (cid:19) . (6.11)The conclusion corresponds to Corollary 1.2, if we replace d with 2 in theremainder on the right-hand side.As in dimension one, when E X k < + ∞ , the o -term may be removedfrom this representation, while for k > m , the o -term dominates. Moreover,if m +22 < k ≤ m , we are left with this term, only, that is, D ( Z n ) = o (cid:18) n ( m − / (log n ) ( m − d ) / (cid:19) . NTROPIC CENTRAL LIMIT THEOREM When k = 3, there is no restriction on the cumulants in Corollary 6.2, and(6.11) becomes D ( Z n ) = 12 n X | ν | =3 γ ν ν ! + O (cid:18) n (cid:19) + o (cid:18) n ( m − / (log n ) ( m − d ) / (cid:19) . If E | X | < + ∞ , we get D ( Z n ) = O (1 /n ) for d ≤
4, and the weaker bound D ( Z n ) = o ((log n ) ( d − / /n ) for d ≥
5. However, if E | X | < + ∞ , we alwayshave D ( Z n ) = O (1 /n ) regardless of the dimension d .Technically, this slight difference between conclusions for different di-mensions is due to the dimension-dependent asymptotic R | x | >T | x | ϕ ( x ) dx ∼ C d T d e − T / . Remark.
In case of discrete distributions when X takes integer val-ues, asymptotics for D ( S n ) were studied by Vilenkin and D’yachkov [26],who used an Edgeworth-type expansion for probabilities P { S n = k } in thecorresponding local limit theorem.
7. Convolutions of mixtures of normal laws.
Is the asymptotic descrip-tion of D ( Z n ) in Theorem 1.1 still optimal, if no expansion terms of order n − j are present? This is exactly the case for 2 ≤ s < p ( x ) = Z + ∞ ϕ σ ( x ) dP ( σ ) ( x ∈ R ) , (7.1)where P is a (mixing) probability measure on the positive half-axis (0 , + ∞ ),and where ϕ σ ( x ) = 1 σ √ π e − x / (2 σ ) is the density of the normal law with mean zero and variance σ [as usual,we write ϕ ( x ) in the standard normal case with σ = 1].Equivalently, let p ( x ) denote the density of the random variable X = ρZ , where the factors Z ∼ N (0 ,
1) and ρ > P ) areindependent. Such distributions appear naturally, for example, as limit lawsof sums with randomized length; cf., for example, [8].For densities such as (7.1), we need a refinement of the local limit theoremfor convolutions, described in the expansions (3.5) and (3.6). More precisely,our aim is to find a representation with an essentially smaller remainder termcompared to o ( n − ( s − / ). S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
Thus, let X , X , . . . be independent random variables, having a commondensity p ( x ) as in (7.1), and let p n ( x ) denote the density of the normalizedsum Z n = ( X + · · · + X n ) / √ n . If X = ρZ , where Z ∼ N (0 ,
1) and ρ > E X = E ρ and more generally, E | X | s = β s E ρ s = β s Z + ∞ σ s dP ( σ ) , where β s denotes the s th absolute moment of Z .Note that p ( x ) is unimodal with mode at the origin, and p (0) = E ρ √ π .If ρ ≥ σ >
0, the density is bounded, and therefore the entropy h ( X ) isfinite. Proposition 7.1.
Assume that E ρ = 1 , E ρ s < + ∞ (2 < s ≤ . If P { ρ ≥ σ } = 1 with some constant σ > , then uniformly over all x , p n ( x ) = ϕ ( x ) + n Z + ∞ ( ϕ σ n ( x ) − ϕ ( x )) dP ( σ ) + O (cid:18) n s − (cid:19) , (7.2) where σ n = q σ − n . Of course, when E ρ s < + ∞ for s >
4, the proposition may be still applied,but with s = 4. In this case (7.2) has a remainder term of order O ( n ). Notethat necessarily σ ≤ E ρ = 1.The function p n may also be described as the density of Z n = q ρ + ··· + ρ n n Z ,where ρ k are independent copies of ρ (independent of Z as well). This rep-resention already indicates the closeness of p n and ϕ and suggests to appealto the law of large numbers. However, we shall choose a different approachbased on the characteristic functions of Z n .Obviously, the characteristic function of X is given by v ( t ) = E e itX = E e − ρ t / ( t ∈ R ) . Using Jensen’s inequality and the assumption ρ ≥ σ >
0, we get a two-sidedestimate e − t / ≤ v ( t ) ≤ e − σ t / . (7.3)In particular, the function ψ ( t ) = e t / v ( t ) − t real. Lemma 7.2. If E ρ = 1 , M s = E ρ s < + ∞ (2 ≤ s ≤ , then for all | t | ≤ , ≤ ψ ( t ) ≤ M s | t | s . Proof.
We may assume 0 < t ≤
1. Write ψ ( t ) = E ( e − ( ρ − t / − ρt >
1, hence ψ ( t ) ≤ E ( e − ( ρ − t / − { ρ ≤ /t } . NTROPIC CENTRAL LIMIT THEOREM Let x = − ( ρ − t . Clearly, | x | ≤ ρ ≤ /t . Using e x ≤ x + x ( | x | ≤
1) and E ρ = 1, we get ψ ( t ) ≤ − t E ( ρ − { ρ ≤ /t } + t E ( ρ − { ρ ≤ /t } (7.4) = t E ( ρ − { ρ> /t } + t E ( ρ − { ρ ≤ /t } . The last expectation is equal to E ρ { ρ ≤ /t } + 2 E ( ρ − { ρ> /t } − P { ρ ≤ /t }≤ E ρ { ρ ≤ /t } + 2 E ρ { ρ> /t } − ≤ E ρ { ρ ≤ /t } + E ρ { ρ> /t } . Together with (7.4), this gives ψ ( t ) ≤ t E ρ { ρ> /t } + t E ρ { ρ ≤ /t } . (7.5)Finally, E ρ { ρ> /t } ≤ E ρ s t s − { ρ> /t } ≤ M s t s − and E ρ { ρ ≤ /t } ≤ E ρ s t s − × { ρ ≤ /t } ≤ M s t s − . It remains to use these estimates in (7.5), and Lemma 7.2is proved. (cid:3) Proof of Proposition 7.1.
The characteristic functions v n ( t ) = v ( t √ n ) n of Z n are real-valued and admit, by (7.3), similar bounds e − t / ≤ v n ( t ) ≤ e − σ t / . (7.6)In particular, one may apply the inverse Fourier transform to represent thedensity of Z n as p n ( x ) = 12 π Z + ∞−∞ e − itx v n ( t ) dt = 12 π Z + ∞−∞ e − itx − t / (1 + ψ ( t/ √ n )) n dt. Letting T n = σ log n , we split the integral into the two regions, defined by I = Z | t |≤ T n e − itx v n ( t ) dt, I = Z | t | >T n e − itx v n ( t ) dt. By the upper bound in (7.6), | I | ≤ Z | t | >T n e − σ t / dt ≤ √ πσ e − σ T n / = √ πσ n . (7.7)In the interval | t | ≤ T n , by Lemma 7.2, ψ ( t √ n ) ≤ M s | t | s n s/ ≤ n , for all n ≥ n .But for 0 ≤ ε ≤ n , there is the simple estimate 0 ≤ (1 + ε ) n − − nε ≤ nε ) . S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
Hence, once more by Lemma 7.2,0 ≤ (1 + ψ ( t/ √ n )) n − − nψ ( t/ √ n ) ≤ nψ ( t/ √ n )) ≤ M s | t | s n s − ( n ≥ n ) . This gives (cid:12)(cid:12)(cid:12)(cid:12) I − Z | t |≤ T n e − itx − t / (1 + nψ ( t/ √ n )) dt (cid:12)(cid:12)(cid:12)(cid:12) ≤ M s n s − Z + ∞−∞ | t | s e − t / dt. (7.8)In addition, (cid:12)(cid:12)(cid:12)(cid:12)Z | t | >T n e − itx − t / (1 + nψ ( t/ √ n )) dt (cid:12)(cid:12)(cid:12)(cid:12) ≤ Z | t | >T n e − t / dt + n Z | t | >T n e − t / ψ ( t/ √ n ) dt. Here, the first integral on the right-hand side is of order O ( n − ). To estimatethe second one, recall that, by (7.3), ψ ( t ) = e t / v ( t ) − ≤ e (1 − σ ) t / . Hence, ψ ( t/ √ n ) ≤ e (1 − σ ) t / and Z | t | >T n e − t / ψ ( t/ √ n ) dt ≤ Z | t | >T n e − σ t / dt ≤ √ πσ n . Together with (7.7) and (7.8) these bounds imply that p n ( x ) = 12 π Z + ∞−∞ e − itx − t / (1 + nψ ( t/ √ n )) dt + O (cid:18) n s − (cid:19) uniformly over all x . It remains to note that12 π Z + ∞−∞ e − itx − t / ψ ( t/ √ n ) dt = 12 π Z + ∞−∞ e − itx − t / ( e t / n v ( t/ √ n ) − dt = Z + ∞ ( ϕ σ n ( x ) − ϕ ( x )) dP ( σ ) . Proposition 7.1 is proved. (cid:3)
Remark 7.3.
An inspection of (7.5) shows that, in the case 2 < s < ψ ( t ) = o ( | t | s ). Correspondingly,the O -relation in Proposition 7.1 can be replaced with an o -relation. Thisimprovement is convenient, but not crucial for the proof of Theorem 1.3.
8. Lower bounds. Proof of Theorem 1.3.
Let X , X , . . . be independentrandom variables with a common density of the form p ( x ) = Z + ∞ ϕ σ ( x ) dP ( σ ) , x ∈ R . NTROPIC CENTRAL LIMIT THEOREM Equivalently, let X = ρZ with independent random variables Z ∼ N (0 , ρ > P .A basic tool for proving Theorem 1.3 will be the following lower bound onthe entropic distance to Gaussianity for the partial sums S n = X + · · · + X n . Proposition 8.1.
Let E ρ = 1 , E ρ s < + ∞ (2 < s < and P { ρ ≥ σ } = 1 with σ > . Assume that, for some γ > , lim inf n →∞ n s − / Z + ∞ n / γ σ dP ( σ ) > . (8.1) Then with some absolute constant c > and some constant δ > , D ( S n ) ≥ cn log n P { ρ ≥ p n log n } + o (cid:18) n ( s − / δ (cid:19) . (8.2)In fact, in (8.2) one may take any positive number δ < min { γs, s − } . Proof of Proposition 8.1.
By Proposition 7.1 and Remark 7.3, uni-formly over all x , p n ( x ) = ϕ ( x ) + n Z + ∞ ( ϕ σ n ( x ) − ϕ ( x )) dP ( σ ) + o (cid:18) n s − (cid:19) , (8.3)where p n is the density of S n / √ n and σ n = q σ − n .Define the sequence N n = n / γ √ log n for n large enough (so that N n ≥ P { ρ ≥ N n } ≤ s M s log nn (1 / γ ) s = o (cid:18) n s/ δ (cid:19) , < δ < γs. (8.4)Using u log u ≥ u − u ≥
0) and applying (8.3), we may write I n ≡ Z | x |≤ √ log n p n ( x ) log p n ( x ) ϕ ( x ) dx ≥ Z | x |≤ √ log n ( p n ( x ) − ϕ ( x )) dx (8.5) ≥ n Z + ∞ Z | x |≤ √ log n ( ϕ σ n ( x ) − ϕ ( x )) dx dP ( σ ) − C √ log nn s − with some constant C . S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
Note that σ n < σ <
1, and thus, for any
T > Z | x |≤ T ( ϕ σ n ( x ) − ϕ ( x )) dx = 2(Φ( T /σ n ) − Φ( T )) > , where Φ denotes the distribution function of the standard normal law.Hence, the outer integral in (8.5) may be restricted to the range σ ≥ σ ≥ N n . More precisely, (8.4) gives n (cid:12)(cid:12)(cid:12)(cid:12)Z + ∞ N n Z | x |≤ √ log n ( ϕ σ n ( x ) − ϕ ( x )) dx dP ( σ ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ n P { ρ ≥ N n } = o (cid:18) n ( s − / δ (cid:19) . Comparing this relation with (8.5) and imposing the additional requirement δ < s − , we get I n ≥ n Z N n Z | x |≤ √ log n ( ϕ σ n ( x ) − ϕ ( x )) dx dP ( σ ) + o (cid:18) n ( s − / δ (cid:19) (8.6) = − n Z N n Z √ log n √ log n/σ n ϕ ( x ) dx dP ( σ ) + o (cid:18) n ( s − / δ (cid:19) . Now, let us estimate p n ( x ) from below in the region 4 √ log n ≤ | x | ≤ n γ .If | x | ≥ √ log n , it follows from (8.3) that p n ( x ) = n Z + ∞ ϕ σ n ( x ) dP ( σ ) + o (cid:18) n s − (cid:19) . (8.7)Consider the function g n ( x ) = Z + ∞ ϕ σ n ( x ) ϕ ( x ) dP ( σ ) . Note that 1 ≤ σ n ≤ σ for σ ≥
1. In this case, the ratio ϕ σn ( x ) ϕ ( x ) is nonincreasingin x ≥
0. Moreover, for σ ≥ √ n + 1, we have σ n = 1 + σ − n ≥
4, so 1 − σ n ≥ . Hence, for | x | ≥ √ log n , ϕ σ n ( x ) ϕ ( x ) = 1 σ n e x (1 − /σ n ) / ≥ n σ . Therefore, g n ( x ) ≥ n Z + ∞√ n +1 σ dP ( σ ) . But by assumption (8.1), the last expression tends to infinity with n , so forall n large enough, g n ( x ) ≥ | x | ≥ √ log n . NTROPIC CENTRAL LIMIT THEOREM Furthermore, if σ ≥ | x |√ n , then σ n = 1 + σ − n ≥ x , so x σ n ≤ . On theother hand, σ n < σ n = n + σ n ≤ σ /x + σ n ≤ σ n , since | x | ≥ √ log n > n ≥
2. The two estimates give ϕ σ n ( x ) = 1 σ n √ π e − x / σ n ≥ √ n σ . Therefore, whenever 4 √ log n ≤ | x | ≤ n γ , n Z + ∞ ϕ σ n ( x ) dP ( σ ) ≥ n / Z + ∞| x |√ n σ dP ( σ ) ≥ n / Z + ∞ n / γ σ dP ( σ ) . By assumption (8.1), the last expression and therefore the left integral arelarger than cn s − with some constant c >
0. Consequently, the remainderterm in (8.7) is indeed smaller, so that for all n large enough, we may write,for example, p n ( x ) ≥ . n Z + ∞ ϕ σ n ( x ) dP ( σ ) = 0 . ng n ( x ) ϕ ( x ) (4 p log n ≤ | x | ≤ n γ ) . Since g n ( x ) ≥ | x | ≥ √ log n with large n , we have in this region p n ( x ) ϕ ( x ) ≥ . n > n , thus p n ( x ) log p n ( x ) ϕ ( x ) ≥ p n ( x ) log n ≥ . n log n Z + ∞ ϕ σ n ( x ) dx dP ( σ ) . Hence, Z √ log n ≤| x |≤ n γ p n ( x ) log p n ( x ) ϕ ( x ) dx ≥ . n log n Z + ∞ Z √ log n ≤| x |≤ n γ ϕ σ n ( x ) dx dP ( σ )(8.8) = 1 . n log n Z + ∞ Z n γ /σ n √ log n/σ n ϕ ( x ) dx dP ( σ ) . At this point, it is useful to note that n γ σ n ≥ √ log n , as long as σ ≤ N n with n large enough. Indeed, in this case σ n ≤ (1 − n ) + N n n < n γ
25 log n , so(4 σ n p log n ) ≤
16 log n (cid:18) n γ
25 log n (cid:19) < n γ for all n large enough. Hence, from (8.8), Z √ log n ≤| x |≤ n γ p n ( x ) log p n ( x ) ϕ ( x ) dx ≥ . n log n Z N n Z √ log n √ log n/σ n ϕ ( x ) dx dP ( σ ) . S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE
But the last expression dominates the double integral in (8.6) with afactor of 2 n . Therefore, combining the above estimate with (8.6), we get Z | x |≤ n γ p n ( x ) log p n ( x ) ϕ ( x ) dx ≥ . n log n Z N n Z √ log n √ log n/σ n ϕ ( x ) dx dP ( σ )+ o (cid:18) n ( s − / δ (cid:19) . Finally, we may extend the outer integral on the right-hand side to allvalues σ > n log n Z + ∞ N n Z √ log n √ log n/σ n ϕ ( x ) dx dP ( σ ) ≤ n log n P { ρ > N n } = o (cid:18) n ( s − / δ (cid:19) . Hence, Z | x |≤ n γ p n ( x ) log p n ( x ) ϕ ( x ) dx ≥ . n log n Z + ∞ Z √ log n √ log n/σ n ϕ ( x ) dx dP ( σ )(8.9) + o (cid:18) n ( s − / δ (cid:19) . For the remaining values | x | ≥ n γ , one can just use the property u log u ≥− e to get a simple lower bound Z | x | >n γ p n ( x ) log p n ( x ) ϕ ( x ) dx ≥ Z | x | >n γ ,p n ( x ) ≤ ϕ ( x ) p n ( x ) log p n ( x ) ϕ ( x ) dx ≥ − e Z | x | >n γ ,p n ( x ) ≤ ϕ ( x ) ϕ ( x ) dx ≥ − e − n γ / . Together with (8.9) this yields Z + ∞−∞ p n ( x ) log p n ( x ) ϕ ( x ) dx ≥ . n log n Z + ∞ Z √ log n √ log n/σ n ϕ ( x ) dx dP ( σ )+ o (cid:18) n ( s − / δ (cid:19) . To simplify, finally note that σ n √ log n ≤ σ ≥ √ n log n . In this casethe last integral is separated from zero (for large n ), hence with some abso-lute constant c > Z + ∞−∞ p n ( x ) log p n ( x ) ϕ ( x ) dx ≥ cn log n P { ρ ≥ p n log n } + o (cid:18) n ( s − / δ (cid:19) . This is exactly the required inequality (8.2) and Proposition 8.1 is proved. (cid:3)
NTROPIC CENTRAL LIMIT THEOREM Proof of Theorem 1.3.
Given η >
0, one may apply Proposition 8.1to the probability measure P with density dP ( σ ) dσ = cσ s +1 (log σ ) η , σ > , and extending it to an interval [ σ ,
2] to meet the requirement R + ∞ σ σ dP ( σ ) =1 (with some 0 < σ < c = c η,s ). It iseasy to see that in this case condition (8.1) is fulfilled for 0 < γ < s − s +1) . Inaddition, if ρ has the distribution P , we have P { ρ ≥ σ } ≥ const 1 σ s (log σ ) η for all σ large enough. Hence, by taking σ = √ n log n , (8.2) provides thedesired lower bound. (cid:3) Remark.
In case s = 2 (i.e., with minimal moment assumptions), themixtures of the normal laws with discrete mixing measures P were used byMatskyavichyus [18] in the central limit theorem in terms of the Kolmogorovdistance. Namely, it is shown that, for any prescribed sequence ε n →
0, onemay choose P such that ∆ n = sup x | F n ( x ) − Φ( x ) | ≥ ε n for all n large enough(where F n is the distribution function of Z n ). In view of the Pinsker-typeinequality, one may conclude that D ( Z n ) ≥ ∆ n ≥ ε n . Therefore, D ( Z n ) may decay at an arbitrarily slow rate.REFERENCES [1] Artstein, S. , Ball, K. M. , Barthe, F. and
Naor, A. (2004). Solution of Shan-non’s problem on the monotonicity of entropy.
J. Amer. Math. Soc. Artstein, S. , Ball, K. M. , Barthe, F. and
Naor, A. (2004). On the rate ofconvergence in the entropic central limit theorem.
Probab. Theory Related Fields
Barron, A. R. (1986). Entropy and the central limit theorem.
Ann. Probab. Bhattacharya, R. N. and
Ranga Rao, R. (1976).
Normal Approximation andAsymptotic Expansions . Wiley, New York. MR0436272[5]
Bikjalis, A. (1964). An estimate for the remainder term in the central limit theorem.
Litovsk. Mat. Sb. Bobkov, S. G. , Chistyakov, G. P. and
G¨otze, F. (2011). Non-uniform bounds inlocal limit theorems in case of fractional moments. I.
Math. Methods Statist. Bobkov, S. G. , Chistyakov, G. P. and
G¨otze, F. (2011). Non-uniform bounds inlocal limit theorems in case of fractional moments. II.
Math. Methods Statist. Bobkov, S. G. and
G¨otze, F. (2007). Concentration inequalities and limit theoremsfor randomized sums.
Probab. Theory Related Fields S. G. BOBKOV, G. P. CHISTYAKOV AND F. G ¨OTZE[9]
Esseen, C.-G. (1945). Fourier analysis of distribution functions. A mathematicalstudy of the Laplace–Gaussian law.
Acta Math. Feller, W. (1971).
An Introduction to Probability Theory and Its Applications. Vol.II , 2nd ed. Wiley, New York. MR0270403[11]
Gnedenko, B. V. and
Kolmogorov, A. N. (1954).
Limit Distributions for Sums ofIndependent Random Variables . Addison-Wesley, Reading, MA. Translated andannotated by K. L. Chung. With an Appendix by J. L. Doob. MR0062975[12]
G¨otze, F. and
Hipp, C. (1978). Asymptotic expansions in the central limit theoremunder moment conditions.
Z. Wahrsch. Verw. Gebiete Ibragimov, I. A. and
Linnik, J. V. (1965).
Independent and Stationarily ConnectedVariables . Nauka, Moscow.[14]
Johnson, O. (2004).
Information Theory and the Central Limit Theorem . ImperialCollege Press, London. MR2109042[15]
Johnson, O. and
Barron, A. (2004). Fisher information inequalities and the centrallimit theorem.
Probab. Theory Related Fields
Linnik, J. V. (1959). An information-theoretic proof of the central limit theoremwith Lindeberg conditions.
Theory Probab. Appl. Madiman, M. and
Barron, A. (2007). Generalized entropy power inequalities andmonotonicity properties of information.
IEEE Trans. Inform. Theory Matskyavichyus, V. K. (1983). A lower bound for the rate of convergence in thecentral limit theorem.
Teor. Veroyatn. Primen. Osipov, L. V. and
Petrov, V. V. (1967). On the estimation of the remainder termin the central limit theorem.
Teor. Veroyatn. Primen. Petrov, V. V. (1964). On local limit theorems for sums of independent randomvariables.
Theory Probab. Appl. Petrov, V. V. (1975).
Sums of Independent Random Variables . Springer, New York.MR0388499[22]
Prohorov, Y. V. (1952). A local theorem for densities.
Doklady Akad. Nauk SSSR(N.S.) Siraˇzdinov, S. H. and
Mamatov, M. (1962). On mean convergence for densities.
Teor. Veroyatn. Primen. Szeg˝o, G. (1967).
Orthogonal Polynomials , 3rd ed.
American Mathematical SocietyColloquium Publications . Amer. Math. Soc., Providence, RI. MR0310533[25] Tucker, H. G. (1965). On a necessary and sufficient condition that an infinitelydivisible distribution be absolutely continuous.
Trans. Amer. Math. Soc.
Vilenkin, P. A. and
D’yachkov, A. G. (1998). Asymptotics of Shannon andR´enyi entropies for sums of independent random variables.
Problemy PeredachiInformatsii Probl. Inf. Transm. (1999) 219–232.MR1663910 S. G. BobkovSchool of MathematicsUniversity of Minnesota127 Vincent Hall, 206 Church St. S.E.Minneapolis, Minnesota 55455USAE-mail: [email protected]