On the entropy power inequality for the Rényi entropy of order [0,1]
aa r X i v : . [ c s . I T ] J u l On the entropy power inequality for the Rényi entropy of order [0 , Arnaud Marsiglietti ∗ and James Melbourne † Abstract
Using a sharp version of the reverse Young inequality, and a Rényi entropy comparison resultdue to Fradelizi, Madiman, and Wang (2016), the authors derive Rényi entropy power inequalitiesfor log-concave random vectors when Rényi parameters belong to [0 , . Furthermore, the estimatesare shown to be sharp up to absolute constants. Keywords.
Entropy power inequality, Rényi entropy, log-concave.
Let r ∈ [0 , ∞ ] . The Rényi entropy [39] of parameter r is defined for continuous random vectors X ∼ f X as h r ( X ) = 11 − r log (cid:18)Z R n f rX ( x ) dx (cid:19) . (1)We take the Rényi entropy power of X to be N r ( X ) = e n h r ( X ) = (cid:18)Z R n f rX ( x ) dx (cid:19) n − r . (2)Three important cases are handled by continuous limits, N ( X ) = Vol n (supp( X )) , (3) N ∞ ( X ) = k f X k − /n ∞ , (4)and N ( X ) corresponds to the usual Shannon entropy power N ( X ) = N ( X ) = e − n R f log f .Here, Vol( A ) denotes the Lebesgue measure of a measurable set A , and supp( X ) denotes thesupport of X .The entropy power inequality (EPI) is the statement that Shannon entropy power of inde-pendent random vectors X and Y is super-additive N ( X + Y ) ≥ N ( X ) + N ( Y ) . (5)In this language we interpret the Brunn-Minkowski inequality of Convex Geometry, classicallystated as the fact that Vol( A + B ) ≥ (cid:16) Vol n ( A ) + Vol n ( B ) (cid:17) n (6)for any pair of compact sets of R n (see [25] for an introduction to the literature surrounding thisinequality), as a Rényi-EPI corresponding to r = 0 . That is, the Brunn-Minkowski inequality ∗ Supported in part by the Walter S. Baer and Jeri Weiss CMI Postdoctoral Fellowship. † Supported by NSF grant CNS 1544721.Parts of this paper were presented at the 2018 IEEE International Symposium on Information Theory, Vail, CO, USA,June 2018. s equivalent to the fact that for independent random vectors X and Y , the square root of the -th Rényi entropy is super-additive, N ( X + Y ) ≥ N ( X ) + N ( Y ) . (7)The parallels between the two famed inequalities had been observed in the 1984 paper ofCosta and Cover [17], and a unified proof using sharp Young’s inequality was given in 1991 byDembo, Cover, and Thomas [19]. Subsequently, analogs of further Shannon entropic inequalitiesand properties in Convex Geometry have been pursued. For example the monotonicity ofentropy in the central limit theorem (see [1, 30, 42]), motivated the investigation of quantifiableconvexification of a general measurable set on repeated Minkowski summation with itself (see[21, 22]). Motivated by Costa’s EPI improvement [16], Costa and Cover conjectured that thevolume of general sets when summed with a dilate of the Euclidean unit ball should haveconcave growth in the dilation parameter [17]. Though this was disproved for general sets in[24], open questions of this nature remain.Conversely, V. Milman’s reversal of the Brunn-Minkowski inequality (for symmetric convexbodies under certain volume preserving linear maps) [36] inspired Bobkov and Madiman to askand answer whether the EPI could be reversed for log-concave random vectors under analogousmappings [6]. In [5] The authors also formulated an entropic version of Bourgain’s slicingconjecture [13], a longstanding open problem in convex geometry that has attracted a lot ofattention.A further example of an inequality at the interface of geometry and information theorycan be found in [2], where Ball, Nayar, and Tkocz conjectured the existence of an entropicBusemann’s inequality [15] for symmetric log-concave random variables and prove some partialresults, see [44] for an extension to “ s -concave” random variables.We refer to the survey [31] for further details on the connections between convex geometryand information theory.Recently the super-additivity of more general Rényi functionals has seen significant activity,starting with Bobkov and Chistyakov [8, 9] where it is shown (the former focusing on r = ∞ the latter on r ∈ (1 , ∞ ) ) that for r ∈ (1 , ∞ ] there exist universal constants c ( r ) ∈ ( e , suchthat for X i independent random vectors N r ( X + · · · + X k ) ≥ c ( r ) k X i =1 N r ( X i ) . (8)This was followed by Ram and Sason [38] who used optimization techniques to sharpen boundson the constant c ( r ) , which should more appropriately be written c ( r, k ) as the authors wereable to clarify the dependency on the number of summands as well as the Rényi parameter r . Bobkov and Marsiglietti [10] showed that for r ∈ (1 , ∞ ) , there exists an α modificationof the Rényi entropy power that preserved super-additivity. More precisely taking α = r +12 , r ∈ [1 , ∞ ) , and X, Y independent random vectors N αr ( X + Y ) ≥ N αr ( X ) + N αr ( Y ) . (9)This was sharpened by Li [28] who optimized the argument of Bobkov and Marsiglietti. Thecase of r = ∞ was studied using functional analytic tools by Madiman, Melbourne, and Xu[32, 45] who showed that the N ∞ functional enjoys an analog of the matrix generalizationsof Brunn-Minkowski and the Shannon-Stam EPI due to Feder and Zamir [47, 48] and beganinvestigation into discrete versions of the inequality in [46].Conspicuously absent from the discussion above, and mentioned as an open problem in[9, 28, 31, 38] are super-additivity properties of the Rényi entropy power when r ∈ (0 , .In this paper, we address this problem, and provide a solution in the log-concave case (seeDefinition 5). Our first main result is the following. heorem 1. Let r ∈ (0 , . Let X, Y be log-concave random vectors in R n . Then, N αr ( X + Y ) ≥ N r ( X ) α + N r ( Y ) α , (10) where α , α ( r ) = (1 − r ) log 2(1 + r ) log(1 + r ) + r log r . (11)Furthermore, and in contrast to some previous optimism (see, e.g., [28]), these estimatesare somewhat sharp for log-concave random vectors. Indeed, letting α opt = α opt ( r ) denote theinfimum over all α satisfying the inequality (10) for log-concave random vectors, we have max (cid:26) , (1 − r ) log 22 log Γ(1 + r ) + 2 r log r (cid:27) ≤ α opt ≤ α ( r ) , (12)(see Proposition 11 in Section 5). Unsurprisingly, the bounds (11) and (12) imply that lim r → α ( r ) = lim r → α opt ( r ) = 1 , (13)recovering the usual EPI. In fact the ratio of the lower and upper bounds satisfies r ) + 2 r log r (1 + r ) log(1 + r ) + r log r → (14)with r → as can be seen by applying L’Hôpital’s rule and the strict convexity of log Γ(1 + r ) .It can be verified numerically that the derivative of (14) is strictly positive on (0 , . Thus the α ( r ) derived cannot be improved beyond a factor of .More strikingly, as r → the bounds derived force both α opt and α to be of the order ( − r log r ) − . Thus, α opt ( r ) → + ∞ for r → , while α opt (0) = 1 / by the Brunn-Minkowskiinequality. Nevertheless, in the case that the random vectors are uniformly distributed we dohave better behavior. Theorem 2.
Let r ∈ (0 , . Let X, Y be uniformly distributed random vectors on compact sets.Then, N βr ( X + Y ) ≥ N βr ( X ) + N βr ( Y ) , (15) where β , β ( r ) = (1 − r ) log 22 log 2 + r log r − ( r + 1) log( r + 1) . (16)Stated geometrically, Theorem 2 is the following generalization of the Brunn-Minkowskiinequality. Theorem 3.
Let r ∈ (0 , . Let A, B be compact sets in R n . Then, letting X and Y denoteindependent random vectors distributed uniformly on the respective sets A and B , e h r ( X + Y ) ≥ (Vol γ ( A ) + Vol γ ( B )) γ (17) where γ , β/n. Theorems 2 and 3 can be understood as a family of Rényi-EPIs for uniform distributionsinterpolating between the Brunn-Minkowski inequality and EPI. Indeed lim r → γ = 1 /n , while e h r ( X + Y ) increases to Vol( A + B ) , and we recover the Brunn-Minkowski inequality (6). Observe, lim r → β = 1 gives the usual EPI in the special case that the random vectors are uniformdistributions. Note also that the exponent β in (16) is identical to the exponent obtained in[28, Theorem 2.2] for r > .We also approach the Rényi EPI of the form (8) and obtain the following result. heorem 4. Let r ∈ (0 , . For all independent log-concave random vectors X , . . . , X k in R n , N r ( X + · · · + X k ) ≥ c ( r, k ) k X i =1 N r ( X i ) (18) where c ( r, k ) ≥ r − r (cid:18) k | r ′ | (cid:19) k | r ′ | . (19)This bound is shown to be tight up to absolute constants as well. Indeed, we will see inProposition 14 in Section 6 that the largest constant c opt ( r ) satisfying N r ( X + · · · + X k ) ≥ c opt ( r ) k X i =1 N r ( X i ) (20)for any k -tuples of independent log-concave random vectors satisfies er − r ≤ c opt ( r ) ≤ πr − r . (21) For p ∈ [0 , ∞ ] , we denote by p ′ the conjugate of p , p + 1 p ′ = 1 . (22)For a non-negative function f : R n → [0 , + ∞ ) we introduce the notation k f k p = (cid:18)Z R n f p ( x ) dx (cid:19) /p . (23) Definition 5.
A random vector X in R n is log-concave if it possesses a log-concave density f X : R n → [0 , + ∞ ) with respect to Lebesgue measure. In other words, for all λ ∈ (0 , and x, y ∈ R n , f X ((1 − λ ) x + λy ) ≥ f − λX ( x ) f λX ( y ) . (24) Equivalently f X can be written in the form e − V , where V is a proper convex function. Log-concave random vectors and functions are important classes in many disciplines. In thecontext of information theory, several nice properties involving entropy of log-concave randomvectors were recently established (see, e.g., [3, 5, 18, 33, 40, 41]). Significant examples areGaussian and exponential distributions as well as any uniform distribution on a convex set.The main tool in establishing Theorems 1, 2 and 4 is the reverse form of the sharp Younginequality. The reversal of Young’s inequality for parameters in [0 , is due to Leindler [27],while sharp constants were obtained independently by Beckner, and Brascamp and Lieb: Theorem 6 ([4, 14]) . Let ≤ p, q, r ≤ such that p ′ + q ′ = r ′ . Then, k f ⋆ g k r ≥ C n k f k p k g k q , (25) where C = C ( p, q, r ) = c p c q c r , c m = m /m | m ′ | /m ′ . (26) et us recall the information-theoretic interpretation of Young’s inequality. Given indepen-dent random vectors X with density f and Y with density g , the random vector X + Y will bedistributed according to f ⋆ g . Observe that the L p “norms” have the following expression asRényi entropy powers, k f k r = N r ( X ) − n r ′ = N r ( X ) n | r ′| . (27)Hence, we can rewrite (25) as follows, N r ( X + Y ) | r ′| ≥ CN p ( X ) | p ′| N q ( Y ) | q ′| . (28)This is an information-theoretic interpretation of the sharp Young inequality, which was devel-oped in [19].We also need a Rényi comparison result for log-concave random vectors that the authorsfirst learned from private communication [29]. Though the result is implicit in [23], we give aderivation in the appendix for the convenience of the reader. A generalization of the result to s -concave random variables (see [7, 12]) is planned to be included in a revised version of [20]. Lemma 7 (Fradelizi-Madiman-Wang [23, 29]) . Let < p < q . Then, for every log-concaverandom vector X , N q ( X ) ≤ N p ( X ) ≤ p p − q q − N q ( X ) . (29)The first inequality is classical and holds for general X , and follows from the expression N p ( X ) n/ = ( E f p − ( X )) − / ( p − . Indeed, the increasingness of the function s ( E Y s ) /s for apositive random variable Y and s ∈ ( −∞ , ∞ ) , which follows from Jensen’s inequality, impliesthe decreasingness of Rényi entropy powers. The content of Fradelizi, Madiman, and Wang’sresult is thus the second inequality, that this decrease is not too rapid for log-concave randomvectors.We will also have use for a somewhat technical but elementary Calculus result. Lemma 8.
Let c > . Let L, F : [0 , c ] → [0 , ∞ ) be twice differentiable on (0 , c ] , continuous on [0 , c ] , such that L (0) = F (0) = 0 and L ′ ( c ) = F ′ ( c ) = 0 . Let us also assume that F ( x ) > for x > , that F is strictly increasing, and that F ′ is strictly decreasing. Then L ′′ F ′′ increasing on (0 , c ) implies that LF is increasing on (0 , c ) as well. In particular, max x ∈ [0 ,c ] L ( x ) F ( x ) = L ( c ) F ( c ) . (30)The proof is an exercise in Cauchy’s mean value theorem. Proof.
For < u < v < c , by Cauchy’s mean value theorem L ′ ( c ) − L ′ ( v ) F ′ ( c ) − F ′ ( v ) = L ′′ ( c ) F ′′ ( c ) , (31) L ′ ( v ) − L ′ ( u ) F ′ ( v ) − F ′ ( u ) = L ′′ ( c ) F ′′ ( c ) , (32)for some c ∈ ( u, v ) and c ∈ ( v, c ) . Thus, L ′ ( v ) F ′ ( v ) = L ′ ( c ) − L ′ ( v ) F ′ ( c ) − F ′ ( v ) (33) = L ′′ ( c ) F ′′ ( c ) (34) ≥ L ′′ ( c ) F ′′ ( c ) (35) = L ′ ( v ) − L ′ ( u ) F ′ ( v ) − F ′ ( u ) (36) here (33) holds by the assumption that L ′ ( c ) = F ′ ( c ) = 0 , (34) and (36) follow from (31)and (32) respectively, and (35) holds by the assumption that L ′′ F ′′ is monotonically increasing in (0 , c ) . The inequality L ′ ( v ) F ′ ( v ) ≥ L ′ ( v ) − L ′ ( u ) F ′ ( v ) − F ′ ( u ) , (37)is equivalent to − L ′ ( v ) F ′ ( u ) ≤ − L ′ ( u ) F ′ ( v ) , (38)because F ′ is non-negative and strictly decreasing on (0 , c ) . Thus L ′ ( v ) /F ′ ( v ) ≥ L ′ ( u ) /F ′ ( u ) since F ′ ≥ . That is, L ′ /F ′ is non-decreasing on (0 , c ) . Now we can apply a similar argumentto show that L/F is non-decreasing. Again Cauchy’s mean value theorem, for < u < v < c we have L ( u ) − L (0) F ( u ) − F (0) = L ′ ( c ) F ′ ( c ) , (39) L ( v ) − L ( u ) F ( v ) − F ( u ) = L ′ ( c ) F ′ ( c ) , (40)for some c ∈ (0 , u ) and c ∈ ( u, v ) . Thus by the proven non-decreasingness of L ′ F ′ and the factthat F (0) = L (0) = 0 the above implies L ( v ) − L ( u ) F ( v ) − F ( u ) ≥ L ( u ) F ( u ) . (41)Since F is non-negative and strictly increasing on (0 , c ) , we have L ( v ) F ( u ) ≥ L ( u ) F ( v ) . (42)Thus it follows that L/F is indeed non-decreasing.
We first combine Lemma 7 and the information-theoretic formulation of reverse Young’s in-equality (28). Observe that for p, q, r ∈ (0 , satisfying the equation p ′ + q ′ = r ′ forces p, q > r .Thus, our invocation of Lemma 7 is necessarily at the expense of the two constants below, N r ( X + Y ) | r ′| ≥ C p p − r r − ! | p ′| q q − r r − ! | q ′| N r ( X ) | p ′| N r ( Y ) | q ′| . (43)Since | p ′ | = − p ′ = 1 p − − pp , (44)we deduce that | p ′ | ( p −
1) = − p . (45)Also, we have | p ′ | + 1 | q ′ | = 1 | r ′ | . (46)Hence, we can rewrite (43) as, N r ( X + Y ) | r ′| ≥ C p − p q − q r − r N r ( X ) | p ′| N r ( Y ) | q ′| = A ( p, q, r ) N r ( X ) | p ′| N r ( Y ) | q ′| , (47) here A ( p, q, r ) = c p c q c r r r p p q q . (48)Equivalently, N r ( X + Y ) ≥ A ( p, q, r ) | r ′ | N r ( X ) | r ′|| p ′| N r ( Y ) | r ′|| q ′| . (49)We collect these arguments to state the following result, actually stronger than Theorem 1. Theorem 9.
Let r ∈ (0 , . Let X, Y be independent log-concave vectors in R n . For < p, q < satisfying p ′ + q ′ = r ′ , we have N r ( X + Y ) ≥ A ( p, q, r ) | r ′ | N r ( X ) | r ′|| p ′| N r ( Y ) | r ′|| q ′| (50) with A ( p, q, r ) as defined in (48) . Thus to complete our proof of Theorem 1 it suffices to obtain for a fixed r ∈ (0 , , an α > such that for any given pair of independent log-concave random vectors X and Y , there exist ≤ p, q ≤ such that p ′ + q ′ = r ′ and A ( p, q, r ) α | r ′ | N r ( X ) α | r ′|| p ′| N r ( Y ) α | r ′|| q ′| ≥ N αr ( X ) + N αr ( Y ) . (51)Let us observe that there is nothing probabilistic about equation (51). If we write x = N r ( X ) α , y = N r ( Y ) α , our Rényi-EPI is implied by the following algebraic inequality. Proposition 10.
Given r ∈ (0 , and taking α = (1 − r ) log 2(1 + r ) log(1 + r ) + r log r , (52) then for any x, y > there exist < p, q < satisfying p ′ + q ′ = r ′ such that A ( p, q, r ) α | r ′ | x | r ′|| p ′| y | r ′|| q ′| ≥ x + y. (53) Proof.
Using the homogeneity of equation (53), we may assume without loss of generality that x + y = 1 | r ′ | . (54)We then choose admissible p, q by selecting p ′ = − x and q ′ = − y . Hence, equation (53) becomes A ( p, q, r ) α ≥ ( x + y ) x + y x x y y . (55)Let us note that A ( p, q, r ) ≥ (we will prove this fact in the appendix based on the descriptionof A ( p, q, r ) in (62)), so that taking logarithms we can choose α = sup log (cid:16) ( x + y ) x + y x x y y (cid:17) log( A ( p, q, r )) , (56)where the sup runs over all x, y > satisfying x + y = | r ′ | (recall that r ∈ (0 , is fixed).We claim that this is exactly the α defined in (52). To establish this fact, let us first rewrite A ( p, q, r ) in terms of x and y . From, p = 1 x + 1 , q = 1 y + 1 , r = 1 x + y + 1 , (57) e can write, c p = p /p | p ′ | /p ′ = x +1) x +1 x − x = 1 x x ( x + 1) x +1 , (58) c q = 1 y y ( y + 1) y +1 , (59) c r = 1( x + y ) x + y ( x + y + 1) x + y +1 . (60)From (58) - (60) it follows that, A ( p, q, r ) = c p c q c r r r p p q q (61) = ( x + y ) x + y ( x + 1) x +1 ( y + 1) y +1 x x y y ( x + y + 1) ( x + y +1) . (62)Let us denote F ( x ) , log (cid:18) ( x + y ) x + y x x y y (cid:19) = 1 | r ′ | log (cid:18) | r ′ | (cid:19) − x log( x ) − (cid:18) | r ′ | − x (cid:19) log (cid:18) | r ′ | − x (cid:19) , (63)and G ( x ) , log (cid:18) ( x + y ) x + y ( x + 1) x +1 ( y + 1) y +1 x x y y ( x + y + 1) ( x + y +1) (cid:19) = F ( x ) − L ( x ) , (64)where L ( x ) , (cid:18) | r ′ | + 1 (cid:19) log (cid:18) | r ′ | + 1 (cid:19) − ( x +1) log( x +1) − (cid:18) | r ′ | − x + 1 (cid:19) log (cid:18) | r ′ | − x + 1 (cid:19) . (65)Our claim is that α = sup FG = sup FF − L = (cid:18) − sup LF (cid:19) − . (66)We invoke Lemma 8, to prove that the ratio L/F is increasing on [0 , / | r ′ | ] . Indeed, takingderivatives it is easy to see that F is positive and increasing on (0 , / | r ′ | ] , and its derivative F ′ is strictly decreasing on the same interval. Furthermore, L ′′ F ′′ is non-decreasing on (0 , | r ′ | ) .Indeed, L ′′ ( x ) F ′′ ( x ) = | r ′ | + 2 | r ′ | x ( | r ′ | − x )( x + 1)( | r ′ | − x + 1) , (67)and one can see that this is non-decreasing when x ∈ (0 , | r ′ | ) again, by taking the derivative.Now by Lemma 8 applied to F, L , and c = | r ′ | we have sup (cid:18) − L ( x ) F ( x ) (cid:19) − = (cid:18) − L ( c ) F ( c ) (cid:19) − = (cid:18) − L (1 / | r ′ | ) F (1 / | r ′ | ) (cid:19) − . (68)Let us compute F ( c ) and L ( c ) , with c = | r ′ | . We have F ( c ) = 2 c log 2 c − c log( c ) = (1 − r ) log 2 r , (69)and L ( c ) = (2 c + 1) log(2 c + 1) − c + 1) log( c + 1) (70) = log (cid:0) r (cid:1) r − log (cid:18) r + 12 r (cid:19) . (71) hus α = (cid:18) − L ( c ) F ( c ) (cid:19) − (72) = − log ( r ) r − log (cid:0) r +12 r (cid:1) (1 − r ) log 2 r − (73) = (1 − r ) log 2(1 + r ) log(1 + r ) + r log r . (74) The proof is very similar to the proof of Theorem 1. The improvement is by virtue of thefact that for U a random vector uniformly distributed on a set A ⊂ R n , the Rényi entropy isdetermined entirely by the volume of A , and is thus independent of parameter. Indeed, N r ( U ) = (cid:18)Z R n ( A ( x ) / Vol( A )) r dx (cid:19) /n (1 − r ) = Vol( A ) /n . (75)We again use the information-theoretic version of the sharp Young inequality (see (28)): N r ( X + Y ) | r ′| ≥ CN p ( X ) | p ′| N q ( Y ) | q ′| . (76)Now, since X and Y are uniformly distributed, we have N p ( X ) = N r ( X ) , N q ( Y ) = N r ( Y ) . (77)Hence, N r ( X + Y ) ≥ C | r ′ | N r ( X ) | r ′|| p ′| N r ( Y ) | r ′|| q ′| . (78)Let us raise (78) to the power β , and put x = N r ( X ) β , y = N r ( Y ) β . As before, we can assumethat x + y = | r ′ | . Thus, it is enough to show that C β | r ′ | x | r ′|| p ′| y | r ′|| q ′| ≥ | r ′ | , (79)for some admissible ( p, q ) . Let us choose p, q such that x = | p ′ | and y = | q ′ | . The inequality isvalid since β = sup log (cid:16) ( x + y ) x + y x x y y (cid:17) log( C ) = sup log (cid:16) ( x + y ) x + y x x y y (cid:17) log (cid:16) ( x + y ) x + y x x y y ( x + y +1) x + y +1 ( x +1) x +1 ( y +1) y +1 (cid:17) , (80)where the sup runs over all x, y > satisfying x + y = | r ′ | (recall that r ∈ (0 , is fixed).Indeed, as in Section 3, it is a consequence of Lemma 8 that the sup is attained at x = | r ′ | andfrom this the result follows. Lower bound on the optimal exponent
Proposition 11.
The optimal exponent α opt that satisfies (10) verifies, max (cid:26) , (1 − r ) log 22 log Γ( r + 1) + 2 r log r (cid:27) ≤ α opt ≤ (1 − r ) log 2(1 + r ) log(1 + r ) + r log r . (81)Let us remark that smooth interpolation of Brunn-Minkowski and the EPI as in Theorem2, cannot hold for any class of random variables that contains the Gaussians. Indeed, let Z and Z be i.i.d. standard Gaussians. Hence, Z + Z ∼ √ Z , and by homogeneity of Rényientropy, N αr ( Z + Z ) = 2 α N αr ( Z ) , (82)while N αr ( Z ) + N αr ( Z ) = 2 N αr ( Z ) . (83)It follows that for a modified Rényi-EPI to hold, even when restricted to the class of log-concaverandom vectors, we must have α ≥ . That is, α ≥ .We now show by direct computation on the exponential distribution on (0 , ∞ ) the lowerbounds on α opt .Let X ∼ f X be a random variable with exponential distribution, f X ( x ) = (0 , ∞ ) ( x ) e − x . Thecomputation of the Rényi entropy of X is an obvious change of variables, N r ( X ) = (cid:18)Z f rX (cid:19) − r = (cid:18)Z ∞ e − rx dx (cid:19) − r = (cid:18) r (cid:19) − r . (84)Let Y be an independent copy of X . The density of X + Y is f ∗ f ( x ) = Z ∞−∞ (0 , ∞ ) ( x − y ) e − ( x − y ) (0 , ∞ ) ( y ) e − y dy (85) = (0 , ∞ ) xe − x . (86)Hence, N r ( X + Y ) = (cid:18)Z (0 , ∞ ) ( x ) x r e − rx dx (cid:19) − r (87) = (cid:18) Γ( r + 1) r r +1 (cid:19) − r . (88)Since the optimal exponent α opt satisfies N α opt r ( X + Y ) ≥ N α opt r ( X ) , (89)we have (cid:18) Γ( r + 1) r r +1 (cid:19) αopt − r ≥ (cid:18) r (cid:19) αopt − r . (90)Canceling and taking logarithms, this rearranges to log Γ( r + 1) + r log 1 r ≥ (1 − r ) log 22 α opt , (91)which implies that we must have α opt ≥ (1 − r ) log 22(log Γ( r + 1) + r log r ) . (92) ote that by the log-convexity of Γ and the fact that Γ(1) = Γ(2) = 1 , we have log(Γ(1+ r )) ≤ ,which implies α opt ≥ (1 − r ) log 22 r log r . (93)In particular we must have α opt r − ε → ∞ with r → , for any ε > . The reverse sharp Young inequality can be generalized to k ≥ functions in the following way. Theorem 12 ([14]) . Let f , . . . , f k : R n → R and r, r , . . . , r k ∈ (0 , such that r ′ + · · · + r ′ k = r ′ .Then, k f ∗ · · · ∗ f k k r ≥ C n k Y i =1 k f i k r i . (94) Here, C = C ( r, r , . . . , r k ) = Q ki =1 c r i c r , (95) where we recall that c m is defined in (26) as c m = m m | m ′ | m ′ . We have the following information-theoretic reformulation for X , . . . , X k independent ran-dom vectors, N r ( X + · · · + X k ) ≥ C | r ′ | k Y i =1 N | r ′ | / | r ′ i | r i ( X i ) . (96)Thus when we restrict to log-concave random vectors X i , ≤ i ≤ k , and invoke Lemma 7, wecan collect our observations as the following. Theorem 13.
Let r, r , . . . , r k ∈ (0 , such that P ki =1 1 r ′ i = r ′ . Let X , . . . , X k be independentlog-concave random vectors. Then, N r ( X + · · · + X k ) ≥ A | r ′ | k Y i =1 N t i r ( X i ) , (97) where t i = r ′ /r ′ i and A = A ( r, r , . . . , r n ) = Q ki =1 A ri A r with A m = | m ′ | | m ′| m m .Proof. By combining (96) with Lemma 7, we obtain N r ( X + · · · + X k ) ≥ C | r ′ | k Y i =1 N r ( X i ) r | r ′ | /r r | r ′ i | /r i i ! | r ′ | / | r ′ i | (98) = C | r ′ | k Y i =1 r | r ′ | /r r | r ′ i | /r i i ! | r ′|| r ′ i | k Y i =1 N r ′ r ′ i r ( X i ) (99) = A ( r, r , . . . , r k ) | r ′ | k Y i =1 N t i r ( X i ) . (100) ow let us show that Theorem 13 implies a super-additivity property for the Rényi entropyand independent log-concave vectors. Proof of Theorem 4.
By the homogeneity of equation (18), we can assume without loss of gener-ality that P ki =1 N r ( X i ) = 1 . From Theorem 13, for every r , . . . , r k ∈ (0 , such that P i r ′ i = r ′ we have N r ( X + · · · + X k ) ≥ A | r ′ | k Y i =1 N t i r ( X i ) (101) = r | r ′| r Q ki =1 | r ′ i | | r ′ i | r rii ! | r ′ | N t i r ( X i ) | r ′ | , (102)where t i = r ′ /r ′ i . Thus, log N r ( X + · · · + X k ) ≥ k X i =1 (cid:18) t i log | r ′ | t i − | r ′ | log r i r i (cid:19) (103) + (cid:18) | r ′ | log rr − log | r ′ | (cid:19) + k X i =1 t i log N r ( X i ) . Since r i = 1 + t i | r ′ | and r i = | r ′ i | / (1 + | r ′ i | ) , we deduce that log N r ( X + · · · + X k ) ≥ | r ′ | log rr + k X i =1 | r ′ | (cid:18) t i | r ′ | (cid:19) log (cid:18) t i | r ′ | (cid:19) + k X i =1 t i log N r ( X i ) t i . (104)It follows that N r ( X + · · · + X k ) ≥ c ( r, k ) , (105)with c ( r, k ) , inf λ sup t exp ( | r ′ | log rr + k X i =1 | r ′ | (cid:18) t i | r ′ | (cid:19) log (cid:18) t i | r ′ | (cid:19) + k X i =1 t i log λ i t i )! , (106)where the infimum runs over all λ = ( λ , . . . , λ k ) such that λ i ≥ and P ki =1 λ i = 1 , and thesupremum runs over all t = ( t , . . . , t n ) such that t i ≥ and P ki =1 t i = 1 . For a fixed λ , we canalways choose t = λ , and thus c ( r, k ) ≥ inf t exp ( | r ′ | log rr + k X i =1 | r ′ | (cid:18) t i | r ′ | (cid:19) log (cid:18) t i | r ′ | (cid:19)) . (107)Due to the convexity of the function G ( u ) , u log( u ) , u > , we have G (cid:18) t i | r ′ | (cid:19) ≥ G (cid:18) k | r ′ | (cid:19) + G ′ (cid:18) k | r ′ | (cid:19) (cid:18) t i | r ′ | − k | r ′ | (cid:19) . (108)Using the fact that P ki =1 t i = 1 , inequality (108) yields | r ′ | k X i =1 (cid:18) t i | r ′ | (cid:19) log (cid:18) t i | r ′ | (cid:19) ≥ ( k | r ′ | + 1) log (cid:18) k | r ′ | (cid:19) . (109) ince there is equality in (109) when t i = k , i = 1 , . . . , k , we deduce that the infimum in (107)is attained at t i = k , i = 1 , . . . , k . As a consequence, we have c ( r, k ) ≥ r − r (cid:18) k | r ′ | (cid:19) k | r ′ | . (110) Proposition 14.
The largest constant c opt ( r ) satisfying N r ( X + · · · + X k ) ≥ c opt ( r ) k X i =1 N r ( X i ) (111) for any k -tuples of independent log-concave random vectors satisfies er − r ≤ c opt ( r ) ≤ πr − r . (112) Proof.
Note that the function k (cid:16) k | r ′ | (cid:17) k | r ′ | decreases to e in k . Thus, taking the limitin (110) we have c opt ( r ) ≥ c ( r, k ) ≥ er − r . (113)On the other hand, specializing the inequality N r ( X + · · · + X k ) ≥ c opt ( r ) k X i =1 N r ( X i ) (114)to the case in which X , . . . , X k are i.i.d., we must have lim inf k →∞ N r (cid:18) X + · · · + X k √ k (cid:19) ≥ c opt ( r ) N r ( X ) . (115)Notice that if X is a centered log-concave random variable of variance , then the X + ··· + X k √ k are also log-concave random variables of variance , converging weakly by the central limittheorem to a standard normal random variable Z . Moreover, letting f k denote the density of X + ··· + X k √ k one may apply the argument of [11, Theorem 1.1] to r ∈ (0 , when one has lim T →∞ Z {| x | >T } f rk ( x ) dx = 0 (116)uniformly in k , to conclude that lim k →∞ N r (cid:18) X + · · · + X k √ k (cid:19) = N r ( Z ) = 2 πr r − . (117)Alternatively, one can arrive at (117) by invoking classical local limit theorems [26, 37] to obtainpointwise convergence of the densities, and conclude with Lebesgue dominated convergence tointerchange the limit. Recall that the class of centered log-concave densities with a fixedvariance can be bounded uniformly by a single sub-exponential function Ce − c | x | for universalconstants C, c > depending only on the variance. This gives the existence of all moments, inparticular a third moment requisite for the local limit theorem, additionally it gives dominationby an integrable function.Inserting (117) into (115), we see that c opt ( r ) must satisfy πr r − ≥ c opt ( r ) N r ( X ) . (118) or X having a Laplace distribution of variance , its density is f ( x ) = e −√ | x | √ on ( −∞ , ∞ ) ,so that N r ( X ) = 2 r r − . (119)We conclude that πr − r ≥ c opt ( r ) . (120)Proposition 14 shows that there does not exist a universal constant C (independent of r and k ) such that the inequality N r ( X + · · · + X k ) ≥ C k X i =1 N r ( X i ) (121)holds. Note that this is in contrast with the case r ≥ when C = e suffices. We have shown that a Rényi EPI does hold for r ∈ (0 , , at least for log-concave randomvectors, for the Rényi EPI of the form (8), as well as for the Rényi entropy power raised toa power α as in (11). Let us comment on the sharpness of the α derived, and contrast thisbehavior with that of the constant derived for uniform distributions β from (16).Due to Madiman and Wang [43] the Rényi entropy of independent sums decreases on spheri-cally symmetric decreasing rearrangement. Let us recall a few definitions. For a measurable set A ⊂ R n , denote by A ∗ the open origin symmetric Euclidean ball satisfying Vol( A ) = Vol( A ∗ ) .For a non-negative measurable function f , define its symmetric decreasing rearrangement by f ∗ ( x ) = Z ∞ { f>t } ∗ ( x ) dt. (122) Theorem 15 ([43]) . If f i are probability density functions and f ∗ i denote their sphericallysymmetric decreasing rearrangements, then N r ( X + · · · + X k ) ≥ N r ( X ∗ + · · · + X ∗ k ) (123) for any r ∈ [0 , ∞ ] , where X i has density f i , and X ∗ i has density f ∗ i , i = 1 , . . . , k . It follows that to prove inequality (9) it suffices to consider X and Y possessing sphericallysymmetric decreasing densities. Indeed, using Theorem 15 we would have N αr ( X + Y ) ≥ N αr ( X ∗ + Y ∗ ) ≥ N αr ( X ∗ ) + N αr ( Y ∗ ) = N αr ( X ) + N αr ( Y ) , (124)where the last equality comes from the equimeasurability of a density and its rearrangement.The same argument applies to inequality (8). Motivated by this fact the authors replaced theexponential distribution in the example above with its spherically symmetric rearrangement,the Laplace distribution, to yield a tighter lower bound in an announcement of this work [34].Additionally, since spherically symmetric rearrangement is stable on the class of log-concaverandom vectors (see [35, Corollary 5.2]), one can reduce to random vectors with sphericallysymmetric decreasing densities, even under the log-concave restriction taken in this work. The authors thank Mokshay Madiman for valuable comments and the explanation of the Rényicomparison results used in this work. The authors are also indebted to the anonymous reviewerswhose suggestions greatly improved the paper, leading in particular to the inclusion of Theorem4 and Proposition 14. Proof of Lemma 7
Theorem 16. ([23, Theorem 2.9])For a log-concave function f on R n , the map ϕ ( t ) = t n Z R n f t , t > , (125)is log-concave as well. Proof of Lemma 7.
The proof is a straightforward consequence of Theorem 16. What remainsis an algebraic computation. When X has density f , one has ϕ (1) = 1 . Write , p, q in convexcombination, and unwind the implication of ϕ being log-concave. We will show the result in thecase that we need < p < q < , the other arguments are similar. In this case, λp +(1 − λ )1 = q for λ = − q − p ∈ (0 , . By log-concavity, ϕ ( q ) ≥ ϕ λ ( p ) ϕ − λ (1) , (126)which is q n Z f q ≥ (cid:18) p n Z f p (cid:19) − q − p . (127)Since − q > raising both sides to the power /n (1 − q ) preserves the inequality, and we have q / (1 − q ) N q ( X ) ≥ p / (1 − p ) N p ( X ) . (128)which implies our result. B Positivity of A ( p, q, r ) By (62), it suffices to show that W ( x, y ) = log (cid:18) ( x + y ) x + y ( x + 1) x +1 ( y + 1) y +1 x x y y ( x + y + 1) ( x + y +1) (cid:19) > , (129)for x, y > . First observe that for y > , lim x → W ( x, y ) = 0 . (130)Computing, ∂∂x W ( x, y ) = log (cid:18) ( x + y )( x + 1)( x + y + 1) x (cid:19) , (131)which is always greater than , since ( x + y )( x + 1) > ( x + y + 1) x (132)reduces to y > . Thus W ( x, y ) > W (0 , y ) = 0 for x, y > , and our result follows. References [1] S. Artstein, K. M. Ball, F. Barthe, and A. Naor. Solution of Shannon’s problem on themonotonicity of entropy.
J. Amer. Math. Soc. , 17(4):975–982 (electronic), 2004.[2] K. Ball, P. Nayar, and T. Tkocz. A reverse entropy power inequality for log-concaverandom vectors.
Studia Math , 235(1):17–30, 2016.
3] K. Ball and V. H. Nguyen. Entropy jumps for isotropic log-concave random vectors andspectral gap.
Studia Math. , 213(1):81–96, 2012.[4] W. Beckner. Inequalities in Fourier analysis.
Ann. of Math. (2) , 102(1):159–182, 1975.[5] S. Bobkov and M. Madiman. The entropy per coordinate of a random vector is highlyconstrained under convexity conditions.
IEEE Trans. Inform. Theory , 57(8):4940–4954,August 2011.[6] S. Bobkov and M. Madiman. Reverse Brunn-Minkowski and reverse entropy power in-equalities for convex measures.
J. Funct. Anal. , 262:3309–3339, 2012.[7] S. Bobkov and J. Melbourne. Hyperbolic measures on infinite dimensional spaces.
Proba-bility Surveys , 13:57–88, 2016.[8] S. G. Bobkov and G. P. Chistyakov. Bounds for the maximum of the density of the sum ofindependent random variables.
Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov.(POMI) , 408(Veroyatnost i Statistika. 18):62–73, 324, 2012.[9] S. G. Bobkov and G. P. Chistyakov. Entropy power inequality for the Rényi entropy.
IEEETrans. Inform. Theory , 61(2):708–714, February 2015.[10] S. G. Bobkov and A. Marsiglietti. Variants of the entropy power inequality.
IEEE Trans-actions on Information Theory , 63(12):7747–7752, 2017.[11] S. G. Bobkov and A. Marsiglietti. Asymptotic behavior of Rényi entropy in the centrallimit theorem.
Preprint, arXiv:1802.10212 , 2018.[12] C. Borell. Convex measures on locally convex spaces.
Ark. Mat. , 12:239–252, 1974.[13] J. Bourgain. On high-dimensional maximal functions associated to convex bodies.
Amer.J. Math. , 108(6):1467–1476, 1986.[14] H. J. Brascamp and E. H. Lieb. Best constants in Young’s inequality, its converse, and itsgeneralization to more than three functions.
Advances in Math. , 20(2):151–173, 1976.[15] H. Busemann. A theorem on convex bodies of the Brunn-Minkowski type.
Proc. Nat.Acad. Sci. U. S. A. , 35:27–31, 1949.[16] M. H. M. Costa. A new entropy power inequality.
IEEE Trans. Inform. Theory , 31(6):751–760, 1985.[17] M. H. M. Costa and T. M. Cover. On the similarity of the entropy power inequality andthe Brunn-Minkowski inequality.
IEEE Trans. Inform. Theory , 30(6):837–839, 1984.[18] T. A. Courtade, M. Fathi, and A. Pananjady. Wasserstein stability of the entropy powerinequality for log-concave densities.
Preprint, arXiv:1610.07969 , 2016.[19] A. Dembo, T. M. Cover, and J. A. Thomas. Information-theoretic inequalities.
IEEETrans. Inform. Theory , 37(6):1501–1518, 1991.[20] M. Fradelizi, J. Li, and M. Madiman. Concentration of information content for convexmeasures.
Preprint, arXiv:1512.01490 , 2015.[21] M. Fradelizi, M. Madiman, A. Marsiglietti, and A. Zvavitch. The convexification effect ofMinkowski summation.
Preprint, arXiv:1704.05486 , 2016.[22] M. Fradelizi, M. Madiman, A. Marsiglietti, and A. Zvavitch. Do Minkowski averages getprogressively more convex?
C. R. Acad. Sci. Paris Sér. I Math. , 354(2):185–189, February2016.
23] M. Fradelizi, M. Madiman, and L. Wang. Optimal concentration of information contentfor log-concave densities. In C. Houdré, D. Mason, P. Reynaud-Bouret, and J. Rosinski,editors,
High Dimensional Probability VII: The Cargèse Volume , Progress in Probability.Birkhäuser, Basel, 2016. Available online at arXiv:1508.04093 .[24] M. Fradelizi and A. Marsiglietti. On the analogue of the concavity of entropy power in theBrunn-Minkowski theory.
Adv. in Appl. Math. , 57:1–20, 2014.[25] R. J. Gardner. The Brunn-Minkowski inequality.
Bull. Amer. Math. Soc. (N.S.) , 39(3):355–405 (electronic), 2002.[26] B. V. Gnedenko and A. N. Kolmogorov.
Limit distributions for sums of independentrandom variables . Translated from the Russian, annotated, and revised by K. L. Chung.With appendices by J. L. Doob and P. L. Hsu. Revised edition. Addison-Wesley PublishingCo., Reading, Mass.-London-Don Mills., Ont., 1968.[27] L. Leindler. On a certain converse of Hölder’s inequality. II.
Acta Sci. Math. (Szeged) ,33(3-4):217–223, 1972.[28] J. Li. Rényi entropy power inequality and a reverse.
Studia Math , 242:303 – 319, 2018.[29] M. Madiman. Private communication. 2017.[30] M. Madiman and A. R. Barron. Generalized entropy power inequalities and monotonicityproperties of information.
IEEE Trans. Inform. Theory , 53(7):2317–2329, July 2007.[31] M. Madiman, J. Melbourne, and P. Xu. Forward and reverse entropy power inequalitiesin convex geometry.
Convexity and Concentration , pages 427–485, 2017.[32] M. Madiman, J. Melbourne, and P. Xu. Rogozin’s convolution inequality for locally com-pact groups.
Preprint, arXiv:1705.00642 , 2017.[33] A. Marsiglietti and V. Kostina. A lower bound on the differential entropy of log-concaverandom vectors with applications.
Entropy , 20(3):185, 2018.[34] A. Marsiglietti and J. Melbourne. A Rényi entropy power inequality for log-concave vectorsand parameters in [0 , . In Proceedings 2018 IEEE International Symposium on Informa-tion Theory , Vail, USA, 2018.[35] J. Melbourne. Rearrangement and Prékopa-Leindler type inequalities.
Preprint,arXiv:1806.08837 , 2018.[36] V. D. Milman. Inégalité de Brunn-Minkowski inverse et applications à la théorie locale desespaces normés.
C. R. Acad. Sci. Paris Sér. I Math. , 302(1):25–28, 1986.[37] V. V. Petrov. On local limit theorems for sums of independent random variables.
Theoryof Probability & Its Applications , 9(2):312–320, 1964.[38] E. Ram and I. Sason. On Rényi entropy power inequalities.
IEEE Transactions on Infor-mation Theory , 62(12):6800–6815, 2016.[39] A. Rényi. On measures of entropy and information. In
Proc. 4th Berkeley Sympos. Math.Statist. and Prob., Vol. I , pages 547–561. Univ. California Press, Berkeley, Calif., 1961.[40] G. Toscani. A concavity property for the reciprocal of Fisher information and its conse-quences on Costa’s EPI.
Phys. A , 432:35–42, 2015.[41] G. Toscani. A strengthened entropy power inequality for log-concave densities.
IEEETrans. Inform. Theory , 61(12):6550–6559, 2015.
42] A. M. Tulino and S. Verdú. Monotonic decrease of the non-gaussianness of the sum ofindependent random variables: A simple proof.
IEEE Trans. Inform. Theory , 52(9):4295–7, September 2006.[43] L. Wang and M. Madiman. Beyond the entropy power inequality, via rearrangements.
IEEE Transactions on Information Theory , 60(9):5116–5137, 2014.[44] P. Xu, J. Melbourne, and M. Madiman. Reverse entropy power inequalities for s-concavedensities. In
Proceedings 2016 IEEE International Symposium on Information Theory ,pages 2284–2288, Barcelona, Spain, 2016.[45] P. Xu, J. Melbourne, and M. Madiman. Infinity-rényi entropy power inequalities. In
Proceedings 2017 IEEE International Symposium on Information Theory , pages 2985–2989, Aachen, Germany, 2017.[46] P. Xu, J. Melbourne, and M. Madiman. A min-entropy power inequality for groups. In
Proceedings 2017 IEEE International Symposium on Information Theory , pages 674–678,Aachen, Germany, 2017.[47] R. Zamir and M. Feder. A generalization of the entropy power inequality with applications.
IEEE Trans. Inform. Theory , 39(5):1723–1728, 1993.[48] R. Zamir and M. Feder. On the volume of the Minkowski sum of line sets and the entropy-power inequality.
IEEE Trans. Inform. Theory , 44(7):3039–3063, 1998.Arnaud MarsigliettiDepartment of MathematicsUniversity of FloridaGainesville, FL 32611E-mail address: a.marsiglietti@ufl.eduJames MelbourneElectrical and Computer EngineeringUniversity of MinnesotaMinneapolis, MN 55455, USAE-mail address: [email protected], 44(7):3039–3063, 1998.Arnaud MarsigliettiDepartment of MathematicsUniversity of FloridaGainesville, FL 32611E-mail address: a.marsiglietti@ufl.eduJames MelbourneElectrical and Computer EngineeringUniversity of MinnesotaMinneapolis, MN 55455, USAE-mail address: [email protected]