[PDF] Location and scale behaviour of the quantiles of a natural exponential family

Abstract

Let P 0 be a probability on the real line generating a natural exponential family ( P t ) t∈R . Fix α in (0,1). We show that the property that P t ((−∞,t))≤α≤ P t ((−∞,t]) for all t implies that there exists a number μ α such that P 0 is the Gaussian distribution N( μ α ,1). In other terms, if for all t , t is a quantile of P t associated to some threshold α∈(0,1) , then the exponential family must be Gaussian. The case α=1/2 , \textit{i.e.} t is always a median of P t , has been considered in Letac \textit{et al.} (2018). Analogously let Q be a measure on [0,∞) generating a natural exponential family ( Q −t ) t>0 . We show that Q −t ([0, t −1 ))≤α≤ Q −t ([0, t −1 ]) for all t>0 implies that there exists a number p= p α >0 such that Q(dx)∝ x p−1 dx, and thus Q −t has to be a gamma distribution with parameters p and t.

Full PDF

aa r X i v : . [ m a t h . S T ] O c t Location and scale behaviour of the quantiles of a naturalexponential family

Mauro Piccioni ∗ , Bartosz Ko lodziejek † and G´erard Letac ‡ October 30, 2018

Abstract

Let P be a probability on the real line generating a natural exponential family ( P t ) t ∈ R .Fix α in (0 , . We show that the property that P t (( −∞ , t )) ≤ α ≤ P t (( −∞ , t ]) for all t implies that there exists a number µ α such that P is the Gaussian distribution N ( µ α , . Inother terms, if for all t , t is a quantile of P t associated to some threshold α ∈ (0 , α = 1 / i.e. t is always a median of P t , hasbeen considered in Letac et al. (2018). Analogously let Q be a measure on [0 , ∞ ) generatinga natural exponential family ( Q − t ) t> . We show that Q − t ([0 , t − )) ≤ α ≤ Q − t ([0 , t − ]) forall t > p = p α > Q ( dx ) ∝ x p − dx, and thus Q − t has to be a gamma distribution with parameters p and t. Keywords:

Characterization of normal and gamma laws, one-dimensional exponentialfamilies, quantiles of a distribution, Deny equations.

MSC2010 classification:

Let P be a probability on the real line and assume its moment generating function M ( t ) = Z + ∞−∞ e tx P ( dx ) (1)is ﬁnite for all real t. Such a probability generates the natural exponential family P t ( dx ) = e tx M ( t ) P ( dx ) , t ∈ R . (2)For example, the Gaussian probability P ( dx ) = (2 π ) − / e − ( x − m ) / , i.e. P = N ( m,

1) generatesthe natural exponential family ( P t ) = ( N ( m + t, X t ∼ P t for any t ∈ R , then X t ∼ X + t , in other words ( P t ) is a location family generated by P with location parameter t . It is well known and easily veriﬁed that the property X t ∼ X + t forces a natural exponentialfamily to be generated by P = N ( m,

1) for some m. A way to see this is to compute the m.g.f. of X t and substitute X + t to X t , getting the equation M ( t + s ) = M ( t ) M ( s ) e ts , t, s ∈ R . ∗ Dipartimento di Matematica, Sapienza Universit`a di Roma, 00185 Roma, Italia. [email protected] † Wydzia l Matematyki i Nauk Informacyjnych, Politechnika Warszawska, Warszawa, Polska. [email protected] ‡ Institut de Math´ematiques de Toulouse, ´Equipe de Probabilit´es et Statistique, Universit´e Paul Sabatier, 31062Toulouse, France. [email protected] t and s we get that the cumulant generating function k = log M of P satisﬁes k ′′ ( u ) = 1 for all u ∈ R , from which k ( u ) = − mu + u /

2, that is precisely the c.g.f.of N ( m, X t ∼ X + t , for any t ∈ R , meansthat the distribution function of X t − t is independent of t , and so the same is true for the quantilefunction. If we make the weaker assumption that for some ﬁxed α ∈ (0 ,

1) an α –quantile of X t − t does not depend on t , does one obtains the same characterization? In slightly simpliﬁed words, if X t ∼ P t , as deﬁned in (2), is such that Pr( X t ≤ t + b ) = α for any t ∈ R , does this imply that P is N ( m,

1) for some m ?A recent paper (Letac et al (2018)) gives the answer to this question for α = 1 / b = 0).Indeed it is proved there that if t is a median of P t , for any t ∈ R , then P is the standard Gaussian N (0 , α ∈ (0 ,

1) (andan arbitrary b ). Theorem 1 . Let P be a probability on the real line which generates the exponential family (2).Let b ∈ R and suppose that b + t is an α -quantile of P t , for t ∈ R , that is Z ( −∞ ,b + t ) e tx P ( dx ) ≤ αM ( t ) ≤ Z ( −∞ ,b + t ] e tx P ( dx ) , t ∈ R . (3)Then P = N ( m ∗ , m ∗ = b − Φ − ( α ), Φ being the standard Gaussian distribution function;moreover P t = N ( m ∗ + t, Q be a Radon measure on the positive real line suchthat its Laplace transform L ( t ) = Z [0 , + ∞ ) e − ty Q ( dy ) (4)is ﬁnite for all t > . Such a measure generates the exponential family Q − t ( dy ) = e − ty L ( t ) Q ( dy ) , t > Q , but thiswill turn out to be impossible). For example, the measure Q p ( dy ) = 1Γ( p ) y p − dy, (6)deﬁned for p > Q p − t = Ga( p, t ), with t >

0, whereGa( p, t ) is the gamma law with parameters p and t. Now it is immediately veriﬁed that if Y t ∼ Q p − t then Y t ∼ Y /t , that is ( Q p − t ) is a scale family generated by Q p − = Ga( p, t − . It is relatively easy to verify that this property forces Q to be of the form (6). However theargument in the scale case is slightly more involved than in the location case and we prefer to givethe statement as a proposition. Proposition 1 . Suppose that ( Q − t ) t> is the natural exponential family deﬁned in (5), for somemeasure Q on the non-negative real line. With Y t ∼ Q − t , assume that Y t ∼ Y t for any t > Q = Q p deﬁned by (6), for some p > Proof of Proposition 1 . Compute the Laplace transform of Y t in the point st , where s, t > Y t ∼ Y t one arrives at L ( t + ts ) L ( t ) = L (1 + s ) L (1)2eﬁning c ( t ) = log L ( t ), for t > t and s this implies uc ′′ ( u ) + c ′ ( u ) = 0 , where u = t + ts >

0. Integrating twice one arrives at c ( u ) = − p log u + ℓ , with p > ℓ ∈ R , from which L ( u ) = e ℓ u p , the Laplace transform of e ℓ Q p . (cid:3) It is worth to notice that the statement of the previous proposition and the analogous resultfor location families are special cases of the general results obtained by Ferguson (1962), thatcharacterize general exponential families which are location and scale families.Now the assumption Y t ∼ Y t for any t >

0, is equivalent to say that the distribution functionof tY t is independent of t , and so the same is true for the quantile function. If we make the weakerassumption that, for some ﬁxed α ∈ (0 , α -quantile of tY t does not depend on t , is it enoughto obtain the characterization stated in Proposition 1? In slightly simpliﬁed words, if Y t ∼ Q − t asdeﬁned in (5), is such that Pr( Y t ≤ a/t ) = α for all t >

0, for some a >

0, does this still implythat Q is proportional to Q p for some p >

0? Our second result gives a positive answer to thisconjecture.

Theorem 2 . Let Q be a Radon measure on the non-negative real line which generates theexponential family (5). Let a > a/t is an α -quantile of Q − t , for t >

0, that is Z (0 ,a/t ) e − ty Q ( dy ) ≤ αL ( t ) ≤ Z (0 ,a/t ] e − ty Q ( dy ) , t > . (7)Then Q is proportional to Q p ∗ , where p ∗ = p ∗ ( α ) is the unique solution in p > E p ( a ) = α , E p being the distribution function of Ga( p, Q − t = Ga( p ∗ , t/a ).It is convenient to comment on the existence and the uniqueness of p ∗ . The family (Ga( p, , p >

0) is a convolution semigroup of laws supported by (0 , ∞ ) . Hence, for any ﬁxed a > p E p ( a ) is strictly decreasing in p and is continuous. From the Markov inequality and the factthat the expectation for Ga( p,

1) is p we have 1 − E p ( a ) ≤ p/a and this implies lim p ↓ E p ( a ) = 1 . The limit of E p ( a ) as p → ∞ is zero from the law of large numbers.The proofs of Theorem 1 and 2 are given in the next section. These proofs deduce from (3)and (7) two convolution equations in additive and multiplicative forms, respectively. The solutionsto these equations have been investigated by Deny (1960). The result for additive convolutions isreported in the ﬁnal section of Deny (1960). The result for multiplicative convolutions can obtainedwith a passage to the additive convolution form by taking logarithms. In the next proposition wereport both of them explicitly. Proposition 2

1) Suppose H is a probability density deﬁned on the whole real line, and consider the equation f ( t ) = Z + ∞−∞ H ( t − x ) f ( x ) dx, t ∈ R , (8)where f is a locally integrable, non-negative function. Then f is necessarily a linear combi-nation, with non-negative coeﬃcients, of a constant function with an exponential function ofthe form e − s ∗ x , where s ∗ = 0 is a solution of the following equation in the real unknown s Z + ∞−∞ e sx H ( x ) dx = 1 . (9)If there is no solution of this form then f is necessarily constant.3) Suppose K is a probability density on the positive real line and consider the equation g ( t ) = Z + ∞ K ( ty ) g ( y ) dyy , t > , (10)where g is a locally integrable and non-negative function on (0 , ∞ ). Then g ( t ) is necessarily alinear combination, with non-negative coeﬃcients, of the function t − with a power functionof the form t − − u ∗ , where u ∗ = 0 is a solution of the following equation in the real unknown u Z + ∞ y u K ( y ) dy = 1 . (11)If there is no solution of this form then g ( t ) = c/t , where c ≥ s and u , respectively. Thisfollows by convexity of the logarithm of the functions appearing at the l.h.s. of these equations. Proof of Theorem 1 .Let us prove the theorem with b = 0. Then we will adjust the solution to take into account anarbitrary value of b . First we prove that P is absolutely continuous. Take − A ≤ s < t ≤ A , forsome constant A > P (( s, t )) = Z ( s,t ) e − tx e tx P ( dx ) ≤ e A Z ( s,t ) e tx P ( dx ) ≤ e A Z ( −∞ ,t ) e tx P ( dx ) − Z ( −∞ ,s ] e sx P ( dx ) + Z ( −∞ ,s ] (cid:0) e sx − e tx (cid:1) P ( dx ) ! . Using (3) and the inequality | e u − e v | ≤ | u − v | e w , for | u | , | v | ≤ w , this is bounded by e A (cid:18) α ( M ( t ) − M ( s )) + | t − s | Z R | x | e A | x | P ( dx ) (cid:19) ≤ c A | t − s | since M , being analytic, is locally Lipschitz, and the integral at the r.h.s. is ﬁnite by the existenceof the m.g.f. of P on the whole real line.So we can always assume that P has a density p . Setting α = C C , with C >

0, the quantilerelation (3) leads to Z t −∞ e tx p ( x ) dx = C Z + ∞ t e tx p ( x ) dx. (12)Deriving w.r.t. t both sides and multiplying by e − t one gets p ( t ) + e − t Z t −∞ xe tx p ( x ) dx = − Cp ( t ) + Ce − t Z + ∞ t xe tx p ( x ) dx (13)Introduce the function deﬁned byabs C ( x ) = − Cx { x< } + x { x> } . (14)Multiply both sides of (12) by te − t and subtract from (13). We obtain p ( t ) = 11 + C Z + ∞−∞ abs C ( t − x ) e t ( x − t ) p ( x ) dx. (15)4s expected, a solution to the equation (15) is given by ϕ ( t − m ∗ ), where ϕ is the standard Gaussiandensity function, and m ∗ = − Φ − ( α ). Next set p ( x ) = ϕ ( x − m ) f ( x ) , (16)with m = m ∗ . We aim to prove that f ( x ) has to be constant to solve the equation (15), with thesubstitution (16). Rewriting the equation for f , one gets f ( t ) e mt − t / = 11 + C Z + ∞−∞ abs C ( t − x ) e t ( x − t )+ mx − x / f ( x ) dx which is equivalent to f ( t ) = e m C Z + ∞−∞ abs C ( t − x ) e − ( t − x + m )22 f ( x ) dx (17)which has the form (8) with H ( x ) = e m C abs C ( x ) e − ( x + m )22 . (18)The moment generating function of H can be exactly computed Z + ∞−∞ e sx H ( x ) dx = 1 + √ πe ( s − m ) / ( s − m ) (Φ( s − m ) − α ) . (19)This is clearly equal to 1 only if s = 0 (hence H is a density) and if s = m . We apply Proposition 2a) to the equation (17). When m = 0, that is if α = , the r.h.s. of (19) is equal to 1 only in 0, hencethe only non-negative non trivial solutions of the convolution equation (8) with kernel H given by(18) are the positive constants. This yields immediately that p ( x ) = ϕ ( x ), as desired. In the case α = the solutions f ( x ) are linear combinations with non-negative coeﬃcients of the constant 1and the function e − mx . Coming back to p ( x ) = f ( x ) ϕ ( x − m ), this gives density solutions for p which are mixtures of N ( m,

1) with N (0 , α = 1 /

2, therefore a positive component from N (0 ,

1) is forbidden. This proves that p ( x ) has to be ϕ ( x − m ).Finally, to deal with an arbitrary value for b , deﬁne τ − b = x − b. Now observe that if P t has α -quantile b + t then P ∗ t = P t ◦ τ − − b has α -quantile t and it is still a natural exponential family, for t ∈ R . So P ∗ t = N ( − Φ − ( α ) , t ) and P t = N ( − Φ − ( α ) + b + t, (cid:3) Proof of Theorem 2 .First we prove the result with a = 1. Assume the relation (7) and let t → L ( t )increases to Q ( R + ) by the monotone convergence theorem. Suppose Q ( R + ) is ﬁnite: then, for any ε >

0, the l.h.s. of (7) can be made larger than Q ( R + ) − ε in the following way. First choose K in such a way that Q ((0 , K ]) > Q ( R + ) − ε ; then choose t < K − and small enough to guarantee Z (0 ,t − ) e − tx Q ( dx ) ≥ Z (0 ,K ] e − tx Q ( dx ) > Q ( R + ) − ε. (20)Since 0 < α <

1, the ﬁrst inequality in (7) becomes absurd for t and ε suﬃciently small. Thisimplies that Q ( R + ) = + ∞ , hence the natural parameter space of the natural exponential family( Q s ) coincides with the negative reals.Next we prove that Q is absolutely continuous. Take 0 < s < t < + ∞ and compute Q (( t − , s − )) = Z ( t − ,s − ) e sx e − sx Q ( dx ) ≤ e Z ( t − ,s − ) e − sx Q ( dx )= e Z (0 ,s − ) e − sx Q ( dx ) − Z (0 ,t − ] e − tx Q ( dx ) + Z (0 ,t − ] (cid:0) e − tx − e − sx (cid:1) Q ( dx ) ! .

5y (7) the diﬀerence between the ﬁrst two integrals at the r.h.s. is bounded by α ( L ( s ) − L ( t )),whereas the remaining integral is non positive. Again since L is analytic in the positive real lineit is locally Lipschitz and this proves the absolute continuity of Q , that is Q ( dx ) = q ( x ) dx , with q non-negative and locally integrable.Now we can write (7) in the form of an equality, setting again α = C C , namely Z t − e − ty q ( y ) dy = C Z + ∞ t − e − ty q ( y ) dy, t > . (21)Deriving both sides w.r.t. t , one gets1 + Ct e − q ( t − ) = C Z + ∞ t − ye − ty q ( y ) dy − Z t − ye − ty q ( y ) dy. Adding the l.h.s. of (21) and subtracting the r.h.s., both multiplied by t − , to the r.h.s. of theabove equality, we get for any t > q (cid:0) t − (cid:1) = et C (Z t − ( t − − y ) e − yt q ( y ) dy + C Z + ∞ t − ( y − t − ) e − yt q ( y ) dy ) . (22)With the help of the function abs C deﬁned in (14) , equality (22) is rewritten as q ( t − ) = et (1 + C ) Z ∞ abs C (1 − ty ) e − ty q ( y ) dy, t > . (23)Next, for any p >

0, deﬁne q p ( x ) = p ) x p − (0 , + ∞ ) ( x ). Recall from the introduction that for p = p ∗ ( α ) one has Z y p ∗ − e − y dy = C Z + ∞ y p ∗ − e − y dy (24)Now multiply both sides of (23) by t p ∗ − and change the variable of integration at the r.h.s. to be z = y − . One gets t p ∗ − q ( t − ) = et p ∗ − C Z ∞ abs C (cid:0) − tz − (cid:1) e − t/z h (cid:0) z − (cid:1) dzz . (25)Deﬁning the l.h.s. of the above equality to be g ( t ), one has q ( t ) = g ( t − ) t p ∗ − , (26)and turns the equation (25) into an equation of the form (10) in g with K ( y ) = e C abs C (1 − y ) e − y y p ∗ − (0 , + ∞ ) ( y ) . (27)The Mellin transform of K can be easily computed Z ∞ y u K ( y ) dy = 1 + e Γ( p ∗ + u )( p ∗ + u − { α − E p ∗ + u (1) } . (28)Now observe that the quantity inside the brackets of (28) at the r. h. s. of (28) is always increasingin u ; moreover it is equal to 0 for u = 0, due to (24). Hence for any value of C > K is always a density. When p ∗ = 1 (equivalently, C = e −

1, or α = 1 − e − ), u = 0 is theunique global minimum point of the r. h. s. of (28). Then, by Proposition 2 b), the only nonnegative non trivial solutions to the equation (10), with K given by (27), have necessarily the form g ( t ) = c t − , with c >

0. Thus q ( y ) = c q p ∗ ( y ) = c ′ y p ∗ − . Moreover, for p ∗ = 1 (equivalently, C = e −

1, or α = 1 − e − ) the value u = 1 − p ∗ = 0 makes the expression at the r. h. s. of (28)6qual to 1, too. As a consequence g ( t ) = t p ∗ − is also a solution of the multiplicative convolutionequation with K given by (27). Applying again Proposition 2 b) all the non negative non trivialsolutions are linear combinations g ( t ) + c g ( t ) = c t − + c t p ∗ − , with c , c ≥ q ( y ) = c ′ y p ∗− + c . But for c > t = 1. Indeed, in this case the diﬀerence betweenthe l.h.s. and the r.h.s. of (21) is equal to c (1 − Ce ) and this is diﬀerent from 0 as soon as C = e −

1. So again q ( y ) = c ′ y p ∗ − , as desired.Finally, to deal with an arbitrary value of a >

0, ﬁrst deﬁne σ a − to be the multiplication by a − . Now observe that, if Q − t has α -quantile at − then Q ∗− t = Q − t ◦ σ − a − has α -quantile t − andit is still a natural exponential family, for t >

0. So Q ∗− t = Ga( p ∗ , t ) and Q − t = Ga( p ∗ , at − ),ending the proof of Theorem 2. (cid:3) Deny, J. (1960). Sur l’´equation de convolution µ = µ ∗ σ . S´eminaire Brelot-Choquet-Deny (Th´eoriedu Potentiel) e ann´ee, 1959-60, Expos´e num´ero 5. Ferguson, T.S. (1962), Location and scale parameters in exponential families of distributions,

Ann. Math. Statist. , pp. 986-1001. Letac, G., Mattner, L., Piccioni, M. (2018), The median of an exponential family and thenormal law,

Statist. Prob. Lett.133