Asymptotic results for linear combinations of spacings generated by i.i.d. exponential random variables
Camilla Calì, Maria Longobardi, Claudio Macci, Barbara Pacchiarotti
aa r X i v : . [ m a t h . P R ] F e b Asymptotic results for linear combinations of spacings generated byi.i.d. exponential random variables
Camilla Cal`ı ∗ Maria Longobardi † Claudio Macci ‡ Barbara Pacchiarotti § February 25, 2021
Abstract
We prove large (and moderate) deviations for a class of linear combinations of spacings gen-erated by i.i.d. exponentially distributed random variables. We allow a wide class of coefficientswhich can be expressed in terms of continuous functions defined on [0 ,
1] which satisfy some suit-able conditions. In this way we generalize some recent results by Giuliano et al. (2015) whichconcern the empirical cumulative entropies defined in Di Crescenzo and Longobardi (2009a).
Keywords: large deviations, moderate deviations, cumulative entropy, L -statistics. : 60F10, 62G30, 94A17. Empirical processes and their applications to statistics are widely studied (see e.g. Shorack and Wellner(1986) as a monograph on this topic). An important part of the results on this topic concerns linearcombinations of order statistics (called L -statistics) and, more in particular, linear combinationsof spacings (a spacing is a difference between two consecutive order statistics). Among the ref-erences with results on large deviations for L -statistics here we recall Aleshkyavichene (1991),Bentkus and Zikitis (1990), Groeneboom et al. (1979) and Groeneboom and Shorack (1981). Insome cases the large deviation results are formulated in terms of the concept of large deviationprinciple (see e.g. Dembo and Zeitouni (1998)) and, among the references with this kind of results,here we recall Boistard (2007) and Duffy et al. (2011).The aim of this paper is to generalize the results in Giuliano et al. (2015) concerning a par-ticular sequence of linear combinations of spacings { C n : n ≥ } generated by a sequence ofindependent and identically distributed (i.i.d. for short) random variables { X n : n ≥ } . Werecall that the random variables { C n : n ≥ } are the empirical cumulative entropies definedin Di Crescenzo and Longobardi (2009a) for a sequence of i.i.d. positive random variables { X n : n ≥ } with a (common) absolutely continuous distribution function. Moreover the results inGiuliano et al. (2015) concern the case of exponentially distributed random variables { X n : n ≥ } and, in such a case, the joint distribution of the spacings has some nice properties. In this paperthe random variables { X n : n ≥ } are again exponentially distributed, and we allow a wide class ∗ Address: Dipartimento di Biologia, Universit`a di Napoli Federico II, Via Cintia, Complesso Monte S. Angelo,80126 Naples, Italy. e-mail: [email protected] † Address: Dipartimento di Biologia, Universit`a di Napoli Federico II, Via Cintia, Complesso Monte S. Angelo,80126 Naples, Italy. e-mail: [email protected] ‡ Address: Dipartimento di Matematica, Universit`a di Roma Tor Vergata, Via della Ricerca Scientifica, I-00133Roma, Italia. E-mail: [email protected] § Address: Dipartimento di Matematica, Universit`a di Roma Tor Vergata, Via della Ricerca Scientifica, I-00133Roma, Italia. E-mail: [email protected]
1f sequences of linear combinations of spacings { C n ( w ) : n ≥ } , where w is a continuous functionon [0 ,
1] which satisfies some suitable conditions.We conclude with the outline of the paper. Section 2 is devoted to some preliminaries; inparticular we also illustrate the connections with some references as Di Crescenzo and Longobardi(2009a), Di Crescenzo and Longobardi (2009b) and Gao and Zhao (2011). In Section 3 we gener-alize the results in Giuliano et al. (2015). The connections between our moderate deviation resultand the moderate deviation result for L -statistics in Gao and Zhao (2011) is discussed in Section4. Finally, in Section 5, we discuss some possible choices of the function w based on some empiricalentropies in the literature. We start with some preliminaries on large deviations. We also present the sequence studied in thispaper, and some connection with literature.
Here we briefly recall some basic preliminaries on large deviations (see e.g. Dembo and Zeitouni(1998), pages 4-5). Let X be a topological space equipped with its completed Borel σ -field. Asequence of X -valued random variables { Z n : n ≥ } satisfies the large deviation principle (LDP forshort) with speed function v n and rate function I if: lim n →∞ v n = ∞ ; the function I : X → [0 , ∞ ]is lower semi-continuous; we have the upper boundlim sup n →∞ v n log P ( Z n ∈ F ) ≤ − inf x ∈ F I ( x ) for all closed sets F, and the lower boundlim inf n →∞ v n log P ( Z n ∈ G ) ≥ − inf x ∈ G I ( x ) for all open sets G. A rate function I is said to be good if its level sets {{ x ∈ X : I ( x ) ≤ η } : η ≥ } are compact. Inthe LDPs presented in this paper we always have X = R . In some cases we apply the G¨artner EllisTheorem (see e.g. Theorem 2.3.6 in Dembo and Zeitouni (1998)) with the speed function v n , andthe rate functions are good. Here we briefly recall the statement of this theorem for real valuedrandom variables: if there existsΛ( θ ) := lim n →∞ v n log E [ e v n θZ n ] for all θ ∈ R , the origin belongs to the interior of D (Λ) := { θ ∈ R : Λ( θ ) < ∞} , and the function Λ is essentially smooth (see e.g. Definition 2.3.5 in Dembo and Zeitouni (1998))and lower semi-continuous, then { Z n : n ≥ } satisfies the LDP with speed function v n and goodrate function Λ ∗ defined by Λ ∗ ( z ) := sup θ ∈ R { θz − Λ( θ ) } . For the sake of completeness we recallthat the function Λ is essentially smooth if the interior of D (Λ) is non-empty, it is differentiablethroughout the interior of D (Λ), and | Λ ′ ( θ n ) | → ∞ whenever { θ n } is a sequence of points in theinterior of D (Λ) which converges to a boundary point of D (Λ).2 .2 Preliminaries on the sequence { C n ( w ) : n ≥ } Let { X n : n ≥ } be a sequence of i.i.d. positive random variables and let X n ≤ · · · ≤ X n : n bethe ascending order statistics of X , . . . , X n (for all n ≥ X n = 0. Then weconsider the sequence { C n ( w ) : n ≥ } defined by C n ( w ) := n − X k =0 w ( k/n )( X k +1: n − X k : n ) , (1)for some function w : [0 , → R . So we have C n ( w ) = n − X k =0 w ( k/n ) X k +1: n − n − X k =0 w ( k/n ) X k : n = n X k =1 w (( k − /n ) X k : n − n − X k =0 w ( k/n ) X k : n and, by taking into account X n = 0, we get C n ( w ) = n − X k =1 ( w (( k − /n ) − w ( k/n )) X k : n + w (( n − /n ) X n : n . (2)Actually in this paper we assume that the common distribution of the random variables { X n : n ≥ } is EX P ( λ ) for some λ >
0, i.e. their (common) distribution function is F ( t ) := 1 − e − λt for all t ≥ . (3)Then, in such a case, it is known (see e.g. Subsection 2.3 in Pyke (1965)) that the spacings { X n − X n , X n − X n , . . . , X n : n − X n − n } are independent and, for all k ∈ { , . . . , n − } , X k +1: n − X k : n is EX P ( λ ( n − k )) distributed. Thisresult allows to provide some explicit formulas for moment generating function, mean and varianceof C n ( w ). Firstly, for all θ ∈ R , we have E h e θC n ( w ) i = n − Y k =0 E h e θw ( k/n )( X k +1: n − X k : n ) i , and therefore E h e θC n ( w ) i = ( Q n − k =0 λ ( n − k ) λ ( n − k ) − θw ( k/n ) if θw ( k/n ) < λ ( n − k ) for all k ∈ { , . . . , n − }∞ otherwise . (4)Moreover E [ C n ( w )] = 1 λ n − X k =0 w ( k/n ) n − k and Var[ C n ( w )] = 1 λ n − X k =0 w ( k/n )( n − k ) . (5)Now we discuss the almost sure convergence and the asymptotic normality following the lines ofsome proofs in Di Crescenzo and Longobardi (2009a) and Di Crescenzo and Longobardi (2009b).We introduce the following condition. Condition 1.
The function w : [0 , → R is continuous and there exist x ∈ (0 , , β ∈ (0 , and c > such that | w ( x ) | ≤ c (1 − x ) β for all x ∈ [1 − x , . We start with a generalization of Proposition 2 in Di Crescenzo and Longobardi (2009b). Inview of what follows we recall that Condition 1 yields w (1) = 0, and this condition is needed tohave the finiteness of the almost sure limit R ∞ w ( F ( z )) dz (see (6) below).3 roposition 2.1. Assume that Condition 1 holds. Let { X n : n ≥ } be a sequence of i.i.d. positiverandom variables in L p for some p such that βp > , with (common) distribution function F possiblydifferent from the one in (3) . Then C n ( w ) → Z ∞ w ( F ( z )) dz a.s. ( as n → ∞ ) . (6) Proof.
We follow the lines of the proof of Proposition 2 in Di Crescenzo and Longobardi (2009b)(see also the proof of Theorem 9 in Rao et al. (2004)). Obviously we have C n ( w ) = Z ∞ w ( ˆ F n ( z )) dz (for all n ≥ , where ˆ F n ( x ) := n P nk =1 { X k ≤ x } is the empirical distribution function. We take a > F ( a ) ≥ − x and, by the Glivenko Cantelli Theorem, for n large enough we have F ( a ) + x ≥ ˆ F n ( a ) ≥ F ( a ) − x . Thus for all z ≥ a we have ˆ F n ( z ) ≥ ˆ F n ( a ) ≥ − x , which yields | w ( ˆ F n ( z )) | ≤ c (1 − ˆ F n ( z )) β by Condition 1. We also remark that1 − ˆ F n ( z ) = 1 − n n X k =1 { X k ≤ z } = 1 n n X k =1 { X k >z } ≤ n n X k =1 { X k >z } X pk z p ≤ n n X k =1 X pk z p ≤ αz p , where α := sup n ≥ n P nk =1 X pk < ∞ a.s. (in fact, since the random variables { X n : n ≥ } are in L p , α is the supremum of a sequence that converges a.s.); thus | w ( ˆ F n ( z )) | ≤ c α β z βp . So, by the Glivenko Cantelli Theorem, we can apply the dominated convergence theorem (notingthat R ∞ a dzz βp < ∞ because βp >
1) and we have Z ∞ a w ( ˆ F n ( z )) dz → Z ∞ a w ( F ( z )) dz a.s. (as n → ∞ ) . Then we easily conclude the proof noting that we also have Z a w ( ˆ F n ( z )) dz → Z a w ( F ( z )) dz a.s. (as n → ∞ )again by the Glivenko Cantelli Theorem and the dominated convergence theorem (noting that w is continuous and therefore bounded, and the integral is over a bounded interval).In particular, if F is the distribution function in (3), it is easy to check that the limit value is Z ∞ w ( F ( z )) dz = Z ∞ w (1 − e − λz ) dz = 1 λ Z w ( x )1 − x dx =: µ w , (7)which is finite by Condition 1; moreover, if we take the mean value in (5), we havelim n →∞ E [ C n ( w )] = µ w . (8)4e conclude with a brief comment on the asymptotic Normality of the empirical estimators,i.e. the weak convergence of C n ( w ) − E [ C n ( w )] √ Var [ C n ( w )] to the standard Normal distribution. We can followthe lines of the proof of Theorem 7.1 in Di Crescenzo and Longobardi (2009a) and, in particular,the Lyapunov condition for the sequence { C n ( w ) : n ≥ } islim n →∞ λn ) P n − k =0 w ( k/n )(1 − k/n ) λn ) P n − k =0 w ( k/n )(1 − k/n ) = 0 . (9) Remark 2.1.
It is easy to check that (9) holds if lim n →∞ n n − X k =0 w b ( k/n )(1 − k/n ) b = Z w b ( x )(1 − x ) b dx < ∞ for b ∈ { , } (10) and, if we refer to Condition 1, we have finite integrals for b ∈ { , } if b (1 − β ) < ; thus we haveto take β > . Remark 2.2.
The limit in (10) with b = 2 is equivalent to lim n →∞ n Var [ C n ( w )] = 1 λ Z w ( x )(1 − x ) dx =: σ w ; (11) thus the above weak convergence of C n ( w ) − E [ C n ( w )] √ Var [ C n ( w )] to the standard Normal distribution is equivalentto the weak convergence of √ n ( C n ( w ) − E [ C n ( w )]) to the centered Normal distribution with variance σ w . We start noting that the sequence { C n ( w ) : n ≥ } defined by (1) (see also (2)) coincides with thesequence { L n : n ≥ } of L -statistics in Gao and Zhao (2011) (Section 4.6) if we take w ( · ) = w ( J ; · ),where w ( J ; · ) is defined by w ( J ; x ) := Z x J ( u ) du (12)for some function J (called score function ). In that reference it is not required that the i.i.d.random variables { X n : n ≥ } are exponentially distributed.Moreover, if we consider the score function˜ J ( u ) := log u + 1 , we get w ( ˜ J ; x ) := Z x log u + 1 du = [ u log u ] x =1 u = x = − x log x ;then, by (1) (and by taking into account that 0 log 0 = 0), we get C n ( w ( ˜ J ; · )) = n − X k =1 (cid:18) − kn log kn (cid:19) ( X k +1: n − X k : n ) . So { C n ( w ( ˜ J ; · )) : n ≥ } coincides with: • {CE ( ˆ F n ) : n ≥ } in Di Crescenzo and Longobardi (2009a) (Section 7), when { X n : n ≥ } are i.i.d. and positive random variables; • { C n : n ≥ } in Giuliano et al. (2015) (Section 4), when { X n : n ≥ } are i.i.d. EX P ( λ )distributed random variables. 5 Results
In this section we generalize the results for the sequence { C n : n ≥ } in Giuliano et al. (2015)(Section 4). In view of what follows we introduce the following condition. Condition 2.
Let w : [0 , → R be a function as in Condition 1 with β = 1 , and set h w ( x ) := w ( x )1 − x for x ∈ [0 , . Moreover let Λ w : R → R ∪ {∞} be the function defined by Λ w ( θ ) := ( R log (cid:16) λλ − θh w ( x ) (cid:17) dx if sup x ∈ [0 , { θh w ( x ) } ≤ λ ∞ otherwise , and assume that Λ w is finite in a neighbourhood of the origin θ = 0 . We remark that the function Λ w would not be finite in a neighbourhood of the origin θ = 0 ifwe have Condition 1 with β ∈ (0 , Some examples for the function w . We consider the following functions: w ( x ) := 1 − x ; w ( x ) := (1 − x ) ; w ( x ) := (1 − x )(1 − √ x ) . (13)For all these cases Condition 1 holds with β = 1; moreover: sup x ∈ [0 , { θh w ( x ) } ≤ λ if and only if θ ≤ λ , Λ w is lower semicontinuous, and there exists Λ ′ w ( θ ) for θ < λ . Thus, for each function, wehave to check the steepness of Λ w , i.e. lim θ → λ − Λ ′ w ( θ ) = ∞ , (14)which yields its essential smoothness required in the statement of Proposition 3.1. • For w = w we have Λ w ( θ ) = ( log (cid:16) λλ − θ (cid:17) if θ < λ ∞ otherwise . (15)So we have Λ w ( λ ) = ∞ , and therefore (14) holds; indeed we havelim θ → λ − Λ ′ w ( θ ) = lim θ → λ − λ − θ = ∞ . • For w = w we have Λ w ( λ ) = − Z log(1 − h w ( x )) dx = 1;however, even if Λ w ( λ ) < ∞ , (14) holds becauselim θ → λ − Λ ′ w ( θ ) = lim θ → λ − Z h w ( x ) λ − θh w ( x ) dx = 1 λ Z h w ( x )1 − h w ( x ) dx = 1 λ (cid:18)Z x dx − (cid:19) = ∞ . • For w = w we have Λ w ( λ ) = − Z log(1 − h w ( x )) dx = 12andlim θ → λ − Λ ′ w ( θ ) = lim θ → λ − Z h w ( x ) λ − θh w ( x ) dx = 1 λ Z h w ( x )1 − h w ( x ) dx = 1 λ (cid:18)Z √ x dx − (cid:19) = 1 λ ;thus (14) fails. 6e start with the first result, which is the analogue of Proposition 4.1 in Giuliano et al. (2015). Proposition 3.1.
Assume that { X n : n ≥ } are i.i.d. and EX P ( λ ) distributed, Condition 2holds, and Λ w is essentially smooth and lower semi-continuous. Then the sequence { C n ( w ) : n ≥ } defined by (1) satisfies the LDP with speed function v n = n and good rate function Λ ∗ w defined by Λ ∗ w ( y ) := sup θ ∈ R { θy − Λ w ( θ ) } . Proof.
We want to apply G¨artner Ellis Theorem; thus we have to check thatlim n →∞ n log E h e nθC n ( w ) i = Λ w ( θ ) (for all θ ∈ R ) . (16)We remark that, by (4), we have1 n log E h e nθC n ( w ) i = 1 n n − X k =0 log (cid:18) λ ( n − k ) λ ( n − k ) − nθw ( k/n ) (cid:19) = 1 n n − X k =0 log λ (cid:0) − kn (cid:1) λ (cid:0) − kn (cid:1) − θw ( k/n ) ! for all θ ∈ R such that θw ( k/n ) < λ (cid:18) − kn (cid:19) for all j ∈ { , . . . , n − } (17)(and n log E (cid:2) e nθC n ( w ) (cid:3) equal to infinity otherwise). Moreover condition (17) holds (for any fixed n ≥
1) if and only if θh w ( k/n ) < λ for all k ∈ { , . . . , n − } . Thus the limit in (16) trivially holds if sup x ∈ [0 , { θh w ( x ) } > λ while, if sup x ∈ [0 , { θh w ( x ) } ≤ λ ,the limit (16) can be checked noting that we have a limit of an integral sum (possibly equal toinfinity). In conclusion the desired LDP holds as a straightforward application of the G¨artner EllisTheorem. Remark 3.1.
It is well-known that Λ ∗ w ( y ) = 0 if and only if y = Λ ′ w (0) . Then, since we candifferentiate under the integral sign by Condition 2, we get Λ ′ w (0) = 1 λ Z h w ( x ) dx = 1 λ Z w ( x )1 − x dx, i.e. Λ ′ w (0) coincides with µ w in (7) . Remark 3.2.
Here we consider Proposition 3.1 with w = w , where w is the function in (13) .Thus Λ w coincides with the function Λ w in (15) ; moreover we can check (after some easy compu-tations) that Λ ∗ w coincides with Λ ∗ w ( y ) = (cid:26) λy − − log( λy ) if y > ∞ otherwise . Then we have the rate function provided by the Cram´er Theorem (see e.g. Theorem 2.2.3 inDembo and Zeitouni (1998)) for the sequence of empirical means (cid:8) X + ··· + X n n : n ≥ (cid:9) when (ashappens in Proposition 3.1) { X n : n ≥ } is a sequence of i.i.d. and EX P ( λ ) distributed randomvariables. In fact it is easy to check that C n ( w ) = 1 n n − X k =0 ( n − k )( X k +1: n − X k : n ) ( for all n ≥ by (1) and the definition of w in (13) , and therefore { C n ( w ) : n ≥ } and (cid:8) X + ··· + X n n : n ≥ (cid:9) areequally distributed by taking into account the independence and the distributions of the spacings. ∗ w in Proposition 3.1 when h w ( x ) > x . This upper bound can be expressed in terms of the relative entropy (see e.g.Kullback and Leibler (1951)) of an exponential distribution with respect to another one. We recallthat, given two absolutely continuous real valued random variables X and X with densities f and f , the relative entropy of X with respect to X is defined by H ( X | X ) := Z R f ( x ) log f ( x ) f ( x ) dx ;thus H ( X | X ) actually depends on the laws of the random variables X and X . Then the relativeentropy of the distribution EX P ( λ ) with respect to the distribution EX P ( λ ) is H ( EX P ( λ ) |EX P ( λ )) = λ λ − − log λ λ . Proposition 3.2.
Let h w be as in Condition 2 and assume that h w ( x ) > almost everywherewith respect to x . Moreover set M w ( y ) := R H (cid:0) EX P (1 /y ) |EX P ( λh − w ( x ) (cid:1) dx . Then: (i) Λ ∗ w ( y ) ≤ M w ( y ) for all y ∈ (0 , ∞ ) ; (ii) Λ ∗ w ( y ) = ∞ for all y ∈ ( −∞ , ; (iii) the infimum of M w ( y ) isattained at y = ¯ y w , where ¯ y w := ( λ R h − w ( x ) dx ) − .Proof. We start with the proof of (i). We remark that, for y >
0, we havesup θ<η (cid:26) θy − log (cid:18) ηη − θ (cid:19)(cid:27) = H ( EX P (1 /y ) |EX P ( η ))for η >
0; then we getΛ ∗ w ( y ) = sup θ sup z ∈ [0 , h w ( z ) ≤ λ (cid:26) θy − Z log (cid:18) λλ − θh w ( x ) (cid:19) dx (cid:27) = sup θ sup z ∈ [0 , h w ( z ) ≤ λ (cid:26) θy − Z log (cid:18) λh − w ( x ) λh − w ( x ) − θ (cid:19) dx (cid:27) ≤ Z sup θ sup z ∈ [0 , h w ( z ) ≤ λ (cid:26) θy − log (cid:18) λh − w ( x ) λh − w ( x ) − θ (cid:19)(cid:27) dx ≤ Z sup θ<λh − w ( x ) (cid:26) θy − log (cid:18) λh − w ( x ) λh − w ( x ) − θ (cid:19)(cid:27) dx = Z H (cid:0) EX P (1 /y ) |EX P ( λh − w ( x )) (cid:1) dx. Now the proof of (ii): for y < ∗ w ( y ) ≥ sup θ ≤ (cid:26) θy − Z log (cid:18) λh − w ( x ) λh − w ( x ) − θ (cid:19) dx (cid:27) ≥ sup θ ≤ { θy } = ∞ ;for y = 0 (this case was forgotten in the proof of Proposition 4.2 in Giuliano et al. (2015)) we haveΛ ∗ w (0) ≥ sup θ ≤ (cid:26) − Z log (cid:18) λh − w ( x ) λh − w ( x ) − θ (cid:19) dx (cid:27) = lim θ →−∞ − Z log (cid:18) λh − w ( x ) λh − w ( x ) − θ (cid:19) dx = ∞ . Finally the proof of (iii). One can check that M w ( y ) = λ Z h − w ( x ) dx · y − − log λ − Z log( h − w ( x )) dx − log y and its derivative is M ′ w ( y ) = λ Z h − w ( x ) dx − y . So we have M ′ w ( y ) = 0 if and only if y = ¯ y w , and y = ¯ y w is a minimizer by the convexity of M w .8he third result, which is the analogue of Proposition 4.3 in Giuliano et al. (2015), concernsmoderate deviations. In view of its proof we remark thatthere exists δ > x ) ≤ x − x x | x | < δ (18)(which can be proved by checking that the function g defined by g ( x ) := log(1 + x ) − ( x − x + x )has a local maximum at x = 0) andfor all v > , there exists δ > x ) ≥ x − vx for all | x | < δ (19)(which can be proved by checking that the function g defined by g ( x ) := log(1 + x ) − ( x − vx ) hasa local minimum at x = 0). Proposition 3.3.
Assume that { X n : n ≥ } are i.i.d. and EX P ( λ ) distributed, and Condition 2holds. Then, for any positive sequence { a n : n ≥ } such that a n → and na n → ∞ ( as n → ∞ ) , (20) the sequence (cid:8) √ na n ( C n ( w ) − E [ C n ( w )]) : n ≥ (cid:9) satisfies the LDP with speed function v n = 1 /a n and good rate function ˜Λ ∗ w ( y ) defined by ˜Λ ∗ w ( y ) := y σ w , where σ w is the value in (11) .Proof. We want to apply the G¨artner Ellis Theorem with speed function 1 /a n ; thus we have tocheck that lim inf n →∞ a n log E (cid:20) exp (cid:18) θ r na n ( C n ( w ) − E [ C n ( w )]) (cid:19)(cid:21) ≥ σ w θ n →∞ a n log E (cid:20) exp (cid:18) θ r na n ( C n ( w ) − E [ C n ( w )]) (cid:19)(cid:21) ≤ σ w θ θ ∈ R .It is useful to remark that, by (4) and the mean value in (5) (together with some computations),we havelog E (cid:20) exp (cid:18) θ r na n ( C n ( w ) − E [ C n ( w )]) (cid:19)(cid:21) = log E h e θ √ nan C n ( w ) i − θ r na n E [ C n ( w )]= n − X k =0 log λ ( n − k ) λ ( n − k ) − θ q na n w ( k/n ) − θλ r na n n − X k =0 w ( k/n ) n − k = − n − X k =0 (cid:18) log (cid:18) − θλ √ na n w ( k/n )1 − k/n (cid:19) + θλ √ na n w ( k/n )1 − k/n (cid:19) for all θ ∈ R such that θλ √ na n w ( k/n )1 − k/n < k ∈ { , . . . , n − } (and log E h exp (cid:16) θ q na n ( C n ( w ) − E [ C n ( w )]) (cid:17)i equal to infinity otherwise). Then, by Condition 2and by na n → ∞ , for all δ > n such that (cid:12)(cid:12)(cid:12)(cid:12) θλ √ na n w ( k/n )1 − k/n (cid:12)(cid:12)(cid:12)(cid:12) < δ for all k ∈ { , . . . , n − } n > ¯ n (in fact (cid:12)(cid:12)(cid:12) θλ √ na n w ( k/n )1 − k/n (cid:12)(cid:12)(cid:12) ≤ | θ | cλ √ na n → n → ∞ ).Now we are ready for the proof of (21) and (22); this will be done by using (18) and (19) for δ > x which depend on n > ¯ n . We start with the proofof (21). If we combine the above computations in this proof and (18) (with x = − θλ √ na n w ( k/n )1 − k/n ),we have a n log E h exp (cid:16) θ r na n ( C n ( w ) − E [ C n ( w )]) (cid:17)i ≥ a n n − X k =0 (cid:16) θ λ na n w ( k/n )(1 − k/n ) + θ λ ( na n ) / w ( k/n )(1 − k/n ) (cid:17) ;hence, by taking into account the limit for the variance in (11) andlim n →∞ n √ na n n − X k =0 w ( k/n )(1 − k/n ) = 0(because na n → ∞ by (20) and, as explained in Remark 2.1, lim n →∞ n P n − k =0 w ( k/n )(1 − k/n ) = R w ( x )(1 − x ) dx is finite because β > ), we obtainlim inf n →∞ a n log E (cid:20) exp (cid:18) θ r na n ( C n ( w ) − E [ C n ( w )]) (cid:19)(cid:21) ≥ σ w θ n →∞ a n log E (cid:20) exp (cid:18) θ r na n ( C n ( w ) − E [ C n ( w )]) (cid:19)(cid:21) ≤ lim sup n →∞ a n n − X k =0 vθ na n w ( k/n )(1 − k/n ) = σ w vθ , and we get (22) by letting v go to .In the following remark we recall some typical features on moderate deviations. Remark 3.3.
The class of LDPs in Proposition 3.3 fill the gap between two asymptotic regimes.1. The almost sure convergence of C n ( w ) to µ w , which is equivalent (by (8) ) to the almost sureconvergence of C n ( w ) − E [ C n ( w )] to zero.2. The weak convergence of √ n ( C n ( w ) − E [ C n ( w )]) to the centered Normal distribution withvariance σ w (see Remark 2.2).Then we recover these two cases by taking the sequence of random variables in Proposition 3.3 with a n = n and a n = 1 , respectively; in both cases one condition in (20) holds, and the other one fails.Moreover we know that the LDP in Proposition 3.1, which concerns the almost sure convergenceof C n ( w ) to µ w , is governed by the rate function Λ ∗ w ( y ) which uniquely vanishes at y = Λ ′ w (0) (seeRemark 3.1), and (Λ ∗ w ) ′′ (Λ ′ w (0)) = (Λ ′′ w (0)) − . So, since we can differentiate (twice) under theintegral sign by Condition 2, we get (see also (11) ) Λ ′′ w (0) = 1 λ Z h w ( x ) dx = 1 λ Z w ( x )(1 − x ) dx = σ w , i.e. the variance of the weak limit law of C n ( w ) − E [ C n ( w )] √ n .In some sense we can say that we have an asymptotic normality result as a consequence of anLDP; an interesting discussion on this issue can be found in Bryc (1993). σ w in Remark 3.3(and in Remark 2.2). Remark 3.4.
Here we assume that γ w := R w ( x )1 − x dx = 0 . Then, by (11) and an easy applicationof the Jensen’s inequality, we have σ w ≥ λ (cid:18)Z w ( x )1 − x dx (cid:19) = γ w λ . So, if we consider the function w in (13) , the inequality turns into an equality if and only if w ( x ) = γ w w ( x ) = γ w (1 − x ) . From now on we set ℓ w ( x ) := γ w (1 − x ) ; moreover we take γ w > and we follow the same lines ofsome parts of Remark 3.2. Firstly we have Λ ℓ w ( θ ) = Λ w ( θγ w ) for all θ ∈ R and Λ ∗ ℓ w ( y ) = sup θ ∈ R { θy − Λ ℓ w ( θ ) } = sup θ ∈ R { θy − Λ w ( θγ w ) } = Λ ∗ w ( yγ − w ) = (cid:26) λγ − w y − − log( λγ − w y ) if y > ∞ otherwise . Moreover Λ ∗ ℓ w is the rate function provided by the Cram´er Theorem for a sequence of empiricalmeans of i.i.d. and EX P ( λγ − w ) distributed random variables; indeed we have C n ( ℓ w ) = γ w n n − X k =0 ( n − k )( X k +1: n − X k : n ) ( for all n ≥ by (1) and the definition of ℓ w , and therefore { C n ( ℓ w ) : n ≥ } is a sequence of such empiricalmeans by the independence and the distributions of the spacings. L -statistics In this section we discuss some connections between Theorem 4.8 in Gao and Zhao (2011) with F as the exponential distribution function in (3), and Proposition 3.3 in this paper with w ( · ) = w ( J ; · )as in (12).Firstly, since F is the exponential distribution function in (3), we can give the following formulasfor m ( J, F ) and σ ( J, F ) in Theorem 4.8 in Gao and Zhao (2011): m ( J, F ) := Z ∞ xJ (1 − e − λx ) λe − λx dx ; σ ( J, F ) := Z ∞ Z ∞ J (1 − e − λx ) J (1 − e − λy ) { − e − λ ( x ∧ y ) − (1 − e − λx )(1 − e − λy ) } dxdy. Then, under suitable hypotheses (some of them concern the score function J ), Theorem 4.8 inGao and Zhao (2011) allows to say that, for any sequence { a ( n ) : n ≥ } of positive numbers suchthat a ( n ) → ∞ and a ( n ) √ n → n → ∞ ) , (23)the sequence n √ na ( n ) ( C n ( w ) − m ( J, F )) : n ≥ o satisfies the LDP with speed ( a ( n )) and good ratefunction I L ( y ) := y σ ( J, F ) . a ( n )) − in (23) plays the role of a n in (20); moreover, as typically happens for the resultson moderate deviations, both rate functions Λ ∗ w in Proposition 3.3 and I L are quadratic functionsthat uniquely vanish at the origin y = 0.We remark that, if we compare n √ na ( n ) ( C n ( w ) − m ( J, F )) : n ≥ o and the sequence of randomvariables in Proposition 3.3 in this paper, by taking into account the limit (8) we expect to have m ( J, F ) = µ w ( J ; · ) . In fact, by considering the change of variable r = 1 − e − λx and some computationswith an integration by parts, we have m ( J, F ) = Z − log(1 − r ) λ J ( r ) λ (1 − r ) drλ (1 − r ) = − λ Z log(1 − r ) J ( r ) dr = − λ (cid:26) [ − w ( J ; r ) log(1 − r )] r =1 r =0 − Z w ( J ; r )1 − r dr (cid:27) = 1 λ Z w ( J ; r )1 − r dr = µ w ( J ; · ) , indeed [ − w ( J ; r ) log(1 − r )] r =1 r =0 = 0 because the score function J is bounded, continuous andtrimmed (i.e. it is equal to zero near r = 0 and r = 1).We also remark that, if we compare the rate functions Λ ∗ w and I L , we expect to have σ ( J, F ) = σ w ( J ; · ) . In order to check this equality we start noting that the function inside the integral issymmetric with respect to ( x, y ); therefore we have the integral over { ( x, y ) : 0 ≤ x ≤ y } multipliedby 2 and, after some computations, we get σ ( J, F ) = 2 Z ∞ dyJ (1 − e − λy ) e − λy Z y dx (1 − e − λx ) J (1 − e − λx ) . Moreover we consider two further changes of variables: the first one is r = 1 − e − λx , and we obtain σ ( J, F ) = 2 Z ∞ dye − λy J (1 − e − λy ) Z − e − λy drλ (1 − r ) rJ ( r );the second one is s = 1 − e − λy , and we get σ ( J, F ) = 2 Z dsλ (1 − s ) (1 − s ) J ( s ) Z s drλ (1 − r ) rJ ( r ) = 2 λ Z dsJ ( s ) Z s dr r − r J ( r ) . Finally we conclude with the following computations (again with integration by parts): σ ( J, F ) = 2 λ ((cid:20) − w ( J ; s ) Z s dr r − r J ( r ) (cid:21) s =1 s =0 + Z w ( J ; s ) J ( s ) s − s ds ) = 2 λ ((cid:20) − ( w ( J ; s )) s − s (cid:21) s =1 s =0 + Z ( w ( J ; s )) − s ) ds ) = 1 λ Z ( w ( J ; s )) (1 − s ) ds = σ w ( J ; · ) , because h − w ( J ; s ) R s dr r − r J ( r ) i s =1 s =0 = 0 and h − ( w ( J ; s )) s − s i s =1 s =0 = 0 by the hypotheses on thescore function J recalled above and, for the second equality, by Condition 2 with β = 1 (whichrefers to Condition 1) for w ( J ; · ). A natural way to estimate a functional ϕ ( F ) of a distribution function F is to consider ϕ ( ˆ F n ) where,given a sequence { X n : n ≥ } of i.i.d. random variables with distribution function F (possibly12ifferent from that one in (3)), { ˆ F n : n ≥ } is the sequence of the empirical distribution functionsdefined by ˆ F n ( x ) := 1 n n X k =1 { X k ≤ x } , x ∈ R . In this section we concentrate our attention on functionals related to the concept of entropy andsome other related items.We recall some preliminaries and we refer to Di Crescenzo and Longobardi (2009a) (see alsothe references cited therein). The cumulative entropy associated to an absolutely continuous dis-tribution function F is defined by CE ( F ) = − Z ∞ F ( z ) log F ( z ) dz. Then, given a sequence of i.i.d. random variables { X n : n ≥ } with (common) distribution function F , we can consider the sequence of empirical cumulative entropies {CE ( ˆ F n ) : n ≥ } defined by CE ( ˆ F n ) := − Z ∞ ˆ F n ( z ) log ˆ F n ( z ) dz. It is known that CE ( ˆ F n ) → CE ( F ) a.s. as n → ∞ ; see Proposition 2 in Di Crescenzo and Longobardi(2009b). However we can also refer to Proposition 2.1 with w = w (1) , where w (1) ( x ) := − x log x ;in fact C n ( w (1) ) in Proposition 2.1 coincides with CE ( ˆ F n ). It is easy to check that both Conditions1 and 2 hold for the function w (1) .We can also consider the generalized cumulative entropy (see eq. (5.1) in Kayal (2016)) definedby CE n ( F ) := 1 n ! Z ∞ F ( z )( − log F ( z )) n dx (for all n ≥ CE ( F ) defined above for n = 1. In this case we have to consider the function w ( n ) ( x ) := 1 n ! x ( − log x ) n and, again, it is easy to check that both Conditions 1 and 2 hold for the function w ( n ) .In other cases we have a different situation. For instance we consider the cumulative residualentropy (see Rao et al. (2004)) E ( F ) := − Z ∞ (1 − F ( z )) log(1 − F ( z )) dz or the fractional cumulative residual entropy (see eq. (5) in Xiong et a. (2019)) E q ( F ) := Z ∞ (1 − F ( x ))( − log(1 − F ( x ))) q dx (for all q ∈ [0 , E ( F ) for q = 1. In this case we have to consider the function w [ q ] ( x ) := (1 − x )( − log(1 − x )) q ;then the function w [ q ] satisfies Condition 1 with β ∈ (0 , q = 0. 13 unding CC and ML are supported by Indam-GNAMPA and by MIUR-PRIN 2017 Project ”StochasticModels for Complex Systems” (No. 2017JFFHSH). CM and BP are supported by MIUR Excel-lence Department Project awarded to the Department of Mathematics, University of Rome TorVergata (CUP E83C18000100006), by University of Rome Tor Vergata (research program ”Be-yond Borders”, project ”Asymptotic Methods in Probability” (CUP E89C20000680005)) and byIndam-GNAMPA (research project ”Stime asintotiche: principi di invarianza e grandi deviazioni”).
Acknowledgements
We thank Prof. Gao for some discussion on the Theorem 4.8 in Gao and Zhao (2011).
References