On cumulative entropies in terms of moments of order statistics
Narayanaswamy Balakrishnan, Francesco Buono, Maria Longobardi
aa r X i v : . [ m a t h . S T ] S e p Noname manuscript No. (will be inserted by the editor)
On cumulative entropies in terms of moments oforder statistics
Narayanaswamy Balakrishnan · Francesco Buono · Maria Longobardi
Received: date / Accepted: date
Abstract
In this paper relations among some kinds of cumulative entropiesand moments of order statistics are presented. By using some characterizationsand the symmetry of a non negative and absolutely continuous random variable X , lower and upper bounds for entropies are obtained and examples are given. Keywords
Cumulative Entropies · Order Statistics · MomentsAMS Subject Classification: 60E15. 62N05, 94A17
In reliability theory, to describe and study the information related to a non-negative absolutely continuous random variable X we use the Shannon entropy,or differential entropy, of X , defined by (Shannon, 1948) H ( X ) = − E [log( X )] = − Z + ∞ f ( x ) log f ( x )d x, where log is the natural logarithm and f is the probability density function(pdf) of X . In the following we use F and F to indicate the cumulative dis-tribution function (cdf) and the survival function (sf) of X , respectively. N. Balakrishnan,McMaster University, Hamilton, Ontario, CanadaE-mail: [email protected]. BuonoUniversit degli Studi di Napoli Federico II, Naples, ItalyE-mail: [email protected]. LongobardiUniversit degli Studi di Napoli Federico II, Naples, ItalyE-mail: [email protected] Narayanaswamy Balakrishnan et al.
In the literature, there are several different versions of entropy, each onesuitable for a specific situation. Rao et al., 2004, introduced the CumulativeResidual Entropy (CRE) of X as E ( X ) = − Z + ∞ F ( x ) log F ( x )d x. Di Crescenzo and Longobardi, 2009, introduced the Cumulative Entropy(CE) of X as CE ( X ) = − Z + ∞ F ( x ) log F ( x )d x. (1)This information measure is suitable to measure information when uncertaintyis related to the past, a dual concept of the cumulative residual entropy whichrelates to uncertainty on the future lifetime of a system.Mirali et al., 2016, introduced the Weighted Cumulative Residual Entropy(WCRE) of X as E w ( X ) = − Z + ∞ x (1 − F ( x )) log(1 − F ( x ))d x. Mirali and Baratpour, 2017, introduced the Weighted Cumulative Entropy(WCE) of X as CE w ( X ) = − Z + ∞ xF ( x ) log F ( x )d x. Recently, various authors have discussed different versions of entropy andtheir applications (see, for instance, [3], [4], [5], [11]).The paper is organized as follows. In Section 2, we study relations amongsome kinds of entropies and moments of order statistics and present variousexamples. In Section 3, bounds are given by using also some characterizationsand properties (as the symmetry) of the random variable X , some examplesand bounds for known distributions are given. We recall that, if we have n i.i.d. random variables X , X , . . . , X n , we canintroduce the order statistics X k : n , k = 1 , . . . , n . The k -th order statistic isequal to the k -th smallest value from the sample. We know that the cdf of X k : n can be given in terms of the cdf of the parent distribution; in fact F k : n ( x ) = n X j = k (cid:18) nj (cid:19) [ F ( x )] j [1 − F ( x )] n − j , whereas the pdf of X k : n is expressed as f k : n ( x ) = (cid:18) nk (cid:19) k [ F ( x )] k − [1 − F ( x )] n − k f ( x ) . n cumulative entropies in terms of moments of order statistics 3 Choosing k = 1 and k = n we get the smallest and the largest order statistic,respectively. Their cdf and pdf are given by F n ( x ) = 1 − [1 − F ( x )] n f n ( x ) = n [1 − F ( x )] n − f ( x ) F n : n ( x ) = [ F ( x )] n f n : n ( x ) = n [ F ( x )] n − f ( x ) . X can be written also in termsof order statistics, that is E ( X ) = − Z + ∞ (1 − F ( x )) log(1 − F ( x ))d x = − x (1 − F ( x )) log(1 − F ( x )) (cid:12)(cid:12) + ∞ − Z + ∞ x log(1 − F ( x )) f ( x )d x − Z + ∞ xf ( x )d x = Z + ∞ x [ − log(1 − F ( x ))] f ( x )d x − E ( X )= Z + ∞ x " + ∞ X n =1 F ( x ) n n f ( x )d x − E ( X )= + ∞ X n =1 n ( n + 1) µ n +1: n +1 − E ( X ) , (2)where E ( X ) is the expectation or mean of X , and µ n +1: n +1 the mean ofthe largest order statistic in a sample of size n + 1 from F , provided thatlim x → + ∞ − x (1 − F ( x )) log(1 − F ( x )) = 0.We note that (2) can be rewritten as E ( X ) = + ∞ X n =1 (cid:18) n − n + 1 (cid:19) µ n +1: n +1 − E ( X ) . (3) Example 1
Consider the standard exponential distribution with pdf f ( x ) =e − x , x >
0. Then, it is known that E ( X ) = 1 and E ( X n : n ) = 1 + 12 + · · · + 1 n . Narayanaswamy Balakrishnan et al.
Then, from (3), we readily have E ( X ) = (cid:18) − (cid:19) µ + (cid:18) − (cid:19) µ + (cid:18) − (cid:19) µ + · · · − E ( X )= ( µ − E ( X )) + 12 ( µ − µ ) + 13 ( µ − µ ) + . . . = 12 + 12 · · . . . = + ∞ X n =1 n ( n + 1) = 1 . Example 2
Consider the standard uniform distribution with pdf f ( x ) = 1,0 < x <
1. Then, it is known that E ( X ) = 12 and E ( X n : n ) = nn + 1 . So, from (2), we readily find E ( X ) = + ∞ X n =1 n ( n + 1) n + 1 n + 2 −
12= 12 + ∞ X n =1 (cid:18) n − n + 2 (cid:19) −
12= 12 (cid:18) (cid:19) −
12 = 14 . X can be rewritten in terms of the mean ofthe minimum order statistic; integrating by parts (1) CE ( X ) = − xF ( x ) log F ( x ) (cid:12)(cid:12) + ∞ + Z + ∞ x log F ( x ) f ( x )d x + Z + ∞ xf ( x )d x = Z + ∞ x log[1 − (1 − F ( x ))] f ( x )d x + E ( X )= − Z + ∞ x + ∞ X n =1 (1 − F ( x )) n n f ( x )d x + E ( X )= − + ∞ X n =1 n ( n + 1) µ n +1 + E ( X ) , (4)where µ n +1 is the mean of the smallest order statistic from a sample of size n + 1 from F , provided that lim x → + ∞ − xF ( x ) log F ( x ) = 0. n cumulative entropies in terms of moments of order statistics 5 We note that (4) can be rewritten as CE ( X ) = − + ∞ X n =1 (cid:18) n − n + 1 (cid:19) µ n +1 + E ( X ) . (5) Example 3
For the standard exponential distribution, it is known that µ n = 1 n , and so from (5), we readily have CE ( X ) = − + ∞ X n =1 (cid:18) n − n + 1 (cid:19) n + 1 + 1= − + ∞ X n =1 n ( n + 1) + + ∞ X n =1 n + 1) + 1= π − , by the use of Euler’s identity. Example 4
For the standard uniform distribution, using the fact that µ n = 1 n + 1 , we obtain from (4) that CE ( X ) = − + ∞ X n =1 n ( n + 1)( n + 2) + 12= 12 − + ∞ X n =1 n + + ∞ X n =1 n + 1 − + ∞ X n =1 n + 2= 14by the use of Euler’s identity. Remark 1
If the random variable X has finite mean µ and is symmetricallydistributed about µ , then we know µ n : n − µ = µ − µ n , and so the symmetry property of CE readily follows. Narayanaswamy Balakrishnan et al. X can be expressed as E w ( X ) = − x − F ( x )) log(1 − F ( x )) (cid:12)(cid:12) + ∞ − Z + ∞ x log(1 − F ( x )) f ( x )d x − Z + ∞ x f ( x )d x = 12 Z + ∞ x [ − log(1 − F ( x ))] f ( x )d x − E ( X )= 12 Z + ∞ x " + ∞ X n =1 F ( x ) n n f ( x )d x − E ( X )= 12 + ∞ X n =1 n ( n + 1) µ (2) n +1: n +1 − E ( X ) , (6)where µ (2) n +1: n +1 is the second moment of the largest order statistic in a sampleof size n + 1, provided that lim x → + ∞ − x (1 − F ( x )) log(1 − F ( x )) = 0. Example 5
For the standard uniform distribution, using the fact that µ (2) n +1: n +1 = n + 1 n + 3 and E ( X ) = 13 , we obtain from (6) that E w ( X ) = 12 + ∞ X n =1 n ( n + 1) n + 1 n + 3 −
16= 16 + ∞ X n =1 (cid:18) n − n + 3 (cid:19) −
16= 16 (cid:18) (cid:19) −
16 = 536 . Moreover, we can derive the Weighted Cumulative Entropy (WCE) of X in terms of the second moment of the minimum order statistic in the following n cumulative entropies in terms of moments of order statistics 7 way CE w ( X ) = − x F ( x ) log F ( x ) (cid:12)(cid:12) + ∞ + 12 Z + ∞ x log F ( x ) f ( x )d x +12 Z + ∞ x f ( x )d x = 12 Z + ∞ x log[1 − (1 − F ( x ))] f ( x )d x + 12 E ( X )= − Z + ∞ x ∞ X n =1 (1 − F ( x )) n n f ( x )d x + 12 E ( X )= − + ∞ X n =1 n ( n + 1) µ (2)1: n +1 + 12 E ( X ) , (7)where µ (2)1: n +1 is the second moment of the smallest order statistic in a sampleof size n + 1, provided that lim x → + ∞ − x F ( x ) log F ( x ) = 0. Let us consider a sample with parent distribution X such that E ( X ) = 0 and E ( X ) = 1. Hartley and David, 1954, and Gumbel, 1954, have shown that µ n : n ≤ n − √ n − . We relate µ n : n with the mean of the largest statistic order from the stan-dard distribution. In fact, by normalizing the random variable X with mean µ and variance σ we get Z = X − µσ . Hence, the cdf F Z is given in terms of the cdf F X by F Z ( x ) = F X ( σx + µ ) . Then, cdf and pdf of the largest order statistic in a sample of size n are F Z n : n ( x ) = F nX ( σx + µ ) , f Z n : n ( x ) = nF n − X ( σx + µ ) f X ( σx + µ ) σ. The mean of X n : n is given by µ n : n = E ( X n : n ) = n Z + ∞ xF n − X ( x ) f X ( x )d x. Narayanaswamy Balakrishnan et al.
The mean of the largest statistic order from Z is given by E ( Z n : n ) = nσ Z + ∞− µσ xF n − X ( σx + µ ) f X ( σx + µ )d x = n Z + ∞ x − µσ F n − X ( x ) f X ( x )d x = nσ Z + ∞ xF n − X ( x ) f X ( x )d x − nµσ Z + ∞ F n − X ( x ) f X ( x )d x = µ n : n − µσ . Using the Hartley-David-Gumbel bound for a non-negative parent distri-bution with mean µ and variance σ , we get µ n : n = σ E ( Z n : n ) + µ ≤ σ n − √ n − µ. (8) Theorem 1
Let X be a non-negative random variable with mean µ and vari-ance σ . Then, we obtain an upper bound for the CRE of X E ( X ) ≤ + ∞ X n =1 σ ( n + 1) √ n + 1 ≃ . σ. (9) Proof
From (2) and (8) we get E ( X ) = + ∞ X n =1 n ( n + 1) µ n +1: n +1 − E ( X ) ≤ + ∞ X n =1 n ( n + 1) (cid:18) σ n √ n + 1 + E ( X ) (cid:19) − E ( X )= + ∞ X n =1 σ ( n + 1) √ n + 1 ≃ . σ, i.e., the upper bound given in (9) Remark 2
Since X is non-negative we have that µ n +1: n +1 ≥
0, for all n ∈ N .For this reason, using finite series approximations we get lower bounds for E ( X ): E ( X ) ≥ m X n =1 n ( n + 1) µ n +1: n +1 − E ( X ) , for all m ∈ N . Remark 3
Since X is non-negative we have that µ n +1 ≥
0, for all n ∈ N . Forthis reason, using finite series approximations we get upper bounds for CE ( X ): CE ( X ) ≤ − m X n =1 n ( n + 1) µ n +1 + E ( X ) , for all m ∈ N . n cumulative entropies in terms of moments of order statistics 9 Theorem 2
Let X be DFR (decreasing failure rate). Then, we have the fol-lowing lower bound for CE ( X ) CE ( X ) ≥ E ( X ) − p E ( X ) √ (cid:18) − π (cid:19) . (10) Proof
Let X be DFR. From Theorem 12 of Rychlik (2001) we know that fora sample of size n , if δ j = j X k =1 n + 1 − k ≤ j ∈ { , . . . , n } then E ( X j : n ) ≤ δ j √ p E ( X ) . For j = 1 we have δ = n ≤ n ∈ N and we get E ( X n ) ≤ p E ( X ) √ n . Then, from (4) we get the following lower bound for CE ( X ) CE ( X ) ≥ − + ∞ X n =1 n ( n + 1) p E ( X ) √ E ( X )= E ( X ) − p E ( X ) √ (cid:18) − π (cid:19) . Remark 4
We note that we can not provide an analogous bound for E ( X )because δ n ≤ n ≥ X , . . . , X n with parent distribution X symmetric about 0 with variance 1, then E ( X n : n ) ≤ nc ( n ) , (11)where c ( n ) = (cid:18) − ( n − n − ) (cid:19) n − . Using the bound (11) for a non-negative parent distribution symmetricabout the mean µ , with bounded support and variance σ , we get µ n : n = σ E ( Z n : n ) + µ ≤ σnc ( n ) + µ. (12) Theorem 3
Let X be a symmetric non-negative random variable with boundedsupport, mean µ and variance σ . Then, we obtain an upper bound for the CREof X E ( X ) ≤ σ + ∞ X n =1 c ( n ) n . (13) Proof
From (2) and (12) we get E ( X ) = + ∞ X n =1 n ( n + 1) µ n +1: n +1 − E ( X ) ≤ + ∞ X n =1 n ( n + 1) (cid:18) σ ( n + 1) c ( n + 1) + E ( X ) (cid:19) − E ( X )= σ + ∞ X n =1 c ( n ) n , i.e., the upper bound given in (13).About a symmetric distribution, Arnold and Balakrishnan, 1989, showedthat if we have a sample X , . . . , X n with parent distribution X symmetricabout 0 with variance 1, then E ( X n : n ) ≤ n √ r n − − B ( n, n ) , (14)where B ( n, n ) is the complete beta function.Using the bound (14) for a non-negative parent distribution symmetricabout the mean µ and with variance σ , we get µ n : n = σ E ( Z n : n ) + µ ≤ σ n √ r n − − B ( n, n ) + µ. (15) Theorem 4
Let X be a symmetric non-negative random variable with mean µ and variance σ . Then, we obtain an upper bound for the CRE of X E ( X ) ≤ σ √ + ∞ X n =1 n r n + 1 − B ( n + 1 , n + 1) . (16) Proof
From (2) and (15) we get E ( X ) = + ∞ X n =1 n ( n + 1) µ n +1: n +1 − E ( X ) ≤ + ∞ X n =1 n ( n + 1) σ n + 1 √ r n + 1 − B ( n + 1 , n + 1) + E ( X ) ! − E ( X )= σ √ + ∞ X n =1 n r n + 1 − B ( n + 1 , n + 1) , i.e., the upper bound given in (16). n cumulative entropies in terms of moments of order statistics 11 Example 6
Let us consider a sample with parent distribution X ∼ N (0 , E ( X ) ≃ X n =1 n ( n + 1) µ n +1: n +1 ≃ . < √ X n =1 n r n + 1 − B ( n + 1 , n + 1) ≃ . . From (2) and (4) we get the following expression for the sum of the cumu-lative residual entropy and the cumulative entropy E ( X ) + CE ( X ) = + ∞ X n =1 n ( n + 1) ( µ n +1: n +1 − µ n +1 ) . (17)Cal et al., 2017, showed a connection among (17) and the partition entropystudied by Bowden, 2007. Theorem 5
We have the following bound for the sum of the CRE and theCE E ( X ) + CE ( X ) ≤ + ∞ X n =1 √ σn √ n + 1 ≃ . σ. (18) Proof
From Theorem 3.24 of Arnold and Balakrishnan, 1989, we know thefollowing bound for the difference between the expectation of the largest andthe smallest order statistics from a sample of size n + 1 µ n +1: n +1 − µ n +1 ≤ σ p n + 1) , (19)and so using (19) in (17) we get the following bound for the sum of the CREand the CE E ( X ) + CE ( X ) ≤ + ∞ X n =1 σ p n + 1) n ( n + 1) = + ∞ X n =1 √ σn √ n + 1 ≃ . σ. About a symmetric distribution, Arnold and Balakrishnan, 1989, showedthat if we have a sample X , . . . , X n with parent distribution X symmetricabout the mean µ with variance 1, then E ( X n : n ) − E ( X n ) ≤ n √ r n − − B ( n, n ) , (20)where B ( n, n ) is the complete beta function.Using the bound (20) for a non-negative parent distribution symmetricabout the mean µ and with variance σ , we get µ n : n − µ n = σ ( E ( Z n : n ) − E ( Z n )) ≤ σn √ r n − − B ( n, n ) . (21) Table 1
Some bounds for known distributions.Support CDF E ( X ) Bound thm.1 E ( X ) + CE ( X ) Bound thm.5 x > − exp( − λx ) λ . λ π λ . λ < x < a xa a . a √ a . a √ x ∈ (0 , x exp (cid:0) (cid:0) − x (cid:1)(cid:1) . . . . x > − x +1) .
75 1 . . . x ∈ (0 , x . . . . x ∈ (0 , + ∞ ) exp (cid:16) − x ) − (cid:17) . . . . Theorem 6
Let X be a symmetric non-negative random variable with mean µ and variance σ . Then, we obtain an upper bound for the sum of the CREand the CE of X E ( X ) + CE ( X ) ≤ √ σ + ∞ X n =1 n r n + 1 − B ( n + 1 , n + 1) . (22) Proof
From (17) and (21) we get E ( X ) + CE ( X ) = + ∞ X n =1 n ( n + 1) ( µ n +1: n +1 − µ n +1 ) ≤ + ∞ X n =1 n ( n + 1) σ ( n + 1) √ r n + 1 − B ( n + 1 , n + 1) ! = √ σ + ∞ X n =1 n r n + 1 − B ( n + 1 , n + 1) , i.e., the upper bound given in (22).In Table 1 we present some applications of the bounds obtained in thissection to important distributions in the reliability theory. Acknowledgements
Francesco Buono and Maria Longobardi are partially supported bythe GNAMPA research group of INdAM (Istituto Nazionale di Alta Matematica) and MIUR-PRIN 2017, Project ”Stochastic Models for Complex Systems” (No. 2017 JFFHSH).
Conflict of interest
The authors declare that they have no conflict of interest.
References
1. Arnold, B. C., Balakrishnan, N. (1989).
Relations, Bounds and Approximations forOrder Statistics.
Springer, New York, NY.n cumulative entropies in terms of moments of order statistics 132. Bowden, R. (2007). Information, measure shifts and distribution diagnostics.
Statistics , (2), 249–262.3. Cal, C., Longobardi, M., Ahmadi, J. (2017). Some properties of cumulative Tsallisentropy. Physica A , , 1012–1021.4. Cal, C.; Longobardi, M.; Navarro, J. (2020). Properties for generalized cumulative pastmeasures of information. Probab. Eng. Inform. Sci. , 92–111.5. Cal, C.; Longobardi, M.; Psarrakos, G. (2019). A family of weighted distributions basedon the mean inactivity time and cumulative past entropies. Ricerche Mat. , (in press),doi:10.1007/s11587-019-00475-7.6. David, H. A., Nagaraya, H. N. (2003).
Order Statistics.
John Wiley & Sons Inc.7. Di Crescenzo, A., Longobardi, M. (2009). On cumulative entropies.
Journal of StatisticalPlanning and Inference , , 4072–4087.8. Gumbel, E. J. (1954). The maxima of the mean largest value and of the range. TheAnnals of Mathematical Statistics , , 76–84.9. Harter, H. L. (1961). Expected Values of Normal Order Statistics. Biometrika , (1),151–165.10. Hartley, H. O., David, H. A. (1954). Universal bounds for mean range and extremeobservation. The Annals of Mathematical Statistics , , 85–99.11. Longobardi, M. (2014). Cumulative measures of information and stochastic orders. Ricerche Mat. , 209–223.12. Mirali, M., Baratpour, S., Fakoor, V., (2016). On weighted cumulative residual entropy. Communications in Statistics - Theory and Methods , (6), 2857–2869.13. Mirali, M., Baratpour, S., (2017). Some results on weighted cumulative entropy. Journalof the Iranian Statistical Society , (2), 21–32.14. Rao, M., Chen, Y., Vemuri, B., Wang, F. (2004). Cumulative Residual Entropy: A NewMeasure of Information. IEEE Transactions on Information Theory , , 1220–1228.15. Rychlik, T. (2001). Projecting Statistical Functionals.
Springer, New York, NY.16. Shannon, C. E. (1948). A mathematical theory of communication.
Bell System TechnicalJournal ,27