On estimation of the PMF and the CDF of a natural discrete one parameter polynomial exponential distribution
OOn estimation of the PMF and the CDF of a naturaldiscrete one parameter polynomial exponentialdistribution
Indrani Mukherjee , Sudhansu S. Maiti ∗ and Rama Shanker Department of Statistics, Visva-Bharati University, Santiniketan-731 235, India Department of Statistics, Assam University, Silchar-7880111, India
AbstractIn this article, a new natural discrete analog of the one parameter polynomial exponential(OPPE) distribution as a mixture of a number of negative binomial distributions has beenproposed and is called as a natural discrete one parameter polynomial exponential (NDOPPE)distribution. This distribution is a generalized version of natural discrete Lindley (NDL) dis-tribution, proposed and studied by Ahmed and Afify (2019). Two estimators viz., MLE andUMVUE of the PMF and the CDF of a NDOPPE distribution have been derived. The estima-tors have been compared with respect to their MSEs. Simulation study has been conducted toverify the consistency of the estimators. A real data illustration has been reported.
Keywords:
Maximum likelihood estimator; uniformly minimum variance unbiased estimator.
Some standard discrete distributions have been mentioned and the estimators of their proba-bility mass functions (PMF) and cumulative distribution functions (CDF) are studied in Maitiand Mukherjee (2017). Sometimes these models are too restrictive. As for example, Poisson ∗ Corresponding author e-mail: dssm1@rediffmail.com a r X i v : . [ m a t h . S T ] A ug odel is not appropriate because it imposes the restriction of equidispersion in the modeleddata. Similarly, binomial model imposes the restriction of under-dispersion. As a result, var-ious models prescribed using discrete concentration and discrete analog approaches that areless restrictive (see, Nakagawa and Osaki (1975), Famoye (1993), among others). The mostrecent discrete distributions are due to Stein and Dattero(1984), Roy (2002), Roy (2003), Roy(2004), Krishna and Pundir (2009), Jazi et al. (2010), Gomez-Deniz (2010), Gomez-Deniz andCalderin-Ojeda (2011), Maiti et al. (2018), among others.To generate a natural discrete analog of the one parameter polynomial exponential (OPPE)distribution, we use the well-known fact that the geometric and the negative binomial distribu-tions are the natural discrete analogs of the exponential and the gamma distributions respec-tively. This is the motivation behind proposing a new natural discrete analog to the OPPEdistribution by mixing discrete counterparts of the exponential and gamma distributions.The PMF of the random variable X of a natural discrete one parameter polynomial expo-nential distribution (NDOPPE) distribution can be written as f ( x ) = h ( θ ) p ( x )(1 − θ ) x , x = 0 , , . . . , < θ < , (1.1)where, h ( θ ) = (cid:80) rk =1 a k − k ) θk , p ( x ) = (cid:80) rk =1 a k − Γ( k ) (cid:0) x + k − x (cid:1) . The distribution can also be written as f ( x ) = h ( θ ) r (cid:88) k =1 a k − Γ( k ) θ k f NB ( x ; θ, k ) , (1.2)where f NB ( x ; θ, k ) is the PMF of a negative binomial distribution with parameters θ and k and a k − ’s are known non-negative constants.The CDF is given by F ( x ) = h ( θ ) r (cid:88) k =1 a k − Γ( k ) θ k I θ ( x, k + 1) , x = 0 , , . . . , < θ < , (1.3)where I x ( a, b ) = B ( a,b ) (cid:82) x t a − (1 − t ) b − dt .We have mentioned two special cases which are given below,1. r = 1 , a = 1 gives the geometric distribution ,2. r = 2 , a = 1 , a = 1 gives a natural discrete Lindley (NDL) distribution [c.f. Ahmedand Afify (2019)]. The PMF and the CDF is given by f ( x ) = θ (1 + θ ) (2 + x )(1 − θ ) x , x = 0 , , . . . ; θ ∈ (0 , F ( x ) = 1 − θ + θx (1 + θ ) (1 − θ ) x , x = 0 , , . . . ; θ ∈ (0 , , respectively.The problem of estimation of the PMF and the CDF is interesting for many reasons. Forexample, the PMF and the CDF can be used for estimation of differential entropy, R´enyi en-tropy, Kullback-Leibler divergence, Fisher’s Information, Cumulative residual entropy, quantilefunction, Lorenze curve, Hazard rate function, Mean remaining life function etc.Most of the times, emphasis is given to infer the parameters involved in the distribution andthe study is concentrated on measuring the efficiency of these estimators. No such effort hasbeen made to find out estimator of the PMF and the CDF of these discrete random variables.Plugging in the MLE of parameter gives, by invariance property, the MLE of the PMF and theCDF. It is to be noted that all these estimators are biased. Mere substitution of the UMVUEof parameter(s) will not provide the UMVUE of the PMF and that of the CDF. Therefore,comparing UMVUE of the PMF and that of the CDF (the only unbiased estimators in ourstudy) with other estimators seem to be an interesting study.Similar type of studies have appeared in recent literature for some continuous distributions.See, Asrabadi (1990), Bagheri et al. (2014), Dixit and Jabbari (2010), Dixit and Jabbari (2011),Jabbari and Jabbari (2010), Maiti and Mukherjee (2017), Mukherjee and Maiti (2018) etc.The article is organized as follows. Section 2 deals with MLE of the PMF and the CDF ofa NDOPPE distribution. Section 3 is devoted to finding out the UMVUE of the PMF and theCDF. In section 4, simulation study results are reported and comparisons are made. Real-lifedata set is analyzed in section 5. In section 6, concluding remarks are made based on thefindings of this article. 3 MLE of the PMF and that of the CDF
Let X , X , ..., X n be a random sample of size n drawn from the PMF in (1.2). here we try tofind the MLE of θ which is denoted as (cid:101) θ . The log-likelihood of θ is given by l ( θ ) = ln L ( θ | X )= n ln h ( θ ) + n (cid:88) i =1 ln p ( X i ) + ln(1 − θ ) n (cid:88) i =1 X i .Now, dl ( θ ) dθ = 0i.e. n ddθ (ln h ( θ )) + 11 − θ n (cid:88) i =1 X i = 0i.e. h ( θ ) r (cid:88) k =1 a k − Γ( k + 1) θ k +1 = ¯ X − θ . (2.4)Since, the above equation is not of a closed form, we have to solve (2.4) numerically to obtainthe MLE of θ . Theoretical expression for the MSE of the MLEs are not available. MSE will bestudied through simulation. In this section, we obtain the UMVUE of the PMF and that of the CDF of a NDOPPEdistribution. Also, we obtain the MSEs of these estimators.
Theorem 3.1.
Let, X , X , ..., X n ∼ f NDOP P E ( x, θ ) . Then the distribution of T = X + X + .... + X n is f ( t ) = h n ( θ ) (cid:88) y . . . (cid:88) y r c ( n, y , . . . , y r ) (cid:18) t + (cid:80) rk =1 ky k − t (cid:19) (1 − θ ) t ,t = 0 , , . . . with y + ... + y r = n and c ( n, y , . . . , y r ) = n ! y ! ...y r ! × (cid:81) rk =1 ( a k − Γ( k )) y k . roof. The mgf of T is M T ( t ) = h n ( θ ) (cid:34) r (cid:88) k =1 a k − θ k · Γ( k ) θ k { − (1 − θ ) e t } k (cid:35) n = h n ( θ ) (cid:34)(cid:88) y . . . (cid:88) y r c ( n, y , . . . , y r ) × θ (cid:80) rk =1 ky k (cid:26) θ − (1 − θ ) e t (cid:27) (cid:80) rk =1 ky k (cid:35) . Hence, the distribution of T is f ( t ) = h n ( θ ) (cid:88) y . . . (cid:88) y r c ( n, y , . . . , y r ) × (cid:18) t + (cid:80) rk =1 ky k − t (cid:19) (1 − θ ) t , where, c ( n, y , . . . , y r ) = n ! y ! ...y r ! (cid:81) rk =1 ( a k − Γ( k )) y k . Lemma 3.1.
The conditional distribution of X given X + X + .... + X n = T is f X | T ( x | t ) = p ( x ) A n ( t ) (cid:88) q ... (cid:88) y r c ( n − , q , . . . , q r ) × (cid:18) t − x + (cid:80) rk =1 kq k − t − x (cid:19) , x = 0 , . . . , t, where, A n ( t ) = (cid:88) y . . . (cid:88) y r c ( n, y , . . . , y r ) (cid:18) t + (cid:80) rk =1 ky k − t (cid:19) and c ( n − , q , . . . , q r ) = ( n − q ! . . . q r ! r (cid:89) k =1 ( a k − Γ( k )) y k , with q + q + .... + q r = n − . Proof. f X | T ( x | t ) = f ( x ) f ( t − x ) f ( t )= p ( x ) A n ( t ) (cid:88) q ... (cid:88) y r c ( n − , q , . . . , q r ) (cid:18) t − x + (cid:80) rk =1 kq k − t − x (cid:19) . heorem 3.2. Let, T = t be given. Then (cid:98) f ( x ) = p ( x ) A n ( t ) (cid:88) q ... (cid:88) q r c ( n − , q , . . . , q r ) × (cid:18) t − x + (cid:80) rk =1 kq k − t − x (cid:19) , x = 0 , . . . , t, (3.5) is UMVUE for f ( x ) and (cid:98) F ( x ) = 1 A n ( t ) (cid:88) q . . . (cid:88) q r c ( n − , q , . . . , q r ) × x (cid:88) w =0 p ( w ) (cid:18) t + (cid:80) rk =1 kq k − t − w (cid:19) , x = 0 , . . . , t, (3.6) is UMVUE for F ( x ) , where, p ( w ) = (cid:80) rk =1 a k − Γ( k ) (cid:0) w + k − w (cid:1) . Remark:
When r = 2 , a = 1 , a = 1, the above expressions (3.5) and (3.6) reduce to (cid:98) f ( x ) = (2 + x ) (cid:80) nk =0 (cid:0) nk (cid:1)(cid:0) n − k + t − t (cid:1) n − (cid:88) k =0 (cid:18) n − k (cid:19)(cid:18) n − − k + t − x − t − x (cid:19) , x = 0 , . . . , t (3.7)and (cid:98) F ( x ) = x (cid:88) w =0 (2 + w ) (cid:80) nk =0 (cid:0) nk (cid:1)(cid:0) n − k + t − t (cid:1) n − (cid:88) k =0 (cid:18) n − k (cid:19) × (cid:18) n − − k + t − w − t − w (cid:19) , x = 0 , . . . , t, (3.8)respectively.The MSE of (cid:98) f ( x ) is given by M SE ( (cid:98) f ( x )) = E ( (cid:98) f ( x )) − f ( x )= ∞ (cid:88) t = x (cid:34) p ( x ) A n ( t ) (cid:88) q (cid:88) q ... (cid:88) q r c ( n − , q , q , . . . , q r ) (cid:18) t − x + (cid:80) rk =1 kq k − t − x (cid:19)(cid:21) f ( t ) − f ( x ) . (3.9)Using Theorem 3 .
1, (1.2) in (3.9), we can get the value of the MSE of UMVUE of the PDF.6he MSE of (cid:98) F ( x ) is given by M SE ( (cid:98) F ( x )) = E ( (cid:98) F ( x )) − F ( x )= ∞ (cid:88) t = x (cid:34) x (cid:88) w =0 p ( w ) A n ( t ) (cid:88) q ... (cid:88) q r c ( n − , q , q , . . . , q r ) × (cid:18) t − w + (cid:80) rk =1 kq k − t − w (cid:19)(cid:21) f ( t ) − F ( x ) , (3.10)where, p ( w ) = (cid:80) rk =1 a k − Γ( k ) (cid:0) w + k − w (cid:1) . Similarly, using Theorem 3 .
1, (1.3) in (3.10), we canget the value of the MSE of UMVUE of the CDF.Theoretical MSE of the UMVUE of the PMF and that of the CDF have been shown inFigure 1, taking a = 1 , a = 1, x = 2, θ = 0 .
01 and r = 2. It is clear from the graph thatMSE (in this case, variance) decreases as sample size increases. − − − − Sample size V a r i an c e_ f x mse.umvue_fx . . . . Sample size V a r i an c e_ F x mse.umvue_Fx Figure 1: Graph of theoretical MSE of the UMVUE of the PMF and that of the CDF of a NDLdistribution for x = 2, θ = 0 .
01 and r = 2. Generation of random sample X , X , . . . , X n is distributed in the following algorithm:1. Generate U i ∼ U nif orm (0 , , i = 1(1) n .7. If (cid:80) j − k =1 a k − k − θk (cid:80) rk =1 a k − k − θk < U i ≤ (cid:80) jk =1 a k − k − θk (cid:80) rk =1 a k − k − θk , j = 2 , ..., r , then set X i = V i ,where V i ∼ N B ( j, θ ) and if U i ≤ a θ (cid:80) rk =1 a k − k − θk , then set X i = V i , where V i ∼ geo ( θ ) . A simulation study is carried out with 1 , N ) repetitions. Here we choose a = 1 , a = 1, θ = 0 . x = 2 and r = 2. We compute MSE of the MLE and that of the UMVUE of thePMF and the CDF of a NDL distribution. From Figure 2, it is clear that MSE decreases withan increasing sample size that shows the consistency property of the estimators. . − . − . − . − Sample Size M SE _ f ( x ) mse_MLEmse_UMVUE + − − − − Sample Size M SE _ F ( x ) mse_MLEmse_UMVUE Figure 2: Graph of simulated MSE of the MLE and UMVUE of the PMF and the CDF of aNDL distribution for x = 2, θ = 0 .
01 and r = 2. We have studied the data comprise of numbers of fires in forest districts of Greece from period1 July 1998 to 31 August 1998. The total number of observed samples is 123. This data set isobtained from Bakouch et al. (2014) and is shown in Table 1. In Figure 3, we have shown theestimated PMF and that of the CDF of a NDL distribution.8able 1: Numbers of fires in Greece
Table 2: Model selection criterion
Negative log-likelihoodEstimators NDL distributionMLE 340.0195UMVUE 340.1765
We use the estimate of the negative log-likelihood values for the model selection criterion.Lower the value of negative log-likelihood indicates the better fit. From Table 2, we found thatMLE is better than UMVUE in a negative log-likelihood sense. l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l . . . . . . x E s t i m a t ed P M F l MLE_fxUMVUE_fx . . . . . . x E s t i m a t ed CD F MLE_FxUMVUE_Fx
Figure 3: Graph of the estimated PMF and that of the CDF of a NDL distribution.9
Concluding Remarks
In this article, two methods of estimation of the PMF and the CDF of a NDOPPE distributionhave been considered. The MLE and the UMVUE have been found out. Simulation studyis performed to compare the performances of the proposed methods of estimation. We havestudied the performance of the estimators of the PMF and the CDF of a NDL distribution asa particular case of a NDOPPE distribution. From simulation study results, it is found thatUMVUE is better than MLE for the PMF and MLE is better than UMVUE for the CDF inMSE sense. From the model selection criterion, it is noticed that MLE is better than UMVUEin negative log-likelihood (i.e. AIC) sense. We analyze a data set to to illustrate fitting of theproposed distribution.
References [1] Ahmed, A. N. and Afify, A. Z. (2019). A new discrete analog of the continuous Lindley dis-tribution, with reliability applications.
Proceedings of 62nd ISI World Statistics Congress ,5, 7-14.[2] Asrabadi, B. R. (1990). Estimation in the Pareto distribution.
Metrika , 37, 199-205.[3] Bagheri, S. F., Alizadeh, M., Baloui, J. E., and Nadarajah, S. (2014). Evaluation andcomparison of estimations in the generalized exponential-Poisson distribution.
Journal ofStatistical Computation and Simulation , 84(11), 2345-2360.[4] Bakouch, H. S., Jazi, M. A., and Nadarajah, S. (2014). A new discrete distribution. Statis-tics, 48(1), 200-240.[5] Dixit, U. J. and Jabbari, N. M. (2010). Efficient estimation in the Pareto distribution.
Statistical Methodology , 7, 687-691.[6] Dixit, U. J. and Jabbari, N. M. (2011). Efficient estimation in the Pareto distribution withthe presence of outliers. Statistical Methodology, 8, 340-355.[7] Famoye, F. (1993). Restricted generalized Poisson regression model.
Communications inStatistics-Theory and Methods , 22(5), 1335-1354.108] Gomez-Deniz, E. (2010). Another generalization of the geometric distribution.
TEST , 19,399-415.[9] Gomez-Deniz, E. and Calderin-Ojeda, E. (2011). The discrete Lindley distribution: prop-erties and applications.
Journal of Statistical Computation and Simulation , 81(11), 1405-1416.[10] Jabbari, N. M. and Jabbari, N. H. (2010). Efficient estimation of PDF, CDF and rthmoment for the exponentiated Pareto distribution in the presence of outliers.
Statistics: AJournal of Theoretical and Applied Statistics , 44(4), 1-20.[11] Jazi, M. A., Lai, C. D., and Alamatsaz, M. H. (2010). A discrete inverse Weibull distribu-tion and estimation of its parameters.
Statistical Methodology , 7(2), 121-132.[12] Krishna, H. and Pundir, P. S. (2009). Discrete Burr and discrete Pareto distributions.
Statistical Methodology , 6(2), 177-188.[13] Maiti, S. S., Dey, M., and Sarkar (Mondal), S. (2018). Discrete xgamma distributions:properties, estimation and an application to the collective risk model.
Journal of Reliabilityand Statistical Studies , 11(1), 117-132.[14] Maiti, S. S. and Mukherjee, I. (2017). Estimation of the PMF and CDF of some standarddiscrete distributions useful in reliability modelling.
International Journal of Agriculturaland Statistical Sciences , 13(2), 735-751.[15] Mukherjee, I. and Maiti, S. S.and Das, M. (2018). On estimation of the PMF and CDF ofthe logarithmic series distribution.
RASHI , 3(2), 34-44.[16] Nakagawa, T. and Osaki, S. (1975). Discrete Weibull distribution. IEEE Transactions onReliability, 24(5), 300-301.[17] Roy, D. (2002). Discritization of continuous distributions with an application to stess-strength reliability.
Calcutta Statistical Association Bulletin , 52, 297-314.[18] Roy, D. (2003). The discrete normal distribution.
Communications in Statistics-Theoryand Methods , 32(10), 1871-1883. 1119] Roy, D. (2004). Discrete Rayleigh distribution.
IEEE Transaction in Reliability , 53(2),255-260.[20] Stein, W. E. and Dattero, R. (1984). A new discrete Weibull distribution.