[PDF] Comments on the estimate for Pareto Distribution

Abstract

Dixit and Jabbari Nooghabi (2010) had derived the MLE and UMVUE of the probability density function (pdf) and cumulative distributive function (cdf). Further, it had been shown that MLE is more efficient than UMVUE. He, Zhou and Zhang (2014) have also derived the same and made a remark that the work of Dixit and Jabbari Nooghabi (2010) is not correct. We have made a comments with detail algebra that our results are correct. Further, we have also given the R code.

Full PDF

aa r X i v : . [ m a t h . S T ] A ug Comments on the estimate for Pareto Distribution

U. J. Dixit M. Jabbari Nooghabi Department of Statistics, University of Mumbai, Mumbai-India

Abstract

Dixit and Jabbari Nooghabi (2010) had derived the MLE and UMVUE of the probability densityfunction (pdf) and cumulative distributive function (cdf). Further, it had been shown that MLE ismore eﬃcient than UMVUE. He, Zhou and Zhang (2014) have also derived the same and made aremark that the work of Dixit and Jabbari Nooghabi (2010) is not correct. We have made a commentswith detail algebra that our results are correct. Further, we have also given the R code.

Key Words : Pareto distribution, Maximum likelihood estimator, Uniform minimum variance unbiasedestimator, Probability density function (pdf), Cumulative distribution function (cdf), Comments, R code.

The Pareto distribution has been used in connection with studies of income, property values, insurancerisk, migration, size of cities and ﬁrms, word frequencies, business mortality, service time in queuingsystems, etc.The objective of this paper is to discuss eﬃcient estimation of pdf and CDF of Pareto distribution whichhas been one of the most distinguished candidates for the honor of explaining the distribution of incomes,assets, etc.We assume that random variable X has Pareto distribution with parameter α and k (known) and itsprobability density function (pdf) is as, f X ( x ) = αk α x α +1 , < k ≤ x, α > F X ( x ) = 1 − (cid:18) kx (cid:19) α , k ≤ x In economics, where this distribution is used as an income distribution, k is some minimum income witha known value. Asrabadi (1990) derived the uniformly minimum variance unbiased estimator (UMVUE)of the probability density function (pdf), the distribution function (cdf) and the r th moment.In this paper, we will give the detail algebra of Dixit and Jabbari Nooghabi’s (2010) paper. Also, Wehave made a comments that our results are correct. Further, we have also given the R code.Dixit and Jabbari Nooghabi (2010) had derived the MLE and UMVUE of the probability density function(pdf) and cumulative distributive function (cdf). Further, it had been shown that MLE is more eﬃcientthan UMVUE. He, Zhou and Zhang (2014) have also derived the same and made a remark that the workof Dixit and Jabbari Nooghabi (2010) is not correct.We like to make some comments as follows.1. We have veriﬁed our results and they are correct. We have given the detail algebra and R program.See the attachment.2. Examples given by He et al. (2014) are not correct. One should note MSE of ˆ f ( x ) or ˆ F ( x ) is a functionof parameters. By this one cannot prove anything. Only, one can calculate ˆ f ( x ) or ˆ F ( x ). E-mail: [email protected] E-mail: [email protected], [email protected]

1. According to deﬁnition of the modiﬁed Bessel function in Olver, Lozier, Boisvert, et al. (2010) in theTheorem 1. the notation K ( n − r ) (2 √ nrαz ) should be K ( r − n ) (2 √ nrαz ). Also, the notation K ( n ) (2 √ nrαz )should be K ( − n ) (2 √ nrαz ).4. After the Theorem 2. the Kummer conﬂuent hypergeometric function is wrong and the correct versionis U ( a, b, c ) = 1Γ( a ) = Z ∞ t a − (1 + t ) b − a − e − ct dt. In this section, we give the detail algebra of the paper Dixit and Jabbari Nooghabi (2010).The details of ﬁnding result of the second chapter of that paper are as follows. X , . . . , X n iid ∼ f ( x ) = αk α x α +1 , < k ≤ x, α > L ( x , ..., x n , α, k ) = α n k nα Q ni =1 x α +1 i n Y i =1 I ( x i − k ) , where I is the indicator function deﬁned as I ( y ) = (cid:26) y > , otherwise. ⇒ ln L ( x, α, k ) = n ln( α ) + nα ln( k ) − ( α + 1) ln( n X i =1 x i ) ⇒ ∂ ln L ( x, α ) ∂α = nα + n ln( k ) − ln n X i =1 x i = 0 ⇒ ˜ α = M LE ( α ) = n P ni =1 ln( x i k ) , ∂ ln L∂α = − nα < ⇒ M LE of f ( x ) = f ( x, ˜ α ) = ˜ αk ˜ α x ˜ α +1 ⇒ ˜ f ( x ) = ˜ αk ˜ α x ˜ α +1 , ˜ α > , < k ≤ x, ⇒ ˜ F ( x ) = 1 − (cid:18) kx (cid:19) ˜ α , ˜ α > , < k ≤ x. (2)Put: y = ln (cid:16) xk (cid:17) ⇒ dy = 1 x dx x = ke y = ⇒ f Y ( y ) = ke y αk α ( ke y ) α +1 = αe − αy , y > , x ≥ k ⇒ f Y ( y ) = αe − αy , y > or Y ∼ Γ(1 , α ) ⇒ S = n X i =1 Y i ∼ Γ( n, α ) ,g S ( s ) = α n s n − Γ( n ) exp( − αs ) , s > . (3)2et w = ˜ α (1) ⇒ w = ns ⇒ s = nw ⇒ dsdw = − nw ⇒ Jacobian = | − nw | = nw = ⇒ g ( w ) = nw n )( α ) n ( nw ) n − e − nα/w , w > , α > ⇒ g ( w ) = ( αn ) n Γ( n )( w n +1 ) exp n − αnw o , w > . (4) E (˜ α ) = E ( W ) = Z ∞ wg ( w ) dw = Z ∞ w ( αn ) n Γ( n )( w n +1 ) exp n − αnw o dw = ( αn ) n Γ( n ) Z ∞ w − n e − αnw dw. Put z = w ⇒ dz = − w dw , so E ( W ) = ( αn ) n Γ( n ) Z ∞ z n − e − αnz dz = ( αn ) n Γ( n ) Γ( n − αn ) n − Z ∞ z n − e − αnz Γ( n − αn ) n − dz. ⇒ E (˜ α ) = αnn − .E (˜ α ) = E ( W ) = Z ∞ w g ( w ) dw = Z ∞ w ( αn ) n Γ( n )( w n +1 ) exp n − αnw o dw = ( αn ) n Γ( n ) Z ∞ w − n +1 e − αnw dw. Same as the previous E ( W ) = ( αn ) n Γ( n ) Z ∞ z n − e − αnz dz = ( αn ) n Γ( n ) Γ( n − αn ) n − Z ∞ z n − e − αnz Γ( n − αn ) n − dz. ⇒ E (˜ α ) = ( αn ) ( n − n − . Therefore

M SE ( W ) = V ( W ) + ( E ( W ) − α ) = E ( W ) − αE ( W ) + α = ( αn ) ( n − n − − α αnn − α . ⇒ M SE (˜ α ) = M SE ( W ) = α ( n + n − n − ( n − . Proof of the Theorem 1.: (A) E ( ˜ f ( x )) = Z ˜ f ( x ) g ( w ) dw = Z ∞ wk w x w +1 ( αn ) n Γ( n ) w n +1 e − αn/w dw = ( αn ) n Γ( n ) x Z ∞ (cid:18) kx (cid:19) w e − αn/w w n dw. Put ( kx ) w = e w ln( kx ) , then E ( ˜ f ( x )) = ( αn ) n Γ( n ) x Z ∞ e wln ( kx ) e − αn/w w n dw.

3e know e w ln( kx ) = P ∞ j =0 w j (ln kx ) j j ! E ( ˜ f ( x )) = ( αn ) n Γ( n ) x Z ∞ ∞ X j =0 w j (ln( kx )) j j ! w n e − αn/w dw = ( αn ) n Γ( n ) x ∞ X j =0 (ln( kx )) j j ! Z ∞ e − αn/w w n − j dw. Put w = z then − w dw = dz ⇒ dw = − dzz . E ( ˜ f ( x )) = ( αn ) n Γ( n ) x ∞ X j =0 (ln( kx )) j j ! Z ∞ z n − j − e − αnz dz = ( αn ) n Γ( n ) x ∞ X j =0 (ln( kx )) j j ! Γ( n − j − nα ) − n + j +1 Z ∞ z n − j − e − αnz Γ( n − j − nα ) n − j − dz = 1Γ( n ) x ∞ X j =0 ( nα ) j +1 j ! Γ( n − j − (cid:18) ln (cid:18) kx (cid:19)(cid:19) j ⇒ E ( ˜ f ( x )) = 1Γ( n ) x n − X j =0 ( nα ) j +1 j ! Γ( n − j − (cid:18) ln (cid:18) kx (cid:19)(cid:19) j . (5)(B) E ( ˜ F ( x )) = Z ˜ F ( x ) g ( w ) dw = Z ∞ (cid:20) − ( kx ) w (cid:21) ( αn ) n Γ( n ) w n +1 e − αn/w dw = Z ∞ ( αn ) n Γ( n ) w n +1 e − αn/w dw − ( αn ) n Γ( n ) Z ∞ ( kx ) w w n +1 e − αn/w dw = 1 − ( αn ) n Γ( n ) Z ∞ e w ln ( kx ) e − αn/w w n +1 dw. We know that (cid:0) kx (cid:1) w = e w ln ( kx ) = P ∞ j =0 w j (ln ( kx ) ) j j ! , then E ( ˜ F ( x )) = 1 − ( αn ) n Γ( n ) Z ∞ ∞ X j =0 w j (ln( kx )) j j ! w − n − e − αn/w dw = 1 − ( αn ) n Γ( n ) ∞ X j =0 (ln( kx )) j j ! Z ∞ w − n − j e − αn/w dw = 1 − ( αn ) n Γ( n ) ∞ X j =0 (ln( kx )) j j ! Z ∞ e − αn/w w n +1 − j dw. Put w = z ⇒ − w dw = dz ⇒ dw = − z dz. Then E ( ˜ F ( x )) = 1 − ( αn ) n Γ( n ) ∞ X j =0 (ln( kx )) j j ! Γ( n − j ) (cid:18) αn (cid:19) n − j Z ∞ z n − j − e − αnz Γ( n − j )( αn ) n − j dz ⇒ E ( ˜ F ( x )) = 1 − n ) n − X j =0 ( αn ) j j ! Γ( n − j ) (cid:18) ln (cid:18) kx (cid:19)(cid:19) j , x ≥ k. (6)4 roof of the Theorem 2.: (A) At ﬁrst, we should ﬁnd E ( ˜ f ( x )) . So E ( ˜ f ( x )) = Z ∞ ( ˜ f ( x )) g ( w ) dw = Z ∞ w k w x w +2 ( αn ) n Γ( n ) w n +1 e − αn/w dw = ( αn ) n Γ( n ) x Z ∞ (cid:18) kx (cid:19) w e − αn/w w n − dw. Similarly to the pervious Theorem, we have E ( ˜ f ( x )) = ( αn ) n Γ( n ) x Z ∞ e w ln ( kx ) e − αn/w w n − dw = ( αn ) n Γ( n ) x Z ∞ ∞ X j =0 j w j (ln (cid:0) kx (cid:1) ) j j ! w n − e − αn/w dw = ( αn ) n Γ( n ) x ∞ X j =0 j (ln (cid:0) kx (cid:1) ) j j ! Z ∞ e − αn/w w n − j − dw w = z = ( αn ) n Γ( n ) x ∞ X j =0 j (ln (cid:0) kx (cid:1) ) j j ! Z ∞ z n − j − e − αnz dz = ( αn ) n Γ( n ) x ∞ X j =0 j (ln (cid:0) kx (cid:1) ) j j ! Γ( n − j − (cid:18) αn (cid:19) n − j − Z ∞ z n − j − e − αnz Γ( n − j − αn ) n − j − dz ⇒ E ( ˜ f ( x )) = 1Γ( n ) x n − X j =0 j (ln (cid:0) kx (cid:1) ) j j ! Γ( n − j − αn ) j +2 . We know that V ( ˜ f ( x )) = E ( ˜ f ( x )) − E ( ˜ f ( x )). Then V ( ˜ f ( x )) = 1Γ( n ) x n − X j =0 j (ln (cid:0) kx (cid:1) ) j j ! Γ( n − j − αn ) j +2 −  n ) x n − X j =0 ( αn ) j +1 j ! Γ( n − j − (cid:18) kx (cid:19) ) j  . Therefore

M SE ( ˜ f ( x )) = V ( ˜ f ( x )) + ( E ( ˜ f ( x )) − f ( x )) = V ( ˜ f ( x )) + E ( ˜ f ( x )) − E ( ˜ f ( x )) f ( x ) + f ( x )= E ( ˜ f ( x )) − E ( ˜ f ( x )) + E ( ˜ f ( x )) − E ( ˜ f ( x )) f ( x ) + f ( x )= E ( ˜ f ( x )) − f ( x ) E ( ˜ f ( x )) + f ( x ) ⇒ M SE ( ˜ f ( x )) = 1Γ( n ) x n − X j =0 j (ln (cid:0) kx (cid:1) ) j j ! Γ( n − j − αn ) j +2 − αk α x α +1 n ) x n − X j =0 ( nα ) j +1 j ! Γ( n − j − (cid:18) kx (cid:19) ) j + ( αk α x α +1 ) . (7)(B) Seme as the case (A) E ( ˜ F ( x )) = Z ∞ ( ˜ F ( x )) g ( w ) dw = Z ∞ (cid:20) − (cid:18) k w x w (cid:19)(cid:21) ( αn ) n Γ( n ) w n +1 e − αn/w dw Z ∞ ( αn ) n Γ( n ) w n +1 e − αn/w dw − αn ) n Γ( n ) Z ∞ ( kx ) w e − αn/w w n +1 dw + ( αn ) n Γ( n ) Z ∞ ( kx ) w e − αn/w w n +1 dw = 1 − αn ) n Γ( n ) Z ∞ e w ln ( kx ) e − αn/w w n +1 dw + ( αn ) n Γ( n ) Z ∞ e w ln kx e − αn/w w n +1 dw. Let e w ln ( kx ) = P ∞ j =0 w j (ln ( kx ) ) j j ! , e w ln ( kx ) = P ∞ j =0 2 j w j (ln ( kx ) ) j j ! , then E ( ˜ F ( x )) = 1 − αn ) n Γ( n ) Z ∞ ∞ X j =0 w j (ln (cid:0) kx (cid:1) ) j j ! e − αn/w w n +1 dw + ( αn ) n Γ( n ) Z ∞ ∞ X j =0 j w j (ln (cid:0) kx (cid:1) ) j j ! e − αn/w w n +1 dw = 1 − αn ) n Γ( n ) ∞ X j =0 (ln (cid:0) kx (cid:1) ) j j ! Z ∞ e − αn/w w n +1 − j dw + ( αn ) n Γ( n ) ∞ X j =0 j (ln (cid:0) kx (cid:1) ) j j ! Z ∞ e − αn/w w n +1 − j dw w = z = 1 − αn ) n Γ( n ) ∞ X j =0 (ln (cid:0) kx (cid:1) ) j j ! Γ( n − j ) (cid:18) αn (cid:19) n − j Z ∞ z n − − j e − αnz Γ( n − j ) (cid:0) αn (cid:1) n − j dz + ( αn ) n Γ( n ) ∞ X j =0 j (ln (cid:0) kx (cid:1) ) j j ! Γ( n − j ) (cid:18) αn (cid:19) n − j Z ∞ z n − − j e − αnz Γ( n − j ) (cid:0) αn (cid:1) n − j dz ⇒ E ( ˜ F ( x )) = 1 − n ) n − X j =0 Γ( n − j )( αn ) j j ! (ln (cid:18) kx (cid:19) ) j + 1Γ( n ) n − X j =0 Γ( n − j )2 j ( αn ) j j ! (ln (cid:18) kx (cid:19) ) j , x ≥ k. We have V ( ˜ F ( x )) = E ( ˜ F ( x )) − E ( ˜ F ( x )), so V ( ˜ F ( x )) = 1 − n ) n X j =0 Γ( n − j )( αn ) j j ! (ln (cid:18) kx (cid:19) ) j + 1Γ( n ) n X j =0 Γ( n − j )2 j ( αn ) j j ! (ln (cid:18) kx (cid:19) ) j −  − n ) n X j =0 Γ( n − j )( αn ) j j ! (ln (cid:18) kx (cid:19) ) j  . Further

M SE ( ˜ F ( x )) = V ( ˜ F ( x )) + ( E ( ˜ F ( x )) − F ( x )) = E ( ˜ F ( x )) − F ( x ) E ( ˜ F ( x )) + F ( x ), then M SE ( ˜ F ( x )) = 1 − n ) n X j =0 Γ( n − j )( αn ) j j ! (ln (cid:18) kx (cid:19) ) j + 1Γ( n ) n X j =0 Γ( n − j )2 j ( αn ) j j ! (ln (cid:18) kx (cid:19) ) j − (cid:20) − (cid:18) kx (cid:19) α (cid:21)  − n ) n X j =0 Γ( n − j )( αn ) j j ! (ln (cid:18) kx (cid:19) ) j  + (cid:20) − (cid:18) kx (cid:19) α (cid:21) . Therefore

M SE ( ˜ F ( x )) = 2 + 1Γ( n ) n X j =0 Γ( n − j )2 j ( αn ) j j ! (ln (cid:18) kx (cid:19) ) j (cid:18) kx (cid:19) α n ) n X j =0 Γ( n − j )( αn ) j j ! (ln (cid:18) kx (cid:19) ) j + (cid:18) kx (cid:19) α . (8)From Asrabadi (1990), we have ˆ α = n − t ) − n ln( k ) , t ≥ k n , (9)the UMVUE of f ( x ) and F ( x ) isˆ f ( x ) = ( n − t ) − ln( x ) − ( n −

1) ln( k )] n − x [ln( t ) − n ln( k )] n − , k ≤ x < tk − n , (10)ˆ F ( x ) =  x < k, − [ln( t ) − ln( x ) − ( n −

1) ln( k )] n − [ln( t ) − n ln( k )] n − k ≤ x ≤ tk − n , x ≥ tk − n , (11)respectively. Also, f ( x ) = αk α x α +1 , x ≥ k > , α > ,F ( x ) = 1 − (cid:18) kx (cid:19) α , k ≤ x. Proof of the Theorem 3.: (A) It is obvious that E ( ˆ f ( x )) = f ( x ). So, we should ﬁnd E ( ˆ f ( x )) = Z ˆ f ( x ) h ∗ ( x ) dt, where, h ∗ ( x ) = α n k nα ( n − t − α − [ln( t ) − n ln( k )] n − , t ≥ k n . Therefore E ( ˆ f ( x )) = Z ∞ xk n − ( n − [ln( t ) − ln( x ) − ( n −

1) ln( k )] n − x [ln( t ) − n ln( k )] n − α n k nα ( n − t − α − [ln( t ) − n ln( k )] n − dt = ( n − α n k αn x ( n − Z ∞ xk n − [ln( t ) − ln( x ) − ( n −

1) ln( k )] n − [ln( t ) − n ln( k )] n − t − α − dt. Let z = ln( t ) − n ln( k ) ⇒ dz = t dt , then E ( ˆ f ( x )) = ( n − α n x ( n − Z ∞ ln ( xk ) [ z − ln( x ) + ln( k )] n − e − αz z n − dz. We know that [ z − ln( x ) + ln( k )] n − = n − X j =0 C (2 n − , j ) (cid:16) − ln (cid:16) xk (cid:17)(cid:17) j z n − − j , C ( n, k ) = n ! k !( n − k )! . So E ( ˆ f ( x )) = ( n − α n x ( n − n − X j =0 C (2 n − , j ) (cid:16) − ln (cid:16) xk (cid:17)(cid:17) j Z ∞ ln ( xk ) z n − − j e − αz dz. The above integral is the incomplete Gamma function, therefore E ( ˆ f ( x )) = ( n − α n x ( n − n − X j =0 C (2 n − , j ) (cid:16) − ln (cid:16) xk (cid:17)(cid:17) j × Γ( n − − j ) α n − − j n − − j X i =0 exp (cid:0) − α ln (cid:0) xk (cid:1)(cid:1) (cid:0) α ln (cid:0) xk (cid:1)(cid:1) i i != ( n − α k α x α +2 ( n − n − X j =0 C (2 n − , j ) α j h − ln (cid:16) xk (cid:17)i j Γ( n − − j ) n − − j X i =0 α i (cid:0) ln (cid:0) xk (cid:1)(cid:1) i i ! . We know that the Gamma function is deﬁned on the positive value. So E ( ˆ f ( x )) = ( n − α k α x α +2 ( n − n − X j =0 C (2 n − , j ) α j h − ln (cid:16) xk (cid:17)i j Γ( n − − j ) n − − j X i =0 α i (cid:0) ln (cid:0) xk (cid:1)(cid:1) i i ! . Finally

M SE ( ˆ f ( x )) = V ( ˆ f ( x )) = ( n − α k α x α +2 Γ( n − n − X j =0 C (2 n − , j ) α j Γ( n − j − (cid:16) − ln (cid:16) xk (cid:17)(cid:17) j × n − − j X i =0 α i (cid:0) ln (cid:0) xk (cid:1)(cid:1) i i ! − (cid:18) αk α x α +1 (cid:19) . (12)(B) E ( ˆ F ( x )) = Z ˆ F ( x ) h ∗ ( t ) dt = Z ∞ xk n − (cid:20) − [ln( t ) − ln( x ) − ( n −

1) ln( k )] n − [ln( t ) − n ln( k )] n − (cid:21) h ∗ ( t ) dt + Z xk n − k n × h ∗ ( t ) dt = Z ∞ xk n − h ∗ ( t ) dt − Z ∞ xk n − [ln( t ) − ln( x ) − ( n −

1) ln( k )] n − [ln( t ) − n ln( k )] n − h ∗ ( t ) dt + Z ∞ xk n − [ln( t ) − ln( x ) − ( n −

1) ln( k )] n − [ln( t ) − n ln( k )] n − h ∗ ( t ) dt + Z xk n − k n h ∗ ( t ) dt = Z ∞ xk n − h ∗ ( t ) dt + Z xk n − k n h ∗ ( t ) dt − Z ∞ xk n − [ln( t ) − ln( x ) − ( n −

1) ln( k )] n − [ln( t ) − n ln( k )] n − × α n k αn ( n − t − α − [ln( t ) − n ln( k )] n − dt + Z ∞ xk n − [ln( t ) − ln( x ) − ( n −

1) ln( k )] n − [ln( t ) − n ln( k )] n − α n k αn ( n − t − α − [ln( t ) − n ln( k )] n − dt = Z ∞ k n h ∗ ( t ) dt − α n k αn ( n − Z ∞ xk n − [ln( t ) − ln( x ) − ( n −

1) ln( k )] n − t − α − dt + α n k αn ( n − Z ∞ xk n − [ln( t ) − ln( x ) − ( n −

1) ln( k )] n − [ln( t ) − n ln( k )] n − t − α − dt.

8e know that R ∞ k n h ∗ ( t ) dt = 1. For second part let z = ln( t ) − ln( x ) − ( n −

1) ln( k ) and to solve the thirdintegral put z = ln( t ) − n ln( k ). Then E ( ˆ F ( x )) = 1 − α n k αn ( n − x − α k − α ( n − Z ∞ z n − e − αz dz + α n k αn ( n − k − nα Z ∞ ln ( xk ) [ z − ln( x ) + ln( k )] n − z n − e − αz dz = 1 − α n k α x α ( n − Z ∞ z n − e − αz dz + α n ( n − Z ∞ ln ( xk ) [ z − ln( x ) + ln( k )] n − e − αz z n − dz. R ∞ z n − e − αz dz = Γ( n ) α n and for the last integral, we know [ z − ln( x ) + ln( k )] n − = P n − j =0 C (2 n − , j ) z n − − j (cid:2) − ln (cid:0) xk (cid:1)(cid:3) j . Therefore E ( ˆ F ( x )) = 1 − k α x α + α n ( n − n − X j =0 C (2 n − , j ) h − ln (cid:16) xk (cid:17)i j Z ∞ ln ( xk ) z n − − j e − αz z n − dz = 1 − k α x α + α n ( n − n − X j =0 C (2 n − , j ) h − ln (cid:16) xk (cid:17)i j Γ( n − j ) α n − j Z ∞ ln ( xk ) α n − j z n − − j e − αz Γ( n − j ) dz. The last integral is the incomplete Gamma function, then E ( ˆ F ( x )) = 1 − k α x α + α n ( n − n − X j =0 C (2 n − , j ) h − ln (cid:16) xk (cid:17)i j × Γ( n − j ) α n − j n − − j X i =0 e − α ln ( xk ) (cid:2) α ln (cid:0) xk (cid:1)(cid:3) i i ! . The Gamma function is deﬁned over positive value, So E ( ˆ F ( x )) = 1 − k α x α + k α x α ( n − n − X j =0 C (2 n − , j ) h − ln (cid:16) xk (cid:17)i j × α j Γ( n − j ) n − − j X i =0 (cid:2) α ln (cid:0) xk (cid:1)(cid:3) i i ! . Then

M SE ( ˆ F ( x )) = V ( ˆ F ( x )) = E ( ˆ F ( x )) − E ( ˆ F ( x )) = E ( ˆ F ( x )) − F ( x )= 1 − k α x α + k α x α ( n − n − X j =0 C (2 n − , j ) h − ln (cid:16) xk (cid:17)i j × α j Γ( n − j ) n − − j X i =0 (cid:2) α ln (cid:0) xk (cid:1)(cid:3) i i ! − (cid:20) − (cid:18) kx (cid:19) α (cid:21) . Finally

M SE ( ˆ F ( x )) = k α Γ( n ) x α n − X j =0 C (2 n − , j ) α j Γ( n − j ) h − ln (cid:16) xk (cid:17)i j × n − − j X i =0 α i (cid:2) ln (cid:0) xk (cid:1)(cid:3) i i ! − (cid:18) kx (cid:19) α . (13)9 .1 The rth estimate of ˜ f ( x ) and ˜ F ( x ) To ﬁnd the the rth estimate of ˜ f ( x ), we have E ( ˜ f ( x )) r = Z ∞ ( ˜ f ( x )) r g ( w ) dw = Z ∞ w r k rw x r ( w +1) ( αn ) n Γ( n ) w n +1 e − αn/w dw = ( αn ) n Γ( n ) x r Z ∞ (cid:18) kx (cid:19) rw e − αn/w w n − r +1 dw = ( αn ) n Γ( n ) x r Z ∞ e rw ln ( kx ) e − αn/w w n − r +1 dw = ( αn ) n Γ( n ) x r Z ∞ ∞ X j =0 r j w j (ln (cid:0) kx (cid:1) ) j j ! w n − r +1 e − αn/w dw = ( αn ) n Γ( n ) x r ∞ X j =0 r j (ln (cid:0) kx (cid:1) ) j j ! Z ∞ e − αn/w w n − r − j +1 dw w = z = ( αn ) n Γ( n ) x r ∞ X j =0 r j (ln (cid:0) kx (cid:1) ) j j ! Z ∞ z n − r − j − e − αnz dz = ( αn ) n Γ( n ) x r ∞ X j =0 r j (ln (cid:0) kx (cid:1) ) j j ! Γ( n − r − j ) (cid:18) αn (cid:19) n − r − j ⇒ E ( ˜ f ( x )) r = 1Γ( n ) x r n − r − X j =0 r j (ln (cid:0) kx (cid:1) ) j j ! Γ( n − r − j )( αn ) j + r . (14)Also, the rth estimate of ˜ F ( x ) can be found by calculating the following integral. E ( ˜ F ( x )) r = Z ∞ ( ˜ F ( x )) r g ( w ) dw = Z ∞ (cid:20) − (cid:18) kx (cid:19) w (cid:21) r ( αn ) n Γ( n ) w n +1 e − αn/w dw = Z ∞ r X j =0 C ( r, j ) (cid:18) − (cid:18) kx (cid:19) w (cid:19) j ( αn ) n Γ( n ) w n +1 e − αn/w dw = ( αn ) n Γ( n ) r X j =0 C ( r, j )( − j Z ∞ ( kx ) jw e − αn/w w n +1 dw = ( αn ) n Γ( n ) r X j =0 C ( r, j )( − j Z ∞ e jw ln ( kx ) e − αn/w w n +1 dw = ( αn ) n Γ( n ) r X j =0 C ( r, j )( − j ∞ X i =0 (cid:0) j ln (cid:0) kx (cid:1)(cid:1) i i ! Z ∞ e − αn/w w n +1 − i dw w = z = ( αn ) n Γ( n ) r X j =0 C ( r, j )( − j ∞ X i =0 (cid:0) j ln (cid:0) kx (cid:1)(cid:1) i i ! Z ∞ z n − i − e − αnz dz = ( αn ) n Γ( n ) r X j =0 C ( r, j )( − j ∞ X i =0 (cid:0) j ln (cid:0) kx (cid:1)(cid:1) i i ! Γ( n − i )( 1 αn ) n − i . Then E ( ˜ F ( x )) r = 1Γ( n ) r X j =0 C ( r, j )( − j n − X i =0 (cid:0) j ln (cid:0) kx (cid:1)(cid:1) i i ! Γ( n − i )( αn ) i . (15)10 .2 The rth estimate of ˆ f ( x ) and ˆ F ( x ) The rth estimate of ˆ f ( x ) is easily obtained as follows. E ( ˆ f ( x )) r = Z ( ˆ f ( x )) r α n k nα ( n − t − α − [ln( t ) − n ln( k )] n − dt = Z ∞ xk n − ( n − r [ln( t ) − ln( x ) − ( n −

1) ln( k )] r ( n − x r [ln( t ) − n ln( k )] r ( n − α n k nα ( n − t − α − [ln( t ) − n ln( k )] n − dt = ( n − r α n k αn x r ( n − Z ∞ xk n − [ln( t ) − ln( x ) − ( n −

1) ln( k )] r ( n − [ln( t ) − n ln( k )] ( r − n − t − α − dt. Let z = ln( t ) − n ln( k ) ⇒ dz = t dt , then E ( ˆ f ( x )) r = ( n − r α n k αn x r ( n − Z ∞ ln ( xk ) [ z − ln( x ) + ln( k )] r ( n − k − αn e − αz z ( r − n − dz = ( n − r α n x r ( n − Z ∞ ln ( xk ) P r ( n − j =0 C ( r ( n − , j ) (cid:0) − ln (cid:0) xk (cid:1)(cid:1) j z r ( n − − j e − αz z ( r − n − dz = ( n − r α n x r ( n − r ( n − X j =0 C ( r ( n − , j ) (cid:16) − ln (cid:16) xk (cid:17)(cid:17) j Z ∞ ln ( xk ) z n − r − j − e − αz dz = ( n − r α n x r ( n − r ( n − X j =0 C ( r ( n − , j ) (cid:16) − ln (cid:16) xk (cid:17)(cid:17) j × Γ( n − r − j ) α n − r − j n − r − j − X i =0 exp (cid:0) − α ln (cid:0) xk (cid:1)(cid:1) (cid:0) α ln (cid:0) xk (cid:1)(cid:1) i i !Therefore E ( ˆ f ( x )) r = ( n − r α r k α x α + r ( n − n − r − X j =0 C ( r ( n − , j ) α j h − ln (cid:16) xk (cid:17)i j Γ( n − r − j ) n − r − j − X i =0 α i (cid:0) ln (cid:0) xk (cid:1)(cid:1) i i ! . (16)Also, the rth estimate of ˆ F ( x ) is similarly obtained as follows. E ( ˆ F ( x )) r = Z ( ˆ F ( x )) r h ∗ ( t ) dt = Z ∞ xk n − (cid:20) − [ln( t ) − ln( x ) − ( n −

1) ln( k )] n − [ln( t ) − n ln( k )] n − (cid:21) r h ∗ ( t ) dt + Z xk n − k n r × h ∗ ( t ) dt = Z ∞ xk n − r X j =0 C ( r, j )( − j [ln( t ) − ln( x ) − ( n −

1) ln( k )] j ( n − [ln( t ) − n ln( k )] j ( n − × α n k nα ( n − t − α − [ln( t ) − n ln( k )] n − dt + Z xk n − k n α n k nα ( n − t − α − [ln( t ) − n ln( k )] n − dt = α n k nα ( n − r X j =0 C ( r, j )( − j Z ∞ xk n − [ln( t ) − ln( x ) − ( n −

1) ln( k )] j ( n − [ln( t ) − n ln( k )] ( j − n − t − α − dt + α n k nα ( n − Z xk n − k n t − α − [ln( t ) − n ln( k )] n − dt. z = ln( t ) − n ln( k ) ⇒ dz = t dt , then E ( ˆ F ( x )) r = α n k nα ( n − r X j =0 C ( r, j )( − j Z ∞ ln( xk ) [ z − ln( x ) + ln( k )] j ( n − z ( j − n − k − αn e − αz dz + α n k nα ( n − Z ln( xk )0 z n − k − αn e − αz dz = α n ( n − r X j =0 C ( r, j )( − j Z ∞ ln( xk ) P j ( n − i =0 C ( j ( n − , i )( − ln( xk )) i z j ( n − − i e − αz z ( j − n − dz + α n ( n − n ) α n " − (cid:16) xk (cid:17) − α n − X i =0 ( α ln( xk )) i i ! = α n ( n − r X j =0 C ( r, j )( − j j ( n − X i =0 C ( j ( n − , i ) (cid:16) − ln (cid:16) xk (cid:17)(cid:17) i Z ∞ ln( xk ) z n − i − e − αz dz + " − (cid:16) xk (cid:17) − α n − X i =0 ( α ln( xk )) i i ! = α n ( n − r X j =0 C ( r, j )( − j j ( n − X i =0 C ( j ( n − , i ) (cid:16) − ln (cid:16) xk (cid:17)(cid:17) i Γ( n − i ) α n − i × n − i − X l =0 exp( − α ln( xk ))( α ln( xk )) l l ! + (cid:16) xk (cid:17) − α "(cid:16) xk (cid:17) α − n − X i =0 ( α ln( xk )) i i ! . Then E ( ˆ F ( x )) r = k α ( n − x α r X j =0 C ( r, j )( − j j ( n − X i =0 C ( j ( n − , i ) (cid:16) − ln (cid:16) xk (cid:17)(cid:17) i α i Γ( n − i ) × n − i − X l =0 ( α ln( xk )) l l ! + (cid:18) kx (cid:19) α "(cid:16) xk (cid:17) α − n − X i =0 ( α ln( xk )) i i ! . (17) The R code to compare the bias and MSE of the estimators is as follows. sim=function(t,n,k,alpha,r){sfh<-0sFh<-0sft<-0sFt<-0for(l in 1:t){x<-array(, c(1,n))for (i in 1:n) {u<-runif(1,0,1)x[i]<-k*(1-u)^(-1/alpha)}alphah<-n/sum(log(x)-log(k))fx<-alpha*k^alpha/x[1]^(alpha+1)intB<-function(z){ z^(-n)*exp(-alpha*n/z)*(k/x[1])^z} <-integrate(intB,lower=0,upper=Inf)$valueEftildx<-(alpha*n)^n/factorial(n-1)/x[1]*BintB1<-function(z){ z^(-n+1)*exp(-alpha*n/z)*(k/x[1])^(2*z)}B1<-integrate(intB1,lower=0,upper=Inf)$valueEftildxs2<-(alpha*n)^n/factorial(n-1)/x[1]^2*B1MSEftildx<-Eftildxs2-2*fx*Eftildx+fx^2intA<-function(z){ z^(2*n-4)*exp(-alpha*z)/(z+log(x[1])-log(k))^(n-1)}A<-integrate(intA,lower=0,upper=Inf)$valueMSEfhx<-(n-1)*alpha^n*k^alpha/x[1]^(alpha+2)/factorial(n-2)*A-alpha^2*k^(2*alpha)/x[1]^(2*alpha+2)Fx<-1-(k/x[1])^alphaintB2<-function(w){ w^(-n-1)*exp(w*log(k/x[1]))*exp(-alpha*n/w)}B2<-integrate(intB2,lower=0,upper=Inf)$valueEFtildx<-1-(alpha*n)^n/factorial(n-1)*B2intB3<-function(w){ w^(-n-1)*exp(2*w*log(k/x[1]))*exp(-alpha*n/w)}B3<-integrate(intB3,lower=0,upper=Inf)$valueEFtildxs2<-1-2*(alpha*n)^n/factorial(n-1)*B2+(alpha*n)^n/factorial(n-1)*B3MSEFtildx<-EFtildxs2-2*Fx*EFtildx+Fx^2intA1<-function(w){ (w-log(x[1]/k))^(2*n-2)*exp(-alpha*w)*w^(-n+1)}A1<-integrate(intA1,lower=log(x[1]/k),upper=Inf)$valueMSEFhx<-1-2*(k/x[1])^alpha+alpha^n/factorial(n-1)*A1-(1-(k/x[1])^alpha)^2sfh<-sfh+MSEfhxsft<-sft+MSEftildxsFh<-sFh+MSEFhxsFt<-sFt+MSEFtildx}mMSEfhx<-sfh/tmMSEftildx<-sft/tmMSEFhx<-sFh/tmMSEFtildx<-sFt/treturn(c(mMSEfhx,mMSEftildx,mMSEFhx,mMSEFtildx))}sim(10,5,1,5,1)sim1=function(t,k,alpha,r){i<-seq(3,35,1)for (j in i){sim(t,j,k,alpha,r)}}sim1(10,1,5,1) In order to get the idea of eﬃciency between the two type of estimation i.e MLE and UMVUE. We havegenerated a sample of size 4(1)15(5)100 from the Pareto distribution with α =0.5(0.5)2 and k =0.5(0.5)2.We have given Tables based on one thousand independent replication of each experiments.Table 1. shows the bias and MSE of the estimators of the pdf and bias and MSE of the estimators of cdfare shown in Tables 2. The value in the bracket is for the MSE in each tables. From the Tables, it hasbeen seen that MLE of pdf and cdf are more eﬃcient than UMVUEs.One should note that UMVUE of α is better than MLE of α .13 able 1. MSE of ˆ f ( x ) and ˜ f ( x ) for diﬀerent values of α and k respect to n n α = 0 . α = 1 α = 1 . α = 2 α = 0 . α = 2 k = 0 . k = 1 k = 1 . k = 2 k = 2 k = 0 .

54 .4723690000 .4755806957 .476511 .476705 .029389 7.658800(.4551780000) (.4572692552) (.457605) (.457264) (.028230) (7.374350)5 .3172310000 .3207780187 .322473 .321924 .019827 5.157680(.2725650000) (.2757554306) (.277487) (.276551) (.017036) (4.434020)6 .2383780000 .2404067792 .241698 .242471 .014881 3.885290(.1881800000) (.1896539273) (.190811) (.191536) (.011743) (3.071030)7 .1914040000 .1929655509 .193791 .194454 .011901 3.105110(.1421410000) (.1432299232) (.143884) (.144461) (.008828) (2.305270)8 .1599510000 .1602595912 .161806 .161604 .009946 2.588020(.1133700000) (.1133783291) (.114652) (.114415) (.007042) (1.832800)9 .1372420000 .1380770369 .138780 .139051 .008601 2.213210(.0937400000) (.0942468652) (.094764) (.094956) (.005877) (1.509400)10 .1195130000 .1209664617 .121456 .121541 .007514 1.944830(.0791650000) (.0801522609) (.080488) (.080528) (.004982) (1.288590)11 .1063690000 .1075933345 .107972 .108015 .006661 1.728000(.0687560000) (.0695600949) (.069808) (.069818) (.004307) (1.116900)12 .0957350000 .0966263202 .096937 .097635 .005985 1.561640(.0606160000) (.0611705833) (.061363) (.061865) (.003789) (.989450)13 .0873630000 .0876848730 .088438 .088472 .005442 1.418000(.0543730000) (.0545292384) (.055038) (.055049) (.003386) (.882550)14 .0802730000 .0808533231 .081013 .081069 .005006 1.299260(.0492200000) (.0495643965) (.049654) (.049681) (.003069) (.796410)15 .0735700000 .0746156473 .074685 .074919 .004613 1.197330(.0444890000) (.0451395324) (.045168) (.045318) (.002790) (.724140)20 .0532991000 .0538278427 .053924 .054039 .003330 .865418(.0307735000) (.0310780487) (.031129) (.031196) (.001923) (.499644)25 .0418259000 .0420976064 .042261 .042273 .002612 .676949(.0234799000) (.0236273808) (.023720) (.023724) (.001466) (.379933)30 .0341756000 .0346115111 .034740 .034775 .002141 .556918(.0188200000) (.0190619643) (.019133) (.019152) (.001179) (.306728)35 .0290451000 .0294767054 .029444 .029448 .001821 .473159(.0157795000) (.0160172951) (.015996) (.015996) (.000989) (.257090)40 .0252562000 .0254701187 .025650 .025593 .001581 .411040(.0135817000) (.0136957411) (.013795) (.013761) (.000850) (.221060)45 .0224046000 .0225300956 .022630 .022658 .001395 .362473(.0119535000) (.0120188221) (.012073) (.012088) (.000744) (.193366)50 .0198892000 .0202331230 .020334 .020295 .001253 .324570(.0105411000) (.0107253823) (.010780) (.010758) (.000664) (.172036)55 .0181373000 .0183714478 .018323 .018382 .001130 .294111(.0095638000) (.0096881397) (.009661) (.009692) (.000596) (.155077)60 .0165965000 .0167227711 .016761 .016767 .001036 .269022(.0087132000) (.0087790315) (.008799) (.008802) (.000544) (.141232)65 .0152848000 .0153716098 .015452 .015489 .000954 .247479(.0079948000) (.0080394109) (.008082) (.008102) (.000499) (.129436)70 .0141436000 .0142593414 .014292 .014337 .000880 .228735(.0073741000) (.0074341631) (.007451) (.007474) (.000459) (.119243)75 .0131255000 .0132738528 .013345 .013349 .000820 .213970(.0068239000) (.0069011346) (.006938) (.006940) (.000427) (.111250)80 .0123438000 .0124114845 .012482 .012486 .000766 .199643(.0064025000) (.0064370354) (.006474) (.006476) (.000397) (.103542)85 .0115424000 .0117098555 .011725 .011738 .000722 .188222(.0059735000) (.0060606327) (.006068) (.006075) (.000374) (.097417)90 .0109372000 .0110042351 .011088 .011080 .000680 .177858(.0056499000) (.0056841841) (.005728) (.005724) (.000351) (.091881)95 .0102967000 .0104005835 .010482 .010493 .000644 .167683(.0053096000) (.0053631148) (.005405) (.005411) (.000332) (.086471)100 .0098069000 .0098648715 .009939 .009971 .000615 .159213(.0050495000) (.0050790332) (.005117) (.005134) (.000316) (.081978)The figures in the bracket refers to the MSE of MLE of f ( x ) ( ˜ f ( x )) and without bracket refers to the MSE of UMVUE of f ( x ) ( ˆ f ( x )) able 2. MSE of ˆ F ( x ) and ˜ F ( x ) for diﬀerent values of α and k respect to n n α = 0 . α = 1 α = 1 . α = 2 α = 0 . α = 2 k = 0 . k = 1 k = 1 . k = 2 k = 2 k = 0 .

54 .1333908446 .1368410441 .038435 .115906 .121205 .157459(.0014763889) (.0012562228) (.001100) (.000794) (.001880) (.002199)5 .1118961062 .1799791911 .172372 .198062 .137497 .124812(.0072949912) (.0356532489) (.029683) (.055201) (.013072) (.009786)6 .1408169615 .1737093067 .101804 .188658 .143735 .134947(.0037596610) (.0032023158) (.001684) (.005213) (.003836) (.005169)7 .2191326212 .2065645792 .195758 .250334 .126943 .160016(.0229514291) (.0165453192) (.013057) (.067911) (.004378) (.007159)8 .2007151499 .2012409768 .195431 .219443 .197351 .203045(.0034259269) (.0032392638) (.004168) (.000621) (.002333) (.007496)9 .2576817407 .2512987261 .147364 .135785 .124131 .269592(.0275929634) (.0205299145) (.004077) (.003471) (.002910) (.060899)10 .1492978531 .1837933180 .150179 .220440 .134771 .213361(.0023002929) (.0030831665) (.003979) (.005504) (.001508) (.007332)11 .2623413820 .2259891179 .259195 .274008 .163600 .129209(.0139901431) (.0076116749) (.012447) (.046732) (.003893) (.002419)12 .1819439058 .1679376346 .197488 .176699 .147436 .205010(.0062366813) (.0049937937) (.004988) (.004595) (.004263) (.005586)13 .2177045578 .1674005776 .235847 .254445 .141234 .141272(.0058142184) (.0033358805) (.006945) (.008379) (.002349) (.002350)14 .2356903001 .0548103487 .227515 .193420 .188948 .136300(.0060898969) (.0003127263) (.006204) (.004149) (.006286) (.001994)15 .1076037454 .1555011068 .122969 .253257 .199792 .191199(.0011259936) (.0024132218) (.001482) (.007139) (.004118) (.003743)20 .2627256864 .2622509943 .248371 .232384 .164861 .168388(.0065292099) (.0062066109) (.005120) (.004292) (.002324) (.002052)25 .2488455377 .2598193829 .218770 .259375 .151849 .177311(.0041453670) (.0052022850) (.004877) (.004883) (.001275) (.001795)30 .2251732413 .2172670151 .257171 .211684 .113000 .118642(.0042208658) (.0023800197) (.004369) (.003983) (.000552) (.000612)35 .2458125666 .2072001232 .198559 .204366 .178507 .156018(.0028694984) (.0018011252) (.003254) (.003352) (.001266) (.000935)40 .1737078950 .2554091293 .251794 .152611 .116046 .156706(.0010316986) (.0032353869) (.002759) (.002133) (.001487) (.000818)45 .1940551896 .2429150460 .173837 .222621 .157868 .108663(.0011808928) (.0021535385) (.002250) (.002867) (.000733) (.000325)50 .1064113870 .2267825379 .179884 .228218 .125222 .151083(.0011050152) (.0015767910) (.000884) (.002635) (.000394) (.000594)55 .2299332896 .1855659085 .252088 .218652 .163238 .181364(.0024125373) (.0008607060) (.002056) (.002338) (.000639) (.001967)60 .1392457773 .1773309892 .127104 .227032 .170991 .108474(.0013392528) (.0017724707) (.001195) (.002202) (.000649) (.000238)65 .1986728058 .2322009300 .253793 .115857 .107674 .118045(.0018421105) (.0012944937) (.001897) (.000989) (.000215) (.000262)70 .2288138955 .1496013496 .251293 .108626 .134125 .123655(.0019021751) (.0012705637) (.001864) (.000203) (.000320) (.000268)75 .2524003249 .1656385542 .252058 .226321 .113991 .124311(.0017059572) (.0013387601) (.001537) (.001769) (.000209) (.000252)80 .2232959249 .1154621504 .238145 .123070 .159104 .104494(.0016486241) (.0008181671) (.001686) (.000887) (.000408) (.000719)85 .1569167081 .1116721695 .197303 .038435 .100752 .160423(.0003712149) (.0007421171) (.000637) (.000164) (.000141) (.000959)90 .2498099645 .1551047337 .154748 .084016 .080884 .071267(.0014551335) (.0003408055) (.000339) (.000483) (.000459) (.000064)95 .2525238629 .1423488455 .237976 .038663 .021378 .055160(.0013016097) (.0002657662) (.000956) (.000151) (.000005) (.000255)100 .1352354632 .0786840925 .105801 .064531 .055326 .026498(.0008108742) (.0004021173) (.000132) (.000305) (.000245) (.000081)The figures in the bracket refers to the MSE of MLE of F ( x ) ( ˜ F ( x )) and without bracket refers to the MSE of UMVUE of F ( x ) ( ˆ F ( x )) References [1] Asrabadi, B. R.: Estimation in the Pareto distribution,

Metrika , 1990, Vol. 37, 199-205.[2] Dixit, U.J. and Jabbari Nooghabi M.: Eﬃcient Estimation in the Pareto Distribution, StatisticalMethodology, 2010, Vol. 7(6), 687-691.[3] He Hui, Zhou Na and Zhang R.: On estimation for the Pareto Distribution, Statistical Methodology,2014, Vol. 21, 49-58.[4] Olver, F.W., Lozier, D.W., Boisvert, R.F. et al., NIST Handbook of Mathematical Functions, Cam-bridge University Press, New York, 2010. 15 tatistical Methodology 7 (2010) 687–691

Contents lists available at ScienceDirect

Statistical Methodology

Efficient estimation in the Pareto distribution

U.J. Dixit, M. Jabbari Nooghabi ∗ Department of Statistics, University of Mumbai, Mumbai, India a r t i c l e i n f o

Article history:

Received 28 July 2009Accepted 27 April 2010

Keywords:

Pareto distributionMaximum likelihood estimatorUniform minimum variance unbiasedestimatorProbability density functionCumulative distribution function a b s t r a c t

The maximum likelihood estimation (MLE) of the probabilitydensity function (pdf) and cumulative distribution function (CDF)are derived for the Pareto distribution. It has been shown thatMLEs are more efficient than uniform minimum variance unbiasedestimators of pdf and CDF. ©

1. Introduction

The Pareto distribution has been used in connection with studies of income, property values,insurance risk, migration, size of cities and firms, word frequencies, business mortality, service timein queuing systems, etc.The objective of this paper is to discuss efficient estimation of probability density function (pdf)and cumulative distribution function (CDF) of Pareto distribution which has been one of the mostdistinguished candidates for the honor of explaining the distribution of incomes, assets, etc.We assume that random variable X has Pareto distribution with parameters α and k (known) andits pdf is as f ( x ) = α k α x α + , < k ≤ x , α > , and CDF is F ( x ) = − (cid:18) kx (cid:19) α , k ≤ x . ∗ Corresponding author.

E-mail addresses: [email protected] (U.J. Dixit), [email protected] (M. Jabbari Nooghabi).1572-3127/$ – see front matter © U.J. Dixit, M. Jabbari Nooghabi / Statistical Methodology 7 (2010) 687–691

In economics, where this distribution is used as an income distribution, k is some minimum incomewith a known value. Asrabadi [1] derived the uniformly minimum variance unbiased estimator(UMVUE) of pdf, CDF and the r th moment.In general, we expect that the UMVU estimators are better than MLEs. In Pareto distribution, weshow that UMVU estimators of parameter α is more efficient than the MLE, but for pdf and CDF, MLestimators are biased and more efficient than the UMVUEs.

2. Maximum likelihood estimator

Let X , X , . . . , X n be a random sample of size n from the Pareto distribution. According to MLmethod we obtain the MLE of α and it is given as ˜ α where ˜ α = n ( P ni = ln ( x i k )) − .Therefore, by using the property of MLE we can obtain the estimator of pdf and CDF withreplacement of ˜ α instead of α in the pdf and CDF, respectively. Then ˜ f ( x ) = ˜ α k ˜ α x ˜ α + , ˜ α > , < k ≤ x , (1) ˜ F ( x ) = − (cid:18) kx (cid:19) ˜ α , < k ≤ x , ˜ α > . (2)We know that pdf of S = P ni = ln ( X i k ) is g ( s ) = α n s n − Γ ( n ) exp ( − α s ), s > , (3)and by using some elementary algebra, we can find the distribution of w = ˜ α as g (w) = (α n ) n Γ ( n )w n + exp n − α n w o , w > . (4) Note . It is clear that the MLE of α is biased and M SE ( ˜ α) = α ( n + n − )( n − ) ( n − ) . Theorem 1. (A) ˜ f ( x ) is a biased estimator of f ( x ) and E ( ˜ f ( x )) = Γ ( n ) x n − X j = (α n ) j + j ! Γ ( n − j − ) (cid:18) ln (cid:18) kx (cid:19)(cid:19) j . (5)(B) ˜ F ( x ) is a biased estimator of F ( x ) and E ( ˜ F ( x )) = − Γ ( n ) n − X j = (α n ) j j ! Γ ( n − j ) (cid:18) ln (cid:18) kx (cid:19)(cid:19) j . (6) Proof.

In cases of (A), we can easily find the expectation of f ( x ) with substituting this formula: ( kx ) w = e w ln ( kx ) = P ∞ j = w j ( ln ( kx )) j j ! . Also, the GAMMA function defines for variable grater than zero,then j must be less than ( n − ) and the proof is complete. In the case (B), the proof is similar as in thecase (A). (cid:3) Theorem 2. (A) M SE ( ˜ f ( x )) = Γ ( n ) x n − X j = j (α n ) j + j ! Γ ( n − j − ) (cid:18) ln (cid:18) kx (cid:19)(cid:19) j − α k α Γ ( n ) x α + n − X j = (α n ) j + j ! Γ ( n − j − ) (cid:18) ln (cid:18) kx (cid:19)(cid:19) j + (cid:18) α k α x α + (cid:19) . (7) .J. Dixit, M. Jabbari Nooghabi / Statistical Methodology 7 (2010) 687–691 (B) M SE ( ˜ F ( x )) = Γ ( n ) n − X j = j (α n ) j j ! Γ ( n − j ) (cid:18) ln (cid:18) kx (cid:19)(cid:19) j − (cid:18) kx (cid:19) α Γ ( n ) n − X j = (α n ) j j ! Γ ( n − j ) (cid:18) ln (cid:18) kx (cid:19)(cid:19) j + (cid:18) kx (cid:19) α . (8) Proof.

In cases (A) and (B) we should find E ( ˜ f ( x )) and E ( ˜ F ( x )) as the previous Theorem, respectively.So by using some elementary algebra the proof is complete. (cid:3)

3. MSE of UMVU estimator

Asrabadi [1] derived the UMVUE of α , f ( x ) and F ( x ) . Here UMVUE of α , f ( x ) and F ( x ) are denotedby ˆ α , ˆ f ( x ) and ˆ F ( x ) , respectively. So ˆ α = n − ( t ) − n ln ( k ) , (9) ˆ f ( x ) = ( n − ) [ ln ( t ) − ln ( x ) − ( n − ) ln ( k ) ] n − x [ ln ( t ) − n ln ( k ) ] n − , (10)and ˆ F ( x ) = − [ ln ( t ) − ln ( x ) − ( n − ) ln ( k ) ] n − [ ln ( t ) − n ln ( k ) ] n − , (11)where k ≤ x ≤ tk − n , and t = Q ni = x i is the observed value of T . Theorem 3. (A) M SE ( ˆ f ( x )) = ( n − )α k α Γ ( n − ) x α + n − X j = C ( n − , j )α j Γ ( n − j − ) (cid:16) − ln (cid:16) xk (cid:17)(cid:17) j × n − − j X i = α i ln ( xk ) (cid:1) i i ! − (cid:18) α k α x α + (cid:19) , (12)(B) M SE ( ˆ F ( x )) = k α Γ ( n ) x α n − X j = C ( n − , j )α j Γ ( n − j ) (cid:16) − ln (cid:16) xk (cid:17)(cid:17) j × n − j − X i = α i ln ( xk ) (cid:1) i i ! − (cid:18) kx (cid:19) α , (13) where C ( n , k ) = n ! k ! ( n − k ) ! . Proof.

In cases (A) and (B), we can obtain E ( ˆ f ( x )) and E ( ˆ F ( x )) by using pdf of T that is given in [1].In the process to calculate the integral we should note that Z ∞ k z n − α n Γ ( n ) e − α z = n − X i = (α k ) i i ! e − α k . Hence, the proof is complete. (cid:3)

Note . One should note that M SE ( ˆ α) = α ( n − ) . U.J. Dixit, M. Jabbari Nooghabi / Statistical Methodology 7 (2010) 687–691

Fig. 1.

Comparison of the MSE of the estimators of pdf and CDF with respect to observation generated from the Paretodistribution. .J. Dixit, M. Jabbari Nooghabi / Statistical Methodology 7 (2010) 687–691

4. Comparison of MLE and UMVUE

It is obvious that the UMVU estimator of α is more efficient than the MLE for any value of n . Nowin order to get the idea of efficiency between MLE and UMVUE of pdf and CDF, we have generated asample of size 4(1)15(5)100 from the Pareto distribution with α = . ( . ) k = . ( . )

2. Wehave given graphs based on one thousand independent replications of each experiments (Fig. 1). Fromthe graphs, it has been seen that MLE of pdf and CDF are more efficient than UMVUEs.

Acknowledgements

The authors are thankful to the referees for their valuable comments.

References [1] B.R. Asrabadi, Estimation in the Pareto distribution, Metrika 37 (1990) 199–205. tatistical Methodology 21 (2014) 49–58 Contents lists available at ScienceDirect

Statistical Methodology

On estimation for the Pareto distribution

Hui He, Na Zhou, Ruiming Zhang ∗ College of Science, Northwest A&F University, Yangling, Shaanxi 712100, PR China a r t i c l e i n f o

Article history:

Received 1 June 2013Received in revised form22 January 2014Accepted 12 March 2014

Keywords:

Pareto distributionMLEUMVUEProbability density functionReliability function a b s t r a c t

In this work, we obtain the r -th raw moments of the probabilitydensity function (PDF) and reliability function (RF) for the Paretodistribution under the maximum likelihood estimation (MLE) anduniform minimum variance unbiased estimation (UMVUE). We de-rive some large sample properties of the estimators, the MLE andUMVUE of the PDF as well as RF. Two examples are provided tocompute the efficient estimations of PDF and RF numerically. Ourresults indicate that there are no absolute superiorities of MLEsover the UMVUEs of PDF and RF and vice versa. ©

1. Introduction

We consider a random variable X has the Pareto distribution (PD) with PDF f ( x ) = α k α x α + , (1)and RF G ( x ) = Prob { X > x } = (cid:18) kx (cid:19) α , (2)where α is a shape parameter ( α > k is a scale parameter (known, and x > k > k usually represents some minimum income with a known value, see [1].PD was applied by Pareto [7] to model the allocation of wealth among individuals and the dis-tribution of incomes. It has been widely used in economics, insurance (general liability, commercial ∗ Corresponding author. Tel.: +86 13032906582.

E-mail addresses: [email protected] (H. He), [email protected] (N. Zhou), [email protected],[email protected] (R. Zhang).http://dx.doi.org/10.1016/j.stamet.2014.03.0021572-3127/ © H. He et al. / Statistical Methodology 21 (2014) 49–58 auto [9]), geography (sizes of human settlements [8]), physical sciences (sizes of sand particles ormeteorites [8], clusters of Bose–Einstein condensate near absolute zero [5]), chemical sciences (dis-tributions of electrolytic powder production [4]). Asrabadi [1] established the UMVUEs for the PDFand cumulative distribution function (CDF) of PD. Based on the work of Asrabadi [1], Dixit and JabbariNooghabi [2] tried to study the mean square errors (MSEs) of the MLEs and UMVUEs for the PDF andCDF of PD and their results seem to show that the MLEs are more efficient than the UMVUEs of PDFand CDF. Unfortunately, their work are seriously flawed. Most of their main claims in [2] are wrong,and their conclusion, the MLEs are more efficient than the UMVUEs of PDF and CDF, is unreasonable.We present our main results in Section 2. Most of the results in Section 2.1 are corrected versions ofthe wrong results of [2]. We also notice that the exact expressions of the MSEs of estimators of PDF andRF may not be useful in case of large scale samples and large scale numerical computations. For thisreason we have derived the asymptotic expressions of the r -th raw moments and MSEs in Section 2.2.Two numerical examples are provided in Section 2.3 to show how to compute the efficient estimationsof PDF and RF. In Section 2.4 we expose the fatal errors in [2].

2. Main results

As a notational convenience, let z = z ( x ) = log xk , z x = z x ( x ) = dzdx = x , through the rest of this paper. It is known that the UMVUEs of f ( x ) and G ( x ) are given by [1] ˆ f ( x ) = n − s z x (cid:16) − zs (cid:17) n − , ˆ G ( x ) = (cid:16) − zs (cid:17) n − , where z < s , s = P ni = z ( x i ) and s follow the Gamma distribution Ga ( n , α) . Note that the UMVUE of α is ˆ α = ( n − )/ s .The MLEs of f ( x ) and G ( x ) can be computed easily, they are ˜ f ( x ) = ˜ α z x e −˜ α z , ˜ G ( x ) = e −˜ α z , where ˜ α = ns is the MLE of α . Note that the PDF of s is given by h ( s ) = α n s n − Γ ( n ) exp ( − α s ) . Theorem 1.

For n > r > , the r-th raw moments of ˜ f ( x ) and ˜ G ( x ) are given byE ( ˜ f ( x )) r = Γ ( n ) ( n α z x ) r ( √ nr α z ) n − r K n − r √ nr α z (cid:1) , (3) E ( ˜ G ( x )) r = Γ ( n ) ( √ nr α z ) n K n √ nr α z (cid:1) , (4) where K ν ( x ) is the modified Bessel function [6] . Proof.

For the proof we just need to note the well-known integral representation [6], K ν ( x ) = (cid:16) x (cid:17) ν Z ∞ exp (cid:18) − t − x t (cid:19) dtt ν + . (cid:3) Corollary 1.

The mean square errors of ˜ f ( x ) and ˜ G ( x ) are given by MSE ( ˜ f ( x )) = ( n α z x ) Γ ( n ) ( √ n α z ) n − K n − (cid:16) √ n α z (cid:17) − n α z x Γ ( n ) f ( x )( √ n α z ) n − K n − √ n α z (cid:1) + f ( x ). (5) . He et al. / Statistical Methodology 21 (2014) 49–58 MSE ( ˜ G ( x )) = Γ ( n ) ( √ n α z ) n K n (cid:16) √ n α z (cid:17) − Γ ( n ) G ( x )( √ n α z ) n K n √ n α z (cid:1) + G ( x ). (6) Theorem 2.

For n > r > , the r-th raw moments of ˆ f ( x ) and ˆ G ( x ) are given byE ( ˆ f ( x )) r = (α z x ( n − )) r − f ( x ) Γ ( nr − r + ) Γ ( n − ) U ( nr − n − r + , r − n + , α z ), (7) E ( ˆ G ( x )) r = Γ ( nr − r + ) Γ ( n ) G ( x ) U ( nr − n − r + , − n , α z ), (8) where U ( a , b , c ) is the Kummer confluent hypergeometric function [6] . Proof.

Note that the Kummer confluent hypergeometric function has an integral representation [6], U ( a , b , c ) = e c Γ ( a ) Z ∞ t a − ( + t ) b − a − e − ct dt , and the proof is completed by applying the Kummer transformation [6], U ( a , b , c ) = c − b U ( + a − b , − b , c ). (cid:3) Corollary 2.

The mean square errors of ˆ f ( x ) and ˆ G ( x ) are given by MSE ( ˆ f ( x )) = α z x f ( x ) ( n − ) Γ ( n − ) Γ ( n − ) U ( n − , − n , α z ) − f ( x ). (9)MSE ( ˆ G ( x )) = G ( x ) Γ ( n − ) Γ ( n ) U ( n − , − n , α z ) − G ( x ). (10)Figs. 1 and 2 illustrate the efficient estimators between MLEs and UMVUEs of PDF and RF for n = { , , } , k = { , } , < α <

10 and k < x <

50. In the graphs, the black areas indicate thatthe MLEs of PDF and RF are more efficient than the UMVUEs while the white areas mean the UMVUEsof PDF and RF are more efficient than the MLEs, and there is no evidence that the black areas or thewhite areas will disappear from the first quadrant. Thus we conclude that the MLEs are not generallymore efficient than the UMVUEs of PDF/RF and vice versa. We also notice that Corollaries 1 and 2 canhelp us to obtain more efficient estimations, see Example 1.

Corollaries 1 and 2 can be expediently used to calculate the MSEs of the estimations for a smallsample. However, in practice, we find that the corollaries would not be expedient for a large sampleand the large-scale numerical computation. For reasonable large n , direct numerical evaluations of Γ ( n − ), K n √ n α z (cid:1) , U ( n − , − n , α z ) will incur either overflow or underflow. For large-scalenumerical computation, symbolic computations will run for a very long time. Therefore, it is necessaryto study the large sample properties of the r -th raw moments and the asymptotic behaviors of theMSEs. Lemma 1.

For fixed y > , a ≥ , thenK ν (cid:16) p y (ν + a ) (cid:17) = e − ν − a − y p π (ν + a ) ν ( ν y ν ) − ∞ X i = a i ( a , y )ν i , (11) H. He et al. / Statistical Methodology 21 (2014) 49–58 x α x α (a) Graph of MSE ( ˜ f ) − MSE ( ˆ f ) ≤ n = k =

1. (b) Graph of MSE ( ˜ f ) − MSE ( ˆ f ) ≤ n = k = x α x α (c) Graph of MSE ( ˜ f ) − MSE ( ˆ f ) ≤ n =

30 and k =

1. (d) Graph of MSE ( ˜ f ) − MSE ( ˆ f ) ≤ n = k = x α x α (e) Graph of MSE ( ˜ f ) − MSE ( ˆ f ) ≤ n = k =

5. (f) Graph of MSE ( ˜ f ) − MSE ( ˆ f ) ≤ n =

30 and k = Fig. 1.

Efficient estimators between MLE and UMVUE of PDF. as ν → +∞ , where the first three coefficients are given bya ( a , y ) = , a ( a , y ) = y / − ( a + ) y + ( a + )/ , a ( a , y ) = y / − ( a + ) y / + a + a + (cid:1) y / − a + a + a + (cid:1) y / + a − a + a + (cid:1) / . . He et al. / Statistical Methodology 21 (2014) 49–58 x α x α (a) Graph of MSE ( ˜ G ) − MSE ( ˆ G ) ≤ n = k =

1. (b) Graph of MSE ( ˜ G ) − MSE ( ˆ G ) ≤ n = k = x α x α (c) Graph of MSE ( ˜ G ) − MSE ( ˆ G ) ≤ n =

30 and k =

1. (d) Graph of MSE ( ˜ G ) − MSE ( ˆ G ) ≤ n = k = x α x α (e) Graph of MSE ( ˜ G ) − MSE ( ˆ G ) ≤ n = k =

5. (f) Graph of MSE ( ˜ G ) − MSE ( ˆ G ) ≤ n =

30 and k = Fig. 2.

Efficient estimators between MLE and UMVUE of RF.

Lemma 2.

For fixed y > , r > , + ∞ > a > −∞ , then Γ ( nr − n − r + ) U ( nr − n − r + , ar − r − n + , y ) = ( r − ) nr − n − r + e y − ry r ar − nr − √ π n − ∞ X i = b i ( a , r , y ) n i , (12) as n → +∞ , where the first coefficient is given as b ( a , r , y ) = , and when r = , the second and thirdcoefficients are given byb ( a , , y ) = y + ( − a ) y + a − a + (cid:1) / , H. He et al. / Statistical Methodology 21 (2014) 49–58 b ( a , , y ) = y / − ay + a − a + (cid:1) y / + − a + a − a + (cid:1) y / + a − a + a − a + (cid:1) / . Note that the proofs of Lemmas 1 and 2 are very lengthy, so we only provide the results here. Forinterested readers, please see the related sections on Laplace’s method in [10].

Theorem 3.

If n → ∞ , then the following formulas hold: (a) E ( ˜ f ( x )) r = f r ( x ) + O ( n − ) . (b) E ( ˜ G ( x )) r = G r ( x ) + O ( n − ) . (c) E ( ˆ f ( x )) r = f r ( x ) + O ( n − ) . (d) E ( ˆ G ( x )) r = G r ( x ) + O ( n − ) . Proof.

Cases (a) and (b) are obtained by applying Lemma 1 and Stirling’s formula [6], the proofs for thecases (c) and (d) are similar to the cases (a) and (b) except applying Lemma 2 instead of Lemma 1. (cid:3)

Theorem 4.

For n → ∞ , we have the following results: (a) MSE ( ˜ f ( x )) = f ( x ) (α z − ) n − + p (α z ) n − + O ( n − ) (cid:1) , (b) MSE ( ˜ G ( x )) = G ( x ) (α z ) n − + q (α z ) n − + O ( n − ) (cid:1) , (c) MSE ( ˆ f ( x )) = f ( x ) (α z − ) n − + p (α z ) n − + O ( n − ) (cid:1) , (d) MSE ( ˆ G ( x )) = G ( x ) (α z ) n − + q (α z ) n − + O ( n − ) (cid:1) ,where the functions p , p , q , q are defined asp ( x ) = x / − x + x − x + , q ( x ) = x / − x + x , p ( x ) = x / − x + x − x + , q ( x ) = x / − x + x , x > . Proof.

The proof of Theorem 4 is similar to the proof of Theorem 3. (cid:3)

Next, we will discuss the efficient estimations under the large scale sample. By Theorem 4, we haveMSE ( ˜ f ( x )) − MSE ( ˆ f ( x )) ≈ ( f ( x )/ n ) p (α z ), and MSE ( ˜ G ( x )) − MSE ( ˆ G ( x )) ≈ ( G ( x )/ n ) q (α z ), where p (α z ) = p (α z ) − p (α z ) and q (α z ) = q (α z ) − q (α z ) . Note that p (α z ) = (α z ) / − (α z ) + (α z ) − (α z ) + , q (α z ) = (α z ) / − (α z ) + (α z ) . We notice that the algebraicsigns of p (α z ) and q (α z ) can approximatively determine the efficient estimators of PDF and RF when n is large. Corollary 3.

For sufficiently large n, we have the following results: If α z ∈ [ . , . ] ∪ [ . , . ] , then p (α z ) ≤ and the MLE is more efficient than theUMVUE of PDF. If α z ∈ ( , . ) ∪ ( . , . ) ∪ ( . , ∞ ) , then p (α z ) > and the UMVUE is moreefficient than the MLE of PDF. If α z ∈ [ . , . ] , then q (α z ) ≤ and the MLE is more efficient than the UMVUE of RF. If α z ∈ ( , . ) ∪ ( . , ∞ ) , then q (α z ) > and the UMVUE is more efficient than the MLE ofRF. . He et al. / Statistical Methodology 21 (2014) 49–58 Example 1.

Efficient estimator in the small sample.We use Dyer [3] annual wage data (in multiples of 10,000 US dollars) to illustrate our results. Thevalues of the data are given below:1 . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . , . . Here we suppose that the minimum wage is 10,000 US dollars. Then the pertinent data are n = , k = α ≈ . α ).Further, letMSE ( ˜ f ( x )) − MSE ( ˆ f ( x )) = , then, the positive roots are x = . , x = . , x = . , x = . . Therefore, if x ∈ [ . , . ] ∪ [ . , . ] , ˜ f ( x ) is more efficient than ˆ f ( x ) ; if x ∈ ( , . ) ∪ ( . , . ) ∪ ( . , ∞ ), ˆ f ( x ) is more efficient than ˜ f ( x ) . Similarly, letMSE ( ˜ G ( x )) − MSE ( ˆ G ( x )) = , then, the positive roots are x = . , x = . . Hence, when x ∈ [ . , . ] , ˜ G ( x ) is more efficient than ˆ G ( x ) ; when x ∈ ( , . ) ∪ ( . , ∞ ), ˆ G ( x ) is more efficient than ˜ G ( x ) . Example 2.

Efficient estimator in the large sample.To compare with our first example, we let k = α = . k and α are the sameas in the first example) in Corollary 3 to see that, when x ∈ [ . , . ] ∪ [ . , . ] , ˜ f ( x ) is more efficient than ˆ f ( x ) ; when x ∈ ( , . ) ∪ ( . , . ) ∪ ( . , ∞ ), ˆ f ( x ) is moreefficient than ˜ f ( x ) ; if x ∈ [ . , . ] , ˜ G ( x ) is more efficient than ˆ G ( x ) ; if x ∈ ( , . ) ∪ ( . , ∞ ), ˆ G ( x ) is more efficient than ˜ G ( x ) . The errors of the main results of [2] can be seen clearly from the following simple numericalcalculation. Let n = { , , } , k = { , } , α = { , } , and x = { , , , } , the mathematicalexpectation values and MSE values are listed in Table 1, and some of them are negative which is clearlyabsurd. It is not hard to see Theorems 1–3 of [2] are all wrong, where their Theorem 1 is about themathematical expectation expressions of ˜ f ( x ), ˜ F ( x ) ; their Theorem 2 is about the MSE expressionsof ˜ f ( x ), ˜ F ( x ) , and their Theorem 3 is on the MSE expressions of ˆ f ( x ), ˆ F ( x ) . Furthermore, all of ournumerical simulations (Figs. 1, 2, Examples 1 and 2) also show that the main conclusion of [2], theMLEs are more efficient than the UMVUEs of PDF and CDF, is false.

3. Conclusion

We have studied the efficient estimation in PD in our work. Our results show that the efficientestimations of PDF and RF of PD depend on four variables ( n , k , α, x ). Let g ( x ), . . . , g m ( x ) denote thedifferent estimations (MLE, UMVUE, Bayesian estimation, etc.) of PDF or RF, we construct the following H. He et al. / Statistical Methodology 21 (2014) 49–58

Table 1

Numerical values of E ( ˜ f ( x )), E ( ˜ F ( x )), MSE ( ˜ f ( x )), MSE ( ˜ F ( x )), MSE ( ˆ f ( x )) and MSE ( ˆ F ( x )) , for n = { , , } , k = { , } , α ={ , } and x = { , , , } . Note that the values are calculated by using the results of Dixit and Jabbari Nooghabi [2]. ( n , k , α, x ) E ( ˜ f ) E ( ˜ F ) MSE ( ˜ f ) MSE ( ˜ F ) MSE ( ˆ f ) MSE ( ˆ F ) (5, 1, 5, 2) − + − + + + + + + + − + − + − + − + − − − + − + − + − − − + − + + + − + + + − + − + − + − + + + − + − + − + − + − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − estimation ¯ g ( x ) =  g ( x ), if x ∈ ( x (cid:12)(cid:12)(cid:12)(cid:12) m \ i = MSE ( g ( x )) ≤ MSE ( g i ( x )) ) · · · , · · · g m ( x ), if x ∈ ( x (cid:12)(cid:12)(cid:12)(cid:12) m \ i = MSE ( g m ( x )) ≤ MSE ( g i ( x )) ) , where the parameters n , k , α are given. As an estimator of PDF or RF, it is more efficient than all of g i ( x ), i = , . . . , m . It is also clear that ¯ g ( x ) may have discontinuities. Acknowledgments

The authors are thankful to the Editor-in-Chief, Associate Editor(s), and Reviewers for theirvaluable comments and suggestions.

Appendix. Proofs associated with Theorems 1–4 and Lemmas 1–2Proof of Theorem 1. E ( ˜ f ( x )) r = Z ∞ ( ˜ f ( x )) r h ( s ) ds = Z ∞ (cid:16) ns z x e − ns z (cid:17) r α n s n − exp ( − α s ) Γ ( n ) ds = (α nz x ) r Γ ( n ) Z ∞ (α s ) n − r − exp (cid:16) − α s − α nzr α s (cid:17) d (α s ) = (α nz x ) r Γ ( n ) Z ∞ t n − r − exp (cid:18) − t − ( √ α nzr ) t (cid:19) dt = Γ ( n ) ( n α z x ) r ( √ nr α z ) n − r K n − r √ nr α z (cid:1) . Eq. (4) can be proved similarly, and the proof of Theorem 1 is complete.

Proof of Theorem 2. E ( ˆ f ( x )) r = Z ∞ z ( ˆ f ( x )) r h ( s ) ds = Z ∞ z (cid:18) n − s z x (cid:16) − zs (cid:17) n − (cid:19) r α n s n − exp ( − α s ) Γ ( n ) ds = (α( n − ) z x ) r Γ ( n ) Z ∞ z (α s ) n − r − (cid:16) − α z α s (cid:17) ( n − ) r exp ( − α s ) d (α s ) . He et al. / Statistical Methodology 21 (2014) 49–58 = (α( n − ) z x ) r Γ ( n ) Z ∞ α z t ( n − )( − r ) ( t − α z ) ( n − ) r exp ( − t ) dt = (α( n − ) z x ) r Γ ( n ) e α z Z ∞ ( t + α z ) ( n − )( − r ) t ( n − ) r exp ( − t ) dt = (α( n − ) z x ) r Γ ( n ) e α z (α z ) n − r Z ∞ ( + t ) ( n − )( − r ) t ( n − ) r exp ( − α zt ) dt = (α( n − ) z x ) r Γ ( n ) e α z (α z ) n − r Γ ( nr − r + ) U ( nr − r + , n − r + , α z ) = (α z x ( n − )) r − f ( x ) Γ ( nr − r + ) Γ ( n − ) U ( nr − n − r + , r − n + , α z ). Eq. (8) can be proved similarly.

Proof of Lemma 1. K v (cid:16) p y (ν + a ) (cid:17) = ( y (ν + a )) v Z ∞ t − v − exp (cid:18) − t − y (ν + a ) t (cid:19) dt = ( y (ν + a )) v Z ∞ exp (cid:16) − t − yat (cid:17) exp (cid:16) − ν (cid:16) log t + yt (cid:17)(cid:17) dtt , where y > , a ≥ ν >

0. Let f ( t ) = log t + y / t , then f ′ ( t ) = t − y / t , f ′′ ( t ) = − t − y / t , it is clear f ( t ) has a unique minimum 1 + log y at t = y . Then, we obtain Eq. (11) by Laplace’smethod [10]. Then the proof of Lemma 1 is complete. Proof of Lemma 2. Γ ( nr − n − r + ) U ( nr − n − r + , ar − r − n + , y ) = Z ∞ t nr − n − r ( + t ) ar − nr − e − yt dt = Z ∞ t n ( r − ) ( + t ) − nr ( + t ) ar − t r e − yt dt = Z ∞ e − n (( − r ) log ( t ) + r log ( + t )) ( + t ) ar − t r e − yt dt where y > , r > , + ∞ > a > −∞ . Let f ( t ) = ( − r ) log ( t ) + r log ( + t ) , then f ′ ( t ) = r ( + t ) − − ( r − ) t − . It is clear f ( t ) has a unique minimum r log r − ( r − ) log ( r − ) at t = r −

1. Then, we obtainEq. (12) by Laplace’s method [10]. Then the proof of Lemma 2 is complete.

Proof of Theorem 3.

In the cases (a) and (b), the results can be easily obtained by applying Lemma 1and Stirling’s formula Γ ( n ) = √ π n n − e − n ∞ X k = c i n i , c = , c = , c = . In the cases (c) and (d), the results can be derived by using Lemma 2 and Stirling’s formula.

Proof of Theorem 4.

The proof process of Theorem 4 is similar to the proof of Theorem 3. H. He et al. / Statistical Methodology 21 (2014) 49–58

References [1] B.R. Asrabadi, Estimation in the Pareto distribution, Metrika 37 (1990) 199–205.[2] U.J. Dixit, M. Jabbari Nooghabi, Efficient estimation in the Pareto distribution, Stat. Methodol. 7 (2010) 687–691.[3] D. Dyer, Structural probability bounds for the strong Pareto law, Canad. J. Statist. 9 (1981) 71–77.[4] T.Z. Fahidy, Applying Pareto distribution theory to electrolytic powder production, Electrochem. Commun. 13 (2011)262–264.[5] Y. Ijiri, H.A. Simon, Some distributions associated with Bose–Einstein statistics, Proc. Natl. Acad. Sci. 72 (1975) 1654–1657.[6] F.W. Olver, D.W. Lozier, R.F. Boisvert, et al., NIST Handbook of Mathematical Functions, Cambridge University Press, NewYork, 2010.[7] V. Pareto, Cours d’Economie Politique, Librairie Droz, Geneva, 1964.[8] W.J. Reed, M. Jorgensen, The double Pareto-lognormal distribution—a new parametric model for size distributions, Comm.Statist. Theory Methods 33 (2004) 1733–1753.[9] H.L. Seal, Survival probabilities based on Pareto claim distributions, Astin Bull. 11 (1980) 61–72.[10] R. Wong, Asymptotic Approximations of Integrals, Academic Press, Boston, 1989.[1] B.R. Asrabadi, Estimation in the Pareto distribution, Metrika 37 (1990) 199–205.[2] U.J. Dixit, M. Jabbari Nooghabi, Efficient estimation in the Pareto distribution, Stat. Methodol. 7 (2010) 687–691.[3] D. Dyer, Structural probability bounds for the strong Pareto law, Canad. J. Statist. 9 (1981) 71–77.[4] T.Z. Fahidy, Applying Pareto distribution theory to electrolytic powder production, Electrochem. Commun. 13 (2011)262–264.[5] Y. Ijiri, H.A. Simon, Some distributions associated with Bose–Einstein statistics, Proc. Natl. Acad. Sci. 72 (1975) 1654–1657.[6] F.W. Olver, D.W. Lozier, R.F. Boisvert, et al., NIST Handbook of Mathematical Functions, Cambridge University Press, NewYork, 2010.[7] V. Pareto, Cours d’Economie Politique, Librairie Droz, Geneva, 1964.[8] W.J. Reed, M. Jorgensen, The double Pareto-lognormal distribution—a new parametric model for size distributions, Comm.Statist. Theory Methods 33 (2004) 1733–1753.[9] H.L. Seal, Survival probabilities based on Pareto claim distributions, Astin Bull. 11 (1980) 61–72.[10] R. Wong, Asymptotic Approximations of Integrals, Academic Press, Boston, 1989.