Truncated, Censored, and Actuarial Payment-type Moments for Robust Fitting of a Single-parameter Pareto Distribution
aa r X i v : . [ s t a t . M E ] F e b Truncated, Censored, and Actuarial Payment–type Moments forRobust Fitting of a Single-parameter Pareto Distribution
Chudamani Poudyal Department of MathematicsTennessee Technological University
October 12, 2020
Abstract . With some regularity conditions maximum likelihood estimators (MLEs) al-ways produce asymptotically optimal (in the sense of consistency, efficiency, sufficiency,and unbiasedness) estimators. But in general, the MLEs lead to non-robust statisticalinference, for example, pricing models and risk measures. Actuarial claim severity iscontinuous, right-skewed, and frequently heavy-tailed. The data sets that such mod-els are usually fitted to contain outliers that are difficult to identify and separate fromgenuine data. Moreover, due to commonly used actuarial “loss control strategies” infinancial and insurance industries, the random variables we observe and wish to modelare affected by truncation (due to deductibles), censoring (due to policy limits), scal-ing (due to coinsurance proportions) and other transformations. To alleviate the lackof robustness of MLE-based inference in risk modeling, here in this paper, we proposeand develop a new method of estimation – method of truncated moments (MTuM) andgeneralize it for different scenarios of loss control mechanism. Various asymptotic prop-erties of those estimates are established by using central limit theory. New connectionsbetween different estimators are found. A comparative study of newly-designed methodswith the corresponding MLEs is performed. Detail investigation has been done for asingle parameter Pareto loss model including a simulation study.
Keywords & Phrases . Claim Severity; Deductible; Relative Efficiency; Loss Models;Robust Estimation; Truncated and Censored Moments. Chudamani Poudyal, Ph.D., is an Assistant Professor in the Department of Mathematics, Tennessee TechnologicalUniversity, Cookeville, TN 38505, USA. e-mail : [email protected] Introduction
The research leading to the results of this work is basically motivated to find some trade-offs be-tween robustness and efficiency of parametric estimators for ground-up continuous loss distributions.Parametric statistical loss models for insurance claim severity are continuous, right-skewed, and fre-quently heavy-tailed [15]. The data sets that such models are usually fitted to contain outliersthat are difficult to identify and separate from genuine data. As a result, there could be a signif-icant difference in statistical inference if there is a small perturbation in the assumed model fromthe unknown true underlying parametric model. In practice, due to commonly used loss controlmechanism in the financial and insurance industries [8], the random variables we observe and wishto model are affected by data truncation (due to deductibles), censoring (due to policy limits),and scaling (due to coinsurance factor). Maximum likelihood estimators (MLEs) typically resultin sensitive loss severity models if there is a small perturbation in the underlying assumed modelor if the observed sample is coming from a contaminated distribution [23]. The implementationof MLE procedures even on ground-up loss data is computationally challenging [9, 16]. This issueis even more evident when one tries to fit complicated multi-parameter models such as mixturesof Erlangs [19, 25]. Thus, beside many ideas from the mainstream robust statistics literature [see,e.g., 10, 13, 14], actuaries have to deal with heavy-tailed and skewed distributions, data truncationand censoring, identification and recycling of outliers, and aggregate loss, etc. Based on a generalclass of L –statistics [4], two board classes of robust estimator – the methods of trimmed moments (MTM) [1]; and winsorized moments (MWM) [26] are recently developed with actuarial applicationsin view. Therefore, it is appealing to search some estimation procedures which directly work withthose mentioned loss control mechanism and are insensitive.If a truncated (both singly and doubly) normal sample data is available then the MLE proceduresfor such data have been developed by [6] and the method of truncated moments estimators can befound in [7] and [22]. But the goal and motivation of this research work is different and is initiallypurposed by this author in [18]. That is, instead of truncated sample data, we assume a completeground-up sample loss data is available, i.e., the data set is neither truncated nor censored, andwe propose and develop robust estimation procedures for the corresponding ground-up loss severity2odels. Instead of trimming or winsorizing a fixed lower (say, 2%) and upper proportion (say, 3%)of the observed sample data, in this paper we develop a novel fixed lower and upper thresholdsmethod of truncated moments (say, MTuM) approach where the tail probabilities will be random.Depends on the nature of the loss data mentioned above, some variants of MTuM, called methodsof fixed censored moment (MCM) and actuarial payment-type moment (MTCM) will be defined forsingle parameter Pareto distribution, see Figure 3.3. Asymptotic distributions, such as normalityand consistency, along with asymptotic relative efficiency of those estimators with respect to thecorresponding MLEs are established. Several theoretical connections between different approachesare also discovered. The newly designed procedures work like the standard method-of-moments butinstead of classical moments they are truncated or censored moments for a completely observedsample. Irrespective with the heaviness of the underlying distribution, threshold truncated andcensored moments are always finite.The remainder of the paper is organized as follows. In Section 2, the newly proposed MTuMestimation procedure is defined in general with the establishment of the corresponding asymptoticdistributional properties. In Section 3, we develop specific formulas of different estimators (includingMTuM) when the underlying loss distribution is Pareto I which is equivalent to an exponentialdistribution, and compare the asymptotic relative efficiency of all the estimators with respect to thecorresponding MLEs for completely observed data. Several connections among different estimatorsare established. Section 4 summarizes a detail simulation study of different estimators developedin this paper. Concluding remarks are offered in Section 5. Finally, some additional results areprovided in Appendix A and B. We assume that a complete ground-up loss data is available, i.e., the data set is neither truncatednor censored. Then, instead of trimming or winsorizing fixed proportion from both tails, from acompletely observed data, as investigated by [1, 26], in this approach of parametric estimation wetruncate the data from below at lower threshold and from above at upper threshold and then applythe method of moments on the remaining data. We call such an approach method-of-truncated-moments (MTuM – for short) . 3 .1 Definition
Let X , X , ..., X n be i.i.d. random variables with common ground-up cdf F ( ·| θ ) , where θ :=( θ , . . . , θ k ) , k ≥ is the parameter vector to be estimated. The truncated moments estimators of θ , θ , ..., θ k are computed according to the following procedures.(i) The sample truncated moments are computed as b µ j = P ni =1 h j ( X i ) { d j < X i ≤ u j } P ni =1 { d j < X i ≤ u j } , ≤ j ≤ k, (2.1)where {·} denotes the indicator function. The h ′ j s in (2.1) are specially chosen functions aswell as the thresholds d j and u j are chosen by the researcher. In general, it is reasonable toassume that X n ≤ d j < u j ≤ X n : n , for all ≤ j ≤ k , where X n and X n : n are the smallestand the largest order statistics, respectively, from the sample.(ii) Derive the corresponding population truncated moments as µ j ( θ , θ , ..., θ k ) = E [ h j ( X ) | d j < X ≤ u j ] = E [ h j ( X ) { d j < X ≤ u j } ] P ( d j < X ≤ u j )= R u j d j h j ( x ) f ( x | θ ) dxF ( u j | θ ) − F ( d j | θ ) , ≤ j ≤ k. (2.2)(iii) Now, match the sample and population truncated moments from (2.1) and (2.2) to get thefollowing system of equations for θ , θ , ..., θ k : µ ( θ , . . . , θ k ) = b µ ... µ k ( θ , . . . , θ k ) = b µ k (2.3) Definition 2.1.
A solution to the system of equations (2.3), say b θ = (cid:16)b θ , b θ , ..., b θ k (cid:17) , if it exists,is called the method of truncated moments (MTuM) estimator of θ . Thus, b θ j =: g j ( b µ , b µ , ..., b µ k ) , ≤ j ≤ k are the MTuM estimators of θ , θ , ..., θ k . Note 2.1.
Obviously, it is possible that the system of equations (2.3) does not have a solution, orit is difficult to solve the system even with numerical methods when k is large. To facilitate thisissue, the functions h j have to be chosen carefully. But most claim severity distributions have asmall number k of parameters, usually not exceeding three [see 15, Appendix A]. .2 Asymptotic Properties For ≤ j, j ′ ≤ k and for any positive integer n , define { d jj ′ < X ≤ u jj ′ } := { d j < X ≤ u j } { d j ′ < X ≤ u j ′ } and consider the following additional notations: Z j := h j ( X ) , h jj ′ ( x ) := h j ( x ) h j ′ ( x ) , p j := F ( u j | θ ) − F ( d j | θ ) ,Y jj ′ := Y j Y j ′ , Y j := Z j { d j < X ≤ u j } , p jj ′ := F ( u jj ′ | θ ) − F ( d jj ′ | θ ) ,r j := h j ( d j ) , R j := h j ( u j ) , W jj ′ := Z j { d jj ′ < X ≤ u jj ′ } , p j,n := F n ( u j ) − F n ( d j ) , where F n ( x ) = n P ni =1 { X i ≤ x } is the empirical distribution function. Note that Y jj ′ = Y j ′ j but W jj ′ = W j ′ j for j = j ′ , in general. With those notations, the density of Y j , (1 ≤ j ≤ k ) can beexpressed as f Y j ( y ) = − F Z j ( R j | θ ) + F Z j ( r j | θ ) , if y = 0; f Z j ( y | θ ) , if r j < y < R j ;0 , otherwise.The density of the random variables Y jj ′ = Y j ′ j and W jj ′ can be constructed with the four possiblescenarios which are listed in Appendix A. To establish the asymptotic distribution of b µ , we needthe following lemma. Lemma 2.1.
For ≤ j, j ′ ≤ k , C ov (cid:0) Y j , Y j ′ (cid:1) = µ Y jj ′ − µ Y j µ Y j ′ , C ov (cid:0) Y j ; p j ′ , (cid:1) = µ W jj ′ − µ Y j p j ′ , C ov (cid:0) p j, ; p j ′ , (cid:1) = p jj ′ − p j p j ′ . Consider a k – dimensional random vector V := ( Y , . . . , Y k , p , , . . . , p k, ) . Clearly the meanvector of V is µ V = ( µ Y , . . . , µ Y k , p , . . . , p k ) and with Lemma 2.1, the variance-covariance matrixis Σ V = h σ V ,jj ′ i kj,j ′ =1 , where σ V ,jj ′ = µ Y jj ′ − µ Y j µ Y j ′ , ≤ j, j ′ ≤ k ; µ W j ( j ′− k ) − µ Y j p j ′ − k , ≤ j ≤ k ; k + 1 ≤ j ′ ≤ k ; µ W ( j − k ) j ′ − µ Y j ′ p j − k , ≤ j ′ ≤ k ; k + 1 ≤ j ≤ k ; p ( j − k )( j ′ − k ) − p j − k p j ′ − k , k + 1 ≤ j, j ′ ≤ k. Theorem 2.1.
The empirical estimator b µ V := 1 n n X i =1 Y ,i , . . . , n X i =1 Y k,i , n X i =1 p ,i , . . . , n X i =1 p k,i ! = (cid:0) Y ,n , . . . , Y k,n , p ,n , . . . , p k,n (cid:1) of the mean vector µ V is such that b µ V ∼ AN (cid:0) µ V , n Σ V (cid:1) . roof. Let { V n } be a sequence of i.i.d. V random vectors, then by multivariate Central LimitTheorem [see, e.g., 21, Theorem B, p. 28], we have: (cid:0) Y ,n , . . . , Y k,n , p ,n , . . . , p k,n (cid:1) = 1 n n X i =1 V i ∼ AN (cid:18) µ V , n Σ V (cid:19) . The system of MTuM equations (2.3) can now be written as: µ ( θ , . . . , θ k ) = b µ = Y ,n p ,n , ... ... µ k ( θ , . . . , θ k ) = b µ k = Y k,n p k,n . (2.4) Lemma 2.2.
Consider a function g V : R k → R k for x = ( x , x , . . . , x k ) defined by g V ( x ) = ( g ( x ) , . . . , g k ( x )) := (cid:18) x x k +1 , . . . , x k x k (cid:19) , where x i = 0 , i = k + 1 , . . . , k . Then g V is totally differentiable at any point x ∈ R k .Proof. A proof directly follows from [21, Lemma 1.12.2].With the help of Theorem 2.1 and Lemma 2.2, we are now ready to state the asymptoticdistribution of the truncated sample moment vector b µ whose proof can be found in Appendix B. Theorem 2.2.
The asymptotic joint distribution of the truncated sample moment vector ( b µ , . . . , b µ k ) is given by N (cid:0) µ , n Σ (cid:1) with Σ = D V Σ V D ′ V =: h σ jj ′ i k × k , where σ jj ′ = 1 p j ′ µ Y jj ′ − µ Y j µ Y j ′ p j − µ Y j (cid:16) µ W jj ′ − µ Y j ′ p j (cid:17) p j − µ Y j ′ p j ′ µ W jj ′ − µ Y j p j ′ p j − µ Y j (cid:0) p jj ′ − p j p j ′ (cid:1) p j ! . Now, with b µ = ( b µ , . . . , b µ k ) and g θ ( b µ ) = ( g , θ ( b µ ) , . . . , g k, θ ( b µ )) = b θ , then by delta method [see,e.g., 21, Theorem A, p. 122], we have the following main result of this section. Theorem 2.3.
The MTuM estimator of θ , denoted by b θ , has the following asymptotic distribution: b θ = (cid:16)b θ , . . . , b θ k (cid:17) ∼ AN (cid:18) θ , n D Σ D ′ (cid:19) , where the Jacobian D is given by D = (cid:20) ∂g j, θ ∂ b µ j ′ (cid:12)(cid:12)(cid:12) b µ = µ (cid:21) k × k =: (cid:2) d jj ′ (cid:3) k × k and the variance-covariancematrix Σ has the same form as in Theorem 2.2. ote 2.2. In view of the above derivations, we notice that data trimming and thus (method oftrimmed moments – MTM) investigated by [1] can be interpreted as special cases of data truncationand thus MTuM, respectively. To see that, let F be the distribution function of X . For ≤ j ≤ k ,consider F ( d j | θ ) = a j and F ( u j | θ ) = 1 − b j . Then, using integration by substitution with U = F ( X ) ,the equation (2.2) becomes µ j ( θ , θ , ..., θ k ) = R u j d j h j ( x ) f ( x | θ ) dxF ( u j | θ ) − F ( d j | θ ) = R F ( u j | θ ) F ( d j | θ ) h j ( F − ( u | θ )) duF ( u j | θ ) − F ( d j | θ ) (2.5a) = R − b j a j h j ( F − ( u | θ )) du − a j − b j , (2.5b) which is equivalent to the corresponding population trimmed moment. Note 2.3.
For estimation purposes these two approaches (i.e., MTM and MTuM) are very different.With the MTuM approach, the limits of integration as well as the denominator in equation (2.5a) areunknowns, which create technical complications when we want to assess the asymptotic properties ofMTuM estimators. On the other hand, with the MTM approach, both the limits of integration andthe denominator in equation (2.5b) are constants, which simplify the matters significantly. Indeed,as is evident from complete data examples in [1] and [26], MTM leads to explicit formulas for alllocation-scale families and their variants, but that is not the case with MTuM. In view of this, wewill consider the MTuM approach further only for some data scenarios, but not all.
Let Y ∼ Pareto I ( α, x ) with the distribution function F Y ( y ) = 1 − ( x /y ) α , y > x , zero elsewhere,where α > is the shape (so called tail) parameter and x > is known left threshold. Then X := log ( Y /x ) ∼ Exp ( θ = 1 /α ) with the distribution function F X ( x ) = 1 − e − x/θ . Therefore,in order to estimate α it is equivalent to estimate the exponential parameter θ that what we willproceed for the rest of this section. MTuM will be derived with asymptotic results for a complete i.i.d. sample from an exponential distribution. For this particular distribution, we also explore twoadditional methods: method of censored moments and insurance payment–type moment estimators.Several connections between different approaches are established.7he asymptotic performance of the newly designed estimators will be measured via asymptoticrelative efficiency (ARE) with respect to MLE and is defined as [see, e.g., 21, 24]: ARE ( C , M LE ) = asymptotic variance of MLE estimatorasymptotic variance of C estimator . (3.1)The main reason why MLE should be used as a benchmark is its optimal asymptotic performancein terms of variability (of course, with the usual caveat of “under certain regularity conditions”). In this section, we derive MTuM and related estimators for the parameter of exponential distributionfor completely observed data. Since there is a single parameter, θ , to be estimated, we consider thefunction h ( x ) = x . Let X , . . . , X n be i.i.d. random variables given as in Definition 2.1. Consider d and u be the left and right truncation points, respectively. Then the sample truncated momentis given by b µ MTuM := ( P ni =1 X i { d < X i ≤ u } ) /n ( P ni =1 { d < X i ≤ u } ) /n = ( P ni =1 Y i ) /nF n ( u ) − F n ( d ) = Y n F n ( u ) − F n ( d ) = Y n p n , where Y , Y , . . . , Y n i.i.d. ∼ Y := X { d < X ≤ u } and p n := F n ( u ) − F n ( d ) with p ≡ p ( θ ) = F ( u | θ ) − F ( d | θ ) = e − dθ − e − uθ . Theorem 3.1.
The mean and the variance of the random variable Y are respectively given by µ Y = θp + de − dθ − ue − uθ and σ Y = 2 θ (cid:18) Γ (cid:16) uθ (cid:17) − Γ (cid:18) dθ (cid:19)(cid:19) − µ Y , where Γ( α ; x ) with α > , x > is the incomplete gamma function defined as Γ ( α ; x ) = 1Γ ( α ) Z x t α − e − t dt with Γ( α ) = Z ∞ t α − e − t dt. Proof.
See Appendix B.From Theorem 2.2, b µ MTuM ∼ AN (cid:16) µ Y p , n (cid:16) σ Y p − (1 − p ) µ Y p (cid:17)(cid:17) . Note that the asymptotic varianceof b µ MTuM is exactly equal to the approximation through the second order Taylor series expansionof the ratio of the asymptotic distribution of Y n and p n as mentioned in [12]. The correspondingpopulation version of b µ MTuM is given by µ MTuM := E [ X | d < X ≤ u ] = E [ Y ] F ( u | θ ) − F ( d | θ ) = µ Y p . (3.2)8 heorem 3.2. The equation µ MTuM = b µ MTuM has a unique solution b θ provided that d < b µ MTuM < d + u . Otherwise, the solution does not exist.Proof. It is clear that d < b µ MTuM < u . Also, µ MTuM ( θ ) = µ Y p = e − dθ ( d + θ ) − e − uθ ( u + θ ) e − dθ − e − uθ . Then, in orderto establish the result, it is enough to prove the following statements: ( a ) µ MTuM ( θ ) is strictly increasing, ( b ) lim θ → µ MTuM ( θ ) = d, and ( c ) lim θ →∞ µ MTuM ( θ ) = d + u . First of all, let us establish that µ MTuM ( θ ) is strictly increasing. µ ′ MTuM ( θ ) = (cid:16)(cid:0) dθ (cid:1) e − dθ − (cid:0) uθ (cid:1) e − uθ (cid:17) (cid:16) e − dθ − e − uθ (cid:17) + (cid:16) e − dθ − e − uθ (cid:17) (cid:16) e − dθ − e − uθ (cid:17) = − e − d + uθ (cid:16)(cid:0) uθ (cid:1) + (cid:0) dθ (cid:1) (cid:17) + duθ e − d + uθ + (cid:16) e − dθ − e − uθ (cid:17) (cid:16) e − dθ − e − uθ (cid:17) = 1 − d − u ) θ (cid:16) e d − u θ − e − d − u θ (cid:17) = 1 − (cid:18) d − u θ (cid:19) (cid:18) e d − u θ − e − d − u θ (cid:19) = 1 − (cid:18) d − u θ (cid:19) csch (cid:18) d − u θ (cid:19) . Therefore, µ ′ MTuM ( θ ) > if and only if (cid:0) d − u θ (cid:1) < sinh (cid:0) d − u θ (cid:1) , which is true since x < sinh x for all x > and x > sinh x for all x < . Further, lim θ → µ MTuM ( θ ) = lim θ → " θ + de − dθ − ue − uθ e − dθ − e − uθ = lim θ → e − dθ (cid:16) d − ue d − uθ (cid:17) e − dθ (cid:16) − e d − uθ (cid:17) = d. lim θ →∞ µ MTuM ( θ ) = lim θ →∞ " θ + de − dθ − ue − uθ e − dθ − e − uθ y := θ = lim y → (cid:20) y + de − dy − ue − uy e − dy − e − uy (cid:21) = lim y → " y + de ( u − d ) y − ue ( u − d ) y − = lim y → " e ( u − d ) y − dye ( u − d ) y − uyy (cid:0) e ( u − d ) y − (cid:1) = d − lim y → ( u − d ) − ( u − d ) e ( u − d ) y e ( u − d ) y − y ( u − d ) e ( u − d ) y = d − lim y → − ( u − d ) e ( u − d ) y ( u − d ) e ( u − d ) y + y ( u − d ) e ( u − d ) y + ( u − d ) e ( u − d ) y = d + ( u − d ) u − d + u − d = d + u . Also from (3.2), we have θ ′ := dθdµ MTuM = pθ de − dθ ( θ + d ) − ue − uθ ( θ + u ) + pθ − µ MTuM (cid:16) de − dθ − ue − uθ (cid:17) (3.3)9 p θ p θ − e − d + uθ ( u − d ) . (3.4)Therefore, by delta method, we get b θ MTuM ∼ AN (cid:18) θ, (cid:0) θ ′ (cid:1) (cid:18) σ Y np − (1 − p ) µ Y np (cid:19)(cid:19) = AN θ, θ n pθ p θ − e − d + uθ ( u − d ) !! , (3.5)and hence, we haveARE (cid:16)b θ MTuM , b θ MLE (cid:17) = θ p ( θ ′ ) (cid:0) pσ Y − (1 − p ) µ Y (cid:1) = p θ − e − d + uθ ( u − d ) pθ . (3.6)Clearly, ARE (cid:16)b θ MTuM , b θ MLE (cid:17) given by (3.6) is a function of the parameter θ . Thus, it turns outthat, if we fix the left, d , and right, u , truncation thresholds and allow the tail probabilities, i.e., F ( d | θ ) and − F ( u | θ ) , be random then the corresponding asymptotic relative efficiency is notstable (see Figure 3.2). However, as in method of trimmed moments (MTM) [see, e.g., 1, 26] if thetail probabilities F ( d | θ ) and − F ( u | θ ) are fixed then we have the following result. Proposition 3.1.
Let θ = θ be two exponential parameters with corresponding left and right trun-cation thresholds d , d and u , u , respectively. Assume F ( d | θ ) = F ( d | θ ) and F ( u | θ ) = F ( u | θ ) , then it follows thatARE (cid:16)b θ , b θ (cid:17) = ARE (cid:16)b θ , b θ (cid:17) . (3.7) Proof.
A proof immediately follows from (3.6).Numerical values of ARE (cid:16)b θ MTuM , b θ MLE (cid:17) given by (3.6) for some selected values of left and righttruncation thresholds d and u , respectively are summarized on the first horizontal block of Table3.1.As mentioned above, if Y ∼ Pareto I ( α, x ) with x known then X := log (cid:16) Yx (cid:17) ∼ Exp (cid:0) α =: θ (cid:1) .So, estimators of α of the single-parameter Pareto distribution will share the same AREs withestimators of Exp ( θ ) , given that h ( y ) = log (cid:16) yx (cid:17) . The following result for single-parameter Paretohas been partially derived in [5], but can easily be extended using the tools of this section. Theorem 3.3.
Let d and u be the left and right truncation points, respectively, for Y ∼ Pareto I ( α, x ) .Also, define A du := u α (cid:0) − α log (cid:0) x d (cid:1)(cid:1) − d α (cid:0) − α log (cid:0) x u (cid:1)(cid:1) and g du ( α ) := A du α ( u α − d α ) . Then theequation b µ MTuM = µ MTuM has a unique solution provided that lim α →∞ g du ( α ) < b µ MTuM < lim α → g du ( α ) . roof. See Appendix B.Note that, given a truncated data, method of truncated moments estimators for a normal pop-ulation parameters can be found in [7] and [22].
There are several versions of data censoring that occur in statistical modeling: interval censoring(it includes left and right censoring depending on which end point of the interval is infinite), typeI censoring, type II censoring, and random censoring. For actuarial work, the most relevant type is interval censoring . It occurs when complete sample observations are available within some interval,say ( d, u ] , but data outside the interval is only partially known. That is, counts are available butactual values are not. That is, we observe the i.i.d. data Z , Z , . . . , Z n , (3.8)where each Z is equal to the ground-up variable X , if X falls between d and u , and is equal to thecorresponding end-point of the interval if X is beyond that point. That is, Z is given by Z := min (cid:8) max( d, X ) , u (cid:9) = d { X ≤ d } + X { d < X ≤ u } + u { X > u } = d, X ≤ d ; X, d < X ≤ u ; u, X > u. Therefore, instead of winsorizing fixed proportions of lowest and highest order statistics from anobserved sample [26], here we design a method of fixed threshold censored moment for exponentialdistribution.Let X , X , . . . , X n i.i.d. ∼ Exp ( θ ) random variables. Then, the sample censored mean is given by b µ MCM := d P ni =1 { X i ≤ d } + P ni =1 X i { d < X i ≤ u } + u P ni =1 { X i > u } n . The corresponding population censored moments are: µ MCM := E [ Z ] = d (cid:16) − e − dθ (cid:17) + µ Y + ue − uθ , and µ MCM,2 := E (cid:2) Z (cid:3) = d (cid:16) − e − dθ (cid:17) + E (cid:2) Y (cid:3) + u e − uθ , where Y := X { d < X ≤ u } as in Section 3.1. Thus, σ MCM = µ MCM,2 − µ MCM . Moreover, setting µ MCM = b µ MCM implies d + θ (cid:16) e − dθ − e − uθ (cid:17) = b µ MCM , which needs to be solved to get a method ofcensored moment (MCM) estimator, b θ MCM , of θ . 11 heorem 3.4. The equation b µ MCM = µ MCM has a unique solution b θ MCM provided that d < b µ MCM
For method of censored moments (MCM) with a = F ( d | θ ) and b = 1 − F ( u | θ ) , thenthe following result holds: ARE (cid:16)b θ MCM , b θ MLE (cid:17) = ARE (cid:16)b θ MTM , b θ MLE (cid:17) . Proof.
Following [1], we know that b θ MTM ∼ AN (cid:18) θ, θ n ∆ (cid:19) , with ∆ = J ( a, − b )[ I ( a, − b )] , where I ( a, − b ) := Z − ba log (1 − v ) dv and J ( a, − b ) := Z − ba Z − ba min { v, w } − vw (1 − v )(1 − w ) dw dv. Therefore, ARE (cid:16)b θ MTM , b θ MLE (cid:17) = 1∆ = [ I ( a, − b )] J ( a, − b ) . On the other hand, ARE (cid:16)b θ MCM , b θ MLE (cid:17) = (cid:16) pθ + de − dθ − ue − uθ (cid:17) σ MCM . (cid:16)b θ MCM , b θ MLE (cid:17) = ARE (cid:16)b θ MTM , b θ MLE (cid:17) = θ [ I ( a, − b )] σ MCM . That is, J ( a, − b ) = σ MCM θ . For that, we have: σ MCM = d (cid:16) − e − dθ (cid:17) + 2 θ (cid:20) Γ (cid:18) dθ (cid:19) − Γ (cid:16) uθ (cid:17)(cid:21) + u e − uθ − d − dθp − θ p = 2 θ (1 − a − b ) + 2 θ ( − (1 − a ) log (1 − a ) + b log ( b )) − θ (1 − a − b )( − − a ) + (1 − a − b )) .σ MCM θ = 2(1 − a − b ) + 2( b log ( b ) − (1 − a ) log (1 − a )) − (1 − a − b )(1 − a − b − − a ))= (1 − a − b ) [ a + log (1 − a )] − I ( a, − b )+ (1 − b − (cid:20) a − b + log (cid:18) − ab (cid:19)(cid:21) = (1 − a − b ) [ a + log (1 − a )] − I ( a, − b ) + (1 − b − I ( a, − b )= J ( a, − b ) . where I ( a, − b ) := Z − ba v − v dv = ( a − b ) + log (cid:18) − ab (cid:19) .Here is an important and a new connection between trimmed and interval censored populationmeans for any F ∈ F , where F is the family of continuous parametric distributions. Theorem 3.6.
Let F ∈ F be an arbitrary continuous ground-up cumulative distribution function(cdf ). Consider d and u be the lower and upper thresholds, respectively. Define a := F ( d ) and b := 1 − F ( u ) . Let µ MCM = dF ( d ) + Z ud zf ( z ) dz + u (1 − F ( u )) and µ MTM = 11 − a − b Z − ba F − ( v ) dv, are, respectively, the fixed censored mean and proportion trimmed mean of the same cdf F . Then IF ( µ MCM , x ) = (1 − a − b ) IF ( µ MTM , x ) , −∞ < x < ∞ where IF stands for influence function.Proof. The influence function of the trimmed mean is given by [see, e.g., 10, 14]: IF ( µ MTM , x ) = 11 − a − b Z − ba (cid:18) ddλ F − λ ( v ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) λ =0 dv = 11 − a − b Z − ba v − { F ( x ) ≤ v } f ( F − ( v )) dv, (3.11)13here F λ := (1 − λ ) F + λδ x and δ x is the point mass at x . Since d and u are left and right censoredpoints, respectively. Then, the censored mean is: µ MCM [ F ] = Z ud z dF Z ( z ) = dF ( d ) + E [ X { d < X < u } ] + u (1 − F ( u ))= dF ( d ) + Z ud z dF ( z ) + u (1 − F ( u )) , where F Z is the distribution function of Z given by: F Z ( z | d, u ) = P (cid:2) min (cid:8) max( d, X ) , u (cid:9) ≤ z (cid:3) = , z < d ; F ( z ) , d ≤ z < u ;1 , z ≥ u, (3.12)Further, µ MCM [ F λ ] = dF λ ( d ) + R ud z dF λ ( z ) + u (1 − F λ ( u )) . Note that the influence function isjust a special case of first order Gâteaux derivative [see, e.g., 11, Section 2.3]. Thus, a simplercomputational formula to get the IF is [see, e.g., 21, Chapter 6]: IF ( µ MCM , x ) = dµ MCM [ F λ ] dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 . It is clear that dF λ ( d ) dλ (cid:12)(cid:12)(cid:12) λ =0 = − F ( d ) + δ x ( d ) and similarly d (1 − F λ ( u )) dλ (cid:12)(cid:12)(cid:12) λ =0 = F ( u ) − δ x ( u ) . Also, byusing Leibniz’s rule for differentiation under integral sign, we get ddλ Z ud z dF λ ( z ) = ddλ Z F λ ( u ) F λ ( d ) F − λ ( v ) dv = F − λ ( F λ ( u )) ddλ F λ ( u ) − F − λ ( F λ ( d )) ddλ F λ ( d ) + Z F λ ( u ) F λ ( d ) ddλ F − λ ( v ) dv = u ddλ F λ ( u ) − d ddλ F λ ( d ) + Z F λ ( u ) F λ ( d ) ddλ F − λ ( v ) dv.ddλ Z ud z dF λ ( z ) (cid:12)(cid:12)(cid:12)(cid:12) λ =0 = u dF λ ( u ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 − d dF λ ( d ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + Z F λ ( u ) F λ ( d ) ddλ F − λ ( v ) dv (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) λ =0 = u dF λ ( u ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 − d dF λ ( d ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + Z − ba (cid:18) ddλ F − λ ( v ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) λ =0 dv. Therefore, IF ( µ MCM , x ) = dµ MCM [ F λ ] dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 = d dF λ ( d ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + u d (1 − F λ ( u )) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + ddλ Z ud y dF λ ( y ) (cid:12)(cid:12)(cid:12)(cid:12) λ =0 = d dF λ ( d ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + u d (1 − F λ ( u )) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + u dF λ ( u ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 d dF λ ( d ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + Z − ba (cid:18) ddλ F − λ ( v ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) λ =0 dv = Z − ba (cid:18) ddλ F − λ ( v ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) λ =0 dv. (3.13)Thus, from Equations (3.13) and (3.11), IF ( µ MCM , x ) = (1 − a − b ) IF ( µ MTM , x ) . -10-505101520 -10-505101520 Figure 3.1:
Influence functions of trimmed mean (left panel) and censored mean (right panel).
The following two points are immediate consequence of Theorem 3.6.(i) Censored mean is asymptotically more stable than trimmed mean. This point is more clearfrom the Figure 3.1 as the graph of IF ( µ MCM , x ) is just the vertical contraction of the graphof IF ( µ MTM , x ) by the contracting factor < − a − b < with the assumption < a + b < .(ii) The asymptotic investigation of censored mean could be quite challenging due to thresholds.So, in this situation one can assess the asymptotic distributional properties of censored meanthrough the corresponding properties of trimmed mean as the asymptotic variance of anestimator is the expectation of the square of the corresponding IF [see, e.g., 10, 17]. Insurance contracts have coverage modifications that need to be taken into account when modelingthe underlying loss variable. Usually the coverage modifications such as deductibles, policy limits,15nd coinsurance are introduced as loss control strategies so that unfavorable policyholder behavioraleffects (e.g., adverse selection) can be minimized. Therefore, the actuarial loss data are left truncatedand right censored in nature. Motivated with this nature of the loss data, here we design anestimation approach, called insurance payment–type estimators and is basically left truncated andright censored method of moments.Let X , . . . , X n be i.i.d. random variables with common exponential cdf F ( ·| θ ) . Define the left-truncated (at d ) and right-censored (at u ) sample moment as: b µ MTCM := P ni =1 X i { d < X i ≤ u } + u P ni =1 { X i > u } P ni =1 { X i > d } = W n τ n . where W := X { d < X ≤ u } + u { X > u } , τ n = 1 − F n ( d ) , and τ = 1 − F ( d | θ ) . The covarianceof W and τ is given as σ W τ = C ov ( W, τ ) = µ W (1 − τ ) , with µ W = E [ W ] = µ Y + u (1 − F ( u | θ )) and E (cid:2) W (cid:3) = E (cid:2) Y (cid:3) + u (1 − F ( u | θ )) , where Y := X { d < X ≤ u } as in Section 3.1. Then, by multivariate Central Limit Theorem, wehave (cid:0) W n , τ n (cid:1) ∼ AN (cid:18) ( µ W , τ ) , n (cid:20) σ W σ W τ σ W τ τ (1 − τ ) (cid:21)(cid:19) . Then, by delta method with a function g ( x , x ) := x x , x = 0 , we have b µ MTCM = W n τ n ∼ AN (cid:18) µ W τ , n (cid:18) σ W τ − µ W (1 − τ ) τ (cid:19)(cid:19) . The population version of b µ MTCM is given by µ MTCM = E [ W ]1 − F ( d | θ ) = θ (cid:16) e − dθ − e − uθ (cid:17) + de − dθ e − dθ = pθ + dττ ⇒ θ ′ := dθdµ MTCM = τ θ pθ + θ (cid:16) de − dθ − ue − uθ (cid:17) + d τ − dτ µ MTCM = τ θ pθ − e − uθ ( u − d ) . A solution, if exists, of the equation b µ MTCM = µ MTCM , say b θ MTCM , is called the method of truncatedand censored moment (MTCM) estimator of θ . Let b := e − uθ , then by delta method the asymptoticdistribution and ARE are, respectively, given by b θ MTCM ∼ AN (cid:18) θ, ( θ ′ ) n (cid:18) σ W τ − (1 − τ ) µ W τ (cid:19)(cid:19) = AN θ, θ n pθ ( τ + b ) − θτ b ( u − d ) τ (cid:2) pθ − b (cid:0) u − dθ (cid:1)(cid:3) !! (3.14)16 Figure 3.2:
Graphs of ARE (cid:16)b θ C , b θ MLE (cid:17) where
C ∈ {
MTuM, MCM, MTCM } with ( d, u ) = (0 . , . and θ = 10 . ARE (cid:16)b θ MTCM , b θ MLE (cid:17) = θ τ ( θ ′ ) (cid:0) τ σ W − (1 − τ ) µ W (cid:1) = (cid:2) p − b (cid:0) u − dθ (cid:1)(cid:3) p (1 + b/τ ) − b (cid:0) u − dθ (cid:1) . (3.15)Similar to (3.6) and (3.10), ARE (cid:16)b θ MTCM , b θ MLE (cid:17) given by (3.15) is a function of θ . But if we fix thetail probabilities then we have the following stability result. Proposition 3.2.
Let θ = θ be two exponential parameters with corresponding left and right trun-cation thresholds d , d and u , u , respectively. Assume F ( d | θ ) = F ( d | θ ) and F ( u | θ ) = F ( u | θ ) , then it follows thatARE (cid:16)b θ , b θ (cid:17) = ARE (cid:16)b θ , b θ (cid:17) . (3.16) Proof.
With the assumptions given, we have e − d θ = e − d θ , e − u θ = e − u θ = ⇒ u − d θ = u − d θ , and then the conclusion follows directly from (3.15).17 able 3.1: Numerical values of ARE (cid:16)b θ C , b θ MLE (cid:17) where
C ∈ {
MTuM, MCM, MTCM } , respectively, givenby (3.6), (3.10), and (3.15) for various values of left and right truncation thresholds d and u from Exp ( θ = 10) . The truncation thresholds d and u are rounded to two decimal places; forexample, . ≈ F − (0 . , . ≈ F − (0 . , etc. u (1 − F ( u | θ )) d ( F ( d | θ )) ∞ (.00) (.05) (.10) (.15) (.25) (.49) (.70) (.85) A R E (cid:16) b θ M T u M , b θ M L E (cid:17) (.00) (.05) .950 .443 .284 .193 .095 .016 .002 .000 (.10) .900 .408 .257 .172 .082 .012 .001 .000 (.15) .850 .373 .231 .152 .069 .009 .000 - (.25) .750 .307 .182 .114 .047 .004 .000 - (.49) .510 .161 .080 .042 .011 .000 - - (.70) .300 .057 .019 .006 .000 - - - (.85) .150 .009 .001 - - - - - A R E (cid:16) b θ M C M , b θ M L E (cid:17) (.00) (.05) (.10) (.15) .999 .918 .850 .787 .672 .436 .261 - (.25) .995 .918 .851 .790 .679 .452 .285 - (.49) .958 .897 .839 .786 .688 .487 - - (.70) .857 .824 .781 .738 .659 - - - (.85) .681 .688 .663 - - - - - A R E (cid:16) b θ M TC M , b θ M L E (cid:17) (.00) (.05) .950 .868 .798 .735 .619 .380 .197 .077 (.10) .900 .819 .750 .687 .572 .336 .157 .038 (.15) .850 .768 .700 .638 .525 .292 .116 - (.25) .750 .670 .603 .542 .432 .208 .038 - (.49) .510 .434 .371 .315 .216 .015 - - (.70) .300 .229 .173 .124 .039 - - - (.85) .150 .087 .040 - - - - -From Table 3.1, it follows evidently thatARE (cid:16)b θ MTuM , b θ MLE (cid:17) ≤ ARE (cid:16)b θ MTCM , b θ MLE (cid:17) ≤ ARE (cid:16)b θ MCM , b θ MLE (cid:17) . This inequality is intuitive because MTuM is more robust than MCM and MTCM. As a result,MTuM estimators lose more efficiency and converge to the asymptotic results slower. For exam-ple, if the lower and upper truncation thresholds are, respectively, d = 0 . and u = 29 . thenARE (cid:16)b θ MTuM , b θ MLE (cid:17) = 0 . and ARE (cid:16)b θ MCM , b θ MLE (cid:17) = 0 . . That is, we lose approximately 52%efficiency by going from MCM to MTuM. The reason that MTuM relative efficiency is much lowerthan the corresponding MCM is that the censored sample size is always fixed but even if we fix the18 MTuM - Method of Truncated MomentsMCM - Method of Censored MomentsMTCM - Method of Left Truncated and Right Censored Moments
Figure 3.3:
Effects of MTuM (left panel), MCM (middle panel), and MTCM (right panel) on the underlyingquantile function and thus data. MTuM focuses on the data only between the truncationthresholds, MCM is a threshold censored form that takes into account the upper and loweroutside values as well (orange area), and MTCM is a mixed version of both MTuM (lefttruncated) and MCM (right censored). truncation thresholds, the truncated sample size is random. Further, MTuM disregards the obser-vations beyond the truncation thresholds in order to control the influence of extremes in statisticalinference. MCM controls such influence of extremes differently, i.e., those observations which arebeyond the thresholds are adjusted to be equal to the corresponding thresholds and hence increasesthe efficiency significantly. MTCM controls the influence of extremes by disregarding the observa-tions below lower threshold and adjusting the observations above upper threshold to be equal to theupper threshold which makes the MTCM entries in between the corresponding MTuM and MCMentries. Due to Theorem 3.5, entries for ARE (cid:16)b θ MCM , b θ MLE (cid:17) are identical to ARE (cid:16)b θ MTM , b θ MLE (cid:17) entries found in [1, Table 1].
Theorem 3.7.
The equation b µ MTCM = µ MTCM has a unique solution b θ MTCM provided that d < b µ MTCM < u . Otherwise, the solution does not exist.Proof.
A proof can similarly be established as in Theorem 3.2.19
Simulation Study
This section supplements the theoretical results we developed in Section 3 via simulation. The maingoal is to access the size of the sample such that the estimators are free from bias (given that theestimators are asymptotically unbiased), justify the asymptotic normality, and their finite samplerelative efficiencies (RE) are approaching to the corresponding AREs. To compute RE of differentestimators (MTuM, MCM, and MTCM) we use MLE as a benchmark. Thus, the definition ofasymptotic relative efficiency given by equation (3.1) for finite sample performance translates to: RE ( C , M LE ) = asymptotic variance of MLE estimatorsmall-sample variance of a competing estimator C , where the denominator is the empirical mean square error matrix of the competing estimator C .From Exp ( θ = 10) , we first monitor the approximate normality distributional properties of theMTuM, MCM, and MTCM estimators of θ given, respectively, by (3.5), (3.9), and (3.14) with ( d, u ) = (0 . , . and finite sample sizes n = 30 , . We generate 100 samples for each samplesize n = 30 , and estimate the values of θ from each sample via MTuM, MCM, and MTCM.We plot the histograms of those 100 estimated values of θ in Figure 4.1. Clearly, the histogramscorresponding to MTuM (for n = 30 , ) are positively skewed (but turns out to be symmetric for n = 500 ) and hence the asymptotic normality property of b θ MTuM given by (3.5) can be achievedslower, i.e., only for bigger sample sizes, than MCM and MTCM. On the other hand the asymptoticnormality property of b θ MCM given by (3.9) and b θ MTCM given by (3.14) can be justified even for thesample of size n = 30 .Second, again from exponential distribution F ( ·| θ = 10) we generate , samples of a spec-ified length n using Monte Carlo. For each sample we estimate the parameter of F using variousMTuM, MCM, and MTCM estimators and then compute the average mean and RE of those , estimates. This process is repeated times and the average means and the RE’s are againaveraged and their standard errors are reported. Such repetitions are useful for assessing standarderrors of the estimated means and RE’s. Hence, our findings are essentially based on , sam-ples. The standardized ratio b θ/θ that we report is defined as the average of , estimatesdivided by the true value of the parameter that we are estimating. We observe the performance ofdifferent methods of estimation for exponential distribution (see Section 3) in the aspects of20
10 20 300102030 0 5 10 150102030 0 5 10 15010200 5 10 1501020 6 8 10 1201020 6 8 10 12 14010206 8 10 1205101520 9 10 11 120102030 8 10 120102030
Figure 4.1:
Histograms of 100 estimated values of the parameter Exp ( θ = 10) via MTuM, MCM, andMTCM with ( d, u ) = (0 . , . and sample sizes n = 30 , , . (i) Sample size: n = 50 , , , , .(ii) Estimators of θ :(a) MLE, which is a special case of all others.(b) MTuM, MCM, MTCM.(c) For the selected proportions a = b = 0 ; a = b = 0 . ; a = b = 0 . ; a = b = 0 . ; a = b = 0 . ; a = 0 . and b = 0 . ; a = 0 . and b = 0 . , the left and right truncation(or censored) thresholds d and u , respectively, to the nearest two decimal places arechosen specifically as a = F ( d ) or d = F − ( a ) and − b = F ( u ) or u = F − (1 − b ) .21 able 4.1: Finite-sample performance evaluation of different Estimators for
Exp ( θ = 10) . d ( a ) u ( b ) MTuM MCM MTCM MTuM MCM MTCM MTuM MCM MTCM n = 50 n = 100 n = 250 b θ/θ . ( . ∞ ( . (.000) (.000) (.000) (.000) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.001) (.000) (.000) (.001) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.002) (.000) (.001) (.001) (.000) (.000) (.000) (.000) (.000) . ( . . ( . – 1.01 (.000) (.001) (.002) (.000) (.000) (.001) (.000) (.000) . ( . . ( . – 1.02 (.001) (.001) – 1.01 (.000) (.000) – 1.00 (.000) (.000) . ( . . ( . – 1.08 (.001) (.002) – 1.04 (.001) (.001) – 1.01 (.000) (.000) . ( . ∞ ( . (.000) (.000) (.000) (.000) (.000) (.000) (.000) (.000) (.000) Re . ( . ∞ ( . (.004) (.004) (.004) (.003) (.003) (.003) (.006) (.006) (.006) . ( . . ( . (.003) (.005) (.005) (.002) (.004) (.003) (.002) (.005) (.004) . ( . . ( . (.009) (.005) (.004) (.002) (.003) (.003) (.001) (.004) (.003) . ( . . ( . – 0.73 (.005) (.004) (.007) (.003) (.002) (.001) (.004) (.002) . ( . . ( . – 0.61 (.005) (.003) – 0.64 (.002) (.002) – 0.67 (.003) (.002) . ( . . ( . – 0.14 (.001) (.004) – 0.20 (.001) (.001) – 0.23 (.002) (.001) . ( . ∞ ( . (.002) (.004) (.002) (.003) (.003) (.003) (.004) (.006) (.004) d ( a ) u ( b ) MTuM MCM MTCM MTuM MCM MTCM MTuM MCM MTCM n = 500 n = 1000 n → ∞ b θ/θ . ( . ∞ ( . (.000) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.000) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.000) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.000) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.001) (.000) (.000) (.000) (.000) (.000) . ( . . ( . – 1.01 (.000) (.000) – 1.00 (.000) (.000) . ( . ∞ ( . (.000) (.000) (.000) (.000) (.000) (.000) Re . ( . ∞ ( . (.001) (.001) (.000) (.004) (.004) (.004) . ( . . ( . (.001) (.001) (.000) (.002) (.004) (.003) .442 .918 .868 . ( . . ( . (.001) (.001) (.000) (.001) (.004) (.003) .257 .848 .749 . ( . . ( . (.001) (.001) (.000) (.001) (.003) (.002) .152 .787 .638 . ( . . ( . (.001) (.001) (.000) (.000) (.002) (.001) .047 .679 .433 . ( . . ( . – 0.24 (.001) (.000) – 0.24 (.001) (.001) .001 .250 .156 . ( . ∞ ( . (.001) (.001) (.000) (.003) (.004) (.003) .750 .995 .750 imulation results are recorded in Table 4.1. The entries are mean values (with standard errors inparentheses) based on 100,000 samples. The columns corresponding to n → ∞ , represent analyticARE (cid:16)b θ C , b θ MLE (cid:17) results with
C ∈ {
MTuM, MCM, MTCM } and are found in Section 3, not fromsimulations. Among these three columns, the first one is C = M T uM , second C = MCM, and third C = MTCM. As seen from Table 4.1, the ratio b θ/θ of the exponential θ estimators converges tothe true asymptotic value of very fast. Besides MTuM approach, the bias of all other proceduresdisappears as soon as n ≥ and the estimators’ RE’s practically reach their ARE levels justfor n ≥ . Some of the finite sample entries for MTuM columns in Table 4.1 are not reported,specially for the pair ( d, u ) = (1 . , . , because the corresponding threshold pair ( d, u ) does notsatisfy the necessary condition of Theorem 3.2 for at least one generated sample. It is evident fromthe entries that if the difference between the thresholds (i.e., d and u ) is smaller then the estimatorsconverge slower to the true values. Overall, as expected MCM perform the best in terms of balancingbetween efficiency and robustness. MTuM performs very poorly specially for small sample sizes interms of efficiency but this approach produces highly robust estimators. In this paper, we have developed the methods of truncated (called MTuM), censored (called MCM),and insurance payment-type (called MTCM) moments estimators for completely observed ground-up loss severity data. A series of theoretical results about estimators’ existence and asymptoticnormality are established. Our analysis has established new connections between data truncation,trimming, and censoring, which paves the way for more effective modeling of non-linearly trans-formed loss data. Further, as seen from Table 3.1, there is clear trade-offs between efficiency androbustness between newly designed estimators and the corresponding MLEs when sample size islarge. The finite sample performance, for various sample sizes, of all the estimators developed inthis paper has been investigated in detail for single parameter Pareto model via simulation study.The results of this paper motivate open problems and generate several ideas for further research.First, most of the results of Section 3 (beside Theorem 3.6) are limited to complete exponentially(equivalently single parameter Pareto) distributed data but they could be extended to more generalsituations and models. For example, similar estimation approaches could be designed for (log)23ocation-scale and exponential dispersion families which could lead to more challenging non-linearequations to be solved (see Theorems 3.2, 3.4, and 3.7). Second, several contaminated loss severitymodels are proposed in the literature [see, e.g., 2, 3, 20], so it could even produce a better model stillmaintaining a reasonable balance between efficiency and robustness if one implements the proceduresdeveloped in this paper on the body part of the data and some heavier distributions (say, for e.g.,Pareto) on the right tail. Further, it is yet to measure how the newly designed estimation proceduresact with different risk analysis in practice.
Acknowledgements
The author is very appreciative of valuable comments provided by Prof. Dr. V. Brazauskas atthe University of Wisconsin-Milwaukee and constructive criticisms by anonymous referee(s). Thesehave led to many improvements in the paper. Further, a part of this work presented by the authorwon the “1st Place Prize among Student Presentation Competition” at the 52nd Actuarial ResearchConference (ARC), Atlanta, GA, 2017. The author was also awarded with the
Committee onKnowledge Extension Research (CKER)
ARC Travel Grant 2017 from the Society of Actuaries,Schaumburg, IL. Thus, the author gratefully acknowledges the support provided by the Society ofActuaries.
References [1] Brazauskas, V., Jones, B.L., and Zitikis, R. (2009). Robust fitting of claim severity distributionsand the method of trimmed moments.
Journal of Statistical Planning and Inference , (6),2028–2043.[2] Brazauskas, V. and Kleefeld, A. (2016). Modeling severity and measuring tail risk of Norwegianfire claims. N. Am. Actuar. J. , (1), 1–16.[3] Chan, J.S.K., Choy, S.T.B., Makov, U.E., and Landsman, Z. (2018). Modelling insurance lossesusing contaminated generalized beta type-II distribution. Astin Bull. , (2), 871–904.[4] Chernoff, H., Gastwirth, J.L., and Johns, Jr., M.V. (1967). Asymptotic distribution of lin-ear combinations of functions of order statistics with applications to estimation. Annals ofMathematical Statistics , (1), 52–72.[5] Clark, D.R. (2013). A note on the upper-truncated Pareto distribution. Casualty ActuarialJournal E-Forum .[6] Cohen, Jr., A.C. (1950). Estimating the mean and variance of normal populations from singlytruncated and doubly truncated samples.
Annals of Mathematical Statistics , (4), 557–569.247] Cohen, Jr., A.C. (1951). On estimating the mean and variance of singly truncated normaldistributions from the first three sample moments. Ann. Inst. Statist. Math., Tokyo , , 37–44.[8] Ergashev, B., Pavlikov, K., Uryasev, S., and Sekeris, E. (2016). Estimation of truncated datasamples in operational risk modeling. The Journal of Risk and Insurance , (3), 613–640.[9] Frees, E. (2017). Insurance portfolio risk retention. North American Actuarial Journal , (4),526–551.[10] Hampel, F.R. (1974). The influence curve and its role in robust estimation. Journal of theAmerican Statistical Association , , 383–393.[11] Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., and Stahel, W.A. (1986). Robust Statistics:The Approach Based on Influence Functions . John Wiley & Sons, Inc., New York.[12] Hayya, J., Armstrong, D., and Gressis, N. (1975). A note on the ratio of two normally dis-tributed variables.
Management Science , (11), 1338–1341.[13] Huber, P.J. (1964). Robust estimation of a location parameter. Annals of Mathematical Statis-tics , (1), 73–101.[14] Huber, P.J. and Ronchetti, E.M. (2009). Robust Statistics . Second edition. John Wiley & Sons,Inc., Hoboken, NJ.[15] Klugman, S.A., Panjer, H.H., and Willmot, G.E. (2019).
Loss Models: From Data to Decisions .Fifth edition. John Wiley & Sons, Hoboken, NJ.[16] Lee, G.Y. (2017). General insurance deductible ratemaking.
North American Actuarial Journal , (4), 620–638.[17] Maronna, R.A., Martin, R.D., Yohai, V.J., and Salibián-Barrera, M. (2019). Robust Statistics:Theory and Methods (with R) . Second edition. John Wiley & Sons, Inc., Hoboken, NJ.[18] Poudyal, C. (2018).
Robust Estimation of Parametric Models for Insurance Loss Data . Pro-Quest LLC, Ann Arbor, MI. Thesis (Ph.D.)–The University of Wisconsin - Milwaukee.[19] Reynkens, T., Verbelen, R., Beirlant, J., and Antonio, K. (2017). Modelling censored lossesusing splicing: A global fit strategy with mixed Erlang and extreme value distributions.
Insur-ance: Mathematics and Economics , , 65–77.[20] Scollnik, D.P.M. and Sun, C. (2012). Modeling with Weibull-Pareto models. N. Am. Actuar.J. , (2), 260–272.[21] Serfling, R.J. (1980). Approximation Theorems of Mathematical Statistics . John Wiley & Sons,New York.[22] Shah, S.M. and Jaiswal, M.C. (1966). Estimation of parameters of doubly truncated normaldistribution from first four sample moments.
Annals of the Institute of Statistical Mathematics , , 107–111.[23] Tukey, J.W. (1960). A survey of sampling from contaminated distributions. Contributions toProbability and Statistics , pages 448–485. Stanford University Press, Stanford, CA.2524] van der Vaart, A.W. (1998).
Asymptotic Statistics . Cambridge University Press, Cambridge.[25] Verbelen, R., Gong, L., Antonio, K., Badescu, A., and Lin, S. (2015). Fitting mixtures ofErlangs to censored and truncated data using the EM algorithm.
ASTIN Bulletin , (3),729–758.[26] Zhao, Q., Brazauskas, V., and Ghorai, J. (2018). Robust and efficient fitting of severity modelsand the method of Winsorized moments. ASTIN Bulletin , (1), 275–309. Appendix A: All four possible scenarios for Section 2.2
Scenario 1 : d j ≤ d j ′ < u j ≤ u j ′ ✲✛ d j d j ′ u j u j ′ Y jj ′ = h jj ′ ( X ) { d jj ′ < X ≤ u jj ′ } = h jj ′ ( X ) { d j ′ < X ≤ u j } ,W jj ′ = Z j { d j ′ < X ≤ u j } , and W j ′ j = Z j ′ { d j ′ < X ≤ u j } . Scenario 2 : d j ≤ d j ′ < u j ′ ≤ u j ✲✛ d j d j ′ u j ′ u j Y jj ′ = h jj ′ ( X ) { d jj ′ < X ≤ u jj ′ } = h jj ′ ( X ) { d j ′ < X ≤ u j ′ } ,W jj ′ = Z j { d j ′ < X ≤ u j ′ } , and W j ′ j = Z j ′ { d j ′ < X ≤ u j ′ } . Scenario 3 : d j ′ ≤ d j < u j ≤ u j ′ ✲✛ d j ′ d j u j u j ′ Y jj ′ = h jj ′ ( X ) { d jj ′ < X ≤ u jj ′ } = h jj ′ ( X ) { d j < X ≤ u j } ,W jj ′ = Z j { d j < X ≤ u j } , and W j ′ j = Z j ′ { d j < X ≤ u j } . Scenario 4 : d j ′ ≤ d j < u j ′ ≤ u j ✲✛ d j ′ d j u j ′ u j Y jj ′ = h jj ′ ( X ) { d jj ′ < X ≤ u jj ′ } = h jj ′ ( X ) { d j < X ≤ u j ′ } ,W jj ′ = Z j { d j < X ≤ u j ′ } , and W j ′ j = Z j ′ { d j < X ≤ u j ′ } . Therefore, depending on the scenario the expected values are given by: µ Y jj ′ = E [ Y jj ′ ] µ W jj ′ = E [ W jj ′ ]= Z F ( u jj ′ | θ ) F ( d jj ′ | θ ) h jj ′ (cid:0) F − ( v | θ ) (cid:1) dv, = Z F ( u jj ′ | θ ) F ( d jj ′ | θ ) h j (cid:0) F − ( v | θ ) (cid:1) dv. ppendix B: Proofs Proof of Theorem 2.2:
Clearly, g V ( µ V ) = (cid:16) µ Y p , . . . , µ Yk p k (cid:17) =: ( µ , . . . , µ k ) =: µ . From Lemma 2.2, it follows that D V := " ∂g j ∂x j ′ (cid:12)(cid:12)(cid:12)(cid:12) x = µ V k × k = (cid:2) d V ,jj ′ (cid:3) k × k , where d V ,jj ′ := p j ′ , if ≤ j = j ′ ≤ k ; − µ Y j p j , if j ′ − j = k ;0 , otherwise.Now, with an application of delta method corresponding with the function g V above, [see 21,§3.3 Theorem A], we have ( b µ , . . . , b µ k ) ∼ AN (cid:18) g V ( µ V ) = µ , n D V Σ V D ′ V (cid:19) . Proof of Theorem 3.1:
The r.v. Y can be expressed in the form of Y = X ∧ u − u { u < X < ∞} − X ∧ d + d { d < X < ∞} . Define, I a,b := { a < X < b } . Therefore, µ Y = E [ Y ] = E [ X ∧ u ] − E [ uI u, ∞ ] − E [ X ∧ d ] + E [ dI d, ∞ ]= θ (1 − e − uθ ) − ue − uθ − θ (1 − e − dθ ) + de − dθ = θ (cid:16) e − dθ − e − uθ (cid:17) + de − dθ − ue − uθ . Since Y = X ∧ u − u { u < X < ∞} − X ∧ d + d { d < X < ∞} , then Y = ( X ∧ u ) − ( X ∧ d ) − d [ X ∧ u − X ∧ d ] − u I u, ∞ − d I d, ∞ + 2 d ( XI d,u + uI u, ∞ ) . Therefore, for X ∼ Exp ( θ ) then µ Y := E [ Y ] is computed as below: µ Y = E [ Y ]= E [( X ∧ u ) ] − E [( X ∧ d ) ] − d [ E [ X ∧ u ] − E [ X ∧ d ]] − u E [ I u, ∞ ] d E [ I d, ∞ ] + 2 d ( E [ XI d,u ] + u E [ I u, ∞ ])= 2 θ Γ (cid:16) uθ (cid:17) + u e − uθ − θ Γ (cid:18) dθ (cid:19) − d e − dθ − d h θ (cid:16) − e − uθ (cid:17) − θ (cid:16) − e − dθ (cid:17)i − u e − uθ − d e − dθ + 2 d h θe − dθ + de − dθ − θe − uθ − ue − uθ + ue − Rθ i = 2 θ (cid:18) Γ (cid:16) uθ (cid:17) − Γ (cid:18) dθ (cid:19)(cid:19) . Therefore, σ Y = µ Y − µ Y = 2 θ (cid:18) Γ (cid:16) uθ (cid:17) − Γ (cid:18) dθ (cid:19)(cid:19) − µ Y . Proof of Theorem 3.3:
Note that the parameter vector is given by θ = ( α, x ) with x known in advance. The populationversion of b µ MTuM is given by µ MTuM = E [ h ( Y ) | d < Y ≤ u ] = E [ h ( Y ) { d < Y ≤ u } ] F ( d | θ ) − F ( d | θ ) = R ud h ( y ) f ( y | θ ) dyF ( u | θ ) − F ( d | θ )= R F ( u | θ ) F ( d | θ ) h ( F − ( v | θ )) dvF ( u | θ ) − F ( d | θ ) = − R F ( u | θ ) F ( d | θ ) log (1 − v ) dvα ( F ( u | θ ) − F ( d | θ ))= − α (cid:0)(cid:0) x d (cid:1) α − (cid:0) x u (cid:1) α (cid:1) × h F ( d | θ ) − F ( u | θ ) + α (1 − F ( d | θ )) log (cid:16) x d (cid:17) − α (1 − F ( u | θ )) log (cid:16) x u (cid:17)i = − α (cid:0)(cid:0) x d (cid:1) α − (cid:0) x u (cid:1) α (cid:1) h(cid:16) x u (cid:17) α − (cid:16) x d (cid:17) α + α (cid:16) x d (cid:17) α log (cid:16) x d (cid:17) − α (cid:16) x u (cid:17) α log (cid:16) x u (cid:17)i = x α ( du ) α αx α ( du ) α ( u α − d α ) h u α (cid:16) − α log (cid:16) x d (cid:17)(cid:17) − d α (cid:16) − α log (cid:16) x u (cid:17)(cid:17)i = 1 α ( u α − d α ) h u α (cid:16) − α log (cid:16) x d (cid:17)(cid:17) − d α (cid:16) − α log (cid:16) x u (cid:17)(cid:17)i = A du α ( u α − d α ) . Now, to establish the proof of the statement, it is enough to prove that the function g du is strictlydecreasing with respect to α . For that, g ′ du ( α ) = dg du ( α ) dα = ( du ) α α (cid:0) log (cid:0) ud (cid:1)(cid:1) − ( u α − d α ) α ( u α − d α ) . Now, in order to show that g ′ du ( α ) < , it is enough to establish ( du ) α α (cid:0) log (cid:0) ud (cid:1)(cid:1) − ( u α − d α ) < which is equivalent to establish that ( du ) α α log (cid:0) ud (cid:1) < u α − d α . Now, ( du ) α α log (cid:16) ud (cid:17) < u α − d α ⇒ α log (cid:16) ud (cid:17) < (cid:16) ud (cid:17) α − (cid:16) ud (cid:17) − α ⇐⇒ α log (cid:16) ud (cid:17) < (cid:16) α (cid:16) ud (cid:17)(cid:17) ⇐⇒ α (cid:16) ud (cid:17) < sinh (cid:16) α (cid:16) ud (cid:17)(cid:17) . But we know that x < sinh x for all x > , therefore, g ′ du ( α ) < for all α > which implies that g du is strictly decreasing. Finally, note that lim α → g du ( α ) = (log( u )) − (log( d )) − u ) log (cid:0) x d (cid:1) + 2 log( d ) log (cid:0) x u (cid:1) (cid:0) ud (cid:1) , and lim α →∞ g du ( α ) = − log (cid:16) x d (cid:17) ..