[PDF] Truncated, Censored, and Actuarial Payment-type Moments for Robust Fitting of a Single-parameter Pareto Distribution

Abstract

With some regularity conditions maximum likelihood estimators (MLEs) always produce asymptotically optimal (in the sense of consistency, efficiency, sufficiency, and unbiasedness) estimators. But in general, the MLEs lead to non-robust statistical inference, for example, pricing models and risk measures. Actuarial claim severity is continuous, right-skewed, and frequently heavy-tailed. The data sets that such models are usually fitted to contain outliers that are difficult to identify and separate from genuine data. Moreover, due to commonly used actuarial "loss control strategies" in financial and insurance industries, the random variables we observe and wish to model are affected by truncation (due to deductibles), censoring (due to policy limits), scaling (due to coinsurance proportions) and other transformations. To alleviate the lack of robustness of MLE-based inference in risk modeling, here in this paper, we propose and develop a new method of estimation - method of truncated moments (MTuM) and generalize it for different scenarios of loss control mechanism. Various asymptotic properties of those estimates are established by using central limit theory. New connections between different estimators are found. A comparative study of newly-designed methods with the corresponding MLEs is performed. Detail investigation has been done for a single parameter Pareto loss model including a simulation study.

Full PDF

aa r X i v : . [ s t a t . M E ] F e b Truncated, Censored, and Actuarial Payment–type Moments forRobust Fitting of a Single-parameter Pareto Distribution

Chudamani Poudyal Department of MathematicsTennessee Technological University

October 12, 2020

Abstract . With some regularity conditions maximum likelihood estimators (MLEs) al-ways produce asymptotically optimal (in the sense of consistency, eﬃciency, suﬃciency,and unbiasedness) estimators. But in general, the MLEs lead to non-robust statisticalinference, for example, pricing models and risk measures. Actuarial claim severity iscontinuous, right-skewed, and frequently heavy-tailed. The data sets that such mod-els are usually ﬁtted to contain outliers that are diﬃcult to identify and separate fromgenuine data. Moreover, due to commonly used actuarial “loss control strategies” inﬁnancial and insurance industries, the random variables we observe and wish to modelare aﬀected by truncation (due to deductibles), censoring (due to policy limits), scal-ing (due to coinsurance proportions) and other transformations. To alleviate the lackof robustness of MLE-based inference in risk modeling, here in this paper, we proposeand develop a new method of estimation – method of truncated moments (MTuM) andgeneralize it for diﬀerent scenarios of loss control mechanism. Various asymptotic prop-erties of those estimates are established by using central limit theory. New connectionsbetween diﬀerent estimators are found. A comparative study of newly-designed methodswith the corresponding MLEs is performed. Detail investigation has been done for asingle parameter Pareto loss model including a simulation study.

Keywords & Phrases . Claim Severity; Deductible; Relative Eﬃciency; Loss Models;Robust Estimation; Truncated and Censored Moments. Chudamani Poudyal, Ph.D., is an Assistant Professor in the Department of Mathematics, Tennessee TechnologicalUniversity, Cookeville, TN 38505, USA. e-mail : [email protected] Introduction

The research leading to the results of this work is basically motivated to ﬁnd some trade-oﬀs be-tween robustness and eﬃciency of parametric estimators for ground-up continuous loss distributions.Parametric statistical loss models for insurance claim severity are continuous, right-skewed, and fre-quently heavy-tailed [15]. The data sets that such models are usually ﬁtted to contain outliersthat are diﬃcult to identify and separate from genuine data. As a result, there could be a signif-icant diﬀerence in statistical inference if there is a small perturbation in the assumed model fromthe unknown true underlying parametric model. In practice, due to commonly used loss controlmechanism in the ﬁnancial and insurance industries [8], the random variables we observe and wishto model are aﬀected by data truncation (due to deductibles), censoring (due to policy limits),and scaling (due to coinsurance factor). Maximum likelihood estimators (MLEs) typically resultin sensitive loss severity models if there is a small perturbation in the underlying assumed modelor if the observed sample is coming from a contaminated distribution [23]. The implementationof MLE procedures even on ground-up loss data is computationally challenging [9, 16]. This issueis even more evident when one tries to ﬁt complicated multi-parameter models such as mixturesof Erlangs [19, 25]. Thus, beside many ideas from the mainstream robust statistics literature [see,e.g., 10, 13, 14], actuaries have to deal with heavy-tailed and skewed distributions, data truncationand censoring, identiﬁcation and recycling of outliers, and aggregate loss, etc. Based on a generalclass of L –statistics [4], two board classes of robust estimator – the methods of trimmed moments (MTM) [1]; and winsorized moments (MWM) [26] are recently developed with actuarial applicationsin view. Therefore, it is appealing to search some estimation procedures which directly work withthose mentioned loss control mechanism and are insensitive.If a truncated (both singly and doubly) normal sample data is available then the MLE proceduresfor such data have been developed by [6] and the method of truncated moments estimators can befound in [7] and [22]. But the goal and motivation of this research work is diﬀerent and is initiallypurposed by this author in [18]. That is, instead of truncated sample data, we assume a completeground-up sample loss data is available, i.e., the data set is neither truncated nor censored, andwe propose and develop robust estimation procedures for the corresponding ground-up loss severity2odels. Instead of trimming or winsorizing a ﬁxed lower (say, 2%) and upper proportion (say, 3%)of the observed sample data, in this paper we develop a novel ﬁxed lower and upper thresholdsmethod of truncated moments (say, MTuM) approach where the tail probabilities will be random.Depends on the nature of the loss data mentioned above, some variants of MTuM, called methodsof ﬁxed censored moment (MCM) and actuarial payment-type moment (MTCM) will be deﬁned forsingle parameter Pareto distribution, see Figure 3.3. Asymptotic distributions, such as normalityand consistency, along with asymptotic relative eﬃciency of those estimators with respect to thecorresponding MLEs are established. Several theoretical connections between diﬀerent approachesare also discovered. The newly designed procedures work like the standard method-of-moments butinstead of classical moments they are truncated or censored moments for a completely observedsample. Irrespective with the heaviness of the underlying distribution, threshold truncated andcensored moments are always ﬁnite.The remainder of the paper is organized as follows. In Section 2, the newly proposed MTuMestimation procedure is deﬁned in general with the establishment of the corresponding asymptoticdistributional properties. In Section 3, we develop speciﬁc formulas of diﬀerent estimators (includingMTuM) when the underlying loss distribution is Pareto I which is equivalent to an exponentialdistribution, and compare the asymptotic relative eﬃciency of all the estimators with respect to thecorresponding MLEs for completely observed data. Several connections among diﬀerent estimatorsare established. Section 4 summarizes a detail simulation study of diﬀerent estimators developedin this paper. Concluding remarks are oﬀered in Section 5. Finally, some additional results areprovided in Appendix A and B. We assume that a complete ground-up loss data is available, i.e., the data set is neither truncatednor censored. Then, instead of trimming or winsorizing ﬁxed proportion from both tails, from acompletely observed data, as investigated by [1, 26], in this approach of parametric estimation wetruncate the data from below at lower threshold and from above at upper threshold and then applythe method of moments on the remaining data. We call such an approach method-of-truncated-moments (MTuM – for short) . 3 .1 Deﬁnition

Let X , X , ..., X n be i.i.d. random variables with common ground-up cdf F ( ·| θ ) , where θ :=( θ , . . . , θ k ) , k ≥ is the parameter vector to be estimated. The truncated moments estimators of θ , θ , ..., θ k are computed according to the following procedures.(i) The sample truncated moments are computed as b µ j = P ni =1 h j ( X i ) { d j < X i ≤ u j } P ni =1 { d j < X i ≤ u j } , ≤ j ≤ k, (2.1)where {·} denotes the indicator function. The h ′ j s in (2.1) are specially chosen functions aswell as the thresholds d j and u j are chosen by the researcher. In general, it is reasonable toassume that X n ≤ d j < u j ≤ X n : n , for all ≤ j ≤ k , where X n and X n : n are the smallestand the largest order statistics, respectively, from the sample.(ii) Derive the corresponding population truncated moments as µ j ( θ , θ , ..., θ k ) = E [ h j ( X ) | d j < X ≤ u j ] = E [ h j ( X ) { d j < X ≤ u j } ] P ( d j < X ≤ u j )= R u j d j h j ( x ) f ( x | θ ) dxF ( u j | θ ) − F ( d j | θ ) , ≤ j ≤ k. (2.2)(iii) Now, match the sample and population truncated moments from (2.1) and (2.2) to get thefollowing system of equations for θ , θ , ..., θ k :  µ ( θ , . . . , θ k ) = b µ ... µ k ( θ , . . . , θ k ) = b µ k (2.3) Deﬁnition 2.1.

A solution to the system of equations (2.3), say b θ = (cid:16)b θ , b θ , ..., b θ k (cid:17) , if it exists,is called the method of truncated moments (MTuM) estimator of θ . Thus, b θ j =: g j ( b µ , b µ , ..., b µ k ) , ≤ j ≤ k are the MTuM estimators of θ , θ , ..., θ k . Note 2.1.

Obviously, it is possible that the system of equations (2.3) does not have a solution, orit is diﬃcult to solve the system even with numerical methods when k is large. To facilitate thisissue, the functions h j have to be chosen carefully. But most claim severity distributions have asmall number k of parameters, usually not exceeding three [see 15, Appendix A]. .2 Asymptotic Properties For ≤ j, j ′ ≤ k and for any positive integer n , deﬁne { d jj ′ < X ≤ u jj ′ } := { d j < X ≤ u j } { d j ′ < X ≤ u j ′ } and consider the following additional notations: Z j := h j ( X ) , h jj ′ ( x ) := h j ( x ) h j ′ ( x ) , p j := F ( u j | θ ) − F ( d j | θ ) ,Y jj ′ := Y j Y j ′ , Y j := Z j { d j < X ≤ u j } , p jj ′ := F ( u jj ′ | θ ) − F ( d jj ′ | θ ) ,r j := h j ( d j ) , R j := h j ( u j ) , W jj ′ := Z j { d jj ′ < X ≤ u jj ′ } , p j,n := F n ( u j ) − F n ( d j ) , where F n ( x ) = n P ni =1 { X i ≤ x } is the empirical distribution function. Note that Y jj ′ = Y j ′ j but W jj ′ = W j ′ j for j = j ′ , in general. With those notations, the density of Y j , (1 ≤ j ≤ k ) can beexpressed as f Y j ( y ) =  − F Z j ( R j | θ ) + F Z j ( r j | θ ) , if y = 0; f Z j ( y | θ ) , if r j < y < R j ;0 , otherwise.The density of the random variables Y jj ′ = Y j ′ j and W jj ′ can be constructed with the four possiblescenarios which are listed in Appendix A. To establish the asymptotic distribution of b µ , we needthe following lemma. Lemma 2.1.

For ≤ j, j ′ ≤ k , C ov (cid:0) Y j , Y j ′ (cid:1) = µ Y jj ′ − µ Y j µ Y j ′ , C ov (cid:0) Y j ; p j ′ , (cid:1) = µ W jj ′ − µ Y j p j ′ , C ov (cid:0) p j, ; p j ′ , (cid:1) = p jj ′ − p j p j ′ . Consider a k – dimensional random vector V := ( Y , . . . , Y k , p , , . . . , p k, ) . Clearly the meanvector of V is µ V = ( µ Y , . . . , µ Y k , p , . . . , p k ) and with Lemma 2.1, the variance-covariance matrixis Σ V = h σ V ,jj ′ i kj,j ′ =1 , where σ V ,jj ′ =  µ Y jj ′ − µ Y j µ Y j ′ , ≤ j, j ′ ≤ k ; µ W j ( j ′− k ) − µ Y j p j ′ − k , ≤ j ≤ k ; k + 1 ≤ j ′ ≤ k ; µ W ( j − k ) j ′ − µ Y j ′ p j − k , ≤ j ′ ≤ k ; k + 1 ≤ j ≤ k ; p ( j − k )( j ′ − k ) − p j − k p j ′ − k , k + 1 ≤ j, j ′ ≤ k. Theorem 2.1.

The empirical estimator b µ V := 1 n n X i =1 Y ,i , . . . , n X i =1 Y k,i , n X i =1 p ,i , . . . , n X i =1 p k,i ! = (cid:0) Y ,n , . . . , Y k,n , p ,n , . . . , p k,n (cid:1) of the mean vector µ V is such that b µ V ∼ AN (cid:0) µ V , n Σ V (cid:1) . roof. Let { V n } be a sequence of i.i.d. V random vectors, then by multivariate Central LimitTheorem [see, e.g., 21, Theorem B, p. 28], we have: (cid:0) Y ,n , . . . , Y k,n , p ,n , . . . , p k,n (cid:1) = 1 n n X i =1 V i ∼ AN (cid:18) µ V , n Σ V (cid:19) . The system of MTuM equations (2.3) can now be written as:  µ ( θ , . . . , θ k ) = b µ = Y ,n p ,n , ... ... µ k ( θ , . . . , θ k ) = b µ k = Y k,n p k,n . (2.4) Lemma 2.2.

Consider a function g V : R k → R k for x = ( x , x , . . . , x k ) deﬁned by g V ( x ) = ( g ( x ) , . . . , g k ( x )) := (cid:18) x x k +1 , . . . , x k x k (cid:19) , where x i = 0 , i = k + 1 , . . . , k . Then g V is totally diﬀerentiable at any point x ∈ R k .Proof. A proof directly follows from [21, Lemma 1.12.2].With the help of Theorem 2.1 and Lemma 2.2, we are now ready to state the asymptoticdistribution of the truncated sample moment vector b µ whose proof can be found in Appendix B. Theorem 2.2.

The asymptotic joint distribution of the truncated sample moment vector ( b µ , . . . , b µ k ) is given by N (cid:0) µ , n Σ (cid:1) with Σ = D V Σ V D ′ V =: h σ jj ′ i k × k , where σ jj ′ = 1 p j ′  µ Y jj ′ − µ Y j µ Y j ′ p j − µ Y j (cid:16) µ W jj ′ − µ Y j ′ p j (cid:17) p j  − µ Y j ′ p j ′ µ W jj ′ − µ Y j p j ′ p j − µ Y j (cid:0) p jj ′ − p j p j ′ (cid:1) p j ! . Now, with b µ = ( b µ , . . . , b µ k ) and g θ ( b µ ) = ( g , θ ( b µ ) , . . . , g k, θ ( b µ )) = b θ , then by delta method [see,e.g., 21, Theorem A, p. 122], we have the following main result of this section. Theorem 2.3.

The MTuM estimator of θ , denoted by b θ , has the following asymptotic distribution: b θ = (cid:16)b θ , . . . , b θ k (cid:17) ∼ AN (cid:18) θ , n D Σ D ′ (cid:19) , where the Jacobian D is given by D = (cid:20) ∂g j, θ ∂ b µ j ′ (cid:12)(cid:12)(cid:12) b µ = µ (cid:21) k × k =: (cid:2) d jj ′ (cid:3) k × k and the variance-covariancematrix Σ has the same form as in Theorem 2.2. ote 2.2. In view of the above derivations, we notice that data trimming and thus (method oftrimmed moments – MTM) investigated by [1] can be interpreted as special cases of data truncationand thus MTuM, respectively. To see that, let F be the distribution function of X . For ≤ j ≤ k ,consider F ( d j | θ ) = a j and F ( u j | θ ) = 1 − b j . Then, using integration by substitution with U = F ( X ) ,the equation (2.2) becomes µ j ( θ , θ , ..., θ k ) = R u j d j h j ( x ) f ( x | θ ) dxF ( u j | θ ) − F ( d j | θ ) = R F ( u j | θ ) F ( d j | θ ) h j ( F − ( u | θ )) duF ( u j | θ ) − F ( d j | θ ) (2.5a) = R − b j a j h j ( F − ( u | θ )) du − a j − b j , (2.5b) which is equivalent to the corresponding population trimmed moment. Note 2.3.

For estimation purposes these two approaches (i.e., MTM and MTuM) are very diﬀerent.With the MTuM approach, the limits of integration as well as the denominator in equation (2.5a) areunknowns, which create technical complications when we want to assess the asymptotic properties ofMTuM estimators. On the other hand, with the MTM approach, both the limits of integration andthe denominator in equation (2.5b) are constants, which simplify the matters signiﬁcantly. Indeed,as is evident from complete data examples in [1] and [26], MTM leads to explicit formulas for alllocation-scale families and their variants, but that is not the case with MTuM. In view of this, wewill consider the MTuM approach further only for some data scenarios, but not all.

Let Y ∼ Pareto I ( α, x ) with the distribution function F Y ( y ) = 1 − ( x /y ) α , y > x , zero elsewhere,where α > is the shape (so called tail) parameter and x > is known left threshold. Then X := log ( Y /x ) ∼ Exp ( θ = 1 /α ) with the distribution function F X ( x ) = 1 − e − x/θ . Therefore,in order to estimate α it is equivalent to estimate the exponential parameter θ that what we willproceed for the rest of this section. MTuM will be derived with asymptotic results for a complete i.i.d. sample from an exponential distribution. For this particular distribution, we also explore twoadditional methods: method of censored moments and insurance payment–type moment estimators.Several connections between diﬀerent approaches are established.7he asymptotic performance of the newly designed estimators will be measured via asymptoticrelative eﬃciency (ARE) with respect to MLE and is deﬁned as [see, e.g., 21, 24]: ARE ( C , M LE ) = asymptotic variance of MLE estimatorasymptotic variance of C estimator . (3.1)The main reason why MLE should be used as a benchmark is its optimal asymptotic performancein terms of variability (of course, with the usual caveat of “under certain regularity conditions”). In this section, we derive MTuM and related estimators for the parameter of exponential distributionfor completely observed data. Since there is a single parameter, θ , to be estimated, we consider thefunction h ( x ) = x . Let X , . . . , X n be i.i.d. random variables given as in Deﬁnition 2.1. Consider d and u be the left and right truncation points, respectively. Then the sample truncated momentis given by b µ MTuM := ( P ni =1 X i { d < X i ≤ u } ) /n ( P ni =1 { d < X i ≤ u } ) /n = ( P ni =1 Y i ) /nF n ( u ) − F n ( d ) = Y n F n ( u ) − F n ( d ) = Y n p n , where Y , Y , . . . , Y n i.i.d. ∼ Y := X { d < X ≤ u } and p n := F n ( u ) − F n ( d ) with p ≡ p ( θ ) = F ( u | θ ) − F ( d | θ ) = e − dθ − e − uθ . Theorem 3.1.

The mean and the variance of the random variable Y are respectively given by µ Y = θp + de − dθ − ue − uθ and σ Y = 2 θ (cid:18) Γ (cid:16) uθ (cid:17) − Γ (cid:18) dθ (cid:19)(cid:19) − µ Y , where Γ( α ; x ) with α > , x > is the incomplete gamma function deﬁned as Γ ( α ; x ) = 1Γ ( α ) Z x t α − e − t dt with Γ( α ) = Z ∞ t α − e − t dt. Proof.

See Appendix B.From Theorem 2.2, b µ MTuM ∼ AN (cid:16) µ Y p , n (cid:16) σ Y p − (1 − p ) µ Y p (cid:17)(cid:17) . Note that the asymptotic varianceof b µ MTuM is exactly equal to the approximation through the second order Taylor series expansionof the ratio of the asymptotic distribution of Y n and p n as mentioned in [12]. The correspondingpopulation version of b µ MTuM is given by µ MTuM := E [ X | d < X ≤ u ] = E [ Y ] F ( u | θ ) − F ( d | θ ) = µ Y p . (3.2)8 heorem 3.2. The equation µ MTuM = b µ MTuM has a unique solution b θ provided that d < b µ MTuM < d + u . Otherwise, the solution does not exist.Proof. It is clear that d < b µ MTuM < u . Also, µ MTuM ( θ ) = µ Y p = e − dθ ( d + θ ) − e − uθ ( u + θ ) e − dθ − e − uθ . Then, in orderto establish the result, it is enough to prove the following statements: ( a ) µ MTuM ( θ ) is strictly increasing, ( b ) lim θ → µ MTuM ( θ ) = d, and ( c ) lim θ →∞ µ MTuM ( θ ) = d + u . First of all, let us establish that µ MTuM ( θ ) is strictly increasing. µ ′ MTuM ( θ ) = (cid:16)(cid:0) dθ (cid:1) e − dθ − (cid:0) uθ (cid:1) e − uθ (cid:17) (cid:16) e − dθ − e − uθ (cid:17) + (cid:16) e − dθ − e − uθ (cid:17) (cid:16) e − dθ − e − uθ (cid:17) = − e − d + uθ (cid:16)(cid:0) uθ (cid:1) + (cid:0) dθ (cid:1) (cid:17) + duθ e − d + uθ + (cid:16) e − dθ − e − uθ (cid:17) (cid:16) e − dθ − e − uθ (cid:17) = 1 − d − u ) θ (cid:16) e d − u θ − e − d − u θ (cid:17) = 1 − (cid:18) d − u θ (cid:19) (cid:18) e d − u θ − e − d − u θ (cid:19) = 1 − (cid:18) d − u θ (cid:19) csch (cid:18) d − u θ (cid:19) . Therefore, µ ′ MTuM ( θ ) > if and only if (cid:0) d − u θ (cid:1) < sinh (cid:0) d − u θ (cid:1) , which is true since x < sinh x for all x > and x > sinh x for all x < . Further, lim θ → µ MTuM ( θ ) = lim θ → " θ + de − dθ − ue − uθ e − dθ − e − uθ = lim θ → e − dθ (cid:16) d − ue d − uθ (cid:17) e − dθ (cid:16) − e d − uθ (cid:17) = d. lim θ →∞ µ MTuM ( θ ) = lim θ →∞ " θ + de − dθ − ue − uθ e − dθ − e − uθ y := θ = lim y → (cid:20) y + de − dy − ue − uy e − dy − e − uy (cid:21) = lim y → " y + de ( u − d ) y − ue ( u − d ) y − = lim y → " e ( u − d ) y − dye ( u − d ) y − uyy (cid:0) e ( u − d ) y − (cid:1) = d − lim y → ( u − d ) − ( u − d ) e ( u − d ) y e ( u − d ) y − y ( u − d ) e ( u − d ) y = d − lim y → − ( u − d ) e ( u − d ) y ( u − d ) e ( u − d ) y + y ( u − d ) e ( u − d ) y + ( u − d ) e ( u − d ) y = d + ( u − d ) u − d + u − d = d + u . Also from (3.2), we have θ ′ := dθdµ MTuM = pθ de − dθ ( θ + d ) − ue − uθ ( θ + u ) + pθ − µ MTuM (cid:16) de − dθ − ue − uθ (cid:17) (3.3)9 p θ p θ − e − d + uθ ( u − d ) . (3.4)Therefore, by delta method, we get b θ MTuM ∼ AN (cid:18) θ, (cid:0) θ ′ (cid:1) (cid:18) σ Y np − (1 − p ) µ Y np (cid:19)(cid:19) = AN θ, θ n pθ p θ − e − d + uθ ( u − d ) !! , (3.5)and hence, we haveARE (cid:16)b θ MTuM , b θ MLE (cid:17) = θ p ( θ ′ ) (cid:0) pσ Y − (1 − p ) µ Y (cid:1) = p θ − e − d + uθ ( u − d ) pθ . (3.6)Clearly, ARE (cid:16)b θ MTuM , b θ MLE (cid:17) given by (3.6) is a function of the parameter θ . Thus, it turns outthat, if we ﬁx the left, d , and right, u , truncation thresholds and allow the tail probabilities, i.e., F ( d | θ ) and − F ( u | θ ) , be random then the corresponding asymptotic relative eﬃciency is notstable (see Figure 3.2). However, as in method of trimmed moments (MTM) [see, e.g., 1, 26] if thetail probabilities F ( d | θ ) and − F ( u | θ ) are ﬁxed then we have the following result. Proposition 3.1.

Let θ = θ be two exponential parameters with corresponding left and right trun-cation thresholds d , d and u , u , respectively. Assume F ( d | θ ) = F ( d | θ ) and F ( u | θ ) = F ( u | θ ) , then it follows thatARE (cid:16)b θ , b θ (cid:17) = ARE (cid:16)b θ , b θ (cid:17) . (3.7) Proof.

A proof immediately follows from (3.6).Numerical values of ARE (cid:16)b θ MTuM , b θ MLE (cid:17) given by (3.6) for some selected values of left and righttruncation thresholds d and u , respectively are summarized on the ﬁrst horizontal block of Table3.1.As mentioned above, if Y ∼ Pareto I ( α, x ) with x known then X := log (cid:16) Yx (cid:17) ∼ Exp (cid:0) α =: θ (cid:1) .So, estimators of α of the single-parameter Pareto distribution will share the same AREs withestimators of Exp ( θ ) , given that h ( y ) = log (cid:16) yx (cid:17) . The following result for single-parameter Paretohas been partially derived in [5], but can easily be extended using the tools of this section. Theorem 3.3.

Let d and u be the left and right truncation points, respectively, for Y ∼ Pareto I ( α, x ) .Also, deﬁne A du := u α (cid:0) − α log (cid:0) x d (cid:1)(cid:1) − d α (cid:0) − α log (cid:0) x u (cid:1)(cid:1) and g du ( α ) := A du α ( u α − d α ) . Then theequation b µ MTuM = µ MTuM has a unique solution provided that lim α →∞ g du ( α ) < b µ MTuM < lim α → g du ( α ) . roof. See Appendix B.Note that, given a truncated data, method of truncated moments estimators for a normal pop-ulation parameters can be found in [7] and [22].

There are several versions of data censoring that occur in statistical modeling: interval censoring(it includes left and right censoring depending on which end point of the interval is inﬁnite), typeI censoring, type II censoring, and random censoring. For actuarial work, the most relevant type is interval censoring . It occurs when complete sample observations are available within some interval,say ( d, u ] , but data outside the interval is only partially known. That is, counts are available butactual values are not. That is, we observe the i.i.d. data Z , Z , . . . , Z n , (3.8)where each Z is equal to the ground-up variable X , if X falls between d and u , and is equal to thecorresponding end-point of the interval if X is beyond that point. That is, Z is given by Z := min (cid:8) max( d, X ) , u (cid:9) = d { X ≤ d } + X { d < X ≤ u } + u { X > u } =  d, X ≤ d ; X, d < X ≤ u ; u, X > u. Therefore, instead of winsorizing ﬁxed proportions of lowest and highest order statistics from anobserved sample [26], here we design a method of ﬁxed threshold censored moment for exponentialdistribution.Let X , X , . . . , X n i.i.d. ∼ Exp ( θ ) random variables. Then, the sample censored mean is given by b µ MCM := d P ni =1 { X i ≤ d } + P ni =1 X i { d < X i ≤ u } + u P ni =1 { X i > u } n . The corresponding population censored moments are: µ MCM := E [ Z ] = d (cid:16) − e − dθ (cid:17) + µ Y + ue − uθ , and µ MCM,2 := E (cid:2) Z (cid:3) = d (cid:16) − e − dθ (cid:17) + E (cid:2) Y (cid:3) + u e − uθ , where Y := X { d < X ≤ u } as in Section 3.1. Thus, σ MCM = µ MCM,2 − µ MCM . Moreover, setting µ MCM = b µ MCM implies d + θ (cid:16) e − dθ − e − uθ (cid:17) = b µ MCM , which needs to be solved to get a method ofcensored moment (MCM) estimator, b θ MCM , of θ . 11 heorem 3.4. The equation b µ MCM = µ MCM has a unique solution b θ MCM provided that d < b µ MCM

For method of censored moments (MCM) with a = F ( d | θ ) and b = 1 − F ( u | θ ) , thenthe following result holds: ARE (cid:16)b θ MCM , b θ MLE (cid:17) = ARE (cid:16)b θ MTM , b θ MLE (cid:17) . Proof.

Following [1], we know that b θ MTM ∼ AN (cid:18) θ, θ n ∆ (cid:19) , with ∆ = J ( a, − b )[ I ( a, − b )] , where I ( a, − b ) := Z − ba log (1 − v ) dv and J ( a, − b ) := Z − ba Z − ba min { v, w } − vw (1 − v )(1 − w ) dw dv. Therefore, ARE (cid:16)b θ MTM , b θ MLE (cid:17) = 1∆ = [ I ( a, − b )] J ( a, − b ) . On the other hand, ARE (cid:16)b θ MCM , b θ MLE (cid:17) = (cid:16) pθ + de − dθ − ue − uθ (cid:17) σ MCM . (cid:16)b θ MCM , b θ MLE (cid:17) = ARE (cid:16)b θ MTM , b θ MLE (cid:17) = θ [ I ( a, − b )] σ MCM . That is, J ( a, − b ) = σ MCM θ . For that, we have: σ MCM = d (cid:16) − e − dθ (cid:17) + 2 θ (cid:20) Γ (cid:18) dθ (cid:19) − Γ (cid:16) uθ (cid:17)(cid:21) + u e − uθ − d − dθp − θ p = 2 θ (1 − a − b ) + 2 θ ( − (1 − a ) log (1 − a ) + b log ( b )) − θ (1 − a − b )( − − a ) + (1 − a − b )) .σ MCM θ = 2(1 − a − b ) + 2( b log ( b ) − (1 − a ) log (1 − a )) − (1 − a − b )(1 − a − b − − a ))= (1 − a − b ) [ a + log (1 − a )] − I ( a, − b )+ (1 − b − (cid:20) a − b + log (cid:18) − ab (cid:19)(cid:21) = (1 − a − b ) [ a + log (1 − a )] − I ( a, − b ) + (1 − b − I ( a, − b )= J ( a, − b ) . where I ( a, − b ) := Z − ba v − v dv = ( a − b ) + log (cid:18) − ab (cid:19) .Here is an important and a new connection between trimmed and interval censored populationmeans for any F ∈ F , where F is the family of continuous parametric distributions. Theorem 3.6.

Let F ∈ F be an arbitrary continuous ground-up cumulative distribution function(cdf ). Consider d and u be the lower and upper thresholds, respectively. Deﬁne a := F ( d ) and b := 1 − F ( u ) . Let µ MCM = dF ( d ) + Z ud zf ( z ) dz + u (1 − F ( u )) and µ MTM = 11 − a − b Z − ba F − ( v ) dv, are, respectively, the ﬁxed censored mean and proportion trimmed mean of the same cdf F . Then IF ( µ MCM , x ) = (1 − a − b ) IF ( µ MTM , x ) , −∞ < x < ∞ where IF stands for inﬂuence function.Proof. The inﬂuence function of the trimmed mean is given by [see, e.g., 10, 14]: IF ( µ MTM , x ) = 11 − a − b Z − ba (cid:18) ddλ F − λ ( v ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) λ =0 dv = 11 − a − b Z − ba v − { F ( x ) ≤ v } f ( F − ( v )) dv, (3.11)13here F λ := (1 − λ ) F + λδ x and δ x is the point mass at x . Since d and u are left and right censoredpoints, respectively. Then, the censored mean is: µ MCM [ F ] = Z ud z dF Z ( z ) = dF ( d ) + E [ X { d < X < u } ] + u (1 − F ( u ))= dF ( d ) + Z ud z dF ( z ) + u (1 − F ( u )) , where F Z is the distribution function of Z given by: F Z ( z | d, u ) = P (cid:2) min (cid:8) max( d, X ) , u (cid:9) ≤ z (cid:3) =  , z < d ; F ( z ) , d ≤ z < u ;1 , z ≥ u, (3.12)Further, µ MCM [ F λ ] = dF λ ( d ) + R ud z dF λ ( z ) + u (1 − F λ ( u )) . Note that the inﬂuence function isjust a special case of ﬁrst order Gâteaux derivative [see, e.g., 11, Section 2.3]. Thus, a simplercomputational formula to get the IF is [see, e.g., 21, Chapter 6]: IF ( µ MCM , x ) = dµ MCM [ F λ ] dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 . It is clear that dF λ ( d ) dλ (cid:12)(cid:12)(cid:12) λ =0 = − F ( d ) + δ x ( d ) and similarly d (1 − F λ ( u )) dλ (cid:12)(cid:12)(cid:12) λ =0 = F ( u ) − δ x ( u ) . Also, byusing Leibniz’s rule for diﬀerentiation under integral sign, we get ddλ Z ud z dF λ ( z ) = ddλ Z F λ ( u ) F λ ( d ) F − λ ( v ) dv = F − λ ( F λ ( u )) ddλ F λ ( u ) − F − λ ( F λ ( d )) ddλ F λ ( d ) + Z F λ ( u ) F λ ( d ) ddλ F − λ ( v ) dv = u ddλ F λ ( u ) − d ddλ F λ ( d ) + Z F λ ( u ) F λ ( d ) ddλ F − λ ( v ) dv.ddλ Z ud z dF λ ( z ) (cid:12)(cid:12)(cid:12)(cid:12) λ =0 = u dF λ ( u ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 − d dF λ ( d ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + Z F λ ( u ) F λ ( d ) ddλ F − λ ( v ) dv (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) λ =0 = u dF λ ( u ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 − d dF λ ( d ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + Z − ba (cid:18) ddλ F − λ ( v ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) λ =0 dv. Therefore, IF ( µ MCM , x ) = dµ MCM [ F λ ] dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 = d dF λ ( d ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + u d (1 − F λ ( u )) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + ddλ Z ud y dF λ ( y ) (cid:12)(cid:12)(cid:12)(cid:12) λ =0 = d dF λ ( d ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + u d (1 − F λ ( u )) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + u dF λ ( u ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 d dF λ ( d ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + Z − ba (cid:18) ddλ F − λ ( v ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) λ =0 dv = Z − ba (cid:18) ddλ F − λ ( v ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) λ =0 dv. (3.13)Thus, from Equations (3.13) and (3.11), IF ( µ MCM , x ) = (1 − a − b ) IF ( µ MTM , x ) . -10-505101520 -10-505101520 Figure 3.1:

Inﬂuence functions of trimmed mean (left panel) and censored mean (right panel).

The following two points are immediate consequence of Theorem 3.6.(i) Censored mean is asymptotically more stable than trimmed mean. This point is more clearfrom the Figure 3.1 as the graph of IF ( µ MCM , x ) is just the vertical contraction of the graphof IF ( µ MTM , x ) by the contracting factor < − a − b < with the assumption < a + b < .(ii) The asymptotic investigation of censored mean could be quite challenging due to thresholds.So, in this situation one can assess the asymptotic distributional properties of censored meanthrough the corresponding properties of trimmed mean as the asymptotic variance of anestimator is the expectation of the square of the corresponding IF [see, e.g., 10, 17]. Insurance contracts have coverage modiﬁcations that need to be taken into account when modelingthe underlying loss variable. Usually the coverage modiﬁcations such as deductibles, policy limits,15nd coinsurance are introduced as loss control strategies so that unfavorable policyholder behavioraleﬀects (e.g., adverse selection) can be minimized. Therefore, the actuarial loss data are left truncatedand right censored in nature. Motivated with this nature of the loss data, here we design anestimation approach, called insurance payment–type estimators and is basically left truncated andright censored method of moments.Let X , . . . , X n be i.i.d. random variables with common exponential cdf F ( ·| θ ) . Deﬁne the left-truncated (at d ) and right-censored (at u ) sample moment as: b µ MTCM := P ni =1 X i { d < X i ≤ u } + u P ni =1 { X i > u } P ni =1 { X i > d } = W n τ n . where W := X { d < X ≤ u } + u { X > u } , τ n = 1 − F n ( d ) , and τ = 1 − F ( d | θ ) . The covarianceof W and τ is given as σ W τ = C ov ( W, τ ) = µ W (1 − τ ) , with µ W = E [ W ] = µ Y + u (1 − F ( u | θ )) and E (cid:2) W (cid:3) = E (cid:2) Y (cid:3) + u (1 − F ( u | θ )) , where Y := X { d < X ≤ u } as in Section 3.1. Then, by multivariate Central Limit Theorem, wehave (cid:0) W n , τ n (cid:1) ∼ AN (cid:18) ( µ W , τ ) , n (cid:20) σ W σ W τ σ W τ τ (1 − τ ) (cid:21)(cid:19) . Then, by delta method with a function g ( x , x ) := x x , x = 0 , we have b µ MTCM = W n τ n ∼ AN (cid:18) µ W τ , n (cid:18) σ W τ − µ W (1 − τ ) τ (cid:19)(cid:19) . The population version of b µ MTCM is given by µ MTCM = E [ W ]1 − F ( d | θ ) = θ (cid:16) e − dθ − e − uθ (cid:17) + de − dθ e − dθ = pθ + dττ ⇒ θ ′ := dθdµ MTCM = τ θ pθ + θ (cid:16) de − dθ − ue − uθ (cid:17) + d τ − dτ µ MTCM = τ θ pθ − e − uθ ( u − d ) . A solution, if exists, of the equation b µ MTCM = µ MTCM , say b θ MTCM , is called the method of truncatedand censored moment (MTCM) estimator of θ . Let b := e − uθ , then by delta method the asymptoticdistribution and ARE are, respectively, given by b θ MTCM ∼ AN (cid:18) θ, ( θ ′ ) n (cid:18) σ W τ − (1 − τ ) µ W τ (cid:19)(cid:19) = AN θ, θ n pθ ( τ + b ) − θτ b ( u − d ) τ (cid:2) pθ − b (cid:0) u − dθ (cid:1)(cid:3) !! (3.14)16 Figure 3.2:

Graphs of ARE (cid:16)b θ C , b θ MLE (cid:17) where

C ∈ {

MTuM, MCM, MTCM } with ( d, u ) = (0 . , . and θ = 10 . ARE (cid:16)b θ MTCM , b θ MLE (cid:17) = θ τ ( θ ′ ) (cid:0) τ σ W − (1 − τ ) µ W (cid:1) = (cid:2) p − b (cid:0) u − dθ (cid:1)(cid:3) p (1 + b/τ ) − b (cid:0) u − dθ (cid:1) . (3.15)Similar to (3.6) and (3.10), ARE (cid:16)b θ MTCM , b θ MLE (cid:17) given by (3.15) is a function of θ . But if we ﬁx thetail probabilities then we have the following stability result. Proposition 3.2.

With the assumptions given, we have e − d θ = e − d θ , e − u θ = e − u θ = ⇒ u − d θ = u − d θ , and then the conclusion follows directly from (3.15).17 able 3.1: Numerical values of ARE (cid:16)b θ C , b θ MLE (cid:17) where

C ∈ {

MTuM, MCM, MTCM } , respectively, givenby (3.6), (3.10), and (3.15) for various values of left and right truncation thresholds d and u from Exp ( θ = 10) . The truncation thresholds d and u are rounded to two decimal places; forexample, . ≈ F − (0 . , . ≈ F − (0 . , etc. u (1 − F ( u | θ )) d ( F ( d | θ )) ∞ (.00) (.05) (.10) (.15) (.25) (.49) (.70) (.85) A R E (cid:16) b θ M T u M , b θ M L E (cid:17) (.00) (.05) .950 .443 .284 .193 .095 .016 .002 .000 (.10) .900 .408 .257 .172 .082 .012 .001 .000 (.15) .850 .373 .231 .152 .069 .009 .000 - (.25) .750 .307 .182 .114 .047 .004 .000 - (.49) .510 .161 .080 .042 .011 .000 - - (.70) .300 .057 .019 .006 .000 - - - (.85) .150 .009 .001 - - - - - A R E (cid:16) b θ M C M , b θ M L E (cid:17) (.00) (.05) (.10) (.15) .999 .918 .850 .787 .672 .436 .261 - (.25) .995 .918 .851 .790 .679 .452 .285 - (.49) .958 .897 .839 .786 .688 .487 - - (.70) .857 .824 .781 .738 .659 - - - (.85) .681 .688 .663 - - - - - A R E (cid:16) b θ M TC M , b θ M L E (cid:17) (.00) (.05) .950 .868 .798 .735 .619 .380 .197 .077 (.10) .900 .819 .750 .687 .572 .336 .157 .038 (.15) .850 .768 .700 .638 .525 .292 .116 - (.25) .750 .670 .603 .542 .432 .208 .038 - (.49) .510 .434 .371 .315 .216 .015 - - (.70) .300 .229 .173 .124 .039 - - - (.85) .150 .087 .040 - - - - -From Table 3.1, it follows evidently thatARE (cid:16)b θ MTuM , b θ MLE (cid:17) ≤ ARE (cid:16)b θ MTCM , b θ MLE (cid:17) ≤ ARE (cid:16)b θ MCM , b θ MLE (cid:17) . This inequality is intuitive because MTuM is more robust than MCM and MTCM. As a result,MTuM estimators lose more eﬃciency and converge to the asymptotic results slower. For exam-ple, if the lower and upper truncation thresholds are, respectively, d = 0 . and u = 29 . thenARE (cid:16)b θ MTuM , b θ MLE (cid:17) = 0 . and ARE (cid:16)b θ MCM , b θ MLE (cid:17) = 0 . . That is, we lose approximately 52%eﬃciency by going from MCM to MTuM. The reason that MTuM relative eﬃciency is much lowerthan the corresponding MCM is that the censored sample size is always ﬁxed but even if we ﬁx the18 MTuM - Method of Truncated MomentsMCM - Method of Censored MomentsMTCM - Method of Left Truncated and Right Censored Moments

Figure 3.3:

Eﬀects of MTuM (left panel), MCM (middle panel), and MTCM (right panel) on the underlyingquantile function and thus data. MTuM focuses on the data only between the truncationthresholds, MCM is a threshold censored form that takes into account the upper and loweroutside values as well (orange area), and MTCM is a mixed version of both MTuM (lefttruncated) and MCM (right censored). truncation thresholds, the truncated sample size is random. Further, MTuM disregards the obser-vations beyond the truncation thresholds in order to control the inﬂuence of extremes in statisticalinference. MCM controls such inﬂuence of extremes diﬀerently, i.e., those observations which arebeyond the thresholds are adjusted to be equal to the corresponding thresholds and hence increasesthe eﬃciency signiﬁcantly. MTCM controls the inﬂuence of extremes by disregarding the observa-tions below lower threshold and adjusting the observations above upper threshold to be equal to theupper threshold which makes the MTCM entries in between the corresponding MTuM and MCMentries. Due to Theorem 3.5, entries for ARE (cid:16)b θ MCM , b θ MLE (cid:17) are identical to ARE (cid:16)b θ MTM , b θ MLE (cid:17) entries found in [1, Table 1].

Theorem 3.7.

The equation b µ MTCM = µ MTCM has a unique solution b θ MTCM provided that d < b µ MTCM < u . Otherwise, the solution does not exist.Proof.

A proof can similarly be established as in Theorem 3.2.19

Simulation Study

This section supplements the theoretical results we developed in Section 3 via simulation. The maingoal is to access the size of the sample such that the estimators are free from bias (given that theestimators are asymptotically unbiased), justify the asymptotic normality, and their ﬁnite samplerelative eﬃciencies (RE) are approaching to the corresponding AREs. To compute RE of diﬀerentestimators (MTuM, MCM, and MTCM) we use MLE as a benchmark. Thus, the deﬁnition ofasymptotic relative eﬃciency given by equation (3.1) for ﬁnite sample performance translates to: RE ( C , M LE ) = asymptotic variance of MLE estimatorsmall-sample variance of a competing estimator C , where the denominator is the empirical mean square error matrix of the competing estimator C .From Exp ( θ = 10) , we ﬁrst monitor the approximate normality distributional properties of theMTuM, MCM, and MTCM estimators of θ given, respectively, by (3.5), (3.9), and (3.14) with ( d, u ) = (0 . , . and ﬁnite sample sizes n = 30 , . We generate 100 samples for each samplesize n = 30 , and estimate the values of θ from each sample via MTuM, MCM, and MTCM.We plot the histograms of those 100 estimated values of θ in Figure 4.1. Clearly, the histogramscorresponding to MTuM (for n = 30 , ) are positively skewed (but turns out to be symmetric for n = 500 ) and hence the asymptotic normality property of b θ MTuM given by (3.5) can be achievedslower, i.e., only for bigger sample sizes, than MCM and MTCM. On the other hand the asymptoticnormality property of b θ MCM given by (3.9) and b θ MTCM given by (3.14) can be justiﬁed even for thesample of size n = 30 .Second, again from exponential distribution F ( ·| θ = 10) we generate , samples of a spec-iﬁed length n using Monte Carlo. For each sample we estimate the parameter of F using variousMTuM, MCM, and MTCM estimators and then compute the average mean and RE of those , estimates. This process is repeated times and the average means and the RE’s are againaveraged and their standard errors are reported. Such repetitions are useful for assessing standarderrors of the estimated means and RE’s. Hence, our ﬁndings are essentially based on , sam-ples. The standardized ratio b θ/θ that we report is deﬁned as the average of , estimatesdivided by the true value of the parameter that we are estimating. We observe the performance ofdiﬀerent methods of estimation for exponential distribution (see Section 3) in the aspects of20

10 20 300102030 0 5 10 150102030 0 5 10 15010200 5 10 1501020 6 8 10 1201020 6 8 10 12 14010206 8 10 1205101520 9 10 11 120102030 8 10 120102030

Figure 4.1:

Histograms of 100 estimated values of the parameter Exp ( θ = 10) via MTuM, MCM, andMTCM with ( d, u ) = (0 . , . and sample sizes n = 30 , , . (i) Sample size: n = 50 , , , , .(ii) Estimators of θ :(a) MLE, which is a special case of all others.(b) MTuM, MCM, MTCM.(c) For the selected proportions a = b = 0 ; a = b = 0 . ; a = b = 0 . ; a = b = 0 . ; a = b = 0 . ; a = 0 . and b = 0 . ; a = 0 . and b = 0 . , the left and right truncation(or censored) thresholds d and u , respectively, to the nearest two decimal places arechosen speciﬁcally as a = F ( d ) or d = F − ( a ) and − b = F ( u ) or u = F − (1 − b ) .21 able 4.1: Finite-sample performance evaluation of diﬀerent Estimators for

Exp ( θ = 10) . d ( a ) u ( b ) MTuM MCM MTCM MTuM MCM MTCM MTuM MCM MTCM n = 50 n = 100 n = 250 b θ/θ . ( . ∞ ( . (.000) (.000) (.000) (.000) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.001) (.000) (.000) (.001) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.002) (.000) (.001) (.001) (.000) (.000) (.000) (.000) (.000) . ( . . ( . – 1.01 (.000) (.001) (.002) (.000) (.000) (.001) (.000) (.000) . ( . . ( . – 1.02 (.001) (.001) – 1.01 (.000) (.000) – 1.00 (.000) (.000) . ( . . ( . – 1.08 (.001) (.002) – 1.04 (.001) (.001) – 1.01 (.000) (.000) . ( . ∞ ( . (.000) (.000) (.000) (.000) (.000) (.000) (.000) (.000) (.000) Re . ( . ∞ ( . (.004) (.004) (.004) (.003) (.003) (.003) (.006) (.006) (.006) . ( . . ( . (.003) (.005) (.005) (.002) (.004) (.003) (.002) (.005) (.004) . ( . . ( . (.009) (.005) (.004) (.002) (.003) (.003) (.001) (.004) (.003) . ( . . ( . – 0.73 (.005) (.004) (.007) (.003) (.002) (.001) (.004) (.002) . ( . . ( . – 0.61 (.005) (.003) – 0.64 (.002) (.002) – 0.67 (.003) (.002) . ( . . ( . – 0.14 (.001) (.004) – 0.20 (.001) (.001) – 0.23 (.002) (.001) . ( . ∞ ( . (.002) (.004) (.002) (.003) (.003) (.003) (.004) (.006) (.004) d ( a ) u ( b ) MTuM MCM MTCM MTuM MCM MTCM MTuM MCM MTCM n = 500 n = 1000 n → ∞ b θ/θ . ( . ∞ ( . (.000) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.000) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.000) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.000) (.000) (.000) (.000) (.000) (.000) . ( . . ( . (.001) (.000) (.000) (.000) (.000) (.000) . ( . . ( . – 1.01 (.000) (.000) – 1.00 (.000) (.000) . ( . ∞ ( . (.000) (.000) (.000) (.000) (.000) (.000) Re . ( . ∞ ( . (.001) (.001) (.000) (.004) (.004) (.004) . ( . . ( . (.001) (.001) (.000) (.002) (.004) (.003) .442 .918 .868 . ( . . ( . (.001) (.001) (.000) (.001) (.004) (.003) .257 .848 .749 . ( . . ( . (.001) (.001) (.000) (.001) (.003) (.002) .152 .787 .638 . ( . . ( . (.001) (.001) (.000) (.000) (.002) (.001) .047 .679 .433 . ( . . ( . – 0.24 (.001) (.000) – 0.24 (.001) (.001) .001 .250 .156 . ( . ∞ ( . (.001) (.001) (.000) (.003) (.004) (.003) .750 .995 .750 imulation results are recorded in Table 4.1. The entries are mean values (with standard errors inparentheses) based on 100,000 samples. The columns corresponding to n → ∞ , represent analyticARE (cid:16)b θ C , b θ MLE (cid:17) results with

C ∈ {

MTuM, MCM, MTCM } and are found in Section 3, not fromsimulations. Among these three columns, the ﬁrst one is C = M T uM , second C = MCM, and third C = MTCM. As seen from Table 4.1, the ratio b θ/θ of the exponential θ estimators converges tothe true asymptotic value of very fast. Besides MTuM approach, the bias of all other proceduresdisappears as soon as n ≥ and the estimators’ RE’s practically reach their ARE levels justfor n ≥ . Some of the ﬁnite sample entries for MTuM columns in Table 4.1 are not reported,specially for the pair ( d, u ) = (1 . , . , because the corresponding threshold pair ( d, u ) does notsatisfy the necessary condition of Theorem 3.2 for at least one generated sample. It is evident fromthe entries that if the diﬀerence between the thresholds (i.e., d and u ) is smaller then the estimatorsconverge slower to the true values. Overall, as expected MCM perform the best in terms of balancingbetween eﬃciency and robustness. MTuM performs very poorly specially for small sample sizes interms of eﬃciency but this approach produces highly robust estimators. In this paper, we have developed the methods of truncated (called MTuM), censored (called MCM),and insurance payment-type (called MTCM) moments estimators for completely observed ground-up loss severity data. A series of theoretical results about estimators’ existence and asymptoticnormality are established. Our analysis has established new connections between data truncation,trimming, and censoring, which paves the way for more eﬀective modeling of non-linearly trans-formed loss data. Further, as seen from Table 3.1, there is clear trade-oﬀs between eﬃciency androbustness between newly designed estimators and the corresponding MLEs when sample size islarge. The ﬁnite sample performance, for various sample sizes, of all the estimators developed inthis paper has been investigated in detail for single parameter Pareto model via simulation study.The results of this paper motivate open problems and generate several ideas for further research.First, most of the results of Section 3 (beside Theorem 3.6) are limited to complete exponentially(equivalently single parameter Pareto) distributed data but they could be extended to more generalsituations and models. For example, similar estimation approaches could be designed for (log)23ocation-scale and exponential dispersion families which could lead to more challenging non-linearequations to be solved (see Theorems 3.2, 3.4, and 3.7). Second, several contaminated loss severitymodels are proposed in the literature [see, e.g., 2, 3, 20], so it could even produce a better model stillmaintaining a reasonable balance between eﬃciency and robustness if one implements the proceduresdeveloped in this paper on the body part of the data and some heavier distributions (say, for e.g.,Pareto) on the right tail. Further, it is yet to measure how the newly designed estimation proceduresact with diﬀerent risk analysis in practice.

Acknowledgements

The author is very appreciative of valuable comments provided by Prof. Dr. V. Brazauskas atthe University of Wisconsin-Milwaukee and constructive criticisms by anonymous referee(s). Thesehave led to many improvements in the paper. Further, a part of this work presented by the authorwon the “1st Place Prize among Student Presentation Competition” at the 52nd Actuarial ResearchConference (ARC), Atlanta, GA, 2017. The author was also awarded with the

Committee onKnowledge Extension Research (CKER)

ARC Travel Grant 2017 from the Society of Actuaries,Schaumburg, IL. Thus, the author gratefully acknowledges the support provided by the Society ofActuaries.

References [1] Brazauskas, V., Jones, B.L., and Zitikis, R. (2009). Robust ﬁtting of claim severity distributionsand the method of trimmed moments.

Journal of Statistical Planning and Inference , (6),2028–2043.[2] Brazauskas, V. and Kleefeld, A. (2016). Modeling severity and measuring tail risk of Norwegianﬁre claims. N. Am. Actuar. J. , (1), 1–16.[3] Chan, J.S.K., Choy, S.T.B., Makov, U.E., and Landsman, Z. (2018). Modelling insurance lossesusing contaminated generalized beta type-II distribution. Astin Bull. , (2), 871–904.[4] Chernoﬀ, H., Gastwirth, J.L., and Johns, Jr., M.V. (1967). Asymptotic distribution of lin-ear combinations of functions of order statistics with applications to estimation. Annals ofMathematical Statistics , (1), 52–72.[5] Clark, D.R. (2013). A note on the upper-truncated Pareto distribution. Casualty ActuarialJournal E-Forum .[6] Cohen, Jr., A.C. (1950). Estimating the mean and variance of normal populations from singlytruncated and doubly truncated samples.

Annals of Mathematical Statistics , (4), 557–569.247] Cohen, Jr., A.C. (1951). On estimating the mean and variance of singly truncated normaldistributions from the ﬁrst three sample moments. Ann. Inst. Statist. Math., Tokyo , , 37–44.[8] Ergashev, B., Pavlikov, K., Uryasev, S., and Sekeris, E. (2016). Estimation of truncated datasamples in operational risk modeling. The Journal of Risk and Insurance , (3), 613–640.[9] Frees, E. (2017). Insurance portfolio risk retention. North American Actuarial Journal , (4),526–551.[10] Hampel, F.R. (1974). The inﬂuence curve and its role in robust estimation. Journal of theAmerican Statistical Association , , 383–393.[11] Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., and Stahel, W.A. (1986). Robust Statistics:The Approach Based on Inﬂuence Functions . John Wiley & Sons, Inc., New York.[12] Hayya, J., Armstrong, D., and Gressis, N. (1975). A note on the ratio of two normally dis-tributed variables.

Management Science , (11), 1338–1341.[13] Huber, P.J. (1964). Robust estimation of a location parameter. Annals of Mathematical Statis-tics , (1), 73–101.[14] Huber, P.J. and Ronchetti, E.M. (2009). Robust Statistics . Second edition. John Wiley & Sons,Inc., Hoboken, NJ.[15] Klugman, S.A., Panjer, H.H., and Willmot, G.E. (2019).

Loss Models: From Data to Decisions .Fifth edition. John Wiley & Sons, Hoboken, NJ.[16] Lee, G.Y. (2017). General insurance deductible ratemaking.

North American Actuarial Journal , (4), 620–638.[17] Maronna, R.A., Martin, R.D., Yohai, V.J., and Salibián-Barrera, M. (2019). Robust Statistics:Theory and Methods (with R) . Second edition. John Wiley & Sons, Inc., Hoboken, NJ.[18] Poudyal, C. (2018).

Robust Estimation of Parametric Models for Insurance Loss Data . Pro-Quest LLC, Ann Arbor, MI. Thesis (Ph.D.)–The University of Wisconsin - Milwaukee.[19] Reynkens, T., Verbelen, R., Beirlant, J., and Antonio, K. (2017). Modelling censored lossesusing splicing: A global ﬁt strategy with mixed Erlang and extreme value distributions.

Insur-ance: Mathematics and Economics , , 65–77.[20] Scollnik, D.P.M. and Sun, C. (2012). Modeling with Weibull-Pareto models. N. Am. Actuar.J. , (2), 260–272.[21] Serﬂing, R.J. (1980). Approximation Theorems of Mathematical Statistics . John Wiley & Sons,New York.[22] Shah, S.M. and Jaiswal, M.C. (1966). Estimation of parameters of doubly truncated normaldistribution from ﬁrst four sample moments.

Annals of the Institute of Statistical Mathematics , , 107–111.[23] Tukey, J.W. (1960). A survey of sampling from contaminated distributions. Contributions toProbability and Statistics , pages 448–485. Stanford University Press, Stanford, CA.2524] van der Vaart, A.W. (1998).

Asymptotic Statistics . Cambridge University Press, Cambridge.[25] Verbelen, R., Gong, L., Antonio, K., Badescu, A., and Lin, S. (2015). Fitting mixtures ofErlangs to censored and truncated data using the EM algorithm.

ASTIN Bulletin , (3),729–758.[26] Zhao, Q., Brazauskas, V., and Ghorai, J. (2018). Robust and eﬃcient ﬁtting of severity modelsand the method of Winsorized moments. ASTIN Bulletin , (1), 275–309. Appendix A: All four possible scenarios for Section 2.2

Scenario 1 : d j ≤ d j ′ < u j ≤ u j ′ ✲✛ d j d j ′ u j u j ′ Y jj ′ = h jj ′ ( X ) { d jj ′ < X ≤ u jj ′ } = h jj ′ ( X ) { d j ′ < X ≤ u j } ,W jj ′ = Z j { d j ′ < X ≤ u j } , and W j ′ j = Z j ′ { d j ′ < X ≤ u j } . Scenario 2 : d j ≤ d j ′ < u j ′ ≤ u j ✲✛ d j d j ′ u j ′ u j Y jj ′ = h jj ′ ( X ) { d jj ′ < X ≤ u jj ′ } = h jj ′ ( X ) { d j ′ < X ≤ u j ′ } ,W jj ′ = Z j { d j ′ < X ≤ u j ′ } , and W j ′ j = Z j ′ { d j ′ < X ≤ u j ′ } . Scenario 3 : d j ′ ≤ d j < u j ≤ u j ′ ✲✛ d j ′ d j u j u j ′ Y jj ′ = h jj ′ ( X ) { d jj ′ < X ≤ u jj ′ } = h jj ′ ( X ) { d j < X ≤ u j } ,W jj ′ = Z j { d j < X ≤ u j } , and W j ′ j = Z j ′ { d j < X ≤ u j } . Scenario 4 : d j ′ ≤ d j < u j ′ ≤ u j ✲✛ d j ′ d j u j ′ u j Y jj ′ = h jj ′ ( X ) { d jj ′ < X ≤ u jj ′ } = h jj ′ ( X ) { d j < X ≤ u j ′ } ,W jj ′ = Z j { d j < X ≤ u j ′ } , and W j ′ j = Z j ′ { d j < X ≤ u j ′ } . Therefore, depending on the scenario the expected values are given by: µ Y jj ′ = E [ Y jj ′ ] µ W jj ′ = E [ W jj ′ ]= Z F ( u jj ′ | θ ) F ( d jj ′ | θ ) h jj ′ (cid:0) F − ( v | θ ) (cid:1) dv, = Z F ( u jj ′ | θ ) F ( d jj ′ | θ ) h j (cid:0) F − ( v | θ ) (cid:1) dv. ppendix B: Proofs Proof of Theorem 2.2:

Clearly, g V ( µ V ) = (cid:16) µ Y p , . . . , µ Yk p k (cid:17) =: ( µ , . . . , µ k ) =: µ . From Lemma 2.2, it follows that D V := " ∂g j ∂x j ′ (cid:12)(cid:12)(cid:12)(cid:12) x = µ V k × k = (cid:2) d V ,jj ′ (cid:3) k × k , where d V ,jj ′ :=  p j ′ , if ≤ j = j ′ ≤ k ; − µ Y j p j , if j ′ − j = k ;0 , otherwise.Now, with an application of delta method corresponding with the function g V above, [see 21,§3.3 Theorem A], we have ( b µ , . . . , b µ k ) ∼ AN (cid:18) g V ( µ V ) = µ , n D V Σ V D ′ V (cid:19) . Proof of Theorem 3.1:

The r.v. Y can be expressed in the form of Y = X ∧ u − u { u < X < ∞} − X ∧ d + d { d < X < ∞} . Deﬁne, I a,b := { a < X < b } . Therefore, µ Y = E [ Y ] = E [ X ∧ u ] − E [ uI u, ∞ ] − E [ X ∧ d ] + E [ dI d, ∞ ]= θ (1 − e − uθ ) − ue − uθ − θ (1 − e − dθ ) + de − dθ = θ (cid:16) e − dθ − e − uθ (cid:17) + de − dθ − ue − uθ . Since Y = X ∧ u − u { u < X < ∞} − X ∧ d + d { d < X < ∞} , then Y = ( X ∧ u ) − ( X ∧ d ) − d [ X ∧ u − X ∧ d ] − u I u, ∞ − d I d, ∞ + 2 d ( XI d,u + uI u, ∞ ) . Therefore, for X ∼ Exp ( θ ) then µ Y := E [ Y ] is computed as below: µ Y = E [ Y ]= E [( X ∧ u ) ] − E [( X ∧ d ) ] − d [ E [ X ∧ u ] − E [ X ∧ d ]] − u E [ I u, ∞ ] d E [ I d, ∞ ] + 2 d ( E [ XI d,u ] + u E [ I u, ∞ ])= 2 θ Γ (cid:16) uθ (cid:17) + u e − uθ − θ Γ (cid:18) dθ (cid:19) − d e − dθ − d h θ (cid:16) − e − uθ (cid:17) − θ (cid:16) − e − dθ (cid:17)i − u e − uθ − d e − dθ + 2 d h θe − dθ + de − dθ − θe − uθ − ue − uθ + ue − Rθ i = 2 θ (cid:18) Γ (cid:16) uθ (cid:17) − Γ (cid:18) dθ (cid:19)(cid:19) . Therefore, σ Y = µ Y − µ Y = 2 θ (cid:18) Γ (cid:16) uθ (cid:17) − Γ (cid:18) dθ (cid:19)(cid:19) − µ Y . Proof of Theorem 3.3:

Note that the parameter vector is given by θ = ( α, x ) with x known in advance. The populationversion of b µ MTuM is given by µ MTuM = E [ h ( Y ) | d < Y ≤ u ] = E [ h ( Y ) { d < Y ≤ u } ] F ( d | θ ) − F ( d | θ ) = R ud h ( y ) f ( y | θ ) dyF ( u | θ ) − F ( d | θ )= R F ( u | θ ) F ( d | θ ) h ( F − ( v | θ )) dvF ( u | θ ) − F ( d | θ ) = − R F ( u | θ ) F ( d | θ ) log (1 − v ) dvα ( F ( u | θ ) − F ( d | θ ))= − α (cid:0)(cid:0) x d (cid:1) α − (cid:0) x u (cid:1) α (cid:1) × h F ( d | θ ) − F ( u | θ ) + α (1 − F ( d | θ )) log (cid:16) x d (cid:17) − α (1 − F ( u | θ )) log (cid:16) x u (cid:17)i = − α (cid:0)(cid:0) x d (cid:1) α − (cid:0) x u (cid:1) α (cid:1) h(cid:16) x u (cid:17) α − (cid:16) x d (cid:17) α + α (cid:16) x d (cid:17) α log (cid:16) x d (cid:17) − α (cid:16) x u (cid:17) α log (cid:16) x u (cid:17)i = x α ( du ) α αx α ( du ) α ( u α − d α ) h u α (cid:16) − α log (cid:16) x d (cid:17)(cid:17) − d α (cid:16) − α log (cid:16) x u (cid:17)(cid:17)i = 1 α ( u α − d α ) h u α (cid:16) − α log (cid:16) x d (cid:17)(cid:17) − d α (cid:16) − α log (cid:16) x u (cid:17)(cid:17)i = A du α ( u α − d α ) . Now, to establish the proof of the statement, it is enough to prove that the function g du is strictlydecreasing with respect to α . For that, g ′ du ( α ) = dg du ( α ) dα = ( du ) α α (cid:0) log (cid:0) ud (cid:1)(cid:1) − ( u α − d α ) α ( u α − d α ) . Now, in order to show that g ′ du ( α ) < , it is enough to establish ( du ) α α (cid:0) log (cid:0) ud (cid:1)(cid:1) − ( u α − d α ) < which is equivalent to establish that ( du ) α α log (cid:0) ud (cid:1) < u α − d α . Now, ( du ) α α log (cid:16) ud (cid:17) < u α − d α ⇒ α log (cid:16) ud (cid:17) < (cid:16) ud (cid:17) α − (cid:16) ud (cid:17) − α ⇐⇒ α log (cid:16) ud (cid:17) < (cid:16) α (cid:16) ud (cid:17)(cid:17) ⇐⇒ α (cid:16) ud (cid:17) < sinh (cid:16) α (cid:16) ud (cid:17)(cid:17) . But we know that x < sinh x for all x > , therefore, g ′ du ( α ) < for all α > which implies that g du is strictly decreasing. Finally, note that lim α → g du ( α ) = (log( u )) − (log( d )) − u ) log (cid:0) x d (cid:1) + 2 log( d ) log (cid:0) x u (cid:1) (cid:0) ud (cid:1) , and lim α →∞ g du ( α ) = − log (cid:16) x d (cid:17) ..