[PDF] On unbalanced data and common shock models in stochastic loss reserving

Abstract

Introducing common shocks is a popular dependence modelling approach, with some recent applications in loss reserving. The main advantage of this approach is the ability to capture structural dependence coming from known relationships. In addition, it helps with the parsimonious construction of correlation matrices of large dimensions. However, complications arise in the presence of "unbalanced data", that is, when (expected) magnitude of observations over a single triangle, or between triangles, can vary substantially. Specifically, if a single common shock is applied to all of these cells, it can contribute insignificantly to the larger values and/or swamp the smaller ones, unless careful adjustments are made. This problem is further complicated in applications involving negative claim amounts. In this paper, we address this problem in the loss reserving context using a common shock Tweedie approach for unbalanced data. We show that the solution not only provides a much better balance of the common shock proportions relative to the unbalanced data, but it is also parsimonious. Finally, the common shock Tweedie model also provides distributional tractability.

Full PDF

OOn unbalanced data and common shock models in stochastic loss reserving

Benjamin Avanzi a , Greg Taylor b , Phuong Anh Vu c, ∗ , Bernard Wong b a Centre for Actuarial Studies, Department of Economics, University of Melbourne VIC 3010, Australia b School of Risk and Actuarial Studies, Business School UNSW Sydney NSW 2052, Australia c Taylor Fry, Level 22, 45 Clarence St, Sydney NSW 2000, Australia

Abstract

Introducing common shocks is a popular dependence modelling approach, with some recent applicationsin loss reserving. The main advantage of this approach is the ability to capture structural dependence comingfrom known relationships. In addition, it helps with the parsimonious construction of correlation matricesof large dimensions. However, complications arise in the presence of “unbalanced data”, that is, when(expected) magnitude of observations over a single triangle, or between triangles, can vary substantially.Speciﬁcally, if a single common shock is applied to all of these cells, it can contribute insigniﬁcantly to thelarger values and/or swamp the smaller ones, unless careful adjustments are made. This problem is furthercomplicated in applications involving negative claim amounts. In this paper, we address this problem inthe loss reserving context using a common shock Tweedie approach for unbalanced data. We show that thesolution not only provides a much better balance of the common shock proportions relative to the unbalanceddata, but it is also parsimonious. Finally, the common shock Tweedie model also provides distributionaltractability.

Keywords:

Stochastic loss reserving; Common shock; Unbalanced data; Negative claims; MultivariateTweedie distributionMSC classes: 91G70, 91G60, 62P05, 62H12

1. Introduction

Outstanding claims reserves are typically some of the most critical components in the ﬁnancial statementof a non-life insurer (Abdallah, Boucher and Cossette, 2015; Heberle and Thomas, 2016; Saluz and Gisler,2014). When estimating reserves, the insurer often has to provide the central estimate as well as a risk marginto accommodate for the stochastic nature of outstanding claims. The estimation of reserving variability isalso required by many regulators (Gismondi, Janssen and Manca, 2012). For example, the AustralianPrudential Regulation Authority (APRA) requires insurers to provide a risk margin calculated as the largerof a half of one standard deviation, and the diﬀerence between 75th percentile and the expected value of thetotal outstanding claims distribution. The 99.5th percentile of the distribution of total outstanding claimsis also an input in the calculation of risk based capital for solvency purposes in many regulatory frameworks,for example, Solvency II in Europe and APRA’s Prudential Standards in Australia. This was one of themotivations for the development of stochastic reserving methodologies since the early 1980s. For generalreferences on reserving, one can refer to Taylor (2000) and W¨uthrich and Merz (2008). A recent strand ofthe literature focuses on the modelling of individual claims (see for example, Avanzi et al., 2016; Pinheiroet al., 2003; W¨uthrich, 2018; Zhao et al., 2009. However, the focus of this paper is on the modelling oftraditional aggregate data in the form of loss triangles. ∗ Correspondence to: Phuong Anh Vu, Taylor Fry, Level 22, 45 Clarence St, Sydney NSW 2000, Australia. E-mail:[email protected] a r X i v : . [ q -f i n . R M ] M a y on-life insurers typically operate in multiple lines or segments, and are required by regulators to estimateloss reserves and risk capital on an aggregate level. Diﬀerent business lines within an insurer’s operationoften lack a comonotonic dependence structure. This allows the insurer to enjoy diversiﬁcation beneﬁts inthe calculation of loss reserves and risk capital for their consolidated operation (Avanzi, Taylor and Wong,2016a). It is hence essential to develop an accurate approach to the modelling of outstanding losses whileallowing for dependencies (Cˆot´e, Genest and Abdallah, 2016; Shi, Basu and Meyers, 2012). This not onlyallows the insurer to accurately assess their performance, but also to hold an appropriate amount of reservesand capital to optimise its internal use while satisfying regulatory requirements (Ajne, 1994; Avanzi, Taylorand Wong, 2018).Various multivariate approaches have been developed for stochastic loss reserving which take into accountthe dependency across business lines or segments. Some well-known non-parametric approaches includemultivariate chain ladder frameworks in Braun (2004); Schmidt (2006); Merz and W¨uthrich (2007); Zhang(2010) and the multivariate additive loss reserving framework in Hess, Schmidt and Zocher (2006); Merz andW¨uthrich (2009). These approaches are non-parametric and do not utilise any distributional assumptions.They also focus on speciﬁc cell-wise dependence (i.e. the dependence between cells that are in the sameposition) across loss triangles. Alternatively, parametric approaches utilising distributional assumptions canbe used, see for example, Shi and Frees (2011); Zhang and Dukic (2013); De Jong (2012); Abdallah, Boucherand Cossette (2015); Shi (2014).In this paper, we focus on the common shock approach to dependence modelling. Common shockapproaches use common random factors to capture drivers of dependence across related variables. As a result,these drivers can be identiﬁed, as well as monitored if needed. The transparent dependence structures incommon shock models can then be interpreted more easily. This is indeed one of the four desirable propertiesof multivariate distributions considered in Joe (1997, Chapter 4) which include:– interpretability,– closure under the taking of marginals, meaning that the multivariate marginals belong to the samefamily (this is important if, in modelling, we need to ﬁrst choose appropriate univariate marginals, thenbivariate and sequentially to higher-order marginals),– ﬂexible and wide range of dependence,– density and cumulative distribution function in closed-form (if not, they are computationally feasibleto work with).Furthermore, the construction of correlation matrices can be facilitated. Correlation matrices are tools ex-tensively used by practitioners to specify dependence in the aggregation of outstanding claims liabilities orrisk-based capital. Explicit dependence structures captured using common shock approaches allow correla-tion matrices to be speciﬁed in a more disciplined and parsimonious manner (see e.g., Avanzi, Taylor andWong, 2018).Common shock approaches have been used to good eﬀect. They are typically used to capture structuraldependence, that is, “structural co-movements that are due to known relationships which can be accountedfor in a modelling exercise” (International Actuarial Association, 2004). De Jong (2006) introduced threediﬀerent models to capture dependence across development periods, accident periods and calendar periodsrespectively. Calendar period dependence is captured using common shock variables in the multivariatelog-normal model of Shi, Basu and Meyers (2012). A common shock Tweedie framework was developed inAvanzi, Taylor, Vu and Wong (2016b) to capture cell-wise dependence across business lines. It is worthnoting that these models are static models which assume a single development pattern for all accident yearsthrough the use of ﬁxed eﬀects. A recent use of common shock approach in evolutionary reserving modelswhich allow claims development pattern to evolve can be found in Avanzi et al. (2019). There are also variousapplications of common shock models outside of the reserving literature, including mortality modelling (Alaiet al., 2013, 2016), capital modelling (Furman and Landsman, 2010) and claim counts modelling (Meyers,2007).Despite the beneﬁts mentioned above, complications can arise in the application of common shock ap-proaches to loss triangle data. This is due to the “unbalanced” feature of data where expected magnitude ofobservations within a loss triangle as well as across triangles vary substantially. This feature represents the2ypical claim experience where the level of claim activity reaches a peak in early years then dies out as thedevelopment lag increases. The “unbalanced-ness” can also be observed in loss data that consists of multiplebusiness lines. In particular, the speed of claims development can vary across business lines where somelines are longer tailed than others. As a result, the magnitude of claim observations in the same accidentyear and development year can vary across loss triangles. Because of this feature, we say that loss reservingdata is an example of “unbalanced data”. If a single common shock is applied to these observations that areof diﬀerent magnitudes, it can contribute insigniﬁcantly to the larger ones and/or swamp the smaller ones,unless careful adjustments are made. It is the aim of this paper to study and address this problem.While this paper aims to examine the challenges for common shock models and propose a solution toaddress these challenges, a focus of the solution is placed on the Tweedie family of distributions. This ismotivated by its popularity. This family is a major subclass of the exponential dispersion family (EDF)consisting of symmetric and non-symmetric, light-tailed and heavy-tailed distributions (Alai et al., 2015;Jørgensen, 1997). Various members of it have been frequently used in the loss reserving literature, see forexample, Alai and W¨uthrich (2009); Boucher and Davidov (2011); England and Verrall (2002); Peters et al.(2009); Renshaw and Verrall (1998); Taylor (2009, 2015); W¨uthrich (2003); Zhang et al. (2012). Avanziet al. (2016a) developed a common shock Tweedie framework for reserving to allow for dependence acrossbusiness line while utilising the ﬂexibility of this family of distribution. The solution proposed in this paperwill be illustrated using this framework.Another feature that is occasionally observed in loss triangles are negative claim amounts. These aredue to various reasons, for example, salvage recoveries, or payment from third parties. Many common useddistributions such as gamma distributions and log-normal distributions are unable to handle this featuredue to their lack of support for negative values. A remarkably small area of literature has been devoted forthe treatment of negative payments a single business line. The existing methods include a three-parameter-log-normal model in De Alba (2006) and a mixture model in Kunkler (2006). In the development of thenew approach for unbalanced data, we will also consider a treatment for negative claims.The organisation of this paper is as follows: Section 2 investigates the issue of unbalanced data forcommon shock models. A common shock Tweedie approach to unbalanced data is introduced in Section 3.Simulation illustrations are provided in Section 4, including an illustration using a portfolio of triangles withdiﬀerent tail lengths, and a comparison of the performances of the original common shock Tweedie approachand the modiﬁed Tweedie approach with treatment for unbalanced data. An illustration using real data isprovided in Section 5 and Section 6 concludes the paper.

2. Unbalanced feature of reserving data and its challenges to common shock models

In this section, we examine the unbalanced feature of loss reserving data in detail. The general commonshock framework developed in Avanzi et al. (2018) is then described. Challenges that arise in applyingcommon shock models to reserving data due to its unbalanced feature are then discussed.

As described in Section 1, loss reserving data typically exhibits unbalanced nature. We consider forillustration a real data set from a Canadian insurer collected from 2003 to 2012 (denoted by years 1-10).This data set is used for illustration in Cˆot´e et al. (2016) and is provided in Tables C.1 and C.2 in AppendixC. The two lines of business used for illustration are Bodily Injury line and Accident Beneﬁts (excludingDisability Income).Figure 2.1 provides heat maps of incremental loss ratios on the left, and a plot of incremental loss ratiosfor accident year 2003 from the two lines of business on the right. For a given accident year, the loss ratioincrement for development year j is deﬁned as the ratio of incremental claim payments in that developmentyear to the earned premium for the accident year. Within a single loss triangle, one can observe a quitesigniﬁcant variation in claim observations across development years for any particular accident year. Asshown in the heat map in Figure 2.1, the claim activity for the Bodily Injury line is low in development year0, then peaks in the next few years and dies out after the peak. This typical pattern is also shown in the plot3f loss ratio on the right hand side of Figure 2.1 for accident year 2003. For the Accident Beneﬁts line, theclaim activity is the highest in development year 0 or 1, then drops quickly as we approach later developmentyear. This pattern is also shown in the plot of loss ratios for accident year 2003 of the Accident Beneﬁt line.The plot of ratios on the right hand side of Figure 2.1 also indicates the diﬀerence in development patternsfor two diﬀerent business lines. We can say that the Accident Beneﬁts line is shorter-tailed than the BodilyInjury line. A variation can be observed across claim observations that come from the same accident yearand the same development year, simply due to diﬀerent claim development patterns across these businesslines. This is in addition to the variation between loss values in diﬀerent development lags and from diﬀerentloss triangles, such as cells in development year 1 from the Bodily Injury line and cells in development year10 from the Accident Beneﬁt line.Overall, Figure 2.1 shows a large variation across claim observations in a loss reserving data set. Withina single loss triangle, there is variation across development years due to the development pattern of claimsover time. Diﬀerent claim development patterns can also result in variation between observations across losstriangles. Typically, one often does not expect dependence between lines that have diﬀerent tail lengths, forexample, an Auto Property Damage line is often independent of a Workers Compensation line. However,lines with diﬀerent tail lengths can still have some association. One of such examples is a portfolio ofAccident Beneﬁts line and the Bodily Injury line in the above illustration.With the variations between claim observations within and across triangles, we refer to loss reservingdata as unbalanced data. This data feature creates a number of challenges in applying common shock modelsto reserving data, which will be discussed in the remainder of this section. For generality and completeness,the focus is placed on the unbalanced feature of data consisting of multiple lines of business. (2.1 (a)) Heat maps of Bodily Injury line (top) andAccident Beneﬁt line (bottom) . . . . Development year Lo ss r a t i o (2.1 (b)) Plot of loss ratios for accident year 2003 Figure 2.1: (Colour online) Loss ratios from Bodily Injury line and Accident Beneﬁts line (from a Canadian insurer)

Consider N loss triangles of claim cells Y ( n ) i,j . The notation Y ( n ) i,j can represent incremental claim paymentsor counts. We have the indices i ( i = 1 , ..., I ) representing the accident period, j ( j = 0 , ..., J ) representingthe development period, and n ( n = 1 , ..., N ) representing the business line. It also follows that the claims Y ( n ) i,j belong to the calendar period t = i + j − , ( t = 1 , ..., T ).Let S ( n ) = {S ( n ) s ; s = 1 , ..., S } be a partition of the set of all claims Y ( n ) i,j from business line n . Alsoassume that all partitions are the same for diﬀerent lines n for simplicity. Denote by π ( i,j ) = s a unique4apping of claim Y ( n ) i,j to a set S ( n ) s in the partition. For example, the partition S ( n ) = {S ( n ) s ; s = 1 , ..., I } where S ( n ) s = { Y ( n ) s,j ; j = 1 , ..., J } represents a partition of claims by accident year. The selection of thepartition S ( n ) is very ﬂexible and can be speciﬁed for diﬀerent types of dependence.Many multivariate models with diﬀerent types of dependence can be generalised by the common shockframework in Avanzi, Taylor and Wong (2018) with Y ( n ) i,j = κ ( n ) i,j W π ( i,j ) + λ ( n ) i,j U ( n ) π ( i,j ) + Z ( n ) i,j , (2.1)where π ( i,j ) = s denote the unique mapping of the claim Y ( n ) i,j to the corresponding subset S ( n ) s in thepartition S ( n ) , and W π ( i,j ) , U ( n ) π ( i,j ) , Z ( n ) i,j are independent stochastic variates. The common shock W π ( i,j ) introduces dependence across all business lines n = , ..., N on claims that belong to the subsets S ( n ) s .For example, accident year dependence across lines can be captured using the partition where S ( n ) s = { Y ( n ) s,j ; j = 1 , ..., J } . The other common shock U ( n ) π ( i,j ) introduces dependence across claims within the set S ( n ) s of business line n only, such as development year dependence with the partition set speciﬁcation S ( n ) s = { Y ( n ) i,s ; i = 1 , ..., I } . Overall, the ﬂexibility of choice of the subsets S ( n ) s allows diﬀerent dependencestructures to be captured. Lastly, the idiosyncratic component, which is unique to the claim Y ( n ) i,j , isdenoted by Z ( n ) i,j . Scaling factors, denoted by κ ( n ) i,j , λ ( n ) i,j , control the extent to which the set-wide commonshock contributes to individual members of the set. In this section, we have wished to preserve the link tothe general notation of Avanzi et al. (2018) through the use of the notation π ( i,j ) . This notation will besimpliﬁed in Section 2.3 for speciﬁc examples. Remark 2.1.

There can be situations where variables { Y ( n ) i,j ; ∀ i, j ; n > } are pairwise dependent (i.e.the dependency between each pair of variables is driven by a diﬀerent source). For example, there canbe a portfolio of 3 lines of business (LOBs) where there are 3 independent common shocks that drive thedependence between each of the following three pairs, LoB 1 and LoB 2, LoB 2 and LoB 3 and LoB 3 andLoB 1, respectively. In such cases, one can consider having additional common shock variables W π ( i,j ) thatcapture dependence across lines, for example, W (1 , π ( i,j ) , W (2 , π ( i,j ) , W (3 , π ( i,j ) for the above scenario of 3 LoBs.However, it is worth noting that these will result in more parameters required for the framework.2.3. Balancing common shock proportions in loss reserving data As a result of the unbalanced feature of reserving data, a common shock model can create problems inthe absence of careful modelling.Consider a special case of Equation (2.1) for dependence within a business line (i.e. W π ( i,j ) = 0).Further specify accident period dependence (i.e. π ( n )( i,j ) = p for the mapping of subsets in the partition where S ( n ) s = { Y ( n ) s,j ; j = 1 , ..., J } ). This allows us to simplify U ( n ) π ( i,j ) = X i ( n ) . Hence the general framework isreduced to Y ( n ) i,j = λ ( n ) i,j X ( n ) i + Z ( n ) i,j . (2.2)Consequently, the proportionate contribution of the common shock to the expected value of the total ob-servation is λ ( n ) i,j E (cid:104) X ( n ) i (cid:105) λ ( n ) i,j E (cid:104) X ( n ) i (cid:105) + E (cid:104) Z ( n ) i,j (cid:105) . (2.3)If the scaling factor is removed, i.e. λ ( n ) i,j = 1, this proportion has an inverse relationship with the mean ofthe idiosyncratic component E (cid:104) Z ( n ) i,j (cid:105) . As a result, in a set of loss cells in a triangle that are dependent and5hare a common shock, the cells with large values have a smaller proportion of common shock contributionand vice versa. This is because claims within the same accident period, or within the same calendar periodbelong to diﬀerent development periods. As explained in Section 2.1, their values can vary signiﬁcantlydue to the variation in claim activity across development periods. This issue can also be observed in thecase of calendar period dependence (i.e. π ( n )( i,j ) = s for the mapping of subsets in the partition where S ( n ) s = { Y ( n ) i,s − i +1 ; i = 1 , ..., J } ).A similar issue is encountered for a portfolio of dependent business lines with diﬀering tail lengths, suchas the two business lines Bodily Injury and Accident Beneﬁts in the illustration in Section 2.1. We considera special case of Equation (2.1) that allows for dependence between business lines only (i.e. U ( n ) π i,j = 0).Further specify cell-wise dependence (i.e. partition mapping where S ( n ) i,j = { Y ( n ) i,j } ). This allows us tosimplify W π ( i,j ) = V i,j . The contribution of the common shock to the total expected observation is thengiven by κ ( n ) i,j E [ V i,j ] κ ( n ) i,j E [ V i,j ] + E (cid:104) Z ( n ) i,j (cid:105) . (2.4)If the scaling factor is removed, i.e. κ ( n ) i,j , this proportion also has an inverse relationship with the meanof the idiosyncratic component E (cid:104) Z ( n ) i,j (cid:105) . As explained in Section 2.1, values of claims in a portfolio ofmultiple triangles can vary in two main ways: across development years within a loss triangle, and acrossloss triangles. As a result, the proportion of common shock varies within and across loss triangles whereinloss cells with larger values have smaller common shock contributions. In the case of pairwise dependenceconsidered above, the disproportion is typically a result of varying tail lengths across business lines. However,it is worth noting that unbalanced common shock proportions can also be typically observed for accidentyear dependence, or calendar dependence across business lines from the same cause.We consider the case of accident year dependence across the two triangles illustrated in Section 2.1 (i.e.partition mapping S s = { Y ( n ) s,j ; j = 1 , ..., J ; n = 1 , ..., N } , and we can simplify W π ( i,j ) = V i ). For illustration,the mean of the common shock E [ V i ] is set to 5% of the loss ratios in the ﬁrst development year of eachaccident year in the Bodily Injury line. The contributions of common shock are shown in Figure 2.2 assumingno scaling terms. With accident year dependence across business lines, claims within the same accident yearshare the same common shock. These include claims from diﬀerent development years within and across losstriangles. Because their values vary due to diﬀerent claim activities within and across lines, their commonshock proportions also vary. Speciﬁcally, common shock proportions are signiﬁcantly smaller in areas withhigh claim activity, and larger in areas with low claim activity, as shown in Figure 2.2.In general, quite signiﬁcant variations in common shock proportions can be observed within and acrosssegments in the absence of careful modelling as a result of the unbalanced nature of loss reserving data.One may wish to conﬁne the relation of the common shock to total observations over the entire range of thetriangles. The most straight-forward solution to the balancing common shock proportions within and across tri-angles is to have cell-speciﬁc scaling factors κ ( n ) i,j , λ ( n ) i,j to adjust the common shock eﬀects for each totalobservation Y ( n ) i,j . However, this implies that 2 IJN new parameters are required for the entire range oftriangles of observed data and outstanding claims to be predicted. Given that the variation in claim observa-tions typically occurs across development periods, one may simplify the scaling factors to be column-speciﬁc κ ( n ) i,j = κ ( n ) j , λ ( n ) i,j = λ ( n ) j . However, this still results in 2 JN new parameters.Loss triangle data typically has a small sample size. While the presence of scaling factors can mitigatethe impact of the unbalanced nature of reserving data, it also adds many more parameters to the model.If scaling factors are not chosen carefully, it may result in over-ﬁtting and the number of parameters to beestimated can even exceed the number of observations.6 igure 2.2: (Colour online) Heat maps of common shock contributions in Bodily Injury line (top) and Accident Beneﬁt line(bottom) without using scaling terms On some occasions, parameters λ ( n ) i,j and κ ( n ) i,j need to be speciﬁed such that the total observation Y ( n ) i,j follows a speciﬁc distribution (Avanzi, Taylor and Wong, 2018). This is referred to as distributional tractabil-ity, or closure under the taking of marginals, which is considered in Joe (1997, Chapter 4) to be one of thefour desirable properties of a multivariate model (see also Section 1).Consider as an example the common shock Tweedie framework in Avanzi, Taylor, Vu and Wong (2016b).This framework is developed for cell-wise dependence across business lines (i.e. S ( n ) i,j = { Y ( n ) i,j } , U ( n ) i,j = 0).Fitting this into the general common shock structure in Equation (2.1) and simplifying W π ( i,j ) = V i,j wehave Y ( n ) i,j = κ ( n ) i,j V i,j + Z ( n ) i,j , (2.5)where the two components V i,j , Z ( n ) i,j are assumed to be independent and have Tweedie distributions V i,j ∼ Tweedie p ( α, β ) , (2.6) Z ( n ) i,j ∼ Tweedie p ( η ( n ) i ν ( n ) j , γ ( n ) ) . (2.7)Parameter p is the power parameter which speciﬁes a member of the Tweedie family, for example p = 1corresponds to a Poisson distribution. The representation of Tweedie distributions used is the reproductiverepresentation (Jørgensen, 1997, Chapter 4). This representation speciﬁes a Tweedie random variable usinga location (or mean) parameter, and a dispersion parameter. In the above model speciﬁcation, parameters α and η ( n ) i ν ( n ) j are the location parameters, and parameters β and γ ( n ) are the dispersion parameters. Thereproductive representation has a distinctive property wherein the weighted average of independent Tweedie7ariables with the same power parameter p and the same location parameter is also a Tweedie variable withthe same power and location parameters. The weighting factors are determined using dispersion parametersof the component variables in the weighted average.It then follows that the mean and variance of the two components V i,j , Z ( n ) i,j areE [ V i,j ] = α, Var [ V i,j ] = βα p , (2.8)E (cid:104) Z ( n ) i,j (cid:105) = η ( n ) i ν ( n ) j , Var (cid:104) Z ( n ) i,j (cid:105) = γ ( n ) ( η ( n ) i ν ( n ) j ) p . (2.9)As stated in Remark 2.2 of Avanzi, Taylor, Vu and Wong (2016b), the most simple parametrisation is usedfor the common shock component V i,j with parameters α and β .As mentioned earlier in this section, it can be desirable to maintain distributional tractability, or closureunder the taking of marginals for ease of interpretation. It follows from the form of closure under additionof the Tweedie family of distributions, as proven in Jørgensen (1997, Chapter 3), that a speciﬁc choice of κ ( n ) i,j is required to ensure that Y ( n ) i,j also has a Tweedie distribution. This choice is κ ( n ) i,j = (cid:32) αη ( n ) i ν ( n ) j (cid:33) − p γ ( n ) β . (2.10)The mean expression is given byE (cid:104) Y ( n ) i,j (cid:105) = (cid:32) αη ( n ) i ν ( n ) j (cid:33) − p γ ( n ) β η ( n ) i ν ( n ) j + η ( n ) i ν ( n ) j (2.11)where the ﬁrst term in the summation is the contribution from the common shock and the second term isthe contribution from the idiosyncratic component. The expected contribution of the common shock to thetotal expected observation is (cid:32) αη ( n ) i ν ( n ) j (cid:33) − p γ ( n ) β (cid:32) αη ( n ) i ν ( n ) j (cid:33) − p γ ( n ) β + 1 . (2.12)The following observation can be made on the eﬀect of the power parameter p :– If p <

2: The above ratio increases as ν ( n ) j decreases. As a result, the proportion of common shock isunderstated in early development periods, and overstated in late development periods (Avanzi, Taylorand Wong, 2018). In a portfolio of segments with varying tail lengths, the larger the discrepancy betweenthe tail lengths (i.e. between ν ( n ) j and ν ( m ) j ), the larger the variation in the common shock contributions.The behaviour of the above ratio has been examined with respect to development factor ν ( n ) j in particularbecause variation within and across lines of business is mainly driven by the development pattern ofclaims as explained in Section 2.1. As a result, one would expect the development factors to vary themost.– If p >

2: The opposite observation is made for the relationship between the above ratio and ν ( n ) j (i.e.the above ratio decreases as ν ( n ) j decreases).– If p = 2: In this special case, the common shock contribution is simpliﬁed to γ ( n ) βγ ( n ) β + 1 , (2.13)8hich is now independent of accident and development periods. Consequently, the common shockcontributes proportionately to the total observations over the entire range of the triangles. It is alsoworth emphasising that specifying p = 2 gives the multivariate gamma case of the multivariate Tweedieframework.The above analyses and examples show that the choices of scaling factors κ ( n ) i,j and λ ( n ) i,j are subjectto many constraints. To accurately capture the dependence structure, these parameters are required tobalance the common shock proportions within all claim observations over the entire range of the triangles.However, this can result in over-ﬁtting, which can be a critical issue in loss reserving due to small samplesize data. Furthermore, the speciﬁcation of these parameters may need to be restricted in some cases forthe purpose of preserving distributional tractability. It is then the aim of this paper to ﬁnd a solutionthat compromises between these conﬂicting issues with a speciﬁc application on the common shock Tweedieapproach in Avanzi, Taylor, Vu and Wong (2016b).

3. A common shock Tweedie approach to unbalanced data

In this section, we propose a solution that compromises between conﬂicting challenges encountered bycommon shock models when they are applied to reserving data due to the unbalanced feature of the data.The focus of this development is on a common shock Tweedie approach to unbalanced data. The estimationmethod for this approach is also given.The multivariate Tweedie framework described in Section 2.5 is a typical example of an applicationof the common shock approach in stochastic loss reserving. It is of particular interest due to its variousadvantages. Developed on the Tweedie family of distributions, it oﬀers ﬂexible choices of marginal densitythat also include Tweedie’s compound Poisson density with the ability to deal with zero data points. Theframework can also be generalised to more than two dimensions. In addition, the explicit common shockstructure allows the correlation matrix to be obtained in closed form. Moment- and cumulant-generating-functions can also be obtained analytically, enhancing the tractability of the model. Similar to other commonshock models, this framework also encounters the issue of unbalanced data. As explained in Section 2, theselection of scaling coeﬃcients for the common shock term in this framework is constrained by the need tobalance common shock proportions while maintaining model parsimony and distributional tractability.

Claims are ﬁrst standardised using a common unit of exposure such as the number of claims, or the totalamount of premium collected, to ensure consistency across accident periods and business lines. Recall thespeciﬁcation of the common shock Tweedie model in Avanzi et al. (2016b) described in Section 2.5, Y ( n ) i,j = κ ( n ) i,j V i,j + Z ( n ) i,j , (2.5)where V i,j ∼ Tweedie p ( α, β ) , (2.6) Z ( n ) i,j ∼ Tweedie p ( η ( n ) i ν ( n ) j , γ ( n ) ) , (2.7) κ ( n ) i,j = (cid:32) αη ( n ) i ν ( n ) j (cid:33) − p γ ( n ) β . (2.10)Recall that α and η ( n ) i ν ( n ) j are location (mean) parameters, and β and γ ( n ) are dispersion parameters of V i,j and Z ( n ) i,j respectively.As shown in Equation (2.10), the common shock scaling factor has to be speciﬁed in the above form thatinvolves parameters of the common shock V i,j and the idiosyncratic component Z ( n ) i,j . However, due to theunbalanced feature of reserving data with ν ( n ) j varying across development lag j and business line n , the9ommon shock contributes disproportionately to the total observation Y ( n ) i,j . It is also desirable to maintainmodel parsimony.Given the above considerations, we can replace the non-cell-speciﬁc parameter α in the scaling factorwith column-speciﬁc parameter α j = ˜ c (cid:32)(cid:89) n E (cid:104) Z ( n ) i,j (cid:105)(cid:33) N = ˜ c (cid:32)(cid:89) n η ( n ) i ν ( n ) j (cid:33) N (3.1) ≈ c N (cid:113) ν (1) j ...ν ( N ) j . (3.2)The parameter α j is also the location parameter of the common shock V i,j . As a result, we approximatelyhave V i,j ∼ Tweedie p ( α j , β ) = Tweedie p (cid:18) c N (cid:113) ν (1) j ...ν ( N ) j , β (cid:19) . (3.3)Essentially, the common shock parameter α j is proportional to the geometric average of idiosyncraticcomponents of claims which share the same common shock component. In this case, these are claims inthe same accident period and development period as the framework is used to capture cell-wise dependence.This geometric average can then be simpliﬁed by removing accident period factors because we can reasonablyexpect limited variation across accident periods as a result of claims standardisation, assuming no signiﬁcantchanges occur across accident periods.The above speciﬁcation of scaling factor aims to balance the impact of unbalanced feature in loss reservingdata which is mainly introduced by variations in development factors ν ( n ) j . Using this speciﬁcation, thecommon shock proportion is given by  c N (cid:113) ν (1) j ...ν ( N ) j η ( n ) i ν ( n ) j  − p γ ( n ) β  c N (cid:113) ν (1) j ...ν ( N ) j η ( n ) i ν ( n ) j  − p γ ( n ) β + 1 . (3.4)This does not provide a complete balance of common shock proportions because the eﬀect of ν ( n ) j is reducedby a factor N (cid:113) ν ( n ) j . However it still provides quite a signiﬁcant improvement over the original framework.This will be demonstrated in the simulation illustration in Section 4. This speciﬁcation can also preservedistributional tractability of the framework. In addition, model parsimony is retained as the total numberof parameters in the framework is unchanged. This can be considered an eﬀective solution given the threeconstraints discussed in Section 2.In addition to the above treatment for unbalanced data, we also introduce a treatment for negative claims Y ( n ) i,j + ξ ( n ) =  c N (cid:113) ν (1) j ...ν ( N ) j η ( n ) i ν ( n ) j  − p γ ( n ) β V i,j + Z ( n ) i,j , (3.5)where a translation factor is used and deﬁned such that ξ ( n ) = (cid:40) { Y ( n ) i,j , ∀ i, j } ≥ , ≥ − min { Y ( n ) i,j } if min { Y ( n ) i,j , ∀ i, j } < . (3.6)The translation is only needed for a loss triangle if it contains at least one negative value and it must belarge enough to oﬀset the smallest negative value. It is worth emphasising that in this case, while its lower10ound is deterministic, the actual value of ξ ( n ) still has to be estimated. The generalisation of this treatmentto the general common shock framework in Avanzi, Taylor and Wong (2018) is straightforward.Following from the above speciﬁcation, the marginal density is then given by Y ( n ) i,j + ξ ( n ) ∼ Tweedie p  η ( n ) i ν ( n ) j  c N (cid:113) ν (1) j ...ν ( N ) j η ( n ) i ν ( n ) j  − p γ ( n ) β + 1  , γ ( n )  c N (cid:113) ν (1) j ...ν ( N ) j η ( n ) i ν ( n ) j  − p γ ( n ) β + 1  − p  , (3.7)where the ﬁrst parameter is the location parameter and also the mean of Y ( n ) i,j + ξ ( n ) . The second parameteris the dispersion parameter. It follows that the vector of translated claims in the same position across alltriangles ξ Y i,j =  Y (1) i,j + ξ (1) Y (2) i,j + ξ (2) ... Y ( N ) i,j + ξ ( N )  , (3.8)has a multivariate Tweedie distribution with the multivariate density f ξ Y i,j (cid:16) y (1) i,j + ξ (1) , ..., y ( N ) i,j + ξ ( N ) (cid:17) = (cid:90) A i,j f V i,j ( w i,j ) N (cid:89) n =1 f Z ( n ) i,j  y ( n ) i,j + ξ ( n ) −  c N (cid:113) ν (1) j ...ν ( N ) j η ( n ) i ν ( n ) j  − p γ ( n ) β w i,j  dw i,j , (3.9)where A i,j = min  η (1) i ν (1) j c N (cid:113) ν (1) j ...ν ( N ) j  − p βγ (1) ( y (1) i,j + ξ (1) ) , ...,  η ( N ) i ν ( N ) j c N (cid:113) ν (1) j ...ν ( N ) j  − p βγ ( N ) ( y ( N ) i,j + ξ ( N ) )  , (3.10)and where f ( . ) is the Tweedie density in reproductive form (see also Jørgensen, 1997, Chapter 4). Bayesian inference is used for model estimation. Bayesian estimation has gained its popularity in theloss reserving literature due to rapid computing advancements and Markov Chain Monte Carlo (MCMC)methods that allow the calculation of intractable posterior densities to be performed signiﬁcantly faster(Avanzi, Taylor, Vu and Wong, 2016b; Verrall, H¨ossjer and Bj¨orkwall, 2012). In addition, the incorporationof prior densities in the calculation of posterior densities is a natural way to allow for parameter error inmodelling (Shi, Basu and Meyers, 2012; England, Verrall and W¨uthrich, 2012). Another aim of using aBayesian set-up is to also estimate the power parameter p and translation parameter ξ ( n ) with allowance ofparameter uncertainty. This is to formalise the estimation of these parameters as they are often estimatedheuristically in practice. It is worth emphasising that the Bayesian structure is not integral to our model,but serves as a device for estimation.A two step procedure is used for estimation, similar to that in Avanzi, Taylor, Vu and Wong (2016b).The ﬁrst stage is the estimation of all parameters except c and β of the common shock V i,j . This stage,however, gives the estimate of a ratio of these parameters denoted as δ = c − p β , (3.11)11s can observed from Equation (3.7). This is followed by the multivariate stage that estimates c and β conditional on estimates of other parameters from the ﬁrst stage. The motivation for this procedure comesfrom properties of the common shock Tweedie framework. Claim observations in the same position acrosstriangles in this framework follow a multivariate Tweedie distribution, and each observation itself also has amarginal Tweedie distribution. In addition, the multivariate density has an integral calculation, as shown inEquation (3.9). This can prolong the estimation of the posterior density, making the tuning and convergenceof MCMC much more diﬃcult.A Bayesian set-up requires the speciﬁcation of the likelihood functions, prior densities and, if posteriordensities are not in closed form, computational algorithms used to approximate them. The likelihoodfunctions follow from Equation (3.7) for the ﬁrst stage and Equation (3.9) for the second stage.Prior densities are then speciﬁed. Prior densities can be chosen to be informative or uninformative.Uninformative priors assign equal possibilities to all values in the feasible set of parameter values, whereasinformative priors convey some prior preference for certain values of the parameters. However, the use ofinformative priors can signiﬁcantly improve the convergence rate, especially when the parameter dimension islarge (Congdon, 2010). Parameter estimates from univariate Tweedie model (Alai and W¨uthrich, 2009) canassist in the speciﬁcation of informative prior densities for parameters η ( n ) i , ν ( n ) j and γ ( n ) . A preliminaryanalysis of the dependence structure can help select informative prior densities for the common shockparameters c and β . Regarding the prior densities for p and ξ ( n ) , some constraints need to be taken intoaccount. In particular, p is not deﬁned in (0 , ξ ( n ) has a lower bound as per its speciﬁcation inEquation (3.6).Putting together the likelihood and prior speciﬁcations, the posterior density in the ﬁrst stage is givenby f Ω | Y U ( Ω | Y U ) ∝  (cid:89) i,j,n f Y ( n ) i,j + ξ ( n ) (cid:16) y ( n ) i,j + ξ ( n ) | Ω (cid:17) f p ( p ) f ξ ( ξ ) f δ ( δ ) f η ( η ) f ν ( ν ) f γ ( γ ) , (3.12)where Ω =  p ξ δ ηνγ  , ξ =  ξ (1) ξ (2) ... ξ ( N )  , η i =  η (1) i η (2) i ... η ( N ) i  , η =  η η ... η I  , ν j =  ν (1) j ν (2) j ... ν ( N ) j  , ν =  ν ν ... ν J  , γ =  γ (1) γ (2) ... γ ( N )  , and where Y U is a vector of claim observations in the upper claim triangles.From the model structure in Equation (3.5), we have that all claims Y ( n ) i,j are independent conditionalon common shock. Hence, the joint likelihood can be written as a product of two separate parts: a productof the densities of claims conditional on common shock, and the density of the common shock. In the ﬁrststage of the estimation procedure, the likelihood obtained is the ﬁrst part of the joint likelihood. As alsomentioned earlier, this stage provides the estimates of mean parameters ν , η and dispersion parameters γ ofthe idiosyncratic variables Z ( n ) i,j , translation parameters ξ and power parameter p . This stage also providesthe estimate of δ which is a function of parameters c and β of the common shock V i,j .In the second estimation step, we work with the joint likelihood directly since common shock componentsare not observed. In this step, the estimation of c and β is carried out conditioning on estimates of otherparameters in the ﬁrst step, including δ which is a function of c and β . The multivariate Tweedie densityof Y i,j is used to obtain the likelihood in this estimation. The posterior density in this step is given by f c | Y U , Ω ( c | Y U , Ω ) ∝ (cid:89) i,j f ξ Y i,j (cid:16) ξ y i,j | c, Ω (cid:17) f c ( c ) . (3.13)The posterior densities in both stages are not in recognisable forms, hence MCMC algorithms are requiredfor the estimation. The MCMC algorithm used is Metropolis-Hastings, which is a popular class of MCMC12lgorithms when the posterior distribution is not in a recognisable form. Random walk Metropolis-Hastingsalgorithms are used for marginal estimation and multivariate estimation. Proposal densities are chosen(tuned) so that acceptance probabilities are within desirable ranges. The tuning process can be donemanually using classical Metropolis-Hastings algorithms. Alternatively, it can be done automatically inadaptive Metropolis-Hastings algorithms using coerced acceptance rates (Haario, Saksman and Tamminen,2001; Vihola, 2012).

4. Simulation illustrations

Two illustrations are performed on two data sets. The ﬁrst illustration, provided in Section 4.1, is toassess the accuracy of the estimation procedure. Since true parameter values are known in a simulateddata, a comparison of their estimates with their true values gives an indication of the appropriateness of theestimation procedure. The second illustration, provided in Section 4.2, is to compare the performance ofthe common shock Tweedie approach with treatment for unbalanced data and the original common shockTweedie approach in Avanzi, Taylor, Vu and Wong (2016b). This comparison focuses particularly on thecontributions of common shock estimated from the two approaches.

A data set consisting of two business lines, one of which has a negative claim observation, is simulated.The two loss triangles are represented in Tables A.1 and A.2 in Appendix A. These two triangles consistof simulated claim observations. Each observation in the triangles is drawn from the multivariate Tweediemodel for unbalanced data represented in Section 3. For simplicity, these observations are assumed to havebeen adjusted for changes in exposure across accident years.The marginal ﬁtting is ﬁrst performed. Parameters are ﬁrst transformed using the log transformation,and uniform prior densities are used. 200,000 simulations are run and 100,000 simulations are discardedas the burn-in period. The sample chain is thinned by accepting every 5th iteration to reduce the serialdependence between iterations. MCMC paths of some parameters are given in Figure A.1. A similarprocedure is performed for the multivariate estimate. The estimates of c and β are obtained from this step.Parameter estimates are provided in Table A.3 in Appendix A.To evaluate the Bayesian inference used for estimation, we compare the true parameter values with 90%conﬁdence intervals obtained from the posterior distributions of these parameters. The results the truevalues always lie within the corresponding 90% conﬁdence intervals. This indicates the accuracy of theestimation procedure.We have calibrated the model on the same simulated data set using sub-triangles of dimension 5 × Table 4.1: Comparison of outstanding claims forecasts

13t can be observed from Table 4.1 that the true forecasts fall within the 90% conﬁdence intervals ofthe balanced multivariate Tweedie model forecasts. The forecasts from our model are also closer to thetrue forecasts than those from the multivariate chain ladder model. However, it is worth noting that thesimulated data was generated from the multivariate Tweedie model in this illustration.We further assess bias in the resulting dependence structure by comparing the true cell-wise Pearsoncorrelation coeﬃcients for the outstanding claims and the cell-wise Pearson correlation coeﬃcients calculatedusing the parameter estimates. Residual ratios, deﬁned as ratios of estimated Pearson correlation coeﬃcientsto true Pearson correlation coeﬃcients, are provided in Table 4.2. The ratios are close to 1, indicating thatthe cell-wise dependence in the data is well captured.Development year1 2 3 4 5 6 7 8 9 10 A cc i d e n t y e a r

12 1.073 0.99 0.994 1.02 1.04 1.045 1.00 0.98 0.99 1.006 1.03 1.02 1.01 1.02 1.037 1.02 1.03 1.03 1.01 1.03 1.038 1.01 1.02 1.03 1.03 1.01 1.03 1.039 1.04 1.03 1.04 1.05 1.05 1.03 1.05 1.0510 1.00 0.99 0.98 0.99 1.00 1.00 0.98 1.00 1.00

Table 4.2: Residual ratios of estimated Pearson correlation coeﬃcients to true Pearson correlation coeﬃcients

We acknowledge that the use of the two-step Bayesian inference does not provide the full picture dueto the dependence between the estimated parameters and the reserve being a non-linear function in termsof these parameters. However, this calibration approach was selected due to a number of advantages asmentioned in Section 3. These include overcoming the diﬃculties in dealing with Tweedie densities whichare not in tractable form, and enhancing computational speed. From the analyses provided above, we canconclude that:– The calibration method can capture the dependence structure well.– The resulting reserve predictions show no apparent bias and they are in line with the chain ladderpredictions.Therefore, even though we may not get the full picture, the above results give us conﬁdence that this wouldnot have a material impact on the performance of the calibration.

A natural question arises regarding the performance of the multivariate Tweedie approach for unbalanceddata compared to the original multivariate Tweedie approach introduced in Avanzi, Taylor, Vu and Wong(2016b). To be able to assess their performances more accurately, this comparison is performed on asimulated data set whose underlying model is known. True common shock contributions are also known andthese serve as the benchmark for the comparison.To not put any particular framework at a disadvantage, the synthetic data used for this illustrationis simulated from a mixture of models. We deliberately select a (extreme) data set to which neither ofthe frameworks is properly adapted. In particular, two loss triangles of ten development periods and tenaccident periods are generated such that the dependence is strong in the ﬁrst four development periods, andnot as strong in the last six periods. The common shock components are generated with column-speciﬁcmean parameters α j = c j (cid:113) ν (1) j ν (2) j with c j = 0 . ≤ j ≤

4, and c j = 0 .

02 for 5 ≤ j ≤

10. The14econd business line is also simulated to be longer-tailed than the ﬁrst. Similar to the previous illustration,each observation in the two triangles is drawn from the multivariate Tweedie model for unbalanced datarepresented in Section 3. These observations are assumed to have been standardised for accident year eﬀectfor simplicity. The two loss triangles are presented in Table B.1 and B.2 in Appendix B.Heat maps of ratios of ﬁtted common shock proportions to true proportions are given in Figure 4.3 fortriangle 1. Fitted values are calculated using posterior median of parameters and true values are calculatedusing true parameter values. The modiﬁed Tweedie model provides a very good ﬁt for the ﬁrst four develop-ment periods. The goodness of ﬁt is considerably less satisfactory in the later development periods when thetrue common shock proportion drops. The discrepancy is more signiﬁcant for the ﬁrst business line whichhas shorter tail development. The original common shock Tweedie model provides a poor goodness-of-ﬁtoverall, especially in early development periods. The proportions of common shock are underestimated inearly development periods and overestimated in later periods. Even though not reported here, similar resultsare also observed in heat maps of ratios of ﬁtted common shock proportions to true proportions for triangle2.

Figure 4.3: (Colour online) Heat maps of ratios of ﬁtted common shock proportions to true proportions for triangle 1 (top:Tweedie framework modiﬁed for unbalanced data, bottom: original common shock Tweedie framework) c and the true model generatingthe data (with column speciﬁc scaling term c j ). With this speciﬁcation, it is not surprising that theearlier (large) development periods dominate the estimation of c . We do not expect good results because ofmodel misspeciﬁcation, but we can arrive at two main conclusions: the modiﬁed framework out-performs theoriginal framework; and the common shock proportions are mis-estimated in the higher development periods,where amounts are small and do not contribute signiﬁcantly to total liability. It is also worth noting thatthe poor estimation of common shock proportion does not aﬀect mean forecasts, only dependency betweenthe triangles, and then only where the magnitudes of the forecasts are small.

5. Illustration with real data

The data used for illustration is a set of two triangles from the Bodily Injury line (1) and the AccidentBeneﬁt (excluding Disability Income) line (2) from a Canadian insurance company provided in Cˆot´e, Genestand Abdallah (2016). These two triangles have also been used for illustrations in Sections 1 and 2 and theirdetails can be found therein.

A preliminary analysis is performed to assess the suitability of this data set. This includes the assessmentof the tails, as well as the dependence structure.

From the plots of loss ratios provided earlier in Figure 2.1, it can be observed that the Bodily Injury linehas longer claims development than the Accident Beneﬁts line. Tail lengths of the two business lines arealso assessed using age-to-age development factors f ( n ) j = I − j (cid:80) i =1 Y ( n ) i,j +1 I − j (cid:80) i =1 Y ( n ) i,j . (5.1)Results are given in Table 5.3. It can be observed that the development factors of the Bodily Injury dominatethose of the Accident Beneﬁts line for all development periods, except in the ﬁnal year. However, this blipmay be a false signal due to the truncation of data at the last development period and only one singleobservation is made in this ﬁnal year. Hence the Bodily Injury line is convincingly longer-tailed than theAccident Beneﬁts line. j f (1) j f (2) j Table 5.3: Claims development factors for each development period .1.2. Explanatory dependence analysis A heuristic dependence analysis is performed by ﬁtting to each line a Tweedie GLM with a log-link andthe chain ladder mean structure a ( n ) i + b ( n ) j . (5.2)This is to remove ﬁxed accident period and development period eﬀects. Correlations between GLM Pearsonresiduals of the two lines are given in Table 5.4. The dependence between residuals is strong and signiﬁcantafter allowing for ﬁxed accident period and development period eﬀects.Pearson Spearman Kendall0.3659 (0.0060) 0.3480 (0.0096) 0.2525 (0.0065) Table 5.4: Correlation coeﬃcients between cell-wise GLM residuals and their corresponding p -values To examine whether this strong correlation comes from calendar year eﬀects that can impact both linessimultaneously, we also perform another GLM analysis with an additional ﬁxed calendar year eﬀect in themean structure a ( n ) i + b ( n ) j + h ( n ) t . (5.3)Correlations between GLM Pearson residuals of the two lines are then given in Table 5.5. The correlationcoeﬃcients have been reduced, however, not very signiﬁcantly.Pearson Spearman Kendall0.3416 (0.0107) 0.3250 (0.0159) 0.2202 (0.0176) Table 5.5: Correlation coeﬃcients between cell-wise GLM residuals and their corresponding p -values after removing ﬁxedcalendar year eﬀects Heat maps of residual ratios are given in Figure C.2 in Appendix C. Residual ratios are deﬁned as ratiosof observed values to GLM ﬁtted values with the mean structure speciﬁed in Equation (5.3). There aresome common cell-wise patterns that are quite obvious from the heat maps, for example, low payments indevelopment year 7 compensated by accelerated payments in years 8-9 in the ﬁrst accident year, paymentdips in accident year 4 and development year 2, similar development patterns in accident years from thepreliminary analysis shows that this data set is suitable for illustration of the model.Results from the preliminary analysis shows that this data set is suitable to be used for illustration ofthe model.

Bayesian inference is used for estimation. The marginal ﬁtting is ﬁrst performed. 400,000 simulations arerun and 300,000 simulations are discarded as the burn-in period. The sample chain is thinned by acceptingevery 5 th iteration to reduce the serial dependence between iterations. The multivariate ﬁtting is thenperformed with 90,000 simulations and the ﬁrst 30,000 are discarded as the burn-in period. The chain isthen thinned by selecting every 3 th iteration. Summary statistics are then computed on these posteriorsamples. The results are given in Table C.3 and C.4 of Appendix C.Marginal and multivariate goodness-of-ﬁts are assessed. Marginal goodness-of-ﬁt is assessed using QQplots of residuals in Figure 5.4. The plot shows that the ﬁt is quite oﬀ in the right tail of the Bodily Injuryline, and slightly oﬀ in both tails of the Accident Beneﬁt line. The goodness of ﬁt in other regions, however,is reasonable. This may be a result of the restriction of using the same power parameter p for both lines.However, the multivariate Tweedie framework still provides marginal ﬂexibility with ﬂexible choices of p .For comparison, similar QQ plots are performed for a common shock normal model in Figure 5.5. It canbe observed that the Tweedie marginals provide a much better ﬁt than the normal marginals (with powerparameter p = 0). 17 − Bodily injury

Theoretical quantiles S a m p l e quan t il e s −2 −1 0 1 2 − . − . . . . . Accident benefits

Theoretical quantiles S a m p l e quan t il e s Figure 5.4: QQ plots of residuals from common shock Tweedie model ( p = 1 . −2 −1 0 1 2 − . − . − . − . . . . Bodily injury

Theoretical quantiles S a m p l e quan t il e s −2 −1 0 1 2 − Accident benefits

Theoretical quantiles S a m p l e quan t il e s Figure 5.5: QQ plots of residuals from common shock normal model

Multivariate goodness-of-ﬁt is assessed by comparing the empirical bivariate marginals of real dataobservations and of back ﬁtted values. These are obtained using the empirical cumulative distributionfunctions of claim observations from each triangle. Because of the use of a Bayesian inference, various setsof back ﬁtted data can be generated. A path is randomly chosen for illustration. Scatter plots of theseempirical bivariate marginals are presented in Figure 5.6. It can be observed that the model can capturethe general positive dependence structure in the data.18 .0 0.2 0.4 0.6 0.8 1.0 . . . . . . Observed data

Bodily injury A cc i den t bene f i t s . . . . . . Fitted data

Bodily injury A cc i den t bene f i t s Figure 5.6: Plots of empirical bivariate marginals for observed values and back-ﬁtted values

To look for any trace of dependence not captured by the model, we examine the residuals from modelﬁtting. These residuals are obtained as the diﬀerences between observations and ﬁtted values, where thelatter are calculated using posterior estimates. The Pearson correlation coeﬃcient of these residuals reducesto 0.1204 (p-value 0.3812). This is much weaker than the correlation coeﬃcient of 0.3416 of GLM Pearsonresiduals in Section 5.1.2 and is also insigniﬁcant. The insigniﬁcant correlation indicates that our model hasexplained away most of the dependence in the data.

Predictive distributions of outstanding claim observations in the lower triangles can be calculated usingpredictive Bayesian inference. Using parameter estimates, the contributions of common shock within eachcell in the two triangles are calculated and given in Table 5.6 and 5.7. It can be observed that there isonly a very mild variation in the common shock proportions within and across triangles. We can relate thisresult to the challenges coming from applying a common shock model to loss reserving data which has anunbalanced nature discussed in Section 2. It shows that the proposed approach has provided a balance ofcommon shock proportions across all loss cells within and across loss triangles.

To obtain the distributions of the outstanding claims, posterior samples of parameters from the Bayesianinference are used to project claims in lower triangles. This projection utilises the speciﬁcation in Equations(2.7), (3.3) and (3.5). This gives a set of samples of future claims in the lower triangles. Using this set,summary statistics of the total outstanding claims distributions are given in Table 5.8 and kernel densitiesof outstanding claims are given in Figure 5.7. Summary statistics provided include the posterior mean,standard deviation, VaR and VaR of the distribution of total outstanding claims for each line, as wellas for both lines. 19evelopment year1 2 3 4 5 6 7 8 9 10 A cc i d e n t y e a r Table 5.6: Proportions of common shock to the expected total observations calculated using parameter estimates - BodilyInjury

Development year1 2 3 4 5 6 7 8 9 10 A cc i d e n t y e a r Table 5.7: Proportions of common shock to the expected total observations calculated using parameter estimates - AccidentBeneﬁts . + . − . − Total unpaid losses D en s i t y Bodily injuryAccient benefitsTotal

Figure 5.7: Kernel densities of predictive distributions of total outstanding claims in each line of business and in the aggregateportfolio

Table 5.8: Summary statistics of outstanding claims distributions

The empirical bivariate marginals of total reserves are shown in Figure 5.8. For illustration purpose,we show the scatter plot of total reserves from 1,000 posterior samples. The plot shows a mild positivedependence structure in the total outstanding claims across two lines. This is accompanied by a Pearsoncorrelation of 0.0855 (p-value < . . . . . . Total reserves

Bodily Injury A cc i den t B ene f i t s Figure 5.8: Plot of empirical bivariate marginals of total reserves (using 1,000 posterior samples)

The two business lines do not have a comonotonic dependence structure, and this allows the insurerto gain some diversiﬁcation beneﬁts when they set their risk margins. Using the speciﬁcation of a riskmargin under APRA’s Prudential Standards GPS 340, we have the following deﬁnition of Risk margin andDiversiﬁcation Beneﬁt (DB)Risk margin χ % [ Y ] = max (cid:26) VaR χ % [ Y ] − E[ Y ]; 12 SD[ Y ] (cid:27) , (5.4)DB = (cid:0) Risk margin χ % [ Y ] + Risk margin χ % [ Y ] (cid:1) − Risk margin χ % [ Y + Y ]Risk margin χ % [ Y ] + Risk margin χ % [ Y ] × . (5.5)Risk Margin and Risk Margin , as well as associated diversiﬁcation beneﬁts are provided in Table21.9. It can then be observed that diversiﬁcation beneﬁts can be gained as a result of allowing for (non-comonotonic) dependence across business lines.Bodily Injury Accident Beneﬁts Both lines DBRisk margin Table 5.9: Risk margin and diversiﬁcation beneﬁts statistics

6. Conclusion

Common shock approaches can provide many beneﬁts in the modelling of outstanding claims. However,they often require very careful parametrisation. This arises from the unbalanced nature of loss reservingdata. It is often desirable to use scaling factors to adjust the common shock eﬀects so that they cancontribute proportionately to the total observations over the entire range of the triangles. However, anexcessive use of scaling factors can result in over-parametrisation. In some cases, such as the common shockTweedie framework developed in Avanzi, Taylor, Vu and Wong (2016b), it is also desirable to select scalingfactors such that distributional tractability is preserved. These requirements place conﬂicting constraintson the speciﬁcation of scaling factors in common shock models.In this paper, we propose an approach which compromises the various constraints mentioned above.This approach involves using careful and parsimonious parametrisation to develop a common shock Tweedieframework modiﬁed for unbalanced data. Additional modiﬁcations for negative claims are also undertakenunder this framework. Illustrations using simulated and with real data are presented. These illustrationsshow that while the proposed approach cannot fully eliminate the issue of unbalanced common shock pro-portions, the improvement over the original framework in Avanzi, Taylor, Vu and Wong (2016b) is quitesubstantial.We examined a common shock Tweedie approach for cell-wise dependence in this paper. Future researchcould consider applications on other structures of dependence (such as calendar period dependence). Thispaper raises some potential issues of common shock models when they are applied to reserving data that hasan unbalanced nature. These issues, however, might appear whenever common shock models are appliedto heterogeneous data. These can include mortality data for diﬀerent group ages, or capital modelling fordiﬀerent types of risks. The proposed solution could be extended to solve similar problems in other contexts.While this solution can reduce the problems of unbalanced data quite substantially, a complete balance incommon shock proportions cannot be achieved. Future research could consider a better solution to thisproblem. Other multivariate models with explicit dependence structures such as mixture models could alsobe considered as they might be more applicable to unbalanced data.

Acknowledgements

Results in this paper were presented at The Australasian Actuarial Education and Research Symposiumin 2017 and the 22nd International Congress on Insurance: Mathematics and Economics in 2018. Theauthors are grateful for constructive comments received from colleagues who attended these conferences.The authors are also thankful to the two anonymous reviewers for their constructive comments that helpedsigniﬁcantly improve the paper.This research was supported under Australian Research Council’s Linkage (LP130100723, with fundingpartners Allianz Australia Insurance Ltd, Insurance Australia Group Ltd, and Suncorp Metway Ltd) andDiscovery (DP200101859) Projects funding schemes. Furthermore, Phuong Anh Vu acknowledges ﬁnancialsupport from a University International Postgraduate Award/University Postgraduate Award and supple-mentary scholarships provided by the UNSW Business School. The views expressed herein are those of theauthors and are not necessarily those of the supporting organisations.22 bdallah, A., Boucher, J.P., Cossette, H., 2015. Modeling dependence between loss triangles with hierarchical Archimedeancopulas. ASTIN Bulletin 45, 577–599.Ajne, B., 1994. Additivity of chain-ladder projections. ASTIN Bulletin 24, 311–318.Alai, D.H., Landsman, Z., Sherris, M., 2013. Lifetime dependence modelling using a truncated multivariate gamma distribution.Insurance: Mathematics and Economics 52, 542–549.Alai, D.H., Landsman, Z., Sherris, M., 2015. A multivariate Tweedie lifetime model: Censoring and truncation. Insurance:Mathematics and Economics 64, 203–213.Alai, D.H., Landsman, Z., Sherris, M., 2016. Multivariate Tweedie lifetimes: The impact of dependence. Scandinavian ActuarialJournal 2016, 692–712.Alai, D.H., W¨uthrich, M.V., 2009. Taylor approximations for model uncertainty within the Tweedie exponential dispersionfamily. ASTIN Bulletin 39, 453.Avanzi, B., Taylor, G., Vu, P.A., Wong, B., 2016b. Stochastic loss reserving with dependence: A ﬂexible multivariate Tweedieapproach. Insurance: Mathematics and Economics 71, 63–78.Avanzi, B., Taylor, G., Vu, P.A., Wong, B., 2019. A multivariate evolutionary generalised linear model framework with adaptiveestimation for claims reserving. Available at SSRN: https://ssrn.com/abstract=3413016 .Avanzi, B., Taylor, G., Wong, B., 2016a. Correlations between insurance lines of business: An illusion or a real phenomenon?Some methodological considerations. ASTIN Bulletin 46, 225–263.Avanzi, B., Taylor, G., Wong, B., 2018. Common shock models for claim arrays. ASTIN Bulletin 48, 1–28.Avanzi, B., Wong, B., Yang, X., 2016. A micro-level claim count model with overdispersion and reporting delays. Insurance:Mathematics and Economics 71, 1–14.Boucher, J.P., Davidov, D., 2011. On the importance of dispersion modeling for claims reserving: An application with theTweedie distribution. Variance 5, 158.Braun, C., 2004. The prediction error of the chain ladder method applied to correlated run-oﬀ triangles. ASTIN Bulletin 34,399–424.Congdon, P.D., 2010. Applied Bayesian hierarchical methods. Chapman & Hall, Boca Raton.Cˆot´e, M.P., Genest, C., Abdallah, A., 2016. Rank-based methods for modeling dependence between loss triangles. EuropeanActuarial Journal 6, 377–408.De Alba, E., 2006. Claims reserving when there are negative values in the runoﬀ triangle: Bayesian analysis using the three-parameter log-normal distribution. North American Actuarial Journal 10, 45–59.De Jong, P., 2006. Forecasting runoﬀ triangles. North American Actuarial Journal 10, 28–38.De Jong, P., 2012. Modeling dependence between loss triangles. North American Actuarial Journal 16, 74–86.England, P.D., Verrall, R.J., 2002. Stochastic claims reserving in general insurance. British Actuarial Journal 8, 443–518.England, P.D., Verrall, R.J., W¨uthrich, M.V., 2012. Bayesian over-dispersed Poisson model and the Bornhuetter– Fergusonclaims reserving method. Annals of Actuarial Science 6, 258–283.Furman, E., Landsman, Z., 2010. Multivariate Tweedie distributions and some related capital-at-risk analyses. Insurance:Mathematics and Economics 46, 351–361.Gismondi, F., Janssen, J., Manca, R., 2012. The construction of the claims reserve distribution by means of a semi-Markovbackward simulation model. Annals of Actuarial Science 6, 23–64.Haario, H., Saksman, E., Tamminen, J., 2001. An adaptive Metropolis algorithm. Bernoulli , 223–242.Heberle, J., Thomas, A., 2016. The fuzzy Bornhuetter–Ferguson method: An approach with fuzzy numbers. Annals of ActuarialScience 10, 303–321.Hess, K.T., Schmidt, K.D., Zocher, M., 2006. Multivariate loss prediction in the multivariate additive model. Insurance:Mathematics and Economics 39, 185–191.International Actuarial Association, 2004. A global framework for insurer solvency assessment. Website. URL: . last accessed: 9/12/2018.Joe, H., 1997. Multivariate models and dependence concepts. Chapman & Hall, New York.Jørgensen, B., 1997. The theory of dispersion models. Chapman & Hall, London.Kunkler, M., 2006. Modelling negatives in stochastic reserving models. Insurance: Mathematics and Economics 38, 540–555.Merz, M., W¨uthrich, M.V., 2007. Prediction error of the chain ladder reserving method applied to correlated run-oﬀ triangles.Annals of Actuarial Science 2, 25–50.Merz, M., W¨uthrich, M.V., 2009. Prediction error of the multivariate additive loss reserving method for dependent lines ofbusiness. Variance 3, 131–151.Meyers, G.G., 2007. The common shock model for correlated insurance losses. Variance 1, 40–52.Peters, G.W., Shevchenko, P., W¨uthrich, M.V., 2009. Model uncertainty in claims reserving within Tweedie’s compoundPoisson models. ASTIN Bulletin 39, 1–33.Pinheiro, P.J.R., Andrade e Silva, J.M., de Lourdes Centeno, M., 2003. Bootstrap methodology in claim reserving. Journal ofRisk and Insurance 70, 701–714.Pr¨ohl, C., Schmidt, K.D., 2005. Multivariate chain-ladder. Techn. Univ., Inst. f¨ur Mathematische Stochastik.Renshaw, A.E., Verrall, R.J., 1998. A stochastic model underlying the chain-ladder technique. British Actuarial Journal 4,903–923.Saluz, A., Gisler, A., 2014. Best estimate reserves and the claims development results in consecutive calendar years. Annals ofActuarial Science 8, 351–373.Schmidt, K.D., 2006. Optimal and additive loss reserving for dependent lines of business. Casualty Actuarial Society

Forum (Fall) , 319–351.Shi, P., 2014. A copula regression for modeling multivariate loss triangles and quantifying reserving variability. ASTIN Bulletin

4, 85–102.Shi, P., Basu, S., Meyers, G.G., 2012. A Bayesian log-normal model for multivariate loss reserving. North American ActuarialJournal 16, 29–51.Shi, P., Frees, E.W., 2011. Dependent loss reserving using copulas. ASTIN Bulletin 41, 449–486.Taylor, G., 2009. The chain ladder and Tweedie distributed claims data. Variance 3, 96–104.Taylor, G., 2015. Bayesian chain ladder models. ASTIN Bulletin 45, 75–99.Taylor, G.C., 2000. Loss reserving: An actuarial perspective. Kluwer Academic Publishers, Boston.Verrall, R., H¨ossjer, O., Bj¨orkwall, S., 2012. Modelling claims run-oﬀ with reversible jump Markov chain Monte Carlo methods.ASTIN Bulletin 42, 35–58.Vihola, M., 2012. Robust adaptive metropolis algorithm with coerced acceptance rate. Statistics and Computing 22, 997–1008.W¨uthrich, M.V., 2003. Claim reserving using Tweedie’s compound poisson model. ASTIN Bulletin 33, 331–346.W¨uthrich, M.V., 2018. Machine learning in individual claims reserving. Scandinavian Actuarial Journal 2018, 465–480.W¨uthrich, M.V., Merz, M., 2008. Stochastic claims reserving methods in insurance. John Wiley & Sons, Chichester.Zhang, Y., 2010. A general multivariate chain ladder model. Insurance: Mathematics and Economics 46, 588–599.Zhang, Y., Dukic, V., 2013. Predicting multivariate insurance loss payments under the Bayesian copula framework. TheJournal of Risk and Insurance 80, 891–919.Zhang, Y., Dukic, V., Guszcza, J., 2012. A Bayesian non-linear model for forecasting insurance loss payments. Journal ofRoyal Statistical Society 175, 637–656.Zhao, X.B., Zhou, X., Wang, J.L., 2009. Semiparametric model for prediction of individual claim loss reserving. Insurance:Mathematics and Economics 45, 1–8. ppendix A. Simulated data set 1 Development year1 2 3 4 5 6 7 8 9 10 A cc i d e n t y e a r Table A.1: Simulated triangle 1 (data set 1)

Development year1 2 3 4 5 6 7 8 9 10 A cc i d e n t y e a r Table A.2: Simulated triangle 2 (data set 1) r u e v a l u e M e d i a nS D % C I T r u e v a l u e M e d i a nS D % C I η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) η ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) ν ( ) . . . ( . ; . ) γ ( ) . . . ( . ; . ) γ ( ) . . . ( . ; . ) p . . . ( . ; . ) ξ ( ) . . . ( . ; . ) δ . . . ( . ; . ) c . . . ( . ; . ) β . . . ( . ; . ) T a b l e A . : P o s t e r i o r s t a t i s t i c s o f p a r a m e t e r s ( d a t a s e t ) . . . η ( ) − . . . η ( ) . . . ν ( ) . . ν ( ) − . − . − . γ ( ) − . − . − . γ ( ) − . . . δ − . − . − . ξ ( ) . . . p Figure A.1: MCMC sample paths of some parameters ppendix B. Simulated data set 2 Development year1 2 3 4 5 6 7 8 9 10 A cc i d e n t y e a r Table B.1: Simulated triangle 1 (data set 2)

Development year1 2 3 4 5 6 7 8 9 10 A cc i d e n t y e a r Table B.2: Simulated triangle 2 (data set 2)

Appendix C. Real data set

This data set is drawn from Cˆot´e, Genest and Abdallah (2016).28remium Development year1 2 3 4 5 6 7 8 9 10 A cc i d e n t y e a r Table C.1: Bodily Injury line (cumulative claims)

Premium Development year1 2 3 4 5 6 7 8 9 10 A cc i d e n t y e a r Table C.2: Accident Beneﬁts (cumulative claims) igure C.2: (Colour online) Heat maps of ratios of observed values to GLM ﬁtted values (top: Bodily Injury line, bottom:Accident Beneﬁts) η (1)2 η (2)2 η (1)3 η (2)3 η (1)4 η (2)4 η (1)5 η (2)5 η (1)6 η (2)6 η (1)7 η (2)7 η (1)8 η (2)8 η (1)9 η (2)9 η (1)10 η (2)10 ν (1)1 ν (2)1 ν (1)2 ν (2)2 ν (1)3 ν (2)3 ν (1)4 ν (2)4 ν (1)5 ν (2)5 ν (1)6 ν (2)6 ν (1)7 ν (2)7 ν (1)8 ν (2)8 ν (1)9 ν (2)9 ν (1)10 ν (2)10 γ (1) γ (2) δ p Table C.3: Posterior statistics of parameters from marginal estimation