A multivariate evolutionary generalised linear model framework with adaptive estimation for claims reserving
Benjamin Avanzi, Gregory Clive Taylor, Phuong Anh Vu, Bernard Wong
AA multivariate evolutionary generalised linear model frameworkwith adaptive estimation for claims reserving
Benjamin Avanzi a , Greg Taylor b , Phuong Anh Vu b,c, ∗ , Bernard Wong b a Centre for Actuarial Studies, Department of EconomicsUniversity of Melbourne VIC 3010, Australia b School of Risk and Actuarial Studies, UNSW Australia Business SchoolUNSW Sydney NSW 2052, Australia c D´epartement de Math´ematiques et de Statistique, Universit´e de Montr´ealMontr´eal QC H3T 1J4, Canada
Abstract
In this paper, we develop a multivariate evolutionary generalised linear model (GLM) framework forclaims reserving, which allows for dynamic features of claims activity in conjunction with dependency acrossbusiness lines to accurately assess claims reserves. We extend the traditional GLM reserving framework ontwo fronts: GLM fixed factors are allowed to evolve in a recursive manner, and dependence is incorporatedin the specification of these factors using a common shock approach.We consider factors that evolve across accident years in conjunction with factors that evolve across cal-endar years. This two-dimensional evolution of factors is unconventional as a traditional evolutionary modeltypically considers the evolution in one single time dimension. This creates challenges for the estimationprocess, which we tackle in this paper. We develop the formulation of a particle filtering algorithm withparameter learning procedure. This is an adaptive estimation approach which updates evolving factors ofthe framework recursively over time.We implement and illustrate our model with a simulated data set, as well as a set of real data from aCanadian insurer.
Keywords:
Claims Reserving, Evolutionary GLM, Adaptive Reserving, Particle Learning, Common ShockModelsJEL code: G22MSC classes: 91G70, 91G60, 62P05, 62H12
1. Introduction
A general insurer (also known as a non-life insurer, or property and casualty insurer) usually experiencesdelays between the occurrences of insured events, their reporting and the actual payments of claims. Reservesfor these outstanding claims are typically one of the largest liabilities on the balance sheet of the insurer, seefor example, Alai and W¨uthrich (2009); Saluz and Gisler (2014); Abdallah, Boucher and Cossette (2015).Therefore, they can have significant impacts on the emerging profit as well as solvency of the insurer.To accurately assess outstanding claims reserves, we argue in this paper that it is advantageous to bothincorporate evolutionary and dependence modelling in the process, and we develop a new methodology fordoing so. The term “evolutionary” refers to the use of flexible modelling structures that adapt to changes inthe claims experience. In the subsections below, we discuss the motivations for our developments in detailand further describe the contributions of our paper. ∗ Corresponding author.
Email addresses: [email protected] (Benjamin Avanzi), [email protected] (Greg Taylor), [email protected] (Phuong Anh Vu), [email protected] (Bernard Wong) a r X i v : . [ q -f i n . R M ] A p r .1. Evolutionary modelling in claims reserving The valuation of claims reserves is a predictive modelling activity where past information is used topredict outstanding claims for the future. It is, however, not at all unusual to observe changes in claimexperience over time (Ghezzi, 2001; Renshaw, 1989; Gluck and Venter, 2009). These changes can be due tovarious reasons, such as legislative reforms, or changes to the internal operations of insurers. For example,the recent reform for Auto Bodily Injury covers in New South Wales (Australia) has resulted in faster claimsresolution (State Insurance Regulatory Authority, 2018), and has thus reduced payment delays. Anotherexample is the development patterns from the Commercial Auto line of an American insurer whose datais used for illustration in Shi and Frees (2011). Figure 1.1 provides plots of incremental loss ratios in fouraccident years 1990, 1991, 1994 and 1995 of this insurer. Incremental loss ratios are calculated as incrementalclaims divided by the total premium earned in the corresponding accident year. Changes in claim activityacross accident years are evident in this figure with variations in the development patterns across accidentyears. . . . . . Development year I n c r e m en t a l l o ss r a t i o Accident year 1990Accident year 1991Accident year 1994Accident year 1995
Figure 1.1: Incremental loss ratios in the Commercial Auto line of an American insurer (Schedule P)
When changes in development patterns have taken place, it is no longer straightforward to projectfuture trends (Zehnwirth, 1994; Ghezzi, 2001; Sims, 2012). Traditional reserving methods which use theassumption of similar claim activities across accident years, such as the traditional chain ladder algorithmand the traditional generalised linear model (GLM) framework, are no longer appropriate. In some cases,actuaries have to make a lot of judgements to remove or reduce the effects of these changes before applyingtraditional reserving methods. An example of this procedure is the Berquist-Sherman technique by Berquistand Sherman (1977). These judgements are often difficult and time consuming to make, as well as to justifyto management and peer reviewers (Sims, 2012) because of their inherent subjective nature. In other cases,when the model selected using earlier data no longer fits more recent data, there may be a need to reviseits algebraic structure. This will result in a fundamental discontinuity in the sequence of estimates such asthe central estimates of outstanding claims (Taylor, McGuire and Greenfield, 2003).A very elegant and plausible solution to this issue is to accommodate changes directly in the modelstructures by allowing model factors to evolve (De Jong and Zehnwirth, 1983; Zehnwirth, 1994; Taylor,McGuire and Greenfield, 2003; Gluck and Venter, 2009). These models have appeared in the literatureunder many names, including state space models (Alpuim and Ribeiro, 2003; Chukhrova and Johannssen,2017), adaptive models (Taylor and McGuire, 2009), robotic models (McGuire, 2007), and what we callthem in this paper, evolutionary models (Sims, 2012). These models not only can provide a better fit tothe data, but also are more parsimonious. Trends in the data can be captured using a simple but explicit2tructure, enhancing model interpretation and reducing the need for unrecognised parameters (Zehnwirth,1994).Evolutionary models belong to a larger group known as mixed effect models where model factors arerandomised, for example, Guszcza (2008); Zhang et al. (2012); Shi et al. (2012); Gigante et al. (2013,2019). The key feature of evolutionary models that distinguishes them from other mixed effect models isthe evolution of the majority of random effects, especially development year effect, in a recursive manner.This evolution is typically specified using a time series structure. While some existing mixed effect modelssuch as Zhang et al. (2012); Shi et al. (2012) use time series processes to capture the dynamics in one modelaspect such as the residuals or calendar year effect, other factors are typically static. These fixed effectsinclude development year factors with the underlying assumption of one single development pattern acrossall accident years. On the other hand, most factors in evolutionary models, especially development year andaccident year effects, are allowed to evolve over time. Overall, the development of observations is explainedby a series of associated random factors. These models are also commonly known as state space models, avery powerful model class in time series analysis (Harvey, 1990).Random factors in evolutionary models can be estimated in a traditional manner which utilises the entirehistory of data at once using methods such as maximum likelihood estimation, or Bayesian estimation. Itis worth noting that traditional Bayesian inference has been particularly favourable for mixed effect modelssuch as Zhang et al. (2012); Shi et al. (2012). However, as mentioned previously, most factors in evolutionarymodels are dynamic and the development of observations over time is explained by the associated randomfactors at that point in time. To optimise the benefits from this nature of evolutionary models, a specialcalibration method is typically used for these models known as adaptive estimation, “Bayesian filtering” , orrecursive Bayesian estimation (Durbin and Koopman, 2012). This is a special type of Bayesian inference.While traditional Bayesian inference uses the complete history of data for calibration of all factors at once,adaptive estimation recursively updates the estimates of random factors using new observations at each timepoint. Many benefits stem from this estimation technique. In this recursive Bayesian structure, more weightis given to recent data which improves the responsiveness of the models to changes (Zehnwirth, 1994; Taylor,2000; Alpuim and Ribeiro, 2003). By estimating underlying factors recursively over time rather than all atonce, changes are recognised gradually. As a result, a clear picture of changes in the historical experiencecan be obtained, and estimates derived from these models are also smooth over time (Taylor, McGuireand Greenfield, 2003; Sims, 2012). Parameters are not estimated separately using scarce information forimmature accident years but are projected recursively using the previous ones (Gluck and Venter, 2009).This also reduces the reliance on arbitrary modelling judgements which can be particularly beneficial whendealing with a small dataset.Evolutionary models have appeared in the loss reserving literature since as early as the 1980s. Thesemodels can be categorised into two groups on the ground of distributional assumption used. The first groupconsists of Gaussian models which rely on the Gaussian assumption of observations and model factors. Thisis also the dominant group of evolutionary models in the literature. Examples of models in this group includeDe Jong and Zehnwirth (1983); Verrall (1989, 1994); Ntzoufras and Dellaportas (2002); Atherino, Pizzingaand Fernandes (2010); De Jong (2006). The popularity of the evolutionary Gaussian approach comes fromits tractability in the estimation process. The recursive estimates of model factors are available in closedform through the use of a very popular procedure called the Kalman filter. The estimates of factors obtainedfrom this filter are the best linear estimators in terms of the mean square error. Detailed descriptions ofthe Kalman filter as well its various applications can be found in the two comprehensive review books onevolutionary models (state space models) by Harvey (1990) and Durbin and Koopman (2012).The second group of evolutionary models is the group of non-Gaussian models. This is a remarkablysmaller group. When the distributional assumptions deviate from the Gaussian assumption, the Kalmanfilter no longer provides the best linear estimates of factors in terms of mean square error. Taylor andMcGuire (2009) developed a univariate evolutionary GLM framework with Poisson and gamma errors. Anestimation procedure called second-order Bayesian revision developed in Taylor (2008) is used to provide anapproximate solution to this framework through the use of second-order Taylor expansions. Sims (2011) thenprovided a simulation-based solution commonly known as a particle filter to the framework in Taylor andMcGuire (2009). Dong and Chan (2013) developed an evolutionary framework using the generalised-beta II3amily, and used conventional Bayesian inference for model estimation.
An insurer typically operates in multiple business segments, the number of which can be up to 100 insome cases (Avanzi, Taylor and Wong, 2016b). These segments can be business lines or subsets of these.Dependency across different business segments is an important characteristic of claims for a typical generalinsurer. As defined in Avanzi, Taylor and Wong (2016b) using non-technical terms, it typically is thesituation where the (departures from known trends of the) experience of one segment varies in sympathywith that of other segments. This experience can arise from many causes. At the very least, some segmentsshare elements in their reporting procedure (Shi and Frees, 2011; De Jong, 2012) and hence any changes inthe operational system can affect these segments simultaneously. Similarly, legislative changes can also haveimpacts on some segments such as the same business line in different geographical locations. There can alsobe claim causing events such as hailstorms that give rise to claims in multiple business lines (for example,motor lines and property lines).While business segments are likely dependent to some extent, it is quite rare to observe cases of co-monotonicity (Kirschner, Kerley and Isaacs, 2002). Due to the lack of a perfectly positive dependencestructure across segments, the volatility of claims on the aggregate portfolio level is reduced compared to theaggregation of volatility on the individual segment level. This reduction often is known as a “diversificationbenefit” (Shi and Frees, 2011; De Jong, 2012; Cˆot´e, Genest and Abdallah, 2016; Avanzi, Taylor, Vu andWong, 2016a). Ignoring (or underestimating) this effect can result in an over-estimation of risk marginand risk-based capital. Even if a certain degree of prudence is recommended, insurers should have ascorrect reserves as possible, not as large reserves as possible (Ajne, 1994). This is to ensure that capitalis used parsimoniously while meeting solvency expectations (Avanzi, Taylor and Wong, 2016b). Indeed,many insurance regulatory frameworks enable insurers to enjoy their diversification benefits in assessing therisk margins for their outstanding claims liabilities, as well as risk capital for their consolidated operations(Avanzi, Taylor and Wong, 2016b).In many cases, the dependence across business segments arises from some calendar year factors thataffect claims in the same calendar year within and across segments simultaneously (Shi, Basu and Meyers,2012; De Jong, 2012; W¨uthrich, 2010). For example, a legislative change in a particular calendar year canspeed up the claims settlement processes in all business segments. A subset of the existing literature onmultivariate reserving focuses on modelling calendar year dependence, see for example, W¨uthrich (2010);De Jong (2012); Shi, Basu and Meyers (2012); Abdallah, Boucher and Cossette (2015).Besides copulas which are popular dependence modelling tools in the reserving literature (Shi and Frees,2011; De Jong, 2012; Zhang and Dukic, 2013; Abdallah, Boucher and Cossette, 2015; Cˆot´e, Genest andAbdallah, 2016), common shock approaches have also gained their popularity recently (De Jong, 2006; Shi,Basu and Meyers, 2012; Avanzi, Taylor, Vu and Wong, 2016a; Avanzi, Taylor and Wong, 2018). The mainadvantage of these approaches stems from their ability to capture structural dependence arising from knownrelationships. In addition, they also allow some interpretability of the dependence structure as well as aparsimonious construction of correlation matrices of large dimensions (Avanzi, Taylor and Wong, 2018).
In this paper, we allow for dynamic features of claims activity in conjunction with dependency acrossbusiness lines to accurately assess claims reserves. To the best of our knowledge, this is the first paper thatcombines both elements, and applies the methodology to real insurance data.Specifically, we develop and illustrate a multivariate evolutionary GLM framework for claims reserving.It aims to extend the traditional GLM reserving framework on two fronts: GLM fixed factors are allowedto evolve in a recursive manner, and dependence is incorporated in the specification of these factors usinga common shock approach. This is to utilise the flexibility of the GLM structure while being able to enjoyvarious benefits of common shock approaches. We also formulate adaptive approaches that provide recursivereal-time estimates of random factors in the evolutionary GLM framework. The “evolutionarisation” of theGLM fixed effects is a natural step. In the loss reserving literature, the exponential dispersion family (EDF)4nd its sub-class, the Tweedie family of distributions, have been used frequently in univariate as well asmultivariate models. A comprehensive review of these models can be found in Taylor and McGuire (2016).Models using the EDF are typically specified using the GLM framework which allows explanatory fixedfactors to be incorporated in a flexible manner.We consider factors that evolve across accident years in conjunction with factors that evolve across cal-endar years. This two-dimensional evolution of factors is unconventional as a traditional evolutionary modeltypically considers the evolution in one single time dimension. This creates challenges for the estimationprocess, which we tackle in this paper; see Section 3.1.The paper is organised as follows. Section 2 introduces our model, a multivariate evolutionary GLMframework. An adaptive estimation procedure for this framework is developed in Section 3. A simulationillustration is provided in Section 4. Section 5 provides an illustration using real data. We discuss thoseresults, as well as some practical considerations, in Section 6. Section 7 concludes the paper.
2. A multivariate evolutionary GLM framework
In this section we develop a multivariate evolutionary GLM framework. Framework specifications aregiven in Section 2.1. In the existing literature of evolutionary modelling, models are often presented inmatrix form, see for example, the review books by Harvey (1990); Durbin and Koopman (2012). This notonly helps present the models more elegantly but is also useful in the presentation of model calibration.Therefore, we also introduce a matrix representation of our model in Section 2.2.
The framework being introduced applies to loss triangles, the standard representation of loss reservingdata (Taylor, 2000; W¨uthrich and Merz, 2008). Consider a portfolio of N lines (or segments) of business,each with a corresponding loss triangle. Each triangle contains a set of incremental claims Y ( n ) i,j indexed byintegers i ( i = 1 , ..., I ) , j ( j = 1 , ..., J ) and n ( n = 1 , ..., N ) which represents the accident year, developmentyear and line of business, respectively. For simplicity it is assumed that I = J . However, this assumptioncan be relaxed to give the general case of loss trapeziums. It also follows that the claim represented by Y ( n ) i,j belongs to the calendar year t = i + j − t = 1 , ..., T ). Note that irregular data shapes as well as missingvalues can be accommodated in a robust manner in this framework.The multivariate evolutionary GLM framework consists of two components: an observation componentand a state component. The observation component specifies the relation between observations and randomfactors. These random factors are not observed and are often referred to as states in the literature ofevolutionary models (Durbin and Koopman, 2012). The state component specifies the recursive evolutionof random factors. These two components are described in more detail in the subsections below. Each incremental claim Y ( n ) i,j is specified in the observation component of the framework and may or maynot be observed at the valuation date. It is specified in the same manner as the traditional GLM frameworkwith the mean and variance such that E [ Y ( n ) i,j ] = µ ( n ) i,j , (2.1) V ar [ Y ( n ) i,j ] = φ ( n ) · V (cid:16) µ ( n ) i,j (cid:17) , (2.2)where µ ( n ) i,j is the mean parameter, φ ( n ) is the dispersion parameter, and V ( . ) is an appropriate variancefunction. When the distribution choice is restricted to a popular sub-family of the EDF, known as theTweedie sub-family, the following variance function is specified with a power parameter pV (cid:16) µ ( n ) i,j (cid:17) = (cid:16) µ ( n ) i,j (cid:17) p . (2.3)5 specific value of p maps to a distribution from the Tweedie sub-family. Therefore, the Tweedie sub-familycan be useful for modelling in the reserving context as it allows model uncertainty to be considered naturallythrough the estimation of the power parameter p (Alai and W¨uthrich, 2009).The mean structure (also known as the systematic component) relates the observations with a set ofexplanatory factors. In this framework, a modified Hoerl curve (a discrete version of a Gamma curve) whichallows for calendar year effect is used. Hoerl curves are used frequently in reserving, especially evolutionaryreserving, see for example, De Jong and Zehnwirth (1983); England and Verrall (2001); Taylor and McGuire(2009). A Hoerl curve is specified using a function of j and log( j ) which aims to approximate and pro-vide smoothing for a claims development pattern. Various benefits of a Hoerl curve include parsimoniousmodelling, robustness against fluctuations in observations, and extrapolation beyond the range of the ob-served development year. They are considered desirable in the evolutionary reserving context as they allowsystematic changes in the claims experience over time.Using a log-link on a Hoerl curve, the mean structure is specified such thatlog( µ ( n ) i,j ) = a ( n ) i + r ( n ) i · log( j ) + s ( n ) i · j + h ( n ) t , (2.4)where a ( n ) i is the accident year factor, r ( n ) i and s ( n ) i are factors of the Hoerl curve that specifies the develop-ment pattern for accident year i , and h ( n ) t is the calendar year factor. Note that these “factors” would oftenbe called “parameters” in static models but we choose to call them factors to avoid confusion with modelparameters which do not evolve; see, e.g., Figure 2.3. It is worth emphasising that factors a ( n ) i , r ( n ) i , s ( n ) i areaccident-year-specific, whereas factor h ( n ) t is calendar-year-specific, as noted by their subscripts. The abovemean structure as well as the link function can be modified on a case-by-case basis.A special case of the multivariate evolutionary GLM framework is a multivariate Gaussian model withGaussian assumptions of observations (on which a log-transformation can be first applied if observationsare log-normal). This model extends the existing Gaussian evolutionary models in the literature (De Jongand Zehnwirth, 1983; Verrall, 1989; Ntzoufras and Dellaportas, 2002; Atherino et al., 2010) by allowing forcalendar year dependence across business lines. The observation component in this case is specified suchthat Y ( n ) i,j = a ( n ) i + r ( n ) i · log( j ) + s ( n ) i · j + h ( n ) t + ς ( n ) i,j , ς ( n ) i,j ∼ Normal (cid:0) , σ ς ( n ) (cid:1) . (2.5) The observation component specified in the previous section is a standard GLM structure in reserving,see the review by Taylor and McGuire (2016). In a standard structure, the same fixed (static) parameters areused to capture the development year effect (i.e r ( n ) i = r ( n ) , s ( n ) i = s ( n ) in a Hoerl curve specification) acrossall accident years. This essentially implies that one single average development pattern is assumed for allaccident years. The key features that differentiate the multivariate evolutionary GLM reserving frameworkfrom the traditional GLM reserving framework lie in the specification of the factors a ( n ) i , r ( n ) i , s ( n ) i and h ( n ) t of the mean structure. They are not unknown deterministic factors but are random and evolve over time.As a result, each accident year can have its own development pattern. Their evolution can be specifiedusing time series processes such as Autoregressive-Moving-Average (ARMA) processes. For simplicity, weuse random walk processes for the evolution of states.The evolution of a ( n ) i , r ( n ) i , s ( n ) i is specified such that a ( n ) i = a ( n ) i − + a (cid:15) ( n ) i , a (cid:15) ( n ) i ∼ Normal (cid:16) , σ a (cid:15) ( n ) (cid:17) , (2.6) r ( n ) i = r ( n ) i − + r (cid:15) ( n ) i , r (cid:15) ( n ) i ∼ Normal (cid:16) , σ r (cid:15) ( n ) (cid:17) , (2.7) s ( n ) i = s ( n ) i − + s (cid:15) ( n ) i , s (cid:15) ( n ) i ∼ Normal (cid:16) , σ s (cid:15) ( n ) (cid:17) , (2.8)6
1 2 3 … I=J
1 3 I … (cid:2) (cid:3)(cid:4)(cid:5)(cid:6) , (cid:8) (cid:3)(cid:4)(cid:5)(cid:6) , (cid:9) (cid:3)(cid:4)(cid:5)(cid:6) (cid:2) (cid:10)(cid:4)(cid:5)(cid:6) ; (cid:8) (cid:10)(cid:4)(cid:5)(cid:6) ; (cid:9) (cid:10)(cid:4)(cid:5)(cid:6) ℎ (cid:3)(cid:4)(cid:5)(cid:6) ℎ (cid:10)(cid:4)(cid:5)(cid:6) . . . ℎ (cid:14)(cid:4)(cid:5)(cid:6) ℎ (cid:15)(cid:4)(cid:5)(cid:6) ; (cid:17) (cid:18) 1, . . . , (cid:20) ℎ (cid:10)(cid:4)(cid:5)(cid:6) , . . . , ℎ (cid:14)(cid:4)(cid:5)(cid:6) (cid:2) (cid:21)(cid:4)(cid:5)(cid:6) ; (cid:8) (cid:21)(cid:4)(cid:5)(cid:6) ; (cid:9) (cid:21)(cid:4)(cid:5)(cid:6) ℎ (cid:21)(cid:4)(cid:5)(cid:6) , . . . , ℎ (cid:14)(cid:4)(cid:5)(cid:6) … (cid:2) (cid:14)(cid:4)(cid:5)(cid:6) ; (cid:8) (cid:14)(cid:4)(cid:5)(cid:6) ; (cid:9) (cid:14)(cid:4)(cid:5)(cid:6) ℎ (cid:14)(cid:4)(cid:5)(cid:6) AY DY
Figure 2.2: Evolution of random factors within a triangle (solid arrows represent evolution, dashed arrows represent one-to-onemapping, AY stands for accident year and DY stands for development year) where σ a (cid:15) ( n ) , σ r (cid:15) ( n ) , σ s (cid:15) ( n ) are variances of the disturbance terms a (cid:15) ( n ) i , r (cid:15) ( n ) i , s (cid:15) ( n ) i , respectively, in the evo-lution. These variance terms are model parameters and need to be estimated.The evolution of calendar factor h ( n ) t is specified using a modified random walk to incorporate calendaryear dependence through a common shock approach h ( n ) t = h ( n ) t − + h (cid:15) ( n ) t + λ ( n ) · h ˜ (cid:15) t , h (cid:15) ( n ) t ∼ Normal (cid:16) , σ h (cid:15) ( n ) (cid:17) , h ˜ (cid:15) t ∼ Normal (cid:0) , σ h ˜ (cid:15) (cid:1) . (2.9)There are two sources of disturbance in this evolution: line-specific disturbance h (cid:15) ( n ) t and common shockdisturbance h ˜ (cid:15) t . The variances of these terms, σ h (cid:15) ( n ) and σ h ˜ (cid:15) , as well as the common shock coefficient λ ( n ) are model parameters. The calendar year dependence is induced by the common shock term h ˜ (cid:15) t . This termcan represent any changes in the calendar year t that affect all lines simultaneously, such as superimposedinflation. The effects of this common shock on each line, however, are usually not uniform as some lines canbe more heavily affected than others. It is then desirable to use scaling factors λ ( n ) to adjust the effects ofthe common shock on individual lines. Remark 2.1.
In the specification of the framework, random walk processes are used for states evolution.This is chosen for the sake of simplicity following the existing literature (De Jong and Zehnwirth, 1983;Verrall, 1989, 1994; Atherino et al., 2010; Taylor and McGuire, 2009). One also typically adheres to thesimplest specification until there is reason to do otherwise. Further, the fact is that unpredictable natureof superimposed inflation in practice renders the random walk a reasonable representation. There can becases where data can exhibit an accident year trend and/or calendar year trend, and in such specific casesa different time series structure such as ARMA processes can be used. However, it is worth noting that thiscomes at a cost of more extensive parametrisation.
The evolution of random factors in a single line can be summarised in Figure 2.2. As demonstrated inthis figure, we consider the evolution in the accident year dimension, meaning that random factors evolveand their estimates are updated as we proceed to the subsequent row of data in the loss triangles. Thisis to best utilise the available data in the first accident year to initialise the estimation process which will7e discussed in Section 3. It is worth noting that random factors a ( n ) i , r ( n ) i , s ( n ) i are indexed by i and theyevolve from one accident year (i.e. row) to the next. Calendar factors h ( n ) t , on the other hand, evolve fromone calendar year (i.e. diagonal) to the next and are indexed by t . However, when the process initiates ataccident year 1, the effect of all calendar factors h ( n )1 , h ( n )2 , ..., h ( n ) I is present as claims in the first accidentyear are developed in these different calendar years. Note that the relationship between these calendar yearfactors themselves follows the evolution specified in Equation (2.9). These factors are mapped one to onewith calendar factors in the next row h ( n )2 , .., h ( n ) I as we proceed to the second accident year, and so on. Thisis because claims within the same diagonal share the same calendar year effect. Figure 2.3: Summary of framework structure
Figure 2.3 summarises this section with the structure of the framework. There are three levels: underlyingparameters, random factors and observations. Underlying parameters do not evolve with time and theyspecify the evolution of random factors as well as dynamics of the observations. Random factors evolve overtime and they are explanatory factors of observations. Observations are directly observed from data andare specified by underlying parameters as well as random factors.As shown in Figure 2.3, there are 6 N + 1 parameters, and (3 I + T ) × N random factors. This applies toa data set of N triangles, each has size I × J (a total of I × ( J +1)2 N observations). The number of randomfactors is similar to that of a typical GLM framework which has a Hoerl curve specification with the additionof a calendar year factor, see for example, Zehnwirth (1989); Wright (1990); England and Verrall (2002).Similar models in which cross-classified structures are used instead of a Hoerl curve can be found in, forexample, Shi et al. (2012); Abdallah et al. (2015). However, in our framework, these random factors arespecified to evolve recursively using previous random factors and some parameters. For immature accidentyears which have limited data, this structure can provide a better calibration of random factors for theseyears. In this section, we provide a matrix representation of the framework introduced in the previous section.A matrix representation of evolutionary models is often called a state space representation in the time seriesliterature (Durbin and Koopman, 2012). It is a conventional representation of evolutionary models and itaims to provide a compact representation of the relationships between observations and random factors,and the development in random factors over time. Model estimation can then be presented more clearlyusing this representation. This is particularly useful in a multi-dimensional setting, which typically appliesin the context of this paper when we have multiple observations at each time node and multiple underlyingrandom factors. 8 .2.1. Observation component
In the proposed framework, loss reserving data can be considered as a multivariate time series process.Each accident year as a point in this process when new observations are assumed to arrive. The vector ofobservations at accident year i is a vector of all N · ( I − i + 1) claims in the same accident year within andacross triangles Y i = Y (1) i, ... Y (1) i,I − i +1 Y (2) i, ... Y ( N ) i,I − i +1 , (2.10)which has the following mean and dispersion E [ Y i ] = µ i = µ (1) i, ... µ (1) i,I − i +1 µ (2) i, ... µ ( N ) i,I − i +1 , φ = φ (1) ... φ (1) φ (2) ... φ ( N ) ( I − i + 1) rows . (2.11)The mean structure is specified using a linear predictor with a log-linklog ( µ i ) = A i · γ i + E i · ψ I , (2.12)where, with the mean structure specified in Equation (2.4), we have γ ( n ) i = a ( n ) i r ( n ) i s ( n ) i , γ i = γ (1) i ... γ ( N ) i , ψ ( n ) I = h ( n )1 ... h ( n ) I , ψ I = ψ (1) I ... ψ ( N ) I , (2.13) A ( n ) i = I − i + 1) I − i + 1 , A i = A (1) i . . . A (2) i . . . ... ... . . . ... . . . A ( N ) i , (2.14) E ( n ) i = . . . . . . . . . . . . (cid:124) (cid:123)(cid:122) (cid:125) ( i −
1) cols . . . (cid:124) (cid:123)(cid:122) (cid:125) ( I − i + 1) cols . . . ( I − i + 1) rows , E i = E (1) i . . . E (2) i . . . ... ... . . . ... . . . E ( N ) i . (2.15)In the special case of a multivariate Gaussian model, the observation equation can be specified in matrixform as Y i = A i · γ i + E i · ψ I + ς i , ς i ∼ Normal( , H i ) , (2.16)9here ς i = ς (1) i, ... ς (1) i,I − i +1 ς (2) i, ... ς ( N ) i,I − i +1 , (2.17) H ( n ) i = σ ς ( n ) . . . σ ς ( n ) . . . (cid:124) (cid:123)(cid:122) (cid:125) ( I − i + 1) cols . . . σ ς ( n ) , H i = H (1) i . . . H (2) i . . . ... ... . . . ... . . . H ( N ) i . (2.18)(2.19) The random walk evolution of γ i can be represented in matrix form as γ i = γ i − + γ (cid:15) i , γ (cid:15) i ∼ Normal( , Q γ (cid:15) ) , (2.20)where γ (cid:15) ( n ) i = a (cid:15) ( n ) ir (cid:15) ( n ) is (cid:15) ( n ) i , γ (cid:15) i = γ (cid:15) (1) i ... γ (cid:15) ( N ) i , (2.21) Q ( n ) γ (cid:15) = σ a (cid:15) ( n ) σ r (cid:15) ( n )
00 0 σ s (cid:15) ( n ) , Q γ (cid:15) = Q (1) γ (cid:15) . . . Q (2) γ (cid:15) . . . ... ... . . . ... . . . Q ( N ) γ (cid:15) . (2.22)The evolution of calendar year factors is ψ t = R t − · ψ t − + S t − · h (cid:15) t , h (cid:15) t ∼ Normal( , Q h (cid:15) ) (2.23)where ψ ( n ) t = h ( n )1 ... h ( n ) t , ψ t = ψ (1) t ... ψ ( N ) t , h (cid:15) t = h (cid:15) (1) t ... h (cid:15) ( N ) th ˜ (cid:15) t , (2.24)10 ( n ) t − = . . .
00 1 . . . . . . (cid:124) (cid:123)(cid:122) (cid:125) ( t −
1) cols . . . t rows , R t − = R (1) t − . . . R (2) t − . . . ... ... . . . ... . . . R ( N ) t − , (2.25) Q h (cid:15) = σ h (cid:15) (1) . . . . . . σ h (cid:15) ( N ) . . . σ h ˜ (cid:15) , Λ ( n ) = λ ( n ) t rows , (2.26) S ( n ) t − = t rows , S t − = S (1) t − . . . (1) S (2) t − . . . (2) ... ... . . . ... ... . . . S ( N ) t − Λ ( N ) . (2.27)
3. Adaptive estimation
This section covers the estimation of random factors and underlying parameters in the multivariateevolutionary GLM framework. The state space matrix representation of the framework is used in thedevelopment of the estimation approach. The random factors include non-calendar year factors γ i ( i =1 , . . . , I ) and calendar year factors ψ I . The unknown parameters are denoted as Θ = { σ a (cid:15) ( n ) , σ r (cid:15) ( n ) , σ s (cid:15) ( n ) , σ h (cid:15) ( n ) , σ h ˜ (cid:15) , λ ( n ) , φ ( n ) ; n = 1 , . . . , N } , (3.1)where φ (1) , . . . , φ ( N ) are replaced by σ ς (1) , . . . , σ ς ( N ) when a Gaussian model is considered.As mentioned in Section 1.1, adaptive estimation is typically used for evolutionary models to estimaterandom factors recursively. This is in contrast with conventional estimation approaches that estimate modelfactors using all available information in loss triangles at once. Many benefits arise from this estimationapproach (Durbin and Koopman, 2012). The recursive Bayesian structure gives more weight to recent datawhich improves the responsiveness of model prediction to more recent experience. In addition, changes arerecognised gradually over time, giving a clearer picture of changes in the historical experience. The recursiveestimation and the framework structure in which random factors are specified recursively using previousfactors and parameters also improve the calibration of random factors when data is scarce. However, it isworth noting that in order to have a good calibration of model parameters which feeds into the estimationof random factors, it is important to have sufficient data in the early periods of the estimation process.Hence, our calibration uses the accident year dimension as the main time dimension to utilise the greateravailability of data in the early accident years.Motivated by the above advantages, we adopt an adaptive estimation approach for our framework. Asimulation based approach known as a particle filter with parameter learning approach is provided in Section3.1. The adaptive estimates of random factors in the special case of Gaussian models are available in closedform and are also given in Section 3.2. In the adaptive estimation of evolutionary models, random factors are recursively estimated over time.Parameters, however, can be estimated in two ways: using off-line estimation methods and using on-lineestimate methods (Kantas, Doucet, Singh and Maciejowski, 2009). Off-line estimation methods refer to theestimation of parameters using all observations at the end of the time series process. On-line estimationmethods, on the other hand, integrate the estimation of parameters into the adaptive estimation of random11actors. Hence parameters are estimated recursively in the same manner as random factors upon the arrivalof new observations at each time point.The adaptive estimation approach chosen for the multivariate evolutionary GLM framework is an on-lineestimation method known as a particle filter with parameter learning. It is based on the approach developedin Liu and West (2001). This is a very popular method that has been used in various fields including traffictracking (Polson and Sokolov, 2015), medicine (Lamien, Rangel Barreto Orlande, Antonio Bermeo Var´on,Leite Queiroga Basto, Enrique Eli¸cabe, Silva dos Santos and Machado Cotta, 2018), and finance (Rios andLopes, 2013). Because it is an on-line estimation approach, parameters are updated recursively in the samemanner as random factors even though they are assumed fixed effects. Artificial dynamics therefore areadded to their specifications in a specific way. Because parameters are estimated in a Bayesian manner,their errors can also be assessed. The prior distribution of parameters at the initialisation can also be chosento reflect the level of knowledge of these parameters.In this paper, we assume a time series process with accident years being the time dimension. However,as explained in Section 1.3, in the multivariate evolutionary GLM framework, while the random factors γ i evolve by accident year, the factors ψ I —which is a vector of random calendar year factors—evolve bydiagonals (see Figure 2.2). To address this challenge, we treat calendar year factors in the same way asparameters in the adaptive estimation process. This is motivated by the fact that the effects of all calendaryear factors are already present in the first accident year when the estimation process is initialised. Thesefactors are not evolutionary across accident years, and as we proceed to the next accident year, we havemore information about (some of) them, but their true value is not supposed to change. Hence they can betreated as un-evolving factors beyond the first accident year and have the same nature as parameters in theframework.The notations used in the estimation process are from the state space matrix representation of theframework in Section 2.2. Key equations are the observation equation (2.12), and the random factorsequations (2.20) and (2.23). The estimation process is as follows. Particle filter with parameter learning algorithmStep 1 . Initialise.At i = 1, for m = 1 , ..., M , draw a sample (also called a particle) of parameters Θ { m ;1 } from theirprior densities. The superscript 1 is the filter time (accident year) index and the super script m is thesample index.Draw calendar factors ψ { m ;1 } I using their specification in Equation (2.9) and Θ { m ;1 } . The vector ψ { m ;1 } I represents the m th sample vector (particle) of all calendar factors (i.e. from calendar time 1 tocalendar time I ) at accident time 1.Draw initial samples of other factors at time 1 γ { m } ∼ f γ ( γ ; Θ { m ;1 } ) . (3.2)Calculate the importance weights at time 1 ω { m } = f Y | γ ; ψ I , Θ ( y | γ { m } ; ψ { m ;1 } I , Θ { m ;1 } ) . (3.3)This is to assign weight to each sample of simulated random factors and parameters using the likelihoodof observations Y received at this time. For i = 2 , ..., I and m = 1 , ..., M :Step 2 . Compute look-ahead (projected) random factors and parameters at time i recursively using theirvalues at i −
1. 12alendar year factors and parameters at time i are projected using (cid:101) Θ { m ; i } = ξ · Θ { m ; i − } + (1 − ξ ) · M · M (cid:88) r =1 Θ { r ; i − } , (3.4) (cid:101) ψ { m ; i } I = ξ · ψ { m ; i − } I + (1 − ξ ) · M · M (cid:88) r =1 ψ { r ; i − } I , (3.5)where (cid:101) Θ { m ; i } is the m th look-ahead estimate of Θ at time i , ξ is a shrinkage coefficient and (cid:101) ψ { m ; i } I represents the m th look-ahead sample of all calendar year factors at time i . Liu and West (2001)discuss how to choose an appropriate value for ξ .The m th look-ahead sample of non-calendar year factors at time i is computed using (cid:101) γ { m } i = E [ γ i | γ { m } i − , Θ { m ; i − } ] . (3.6)These samples are in the same nature as samples from prior distributions in the conventional Bayesiansetting. Note that with a random walk formulation these are simply equal to their previous value attime i − Step 3 . Compute the look-ahead raw and normalised importance weights at time i .Raw importance weights are calculated recursively using previous weights and updated likelihood ofnew observations Y i received at i ˜ ω { m } i = ω { m } i − · f Y i | γ i , ψ I ; Θ ( y i | (cid:101) γ { m } i , (cid:101) ψ { m ; i } I ; (cid:101) Θ { m ; i } ) , (3.7)where the likelihood is estimated using projected random factors and parameters.Normalise the importance weights W { m } i = ˜ ω { m } i (cid:80) Mm =1 ˜ ω { m } i . (3.8) Step 4 . Re-sample.Resample M samples (particles) of { γ { m } i − , (cid:101) ψ { m ; i } I ; (cid:101) Θ { m ; i } } Mm =1 with probabilities { W { m } i } Mm =1 .Overall Steps 2-4 aim to select and resample particles (samples) of random factors and parameters.Samples of random factors and parameters that give higher likelihood on new observations Y i receivedat i will be sampled with higher probabilities. Step 5 . Draw filtered (posterior) random factors and parameters.Filtered parameters and calendar year factors are sampled using Θ { m ; i } ∼ Normal( (cid:101) Θ { m ; i } , (1 − ξ ) Σ Θ { i − } ) , (3.9) ψ { m ; i } I ∼ Normal( (cid:101) ψ { m ; i } I , (1 − ξ ) Σ ψ { i − } I ) , (3.10)where Θ { m ; i } is the m th filtered estimate of Θ at time i , ψ { m ; i } I is the m th filtered sample of all calendaryear factors at time i , Σ Θ { i − } is the sample covariance matrix of { Θ { m ; i − } } Mm =1 , and Σ ψ { i − } I is thesample covariance matrix of { ψ { m ; i − } I } Mm =1 .Also draw samples of filtered values of non-calendar factors at time i according to the distribution γ { m } i ∼ f γ i | γ i − ( γ i | γ { m } i − ; Θ { m ; i } ) . (3.11)13hese filtered samples are different from the projected samples created in Step 2. Projected samplesare drawn before observations Y i are received at i . Filtered samples are drawn after the arrival ofthese observations. They are drawn directly using look-ahead samples which are resampled based ontheir likelihood on new observations. Hence filtered samples can be considered as correction samples,or posterior samples in the conventional Bayesian setting, of random factors and parameters for newobservations. Step 6 . Calculate importance weights on filtered samples.Importance weights are updated using ω { m } i = f Y i | γ i , ψ I ; Θ ( y i | γ { m } i , ψ { m ; i } I ; Θ { m ; i } ) f Y i | γ i , ψ I ; Θ ( y i | (cid:101) γ { m } i , (cid:101) ψ { m ; i } I ; (cid:101) Θ { m ; i } ) . (3.12)This update in weights take into account the mismatch between the likelihood at the actual sampleof random factors γ { m } i , calendar year factors ψ I , and parameters Θ { m ; i } , and their predicted values.Essentially samples that have higher predictive ability (with lower mismatches) are given higher weightsand are more likely to be resampled in the next time period. Step 7 . Repeat Steps 2-6 until i = I . Remark 3.1.
The filtering approach described in this section is a special type of particle filtering whichincorporates parameter estimation. Particle filtering, in general, is considered an effective filtering tool forevolutionary models due to its flexibility in dealing with a wide range of models. This flexibility comes at acomputational cost. However, with increasing computational power, it has found its applications in variousfields (Polson and Sokolov, 2015; Lamien et al., 2018; Rios and Lopes, 2013).One often experience convergence issue when implementing a particle filter, see for example, Doucet,Godsill and Andrieu (2000); Andrieu, Doucet and Tadi´c (2005); Carvalho, Johannes, Lopes, Polson et al.(2010a); Creal (2012). In the filtering approach developed in this paper, we have taken steps to assistconvergence. In particular, while a traditional particle filter places the evaluation step ahead of the re-sampling step, this order is reversed in the filtering approach used. This in conjunction with somewhatinformative priors/initial estimates used in the filter can reduce the degeneracy issue (Doucet and Johansen,2011; Creal, 2012; Capp´e et al., 2007). These also help reduce the computational cost of the filter andimprove the rate of convergence.3.2. Dual Kalman filtering approach for Gaussian models
For Gaussian models, a (modified) Kalman filter can be used to recursively estimate random factors.In the special case of Gaussian models, estimates of random factors can be obtained in analytical form,making it an interesting case to study. As mentioned in the previous section, due to the calendar yearfactors h t which behave differently to other factors, the traditional Kalman filter cannot be applied withoutany adjustment. Using a similar treatment as in the previous section for the general case, we also considercalendar factors as static parameters to be updated beyond the first accident year. A modified version of theKalman filter in the literature that fits well to this purpose is called the dual Kalman filter. It was developedby Nelson and Stear (1976) and has been used to provide sequential estimates of dynamic factors as well asstatic factors or parameters of Gaussian models in various fields, including civil engineering (Azam, Chatziand Papadimitriou, 2015), and vehicle systems (Wenzel, Burnham, Blundell and Williams, 2006). The dualKalman filter involves two filters that run in parallel, one for calendar factors, and one for non-calendarfactors. The information from one filter flows into the other for continuing updates. Applying a dual Kalmanfilter to the Gaussian cases, we proceed as follows below.14 ual Kalman filter algorithmStep 1 . Initialise.At i = 1, obtain initial estimates of the mean and variance of calendar year factors (cid:101) ψ { } I = E [ ψ { } I ] , (3.13) h (cid:101) P = Cov [ ψ { } I ] . (3.14)These can be obtained by simulating N samples of ψ { } I using using their specification in Equation(2.9). The mean and variance estimates can then be calculated using the sample mean and covariancematrix of these samples.Also obtain initial estimates of non-calendar year factors (cid:101) γ = E [ γ ] , (3.15) γ (cid:101) P = Cov [ γ ] , (3.16)which can be chosen using preliminary GLM analyses with fixed factors. For i = 1 , ..., I : Step 2 . Filter/update calendar year factors.Calculate the Kalman gain h G i for calendar year factors h G i = h (cid:101) P i · E (cid:48) i · (cid:16) E i · h (cid:101) P i · E (cid:48) i + H i (cid:17) − . (3.17)Update estimates of calendar factors, including the conditional mean (cid:98) ψ { i } I and the conditional covari-ance matrix h (cid:98) P i on the arrival of observations Y i at time i (cid:98) ψ { i } I = E [ ψ { i } I | Y i ] = (cid:101) ψ { i } I + h G i · (cid:16) Y i − A i · (cid:98) γ i − − E i · (cid:101) ψ { i } I (cid:17) , (3.18) h (cid:98) P i = Cov [ ψ { i } I | Y i ] = h (cid:101) P i − h G i · E i · h (cid:101) P i . (3.19)These can be considered as the posterior mean and variance in the conventional Bayesian setting. Step 3 . Filter/update non-calendar year factors.Calculate the Kalman gain γ G i for non-calendar year factors γ G i = γ (cid:101) P i · A (cid:48) i · (cid:16) A i · γ (cid:101) P i · A (cid:48) i + H i (cid:17) − . (3.20)Update estimates of other factors, including the conditional mean (cid:98) γ i and the conditional covariancematrix γ (cid:98) P i using observations Y i at time i (cid:98) γ i = E [ γ i | Y i ] = (cid:101) γ i + γ G i · (cid:16) Y i − A i · (cid:101) γ i − E i · (cid:98) ψ { i } I (cid:17) , (3.21) γ (cid:98) P i = Cov [ γ i | Y i ] = γ (cid:101) P i − γ G i · A i · γ (cid:101) P i . (3.22)These can be considered as the posterior mean and variance in the conventional Bayesian setting. Step 4 . Predict calendar year factors (time update).Project the calendar year factors in the next period (cid:101) ψ { i +1 } I = E [ ψ { i +1 } I | Y i ] = (cid:98) ψ { i } I , (3.23)15nd project the error covariance of these factors h (cid:101) P i +1 = Cov [ ψ { i +1 } I | Y i ] = h (cid:98) P i + Q hI (cid:15) , (3.24)where Q hI (cid:15) is the artificial dynamic added to the covariance specification. It can be chosen to re-flect the level of uncertainty regarding the estimates of calendar factors. Greater uncertainty can beaccompanied by a larger artificial noise.The estimates obtained in this step can be considered as prior mean and variance in the conventionalBayesian setting. Step 5 . Predict non-calendar year factors (time update).Project non-calendar year factors ahead (cid:101) γ i +1 = E [ γ i +1 | Y i ] = (cid:98) γ i , (3.25)and project their error covariance ahead γ (cid:101) P i +1 = Cov [ γ i +1 | Y i ] = γ (cid:98) P i + Q γ (cid:15) . (3.26) Step 6 . Repeat step 2-5 until i = I .The above algorithm is conditional on values of parameters Θ . Maximum likelihood estimation can beused to estimate these parameters. The log likelihood function can be written aslog f Y I ( Y I ; Θ ) = I (cid:88) i =1 log f Y i | Y i − ( y i | y i − ; Θ ) (3.27)= − I ( I + 1)4 · log(2 π ) − I (cid:88) i =1 (cid:16) log | A i · γ (cid:101) P i · A (cid:48) i + E i · h (cid:101) P i · E (cid:48) i + H i | + (cid:16) y i − A i · (cid:101) γ i − E i · (cid:101) ψ { i } I (cid:17) (cid:48) (cid:16) A i · γ (cid:101) P i · A (cid:48) i + E i · h (cid:101) P i · E (cid:48) i + H i (cid:17) − × (cid:16) y i − A i · (cid:101) γ i − E i · (cid:101) ψ { i } I (cid:17)(cid:17) , which can be maximised numerically to provide the maximum likelihood estimate of Θ . When maximumlikelihood estimation is used for parameter estimation, bootstrapping is needed to assess the parameteruncertainty in the projection of claims liability due to a typically small sample size of loss data. This isbecause in that case the standard errors computed via maximum likelihood estimation are unreliable.
4. Simulation illustration
A simulation illustration is performed for the evolutionary GLM framework to assess the performance ofthe estimation approach. The simulated data used for this illustration consists of two 15 ×
15 triangles givenin Table A and Table B in Appendix A.1. The data is simulated from a Tweedie GLM model with randomfactors having random walk evolution. The estimation approach developed in Section 3.1 is used to estimaterandom factors and unknown parameters. We use 50,000 samples for each time period and initialise theestimation using static GLM estimates.
Estimates of random factors are provided in Table C in Appendix A.1. Note that these are filteredestimates, or posterior estimates, meaning that the estimates of factors in an accident period are obtainedas the data in that accident period arrives. We provide plots of fitting ratios in Figure 4.4, which arecalculated as ratios of filtered to true values. It is noted that each development pattern is fitted with aHoerl curve orchestrated by two random factors r ( n ) i and s ( n ) i . Therefore, the most direct way to assess the16oodness-of-fit for the development pattern is to consider the fitting ratios of the mean and variance of theHoerl curve calculated using these two parameters. In particular, the mean of the Hoerl curve for accidentperiod i and line n is calculated by r ( n ) i − − s ( n ) i , (4.1)and the variance by r ( n ) i − (cid:16) s ( n ) i (cid:17) . (4.2) . . . a i ( ) i1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2 4 6 8 10 12 14 . . . a i ( ) i1 2 3 4 5 6 7 8 9 10 11 12 13 14 152 4 6 8 10 12 14 . . . ( r i ( ) - ) ( - s i ( ) ) i1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2 4 6 8 10 12 14 . . . ( r i ( ) - ) ( - s i ( ) ) i1 2 3 4 5 6 7 8 9 10 11 12 13 14 152 4 6 8 10 12 14 . . . ( r i ( ) - ) (( s i ( ) ) ) i1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2 4 6 8 10 12 14 . . . ( r i ( ) - ) (( s i ( ) ) ) i1 2 3 4 5 6 7 8 9 10 11 12 13 14 152 4 6 8 10 12 14 . . . h t ( ) . . . h t ( ) Figure 4.4: Fitted ratios of random factors in the GLM framework illustration (filtered values to true values)
It can be observed from Figure 4.4 that the goodness of fit is reasonable for accident year factors a ( n ) i .The estimates of the means and variances of the Hoerl curves are also quite close to their actual values.Estimates of calendar factor h ( n ) t are consistently lower than their true values, however the deviations remainwithin 20% of actual values.A compensation across random factors is evident from their estimates in Table C. Accident year factors a ( n ) i are overestimated, and this is compensated by an underestimation of calendar year factors. A consistent17is-allocation between these factors is observed from relatively stable fitting ratios of these factors in Figure4.4. This issue is not only specific to evolutionary models but is also very common in models that considerall three factors: accident year factor, calendar year factor and development year factor (Zehnwirth, 1994;Taylor, 2000; Barnett and Zehnwirth, 2000; Brydon and Verrall, 2009; Gluck and Venter, 2009; Venter,Gutkovich and Gao, 2019). Due to the collinearity between these factors, mis-allocations between them canlargely offset to give an overall reasonable fit. Experienced practitioners also tend to be more interestedin the combination of these factors rather than their individual trends (Venter, Gutkovich and Gao, 2019).As the ultimate goal of any valuation task is to forecast outstanding claims, a question is then raisedregarding the impacts of this mis-allocation on the accurateness of claims projection. McGuire, Taylor andMiller (2018) argued that extrapolating future trends of these factors using their corresponding estimatesfrom the past would produce reasonably accurate future claims experience. This works well for cases withconstant calendar year trends. When the trends are not constant, one should proceed with caution andsome reasonableness checks of the trends can be useful.To assess the calendar year dependence driven by common shocks, we calculate the Pearson correlationcoefficient between actual calendar year factors h (1) t and h (2) t , and the correlation coefficient between theestimates of these factors obtained from the particle filter. The Pearson correlation coefficient between thefiltered (posterior) values is 0.4132 with a 95% confidence interval (-0.1257; 0.7638). The actual Pearsoncorrelation coefficient is 0.1463 which lies well within this interval. Parameter estimates are provided in Table D in Appendix A.1. The results show that the true valuesof many parameters fall out of their respective confidence intervals, even though they appear close to theirestimates.The narrow CIs that we observe in the results are mainly due to the degeneracy issue of the particlefilter used. Particle degeneracy refers to the situation where all but a few particles (samples) have negligibleweights. As a result, the re-sampling step in the particle filtering with parameter learning algorithm focuseson multiplying the few particles with significant weights and abandons the majority of particles with neg-ligible weights. The resultant sample then has a very low diversity of particles. This consequently resultsin smaller confidence intervals of parameter estimates in the particle filtering with parameter learning (Riosand Lopes, 2013). As a result, the algorithm may not converge to the right parameters. This issue hasbeen recognised for particle filtering with parameter learning specifically (see for example, Carvalho, Jo-hannes, Lopes, Polson et al., 2010b; Rios and Lopes, 2013), and for simulation-based estimation approachesin general (see for example, Doucet, Godsill and Andrieu, 2000; Andrieu, Doucet and Tadi´c, 2005; Carvalho,Johannes, Lopes, Polson et al., 2010a; Creal, 2012). This illustration also has a large vector of observationsat each time period (up to 30 in the first year). This high dimension of observations makes the likelihoodfunction very steep and it further contributes to the degeneracy problem (see, for example, Wang, Li, Sunand Corchado, 2017; Li, Sun, Sattar and Corchado, 2014).Using informative prior distributions for parameters and initial values for random factors can help reducethis problem. In this illustration, we have used estimates from a static GLM fitting as a guidance for selectingprior distributions and initial values for random factors. It is also worth noting that while it is desirable forprior distributions of parameters to have large variances, these prior distributions should not be too vague toavoid particle degeneracy issues. We have also performed a number of trial runs to get a reasonable particlefiltering path which does not suffer a severe particle degeneracy issue. This task is quite similar in natureto the “tuning” process of the Metropolis-Hastings algorithm in Bayesian inference. However, as mentionedin Remark 3.1, the degeneracy issue is unavoidable in particle filtering and it is only possible to reduce it toa certain extent but not eliminate it completely.As shown in Table D, there appears to be some compensation between variance terms in this illustration aswell. The dispersion parameters of the observations, including φ (1) , φ (2) , p (1) , p (2) are overestimated, whilevariance terms of random factors are underestimated. An explanation can be that while the framework hastwo components: observations and random factors, random factors are latent and only the total volatilityis observed. This may cause a mis-allocation across variance terms. This issue is unlikely to distort the18rojection of claims if it is handled with caution. For example, observed rapid changes in claims developmentpatterns should be appropriately recognised with large variance estimates of development factors. As longas the overall goodness-of-fit is reasonable, the compensation across component volatilities should not havematerial impacts on the projection of future claims because the total volatility of future claims is thesubject of interest. A careful examination of data features as well as any expert opinions can help selectmore informative starting values and prior distributions of variance parameters. As mentioned above, inthis illustration, we have based our initial estimates on static GLM estimates and performed trial runs toselect appropriate prior distributions for variance parameters. We examine the performance of the particle filter by assessing how closely the estimated claim patternstrack the observed patterns. Examples of this tracking for accident years 7 and 11 are given in Figure 4.5.There are significant changes in the claims development patterns from year 6 to year 7, and from year 10 toyear 11, as shown in this figure. However, the particle filter is able to track these changes very closely.
Triangle 1
Index Year 7 observedYear 7 filteredYear 6 filtered 2 4 6 8 10 12 14
Triangle 2
Index li ne2 [ , ] Year 7 observedYear 7 filteredYear 6 filtered2 4 6 8 10 12 14
Triangle 1
Year 11 observedYear 11 filteredYear 10 filtered 2 4 6 8 10 12 14
Triangle 2 li ne2 [ , ] Year 11 observedYear 11 filteredYear 10 filtered
Figure 4.5: Tracking of claims development patterns for some accident years in the simulation illustration
5. Real data illustration
The data set used for this illustration is from Cˆot´e, Genest and Abdallah (2016). The two triangleschosen for illustration in this chapter are from the Accident Benefits covers of the Auto Insurance line inOntario. One triangle is for Accident Benefits (AB) excluding Disability Income, and the other is for ABwith Disability Income only. Incremental losses are given in Table E and F in Appendix A.2. Claims arestandardised using the total premium earned in the corresponding accident years to give incremental lossratios. 19 . . . . . . Accident Benefits (excluding DI)
Development year Lo ss r a t i o . . . . . Accident Benefits (DI only)
Development year Lo ss r a t i o Figure 5.6: Incremental loss ratios in real data from a Canadian insurer
A preliminary analysis is performed to assess the suitability of the data set. This includes an assessmentof any changes in the development patterns as well as the dependence across the two lines.
Plots of loss ratios are provided in Figure 5.6. For each loss triangle, the top two values in each accidentyear are also highlighted to identify the peak in the development pattern. These are provided in Figure 5.7for accident years 1-8.
Figure 5.7: Loss triangles with top two values highlighted for each accident year (top: Accident Benefits excluding DisabilityIncome, bottom: Accident Benefits - Disability Income only)
Plots of loss ratios show variations in claims development patterns over time. In the Accident Benefitsexcluding Disability Income line, the peak in the claims development pattern shifts across development years1-2 and 2-3. Hence it is desirable for the model to be able to capture this feature. In the Accident Benefitswith Disability Income only line, the peak in the development shifts from development years 1-2 in thefirst two accident years to years 2-3 onwards. It could be tempting to model the first two accident yearsseparately using a static modelling approach. However, using an evolutionary model can allow the changesin the prediction of claims to be smoothed out over time. In addition, it can be observed that the level ofvariation between claims in development years 1 and 3 within the same accident year has also varied overtime. This indicates some variation in the development patterns besides the peaks.
A heuristic dependence analysis is performed by fitting to each line a Tweedie GLM. The Tweedie familyis a major subclass of the Exponential Dispersion Family consisting of symmetric and non-symmetric,20ight-tailed and heavy-tailed distributions (Alai et al., 2016; Jørgensen, 1997). This subclass includes manycommonly used distributions in the reserving literature, see for example, Alai and W¨uthrich (2009); Boucherand Davidov (2011); England and Verrall (2002); Peters et al. (2009). A key feature of this subclass is therelationship between the unit variance function V ( µ ) and the mean µ where V ( µ ) = µ p . The powerparameter p identifies a member of the family, for example, p = 1 corresponds to a Poisson distribution, and p = 2 corresponds to a gamma distribution. Through the flexibility of the power parameter p , model erroris also allowed for.The log-link and a modified Hoerl curve is used in this GLM structure with additional covariates for thefirst two development years a ( n ) i + r ( n ) log( j ) + s ( n ) j + b ( n )1 { j =1 } + b ( n )2 { j =2 } . (5.1)As the peak in the development pattern shifts across development years 1, 2 and 3, adding the two covariatesfor the first two development years can improve the goodness-of-fit of the Hoerl curve. Such modification isquite common in the applications of the Hoerl curve (England and Verrall, 2001). This static GLM fittingaims to remove fixed accident year and development year effects.Pearson Spearman Kendall0.2599 (0.0554) 0.3087 (0.0222) 0.2256 (0.0150) Table 5.1: Measures of association between cell-wise GLM residuals and their corresponding p -values Measures of association between pair-wise GLM Pearson residuals of the two lines are provided in Table5.1. All coefficients are quite strong, and while the Pearson coefficient is not significant at 5%, more holisticindicators such as Spearman and Kendall are significant. Another GLM analysis is also performed withthe chain ladder mean structure. The Pearson correlation coefficient of residuals from this GLM analysis,however, is significant. This further illustrates the conclusion in Avanzi, Taylor and Wong (2016b) thatcorrelation coefficients are dependent on the methodology used. From these results, it is then reasonableto conclude that there may be dependence retained in the data set after removing fixed accident year anddevelopment year effects. However, this dependence is not necessarily linear, and a Pearson correlationcoefficient may not be an appropriate model for it. We also analyse heat maps of GLM Pearson residualsprovided in Figure A in Appendix A.2 and observe some common patterns in the diagonals of the twotriangles. − . . . . Calendar year R e s i dua l s Accident Benefits (excluding DI)Accident Benefits (DI only)
Figure 5.8: Plots of GLM residuals by calendar years
We now examine residuals by calendar years to find further trace of calendar year trend and dependence.The residual for each calendar year is calculated as the difference between the sum of observed values and21he sum of fitted values for all cells in that year, standardised using the sum of fitted values. Plots ofcalendar year residuals for the two lines are given in Figure 5.8. Clear evidence of calendar year dependenceis observed from this figure. In particular, there are some sympathetic movements in the calendar yeartrends from both lines such as in calendar years 1 to 2, 4 to 5, 7 to 9. This suggests some common effectsthat impact both lines, as well as idiosyncratic effects within individual lines. The Pearson correlationcoefficient between the calendar year residuals from the two lines is 0.6976 ( p -value 0.0249), which is strongand significant. However, the correlation is -0.0691(p-value 0.8598) if CY1 is omitted. This highlights thelimitations of using correlation to model dependence in this context; this is further discussed in Section 6.1.The exploratory dependence analysis shows evidence of calendar year dependence across the two lines.Hence this data set is suitable for illustration of the multivariate evolutionary GLM framework. . . . . Accident Benefits (excluding DI)
IndexYear 1 observedYear 1 filtered 2 4 6 8 10 . . . Accident Benefits (DI only)
Index li ne2 [ , ] Year 1 observedYear 1 filtered 2 4 6 8 10 . . . . Accident Benefits (excluding DI)
Index li ne1 [ , ] Year 2 observedYear 2 filteredYear 1 filtered 2 4 6 8 10 . . . Accident Benefits (DI only)
Index li ne2 [ , ] Year 2 observedYear 2 filteredYear 1 filtered2 4 6 8 10 . . . . Accident Benefits (excluding DI)
IndexYear 3 observedYear 3 filteredYear 2 filtered 2 4 6 8 10 . . . Accident Benefits (DI only)
Index li ne2 [ , ] Year 3 observedYear 3 filteredYear 2 filtered 2 4 6 8 10 . . . Accident Benefits (excluding DI)
Index li ne1 [ , ] Year 4 observedYear 4 filteredYear 3 filtered 2 4 6 8 10 . . Accident Benefits (DI only)
Index li ne2 [ , ] Year 4 observedYear 4 filteredYear 3 filtered2 4 6 8 10 . . . Accident Benefits (excluding DI)
IndexYear 5 observedYear 5 filteredYear 4 filtered 2 4 6 8 10 . . . Accident Benefits (DI only)
Index li ne2 [ , ] Year 5 observedYear 5 filteredYear 4 filtered 2 4 6 8 10 . . . Accident Benefits (excluding DI)
Index li ne1 [ , ] Year 6 observedYear 6 filteredYear 5 filtered 2 4 6 8 10 . . . Accident Benefits (DI only)
Index li ne2 [ , ] Year 6 observedYear 6 filteredYear 5 filtered2 4 6 8 10 . . Accident Benefits (excluding DI)
IndexYear 7 observedYear 7 filteredYear 6 filtered 2 4 6 8 10 . . Accident Benefits (DI only)
Index li ne2 [ , ] Year 7 observedYear 7 filteredYear 6 filtered 2 4 6 8 10 . . . Accident Benefits (excluding DI)
Index li ne1 [ , ] Year 8 observedYear 8 filteredYear 7 filtered 2 4 6 8 10 . . Accident Benefits (DI only)
Index li ne2 [ , ] Year 8 observedYear 8 filteredYear 7 filtered2 4 6 8 10 . . . Accident Benefits (excluding DI)
Year 9 observedYear 9 filteredYear 8 filtered 2 4 6 8 10 . . Accident Benefits (DI only) li ne2 [ , ] Year 9 observedYear 9 filteredYear 8 filtered
Figure 5.9: Tracking of claims development patterns in real data illustration
A multivariate evolutionary GLM is fitted to the data set. We focus on the Tweedie sub-family of theEDF. For this data set, different Tweedie distributions with different power parameters p are used, providingflexible dispersion modelling for the two lines. 22he mean structure used is a ( n ) i + r ( n ) i log( j ) + s ( n ) i j + b ( n ) i, { j =1 } + b ( n ) i, { j =2 } + h ( n ) t , (5.2)where a ( n ) i , r ( n ) i , s ( n ) i , b ( n ) i, , b ( n ) i, , h ( n ) t are random factors that have random walk evolutions within theirrespective time dimension.We use 50,000 samples for each time step and initialise the filter using static GLM estimates with theabove mean structure. We also examine the change in claims development pattern over time and performa number of trial runs to select somewhat informative prior distributions for parameters to reduce thedegeneracy issue. The filtered (posterior) values of random factors are given in Table G and parameterestimates are given in Table H in Appendix A.2. The tracking of claims development patterns is provided in Figure 5.9. These plots provide the observedclaims development pattern for each year, the fitted patterns of that year and the previous year. Fittedpatterns are calculated using filtered (posterior) estimates of random factors and parameters. These plotsshow that the particle filter can track changes in claim activity quite reasonably well overall, especially inthe last few years. There are also quite dramatic changes in the claims development pattern in some years,for example, from year 1 to 2, year 2 to 3, which are captured well by the model. Changes within the yearfrom year 3 to 6 are quite rapid, which are captured by the model to some extent but not fully. For years4-6, the development pattern deviates from the shape of a Hoerl curve in the tail, and this is not capturedwell by the filter.Using parameter estimates in Table H, the Pearson correlation coefficient between calendar year factors h (1) t and h (2) t is 0.3198. This is the theoretical correlation coefficient calculated using the evolution assump-tions of calendar year factors in Equation (2.9). The sample correlation coefficient between the filtered(posterior) values of these factors is 0.7537 with 95% CI (0 . . p -value 0.6996), which is muchweaker than the correlation coefficient of 0.6976 for calendar year GLM residuals in in Section 5.1.2 andis also insignificant. The main difference between both correlations is the capture of calendar year effects.Since the correlation is now insignificant, it seems that it was the most significant dependence driver in thedata, and that our model has explained away most of the dependence in the data.The goodness-of-fit in this illustration is not as good as the goodness-of-fit in the simulation illustration.This is expected as the synthetic data set is simulated from a theoretical model, whereas the underlying23 igure 5.10: Heat maps of ratios of observed values to fitted values (top: Accident Benefits (excluding DI), bottom: AccidentBenefits (DI only) model that generates the real data set is unknown. This is to say that there may be other factors in thedata that are yet to be considered and captured in the model. Remark 5.1.
The data set used in this illustration is only available in annual time interval. When moredata is available (or when data is available in a smaller interval), one could consider performing a backtesting of the model to further assess its performance. However, it is also worth noting that value of reservesobserved in back-testing is only one possible outcome generated by the (unknown) underlying mechanism. Asa result, we can only assess the predictive power of the model using these observed reserves with a certainlevel of confidence.
Year AB (excluding DI) AB (DI only) Both linesMean SD Mean SD Mean SD1 187.49 193.56 38.30 44.31 225.79 197.852 579.92 361.71 216.96 142.92 796.89 388.723 1,448.14 698.41 526.54 299.49 1,974.68 764.694 2,299.10 1,117.06 940.59 446.71 3,239.70 1,188.885 5,504.30 2,482.17 3,695.25 1,644.89 9,199.55 2,806.426 13,128.14 5,506.59 5,456.64 2,618.54 18,584.78 6,155.757 17,550.89 8,319.55 6,530.34 3,742.70 24,081.23 9,194.948 19,898.82 11,057.94 7,026.08 6,163.77 26,924.90 12,297.219 31,035.09 20,539.99 14,891.23 24,332.31 45,926.32 32,660.82
Table 5.2: Outstanding claims statistics by accident year − . . . . Accident Benefits (excluding DI)
Year R e s i dua l s Development yearAccident yearCalendar year 2 4 6 8 10 − . − . . . Accident Benefits (DI only)
Year R e s i dua l s Development yearAccident yearCalendar year
Figure 5.11: Plots of residuals by accident year, development year and calendar year
To forecast outstanding claims in the lower triangles, we utilise samples from the filtering for the uppertriangles. These are samples of random factors and parameters. They are used to project future claims usingthe framework specification in Section 2. A set of simulated values of future claims can then be obtained,which is used to calculate summary statistics for the total outstanding claims liability. Future calendar yearfactors were simulated recursively in accordance with their random walk specification.The means and standard deviations of the total outstanding claims by accident years for each line ofbusiness and the total portfolio are summarised in Table 5.2. Summary statistics of the total outstandingclaims distributions are given in Table 5.3 and their kernel densities are given in Figure 5.12. The summarystatistics provided include the posterior means, standard deviations, VaR and VaR of the distributionsof total outstanding claims for each line, as well as the total portfolio.AB (excluding DI) AB (DI only) Both linesMean 91,631.90 39,321.94 130,953.84SD 28,960.41 26,251.88 40,495.76VaR
Table 5.3: Summary statistics of outstanding claims distributions
The two lines, however, do not have a comonotonic dependence structure, and this allows the insurerto gain a diversification benefit when they set their risk margins. We use the following definitions of riskmargins and diversification benefits (D.B.) motivated by the regulatory requirements in Australia set by theAustralian Prudential Regulation Authority (and other similar regimes around the world such as in SolvencyII in Europe): Risk margin χ % [ Y ] = max (cid:26) VaR χ % [ Y ] − E [ Y ]; 12 SD [ Y ] (cid:27) , (5.3)D.B. = (cid:0) Risk margin χ % [ Y ] + Risk margin χ % [ Y ] (cid:1) − Risk margin χ % [ Y + Y ]Risk margin χ % [ Y ] + Risk margin χ % [ Y ] × . (5.4)The Risk Margin and Risk Margin , as well as their corresponding diversification benefits are providedin Table 5.4. It can then be observed that quite significant diversification benefits can be gained as a result25 . + . − . − . − Total unpaid losses D en s i t y Accident Benefits (excluding DI)Accident Benefits (DI only)Total
Figure 5.12: Kernel densities of predictive distributions of total outstanding claims in each line of business and in the aggregateportfolio of allowing for (non-comonotonic) dependence across lines in the valuation of the total outstanding claimsliability. AB (excluding DI) AB (DI only) Both lines Diversification BenefitRisk Margin
Table 5.4: Risk margin and diversification benefits statistics
6. Discussion and practical considerations
There are two aspects related to the modelling of dependence between triangles, which support use ofour common shock model (as opposed to relying on correlation).In Section 5.1.2 we observed that the Pearson correlation coefficient between calendar year residuals wassignificant at 70% when including the first year, but became insignificant when that first year was removed.This is because there was probably a structural change, but how to deal with this issue is akin to dealingwith outliers. When building a model, one needs to separate the modelling of the first year (which boils downto ignoring that data as there isn’t enough data to model it separately), or make a judgmental assumptionabout correlation. Rather than forcing extreme approaches (include the first year fully or not at all), orrequiring subjective input (an assumption about correlation), our model smoothes the potential structuralbreak naturally in an objective way: as time goes by, what happened in that first year becomes less and lessinfluential.Furthermore, we know from Section 5.3 that inclusion of calendar effects in fact wipes away all correlationin the data. This further supports the idea that it is better to try and model the dependence drivers explicitlyfirst, and use correlation as a dependence model only if and when that is not entirely successful (if there issome unexplained dependence left in the residuals); see also Avanzi, Taylor and Wong (2016b). In our case,the common shock approach worked well. 26 .2. Degeneracy issues
The closest model to ours in the recent literature would be Sims (2011), who used particle filtering ina univariate evolutionary model. Our results appear much more satisfactory than the results for univariateevolutionary models therein. Sims (2011) used the traditional sequential Monte Carlo particle filter withthe variances of the disturbance terms chosen by an initial residual analysis. Sims (2011) noted that theparticle filter could not always keep track of the changes, and it also suffered from the degeneracy issue. Theapproach that we proposed, however, is an advancement of the traditional particle filter to also incorporateparameter estimation. This allows the variances of the disturbance terms to be updated upon the arrival ofnew observations. In addition, the particle filter used in this paper is a modification of the auxiliary particlefilter. This filter typically places the re-sampling ahead of the evaluation step whereas the reverse order istypically performed in the traditional particle filter (see also the review in Doucet and Johansen, 2011). There-sampling step utilises the importance weights calculated using the look-ahead-likelihood. This aims toreduce the degeneracy issue (Doucet and Johansen, 2011; Creal, 2012; Capp´e, Godsill and Moulines, 2007).These could explain the better performance of our particle filter in tracking changes.While degeneracy issues are unavoidable in particle filtering with parameter learning and particle filteringin general (see for example, Doucet, Godsill and Andrieu, 2000; Andrieu, Doucet and Tadi´c, 2005; Carvalho,Johannes, Lopes, Polson et al., 2010a; Creal, 2012), it can be more severe for data of a higher dimension,or when uninformative priors for parameters and initial values of random factors are used (Wang, Li, Sunand Corchado, 2017; Li, Sun, Sattar and Corchado, 2014). However, it is noted that degeneracy appears toaffect mainly the random parameters of the model in the simulation illustration of Section 4. Confidenceintervals of the parameters there are small to the point where they are essentially fixed parameters. Whilerecognition of their randomness in confidence intervals would be preferable, it should be noted that the endresult is of a quality equal to that of a model that specifies these as (almost) fixed parameters. There is nosign of particle degeneracy in relation to the actual fixed parameters of the model.
The particle filter and the dual Kalman filter that we introduce in this chapter are accident-year-based.It means that they update estimates of random factors and parameters from one accident year to another.This is to utilise the greater availability of data in the first accident year, which helps initialise the filtersmore accurately. However, given that new information typically arrives by calendar years, it can be betterfor risk management purposes to have the filters run by calendar years. This, however, creates additionalcomplications due to the lack of data for initialisation. In addition, accident year factors and developmentyear factors typically evolve by rows. This can create challenges for calendar-year-based filters. In thispaper we consider a special treatment for calendar year factors in our accident-year-based filter. A similartreatment could be considered for accident year and development year factors in the development of calendar-based-filters.
Compensation can occur across estimates of random factors where one factor is consistently overestimatedand this is offset by an underestimation of another factor. This issue can arise in any model that incorporatesall three factors, accident year factor, development year factor and calendar year factor due to the collinearitybetween them. The mis-allocations between these factors can largely offset to give an overall reasonablefit. However, caution needs to be taken in cases where changes in claim experience are sudden and somereasonableness checks can be useful.
It is often desirable to perform back smoothing following filtering to obtain estimates using all availableinformation. Back smoothing is a lot easier to accomplish for Gaussian models because of the availability ofthe estimates in closed-form. For non-Gaussian models, particle smoothing can be used. However, it is notan easy task due to particle degeneracy (Doucet and Johansen, 2011). This problem is further escalated forthe particle filtering with parameter learning algorithm in Section 3.1 as parameters are also incorporated27n the on-line estimation. We do not perform back smoothing for the evolutionary GLM framework. Futureresearch could further investigate this aspect, especially with regard to addressing the degeneracy problem.
To assess the size of triangles required for a good calibration, we have run the filter on the simulated datain Section 4, using the sub-triangles of dimension 6, 9 and 12 respectively. The results are then comparedwith those from the full triangle.The above study shows that the resultant fitted observations can track the actual observations quiteclosely even when triangles of a small size are used. It is worth noting that even though random factors r ( n ) i and s ( n ) i for an undeveloped accident year i can be calibrated reasonably well, the accuracy of estimationwould improve as more data is observed for accident year i . This is because these two factors are used toformulate a Hoerl curve which captures the total claims development pattern in accident year i . The numberof observations required for reliable estimation would be likely to depend on the length of the payment tail.
7. Conclusion
Insurers typically experience changes in their claim experience over time, making the application of staticmodels with deterministic parameters no longer straightforward. We capture this common data feature ina multivariate evolutionary GLM framework. This framework utilises the very popular and rich GLMstructure, hence provides great flexibility in modelling. We extend the traditional GLM framework in lossreserving on two fronts. Firstly, we allow parameters of the traditional GLM framework to evolve, andare hence enabling changes in claim experience to be captured naturally in an elegant manner. This helpsprovide a clear picture of the historical experience. Secondly, we introduce dependence across lines using acommon shock approach with an explicit and easy-to-interpret dependence structure. We specifically targetthe calendar year dependence in the specification of our framework. This, however, can be modified easilyto incorporate other sources of dependence such as common accident year effects, and common developmentyear effects. The framework introduced also specifies the use of random walks for the evolution of randomfactors for the sake of simplification. This could be extended with a more complex time series model, butthis might come at a cost of more extensive parametrisation.Together with the development of the multivariate evolutionary GLM framework, we also contribute tothe literature with the formulation of two adaptive estimation approaches: a particle filter with parameterlearning for the general framework, and a dual Kalman filter for the special case of Gaussian models. Thesefilters are real-time devices that recursively provide posterior estimates of random factors and parametersupon the arrival of new information. They give more weight to more recent data, which increases thelikelihood of a more accurate projection of future claims. In the special structure of reserving data with twodifferent time dimensions, the application of a typical adaptive estimation approach is not straightforward.We take into account this difficulty in the development of the filters. Specifically, calendar year factors aretreated as fixed effect in the period where they do not evolve as we consider the evolution in the accidentyear dimension.Theoretical developments are demonstrated using illustrations on a simulated data set and a real dataset. The results show that factor estimates can adapt to changes in claim experience reasonably well. Somepractical considerations are also raised through these illustrations. A common issue with particle filters isparticle degeneracy where the number of samples with non-negligible weight drops significantly after a fewtime periods. A mis-allocation across estimates of factors can also be observed in models which considerall three factors, accident year, development year and calendar year factors due to the collinearity betweenthem. A careful selection of priors and initial value and frequent reasonableness check may be required tomitigate these issues. Future research could look into a new development of particle filtering for adaptivereserving that can overcome the issues of particle degeneracy.
R codes
Full R codes for Sections 4 and 5 are available upon request by contacting the authors.28 cknowledgements
Results in this paper were presented at a research seminar at UNSW in Sydney (Australia), at TheAustralasian Actuarial Education and Research Symposium in 2018, at the L Seminars in Lausanne(Switzerland) and at the 23 rd International Congress on Insurance: Mathematics and Economics in Munich(Germany) in 2019. The authors are grateful for constructive comments received from colleagues who at-tended those presentations. The authors are also thankful to the two anonymous reviewers for constructivecomments that helped improve the paper.This research was supported under Australian Research Council’s Linkage (LP130100723, with fundingpartners Allianz Australia Insurance Ltd, Insurance Australia Group Ltd, and Suncorp Metway Ltd) andDiscovery (DP200101859) Projects funding schemes. Furthermore, Phuong Anh Vu acknowledges financialsupport from a University International Postgraduate Award/University Postgraduate Award and supple-mentary scholarships provided by the UNSW Business School. The views expressed herein are those of theauthors and are not necessarily those of the supporting organisations.
References
Abdallah, A., Boucher, J.P., Cossette, H., 2015. Modeling dependence between loss triangles with hierarchical Archimedeancopulas. ASTIN Bulletin 45, 577–599.Ajne, B., 1994. Additivity of chain-ladder projections. ASTIN Bulletin 24, 311–318.Alai, D.H., Landsman, Z., Sherris, M., 2016. Multivariate Tweedie lifetimes: The impact of dependence. Scandinavian ActuarialJournal 8, 692–712.Alai, D.H., W¨uthrich, M.V., 2009. Taylor approximations for model uncertainty within the Tweedie exponential dispersionfamily. ASTIN Bulletin 39, 453–477.Alpuim, T., Ribeiro, I., 2003. A state space model for run-off-off triangles. Applied Stochastic Models in Business and Industry19, 105–120.Andrieu, C., Doucet, A., Tadi´c, V.B., 2005. On-line parameter estimation in general state-space models, in: Proceedings ofthe 44th IEEE Conference on Decision and Control, pp. 332–337.Atherino, R., Pizzinga, A., Fernandes, C., 2010. A row-wise stacking of the runoff triangle: State space alternatives for IBNRreserve prediction. ASTIN Bulletin 40, 917–946.Avanzi, B., Taylor, G., Vu, P.A., Wong, B., 2016a. Stochastic loss reserving with dependence: A flexible multivariate Tweedieapproach. Insurance: Mathematics and Economics 71, 63–78.Avanzi, B., Taylor, G., Wong, B., 2018. Common shock models for claim arrays. ASTIN Bulletin 48, 1–28.Avanzi, B., Taylor, G.C., Wong, B., 2016b. Correlations between insurance lines of business: An illusion or a real phenomenon?Some methodological considerations. ASTIN Bulletin 46, 225–263.Azam, S.E., Chatzi, E., Papadimitriou, C., 2015. A dual Kalman filter approach for state estimation via output-only accelerationmeasurements. Mechanical Systems and Signal Processing 60, 866–886.Barnett, G., Zehnwirth, B., 2000. Best estimates for reserves, in: Proceedings of the Casualty Actuarial Society, pp. 245–321.Berquist, J.R., Sherman, R.E., 1977. Loss reserve adequacy testing: A comprehensive, systematic approach, in: Proceedingsof the Casualty Actuarial Society, pp. 123–184.Boucher, J.P., Davidov, D., 2011. On the importance of dispersion modeling for claims reserving: An application with theTweedie distribution. Variance 5, 158–172.Brydon, D., Verrall, R., 2009. Calendar year effects, claims inflation and the chain-ladder technique. Annals of ActuarialScience 4, 287–301.Capp´e, O., Godsill, S.J., Moulines, E., 2007. An overview of existing methods and recent advances in sequential Monte Carlo,in: Proceedings of the IEEE, pp. 899–924.Carvalho, C.M., Johannes, M.S., Lopes, H.F., Polson, N.G., et al., 2010a. Particle learning and smoothing. Statistical Science25, 88–106.Carvalho, C.M., Johannes, M.S., Lopes, H.F., Polson, N.G., et al., 2010b. Particle learning and smoothing. Statistical Science25, 88–106.Chukhrova, N., Johannssen, A., 2017. State space models and the Kalman-filter in stochastic claims reserving: Forecasting,filtering and smoothing. Risks 5, 30.Cˆot´e, M.P., Genest, C., Abdallah, A., 2016. Rank-based methods for modeling dependence between loss triangles. EuropeanActuarial Journal 6, 377–408.Creal, D., 2012. A survey of sequential Monte Carlo methods for economics and finance. Econometric Reviews 31, 245–296.De Jong, P., 2006. Forecasting runoff triangles. North American Actuarial Journal 10, 28–38.De Jong, P., 2012. Modeling dependence between loss triangles. North American Actuarial Journal 16, 74–86.De Jong, P., Zehnwirth, B., 1983. Claims reserving, state-space models and the Kalman filter. Journal of the Institute ofActuaries 110, 157–181.Dong, A.X.D., Chan, J.S.K., 2013. Bayesian analysis of loss reserving using dynamic models with generalized beta distribution.Insurance: Mathematics and Economics 53, 355–365. oucet, A., Godsill, S., Andrieu, C., 2000. On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics andComputing 10, 197–208.Doucet, A.D., Johansen, A.M., 2011. A tutorial on particle filtering and smoothing: 15 years later, in: Handbook of NonlinearFiltering. Oxford University Press. volume 12, pp. 656–704.Durbin, J., Koopman, S.J., 2012. Time series analysis by state space methods. Oxford University Press.England, P., Verrall, R., 2002. Stochstic Claims Reserving in General Insurance. British Actuarial Journal .England, P.D., Verrall, R.J., 2001. A flexible framework for stochastic claims reserving, in: Proceedings of the CasualtyActuarial Society, pp. 1–38.Ghezzi, T.L., 2001. Loss reserving without loss development patterns–Beyond Berquist-Sherman, in: Casualty Actuarial SocietyE-forum, pp. 43–104.Gigante, P., Picech, L., Sigalotti, L., 2013. Claims reserving in the hierarchical generalized linear model framework. Insurance:Mathematics and Economics 52, 381–390.Gigante, P., Picech, L., Sigalotti, L., 2019. Calendar year effect modeling for claims reserving in HGLM. ASTIN Bulletin 49,763–786.Gluck, S.M., Venter, G.G., 2009. Stochastic trend models in casualty and life insurance, in: Enterprise Risk ManagementSymposium.Guszcza, J., 2008. Hierarchical growth curve models for loss reserving, in: Casualty Actuarial Society E-Forum, pp. 146–173.Harvey, A.C., 1990. Forecasting, structural time series models and the Kalman filter. Cambridge university press.Jørgensen, B., 1997. The theory of dispersion models, in: Monographs on Statistics and Applied Probability. Chapman & Hall,London. 76.Kantas, N., Doucet, A., Singh, S.S., Maciejowski, J.M., 2009. An overview of sequential Monte Carlo methods for parameterestimation in general state-space models, in: IFAC Proceedings, pp. 774–785.Kirschner, G.S., Kerley, C., Isaacs, B., 2002. Two approaches to calculating correlated reserve indications across multiple linesof business, in: Proceedings of the Casualty Actuarial Society, pp. 211–246.Lamien, B., Rangel Barreto Orlande, H., Antonio Bermeo Var´on, L., Leite Queiroga Basto, R., Enrique Eli¸cabe, G., Silva dosSantos, D., Machado Cotta, R., 2018. Estimation of the temperature field in laser-induced hyperthermia experiments witha phantom. International Journal of Hyperthermia 35, 279–290.Li, T., Sun, S., Sattar, T.P., Corchado, J.M., 2014. Fight sample degeneracy and impoverishment in particle filters: A reviewof intelligent approaches. Expert Systems with Applications 41, 3944–3954.Liu, J., West, M., 2001. Combined parameter and state estimation in simulation-based filtering, in: Doucet, A., de Freitas, N.,Gordon, N. (Eds.), Sequential Monte Carlo methods in practice. Springer, pp. 197–223.McGuire, G., 2007. Building a reserving robot, in: Biennial Convention - Actuaries Institute.McGuire, G., Taylor, G., Miller, H., 2018. Self-assembling insurance claim models using regularized regression and machinelearning. SSRN .Nelson, L., Stear, E., 1976. The simultaneous on-line estimation of parameters and states in linear systems. IEEE Transactionson Automatic Control 21, 94–98.Ntzoufras, I., Dellaportas, P., 2002. Bayesian modelling of outstanding liabilities incorporating claim count uncertainty. NorthAmerican Actuarial Journal 6, 113–125.Peters, G.W., Shevchenko, P., W¨uthrich, M., 2009. Model uncertainty in claims reserving within Tweedie’s compound Poissonmodels. ASTIN Bulletin 39, 1–33.Polson, N., Sokolov, V., 2015. Bayesian analysis of traffic flow on interstate i-55: The lwr model. The Annals of AppliedStatistics 9, 1864–1888.Renshaw, A.E., 1989. Chain ladder and interactive modelling (Claims reserving and GLIM). Journal of the Institute ofActuaries 116, 559–587.Rios, M.P., Lopes, H.F., 2013. The extended Liu and West filter: Parameter learning in Markov switching stochastic volatilitymodels, in: Zeng, Y., Wu, S. (Eds.), State-space models: Applications in economics and finance. Springer, pp. 23–61.Saluz, A., Gisler, A., 2014. Best estimate reserves and the claims development results in consecutive calendar years. Annals ofActuarial Science 8, 351–373.Shi, P., Basu, S., Meyers, G.G., 2012. A Bayesian log-normal model for multivariate loss reserving. North American ActuarialJournal 16, 29–51.Shi, P., Frees, E.W., 2011. Dependent loss reserving using copulas. ASTIN Bulletin 41, 449–486.Sims, J., 2011. Evolutionary reserving models - Are particle filters the way to go?, in: GIRO conference.Sims, J., 2012. Seeing the bigger picture in claims reserving, in: General Insurance Seminar, Actuaries Institute.State Insurance Regulatory Authority, 2018. CTP green slip reforms. Website.Taylor, G., 2000. Loss Reserving: An Actuarial Perspective. Huebner International Series on Risk, Insurance and EconomicSecurity, Kluwer Academic Publishers.Taylor, G., 2008. Second-order Bayesian revision of a generalised linear model. Scandinavian Actuarial Journal 2008, 202–242.Taylor, G., McGuire, G., 2009. Adaptive reserving using Bayesian revision for the exponential dispersion family. Variance 3,105–130.Taylor, G., McGuire, G., 2016. Stochastic loss reserving using Generalized Linear Models. volume 3 of CAS Monograph Series .Arlington, USA: Casualty Actuarial Society.Taylor, G., McGuire, G., Greenfield, A., 2003. Loss reserving: Past, present and future. University of Melbourne ResearchPaper.Venter, G.G., Gutkovich, R., Gao, Q., 2019. Parameter reduction in actuarial triangle models. Variance 12, 142–160.Verrall, R.J., 1989. A state space representation of the chain ladder linear model. Journal of the Institute of Actuaries 116, A. Appendices
A.1. Simulated data set and estimation results e a r . . . . . . . . . . . . . . .
00 2851 . . . . . . . . . . . . . .
47 3688 . . . . . . . . . . . . .
30 4743 . . . . . . . . . . . .
31 5674 . . . . . . . . . . .
71 6618 . . . . . . . . . .
52 7587 . . . . . . . . .
75 8629 . . . . . . . .
72 9702 . . . . . . .
06 10942 . . . . . .
70 111 , . , . . . .
15 121 , . , . , . .
44 131 , . , . , .
43 14926 . , .
62 15861 . T a b l e A : S i m u l a t e d t r i a n g l e Y e a r , . , . , . , . , . , . , . , . , . , . , . , . , . . .
24 21 , . , . , . , . , . , . , . , . , . , . , . , . . .
24 31 , . , . , . , . , . , . , . , . , . , . , . , . .
19 4876 . , . , . , . , . , . , . , . , . , . . .
64 5961 . , . , . , . , . , . , . , . , . , . .
76 6732 . , . , . , . , . , . , . , . , . , .
42 7732 . , . , . , . , . , . , . , . , .
58 8785 . , . , . , . , . , . , . , .
73 9711 . , . , . , . , . , . , .
98 10708 . , . , . , . , . , .
30 11803 . , . , . , . , .
03 12816 . , . , . , .
57 13602 . , . , .
59 14708 . , .
89 15728 . T a b l e B : S i m u l a t e d t r i a n g l e
32 i a ( n ) i r ( n ) i s ( n ) i h ( n ) i True Estimate True Estimate True Estimate True Estimate(1) 1 6.9111 6.9772 1.2867 1.1637 -0.8014 -0.7669 0.5000 0.45002 7.1203 7.0228 1.1831 1.2624 -0.7783 -0.7905 0.5025 0.45833 6.8500 6.8797 1.2921 1.2786 -0.7982 -0.7882 0.4974 0.45694 6.7755 6.8645 1.2128 1.1614 -0.7977 -0.7849 0.4985 0.45035 6.7846 6.7775 1.1308 1.1068 -0.7965 -0.7809 0.4869 0.46366 6.7512 6.7561 1.1458 1.1150 -0.8197 -0.8041 0.4824 0.47067 6.6737 6.6867 1.0900 1.0812 -0.8394 -0.8245 0.4900 0.46678 6.6937 6.8043 1.0746 1.0563 -0.7904 -0.7970 0.4872 0.43999 6.8156 6.8286 1.1102 1.0903 -0.7663 -0.7491 0.4925 0.472910 7.0346 7.1467 1.1100 1.0572 -0.7412 -0.7471 0.4923 0.439811 7.1906 7.2977 1.2651 1.1593 -0.7347 -0.7133 0.5041 0.439412 7.1507 7.2521 1.3664 1.1840 -0.7553 -0.7031 0.5028 0.433613 7.1222 7.2113 1.4539 1.3373 -0.6913 -0.6700 0.4995 0.444714 7.0539 7.0465 1.5131 1.3268 -0.7400 -0.6537 0.4948 0.460015 6.9438 6.9839 1.4693 1.3363 -0.6674 -0.6564 0.4973 0.4403(2) 1 7.0908 7.1637 2.0212 1.9650 -0.4343 -0.4275 0.5000 0.45002 6.9679 7.0941 2.0064 1.9393 -0.4422 -0.4376 0.5041 0.45333 6.9701 7.0290 1.9750 1.9282 -0.4368 -0.4256 0.5117 0.45234 6.8030 6.9095 1.9991 1.9200 -0.4769 -0.4635 0.5046 0.46665 6.7276 6.8444 1.9401 1.9091 -0.4614 -0.4644 0.4993 0.46796 6.6243 6.7794 1.9961 1.9261 -0.4351 -0.4328 0.4866 0.46517 6.5513 6.6753 2.0254 2.0316 -0.3946 -0.4090 0.4991 0.48358 6.5454 6.6182 2.0025 2.0224 -0.4018 -0.4114 0.4860 0.47489 6.5506 6.6417 2.0265 2.0118 -0.3959 -0.4076 0.4883 0.470410 6.5493 6.5592 2.0127 2.0912 -0.4040 -0.4261 0.4880 0.464711 6.5508 6.5689 1.9884 2.0683 -0.3751 -0.3991 0.4911 0.457012 6.4854 6.5595 2.0249 2.0520 -0.4015 -0.4111 0.4872 0.436413 6.4746 6.5175 2.1207 2.0716 -0.4055 -0.4130 0.4843 0.452014 6.5039 6.5914 2.1752 2.0839 -0.3687 -0.4179 0.4760 0.441315 6.4701 6.5787 2.2322 2.0846 -0.3650 -0.4203 0.4736 0.4515
Table C: Filtered (posterior) estimates of random factors φ (1) σ a (cid:15) (1) σ r (cid:15) (1) σ s (cid:15) (1) σ h (cid:15) (1) λ (1) p (1) φ (2) σ a (cid:15) (2) σ r (cid:15) (2) σ s (cid:15) (2) σ h (cid:15) (2) λ (2) p (2) σ h ˜ (cid:15) Table D: Posterior estimates of parameters
A.2. Empirical data set and estimation results
This data set is drawn from Cˆot´e, Genest and Abdallah (2016).34 e a r P r e m i u m , , , , , , , , , , ,
902 2111 , , , , , , , , ,
612 3107 , , , , , , , ,
665 4105 , , , , , , ,
914 5105 , , , , , , ,
260 6111 , , , , ,
788 7113 , , , , ,
757 8121 , , , ,
677 9110 , ,
357 10104 , , T a b l e E : A cc i d e n t B e n e fi t s e x c l ud i n g D i s a b ili t y I n c o m e ( c u m u l a t i v ec l a i m s ) Y e a r P r e m i u m , , , , , , , , , , ,
860 2111 , , , , , , , , ,
121 3107 , , , , , , , , ,
155 4105 , , , , , , , ,
579 5105 , , , , , , ,
402 6111 , , , , , ,
934 7113 , , , , ,
945 8121 , , , ,
196 9110 , , ,