Volterra mortality model: Actuarial valuation and risk management with long-range dependence
aa r X i v : . [ q -f i n . M F ] S e p Volterra mortality model: Actuarial valuation and riskmanagement with long-range dependence
Ling Wang ∗ Mei Choi Chiu † Hoi Ying Wong ‡ September 22, 2020
Abstract
While abundant empirical studies support the long-range dependence (LRD) of mor-tality rates, the corresponding impact on mortality securities are largely unknown due tothe lack of appropriate tractable models for valuation and risk management purposes. Wepropose a novel class of Volterra mortality models that incorporate LRD into the actuarialvaluation, retain tractability, and are consistent with the existing continuous-time affinemortality models. We derive the survival probability in closed-form solution by takinginto account of the historical health records. The flexibility and tractability of the modelsmake them useful in valuing mortality-related products such as death benefits, annuities,longevity bonds, and many others, as well as offering optimal mean-variance mortalityhedging rules. Numerical studies are conducted to examine the effect of incorporatingLRD into mortality rates on various insurance products and hedging efficiency.
Keywords : Stochastic mortality; Long-range dependence; Affine Volterra processes;Valuation; Mean-variance hedging. ∗ Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong.( [email protected] ) † Department of Mathematics & Information Technology, The Education University of Hong Kong, Tai Po,N.T., Hong Kong.( [email protected] ) ‡ Corresponding author. Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T.,Hong Kong.( [email protected] ) Introduction
Actuaries heavily rely on mortality modeling for mortality prediction, actuarial valuation,and risk management. Accurate estimations and predictions of human mortality are theessential building blocks of both insurance contract pricing and pension policy. The firststudy of this can be dated back to Gompertz (1825).The arguably most well-received modern mortality model is the Lee and Carter (1992)model and its extensions using time series analysis. For instance, it has been generalizedto multivariate populations with a common trend (Li and Lee, 2005), mortality fore-casts using single value decomposition (Renshaw and Haberman, 2003), joint modelingof different national populations (Antonio et al., 2015) and sub-populations (Villegas andHaberman, 2014), a multi-population stochastic mortality model (Danesi et al., 2015), aPoisson regression model (Brouhns et al., 2002), and stochastic period and cohort effect(Toczydlowska et al., 2017), among others. A key advantage of the Lee-Carter modeland its invariant is that statistical inferences from time series analysis can be applied orgeneralized to estimate and test with a real mortality data set.By incorporating fractionally integrated time series analysis into the Lee-Carter model,Yan et al. (2018) empirically show the existence of long-range dependence (LRD) (alsoknown as long-memory pattern or fractional persistence) across age groups, gender, andcountries by using the dataset of 16 countries. When they apply their long-memory mor-tality model to forecast life expectancies, the mortality model ignoring LRD tends tounderestimate life expectancy, which leads to important implications for pension schemesand funding issues. Yan et al. (2020) further extend the model to incorporate multivariatecohorts and document the existence of LRD. Yaya et al. (2019) show a long-memory pat-tern in the infant mortality rates of G7 countries. Delgado-Vences and Ornelas (2019) offerfurther empirical evidence that mortality rates exhibit LRD using a fractional Ornstein-Uhlenbeck (fOU) process with Italian population data from the 1950 to 2004 period.Most stochastic mortality models focus on the mortality rate, or equivalently thePoisson intensity rate. We refer to the pioneering work of Milevsky and Promislow (2001)who introduced the Cox model to insurance applications. Biffis (2005) and Biffis andMillossovich (2006) further develop this idea of doubly stochastic mortality models withan affine feature for exploiting analytical tractability in actuarial valuation with bothfinancial and mortality risks. Jevti´c et al. (2013) extend it to cohort models, and Wong etal. (2017) introduce continuous-time cointegration into the multivariate mortality rates.Blackburn and Sherris (2013) advocate the use of continuous-time affine mortalitymodels for longevity pricing and hedging because of its tractability and consistency withthe market data. Jevti´c and Regis (2019) propose a calibration to the multiple popula-tions affine mortality models and demonstrate its empirical use with product price data.However, none of the aforementioned studies provide an analytically tractable dynamicmortality model with the LRD feature.The primary contribution of this paper is the proposal of a novel class of dynamic tochastic mortality models that simultaneously render actuarial valuation tractabilityand the LRD property. As the proposed model is based on Volterra processes, we callthem Volterra mortality models. Inspired by the affine Volterra process (Abi Jaber et al.,2019), our model preserves the affine structure for general actuarial valuation but stillcaptures LRD. In terms of practical contributions, we use the model to derive closed-formsolutions for the survival probability, death and survival benefits of insurance contracts,and longevity bonds, and then address the impact of LRD on these insurance products.To the best of our knowledge, the derived formulas constitute the first set of formulas forinsurance products that are subject to the LRD feature of mortality rates.This study also contributes to risk management with LRD mortality rates. We rigor-ously develop the mean-variance (MV) strategy for hedging longevity risk with a longevitysecurity that is subject to LRD. This later hedging strategy is highly non-trivial becausethe Volterra mortality rate is a non-Markovian and non-semimartingale process. Inspiredby Han and Wong (2020), we derive the MV optimal hedging with the Volterra mortal-ity models by means of linear-quadratic control with the backward stochastic differentialequation (BSDE) framework similar to Wong et al. (2017). In contrast, Han and Wong(2020) solve the MV portfolio problem with rough volatility by constructing an auxiliaryprocess. Our optimal hedging rule shows how to adjust the hedge for LRD of mortalityrates.The rest of this paper is organized as follows. Section 2 introduces the Volterra mor-tality model based on the doubly stochastic mortality models and explains how the modelcaptures LRD. Section 3 offers some formulas for actuarial valuation. In Section 4, weformulate an optimal hedging problem under the Volterra mortality model and give anexplicit solution. To compare the Volterra mortality model with LRD with the Marko-vian mortality model, numerical studies are conducted for both actuarial valuation andthe hedging problem in Section 5. Section 6 gives our concluding remarks. Some detailsand additional proofs are given in the Appendix. Consider a filtered probability space (Ω , F , F , P ) where the filtration F = {F t : 0 ≤ t ≤ T } satisfies the usual properties. We write F t = G t ∨ H t , where H t represents the flow ofinformation available as time goes by including the historical processes and the currentstates, and G t contains the information whether an individual has died. We interpret P as the physical probability measure. Alternatively, our model can be developed undera pricing measure so that the model parameters are calibrated to the insurance productprices available in the market. This enables actuarial valuation consistent with marketprices. However, risk management strategies should be conducted under the physicalprobability measure. To avoid confusion, we denote the pricing measure by Q and discussthe relationship between P and Q in the next section. For the time being, we focus onthe model development under P . e begin with the classic doubly stochastic mortality models. For simplicity, weconsider a group of people with homogeneous feature while individual differences certainlyexist in this group at the same time. A counting process N is a doubly stochastic processdriven by the subfiltration G = {G t } t ≥ of F and with G -intensity µ t . Let τ be thefirst jump-time of the process N with intensity µ t . In actuarial applications, the process { N t } t ≥ records the number of deaths at each time t ≥
0. For any time t ≥ ω ∈ Ω such that τ ( ω ) > t , we have P ( τ ≤ t + ∆ |F t ) ∼ = µ t ( ω )∆ , (1)for a trajectory of µ t ( ω ) and a fixed ω ∈ Ω. Thus, the counting process N associated with τ becomes an inhomogeneous Poisson with parameter R · µ s ( ω ) ds . In other words, for all T ≥ t ≥ k ( k ≥ P ( N T − N t = k |F t ∨ G T ) = ( R Tt µ s ( ω ) ds ) k k ! e − R Tt µ s ( ω ) ds . By the law of iterated expectations, the time- t survival probabilities over the time interval( t, T ] (for fixed T ≥ t ≥
0) can be expressed as follows: P ( τ > T |F t ) = E h e − R Tt µ s ( ω ) ds (cid:12)(cid:12)(cid:12) F t i . (2)If the intensity µ t is a constant, then the doubly stochastic process reduces to the homo-geneous Poisson process. However, the literature of mortality modeling is in favour of astochastic intensity. Typically, the intensity is modeled through a stochastic differentialequation (SDE). For instance, Biffis (2005) and Biffis and Millossovich (2006) postulate aMarkovian process such that µ t = f ( X t ), where f is a continuous function on R , dX t = b ( X t ) dt + σ ( X t ) dW t , (3)and { W t } t ≥ is the standard Brownian motion.To incorporate LRD into the mortality rate, one simply replaces the Brownian motionin (3) with the fractional Brownian motion. In other words, dX t = b ( X t ) dt + σ ( X t ) dW Ht , (4)where W Ht is a fractional Brownian motion (fBM) with the Hurst parameter H ∈ [0 . , f ( X t ) = h exp( h t + h X t ), for the constants h , h , h >
0, and a fOU process in the form of (4)such that the drift term b ( X t ) is a linear function of X t and the σ ( X t ) ≡ σ is a constant.However, the fractional Brownian motion is analytically intractable for actuarial valuation. We propose a stochastic mortality model incorporating LRD that retains the key advan-tages of the works of Biffis (2005), Delgado-Vences and Ornelas (2019), and Leonenko et l. (2019). More specifically, we maintain the affine nature of Biffis (2005), reflect LRDwith fBM as in Delgado-Vences and Ornelas (2019), and offers explicit expressions forsome important Fourier-Laplace functional generalizing Leonenko et al. (2019) for actu-arial valuation. Our model is highly inspired by the affine Volterra processes (Abi Jaberet al., 2019) and hence called the Volterra mortality model.In the one dimensional case, Baudoin and Nualart (2003) show the equivalence betweenfBM and the Volterra process: W Ht = c H Z t ( t − s ) H − dW ( s ) , where c H is a constant related to the Hurst parameter H , W is the Wiener process, andthe integral process on the right-handed side is a standard Volterra process. For simplicityand to be consistent with the literature, we postulate the mortality rate µ t of a group: µ t = m ( t ) + ηX t , (5)where m ( t ) is a bounded continuous deterministic function and η is a constant. In otherwords, we require that f ( X t ) is a linear function of X t . In addition, X t follows a stochasticVolterra integral equation (SVIE): X t = X + Z t K ( t − s ) b ( X s ) ds + Z t K ( t − s ) σ ( X s ) dW s , (6)where W = [ W , · · · , W d ] ⊤ is the standard d -dimensional Brownian motion under P , andthe coefficients b and σ are assumed to be continuous. The convolution kernel K satisfiesthe following condition: K ∈ L loc ( R + , R ), R h K ( t ) dt = O ( h γ ) and R T ( K ( t + h ) − K ( t )) dt = O ( h γ ) forsome γ ∈ (0 ,
2] and every
T < ∞ .Although the process X t in (6) is generally high-dimensional, we would like to illustrateit in a one-dimensional case. Table 1 exhibits some useful kernels K in the one-dimensionalcase. We obtain the fBM by choosing K as the fractional kernel in Table 1 with a constant σ ( X s ) and b = 0 in (6). Therefore, the Volterra processes can be applied to a wider classof LRD noise terms. Note that the resolvent or resolvent of the second kind correspondingto the K shown in Table 1 is defined as the kernel R such that K ∗ R = R ∗ K = K − R .The convolutions K ∗ R and R ∗ K with K a measurable function on R + and R a measureon R + of locally bounded variation are defined by( K ∗ R )( t ) = Z [0 ,t ] K ( t − s ) R ( ds ) , ( R ∗ K )( t ) = Z [0 ,t ] R ( ds ) K ( t − s )for t > Remark 1.
According to Biffis (2005), the deterministic function m ( t ) in (5) may repre-sent (i) a best-estimated assumption on µ enforcing unbiased expectations about the futurebased on the available information, (ii) pricing demographics basis, or (iii) an availablemortality table for a population of insureds. In Section 5, we calibrate m ( t ) to the tableSIM92, a period table usually employed to price assurances. K ( t ) c c t α − Γ( α ) ce − λt ce − λt t α − Γ( α ) R ( t ) ce − ct ct α − E α,α ( − ct α ) ce − λt e − ct ce − λt t α − E α,α ( − ct α )Table 1: Examples of kernel function K and the corresponding resolvent R . Here E α,β ( z ) = P ∞ n =0 z n Γ( αn + β ) denotes the Mittag-Leffler function. In addition, when the convolution kernel K is set to a constant c in (6), the X t reducesto the solution of a SDE. Furthermore, once b ( X t ) is linear in X t and σ ( X t ) satisfies acertain affine property, then our model in (6) becomes the affine stochastic mortalitymodel of Biffis (2005). The possibly high-dimensional X t enables us to also incorporatemulti-factor mortality modeling. However, we would like to highlight that the Volterraprocess in (6) is generally a non-Markovian and non-semimartingale process. The non-Markovian nature is obvious because the integrals in the SIVE take the whole realizedsample path into account. The non-semimartingale feature is reflected by the fact thatthe time variable t appears in both the integral limit and the kernel function, making itfail to define the Itˆo integral.Fortunately, Abi Jaber et al. (2019) show that it is still possible to maintain the affinenature within (6). Let a ( x ) = σ ( x ) σ ( x ) ⊤ be the covariance matrix. Definition 1.
The SVIE (6) is called an affine process (Abi Jaber et al., 2019) if a ( x ) = A + x A + · + x d A d ,b ( x ) = b + x b + · · · + x d b d , for some d -dimensional symmetric matrices A i and vectors b i . For simplicity, we set B = ( b , · · · , b d ) and A ( u ) = ( uA u ⊤ , · · · , uA d u ⊤ ) for any row vector u ∈ C d . To draw insights from Definition 1, consider the one dimensional case. When b ( x ) = b − b x , a linear function of x , and a ( x ) is a constant, (6) is known as the Volterra type ofthe Vasicek (VV) model which reduces to the classic Vasicek model by taking a constantkernel or, equivalently, H = 1 / b ( x ) is linear in x and a ( x ) is directly proportional to x , our model in (6) reduces to the Volterra version of the CIR (VCIR) model.
Although we focus on mortality modeling, actuarial valuation needs to specify the dynamicof the risk-free interest rate. We simply adopt a Markov affine model for the interest rate.Specifically, we adopt the short rate process r that satisfies R t | r s | ds < ∞ for t ≥
0, andwe define the return of a risk-less asset as exp( R t r s ds ) for a unit dollar investment at ime 0. In addition, the interest rate process is driven by the Markov affine process Z in R k : dZ t = e b ( Z t ) dt + e σ ( Z t ) dW ′ t , (7)where W ′ is a k -dimensional standard Brownian motion. The coefficients e b ( Z t ) and e a ( Z t ) = e σ ( Z t ) e σ ⊤ ( Z t ) have affine dependence on Z t once they satisify Definition 1 withthe dimension d replaced by k . Hence, the Markov affine feature coincides with thedefinition of Markov affine process in Duffie et al. (2003). Furthermore, the short rate r t . = r ( t, Z t ) = λ ( t ) + λ ( t ) · Z t which is an affine function on Z t with coefficients λ ( t )and λ ( t ) being bounded continuous functions on [0 , ∞ ). By the affine processes in Duffieet al. (2003) and Filipovi´c (2005), at time t , we have B ( t, T ) = E h e − R Tt r ( s,Z s ) ds (cid:12)(cid:12)(cid:12) F t i = e e α ( t,T )+ e β ( t,T ) · Z t , (8)where the functions ˜ α ( · , T ) and ˜ β ( · , T ) are uniquely solved from the ordinary differentialequations (ODEs) in Appendix A with boundary conditions e α ( T, T ) = 0 and e β ( T, T ) = 0.If the interest rate model in (7) is defined under the pricing measure, i.e., P = Q , thenthe quantity B ( t, T ) represents the price of a unit zero coupon bond. We demonstrate the tractability of the proposed Volterra mortality model in actuarialvaluation. Specifically, we derive closed-form solutions to the survival probability andprices of some standard life insurance products. The following theorem is the buildingblock of the actuarial valuation.
Theorem 1.
If the mortality rate µ t follows (5) and (6) and has the affine structurespecified in Definition 1, then, for any constant c and c and T > t , we have E h e − R Tt µ s ds ( c + c µ T ) (cid:12)(cid:12)(cid:12) F Xt i = c g ( t, T ) − c ∂g ( t, T ) ∂T , (9) where g ( t, T ) = e − R T m ( s ) ds e R t µ s ds exp( Y t ( T )) ,Y t ( T ) = Y + Z t ψ ( T − s ) σ ( X s ) dW s − Z t ψ ( T − s ) a ( X s ) ψ ( T − s ) ⊤ ds, (10) Y ( T ) = Z T ( − ηX + ψ ( s ) b ( X ) + 12 ψ ( s ) a ( X ) ψ ( s ) ⊤ ) ds, and ψ ∈ L ([0 , T ] , C d ) solves the Riccati-Volterra equation: ψ = ( − η + ψB + 12 A ( ψ )) ∗ K, (11) with A ( · ) appearing in Definition 1. In addition, the Y has an alternative expression: Y t ( T ) = − η Z T E [ X s |F t ] ds + 12 Z Tt ψ ( T − s ) a ( E [ X s |F t ]) ψ ( T − s ) ⊤ ds, (12) here E [ X T |F t ] = id − Z T R B ( s ) ds ! X + Z T E B ( T − s ) b ( s ) ds + Z t E B ( T − s ) σ ( X s ) dW s (13) with id being the identity matrix, R B the resolvent of − KB , and E B = K − R B ∗ K .Proof. See Appendix A.
Remark 2.
The partial derivative ∂g ( t,T ) ∂T does not admit a closed-form solution in generalbecause the function g ( t, T ) depends on Y t ( T ) which depends on T through the ψ solvedfrom the Riccati-Volterra Equation (11) . Fortunately, the partial derivative appears ininsurance products related to the death benefit through an integration. We can then avoidcomputing it by means of integration by parts. We highlight that the expression in (10) implies that Y t ( T ) is a semimartingale, becauseall of the integrants in (10) are independent of t . This is important and interestingbecause it implies that insurance product prices can be expressed into SDE even thoughthe mortality rate with LRD can not. This enables us to construct a hedging strategy forlongevity risk using longevity securities in a LRD mortality environment, indicating theimportance of the longevity securatization. For the time being, we apply Theorem 1 toobtain the survival probability of the Volterra mortality model in a closed-form solution. Corollary 1. (Survival Probability) Under the Volterra mortality model in (5) , (6) , andDefinition 1, for any t < T , the survival probability reads P ( τ > T |F t ) = E h e − R Tt µ s ds (cid:12)(cid:12)(cid:12) F t i = g ( t, T ) = e − R T m ( s ) ds + R t µ s ds exp( Y t ( T )) , (14) where Y t ( T ) is defined in (10) or, equivalently, (12) .Proof. The result follows by taking c = 1 and c = 0 in Theorem 1.The survival probability in Corollary 1 captures LRD because it depends on the wholehistorical path of the mortality rate. This is reflected in the terms e − R T m ( s ) ds + R t µ s ds and Y ( T ). However, when comparing our survival probability with LRD with that of thecorresponding Markovian mortality model, we find them consistent. Consider the case offractional kernel K ( t ) = t α − Γ( α ) id, where α = H + 1 / H is the Hurst parameter H .The process X t becomes X t = X + λ Z t ( t − s ) α − Γ( α ) ( θ − X s ) ds + Z t ( t − s ) α − Γ( α ) σ ( X s ) dW s . (15)When α = 1, the K ( t ) ≡ id and dX t = λ ( θ − X t ) dt + σ ( X t ) dW t , which is the Vasicek mortality rate model for a constant σ ( X t ) and the CIR model for σ ( X t ) = σ √ X t . Both are investigated by Biffis (2005). In such a situation, a part of the ( T ) in (10) cancels with R t µ s ds , and the Volterra-Riccati Equation (11) reduces to theordinary Riccati equation. This makes our solution the same as these in Biffis (2005) for α = 1 or H = 1 /
2. However, once α >
1, the process X t has the LRD feature. Theempirical study in Yan et al. (2018) shows that the survival probability is underestimatedwhen LRD is not taken into account. To streamline the presentation, we assume that mortality rates are independent of theinterest rate. Although this assumption could be considered as mathematically restrictive,it is a common assumption in the actuarial and insurance literature. Two basic payoffsin insurance contracts are the survival benefit and the death benefit.Let C T be a bounded random payoff for a survivor at time T independent of themortality. The time- t fair value of the survival benefit SB t ( C T ; T ) of the terminal amount C T , with 0 ≤ t ≤ T under the pricing measure Q is given bySB t ( C T ; T ) = 1 { τ>t } E Q h e − R Tt r s ds C T (cid:12)(cid:12)(cid:12) G Zt i E Q h e − R Tt µ s ds (cid:12)(cid:12)(cid:12) G Xt i . (16)To draw some insights from (16), let us consider the situation in which the mortality modelof (5) and (6) and interest rate process of (7) are constructed under the pricing measure Q or, equivalently, that P = Q in Section 2. We refer to the results obtained under suchan assumption as the baseline case in this paper and the corresponding valuation becomessimple. Proposition 1. (Survival Benefit: The Baseline Valuation.) If P = Q and the mortalityand interest rate are independent, then the Volterra mortality model of (5) , (6) , andDefinition 1 and the affine interest rate model imply that SB t ( C T ; T ) = 1 { τ>t } B ( t, T ) E Q T (cid:2) C T | G Zt (cid:3) g ( t, T ) , where g ( t, T ) is presented in Theorem 1, B ( t, T ) is the zero coupon bond price in (8) , and Q T is the forward pricing measure: d Q T d Q (cid:12)(cid:12)(cid:12)(cid:12) F t = exp (cid:18) − Z t e β ( u, T ) e σ ( Z u ) du − Z t e β ( u, T ) e σ ( Z u ) dW ′ u (cid:19) . Proof.
By Corollary 1, E Q h e − R Tt µ s ds (cid:12)(cid:12)(cid:12) G Xt i = g ( t, T ) . By the affine short-rate Model (7) and Equation (8), we have d B ( t, T ) = B ( t, T ) r t dt − B ( t, T ) e β ( t, T ) e σ ( Z t ) dW ′ t , which implies that 1 = B ( T, T ) = B ( t, T ) e R Tt r u − e β ( u,T ) e σ ( Z u ) du − R Tt e β ( u,T ) e σ ( Z u ) dW ′ u . Hence, e − R Tt r s ds = B ( t, T ) exp − Z Tt e β ( u, T ) e σ ( Z u ) du − Z Tt e β ( u, T ) e σ ( Z u ) dW ′ u ! . n application of the Girsanov theorem shows that E Q h e − R Tt r s ds C T (cid:12)(cid:12)(cid:12) G Zt i = B ( t, T ) E Q T (cid:2) C T | G Zt (cid:3) , where the forward measure Q T is presented in the Proposition.Another important basic payoff is the death benefit. Let C t be a bounded G Z -predictable process, representing a cash flow stream independent of the mortality rate.Then, the time- t fair value of the death benefit with a cash flow stream C t , payable incase the insured dies before time T and 0 ≤ t ≤ T , is given byDB t ( C τ ; T ) = 1 { τ>t } Z Tt E Q h e − R ut r s ds C u (cid:12)(cid:12)(cid:12) G Zt i E Q h e − R ut µ s ds µ u (cid:12)(cid:12)(cid:12) G Xt i du. Then, we also have an explicit baseline valuation formula for the death benefit.
Proposition 2. (Death Benefit: The Baseline Valuation.) If P = Q and the mortalityand interest rate are independent, then the Volterra mortality model of (5) , (6) , andDefinition 1 and the affine interest rate model imply that DB t ( C T ; T ) = − { τ>t } Z Tt B ( t, u ) E Q u (cid:2) C u | G Zt (cid:3) ∂g ( t, u ) ∂u du, where Y t ( u ) is defined in (10) , B ( t, T ) in (8) , g ( t, T ) in Theorem 1, and the forwardpricing measure Q u in Proposition 1.Proof. The proof is similar to that of Proposition 1 except for the second expectationappearing in the representation of DB t ( C τ ; T ). By Theorem 1, it is clear that E Q h e − R ut µ s ds µ u (cid:12)(cid:12)(cid:12) G Xt i = − ∂g ( t, u ) ∂u . Applying integration by parts to DB in Proposition 2 yields an alternative expression:DB t ( C T ; T ) = − { τ>t } (cid:26) B ( t, T ) E Q T (cid:2) C T | G Zt (cid:3) g ( t, T ) − E Q t (cid:2) C t | G Zt (cid:3) (17) − Z Tt ∂ (cid:0) B ( t, u ) E Q u (cid:2) C u | G Zt (cid:3)(cid:1) ∂u g ( t, u ) du (cid:27) . In this way, as the interest rate model follows the Markovian affine model, the partialderivative term in (17) admits a closed-form solution in many cases and we get rid of theneed to compute a T -partial derivative of g ( t, T ), which is rather more complicated. These formulas for survival and death benefits may still be considered abstract, so weapply them to some concrete insurance or pension products. ongevity Bond : Consider a unit zero-coupon longevity bond which pays $1 times e − R Tt µ s ds , the percentage of survivors in a population during t to T . Blake et al. (2006)show that the longevity bond takes the form B L ( t, T ) = E Q h e − R Tt r s + µ s ds (cid:12)(cid:12)(cid:12) F t i . Under the Volterra mortality model with LRD, Proposition 1 immediately implies that B L ( t, T ) = B ( t, T ) g ( t, T ) , by setting C T ≡ Annuity : Consider a t ′ -years deferred annuity involving a continuous payment of anindexed benefit from time t onwards, conditional on survival of the policyholder at thattime. Suppose that the payoff is made of a unit amount each year. Denote x ∗ as themaximum age humans can live. The fair value of such an annuity is given byAN t ( t ′ ) = x ∗ − t − X h = t ′ SB t (1; t + h ) = x ∗ − X T = t + t ′ B ( t, T ) g ( t, T )= x ∗ − X T = t + t ′ e e α ( t,T )+ e β ( t,T ) Z t e − R T m ( s ) ds + R t µ s ds exp( Y t ( T )) , (18)where Y t ( T ) is defined in (10) and e α ( t, T ) and e β ( t, T ) are as in (8). Assurances:
Consider an assurance guaranteeing a unit amount benefit in case ofdeath in the period ( t, T ]. By setting C ≡ t ( T ) = 1 − B ( t, T ) g ( t, T ) + Z Tt ∂ B ( t, u ) ∂u g ( t, u ) du, where B ( t, T ) is defined in (8) and g ( t, T ) in Theorem 1. Endowment:
Consider an endowment given the survival on time t with maturity time T , which includes a survival benefit C given the survival on time T and a death benefit C in case of the death in the period ( t, T ]. C and C are constants. By Propositions 1and 2 and (17), the fair value of such an endowment is given byEN Tt ( C , C ) = SB t ( C ; T ) + DB t ( C ; T )= ( C − C ) B ( t, T ) g ( t, T ) + C Z Tt ∂ B ( t, u ) ∂u g ( t, u ) du ! , where B ( t, T ) is defined in (8) and g ( t, T ) in Theorem 1. Although Propositions 1 and 2 facilitate the model development under the pricing measureand the calibration to market prices of insurance products, an insurance practice may nothave sufficient market prices for such calibration. In addition, risk management requiresthe connection between the physical and pricing measures as demonstrated in the next ection. Therefore, we present two possible ways to link the measures of P and Q withlimited observed prices. For the time being, we focus on the situation in which the Volterramortality model is estimated using a historical mortality table and hence built under thephysical measure P = Q .The first approach commonly used to identify a pricing measure in the actuarial liter-ature is the Esscher transform. Chuang and Brockett (2014) apply the Esscher transformto the mortality rate to find a related martingale measure for pricing longevity derivatives.Wang et al. (2019) also use the Esscher transform for pricing longevity derivatives basedon an improved Lee–Carter model. Although the mortality rate µ t is non-Markovianand non-semimartingale under our framework, the advantage is that we have an explicitLaplace-Fourier functional representation in Theorem 1. For a random variable γ witha well-defined moment-generating function (MGF) under P , an equivalent probabilitymeasure Q ( θ ) derived from the Esscher transform with parameter θ is defined as d Q ( θ ) d P = e θγ E [ e θγ ] . (19)By setting c = 1 and c = 0 in Theorem 1, the MGF for the random variable − R Tt µ s ds is well-defined and can be obtained in an explicit form. Specifically, as weassume µ t = m ( t ) + ηX t , the MGF defined as M ( θ T ) = E [ e − θ T R Tt µ s ds ] , which corresponds to the g ( t, T ) in Theorem 1 with the parameters m ( t ) and η replacedwith θ T m ( t ) and θ T η for the constant θ T and a fixed T . For instance, we observe a risk-free zero coupon bond and a zero coupon longevity bond with the same maturity. Then,we can deduce the synthetic value of E Q ( θ T ) t [ e − R Tt µ s ds ] = E t [ e − ( θ T +1) R Tt µ s ds ] E t [ e − θ T R Tt µ s ds ] = M ( θ T + 1) M ( θ T ) . (20)Although the left-hand quantity is deduced from market prices, the M ( θ T ) achieves aclosed-form solution from our model through Theorem 1. Specifically, M ( θ ) is the g ( t, T )in Theorem 1 with m ( t ) and η replaced with θm ( t ) and θη , respectively. One can thencalibrate θ T to the term structure of longevity bonds, or longevity bond prices for differentmaturity T , after estimating the physical model parameters, including the LRD feature,using historical data.From (20), when θ T = 0, the longevity bond is priced under P and our previousvaluation formulas hold. For a nonzero θ T , a slight adjustment can be made through (20)as the MGF is explicitly known. Although the Esscher transform provides us with a powerful and convenient frameworkto identify a pricing measure, it does not offer us an explicit stochastic process under the ricing measure. When we perform a risk management strategy, we need the stochasticprocess of the mortality rate under both P and Q . It is desirable that the Volterra mortalitymodel retains the affine nature in Definition 1. Therefore, we propose the following affineretaining transform based on the Girsanov theorem. Definition 2.
Given an affine SIVE of (6) satisfying Definition 1, an affine retainingtransform for measure change is based on shifting the Wiener process as follows: dW Q t = dW t − σ ( X t ) ⊤ ϕ ( t ) dt, for a deterministic function ϕ ( t ) ∈ R d satisfying E t h e R T | σ ( X t ) ⊤ ϕ ( t ) | dt i < ∞ . (21)Under Definition 2, we identify a pricing measure Q equivalent to P : d Q d P = e − R t | σ ( X s ) ⊤ ϕ ( s ) | ds + R t ϕ ( s ) ⊤ σ ( X s ) dW s , where ϕ ( t ) is calibrated to observed prices. In addition, the mortality process µ t = m ( t ) + ηX t in (6) under Q has the X t changed to X t = X + Z t K ( t − s )( b ( X s ) + a ( X s ) ϕ ( s )) ds + Z t K ( t − s ) σ ( X s ) dW Q s , (22)where b ( X s ) + a ( X s ) ϕ ( s ) and a ( X s ) still satisfy the affine nature in Definition 1. Hence,the pricing formulas of Propositions 1 and 2 remain the same except that the b ( X s ) isreplaced with b ( X s ) + a ( X s ) ϕ ( s ) once the affine retaining transform in Definition 2 isadopted. Remark 3.
Although the Esscher and affine retaining transforms presented in Sections3.2 and 3.3 are applied to the Volterra mortality model, these techniques have been widelyused in the actuarial science literature, including the measure change with the affine in-terest rate models. Therefore, we do not repeat the detailed case for the interest rate. Wemention them to highlight the advantage of the proposed LRD mortality model in sense ofcalibrating to the pricing measure.
We further investigate optimal hedging with the proposed LRD mortality model, as hedg-ing is a typical risk management task. The intent is to demonstrate the tractability ofthe LRD mortality model in hedging problems. As hedging should be performed underthe physical probability measure P , whereas longevity securities such as the longevitybonds and swaps are valued in the market-implied pricing measure Q , we adopt the affineretaining transform detailed in Section 3.3 to bridge the two probability measures in thissection. et us sketch the conceptual framework prior to detailing the mathematics. As insur-ance product prices under the Volterra mortality model are semimartingales and hencecan be expressed in SDE, the insurer’s wealth also satisfies a SDE with stochastic coef-ficients, which are possibly non-Markovian. According to stochastic control theory, theinsurer’s wealth plays the role of the state process. Therefore, the theory of backwardSDE (BSDE) is useful for solving the stochastic optimal control problem for a state pro-cess with stochastic coefficients. Typically, the mean-variance (MV) hedging problem isclosely related to the linear-quadratic (LQ) control problem under the classic formulationof the BSDE approach. In the following, we leverage this well-received theoretical resultto show the application of the LRD mortality model, though the optimal hedging derivedis novel and has remarkable performance in reducing risk with the LRD mortality. Theperformance is, however, shown in the next section numerically. Consider an insurer offering a pension scheme who wants to hedge the longevity risk usinga longevity security. Specifically, the insurer allocates her capital among a bank account,risk-free zero-coupon bond, and zero-coupon longevity bond. Let us concentrate on theone-dimensional case so that d = k = 1 from now on.To simplify the discussion, we adopt the VV mortality rate and assume m ( t ) = 0 and η = 1 in (5). In other words, µ ( t ) = X ( t ) and µ t = X t = X + Z t K ( t − s )( b − b X s ) ds + Z t K ( t − s ) σ µ dW s , (23)where b , b , and σ µ are constants and K is the Volterra kernel. In addition, the interestrate r t = Z t follows the Vasicek model: dr ( t ) = ( e b − e b r t ) dt + σ r dW ′ t , (24)where e b , e b , and σ r are constant parameters. W t and W ′ t are independent Wiener pro-cesses under P . Let W ( t ) = ( W t , W ′ t ) ⊤ . Using the affine retaining transform in Definition2, the Weiner process under the pricing measure is given by dW Q t = dW t − σ µ ϕ ( t ) σ µ dt, dW ′ t Q = dW ′ t − σ r ϑ ( t ) σ r dt, where ϑ and ϕ are deterministic functions satisfying the condition (21). Under the pricingmeasure, the mortality and interest rates are, respectively, X t = X + Z t K ( t − s )( b + ϕ ( s ) σ µ − b X s ) ds + Z t K ( t − s ) σ µ dW Q s ; dr ( t ) = ( e b + ϑ ( t ) σ r − e b r t ) dt + σ r dW ′ t Q . As the unit zero coupon bond price takes the form B ( t, T ) = E Q h e − R Tt r ( s ) ds (cid:12)(cid:12)(cid:12) F t i = e e α ( t,T )+ e β ( t,T ) r t , ith e α ( t, T ) and e β ( t, T ) as defined in Appendix A, the P -dynamics of the bond reads d B ( t, T ) = B ( t, T )( r ( t ) + ν B ( t )) dt + B ( t, T ) σ b ( t ) dW ′ t , where ν B = ϑ ( t ) σ b ( t ) and σ b ( t ) = − e β ( t, T ) σ r . Similarly, using the expression for a zerocoupon longevity bond, i.e., B L ( t, T ) = E Q h e − R Tt r ( s )+ µ ( s ) ds (cid:12)(cid:12)(cid:12) F t i = B ( t, T ) e R t µ ( s ) ds exp( Y t ( T )) , where Y t ( T ) is equivalent to the Y t ( T ) in (10) with b ( x ) = b + ϕ ( s ) σ µ − b x , σ ( x ) = σ µ ,and W replaced by W Q , we obtain the P -dynamics of the longevity bond prices as follows: d B L ( t, T ) = B L ( t, T )( r ( t ) + µ ( t ) + ν L ( t )) dt + B L ( t, T ) σ l ( t ) dW t + B L ( t, T ) σ b dW ′ t , where ν L = ν B + ϕ ( t ) σ l , σ l = − ψ ( T − t ) σ µ , and ψ ∈ L ([0 , T ] , C ) is the solution ofthe Riccati equation ψ = ( − − b ψ ) ∗ K . As an investment amount of B L ( t, T ) in thelongevity bond at time t becomes e − R τt µ ( s ) ds B L ( τ, T ) at τ > t , the value of holding oneunit of zero coupon longevity bond B L ( t ) satisfies d B L ( t, T ) = B L ( t, T )( r ( t ) + ν L ( t )) dt + B L ( t, T ) σ l ( t ) dW t + B L ( t, T ) σ b dW ′ t . (25)The quantities ν L − ν B and ν B are often known as the market prices of mortality andinterest rate risks, respectively. From (25), the zero coupon longevity bond price stillsatisfies a SDE due to the semimartingale nature of Y t ( T ). This fact enables us to dealwith the optimal hedging problem with a LRD mortality rate. Note that the LRD featureis reflected by the volatility term of B L ( t ) through a Riccati-Volterra equation.Let u ( t ), u ( t ), and u ( t ) denote the investment amounts in the bank account, zero-coupon longevity bond, and zero-coupon bond respectively. Denote ˜ N ( t ) as a stochasticPoisson process with intensity k µ ( t ) and { z i } ∞ i =1 as independent identically distributed(iid) insurance claims. Consider a hedging horizon of T < T . Then, the wealth processof the insurer reads M ( t ) = u ( t ) + u ( t ) + u ( t ) − ˜ N ( t ) X i =1 z i − Π( t ) , t ∈ [0 , T ] , (26)where Π = R t π ( s ) ds , t ∈ [0 , T ], and π ( t ) is a F t -adapted, square integrable processrepresenting the pension annuity net cash outflow. We denote the filtration generated by { M ( s ) : 0 ≤ s ≤ t } by ˜ H t ⊇ F t . The insurer’s wealth M ( t ) satisfies the following SDE: dM ( t ) = ( M ( t ) r ( t ) + u ( t ) ⊤ ν ( t ) − π ( t )) dt + u ( t ) ⊤ σ S ( t ) ⊤ d W ( t ) − zd ˜ N ( t ) , (27)where z has the same distribution as z , u ( t ) = ( u ( t ) , u ( t )) ⊤ , ν ( t ) = ( ν L ( t ) , ν B ( t )) ⊤ ,and σ S ( t ) ⊤ = σ l σ b σ b ! . If a hedging strategy u ( t ) is a F t -adapted process and E [ R T | u ( s ) | ds ] < ∞ , then itis said to be admissible. We denote the set of admissible controls as U . efinition 3. The classic mean-variance (MV) hedging problem is defined as V ( φ ) = min u ( · ) ∈U Var( M ( T )) − φ E [ M ( T )] , (28) where the parameter φ measures the insurer’s risk averseness. When φ = 0, problem (28) refers to the minimum-variance hedging. For any given¯ M = E [ M ( T )], E [( M ( T ) − ¯ M ) ] − φ E [ M ( T )] = E [( M ( T ) − ( ¯ M + φ ] − φ M − φ . In addition, the MV hedging problem can be embedded into a target-based objective.Specifically, the problem (28) is equivalent tomin ¯ M ∈ R min u ( · ) ∈U E [( M ( T ) − c ) ] − φ M − φ , (29)where c = ¯ M + φ . The inner minimization problem there refers to a target-based objectivethat aims to make the wealth close to the target c . Let π ( t ) = k e − R t µ ( s ) ds and Σ( t ) = σ S ( t ) ⊤ σ S ( t ). To solve the optimal hedging problem,we introduce two additional probability measures: d ˆ P d P = e − R t ξ ( s ) ⊤ d W ( s ) − | ξ ( s ) | ds , d ´ P d P = e − R t ζ ( s ) ⊤ d W ( s ) − ζ ( s ) ⊤ ζ ( s ) ds with ξ ( t ) = (2 ϕ ( t ) , ϑ ( t )) ⊤ and ζ ( t ) = ( ϕ ( t ) , ϑ ( t )) ⊤ . By the Girsanov theorem, ˆ W t , W t + R t ξ ( s ) ds and ´ W t , W t + R t ζ ( s ) ds are Wiener processes under ˆ P and ´ P , respectively.Denote ˆ E [ · ] and ´ E [ · ] as expectations under ˆ P and ´ P , respectively. By Theorem 1,´ E h e − R s µ τ dτ (cid:12)(cid:12)(cid:12) ˜ H t i = exp( Y t ( T )) , where Y t ( T ) is equivalent to the Y t ( T ) in (10) with b ( x ) = b − ϕ ( s ) σ µ − b x , σ ( x ) = σ µ ,and W replaced by ´ W ; ´ E [ µ s | ˜ H t ] = ´ E [ X s | ˜ H t ] is equivalent to E [ X s |F t ] as defined in (13)with B = − b , b ( s ) replaced by b − ϕ ( s ) σ µ , and W replaced by ´ W . In addition, wehave the following expressions.ˆ E h e − R T t r ( s ) ds (cid:12)(cid:12)(cid:12) F t i = exp( α ( t, T ) + β ( t, T ) r ( t )) , (30)´ B ( t, s ) = ´ E h e − R st r ( u ) du (cid:12)(cid:12)(cid:12) F t i = exp( α ( t, s ) + β ( t, s ) r ( t )) , (31)where α ( t, T ), β ( t, T ), α ( t, s ), and β ( t, s ) solve the ODEs in Appendix A. Thefollowing theorem provides the optimal hedging strategy. Theorem 2.
Consider two stochastic processes P ( t ) = e − R T t ϑ ( s )+ ϕ ( s ) ds ˆ E h e − R T t r ( s ) ds (cid:12)(cid:12)(cid:12) F t i (32) nd Q ( t ) = − P ( t )[ Q ( t ) + c ´ B ( t, T )] , (33) where Q ( t ) = Z T t ´ B ( t, s )( k E [ z ]´ E [ µ s | ˜ H t ] + k ´ E [ e − R s µ τ dτ | ˜ H t ]) ds, ´ B ( t, s ) = ´ E h e − R st r ( u ) du (cid:12)(cid:12)(cid:12) F t i , ≤ t ≤ s. Once dP ( t ) = µ P ( t ) dt + η ⊤ d W ( t ) and dQ ( t ) = µ Q ( t ) dt + η ⊤ d W ( t ) (34) under P , the inner minimization problem in (29) has an optimal feedback control: u ∗ c ( t ) = − Σ( t ) − (cid:20)(cid:18) ν ( t ) + σ S ( t ) ⊤ η ( t ) P ( t ) (cid:19) M ( t ) + Q ( t ) ν ( t ) + σ S ( t ) ⊤ η ( t ) P ( t ) (cid:21) . (35) In addition, the optimal objective value is P (0)( M (0) + Q (0) P (0) ) + I (0) , where I ( t ) = E " Z T t P ( µz + (cid:18) η − Qη P (cid:19) ⊤ σ ⊥ (cid:18) η − Qη P (cid:19)) ( s ) ds (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ˜ H t (36) in which σ ⊥ = id − σ S ( t )Σ( t ) − σ S ( t ) ⊤ .Proof. See Appendix B.
Proposition 3.
Then, the diffusion coefficients in (34) are η = (0 , η ) ⊤ , where η = − P ( t ) β ( t, T ) σ r and η = ( η , η ) ⊤ in which η = − P ( t ) Z T t ´ B ( t, s ) (cid:16) k E [ z ] E B ( s − t ) σ µ + k ´ E h e − R s µ τ dτ (cid:12)(cid:12)(cid:12) ˜ H t i ψ ( s − t ) σ µ (cid:17) ds,η = − P ( t ) (cid:26) Z T t ´ B ( t, s ) (cid:16) k E [ z ]´ E [ µ s | ˜ H t ] + k ´ E h e − R s µ τ dτ (cid:12)(cid:12)(cid:12) ˜ H t i(cid:17) β ( t, s ) σ r ds + c ´ B ( t, T ) β ( t, T ) σ r (cid:27) + P ( t )[ Q ( t ) + c ´ B ( t, T )] β ( t, T ) σ r , (37) where β ( t, T ) is defined in (30) , β ( t, s ) in (31) , E B in Theorem 1 with B = − b , and ψ ∈ L ([0 , s ] , C ) solves the Riccati equation ψ = ( − − ψ b ) ∗ K . Proposition 4.
The optimal hedging strategy u ∗ ( t ) = ( u ∗ ( t ) , u ∗ ( t )) ⊤ to problem (28) isgiven by u ∗ ( t ) = − σ l ( t ) (cid:26)(cid:20) M ( t ) − Q ( t ) − (cid:18) ¯ M ∗ + φ (cid:19) ´ B ( t, T ) (cid:21) ϕ ( t ) + η ( t ) P ( t ) (cid:27) , (38) u ∗ ( t ) = − σ b ( t ) (cid:26)(cid:20) M ( t ) − Q ( t ) − (cid:18) ¯ M ∗ + φ (cid:19) ´ B ( t, T ) (cid:21) ϑ ( t ) + M ( t ) η ( t ) + η ( t ) P ( t ) (cid:27) − u ∗ ( t ) , (39) where ¯ M ∗ = φ (1 − P (0) ´ B (0 , T )) + P (0) ´ B (0 , T )( M (0) − Q (0)) P (0) ´ B (0 , T ) . he explicit optimal hedging strategy in Proposition 4 incorporates the LRD featurethrough η which depends on the mortality rate path and the kernel K as shown in Propo-sition 3. In addition, the Hurst parameter is contained in the kernel function K . In this section, we numerically examine the impact of long-range dependence on theprices of insurance products and the hedging effectiveness. To do so, we contrast theLRD mortality model with its Markovian counterpart. For the latter case, the Hurstparameter H is set to 1/2. As the LRD appears when H > /
2, we examine the effectwhen H falls into this range. As the basic quantity, we begin with the survival probability. Under the Volterra mortalitymodel, we assume that process X satisfies Equation (15) which is a Volterra type ofVasicek model. The Vasicek model is a special case with α = 1 or H = 1 /
2. Wecompare the Vasicek and VV mortality models using two different values of H while theother parameters are kept constant. It is empirically estimated by Yan et al. (2018)that the H is around 0.83 for mortality data. Thus, we choose an α of 1.33 for the VVmortality model. Table 2 summarizes the remaining parameters used in this numericalstudy. The parameters chosen have similar magnitudes to those in Biffis (2005) for thecase of Markovian model. Projection α m ( t ) η λ θ σ t X A 1.33 SIM92 0.2 0.5 0.0009 0.01 40 0.001B 1 SIM92 0.2 0.5 0.0009 0.01 40 0.001Table 2: Parameter values for the mortality model
Remark 4.
The SIM92 in Table 2 is a dataset from the Italian National Institute ofStatistics (ISTAT) which reports Italian population life tables. SIM92 is usually employedto price assurance. Such a setting for m ( t ) has been adopted in Biffis (2005). Specifically,after fixing the other parameter values, the m ( t ) is calibrated to fit the SIM92 table, sothe functional form of m ( t ) is not explicitly shown here. Although parameter values are assigned in this numerical experiment, we stress that, inreality, the parameters can be calibrated to observed prices of actuarial products using theset of the closed-form pricing formulas derived in this paper. In addition, the parameter θ n (14) or b in Definition 1 can be set as a bounded measurable function of time t ratherthan a constant as in our example.In Table 2, the symbol t stands for the age group. For example, when we set t = 40,it corresponds to a group of the survival population at the age of 40. In Figures 1(a)and 2(a), we simulate two different sample paths of X for this group of individuals overthe time interval [0 , t ]. Under the VV mortality model, the historical sample paths of X affect the estimated survival probability, whereas the Vasicek model does not due toits Markovian nature. Given the parameters in Table 2 and (14), we directly calculatesurvival probabilities from the two models. By (14) and Theorem 1, P ( τ > T |F t )= e − R Tt m ( s ) ds exp − η Z Tt E [ X s |F t ] ds + 12 Z Tt ψ ( T − s ) a ( E [ X s |F t ]) ψ ( T − s ) ⊤ ds ! , (40)for T > t . Under the Vasicek mortality model, the survival probability depends only on X t ( t = 40) as E [ µ s |F t ] = µ t . However, under the VV mortality model, the expression of E [ µ s |F t ] given in (13) depends on the whole historical path of X . Based on the simulatedsample paths, we calculate the survival probabilities for the interval T ∈ [ t, x ∗ ], where weset the maximum age at x ∗ = 109.Figures 1(b) and 2(b) show the survival probabilities that correspond to the historicalrecords in Figure 1(a) and 2(a), respectively. The solid line is the survival probabilitycurve with LRD and the dashed line is that of the Markovian model. Depending on thehistorical record, the LRD survival probability can be higher or lower than the Markoviansurvival probability. This indicates that the historical sample path has impact on thesurvival probability when LRD is present. The effect is more pronounced for the middleage group. This is reasonable because the young age group has a shorter historical recordand the old age group may be restricted by the human age limit. This kind of middle-ageeffect may result in a significant effect on insurance pricing. We further examine it witha concrete insurance product.
10 20 30 40 − . . . . . age X (a)
40 50 60 70 80 90 100 110 . . . . . . age S u r v i v a l P r obab ili t y AB (b) Figure 1: A sample historical path of X that makes the survival probability with LRDhigher than its Markovian counterpart. − . . . . age X (a)
40 50 60 70 80 90 100 110 . . . . . . age S u r v i v a l P r obab ili t y AB (b) Figure 2: A sample of historical path of X that makes the survival probability with LRDlower than its Markovian counterpart.20 .2 Impact on annuity To examine the effect of LRD on annuity prices, we compare the prices calculated bythe two models. We are interested in annuities because they are popular insurance andpension products around the globe.The numerical experiment is constructed as follows. Consider a 20-year deferred an-nuity and its payoff is a unit amount each year. For simplicity, we assume that Q = P in this part so that no additional effort is required to identify the pricing measure. Thesimulation and calculation are made with the parameters in Table 2. In addition, wespecify the short interest rate r t = Z t as follows. dZ t = ( e b − e b Z t ) dt + σ r dW ′ , where e b = 0 . e b = 0 . σ r = 0 .
3, and Z (40) = 0 .
01. Then we use (18) directly tocalculate the price of the annuity and t ′ = 20. (a) Price difference F r equen cy −0.04 −0.02 0.00 0.02 0.04 (b) Figure 3: (a) Examples of historical paths for X and (b) histogram of percentage differencein annuity prices between the two models To demonstrate the LRD effect, we generate 15,000 sample paths of X over the timeinterval [0 , t ]. In Figure 3(a), we illustrate that the last two sample paths meet at time t . The classic Markovian model ignores how they come to this point and assigns thesame price to the two scenarios as explained in (40). However, our LRD mortality modeltakes the historical record into account and assigns two different prices as shown in (18)and Theorem 1. The problem is to determine how large the difference between thesetwo models is. Clearly, the difference is not a single number as there are uncountably any ways to reach the same point. Therefore, we examine the distribution of the pricedifference for different historical paths.To do so, Figure 3(b) plots a histogram of the percentage difference of the annuityprices between the LRD and Markovian models. First, the mean of the distribution isnear zero, implying that the Markovian mortality model offers an appropriate estimate ofthe averaged price even under the LRD feature. However, the dispersion of the histogramis still obvious. The price difference between the two models can reach 4% even for a linearannuity product, and this 4% difference seems not negligible in practice. The discrepancymay be amplified for products with leveraging effects such as those with optionality.Even for this annuity product, we can see the volatility could be higher compared tothe Markovian model due to incorrect predictions of the mortality rate if the realizedmortality has the LRD feature.To illustrate the influence of LRD on products with optionality, consider a Europeancall option on a zero-coupon longevity bond B L ( t, T ) with strike D and expiration time T , where T is the fixed maturity of the bond and T is the expiration date of the optionso that 0 ≤ t ≤ T < T . Specifically, the call option payoff reads V ( B L ( T , T )) =max( B L ( T , T ) − D, r and m ( · ) = 0. By (15) and (25), we have d B L ( t, T ) = B L ( t, T ) h rdt + ψ ( T − t ) σdW Q t i , (41)under the pricing measure, where ψ solves ψ = ( − η − λψ ) ∗ K . As (25) is the dynamic of B L ( t, T ) under P , the corresponding Q dynamics in (41) is one in which the term ν L in(25) is absorbed into the P -Brownian motion to form a Q -Brownian motion. Hence, thecall value function V ( B L , t ) resembles the Black-Scholes formula. Specifically, V ( B L , t ) = Φ( d ) B L ( t, T ) − Φ( d ) De − r ( T − t ) ,d = 1 ψ ( T − t ) σ √ T − t (cid:20) ln (cid:18) B L ( t, T ) D (cid:19) + (cid:18) r + 12 ψ ( T − t ) σ (cid:19) ( T − t ) (cid:21) ,d = d − ψ ( T − t ) σ p T − t, where Φ( · ) is the cumulative distribution function of the standard normal distribution.Let us make a numerical comparison in terms of percentage difference in option pricebetween the VV and Markovian models. Let r = 0 . T = 5, and T = 2, and setthe other parameters as in Table 2. Assume B L ( t, T ) = 0 . , D , the benchmarkingat-the-money (ATM) strike, at the option issuance time. Note that the historical pathof the mortality rate is subsumed into the longevity bond price B L ( t, T ). By varying thestrike D from 0.8 (ATM) to 0.832 (4% in-the-money), option prices under the two modelsare shown in Figure 4(a) while the percentage difference in price is shown in Figure 4(b).When the strike increases by 4%, the percentage difference in option price could reach20% which is quite significant. We mention the 4% increase in strike because the price ofan annuity can reach a 4% difference in price in the former analysis. When the strike isset to make the option ATM, the difference in the longevity bond price results in a 4% ifference in setting the ATM strike. This example shows that optionality may furtheramplify the pricing difference. Strike D P r i c e Volterra mortality modelMarkovian mortality model (a)
D/D P e r c e n t a g e d i ff e r e n c e o f o p t i o n p r i c e s (b) Figure 4: (a) Option prices and (b) difference of the prices under the two models
We further examine the hedging with LRD. In this part, we still consider the fractionalkernel in (23) so that K ( t ) = t α − Γ( α ) . Again, we first simulate a pair of sample pathsof mortality and interest rates as shown in Figure 5. The model parameters used are µ (0) = 0 . b = 0 . b = 0 . σ µ = 0 . r (0) = 0 . e b = 0.6, e b = 0 . σ r = 0 . T = 5, α = 1 . k = 1, k = 10, and E [ z ] = 2. . . . . t M o r t a li t y r a t e (a) . . . t I n t e r e s t r a t e (b) Figure 5: A pair of sample paths of (a) mortality rate and (b) interest rate
We hedge with the following two models. Model 1: Above assumption with K ( t ) = t α − Γ( α ) (Volterra mortality model); • Model 2: Above assumption with K ( t ) = 1 (Markovian mortality model).Our objective is to hedge with φ = 3000 over a horizon of 5 years using a zero-couponlongevity bond and a zero-coupon bond with a maturity time T = 15. The initial valueof wealth process is set to 2000. The optimal hedging strategies are calculated accordingto (38) and (39). The longevity bond price and bond price are calculated by assumingconstant market price of risks ϕ = 0 . ϑ = 0 . α = 1 .
33 (or H = 0 .
83) in this numerical experiment, the value of α can be calibrated or estimated in practice by using the pricing formulas we provide.Therefore, this study offers the option of choosing between Volterra and Markovian mor-tality models when dealing with longevity hedging in reality. Our proposed model rendersa practical, flexible approach to the choice of α . u ( t ) Model 1Model 2 (a) u ( t ) Model 1Model 2 (b)
Figure 6: Optimal hedging strategy (a) u ( t ) and (b) u ( t )24 W e a l t h p r o c e ss Model 1Model 2No hedge
Figure 7: Wealth processes
In this paper, we propose a tractable continuous-time mortality rate model that incor-porates the LRD feature. Using our model, we derive novel closed-form solutions to thesurvival probability and prices of several basic insurance products. In addition, our modelenables us to investigate an optimal longevity hedging strategy via the BSDE framework.Therefore, the key advantages of our model are its tractability for pricing and risk manage-ment as well as its ability to capture the LRD feature. Our numerical experiments showthat LRD has significant effects for insurance pricing and hedging. The new longevityhedging strategy improves the hedging effectiveness when the mortality rate observes theLRD feature.
A Transformation of Markov affine processes
We now give the ODEs which the coefficients e α and e β solve appearing in Section 2 and 4.A R k -valued affine diffusion Z is a F -Markovian process specified as the strong solutionto the following SDE: dZ t = e b ( Z t ) dt + e σ ( Z t ) dW ′ t , where W ′ t is a F -standard k -dimensional Brownian motion. We require the covariancematrix e a ( Z ) = e σ ( Z ) e σ ( Z ) ⊤ and the drift e b ( Z ) to have affine dependence on Z as inDefinition 1. That is e a ( Z ) = e A + Z e A + · · · + Z k e A k , e b ( Z ) = e b + Z e b + · · · + Z k e b k , or some k -dimensional symmetric matrices e A i and vectors e b i . For convenience, we set e A = ( e A , · · · , e A k ) and e b = ( e b , · · · , e b k ). As shown in Duffie et al. (2000), for any c , c ∈ C k and c ∈ C , given T > t and affine function Λ( t, x ) = λ ( t ) + λ ( t ) · Z ( λ and λ are bounded continuous functions), under technical conditions we have E [ e − R Tt Λ( s,Z s ) ds e c · Z T ( c · Z T + c ) |F t ] = e e α ( t )+ e β ( t ) · Z t [ˆ α ( t ) + ˆ β ( t ) · Z t ] , where the functions e α ( · ) . = e α ( · , T ) and e β ( · ) . = e β ( · , T ) solve the following ODEs:˙ e β ( t ) = λ ( t ) − e b ( t ) ⊤ e β ( t ) − e β ( t ) ⊤ e A ( t ) e β ( t ) , ˙ e α ( t ) = λ ( t ) − e b ( t ) · e β ( t ) − e β ( t ) ⊤ e A ( t ) e β ( t ) , with boundary conditions e α ( T ) = 0 and e β ( T ) = c ; the functions ˆ α ( · ) . = ˆ α ( · ; c , c , c , T )and ˆ β ( · ) . = ˆ β ( · ; c , c , c , T ) are the solutions to the following ODEs:˙ˆ β ( t ) = − e b ( t ) ⊤ ˆ β ( t ) − e β ( t ) ⊤ e A ( t ) ˆ β ( t ) , ˙ˆ α ( t ) = − e b ( t ) · ˆ β ( t ) − e β ( t ) ⊤ e A ( t ) ˆ β ( t ) , with boundary conditions ˆ α ( T ) = c and ˆ β ( T ) = c . B Some Proofs
Proof of Theorem 1
Under our model, from (5), E [ e − R Tt µ s ds |F t ] = E [ e − R Tt m ( s )+ ηX s ds |F t ] = e − R Tt m ( s ) ds E [ e − R Tt ηX s ds |F t ] . As X t has the affine structure specified in Definition 1, by application of Lemma 4.2 andTheorem 4.3 provided in Abi Jaber et al. (2019), we have E [ e − R Tt ηX s ds |F t ] = e R t ηX s ds E [ e − R T ηX s ds |F t ] = e R t ηX s ds exp( Y t ( T )) , where Y t ( T ) is the Markovian process defined in (10) or equivalently (12) in Theorem 1.Then, for T > t ≥
0, we have E [ e − R Tt µ s ds |F t ] = e − R Tt m ( s ) ds e R t ηX s ds exp( Y t ( T )) . Notice that − R Tt m ( s ) ds + R t ηX s ds = − R T m ( s ) ds + R t µ s ds . Hence, E [ e − R Tt µ s ds |F t ] = e − R T m ( s ) ds e R t µ s ds exp( Y t ( T )) = g ( t, T ) . (42)By taking the derivative of g ( t, T ) with respect to T , we get − ∂g ( t, T ) ∂T = E [ e − R Tt µ s ds µ T |F t ] , T > t. (43)Then, by combining the Equations (42) and (43), the result in (9) follows. roof of Theorem 2 and Proposition 3 For P ( t ), it is obvious that P ( t ) > P ( T ) = 1, and P − ( t ) = e R T t ϑ ( s )+ ϕ ( s ) ds ˆ E [ e − R T t r ( s ) ds |F t ]= e R T t ϑ ( s )+ ϕ ( s ) ds exp( α ( t, T ) + β ( t, T ) r ( t )) . Under our setting, ν ( t ) ⊤ Σ( t ) − ν ( t ) = ϑ ( t ) + ϕ ( t ). Then, by applying Itˆo’s formula, weget dP − ( t ) = P − ( t )(2 r ( t ) − ϑ ( t ) − ϕ ( t )) dt − P − ( t ) e η ( t ) ⊤ d ˆ W ( t )= P − ( t )(2 r ( t ) − ν ( t ) ⊤ Σ( t ) − ν ( t ) − ˜ η ( t ) ⊤ ξ ( t )) dt − P − ( t ) e η ( t ) ⊤ d W ( t ) , where e η = − β ( t, T ) σ r = η /P ( t ) and η ( t ) is defined in Proposition 3. Notice that ξ ( t ) = 2 σ S Σ( t ) − ν ( t ) and σ ⊥ e η = 0. Then by Itˆo’s lemma again, P ( t ) satisfies dP ( t ) = (cid:26) (cid:2) − r ( t ) + ν ( t ) ⊤ Σ( t ) − ν ( t ) (cid:3) P ( t ) + 2 ν ( t ) ⊤ Σ( t ) − σ S ( t ) ⊤ η ( t )+ η ( t ) ⊤ σ S ( t )Σ( t ) − σ S ( t ) ⊤ η ( t ) 1 P ( t ) (cid:27) dt + η ( t ) ⊤ d W ( t ) . For Q ( t ), it is obvious that Q ( T ) = − c and Q ( t ) P ( t ) = − [ Q ( t ) + c ´ B ( t, T )]. By applyingItˆo’s lemma to ´ E [ µ s | ˜ H t ] on time t , we have d (cid:16) ´ E [ µ s | ˜ H t ] (cid:17) = E B ( s − t ) σ µ d ´ W t , where E B is defined in Theorem 1 with B = − b . By applying Ito’s lemma to ´ E h e − R s µ τ dτ (cid:12)(cid:12)(cid:12) ˜ H t i =exp( Y t ( T )) on time t , we get d (cid:16) ´ E h e − R s µ τ dτ (cid:12)(cid:12)(cid:12) ˜ H t i(cid:17) = ´ E h e − R s µ τ dτ (cid:12)(cid:12)(cid:12) ˜ H t i ψ ( s − t ) σ µ d ´ W t with ψ ∈ L ([0 , s ] , C ) solving the Riccati equation ψ = ( − − ψ b ) ∗ K . From (31), d ´ B ( t, s ) = ´ B ( t, s ) r ( t ) dt − ´ B ( t, s ) β ( t, s ) σ r d ´ W ′ t . Then, by applying Itˆo’s lemma to Q ( t ) P ( t ) , wehave d (cid:20) Q ( t ) P ( t ) (cid:21) = (cid:20) Q ( t ) P ( t ) r ( t ) + k µ ( t ) z + π ( t ) (cid:21) dt + (cid:20)e η ( t ) ⊤ − Q ( t ) P ( t ) e η ( t ) ⊤ (cid:21) d ´ W ( t )= (cid:20) Q ( t ) P ( t ) r ( t ) + k µ ( t ) z + π ( t ) + e η ( t ) ⊤ ζ ( t ) − Q ( t ) P ( t ) e η ( t ) ⊤ ζ ( t ) dt (cid:21) + (cid:20)e η ( t ) ⊤ − Q ( t ) P ( t ) e η ( t ) ⊤ (cid:21) d W ( t ) , where e η = η /P ( t ) and η is shown in Proposition 3. Notice that ζ ( t ) = σ S Σ( t ) − ν ( t )and σ ⊥ e η = 0. Then, by Itˆo’s lemma again, Q ( t ) satisfies dQ ( t ) = (cid:26) (cid:20) − r ( t ) + ν ( t ) ⊤ Σ( t ) − (cid:18) ν ( t ) + σ S ( t ) ⊤ η ( t ) P ( t ) (cid:19)(cid:21) Q ( t ) + P ( t )( k µ ( t ) z + π ( t ))+ η ( t ) ⊤ σ S ( t )Σ( t ) − (cid:18) ν ( t ) + σ S ( t ) ⊤ η ( t ) P ( t ) (cid:19) (cid:27) dt + η ( t ) ⊤ d W ( t ) . inally, we consider the process P ( t ) (cid:16) M ( t ) + Q ( t ) P ( t ) (cid:17) + I ( t ). By Itˆo’s formula, we have d " P ( t ) (cid:18) M ( t ) + Q ( t ) P ( t ) (cid:19) + I ( t ) = d [ P ( t ) M ( t ) + 2 d [ Q ( t ) M ( t )] + d [ Q ( t ) P − ( t )] + dI ( t )= P ( t )( u ( t ) − u ∗ c ( t )) ⊤ σ S ( t ) ⊤ σ S ( t )( u ( t ) − u ∗ c ( t )) dt + {· · · } d W ( t ) + {· · · } d K ( t )= P ( t ) || σ S ( t )( u ( t ) − u ∗ c ( t )) || dt + {· · · } d W ( t ) + {· · · } d K ( t ) , where u ∗ c ( t ) is defined in (35) and K ( t ) = ˜ N ( t ) − k R t µ ( s ) ds is a martingale with respectto the filtration ˜ H t . Then, there exists an increasing sequence of stopping times { τ i } suchthat τ i ↑ T as i → ∞ and E " P ( T ∧ τ i ) (cid:18) M ( T ∧ τ i ) + Q ( T ∧ τ i ) P ( T ∧ τ i ) (cid:19) + I ( T ∧ τ i ) = P (0)( Y (0) + Q (0) P (0) ) + I (0) + E "Z T ∧ τ i P ( t ) || σ S ( t )( u ( t ) − u ∗ c ( t )) || dt . From (32) and (33), we can see P ( t ) and Q ( t ) are bounded. From (36), I ( t ) is alsobounded. As E [sup t ∈ [0 ,T ] | Y ( t ) | ] < ∞ , according to the Dominance Covergence Theo-rem and Monotone Convergence Theorem as i → ∞ , we have E " P ( T ) (cid:18) M ( T ) + Q ( T ) P ( T ) (cid:19) + I ( T ) = P (0)( Y (0) + Q (0) P (0) ) + I (0) + E "Z T P ( t ) || σ S ( t )( u ( t ) − u ∗ c ( t )) || dt . Thus, the objective function E [( M ( T ) − c ) ] = E (cid:20) P ( T ) (cid:16) M ( T ) + Q ( T ) P ( T ) (cid:17) + I ( T ) (cid:21) isminimized when u ( t ) = u ∗ t . P (0)( Y (0) + Q (0) P (0) ) + I (0) is the optimal objective value. Proof of Proposition 4
By Theorem 2, the optimal objective value is given by P (0)( M (0) + Q (0) P (0) ) + I (0) forany given c . Take c = ¯ M + φ and substitute Q (0) = − P (0)[ Q (0) + c ´ B (0 , T )], then theexternal minimization problem in (29) becomesmin ¯ M ∈ R P (0)( M (0) − ( ¯ M + φ B (0 , T ) − Q (0)) + I (0) − φ M − φ , which is a quadratic function attaining its minimum at¯ M ∗ = φ (1 − P (0) ´ B (0 , T )) + P (0) ´ B (0 , T )( M (0) − Q (0)) P (0) ´ B (0 , T ) . By substituting c = ¯ M ∗ + φ , the result follows. eferences Abi Jaber, E., Larsson, M., Pulido, S. (2019). Affine Volterra processes.
The Annals ofApplied Probability
European Actuarial Journal
Stochastic Processesand Their Applications
Insur-ance: Mathematics and Economics
Scandi-navian Actuarial Journal
Insurance: Mathematics and Economics
Journal of Risk and Insurance.
Insurance: Mathematics and Eco-nomics , 31(3), 373-393.Chuang, S. L., Brockett, P. L. (2014). Modeling and pricing longevity derivatives us-ing stochastic mortality rates and the Esscher transform.
North American ActuarialJournal , 18(1), 22-37.Danesi, I. L., Haberman, S., Millossovich, P. (2015). Forecasting mortality in subpop-ulations using Lee-Carter type models: A comparison.
Insurance: Mathematics andEconomics
62, 151-161.Delgado-Vences, F., Ornelas, A. (2019). Modelling Italian mortality rateswith a geometric-type fractional Ornstein-Uhlenbeck process. arXiv preprintarXiv:1901.00795.
Duffie, D., Filipovi´c, D., Schachermayer, W. (2003). Affine processes and applications infinance.
The Annals of Applied Probability uffie, D., Pan, J., Singleton, K. (2000). Transform analysis and asset pricing for affinejump-diffusions. Econometrica
Stochastic Processes and TheirApplications
PhilosophicalTransactions of the Royal Society of London (115), 513-583.Han, B., Wong, H.Y. (2020). Mean-variance portfolio selection withVolterra Heston model.
Applied Mathematics and Optimization https://doi.org/10.1007/s00245-020-09658-3.Jevti´c, P., Luciano, E., Vigna, E. (2013). Mortality surface by means of continuous timecohort models.
Insurance: Mathematics and Economics
Insurance: Mathematics and Economics
88, 181-195.Lee, R. D., Carter, L. R. (1992). Modeling and forecasting US mortality.
Journal of theAmerican Statistical Association
Journal of Applied Probability
Demography
Insurance: Mathematics and Economics
Insurance: Mathematics and Economics
Insurance: Mathematics and Eco-nomics
Risks
NorthAmerican Actuarial Journal ang, Y., Zhang, N., Jin, Z., Ho, T. L. (2019). Pricing longevity-linked derivatives using astochastic mortality model. Communications in Statistics-Theory and Methods , 48(24),5923-5942.Wong, T. W., Chiu, M. C., Wong, H. Y. (2017). Managing mortality risk with longevitybonds when mortality rates are cointegrated.
Journal of Risk and Insurance
Annals of ActuarialScience.
Yan, H., Peters, G., Chan, J. (2020). Multivariate long memory cohort mortality models.
ASTIN Bulletin
European Journal of Population
35, 675-694.35, 675-694.