[PDF] Mean-Variance Investment and Risk Control Strategies -- A Time-Consistent Approach via A Forward Auxiliary Process

Abstract

We consider an optimal investment and risk control problem for an insurer under the mean-variance (MV) criterion. By introducing a deterministic auxiliary process defined forward in time, we formulate an alternative time-consistent problem related to the original MV problem, and obtain the optimal strategy and the value function to the new problem in closed-form. We compare our formulation and optimal strategy to those under the precommitment and game-theoretic framework. Numerical studies show that, when the financial market is negatively correlated with the risk process, optimal investment may involve short selling the risky asset and, if that happens, a less risk averse insurer short sells more risky asset.

Full PDF

aa r X i v : . [ q -f i n . P M ] J a n Mean-Variance Investment and Risk Control Strategies– A Time-Consistent Approach via A Forward Auxiliary Process

Yang Shen ∗ Bin Zou † First Version: July 5, 2020; This Version: November 29, 2020Accepted for publication in

Insurance: Mathematics and Economics

Abstract

We consider an optimal investment and risk control problem for an insurer under the mean-variance(MV) criterion. By introducing a deterministic auxiliary process deﬁned forward in time, we formulatean alternative time-consistent problem related to the original MV problem, and obtain the optimalstrategy and the value function to the new problem in closed-form. We compare our formulation andoptimal strategy to those under the precommitment and game-theoretic framework. Numerical studiesshow that, when the ﬁnancial market is negatively correlated with the risk process, optimal investmentmay involve short selling the risky asset and, if that happens, a less risk averse insurer short sells morerisky asset.

Keywords : Optimal Reinsurance; Jump Diﬀusion; Hamilton-Jacobi-Bellman Equation; Time-consistentControl; Precommitment

JEL Code : G11; G22; C61

Investing premiums in ﬁnancial assets and managing risk from underwriting are two fundamental businessdecisions to an insurer. The close interaction between these two decisions motivates us to model a combinedﬁnancial and insurance market for an insurer and study them simultaneously. With this in mind, we set upsuch a combined market consisting of one risk-free asset, one risky asset, and one risk process R representingthe liabilities per unit (or per policy). We apply a jump-diﬀusion model for the risk process R , which is ageneralization of the diﬀusion approximation model and the classical Cram´er-Lundberg (CL) model. Sucha modeling framework is general in the sense that it covers risk process models used in related literature asspecial cases; see, e.g., Browne (1995) and Højgaard and Taksar (1998) for early works without jumps andZeng et al. (2013) and Zeng et al. (2016) for more recent works with jumps. The setup of the combined ∗ School of Risk and Actuarial Studies, University of New South Wales, Sydney, NSW 2052, Australia. Email:[email protected]. † Corresponding author. Department of Mathematics, University of Connecticut, 341 Mansﬁeld Road U1009, Storrs 06269-1009, USA. Email: [email protected]. Phone: +1-860-486-3921. L times R . One can easily see thatsuch an assumption is equivalent to allowing the insurer to purchase proportional reinsurance to manageher risk exposure from underwriting. Please refer to Højgaard and Taksar (1998) and Schmidli (2001) foran excellent introduction on optimal proportional reinsurance. The insurer in consideration then decidesthe amount π to be invested in the risky asset and the liability units L in the insurance business, with thepurpose to achieve her optimization objective.In the control literature within actuarial science, there are three popular choices for the optimizationobjective: (1) utility maximization, cf. Yang and Zhang (2005), Bai and Guo (2008), and Liang et al.(2011); (2) risk minimization (e.g., ruin probability, VaR, and CVaR), cf. Schmidli (2001), Liu and Yang(2004), and Promislow and Young (2005); and (3) the mean-variance (MV) criterion, with the last one usedin this paper. Please see a recent review article Cai and Chi (2020), the monograph Albrecher et al. (2017),and the references therein for the rich literature on optimal reinsurance under the ﬁrst two objectives. MVportfolio selection (without risk control or reinsurance) is ﬁrst studied by Markowitz (1952) and is truly acornerstone of the modern portfolio theory. There are so many works extending Markowitz (1952) alongnumerous directions that it is almost impossible to give credits to all of them even in a review article. Herewe only mention Li and Ng (2000), Zhou and Li (2000), Basak and Chabakauri (2010), Bj¨ork and Murgoci(2010), and Bj¨ork et al. (2014) for a short list. It is well known that a standard dynamic MV problem isa time-inconsistent control problem, in the sense that an optimal strategy obtained at time t may ceaseto be optimal at a later time t . In order for such a strategy to be followed after the initial time, theagent needs to commit herself to it. Hence, this type of optimal strategy is called the precommitment strategy in the literature; see Zhou and Li (2000). However, a rational agent who is aware of such a time-inconsistency issue should search for a time-consistent equilibrium strategy, which she has no incentiveto deviate from once obtained; see Basak and Chabakauri (2010) and Kryger and Steﬀensen (2010). Fora complete analysis on a more general framework in both discrete and continuous time models, we referreaders to the inﬂuential work of Bj¨ork and Murgoci (2010). In the coming paragraph, we shall focus onoptimal MV (proportional) reinsurance problems and provide a selective literature review on the topic.B¨auerle (2005) is among the early contributions to optimal MV reinsurance problems. The authorapplies the embedding technique and the standard Hamilton-Jacobi-Bellman (HJB) approach to obtainthe precommitment proportional reinsurance strategy under the CL model. Bai and Zhang (2008) furtheradd a no short-selling constraint on the investment strategy and solve for the precommitment strategy underboth the CL model and the diﬀusion model for the risk process. Shen and Zeng (2014) introduce delay intothe controlled portfolio and apply a stochastic maximal principle to handle such a problem. Shen and Zeng(2015) propose an asset model in which both the appreciation and volatility of the risky asset follow non-Markovian processes, and apply a backward stochastic diﬀerential equation (BSDE) approach to obtain2he precommitment strategy. On the other hand, time-consistent MV investment-reinsurance problems areﬁrst studied by Zeng and Li (2011) under a standard Black-Scholes type ﬁnancial market and a diﬀusionrisk process. They apply the method from Bj¨ork and Murgoci (2010) and obtain the equilibrium strategyby solving an extend HJB system. Zeng and his collaborators further include jumps in both risky assetsand the risk process in Zeng et al. (2013) and ambiguity aversion in Zeng et al. (2016). Li and Li (2013)allow the risk aversion to be state dependent, similar to that in Bj¨ork et al. (2014). See Li et al. (2015)for extension with stochastic interest rate and inﬂation risk. A latest paper by Cao et al. (2020) studiesthe problem in a contagious model where the risk process is given by a self-exciting Hawkes process. Theabove mentioned papers consider Markovian, also called feedback or closed-loop, controls in the analysis.Two recent works Wang et al. (2019) and Yan and Wong (2020) consider open-loop controls under a modelwith random parameters and a stochastic volatility model, respectively.This paper lies in the category of time-consistent strategies to MV investment and proportional rein-surance problems. We summarize the main results and contributions of this paper as follows. • We propose an alternative time-consistent formulation to the insurer’s original time-inconsistentMV problem, which is diﬀerent from both the general approach of Bj¨ork and Murgoci (2010) andthe special approach of Basak and Chabakauri (2010). To be precise, we introduce an auxiliary deterministic process Y deﬁned forward in time and use Y to replace the conditional expectation ofthe insurer’s wealth X in the original objective J , leading to a modiﬁed MV objective J (see (2.11)for details). We then consider an alternative MV problem in which the pair ( X, Y ) are taken asthe state processes and J is the optimization objective. Because of the introduction of Y and theenlargement of the state space in the deﬁnition of J (which together “kill” the troubling varianceterm in J ), the alternative MV problem is a time-consistent control problem and can be solvedby the standard HJB method. In comparison, Bj¨ork and Murgoci (2010) introduce an auxiliary stochastic process deﬁned backward in time to handle the square term in the MV problems; whileBasak and Chabakauri (2010) take a clever use of the total variance formula to derive a heuristic HJBequation. In terms of applicability to time-inconsistent problems, the approach of Bj¨ork and Murgoci(2010) is the most general one and the approach of Basak and Chabakauri (2010) is likely the mostrestrictive one, as it only applies to the MV problems. The approach of this paper is on the specialside as Basak and Chabakauri (2010), but it can also handle general non-linear term(s) involvingconditional expectation in the objective, other than the square term in the MV objective. • The main approach used in this paper follows from Yang (2020), which considers a standard MVportfolio selection problem without risk control in a Black-Scholes market model. On the technicallevel, we extend the work of Yang (2020) by including an additional risk control strategy for an insurerwhose risk process is modeled by a jump-diﬀusion process and is correlated with the risky asset inthe ﬁnancial market. Such an extension leads to many interesting ﬁndings along both analytical andnumerical directions. 3

We fully solve both the time-inconsistent and the time-consistent MV investment and risk con-trol problems in the paper, i.e., we obtain both the precommitment and the (time-consistent)optimal strategies, and the eﬃcient frontier and the value function of both problems in closed-form. Utilizing these explicit results, we conduct a comprehensive analysis to compare our optimaland precommitment strategies with those obtained in the standard game-theoretic framework (seeBj¨ork and Murgoci (2010)) and the precommitted framework (see Zhou and Li (2000)). • We also make contributions to the literature in the direction of economic analysis on the insurer’soptimal strategies, which is missing or insuﬃcient in many related works including Yang (2020).We discuss how the model parameters and the insurer’s risk proﬁles aﬀect the optimal strategiesboth analytically (see Section 3.3) and numerically (see Section 5). In particular, we ﬁnd that thecorrelation between the ﬁnancial market and the insurance market (risk process) plays a key rolein the insurer’s optimal decisions. Both analytical results (see Table 1) and numerical ﬁndings (seeFigure 1) show that when the correlation coeﬃcient ρ is positive, the insurer increases her investmentin the risky asset and holds more liabilities from underwriting as ρ increases. When ρ is negative, weobtain several interesting ﬁndings. First, the optimal liability strategy is no longer monotone withrespect to ρ , but rather has a convex relation, decreasing ﬁrst and then increasing. Next, when ρ isvery negative (close to -1), the optimal investment involves short selling the risky asset and a lessrisk averse insurer short sells more risky asset.We organize the rest of this paper as follows. In Section 2, we present the market model and statethe insurer’s MV investment and risk control problems. In Section 3, we obtain explicit solutions to theinsurer’s MV problems. We then compare our formulation, approach, and optimal strategy with thoseunder the game-theoretic and precommitted framework in Section 4. Section 5 contains our numericalstudies which focus on the impact of correlation and jumps on the optimal strategy. Our conclusions aresummarized in Section 6. Appendix A collects technical proofs. Let us ﬁx a complete ﬁltered probability space (Ω , F , F = ( F t ) t ∈ [0 ,T ] , P ) over a ﬁnite time horizon [0 , T ],with T < ∞ . Here F t contains all the information up to time t and the ﬁltration F satisﬁes the usualhypotheses. We interpret P as the physical probability measure. We assume all the stochastic processesbelow are well deﬁned and adapted to the given ﬁltration F .We consider an insurer who makes business decisions in a combined ﬁnancial and insurance market,similar to the one introduced in Zeng and Li (2011) and Zou and Cadenillas (2014). In the ﬁnancialmarket, there is one risk-free asset and one risky asset (a stock or an index) whose price dynamics are4iven respectively byd S ( t ) = rS ( t ) d t, S (0) = 1 , (2.1) d S ( t ) = S ( t ) (cid:0) µ d t + σ d W ( t ) (cid:1) , S (0) > , (2.2)where r, µ, σ > W is a standard Brownian motion. Here, r is the risk-freeinterest rate, and µ and σ are the appreciation rate and volatility of the risky asset, respectively. Theinsurer can dynamically trade both assets without frictions and taxes. In the insurance market, we modelthe insurer’s unit liabilities (risk) R = ( R ( t )) t ∈ [0 ,T ] by the following jump-diﬀusion process:d R ( t ) = α d t + β (cid:16) ρ d W ( t ) + p − ρ d W ( t ) (cid:17) + Z R γ ( t, z ) N (d t, d z ) , (2.3)where α, β ≥ ρ ∈ [ − , W is another standard Brownian motion independent of W , N is thePoisson random measure, and γ ( t, z ) > t and z . Here, ρ captures the correlation between theﬁnancial market and the insurance market.We impose several technical assumptions on the models (2.1)-(2.3) that will be enforced throughoutthe rest of the paper. Suppose the compensated Poisson random measure e N is given by e N (d t, d z ) = N (d t, d z ) − λ d t d F Z ( z ) , (2.4)where λ > F Z is the distribution function of a random variable Z . Inaddition, γ ( t, · ) = γ ( · ) is homogeneous and deterministic, and γ ( Z ) has ﬁnite ﬁrst and second moments,i.e., γ := Z R γ ( z ) d F Z ( z ) ∈ (0 , ∞ ) and γ := Z R γ ( z ) d F Z ( z ) ∈ (0 , ∞ ) . (2.5)We assume W , W , N , and Z are stochastically independent, and the ﬁltration F is generated by themand augmented with P -null sets.In the ﬁnancial market, the insurer chooses an investment strategy π = ( π ( t )) t ∈ [0 ,T ] , where π ( t ) denotesthe amount of wealth invested in the risky asset at time t . In the insurance market, the insurer chooses arisk control strategy (or a liability strategy) L = ( L ( t )) t ∈ [0 ,T ] , where L ( t ) denotes the amount of liabilitiesin the underwriting at time t . Assume the unit premium rate, corresponding to the unit liabilities (risk) R , is given by p , where p >

0. For any ﬁxed risk control strategy L , the gains from the insurance businessevolve according to L ( t ) (cid:0) p d t − d R ( t ) (cid:1) . Remark 2.1.

In the combined market model (2.1) - (2.3) , we set the model parameters ( r , µ , σ , α , β ,and λ ) to be positive constants. We comment that all the analysis and results in the sequel hold if theseparameters are given by deterministic, bounded, and positive processes, and the volatility process σ isbounded away from zero (i.e., there exists a positive constant K such that σ ( t ) ≥ K > ). Similarly, if thedeterministic function γ ( t, · ) is not time homogeneous, we need to replace γ i in (2.5) by γ i ( t ) , and assume i ( t ) are bounded for all t , where i = 1 , . Given that all the parameters are constants and the function γ ishomogeneous, we set the unit premium rate to be a positive constant p . The above assumptions on modeling,albeit strong, are standard and popular in the related literature; see, e.g, Schmidli (2001), Yang and Zhang(2005), Moore and Young (2006), and Zeng and Li (2011).The risk model (2.3) incorporates several well known models in actuarial science. If we set α = β = 0 ,then the model (2.3) is a generalization of the classical Cram´er-Lundberg (CL) model. Recall the risk process R ( t ) in the CL model is given by R ( t ) = P b N ( t ) i =1 b C i , where b N = ( b N ( t )) t ∈ [0 ,T ] is a homogeneous Poissonprocess with constant intensity b λ and ( b C i ) i =1 , , ··· is a series of independent and identically distributedrandom variables, also independent of b N . Comparing (2.3) with the CL model shows that (i) λ = b λ and(ii) γ ( t, Z ) and b C has the same distribution. If we set λ = 0 , i.e., no jumps in (2.3) , then the model (2.3) can be seen as a diﬀusion approximation to the CL model; see, e.g., Browne (1995), Højgaard and Taksar(1998), and Moore and Young (2006). In such a case, we have α = b λ E [ b C ] and β = b λ E [ b C ] .In our framework, we interpret L as the amount of liabilities the insurer decides to take in the in-surance business, and p as the premium rate the insurer receives from underwriting the policies againstthe risk R . This modeling choice follows from Stein (2012)[Chapter 6] and its subsequent studies suchas Zou and Cadenillas (2014), Peng and Wang (2016), and Bo and Wang (2017), where the motivationcomes from the AIG case in the ﬁnancial crisis of 2007-2008 and argues for a negative correlation ρ < .It is clear that our risk control setup is consistent with the model of proportional reinsurance, animportant topic in actuarial science. In the latter case, we should understand R in (2.3) as the risk processof the insurer, and p the reinsurance premium paid by the insurer to the reinsurer. To manage risk, theinsurer chooses proportional insurance, with L denoting the retention proportion. The dynamics of thegains process are then given by L ( t ) d R ( t ) − p (1 − L ( t )) d t . Indeed, our setup (with λ = 0 ) is the same tothat in Zeng and Li (2011) by taking m = 0 and σ to be the negative value there. In the combined market described above, let us introduce u = ( π, L ) as a shorthanded notation forthe insurer’s control or strategy. As usual, we consider self-ﬁnancing strategies only in the analysis. For aﬁxed strategy u , we write X u = ( X u ( t )) t ∈ [0 ,T ] as the insurer’s wealth process, and obtain the dynamics of X u by d X u ( t ) = (cid:0) rX u ( t ) + ¯ µπ ( t ) + ¯ pL ( t ) (cid:1) d t + (cid:0) σπ ( t ) − ρβL ( t ) (cid:1) d W ( t )(2.6) − β p − ρ L ( t ) d W − L ( t ) Z R γ ( z ) N (d t, d z ) , where the initial wealth X (0) is a (positive) constant, and ¯ µ and ¯ p are deﬁned by¯ µ := µ − r and ¯ p := p − α. (2.7)If we ﬁx an initial state X u ( t ) = x at time t , where t ∈ [0 , T ], and want to emphasize the dependence of X u on the initial state, we use the notation ( X ut,x ( s )) s ∈ [ t,T ] .6o make sure the stochastic diﬀerential equation (SDE) (2.6) admits a unique strong solution, we needto impose square integrability conditions on strategies. Given the nature of the Markovian framework,we consider Markov (feedback) controls, π ( t ) = e π ( t, X u ( t )) and L ( t ) = e L ( t, X u ( t )), for some deterministicfunctions e π and e L . We are now ready to state the admissible set of strategies, denoted by A , as follows. Deﬁnition 2.2.

A strategy u is called admissible if (1) u is progressively measurable with respect to theunderlying ﬁltration F , (2) E [ R T π ( t ) d t ] < ∞ and E [ R T L ( t ) d t ] < ∞ , (3) u is a feedback control, and(4) X u satisﬁes the SDE (2.6) . Remark 2.3.

In deﬁning the admissible set, we do not impose L ≥ . In other words, we allow L < ,which corresponds to the case of taking a strategy greater than 1 in the proportional reinsurance and isoften interpreted as acquiring new businesses in the literature; see, e.g., B¨auerle (2005) and Zeng and Li(2011). If we insist on L ≥ , we can impose an extra condition on the model parameters, so that theoptimal liability strategy L ∗ is always non-negative; see Eq.(6) in Zou and Cadenillas (2014). We consider a representative insurer who is a mean-variance (MV) type agent. Namely, the insurer prefershigher mean and lower variance of her terminal wealth. Following the standard literature of Li and Ng(2000) and Zhou and Li (2000), we deﬁne the insurer’s objective functional J by J ( t, x ; u ) := E t,x [ X u ( T )] − θ V t,x [ X u ( T )] , (2.8)where θ > X u is given by (2.6), and E t,x (resp. V t,x ) denotes takingconditional expectation (resp. variance) given X u ( t ) = x under the physical measure P . We now state theinsurer’s MV investment and risk control problem as follows. Problem 2.4 (A Time-Inconsistent MV Problem) . The insurer seeks a strategy u pre = ( u pre ( s )) s ∈ [ t,T ] tomaximize the objective functional J deﬁned in (2.8) , i.e., the insurer solves the following MV problem: V ( t, x ) := sup u ∈A J ( t, x ; u ) . (2.9) We call V the value function to Problem (2.9) . As is well known in the literature, a solution to Problem (2.9) is time-inconsistent , and hence theinsurer has the incentive to deviate from such a strategy at a later time s > t . As such, a strategy solvingProblem (2.9) will be followed over the remaining period [ t, T ] only if the insurer commits to it. For thisreason, we call a solution to Problem (2.9) a precommitment strategy with notation u pre . Time-consistentformulations and approaches to the MV problems are then proposed and investigated, while most, if notall, of them follow the inﬂuential work of Bj¨ork and Murgoci (2010). However, we will take a diﬀerentapproach from Yang (2020) in this paper, which is, although less general, simple and suﬃcient to handlethe problem under our framework. 7nspired by Yang (2020), for any u = ( π, L ) ∈ A , we introduce an auxiliary process ( Y ut,y ( s )) s ∈ [ t,T ] ,which is deﬁned forward in time byd Y ut,y ( s ) = (cid:0) rY ut,y ( s ) + ¯ µ E t,y (cid:2) π ( s ) (cid:3) + (¯ p − λγ ) E t,y (cid:2) L ( s ) (cid:3)(cid:1) d s, s ∈ [ t, T ] , (2.10)where Y ut,y ( t ) = y is the initial state for arbitrary but ﬁxed t ∈ [0 , T ] and y ∈ R , γ is given by (2.5), and ¯ µ and ¯ p are deﬁned in (2.7). Here, the initial state value y may be diﬀerent from that of X ut,x ( t ) = x . Using(2.6), we obtain that E t,y [ X ut,y ( s )] = y + E t,y (cid:20)Z st (cid:0) rX ut,y ( v ) + ¯ µπ ( v ) + (¯ p − λγ ) L ( v ) (cid:1) d v (cid:21) + E t,y (cid:20)Z st (cid:0) σπ ( v ) − βρL ( v ) (cid:1) d W ( v ) (cid:21) − E t,y (cid:20)Z st β p − ρ L ( v ) d W ( v ) (cid:21) − E t,y (cid:20)Z st Z R L ( v ) γ ( z ) e N (d v, d z ) (cid:21) = y + Z st (cid:0) r E t,y (cid:2) X ut,y ( v ) (cid:3) + ¯ µ E t,y (cid:2) π ( v ) (cid:3) + (¯ p − λγ ) E t,y (cid:2) L ( v ) (cid:3)(cid:1) d v, ∀ s ∈ [ t, T ] , where X ut,y ( t ) = y , and e N and γ are deﬁned respectively by (2.4) and (2.5). Here, we have used the squareintegrability conditions of π and L from Deﬁnition 2.2 and γ = E [ γ ( Z )] < ∞ in (2.5) to conclude thatthe (conditional) expectations of the two Itˆo integrals and the integral with respect to e N are zero. Bycomparing the above result with the dynamics of Y in (2.10), we see that Y ut,y ( s ) = E t,y [ X ut,y ( s )] for all s ∈ [ t, T ] and y ∈ R .We now treat both X and Y as state processes and consider a modiﬁed objective functional J deﬁnedby J ( t, x, y ; u ) = E t,x,y (cid:20) X ut,x ( T ) − θ (cid:0) X ut,x ( T ) − Y ut,y ( T ) (cid:1) (cid:21) , (2.11)where θ > X u and Y u are given respectively by (2.6) and (2.10), and E t,x,y denotes taking conditional expectation under X ut,x ( t ) = x and Y ut,y ( t ) = y . Notice that there is afundamental diﬀerence between J in (2.8) and J in (2.11). That is, we no longer have the “troubling”square term ( E t,x [ X u ( T )]) , which causes time-inconsistency in Problem (2.9). Based on the new objectivefunctional J , we formule a time-consistent version of the original Problem (2.9). Problem 2.5 (A Time-Consistent MV Problem) . The insurer seeks an optimal strategy u ∗ to maximizethe objective functional J deﬁned in (2.11) , i.e., the insurer solves the following MV problem: V ( t, x, y ) = sup u ∈A J ( t, x, y ; u ) . (2.12) We call V the value function to Problem (2.12) . In this section, we ﬁrst solve the insurer’s time-inconsistent problem (Problem (2.9)) to obtain a precom-mitment strategy in Section 3.1 and then solve the insurer’s time-consistent problem (Problem (2.12)) to8btain an optimal strategy in Section 3.2. We discuss the economic implications based on the results ofProblem (2.12) in Section 3.3.We impose a standing assumption for the subsequent analysis: β (1 − ρ ) + λγ = 0 , (3.1)where γ is given by (2.5). The assumption in (3.1) is rather weak and holds in most conditions. In fact,it only fails when there are (1) no jumps ( λ = 0) and (2) no diﬀusion term ( β = 0) or perfect correlation( ρ = ± In this subsection, we solve the insurer’s time-inconsistent MV problem, as formulated in Problem (2.9),and obtain explicit solutions in the following theorem.

Theorem 3.1.

A precommitment strategy to Problem (2.9) , denoted by u pre = ( π pre ( s ) , L pre ( s )) s ∈ [ t,T ] , isgiven by π pre ( s ) = − κ (cid:18) X pre ( s ) − x e r ( T − t ) − θ e κ ( T − t ) (cid:19) and L pre ( s ) = − κ κ π pre ( s ) , (3.2) where X pre is the wealth process under the precommitment strategy u pre and the initial state ( t, x ) , and theconstants κ i , i = 1 , , , are given by κ := ¯ µ ( β + λγ ) + ρβσ (¯ p − λγ )( β (1 − ρ ) + λγ ) σ , κ := ρβ ¯ µ + (¯ p − λγ ) σ ( β (1 − ρ ) + λγ ) σ , (3.3) κ := (cid:0) β + λγ (cid:1) ¯ µ + 2 ρβσ ¯ µ (¯ p − λγ ) + (¯ p − λγ ) σ ( β (1 − ρ ) + λγ ) σ = ¯ µ σ + (cid:16) ¯ p − λγ − ρβ ¯ µσ (cid:17) β (1 − ρ ) + λγ , (3.4) with γ and γ deﬁned in (2.5) , and ¯ µ and ¯ p deﬁned in (2.7) .Proof. Please refer to Appendix A for a proof.

Proposition 3.2.

Let X pre be the insurer’s wealth process under the precommitment strategy u pre givenby (3.2) . We have E t,x [ X pre ( T )] = x e r ( T − t ) + e κ ( T − t ) − θ and V t,x [ X pre ( T )] = e κ ( T − t ) − θ , (3.5) where κ is given in (3.4) . The eﬃcient frontier of Problem (2.9) is obtained by V t,x [ X pre ( T )] = (cid:0) E t,x [ X pre ( T )] − x e r ( T − t ) (cid:1) e κ ( T − t ) − , and the value function (mean-variance tradeoﬀ ) of Problem (2.9) is given by V ( t, x ) = E t,x [ X pre ( T )] − θ V t,x [ X pre ( T )] = x e r ( T − t ) + e κ ( T − t ) − θ . (3.6) 9 roof. By plugging u pre in (3.2) back into the SDE (2.6), we obtain the above explicit results. Remark 3.3.

From (3.2) , one can easily see that the strategy u pre = ( π pre , L pre ) strongly depends onthe initial state ( t, x ) , so a more precise but also more cumbersome notation is to replace it by u pre t,x =( π pre t,x , L pre t,x ) . Also it is obvious from (3.2) that u pre is indeed time-consistent as claimed. To see this, let t < t < t < T and X pre t,x ( t ) be the corresponding wealth at time t under the strategy u pre . Using (3.2) ,we have u pre t,x ( t ) = u pre t ,X pre t,x ( t ) ( t ) in general. That means the “best” strategy for a future time t found atthe state ( t, x ) is not the same as the one found at the state ( t , X pre t,x ( t )) . Here, by the “best” strategy,we mean a solution to Problem (2.9) .In Problem (2.9) , θ is a free parameter, called the insurer’s risk aversion parameter, which speciﬁesthe insurer’s risk attitude towards the mean-variance tradeoﬀ. Since κ > due to (3.4) and θ > bydeﬁnition, we derive from (3.5) that there is a one-to-one relation between the target expected terminalwealth m := E t,x [ X pre ( T )] and the risk aversion parameter θ , for all m > x e r ( T − t ) . The key to solving Problem (2.12) is the standard Hamilton-Jacobi-Bellman (HJB) approach, as presentedin Theorem 3.4. The proof to this theorem is rather standard in the literature, which is based on the ﬂowproperty of SDEs, dynamic programming principle, and a veriﬁcation theorem. We omit the proof hereand refer readers to Theorems 3.3 and 3.4, and Proposition 3.5 in Chapter 4 of Yong and Zhou (1999) fora standard proof on a more general control problem.

Theorem 3.4.

Suppose there exists a classical solution V to Problem (2.12) . Then V solves the followingHamilton-Jacobi-Bellman (HJB) equation: V t ( t, x, y ) + sup π, L ∈ R (cid:26) ( rx + ¯ µπ + ¯ pL ) V x ( t, x, y ) + 12 (cid:2) ( σπ − ρβL ) + β (1 − ρ ) L (cid:3) V xx ( t, x, y )+( ry + ¯ µπ + (¯ p − λγ ) L ) V y ( t, x, y ) + λ Z R (cid:2) V ( t, x − Lγ ( z ) , y ) − V ( t, x, y ) (cid:3) d F Z ( z ) (cid:27) = 0 , (3.7) for all ( t, x, y ) ∈ [0 , T ) × R × R , and satisﬁes the terminal condition V ( T, x, y ) = x − θ x − y ) , ∀ x, y ∈ R . (3.8)By applying Theorem 3.4, we obtain explicit solutions to the optimal strategy and the value functionof Problem (2.12), as summarized below. Theorem 3.5.

An optimal strategy to Problem (2.12) , denoted by u ∗ = ( π ∗ ( s ) , L ∗ ( s )) s ∈ [ t,T ] , is given by π ∗ ( s ) = κ θ e − r ( T − s ) and L ∗ ( s ) = κ θ e − r ( T − s ) , s ∈ [ t, T ] , (3.9) where κ and κ are deﬁned in (3.3) . The value function V ( t, x, y ) to Problem (2.12) is given by V ( t, x, y ) = − θ e r ( T − t ) ( x − y ) + e r ( T − t ) x + κ θ ( T − t ) , (3.10) where κ is deﬁned in (3.4) . roof. Please see Appendix A for a proof.

Remark 3.6.

Since both the optimal investment strategy π ∗ and the optimal liability strategy L ∗ in (3.9) are independent of the initial state ( t, x ) , the optimal strategy u ∗ in (3.9) is indeed time-consistent. Wecall u ∗ an optimal strategy, instead of an equilibrium strategy, since the alternative formulated Problem (2.12) is a standard time-consistent stochastic control problem.We next comment on possible generalizations to the model (2.1) - (2.3) . First, as mentioned in Remark2.1, it is straightforward to extend to the case when all model parameters are deterministic and boundedprocesses. Indeed, we simply replace all the constant parameters in κ i by their corresponding (deterministic)process version in κ i ( t ) , where i = 1 , , , and the product with time arguments by an appropriate integral,e.g., we change r ( T − t ) to R Tt r ( s ) d s and κ ( T − t ) to R Tt κ ( s ) d s in Theorems 3.1 and 3.5. Second, weconsider only one risky asset in our model, but the analysis and key results in Theorems 3.1 and 3.5 applyto the case of multiple risky assets in a parallel way once we use appropriate matrix notation. In fact, ourtechnique is adequate to handle an incomplete market with n risky assets driven by d independent Brownianmotions, where n ≤ d . In such a case, we modify the assumption that σ > to σ ( t ) σ ( t ) ⊤ ≥ K I n × n forsome positive K and all t ∈ [0 , T ] , where ⊤ denotes transpose operator and I n × n is an n × n identitymatrix. We next derive the eﬃcient frontier of Problem (2.12) and present the results in the proposition below.

Proposition 3.7.

Let X ∗ be the insurer’s wealth process under the optimal strategy u ∗ given by (3.9) . Weobtain the dynamic eﬃcient frontier of Problem (2.12) by V t,x [ X ∗ ( s )] = (cid:0) E t,x [ X ∗ ( s )] − x e r ( s − t ) (cid:1) κ ( s − t ) , t < s ≤ T, (3.11) and the mean-variance tradeoﬀ by E t,x [ X ∗ ( s )] − θ V t,x [ X ∗ ( s )] = x e r ( s − t ) + κ θ e − r ( T − s ) (cid:18) − e − r ( T − s ) (cid:19) ( s − t ) , t ≤ s ≤ T. (3.12) Proof.

By plugging (3.9) into (2.6), we obtain e r ( T − s ) X ∗ ( s ) = e r ( T − t ) X ∗ ( t ) + ¯ µκ + (¯ p − λγ ) κ θ ( s − t ) + σκ − ρβκ θ (cid:0) W ( s ) − W ( t ) (cid:1) (3.13) − β p − ρ κ θ (cid:0) W ( s ) − W ( t ) (cid:1) − κ θ Z st Z R γ ( z ) e N (d v, d z ) , where κ and κ are deﬁned in (3.3). Taking expectation and variance on (3.13) given the initial state( t, X ∗ ( t ) = x ), we get E t,x [ X ∗ ( s )] = x e r ( s − t ) + ¯ µκ + (¯ p − λγ ) κ θ e − r ( T − s ) ( s − t )(3.14) = x e r ( s − t ) + κ θ e − r ( T − s ) ( s − t ) , t,x [ X ∗ ( s )] = e − r ( T − s ) ( s − t ) θ h ( σκ − ρβκ ) + β (1 − ρ ) κ + λγ κ i (3.15) = κ θ e − r ( T − s ) ( s − t ) , where we have used the following result¯ µκ + (¯ p − λγ ) κ = ( σκ − ρβκ ) + β (1 − ρ ) κ + λγ κ = κ > . (3.16)We obtain (3.16) by recalling the deﬁnitions of κ i , i = 1 , ,

3, in (3.3) and (3.4). Combining (3.14), (3.15),and (3.16) leads to the above dynamic eﬃcient frontier of Problem (2.12).With explicit results obtained in (3.14) and (3.15), we can provide a further veriﬁcation of the optimalinvestment strategy u ∗ in (3.9). To this end, let us recall the deﬁnition of Y u in (2.10) and the objectivefunctional J in (2.11). Denote X ∗ and Y ∗ the corresponding processes under the optimal strategy u ∗ . Weobtain J ( t, x, y ; u ∗ ) = E t,x,y (cid:20) X ∗ t,x ( T ) − θ (cid:0) X ∗ t,x ( T ) − Y ∗ t,y ( T ) (cid:1) (cid:21) = E t,x (cid:2) X ∗ t,x ( T ) (cid:3) − θ V t,x (cid:2) X ∗ t,x ( T ) (cid:3) − θ (cid:0) E t,x (cid:2) X ∗ t,x ( T ) (cid:3) − E t,y (cid:2) Y ∗ t,y ( T ) (cid:3)(cid:1) = x e r ( T − t ) + κ θ ( T − t ) − κ θ ( T − t ) − θ (cid:16) x e r ( T − t ) − y e r ( T − t ) (cid:17) = V ( t, x, y ) derived in (3.10) . (3.17)Hence, u ∗ given by (3.9) is optimal to Problem (2.12). In this subsection, we present economic discussions on the explicit results of Problem (2.12) in Theorem3.5 and Proposition 3.7.We ﬁrst derive two corollaries when the insurer has no access to the ﬁnancial market or decides notto take any insurance business. To this purpose, we directly follow the same arguments in the proof ofTheorem 3.5 by setting π ≡ L ≡

0) and obtain the optimal strategy and the value function in eachcase. We skip the cumbersome computations and report the results below.

Corollary 3.8.

If the insurer has no access to the ﬁnancial market ( π ( s ) ≡ for all s ∈ [ t, T ] ), the optimalrisk control strategy L ∗ to Problem (2.12) is given by L ∗ ( s ) (cid:12)(cid:12)(cid:12) π ≡ = ¯ p − λγ ( β + λγ ) θ e − r ( T − s ) , ∀ s ∈ [ t, T ] , and the value function is given by V ( t, x, y ) (cid:12)(cid:12)(cid:12) π ≡ = − θ e r ( T − t ) ( x − y ) + e r ( T − t ) x + κ θ ( T − t ) , where κ = (¯ p − λγ ) β + λγ . he loss due to no access to investing in the ﬁnancial market is measured by V ( t, x, y ) − V ( t, x, y ) (cid:12)(cid:12)(cid:12) π ≡ = (cid:0) ¯ µ ( β + λγ ) + ρβσ (¯ p − λγ ) (cid:1) ( β + λγ )( β (1 − ρ ) + λγ ) · T − t θ > . Corollary 3.9.

If the insurer sets L ( s ) ≡ for all s ∈ [ t, T ] , the optimal investment strategy π ∗ to Problem (2.12) is given by π ∗ ( s ) (cid:12)(cid:12)(cid:12) L ≡ = ¯ µθσ e − r ( T − s ) , ∀ s ∈ [ t, T ] , (3.18) and the value function is given by V ( t, x, y ) (cid:12)(cid:12)(cid:12) L ≡ = − θ e r ( T − t ) ( x − y ) + e r ( T − t ) x + ¯ µ θσ ( T − t ) . The loss due to not taking insurance business is measured by V ( t, x, y ) − V ( t, x, y ) (cid:12)(cid:12)(cid:12) L ≡ = (cid:16) ¯ p − λγ − ρβ ¯ µσ (cid:17) β (1 − ρ ) + λγ · T − t θ > . Several important remarks and explanations are due regarding Theorem 3.5 and Corollaries 3.8 and3.9. First, the optimal strategy depends on both the ﬁnancial market and the insurance market. Namely, π ∗ also depends on the risk model (2.3) and π ∗ = 0 in general, while L ∗ depends on the price models(2.1)-(2.2) and L ∗ = 0 either. This observation testiﬁes the importance of considering a combined ﬁnancialand insurance market in the risk management study for an insurer. Corollary 3.8 further shows thatthe loss in the value function is strictly positive for all t ∈ [0 , T ); not having the access to the ﬁnancialmarket makes the insurer worse oﬀ. On the other hand, when the insurer does not engage in the insurancebusiness (i.e., L = 0), the insurer invests as if she were a utility maximizer equipped with an exponentialutility U ( x ) = − e − θx . In this case, the loss in the value function is also strictly positive, as shown inCorollary 3.9, even when the insurance policy is “cheap”, i.e., when ¯ p = λγ ( p d t = E [d R ( t )]). Along thisdiscussion, given ¯ p ≤ λγ (insurance policies are underpriced), the optimal risk control strategy L ∗ maystill be positive, if ρβ ¯ µ > ρ >

0, given ¯ µ >

0. Thisﬁnding is certainly interesting, as the insurer takes positive shares in a “losing” insurance business, whichseems an irrational decision at ﬁrst thought. But a second thought reveals that such a business provides anatural hedge to the risky asset, making it still desirable to hold positive shares in the portfolio. Anotherplausible explanation is that the insurer sees underwriting policies as a ﬁnancing tool to raise capital forinvestment purpose. If the gain from investing premiums in the ﬁnancial market outweighs the shortfalldue to underpricing policies, the insurer indeed has the incentive to sell underpriced insurance policies.Next, we investigate the impact of the correlation between the two markets on the optimal strategy.In the extreme case of ρ = 0, i.e., when the two markets are independent, we have π ∗ ( s ) (cid:12)(cid:12)(cid:12) ρ =0 = ¯ µθσ e − r ( T − s ) and L ∗ ( s ) (cid:12)(cid:12)(cid:12) ρ =0 = ¯ p − λγ θ ( β + λγ ) e − r ( T − s ) , ∀ s ∈ [ t, T ] , Please see Remark 1 in Basak and Chabakauri (2010) for further discussions on the connection with exponential utilitymaximization. π ∗ | ρ =0 = π ∗ | L =0 is the same as the optimalstrategy in the standard Merton’s problem with an exponential utility U ( x ) = − e − θx . For the given riskprocess R in (2.3), if p ≤ α + λγ (i.e., ¯ p ≤ λγ ), ruin occurs for sure to the insurer. With that in mind,let us suppose the following conditions hold for the rest of the section:¯ p − λγ > µ = µ − r > , (3.19)where the second inequality means the risky asset has higher return than the risk-free asset. Let us denote π ∗ | ρ> the optimal investment in the risky asset under a positive correlation ρ . We rewrite the optimalstrategy in (3.9) as follows π ∗ ( s ) = β + λγ β (1 − ρ ) + λγ · π ∗ ( s ) (cid:12)(cid:12)(cid:12) ρ =0 + ρ β (¯ p − λγ )( β (1 − ρ ) + λγ ) σ θ e − r ( T − s ) , (3.20)from which we deduce π ∗ ( s ) | ρ> > π ∗ ( s ) | ρ =0 . The economic meaning is that an insurer, with risk positivelycorrelated with the risky asset, invests more aggressively in the risky asset, as if she were less risk averse.This makes perfect sense, since, with ρ >

0, a decrease in the price of the risky asset is accompanied by adecrease in the liabilities to be paid out, making the risky asset less risky to the insurer. However, whenthese two markets are negatively correlated, less can be said, as although the factor in the ﬁrst term in(3.20) is still greater than 1, the second term in (3.20) is negative. In consequence, we expect diﬀerentparameter values lead to diﬀerent monotonicity results. One could carry out the same analysis on theoptimal risk control strategy L ∗ , and the ﬁndings are the same. We point out that the numerical analysisin Zou and Cadenillas (2014) shows π ∗ is increasing with respect to ρ ∈ ( − , L ∗ seems to exhibit a“smile” shape (decreasing ﬁrst and then increasing) as a function of negative ρ .We continue to study the impact of other model parameters on the optimal strategy. After carefulcomputations and analysis, we summarize the impact of all the model parameters on the optimal strategyand the value function in Table 1. Note the results are obtained under the additional conditions in (3.19).As the excess return ¯ µ (recall ¯ µ = µ − r ) increases, the optimal investment strategy π ∗ increases, and theoptimal liability strategy L ∗ increases (resp. decreases) if and only if ρ > ρ < p (recall ¯ p = p − α ) on the optimal strategy is exactly the opposite, comparing to thatof ¯ µ . However, how the rest of parameters aﬀect the optimal strategy is less clear, with only partial resultsas presented in Table 1. As already discussed in details in the proceeding paragraph, when ρ >

0, theﬁnancial and insurance markets provide a natural hedge to each other, which allows us to gain full insighton the optimal strategy. When the two markets are negatively correlated, argued in Stein (2012) to be amain contributor to the failure of AIG during the ﬁnancial crisis, the change of a parameter (e.g., β and λ )results in opposite reactions from the two markets. To better explain this result, let us consider the impactof the asset volatility σ on the optimal investment π ∗ . From (3.20), it is clear that, as σ increases, themyopic component ¯ µ/ ( θσ ) decreases, but the second hedging component increases if ρ <

0. In summary,14he case of negative correlation ρ < π ∗ Optimal Liability L ∗ Value Function V correlation ρ ∂π ∗ /∂ρ > ρ > ∂L ∗ /∂ρ > ρ > ∂ V /∂ρ < ρ < µ ∂π ∗ /∂ ¯ µ > ∂L ∗ /∂ ¯ µ = sign ( ρ ) ∂ V /∂ ¯ µ > ρ < σ ∂π ∗ /∂σ < ρ > ∂L ∗ /∂σ = sign ( ρ ) ∂ V /∂σ > ρ < p ∂π ∗ /∂ ¯ p = sign ( ρ ) ∂L ∗ /∂ ¯ p > ∂ V /∂ ¯ p < ρ < β ∂π ∗ /∂β > ρ > ∂L ∗ /∂β > ρ = 0 ∂ V /∂β < ρ < λ ∂π ∗ /∂λ < ρ > ∂L ∗ /∂λ < ρ > ∂ V /∂λ < ρ > γ ∂π ∗ /∂γ < ρ > ∂L ∗ /∂γ < ρ > ∂ V /∂γ < ρ > Note. In the last row of “jump size γ ”, we derive the results assuming γ ( · ) ≡ γ is a positive constant. From the eﬃcient frontier (3.11) of Problem (2.12), one immediate observation is the so-called securitymarket line (SML), E t,x [ X ∗ ( s )] = x e r ( s − t ) + p κ ( s − t ) × q V t,x [ X ∗ ( s )] . Since the dependence of the value function V ( t, x, y ) on the model parameters (except r ) is only throughthe coeﬃcient κ , the last column of Table 1 oﬀers some monotonic results regarding the slope of the SML.For instance, when ρ <

0, the slope of the SML is increasing with respect to (w.r.t.) the excess return ¯ µ and asset volatility σ , and decreasing w.r.t. the correlation coeﬃcient ρ , the excess premium ¯ p , and therisk volatility β .Finally, according to Theorem 3.5, both the optimal investment strategy π ∗ and the optimal liabilitystrategy L ∗ are independent of the wealth, but are increasing at an exponential rate with respect to thetime variable. Setting y = x , we have V x ( t, x, x ) > V t ( t, x, x ) <

0. That is, with higher initial wealth x or a longer investment horizon T − t , the insurer is able to derive a higher expected utility, which ﬁtsour intuition perfectly. The goal of this section is to compare our approach and optimal strategy in Section 3 with those underthe game-theoretic and precommitted framework. 15 .1 Comparison Analysis with Game-Theoretic Strategies

If we consider a standard MV portfolio selection problem without risk control strategies, the optimalinvestment π ∗ | L ≡ is obtained in (3.18), which is the same as that in Basak and Chabakauri (2010) (with-out stochastic factor), Bj¨ork and Murgoci (2010), Bj¨ork et al. (2014) (under constant risk aversion), andKryger et al. (2020). But as witnessed in Section 3, we arrive at the same result via a diﬀerent analysis.Next, let us compare our optimal strategy in (3.9) to those in the time-consistent MV investment-reinsurance literature. If we set ρ = 0 and γ ≡

0, i.e., the ﬁnancial market is independent of the insurancemarket and the risk process R is a diﬀusion approximation process, then we recover the same results as inZeng and Li (2011); see Theorem 2 therein. If we set ρ = 0, then our optimal strategy is consistent withthat in Zeng et al. (2013) and Zeng et al. (2016) without ambiguity aversion. There is a key diﬀerence between our optimal strategy and those discussed above. Under (3.19), theoptimal strategy in the above papers is always positive (excluding the wealth-dependent risk aversion casein Bj¨ork et al. (2014)) and does not involve short selling. In our model, the optimal strategy is positive if ρ ≥

0. However, if ρ <

0, it is possible that κ and κ are negative at the same time, or they have diﬀerentsigns. In other words, our optimal strategy may involve short selling when ρ <

0; see Figure 1.Our optimal strategy is independent of the wealth process X , since the risk aversion parameter θ is takento be a constant in the analysis. For the same problem but under a wealth-dependent risk aversion (e.g., θ ( x ) = constant /x ), the optimal strategy is likely to be proportional to the wealth level; see, Bj¨ork et al.(2014), Dai et al. (2020), and Kryger et al. (2020). A standard approach to tackle the time-inconsistency issue of Problem (2.9) is to formulate the problemunder a game-theoretic framework. That is, one sets up a game between the current self of an MV-typeagent and her future incarnations (selfs). A strategy b u is “optimal”, more often called equilibrium (usedhereafter in this subsection), if all her future incarnations living in [ t + ǫ, T ] will follow this strategy andthere is no gain for her to choose a diﬀerent strategy u at time t (lasting from t to t + ǫ for an inﬁnitesimalperiod ǫ ). The above “informal” deﬁnition of an equilibrium strategy b u is indeed rigorous in a discrete-timemodel. To see this, suppose b u = ( b u ( s )) s ∈ [ t,T ] is an equilibrium strategy and consider a “perturbed” strategy u = ( u ( s )) s ∈ [ t,T ] , where u ( t ) = u = b u ( t ) and u ( s ) = b u ( s ) for all s = t + 1 , · · · , T . By the deﬁnition ofan equilibrium strategy, we have J ( t, x ; b u ) ≥ J ( t, x ; u ) for any F t -measurable u in the admissible domain.However, extending the same idea to a continuous-time model becomes problematic, since a deviation from b u at time t is only eﬀective for an inﬁnitesimal period of time (a set with Lebesgue measure zero) andhence its impact on the objective functional is negligible in most important problems (including the MV There is slight diﬀerence in the setup of risk management between ours and those in the works of Zeng and his collaborators’;see Remark 2.1. For instance, using our notation, m and θ in Zeng and Li (2011) are 0 and ¯ p − λγ . b u is called an equilibrium strategy iflim inf ǫ ↓ J ( t, x ; b u ) − J ( t, x ; u ) ǫ ≥ , (4.1)holds for all perturbed strategies u within the admissible set, where we assume the goal is to maximize theobjective J in the problem. The deﬁnition (4.1) is proposed by Ekeland and Lazrak (2006) to study anoptimization problem with a non-exponential discounting function. The equilibrium deﬁnition (4.1) is alsofundamental and used in almost all the subsequent works on time-consistent MV and MV-reinsurance prob-lems; see, e.g., Bj¨ork and Murgoci (2010), Kryger and Steﬀensen (2010), Zeng and Li (2011), Zeng et al.(2013), Bj¨ork et al. (2014), and many others.However, the equilibrium condition (4.1) is only a necessary condition, not a suﬃcient condition. Thatbrings an immediate problem: a strategy b u satisfying (4.1) with an equality may fail to dominate anotheradmissible strategy (See Remark 7.1 in Bj¨ork and Murgoci (2010)). Huang and Zhou (2020) construct acounterexample (see Example 4.3 therein) in which an equilibrium strategy b u satisfying (4.1) is strictlyworse oﬀ than a diﬀerent admissible strategy u ′ , i.e., J ( t, x ; u ′ ) > J ( t, x ; b u ). Namely, there exist caseswhere an agent has the incentive to deviate from an equilibrium strategy b u , as characterized by (4.1),which contradicts the very deﬁnition of equilibrium.In comparison, the problem considered in this paper, Problem (2.12), is a standard time-consistentcontrol problem. Namely, once a solution u ∗ is obtained to Problem (2.12) at time t , the insurer will followthe strategy u ∗ over [ t, T ] and the optimality of u ∗ holds trivially by deﬁnition, i.e., J ( t, x, y ; u ∗ ) ≥ J ( t, x, y ; u ) ∀ u ∈ A . (4.2)In fact, our analysis in Section 3 does ﬁnd an optimal strategy u ∗ such that J ( t, x, y ; u ∗ ) = sup u ∈A J ( t, x, y );see Theorem 3.5 and (3.17). The key diﬀerences between the equilibrium deﬁnition of Bj¨ork and Murgoci(2010) in (4.1) and the optimality deﬁnition in (4.2) are as follows: (1) we introduce an auxiliary process Y (with initial value y ) and consider a modiﬁed objective J , in which Y replaces the conditional expectationin the original objective J ; and (2) we seek an optimal strategy u ∗ that maximizes J over all admissiblestrategies. Because of these diﬀerences, our alternative formulation in (2.12) always leads to a well-deﬁnedoptimal strategy (for a modiﬁed objective J ), while the same cannot be said in general under the deﬁnition(4.1). We discuss two diﬀerences between our approach and those in Bj¨ork and Murgoci (2010) and its followingworks, such as Zeng and Li (2011) and Bj¨ork et al. (2014). The ﬁrst one comes from the auxiliary process,while the second one comes from the HJB equation.17o facilitate the presentation, let us restate the MV problem considered in Bj¨ork and Murgoci (2010)and Bj¨ork et al. (2014) as follows: b V ( t, x ) = sup u ∈A b J ( t, x ; u ) = sup u ∈A (cid:26) E t,x [ X u ( T )] − θ V t,x [ X u ( T )] (cid:27) , θ > . In Bj¨ork and Murgoci (2010), an essential technique to handle the “troubling” square term is to introducean auxiliary process g u deﬁned by g u ( t, x ) = E t,x [ X u ( T )] , t < T and g u ( T, x ) = x. (4.3)Notice that g u in (4.3) is deﬁned in a backward way and its dynamics g u ( t, X u ( t )) are stochastic , while ourauxiliary process Y u is deﬁned in a forward way and its dynamics Y u ( t ) are deterministic ; see (2.10). Dueto the deterministic nature of process Y u , we do not have V yy and V xy terms in our HJB equation (3.7).For any admissible control u ∈ A , deﬁne the associated Dynkin operator of process X u in (2.6) by L u . As u ∈ A , X u is the unique strong solution to the SDE (2.6) and X ( T ) is square integrable. It thenfollows from (4.3) that g u ( t, X u ( t )) is a martingale and we have L u g u ( t, x ) = 0 for all u ∈ A by Dynkin’sformula. Assume an equilibrium strategy ˆ u as deﬁned in (4.1) exists, Bj¨ork and Murgoci (2010) derive anextended system of HJB equations satisﬁed by the pair ( b V ( t, x ) , g ˆ u ( t, x )) and solve the system to obtain ˆ u ;see Theorems 2.1 and 7.4 therein. In our formulation, Problem (2.12) is a standard control problem, withtwo state variables, and the HJB equation (3.7) is about the value function V ( t, x, y ) only. To summarize,the approach of Bj¨ork and Murgoci (2010) ends up with solving a system of two one-dimensional partialdiﬀerential equations (PDEs), while our approach leads to a two-dimensional PDE. In Section 3, we obtain the precommitment strategy u pre to Problem (2.9) in (3.2) and the optimal strategy u ∗ to Problem (2.12) in (3.9). Since u pre and u ∗ are solutions to two diﬀerent stochastic control problems, adirect side-by-side comparison has little mathematical meaning. However, Problem (2.12) is an alternativetime-consistent formulation to Problem (2.9), and both u pre and u ∗ are available investment and riskcontrol strategies to the insurer, comparing the end results of these strategies makes economic sense.First, we ﬁx the same risk aversion parameter θ for both Problems (2.9) and (2.12), and investigatethree important results: the mean E t,x [ X u ( T )], the variance V t,x [ X u ( T )], and the objective J deﬁnedin (2.8). Let X pre and X ∗ denote the insurer’s wealth process (2.6) under the precommitment strategy u pre and the optimal strategy u ∗ , respectively. By comparing (3.5)-(3.6) with (3.14)-(3.15) and (3.12), weobtain: E t,x [ X pre ( T )] > E t,x [ X ∗ ( T )] , V t,x [ X pre ( T )] > V t,x [ X ∗ ( T )] , J ( t, x ; u pre ) > J ( t, x ; u ∗ ) . (4.4)From (4.4), we conclude that the optimal strategy u ∗ is more conservative than the precommitment strategy u pre , leading to a smaller risk under the compromise of performance (mean). Note that to derive (4.4), we18ake the parameter θ to be the same for Problems (2.9) and (2.12). However, the results in (4.4) clearlyreveal diﬀerent risk attitudes in terms of both mean and variance. As such, in the next step, we ﬁx thesame target E t,x [ X u ( T )] for the insurer and study how the two diﬀerent formulations achieve the sametarget.Second, let us ﬁx m := E t,x [ X u ( T )] > xe r ( T − t ) for the insurer, and re-consider Problems (2.9) and(2.12) under a constrained admissible set A m , deﬁned by A m := { u ∈ A : E t,x [ X u ( T )] = m } . Denote the corresponding solutions by u pre m and u ∗ m . We apply Theorems 3.1 and 3.5 to obtain u pre m and u ∗ m in the next two corollaries and then compare them. Corollary 4.1.

Let a target level m := E t,x [ X u ( T )] > xe r ( T − t ) be given. A precommitment strategy,denoted by u pre m = ( π pre m ( s ) , L pre m ( s )) s ∈ [ t,T ) , to Problem (2.9) over the admissible set A m is given by π pre m ( s ) = − κ X pre m ( s ) − m − x e ( r − κ )( T − t ) − e − κ ( T − t ) e − r ( T − s ) ! and L pre m ( s ) = − κ κ π pre m ( s ) , (4.5) where X pre m denotes the wealth process under the precommitment strategy u pre m .Proof. Recall E t,x [ X pre ( T )] is obtained in (3.5). By equating E t,x [ X pre ( T )] = m , we get θ pre ( m ) = e κ ( T − t ) − m − xe r ( T − t ) > . Substituting the free parameter θ by the above θ pre ( m ) in (3.2) leads to the desired results in (4.5). Corollary 4.2.

Let a target level m := E t,x [ X u ( T )] > xe r ( T − t ) be given. A precommitment strategy, u ∗ m = ( π ∗ m ( s ) , L ∗ m ( s )) s ∈ [ t,T ) , to Problem (2.12) over the admissible set A m is given by π ∗ m ( s ) = κ (cid:0) m − x e r ( T − t ) (cid:1) κ ( T − t ) e − r ( T − s ) and L ∗ m ( s ) = κ (cid:0) m − x e r ( T − t ) (cid:1) κ ( T − t ) e − r ( T − s ) . (4.6) Proof.

Recall E t,x [ X ∗ ( s )] is obtained in (3.14). By equating E t,x [ X ∗ ( T )] = m , we get θ ∗ ( m ) = κ ( T − t ) m − xe r ( T − t ) > . Substituting the free parameter θ by the above θ ∗ ( m ) in (3.9) leads to the desired results in (4.6).Despite successfully obtaining π pre m and π ∗ m in close-form, we cannot directly compare them at anarbitrary time s ( t ≤ s < T ). However, when s = t , by using (4.5) and (4.6), and e κ ( T − t ) − > κ ( T − t ),we obtain π pre m ( t ) > π ∗ m ( t ) and L pre m ( t ) > L ∗ m ( t ) . (4.7) 194.7) implies that, to achieve the same target m , the precommitment framework yields a more risky strategyin both investment and liability at the initial time t than the alternative time-consistent formulation.Note that we call u ∗ m in Corollary 4.1 a precommitment strategy, since both π ∗ m and L ∗ m depend onthe initial state ( t, x ) as seen in (4.6). That means, by restricting the admissible set from A to A m , thealternative time-consistent formulation in Problem (2.12) becomes time-inconsistent. In fact, u ∗ m being aprecommitment strategy is not surprising at all, because the restriction of A m leads to a state-dependentrisk aversion θ ∗ ( m ), which is a well-known contributor to time-inconsistent problems (see Bj¨ork et al.(2014)). Given the explicit results on both the optimal strategy and the value function in Theorem 3.5, we discusstheir analytical properties with respect to the model parameters, and summarize the main ﬁndings in Table1. However, most ﬁndings there are partial and limited to the assumption of ρ >

0. That motivates us toconduct numerical studies in this section to further investigate how the model parameters and risk aversionaﬀect the insurer’s decisions.We follow Zou and Cadenillas (2014) to initiate the default model parameters in (2.1)-(2.3), and presentthem in Table 2, where we assume γ ( t, z ) ≡ γ for all t ∈ [0 , T ] and z ∈ R . Note that we leave both ρ (correlation coeﬃcient) and θ (risk aversion) unspeciﬁed in the default setting, as they are the mostsigniﬁcant factors of the optimal strategy. r µ σ α β λ γ p ρ (correlation coeﬃcient) and θ (risk aversion) on the optimal strategy.We plot the optimal investment strategy π ∗ ( t ) and the optimal liability strategy L ∗ ( t ) against all ρ ∈ [ − , θ = 1 , , ,

10 in Figure 1. Several important observations andexplanations are due as follows. • The optimal investment in the risky asset may be negative (i.e., shorting selling may be optimal),but only when ρ is close to -1. We further observe that a less risk averse insurer short sells more ina combined market with extreme negative correlation. To understand these results, we look at theextreme case of ρ = −

1, in which the movements from the Brownian motion W cause the risky asset S and the risk process R to aﬀect the insurer in the exactly same direction, if she holds positivepositions. For instance, a decrease of W leads to a lower price of S and more liabilities fromunderwriting policies. In turn, the volatility due to W is ampliﬁed and results in a more volatilemarket in the insurer’s view. A strategy to counter such an ampliﬁcation eﬀect is to short sell the20 Optimal Investment Strategy * = 1 = 2 = 5 = 10 -1 -0.5 0 0.5 101234567 Optimal Liability Strategy L * = 1 = 2 = 5 = 10 Figure 1: Impact of ρ and θ on the Optimal Strategies π ∗ (left) and L ∗ (right) Notes. We plot π ∗ ( t ) and L ∗ ( t ) under the default model parameters in Table 2, x = 1, and T − t = 1. risky asset. A less risk averse insurer is less prone to the ampliﬁcation eﬀect, and then short sellsmore risky asset when ρ is close to -1. • The optimal investment π ∗ is an increasing function of ρ , and shares the same intersection point at π ∗ = 0 (recall π ∗ = 0 iﬀ κ = 0 by (3.9) and κ is independent of θ by (3.3)). The optimal liability L ∗ ﬁrst decreases and then increases with respect to (w.r.t.) ρ , revealing a convex relation. However,if ρ > L ∗ is an increasing function of ρ , as already shown in Table 1. Under the given parametersin Table 2, we calculate that L ∗ takes minimum values at ρ = − . ρ >

0, the ﬁnancialmarket and the insurance market provide a natural hedge to each other, making the combined marketless volatile to the insurer, and hence, both π ∗ and L ∗ increase w.r.t. ρ when ρ > • As risk aversion θ increases, we observe that the optimal liability strategy L ∗ taken by the insurerdecreases, and is almost insensitive to ρ when θ is large enough. Similar statement holds true for theoptimal investment π ∗ (in the absolute value).21 Optimal Investment Strategy * = -0.5 = 0 = 0.5 Optimal Liability Strategy L * = -0.5 = 0 = 0.5 Figure 2: Impact of λ on the Optimal Strategies π ∗ (left) and L ∗ (right) Notes. We calculate the premium p by the expected value principle with loading η = 40%. We set θ = 2, x = 1, T − t = 1, and other parameters as in Table 2. We next study how the jump intensity λ of the risk process R aﬀects the insurer’s optimal strategy. Tothis end, we allow λ to vary in [0 , . p = 0 .

15 as now λ is changing, but instead applythe expected value principle with loading η = 40% to calculate the premium p by p = (1 + η ) × ( α + λγ ).We set θ = 2 and consider three correlation levels ρ = − . , , .

5. The remaining model parameters arethe same as in Table 2. We then plot π ∗ ( t ) and L ∗ ( t ), with x = 1 and T − t = 1, against λ ∈ [0 , . λ in Table 1. When ρ = 0, the ﬂat solid line (in black) in the left panel of Figure 2 represents the Merton ratio. The impactof λ on L ∗ is more direct and easy to understand. As λ increases, the liabilities per unit increase and theinsurer responds by taking less units (policies) in the insurance business. However, the impact of λ on π ∗ isindirect and more complex. We notice a decreasing relation when ρ = 0 . ρ = − .

5. In a positively correlated combined market, the risk process R acts as a hedge to the risky asset.But, as λ increases, this hedging eﬀect weakens; or putting it diﬀerently, the hedging tool itself becomestoo volatile to serve its hedging purpose. Upon understanding that, the insurer reacts by reducing risky22nvestment. Less can be said in general regarding the relation between π ∗ and λ when ρ is negative. Weknow that, with ρ <

0, the increase of λ leads to a shrinking eﬀect on the risky asset’s volatility σ , makingthe insurer willing to hold more risky asset, but at the same time a bigger λ makes the insurance businessmore volatile, causing the insurer to be cautious of making risky investment. The former eﬀect outweighsthe latter one, leading to a positive relation found in Figure 2. We remark that the opposite may happenif a diﬀerent set of model parameters is taken. We study an optimal investment and risk control optimization problem for an insurer with mean-variance(MV) preference. The insurer has access to a combined market with ﬁnancial assets and insurance businessopportunities. The ﬁnancial market is given by a standard Black-Scholes model, and the risk process(liabilities) per unit is given by a jump-diﬀusion process. The insurer seeks an optimal investment and riskcontrol strategy under the MV preference, which is a well-known time-inconsistent problem. We introducea deterministic auxiliary process to replicate the conditional expectation of the insurer’s wealth and extendit as a second state process. We then formulate an alternative time-consistent problem which is not onlyintimately related to the original problem but also can be solved by the standard dynamic programmingmethod. We obtain the optimal strategy, the eﬃcient frontier, and the value function, all in closed-form, tothe alternative problem. We conduct analytical studies to compare our formulation and optimal strategywith those under the game-theoretic and precommitted framework. Among many ﬁndings, we emphasizethat the correlation between the ﬁnancial market and the risk process plays a key role in the optimalstrategy. When the correlation coeﬃcient ρ is positive, both the optimal investment strategy π ∗ and theoptimal liability strategy L ∗ are increasing functions of ρ . However, when ρ is negative, L ∗ ﬁrst decreasesand then increases with respect to ρ , showing a convex U-shaped relation. As ρ moves towards -1, π ∗ becomes negative, suggesting the optimal investment decision is to short sell the risky asset. Lastly, wereport that the jump intensity of the risk process also has a major impact on the optimal strategy. Acknowledgments

We would like to thank anonymous referees and editor for their careful reading and many insightful com-ments that help us improve the quality of an early version of this paper. Bin Zou is partially supported bya start-up grant from the University of Connecticut. Yang Shen is partially supported by the DiscoveryEarly Career Researcher Award (Grant No. DE200101266) from the Australian Research Council.Declarations of interest: none. 23

Proofs

Problem (2.9) is a standard MV type control problem that is well studied in the literature; see, e.g.,Markowitz (1952) and Li and Ng (2000) in discrete time, and Zhou and Li (2000) and the book Yong and Zhou(1999) in continuous time. There are various approaches to ﬁnding the precommitment strategy, and mostof them rely on the so-called embedding technique, ﬁrst used in Li and Ng (2000) and Zhou and Li (2000).We denote a constrained admissible set A m by A m := { u ∈ A : E t,x [ X u ( T )] = m } , where m > x e r ( T − t ) , and explain such a technique using the (equivalence) relations as follows:sup u ∈A J ( t, x ; u ) ⇔ sup m ∈ R sup u ∈A m J ( t, x ; u ) = m − θ E t,x h ( X u ( T ) − m ) i ⇔ inf u ∈A m V t,x [ X u ( T )] ⇔ inf u ∈A V t,x [ X u ( T )] − w ( E t,x [ X u ( T )] − m ) ⇔ inf u ∈A E t,x h ( X u ( T ) − ξ ) i − ( ξ − m ) , ξ = m + w (Lagrange multiplier) . (A.1)The above relation shows that the key to solving Problem (2.9) is ﬁnding an optimal solution to thefollowing problem: W ( t, x ; ξ ) := inf u ∈A E t,x h ( X u ( T ) − ξ ) i , where ξ ∈ R . (A.2) Theorem A.1.

Suppose β (1 − ρ ) + λγ = 0 . An optimal solution to Problem (A.2) is given by π pre ( s ; ξ ) = − κ (cid:16) X pre ( s ; ξ ) − ξ e − r ( T − s ) (cid:17) and L pre ( s ; ξ ) = − κ (cid:16) X pre ( s ; ξ ) − ξ e − r ( T − s ) (cid:17) , (A.3) where κ and κ are deﬁned in (3.3) and X pre ( · ; ξ ) is the corresponding wealth process under the initialstate ( t, x ) . The value function to Problem (A.2) is given by W ( t, x ; ξ ) = (cid:16) x e r ( T − t ) − ξ (cid:17) e − κ ( T − t ) , (A.4) where κ is deﬁned in (3.4) .Proof. Since Problem (A.2) is a standard control problem, by using the dynamic programming principle, weobtain that W solves the following HJB equation (assuming W satisﬁes the needed regularity conditions) W t ( t, x ; ξ ) + sup ( π,L ) ∈ R (cid:26) ( rx + ¯ µπ + ¯ pL ) W x ( t, x ; ξ ) + 12 (cid:0) σ π − ρβσπL + β L (cid:1) W xx ( t, x ; ξ )+ λ Z R (cid:0) W ( t, x − Lγ ( z ); ξ ) − W ( t, x ; ξ ) (cid:1) d F Z ( z ) (cid:27) = 0along with the terminal condition W ( T, x ; ξ ) = ( x − ξ ) . We then make an educated guess for the valuefunction in the form of W ( t, x ; ξ ) = (cid:16) x e r ( T − t ) − ξ (cid:17) f ( t ) , f ( T ) = 1 . Straightforward computations and veriﬁcation then yield the desired results in (A.3) and (A.4).24e next apply the above theorem and the road map in (A.1) to solve Problem (2.9).

Proof of Theorem 3.1.

Using (A.4), we consider the problem below c W ( t, x ; m ) = sup ξ ∈ R W ( t, x ; ξ ) − ( ξ − m ) and easily obtain the optimal solution and the value function, under the given expectation level m , by ξ ∗ ( m ) = m − x e ( r − κ )( T − t ) − e − κ ( T − t ) and c W ( t, x ; m ) = (cid:0) m − x e r ( T − t ) (cid:1) e κ ( T − t ) − . (A.5)Next, we solve sup m ∈ R m − θ c W ( t, x ; m )and obtain the optimal target level m ∗ by m ∗ = x e r ( T − t ) + e κ ( T − t ) − θ . (A.6)Finally, we substitute ξ in (A.3) by ξ ∗ ( m ∗ ), where ξ ∗ ( · ) and m ∗ are given in (A.5) and (A.6), and obtainthe desired results after tedious computations. Proof to Theorem 3.5.

From the terminal condition (3.8) and the nature of stochastic linear-quadraticproblems, we guess an ansatz to the value function in the form of V ( t, x, y ) = A ( t ) ( x − y ) + B ( t ) x + C ( t ) , where A , B and C are yet to be determined and A ( t ) < t . Note that with A ( t ) <

0, the above V is concave in both x and y arguments. It is clear from (3.8) that A ( T ) = − θ , B ( T ) = 1 , C ( T ) = 0 . Using the above ansatz and the HJB equation (3.7), we derive that an (candidate of) optimal strategy u ∗ = ( π ∗ , L ∗ ) should solve the following system of equations  σ · π ∗ − ρβσ · L ∗ = − ¯ µ B ( t )2 A ( t ) ρβσ · π ∗ − (cid:0) β + λγ (cid:1) · L ∗ = (¯ p − λγ ) B ( t )2 A ( t )which, under the assumption β (1 − ρ ) + λγ = 0 in (3.1), admits a unique solution π ∗ = − κ B ( t )2 A ( t ) and L ∗ = − κ B ( t )2 A ( t ) , where κ and κ are deﬁned in (3.3). 25e plug the above ( π ∗ , L ∗ ) into the HJB equation (3.7) and simplify to get( x − y ) A ′ ( t ) + xB ′ ( t ) + C ′ ( t ) + 2 r ( x − y ) A ( t ) + rxB ( t ) − κ B ( t )4 A ( t ) = 0 , where κ is deﬁned by (3.4). Since the above equation holds for all x, y ∈ R , we obtain  A ′ ( t ) + 2 r A ( t ) = 0 , A ( T ) = − θ ,B ′ ( t ) + r B ( t ) = 0 , B ( T ) = 1 ,C ′ ( t ) − κ B ( t )4 A ( t ) = 0 , C ( T ) = 0 , which leads to the solutions A ( t ) = − θ e r ( T − t ) < , B ( t ) = e r ( T − t ) > , C ( t ) = κ θ ( T − t ) > . It is straightforward to verify that V given by (3.10) is smooth ( V ∈ C , , ) and, by construction, satisﬁesthe HJB equation (3.7). As a result, the value function to Problem (2.12) is indeed given by (3.10). Thestrategy ( π ∗ , L ∗ ) given by (3.9) solves the supremum problem uniquely in the HJB equation (3.7), whichis guaranteed by V xx ( t, x, y ) = 2 A ( t ) <

0. By Deﬁnition 2.2, ( π ∗ , L ∗ ) is admissible. Hence, we concludethat the strategy given by (3.9) is optimal to Problem (2.12). The proof is now complete. References

Albrecher, H., Beirlant, J., and Teugels, J. L. (2017).

Reinsurance: Actuarial and Statistical Aspects . JohnWiley & Sons.Bai, L. and Guo, J. (2008). Optimal proportional reinsurance and investment with multiple risky assetsand no-shorting constraint.

Insurance: Mathematics and Economics , 42(3):968–975.Bai, L. and Zhang, H. (2008). Dynamic mean-variance problem with constrained risk control for theinsurers.

Mathematical Methods of Operations Research , 68(1):181–205.Basak, S. and Chabakauri, G. (2010). Dynamic mean-variance asset allocation.

Review of FinancialStudies , 23(8):2970–3016.B¨auerle, N. (2005). Benchmark and mean-variance problems for insurers.

Mathematical Methods of Oper-ations Research , 62(1):159–165.Bj¨ork, T. and Murgoci, A. (2010). A general theory of Markovian time inconsistent stochastic controlproblems.

SSRN Preprint https: // ssrn. com/ abstract= 1694759 .Bj¨ork, T., Murgoci, A., and Zhou, X. Y. (2014). Mean–variance portfolio optimization with state-dependentrisk aversion.

Mathematical Finance , 24(1):1–24. 26o, L. and Wang, S. (2017). Optimal investment and risk control for an insurer with stochastic factor.

Operations Research Letters , 45(3):259–265.Browne, S. (1995). Optimal investment policies for a ﬁrm with a random risk process: Exponential utilityand minimizing the probability of ruin.

Mathematics of Operations Research , 20(4):937–958.Cai, J. and Chi, Y. (2020). Optimal reinsurance designs based on risk measures: A review.

StatisticalTheory and Related Fields , 4(1):1–13.Cao, J., Landriault, D., and Li, B. (2020). Optimal reinsurance-investment strategy for a dynamic contagionclaim model.

Insurance: Mathematics and Economics , 93:206–215.Dai, M., Jin, H., Kou, S., and Xu, Y. (2020). A dynamic mean-variance analysis for log returns.

Manage-ment Science , forthcoming.Ekeland, I. and Lazrak, A. (2006). Being serious about non-commitment: Subgame perfect equilibrium incontinuous time. arXiv preprint https: // arxiv. org/ abs/ math/ 0604264 .Højgaard, B. and Taksar, M. (1998). Optimal proportional reinsurance policies for diﬀusion models.

Scandinavian Actuarial Journal , 1998(2):166–180.Huang, Y.-J. and Zhou, Z. (2020). Strong and weak equilibria for time-inconsistent stochastic control incontinuous time.

Mathematics of Operations Research , forthcoming.Kryger, E., Nordfang, M.-B., and Steﬀensen, M. (2020). Optimal control of an objective functional withnon-linearity between the conditional expectations: Solutions to a class of time-inconsistent portfolioproblems.

Mathematical Methods of Operations Research , 91(3):405–438.Kryger, E. M. and Steﬀensen, M. (2010). Some solvable portfolio problems with quadratic and collectiveobjectives.

SSRN Preprint https: // ssrn. com/ abstract= 1577265 .Li, D. and Ng, W.-L. (2000). Optimal dynamic portfolio selection: Multiperiod mean-variance formulation.

Mathematical Finance , 10(3):387–406.Li, D., Rong, X., and Zhao, H. (2015). Time-consistent reinsurance–investment strategy for a mean–variance insurer under stochastic interest rate model and inﬂation risk.

Insurance: Mathematics andEconomics , 64:28–44.Li, Y. and Li, Z. (2013). Optimal time-consistent investment and reinsurance strategies for mean–varianceinsurers with state dependent risk aversion.

Insurance: Mathematics and Economics , 53(1):86–97.Liang, Z., Yuen, K. C., and Guo, J. (2011). Optimal proportional reinsurance and investment in a stockmarket with Ornstein–Uhlenbeck process.

Insurance: Mathematics and Economics , 49(2):207–215.27iu, C. S. and Yang, H. (2004). Optimal investment for an insurer to minimize its probability of ruin.

North American Actuarial Journal , 8(2):11–31.Markowitz, H. (1952). Portfolio selection.

Journal of Finance , 7(1):77–91.Moore, K. S. and Young, V. R. (2006). Optimal insurance in a continuous-time model.

Insurance: Math-ematics and Economics , 39(1):47–68.Peng, X. and Wang, W. (2016). Optimal investment and risk control for an insurer under inside information.

Insurance: Mathematics and Economics , 69:104–116.Promislow, D. S. and Young, V. R. (2005). Minimizing the probability of ruin when claims follow brownianmotion with drift.

North American Actuarial Journal , 9(3):110–128.Schmidli, H. (2001). Optimal proportional reinsurance policies in a dynamic setting.

Scandinavian ActuarialJournal , 2001(1):55–68.Shen, Y. and Zeng, Y. (2014). Optimal investment–reinsurance with delay for mean–variance insurers: Amaximum principle approach.

Insurance: Mathematics and Economics , 57:1–12.Shen, Y. and Zeng, Y. (2015). Optimal investment–reinsurance strategy for mean–variance insurers withsquare-root factor process.

Insurance: Mathematics and Economics , 62:118–137.Stein, J. L. (2012).

Stochastic Optimal Control and the US Financial Debt Crisis . Springer.Wang, H., Wang, R., and Wei, J. (2019). Time-consistent investment-proportional reinsurance strategywith random coeﬃcients for mean–variance insurers.

Insurance: Mathematics and Economics , 85:104–114.Yan, T. and Wong, H. Y. (2020). Open-loop equilibrium reinsurance-investment strategy under mean–variance criterion with stochastic volatility.

Insurance: Mathematics and Economics , 90:105–119.Yang, H. and Zhang, L. (2005). Optimal investment for insurer with jump-diﬀusion risk process.

Insurance:Mathematics and Economics , 37(3):615–634.Yang, S. (2020). Bellman type strategy for the continuous time mean-variance model. arXiv preprint https: // arxiv. org/ abs/ 2005. 01904 .Yong, J. and Zhou, X. Y. (1999).

Stochastic Controls: Hamiltonian Systems and HJB Equations . SpringerScience & Business Media.Zeng, Y., Li, D., and Gu, A. (2016). Robust equilibrium reinsurance-investment strategy for a mean–variance insurer in a model with jumps.

Insurance: Mathematics and Economics , 66:138–152.28eng, Y. and Li, Z. (2011). Optimal time-consistent investment and reinsurance policies for mean-varianceinsurers.

Insurance: Mathematics and Economics , 49(1):145–154.Zeng, Y., Li, Z., and Lai, Y. (2013). Time-consistent investment and reinsurance strategies for mean–variance insurers with jumps.

Insurance: Mathematics and Economics , 52(3):498–507.Zhou, X. Y. and Li, D. (2000). Continuous-time mean-variance portfolio selection: A stochastic LQframework.

Applied Mathematics and Optimization , 42(1):19–33.Zou, B. and Cadenillas, A. (2014). Optimal investment and risk control policies for an insurer: Expectedutility maximization.