[PDF] A hybrid stochastic differential reinsurance and investment game with bounded memory

Abstract

This paper investigates a hybrid stochastic differential reinsurance and investment game between one reinsurer and two insurers, including a stochastic Stackelberg differential subgame and a non-zero-sum stochastic differential subgame. The reinsurer, as the leader of the Stackelberg game, can price reinsurance premium and invest its wealth in a financial market that contains a risk-free asset and a risky asset. The two insurers, as the followers of the Stackelberg game, can purchase proportional reinsurance from the reinsurer and invest in the same financial market. The competitive relationship between two insurers is modeled by the non-zero-sum game, and their decision making will consider the relative performance measured by the difference in their terminal wealth. We consider wealth processes with delay to characterize the bounded memory feature. This paper aims to find the equilibrium strategy for the reinsurer and insurers by maximizing the expected utility of the reinsurer's terminal wealth with delay and maximizing the expected utility of the combination of insurers' terminal wealth and the relative performance with delay. By using the idea of backward induction and the dynamic programming approach, we derive the equilibrium strategy and value functions explicitly. Then, we provide the corresponding verification theorem. Finally, some numerical examples and sensitivity analysis are presented to demonstrate the effects of model parameters on the equilibrium strategy. We find the delay factor discourages or stimulates investment depending on the length of delay. Moreover, competitive factors between two insurers make their optimal reinsurance-investment strategy interact, and reduce reinsurance demand and reinsurance premium price.

Full PDF

AA hybrid stochastic diﬀerential reinsurance and investment game withbounded memory ∗ Yanfei Bai , Zhongbao Zhou , † , Helu Xiao , Rui Gao , Feimin Zhong School of Business Administration, Hunan University, Changsha 410082, China School of Business, Hunan Normal University, Changsha 410081, China College of Mathematics and Econometrics, Hunan University, Changsha 410082, China

Abstract

This paper investigates a hybrid stochastic diﬀerential reinsurance and investment game between one reinsurerand two insurers, including a stochastic Stackelberg diﬀerential subgame and a non-zero-sum stochastic diﬀerentialsubgame. The reinsurer, as the leader of the Stackelberg game, can price reinsurance premium and invest its wealthin a ﬁnancial market that contains a risk-free asset and a risky asset. The two insurers, as the followers of theStackelberg game, can purchase proportional reinsurance from the reinsurer and invest in the same ﬁnancial market.The competitive relationship between two insurers is modeled by the non-zero-sum game, and their decision makingwill consider the relative performance measured by the diﬀerence in their terminal wealth. We consider wealthprocesses with delay to characterize the bounded memory feature. This paper aims to ﬁnd the equilibrium strategyfor the reinsurer and insurers by maximizing the expected utility of the reinsurer’s terminal wealth with delay andmaximizing the expected utility of the combination of insurers’ terminal wealth and the relative performance withdelay. By using the idea of backward induction and the dynamic programming approach, we derive the equilibriumstrategy and value functions explicitly. Then, we provide the corresponding veriﬁcation theorem. Finally, somenumerical examples and sensitivity analysis are presented to demonstrate the eﬀects of model parameters on theequilibrium strategy. We ﬁnd the delay factor discourages or stimulates investment depending on the length of delay.Moreover, competitive factors between two insurers make their optimal reinsurance-investment strategy interact, andreduce reinsurance demand and reinsurance premium price.

Keywords:

Decision analysis; Stochastic diﬀerential games; Reinsurance contract design; Investment; Delay

Insurers and reinsurers, as special ﬁnancial institutions, not only have to face the investment risks in the ﬁnancial market,but also have to manage the risk of random claims in the insurance market. Insurers can sign reinsurance contractsfrom the reinsurer and transfer part of the risk of claims to the reinsurer, because the reinsurer is more risk-seekingthan insurers. Research on optimal reinsurance and investment strategies has been an important part of mainstreamstudy in the actuarial ﬁeld. In recent decades, many scholars have made extensive studies on reinsurance and investmentoptimization problem under diﬀerent objectives, for example, minimizing the probability of ruin (Browne (1995), Chenet al. (2010), Li et al. (2015), etc.), maximizing the expected utility of the terminal wealth (Liang et al. (2011), Liet al. (2012), Huang et al. (2016), Zhao and Rong (2017), etc.), maximizing the expected terminal surplus as well asminimizing the variance of the terminal surplus ( Bi et al. (2014), Zhou et al. (2019a), Zhou et al. (2019b), etc.). ∗ This research is supported by the National Natural Science Foundation of China (Nos. 71771082, 71801091) and Hunan ProvincialNatural Science Foundation of China (No. 2017JJ1012). † Corresponding author. E-mail: [email protected]; [email protected]. a r X i v : . [ q -f i n . M F ] O c t owever, the majority of these researches study the reinsurance and investment optimization problem only fromthe unilateral perspective of insurers, while the interest of the reinsurer is generally ignored. Since the setting of thereinsurance contract depends on the mutual agreement between the insurer and the reinsurer, a reinsurance contractthat only considers the interest of one party may be unacceptable to the other party. That is, the reinsurance contractshould be designed to take into account the interests of both the insurer and the reinsurer. In view of the monopolyposition of the reinsurer and the competitive relationship between insurers in the market, we consider a reinsurer andtwo insurers as the leader and the followers of a stochastic Stackelberg diﬀerential game respectively, and use the non-zero-sum stochastic diﬀerential game to describe the competitive relationship. We investigate the reinsurer’s premiumpricing and investment optimization problem as well as insurers’ reinsurance and investment optimization problem.That is, we consider the mutual interests of the reinsurer and two insurers as well as the competition between insurers.As far as we know, the game problem in the insurance market has attracted some scholars’ attention. With re-spect to maximizing the expected utility of the relative performance, Bensoussan et al. (2014) studied a non-zero-sumstochastic diﬀerential investment and reinsurance game between two insurers whose surplus processes were modulatedby continuous-time Markov chains; Deng et al. (2018) investigated the implications of strategic interaction between twoconstant absolute risk aversion (CARA) insurers on their reinsurance-investment policies with default risk under theframework of non-zero-sum stochastic diﬀerential game. There are still many studies on the non-zero-sum stochasticdiﬀerential reinsurance-investment game problem, such as Meng et al. (2015), Pun and Wong (2016), Guan and Liang(2016), Yan et al. (2017), Zhu et al. (2018), etc.Obviously, the above mentioned stochastic diﬀerential game models about reinsurance-investment problem do notconsider the interest of the reinsurer. Chen and Shen (2018) ﬁrst proposed a stochastic Stackelberg diﬀerential reinsur-ance game model to depict the leader-follower relationship between the reinsurer and insurer in the insurance market,and analyzed optimal reinsurance strategy from joint interests of the insurer and the reinsurer. Chen and Shen (2019)studied stochastic Stackelberg diﬀerential reinsurance games under time-inconsistent mean-variance framework. In-spired by these insights, in this paper, we will build a more realistic stochastic diﬀerential reinsurance and investmentgame model, i.e., the hybrid stochastic diﬀerential reinsurance and investment game model, which takes into accountthe leader-follower relationship between the reinsurer and the insurers and the competitive relationship between theinsurers.Traditionally, most of the researches on optimal reinsurance-investment decision making are based on current infor-mation, ignoring the performance of past wealth. However, decisions often rely on the past information in real systems,and delays arise naturally. This feature is commonly referred to as the delay feature or bounded memory feature (seeChang et al. (2011) and Federico (2011)). Shen and Zeng (2014) ﬁrst introduced the bounded memory feature of wealthinto the optimal investment-reinsurance problem for mean-variance insurers and obtained the optimal strategy undercertain conditions. Since then, A and Li (2015) considered an optimal investment and excess-of-loss reinsurance problemwith delay for an insurer under Heston’s stochastic volatility model. Yang et al. (2017) researched an optimal propor-tional reinsurance problem for the compound Poisson risk model with delay under the mean-variance criterion. A et al.(2018) studied an optimal investment and excess-of-loss reinsurance problem with delay and jump-diﬀusion risk process.The delay feature of the wealth process has inﬂuence on decision making of the reinsurer and insurers. It would be morepractical to consider such a delay period. Therefore, this paper also considers the delay feature of the wealth processesunder the framework of the hybrid stochastic diﬀerential game.The main work of this article is summarized as follows. We build a hybrid stochastic diﬀerential reinsurance andinvestment game model, including stochastic Stackelberg diﬀerential subgame and non-zero-sum stochastic diﬀerentialsubgame. One reinsurer and two insurers are three players in the hybrid game and are the leader and the followers in theStackelberg game respectively. The competitive relationship between two insurers is modeled by the non-zero-sum game,and their decision making will consider the relative performance measured by the diﬀerence in their terminal wealth.Furthermore, the eﬀects of delay on wealth processes are considered. By using the idea of backward induction and thedynamic programming approach, we derive the equilibrium strategy and value functions explicitly. Then, we establish thecorresponding veriﬁcation theorem for the optimality of the given strategy. The equilibrium strategy indicates that the2ptimal reinsurance-investment strategies of two insurers interact with each other and reﬂect the herd eﬀect. Moreover,the optimal reinsurance strategies of insurers depend on the reinsurer’s optimal premium strategy. We study severalspecial cases of the model and ﬁnd that the delay factor discourages or stimulates investment depending on the lengthof the delay, and the optimal reinsurance premium in the intermediate case follows the principle of variance premiumwhen there is only one insurer and one reinsurer in the insurance market. Finally, we present some numerical examplesand sensitivity analysis to demonstrate the eﬀects of the model parameters on the equilibrium strategy. Through theanalysis and numerical simulation of equilibrium strategy, we ﬁnd that competitive factors between two insurers reducethe demand for reinsurance and the price of reinsurance premium. In addition, we ﬁnd that the eﬀect of delay weighton the equilibrium strategy is related to the length of delay.Diﬀerent from the existing literature, our work has the following four contributions. (1) We ﬁrst construct a hybridstochastic diﬀerential reinsurance and investment game model including a stochastic Stackelberg diﬀerential subgameand a non-zero-sum stochastic diﬀerential subgame. That is, we ﬁrst consider the leader-follower relationship betweenthe reinsurer and insurers and the competitive relationship between insurers at the same time, which is closer to theactual situation. (2) We consider the tripartite game between one reinsurer and two insurers, and the joint interests ofthe reinsurer and insurers are considered, while the majority of the existing researches on non-zero-sum reinsurance andinvestment game only focus on the insurer’s interest, ignoring the reinsurer’s interest generally. (3) Since reinsurance andinvestment are important risk management tools, we study the stochastic diﬀerential reinsurance and investment gameproblem, while Chen and Shen (2018) and Chen and Shen (2019) only studied the reinsurance optimization problemunder the framework of the Stackelberg game. (4) We consider the eﬀect of the bounded memory feature of wealthprocesses under the framework of the hybrid stochastic diﬀerential game, which has rarely been studied in the past.The remainder of this paper is organized as follows. Section 2 formulates the hybrid stochastic diﬀerential reinsuranceand investment game between one reinsurer and two insurers with delay. In Section 3, we derive the equilibrium strategyand value functions for the hybrid game problem by using the idea of backward induction and the dynamic programmingapproach. Then, we establish a veriﬁcation theorem for the optimality of the equilibrium strategy and study some specialcases of our model. Section 4 provides some numerical examples and sensitivity analysis to demonstrate the eﬀects ofthe model parameters on the equilibrium strategy. Section 5 concludes the paper. In this section, we describe the model in details. Let [0 , T ] be a continuous-time ﬁnite horizon, over which reinsuranceand investment behavior can occur. The uncertainty in markets is represented by a complete probability space (Ω , F , P ),which is equipped with a ﬁltration F = {F t } ≤ t ≤ T satisfying the usual conditions. We consider an insurance market containing two competing insurers and one reinsurer. The surplus process of insurer i ∈ { , } , denoted by { X i ( t ) } t ≥ , is depicted by the classic Cram´er-Lundberg risk model: X i ( t ) = x i + c i t − N i ( t )+ N ( t ) (cid:88) n =1 ˜ Y ni , i ∈ { , } , (2.1)where x i > c i ≥ i , respectively; N i ( t ) + N ( t ) representsthe number of claims up to time t ; { ˜ Y ni ≥ , n ≥ } is a list of independent identically distributed (i.i.d.) randomvariables with distribution function F i ( ˜ Y ); ˜ Y ni represents the amount of the n -th claim of insurer i . We assume that(i) { ˜ Y ni ≥ , n ≥ } has ﬁnite ﬁrst moment µ i (0 < µ i < + ∞ ) and ﬁnite secondary moment (˜ σ i ) < + ∞ ;(ii) { ˜ Y ni ≥ , n ≥ } is independent of N i ( t ) and N ( t ); 3iii) N ( t ), N ( t ) and N ( t ) are three mutually independent Poisson processes with intensity ˜ λ >

0, ˜ λ > λ > { X i ( t ) } t ≥ indicates that insurer 1 and insurer 2 are subject to common impact that isrepresented by { N ( t ) } t ≥ . Refer to Grandell (1977), Browne (1995), Gerber and Shiu (2006), Bai and Guo (2008), Chenet al. (2018) etc., the classic Cram´er-Lundberg model (2.1) can be approximated by the following diﬀusion process: dX i ( t ) = c i dt − λ i µ i dt + (cid:112) λ i (˜ σ i ) dW i ( t ) , X i (0) = x i , i ∈ { , } , (2.2)where λ i = ˜ λ i + ˜ λ ; { W i ( t ) , t ≥ } is a standard F -Brownian motion. d (cid:104) W ( t ) , W ( t ) (cid:105) = ρdt , where ρ = ˜ λµ µ √ λ λ ˜ σ ˜ σ .According to expected value premium principle, we know that c i = (1 + θ i ) λ i µ i , where θ i > i . Both two insurers can manage its claim risk through purchasing proportional reinsurance continuously fromthe reinsurer. The reinsurance strategy of insurer i is characterized by { q i ( t ) , t ≥ } satisfying q i ( t ) ∈ [0 , − q i ( t ))100% of the claims of insurer i while insurer i will cover remaining at time t . The price ofthe reinsurance premium at time t is p ( t ) ∈ [ c F , ¯ c ], where c F = max { c , c } , ¯ c = (1 + ¯ θ ) λ F µ F , ¯ θ is an upper bound of thereinsurer’s relative safety loading, ¯ θ > max { θ , θ } and λ F µ F = max { λ µ , λ µ } . Introducing proportional reinsurancestrategy q i ( t ) into equation (2.2), then dX i ( t ) = [ θ i a i − ( p ( t ) − a i )(1 − q i ( t ))] dt + q i ( t ) σ i dW i ( t ) , i ∈ { , } , (2.3)where a i = λ i µ i , σ i = (cid:112) λ i (˜ σ i ) . The surplus process of the reinsurer is as following: dX L ( t ) =[( p ( t ) − a )(1 − q ( t )) + ( p ( t ) − a )(1 − q ( t ))] dt + (1 − q ( t )) σ dW ( t ) + (1 − q ( t )) σ dW ( t ) , X L (0) = x L . (2.4) Assuming that both the reinsurer and two insurers can invest in a ﬁnancial market that contains one risk-free assetand one risky asset. The price process of the risk-free asset, { S ( t ) } t ≥ , is given by the following ordinary diﬀerentialequation (ODE): dS ( t ) = r S ( t ) dt, S (0) = 1 , (2.5)where r is the constant risk-free interest rate. The price process of the risky asset, { S ( t ) } t ≥ , is described by theconstant elasticity of variance (CEV) model: dS ( t ) = S ( t ) (cid:2) rdt + σS β ( t ) dW ( t ) (cid:3) , S (0) = s , (2.6)where r , σS β ( t ) and β denote the expected return rate, the volatility and the constant elasticity parameter of therisky asset, respectively; r > r > σ > { W ( t ) , t ≥ } is a standard F -Brownian motion and is independent of { W ( t ) , t ≥ } and { W ( t ) , t ≥ } . The CEV model can reduce to a geometric Brownian motion (GBM) when β = 0.If β <

0, the volatility σS β ( t ) increases as the stock price decreases, and a distribution with a fatter left tail can begenerated. If β >

0, the volatility σS β ( t ) increases as the stock price increases. Suppose that there are no transaction costs or taxes for investment and reinsurance, and short-selling of the risky assetis allowed. Let { b L ( t ) , t ≥ } , { b ( t ) , t ≥ } and { b ( t ) , t ≥ } be measurable processes valued in R representing theamount invested in the risky asset by the reinsurer, insurer 1 and insurer 2 at time t , respectively. Then, the remainingwealth X L ( t ) − b L ( t ), X ( t ) − b ( t ) and X ( t ) − b ( t ) are invested in the risk-free asset. Let π L ( t ) = ( p ( t ) , b L ( t )),4 ( t ) = ( q ( t ) , b ( t )) and π ( t ) = ( q ( t ) , b ( t )). Then, with strategy { π L ( t ) , t ≥ } , the wealth process of the reinsurer,denoted by { X π L L ( t ) } t ≥ , can be expressed as: dX π L L ( t ) =[( p ( t ) − a )(1 − q ( t )) + ( p ( t ) − a )(1 − q ( t )) + r X π L L ( t ) + ( r − r ) b L ( t )] dt + (1 − q ( t )) σ dW ( t ) + (1 − q ( t )) σ dW ( t ) + b L ( t ) σS β ( t ) dW ( t ) . (2.7)With strategy { π i ( t ) , t ≥ } , the wealth process of insurer i , denoted by { X π i i ( t ) } t ≥ , can be expressed as: dX π i i ( t ) =[ θ i a i − ( p ( t ) − a i )(1 − q i ( t )) + r X π i i ( t ) + ( r − r ) b i ( t )] dt + q i ( t ) σ i dW i ( t ) + b i ( t ) σS β ( t ) dW ( t ) , i ∈ { , } . (2.8)In fact, due to the bounded memory feature, the reinsurer’s and insurers’ strategies depend on the exogenouscapital instantaneous inﬂow into or outﬂow from current wealth. Refer to A and Li (2015), let Y L ( t ) and Z L ( t ) be theintegrated and pointwise delayed information of the reinsurer’s wealth process in the past horizon [ t − h L , t ], respectively.Correspondingly, let Y i ( t ) and Z i ( t ) be the integrated and pointwise delayed information of insurer i ’s wealth processin the past horizon [ t − h i , t ], respectively. That is, for ∀ t ∈ [0 , T ], Y L ( t ) = (cid:90) − h L e α L s X π L L ( t + s ) ds, Z L ( t ) = X π L L ( t − h L ) , (2.9) Y i ( t ) = (cid:90) − h i e α i s X π i i ( t + s ) ds, Z i ( t ) = X π i i ( t − h i ) , i ∈ { , } , (2.10)where α L > α i > h L > h i > f L ( t, X L ( t ) − Y L ( t ) , X L ( t ) − Z L ( t )) and f i ( t, X i ( t ) − Y i ( t ) , X i ( t ) − Z i ( t )) represent the capital inﬂow/outﬂow amountof the reinsurer and insurer i , respectively; where X L ( t ) − Y L ( t ), X i ( t ) − Y i ( t ) represent the average performance and X L ( t ) − Z L ( t ), X i ( t ) − Z i ( t ) represent the absolute performance. Such capital inﬂow/outﬂow, which is related to thepast performance of the wealth, may come out in various situations. For example, a good past performance may bringthe company more gain and further the company can pay a part of the gain as dividend to its shareholders. Contrarily,a poor past performance forces the company to seek further capital injection for covering the loss so that the ﬁnalperformance objective is still achievable. To make the problem solvable, we assume f L ( t, X L ( t ) − Y L ( t ) , X L ( t ) − Z L ( t )) = B L ( X L ( t ) − Y L ( t )) + C L ( X L ( t ) − Z L ( t )) , (2.11) f i ( t, X i ( t ) − Y i ( t ) , X i ( t ) − Z i ( t )) = B i ( X i ( t ) − Y i ( t )) + C i ( X i ( t ) − Z i ( t )) , i ∈ { , } , (2.12)where B L , C L , B i and C i are nonnegative constants. In other words, the amount of the capital inﬂow/outﬂow is thelinear weighted sum of the average performance and the absolute performance. Then, considering capital inﬂow/outﬂowfunctions f L ( t, X L ( t ) − Y L ( t ) , X L ( t ) − Z L ( t )) and f i ( t, X i ( t ) − Y i ( t ) , X i ( t ) − Z i ( t )), the wealth processes of the reinsurerand insurer i are governed by the following stochastic diﬀerential delay equations (SDDEs), respectively: dX π L L ( t ) = (cid:2) ( p ( t ) − a )(1 − q ( t )) + ( p ( t ) − a )(1 − q ( t )) + A L X π L L ( t ) + ( r − r ) b L ( t ) + B L Y L ( t ) + C L Z L ( t ) (cid:3) dt + (1 − q ( t )) σ dW ( t ) + (1 − q ( t )) σ dW ( t ) + b L ( t ) σS β ( t ) dW ( t ) , (2.13) dX π i i ( t ) = (cid:2) θ i a i − ( p ( t ) − a i )(1 − q i ( t )) + A i X π i i ( t ) + B i Y i ( t ) + C i Z i ( t ) + ( r − r ) b i ( t ) (cid:3) dt + q i ( t ) σ i dW i ( t ) + b i ( t ) σS β ( t ) dW ( t ) , i ∈ { , } , (2.14)where A L = r − B L − C L , A i = r − B i − C i . In addition, we assume that insurer i , i ∈ { , } , is endowed with the initialwealth x i at time − h i and does not start the business (insurance/reinsurance/investment) until time 0, i.e., X i ( t ) = x i > , ∀ t ∈ [ − h i , X L ( t ) = x L > , ∀ t ∈ [ − h L , Y L (0) = x L α L (1 − e − α L h L ) and Y i (0) = x i α i (1 − e − α i h i ). For any ﬁxed t ∈ [0 , T ], denote X π L L ( t ) = x L , Y L ( t ) = y L , Z L ( t ) = z L , X π i i ( t ) = x i , Y i ( t ) = y i , Z i ( t ) = z i and S ( t ) = s . Then, we deﬁne the admissible strategy as follows.5 eﬁnition 1 (Admissible strategy) π ( · ) = π L ( · ) × π ( · ) × π ( · ) = ( p ( · ) , b L ( · )) × ( q ( · ) , b ( · )) × ( q ( · ) , b ( · )) is said to beadmissible, if(i) { π L ( t ) } t ∈ [0 ,T ] , { π ( t ) } t ∈ [0 ,T ] and { π ( t ) } t ∈ [0 ,T ] are F -progressively measurable processes, such that p ( t ) ∈ [ c F , ¯ c ] , q ( t ) ∈ [0 , and q ( t ) ∈ [0 , for any t ∈ [0 , T ] ;(ii) E (cid:104)(cid:82) Tt [( b L ( (cid:96) )) + ( p ( (cid:96) )) ] d(cid:96) (cid:105) < + ∞ and E (cid:104)(cid:82) Tt [( b i ( (cid:96) )) + ( q i ( (cid:96) )) ] d(cid:96) (cid:105) < + ∞ , ∀ (cid:96) ∈ [ t, T ] , i ∈ { , } ;(iii) the equation (2.13) associated with π ( · ) has a unique solution X π L L ( · ) , which satisﬁes { E t,x L ,y L ,s (cid:2) sup | X π L L ( (cid:96) ) | (cid:3) } < + ∞ , for ∀ ( t, x L , y L , s ) ∈ [0 , T ] × R × R × R , ∀ (cid:96) ∈ [ t, T ] ;(iv) the equation (2.14) associated with π ( · ) has a unique solution X π i i ( · ) , which satisﬁes { E t,x i ,y i ,s (cid:2) sup | X π i i ( (cid:96) ) | (cid:3) } < + ∞ , for ∀ ( t, x i , y i , s ) ∈ [0 , T ] × R × R × R , ∀ (cid:96) ∈ [ t, T ] , i ∈ { , } . Let Π = Π L × Π × Π be the set of all admissible strategies, where Π L , Π and Π denote the set of all admissiblestrategies of the reinsurer, insurer 1 and insurer 2, respectively. In view of the monopoly position of the reinsurer and the competitive relationship between insurers in the market,we investigate a hybrid stochastic diﬀerential reinsurance and investment game with delay. The reinsurer and the twoinsurers are three players in the hybrid game. The game between the reinsurer and two insurers is a stochastic Stackelbergdiﬀerential game, in which the reinsurer is the leader and two insurers are the followers. The game between the twoinsurers is a non-zero-sum stochastic diﬀerential game, and their status is equal. More intuitively, the relationshipsbetween these three companies are shown in Figure 1. ∗ Figure 1: Relationships between three companies.

The goal of the hybrid game is to seek the equilibrium by solving the optimization problems of three parties. Referto Chen and Shen (2018), Chen and Shen (2019) and Asmussen et al. (2019), the procedure of solving the Stackelberggame is to solve the leader’s and followers’ optimization problems sequentially, based on the idea of backward induction.To be more speciﬁc, the procedure can be divided into the following three steps: ∗ The non-zero-sum game between two insurers can also be regarded as the subgame of the Stackelberg game. Step 1: The leader (i.e., the reinsurer) moves ﬁrst by announcing its any admissible strategy ( p ( · ) , b L ( · )) ∈ Π L ; • Step 2: The followers (i.e., the two insurers) observe the reinsurer’s strategy and obtain their optimal strategies q ∗ ( · ) = α ∗ ( · , p ( · ) , b L ( · )), b ∗ ( · ) = β ∗ ( · , p ( · ) , b L ( · )), q ∗ ( · ) = α ∗ ( · , p ( · ) , b L ( · )), b ∗ ( · ) = β ∗ ( · , p ( · ) , b L ( · )) by solving theiroptimization problems; • Step 3: Knowing that the two insurers would execute α ∗ ( · , p ( · ) , b L ( · )), β ∗ ( · , p ( · ) , b L ( · )), α ∗ ( · , p ( · ) , b L ( · )) and β ∗ ( · , p ( · ) , b L ( · )), the reinsurer then decides on its optimal strategy ( p ∗ ( · ) , b ∗ L ( · )) by solving its own optimizationproblem.Due to the reinsurer’s bounded memory feature, we suppose that the reinsurer is concerned about not only theterminal wealth X π L L ( T ), but also the integrated delayed information over the period [ T − h L , T ], i.e., Y L ( T ). In otherwords, the objective of the reinsurer is to ﬁnd the premium strategy and investment strategy to maximize the expectedutility of X π L L ( T ) + η L Y L ( T ), where the constant η L ∈ (0 ,

1) represents the sensitivity of the reinsurer to past wealth.Due to the ﬁerce competition in the insurance market, insurer i ( i ∈ { , } ) should consider not only the boundedmemory feature of its own wealth, but also the wealth gap between itself and insurer j ( j (cid:54) = i ∈ { , } ) at the terminaltime T . In other words, the objective of insurer i is to ﬁnd the optimal reinsurance strategy and investment strategysuch that the expected utility of the combination of its terminal wealth and the relative performance with delay ismaximized. That is, insurer i will choose a reinsurance-investment strategy π i ( · ) = ( q i ( · ) , b i ( · )) ∈ Π i such that E (cid:104) U i (cid:0) (1 − k i )( X π i i ( T ) + η i Y i ( T )) + k i (( X π i i ( T ) + η i Y i ( T )) − ( X π j j ( T ) + η j Y j ( T ))) (cid:1)(cid:105) = E (cid:104) U i (cid:0) ( X π i i ( T ) + η i Y i ( T )) − k i ( X π j j ( T ) + η j Y j ( T )) (cid:1)(cid:105) , i (cid:54) = j ∈ { , } , (2.15)is maximized. Here, U i ( i ∈ { , } ) is the utility function of insurer i , η i ∈ (0 ,

1) values the weight of Y i ( T ), k i ∈ [0 , i to the performance of insurer j ( j (cid:54) = i ∈ { , } ).Refer to Bensoussan et al. (2014), Yan et al. (2017) and Deng et al. (2018), the non-zero-sum game problem is to ﬁndan equilibrium reinsurance-investment strategy ( π ∗ , π ∗ ) ∈ Π × Π such that for any ( π , π ) ∈ Π × Π , the followinginequations are simultaneously established. E (cid:104) U (cid:0) ( X π ( T ) + η Y ( T )) − k ( X π ∗ ( T ) + η Y ( T )) (cid:1)(cid:105) ≤ E (cid:104) U (cid:0) ( X π ∗ ( T ) + η Y ( T )) − k ( X π ∗ ( T ) + η Y ( T )) (cid:1)(cid:105) , (2.16) E (cid:104) U (cid:0) ( X π ( T ) + η Y ( T )) − k ( X π ∗ ( T ) + η Y ( T )) (cid:1)(cid:105) ≤ E (cid:104) U (cid:0) ( X π ∗ ( T ) + η Y ( T )) − k ( X π ∗ ( T ) + η Y ( T )) (cid:1)(cid:105) . (2.17)The way of solving the non-zero-sum game is to solve the optimization problems of both two insurers at the same time.That is, in Step 2, we have to solve the optimization problems for both two insurers simultaneously. For convenience,we denote ˆ X π i i ( t ) = X π i i ( t ) − k i X π j j ( t ), for i (cid:54) = j ∈ { , } . Then, from (2.14), we have d ˆ X π i i ( t ) = (cid:104) θ i a i − k i θ j a j − ( p ( t ) − a i )(1 − q i ( t )) + k i ( p ( t ) − a j )(1 − q j ( t )) + A i X π i i ( t ) − k i A j X π j j ( t ) + B i Y i ( t ) − k i B j Y j ( t ) + C i Z i ( t ) − k i C j Z j ( t ) + ( r − r )( b i ( t ) − k i b j ( t )) (cid:105) dt + q i ( t ) σ i dW i ( t ) − k i q j ( t ) σ j dW j ( t )+ ( b i ( t ) − k i b j ( t )) σS β ( t ) dW ( t ) , i (cid:54) = j ∈ { , } , (2.18)with ˆ X π i i (0) = X π i i (0) − k i X π j j (0) = x i − k i x j . = ˆ x i . For any ﬁxed t ∈ [0 , T ], let ˆ X π i i ( t ) = X π i i ( t ) − k i X π j j ( t ) = x i − k i x j . = ˆ x i .Then, the hybrid stochastic diﬀerential reinsurance and investment game problem can be described as the followingproblem. 7 roblem The problem of insurer i ( i ∈ { , } ) is the following optimization problem: for any π L ( · ) = ( p ( · ) , b L ( · )) ∈ Π L ,ﬁnd a map ( q ∗ i ( · ) , b ∗ i ( · )) = ( α ∗ i ( · , p ( · ) , b L ( · )) , β ∗ i ( · , p ( · ) , b L ( · ))) : [0 , T ] × Ω × Π L → Π i such that the following value functionholds. V F i ( t, ˆ x i , y i , y j , s ; p ( · ) , b L ( · ) , α ∗ i ( · , p ( · ) , b L ( · )) , β ∗ i ( · , p ( · ) , b L ( · )))= sup ( q i ( · ) ,b i ( · )) ∈ Π i V F i ( t, ˆ x i , y i , y j , s ; p ( · ) , b L ( · ) , q i ( · ) , b i ( · ))= sup ( q i ( · ) ,b i ( · )) ∈ Π i E t, ˆ x i ,y i ,y j ,s (cid:104) U i (cid:0) X π i i ( T ) + η i Y i ( T ) − k i ( X π ∗ j j ( T ) + η j Y j ( T )) (cid:1)(cid:105) , i (cid:54) = j ∈ { , } . (2.19) The reinsurer’s problem is the following optimization problem: ﬁnd the optimal strategy ( p ∗ ( · ) , b ∗ L ( · )) ∈ Π L such thatthe following value function holds. V L (cid:0) t, x L , y L , s ; p ∗ ( · ) , b ∗ L ( · ) , α ∗ ( · , p ∗ ( · ) , b ∗ L ( · )) , β ∗ ( · , p ∗ ( · ) , b ∗ L ( · )) , α ∗ ( · , p ∗ ( · ) , b ∗ L ( · )) , β ∗ ( · , p ∗ ( · ) , b ∗ L ( · )) (cid:1) = sup ( p ( · ) ,b L ( · )) ∈ Π L V L (cid:0) t, x L , y L , s ; p ( · ) , b L ( · ) , α ∗ ( · , p ( · ) , b L ( · )) , β ∗ ( · , p ( · ) , b L ( · )) , α ∗ ( · , p ( · ) , b L ( · )) , β ∗ ( · , p ( · ) , b L ( · )) (cid:1) = sup ( p ( · ) ,b L ( · )) ∈ Π L E t,x L ,y L ,s [ U L ( X π L L ( T ) + η L Y L ( T ))] , (2.20) where U L is the utility function of the reinsurer. Deﬁnition 2

The pair (cid:0) p ∗ ( · ) , b ∗ L ( · ) , α ∗ ( · , p ∗ ( · ) , b ∗ L ( · )) , β ∗ ( · , p ∗ ( · ) , b ∗ L ( · )) , α ∗ ( · , p ∗ ( · ) , b ∗ L ( · )) , β ∗ ( · , p ∗ ( · ) , b ∗ L ( · )) (cid:1) is called anequilibrium strategy of the hybrid game. Furthermore, if there is no risk of confusion, when the equilibrium strategy of the hybrid game is adopted, V F i ( t, ˆ x i , y i ,y j , s ; p ∗ ( · ) , b ∗ L ( · ) , α ∗ i ( · , p ∗ ( · ) , b ∗ L ( · )) , β ∗ i ( · , p ∗ ( · ) , b ∗ L ( · ))) is also called the value function of the insurer i . Assume that both the reinsurer and insurers are constant absolute risk aversion (CARA) agents, i.e., the reinsurer andinsurer i ( i ∈ { , } ) have exponential utility functions: U L ( x L + η L y L ) = − γ L exp( − γ L ( x L + η L y L )) , (3.21) U i (ˆ x i + η i y i − k i η j y j ) = − γ i exp( − γ i (ˆ x i + η i y i − k i η j y j )) , i (cid:54) = j ∈ { , } , (3.22)where γ L > γ i > i , respectively.According to existing literatures, the optimal control problem with delay is inﬁnite-dimensional in general. To make theproblem solvable and ﬁnite-dimensional, we assume the following conditions on parameters C L = η L e − α L h L , B L e − α L h L =( α L + A L + η L ) C L , C i = η i e − α i h i , B i e − α i h i = ( α i + A i + η i ) C i , i ∈ { , } . By using the idea of backward induction mentioned in Section 2.4 and dynamic programming techniques, we solve thehybrid game problem and obtain the following theorem.

Theorem 1

Suppose that A + η = A + η , k k < and k k ρ < . The equilibrium strategy of the Stackelberggame problem is ( p ∗ ( t ) , b ∗ L ( t ) , q ∗ ( t ) , b ∗ ( t ) , q ∗ ( t ) , b ∗ ( t )) , where b ∗ L ( t ) , b ∗ ( t ) and b ∗ ( t ) are given by b ∗ L ( t ) = s − β γ L ϕ L ( t ) (cid:2) ( r − r ) σ − βg ( t ) (cid:3) , (3.23)8 ∗ ( t ) = s − β (1 − k k ) ϕ F ( t ) (cid:0) γ + k γ (cid:1)(cid:2) r − r σ − βg ( t ) (cid:3) , (3.24) b ∗ ( t ) = s − β (1 − k k ) ϕ F ( t ) (cid:0) γ + k γ (cid:1)(cid:2) r − r σ − βg ( t ) (cid:3) ; (3.25) p ∗ ( t ) , q ∗ ( t ) and q ∗ ( t ) under diﬀerent cases are given by the following:Case (1): If N cF ( t ) + k ρσ σ ≥ , N cF ( t ) + k ρσ σ ≥ , then p ∗ ( t ) = p , q ∗ ( t ) = 1 , q ∗ ( t ) = 1 , where ∀ p ∈ [ c F , ¯ c ] ;Case (2): If K [ N ¯ cF ( t ) + k ρσ σ N ¯ cF ( t )] ≥ , N ¯ cF ( t ) ≤ (1 − k ρσ σ ) M F ( t ) , then p ∗ ( t ) = ¯ c , q ∗ ( t ) = 1 , q ∗ ( t ) = N ¯ cF ( t ) + k ρσ σ ;Case (3): If K [ N cF ( t ) + k ρσ σ N cF ( t )] ≥ , (1 − k ρσ σ ) M F ( t ) ≤ N cF ( t ) < − k ρσ σ , then p ∗ ( t ) = c F , q ∗ ( t ) = 1 , q ∗ ( t ) = N cF ( t ) + k ρσ σ ;Case (4): If K [ N aF ( t ) + K ˜ F M F ( t )] ≥ , N cF ( t ) < (1 − k ρσ σ ) M F ( t ) < min { N ¯ cF ( t ) , − k ρσ σ } , then p ∗ ( t ) = a + γ σ ϕ F ( t )(1 − k ρσ σ ) M F ( t ) , q ∗ ( t ) = 1 , q ∗ ( t ) = (1 − k ρσ σ ) M F ( t ) + k ρσ σ ;Case (5): If K [ N ¯ cF ( t ) + k ρσ σ N ¯ cF ( t )] ≥ , N ¯ cF ( t ) ≤ (1 − k ρσ σ ) M F ( t ) , then p ∗ ( t ) = ¯ c , q ∗ ( t ) = N ¯ cF ( t ) + k ρσ σ , q ∗ ( t ) = 1 ;Case (6): If K [ N cF ( t ) + k ρσ σ N cF ( t )] ≥ , (1 − k ρσ σ ) M F ( t ) ≤ N cF ( t ) < − k ρσ σ , then p ∗ ( t ) = c F , q ∗ ( t ) = N cF ( t ) + k ρσ σ , q ∗ ( t ) = 1 ;Case (7): If K [ N aF ( t ) + K ˜ F M F ( t )] ≥ , N cF ( t ) < (1 − k ρσ σ ) M F ( t ) < min { N ¯ cF ( t ) , − k ρσ σ } , then p ∗ ( t ) = a + γ σ ϕ F ( t )(1 − k ρσ σ ) M F ( t ) , q ∗ ( t ) = (1 − k ρσ σ ) M F ( t ) + k ρσ σ , q ∗ ( t ) = 1 ;Case (8): If K [ N ¯ cF ( t ) + k ρσ σ N ¯ cF ( t )] < , K [ N ¯ cF ( t ) + k ρσ σ N ¯ cF ( t )] < , P N ( t ) P D ( t ) ≥ ¯ c , then p ∗ ( t ) = ¯ c , q ∗ ( t ) = K [ N ¯ cF ( t ) + k ρσ σ N ¯ cF ( t )] , q ∗ ( t ) = K [ N ¯ cF ( t ) + k ρσ σ N ¯ cF ( t )] ;Case (9): If K [ N cF ( t ) + k ρσ σ N cF ( t )] < , K [ N cF ( t ) + k ρσ σ N cF ( t )] < , P N ( t ) P D ( t ) ≤ c F , then p ∗ ( t ) = c F , q ∗ ( t ) = K [ N cF ( t ) + k ρσ σ N cF ( t )] , q ∗ ( t ) = K [ N cF ( t ) + k ρσ σ N cF ( t )] ;Case (10): If K [ PN ( t ) PD ( t ) − a γ σ ϕ F ( t ) + k ρ ( PN ( t ) PD ( t ) − a ) γ σ σ ϕ F ( t ) ] < , K [ PN ( t ) PD ( t ) − a γ σ ϕ F ( t ) + k ρ ( PN ( t ) PD ( t ) − a ) γ σ σ ϕ F ( t ) ] < , c F < P N ( t ) P D ( t ) < ¯ c , then p ∗ ( t ) = P N ( t ) P D ( t ) , q ∗ ( t ) = K [ PN ( t ) PD ( t ) − a γ σ ϕ F ( t ) + k ρ ( PN ( t ) PD ( t ) − a ) γ σ σ ϕ F ( t ) ] , q ∗ ( t ) = K [ PN ( t ) PD ( t ) − a γ σ ϕ F ( t ) + k ρ ( PN ( t ) PD ( t ) − a ) γ σ σ ϕ F ( t ) ] .where ϕ L ( t ) , ϕ F i ( t ) and g ( t ) are given by (A.67) , (A.51) and (A.54) , respectively; K , K ˜ F i , K F i , N cF i ( t ) , N ¯ cF i ( t ) , N aF i ( t ) and M F i ( t ) are given by (A.70) ; P N ( t ) and P D ( t ) are given by (A.91) and (A.92) , respectively.The value function of the reinsurer is given by V L ( t, x L , y L , s ) = − γ L exp {− γ L ϕ L ( t )( x L + η L y L ) + g ( t ) s − β + g L ( t ) } , (3.26) and, the value function of insurer is given by V F ( t, ˆ x , y , y , s ) = − γ exp {− γ ϕ F ( t )(ˆ x + η y − k η y ) + g ( t ) s − β + g F ( t ) } , (3.27)9 nd, the value function of insurer is given by V F ( t, ˆ x , y , y , s ) = − γ exp {− γ ϕ F ( t )(ˆ x + η y − k η y ) + g ( t ) s − β + g F ( t ) } , (3.28) where g L ( t ) , g F ( t ) and g F ( t ) under diﬀerent cases are given by Table 1; g La ( t ) is given by equation (A.73) ; for Table 1: g L ( t ), g F ( t ) and g F ( t ) under diﬀerent cases. Cases g L ( t ) g F ( t ) g F ( t ) Case (1) g La ( t ) g F a ( t ) g F a ( t ) Case (2) g L b ( t ) g F b ( t ) g ˜ F b ( t ) Case (3) g L b ( t ) g F b ( t ) g ˜ F b ( t ) Case (4) g L b ( t ) g F b ( t ) g ˜ F b ( t ) Case (5) g L b ( t ) g ˜ F b ( t ) g F b ( t ) Case (6) g L b ( t ) g ˜ F b ( t ) g F b ( t ) Case (7) g L b ( t ) g ˜ F b ( t ) g F b ( t ) Case (8) g Lc ( t ) g F c ( t ) g F c ( t ) Case (9) g Lc ( t ) g F c ( t ) g F c ( t ) Case (10) g Lc ( t ) g F c ( t ) g F c ( t ) j ∈ { , } , g Ljb ( t ) , g Ljb ( t ) and g Ljb ( t ) are given by equations (A.77) , (A.81) and (A.85) , respectively; g Lc ( t ) , g Lc ( t ) and g Lc ( t ) are given by equations (A.94) , (A.97) and (A.100) , respectively; g F i a ( t ) is given by equation (A.59) ; for i ∈ { , } , g F i b ( t ) , g F i b ( t ) and g F i b ( t ) are given by equations (A.78) , (A.82) and (A.86) ,respectively; g ˜ F j b ( t ) , g ˜ F j b ( t ) and g ˜ F j b ( t ) are given by equations (A.79) , (A.83) and (A.87) , respectively; g F i c ( t ) , g F i c ( t ) and g F i c ( t ) are given byequations (A.95) , (A.98) and (A.101) ,respectively. Proof:

See Appendix A.

Remark 1

Case (1) in Theorem 1 corresponds to Case (La) of Appendix A. In this situation, both two insurers do notsign the reinsurance contract and bear all claims themselves. Case (2), Case (3) and Case (4) in Theorem 1 correspondto the case of i = 1 , j = 2 in Case (Lb) of Appendix A. This situation can be understood as that insurer bears all theclaim risk by itself, and insurer adopts the reinsurance strategy to spread its claim risk. Case (5), Case (6) and Case(7) in Theorem 1 correspond to the case of i = 2 , j = 1 in Case (Lb) of Appendix A. That is, insurer bears all theclaim risk by itself, and insurer adopts the reinsurance strategy to spread its claim risk. Case (8), Case (9) and Case(10) in Theorem 1 correspond to the Case (Lc) of Appendix A. In this case, both two insurers will sign the reinsurancecontract to spread their claims risk. Remark 2

More generally, if there’s one reinsurer and n insurers in the insurance market, the optimal premium strategyand the optimal reinsurance strategies will have C n + 3( C n + C n + · · · + C nn ) = 1 + 3(2 n − situations. Corollary 1

The insurer i ’s ( i ∈ { , } ) optimal reinsurance strategy can be expressed by the optimal premium pricestrategy and the insurer j ’s ( j (cid:54) = i ∈ { , } ) optimal reinsurance strategy. That is, q ∗ i ( t ) = (cid:104) p ∗ ( t ) − a i γ i σ i ϕ F i ( t ) + k i ρσ j q ∗ j ( t ) σ i (cid:105) ∧ , i (cid:54) = j ∈ { , } . (3.29)10 he optimal investment strategy of insurer i ’s ( i ∈ { , } ) can be expressed by the optimal reinsurance strategy ofinsurer j ’s ( j (cid:54) = i ∈ { , } ). That is b ∗ i ( t ) = k i b ∗ j ( t ) + 1 γ i ϕ F i ( t ) s β (cid:2) r − r σ − βg ( t ) (cid:3) , i (cid:54) = j ∈ { , } . (3.30) Moreover, we have ∂q ∗ i ( t ) ∂p ∗ ( t ) > , ∂q ∗ i ( t ) ∂q ∗ j ( t ) > , ∂b ∗ i ( t ) ∂b ∗ j ( t ) > , i (cid:54) = j ∈ { , } . (3.31) Proof: (3.29),(3.30) and (3.31) are obviously established, and we omit the proof here.Through Corollary 1, we ﬁnd that the equilibrium reinsurance-investment strategies of the two insurers interact witheach other and exhibit herd eﬀect. That is insurer i ’s ( i ∈ { , } ) will imitate insurer j ’s ( j (cid:54) = i ∈ { , } ) reinsurance-investment strategy. More speciﬁcally, the amount of insurer i invested in the risky asset will increase with the amountthat insurer j invested in the risky asset; the reserve proportion of insurer i will also increase with the increase ofthe reserve proportion of insurer j . Furthermore, the optimal reinsurance strategies of insurers depend on the optimalpremium strategy. The reserve proportion of insurers will increase with the increase of reinsurance premium price,that is, a high reinsurance premium price will reduce the reinsurance demand. These ﬁndings illustrate consideringleader-follower relationship and competitive relationship at the same time will make decisions more rational and morerealistic.Similar to Lin and Qian (2015), we ﬁnd that the reinsurer’s investment strategy b ∗ L ( t ) in Theorem 1 contains twoparts. The ﬁrst part γ L ϕ L ( t ) s β ( r − r ) σ has an updated instantaneous volatility at the current time t , while the secondpart − βγ L ϕ L ( t ) s β g ( t ) results from the fact that the reinsurer tries to hedge its portfolio against the additional volatilityrisk. When β >

0, we have e − r β ( T − t ) < − βγ L ϕ L ( t ) s β g ( t ) >

0, it will cause positive deviation from the classicalresult (i.e., the investment strategy when the price of the risky asset obeys GBM model † ). Conversely, when β <

0, wehave e − r β ( T − t ) > − βγ L ϕ L ( t ) s β g ( t ) <

0, it will cause negative deviation from the classical result. Correspondingly,the investment strategies of two insurers have similar analysis.

Corollary 2 If β ≥ , some properties of b ∗ L ( t ) and b ∗ i ( t ) ( i = 1 , ) are given in Table 2, (3.32) and (3.33) . Table 2: The properties of b ∗ L ( t ) and b ∗ i ( t ). ∂b ∗ L ( t ) ∂γ L ∂b ∗ L ( t ) ∂h L ∂b ∗ i ( t ) ∂γ i ∂b ∗ i ( t ) ∂k i ∂b ∗ i ( t ) ∂h i − − − + − ∂b ∗ L ( t ) ∂α L =  > , α L > − h L ln h L ;= 0 , α L = − h L ln h L ; < , α L < − h L ln h L . ∂b ∗ i ( t ) ∂α i =  > , α i > − h i ln h i ;= 0 , α i = − h i ln h i ; < , α i < − h i ln h i . (3.32) Furthermore, if r + α L < and r + α i < , i = 1 , , then ∂b ∗ L ( t ) ∂η L =  > , h L < − α L ln (1 − r − α L );= 0 , h L = − α L ln (1 − r − α L ); < , h L > − α L ln (1 − r − α L ) . ∂b ∗ i ( t ) ∂η i =  > , h i < − α i ln (1 − r − α i );= 0 , h i = − α i ln (1 − r − α i ); < , h i > − α i ln (1 − r − α i ) . (3.33) Proof:

See Appendix B. † If β = 0, the CEV model reduces to the GBM model. Then, the optimal investment strategies of the reinsurer and the insurers are givenby b ∗ L ( t ) = γ L ϕ L ( t ) ( r − r ) σ , b ∗ i ( t ) = − k k ) ϕ Fi ( t ) (cid:0) γ i + k i γ j (cid:1) r − r σ , i (cid:54) = j ∈ { , } . h L > − α L ln (1 − r − α L ), the greater the delay weight, the less money is investedin the risky asset. On the contrary, when the delay time h L < − α L ln (1 − r − α L ), the greater the delay weight, themore money is invested in the risky asset. The optimal investment strategies of the insurers have similar rules. In otherwords, when the delay time selected by the reinsurer (insurer i ) is relatively long, the greater the weight of the integratedperformance in the past, the more conservative the strategy made by the reinsurer (insurer i ). On the contrary, whenthe memory time is relatively short, the smaller the weight of integrated performance in the past, the more conservativeinvestment strategy will be adopted. It also illustrates that the reinsurer and insurers manage investment risk accordingto the relevant parameters of delay. In order to prove that the equilibrium strategy given in Theorem 1 is indeed optimal for all three parties of the hybridgame, we give a veriﬁcation theorem in this section. For i (cid:54) = j ∈ { , } , let A F i V F i ( t, ˆ x i , y i , y j , s )= V F i t + V F i ˆ x i (cid:2) θ i a i − k i θ j a j − ( p ( t ) − a i )(1 − q i ( t )) + k i ( p ( t ) − a j )(1 − q ∗ j ( t )) + A i x i − k i A j x j + B i y i − k i B j y j + C i z i − k i C j z j + ( r − r )( b i ( t ) − k i b ∗ j ( t )) (cid:3) + 12 (cid:2) ( q i ( t ) σ i ) + ( k i q ∗ j ( t ) σ j ) − q i ( t ) σ i k i q ∗ j ( t ) σ j ρ + ( b i ( t ) − k i b ∗ j ( t )) σ s β (cid:3) V F i ˆ x i ˆ x i + ( x i − α i y i − e − α i h i z i ) V F i y i + ( x j − α j y j − e − α j h j z j ) V F i y j + rsV F i s + 12 σ s β +2 V F i ss + ( b i ( t ) − k i b ∗ j ( t )) σ s β +1 V F i ˆ x i s , (3.34) A L V L ( t, x L , y L , s )= V Lt + V Lx L (cid:2) ( p ( t ) − a )(1 − q ∗ ( t )) + ( p ( t ) − a )(1 − q ∗ ( t )) + ( r − r ) b L ( t ) + A L x L + B L y L + C L z L (cid:3) + 12 (cid:2) (1 − q ∗ ( t )) ( σ ) + (1 − q ∗ ( t )) ( σ ) + ( b L ( t )) σ s β + 2(1 − q ∗ ( t ))(1 − q ∗ ( t )) σ σ ρ (cid:3) V Lx L x L + ( x L − α L y L − e − α L h L z L ) V Ly L + rsV Ls + 12 σ s β +2 V Lss + b L ( t ) σ s β +1 V Lx L s . (3.35)We ﬁrst give the following lemmas: Lemma 1

Let M F i = R × R + × R + × R + , i ∈ { , } . Take a sequence of bounded open sets M F i , M F i , M F i , · · · ,with M F i n ⊂ M F i n +1 ⊂ M F i , n = 1 , , · · · , and M F i = ∪ n M F i n . For (ˆ x i , y i , y j , s ) ∈ M F i n , let τ n be the exit time of ( ˆ X i ( t ) , Y i ( t ) , Y j ( t ) , S ( t )) from M F i n . Then, for n = 1 , , · · · , E t, ˆ x i ,y i ,y j ,s (cid:8)(cid:2) V F i ( τ n ∧ T, ˆ X i ( τ n ∧ T ) , Y i ( τ n ∧ T ) , Y j ( τ n ∧ T ) , S ( τ n ∧ T )) (cid:3) (cid:9) < + ∞ . Proof:

See Appendix C.

Lemma 2

Let M L = R + × R + × R + . Take a sequence of bounded open sets M L , M L , M L , · · · , with M Ln ⊂ M Ln +1 ⊂M L , n = 1 , , · · · , and M L = ∪ n M Ln . For ( x L , y L , s ) ∈ M Ln , let τ n be the exit time of ( X L ( t ) , Y L ( t ) , S ( t )) from M Ln .Then, for n = 1 , , · · · , E t,x L ,y L ,s (cid:8)(cid:2) V L ( τ n ∧ T, X L ( τ n ∧ T ) , Y L ( τ n ∧ T ) , S ( τ n ∧ T )) (cid:3) (cid:9) < + ∞ . Proof:

The proof of this lemma is similar to that of Lemma 1.

Theorem 2 (Veriﬁcation theorem) The equilibrium strategy ( p ∗ ( t ) , b ∗ L ( t ) , q ∗ ( t ) , b ∗ ( t ) , q ∗ ( t ) , b ∗ ( t )) described in Theorem1 achieves optimality in Π L × Π × Π . Proof:

See Appendix D. 12 .3 Special cases

In what follows, we present several special cases of our model.

Special case 1:

If the reinsurer and two insurers do not consider the eﬀect of the bounded memory, i.e., η L = η i = h L = h i = α L = α i = 0, then B L = B i = C L = C i = 0 and A L = A i = r , i ∈ { , } . The optimal investmentstrategies, denoted as ˆ b ∗ L ( t ), ˆ b ∗ i ( t ) , i ∈ { , } , are given byˆ b ∗ L ( t ) = e − r ( T − t ) s − β γ L (cid:2) ( r − r ) σ − βg ( t ) (cid:3) = b ∗ L ( t ) e ηL ηL (1 − r − α L − e − αLhL )( T − t ) , (3.36)ˆ b ∗ i ( t ) = e − r ( T − t ) s − β (1 − k k ) (cid:0) γ i + k i γ j (cid:1)(cid:2) r − r σ − βg ( t ) (cid:3) = b ∗ i ( t ) e ηi ηi (1 − r − α i − e − αihi )( T − t ) , (3.37)where g ( t ) is given by (A.54). The optimal reinsurance premium strategy and the optimal reinsurance strategies in theinterior case (i.e., Case (10) in Theorem 1) becomeˆ p ∗ ( t ) = ˆ P N ( t )ˆ P D ( t ) , ˆ q ∗ i ( t ) = e − r ( T − t ) − k k ρ (cid:104) ˆ P N ( t )ˆ P D ( t ) − a i γ i σ i + k i ρ ( ˆ P N ( t )ˆ P D ( t ) − a j ) γ j σ j σ i (cid:105) , i (cid:54) = j ∈ { , } , (3.38)whereˆ P N ( t ) = e r ( T − t ) ( σ σ ) ( γ ˆ D F + γ ˆ D F ) + 2 a σ γ ˆ D ˜ F + 2 a σ γ ˆ D ˜ F + ( a + a ) ρσ σ ˆ D F , ˆ P D ( t ) =2 σ γ ˆ D ˜ F + 2 σ γ ˆ D ˜ F + 2 ρσ σ ˆ D F , ˆ D F i = γ i (1 − k k ρ ) + γ L [1 + k j ρ + σ j ρσ i (1 + k j )] , i (cid:54) = j ∈ { , } , ˆ D ˜ F i =1 + γ L (1 + ( k j ρ ) + 2 k j ρ )2 γ i (1 − k k ρ ) , i (cid:54) = j ∈ { , } , ˆ D F = k γ + k γ + γ L (1 + k + k + k k ρ )1 − k k ρ . From (3.36) and (3.37), we can ﬁnd that when 1 − r − α L − e − α L h L ≥

0, then ˆ b ∗ L ( t ) ≥ b ∗ L ( t ). That is, when thedelay time satisﬁes h L ≥ − α L ln (1 − r − α L ), the amount of investment in the risky asset without delay is greater thanthat with delay, that is to say, delay makes the investment strategy more conservative in this case. On the contrary,when 1 − r − α L − e − α L h L <

0, then ˆ b ∗ L ( t ) < b ∗ L ( t ). That is to say, when the reinsurer’s delay time h L is less than − α L ln (1 − r − α L ), the amount invested in the risky asset with delay is larger than that without delay, i.e., the delayfactor stimulates the investment in this case. Accordingly, the investment strategies of insurers have similar analysis.Generally speaking, delay factor discourages or stimulates investment depending on the length of the delay. Corollary 3

For i ∈ { , } , we have lim t → T [ˆ b ∗ L ( t ) − b ∗ L ( t )] = 0 , lim t → T [ˆ b ∗ i ( t ) − b ∗ i ( t )] = 0 , lim t → T [ˆ p ∗ ( t ) − p ∗ ( t )] = 0 , lim t → T [ˆ q ∗ i ( t ) − q ∗ i ( t )] = 0 , (3.39) where p ∗ ( t ) and q ∗ i ( t ) , i ∈ { , } are the optimal reinsurance premium strategy and the optimal reinsurance strategiesrespectively in Case (10) of Theorem 1. Proof:

This corollary is easily obtained by (3.23), (3.24), (3.25), (A.99), (3.36), (3.37) and (3.38), and we omitsthe proof here.Corollary 3 indicates that when time t tends to the terminal time T , the equilibrium strategy with delay and withoutdelay will tend to be consistent. In particular, the equilibrium strategy with delay is equal to that without delay at theterminal time T . Special case 2:

We study a stochastic diﬀerential reinsurance-investment game between one reinsurer and oneinsurer, i.e., i = j = 1. Then, ρ = 1, c F = c = (1 + θ ) a , k i = k j = 0. At this point, the hybrid game becomes a13ure Stackelberg game problem. Using the method similar to that in Section 3.1, we can get the equilibrium strategy(˜ p ∗ ( t ) , ˜ b ∗ L ( t ) , ˜ q ∗ ( t ) , ˜ b ∗ ( t )) and value functions ˜ V L ( t, x L , y L , s ), ˜ V F ( t, x , y , s ). ˜ b ∗ L ( t ) and ˜ b ∗ ( t ) are given by˜ b ∗ L ( t ) = 1 γ L ϕ L ( t ) s β (cid:104) ( r − r ) σ − βg ( t ) (cid:105) , ˜ b ∗ ( t ) = 1 γ ϕ F ( t ) s β (cid:104) ( r − r ) σ − βg ( t ) (cid:105) . ˜ p ∗ ( t ) and ˜ q ∗ ( t ) under diﬀerent cases are given by Table 3, where ϕ L ( t ), ϕ F ( t ) and g ( t ) are given by (A.67), (A.51) and Table 3: The optimal premium strategy and the optimal reinsurance strategy under diﬀerent cases.

Cases p ∗ ( t ) q ∗ ( t )(1) N cF ( t ) ≥ ∀ p ∈ [ c F , ¯ c ] 1(2) N ¯ cF ( t ) ≤ M F ( t ) ¯ c N ¯ cF ( t )(3) M F ( t ) ≤ N cF ( t ) < c F N cF ( t )(4) N cF ( t ) < M F ( t ) < N ¯ cF ( t ) a + M F ( t ) γ σ ϕ F ( t ) M F ( t )(A.54), respectively; N cF ( t ), N ¯ cF ( t ) and M F ( t ) are given by (A.70). The value function of the reinsurer is given by˜ V L ( t, x L , y L , s ) = − γ L exp {− γ L ϕ L ( t )( x L + η L y L ) + g ( t ) s − β + g L ( t ) } , the value function of the insurer is given by˜ V F ( t, x , y , s ) = − γ exp {− γ ϕ F ( t )( x + η y ) + g ( t ) s − β + g F ( t ) } , where g L ( t ) and g F ( t ) under diﬀerent cases are given by Table 4, g ( t ) is given by (A.58), Table 4: g L ( t ) and g F ( t ) under diﬀerent cases. Cases g L ( t ) g F ( t ) Case (1) g La ( t ) g F a ( t ) Case (2) g Lb ( t ) g F b ( t ) Case (3) g Lb ( t ) g F b ( t ) Case (4) g Lb ( t ) g F b ( t ) g La ( t ) = g ( t ) ,g Lb ( t ) = g ( t ) + γ L σ A L + η L ) [( ϕ L ( t )) −

1] + (cid:90) tT ¯ θa γ L ϕ L ( s )[1 + γ L ϕ L ( s ) γ ϕ F ( s ) ] ds − (cid:90) tT (¯ θa ) γ L ϕ L ( s ) σ γ ϕ F ( s ) [1 + γ L ϕ L ( s )2 γ ϕ F ( s ) ] ds,g Lb ( t ) = g ( t ) + γ L σ A L + η L ) [( ϕ L ( t )) −

1] + (cid:90) tT θ a γ L ϕ L ( s )[1 + γ L ϕ L ( s ) γ ϕ F ( s ) ] ds − (cid:90) tT ( θ a ) γ L ϕ L ( s ) σ γ ϕ F ( s ) [1 + γ L ϕ L ( s )2 γ ϕ F ( s ) ] ds, Lb ( t ) = g ( t ) + γ L σ A L + η L ) [( ϕ L ( t )) −

1] + σ (cid:90) tT γ L ϕ L ( s )( γ ϕ F ( s ) + γ L ϕ L ( s )) M F ( s ) ds,g F a ( t ) = g ( t ) − γ θ a A + η [ ϕ F ( t ) −

1] + γ σ A + η ) [( ϕ F ( t )) − ,g F b ( t ) = g ( t ) + γ (¯ θ − θ ) a A + η [ ϕ F ( t ) − − (¯ θa ) σ ( T − t ) ,g F b ( t ) = g ( t ) − ( θ a ) σ ( T − t ) ,g F b ( t ) = g ( t ) − γ θ a A + η [ ϕ F ( t ) − − γ σ (cid:90) tT ( ϕ F ( s )) M F ( s ) ds + 12 γ σ (cid:90) tT ( ϕ F ( s )) ( M F ( s )) ds. Similar to Chen and Shen (2018), we can get that when the equilibrium is achieved in the interior case (i.e., Case(4) in Table 3), the optimal reinsurance premium follows the variance premium principle. In other words, for every oneunit of risk, the total instantaneous reinsurance premium associated with the ceded proportion (1 − ˜ q ∗ ( t ))100% can bewritten as ˜ p ∗ ( t )(1 − ˜ q ∗ ( t )) = a (1 − ˜ q ∗ ( t )) + [ γ ϕ F ( t ) + γ L ϕ L ( t )] σ (1 − ˜ q ∗ ( t )) , where the ﬁrst term accounts for the mean component, and the second for the variance component. Corollary 4 If β ≥ , some properties of the optimal investment strategies (i.e., ˜ b ∗ L ( t ) , ˜ b ∗ ( t ) ), optimal premium strategy(i.e., ˜ p ∗ ( t ) ) and optimal reinsurance strategy (i.e., ˜ q ∗ ( t ) ) of Case 4 in Table 3 are given in Table 5, (3.40) , (3.41) , (3.42) and (3.43) . Table 5: The properties of (˜ p ∗ ( t ) , ˜ b ∗ L ( t ) , ˜ q ∗ ( t ) , ˜ b ∗ ( t )). ∂ ˜ b ∗ L ( t ) ∂γ L ∂ ˜ b ∗ L ( t ) ∂h L ∂ ˜ b ∗ ( t ) ∂γ ∂ ˜ b ∗ ( t ) ∂h ∂ ˜ p ∗ ( t ) ∂γ L ∂ ˜ p ∗ ( t ) ∂h L ∂ ˜ q ∗ ( t ) ∂γ ∂ ˜ q ∗ ( t ) ∂h − − − − + + − − ∂ ˜ b ∗ L ( t ) ∂α L =  > , α L > − h L ln h L ;= 0 , α L = − h L ln h L ; < , α L < − h L ln h L . ∂ ˜ b ∗ ( t ) ∂α =  > , α > − h ln h ;= 0 , α = − h ln h ; < , α < − h ln h . (3.40) ∂ ˜ p ∗ ( t ) ∂α L =  < , α L > − h L ln h L ;= 0 , α L = − h L ln h L ; > , α L < − h L ln h L . ∂ ˜ q ∗ ( t ) ∂α =  > , α > − h ln h ;= 0 , α = − h ln h ; < , α < − h ln h . (3.41) If r + α L < and r + α < , then ∂ ˜ b ∗ L ( t ) ∂η L =  > , h L < − α L ln (1 − r − α L );= 0 , h L = − α L ln (1 − r − α L ); < , h L > − α L ln (1 − r − α L ) . ∂ ˜ b ∗ ( t ) ∂η =  > , h < − α ln (1 − r − α );= 0 , h = − α ln (1 − r − α ); < , h > − α ln (1 − r − α ) . (3.42) ∂ ˜ p ∗ ( t ) ∂η L =  < , h L < − α L ln (1 − r − α L );= 0 , h L = − α L ln (1 − r − α L ); > , h L > − α L ln (1 − r − α L ) . ∂ ˜ q ∗ ( t ) ∂η =  > , h < − α ln (1 − r − α );= 0 , h = − α ln (1 − r − α ); < , h > − α ln (1 − r − α ) . (3.43) Proof:

The proof of this corollary is similar to that of Corollary 2.Further, if the reinsurer and the insurer do not consider the eﬀect of the delay, i.e., η L = η = h L = h = α L = α = 0,then B L = B = C L = C = 0 and A L = A = r . The optimal reinsurance premium and the optimal reinsurancestrategy are the same as that in the case of ρ L = ρ F in the literature Chen and Shen (2018).15 Sensitivity analysis

To illustrate the sensitivities of the equilibrium strategy ( p ∗ ( · ) , b ∗ L ( · ); q ∗ ( · ) , b ∗ ( · ); q ∗ ( · ) , b ∗ ( · )) with respect to the modelparameters, we conduct numerical experiments in this section. Throughout this section, unless stated otherwise, thebasic model parameters are given in Table 6, Table 7 and Table8. ‡ Table 6: The parameter values of the ﬁnancial assets. r r σ β s T .

05 0 . . Table 7: The parameter values of the reinsurer. ¯ θ h L α L η L γ L . .

05 0 . Table 8: The parameter values of insurers.

The parameter values of insurer 1 The parameter values of insurer 2Parameter Value Parameter Value λ µ σ θ h α η γ k ρ . . . . . . λ µ σ θ h α η γ k / . / . / Figure 2 shows the change in the risky asset price over time and the optimal investment strategies with and withoutdelay over time. From Theorem 1, we can ﬁnd that ∂b ∗ L ( t ) ∂s = − βs b ∗ L ( t ) and ∂b ∗ i ( t ) ∂s = − βs b ∗ i ( t ) , i ∈ { , } . It is easy toﬁnd that ∂b ∗ L ( t ) ∂s < ∂b ∗ i ( t ) ∂s <

0, when β = 1. That is, the wealth invested in the risky asset is negatively correlatedwith the price of the risky asset, which is consistent with the trend of curves in Figure 2. Under the setting of parametersin Table 6, Table 7 and Table 8, we have h L ≥ − α L ln (1 − r − α L ) and h i ≥ − α i ln (1 − r − α i ) , i ∈ { , } . Accordingto Special case 1, the amount invested in the risky asset with delay is lower than that without delay, which is consistentwith Figure 2. That is, the delay factor will urge the investor to shrink the position invested in the risky asset and makethe investment strategy more conservative when the delay time considered is greater than a certain value. Furthermore,the gap between the investment strategy with delay and the investment strategy without delay will decrease with theincrease of time t . And they completely coincide at terminal time T . ‡ For i (cid:54) = j ∈ { , } , we can get that A i = η i [ r − ( α i + η i ) η i − η i e − α i h i ] due to A i = r − B i − C i , C i = η i e − α i h i and B i e − α i h i =( α i + A i + η i ) C i . From the condition in Theorem 1, A + η = A + η , we can get that η j = ( r − e − αihi + α i ) η i ( r − e − αjhj + α j )+( α j − α i + e − αjhj − e − αihi ) η i . S ( t ) S(t) 0 2 4 6 8 1022.533.544.5 t b L * ( t ) b L* (t)) with delayb L* (t) without delay b * ( t )) b (t)) with delayb (t) without delay b * ( t )) b (t)) with delayb (t) without delay Figure 2: b ∗ L ( t ), b ∗ ( t ) and b ∗ ( t ) with and without delay. −5 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1−140−120−100−80−60−40−20020 β op t i m a l i n v e s t m en t s t r a t eg i e s b L* (0)b (0)b (0) (a) β < β op t i m a l i n v e s t m en t s t r a t eg i e s b L* (0)b (0)b (0) (b) β > β on b ∗ L (0), b ∗ (0) and b ∗ (0). .1 0.2 0.3 0.4 0.5 0.6 0.7 0.800.511.522.5 γ L b L * ( ) b L* (0) (a) Eﬀect of γ L on b ∗ L (0) γ b * ( ) k =0k =0.2k =0.5k =0.8 (b) Eﬀects of γ and k on b ∗ (0) γ b * ( ) k =0k =0.2k =0.5k =0.8 (c) Eﬀects of γ and k on b ∗ (0)Figure 4: Eﬀects of risk aversion coeﬃcients and sensitivity coeﬃcients on optimal investment strategies. Figure 3 indicates the inﬂuence of the constant elasticity parameter β on optimal investment strategies at the initialmoment, including β < β >

0. In Figure 3, we can note that both the reinsurer’s and insurers’ investmentstrategies will increase as β increases. The investment amount is negative when elasticity parameter β <

0, and theinvestment amount is positive when the elasticity parameter β >

0. In other words, positive elasticity parameter resultsin a positive hedging demand; the hedging demand is negative for negative elasticity parameter, which is consistent withthe description in Section 3.1.Figure 4 indicates the impacts of the risk aversion coeﬃcients (i.e., γ L , γ , γ ) and sensitivity coeﬃcients (i.e., k , k )on optimal investment strategies at the initial moment. Both 4(a), 4(b) and 4(c) show that the greater the risk aversioncoeﬃcient, the less the amount invested in the risky asset, which is consistent with the actual situation. Both 4(b) and4(c) show that for insurer i ( i ∈ { , } ), the greater the sensitivity coeﬃcient k i , the more the insurer i invests in therisky asset. Because the sensitivity coeﬃcient k i reﬂects the degree to which insurer i cares about the terminal wealthof its competitor (i.e., insurer j , j (cid:54) = i ∈ { , } ), the larger the k i , the more the insurer i cares about the performanceof its opponent. Therefore, when the sensitivity coeﬃcient k i is larger, insurer i is more inclined to invest more moneyinto the risky asset for increasing its wealth.Figure 5 indicates the eﬀects of delay parameters (i.e., h L , η L , α L , h , η , α , h , η and α ) on optimal investmentstrategies (i.e., b ∗ L (0), b ∗ (0) and b ∗ (0)), respectively. Both 5(a), 5(b) and 5(c) show that when the delay weight deﬁned,the longer the delay time, the less money is invested in the risky asset. That is, when the delay time is longer, thereinsurer and the insurers adopt more conservative and robust investment strategies to control their risk. As can beseen from 5(a), for the reinsurer, when the delay time is within a certain range, the greater the delay weight, the moremoney is invested in the risky asset. Otherwise, the opposite situation occurs. From the range of abscissa in 5(b) and5(c), we can know that, h > − α ln (1 − r − α ) = 1 . h > − α ln (1 − r − α ) = 1 . § For insurer i , i ∈ { , } , when the delay time is greater than the certain value, the greater the delay weight η i is, the less insurer i invests in the risky asset, i.e., ∂b ∗ i ( t ) ∂η i <

0, which is consistent with the case of h i > − α i ln (1 − r − α i ) in equation (3.33).Subﬁgures 5(d), 5(e) and 5(f) show the eﬀects of α L , α and α on b ∗ L (0), b ∗ (0) and b ∗ (0), respectively. These threesubgraphs illustrate that the impact of the average parameter on the optimal investment strategy will change with itssize, which is consistent with equation (3.32). Figure 6 shows the changes of the reinsurer’s optimal premium strategy and the insurers’ optimal reinsurance strategieswith and without delay over time. Table 9 shows numerical results corresponding to Figure 6. When t ≤

6, it satisﬁes § Note: Because η j = ( r − e − αihi + α i ) η i ( r − e − αjhj + α j )+( α j − α i + e − αjhj − e − αihi ) η i for i (cid:54) = j ∈ { , } , the selection of abscissa, delay time parametersand the delay weight parameters should ensure that η i ∈ (0 , , i ∈ { , } . L b L * ( ) η L =0.05 η L =0.1 η L =0.15 (a) Eﬀects of η L and h L on b ∗ L (0) b * ( ) η =0.05 η =0.1 η =0.15 (b) Eﬀects of η and h on b ∗ (0) b * ( ) η =0.05 η =0.1 η =0.15 (c) Eﬀects of η and h on b ∗ (0) α L b L * ( ) (d) Eﬀect of α L on b ∗ L (0) α b * ( ) (e) Eﬀect of α on b ∗ (0) α b * ( ) (f) Eﬀect of α on b ∗ (0)Figure 5: Eﬀects of delay parameters on optimal investment strategies.Table 9: Numerical results corresponding to Figure 6. t with delay p ∗ ( t ) 12 12 12 12 12 12 12 11.831 11.419 11.030 10.661 q ∗ ( t ) 0.294 0.310 0.327 0.345 0.364 0.384 0.406 0.419 0.419 0.419 0.419 q ∗ ( t ) 0.429 0.452 0.477 0.504 0.532 0.561 0.592 0.612 0.612 0.612 0.612 without delay p ∗ ( t ) 12 12 12 12 12 12 12 11.739 11.361 11.002 10.661 q ∗ ( t ) 0.305 0.321 0.337 0.355 0.373 0.392 0.412 0.419 0.419 0.419 0.419 q ∗ ( t ) 0.446 0.468 0.492 0.518 0.544 0.572 0.601 0.612 0.612 0.612 0.612 p * ( t ) p * (t) with delayp * (t) without delay q * ( t ) q (t) with delayq (t) without delay q * ( t ) q (t) with delayq (t) without delay Case (8) Case (10) Case (10)Case (8) Case (8) Case (10)Case (8) Case (8) Case (10)

Figure 6: p ∗ ( t ), q ∗ ( t ) and q ∗ ( t ) with and without delay. p ∗ ( t ) =¯ c = (1 + ¯ θ ) λ F µ F = 12; the optimal reserve proportional of insurer i ( i ∈ { , } ) is q ∗ i ( t ) = K [ N ¯ cF i ( t ) + k i ρσ j σ i N ¯ cF j ( t )],which is an increasing function of t . When t ≥

7, it satisﬁes conditions in Case (10) of Theorem 1. In this case,the reinsurer’s optimal premium strategy is p ∗ ( t ) = P N ( t ) P D ( t ) , which is a decreasing function of t ; the optimal reserveproportional of insurer i ( i ∈ { , } ) is q ∗ i ( t ) = K [ PN ( t ) PD ( t ) − a i γ i σ i ϕ Fi ( t ) + k i ρ ( PN ( t ) PD ( t ) − a j ) γ j σ σ ϕ Fj ( t ) ]. Furthermore, from Figure 6, we can ﬁndthat the price of reinsurance premium with delay is not lower than that without delay, and the reserve proportional withdelay is not higher than that without delay. This indicates that delay factors can urge the reinsurer and insurers to hedgerisk by raising the price of reinsurance premium or reducing the reserve proportional under the setting of parameters inTable 6, Table 7 and Table 8. Furthermore, in Case (10), as time goes on, the gap between the reinsurance premium pricewith delay and that without delay becomes smaller and smaller, and the two are completely equal until the terminaltime T .Since Case (10) in Theorem 1 is the most general situation in this paper, we will analyze the premium strategy andreinsurance strategies in Case (10) below. For convenience, we choose the strategies at t=9 for analysis. q * ( ) q (9) 0 0.5 10.550.60.650.7 k q * ( ) q (9)0 0.5 110.51111.5 k p * ( ) p * (9) 0 0.5 11010.51111.5 k p * ( ) p * (9) Figure 7: Eﬀects of sensitivity parameters on optimal reinsurance strategies.

Figure 7 illustrates the eﬀects of sensitive parameters (i.e., k , k ) on optimal reinsurance strategies (i.e., q ∗ (9), q ∗ (9))and optimal premium strategy (i.e., p ∗ (9)), respectively. The more sensitive insurer i ( i ∈ { , } ) is to the wealth ofinsurer j ( j (cid:54) = i ∈ { , } ), (that is to say, k i is bigger), the higher the reservation ratio of insurer i is, and the lowerthe premium price is. This is because the more sensitive the insurer is to the wealth level of the other party, the moreserious the psychology of comparison will be, which makes the insurer more eager to widen the wealth gap with itsopponent. This comparing mentality leads to its willingness to assume more risk of random claims than to spend moneyon reinsurance contracts. This phenomenon in turn leads to lower the premium price. In summary, competitive factorsbetween the two insurers reduce the demand for reinsurance and the price of reinsurance premiums.Three subgraphs in Figure 8 illustrate the sensitivity of p ∗ (9), q ∗ (9) and q ∗ (9) to the γ L , γ and γ , respectively.Subﬁgures 8(a), 8(b) and 8(c) indicate that the reinsurance premium price will increase with the increase of risk aversioncoeﬃcients. This is understandable, since the more risk-averse the reinsurer is, the more inclined the reinsurer is toreduce its risk by raising the premium price; the more risk-averse insurers are, the more demand for reinsurance will21 .1 0.12 0.14 0.16 0.18 0.21111.0511.111.1511.211.25 γ L p * ( ) (a) Eﬀect of γ L on p ∗ (9) γ p * ( ) (b) Eﬀect of γ on p ∗ (9) γ p * ( ) (c) Eﬀect of γ on p ∗ (9) γ L O p t i m a l r e i n s u r an c e s t r a t eg i e s q (9)q (9) (d) Eﬀects of γ L on q ∗ (9) and q ∗ (9) γ O p t i m a l r e i n s u r an c e s t r a t eg i e s q (9)q (9) (e) Eﬀects of γ on q ∗ (9) and q ∗ (9) γ O p t i m a l r e i n s u r an c e s t r a t eg i e s q (9)q (9) (f) Eﬀects of γ on q ∗ (9) and q ∗ (9)Figure 8: Eﬀects of risk aversion coeﬃcients on optimal strategies. increase, leading to an increase in the price of reinsurance premium. From 8(d), we can ﬁnd that the reserve proportionsof two insurers will increase with the increase of the reinsurer’s risk aversion coeﬃcient. This phenomenon is because thatthe more risk-averse the reinsurer is, the higher the reinsurance premium price will be, which will reduce the reinsurancedemand and lead to an increase in the reserve level of insurers. 8(e) and 8(f) show that the reserve proportion ofinsurer i ( i ∈ { , } ) will decrease with the increase of its risk aversion coeﬃcient γ i and increase with the increase of itscompetitor’s (i.e., insurer j ’s, j (cid:54) = i ∈ { , } ) risk aversion coeﬃcient γ j . Because when the insurer is more risk-averse,it tends to sign a reinsurance contract to transfer part of its random claim risk; when its competitor is more risk-averse,it prefers to increase its claim reserve ratio, but is not willing to spend money to buy reinsurance contracts due to thepsychology of comparison. To sum up, risk aversion factors increase the price of reinsurance premiums; the reinsurer’srisk aversion factor reduces the insurers’ reinsurance demand; insurer’s risk aversion factor increases its own reinsurancedemand and reduces the reinsurance demand of its competitor.Figure 9 indicates the eﬀects of delay parameters (i.e., h L , η L , α L , h , η , α , h , η and α ) on optimal premiumstrategy (i.e., p ∗ (9)) and reinsurance strategies (i.e., q ∗ (9), q ∗ (9)), respectively. As can be seen from 9(a), when thedelay weight is deﬁned, the premium price increases with the increase of delay time. Furthermore, when the delay timeis less than a certain value, the larger the delay weight is, the lower the premium price is. When the delay time is greaterthan this certain value, the larger the delay weight is, the higher the premium price is. Therefore, for the reinsurer, theeﬀect of the delay weight on the optimal premium strategy is related to the length of the delay time in Case (10) ofTheorem 1. From 9(b) and 9(c), we can see that for insurers, when the delay time takes a special interval, the longerthe delay time, the lower the reserve level; the larger the delay weight, the lower the reserve level. Subﬁgures 9(d), 9(e)and 9(f) show the eﬀects of α L , α and α on p ∗ (9), q ∗ (9) and q ∗ (9), respectively. These three subgraphs illustrate thatthe impact of the average parameter on the optimal premium strategy (or optimal reinsurance strategy) will changewith its size. 22 L p * ( ) η L =0.05 η L =0.1 η L =0.15 (a) Eﬀects of η L and h L on p ∗ (9) q * ( ) η =0.05 η =0.1 η =0.15 (b) Eﬀects of η and h on q ∗ (9) q * ( ) η =0.05 η =0.1 η =0.15 (c) Eﬀects of η and h on q ∗ (9) α L p * ( ) (d) Eﬀect of α L on p ∗ (9) −0.37782 −0.37781 −0.3778 −0.37779 α q * ( ) (e) Eﬀect of α on q ∗ (9) −0.2136 −0.2135 α q * ( ) (f) Eﬀect of α on q ∗ (9)Figure 9: Eﬀects of delay parameters on optimal strategies. In this study, we investigate a hybrid stochastic diﬀerential reinsurance and investment game, including a stochasticStackelberg diﬀerential subgame and a non-zero-sum stochastic diﬀerential subgame. One reinsurer and two insurersare three players in the hybrid game. In view of the monopoly position of the reinsurer and the competitive relationshipbetween insurers in the market, we consider the reinsurer and two insurers as the leader and the followers of the Stack-elberg game respectively, and model the insurers’ competition relationship as the non-zero-sum game. We investigatethe reinsurer’s premium pricing and investment optimization problem as well as insurers’ reinsurance and investmentoptimization problem. Under the consideration of the performance-related capital inﬂow/outﬂow, the wealth processesof the reinsurer and insurers are described by SDDEs. We derive the equilibrium strategy and value functions explicitlyby using the idea of backward induction and the dynamic programming approach. Then, we establish a veriﬁcationtheorem for the optimality of the given strategy. Furthermore, we study several special cases of our model. Moreover,some numerical examples and sensitivity analysis are presented to demonstrate the eﬀects of the model parameters onthe equilibrium strategy.The main ﬁndings are as follows: (1) the optimal reinsurance-investment strategies of two insurers interact witheach other and reﬂect the herd eﬀect. (2) the optimal reinsurance strategies of insurers depend on the optimal premiumstrategy. (3) competitive factors between two insurers reduce the demand for reinsurance and the price of reinsurancepremium. (4) the delay factor discourages or stimulates investment depending on the length of the delay. When thedelay time is greater than a certain value, the delay factor will make the investment become more conservative. On thecontrary, when the delay time is less than this certain value, the delay factor will stimulate investment. (5) the eﬀect ofthe delay weight on the equilibrium strategy is related to the length of the delay. When the delay time is greater thana certain value, the optimal investment strategies and the optimal reinsurance strategies are negatively correlated withthe corresponding delay weight parameters; the optimal premium strategy is positively correlated with the reinsurer’sdelay weight. Conversely, when the delay time is less than this value, the opposite case occurs.23tochastic diﬀerential games between reinsurers and insurers are a very common social phenomenon and an importantcurrent research issue in the economic and ﬁnancial ﬁelds. In the future work, this study can be extended in the followingdirections: one is introducing multi-asset investment, which is closer to reality; the other is considering regime switchingto better describe the stochastic market.

References

A, C., Lai, Y., and Shao, Y. (2018). Optimal excess-of-loss reinsurance and investment problem with delay and jump-diﬀusion risk process under the CEV model.

Journal of Computational and Applied Mathematics , 342:317–336.A, C. and Li, Z. (2015). Optimal investment and excess-of-loss reinsurance problem with delay for an insurer underHeston’s SV model.

Insurance: Mathematics and Economics , 61:181–196.Asmussen, S., Christensen, B. J., and Thgersen, J. (2019). Stackelberg equilibrium premium strategies for push-pullcompetition in a non-life insurance market with product diﬀerentiation.

Risks , 7(2).Bai, L. and Guo, J. (2008). Optimal proportional reinsurance and investment with multiple risky assets and no-shortingconstraint.

Insurance: Mathematics and Economics , 42(3):968–975.Bensoussan, A., Siu, C. C., Yam, S. C. P., and Yang, H. (2014). A class of non-zero-sum stochastic diﬀerential investmentand reinsurance games.

Automatica , 50(8):2025–2037.Bi, J., Meng, Q., and Zhang, Y. (2014). Dynamic mean-variance and optimal reinsurance problems under the no-bankruptcy constraint for an insurer.

Annals of Operations Research , 212(1):43–59.Browne, S. (1995). Optimal investment policies for a ﬁrm with a random risk process: Exponential utility and minimizingthe probability of ruin.

Mathematics of Operations Research , 20(4):937–958.Chang, M.-H., Pang, T., and Yang, Y. (2011). A stochastic portfolio optimization model with bounded memory.

Mathematics of Operations Research , 36(4):604–619.Chen, L. and Shen, Y. (2018). On a new paradigm of optimal reinsurance: A stochastic stackelberg diﬀerential gamebetween an insurer and a reinsurer.

ASTIN Bulletin , 48(02):905–960.Chen, L. and Shen, Y. (2019). Stochastic stackelberg diﬀerential reinsurance games under time-inconsistent mean-variance framework.

Insurance: Mathematics and Economics , 88:120–137.Chen, S., Li, Z., and Li, K. (2010). Optimal investment-reinsurance policy for an insurance company with VaR constraint.

Insurance: Mathematics and Economics , 47(2):144–153.Chen, S., Yang, H., and Zeng, Y. (2018). Stochastic diﬀerential games between two insurers with generalized mean-variance premium principle.

ASTIN Bulletin , 48(01):413–434.Deng, C., Zeng, X., and Zhu, H. (2018). Non-zero-sum stochastic diﬀerential reinsurance and investment games withdefault risk.

European Journal of Operational Research , 264(3):1144–1158.Federico, S. (2011). A stochastic control problem with delay arising inapension fund model.

Finance and Stochastics ,15(3):421–459.Gerber, H. U. and Shiu, E. S. W. (2006). On optimal dividends: From reﬂection to refraction.

Journal of Computationaland Applied Mathematics , 186(1):4–22.Grandell, J. (1977). A class of approximations of ruin probabilities.

Scandinavian Actuarial Journal , 1977(sup1):37–52.24uan, G. and Liang, Z. (2016). A stochastic Nash equilibrium portfolio game between two DC pension funds.

Insurance:Mathematics and Economics , 70:237–244.Huang, Y., Yang, X., and Zhou, J. (2016). Optimal investment and proportional reinsurance for a jump-diﬀusion riskmodel with constrained control variables.

Journal of Computational and Applied Mathematics , 296:443–461.Li, P., Zhao, W., and Zhou, W. (2015). Ruin probabilities and optimal investment when the stock price follows anexponential Lvy process.

Applied Mathematics and Computation , 259:1030–1045.Li, Z., Zeng, Y., and Lai, Y. (2012). Optimal time-consistent investment and reinsurance strategies for insurers underHeston’s SV model.

Insurance: Mathematics and Economics , 51(1):191–203.Liang, Z., Bai, L., and Guo, J. (2011). Optimal investment and proportional reinsurance with constrained controlvariables.

Optimal Control Applications and Methods , 32:587608.Lin, X. and Qian, Y. (2015). Time-consistent mean-variance reinsurance-investment strategy for insurers under CEVmodel.

Scandinavian Actuarial Journal , 2016(7):646–671.Meng, H., Li, S., and Jin, Z. (2015). A reinsurance game between two insurance companies with nonlinear risk processes.

Insurance: Mathematics and Economics , 62:91–97.Pun, C. S. and Wong, H. Y. (2016). Robust non-zero-sum stochastic diﬀerential reinsurance game.

Insurance: Mathe-matics and Economics , 68:169–177.Shen, Y. and Zeng, Y. (2014). Optimal investmentreinsurance with delay for mean-variance insurers: A maximumprinciple approach.

Insurance: Mathematics and Economics , 57:1–12.Yan, M., Peng, F., and Zhang, S. (2017). A reinsurance and investment game between two insurance companies withthe diﬀerent opinions about some extra information.

Insurance: Mathematics and Economics , 75:58–70.Yang, X., Liang, Z., and Zhang, C. (2017). Optimal mean-variance reinsurance with delay and multiple classes ofdependent risks.

SCIENTIA SINICA Mathematica , 47(6):723–756.Zeng, X. and Taksar, M. (2013). A stochastic volatility model and optimal portfolio selection.

Quantitative Finance ,13(10):1547–1558.Zhao, H. and Rong, X. (2017). On the constant elasticity of variance model for the utility maximization problem withmultiple risky assets.

IMA Journal of Management Mathematics , 28(2):299–320.Zhou, J., Zhang, X., Huang, Y., Xiang, X., and Deng, Y. (2019a). Optimal investment and risk control policies for aninsurer in an incomplete market.

Optimization , 0(0):1–28.Zhou, Z., Ren, T., Xiao, H., and Liu, W. (2019b). Time-consistent investment and reinsurance strategies for insurersunder multi-period mean-variance formulation with generalized correlated returns.

Journal of Management Scienceand Engineering .Zhu, H., Cao, M., and Zhang, C. (2018). Time-consistent investment and reinsurance strategies for mean-varianceinsurers with relative performance concerns under the Heston model.

Finance Research Letters .25 ppendix A Proof of Theorem 1

Proof:

We solve the Stackelberg game problem by using the idea of backward induction mentioned in Section 2.4and standard dynamic programming techniques.

Step 1

In the stochastic Stackelberg diﬀerential game, the reinsurer takes action ﬁrst by announcing its any admis-sible strategy ( p ( · ) , b L ( · )) ∈ Π L . Step 2

Based on the reinsurer’s strategy ( p ( · ) , b L ( · )) ∈ Π L , we solve two insurers’ optimization problems (i.e., (2.19)for i = 1 ,

2) under the CARA preference simultaneously. i (cid:54) = j ∈ { , } in this step.For the value function of insurer i , we conjecture that V F i ( t, ˆ x i , y i , y j , s ) = − γ i exp {− γ i ϕ F i ( t )(ˆ x i + η i y i − k i η j y j ) + g F i ( t ) s − β + g F i ( t ) } , (A.44)where ϕ F i ( t ), g F i ( t ) and g F i ( t ) are deterministic, continuously diﬀerentiable functions with boundary conditions ϕ F i ( T ) =1, g F i ( T ) = 0 and g F i ( T ) = 0. For insurer i , the HJB equation is0 = sup ( q i ( · ) ,b i ( · )) ∈ Π i (cid:8) V F i t + V F i ˆ x i (cid:2) θ i a i − k i θ j a j − ( p ( t ) − a i )(1 − q i ( t )) + k i ( p ( t ) − a j )(1 − q ∗ j ( t ))+ A i x i − k i A j x j + B i y i − k i B j y j + C i z i − k i C j z j + ( r − r )( b i ( t ) − k i b ∗ j ( t )) (cid:3) + 12 (cid:2) ( q i ( t ) σ i ) + ( k i q ∗ j ( t ) σ j ) − q i ( t ) σ i k i q ∗ j ( t ) σ j ρ + ( b i ( t ) − k i b ∗ j ( t )) σ s β (cid:3) V F i ˆ x i ˆ x i + ( x i − α i y i − e − α i h i z i ) V F i y i + ( x j − α j y j − e − α j h j z j ) V F i y j + rsV F i s + 12 σ s β +2 V F i ss + ( b i ( t ) − k i b ∗ j ( t )) σ s β +1 V F i ˆ x i s (cid:9) . (A.45)The ﬁrst-order condition for maximizing the value in (A.45) gives that0 = V F i ˆ x i [ p ( t ) − a i ] + [ q ∗ i ( t ) σ i − σ i k i q ∗ j ( t ) σ j ρ ] V F i ˆ x i ˆ x i , (A.46)0 =( r − r ) V F i ˆ x i + [ b ∗ i ( t ) − k i b ∗ j ( t )] σ s β V F i ˆ x i ˆ x i + σ s β +1 V F i ˆ x i s . (A.47)From (A.44), (A.46) and (A.47), we know that reinsurance strategy and investment strategy are independent. Theinvestment strategy of insurer i is independent of the investment strategy of the reinsurer. Then, α ∗ i ( · , p ( · ) , b L ( · )) and β ∗ i ( · , p ( · ) , b L ( · )) could be written as α ∗ i ( · , p ( · )) and β ∗ i ( · ), respectively. Due to q i ( t ) ∈ [0 , q ∗ i ( t, p ( t )) = α ∗ i ( t, p ( t )) = (cid:104) − ( p ( t ) − a i ) V F i ˆ x i σ i V F i ˆ x i ˆ x i + k i σ j ρq ∗ j ( t ) σ i (cid:105) ∨ ∧ (cid:104) p ( t ) − a i γ i σ i ϕ F i ( t ) + k i ρσ j q ∗ j ( t ) σ i (cid:105) ∨ ∧ , (A.48) b ∗ i ( t ) = β ∗ ( t ) = k i b ∗ j ( t ) − sV F i ˆ x i s V F i ˆ x i ˆ x i − ( r − r ) V F i ˆ x i σ s β V F i ˆ x i ˆ x i = k i b ∗ j ( t ) + 1 γ i ϕ F i ( t ) s β (cid:2) r − r σ − βg F i ( t ) (cid:3) . (A.49)Substitute the investment strategy (A.49) into the HJB equation (A.45). Then, we have0 = V F i (cid:8) γ i x i (cid:2) − ϕ F i t − ϕ F i ( t ) A i − ϕ F i ( t ) η i (cid:3) + k i γ i x j (cid:2) ϕ F i t + ϕ F i ( t ) A j + ϕ F i ( t ) η j (cid:3) + γ i y i (cid:2) − ϕ F i t η i − ϕ F i ( t ) B i + ϕ F i ( t ) η i α i (cid:3) + k i γ i y j (cid:2) ϕ F i t η j + ϕ F i ( t ) B j − ϕ F i ( t ) η j α j (cid:3) + γ i ϕ F i ( t ) z i (cid:2) − C i + η i e − α i h i (cid:3) + k i γ i ϕ F i ( t ) z j (cid:2) C j − η j e − α j h j (cid:3) + s − β (cid:0) dg F i ( t ) dt − βr g F i ( t ) −

12 ( r − r ) σ (cid:1)(cid:9) + V F i (cid:8) dg F i ( t ) dt + β (2 β + 1) σ g F i ( t ) − γ i ϕ F i ( t ) (cid:2) θ i a i − k i θ j a j − ( p ( t ) − a i )(1 − q ∗ i ( t )) + k i ( p ( t ) − a j )(1 − q ∗ j ( t )) (cid:3) + 12 ( γ i ϕ F i ( t )) (cid:2) ( q ∗ i ( t ) σ i ) + ( k i q ∗ j ( t ) σ j ) − q ∗ i ( t ) σ i k i q ∗ j ( t ) σ j ρ (cid:3)(cid:9) . (A.50)26bviously, q ∗ i ( t, p ( t )) does not depend on the state variables x i , x j , y i , y j and s − β . Due to C i = η i e − α i h i , B i e − α i h i =( α i + A i + η i ) C i , C j = η j e − α j h j , B j e − α j h j = ( α j + A j + η j ) C j , ϕ F i ( T ) = 1 and g F i ( T ) = 0, we have ϕ F i ( t ) = exp { ( A i + η i )( T − t ) } = exp { ( A j + η j )( T − t ) } = ϕ F j ( t ) , (A.51) g F i ( t ) = g ( t ) , (A.52)and 0 = V F i (cid:8) − γ i ϕ F i ( t ) (cid:2) θ i a i − k i θ j a j − ( p ( t ) − a i )(1 − q ∗ i ( t )) + k i ( p ( t ) − a j )(1 − q ∗ j ( t )) (cid:3) + 12 ( γ i ϕ F i ( t )) (cid:2) ( q ∗ i ( t ) σ i ) + ( k i q ∗ j ( t ) σ j ) − q ∗ i ( t ) σ i k i q ∗ j ( t ) σ j ρ (cid:3) + dg F i ( t ) dt + β (2 β + 1) σ g F i ( t ) (cid:9) , (A.53)where g ( t ) = − βr ( r − r σ ) [1 − exp {− βr ( T − t ) } ] . (A.54)Then, we have ϕ F ( t ) = ϕ F ( t ) and g F ( t ) = g F ( t ) = g ( t ). Assume that k k <

1, by (A.49), we can get that b ∗ i ( t ) = s − β (1 − k k ) ϕ F i ( t ) ( 1 γ i + k i γ j ) (cid:2) r − r σ − βg ( t ) (cid:3) . (A.55)Due to p ( t ) ∈ [ c F , ¯ c ] and (A.48), we know that p ( t ) − a i γ i σ i ϕ Fi ( t ) + k i ρσ j q ∗ j ( t ) σ i >

0. Then, q ∗ i ( t, p ( t )) = α ∗ i ( t, p ( t )) = (cid:104) p ( t ) − a i γ i σ i ϕ F i ( t ) + k i ρσ j q ∗ j ( t ) σ i (cid:105) ∧ . (A.56)Assume that k k ρ <

1. According to Bensoussan et al. (2014) and Deng et al. (2018), let ˜ q i ( t, p ( t )) be the solutionof the following system of equations: (cid:40) ˜ q ( t, p ( t )) = p ( t ) − a γ σ ϕ F ( t ) + k ρσ ˜ q ( t ) σ , ˜ q ( t, p ( t )) = p ( t ) − a γ σ ϕ F ( t ) + k ρσ ˜ q ( t ) σ . (A.57)Denote g ( t ) = − (2 β + 1)( r − r ) r (cid:2) ( T − t ) + 12 βr (exp {− βr ( T − t ) } − (cid:3) . (A.58)Then, we will discuss the following situations: • Case (Fa)

If ˜ q i ( t, p ( t )) ≥ i ∈ { , } , we have q ∗ i ( t, p ( t )) = 1. Substituting q ∗ i ( t, p ( t )) into (A.53) and integratingfrom T to t gives g F i ( t ) = g F i a ( t ) . = g ( t ) − γ i ( θ i a i − k i θ j a j ) A i + η i [ ϕ F i ( t ) −

1] + γ i (cid:2) σ i + k i σ j − σ i k i σ j ρ (cid:3) A i + η i ) [( ϕ F i ( t )) − . (A.59) • Case (Fb)

If ˜ q i ( t, p ( t )) ≥ q j ( t, p ( t )) < i (cid:54) = j ∈ { , } , we have q ∗ i ( t, p ( t )) = 1 and q ∗ j ( t, p ( t )) = p ( t ) − a j γ j σ j ϕ Fj ( t ) + k j ρσ i σ j . Substituting q ∗ i ( t, p ( t )) and q ∗ j ( t, p ( t )) into (A.53) and integrating from T to t gives g F i ( t ) = g ( t ) − γ i ( θ i a i − k i θ j a j ) A i + η i [ ϕ F i ( t ) −

1] + γ i σ i [1 + ( k k ρ ) − k k ρ ]4( A i + η i ) [( ϕ F i ( t )) − k i γ i γ j σ j [1 + k i γ i γ j ] (cid:90) tT ( p ( s ) − a j ) ds − k i γ i [ − k j ρσ i σ j + k k ργ i σ i γ j σ j − ρσ i γ i γ j σ j ] (cid:90) tT ( p ( s ) − a j ) ϕ F i ( s ) ds, (A.60)and g F j ( t ) = g ( t ) − γ j ( θ j a j − k j θ i a i ) A j + η j [ ϕ F j ( t ) −

1] + ( k j γ j σ i ) (1 − ρ )4( A j + η j ) [( ϕ F j ( t )) −

1] + 12( σ j ) (cid:90) tT ( p ( s ) − a j ) ds − γ j (1 − k j ρσ i σ j ) (cid:90) tT ( p ( s ) − a j ) ϕ F j ( s ) ds. (A.61) • Case (Fc)

If ˜ q i ( t, p ( t )) < i ∈ { , } , we have q ∗ i ( t, p ( t )) = ˜ q i ( t, p ( t )). Substituting q ∗ i ( t, p ( t )) into (A.53) andintegrating from T to t gives g F i ( t ) = g ( t ) − γ i ( θ i a i − k i θ j a j ) A i + η i [ ϕ F i ( t ) − − γ i (cid:90) tT ϕ F i ( s )( p ( s ) − a i ) ds + k i γ i (cid:90) tT ϕ F i ( s )( p ( s ) − a j ) ds + 1 − ( k k ρ ) − k k ρ ) ( σ i ) (cid:90) tT ( p ( s ) − a i ) ds − k i γ i [2 γ j (1 − k k ρ ) + k i γ i (1 − ρ )]2(1 − k k ρ ) ( γ j σ j ) (cid:90) tT ( p ( s ) − a j ) ds − k i ρ [ − γ i (1 − k k ) + k j γ j (1 − k k ρ )](1 − k k ρ ) σ σ γ j (cid:90) tT ( p ( s ) − a i )( p ( s ) − a j ) ds. (A.62) Step 3

Knowing that insurer i would execute its reinsurance strategy and investment strategy according to (A.48)and (A.55), the reinsurer then decides on its optimal strategy ( p ∗ ( · ) , b ∗ L ( · )) ∈ Π L .For the value function of the reinsurer, we conjecture that V L ( t, x L , y L , s ) = − γ L exp {− γ L ϕ L ( t )( x L + η L y L ) + g L ( t ) s − β + g L ( t ) } , (A.63)where ϕ L ( t ), g L ( t ) and g L ( t ) are deterministic, continuously diﬀerentiable functions with boundary conditions ϕ L ( T ) =1, g L ( T ) = 0 and g L ( T ) = 0. The HJB equation of the reinsurer is0 = sup ( p ( · ) ,b L ( · )) ∈ Π L (cid:8) V Lt + V Lx L (cid:2) ( p ( t ) − a )(1 − q ∗ ( t )) + ( p ( t ) − a )(1 − q ∗ ( t )) + ( r − r ) b L ( t ) + A L x L + B L y L + C L z L (cid:3) + 12 (cid:2) (1 − q ∗ ( t )) ( σ ) + (1 − q ∗ ( t )) ( σ ) + ( b L ( t )) σ s β + 2(1 − q ∗ ( t ))(1 − q ∗ ( t )) σ σ ρ (cid:3) V Lx L x L + ( x L − α L y L − e − α L h L z L ) V Ly L + rsV Ls + 12 σ s β +2 V Lss + b L ( t ) σ s β +1 V Lx L s (cid:9) . (A.64)The ﬁrst-order condition about b L ( t ) for maximizing the value in (A.64) gives that b ∗ L ( t ) = − ( r − r ) σ s β V Lx L V Lx L x L − s V Lx L s V Lx L x L = s − β γ L ϕ L ( t ) (cid:20) ( r − r ) σ − βg L ( t ) (cid:21) . (A.65)Substitute the investment strategy (A.65) into the HJB equation (A.64). Then, we have0 = sup p ( · ) ∈ [ c F , ¯ c ] V L (cid:8) γ L x L [ − ϕ Lt − ϕ L ( t ) A L − ϕ L ( t ) η L ] + γ L y L [ − ϕ Lt η L − ϕ L ( t ) B L + ϕ L ( t ) η L α L ] − γ L ϕ L ( t )[ C L z L − η L e − α L h L z L ] + s − β (cid:2) dg L ( t ) dt − βr g L ( t ) −

12 ( r − r σ ) (cid:3) + dg L ( t ) dt + β (2 β + 1) σ g L ( t ) − γ L ϕ L ( t )[( p ( t ) − a )(1 − q ∗ ( t )) + ( p ( t ) − a )(1 − q ∗ ( t ))]+ 12 γ L ( ϕ L ( t )) [(1 − q ∗ ( t )) σ + (1 − q ∗ ( t )) σ + 2(1 − q ∗ ( t ))(1 − q ∗ ( t )) σ σ ρ ] (cid:9) . (A.66)28bviously, p ( t ) does not depend on the state variables x L , y L and s − β . Then, we have ϕ L ( t ) = exp { ( A L + η L )( T − t ) } , (A.67) g L ( t ) = g ( t ) , (A.68)0 = sup p ( · ) ∈ [ c F , ¯ c ] (cid:8) dg L ( t ) dt + β (2 β + 1) σ g L ( t ) − γ L ϕ L ( t )[( p ( t ) − a )(1 − q ∗ ( t )) + ( p ( t ) − a )(1 − q ∗ ( t ))]+ 12 γ L ( ϕ L ( t )) [(1 − q ∗ ( t )) σ + (1 − q ∗ ( t )) σ + 2(1 − q ∗ ( t ))(1 − q ∗ ( t )) σ σ ρ ] (cid:9) . (A.69)To simplify our presentation, for i (cid:54) = j ∈ { , } , we denote K ˜ F i = ( γ j σ j γ i σ i + k i ρσ j σ i )(1 − k j ρσ i σ j ) , K F i = (1 + k i ργ i σ i γ j σ j )(1 − k i ρσ j σ i ) , N cF i ( t ) = c F − a i γ i σ i ϕ F i ( t ) ,N ¯ cF i ( t ) = ¯ c − a i γ i σ i ϕ F i ( t ) , N aF i ( t ) = a j − a i γ i σ i ϕ F i ( t ) , M F i ( t ) = γ i ϕ F i ( t ) + γ L ϕ L ( t )2 γ i ϕ F i ( t ) + γ L ϕ L ( t ) , K = 11 − k k ρ . (A.70)For the premium strategy p ( t ), we discuss it in the following situations: • Case (La)

If ˜ q i ( t ) ≥ i = 1 ,

2, we have q ∗ ( t, p ( t )) = 1 and q ∗ ( t, p ( t )) = 1. Substituting q ∗ ( t, p ( t )) and q ∗ ( t, p ( t ))into (A.69), we can get that 0 = sup p ( · ) ∈ [ c F , ¯ c ] (cid:8) dg L ( t ) dt + β (2 β + 1) σ g L ( t ) (cid:9) . (A.71)Then, p ∗ ( t ) = p, q ∗ ( t, p ∗ ( t )) = 1 , q ∗ ( t, p ∗ ( t )) = 1 , (A.72)where p is an arbitrary value in the interval [ c F , ¯ c ]. The precondition ˜ q i ( t ) ≥ i = 1 ,

2) becomes N cF i ( t )+ k i ρσ j σ i ≥ i (cid:54) = j ∈ { , } . By equation (A.71) and g L ( T ) = 0, we can get that g L ( t ) = g La ( t ) . = g ( t ) . (A.73) • Case (Lb)

If ˜ q i ( t ) ≥

1, ˜ q j ( t ) < i (cid:54) = j ∈ { , } , we have q ∗ i ( t, p ( t )) = 1 and q ∗ j ( t, p ( t )) = p ( t ) − a j γ j σ j ϕ Fj ( t ) + k j ρσ i σ j .Substituting q ∗ i ( t, p ( t )) and q ∗ j ( t, p ( t )) into (A.69) and simplifying gives0 = sup p ( · ) ∈ [ c F , ¯ c ] (cid:8) dg L ( t ) dt + β (2 β + 1) σ g L ( t ) − ( p ( t ) − a j ) γ L ϕ L ( t )(1 − k j ρσ i σ j )[1 + γ L ϕ L ( t ) γ j ϕ F j ( t ) ]+ 12 ( γ L ϕ L ( t )) ( σ j − k j ρσ i ) + ( p ( t ) − a j ) γ L ϕ L ( t ) γ j σ j ϕ F j ( t ) [1 + 12 γ L ϕ L ( t ) γ j ϕ F j ( t ) ] (cid:9) . (A.74)The ﬁrst-order condition about p ( t ) for maximizing the value in equation (A.74) gives p ∗ ( t ) = [ a j + γ j σ j ϕ F j ( t )(1 − k j ρσ i σ j ) M F j ( t )] ∨ c F ∧ ¯ c. (A.75) • Subcase (Lb1) If a j + γ j σ j ϕ F j ( t )(1 − k j ρσ i σ j ) M F j ( t ) ≥ ¯ c , we have p ∗ ( t ) = ¯ c, q ∗ i ( t, p ∗ ( t )) = 1 , q ∗ j ( t, p ∗ ( t )) = N ¯ cF j ( t ) + k j ρσ i σ j . (A.76)29he preconditions become K [ N ¯ cF i ( t ) + k i ρσ j σ i N ¯ cF j ( t )] ≥ N ¯ cF j ( t ) + k j ρσ i σ j <

1. Then, substituting p ∗ ( t )into (A.74) and integrating from T to t gives g L ( t ) = g Ljb ( t ) . = g ( t ) + γ L ( σ j − k j ρσ i ) A L + η L ) [( ϕ L ( t )) −

1] + (¯ c − a j ) γ L (1 − k j ρσ i σ j ) (cid:2) (cid:90) tT ϕ L ( s ) ds + γ L γ j (cid:90) tT ( ϕ L ( s )) ϕ F j ( s ) ds (cid:3) − (¯ c − a j ) γ L γ j σ j (cid:2) (cid:90) tT ϕ L ( s ) ϕ F j ( s ) ds + γ L γ j (cid:90) tT ( ϕ L ( s ) ϕ F j ( s ) ) ds (cid:3) . (A.77)Equations (A.60) and (A.61) become g F i ( t ) = g F i b ( t ) . = g ( t ) + γ i ( ϕ F i ( t ) − A i + η i (cid:2) k i (¯ c − a j )( − k j ρσ i σ j + k k ργ i σ i γ j σ j − ρσ i γ i γ j σ j ) − ( θ i a i − k i θ j a j ) (cid:3) + γ i σ i [1 + ( k k ρ ) − k k ρ ]4( A i + η i ) [( ϕ F i ( t )) −

1] + k i γ i γ j σ j [1 + k i γ i γ j ](¯ c − a j ) ( T − t ) , (A.78)and g F j ( t ) = g ˜ F j b ( t ) . = g ( t ) + γ j ( ϕ F j ( t ) − A j + η j (cid:2) (1 − k j ρσ i σ j )(¯ c − a j ) − ( θ j a j − k j θ i a i ) (cid:3) + ( k j γ j σ i ) (1 − ρ )4( A j + η j ) [( ϕ F j ( t )) − − σ j ) (¯ c − a j ) ( T − t ) . (A.79) • Subcase (Lb2) If a j + γ j σ j ϕ F j ( t )(1 − k j ρσ i σ j ) M F j ( t ) ≤ c F , we have p ∗ ( t ) = c F , q ∗ i ( t, p ∗ ( t )) = 1 , q ∗ j ( t, p ∗ ( t )) = N cF j ( t ) + k j ρσ i σ j . (A.80)The preconditions become K [ N cF i ( t ) + k i ρσ j σ i N cF j ( t )] ≥ N cF j ( t ) + k j ρσ i σ j <

1. Then, substituting p ∗ ( t )into (A.74) and integrating from T to t gives g L ( t ) = g Ljb ( t ) . = g ( t ) + γ L ( σ j − k j ρσ i ) A L + η L ) [( ϕ L ( t )) −

1] + ( c F − a j ) γ L (1 − k j ρσ i σ j ) (cid:2) (cid:90) tT ϕ L ( s ) ds + γ L γ j (cid:90) tT ( ϕ L ( s )) ϕ F j ( s ) ds (cid:3) − ( c F − a j ) γ L γ j σ j (cid:2) (cid:90) tT ϕ L ( s ) ϕ F j ( s ) ds + γ L γ j (cid:90) tT ( ϕ L ( s ) ϕ F j ( s ) ) ds (cid:3) . (A.81)Equations (A.60) and (A.61) become g F i ( t ) = g F i b ( t ) . = g ( t ) + γ i ( ϕ F i ( t ) − A i + η i (cid:2) k i ( c F − a j )( − k j ρσ i σ j + k k ργ i σ i γ j σ j − ρσ i γ i γ j σ j ) − ( θ i a i − k i θ j a j ) (cid:3) + γ i σ i [1 + ( k k ρ ) − k k ρ ]4( A i + η i ) [( ϕ F i ( t )) −

1] + k i γ i γ j σ j [1 + k i γ i γ j ]( c F − a j ) ( T − t ) , (A.82)and g F j ( t ) = g ˜ F j b ( t ) . = g ( t ) + γ j ( ϕ F j ( t ) − A j + η j (cid:2) (1 − k j ρσ i σ j )( c F − a j ) − ( θ j a j − k j θ i a i ) (cid:3) + ( k j γ j σ i ) (1 − ρ )4( A j + η j ) [( ϕ F j ( t )) − − σ j ) ( c F − a j ) ( T − t ) . (A.83)30 Subcase (Lb3) If c F < a j + γ j σ j ϕ F j ( t )(1 − k j ρσ i σ j ) M F j ( t ) < ¯ c , we have p ∗ ( t ) = a j + γ j σ j ϕ F j ( t )(1 − k j ρσ i σ j ) M F j ( t ) ,q ∗ i ( t, p ∗ ( t )) = 1 , q ∗ j ( t, p ∗ ( t )) = (1 − k j ρσ i σ j ) M F j ( t ) + k j ρσ i σ j . (A.84)The preconditions become K [ N aF i ( t ) + K ˜ F i M F j ( t )] ≥ − k j ρσ i σ j ) M F j ( t ) + k j ρσ i σ j <

1. Substituting p ∗ ( t ) into (A.74) and integrating from T to t gives g L ( t ) = g Ljb ( t ) . = g ( t ) + γ L ( σ j − k j ρσ i ) A L + η L ) [( ϕ L ( t )) − γ L ( σ j − k j ρσ i ) (cid:90) tT ϕ L ( s )( γ j ϕ F j ( s ) + γ L ϕ L ( s )) M F j ( s ) ds. (A.85)Equations (A.60) and (A.61) become g F i ( t ) = g F i b ( t ) . = g ( t ) − γ i ( θ i a i − k i θ j a j ) A i + η i [ ϕ F i ( t ) −

1] + γ i σ i [1 + ( k k ρ ) − k k ρ ]4( A i + η i ) [( ϕ F i ( t )) − − k i γ i ( σ j − k j ρσ i )[ − γ j σ j + k j ργ j σ i + k k ργ i σ i − ρσ i γ i ] (cid:90) tT ϕ F i ( s ) ϕ F j ( s ) M F j ( s ) ds − k i γ i (2 γ j + k i γ i )( σ j − k j ρσ i ) (cid:90) tT ( ϕ F j ( s ) M F j ( s )) ds, (A.86)and g F j ( t ) = g ˜ F j b ( t ) . = g ( t ) − γ j ( θ j a j − k j θ i a i ) A j + η j [ ϕ F j ( t ) −

1] + ( k j γ j σ i ) (1 − ρ )4( A j + η j ) [( ϕ F j ( t )) − − [ γ j ( σ j − k j ρσ i )] (cid:90) tT ( ϕ F j ( s )) M F j ( s ) [3 γ j ϕ F j ( s ) + γ L ϕ L ( s )]2[2 γ j ϕ F j ( s ) + γ L ϕ L ( s )] ds. (A.87) • Case (Lc)

If ˜ q i ( t, p ( t )) < i = 1 ,

2, we have q ∗ i ( t, p ( t )) = ˜ q i ( t, p ( t )). Substituting q ∗ i ( t, p ( t )) into (A.69) andsimplifying gives0 = sup p ( · ) ∈ [ c F , ¯ c ] (cid:8) dg L ( t ) dt + β (2 β + 1) σ g L ( t ) + 12 ( γ L ϕ L ( t )) [( σ ) + ( σ ) + 2 σ σ ρ ]+ Kγ L (cid:2) − ϕ L ( t ) D F ( t ) γ ϕ F ( t ) ( p ( t ) − a ) − ϕ L ( t ) D F ( t ) γ ϕ F ( t ) ( p ( t ) − a ) + ϕ L ( t ) D ˜ F ( t ) γ σ ϕ F ( t ) ( p ( t ) − a ) + ϕ L ( t ) D ˜ F ( t ) γ σ ϕ F ( t ) ( p ( t ) − a ) + ρϕ L ( t ) D F ( t ) γ γ σ σ ϕ F ( t ) ϕ F ( t ) ( p ( t ) − a )( p ( t ) − a ) (cid:3)(cid:9) , (A.88)where D F i ( t ) = 1 K γ i ϕ F i ( t ) + γ L ϕ L ( t )[1 + k j ρ + σ j ρσ i (1 + k j )] , i (cid:54) = j ∈ { , } ,D ˜ F i ( t ) =1 + K (1 + ( k j ρ ) + 2 k j ρ ) γ L ϕ L ( t )2 γ i ϕ F i ( t ) , i (cid:54) = j ∈ { , } ,D F ( t ) = k γ ϕ F ( t ) + k γ ϕ F ( t ) + K (1 + k + k + k k ρ ) γ L ϕ L ( t ) . (A.89)31he ﬁrst-order condition about p ( t ) for maximizing the value in equation (A.88) gives that p ∗ ( t ) = P N ( t ) P D ( t ) ∨ c F ∧ ¯ c, (A.90)where P N ( t ) =( σ σ ) [ γ ϕ F ( t ) D F ( t ) + γ ϕ F ( t ) D F ( t )] + 2 a σ γ ϕ F ( t ) D ˜ F ( t )+ 2 a σ γ ϕ F ( t ) D ˜ F ( t ) + ( a + a ) ρσ σ D F ( t ) , (A.91) P D ( t ) =2 σ γ ϕ F ( t ) D ˜ F ( t ) + 2 σ γ ϕ F ( t ) D ˜ F ( t ) + 2 ρσ σ D F ( t ) . (A.92) • Subcase (Lc1) If P N ( t ) P D ( t ) ≥ ¯ c , then p ∗ ( t ) = ¯ c, q ∗ i ( t, p ∗ ( t )) = K [ N ¯ cF i ( t ) + k i ρσ j σ i N ¯ cF j ( t )] . (A.93)The precondition becomes K [ N ¯ cF i ( t ) + k i ρσ j σ i N ¯ cF j ( t )] < i = 1 ,

2. Substituting p ∗ ( t ) into the equation(A.88) and integrating from T to t gives g L ( t ) = g Lc ( t ) . = g ( t ) + γ L [( σ ) + ( σ ) + 2 σ σ ρ ]4( A L + η L ) [( ϕ L ( t )) −

1] + Kγ L (cid:110) (¯ c − a ) γ (cid:90) tT ϕ L ( s ) D F ( s ) ϕ F ( s ) ds + (¯ c − a ) γ (cid:90) tT ϕ L ( s ) D F ( s ) ϕ F ( s ) ds − (¯ c − a ) γ σ (cid:90) tT ϕ L ( s ) D ˜ F ( s ) ϕ F ( s ) ds − (¯ c − a ) γ σ (cid:90) tT ϕ L ( s ) D ˜ F ( s ) ϕ F ( s ) ds − ρ (¯ c − a )(¯ c − a ) γ γ σ σ (cid:90) tT ϕ L ( s ) D F ( s ) ϕ F ( s ) ϕ F ( s ) ds (cid:111) . (A.94)Equation (A.62) becomes g F i ( t ) = g F i c ( t ) . = g ( t ) + γ i A i + η i [¯ c − a i − k i (¯ c − a j ) − θ i a i + k i θ j a j ][ ϕ F i ( t ) − K ( T − t )2 (cid:8) − [1 − ( k k ρ ) ]( σ i ) (¯ c − a i ) + k i γ i [2 γ j − k k γ j ρ + k i γ i (1 − ρ )]( γ j σ j ) (¯ c − a j ) + 2 k i ρ [ − γ i (1 − k k ) + k j γ j (1 − k k ρ )] σ i σ j γ j (¯ c − a i )(¯ c − a j ) (cid:9) . (A.95) • Subcase (Lc2) If P N ( t ) P D ( t ) ≤ c F , then p ∗ ( t ) = c F , q ∗ i ( t, p ∗ ( t )) = K [ N cF i ( t ) + k i ρσ j σ i N cF j ( t )] . (A.96)The precondition becomes K [ N cF i ( t )+ k i ρσ j σ i N cF j ( t )] < i = 1 ,

2. Substituting p ∗ ( t ) into the equation(A.88)and integrating from T to t gives g L ( t ) = g Lc ( t ) . = g ( t ) + γ L [( σ ) + ( σ ) + 2 σ σ ρ ]4( A L + η L ) [( ϕ L ( t )) −

1] + Kγ L (cid:110) ( c F − a ) γ (cid:90) tT ϕ L ( s ) D F ( s ) ϕ F ( s ) ds + ( c F − a ) γ (cid:90) tT ϕ L ( s ) D F ( s ) ϕ F ( s ) ds − ( c F − a ) γ σ (cid:90) tT ϕ L ( s ) D ˜ F ( s ) ϕ F ( s ) ds ( c F − a ) γ σ (cid:90) tT ϕ L ( s ) D ˜ F ( s ) ϕ F ( s ) ds − ρ ( c F − a )( c F − a ) γ γ σ σ (cid:90) tT ϕ L ( s ) D F ( s ) ϕ F ( s ) ϕ F ( s ) ds (cid:111) . (A.97)Equation (A.62) becomes g F i ( t ) = g F i c ( t ) . = g ( t ) + γ i A i + η i [ c F − a i − k i ( c F − a j ) − θ i a i + k i θ j a j ][ ϕ F i ( t ) − K ( T − t )2 (cid:8) − [1 − ( k k ρ ) ]( σ i ) ( c F − a i ) + k i γ i [2 γ j − k k γ j ρ + k i γ i (1 − ρ )]( γ j σ j ) ( c F − a j ) + 2 k i ρ [ − γ i (1 − k k ) + k j γ j (1 − k k ρ )] σ i σ j γ j ( c F − a i )( c F − a j ) (cid:9) . (A.98) • Subcase (Lc3) If c F < P N ( t ) P D ( t ) < ¯ c , then p ∗ ( t ) = P N ( t ) P D ( t ) , q ∗ i ( t, p ∗ ( t )) = K (cid:104) P N ( t ) P D ( t ) − a i γ i σ i ϕ F i ( t ) + k i ρ ( P N ( t ) P D ( t ) − a j ) γ j σ j σ i ϕ F j ( t ) (cid:105) . (A.99)The precondition becomes K [ PN ( t ) PD ( t ) − a i γ i σ i ϕ Fi ( t ) + k i ρ ( PN ( t ) PD ( t ) − a j ) γ j σ j σ i ϕ Fj ( t ) ] < i = 1 ,

2. Substituting p ∗ ( t ) into the equa-tion(A.88) and integrating from T to t gives g L ( t ) = g Lc ( t ) . = g ( t ) + γ L [( σ ) + ( σ ) + 2 σ σ ρ ]4( A L + η L ) [( ϕ L ( t )) −

1] + Kγ L (cid:8) (cid:90) tT ϕ L ( s ) D F ( s ) γ ϕ F ( s ) ( P N ( s ) P D ( s ) − a ) ds + (cid:90) tT ϕ L ( s ) D F ( s ) γ ϕ F ( s ) ( P N ( s ) P D ( s ) − a ) ds − (cid:90) tT ϕ L ( s ) D ˜ F ( s ) γ σ ϕ F ( s ) ( P N ( s ) P D ( s ) − a ) ds − (cid:90) tT ϕ L ( s ) D ˜ F ( s ) γ σ ϕ F ( s ) ( P N ( s ) P D ( s ) − a ) ds − (cid:90) tT ρϕ L ( s ) D F ( s ) γ γ σ σ ϕ F ( s ) ϕ F ( s ) ( P N ( s ) P D ( s ) − a )( P N ( s ) P D ( s ) − a ) ds (cid:9) . (A.100)Equation (A.62) becomes g F i ( t ) = g F i c ( t ) . = g ( t ) − γ i ( θ i a i − k i θ j a j ) A i + η i [ ϕ F i ( t ) − − γ i (cid:90) tT ϕ F i ( s )( P N ( s ) P D ( s ) − a i ) ds + k i γ i (cid:90) tT ϕ F i ( s )( P N ( s ) P D ( s ) − a j ) ds + K (cid:8) − ( k k ρ ) ( σ i ) (cid:90) tT ( P N ( s ) P D ( s ) − a i ) ds − k i γ i [2 γ j − k k γ j ρ + k i γ i (1 − ρ )]( γ j σ j ) (cid:90) tT ( P N ( s ) P D ( s ) − a j ) ds − k i ρ [ − γ i (1 − k k ) + k j γ j (1 − k k ρ )] σ i σ j γ j ] (cid:90) tT ( P N ( s ) P D ( s ) − a i )( P N ( s ) P D ( s ) − a j ) ds (cid:9) . (A.101)Summing up the above processes, we can get Theorem 1. Appendix B Proof of Corollary 2

Proof:

The conclusions in Table 2 can be obtained by taking partial derivatives of b ∗ L ( t ) and b ∗ i ( t ) with correspondingvariables, respectively. From A L = r − B L − C L , C L = η L e − α L h L , B L e − α L h L = ( α L + A L + η L ) C L , we can get that A L = 11 + η L [ r − ( α L + η L ) η L − η L e − α L h L ] . A L into the equation (3.23), and take the derivative with respect to η L and α L respectively. We can get that ∂b ∗ L ( t ) ∂η L = b ∗ L ( t )( T − t ) 1(1 + η L ) [ r + α L + e − α L h L − , ∂b ∗ L ( t ) ∂α L = b ∗ L ( t )( T − t ) η L η L [1 − h L e − α L h L ] .b ∗ L ( t ) > β ≥

0. Thus, the left side of equation (3.32) and equation (3.33) is established. Following similarderivations, we can get that the right side of equation (3.32) and equation (3.33) is true.

Appendix C Proof of Lemma 1

Proof:

According to (2.6) and (2.18), we apply Itˆo’s formula d ( V F i ( t )) = 2 V F i ( t ) (cid:8) A F i V F i ( t, ˆ x i , y i , y j , s ) | ( q ∗ ( · ) ,b ∗ ( · ) ,q ∗ ( · ) ,b ∗ ( · )) dt + V F i ˆ x i [ q ∗ i ( t ) σ i dW i ( t ) − k i q ∗ j ( t ) σ j dW j ( t )]+ [ V F i ˆ x i ( b ∗ i ( t ) − k i b ∗ j ( t )) + sV F i s ] σs β dW ( t ) (cid:9) + (cid:8) ( V F i ˆ x i ) (cid:2) ( q ∗ i ( t ) σ i ) + ( k i q ∗ j ( t ) σ j ) + ( b ∗ i ( t ) − k i b ∗ j ( t )) σ s β − q ∗ i ( t ) σ i k i q ∗ j ( t ) σ j ρ (cid:3) + σ s β +2 ( V F i s ) + 2 V F i ˆ x i V F i s ( b ∗ i ( t ) − k i b ∗ j ( t )) σ s β +1 (cid:9) dt. From Section 3.1, we know that A F i V F i ( t, ˆ x i , y i , y j , s ) | ( q ∗ ( · ) ,b ∗ ( · ) ,q ∗ ( · ) ,b ∗ ( · )) = 0. By plugging the expressions of V F i ˆ x i , V F i s , b ∗ ( · ) and b ∗ ( · ) into the above equation, we obtain d ( V F i ( t )) ( V F i ( t )) = − { γ i ϕ F i ( t )[ q ∗ i ( t ) σ i dW i ( t ) − k i q ∗ j ( t ) σ j dW j ( t )] + r − r σs β dW ( t ) } + { ( γ i ϕ F i ( t )) [( q ∗ i ( t ) σ i ) + ( k i q ∗ j ( t ) σ j ) − q ∗ i ( t ) σ i k i q ∗ j ( t ) σ j ρ ] + ( r − r σs β ) } dt = Θ F i ( t ) dt + Θ F i ( t ) dW i ( t ) + Θ F i ( t ) dW j ( t ) − r − r σs β dW ( t ) , (C.102)where Θ F i ( t ) = ( γ i ϕ F i ( t )) [( q ∗ i ( t ) σ i ) +( k i q ∗ j ( t ) σ j ) − q ∗ i ( t ) σ i k i q ∗ j ( t ) σ j ρ ]+( r − r σs β ) , Θ F i ( t ) = − γ i ϕ F i ( t ) q ∗ i ( t ) σ i , Θ F i ( t ) =2 γ i ϕ F i ( t ) k i q ∗ j ( t ) σ j . The forms of q ∗ i ( t ) and q ∗ j ( t ) in diﬀerent cases are given by Theorem 1. Thus, the solution to theequation (C.102) is( V F i ( t )) ( V F i (0)) = exp (cid:8) (cid:90) t Θ F i ( ι ) dι (cid:9) + exp (cid:8) (cid:90) t [ −

12 Θ F i ( ι ) −

12 Θ F i ( ι ) ] dι + (cid:90) t Θ F i ( ι ) dW i ( ι ) + (cid:90) t Θ F i ( ι ) dW j ( ι ) (cid:9) + exp (cid:8) − (cid:90) t r − r ) σ ( S ( ι )) − β dι − (cid:90) t r − r ) σ ( S ( ι )) − β dW ( ι ) (cid:9) . (C.103)By virtue of Novikov’s condition, we know that exp (cid:8) − (cid:82) t Θ F i ( ι ) dι + (cid:82) t Θ F i ( ι ) dW i ( ι ) (cid:9) and exp (cid:8) − (cid:82) t Θ F i ( ι ) dι + (cid:82) t Θ F i ( ι ) dW j ( ι ) (cid:9) are two martingales. In addition, according to (2.6) and Itˆo’s formula, we can drive d ( S ( t )) − β = [ β (2 β + 1) σ − βr ( S ( t )) − β ] dt − βσ ( S ( t )) − β dW ( t ) . (C.104)By using Lemma 4.3 and Theorem 5.1 of Zeng and Taksar (2013), we can verify that exp (cid:8) − (cid:82) t r − r ) σ ( S ( ι )) − β dι − (cid:82) t r − r ) σ ( S ( ι )) − β dW ( ι ) is a martingale. From the range of parameters and the form of ϕ F i ( t ), we know that Θ F i ( t ) < ∞ .Thus taking expectation from both sides of equation (C.103), we obtain E [( V F i ( t )) ] = ( V F i (0)) exp (cid:8) (cid:90) t Θ F i ( ι ) dι (cid:9) < + ∞ . Therefore, for n = 1 , , · · · , we have E t, ˆ x i ,y i ,y j ,s (cid:8) [ V F i ( τ n ∧ T, ˆ X i ( τ n ∧ T ) , Y i ( τ n ∧ T ) , Y j ( τ n ∧ T ) , S ( τ n ∧ T ))] (cid:9) < + ∞ . Then, the proof is complete. 34 ppendix D Proof of Theorem 2

Proof:

Obviously, the pair ( p ∗ ( t ) , b ∗ L ( t ) , q ∗ ( t ) , b ∗ ( t ) , q ∗ ( t ) , b ∗ ( t )) obtained in Theorem 1 is an admissible strategy, i.e.,( p ∗ ( t ) , b ∗ L ( t ) , q ∗ ( t ) , b ∗ ( t ) , q ∗ ( t ) , b ∗ ( t )) ∈ Π L × Π × Π . Next, we show the optimality of ( p ∗ ( t ) , b ∗ L ( t ) , q ∗ ( t ) , b ∗ ( t ) ,q ∗ ( t ) , b ∗ ( t )) in Π L × Π × Π . From the construction of τ n , we know that τ n ∧ T → T when n → + ∞ . For ∀ ( p ∗ ( t ) , b ∗ L ( t ) , q ∗ ( t ) , b ∗ ( t ) , q ∗ ( t ) , b ∗ ( t )) ∈ Π L × Π × Π and ∀ ι ∈ [ t, T ], we apply Itˆo’s formula to V F i ( t, ˆ x i , y i , y j , s )and deduce V F i ( τ n ∧ T, ˆ X i ( τ n ∧ T ) , Y i ( τ n ∧ T ) , Y j ( τ n ∧ T ) , S ( τ n ∧ T ))= V F i ( t, ˆ x i , y i , y j , s ) + (cid:90) τ n ∧ Tt A F i V F i ( ι, ˆ X i ( ι ) , Y i ( ι ) , Y j ( ι ) , S ( ι )) dι + (cid:90) τ n ∧ Tt V F i ˆ x i ( ι, ˆ X i ( ι ) , Y i ( ι ) , Y j ( ι ) , S ( ι )) q i ( ι ) σ i dW i ( ι ) − (cid:90) τ n ∧ Tt V F i ˆ x i ( ι, ˆ X i ( ι ) , Y i ( ι ) , Y j ( ι ) , S ( ι )) k i q j ( ι ) σ j dW j ( ι ) + (cid:90) τ n ∧ Tt V F i ˆ x i ( ι, ˆ X i ( ι ) , Y i ( ι ) , Y j ( ι ) , S ( ι ))( b i ( ι ) − k i b j ( ι )) σ ( S ( ι )) β dW ( ι ) + (cid:90) τ n ∧ Tt V F i s ( ι, ˆ X i ( ι ) , Y i ( ι ) , Y j ( ι ) , S ( ι )) σ ( S ( ι )) β +1 dW ( ι ) . Because the last four terms are square-integrable martingales with zero expectations, taking expectation on both sidesof the above equation conditional on ˆ X i ( t ) = ˆ x i , Y i ( t ) = y i , Y j ( t ) = y j and S ( t ) = s , we have E t, ˆ x i ,y i ,y j ,s [ V F i ( τ n ∧ T, ˆ X i ( τ n ∧ T ) , Y i ( τ n ∧ T ) , Y j ( τ n ∧ T ) , S ( τ n ∧ T ))]= V F i ( t, ˆ x i , y i , y j , s ) + E t, ˆ x i ,y i ,y j ,s (cid:2) (cid:90) τ n ∧ Tt A F i V F i ( ι, ˆ X i ( ι ) , Y i ( ι ) , Y j ( ι ) , S ( ι )) dι (cid:3) ≤ V F i ( t, ˆ x i , y i , y j , s ) . In terms of Lemma 1, the uniform integrability of V F i ( τ n ∧ T, ˆ X i ( τ n ∧ T ) , Y i ( τ n ∧ T ) , Y j ( τ n ∧ T ) , S ( τ n ∧ T )) yieldssup ( q i ( · ) ,b i ( · )) ∈ Π i E t, ˆ x i ,y i ,y j ,s (cid:2) U i (cid:0) ˆ X π i i ( T ) + η i Y i ( T ) − k i η j Y j ( T ) (cid:1)(cid:3) = lim n → + ∞ E t, ˆ x i ,y i ,y j ,s (cid:2) V F i ( τ n ∧ T, ˆ X i ( τ n ∧ T ) , Y i ( τ n ∧ T ) , Y j ( τ n ∧ T ) , S ( τ n ∧ T )) (cid:3) ≤ V F i ( t, ˆ x i , y i , y j , s ) . When ( q i ( · ) , b i ( · )) = ( q ∗ i ( · ) , b ∗ i ( · )), the inequality in the above formula becomes an equality, and thussup ( q i ( · ) ,b i ( · )) ∈ Π i E t, ˆ x i ,y i ,y j ,s (cid:2) U i (cid:0) ˆ X π i i ( T ) + η i Y i ( T ) − k i η j Y j ( T ) (cid:1)(cid:3) = V F i ( t, ˆ x i , y i , y j , s ) . Following similar derivations, we can obtainsup ( p ( · ) ,b L ( · )) ∈ Π L E t,x L ,y L ,s [ U L ( X π L L ( T ) + η L Y L ( T ))] ≤ V L ( t, x L , y L , s ) . And when ( p ( · ) , b L ( · )) = ( p ∗ ( · ) , b ∗ L ( ··