[PDF] Dynamic Reinsurance in Discrete Time Minimizing the Insurer's Cost of Capital

Abstract

In the classical static optimal reinsurance problem, the cost of capital for the insurer's risk exposure determined by a monetary risk measure is minimized over the class of reinsurance treaties represented by increasing Lipschitz retained loss functions. In this paper, we consider a dynamic extension of this reinsurance problem in discrete time which can be viewed as a risk-sensitive Markov Decision Process. The model allows for both insurance claims and premium income to be stochastic and operates with general risk measures and premium principles. We derive the Bellman equation and show the existence of a Markovian optimal reinsurance policy. Under an infinite planning horizon, the model is shown to be contractive and the optimal reinsurance policy to be stationary. The results are illustrated with examples where the optimal policy can be determined explicitly.

Full PDF

aa r X i v : . [ q -f i n . R M ] D ec DYNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THEINSURER’S COST OF CAPITAL

ALEXANDER GLAUNER ∗ Abstract.

In the classical static optimal reinsurance problem, the cost of capital for the in-surer’s risk exposure determined by a monetary risk measure is minimized over the class ofreinsurance treaties represented by increasing Lipschitz retained loss functions. In this paper,we consider a dynamic extension of this reinsurance problem in discrete time which can beviewed as a risk-sensitive Markov Decision Process. The model allows for both insurance claimsand premium income to be stochastic and operates with general risk measures and premiumprinciples. We derive the Bellman equation and show the existence of a Markovian optimalreinsurance policy. Under an inﬁnite planning horizon, the model is shown to be contractiveand the optimal reinsurance policy to be stationary. The results are illustrated with exampleswhere the optimal policy can be determined explicitly.

Key words : Optimal reinsurance, risk measure, cost of capital, risk-sensitive MarkovDecision Process

AMS subject classifications : 91G05, 91G70, 90C40 Introduction

Reinsurance is an important instrument for insurance companies to reduce their risk exposure.The optimal design of reinsurance contracts has been studied extensively in the actuarial liter-ature for more than half a century. In their pioneering works, Borch (1960) and Arrow (1963)considered the variance of the retained risk and the exponential utility of terminal wealth asoptimality criteria. Since then, a wide range of objectives for the optimal use of reinsurancehas been discussed. For a comprehensive literature overview we refer the interested reader toChapter 8 of Albrecher et al. (2017).Developments in the insurance sector’s regulatory framework like Solvency II or the SwissSolvency Test led to a special interest in monetary risk measures like Value-at-Risk and Ex-pected Shortfall as objective functionals. Economically speaking, the target is to minimize thecapital requirement or equivalently the cost of capital for the eﬀective risk after reinsurancewhich is calculated by the respective monetary risk measure. This line of research has beeninitiated by Cai and Tan (2007) who optimized the retention levels of stop-loss contracts underthe expected premium principle with respect to Value-at-Risk and Expected Shortfall. Cai et al.(2008) extended the same setting to the non-parametric class of increasing convex reinsurancetreaties. A further step to general premium principles has been made by Chi and Tan (2013) whoconsidered the now standard class of increasing Lipschitz reinsurance contracts. Subsequently,more general risk measures were considered. Cui et al. (2013) were the ﬁrst to study arbitrarydistortion risk measures. Similar ﬁndings were reached by Zhuang et al. (2016) using their lesstechnical marginal indemniﬁcation function approach. Other extensions of the cost of capitalminimization problem concerned additional constraints, see e.g. Lo (2017) or multidimensionalsettings induced by a macroeconomic perspective, see B¨auerle and Glauner (2018).Research on the cost of capital minimization problem has so far been focused on static single-period models. Other optimality criteria for the choice of reinsurance have however been con-sidered in dynamic setups. Sch¨al (2004) studied the control of an insurer’s surplus processin discrete time by means of investment and reinsurance using general parametric contracts. ∗ Department of Mathematics, Karlsruhe Institute of Technology (KIT), D-76128 Karlsruhe, Germany,[email protected].

Optimality criteria were maximization of lifetime dividends and minimization of the ruin proba-bility. The continuous-time versions of these problems gained greater attention in the literature.For an overview we refer to Albrecher and Thonhauser (2009) and the books Schmidli (2008),Azcue and Muler (2014). Several authors used Value-at-Risk and Expected Shortfall based sol-vency constraints in continuous time reinsurance models, see Chen et al. (2010) and Liu et al.(2013) for two early contributions.The only study of solvency capital requirements or corresponding cost of capital as opti-mization target in a dynamic setup is the recent paper by B¨auerle and Glauner (2020c). Theyminimized the cost of capital for the discrete-time total discounted loss determined by a generalspectral risk measure over the class of increasing Lipschitz retained loss functions from whichthe insurer selects a treaty in every period depending on the current surplus. Since reinsurancetreaties are typically written for one year (Albrecher et al., 2017), only modeling in discrete timeis realistic. Continuous time models are typically used when then insurer’s surplus is managedby both reinsurance and capital market instruments. They realistically describe the ﬁnancialmarket, but with regard to reinsurance they are only a compromise.The aim of this paper is to introduce another dynamic extension in discrete time of the staticcost of capital minimization problem. We propose a recursive approach where in the terminalperiod the insurer faces the static problem and in any earlier period calculates the capital require-ment taking into account the period’s retained loss, the cost of reinsurance, and the future costof capital. The latter is a random quantity depending on the development of the future surplus.The recursive minimization of risk measures has been studied by Asienkiewicz and Ja´skiewicz(2017) for an abstract Markov Decision Process and by B¨auerle and Ja´skiewicz (2017, 2018)for a dividend and an optimal growth problem speciﬁcally using the entropic risk measure.This choice is motivated by recursive utilities studied extensively in the economic literaturesince the entropic risk measure happens to be the certainty equivalent of an exponential utility.B¨auerle and Glauner (2020b) generalized the recursive approach to axiomatically characterizedgeneral monetary risk measures in an abstract Markov decision model. Here, we show that theapproach is well-suited for a dynamic cost of capital optimization.The paper is structured as follows: In Section 2 we recall some important facts about riskmeasures and premium principles. Then, we introduce the dynamic reinsurance model. Therecursive cost of capital minimization problem is solved in Section 4 under a ﬁnite planninghorizon. We derive a Bellman equation for the optimal cost of capital and show that thereexists a Markovian optimal reinsurance policy only requiring that the monetary risk measureshave the Fatou property. Under an inﬁnite planning horizon, we additionally need coherenceand can then show that the model is contractive and the optimal reinsurance policy stationary.Addressing a criticism by Albrecher et al. (2017, Sec. 8.4), who question the suitability of costof capital minimization as a business objective, we show in Section 6 that the recursive cost ofcapital minimization is consistent with proﬁt maximization, the primary target of any company.In Section 7, we illustrate our results with analytic examples. We speciﬁcally consider Value-at-Risk due to its practical relevance with regard to Solvency II.2.

Risk measures and premium principles

Let a probability space (Ω , A , P ) and a real number p ∈ [1 , ∞ ) be ﬁxed. With q ∈ (1 , ∞ ] wedenote the conjugate index satisfying p + q = 1 under the convention ∞ = 0. Henceforth, L p = L p (Ω , A , P ) denotes the vector space of real-valued random variables which have an integrable p -th moment. L p + is the subset of non-negative random variables. We follow the convention of theactuarial literature that positive realizations of random variables represent losses and negativeones gains. A risk measure is a functional ρ : L p → ¯ R . The notion of a premium principle π : L p + → ¯ R is mathematically closely related but the applications are diﬀerent. While the formerdetermines the necessary solvency capital to bear a risk, the latter gives the price of (re)insuringit. In contrast to general ﬁnancial risks, insurance risks are typically non-negative. Hence, it YNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THE INSURER’S COST OF CAPITAL 3 suﬃces to consider premium principles on L p + . The properties of risk measures discussed in thesequel apply to premium principles analogously. Deﬁnition 2.1.

A risk measure ρ : L p → ¯ R isa) law-invariant if ρ ( X ) = ρ ( Y ) for X, Y with the same distribution.b) monotone if X ≤ Y implies ρ ( X ) ≤ ρ ( Y ).c) translation invariant if ρ ( X + m ) = ρ ( X ) + m for all m ∈ R .d) normalized if ρ (0) = 0.e) ﬁnite if ρ ( L p ) ⊆ R .f) positive homogeneous if ρ ( λX ) = λρ ( X ) for all λ ∈ R + .g) convex if ρ ( λX + (1 − λ ) Y ) ≤ λρ ( X ) + (1 − λ ) ρ ( Y ) for λ ∈ [0 , subadditive if ρ ( X + Y ) ≤ ρ ( X ) + ρ ( Y ) for all X, Y .i) said to have the

Fatou property , if for every sequence { X n } n ∈ N ⊆ L p with | X n | ≤ Y P -a.s.for some Y ∈ L p and X n → X P -a.s. for some X ∈ L p it holdslim inf n →∞ ρ ( X n ) ≥ ρ ( X ) . Throughout, we only consider law-invariant risk measures and premium principles. A riskmeasure is called monetary if it is monotone and translation invariant. It appears to be consen-sus in the literature that these two properties are a necessary minimal requirement for any riskmeasure. However, the attribute monetary is rather unusual for premium principles since mostof them are monotone but often not translation invariant. Monetary risk measures which are ad-ditionally positive homogeneous and subadditive are referred to as coherent . Note that positivehomogeneity implies normalization and makes convexity and subadditivity equivalent. The Fa-tou property means that the risk measure is lower semicontinuous w.r.t. dominated convergence.The following result can be found in R¨uschendorf (2013) as Theorem 7.24.

Lemma 2.2.

Finite and convex monetary risk measures have the Fatou property.

Pichler (2013) showed that coherent risk measures satisfy a triangular inequality.

Lemma 2.3.

For a coherent risk measure ρ and X, Y ∈ L p it holds | ρ ( X ) − ρ ( Y ) | ≤ ρ ( | X − Y | ) . We denote by M (Ω , A , P ) the set of probability measures on (Ω , A ) which are absolutelycontinuous with respect to P and deﬁne M q (Ω , A , P ) = (cid:26) Q ∈ M (Ω , A , P ) : d Q d P ∈ L q (Ω , A , P ) (cid:27) . Recall that an extended real-valued convex functional is called proper if it never attains −∞ and is strictly smaller than + ∞ in at least one point. Coherent risk measures have the followingdual or robust representation, cf. Theorem 7.20 in R¨uschendorf (2013). Proposition 2.4.

A functional ρ : L p → ¯ R is a proper coherent risk measure with the Fatouproperty if and only if there exists a subset Q ⊆ M q (Ω , A , P ) such that ρ ( X ) = sup Q ∈Q E Q [ X ] , X ∈ L p . The supremum is attained since the subset

Q ⊆ M q (Ω , A , P ) can be chosen σ ( L q , L p ) -compactand the functional Q E Q [ X ] is σ ( L q , L p ) -continuous. With the dual representation, B¨auerle and Glauner (2020b) derived a complementary inequal-ity to subadditivity.

Lemma 2.5.

A proper coherent risk measure with the Fatou property ρ : L p → ¯ R satisﬁes ρ ( X + Y ) ≥ ρ ( X ) − ρ ( − Y ) for all X, Y ∈ L p . A. GLAUNER

In the following, F X ( x ) = P ( X ≤ x ) denotes the distribution function, S X ( x ) = 1 − F X ( x ) , x ∈ R , the survival function and F − X ( u ) = inf { x ∈ R : F X ( x ) ≥ u } , u ∈ [0 , X . Many established risk measures belong to the large class of distortionrisk measures. Deﬁnition 2.6. a) An increasing function g : [0 , → [0 ,

1] with g (0) = 0 and g (1) = 1 iscalled distortion function .b) The distortion risk measure w.r.t. a distortion function g is deﬁned by ρ g : L p → ¯ R , ρ g ( X ) = Z ∞ g ( S X ( x )) d x − Z −∞ − g ( S X ( x )) d x whenever at least one of the integrals is ﬁnite.c) The Wang premium principle w.r.t. a distortion function g is deﬁned by π g : L p + → ¯ R , π g ( X ) = (1 + θ ) Z ∞ g ( S X ( x )) d x, θ ≥ . Distortion risk measures have many of the properties introduced in Deﬁnition 2.1, see e.g.Sereda et al. (2010).

Lemma 2.7. a) Distortion risk measures are law invariant, monotone, translation invari-ant, normalized and positive homogeneous. b) A distortion risk measure is subadditive if and only if the distortion function g is concave. Wang premium principles share these properties apart from translation invariance. Moreover,they have the Fatou property which has so far not been investigated in the literature.

Lemma 2.8.

For a left-continuous distortion function g , the Wang premium principle has theFatou property.Proof. Let { X n } n ∈ N ⊆ L p + with X n ≤ Y P -a.s. for some Y ∈ L p + and X n → X P -a.s. for X ∈ L p + .Especially, X n → X in distribution. Therefore, S X n ( x ) → S X ( x ) for almost every x ∈ R + .Since g is left-continuous and increasing it is lower semicontinuous, i.e. lim inf n →∞ g ( S X n ( x )) ≥ g ( S X ( x )) for almost every x ∈ R + . Finally, Fatou’s lemma yields withlim inf n →∞ π ( X n ) = lim inf n →∞ (1 + θ ) Z ∞ g ( S X n ( x )) d x ≥ (1 + θ ) Z ∞ g ( S X ( x )) d x = π ( X ) . the assertion. (cid:3) There is an alternative representation of distortion risk measures in terms of Lebesgue-Stieltjesintegrals based on the quantile function in lieu of the survival function of the risk X . Remark 2.9.

For a distortion risk measure ρ g with left-continuous distortion function g it holds ρ g ( X ) = Z F − X ( u ) d ¯ g ( u ) , (2.1)where ¯ g ( u ) = 1 − g (1 − u ) , u ∈ [0 , , is the dual distortion function, cf. Dhaene et al. (2012).For a continuous and concave distortion function g : [0 , → [0 , g : [0 , → [0 ,

1] is continuous and convex. It can thus be written as ¯ g ( x ) = R x φ ( s ) d s for anincreasing right-continuous function φ : [0 , → R + , which is called spectrum . By the propertiesof the Lebesgue-Stieltjes integral, (2.1) can then be written as ρ g ( X ) = ρ φ ( X ) = Z F − X ( u ) φ ( u ) d u. (2.2)Therefore, distortion risk measures with continuous concave distortion function are referred toas spectral risk measures . Note that the continuity of g is an additional requirement only in 0,since an increasing concave function on [0 ,

1] is already continuous on (0 , YNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THE INSURER’S COST OF CAPITAL 5

Due to H¨older’s inequality, spectral risk measures ρ φ : L p → ¯ R with spectrum φ ∈ L q fulﬁll | ρ φ ( X ) | = (cid:12)(cid:12)(cid:12)(cid:12)Z F − X ( u ) φ ( u ) d u (cid:12)(cid:12)(cid:12)(cid:12) ≤ Z | F − X ( u ) | φ ( u ) d u = (cid:0) E | F − X ( U ) | p (cid:1) p (cid:0) E | φ ( U ) | q (cid:1) q < ∞ , where U ∼ U ([0 , Example 2.10. a) The most widely used risk measure in ﬁnance and insurance

Value-at-Risk

VaR α ( X ) = F − X ( α ) , α ∈ (0 , , is a distortion risk measure with distortion function g ( u ) = (1 − α, ( u ). Since the dis-tortion function is not concave, Value-at-Risk is not coherent and especially not spectral.B¨auerle and Glauner (2020b) have shown that VaR has the Fatou property.b) The lack of coherence can be overcome by using Expected Shortfall ES α ( X ) = 11 − α Z α F − X ( u ) d u, α ∈ [0 , . The corresponding distortion function g ( u ) = min { u − α , } is concave and ES thus coher-ent. It is also spectral with φ ( u ) = − α [ α, ( u ). Due to the bounded spectrum, ES hasthe Fatou property.c) The Proportional Hazard (PH) premium principle π ( X ) = (1 + θ ) Z ∞ S X ( x ) γ d x, θ ≥ , γ ∈ (0 , , is an example from the class of Wang premium principles. Note that the distortionfunction g ( x ) = x γ , γ ∈ (0 ,

1] is continuous and concave. For γ = 1, the widely used Expected premium principle π ( X ) = (1 + θ ) E [ X ] , θ ≥ , is a special case.d) The entropic risk measure ρ γ ( X ) = 1 γ log E (cid:2) e γX (cid:3) , γ > , is also known as exponential premium principle . It is a law-invariant and convex monetaryrisk measure which does not belong to the distortion class. For random variables withexisting moment-generating function it has the Fatou property directly by dominatedconvergence. 3. Dynamic reinsurance model

The aim of this paper is to introduce and solve a dynamic extension of the static optimalreinsurance problem min f ∈F r CoC · ρ (cid:0) f ( Y ) + π R ( f ) (cid:1) , (3.1)which has been studied extensively in the literature starting with Cai and Tan (2007) and gen-eralizations i.a. by Chi and Tan (2013), Cui et al. (2013), Lo (2017) and B¨auerle and Glauner(2018). In this setting, an insurance company incurs a loss Y ∈ L p + at the end of a ﬁxed perioddue to insurance claims. In order to reduce its risk, the insurer may cede a portion of it to areinsurance company and retain only f ( Y ). Here, the reinsurance treaty f determines the re-tained loss f ( Y ( ω )) in each scenario ω ∈ Ω. For the risk transfer, the insurer has to compensatethe reinsurer with a reinsurance premium π R ( f ) = π R ( Y − f ( Y )) determined by a premiumprinciple π R : L p + → ¯ R . In order to preclude moral hazard, it is standard in the actuarialliterature to assume that both f and the ceded loss function id R + − f are increasing meaningthat both the insurer and the reinsurer suﬀer from higher claims. Otherwise, the insurer might A. GLAUNER have an incentive to misreport losses or accept unjustiﬁed claims. Hence, the set of admissibleretained loss functions is F = { f : R + → R + | f ( t ) ≤ t ∀ t ∈ R + , f increasing , id R + − f increasing } . The insurer’s target is to minimize its cost of solvency capital which is calculated as the costof capital rate r CoC ∈ (0 ,

1] times the solvency capital requirement determined by applying therisk measure ρ to the insurer’s eﬀective risk after reinsurance.It is natural to model a dynamic extension of (3.1) in discrete time since reinsurance treatiesare typically written for one year (Albrecher et al., 2017) and we will focus on the managementof the insurer’s surplus by means of reinsurance neglecting the possible use of capital marketinstruments.In our model, the insurer is endowed with an initial capital x ∈ R . At the end of each period[ n, n + 1) , n ∈ N , he incurs aggregate claims Y n +1 ∈ L p + for that period and receives the totalpremium income Z n +1 ∈ L ∞ + for the next period. Both quantities are allowed to be stochasticand ( Y n , Z n ) n ∈ N is assumed to be an independent sequence of random vectors deﬁned on acommon probability space (Ω , A , P ). Requiring that the aggregate losses are independent andfulﬁll some integrability condition is standard in actuarial science. Often, the premium incomeis assumed to be deterministic. Here, we allow for some uncertainty or ﬂuctuation but one willat least know an upper bound (complete and timely payment by all insurants). The insurer’suncontrolled surplus process is given recursively by X = x, X n +1 = X n − Y n +1 + Z n +1 . Note that a negative surplus is possible. In order to reduce the downside risk of its surplusprocess, the insurance company can underwrite a reinsurance treaty represented by a retainedloss function f n ∈ F at the beginning of each period [ n, n + 1). When purchasing reinsurance f n at time n , the insurance company retains the portion f n ( Y n +1 ) of the claims Y n +1 arrivingat time n + 1 and the reinsurer covers Y n +1 − f n ( Y n +1 ). In return, the insurer has to pay thereinsurance premium π R,n ( f n ) = π R,n (cid:0) Y n +1 − f n ( Y n +1 ) (cid:1) . Throughout, we require the premiumprinciples to have the following standard properties. Assumption 3.1. π R,n : L p + → ¯ R is a law-invariant, monotone and normalized premiumprinciple with the Fatou property satisfying π R,n ( Y n +1 ) < ∞ for all n ∈ N .The condition π R,n ( Y n +1 ) < ∞ means that the risk can be fully ceded at each stage which isnatural for a model with a passive reinsurer.Additionally, we may impose a budget constraint such that for a current capital x ∈ R theavailable reinsurance contracts at time n are D n ( x ) = { f ∈ F : π R,n ( f ) ≤ x + } . This precludes the insurer from purchasing reinsurance on credit.For n ∈ N we denote by H n the set of feasible histories of the surplus process up to time nh n = ( x , if n = 0 , ( x , f , x , . . . , f n − , x n ) , if n ≥ , where f k ∈ D k ( x k ) for k ∈ N . The decision making at the beginning of each period has tobe based on the information available at that time. I.e. the decisions must be functions of thehistory of the surplus process. Deﬁnition 3.2. a) A measurable mapping d n : H n → A with d n ( h n ) ∈ D n ( x n ) for every h n ∈ H n is called decision rule at time n . A ﬁnite sequence π = ( d , . . . , d N − ) is called N -stage policy and a sequence π = ( d , d , . . . ) is called policy .b) A decision rule at time n is called Markov if it depends on the current state only, i.e. d n ( h n ) = d n ( x n ) for all h n ∈ H n . If all decision rules are Markov, the ( N -stage) policy iscalled Markov . YNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THE INSURER’S COST OF CAPITAL 7 c) An ( N -stage) policy π is called stationary if π = ( d, . . . , d ) or π = ( d, d, . . . ), respectively,for some Markov decision rule d .With Π ⊇ Π M ⊇ Π S we denote the sets of all policies, Markov policies and stationary policies.It will be clear from the context if N -stage or inﬁnite stage policies are meant. An admissiblepolicy always exists since full retention is feasible in any scenario. The dynamic of the controlledsurplus process under a policy π ∈ Π is given by X π = x, X πn +1 = X πn − d n ( h n )( Y n +1 ) − π R,n ( d n ( h n )) + Z n +1 . (3.2)This setting deﬁnes a non-stationary Markov Decision Process (MDP) with the following data: • The state space is the real line R with Borel σ -algebra B ( R ). • The action space is F with Borel σ -algebra B ( F ). The topology is given in Lemma 3.3below. • The independent disturbances are ( Y n , Z n ) n ∈ N . • The transition function T n : R × F × R + × R + → R at time n is given by T n ( x, f, y, z ) = x − f ( y ) − π R,n ( f ) + z. • Regarding the admissible actions D n ( x ) at time n in state x ∈ R we have two cases:Unconstrained: D n ( x ) = F for all x ∈ R .Budget-constrained: D n ( x ) = { f ∈ F : π R,n ( f ) ≤ x + } for all x ∈ R .The set of admissible state-action combinations is D n = { ( x, f ) ∈ R × F : f ∈ D n ( x ) } . Itcontains the graph of the constant measurable map R ∋ x id R + . • The one-stage cost function c : D × R → R is given by c ( x, f, x ′ ) = − x ′ , where x ′ denotesthe next state of the surplus process, see Section 4.The next lemma summarizes some properties of the dynamic reinsurance model which willbe relevant in the following sections. Lemma 3.3. a) The retained loss functions f ∈ F are Lipschitz continuous with constant L ≤ . Moreover, F is a Borel space as a compact subset of the metric space ( C ( R + ) , m ) of continuous real-valued functions on R + with the metric of compact convergence m ( f , f ) = ∞ X j =1 − j max ≤ t ≤ j | f ( t ) − f ( t ) | ≤ t ≤ j | f ( t ) − f ( t ) | . b) The functional π R,n : F → R + , f π R,n ( f ) is lower semicontinuous. c) D n ( x ) is a compact subset of F for all x ∈ R and the set-valued mapping R ∋ x → D n ( x ) is upper semicontinuous, i.e. if x k → x and a k ∈ D n ( x k ) , k ∈ N , then { a k } k ∈ N has anaccumulation point in D n ( x ) . d) The transition function T n is upper semicontinuous and the one-stage cost D n ∋ ( x, f ) c ( x, f, T n ( x, f, y, z )) = − T n ( x, f, y, z ) is lower semicontinuous.Proof. a) Let f ∈ F . Since id R + − f is increasing, it holds for 0 ≤ x ≤ y that x − f ( x ) ≤ y − f ( y ). Rearranging and using that f is increasing, too, yields with | f ( x ) − f ( y ) | = f ( y ) − f ( x ) ≤ y − x = | x − y | the Lipschitz continuity with common constant L = 1.Moreover, F is pointwise bounded by id R + and closed under pointwise convergence. Hence,( F , m ) is a compact metric space by the Arzel`a-Ascoli theorem and as such also completeand separable, i.e. a Borel space.b) Let { f k } k ∈ N be a sequence in F such that f k → f ∈ F . Especially, it holds f k ( x ) → f ( x )for all x ∈ R + and Y − f k ( Y ) → Y − f ( Y ) P -a.s. Since Y − f n ( Y ) ≤ Y ∈ L for all k ∈ N ,the Fatou property of π R,n implieslim inf k →∞ π R,n ( f k ) = lim inf k →∞ π R,n (cid:0) Y − f k ( Y ) (cid:1) ≥ π R,n (cid:0) Y − f ( Y ) (cid:1) = π R,n ( f ) . c) Due to a), we only have to consider the budget-constrained case. Since F is compactit suﬃces to show that D n ( x ) = { f ∈ F : π R,n ( f ) ≤ ( x ) + } is closed. This is thecase since D n ( x ) is a sublevel set of the lower semicontinuous function π R,n : F → R + . A. GLAUNER

Furthermore, we show that D n is closed to obtain the upper semicontinuity from LemmaA.2.2 in B¨auerle and Rieder (2011). From the lower semicontinuity of π R,n it follows thatthe epigraph epi( π R,n ) = { ( f, x ) ∈ F × R + : π R,n ( f ) ≤ x } is closed. Thus, D n = { ( x, f ) : ( f, x ) ∈ epi( π R,n ) } ∪ ( R − × D n (0)) is closed, too.d) We show that the mapping F × R + ∋ ( f, y ) f ( y ) is continuous. Then, the transitionfunction T n is upper semicontinuous as a sum of upper semicontinuous functions due topart b) and the one-stage cost c ( x, f, T n ( x, f, y, z )) = − T n ( x, f, y, z ) is lower semicon-tinuous. Let { ( f k , y k ) } k ∈ N be a convergent sequence in F × R + with limit ( f, y ). Sinceconvergence w.r.t. the metric m implies pointwise convergence and all f k have the Lips-chitz constant L = 1, it follows | f k ( y k ) − f ( y ) | ≤ | f k ( y k ) − f k ( y ) | + | f k ( y ) − f ( y ) | ≤ | y k − y | + | f k ( y ) − f ( y ) | → . I.e.

F × R + ∋ ( f, y ) f ( y ) is continuous. (cid:3) Cost of capital minimization with finite planning horizon

Let N ∈ N be the ﬁnite planning horizon and π = ( d , . . . , d N − ) ∈ Π a policy of the insurancecompany. At the beginning of the terminal period [ N − , N ), the insurer faces the same situationas in the static reinsurance problem (3.1). The cost of solvency capital is calculated for theperiod’s loss which equals the negative surplus at time N : V N − π ( h N − ) = r CoC · ρ N − (cid:16) d N − ( h N − )( Y N ) + π R,N − ( d N − ( h N − )) − Z N − x N − (cid:17) . In any earlier period [ n, n + 1), the eﬀective risk relevant for the cost of capital calculationconsists of the risk for that period plus the discounted future cost of capital. The latter is arandom variable as a measurable function of the next state of the surplus process, see Remark4.2. I.e. the cost of capital is given by V nπ ( h n ) = r CoC · ρ n (cid:16) d n ( h n )( Y n +1 ) + π R,n ( d n ( h n )) − Z n +1 − x n + βV n +1 π (cid:0) h n , d n ( h n ) , x n + Z n +1 − d n ( h n )( Y n +1 ) − π R,n ( d n ( h n )) (cid:1)(cid:17) , where β ∈ (0 ,

1] is a discount factor. To simplify the notation, we assume that the cost of capitalrate r CoC is included in the discount factor meaning that β is of the form β = r CoC ·

11 + r , where r ∈ (0 ,

1] is the risk-free interest rate per period. Hence, the discount factor still is aquantity in (0 ,

1] and our simpliﬁcation of the notation entails no restriction. In the ﬁrst period,one has to multiply once more with the cost of capital rate in order to obtain the overall recursivecost of capital, but for the minimization this is of course not relevant. Hence, we can deﬁne the value of a policy π = ( d , . . . , d N − ) ∈ Π, i.e. the cost of capital under this policy, recursively as V N ( h N ) = 0 ,V nπ ( h n ) = ρ n (cid:16) d n ( h n )( Y n +1 ) + π R,n ( d n ( h n )) − Z n +1 − x n + βV n +1 π (cid:0) h n , d n ( h n ) , x n + Z n +1 − d n ( h n )( Y n +1 ) − π R,n ( d n ( h n )) (cid:1)(cid:17) . The corresponding value functions are V n ( h n ) = inf π ∈ Π V nπ ( h n ) , h n ∈ H n , and the optimization objective is to determine the optimal recursive cost of solvency capital V ( x ) = inf π ∈ Π V π ( x ) , x ∈ R . (4.1) YNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THE INSURER’S COST OF CAPITAL 9

For the actuarial interpretation of the optimality criterion one should note the capital require-ment for the possible claims is set oﬀ against the insurer’s capital and income at each stage.Hence, a negative cost of capital at time zero means that a part of the initial capital can be usedfor other investments or dividend payments without compromising solvency up to the planninghorizon. On the other hand, a positive cost of capital represents the opportunity cost of theadditional capital needed to make to risk up to the planning horizon acceptable.We make the following assumption in this section.

Assumption 4.1. (i) ρ , . . . , ρ N − : L p → ¯ R are law-invariant and normalized monetaryrisk measures with the Fatou property.(ii) It holds ρ n ( λY n +1 ) < ∞ for all λ ∈ R + and n = 0 , . . . , N − ρ n ( Y n +1 ) < ∞ if therisk measure is additionally positive homogeneous. Remark 4.2.

For the recursive deﬁnition of the policy values to be meaningful, we need tomake sure that the risk measures are applied to elements of L p (Ω , A , P ). This has two aspects:integrability will be ensured by Lemma 4.3, but ﬁrst of all V nπ needs to be a measurable functionfor all π ∈ Π and n = 0 , . . . , N . For most risk measures with practical relevance, this is fulﬁlled.To see this, we proceed by backward induction. For n = N there is noting to show and if V n +1 π is measurable, the function ψ ( h n , y, z ) = d n ( h n )( y ) + π R,n ( d n ( h n )) − z − x n + βV n +1 π (cid:0) h n , d n ( h n ) , x n + z − d n ( h n )( y ) − π R,n ( d n ( h n )) (cid:1)(cid:17) is measurable, too, as a composition of measurable maps. Now let us distinguish diﬀerent riskmeasures. • In the risk-neutral case, i.e. for ρ = E , and also for the entropic risk measure ρ γ themeasurability of V nπ ( h n ) = ρ ( ψ ( h n , Y n +1 , Z n +1 )) follows from Fubini’s theorem. • For distortion risk measures, the measurability is guaranteed, too. Here, Fubini’s theoremyields that the survival function of ψ ( h n , Y n +1 , Z n +1 ) S ( t | h n ) = Z (cid:8) ψ ( h n , Y n +1 ( ω ) , Z n +1 ( ω )) > t (cid:9) P (d ω )is measurable. A distortion function g is increasing and hence measurable. So again byFubini’s theorem we obtain the measurability of V nπ ( h n ) = ρ g ( ψ ( h n , Y n +1 , Z n +1 )) = Z ∞ g ( S ( t | h n )) d t − Z −∞ − g ( S ( t | h n )) d t since the integrands are non-negative and compositions of measurable maps. • For proper coherent risk measures with the Fatou property one can insert the dual repre-sentation of Proposition 2.4 V nπ ( h n ) = sup Q ∈Q E Q [ ψ ( h n , Y n +1 , Z n +1 )] . Then, an optimal measurable selection argument as in Theorem 3.6 in B¨auerle and Glauner(2020a) yields the measurability.Throughout, it is implicitly assumed that the risk measures are chosen such that all policy valuesare measurable.

Lemma 4.3.

There exist decreasing bounding functions ¯ b n : R → R − and ¯ b n : R → R + suchthat it holds for all π ∈ Π and n = 0 , . . . , N − b n ( x n ) ≤ V nπ ( h n ) ≤ ¯ b n ( x n ) , h n ∈ H n . The bounding functions are given by ¯ b n ( x ) = − ¯ c n − a n x + , ¯ b n ( x ) = ¯ c n + a n x − with recursively deﬁned, non-negative coeﬃcients ¯ c N − = ess sup( Z N ) , ¯ c n = (1 + βa n +1 ) ess sup( Z n +1 ) + β ¯ c n +1 , ¯ c N − = ρ N − ( Y N ) + π R,N − ( Y N ) , ¯ c n = ρ n (cid:0) (1 + βa n +1 ) Y n +1 (cid:1) + (1 + βa n +1 ) π R,n ( Y n +1 ) + β ¯ c n +1 ,a N − = 1 , a n = 1 + βa n +1 . Proof.

We proceed by backward induction. At time N −

1, it follows from the monotonicity,translation invariance and normalization of ρ N − that for any policy π ∈ Π V N − π ( h N − ) = ρ N − (cid:16) d N − ( h N − )( Y N ) + π R,N − ( d N − ( h N − )) − Z N − x N − (cid:17) ≥ ρ N − (cid:0) − ess sup( Z N ) − x N − (cid:1) = − ess sup( Z N ) − x N − ≥ ¯ b N − ( x N − ) ,V N − π ( h N − ) = ρ N − (cid:16) d N − ( h N − )( Y N ) + π R,N − ( d N − ( h N − )) − Z N − x N − (cid:17) ≤ ρ N − (cid:0) Y N + π R,N − ( Y N ) − x N − (cid:1) = ρ N − ( Y ) + π R,N − ( Y ) − x N − ≤ ¯ b N − ( x N − ) . Now assume the assertion holds for n + 1. For any policy π ∈ Π it follows from the propertiesof ρ n and the monotonicity of ¯ b n +1 , ¯ b n +1 that V nπ ( h n ) ≥ ρ n (cid:16) d n ( h n )( Y n +1 ) + π R,n ( d n ( h n )) − Z n +1 − x n + β ¯ b n +1 (cid:0) x n + Z n +1 − d n ( h n )( Y n +1 ) − π R,n ( d n ( h n )) (cid:1)(cid:17) ≥ ρ n (cid:16) − ess sup( Z n +1 ) − x n + β ¯ b n +1 (cid:0) ess sup( Z n +1 ) + x n (cid:1)(cid:17) ≥ − ess sup( Z n +1 ) − x + n + β (cid:0) − ¯ c n +1 − a n +1 (ess sup( Z n +1 ) + x + n ) (cid:1) = ¯ b n ( x n ) ,V nπ ( h n ) ≤ ρ n (cid:16) d n ( h n )( Y n +1 ) + π R,n ( d n ( h n )) − Z n +1 − x n + β ¯ b n +1 (cid:0) x n + Z n +1 − d n ( h n )( Y n +1 ) − π R,n ( d n ( h n )) (cid:1)(cid:17) ≤ ρ n (cid:16) Y n +1 + π R,n ( Y n +1 ) − x n + β ¯ b n +1 (cid:0) x n − Y n +1 − π R,n ( Y n +1 ) (cid:1)(cid:17) ≤ ρ n (cid:16) Y n +1 + π R,n ( Y n +1 ) + x − n + β (cid:0) ¯ c n +1 + a n +1 ( x − n + Y n +1 + π R,n ( Y n +1 ) (cid:1)(cid:17) = ρ n (cid:16) (1 + βa n +1 ) Y n +1 (cid:17) + (1 + βa n +1 ) π R,n ( Y n +1 ) + β ¯ c n +1 + (1 + βa n +1 ) x − n = ¯ b n ( x n ) . (cid:3) Let us now consider speciﬁcally Markov policies π ∈ Π M of the insurance company. Then, B n = { v : R → R | v decreasing and lower semicontinuous with ¯ b n ( x ) ≤ v ( x ) ≤ ¯ b n ( x ) ∀ x ∈ R } turns out to be set of potential value functions at time n under such policies. In order to simplifythe notation, we deﬁne the following operators thereon. Deﬁnition 4.4.

For n = 0 , . . . , N − v ∈ B n +1 , x ∈ R , f ∈ D n ( x ) and a Markov decision rule d let L n v ( x, f ) = ρ n (cid:16) f ( Y n +1 ) + π R,n ( f ) − Z n +1 − x + βv (cid:0) x + Z n +1 − f ( Y n +1 ) − π R,n ( f ) (cid:1)(cid:17) , YNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THE INSURER’S COST OF CAPITAL 11 T n,d v ( x ) = L n v ( x, d ( x )) , T n v ( x ) = inf f ∈ D n ( x ) L n v ( x, f ) . Note that the operators are monotone in v . Under a Markov policy π = ( d , . . . , d N − ) ∈ Π M , the value iteration can be expressed with the operators. In order to distinguish from thehistory-dependent case, we denote policy values here with J . Setting J Nπ ≡

0, we obtain for n = 0 , . . . , N − x ∈ R J nπ ( x ) = ρ n (cid:16) d n ( x )( Y n +1 ) + π R,n ( d n ( x )) − Z n +1 − x + βJ n +1 π (cid:0) x + Z n +1 − d n ( x )( Y n +1 ) − π R,n ( d n ( x )) (cid:1)(cid:17) = T nd n J n +1 π ( x ) . Let us further deﬁne for n = 0 , . . . , N − Markov value function J n ( x ) = inf π ∈ Π M J nπ ( x ) , x ∈ R . The next result shows that V n satisﬁes a Bellman equation and proves that an optimal policyexists and is Markov. Theorem 4.5.

For n = 0 , . . . , N the value function V n only depends on x n , i.e. V n ( h n ) = J n ( x n ) for all h n ∈ H n , lies in B n and satisﬁes the Bellman equation J N ( x ) = 0 ,J n ( x ) = T n J n +1 ( x ) , x ∈ R . Furthermore, for n = 0 , . . . , N − there exist Markov decision rules d ∗ n with T nd ∗ n J n +1 = T n J n +1 and every sequence of such minimizers constitutes an optimal policy π ∗ = ( d ∗ , . . . , d ∗ N − ) .Proof. The proof is by backward induction. At time N we have V N = J N ≡ ∈ B N . Assumingthe assertion holds at time n + 1, we have at time n : V n ( h n ) = inf π ∈ Π V nπ ( h n )= inf π ∈ Π ρ n (cid:16) d n ( h n )( Y n +1 ) + π R,n ( d n ( h n )) − Z n +1 − x n + βV n +1 π (cid:0) h n , d n ( h n ) , x n + Z n +1 − d n ( h n )( Y n +1 ) − π R,n ( d n ( h n )) (cid:1)(cid:17) ≥ inf π ∈ Π ρ n (cid:16) d n ( h n )( Y n +1 ) + π R,n ( d n ( h n )) − Z n +1 − x n + βV n +1 (cid:0) h n , d n ( h n ) , x n + Z n +1 − d n ( h n )( Y n +1 ) − π R,n ( d n ( h n )) (cid:1)(cid:17) = inf π ∈ Π ρ n (cid:16) d n ( h n )( Y n +1 ) + π R,n ( d n ( h n )) − Z n +1 − x n + βJ n +1 (cid:0) x n + Z n +1 − d n ( h n )( Y n +1 ) − π R,n ( d n ( h n )) (cid:1)(cid:17) = inf f ∈ D n ( x ) ρ n (cid:16) f ( Y n +1 ) + π R,n ( f ) − Z n +1 − x + βJ n +1 (cid:0) x + Z n +1 − f ( Y n +1 ) − π R,n ( f ) (cid:1)(cid:17) The last equality holds since the minimization does not depend on the entire policy but only on f = d n ( h n ). Here, objective and constraint depend on the history of the process only through x n .Thus, given existence of a minimizing Markov decision rule d ∗ n , the last line equals T nd ∗ n J n +1 ( x n ).Again by the induction hypothesis there exists an optimal Markov policy π ∗ ∈ Π M such that J n +1 = J n +1 π ∗ . Hence, we have V n ( h n ) ≥ T nd ∗ n J n +1 ( x n ) = T nd ∗ n J n +1 π ∗ ( x n ) = J nπ ∗ ( x n ) ≥ J n ( x n ) ≥ V n ( h n ) . It remains to show the existence of a minimizing Markov decision rule d ∗ n and that J n ∈ B n .We want to apply Proposition 2.4.3 in B¨auerle and Rieder (2011). The set-valued mapping R ∋ x D n ( x ) is compact-valued and upper semicontinuous by Lemma 3.3 c). Next, we show that D n ∋ ( x, f ) L n v ( x, f ) is lower semicontinuous for every v ∈ B n +1 . Let { ( x k , f k ) } k ∈ N bea convergent sequence in D n with limit ( x ∗ , f ∗ ) ∈ D n . As a convergent real sequence { x k } k ∈ N isbounded above, by say ¯ x ≥

0. Convergence w.r.t. the metric m implies pointwise convergence,i.e. Y n +1 − f k ( Y n +1 ) → Y n +1 − f ( Y n +1 ) a.s. By the properties of F it holds 0 ≤ Y n +1 − f k ( Y n +1 ) ≤ Y n +1 ∈ L p for all k ∈ N . Note that the sequence { inf ℓ ≥ k π R,n ( f ℓ ) } k ∈ N is increasing and boundedfrom above by π R,n ( Y n +1 ), i.e. convergent. Now, the Fatou property of π R,n impliesˆ π = lim k →∞ inf ℓ ≥ k π R,n ( f ℓ ) = lim inf k →∞ π R,n ( f k ) ≥ π R,n ( f ∗ ) . (4.2)Further, we have for every k ∈ N by the triangular inequality (cid:12)(cid:12)(cid:12)(cid:12) x k + Z n +1 − f k ( Y n +1 ) − inf ℓ ≥ k π R,n ( f ℓ ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ ¯ x + Z n +1 + Y n +1 + π R,n ( Y n +1 ) ∈ L p . Again, since convergence w.r.t. the metric m implies pointwise convergence, it holds x k + Z n +1 − f k ( Y n +1 ) − inf ℓ ≥ k π R,n ( f ℓ ) → x ∗ + Z n +1 − f ∗ ( Y n +1 ) − ˆ π a.s. for k → ∞ . (4.3)Lemma 4.3 yields that (cid:12)(cid:12)(cid:12)(cid:12) v (cid:0) x k + Z n +1 − f k ( Y n +1 ) − inf ℓ ≥ k π R,n ( f ℓ ) (cid:1)(cid:12)(cid:12)(cid:12)(cid:12) ≤ max (cid:26)(cid:12)(cid:12)(cid:12) ¯ b n (cid:0) x k + Z n +1 − f k ( Y n +1 ) − inf ℓ ≥ k π R,n ( f ℓ ) (cid:1)(cid:12)(cid:12)(cid:12) , (cid:12)(cid:12)(cid:12) ¯ b n (cid:0) x k + Z n +1 − f k ( Y n +1 ) − inf ℓ ≥ k π R,n ( f ℓ ) (cid:1)(cid:12)(cid:12)(cid:12)(cid:27) ≤ max { ¯ c n , ¯ c n } + a n (cid:0) ¯ x + Z n +1 + Y n +1 + π R,n ( Y n +1 ) (cid:1) , which is in L p . The sequence (cid:26) inf ℓ ≥ k v (cid:0) x ℓ + Z n +1 ( ω ) − f ℓ ( Y n +1 ( ω )) − inf m ≥ ℓ π R,n ( f m ) (cid:1)(cid:27) k ∈ N (4.4)is increasing and bounded from above, i.e. convergent for almost all ω ∈ Ω. Let us denote thealmost sure limit by V ∈ L p . The lower semicontinuity of v implies a.s. V = lim k →∞ inf ℓ ≥ k v (cid:0) x ℓ + Z n +1 − f ℓ ( Y n +1 ) − inf m ≥ ℓ π R,n ( f m ) (cid:1) = lim inf k →∞ v (cid:0) x ℓ + Z n +1 − f ℓ ( Y n +1 ) − inf m ≥ ℓ π R,n ( f m ) (cid:1) ≥ v (cid:0) x ∗ + Z n +1 − f ∗ ( Y n +1 ) − ˆ π (cid:1) ≥ v (cid:0) x ∗ + Z n +1 − f ∗ ( Y n +1 ) − π R,n ( f ∗ ) (cid:1) . (4.5)The last inequality holds by (4.2) since v is decreasing. Now, we getlim inf k →∞ L n v ( x k , f k ) = lim inf k →∞ ρ n (cid:16) f k ( Y n +1 ) + π R,n ( f k ) − Z n +1 − x k + βv (cid:0) x k + Z n +1 − f k ( Y n +1 ) − π R,n ( f k ) (cid:1)(cid:17) ≥ lim inf k →∞ ρ n (cid:16) f k ( Y n +1 ) + inf ℓ ≥ k π R,n ( f ℓ ) − Z n +1 − x k + β inf ℓ ≥ k v (cid:0) x ℓ + Z n +1 − f ℓ ( Y n +1 ) − inf m ≥ ℓ π R,n ( f m ) (cid:1)(cid:17) ≥ ρ n (cid:16) f ∗ ( Y n +1 ) + ˆ π − Z n +1 − x ∗ + βV (cid:17) ≥ ρ n (cid:16) f ∗ ( Y n +1 ) + π R,n ( f ∗ ) − Z n +1 − x ∗ + βv (cid:0) x ∗ + Z n +1 − f ∗ ( Y n +1 ) − π R,n ( f ∗ ) (cid:1)(cid:17) = L n v ( x ∗ , f ∗ ) . YNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THE INSURER’S COST OF CAPITAL 13

The ﬁrst inequality is by the monotonicity of ρ n and v , the second one is by the almost sureconvergence of the sequences (4.3) and (4.4) together with the Fatou property of ρ n and thethird one is by (4.2), (4.5) together with the monotonicity of ρ n . Hence, D n ∋ ( x, f ) L n v ( x, f ) is lower semicontinuous for every v ∈ B n +1 . Proposition 2.4.3 in B¨auerle and Rieder(2011) yields the existence of a minimizing Markov decision rule d ∗ n and that J n = T n J n +1 is lower semicontinuous. J n is also decreasing since the monotonicity of ρ n and J n +1 makes x L n J n +1 ( x, f ) decreasing for every f ∈ D n ( x ) implying for x ≤ x J n ( x ) = inf f ∈ D n ( x ) L n J n +1 ( x , f ) ≥ inf f ∈ D n ( x ) L n J n +1 ( x , f ) ≥ inf f ∈ D n ( x ) L n J n +1 ( x , f ) = J n ( x )since D n ( x ) ⊆ D n ( x ). Furthermore, J n is bounded by ¯ b n , ¯ b n according to Lemma 4.3, i.e. J n ∈ B n and the proof is complete. (cid:3) Remark 4.6.

When using general monetary risk measures, we need the claims Y , Y , . . . ∈ L p to be integrable since our existence results are based on the Fatou property. If however Value-at-Risk is used in each period, we can relax this assumption and consider any real-valued randomvariables Y , Y , . . . ∈ L . Especially, heavy-tailed claim size distributions may be used. Firstly,Assumption 4.1 (ii) is satisﬁed, since for α ∈ (0 ,

1) the quantile VaR α ( Y ) is ﬁnite for any Y ∈ L . And secondly, a recourse on the Fatou property is not needed as Value-at-Risk islower semicontinuous even w.r.t. convergence in distribution, cf. the proof of Lemma 2.10 inB¨auerle and Glauner (2020b). I.e. where we have used dominated convergence in the proof ofTheorem 4.5 we can dispense with the majorants and argue only with almost sure convergence.5. Cost of capital minimization with infinite planning horizon

We now consider the optimal reinsurance problem under an inﬁnite planning horizon. Thisis reasonable if the terminal period is unknown or if one wants to approximate a model with alarge but ﬁnite planning horizon. Solving the inﬁnite horizon problem will turn out to be easiersince it admits a stationary optimal policy. The model is now required to be stationary meaningthat the disturbances ( Y n , Z n ) n ∈ N are an i.i.d. sequence with representative ( Y, Z ), the premiumprinciple π R and the risk measure ρ do not vary over time and we have strict discounting by β ∈ (0 , Assumption 5.1. (i) ρ : L p → ¯ R is law-invariant, proper and coherent risk measure withthe Fatou property.(ii) It holds ρ ( Y ) < ∞ .Since the model with inﬁnite planning horizon will be derived as a limit of the one with ﬁnitehorizon, the consideration can be restricted to Markov policies π = ( d , d , . . . ) ∈ Π M due toTheorem 4.5. When calculating limits, it is more convenient to index the value functions withthe distance to the time horizon rather than the point in time. This is also referred to as forwardform of the value iteration and only possible under Markov policies in a stationary model. Thevalue of a policy π = ( d , d . . . ) ∈ Π M up to a planning horizon N ∈ N now is J Nπ ( x ) = T d ◦ · · · ◦ T d N − x ) = T d J N − ~π ( x ) , x ∈ R , (5.1)with ~π = ( d , d , . . . ). Hence, it holds J non-stat n = J stat N − n , n = 0 , . . . , N . Note that the operatorsfrom Deﬁnition 4.4 do not depend on the time index in a stationary model. The value functionunder planning horizon N ∈ N is given by J N ( x ) = inf π ∈ Π M J Nπ ( x ) , x ∈ R . By Theorem 4.5, the value function satisﬁes the Bellman equation J N ( x ) = T J N − ( x ) = T N x ) , x ∈ R . (5.2) When the planning horizon is inﬁnite, we deﬁne the value of a policy π ∈ Π M as J ∞ π ( x ) = lim N →∞ J Nπ ( x ) , x ∈ R . (5.3)Hence, the optimality criterion considered in this section is J ∞ ( x ) = inf π ∈ Π M J ∞ π ( x ) , x ∈ R . (5.4)In the stationary model, also the bounding functions no longer depend on the time index. Lemma 5.2.

The decreasing functions ¯ b : R → R − and ¯ b : R → R + deﬁned by ¯ b ( x ) = − x + − β − ¯ η (1 − β ) , ¯ η = ess sup( Z ) , ¯ b ( x ) = x − − β + ¯ η (1 − β ) , ¯ η = ρ ( Y ) + π R ( Y ) , satisfy for every planning horizon N ∈ N and policy π ∈ Π M ¯ b ( x ) ≤ J Nπ ( x ) ≤ ¯ b ( x ) , x ∈ R . Proof.

We proceed by induction. For N = 0 there is nothing to show. Assuming the assertionholds for N −

1, it follows J Nπ ( x ) = T d J N − ~π ( x ) ≥ T d ¯ b ( x )= ρ (cid:16) d ( x )( Y ) + π R ( d ( x )) − Z − x + β ¯ b (cid:0) x + Z − d ( x )( Y ) − π R ( d ( x )) (cid:1)(cid:17) ≥ − ¯ η − x + β ¯ b ( x + ¯ η ) ≥ − ¯ η − x + + β − ¯ η (1 − β ) − x + + ¯ η − β ! = − ¯ η (cid:18) β − β + β (1 − β ) (cid:19) − x + (cid:18) β − β (cid:19) = ¯ b ( x ) . The ﬁrst inequality is by the monotonicity of T d and the second one by the monotonicity,translation invariance and normalization of ρ . Regarding the upper bound we have J Nπ ( x ) = T d J N − ~π ( x ) ≤ T d ¯ b ( x )= ρ (cid:16) d ( x )( Y ) + π R ( d ( x )) − Z − x + β ¯ b (cid:0) x + Z − d ( x )( Y ) − π R ( d ( x )) (cid:1)(cid:17) ≤ ρ (cid:16) Y + π R ( Y ) − x + β ¯ b (cid:0) x − Y − π R ( Y ) (cid:1)(cid:17) ≤ ρ ( Y ) + π R ( Y ) + x − + βρ (cid:18) ¯ η (1 − β ) + x − + Y + π R ( Y )1 − β (cid:19) = ¯ η (cid:18) β − β + β (1 − β ) (cid:19) + x − (cid:18) β − β (cid:19) = ¯ b ( x ) . The ﬁrst inequality is again by the monotonicity of T d , the second one by the monotonicity of ρ and the third one by subadditivity and translation invariance. (cid:3) With analogous arguments as in the proof of Lemma 5.2 and using the fact that11 − β = 1 + β − β and 1 + β − β + β (1 − β ) = 1(1 − β ) we obtain for ( x, f ) ∈ D the two inequalities − ρ (cid:0) − ¯ b ( x + Z − f ( Y ) − π R ( f )) (cid:1) ≥ ¯ b ( x + ¯ η ) ≥ − ¯ η (1 − β ) − x + + ¯ η − β = − x + β (cid:18) − β − (cid:19) − ¯ η β (cid:18) − β ) − (cid:19) YNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THE INSURER’S COST OF CAPITAL 15 = 1 β (cid:0) ¯ b ( x ) + x + + ¯ η (cid:1) ≥ β (cid:0) ¯ b ( x ) + (1 − β ) x + + ¯ η (cid:1) = 1 − (1 − β ) β ¯ b ( x ) , (5.5) ρ (cid:0) ¯ b ( x + Z − f ( Y ) − π R ( f )) (cid:1) ≤ ρ (cid:0) ¯ b ( x − Y − π R ( Y )) (cid:1) ≤ ρ (cid:18) ¯ η (1 − β ) + Y + π R ( Y ) + x − − β (cid:19) = x − − β + ¯ η (cid:18) − β + 1(1 − β ) (cid:19) = x − β (cid:18) − β − (cid:19) + ¯ η β (cid:18) − β ) − (cid:19) = 1 β (cid:0) ¯ b ( x ) − x − − ¯ η (cid:1) ≤ β (cid:0) ¯ b ( x ) − (1 − β ) x − − ¯ η (cid:1) = 1 − (1 − β ) β ¯ b ( x ) , (5.6)which will be relevant in subsequent proofs.The next lemma shows that the inﬁnite horizon policy values (5.3) and inﬁnite horizon valuefunction (5.4) are well-deﬁned. Lemma 5.3.

The sequence { J Nπ } N ∈ N converges pointwise for every Markov policy π ∈ Π M andthe limit function J ∞ π is lower semicontinuous, decreasing and bounded by ¯ b, ¯ b .Proof. First, we show by induction that for all N ∈ N J Nπ ( x ) ≥ J N − π ( x ) + (1 − (1 − β ) ) N − ¯ b ( x ) , x ∈ R . (5.7)For N = 1 we have by Lemma 5.2 J π ( x ) ≥ ¯ b ( x ) = J π ( x ) + (1 − (1 − β ) ) ¯ b ( x ) . For N ≥ J Nπ ( x ) = ρ (cid:16) d ( x )( Y ) + π R ( d ( x )) − Z − x + βJ N − ~π (cid:0) x + Z − d ( x )( Y ) − π R ( d ( x )) (cid:1)(cid:17) ≥ ρ (cid:16) d ( x )( Y ) + π R ( d ( x )) − Z − x + βJ N − ~π (cid:0) x + Z − d ( x )( Y ) − π R ( d ( x )) (cid:1) + β (1 − (1 − β ) ) N − ¯ b (cid:0) x + Z − d ( x )( Y ) − π R ( d ( x )) (cid:17) ≥ J N − π ( x ) − β (1 − (1 − β ) ) N − ρ (cid:16) − ¯ b (cid:0) x + Z − d ( x )( Y ) − π R ( d ( x )) (cid:1)(cid:17) ≥ J N − π ( x ) + (1 − (1 − β ) ) N − ¯ b ( x ) . The ﬁrst inequality is by the induction hypothesis, the second one is by Lemma 2.5 togetherwith the positive homogeneity of ρ and the third one is due to (5.5). Thus, (5.7) holds. Applyingthis inequality repeatedly for N, N − , . . . , m yields J Nπ ( x ) ≥ J mπ ( x ) + N − X k = m (1 − (1 − β ) ) k ¯ b ( x ) ≥ J mπ ( x ) + δ m ( x ) , (5.8)where δ m : R → ( −∞ , , δ m ( x ) = ¯ b ( x ) ∞ X k = m (1 − (1 − β ) ) k , m ∈ N are non-positive functions with lim m →∞ δ m ( x ) = 0 for all x ∈ R . Hence, the sequence offunctions { J Nπ } N ∈ N is weakly increasing and by Lemma A.1.4 in B¨auerle and Rieder (2011)convergent to a limit function J ∞ π which is lower semicontinuous and decreasing. The boundsfrom Lemma 5.2 also apply to the limit. (cid:3) Since ¯ b ≤ b ≥

0, the ﬁnite and inﬁnite horizon value functions are bounded in absolutevalue by b : R → R + , b ( x ) = ¯ b ( x ) − ¯ b ( x ) = 11 − β | x | + 1(1 − β ) η, η = ρ ( Y ) + π R ( Y ) + ess sup( Z )and therefore contained in the set B = { v : R → R | v lower semicontinuous and decreasing with λ ∈ R + s.t. | v | ≤ λb } . Endowing it with the weighted supremum norm k v k b = sup x ∈ R | v ( x ) | b ( x ) makes ( B , k · k b ) a completemetric space, cf. Proposition 7.2.1 in Hern´andez-Lerma and Lasserre (1999). Lemma 5.4.

The Bellman operator T is a contraction on B with modulus − (1 − β ) ∈ (0 , .Proof. Let v ∈ B . It follows as in the proof of Theorem 4.5 that T v is lower semicontinuous anddecreasing. Furthermore, |T v ( x ) | = inf f ∈ D ( x ) (cid:12)(cid:12)(cid:12) ρ (cid:16) f ( Y ) + π R ( f ) − Z − x + βv (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)(cid:17)(cid:12)(cid:12)(cid:12) ≤ inf f ∈ D ( x ) ρ (cid:16) | f ( Y ) + π R ( f ) − Z − x | (cid:17) + βρ (cid:16) (cid:12)(cid:12) v (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)(cid:12)(cid:12) (cid:17) ≤ inf f ∈ D ( x ) ρ (cid:16) | f ( Y ) + π R ( f ) − Z − x | (cid:17) + βρ (cid:16) k v k b b (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)(cid:17) ≤ ρ (cid:16) Y + π R ( Y ) + Z + | x | (cid:17) + β k v k b ρ (cid:16) η (1 − β ) + 11 − β (cid:0) | x | + Z + Y + π R ( Y ) (cid:1)(cid:17) ≤ max {k v k b , } (cid:18) η + | x | + β (cid:16) η (1 − β ) + | x | + η − β (cid:17)(cid:19) = max {k v k b , } b ( x ) . The ﬁrst inequality is by Lemma 2.3 and subadditivity. Hence, the operator T is an endofunctionon B and it remains to verify the Lipschitz constant 1 − (1 − β ) . For v , v ∈ B it holds |T v ( x ) − T v ( x ) | ≤ sup f ∈ D ( x ) (cid:12)(cid:12)(cid:12) ρ (cid:16) f ( Y ) + π R ( f ) − Z − x + βv (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)(cid:17) − ρ (cid:16) f ( Y ) + π R ( f ) − Z − x + βv (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)(cid:17)(cid:12)(cid:12)(cid:12) ≤ β sup f ∈ D ( x ) ρ (cid:16) (cid:12)(cid:12) v (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1) − v (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)(cid:12)(cid:12) (cid:17) ≤ β k v − v k b sup f ∈ D ( x ) ρ (cid:16) b (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)(cid:17) ≤ β k v − v k b sup f ∈ D ( x ) h ρ (cid:16) ¯ b (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)(cid:17) + ρ (cid:16) − ¯ b (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)(cid:17)i ≤ (1 − (1 − β ) ) k v − v k b [¯ b ( x ) − ¯ b ( x )] = (1 − (1 − β ) ) k v − v k b b ( x ) . Dividing by b ( x ) and taking the supremum over x ∈ R on the left hand side completes the proof.Note that the second inequality holds by Lemma 2.3, the fourth one due to b = ¯ b − ¯ b and thesubadditivity of ρ and the last one is by (5.5), (5.6). (cid:3) Under a ﬁnite planning horizon N ∈ N we have characterized the value function with theBellman equation (5.2). We will show that this is compatible with the optimality criterion ofthe inﬁnite horizon model (5.4). To this end, we deﬁne the limit value function J ( x ) = lim N →∞ J N ( x ) , x ∈ R . YNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THE INSURER’S COST OF CAPITAL 17

Note that the limit exists since it follows from (5.8) that J N ≥ J m + δ m for all N ≥ m making J N weakly increasing. Theorem 5.5. a) The limit value function J is the unique ﬁxed point of the Bellman oper-ator T in B . b) There exists a Markov decision rule d ∗ such that T d ∗ J ( x ) = T J ( x ) , x ∈ R . c) Each stationary reinsurance policy π ∗ = ( d ∗ , d ∗ , . . . ) induced by a Markov decision rule d ∗ as in part b) is optimal for optimization problem (5.4) and it holds J ∞ = J .Proof. a) The fact that J is the unique ﬁxed point of the operator T in B follows directlyfrom Banach’s Fixed Point Theorem using Lemma 5.4.b) The existence of a minimizing Markov decision rule follows from the respective result inthe ﬁnite horizon case, cf. Theorem 4.5.c) Let d ∗ be a Markov decision rule as in part b) and π ∗ = ( d ∗ , d ∗ , . . . ). Then it holds J ( x ) ≤ J ∞ ( x ) ≤ J ∞ π ∗ ( x ) , x ∈ R . The second inequality is by deﬁnition. Regarding the ﬁrst one note that for any π ∈ Π M we have J N ≤ J Nπ for all N ∈ N . Letting N → ∞ yields J ≤ J ∞ π . Since π ∈ Π M wasarbitrary, we get J ≤ inf π ∈ Π M J ∞ π = J ∞ . It remains to show J ∞ π ∗ ( x ) ≤ J ( x ) , x ∈ R . (5.9)To that end, we will prove by induction that for all N ∈ N and x ∈ R J ( x ) ≥ J Nπ ∗ ( x ) + (1 − (1 − β ) ) N ¯ b ( x ) . (5.10)Letting N → ∞ in (5.10) yields (5.9) and concludes the proof. For N = 0 equation (5.10)reduces to J ( x ) ≥ ¯ b ( x ), which holds by Lemma 5.2. For N ≥ J ( x ) = T d ∗ J ( x ) ≥ T d ∗ (cid:0) J N − π ∗ + (1 − (1 − β ) ) N − ¯ b (cid:1) ( x ) ≥ ρ (cid:16) d ∗ ( x )( Y ) + π R ( d ∗ ( x )) − Z − x + βJ N − π ∗ (cid:0) x + Z − d ∗ ( x )( Y ) − π R ( d ∗ ( x )) (cid:1)(cid:17) − β (1 − (1 − β ) ) N − ρ (cid:16) − ¯ b (cid:0) x + Z − d ∗ ( x )( Y ) − π R ( d ∗ ( x ) (cid:1)(cid:17) ≥ J Nπ ∗ ( x ) + (1 − (1 − β ) ) N ¯ b ( x ) . The second inequality is by Lemma 2.5 together with the positive homogeneity of ρ andthe last one is by (5.5). (cid:3) Connection to profit maximization

In this section, we show that recursive cost of solvency capital minimization is in accordancewith the primary target of any insurance company: proﬁt maximization. For notational con-venience we consider the stationary model regardless of the planning horizon and use forwardindexing for the value functions. Let ρ be a law-invariant, proper and coherent risk measure withthe Fatou property satisfying ρ ( Y ) < ∞ . By inserting the dual representation of Proposition2.4 in the Bellman equation, we get J ( x ) = 0 ,J N ( x ) = inf f ∈ D ( x ) sup Q ∈Q E Q h f ( Y ) + π R ( f ) − Z − x + βJ N − (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)i , x ∈ R , i.e. the Bellman equation of a distributionally robust Markov Decision Process as consideredin B¨auerle and Glauner (2020a). Through this connection, we can derive a closed form ex-pression for our recursively deﬁned optimality criterion which turns out to be related to proﬁtmaximization. Since the claims Y , Y , . . . and premium income Z , Z , . . . are i.i.d. we can w.l.o.g. assumethat the probability space has a product structure(Ω , A , P ) = ∞ O n =1 (Ω , A , P )with ( Y n , Z n )( ω ) = ( Y n , Z n )( ω n ) only depending on component ω n of ω = ( ω , ω , . . . ) ∈ Ω.Besides, the probability measure P on (Ω , A ) can w.l.o.g. assumed to be as separable byconstructing the random variables canonically since B ( R ) is countably generated which makesany probability measure on it separable, see Bogachev (2007, 1.12). Furthermore, note that ρ (cid:0) ( x + Z − f ( Y ) − π R ( f )) + (cid:1) ≤ x + + ess sup( Z ) ≤ − ¯ b ( x ) ,ρ (cid:0) ( x + Z − f ( Y ) − π R ( f )) − (cid:1) ≤ ρ ( Y ) + π ( Y ) + x − ≤ ¯ b ( x )holds for all ( x, f ) ∈ D . This implies that the assumptions of Theorem 6.1 in B¨auerle and Glauner(2020b) are satisﬁed and we can conclude the following: Proposition 6.1.

For a Markov policy π = ( d , d , . . . ) ∈ Π M and γ = ( γ , γ , . . . ) , where γ n : D → Q is measurable, we deﬁne the transition kernel Q πγn ( B | x ) = Z B (cid:0) x + Z n +1 ( ω n +1 ) − d n ( x )( Y n +1 ( ω n +1 )) − π R ( d n ( x )) (cid:1) γ n (d ω n +1 | x, d n ( x )) ,B ∈ B ( R ) , x ∈ R , and the law of motion Q πγx = δ x ⊗ Q πγ ⊗ Q πγ ⊗ . . . of the surplus processinduced by the Theorem of Ionescu-Tulcea. The set of all possible laws of motion under policy π ∈ Π M is denoted by Q π = { Q πγx : γ ∈ Γ } with Γ being the set of all possible γ . Then it holdsfor N ∈ N ∪ {∞} J N ( x ) = inf π ∈ Π M sup Q ∈ Q π E Q " N − X n =0 β n (cid:0) d n ( x )( Y n +1 ) + π R ( d n ( X πn )) − Z n +1 − X πn (cid:1) . (6.1) Proof.

The assertion follows directly from Theorem 6.1 in B¨auerle and Glauner (2020b). (cid:3)

To interpret this closed form expression for the value functions let us reformulate (6.1) to J N ( x ) = − sup π ∈ Π M inf Q ∈ Q π E Q " N − X n =0 β n (cid:16) X πn + Z n +1 − d n ( X πn )( Y n +1 ) − π R ( d n ( X πn )) (cid:17) = N − X n =0 β n x − sup π ∈ Π M inf Q ∈ Q π E Q  N − X n =0 (cid:16) N − X j = n β j (cid:17)(cid:16) Z n +1 − d n ( X πn )( Y n +1 ) − π R ( d n ( X πn )) (cid:17) . The second equality can be obtained inductively by inserting the dynamics of the surplus pro-cess (3.2). Here, we have a robust maximization of total proﬁt with higher weights on earlierperiods. This addresses a fundamental criticism of cost of capital minimization as an optimalitycriterion for reinsurance design by Albrecher et al. (2017, Sec. 8.4). The authors state that if theminimization of the cost of capital was the driving criterion of the insurer, it would be optimalin the long run to stay out of business altogether and thereby achieve zero cost of capital. Thisviewpoint brings the suitability of the recursive optimality criterion into question since it wouldbe applied over several periods. However, under a coherent risk measure the calculations aboveshow that the recursive criterion is indeed in accordance with proﬁt maximization.7.

Examples

In this section, we present two examples where the optimal reinsurance policy can be deter-mined analytically. Moreover, we give an example illustrating the diﬃculties of the general case.The ﬁrst example studies Value-at-Risk as a concrete choice for the risk measure ρ . This hasparticular practical relevance with regard to Solvency II. YNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THE INSURER’S COST OF CAPITAL 19

Example 7.1.

Let ρ n = VaR α n and the insurer’s premium income be deterministic, i.e. Z n +1 ≡ z n +1 ∈ R + , n = 0 , . . . , N −

1. That is, we focus on the risk in the insurance claims and neglectpotential ﬂuctuations of premium payments from the insurer’s customers. Using Theorem 4.5,we have to solve the Bellman equation J n ( x ) = inf f ∈ D n ( x ) VaR α n (cid:16) f ( Y n +1 ) + π R,n ( f ) − z n +1 − x + βJ n +1 (cid:0) x + z n +1 − f ( Y n +1 ) − π R,n ( f ) (cid:1)(cid:17) = inf f ∈ D n ( x ) VaR α n (cid:16) φ (cid:0) f ( Y n +1 ) + π R,n ( f ) − z n +1 − x (cid:1)(cid:17) = inf f ∈ D n ( x ) φ (cid:16) VaR α n (cid:0) f ( Y n +1 ) (cid:1) + π R,n ( f ) − z n +1 − x (cid:17) Here, we used that quantiles can be interchanged with the increasing lower semicontinuous (i.e.left-continuous) function φ ( x ) = x + βJ n +1 ( − x ), see e.g. Proposition 2.2 in B¨auerle and Glauner(2018). The increasing transformation φ can be dropped and it remains to solveinf f ∈ D n ( x ) VaR α n ( f ( Y n +1 )) + π R,n ( f ) . (7.1)This is a static optimal reinsurance problem with budget constraint. I.e. the dynamic reinsuranceproblem (4.1) possesses a myopic optimal policy under Value-at-Risk.Extending an approach used in Chi and Tan (2013) and B¨auerle and Glauner (2018) to prob-lems with constraints, (7.1) can be reduced to a ﬁnite dimensional problem. By repeating theinterchange argument, the minimization problem can be rewritten asinf f ∈ D n ( x ) f (cid:0) VaR α n ( Y n +1 ) (cid:1) + π R,n ( f ) . Now deﬁne h a ( x ) = max { min { a, x } , x − VaR α n ( Y n +1 ) + a } , x ∈ R + , ≤ a ≤ VaR α n ( Y n +1 ) . This is the retained loss function corresponding to a layer reinsurance treaty with deductible a and upper bound VaR α n ( Y ) − a . Clearly, h a ∈ F for all a ∈ [0 , VaR α n ( Y n +1 )]. Fix f ∈ F .We write h f short hand for h a when a = f (VaR α n ( Y n +1 )). Observe that f (VaR α n ( Y n +1 )) ∈ [0 , VaR α n ( Y n +1 )]. Simply by inserting we ﬁnd that h f (VaR α n ( Y n +1 )) = f (VaR α n ( Y n +1 )) . Moreover, it holds π R,n ( h f ) ≤ π R,n ( f ). This can be seen as follows. If 0 ≤ x < f (VaR α n ( Y n +1 )),then h f ( x ) = x ≥ f ( x ) as f is bounded by the identity. If f (VaR α n ( Y n +1 )) ≤ x < VaR α n ( Y n +1 ),then h f ( x ) = f (VaR α n ( Y n +1 )) ≥ f ( x ) since f is increasing. Finally if x ≥ VaR α n ( Y n +1 ), then h f ( x ) = x − VaR α n ( Y n +1 ) + f (VaR α n ( Y n +1 )) ≥ f ( x ) as f is 1-Lipschitz, cf. Lemma 3.3 a).Consequently, Y n +1 − h f ( Y n +1 ) ≤ Y n +1 − f ( Y n +1 ) and by monotonicity π R,n ( h f ) ≤ π R,n ( f ). I.e. h f is weakly better than f with respect to the objective function and satisﬁes the constraint if f does. Therefore, it suﬃces to consider the reduced probleminf ≤ a ≤ VaR αn ( Y n +1 ) a + π R,n ( h a ) such that π R,n ( h a ) ≤ x + . (7.2)For solving (7.2) explicitly, one has to specify the premium principle. We consider exemplarilya Wang premium principle π R,n ( X ) = (1 + θ ) Z ∞ g ( S X ( x )) d x, θ ≥ , where we only assume that the distortion function g is left-continuous. This includes any PHpremium, especially the expected premium principle. In order to calculate the premium for h a ,we have to determine the survival function of Y n +1 − h a ( Y n +1 ) = min { ( Y n +1 − a ) + , VaR α n ( Y n +1 ) − a } : P ( Y n +1 − h a ( Y n +1 ) > y ) = ( P (cid:0) ( Y n +1 − a ) + > y (cid:1) = S Y n +1 ( y + a ) , ≤ y < VaR α n ( Y n +1 ) − a, , y ≥ VaR α n ( Y n +1 ) − a. It follows π R,n ( h a ) = (1 + θ ) Z VaR αn ( Y n +1 ) − a g (cid:0) S Y n +1 ( y + a ) (cid:1) d y = (1 + θ ) Z VaR αn ( Y n +1 ) a g (cid:0) S Y n +1 ( y ) (cid:1) d y The derivative of the objective function ψ ( a ) = a + (1 + θ ) Z VaR αn ( Y n +1 ) a g (cid:0) S Y n +1 ( y ) (cid:1) d y, ≤ a ≤ VaR α n ( Y n +1 )is given by ψ ′ ( a ) = 1 − (1 + θ ) g ( S Y n +1 ( a )). Since the distortion function g is left-continuous, g ◦ S Y is itself a survival function. Thus, ψ ′ is increasing and right continuous. I.e. its generalizedinverse ψ ′ − ( z ) = inf { a ∈ [0 , VaR α n ( Y n +1 )] : ψ ′ ( a ) ≥ z } is well-deﬁned for every z in the range of ψ ′ . Let us distinguish two cases: Case 1: g (1 − α n ) < θ By deﬁnition of a quantile, we have S Y n +1 (VaR α n ( Y n +1 )) ≤ − α n . Since g is increasing itfollows ψ ′ (VaR α n ( Y n +1 )) = 1 − (1 + θ ) g (cid:0) S Y n +1 (VaR α n ( Y n +1 )) (cid:1) ≥ − (1 + θ ) g (1 − α n ) > . Hence, ψ is strictly increasing on [ ψ ′− (0) , VaR α n ( Y n +1 )]. Case 2: g (1 − α n ) ≥ θ Let a <

VaR α n ( Y n +1 ). Then S Y n +1 ( a ) > − α n and as g is increasing ψ ′ ( a ) = 1 − (1 + θ ) g (cid:0) S Y n +1 ( a ) (cid:1) ≤ − (1 + θ ) g (1 − α ) ≤ . I.e. ψ is decreasing on [0 , VaR α n ( Y n +1 )].Note that in practice α n is chosen very close to 1 and θ smaller than 1, so only the ﬁrst case isactually relevant. Let us deﬁne a ∗ n = ( ψ ′− (0) , if g (1 − α n ) < θ , VaR α n ( Y n +1 ) , otherwise . Note that a = VaR α n ( Y n +1 ) is always feasible for optimization problem (7.2) and that a π R,n ( h a ) is a continuous mapping. Therefore, taking into account the budget constraint weobtain as an optimal solution of (7.2): a n ( x ) = min { a ∈ [ a ∗ n , VaR α n ( Y n +1 )] : π R,n ( h a ) ≤ x + } . Consequently, an optimal reinsurance policy π = ( d ∗ , . . . , d ∗ N − ) is given by d ∗ n ( x ) = h a n ( x ) .I.e. it is an optimal policy to buy a layer reinsurance treaty in each period where the singleparameter is chosen as close to the optimal parameter of the corresponding problem withoutconstraint as the current surplus allows. For a low surplus, this means that the insurer shouldinvest all capital in reinsurance to mitigate future insurance claims rather than saving capitalto pay for them himself. Remark 7.2.

Since the value functions are decreasing, translation invariance of the risk measureimplies that for a suﬃciently high initial capital the recursive cost of capital is non-positive ateach stage, cf. Theorem 4.5. In this case, we get an upper bound for the probability of ruinbefore the planning horizon N in the setting of Example 7.1. Fix a suﬃciently large x >

0, let f ∗ n be the optimal reinsurance contract at time n in state x and U n = f ∗ n ( Y n +1 ) + π R,n ( f ∗ n ) − z n +1 − x + βJ n +1 (cid:0) x + z n +1 − f ∗ n ( Y n +1 ) − π R,n ( f ∗ n ) (cid:1) . Then it follows 0 ≥ J n ( x ) = VaR α n ( U n ) = inf { t ∈ R : P ( U n > t ) ≤ − α n } and especially P ( U n > ≤ − α n . This is the probability of a ruin in period [ n, n + 1) ifthe future cost of capital is taken into account. Now, Bonferroni’s inequality yields that theprobability of a ruin before the planning horizon N is bounded by P N − n =0 (1 − α n ) under the YNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THE INSURER’S COST OF CAPITAL 21 optimal reinsurance policy if the recursive cost of capital is non-positive at each stage. Notethat we are discussing here an imputed ruin taking into account future costs. Omitting thefuture cost part in the Bellman equation at each stage, one ﬁnds that the probability of a ruinin the classical sense (negative surplus) under the so-obtained optimal policy is also bounded by P N − n =0 (1 − α n ).As a second example we consider the dynamic reinsurance problem with general risk measuresbut without a budget constraint. This turns out to simplify the problem signiﬁcantly. Example 7.3.

Let the model be stationary since we consider both ﬁnite and inﬁnite planninghorizons. In case of no budget constraint D n ( x ) = F for all x ∈ R , the dynamic optimizationproblems (4.1) and (5.4) reduce to static problems and there is a constant optimal action. Thiscan be seen by backward induction. At time N −

1, the Bellman equation reads due to thetranslation invariance of ρJ N − ( x ) = min f ∈F ρ (cid:16) f ( Y ) − Z (cid:17) + π R ( f ) − x, i.e. the minimization does not depend on the state of the surplus process x . Therefore, the valuefunction is of the form J N − ( x ) = c − x with a constant c = min f ∈F ρ ( f ( Y ) − Z ) + π R ( f ) and the optimal decision rule d ∗ N − is constant d ∗ N − ( x ) = argmin f ∈F ρ ( f ( Y ) − Z ) + π R ( f ) =: f ∗ , x ∈ R . Proceeding to the previous time step, we get due to translation invariance and positive homo-geneity of ρJ N − ( x ) = min f ∈F ρ (cid:16) f ( Y ) + π R ( f ) − Z − x + βJ N − (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)(cid:17) = min f ∈F ρ (cid:16) f ( Y ) + π R ( f ) − Z − x + β (cid:0) c + f ( Y ) + π R ( f ) − Z − x (cid:1)(cid:17) = min f ∈F (1 + β ) (cid:16) ρ (cid:0) f ( Y ) − Z (cid:1) + π R ( f ) (cid:17) − (1 + β ) x + βc. Again, the minimization does not depend on x , the value function is given by J N − ( x ) = (1 + 2 β ) c − (1 + β ) x and the optimal decision rule is d ∗ N − ≡ f ∗ . Continuing with the induction, one ﬁnds that thevalue functions are aﬃne and structurally related to the bounding functions J n ( x ) = c N − n − X k =0 ( k + 1) β k − x N − n − X k =0 β k , x ∈ R , n = 0 , . . . , N − ,J ( x ) = c (1 − β ) − x − β , x ∈ R . Moreover, there is a retained loss function f ∗ ∈ F which is optimal at each point in timeindependently from the state of the surplus process. It can be determined by solving the classicalstatic optimal reinsurance problemmin f ∈F ρ (cid:0) f ( Y ) − Z (cid:1) + π R ( f ) . (7.3)In order to prove this, it remains to verify the induction step. Due to translation invariance andpositive homogeneity of ρ it follows J n ( x ) = min f ∈F ρ (cid:16) f ( Y ) + π R ( f ) − Z − x + βJ n +1 (cid:0) x + Z − f ( Y ) − π R ( f ) (cid:1)(cid:17) = min f ∈F ρ (cid:16) f ( Y ) + π R ( f ) − Z − x + cβ N − n − X k =0 ( k + 1) β k + β (cid:0) f ( Y ) + π R ( f ) − Z − x (cid:1) N − n − X k =0 β k (cid:17) = min f ∈F (cid:16) ρ (cid:0) f ( Y ) − Z (cid:1) + π R ( f ) (cid:17) N − n − X k =0 β k + cβ N − n − X k =0 ( k + 1) β k − x N − n − X k =0 β k = c N − n − X k =0 β k + β N − n − X k =0 ( k + 1) β k ! − x N − n − X k =0 β k = c N − n − X k =0 ( k + 1) β k − x N − n − X k =0 β k . If the premium income is deterministic, the optimal retained loss function in (7.3) is knownfor several risk measures. Under Value-at-Risk, layer reinsurance contracts are optimal as wehave shown in Example 7.1 and this remains true under Expected Shortfall with the possibilityof a degeneration to a stop-loss treaty, see Chi and Tan (2013). More complicated multi-layertreaties are known to be optimal under general distortion risk measures and Wang premiumprinciples, see e.g. Cui et al. (2013) and Zhuang et al. (2016).In examples 7.1 and 7.3 the optimal reinsurance policy turned out to be myopic. The followingexample shows that in general this is not the case if there is a budget constraint.

Example 7.4.

Consider a stationary model with N = 2 , Y ∼ U (0 , , Z ≡ z ∈ R + , theexpected premium principle π R = (1 + θ ) E and Expected Shortfall as risk measure ρ = ES α .Realistically chosen parameters satisfy − α ≥ θ . In the terminal period, we have to solvethe Bellman equation min f ∈ D ( x ) ES α ( f ( Y )) + (1 + θ ) E [ Y − f ( Y )] − z − x. (7.4)Here, we used Theorem 4.5 and the translation invariance of ρ . We show now that an optimalretained loss function can be found in the class of stop-loss treaties f ( x ) = min { x, a } , a ∈ [0 , f ∈ F we can choose a f ∈ [0 ,

1] such thatmin { Y, a f } ≤ cx f ( Y ) . (7.5)To see this, note that the mapping [0 , → R + , a E [min { Y, a } ] is continuous by dominatedconvergence and E [min { Y, } ] ≤ E [ f ( Y )] ≤ E [min { Y, } ]. Hence, by the intermediate valuetheorem there is an a f ∈ [0 ,

1] such that E [ f ( Y )] = E [min { Y, a f } ]. Let us compare the survivalfunctions S min { Y,a f } ( y ) = P (min { Y, a f } > y ) = P ( Y > y ) { a f > y } ,S f ( Y ) ( y ) = P ( f ( Y ) > y ) ≤ P ( Y > y ) . The inequality holds since f ≤ id R + . Hence, we have S min { Y,a f } ( y ) ≥ S f ( Y ) ( y ) for y < a f and S min { Y,a f } ( y ) ≤ S f ( Y ) ( y ) for y ≥ a f . The cut criterion 1.5.17 in M¨uller and Stoyan (2002)implies min { Y, a f } ≤ icx f ( Y ) and due to the equality in expectation follows (7.5), cf. Theorem1.5.3 in M¨uller and Stoyan (2002). Note that Expected Shortfall preserves the convex order ≤ cx , see Theorem 4.3 in B¨auerle and M¨uller (2006), and we have equality in expectation. Thus, y min { y, a f } is weakly better that f w.r.t. the objective function (7.4) and satisﬁes the budgetconstraint if f does. Therefore, the problem is reduced to ﬁnding the optimal parameter of astop-loss treaty.Interchanging quantiles with the increasing and continuous function y min { y, a } as inExample 7.1, we can reformulate (7.4) tomin a ∈ [0 , − α Z α min { u, a } d u + (1 + θ ) (cid:18) − Z min { u, a } d u (cid:19) − z − x (7.6) YNAMIC REINSURANCE IN DISCRETE TIME MINIMIZING THE INSURER’S COST OF CAPITAL 23 with the constraint (1 + θ ) (cid:18) − Z min { u, a } d u (cid:19) = 1 + θ − a ) ≤ x + . (7.7)If we consider (7.6) without constraint, it follows from − α ≥ θ that the optimal a must besmaller than α which reduces the objective function tomin a ∈ [0 ,α ] a + 1 + θ − a ) − z − x with minimizer ˆ a = min n θ θ , α o . Since the left hand side of the constraint (7.7) is decreasingin a , the inequality can be transformed to a ≥ − r x + θ ! + . Consequently, the optimal parameter of the problem with constraint (7.6) is a ∗ ( x ) = max ( min (cid:26) θ θ , α (cid:27) , − r x + θ ! + ) . I.e. the value function at time n = 1 is J ( x ) = a ∗ ( x ) + 1 + θ − a ∗ ( x )) − z − x resulting in a structurally diﬀerent optimization problem at time n = 0 and a non-myopicoptimal reinsurance policy. Acknowledgments.

The author would like to thank Nicole B¨auerle for inspiring discussionsand valuable comments.

References

Albrecher, H., Beirlant, J., and Teugels, J. L. (2017).

Reinsurance: Actuarial and Statistical Aspects .John Wiley & Sons, Hoboken, NJ.Albrecher, H. and Thonhauser, S. (2009). Optimality results for dividend problems in insurance.

Revistade la Real Academia de Ciencias Exactas, Fisicas y Naturales. Serie A. Matematicas , 103(2):295–320.Arrow, K. (1963). Uncertainty and the welfare economics of medical care.

The American EconomicReview , 53(5):941–973.Asienkiewicz, H. and Ja´skiewicz, A. (2017). A note on a new class of recursive utilities in Markov decisionprocesses.

Applicationes Mathematicae , 44(2):149–161.Azcue, P. and Muler, N. (2014).

Stochastic Optimization in Insurance: A Dynamic Programming Ap-proach . Springer, New York.B¨auerle, N. and Glauner, A. (2018). Optimal risk allocation in reinsurance networks.

Insurance: Mathe-matics and Economics , 82:37–47.B¨auerle, N. and Glauner, A. (2020a). Distributionally robust Markov decision processes and their con-nection to risk measures. arXiv:2007.13103.B¨auerle, N. and Glauner, A. (2020b). Markov decision processes with recursive risk measures.arXiv:2010.07220.B¨auerle, N. and Glauner, A. (2020c). Minimizing spectral risk measures applied to Markov decisionprocesses. arXiv:2012.04521.B¨auerle, N. and Ja´skiewicz, A. (2017). Optimal dividend payout model with risk sensitive preferences.

Insurance: Mathematics and Economics , 73:82–93.B¨auerle, N. and Ja´skiewicz, A. (2018). Stochastic optimal growth model with risk sensitive preferences.

Journal of Economic Theory , 173:181–200.B¨auerle, N. and M¨uller, A. (2006). Stochastic orders and risk measures: Consistency and bounds.

Insurance: Mathematics and Economics , 38(1):132–148.B¨auerle, N. and Rieder, U. (2011).

Markov Decision Processes with Applications to Finance . Springer-Verlag, Berlin Heidelberg.

Bogachev, V. I. (2007).

Measure Theory , volume I. Springer-Verlag, Berlin Heidelberg.Borch, K. (1960). An attempt to determine the optimum amount of stop loss reinsurance. In

Transactionsof the XVI International Congress of Actuaries , volume I, pages 597–610.Cai, J. and Tan, K. S. (2007). Optimal retention for a stop-loss reinsurance under the VaR and CTE riskmeasures.

ASTIN Bulletin , 37(1):93–112.Cai, J., Tan, K. S., Weng, C., and Zhang, Y. (2008). Optimal reinsurance under VaR and CTE riskmeasures.

Insurance: Mathematics and Economics , 43(1):185–196.Chen, S., Li, Z., and Li, K. (2010). Optimal investment–reinsurance policy for an insurance companywith VaR constraint.

Insurance: Mathematics and Economics , 47(2):144–153.Chi, Y. and Tan, K. S. (2013). Optimal reinsurance with general premium principles.

Insurance: Math-ematics and Economics , 52(2):180–189.Cui, W., Yang, J., and Wu, L. (2013). Optimal reinsurance minimizing the distortion risk measure undergeneral reinsurance premium principles.

Insurance: Mathematics and Economics , 53(1):74–85.Dhaene, J., Kukush, A., Linders, D., and Tang, Q. (2012). Remarks on quantiles and distortion riskmeasures.

European Actuarial Journal , 2(2):319–328.Hern´andez-Lerma, O. and Lasserre, J. B. (1999).

Further Topics on Discrete-Time Markov ControlProcesses . Springer-Verlag, New York.Liu, J., Yiu, K.-F. C., Siu, T. K., and Ching, W.-K. (2013). Optimal investment-reinsurance with dynamicrisk constraint and regime switching.

Scandinavian Actuarial Journal , 2013(4):263–285.Lo, A. (2017). A Neyman-Pearson perspective on optimal reinsurance with constraints.

ASTIN Bulletin ,47(2):467–499.M¨uller, A. and Stoyan, D. (2002).

Comparison Methods for Stochastic Models and Risks . John Wiley &Sons, Chichester.Pichler, A. (2013). The natural Banach space for version independent risk measures.

Insurance: Mathe-matics and Economics , 53(2):405–415.R¨uschendorf, L. (2013).

Mathematical Risk Analysis: Dependence, Risk Bounds, Optimal Allocations andPortfolios . Springer-Verlag, Berlin Heidelberg.Sch¨al, M. (2004). On discrete-time dynamic programming in insurance: Exponential utility and mini-mizing the ruin probability.

Scandinavian Actuarial Journal , 2004(3):189–210.Schmidli, H. (2008).

Stochastic Control in Insurance . Springer-Verlag, London.Sereda, E. N., Bronshtein, E. M., Rachev, S. T., Fabozzi, F. J., Sun, W., and Stoyanov, S. V. (2010).Distortion risk measures in portfolio optimization. In Guerard, J. B., editor,

Handbook of PortfolioConstruction , chapter 25, pages 649–673. Springer, New York.Zhuang, S. C., Weng, C., Tan, K. S., and Assa, H. (2016). Marginal indemniﬁcation function formulationfor optimal reinsurance.