[PDF] Adjusted Expected Shortfall

Abstract

We introduce and study the main properties of a class of convex risk measures that refine Expected Shortfall by simultaneously controlling the expected losses associated with different portions of the tail distribution. The corresponding adjusted Expected Shortfalls quantify risk as the minimum amount of capital that has to be raised and injected into a financial position X to ensure that Expected Shortfall E S p (X) does not exceed a pre-specified threshold g(p) for every probability level p∈[0,1] . Through the choice of the benchmark risk profile g one can tailor the risk assessment to the specific application of interest. We devote special attention to the study of risk profiles defined by the Expected Shortfall of a benchmark random loss, in which case our risk measures are intimately linked to second-order stochastic dominance.

Full PDF

AAdjusted Expected Shortfall

Matteo Burzoni

Department of Mathematics, University of Milan, Italy [email protected]

Cosimo Munari

Center for Finance and Insurance and Swiss Finance Institute, University of Zurich, Switzerland [email protected]

Ruodu Wang Department of Statistics and Actuarial Science, University of Waterloo, Canada [email protected]

July 20, 2020

Abstract

We introduce and study the main properties of a class of convex risk measures that reﬁneExpected Shortfall by simultaneously controlling the expected losses associated with diﬀerentportions of the tail distribution. The corresponding adjusted Expected Shortfalls quantify riskas the minimum amount of capital that has to be raised and injected into a ﬁnancial position X to ensure that Expected Shortfall ES p ( X ) does not exceed a pre-speciﬁed threshold g ( p ) forevery probability level p ∈ [0 , g one cantailor the risk assessment to the speciﬁc application of interest. We devote special attentionto the study of risk proﬁles deﬁned by the Expected Shortfall of a benchmark random loss, inwhich case our risk measures are intimately linked to second-order stochastic dominance. In this paper we introduce and discuss the main properties of a new class of risk measures based onExpected Shortfall. Following the seminal paper by Artzner et al. (1999), we view a risk measureas a capital requirement rule. More precisely, we quantify risk as the minimal amount of capitalthat has to be raised and invested in a pre-speciﬁed ﬁnancial instrument (which is typically takento be risk free) to conﬁne future losses within a pre-speciﬁed acceptable level of security. Value atRisk (VaR) and Expected Shortfall (ES) are the most prominent examples of risk measures in theabove sense. If we ﬁx a probability level p ∈ [0 ,

1] and model the future value of a ﬁnancial position Ruodu Wang is supported by Natural Sciences and Engineering Research Council of Canada (RGPIN-2018-03823,RGPAS-2018-522590). a r X i v : . [ q -f i n . R M ] J u l y a random variable X , then the VaR and ES of X at level p are respectively given byVaR p ( X ) :=  inf { x ∈ R | P ( X (cid:54) x ) (cid:62) p } if p ∈ (0 , , ess inf X if p = 0 , ES p ( X ) :=  − p (cid:82) p VaR q ( X )d q if p ∈ [0 , , ess sup X if p = 1 . Here, we have adopted the convention to assign positive values to losses. In particular, note thatVaR p ( X ) is nothing but the (left) p -quantile of X . In line with our interpretation, the risk measuresVaR p ( X ) and ES p ( X ) represent the minimal amount of cash that has to be raised and injected into X in order to ensure the following target solvency condition (for 0 < p < p ( X ) (cid:54) ⇐⇒ P ( X (cid:54) (cid:62) p, ES p ( X ) (cid:54) ⇐⇒ (cid:90) p VaR q ( X )d q (cid:54) F X is continuous ⇐⇒ E [ X | X (cid:62) VaR p ( X )] (cid:54) , where F X is the cumulative distribution function of X . In words, the VaR solvency conditionrequires that the loss probability of X is capped by 1 − p whereas the ES solvency condition thatthere is no loss on average beyond the (left) p -quantile of X . In the banking sector, the BaselCommittee has recently decided to move from VaR at level 99% to ES at level 97 .

5% for themeasurement of ﬁnancial market risk. In the European insurance sector, VaR at level 99 .

5% isthe reference risk measure in the Solvency II framework while ES at level 99% is the referencerisk measure in the Swiss Solvency Test framework. In the past 20 years, an impressive body ofresearch has investigated the relative merits and drawbacks of VaR and ES at both a theoreticaland a practical level. This investigation led to a better understanding of the properties of the abovetwo risk measures but also triggered many brand new research questions that go beyond VaR andES themselves. We refer to early work on ES in Acerbi and Tasche (2002) Acerbi (2002), Frey andMcNeil (2002), and Rockafellar and Uryasev (2002) (where ES was called Conditional VaR). Somerecent contributions to the broad investigation on whether and to what extent VaR and ES meetregulatory objectives are Koch-Medina and Munari (2016), Embrechts et al. (2018), Weber (2018),Bignozzi et al. (2020), and Baes et al. (2020). For robustness problems concerning VaR and ES,see, e.g., Cont et al. (2010) and Kr¨atschmer et al. (2014), and for their backtesting, see, e.g., Ziegel(2016), Du and Escanciano (2017), and Kratz et al. (2018).A fundamental diﬀerence between VaR and ES is that, by deﬁnition, VaR is completely blind tothe behaviour of the loss tail beyond the reference quantile whereas ES depends on the whole tail2eyond it. It is often argued that this diﬀerence, together with the convexity property, makes ESa superior risk measure compared to VaR. In fact, this is the main motivation that led the BaselCommittee to shift from VaR to ES in their market risk framework; see BCBS (2012). In thespirit of Bignozzi et al. (2020) and Mao and Wang (2020), the aim of this paper is to enhance ourunderstanding of how tail risk is captured by ES. More speciﬁcally, being essentially an averagebeyond a given quantile, ES can only provide an aggregate estimation of risk which, by its verydeﬁnition, does not distinguish across diﬀerent tail behaviors with the same mean. We show how tocapture this dimension of tail risk by introducing a new class of convex risk measures that includesES as a particular case.To best describe this new class, we start from the following simple example. Consider two normallydistributed random variables X i ∼ N ( µ i , σ i ), with µ = 1, µ = 0, σ = 0 . σ = 0 .

5. For everyprobability level p ∈ (0 , p ( X i ) = µ i + σ i φ (Φ − ( p ))1 − p , where φ and Φ are, respectively, the probability density and the cumulative distribution function ofa standard normal random variable. For p = 99% the ES of both random variables is approximatelyequal to 1 .

33. In Figure 1 we plot the two distribution functions. Despite they have the same ES,the two risks are quite diﬀerent mainly because of their diﬀerent variance: The potential losses of X tend to accumulate around its mean whereas those of X are more disperse and can be signiﬁcantlyhigher (compare the tails in Figure 2). A closer look at the ES proﬁle of both random variables,i.e., at the function p (cid:55)→ ES p ( X i ), shows that the ES proﬁle of X is more stable than that of X (see Figure 2). In order to distinguish the above two risks, we introduce a comparative criterionthat looks at the tail in a diﬀerent way compared to ES. More speciﬁcally, we want a risk measurethat is sensitive to changes in (any pre-speciﬁed portion of) the ES proﬁle p (cid:55)→ ES p ( X )of a random variable X . To this end, we “adjust” ES by considering the new risk measureES g ( X ) := sup p ∈ [0 , { ES p ( X ) − g ( p ) } where g : [0 , → ( −∞ , ∞ ] is a given increasing function. The choice of g allows to take into accountany desired portion of the ES proﬁle of X . The above adjusted ES is a monetary risk measure inthe sense of Artzner et al. (1999). Indeed, the quantity ES g ( X ) can be interpreted as the minimalamount of cash that has to be raised and injected into X in order to ensure the following target3 . . . . . . . −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 . . . . . . . . . . . . . . . . . . . Figure 1: Above: Distribution of X (blue) and X (black). The vertical lines correspond to therespective 99% quantile. Below: Tails of of X (blue) and X (black) beyond the 99% quantile.solvency condition: ES g ( X ) (cid:54) ⇐⇒ ES p ( X ) (cid:54) g ( p ) for every p ∈ [0 , . To return to the above example, a simple way to distinguish X and X while, at the same time,focusing on average losses beyond the 99% quantile is to consider, e.g., the function g ( p ) =  p ∈ [0 , . , . p ∈ (0 . , . , ∞ if p ∈ (0 . , . .990 0.992 0.994 0.996 0.998 1.000 . . . . . . . . . . . . . . . . Figure 2: ES proﬁle of X (blue) and X (black) for p (cid:62) . g ( X i ) = max { ES . ( X i ) , ES . ( X i ) − . } =  ES . ( X ) ≈ .

33 for i = 1 , ES . ( X ) − . ≈ .

45 for i = 2 . (1)The focus of ES g is still on the tail beyond the 99% quantile. However, the risk measure ES g is ableto detect the heavier tail of X and penalize it with a higher capital requirement. This is becauseES g is additionally sensitive to the tail beyond the 99 .

75% quantile and penalizes any risk whoseaverage loss on this far region of the tail is too large.For the sake of illustration, we use a similar target risk proﬁle to compare the behavior of the classicalES and the adjusted ES on real data. We collect the S&P 500 and the NASDAQ Composite indicesdaily log-returns (using closing prices) from January 01, 1999 to June 30, 2020. Each index has5406 data points (publicly available from Yahoo Finance). We simply produce empirical valuesof the risk measures using a rolling window of one year without assuming any time-series models.More speciﬁcally, at each day (starting from Jan 2000), the risk measures are estimated from theempirical distribution of the negative log-return (log-loss) in the preceding year. We consider g ( p ) =  p ∈ [0 , . , .

01 if p ∈ (0 . , . , ∞ if p ∈ (0 . , , which yields ES g ( X ) = max { ES . ( X ) , ES . ( X ) − . } . . . . S&P 500 Index daily log−return2000 2005 2010 2015 2020ES g ESVaR . . . . NASDAQ Composite daily log−return2000 2005 2010 2015 2020ES g ESVaR

Figure 3: Empirical ES . , ES g and VaR . for S&P 500 and NASDAQ.similar to (1) in a diﬀerent context. The numbers 0 .

95, 0 .

99, and 0 .

01 that appear in g are chosenfor the ease of illustration only. The 20-year empirical values of ES at level 95% and ES g , as wellas those of VaR at level 95%, are plotted in Figure 3. As we can see from the numerical results onboth S&P 500 and NASDAQ, the estimated values of ES g and the reference ES agree with eachother during most of the considered time horizon. However, during periods of signiﬁcant ﬁnancialstress, such as the dot-com bubble in 2000, the subprime crisis in 2008, and the COVID-19 crisis inearly 2020, ES g is visibly larger than the reference ES. This illustrates that ES g may capture tailrisk in a more appropriate way than ES, especially under ﬁnancial stress.As illustrated above, a key feature of adjusted ES is the ﬂexibility in the choice of the target riskproﬁle g . Indeed, the same random loss can be considered more or less relevant depending on avariety of factors, including the availability of hedging strategies or other risk mitigation tools in6he underlying business sector. The choice of g can be therefore tailored to the particular area ofapplication by assigning diﬀerent weights to diﬀerent portions of the reference tail. We illustrate thisfeature by discussing a simple stylized example in the context of cyber risk. Diﬀerently from otheroperational risks, cyber risk has a strong geographical component. The empirical study Biener et al.(2015), which takes into account 22,075 incidents reported between March 1971 and September 2009,reveals that “Northern America has some of the lowest mean cyber risk and non-cyber risk losses,whereas Europe and Asia have much higher average losses despite Northern American companiesexperience more than twice as many (51.9 per cent) cyber risk incidents than European rms (23.2per cent) and even more than twice as many as ﬁrms located on other continents”. A possiblereason is that North American companies may be better equipped to protect themselves againstsuch events. Cyber risk cannot be properly managed by a simple frequency-severity analysis. Inthe qualitative analysis of Refsdal et al. (2015), many additional factors are identiﬁed including: • Ease of discovery: How easy is it for a group of attackers to discover a given vulnerability? • Ease of exploit: How easy is it for a group of attackers to actually exploit a given vulnerability? • Awareness: How well known is a given vulnerability to a group of attackers? • Intrusion detection: How likely is an exploit to be detected?The answers may very well depend on the speciﬁc sector if not on the speciﬁc ﬁrms under consid-eration. The choice of diﬀerent reference risk proﬁles g across companies might be a way to applythe theory of risk measures in the spirit of Artzner et al. (1999) to the rather complex analysis ofthis type of risk. For example, it would be possible to set g ( p ) =  ES . ( Z ) if p ∈ [0 , . , ES . ( Z ) if p ∈ (0 . , . , ES . ( Z ) if p ∈ (0 . , . , ∞ otherwise , where Z , Z , Z are suitable benchmark random losses. The resulting adjusted ES isES g ( X ) = max { ES . ( X ) − ES . ( Z ) , ES . ( X ) − ES . ( Z ) , ES . ( X ) − ES . ( Z ) } . The associated target solvency condition is given byES g ( X ) (cid:54) ⇐⇒  ES . ( X ) (cid:54) ES . ( Z ) , ES . ( X ) (cid:54) ES . ( Z ) , ES . ( X ) (cid:54) ES . ( Z ) . g should be motivated by speciﬁc cyber risk events (see Refsdal et al. (2015) for acategorization of likelihood/severity for diﬀerent cyber attacks): The one in a hundred times eventcould be the malfunctioning of the server, the one in a thousand times event the stealing of theproﬁle data of the clients, the one in a million times event the stealing of the credit cards detailsof the customers. Note that it is possible to choose a single benchmark random loss or a diﬀerentbenchmark random loss for each considered incident. This choice could also be company speciﬁcso as to reﬂect the company’s ability to react to the diﬀerent types of cyber attacks. This is inline with Biener et al. (2015), which says that “Regarding size (of the average loss per event), weobserve a U-shaped relation, that is, smaller and larger ﬁrms have higher costs than medium-sized.Possibly, smaller ﬁrms are less aware of and less able to deal with cyber risk, while large ﬁrms maysuﬀer from complexity”.The goal of this paper is to introduce the class of adjusted ES’s and discuss their main theoreticalproperties. In Section 2 we provide a formal deﬁnition and a useful representation of adjusted ES’stogether with a broad discussion on some basic properties. A special interesting case is when therisk proﬁle g is given by the ES of a benchmark random variable. We focus on this situation inSection 3 and show that such special adjusted ES’s are strongly linked with second-order stochasticdominance. More precisely, they coincide with the monetary risk measures for which acceptabilityis deﬁned in terms of carrying less risk, in the sense of second-order stochastic dominance, than agiven benchmark random variable. Despite the importance of such a concept, we are not aware ofearlier attempts to explicitly construct monetary risk measures based on second-order stochasticdominance. In Section 4 we focus on a variety of optimization problems featuring risk functionalseither in the objective function or in the optimization domain and study the existence of optimalsolutions in the presence of this type of risk measures. In each case of interest we are able toestablish explicit optimal solutions. Throughout the paper we ﬁx an atomless probability space (Ω , F , P ) and denote by X the space of(equivalent classes with respect to P -almost sure equality of) P -integrable random variables. Forany two random variables X, Y ∈ X we write X ∼ Y whenever X and Y are identically distributed.We adopt the convention that positive values of X ∈ X correspond to losses. Recall that VaR isdeﬁned as VaR p ( X ) :=  inf { x ∈ R | P ( X (cid:54) x ) (cid:62) p } if p ∈ (0 , , ess inf X if p = 0 , p ( X ) :=  − p (cid:82) p VaR q ( X )d q if p ∈ [0 , , ess sup X if p = 1 . The focus of the paper is on the following class of risk measures. Here and in the sequel, we denoteby G the set of all functions g : [0 , → ( −∞ , ∞ ] that are increasing (in the non-strict sense) andnot identically ∞ . Moreover, we use the convention ∞ − ∞ = −∞ . Deﬁnition 2.1.

Consider a function g ∈ G and deﬁne the set A g := { X ∈ X | ∀ p ∈ [0 , , ES p ( X ) (cid:54) g ( p ) } . The functional ES g : X → ( −∞ , ∞ ] deﬁned byES g ( X ) := inf { m ∈ R | X − m ∈ A g } . is called the g -adjusted Expected Shortfall (g-adjusted ES) .To best appreciate the ﬁnancial meaning of the above risk measures, it is useful to consider the ESproﬁle associated with a random variable X ∈ X , i.e., the function p (cid:55)→ ES p ( X ) . From this persepctive, the function g in the preceding deﬁnition can be interpreted as a thresholdbetween acceptable (safe) and unacceptable (risky) ES proﬁles. In this sense, the set A g consistsof all the positions with acceptable ES proﬁle and the quantity ES g ( X ) represents the minimalamount of capital that has to be injected into X in order to align its ES proﬁle with the chosenacceptability proﬁle. For this reason, we will sometimes refer to g as the target risk proﬁle .The above risk measure is a natural extension of ES. Indeed, if for each p ∈ [0 ,

1] we deﬁne g ( q ) =  q ∈ [0 , p ] , ∞ if q ∈ ( p, , we obtain that ES p ( X ) = ES g ( X ) for every random variable X ∈ X .The next proposition highlights an equivalent but operationally preferable formulation of adjustedES’s which also justiﬁes the chosen terminology. Proposition 2.2.

For every risk proﬁle g ∈ G and for every X ∈ X we have ES g ( X ) = sup p ∈ [0 , { ES p ( X ) − g ( p ) } . roof. Fix X ∈ X and note that for every m ∈ R the condition X − m ∈ A g is equivalent toES p ( X ) − m = ES p ( X − m ) (cid:54) g ( p )for every p ∈ [0 , p = 1 both sides could be equal to ∞ . However, in view of our convention ∞ − ∞ = −∞ , the above inequality holds if and only if m (cid:62) ES p ( X ) − g ( p ) for every p ∈ [0 , Remark 2.3. (i) The above deﬁnition is reminiscent of the deﬁnition of

Loss Value at Risk (LVaR)in Bignozzi et al. (2020). In that case, one takes an increasing and right-continuous function α : [0 , ∞ ) → [0 ,

1] (the so-called benchmark loss distribution) and deﬁnes the acceptance set by A α := { X ∈ X | P ( X > x ) (cid:54) α ( x ) , ∀ x (cid:62) } . The corresponding LVaR is given byLVaR α ( X ) := inf { m ∈ R | X − m ∈ A α } . The quantity LVaR α ( X ) represents the minimal amount of capital that has to be injected into theposition X in order to ensure that, for each loss level x , the probability of exceeding a loss of size x is controlled by α ( x ). According to Proposition 3.6 in the cited paper, we can equivalently writeLVaR α ( X ) = sup p ∈ [0 , (cid:8) VaR p ( X ) − α − ( p ) (cid:9) , (2)where α − is the right inverse of α . This highlights the similarity with adjusted ES’s.(ii) It is clear, see also below, that ES g is monotonic with respect to second-order stochastic dom-inance. This implies that ES g belongs to the class of consistent risk measures as deﬁned in Maoand Wang (2020). In fact, Theorem 3.1 in that paper shows that any consistent risk measure canbe expressed as an inﬁmum of ES g ’s over a suitable class of risk proﬁles g ’s. In this sense, adjustedES’s can be seen as the building blocks for risk measures that are consistent with second-orderstochastic dominance. This class is very large and includes all law-invariant convex risk measures.The remainder of this section is devoted to discussing some basic properties of adjusted ES’s. Itfollows immediately from our deﬁnition that every adjusted ES is a monetary risk measure in thesense of F¨ollmer and Schied (2016), i.e., is monotone and cash additive. The other properties listedbelow are automatically inherited from the corresponding properties of ES. For any random variables X, Y ∈ X we say that X dominates Y with respect to second-order stochastic dominance , written X (cid:62) SSD Y , whenever E [ u ( − X )] (cid:62) E [ u ( − Y )] for every increasing and concave function u : R → R .10n the language of utility theory, this means that X is preferred to Y by every risk-averse agent(recall that positive values of X and Y represent losses). Proposition 2.4.

For every risk proﬁle g ∈ G the risk measure ES g satisﬁes the following properties:(1) monotonicity : ES g ( X ) (cid:54) ES g ( Y ) for all X, Y ∈ X such that X (cid:54) Y .(2) cash additivity : ES g ( X + m ) = ES g ( X ) + m for all X ∈ X and m ∈ R .(3) law invariance : ES g ( X ) = ES g ( Y ) for all X, Y ∈ X such that X ∼ Y .(4) convexity : ES g ( λX + (1 − λ ) Y ) (cid:54) λ ES g ( X ) + (1 − λ ) ES g ( Y ) for all X, Y ∈ X and λ ∈ [0 , .(5) consistency with (cid:62) SSD : ES g ( X ) (cid:54) ES g ( Y ) for all X, Y ∈ X such that X (cid:62) SSD Y .(6) normalization : ES g (0) = 0 if and only if g (0) = 0 . It is well known that, in addition to convexity, ES satisﬁes positive homogeneity. This qualiﬁes itas a coherent risk measure in the sense of Artzner et al. (1999). In the next proposition we showthat ES g satisﬁes positive homogeneity, i.e.ES g ( λX ) = λ ES g ( X ) for all X ∈ X and λ ∈ (0 , ∞ ) , only in the case where it coincides with some ES. In other words, with the exception of ES, theclass of adjusted ES’s consists of monetary risk measures that are convex but not coherent. Proposition 2.5.

For every risk proﬁle g ∈ G the following statements are equivalent:(a) ES g is positively homogeneous.(b) g (0) = 0 and g ( p ) ∈ (0 , ∞ ) for at most one p ∈ (0 , .(c) ES g = ES p where p = sup { q ∈ [0 , | g ( q ) = 0 } .Proof. “(a) ⇒ (b)”: Since ES g is positively homogeneous we have λg (0) = − λ ES g (0) = − ES g ( λ

0) = − ES g (0) = g (0)for every λ ∈ (0 , ∞ ). As g (0) < ∞ by our assumptions on the class G , we must have g (0) = 0.Now, assume by way of contradiction that 0 < g ( p ) (cid:54) g ( p ) < ∞ for some 0 < p < p (cid:54)

1. Takenow q ∈ ( p , p ) and b ∈ (0 , g ( p )) and set a = min (cid:26) − (1 − q ) bp − p , inf p ∈ [0 ,p ) (1 − p ) g ( p ) − b (1 − q ) q − p (cid:27) . a ∈ ( −∞ , X ∈ X such that F X ( x ) =  x ∈ ( −∞ , a ) ,q if x ∈ [ a, b ) , x ∈ [ b, ∞ ) . Note that, for every p ∈ [0 , p ), the deﬁnition of a implies(1 − p ) g ( p ) − b (1 − q ) q − p (cid:62) a. Moreover, for every p ∈ [ p , q ), the choice of b implies(1 − p ) g ( p ) − b (1 − q ) q − p (cid:62) (1 − p ) g ( p ) − b (1 − q ) q − p (cid:62) (1 − p ) b − b (1 − q ) q − p = b (cid:62) a. As a result, for every p ∈ [0 , q ) we obtainES p ( X ) = a ( q − p ) + b (1 − q )1 − p (cid:54) g ( p ) . Similarly, for every p ∈ [ q,

1] we easily see thatES p ( X ) = b < g ( p ) (cid:54) g ( q ) (cid:54) g ( p ) . This yields ES g ( X ) (cid:54)

0. However, taking λ ∈ (0 , ∞ ) large enough deliversES g ( λX ) = sup p ∈ [0 , { λ ES p ( X ) − g ( p ) } (cid:62) λ ES q ( X ) − g ( q ) = λb − g ( q ) > p = p and thus (b) holds.“(b) ⇒ (c)”: Set q = sup { p ∈ [0 , | g ( p ) = 0 } . Note that q ∈ [0 , g ( p ) = 0 forevery p ∈ [0 , q ) and g ( p ) = ∞ for every p ∈ ( q,

1] by assumption. Then, for every X ∈ X we getES g ( X ) = sup p ∈ [0 ,q ] { ES p ( X ) − g ( p ) } = sup p ∈ [0 ,q ] ES p ( X ) = ES q ( X )by the continuity of ES p ( X ) in p .“(c) ⇒ (a)”: The implication is clear.Even if positive homogeneity is generally not fulﬁlled, the following weaker scaling property issatisﬁed by every adjusted ES because of convexity. Proposition 2.6.

Consider a risk proﬁle g ∈ G with g (0) = 0 . Then, for every X ∈ X the followingstatement hold: i) ES g ( λX ) (cid:62) λ ES g ( X ) for every λ ∈ [1 , ∞ ) .(ii) ES g ( λX ) (cid:54) λ ES g ( X ) for every λ ∈ [0 , . Let p ∈ (0 , ρ : X → ( −∞ , ∞ ] satisﬁes the p -tail property if ρ ( X ) = ρ ( Y )for all random variables X, Y ∈ X such that VaR q ( X ) = VaR q ( Y ) for every q ∈ [ p, p -tail property says that ρ is soely determined by the tail distribution of random losses, and itdoes not distinguish between two random losses having the same (left) quantiles beyond the level p . This property was introduced by Liu and Wang (2020) to formalize the consideration of tailrisk of BCBS (2012). For instance, ES p satisﬁes the p -tail property, and the risk measure ES g in(1) satisﬁes the 0 . g . Proposition 2.7.

Consider a risk proﬁle g ∈ G . For every p ∈ (0 , the following statements areequivalent:(i) ES g satisﬁes the p -tail property.(ii) g is constant on (0 , p ) .Proof. To show that (i) implies (ii), assume that g is not constant on (0 , p ) so that g (0) < g ( r ) forsome r ∈ (0 , p ). Without loss of generality we can assume that g (0) < g ( r − ε ) for a suitable ε > X = 1 and Y = 1 E for some event E ∈ F satisfying P ( E ) = r .Note that VaR q ( X ) = VaR q ( Y ) = 1 for every q ∈ [ p, g ( X ) = 1 − g (0) , ES g ( Y ) = max (cid:40) sup q ∈ [0 ,r ) (cid:26) − r − q − g ( q ) (cid:27) , − g ( r ) (cid:41) . Note that 1 − g ( r ) < − g (0). Moreover, we havesup q ∈ [0 ,r − ε ] (cid:26) − r − q − g ( q ) (cid:27) (cid:54) − r − ( r − ε ) − g (0) < − g (0) , sup q ∈ ( r − ε,r ) (cid:26) − r − q − g ( q ) (cid:27) (cid:54) − g ( r − ε ) < − g (0) . This shows that ES g ( X ) (cid:54) = ES g ( Y ) and, hence, ES g fails to satisfy the p -tail property.To show that (ii) implies (i), assume that g is constant on (0 , p ). Then, we readily see thatES g ( X ) = max (cid:40) sup q ∈ [0 ,p ) { ES q ( X ) − g (0) } , sup q ∈ [ p, { ES q ( X ) − g ( q ) } (cid:41) = max (cid:40) ES p ( X ) − g (0) , sup q ∈ [ p, { ES q ( X ) − g ( q ) } (cid:41) for every X ∈ X . This shows that ES g satisﬁes the p -tail property.13s a next step, we focus on inﬁmal convolutions of adjusted ES’s. Inﬁmal convolutions arise natu-rally in the study of optimal risk sharing and capital allocation problems and have been extensivelyinvestigated in the risk measure literature; see, e.g., Barrieu and El Karoui (2005), Burgert andR¨uschendorf (2008), Filipovi´c and Svindland (2008) for results in the convex world and Embrechtset al. (2018) for results outside the convex world. Deﬁnition 2.8.

Let n ∈ N and consider the functionals ρ , . . . , ρ n : X → ( −∞ , ∞ ]. For every X ∈ X we set S n ( X ) := (cid:40) ( X , . . . , X n ) ∈ X n (cid:12)(cid:12)(cid:12)(cid:12) n (cid:88) i =1 X i = X (cid:41) . The map n (cid:50) i =1 ρ i : X → [ −∞ , ∞ ] deﬁned by n (cid:50) i =1 ρ i ( X ) := inf (cid:40) n (cid:88) i =1 ρ i ( X i ) (cid:12)(cid:12)(cid:12)(cid:12) ( X , . . . , X n ) ∈ S n ( X ) (cid:41) , is called the inf-convolution of { ρ , . . . , ρ n } . For n = 2 we simply write ρ (cid:50) ρ . Remark 2.9.

It is not diﬃcult to see that, if ρ , . . . , ρ n : X → ( −∞ , ∞ ] are monetary risk measures,then for every X ∈ X we have n (cid:50) i =1 ρ i ( X ) = inf { m ∈ R ; X − m ∈ A + · · · + A n } where A i = { X ∈ X | ρ i ( X ) (cid:54) } is the acceptance sets induced by ρ i for i ∈ { , . . . , n } . Thisshows that the inﬁmal convolution of monetary risk measures is also a monetary risk measure.We start by establishing a general inequality for inf-convolutions. More precisely, we show that anyinf-convolution of adjusted ES’s can be controlled from below by a suitable adjusted ES. Lemma 2.10.

Let n ∈ N and consider the risk proﬁles g , . . . , g n ∈ G . For every X ∈ X we have n (cid:50) i =1 ES g i ( X ) (cid:62) ES (cid:80) ni =1 g i ( X ) . Proof.

Clearly, if suﬃces to focus on the case n = 2. For all Y ∈ X and p ∈ [0 ,

1] we haveES g ( Y ) + ES g ( X − Y ) (cid:62) ES p ( Y ) − g ( p ) + ES p ( X − Y ) − g ( p ) (cid:62) ES p ( X ) − ( g + g )( p )by the subadditivity of ES. Taking the supremum over p and then the inﬁmum over Y delivers thedesired inequality.The preceding general inequality can be used to derive a formula for the inf-convolution of anadjusted ES with itself. 14 roposition 2.11. Let n ∈ N and consider a risk proﬁle g ∈ G . For every X ∈ X we have n (cid:50) i =1 ES g ( X ) = ES ng ( X ) . Proof.

The inequality “ (cid:62) ” follows directly from Lemma 2.10. To show the inequality “ (cid:54) ”, take anarbitrary X ∈ X and observe thatES g (cid:18) n X (cid:19) = 1 n sup p ∈ [0 , { ES p ( X ) − ng ( p ) } = 1 n ES ng ( X ) . As a result, we infer that n (cid:50) i =1 ES g ( X ) (cid:54) n (cid:88) i =1 ES g (cid:18) n X (cid:19) = ES ng ( X ) . This yields the desired inequality and concludes the proof.The preceding formula allows us to infer that adjusted ES’s exhibit limited regulatory arbitrage inthe sense of Wang (2016). As a preliminary step, we recall the notion of regulatory arbitrage inthe next deﬁnition. (The original deﬁnition was in the context of bounded positions and ﬁnite riskmeasures). Recall our convention ∞ − ∞ = −∞ . Deﬁnition 2.12.

Consider a functional ρ : X → ( −∞ , ∞ ] and for every X ∈ X set ρ ( X ) := inf n ∈ N n (cid:50) i =1 ρ ( X ) . We say that:(1) ρ is free of regulatory arbitrage if ρ ( X ) − ρ ( X ) = 0 for every X ∈ X .(2) ρ has limited regulatory arbitrage if ρ ( X ) − ρ ( X ) < ∞ for every X ∈ X .(3) ρ has inﬁnite regulatory arbitrage if ρ ( X ) − ρ ( X ) = ∞ for every X ∈ X .It is clear that a risk measure will always exhibit regulatory arbitrage unless it is subadditive. Ifsubadditivity is violated, then the risk measure may incentivize the (internal) reallocation of riskwith the only purpose of reaching a lower level of capital requirements. It follows from Proposi-tion 2.5 that adjusted ES’s are not subadditive in general and, hence, they will allow for regulatoryarbitrage. The next proposition shows that that happens only in a limited form. (The statementfor bounded positions follows from Corollary 3.5 in Wang (2016). Note that, in a bounded setting,ES g is always ﬁnite). Proposition 2.13.

Consider a risk proﬁle g ∈ G such that g (0) = 0 . The following statementshold: i) ES g ( X ) − ES g ( X ) < ∞ for every X ∈ X with ES g ( X ) < ∞ .(ii) ES g ( X ) − ES g ( X ) = ∞ for every X ∈ X with ES g ( X ) = ∞ .Proof. Let X ∈ X . It follows from Proposition 2.11 that ρ ( X ) = inf n ∈ N ES ng ( X ) (cid:62) ES ( X ) = E [ X ] > −∞ , where we used that g (0) = 0. This delivers the desired statements.The ability to express a risk measure in dual terms as a supremum of aﬃne functionals proves avery useful tool in many applications, notably optimization problems; see the general discussion inRockafellar (1974) and the results on risk measures in F¨ollmer and Schied (2016). We concludethis section by establishing a dual representation of adjusted ES’s. In what follows we denote by P the set of probability measures on (Ω , F ) and we use the standard notation for Radon-Nikodymderivatives. Proposition 2.14.

Consider a risk proﬁle g ∈ G . For every X ∈ X we have ES g ( X ) = sup Q ∈P ∞ P (cid:26) E Q [ X ] − g (cid:18) − (cid:13)(cid:13)(cid:13)(cid:13) d Q d P (cid:13)(cid:13)(cid:13)(cid:13) − ∞ (cid:19)(cid:27) , where P ∞ P = { Q ∈ P | Q (cid:28) P , d Q /d P ∈ L ∞ } .Proof. For notational convenience, for every Q ∈ P ∞ P set D ( Q ) = (cid:26) p ∈ [0 , (cid:12)(cid:12)(cid:12)(cid:12) d Q d P (cid:54) − p (cid:27) = (cid:20) − (cid:13)(cid:13)(cid:13)(cid:13) d Q d P (cid:13)(cid:13)(cid:13)(cid:13) − ∞ , (cid:21) . Take X ∈ X . The well-known dual representation of ES states thatES p ( X ) = sup (cid:26) E Q [ X ] (cid:12)(cid:12)(cid:12) Q ∈ P ∞ P , d Q d P (cid:54) − p (cid:27) for every p ∈ [0 , g ( X ) = sup p ∈ [0 , (cid:26) sup Q ∈P ∞ P , p ∈ D ( Q ) { E Q [ X ] − g ( p ) } (cid:27) = sup Q ∈P ∞ P (cid:26) sup p ∈ D ( Q ) { E Q [ X ] − g ( p ) } (cid:27) = sup Q ∈P ∞ P (cid:26) E Q [ X ] − inf p ∈ D ( Q ) g ( p ) (cid:27) . It remains to observe that the above inﬁmum equals g (1 − (cid:107) d Q /d P (cid:107) − ∞ ) by monotonicity of g .16 Adjusting ES via benchmark ES proﬁles

In this section we focus on a special class of adjusted ES’s for which the target risk proﬁles areexpressed in terms of the ES proﬁle of a reference random loss. As shown below, these specialadjusted ES’s are intimately linked with monetary risk measures induced by second-order stochasticdominance.

Deﬁnition 3.1.

Consider a functional ρ : X → ( −∞ , ∞ ].(1) ρ is called a benchmark-adjusted ES if there exists Z ∈ X such that for every X ∈ X ρ ( X ) = sup p ∈ [0 , { ES p ( X ) − ES p ( Z ) } . (2) ρ is called an SSD-based risk measure if there exists Z ∈ X such that for every X ∈ X ρ ( X ) = inf { m ∈ R | X − m (cid:62) SSD Z } . It is clear that benchmark-adjusted ES’s are special instances of adjusted ES’s for which the targetrisk proﬁle is deﬁned in terms of the ES proﬁle of a benchmark random loss. The distributionof this random loss may correspond, for example, to the (stressed) historical loss distribution ofthe underlying position or to a target (risk-class speciﬁc) loss distribution. It is also clear thatSSD-based risk measures are nothing but monetary risk measures associated with acceptance setsdeﬁned through second-order stochastic dominance.The classical characterization of second-order stochastic dominance in terms of ES can be usedto show that benchmark-adjusted ES’s coincide with SSD-based risk measures. In addition, weprovide a simple characterization of this class of risk measures.

Theorem 3.2.

For a monetary risk measure ρ : X → ( −∞ , ∞ ] the following statements areequivalent:(i) ρ is a benchmark-adjusted ES.(ii) ρ is an SSD-based risk measure.(iii) ρ is consistent with (cid:62) SSD and the set { X ∈ X | ρ ( X ) (cid:54) } has an (cid:62) SSD -minimum element.Proof.

Recall that for all X ∈ X and Z ∈ X we have X (cid:62) SSD Z if and only if ES p ( X ) (cid:54) ES p ( Z )for every p ∈ [0 , A = { X ∈ X | ρ ( X ) (cid:54) } . To show that (i) implies (ii), assume that ρ is a benchmark-adjustedES with respect to Z ∈ X . Then, for every X ∈ X ρ ( X ) = inf { m ∈ R | X − m ∈ A} = inf { m ∈ R | ES p ( X ) − m (cid:54) ES p ( Z ) , ∀ p ∈ [0 , } = inf { m ∈ R | X − m (cid:62) SSD Z } . To show that (ii) implies (i), assume that ρ is SSD-based with respect to Z ∈ X . Then, we have ρ ( X ) = inf { m ∈ R | X − m (cid:62) SSD Z } = inf { m ∈ R | ES p ( X ) − m (cid:54) ES p ( Z ) , ∀ p ∈ [0 , } = sup p ∈ [0 , { ES p ( X ) − ES p ( Z ) } . It is clear that (iii) implies (ii). Finally, to show that (ii) implies (iii), assume that ρ is an SSD-based risk measure with respect to Z ∈ X . It is clear that Z ∈ A . Now, take an arbitrary X ∈ A .We ﬁnd a sequence ( m n ) ⊂ R such that m n ↓ ρ ( X ) and X − m n (cid:62) SSD Z for every n ∈ N . Thisimplies that X − ρ ( X ) (cid:62) SSD Z . Since ρ ( X ) (cid:54)

0, we infer that X (cid:62) SSD Z as well. This showsthat A has an SSD-minimum element. To establish that ρ is consistent with (cid:62) SSD , take arbitrary

X, Y ∈ X satisfying X (cid:62) SSD Y . For every m ∈ R such that Y − m (cid:62) SSD Z we clearly have that X − m (cid:62) SSD Y − m (cid:62) SSD Z . This implies that ρ ( X ) (cid:54) ρ ( Y ) and concludes the proof of the desiredimplication. Remark 3.3. (i) Let L be the family of all (nonconstant) convex and increasing functions (cid:96) : R → R .The monetary risk measure associated to (cid:96) ∈ L is deﬁned for a given α ∈ R by ρ (cid:96),α ( X ) := inf { m ∈ R | E [ (cid:96) ( X − m )] (cid:54) α } , X ∈ X . Consider the risk proﬁle g ( p ) = ES p ( Z ) for every p ∈ [0 , Z is a given P -essentially boundedrandom variable. Then, it is not diﬃcult to verify thatES g ( X ) = sup (cid:96) ∈L ρ (cid:96), E [ (cid:96) ( Z )] ( X )for every X ∈ X . In particular, ES g is more conservative than any shortfall risk measure withreference level E [ (cid:96) ( Z )].(ii) If second-order stochastic dominance is replaced by ﬁrst-order stochastic dominance in the abovetheorem, then one arrives at a characterization of LVaR in (2) with α being a distribution function.18e are interested in characterizing when the acceptable risk proﬁle g of an adjusted ES can beexpressed in terms of an ES proﬁle. To this eﬀect, it is convenient to introduce the followingadditional class of risk measures, which will be shown to contain all benchmark-adjusted ES’s. Deﬁnition 3.4.

A functional ρ : X → ( −∞ , ∞ ] is called a quantile-adjusted ES if there exists Z ∈ L such that for every X ∈ X ρ ( X ) = sup p ∈ [0 , { ES p ( X ) − VaR p ( Z ) } . To establish our desired characterization, for a risk proﬁle g ∈ G we deﬁne h g : [0 , → ( −∞ , ∞ ] by h g ( p ) := (1 − p ) g ( p ) . Here, we set 0 · ∞ = 0 so that h g (1) = 0. Moreover, we introduce the following sets: G VaR := { g ∈ G | g is ﬁnite on [0 , , } , G ES := { g ∈ G VaR | h g is concave on (0 ,

1) and left-continuous at 1 } . Lemma 3.5.

For every risk proﬁle g ∈ G the following statements hold:(i) g ∈ G VaR if and only if there exists a random variable Z ∈ L that is bounded from below andsatisﬁes g ( p ) = VaR p ( Z ) for every p ∈ [0 , .(ii) g ∈ G ES if and only if there exists a random variable Z ∈ X such that g ( p ) = ES p ( Z ) for every p ∈ [0 , .Proof. (i) The “if” part is clear. For the “only if” part, let U be a uniform random variable on[0 ,

1] and set Z = g ( U ). Then, it is well known that VaR p ( Z ) = g ( p ) for every p ∈ [0 , g (0) > −∞ , we see that Z is bounded from below. (ii) The “if” part is straightforward. For the “only if” part, let U be a uniform random variable on[0 , h (cid:48) g the left derivative of h g . Then, for every p ∈ [0 ,

1) we haveES p ( − h (cid:48) g ( U )) = − − p (cid:90) p h (cid:48) g ( u )d u = − h g (1) − h g ( p )1 − p = g ( p ) . This shows that, by taking Z = − h (cid:48) g ( U ), we have g ( p ) = ES p ( Z ) for every p ∈ [0 , g and ES · ( Z ) at 1 gives the same equality for p = 1.As a direct consequence of the previous lemma we derive a characterization of quantile- andbenchmark-adjusted ES’s in terms of the underlying risk proﬁle.19 heorem 3.6. For every risk proﬁle g ∈ G the following statements hold:(i) There exists Z ∈ L that is bounded from below and such that ES g is a quantile-adjusted ESwith respect to Z if and only if g ∈ G VaR .(ii) There exists Z ∈ X such that ES g is an benchmark-adjusted ES with respect to Z if and onlyif g ∈ G ES . Since we clearly have G ES ⊂ G VaR , it follows from the above results that every benchmark-adjustedES is also a quantile-adjusted ES. In particular, this implies that, for every random variable Z ∈ X ,we can always ﬁnd a random variable W ∈ L such that VaR p ( W ) = ES p ( Z ) for every p ∈ [0 , Proposition 3.7. (i) There exists g ∈ G such that ES g (cid:54) = ES h for every h ∈ G VaR .(ii) There exists g ∈ G VaR such that ES g (cid:54) = ES h for every h ∈ G ES .Proof. The second assertion follows immediately from Theorem 3.6 and the fact that the inclusion G ES ⊂ G VaR is strict. To establish the ﬁrst assertion, ﬁx q ∈ (0 ,

1) and deﬁne g ∈ G by setting g ( p ) =  p ∈ [0 , q ] , ∞ if p ∈ ( q, . It follows that ES g ( X ) = sup p ∈ [0 ,q ] { ES p ( X ) } = ES q ( X )for every X ∈ X . We claim that ES g is not a quantile-adjusted ES. To the contrary, suppose thatthere exists a random variable Z ∈ L that is bounded from below and satisﬁesES q ( X ) = ES g ( X ) = sup p ∈ [0 , { ES p ( X ) − VaR p ( Z ) } for every X ∈ X . Take r ∈ ( q,

1) and X ∈ X such that ES r ( X ) > ES q ( X ). Then, for each λ > q ( X ) = 1 λ ES q ( λX ) = 1 λ sup p ∈ [0 , { ES p ( λX ) − VaR p ( Z ) } (cid:62) λ (ES r ( λX ) − VaR r ( Z )) = ES r ( X ) − λ VaR r ( Z ) . By sending λ → ∞ , we obtain ES q ( X ) (cid:62) ES r ( X ), which contradicts our assumption on X .20ote that ES is always ﬁnite on our domain. Here, we are interested in discussing the ﬁniteness ofadjusted ES’s associated with risk proﬁles in the class G VaR and G ES . We show that ﬁniteness onthe whole reference space X can never hold in the presence of a risk proﬁle in G ES while it can holdif we take a risk proﬁle in G VaR . Proposition 3.8.

Consider a risk proﬁle g ∈ G . If g ∈ G VaR , then ES g can be ﬁnite on X . If g ∈ G ES , then ES g cannot be ﬁnite on X .Proof. To show the ﬁrst part of the assertion, set g ( p ) = − p for every p ∈ [0 ,

1] (with the convention = ∞ ). Note that g ∈ G VaR . Fix X ∈ X and note that there exists q ∈ (0 ,

1) such thatsup p ∈ [ q, (cid:90) p VaR r ( X )d r < . It follows that sup p ∈ [ q, (cid:26) ES p ( X ) − − p (cid:27) = sup p ∈ [ q, (cid:26) − p (cid:18)(cid:90) p VaR r ( X )d r − (cid:19)(cid:27) (cid:54) . Therefore, ES g ( X ) (cid:54) max (cid:40) sup p ∈ [0 ,q ] (cid:26) ES p ( X ) − − p (cid:27) , (cid:41) (cid:54) max { ES q ( X ) , } < ∞ . This shows that ES g is ﬁnite on the entire X . To establish the second part of the assertion, take Z ∈ X and set g ( p ) = ES p ( Z ) for every p ∈ [0 , g ∈ G ES by Lemma 3.5. If Z isbounded from above, then take X ∈ X that is unbounded from above. In this case, it follows thatES g ( X ) (cid:62) ES ( X ) − ES ( Z ) = ∞ . If Z is unbounded from above, then take X = 2 Z ∈ X . In this case, we haveES g ( X ) (cid:62) ES (2 Z ) − ES ( Z ) = ES ( Z ) = ∞ . Hence, we see that ES g is never ﬁnite on X .The next result improves Lemma 2.10 by showing that the inf-convolution of benchmark-adjustedES’s can still be expressed as an adjusted ES. Proposition 3.9.

Let n ∈ N and consider the risk proﬁles g , . . . , g n ∈ G ES . For every X ∈ X n (cid:50) i =1 ES g i ( X ) = ES (cid:80) ni =1 g i ( X ) . roof. The inequality “ (cid:62) ” follows from Lemma 2.10. To show the inequality “ (cid:54) ”, note that thereexist Z , . . . , Z n ∈ X such that A g i = { X ∈ X | X (cid:62) SSD Z i } by Theorem 3.6. We prove that A := { X ∈ X ; ES (cid:80) ni =1 g i ( X ) (cid:54) } ⊂ n (cid:88) i =1 A g i which, together with Remark 2.9, yields the desired inequality. Let U be a uniform random variableand, for any X ∈ X , denote by F − X the (left) quantile function of X . Take i ∈ { , . . . , n } and notethat F − Z i ( U ) ∼ Z i . It follows from the law invariance of ES that ES p ( F − Z i ( U )) = ES p ( Z i ) for every p ∈ [0 , F − Z i ( U ) ∈ A g i . Since the random variables F − Z i ( U )’s are comonotonic, we alsohave (cid:80) ni =1 ES p ( Z i ) = (cid:80) ni =1 ES p ( F − Z i ( U )) = ES p ( Z ) with Z = (cid:80) ni =1 F − Z i ( U ). We deduce that each X ∈ A satisﬁes ES p ( X ) (cid:54) ES p ( Z ) for every p ∈ [0 , X (cid:62) SSD Z . Note that Z ∈ (cid:80) ni =1 A g i so that n (cid:50) i =1 ES g i ( Z ) (cid:54)

0. Since the inf-convolution is consistent with (cid:62)

SSD , as shownin Theorem 4.1 by Mao and Wang (2020), we have n (cid:50) i =1 ES g i ( X ) (cid:54) n (cid:50) i =1 ES g i ( Z ) (cid:54)

0, which implies X ∈ (cid:80) ni =1 A g i as desired. Using the characterization of benchmark-adjusted ES’s established in Theorem 3.2, many optimiza-tion problems related to benchmark-adjusted ES’s or, equivalently, SSD-based risk measures can besolved explicitly. In this section, we focus on risk minimization and utility maximization problemsin the context of a multi-period frictionless market that is complete and arbitrage free. The interestrate is set to be zero for simplicity. As is commonly done in the literature, this type of optimizationproblems, which are naturally expressed in terms of dynamic investment strategies, can be con-verted into static optimization problems by way of martingale methods. Below we focus directlyon their static counterparts. For more details we refer, e.g., to Schied et al. (2009) or F¨ollmer andSchied (2016). In addition, to ensure that all our problems are well deﬁned, we assume throughoutthat X consists of P -bounded random variables.In the sequel, we denote by Q the risk-neutral pricing measure (whose existence and uniqueness inour setting are ensured by the Fundamental Theorem of Asset Pricing), by w ∈ R a ﬁxed level ofinitial wealth, by x ∈ R a real number representing a constraint, by u : R → R ∪ {−∞} a concaveand increasing function that is continuous (at the point where it potentially jumps to −∞ ) andsatisﬁes u ( −∞ ) < x < u ( ∞ ), and by ρ : X → ( −∞ , ∞ ] a risk functional. We focus on the followingﬁve optimization problems:(i) Risk minimization with a budget constraint:minimize ρ ( X ) over X ∈ X subject to E Q [ w − X ] (cid:54) x .22ii) Price minimization with controlled risk:minimize E Q [ w − X ] over X ∈ X subject to ρ ( X ) (cid:54) x .(iii) Risk minimization with a target utility level:minimize ρ ( X ) over X ∈ X subject to E [ u ( w − X )] = x. (iv) Worst-case utility with a reference risk assessment:minimize E [ u ( w − X )] over X ∈ X subject to ρ ( X ) = x .(v) Worst-case risk with a reference risk assessment:maximize ρ (cid:48) ( X ) over X ∈ X subject to ρ ( X ) = x ,where ρ (cid:48) is an SSD-consistent functional that is continuous with respect to the L ∞ -norm.Problem (i) is an optimal investment problem minimizing the risk given a budget constraint. Con-versely, problem (ii) aims at minimizing the cost given a controlled risk level. Problem (iii) is aboutminimizing the risk exposure with a target utility level, similar to the mean-variance problem ofMarkowitz (1952). The interpretation of problems (iv) and (v) is diﬀerent from the ﬁrst three prob-lems: They are not about optimization over risk, but about ambiguity, i.e., in these problems themain concern is model risk. Indeed, the set X may represent the class of plausible models for thedistribution of a certain ﬁnancial position of interest. In the case of problem (iv), the assumptionis that the only available information for X is the risk ﬁgure ρ ( X ), evaluated, e.g., by an expertor another decision maker. In this context, we are interested in determining the worst case utilityamong all possible models which agree with the evaluation ρ ( X ) = x (see also Example 5.3 of Wanget al. (2019)). A similar interpretation can be given for problem (v). Proposition 4.1.

Each of the optimization problems (i)-(v) relative to a benchmark-adjusted ES ρ = ES g for g ∈ G ES admits an optimal solution of the explicit form Z + z where Z ∈ X has the ESproﬁle g and z ∈ R . Moreover, Z is comonotonic with d Q d P in (i)-(ii), and the (binding) constraintuniquely determines z in each problem.Proof. The result for the optimization problem (i) is a direct consequence of Proposition 5.2 in Maoand Wang (2020). Let Z be comonotonic with d Q /d P which has ES proﬁle g ; comonotonicity is23nly relevant in problems (i) and (ii). Note that ρ ( Z ) = 0. For any random variable X ∈ X , we set Y X = Z + ρ ( X ). It is clear that ρ ( Y X ) = ρ ( X ) andES p ( Y X ) = g ( p ) + ρ ( X ) = g ( p ) + sup q ∈ [0 , { ES q ( X ) − g ( q ) } (cid:62) ES p ( X ) . Hence, X (cid:62) SSD Y X . This observation will be useful in the analysis below.(a) We ﬁrst look at problem (ii). First, since both X (cid:55)→ E Q [ X ] and ρ are translation-invariant,the condition ρ ( X ) (cid:54) x is binding, and problem (ii) is equivalent to maximizing E Q [ X ] over X ∈ X such that ρ ( X ) = x . Let X ∈ X be any random variable with ρ ( X ) = x and let X (cid:48) be identically distributed as X and comonotonic with d Q /d P . Since X (cid:48) ∼ X , by the Hardy–Littlewood inequality (e.g., Remark 3.25 of R¨uschendorf (2013)), we have E Q [ X ] (cid:54) E Q [ X (cid:48) ].Moreover, for any random variable Y ∈ X that is comonotonic with d Q /d P , we can write (seee.g., (A.8) of Mao and Wang (2020)) E Q [ Y ] = (cid:90) ES p ( Y )d µ ( p )for some Borel probability measure µ on [0 , X (cid:48) (cid:62) SSD Y X implies E Q [ X (cid:48) ] (cid:54) E Q [ Y X ],and we obtain E Q [ X ] (cid:54) E Q [ X (cid:48) ] (cid:54) E Q [ Y X ] . Note also that ρ ( Y X ) = ρ ( X ) = x . Hence, for any random variable X ∈ X , there exists Z + z for some z ∈ R which dominates X for problem (ii). Since both the constraint and the objectiveare continuous in z ∈ R , an optimizer of the form Z + z exists.(b) We next look at problem (iii). Let X ∈ X be any random variable such that E [ u ( w − X )] = x .The aforementioned fact X (cid:62) SSD Y X implies that E [ u ( w − Y )] (cid:54) E [ u ( w − X )] = x since u is aconcave utility function. Therefore, there exists ε ∈ [0 , ∞ ) such that E [ u ( w − ( Y − ε ))] = x , andwe take the largest ε satisfying this equality, which is obviously ﬁnite. Let z = ρ ( X ) − ε . It isthen clear that E [ u ( w − ( Z + z ))] = E [ u ( w − X )] = x and ρ ( Z + z ) = ρ ( Y − ε ) = ρ ( X ) − ε (cid:54) ρ ( X ).Hence, Z + z dominates X as an optimizer for problem (iii). Since both the constraint and theobjective are continuous in z ∈ R , an optimizer of the form Z + z exists.(c) Problems (iv) and (v) are similar, and they can be shown via similar arguments to the abovecases. Remark 4.2. (i) It should be clear that the classical ES does not belong to the class of SSD-basedrisk measures as the associated risk proﬁle is not in G ES . As a consequence, the results in this section24o not directly apply to ES. In particular, although ES is consistent with SSD, its acceptance setdoes not have a minimum SSD element as required by Proposition 3.2. We refer to Wang andZitikis (2020) for a diﬀerent characterization of ES.(ii) In the context of decision theory and, speciﬁcally, portfolio selection, it is sometimes arguedthat (second order) stochastic dominance is too extreme in the sense that it ranks risks accordingto the simultaneous preferences of every risk-averse agent, thus including utility functions that maylead to counterintuitive outcomes. A typical example is the one proposed by Leshno and Levy(2002). Consider a portfolio that pays one million dollars in 99% of cases and nothing otherwiseand another portfolio that pays one dollar with certainty. According to the sign convention adoptedin this paper, the corresponding payoﬀs are given by X =  − with probability 99% and Y = − . Even though X does not dominate Y with respect to SSD, most agents prefer X to Y . Thus, theauthors argue for the necessity of relaxing SSD in favor of a more reasonable notion. We pointout that our approach yields a novel and reasonable generalization of SSD. First, consider the riskproﬁle deﬁned by g ( p ) = ES p ( Y ) = − p ∈ [0 ,

1] and note that X is acceptable under ES g precisely when X (cid:62) SSD Y . Note also thatES p ( X ) (cid:54) g ( p ) ⇐⇒ p (cid:54) ¯ p := 1 − − − ≈ − − . This fact has two implications. On the one hand, it conﬁrms that X does not dominate Y withrespect to SSD and highlights that this failure is due to the behavior of X in the far region of itsleft tail. On the other hand, it suggests that it is enough to consider the new risk proﬁle deﬁnedby h ( p ) = g ( p ) for p (cid:54) ¯ p and h ( p ) = ∞ otherwise to make X acceptable under ES h . In otherwords, moving from g to h is equivalent to moving from SSD to a relaxed form of SSD that enlargesthe spectrum of acceptability in portfolio selection problems. However, note that ES h is not anSSD-based risk measure and, hence, the existence results obtained above do not apply to it. Asystematic study of optimization problems under constraints of ES h type requires further research. References

Acerbi, C. (2002). Spectral measures of risk: A coherent representation of subjective risk aversion.

Journalof Banking & Finance , 26(7):1505–1518.Acerbi, C. and Tasche, D. (2002). On the coherence of expected shortfall.

Journal of Banking & Finance ,26(7):1487–1503. rtzner, P., Delbaen, F., Eber, J.-M., and Heath, D. (1999). Coherent measures of risk. MathematicalFinance , 9(3):203–228.Baes, M., Koch-Medina, P., and Munari, C. (2020). Existence, uniqueness, and stability of optimal payoﬀsof eligible assets.

Mathematical Finance .Barrieu, P. and El Karoui, N. (2005). Inf-convolution of risk measures and optimal risk transfer.

Financeand Stochastics , 9(2):269–298.BCBS (2012).

Consultative Document May 2012. Fundamental review of the trading book . Basel Committeeon Banking Supervision. Basel: Bank for International Settlements.Biener, C., Eling, M., and Wirfs, J. (2015). Insurability of cyber risk: An empirical analysis.

Geneva Paperson Risk and Insurance - Issues and Practice , 40:131158.Bignozzi, V., Burzoni, M., and Munari, C. (2020). Risk measures based on benchmark loss distributions.

Journal of Risk and Insurance , 87(2):437–475.Burgert, C. and R¨uschendorf, L. (2008). Allocation of risks and equilibrium in markets with ﬁnitely manytraders.

Insurance: Mathematics and Economics , 42(1):177–188.Cont, R., Deguest, R., and Scandolo, G. (2010). Robustness and sensitivity analysis of risk measurementprocedures.

Quantitative Finance , 10(6):593–606.Du, Z. and Escanciano, J. C. (2017). Backtesting expected shortfall: accounting for tail risk.

ManagementScience , 63(4):940–958.Embrechts, P., Liu, H., and Wang, R. (2018). Quantile-based risk sharing.

Operations Research , 66(4):936–949.Filipovi´c, D. and Svindland, G. (2008). Optimal capital and risk allocations for law-and cash-invariant convexfunctions.

Finance and Stochastics , 12(3):423–439.F¨ollmer, H. and Schied, A. (2016).

Stochastic Finance: An Introduction in Discrete Time . De Gruyter,Berlin, 4th edition.Frey, R. and McNeil, A. J. (2002). Var and expected shortfall in portfolios of dependent credit risks:conceptual and practical insights.

Journal of banking & ﬁnance , 26(7):1317–1334.Koch-Medina, P. and Munari, C. (2016). Unexpected shortfalls of expected shortfall: Extreme default proﬁlesand regulatory arbitrage.

Journal of Banking & Finance , 62:141–151.Kr¨atschmer, V., Schied, A., and Z¨ahle, H. (2014). Comparative and qualitative robustness for law-invariantrisk measures.

Finance and Stochastics , 18(2):271–295.Kratz, M., Lok, Y. H., and McNeil, A. J. (2018). Multinomial var backtests: A simple implicit approach tobacktesting expected shortfall.

Journal of Banking & Finance , 88:393–407.Leshno, M. and Levy, H. (2002). Preferred by all and preferred by most decision makers: Almost stochasticdominance.

Management Science , 48(8):1074–1085.Liu, F. and Wang, R. (2020). A theory for measures of tail risk.

Mathematics of Operations Research,forthcoming .Mao, T. and Wang, R. (2020). Risk aversion in regulatory capital principles.

SIAM Journal on FinancialMathematics , 11(1):169–200. arkowitz, H. (1952). Portfolio selection. Journal of Finance , 7:77–91.Refsdal, A., Solhaug, B., and Stølen, K. (2015).

Cyber-risk Management , pages 33–47. Springer InternationalPublishing, Cham.Rockafellar, R. T. (1974).

Conjugate duality and optimization . SIAM.Rockafellar, R. T. and Uryasev, S. (2002). Conditional value-at-risk for general loss distributions.

Journalof banking & ﬁnance , 26(7):1443–1471.R¨uschendorf, L. (2013). Mathematical risk analysis.

Springer Ser. Oper. Res. Financ. Eng. Springer, Hei-delberg .Schied, A., F¨ollmer, H., and Weber, S. (2009). Robust preferences and robust portfolio choice.

Handbook ofNumerical Analysis , 15:29–87.Shaked, M. and Shanthikumar, J. (2007).

Stochastic Orders . Springer Series in Statistics. Springer NewYork.Wang, R. (2016). Regulatory arbitrage of risk measures.

Quantitative Finance , 16(3):337–347.Wang, R., Xu, Z. Q., and Zhou, X. Y. (2019). Dual utilities on risk aggregation under dependence uncertainty.

Finance and Stochastics , 23(4):1025–1048.Wang, R. and Zitikis, R. (2020). An axiomatic foundation for the expected shortfall.

Management Science ,forthcoming.Weber, S. (2018). Solvency ii, or how to sweep the downside risk under the carpet.

Insurance: Mathematicsand economics , 82:191–200.Ziegel, J. F. (2016). Coherence and elicitability.

Mathematical Finance , 26(4):901–918., 26(4):901–918.