[PDF] XVA Analysis From the Balance Sheet

Abstract

XVAs denote various counterparty risk related valuation adjustments that are applied to financial derivatives since the 2007--09 crisis. We root a cost-of-capital XVA strategy in a balance sheet perspective which is key in identifying the economic meaning of the XVA terms. Our approach is first detailed in a static setup that is solved explicitly. It is then plugged in the dynamic and trade incremental context of a real derivative banking portfolio. The corresponding cost-of-capital XVA strategy ensures to bank shareholders a submartingale equity process corresponding to a target hurdle rate on their capital at risk, consistently between and throughout deals. Set on a forward/backward SDE formulation, this strategy can be solved efficiently using GPU computing combined with deep learning regression methods in a whole bank balance sheet context. A numerical case study emphasizes the workability and added value of the ensuing pathwise XVA computations.

Full PDF

aa r X i v : . [ q -f i n . R M ] S e p XVA Analysis From the Balance Sheet

Claudio Albanese , St´ephane Cr´epey , Rodney Hoskinson , and Bouazza Saadeddine September 2, 2020

Abstract

XVAs denote various counterparty risk related valuation adjustments that are appliedto ﬁnancial derivatives since the 2007–09 crisis. We root a cost-of-capital XVA strategyin a balance sheet perspective which is key in identifying the economic meaning of theXVA terms. Our approach is ﬁrst detailed in a static setup that is solved explicitly. It isthen plugged in the dynamic and trade incremental context of a real derivative bankingportfolio. The corresponding cost-of-capital XVA strategy ensures to bank shareholdersa submartingale equity process corresponding to a target hurdle rate on their capitalat risk, consistently between and throughout deals. Set on a forward/backward SDEformulation, this strategy can be solved eﬃciently using GPU computing combined withdeep learning regression methods in a whole bank balance sheet context. A numericalcase study emphasizes the workability and added value of the ensuing pathwise XVAcomputations.

Keywords:

Counterparty risk, balance sheet of a bank, market incompleteness, wealth trans-fer, X-valuation adjustment (XVA), deep learning, quantile regression.

Mathematics Subject Classiﬁcation:

JEL Classiﬁcation:

D52, G13, G24, G28, G33, M41.

XVAs, with X as C for credit, D for debt, F for funding, M for margin, or K for capital, are post-2007–09 crisis valuation adjustments for ﬁnancial derivatives. In broad terms to be detailedlater in the paper (cf. Table 1 in Section 2), CVA is what the bank expects to lose due to Global Valuation ltd, London, United Kingdom LaMME, Univ Evry, CNRS, Universit´e Paris-Saclay. ANZ Banking Group, Singapore Quantitative Research GMD/GMT Credit Agricole CIB, ParisAcknowledgement:

This article has been accepted for publication in Quantitative Finance, published byTaylor & Francis.We are grateful for useful discussions to Lokman Abbas-Turki, Agostino Capponi, Karl-Theodor Eisele, ChrisKenyon, Marek Rutkowski, and Michael Schmutz. The PhD thesis of Bouazza Saadeddine is funded by a CIFREgrant from CA-CIB and French ANRT.

Disclaimers:

The views expressed herein by Rodney Hoskinson and co-authors are their personal viewsand do not reﬂect the views of ANZ Banking Group Limited (”ANZ”). No liability shall be accepted by ANZwhatsoever for any direct or consequential loss from any use of this paper and the information, opinions andmaterials contained herein.

Email addresses: [email protected], [email protected], [email protected], [email protected] − DVA (1)for the valuation of bilateral counterparty risk on a swap, assuming risk-free funding. This for-mula, rediscovered and generalized by others since the 2008–09 ﬁnancial crisis (cf. e.g. Brigo and Capponi (2010)),is symmetrical, i.e. it is the negative of the analogous quantity considered from the point of viewof the counterparty, consistent with the law of one price and the Modigliani and Miller (1958)theorem.Around 2010, the materiality of the DVA windfall beneﬁt of a bank at its own defaulttime became the topic of intense debates in the quant and academic communities. At least,it seemed reasonable to admit that, if the own default risk of the bank was accounted for inthe modeling, in the form of a DVA beneﬁt, then the cost of funding (FVA) implication of thisrisk should be included as well, leading to the modiﬁed formula (CVA − DVA + FVA). See forinstance Burgard and Kjaer (2011, 2013, 2017), Cr´epey (2015), Brigo and Pallavicini (2014), orBichuch, Capponi, and Sturm (2018). See also Bielecki and Rutkowski (2015) for an abstractfunding framework (without explicit reference to XVAs), generalizing Piterbarg (2010) to anonlinear setup.Then Hull and White (2012) objected that the FVA was only the compensator of anotherwindfall beneﬁt of the bank at its own default, corresponding to the non-reimbursement by thebank of its funding debt. Accounting for the corresponding “DVA2” (akin to the FDA in thispaper) brings back to the original ﬁrm valuation formula:CVA − DVA + FVA − FDA = CVA − DVA , as FVA = FDA (assuming risky funding fairly priced as we will see).However, their argument implicitly assumes that the bank can perfectly hedge its owndefault: cf. Burgard and Kjaer (2013, end of Section 3.1) and see Section 3.5 below. As a bankis an intrinsically leveraged entity, this is not the case in practice. One can mention the relatedcorporate ﬁnance notion of debt overhang in Myers (1977), by which a project valuable forthe ﬁrm as a whole may be rejected by shareholders because the project is mainly valuableto bondholders. But, until recently, such considerations were hardly considered in the ﬁeld ofderivative pricing.The ﬁrst ones to recast the XVA debate in the perspective of the balance sheet of thebank were Burgard and Kjaer (2011), to explain that an appropriately hedged derivative po-sition has no impact on the dealer’s funding costs. Also relying on balance sheet modelsof a dealer bank, Castagna (2014) and Andersen, Duﬃe, and Song (2019) end up with con-ﬂicting conclusions, namely that the FVA should, respectively should not, be included in2he valuation of ﬁnancial derivatives. Adding the KVA, but in a replication framework,Green, Kenyon, and Dennis (2014) conclude that both the FVA and the KVA should be in-cluded as add-ons in entry prices and as liabilities in the balance sheet. Our key premise is that counterparty risk entails two distinct but intertwined sources of marketincompleteness: • A bank cannot perfectly hedge counterparty default losses, by lack of suﬃciently liquidCDS markets; • A bank can even less hedge its own jump-to-default exposure, because this would meanselling protection on its own default, which is nonpractical and, under certain juridictions,even legally forbidden (see Section 2).We specify the banking XVA metrics that align derivative entry prices to shareholder interest,given this impossibility for a bank to replicate the jump-to-default related cash ﬂows. We de-velop a cost-of-capital XVA approach consistent with the accounting standards set out in IFRS 4Phase II (see International Financial Reporting Standards (2013)), inspired from the Swiss sol-vency test and Solvency II insurance regulatory frameworks (see Swiss Federal Oﬃce of Private Insurance (2006)and Committee of European Insurance and Occupational Pensions Supervisors (2010)), whichso far has no analogue in the banking domain. Under this approach, the valuation (CL) ofthe so-called contra-liabilities and the cost of capital (KVA) are sourced from clients at tradeinceptions, on top of the (CVA − DVA) complete market valuation of counterparty risk, in orderto compensate bank shareholders for wealth transfer and risk on their capital.The cost of the corresponding collateralization, accounting, and dividend policy is, by con-trast with the complete market valuation (CVA − DVA) of counterparty risk,CVA + FVA + KVA , (2)computed unilaterally in a certain sense (even though we do crucially include the default of thebank itself in our modeling), and charged to clients on an incremental run-oﬀ basis at everynew deal .All in one, our cost-of-capital XVA strategy makes shareholder equity a submartingale withdrift corresponding to a hurdle rate h on shareholder capital at risk, consistently between andthroughout deals. Thus we arrive at a sustainable strategy for proﬁts retention, much like inthe above-mentioned insurance regulation, but in a consistent continuous-time and bankingframework.Last but not least, our approach can be solved eﬃciently using GPU computing combinedwith deep learning regression methods in a whole bank balance sheet context. Section 2 sets a ﬁnancial stage where a bank is split across several trading desks and entailsdiﬀerent stakeholders. Section 3 develops our cost-of-capital XVA approach in a one-periodstatic setup. Section 4 revisits the approach at the dynamic and trade incremental level.Section 5 is a numerical case study on large, multi-counterparty portfolios of interest rateswaps, based on the continuous-time XVA equations for bilateral trade portfolios recalled inSection A.The main contributions of the paper are: See also Remark 2.1 regarding the meaning of the FVA in (2). The one-period static XVA model of Section 3, with explicit formulas for all the quantitiesat hand, oﬀering a concrete grasp on the related wealth transfer and risk premium issues; • Proposition 4.1, which establishes the connections between XVAs and the core equity tier1 capital of the bank, respectively bank shareholder equity; • Proposition 4.2, which establishes that, under the XVA policy represented by the balanceconditions (4) between deals and the counterparty risk add-on (43) throughout deals,bank shareholder equity is a submartingale with drift corresponding to a target hurdlerate h on shareholder capital at risk. This perspective solves the puzzle according towhich, on the one hand, XVA computations are performed on a run-oﬀ portfolio basis,while, on the other hand, they are used for computing pricing add-ons to new deals; • The XVA deep learning (quantile) regression computational strategy of Section 4.4; • The numerical case study of Section 5, which emphasizes the materiality of reﬁned, path-wise XVA computations, as compared to more simplistic XVA approaches.From a broader point of view, this paper reﬂects a shift of paradigm regarding the pricingand risk management of ﬁnancial derivatives, from hedging to balance sheet optimization, asquantiﬁed by relevant XVA metrics. In particular (compare with the last paragraph beforeSection 1.1), our approach implies that the FVA (and also the MVA, see Remark 2.1) shouldbe included as an add-on in entry prices and as a liability in the balance sheet; the KVA shouldbe included as an add-on in entry prices, but not as a liability in the balance sheet.From a computational point of view, this paper opens the way to second generation XVAGPU implementation. The ﬁrst generation consisted of nested Monte Carlo implemented by ex-plicit CUDA programming on GPUs (see Albanese, Caenazzo, and Cr´epey (2017), Abbas-Turki, Diallo, and Cr´epey (2018)).The second generation takes advantage of GPUs leveraging via pre-coded CUDA/AAD deeplearning packages that are used for the XVA embedded regression and quantile regression task.Compared to a regulatory capital based KVA approach, an economic capital based KVA ap-proach is then not only conceptually more satisfying, but also simpler to implement.

We consider a dealer bank, which is a market maker involved in bilateral derivative portfolios.For simplicity, we only consider European derivatives. The bank has two kinds of stakeholders, shareholders and bondholders . The shareholders have the control of the bank and aresolely responsible for investment decisions before bank default. The bondholders represent thesenior creditors of the bank, who have no decision power until bank default, but are protectedby laws, of the pari-passu type, forbidding trades that would trigger value away from themto shareholders during the default resolution process of the bank. The bank also has juniorcreditors, represented in our framework by an external funder , who can lend unsecured tothe bank and is assumed to suﬀer an exogenously given loss-given-default in case of default ofthe bank.We consider three kinds of business units within the bank (see Figure 1 for the correspondingpicture of the bank balance sheet and refer to Table 1 for a list of the main ﬁnancial acronymsused in the paper): the

CA desks , i.e. the CVA desk and the FVA desk (or Treasury) ofthe bank, in charge of contra-assets, i.e. of counterparty risk and its funding implications forthe bank; the clean desks , who focus on the market risk of the contracts in their respectivebusiness lines; the management of the bank, in charge of the dividend release policy of thebank. 4 mounts on dedicated cash accounts of the bank: CM Clean margin Deﬁnition 2.1 and Assumption 2.1 RC Reserve capital Deﬁnition 2.1 and Assumption 2.1 RM Risk margin Deﬁnition 2.1 and Assumption 2.1 UC Uninvested capital Deﬁnition 2.1 and Assumption 2.1

Valuations: CA Contra-assets valuation (3), (16), and (51) CL Contra-liabilities valuation Deﬁnition 2.1 and (18), (35), and (43)

CVA

Credit valuation adjustment (17), (16), (52), and (60)–(61)

DVA

Debt valuation adjustment (18) and (17)

FDA

Funding debt adjustment (18) and (23) FV Firm valuation of counterparty risk (21) and (23)

FVA

Funding valuation adjustment Remark 2.1, (17), (16), and (52)

KVA

Capital valuation adjustment (4), (26), and (56)

MtM

Mark-to-market (4) and (15)

MVA

Margin valuation adjustment Remark 2.1, (33), (52), and (62)

XVA

Generic “X” valuation adjustment First paragraph

Also: CR Capital at risk (54)

CET1

Core equity tier I capital (3) and (40) EC Economic capital Deﬁnitions 3.2 and A.1

FTP

Funds transfer price (43)

SHC

Shareholder capital (or equity) (3) and (41)

SCR

Shareholder capital at risk Assumption 2.1 and (25)Table 1: Main ﬁnancial acronyms and place where they are introduced conceptually and/orspeciﬁed mathematically in the paper, as relevant.5 eserve capital (RC)Shareholder capital at risk (SCR)yr1Uninvested capital (UC)ASSETS LIABILITIESyr39 yr40 Core equity tier I capital (CET1)Mark-to-market of theportfolio receivables Mark-to-market of theportfolio payablesContra-liabilities (CL) yr1 yr39 yr40Contra-assets (CA)Accounting equityCapital at risk (CR) CVACollateral posted by theclean desks Collateral received by theclean desksFVADVA FVA desk(Treasury)CA desksClean desksKVA desk(management) CVA desk

Risk Margin (RM=KVA)(MtM + ) (CM + )(CM − ) (MtM − )FDA = FVA Figure 1: Balance sheet of a dealer bank. Contra-liability valuation (CL) at the top is shownin dotted boxes because it is only value to the bondholders (see Section 3.5). Mark-to-marketvaluation (MtM) of the derivative portfolio of the bank by the clean desks, as well as thecorresponding collateral (clean margin CM), are shown in dashed boxes at the bottom. Theirrole will essentially vanish in our setup, where we assume a perfect clean hedge by the bank.The arrows in the left column represent trading losses of the CA desks in “normal years 1 to39” and in an “exceptional year 40” with full depletion (i.e. reﬁll via UC, under Assumption2.1.ii) of RC, RM, and SCR. The numberings yr1 to yr40 are ﬁctitious yearly scenarios inline with a 97.5% expected shortfall of the one-year-ahead trading losses of the bank that weuse for deﬁning its economic capital. The arrows in the right column symbolize the averagedepreciation in time of contra-assets between deals. The collateral between the bank and itscounterparties is not shown to alleviate the picture.Collateral means cash or liquid assets that are posted to guarantee a netted set of transac-6ions against defaults. It comes in two forms: variation margin, which is re-hypothecableotecable,i.e. fungible across netting sets, and initial margin, which is segregated. We assume cash onlycollateral. Posted collateral is supposed to be remunerated at the risk-free rate (assumed toexist, with overnight index swap rates as a best market proxy).

Remark 2.1

To alleviate the notation, in this conceptual section of the paper, we only consideran FVA as the global cost of raising collateral for the bank, as opposed to a distinction, in theindustry and in later sections in the paper, between an FVA, in the strict sense of the cost ofraising variation margin, and an MVA for the cost of raising initial margin.The CA desks guarantee the trading of the clean desks against counterparty defaults,through a clean margin account , which can be seen as (re-hypothecable) collateral exchangedbetween the CA desks and the clean desks. The corresponding clean margin amount (CM) alsoplays the role of the funding debt of the clean desks put at their disposal at a risk-free cost bythe Treasury of the bank. This is at least the case when CM > < − CM) correspondsto excess cash generated by the trading of the clean desks, usable by the Treasury for its otherfunding purposes. See the bottom, dashed boxes in Figure 1.In addition, the CA desks value the contra-assets (future counterparty default losses andfunding expenditures), charge them to the (corporate) clients at deal inception, deposit thecorresponding payments in a reserve capital account , and then are exposed to the corre-sponding payoﬀs. As time proceeds, contra-assets realize and are covered by the CA desks withthe reserve capital account.On top of reserve capital, the so-called risk margin is sourced by the management of thebank from the clients at deal inception, deposited into a risk margin account , and thengradually released as KVA payments into the shareholder dividend stream.Another account contains the shareholder capital at risk earmarked by the bank to dealwith exceptional trading losses (beyond the expected losses that are already accounted for byreserve capital).Last, there is one more bank account with shareholder uninvested capital .All cash accounts are remunerated at the risk-free rate.

Deﬁnition 2.1

We write CM, RC, RM, SCR, and UC for the respective (risk-free discounted)amounts on the clean margin, reserve capital, risk margin, shareholder capital at risk, anduninvested capital accounts of the bank. We also deﬁneSHC = SCR + UC , CET1 = RM + SCR + UC . (3)From a ﬁnancial interpretation point of view, before bank default, SHC corresponds to share-holder capital (or equity); CET1 is the core equity tier I capital of the bank, representingthe ﬁnancial strength of the bank assessed from a regulatory, structural solvency point of view,i.e. the sum between shareholder capital and the risk margin (which is also loss-absorbing),but excluding the value CL of the so-called contra-liabilities (see Figure 1). Indeed, the latteronly beneﬁts the bondholders (cf. Section 3.5), hence it only enters accounting equity. Beforethe default of the bank, shareholder wealth and bondholder wealth are respectively givenby SHC + RM sh and CL + RM bh , for shareholder and bondholder components of RM to bedetailed in Remark 3.3; shareholder and bondholder wealths sum up to the accounting equityRM + SCR + UC + CL, i.e. the wealth of the ﬁrm as a whole (see Figure 1). Remark 2.2

The purpose of our capital structure model of the bank is not to model the defaultof the bank, like in a Merton (1974) model, as the point of negative equity (i.e. CET1 < τ calibrated to the credit default swap (CDS) curve referencing the bank. Indeed we viewthe latter as the most reliable and informative credit data regarding anticipations of marketsparticipants about future recapitalization, government intervention, bail-in, and other bankfailure resolution policies.The aim of our capital structure model, instead, is to put in a balance sheet perspective thecontra-assets and contra-liabilities of a dealer bank, items which are not present in the Mertonmodel and play a key role in our XVA analysis.In line with the Volcker rule banning proprietary trading for a bank, we assume a perfectmarket hedge of the derivative portfolio of the bank by the clean desks, in a sense to bespeciﬁed below in the respective static and continuous-time setups. By contrast, as jump-to-default exposures (own jump-to-default exposure, in particular) cannot be hedged by the bank(cf. Section 1.1), we conservatively assume no XVA hedge.We work on a measurable space (Ω , A ) endowed with a probability measure Q ∗ , with Q ∗ expectation denoted by E ∗ , which is used for the linear valuation task, using the risk-free assetas our num´eraire everywhere. Remark 2.3

Regarding the nature of our reference probability measure Q ∗ , “physical or risk-neutral”, one should view it as a blend between the two. For instance, even if we do not use thisexplicitly in the paper, one could conceptually think of Q ∗ as the probability measure introducedby Dybvig (1992) to deal with incomplete markets that are a mix of ﬁnancial traded risk factorsand unhedgeable ones (jumps to default, in our setup), recently revisited in a ﬁnance andinsurance context by Artzner, Eisele, and Schmidt (2020). Namely, one could think of Q ∗ asthe unique probability measure on A that coincides (i) with a given risk-neutral pricing measure on the ﬁnancial σ algebra ⊆ A , and (ii) with the physical probability measure conditional onthe ﬁnancial σ algebra (the risk-neutral and physical measures being assumed equivalent onthe ﬁnancial σ algebra). The risk-neutral pricing measure (hence, in view of (i) , Q ∗ itself) iscalibrated to prices of fully collateralized transaction for which counterparty risk is immaterial.The physical probability measure expresses user views on the unhedgeable risk factors. Theuncertainty about Q ∗ can be dealt with by a Bayesian variation on our baseline XVA approach,whereby paths of alternative, co-calibrated models are combined in a global simulation (cf.Hoeting, Madigan, Raftery, and Volinsky (1999)). Until Section 4.2, we consider the case of a portfolio held on a run-oﬀ basis, i.e. set up at time0 and such that no new unplanned trades enter the portfolio in the future.The trading cash ﬂows of the bank (cumulative cash ﬂow streams starting from 0 at time0) then consist of • the contractually promised cash ﬂows P from counterparties, • the counterparty credit cash ﬂows C to counterparties (i.e., because of counterparty risk,the eﬀective cash ﬂows from counterparties are P − C ), • the risky funding cash ﬂows F to the external funder, and See Artzner, Eisele, and Schmidt (2020, Proposition 2.1) for a proof. the hedging cash ﬂows H of the clean desks to ﬁnancial hedging markets(note that all cash ﬂow diﬀerentials can be positive or negative). See Section 3.1 and (49)–(50)for concrete speciﬁcations in respective one-period and continuous-time setups. Assumption 2.1 i. (Self-ﬁnancing condition) RC + RM + SCR + UC − CM evolveslike the received trading cash ﬂows

P − C − F − H .ii. (Mark-to-model)

The amounts on all the accounts but UC are marked-to-model (hencethe last, residual amount, UC, plays the role of an adjustment variable). Speciﬁcally, weassume that the following shareholder balance conditions hold at all times:CM = MtM , RC = CA , RM = KVA , (4)for theoretical target levels MtM, CA, and KVA to be speciﬁed in later sections of thepaper (which will also determine the theoretical target level for SCR).iii. (Agents) The initial amounts MtM , CA , and KVA are provided by the clients atportfolio inception time 0. Resets between time 0 and the bank default time τ (excluded)are on bank shareholders. At the (positive) bank default time τ , the property of theresidual amount on the reserve capital and risk margin accounts is transferred from theshareholders to the bondholders of the bank. Remark 2.4

In an asymmetric setup with a price maker and a price taker, the price makerpasses his costs to the price taker. Accordingly, in our setup, the (corporate) clients provide allthe amounts to the clean margin, reserve capital, and risk margin accounts of the bank requiredfor resetting the accounts to their theoretical target levels (4) corresponding to the updatedportfolio.Under a cost-of-capital XVA approach, we deﬁne valuation so as to make shareholder tradinglosses (that include marked-to-model liability ﬂuctuations) centered, then we add a KVA riskpremium in order to ensure to bank shareholders some positive hurdle rate h on their capitalat risk.In what follows, such an approach is developed, ﬁrst, in a static setup, which can be solvedexplicitly, and then, in a dynamic and trade incremental setup, as suitable for dealing with areal derivative banking portfolio. In this section, we apply the cost-of-capital XVA approach to a portfolio made of a singledeal, P (random variable promised to the bank), between a bank and a client, without priorendowment, in an elementary one-period (one year) setup. All the trading cash ﬂows P , C , F ,and H are then random variables (as opposed to processes in a multi-period setup later in thepaper). We ﬁrst assume no collateral exchanged between the bank and its client (but collateralexchanged as always between the CA and the clean desks as well as collateral on the markethedge of the bank, the way explained after the respective Remarks 2.1 and 2.2). Risky fundingassets are assumed fairly priced by the market, in the sense that E ∗ F = 0.The bank and client are both default prone with zero recovery to each other. The bank alsohas zero recovery to its external funder. We denote by J and J the survival indicators (randomvariables) of the bank and client at time 1, with default probability of the bank Q ∗ ( J = 0) = γ .9ince prices and XVAs only matter at time 0 in a one-period setup, we identify all the XVAprocesses, as well as the mark-to-market (valuation by the clean desks) MtM of the deal, withtheir values at time 0.For any random variable Y , we deﬁne Y ◦ = J Y and Y • = − (1 − J ) Y , hence Y = Y ◦ − Y • . (5)Let E denote the expectation with respect to the bank survival measure, say Q , associated with Q ∗ , i.e., for any random variable Y , E Y = (1 − γ ) − E ∗ ( Y ◦ ) (6)(which is also equal to E Y ◦ ). The notion of bank survival measure was introduced in greatergenerality by Sch¨onbucher (2004). In the present static setup, (6) is nothing but the Q ∗ expec-tation of Y conditional on the survival of the bank (note that, whenever Y is independent from J , the right-hand-side in (6) coincides with E ∗ Y ). Lemma 3.1

For any random variable Y and constant Y , we have Y = E ∗ ( Y ◦ + (1 − J ) Y ) ⇐⇒ Y = E Y . (7) Proof . Indeed, Y = E ∗ ( J Y + (1 − J ) Y ) ⇐⇒ E ∗ ( J ( Y − Y )) = 0 ⇐⇒ E ( Y − Y ) = 0 ⇐⇒ Y = E Y , (8)where the equivalence in the middle is justiﬁed by (6). Remark 3.1

For simplicity in a ﬁrst stage, we will ignore the possibility of using capital atrisk for funding purposes, only considering in this respect reserve capital RC = CA (cf. (4)).The additional free funding source provided by capital at risk will be introduced later, as wellas collateral between bank and client, in Section 3.4.

Lemma 3.2

Given the (to be speciﬁed)

MtM and CA amounts (cf. Assumption 2.1.ii), thecredit and funding cash ﬂows C and F of the bank and its trading loss (and proﬁt) L are suchthat C ◦ = J (1 − J ) P + , F ◦ = Jγ (MtM − CA) + C • = (1 − J ) (cid:0) P − − (1 − J ) P + (cid:1) , F • = (1 − J ) (cid:0) (MtM − CA) + − γ (MtM − CA) + (cid:1) L ◦ = C ◦ + F ◦ − J CA , L • = C • + F • + (1 − J )CA , L = C + F − CA . (9) Proof . For the deal to occur, the bank needs to borrow (MtM − CA) + unsecured or invest(MtM − CA) − risk-free (cf. Remark 3.1). Having assumed zero recovery to the external funder,unsecured borrowing is fairly priced as γ × the amount borrowed by the bank (in line with ourassumption that E ∗ F = 0), i.e. the bank must pay for its risky funding the amount γ (MtM − CA) + . Moreover, at time 1, under zero recovery upon defaults:10

If the bank is not in default (i.e. J = 1), then the bank closes its position with the clientwhile receiving P from its client if the latter is not in default (i.e. J = 1), whereas thebank pays P − to its client if the latter is in default (i.e. J = 0). In addition, the bankreimburses its funding debt (MtM − CA) + or receives back the amount (MtM − CA) − ithad lent at time 0; • If the bank is in default (i.e. J = 0), then the bank receives back J P + on the derivativeas well as the amount (MtM − CA) − it had lent at time 0.Also accounting for the hedging loss H , the trading loss of the bank over the year is L = γ (MtM − CA) + − J (cid:0) J P − (1 − J ) P − − (MtM − CA) + + (MtM − CA) − (cid:1) − (1 − J ) (cid:0) J P + + (MtM − CA) − (cid:1) + H . (10)In the static setup, the perfect clean hedge condition (see after Remark 2.2) writes H = P −

MtM. Inserting this into the above yields L = (1 − J ) P + + γ (MtM − CA) + − CA − (1 − J )( P − + (MtM − CA) + ) , (11)as easily checked for each of the four possible values of the pair ( J, J ). That is, L ◦ = J (1 − J ) P + | {z } C ◦ + Jγ (MtM − CA) + | {z } F ◦ − J CA L • = (1 − J ) (cid:0) P − − (1 − J ) P + (cid:1)| {z } C • + (1 − J ) (cid:0) (MtM − CA) + − γ (MtM − CA) + (cid:1)| {z } F • +(1 − J )CA , (12)where the identiﬁcation of the diﬀerent terms as part of C or F follows from their ﬁnancialinterpretation. Remark 3.2

The derivation (10) implicitly allows for negative equity (that arises whenever L ◦ > CET1, cf. (3)), which is interpreted as recapitalization. In a variant of the model excludingboth recapitalization and negative equity, the default of the bank would be modeled in astructural fashion as the event { L = CET1 } , where L = (cid:0) (1 − J ) P + + γ (MtM − CA) + − CA (cid:1) ∧ CET1 , (13)and we would obtain, instead of (11), the following trading loss for the bank: { CET1 >L } L + { CET1= L } (cid:0) CET1 − P − − (MtM − CA) + (cid:1) . (14)In this paper we consider a model with recapitalization for the reasons explained in Remark2.2.Structural XVA approaches in a static setup have been proposed in Andersen, Duﬃe, and Song (2019)(without KVA) and Kjaer (2019) (including the KVA). Their marginal, limiting results as anew deal size goes to zero are comparable to some of the results that we have here. But then,instead of developing a continuous time version of their corporate ﬁnance model and takingthe small trade limit, these papers start the development of the continuous time model fromthe single period small trade limit model. By contrast, in our framework, we have end to enddevelopment in the continuous time model of Section 4 and in the present single period model.11 .2 Contra-assets and Contra-liabilities To make shareholder trading losses centered (cf. the next-to-last paragraph of Section 2), cleanand CA desks value by Q ∗ expectation their shareholder sensitive cash ﬂows. These include,in case of default of the bank, the transfer of property from the CA desks to the clean desksof the collateral amount MTM on the clean margin account, as well as (cf. Assumptions 2.1.iiand iii) the transfer from shareholders to bondholders of the residual value RC = CA on thereserve capital account. Accordingly: Deﬁnition 3.1

We let MtM = E ∗ (cid:0) P ◦ + (1 − J )MtM (cid:1) (15)and CA = CVA + FVA , (16)where CVA = E ∗ (cid:0) C ◦ + (1 − J )CVA (cid:1) FVA = E ∗ (cid:0) F ◦ + (1 − J )FVA (cid:1) , (17)hence CA = E ∗ (cid:0) C ◦ + F ◦ + (1 − J )CA (cid:1) . We also deﬁne the contra-liabilities valueCL = DVA + FDA , (18)where DVA = E ∗ (cid:0) C • + (1 − J )CVA (cid:1) (19)FDA = E ∗ (cid:0) F • + (1 − J )FVA (cid:1) . (20)Finally we deﬁne the ﬁrm valuation of counterparty risk,FV = E ∗ ( C + F ) . (21)The deﬁnitions of MtM , CVA , and FVA are in fact ﬁx-point equations. However, the fol-lowing result shows that these equations are well-posed and yields explicit formulas for all thequantities at hand. Proposition 3.1

We have

MtM = E P ◦ CVA = E (cid:0) (1 − J ) P + (cid:1) FVA = γ (MtM − CA) + = γ γ (MtM − CVA) + (22) and E ∗ L ◦ = E L = 0FDA = FVAFV = E ∗ C = CVA − DVA = CA − CL . (23)12 roof . The ﬁrst identities in each line of (22) follow from Deﬁnition 3.1 by Lemma 3.1and deﬁnition of the involved cash ﬂows in Lemma 3.2. Given (16), the formula FVA = γ (MtM − CA) + in (22) is in fact a semi-linear equationFVA = γ (MtM − CVA − FVA) + . (24)But, as γ (a probability) is nonnegative, this equation has the unique solution given by theright-hand side in the third line of (22).Regarding (23), we have E ∗ L ◦ = (1 − γ ) E (cid:0) (1 − J ) P + + γ (MtM − CA) + − CA (cid:1) = 0 , by application of (6), the ﬁrst line in (12), (22), and (16). Hence, using (6) again, E L = (1 − γ ) − E ∗ L ◦ = 0 . This is the ﬁrst line in (23), which implies the following ones by deﬁnition of the involvedquantities and from the assumption that E ∗ F = 0.Note that MtM = E P ◦ also coincides with E P (cf. (22) and the parenthesis following (6)). Inpractice P ◦ has less terms than P (that also includes cash ﬂows from bank default onward),which is why we favor the formulation E P ◦ in (22). The alternative formulation E P may seemmore in line with the intuition of MtM as value deprived from any credit/funding considerations.However, as the measure underlying E is the survival one (see before Lemma 3.1), this intuitionis in fact simplistic and only strictly correct in the case without wrong way risk between creditand market (cf. the parenthesis preceding Lemma 3.1). Economic capital (EC) is the level of capital at risk that a regulator would like to see on aneconomic, structural basis. Risk calculations are typically performed by banks “on a goingconcern”, i.e. assuming that the bank itself does not default. Accordingly:

Deﬁnition 3.2

The economic capital (EC) of the bank is given by the 97.5% expected short-fall of the bank trading loss L under Q , which we denote by ES ( L ◦ ).The risk margin (sized by the to-be-deﬁned KVA in our setup) is also loss-absorbing, i.e. partof capital at risk, and the KVA is originally sourced from the client (see Assumption 2.1.iii).Hence, shareholder capital at risk only consists of the diﬀerence between the (total) capitalat risk and the KVA. Accordingly (and also accounting, regarding (26), for the last part inAssumption 2.1.iii): Deﬁnition 3.3

The capital at the risk (CR) of the bank is given by max(EC , KVA) and theensuing shareholder capital at risk (SCR) bySCR = max(EC , KVA) − KVA = (EC − KVA) + , (25)where, given some hurdle rate (target return-on-equity) h ,KVA = E ∗ (cid:0) h SCR ◦ + (1 − J )KVA (cid:1) . (26) See e.g. F¨ollmer and Schied (2016, Section 4.4). Note that, by deﬁnition of Q , this quantity does not depend on L • . emark 3.3 In view of (26) and of the last balance condition in (4), we haveRM sh = E ∗ (cid:0) h SCR ◦ ) , RM bh = E ∗ (cid:0) (1 − J )KVA (cid:1) . (27)We refer the reader to the last bullet point in Albanese and Cr´epey (2020, Deﬁnition A.1) forthe analogous split of RM between shareholder and bondholder wealth in a dynamic, continuous-time setup. Proposition 3.2

We have

KVA = h SCR = h h EC = h h ES ( L ◦ ) . (28) Proof . The ﬁrst identity follows from Lemma 3.1. The resulting KVA semi-linear equation(in view of (25)) is solved similarly to the FVA equation (24).The KVA formula (28) (as well as its continuous-time analog (56)) can be used either in thedirect mode, for computing the KVA corresponding to a given h , or in the reverse-engineeringmode, for deﬁning the “implied hurdle rate” associated with the actual level on the risk marginaccount of the bank. Cost of capital proxies have always been used to estimate return-on-equity.The KVA is a reﬁnement, ﬁne-tuned for derivative portfolios, but the base return-on-equityconcept itself is far older than even the CVA. In particular, the KVA is very useful in thecontext of collateral and capital optimization. KVA Risk Premium and Indiﬀerence Pricing Interpretation

The CA component ofthe FTP corresponds to the expected costs for the shareholders of concluding the deal. This CAcomponent makes the shareholder trading loss L ◦ centered (cf. the ﬁrst line in (23)). On topof expected shareholder costs, the bank charges to the clients a risk margin (RM). Assume thebank shareholders endowed with a utility function U on R such that U (0) = 0. In a shareholderindiﬀerence pricing framework, the risk margin arises as per the following equation: E ∗ U ( J (RM − L )) = E ∗ U (0) = 0 (29)(the expected utility of the bank shareholders without the deal), where E ∗ U ( J (RM − L )) = E ∗ (cid:0) JU (RM − L ) (cid:1) = (1 − γ ) E U (RM − L ) , by (6). Hence E U (RM − L ) = 0 . (30)The corresponding RM is interpreted as the minimal admissible risk margin for the deal tooccur, seen from bank shareholders’ perspective.Taking for concreteness U ( − ℓ ) = − e ρℓ ρ , for some risk aversion parameter ρ , (30) yieldsRM = ρ − ln E e ρL = ρ − ln E e ρL ◦ , by the observation following (6). In the limiting case wherethe shareholder risk aversion parameter ρ → E U ( − L ) → − E ( L ) = 0 (by the ﬁrst line in(23)), then RM → . In view of (4) and (28), the corresponding implied KVA and hurdle rate h are such thatKVA = ρ − ln E e ρL ◦ , h h = ρ − ln E e ρL ◦ ES ( L ◦ ) . (31)14ence, “for h and ρ small”, h ≈ V ar( L ◦ )2 ES ( L ◦ ) ρ (32)(as E ( L ◦ ) = 0), where V ar is the Q variance operator. The hurdle rate h in our KVA setupplays the role of a risk aversion parameter, like ρ in the exponential utility framework.An indiﬀerence price has a competitive interpretation. Assume that the bank is competingfor the client with other banks. Then, in the limit of a continuum of competing banks witha continuum of indiﬀerence prices, whenever a bank makes a deal, this can only be at itsindiﬀerence price. Our stylized indiﬀerence pricing model of a KVA deﬁned by a constanthurdle rate h exogenizes (by comparison with the endogenous hurdle rate h in (31)) the impacton pricing of the competition between banks. It does so in a way that generalizes smoothly toa dynamic setup (see Section 4), as required to deal with a real derivative banking portfolio. Itthen provides a reﬁned notion of return-on-equity for derivative portfolios, where a full-ﬂedgedoptimization approach would be impractical. In case of variation margin (VM) that would be exchanged between the bank and its client,and of initial margin that would be received (RIM) and posted (PIM) by the bank, at the levelof, say, some Q value-at-risk of ± ( P −

VM), then • P needs be replaced by (

P − VM − RIM) everywhere in the above, whence an accordinglymodiﬁed (in principle: diminished) CVA, • an additional initial margin related cash ﬂow in F ◦ given as Jγ PIM, triggering an addi-tional adjustment MVA in CA, whereMVA = E ∗ (cid:0) Jγ PIM + (1 − J )MVA (cid:1) = γ PIM; (33) • additional initial margin related cash ﬂows in F • given as (1 − J )(PIM − γ PIM) and(1 − J )MVA, triggering an additional adjustment MDA = MVA in CL; • the second FVA formula in (22) modiﬁed into FVA = γ γ (MtM − VM − CVA − MVA) + . Accounting further for the additional free funding source provided by capital at risk (cf. Re-mark 3.1), then, in view of the speciﬁcation given in the ﬁrst sentence of Deﬁnition 3.3 forthe latter, one needs replace (MtM − CA) ± by (MtM − VM − CA − max(EC , KVA)) ± every-where before. This results in the same CVA and MVA as in the bullet points above, but inthe following system for the random variable L ◦ and the FVA and the KVA numbers (cf. thecorresponding lines in (12), (22), (28), and recall (16)): L ◦ = J (1 − J ) P + + Jγ (MtM − VM − CA − max(EC , KVA)) + + Jγ PIM − J CAFVA = γ (MtM − VM − CA − max(EC , KVA)) + KVA = h h ES ( L ◦ ) . (34)This system entails a coupled dependence between, on the one hand, the FVA and KVA numbersand, on the other hand, the shareholder loss process L ◦ . However, once CVA, PIM, RIM,15nd MVA computed as in the above, the system (34) can be addressed numerically by Picarditeration, starting from, say, L (0) = KVA (0) = 0 and FVA (0) = γ γ (MtM − VM − CVA − MVA) + (cf. the last line in (22)), and then iterating in (34) until numerical convergence. Remark 3.4

The rationale for funding FVA but not MVA from CA + max(EC , KVA) is setout before Equation (15) in Albanese, Caenazzo, and Cr´epey (2017).

The funds transfer price (all-inclusive XVA rebate to MtM) aligning the deal to shareholderinterest (in the sense of a given hurdle rate h , cf. the next-to-last paragraph of Section 2) isFTP = CVA + FVA | {z } Expected shareholder costs CA + KVA | {z }

Shareholder risk premium= CVA − DVA | {z }

Firm valuation FV + DVA + FDA | {z }

Wealth transfer CL + KVA | {z }

Shareholder Risk premium , (35)where all terms are explicitly given in Propositions 3.1 and 3.2 (or the corresponding variantsof Section 3.4 in the reﬁned setup considered there). Wealth Transfer Analysis

The above results implicitly assumed that the bank cannot hedgejump-to-default cash ﬂows (cf. Section 1.1). To understand this, let us temporarily suppose,for the sake of the argument, that the bank would be able to hedge its own jump-to-defaultthrough a further deal, whereby the bank would deliver a payment L • at time 1 in exchange ofa fee fairly valued as CL = E ∗ L • = DVA + FDA , (36)deposited in the reserve capital account of the bank at time 0.We include this hedge and assume that the client would now contribute at the level ofFV = CA − CL (cf. (23)), instead of CA before, to the reserve capital account of the bank attime 0. Then the amount that needs be borrowed by the bank for implementing its strategyis still γ (MtM − CA) + as before (back to the baseline funding setup of Remark 3.1). But thetrading loss of the bank becomes, instead of L before, C + F −

FV + ( L • − CL) = C + F −

CA + L • = L + L • = L ◦ , (37)where the last line in (23) and the last identity in (9) were used in the ﬁrst and second equality.By comparison with the situation from previous sections without own-default hedge by thebank: • the shareholders are still indiﬀerent to the deal in expected counterparty default andfunding expenses terms, • the recovery of the bondholders becomes zero, • the client is better oﬀ by the amount CA − FV = CL.The CL originating cash ﬂow L • has been hedged and monetized by the shareholders, who havepassed the corresponding beneﬁt to the client.Under a cost-of-capital pricing approach, the bank would still charge to its client a KVAadd-on h h ES ( L ◦ ), as risk compensation for the nonvanishing shareholder trading loss L ◦ still16riggered by the deal. If, however, the bank could also hedge the (zero-valued, by the ﬁrst linein (23)) loss L ◦ , hence the totality of L = L ◦ − L • (instead of L • only in the above), then thetrading loss and the KVA would vanish. As a result, the all-inclusive XVA add-on (rebate fromMtM valuation) would boil down to FV = CVA − DVA(cf. (1)), the value of counterparty risk and funding to the bank as a whole.

Connection With the Modigliani-Miller Theory

The Modigliani-Miller invariance re-sult, with Modigliani and Miller (1958) as a seminal reference, consists in various facets of abroad statement that the funding and capital structure policies of a ﬁrm are irrelevant to theproﬁtability of its investment decisions. Modigliani-Miller (MM) irrelevance, as we put it forbrevity hereafter, was initially seen as a pure arbitrage result. However, it was later understoodthat there may be market incompleteness issues with it. So quoting Duﬃe and Sharer (1986, page 9),“generically, shareholders ﬁnd the span of incomplete markets a binding constraint [...] share-holders are not indiﬀerent to the ﬁnancial policy of the ﬁrm if it can change the span ofmarkets (which is typically the case in incomplete markets)”; or Gottardi (1995, page 197):“When there are derivative securities and markets are incomplete the ﬁnancial decisions of theﬁrm have generally real eﬀects”.A situation where shareholders may “ﬁnd the span of incomplete markets a binding con-straint” is when market completion is legally forbidden. This corresponds to the XVA case,which is also at the crossing between market incompleteness and the presence of derivativespointed out above as the MM non irrelevance case in Gottardi (1995). Speciﬁcally, the contra-assets and contra-liabilities that emerge endogenously from the impact of counterparty risk onthe derivative portfolio of a bank cannot be “undone” by shareholders, because jump-to-defaultrisk cannot be replicated by a bank.As a consequence, MM irrelevance is expected to break down in the XVA setup. In fact,as visible on the trade incremental FTP (counterparty risk pricing) formula (35) (cf. also (43)and Proposition 4.2 in a dynamic and trade incremental setup below), cost of funding and costof capital are material to banks and need be reﬂected in entry prices for ensuring shareholderindiﬀerence to the trades, i.e. preserving their hurdle rate throughout trades.

We now consider a dynamic, continuous-time setup, with model ﬁltration G and a (positive)bank default time τ endowed with an intensity γ . The bank survival probability measureassociated with the measure Q ∗ is then the probability measure Q with ( G , Q ∗ ) density pro-cess Je R · γ s ds (assumed integrable), where J = [0 ,τ ) is the bank survival indicator process(cf. Sch¨onbucher (2004) and Collin-Dufresne, Goldstein, and Hugonnier (2004)). In particular,writing Y ◦ = JY + (1 − J ) Y τ − , for any left-limited process Y , we have by application of theresults of Cr´epey and Song (2017) (cf. the condition (A) there): Lemma 4.1

For every Q (resp. sub-, resp. resp. super-) martingale Y , the process Y ◦ is a Q ∗ (resp. sub-, resp. resp. super-) martingale. Remark 4.1

In the dynamic setup, the survival measure formulation is a light presentation,suﬃcient for the purpose of the present paper (skipping the related integrability issues), ofan underlying reduction of ﬁltration setup, which is detailed in the above-mentioned reference(regarding Lemma 4.1, cf. also Collin-Dufresne, Goldstein, and Hugonnier (2004, Lemma 1)).17 .1 Case of a Run-Oﬀ Portfolio

First, we consider the case of a portfolio held on a run-oﬀ basis (cf. Section 2.1). We denote by T the ﬁnal maturity of the portfolio and we assume that all prices and XVAs vanish at time T if T < τ . Then the results of Albanese and Cr´epey (2020) show that all the qualitative insightsprovided by the one-period XVA analysis of Section 3 are still valid. The trading loss of thebank is now given by the process L = C + F + CA − CA (38)and the bank shareholder trading loss by the Q (hence Q ∗ , by Lemma 4.1) martingale L ◦ = C ◦ + F ◦ + CA ◦ − CA . (39)In (38)-(39), we have CA = CVA + FVA as in (16); the processes C , F , CVA , and FVAare continuous-time processes analogs, detailed in the case of bilateral trade portfolios in Sec-tion A.1-A.2, of the eponymous quantities in Section 3 (which were constants or random vari-ables there). Proposition 4.1

The core equity tier 1 capital of the bank is given by

CET1 = CET1 − L. (40) Shareholder equity is given by

SHC = SHC − ( L + KVA − KVA ) . (41) Proof . In the continuous-time setup, Assumption 2.1.i is written asRC + RM + SCR + UC − CM − (RC + RM + SCR + UC − CM) = P − ( C + F + H ) . Given the deﬁnition of CET1 in (3), the perfect clean hedge condition (see after Remark 2.2)written in the dynamic setup as P + MtM − MtM − H = 0, and the balance conditions (4),this is equivalent to CA + CET1 − (CA + CET1) = − ( C + F ) . In view of (38), we obtain (40).As SHC = CET1 − RM (cf. (3)), we have by (40):SHC = CET1 − L − RM = CET1 − RM − ( L + RM − RM ) , which, by the third balance condition in (4), yields (41).Moreover, by Lemma 4.1, the continuous-time process KVA ◦ that stems from (54)-(55) is a Q ∗ supermartingale with terminal condition KVA ◦ T = 0 on { T < τ } and drift coeﬃcient h SCR,where SCR is given as in (25), but for EC there dynamically deﬁned as the time-t conditional,97.5% expected shortfall of ( L ◦ t +1 − L ◦ t ) under Q , killed at τ . Remark 4.2

It is only before τ that the right-hand-sides in the deﬁnitions (3) really deservethe respective interpretations of shareholder equity of the bank and core equity tier 1 capital.Hence, it is only the parts of (40) and (41) stopped before τ , i.e.CET1 ◦ = CET1 − L ◦ , SHC ◦ = SHC − ( L ◦ + KVA ◦ − KVA ) , (42)which are interesting ﬁnancially. 18 .2 Trade Incremental Cost-of-Capital XVA Strategy In Albanese and Cr´epey (2020) and in Section 4.1 above, the derivative portfolio of the bankis assumed held on a run-oﬀ basis. By contrast, real-life derivative portfolios are incremental.Assume a new deal shows up at time θ ∈ (0 , τ ). We denote by ∆ · , for any portfolio relatedprocess, the diﬀerence between the time θ values of this process for the run-oﬀ versions of theportfolio with and without the new deal. Deﬁnition 4.1

We apply the following trade incremental pricing and accounting policy: • The clean desks pay ∆MtM to the client and the CA desks add an amount ∆MtM on the clean margin account; • The CA desks charge to the client an amount ∆CA and add it on the reserve capitalaccount; • The management of the bank charges the amount ∆KVA to the client and adds it on the risk margin account.The funds transfer price of a deal is the all-inclusive XVA add-on charged by the bank tothe client in the form of a rebate with respect to the mark-to-market ∆MtM of the deal. Underthe above scheme, the overall price charged to the client for the deal is ∆MtM − ∆CA − ∆KVA,i.e. FTP = ∆CA + ∆KVA = ∆CVA + ∆FVA + ∆KVA= ∆FV + ∆CL + ∆KVA , (43)by (16) and the last line in (23) (which still hold in continuous time, see Albanese and Cr´epey (2020, Equations (1) and (66)))applied to the portfolios with and without the new deal. Remark 4.3

As opposed to the ∆XVA terms, which entail portfolio-wide computations, ∆MtMreduces to the so-called clean valuation of the new deal, by trade-additivity of MtM (as followsfrom Albanese and Cr´epey (2020, Equations (25) and (37))).Obviously, the legacy portfolio of the bank has a key impact on the FTP. It may verywell happen that the new deal is risk-reducing with respect to the portfolio, in which caseFTP <

0, i.e. the overall, XVA-inclusive price charged by the bank to the client would be∆MtM − FTP > ∆MtM (subject of course to the commercial attitude adopted by the bankunder such circumstance).In order to exclude for simplicity jumps of our L and KVA processes at θ (the ones relatedto the initial portfolio, but also those, starting at time θ , corresponding to the augmentedportfolio), we assume a quasi-left continuous model ﬁltration G and a G predictable stoppingtime θ . The ﬁrst assumption excludes that martingales can jump at predictable times. It issatisﬁed in all practical models and, in particular, in all models with L´evy or Markov chaindriven jumps. The second assumption is reasonable regarding the time at which a ﬁnancialcontract is concluded. Note that it was actually already assumed regarding the (ﬁxed) time 0at which the portfolio of the bank is supposed to have been set up in the ﬁrst place. i.e. remove ( − ∆MtM) from, if ∆MtM < i.e. remove ( − ∆CA) from, if ∆CA < i.e. removes ( − ∆KVA) from, if ∆KVA < emma 4.2 Assuming the new trade at time θ handled by the trade incremental policy ofDeﬁnition 4.1 after that the balance conditions (4) have been held before θ , then shareholderequity SHC ◦ (see Remark 4.2) is a Q ∗ submartingale on [0 , θ ] ∩ R + , with drift coeﬃcient h SCR killed at τ . Proof . In the case of a trade incremental portfolio, a priori, the second identity in (42) isonly guaranteed to hold before θ . However, in view of the observation made in Remark 2.4and because, under our (harmless) technical assumptions, there can be no dividends arisingfrom the portfolio expanded with the new deal (i.e. jumps in the related processes L and KVA,deﬁned on [ θ, + ∞ )) at time θ itself, the process SHC does not jump at θ . The process L andKVA related to the legacy portfolio cannot jump at θ either. As a result, the second identityin (42) still holds at θ . It is therefore valid on [0 , θ ] ∩ R + . The result then follows from therespective martingale and supermartingale properties of the (original) processes L ◦ and KVA ◦ recalled before and after Proposition 4.1.The above XVA strategy can be iterated between and throughout every new trade. Wecall this approach the trade incremental cost-of-capital XVA strategy . By an iteratedapplication of Lemma 4.2 at every new trade, we obtain the following: Proposition 4.2

Under a dynamic and trade incremental cost-of-capital XVA strategy, share-holder equity

SHC ◦ is a Q ∗ submartingale on R + , with drift coeﬃcient h SCR killed at τ . Thus, a trade incremental cost-of-capital XVA strategy results in a sustainable strategy forproﬁts retention, both between and throughout deals, which was already the key principlebehind Solvency II (see Section 1.1). Note that, without the KVA (i.e. for h = 0), the (risk-freediscounted) shareholder equity process SHC ◦ would only be a Q ∗ martingale, which could onlybe acceptable to shareholders without risk aversion (cf. Section 3.3). Figure 2 yields a picturesque representation, in the form of a corresponding XVA dependencetree, of the continuous-time XVA equations.For concreteness, we restrict ourselves to the case of bilateral trading in what follows,referring the reader to Albanese, Armenti, and Cr´epey (2020, Section 6.2) for the more generaland realistic situation of a bank also involved in centrally cleared trading. As visible fromthe corresponding equations in Section A, the CVA of the bank can then be computed as thesum of its CVAs restricted to each netting set (or counterparty i of the bank, with defaulttime denoted by τ i in Figure 2). The initial margins and the MVA are also most accuratelycalculated at each netting set level. By contrast, the FVA is deﬁned in terms of a semilinearequation that can only be solved at the level of the overall derivative portfolio of the bank. TheKVA can only be computed at the level of the overall portfolio and relies on conditional riskmeasures of future ﬂuctuations of the shareholder trading loss process L ◦ , which itself involvesfuture ﬂuctuations of the other XVA processes (as these are part of the bank liabilities).Moreover, the fungibility of capital at risk with variation margin (cf. Remark 3.4) induces acoupling between, on the one hand, the “backward” FVA and KVA processes and, on the otherhand, the “forward” shareholder loss process L ◦ . As in the static case of Section 3.4 (cf. the lastparagraph there), the ensuing forward backward system can be decoupled by Picard iteration.These are heavy computations encompassing all the derivative contracts of the bank. Yetthese computations require accuracy so that trade incremental XVA computations, which arerequired as XVA add-ons to derivative entry prices (cf. Section 4.2), are not in the numericalnoise of the machinery. 20 V A E C s , < s < T E C s F VA t = s ,..., s + C VA t , M VA t , t = s ,..., s + I M t = s ,..., s + , M t M t = s ,..., s + F V A t C VA u , M VA u , u = t ,..., T I M u = t ,..., T , M t M u = t ,..., T M V A u , C V A u I M v = u ,..., T , M t M v = u ,..., T I M v , M t M w= v ,..., v + , M t M w D e p t h M c va M f va M kva M e c M i m M m t m . . . . . . . . . . . . . . . . . . . . . Figure 2: The XVA equations dependence tree (

Source :Abbas-Turki, Diallo, and Cr´epey (2018)).As developed in Abbas-Turki, Diallo, and Cr´epey (2018, Section 3.2), computational strate-gies for (each Picard iteration of) the XVA equations involve a mix of nested Monte Carlo(NMC) and of simulation/regression schemes, optimally implemented on GPUs. In view ofFigure 2, a pure NMC approach would involve ﬁve nested layers of simulation (with respectivenumbers of paths M xva ∼ √ M mtm , see Abbas-Turki, Diallo, and Cr´epey (2018, Section 3.3)).Moreover, nested Monte Carlo implies intensive repricing of the mark-to-market cube, i.e. path-wise MtM valuation for each netting set, or/and high dimensional interpolation. In this work,we use no nested Monte Carlo or conditional repricing of future MtM cubes: beyond the baseMtM layer in the XVA dependence tree, each successive layer (from right to left in Figure 2,at each Picard iteration) will be “learned” instead. We denote by E t , V a R t , and ES t (and simply, in case t = 0, E , V a R , and ES ) the time- t conditional expectation, value-at-risk, and expected shortfall with respect to the bank survivalmeasure Q .We compute the mark-to-market cube using CUDA routines. The pathwise XVAs areobtained by deep learning regression, i.e. extension of Longstaﬀ and Schwartz (2001) kind ofschemes to deep neural network regression bases as also considered in Hur´e, Pham, and Warin (2020)or Beck, Becker, Cheridito, Jentzen, and Neufeld (2019), based on the classical quadratic (alsoknown as mean square error, MSE) loss function. The conditional value-at-risks and expectedshortfalls involved in the embedded pathwise EC and IM computations are obtained by deepquantile regression, as follows.Given features X and labels Y (random variables), we want to compute the conditionalvalue-at-risk and expected shortfall functions q ( · ) and s ( · ) such that V a R ( Y | X ) = q ( X ) and ES ( Y | X ) = s ( X ). Recall from Fissler, Ziegel, and Gneiting (2016) and Fissler and Ziegel (2016)21hat value-at-risk is elicitable , expected shortfall is not, but their pair is jointly elicitable . Specif-ically, we consider loss functions ρ of the form (where in our notation Y is a signed loss, whereasit is a signed gain in their paper) ρ α ( q ( · ) , s ( · ); X, Y ) = (1 − α ) − ( f ( Y ) − f ( q ( X ))) + + f ( q ( X ))+ g ( s ( X )) − ˙ g ( s ( X )) (cid:0) s ( X ) − q ( X ) − (1 − α ) − ( Y − q ( X )) + (cid:1) . (44)One can show (cf. also Dimitriadis and Bayer (2019)) that, for a suitable choice of thefunctions f , g including f ( z ) = z and g = − ln(1 + e − z ) (our choice in our numerics), thepair of the conditional value-at-risk and expected shortfall functions is the minimizer, over allmeasurable pair-functions ( q ( · ) , s ( · )), of the error E ρ ( q ( · ) , s ( · ); X, Y ) . (45)In practice, one minimizes numerically the error (45), based on m independent simulated val-ues of ( X, Y ), over a parametrized family of functions ( q, s )( x ) ≡ ( q, s ) θ ( x ). Dimitriadis and Bayer (2019)restrict themselves to multilinear functions. In our case we use a feedforward neural network pa-rameterization (see e.g. Goodfellow, Bengio, and Courville (2016)). The minimizing pair ( q, s ) b θ then represents the two scalar neural network approximations of the conditional value-at-riskand expected shortfall functions pair.The left and right panels of Figure 3 show the respective deep neural networks for pathwisevalue-at-risk/expected shortfall (with error (45)) and pathwise XVAs (with classical quadraticnorm error). Deep learning methods often show particularly good generalization and scalabilityperformances (cf. Section 5.5). In the case of conditional value-at-risk and expected shortfallcomputations, deep learning quantile regression is also easier to implement than more naivemethods, such as the resimulation and sort-based scheme of Barrera, Cr´epey, Diallo, Fort, Gobet, and Stazhynski (2019)for the value-at-risk and expected shorfall at each outer node of a nested Monte Carlo simula-tion. , X U(cid:238) , (cid:238)(cid:236)U(cid:237) , (cid:238)(cid:236)U(cid:238) , (cid:237)U(cid:237) , X U(cid:237) , (cid:237)U(cid:238) , (cid:237)U(cid:239) , X U(cid:239) , (cid:238)(cid:236)U(cid:239) (cid:28)^ (cid:154) s (cid:2) Z (cid:154) Z& v (cid:154) Z& (cid:237) (cid:154) / v(cid:137)(cid:181)(cid:154) (cid:3)> (cid:2)˙(cid:30)(cid:140) (cid:239) (cid:3) (cid:15)˙ (cid:3) (cid:238)(cid:236) (cid:3), ](cid:26)(cid:26)(cid:30)v (cid:3) o(cid:2)˙(cid:30)(cid:140)(cid:144) K (cid:181)(cid:154)(cid:137)(cid:181)(cid:154) (cid:3)> (cid:2)˙(cid:30)(cid:140) , X U(cid:238) , (cid:238)(cid:236)U(cid:237) , (cid:238)(cid:236)U(cid:238) , (cid:237)U(cid:237) , X U(cid:237) , (cid:237)U(cid:238) , (cid:237)U(cid:239) , X U(cid:239) , (cid:238)(cid:236)U(cid:239) ys(cid:4) (cid:154) Z& v (cid:154) Z& (cid:237) (cid:154) / v(cid:137)(cid:181)(cid:154) (cid:3)> (cid:2)˙(cid:30)(cid:140) (cid:239) (cid:3) (cid:15)˙ (cid:3) (cid:238)(cid:236) (cid:3), ](cid:26)(cid:26)(cid:30)v (cid:3) o(cid:2)˙(cid:30)(cid:140)(cid:144) K (cid:181)(cid:154)(cid:137)(cid:181)(cid:154) (cid:3)> (cid:2)˙(cid:30)(cid:140) Figure 3: Neural networks with state variables (realizations of the risk factors at the consideredpricing time) as features. (Left)

Joint value-at-risk/expected shortfall neural network: out-put is joint estimate of pathwise conditional value-at-risk and expected shorfall, at a selectedconﬁdence level, of the label (inputs to initial margin or economic capital) given the features. (Right)

XVAs neural network: output is estimate of pathwise conditional mean of the label(XVA generating cash ﬂows) given the features.The neural network topology and hyper-parameters used by default in our examples are de-tailed in Table 2. We use hyperbolic tangent activation functions in all cases. Algorithm 1 yields See Section 5. VA FVA IM MVA Gap CVA EC KVAHidden Layers 3 5 3 3 3 3 3Hidden Layer Size 20 6 20 20 20 20 20Learning Rate 0.025 0.025 0.05 0.1 0.1 0.025 0.1Momentum 0.95 0.95 0.5 0.5 0.5 0.95 0.5Iterations 100 50 150 100 100 100 100Loss Function MSE MSE (44) MSE (44) (44) MSEApplication netting set portf. netting set netting set netting set portf. portf.

Table 2: Neural network topology and learning parameters used by default in our numerics(portf. ≡ overall derivative portfolio of the bank).our fully (time and space) discrete scheme for simulating the Picard iteration (58) until numeri-cal convergence to the XVA processes. Note that, as opposed to more rudimentary, expected ex-posure based XVA computational approaches (see Section 1 in Abbas-Turki, Diallo, and Cr´epey (2018)),this algorithm requires the simulation of the counterparty defaults. Algorithm 1

Deep XVAs algorithm. • Simulate forward m realizations (Euler paths) of the market risk factor processes and ofthe counterparty survival indicator processes (i.e. default times) on a reﬁned time grid; • For each pricing time t = t i of a pricing time grid, with coarser time step denoted by h ,and for each counterparty c : – Learn the corresponding V a R t and ES t terms visible in (59) or (under the time-discretized outer integral in) (61); – Learn the corresponding E t terms visible in (60) through (62); – Compute the ensuing pathwise CVA and MVA as per (60)–(62); • For FVA (0) , consider the following time discretization of (57) (in which λ is the riskyfunding spread process of the bank) with time step h :FVA (0) t ≈ E t [FVA (0) t + h ] + hλ t (cid:16) X c J ct ( P ct − VM ct ) − CVA t − MVA t − FVA (0) t (cid:17) + (46)and, for each t = t i , learn the corresponding E t in (46), then solve the semi-linear equationfor FVA (0) t ; • For each Picard iteration k (until numerical convergence), simulate forward L ( k ) as perthe ﬁrst line in (58) (which only uses known or already learned quantities), and: – For economic capital EC ( k ) , for each t = t i , learn ES t (cid:0) ( L ( k ) ) ◦ t +1 − ( L ( k ) ) ◦ t (cid:1) (cf. Deﬁ-nition A.1); – KVA ( k ) and FVA ( k ) then require a backward recursion solved by deep learning ap-proximation much like the one for FVA (0) above.23 Swap Portfolio Case Study

We consider an interest rate swap portfolio case study with counterparties in diﬀerent economies,ﬁrst involving 10 one-factor Hull White interest-rates, 9 Black-Scholes exchange rates, and 11Cox-Ingersoll-Ross default intensity processes. The default times of the counterparties and thebank itself are jointly modeled by a “common shock” or dynamic Marshall-Olkin copula modelas per Cr´epey, Bielecki, and Brigo (2014, Chapt. 8–10) and Cr´epey and Song (2016) (see alsoElouerkhaoui (2007, 2017)). This whole setup results in about 40 risk factors used as deeplearning features (including the counterparty default indicators).In this model we consider a bank portfolio of 10K randomly generated swap trades, with • trade currency and counterparty both uniform on [1 , , . . . , • notional uniform on [10 K, K, . . . , K ], • collateralization (cf. Section A.4): either “no CSA counterparty” without initial margin(IM) nor variation margin (VM), or “CSA counterparty” with VM = MtM and postedinitial margin (PIM) pledged at 99% gap risk value-at-risk, received initial margin (RIM)covering 75% gap risk and leaving excess as residual gap CVA, • for economic capital, 97 .

5% expected shortfall of 1-year ahead trading loss of the bankshareholders.By default we use Monte Carlo simulation with 50K paths of 16 coarse (pricing) and 32 ﬁne(risk factors) time steps per year.

The validation of our deep learning methodology is done in the setup of a portfolio of swapsissued at par, with ﬁnal maturity T = 10 years, without initial margin (IM) nor variationmargin (VM).We ﬁrst focus on the CVA, as the latter is amenable to validation by a standard nested MonteCarlo (“NMC”) methodology. Figures 4, 5 and 6 show that the learned CVA is consistent withthat obtained from a nested Monte Carlo simulation. Regarding Figure 6 (and also later below),note the equivalence of optimising the mean quadratic error • between the ANN learned estimator h ( X ) and the labels Y (“MSE”), E h ( h ( X ) − Y ) i ,and • between the ANN learned estimator and the conditional expectation E [ Y | X ] (in our caseestimated by NMC), E h ( h ( X ) − E [ Y | X ]) i .The equivalence stems from the following identities, which hold for any random variables X , Y and hypothesis function h such that Y and h ( X ) are square integrable: E h ( h ( X ) − Y ) i = E h ( h ( X ) − E [ Y | X ]) i + E h ( E [ Y | X ] − Y ) i + 2 E (cid:2) ( h ( X ) − E [ Y | X ]) ( E [ Y | X ] − Y ) (cid:3) = E h ( h ( X ) − E [ Y | X ]) i + E [ V ar ( Y | X )] (47)(as the second line vanishes), where E [ V ar ( Y | X )] does not depend on h .24he CVA error proﬁle on Figure 6 reveals slightly more diﬃculty in learning the earlierCVAs. This is because of a higher variance of the corresponding cash ﬂows (integrated overlonger time frames) in conjunction with a lower variance of the features (risk factors diﬀusedover shorter time horizons).Figure 4: Random variables CVA c and CVA c (in the case of a no CSA netting set c , respectivelyobserved after 1 and 7 years) obtained by learning (blue histogram) versus nested Monte Carlo(orange histogram). All histograms are based on out-of-sample paths.Figure 5: QQ-plot of learned versus nested Monte Carlo CVA for the random variables CVA c ( left ) and CVA c ( right ). Paths are out-of-sample.Table 3 shows the computational cost and accuracy of the nested Monte Carlo method fordiﬀerent number of inner paths, using 32768 outer paths. The convergence is already achievedfor approximately 128 inner paths, in line with the NMC square root rule that is recalled inan XVA setup in Abbas-Turki, Diallo, and Cr´epey (2018, Section 3.3). Figure 7 and Table 4show that a good accuracy can be achieved through learning at a lower computational costthan through nested Monte Carlo, while also enjoying the advantages of the approach beingparametric. Indeed, once the CVA is learned, one would pay only the cost of inference lateron, which is generally negligible compared to training time. By contrast, a nested Monte Carloapproach would require to relaunch the nested simulations every time the CVA estimator isneeded on new paths. Early stopping could be used to help reduce training time further whileimproving regularization.More generally, in the presence of a multiple number of XVA layers (cf. Figure 2), a purelynested Monte Carlo approach would require multiple layers of nested simulations, which wouldamount to a computational time that is exponential in the number of XVA layers, while the25 t (years)05001000150020002500 M S E learned CVA, out-of-sampleunconditional meanNested Monte-Carlo CVA, out-of-sample Figure 6: Empirical quadratic loss of each CVA estimator at all coarse time-steps. The lower,the closer to the true conditional expectation (cf. (47)). Since the nested Monte Carlo methodis computationally expensive, it was carried out only once every 10 coarse time-steps.

200 400 600 800 1000 1200Computation time (sec)0 . . . . . M S E l o ss ( s t a nd a r d i ze db y l a b e l s v a r ) nested Monte-Carlolearning, in-samplelearning, out-of-sample0 10 20 30 40 50 600 . . . . Figure 7: Speed versus accuracy in the case of a CVA at a given pricing time. We kept varyingthe number of inner paths for the nested Monte Carlo estimator and the number of epochs forthe learning approach and recorded the computation time and the empirical quadratic loss.MSE (vs NMC CVA) MSE (vs labels) Simulation time Training time (0) proﬁle as per (46). The orange FVA curve representsthe mean FVA originating cash ﬂows, which, in principle as on the picture, matches the bluemean FVA itself learned from these cash ﬂows. The 5th and 95th percentiles FVA estimatesare a bit less smooth in time then the mean proﬁles, as expected.Figure 10 (left) is a sanity check that the proﬁles of the successives iterates L ( k ) of theshareholder trading loss process L ◦ in Algorithm 1 converge rapidly with k . Figure 10 (right) shows the loss process L (3) , displayed as its mean and mean ± L (3) appears numerically centered around zero. Thelatter holds, at least, beyond t ∼ For the ﬁnancial case study that follows, we consider • swap rates uniformly distributed on [0 . , .

05] (hence swaps already in-the-money orout-of-the-money at time 0), • number of six-monthly coupon resets uniform on [5 . . .

60] (ﬁnal maturity of the portfolio T = 30 years), • portfolio direction: either “asset heavy” bank mostly in the receivables in the future, or“liability-heavy” bank mostly in the payables in the future (respectively corresponding,with our data, to a bank 75% likely to pay ﬁxed in the swaps, or 75% likely to receiveﬁxed).The ﬁgures that follow only display proﬁles, i.e. term structures, that is, expectations as afunction of time of the corresponding processes. But all these processes are computed pathwise,based on the deep learning regression and quantile regression methodology of Section 4.4,allowing for all XVA inter-dependencies. Of course, XVA proﬁles (or pathwise XVAs if wished)are much more informative for traders than the spot XVA values (or time 0 conﬁdence intervals)returned by most XVA systems.Assuming 10 counterparties, Figure 11 shows the GPU generated proﬁles ofMtM = X c P c [0 ,τ δc ) (48)in the case of the asset-heavy portfolio and of the liability-heavy portfolio.Figure 12 shows the porftolio-wide XVA proﬁles of the asset-heavy (top) vs. liability–heavy (bottom) portfolio and of the no CSA (left) vs. CSA portfolio (right) . Obviously, asset–heavy or28

20 40 60 80 100 120 . . . . . . M S E l o ss ( s t a nd a r d i ze db y l a b e l s v a r ) N layers = 1, N units = 11 N layers = 1, N units = 16 N layers = 1, N units = 22 N layers = 2, N units = 11 N layers = 2, N units = 16 N layers = 2, N units = 22 N layers = 3, N units = 11 N layers = 3, N units = 16 N layers = 3, N units = 2220 . . . . . . . . . . . . . .

420 20 40 60 80 100 120 . . . . . . . . M S E l o ss ( s t a nd a r d i ze db y l a b e l s v a r ) N layers = 1, N units = 11 N layers = 1, N units = 16 N layers = 1, N units = 22 N layers = 2, N units = 11 N layers = 2, N units = 16 N layers = 2, N units = 22 N layers = 3, N units = 11 N layers = 3, N units = 16 N layers = 3, N units = 2220 . . . . . . . . . . . . . . . . Figure 8: Empirical quadratic loss during CVA learning at time-step t = 5 years, standardizedby the variance of the labels. (Bottom) Paths are in-sample. (Top)

Paths are out-of-sample.29 t (years)020000400006000080000100000120000 E h FVA (0)t i E (cid:20)R Tt λ s (cid:16)P c J cs ( P cs − VM cs ) − CVA s − MVA s − FVA (0)s (cid:17) + ds (cid:21) (0)t (0)t Figure 9: Learned FVA (0) . -6000-4000-200002000400060008000 0 5 10 15 20 25 30 D o m e s t i c C u rr e c n y U n i t s Year

Liability-Heavy Bank No CSA Mean Loss Process Convergence

Loss Iter 0 Loss Iter 1 Loss Iter 2 -20000-15000-10000-500005000100001500020000 0 5 10 15 20 25 30 D o m e s t i c C u rr e c n y U n i t s Year

Liability-Heavy Bank Loss Process -No CSA

Loss Mean Loss Mean+2SE Loss Mean-2SE

Figure 10: (Left)

Proﬁles of the processes L ( k ) , for k = 1 , , (Right) Mean ± L (3) .no CSA means more CVA. The correponding curves also emphasize the transfer from counter-party credit into liquidity funding risk prompted by extensive collateralisation. Yet FVA/MVArisk is ignored in current derivatives capital regulation.Figure 13 shows that (top left) capital at risk as funding (cf. Section 3.4) has a materialimpact on the already (reserve capital as funding) reduced FVA, (top right) treating KVA asa risk margin (cf. (26)) gives a huge discounting impact, (bottom left) deep learning detectsmaterial initial margin convexity in the asset-heavy CSA portfolio, and (bottom right) deep30 - 1,000,000

14 15

21 22

28 29 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio MtM : Mainly Payer (Asset-Heavy)

MtM -9,000,000 -8,000,000-7,000,000-6,000,000 -5,000,000-4,000,000 -3,000,000-2,000,000-1,000,000 -

11 12 13 14 15 16

17 18 19 20 21 22 23

24 25 26 27 28 29 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio MtM: Mainly Receiver (Liability-Heavy)

MtM

Figure 11: MtM proﬁles. (Left)

Asset-heavy portfolio. (Right)

Liability-heavy portfolio. - 200,000

23 24 25 26 27 28 29 30 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio Asset-Heavy -XVA no CSA

CVAFVAKVA -

10 11

14 15

19 20

23 24

28 29 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio Asset-Heavy -XVA IM CSA

CVA

MVAKVA -

23 24 25 26 27 28 29 30 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio Liability-Heavy -XVA no CSA

CVA

FVAKVA -

10 11

14 15

19 20

23 24

28 29 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio Liability-Heavy -XVA IM CSA

CVA

MVA

KVA

Figure 12: (Top left)

Asset-heavy portfolio, no CSA. (Top right)

Asset-heavy portfolio underCSA. (Bottom left)

Liability–heavy portfolio, no CSA. (Bottom right)

Liability-heavy portfoliounder CSA. 31earning detects material economic capital convexity in the asset-heavy no CSA portfolio. -

24 25 26 27 28 29 30 D o m e s t i c C u rr e c n y U n i t s Years

Swaps Portfolio Liability-Heavy -FVA offsets -no CSA

FVA No Offset - Bank level FCAFVA CA Offset

FVA CA EC Offset -

15 16

21 22

27 28 D o m s t i c C u rr e n c y U n i t s Years

Swaps Portfolio Asset-Heavy -KVA Discounting no CSA

Discount OIS+h Discount OIS -

10 11 12

13 14 15 16

17 18 19

20 21 22 23

24 25 26

27 28 29 30 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio Asset-Heavy -Posted IM Unconditional vs Average Conditional

Unconditional Average Conditional - 500,000 1,000,000 1,500,000 2,000,000 2,500,000

10 11

12 13

15 16

18 19

20 21

23 24

26 27

28 29 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio Asset-Heavy-Convexity ES(L) Unconditional vs Average Conditional: no CSA

Unconditional Average Conditional

Figure 13: (Top left)

FVA ignoring the oﬀ-setting impact of reserve capital and capital at risk,cf. Section 3.4 (blue), FVA as per (57) accounting for the oﬀ-setting impact of reserve capitalbut ignoring the one of capital at risk (green), reﬁned FVA as per (52) accounting for bothimpacts (red). (Top right)

KVA ignoring the oﬀ-setting impact of the risk margin, i.e. with CRinstead of (CR − KVA) in (56) (red), reﬁned KVA as per (54)–(55) (blue). (Bottom left)

In thecase of the asset-heavy portfolio under CSA, unconditional PIM proﬁle, i.e. with V a R t replacedby V a R in (59) (blue), vs. pathwise PIM proﬁle, i.e. mean of the pathwise PIM process as per(59) (red). (Bottom right) In the asset-heavy portfolio no CSA case, unconditional economiccapital proﬁle, i.e. EC proﬁle ignoring the words “time- t conditional” in Deﬁnition A.1 (blue),vs. pathwise economic capital proﬁle, i.e. mean of the pathwise EC process as per DeﬁnitionA.1 (red).The above ﬁndings demonstrate the necessity of pathwise capital and margin calculationsfor accurate FVA, MVA, and KVA calculations. Next, we consider, on top of the previous portfolios, an incremental trade given as a par 30year (receive ﬁx or pay ﬁx) swap with 100K notional. Figure 14 shows the trade incrementalXVA proﬁles produced by our deep learning approach. Note that, for obtaining such smoothincremental proﬁles, it has been key to use common random numbers, as much as possible,between the original portfolio XVA computations and the ones regarding the portfolio expandedwith the new trade.

Our model assumes the market risk of trades to be fully hedged (see the paragraph followingRemark 2.2 and the proofs of Lemma 3.2 and Proposition 4.1). In the previous subsection,the new swap was implicitly meant to be hedged, in terms of market risk, by the clean desks,32 ears -

60 80

18 19 20 21 22 23 24

25 26 27 28 29 30 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio Asset-Heavy -Incremental XVA IM CSA

CVA

MVA

KVA -180 -160 -140-120 -100 -80-60 -40-20 - 20

14 15 16 17 18 19 20 21 22

23 24 25 26 27 28 29 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio Asset-Heavy (mainly Payer) -Incremental Receiver XVA IM CSA

CVAMVAKVA -

15 16 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio Liability-Heavy -Incremental XVA no CSA

CVAFVA

KVA -1,000.0 -900.0-800.0-700.0-600.0 -500.0-400.0-300.0 -200.0-100.0 -

15 16 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio Liability-Heavy -Incremental XVA no CSA

CVA FVAKVA

Figure 14: (Top left)

Asset-heavy portfolio, no CSA. Incremental receive ﬁx trade. (Top right)

Liability-heavy portfolio, no CSA. Incremental pay ﬁx trade. (Bottom left)

Asset-heavy port-folio under CSA. Incremental Pay Fix Trade. (Bottom right)

Liability-heavy portfolio underCSA. Incremental receive ﬁx trade.through an accordingly modiﬁed hedging loss process H (see Section 2.1). Here we consideran alternative situation where the market risk of the new swap is back-to-back hedged via aﬁnancial, hedge counterparty. Speciﬁcally, we deal with •

10 counterparties: 8 no CSA clients and 2 bilateral VM/IM CSA hedge counterparties, • portfolios of 5K randomly generated swap trades as before, plus 5K corresponding hedgetrades, • an incremental trade given as a par 30 year swap with 100K notional, along with thecorresponding hedge trade.In particular, MtM = 0 (cf. (48)), in both portfolios excluding or including the new swap. Incase a client or hedge counterparty defaults, the corresponding market hedge is assumed to berewired through the clean desks via an accordingly modiﬁed hedging loss process H .The 8 no CSA counterparties are primarily asset or liability heavy. One bilateral CSA hedgecounterparty is asset-heavy and one liability-heavy. Figure 15 provides the trade incrementalXVA proﬁles of the bilateral hedge alternatives in combination with those for the initial coun-terparty trade. The main XVA impact of the hedge is then a corresponding incremental MVAterm, which can contribute to make the global FTP related to the trade+hedge package moreor less positive or negative, depending on the data (cf. the four panels in Figure 15), as canonly be inferred by a reﬁned XVA computation. Remark 5.1

In the above, we do not include the XVA costs/beneﬁts of the bilateral hedgecounterparty itself. Given Remark 2.4, in diﬀerent circumstances it may be possible to attributethem to client trades of the original or hedge bank. Space is lacking for a fuller discussion ofeconomics of XVA trading in diﬀerent setups. In particular, many hedge trades now face central33 -500.0-400.0 -300.0-200.0 -100.0 - 100.0

28 29 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio: XVA-reducing no-CSA CP Trade -Incremental 30Y pay fix swap+ XVA-increasing IM CP hedge

CVAKVAMVAFVA - 100.0 200.0 300.0 400.0 500.0 600.0

28 29 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio: XVA-increasing no-CSA CP Trade -Incremental 30Y receive fix swap + XVA-increasing IM CP hedge

CVAKVAMVAFVA-600.0 -500.0-400.0-300.0-200.0 -100.0 - 100.0 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio: XVA-reducing no-CSA CP Trade -Incremental 30Y pay fix swap + XVA-reducing IM CP hedge

CVAKVA

MVAFVA

28 29 -200.0-100.0 - 100.0

28 29 D o m e s t i c C u rr e n c y U n i t s Years

Swaps Portfolio: XVA-increasing no-CSA CP Trade -Incremental 30Y receive fix swap + XVA-reducing IM CP hedge

CVAKVAMVAFVA

Figure 15: (Top left)

XVA-reducing trade + XVA-increasing bilateral hedge (Top right)

XVA-increasing trade + XVA-increasing bilateral hedge. (Bottom left)

XVA-reducing trade + xva-reducing bilateral hedge (Bottom right)

XVA-increasing trade + XVA-reducing bilateral hedge.instead of bilateral counterparties. This occurs at additional XVA costs for the client of the ini-tial swap that can be computed the way explained in Albanese, Armenti, and Cr´epey (2020).

Our deep learning XVA implementation uses CNTK, the Microsoft Cognitive Toolkit. CNTKis written in core C++/CUDA (with wrappers for Python, C

10 CP 40 risk factors 20 CP 80 risk factorsNo CSA IM CSA No CSA IM CSAInitial risk factor & trade pricing simulation Cuda 352 352 426 426Counterparty and bank level learning calculations 4,529 13,466 19,154 59,342Total initial batch 4,881 13,818 19,580 59,768Re-simulate 1 counterparty trade pricing Cuda 40 40 51 51Counterparty and bank level learning calculations 2,785 2,736 7,694 6,628Total incremental trade 2,825 2,776 7,745 6,679

Table 5: XVA deep learning computation timings (seconds).All these results were based on 50K simulation paths, 32 time steps per year for risk factorsimulation, and 16 time steps per year for all XVA calculations and deep learning. They were34omputed on a Lenovo P52 laptop with NVidia Quadro P3200 GPU @ 5.5 Teraﬂops peak FP32performance, and 14 streaming multiprocessors.The computations for 20 counterparties took more than twice as long as those for 10 counter-parties. However, our deep learning calculations achieved around 80 to 90% Cuda occupancy for10 counterparties and at times fell to half that level for 20 counterparties. Scaling to realisticallyhigh dimensions should be achievable, but acceptable trade incremental pricing performancein production would require server-grade GPU hardware, performance tuning for high GPUutilisation, and, possibly, caching computations.

A Continuous-Time XVA Equations

We recall from Cr´epey, Sabbagh, and Song (2020) the continuous-time XVA equations for bi-lateral trade portfolios when capital at risk is deemed fungible with variation margin, alsoadding here initial margin and MVA as in the reﬁned static setup of Section 3.4.We write δ η ( dt ) = d { η ≤ t } for the Dirac measure at a random time η . A.1 Cash Flows

We suppose that the derivative portfolio of the bank is partitioned into bilateral netting sets ofcontracts which are jointly collateralized and liquidated upon bank or counterparties (whetherthese are clients or market hedge counterparties) default. Given a netting set c of the bankportfolio, we denote by: • P c and P c , the corresponding contractually promised cash ﬂows and clean value processes; • τ c , J c , and R c , the corresponding default times, survival indicators, and recovery rates,whereas τ , J , and R are the analogous data regarding the bank itself, with bank creditspread process λ = (1 − R ) γ taken as a proxy of its risky funding spread process ; • τ δc = τ c + δ and τ δ = τ + δ , where δ is a positive margin period of risk, in the sense thatthe liquidation of the netting set c happens at time τ δc ∧ τ δ ; • VM c , the variation margin (re-hypothecable collateral) exchanged between the bank andcounterparty c , counted positively when received by the bank; • PIM c and RIM c , the related initial margin (segregated collateral) posted and received bythe bank; • RC and CR, the reserve capital and capital at risk of the bank.The contractually promised cash ﬂows are supposed to be hedged out by the bank but oneconservatively assumes no XVA hedge, so that the bank is left with the following trading cashﬂows C and F (cf. (38) and see Albanese and Cr´epey (2020, Lemmas 5.1 and 5.2) for detailedderivations of analogous equations in a slightly simpliﬁed setup): See Albanese, Armenti, and Cr´epey (2020, Section 5) for the discussion of cheaper funding schemes for ini-tial margin. The (counterparty) credit cash ﬂows d C t = X c ; τ c ≤ τ δ (1 − R c ) (cid:16) ( P c + P c ) τ δc ∧ τ δ − ( P c + VM c + RIM c ) ( τ c ∧ τ ) − (cid:17) + δ τ δc ∧ τ δ ( dt ) − (1 − R ) X c ; τ ≤ τ δc (cid:16) ( P c + P c ) τ δ ∧ τ δc − ( P c + VM c − PIM c ) ( τ ∧ τ c ) − (cid:17) − δ τ δ ∧ τ δc ( dt ); (49) • The (risky) funding cash ﬂows d F t = J t λ t (cid:16) X c J c ( P c − VM c ) − RC − CR (cid:17) + t dt − (1 − R ) (cid:16) X c J c ( P c − VM c ) − RC − CR (cid:17) + τ − δ τ ( dt )+ J t λ t X c J ct PIM ct dt − (1 − R ) X c J cτ − PIM cτ − δ τ ( dt ) , (50)where the RC and CR terms account for the fungibility of reserve capital and capital atrisk with variation margin. A.2 Valuation

Here (as in our numerics) we distinguish between a (strict) FVA, in the strict sense of the costof raising variation margin, and an MVA for the cost of raising initial margin (see Remark 2.1).The (other than K)VA equations are thenRC = CA = CVA + FVA + MVA , (51)the so-called “contra-assets valuation” sourced from the clients and deposited in the reservecapital account of the bank, where, for t < τ ,CVA t = E t X t<τ δc (1 − R c ) (cid:16) ( P c + P c ) τ δc − ( P c + VM c + RIM c ) τ c − (cid:17) + FVA t = E t Z Tt λ s (cid:16) X c J c ( P c − VM c ) − CA − CR (cid:17) + s ds MVA t = E t Z Tt λ s X c J cs PIM cs ds. (52)The corresponding trading loss and proﬁt process L of the bank is such that L = 0 and, for t < τ,dL t = X c (1 − R c ) (cid:16) ( P c + P c ) τ δc − ( P c + VM c + RIM c ) τ c − (cid:17) + δ τ δc ( dt )+ λ t (cid:16) X c J c ( P c − VM c ) − CA − CR (cid:17) + t dt + λ t X c J ct PIM ct dt + d CA t , (53)36o that L is a Q martingale, hence (by Lemma 4.1) L ◦ is a Q ∗ martingale.By the same rationale as Deﬁnitions 3.2 and 3.3 in the static setup: Deﬁnition A.1 EC t is the time- t conditional 97.5% expected shortfall of ( L ◦ t +1 − L ◦ t ) under Q .Given a positive target hurdle rate h : Deﬁnition A.2

We set CR = max(EC , KVA) , (54)for a KVA process such that, for t < τ ,KVA t = E t h Z Tt h (cid:0) CR s − KVA s (cid:1) ds i . (55)Hence, for t < τ , KVA t = E t h Z Tt he − h ( s − t ) CR s ds i = E t h Z Tt he − h ( s − t ) max(EC s , KVA s (cid:1) ds i . (56)The next-to-last identity is the continuous-time analog of the risk margin formula under theSwiss solvency test cost of capital methodology: see Swiss Federal Oﬃce of Private Insurance (2006, Section 6, middle of page 86 and top of page 88). A.3 The XVA Equations are Well-Posed

In view of (51), the second line in (52) is in fact an FVA equation . Likewise, the secondline in (56) is a KVA equation. Moreover, as capital at risk is fungible with variation margin(cf. Section 3.4), i.e. in consideration of the CR term in (52)-(53), where CR = max(EC , KVA),we actually deal with an (FVA , KVA) system , and even, as EC depends on L (cf. DeﬁnitionA.1), with a forward backward system for the forward loss process L and the backward pair(FVA , KVA).However, as in the reﬁned static setup of Section 3.4, the coupling between (FVA , KVA)and L can be disentangled by the following Picard iteration: • Let CVA and MVA be as in (52), L (0) = KVA (0) = 0, and , for t < τ ,FVA (0) t = E t Z Tt λ s (cid:16) X c J c ( P c − VM c ) − CA (0) (cid:17) + s ds, (57)where CA (0) = CVA + FVA (0) + MVA; • For k ≥ , writing explicitly EC = EC( L ) to emphasize the dependence of EC on L , let37 ( k )0 = 0 and, for t < τ , dL ( k ) t = X c (1 − R c ) (cid:16) ( P c + P c ) τ δc − ( P c + VM c + RIM c ) τ c − (cid:17) + δ τ δc ( dt )+ λ t (cid:16) X c J c ( P c − VM c ) − CA ( k − − max (cid:0) EC( L ( k − ) , KVA ( k − (cid:1)(cid:17) + t dt + λ t X c J ct PIM ct dt + d CA ( k − t , KVA ( k ) t = h E t Z Tt e − h ( s − t ) max (cid:0) EC s ( L ( k ) ) , KVA ( k ) s (cid:1) ds, CA ( k ) t = CVA t + FVA ( k ) t + MVA t where FVA ( k ) t = E t Z Tt λ s (cid:16) X c J c ( P c − VM c ) − CA ( k ) − max (cid:0) EC( L ( k ) ) , KVA ( k ) (cid:1)(cid:17) + s ds. (58) Theorem 4.1 in Cr´epey, Sabbagh, and Song (2020)

Assuming square integrable data,the XVA equations are well-posed within square integrable solution (including when one ac-counts for the fact that capital at risk can be used for funding variation margin). Moreover, theabove Picard iteration converges to the unique square integrable solution of the XVA equations.

A.4 Collateralization Schemes

We denote by ∆ ct = P ct − P c ( t − δ ) − the cumulative contractual cash ﬂows with the counterparty c accumulated over a past period of length δ . In our case study, we consider both “no CSA”netting sets c , with VM = RIM = PIM = 0, and “(VM/IM) CSA” netting sets c , withVM ct = P ct and, for t ≤ τ c ,RIM ct = V a R t (cid:16) ( P ct δ + ∆ ct δ ) − P ct (cid:17) , PIM ct = V a R t (cid:16) − ( P ct δ + ∆ ct δ ) + P ct (cid:17) , (59)for some PIM and RIM quantile levels a pim and a rim (and t δ = t + δ ).The following result can be derived by similar computations as the ones in Albanese, Armenti, and Cr´epey (2020, Section A). Proposition A.1

In a common shock default model of the counterparties and the bank itself(see the beginning of Section 5), with pre-default intensity processes γ c of the counterpartiesand γ of the bank, then CVA = CVA nocsa + CVA csa , where, for t < τ,

CVA nocsat = X c nocsa t<τ c (1 − R c ) E t Z Tt ( P cs δ + ∆ cs δ ) + γ cs e − R st γ cu du ds + X c nosca τ c

MVA csat = X c csa J ct E t Z Tt (1 − R ) γ s PIM cs e − R st γ cu du ds. (62) References

Abbas-Turki, L., B. Diallo, and S. Cr´epey (2018). XVA principles, nested Monte Carlostrategies, and GPU optimizations.

International Journal of Theoretical and Applied Fi-nance 21 , 1850030.Albanese, C., Y. Armenti, and S. Cr´epey (2020). XVA Metrics for CCP optimisation.

Statis-tics & Risk Modeling 37 (1-2), 25–53.Albanese, C., S. Caenazzo, and S. Cr´epey (2017). Credit, funding, margin, and capitalvaluation adjustments for bilateral portfolios.

Probability, Uncertainty and QuantitativeRisk 2 (7), 26 pages.Albanese, C. and S. Cr´epey (2020). The cost-of-capital XVA approach in continuous time.Working paper available on https://math.maths.univ-evry.fr/crepey.Andersen, L., D. Duﬃe, and Y. Song (2019). Funding value adjustments.

Journal of Fi-nance 74 (1), 145–192.Artzner, P., K.-T. Eisele, and T. Schmidt (2020). No arbitrage in insurance and the QP-rule.Working paper available as arXiv:2005.11022.Barrera, D., S. Cr´epey, B. Diallo, G. Fort, E. Gobet, and U. Stazhynski (2019). Stochas-tic approximation schemes for economic capital and risk margin computations.

ESAIM:Proceedings and Surveys 65 , 182–218.Beck, C., S. Becker, P. Cheridito, A. Jentzen, and A. Neufeld (2019). Deep splitting methodfor parabolic PDEs. arXiv:1907.03452.Bichuch, M., A. Capponi, and S. Sturm (2018). Arbitrage-free XVA.

Mathematical Fi-nance 28 (2), 582–620.Bielecki, T. and M. Rutkowski (2002).

Credit Risk: Modeling, Valuation and Hedging .Springer Finance, Berlin.Bielecki, T. R. and M. Rutkowski (2015). Valuation and hedging of contracts with fundingcosts and collateralization.

SIAM Journal on Financial Mathematics 6 , 594–655.Brigo, D. and A. Capponi (2010). Bilateral counterparty risk with application to CDSs.

RiskMagazine , March 85–90. Preprint version available at https://arxiv.org/abs/0812.3705.Brigo, D. and A. Pallavicini (2014). Nonlinear consistent valuation of CCP cleared or CSAbilateral trades with initial margins under credit, funding and wrong-way risks.

Journalof Financial Engineering 1 , 1–60.Burgard, C. and M. Kjaer (2011). In the balance.

Risk Magazine , October 72–75.Burgard, C. and M. Kjaer (2013). Funding costs, funding strategies.

Risk Magazine , Decem-ber 82–87. Preprint version available at https://ssrn.com/abstract=2027195.Burgard, C. and M. Kjaer (2017). Derivatives funding, netting and accounting.

Risk Maga-zine , March 100–104. Preprint version available at https://ssrn.com/abstract=2534011.39astagna, A. (2014). Towards a theory of internal valuation and transfer pricingof products in a bank: Funding, credit risk and economic capital. Available athttp://ssrn.com/abstract=2392772.Collin-Dufresne, P., R. Goldstein, and J. Hugonnier (2004). A general formula for valuingdefaultable securities.

Econometrica 72 (5), 1377–1407.Committee of European Insurance and Occupational Pensions Supervisors (2010).QIS5 technical speciﬁcations. https://eiopa.europa.eu/Publications/QIS/QIS5-technical speciﬁcations 20100706.pdf.Cr´epey, S. (2015). Bilateral counterparty risk under funding constraints. Part I: Pricing,followed by Part II: CVA.

Mathematical Finance 25 (1), 1–22 and 23–50. First publishedonline on 12 December 2012.Cr´epey, S., T. R. Bielecki, and D. Brigo (2014).

Counterparty Risk and Funding: A Tale ofTwo Puzzles . Chapman & Hall/CRC Financial Mathematics Series.Cr´epey, S., W. Sabbagh, and S. Song (2020). When capital is a funding source: The antici-pated backward stochastic diﬀerential equations of X-Value Adjustments.

SIAM Journalon Financial Mathematics 11 (1), 99–130.Cr´epey, S. and S. Song (2016). Counterparty risk and funding: Immersion and beyond.

Finance and Stochastics 20 (4), 901–930.Cr´epey, S. and S. Song (2017). Invariance times.

The Annals of Probability 45 (6B), 4632–4674.Dimitriadis, T. and S. Bayer (2019). A joint quantile and expected shortfall regression frame-work.

Electronic Journal of Statistics 13 (1), 1823–1871.Duﬃe, D. and M. Huang (1996). Swap rates and credit quality.

Journal of Finance 51

Options: recent advances in theory and practice , Volume 2,pp. 13–24. Manchester University Press.Elouerkhaoui, Y. (2007). Pricing and hedging in a dynamic credit model.

International Jour-nal of Theoretical and Applied Finance 10 (4), 703–731.Elouerkhaoui, Y. (2017).

Credit Correlation: Theory and Practice . Palgrave Macmillan.Fissler, T. and J. Ziegel (2016). Higher order elicitability and Osband’s principle.

The Annalsof Statistics 44 (4), 1680–1707.Fissler, T., J. Ziegel, and T. Gneiting (2016). Expected Shortfall is jointly elicitable withValue at Risk—Implications for backtesting.

Risk Magazine , January.F¨ollmer, H. and A. Schied (2016).

Stochastic Finance: An Introduction in Discrete Time (4th ed.). De Gruyter Graduate.Goodfellow, I., Y. Bengio, and A. Courville (2016).

Deep Learning . MIT Press.Gottardi, P. (1995). An analysis of the conditions for the validity of Modigliani-Miller The-orem with incomplete markets.

Economic Theory 5 , 191–207.40reen, A., C. Kenyon, and C. Dennis (2014). KVA: capital valuation adjustment by replica-tion.

Risk Magazine , December 82–87. Preprint version “KVA: capital valuation adjust-ment” available at ssrn.2400324.Hoeting, J. A., D. Madigan, A. E. Raftery, and C. T. Volinsky (1999). Bayesian modelaveraging: A tutorial.

Statistical Science 14 (4), 382–417.Hull, J. and A. White (2012). The FVA debate, followed by The FVA debate continued.

RiskMagazine , July 83–85 and October 52.Hur´e, C., H. Pham, and C. Warin (2020). Some machine learning schemes for high-dimensional nonlinear PDEs.

Mathematics of Computation 89 (324), 1547–1580.International Financial Reporting Standards (2013). IFRS 4 insurance contracts exposuredraft.Kjaer, M. (2019). In the balance redux.

Risk Magazine (November).Longstaﬀ, F. A. and E. S. Schwartz (2001). Valuing American options by simulation: Asimple least-squares approach.

The Review of Financial Studies 14 (1), 113–147.Merton, R. (1974). On the pricing of corporate debt: the risk structure of interest rates.

TheJournal of Finance 29 , 449–470.Modigliani, F. and M. Miller (1958). The cost of capital, corporation ﬁnance and the theoryof investment.

Economic Review 48 , 261–297.Myers, S. (1977). Determinants of corporate borrowing.

Journal of Financial Economics 5 ,147–175.Piterbarg, V. (2010). Funding beyond discounting: collateral agreements and derivativespricing.

Risk Magazine , August 57–63.Sch¨onbucher, P. (2004). A measure of survival.