XVA Analysis From the Balance Sheet
Claudio Albanese, Stephane Crepey, Rodney Hoskinson, Bouazza Saadeddine
aa r X i v : . [ q -f i n . R M ] S e p XVA Analysis From the Balance Sheet
Claudio Albanese , St´ephane Cr´epey , Rodney Hoskinson , and Bouazza Saadeddine September 2, 2020
Abstract
XVAs denote various counterparty risk related valuation adjustments that are appliedto financial derivatives since the 2007–09 crisis. We root a cost-of-capital XVA strategyin a balance sheet perspective which is key in identifying the economic meaning of theXVA terms. Our approach is first detailed in a static setup that is solved explicitly. It isthen plugged in the dynamic and trade incremental context of a real derivative bankingportfolio. The corresponding cost-of-capital XVA strategy ensures to bank shareholdersa submartingale equity process corresponding to a target hurdle rate on their capitalat risk, consistently between and throughout deals. Set on a forward/backward SDEformulation, this strategy can be solved efficiently using GPU computing combined withdeep learning regression methods in a whole bank balance sheet context. A numericalcase study emphasizes the workability and added value of the ensuing pathwise XVAcomputations.
Keywords:
Counterparty risk, balance sheet of a bank, market incompleteness, wealth trans-fer, X-valuation adjustment (XVA), deep learning, quantile regression.
Mathematics Subject Classification:
JEL Classification:
D52, G13, G24, G28, G33, M41.
XVAs, with X as C for credit, D for debt, F for funding, M for margin, or K for capital, are post-2007–09 crisis valuation adjustments for financial derivatives. In broad terms to be detailedlater in the paper (cf. Table 1 in Section 2), CVA is what the bank expects to lose due to Global Valuation ltd, London, United Kingdom LaMME, Univ Evry, CNRS, Universit´e Paris-Saclay. ANZ Banking Group, Singapore Quantitative Research GMD/GMT Credit Agricole CIB, ParisAcknowledgement:
This article has been accepted for publication in Quantitative Finance, published byTaylor & Francis.We are grateful for useful discussions to Lokman Abbas-Turki, Agostino Capponi, Karl-Theodor Eisele, ChrisKenyon, Marek Rutkowski, and Michael Schmutz. The PhD thesis of Bouazza Saadeddine is funded by a CIFREgrant from CA-CIB and French ANRT.
Disclaimers:
The views expressed herein by Rodney Hoskinson and co-authors are their personal viewsand do not reflect the views of ANZ Banking Group Limited (”ANZ”). No liability shall be accepted by ANZwhatsoever for any direct or consequential loss from any use of this paper and the information, opinions andmaterials contained herein.
Email addresses: [email protected], [email protected], [email protected], [email protected] − DVA (1)for the valuation of bilateral counterparty risk on a swap, assuming risk-free funding. This for-mula, rediscovered and generalized by others since the 2008–09 financial crisis (cf. e.g. Brigo and Capponi (2010)),is symmetrical, i.e. it is the negative of the analogous quantity considered from the point of viewof the counterparty, consistent with the law of one price and the Modigliani and Miller (1958)theorem.Around 2010, the materiality of the DVA windfall benefit of a bank at its own defaulttime became the topic of intense debates in the quant and academic communities. At least,it seemed reasonable to admit that, if the own default risk of the bank was accounted for inthe modeling, in the form of a DVA benefit, then the cost of funding (FVA) implication of thisrisk should be included as well, leading to the modified formula (CVA − DVA + FVA). See forinstance Burgard and Kjaer (2011, 2013, 2017), Cr´epey (2015), Brigo and Pallavicini (2014), orBichuch, Capponi, and Sturm (2018). See also Bielecki and Rutkowski (2015) for an abstractfunding framework (without explicit reference to XVAs), generalizing Piterbarg (2010) to anonlinear setup.Then Hull and White (2012) objected that the FVA was only the compensator of anotherwindfall benefit of the bank at its own default, corresponding to the non-reimbursement by thebank of its funding debt. Accounting for the corresponding “DVA2” (akin to the FDA in thispaper) brings back to the original firm valuation formula:CVA − DVA + FVA − FDA = CVA − DVA , as FVA = FDA (assuming risky funding fairly priced as we will see).However, their argument implicitly assumes that the bank can perfectly hedge its owndefault: cf. Burgard and Kjaer (2013, end of Section 3.1) and see Section 3.5 below. As a bankis an intrinsically leveraged entity, this is not the case in practice. One can mention the relatedcorporate finance notion of debt overhang in Myers (1977), by which a project valuable forthe firm as a whole may be rejected by shareholders because the project is mainly valuableto bondholders. But, until recently, such considerations were hardly considered in the field ofderivative pricing.The first ones to recast the XVA debate in the perspective of the balance sheet of thebank were Burgard and Kjaer (2011), to explain that an appropriately hedged derivative po-sition has no impact on the dealer’s funding costs. Also relying on balance sheet modelsof a dealer bank, Castagna (2014) and Andersen, Duffie, and Song (2019) end up with con-flicting conclusions, namely that the FVA should, respectively should not, be included in2he valuation of financial derivatives. Adding the KVA, but in a replication framework,Green, Kenyon, and Dennis (2014) conclude that both the FVA and the KVA should be in-cluded as add-ons in entry prices and as liabilities in the balance sheet. Our key premise is that counterparty risk entails two distinct but intertwined sources of marketincompleteness: • A bank cannot perfectly hedge counterparty default losses, by lack of sufficiently liquidCDS markets; • A bank can even less hedge its own jump-to-default exposure, because this would meanselling protection on its own default, which is nonpractical and, under certain juridictions,even legally forbidden (see Section 2).We specify the banking XVA metrics that align derivative entry prices to shareholder interest,given this impossibility for a bank to replicate the jump-to-default related cash flows. We de-velop a cost-of-capital XVA approach consistent with the accounting standards set out in IFRS 4Phase II (see International Financial Reporting Standards (2013)), inspired from the Swiss sol-vency test and Solvency II insurance regulatory frameworks (see Swiss Federal Office of Private Insurance (2006)and Committee of European Insurance and Occupational Pensions Supervisors (2010)), whichso far has no analogue in the banking domain. Under this approach, the valuation (CL) ofthe so-called contra-liabilities and the cost of capital (KVA) are sourced from clients at tradeinceptions, on top of the (CVA − DVA) complete market valuation of counterparty risk, in orderto compensate bank shareholders for wealth transfer and risk on their capital.The cost of the corresponding collateralization, accounting, and dividend policy is, by con-trast with the complete market valuation (CVA − DVA) of counterparty risk,CVA + FVA + KVA , (2)computed unilaterally in a certain sense (even though we do crucially include the default of thebank itself in our modeling), and charged to clients on an incremental run-off basis at everynew deal .All in one, our cost-of-capital XVA strategy makes shareholder equity a submartingale withdrift corresponding to a hurdle rate h on shareholder capital at risk, consistently between andthroughout deals. Thus we arrive at a sustainable strategy for profits retention, much like inthe above-mentioned insurance regulation, but in a consistent continuous-time and bankingframework.Last but not least, our approach can be solved efficiently using GPU computing combinedwith deep learning regression methods in a whole bank balance sheet context. Section 2 sets a financial stage where a bank is split across several trading desks and entailsdifferent stakeholders. Section 3 develops our cost-of-capital XVA approach in a one-periodstatic setup. Section 4 revisits the approach at the dynamic and trade incremental level.Section 5 is a numerical case study on large, multi-counterparty portfolios of interest rateswaps, based on the continuous-time XVA equations for bilateral trade portfolios recalled inSection A.The main contributions of the paper are: See also Remark 2.1 regarding the meaning of the FVA in (2). The one-period static XVA model of Section 3, with explicit formulas for all the quantitiesat hand, offering a concrete grasp on the related wealth transfer and risk premium issues; • Proposition 4.1, which establishes the connections between XVAs and the core equity tier1 capital of the bank, respectively bank shareholder equity; • Proposition 4.2, which establishes that, under the XVA policy represented by the balanceconditions (4) between deals and the counterparty risk add-on (43) throughout deals,bank shareholder equity is a submartingale with drift corresponding to a target hurdlerate h on shareholder capital at risk. This perspective solves the puzzle according towhich, on the one hand, XVA computations are performed on a run-off portfolio basis,while, on the other hand, they are used for computing pricing add-ons to new deals; • The XVA deep learning (quantile) regression computational strategy of Section 4.4; • The numerical case study of Section 5, which emphasizes the materiality of refined, path-wise XVA computations, as compared to more simplistic XVA approaches.From a broader point of view, this paper reflects a shift of paradigm regarding the pricingand risk management of financial derivatives, from hedging to balance sheet optimization, asquantified by relevant XVA metrics. In particular (compare with the last paragraph beforeSection 1.1), our approach implies that the FVA (and also the MVA, see Remark 2.1) shouldbe included as an add-on in entry prices and as a liability in the balance sheet; the KVA shouldbe included as an add-on in entry prices, but not as a liability in the balance sheet.From a computational point of view, this paper opens the way to second generation XVAGPU implementation. The first generation consisted of nested Monte Carlo implemented by ex-plicit CUDA programming on GPUs (see Albanese, Caenazzo, and Cr´epey (2017), Abbas-Turki, Diallo, and Cr´epey (2018)).The second generation takes advantage of GPUs leveraging via pre-coded CUDA/AAD deeplearning packages that are used for the XVA embedded regression and quantile regression task.Compared to a regulatory capital based KVA approach, an economic capital based KVA ap-proach is then not only conceptually more satisfying, but also simpler to implement.
We consider a dealer bank, which is a market maker involved in bilateral derivative portfolios.For simplicity, we only consider European derivatives. The bank has two kinds of stakeholders, shareholders and bondholders . The shareholders have the control of the bank and aresolely responsible for investment decisions before bank default. The bondholders represent thesenior creditors of the bank, who have no decision power until bank default, but are protectedby laws, of the pari-passu type, forbidding trades that would trigger value away from themto shareholders during the default resolution process of the bank. The bank also has juniorcreditors, represented in our framework by an external funder , who can lend unsecured tothe bank and is assumed to suffer an exogenously given loss-given-default in case of default ofthe bank.We consider three kinds of business units within the bank (see Figure 1 for the correspondingpicture of the bank balance sheet and refer to Table 1 for a list of the main financial acronymsused in the paper): the
CA desks , i.e. the CVA desk and the FVA desk (or Treasury) ofthe bank, in charge of contra-assets, i.e. of counterparty risk and its funding implications forthe bank; the clean desks , who focus on the market risk of the contracts in their respectivebusiness lines; the management of the bank, in charge of the dividend release policy of thebank. 4 mounts on dedicated cash accounts of the bank: CM Clean margin Definition 2.1 and Assumption 2.1 RC Reserve capital Definition 2.1 and Assumption 2.1 RM Risk margin Definition 2.1 and Assumption 2.1 UC Uninvested capital Definition 2.1 and Assumption 2.1
Valuations: CA Contra-assets valuation (3), (16), and (51) CL Contra-liabilities valuation Definition 2.1 and (18), (35), and (43)
CVA
Credit valuation adjustment (17), (16), (52), and (60)–(61)
DVA
Debt valuation adjustment (18) and (17)
FDA
Funding debt adjustment (18) and (23) FV Firm valuation of counterparty risk (21) and (23)
FVA
Funding valuation adjustment Remark 2.1, (17), (16), and (52)
KVA
Capital valuation adjustment (4), (26), and (56)
MtM
Mark-to-market (4) and (15)
MVA
Margin valuation adjustment Remark 2.1, (33), (52), and (62)
XVA
Generic “X” valuation adjustment First paragraph
Also: CR Capital at risk (54)
CET1
Core equity tier I capital (3) and (40) EC Economic capital Definitions 3.2 and A.1
FTP
Funds transfer price (43)
SHC
Shareholder capital (or equity) (3) and (41)
SCR
Shareholder capital at risk Assumption 2.1 and (25)Table 1: Main financial acronyms and place where they are introduced conceptually and/orspecified mathematically in the paper, as relevant.5 eserve capital (RC)Shareholder capital at risk (SCR)yr1Uninvested capital (UC)ASSETS LIABILITIESyr39 yr40 Core equity tier I capital (CET1)Mark-to-market of theportfolio receivables Mark-to-market of theportfolio payablesContra-liabilities (CL) yr1 yr39 yr40Contra-assets (CA)Accounting equityCapital at risk (CR) CVACollateral posted by theclean desks Collateral received by theclean desksFVADVA FVA desk(Treasury)CA desksClean desksKVA desk(management) CVA desk
Risk Margin (RM=KVA)(MtM + ) (CM + )(CM − ) (MtM − )FDA = FVA Figure 1: Balance sheet of a dealer bank. Contra-liability valuation (CL) at the top is shownin dotted boxes because it is only value to the bondholders (see Section 3.5). Mark-to-marketvaluation (MtM) of the derivative portfolio of the bank by the clean desks, as well as thecorresponding collateral (clean margin CM), are shown in dashed boxes at the bottom. Theirrole will essentially vanish in our setup, where we assume a perfect clean hedge by the bank.The arrows in the left column represent trading losses of the CA desks in “normal years 1 to39” and in an “exceptional year 40” with full depletion (i.e. refill via UC, under Assumption2.1.ii) of RC, RM, and SCR. The numberings yr1 to yr40 are fictitious yearly scenarios inline with a 97.5% expected shortfall of the one-year-ahead trading losses of the bank that weuse for defining its economic capital. The arrows in the right column symbolize the averagedepreciation in time of contra-assets between deals. The collateral between the bank and itscounterparties is not shown to alleviate the picture.Collateral means cash or liquid assets that are posted to guarantee a netted set of transac-6ions against defaults. It comes in two forms: variation margin, which is re-hypothecableotecable,i.e. fungible across netting sets, and initial margin, which is segregated. We assume cash onlycollateral. Posted collateral is supposed to be remunerated at the risk-free rate (assumed toexist, with overnight index swap rates as a best market proxy).
Remark 2.1
To alleviate the notation, in this conceptual section of the paper, we only consideran FVA as the global cost of raising collateral for the bank, as opposed to a distinction, in theindustry and in later sections in the paper, between an FVA, in the strict sense of the cost ofraising variation margin, and an MVA for the cost of raising initial margin.The CA desks guarantee the trading of the clean desks against counterparty defaults,through a clean margin account , which can be seen as (re-hypothecable) collateral exchangedbetween the CA desks and the clean desks. The corresponding clean margin amount (CM) alsoplays the role of the funding debt of the clean desks put at their disposal at a risk-free cost bythe Treasury of the bank. This is at least the case when CM > < − CM) correspondsto excess cash generated by the trading of the clean desks, usable by the Treasury for its otherfunding purposes. See the bottom, dashed boxes in Figure 1.In addition, the CA desks value the contra-assets (future counterparty default losses andfunding expenditures), charge them to the (corporate) clients at deal inception, deposit thecorresponding payments in a reserve capital account , and then are exposed to the corre-sponding payoffs. As time proceeds, contra-assets realize and are covered by the CA desks withthe reserve capital account.On top of reserve capital, the so-called risk margin is sourced by the management of thebank from the clients at deal inception, deposited into a risk margin account , and thengradually released as KVA payments into the shareholder dividend stream.Another account contains the shareholder capital at risk earmarked by the bank to dealwith exceptional trading losses (beyond the expected losses that are already accounted for byreserve capital).Last, there is one more bank account with shareholder uninvested capital .All cash accounts are remunerated at the risk-free rate.
Definition 2.1
We write CM, RC, RM, SCR, and UC for the respective (risk-free discounted)amounts on the clean margin, reserve capital, risk margin, shareholder capital at risk, anduninvested capital accounts of the bank. We also defineSHC = SCR + UC , CET1 = RM + SCR + UC . (3)From a financial interpretation point of view, before bank default, SHC corresponds to share-holder capital (or equity); CET1 is the core equity tier I capital of the bank, representingthe financial strength of the bank assessed from a regulatory, structural solvency point of view,i.e. the sum between shareholder capital and the risk margin (which is also loss-absorbing),but excluding the value CL of the so-called contra-liabilities (see Figure 1). Indeed, the latteronly benefits the bondholders (cf. Section 3.5), hence it only enters accounting equity. Beforethe default of the bank, shareholder wealth and bondholder wealth are respectively givenby SHC + RM sh and CL + RM bh , for shareholder and bondholder components of RM to bedetailed in Remark 3.3; shareholder and bondholder wealths sum up to the accounting equityRM + SCR + UC + CL, i.e. the wealth of the firm as a whole (see Figure 1). Remark 2.2
The purpose of our capital structure model of the bank is not to model the defaultof the bank, like in a Merton (1974) model, as the point of negative equity (i.e. CET1 < τ calibrated to the credit default swap (CDS) curve referencing the bank. Indeed we viewthe latter as the most reliable and informative credit data regarding anticipations of marketsparticipants about future recapitalization, government intervention, bail-in, and other bankfailure resolution policies.The aim of our capital structure model, instead, is to put in a balance sheet perspective thecontra-assets and contra-liabilities of a dealer bank, items which are not present in the Mertonmodel and play a key role in our XVA analysis.In line with the Volcker rule banning proprietary trading for a bank, we assume a perfectmarket hedge of the derivative portfolio of the bank by the clean desks, in a sense to bespecified below in the respective static and continuous-time setups. By contrast, as jump-to-default exposures (own jump-to-default exposure, in particular) cannot be hedged by the bank(cf. Section 1.1), we conservatively assume no XVA hedge.We work on a measurable space (Ω , A ) endowed with a probability measure Q ∗ , with Q ∗ expectation denoted by E ∗ , which is used for the linear valuation task, using the risk-free assetas our num´eraire everywhere. Remark 2.3
Regarding the nature of our reference probability measure Q ∗ , “physical or risk-neutral”, one should view it as a blend between the two. For instance, even if we do not use thisexplicitly in the paper, one could conceptually think of Q ∗ as the probability measure introducedby Dybvig (1992) to deal with incomplete markets that are a mix of financial traded risk factorsand unhedgeable ones (jumps to default, in our setup), recently revisited in a finance andinsurance context by Artzner, Eisele, and Schmidt (2020). Namely, one could think of Q ∗ asthe unique probability measure on A that coincides (i) with a given risk-neutral pricing measure on the financial σ algebra ⊆ A , and (ii) with the physical probability measure conditional onthe financial σ algebra (the risk-neutral and physical measures being assumed equivalent onthe financial σ algebra). The risk-neutral pricing measure (hence, in view of (i) , Q ∗ itself) iscalibrated to prices of fully collateralized transaction for which counterparty risk is immaterial.The physical probability measure expresses user views on the unhedgeable risk factors. Theuncertainty about Q ∗ can be dealt with by a Bayesian variation on our baseline XVA approach,whereby paths of alternative, co-calibrated models are combined in a global simulation (cf.Hoeting, Madigan, Raftery, and Volinsky (1999)). Until Section 4.2, we consider the case of a portfolio held on a run-off basis, i.e. set up at time0 and such that no new unplanned trades enter the portfolio in the future.The trading cash flows of the bank (cumulative cash flow streams starting from 0 at time0) then consist of • the contractually promised cash flows P from counterparties, • the counterparty credit cash flows C to counterparties (i.e., because of counterparty risk,the effective cash flows from counterparties are P − C ), • the risky funding cash flows F to the external funder, and See Artzner, Eisele, and Schmidt (2020, Proposition 2.1) for a proof. the hedging cash flows H of the clean desks to financial hedging markets(note that all cash flow differentials can be positive or negative). See Section 3.1 and (49)–(50)for concrete specifications in respective one-period and continuous-time setups. Assumption 2.1 i. (Self-financing condition) RC + RM + SCR + UC − CM evolveslike the received trading cash flows
P − C − F − H .ii. (Mark-to-model)
The amounts on all the accounts but UC are marked-to-model (hencethe last, residual amount, UC, plays the role of an adjustment variable). Specifically, weassume that the following shareholder balance conditions hold at all times:CM = MtM , RC = CA , RM = KVA , (4)for theoretical target levels MtM, CA, and KVA to be specified in later sections of thepaper (which will also determine the theoretical target level for SCR).iii. (Agents) The initial amounts MtM , CA , and KVA are provided by the clients atportfolio inception time 0. Resets between time 0 and the bank default time τ (excluded)are on bank shareholders. At the (positive) bank default time τ , the property of theresidual amount on the reserve capital and risk margin accounts is transferred from theshareholders to the bondholders of the bank. Remark 2.4
In an asymmetric setup with a price maker and a price taker, the price makerpasses his costs to the price taker. Accordingly, in our setup, the (corporate) clients provide allthe amounts to the clean margin, reserve capital, and risk margin accounts of the bank requiredfor resetting the accounts to their theoretical target levels (4) corresponding to the updatedportfolio.Under a cost-of-capital XVA approach, we define valuation so as to make shareholder tradinglosses (that include marked-to-model liability fluctuations) centered, then we add a KVA riskpremium in order to ensure to bank shareholders some positive hurdle rate h on their capitalat risk.In what follows, such an approach is developed, first, in a static setup, which can be solvedexplicitly, and then, in a dynamic and trade incremental setup, as suitable for dealing with areal derivative banking portfolio. In this section, we apply the cost-of-capital XVA approach to a portfolio made of a singledeal, P (random variable promised to the bank), between a bank and a client, without priorendowment, in an elementary one-period (one year) setup. All the trading cash flows P , C , F ,and H are then random variables (as opposed to processes in a multi-period setup later in thepaper). We first assume no collateral exchanged between the bank and its client (but collateralexchanged as always between the CA and the clean desks as well as collateral on the markethedge of the bank, the way explained after the respective Remarks 2.1 and 2.2). Risky fundingassets are assumed fairly priced by the market, in the sense that E ∗ F = 0.The bank and client are both default prone with zero recovery to each other. The bank alsohas zero recovery to its external funder. We denote by J and J the survival indicators (randomvariables) of the bank and client at time 1, with default probability of the bank Q ∗ ( J = 0) = γ .9ince prices and XVAs only matter at time 0 in a one-period setup, we identify all the XVAprocesses, as well as the mark-to-market (valuation by the clean desks) MtM of the deal, withtheir values at time 0.For any random variable Y , we define Y ◦ = J Y and Y • = − (1 − J ) Y , hence Y = Y ◦ − Y • . (5)Let E denote the expectation with respect to the bank survival measure, say Q , associated with Q ∗ , i.e., for any random variable Y , E Y = (1 − γ ) − E ∗ ( Y ◦ ) (6)(which is also equal to E Y ◦ ). The notion of bank survival measure was introduced in greatergenerality by Sch¨onbucher (2004). In the present static setup, (6) is nothing but the Q ∗ expec-tation of Y conditional on the survival of the bank (note that, whenever Y is independent from J , the right-hand-side in (6) coincides with E ∗ Y ). Lemma 3.1
For any random variable Y and constant Y , we have Y = E ∗ ( Y ◦ + (1 − J ) Y ) ⇐⇒ Y = E Y . (7) Proof . Indeed, Y = E ∗ ( J Y + (1 − J ) Y ) ⇐⇒ E ∗ ( J ( Y − Y )) = 0 ⇐⇒ E ( Y − Y ) = 0 ⇐⇒ Y = E Y , (8)where the equivalence in the middle is justified by (6). Remark 3.1
For simplicity in a first stage, we will ignore the possibility of using capital atrisk for funding purposes, only considering in this respect reserve capital RC = CA (cf. (4)).The additional free funding source provided by capital at risk will be introduced later, as wellas collateral between bank and client, in Section 3.4.
Lemma 3.2
Given the (to be specified)
MtM and CA amounts (cf. Assumption 2.1.ii), thecredit and funding cash flows C and F of the bank and its trading loss (and profit) L are suchthat C ◦ = J (1 − J ) P + , F ◦ = Jγ (MtM − CA) + C • = (1 − J ) (cid:0) P − − (1 − J ) P + (cid:1) , F • = (1 − J ) (cid:0) (MtM − CA) + − γ (MtM − CA) + (cid:1) L ◦ = C ◦ + F ◦ − J CA , L • = C • + F • + (1 − J )CA , L = C + F − CA . (9) Proof . For the deal to occur, the bank needs to borrow (MtM − CA) + unsecured or invest(MtM − CA) − risk-free (cf. Remark 3.1). Having assumed zero recovery to the external funder,unsecured borrowing is fairly priced as γ × the amount borrowed by the bank (in line with ourassumption that E ∗ F = 0), i.e. the bank must pay for its risky funding the amount γ (MtM − CA) + . Moreover, at time 1, under zero recovery upon defaults:10
If the bank is not in default (i.e. J = 1), then the bank closes its position with the clientwhile receiving P from its client if the latter is not in default (i.e. J = 1), whereas thebank pays P − to its client if the latter is in default (i.e. J = 0). In addition, the bankreimburses its funding debt (MtM − CA) + or receives back the amount (MtM − CA) − ithad lent at time 0; • If the bank is in default (i.e. J = 0), then the bank receives back J P + on the derivativeas well as the amount (MtM − CA) − it had lent at time 0.Also accounting for the hedging loss H , the trading loss of the bank over the year is L = γ (MtM − CA) + − J (cid:0) J P − (1 − J ) P − − (MtM − CA) + + (MtM − CA) − (cid:1) − (1 − J ) (cid:0) J P + + (MtM − CA) − (cid:1) + H . (10)In the static setup, the perfect clean hedge condition (see after Remark 2.2) writes H = P −
MtM. Inserting this into the above yields L = (1 − J ) P + + γ (MtM − CA) + − CA − (1 − J )( P − + (MtM − CA) + ) , (11)as easily checked for each of the four possible values of the pair ( J, J ). That is, L ◦ = J (1 − J ) P + | {z } C ◦ + Jγ (MtM − CA) + | {z } F ◦ − J CA L • = (1 − J ) (cid:0) P − − (1 − J ) P + (cid:1)| {z } C • + (1 − J ) (cid:0) (MtM − CA) + − γ (MtM − CA) + (cid:1)| {z } F • +(1 − J )CA , (12)where the identification of the different terms as part of C or F follows from their financialinterpretation. Remark 3.2
The derivation (10) implicitly allows for negative equity (that arises whenever L ◦ > CET1, cf. (3)), which is interpreted as recapitalization. In a variant of the model excludingboth recapitalization and negative equity, the default of the bank would be modeled in astructural fashion as the event { L = CET1 } , where L = (cid:0) (1 − J ) P + + γ (MtM − CA) + − CA (cid:1) ∧ CET1 , (13)and we would obtain, instead of (11), the following trading loss for the bank: { CET1 >L } L + { CET1= L } (cid:0) CET1 − P − − (MtM − CA) + (cid:1) . (14)In this paper we consider a model with recapitalization for the reasons explained in Remark2.2.Structural XVA approaches in a static setup have been proposed in Andersen, Duffie, and Song (2019)(without KVA) and Kjaer (2019) (including the KVA). Their marginal, limiting results as anew deal size goes to zero are comparable to some of the results that we have here. But then,instead of developing a continuous time version of their corporate finance model and takingthe small trade limit, these papers start the development of the continuous time model fromthe single period small trade limit model. By contrast, in our framework, we have end to enddevelopment in the continuous time model of Section 4 and in the present single period model.11 .2 Contra-assets and Contra-liabilities To make shareholder trading losses centered (cf. the next-to-last paragraph of Section 2), cleanand CA desks value by Q ∗ expectation their shareholder sensitive cash flows. These include,in case of default of the bank, the transfer of property from the CA desks to the clean desksof the collateral amount MTM on the clean margin account, as well as (cf. Assumptions 2.1.iiand iii) the transfer from shareholders to bondholders of the residual value RC = CA on thereserve capital account. Accordingly: Definition 3.1
We let MtM = E ∗ (cid:0) P ◦ + (1 − J )MtM (cid:1) (15)and CA = CVA + FVA , (16)where CVA = E ∗ (cid:0) C ◦ + (1 − J )CVA (cid:1) FVA = E ∗ (cid:0) F ◦ + (1 − J )FVA (cid:1) , (17)hence CA = E ∗ (cid:0) C ◦ + F ◦ + (1 − J )CA (cid:1) . We also define the contra-liabilities valueCL = DVA + FDA , (18)where DVA = E ∗ (cid:0) C • + (1 − J )CVA (cid:1) (19)FDA = E ∗ (cid:0) F • + (1 − J )FVA (cid:1) . (20)Finally we define the firm valuation of counterparty risk,FV = E ∗ ( C + F ) . (21)The definitions of MtM , CVA , and FVA are in fact fix-point equations. However, the fol-lowing result shows that these equations are well-posed and yields explicit formulas for all thequantities at hand. Proposition 3.1
We have
MtM = E P ◦ CVA = E (cid:0) (1 − J ) P + (cid:1) FVA = γ (MtM − CA) + = γ γ (MtM − CVA) + (22) and E ∗ L ◦ = E L = 0FDA = FVAFV = E ∗ C = CVA − DVA = CA − CL . (23)12 roof . The first identities in each line of (22) follow from Definition 3.1 by Lemma 3.1and definition of the involved cash flows in Lemma 3.2. Given (16), the formula FVA = γ (MtM − CA) + in (22) is in fact a semi-linear equationFVA = γ (MtM − CVA − FVA) + . (24)But, as γ (a probability) is nonnegative, this equation has the unique solution given by theright-hand side in the third line of (22).Regarding (23), we have E ∗ L ◦ = (1 − γ ) E (cid:0) (1 − J ) P + + γ (MtM − CA) + − CA (cid:1) = 0 , by application of (6), the first line in (12), (22), and (16). Hence, using (6) again, E L = (1 − γ ) − E ∗ L ◦ = 0 . This is the first line in (23), which implies the following ones by definition of the involvedquantities and from the assumption that E ∗ F = 0.Note that MtM = E P ◦ also coincides with E P (cf. (22) and the parenthesis following (6)). Inpractice P ◦ has less terms than P (that also includes cash flows from bank default onward),which is why we favor the formulation E P ◦ in (22). The alternative formulation E P may seemmore in line with the intuition of MtM as value deprived from any credit/funding considerations.However, as the measure underlying E is the survival one (see before Lemma 3.1), this intuitionis in fact simplistic and only strictly correct in the case without wrong way risk between creditand market (cf. the parenthesis preceding Lemma 3.1). Economic capital (EC) is the level of capital at risk that a regulator would like to see on aneconomic, structural basis. Risk calculations are typically performed by banks “on a goingconcern”, i.e. assuming that the bank itself does not default. Accordingly:
Definition 3.2
The economic capital (EC) of the bank is given by the 97.5% expected short-fall of the bank trading loss L under Q , which we denote by ES ( L ◦ ).The risk margin (sized by the to-be-defined KVA in our setup) is also loss-absorbing, i.e. partof capital at risk, and the KVA is originally sourced from the client (see Assumption 2.1.iii).Hence, shareholder capital at risk only consists of the difference between the (total) capitalat risk and the KVA. Accordingly (and also accounting, regarding (26), for the last part inAssumption 2.1.iii): Definition 3.3
The capital at the risk (CR) of the bank is given by max(EC , KVA) and theensuing shareholder capital at risk (SCR) bySCR = max(EC , KVA) − KVA = (EC − KVA) + , (25)where, given some hurdle rate (target return-on-equity) h ,KVA = E ∗ (cid:0) h SCR ◦ + (1 − J )KVA (cid:1) . (26) See e.g. F¨ollmer and Schied (2016, Section 4.4). Note that, by definition of Q , this quantity does not depend on L • . emark 3.3 In view of (26) and of the last balance condition in (4), we haveRM sh = E ∗ (cid:0) h SCR ◦ ) , RM bh = E ∗ (cid:0) (1 − J )KVA (cid:1) . (27)We refer the reader to the last bullet point in Albanese and Cr´epey (2020, Definition A.1) forthe analogous split of RM between shareholder and bondholder wealth in a dynamic, continuous-time setup. Proposition 3.2
We have
KVA = h SCR = h h EC = h h ES ( L ◦ ) . (28) Proof . The first identity follows from Lemma 3.1. The resulting KVA semi-linear equation(in view of (25)) is solved similarly to the FVA equation (24).The KVA formula (28) (as well as its continuous-time analog (56)) can be used either in thedirect mode, for computing the KVA corresponding to a given h , or in the reverse-engineeringmode, for defining the “implied hurdle rate” associated with the actual level on the risk marginaccount of the bank. Cost of capital proxies have always been used to estimate return-on-equity.The KVA is a refinement, fine-tuned for derivative portfolios, but the base return-on-equityconcept itself is far older than even the CVA. In particular, the KVA is very useful in thecontext of collateral and capital optimization. KVA Risk Premium and Indifference Pricing Interpretation
The CA component ofthe FTP corresponds to the expected costs for the shareholders of concluding the deal. This CAcomponent makes the shareholder trading loss L ◦ centered (cf. the first line in (23)). On topof expected shareholder costs, the bank charges to the clients a risk margin (RM). Assume thebank shareholders endowed with a utility function U on R such that U (0) = 0. In a shareholderindifference pricing framework, the risk margin arises as per the following equation: E ∗ U ( J (RM − L )) = E ∗ U (0) = 0 (29)(the expected utility of the bank shareholders without the deal), where E ∗ U ( J (RM − L )) = E ∗ (cid:0) JU (RM − L ) (cid:1) = (1 − γ ) E U (RM − L ) , by (6). Hence E U (RM − L ) = 0 . (30)The corresponding RM is interpreted as the minimal admissible risk margin for the deal tooccur, seen from bank shareholders’ perspective.Taking for concreteness U ( − ℓ ) = − e ρℓ ρ , for some risk aversion parameter ρ , (30) yieldsRM = ρ − ln E e ρL = ρ − ln E e ρL ◦ , by the observation following (6). In the limiting case wherethe shareholder risk aversion parameter ρ → E U ( − L ) → − E ( L ) = 0 (by the first line in(23)), then RM → . In view of (4) and (28), the corresponding implied KVA and hurdle rate h are such thatKVA = ρ − ln E e ρL ◦ , h h = ρ − ln E e ρL ◦ ES ( L ◦ ) . (31)14ence, “for h and ρ small”, h ≈ V ar( L ◦ )2 ES ( L ◦ ) ρ (32)(as E ( L ◦ ) = 0), where V ar is the Q variance operator. The hurdle rate h in our KVA setupplays the role of a risk aversion parameter, like ρ in the exponential utility framework.An indifference price has a competitive interpretation. Assume that the bank is competingfor the client with other banks. Then, in the limit of a continuum of competing banks witha continuum of indifference prices, whenever a bank makes a deal, this can only be at itsindifference price. Our stylized indifference pricing model of a KVA defined by a constanthurdle rate h exogenizes (by comparison with the endogenous hurdle rate h in (31)) the impacton pricing of the competition between banks. It does so in a way that generalizes smoothly toa dynamic setup (see Section 4), as required to deal with a real derivative banking portfolio. Itthen provides a refined notion of return-on-equity for derivative portfolios, where a full-fledgedoptimization approach would be impractical. In case of variation margin (VM) that would be exchanged between the bank and its client,and of initial margin that would be received (RIM) and posted (PIM) by the bank, at the levelof, say, some Q value-at-risk of ± ( P −
VM), then • P needs be replaced by (
P − VM − RIM) everywhere in the above, whence an accordinglymodified (in principle: diminished) CVA, • an additional initial margin related cash flow in F ◦ given as Jγ PIM, triggering an addi-tional adjustment MVA in CA, whereMVA = E ∗ (cid:0) Jγ PIM + (1 − J )MVA (cid:1) = γ PIM; (33) • additional initial margin related cash flows in F • given as (1 − J )(PIM − γ PIM) and(1 − J )MVA, triggering an additional adjustment MDA = MVA in CL; • the second FVA formula in (22) modified into FVA = γ γ (MtM − VM − CVA − MVA) + . Accounting further for the additional free funding source provided by capital at risk (cf. Re-mark 3.1), then, in view of the specification given in the first sentence of Definition 3.3 forthe latter, one needs replace (MtM − CA) ± by (MtM − VM − CA − max(EC , KVA)) ± every-where before. This results in the same CVA and MVA as in the bullet points above, but inthe following system for the random variable L ◦ and the FVA and the KVA numbers (cf. thecorresponding lines in (12), (22), (28), and recall (16)): L ◦ = J (1 − J ) P + + Jγ (MtM − VM − CA − max(EC , KVA)) + + Jγ PIM − J CAFVA = γ (MtM − VM − CA − max(EC , KVA)) + KVA = h h ES ( L ◦ ) . (34)This system entails a coupled dependence between, on the one hand, the FVA and KVA numbersand, on the other hand, the shareholder loss process L ◦ . However, once CVA, PIM, RIM,15nd MVA computed as in the above, the system (34) can be addressed numerically by Picarditeration, starting from, say, L (0) = KVA (0) = 0 and FVA (0) = γ γ (MtM − VM − CVA − MVA) + (cf. the last line in (22)), and then iterating in (34) until numerical convergence. Remark 3.4
The rationale for funding FVA but not MVA from CA + max(EC , KVA) is setout before Equation (15) in Albanese, Caenazzo, and Cr´epey (2017).
The funds transfer price (all-inclusive XVA rebate to MtM) aligning the deal to shareholderinterest (in the sense of a given hurdle rate h , cf. the next-to-last paragraph of Section 2) isFTP = CVA + FVA | {z } Expected shareholder costs CA + KVA | {z }
Shareholder risk premium= CVA − DVA | {z }
Firm valuation FV + DVA + FDA | {z }
Wealth transfer CL + KVA | {z }
Shareholder Risk premium , (35)where all terms are explicitly given in Propositions 3.1 and 3.2 (or the corresponding variantsof Section 3.4 in the refined setup considered there). Wealth Transfer Analysis
The above results implicitly assumed that the bank cannot hedgejump-to-default cash flows (cf. Section 1.1). To understand this, let us temporarily suppose,for the sake of the argument, that the bank would be able to hedge its own jump-to-defaultthrough a further deal, whereby the bank would deliver a payment L • at time 1 in exchange ofa fee fairly valued as CL = E ∗ L • = DVA + FDA , (36)deposited in the reserve capital account of the bank at time 0.We include this hedge and assume that the client would now contribute at the level ofFV = CA − CL (cf. (23)), instead of CA before, to the reserve capital account of the bank attime 0. Then the amount that needs be borrowed by the bank for implementing its strategyis still γ (MtM − CA) + as before (back to the baseline funding setup of Remark 3.1). But thetrading loss of the bank becomes, instead of L before, C + F −
FV + ( L • − CL) = C + F −
CA + L • = L + L • = L ◦ , (37)where the last line in (23) and the last identity in (9) were used in the first and second equality.By comparison with the situation from previous sections without own-default hedge by thebank: • the shareholders are still indifferent to the deal in expected counterparty default andfunding expenses terms, • the recovery of the bondholders becomes zero, • the client is better off by the amount CA − FV = CL.The CL originating cash flow L • has been hedged and monetized by the shareholders, who havepassed the corresponding benefit to the client.Under a cost-of-capital pricing approach, the bank would still charge to its client a KVAadd-on h h ES ( L ◦ ), as risk compensation for the nonvanishing shareholder trading loss L ◦ still16riggered by the deal. If, however, the bank could also hedge the (zero-valued, by the first linein (23)) loss L ◦ , hence the totality of L = L ◦ − L • (instead of L • only in the above), then thetrading loss and the KVA would vanish. As a result, the all-inclusive XVA add-on (rebate fromMtM valuation) would boil down to FV = CVA − DVA(cf. (1)), the value of counterparty risk and funding to the bank as a whole.
Connection With the Modigliani-Miller Theory
The Modigliani-Miller invariance re-sult, with Modigliani and Miller (1958) as a seminal reference, consists in various facets of abroad statement that the funding and capital structure policies of a firm are irrelevant to theprofitability of its investment decisions. Modigliani-Miller (MM) irrelevance, as we put it forbrevity hereafter, was initially seen as a pure arbitrage result. However, it was later understoodthat there may be market incompleteness issues with it. So quoting Duffie and Sharer (1986, page 9),“generically, shareholders find the span of incomplete markets a binding constraint [...] share-holders are not indifferent to the financial policy of the firm if it can change the span ofmarkets (which is typically the case in incomplete markets)”; or Gottardi (1995, page 197):“When there are derivative securities and markets are incomplete the financial decisions of thefirm have generally real effects”.A situation where shareholders may “find the span of incomplete markets a binding con-straint” is when market completion is legally forbidden. This corresponds to the XVA case,which is also at the crossing between market incompleteness and the presence of derivativespointed out above as the MM non irrelevance case in Gottardi (1995). Specifically, the contra-assets and contra-liabilities that emerge endogenously from the impact of counterparty risk onthe derivative portfolio of a bank cannot be “undone” by shareholders, because jump-to-defaultrisk cannot be replicated by a bank.As a consequence, MM irrelevance is expected to break down in the XVA setup. In fact,as visible on the trade incremental FTP (counterparty risk pricing) formula (35) (cf. also (43)and Proposition 4.2 in a dynamic and trade incremental setup below), cost of funding and costof capital are material to banks and need be reflected in entry prices for ensuring shareholderindifference to the trades, i.e. preserving their hurdle rate throughout trades.
We now consider a dynamic, continuous-time setup, with model filtration G and a (positive)bank default time τ endowed with an intensity γ . The bank survival probability measureassociated with the measure Q ∗ is then the probability measure Q with ( G , Q ∗ ) density pro-cess Je R · γ s ds (assumed integrable), where J = [0 ,τ ) is the bank survival indicator process(cf. Sch¨onbucher (2004) and Collin-Dufresne, Goldstein, and Hugonnier (2004)). In particular,writing Y ◦ = JY + (1 − J ) Y τ − , for any left-limited process Y , we have by application of theresults of Cr´epey and Song (2017) (cf. the condition (A) there): Lemma 4.1
For every Q (resp. sub-, resp. resp. super-) martingale Y , the process Y ◦ is a Q ∗ (resp. sub-, resp. resp. super-) martingale. Remark 4.1
In the dynamic setup, the survival measure formulation is a light presentation,sufficient for the purpose of the present paper (skipping the related integrability issues), ofan underlying reduction of filtration setup, which is detailed in the above-mentioned reference(regarding Lemma 4.1, cf. also Collin-Dufresne, Goldstein, and Hugonnier (2004, Lemma 1)).17 .1 Case of a Run-Off Portfolio
First, we consider the case of a portfolio held on a run-off basis (cf. Section 2.1). We denote by T the final maturity of the portfolio and we assume that all prices and XVAs vanish at time T if T < τ . Then the results of Albanese and Cr´epey (2020) show that all the qualitative insightsprovided by the one-period XVA analysis of Section 3 are still valid. The trading loss of thebank is now given by the process L = C + F + CA − CA (38)and the bank shareholder trading loss by the Q (hence Q ∗ , by Lemma 4.1) martingale L ◦ = C ◦ + F ◦ + CA ◦ − CA . (39)In (38)-(39), we have CA = CVA + FVA as in (16); the processes C , F , CVA , and FVAare continuous-time processes analogs, detailed in the case of bilateral trade portfolios in Sec-tion A.1-A.2, of the eponymous quantities in Section 3 (which were constants or random vari-ables there). Proposition 4.1
The core equity tier 1 capital of the bank is given by
CET1 = CET1 − L. (40) Shareholder equity is given by
SHC = SHC − ( L + KVA − KVA ) . (41) Proof . In the continuous-time setup, Assumption 2.1.i is written asRC + RM + SCR + UC − CM − (RC + RM + SCR + UC − CM) = P − ( C + F + H ) . Given the definition of CET1 in (3), the perfect clean hedge condition (see after Remark 2.2)written in the dynamic setup as P + MtM − MtM − H = 0, and the balance conditions (4),this is equivalent to CA + CET1 − (CA + CET1) = − ( C + F ) . In view of (38), we obtain (40).As SHC = CET1 − RM (cf. (3)), we have by (40):SHC = CET1 − L − RM = CET1 − RM − ( L + RM − RM ) , which, by the third balance condition in (4), yields (41).Moreover, by Lemma 4.1, the continuous-time process KVA ◦ that stems from (54)-(55) is a Q ∗ supermartingale with terminal condition KVA ◦ T = 0 on { T < τ } and drift coefficient h SCR,where SCR is given as in (25), but for EC there dynamically defined as the time-t conditional,97.5% expected shortfall of ( L ◦ t +1 − L ◦ t ) under Q , killed at τ . Remark 4.2
It is only before τ that the right-hand-sides in the definitions (3) really deservethe respective interpretations of shareholder equity of the bank and core equity tier 1 capital.Hence, it is only the parts of (40) and (41) stopped before τ , i.e.CET1 ◦ = CET1 − L ◦ , SHC ◦ = SHC − ( L ◦ + KVA ◦ − KVA ) , (42)which are interesting financially. 18 .2 Trade Incremental Cost-of-Capital XVA Strategy In Albanese and Cr´epey (2020) and in Section 4.1 above, the derivative portfolio of the bankis assumed held on a run-off basis. By contrast, real-life derivative portfolios are incremental.Assume a new deal shows up at time θ ∈ (0 , τ ). We denote by ∆ · , for any portfolio relatedprocess, the difference between the time θ values of this process for the run-off versions of theportfolio with and without the new deal. Definition 4.1
We apply the following trade incremental pricing and accounting policy: • The clean desks pay ∆MtM to the client and the CA desks add an amount ∆MtM on the clean margin account; • The CA desks charge to the client an amount ∆CA and add it on the reserve capitalaccount; • The management of the bank charges the amount ∆KVA to the client and adds it on the risk margin account.The funds transfer price of a deal is the all-inclusive XVA add-on charged by the bank tothe client in the form of a rebate with respect to the mark-to-market ∆MtM of the deal. Underthe above scheme, the overall price charged to the client for the deal is ∆MtM − ∆CA − ∆KVA,i.e. FTP = ∆CA + ∆KVA = ∆CVA + ∆FVA + ∆KVA= ∆FV + ∆CL + ∆KVA , (43)by (16) and the last line in (23) (which still hold in continuous time, see Albanese and Cr´epey (2020, Equations (1) and (66)))applied to the portfolios with and without the new deal. Remark 4.3
As opposed to the ∆XVA terms, which entail portfolio-wide computations, ∆MtMreduces to the so-called clean valuation of the new deal, by trade-additivity of MtM (as followsfrom Albanese and Cr´epey (2020, Equations (25) and (37))).Obviously, the legacy portfolio of the bank has a key impact on the FTP. It may verywell happen that the new deal is risk-reducing with respect to the portfolio, in which caseFTP <
0, i.e. the overall, XVA-inclusive price charged by the bank to the client would be∆MtM − FTP > ∆MtM (subject of course to the commercial attitude adopted by the bankunder such circumstance).In order to exclude for simplicity jumps of our L and KVA processes at θ (the ones relatedto the initial portfolio, but also those, starting at time θ , corresponding to the augmentedportfolio), we assume a quasi-left continuous model filtration G and a G predictable stoppingtime θ . The first assumption excludes that martingales can jump at predictable times. It issatisfied in all practical models and, in particular, in all models with L´evy or Markov chaindriven jumps. The second assumption is reasonable regarding the time at which a financialcontract is concluded. Note that it was actually already assumed regarding the (fixed) time 0at which the portfolio of the bank is supposed to have been set up in the first place. i.e. remove ( − ∆MtM) from, if ∆MtM < i.e. remove ( − ∆CA) from, if ∆CA < i.e. removes ( − ∆KVA) from, if ∆KVA < emma 4.2 Assuming the new trade at time θ handled by the trade incremental policy ofDefinition 4.1 after that the balance conditions (4) have been held before θ , then shareholderequity SHC ◦ (see Remark 4.2) is a Q ∗ submartingale on [0 , θ ] ∩ R + , with drift coefficient h SCR killed at τ . Proof . In the case of a trade incremental portfolio, a priori, the second identity in (42) isonly guaranteed to hold before θ . However, in view of the observation made in Remark 2.4and because, under our (harmless) technical assumptions, there can be no dividends arisingfrom the portfolio expanded with the new deal (i.e. jumps in the related processes L and KVA,defined on [ θ, + ∞ )) at time θ itself, the process SHC does not jump at θ . The process L andKVA related to the legacy portfolio cannot jump at θ either. As a result, the second identityin (42) still holds at θ . It is therefore valid on [0 , θ ] ∩ R + . The result then follows from therespective martingale and supermartingale properties of the (original) processes L ◦ and KVA ◦ recalled before and after Proposition 4.1.The above XVA strategy can be iterated between and throughout every new trade. Wecall this approach the trade incremental cost-of-capital XVA strategy . By an iteratedapplication of Lemma 4.2 at every new trade, we obtain the following: Proposition 4.2
Under a dynamic and trade incremental cost-of-capital XVA strategy, share-holder equity
SHC ◦ is a Q ∗ submartingale on R + , with drift coefficient h SCR killed at τ . Thus, a trade incremental cost-of-capital XVA strategy results in a sustainable strategy forprofits retention, both between and throughout deals, which was already the key principlebehind Solvency II (see Section 1.1). Note that, without the KVA (i.e. for h = 0), the (risk-freediscounted) shareholder equity process SHC ◦ would only be a Q ∗ martingale, which could onlybe acceptable to shareholders without risk aversion (cf. Section 3.3). Figure 2 yields a picturesque representation, in the form of a corresponding XVA dependencetree, of the continuous-time XVA equations.For concreteness, we restrict ourselves to the case of bilateral trading in what follows,referring the reader to Albanese, Armenti, and Cr´epey (2020, Section 6.2) for the more generaland realistic situation of a bank also involved in centrally cleared trading. As visible fromthe corresponding equations in Section A, the CVA of the bank can then be computed as thesum of its CVAs restricted to each netting set (or counterparty i of the bank, with defaulttime denoted by τ i in Figure 2). The initial margins and the MVA are also most accuratelycalculated at each netting set level. By contrast, the FVA is defined in terms of a semilinearequation that can only be solved at the level of the overall derivative portfolio of the bank. TheKVA can only be computed at the level of the overall portfolio and relies on conditional riskmeasures of future fluctuations of the shareholder trading loss process L ◦ , which itself involvesfuture fluctuations of the other XVA processes (as these are part of the bank liabilities).Moreover, the fungibility of capital at risk with variation margin (cf. Remark 3.4) induces acoupling between, on the one hand, the “backward” FVA and KVA processes and, on the otherhand, the “forward” shareholder loss process L ◦ . As in the static case of Section 3.4 (cf. the lastparagraph there), the ensuing forward backward system can be decoupled by Picard iteration.These are heavy computations encompassing all the derivative contracts of the bank. Yetthese computations require accuracy so that trade incremental XVA computations, which arerequired as XVA add-ons to derivative entry prices (cf. Section 4.2), are not in the numericalnoise of the machinery. 20 V A E C s , < s < T E C s F VA t = s ,..., s + C VA t , M VA t , t = s ,..., s + I M t = s ,..., s + , M t M t = s ,..., s + F V A t C VA u , M VA u , u = t ,..., T I M u = t ,..., T , M t M u = t ,..., T M V A u , C V A u I M v = u ,..., T , M t M v = u ,..., T I M v , M t M w= v ,..., v + , M t M w D e p t h M c va M f va M kva M e c M i m M m t m . . . . . . . . . . . . . . . . . . . . . Figure 2: The XVA equations dependence tree (
Source :Abbas-Turki, Diallo, and Cr´epey (2018)).As developed in Abbas-Turki, Diallo, and Cr´epey (2018, Section 3.2), computational strate-gies for (each Picard iteration of) the XVA equations involve a mix of nested Monte Carlo(NMC) and of simulation/regression schemes, optimally implemented on GPUs. In view ofFigure 2, a pure NMC approach would involve five nested layers of simulation (with respectivenumbers of paths M xva ∼ √ M mtm , see Abbas-Turki, Diallo, and Cr´epey (2018, Section 3.3)).Moreover, nested Monte Carlo implies intensive repricing of the mark-to-market cube, i.e. path-wise MtM valuation for each netting set, or/and high dimensional interpolation. In this work,we use no nested Monte Carlo or conditional repricing of future MtM cubes: beyond the baseMtM layer in the XVA dependence tree, each successive layer (from right to left in Figure 2,at each Picard iteration) will be “learned” instead. We denote by E t , V a R t , and ES t (and simply, in case t = 0, E , V a R , and ES ) the time- t conditional expectation, value-at-risk, and expected shortfall with respect to the bank survivalmeasure Q .We compute the mark-to-market cube using CUDA routines. The pathwise XVAs areobtained by deep learning regression, i.e. extension of Longstaff and Schwartz (2001) kind ofschemes to deep neural network regression bases as also considered in Hur´e, Pham, and Warin (2020)or Beck, Becker, Cheridito, Jentzen, and Neufeld (2019), based on the classical quadratic (alsoknown as mean square error, MSE) loss function. The conditional value-at-risks and expectedshortfalls involved in the embedded pathwise EC and IM computations are obtained by deepquantile regression, as follows.Given features X and labels Y (random variables), we want to compute the conditionalvalue-at-risk and expected shortfall functions q ( · ) and s ( · ) such that V a R ( Y | X ) = q ( X ) and ES ( Y | X ) = s ( X ). Recall from Fissler, Ziegel, and Gneiting (2016) and Fissler and Ziegel (2016)21hat value-at-risk is elicitable , expected shortfall is not, but their pair is jointly elicitable . Specif-ically, we consider loss functions ρ of the form (where in our notation Y is a signed loss, whereasit is a signed gain in their paper) ρ α ( q ( · ) , s ( · ); X, Y ) = (1 − α ) − ( f ( Y ) − f ( q ( X ))) + + f ( q ( X ))+ g ( s ( X )) − ˙ g ( s ( X )) (cid:0) s ( X ) − q ( X ) − (1 − α ) − ( Y − q ( X )) + (cid:1) . (44)One can show (cf. also Dimitriadis and Bayer (2019)) that, for a suitable choice of thefunctions f , g including f ( z ) = z and g = − ln(1 + e − z ) (our choice in our numerics), thepair of the conditional value-at-risk and expected shortfall functions is the minimizer, over allmeasurable pair-functions ( q ( · ) , s ( · )), of the error E ρ ( q ( · ) , s ( · ); X, Y ) . (45)In practice, one minimizes numerically the error (45), based on m independent simulated val-ues of ( X, Y ), over a parametrized family of functions ( q, s )( x ) ≡ ( q, s ) θ ( x ). Dimitriadis and Bayer (2019)restrict themselves to multilinear functions. In our case we use a feedforward neural network pa-rameterization (see e.g. Goodfellow, Bengio, and Courville (2016)). The minimizing pair ( q, s ) b θ then represents the two scalar neural network approximations of the conditional value-at-riskand expected shortfall functions pair.The left and right panels of Figure 3 show the respective deep neural networks for pathwisevalue-at-risk/expected shortfall (with error (45)) and pathwise XVAs (with classical quadraticnorm error). Deep learning methods often show particularly good generalization and scalabilityperformances (cf. Section 5.5). In the case of conditional value-at-risk and expected shortfallcomputations, deep learning quantile regression is also easier to implement than more naivemethods, such as the resimulation and sort-based scheme of Barrera, Cr´epey, Diallo, Fort, Gobet, and Stazhynski (2019)for the value-at-risk and expected shorfall at each outer node of a nested Monte Carlo simula-tion. , X U(cid:238) , (cid:238)(cid:236)U(cid:237) , (cid:238)(cid:236)U(cid:238) , (cid:237)U(cid:237) , X U(cid:237) , (cid:237)U(cid:238) , (cid:237)U(cid:239) , X U(cid:239) , (cid:238)(cid:236)U(cid:239) (cid:28)^ (cid:154) s (cid:2) Z (cid:154) Z& v (cid:154) Z& (cid:237) (cid:154) / v(cid:137)(cid:181)(cid:154) (cid:3)> (cid:2)˙(cid:30)(cid:140) (cid:239) (cid:3) (cid:15)˙ (cid:3) (cid:238)(cid:236) (cid:3), ](cid:26)(cid:26)(cid:30)v (cid:3) o(cid:2)˙(cid:30)(cid:140)(cid:144) K (cid:181)(cid:154)(cid:137)(cid:181)(cid:154) (cid:3)> (cid:2)˙(cid:30)(cid:140) , X U(cid:238) , (cid:238)(cid:236)U(cid:237) , (cid:238)(cid:236)U(cid:238) , (cid:237)U(cid:237) , X U(cid:237) , (cid:237)U(cid:238) , (cid:237)U(cid:239) , X U(cid:239) , (cid:238)(cid:236)U(cid:239) ys(cid:4) (cid:154) Z& v (cid:154) Z& (cid:237) (cid:154) / v(cid:137)(cid:181)(cid:154) (cid:3)> (cid:2)˙(cid:30)(cid:140) (cid:239) (cid:3) (cid:15)˙ (cid:3) (cid:238)(cid:236) (cid:3), ](cid:26)(cid:26)(cid:30)v (cid:3) o(cid:2)˙(cid:30)(cid:140)(cid:144) K (cid:181)(cid:154)(cid:137)(cid:181)(cid:154) (cid:3)> (cid:2)˙(cid:30)(cid:140) Figure 3: Neural networks with state variables (realizations of the risk factors at the consideredpricing time) as features. (Left)
Joint value-at-risk/expected shortfall neural network: out-put is joint estimate of pathwise conditional value-at-risk and expected shorfall, at a selectedconfidence level, of the label (inputs to initial margin or economic capital) given the features. (Right)
XVAs neural network: output is estimate of pathwise conditional mean of the label(XVA generating cash flows) given the features.The neural network topology and hyper-parameters used by default in our examples are de-tailed in Table 2. We use hyperbolic tangent activation functions in all cases. Algorithm 1 yields See Section 5. VA FVA IM MVA Gap CVA EC KVAHidden Layers 3 5 3 3 3 3 3Hidden Layer Size 20 6 20 20 20 20 20Learning Rate 0.025 0.025 0.05 0.1 0.1 0.025 0.1Momentum 0.95 0.95 0.5 0.5 0.5 0.95 0.5Iterations 100 50 150 100 100 100 100Loss Function MSE MSE (44) MSE (44) (44) MSEApplication netting set portf. netting set netting set netting set portf. portf.
Table 2: Neural network topology and learning parameters used by default in our numerics(portf. ≡ overall derivative portfolio of the bank).our fully (time and space) discrete scheme for simulating the Picard iteration (58) until numeri-cal convergence to the XVA processes. Note that, as opposed to more rudimentary, expected ex-posure based XVA computational approaches (see Section 1 in Abbas-Turki, Diallo, and Cr´epey (2018)),this algorithm requires the simulation of the counterparty defaults. Algorithm 1
Deep XVAs algorithm. • Simulate forward m realizations (Euler paths) of the market risk factor processes and ofthe counterparty survival indicator processes (i.e. default times) on a refined time grid; • For each pricing time t = t i of a pricing time grid, with coarser time step denoted by h ,and for each counterparty c : – Learn the corresponding V a R t and ES t terms visible in (59) or (under the time-discretized outer integral in) (61); – Learn the corresponding E t terms visible in (60) through (62); – Compute the ensuing pathwise CVA and MVA as per (60)–(62); • For FVA (0) , consider the following time discretization of (57) (in which λ is the riskyfunding spread process of the bank) with time step h :FVA (0) t ≈ E t [FVA (0) t + h ] + hλ t (cid:16) X c J ct ( P ct − VM ct ) − CVA t − MVA t − FVA (0) t (cid:17) + (46)and, for each t = t i , learn the corresponding E t in (46), then solve the semi-linear equationfor FVA (0) t ; • For each Picard iteration k (until numerical convergence), simulate forward L ( k ) as perthe first line in (58) (which only uses known or already learned quantities), and: – For economic capital EC ( k ) , for each t = t i , learn ES t (cid:0) ( L ( k ) ) ◦ t +1 − ( L ( k ) ) ◦ t (cid:1) (cf. Defi-nition A.1); – KVA ( k ) and FVA ( k ) then require a backward recursion solved by deep learning ap-proximation much like the one for FVA (0) above.23 Swap Portfolio Case Study
We consider an interest rate swap portfolio case study with counterparties in different economies,first involving 10 one-factor Hull White interest-rates, 9 Black-Scholes exchange rates, and 11Cox-Ingersoll-Ross default intensity processes. The default times of the counterparties and thebank itself are jointly modeled by a “common shock” or dynamic Marshall-Olkin copula modelas per Cr´epey, Bielecki, and Brigo (2014, Chapt. 8–10) and Cr´epey and Song (2016) (see alsoElouerkhaoui (2007, 2017)). This whole setup results in about 40 risk factors used as deeplearning features (including the counterparty default indicators).In this model we consider a bank portfolio of 10K randomly generated swap trades, with • trade currency and counterparty both uniform on [1 , , . . . , • notional uniform on [10 K, K, . . . , K ], • collateralization (cf. Section A.4): either “no CSA counterparty” without initial margin(IM) nor variation margin (VM), or “CSA counterparty” with VM = MtM and postedinitial margin (PIM) pledged at 99% gap risk value-at-risk, received initial margin (RIM)covering 75% gap risk and leaving excess as residual gap CVA, • for economic capital, 97 .
5% expected shortfall of 1-year ahead trading loss of the bankshareholders.By default we use Monte Carlo simulation with 50K paths of 16 coarse (pricing) and 32 fine(risk factors) time steps per year.
The validation of our deep learning methodology is done in the setup of a portfolio of swapsissued at par, with final maturity T = 10 years, without initial margin (IM) nor variationmargin (VM).We first focus on the CVA, as the latter is amenable to validation by a standard nested MonteCarlo (“NMC”) methodology. Figures 4, 5 and 6 show that the learned CVA is consistent withthat obtained from a nested Monte Carlo simulation. Regarding Figure 6 (and also later below),note the equivalence of optimising the mean quadratic error • between the ANN learned estimator h ( X ) and the labels Y (“MSE”), E h ( h ( X ) − Y ) i ,and • between the ANN learned estimator and the conditional expectation E [ Y | X ] (in our caseestimated by NMC), E h ( h ( X ) − E [ Y | X ]) i .The equivalence stems from the following identities, which hold for any random variables X , Y and hypothesis function h such that Y and h ( X ) are square integrable: E h ( h ( X ) − Y ) i = E h ( h ( X ) − E [ Y | X ]) i + E h ( E [ Y | X ] − Y ) i + 2 E (cid:2) ( h ( X ) − E [ Y | X ]) ( E [ Y | X ] − Y ) (cid:3) = E h ( h ( X ) − E [ Y | X ]) i + E [ V ar ( Y | X )] (47)(as the second line vanishes), where E [ V ar ( Y | X )] does not depend on h .24he CVA error profile on Figure 6 reveals slightly more difficulty in learning the earlierCVAs. This is because of a higher variance of the corresponding cash flows (integrated overlonger time frames) in conjunction with a lower variance of the features (risk factors diffusedover shorter time horizons).Figure 4: Random variables CVA c and CVA c (in the case of a no CSA netting set c , respectivelyobserved after 1 and 7 years) obtained by learning (blue histogram) versus nested Monte Carlo(orange histogram). All histograms are based on out-of-sample paths.Figure 5: QQ-plot of learned versus nested Monte Carlo CVA for the random variables CVA c ( left ) and CVA c ( right ). Paths are out-of-sample.Table 3 shows the computational cost and accuracy of the nested Monte Carlo method fordifferent number of inner paths, using 32768 outer paths. The convergence is already achievedfor approximately 128 inner paths, in line with the NMC square root rule that is recalled inan XVA setup in Abbas-Turki, Diallo, and Cr´epey (2018, Section 3.3). Figure 7 and Table 4show that a good accuracy can be achieved through learning at a lower computational costthan through nested Monte Carlo, while also enjoying the advantages of the approach beingparametric. Indeed, once the CVA is learned, one would pay only the cost of inference lateron, which is generally negligible compared to training time. By contrast, a nested Monte Carloapproach would require to relaunch the nested simulations every time the CVA estimator isneeded on new paths. Early stopping could be used to help reduce training time further whileimproving regularization.More generally, in the presence of a multiple number of XVA layers (cf. Figure 2), a purelynested Monte Carlo approach would require multiple layers of nested simulations, which wouldamount to a computational time that is exponential in the number of XVA layers, while the25 t (years)05001000150020002500 M S E learned CVA, out-of-sampleunconditional meanNested Monte-Carlo CVA, out-of-sample Figure 6: Empirical quadratic loss of each CVA estimator at all coarse time-steps. The lower,the closer to the true conditional expectation (cf. (47)). Since the nested Monte Carlo methodis computationally expensive, it was carried out only once every 10 coarse time-steps.
200 400 600 800 1000 1200Computation time (sec)0 . . . . . M S E l o ss ( s t a nd a r d i ze db y l a b e l s v a r ) nested Monte-Carlolearning, in-samplelearning, out-of-sample0 10 20 30 40 50 600 . . . . Figure 7: Speed versus accuracy in the case of a CVA at a given pricing time. We kept varyingthe number of inner paths for the nested Monte Carlo estimator and the number of epochs forthe learning approach and recorded the computation time and the empirical quadratic loss.MSE (vs NMC CVA) MSE (vs labels) Simulation time Training time (0) profile as per (46). The orange FVA curve representsthe mean FVA originating cash flows, which, in principle as on the picture, matches the bluemean FVA itself learned from these cash flows. The 5th and 95th percentiles FVA estimatesare a bit less smooth in time then the mean profiles, as expected.Figure 10 (left) is a sanity check that the profiles of the successives iterates L ( k ) of theshareholder trading loss process L ◦ in Algorithm 1 converge rapidly with k . Figure 10 (right) shows the loss process L (3) , displayed as its mean and mean ± L (3) appears numerically centered around zero. Thelatter holds, at least, beyond t ∼ For the financial case study that follows, we consider • swap rates uniformly distributed on [0 . , .
05] (hence swaps already in-the-money orout-of-the-money at time 0), • number of six-monthly coupon resets uniform on [5 . . .
60] (final maturity of the portfolio T = 30 years), • portfolio direction: either “asset heavy” bank mostly in the receivables in the future, or“liability-heavy” bank mostly in the payables in the future (respectively corresponding,with our data, to a bank 75% likely to pay fixed in the swaps, or 75% likely to receivefixed).The figures that follow only display profiles, i.e. term structures, that is, expectations as afunction of time of the corresponding processes. But all these processes are computed pathwise,based on the deep learning regression and quantile regression methodology of Section 4.4,allowing for all XVA inter-dependencies. Of course, XVA profiles (or pathwise XVAs if wished)are much more informative for traders than the spot XVA values (or time 0 confidence intervals)returned by most XVA systems.Assuming 10 counterparties, Figure 11 shows the GPU generated profiles ofMtM = X c P c [0 ,τ δc ) (48)in the case of the asset-heavy portfolio and of the liability-heavy portfolio.Figure 12 shows the porftolio-wide XVA profiles of the asset-heavy (top) vs. liability–heavy (bottom) portfolio and of the no CSA (left) vs. CSA portfolio (right) . Obviously, asset–heavy or28
20 40 60 80 100 120 . . . . . . M S E l o ss ( s t a nd a r d i ze db y l a b e l s v a r ) N layers = 1, N units = 11 N layers = 1, N units = 16 N layers = 1, N units = 22 N layers = 2, N units = 11 N layers = 2, N units = 16 N layers = 2, N units = 22 N layers = 3, N units = 11 N layers = 3, N units = 16 N layers = 3, N units = 2220 . . . . . . . . . . . . . .
420 20 40 60 80 100 120 . . . . . . . . M S E l o ss ( s t a nd a r d i ze db y l a b e l s v a r ) N layers = 1, N units = 11 N layers = 1, N units = 16 N layers = 1, N units = 22 N layers = 2, N units = 11 N layers = 2, N units = 16 N layers = 2, N units = 22 N layers = 3, N units = 11 N layers = 3, N units = 16 N layers = 3, N units = 2220 . . . . . . . . . . . . . . . . Figure 8: Empirical quadratic loss during CVA learning at time-step t = 5 years, standardizedby the variance of the labels. (Bottom) Paths are in-sample. (Top)
Paths are out-of-sample.29 t (years)020000400006000080000100000120000 E h FVA (0)t i E (cid:20)R Tt λ s (cid:16)P c J cs ( P cs − VM cs ) − CVA s − MVA s − FVA (0)s (cid:17) + ds (cid:21) (0)t (0)t Figure 9: Learned FVA (0) . -6000-4000-200002000400060008000 0 5 10 15 20 25 30 D o m e s t i c C u rr e c n y U n i t s Year
Liability-Heavy Bank No CSA Mean Loss Process Convergence
Loss Iter 0 Loss Iter 1 Loss Iter 2 -20000-15000-10000-500005000100001500020000 0 5 10 15 20 25 30 D o m e s t i c C u rr e c n y U n i t s Year
Liability-Heavy Bank Loss Process -No CSA
Loss Mean Loss Mean+2SE Loss Mean-2SE
Figure 10: (Left)
Profiles of the processes L ( k ) , for k = 1 , , (Right) Mean ± L (3) .no CSA means more CVA. The correponding curves also emphasize the transfer from counter-party credit into liquidity funding risk prompted by extensive collateralisation. Yet FVA/MVArisk is ignored in current derivatives capital regulation.Figure 13 shows that (top left) capital at risk as funding (cf. Section 3.4) has a materialimpact on the already (reserve capital as funding) reduced FVA, (top right) treating KVA asa risk margin (cf. (26)) gives a huge discounting impact, (bottom left) deep learning detectsmaterial initial margin convexity in the asset-heavy CSA portfolio, and (bottom right) deep30 - 1,000,000
14 15
21 22
28 29 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio MtM : Mainly Payer (Asset-Heavy)
MtM -9,000,000 -8,000,000-7,000,000-6,000,000 -5,000,000-4,000,000 -3,000,000-2,000,000-1,000,000 -
11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio MtM: Mainly Receiver (Liability-Heavy)
MtM
Figure 11: MtM profiles. (Left)
Asset-heavy portfolio. (Right)
Liability-heavy portfolio. - 200,000
23 24 25 26 27 28 29 30 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio Asset-Heavy -XVA no CSA
CVAFVAKVA -
10 11
14 15
19 20
23 24
28 29 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio Asset-Heavy -XVA IM CSA
CVA
MVAKVA -
23 24 25 26 27 28 29 30 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio Liability-Heavy -XVA no CSA
CVA
FVAKVA -
10 11
14 15
19 20
23 24
28 29 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio Liability-Heavy -XVA IM CSA
CVA
MVA
KVA
Figure 12: (Top left)
Asset-heavy portfolio, no CSA. (Top right)
Asset-heavy portfolio underCSA. (Bottom left)
Liability–heavy portfolio, no CSA. (Bottom right)
Liability-heavy portfoliounder CSA. 31earning detects material economic capital convexity in the asset-heavy no CSA portfolio. -
24 25 26 27 28 29 30 D o m e s t i c C u rr e c n y U n i t s Years
Swaps Portfolio Liability-Heavy -FVA offsets -no CSA
FVA No Offset - Bank level FCAFVA CA Offset
FVA CA EC Offset -
15 16
21 22
27 28 D o m s t i c C u rr e n c y U n i t s Years
Swaps Portfolio Asset-Heavy -KVA Discounting no CSA
Discount OIS+h Discount OIS -
10 11 12
13 14 15 16
17 18 19
20 21 22 23
24 25 26
27 28 29 30 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio Asset-Heavy -Posted IM Unconditional vs Average Conditional
Unconditional Average Conditional - 500,000 1,000,000 1,500,000 2,000,000 2,500,000
10 11
12 13
15 16
18 19
20 21
23 24
26 27
28 29 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio Asset-Heavy-Convexity ES(L) Unconditional vs Average Conditional: no CSA
Unconditional Average Conditional
Figure 13: (Top left)
FVA ignoring the off-setting impact of reserve capital and capital at risk,cf. Section 3.4 (blue), FVA as per (57) accounting for the off-setting impact of reserve capitalbut ignoring the one of capital at risk (green), refined FVA as per (52) accounting for bothimpacts (red). (Top right)
KVA ignoring the off-setting impact of the risk margin, i.e. with CRinstead of (CR − KVA) in (56) (red), refined KVA as per (54)–(55) (blue). (Bottom left)
In thecase of the asset-heavy portfolio under CSA, unconditional PIM profile, i.e. with V a R t replacedby V a R in (59) (blue), vs. pathwise PIM profile, i.e. mean of the pathwise PIM process as per(59) (red). (Bottom right) In the asset-heavy portfolio no CSA case, unconditional economiccapital profile, i.e. EC profile ignoring the words “time- t conditional” in Definition A.1 (blue),vs. pathwise economic capital profile, i.e. mean of the pathwise EC process as per DefinitionA.1 (red).The above findings demonstrate the necessity of pathwise capital and margin calculationsfor accurate FVA, MVA, and KVA calculations. Next, we consider, on top of the previous portfolios, an incremental trade given as a par 30year (receive fix or pay fix) swap with 100K notional. Figure 14 shows the trade incrementalXVA profiles produced by our deep learning approach. Note that, for obtaining such smoothincremental profiles, it has been key to use common random numbers, as much as possible,between the original portfolio XVA computations and the ones regarding the portfolio expandedwith the new trade.
Our model assumes the market risk of trades to be fully hedged (see the paragraph followingRemark 2.2 and the proofs of Lemma 3.2 and Proposition 4.1). In the previous subsection,the new swap was implicitly meant to be hedged, in terms of market risk, by the clean desks,32 ears -
60 80
18 19 20 21 22 23 24
25 26 27 28 29 30 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio Asset-Heavy -Incremental XVA IM CSA
CVA
MVA
KVA -180 -160 -140-120 -100 -80-60 -40-20 - 20
14 15 16 17 18 19 20 21 22
23 24 25 26 27 28 29 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio Asset-Heavy (mainly Payer) -Incremental Receiver XVA IM CSA
CVAMVAKVA -
15 16 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio Liability-Heavy -Incremental XVA no CSA
CVAFVA
KVA -1,000.0 -900.0-800.0-700.0-600.0 -500.0-400.0-300.0 -200.0-100.0 -
15 16 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio Liability-Heavy -Incremental XVA no CSA
CVA FVAKVA
Figure 14: (Top left)
Asset-heavy portfolio, no CSA. Incremental receive fix trade. (Top right)
Liability-heavy portfolio, no CSA. Incremental pay fix trade. (Bottom left)
Asset-heavy port-folio under CSA. Incremental Pay Fix Trade. (Bottom right)
Liability-heavy portfolio underCSA. Incremental receive fix trade.through an accordingly modified hedging loss process H (see Section 2.1). Here we consideran alternative situation where the market risk of the new swap is back-to-back hedged via afinancial, hedge counterparty. Specifically, we deal with •
10 counterparties: 8 no CSA clients and 2 bilateral VM/IM CSA hedge counterparties, • portfolios of 5K randomly generated swap trades as before, plus 5K corresponding hedgetrades, • an incremental trade given as a par 30 year swap with 100K notional, along with thecorresponding hedge trade.In particular, MtM = 0 (cf. (48)), in both portfolios excluding or including the new swap. Incase a client or hedge counterparty defaults, the corresponding market hedge is assumed to berewired through the clean desks via an accordingly modified hedging loss process H .The 8 no CSA counterparties are primarily asset or liability heavy. One bilateral CSA hedgecounterparty is asset-heavy and one liability-heavy. Figure 15 provides the trade incrementalXVA profiles of the bilateral hedge alternatives in combination with those for the initial coun-terparty trade. The main XVA impact of the hedge is then a corresponding incremental MVAterm, which can contribute to make the global FTP related to the trade+hedge package moreor less positive or negative, depending on the data (cf. the four panels in Figure 15), as canonly be inferred by a refined XVA computation. Remark 5.1
In the above, we do not include the XVA costs/benefits of the bilateral hedgecounterparty itself. Given Remark 2.4, in different circumstances it may be possible to attributethem to client trades of the original or hedge bank. Space is lacking for a fuller discussion ofeconomics of XVA trading in different setups. In particular, many hedge trades now face central33 -500.0-400.0 -300.0-200.0 -100.0 - 100.0
28 29 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio: XVA-reducing no-CSA CP Trade -Incremental 30Y pay fix swap+ XVA-increasing IM CP hedge
CVAKVAMVAFVA - 100.0 200.0 300.0 400.0 500.0 600.0
28 29 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio: XVA-increasing no-CSA CP Trade -Incremental 30Y receive fix swap + XVA-increasing IM CP hedge
CVAKVAMVAFVA-600.0 -500.0-400.0-300.0-200.0 -100.0 - 100.0 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio: XVA-reducing no-CSA CP Trade -Incremental 30Y pay fix swap + XVA-reducing IM CP hedge
CVAKVA
MVAFVA
28 29 -200.0-100.0 - 100.0
28 29 D o m e s t i c C u rr e n c y U n i t s Years
Swaps Portfolio: XVA-increasing no-CSA CP Trade -Incremental 30Y receive fix swap + XVA-reducing IM CP hedge
CVAKVAMVAFVA
Figure 15: (Top left)
XVA-reducing trade + XVA-increasing bilateral hedge (Top right)
XVA-increasing trade + XVA-increasing bilateral hedge. (Bottom left)
XVA-reducing trade + xva-reducing bilateral hedge (Bottom right)
XVA-increasing trade + XVA-reducing bilateral hedge.instead of bilateral counterparties. This occurs at additional XVA costs for the client of the ini-tial swap that can be computed the way explained in Albanese, Armenti, and Cr´epey (2020).
Our deep learning XVA implementation uses CNTK, the Microsoft Cognitive Toolkit. CNTKis written in core C++/CUDA (with wrappers for Python, C
10 CP 40 risk factors 20 CP 80 risk factorsNo CSA IM CSA No CSA IM CSAInitial risk factor & trade pricing simulation Cuda 352 352 426 426Counterparty and bank level learning calculations 4,529 13,466 19,154 59,342Total initial batch 4,881 13,818 19,580 59,768Re-simulate 1 counterparty trade pricing Cuda 40 40 51 51Counterparty and bank level learning calculations 2,785 2,736 7,694 6,628Total incremental trade 2,825 2,776 7,745 6,679
Table 5: XVA deep learning computation timings (seconds).All these results were based on 50K simulation paths, 32 time steps per year for risk factorsimulation, and 16 time steps per year for all XVA calculations and deep learning. They were34omputed on a Lenovo P52 laptop with NVidia Quadro P3200 GPU @ 5.5 Teraflops peak FP32performance, and 14 streaming multiprocessors.The computations for 20 counterparties took more than twice as long as those for 10 counter-parties. However, our deep learning calculations achieved around 80 to 90% Cuda occupancy for10 counterparties and at times fell to half that level for 20 counterparties. Scaling to realisticallyhigh dimensions should be achievable, but acceptable trade incremental pricing performancein production would require server-grade GPU hardware, performance tuning for high GPUutilisation, and, possibly, caching computations.
A Continuous-Time XVA Equations
We recall from Cr´epey, Sabbagh, and Song (2020) the continuous-time XVA equations for bi-lateral trade portfolios when capital at risk is deemed fungible with variation margin, alsoadding here initial margin and MVA as in the refined static setup of Section 3.4.We write δ η ( dt ) = d { η ≤ t } for the Dirac measure at a random time η . A.1 Cash Flows
We suppose that the derivative portfolio of the bank is partitioned into bilateral netting sets ofcontracts which are jointly collateralized and liquidated upon bank or counterparties (whetherthese are clients or market hedge counterparties) default. Given a netting set c of the bankportfolio, we denote by: • P c and P c , the corresponding contractually promised cash flows and clean value processes; • τ c , J c , and R c , the corresponding default times, survival indicators, and recovery rates,whereas τ , J , and R are the analogous data regarding the bank itself, with bank creditspread process λ = (1 − R ) γ taken as a proxy of its risky funding spread process ; • τ δc = τ c + δ and τ δ = τ + δ , where δ is a positive margin period of risk, in the sense thatthe liquidation of the netting set c happens at time τ δc ∧ τ δ ; • VM c , the variation margin (re-hypothecable collateral) exchanged between the bank andcounterparty c , counted positively when received by the bank; • PIM c and RIM c , the related initial margin (segregated collateral) posted and received bythe bank; • RC and CR, the reserve capital and capital at risk of the bank.The contractually promised cash flows are supposed to be hedged out by the bank but oneconservatively assumes no XVA hedge, so that the bank is left with the following trading cashflows C and F (cf. (38) and see Albanese and Cr´epey (2020, Lemmas 5.1 and 5.2) for detailedderivations of analogous equations in a slightly simplified setup): See Albanese, Armenti, and Cr´epey (2020, Section 5) for the discussion of cheaper funding schemes for ini-tial margin. The (counterparty) credit cash flows d C t = X c ; τ c ≤ τ δ (1 − R c ) (cid:16) ( P c + P c ) τ δc ∧ τ δ − ( P c + VM c + RIM c ) ( τ c ∧ τ ) − (cid:17) + δ τ δc ∧ τ δ ( dt ) − (1 − R ) X c ; τ ≤ τ δc (cid:16) ( P c + P c ) τ δ ∧ τ δc − ( P c + VM c − PIM c ) ( τ ∧ τ c ) − (cid:17) − δ τ δ ∧ τ δc ( dt ); (49) • The (risky) funding cash flows d F t = J t λ t (cid:16) X c J c ( P c − VM c ) − RC − CR (cid:17) + t dt − (1 − R ) (cid:16) X c J c ( P c − VM c ) − RC − CR (cid:17) + τ − δ τ ( dt )+ J t λ t X c J ct PIM ct dt − (1 − R ) X c J cτ − PIM cτ − δ τ ( dt ) , (50)where the RC and CR terms account for the fungibility of reserve capital and capital atrisk with variation margin. A.2 Valuation
Here (as in our numerics) we distinguish between a (strict) FVA, in the strict sense of the costof raising variation margin, and an MVA for the cost of raising initial margin (see Remark 2.1).The (other than K)VA equations are thenRC = CA = CVA + FVA + MVA , (51)the so-called “contra-assets valuation” sourced from the clients and deposited in the reservecapital account of the bank, where, for t < τ ,CVA t = E t X t<τ δc (1 − R c ) (cid:16) ( P c + P c ) τ δc − ( P c + VM c + RIM c ) τ c − (cid:17) + FVA t = E t Z Tt λ s (cid:16) X c J c ( P c − VM c ) − CA − CR (cid:17) + s ds MVA t = E t Z Tt λ s X c J cs PIM cs ds. (52)The corresponding trading loss and profit process L of the bank is such that L = 0 and, for t < τ,dL t = X c (1 − R c ) (cid:16) ( P c + P c ) τ δc − ( P c + VM c + RIM c ) τ c − (cid:17) + δ τ δc ( dt )+ λ t (cid:16) X c J c ( P c − VM c ) − CA − CR (cid:17) + t dt + λ t X c J ct PIM ct dt + d CA t , (53)36o that L is a Q martingale, hence (by Lemma 4.1) L ◦ is a Q ∗ martingale.By the same rationale as Definitions 3.2 and 3.3 in the static setup: Definition A.1 EC t is the time- t conditional 97.5% expected shortfall of ( L ◦ t +1 − L ◦ t ) under Q .Given a positive target hurdle rate h : Definition A.2
We set CR = max(EC , KVA) , (54)for a KVA process such that, for t < τ ,KVA t = E t h Z Tt h (cid:0) CR s − KVA s (cid:1) ds i . (55)Hence, for t < τ , KVA t = E t h Z Tt he − h ( s − t ) CR s ds i = E t h Z Tt he − h ( s − t ) max(EC s , KVA s (cid:1) ds i . (56)The next-to-last identity is the continuous-time analog of the risk margin formula under theSwiss solvency test cost of capital methodology: see Swiss Federal Office of Private Insurance (2006, Section 6, middle of page 86 and top of page 88). A.3 The XVA Equations are Well-Posed
In view of (51), the second line in (52) is in fact an FVA equation . Likewise, the secondline in (56) is a KVA equation. Moreover, as capital at risk is fungible with variation margin(cf. Section 3.4), i.e. in consideration of the CR term in (52)-(53), where CR = max(EC , KVA),we actually deal with an (FVA , KVA) system , and even, as EC depends on L (cf. DefinitionA.1), with a forward backward system for the forward loss process L and the backward pair(FVA , KVA).However, as in the refined static setup of Section 3.4, the coupling between (FVA , KVA)and L can be disentangled by the following Picard iteration: • Let CVA and MVA be as in (52), L (0) = KVA (0) = 0, and , for t < τ ,FVA (0) t = E t Z Tt λ s (cid:16) X c J c ( P c − VM c ) − CA (0) (cid:17) + s ds, (57)where CA (0) = CVA + FVA (0) + MVA; • For k ≥ , writing explicitly EC = EC( L ) to emphasize the dependence of EC on L , let37 ( k )0 = 0 and, for t < τ , dL ( k ) t = X c (1 − R c ) (cid:16) ( P c + P c ) τ δc − ( P c + VM c + RIM c ) τ c − (cid:17) + δ τ δc ( dt )+ λ t (cid:16) X c J c ( P c − VM c ) − CA ( k − − max (cid:0) EC( L ( k − ) , KVA ( k − (cid:1)(cid:17) + t dt + λ t X c J ct PIM ct dt + d CA ( k − t , KVA ( k ) t = h E t Z Tt e − h ( s − t ) max (cid:0) EC s ( L ( k ) ) , KVA ( k ) s (cid:1) ds, CA ( k ) t = CVA t + FVA ( k ) t + MVA t where FVA ( k ) t = E t Z Tt λ s (cid:16) X c J c ( P c − VM c ) − CA ( k ) − max (cid:0) EC( L ( k ) ) , KVA ( k ) (cid:1)(cid:17) + s ds. (58) Theorem 4.1 in Cr´epey, Sabbagh, and Song (2020)
Assuming square integrable data,the XVA equations are well-posed within square integrable solution (including when one ac-counts for the fact that capital at risk can be used for funding variation margin). Moreover, theabove Picard iteration converges to the unique square integrable solution of the XVA equations.
A.4 Collateralization Schemes
We denote by ∆ ct = P ct − P c ( t − δ ) − the cumulative contractual cash flows with the counterparty c accumulated over a past period of length δ . In our case study, we consider both “no CSA”netting sets c , with VM = RIM = PIM = 0, and “(VM/IM) CSA” netting sets c , withVM ct = P ct and, for t ≤ τ c ,RIM ct = V a R t (cid:16) ( P ct δ + ∆ ct δ ) − P ct (cid:17) , PIM ct = V a R t (cid:16) − ( P ct δ + ∆ ct δ ) + P ct (cid:17) , (59)for some PIM and RIM quantile levels a pim and a rim (and t δ = t + δ ).The following result can be derived by similar computations as the ones in Albanese, Armenti, and Cr´epey (2020, Section A). Proposition A.1
In a common shock default model of the counterparties and the bank itself(see the beginning of Section 5), with pre-default intensity processes γ c of the counterpartiesand γ of the bank, then CVA = CVA nocsa + CVA csa , where, for t < τ,
CVA nocsat = X c nocsa t<τ c (1 − R c ) E t Z Tt ( P cs δ + ∆ cs δ ) + γ cs e − R st γ cu du ds + X c nosca τ c MVA csat = X c csa J ct E t Z Tt (1 − R ) γ s PIM cs e − R st γ cu du ds. (62) References Abbas-Turki, L., B. Diallo, and S. Cr´epey (2018). XVA principles, nested Monte Carlostrategies, and GPU optimizations. International Journal of Theoretical and Applied Fi-nance 21 , 1850030.Albanese, C., Y. Armenti, and S. Cr´epey (2020). XVA Metrics for CCP optimisation. Statis-tics & Risk Modeling 37 (1-2), 25–53.Albanese, C., S. Caenazzo, and S. Cr´epey (2017). Credit, funding, margin, and capitalvaluation adjustments for bilateral portfolios. Probability, Uncertainty and QuantitativeRisk 2 (7), 26 pages.Albanese, C. and S. Cr´epey (2020). The cost-of-capital XVA approach in continuous time.Working paper available on https://math.maths.univ-evry.fr/crepey.Andersen, L., D. Duffie, and Y. Song (2019). Funding value adjustments. Journal of Fi-nance 74 (1), 145–192.Artzner, P., K.-T. Eisele, and T. Schmidt (2020). No arbitrage in insurance and the QP-rule.Working paper available as arXiv:2005.11022.Barrera, D., S. Cr´epey, B. Diallo, G. Fort, E. Gobet, and U. Stazhynski (2019). Stochas-tic approximation schemes for economic capital and risk margin computations. ESAIM:Proceedings and Surveys 65 , 182–218.Beck, C., S. Becker, P. Cheridito, A. Jentzen, and A. Neufeld (2019). Deep splitting methodfor parabolic PDEs. arXiv:1907.03452.Bichuch, M., A. Capponi, and S. Sturm (2018). Arbitrage-free XVA. Mathematical Fi-nance 28 (2), 582–620.Bielecki, T. and M. Rutkowski (2002). Credit Risk: Modeling, Valuation and Hedging .Springer Finance, Berlin.Bielecki, T. R. and M. Rutkowski (2015). Valuation and hedging of contracts with fundingcosts and collateralization. SIAM Journal on Financial Mathematics 6 , 594–655.Brigo, D. and A. Capponi (2010). Bilateral counterparty risk with application to CDSs. RiskMagazine , March 85–90. Preprint version available at https://arxiv.org/abs/0812.3705.Brigo, D. and A. Pallavicini (2014). Nonlinear consistent valuation of CCP cleared or CSAbilateral trades with initial margins under credit, funding and wrong-way risks. Journalof Financial Engineering 1 , 1–60.Burgard, C. and M. Kjaer (2011). In the balance. Risk Magazine , October 72–75.Burgard, C. and M. Kjaer (2013). Funding costs, funding strategies. Risk Magazine , Decem-ber 82–87. Preprint version available at https://ssrn.com/abstract=2027195.Burgard, C. and M. Kjaer (2017). Derivatives funding, netting and accounting. Risk Maga-zine , March 100–104. Preprint version available at https://ssrn.com/abstract=2534011.39astagna, A. (2014). Towards a theory of internal valuation and transfer pricingof products in a bank: Funding, credit risk and economic capital. Available athttp://ssrn.com/abstract=2392772.Collin-Dufresne, P., R. Goldstein, and J. Hugonnier (2004). A general formula for valuingdefaultable securities. Econometrica 72 (5), 1377–1407.Committee of European Insurance and Occupational Pensions Supervisors (2010).QIS5 technical specifications. https://eiopa.europa.eu/Publications/QIS/QIS5-technical specifications 20100706.pdf.Cr´epey, S. (2015). Bilateral counterparty risk under funding constraints. Part I: Pricing,followed by Part II: CVA. Mathematical Finance 25 (1), 1–22 and 23–50. First publishedonline on 12 December 2012.Cr´epey, S., T. R. Bielecki, and D. Brigo (2014). Counterparty Risk and Funding: A Tale ofTwo Puzzles . Chapman & Hall/CRC Financial Mathematics Series.Cr´epey, S., W. Sabbagh, and S. Song (2020). When capital is a funding source: The antici-pated backward stochastic differential equations of X-Value Adjustments. SIAM Journalon Financial Mathematics 11 (1), 99–130.Cr´epey, S. and S. Song (2016). Counterparty risk and funding: Immersion and beyond. Finance and Stochastics 20 (4), 901–930.Cr´epey, S. and S. Song (2017). Invariance times. The Annals of Probability 45 (6B), 4632–4674.Dimitriadis, T. and S. Bayer (2019). A joint quantile and expected shortfall regression frame-work. Electronic Journal of Statistics 13 (1), 1823–1871.Duffie, D. and M. Huang (1996). Swap rates and credit quality. Journal of Finance 51 Options: recent advances in theory and practice , Volume 2,pp. 13–24. Manchester University Press.Elouerkhaoui, Y. (2007). Pricing and hedging in a dynamic credit model. International Jour-nal of Theoretical and Applied Finance 10 (4), 703–731.Elouerkhaoui, Y. (2017). Credit Correlation: Theory and Practice . Palgrave Macmillan.Fissler, T. and J. Ziegel (2016). Higher order elicitability and Osband’s principle. The Annalsof Statistics 44 (4), 1680–1707.Fissler, T., J. Ziegel, and T. Gneiting (2016). Expected Shortfall is jointly elicitable withValue at Risk—Implications for backtesting. Risk Magazine , January.F¨ollmer, H. and A. Schied (2016). Stochastic Finance: An Introduction in Discrete Time (4th ed.). De Gruyter Graduate.Goodfellow, I., Y. Bengio, and A. Courville (2016). Deep Learning . MIT Press.Gottardi, P. (1995). An analysis of the conditions for the validity of Modigliani-Miller The-orem with incomplete markets. Economic Theory 5 , 191–207.40reen, A., C. Kenyon, and C. Dennis (2014). KVA: capital valuation adjustment by replica-tion. Risk Magazine , December 82–87. Preprint version “KVA: capital valuation adjust-ment” available at ssrn.2400324.Hoeting, J. A., D. Madigan, A. E. Raftery, and C. T. Volinsky (1999). Bayesian modelaveraging: A tutorial. Statistical Science 14 (4), 382–417.Hull, J. and A. White (2012). The FVA debate, followed by The FVA debate continued. RiskMagazine , July 83–85 and October 52.Hur´e, C., H. Pham, and C. Warin (2020). Some machine learning schemes for high-dimensional nonlinear PDEs. Mathematics of Computation 89 (324), 1547–1580.International Financial Reporting Standards (2013). IFRS 4 insurance contracts exposuredraft.Kjaer, M. (2019). In the balance redux. Risk Magazine (November).Longstaff, F. A. and E. S. Schwartz (2001). Valuing American options by simulation: Asimple least-squares approach. The Review of Financial Studies 14 (1), 113–147.Merton, R. (1974). On the pricing of corporate debt: the risk structure of interest rates. TheJournal of Finance 29 , 449–470.Modigliani, F. and M. Miller (1958). The cost of capital, corporation finance and the theoryof investment. Economic Review 48 , 261–297.Myers, S. (1977). Determinants of corporate borrowing. Journal of Financial Economics 5 ,147–175.Piterbarg, V. (2010). Funding beyond discounting: collateral agreements and derivativespricing. Risk Magazine , August 57–63.Sch¨onbucher, P. (2004). A measure of survival.