[PDF] A sparse grid approach to balance sheet risk measurement

Abstract

In this work, we present a numerical method based on a sparse grid approximation to compute the loss distribution of the balance sheet of a financial or an insurance company. We first describe, in a stylised way, the assets and liabilities dynamics that are used for the numerical estimation of the balance sheet distribution. For the pricing and hedging model, we chose a classical Black & Scholes model with a stochastic interest rate following a Hull & White model. The risk management model describing the evolution of the parameters of the pricing and hedging model is a Gaussian model. The new numerical method is compared with the traditional nested simulation approach. We review the convergence of both methods to estimate the risk indicators under consideration. Finally, we provide numerical results showing that the sparse grid approach is extremely competitive for models with moderate dimension.

Full PDF

CCEMRACS: A sparse grid approach to balance sheet risk measurement

Cyril Bénézet ∗ , Jérémie Bonnefoy † , Jean-François Chassagneux * ,Shuoqing Deng ‡ , Camilo Garcia Trillos § , Lionel Lenôtre ¶ November 22, 2018

Abstract

In this work, we present a numerical method based on a sparse grid approximation to compute theloss distribution of the balance sheet of a ﬁnancial or an insurance company. We ﬁrst describe, in astylised way, the assets and liabilities dynamics that are used for the numerical estimation of the balancesheet distribution. For the pricing and hedging model, we chose a classical Black & Scholes model witha stochastic interest rate following a Hull & White model. The risk management model describing theevolution of the parameters of the pricing and hedging model is a Gaussian model. The new numericalmethod is compared with the traditional nested simulation approach. We review the convergence of bothmethods to estimate the risk indicators under consideration. Finally, we provide numerical results showingthat the sparse grid approach is extremely competitive for models with moderate dimension.

The goal of this paper is to present a robust and eﬃcient method to numerically assess risks on the balancesheet distribution of, say, an insurance company, at a given horizon. In practice, it is chosen to be one year,consistently with the Solvency 2 regulation, the prudential framework for assessing the required solvencycapital for an European insurance company.On a ﬁltered probability space (Ω , A , P , ( F t ) t ≥ ) , the balance sheet of the company is a random processsummarised, at any time t ≥ , by the value of the assets of the company ( A t ) t ≥ and the value of theliabilities ( L t ) t ≥ . The quantity of interest is the Proﬁt and Loss (PnL in the sequel) associated to thebalance sheet, which is given by P t = L t − A t , t ≥ . By convention, and adopting the point of view of risk management, we measure the loss as a positive quantity.On the Liability side, the insurance company has sold a structured ﬁnancial product which depends on theevolution of a one-dimensional stock price ( S t ) and the risk-free interest rate ( r t ) . Several insurance productscould be of this type, in particular Unit-Linked (with or without ﬁnancial guarantees) and Variable Annuitycontracts. For those contracts, client’s money is invested in equity and bond markets while the insurancecompany might also provide with ﬁnancial guarantees similar to long-term put options. The long maturityof those contracts requires the introduction of a model for interest rate as they are very sensitive to InterestRate curve movements. The value L is just the price of this product taking into account the value of somerisk factors X (stock price, interest rate curve etc.) at time t = 1 used to calibrate the pricing model.On the Asset side, the insurance company manages some assets to hedge the risk associated to the productsale. The pricing actually includes a margin which is secured through hedging. The hedging assets are the ∗ Paris Diderot University, LPSM. † Group Risk Management, GIE AXA. ‡ Paris Dauphine University, CEREMADE. § University College London. ¶ Ecole Polytechnique, CMAP. a r X i v : . [ q -f i n . R M ] N ov tock and swaps of several maturities, in practice mostly concentrated on the long term. In practice, bondfutures are also included sometimes. The hedging portfolio is typically rebalanced on a weekly basis and thehedging quantities are determined by a ﬁnancial model, taken to be the same as the liability pricing model,whose inputs are the risk factors X t at the time t when the hedge is computed.We describe precisely in Section 2, the pricing and hedging model, the dynamics of the risk factor X andthe value of the asset and liability side of the balance sheet. Let us stress that the risk factor model isgiven under the so-called real-world probability measure P , which might be objectively calibrated usingtime series of ﬁnancial markets or represents the management view. This real-world model may be –andmost of the time is– completely diﬀerent from the pricing and hedging model which might be simpliﬁedfor runtime/trackability purposes, prudent (pricing and hedging include a margin) or being constrained byregulation.Our goal is then to compute various risk indicator for the loss distribution of the balance sheet at one yearnamely the distribution of P under the real-world probability measure P , that we denote hereafter η .Precisely, we measure the risk associated to η using a (law invariant) risk measure deﬁned over the class ofsquare integrable measures (cid:37) : P ( R ) → R . First, we consider for (cid:37) the so called Value-at-Risk ( V @ R ), whichis deﬁned by the left-side quantile: V @ R p ( η ) = inf { q ∈ R | η (( −∞ , q ]) ≥ p } . (1.1)We will also work with the class of spectral risk measures : a spectral risk measure is deﬁned as (cid:37) h ( η ) = (cid:90) V @ R p ( η ) h ( p )d p , (1.2)where h is a non-decreasing probability density on [0 , . In the numerics, we will focus on the AverageValue-at-Risk ( AV @ R ) which is given by AV @ R α ( η ) = 11 − α (cid:90) α V @ R p ( η )d p , (1.3)and is a special case of a spectral risk measure.For a law invariant risk measure (cid:37) , we denote by (cid:60) its “lift” on L (Ω , A , P ; R ) =: L , namely (cid:60) [ X ] = (cid:37) ([ X ]) for any X ∈ L , where [ X ] denotes the law of X . The lift (cid:60) h from a spectral risk measure (cid:37) h satisﬁes thefollowing properties:1. Monotonicity : (cid:60) h [ X ] ≤ (cid:60) h [ Y ] , for X ≤ Y ∈ L ;2. Cash invariance : (cid:60) h [ X + c ] = (cid:60) h [ X ] + c for X ∈ L and c ∈ R ;3. Positive homogeneity : (cid:60) [ tX ] = t (cid:60) [ X ] , t ≥ and X ∈ L .4. Convexity : (cid:60) [ tX + (1 − t ) Y ] ≤ t (cid:60) [ X ] + (1 − t ) (cid:60) [ Y ] , whenever ≤ t ≤ , for X, Y ∈ L ;Let us stress the fact that V @ R only satisﬁes 1-3. We refer to [7] and the references therein for more insightson risk measures and spectral risk measures.In our setting, the loss distribution η of the balance sheet PnL is obtained through the following expression: η = p (cid:93)ν , where (cid:93) denotes the push-forward operator, p : R θ → R is the function describing the PnL in terms ofthe risk factors, and ν stands for the distribution of the risk factors X . In practice the estimation of (cid:37) ( η ) requires to sample from η . In turn, this demands for a sample of the model parameter distribution ν andfor a numerical approximation of p . In this note, we compare two main approaches to form the sample of η given one of ν .The ﬁrst one is known as the nested simulation approach: It is a two-step method. First, a set of “outersimulation”, describing the random values of the risk factors, is drawn. Then, for each value of the risk2actors, a sample of “inner simulation” is drawn to compute the various hedge and prices. In this approach,all computations are realised “online”. The main advantage of this approach is its simplicity to implement inpractice, described in the ﬁrst paragraph of Subsection 3.1. However, it is well known that this approach isquite greedy, even if optimised as in [6]. We also want to stress the fact that when computing the η -sample,no information about p is stored for future work: for example if ν is modiﬁed, due to time or a modelchange, a full recalculation would be required.The other approach we chose to adopt and would like to promote is a grid approach where the approximationof p is made “oﬄine”, by a Monte Carlo approach, and then stored. The numerical computation is thendone through a (multi-linear) interpolation on a grid. The main drawback of this approach is that the sizeof the grid, in high dimension, can become untractable, especially if one uses regular grid. To partiallycircumconvent this diﬃculty, we introduce a sparse grid [2] which reduces drastically the number of point tobe used (equivalently, values to be stored) with only small reduction of the accuracy of the method.We prove that, for a spectral risk measure, the two approaches give an estimation of (cid:37) ( η ) which convergesto the true value, see Theorem 3.1.Furthermore, we show in the numerical Section 4 that using the grid approach together with a sparse grid oflow level allows to get a good approximation of the loss distributions η , and of some related risk measures,while reducing drastically the computational time and allowing to keep information about the balance sheetfunction p . Last, this permits to numerically quantify uncertainty. Indeed, since the computations on thegrid are stored, the computation of the distribution of the PnL under other distributions for the parametersis almost instantaneous and can be compared with the results obtained with the initial one. An applicationto uncertainty estimation is given in the last numerical application.The rest of the paper is organised as follows. In the Section 2, we ﬁrst describe the mathematical models thatare used to describe the evolution of the prices under the risk-neutral measure Q . We then describe preciselyhow A and L are speciﬁed. In Section 3, we describe the two numerical methods used to compute L t and A t at any given time t ≥ . In particular, we show how to eﬃciently compute, at time t , the quantities to holdin the hedging portfolio, which are expressed in term of the derivatives of the claim’s price. We also explainhow to compute the price of the product and of the assets used to construct the hedging porfolio, leadingto the computation of L t and A t . We show how to obtain an approximation of the distribution of P underthe physical measure P , and we prove an upper bound for the mean square error of the overall procedure.Finally, in Section 4, we present our numerical results, comparing the two methods. In this section, we give the precise speciﬁcation of the asset and liability sides of the balance sheet. We alsopresent the risk-neutral model and the real-world model that are used.

Let us assume that a company sells a contingent claim at time t = 0 which is a (discretely) path-dependentoption with a payoﬀ function G paid at the maturity T > , depending upon the evolution of a one-dimensional risky asset’s price S . We focus here on:A put lookback option, that is a discretely path-dependent option whose strike at maturity T is given bythe maximum of the asset’s price S over the times t ∈ { τ = 0 , τ , · · · , τ κ = T } where κ ≥ : G ( S τ , . . . , S τ κ ) = (cid:18) max ≤ (cid:96) ≤ κ S τ (cid:96) (cid:19) − S T . (2.1) Remark . The proxy provided above is close to ﬁnancial guarantees oﬀered in Variable Annuity contracts.Those contracts are structured insurance products composed of a fund investment on top of which both in-surance and ﬁnancial protection are added. In our case, the contract is a

Guaranteed Minimum AccumulationBeneﬁt including a ratchet mechanism . At time t = 0 , the customer invests his/her money in the underlyingfund and will receive at a given maturity the maximum between the terminal fund value and its terminalbeneﬁt base in case she is still alive. The terminal beneﬁt base is equal to the maximum of the underlying3und values observed at each anniversary date of the contract (ratchet mechanism). We do not consider themodeling of death/survival in this proceedings, neither the possibility that client can surrender at any timeduring the life of the contract. We assume that all pricing and hedging is done with a market risk-neutral measure Q .The derivative with a payoﬀ function G as above depends upon a one-dimensional stock’s price S = ( S t ) t ∈ [0 ,T ] .We assume here that the dynamics of the asset under Q are of the Black & Scholes type as described inSection 2.2.2 with a stochastic interest rate r = ( r t ) t ∈ [0 ,T ] which follows a Hull & White model.As the payoﬀ G is a proxy of Variable Annuity guarantee which is a long term Savings product (in practicematurity ranges from 10 to 30 years depending on product type), the modeling of interest rate is essentialas the product and therefore the overall balance sheet of the company is very sensitive to this risk. Let Θ ∈ R d ( d := 3 in the sequel) be a set of parameters representing some market observations. The shortrate evolution is governed by the Hull & White dynamics r t, Θ s = r t, Θ t + (cid:90) st a (cid:0) µ t, Θ u − r t, Θ u (cid:1) d u + b ( B s − B t ) , s ∈ [ t, T ] , (2.2)where B is a Q -Brownian motion, a and b are real constants and µ t, Θ : [ t, T ] → R is a function. We refer to[1] for a more complete analysis of the Hull & White short rate model.The parameter µ t, Θ is calibrated using the market observations Θ , so that the model reproduces the interestrate curve observed on the market. It is given by µ t, Θ s = f Θ ( t, s ) + 1 a ∂f Θ ( t, s ) ∂s + b a (cid:16) − e − a ( s − t ) (cid:17) , s ∈ [ t, T ] . (2.3)We refer to the Appendix for a derivation of (2.3).As a consequence, the Θ parameter must be chosen in order to represent adequately the forward rate curveobserved on the market.We suppose here that the forward rate curve f Θ ( t, · ) is directly observed and is a linear combination of threeelementary functions h t, , h t, , h t, from [ t, T ] to R , given by h t, ( s ) := h ( s − t ) , h t, ( s ) := h ( s − t ) , and h t, ( s ) := h ( s − t ) , s ∈ [ t, T ] , where, for u ∈ [0 , T ] : h ( u ) =  if u ≤ t + t , t t − ut − t if u ∈ [ t + t , t + t ] , otherwise, h ( u ) =  if u ≤ t + t u − t t t − t if u ∈ [ t + t , t + t ] , otherwise,and h ( u ) = 1 − h ( u ) − h ( u ) , where ≤ t < t < t < t ≤ T are four ﬁxed real numbers.The function h t, (resp. h t, , h t, ) model the short (resp. middle, long) term structure of the interest ratescurve.In a nutshell, the short rate model is determined by:1. the time of observation t ∈ [0 , T ] , 4 . . . . . . Elementary functions h, t1=0, t2=6, t3=16, t4=30 u h h1h2h3 Figure 1:

Building blocks for the forward interest rate curve.

2. the three-dimensional parameter

Θ := { θ , θ , θ } ∈ R , where θ , θ , θ are such that f Θ ( t, · ) = θ h t, + θ h t, + θ h t, , (2.4) f Θ ( t, · ) being the observed forward rates curve.As a result, given an observation ( t, Θ) as above, the short rate process r t, Θ has the dynamics (2.2), where µ t, Θ is computed using (2.3). Remark . In practice, the parameters a, b appearing in (2.2) can be calibrated so that the model reproducesthe prices, observed on the market, of some contracts such as swaps or swaptions. We could more generallyallow the parameters a, b of the Hull & White model to depend upon the market observations Θ . Theparameter Θ should live in a higher-dimensional space to take into account the observed swap(tion)s prices.Regular recalibration of parameters is largely performed by practitioners, in particular when they performdynamic hedging.There are several reasons explaining the choice of this model. First of all, it is quite simple to calibrateusing the data. In fact, the function µ is directly given as a function of the forward rate curve. We shouldnote again that the choice of keeping a, b ﬁxed through time simpliﬁes the calibration. Secondly, we willsee later in Proposition 3.1 that this short rate model, associated to the stock model described below, leadsto an exact simulation under the risk-neutral measure. Lastly, closed and easily tractable formulas can beobtained for the prices of the zero-coupon bonds and swaps which are the products used to construct thehedging portfolio and then to compute the value of the company’s assets A .These prices are as follows. Proposition 2.1.

Let ( t, Θ) ∈ [0 , × R be a market observation, and consider the process (cid:16) r t, Θ s (cid:17) s ∈ [ t,T ] given by (2.2) , where the parameter µ t, Θ is deﬁned with (2.3) and (2.4) .1. The price at time t of a zero-coupon maturing at time u ∈ [ t, T ] is given by: P t, Θ ,u = exp (cid:18) − (cid:90) ut f Θ ( t, s )d s (cid:19) , (2.5) and its derivatives with respect to Θ := ( θ , θ , θ ) are given by: ∂P∂θ i t, Θ ,u = − P t, Θ ,u (cid:90) ut h t,i ( s )d s. (2.6)5 . Let (0 , Θ ) be the observation made at time . Consider a swap contract issued in s = 0 , with maturity M > , rate R > , with coupons versed at every time i ∈ { , . . . , M } . Then, the price of this contractat time t is given by: SW t, Θ ,M,R = P t, Θ , P , Θ , − P t, Θ ,M − R M (cid:88) i =1 P t, Θ ,i , (2.7) and its derivatives with respect to Θ are given by: ∂SW∂θ j t, Θ ,M,R = − P t, Θ , P , Θ , (cid:90) t h t,j ( s )d s + P t, Θ ,M (cid:90) Mt h t,j ( r )d r + R M (cid:88) i =1 (cid:18) P t, Θ ,i (cid:90) t + it h t,j ( s )d s (cid:19) . (2.8) Given the observations Θ of the interest rate factors and the risky asset’s price x ∈ (0 , ∞ ) , the evolution ofthe price under the neutral-risk measure Q is given by S t,x, Θ s = x + (cid:90) st r t, Θ u S t,x, Θ u d u + (cid:90) st σS t,x, Θ u d ˜ W u , s ∈ [ t, T ] , (2.9)where σ > , ˜ W is another Q -Brownian motion, whose quadratic covariation with B is given by (cid:104) B, ˜ W (cid:105) t := ρ t, t ∈ [0 , T ] , where ρ ∈ [ − , . Equivalently, S t,x, Θ can be written as: S t,x, Θ s = x + (cid:90) st r t, Θ u S t,x, Θ u d u + (cid:90) st ρσS t,x, Θ u d B u + (cid:90) st (cid:112) − ρ σS t,x, Θ u d W u , s ∈ [ t, T ] (2.10)where W is a Q -Brownian motion, independent of B . Remark . In practice, the parameter σ appearing in (2.10) can be calibrated so that the model reproducesthe prices of some derivatives over the risky asset. This can be taken into account by increasing the dimensionof the space where Θ lives, and by adding this calibration procedure. Remark . Naturally, the general sparse grid approach can be applied to diﬀerent models and functionalrepresentations. We made the choice of using a Black & Scholes model for the stock value and a Hull &White model with the given functional representation in terms of h , h , h for the short rate, since they areconvenient to obtain explicit pricing and sensitivities formulae as we show in the following. The key point for us is to approximate the distribution of the balance sheet of an insurance company at time t = 1 (here a year) given the market observations at t = 0 . As mentionned in the introduction, the PnL is aprocess P which can be decomposed as P t = L t − A t , t ∈ [0 , , (2.11)where L is the value of the liabilities of the company and A is the value of the assets.We assume that at time t = 0 , the balance sheet is clear, meaning that the company has no asset nor liability,that is L = A = 0 .We describe precisely in the two subsequent sections how these quantities are deﬁned. Importantly, wedenote by ¯ X t := ( ¯ S t , ¯Θ t ) , ≤ t ≤ , the stochastic process representing the random evolution of the marketparameter under the real-world measure P . Namely, ¯ S is the stock price and ¯Θ the interest rate curveparameters as described above in Section 2.2.1. It is important to have in mind that the model chosen forthe stock price S (under Q ) and ¯ S (under P ) will be completely diﬀerent as they do not serve the samepurpose (pricing-hedging on one hand, risk management of the Balance Sheet or regulatory assessment ofrequired capital on the other hand). 6 .3.1 Liability side For any market observation ¯ X t := ( ¯ S t , ¯Θ t ) , the value L t =: (cid:96) ( t, ¯ S t , ¯Θ t ) of the liabilities has to be computed,especially at time t = 1 in our application. The company’s liabilities are reduced to one derivative productsold at t = 0 . In our setting, the contingent claim’s price is simply given by: (cid:96) ( t, x, Θ) = E Q (cid:104) e − (cid:82) Tt r t, Θ s d s G ( S t,x, Θ ) (cid:105) , (2.12)where ( r t, Θ , S t,x, Θ ) t ≤ s ≤ T are the risk neutral dynamics of the short rate and stock price, see Section 2.2.1and Section 2.2.2, calibrated from the observed market parameter ( x, Θ) at time t .We recall that the payoﬀ G depends on S t,x, Θ only through the values S t,x, Θ τ , τ ∈ Γ G := { τ , . . . , τ κ } ( κ ≥ ),see (2.1).As explained in more detail below, the computation of P ﬁrst requires to approximate (cid:96) ( t, x, Θ) for ( t, x, Θ) on a (possibly stochastic) discrete grid of [0 , × (0 , ∞ ) × R . This approximation L at any point ( t, x, Θ) ∈ [0 , × (0 , ∞ ) × R basically follows from the simulation of the processes r t, Θ and S t,x, Θ under the risk-neutralmeasure Q and a Monte Carlo procedure. We will see in Section 3 that the simulation can be done in anexact manner in our model. Remark . A classical approach to compute (cid:96) would be to use a dynamic programming principle. Step bystep, it requires1. To numerically obtain (cid:96) (1 , x, Θ) for all ( x, Θ) on the grid with (2.12),2. Then to iteratively compute (cid:96) ( t k +1 , x, Θ) for all x, Θ on the grid using (cid:96) ( t k , x, Θ) = E Q (cid:20) e − (cid:82) tk +1 tk r tk, Θ s d s E Q (cid:20) e − (cid:82) Ttk +1 r tk, Θ s d s G ( S t k ,x, Θ ) |F t k +1 (cid:21)(cid:21) . (2.13)However, the inner conditional expectation is not of the form (cid:96) ( t k +1 , x, Θ) for x > and Θ ∈ R . In fact, thetime of the market observation Θ is still t k , while the discount factor goes only from t k +1 to T , in contrastwith (2.12). Therefore, using the dynamic programming equation (2.13) would require to introduce someadditional and artiﬁcial parameters, namely the time of calibration and the value of r t k , Θ at time t k +1 . Thiswould made the overall procedure heavier that is why we compute (cid:96) using simply (2.12). The company wants to replicate the product with payoﬀ G . The classical theory of mathematical ﬁnanceensures that it is equivalent, in theory, to possess a portfolio which negates the variations of the price of theproduct with respect to the evolution of the underlying parameters.In our context, the insurer wants to be protected against the variations with respect to the stock price S t and the interest rate curve, which is modeled through the parameter Θ .The dynamic hedging portfolio is constructed and rebalanced in discrete time, on the time grid Γ := { t =0 < t < · · · < t n = 1 } (in practice, the porfolio will be rebalanced up to the maturity of the product, butin our setting, we are only interested in the portfolio’s value up to t = 1 ). At each time t ∈ Γ , the insurercomputes the derivatives of the price with respect to S t and Θ t , and then buys some ﬁnancial assets (thestock and swaps) in order to construct a portfolio whose derivatives match those computed.To model this framework, we decompose the hedging portfolio’s value A in two parts: A t = A ∆ t + A ρt . (2.14)The process A ∆ is the value of the portfolio obtained to cancel the variations of the price with respect to S ,while A ρ is deﬁned to deal with the variations with respect to Θ . Remark . Obviously, since in practice the hedging is done in discrete time and some underlying parameterare not considered, the payoﬀ G is not exactly replicated, nor super-replicated. Therefore the PnL of thecompany is not null, nor always positive, and the goal of this proceedings is precisely to propose a newnumerical method to estimate the distribution of this quantity at time t = 1 . In practice, the pricing embeds a margin and the objective of replicating the payoﬀ G is to secure it.

7o construct the hedging portfolio, the insurer can buy the underlying stock, together with three swapcontracts, deﬁned by some rates R , R , R > and maturity dates T , T , T ∈ [1 , T ] . Interest rate hedgingis performed so that the portfolio is insensitive to the variations of the main maturities of the interestrate curve. For long-term products, this means building an hedging portfolio containing several diﬀerentmaturities from 1 year to 30 year. Here, only 3 maturities representing short, medium and long-term partof the curve are considered for simplicity. The formula for their price SW t, Θ ,T i ,R i , i = 1 , , is given inProposition 2.1 above. We now describe how to compute the quantities of assets and swaps to buy at a time t ∈ Γ , to rebalance the hedging portfolio. Denote by ∆ (resp. ρ i , i = 1 , , ) the quantities of stock (resp.swap with rate R i and maturity date T i , i = 1 , , ). Then the value of the portfolio of the company is givenby: Π t = ∆ S t + (cid:88) i =1 ρ i d SW t, Θ t ,T i ,R i − (cid:96) ( t, S t , Θ t ) . (2.15)By Itô’s formula, assuming a semimartingale decomposition for the process Θ under P , we get: dΠ t = (∆ − ∆( t, S t , Θ t )) d S t + (cid:18) ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R − ∂(cid:96)∂θ ( t, x, Θ) (cid:19) d θ + (cid:18) ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R − ∂(cid:96)∂θ ( t, x, Θ) (cid:19) d θ + (cid:18) ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R − ∂(cid:96)∂θ ( t, x, Θ) (cid:19) d θ + d t terms.To cancel the risks induced by the variations of the stock price and the interest rate curve, it is needed thatthe four ﬁrst terms in the previous equation cancel.Those considerations lead to the following construction for the hedging portfolio A = A ∆ + A ρ : ∆ -hedging: The value of A ∆ at time is A ∆1 = n − (cid:88) i =0 ∆( t i , ¯ S t i , ¯Θ t i ) (cid:0) ¯ S t i +1 − ¯ S t i (cid:1) where ∆( t, x, Θ) := ∂L∂x ( t, x, Θ) . (2.16)Note that the function ∆ has to be computed, at each time t i , i = 0 , . . . , n , and for any market situation ( x, Θ) at this time. A method leading to a numerical estimation of ∆ is proposed in Section 3. ρ -hedging: The value of A ρ is A ρ = n − (cid:88) i =0 3 (cid:88) j =1 ρ j ( t i , ¯ S t i , ¯Θ t i ) (cid:16) SW t i +1 , ¯Θ ti +1 ,T j ,R j − SW t i , ¯Θ ti ,T j ,R j (cid:17) . (2.17)At time t ∈ [0 , T ] , for a market at ( x, Θ) , the quantities ρ i ( t, x, Θ) , i = 1 , , of each swap contract requiredfor the hedging are given by the solution of the linear system  ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R = ∂(cid:96)∂θ ( t, x, Θ) ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R = ∂(cid:96)∂θ ( t, x, Θ) ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R + ρ ∂ SW ∂θ t, Θ ,T ,R = ∂(cid:96)∂θ ( t, x, Θ) (2.18)One key quantity to compute for us in this setting is thus the vector of sensitivities ( ∂(cid:96)∂θ i ( t, x, Θ)) , i = 1 , , . Remark . (i) We choose to always use the same swap contracts issued at t = 0 as hedging instruments.We could have decided to enter for free in swaps (at the swap rate) at each rebalancing time. However,this strategy requires to keep the memory of the swap rate in order to compute the swap price at the nextrebalancing date.(ii) The T i , R i should be chosen so that they represent some liquid contracts.8 .3.3 The global PnL function From the previous two sections, we conclude that the PnL of the balance sheet at time can be expressedas, P = p (cid:0) ( t, ¯ S t , ¯Θ t ) t ∈ Γ (cid:1) where ( ¯ S, ¯Θ) are the market parameters (risk factors) and the PnL function p : R γ → R , with γ = 4 × ( n +1) ,is given by (cid:96) ( t n , x n , Θ n ) − n − (cid:88) i =0 ∆( t i , x i , Θ i )( x i +1 − x i ) − n − (cid:88) i =0 3 (cid:88) j =1 ρ j ( t i , x i , Θ i ) (cid:0) SW t i +1 , Θ i +1 ,T j ,R j − SW t i , Θ i ,T j ,R j (cid:1) . (2.19)In the next section, we describe the model we will consider for the market parameter ( ¯ S, ¯Θ) . real-world measure We describe here the model that will be used for the simulation of the market parameters in the numericalpart. Let us insist that this real-world measure P might represent the view of the management on theevolution of the market parameter on the period [0 , . As already mentioned, it can be completely diﬀerentfrom the model used for the risk-neutral pricing.We assume that we know, or at least are able to estimate, the ﬁrst two moments of the distribution of ( X := log( S ) , Θ ) = ( X , ( θ ) , ( θ ) , ( θ ) ) under P . More precisely, we assume that under P , this randomvector has mean and covariance matrix given by µ = ( µ X , µ , µ , µ ) , (2.20) V = ( V ij ) i,j =0 , , , . (2.21)To model the random process ( X t , ( θ ) t , ( θ ) t , ( θ ) t ) t ∈ [0 , under P , we assume that its dynamics are given by X t = X + b t + c W t + c W t + c W t + c W t , (2.22) ( θ ) t = ( θ ) + b t + c W t + c W t + c W t (2.23) ( θ ) t = ( θ ) + b t + c W t + c W t (2.24) ( θ ) t = ( θ ) + b t + c W t , (2.25)where W i , i = 0 , , , are independent P -Brownian motions. Proposition 2.2.

There exists at most one set of coeﬃcients b i , c ij , i, j = 0 , , , , such that the randomvector ( X , ( θ ) , ( θ ) , ( θ ) ) has mean µ and covariance matrix V .Proof. We refer to the appendix for a proof, cf. Proposition 5.1.

Remark . It is well-known that it is diﬃcult to estimate accurately the drift parameter in a Black-Scholesmodel. This makes our computation subject to model risk. We leave it to a future research work to ﬁnda robust way to approximate the law of P under P . Nevertheless, let us point out that the grid approachallows us to compute, with minimal re-computation, risk measures for diﬀerent approximations of the law of P . This is one of the advantages of this method with respect to using “nested simulations”, as illustrated inSection 4. In this section, we describe the two numerical methods that we use to compute the risk indicator on thebalance sheet PnL. The ﬁrst one is known as nested simulation approach and is already used in the industry,see the seminal paper [6]. The second one is a sparse grid approach and is designed to be more eﬃcientthat the nested simulation approach in the case of moderate dimensions. In the next section, we presentnumerical simulations that conﬁrms this fact for the model with moderate dimension that we consider here.9 .1 Estimating the risk measure

Given a risk measure (cid:37) and the loss distribution η of the balance sheet at one year, we estimate the quantityof interest (cid:37) ( η ) by simulating a sample of N i.i.d random variables (Ψ j ) ≤ j ≤ N with law η and then computingsimply (cid:37) ( η N ) using the formulae (1.1), (1.2) and (1.3) with η N instead of η . Here, η N stands for the empiricalmeasure associated to the Ψ j i.e. η N = 1 N N (cid:88) j =1 δ Ψ j , where δ x is the Dirac mass at the point x .Recall, that in our work, the loss distribution η is obtained through the following expression: η = p (cid:93)νp being described in (2.19) and ν stands for the distribution of the market parameters. Namely, ν is thelaw of the random variable ¯ X := ( ¯ S t , ¯Θ t ) t ∈ Γ (3.1)under the real world probability measure P .In order to estimate (cid:37) ( η ) for a chosen risk measure, we need to be able to sample from η which implies twosteps in our setting. First, we need to be able to sample ¯ X and then we use an approximation p υ of p , υ ∈ {N , S} , namely: • p N if one chooses the nested simulation approach; • p S if one chooses the sparse grid approach.Eventually, the estimator of (cid:37) ( µ ) is given by R υ := (cid:37) ( p υ (cid:93)ν N ) , for υ ∈ {N , S} . (3.2)To summarise, the two numerical methods have the following steps. Nested simulation approach

1. Outer step: Simulate the model parameters ( X j ) j =1 ,...,N .2. Inner step: Simulate Ψ j = p N ( X j ) using MC simulations. This requires to compute the option priceswith Monte Carlo estimates, the interest rate derivative prices, and the various sensitivities of the price,see subsection 3.2.2. All these computations are done using closed-form formulae that are derived below.3. Estimate the risk measure.All computations are made “online”. Sparse grid approach

1. Fix a (sparse) grid V and compute the approximation p S at each required value on the grid by an MCsimulation. This involves exactly the same computations as 2. above.2. Simulate the N model parameter samples ( X j ) and evaluate Ψ j = p S ( X j ) .3. Estimate the risk measure.Computations at step 1. are done “oﬄine”. The next two steps are done “online”.We now describe precisely how to compute p N and p S .10 .2 Nested Simulation approach In the nested simulation approach, once the market parameters ¯ X have been simulated, the function p N hasitself to be computed. Recalling (2.19), this requires to evaluate the function (cid:96) N , ∆ N , ρ N (approximationsof (cid:96), ∆ , ρ ) at the value of the market parameters. This computation is made using again a Monte Carloapproach.In order to compute the risk-neutral expectations in the above formulae, one has to sample the short rateprocess and compute its integral over [ t, T ] , and also to simulate the stock price S at least at the times τ (cid:96) ∈ Γ G ∩ [ t, T ] . Under the model described in section 2.2, the simulation is exact and is described in thetwo following statements whose proofs are postponed to the appendix.Let ( t, x, Θ) ∈ [0 , T ] × (0 , ∞ ) × R be a market observation, and consider the processes (cid:16) r t, Θ s (cid:17) s ∈ [ t,T ] and (cid:16) S t,x, Θ s (cid:17) s ∈ [ t,T ] deﬁned by (2.2), (2.10). We ﬁrst start with an easy and well known observation. Lemma 3.1.

We have the following decomposition for the short rate process: r t, Θ = ξ t + α t, Θ , (3.3) where ( α t, Θ s ) s ∈ [ t,T ] is the deterministic function of time α t, Θ s = 1 a f Θ ( t, s ) + b a (cid:104) − e − a ( s − t ) (cid:105) , s ∈ [ t, T ] , (3.4) and (cid:0) ξ ts (cid:1) s ∈ [ t,T ] is the solution of the SDE dξ ts = − aξ ts d s + b d B s , s ∈ [ t, T ] (3.5) ξ tt = 0 . (3.6)The following proposition provides a recursive way to produce sample paths of the triplet ( ξ t , A t , X t,x, Θ ) onthe discrete grid { t } ∪ (Γ G ∩ [ t, . We are thus in a position to simulate the evolution of ( r t, Θ , S t,x, Θ ) andthe discount factor β t,θT := e − (cid:82) Tt r t, Θ s d s under the risk-neutral measure Q . Proposition 3.1.

Fix t ≤ s ≤ u ≤ T . Conditionally upon F s , the triplet (cid:18) ξ tu , A t,su := (cid:90) us ξ tr d r, X t,x, Θ u := log (cid:0) S t,x, Θ u (cid:1)(cid:19) is a Gaussian vector of dimension , with mean vector and covariance matrix respectively given by E s (cid:2) ξ tu (cid:3) = ξ ts e − a ( u − s ) , E s (cid:2) A t,su (cid:3) = ξ ts a (1 − e − a ( u − s ) ) , E s (cid:2) X t,x, Θ u (cid:3) = X t,x, Θ s + (cid:90) us α r d r − σ u − s ) + ξ ts a (1 − e − a ( u − s ) ) , (3.7) and Var s ( ξ tu ) = b a (1 − e − a ( u − s ) ) , Cov s ( ξ tu , A t,su ) = b a (1 − e − a ( u − s ) ) − b a (1 − e − a ( u − s ) ) , Var s ( A t,su ) = b a ( u − s ) − b a (1 − e − a ( u − s ) ) + b a (1 − e − a ( u − s ) ) , Cov s ( ξ tu , X t,x, Θ u ) = ba ( ba + ρσ )(1 − e − a ( u − s ) ) − b a (1 − e − a ( u − s ) ) , Cov s ( A t,su , X t,x, Θ u ) = − a Cov( ξ tu , X t,x, Θ u ) + ba ( ba + ρσ )( u − s ) − b a (1 − e − a ( u − s ) ) , Var s ( X t,x, Θ ) = ( ρσ + ba ) ( u − s ) + (1 − ρ ) σ ( u − s ) − ba ( ρσ + ba )(1 − e − a ( u − s ) ) + b a (1 − e − a ( u − s ) ) . (3.8)11 .2.1 Approximation of the Liability side With the above results, the approximation at any time t of the liability function (cid:96) , denoted (cid:96) N , is straight-forward. It is given by (cid:96) N ( t, x, Θ) = 1 M M (cid:88) k =1 β t, Θ ,kT G ( S t,x, Θ ,k ) , (3.9)with (( β t, Θ ,kτ , S t,x, Θ ,kτ ) τ ∈ Γ G ) ≤ k ≤ M are i.i.d realisations of ( β t, Θ τ , S t,x, Θ τ ) τ ∈ Γ G , recall (2.1). The approximation of the asset side is slightly more involveld as it requires the computation of sensitivitieswith respect to the model parameters: ∂(cid:96)∂x and ∂(cid:96)∂θ i , i = 1 , , , see (2.16), (2.17) and (2.18). We choose tocompute the sensitivities using a “weight” approach namely we express them as expectation of the discountedpayoﬀ times a well chosen random weight. Note that other techniques are available to compute thesesensitivities e.g. Automatic diﬀerentiation . In our context, we have compared the two methods and observedthat the weight approach is more eﬃcient, see Section 5.4 in the Appendix.We now describe how to obtain the sensitivities in a convenient form for Monte Carlo simulation. Recallthat we consider a liability function (cid:96) of the form: (cid:96) ( t, x, Θ) = E Q (cid:104) e − (cid:82) Tt r t, Θ s d s G ( S t,x, Θ ) (cid:105) , (3.10)where G depends upon S t,x, Θ through its values on a ﬁnite set Γ G = { τ = 0 , . . . , τ κ = T } ⊂ [0 , T ] . In thefollowing, we assume for simplicity that τ > t . Otherwise, there are deterministic terms that are to beadded, but the method remains the same. The Delta:

We want to compute: ∂(cid:96)∂x ( t, x, Θ) = ∂∂x E Q (cid:104) e − (cid:82) Tt r t, Θ s d s G ( S t,x, Θ ) (cid:105) (3.11)and we have the following result. Proposition 3.2.

For all ( t, x, Θ) ∈ [0 , × R × R , the following holds: ∂(cid:96)∂x ( t, x, Θ) = E Q (cid:104) e − (cid:82) Tt r t, Θ s d s G ( S t,x, Θ ) H x, Θ (cid:16) e − (cid:82) τ t ξ ts d s , S t,x, Θ τ (cid:17)(cid:105) , (3.12) with H x, Θ ( a, y ) = Σ − , a + Σ − , × (cid:16) log( y/x ) − (cid:82) τ t α t, Θ r d r + σ ( τ − t ) (cid:17) x , (3.13) where Σ is the covariance matrix of ( A tτ , X t,x, Θ τ ) , see (3.18) .Proof. We write the expectation as an integral, remembering that we know the law of the couple ( ξ tu , A tu , X t,x, Θ u ) conditionally upon F s : E Q (cid:104) e − (cid:82) Tt r t, Θ s d s G ( S t,x, Θ ) (cid:105) (3.14) = e − (cid:82) Tt α t, Θ s d s E Q (cid:34) e − (cid:82) τ t ξ ts d s κ − (cid:89) (cid:96) =1 e − (cid:82) τ(cid:96) +1 τ(cid:96) ξ ts d s G ( S t,x, Θ ) (cid:35) = e − (cid:82) Tt α t, Θ s d s (cid:90) R κ e − a κ − (cid:89) (cid:96) =1 e − a (cid:96) +1 G ( e x , . . . , e x κ )d Q ( A tu ,X t,x, Θ u ) u ∈ Γ G ( a , · · · , a κ , x , · · · , x κ )= e − (cid:82) Tt α t, Θ s d s (cid:90) R κ e − a κ − (cid:89) (cid:96) =1 e − a (cid:96) +1 G ( e x , . . . , e x κ ) p Θ ( t, , log( x ) , , a , x ) × . . . × p Θ ( t κ − , , x κ − , t κ , a κ , x κ )d a · · · d a κ d x · · · d x κ , p Θ ( s, a, x, u, ., . ) is the density of the couple ( A t,au , X t,x, Θ u ) , conditionally upon F s . We have previouslyseen that it is a Gaussian vector with explicit mean vector and covariance matrix. Thus, using Fubini’stheorem, we get, since there is no dependence on x except in the ﬁrst density: ∂(cid:96) ( t, x, Θ) ∂x = e − (cid:82) Tt α t, Θ s d s (cid:90) R κ e − a κ − (cid:89) (cid:96) =1 e − a (cid:96) +1 G ( e x , . . . , e x κ ) ∂p Θ ( t, , log( x ) , , a , x ) ∂x × . . . × p Θ ( t κ − , , x κ − , t κ , a κ , x κ )d a · · · d a κ d x · · · d x κ . (3.15)Consequently, the sensibility of the discounted price with respect to the initial stock price is computed onlyby calculating the derivative of the density with respect to x .We have p Θ ( t, , log( x ) , s, a, y ) = 1det(2 π Σ) exp (cid:18) − (cid:16) (( a, y ) − µ ) Σ − (( a, y ) − µ ) (cid:62) (cid:17)(cid:19) , (3.16)where, thanks to (3.7)-(3.8), µ = (cid:0) E t (cid:2) A tτ (cid:3) , E t (cid:2) X t,x, Θ τ (cid:3)(cid:1) (3.17) Σ = (cid:32)

Var( A tτ ) t Cov( A tτ , X t,x, Θ τ ) t Cov( A tτ , X t,x, Θ τ ) t Var( X t,x, Θ τ ) t (cid:33) . (3.18)Still by (3.7)-(3.8), we see that only E t (cid:104) X t,x, Θ τ (cid:105) depends upon x . Thus we get: ∂f ( t, , log( x ) , s, a, y ) ∂x = Σ − , a + Σ − , × (cid:16) log( y/x ) − (cid:82) τ t α t, Θ r d r + σ ( τ − t ) (cid:17) x f ( t, , log( x ) , s, a, y ) . (3.19)Reinjecting this equality into (3.15) and rewriting the result in term of expectations, we ﬁnally get theresult.The function ∆( t i , · ) is computed using the Monte Carlo estimator given in (3.12) as in (3.9) where wesimulate in addition the weight H . Sensitivities with respect to the interest rates curve.

We consider now the derivatives with respectto the interest rates curve. For i = 1 , , , we want to compute: ∂(cid:96)∂θ i ( t, x, Θ) = ∂∂θ i E Q (cid:104) e − (cid:82) Tt r t, Θ s d s G ( S t,x, Θ ) (cid:105) , i = 1 , , . Proposition 3.3.

For all ( t, x, Θ) ∈ [0 , × (0 , ∞ ) × R and all i = 1 , , , we have the following identity(where we have set τ = t ): ∂(cid:96)∂θ i ( t, x, Θ) = E Q (cid:104) e − (cid:82) Tt r t, Θ s d s G ( S t,x, Θ ) H x, Θ i (cid:16) ( ξ tτ l , e − (cid:82) τlτl − ξ tu d u , S t,x, Θ τ l ) l =1 ,...,κ (cid:17)(cid:105) , (3.20) with H t,x, Θ i (( r (cid:96) , a (cid:96) , s (cid:96) ) (cid:96) =1 ,...,κ ) = − (cid:90) τ κ t h t,is d s + κ (cid:88) (cid:96) =1 (cid:32)(cid:90) τ (cid:96) τ (cid:96) − h t,is d s (cid:33)(cid:16) (Σ τ (cid:96) − ,τ (cid:96) ) − , ( r (cid:96) − µ τ (cid:96) − ,τ (cid:96) )+ (Σ τ (cid:96) − ,τ (cid:96) ) − , ( a (cid:96) − µ τ (cid:96) − ,τ (cid:96) )+ (Σ τ (cid:96) − ,τ (cid:96) ) − , (log( s (cid:96) ) − µ τ (cid:96) − ,τ (cid:96) ) (cid:17) , (3.21) where µ s,u and Σ s,u are the mean and the covariance matrix of the Gaussian vector ( ξ tu , A tu , X t,x, Θ ) condi-tionally upon F s . roof. Performing a similar analysis as the one above, ∂(cid:96)∂θ i ( t, x, Θ) = ∂e − (cid:82) Tt α t, Θ s d s ∂θ i E Q (cid:104) e − (cid:82) Tt ξ ts d s G ( S t,x, Θ ) (cid:105) + e − (cid:82) Tt α t, Θ s d s (cid:90) R κ e − a κ − (cid:89) (cid:96) =1 e − a (cid:96) +1 G ( e x , . . . , e x κ ) ∂ (cid:0) p Θ ( t, , log( x ) , , a , x ) . . . p Θ ( t κ − , a κ − , x κ − , t κ , a κ , x κ ) (cid:1) ∂θ i d a · · · d a κ d x · · · d x κ . (3.22)Here, the computations are more involved since every density depends upon the a i , i = 1 , , , but the ideais the same as before.The only diﬀerence is that, when we use (3.7)-(3.8) to diﬀerentiate, we see that the short term itself appearsin the formulae. This is not a problem as we can rewrite the previous integral as an integral over R κ , withthe short rate process taken at times τ l , l = 1 , . . . , κ , as new variables to integrate against.The quantity ∂(cid:96)∂θ i ( i = 1 , , ) is computed using the Monte Carlo estimator of the formula (3.20). Thensolving the system (2.18) allows to obtain the coeﬃcients ρ , ρ , ρ .This method to calculate derivatives allows us to compute the function (cid:96) ( t, x, Θ) and its four derivativeswith only one Monte Carlo simulation. Furthermore, given a risk-neutral scenario, each quantity involvedin formulae (3.13) and (3.21) can be exactly computed by integrating the elementary functions h t,i and byinverting real symmetric matrices of size × . Thus, the weight functions H t,x, Θ , H t,x, Θ i are easily andaccurately computed. The nested simulation approach requires the approximation of the function (cid:96) and its derivative ∂(cid:96)∂x , ∂(cid:96)∂θ i , i = 1 , , for each path ( ¯ S jt , ¯Θ jt ) t ∈ Γ of the market parameters. These values are computed on the ﬂy whichis quite time consuming.We suggest here an alternative method which will pre-compute the quantities of interest ( (cid:96) and its derivatives)and store them. The requested value for a given market parameter will then be obtained by an interpolationprocedure.A ﬁrst simple approach is to consider an equidistant grid of the domain A := (cid:81) dl =1 [ m p , M p ] which is atruncation of the support of X ( R d , d = 4 in our setting). Then one can use a multi-linear interpolationto reconstruct the function in the whole space. If one sets p points in one dimension, the total number ofpoints will be dp for one function at one given time and overall ( n + 1)2 dp to store (here d = 4 ). This willbecome rapidly too big, especially if one allows the number of market parameters to grow. This is a typicalexample of the “curse of dimensionality” encountered in numerical analysis when dealing with problem ofhigh dimension.Instead of considering a regular grid, we shall rely on the use of sparse grid , which allows to lower the numberof points required to store the numerical approximation of the function. We now present rapidly the mainconcepts linked to sparse grids, see [2] for a comprehensive survey. In our numerical examples, see Section4, the sparse grid will be computed using the StOpt C++ library [5].For each multi-index k ≤ l , we deﬁne a grid mesh h k = 2 − k and grid points ˇ y k , i = ( m + i ( M − m ) h k , . . . , m d + i d ( M d − m d ) h k d ) , ≤ i ≤ k . Using the hat function, y ∈ R (cid:55)→ φ ( y ) := (cid:40) − | y | if y ∈ [ − , (3.23)14nd we can associate to the previous grid a set of nodal basis function: y ∈ R d (cid:55)→ φ k , i ( y ; A ) = d (cid:89) l =1 φ ( y l − ˇ y l k , i − i l ) . When using a “full” linear interpolation, the function is approximated using the whole set of nodal basisfunction at the ﬁnest level l . Instead, we consider the sparse grid nodal space of order p deﬁned by V κ := span { φ l , j ; ( l , j ) ∈ I p ( A ) } , where I κ := (cid:8) ( l , j ) : 0 ≤ d (cid:88) i =1 l i ≤ p ; ≤ j ≤ l ;( l i > and j i is odd ) or ( l i = 0) , for i = 1 , . . . , d (cid:9) . (3.24)For a function ψ : A → R with support in A , we deﬁne its associated V κ -interpolator by π A V κ [ ψ ]( y ) := (cid:88) ( l , j ) ∈I κ ( A ) θ l , j ( ψ ; A ) φ l , j ( y ; A ) (3.25)where the operator θ l , j can be deﬁned recursively in terms of r , the dimension of l , by: θ l , j ( ψ ; A ) =  ψ (ˇ y l , j ); r = 0 θ l − , j − ( ψ ( · , ˇ y rl r ,j r ); A − ); l r = 0 θ l − , j − ( ψ ( · , ˇ y rl r ,j r ); A − ) − θ l − , j − ( ψ ( · , ˇ y rl r ,j r − ); A − ) − θ l − , j − ( ψ ( · , ˇ y rl r ,j r +1 ); A − ); l r > (3.26)where, for a hypercube A = (cid:81) dl =1 [ m l , M l ] , A − := (cid:81) d − l =1 [ m l , M l ] and for a multi-index k with dimension r ≥ , k − = ( k , . . . , k r − ) .Let us now introduce the approximation that we use to compute the loss distribution, namely (cid:96) S ( · ) := π V κ [ (cid:96) N ]( · ) , (cid:18) ∂(cid:96)∂x ( t, · ) (cid:19) S := π V κ [ (cid:18) ∂(cid:96)∂x ( t, · ) (cid:19) N ]( · ) (3.27)and (cid:18) ∂(cid:96)∂θ i ( t, · ) (cid:19) S := π V κ [ (cid:18) ∂(cid:96)∂θ i ( t, · ) (cid:19) N ]( · ) i ∈ { , , } , t ∈ Γ . These functions are built by computing the coeﬃcients appearing in (3.25), which are then stored in memory.For (cid:96) S say, this amounts to compute the function (cid:96) N on the sparse grid V p and this is done by Monte Carlosimulation as in the previous approach, recall the deﬁnition of (cid:96) N in (3.9). Complexity

The main limitation of this method is the memory usage and the time to pre-compute thefunctions. This is proportional to the number of point in the grid. This number can be estimated to be of O (2 κ − d +1 ( κ − d +1) d − ( d − ) , see Proposition 4.1 in [2]. Let us insist on the fact that this is done “oﬄine” comparedto the nested simulation approach. For the “online” computations, the main eﬀort is put in the evaluation ofthe function which is slightly more evolved than a linear interpolation and is of O ( κ ) where κ is the maximumlevel that is chosen. Remark . The computations at each point being independant, this Sparse Grid approach can be easilyparallelised, hence improving further the gain of time observed in Subsection 4.1.

The goal is to obtain a reasonable approximation of the risk associated to the loss distribution of the balancesheet in an eﬃcient way. In this section, we explain why the methods introduced above are indeed goodapproximations of the risk indicators. We also study theoretically the numerical complexity of both methodsin terms of memory and time consumption. 15 .4.1 Error analysis

For the risk estimation, we will investigate a root Mean Square Error (rMSE) of the following form (cid:15) υ := E (cid:2) | (cid:37) ( p (cid:93)η ) − (cid:37) ( p υ (cid:93)η N ) | (cid:3) , for υ ∈ {N , S} . The expectation operator E [ · ] acts under P ⊗ N ⊗ Q ⊗ M , namely it averages both on the simulation of themarket parameters under the real-world measure used for calibration and the risk neutral evolution of themarket model under the pricing measure.The ﬁrst observation is that under reasonable assumptions on the risk measure used in the risk indicator,the error performed in the numerical simulation can be separated in two main contribution: the error dueto the sampling of the loss distribution coming from the sampling of the market parameters and the errormade when approximating the diﬀerent pricing and hedging functions. Lemma 3.2.

Assume that (cid:37) has a

Monotonic and

Cash Invariant lift (cid:60) , then (cid:15) υ ≤ E (cid:2) | (cid:37) ( p (cid:93)η ) − (cid:37) ( p (cid:93)η N ) | (cid:3) + E (cid:34) sup ≤ j ≤ N | p ( X j ) − p υ ( X j ) | (cid:35) . Proof.

We denote by (cid:98) X N ( ω ) the random variable with distribution η N ( ω ) . Note that p ( (cid:98) X N ( ω )) ≤ p υ ( (cid:98) X N ( ω )) + sup ≤ j ≤ N | p ( X j ( ω )) − p υ ( X j ( ω )) | (3.28)which leads to (cid:60) ( p ( (cid:98) X N ( ω ))) ≤ (cid:60) ( p υ ( (cid:98) X N ( ω ))) + sup ≤ j ≤ N | p ( X j ( ω )) − p υ ( X j ( ω )) | . By symmetry, we easily get | (cid:37) ( p (cid:93)η ) − (cid:37) ( p υ (cid:93)η N ( ω )) | ≤ | (cid:37) ( p (cid:93)η ) − (cid:37) ( p (cid:93)η N ( ω )) | + sup ≤ j ≤ N | p ( X j ( ω )) − p υ ( X j ( ω )) | (3.29)and then the proof is concluded using Minkowski inequality.The error due to the approximation of the function p is well understood when the function is smooth enough.Note that the asset side of the function is quite involved and we will not attempt to obtain the condition forsmoothness of the overall function p . We will now simply review the error done on the liability part (cid:96) (1 , · ) assuming that the mapping G is bounded and ( x, Θ) → β , Θ T G ( S ,x, Θ ) ∈ C b . (3.30)Even though, this cannot be almost surely true in the model presented above, we will assume that β , Θ T isbounded in the discussion below. A more precise analysis should take care of these extreme events arising withsmall probability. Another possibility would be to force the interest rate to be non-negative, by truncationor by considering a CIR type of model for (2.2). Lemma 3.3.

Assume that (3.30) holds true. Recall the deﬁnition of (cid:96) N in (3.9) , then E (cid:20) max ≤ j ≤ N | (cid:96) ( X j ) − (cid:96) N ( X j ) | (cid:21) ≤ C (cid:114) log( N ) M . (3.31)16 roof.

We denote by c the bound on the mapping ( x, Θ) → β , Θ T G ( S ,x, Θ ) (recall the discussion after equation(3.30)) and thus the bound on (cid:96) . For the reader’s convenience, we introduce Σ jM := M (cid:88) k =1 (cid:96) ( X j ) − β t, X j ,kT G ( S t, X j ,kT ) , and observe that E (cid:104) Σ jM (cid:105) = 0 and recall that the (Σ jM ) are i.i.d. We have, using Hoeﬀding Inequality, E (cid:104) {| Σ jM | >z } (cid:105) ≤ (cid:16) − zcM (cid:17) . Using the independence property, we obtain E (cid:104) { max j | Σ jM | ≤ z } (cid:105) ≥ (cid:16) − (cid:16) − zcM (cid:17)(cid:17) N . (3.32)Now set ξ := cM log( N ) and compute E (cid:20) max ≤ j ≤ N | Σ jM | (cid:21) = (cid:90) ∞ E (cid:104) { max j | Σ jM | >z } (cid:105) d z ≤ ξ + (cid:90) ∞ ξ E (cid:104) { max j | Σ jM | >z } (cid:105) d z ≤ ξ + (cid:90) ∞ ξ (cid:26) − (cid:16) − (cid:16) − zcM (cid:17)(cid:17) N (cid:27) d z. Now, we observe that for N ≥ , (cid:0) − zcM (cid:1) ≤ for z ≥ ξ . Using the fact that − (1 − u ) N ≤ N u , for u ∈ [0 , , we get E (cid:20) max ≤ j ≤ N | Σ jM | (cid:21) ≤ ξ + 2 N (cid:90) ∞ ξ exp (cid:16) − zcM (cid:17) d z which leads to E (cid:20) max ≤ j ≤ N | Σ jM | (cid:21) ≤ ξ + 2 cM, and concludes the proof.We conclude this section by giving the overall estimation error induced by the numerical procedure above.We will admit that the upper bound for the error given for (cid:96) (1 , · ) in Lemma 3.3 holds true for the PnLfunction p (1 , · ) with a scaling by n coming from the number of rebalancing date. Theorem 3.1.

Assume that (cid:37) h is a spectral risk measure. Then, the following holds, for some α > ,1. for the Nested Simulation approach (cid:15) N ≤ C (cid:32) N α + n (cid:114) log( N ) M (cid:33) ; (3.33)

2. for the

Sparse Grid approach with maximum level κ(cid:15) S ≤ C (cid:32) N α + n (cid:40)(cid:114) log( N ) M + 2 − κ ( κ − d + 1) ( d − (cid:41)(cid:33) . (3.34)17 roof.

1. We ﬁrst show that E (cid:20) max ≤ j ≤ N | (cid:96) ( X j ) − (cid:96) S ( X j ) | (cid:21) ≤ C (cid:32)(cid:114) log( N ) M + 2 − κ κ ( d − (cid:33) . (3.35)Indeed, we have that (cid:96) S = π V κ [ (cid:96) N ]= (cid:96) N + π V κ [ (cid:96) N ] − (cid:96) N . And we observe that π V κ [ (cid:96) N ] − (cid:96) N = 1 M M (cid:88) j =1 π V κ [ e − (cid:82) T r , · ,ks d s G ( S , · ,k )] − e − (cid:82) T r , · ,ks d s G ( S , · ,k ) Let us denote by ( x, Θ) (cid:55)→ φ k ( x, Θ) = e − (cid:82) T r , Θ ,ks d s G ( S ,x, Θ ,k ) which is a random function as it dependson the random realisation of the ( r, S ) process. Under (3.30), φ k is smooth enough to apply the results inProposition 4.1 in [2] and we obtain | π V κ [ (cid:96) N ] − (cid:96) N | ∞ ≤ M M (cid:88) j =1 | π V κ [ φ k ] − φ k | ∞ ≤ C − κ κ d − . (3.36)We then observe that E (cid:20) max ≤ j ≤ N | (cid:96) ( X j ) − (cid:96) S ( X j ) | (cid:21) ≤ C (cid:32) E (cid:20) max ≤ j ≤ N | (cid:96) ( X j ) − (cid:96) N ( X j ) | (cid:21) + | π V κ [ (cid:96) N ] − (cid:96) N | ∞ (cid:33) . The proof of (3.35) is concluded by combining the above inequality with (3.36) and Lemma 3.3.2. We now prove (3.33). Applying Lemma 3.2, we obtain (cid:15) N ≤ E (cid:2) | (cid:37) ( p (cid:93)η ) − (cid:37) ( p (cid:93)η N ) | (cid:3) + E (cid:20) max ≤ j ≤ N | (cid:96) ( X j ) − (cid:96) N ( X j ) | (cid:21) . (3.37)The second term in the right-hand side of the above inequality is controlled by using Lemma 3.3. Wenow study the ﬁrst term in the right-hand side, which is the error introduced by the sampling of the lossdistribution. Applying Corollary 11 in [7] to the spectral risk measure, we ﬁrst get E (cid:2) | (cid:37) ( p (cid:93)η ) − (cid:37) ( p (cid:93)η N ) | (cid:3) ≤ C E (cid:2) W ( η, η N ) (cid:3) . We then use Theorem 1 in [4] to bound the Wasserstein distance, which concludes the proof for this step.3. To prove (3.34), we follow similar arguments as in step 2. but using (3.35) instead of invoking Lemma3.3.

Remark . We can compare the bound obtained for the nested simulation with the ones in [6]. Using adiﬀerent approach, the authors prove a very nice bound on the overall error given by C (cid:18) √ N + 1 M (cid:19) , for the V @ R (which is not a spectral risk measure) and AV @ R . Note that the term M is obtained bycancellation of the ﬁrst order term through an error expansion. It would be interesting to understand underwhich assumptions their bound can be retrieved in our setting of general spectral risk measure. This topicis left for further research.We conclude this Section by a short account on the numerical complexity of the two methods.The Nested Simulation approach is a pure “online” method which is very simple to implement but has a hugedrawback in term of running time. Each time an estimation is requested the numerical complexity is overall18f nN M , where recall n is the number of rebalancing date, M the number of sample for the risk neutralsimulation and N the number of sample for the real-world simulation. The memory requirements comes onlyfrom the estimation of the loss distribution and are of order N .As already mentioned, the Sparse Grid approach is both an “online” and “oﬄine” method. In terms ofmemory requirement, it is thus greedier than the nested simulation approach. On top of the memory neededto store the sample distribution (of order N ), memory is also needed to store the sparse grid approximation p S , the requirement are of order O ( n κ − d +1 ( κ − d +1) d − ( d − ) . In term of running time, the gain is important asthe complexity of evaluating p S is of O ( κ ) only, where κ is the maximum level used. In the numerical applications below, we will compare the loss distribution obtained via our two numericalprocedures by computing the Wasserstein distance between the two empirical distributions. Since the lossdistribution is one-dimensional, we use the following formula [8]: for two probability distribution on R , η and ˜ η , W ( η, ˜ η ) = ( (cid:90) | F − η ( u ) − F − η ( u ) | d u ) . (4.1)In the setting of empirical distributions, the above distance is easily computed. Suppose η = N (cid:80) Ni =1 δ x i and ˜ η = N (cid:80) Ni =1 δ y i .We straightforwardly compute W ( η, ˜ η ) = N (cid:88) i =1 (cid:90) iNi − N | F − η ( u ) − F − η ( u ) | d u = 1 N N (cid:88) i =1 | x ( i ) − y ( i ) | , (4.2)where the subscript ( i ) refers to the i -th order statistic of the distribution, since x ( i ) (resp. y ( i ) ) is simplythe iN -th quantile of η (resp. ˜ η ).Besides the Wasserstein distance between the two empirical distributions, we will also compare the estimatedV@R and AV@R, which are computed in a similar way. Indeed, for α ∈ (0 , , we have: V @ R α ( η ) = F − η ( α ) = x ( i α ) , (4.3)where i α − N < α ≤ i α N , i α ∈ { , . . . , N } .For a given α ∈ (0 , , we observe that AV @ R α ( η ) = 11 − α (cid:90) α V @ R p ( η )d p = 11 − α (cid:32)(cid:90) iαN α V @ R p ( η )d p + N − (cid:88) i = i α (cid:90) i +1 NiN V @ R p ( η )d p (cid:33) which leads to AV @ R ¯ α ( η ) = 11 − α (cid:32) { i α N − α } x ( i α ) + 1 N N − (cid:88) i = i α x ( i +1) (cid:33) . (4.4)Using the formulae (4.2), (4.3) and (4.4), we will now present numerical results showing the eﬃciency andusefulness of the sparse grid approach. We ﬁrst start with a comparison with the classically used nestedsimulation approach. We computed the empirical distribution of the PnL at horizon 1 year using the nested simulations approach,recall Section 3.2, and the sparse grid approach, recall Section 3.3.For both methods, we used a sample of size N = 11000 describing the real-world evolution of S and Θ , recall19

100 500 1000 2000 10000

Wasserstein distance 0.73 0.20 0.09 0.05 0V@R/AV@R α = 0 . α = 0 . α = 0 . α = 0 . M (cid:39) √ N , that is M = 100 . Inpractice however, we observe that convergence has not occured yet and we observe non-neglectible changesin the risk measures taken into account, see Table 1. In this table, we compute the Wasserstein distance withrespect to the distribution obtained for M = 10000 . Operational constraints do not allow us to change thesize of the sample used for the Monte Carlo simulations under P , We thus consider a sample of size M = 2000 for the risk-neutral Monte Carlo simulations. Following Proposition 2.2, we calibrated a Gaussian modelsuch that ( X , ( θ ) , ( θ ) , ( θ ) ) has mean and covariance matrix given by: µ = (4 . × − , . , . , . , (4.5) V =  .

004 3 . × − . × − . . × − . × − . × − . × − . × − . × − . × − . × − . . × − . × − . × −  (4.6)The risk-neutral simulations were computed by a Monte Carlo procedure, computed with the exact formulaein the Hull & White and Black & Scholes setting we used, recall Proposition 3.1. The volatility parameterused in the Black & Scholes model is set to σ = 0 . while the parameters deﬁning the Hull & White modelare set to a = 0 . and b = 0 . . Last, the covariation parameter between the two Brownian motions is setto ρ = 0 .In this setting, the nested simulations method was tested with the Put Lookback option described in 2.1,with maturity T = 30 years. Figure 2 shows the outcome PnL’s distribution. PnL distribution, Nested Simulations

PnL D en s i t y . . . . . Figure 2: PnL distribution, nested simulationsWe next looked at the grid method. Figure 3 shows the outcome PnL’s distributions for sparse grids of level20 , , , which respectively have cardinal , , . For each level, we chose the number of risk-neutralsimulations M so that the error induced by the Monte Carlo estimation is small compared with the sparseinterpolation error, meaning that we can run the program multiple times without changing signiﬁcantly theoutcome. Furthermore, as in the nested simulations case, we choose M so that increasing M has no eﬀect onthe distribution. Empirically, we chose M = 20000 for the sparse grid of level , or . Figure 4 comparesthe distribution obtained with nested simulations with the distribution obtained with the sparse grid of level . Table 2 shows computational times comparison, and Table 3 shows V@R and AV@R comparison for theempirical distributions obtained in each case.We observe that the computational time on the Sparse grid of level with M = 20000 is similar to the onefor the Nested simulations with only M = 2000 . Moreover, a signiﬁcant gain in time is obtained by the use ofthe sparse grid of level only, which already gives good results, see Table 2. As already observed in Remark3.1, this gain in time can be further improved by parallelisation of the computations. In addition, we observethat, once the computations on the grid are done, then the PnL distribution is almost straightforwardlyobtained. This is a key feature of the method since the computations on the grid are to be kept. Indeed, ifone needs to change the distribution of ( S, Θ) under P , say because the view of the risk management on theevolution of the market parameters has changed, then they can be re-used easily. In the next section, wegive an application in this direction. PnL distribution, sparse grids

PnL D en s i t y . . . . . . . Sparse grid, level 1Sparse grid, level 2Sparse grid, level 3

Figure 3: PnL distribution, sparse gridsLevel of the sparse grid l = 1 l = 2 l = 3 Nested simulationsComputations on the grid 9 min 30 sec 35 min 10 sec 1h 58 minComputation of the PnL distribution 2 sec 4 sec 8 sec 2h 15minTable 2: Computational timesLevel of the sparse grid l = 1 l = 2 l = 3 Nested simulationsWasserstein distance 0.13 0.06 0.05 0V@R/AV@R α = 0 . α = 0 . α = 0 . α = 0 . omp Sparse vs Nested PnL D en s i t y . . . . . SparseNested

Figure 4: PnL distribution, nested simulations versus sparse grid method

As mentioned above, an interesting feature of the sparse grid approach developed in this paper is the abilityto change the distribution of the processes X = log ( S ) and Θ under the real-world probability P . In thisSection, we use the model described in Section 2.4 to simulate a ﬁrst sample. Then, we consider someuncertainty over the estimated moments µ, V of ( X , Θ ) used to calibrate the gaussian model: we onlyassume that the true moments lie in centered intervals around the estimations. In practice, we considerintervals of the form [ m × . , m × . , where m is the estimated moment under consideration.To better understand the risk associated with this uncertainty under P , we simulate two “extreme” newsamples of ( X, Θ) , where every moment taken into account to calibrate the model are multiplied by . (resp. . ), and, thanks to the grid computations done before with the initial model, we are in position tocompute almost instantaneously the empirical PnL distributions associated with these two new samples.Table 4 shows the Wasserstein distance between the initial distributions and the two obtained for the shiftedparameters, and the V@R and AV@R obtained at diﬀerent quantile levels. We observe that with these smallchange the distribution are quite close to each other. The main discrepancy are obtained for the diminishedmoments. Model Initial model Diminished moments Augmented momentsWasserstein distance 0 0.030 0.0070V@R/AV@R α = 0 . α = 0 . α = 0 . α = 0 . Appendix (2.3)

In this subsection, we shall give the proof of Proposition 2.1 for completeness. We remind that in theHull-White model, the dynamics of the short rate is given by the following: dr t, Θ s = a ( µ t, Θ s − r t, Θ s )d s + b d B s (5.1)with a, b ∈ R . We will prove that the mean-reverting θ s can be calibrated by forward interest rate curve f Θ ( t, s ) by: µ t, Θ s = f Θ ( t, s ) + 1 a ∂f Θ ( t, s ) ∂s + b a (1 − e − a ( s − t ) ) (5.2)The method is to express the price of the zero-coupon bond P ( t, s ) in the following way: E [exp( − (cid:90) st r t, Θ u d u )] = P ( t, s ) = exp( − (cid:90) st f Θ ( t, u )d u ) (5.3)Then by comparing both sides, we can determine f Θ ( t, s ) . First, it is easy to ﬁnd out that the solution to(5.1) is r t, Θ s = r t, Θ t e − a ( s − t ) + a (cid:90) st µ t, Θ u e − a ( s − u ) d u + b (cid:90) st e − a ( s − u ) d B u Then by straightforward calculation we have that (cid:90) st r t, Θ u d u = r t, Θ t a (1 − e − a ( s − t ) ) + (cid:90) st µ t, Θ u (1 − e − a ( s − u ) )d u + ba (cid:90) st (1 − e − a ( s − u ) )d B u So (cid:82) st r t, Θ u d u follows a normal distribution with mean E [ (cid:90) st r t, Θ u d u ] = r t, Θ t a (1 − e − a ( s − t ) ) + (cid:90) st µ t, Θ u (1 − e − a ( s − u ) )d u and variance V [ (cid:90) st r t, Θ u d u ] = b a (cid:90) st (1 − e − a ( s − u ) ) d u Now comparing the both sides of (5.3), we have that (cid:90) st f Θ ( t, u )d u = E [ (cid:90) st r t, Θ u d u ] − V [ (cid:90) st r t, Θ u d u ] Thus f Θ ( t, s ) = ∂∂s E [ (cid:90) st r t, Θ u d u ] − ∂∂s V [ (cid:90) st r t, Θ u d u ]= r t, Θ t e − a ( s − t ) + a (cid:90) st µ t, Θ u e − a ( s − u ) d u − b a a (cid:90) st ( e − a ( s − u ) − e − a ( s − u ) )d u = r t, Θ t e − a ( s − t ) + a (cid:90) st µ t, Θ u e − a ( s − u ) du − b a (1 − e − a ( s − t ) ) (5.4)By straightforward diﬀerentiation, we have ∂∂s f Θ ( t, s ) = − ar t, Θ t e − a ( s − t ) − a (cid:90) st µ t, Θ u e − a ( s − u ) d u + aµ t, Θ s − b a ( e − a ( s − t ) − e − a ( s − t ) ) (5.5)Now by (5.4) and (5.5), we can easily verify that (5.2) is valid.23 .2 Proof of Proposition 2.2 We provide a recursive proof of Proposition 2.2, which allows to compute eﬀectively the coeﬃcients deﬁningthe processes.Suppose more generally that a vector µ ∈ R n and a covariance matrix V ∈ R n × n is given. We look for n processes X i ( i = 0 , . . . , n ) deﬁned by: X it = X i + b i t + n (cid:88) j =1 c ij W jt , (5.6)where W jt ( j = 1 , . . . , n ) are n independant Brownian motions, and b ∈ R n , C = ( c ij ) ∈ R n × n . Proposition 5.1.

There is at most one ( b, C ) ∈ R n × R n × n such that: • c ij = 0 whenever i > j , • E (cid:2) X i (cid:3) = µ i ( i = 1 , . . . , n ) , • Cov ( X i , X j ) = V ij ( i, j = 1 , . . . , n ) .Proof. We have E (cid:2) X i (cid:3) = X i + b i , so b i := µ i − X i ensures E (cid:2) X i (cid:3) = µ i for all i .We next determine the matrix C thanks to a recursive algorithm: Ascending step:

Let i, l ∈ { , . . . , n } , and assume c ik , k > l and c lk , k ≥ l are determined. Then we candetermine c il .Indeed, if i > l , we set c il = 0 . If i < l , we have: V il = Cov ( X i , X l ) (5.7) = n (cid:88) j =1 c ij c lj (5.8) = n (cid:88) j = l c ij c lj (5.9) = c il c ll + (cid:88) j>l c ij c lj . (5.10)Thus we set: c il = 1 c ll  V il − (cid:88) j>l c ij c lj  . (5.11) Back step:

Let l ∈ { , . . . , n } and assume c lj is determined, for k > l . Then we can determine c ll .Indeed: V ll = V ( X l ) = n (cid:88) j =1 c lj = n (cid:88) j = l c lj = c ll + (cid:88) j>l c lj . (5.12)Thus we set: c ll = (cid:115) V ll − (cid:88) j>l c lj . (5.13)24 .3 Proofs of Lemma 3.1 and Proposition 3.1 We prove here the Lemma 3.1 and the Proposition 3.1, which give a recursive procedure to simulate exactlyunder Q . Proof of Lemma 3.1.

Let ( t, Θ) ∈ [0 , T ] × R and consider the process r t, Θ = ( r t, Θ s ) s ∈ [ t,T ] deﬁned by (2.2)-(2.3).Let s ∈ [ t, T ] . An application of Itô’s formula gives: e as r t, Θ s = e at r t, Θ t + a (cid:90) st e au µ t, Θ u d u + b (cid:90) st e au d B u , (5.14)and an easy computation using equality (2.3) shows that: a (cid:90) st e − a ( s − u ) µ t, Θ u d u = α t, Θ s − α t, Θ t e − a ( s − t ) , (5.15)where α t, Θ is the deﬁned by (3.4).In addition, if ξ t is deﬁned by (3.5), applying Itô’s formula again gives: ξ ts = b (cid:90) st e − a ( s − u ) d B u . (5.16)Thus: r t, Θ s = e − a ( s − t ) r t, Θ t + α t, Θ s − e − a ( s − t ) α t, Θ t + ξ ts , (5.17)which ends the proof as r t, Θ t = α t, Θ t , by (5.4).We now turn to the proof of Proposition 3.1. Proof of Proposition 3.1.

Let t ≤ s ≤ u ≤ T . Itô’s formula implies that the triplet ( ξ tr , A t,sr , X t,x, Θ r ) r ∈ [ s,u ] isthe solution of the following linear stochastic diﬀerential equation: d  ξ r A r X r  =  − a   ξ r A r X r  +  α t, Θ r − σ  d t +  b

00 0 σρ σ (cid:112) − ρ  (cid:18) d B r d W r (cid:19) , r ∈ [ s, u ] , (5.18)with the initial conditions ξ s = ξ ts , A s = 0 , X s = X t,x, Θ s . This linear equation has a closed form solution, andwe ﬁnd: ξ tu = e − a ( u − s ) ξ ts + b (cid:90) us e − a ( u − r ) d B r , (5.19) A t,su = ξ ts a (1 − e − a ( u − s ) ) + ba (cid:90) us (1 − e − a ( u − r ) )d B r , (5.20) X t,x, Θ u = X t,x, Θ s + (cid:90) us α t, Θ r d r − σ u − s ) + ξ ts a (1 − e − a ( u − s ) ) + (cid:90) us (cid:112) − ρ σ d W r (5.21) + ba (cid:90) us (1 − e − a ( u − r ) )d B r + (cid:90) us ρσ d B r . (5.22)Conditionaly upon F s , the vector ( ξ tu , A t,su , X t,x, Θ u ) is Gaussian and the expectations and covariations givenin the Proposition are easily computed thanks to the above formulae.25 .4 Comparison with Automatic Diﬀerentiation. We use the stan math C++ library [3] which allows to easily implement a (Reverse Mode) AutomaticDiﬀerentiation procedure in order to deduce the derivatives directly from the Monte-Carlo computation ofthe function L . We compare the results obtained with the weights method developed here with the resultsobtained by Automatic Diﬀerentiation. We also provide a comparison about the computational times.Precisely, we computed the derivatives of (cid:96) with respect to the variables ( x, θ , θ , θ ) at points ( x i , θ i , θ i , θ i ) i =1 ,..., . In the case of the automatic diﬀerentiation, we only take risk-neutral simulationsto compute (cid:96) , while for the approach involving the computation of weights, we took simulations tocompute (cid:96) and its four derivatives.Table 5 sums up the time taken for the computations. Clearly, the gain in time resulting by using theweights algorithm is really signiﬁcant. Additionally, Figure 5 shows the accuracy in the computation usingthe weights derivatives in comparison with the Automatic Diﬀerentiation.Algorithm - Option Put LookbackAutomatic Diﬀerentiation 179 secWeights 97 secTable 5: Computational timesFigure 5: Results from the grid method versus results from Automatic Diﬀerentiation Acknowledgements

The authors would like to thank the organizers of the 2017 CEMRACS for the opportunity to study atCIRM. This work was partially funded in the scope of the research project “Advanced techniques for non-linear pricing and risk management of derivatives” under the aegis of the Europlace Institute of Finance, withthe support of AXA Research Fund. Finally, we would like to thank the all funding sources that supportedus during CEMRACS, in particular University Paris Diderot.26 eferences [1] Damiano Brigo and Fabio Mercurio.

Interest rate models-theory and practice: with smile, inﬂation andcredit . Springer Science & Business Media, 2007.[2] Hans-Joachim Bungartz and Michael Griebel. Sparse grids.

Acta Numerica , 13(May):147, 2004.[3] Bob Carpenter, Matthew D Hoﬀman, Marcus Brubaker, Daniel Lee, Peter Li, and Michael Betancourt.The stan math library: Reverse-mode automatic diﬀerentiation in c++. arXiv preprint arXiv:1509.07164 ,2015.[4] Nicolas Fournier and Arnaud Guillin. On the rate of convergence in wasserstein distance of the empiricalmeasure.

Probability Theory and Related Fields , 162(3-4):707–738, 2015.[5] Hugo Gevret, Nicolas Langrené, Jerome Lelong, Xavier Warin, and Aditya Maheshwari. STochasticOPTimization library in C++. Research report, EDF Lab, May 2018.[6] Michael B Gordy and Sandeep Juneja. Nested simulation in portfolio risk measurement.

ManagementScience , 56(10):1833–1848, 2010.[7] Alois Pichler. Evaluations of risk measures for diﬀerent probability measures.

SIAM Journal on Opti-mization , 23(1):530–551, 2013.[8] Yu V Prokhorov. Convergence of random processes and limit theorems in probability theory.