Calibration of Local-Stochastic and Path-Dependent Volatility Models to Vanilla and No-Touch Options
CCalibration of Local-Stochastic and Path-Dependent VolatilityModels to Vanilla and No-Touch Options alan bain ∗ matthieu mariapragassam † (cid:12) christoph reisinger † Abstract
We propose a generic calibration framework to both vanilla and no-touch options for a large classof continuous semi-martingale models. The method builds upon the forward partial integro-differentialequation (PIDE) derived in
Hambly et al. (2016), A forward equation for barrier options under theBrunick & Shreve Markovian projection, Quant. Finance, 16 (6), 827–838 , which allows fast computationof up-and-out call prices for the complete set of strikes, barriers and maturities. It also utilises a noveltwo-state particle method to estimate the Markovian projection of the variance onto the spot and runningmaximum. We detail a step-by-step procedure for a Heston-type local-stochastic volatility model with localvol-of-vol, as well as two path-dependent volatility models where the local volatility component depends onthe running maximum. In numerical tests we benchmark these new models against standard models for aset of EURUSD market data, all three models are seen to calibrate well within the market no-touch bid–ask.
For derivative pricing models to be useful in practice, they need to allow calibration to the market prices ofliquid contracts, as well as exhibit a dynamic behaviour consistent with that of the underlying and with futureoptions quotes. Vanilla options prices provide a snapshot of the market implied distributions of the underlyingwhich is the key ingredient for pricing European options, but they provide limited information about the jointlaw of the underlying observed at different times, which is needed for pricing path dependent options. There isevidence (see, e.g., [2]) that the market prices of contracts with barrier features contain additional informationon the dynamic behaviour of the volatility surface not already seen in vanilla quotes. The topic of this paperis hence the simultaneous calibration of volatility models to European call and no-touch (or, more generally,barrier) options.The case of calibration to vanilla options, i.e. European calls and puts, has been considered extensivelyin the literature. The seminal work of Dupire [13] gives a constructive solution to the calibration problemfor local volatility (LV) models, which can perfectly match call prices for any strike and maturity. Nowadays,local-stochastic volatility (LSV) models are in widespread use in financial institutions because of their ability tocalibrate exactly to vanilla options due to the local volatility component while embedding a stochastic variancecomponent, which improves the dynamic properties. The calibration problem of LSV models to vanilla quotesis reviewed already in [34], and is addressed, more recently, in the works of Guyon and Henry-Labord`ere [22, 23]by a particle method, and in [36] by solution of a nonlinear Fokker-Planck PDE; see also [10, Section 6.8].In addition to call options, practitioners are increasingly interested in including the quotes of touch options inthe set of calibration instruments, which will improve the pricing and risk-management of exotic contracts with ∗ BNP Paribas, 10 Harewood Avenue, London, NW1 6AA, United Kingdom † Mathematical Institute and Oxford-Man Institute of Quantitative Finance, University of Oxford, WoodstockRoad, Oxford, OX2 6GG, United Kingdom , [email protected], [email protected] second author gratefully acknowledges financial support from the
Oxford-Man Institute and
BNP Paribas . a r X i v : . [ q -f i n . M F ] N ov arrier features. In some markets, for example in foreign exchange, the next most visible layer of option pricesafter the European vanilla prices are the American barrier options (for example products such as one-touch,double no-touch and vanilla knock-out options). Observation and model parameter adjustment derived fromthese prices is a well-established part of model calibration for short-dated FX options.A few published works already address this question for different model classes: Crosby and Carr [9] considera particular class of jump models which gives a calibration to both vanillas and barriers; Pironneau [33] provesthat an adaptation of the Dupire equation is valid for a given barrier level, under the local volatility model.This paper addresses the simultaneous calibration to vanilla and barrier (specifically, no-touch) optionssystematically for a wide class of volatility models. The focus is less on the calibration of a particular model –although we do calibrate three different new models – but on a methodology which allows the efficient calibrationof any volatility model.We assume that interest rates are deterministic. We take the Brunick–Shreve mimicking point of viewfrom [6] that the joint law of a stock price and its running maximum, or equivalently, barrier prices for allstrikes, barriers levels and maturities, can be reproduced by a one-factor model with a deterministic volatilityfunction of the spot, the running maximum and time. This is a natural extension of Gy¨ongy’s result in [24]that the stock price distribution, or equivalently, call prices for all strikes and maturities, can be reproducedwith a deterministic volatility function of the spot and time. In the latter case, this volatility function isthe expectation of the variance process conditional on the spot price, while in the path-dependent case theexpectation is also conditional on the path-dependent quantity, here the running maximum. These conditionalexpectations are often referred to as Markovian projections.In the vanilla case, exact calibration is guaranteed if the Markovian projection of the instantaneous varianceonto the spot coincides with the squared local volatility function derived from vanilla quotes by Dupire’sformula. Conversely, given the local volatility, model prices can be computed by the forward Dupire PDE,formulated in strike and maturity. The estimation of the local volatility from observed prices is an ill-posedinverse problem, and regularisation approaches have been proposed, e.g., in [27], [14], or [12]. If the underlyingmodel to be calibrated is not itself a local volatility model, the Dupire PDE can still be used to compute themodel prices by utilising the mimicking result, i.e., the Dupire PDE with the Markovian projection of theinstantaneous variance onto the spot as diffusion coefficient gives the correct model prices. The conditionalexpectation of the stochastic variance under the desired model can be estimated, e.g., by the particle methodin [23, 22].The natural extension of the forward Dupire PDE for calls to a forward equation for barrier option prices,with strike, barrier level and maturity as independent variables, is the forward PIDE derived in [26]. It has asdiffusion coefficient a volatility function of spot, running maximum, and time, which we view as a ‘code book’for barrier option prices, a name coined in [8] for local volatility and European options. We re-iterate that theunderlying diffusion can be a general continuous stochastic process. Specifically, the variance process does notneed to contain the running maximum in its parametrisation. The link between the original model and thispath-dependent volatility is given by Corollary 3.10 in [6] (see also (2.9) below).We investigate as example a Heston-type LSV model with a local volatility component as well as a stochasticvolatility with simple parametric, spot- and time-dependent vol-of-vol (LSV-LVV). Hence, we can perform abest-fit of the vol-of-vol function to no-touch options at each quoted maturity while ensuring perfect calibrationto vanilla options through the local volatility function. The tests show that the calibrated LSV-LVV modelprices no-touch options well within the market bid–ask spread for all barrier levels and maturities. The approachcan easily be generalised to other types of stochastic volatility diffusions.We also construct a “local maximum volatility”, i.e. a spot and running maximum dependent volatilityfunction (LMV) consistent with market prices of calls and no-touches by solving an inverse problem for thePIDE discussed above using regularisation. The calibration of this maximum-dependent local volatility functionusing the forward PIDE is inspired by, and extends, the literature on the local volatility model calibration,see e.g. [27, 14, 12]. We then consider an extension of the model in the spirit of LSV models, i.e., a localmaximum-dependent volatility function (LMSV) multiplied by a stochastic volatility. These two models fall2nto the class of path-dependent volatility models and can be useful to replicate a market’s spot-volatilitydynamics as explained in [21].In the calibration of the LSV-LVV and LMSV models, one can compute the Markovian projection of thestochastic variance by either extending the particle method introduced in [23, 22] or by solving the Kolmogorovforward PDE for the joint density of ( S t , M t , V t ), the spot, maximum and volatility, numerically. In ourapproach, we rely on a two-dimensional particle method (in ( S t , M t )) as it offers a straightforward extension toadditional stochastic factors. The computationally most expensive part is, as often, retrieval of the neighbouringparticles, for which we propose a binary tree search, specifically on a k -d tree. The use of k -d trees for particlemethod calibration is a novel approach which can easily be generalised to higher-dimensional state spaces.The remainder of this paper is organised as follows. In Section 2, we define the models and calibrationcondition for up-and-out barrier option quotes. Section 3 presents an efficient numerical solution of theforward PIDE for barrier options, which is central for the algorithms in this paper. In Section 4, we present apossible calibration algorithm for the path-dependent volatility (LMV) model by forward PIDE and regularisedgradient-based optimisation. Then, in Section 4.3 specifically, we use again a particle method to calibrate theLMSV model by Markovian projection. Section 5 makes use of a Markovian projection with a two-dimensionalconditional state of spot and running maximum for the LSV-LVV model and combines it with the forwardPIDE for barrier options in order to best-fit no-touch quotes while perfectly calibrating vanilla market prices.The calibration results for all these models are presented and compared in Section 6. Section 7 concludes witha brief discussion. We consider a spot exchange rate S t associated with the currency pair FORDOM, which is the amount ofunits of domestic currency DOM needed to buy one unit of foreign currency FOR at time t . We assume theexistence of a filtered probability space (Ω, F , {F t } t ≥ , Q d ) with domestic risk-neutral measure Q d , underwhich S follows the SDE dS t S t = (cid:0) r d ( t ) − r f ( t ) (cid:1) dt + Y t dW t , (2.1)where W is a one-dimensional F t -adapted standard Brownian motion, Y is a continuous and positive F t -adaptedsemi-martingale, where E Q d (cid:20) ˆ t Y u S u du (cid:21) < ∞ , (2.2)and the domestic and foreign short rates, r d and r f , are deterministic functions of time, such that r d ( t ) = − ∂ ln P d (0 , t ) ∂t , r f ( t ) = − ∂ ln P f (0 , t ) ∂t , with P d / f (0 , T ) the market zero-coupon bond prices for the domestic and the foreign money market accounts,respectively (see Chapter 9.1 in [31]). The domestic and foreign discount factors are then defined as D d ( t ) = e − ´ t r d ( u ) du , D f ( t ) = e − ´ t r f ( u ) du . (2.3)A model widely used in the industry is the Heston-type LSV model dS t S t = (cid:0) r d ( t ) − r f ( t ) (cid:1) dt + σ ( S t , t ) √ V t dW t dV t = κ ( θ − V t ) dt + βξ √ V t dW Vt , (2.4)where W and W V are standard Brownian motions with constant correlation ρ . Moreover, v , the a prioriunknown initial value of V , and κ, θ, ξ, β are non-negative scalar parameters, while the local volatility component3 : R + × [0 , T ] → R + , is assumed to be bounded and locally Lipschitz in S . This ensures, alongside (2.2), thenecessary conditions to use the forward equation from [26] referenced below in (2.10). Here, the parameter β is redundant (as only the product βξ appears) and can be set to 1 for the time being; it will be used inSection 5 to interpolate between the pure local volatility model ( β = 0, κ = 0, v = 1) and the Heston model( β = 1, σ = 1). Similarly, although we will calibrate θ (alongside κ, ξ, v , ρ ) to vanilla options using a pureHeston model, we note that the use of σ makes θ a redundant parameter because of the scaling properties ofthe model; one could fix θ = 1 instead.Extending (2.4), we introduce a Heston-type local-stochastic volatility model with local vol-of-vol (LSV-LVV), dS t S t = (cid:0) r d ( t ) − r f ( t ) (cid:1) dt + σ ( S t , t ) √ V t dW t dV t = κ ( θ − V t ) dt + ξ ( S t , t ) √ V t dW Vt , (2.5)with ξ : R + × [0 , T ] → R + . The motivation for this model is the freedom gained through the local volatilityfunction σ and the vol-of-vol function ξ for the calibration to two classes of options. In particular, while σ plays a similar role to that in (2.4) and enables calibration to calls with different strikes and maturities, we willuse ξ to match (no-)touch option quotes with different barrier levels and maturities.The choice of ξ as a function of S reflects the fact that S is an observable quantity. One could argue forother choices, for instance have the extra function ξ depend on V , however, absolute levels of V lack financialinterpretation (they will be adjusted for in the σ ( ., )). One could also have a spot-vol correlation that dependson S , however, local correlation models as in [35, 28] can be fragile as there are tight limits on the range of thecorrelation (for example, by requirements that the correlation matrix be positive semi-definite).As calibration instruments, in addition to vanillas, we will consider one-touch options, which pay 1 atmaturity, in one of the currencies, if the FX rate breaches the up-barrier B during the product lifespan(with continuous monitoring). We note that on the market, touch options paying either foreign or domesticnotional are quoted. We convert the market quotes for foreign one-touches denominated in the foreign currencynumeraire, FOT, to foreign no-touch options denominated in domestic currency numeraire, FNT, with thefollowing formula, FNT( B, T ) = D d ( T ) E Q d [ S T M T
B, T ) (cid:1) . (2.6)In the following, if no specification of notional currency is given, the price of a no-touch is defined as in (2.6).No-touches and vanilla calls are two special cases of barrier calls, and we therefore work under this moregeneral framework. In the remainder of this section, we give a calibration condition for up-and-out call pricesunder model (2.1). The up-and-out call price under model (2.1) for a notional of one unit of FOR is C ( K, B, T ) = D d ( T ) E Q d (cid:104) ( S T − K ) + M T
0) = ( S − K ) + S C ( K, S , T ) = 0 , K ≤ S , T > . The equation is degenerate at K = 0 due to the factors K and K in the first and third line of (2.10), andno boundary condition is needed (the process (cid:98) S in (2.8) does not attain the zero boundary). Moreover, due tothe nature of the integral term, the solution C ( · , B, T ) is fully determined without any asymptotic boundarycondition for large B , hence (2.10) is solved up to the largest barrier level needed.We describe the numerical solution of (2.10) by finite differences in Section 3 and the estimation of (2.9) byparticle method in Section 5.1. We hereby describe the available data that we use throughout the paper for the different calibration routines.These are market quotes from 28 / / .We use at-the-money (spot or forward) volatility, 10 and 25 delta smile-strangles and risk-reversals, i.e. foreach maturity 5 volatilities on a delta scale (spot-delta convention up to 1Y included, forward-delta conventionafterwards), denoted as 10D-Put , , , , , and the following maturities, relative to28/03/2013: 3M, 6M, 1Y, 2Y, 3Y, 4Y, 5Y . The implied volatility is plotted in Figure 2.1 on a strike scale fordifferent maturities.Additionally, we will perform calibration on quotes for foreign one-touch options for the following maturities,relative to 28/03/2013: 3M, 6M, 1Y, 2Y, 3Y, 4Y, 5Y , and barrier levels B chosen such that the discountedforeign no-touch-up probabilities are approximately 50% , , , , . In this section, we introduce a second order accurate and empirically stable numerical scheme for the PIDE(2.10). More specifically, we construct a tailored non-uniform spatial mesh, combined with finite differencesfor the derivatives and quadrature of the integral term, and a backward differentiation formula (BDF) on anon-uniform time mesh, which is shown in tests to have better stability than the usual Crank-Nicolson scheme.In this section, for simplicity of notation, we drop the LMV subscript from σ LMV in (2.8). The call option prices and no-touch prices were provided by Markit. The zero-coupon rates for both EUR and USD curveswere retrieved from Bloomberg. igure 2.1: Market volatility surface for EURUSD on 28/03/2013. The spot value was S = 1 . . We define M + 1 time points T m , N + 1 strike points K i and P + 1 barrier points B j , leading to the followingimplicit definition of the non-uniform step sizes functions ∆ T , ∆ K , ∆ B , respectively: T m = (cid:80) m − m (cid:48) =0 ∆ T ( m (cid:48) ) , ≤ m ≤ M,K i = (cid:80) i − i (cid:48) =0 ∆ K ( i (cid:48) ) , ≤ i ≤ N,B j = S + (cid:80) j − j (cid:48) =0 ∆ B ( j (cid:48) ) , ≤ j ≤ P. We denote by N the node such that K N = S . For simplicity, we relate the step sizes functions ∆ K and∆ B by ∆ K ( i ) = ∆ B ( i − N ) for any i with N ≤ i ≤ N . This will ensure that for any B j , the correspondingmesh row ( · , B j , T m ) will contain all ( B u , B j , T m ) for all u < j , which is useful for the following algorithm. Wedenote the discrete solution vector in such a row ( · , B j , T m ) by u m.,j = (cid:2) C ( K , B j , T m ) , . . . , C ( B j , B j , T m ) (cid:3) (cid:48) ∈ R n j , where n j = N + j + 1 and (cid:48) denotes the transpose. Define I n the identity matrix of size n × n .Derivatives are approximated by centered finite differences at each space point except at K = 0 and K = B j ,where they are computed, respectively, by three-point forward and backward one-sided differences. We allow anon-uniform grid and rely on the algorithm in [15] to define two finite difference operators δ K u mi,j and δ KK u mi,j (acting on the index i ) as well as the corresponding matrix derivative operators D and D respectively (see[26] for more details).The integral term F ( K i , B j , T m ) = ˆ B j S ∨ K g ( K i , b, T m ) d b
6s computed using the trapezoidal quadrature rule on the non-uniform grid, with g ( K, b, T ) = − K ∂ C ( K, b, T ) ∂K ∂σ ( K, b, T ) ∂b , and where we recall that we dropped for simplicity the subscript LMV from σ . We define¯ g ( K i , B j , T m ) = (cid:40) − K i δ KK u mi,j ∂σ ( K i ,B j ,T m ) ∂B , K i < B j , , K i = B j , since (see [26]) ∂ C ( B, B, T ) ∂K = 0 . Let j ≥ · , B j (cid:48) , T m ) j (cid:48) T > ∂ C∂K ∂B ( B, B, T ) = − ∂ C∂K ( B, B, T ) (3.4)= D d ( T ) φ ( B, B, T ) , (3.5)where φ ( · , · , T ) is the joint density of ( S T , M T ) at time T . We can compute a second order, five pointapproximation to the third derivative on the right-hand side of (3.4) with the algorithm from [15] and denote theleft-sided difference operator by δ − KKK and the discretisation matrix (both obtained by numerical computation)by Φ . For uniformly spaced grids, the discretisation matrix is given by Φ = 12∆ K . . . − 14 24 − 18 5... (0) ... ... ... ... ... ... ...0 . . . − 14 24 − 18 5 . The PIDE algorithm involves solving one-dimensional PDEs for different values of the barrier at everyrow j of the discretisation. The interconnection between each of these “layers” is given through the integralterm F . The first row j = 0 is for the barrier level B = S ∨ K . As a requirement for stability, we foundempirically that the five grid points used for the approximation Φ need to be on the right-hand side of S .7n order to ensure this, we do not compute the solution for B j with j ∈ { , , , } , and for j > 4, we startthe summation in (3.2) at j (cid:48) = 5. No error is introduced if ∂σ∂b ( K, b, T ) = 0 for any b ∈ [ S , B ] (which will besatisfied by our construction of σ ), in fact, this allows to start the induction over B j in (3.3) at the barrierlevel B first = inf { b : ∂σ∂b ( · , b, t ) (cid:54) = 0 } , since ˆ B first S ∨ K K ∂ C ( K, b, T ) ∂K ∂σ ( K, b, T ) ∂b d b = 0 . Please note that B first is not linked to any market conventions, but only a numerical convenience. We will referto the skipped rows 1 to 4 as the “blank layers” in the remainder of the section (see Fig. 3.1 for an illustration).We note that the same effect is obtained if the strike and barrier discretisations are decoupled and that the firstbarrier level is chosen such that, at least five grid points used for the approximation Φ are on the right-handside of S .Finally, the complete surface of barrier option prices can be retrieved by cubic spline interpolation in bothstrike and barrier (in particular also for B < B first by interpolation between B = S , where C ( K, S , T ) = 0,and B = B first ), and constant extrapolation for large barriers. The latter is a consistent assumption since, forany B > B max , the value will be close to that of a vanilla option. Remark. We recall that in [26] we were only able to use a first order accurate approximation of the boundaryderivative due to stability issues. As this term is present in the discretised equation for all interior mesh points,the scheme we proposed in [26] had a consistency order reduced to one in ∆ K . We find that using “blank layers”and a second-order BDF time scheme as described below in Section 3.3 instead of Crank-Nicolson allows touse a second order accurate approximation and preserve stability. Overall, we obtain an order two consistentspatial approximation for smooth enough meshes. Here, we explain how prices of vanilla contracts can be obtained efficiently as a by-product of the solutionof the forward PIDE (2.10) for barrier options. Note that standard PDE pricing approaches are not directlyapplicable due to the dependence of the volatility on the running maximum, such that a two-dimensionalbackward PDE would be required for each strike (see [26]).The straightforward approach to vanillas with the forward PIDE (2.10) is to set the maximum up-and-outbarrier B max very high. This requires a very large number of mesh rows in the B -direction, which increases thecomputational time drastically. Therefore, we make and exploit the assumption that the volatility becomesconstant in the running maximum dimension above a given level B last . Then ∂σ ( K, b, T ) /∂b = 0 for any b ≥ B last and all K, T ≥ 0, such that (similar to the situation for small B in Section 3.1) ˆ ∞ B last K ∂ C ( K, b, T ) ∂K ∂σ ( K, b, T ) ∂b d b = 0 . (3.6)Moreover, it seems reasonable to assume thatlim B →∞ σ ( B, B, T ) B ( B − K ) ∂ C ( B, B, T ) ∂K ∂B = 0 , since by (3.5) the term in the limit is proportional to the joint density function of ( S T , M T ) at ( B, B ), and weconjecture here that it goes to 0 faster than B − . In other words, no error is made by jumping from B = B last to a large B = B max in the solution of (2.10).The PIDE (2.10) then becomes ∂C ( K, B max , T ) ∂T + r f ( T ) C ( K, B max , T ) = − (cid:16) r d ( T ) − r f ( T ) (cid:17) K ∂C ( K, B max , T ) ∂K (3.7)+ 12 σ ( K, B max , T ) K ∂ C ( K, B max , T ) ∂K − ˆ B last S ∨ K K ∂ C ( K, b, T ) ∂K ∂σ ( K, b, T ) ∂b d b , ith boundary conditions C ( K, B max , 0) = ( S − K ) + , T = 0 ,∂ C ( K max , B max , T ) ∂K = 0 , K = K max , for some large enough K max (cid:28) B max . One will then compute all mesh rows up to B last , and one additional“vanilla layer” for C ( K, B max , T ) for an arbitrarily large level B max by the PDE (3.7). The main difficulty in the time discretisation of the forward PIDE (2.10) arises from the term12 σ ( B, B, T ) B ( B − K ) ∂ C ( B, B, T ) ∂K ∂B , which, as per (3.5), contains the joint density φ of the process ( S T , M T ) and becomes a Dirac delta point sourceat ( S , S ) for T = 0. This potentially causes instabilities for all B close to S for short-term options.In order to tackle the problem, we first subdivide the initial time step and perform 4 fully implicit steps ofa quarter step-size. For the definition of the BDF2 scheme, a single initial fully implicit step would suffice, butfor better comparison with the Crank-Nicholson scheme we adopt the Rannacher startup with four steps (see[16]) in both cases. Also in both cases, we make use of the “blank layers” described in Section 3.1If we take into account the finite difference approximations and quadrature rule for the integral, it is nowpossible to give a discretised PIDE, for a given triplet ( i, j, m ), 1 ≤ i ≤ N , 1 ≤ j ≤ P , 1 ≤ m ≤ M , by δ T u mi,j + r f ( T m ) u mi,j + (cid:0) r d ( T m ) − r f ( T m ) (cid:1) K i δ K u mi,j − (cid:18) σ ( K i , B j , T m ) − ∂σ ∂B ( K i , B j , T m )∆ B ( j ) (cid:19) K i δ KK u mi,j − σ ( B j , B j , T m ) B j ( B j − K i ) + δ − KKK u mn j ,j = f ( K i , B j , T m ) , (3.8)where δ T is a time difference operator, and specify the coefficient matrices A m.,j = (cid:0) r d ( T m ) − r f ( T m ) (cid:1) diag (cid:0) K , ..., K n j (cid:1) , B m.,j = − 12 diag (cid:32)(cid:18) σ ( K i , B j , T m ) K i − K i ∂σ ( K i , B j , T m ) ∂B ∆ B ( j ) (cid:19) ≤ i ≤ n j − (cid:33) , C m.,j = − 12 diag (cid:18)(cid:16) σ ( B j , B j , T m ) B j ( B j − K i ) + (cid:17) ≤ i ≤ n j − (cid:19) . Under fully implicit time stepping, the complete scheme can be more compactly written as u m.,j − u m − .,j ∆ T ( m ) + L m.,j u m.,j = f m.,j , L m.,j = r f ( T m ) I n j + A m.,j D + B m.,j D + C m.,j Φ . (3.9)To define the BDF scheme for variable step size, we denote C ( K, B, T m ) by C m and write Newton’s interpolationpolynomial in time as C ( T ) = C m + [ C m, C m − ]( T − T m ) + [ C m, C m − , C m − ]( T − T m )( T − T m − ) , ., . ] and [ ., ., . ] are divided differences. Taking the derivative with respect to T , evaluated at T m , ∂C∂T ( T m ) = C m − C m − ∆ T + (∆ T ( m )) ∆ T ( m ) + ∆ T ( m − (cid:20) C m − C m − ∆ T ( m ) − C m − − C m − ∆ T ( m − (cid:21) . This yields a linear system of equations for each time step, (cid:18) I n j + ∆ T ( m )1 + γ m L m.,j (cid:19) u m.,j = (cid:18) T ( m )∆ T ( m − γ m γ m (cid:19) u m − .,j − (cid:18) ∆ T ( m )∆ T ( m − γ m γ m (cid:19) u m − .,j + ∆ T ( m )1 + γ m f m.,j , with γ m = ∆ T ( m )∆ T ( m ) + ∆ T ( m − , which defines an implicit second-order multi-step method. Assuming there exists a smooth bijective mappingbetween the non-uniform and a uniform time mesh, the method is consistent of order 2 and stability is preservedif the step-size ratio is bounded as follows (see [25]),0 < ∆ T ( m )∆ T ( m − < √ , which is guaranteed by a smooth change of the step-size. In order to get the best accuracy, we refine the mesh around ( K, B ) = ( S , S ), for two reasons. First, this willadd more barrier mesh rows where ∂C∂B is high and efficiently capture the change in call prices as well as reducean eventual error generated by the “blank layers”. Second, on the strike scale, the solution is mainly convexaround K = S as seen in Figure 3.7, and therefore higher accuracy becomes important in that specific zone.We use a hyperbolic mesh as in [11], where we require S to be a node and with η = 0 . 05 (in their notation)chosen according to our numerical experiments. In Figure 3.1, we display the generated mesh with N = 50points in each direction, including the initial “blank layers” corresponding to the vertical gap in the mesh. Figure 3.1: Hyperbolic mesh on the domain { ( K, B ) : 0 < K ∨ S < B } , initialised with “blank layers”. 10n order to perform numerical tests, we calibrate a pure local volatility model to the set of vanilla optionspresented in Section 2.1 and obtain a local volatility function σ LV . Our implementation follows a fixed-pointalgorithm as described in [11] based on the work of [39]. Other methods to retrieve the local volatility functioncould have been used as well. From this calibrated local volatility we define a hypothetical volatility functionof the form σ ( s, m, t ) = (cid:112) σ LV ( s, t ) σ LV ( m, t ) m ≥ s , defined on a mesh of strikes, barriers and maturities ( K T i ,j , B T i ,k , T i ), with 1 ≤ i ≤ 10 , 1 ≤ j ≤ ≤ k ≤ B T i, ,k = K T i ,k and interpolated with cubic splines in space and constant backwards intime. The initial spot value is S = 1 . N = 700and M = 100, as a function of B for T = 1 with and without “blank layers” in Figure 3.2 and 3.3 respectively.Evaluating the term φ ( B, B, T ) from (2.10) accurately is necessary since the value of the foreign no touchoption is directly linked to it by ∂C (0 , B, T ) ∂T + r f ( T ) C (0 , B, T ) = − σ ( B, B, T ) B φ ( B, B, T ) ∀ ( B, T ) ∈ ( S , + ∞ ) × R + ∗ . Finally, to highlight the importance of a smoothing scheme for the time stepping, we plot φ ( B, B, T ) with “blanklayers” combined with BDF2 in Figure 3.4 and Crank–Nicolson in Figure 3.5, which shows that Crank–Nicolsoncan generate instabilities if the number of time steps is too small, even when using Rannacher initialisation. Figure 3.2: ∂ C/∂K ∂B along the diagonal K = B for T = 1, initialised with “blank layers”. 700 strike steps,100 BDF2 time steps + Rannacher initialisation. Figure 3.3: ∂ C/∂K ∂B along the diagonal K = B for T = 1, initialised without “blank layers”. 700 strike steps,100 BDF2 time steps + Rannacher initialisation. In order to numerically verify the PIDE solution, we compute the price of an up-and-out call option for K = 80% × S , B = 110% × S and T = 1 with the forward PIDE and crude Monte Carlo combined with theBrownian bridge (BB) technique as in Chapter 6 of [18].The results are shown in Table 3.1. 11 igure 3.4: ∂ C/∂K ∂B along the diagonal, for T = 1,initialised with “blank layers”. 700 strike steps, 10 BDF2time steps + Rannacher initialisation. Figure 3.5: ∂ C/∂K ∂B along the diagonal for T = 1,initialised with “blank layers”. 700 strike steps, 10 Crank–Nicolson time steps + Rannacher initialisation. Forward PIDE Monte Carlo (with 95% conf. int.)0.15823 0.15825 (0.15821, 0.15828) Table 3.1: Price of an up-and-out call option for K = 80% × S , B = 110% × S , computed with the forward PIDE (700spot steps and 100 time steps) and Monte Carlo (5 × paths and 500 time steps with Brownian bridge interpolation). In Table 3.2, we give the convergence order as a function of the number of strike points. More precisely, theerror e n in row n is the absolute difference between the value with 2 × N K and N K strike points, N K = 150 × n .The order displayed in row n is then ln( e n /e n +1 ) / ln 2. We notice that while the convergence order is close to 3for a smaller number of strike points, the asymptotic order is indeed 2. A similar approach is used for Table3.3, where convergence of order 2 is obtained after 240 time steps per year. However, even for a smaller numberof time steps, the error is small and the price is accurate. N K Error Order300 9 . × − . × − . × − . × − – Table 3.2: Convergence order in number of strike steps N K for N T = 60 time steps. N T Error Order60 7 . × − . × − -0.99240 6 . × − . × − . × − – Table 3.3: Convergence order in number of time steps N T for N K = 1200 strike steps. We re-iterate the non-standard nature of the PIDE and that it was only by a careful adaptation of standardtechniques that high accuracy and stability was achieved.Finally, we show in Figure 3.7 up-and-out call prices for T = 1 as a function of K and B where we have S = 1 . igure 3.6: Volatility function used for numerical testwith smooth transition at the boundary. The constructionis such that σ ( s, m, t ) = (cid:112) σ LV ( s, t ) σ LV ( m, t ). Figure 3.7: Up-and-out call prices computed with theforward PIDE for different values of strikes and barriersand T = 1. For the calibration to options with barrier features, whose payoff thus depends on the running maximum ofthe underlying asset, it seems natural to also consider models where the volatility depends explicitly on themaximum. This leads to models from the class of path-dependent volatility models proposed in [21].To this end, let the range of possible ( S t , M t ) values be D = (cid:8) ( s, m ) ∈ R : 0 ≤ s ∨ S ≤ m (cid:9) . First, we define a “local maximum volatility” (LMV) model by dS t S t = (cid:0) r d ( t ) − r f ( t ) (cid:1) dt + σ LMV ( S t , M t , t ) dW t M t = max ≤ u ≤ t S u , (4.1)where σ LMV : D × [0 , T ] → R + is assumed to be bounded, locally Lipschitz in S and continuously differentiablein M . This implies that ( S t , M t ) t ≥ is Markovian (see [6]). The construction is motivated by the ability of themodel to mimick the joint distribution of S t and M t for any diffusion model, as shown in [6], and therefore itcan fit up-barrier option prices simultaneously for all maturities, strikes, and barrier levels by construction.Secondly, we propose a “local maximum stochastic volatility” (LMSV) model defined as dS t S t = (cid:0) r d ( t ) − r f ( t ) (cid:1) dt + σ ( S t , M t , t ) √ V t dW t dV t = κ ( θ − V t ) dt + ξ √ V t dW Vt M t = max ≤ u ≤ t S u , (4.2)with σ : D × [0 , T ] → R + a local volatility function which also depends on the running maximum. This modelextends (4.1) in the same way that the LSV model extends the LV model. One might also wish to incorporatea mixing factor β as in (2.4). This would fit naturally into the calibration proposed below.Note that the class of models in (2.1) includes (4.1) and (4.2).In the following, we discuss the calibration of the path-dependent volatility model (4.1) by forward PIDE(2.10) to vanilla and no-touch options. 13 .1 LMV model calibration with regularised best-fit algorithm As the LMV model represents the natural extension of the Dupire local volatility model for European calls toup-and-out barriers, the approach taken here is motivated by the literature on calibration and regularisation oflocal volatility. Our goal is to encode the market prices of both vanilla options and no-touches in one model.Moreover, it represents a building block to the calibration of the LMSV model in Section 4.3 by particle method.The local maximum volatility function σ LMV is calibrated directly to these quotes and cannot be expected tobe unique, as the marginal distributions of S (from the vanillas) and M (from the no-touches) do not uniquelyidentify their joint distribution. How much they restrict the joint distribution is an interesting reseach question.Specifically, we will minimise a functional consisting of the least-squares model error for vanillas and no-touches and a penalty term which steers the optimisation algorithm to a local minimum with certain regularity.The optimisation is performed over volatility surfaces which are parameterised with a finite dimensionalparameter vector Λ i for each ( T i − , T i ).In our tests, the volatility function is defined by quadratic splines in spot and running maximum andpiecewise constant in time. For each quoted maturity, we choose a grid of points formed by the N K = 5 quotedstrikes in the spot direction and N B = 4 nodes ( M T i ,k ) ≤ k ≤ N B in the running maximum direction, uniformlyspaced on the interval (cid:34) S + (cid:0) B T i , − S (cid:1) , B T i , (cid:35) , where B T i , and B T i , are, respectively, the quoted up-and-out barriers for the corresponding 50% and90% no-touch probabilities for maturity T i . The optimisation will be performed over the ( N K × N B ) matrixΛ i = (cid:104) σ i,j,k LMV (cid:105) j,k , with σ i,j,k LMV = σ LMV ( K T i ,j , M T i ,k , T i ), with 1 ≤ i ≤ N Mat , 1 ≤ j ≤ N K , 1 ≤ k ≤ N B . For a given maturity T i ,the volatility is extrapolated asymptotically constant as described in Appendix A.4, outside [ K T i , , K T i ,N K ] × [ M T i , , K T i ,N B ].We define the objective function ¯ e for Q K quoted strikes, Q B quoted barrier levels and each maturity T i as¯ e (Λ i ) = e (Λ i )(1 + P (Λ i )) e (Λ i ) = Q B (cid:88) l =1 (cid:0) e FNT l (Λ i ) (cid:1) + γ Q K (cid:88) l =1 (cid:0) e Σ l (Λ i ) (cid:1) ,e FNT l (Λ i ) = FNT Model ( B T i ,l , T i , Λ i ) − FNT Market ( B T i ,l , T i ) (4.3) e Σ l (Λ i ) = Σ Model ( K T i ,l , T i , Λ i ) − Σ Market ( K T i ,l , T i ) , (4.4)with Σ Market the market Black–Scholes implied volatility, Σ Model the model implied volatility, γ ∈ R and P (Λ i ) = 1 N K N B N K (cid:88) l =1 N B (cid:88) m =1 (cid:18) ∂ σ LMV ( K T i ,l , M T i ,m , T i ) ∂K (cid:19) h (cid:18) − ∂ σ LMV ( K T i ,l , M T i ,m , T i ) ∂K , . , . (cid:19) (4.5) h ( x, x , (cid:15) ) = 1 + tanh(2 ( x − x ) (cid:15) )2 , where the second derivative is obtained by differentiation of the interpolant. Note that ¯ e , e and P at T i arefunctions of (Λ i ) j ≤ i , but in writing e (Λ i ) etc, we focus on the dependence on Λ i for the inductive calibration.The penalisation function P is a Tikhonov-type regularisation which reduces the number of local minimaand improves the stability of the volatility surface. For a pure local volatility model, Tikhonov regularisationhas been shown to provide well-posedness, under the condition that the local volatility does not depend on14ime, for the one maturity vanilla calibration problem in [14]. A similar approach is also used in [12] forthe pure local volatility model. The parametric form (4.5) for the penalisation was chosen empirically. Weacknowledge that the penalised error ¯ e (Λ i ) as defined in (4.3) does not prevent over-parametrisation if themarket data fit perfectly, i.e. when e (Λ i ) = 0, as the penalisation is multiplicative. However, during the iterativeoptimisation we found e (Λ i ) to be always strictly positive and the penalisation will favour smoother, convexshapes of the volatility in the strike direction; see also Section 4.1. Here, h acts as a smoothed step functionto ensure differentiability with respect to the parameters Λ i (for the BFGS routine used below). Setting, (cid:15) = 0 . x = 0 . 5, is such that we get h (0 , . , . ≈ . 02, a small positive amount in order to penalisemainly concave solutions while lightly penalising close-to linear solutions as well. In that sense, h helps minimisethe impact of small values of the second order derivative.The calibration algorithm is described in Algorithm 1 in Appendix B for N Mat maturity pillars where weuse the calibrated local volatility (i.e., σ LMV ( S t , M t , t ) = σ LV ( S t , t ) independent of M t ) as a first guess for thefirst maturity pillar.The calibration algorithm uses the bounded L-BFGS routine described in [42], where the gradient of theobjective function ¯ e needs to be computed. Compared to the gradient-free Nelder–Mead [32] algorithm discussedin Section 5, the L-BFGS optimisation is considerably faster to converge to a local-minimum if a gradient canbe obtained efficiently. For instance, for parameter σ i,j,k LMV , we can write ∂e (Λ i ) ∂σ i,j,k LMV = 2 Q B (cid:88) l =1 e FNT l (Λ i ) ∂ FNT Model ( T i , B T i ,l , Λ i ) ∂σ i,j,k LMV + 2 γ Q K (cid:88) l =1 e Σ l (Λ i ) V (cid:16) T i , K T i ,l , σ i,j,k LMV (cid:17) ∂ Call Model ( T i , K T i ,l , Λ i ) ∂σ i,j,k LMV , where V ( T, K, σ ) is the standard Black–Scholes vega for maturity T , strike K and volatility σ . The computationof the gradient of the model up-and-out call price C ( K, B, T, Λ i ) (including calls and no-touches) with respectto the parameter vector Λ i is described in Section 4.2.The model prices were computed using the PIDE (2.10) discretised with 1200 strike steps, 40 time steps inbetween each quoted maturity T i , which is sufficient to guarantee good accuracy.In order to emphasise the importance of the penalisation function, we calibrate the LMV model withoutregularisation, i.e. P ≡ 0, and plot the resulting LMV function in Figures 4.3 and 4.4, with the same axis rangeas for the regularised solution in Figures 4.1 and 4.2, which highlights different possible solutions, especially forlonger maturities. We used γ = 5 in (4.3).The calibration process shows the existence of a few local minima in the objective function. This is notsurprising as the path-dependent volatility model is, in principle, able to calibrate perfectly a discrete set ofup-and-out call options (which includes vanilla options), hence by only providing call and no-touch prices, thecalibration problem is underdetermined.In the present setting, with five vanilla and five no-touch quotes per maturity, and 20 parameters, we findmany surfaces which fit the data. However, starting the iterative optimisation procedure from the Dupirevolatility (i.e., no dependence on the maximum), the regularisation steers the approximate minimiser towards acalibrated surface with small penalty term.We will see in Section 6, specifically the first column of Table 6.2, that the calibration is very precise, withan absolute error for no-touches never higher than 0 . 03% in price. In order to perform a best-fit algorithm, knowledge of the gradient with respect to the model parameters isrequired for the chosen (gradient-based) optimisation process. Assume that the volatility in ( T i , T i +1 ) is afunction of N parameters Λ i = ( σ i, , ..., σ i,N ), and constant between quoted maturities, where we drop thesubscript ‘LMV’ for brevity.So, we need to compute ∇ C ( σ i, , ..., σ i,N ) , where ∇ is the gradient operator with respect to Λ i .15 igure 4.1: Local maximum volatility function T = 1 Y with regularisation. Figure 4.2: Local maximum volatility function T = 5 Y with regularisation. Figure 4.3: Local maximum volatility function T = 1 Y with no regularisation. Figure 4.4: Local maximum volatility function T = 5 Y with no regularisation. Equation (2.10) can be written as ∂C (Λ i ) ∂T + r f ( T ) C (Λ i ) + (cid:16) r d ( T ) − r f ( T ) (cid:17) K ∂C (Λ i ) ∂K − σ (Λ i ) K ∂ C (Λ i ) ∂K = − σ (Λ i ) (cid:5) K = B B ( B − K ) ∂ C (Λ i ) ∂K ∂B (cid:23) K = B − ˆ BS ∨ K K ∂ C (Λ i ) ∂K σ (Λ i ) ∂σ (Λ i ) ∂b d b . We can differentiate with respect to each of the parameter vectors Λ i , which gives ∂ ∇ C (Λ i ) ∂T + r f ( T ) ∇ C (Λ i ) + (cid:16) r d ( T ) − r f ( T ) (cid:17) K ∂ ∇ C (Λ i ) ∂K − σ (Λ i ) K ∂ ∇ C (Λ i ) ∂K = − σ (Λ i ) (cid:5) K = B B ( B − K ) ∂ ∇ C (Λ i ) ∂K ∂B (cid:23) K = B − ˆ BS ∨ K K ∂ ∇ C (Λ i ) ∂K σ (Λ i ) ∂σ (Λ i ) ∂b d b + R (Λ i ) , ith R (Λ i ) = σ (Λ i )( ∇ σ (Λ i )) K ∂ C (Λ i ) ∂K − ( σ (Λ i )( ∇ σ (Λ i ))) (cid:99) K = B B ( B − K ) ∂ C (Λ i ) ∂K ∂B (cid:23) K = B − ˆ BS ∨ K K ∂ C (Λ i ) ∂K (cid:18) ∇ ( σ (Λ i )) ∂σ (Λ i ) ∂b + σ (Λ i ) ∇ (cid:18) ∂σ (Λ i ) ∂b (cid:19)(cid:19) d b . Hence, ∇ C (Λ i ) follows the same PDE as C , but with an inhomogeneous term which is a function of C and itsspatial derivatives, with initial condition ∇ C (Λ i )( K, B, 0) = 0 0 ≤ K, K ∨ S ≤ B, and with boundary conditions ∇ C (Λ i )( B, B, T ) = 0 , S ≤ B, ∇ C (Λ i )( K, S , T ) = 0 , K ≤ S , which match the Dirichlet boundary conditions for C ( K, B, T ). This useful property confirms that we can usethe same discretised linear operator for both C and ∇ C (Λ i ). The additional source term R on the right-handside is fully known since the solution for C is computed beforehand. Only one costly LU factorisation is neededto compute both C and ∇ C (Λ i ) at each implicit time step. Solving the linear systems of N + 1 equations isthen fast by forward and backward substitution. Additionally, since the volatility is assumed piecewise constantin maturity, the set of parameters Λ i +1 has no impact on the values of C ( K, B, T ) for any T ≤ T i . Hence wealso have ∇ C (Λ i )( K, B, T j ) = 0 , j < i . Some numerical experiments for a volatility defined on a grid of 5 × In this section, we discuss a possible calibration algorithm for the LMSV model (4.2). We assume that acalibrated LMV volatility function σ LMV is at our disposal, e.g. obtained as in Section 4.1.With the Heston parameters ( κ, ξ, θ ) and σ LMV fixed, σ in (4.2) can be found from the calibration condition(see (2.9) and thereafter) σ ( K, B, T ) E Q d (cid:2) V T | S T = K, M T = B (cid:3) = σ ( K, B, T ) . (4.6)Through the conditional expectation, the function σ in (4.6) depends on the distribution of the joint process X = ( X t ) t ≥ = ( S t , M t , V t ) t ≥ . If we insert σ expressed from (4.6) in (4.2) for a model calibrated to vanillaand barrier quotes, via σ LMV , the resulting process thus falls in the class of McKean-Vlasov processes [30].The particle method for the estimation of conditional expectations was introduced in [30], and is discussedin detail in [38]; it was applied to LSV model calibration in [22, 23]. More details about stochastic filteringproblems, as well as a literature review, can also be found in [4].We consider N -sample paths (cid:0) X it (cid:1) ≤ i ≤ N = (cid:0) S it , M it , V it (cid:1) ≤ i ≤ N , t ≥ 0, i.e. N independent realisations of X ,and write for brevity X · = (cid:0) X i · (cid:1) ≤ i ≤ N . 17he (3 × N )-dimensional SDE driving the system X in the case of the LMSV model can be approximated by d ˆ S it ˆ S it = (cid:0) r d ( t ) − r f ( t ) (cid:1) dt + ˆ σ N (cid:16) ˆ S it , ˆ M it , t ; X (cid:17)(cid:112) V it dW it dV it = κ (cid:0) θ − V it (cid:1) dt + ξ (cid:112) V it dW V,it ˆ M it = max ≤ u ≤ t ˆ S iu , where ( W i · , W V,i · ), 1 ≤ i ≤ N , are independent samples of the two correlated driving Brownian motions, ˆ σ N is anestimator for σ to be defined below, and ˆ X t = (cid:16) ˆ X it (cid:17) ≤ i ≤ N = (cid:16) ˆ S it , ˆ M it , V it (cid:17) ≤ i ≤ N , t ≥ 0, with ˆ M it = sup s ≤ t ˆ S it .We use an extension of the QE-scheme [1] (see also Section 5.1) where the volatility now depends on therunning maximum as well as the spot. The Brownian increments are generated with a pseudorandom numbergenerator.s The running maximum is sampled approximately with a Brownian bridge technique as described inChapter 6 of [18], with σ kept constant in time between timesteps, G t = S t + S t +∆ t + (cid:113) ( S t + S t +∆ t ) − (cid:0) S t σ ( S t , t ) √ V t (cid:1) ∆ t log( U t )2 M t +∆ t = max( G t , M t ) , where U t is an independent draw from the uniform distribution U (0 , t .The accuracy of the integration of the SDE would be of lesser importance if the same scheme were used incalibration and pricing, since the calibration will be to the conditional law of the (approximate) model. Here,we discuss calibration by PIDE and pricing by MC and hence use an accurate timestepping scheme.We refer to [22] and [38] for more extensive details about the particle method and conditions for itsconvergence (which, to the best of our knowledge, are not proven for the present case).For the construction of the LMSV by particle method, we estimate the Markovian projection (2.9) asˆ p N ( K, B, T ; X T ) = N (cid:80) Ni =1 V iT δ N (cid:0) S iT − K, M iT − B, T (cid:1) + 2 θξ(cid:15) N (cid:80) Ni =1 δ N (cid:0) S iT − K, M iT − B, T (cid:1) + ξ(cid:15) , (4.7)with δ N an anisotropic bi-variate Gaussian kernel δ N ( x, y, T ) = exp (cid:16) − ζ ( T )1 − ρ xy ( T ) (cid:17) γ ( T ) (4.8) ζ ( T ) = x h x ( T ) + y h y ( T ) − ρ xy ( T ) xyh x ( T ) h y ( T ) γ ( T ) = 2 πh x ( T ) h y ( T ) (cid:113) − ρ xy ( T ) , where h x , h y , ρ xy as well as the specific bandwidth details are again given in Appendix A.1. When we used thenormal Silverman rule bandwidth with ρ xy ( T ) = 0, the accuracy was found to drop drastically and becameunsatisfactory for the considered number of particles in the range 100 000 − θξ(cid:15) and ξ(cid:15) in (4.7) serve as a smooth extrapolation rule for areas containing only afew particles. for all t , then the calibration algorithm recovers, as expected, the calibrated path-dependent(mimicking) volatility σ LMV . Similarly, if ξ = 0, then V is deterministic and a single particle is sufficient tocalibrate the LMSV model. 18hen, σ can be estimated by ˆ σ N ( K, B, T ; X ) = (cid:115) σ ( K, B, T )ˆ p N ( K, B, T ; X ) , with ˆ p N ( K, B, T ; X ) given in (4.7) and X · = ( S i · , V i · , M i · ) ≤ i ≤ N .The step-by-step calibration is detailed in Algorithm 2 in Appendix B.The computational complexity of a direct evaluation of σ LMV by ˆ p N (cid:0) S it , M it , t ; X t (cid:1) for all particles isquadratic in N . To reduce the cost, we first approximate σ LMV on a mesh in ( S, M ) and then interpolate bysplines for the actual evaluation; see Appendix A.3 for details. We do not use any regularisation for σ here.Moreover, for a given point ( K, B ) in (4.7), only a small number of particles in the vicinity contributessignificantly to the sum due to the fast decay of the kernel function. We perform an efficient search for thoseparticles on a k -d tree as described in Appendix A.1, which reduces the complexity per time step from O ( N )to O ( N S N M log N ), if the 2D spline has N S N M nodes.Finally, we note that alternative approaches could be used in order to estimate the conditional expectation.One example would be to bucket the particles and then locally linearly regress on the variables, e.g., S and M .This bucketing approach is used frequently for applications such as Monte Carlo valuation of American optionsand for CVA computations, to avoid having to use higher order polynomial terms as in the Longstaff–Schwartzalgorithm [29].We calibrated the model to the vanilla and no-touch quotes from Section 2.1. We plot in Figure 4.6 thecalibrated local volatility function for T = 1 with 350 time steps per year and 2 000 000 particles. The calibrationfit is compared to other models in Section 6. Figure 4.5: Joint density φ ( S, M, T ) of the spot andrunning maximum after calibration at time T = 1. Figure 4.6: Calibrated local volatility function σ ( S, M, T ) for the LMSV model at time T = 1. In this section, we describe the calibration of the LSV-LVV model (2.5) to both vanilla and no-touch options.Model (2.5) generalises the Heston-type LSV model (2.4) and we briefly discuss the prevalent approach to thecalibration of this model first.The calibration can be based on two different methods. On the one hand, if prices of exotic products arecomputed by Monte Carlo, it is possible to rely on a full Monte Carlo calibration approach. Accurate re-pricing19f vanilla options will be ensured by computation of the particle estimator (5.2), while keeping track of therunning maximum for each particle will allow to compute no-touch prices for all maturities and barrier levels.An optimisation algorithm can then be used to calibrate no-touch options. However, this approach leads toinaccuracies and parameter instabilities for longer maturities and higher barrier levels. Therefore, if one wishesto use PDE techniques in order to price a set of derivative products, a full Monte Carlo calibration becomes farless suitable. On the other hand, and in order to provide consistent and stable calibration for both PDE andMonte Carlo pricing, we propose a calibration method where no-touch prices are computed with PIDE (2.10),coupled with the LMV volatility calculated by a particle estimator we describe hereafter. This PIDE basedcalibration approach is described in the remainder of this section.For any set of parameters v , κ, θ, ξ, ρ in the Heston-LSV model ( β = 1), a sufficient condition on the localvolatility function σ (see, e.g., [23]) such that the model gives a perfect fit to arbitrage-free vanilla quotes is σ ( K, T ) = σ LV ( K, T ) (cid:112) E Q d [ V T | S T = K ] , (5.1)where σ LV is a local volatility function (i.e., calibrated to vanilla quotes). Notice here that the right-hand sidedepends on σ ( · , t ) for t < T through the conditional expectation.One approach to the calibration is to consider the case σ ( S t , t ) = 1 independent of S t and t and calibrate aHeston type model with β = 1 to the vanilla quotes by choice of v , θ, ξ, κ, ρ . Then, having established thischoice of these parameters, β is adjusted and for each choice the local volatility σ ( S, t ) is recalibrated. Thechoice of β is made for the best match to the barrier option prices (while by construction maintaining thecalibration to the vanilla options).The parameter β is commonly within the range [0 , 1] and called the “mixing factor” (see [10]): the market isbelieved to stand in between pure local volatility models, i.e. β = 0, and full Heston LSV models, i.e. β = 1.We show calibration results which support this claim in Section 6.To improve the calibration accuracy to no-touch options, whose payoff depends on the running maximum,the vol-of-vol “local volatility function” is made spot- and time-dependent in our model (2.5). σ We satisfy the calibration condition (5.1) by a modificaiton of the particle method from Section 4.3.We consider N -sample paths (cid:0) X it (cid:1) ≤ i ≤ N = (cid:0) S it , M it , V it (cid:1) ≤ i ≤ N , t ≥ X , and write for brevity X · = (cid:0) X i · (cid:1) ≤ i ≤ N . Then σ can be estimated byˆ σ N ( K, T ; X T ) = σ LV ( K, T ) (cid:112) ˆ p N ( K, T ; X T ) , with ˆ p N ( K, T ; X T ) = N (cid:80) Ni =1 V iT δ SN (cid:0) S iT − K, T (cid:1) + 2 θξ(cid:15) N (cid:80) Ni =1 δ SN (cid:0) S iT − K, T (cid:1) + ξ(cid:15) , (5.2)with δ SN a one-dimensional kernel function, specifically, δ SN ( x, T ) = 1 √ πh x ( T ) exp (cid:18) − x h x ( T ) (cid:19) , (5.3)where h x as well as the specific bandwidth details, constructed heuristically, are given in Appendix A.1. In ourtests we pick (cid:15) = 10 − . 20he (2 × N )-dimensional SDE approximating the system X is in the case of the LSV model d ˆ S it ˆ S it = (cid:0) r d ( t ) − r f ( t ) (cid:1) dt + ˆ σ N (cid:16) ˆ S it , t ; X t (cid:17)(cid:112) V it dW it dV it = κ (cid:0) θ − V it (cid:1) dt + ξ (cid:112) V it dW V,it , where ( W i · , W V,i · ) ≤ i ≤ N are N independent samples of the two correlated driving Brownian motions. For theLSV-LVV model, ξ ≡ ξ ( S it , t ), where the function ξ is assumed as given for now.For application of the forward PIDE (2.10), we also require the Markovian projection (2.9), and we estimatethis again as ˆ p N ( K, B, T ; X T ) = N (cid:80) Ni =1 V iT δ N (cid:0) S iT − K, M iT − B, T (cid:1) + 2 θξ(cid:15) N (cid:80) Ni =1 δ N (cid:0) S iT − K, M iT − B, T (cid:1) + ξ(cid:15) , with δ N an anisotropic bi-variate Gaussian kernel as earlier and bandwidth details given in Appendix A.1. ξ We recall that the available data is described in Section 2.1 as they inform the parametric form of ξ . Giventhe scarcity of the data, and to avoid over-fitting, we will consider two simple parametric vol-of-vol functions:one which is constant in the spot variable and piecewise constant in time, and one which is linear for a rangeof spot values (but capped above and below, i.e., piecewise linear in the spot) and constant in time betweenquoted maturities. More precisely, we write (cid:40) ξ ( S, T ) = ¯ ξ ( q ( S ) , T )¯ ξ ( S, T ) = max(( a n +1 ( S − S ) + b n +1 ) , ξ low ) , T ∈ [ T n , T n +1 ) , where we set T = 0, ξ low = 0 . 01 and with q defined as in (A.2). The construction performs a smooth,asymptotically constant extrapolation of ¯ ξ outside the interval [ S , B max ], where B max is the largest quotedno-touch barrier for the last quoted maturity T N Mat .Note that there are (only) two parameters per maturity, compared to one global vol-of-vol parameter forthe Heston model, and one parameter per maturity for the purely time-dependent case. For the followingdiscussion of the calibration, we focus on the piecewise linear example as the other one is a special case. First, we calibrate a pure Heston model to vanilla options only by minimising the mean square error for thedifference between market and Heston implied volatilities. For research purposes only, we use a Basin-Hoppingglobal optimisation [40] .As shown by the results in [10], the Heston-type LSV model with the Heston parameters calibrated tovanilla options overestimates the no-touch prices. A rule-of-thumb [10] suggests that dividing ξ calibrated tovanillas by two, i.e. taking the so-called “mixing factor” in (2.4) to be β = 0 . 5, provides a good starting point tofitting no-touch options; see also the beginning of Section 5. The parameters used are thus as displayed inTable 5.1 where the vol-of-vol calibrated to vanilla options, has been scaled by 0 . ξ .A local volatility function σ LV is calibrated with the procedure presented in the Appendix of [11], but anystable method, e.g., based on Dupire’s formula or a regularisation approach would be adequate. We thenminimise an error measure ¯ e over the parameters a and b for the fit to no-touch quotes with the model’s localvolatility component σ chosen to accurately fit vanilla options. To this end, we use the particle method from For equity options, a variance swap-based calibration may be preferable (see [20] for details). able 5.1: The calibrated Heston parameters. v θ κ ρ ξ a and b for the best-fitto no-touch quotes. For a proof of concept, we use the Nelder–Mead gradient free optimisation algorithm[32], for which sufficient convergence is obtained in 10 to 20 iterations in our tests. There is clearly room forimprovement by a faster optimisation procedure. Remark (Mixing factor) . Applying this approach to (2.4), i.e. best-fitting the (constant) mixing factor β tono-touch options with all other parameters best-fitted to vanilla options (see Table 5.1, and with ξ = 0 . β = 0 . S and V ), three-state ( S , V and M ) model,it would be possible to numerically solve the Kolmogorov forward PDE for φ , the joint density of the spot,stochastic variance and running-maximum, and compute the Markovian projection by quadrature as E Q d [ V T | S T = K, M T = B ] = ´ ∞ vφ ( K, B, v, T ) dv ´ ∞ φ ( K, B, v, T ) dv . The main reason why we estimate the conditional expectation by a particle method is that it allows a morestraightforward extension to higher-dimensional models, such as those with stochastic rates (or stochasticvol-of-vol or stochastic correlation).As discussed in the introduction of Section 5, another possible approach is to compute the no-touch pricesdirectly with the simulated paths and perform the optimisation process. The particle estimator to compute(5.2) is then still needed to compute the local volatility function σ . We will refer to this approach as “fullMonte Carlo” as the forward PIDE is not required anymore. We denote by T Mat n , 1 ≤ n ≤ N Mat , the quotedmaturities, by T m , m ≤ N T , the time grid, which is constructed to contain all quoted maturities T Mat n , and by N the number of particles. The step-by-step calibration is detailed in Algorithm 3 in Appendix B. For “pureMonte Carlo”, one can remove line 16 and line 17 of Algorithm 3 and replace line 19 by “compute model foreignno-touch price FNT Model for maturity T Mat n from the simulated Monte Carlo particles”. We carry out three “full Monte Carlo” calibrations as detailed at the end of Section 5.3, with 100 000 and500 000 particles, both with 100 time steps per year, and one with 1 000 000 particles and 300 time steps peryear, and one calibration using the forward PIDE with 100 000 particles, 100 time steps per year and 200 strikepoints. We pick (cid:15) = 10 − in (5.2) and (4.7). We found 15 × 10 spline nodes for the estimation of σ LMV toprovide a good trade-off between accuracy and smoothness. While having N S and N M too small will lead toaccuracy problems, choosing them too large will make the surface rougher due to over-fitting.The error measure we use for this comparison is the relative error for one-touch prices, rel = 100 × Q B (cid:88) l =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) FOT Model (cid:0) B T Mat n ,l , T Mat n (cid:1) − FOT Market (cid:0) B T Mat n ,l , T Mat n (cid:1) FOT Market (cid:0) B T Mat n ,l , T Mat n (cid:1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , N = 100 000 MC N = 100 000 MC N = 500 000 MC N = 1 000 000 N K = 200, N T = 100 N T = 100 N T = 100 N T = 300Average relative error 0.474% 1.437% 0.797% 0.621%Timing factor Θ 1 0.3 1.5 10.6 Table 5.2: Average relative error for one-touch prices in comparison Full Monte Carlo vs PIDE. “Timing factor” is thetime needed for the calibration, relative to the PIDE approach computational time. N T is the number of time steps peryear, N the number of particles and N K the number of strike points in the PIDE scheme. where the foreign one-touch price FOT can be computed from the foreign no-touch price FNT with (2.6). Arelative error is best suited in order to compare numerical methods for different levels of barriers. Let τ PIDE be the computational time needed to perform the PIDE calibration with 100 000 particles, 200strike space points and 100 time steps per year. If we denote by τ MC the timing for a full Monte Carlocalibration, the “timing factor” Θ is defined as Θ = τ MC τ PIDE . In Table 5.2, we display the average relative error over all barrier levels and maturities as well as the relativecomputational times Θ with respect to the PIDE approach. Table 5.2 allows us to conclude that the fullMonte Carlo calibration error becomes comparable to the PIDE calibration error only with more than 1 000 000particles and 300 time steps per year. This makes the full Monte Carlo method ten times slower than the PIDEapproach.According to our numerical experiments, the two-dimensional particle method to compute σ and σ LMV takesapproximately two to three times as long as the one-dimensional particle method used for vanilla calibration, i.e.the computation of σ alone (including the computational time for the particle scheme evolution). Additionally,we also need to solve the forward PIDE at each time step, a task that has a comparable computational time assolving the two-dimensional Heston pricing PDE.The results plotted in Figure 5.3 show the relative error for one short (left) and one long maturity (right),where we notice that the full Monte Carlo approach, even with 1 000 000 particles and 300 times steps, haslarger relative error for the longer maturity. Monte Carlo pricing of barrier options is numerically challengingas it translates into integrating a discontinuous payoff function, a problem that becomes more pronounced forhigher levels of no-touch barriers as only a few particles will breach the barrier. Estimating σ LMV and pricingwith the forward PIDE does not suffer from this problem in the same way as the knock-out feature is simplytreated as a Dirichlet boundary condition.Additionally, we plot the calibrated local vol-of-vol functions for both the forward PIDE approach with100 000 particles in Figure 5.4 and the full Monte Carlo approach with 500 000 particles in Figure 5.5, bothwith 100 time steps per year, and notice that the use of the forward PIDE leads to more stable parameters.Hence, we conclude that combining the forward PIDE with the particle method provides a more efficientsolution for the calibration problem compared to the full Monte Carlo technique, as seen by comparing tothe “fully” converged surface in Figure 5.2 with more points and particles. However, the full Monte Carloapproach is a good alternative if one is not willing to implement the finite difference discretisation of the forwardPIDE. A key benefit of the full Monte Carlo calibration is that one calibrates exactly the model simulated forpricing. Additionally, the calibration can be done at the same time as the pricing. Thus, the accuracy of thenumerical discretisation of the model becomes less of a concern. The calibrated model is therefore the chosendiscretisation scheme. This means, however, that it becomes important to perform tests on the implementationto be sure that the discretisation has properties close to the desired model, for example by pricing moments of From a practitioner perspective, the absolute difference is more relevant. For calibration results expressed in terms of absolutedifference, see Section 6. σ and in Figure 5.2 the calibratedlocal vol-of-vol ξ from 3 months onward, obtained with 100 time steps per year, 500 000 particles and 900strike points for the PIDE. For these numerical parameters, the calibration error on the implied volatility is onaverage smaller than 2bps in absolute volatility. The fit to market data, especially regarding no-touch options,is discussed in detail in Section 6.In our tests, a n is negative and lies inside [ − . , 0] and b n is usually in the interval [ ξ H / , ξ H / ξ H is the vol-of-vol of a pure Heston model calibrated to vanilla prices.As seen in Figure 5.2 from the resulting shape of ξ , both parameters are stable from one maturity to thenext and make thus a good first guess for the next quoted pillar. Hence, for the shortest quoted maturity, westart the optimisation with a = − b = ξ H and then use the calibrated ( a n , b n ) of T Mat n as a first guessfor the iterative solver in the calibration to no-touch quotes at T Mat n +1 . Figure 5.1: Calibrated local volatility function σ for theLSV-LVV model (2.5). Figure 5.2: Calibrated local vol-of-vol ξ for the LSV-LVVmodel (2.5). Figure 5.3: Full Monte Carlo vs forward PIDE calibration comparison for a short- and long-term maturity pillar, N T is the number of time steps per year and N the number of particles. igure 5.4: Local vol-of-vol ξ ( S, t ) function calib-rated by forward PIDE 200 strike points, 100 000particles and 100 time steps per year. Figure 5.5: Local vol-of-vol function ξ ( S, t ) calibratedby full Monte Carlo, 500 000 particles and 100 time stepsper year. In this section, we present the calibration fit for all models in this paper, benchmarked against some widelyused models. The calibration error for vanilla options is given in Table 6.1, and for foreign no-touch optionsin Table 6.2, for both path-dependent models, i.e. the LMV model (4.1) and LMSV model (4.2), as well asthe LSV-LVV model (2.5) and the standard LSV model (2.4) with mixing factor β . As a benchmark, we alsoinclude the pure local volatility model, i.e. (2.4) with β = 0, and the LSV Heston model, i.e. (2.4) with β = 1,calibrated to vanilla options and where the Heston parameters are also calibrated to call options.The average error in absolute implied volatility for vanillas is 0 . . . T LMV LMSV LSV-LVV LSV mix.=0.55 LV LSV Heston0.26 Table 6.1: Average absolute error of implied volatilities in % for the vanilla options for all models. For the no-touch prices, we display the average absolute error in % for each maturity as e = 1 Q B Q B (cid:88) l =1 (cid:12)(cid:12)(cid:12) FNT Model (cid:0) B T Mat ,l , T Mat (cid:1) − FNT Market (cid:0) B T Mat ,l , T Mat (cid:1)(cid:12)(cid:12)(cid:12) in Table 6.2, where Q B is the number of quoted barriers and B T Mat ,l , 1 ≤ l ≤ Q B , the set of quoted barriers(e.g., for an average absolute error of 0 . 1% and for a market no-touch probability of 70%, the model will price25t at 70% ± . 1% on average). The fit for the two path-dependent models should theoretically be perfect forall vanilla and barrier contracts, and any mismatches consist in numerical errors and penalisation, while theLSV-LVV is by construction calibrated to vanilla options but has only two further free parameters per maturityto fit five no-touch prices.In Figure 6.1, we plot as a function of B , for fixed T , the error e ( B, T ) = (cid:16) FNT Model ( B, T ) − FNT Market ( B, T ) (cid:17) . All the model prices are computed with 1 000 000 Monte Carlo paths and 365 time steps per year, whereBrownian increments are generated with Sobol sequences and Brownian bridge construction [7]. The runningmaximum is sampled with the Brownian bridge technique as in Chapter 6 of [18]. We note that to reach fasterconvergence for pricing, Sobol sequences and Brownian bridge construction can be used naturally as each pathis simulated independently. T LMV LMSV LSV-LVV LSV mix.=0.55 LV LSV Heston0.26 Table 6.2: Average absolute error in % for foreign no-touch quotes for all models; see Fig. 6.1 for error plots. Onthe left hand-side are models calibrated to no-touches (LMV, LMSV, LSV-LVV); on the right-hand side models notcalibrated on no-touches (LSV mix.=0.55, LV, LSV Heston). As Figure 6.1 suggests, if no-touch options are not included in the set of calibration instruments, moreclassical models like the Heston LSV model can largely mis-price the no-touch probability (in fact, a mis-pricingsignificantly higher than 3% is common). These results show that the calibration of no-touch options is ofparamount importance in order to incorporate the information about the distribution of the running maximumprocess provided by the market.The LSV-LVV, LMV and LMSV models, calibrated to both vanilla and foreign no-touch options, performsignificantly better for the valuation of no-touch options than the LV or the Heston LSV model calibrated tovanilla options only. The inclusion of a constant mixing factor improves the fit for longer maturities, but stilldoes not allow calibration within the bid–ask spread. This is almost achieved by a time-dependent mixingfactor, and fully achieved with a time-dependent vol-of-vol which is also a linear function of the spot FX rate(LSV-LVV model of Section 5). In this work, we demonstrated on the example of three volatility models the calibration to two traded productclasses, namely, vanilla and no-touch options.We introduced a new LSV-LVV model, an extension of the classic Heston-type LSV model with a localvol-of-vol. Due to the small number of degrees-of-freedom, the chosen LSV-LVV parametrisation cannot matchno-touch mid-prices perfectly, however, the fit is very satisfactory as the model price lies well within the marketspread across all quoted barrier levels and maturities.We also studied a model based directly on the local maximum mimicking diffusion as the natural extension tothe Dupire local volatility framework, namely the LMV model. Then, the addition of a Heston-type stochastic26 igure 6.1: Calibration fit to foreign no-touch options for all models, as a function of barrier level B ; see Table 6.2 fortabulated errors. volatility on top of the maximum-dependent volatility leads to a new LMSV model with a potentially moreinteresting spot-vol dynamics.Two approaches were proposed for the calibration; one based a two-dimensional particle method to computethe Markovian projection onto the two-dimensional state space ( S, M ), and the other using the numericalsolution of a forward PIDE for barrier option prices.An interesting extension will be to compare the volatility dynamics implied by the three models throughthe pricing of forward start options. One would then be able to understand to which extent the calibration totouch options, and barriers in general, is compatible with the market smile dynamics. Acknowledgements Simon McNamara (now at UBS London ) and the first author originally proposed the LSV-LVV model inworkshop sessions held at BNP Paribas London . The authors thank Marek Musiela from the Oxford-ManInstitute for insightful comments. 27 Implementation details A.1 Construction of kernel for particle method The bandwidth of the one-dimensional Gaussian kernel in (5.3) is given by h x ( T ) = ηS σ LV ( S , T ) (cid:112) max( T, T min ) N − , (A.1)where σ LV ( S , T ) is replaced by σ LMV ( S , S , T ) for the LMV model, and η = 1 . , T min = 0 . , N = 180 . In the two-dimensional case of (4.7), (4.8), h y ( T ) = h x ( T ) ρ xy ( T ) = ( ρ max − ˆ ρ ( T )) exp (cid:18) − k N T T max (cid:19) + ˆ ρ ( T ) k = 2 T max ln(2) N , where T max is the last quoted maturity, ˆ ρ ( T ) the correlation between S T and M T estimated with the sampledparticles at time T , and ρ max = 0 . 98. To speed up the computation, it is enough to update ˆ ρ ( T ) once everyyear or half-year. The use of an appropriate bandwidth was found to have a significant impact on the accuracyof the method. This bandwidth is inspired by a Silverman-type rule (see [37], [22]) for the values of h x and h y and an experimental definition of the correlation partA heuristic analysis suggests that if the number of time steps is low, the running maximum with theBrownian bridge technique, as described in Chapter 6 of [18], tends to be underestimated in the important areawhere spot and its running maximum are around the initial values ( S , S ), since, as seen in Figure 4.6, thelocal volatility function is an increasing function of the running maximum in this region. Indeed, between twotime steps t and t + ∆ t , the Brownian bridge technique freezes the value of the volatility function σ LMV from(4.2)for a running maximum M at time t smaller than the value of M along [ t, t + ∆ t ], and therefore, σ LMV isunderestimated on average such that M t +∆ t computed by Brownian bridge is smaller than its exact value.The intuition behind the specification of the correlation ρ xy ( T ) is as follows: if the number of time steps islow, the running maximum will tend to be underestimated, hence, we give more importance to the particleswhere the running maximum is higher (which happens when the spot increases); if the number of time stepsis large enough, i.e. more than one per day, the running maximum bias is lower and we rely on the samplecorrelation. It is important to mention that in practice, ˆ ρ ( T ) is around 80% ± 5% and does not changesignificantly over time. Also, the closer we get to the boundary S = M , the higher the correlation, which caneasily reach 95%. This is due to the fact that for particles where M is large, it is highly probable that S is largeas well, therefore, most of the particles will gather around the diagonal boundary. Conversely, if M is around S , most particles are for spots going downwards. This is easily observed from the ( S t , M t ) joint density wherethe mass aggregates around the area ( S , S ), the line S = M and the line M = S as displayed in Figure 4.5where we plot the density computed with a finite element method. A.2 Particle search on tree For a given level of spot and running maximum, K and B , we only want to compute the kernel function in(4.7) for particles which give a significant contribution to the sum, i.e. particles close enough with respect tothe metric implied by δ N . We measure this by δ N ( x, y, T ) ≤ (cid:15) for some (cid:15) > 0, i.e. all the points contained inthe ellipse E defined by x + y − ρ xy ( T ) xy − H ( T ) = 0 , H ( T ) = − h x ( T ) ln( γ ( T ) (cid:15) ) . where γ is defined in (4.8). The canonical form of ellipse E expressed in the coordinate system defined by itsprincipal axes is (see [3]) x − D/ ( λ λ ) + y − D/ ( λ λ ) = 1 , with D = − H ( T ) (cid:0) − ρ xy ( T ) (cid:1) , λ = 1 − ρ xy ( T ) and λ = 1 + ρ xy ( T ), the roots of λ → λ − λ + (cid:0) − ρ xy ( T ) (cid:1) .Since ρ xy ( T ) ≥ R ( T ) = (cid:113) − D/ ( λ λ ) = (cid:115) H ( T )1 − ρ xy ( T ) . Working for simplicity with the Euclidean distance, the particles with a significant contribution to the valueof ˆ p N ( K, B, T ) are contained in the ball with center ( K, B ) and radius R ( T ). We use (cid:15) = ( ξ(cid:15) ) / 10 in our tests.A large value of (cid:15) leads to a fast computation of ˆ p N but also reduces the accuracy of the result.In order to perform a distance query efficiently, we build a k -d tree as described in [5], with a worst casecomplexity of O (2 N log N ) and in which we can perform a binary search with complexity O (log N ) on average,to find all particles with distance less than (cid:15) . This provides fast access to the nearest neighbours, and, as aconsequence, allows to list the particles contained in the ball of centre ( K, B ) and radius R ( T ). Remark. In order to further improve the performance, we could define a distance function d ( x , x ) = (cid:113) ( s − s ) + ( m − m ) − ρ ( s − s )( m − m ) , where the coordinates of a given particle i are given by x i = [ s i , m i ] T , with i ∈ { , } in this example. Then d can be combined with a metric tree algorithm [41] for an optimal nearest neighbours search, by only keepingparticles within a distance H ( T ) to the point ( K, B ) of interest. It is straightforward to check that d is in facta metric. A.3 Spline interpolation of the volatility surface We describe here the spline parameterisation of the volatility surfaces for a given time. Let N T be the numberof time steps. For a given time T m , the function σ is approximated on a rectangle [ S m min , S m max ] × [ S , S m max ],with S m min < S < S m max , by bi-variate quadratic splines in the spot and running maximum directions (andpiecewise constant in time), and is extrapolated outside these bounds as detailed in Appendix A.4. There are N T + 1 volatility “slices” in total such that we denote the m -th time slice, i.e., ( x, y ) → σ ( x, y, T m ), by σ m .The surface construction starts by defining a spot grid where we need more grid points around the forwardvalue and less around S m min and S m max . We then use a hyperbolic grid (see [11] for more details) refined aroundthe forward value F m = S e ´ Tm ( r d ( t ) − r f ( t ) ) dt with S m min = F m e − σ F ( T m ) √ T m , S m max = F m e σ F ( T m ) √ T m , where σ F ( T m ) is the at-the-money forward volatility of the market for maturity T m . The spot grid is denotedby ( S m,j ) m ≤ N T , j ≤ N S . The creation of the running maximum grid is done selecting the nodes of the spot gridabove the initial spot S , which leads to N B ( m ) ≤ N S , where N B is now time dependent This particularconstruction is crucial for accuracy as it ensures that the diagonal where S = M is part of the grid, with theassociated maximum grid points ( M m,k ) m ≤ N T , k ≤ N M ( m ) . Each of the grid values can be seen as a parameterand we denote them by ( σ m,j,k ) m ≤ N T , j ≤ N S ,k ≤ N M ( m ) .29 .4 Smooth volatility extrapolation Here, we describe how we extrapolate volatility functions smoothly to be asymptotically constant in the spatialcoordinates from a rectangle [ x min , x max ] × [ y min , y max ]. For σ ( · , · , t ) defined on [ x min , x max ] × [ y min , y max ], wefirst extend the function to R by constant extrapolation, σ ( x, y, t ) = σ (¯ q ( x ) , ¯ q ( y ) , t ) , ( x, y ) ∈ R , with ¯ q ( x ) = x max x ≥ x max + x 1, most of the transition will happen inside the domain. This behaviourcan seem attractive at first, however, it will give rise to issues if the function needs to take a specific shapeinside the domain, a local volatility for instance. In both cases, we have q ( x max ) ≈ x max and q ( x min ) ≈ x min .Finally, we reach a trade-off when η = 0. This is the value we pick in our implementation. The new volatility isthen ¯ σ ( q ( x ) , q ( y ) , t ) . An example is shown in Figure 3.6. B Algorithms lgorithm 1 Calibration of the local maximum volatility model σ LMV ( x, · , T ) = σ LV ( x, T ) , ∀ x for ( i = 1 ; i ≤ N Mat ; i ++) dowhile ¯ e > − and (cid:107)∇ ¯ e (cid:107) > − dosolve forward PIDE (2.10) on [ T i − , T i ] (with T = 0) compute model implied vol Σ Model for T i from computed up-and-out call prices C ( K, B max , T i ) compute model foreign no-touch price FNT Model for maturity T i from the computed up-and-out callprices C (0 , B, T i ) /S compute the objective function ¯ e (Λ i ) = e (Λ i )(1 + P (Λ i ))as in (4.3) compute the objective function gradient ∇ ¯ e (Λ i ) using the forward PIDE solution for ∇ C (Λ i ) update volatility surface points Λ i with the L-BFGS-B algorithm end whileset σ LMV ( x, y, T i +1 ) = σ LMV ( x, y, T i ) , ∀ ( x, y ) end forAlgorithm 2 Calibration of path-dependent σ with 2D particle method σ ( S, M, T = 0) = σ LMV ( S,M, √ v for each time point ( m = 0 ; m ≤ N T − m ++) do generate ( Z, Z v ) i ≤ N and ( U v , U max ) i ≤ N , i.e. 2 × N independent draws from N (0 , 1) and 2 × N drawsfrom U ([0 , evolve the 2-factor 3-state particle system from T m to T m +1 with the QE − Scheme ( Z, Z v , U v ) where therunning maximum is computed by Brownian bridge ( U max ) and where σ ( S, M, [ T m , T m +1 [) = σ ( S, M, T m ) build the particles k -d tree partitioning on the position values (cid:0) S i , M i (cid:1) as described in Appendix A.1. set T = T m +1 for each maximum level ( k = 1 ; k ≤ N M ; k ++) do for each spot level ( j = 1 ; j ≤ N S ; j ++) do set K = S m +1 ,j ; B = M m +1 ,k select using the k -d tree, a set of selected significant particles I ( K, B ) if ( K ≤ B ) then compute and update ˆ p N = | I ( K,B ) | (cid:80) i ∈ I ( K,B ) V iT δ N (cid:0) S iT − K, M iT − B, T (cid:1) + 2 θξ(cid:15) | I ( K,B ) | (cid:80) i ∈ I ( K,B ) δ N (cid:0) S iT − K, M iT − B, T (cid:1) + ξ(cid:15) end if compute σ m +1 ,j,k = σ LMV ( K, B, T ) √ ˆ p N end for end for end for lgorithm 3 Calibration of LSV-LVV ( σ, ξ ) with 2D particle method, forward PIDE and inner iterations σ ( S, 0) = σ LV ( S, √ v σ LMV ( S, M, 0) = σ LV ( S, N down T = 0 set ( a , b ) = ( − , ξ H ) for each maturity ( n = 1 ; n ≤ N Mat T ; n ++) do set ( a n , b n ) = ( a n − , b n − ) find N up T such that T N up T = T Mat n set the optimisation variable convergence = false while convergence is false do for each time point ( m = N down T ; m < N up T ; m ++) do generate ( Z, Z v ) i ≤ N and ( U v , U max ) i ≤ N , i.e., 2 × N independent draws from N (0 , 1) and U ([0 , evolve the 2-factor 3-state particle system of model (2.5) from T m to T m +1 with the QE-Scheme,where the maximum is computed by Brownian bridge ( U max ) and σ ( S, [ T m , T m +1 [) = σ ( S, T m ) build , as in Appendix A.2, the k -d tree of the particles by their positions (cid:0) S i , M i (cid:1) . This allows todefine a set of significant particles I ( K, B ) to use in (B.1). set T = T m +1 for each K on a grid, compute as in [22, 11] for a set of selected significant particles I ( K ) σ ( K, T ) = σ LV ( K, T ) (cid:113)(cid:80) i ∈ I ( K ) δ SN (cid:0) S iT − K, T (cid:1) + 2 θξ(cid:15) (cid:113)(cid:80) i ∈ I ( K ) V iT δ SN (cid:0) S iT − K, T (cid:1) + ξ(cid:15) , with δ SN a one-dimensional kernel function for each ( K, B ) on a grid, compute as in Section 4.3 for a set of selected significant particles I ( K, B )as in Section 4.3 σ LMV ( K, B, T ) = σ ( K, T ) (cid:113) | I ( K,B ) | (cid:80) i ∈ I ( K,B ) V iT δ N (cid:0) S iT − K, M iT − B, T (cid:1) + 2 θξ(cid:15) (cid:113) | I ( K,B ) | (cid:80) i ∈ I ( K,B ) δ N (cid:0) S iT − K, M iT − B, T (cid:1) + ξ(cid:15) , (B.1)where δ N is a two-dimensional kernel function solve the forward PIDE for barriers (2.10) by BDF2 implicit step on [ T m , T m +1 ] with a volatility of σ LMV ( K, B, T m +1 ) end for compute model foreign no-touch price FNT Model for maturity T Mat n from the up-and-out call prices C (cid:0) , B, T Mat n (cid:1) computed with the PIDE compute the objective function¯ e ( a n , b n ) = Q B (cid:88) l =1 (cid:12)(cid:12)(cid:12) FNT Model (cid:0) B T Mat n ,l , T Mat n , a n , b n (cid:1) − FNT Market (cid:0) B T Mat n ,l , T Mat n (cid:1)(cid:12)(cid:12)(cid:12) update ( a n , b n ) guess with the Nelder–Mead algorithm [32] check difference with the previous iteration and set convergence = true if converged end while end for eferences [1] L. Andersen. Simple and efficient simulation of the Heston stochastic volatility model. The Journal ofComputational Finance , 11(3):1–42, 2008.[2] E. Ayache, P. Henrotte, S. Nassar, and X. Wang. Can anyone solve the smile problem. Wilmott Journal ,January 2004.[3] A. B. Ayoub. The central conic sections revisited. Mathematics Magazine , 66(5):322–325, 1993.[4] A. Bain and D. Crisan. Fundamentals of Stochastic Filtering . Stochastic Modelling and Applied Probability.Springer New York, 2008.[5] R. A. Brown. Building a balanced k -d tree in o ( kn log n ) time. Journal of Computer Graphics Techniques ,4(1):50–68, 2015.[6] G. Brunick and S. Shreve. Mimicking an Itˆo process by a solution of a stochastic differential equation. Annals of Applied Probability , 23(4):1584–1628, 2013.[7] R. E. Caflisch, W. Morokoff, and A. B. Owen. Valuation of mortgage-backed securities using Brownianbridges to reduce effective dimension. Journal of Computational Finance , 1:27–46, 1997.[8] R. Carmona and S. Nadtochiy. Local volatility dynamic models. Finance and Stochastics , 13(1):1–48,2009.[9] P. Carr and J. Crosby. A class of L´evy process models with almost exact calibration to both barrier andvanilla FX options. Quantitative Finance , 10(10):1115–1136, 2010.[10] I. J. Clark. Foreign Exchange Option Pricing: A Practitioner’s Guide . John Wiley & Sons, Chichester,UK, 2010.[11] A. Cozma, M. Mariapragassam, and C. Reisinger. Calibration of a four-factor hybrid local-stochasticvolatility model with a new control variate particle method. preprint, arXiv:1701.06001, 2017.[12] S. Cr´epey. Tikhonov regularization. In R. Cont, editor, Encyclopedia of Quantitative Finance , pages1807–1812. John Wiley & Sons, Chichester, UK, 2010.[13] B. Dupire. A unified theory of volatility. In P. Carr, editor, Derivatives Pricing: The Classic Collection .Risk Books, London, 2004.[14] H. Egger and H. W. Engl. Tikhonov regularization applied to the inverse problem of option pricing:convergence analysis and rates. Inverse Problems , 21(3):1027, 2005.[15] B. Fornberg. Generation of finite difference formulas on arbitrarily spaced grids. Mathematics ofComputation , 51(184):699–706, 1988.[16] M. Giles and R. Carter. Convergence analysis of Crank-Nicolson and Rannacher time-marching. Journalof Computational Finance , 9(4):89–112, 2006.[17] M. Giles and P. Glasserman. Smoking adjoints: Fast Monte Carlo Greeks. Risk , 19(1):88–92, 2006.[18] P. Glasserman. Monte Carlo Methods in Financial Engineering , volume 53 of Stochastic Modelling andApplied Probability . Springer, 2003.[19] A. Griewank and A. Walther. Evaluating derivatives: principles and techniques of algorithmic differentiation .SIAM, second edition, 2008. 3320] F. Guillaume and W. Schoutens. Heston model: The variance swap calibration. Journal of OptimizationTheory and Applications , 161(1):76–89, 2014.[21] J. Guyon. Path-dependent volatility. Risk Magazine , September, 2014.[22] J. Guyon and P. Henry-Labord`ere. Being particular about calibration. Risk , 25(1):88, 2012.[23] J. Guyon and P. Henry-Labord`ere. Nonlinear Option Pricing . Chapman and Hall/CRC FinancialMathematics. Chapman and Hall/CRC, Boca Raton, USA, 2013.[24] I. Gy¨ongy. Mimicking the one-dimensional marginal distributions of processes having an Itˆo differential. Probability Theory and Related Fields , 71(4):501–516, 1986.[25] E. Hairer, G. Wanner, and S. P. N¨ı¿œrsett. Solving Ordinary Differential Equations I: Nonstiff Problems .Springer Series in Computational Mathematics 8. Springer-Verlag, Berlin Heidelberg, 2 edition, 1993.[26] B. Hambly, M. Mariapragassam, and C. Reisinger. A forward equation for barrier options under theBrunick & Shreve Markovian projection. Quantitative Finance , 16(6):827–838, 2016.[27] N. Jackson, E. S¨uli, and S. Howison. Computation of deterministic volatility surfaces. Journal ofComputational Finance , 2:5–32, 1998.[28] A. Langnau. A dynamic model for correlation. Risk , 23(4):74–78, 2010.[29] F. A. Longstaff and E. S. Schwartz. Valuing american options by simulation: A simple least-squaresapproach. Review of Financial Studies , pages 113–147, 2001.[30] H. P. McKean. A class of Markov processes associated with nonlinear parabolic equations. Proceedings ofthe National Academy of Sciences of the United States of America , 56:1907–1911, 1966.[31] M. Musiela and M. Rutkowski. Martingale methods in financial modelling . Springer Berlin, 2009.[32] J. A. Nelder and R. Mead. A simplex method for function minimization. The Computer Journal , 7(4):308,1965.[33] O. Pironneau. Dupire-like identities for complex options. Compte rendu de l’acad¨ı¿œmie des sciences I ,344:127–133, 2007.[34] V. Piterbarg. Markovian projection method for volatility calibration. SSRN preprint 906473, 2006.[35] A. Reghai. Breaking correlation breaks. Risk , 23(10):92–97, 2010.[36] Y. Ren, D. Madan, and M. Qian Qian. Calibrating and pricing with embedded local volatility models. Risk Magazine , September, 2007.[37] B. W. Silverman. Density estimation for statistics and data analysis. Biometrical Journal , 30(7):876–877,1988.[38] A.-S. Sznitman. Ecole d’Et´e de Probabilit´es de Saint-Flour XIX , chapter Topics in propagation of chaos,pages 165–251. Springer, 1991.[39] L. Tur. Local volatility calibration with fixed-point algorithm. GDF Suez Trading , Informal discussion,2014.[40] D. J. Wales and J. P. K. Doye. Global optimization by basin-hopping and the lowest energy structures ofLennard-Jones clusters containing up to 110 atoms. The Journal of Physical Chemistry A , 101(28):5111–5116, 1997. 3441] P. N. Yianilos. Data structures and algorithms for nearest neighbor search in general metric spaces. In Proceedings of the Fifth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , 1993.[42] C. Zhu, R. H. Byrd, P. Lu, and J. Nocedal. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scalebound-constrained optimization.