Weak error rates for option pricing under the rough Bergomi model
WWEAK ERROR RATES FOR OPTION PRICING UNDER THE ROUGH BERGOMIMODEL
CHRISTIAN BAYER , ERIC JOSEPH HALL , , AND RAÚL TEMPONE , Abstract.
In quantitative finance, modeling the volatility structure of underlying assets is a key com-ponent in the pricing of options. Rough stochastic volatility models, such as the rough Bergomi model[Bayer, Friz, Gatheral, Quantitative Finance 16(6), 887-904, 2016], seek to fit observed market databased on the observation that the log-realized variance behaves like a fractional Brownian motion withsmall Hurst parameter,
H < /
2, over reasonable timescales. Both time series data of asset prices andoption derived price data indicate that H often takes values close to 0 . H . Weprove rate H + 1 / Introduction
Rough stochastic volatility models form an increasingly popular paradigm in quantitative finance, asthey simultaneously address two empirical challenges. Firstly, time series of realized variance indicatethat variance is rough in the sense of having Hölder regularity H (cid:28) /
2, see [13, 6, 12]. Secondly, roughvolatility models recover the power-law explosion of the at the money implied volatility skew of the form τ − γ for γ ∼ / τ →
0. In fact, these two constants are linked by γ = 1 / − H , givingfurther evidence of regularity H being small, say around 0 .
1. We refer to [4] for the pricing perspective.To fix notation, we consider a rough stochastic volatility model for an asset price process S t of theform d S t = √ v t S t d Z t , where Z is a Brownian motion (Bm). There are two classes of rough volatility models which differ in thespecification of the instantaneous variance component v t . The rough Heston model ([11]) is an exampleof one kind, with v t given as a solution to a Volterra stochastic differential equation (SDE) with a powerlaw kernel K ( r ) ∼ r H − / , r >
0. This paper will consider an alternative where the variance process isan explicit function of a fractional Brownian motion (fBm) W Ht , which does not need to be the classicalfBm. For instance, the rough Bergomi model ([4]) is specified by the choice(1.1) v t := ξ ( t ) exp (cid:18) ηW Ht − η t H (cid:19) , where ξ ( t ) denotes the forward variance and W Ht denotes the Riemann–Liouville fBm given by(1.2) W Ht := Z t K ( t − s )d W s , K ( r ) := √ Hr H − / , for a Bm W with correlation ρ with Z .The modelling advantages gained by addressing these two empirical challenges using a rough stochasticvolatility model are paid for both on the theoretical and the numerical side. Indeed, rough stochasticvolatility models are neither semi-martingales nor Markov processes. Despite the former, rough volatility WIAS, Mohrenstr. 39, 10117 Berlin, Germany. RWTH Aachen University, Chair of Mathematics for Uncertainty Quantification, Kackertstr. 9, 52072Aachen, Germany. University of Dundee, Division of Mathematics, Nethergate, Dundee, DD1 4HN, UK. King Abdullah University of Science and Technology (KAUST), Computer, Electrical and MathematicalSciences & Engineering Division (CEMSE), Thuwal 23955-6900, Saudi Arabia.
E-mail addresses : [email protected], [email protected], [email protected] .2010 Mathematics Subject Classification.
Key words and phrases. rough volatility, option pricing, weak error, Euler–Maruyama, non-Markovian dynamics. a r X i v : . [ q -f i n . C P ] S e p EAK ERROR RATES ROUGH BERGOMI 2 models do not violate the no-arbitrage-condition, as the asset price process itself is a martingale. On theother hand, the difficulties caused by the lack of Markov property are more severe. In particular, thereis no finite dimensional pricing PDE anymore (although we refer to [19] for an implementation of aninfinite-dimensional pricing PDE based on machine learning). For some rough volatility models of affineVolterra type, for instance, the rough Heston model, there is still a semi-explicit formula for the assetprice’s characteristic function in terms of a deterministic fractional ODE. Otherwise, the rough stochasticvolatility approach necessitates simulation-based methods.On the numerical side, W Ht , . . . , W Ht N can be exactly sampled at discrete-time points as W H is aGaussian process with known covariance function. (The hybrid scheme of [7] is a popular alternative toexact simulation, sacrificing accuracy for speed.) However, simulation of S t requires discretization of astochastic integral, even in simple cases like the rough Bergomi model. As we shall see in further detaillater, we essentially need to compute stochastic integrals of the form(1.3) Z T ψ ( t, W Ht )d W t , for some deterministic, “nice” function ψ . In particular, note that the integrand is adapted and square-integrable (under appropriate conditions). Hence, the stochastic integral exists in the classical It¯o sense,and strong convergence of the numerical scheme(1.4) n − X i =0 ψ ( t i , W Ht i )( W t i +1 − W t i )is also classical. The speed of convergence is considerably less clear. Indeed, Neuenkirch and Shalaiko [21]proved strong convergence with rate H for a very similar problem, i.e., phrased in terms of classical fBm,and strong rate H is widely expected to hold also for the approximation scheme (1.4) to (1.3). Usingtechniques from regularity structures, in particular, renormalization by an exploding constant, [3] provedessentially the same strong rate for a Wong-Zakai type approximation of (1.3).Combining our observations—that volatility is rough ( H ≈ .
1) and typical schemes converge withstrong rate H — we run into problems, as the rate of convergence is so small as to make it indistinguishablefrom lack of convergence in many cases of practical importance. Indeed, suppose that H = 0 . n time steps to reach an error tolerance (cid:15) . If we now decrease our tolerance by a factor ten, i.e.,we require one additional significant digit, then the number of time-steps needed is increased by a factor10 in the asymptotic regime.For most applications we really require weak as opposed to strong convergence of the numerical scheme.For instance, the price of a European option with payoff ϕ is E [ ϕ ( S T )], and its computation relies onweak convergence of the scheme. Weak approximation of stochastic integrals is often much faster thanstrong approximation. Consider the Euler scheme for standard SDEs (the case H = 1 / /
2. This poses the interesting question about the relation between the Hölder regularity ( H = 1 / / H , but there are several plausible candidates for the weakrate: 2 H , H + 1 /
2, and 1 (independent of H ). We stress that only the last two alternatives allow forfeasible numerical simulations in the truly rough regime. Bluntly put, if the true weak error only decaysproportionally to n − H in the number of time steps n , then simulation methods are not viable numericalmethods for option pricing in rough volatility models.Despite the importance of the problem of determining the weak rate, only little work has been done.Horvath, Jacquier and Muguruza [17] study a Donsker theorem for a rough volatility model, whichtranslates into a week tree-type approximation. The rate of convergence of their method is H in thenumber of time-steps. At this stage, we should note that the trees are non-recombining, implying thatthe memory load increases exponentially in the number of time-steps. To the best of our knowledge,this work provides the only rigorous weak convergence result in the literature of rough volatility models.Indeed, it is worth pointing out that standard proof techniques for diffusions, see [24], strongly rely onthe Markov property, and are, hence, not applicable in this setting.At the same time, discretization-based simulation methods are often used in the literature, with greatsuccess. While convergence is rarely considered (not even empirically), we would expect to see difficultiesemerge in the very rough cases H ≈ . H or 2 H . In fact,the few available empirical studies (for instance, [5]) indicate a much larger weak rate of convergence. Infact, the authors of [5] observe a weak rate of one which is stable enough to allow accelerated convergenceby Richardson extrapolation. Anecdotally, we asked several experts on stochastic numerics in early stages of working on this problem, and all threepossibilities were put forward.
EAK ERROR RATES ROUGH BERGOMI 3
In this paper we prove novel weak rates for the convergence of the left-hand rule (1.4) to (1.3):
Theorem 1.1.
The left-point approximation (1.4) to the rough stochastic integral (1.3) converges withweak rate H + 1 / for ψ ( t, W Ht ) = W Ht . For the case that the payoff ϕ is a quadratic polynomial thenthe convergence is weak rate one. We refer to Theorems 2.1 and 4.1 for more precise statements. Some remarks are in order: • The problem of weak convergence in this setting is very subtle; if we restrict ourselves to quadraticpolynomials as payoff functions, then the weak rate of convergence is actually one, see Lemma 4.2.This implies that the rate of convergence for payoff functions ϕ which can be well approximatedby quadratic polynomials, as seen from the law of the solution, may be hard to distinguish fromrate one empirically, due to prevalence of higher order terms (see Figure 3). • We do not have a lower bound establishing that the weak rate of convergence cannot be betterthan H + 1 / • We do not doubt that the proof extends to the general case of non-linear ψ and a complete prooffor this extension is presently a work in progress.Our proof for Theorem 1.1 relies on deriving Taylor expansions for the weak error using an affineMarkovian representation of the underlying. The basic flavor of this approach, i.e., obtaining a Markovianextended variable system to facilitate analysis, is a strategy utilized in other non-Markovian stochasticdynamical systems such as the Generalized Langevin equation (see, e.g., [14, 10]) and open Hamiltoniansystems ([22]). In the context of rough volatility models, Markovian approximations were also used in [1]. Outline of the paper.
In Section 2 we provide the setting and the main result and discuss the generalstrategy of the proof. Section 3 introduces auxiliary, Markovian approximations to both (1.3) and (1.4)based on [8]. This high dimensional Markovian problem will serve as a surrogate problem for most ofthe convergence analysis. Section 4 considers the special case of quadratic payoff functions, for which thegeneral proof strategy simplifies considerably. We contrast this with a specific proof only applicable toquadratic payoffs, which also works for general non-linear ψ . The proof of Theorem 1.1 (and Theorem 2.1)is then carried out in Section 5.2. Problem setting: weak rate of convergence for Euler scheme is H + 1 / ϕ ( X T ) for an underlying(2.1) X t := Z t ψ ( s, W Hs )d W s , where W Ht is a Riemann–Liouville fBm given by (1.2) with Hurst parameter H ∈ (0 , / X t in (2.1) is non-Markovian as W Ht , and hence ψ ( t, W Ht ), depends on the full historyof ( W s ) s ∈ [0 ,t ] (cf. ψ to the instantaneous variance v t in (1.1)). In fact, for the purposes of Europeanoption, the rough Bergomi model can be reduced to (2.1) in the following way (often attributed to [23]).First, It¯o’s formula implies that S T = S exp − Z T v s d s + Z T √ v s d Z s ! . We can now replace the Bm Z by ρW + p − ρ W ⊥ for an independent Bm W ⊥ . Conditionally on W , S T has a log-normal distribution with parameters µ := log S − Z T v s d s + ρ Z T √ v s d W s , σ := (1 − ρ ) Z T v s d s. If we denote the Black–Scholes price for the payoff function ϕ at maturity T by C BS ( S , σ BS T, ϕ ), forinterest rate r = 0 and volatility σ BS , then we get(2.2) E [ ϕ ( S T )] = E " C BS S exp " − ρ Z T v s d s + ρ Z T √ v s d W s , (1 − ρ ) Z T v s d s, ϕ ! . Computation of the right hand side of (2.2) requires simulation of the Lebesgue integral R T v s d s as wellas simulation of(2.3) Z T √ v s d W s = Z T p ξ ( s ) exp (cid:18) η W Hs − η s H (cid:19) d W s , which is of the form (2.1). EAK ERROR RATES ROUGH BERGOMI 4
Presently, we derive weak rates of convergence,(2.4) (cid:12)(cid:12) E [ ϕ ( X T ) − ϕ ( X ∆ tT )] (cid:12)(cid:12) = O (∆ t γ ) , for the left-hand scheme (1.4) with step-size ∆ t such that n ∆ t = T . Restricting to the case ψ ( s, W Hs ) = W Hs , the main finding of this work, in Theorem 2.1 (and implying the first statement in Theorem 1.1),is that the weak rate is γ = H + 1 / H . Theorem 2.1 (Weak rate) . For general ϕ ∈ C ηb , for integer η = d H e , and ψ ( s, W Hs ) = W Hs , we have | Err( T, ∆ t ) | = (cid:12)(cid:12) E [ ϕ ( X T ) − ϕ ( X ∆ tT )] (cid:12)(cid:12) = O (∆ t H +1 / ) , i.e. the Euler method is weak rate H + 1 / . The proof of Theorem 2.1 is presented in Section 5. Before diving into the machinery needed forthe proof, we first consider some numerical evidence that supports the rates in Theorem 1.1 and theaccompanying remarks. Details of the implementation are outlined in Appendix A.The first group of numerical experiments, in Figure 1, provide support for rate H +1 / H for the general (i.e. non-quadratic) payofffunctions ϕ ( x ) = x and ϕ ( x ) = Heaviside( x ). Indeed, the best fits (least squares) of the weak error to∆ t , as well as the extremes suggested by the upper and lower 95% confidence interval for the mean basedon M = 3 × samples, is consistent with the rate H + 1 /
2. Comparing Figure 1 a to Figure 1 b , therate increases (and by approximately 0 .
1) as H increases from H = 0 .
05 to 0 .
15. Although the function ϕ ( x ) = Heaviside( x ) is not continuous and therefore does not fit precisely into our theory, the consistencyof the observed rates in Figure 1 hint at the generality of the findings in Theorem 2.1 to, e.g., digital calloptions. -1 -3 -2 -1 (a) H = 0 . -1 -3 -2 -1 (b) H = 0 . Figure 1.
For small Hurst parameters, ( a ) H = 0 .
05 and ( b ) H = 0 .
15, the best fitslope for the weak error for scheme (1.4), together with extremes suggested by the 95% CIbased on M observations, are consistent with the rate H + 1 / ϕ . In particular, the rate holds for the discontinuous ϕ ( x ) =Heaviside( x ) suggesting our findings are robust. Here ∆ t ∈ [2 − , − ] and the referencemesh is ∆ t ref = 2 − .In Figure 2 b , we observe that for H = 0 .
50, i.e. standard Brownian motion, the best fit of the weakerror rate is consistent with the known weak rate one for general payoff functions. However, in contrastto the rates observed in Figure 1, the behavior of quadratic payoffs looks decidedly different. We observein Figure 2 a that the weak rate for quadratic ϕ ( x ) = x appears to be γ = 1 even for small H = 0 .
05 and H = 0 .
15. Weak rate one for quadratic payoff functions is recorded in Theorem 4.1 and Lemma 4.2 inSection 4; this surprising finding, that the rate depends on the payoff function, will be readily explainedusing the asymptotic expansions that are at the center of our approach.Finally, in Figure 3, we observe that the best fit of weak rate to ∆ t for the shifted-cubic ϕ ( x ) = ( x +1 . is consistent with rate 1 even for small H = 0 .
05 and H = 0 .
15 (cf. compare the rates in Figure 3 tothose for the cubic payoff ϕ ( x ) = x in Figures 1 a and 1 b ). As seen from the law of the solution, theshifted cubic is better approximated by quadratic polynomials and therefore its rate of convergence ismuch harder to distinguish from rate one. This numerical experiment not only drives home the subtlety EAK ERROR RATES ROUGH BERGOMI 5 of the problem of deriving weak rates for the rough Bergomi model, but also leads us to be optimistic thatefficient numerical methods can be obtained for a wide array of real-world problems where the effective rate of convergence is not as bad as the theoretical rate. -1 -2 -1 (a) Quadratic ϕ , small H -1 -3 -2 -1 (b) General (non-quadratic) ϕ , H = 0 . Figure 2. ( a ) Surprisingly, the best fit line for the weak error for scheme (1.4) for thequadratic payoff ϕ ( x ) = x is consistent with weak rate one even for small H , as foundin Theorem 4.1. ( b ) For Hurst parameter H = 1 /
2, the weak rate in Theorem 2.1 forscheme (1.4) is consistent with the expected rate one (for standard Bm), as illustratedby the best fit slope for the weak error for ϕ ( x ) = x and ϕ ( x ) = Heaviside( x ) (cf. weakrate H + 1 / H ). Here ∆ t ∈ [2 − , − ] and the referencemesh is ∆ t ref = 2 − . -1 -1 Figure 3.
The weak error for scheme (1.4) for the shifted cubic payoff ϕ ( x ) = ( x + 1 . achieves a higher rate than ϕ ( x ) = x as the shifted cubic is better approximated by aquadratic in the support of the distribution for the underlying. Here ∆ t ∈ [2 − , − ] andthe reference mesh is ∆ t ref = 2 − . Remark . Although the assumptions of Theorem 2.1 seem extremely strong,they do reflect meaningful financial situations. In particular, note that (2.2) allows us to replace the(generally non-smooth) payoff functions of European options by their smooth Black–Scholes prices. Ad-ditionally, put-call-parity may allow us to assume bounded payoffs. Linearity of ψ is, admittedly, a verystrong assumption, which should be seen as the first stepping stone to the general result. We conjecturethat Theorem 2.1 holds in the setting of the rough Bergomi model, i.e., for non-linear ψ as given in (2.3). EAK ERROR RATES ROUGH BERGOMI 6
Remark . For the simple model problem (1.3) the numerical integration scheme (1.4) is theleft-point approximation. If the problem were not trivialized to a stochastic integral, then in general X ∆ tT would correspond to the Euler–Maruyama approximation for the underlying SDE and we will refer tothe scheme interchangeably as both.In the next section, we introduce the notation and concepts that will be used to derive asymptoticexpansions for the weak error in powers of ∆ t . In particular, we first use these expansions to derive weakrate one for quadratic payoffs, see Theorem 4.1, in Section 4. Finally in Section 5, a proof, following theapproach used for Theorem 4.1 as a guide, is given for Theorem 2.1 obtaining weak rate H + 1 / Markovian extended state space formulation
We first consider a well-known affine representation for the driving fBm. Discretizing this affinerepresentation yields an extended state space for the dynamics of the underlying. A novelty of ourapproach is to utilize this formulation to obtain asymptotic expansions for the weak error. In particular,we utilize the Markovian structure of the extended state space to show that (3.7) admits a Taylorexpansion in ∆ t where the coefficients can be controlled independently of the choice of parameters usedto obtain the extended state space formulation.3.1. Affine representations for small Hurst index.
Over the Hurst parameter regime of interest,the fBm (1.2) admits an affine representation as a linear functional of an infinite-dimensional family ofOrnstein–Uhlenbeck (OU) processes ([8]).
Lemma 3.1 (Affine representation) . For < H < / , (3.1) W Ht = e c H Z ∞ e Y t ( θ ) θ − ( H + ) d θ , where e Y t ( θ ) = Z t e − θ ( t − s ) d W s and e c H is a positive and finite constant depending on H . Although this statement is well-known we provide key details of the proof that will be referenced laterfor the convenience of the reader. The full proof can be found in, e.g., [8, 16] (see also [9, 20, 15] where [9]gives a Markovian representation for
H > /
2, [20] a time-homogeneous Markovian representation thatis also defined for t ∈ ( −∞ , Proof.
Writing the kernel appearing in (1.2) as a Laplace transform,( t − s ) H − = 1Γ( − H ) Z ∞ θ − ( H + ) e − θ ( t − s ) d θ , and then using stochastic Fubini one obtains the desired result, W Ht = Z t √ H Γ( − H ) Z ∞ θ − ( H + ) e − θ ( t − s ) d θ d W s = Z ∞ e c H Z t e − θ ( t − s ) d W s θ − ( H + ) d θ = e c H Z ∞ e Y t ( θ ) θ − ( H + ) d θ , where e c H := √ H/ Γ( − H ) < ∞ . (cid:3) A key tool in our proof of the weak rates will be to utilize the Markovian structure of a projection ofthe fBm obtained by discretizing the affine representation Lemma 3.1. We observe that the integral in(3.1) has a singularity at θ = 0, but behaves essentially like θ − ( H + ) before e Y t ( θ ) vanishes in the limit of θ . To make (3.1) more amenable to quadrature we remove the singularity by introducing the change ofvariable, ϑ = θ − ( H + − = θ − H , thereby obtaining the representation(3.2) W Ht = c H Z ∞ e Y t ( ϑ / (1 − H ) )d ϑ = c H Z ∞ Y t ( θ )d θ , where the constant, c H := e c H − H = √ H Γ( − H ) , EAK ERROR RATES ROUGH BERGOMI 7 is an increasing function of H ∈ (0 , ) such that 0 < c H <
1. In (3.2),(3.3) Y t ( θ ) = Z t e − ( t − s ) θ p d W s is an OU process with speed of mean-reversion given by θ p with a positive power(3.4) p := 2 / (1 − H ) > . One realization of Y t ( θ ) is plotted in Figure 4 together with an envelope illustrating plus/minus twostandard deviations of Y t ( θ ), computed using the formula for the covariance, i.e.Cov( Y t ( θ ) , Y t ( η )) = 1 θ p + η p (1 − e − ( θ p + η p ) t ) . Replacing the integral in (3.2) with a quadrature rule in the parameter θ yields a projection of the fBmonto a finite state space. Figure 4.
A sample of Y t ( θ ) in (3.3), at left, with speed of mean reversion θ / (1 − H ) for H = 0 .
07 plotted together with an envelope demonstrating plus/minus two standarddeviations, at right, (cf. time series data of asset prices and option derived price dataindicate that H often takes values close to 0 . Y t ( θ ) is a smoothanalytic function of θ and discretizing in θ yields an extended variable state space whichwe utilize in our analysis. Lemma 3.2 (Approximate affine representation) . For < H < / , let (3.5) c W Ht = c H N L X l =1 Y lt ∆ θ l =: S ( Y t ) , depend on N L degrees of freedom Y t = ( Y t , . . . , Y N L t ) where Y lt := Y t ( θ l ) are OU process in (3.3) withspeed of mean-reversion θ pl for p in (3.4) . Then c W H converges to W H as L, N L → ∞ in L (Ω; C ([0 , T ])) .Proof. We obtain an approximate affine representation of the fBm by discretizing (3.2) in two steps.First, we divide the integral in (3.2) into two parts, W Ht = c H Z L Y t ( θ )d θ + c H Z ∞ L Y t ( θ )d θ | {z } =: R L ( Y t ) , where R L denotes the error in restricting the integral to a fixed computational domain L >
1. Second,we consider a quadrature rule W Ht = c H N L X l =1 Y t ( θ l )∆ θ l + R N L ( Y t ) + R L ( Y t ) , with points 0 ≤ θ < · · · < θ N L ≤ L and weights ∆ θ l = θ l +1 − θ l where R N L denotes the quadraturetruncation error.That c W H converges to W H in the limit of L, N L essentially follows from the “strong rates” in [15]. The R N L can be made arbitrarily small as Y t ( θ ) is a smooth bounded (even analytic) function of θ (e.g. see EAK ERROR RATES ROUGH BERGOMI 8
Figure 4), i.e. the regularity in θ allows one to approximate efficiently using arbitrarily higher orderquadrature rules if desired ([15]). The R L ( Y t ) is a mean zero Gaussian process, since for all δ ∈ [0 , H ),sup L ∈ [1 , ∞ ] L δ (cid:13)(cid:13)(cid:13) sup t ∈ [0 ,T ] | R L ( Y t ) | (cid:13)(cid:13)(cid:13) L p (Ω) < ∞ , guarantees integrability ([15, Lemma 1(b)]). Then R L can be made arbitrarily small for sufficiently large L by observing that the variance,Var[ R L ( Y t ( θ ))] = c H Z ∞ L Z ∞ L Cov( Y t ( θ ) , Y t ( η ))d θ d η ≤ c H π Z ∞ L θ − p d θ = c H π L − p p − , decays in L since p > (cid:3) We split the weak error (2.4) using the representations in Lemmas 3.1 and 3.2, E (cid:2) ϕ ( X T ( W H )) − ϕ ( X T ( c W H )) (cid:3) + E (cid:2) ϕ ( X T ( c W H )) − ϕ ( X ∆ tT ( c W H )) (cid:3) + E (cid:2) ϕ ( X ∆ tT ( c W H )) − ϕ ( X ∆ tT ( W H )) (cid:3) (3.6)where we emphasize the dependence of the underlying on the driving process. The first and third termsboth correspond to the error in approximating W H with c W H and therefore vanish by Lemma 3.2. Indeed,we have the following result. Lemma 3.3.
Assume that ϕ and ψ are Lipschitz, the latter uniformly in time, with Lipschitz constants K ϕ and K ψ , respectively. Then the first and the third term of (3.6) converge to zero as N L , L → ∞ .Regarding the third term, the convergence is uniform with respect to ∆ t .Proof. We consider the third term first. By basic probabilistic estimates using the Lipschitz property of ϕ and ψ , we have (cid:12)(cid:12)(cid:12) E (cid:2) ϕ ( X ∆ tT ( c W H )) − ϕ ( X ∆ tT ( W H )) (cid:3)(cid:12)(cid:12)(cid:12) ≤ K ϕ E (cid:20)(cid:16) X ∆ tT ( c W H ) − X ∆ tT ( W H ) (cid:17) (cid:21) / = K ϕ E n − X i =0 (cid:16) ψ ( s i , c W Hs i ) − ψ ( s i , W Hs i ) (cid:17) W s i ,s i +1 ! / = K ϕ n − X i =0 E (cid:20)(cid:16) ψ ( s i , c W Hs i ) − ψ ( s i , W Hs i ) (cid:17) (cid:21) ( s i +1 − s i ) ! / ≤ K ϕ K ψ n − X i =0 E (cid:20)(cid:16)c W Hs i − W Hs i (cid:17) (cid:21) ( s i +1 − s i ) ! / ≤ K ϕ K ψ (cid:13)(cid:13)(cid:13)c W H − W H (cid:13)(cid:13)(cid:13) L (Ω; C ([0 ,T ]) T / → N L , L → ∞ by Lemma 3.2. The result for the first term follows in the same manner. (cid:3) Remark N L , L ) . Following [15, Theorem 1], convergence rates in N L and L for the first and third terms of (3.6) could undoubtedly be established. Keep in mind, however, thatwe only use the scheme X ∆ tT ( c W H ) as a tool for the analysis of the scheme X ∆ tT ( W H ), i.e., with exactsimulation of W H . Consequently, rates of the convergence in N L and L are not required to get rates ofconvergence of X ∆ tT ( W H ) in terms of ∆ t . Indeed, with respect to the actual scheme X ∆ tT ( W H ) analyzedin this paper, the error contributions from the first and third terms vanish.The sole remaining term in (3.6),(3.7) Err( T, ∆ t ) := E [ ϕ ( X T ( c W H ))] − E [ ϕ ( X ∆ tT ( c W H ))] , that gives the weak error in the Euler scheme, depends on the approximate affine representation inLemma 3.2. Indeed, suppose that we are given an error tolerance ε . By Lemma 3.3, we can find L = L ( ε, H, T, ϕ, ψ ) and N L = N L ( ε, H, T, ϕ, ψ ) such that the first and third terms of (3.6) are boundedby ε/ t . Our task is now to choose time steps ∆ t such that alsothe second term is bounded by ε/ L, N L . In the next section, we will obtain an extendedvariable system for the dynamics of the underlying that we will use to obtain an asymptotic expansionsfor (3.7). Remark . In the interest of keeping our arguments constructive, we first fixed a compu-tational domain L and then introduced a quadrature based on N L points without specifying the preciserule. One could also choose, e.g., a Gauss–Laguerre quadrature suitable for the half-line thereby reducingthe number of parameters to one (see also [15]). The splitting (3.7) still holds. EAK ERROR RATES ROUGH BERGOMI 9
Forward Euler scheme for extended variable system.
Substituting (3.5) into the underlying(2.1), yields(3.8) b X t := Z t ψ (cid:0) s, c W Hs (cid:1) d W s = Z t ψ (cid:0) s, S ( Y s ) (cid:1) d W s , a finite dimensional Markovian approximation b X t = X t ( c W H ) of the underlying X t that appears in theweak error (3.7). The dynamics of (3.8) are described by Z = ( b X, Y , . . . , Y N L ) , a d -dimensional extended variable state space ( d = N L + 1), solving the system(3.9) d Z t = − b ( Z t , t )d t + σ ( Z t , t )d W t , Z = 0 , with d -vectors b and σ given by, b ( Z t , t ) := (0 , Y t θ p , . . . , Y N L t θ pN L ) ,σ ( Z t , t ) := (cid:0) ψ (cid:0) t, S ( Y t ) (cid:1) , , . . . , (cid:1) , where there is a single driving Brownian motion, i.e. (3.9) is a degenerate system.For the interval [0 , T ], we define the uniform time grid discretization t i := i ∆ t for i = 0 , . . . , n − n := T / ∆ t , and consider the Euler–Maruyama scheme for the underlying,(3.10) X t i +1 = X t i + ψ (cid:0) t i , S ( Y t i ) (cid:1) ∆ W t i , X t = 0 , where ∆ W t i := W t i ,t i +1 = W t i +1 − W t i are the increments of the driving Wiener process and where at each time step the vector of extendedvariables Y · = ( Y l · ) l =1 ,...,N L is sampled exactly. That is, for the Euler update at t i +1 , one can form thejoint distribution(3.11) ( Y τ , ∆ W τ ) τ = t ,...,t i , an ( N L + 1) × ( i + 1)-dimensional Gaussian. The variance-covariance matrix for (3.11) can be obtainedusing the known covariances Cov( Y kt i , Y lt j ), Cov( Y kt i , ∆ W t j ), and Cov(∆ W t i , ∆ W t j ), and then the targetvariables Y t i required in (3.10) can be sampled exactly from the joint distribution, e.g., using the Choleskydecomposition. We extend X in (3.10) to all t ∈ [0 , T ] by the interpolation,(3.12) X ( t ) = Z t ψ (cid:0) s, c W Hκ s (cid:1) d W s = Z t ψ (cid:16) s, c H N L X l =1 ∆ θ l Y lκ s (cid:17) d W s , where κ s = t i if s ∈ [ t i , t i +1 ) for each i = 0 , . . . , n − Z = ( X, Y , . . . , Y N L ) embedded in the SDE(3.13) d Z s = − b ( Z s )d s + σ ( Z s )d W s , s ∈ [0 , T ] , Z = 0 , with coefficients b ( Z s ) = (0 , Y s θ p , . . . , Y N L s θ pN L ) = b ( Z s , s ) ,σ ( Z s ) = (cid:0) ψ (cid:0) s, S ( Y κ s ) (cid:1) , , . . . , (cid:1) = σ ( Z κ s , s ) . For (3.13) and Section 3.2, we are able to formulate a corresponding Kolmogorov backward equation. Fora smooth and bounded payoff ϕ ( Z T ) = ϕ ( Z T ) = ϕ ( b X T ), we consider the value function,(3.14) u ( z , t ) := E [ ϕ ( Z T ) | Z t = z ] = E [ ϕ ( b X T ) | Z t = z ] , that is the conditional expected value of the payoff at time t < T given the starting value Z t = z for z = ( z , . . . , z d ) = ( x, y , . . . , y N L ). The associated Kolmogorov backward equation is given by ∂ t u ( z , t ) − b j ( z , t ) D j u ( z , t ) + 12 A jk ( z , t ) D jk u ( z , t ) = 0 , t < T , z ∈ R d ,u ( z , T ) = ϕ ( z ) , (3.15)where repeated indices indicate summation (over 1 , . . . , d ), b j = b j ( z , t ) is the j th component of the d -vector, b ( z , t ) := (0 , z θ p , . . . , z N L +1 θ pN L ) , and A jk = A jk ( z , t ) are elements of the d × d -matrix A = ( σσ ∗ ),(3.16) A = ψ ( t, S ( z )) , A j = A j = ψ ( t, S ( z )) , j > , A jk = 1 , j, k > , EAK ERROR RATES ROUGH BERGOMI 10 i.e. that contains ones except along the first row and column, where we slightly abuse notation, S ( z ) := 0 · z + c H d X j =2 z j ∆ θ j +1 = c H N L X l =1 y l ∆ θ l . Remark . In the presentation of the Kolmogorov backward equa-tion, we assume necessary regularity conditions on ϕ , i.e. smoothness and boundedness, as a matter ofconvenience. From the context of the problem this is not a strong assumption, see Remark 2.2.3.3. Local weak error representation.
Throughout the remainder of this work we consider the casewhen ψ ( s, W Hs ) = W Hs in (2.1). Returning to (3.7), we obtain a representation for the weak error in termsof local errors. First, we write the discretization error as a telescoping sequence in the value function(3.14), Err( T, ∆ t ) = E (cid:2) ϕ ( b X T ) − ϕ ( X t n ) (cid:3) = − (cid:0) E u ( Z t n , T ) − E u ( Z t , (cid:1) = − n − X i =0 E (cid:2) u ( Z t i +1 , t i +1 ) − u ( Z t i , t i ) (cid:3) , (3.17)using that E ϕ ( X t n ) = E (cid:2) E [ ϕ ( Z T ) | Z T = Z t n ] (cid:3) = E u ( Z t n , T )and E ϕ ( b X T ) = E (cid:2) E [ ϕ ( Z T ) | Z = Z t ] (cid:3) = E u ( Z t , . We then represent each difference appearing in (3.17) as a stochastic differential over a small time incre-ment. Over the interval [ t i , t i +1 ), we have that E [ u ( Z t i +1 , t i +1 ) − u ( Z t i , t i )] = E Z t i +1 t i d u ( Z s , s )= E Z t i +1 t i (cid:16) ∂ t u ( Z s , s ) − b j ( Z s ) D j u ( Z s , s ) + 12 A jk ( Z t i ) D jk u ( Z s , s ) (cid:17) d s , using It¯o’s formula applied to (3.13) where repeated indices indicate summation over 1 , . . . , d . Subtractingoff the Kolmogorov backward equation (3.15) evaluated at ( Z s , s ) then yields(3.18) E [ u ( Z t i +1 , t i +1 ) − u ( Z t i , t i )] = 12 E Z t i +1 t i (cid:0) A jk ( Z t i ) − A jk ( Z s ) (cid:1) D jk u ( Z s , s )d s . We note that the non-zero terms correspond to differences along the first row and column of A in (3.16),and thus (3.18) simplifies to the following expression for the local weak error in the value function, E [ u ( Z t i +1 , t i +1 ) − u ( Z t i , t i )] = 12 E Z t i +1 t i (cid:0) A ( Z t i ) − A ( Z s ) (cid:1) D u ( Z s , s )d s + E Z t i +1 t i d X j =2 (cid:0) A j ( Z t i ) − A j ( Z s ) (cid:1) D j u ( Z s , s )d s = − E Z t i +1 t i S ( Y · ) t i ,s D u ( Z s , s )d s − E Z t i +1 t i S ( Y · ) t i ,s d X j =2 D j u ( Z s , s )d s , (3.19)where we express the differences in components of A in terms of the increments of the approximate fBm S ( Y · ) kt i ,s := S ( Y s ) k − S ( Y t i ) k , k = 1 , . Observe that the derivatives of the value function appearing in (3.19) can be further resolved by directlycomputing fluxes. Here we consider the simplification ψ ( s, c W Hs ) = c W Hs in (2.1) (i.e. “linear ψ ”). Lemma 3.7 (Fluxes) . Let ψ ( s, W Hs ) = W Hs . For the value function u ( z , t ) defined in (3.14) , (3.20) D β u ( z , t ) = c | β |− β H E h ϕ ( | β | ) ( b X T ) N L Y j =1 (∆ θ j M jt,T ) β j +1 | Z t = z i , for a multi-index β = ( β , . . . , β d ) where M jt,T := Z Tt e − ( r − t ) θ pj d W r , j = 1 , . . . , N L . EAK ERROR RATES ROUGH BERGOMI 11
Proof.
Let Z t, z s be the Markov process started at Z t = z with components Z t, z s = ( b X t,xs , Y t, y s ) = ( b X t,xs , Y t,y s , . . . , Y t,y NL s ) ;here we drop the index Y t,y l = Y l ; t,y l when the index is clear from the initial condition. We recall that Y t,y l s started at the value y l at time t is given by, Y t,y l s = e − ( s − t ) θ pl y l + Z st e − ( s − r ) θ pl d W r , t < s , (3.21)and, similarly, that b X t,xs is given by, b X t,xs = x + Z st S ( Y t, y r )d W r = x + Z st c H N L X l =1 Y t,y l r ∆ θ l d W r , t < s . Working directly with (3.21) and Section 3.3, derivatives of the underlying with respect to the initialconditions are given by ∂ b X t,xs ∂x = 1 , and, for l = 1 , . . . , N L , ∂ b X t,xs ∂y l = c H ∆ θ l Z st e − ( r − t ) θ pl d W r =: c H ∆ θ l M lt,s . The formula (3.20) follows as all higher derivatives of b X t,xs vanish. (cid:3) Returning to our expression for the local weak error in the value function (3.19) we apply (3.20) therebyobtaining, E [ u ( Z t i +1 , t i +1 ) − u ( Z t i , t i )] = − E Z t i +1 t i S ( Y · ) t i ,s E [ ϕ ( b X T ) | Z s = Z s ]d s − N L X l =1 E Z t i +1 t i S ( Y · ) t i ,s E [ ϕ ( b X T ) c H ∆ θ l M ls,T | Z s = Z s ]d s . (3.22)We introduce deterministic functions of z , ν ( z , s ) := E [ ϕ ( b X T ) | Z s = z ] , and e ν ( z , s ) := E h ϕ ( b X T ) (cid:0) c H P l M ls,T ∆ θ l (cid:1) | Z s = z i . Rewriting (3.22) with this new notation leads to an expression for the local weak error in the valuefunction, E [ u ( Z t i +1 , t i +1 ) − u ( Z t i , t i )] = − E Z t i +1 t i S ( Y · ) t i ,s ν ( Z s , s )d s − E Z t i +1 t i S ( Y · ) t i ,s e ν ( Z s , s )d s =: J + e J , (3.23)that will serve as our starting point for the convergence rates. Next we derive an expansion for (3.23) inpowers of ∆ t from which we obtain convergence rates.3.4. Taylor expansions and conditional independence.
Starting with the local weak error (3.23),we derive asymptotic expansions for J (and e J ) in in powers of ∆ t by Taylor expanding ν (and e ν ) at Z t i and applying a conditioning argument.We observe that Z = ( X, Y ) in (3.13), i.e. the interpolation (3.12) together with the exact dynamicsof the OU extended variables, is linear with respect to the increment over [ t i , s ],(3.24) Z s − Z t i = (cid:0) S ( Y t i ) W t i ,s , Y t i ,s , . . . , Y N L t i ,s (cid:1) where W t i ,s := W s − W t i and Y lt i ,s := Y ls − Y lt i , for s ∈ [ t i , t i +1 ) , are increments of the driving Brownian motion and extended variable OU processes, respectively. Usingthe linearization (3.24), the Taylor expansion of ν at Z t i is,(3.25) ν ( Z s , s ) = ν ( Z t i , s ) + X β ∈I κ D β ν ( Z t i , s ) · ( S ( Y t i ) W t i ,s ) β ( Y t i ,s ) ˆ β + R κ ( ν ) , for sums over multiindices in the set I κ = (cid:8) β = ( β , ˆ β ) = ( β , β , . . . , β d ) : 1 ≤ | β | ≤ κ − (cid:9) EAK ERROR RATES ROUGH BERGOMI 12 where ( Y t i ,s ) ˆ β = N L Y l =1 ( Y lt i ,s ) β l +1 = ( Y t i ,s ) β · · · ( Y N L t i ,s ) β d and the remainder is given in integral form by, R κ ( ν ) = 1 κ ! X | β | = κ ( S ( Y t i ) W t i ,s ) β ( Y t i ,s ) ˆ β Z D β ν ( ξ τ , s )d τ ,ξ τ := Z t i + τ (cid:0) S ( Y t i ) W t i ,s , Y t i ,s , . . . , Y N L t i ,s (cid:1) . (3.26)In (3.25), the terms ν and D β ν are deterministic functions of the F t i -measurable random variable Z t i and are therefore F t i -measurable. Analogous expressions hold for (3.25) and (3.26) with e ν in place of ν .Plugging the ν -expansion (3.25) into (3.23), we obtain an asymptotic expansion for J ,2 J = − E Z t i +1 t i ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) d s − X β ∈I κ E Z t i +1 t i D β ν ( Z t i , s ) S ( Y t i ) β E (cid:2) S ( Y · ) t i ,s ( W t i ,s ) β ( Y t i ,s ) ˆ β | F t i (cid:3) d s − E Z t i +1 t i E (cid:2) S ( Y · ) t i ,s R κ ( ν ) | F t i (cid:3) d s , (3.27)by conditional independence. An expansion analogous to (3.27) holds for e J with e ν in place of ν . Theonly terms that depend on ∆ t in (3.27) (and in the analogously expansion for e J ) are the conditionalexpectations involving products of the increments S ( Y · ) t i ,s or S ( Y · ) t i ,s together with powers of W t i ,s and Y t i ,s . The key obtain weak error rates will be to show after isolating the order in ∆ t using theseexpansions that the expansion coefficients, which depend on the extended state space variables Y , arecontrolled with respect to summation in the parameter(s) θ .In the next section, we observe that for quadratic payoffs, the expansions in ν and e ν truncate afterthe first term since ν and e ν already depend on two derivatives of ϕ . In this special case, we derive weakrate one in Theorem 4.1. In Section 5, we prove that in general the weak rate is H + 1 /
2, as reported inTheorem 2.1, also using the asymptotic expansions approach.4.
Weak rate one for quadratic payoffs
Using the preceding machinery, we will now derive rates of convergence for the weak error via Taylorexpansions in powers of ∆ t such that all terms stay integrable in θ . For quadratic payoff functions,we obtain that the weak error is O (∆ t ), i.e. rate one in Theorem 4.1 below, which is supported bynumerical evidence, recall Figure 2 a . The mechanism by which rate one is achieved can be observed inthe expansions; the expansion coefficients depend on derivatives of the payoff function and higher-orderterms that reduce the rate vanish when ϕ is quadratic.4.1. Asymptotic expansion approach to weak rate one.
Returning to the increment of the valuefunctional (3.23), if ϕ ∈ P , that is, is a quadratic polynomial, then the derivatives of ν and e ν (as definedin Sections 3.3 and 3.3) vanish and only the first terms in the expansion (3.25) remain. Then, J + e J = − E Z t i +1 t i ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) d s − E Z t i +1 t i e ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) d s =: J + e J , (4.1)and estimating J and e J yields the weak rate corresponding to quadratic payoff functions. Theorem 4.1 (Weak rate quadratic payoff) . Let ϕ ∈ P and let ψ ( s, c W Hs ) = c W Hs , then Err( T, ∆ t, ϕ ) := E (cid:2) ϕ ( b X T ) − ϕ ( X t n ) (cid:3) (cid:46) O (∆ t ) , i.e. the Euler method is weak rate one.Proof. We estimate the terms J and e J in (4.1) beginning with e J . Working directly with the incrementsof the extended variables, E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) = − c H N L X l =1 E (cid:2) Y lt i ,s | F t i (cid:3) ∆ θ l = 0 , and thus we conclude e J = − E Z t i +1 t i e ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) = 0 . EAK ERROR RATES ROUGH BERGOMI 13
In the case of quadratic ϕ , we observe that k ν k ∞ = O (1) and deterministic. From the definition of Y s (e.g. see (3.21)), working again directly with the increments of the extended variables we have that S ( Y · ) t i ,s = − c H N L X k,l =1 ( Y kt i Y lt i − Y ks Y ls )∆ θ k ∆ θ l = − c H N L X k,l =1 n (1 − e − ( s − t i )( θ pk + θ pl ) ) Y kt i Y lt i − e − ( s − t i ) θ pk Y kt i Z st i e − ( s − r ) θ pl d W r − e − ( s − t i ) θ pl Y lt i Z st i e − ( s − r ) θ pk d W r − Z st i e − ( s − r ) θ pk d W r Z st i e − ( s − r ) θ pl d W r o ∆ θ k ∆ θ l . From this the key conditional expectation term reduces to E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) = − c H N L X l,k =1 ∆ θ l ∆ θ k (1 − e − ( s − t i )( θ pl + θ pk ) ) (cid:0) Y lt i Y kt i − θ pl + θ pk (cid:1) . (4.2)Only the component 1 − e − ∆ t ( θ pl + θ pk ) will contribute to the estimate for the weak rate, provided that thesums over θ in (4.2) converge.Using (4.2) we find J = − c H Z t i +1 t i N L X l,k =1 ∆ θ l ∆ θ k E [ Y lt i Y kt i − θ pl + θ pk ](1 − e − ( s − t i )( θ pl + θ pk ) )d s = − c H N L X l,k =1 ∆ θ l ∆ θ k E (cid:2) Y lt i Y kt i − θ pl + θ pk (cid:3) Z ∆ t g ( s )d s . (4.3)The integrand, g ( s ) := 1 − e − s ( θ pl + θ pk ) , is a function of s ∈ [0 , ∆ t ) such that g (0) = 0 and the associated Lipschitz constant K is given by, K = ∂∂s g ( s ) (cid:12)(cid:12)(cid:12) s =0 = θ pl + θ pk . Then, since | g ( s ) − g (0) | ≤ Ks , we have | J | ≤ c H N L X l,k =1 ∆ θ l ∆ θ k (cid:12)(cid:12) E (cid:2) Y lt i Y kt i ( θ pl + θ pk ) − (cid:3)(cid:12)(cid:12) Z ∆ t s d s . Computing the covariance appearing above, E [ Y lt i Y kt i ] = Z t i e − ( t i − r )( θ pl + θ pk ) d r = 1 − e − t i ( θ pl + θ pk ) θ pl + θ pk , we see that(4.4) | J | ≤ c H (cid:16) N L X l =1 ∆ θ l e − t i θ pl (cid:17) ∆ t ≤ c H p Γ (cid:0) p (cid:1) t − /pi ∆ t (cid:46) O (∆ t ) , since N L X l =1 ∆ θ l e − t i θ pl ≤ Z L e − t i θ pl d θ l ≤ Z ∞ e − t i θ pl d θ l = 1 p Γ (cid:0) p (cid:1) t − /pi . Importantly, t − /p is integrable on [0 , T ] when p > t − /pi appearing in (4.4) willremain uniformly bounded when summing over t i in (3.17). EAK ERROR RATES ROUGH BERGOMI 14
Turning now to the telescoping representation of the weak error (3.17) and using (4.4), we obtain thedesired rate | Err( T, ∆ t, ϕ ) | = (cid:12)(cid:12)(cid:12) n − X i =0 E (cid:2) u ( Z t i +1 , t i +1 ) − u ( Z t i , t i ) (cid:3)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) n − X i =0 E Z t i +1 t i ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) d s (cid:12)(cid:12)(cid:12) ≤ c H p Γ (cid:0) p (cid:1) n − X i =0 ∆ t t − /pi ≤ c H p ( p −
2) Γ (cid:0) p (cid:1) T ( p − /p ∆ t ≤ T H ∆ t . (cid:3) Although initially surprising that the rate depends on the payoff, the Taylor expansion for the weakerror provides insight into this behavior. The expansion coefficients in (3.25) depend on increasinglyhigher order derivatives of the payoff function ϕ through derivatives of ν and e ν (in Sections 3.3 and 3.3).Unlike for quadratic ϕ where terms in (3.25) vanish, the higher order derivative terms persist for generalpayoff functions. Oddly, it is only the next higher term (compared to those involved in the expansion forquadratic ϕ ) that reduces the overall rate for general ϕ . In this context, one might hope to obtain an effective rate that is independent of H for payoff functions well approximated by quadratic polynomials.4.2. A simpler proof for weak rate one.
Before moving on to the proof of the main result in Section 5,we first present a simpler proof of weak rate one for quadratic payoff functions that is also applicable tononlinear ψ . This proof was communicated to us by A. Neuenkirch. Lemma 4.2.
Suppose that ψ ∈ C , i.e. ψ ∈ C and ψ, ∂ x ψ have polynomial growth, and ϕ ( x ) = x .Then (cid:12)(cid:12)(cid:12) E (cid:2) ϕ ( X T ) − ϕ ( e X T ) (cid:3)(cid:12)(cid:12)(cid:12) = O (∆ t ) , where e X T := Z T ψ ( s, W Hκ s )d W s , for κ s = t i if s ∈ [ t i , t i +1 ) for each i = 0 , . . . , n − .Proof. By the It¯o isometry, we have E [ ϕ ( X T )] = Z T E (cid:2) ψ ( s, s H W H ) (cid:3) d s = Z T g ( s )d s , E (cid:2) ϕ ( e X T ) (cid:3) = Z T E (cid:2) ψ ( κ s , κ Hs W H ) (cid:3) d s = Z T g ( κ s )d s , where g ( s ) := E (cid:2) ψ ( s, s H V ) (cid:3) , V ∼ N (0 , . Note that g is differentiable with integrable derivative. Indeed, g ( t ) = E (cid:2) ∂ t ψ ( t, t H V ) (cid:3) + E (cid:2) ∂ x ψ ( t, t H V ) V (cid:3) t H − , which is of order t H − and, hence, integrable.Setting ζ t := min { t i | t i ≥ t } , we conclude with (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z T g ( s )d s − Z T g ( κ s )d s (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ Z T (cid:12)(cid:12)(cid:12)(cid:12) g ( κ s ) + Z sκ s g ( t )d t − g ( κ s ) (cid:12)(cid:12)(cid:12)(cid:12) d s ≤ Z T Z sκ s | g ( t ) | d t d s = Z T Z ζ t t d s | g ( t ) | d t ≤ max i =0 ,...,n − | t i +1 − t i | k g k L ([0 ,T ]) . (cid:3) EAK ERROR RATES ROUGH BERGOMI 15
For simplicity, we assumed that ψ ( s, c W Hs ) = c W Hs in Theorem 4.1. In contrast Lemma 4.2 is applicableto any ψ including nonlinear functions. However, it is not clear how to extend the approach of the simpleproof for Lemma 4.2 to more general payoff functions ϕ . In the next section we use the asymptoticexpansions to prove the main result, Theorem 2.1, obtaining the weak rate H + 1 / ψ ( s, c W Hs ) = c W Hs . 5. Proof of Theorem 2.1
Our proof of Theorem 2.1 follows the expansion approach used to obtain rate one for quadratic payoffsin Section 4.1. We derive asymptotic expansions in powers of ∆ t for increments of the value functionin (3.23), i.e., for J and e J . For the case of general payoff functions, this requires two rounds of Taylorexpansions. The first round expands ν and e ν at Z t i using (3.25), as was done in Section 4.1. Then afterapplying a conditioning argument, we explicitly deal with correlations by expressing our expansions interms of the extended variables and making a second round of Taylor expansions with respect to selectcomponents of Y t i . Again, a key point in the proof is that all the terms in the expansions are controlledwith respect to θ .Returning to the local weak error (3.19), we use the ν -expansion (3.25) to find that J , the termcorresponding to the increment S ( Y · ) t i ,s , up to order κ = 3 is given by,2 J = − E Z t i +1 t i ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) d s − E Z t i +1 t i D ν ( Z t i , s ) S ( Y t i ) E (cid:2) S ( Y · ) t i ,s W t i ,s | F t i (cid:3) d s − d X j =2 E Z t i +1 t i D j ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s Y jt i ,s | F t i (cid:3) d s − E Z t i +1 t i D ν ( Z t i , s ) S ( Y t i ) E (cid:2) S ( Y · ) t i ,s ( W t i ,s ) | F t i (cid:3) d s − d X j =2 E Z t i +1 t i D j ν ( Z t i , s ) S ( Y t i ) E (cid:2) S ( Y · ) t i ,s W t i ,s Y j − t i ,s | F t i (cid:3) d s − d X j,k =2 E Z t i +1 t i D jk ν ( Z t i , s ) (cid:2) S ( Y · ) t i ,s Y j − t i ,s Y k − t i ,s | F t i (cid:3) d s − E Z t i +1 t i E (cid:2) S ( Y · ) t i ,s R ( ν ) | F t i (cid:3) d s =: J + J , + J , + J , + J , + J , − E Z t i +1 t i E (cid:2) S ( Y · ) t i ,s R ( ν ) | F t i (cid:3) d s . (5.1)Here J is as before and the J k,i involve k th order derivatives of ν that do not necessarily vanish forgeneral payoff functions ϕ (here the second index i ≤ k is the number of the derivatives that correspond toextended variable directions and hence the number of sums over extended variable indices). Analogously,for e J , the term corresponding to the increment S ( Y · ) t i ,s , we have,(5.2) e J = e J + e J , + e J , + e J , + e J , + e J , − E Z t i +1 t i E (cid:2) S ( Y · ) t i ,s R ( e ν ) | F t i (cid:3) d s , with e ν in place of ν and the increment S ( Y · ) t i ,s in place of S ( Y · ) t i ,s compared to (5.1). In the sequel,we will simply write(5.3) J k = k X i =0 J k,i and e J k = k X i =0 e J k,i , for the sum of all terms involving k th order derivatives of ν and e ν , noting that the sums over i will beof different sizes for each k . Curiously, we will observe that only the terms J , · and e J , · , correspondingto terms containing the first order derivatives of ν and e ν , reduce the overall rate. In what follows, wefirst take the fBm view and first assume deterministic k Dν k ∞ = O (1) and similarly for e ν . Since theterms corresponding to ν and e ν contribute only to the constant and not to the rate, this assumptionallows us to easily deduce the order in ∆ t . Expressing in terms of the extended variables, as in (4.3), itis then possible to carrying out a second round of Taylor expansions to demonstrates that the constantsare controlled. EAK ERROR RATES ROUGH BERGOMI 16
Estimate: J is again O (∆ t ) . In Section 4.1, ν is deterministic and O (1) since it depends on twoderivatives of the quadratic payoff ϕ . Continuing as in (4.3), our starting point for the full estimate for J is J = − c H E Z t i +1 t i ν ( X t i , Y t i ; s ) N L X k,l =1 ∆ θ k ∆ θ l E (cid:2) Y kt i Y lt i − θ pk + θ pl (cid:3) g ( s )d s , (5.4)where we emphasize the dependence of ν on Y t i . We define an auxiliary function,(5.5) f kls ( Y kt i , Y lt i ) := E (cid:2) ν ( X t i , Y t i ; s ) | Y kt i , Y lt i (cid:3) , and expand f kls in a Taylor series at zero,(5.6) f kls ( Y kt i , Y lt i ) = f kls ( ) + X α ∈J α max | α | ! ∂ αkl f kls ( ) (cid:0) Y kt i Y lt i (cid:1) α + R α max ( Y kt i , Y lt i ) , for a set of multiindices J α max = { α = ( α , α ) : 1 ≤ | α | < α max } . We use the notation, ∂ αl ...l j = ∂ α l . . . ∂ α j l j , α = ( α , . . . , α j ) , (as opposed to D ) to emphasize that the derivatives are taken with respect to y l directions only. Theremainder closing the Taylor expansion in the auxiliary function f is given by,(5.7) R α max ( Y kt i , Y lt i ) := X | α | = α max α max ! ∂ αkl f kls ( ξ k , ξ l )( Y kt i Y lt i ) α , for an intermediate point ( ξ k , ξ l ).Plugging this second round of Taylor expansions (5.6) into (5.4), yields J = − c H N L X k,l =1 E (cid:2) Y kt i Y lt i − θ pk + θ pl (cid:3) Z t i +1 t i f kls ( ) g ( s )d s ∆ θ k ∆ θ l + X α ∈J | α | ! N L X k,l =1 E (cid:2) ( Y kt i ) α ( Y lt i ) α (cid:0) Y kt i Y lt i − θ pk + θ pl (cid:1)(cid:3) Z t i +1 t i ∂ αkl f kls ( ) g ( s )d s ∆ θ k ∆ θ l + N L X k,l =1 E (cid:2) R α max ( Y kt i , Y lt i ) (cid:0) Y kt i Y lt i − θ pk + θ pl (cid:1)(cid:3) ∆ θ k ∆ θ l Z t i +1 t i g ( s )d s ! . (5.8)For the remainder term in (5.8), we use the Hölder regularity of the fBm to estimate the derivative ofthe auxiliary function evaluated at an intermediate point, N L X k,l =1 E (cid:2) | R α max ( Y kt i , Y lt i ) || Y kt i || Y lt i | (cid:3) ∼ E (cid:2) | W Ht i ,t i +1 | α max (cid:3) (cid:46) ∆ t Hα max E | W Ht i | α max (cid:46) ∆ t Hα max . From this last expression, the number of terms in the auxiliary expansion, α max , is finite and the contri-bution from the remainder can be made to be order one by choosing α max := d H e . For the remaining terms in (5.8), the f kl can be estimated by the payoff ϕ (see Lemma B.1 in Appendix B)and therefore we write for convenience that f kl and all derivatives are bounded by a constant Q ,(5.9) f kl ∈ C α max b and k D α f kl k ∞ ≤ Q a.s. . Then recalling the Lipschitz argument, used in the proof of weak rate one for quadratic payoffs inSections 4.1 and 4.1, together with (5.9) we find that the first term in (5.8) is estimated by, c H Q ∆ t N L X k,l =1 | E [ Y kt i Y lt i ( θ pk + θ pl ) − | ∆ θ k ∆ θ l ≤ c H p Q Γ (cid:0) p ) t − /pi ∆ t , for a constant proportional to t − /pi as in (4.4). For the higher order terms in α in (5.8), we use Isserlis’theorem to expand the expectations into products of covariances of the extended variables. For example,for | α | = 1, i.e. when α = 1 , α = 0 or α = 0 , α = 0, then the term is zero by Isserlis’. For | α | = 2, weapply Isserlis’ theorem to obtain three terms, E [( Y kt i ) ] (cid:0) E [ Y kt i Y lt i ( θ pk + θ pl ) − (cid:1) ( α = 2 , α = 0) , E [( Y lt i ) ] (cid:0) E [ Y kt i Y lt i ( θ pk + θ pl ) − (cid:1) ( α = 0 , α = 2) , E [ Y kt i Y lt i ] (cid:0) E [ Y kt i Y lt i ( θ pk + θ pl ) − (cid:1) + 2 E [( Y kt i ) ] E [( Y lt i ) ] ( α = 1 , α = 1) , EAK ERROR RATES ROUGH BERGOMI 17 where we note there is no cancellation and ( θ pk + θ pl ) comes from the Lipschitz constant. Considering onlythe first term, E [( Y kt i ) ] (cid:0) E [ Y kt i Y lt i ( θ pk + θ pl ) − (cid:1) = 1 − e − θ pk t i θ pk (2 − e − ( θ pk + θ pl ) t i ) , we are essentially interested in the behavior of the integral Z Z − e − tx p x p (1 − e − t ( x p + y p ) )d y d x = Z − e − tx p x p (cid:16) p e − tx p t − /p Γ( p , ty p ) + y + C y (cid:17) d x = ( y + C y )( e − tx p − x − p p − − ( y + C y ) t − /p Γ( p , tx p ) p − xt − /p (cid:0) Γ( p , tx p ) − ( p − /p t − /p Γ( p , tx p ) (cid:1) ( p − p Γ( p , y p )+ x − p t − /p e − tx p ( e tx p − p − p Γ( p , y p ) + C x , where C y , C x are constants. The integral goes to C x as x, y → ∞ and to a constant C x − C y t − /p Γ( p ) / ( p −
1) as x, y →
0. We conclude that the definite integral is converging to a constant depending on H and t i and therefore the sum over θ k and θ l converges. The higher order terms in α are also summable since theapplication of Isserlis’ theorem results in expressions that are either zero or of the form above. Altogether,our full estimate for J is then(5.10) | J | (cid:46) C ( H, Q, α max ) t − /pi ∆ t (cid:46) O (∆ t ) , yielding the same rate as the estimate in (4.4), albeit with a different constant C ( H, Q, α max ).In the next section, we consider estimates for the terms J , · and e J , · in (5.1) and (5.2). Inspired by theobservation that we obtain the same rate in (5.10) as in (4.4) with only the constant changing, we firstwork directly with the fBm view to easily ascertain the rate. However, to obtain the full estimate, weproceed as in this section by utilizing the structure of the affine approximation to make a second roundof Taylor expansions and subsequently observing that the coefficients are controlled.5.2. Estimate: J and e J are O (∆ t H +3 / ) . For the term J , in (5.1), J , = − E Z t i +1 t i D ν ( X t i , Y t i ; s ) S ( Y t i ) E (cid:2) S ( Y · ) t i ,s W t i ,s | F t i (cid:3) , we first determine the order in ∆ t which arises from the conditional expectation. Working directly withthe increment, S ( Y · ) t i ,s = c W Ht i ,s ( c W Hs + c W Ht i ) = c W Ht i ,s ( c W Ht i ,s + 2 c W Ht i ) , we take the fBm view and express c W Ht i ,s ∼ W Ht i ,s = Z t i K ( s − r ) − K ( t i − r )d W r | {z } := V Hti ( s ) + Z st i K ( s − r )d W r , using the power law kernel, i.e. K ( r ) = √ Hr H − / , (inverse discrete Laplace transform). We observe E [( W Ht i ,s ) W t i ,s | F t i ] = 2 V Ht i ( s ) Z st i K ( s − r )d r , E [ W Ht i ,s W t i ,s | F t i ] = Z st i K ( s − r )d r , and (cid:12)(cid:12) E [ W Ht i V Ht i ( s )] (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)Z t i K ( t i − r ) K ( s − r ) − K ( t i − r )d r (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)Z t i K ( t i − r )d r (cid:12)(cid:12)(cid:12)(cid:12) ∼ t Hi ≤ . EAK ERROR RATES ROUGH BERGOMI 18
Then, J , = − E Z t i +1 t i D ν ( X t i , Y t i ; s ) c W Ht i E (cid:2)c W Ht i ,s ( c W Ht i ,s + 2 c W Ht i ) W t i ,s | F t i (cid:3) d s = − E Z t i +1 t i D ν ( X t i , Y t i ; s ) W Ht i (cid:0) E [( W Ht i ,s ) W t i ,s | F t i ] + 2 W Ht i E [ W Ht i ,s W t i ,s | F t i ] (cid:1) d s = − E Z t i +1 t i D ν ( X t i , Y t i ; s ) W Ht i ( V Ht i ( s ) + W Ht i ) Z st i K ( s − r )d r | {z } ∝ ( s − t i ) H +1 / d s , (5.11)which suggests J , = O (∆ t H +3 / ) provided the coefficient is controlled. The order of J , in ∆ t isasymptotically exact in (5.11), i.e. there is only estimation of the constant.In general, D ν is random and depends on Y t i . This changes the coefficient in the estimate but not theorder in ∆ t , as was observed by comparing (5.10) to (4.4). Following Section 5.1, we proceed to estimatethe coefficient by making a second round of Taylor expansions by introducing an auxiliary function f . Inthis spirit, we start directly from (5.11), noting that V Ht i ( s ) ∼ b V Ht i ( s ) := c H N L X l =1 Y lt i ( e − θ pl ( s − t i ) − θ l , since Z t i K ( s − r )d W r = Z t i √ H ( s − r ) H − / d W r = Z t i √ H Γ( − H ) Z ∞ θ − ( H +1 / e − θ ( s − r ) d θ d W r = e c H Z ∞ θ − ( H +1 / e − θ ( s − t i ) Z t i e − θ ( t i − r ) d W r d θ = e c H Z ∞ θ − ( H +1 / e − θ ( s − t i ) e Y t i d θ ≈ c H N L X l =1 Y lt i e − θ pl ( s − t i ) ∆ θ l . Rewriting J , directly in terms of the extended variables, we obtain J , = − E Z t i +1 t i D ν ( X t i , Y t i ; s ) c W Ht i ( b V Ht i + c W Ht i ) g ( s )d s = − c H N L X k,l =1 Z t i +1 t i E (cid:2) D ν ( X t i , Y t i ; s ) Y kt i Y lt i (cid:3) e − θ pl ( s − t i ) g ( s )d s ∆ θ k ∆ θ l , where g ( s ) := Z st i K ( s − r )d r ∝ ( s − t i ) H +1 / . Recalling that ν ( Z t i , s ) = ν ( X t i , Y t i ; s ) is a deterministic function of Z t i , we further write E [ D ν ( X t i , Y t i ; s ) Y kt i Y lt i ] = E (cid:2) E [ D ν ( X t i , Y t i ; s ) | Y kt i , Y lt i ] Y kt i , Y lt i (cid:3) , where we define a new auxiliary function (here ϕ is already higher order compared to (5.5)) f kls ( Y kt i , Y lt i ) := E (cid:2) D ν ( X t i , Y t i ; s ) | Y kt i , Y lt i (cid:3) . We Taylor expand f kl at zero, yielding J , = − c H N L X k,l =1 E [ Y kt i Y lt i ] Z t i +1 t i f kls ( ) e − θ pl ( s − t i ) g ( s )d s ∆ θ k ∆ θ l + X α ∈J α max | α | ! N L X k,l =1 E (cid:2) ( Y kt i ) α +1 ( Y lt i ) α +1 (cid:3) Z t i +1 t i ∂ αkl f kls ( ) e − θ pl ( s − t i ) g ( s )d s ∆ θ k ∆ θ l + N L X k,l =1 E (cid:2) R α max ( Y kt i , Y lt i ) Y kt i Y lt i (cid:3) Z t i +1 t i e − θ pl ( s − t i ) g ( s )d s ∆ θ k ∆ θ l ! EAK ERROR RATES ROUGH BERGOMI 19 where J α max = { α = ( α , α ) : 1 ≤ | α | < α max } and the remainder R α max is given in the Lagrange formas in (5.7). Since f kl is bounded by ϕ (see Lemma B.1 in Appendix B), we write the bound in terms of Q , as in (5.9), to obtain | J , | (cid:46) C ( H, Q, α max ) t − /pi ∆ t H +3 / (cid:46) O (∆ t H +3 / ) , where we use Isserlis’ theorem and the explicit form of the covariances, e.g., N L X k,l =1 E [ Y kt i Y lt i ]∆ θ k ∆ θ l = N L X k,l =1 ∆ θ k ∆ θ l θ pk + θ pl (1 − e − ( θ pk + θ pl ) t i ) ≤ π Z ∞ θ − p (1 − e − θ p t i )d θ ≤ π p − p ) t − /pi , to observe that the sums over θ k and θ l converge independently of L, N L .The term J , includes increments of OU processes arising from the extended variables. Taking thefBm view using the power law kernel, J , = − N L X j =1 E Z t i +1 t i D j ν ( X t i , Y t i ; s ) E (cid:2)c W Ht i ,s ( c W Ht i ,s + 2 c W Ht i ) Y jt i ,s | F t i (cid:3) d s = − N L X j =1 E Z t i +1 t i D j ν ( X t i , Y t i ; s ) (cid:0) ( V Ht i ( s )) + 2 W Ht i V Ht i ( s ) (cid:1) Y jt i ( e − θ pj ( s − t i ) − s (5.12a) − N L X j =1 E Z t i +1 t i D j ν ( X t i , Y t i ; s )(2 V Ht i ( s ) + 2 W Ht i ) Z st i e − θ pj ( s − r ) K ( s − r )d r d s (5.12b) − N L X j =1 E Z t i +1 t i D j ν ( X t i , Y t i ; s ) Y jt i ( e − θ pj ( s − t i ) − Z st i K ( s − r )d r d s , (5.12c)since E [( W Ht i ,s ) Y jt i ,s | F t i ] = Y jt i ( e − θ pj ( s − t i ) − Z st i K ( s − r )d r + 2 V Ht i ( s ) Z st i K ( s − r ) e − θ pj ( s − r ) d r +( V Ht i ( s )) Y jt i ( e − θ pj ( s − t i ) − E [ W Ht i ,s Y jt i ,s | F t i ] = Z st i e − θ pj ( s − r ) K ( s − r )d r + V Ht i ( s ) Y jt i ( e − θ pj ( s − t i ) − . We examine the contributions to the rate in ∆ t from each of the terms (5.12a) to (5.12c); after integratingover s we find that (5.12a) yields O (∆ t ), (5.12b) yields O (∆ t H +3 / ), and (5.12c) yields O (∆ t H +2 ). Since(5.12a) and (5.12c) are higher order in ∆ t we examine only (5.12b), where we note Z st i e θ pj ( s − t i ) K ( s − r )d r = e − θ pj s √ H Z st i e θ pj r ( s − r ) H − / d r = √ Hθ − ( H +1 / pj h Γ (cid:0) H + , ( s − r ) θ pj (cid:1)i r = sr = t i = √ Hθ − ( H +1 / pj h e s − r ( s − r ) H +1 / Γ (cid:0) H + 1 / , θ pj (cid:1)i r = sr = t i = −√ Hθ − ( H +1 / pj Γ (cid:0) H + , θ pj (cid:1) e s − t i ( s − t i ) H +1 / (5.13)by a change of variable in the argument of the incomplete gamma function where q := ( H + 1 / p = 2 H + 11 − H > , H ∈ (0 , / . This suggests J , = O (∆ t H +3 / ) since the sum over θ j converges (also helpful to note Γ (cid:0) H + , θ pj (cid:1) → θ j → ∞ ).For the full estimate of J , , we note D j ν ( X t i Y t i ; s ) = c H ∆ θ j E (cid:2) ϕ (3) ( b X T ) M js,T | ( b X s , Y s ) = ( X t i , Y t i ) (cid:3) , EAK ERROR RATES ROUGH BERGOMI 20 where we assume that M js,T < ∞ (see (3.22) for definition of M ). Omitting the higher order terms in ∆ t in (5.12a) and (5.12c), we return directly to (5.12b), J , = − c H d X j =2 E Z t i +1 t i E (cid:2) ϕ (3) ( b X T ) M js,T | ( b X s , Y s ) = ( X t i , Y t i ) (cid:3) ( b V Ht i ( s ) + c W Ht i ) g ( s ) h ( θ j )d s ∆ θ j , where we let g ( s ) := −√ He s − t i ( s − t i ) H +1 / and h ( θ j ) := θ − qj Γ (cid:0) H + , θ pj (cid:1) . We define a new auxiliary function f ks ( Y kt i ) := E (cid:2) E [ ϕ (3) ( b X T ) M js,T | ( b X s , Y s ) = ( X t i , Y t i )] | Y kt i (cid:3) , where, although the inner expectation depends implicitly on j , here the index k refers to the generalcomponent Y kt i that we are conditioning against. Expanding f k at zero, we find J , = − c H N L X j,k =1 h ( θ j ) Z t i +1 t i E (cid:2) f ks ( Y kt i ) Y kt i (cid:3) e − θ pk ( s − t i ) g ( s )d s ∆ θ j ∆ θ k = − c H α max − X α =1 α ! N L X j,k =1 h ( θ j ) E (cid:2) ( Y kt i ) α +1 (cid:3) e − θ pk ( s − t i ) Z t i +1 t i ∂ α k f ks ( ) g ( s )d s ∆ θ j ∆ θ k − c H N L X j,k =1 h ( θ j ) E (cid:2) R α max ( Y kt i ) Y kt i (cid:3) e − θ pk ( s − t i ) Z t i +1 t i g ( s )d s ∆ θ j ∆ θ k . Following from the boundedness of f and its derivatives (as in (5.9)), we obtain the full estimate for J , , | J , | (cid:46) C ( H, Q, α max ) t − /pi ∆ t H +3 / (cid:46) O (∆ t H +3 / ) , where we use Isserlis’ theorem and the representation of the covariance for the extended variables todetermine the coefficient depending on t i (which we note is also summable over i ).The estimation of e J , and e J , follows the program above. For e J , we begin by taking the fBm viewwith the kernel K , | e J , | = (cid:12)(cid:12)(cid:12) E Z t i +1 t i D e ν ( X t i , Y t i ; s ) W Ht i E (cid:2) W Ht i ,s W t i ,s | F t i (cid:3) d s (cid:12)(cid:12)(cid:12) = c H (cid:12)(cid:12)(cid:12) N L X k =1 Z t i +1 t i D e ν ( X t i , Y t i ; s ) W Ht i Z st i K ( s − r )d r | {z } ( s − t i ) H +1 / d s (cid:12)(cid:12)(cid:12) which suggests J , = O (∆ t H +3 / ). We recall that D e ν ( Z t i , s ) = E (cid:2) ϕ (3) ( b X T ) (cid:0) c H P l M ls,T ∆ θ l (cid:1) | Z s = Z t i (cid:3) , where we assume that c H N L X l =1 M ls,T ∆ θ l < ∞ , and then define an auxiliary function f ks ( Y kt i ) := E h E (cid:2) ϕ (3) ( b X T ) (cid:0) c H P l M ls,T ∆ θ l (cid:1) | ( b X s , Y s ) = ( X t i , Y t i ) (cid:3) | Y kt i i . Finally, Taylor expanding f ks at zero we encounter terms similar to those estimated previously yieldingthe full estimate, | e J , | (cid:46) O (∆ t H +3 / ) . Likewise for e J , , taking the fBm view with the conditional expectation term E [ c W Ht i ,s Y jt i ,s | F t i ] yields | e J , | = (cid:12)(cid:12)(cid:12) d X j =2 E Z t i +1 t i D j e ν ( X t i , Y t i ; s ) Z st i e − θ pj ( s − r ) K ( s − r )d r d s (cid:12)(cid:12)(cid:12) , where the estimate (5.13) (encountered in J , ) suggest the rate J , = O (∆ t H +3 / ). For the full estimate,we recall that D j e ν ( Z t i , s ) = E (cid:2) ϕ (3) ( b X T ) (cid:0) c H P l M ls,T ∆ θ l (cid:1) | Z s = Z t i (cid:3) and expand in a second round of Taylor expansions for an appropriate auxiliary function thereby obtaining | e J , | (cid:46) O (∆ t H +3 / ) . EAK ERROR RATES ROUGH BERGOMI 21
Taken together, the estimates Sections 5.2 to 5.2 imply that the terms corresponding to first orderderivatives of ν and e ν in (5.1) and (5.2), respectively, yields | J | + | e J | = O (∆ t H +3 / ) , using the notation in (5.3).5.3. Estimate: remaining terms are higher order.
Additional terms (5.1) and (5.2) appearing inthe expansion are higher order than H + 3 /
2. For example, We find J , = O (∆ t H ) which can beseen by once again taking the fractional view, E (cid:2) S ( Y · ) t i ,s ( W t i ,s ) | F t i (cid:3) = E (cid:2)c W Ht i ,s ( c W Ht i ,s + 2 c W Ht i )( W t i ,s ) (cid:3) = E (cid:2) ( c W Ht i ,s ) ( W t i ,s ) (cid:3) + 2 c W Ht i E (cid:2)c W Ht i ,s ( W t i ,s ) (cid:3)| {z } Isserlis’ = ⇒ = E [( c W Ht i ,s ) ] E [( W t i ,s ) ] + 2( E [ c W Ht i ,s W t i ,s ]) = ∆ t H ∆ t − t H +1 / ) =: g (∆ t ) , where g ( s ) ∝ ( s − t i ) H +1 and then expanding in a second round of Taylor expansions for a suitableauxiliary function and checking the control of the coefficient with respect to θ , | J , | (cid:46) (cid:12)(cid:12)(cid:12) E h ( W Ht i ) Z ∆ t D ν ( Z t i , s + t i ) s H +1 d s i(cid:12)(cid:12)(cid:12) = O (∆ t H ) . The term e J , vanishes, | e J , | = (cid:12)(cid:12)(cid:12) E hZ t i +1 t i D e ν ( Z t i , s )( c W Ht i ) E (cid:2)c W Ht i ,s ( W t i ,s ) (cid:3)| {z } Isserlis’ = ⇒ d s i(cid:12)(cid:12)(cid:12) = 0 , by Isserlis’ theorem (although one could argue that the integrand is formally contributing rate H + 2 herefor e J , ). The terms involving increments of the OU processes yield similar results (constants obtainedwould be better behaved owing to the additional decay).5.4. Closure argument to finish proof of Theorem 2.1.
Thus far we have estimated the first fewterms (5.1) and (5.2) arising from the Taylor expansion in powers of ∆ t . Returning to the telescopingsum (3.17), we summarize our estimates up to order κ ,Err( T, ∆ t ) = E (cid:2) ϕ ( b X T ) − ϕ ( X T ) (cid:3) = n − X i =0 κ − X k =0 (cid:16) J k + e J k (cid:17) + R κ ( ν ) + R κ ( e ν ) ! ≤ n − X i =0 (cid:16) C ( t i )∆ t H +3 / + C ( t i )∆ t + O (∆ t H +2 ) + R κ ( ν ) + R κ ( e ν ) (cid:17) , (5.14)using the notation in (5.3). Here the constants C k , for k = 0 , . . . , κ −
1, depending on t i , H , and Q , arethe coefficients that appear in the estimates (5.10) and Sections 5.2 and 5.2, etc., for J · and likewise for e J · . Importantly, each of these constants was obtained independently of N L , L , that is, independently ofthe choice of parameter θ arising in the Markovian extended variable state space formulation. Moreover,each constant was summable in t i . Interchanging the order of summation in (5.14) we obtain,Err( T, ∆ t, ϕ ) (cid:46) C ∆ t H +1 / + C ∆ t + O (∆ t H +1 ) − n − X i =1 E Z t i +1 t i (cid:16) E (cid:2) S ( Y · ) t i ,s R κ ( ν ) | F t i (cid:3) + E (cid:2) S ( Y · ) t i ,s R κ ( e ν ) | F t i (cid:3)(cid:17) d s , (5.15)with new constants C j = n − X i =1 C j ( t i ) . The remainders have a integral form (3.26), and thus the conditional expectations in (5.15) are, E (cid:2) S ( Y · ) t i ,s R κ ( ν ) | F t i (cid:3) = 1 κ ! X | β | = κ ( c W Ht i ) β E (cid:20) ( c W Ht i ,s ) ( W t i ,s ) β ( Y t i ,s ) ˆ β Z D β ν ( ξ τ , s )d τ (cid:21) and E (cid:2) S ( Y · ) t i ,s R κ ( e ν ) | F t i (cid:3) = 1 κ ! X | β | = κ ( c W Ht i ) β E (cid:20)c W Ht i ,s ( W t i ,s ) β ( Y t i ,s ) ˆ β Z D β e ν ( ξ τ , s )d τ (cid:21) , EFERENCES 22 where E (cid:2) S ( Y · ) t i ,s R κ ( ν ) | F t i (cid:3) ∼ ( c W Ht i ,s ) κ and E (cid:2) S ( Y · ) t i ,s R κ ( e ν ) | F t i (cid:3) ∼ ( c W Ht i ,s ) κ . We recall that E | W Ht i ,t i +1 | γ = (∆ t ) γH E | W Ht i +1 | γ (cid:46) ∆ t γH , by the Hölder continuity of the sample paths. Then by applying Cauchy–Schwarz, we obtain in (5.15)that the remainder terms yield O (∆ t κH ). Thus, κ can be chosen such that κ > H yields a large butfinite expansion for general payoff functions ϕ . Then the error is is given byErr( T, ∆ t, ϕ ) (cid:46) C ∆ t H +1 / + C ∆ t + O (∆ t H +1 ) , with weak error rate H + 1 / Remark . In the proof of the main result Theorem 2.1 and of Theorem 4.1, the specific formof K in Section 3.1 and (1.2) is not relevant and the spirit of the proof follows with any relevant L kernelwhere the integrability conditions need to be checked.6. Conclusions and outlook
The rough Bergomi model (rBergomi) is part of a class of increasingly popular rough stochastic volatil-ity models for option pricing in quantitative finance. On the one hand, rBergomi overcomes empiricalchallenges to deliver predictions consistent with observed market data. On the other hand, the non-Markovian nature of rBergomi, due to the fractional Brownian motion (fBm) driver, is an impedimentto both theory and numerics. Despite the widespread use of discretization-based simulation methods foroption pricing under rBergomi, few works have studied the weak convergence rates that underpin thispractice.We prove that the weak convergence of the Euler scheme depends on the Hurst parameter H of the fBmdriver and is weak rate H + 1 / ψ and this is the subject of ongoing work.7. Acknowledgments
This work was supported by the KAUST Office of Sponsored Research (OSR) under Award No.URF/1/2584-01-01 and the Alexander von Humboldt Foundation. R. Tempone is a member of theKAUST SRI Center for Uncertainty Quantification in Computational Science and Engineering. A por-tion of this work was carried out while E. Hall was a Postdoctoral Research Scientist in the Chair ofMathematics for Uncertainty Quantification at RWTH Aachen University. C. Bayer gratefully acknowl-edges support by the German Research Council (DFG) via the Research Unit FOR 2402.We are grateful to Andreas Neuenkirch for pointing out the simpler proof in the case of quadraticpayoffs presented in Section 4.2.
References [1] E. Abi Jaber and O. El Euch, “Multifactor approximation of rough volatility models”,
SIAM Journalon Financial Mathematics , 10 (2019), pp. 309–349, doi : .[2] C. Bayer, P. K. Friz, A. Gulisashvili, B. Horvath, and B. Stemper, “Short-time near-the-moneyskew in rough fractional volatility models”, Quant. Financ. , 19 (2019), pp. 779–798, doi : .[3] C. Bayer, P. K. Friz, P. Gassiat, J. Martin, and B. Stemper, “A regularity structure for roughvolatility”, Mathematical Finance , 30 (2020), pp. 782–832, doi : .[4] C. Bayer, P. K. Friz, and J. Gatheral, “Pricing under rough volatility”, Quant. Financ. , 16 (2016),pp. 887–904, doi : .[5] C. Bayer, C. B. Hammouda, and R. Tempone, “Hierarchical adaptive sparse grids and quasi-MonteCarlo for option pricing under the rough Bergomi model”, Quant. Financ. , 0 (2020), pp. 1–17, doi : .[6] M. Bennedsen, A. Lunde, and M. S. Pakkanen, “Decoupling the short-and long-term behavior ofstochastic volatility”, July 2017, arXiv: . EFERENCES 23 [7] M. Bennedsen, A. Lunde, and M. S. Pakkanen, “Hybrid scheme for Brownian semistationary pro-cesses”,
Finance Stoch. , 21 (2017), pp. 931–965, doi : .[8] P. Carmona and L. Coutin, “Fractional Brownian Motion and the Markov Property”, Electron.Commun. Probab. , 3 (1998), pp. 95–107, doi : .[9] P. Carmona, L. Coutin, and G. Montseny, “Approximation of Some Gaussian Processes”, Stat.Inference Stoch. Process. , 3 (2000), pp. 161–171, doi : .[10] G. Didier, S. A. McKinley, D. B. Hill, and J. Fricks, “Statistical challenges in microrheology”, J.Time Series Anal. , 33 (2012), pp. 724–743, doi : .[11] O. El Euch and M. Rosenbaum, “The characteristic function of rough Heston models”, MathematicalFinance , 29 (2019), pp. 3–38, doi : .[12] M. Fukasawa, T. Takabatake, and R. Westphal, “Is Volatility Rough?”, May 2019, arXiv: .[13] J. Gatheral, T. Jaisson, and M. Rosenbaum, “Volatility is Rough”, Quant. Financ. , 18 (2018),pp. 933–949, doi : .[14] E. J. Hall, M. A. Katsoulakis, and L. Rey-Bellet, “Uncertainty quantification for generalizedLangevin dynamics”, J. Chem. Phys. , 145 (2016), pp. 224108, doi : .[15] P. Harms, “Strong convergence rates for Markovian representations of fractional Brownian motion”,Feb. 2019, arXiv: .[16] P. Harms and D. Stefanovits, “Affine representations of fractional processes with applications inmathematical finance”, Stochastic Process. Appl. , 129 (2019), pp. 1185–1228, doi : .[17] B. Horvath, A. J. Jacquier, and A. Muguruza, “Functional central limit theorems for rough volatil-ity”, May 2019, arXiv: .[18] J. Huntley, Generation of Random Variates , MATLAB Central File Exchange, 2020, url : (visited on 01/01/2020).[19] A. J. Jacquier and M. Oumgari, “Deep PPDEs for rough local stochastic volatility”, 2019, arXiv: .[20] A. A. Muravlev, “Representation of a fractional Brownian motion in terms of an infinite-dimensionalOrnstein-Uhlenbeck process”, Russian Math. Surveys , 66 (2011), pp. 439–441, doi :
10 . 1070 /rm2011v066n02abeh004746 .[21] A. Neuenkirch and T. Shalaiko, “The order barrier for strong approximation of rough volatilitymodels”, June 2016, arXiv: .[22] L. Rey-Bellet, “Open classical systems”, in
Open Quantum Systems II , ed. by S. Attal, A. Joye,and C. Pillet, Lecture Notes in Mathematics vol. 1881, Berlin: Springer, 2006, doi : .[23] M. Romano and N. Touzi, “Contingent claims and market completeness in a stochastic volatilitymodel”, Mathematical Finance , 7 (1997), pp. 399–412, doi : .[24] D. Talay and L. Tubaro, “Expansion of the global error for numerical schemes solving stochastic dif-ferential equations”, Stochastic Anal. Appl. , 8 (1990), pp. 483–509, doi : . Appendix A. Details of numerical implementation
We compute the weak error (here we assume T = 1) (cid:12)(cid:12) E [ ϕ ( X refT ) − ϕ ( X ∆ tT )] (cid:12)(cid:12) , for various payoff functions, ϕ , by the left-point scheme(A.1) X T ( n ) = n X i =0 W Ht i ( W t i +1 − W t i ) , using a reference solution X refT := X T (2 ) and computations X ∆ tT = X T ( n ) for log n = { , . . . , } .We sample paths ( W Ht i , W t i ) i ∈ [0: n ] required for the Monte Carlo approximation of (A.1) at points of thereference mesh using the Cholesky decomposition method. For each { H, T, n } we first form the eigenvaluedecomposition ( L, D ) of the full covariance matrixΣ = (cid:18) Σ Σ Σ Σ (cid:19) , with blocks (Σ ) ij = Cov( W Ht i , W Ht j ), (Σ ) ij = Cov( W Ht i , W t j ), (Σ ) ij = Cov( W t i , W t j ). We refer to,e.g., Lemma 4.1 of [2], for the covariance function of the Riemann-Liouville fBm and to pfq.m from [18]to compute hypergeometric functions: EFERENCES 24 t = T/n:T/n:T; % require pfq.m from MATLAB File Exchange G = @(x) 2.0*H*( x.^(-gam)/(1.0-gam) + (gam*x.^(-1.0-gam)./(1.0-gam)) .* ... pfq([1.0, 1.0+gam], 3.0-gam, x.^(-1.0))/(2.0-gam) ); disp('Computing blocks S11 S12 S22') [X,Y] = meshgrid(t,t); Gmat = G((tril(Y./X)+tril(Y./X)') - eye(size(Y,1)).*diag(Y./X)); Gmat = Gmat.*~eye(size(Gmat)) + eye(size(Gmat)); % diag is 1 b/c G(1)=1 S11 = ((tril(X)+tril(X)') - eye(size(X,1)).*diag(X)).^(2*H) .* Gmat; S12 = sqrt(2*H)*(Y.^(H+0.5)-(Y-min(Y,X)).^(H+0.5))./(H+0.5); S22 = min(X,Y); disp('Finished blocks S11 S12 S22') % LDL decomposition; defaults to Cholesky method disp('Computing L D via eig') [L,D] = eig([S11 S12; S12.' S22]); We then generate M samples using the LDL eigenvalue decomposition: z = randn(2*n,M); X = real(L*(D^0.5)*z); WH = [zeros(1,M); X(1:n, :)]; W = [zeros(1,M); X(n+1: end , :)]; and note that for H = 0 . W H generated by this method is identical to W . Onesample of X refT and X ∆ tT at the final time can then be computed by summing the appropriate terms usingthe reference paths: XTref(1,:) = sum(WH(1: end -1,:) .* diff(W(1: end ,:))); XTdt = nan(numDt,M); for j=1:numDt XTdt(j,:) = sum(WH(1:2^(j+gap-1): end -1,:) .* diff(W(1:2^(j+gap-1): end ,:))); end The code referenced above can be found at the git repository: https://bitbucket.org/datainformeduq/rbwc_code . Appendix B. Bound auxiliary functions f by ϕ Lemma B.1.
Let ϕ ( x ) be the payoff function, and define f s ( Y l t i , . . . , Y l k t i ) := E (cid:2) ν ( Z t i , s ) | Y l t i , . . . , Y l k t i (cid:3) = E (cid:2) E (cid:2) ϕ ( m ) ( b X T ) | Z s = ( X t i , Y t i ) (cid:3) | Y l t i , . . . , Y l k t i (cid:3) , for components ( Y l t i , . . . , Y l k t i ) ⊂ Y t i . Then for a multiindex α = ( α , . . . , α k ) , | ∂ α f kls ( ) | (cid:46) | α | X j =0 k ϕ ( m + j ) k ∞ . Proof. (Lemma B.1) Recall that ν ( Z t i , s ) is a deterministic function of the jointly Gaussian η -dimensionalvector (with η := ( N L + 1) × ( i + 1)), ξ = ( Y τ , ∆ W τ ) τ = t ,...,t i , (cf. (3.11)). Denoting the density as % ( ξ ), we note ξ is mean zero and has variance-covariance Σ givenby the known quantities Cov( Y lt i , Y kt j ), Cov( Y lt i , ∆ W t j ), and Cov(∆ W t i , ∆ W t j ), which have closed formexpressions. For each s , the variable ν ( Z t i , s ) has a density proportional to % ( ξ ),d P ν ∝ % ( ξ )d ξ . Writing y = ( Y l t i , . . . , Y l k t i ), a subset of components of Y t i of size k = | y | such that k ≤ N L , the function f s ( y ) = E [ ν ( Z t i , s ) | y ] , a deterministic function of y , can be expressed in terms of a conditional Gaussian density. Partitioning ξ = ( e ξ , y ) and, likewise, the covariance matrix Σ into components Σ corresponding to e ξ , Σ to y , and Σ to the mixed terms, the the conditional Gaussian density % ( e ξ | y ) has conditional mean e µ = Σ Σ − y , EFERENCES 25 which is linear in y , and conditional variance-covariance matrix e Σ = Σ − Σ > Σ − Σ , that does not depend on y .Rewritten in terms of this conditional density, f is given by f s ( y ) = Z R η − k ν ( X t i , e ξ , y ; s ) % ( e ξ | y )d e ξ , where ν ( X t i , e ξ , y ; s ) = E (cid:2) ϕ ( m ) ( b X T ) | Z s = ( X t i , e ξ , y ) (cid:3) . For a multiindex α = ( α , . . . , α k ) corresponding to the components of y , ∂ α = ∂ α ∂ y α = ∂ α ∂y l · · · ∂ α k ∂y l k , taking the derivative inside the integral we obtain ∂ α f s ( y ) = Z R η − k n(cid:0) ∂ α ν ( X t i , e ξ , y ) (cid:1) % ( e ξ | y ) + ν ( X t i , e ξ , y ) ∂ α % ( e ξ | y ) o d e ξ = Z R η − k n ∂ α ν ( X t i , e ξ , y ) + ν ( X t i , e ξ , y ) P k ( y ) o % ( e ξ | y )d e ξ , (B.1)where ∂ α % ( e ξ | y ) = P k ( y ) % ( e ξ | y ) , for P k a polynomial of degree k since % ( e ξ | y ) ∝ exp[ − ( e ξ − e µ ) > e Σ ( e ξ − e µ )] = exp[ − ( e ξ − Σ Σ − y ) > e Σ ( e ξ − Σ Σ − y )] . The remaining derivative in (B.1) follows similarly to the computation of the fluxes in Lemma 3.7, ∂ α ν ( X t i , e ξ , y ; s ) = c | α | H E h ϕ ( | α | + m ) ( b X T ) | α | Y j =1 (cid:0) ∆ θ l j M l j s,T (cid:1) α j | Z s = ( X t i , e ξ , y ) i , and the estimate follows.and the estimate follows.