[PDF] Weak error rates for option pricing under the rough Bergomi model

Abstract

In quantitative finance, modeling the volatility structure of underlying assets is a key component in the pricing of options. Rough stochastic volatility models, such as the rough Bergomi model [Bayer, Friz, Gatheral, Quantitative Finance 16(6), 887-904, 2016], seek to fit observed market data based on the observation that the log-realized variance behaves like a fractional Brownian motion with small Hurst parameter, H<1/2 , over reasonable timescales. Both time series data of asset prices and option derived price data indicate that H often takes values close to 0.1 or smaller, i.e. rougher than Brownian Motion. This change greatly improves the fit to time series data of underlying asset prices as well as to option prices while maintaining parsimoniousness. However, the non-Markovian nature of the driving fractional Brownian motion in the rough Bergomi model poses severe challenges for theoretical and numerical analyses as well as for computational practice. While the explicit Euler method is known to converge to the solution of the rough Bergomi model, its strong rate of convergence is only H . For a simplified rough Bergomi model, we prove rate H+1/2 for the weak convergence of the Euler method and, surprisingly, in the case of quadratic payoff functions we obtain rate one. Indeed, the problem of weak convergence for rough Bergomi is very subtle; we provide examples demonstrating that the rate of convergence for payoff functions well approximated by second-order polynomials, as weighted by the law of the fractional Brownian motion, may be hard to distinguish from rate one empirically. Our proof relies on Taylor expansions and an affine Markovian representation of the underlying and is further supported by numerical experiments.

Full PDF

WWEAK ERROR RATES FOR OPTION PRICING UNDER THE ROUGH BERGOMIMODEL

CHRISTIAN BAYER , ERIC JOSEPH HALL , , AND RAÚL TEMPONE , Abstract.

In quantitative ﬁnance, modeling the volatility structure of underlying assets is a key com-ponent in the pricing of options. Rough stochastic volatility models, such as the rough Bergomi model[Bayer, Friz, Gatheral, Quantitative Finance 16(6), 887-904, 2016], seek to ﬁt observed market databased on the observation that the log-realized variance behaves like a fractional Brownian motion withsmall Hurst parameter,

H < /

2, over reasonable timescales. Both time series data of asset prices andoption derived price data indicate that H often takes values close to 0 . H . Weprove rate H + 1 / Introduction

Rough stochastic volatility models form an increasingly popular paradigm in quantitative ﬁnance, asthey simultaneously address two empirical challenges. Firstly, time series of realized variance indicatethat variance is rough in the sense of having Hölder regularity H (cid:28) /

2, see [13, 6, 12]. Secondly, roughvolatility models recover the power-law explosion of the at the money implied volatility skew of the form τ − γ for γ ∼ / τ →

0. In fact, these two constants are linked by γ = 1 / − H , givingfurther evidence of regularity H being small, say around 0 .

1. We refer to [4] for the pricing perspective.To ﬁx notation, we consider a rough stochastic volatility model for an asset price process S t of theform d S t = √ v t S t d Z t , where Z is a Brownian motion (Bm). There are two classes of rough volatility models which diﬀer in thespeciﬁcation of the instantaneous variance component v t . The rough Heston model ([11]) is an exampleof one kind, with v t given as a solution to a Volterra stochastic diﬀerential equation (SDE) with a powerlaw kernel K ( r ) ∼ r H − / , r >

0. This paper will consider an alternative where the variance process isan explicit function of a fractional Brownian motion (fBm) W Ht , which does not need to be the classicalfBm. For instance, the rough Bergomi model ([4]) is speciﬁed by the choice(1.1) v t := ξ ( t ) exp (cid:18) ηW Ht − η t H (cid:19) , where ξ ( t ) denotes the forward variance and W Ht denotes the Riemann–Liouville fBm given by(1.2) W Ht := Z t K ( t − s )d W s , K ( r ) := √ Hr H − / , for a Bm W with correlation ρ with Z .The modelling advantages gained by addressing these two empirical challenges using a rough stochasticvolatility model are paid for both on the theoretical and the numerical side. Indeed, rough stochasticvolatility models are neither semi-martingales nor Markov processes. Despite the former, rough volatility WIAS, Mohrenstr. 39, 10117 Berlin, Germany. RWTH Aachen University, Chair of Mathematics for Uncertainty Quantification, Kackertstr. 9, 52072Aachen, Germany. University of Dundee, Division of Mathematics, Nethergate, Dundee, DD1 4HN, UK. King Abdullah University of Science and Technology (KAUST), Computer, Electrical and MathematicalSciences & Engineering Division (CEMSE), Thuwal 23955-6900, Saudi Arabia.

E-mail addresses : [email protected], [email protected], [email protected] .2010 Mathematics Subject Classiﬁcation.

Key words and phrases. rough volatility, option pricing, weak error, Euler–Maruyama, non-Markovian dynamics. a r X i v : . [ q -f i n . C P ] S e p EAK ERROR RATES ROUGH BERGOMI 2 models do not violate the no-arbitrage-condition, as the asset price process itself is a martingale. On theother hand, the diﬃculties caused by the lack of Markov property are more severe. In particular, thereis no ﬁnite dimensional pricing PDE anymore (although we refer to [19] for an implementation of aninﬁnite-dimensional pricing PDE based on machine learning). For some rough volatility models of aﬃneVolterra type, for instance, the rough Heston model, there is still a semi-explicit formula for the assetprice’s characteristic function in terms of a deterministic fractional ODE. Otherwise, the rough stochasticvolatility approach necessitates simulation-based methods.On the numerical side, W Ht , . . . , W Ht N can be exactly sampled at discrete-time points as W H is aGaussian process with known covariance function. (The hybrid scheme of [7] is a popular alternative toexact simulation, sacriﬁcing accuracy for speed.) However, simulation of S t requires discretization of astochastic integral, even in simple cases like the rough Bergomi model. As we shall see in further detaillater, we essentially need to compute stochastic integrals of the form(1.3) Z T ψ ( t, W Ht )d W t , for some deterministic, “nice” function ψ . In particular, note that the integrand is adapted and square-integrable (under appropriate conditions). Hence, the stochastic integral exists in the classical It¯o sense,and strong convergence of the numerical scheme(1.4) n − X i =0 ψ ( t i , W Ht i )( W t i +1 − W t i )is also classical. The speed of convergence is considerably less clear. Indeed, Neuenkirch and Shalaiko [21]proved strong convergence with rate H for a very similar problem, i.e., phrased in terms of classical fBm,and strong rate H is widely expected to hold also for the approximation scheme (1.4) to (1.3). Usingtechniques from regularity structures, in particular, renormalization by an exploding constant, [3] provedessentially the same strong rate for a Wong-Zakai type approximation of (1.3).Combining our observations—that volatility is rough ( H ≈ .

1) and typical schemes converge withstrong rate H — we run into problems, as the rate of convergence is so small as to make it indistinguishablefrom lack of convergence in many cases of practical importance. Indeed, suppose that H = 0 . n time steps to reach an error tolerance (cid:15) . If we now decrease our tolerance by a factor ten, i.e.,we require one additional signiﬁcant digit, then the number of time-steps needed is increased by a factor10 in the asymptotic regime.For most applications we really require weak as opposed to strong convergence of the numerical scheme.For instance, the price of a European option with payoﬀ ϕ is E [ ϕ ( S T )], and its computation relies onweak convergence of the scheme. Weak approximation of stochastic integrals is often much faster thanstrong approximation. Consider the Euler scheme for standard SDEs (the case H = 1 / /

2. This poses the interesting question about the relation between the Hölder regularity ( H = 1 / / H , but there are several plausible candidates for the weakrate: 2 H , H + 1 /

2, and 1 (independent of H ). We stress that only the last two alternatives allow forfeasible numerical simulations in the truly rough regime. Bluntly put, if the true weak error only decaysproportionally to n − H in the number of time steps n , then simulation methods are not viable numericalmethods for option pricing in rough volatility models.Despite the importance of the problem of determining the weak rate, only little work has been done.Horvath, Jacquier and Muguruza [17] study a Donsker theorem for a rough volatility model, whichtranslates into a week tree-type approximation. The rate of convergence of their method is H in thenumber of time-steps. At this stage, we should note that the trees are non-recombining, implying thatthe memory load increases exponentially in the number of time-steps. To the best of our knowledge,this work provides the only rigorous weak convergence result in the literature of rough volatility models.Indeed, it is worth pointing out that standard proof techniques for diﬀusions, see [24], strongly rely onthe Markov property, and are, hence, not applicable in this setting.At the same time, discretization-based simulation methods are often used in the literature, with greatsuccess. While convergence is rarely considered (not even empirically), we would expect to see diﬃcultiesemerge in the very rough cases H ≈ . H or 2 H . In fact,the few available empirical studies (for instance, [5]) indicate a much larger weak rate of convergence. Infact, the authors of [5] observe a weak rate of one which is stable enough to allow accelerated convergenceby Richardson extrapolation. Anecdotally, we asked several experts on stochastic numerics in early stages of working on this problem, and all threepossibilities were put forward.

EAK ERROR RATES ROUGH BERGOMI 3

In this paper we prove novel weak rates for the convergence of the left-hand rule (1.4) to (1.3):

Theorem 1.1.

The left-point approximation (1.4) to the rough stochastic integral (1.3) converges withweak rate H + 1 / for ψ ( t, W Ht ) = W Ht . For the case that the payoﬀ ϕ is a quadratic polynomial thenthe convergence is weak rate one. We refer to Theorems 2.1 and 4.1 for more precise statements. Some remarks are in order: • The problem of weak convergence in this setting is very subtle; if we restrict ourselves to quadraticpolynomials as payoﬀ functions, then the weak rate of convergence is actually one, see Lemma 4.2.This implies that the rate of convergence for payoﬀ functions ϕ which can be well approximatedby quadratic polynomials, as seen from the law of the solution, may be hard to distinguish fromrate one empirically, due to prevalence of higher order terms (see Figure 3). • We do not have a lower bound establishing that the weak rate of convergence cannot be betterthan H + 1 / • We do not doubt that the proof extends to the general case of non-linear ψ and a complete prooffor this extension is presently a work in progress.Our proof for Theorem 1.1 relies on deriving Taylor expansions for the weak error using an aﬃneMarkovian representation of the underlying. The basic ﬂavor of this approach, i.e., obtaining a Markovianextended variable system to facilitate analysis, is a strategy utilized in other non-Markovian stochasticdynamical systems such as the Generalized Langevin equation (see, e.g., [14, 10]) and open Hamiltoniansystems ([22]). In the context of rough volatility models, Markovian approximations were also used in [1]. Outline of the paper.

In Section 2 we provide the setting and the main result and discuss the generalstrategy of the proof. Section 3 introduces auxiliary, Markovian approximations to both (1.3) and (1.4)based on [8]. This high dimensional Markovian problem will serve as a surrogate problem for most ofthe convergence analysis. Section 4 considers the special case of quadratic payoﬀ functions, for which thegeneral proof strategy simpliﬁes considerably. We contrast this with a speciﬁc proof only applicable toquadratic payoﬀs, which also works for general non-linear ψ . The proof of Theorem 1.1 (and Theorem 2.1)is then carried out in Section 5.2. Problem setting: weak rate of convergence for Euler scheme is H + 1 / ϕ ( X T ) for an underlying(2.1) X t := Z t ψ ( s, W Hs )d W s , where W Ht is a Riemann–Liouville fBm given by (1.2) with Hurst parameter H ∈ (0 , / X t in (2.1) is non-Markovian as W Ht , and hence ψ ( t, W Ht ), depends on the full historyof ( W s ) s ∈ [0 ,t ] (cf. ψ to the instantaneous variance v t in (1.1)). In fact, for the purposes of Europeanoption, the rough Bergomi model can be reduced to (2.1) in the following way (often attributed to [23]).First, It¯o’s formula implies that S T = S exp − Z T v s d s + Z T √ v s d Z s ! . We can now replace the Bm Z by ρW + p − ρ W ⊥ for an independent Bm W ⊥ . Conditionally on W , S T has a log-normal distribution with parameters µ := log S − Z T v s d s + ρ Z T √ v s d W s , σ := (1 − ρ ) Z T v s d s. If we denote the Black–Scholes price for the payoﬀ function ϕ at maturity T by C BS ( S , σ BS T, ϕ ), forinterest rate r = 0 and volatility σ BS , then we get(2.2) E [ ϕ ( S T )] = E " C BS S exp " − ρ Z T v s d s + ρ Z T √ v s d W s , (1 − ρ ) Z T v s d s, ϕ ! . Computation of the right hand side of (2.2) requires simulation of the Lebesgue integral R T v s d s as wellas simulation of(2.3) Z T √ v s d W s = Z T p ξ ( s ) exp (cid:18) η W Hs − η s H (cid:19) d W s , which is of the form (2.1). EAK ERROR RATES ROUGH BERGOMI 4

Presently, we derive weak rates of convergence,(2.4) (cid:12)(cid:12) E [ ϕ ( X T ) − ϕ ( X ∆ tT )] (cid:12)(cid:12) = O (∆ t γ ) , for the left-hand scheme (1.4) with step-size ∆ t such that n ∆ t = T . Restricting to the case ψ ( s, W Hs ) = W Hs , the main ﬁnding of this work, in Theorem 2.1 (and implying the ﬁrst statement in Theorem 1.1),is that the weak rate is γ = H + 1 / H . Theorem 2.1 (Weak rate) . For general ϕ ∈ C ηb , for integer η = d H e , and ψ ( s, W Hs ) = W Hs , we have | Err( T, ∆ t ) | = (cid:12)(cid:12) E [ ϕ ( X T ) − ϕ ( X ∆ tT )] (cid:12)(cid:12) = O (∆ t H +1 / ) , i.e. the Euler method is weak rate H + 1 / . The proof of Theorem 2.1 is presented in Section 5. Before diving into the machinery needed forthe proof, we ﬁrst consider some numerical evidence that supports the rates in Theorem 1.1 and theaccompanying remarks. Details of the implementation are outlined in Appendix A.The ﬁrst group of numerical experiments, in Figure 1, provide support for rate H +1 / H for the general (i.e. non-quadratic) payoﬀfunctions ϕ ( x ) = x and ϕ ( x ) = Heaviside( x ). Indeed, the best ﬁts (least squares) of the weak error to∆ t , as well as the extremes suggested by the upper and lower 95% conﬁdence interval for the mean basedon M = 3 × samples, is consistent with the rate H + 1 /

2. Comparing Figure 1 a to Figure 1 b , therate increases (and by approximately 0 .

1) as H increases from H = 0 .

05 to 0 .

15. Although the function ϕ ( x ) = Heaviside( x ) is not continuous and therefore does not ﬁt precisely into our theory, the consistencyof the observed rates in Figure 1 hint at the generality of the ﬁndings in Theorem 2.1 to, e.g., digital calloptions. -1 -3 -2 -1 (a) H = 0 . -1 -3 -2 -1 (b) H = 0 . Figure 1.

For small Hurst parameters, ( a ) H = 0 .

05 and ( b ) H = 0 .

15, the best ﬁtslope for the weak error for scheme (1.4), together with extremes suggested by the 95% CIbased on M observations, are consistent with the rate H + 1 / ϕ . In particular, the rate holds for the discontinuous ϕ ( x ) =Heaviside( x ) suggesting our ﬁndings are robust. Here ∆ t ∈ [2 − , − ] and the referencemesh is ∆ t ref = 2 − .In Figure 2 b , we observe that for H = 0 .

50, i.e. standard Brownian motion, the best ﬁt of the weakerror rate is consistent with the known weak rate one for general payoﬀ functions. However, in contrastto the rates observed in Figure 1, the behavior of quadratic payoﬀs looks decidedly diﬀerent. We observein Figure 2 a that the weak rate for quadratic ϕ ( x ) = x appears to be γ = 1 even for small H = 0 .

05 and H = 0 .

15. Weak rate one for quadratic payoﬀ functions is recorded in Theorem 4.1 and Lemma 4.2 inSection 4; this surprising ﬁnding, that the rate depends on the payoﬀ function, will be readily explainedusing the asymptotic expansions that are at the center of our approach.Finally, in Figure 3, we observe that the best ﬁt of weak rate to ∆ t for the shifted-cubic ϕ ( x ) = ( x +1 . is consistent with rate 1 even for small H = 0 .

05 and H = 0 .

15 (cf. compare the rates in Figure 3 tothose for the cubic payoﬀ ϕ ( x ) = x in Figures 1 a and 1 b ). As seen from the law of the solution, theshifted cubic is better approximated by quadratic polynomials and therefore its rate of convergence ismuch harder to distinguish from rate one. This numerical experiment not only drives home the subtlety EAK ERROR RATES ROUGH BERGOMI 5 of the problem of deriving weak rates for the rough Bergomi model, but also leads us to be optimistic thateﬃcient numerical methods can be obtained for a wide array of real-world problems where the eﬀective rate of convergence is not as bad as the theoretical rate. -1 -2 -1 (a) Quadratic ϕ , small H -1 -3 -2 -1 (b) General (non-quadratic) ϕ , H = 0 . Figure 2. ( a ) Surprisingly, the best ﬁt line for the weak error for scheme (1.4) for thequadratic payoﬀ ϕ ( x ) = x is consistent with weak rate one even for small H , as foundin Theorem 4.1. ( b ) For Hurst parameter H = 1 /

2, the weak rate in Theorem 2.1 forscheme (1.4) is consistent with the expected rate one (for standard Bm), as illustratedby the best ﬁt slope for the weak error for ϕ ( x ) = x and ϕ ( x ) = Heaviside( x ) (cf. weakrate H + 1 / H ). Here ∆ t ∈ [2 − , − ] and the referencemesh is ∆ t ref = 2 − . -1 -1 Figure 3.

The weak error for scheme (1.4) for the shifted cubic payoﬀ ϕ ( x ) = ( x + 1 . achieves a higher rate than ϕ ( x ) = x as the shifted cubic is better approximated by aquadratic in the support of the distribution for the underlying. Here ∆ t ∈ [2 − , − ] andthe reference mesh is ∆ t ref = 2 − . Remark . Although the assumptions of Theorem 2.1 seem extremely strong,they do reﬂect meaningful ﬁnancial situations. In particular, note that (2.2) allows us to replace the(generally non-smooth) payoﬀ functions of European options by their smooth Black–Scholes prices. Ad-ditionally, put-call-parity may allow us to assume bounded payoﬀs. Linearity of ψ is, admittedly, a verystrong assumption, which should be seen as the ﬁrst stepping stone to the general result. We conjecturethat Theorem 2.1 holds in the setting of the rough Bergomi model, i.e., for non-linear ψ as given in (2.3). EAK ERROR RATES ROUGH BERGOMI 6

Remark . For the simple model problem (1.3) the numerical integration scheme (1.4) is theleft-point approximation. If the problem were not trivialized to a stochastic integral, then in general X ∆ tT would correspond to the Euler–Maruyama approximation for the underlying SDE and we will refer tothe scheme interchangeably as both.In the next section, we introduce the notation and concepts that will be used to derive asymptoticexpansions for the weak error in powers of ∆ t . In particular, we ﬁrst use these expansions to derive weakrate one for quadratic payoﬀs, see Theorem 4.1, in Section 4. Finally in Section 5, a proof, following theapproach used for Theorem 4.1 as a guide, is given for Theorem 2.1 obtaining weak rate H + 1 / Markovian extended state space formulation

We ﬁrst consider a well-known aﬃne representation for the driving fBm. Discretizing this aﬃnerepresentation yields an extended state space for the dynamics of the underlying. A novelty of ourapproach is to utilize this formulation to obtain asymptotic expansions for the weak error. In particular,we utilize the Markovian structure of the extended state space to show that (3.7) admits a Taylorexpansion in ∆ t where the coeﬃcients can be controlled independently of the choice of parameters usedto obtain the extended state space formulation.3.1. Aﬃne representations for small Hurst index.

Over the Hurst parameter regime of interest,the fBm (1.2) admits an aﬃne representation as a linear functional of an inﬁnite-dimensional family ofOrnstein–Uhlenbeck (OU) processes ([8]).

Lemma 3.1 (Aﬃne representation) . For < H < / , (3.1) W Ht = e c H Z ∞ e Y t ( θ ) θ − ( H + ) d θ , where e Y t ( θ ) = Z t e − θ ( t − s ) d W s and e c H is a positive and ﬁnite constant depending on H . Although this statement is well-known we provide key details of the proof that will be referenced laterfor the convenience of the reader. The full proof can be found in, e.g., [8, 16] (see also [9, 20, 15] where [9]gives a Markovian representation for

H > /

2, [20] a time-homogeneous Markovian representation thatis also deﬁned for t ∈ ( −∞ , Proof.

Writing the kernel appearing in (1.2) as a Laplace transform,( t − s ) H − = 1Γ( − H ) Z ∞ θ − ( H + ) e − θ ( t − s ) d θ , and then using stochastic Fubini one obtains the desired result, W Ht = Z t √ H Γ( − H ) Z ∞ θ − ( H + ) e − θ ( t − s ) d θ d W s = Z ∞ e c H Z t e − θ ( t − s ) d W s θ − ( H + ) d θ = e c H Z ∞ e Y t ( θ ) θ − ( H + ) d θ , where e c H := √ H/ Γ( − H ) < ∞ . (cid:3) A key tool in our proof of the weak rates will be to utilize the Markovian structure of a projection ofthe fBm obtained by discretizing the aﬃne representation Lemma 3.1. We observe that the integral in(3.1) has a singularity at θ = 0, but behaves essentially like θ − ( H + ) before e Y t ( θ ) vanishes in the limit of θ . To make (3.1) more amenable to quadrature we remove the singularity by introducing the change ofvariable, ϑ = θ − ( H + − = θ − H , thereby obtaining the representation(3.2) W Ht = c H Z ∞ e Y t ( ϑ / (1 − H ) )d ϑ = c H Z ∞ Y t ( θ )d θ , where the constant, c H := e c H − H = √ H Γ( − H ) , EAK ERROR RATES ROUGH BERGOMI 7 is an increasing function of H ∈ (0 , ) such that 0 < c H <

1. In (3.2),(3.3) Y t ( θ ) = Z t e − ( t − s ) θ p d W s is an OU process with speed of mean-reversion given by θ p with a positive power(3.4) p := 2 / (1 − H ) > . One realization of Y t ( θ ) is plotted in Figure 4 together with an envelope illustrating plus/minus twostandard deviations of Y t ( θ ), computed using the formula for the covariance, i.e.Cov( Y t ( θ ) , Y t ( η )) = 1 θ p + η p (1 − e − ( θ p + η p ) t ) . Replacing the integral in (3.2) with a quadrature rule in the parameter θ yields a projection of the fBmonto a ﬁnite state space. Figure 4.

A sample of Y t ( θ ) in (3.3), at left, with speed of mean reversion θ / (1 − H ) for H = 0 .

07 plotted together with an envelope demonstrating plus/minus two standarddeviations, at right, (cf. time series data of asset prices and option derived price dataindicate that H often takes values close to 0 . Y t ( θ ) is a smoothanalytic function of θ and discretizing in θ yields an extended variable state space whichwe utilize in our analysis. Lemma 3.2 (Approximate aﬃne representation) . For < H < / , let (3.5) c W Ht = c H N L X l =1 Y lt ∆ θ l =: S ( Y t ) , depend on N L degrees of freedom Y t = ( Y t , . . . , Y N L t ) where Y lt := Y t ( θ l ) are OU process in (3.3) withspeed of mean-reversion θ pl for p in (3.4) . Then c W H converges to W H as L, N L → ∞ in L (Ω; C ([0 , T ])) .Proof. We obtain an approximate aﬃne representation of the fBm by discretizing (3.2) in two steps.First, we divide the integral in (3.2) into two parts, W Ht = c H Z L Y t ( θ )d θ + c H Z ∞ L Y t ( θ )d θ | {z } =: R L ( Y t ) , where R L denotes the error in restricting the integral to a ﬁxed computational domain L >

1. Second,we consider a quadrature rule W Ht = c H N L X l =1 Y t ( θ l )∆ θ l + R N L ( Y t ) + R L ( Y t ) , with points 0 ≤ θ < · · · < θ N L ≤ L and weights ∆ θ l = θ l +1 − θ l where R N L denotes the quadraturetruncation error.That c W H converges to W H in the limit of L, N L essentially follows from the “strong rates” in [15]. The R N L can be made arbitrarily small as Y t ( θ ) is a smooth bounded (even analytic) function of θ (e.g. see EAK ERROR RATES ROUGH BERGOMI 8

Figure 4), i.e. the regularity in θ allows one to approximate eﬃciently using arbitrarily higher orderquadrature rules if desired ([15]). The R L ( Y t ) is a mean zero Gaussian process, since for all δ ∈ [0 , H ),sup L ∈ [1 , ∞ ] L δ (cid:13)(cid:13)(cid:13) sup t ∈ [0 ,T ] | R L ( Y t ) | (cid:13)(cid:13)(cid:13) L p (Ω) < ∞ , guarantees integrability ([15, Lemma 1(b)]). Then R L can be made arbitrarily small for suﬃciently large L by observing that the variance,Var[ R L ( Y t ( θ ))] = c H Z ∞ L Z ∞ L Cov( Y t ( θ ) , Y t ( η ))d θ d η ≤ c H π Z ∞ L θ − p d θ = c H π L − p p − , decays in L since p > (cid:3) We split the weak error (2.4) using the representations in Lemmas 3.1 and 3.2, E (cid:2) ϕ ( X T ( W H )) − ϕ ( X T ( c W H )) (cid:3) + E (cid:2) ϕ ( X T ( c W H )) − ϕ ( X ∆ tT ( c W H )) (cid:3) + E (cid:2) ϕ ( X ∆ tT ( c W H )) − ϕ ( X ∆ tT ( W H )) (cid:3) (3.6)where we emphasize the dependence of the underlying on the driving process. The ﬁrst and third termsboth correspond to the error in approximating W H with c W H and therefore vanish by Lemma 3.2. Indeed,we have the following result. Lemma 3.3.

Assume that ϕ and ψ are Lipschitz, the latter uniformly in time, with Lipschitz constants K ϕ and K ψ , respectively. Then the ﬁrst and the third term of (3.6) converge to zero as N L , L → ∞ .Regarding the third term, the convergence is uniform with respect to ∆ t .Proof. We consider the third term ﬁrst. By basic probabilistic estimates using the Lipschitz property of ϕ and ψ , we have (cid:12)(cid:12)(cid:12) E (cid:2) ϕ ( X ∆ tT ( c W H )) − ϕ ( X ∆ tT ( W H )) (cid:3)(cid:12)(cid:12)(cid:12) ≤ K ϕ E (cid:20)(cid:16) X ∆ tT ( c W H ) − X ∆ tT ( W H ) (cid:17) (cid:21) / = K ϕ E  n − X i =0 (cid:16) ψ ( s i , c W Hs i ) − ψ ( s i , W Hs i ) (cid:17) W s i ,s i +1 !  / = K ϕ n − X i =0 E (cid:20)(cid:16) ψ ( s i , c W Hs i ) − ψ ( s i , W Hs i ) (cid:17) (cid:21) ( s i +1 − s i ) ! / ≤ K ϕ K ψ n − X i =0 E (cid:20)(cid:16)c W Hs i − W Hs i (cid:17) (cid:21) ( s i +1 − s i ) ! / ≤ K ϕ K ψ (cid:13)(cid:13)(cid:13)c W H − W H (cid:13)(cid:13)(cid:13) L (Ω; C ([0 ,T ]) T / → N L , L → ∞ by Lemma 3.2. The result for the ﬁrst term follows in the same manner. (cid:3) Remark N L , L ) . Following [15, Theorem 1], convergence rates in N L and L for the ﬁrst and third terms of (3.6) could undoubtedly be established. Keep in mind, however, thatwe only use the scheme X ∆ tT ( c W H ) as a tool for the analysis of the scheme X ∆ tT ( W H ), i.e., with exactsimulation of W H . Consequently, rates of the convergence in N L and L are not required to get rates ofconvergence of X ∆ tT ( W H ) in terms of ∆ t . Indeed, with respect to the actual scheme X ∆ tT ( W H ) analyzedin this paper, the error contributions from the ﬁrst and third terms vanish.The sole remaining term in (3.6),(3.7) Err( T, ∆ t ) := E [ ϕ ( X T ( c W H ))] − E [ ϕ ( X ∆ tT ( c W H ))] , that gives the weak error in the Euler scheme, depends on the approximate aﬃne representation inLemma 3.2. Indeed, suppose that we are given an error tolerance ε . By Lemma 3.3, we can ﬁnd L = L ( ε, H, T, ϕ, ψ ) and N L = N L ( ε, H, T, ϕ, ψ ) such that the ﬁrst and third terms of (3.6) are boundedby ε/ t . Our task is now to choose time steps ∆ t such that alsothe second term is bounded by ε/ L, N L . In the next section, we will obtain an extendedvariable system for the dynamics of the underlying that we will use to obtain an asymptotic expansionsfor (3.7). Remark . In the interest of keeping our arguments constructive, we ﬁrst ﬁxed a compu-tational domain L and then introduced a quadrature based on N L points without specifying the preciserule. One could also choose, e.g., a Gauss–Laguerre quadrature suitable for the half-line thereby reducingthe number of parameters to one (see also [15]). The splitting (3.7) still holds. EAK ERROR RATES ROUGH BERGOMI 9

Forward Euler scheme for extended variable system.

Substituting (3.5) into the underlying(2.1), yields(3.8) b X t := Z t ψ (cid:0) s, c W Hs (cid:1) d W s = Z t ψ (cid:0) s, S ( Y s ) (cid:1) d W s , a ﬁnite dimensional Markovian approximation b X t = X t ( c W H ) of the underlying X t that appears in theweak error (3.7). The dynamics of (3.8) are described by Z = ( b X, Y , . . . , Y N L ) , a d -dimensional extended variable state space ( d = N L + 1), solving the system(3.9) d Z t = − b ( Z t , t )d t + σ ( Z t , t )d W t , Z = 0 , with d -vectors b and σ given by, b ( Z t , t ) := (0 , Y t θ p , . . . , Y N L t θ pN L ) ,σ ( Z t , t ) := (cid:0) ψ (cid:0) t, S ( Y t ) (cid:1) , , . . . , (cid:1) , where there is a single driving Brownian motion, i.e. (3.9) is a degenerate system.For the interval [0 , T ], we deﬁne the uniform time grid discretization t i := i ∆ t for i = 0 , . . . , n − n := T / ∆ t , and consider the Euler–Maruyama scheme for the underlying,(3.10) X t i +1 = X t i + ψ (cid:0) t i , S ( Y t i ) (cid:1) ∆ W t i , X t = 0 , where ∆ W t i := W t i ,t i +1 = W t i +1 − W t i are the increments of the driving Wiener process and where at each time step the vector of extendedvariables Y · = ( Y l · ) l =1 ,...,N L is sampled exactly. That is, for the Euler update at t i +1 , one can form thejoint distribution(3.11) ( Y τ , ∆ W τ ) τ = t ,...,t i , an ( N L + 1) × ( i + 1)-dimensional Gaussian. The variance-covariance matrix for (3.11) can be obtainedusing the known covariances Cov( Y kt i , Y lt j ), Cov( Y kt i , ∆ W t j ), and Cov(∆ W t i , ∆ W t j ), and then the targetvariables Y t i required in (3.10) can be sampled exactly from the joint distribution, e.g., using the Choleskydecomposition. We extend X in (3.10) to all t ∈ [0 , T ] by the interpolation,(3.12) X ( t ) = Z t ψ (cid:0) s, c W Hκ s (cid:1) d W s = Z t ψ (cid:16) s, c H N L X l =1 ∆ θ l Y lκ s (cid:17) d W s , where κ s = t i if s ∈ [ t i , t i +1 ) for each i = 0 , . . . , n − Z = ( X, Y , . . . , Y N L ) embedded in the SDE(3.13) d Z s = − b ( Z s )d s + σ ( Z s )d W s , s ∈ [0 , T ] , Z = 0 , with coeﬃcients b ( Z s ) = (0 , Y s θ p , . . . , Y N L s θ pN L ) = b ( Z s , s ) ,σ ( Z s ) = (cid:0) ψ (cid:0) s, S ( Y κ s ) (cid:1) , , . . . , (cid:1) = σ ( Z κ s , s ) . For (3.13) and Section 3.2, we are able to formulate a corresponding Kolmogorov backward equation. Fora smooth and bounded payoﬀ ϕ ( Z T ) = ϕ ( Z T ) = ϕ ( b X T ), we consider the value function,(3.14) u ( z , t ) := E [ ϕ ( Z T ) | Z t = z ] = E [ ϕ ( b X T ) | Z t = z ] , that is the conditional expected value of the payoﬀ at time t < T given the starting value Z t = z for z = ( z , . . . , z d ) = ( x, y , . . . , y N L ). The associated Kolmogorov backward equation is given by ∂ t u ( z , t ) − b j ( z , t ) D j u ( z , t ) + 12 A jk ( z , t ) D jk u ( z , t ) = 0 , t < T , z ∈ R d ,u ( z , T ) = ϕ ( z ) , (3.15)where repeated indices indicate summation (over 1 , . . . , d ), b j = b j ( z , t ) is the j th component of the d -vector, b ( z , t ) := (0 , z θ p , . . . , z N L +1 θ pN L ) , and A jk = A jk ( z , t ) are elements of the d × d -matrix A = ( σσ ∗ ),(3.16) A = ψ ( t, S ( z )) , A j = A j = ψ ( t, S ( z )) , j > , A jk = 1 , j, k > , EAK ERROR RATES ROUGH BERGOMI 10 i.e. that contains ones except along the ﬁrst row and column, where we slightly abuse notation, S ( z ) := 0 · z + c H d X j =2 z j ∆ θ j +1 = c H N L X l =1 y l ∆ θ l . Remark . In the presentation of the Kolmogorov backward equa-tion, we assume necessary regularity conditions on ϕ , i.e. smoothness and boundedness, as a matter ofconvenience. From the context of the problem this is not a strong assumption, see Remark 2.2.3.3. Local weak error representation.

Throughout the remainder of this work we consider the casewhen ψ ( s, W Hs ) = W Hs in (2.1). Returning to (3.7), we obtain a representation for the weak error in termsof local errors. First, we write the discretization error as a telescoping sequence in the value function(3.14), Err( T, ∆ t ) = E (cid:2) ϕ ( b X T ) − ϕ ( X t n ) (cid:3) = − (cid:0) E u ( Z t n , T ) − E u ( Z t , (cid:1) = − n − X i =0 E (cid:2) u ( Z t i +1 , t i +1 ) − u ( Z t i , t i ) (cid:3) , (3.17)using that E ϕ ( X t n ) = E (cid:2) E [ ϕ ( Z T ) | Z T = Z t n ] (cid:3) = E u ( Z t n , T )and E ϕ ( b X T ) = E (cid:2) E [ ϕ ( Z T ) | Z = Z t ] (cid:3) = E u ( Z t , . We then represent each diﬀerence appearing in (3.17) as a stochastic diﬀerential over a small time incre-ment. Over the interval [ t i , t i +1 ), we have that E [ u ( Z t i +1 , t i +1 ) − u ( Z t i , t i )] = E Z t i +1 t i d u ( Z s , s )= E Z t i +1 t i (cid:16) ∂ t u ( Z s , s ) − b j ( Z s ) D j u ( Z s , s ) + 12 A jk ( Z t i ) D jk u ( Z s , s ) (cid:17) d s , using It¯o’s formula applied to (3.13) where repeated indices indicate summation over 1 , . . . , d . Subtractingoﬀ the Kolmogorov backward equation (3.15) evaluated at ( Z s , s ) then yields(3.18) E [ u ( Z t i +1 , t i +1 ) − u ( Z t i , t i )] = 12 E Z t i +1 t i (cid:0) A jk ( Z t i ) − A jk ( Z s ) (cid:1) D jk u ( Z s , s )d s . We note that the non-zero terms correspond to diﬀerences along the ﬁrst row and column of A in (3.16),and thus (3.18) simpliﬁes to the following expression for the local weak error in the value function, E [ u ( Z t i +1 , t i +1 ) − u ( Z t i , t i )] = 12 E Z t i +1 t i (cid:0) A ( Z t i ) − A ( Z s ) (cid:1) D u ( Z s , s )d s + E Z t i +1 t i d X j =2 (cid:0) A j ( Z t i ) − A j ( Z s ) (cid:1) D j u ( Z s , s )d s = − E Z t i +1 t i S ( Y · ) t i ,s D u ( Z s , s )d s − E Z t i +1 t i S ( Y · ) t i ,s d X j =2 D j u ( Z s , s )d s , (3.19)where we express the diﬀerences in components of A in terms of the increments of the approximate fBm S ( Y · ) kt i ,s := S ( Y s ) k − S ( Y t i ) k , k = 1 , . Observe that the derivatives of the value function appearing in (3.19) can be further resolved by directlycomputing ﬂuxes. Here we consider the simpliﬁcation ψ ( s, c W Hs ) = c W Hs in (2.1) (i.e. “linear ψ ”). Lemma 3.7 (Fluxes) . Let ψ ( s, W Hs ) = W Hs . For the value function u ( z , t ) deﬁned in (3.14) , (3.20) D β u ( z , t ) = c | β |− β H E h ϕ ( | β | ) ( b X T ) N L Y j =1 (∆ θ j M jt,T ) β j +1 | Z t = z i , for a multi-index β = ( β , . . . , β d ) where M jt,T := Z Tt e − ( r − t ) θ pj d W r , j = 1 , . . . , N L . EAK ERROR RATES ROUGH BERGOMI 11

Proof.

Let Z t, z s be the Markov process started at Z t = z with components Z t, z s = ( b X t,xs , Y t, y s ) = ( b X t,xs , Y t,y s , . . . , Y t,y NL s ) ;here we drop the index Y t,y l = Y l ; t,y l when the index is clear from the initial condition. We recall that Y t,y l s started at the value y l at time t is given by, Y t,y l s = e − ( s − t ) θ pl y l + Z st e − ( s − r ) θ pl d W r , t < s , (3.21)and, similarly, that b X t,xs is given by, b X t,xs = x + Z st S ( Y t, y r )d W r = x + Z st c H N L X l =1 Y t,y l r ∆ θ l d W r , t < s . Working directly with (3.21) and Section 3.3, derivatives of the underlying with respect to the initialconditions are given by ∂ b X t,xs ∂x = 1 , and, for l = 1 , . . . , N L , ∂ b X t,xs ∂y l = c H ∆ θ l Z st e − ( r − t ) θ pl d W r =: c H ∆ θ l M lt,s . The formula (3.20) follows as all higher derivatives of b X t,xs vanish. (cid:3) Returning to our expression for the local weak error in the value function (3.19) we apply (3.20) therebyobtaining, E [ u ( Z t i +1 , t i +1 ) − u ( Z t i , t i )] = − E Z t i +1 t i S ( Y · ) t i ,s E [ ϕ ( b X T ) | Z s = Z s ]d s − N L X l =1 E Z t i +1 t i S ( Y · ) t i ,s E [ ϕ ( b X T ) c H ∆ θ l M ls,T | Z s = Z s ]d s . (3.22)We introduce deterministic functions of z , ν ( z , s ) := E [ ϕ ( b X T ) | Z s = z ] , and e ν ( z , s ) := E h ϕ ( b X T ) (cid:0) c H P l M ls,T ∆ θ l (cid:1) | Z s = z i . Rewriting (3.22) with this new notation leads to an expression for the local weak error in the valuefunction, E [ u ( Z t i +1 , t i +1 ) − u ( Z t i , t i )] = − E Z t i +1 t i S ( Y · ) t i ,s ν ( Z s , s )d s − E Z t i +1 t i S ( Y · ) t i ,s e ν ( Z s , s )d s =: J + e J , (3.23)that will serve as our starting point for the convergence rates. Next we derive an expansion for (3.23) inpowers of ∆ t from which we obtain convergence rates.3.4. Taylor expansions and conditional independence.

Starting with the local weak error (3.23),we derive asymptotic expansions for J (and e J ) in in powers of ∆ t by Taylor expanding ν (and e ν ) at Z t i and applying a conditioning argument.We observe that Z = ( X, Y ) in (3.13), i.e. the interpolation (3.12) together with the exact dynamicsof the OU extended variables, is linear with respect to the increment over [ t i , s ],(3.24) Z s − Z t i = (cid:0) S ( Y t i ) W t i ,s , Y t i ,s , . . . , Y N L t i ,s (cid:1) where W t i ,s := W s − W t i and Y lt i ,s := Y ls − Y lt i , for s ∈ [ t i , t i +1 ) , are increments of the driving Brownian motion and extended variable OU processes, respectively. Usingthe linearization (3.24), the Taylor expansion of ν at Z t i is,(3.25) ν ( Z s , s ) = ν ( Z t i , s ) + X β ∈I κ D β ν ( Z t i , s ) · ( S ( Y t i ) W t i ,s ) β ( Y t i ,s ) ˆ β + R κ ( ν ) , for sums over multiindices in the set I κ = (cid:8) β = ( β , ˆ β ) = ( β , β , . . . , β d ) : 1 ≤ | β | ≤ κ − (cid:9) EAK ERROR RATES ROUGH BERGOMI 12 where ( Y t i ,s ) ˆ β = N L Y l =1 ( Y lt i ,s ) β l +1 = ( Y t i ,s ) β · · · ( Y N L t i ,s ) β d and the remainder is given in integral form by, R κ ( ν ) = 1 κ ! X | β | = κ ( S ( Y t i ) W t i ,s ) β ( Y t i ,s ) ˆ β Z D β ν ( ξ τ , s )d τ ,ξ τ := Z t i + τ (cid:0) S ( Y t i ) W t i ,s , Y t i ,s , . . . , Y N L t i ,s (cid:1) . (3.26)In (3.25), the terms ν and D β ν are deterministic functions of the F t i -measurable random variable Z t i and are therefore F t i -measurable. Analogous expressions hold for (3.25) and (3.26) with e ν in place of ν .Plugging the ν -expansion (3.25) into (3.23), we obtain an asymptotic expansion for J ,2 J = − E Z t i +1 t i ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) d s − X β ∈I κ E Z t i +1 t i D β ν ( Z t i , s ) S ( Y t i ) β E (cid:2) S ( Y · ) t i ,s ( W t i ,s ) β ( Y t i ,s ) ˆ β | F t i (cid:3) d s − E Z t i +1 t i E (cid:2) S ( Y · ) t i ,s R κ ( ν ) | F t i (cid:3) d s , (3.27)by conditional independence. An expansion analogous to (3.27) holds for e J with e ν in place of ν . Theonly terms that depend on ∆ t in (3.27) (and in the analogously expansion for e J ) are the conditionalexpectations involving products of the increments S ( Y · ) t i ,s or S ( Y · ) t i ,s together with powers of W t i ,s and Y t i ,s . The key obtain weak error rates will be to show after isolating the order in ∆ t using theseexpansions that the expansion coeﬃcients, which depend on the extended state space variables Y , arecontrolled with respect to summation in the parameter(s) θ .In the next section, we observe that for quadratic payoﬀs, the expansions in ν and e ν truncate afterthe ﬁrst term since ν and e ν already depend on two derivatives of ϕ . In this special case, we derive weakrate one in Theorem 4.1. In Section 5, we prove that in general the weak rate is H + 1 /

2, as reported inTheorem 2.1, also using the asymptotic expansions approach.4.

Weak rate one for quadratic payoffs

Using the preceding machinery, we will now derive rates of convergence for the weak error via Taylorexpansions in powers of ∆ t such that all terms stay integrable in θ . For quadratic payoﬀ functions,we obtain that the weak error is O (∆ t ), i.e. rate one in Theorem 4.1 below, which is supported bynumerical evidence, recall Figure 2 a . The mechanism by which rate one is achieved can be observed inthe expansions; the expansion coeﬃcients depend on derivatives of the payoﬀ function and higher-orderterms that reduce the rate vanish when ϕ is quadratic.4.1. Asymptotic expansion approach to weak rate one.

Returning to the increment of the valuefunctional (3.23), if ϕ ∈ P , that is, is a quadratic polynomial, then the derivatives of ν and e ν (as deﬁnedin Sections 3.3 and 3.3) vanish and only the ﬁrst terms in the expansion (3.25) remain. Then, J + e J = − E Z t i +1 t i ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) d s − E Z t i +1 t i e ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) d s =: J + e J , (4.1)and estimating J and e J yields the weak rate corresponding to quadratic payoﬀ functions. Theorem 4.1 (Weak rate quadratic payoﬀ) . Let ϕ ∈ P and let ψ ( s, c W Hs ) = c W Hs , then Err( T, ∆ t, ϕ ) := E (cid:2) ϕ ( b X T ) − ϕ ( X t n ) (cid:3) (cid:46) O (∆ t ) , i.e. the Euler method is weak rate one.Proof. We estimate the terms J and e J in (4.1) beginning with e J . Working directly with the incrementsof the extended variables, E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) = − c H N L X l =1 E (cid:2) Y lt i ,s | F t i (cid:3) ∆ θ l = 0 , and thus we conclude e J = − E Z t i +1 t i e ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) = 0 . EAK ERROR RATES ROUGH BERGOMI 13

In the case of quadratic ϕ , we observe that k ν k ∞ = O (1) and deterministic. From the deﬁnition of Y s (e.g. see (3.21)), working again directly with the increments of the extended variables we have that S ( Y · ) t i ,s = − c H N L X k,l =1 ( Y kt i Y lt i − Y ks Y ls )∆ θ k ∆ θ l = − c H N L X k,l =1 n (1 − e − ( s − t i )( θ pk + θ pl ) ) Y kt i Y lt i − e − ( s − t i ) θ pk Y kt i Z st i e − ( s − r ) θ pl d W r − e − ( s − t i ) θ pl Y lt i Z st i e − ( s − r ) θ pk d W r − Z st i e − ( s − r ) θ pk d W r Z st i e − ( s − r ) θ pl d W r o ∆ θ k ∆ θ l . From this the key conditional expectation term reduces to E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) = − c H N L X l,k =1 ∆ θ l ∆ θ k (1 − e − ( s − t i )( θ pl + θ pk ) ) (cid:0) Y lt i Y kt i − θ pl + θ pk (cid:1) . (4.2)Only the component 1 − e − ∆ t ( θ pl + θ pk ) will contribute to the estimate for the weak rate, provided that thesums over θ in (4.2) converge.Using (4.2) we ﬁnd J = − c H Z t i +1 t i N L X l,k =1 ∆ θ l ∆ θ k E [ Y lt i Y kt i − θ pl + θ pk ](1 − e − ( s − t i )( θ pl + θ pk ) )d s = − c H N L X l,k =1 ∆ θ l ∆ θ k E (cid:2) Y lt i Y kt i − θ pl + θ pk (cid:3) Z ∆ t g ( s )d s . (4.3)The integrand, g ( s ) := 1 − e − s ( θ pl + θ pk ) , is a function of s ∈ [0 , ∆ t ) such that g (0) = 0 and the associated Lipschitz constant K is given by, K = ∂∂s g ( s ) (cid:12)(cid:12)(cid:12) s =0 = θ pl + θ pk . Then, since | g ( s ) − g (0) | ≤ Ks , we have | J | ≤ c H N L X l,k =1 ∆ θ l ∆ θ k (cid:12)(cid:12) E (cid:2) Y lt i Y kt i ( θ pl + θ pk ) − (cid:3)(cid:12)(cid:12) Z ∆ t s d s . Computing the covariance appearing above, E [ Y lt i Y kt i ] = Z t i e − ( t i − r )( θ pl + θ pk ) d r = 1 − e − t i ( θ pl + θ pk ) θ pl + θ pk , we see that(4.4) | J | ≤ c H (cid:16) N L X l =1 ∆ θ l e − t i θ pl (cid:17) ∆ t ≤ c H p Γ (cid:0) p (cid:1) t − /pi ∆ t (cid:46) O (∆ t ) , since N L X l =1 ∆ θ l e − t i θ pl ≤ Z L e − t i θ pl d θ l ≤ Z ∞ e − t i θ pl d θ l = 1 p Γ (cid:0) p (cid:1) t − /pi . Importantly, t − /p is integrable on [0 , T ] when p > t − /pi appearing in (4.4) willremain uniformly bounded when summing over t i in (3.17). EAK ERROR RATES ROUGH BERGOMI 14

Turning now to the telescoping representation of the weak error (3.17) and using (4.4), we obtain thedesired rate | Err( T, ∆ t, ϕ ) | = (cid:12)(cid:12)(cid:12) n − X i =0 E (cid:2) u ( Z t i +1 , t i +1 ) − u ( Z t i , t i ) (cid:3)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) n − X i =0 E Z t i +1 t i ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) d s (cid:12)(cid:12)(cid:12) ≤ c H p Γ (cid:0) p (cid:1) n − X i =0 ∆ t t − /pi ≤ c H p ( p −

2) Γ (cid:0) p (cid:1) T ( p − /p ∆ t ≤ T H ∆ t . (cid:3) Although initially surprising that the rate depends on the payoﬀ, the Taylor expansion for the weakerror provides insight into this behavior. The expansion coeﬃcients in (3.25) depend on increasinglyhigher order derivatives of the payoﬀ function ϕ through derivatives of ν and e ν (in Sections 3.3 and 3.3).Unlike for quadratic ϕ where terms in (3.25) vanish, the higher order derivative terms persist for generalpayoﬀ functions. Oddly, it is only the next higher term (compared to those involved in the expansion forquadratic ϕ ) that reduces the overall rate for general ϕ . In this context, one might hope to obtain an eﬀective rate that is independent of H for payoﬀ functions well approximated by quadratic polynomials.4.2. A simpler proof for weak rate one.

Before moving on to the proof of the main result in Section 5,we ﬁrst present a simpler proof of weak rate one for quadratic payoﬀ functions that is also applicable tononlinear ψ . This proof was communicated to us by A. Neuenkirch. Lemma 4.2.

Suppose that ψ ∈ C , i.e. ψ ∈ C and ψ, ∂ x ψ have polynomial growth, and ϕ ( x ) = x .Then (cid:12)(cid:12)(cid:12) E (cid:2) ϕ ( X T ) − ϕ ( e X T ) (cid:3)(cid:12)(cid:12)(cid:12) = O (∆ t ) , where e X T := Z T ψ ( s, W Hκ s )d W s , for κ s = t i if s ∈ [ t i , t i +1 ) for each i = 0 , . . . , n − .Proof. By the It¯o isometry, we have E [ ϕ ( X T )] = Z T E (cid:2) ψ ( s, s H W H ) (cid:3) d s = Z T g ( s )d s , E (cid:2) ϕ ( e X T ) (cid:3) = Z T E (cid:2) ψ ( κ s , κ Hs W H ) (cid:3) d s = Z T g ( κ s )d s , where g ( s ) := E (cid:2) ψ ( s, s H V ) (cid:3) , V ∼ N (0 , . Note that g is diﬀerentiable with integrable derivative. Indeed, g ( t ) = E (cid:2) ∂ t ψ ( t, t H V ) (cid:3) + E (cid:2) ∂ x ψ ( t, t H V ) V (cid:3) t H − , which is of order t H − and, hence, integrable.Setting ζ t := min { t i | t i ≥ t } , we conclude with (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z T g ( s )d s − Z T g ( κ s )d s (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ Z T (cid:12)(cid:12)(cid:12)(cid:12) g ( κ s ) + Z sκ s g ( t )d t − g ( κ s ) (cid:12)(cid:12)(cid:12)(cid:12) d s ≤ Z T Z sκ s | g ( t ) | d t d s = Z T Z ζ t t d s | g ( t ) | d t ≤ max i =0 ,...,n − | t i +1 − t i | k g k L ([0 ,T ]) . (cid:3) EAK ERROR RATES ROUGH BERGOMI 15

For simplicity, we assumed that ψ ( s, c W Hs ) = c W Hs in Theorem 4.1. In contrast Lemma 4.2 is applicableto any ψ including nonlinear functions. However, it is not clear how to extend the approach of the simpleproof for Lemma 4.2 to more general payoﬀ functions ϕ . In the next section we use the asymptoticexpansions to prove the main result, Theorem 2.1, obtaining the weak rate H + 1 / ψ ( s, c W Hs ) = c W Hs . 5. Proof of Theorem 2.1

Our proof of Theorem 2.1 follows the expansion approach used to obtain rate one for quadratic payoﬀsin Section 4.1. We derive asymptotic expansions in powers of ∆ t for increments of the value functionin (3.23), i.e., for J and e J . For the case of general payoﬀ functions, this requires two rounds of Taylorexpansions. The ﬁrst round expands ν and e ν at Z t i using (3.25), as was done in Section 4.1. Then afterapplying a conditioning argument, we explicitly deal with correlations by expressing our expansions interms of the extended variables and making a second round of Taylor expansions with respect to selectcomponents of Y t i . Again, a key point in the proof is that all the terms in the expansions are controlledwith respect to θ .Returning to the local weak error (3.19), we use the ν -expansion (3.25) to ﬁnd that J , the termcorresponding to the increment S ( Y · ) t i ,s , up to order κ = 3 is given by,2 J = − E Z t i +1 t i ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s | F t i (cid:3) d s − E Z t i +1 t i D ν ( Z t i , s ) S ( Y t i ) E (cid:2) S ( Y · ) t i ,s W t i ,s | F t i (cid:3) d s − d X j =2 E Z t i +1 t i D j ν ( Z t i , s ) E (cid:2) S ( Y · ) t i ,s Y jt i ,s | F t i (cid:3) d s − E Z t i +1 t i D ν ( Z t i , s ) S ( Y t i ) E (cid:2) S ( Y · ) t i ,s ( W t i ,s ) | F t i (cid:3) d s − d X j =2 E Z t i +1 t i D j ν ( Z t i , s ) S ( Y t i ) E (cid:2) S ( Y · ) t i ,s W t i ,s Y j − t i ,s | F t i (cid:3) d s − d X j,k =2 E Z t i +1 t i D jk ν ( Z t i , s ) (cid:2) S ( Y · ) t i ,s Y j − t i ,s Y k − t i ,s | F t i (cid:3) d s − E Z t i +1 t i E (cid:2) S ( Y · ) t i ,s R ( ν ) | F t i (cid:3) d s =: J + J , + J , + J , + J , + J , − E Z t i +1 t i E (cid:2) S ( Y · ) t i ,s R ( ν ) | F t i (cid:3) d s . (5.1)Here J is as before and the J k,i involve k th order derivatives of ν that do not necessarily vanish forgeneral payoﬀ functions ϕ (here the second index i ≤ k is the number of the derivatives that correspond toextended variable directions and hence the number of sums over extended variable indices). Analogously,for e J , the term corresponding to the increment S ( Y · ) t i ,s , we have,(5.2) e J = e J + e J , + e J , + e J , + e J , + e J , − E Z t i +1 t i E (cid:2) S ( Y · ) t i ,s R ( e ν ) | F t i (cid:3) d s , with e ν in place of ν and the increment S ( Y · ) t i ,s in place of S ( Y · ) t i ,s compared to (5.1). In the sequel,we will simply write(5.3) J k = k X i =0 J k,i and e J k = k X i =0 e J k,i , for the sum of all terms involving k th order derivatives of ν and e ν , noting that the sums over i will beof diﬀerent sizes for each k . Curiously, we will observe that only the terms J , · and e J , · , correspondingto terms containing the ﬁrst order derivatives of ν and e ν , reduce the overall rate. In what follows, weﬁrst take the fBm view and ﬁrst assume deterministic k Dν k ∞ = O (1) and similarly for e ν . Since theterms corresponding to ν and e ν contribute only to the constant and not to the rate, this assumptionallows us to easily deduce the order in ∆ t . Expressing in terms of the extended variables, as in (4.3), itis then possible to carrying out a second round of Taylor expansions to demonstrates that the constantsare controlled. EAK ERROR RATES ROUGH BERGOMI 16

Estimate: J is again O (∆ t ) . In Section 4.1, ν is deterministic and O (1) since it depends on twoderivatives of the quadratic payoﬀ ϕ . Continuing as in (4.3), our starting point for the full estimate for J is J = − c H E Z t i +1 t i ν ( X t i , Y t i ; s ) N L X k,l =1 ∆ θ k ∆ θ l E (cid:2) Y kt i Y lt i − θ pk + θ pl (cid:3) g ( s )d s , (5.4)where we emphasize the dependence of ν on Y t i . We deﬁne an auxiliary function,(5.5) f kls ( Y kt i , Y lt i ) := E (cid:2) ν ( X t i , Y t i ; s ) | Y kt i , Y lt i (cid:3) , and expand f kls in a Taylor series at zero,(5.6) f kls ( Y kt i , Y lt i ) = f kls ( ) + X α ∈J α max | α | ! ∂ αkl f kls ( ) (cid:0) Y kt i Y lt i (cid:1) α + R α max ( Y kt i , Y lt i ) , for a set of multiindices J α max = { α = ( α , α ) : 1 ≤ | α | < α max } . We use the notation, ∂ αl ...l j = ∂ α l . . . ∂ α j l j , α = ( α , . . . , α j ) , (as opposed to D ) to emphasize that the derivatives are taken with respect to y l directions only. Theremainder closing the Taylor expansion in the auxiliary function f is given by,(5.7) R α max ( Y kt i , Y lt i ) := X | α | = α max α max ! ∂ αkl f kls ( ξ k , ξ l )( Y kt i Y lt i ) α , for an intermediate point ( ξ k , ξ l ).Plugging this second round of Taylor expansions (5.6) into (5.4), yields J = − c H N L X k,l =1 E (cid:2) Y kt i Y lt i − θ pk + θ pl (cid:3) Z t i +1 t i f kls ( ) g ( s )d s ∆ θ k ∆ θ l + X α ∈J | α | ! N L X k,l =1 E (cid:2) ( Y kt i ) α ( Y lt i ) α (cid:0) Y kt i Y lt i − θ pk + θ pl (cid:1)(cid:3) Z t i +1 t i ∂ αkl f kls ( ) g ( s )d s ∆ θ k ∆ θ l + N L X k,l =1 E (cid:2) R α max ( Y kt i , Y lt i ) (cid:0) Y kt i Y lt i − θ pk + θ pl (cid:1)(cid:3) ∆ θ k ∆ θ l Z t i +1 t i g ( s )d s ! . (5.8)For the remainder term in (5.8), we use the Hölder regularity of the fBm to estimate the derivative ofthe auxiliary function evaluated at an intermediate point, N L X k,l =1 E (cid:2) | R α max ( Y kt i , Y lt i ) || Y kt i || Y lt i | (cid:3) ∼ E (cid:2) | W Ht i ,t i +1 | α max (cid:3) (cid:46) ∆ t Hα max E | W Ht i | α max (cid:46) ∆ t Hα max . From this last expression, the number of terms in the auxiliary expansion, α max , is ﬁnite and the contri-bution from the remainder can be made to be order one by choosing α max := d H e . For the remaining terms in (5.8), the f kl can be estimated by the payoﬀ ϕ (see Lemma B.1 in Appendix B)and therefore we write for convenience that f kl and all derivatives are bounded by a constant Q ,(5.9) f kl ∈ C α max b and k D α f kl k ∞ ≤ Q a.s. . Then recalling the Lipschitz argument, used in the proof of weak rate one for quadratic payoﬀs inSections 4.1 and 4.1, together with (5.9) we ﬁnd that the ﬁrst term in (5.8) is estimated by, c H Q ∆ t N L X k,l =1 | E [ Y kt i Y lt i ( θ pk + θ pl ) − | ∆ θ k ∆ θ l ≤ c H p Q Γ (cid:0) p ) t − /pi ∆ t , for a constant proportional to t − /pi as in (4.4). For the higher order terms in α in (5.8), we use Isserlis’theorem to expand the expectations into products of covariances of the extended variables. For example,for | α | = 1, i.e. when α = 1 , α = 0 or α = 0 , α = 0, then the term is zero by Isserlis’. For | α | = 2, weapply Isserlis’ theorem to obtain three terms, E [( Y kt i ) ] (cid:0) E [ Y kt i Y lt i ( θ pk + θ pl ) − (cid:1) ( α = 2 , α = 0) , E [( Y lt i ) ] (cid:0) E [ Y kt i Y lt i ( θ pk + θ pl ) − (cid:1) ( α = 0 , α = 2) , E [ Y kt i Y lt i ] (cid:0) E [ Y kt i Y lt i ( θ pk + θ pl ) − (cid:1) + 2 E [( Y kt i ) ] E [( Y lt i ) ] ( α = 1 , α = 1) , EAK ERROR RATES ROUGH BERGOMI 17 where we note there is no cancellation and ( θ pk + θ pl ) comes from the Lipschitz constant. Considering onlythe ﬁrst term, E [( Y kt i ) ] (cid:0) E [ Y kt i Y lt i ( θ pk + θ pl ) − (cid:1) = 1 − e − θ pk t i θ pk (2 − e − ( θ pk + θ pl ) t i ) , we are essentially interested in the behavior of the integral Z Z − e − tx p x p (1 − e − t ( x p + y p ) )d y d x = Z − e − tx p x p (cid:16) p e − tx p t − /p Γ( p , ty p ) + y + C y (cid:17) d x = ( y + C y )( e − tx p − x − p p − − ( y + C y ) t − /p Γ( p , tx p ) p − xt − /p (cid:0) Γ( p , tx p ) − ( p − /p t − /p Γ( p , tx p ) (cid:1) ( p − p Γ( p , y p )+ x − p t − /p e − tx p ( e tx p − p − p Γ( p , y p ) + C x , where C y , C x are constants. The integral goes to C x as x, y → ∞ and to a constant C x − C y t − /p Γ( p ) / ( p −

1) as x, y →

0. We conclude that the deﬁnite integral is converging to a constant depending on H and t i and therefore the sum over θ k and θ l converges. The higher order terms in α are also summable since theapplication of Isserlis’ theorem results in expressions that are either zero or of the form above. Altogether,our full estimate for J is then(5.10) | J | (cid:46) C ( H, Q, α max ) t − /pi ∆ t (cid:46) O (∆ t ) , yielding the same rate as the estimate in (4.4), albeit with a diﬀerent constant C ( H, Q, α max ).In the next section, we consider estimates for the terms J , · and e J , · in (5.1) and (5.2). Inspired by theobservation that we obtain the same rate in (5.10) as in (4.4) with only the constant changing, we ﬁrstwork directly with the fBm view to easily ascertain the rate. However, to obtain the full estimate, weproceed as in this section by utilizing the structure of the aﬃne approximation to make a second roundof Taylor expansions and subsequently observing that the coeﬃcients are controlled.5.2. Estimate: J and e J are O (∆ t H +3 / ) . For the term J , in (5.1), J , = − E Z t i +1 t i D ν ( X t i , Y t i ; s ) S ( Y t i ) E (cid:2) S ( Y · ) t i ,s W t i ,s | F t i (cid:3) , we ﬁrst determine the order in ∆ t which arises from the conditional expectation. Working directly withthe increment, S ( Y · ) t i ,s = c W Ht i ,s ( c W Hs + c W Ht i ) = c W Ht i ,s ( c W Ht i ,s + 2 c W Ht i ) , we take the fBm view and express c W Ht i ,s ∼ W Ht i ,s = Z t i K ( s − r ) − K ( t i − r )d W r | {z } := V Hti ( s ) + Z st i K ( s − r )d W r , using the power law kernel, i.e. K ( r ) = √ Hr H − / , (inverse discrete Laplace transform). We observe E [( W Ht i ,s ) W t i ,s | F t i ] = 2 V Ht i ( s ) Z st i K ( s − r )d r , E [ W Ht i ,s W t i ,s | F t i ] = Z st i K ( s − r )d r , and (cid:12)(cid:12) E [ W Ht i V Ht i ( s )] (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)Z t i K ( t i − r ) K ( s − r ) − K ( t i − r )d r (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)Z t i K ( t i − r )d r (cid:12)(cid:12)(cid:12)(cid:12) ∼ t Hi ≤ . EAK ERROR RATES ROUGH BERGOMI 18

Then, J , = − E Z t i +1 t i D ν ( X t i , Y t i ; s ) c W Ht i E (cid:2)c W Ht i ,s ( c W Ht i ,s + 2 c W Ht i ) W t i ,s | F t i (cid:3) d s = − E Z t i +1 t i D ν ( X t i , Y t i ; s ) W Ht i (cid:0) E [( W Ht i ,s ) W t i ,s | F t i ] + 2 W Ht i E [ W Ht i ,s W t i ,s | F t i ] (cid:1) d s = − E Z t i +1 t i D ν ( X t i , Y t i ; s ) W Ht i ( V Ht i ( s ) + W Ht i ) Z st i K ( s − r )d r | {z } ∝ ( s − t i ) H +1 / d s , (5.11)which suggests J , = O (∆ t H +3 / ) provided the coeﬃcient is controlled. The order of J , in ∆ t isasymptotically exact in (5.11), i.e. there is only estimation of the constant.In general, D ν is random and depends on Y t i . This changes the coeﬃcient in the estimate but not theorder in ∆ t , as was observed by comparing (5.10) to (4.4). Following Section 5.1, we proceed to estimatethe coeﬃcient by making a second round of Taylor expansions by introducing an auxiliary function f . Inthis spirit, we start directly from (5.11), noting that V Ht i ( s ) ∼ b V Ht i ( s ) := c H N L X l =1 Y lt i ( e − θ pl ( s − t i ) − θ l , since Z t i K ( s − r )d W r = Z t i √ H ( s − r ) H − / d W r = Z t i √ H Γ( − H ) Z ∞ θ − ( H +1 / e − θ ( s − r ) d θ d W r = e c H Z ∞ θ − ( H +1 / e − θ ( s − t i ) Z t i e − θ ( t i − r ) d W r d θ = e c H Z ∞ θ − ( H +1 / e − θ ( s − t i ) e Y t i d θ ≈ c H N L X l =1 Y lt i e − θ pl ( s − t i ) ∆ θ l . Rewriting J , directly in terms of the extended variables, we obtain J , = − E Z t i +1 t i D ν ( X t i , Y t i ; s ) c W Ht i ( b V Ht i + c W Ht i ) g ( s )d s = − c H N L X k,l =1 Z t i +1 t i E (cid:2) D ν ( X t i , Y t i ; s ) Y kt i Y lt i (cid:3) e − θ pl ( s − t i ) g ( s )d s ∆ θ k ∆ θ l , where g ( s ) := Z st i K ( s − r )d r ∝ ( s − t i ) H +1 / . Recalling that ν ( Z t i , s ) = ν ( X t i , Y t i ; s ) is a deterministic function of Z t i , we further write E [ D ν ( X t i , Y t i ; s ) Y kt i Y lt i ] = E (cid:2) E [ D ν ( X t i , Y t i ; s ) | Y kt i , Y lt i ] Y kt i , Y lt i (cid:3) , where we deﬁne a new auxiliary function (here ϕ is already higher order compared to (5.5)) f kls ( Y kt i , Y lt i ) := E (cid:2) D ν ( X t i , Y t i ; s ) | Y kt i , Y lt i (cid:3) . We Taylor expand f kl at zero, yielding J , = − c H N L X k,l =1 E [ Y kt i Y lt i ] Z t i +1 t i f kls ( ) e − θ pl ( s − t i ) g ( s )d s ∆ θ k ∆ θ l + X α ∈J α max | α | ! N L X k,l =1 E (cid:2) ( Y kt i ) α +1 ( Y lt i ) α +1 (cid:3) Z t i +1 t i ∂ αkl f kls ( ) e − θ pl ( s − t i ) g ( s )d s ∆ θ k ∆ θ l + N L X k,l =1 E (cid:2) R α max ( Y kt i , Y lt i ) Y kt i Y lt i (cid:3) Z t i +1 t i e − θ pl ( s − t i ) g ( s )d s ∆ θ k ∆ θ l ! EAK ERROR RATES ROUGH BERGOMI 19 where J α max = { α = ( α , α ) : 1 ≤ | α | < α max } and the remainder R α max is given in the Lagrange formas in (5.7). Since f kl is bounded by ϕ (see Lemma B.1 in Appendix B), we write the bound in terms of Q , as in (5.9), to obtain | J , | (cid:46) C ( H, Q, α max ) t − /pi ∆ t H +3 / (cid:46) O (∆ t H +3 / ) , where we use Isserlis’ theorem and the explicit form of the covariances, e.g., N L X k,l =1 E [ Y kt i Y lt i ]∆ θ k ∆ θ l = N L X k,l =1 ∆ θ k ∆ θ l θ pk + θ pl (1 − e − ( θ pk + θ pl ) t i ) ≤ π Z ∞ θ − p (1 − e − θ p t i )d θ ≤ π p − p ) t − /pi , to observe that the sums over θ k and θ l converge independently of L, N L .The term J , includes increments of OU processes arising from the extended variables. Taking thefBm view using the power law kernel, J , = − N L X j =1 E Z t i +1 t i D j ν ( X t i , Y t i ; s ) E (cid:2)c W Ht i ,s ( c W Ht i ,s + 2 c W Ht i ) Y jt i ,s | F t i (cid:3) d s = − N L X j =1 E Z t i +1 t i D j ν ( X t i , Y t i ; s ) (cid:0) ( V Ht i ( s )) + 2 W Ht i V Ht i ( s ) (cid:1) Y jt i ( e − θ pj ( s − t i ) − s (5.12a) − N L X j =1 E Z t i +1 t i D j ν ( X t i , Y t i ; s )(2 V Ht i ( s ) + 2 W Ht i ) Z st i e − θ pj ( s − r ) K ( s − r )d r d s (5.12b) − N L X j =1 E Z t i +1 t i D j ν ( X t i , Y t i ; s ) Y jt i ( e − θ pj ( s − t i ) − Z st i K ( s − r )d r d s , (5.12c)since E [( W Ht i ,s ) Y jt i ,s | F t i ] = Y jt i ( e − θ pj ( s − t i ) − Z st i K ( s − r )d r + 2 V Ht i ( s ) Z st i K ( s − r ) e − θ pj ( s − r ) d r +( V Ht i ( s )) Y jt i ( e − θ pj ( s − t i ) − E [ W Ht i ,s Y jt i ,s | F t i ] = Z st i e − θ pj ( s − r ) K ( s − r )d r + V Ht i ( s ) Y jt i ( e − θ pj ( s − t i ) − . We examine the contributions to the rate in ∆ t from each of the terms (5.12a) to (5.12c); after integratingover s we ﬁnd that (5.12a) yields O (∆ t ), (5.12b) yields O (∆ t H +3 / ), and (5.12c) yields O (∆ t H +2 ). Since(5.12a) and (5.12c) are higher order in ∆ t we examine only (5.12b), where we note Z st i e θ pj ( s − t i ) K ( s − r )d r = e − θ pj s √ H Z st i e θ pj r ( s − r ) H − / d r = √ Hθ − ( H +1 / pj h Γ (cid:0) H + , ( s − r ) θ pj (cid:1)i r = sr = t i = √ Hθ − ( H +1 / pj h e s − r ( s − r ) H +1 / Γ (cid:0) H + 1 / , θ pj (cid:1)i r = sr = t i = −√ Hθ − ( H +1 / pj Γ (cid:0) H + , θ pj (cid:1) e s − t i ( s − t i ) H +1 / (5.13)by a change of variable in the argument of the incomplete gamma function where q := ( H + 1 / p = 2 H + 11 − H > , H ∈ (0 , / . This suggests J , = O (∆ t H +3 / ) since the sum over θ j converges (also helpful to note Γ (cid:0) H + , θ pj (cid:1) → θ j → ∞ ).For the full estimate of J , , we note D j ν ( X t i Y t i ; s ) = c H ∆ θ j E (cid:2) ϕ (3) ( b X T ) M js,T | ( b X s , Y s ) = ( X t i , Y t i ) (cid:3) , EAK ERROR RATES ROUGH BERGOMI 20 where we assume that M js,T < ∞ (see (3.22) for deﬁnition of M ). Omitting the higher order terms in ∆ t in (5.12a) and (5.12c), we return directly to (5.12b), J , = − c H d X j =2 E Z t i +1 t i E (cid:2) ϕ (3) ( b X T ) M js,T | ( b X s , Y s ) = ( X t i , Y t i ) (cid:3) ( b V Ht i ( s ) + c W Ht i ) g ( s ) h ( θ j )d s ∆ θ j , where we let g ( s ) := −√ He s − t i ( s − t i ) H +1 / and h ( θ j ) := θ − qj Γ (cid:0) H + , θ pj (cid:1) . We deﬁne a new auxiliary function f ks ( Y kt i ) := E (cid:2) E [ ϕ (3) ( b X T ) M js,T | ( b X s , Y s ) = ( X t i , Y t i )] | Y kt i (cid:3) , where, although the inner expectation depends implicitly on j , here the index k refers to the generalcomponent Y kt i that we are conditioning against. Expanding f k at zero, we ﬁnd J , = − c H N L X j,k =1 h ( θ j ) Z t i +1 t i E (cid:2) f ks ( Y kt i ) Y kt i (cid:3) e − θ pk ( s − t i ) g ( s )d s ∆ θ j ∆ θ k = − c H α max − X α =1 α ! N L X j,k =1 h ( θ j ) E (cid:2) ( Y kt i ) α +1 (cid:3) e − θ pk ( s − t i ) Z t i +1 t i ∂ α k f ks ( ) g ( s )d s ∆ θ j ∆ θ k − c H N L X j,k =1 h ( θ j ) E (cid:2) R α max ( Y kt i ) Y kt i (cid:3) e − θ pk ( s − t i ) Z t i +1 t i g ( s )d s ∆ θ j ∆ θ k . Following from the boundedness of f and its derivatives (as in (5.9)), we obtain the full estimate for J , , | J , | (cid:46) C ( H, Q, α max ) t − /pi ∆ t H +3 / (cid:46) O (∆ t H +3 / ) , where we use Isserlis’ theorem and the representation of the covariance for the extended variables todetermine the coeﬃcient depending on t i (which we note is also summable over i ).The estimation of e J , and e J , follows the program above. For e J , we begin by taking the fBm viewwith the kernel K , | e J , | = (cid:12)(cid:12)(cid:12) E Z t i +1 t i D e ν ( X t i , Y t i ; s ) W Ht i E (cid:2) W Ht i ,s W t i ,s | F t i (cid:3) d s (cid:12)(cid:12)(cid:12) = c H (cid:12)(cid:12)(cid:12) N L X k =1 Z t i +1 t i D e ν ( X t i , Y t i ; s ) W Ht i Z st i K ( s − r )d r | {z } ( s − t i ) H +1 / d s (cid:12)(cid:12)(cid:12) which suggests J , = O (∆ t H +3 / ). We recall that D e ν ( Z t i , s ) = E (cid:2) ϕ (3) ( b X T ) (cid:0) c H P l M ls,T ∆ θ l (cid:1) | Z s = Z t i (cid:3) , where we assume that c H N L X l =1 M ls,T ∆ θ l < ∞ , and then deﬁne an auxiliary function f ks ( Y kt i ) := E h E (cid:2) ϕ (3) ( b X T ) (cid:0) c H P l M ls,T ∆ θ l (cid:1) | ( b X s , Y s ) = ( X t i , Y t i ) (cid:3) | Y kt i i . Finally, Taylor expanding f ks at zero we encounter terms similar to those estimated previously yieldingthe full estimate, | e J , | (cid:46) O (∆ t H +3 / ) . Likewise for e J , , taking the fBm view with the conditional expectation term E [ c W Ht i ,s Y jt i ,s | F t i ] yields | e J , | = (cid:12)(cid:12)(cid:12) d X j =2 E Z t i +1 t i D j e ν ( X t i , Y t i ; s ) Z st i e − θ pj ( s − r ) K ( s − r )d r d s (cid:12)(cid:12)(cid:12) , where the estimate (5.13) (encountered in J , ) suggest the rate J , = O (∆ t H +3 / ). For the full estimate,we recall that D j e ν ( Z t i , s ) = E (cid:2) ϕ (3) ( b X T ) (cid:0) c H P l M ls,T ∆ θ l (cid:1) | Z s = Z t i (cid:3) and expand in a second round of Taylor expansions for an appropriate auxiliary function thereby obtaining | e J , | (cid:46) O (∆ t H +3 / ) . EAK ERROR RATES ROUGH BERGOMI 21

Taken together, the estimates Sections 5.2 to 5.2 imply that the terms corresponding to ﬁrst orderderivatives of ν and e ν in (5.1) and (5.2), respectively, yields | J | + | e J | = O (∆ t H +3 / ) , using the notation in (5.3).5.3. Estimate: remaining terms are higher order.

Additional terms (5.1) and (5.2) appearing inthe expansion are higher order than H + 3 /

2. For example, We ﬁnd J , = O (∆ t H ) which can beseen by once again taking the fractional view, E (cid:2) S ( Y · ) t i ,s ( W t i ,s ) | F t i (cid:3) = E (cid:2)c W Ht i ,s ( c W Ht i ,s + 2 c W Ht i )( W t i ,s ) (cid:3) = E (cid:2) ( c W Ht i ,s ) ( W t i ,s ) (cid:3) + 2 c W Ht i E (cid:2)c W Ht i ,s ( W t i ,s ) (cid:3)| {z } Isserlis’ = ⇒ = E [( c W Ht i ,s ) ] E [( W t i ,s ) ] + 2( E [ c W Ht i ,s W t i ,s ]) = ∆ t H ∆ t − t H +1 / ) =: g (∆ t ) , where g ( s ) ∝ ( s − t i ) H +1 and then expanding in a second round of Taylor expansions for a suitableauxiliary function and checking the control of the coeﬃcient with respect to θ , | J , | (cid:46) (cid:12)(cid:12)(cid:12) E h ( W Ht i ) Z ∆ t D ν ( Z t i , s + t i ) s H +1 d s i(cid:12)(cid:12)(cid:12) = O (∆ t H ) . The term e J , vanishes, | e J , | = (cid:12)(cid:12)(cid:12) E hZ t i +1 t i D e ν ( Z t i , s )( c W Ht i ) E (cid:2)c W Ht i ,s ( W t i ,s ) (cid:3)| {z } Isserlis’ = ⇒ d s i(cid:12)(cid:12)(cid:12) = 0 , by Isserlis’ theorem (although one could argue that the integrand is formally contributing rate H + 2 herefor e J , ). The terms involving increments of the OU processes yield similar results (constants obtainedwould be better behaved owing to the additional decay).5.4. Closure argument to ﬁnish proof of Theorem 2.1.

Thus far we have estimated the ﬁrst fewterms (5.1) and (5.2) arising from the Taylor expansion in powers of ∆ t . Returning to the telescopingsum (3.17), we summarize our estimates up to order κ ,Err( T, ∆ t ) = E (cid:2) ϕ ( b X T ) − ϕ ( X T ) (cid:3) = n − X i =0 κ − X k =0 (cid:16) J k + e J k (cid:17) + R κ ( ν ) + R κ ( e ν ) ! ≤ n − X i =0 (cid:16) C ( t i )∆ t H +3 / + C ( t i )∆ t + O (∆ t H +2 ) + R κ ( ν ) + R κ ( e ν ) (cid:17) , (5.14)using the notation in (5.3). Here the constants C k , for k = 0 , . . . , κ −

1, depending on t i , H , and Q , arethe coeﬃcients that appear in the estimates (5.10) and Sections 5.2 and 5.2, etc., for J · and likewise for e J · . Importantly, each of these constants was obtained independently of N L , L , that is, independently ofthe choice of parameter θ arising in the Markovian extended variable state space formulation. Moreover,each constant was summable in t i . Interchanging the order of summation in (5.14) we obtain,Err( T, ∆ t, ϕ ) (cid:46) C ∆ t H +1 / + C ∆ t + O (∆ t H +1 ) − n − X i =1 E Z t i +1 t i (cid:16) E (cid:2) S ( Y · ) t i ,s R κ ( ν ) | F t i (cid:3) + E (cid:2) S ( Y · ) t i ,s R κ ( e ν ) | F t i (cid:3)(cid:17) d s , (5.15)with new constants C j = n − X i =1 C j ( t i ) . The remainders have a integral form (3.26), and thus the conditional expectations in (5.15) are, E (cid:2) S ( Y · ) t i ,s R κ ( ν ) | F t i (cid:3) = 1 κ ! X | β | = κ ( c W Ht i ) β E (cid:20) ( c W Ht i ,s ) ( W t i ,s ) β ( Y t i ,s ) ˆ β Z D β ν ( ξ τ , s )d τ (cid:21) and E (cid:2) S ( Y · ) t i ,s R κ ( e ν ) | F t i (cid:3) = 1 κ ! X | β | = κ ( c W Ht i ) β E (cid:20)c W Ht i ,s ( W t i ,s ) β ( Y t i ,s ) ˆ β Z D β e ν ( ξ τ , s )d τ (cid:21) , EFERENCES 22 where E (cid:2) S ( Y · ) t i ,s R κ ( ν ) | F t i (cid:3) ∼ ( c W Ht i ,s ) κ and E (cid:2) S ( Y · ) t i ,s R κ ( e ν ) | F t i (cid:3) ∼ ( c W Ht i ,s ) κ . We recall that E | W Ht i ,t i +1 | γ = (∆ t ) γH E | W Ht i +1 | γ (cid:46) ∆ t γH , by the Hölder continuity of the sample paths. Then by applying Cauchy–Schwarz, we obtain in (5.15)that the remainder terms yield O (∆ t κH ). Thus, κ can be chosen such that κ > H yields a large butﬁnite expansion for general payoﬀ functions ϕ . Then the error is is given byErr( T, ∆ t, ϕ ) (cid:46) C ∆ t H +1 / + C ∆ t + O (∆ t H +1 ) , with weak error rate H + 1 / Remark . In the proof of the main result Theorem 2.1 and of Theorem 4.1, the speciﬁc formof K in Section 3.1 and (1.2) is not relevant and the spirit of the proof follows with any relevant L kernelwhere the integrability conditions need to be checked.6. Conclusions and outlook

The rough Bergomi model (rBergomi) is part of a class of increasingly popular rough stochastic volatil-ity models for option pricing in quantitative ﬁnance. On the one hand, rBergomi overcomes empiricalchallenges to deliver predictions consistent with observed market data. On the other hand, the non-Markovian nature of rBergomi, due to the fractional Brownian motion (fBm) driver, is an impedimentto both theory and numerics. Despite the widespread use of discretization-based simulation methods foroption pricing under rBergomi, few works have studied the weak convergence rates that underpin thispractice.We prove that the weak convergence of the Euler scheme depends on the Hurst parameter H of the fBmdriver and is weak rate H + 1 / ψ and this is the subject of ongoing work.7. Acknowledgments

This work was supported by the KAUST Oﬃce of Sponsored Research (OSR) under Award No.URF/1/2584-01-01 and the Alexander von Humboldt Foundation. R. Tempone is a member of theKAUST SRI Center for Uncertainty Quantiﬁcation in Computational Science and Engineering. A por-tion of this work was carried out while E. Hall was a Postdoctoral Research Scientist in the Chair ofMathematics for Uncertainty Quantiﬁcation at RWTH Aachen University. C. Bayer gratefully acknowl-edges support by the German Research Council (DFG) via the Research Unit FOR 2402.We are grateful to Andreas Neuenkirch for pointing out the simpler proof in the case of quadraticpayoﬀs presented in Section 4.2.

References [1] E. Abi Jaber and O. El Euch, “Multifactor approximation of rough volatility models”,

SIAM Journalon Financial Mathematics , 10 (2019), pp. 309–349, doi : .[2] C. Bayer, P. K. Friz, A. Gulisashvili, B. Horvath, and B. Stemper, “Short-time near-the-moneyskew in rough fractional volatility models”, Quant. Financ. , 19 (2019), pp. 779–798, doi : .[3] C. Bayer, P. K. Friz, P. Gassiat, J. Martin, and B. Stemper, “A regularity structure for roughvolatility”, Mathematical Finance , 30 (2020), pp. 782–832, doi : .[4] C. Bayer, P. K. Friz, and J. Gatheral, “Pricing under rough volatility”, Quant. Financ. , 16 (2016),pp. 887–904, doi : .[5] C. Bayer, C. B. Hammouda, and R. Tempone, “Hierarchical adaptive sparse grids and quasi-MonteCarlo for option pricing under the rough Bergomi model”, Quant. Financ. , 0 (2020), pp. 1–17, doi : .[6] M. Bennedsen, A. Lunde, and M. S. Pakkanen, “Decoupling the short-and long-term behavior ofstochastic volatility”, July 2017, arXiv: . EFERENCES 23 [7] M. Bennedsen, A. Lunde, and M. S. Pakkanen, “Hybrid scheme for Brownian semistationary pro-cesses”,

Finance Stoch. , 21 (2017), pp. 931–965, doi : .[8] P. Carmona and L. Coutin, “Fractional Brownian Motion and the Markov Property”, Electron.Commun. Probab. , 3 (1998), pp. 95–107, doi : .[9] P. Carmona, L. Coutin, and G. Montseny, “Approximation of Some Gaussian Processes”, Stat.Inference Stoch. Process. , 3 (2000), pp. 161–171, doi : .[10] G. Didier, S. A. McKinley, D. B. Hill, and J. Fricks, “Statistical challenges in microrheology”, J.Time Series Anal. , 33 (2012), pp. 724–743, doi : .[11] O. El Euch and M. Rosenbaum, “The characteristic function of rough Heston models”, MathematicalFinance , 29 (2019), pp. 3–38, doi : .[12] M. Fukasawa, T. Takabatake, and R. Westphal, “Is Volatility Rough?”, May 2019, arXiv: .[13] J. Gatheral, T. Jaisson, and M. Rosenbaum, “Volatility is Rough”, Quant. Financ. , 18 (2018),pp. 933–949, doi : .[14] E. J. Hall, M. A. Katsoulakis, and L. Rey-Bellet, “Uncertainty quantiﬁcation for generalizedLangevin dynamics”, J. Chem. Phys. , 145 (2016), pp. 224108, doi : .[15] P. Harms, “Strong convergence rates for Markovian representations of fractional Brownian motion”,Feb. 2019, arXiv: .[16] P. Harms and D. Stefanovits, “Aﬃne representations of fractional processes with applications inmathematical ﬁnance”, Stochastic Process. Appl. , 129 (2019), pp. 1185–1228, doi : .[17] B. Horvath, A. J. Jacquier, and A. Muguruza, “Functional central limit theorems for rough volatil-ity”, May 2019, arXiv: .[18] J. Huntley, Generation of Random Variates , MATLAB Central File Exchange, 2020, url : (visited on 01/01/2020).[19] A. J. Jacquier and M. Oumgari, “Deep PPDEs for rough local stochastic volatility”, 2019, arXiv: .[20] A. A. Muravlev, “Representation of a fractional Brownian motion in terms of an inﬁnite-dimensionalOrnstein-Uhlenbeck process”, Russian Math. Surveys , 66 (2011), pp. 439–441, doi :

10 . 1070 /rm2011v066n02abeh004746 .[21] A. Neuenkirch and T. Shalaiko, “The order barrier for strong approximation of rough volatilitymodels”, June 2016, arXiv: .[22] L. Rey-Bellet, “Open classical systems”, in

Open Quantum Systems II , ed. by S. Attal, A. Joye,and C. Pillet, Lecture Notes in Mathematics vol. 1881, Berlin: Springer, 2006, doi : .[23] M. Romano and N. Touzi, “Contingent claims and market completeness in a stochastic volatilitymodel”, Mathematical Finance , 7 (1997), pp. 399–412, doi : .[24] D. Talay and L. Tubaro, “Expansion of the global error for numerical schemes solving stochastic dif-ferential equations”, Stochastic Anal. Appl. , 8 (1990), pp. 483–509, doi : . Appendix A. Details of numerical implementation

We compute the weak error (here we assume T = 1) (cid:12)(cid:12) E [ ϕ ( X refT ) − ϕ ( X ∆ tT )] (cid:12)(cid:12) , for various payoﬀ functions, ϕ , by the left-point scheme(A.1) X T ( n ) = n X i =0 W Ht i ( W t i +1 − W t i ) , using a reference solution X refT := X T (2 ) and computations X ∆ tT = X T ( n ) for log n = { , . . . , } .We sample paths ( W Ht i , W t i ) i ∈ [0: n ] required for the Monte Carlo approximation of (A.1) at points of thereference mesh using the Cholesky decomposition method. For each { H, T, n } we ﬁrst form the eigenvaluedecomposition ( L, D ) of the full covariance matrixΣ = (cid:18) Σ Σ Σ Σ (cid:19) , with blocks (Σ ) ij = Cov( W Ht i , W Ht j ), (Σ ) ij = Cov( W Ht i , W t j ), (Σ ) ij = Cov( W t i , W t j ). We refer to,e.g., Lemma 4.1 of [2], for the covariance function of the Riemann-Liouville fBm and to pfq.m from [18]to compute hypergeometric functions: EFERENCES 24 t = T/n:T/n:T; % require pfq.m from MATLAB File Exchange G = @(x) 2.0*H*( x.^(-gam)/(1.0-gam) + (gam*x.^(-1.0-gam)./(1.0-gam)) .* ... pfq([1.0, 1.0+gam], 3.0-gam, x.^(-1.0))/(2.0-gam) ); disp('Computing blocks S11 S12 S22') [X,Y] = meshgrid(t,t); Gmat = G((tril(Y./X)+tril(Y./X)') - eye(size(Y,1)).*diag(Y./X)); Gmat = Gmat.*~eye(size(Gmat)) + eye(size(Gmat)); % diag is 1 b/c G(1)=1 S11 = ((tril(X)+tril(X)') - eye(size(X,1)).*diag(X)).^(2*H) .* Gmat; S12 = sqrt(2*H)*(Y.^(H+0.5)-(Y-min(Y,X)).^(H+0.5))./(H+0.5); S22 = min(X,Y); disp('Finished blocks S11 S12 S22') % LDL decomposition; defaults to Cholesky method disp('Computing L D via eig') [L,D] = eig([S11 S12; S12.' S22]); We then generate M samples using the LDL eigenvalue decomposition: z = randn(2*n,M); X = real(L*(D^0.5)*z); WH = [zeros(1,M); X(1:n, :)]; W = [zeros(1,M); X(n+1: end , :)]; and note that for H = 0 . W H generated by this method is identical to W . Onesample of X refT and X ∆ tT at the ﬁnal time can then be computed by summing the appropriate terms usingthe reference paths: XTref(1,:) = sum(WH(1: end -1,:) .* diff(W(1: end ,:))); XTdt = nan(numDt,M); for j=1:numDt XTdt(j,:) = sum(WH(1:2^(j+gap-1): end -1,:) .* diff(W(1:2^(j+gap-1): end ,:))); end The code referenced above can be found at the git repository: https://bitbucket.org/datainformeduq/rbwc_code . Appendix B. Bound auxiliary functions f by ϕ Lemma B.1.

Let ϕ ( x ) be the payoﬀ function, and deﬁne f s ( Y l t i , . . . , Y l k t i ) := E (cid:2) ν ( Z t i , s ) | Y l t i , . . . , Y l k t i (cid:3) = E (cid:2) E (cid:2) ϕ ( m ) ( b X T ) | Z s = ( X t i , Y t i ) (cid:3) | Y l t i , . . . , Y l k t i (cid:3) , for components ( Y l t i , . . . , Y l k t i ) ⊂ Y t i . Then for a multiindex α = ( α , . . . , α k ) , | ∂ α f kls ( ) | (cid:46) | α | X j =0 k ϕ ( m + j ) k ∞ . Proof. (Lemma B.1) Recall that ν ( Z t i , s ) is a deterministic function of the jointly Gaussian η -dimensionalvector (with η := ( N L + 1) × ( i + 1)), ξ = ( Y τ , ∆ W τ ) τ = t ,...,t i , (cf. (3.11)). Denoting the density as % ( ξ ), we note ξ is mean zero and has variance-covariance Σ givenby the known quantities Cov( Y lt i , Y kt j ), Cov( Y lt i , ∆ W t j ), and Cov(∆ W t i , ∆ W t j ), which have closed formexpressions. For each s , the variable ν ( Z t i , s ) has a density proportional to % ( ξ ),d P ν ∝ % ( ξ )d ξ . Writing y = ( Y l t i , . . . , Y l k t i ), a subset of components of Y t i of size k = | y | such that k ≤ N L , the function f s ( y ) = E [ ν ( Z t i , s ) | y ] , a deterministic function of y , can be expressed in terms of a conditional Gaussian density. Partitioning ξ = ( e ξ , y ) and, likewise, the covariance matrix Σ into components Σ corresponding to e ξ , Σ to y , and Σ to the mixed terms, the the conditional Gaussian density % ( e ξ | y ) has conditional mean e µ = Σ Σ − y , EFERENCES 25 which is linear in y , and conditional variance-covariance matrix e Σ = Σ − Σ > Σ − Σ , that does not depend on y .Rewritten in terms of this conditional density, f is given by f s ( y ) = Z R η − k ν ( X t i , e ξ , y ; s ) % ( e ξ | y )d e ξ , where ν ( X t i , e ξ , y ; s ) = E (cid:2) ϕ ( m ) ( b X T ) | Z s = ( X t i , e ξ , y ) (cid:3) . For a multiindex α = ( α , . . . , α k ) corresponding to the components of y , ∂ α = ∂ α ∂ y α = ∂ α ∂y l · · · ∂ α k ∂y l k , taking the derivative inside the integral we obtain ∂ α f s ( y ) = Z R η − k n(cid:0) ∂ α ν ( X t i , e ξ , y ) (cid:1) % ( e ξ | y ) + ν ( X t i , e ξ , y ) ∂ α % ( e ξ | y ) o d e ξ = Z R η − k n ∂ α ν ( X t i , e ξ , y ) + ν ( X t i , e ξ , y ) P k ( y ) o % ( e ξ | y )d e ξ , (B.1)where ∂ α % ( e ξ | y ) = P k ( y ) % ( e ξ | y ) , for P k a polynomial of degree k since % ( e ξ | y ) ∝ exp[ − ( e ξ − e µ ) > e Σ ( e ξ − e µ )] = exp[ − ( e ξ − Σ Σ − y ) > e Σ ( e ξ − Σ Σ − y )] . The remaining derivative in (B.1) follows similarly to the computation of the ﬂuxes in Lemma 3.7, ∂ α ν ( X t i , e ξ , y ; s ) = c | α | H E h ϕ ( | α | + m ) ( b X T ) | α | Y j =1 (cid:0) ∆ θ l j M l j s,T (cid:1) α j | Z s = ( X t i , e ξ , y ) i , and the estimate follows.and the estimate follows.