[PDF] A regularity structure for rough volatility

Abstract

A new paradigm recently emerged in financial modelling: rough (stochastic) volatility, first observed by Gatheral et al. in high-frequency data, subsequently derived within market microstructure models, also turned out to capture parsimoniously key stylized facts of the entire implied volatility surface, including extreme skews that were thought to be outside the scope of stochastic volatility. On the mathematical side, Markovianity and, partially, semi-martingality are lost. In this paper we show that Hairer's regularity structures, a major extension of rough path theory, which caused a revolution in the field of stochastic partial differential equations, also provides a new and powerful tool to analyze rough volatility models.

Full PDF

AA REGULARITY STRUCTURE FOR ROUGH VOLATILITY

C. BAYER, P. K. FRIZ, P. GASSIAT, J. MARTIN, B. STEMPER

Abstract.

A new paradigm recently emerged in ﬁnancial modelling: rough (stochastic) volatility,ﬁrst observed by Gatheral et al. in high-frequency data, subsequently derived within marketmicrostructure models, also turned out to capture parsimoniously key stylized facts of the entireimplied volatility surface, including extreme skews that were thought to be outside the scope ofstochastic volatility. On the mathematical side, Markovianity and, partially, semi-martingalityare lost. In this paper we show that Hairer’s regularity structures, a major extension of roughpath theory, which caused a revolution in the ﬁeld of stochastic partial diﬀerential equations, alsoprovides a new and powerful tool to analyze rough volatility models.

Dedicated to Professor Jim Gatheral on the occasion of his 60th birthday.

Contents

1. Introduction 21.1. Markovian stochastic volatility models 31.2. Complications with rough volatility 31.3. Description of main results 41.4. Lessons from KPZ and singular SPDE theory 82. Reduction of Theorems 1.1 and 1.3 103. The rough pricing regularity structure 113.1. Basic pricing setup 113.2. Approximation and renormalization theory 183.3. The case of the Haar basis 234. The full rough volatility regularity structure 244.1. Basic setup 244.2. Small noise model large deviation 255. Rough Volterra dynamics for volatility 265.1. Motivation from market micro-structure 265.2. Regularity structure approach 275.3. Solving for rough volatility 286. Numerical results 31Appendix A. Approximation and renormalization (Proofs) 37Appendix B. Large deviations proofs 40Appendix C. Proofs of Section 5 42References 43

Date : October 23, 2017. a r X i v : . [ q -f i n . P R ] O c t C. BAYER, P. K. FRIZ, P. GASSIAT, J. MARTIN, B. STEMPER Introduction

We are interested in stochastic volatility (SV) models given in Itˆo diﬀerential form(1.1) dS t /S t = σ t dB t ≡ (cid:112) v t ( ω ) dB t . Here, B is a standard Brownian motion and σ t (resp. v t ) are known as stochastic volatility (resp. variance ) process. Many classical Markovian asset price models fall in this framework, includingDupire’s local volatility model, the SABR -, Stein-Stein - and Heston model. In all named SV model,one has Markovian dynamics for the variance process, of the form(1.2) dv t = g ( v t ) dW t + h ( v t ) dt ;constant correlation ρ := d (cid:104) B, W (cid:105) t /dt is incorporated by working with a 2D standard Brownianmotion (cid:0) W, W (cid:1) , B := ρW + ρW ≡ ρW + (cid:112) − ρ W .

This paper is concerned with an important class of non-Markovian (fractional) SV models, dubbed rough volatility (RV) models , in which case σ t (equivalently: v t ≡ σ t ) is modelled via afractional Brownian motion (fBM) in the regime H ∈ (0 , / The terminology ”rough” stemsfrom the fact that in such models stochastic volatility (variance) sample paths are H − -H¨older, hence“rougher” than Brownian paths. Note the stark contrast to the idea of ”trending” fractional volatility,which amounts to take H > /

2. The evidence for the rough regime (recent calibration suggest H as low as 0 .

05) is now overwhelming - both under the physical and the pricing measure, see e.g.[1, 24, 25, 27, 4, 19, 42]. Much attention in theses reference has in fact been given to ”simple” roughvolatility models, by which we mean models of the form σ t := f ( (cid:99) W t ) . . . “simple rough volatility (RV)”(1.3) (cid:99) W t = ˆ t K ( s, t ) dW s ;(1.4) with K ( s, t ) = √ H | t − s | H − / t>s , H ∈ (0 , / . (1.5)In other words, volatility is a function of a fractional Brownian motion, with (ﬁxed) Hurst parameter. Note that, in contrast even to classical SV models, the stochastic volatility is explicitly given, andno rough / stochastic diﬀerential equation needs to be solved (hence ”simple”). Rough volatility notonly provides remarkable ﬁts to both time series and option pricing problems, it also has a marketmicrostructure justiﬁcation: starting with a Hawkes process model, Rosenbaum and coworkers[16, 17, 18] ﬁnd in the scaling limit f, g, h such that σ t := f ( (cid:98) Z t ) . . . “non-simple rough volatility (RV)”(1.6) Z t = z + ˆ t K ( s, t ) g ( Z s ) ds + ˆ t K ( s, t ) h ( Z s ) dW s , (1.7)with stochastic Volterra dynamics that provide a natural generalization of simple rough volatility. Volatility is not a traded asset, hence its non-semimartingality (when H (cid:54) = 1 /

2) does not imply arbitrage. Following [4] we work with the Volterra- or Riemann-Liouville fBM, but other choices such as the Mandelbrotvan Ness fBM, with suitably modiﬁed kernel K , are possible. EGULARITY STRUCTURES & ROUGH VOL 3

Markovian stochastic volatility models.

For comparison with rough volatility, Section 1.2below, we ﬁrst mention a selection of tools and methods well-known for

Markovian

SV models. • PDE methods are ubiquitous in (low-dimensional) pricing problems, as are • Monte Carlo methods, noting that knowledge of strong (resp. weak) rate 1 / • Quasi Monte Carlo (QMC) methods are widely used; related in spirit we have the Kusuoka–Lyons–Victoir cubature approach, popularized in the form of Ninomiya–Victoir (NV) splittingscheme, nowadays available in standard software packages; • Freidlin–Wentzell theory of small noise large deviations is essentially immediately applicable,as are various “strong“ large deviations (a.k.a. exact asymptotics) results, used e.g. thederive the famous SABR formula.For several reasons it can be useful to write model dynamics in

Stratonovich form : From aPDE perspective, the operators then take sum-square form which can be exploited in many ways(H¨ormander theory, naturally linked to Malliavin calculus ...). From a numerical perspective, we notethat the cubature / NV scheme [43] also requires the full dynamics to be rewritten in Stratonovichform. In fact, viewing NV as level-5 cubature, in sense of [40], its level-3 simpliﬁcation is nothingbut the familiar Wong-Zakai approximation result for diﬀfusions. Another ﬁnancial example thatrequires a Stratonovich formulation comes from interest rate model validation [13], based on theStroock–Varadhan support theorem. We further note, that QMC (e.g. Sobol’) works particularlywell if the noise has a multiscale decomposition, as obtained by interpreting a (piece-wise) linearWong-Zakai approximation, as Haar wavelet expansion of the driving white noise.1.2.

Complications with rough volatility.

Due to loss of Markovianity, PDE methods are notapplicable, and neither are (oﬀ-the-shelf) Freidlin–Wentzell large deviation estimates (but see [19]).Moreover, rough volatility is not a semi-martingale, which complicates, to say the least, the use ofseveral established stochastic analysis tools. In particular, rough volatility admits no Stratonovichform . Closely related, one lacks a (Wong-Zakai type) approximation theory for rough volatility. Tosee this, focus on the “simple” situation, that is (1.1), (1.3) so that(1.8) S t /S = E (cid:18) ˆ · f (cid:16)(cid:99) W s (cid:17) dB s (cid:19) ( t ) . Inside the (classical) stochastic exponential E ( M )( t ) = exp( M t − [ M ] t ) we have the martingaleterm(1.9) ˆ t f ( (cid:99) W ) dB = ρ ˆ t f ( (cid:99) W ) dW t (cid:124) (cid:123)(cid:122) (cid:125) + ρ ˆ t f ( (cid:99) W ) dW t and, in essence, the trouble is due to underbraced, innocent looking Itˆo-integral. Indeed, any naiveattempt to put it in Stratonovich form,(1.10) “ ˆ t f ( (cid:99) W ) ◦ dW := ˆ t f ( (cid:99) W ) dW + (Itˆo-Stratonovich correction) ”or, in the spirit of Wong-Zakai approximations,(1.11) “ ˆ t f ( (cid:99) W ) ◦ W := lim ε → ˆ t f ( (cid:99) W ε ) dW ε ” C. BAYER, P. K. FRIZ, P. GASSIAT, J. MARTIN, B. STEMPER must fail whenever

H < /

2. The Itˆo-Stratonovich correction is given by the quadratic covariation,deﬁned (whenever possible) as the limit, in probability, of(1.12) (cid:88) [ u,v ] ∈ π ( f ( (cid:99) W v ) − f ( (cid:99) W u ))( W v − W u ) , along any sequence ( π n ) of partitions with mesh-size tending to zero. But, disregarding trivialsituations, this limit does not exist. For instance, when f ( x ) = x fractional scaling immediatelygives divergence (at rate H − /

2) of the above bracket approximation. This issues also arises inthe context of option pricing which in fact is readily reduced (Theorem 1.3 and Section 6) to thesampling of stochastic integrals of the afore-mentioned type, i.e. with integrands on a fractional scale.All theses problems remain present, of course, for the more complicated situation of “non-simple”rough volatility (Section 5) .1.3.

Description of main results.

With motivation from singular SPDE theory, such as Hairer’swork on KPZ [32] and the Hairer-Pardoux “renormalized” Wong-Zakai theorem [35], we providethe closest there is to a satisfactory approximation theory for rough volatility. This starts with theremark that rough path theory, despite its very purpose to deal with low regularity paths, is notapplicableTo state our basic approximation results, write ˙ W ε ≡ ∂ t W ε for a suitable (details below)approximation at scale ε to white noise, with induced approximation to fBM, denoted by (cid:99) W ε .Throughout, the Hurst parameter H ∈ (0 , /

2] is ﬁxed and f is a smooth function, such that (1.8)is a (local) martingale, as required by modern ﬁnancial theory. Theorem 1.1.

Consider simple rough volatility with dynamics dS t /S t = f ( (cid:99) W t ) dB t , i.e. driven byBrownians B and W with constant correlation ρ . There exist ε -peridioc functions C ε = C ε ( t ) , withdiverging averages C ε , such that a Wong-Zakai result holds of the form (cid:101) S ε → S in probability anduniformly on compacts, where ∂ t (cid:101) S εt /S εt = f ( (cid:99) W ε ) ˙ B ε − ρ C ε ( t ) f (cid:48) ( (cid:99) W ε ) − f ( (cid:99) W ε ) , S ε = S . Similar results hold for more general (“non-simple”) RV models.Remark . When H = 1 /

2, this result is an easy consequence of Itˆo-Stratonovich conversionformulae. In the case

H < / ρ is non-zero. This is the case in equity (and many other) markets [4]. Also note that naiveapproximations S εt , without subtracting the C ε -term, will in general diverge.In order to formulate implications for option pricing, deﬁne the Black-Scholes pricing function(1.13) C BS (cid:0) S , K ; σ T (cid:1) := E (cid:18) S exp (cid:18) σ √ T Z − σ T (cid:19) − K (cid:19) + , where Z denotes a standard normal random variable. We then have Theorem 1.3.

With C ε = C ε ( t ) as in Theorem 1.1, deﬁne the renormalized integral approximation, (1.14) (cid:102) I ε := (cid:102) I εf ( T ) := ˆ T f ( (cid:99) W ε ) dW ε − ˆ T C ε ( t ) f (cid:48) ( (cid:99) W εt ) dt EGULARITY STRUCTURES & ROUGH VOL 5 and also approximate total variance, V ε := V εf ( T ) := ˆ T f ( (cid:99) W εt ) dt . Then the price of a European call option, under the pricing model (1.1), (1.3), struck at K withtime T to maturity, is given as lim ε → E (cid:104) Ψ( (cid:102) I ε , V ε ) (cid:105) where (1.15) Ψ( I , V ) := C BS (cid:18) S exp (cid:18) ρ I − ρ V (cid:19) , K, ρ V (cid:19) . Similar results hold for more general (“non-simple”) RV models.

From a mathematical perspective , the key issue in proving the above theorems is to establishconvergence of the renormalized approximate integrals(1.16) (cid:102) I ε = ˆ T f ( (cid:99) W ε ) dW ε − ˆ T C ε ( t ) f (cid:48) ( (cid:99) W εt ) dt → (Itˆo-integral) . It is here that we ﬁnd much inspiration from singular SPDE theory, which also requires renormalizedapproximations for convergence to the correct Itˆo-object. Speciﬁcally, we see that the theory ofregularity structures [31], which essentially emerged from rough paths and Hairer’s KPZ analysis(see [23] for a discussion and references), is a very appropriate tool for us. This adds to theexisting instances of regularity structures (polynomials, rough paths, many singular SPDEs . . . ) aninteresting new class of examples which on the one hand avoids all considerations related to spatialstructure (notably multi-level Schauder estimates; cf. [31, Ch.5]), yet comes with the genuine needfor renormalization. In fact, since we do not restrict to molliﬁer approximations (this would rule outwavelet approximation of white noise!) our analysis naturally leads us to renormalization functions .In case of molliﬁer approximations, i.e. ˙ W ε is the ε -mollifciation obtained by convolution of ˙ W witha rescaled molliﬁer function, say δ ε ( x, y ) = ε − ρ ( ε − ( y − x ))), which is the usual choice of Hairerand coworkers [32, 31, 11], the renormalization function turns out to constant (since ˙ W ε is stillstationary); in this case C ε ( t ) ≡ C ε = cε H − / with c = c ( ρ ) explicitly given as integral, cf. (3.13). If, on the other hand, we consider a Haarwavelet approximation of white noise, very natural from a numerical point of view, C ε ( t ) = √ HH + 1 / | t − (cid:98) t/ε (cid:99) ε | H +1 / ε with mean C ε = √ H ( H + 1 / H + 3 / ε H − / . (1.17)It is natural to ask if C ε ( t ) can be replaced, after all, by its (since H < /

2: diverging) mean C ε .For H > / H = 1 /

4, cf. Section 3.2.From a numerical simulation perspective , Thereom 1.3 is a step forward as it avoids anysampling related to the other factor W . A brute-force approach then consists in simulating a scalarBrownian motion W , followed by computing (cid:99) W = ´ KdW by Itˆo/Riemann Stieltjes approximationsof ( I , V ). However, given the singularity of Volterra-kernel K , this is not advisable and it is Other wavelet choices are possible. In particular, in case of fractional noise,

Alpert-Rokhlin (AR) wavelets havebeen suggested for improved numerical behaviour; cf. [28] where this is attributed to a series of works of A. Majdaand coworkers. A theoretical and numerical study of AR wavelets in the rough vol context is left to future work.

C. BAYER, P. K. FRIZ, P. GASSIAT, J. MARTIN, B. STEMPER preferable to simulate the two-dimensional Gaussian process ( W t , (cid:99) W t : 0 ≤ t ≤ T ) with covariancereadily available. A remaining problem is that the rate of convergence (cid:88) f ( (cid:99) W s ) W s,t → (Itˆo-integral) , with [ s, t ] taken in a partition of mesh-size ∼ /n , is very slow since (cid:99) W has little regularity when H is small. (Gatheral and co-authors [27, 4] report H ≈ .

05) . It is here that higher-orderapproximations come to help and we have included quantitative estimates, more precisely: strong rates, throughout. An analysis of weak rates will be conducted elsewhere, as is the investigation ofmulti-level algorithms (cf. [6] for MLMC for general Gaussian rough diﬀerential equations). Recallthat the design of MLMC algorithms requires knowledge of strong rates. Numerical aspects arefurther explored in Section 6.The second set of results concerns large deviations for rough volatility. Thanks to the contractionprinciple and fundamental continuity properties of Hairer’s reconstruction map, the problem isreduced to understanding a LDP for a suitable enhancement of the noise. This approach requires(suﬃciently) smooth coeﬃcients, but comes with no growth restrictions which is indeed quite suitablefor ﬁnancial modelling: we improve the Forde-Zhang (simple rough vol) short-time large deviations[19] such as to include f of exponential type, a deﬁning feature in the works of Gatheral andcoauthors [27, 4]. (Such an extension is also subject of a recent preprint [38] and forthcoming work[30].) Theorem 1.4.

Let X t = log ( S t /S ) be the log-price under simple rough SV, i.e. (1.1), (1.3). Then ( t H − X t : t ≥ satisﬁes a short time large deviation principle with speed t H and rate functiongiven by (1.18) I ( y ) = inf h ∈ L ([0 , { (cid:107) h (cid:107) L + ( y − ρI ( h )) I ( h ) } with I ( h ) = ´ f ( (cid:98) h ( t )) h ( t ) dt, I z ( h ) = ´ f ( (cid:98) h ( t )) dt where (cid:98) h ( t ) = ´ t K ( s, t ) h ( s ) ds .Remark . A potential short-coming is the non-explicit form of the rate function, in the sensethat even geometric or Hamiltonian descriptions of the rate function (classical in Markovian setting,see e.g [3, 8, 14, 15, 7]), which led to the famous SABR volatility smile formula, is lost. A partialremedy here is to move from large deviations to (higher order) moderate deviations , which restoresanalytic tractability and still captures the main feature of the volatiliy smile close to the money.This method was introduced in a Markovain setting in [20], the extension to simple rough volatilitywas given in [5], relying either on [19] or the above Theorem 1.4.We next turn to non-simple rough volatility, motivated by Rosenbaum and coworkers [16, 17, 18],and consider the stochastic Itˆo–Volterra equation Z t = z + ˆ t K ( s, t )( u ( Z s ) dW s + v ( Z s ) ds )with corresponding rough SV log-price process given by X t = ˆ t f ( Z s )( ρdW s + ρdW s ) − ˆ t f ( Z s ) ds . EGULARITY STRUCTURES & ROUGH VOL 7 (For simplicity, we here consider f, u, v to be bounded, with bounded derivatives of all orders.) For h ∈ L ([0 , T ]), let z h be the unique solution to the integral equation z h ( t ) = z + ˆ t K ( s, t ) u ( z h ( s )) h ( s ) ds, and deﬁne I ( h ) = ´ f ( z h ( s )) h ( s ) ds and I z ( h ) = ´ f ( z h ( s )) ds . Then we have the followingextension of Theorem 1.4 (and also [19, 38, 30]) to non-simple rough volatility: Theorem 1.6.

Let X t = log ( S t /S ) be the log-price under non-simple rough SV. Then t H − X t satisﬁes a LDP with speed t H and rate function given by (1.19) I ( x ) = inf h ∈ L ([0 ,T ]) { (cid:107) h (cid:107) L + ( x − ρI z ( h )) I z ( h ) } . Remark . We showed in [5, Cor.11] (but see related results by Alos et al. [2] and Fukasawa [24, 25])that in the previously considered simple rough volatility models, now writing σ ( . ) instead of f ( . ),the implied volatility skew behaves, in the short time limit, as ∼ ρ σ (cid:48) (0) σ (0) (cid:104) K , (cid:105) t H − / , where (cid:104) K , (cid:105) in our setting computes to c H := (2 H ) / ( H +1 / H +3 / . (The blowup t H − / as t → Z t ≈ z + u ( z ) ´ t K ( s, t ) dW s = z + u ( z ) (cid:99) W =: σ ( (cid:99) W ), from which one obtains a skew-formula in the non-simple rough volatility caseof the form, ρu ( z ) f (cid:48) ( z ) f ( z ) c H t H − / . Following the approach of [5], Theorem 1.6 not only allows for rigorous justiﬁcation but also forthe computation of higher order smile features, though this is not pursued in this article. In thecase of classical (Markovian) stochastic volaility, H = 1 /

2, and specializing further to f ( x ) ≡ x , sothat Z (resp. z ) models stochastic (resp. spot) volatility, this reduces precisely to the popular skewformula Gatheral’s book [26, (7.6)], attributed therein to Medvedev–Scaillet. In the case of roughHeston , where Z models stochastic variance, cf. (5.1), we have f = √ ., u = η √ . and this leads tothe following (rough Heston, implied volatility) short-dated skew formula ρη √ v c H t H − / , (multiply with 2 √ v to get the implied variance skew, again in agreement with Gatheral [26, p.35]);this may be independently veriﬁed via the characteristic function obtained in [17]. Structure of the article.

In Section 2 we reduce the proofs of Theorems 1.1 and 1.3 to the keyconvergence issue, subject of Section 3. In Section 4 we consider the structure for two-dimensionalnoise, necessary to study the asset price process. Section 5 then discusses the case of non-trivialdynamics for rough volatility. Some numerical results are presented in [], followed by severalappendices with technical details. From Section 3 all our work relies on the framework of Hairer’sregularity structures. There seems to be no point in repeating all the necessary deﬁnitions andterminology, which the reader can ﬁnd in [32, 31, 33, 23] and a variety of survey papers on thesubject. Instead, we ﬁnd it more instructive to substantiate our KPZ inspiration and in the nextsection introduce, informally, all relevant objects from regularity structures in this context.

C. BAYER, P. K. FRIZ, P. GASSIAT, J. MARTIN, B. STEMPER

Lessons from KPZ and singular SPDE theory.

The absence of a good approximationtheory is a deﬁning feature of all singular SPDE recently considered by Hairer, Gubinelli et al. (andnow many others). In particular, approximation of the noise (say, ε -molliﬁcation for the sake ofargument) typically does not give rise to convergent approximations. To be speciﬁc, it is instructiveto recall the universal model for ﬂuctuations of interface growth given by the Kardar–Parisi–Zhang(KPZ) equation ∂ t u = ∂ x u + | ∂ x u | + ξ with space-time white noise ξ = ξ ( x, t ; ω ). As a matter of fact, and without going in further detail,there is a well-deﬁned (“Cole-Hopf”) Itˆo-solution u = u ( t, x ; ω ), but if one considers the equationwith ε -molliﬁed noise, then u = u ε diverges with ε →

0. In this sense, there is a fundamental lack ofapproximation theory and no Stratonovich solution to KPZ exists. To see the problem, take u ≡ u = H (cid:63) (cid:0) | ∂ x u | + ξ (cid:1) with space-time convolution (cid:63) and heat-kernel H ( t, x ) = 1 √ πt exp (cid:18) − x t (cid:19) { t> } One can proceed with Picard iteration u = H (cid:63) ξ + H (cid:63) (( H (cid:48) (cid:63) ξ ) ) + ... but there is an immediate problem with ( H (cid:48) (cid:63) ξ ) , (naively) deﬁned ε -to-zero limit of ( H (cid:48) (cid:63) ξ ε ) ,which does not exist. However, there exists a diverging sequence ( C ε ) such that, in probability, ∃ lim ε → ( H (cid:48) (cid:63) ξ ε ) − C ε → (new object) =: ( H (cid:48) (cid:63) ξ ) (cid:5) . The idea of Hairer, following the philosophy of rough paths, was then to accept

H (cid:63) ξ, ( H (cid:48) (cid:63) ξ ) (cid:5) (and a few more)as enhancement of the noise (” model ”) upon which solution depends in pathwise robust fashion.This unlocks the seemingly ﬁxed (and here even non-sensical) relation H (cid:63) ξ → ξ → ( H (cid:48) (cid:63) ξ ) . Loosely speaking, one has

Theorem 1.8 (Hairer) . There exist diverging constants C ε such that a Wong-Zakai result holds ofthe form (cid:101) u ε → u , in probability and uniformly on compacts, where ∂ t (cid:101) u ε = ∂ x (cid:101) u ε + | ∂ x (cid:101) u ε | − C ε + ξ ε . Similar results hold for a number of other singular semilinear SPDEs.

In a sense, this can be traced back to the

Milstein-scheme for SDEs and then rough paths : Consider dY = f ( Y ) dW , with Y = 0 for simplicity, and consider the 2nd order (Milstein) approximation Y t i +1 ≈ Y t i + f ( Y t i ) W t i ,t i +1 + f f (cid:48) ( Y t i ) ˆ t i +1 t i W t i ,s ˙ W s ds One has to unlock the seemingly ﬁxed relation W → ˙ W → ˆ W ˙ W ds =: W , Hairer–Pardoux [35] derive the KPZ result as special case of a Wong-Zakai result for Itˆo-SPDEs.

EGULARITY STRUCTURES & ROUGH VOL 9 for there is a choice to be made. For instance, the last term can be understood as Itˆo-integral ´ W dW or as Stratonovich integral ´ W ◦ dW (and in fact, there are many other choices, see e.g.the discussion in [23].) It suﬃces to take this thought one step further to arrive at rough path theory :accept W as new (analytic) object, which leads to the main (rough path) insightSDE theory = analysis based on ( W, W ) . In comparison, SPDE theory `a la Hairer= analysis based on (renormalized) enhanced noise ( ξ, .... ). Inside Hairer’s theory: As motivation, consider the Taylor-expansion (at x ) of a real-valuedsmooth function, f ( y ) = f ( x ) + f (cid:48) ( x )( y − x ) + 12 f (cid:48)(cid:48) ( x )( y − x ) + ... , can be written as abstract polynomial (“jet”) at x , F ( x ) := f ( x ) 1 + g ( x ) X + h ( x ) X + ... , with, necessarily, g = f (cid:48) , h = f (cid:48)(cid:48) / , ... . If we “realize” these abstract symbols again as honestmonomials, i.e. Π x : X k (cid:55)→ ( . − x ) k and extend Π x linearly, then we recover the full Taylor expansion:Π x [ F ( x )]( . ) = f ( x ) + g ( x )( . − x ) + 12 h ( x )( . − x ) + ... Hairer looks for solution of this form: at every space-time point a jet is attached, which in case ofKPZ turns out - after solving an abstract ﬁxed point problem - to be of the form U ( x, s ) = u ( x, s ) 1 + + + v ( x, s ) X + 2 + v ( x, s ) . As before, every symbol is given concrete meaning by “realizing” it as honest function (or Schwartzdistribution). In particular,(1.20) (cid:55)→ (cid:40)

H (cid:63) ξ (cid:15) , molliﬁed noise; or H (cid:63) ξ noiseand then, more interestingly,(1.21) (cid:55)→ 

H (cid:63) ( H (cid:48) (cid:63) ξ (cid:15) ) , canonically enhanced molliﬁed noise; or H (cid:63) [( H (cid:48) (cid:63) ξ (cid:15) ) − C (cid:15) ] , renormalized ∼ or H (cid:63) ( H (cid:48) (cid:63) ξ ) (cid:5) , renormalized enhanced noiseThis realization map is called “model” and captures exactly a typical, but otherwise ﬁxed, realizationof the noise (molliﬁed or not) together with some enhancement thereof, renormalized or not. Forinstance, writing Π x,s for the realization map for renormalized enhanced noise, one hasΠ x,s [ U ( x, s )]( . ) = u ( x, s ) + H (cid:63) ξ | ( ∗ ) + H (cid:63) ( H (cid:48) (cid:63) ξ ) (cid:5) | ( ∗ ) + ... where ( ∗ ) indicates suitable centering at ( x, s ). Mind that U takes values in a (ﬁnite) linear spacespanned by (suﬃciently many) symbols, U ( x, s ) ∈ (cid:104) ..., , , , X, , , ... (cid:105) =: T In the section only, following [23], symbols will be coloured.

The map ( x, s ) (cid:55)→ U ( x, s ) is an example of a modelled distribution , the precise deﬁnition is amix of suitable analytic and algebraic conditions (similar to the notation of a controlled rough path).The analysis requires keeping track of the degree (a.k.a. homogeneity ) of each symbol. Forinstance, | | = 1 / − κ (related to the H¨older regularity of the realized object one has in mind), | X | = 2 etc. All these degrees are collected in an index set . Last not least, in order to comparejets at diﬀerent points (think ( X − δ = ... ), use a group of linear maps on T , called structuregroup. Last not least, the reconstruction map uniquely maps modelled distributions to function/ Schwartz distributions. (This can be seen as generalization of the sewing lemma , the essence ofrough integration, see e.g. [23], which turns a collection of suﬃciently compatible local expansionsinto one function / Schwartz distribution.) In the KPZ context, the (Cole-Hopf Itˆo) solution is thenindeed obtained as reconstruction of the abstract (modelled distribution) solution U . Acknowledgment:

The authors acknowledge ﬁnancial support from DFGs research grants BA5484/1(CB, BS) and FR2943/2 (PKF, BS), the ERC via Grant CoG-683166 (PKF), the ANR via GrantANR-16-CE40-0020-01 (PG) and DFG Research Training Group RTG 1845 (JM).Participants of Global Derivatives 2017 (Barcelona) and Gatheral 60th Birthday conference (CIMS,NYU) are thanked for the feedback.2.

Reduction of Theorems 1.1 and 1.3

In the context of these theorems, we have(2.1) S t = S exp (cid:20) ˆ t f (cid:16)(cid:99) W s (cid:17) dB s − ˆ t f (cid:16)(cid:99) W s (cid:17) ds (cid:21) . where we recall that ˆ t f ( (cid:99) W ) dB = ρ ˆ t f ( (cid:99) W ) dW + ρ ˆ t f ( (cid:99) W ) dW . All approximations, W ε , W ε and B ε ≡ ρW ε + ρW ε converge uniformly to the obvious limits, sothat it suﬃces to understand the convergence of the stochastic integral. Note that (cid:102) W is heavilycorrelated with W but independent of W . The diﬃcult interesting part is then indeed (1.16), i.e.(2.2) ˆ t f ( (cid:99) W ε ) dW ε − ˆ t C ε ( s ) f (cid:48) ( (cid:99) W εs ) ds → ˆ t f ( (cid:99) W ) dW , which is the purpose of Theorem 3.24. For the other part, due to independence no correction termsarise and we have (with details left to the reader) ´ t f ( (cid:99) W ε ) dW ε → ´ t f ( (cid:99) W ) dW , with convergencein probability and uniformly on compacts in t . The convergence result of Theorems 1.1 then followsreadily.As for pricing, Theorem 1.3, consider the call payoﬀ (cid:16) S exp (cid:104) ´ T σ ( t, ω ) dB t − ´ T σ ( t, ω ) dt (cid:105) − K (cid:17) + .An elementary conditioning argument (ﬁrst used by Romano–Touzi in the context of Markovian SVmodels) w.r.t. W , then shows that the call price is given as expection of C BS (cid:32) S exp (cid:32) ρ ˆ T σ ( t, ω ) dW − ρ ˆ T σ ( t, ω ) dt (cid:33) , K, ρ ˆ T σ ( t, ω ) dt (cid:33) . EGULARITY STRUCTURES & ROUGH VOL 11

Specializing to the case σ = f ( (cid:102) W ), in combination with Theorem 3.24, then yields Theorem 1.3 .Remark that extensions to non-simple RV are immediate from suitable extensions of Theorem 3.24,as discussed in 5.2. 3. The rough pricing regularity structure

In this section we develop the approximation theory for integrals of the type ´ f ( (cid:102) W ) dW . In theﬁrst part we present the regularity structure and the associated models we will use. In the secondpart we apply the reconstruction theorem from regularity structures to conclude our main result,Theorem 3.24.3.1. Basic pricing setup.

We are given a Hurst parameter H ∈ (0 , / (cid:99) W , and ﬁx an arbitrary κ ∈ (0 , H ) and an integer M ≥ max { m ∈ N | m · ( H − κ ) − / − κ ≤ } so that ( M + 1)( H − κ ) − / − κ > . (3.1)At this stage, we can introduce the “level-( M + 1)” model space T = (cid:10) { Ξ , Ξ I (Ξ) , . . . , Ξ I (Ξ) M , , I (Ξ) , . . . , I (Ξ) M } (cid:11) , (3.2)where (cid:104) . . . (cid:105) denotes the vector space generated by the (purely abstract) symbols in { . . . } . We willsometimes write S = S ( M ) := { Ξ , Ξ I (Ξ) , . . . , Ξ I (Ξ) M , , I (Ξ) , . . . , I (Ξ) M } so that T = T ( M ) = (cid:76) τ ∈ S R τ . Remark . It is useful here and in the sequel to consider as sanity check the special case H = 1 / α := 1 / − κ < / M = 1) condition (3.1) isprecisely the familiar condition α > / S is as follows: Ξ should be understood as an abstractrepresentation of the white noise ξ belonging to the Brownian motion W , i.e. ξ = ˙ W where thederivative is taken in the distributional sense. Note that since we set W ( x ) = 0 for x ≤ W ( ϕ ) = 0 for ϕ ∈ C ∞ c (( −∞ , I ( . . . ) has the intuitive meaning “integration againstthe Volterra kernel”, so that I (Ξ) represents the integration of white noise against the Volterrakernel √ H ˆ t | t − r | H − / d W ( r ) , which is nothing but the fractional Brownian motion (cid:99) W ( t ). Symbols like Ξ I (Ξ) m = Ξ · I (Ξ) · . . . · I (Ξ)or I (Ξ) m = I (Ξ) · . . . · I (Ξ) should be read as products between the objects above. Theseinterpretations of the symbols generating T will be made rigorous by the model (Π , Γ) in the nextsubsection. Every symbol in S is assigned a homogeneity, which we deﬁne by | Ξ I (Ξ) m | = − / − κ + m ( H − κ ) , m ≥ |I (Ξ) m | = m ( H − κ ) , m > | | = 0 , We collect the homogeneities of elements of S in a set A := {| τ | | τ ∈ S } , whose minimum is | Ξ | = − / − κ . Note that the homogeneities are multiplicative in the sense that, | τ · τ (cid:48) | = | τ | + | τ (cid:48) | for τ , τ (cid:48) ∈ S .At last, our regularity comes with a structure group G , an (abstract) group of linear operatorson the model space T which should satisfy Γ τ − τ = (cid:76) τ (cid:48) ∈ S : | τ (cid:48) | < | τ | R τ (cid:48) and Γ = for τ ∈ S andΓ ∈ G . We will choose G = { Γ h | h ∈ ( R , +) } given byΓ h = , Γ h Ξ = Ξ , Γ h I (Ξ) = I (Ξ) + h . and Γ h ( τ (cid:48) · τ ) = Γ h τ (cid:48) · Γ h τ for τ (cid:48) , τ ∈ S for which τ · τ (cid:48) ∈ S is deﬁned. The limiting model (Π , Γ) . Let W be a Brownian motion on R + and extend it to all of R by requiring W ( x ) = 0 for x ≤

0. We will frequently use the notations ˆ t f ( t )d W ( t ) , ˆ t f ( t ) (cid:5) d W ( t )(3.3)which denote the Itˆo integral and the Skohorod integral (which boils down to an Itˆo integral wheneverthe integrand is adapted). From W we construct now the fractional Riemann-Liouville Brownianmotion (cid:99) W with Hurst index H ∈ (0 , /

2] as (cid:99) W ( t ) = ˙ W (cid:63) K ( t ) = √ H ˆ t | t − r | H − / d W ( r ) , where K ( t ) = √ H t> · t H − / denotes the Volterra kernel. We also write K ( s, t ) := K ( t − s ).To give a meaning to the product terms Ξ I (Ξ) k we follow the ideas from rough paths and deﬁnean “iterated integral” for s, t ∈ R , s ≤ t as W m ( s, t ) = ˆ ts ( (cid:99) W ( r ) − (cid:99) W ( s )) m d W ( r )(3.4) W m ( s, t ) satisﬁes a modiﬁcation of Chen’s relation Lemma 3.2. W m as deﬁned in (3.4) satisﬁes W m ( s, t ) = W m ( s, u ) + m (cid:88) l =0 (cid:18) ml (cid:19) ( (cid:99) W ( u ) − (cid:99) W ( s )) l W m − l ( u, t )(3.5) for s, u, t ∈ R , s ≤ u ≤ t .Proof. Direct consequence of the binomial theorem. (cid:3)

We extend the domain of W m to all of R by imposing Chen’s relation for all s, u, t ∈ R , i.e. weset for t, s ∈ R , t ≤ s W m ( s, t ) = − m (cid:88) l =0 (cid:18) ml (cid:19) ( (cid:99) W ( t ) − (cid:99) W ( s )) l W m − l ( t, s )(3.6)We are now in the position to deﬁne a model (Π , Γ) that gives a rigorous meaning to theinterpretation we gave above for Ξ , I (Ξ) , Ξ I (Ξ) , . . . . Recall that in the theory of regularity structures EGULARITY STRUCTURES & ROUGH VOL 13 a model is a collection of linear maps Π s : T → C c ( R ) (cid:48) , Γ st ∈ G for indices s, t ∈ R that satisﬁyΠ t = Π s Γ st , (3.7) | Π s τ ( ϕ λs ) | (cid:46) λ | τ | , (3.8) Γ st τ = τ + (cid:88) τ (cid:48) ∈ S : | τ (cid:48) | <τ c τ (cid:48) ( s, t ) τ (cid:48) , | c τ (cid:48) ( s, t ) | (cid:46) | s − t | | τ |−| τ (cid:48) | (3.9)where the bounds hold uniformly for τ ∈ S , any s, t in a compact set and for ϕ λs := λ − ϕ ( λ − ( · − s ))with λ ∈ (0 ,

1] and ϕ ∈ C with compact support in the ball B (0 , , Γ), and (occasionally) write (Π

Itˆo , Γ Itˆo ) toavoid confusion with a generic model, also denoted by (Π , Γ), which renders more precisely ourinterpretations of the elements of S .Π s = 1 Γ ts = Π s Ξ = ˙ W Γ ts Ξ = ΞΠ s I (Ξ) m = (cid:16)(cid:99) W ( · ) − (cid:99) W ( s ) (cid:17) m Γ ts I (Ξ) = I (Ξ) + ( (cid:99) W ( t ) − (cid:99) W ( s )) Π s Ξ I (Ξ) m = { t (cid:55)→ dd t W m ( s, t ) } Γ ts τ τ (cid:48) = Γ ts τ · Γ ts τ (cid:48) , for τ , τ (cid:48) ∈ S with τ τ (cid:48) ∈ S We extend both maps from S to T by imposing linearity. Lemma 3.3.

The pair (Π , Γ) as deﬁned above deﬁnes (a.s.) a model on ( T , A ) .Proof. The only symbol in S on which (3.7) is not straightforward is Ξ I (Ξ) m , where the statementfollows by Chen’s relation. The bounds (3.8) and (3.9) follow for trivially and for I (Ξ) m by the H − κ (cid:48) , κ (cid:48) ∈ (0 , H ) H¨older regularity of (cid:99) W . It is further straightforward to check the condition (3.9)by using the rule Γ ts τ τ (cid:48) = Γ ts τ · Γ ts τ (cid:48) so that we are only left with the task to bound Π s Ξ I (Ξ) m ( ϕ λs ).Following along the lines of proof [23, Theorem 3.1] it follows | W m ( s, t ) | ≤ C | s − t | mH +1 / − ( m +1) κ (where C > C ∈ (cid:83) p< ∞ L p ), so that | Π s I (Ξ) m Ξ( ϕ λs ) | = (cid:12)(cid:12)(cid:12)(cid:12) ˆ (cid:0) ϕ λs (cid:1) (cid:48) ( t ) W m ( s, t ) d t (cid:12)(cid:12)(cid:12)(cid:12) ≤ C ˆ ϕ (cid:48)− ( t − s )) | s − t | mH +1 / − ( m +1) κ d tλ ≤ Cλ mH − / − ( m +1) κ = Cλ |I (Ξ) m Ξ | . (cid:3) As we will see below in subsection 3.2 this model is the toolbox from which we can build pathwiseItˆo integrals of the type ´ t f ( r, (cid:99) W ( r )) d W ( r ). For an approximation theory for such expressionswe are in need of a comparable setup that describes approximations, which will be achieved byintroducing a model (Π ε , Γ ε ). The approximating model (Π ε , Γ ε ) . The whole deﬁnition of the model (Π , Γ) is based on the object˙ W . It is therefore natural to build an approximating model by replacing ˙ W by some modiﬁcation˙ W ε that converges (as a distribution) to ˙ W as ε → W ε will be based on an object δ ε which should be thought of as an approximationto the Delta dirac distribution. Our purpose to build δ ε from wavelets, which can be as irregular asthe Haar functions. We ﬁnd it therefore convenient to allow δ ε to take values in the Besov space B β , ∞ ( R ) , β > / κ which covers functions like [0 , ∈ B , ∞ ( R ). Remark . We shortly recall the deﬁnition of the Besov space B β , ∞ ( R ) (see for example [41])although this will here only be explicitely used in the proof of Lemma 3.16 in the appendix. Givena compactly supported wavelet basis φ y = φ ( · − y ) , y ∈ Z , ψ jy = 2 j/ ψ (2 j ( · − y )) , j ≥ , y ∈ − j Z we set (cid:107) g (cid:107) B β , ∞ := (cid:88) y ∈ Z | ( g, φ y ) L | + sup j ≥ jβ (cid:88) y ∈ − j Z − j/ | ( g, ψ jy ) L | and deﬁne B β , ∞ ( R ) to be those L functions g (or ( C −(cid:100) β (cid:101) +1 c ( R )) (cid:48) distributions if β ≤

0) for whichthis norm is ﬁnite.

Deﬁnition 3.5.

In the following we call δ ε : R → R a measurable, bounded function with thefollowing properties • δ ε ( x, y ) = δ ε ( y, x ) for all x, y ∈ R ., • the map R (cid:51) x (cid:55)→ δ ε ( x, · ) ∈ B β , ∞ ( R ) is bounded and measurable for some β > −| Ξ | = 1 / κ . • ´ R δ ε ( x, · ) d x = 1, • sup R | δ ε | (cid:46) ε − , • supp δ ε ( x, · ) ⊆ B ( x, c · ε ) for any x ∈ R and some c > Example 3.6.

There are two examples which are of particular interest for our purposes • We say that δ ε “comes from a molliﬁer”, by which we mean that there is symmetric,compactly supported L ∞ ∩ B β , ∞ ( R )-function ρ , which integrates to 1 such that δ ε ( x, y ) = ε − · ρ ( ε − ( y − x )) • A further interesting example is the case where δ ε “comes from a wavelet basis”. Consideronly ε = 2 − N and choose compactly supported L ∞ ∩ B β , ∞ -valued father wavelets ( φ k,N ) k ∈ Z (e.g. the Haar father wavelets φ k,N = 2 N/ · [ k − N , ( k +1)2 − N ) ) and set δ ε ( x, y ) = (cid:88) k ∈ Z φ k,N ( x ) φ k,N ( y )Note that we could also add some generations of mother wavelets in this choice.Note that (locally) ˙ W is contained in B | Ξ |∞ , ∞ ( R ) (recall: | Ξ | = − / − κ ), so that due to B | Ξ |∞ , ∞ ( R ) ⊆ ( B β , ∞ ( R )) (cid:48) we can set ˙ W ε ( t ) := (cid:104) ˙ W , δ ε ( t, · ) (cid:105) R + ( t )which is a Gaussian process and pathwise measurable and locally bounded. For (maybe stochastic)integrands f we introduce the notations ˆ t f ( r ) d W ε ( r ) := ˆ t f ( r ) ˙ W ε ( r ) d r and if f takes values in some (non-homogeneous) Wiener chaos induced by ˙ W we also introduce ˆ t f ( r ) (cid:5) d W ε ( r ) := ˆ t f ( r ) (cid:5) ˙ W ε ( r ) d r , (3.10)where (cid:5) denotes the Wick product. Note that these two objects do in general not coincide. Themotive for using the same symbol “ (cid:5) ” as in (3.3) is that (3.10) can be seen as the Skohorod integralwith respect to the Gaussian stochastic measure induced by the Gaussian process ˙ W ε (for the notionof Wick products and Skohorod integrals and their links see e.g. [39]). EGULARITY STRUCTURES & ROUGH VOL 15

We now deﬁne an approximate fractional Brownian motion by setting (cid:99) W ε ( t ) = K (cid:63) ˙ W ε = √ H ˆ t | t − r | H − / d W ε ( r )which has the expected regularity as it is shown in the following lemma. Lemma 3.7.

On every compact time intervall [0 , T ] we have the estimates | (cid:99) W ε ( t ) − (cid:99) W ε ( s ) | (cid:46) C ε | t − s | H − κ (cid:48) , | (cid:99) W ε ( t ) − (cid:99) W ε ( s ) − ( (cid:99) W ( t ) − (cid:99) W ( s )) | (cid:46) C | t − s | H − κ (cid:48) ε δκ (cid:48) . uniformly in ε ∈ (0 , for any δ ∈ (0 , and κ (cid:48) ∈ (0 , H ) and where C ε , C > are random constantsthat are (uniformly) bounded in L p for p ∈ [1 , ∞ ) .Proof. The proof is elementary but a bit bulky and therefore postponed to the appendix. (cid:3)

Finally we can give the deﬁnition of the approximative model (Π ε , Γ ε ), the “canonical” modelbuilt from the approximate (and hence regular) noise W ε .Π εs = 1 Γ εst = 1Π εs Ξ = ˙ W ε Γ εst Ξ = ΞΠ εs I (Ξ) m = (cid:16)(cid:99) W ε ( · ) − (cid:99) W ε ( s ) (cid:17) m Γ εst I (Ξ) = I (Ξ) + (cid:16)(cid:99) W ε ( t ) − (cid:99) W ε ( s ) (cid:17) Π εs I (Ξ) m Ξ = ( (cid:99) W ε ( · ) − (cid:99) W ε ( s )) m ˙ W ε ( · ) Γ εst τ τ (cid:48) = Γ εst τ · Γ εst τ (cid:48) , τ , τ (cid:48) , τ · τ (cid:48) ∈ S Lemma 3.8.

The pair (Π ε , Γ ε ) as deﬁned above is a model on ( T , A ) .Proof. The identity Π t = Γ ts Π s is straightforward to check. The bounds (3.8) and (3.9) on Γ st andon Π s I (Ξ) m follow from the regularity of (cid:99) W ε as proved in Lemma 3.7. The blow-up of Π s Ξ I (Ξ) m ( ϕ λs )however is even better than we need, since by the choice of δ ε we have | ˙ W ε | ≤ C ε , for some randomconstant C ε , on compact sets. (cid:3) The deﬁnition of this model is justiﬁed by the fact that application of the reconstruction operator(as in Lemma 3.22) yields integrals ˆ t f ( r, (cid:99) W ε ( r )) d W ε ( r ) . (3.11)As we pointed out already in section 1, there is no hope that integrals of this type will converge as ε → H < /

2. This can be cured by working with a renormalized model ( (cid:98) Π ε , Γ ε ) instead. The renormalized model (cid:98) Π ε . From the perspective of regularity structures the fundamental reasonwhy integrals like (3.11) fail to converge to ˆ t f ( r, (cid:99) W ( r )) d W ( r )lies in the fact that the corresponding models will not satisfy (Π ε , Γ ε ) → (Π , Γ) in a suitable norm.To see what is going on we will ﬁrst rewrite Π s Ξ I (Ξ) k Lemma 3.9.

For ϕ ∈ C ∞ c ( R ) , s ∈ R , m ∈ { , . . . , M } we have Π s Ξ I (Ξ) m ( ϕ ) = ˆ ∞ ϕ ( t ) ( (cid:99) W ( t ) − (cid:99) W ( s )) m (cid:5) d W ( t ) − m ˆ ∞ ϕ ( t ) K ( s − t ) ( (cid:99) W ( t ) − (cid:99) W ( s )) m − d t where (cid:5) denotes the Skorokhod integral and K ( t ) = √ H t> t H − / denotes the Volterra kernel.Note that in the second term the domain of integration is actually (0 , s ) .Remark . Our notation reﬂects a close relation between the Skorokhod integral and the Wickproduct. Indeed, when g = (cid:80) X s [ s,t ] , with summation over a ﬁnite partition of [0 , T ], and each X s a (non-adapted) random variable in a ﬁnite Wiener-Itˆo chaos, it follows from [39, Thm 7.40]that ´ gδW = (cid:80) X s (cid:5) W s,t . Passage to L -limits is then standard. See also [44] and the referencestherein. Proof.

We prove this by reexpressing W k ( s, t ). For s < t we have already W k ( s, t ) = ˆ ts d W ( r ) (cid:5) ( (cid:99) W ( r ) − (cid:99) W ( s )) k so that it remains to see what happens for t < s . With relation (3.6) we have in this case W k ( s, t ) = − k (cid:88) l =0 (cid:18) kl (cid:19) ( (cid:99) W ( t ) − (cid:99) W ( s )) l · ˆ d r ˙ W ( r ) (cid:5) ( (cid:99) W ( r ) − (cid:99) W ( t )) k − l t for t < r < s we can reformulatethis and obtain W k ( s, t ) = − ˆ d W ( r ) (cid:5) ( (cid:99) W ( r ) − (cid:99) W ( s )) k t . (An alternative derivation of the above Skorokohod form can be given in terms of [45, Thm 3.2].)Since Π s Ξ I (Ξ) m ( ϕ ) = ´ ϕ ( t ) d t W m ( s, t ) the claim follows. (cid:3) Let us also reexpress the approximating model in suitable form.

Lemma 3.11.

For ϕ ∈ C ∞ c ( R ) , s ∈ R , m ∈ { , . . . , M } we have Π εs Ξ I (Ξ) m ( ϕ ) = ˆ ∞ ϕ ( t ) ( (cid:99) W ε ( t ) − (cid:99) W ε ( s )) m (cid:5) d W ε ( t ) − m ˆ ∞ ϕ ( t ) K ε ( s, t )( (cid:99) W ε ( t ) − (cid:99) W ε ( s )) m − d t + m ˆ ∞ ϕ ( t ) K ε ( t, t )( (cid:99) W ε ( t ) − (cid:99) W ε ( s )) m − d t where (cid:5) is deﬁned as in (3.10) and where K ε ( u, v ) := E [ (cid:99) W ε ( u ) ˙ W ε ( v )] = u,v ≥ ˆ ∞ ˆ ∞ δ ε ( v, x ) δ ε ( x , x ) K ( u − x ) d x d x . (3.13) EGULARITY STRUCTURES & ROUGH VOL 17

Proof.

Using that for Gaussian

V, U we have

V U m = V (cid:5) U m + m E [ V U ] U m − (this is (3.12) with U = 1) we can rewriteΠ εs Ξ I (Ξ) m ( ϕ ) = ˆ ∞ ϕ ( t ) ( (cid:99) W ε ( t ) − (cid:99) W ε ( s )) m (cid:5) d W ε ( t )+ m ˆ ∞ d t ϕ ( t ) E [ ˙ W ε ( t ) ( (cid:99) W ε ( t ) − (cid:99) W ε ( s ))]( (cid:99) W ε ( t ) − (cid:99) W ε ( s )) m − · Inserting E [ ˙ W ε ( t ) ( (cid:99) W ε ( t ) − (cid:99) W ε ( s ))] = K ε ( t, t ) − K ε ( s, t ) shows the identity. (cid:3) Comparing the expressions in Lemma 3.11 and 3.9 we see that we morally have to subtract m ˆ ϕ ( t ) K ε ( t, t )( (cid:99) W ε ( t ) − (cid:99) W ε ( s )) m − d t from the model, which will give us a new model (cid:98) Π ε . Of course we have to be careful that this steppreserves “Chen’s relation” (cid:98) Π εs Γ st = (cid:98) Π εt , see Theorem 3.13 below.If we interpret K ε as an approximation to the Volterra-kernel we see that the expression C ε ( t ) := K ε ( t, t ) , t ≥ H − / = ∞ ” in the limit ε →

0. We have indeed the followingupper bound.

Lemma 3.12.

For all s, t ∈ R we have | K ε ( s, t ) | (cid:46) ε H − / . Proof. | K ε ( s, t ) | (cid:46) ε − ´ B ( t,cε ) d x ´ B ( x,cε ) d u | s − u | H − / (cid:46) ε H − / . (cid:3) Our hope is now that the new model (cid:98) Π ε converges to Π in a suitable sense. Similar to [31, (2.17)]we deﬁne the distance between two models (Π , Γ) and ( (cid:101) Π , (cid:101) Γ) on a compact time interval [0 , T ] as ||| (Π , Γ); ( (cid:101) Π , (cid:101) Γ) ||| T := sup supp ϕ ⊆ B (0 , ,λ ∈ (0 , ,s ∈ [0 , T ] , τ ∈ S λ −| τ | | (Π s − (cid:101) Π s ) τ ( ϕ λs ) | + sup t, s ∈ [0 , T ] ,τ ∈ S, A (cid:51) β < | τ | | Γ ts τ − (cid:101) Γ ts τ | β | t − s | | τ |− β , (3.14)where | · | β denotes the absolute value of the coeﬃcient of the symbol τ (cid:48) with | τ (cid:48) | = β and where theﬁrst supremum runs over ϕ ∈ C c with (cid:107) ϕ (cid:107) C ≤

1. We will also need (cid:107) Π (cid:107) T = sup supp ϕ ⊆ B (0 , ,λ ∈ (0 , ,s ∈ [0 , T ] , τ ∈ S λ −| τ | | Π s τ ( ϕ λs ) | . We are now ready to give the fundamental result of this subsection which plays a key role in ourapproximation theory. Recall that the (minimal) homogeneity | Ξ | = − / − κ which corresponds to W being H¨older with exponent 1 / − κ . Theorem 3.13.

Deﬁne, for every s ∈ [0 , T ] , the linear map (cid:98) Π εs : T → C c ( R ) (cid:48) given by, for m ∈ { , . . . , M } (cid:98) Π εs Ξ I (Ξ) m = Π εs Ξ I (Ξ) m − m C ε ( · )Π εs ( I (Ξ) m − ) and (cid:98) Π εs = Π εs on all remaining symbols in S . Then ( (cid:98) Π ε , (cid:98) Γ ε ) := ( (cid:98) Π ε , Γ ε ) deﬁnes a (“renormalized”) model on ( T , A ) and on compact time intervals we have (cid:13)(cid:13)(cid:13) ||| ( (cid:98) Π ε , (cid:98) Γ ε ); (Π , Γ) ||| T (cid:13)(cid:13)(cid:13) L p (cid:46) ε δκ . (3.15) for any δ ∈ (0 , and p ∈ [1 , ∞ ) . In particular, we have “almost rate H ” for M = M ( κ, H ) largeenough.Remark . In the special case of the level-2 Brownian rough path (i.e. H = 1 / , M = 1) theabove result is in precise agreement with known results (even though the situation here is simplersince we are dealing with scalar Brownian). More speciﬁcally, we don’t see the usual (strong) rate“almost” 1 / / − κ ) which exactly leads to the rate “almost κ ”. Since M = 1 entails the condition1 / − κ > /

3, we see that κ < /

6, exactly as given e.g. in in [23, Ex. 10.14]. A better rate canbe achieved by working with higher-level rough path (here:

M >

1) and indeed the special caseof H = 1 /

2, but general M , can be seen as a consequence of [21]: at the price of working with ∼ / (1 / − κ ) levels, one can choose κ arbitrarily close to 1 / / H < / Proof.

Since due to Lemma 3.12 we have, for ﬁxed ε , that sup t ∈ [0 ,T ] | C ε ( t ) | < ∞ and | Π s I (Ξ) m | (cid:46) | · − s | mH the bound (3.8) is still satisﬁed. The modiﬁcation (cid:98) Π εs Ξ I (Ξ) m − Π εs Ξ I (Ξ) m does not leadto a violation of “Chen’s relation”. Indeed, using validity of (3.7) for the original model, we have (cid:98) Π εt Γ εts (Ξ I (Ξ) k ) = (cid:98) Π εt (cid:32) k (cid:88) l =0 (cid:18) kl (cid:19) ( (cid:99) W ε ( t ) − (cid:99) W ε ( s )) l Ξ I (Ξ) k − l (cid:33) = Π εs (Ξ I (Ξ) k ) − k (cid:88) l =0 (cid:18) kl (cid:19) ( (cid:99) W ε ( t ) − (cid:99) W ε ( s )) l ( k − l ) C ε ( · )( (cid:99) W ε ( · ) − (cid:99) W ε ( t )) k − l − = Π εs (Ξ I (Ξ) k ) − k C ε ( · ) k − (cid:88) l =0 (cid:18) k − l (cid:19) ( (cid:99) W ε ( t ) − (cid:99) W ε ( s )) l ( (cid:99) W ε ( · ) − (cid:99) W ε ( t )) k − l − = Π εs (Ξ I (Ξ) k ) − k C ε ( · )( (cid:99) W ε ( · ) − (cid:99) W ε ( s )) k = (cid:98) Π εs (Ξ( I (Ξ) k ) . We so see that (3.7) is also satisﬁed after our modiﬁcation, and then easily conclude that ( (cid:98) Π ε , Γ ε ) isstill a model on ( T , A ). At last, the bound (3.15) is a bit technical and left to Appendix A. (cid:3) Approximation and renormalization theory.

We now address to central question of howthe integral ´ t f ( (cid:99) W ε ( r ) , r ) d W ε ( r ) has to be modiﬁed to make it convergent against ´ t f ( W ( r ) , r )d W ( r ).The key idea is to combine the convergence result from Theorem 3.13 with Hairer’s reconstructiontheorem, which we state below.We ﬁrst recall the notion of a modelled distribution, compare [31, Deﬁnition 3.1]. We say that amap F : R → T is in the space D γT (Γ) , γ > T > (cid:107) F (cid:107) D γT (Γ) := sup A (cid:51) β<γ,s ∈ [0 ,T ] | F ( s ) | β + sup A (cid:51) β<γ,s,t ∈ [0 ,T ] , s (cid:54) = t | F ( t ) − Γ ts F ( s ) | β | t − s | γ − β < ∞ , (3.16) EGULARITY STRUCTURES & ROUGH VOL 19 where as above | · | β denotes the absolute value of the coeﬃcient of the vector τ with | τ | = β . Giventwo models (Π , Γ) and (Π , Γ) and two

F, F : R (cid:55)→ T it is also usefull to have the notion of a distance ||| F ; F ||| D γT (Γ) , D γT (Γ) := sup A (cid:51) β<γ, t ∈ [0 ,T ] | F ( t ) − F ( t ) | β + sup A (cid:51) β<γ, s,t ∈ [0 ,T ] , s (cid:54) = t | F ( t ) − Γ ts F ( s ) − ( F ( t ) − Γ ts F ( s )) | β | t − s | γ − β . The reconstruction theorem now states that for γ > F ∈ D γT (Γ) can be uniquely identiﬁedwith a distribution that behaves locally like Π · F ( · ). Theorem 3.15. [31, Theorem 3.10]

Given a model (Π , Γ) , γ > and a T > there is a unique continuous operator R : D γT (Γ) →C | Ξ | ( R ) such that for any s ∈ [0 , T ] and ϕ ∈ C c ( B (0 , | ( R F − Π s F ( s ))( ϕ λs ) | (cid:46) (cid:107) Π (cid:107) T λ γ . For two diﬀerent models (Π , Γ) and (Π , Γ) we further have (cid:12)(cid:12) ( R F − Π s F ( s ) − ( R F − Π s F ( s )))( ϕ λs ) (cid:12)(cid:12) (cid:46) λ γ (cid:16) (cid:107) F (cid:107) D γT (Γ) ||| (Π , Γ); (Π , Γ) ||| T + (cid:107) Π (cid:107) T ||| F ; F ||| D γT (Γ); D γT (Γ) (cid:17) (3.18) for F ∈ D γT (Γ) , F ∈ D γT (Γ) . As mentioned earlier we want ourselves to work with compactly supported functions ϕ ∈B β , ∞ ( R d ) , β > −| Ξ | which includes objects like the Haar wavelets. The following Lemma allows usto carry over all bounds. Lemma 3.16.

The bounds (3.8) , (3.14) , (3.17) and (3.18) do still hold for ϕ ∈ B β , ∞ ( R d ) , β > −| Ξ | with compact support in B (0 , (after a change of constants).Remark . This covers in particular functions like [0 , ∈ B , ∞ ( R ). Proof.

We prove this via wavelet methods in the appendix. (cid:3)

By the notation X ( ε ) we mean in the following both X and X ε .To study objects like ´ t f ( (cid:99) W ( ε ) ( r ) , r ) d W ( ε ) ( r ) with the reconstrution theorem we ﬁrst “expand”the integrand f ( (cid:99) W ( ε ) ( r ) , r ) in the regularity structure T F ( ε ) ( s ) := M (cid:88) m =0 m ! ∂ m f ( (cid:99) W ( ε ) ( s ) , s ) I (Ξ) m On the level of the regularity structure these objects can be multiplied with “noise” Ξ which gives amodelled distribution on T .We will analyze F ( ε ) by writing it as composition of a (random) modelled distribution with thesmooth function f . To this end we need Lemma 3.18.

On the regularity structure ( T , A, G ) introduced in Section 3.1, consider a model (Π , Γ) which is admissible in the sense Π t I (Ξ) = ( K ∗ Π t Ξ)( · ) − ( K ∗ Π t Ξ)( t ) . C | Ξ | ( R ) denotes the space of distributions that are locally in the Besov space B | Ξ |∞ , ∞ ( R ) (cmp. [31, Remark 3.8]). Then (3.19) K Ξ( t ) = I (Ξ) + ( K ∗ Π t Ξ)( t ) deﬁnes a modelled distribution. More precisely, K Ξ ∈ D ∞ T := (cid:83) γ< ∞ D γT .Remark . Our notion of admissibility mimics [31, Def 5.9], which however is not directlyapplicable here (due to failure of Assumption 5.4 in [31]).

Proof.

By deﬁnition of the modelled distribution space we need to understand the action of Γ st onall constituting symbols. Since { , I (Ξ) } span a sector, i.e. a space invariant by the action of thestructure group, it is clear that Γ st I (Ξ) = I (Ξ) + ( ... ) . Application of the realization map Π s , followed by evaluation at s , immediately identiﬁes ( .... ) asΠ t I (Ξ)( s ) − Π s I (Ξ)( s ) = Π t I (Ξ)( s ) = ( K ∗ Π s Ξ)( s ) − ( K ∗ Π t Ξ)( t )where we used admissibility and Π s Ξ = Π t Ξ in the last step, a general fact due to the trivial actionof the structure group on the symbol with lowest degree. As a consequence Γ st K Ξ( t ) ≡ K Ξ( s ), sothat, trivially, K Ξ ∈ D γT for any γ < ∞ . (cid:3) For a given (suﬃciently smooth) function f , and a generic model (Π , Γ) on our regularity structure,deﬁne F Π : s (cid:55)→ M (cid:88) m =0 m ! ∂ m f (( RK Ξ( s ) , s ) I (Ξ) m . Remark that K Ξ( s ) is function-like, i.e. with values in the span of symbols with non-negative degree.From [31, Prop. 3.28] we then have RK Ξ( s ) = (cid:104)K Ξ( s ) , (cid:105) = K ∗ Π s Ξ . (In particular, we see that F ( ε ) ( s ) coincides with F Π when Π is taken as either approximate orrenormalized approximate model.) We can also deﬁne Ξ F Π simply obtained by multiplying it withΞ. The properties of F Π and Ξ F Π are summarized in the following lemma. Lemma 3.20.

Given f ∈ C M +3 b ([0 , T ] × R ) , there exists N > such that, for all γ ∈ (1 / κ, , (cid:107) F Π (cid:107) D γT (Γ) (cid:46) (cid:107) Π (cid:107) NT , (cid:107) Ξ F Π (cid:107) D γ + | Ξ | T (Γ) (cid:46) (cid:107) Π (cid:107) NT . We have further for two given models (Π , Γ) and (Π (cid:48) , Γ (cid:48) ) , ||| F Π ; F Π (cid:48) ||| D γT (Γ); D γT (Γ (cid:48) ) (cid:46) (cid:0) (cid:107) Π (cid:107) NT + (cid:107) Π (cid:48) (cid:107) NT (cid:1) ||| (Π , Γ); (Π (cid:48) , Γ (cid:48) ) ||| T , (3.20) ||| Ξ F Π ; Ξ F Π (cid:48) ||| D γ + | Ξ | T (Γ); D γ + | Ξ | T (Γ (cid:48) ) (cid:46) (cid:0) (cid:107) Π (cid:107) NT + (cid:107) Π (cid:48) (cid:107) NT (cid:1) ||| (Π , Γ); (Π (cid:48) , Γ (cid:48) ) ||| T , (3.21) where the proportionality constants are, in particular, uniform over all f with bounded C M +3 -norm.Proof. The map F Π is simply the composition (in the sense of [31, Sec. 4.2]) of the function f with the, thanks to the previous lemma, modelled distributions K Ξ and s (cid:55)→ s . The result thenfollows from [31, Thm 4.16] (polynomial dependence in (cid:107) Π (cid:107) T is not stated there but is clear fromthe proof). (cid:3) Remark . In the case when f ∈ C M +3 but with no global bounds, the result still holds sincewe only consider the values of f on the range of the continuous function RK Ξ (which is bounded bysome R ≥ (cid:107) f (cid:107) C M +3 ( B R × [0 ,T ]) . EGULARITY STRUCTURES & ROUGH VOL 21

In the case of the Itˆo model (Π , Γ) (resp. the approximating renormalized models ( (cid:98) Π ε , Γ ε )) wesimply denote F Π by F (resp. F ε ). We are then allowed to apply Hairer’s reconstruction Theorem3.15. Note that since we have two models we have two reconstruction operators R and R ε . Theobjects R ( ε ) Ξ F ( ε ) can be written down explicitely. Lemma 3.22.

We have (a.s.) R F Ξ( ϕ ) = ˆ ϕ ( t ) f ( (cid:99) W ( t ) , t ) d W ( t ) , R ε F ε Ξ( ϕ ) = ˆ ϕ ( t ) f ( (cid:99) W ε ( t ) , t ) d W ε ( t ) − ˆ K ε ( t, t ) ∂ f ( (cid:99) W ε ( t ) , t ) ϕ ( t ) d t . Proof.

The proof is in the appendix. (cid:3)

If we take ϕ = [0 ,T ) we obtain R F Ξ( [0 ,T ) ) = ´ T f ( (cid:99) W ( t ) , t ) d W ( t ), so that it is natural tochoose (cid:102) I εf ( T ) = R ε Ξ F ε ( [0 ,T ) ) as an approximation. However, note that the key property of thereconstruction operator R ( ε ) is that it is locally close to the corresponding model Π ( ε ) so that wehave in fact two natural approximations: Deﬁnition 3.23.

For

F, F ε as in Lemma 3.20 and t ≥ (cid:102) I εf ( t ) := R ε Ξ F ε ( [0 ,t ] ) = ˆ t f ( (cid:99) W ε ( r ) , r ) d W ε ( r ) − ˆ t C ε ( r ) ∂ f ( (cid:99) W ε ( r ) , r ) d r . For a (ﬁxed) partition { [ t εl , t εl +1 ) } of [0 , t ) with (cid:12)(cid:12) t εl +1 − t εl (cid:12)(cid:12) (cid:46) ε we further set (cid:102) J εf,M ( t ) = (cid:88) [ t εl ,t εl +1 ) (cid:98) Π εt l Ξ F εt l ( [ t εl ,t εl +1 ) )= (cid:88) [ t εl ,t εl +1 ) M (cid:88) m =0 m ! ∂ m f ( (cid:99) W ε ( t εl ) , t εl ) ˆ t εl +1 t εl (cid:16)(cid:99) W ε ( r ) − (cid:99) W ε ( t εl ) (cid:17) m d W ε ( r ) −− M (cid:88) m =1 m − ∂ m f ( (cid:99) W ε ( t εl ) , t εl ) ˆ t εl +1 t εl C ε ( r ) (cid:16)(cid:99) W ε ( r ) − (cid:99) W ε ( t εl ) (cid:17) m − d r . We might drop the indices f and f, M on (cid:102) I ε and (cid:102) J ε if there is no risk of confusion.The following theorem, which can be seen as the fundamental theorem of our regularity structureapproach to rough pricing shows that these approximations do both converge. Theorem 3.24.

Fix

T > . For f smooth, bounded with bounded derivatives, and (cid:102) I εf , (cid:102) J εf,M as inDeﬁnition 3.23 we have(i) for any δ ∈ (0 , and any p < ∞ there exists C such that (3.22) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) sup t ∈ [0 ,T ] (cid:12)(cid:12)(cid:12)(cid:12) (cid:102) I εf ( t ) − ˆ t f ( (cid:99) W ( r ) , r )d W ( r ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ≤ Cε δH , (ii) for every δ ∈ (0 , we can pick M = M ( δ, H ) large enough, such that, for any p < ∞ thereexists C such that (3.23) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) sup t ∈ [0 ,T ] (cid:12)(cid:12)(cid:12)(cid:12) (cid:102) J εf,M ( t ) − ˆ t f ( (cid:99) W ( r ) , r )d W ( r ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ≤ Cε δH . Remark . With regard to (i): although (cid:102) I εf ( t ) does not depend on any choice of M , and nordoes its (Itˆo) limit, the choice of M aﬀects the entire regularity structure and so, implicitly also thereconstruction operator R ε used in the deﬁnition of (cid:102) I εf , as well as the modelled distribution F ε .The latter, in turn, requires f ∈ C M for the construction to make sense. If δ is chosen arbitrarilyclose to one, f needs to have derivatives of arbitrary order, hence our smoothness assumption. Remark . ( f of exponential form; [27]) By an easy localization argument one shows that for f smooth (but without any further bounds) ones still hassup ε ∈ (0 , P (cid:32) sup t ∈ [0 ,T ] (cid:12)(cid:12)(cid:12)(cid:12) (cid:102) I εf ( t ) − ˆ t f ( (cid:99) W ( r ) , r )d W ( r ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ Cε δH (cid:33) → C → ∞ . The original rough vol model due to [27] makes a point that f should be of exponentialform. Now, the result with L p -estimates still holds since we only consider the values of f on therange of the continuous function RK Ξ (which is bounded by some R ≥ (cid:107) f (cid:107) C M +2 ( B R × [0 ,T ]) . Since, for us, (Π , Γ) is alwaysa Gaussian model, RK Ξ is a Gaussian process (say, (cid:99) W or (cid:99) W ε ) hence we have (Fernique) Gaussianconcentration for sup t ∈ [0 ,T ] |RK Ξ( t ) | . So, for instance if f and its derivatives have exponentialgrowth we do have the L p bounds of the above theorem, for all p < ∞ . This remark justiﬁes inparticular the choice f ( x ) = exp( x ) and p = 2 in the numerical discussion of Section 6. Proof.

Without loss of generality T ≤

1, otherwise split [0 , T ] in subintervals. Let us show (3.22). (cid:102) I εf ( t ) − ˆ t f ( (cid:99) W ( r ) , r )d W ( r ) = ( R ε ( F ε Ξ) − R ( F Ξ))( [0 ,t ] )= t (cid:16)(cid:98) Π ε Ξ F ε (0) − Π Ξ F (0)) (cid:17) ( t − [0 ,t ] )+ t (cid:16) R ε Ξ F ε − (cid:98) Π ε Ξ F ε (0) − ( R Ξ F − Π Ξ F (0)) (cid:17) ( t − [0 ,t ) ) . We then obtain the rate ε δκ , δ ∈ (0 ,

1) using Theorem 3.13, Lemma 3.20 and (3.14) for the ﬁrst termand also Theorem 3.15 for the second term. Letting κ ↑ H and M ↑ ∞ our total rate can be chosenarbitrary close to H .To obtain the second estimate we can bound (cid:102) I εf ( t ) − (cid:102) J εf,M ( t ) with the ﬁrst inequality in Theorem3.15. (cid:3) Non-constant vs. constant renormalization If δ ε comes from a molliﬁer (cf. Example 3.6) the renormalization C ε = K ε ( · , · ) that wasapplied in Theorem 3.13 and thus in Deﬁnition 3.23 is a constant, which is the familiar conceptone encounters in the study of singular SPDE [32, 31, 11]. If δ ε comes from wavelets such as theHaar basis, K ε ( · , · ) is usually not constant but a periodic function with period ε . Thus we see thatour analysis gives rise to a “non-constant renormalization”. It is natural to ask if one can do withconstant renormalization after all. For the sake of argument, consider C ε , periodic with period ε ,with mean C ε = 1 ε ˆ ε C ε ( t ) dt. From Lemma 3.12 it follows that C ε (and its mean) are bounded by ε H − / , uniformly in t . Puttingall this together it easily follows that |(cid:104) C ε − C ε , ϕ (cid:105)| (cid:46) ε α + H − / , uniformly over all ϕ bounded in EGULARITY STRUCTURES & ROUGH VOL 23 C α , with convergence to zero when α > / − H . As a consequence, taking ϕ ( t ) = f ( (cid:99) W ε ), forsmooth f , we clearly can apply this with any α < H . Hence, by equating the constraints on α ,we arrive at H > /

4. The practical consequence then is, with focus on the convergence stated inpart (i) of Theorem 3.22 that we can indeed replace non-constant renormalization by a constant,however at the prize of restricting to

H > / H >

0. While we have refrained from investigation this (technical) point further, we can understand the mechanism at work by looking at the following toy example: Consider theIto-integral ´ W h dW where W H is a fBM, but now with Hurst parameter H > /

2, built, say,as Volterra process over W . Using Young integration theory, one can give a pathwise argumentthat shows that Riemann-Stieltjes approximation converge a.s. (with vanishing rate as H → / + ).However, we know from stochastic theory (Itˆo integration) that this convergence works in L (andthen in probability) for any H >

0. We would thus expect that, when H ∈ (0 , / The case of the Haar basis.

The following special case of the approximations above to ´ t f ( (cid:99) W ( r ) , r )d W ( r ) is of particular interest for our purposes. We here collect some more concreteformulas that arise in this case.Let ε = 2 − N , φ := [0 , and φ l,N = 2 N/ φ (2 N · − l ) , l ∈ Z and the corresponding δ ε coming fromthis wavelet is then for x, y ∈ R . δ ε ( x, y ) = (cid:88) l ∈ Z φ l,N ( x ) φ l,N ( y ) = 2 N [ (cid:98) x N (cid:99) − N , ( (cid:98) x N (cid:99) +1)2 − N ) ( y )The molliﬁed Volterra-kernel (3.13) then takes the form K ε ( u, v ) = ˆ ∞ ˆ ∞ δ ε ( v, x ) δ ε ( x , x ) K ( u − x )d x d x = √ H · N ˆ [ (cid:98) v N (cid:99) − N , ( (cid:98) v N (cid:99) +1)2 − N ∧ u ) | u − x | H − / (cid:98) v N (cid:99) − N ≤ u d x = √ H / H N ×× (cid:16) | u − (cid:98) v N (cid:99) − N | / H − | u − ( (cid:98) v N (cid:99) + 1)2 − N ∧ u ) | / H (cid:17) (cid:98) v N (cid:99) − N ≤ u . A special role is played by diagonal function as a renormalization, C ε ( t ) = K ε ( t, t ) = √ H N / H | t − (cid:98) t N (cid:99) − N | / H . (3.24) Some computations led us to believe that this question can be settled with the aid of mixed (1 , ρ )-variation ofthe covariance function of the Volterra process, cf. [22], which we expected to hold uniformly over approximation.However the amount of work seems in no relation to the main theme of this article.

We have moreover (cid:99) W ε ( t ) = ˆ t K ( t − r ) d W ε ( r ) = ∞ (cid:88) l =0 Z l ˆ t K ( t − r ) φ k,N ( r )d r = ∞ (cid:88) l =0 − N/ K ε ( t, l − N ) Z l = (cid:98) t N (cid:99) (cid:88) l =0 − N/ K ε ( t, l − N ) Z l , where Z l = (cid:104) ˙ W , φ l,N (cid:105) are i.i.d. N (0 ,

1) variables. As approximation we can ﬁnally take I εf ( t ) fromDeﬁnition 3.23 with partition { [ t l , t l +1 ) } = { [ l − N , ( l + 1)2 − N ∧ t ) } which gives us (cid:102) J εf,M ( t ) = (cid:100) t N (cid:101)− (cid:88) l =0 M (cid:88) m =0 m ! ∂ m f ( (cid:99) W ε ( t l ) , t l )2 N/ Z l ˆ t l +1 t l (cid:16)(cid:99) W ε ( r ) − (cid:99) W ε ( t l ) (cid:17) m d r −− M (cid:88) m =1 m − ∂ m f ( (cid:99) W ε ( t l ) , t l +1 ) ˆ t l +1 t l C ε ( r ) (cid:16)(cid:99) W ε ( r ) − (cid:99) W ε ( t l ) (cid:17) m − d r and (cid:102) I εf ( t ) = (cid:100) t N (cid:101)− (cid:88) l =0 ˆ t l +1 t l [2 N/ Z l · f ( (cid:99) W ε ( r ) , r ) d r − C ε ( r ) ∂ f ( (cid:99) W ε ( r ) , r )] d r . As explained at the end of the last section, C ε ( r ) in these formulas could be replaced by its localmean, the constant 2 N ˆ − N C ε ( r ) d r = √ H ( H + 1 / H + 3 /

2) 2 N (1 / − H ) . The full rough volatility regularity structure

Basic setup.

We want to add an independent Brownian motion, so that we take an additionalsymbol Ξ. We again ﬁx M and deﬁne a (larger) collection of symbols S , with S ⊂ S , and then T = (cid:77) τ ∈ S R τ ∼ = T + (cid:10) { Ξ , Ξ I (Ξ) , . . . , Ξ I (Ξ) M } (cid:11) . (4.1)Again we ﬁx | Ξ | = − / − κ and the homogeneity of the other symbols are deﬁned multiplicativelyas before.Also as before, we set (cid:99) W t = ´ t K ( s, t ) dW s with K ( s, t ) = √ H | t − s | H − / t>s , where W andalso W are independent Brownian motions.We extend the canonical model (Π , Γ) to this regularity structure by deﬁningΠ s Ξ I (Ξ) m = (cid:26) t (cid:55)→ dd t (cid:18) ˆ ts (cid:16)(cid:99) W ( u ) − (cid:99) W ( s ) (cid:17) m dW ( u ) (cid:19)(cid:27) (the above integral being in Itˆo sense), and Γ ts (cid:0) Ξ I (Ξ) m (cid:1) = ΞΓ ts ( I (Ξ) m ) . Arguments similar to the proof of Lemma 3.8 show that this indeed deﬁnes a model on T . Upon setting Γ ts (cid:0) Ξ (cid:1) = Ξ, the given relation is precisely implied by multiplicativity of Γ. EGULARITY STRUCTURES & ROUGH VOL 25

Small noise model large deviation.

Given δ > δ , Γ δ ) on (cid:101) T obtained by replacing W, W by δW, δW , which simply means thatΠ δ = 1Π δ I (Ξ) m = δ m Π I (Ξ) m Π δ Ξ I (Ξ) m = δ m +1 ΠΞ I (Ξ) m Π δ Ξ I (Ξ) m = δ m +1 ΠΞ I (Ξ) m , and Γ δts = , Γ δts Ξ = Ξ , Γ δts Ξ = Ξ , Γ δts I (Ξ) = I (Ξ) + δ ( (cid:99) W ( t ) − (cid:99) W ( s )) Γ δts τ τ (cid:48) = Γ δts τ · Γ δts τ (cid:48) , for τ , τ (cid:48) ∈ S. Finally, for h = ( h , h ) in H := L ([0 , T ]) , we consider the deterministic model (Π h , Γ h ) deﬁnedby Π h = 1 , Π hs Ξ = h , Π hs Ξ = h , Π hs I (Ξ)( t ) = ˆ t ∨ s ( K ( u, t ) − K ( u, s )) h ( u ) du, Π h τ τ (cid:48) = Π h τ Π h τ (cid:48) for τ , τ (cid:48) ∈ S and Γ hts = , Γ hts Ξ = Ξ , Γ hts Ξ = Ξ , Γ hts I (Ξ) = I (Ξ) + ( ˆ t ∨ s ( K ( u, t ) − K ( u, s )) h ( u ) du ) Γ hts τ τ (cid:48) = Γ hts τ · Γ hts τ (cid:48) , for τ , τ (cid:48) ∈ S. The following lemma and theorem are proved in Appendix B.

Lemma 4.1.

For each h ∈ H , Π h does deﬁne a model. In addition, the map h ∈ H (cid:55)→ Π h iscontinuous. Theorem 4.2.

The models Π δ satisfy a large deviation principle (LDP) in the space of models withrate δ and rate function given by J (Π) = (cid:26) (cid:107) h (cid:107) H if Π = Π h for some h ∈ H , + ∞ , otherwise. . As an immediate corollary we have

Corollary 4.3.

For δ small, P ( Y δ ≈ y ) ≈ exp[ − I ( y ) /δ ] , in the precise sense of a large deviationprinciple (LDP) for Y δ := ˆ f ( δ H (cid:99) W s ) δ (cid:0) ρdW s + ρdW s (cid:1) with speed δ , and rate function given by (4.2) I ( y ) = inf h ∈ L ([0 , { (cid:107) h (cid:107) L + ( y − I ( h )) I ( h ) } where I ( h ) = ρ ˆ f (cid:18) ˆ s K ( u, s ) h ( u ) du (cid:19) h ( s ) ds, I ( h ) = ˆ f (cid:18) ˆ s K ( u, s ) h ( u ) du (cid:19) ds . Remark . This improves a similar result in [19] in the sense that f of exponential form, asrequired in rough volatility modelling [27, 4, 5], is now covered. Proof.

Note that Y δ = (cid:10) R δ F δ · ( ρ Ξ + ρ Ξ) , [0 , (cid:11) where F δ ≡ F Π δ as deﬁned in Lemma 3.20. By the contraction principle and the continuity estimatefrom Theorem 3.15, it holds that Y δ satisﬁes a LDP, with rate function given by I ( y ) = inf { (cid:0) (cid:107) h (cid:107) L + (cid:107) h (cid:107) L (cid:1) , y = (cid:10) R h F h · ( ρ Ξ + ρ Ξ) , [0 , (cid:11) } , where we used F h ≡ F Π h . It then suﬃces to note that (cid:10) R h (cid:0) F h · ( ρ Ξ + ρ Ξ) (cid:1) , [0 , (cid:11) = ˆ f (cid:18) ˆ s K ( u, s ) h ( u ) du (cid:19) ( ρh ( s ) ds + ρh ( s ) ds )and optimizing over h for ﬁxed h we obtain (4.2). (cid:3) We note that thanks to Brownian resp. fractional Brownian scaling, small noise large deviationstranslate immediately to short time large deviations, cf. [19].Although the rate function here is not given in a very useful form, it is possible [5] to expandit in small y and so compute (explicitly in terms of the model parameters) higher order moderatedeviations which relate to implied volatility skew expansions.5. Rough Volterra dynamics for volatility

Motivation from market micro-structure.

Rosenbaum and coworkers, [16, 17, 18], showthat stylized facts of modern market microstructure naturally give rise to fractional dynamics andleverage eﬀects. Speciﬁcally, they construct a sequence of Hawkes processes suitably rescaled in timeand space that converges in law to a rough volatility model of rough Heston form dS t /S t = √ v t dB t ≡ √ v (cid:0) ρdW t + ρdW t (cid:1) , (5.1) v t = v + ˆ t a − bv s ( t − s ) / − H ds + ˆ t c √ v s ( t − s ) / − H dW s . (As earlier, W, W independent Brownians.) Similar to the case of the classical Heston model,the square-root provides both pain (with regard to any methods that rely on suﬃcient smoothcoeﬃcients) and comfort (an aﬃne structure , here inﬁnite-dimensional, which allows for closed formcomputations of moment-generating functions). Arguably, there is no real ﬁnancial reason for thesquare-root dynamics and ongoing work attempts to modify the above square-root dynamics, suchas to obtain (something close to) log-normal volatility , put forward as important rough volatility This is also a frequent remark for the classical Heston model.

EGULARITY STRUCTURES & ROUGH VOL 27 feature by Gatheral et al. [27]. This motivates the study of more general dynamic rough volatilitymodels of the form dS t /S t = f ( Z t ) dB t ≡ f ( Z t ) (cid:0) ρdW t + ρdW t (cid:1) , (5.2) Z t = z + ˆ t K ( s, t ) v ( Z s ) ds + ˆ t K ( s, t ) u ( Z s ) dW s (5.3)with suﬃciently nice functions f, u, v . (While f ( x ) = √ x is still OK in what follows, we assume u, v ∈ C for a local solution theory and then in fact impose u, v ∈ C b for global existence. (Oneclearly expects non-explosion under e.g. linear growth, but in order not to stray too far fromour main line of investigation we refrain from a discussion.) Remark that f ( z ) plays the role ofspot-volatility. Further note that the choice z = 0 , v ≡ , u ≡ f ( Z t ) = f ( (cid:99) W t ) considered in earlier sections.With some good will, equation (5.2) ﬁts into the existing theory of stochastic Volterra equationswith singular kernels (e.g. [46] or [12]).5.2. Regularity structure approach.

We insist that (5.2) is not a classical Itˆo-SDE (solutionswill not be semimartingales), nor a rough diﬀerential equations (in the sense of rough paths, drivenby a Gaussian rough path as in [23, Ch.10]). If rough paths have established themselves as a powerfultool to analyze classical Itˆo-SDE, we here make the point that Hairer’s theroy is an equally powerfultool to analyze stochastic Volterra (resp. mixed Itˆo-Volterra) equations in the singular regime ofinterest.As preliminary step, we have to have to ﬁnd the correct model space, spanned by symbols whicharise by formal Picard iteration. To this end, rewrite (5.2) formally, or as equation for modelleddistributions,(5.4) Z = I ( U ( Z ) · Ξ) + ( .... ) from which one can guess (or formally derive along [31, Sec. 8.1]) the need for the symbols , I (Ξ) , I (Ξ) , I (Ξ I (Ξ)) , ... We have degrees | | = 0 , |I (Ξ) | = H − κ and then, for subsequent symbols, degree computed as(1 / H ) × { number of I} + ( − / κ ) × { number of Ξ } . For a modelled distribution, Z ( t ) takes values in the linear span of suﬃciently many symbols, the(minimal) number of which is dictated by the Hurst parameter H . Loosely speaking, Z ∈ D γ indicates an expansions with γ -error estimate, in practice easy to see from the degree of the lowestdegree symbols that do not ﬁgure in the expansion. For example, in case of a “level-2 expansion”we can expect Z ( t ) = ( .... ) + ( .... ) I (Ξ) ∈ D H − κ )0 since |I (Ξ) | = |I (Ξ I (Ξ)) | = 2 H − κ. It follows from general theory [31, Thm 4.16] that if

Z ∈ D γ ,then so is U ( Z ), the composition with a smooth function, and by [31, Thm 4.7] the product withΞ ∈ D ∞− / − κ is a modelled distribution in D γ − / − κ . For both reconstruction and convolution withsingular kernels, one needs modelled distributions with positive degree γ − / − κ >

0. Given We are not aware of any literature on mixed Itˆo-Volterra systems (although expect no diﬃculties). Here ofcourse, it suﬃces to ﬁrst solves for Z and then construct S as stochastic exponential. H ∈ (0 , /

2] we can then determine which symbols (up to which degree) are required in the expansion.As earlier, ﬁx an integer M ≥ max { m ∈ N | m · ( H − κ ) − / − κ ≤ } (so that ( M + 1) . ( H − κ ) − / − κ >

0) and see that

Z ∈ D ( M +1) . ( H − κ )0 will do. When H > / κ > M = 1 will do. That is, the symbols required todescribe Z are { , I (Ξ) } and if one adds the symbols required to describe the right-hand side, oneends up with the level-2 model space spanned by { Ξ , Ξ I (Ξ) , , I (Ξ) } which is exactly the model space for the “simple” rough pricing regularity structure, (3.2) in case M = 1. When H ≤ / H ∈ (1 / , / M = 2 accordingly, solving (5.3) on the level of modelled distributions will require a (“level-3”)model space given by (cid:104) Ξ , Ξ I (Ξ) , Ξ I (Ξ) , Ξ I (Ξ I (Ξ)) , , I (Ξ) , I (Ξ) , I (Ξ I (Ξ)) (cid:105) which is strictly larger than the corresponding level-3 simple model space given in (3.2). In general,one needs to consider an extended model space (cid:98) T = (cid:104) (cid:98) S (cid:105) , so as to have τ ∈ (cid:98) S ⇒ Ξ I ( τ ) m , I ( τ ) m ∈ (cid:98) S, m ≥ , (with the understanding that only ﬁnitely many such symbols are needed, depending on H asexplained above). As a result, symbols such asΞ I (Ξ( I (Ξ)) m ) , m ≥ , I (Ξ( I (Ξ( I (Ξ)) m ) m (cid:48) ) , m, m (cid:48) ≥ , . . . will appear. At this stage a tree notation (omnipresent in the works of Hairer) would come inhandy and we refer to [9] (and the references therein) for a recent attempt to reconcile the treeformalism of branched rough path [29, 34] and the most recent algebraic formalism of regularitystructures. (In a nutshell, the simple case (3.2) corresponds to trees where one node has m branches;in the present non-simple case symbols branching can happen everywhere.) ) Carrying out thefollowing construction in the general case, H >

0, is certainly possible. However, the algebraiccomplexity is essentially the one from branched rough paths and hence the general case requires aHopf algebraic (Connes-Kreimer, Grossman-Larson ...) construction of the structure group (a.k.a.positive renormalization). Although this, and negative renormalization, is well understood ([31, 10],also [9] for a rough path perspective, all complete exposition would lead us to far astray from themain topic of this paper. Hence, for simplicity only, we shall restrict from here on to the level-2 case

H > / M = 1 accordingly) but will mention general results whenever useful.5.3. Solving for rough volatility.

We rewrite (5.3) as equation for modelled distributions in D γ ,(5.5) Z = z + K ( U ( Z ) · Ξ + V ( Z )) . (Here U, V are the operators associated to composition with u, v ∈ C M +2 respectively.) We alsoimpose γ ∈ (1 / κ, U ( Z ) · Ξ in a modelled distribution space ofpositive parameter, so that reconstruction, convolution etc. makes sense. Let

H > / , M = 1 and We note that, as H → the number of symbols tends to inﬁnity. In comparison, as far as we know, among allrecently studied singular SPDEs, only the sine-Gordon equation [36] exhibits arbitrarily many symbols. EGULARITY STRUCTURES & ROUGH VOL 29 pick κ ∈ (0 , H − ) so that ( M + 1) . ( H − κ ) − / − κ >

0. As explained in the previous section,this exactly allows us to work in the familiar structure of Section 3.1. That is, with M = 1, T = (cid:104) Ξ , Ξ I (Ξ) , , I (Ξ) (cid:105) . with index set and structure group as given in that section. This structure is equipped withthe Itˆo-model, and its (renormalization) approximations. Equation (5.5) critically involves theconvolution operator K acting on D γ . The general construction [31, Sec. 5] is among the mosttechnical in Hairer’s work, and in fact not directly applicable (our kernel K , although β -regularizingwith β = 1 / H ) fails the Assumption 5.4 in [31]) so we shall be rather explicit. Lemma 5.1.

On the regularity structure ( T , A, G ) of Section 3.1 with M = 1 , consider a model (Π , Γ) which is admissible in the sense Π t I (Ξ) = ( K ∗ Π t Ξ)( · ) − ( K ∗ Π t Ξ)( t ) . Let γ > , F ∈ D γ and set K F : s ∈ [0 , T ] (cid:55)→ I ( F ( s )) + ( K ∗ R F )( s ) Then (i) K maps D γ → D min { γ + β, } and (ii) R ( K F ) = K ∗ R F , i.e. convolution commutes withreconstruction.Remark . [31, Thm 5.2] suggests the estimate K maps D γ → D γ + β . The diﬀerence to our babySchauder estimate stems from the fact, unlike Assumption 5.3 in [31, p.64] we do not assume thatour regularity structure contains the polynomial structure. Proof. (Sketch) The special case F ≡ Ξ ∈ D ∞ was already treated in Lemma 3.18. We only showthat, in the general case, K necessarily has the stated form but will not check the properties. It isenough to consider F with values in (cid:104) Ξ , Ξ I Ξ (cid:105) and make the ansatz( K F )( s ) := I F ( s ) + ( ... ) . Applying reconstruction, together with [31, Prop. 3.28] we see that R ( K F ) ≡ ( ... ) which in turnmust equal K ∗ R F , provided we postulate validity of (ii). This is the given deﬁnition of K F . (cid:3) We return to our goal of solving(5.6) Z = z + K ( U ( Z ) · Ξ + V ( Z )) , noting perhaps that U ( Z ) makes sense for every function-like modelled distribution, say F ( t ) = F ( t ) + (cid:80) Mk =1 F k ( t )( I Ξ) k ∈ T + := (cid:10) , I (Ξ) , . . . , ( I Ξ) M (cid:11) , in which case(5.7) U ( F )( t ) = u ( F ( t )) + u (cid:48) ( F ( t )) M (cid:88) k =1 F k ( t ) I (Ξ) k . (Similar remarks apply to V , the composition operator associated to v ∈ C M +2 ). Recall M = 1. Theorem 5.3.

For any admissible model (Π , Γ) and u , v ∈ C M +2 b ( R ) , for any T > , the equation (5.6) has a unique solution in D γ ( T + ) , and the map ( u, v, Π) (cid:55)→ Z is locally Lipschitz in the sensethat if Z and (cid:101) Z are the solutions corresponding respectively to ( u, v, Π) and ( (cid:101) u, (cid:101) v, (cid:101) Π) , |||Z ; (cid:101) Z||| D γT (cid:46) (cid:107) u − (cid:101) u (cid:107) C M +2 b + (cid:107) v − (cid:101) v (cid:107) C M +2 b + ||| (Π , Γ); ( (cid:101) Π , (cid:101) Γ) ||| T , I is extended linearly to all of T by taking I τ = 0 for symbols τ (cid:54) = Ξ). with the proportionality constant being bounded when the (resp. C M +2 b and model) norms of thearguments stay bounded.In addition, if (Π , Γ) is the canonical Itˆo model (associated to Brownian resp. fractional Brownianmotion, H > / ) then Z = RZ is solves (5.2) in the Itˆo-sense.Remark . Z = RZ is clearly the (unique) reconstruction of the (unique) solution to the abstractproblem. We also checked that Z is indeed a solution for the Itˆo-Volterra equation. However, if onedesires to know that Z is the unique strong solution to the stochastic Itˆo-Volterra equation, it isclear that one has to resort to uniqueness results of the stochastic theory, see e.g. [12]. Proof.

The well-posedness and continuous dependence on the parameters essentially follows fromresults of [31], details are spelled out the details in Appendix C.The fact that the reconstruction of the solution solves the Itˆo equation can be obtained byconsidering approximations as is done in [35, Thm 6.2] or [23, Ch. 5]. (cid:3)

Using the large deviation results obtained in the previous subsection, we can directly obtain aLDP for the log-price X t = ˆ t f ( Z s )( ρdW s + ρdW s ) − ˆ t f ( Z s ) ds. For square-integrable h , let z h be the unique solution to the integral equation z h ( t ) = z + ˆ t K ( s, t ) u ( z h ( s )) h ( s ) ds . Corollary 5.5.

Let H ∈ (0 , / and f smooth (without boundedness assumption). Then t H − X t satisﬁes a LDP with speed t H and rate function given by (5.8) I ( x ) = inf h ∈ L ([0 , { (cid:107) h (cid:107) L + ( x − I z ( h )) I z ( h ) } where I z ( h ) = ρ ˆ f ( z h ( s )) h ( s ) ds , I z ( h ) = ˆ f ( z h ( s )) ds . Remark . Despite our previous limitation to

H > /

4, to approach extends to any

H >

Proof.

Ignoring the second part ´ t ( ... ) ds in X t which is O ( t ) = o ( t − H ) since f is bounded, we let (cid:98) X t = ´ t f ( Z s )( ρdW s + ρdW s ) and by scaling we see that t H − (cid:98) X t = d (cid:98) X δ , where δ = t H and X δ , Z δ are deﬁned in the same way as X , Z with W, W replaced by δW, δW and v replaced by v δ = δ H h .We then note that X δ = (cid:10) R δ F ( Z δ )( ρ Ξ + ρ Ξ) , [0 , (cid:11) =: Ψ(Π δ , v δ )where Ψ is locally Lipschitz by Theorem 5.3. We can then directly use the fact that Π δ satisfya LDP (Theorem 4.2) with a contraction principle such as Lemma 3.3 in [37] to obtain that X δ satisﬁes a LDP with rate function I ( x ) = inf (cid:26)

12 ( (cid:107) h (cid:107) L + (cid:107) h (cid:107) L , x = Ψ(Π ( h,h ) , (cid:27) . EGULARITY STRUCTURES & ROUGH VOL 31

It then suﬃces to note that z h is exactly RZ for Z the solution to (5.6) corresponding to a modelΠ ( h,h ) and with h ≡

0, and to optimize separately over h as in the proof of Corollary 4.3. (cid:3) We also have an approximation result :

Corollary 5.7.

Let

H > / (for simplicity, but see remark below). Then Z = lim Z ε , uniformlyon compacts and in probability, where (5.9) Z εt = z + ˆ t K ( s, t )( u ( Z εs ) dW εs + ( v ( Z εs ) − C ε ( s ) uu (cid:48) ( Z εs ) ds ) . Remark . Replacing the renormalization function C ε by its mean is possible, provided H > / Remark . In contrast to the previous statement, the above result is more involved for H ∈ (0 , / Proof.

Thanks to Theorem 3.13 and Theorem 5.3 it follows from continuity of reconstruction that Z = RZ = lim ε → R ε Z ε , so that the only thing to do is check that Z ε solves (5.9). Note that (5.6) implies that one has(omitting upper ε ’s at all normal and caligraphic Z ...) Z ( t ) = Z t + u ( Z t ) I (Ξ) , and, with (5.7), U ( Z ( t ))Ξ = u ( Z t )Ξ + u (cid:48) u ( Z t ) I (Ξ)Ξ . But then since (cid:98) Π ε is a “smooth” model, in the sense of Remark 3.15. in [31], one has R ε ( U ( Z ε )Ξ)( t ) = (cid:98) Π εt ( U ( Z ε ( t ))Ξ)( t )= u ( Z εt )( (cid:98) Π εt Ξ)( t ) + u (cid:48) u ( Z εt )( (cid:98) Π εt Ξ I (Ξ))( t )= u ( Z εt ) ˙ W ε ( t ) − u (cid:48) u ( Z εt ) K ε ( t, t ) . Since convolution commutes with reconstruction, cf. Lemma 5.1, it follows that Z ε is indeed asolution to (5.9). (cid:3) Numerical results

We will now resume where we left oﬀ in Section 3.3 and revisit the case of European optionpricing under rough volatility. Building on the theoretical underpinnings of Section 3, we presenta concise description of the central algorithm of this paper - for simplicity restricted to the unittime interval - and complement the theoretical convergence rates obtained in previous chapterswith numerical counterparts. The code used to run the simulations has been made available on . Concise description.

Without loss of generality, set time to maturity T = 1. We are interestedin pricing a European call option with spot S and strike K under rough volatility. From Theorem C ( S , K,

1) = E (cid:20) C BS (cid:18) S exp (cid:18) ρ I − ρ V (cid:19) , K, ρ V (cid:19)(cid:21) where the computational challenge obviously lies in the eﬃcient simulation of( I , V ) = (cid:18) ˆ f ( (cid:99) W t , t )d W t , ˆ f ( (cid:99) W t , t )d t (cid:19) . As explored in Subsection 3.3, we take a Wong-Zakai-style approach to simulating I , that is, weapproximate the White noise process ˙ W on the Haar grid as follows:Let { Z i } i =1 ,... N − ∼ iid N (0 ,

1) and choose a Haar grid level N ∈ N such that the stepsize of theHaargrid ε = 2 − N . Then, for all t ∈ [0 ,

1] and i = 0 , . . . , N −

1, we set˙ W ε ( t ) = N − (cid:88) i =0 Z i e εi ( t ) where e εi ( t ) = 2 N/ [ i − N , ( i +1)2 − N ) ( t )(6.2)which induces an approximation of the fBm (cid:99) W ε ( t ) = N − (cid:88) i =0 Z i (cid:98) e εi ( t ) where(6.3) (cid:98) e εi ( t ) = t>i − N √ H N/ H + 1 / (cid:16) | t − i − N | H +1 / − | t − min(( i + 1)2 − N , t ) | H +1 / (cid:17) . (6.4)As outlined before, the central issue is that the object ´ f ( (cid:99) W ε ( t ) , t ) ˙ W ε ( t )d t does not converge inan appropriate sense to the object of interest I as ε →

0. This is overcome by renormalizing theobject, two possible approaches of which are explored in Subsection 3.3. For the remainder, we willconsider the ’simpler’ renormalized object given by (cid:102) I ε = ˆ f ( (cid:99) W ε ( t ) , t ) ˙ W ε ( t )d t − ˆ C ε ( t ) ∂ f ( (cid:99) W ε ( t ) , t )d t (6.5)where the renormalization object C ε ( t ) can be one of C ε ( t ) = (cid:40) N √ HH +1 / | t − (cid:4) t N (cid:5) − N | H +1 / √ H ( H +1 / H +3 / N (1 / − H ) . (6.6)Coming back to the original question of simulating ( I , V ), we just argued that what we really needto simulate to achieve convergence in a suitable sense is the object (cid:16) (cid:102) I ε , V ε (cid:17) , the expressions ofwhich are collected below (note that under an assumed non-constant renormalization the expression(6.5) for (cid:102) I ε has been rewritten to a form more suitable for eﬃcient simulation): (cid:102) I ε = N − (cid:88) i =0 ˆ ( i +1)2 − N i − N (cid:104) Z i N/ f ( (cid:99) W ε ( t ) , t ) − √ H N H + 1 / | t − i − N | H +1 / ∂ f ( (cid:99) W ε ( t ) , t ) (cid:105) d t (6.7) V ε = N − (cid:88) i =0 ˆ ( i +1)2 − N i − N f ( (cid:99) W ε ( t ) , t )d t. (6.8) Numerical convergence rates.

EGULARITY STRUCTURES & ROUGH VOL 33

Algorithm 1:

Simulation of M samples of ( (cid:102) I ε , V ε ) Parameters : M ∈ N : N ∈ N : Haar grid ’level’ such that ε = 2 − N d ∈ N : Output: M samples of bivariate object ( (cid:102) I ε , V ε ) initialize (cid:102) I ε = V ε = ∈ R M ; simulate array Z ∈ R M × N of iid standard normals; for each Haar subinterval [ i − N , ( i + 1)2 − N ) where i ∈ { , . . . , N − } do choose discretization grid D i with d points on the Haar subinterval; evaluate functions (cid:98) e (cid:15)k , k = 0 , . . . , i, from (6.4) on D i to obtain (cid:98) e (cid:15) ∈ R ( i +1) × d ; compute (cid:99) W (cid:15) = Z ∗ × (cid:98) e (cid:15) ∈ R M × d where Z ∗ ∈ R M × ( i +1) is the truncation of Z to its ﬁrst i + 1 columns such that (cid:99) W (cid:15) is an approximation of the fBM on D i ; evaluate integrands from equations (6.7, 6.8) on D i using (cid:99) W (cid:15) and the last column of Z ∗ ; approximate respective integrals on subinterval by trapezoidal rule ; add obtained estimates to running sums (cid:102) I ε and V ε ; end return (cid:102) I ε , V ε In this subsection, we will discuss strong convergence of the approximative object (cid:102) I ε to theactual object of interest I as well as weak convergence of the option price itself as the Haar gridinterval size ε →

0. Speciﬁcally, we will be looking at Monte Carlo estimates of our errors, that is,in order to approximate some quantity E [ X ] for some random variable X , we will instead be lookingat M (cid:80) Mi =1 X i where the X i are M iid samples drawn from the same distribution as X . In otherwords, we need to generate M realisations of the bivariate stochastic object (cid:16) (cid:102) I ε , V ε (cid:17) , a task thatcan be vectorized as described below, thus avoiding expensive looping through realisations. Strong convergence . We verify Theorem 3.24 (i) numerically , albeit in the L (Ω)-sense and - forsimplicity - with f ( x, t ) = exp( x ), i.e. with no explicit time dependence. That is, we are concernedwith Monte Carlo approximations of (cid:13)(cid:13)(cid:13)(cid:13) (cid:102) I ε − ˆ exp( (cid:99) W t )d W t (cid:13)(cid:13)(cid:13)(cid:13) L (Ω) and we expect an error almost of order ε H . Remark . We choose f ( x, t ) = exp( x ) because this closely resembles the rough Bergomi model(see [4] and below). Also, for the simplest non-trivial choice, f ( x, t ) = x , the discretization error isovershadowed by the Monte Carlo error, even for very coarse grids.Since (cid:16) W, (cid:99) W (cid:17) is a two-dimensional Gaussian process with known covariance structure, it ispossible to use the Cholesky algorithm (cf. [4, 5]) to simulate the joint paths on some grid and thenuse standard Riemann sums to approximate the integral. The value obtained in this way could serveas a reference value for our scheme. However - for strong convergence - we need both objects to bebased on the same stochastic sample. For this reason, we ﬁnd it easier to construct a reference value = 2 N E rr o r Strong error: non-constant renormalization

H = 0.3: strong rate 0.35Reference rate 0.3H = 0.2: strong rate 0.25Reference rate 0.2

Figure 1.

Empirical strong (6.9) errors on a log-log-scale under a non-constant renormalization,obtained through M = 10 Monte Carlo samples with a trapezoidal rule delta of ∆ = 2 − andﬁneness of the reference Haar grid ε (cid:48) = 2 − . Solid lines visualize the empirical rates of convergenceobtained by least squares regression, dashed lines provide visual reference rates. Shaded colourbands show interpolated 95% conﬁdence levels based on normality of Monte Carlo estimator.by the wavelet-based scheme itself, i.e. we simply pick some ε (cid:48) (cid:28) ε and consider(6.9) (cid:13)(cid:13)(cid:13) (cid:102) I ε − (cid:102) I ε (cid:48) (cid:13)(cid:13)(cid:13) L (Ω) as ε → ε (cid:48) . As can be seen in Figures 1 and 2, both renormalization approaches stated in (6.6)are consistent with a theoretical strong rate of almost H across the full range of 0 < H < . Remark . In absence of a Markovian structure, a proper weak convergenceanalysis proves to be subtle, that is, an analysis that - for suitable test functions ϕ - yields a rate of EGULARITY STRUCTURES & ROUGH VOL 35 = 2 N E rr o r Strong error: constant renormalization

H = 0.3: strong rate 0.33Reference rate 0.3H = 0.2: strong rate 0.24Reference rate 0.2

Figure 2.

Empirical strong (6.9) errors on a log-log-scale under a constant renormalization, obtainedthrough M = 10 Monte Carlo samples with a trapezoidal rule delta of ∆ = 2 − and ﬁneness ofthe reference Haar grid ε (cid:48) = 2 − . Solid lines visualize the empirical rates of convergence obtainedby least squares regression, dashed lines provide visual reference rates. Shaded colour bands showinterpolated 95% conﬁdence levels based on normality of Monte Carlo estimator.convergence for (cid:12)(cid:12)(cid:12)(cid:12) E (cid:104) ϕ (cid:16) (cid:102) I ε (cid:17)(cid:105) − E (cid:20) ϕ (cid:18) ˆ exp( (cid:99) W t )d W t (cid:19)(cid:21)(cid:12)(cid:12)(cid:12)(cid:12) as (cid:15) →

0, remains an open problem. However, picking ϕ ( x ) = x , Ito’s isometry yields(6.10) E (cid:34)(cid:18) ˆ exp( (cid:99) W t )d W t (cid:19) (cid:35) = ˆ E (cid:104) exp (cid:16) (cid:99) W t (cid:17)(cid:105) d t = ˆ exp (cid:0) t H (cid:1) d t which we can be approximated numerically. So we can consider(6.11) (cid:12)(cid:12)(cid:12)(cid:12) E (cid:20)(cid:16) (cid:102) I ε (cid:17) (cid:21) − ˆ exp (cid:0) t H (cid:1) d t (cid:12)(cid:12)(cid:12)(cid:12) = 2 N E rr o r Option price: Weak error

H = 0.4, rate 0.83 H = 0.3, rate 0.62 H = 0.2, rate 0.44 H = 0.1, rate 0.30

Figure 3.

Empirical weak (6.12) errors on a log-log-scale as ε → ε (cid:48) = 2 − , obtained through M = 10 MC samples with spot S = 1, strike K = 1, correlation ρ = − .

8, spot vol σ = 0 .

2, vvol η = 2 and trapezoidal rule delta ∆ = 2 − . Dashed lines represent LS estimates for rate estimation,shaded colour bands show conﬁdence levels based on normality of Monte Carlo estimator.as ε →

0. Our preliminary results indicate that for both renormalization approaches the weak rateseems to be around the strong rate H . Option pricing . We pick a simpliﬁed version of the rough Bergomi model [4] where the instanta-neous variance is given by f ( x ) = σ exp ( ηx )with σ and η denoting spot volatility and volatility of volatility respectively. Let C ε denote theapproximation of the call price (6.1) based on (cid:16) (cid:102) I ε , V ε (cid:17) , ﬁx some ε (cid:48) (cid:28) ε and consider(6.12) (cid:12)(cid:12)(cid:12) C ε ( S , K, T = 1) − C ε (cid:48) ( S , K, T = 1) (cid:12)(cid:12)(cid:12) as ε → ε (cid:48) . Empirical results displayed in Figure 3 indicate a weak rate of 2 H across the full range of0 < H < . EGULARITY STRUCTURES & ROUGH VOL 37

Appendix A. Approximation and renormalization (Proofs)

Lemma A.1.

For a, b > and δ ∈ [0 , we have for x / ∈ [0 , | a x − b x | ≤ − δ | x | δ ( a x − δ ∨ b x − δ ) · | a − b | δ and for x ∈ (0 , | a x − b x | ≤ − δ | x | δ ( a ( x − δ b x (1 − δ ) ∨ b ( x − δ a x (1 − δ ) ) · | a − b | δ . Proof.

This follows from interpolation between | a x − b x | ≤ | x | sup z ∈ [ a,b ] z x − | a − b | ≤ | x | a x − ∨ b x − | a − b | and | a x − b x | ≤ a x + b x ≤ a x ∨ b x . (cid:3) Proof of Lemma 3.7 .

Rewriting (cid:99) W ε ( t ) = √ H ´ ∞ d W ( u ) ´ ∞ d r δ ε ( r, u ) | t − r | H − / r u or u > t yield an ε H error term as above so that bounding withLemma A.1 (cid:12)(cid:12)(cid:12) | t − r | H − / − | t − u | H − / (cid:12)(cid:12)(cid:12) (cid:46) ( | t − r | − / κ + | t − u | − / κ ) · | u − r | H − κ proves (A.1). (cid:3) Proof of (3.15) . We only consider the symbols Ξ I m (Ξ), the symbols I (Ξ) m can be handled withLemma 3.7. In view of Lemma 3.9 and 3.11 we have to controll (for m ≥ m > E (cid:12)(cid:12)(cid:12)(cid:12) ˆ ∞ d W ε ( t ) (cid:5) ϕ λs ( t )( (cid:99) W εst ) m − ˆ ∞ d W ( t ) (cid:5) ϕ λs ( t )( (cid:99) W st ) m (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) ε δκ (cid:48) λ mH − − κ (cid:48) , (A.2) E (cid:12)(cid:12)(cid:12)(cid:12) ˆ ∞ d t ϕ λs ( t ) (cid:16) K ε ( s, t )( (cid:99) W εst ) m − − K ( s − t )( (cid:99) W st ) m − (cid:17)(cid:12)(cid:12)(cid:12)(cid:12) (cid:46) ε δκ (cid:48) λ mH − − κ (cid:48) , (A.3)where (cid:99) W ( ε ) st = (cid:99) W ( ε ) ( t ) − (cid:99) W ( ε ) ( s ) and where δ ∈ (0 , , κ (cid:48) ∈ (0 , H ) is arbitrary. Equivalence of normsin the Wiener chaos and a version of Kolmogorov’s criterion for models ([31, Proposition 3.32])then gives (3.15) (note that this gives for a better homogeneity then we actually need since we onlysubtract 2 κ (cid:48) and not 2 mκ (cid:48) in the exponent of λ ∈ (0 , ˆ T +10 d W ( t ) (cid:5) ˆ d u δ ε ( t, u ) (cid:16) u ≥ ϕ λs ( u )( (cid:99) W εsu ) m − ϕ λs ( t )( (cid:99) W st ) m (cid:17) Using [39, Theorem 7.39] and Jensen’s inequality we can estimate the second moment of thisSkorohod integral by E | (A.2) | (cid:46) ˆ T +10 d t ˆ d u | δ ε ( t, u ) | E (cid:16) u ≥ ϕ λs ( u )( (cid:99) W εsu ) m − ϕ λs ( t )( (cid:99) W st ) m (cid:17) . In the regime λ ≤ ε every term in the squared parentheses can simply be bounded (using Lemma3.7) by λ H − (cid:46) λ H − − κ (cid:48) ε κ (cid:48) . If on the other hand ε < λ we can split oﬀ a term of order ´ B (0 ,cε ) d t ´ B (0 ,cε ) d uε (cid:46) λ mH − − κ (cid:48) ε κ (cid:48) to drop the indicator u ≥ and can bound on the supportof δ ε ( t, u ) | ϕ λs ( u )( (cid:99) W εsu ) m − ϕ λs ( t )( (cid:99) W ts ) m | ≤ | ( ϕ λs ( u ) − ϕ λs ( t )) · | (cid:99) W εsu | m + | ϕ λs ( t ) | · (cid:12)(cid:12)(cid:12) ( (cid:99) W εsu ) m − ( (cid:99) W st ) m (cid:12)(cid:12)(cid:12) (cid:46) C ε B ( s, (1+2 c ) λ ) ( t ) λ − − κ (cid:48) ε κ (cid:48) λ mH + C ε B ( s,λ ) ( t ) λ − λ mH − κ (cid:48) ε κ (cid:48) , where C ε > L p for p ∈ [1 , ∞ ). This shows(A.2). To estimate (A.3) we ﬁrst note that due to E | ( (cid:99) W ) m − st − ( (cid:99) W εst ) m − | (cid:46) | t − s | m − H − κ (cid:48) ε δ κ (cid:48) we are only left with E (cid:12)(cid:12)(cid:12)(cid:12) ˆ ∞ d t ϕ λs ( t )( K ε ( s, t ) − K ( s − t ))( (cid:99) W εst ) m − (cid:12)(cid:12)(cid:12)(cid:12) (cid:46) ˆ ∞ d t ϕ λs ( t ) | K ε ( s, t ) − K ( s − t ) | | s − t | m − H , which is straightforward to bound with Lemma 3.12 if λ ≤ ε . For λ < ε and t > cε with c > ˆ B (0 , cε ) d t ϕ λs ( t ) | t − s | m − H ( ε H − + | t − s | H − ) (cid:46) ˆ B ( s,λ − cε ) d t ( λ m − H ε H − + λ mH − | t | mH − ) (cid:46) λ m − H − ε H + λ mH − ( λ − ε ) mH ≤ λ kH − κ (cid:48) ε κ (cid:48) , which completes the proof. (cid:3) EGULARITY STRUCTURES & ROUGH VOL 39

Lemma A.2.

For c as in Deﬁnition 3.5 and t > cε and s ∈ R we have for κ (cid:48) ∈ (0 , H ) | K ( s − t ) − K ε ( s, t ) | (cid:46) | s − t | H − / − κ (cid:48) ε κ (cid:48) . Proof.

If 2 cε ≥ | s − t | / cε ≥ | s − t | / | K ( s − t ) − K ε ( s, t ) | = (cid:12)(cid:12)(cid:12)(cid:12) ˆ ∞−∞ d u δ ,ε ( t, u )( t

We restrict ourselves to proof (3.17), the other three inequalities follow bybasically the same arguments. We ﬁx a wavelet basis φ y = φ ( ·− y ) , y ∈ Z , ψ jy = 2 j/ ψ (2 j ( ·− y )) , j ≥ , y ∈ − j Z and use in the following the notation φ y = 2 j/ φ (2 j ( · − y )) , j ≥ , y ∈ − j Z . Withinthis basis we can express the B β , ∞ regularity of ϕ by (cid:88) y ∈ Z | ( ϕ, φ y ) L | + sup j ≥ jβ (cid:88) y ∈ − j Z − dj/ | ( ϕ, ψ jy ) L | (cid:46) (cid:107) ϕ (cid:107) B β , ∞ Without loss of generality we can assume that λ = 2 − j is dyadic, so that by scaling (cid:88) y ∈ − j Z | ( ϕ λs , φ j y ) L | + sup j ≥ j ( j − j ) β (cid:88) y ∈ − j Z − ( j − j ) d/ | ( ϕ λs , ψ jy ) L | (cid:46) j d/ (cid:107) ϕ (cid:107) B β , ∞ . (A.4)We can now rewrite( R F − Π s F s )( ϕ λs ) = (cid:88) y ∈ − j Z ( R F − Π s F s )( φ j y ) · ( φ j y , ϕ λs ) L + (cid:88) j ≥ j (cid:88) y ∈ − j Z ( R F − Π s F s )( ψ jy ) · ( ψ jy , ϕ λs ) L = (cid:88) y ∈ − j Z ( R F − Π y F y )( φ j y ) ( φ j y , ϕ λs ) L + (cid:88) y ∈ − j Z Π y ( F y − Γ ys F s )( φ j y ) ( φ j y , ϕ λs )(A.5) + (cid:88) j ≥ j ,y ∈ − j Z ( R F − Π y F y )( ψ jy ) ( ψ jy , ϕ λs ) L + (cid:88) j ≥ j ,y ∈ − j Z Π y ( F y − Γ ys F s )( ψ jy ) ( ψ jy , ϕ λs ) L (A.6)Only ﬁnite terms in (A.5) contribute which all can be bounded (up to a constant) by 2 − j γ = λ γ .Moreover (A.6) (cid:46) (cid:88) j ≥ j − jγ + (cid:88) j ≥ j (cid:88) A (cid:51) α<γ − jα − ( γ − α ) j (cid:88) y ∈ − j Z jd/ | ( ϕ λs , ψ jy ) L | (cid:46) (cid:88) j ≥ j − jγ + 2 − γj (cid:88) A (cid:51) α<γ (cid:88) j ≥ j − ( j − j ) α − ( j − j ) β (cid:46) − j γ = λ γ where we used β + α > , α ∈ A in the last line. (cid:3) Proof of Lemma 3.22.

Note ﬁrst that via Taylor’s formula it easy to check that for scaled Haarwavelets ϕ λs and γ ∈ (0 , ( M + 1) H ) E (cid:34)(cid:12)(cid:12)(cid:12)(cid:12) ˆ ϕ λs ( t ) f ( (cid:99) W ( t ) , t )d W ( t ) − Π s F Ξ( s )( ϕ λs ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:35) / (cid:46) λ ( γ − / − κ ) (A.7)uniformly for s in compact sets. The same argument as in the proof of Lemma 3.16 then impliesthat (A.7) actually holds for compactly supported smooth function ϕ (or even compactly supportedfunctions in B β , ∞ ( R d )). Proceeding now as in [31] we choose test functions η, ψ ∈ C ∞ c with η evenand supp η ⊆ B(0 , , ´ η (t) dt = 1. We then obtain for ψ δ ( s ) = (cid:104) ψ, η δs (cid:105) E (cid:20) |R F Ξ( ψ δ ) − ˆ ψ δ ( t ) f ( (cid:99) W ( t ) , t ))d W ( t ) | (cid:21) / = E (cid:34)(cid:12)(cid:12)(cid:12)(cid:12) ˆ d x ψ ( x ) (cid:18) R F Ξ( η δx ) − ˆ η δx ( t ) f ( (cid:99) W ( t ) , t ))d W ( t ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) (cid:35) / (cid:46) ˆ d x ψ ( x ) δ γ − / − κ δ → → x Ξ F ( x ) in the second step. It remains to note that ˆ ψ δ ( t ) f ( (cid:99) W ( t ) , t ))d W ( t ) δ → → ˆ ψ ( t ) f ( (cid:99) W ( t ) , t )d W ( t )in L ( P ) and further R F Ξ( ψ δ ) → R F Ξ( ψ ) a.s. and thus in L ( P ). Putting everything togetherwe obtain E (cid:20) |R F Ξ( ψ ) − ˆ ψ ( t ) f ( (cid:99) W ( t ) , t )d W ( t ) | (cid:21) = 0which implies the ﬁrst statement. For the second identity we proceed in the same way but makinguse of Lemma A.3. (cid:3) Lemma A.3.

For F ∈ L ( P × Leb) we have E (cid:34)(cid:12)(cid:12)(cid:12)(cid:12) ˆ F ( t )d W ε ( t ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:35) (cid:46) ˆ E (cid:104) | F ( t ) | (cid:105) d t Proof.

As a consequence of Deﬁnition 3.5, we have ´ | δ ε ( x, y )d x | is bounded uniformly in ε and y .We can, therefore, normalize | δ ε ( · , r ) | to a probability density and apply Itˆo’s isometry and Jensen’sinequality to ˆ F ( t )d W ε ( t ) = ˆ ∞ ˆ ∞ δ ε ( t, r ) F ( t )d t d W ( r ) . (cid:3) Appendix B. Large deviations proofs

Proof of Lemma 4.1.

The fact that Π h satisﬁes the algebraic constraints is obvious so we focus onthe analytic ones. The Sobolev embedding L ⊂ C − / yields that ΠΞ, ΠΞ satisfy the right bounds. EGULARITY STRUCTURES & ROUGH VOL 41

Noting that (by e.g. [47, section 3.1]) (cid:107) K ∗ h (cid:107) C H ≤ C (cid:107) h (cid:107) C − / gives the bound for Π I (Ξ) m . Finally,we note that using Cauchy-Schwarz’s inequality (cid:12)(cid:12)(cid:12)(cid:68) Π t Ξ I (Ξ) m , φ λx (cid:69)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) ˆ h ( s )( K ∗ h ( s ) − K ∗ h ( t )) m φ λx ( s ) ds (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:32) sup | s − t |≤ λ | K ∗ h ( s ) − K ∗ h ( t ) | (cid:33) m (cid:107) h (cid:107) L (cid:107) (cid:107) φ λx (cid:107) L (cid:46) λ mH − / . The inequality for ΠΞ I (Ξ) m follows in the same way, and the bounds for Γ also follow.Continuity is h is proved by similar arguments which we leave to the reader. (cid:3) Proof of Theorem 4.2.

The theorem is a special case of results in Hairer-Weber [37] for largedeviations of Banach-valued Gaussian polynomials. Let us recall the setting.Let ( B, H , µ ) be an abstract Wiener space and let us call ξ the associated B -valued Gaussianrandom variable, and ( e i ) an orthonormal basis of H with e i ∈ B ∗ . For a multi-index α ∈ N N withonly ﬁnitely many nonzero entries, deﬁne H α ( ξ ) = Π i ≥ H α i ( (cid:104) ξ, e i (cid:105) ), where the H n , n ≥ E , the homogeneous Wiener chaos H ( k ) ( E ) isdeﬁned as the closure in L ( E, µ ) of the linear space generated by elements of the form H α ( ξ ) y, | α | = k, y ∈ E. Also deﬁne the inhomogeneous Wiener chaos H k ( E ) = ⊕ ki =0 H ( i ) ( E ). Finally for Ψ ∈ H ( k )( E ) and h ∈ H we deﬁne Ψ hom ( h ) = ´ Ψ( ξ + h ) µ ( dξ ), and for Ψ = (cid:80) i ≤ k Ψ i ∈ H k ( E ), we let Ψ hom = (Ψ k ) hom .Now let E = ⊕ τ ∈W E τ where W is a ﬁnite set and each E τ is a separable Banach space. LetΨ = ⊕ τ ∈W Ψ τ be a random variable such that each Ψ τ is in H K τ ( E τ ). Letting Ψ δ = ⊕ τ δ K τ Ψ τ ,Theorem 3.5 in [37] states that Ψ δ satisﬁes a LDP with rate function given by I (Ψ) = inf (cid:8) / (cid:107) h (cid:107) H , Ψ = ⊕ τ ∈W Ψ homτ ( h ) (cid:9) . In our case, we apply this result with W = (cid:8) Ξ I (Ξ) m , Ξ I (Ξ) m , ≤ m ≤ M (cid:9) and each E τ is theclosure of smooth functions ( t, s ) (cid:55)→ Π t τ ( s ) under the norms (cid:107) Π τ (cid:107) = sup λ,t,φ λ −| τ | (cid:12)(cid:12)(cid:12)(cid:68) Π t τ , φ λt (cid:69)(cid:12)(cid:12)(cid:12) . In order to obtain Theorem 4.2, it suﬃces then to identify (Π τ ) hom ( h ) which is done in thefollowing lemma. (cid:3) Lemma B.1.

For each τ ∈ W and h ∈ H , (Π τ ) hom ( h ) = Π h τ .Proof. We prove it for τ = Ξ I (Ξ) m , the other cases are similar. Note that Ψ (cid:55)→ Ψ hom ( h ) iscontinuous from H k to R for ﬁxed h (by an application of the Cameron-Martin formula), and so itis enough to prove that(B.1) lim ε → (cid:16)(cid:98) Π ε τ (cid:17) hom ( h ) = Π h τ , where (cid:98) Π ε corresponds to the (renormalized model) with piecewise linear approximation of ξ . Forany test function ϕ , by deﬁnition one has (cid:104) Π εt τ , ϕ (cid:105) = −(cid:104) I ε , ϕ (cid:48) (cid:105) , where I ε ( s ) = ˆ st (( K ∗ ξ ε )( u ) − ( K ∗ ξ ε )( t )) m ξ ε ( u ) du − C ε R εm , where R εm is a renormalization term which is valued in the lower-order chaos H m , so that bydeﬁnition it does not play a role in the value of (Π τ ) hom . Now note that if Φ is a Wiener polynomialwhose leading order term is given by Π ki =1 (cid:104) ξ, g i (cid:105) (where the g i are in H ) then Φ hom ( h ) = Π ki =1 (cid:104) h, g i (cid:105) .In our case this means that( I ε ) hom ( s ) = ˆ st (( K ∗ h ε )( u ) − ( K ∗ h ε )( t )) m h ε ( u ) du where h ε = ρ ε ∗ h . In other words we have ( (cid:98) Π ε τ ) hom = Π h ε τ , and by continuity of h (cid:55)→ Π h weobtain (B.1). (cid:3) Appendix C. Proofs of Section 5

The proof of Theorem 5.3 follows from the estimates in the lemmas below, using the standardprocedure of taking a time horizon T small enough to obtain a contraction and then iterating. Notethat due to global boundedness of u , v the estimates are uniform in the starting point z , so that oneobtains global existence (unlike the typical situation in SPDE where the theory only gives local intime existence).By translating u and v we can assume w.l.o.g. that the initial condition is z = 0. Then thesolution will take value in D γ ,T (Γ) := { F ∈ D γT (Γ) , F (0) = 0 . } . Lemma C.1.

For each F and (cid:101) F in D γ ,T ( T ) for the respective models (Π , Γ) and ( (cid:101) Π , (cid:101) Γ) , and foreach γ < and T ∈ (0 , , one has |||K F ; K (cid:101) F ||| D γT (Γ) , D γT ( (cid:101) Γ) (cid:46) T η ||| F ; (cid:101) F ||| D γ + | Ξ | T (Γ) , D γ + | Ξ | T ( (cid:101) Γ) for some η > , the proportionality constants depending only on γ and the norms of (Π , Γ) and ( (cid:101) Π , (cid:101) Γ) .Proof. ( γ < F belongs to D γ ,T so does K F . Since K is a regularizing kernel of order β := + H in the sense of [31], it follows along the lines of [31, Sec. 5] that |||K F ; K (cid:101) F ||| D γT (Γ) , D γT ( (cid:101) Γ) (cid:46) ||| F ; (cid:101) F ||| D γ + | Ξ | T (Γ) , D γ + | Ξ | T ( (cid:101) Γ) where we pick γ ∈ ( γ,

1) such that γ ≤ γ + | Ξ | + β = γ + H − κ . On the other hand, it is clear fromthe deﬁnition of |||· ; ·||| that since K F and K (cid:101) F vanish at t = 0 it holds that |||K F ; K (cid:101) F ||| D γT (Γ) , D γT ( (cid:101) Γ) (cid:46) T η |||K F ; K (cid:101) F ||| D γT (Γ) , D γT ( (cid:101) Γ) for η = γ − γ . (cid:3) Lemma C.2.

Let G (resp. (cid:101) G ) be the composition operator corresponding to g (resp. (cid:101) g ) ∈ C M +2 b .Then one has ||| G ( F ); (cid:101) G ( (cid:101) F ) ||| D γT (Γ) , D γT ( (cid:101) Γ) (cid:46) (cid:107) G − (cid:101) G (cid:107) C M +2 + ||| F ; (cid:101) F ||| D γT (Γ) , D γT ( (cid:101) Γ) the proportionality constants depending only on γ and the norms of (Π , Γ) , ( (cid:101) Π , (cid:101) Γ) , F , (cid:101) F , g , (cid:101) g .Proof. This follows from the estimate in [31, Theorem 4.16]. The joint continuity is not stated therebut is clear from the triangle inequality. (cid:3)

EGULARITY STRUCTURES & ROUGH VOL 43

References [1] Elisa Al`os, Jorge A Le´on, and Josep Vives. On the short-time behavior of the implied volatility for jump-diﬀusionmodels with stochastic volatility.

Finance and Stochastics , 11(4):571–589, 2007.[2] Elisa Al`os, Jorge A. Le´on, and Josep Vives. On the short-time behavior of the implied volatility for jump-diﬀusionmodels with stochastic volatility.

Finance and Stochastics , 11(4):571–589, Oct 2007.[3] Marco Avellaneda, Dash Boyer-Olson, J´erˆome Busca, and Peter Friz. Application of large deviation methods tothe pricing of index options in ﬁnance.

Comptes Rendus Mathematique , 336(3):263–266, 2003.[4] Christian Bayer, Peter Friz, and Jim Gatheral. Pricing under rough volatility.

Quantitative Finance , 16(6):887–904,2016.[5] Christian Bayer, Peter K Friz, Archil Gulisashvili, Blanka Horvath, and Benjamin Stemper. Short-time near-the-money skew in rough fractional volatility models. arXiv preprint arXiv:1703.05132 , 2017.[6] Christian Bayer, Peter K. Friz, Sebastian Riedel, and John Schoenmakers. From rough path estimates to multilevelmonte carlo.

SIAM Journal on Numerical Analysis , 54(3):1449–1483, 2016.[7] Christian Bayer and Peter Laurence. Asymptotics beats Monte Carlo: The case of correlated local vol baskets.

Communications on Pure and Applied Mathematics , 67(10):1618–1657, 2014.[8] Henri Berestycki, J´erˆome Busca, and Igor Florent. Computing the implied volatility in stochastic volatilitymodels.

Communications on Pure and Applied Mathematics , 57(10):1352–1373, 2004.[9] Y. Bruned, I. Chevyrev, P. K. Friz, and R. Preiss. A Rough Path Perspective on Renormalization.

ArXiv e-prints ,January 2017.[10] Y. Bruned, M. Hairer, and L. Zambotti. Algebraic renormalisation of regularity structures.

ArXiv e-prints ,October 2016.[11] Ajay Chandra and Martin Hairer. An analytic BPHZ theorem for regularity structures. arXiv preprintarXiv:1612.08138 , 2016.[12] L. Coutin and L. Decreusefond.

Stochastic Volterra Equations with Singular Kernels , pages 39–50. Birkh¨auserBoston, Boston, MA, 2001.[13] Mark H. A. Davis and Vicente Mataix-Pastor. Negative Libor rates in the swap market model.

Finance andStochastics , 11(2):181–193, Apr 2007.[14] J. D. Deuschel, P. K. Friz, A. Jacquier, and S. Violante. Marginal density expansions for diﬀusions and stochasticvolatility I: Theoretical foundations.

Comm. Pure Appl. Math. , 67(1):40–82, 2014.[15] J. D. Deuschel, P. K. Friz, A. Jacquier, and S. Violante. Marginal density expansions for diﬀusions and stochasticvolatility II: Applications.

Comm. Pure Appl. Math. , 67(2):321–350, 2014.[16] Omar El Euch, Masaaki Fukasawa, and Mathieu Rosenbaum. The microstructural foundations of leverage eﬀectand rough volatility.

ArXiv e-prints , September 2016.[17] Omar El Euch and Mathieu Rosenbaum. The characteristic function of rough Heston models. arXiv preprintarXiv:1609.02108 , 2016.[18] Omar El Euch and Mathieu Rosenbaum. Perfect hedging in rough Heston models. arXiv preprint arXiv:1703.05049 ,2017.[19] Martin Forde and Hongzhong Zhang. Asymptotics for rough stochastic volatility models.

SIAM Journal onFinancial Mathematics , 8(1):114–145, 2017.[20] Peter Friz, Stefan Gerhold, and Arpad Pinter. Option pricing in the moderate deviations regime.

MathematicalFinance , pages n/a–n/a, 2017.[21] Peter Friz and Sebastian Riedel. Convergence rates for the full brownian rough paths with applications to limittheorems for stochastic ﬂows.

Bulletin des Sciences Math´ematiques , 135(6):613 – 628, 2011.[22] Peter K. Friz, Benjamin Gess, Archil Gulisashvili, and Sebastian Riedel. The Jain-Monrad criterion for rough pathsand applications to random Fourier series and non–Markovian Hoermander theory.

Ann. Probab. , 44(1):684–738,01 2016.[23] Peter K. Friz and Martin Hairer.

A Course on Rough Paths: With an Introduction to Regularity Structures .Springer International Publishing, Cham, 2014.[24] Masaaki Fukasawa. Asymptotic analysis for stochastic volatility: martingale expansion.

Finance and Stochastics ,15(4):635–654, 2011.[25] Masaaki Fukasawa. Short-time at-the-money skew and rough fractional volatility.

Quantitative Finance , 17(2):189–198, 2017.[26] J. Gatheral and N.N. Taleb.

The Volatility Surface: A Practitioner’s Guide . Wiley Finance. Wiley, 2006.[27] Jim Gatheral, Thibault Jaisson, and Mathieu Rosenbaum. Volatility is rough. Preprint, 2014. arXiv:1410.3394. [28] Denis S Grebenkov, Dmitry Belyaev, and Peter W Jones. A multiscale guide to brownian motion.

Journal ofPhysics A: Mathematical and Theoretical , 49(4):043001, 2016.[29] Massimiliano Gubinelli. Ramiﬁcation of rough paths.

Journal of Diﬀerential Equations , 248(4):693 – 721, 2010.[30] Archil Gulisashvili. Large deviation principle for Volterra type fractional stochastic volatility models.

In Prepara-tion, 2017. [31] M. Hairer. A theory of regularity structures.

Inventiones mathematicae , 198(2):269–504, 2014.[32] Martin Hairer. Solving the KPZ equation.

Ann. of Math. (2) , 178(2):559–664, 2013.[33] Martin Hairer et al. Introduction to regularity structures.

Brazilian Journal of Probability and Statistics ,29(2):175–210, 2015.[34] Martin Hairer and David Kelly. Geometric versus non-geometric rough paths.

Ann. Inst. H. Poincar Probab.Statist. , 51(1):207–251, 02 2015.[35] Martin Hairer and ´Etienne Pardoux. A Wong-Zakai theorem for stochastic PDEs.

J. Math. Soc. Japan , 67(4):1551–1604, 2015.[36] Martin Hairer and Hao Shen. The dynamical sine-gordon model.

Communications in Mathematical Physics ,341(3):933–989, Feb 2016.[37] Martin Hairer and Hendrik Weber. Large deviations for white-noise driven, nonlinear stochastic PDEs in twoand three dimensions.

Ann. Fac. Sci. Toulouse Math. (6) , 24(1):55–92, 2015.[38] A. Jacquier, M. S. Pakkanen, and H. Stone. Pathwise large deviations for the Rough Bergomi model.

ArXive-prints , June 2017.[39] S. Janson.

Gaussian Hilbert Spaces . Cambridge Tracts in Mathematics. Cambridge University Press, 1997.[40] Terry Lyons and Nicolas Victoir. Cubature on wiener space.

Proceedings of the Royal Society of London A:Mathematical, Physical and Engineering Sciences , 460(2041):169–198, 2004.[41] Y. Meyer and D.H. Salinger.

Wavelets and Operators: . Number vol. 1 in Cambridge Studies in AdvancedMathematics. Cambridge University Press, 1995.[42] Aleksandar Mijatovi´c and Peter Tankov. A new look at short-term implied volatility in asset price models withjumps.

Math. Finance , 26(1):149–183, 2016.[43] Syoiti Ninomiya and Nicolas Victoir. Weak approximation of stochastic diﬀerential equations and application toderivative pricing.

Applied Mathematical Finance , 15(2):107–121, 2008.[44] David Nualart.

The Malliavin Calculus and Related Topics . Springer Science & Business Media, 2013.[45] David Nualart and ´Etienne Pardoux. Stochastic calculus with anticipating integrands.

Probability Theory andRelated Fields , 78(4):535–581, 1988.[46] Etienne Pardoux and Philip Protter. Stochastic volterra equations with anticipating coeﬃcients.

Ann. Probab. ,18(4):1635–1655, 10 1990.[47] Stefan G. Samko, Anatoly A. Kilbas, and Oleg I. Marichev.

Fractional integrals and derivatives . Gordon andBreach Science Publishers, Yverdon, 1993. Theory and applications, Edited and with a foreword by S. M.Nikol (cid:48) ski˘ı, Translated from the 1987 Russian original, Revised by the authors.ski˘ı, Translated from the 1987 Russian original, Revised by the authors.

Related Researches

The Stochastic Balance Equation for the American Option Value Function and its Gradient

by Malkhaz Shashiashvili

Climate Change Valuation Adjustment (CCVA) using parameterized climate change impacts

by Chris Kenyon

A structural approach to default modelling with pure jump processes

by Jean-Philippe Aguilar

Quantum crypto-economics: Blockchain prediction markets for the evolution of quantum technology

by Peter P. Rohde

Solving optimal stopping problems with Deep Q-Learning

by John Ery

The Log Moment formula for implied volatility

by Vimal Raval

On the RND under Heston's stochastic volatility model

by Ben Boukai

Extreme-Strike Comparisons and Structural Bounds for SPX and VIX Options

by Andrew Papanicolaou

X-Value adjustments: accounting versus economic management perspectives

by Alberto Elices

On the Pricing of Currency Options under Variance Gamma Process

by Azwar Abdulsalam

Client engineering of XVA in crisis and normality: Restructuring, Mandatory Breaks and Resets

by Chris Kenyon

Price formation and optimal trading in intraday electricity markets

by Olivier Féron

The SINC way: A fast and accurate approach to Fourier pricing

by Fabio Baschetti

Optimal semi-static hedging in illiquid markets

by Teemu Pennanen

Market-making with reinforcement-learning (SAC)

by Alexey Bakshaev

Power-type derivatives for rough volatility with jumps

by Weixuan Xia

Detecting and repairing arbitrage in traded option prices

by Samuel N. Cohen

Series expansions and direct inversion for the Heston model

by Simon J.A. Malham

On the harmonic mean representation of the implied volatility

by Stefano De Marco

Analysis on the Pricing model for a Discrete Coupon Bond with Early redemption provision by the Structural Approach

by Hyong Chol O

A decomposition formula for fractional Heston jump diffusion models

by Marc Lagunas-Merino

Equity warrant pricing under subdiffusive fractional Brownian motion of the short rate

by Foad Shokrollahi

Pricing equity-linked life insurance contracts with multiple risk factors by neural networks

by Karim Barigou

Approximate XVA for European claims

by Fabio Antonelli

Equity Tail Risk in the Treasury Bond Market

by Mirco Rubin

«

1

2

3

4

»

Submitted on 20 Oct 2017 Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar