[PDF] Small-time, large-time and H→0 asymptotics for the Rough Heston model

Abstract

We characterize the behaviour of the Rough Heston model introduced by Jaisson\&Rosenbaum \cite{JR16} in the small-time, large-time and α→1/2 (i.e. H→0 ) limits. We show that the short-maturity smile scales in qualitatively the same way as a general rough stochastic volatility model (cf.\ \cite{FZ17}, \cite{FGP18a} et al.), and the rate function is equal to the Fenchel-Legendre transform of a simple transformation of the solution to the same Volterra integral equation (VIE) that appears in \cite{ER19}, but with the drift and mean reversion terms removed. The solution to this VIE satisfies a space-time scaling property which means we only need to solve this equation for the moment values of p=1 and p=−1 so the rate function can be efficiently computed using an Adams scheme or a power series, and we compute a power series in the log-moneyness variable for the asymptotic implied volatility which yields tractable expressions for the implied vol skew and convexity. The limiting asymptotic smile in the large-maturity regime is obtained via a stability analysis of the fixed points of the VIE, and is the same as for the standard Heston model in \cite{FJ11}. Finally, using Lévy's convergence theorem, we show that the log stock price X t tends weakly to a non-symmetric random variable X (1/2) t as α→1/2 (i.e. H→0 ) whose mgf is also the solution to the Rough Heston VIE with α=1/2 , and we show that X (1/2) t / t √ tends weakly to a non-symmetric random variable as t→0 , which leads to a non-flat non-symmetric asymptotic smile in the Edgeworth regime. We also show that the third moment of the log stock price tends to a finite constant as H→0 (in contrast to the Rough Bergomi model discussed in \cite{FFGS20} where the skew flattens or blows up) and the V process converges on pathspace to a random tempered distribution.

Full PDF

aa r X i v : . [ q -f i n . P R ] J un Small-time and large-time smile behaviourfor the Rough Heston model

Martin Forde ∗ Stefan Gerhold † Benjamin Smith ‡ June 24, 2019

Abstract

We characterize the asymptotic small-time and large-time implied volatility smile for the popular Rough Heston modelintroduced by Jaisson&Rosenbaum [JR16]. We show that the asymptotic short-maturity smile scales in qualitatively thesame way as a general rough stochastic volatility model (cf. [FZ17], [FGP18a] et al.), and the rate function is equalto the Fenchel-Legendre transform of a simple transformation of the solution to the same Volterra integral equation(VIE) that appears in [ER19], but with the drift and mean reversion terms removed. The solution to this VIE satisﬁesa space-time scaling property which means we only need to solve this equation for the moment values of p = 1 and p = − [JR16] introduced the Rough Heston stochastic volatility model and show that the model arises naturally as the large-time limit of a high frequency market microstructure model driven by two nearly unstable self-exciting Poisson processes(otherwise known as Hawkes process) with a Mittag-Leﬄer kernel which drives buy and sell orders (a Hawkes process is ageneralized Poisson process where the intensity is itself stochastic and depends on the jump history via the kernel). Themicrostructure model captures the eﬀects of endogeneity of the market, no-arbitrage, buying/selling asymmetry and thepresence of metaorders. [ER19] show that the characteristic function of the log stock price for the Rough Heston model isthe solution to a fractional Riccati equation which is non-linear (see also [EFR18] and [ER18]), and the variance curve forthe model evolves as dξ u ( t ) = κ ( u − t ) √ V t dW t , where κ ( t ) is the kernel for the V t process itself multiplied by a Mittag-Leﬄer function (see Proposition 2.2 below for a proof of this). Theorem 2.1 in [ER18] shows that a Rough Heston modelconditioned on its history up to some time is still a Rough Heston model, but with a time-dependent mean reversion level θ ( t ) which depends on the history of the V process. Using Fr´echet derivatives, [ER18] also show that one can replicate acall option under the Rough Heston model if we have tradeable variance swaps at all maturities. More generally, we canreplicate any Malliavian diﬀerentiable contingent claim under any two-dimensional Rough Stochastic volatility model withdynamic trading in the stock and a dynamic trading in a forward variance contract, using the Clark-Ocone formula fortwo-dimensional Brownian motion (explicit calculations in this respect are much easier for e.g. the Rough Bergomi andfractional Stein Stein models than the Rough Heston model, since the latter is deﬁned implicitly).[EGR18] derive a quick and dirty (albeit useful) trick for approximating the Rough Heston model with a standardHeston model with the vol-of-vol parameter appropriately re-scaled, which comes from matching the second moment of theintegrated variance for the two models. [GR18] propose a global Pad´e-type rational function approximation to the truesolution of the Rough Heston VIE which asymptotically agrees with the true solution at small and large maturities, andoption pricing via Fourier inversion using this approximation is reported as being fast and accurate.[GK18] consider the more general class of aﬃne forward variance (AFV) models of the form dξ u ( t ) = κ ( u − t ) √ V t dW t (for which the Rough Heston model is a special case). They show that AFV models arise naturally as the weak limitof a so-called aﬃne forward intensity (AFI) model, where order ﬂow is driven by two generalized Hawkes-type processwith an arbitrary jump size distribution, and we exogenously specify the evolution of the conditional expectation of theintensity at diﬀerent maturities in the future, akin to a variance curve model. The weak limit here involves letting thejump size tends to zero as the jump intensity tends to inﬁnity in a certain way, and one can argue that an AFI model ∗ Dept. Mathematics, King’s College London, Strand, London, WC2R 2LS (

[email protected] ) † TU Wien, Financial and Actuarial Mathematics, Wiedner Hauptstraße 8/105-1, A-1040 Vienna, Austria ( [email protected] ) ‡ Dept. Mathematics, King’s College London, Strand, London, WC2R 2LS (

[email protected] )

1s more realistic than the bivariate Hawkes model in [ER19], since the latter only allows for jumps of a single magnitude(which correspond to buy/sell orders). Using martingale arguments (which do not require considering a Hawkes process asin the aforementioned El Euch&Rosenbaum articles) they show that the mgf of the log stock price for the aﬃne variancemodel satisﬁes a convolution Riccati equation, or equivalently is a non-linear function of the solution to a VIE. Formallyat least, one can also compute the next order term associated with the [GK18] convergence result, which we can view asan expansion around the limiting AFV model; the correction term satisﬁes a linear VIE and Fourier inversion has to beapplied to the correction term for e.g. pricing a call option.[GGP18] use comparison principle arguments for VIEs to show that the moment explosion time for the Rough Hestonmodel is ﬁnite if and only if it is ﬁnite for the standard Heston model. [GGP18] also establish upper and lower boundsfor the explosion time, and show that the critical moments are ﬁnite for all maturities, and formally derive reﬁned tailasymptotics for the Rough Heston model using Laplace’s method. A recent talk by M.Keller-Ressel (joint work with Majid)states an alternate upper bound for the moment explosion time for the Rough Heston model, based on a comparison with a(deterministic) time-change of the standard Heston model, which they claim is usually sharper than the bound in [GGP18].Corollary 7.1 in [FGP18a] provides a sharp small-time expansion in the [FZ17] large deviations regime (valid for x -valuesin some interval) for a general class of Rough Stochastic volatility models using regularity structures, which provides thenext order correction to the leading order behaviour obtained in [FZ17], and some earlier intermediate results in Bayer et al.[BFGHS18]. [EFGR18] derive a higher order Edgeworth expansion for implied volatility in the central limit theorem regimewhere the log moneyness scales as k √ t as t → O ( T H ) term,which itself contains an at-the-money, convexity and higher order correction term, which are important eﬀects to capturein practice.In this article, we establish small-time and large-time large deviation principles for the Rough Heston model, via thesolution to a VIE, and we translate these results into asymptotic estimates for call options and implied volatility. Thesolution to the VIE satisﬁes a certain scaling property which means we only have to solve the VIE for the moment values of p = +1 and −

1, rather than solving an entire family of VIEs. Using the Lagrange inversion theorem, we also compute theﬁrst three terms in the power series for the asymptotic implied volatility ˆ σ ( x ). We later derive formal asymptotics for thesmall-time moderate deviations regime and a formal saddlepoint approximation for European call options in the original[FZ17] large deviations regime which goes to higher order than previous works for rough models, and captures the eﬀect ofthe mean reversion term and the drift of the log stock price, and we discuss practical issues and limitations of this result.Our higher order expansion is of qualitatively the same form as the higher order expansion for a general model in Theorem6 in [FGP18a] (their expansion is not known to hold for large x -values since in their more general setup there are additionalcomplications with focal points, proving non-degeneracy etc.). For the large time, large log-moneyness regime, we showthat the asymptotic smile is the same as for the standard Heston model as in [FJ11], and we brieﬂy outline how one couldgo about computing the next order term using a saddlepoint approximation, in the same spirit as [FJM11]. In this section, we recall the deﬁnition and basic properties and origins of the Rough Heston model, and more general aﬃneand non-aﬃne forward variance models. Most of the results in this section are given in various locations in [ER18],[ER19]and [GK18], but for pedagogical purposes we found it instructive to collate them together in one place.Let (Ω , F , P ) denote a probability space with ﬁltration ( F t ) t ≥ which satisﬁes the usual conditions, and consider theRough Heston model for a log stock price process X t introduced in [JR16]: dX t = − V t dt + p V t dB t V t = V + 1Γ( α ) Z t ( t − s ) α − λ ( θ − V s ) ds + 1Γ( α ) Z t ( t − s ) α − ν p V s dW s (1)for α ∈ ( , θ > λ ≥ ν >

0, where W , B are two F t -Brownian motions with correlation ρ ∈ ( − , X = 0 and zero interest rate without loss of generality, since the law of X t − X is independent of X . E ( V t ) Proposition 2.1 E ( V t ) = V − ( V − θ ) Z t f α,λ ( s ) ds (2) where f α,λ ( t ) := λt α − E α,α ( − λt α ) , and E α,β ( z ) := P ∞ n =0 z n Γ( αn + β ) denotes the Mittag-Leﬄer function roof. (see also page 7 in [GK18]), and Proposition 3.1 in [ER18] for an alternate proof). Let r ( t ) = f α,λ ( t ). Takingexpectations of (1) and using that the expectation of the stochastic integral term is zero, we see that E ( V t ) = V + 1Γ( α ) Z t ( t − s ) α − λ ( θ − E ( V s )) dt . (3)Let k ( t ) := λt α − Γ( α ) and f ( t ) := E ( V t ) − θ . Then we can re-write (3) as f ( t ) = ( V − θ ) − k ∗ f ( t ) . (4)where ∗ denotes convolution. Now deﬁne the resolvent r ( t ) as the unique function which satisﬁes r = k − k ∗ r . Then weclaim that f ( t ) = ( V − θ ) − r ∗ ( V − θ ) . To verify the claim, we substitute this expression into (4) to get:( V − θ ) − k ∗ [( V − θ ) − r ∗ ( V − θ )] = ( V − θ ) − ( V − θ ) ∗ ( k − k ∗ r )( t )= ( V − θ ) − ( V − θ ) ∗ r ( t )so ( V − θ ) − k ∗ f ( t ) = ( V − θ ) − ( V − θ ) ∗ r ( t ) = f ( t ), which is precisely the integral equation we are trying to solve.Taking Laplace transform of both sides of k − k ∗ r = r we obtain ˆ r = ˆ k − ˆ k ˆ r , which we can re-arrange asˆ r = ˆ k k = λz − α λz − α = λz α + λ and the inverse Laplace transform of ˆ r is r ( t ) = λt α − E α,α ( − λt α ). E ( V u |F t ) Now let ξ t ( u ) := E ( V u |F t ). Then ξ t ( u ) is an F t -martingale, and ξ t ( u ) = V + 1Γ( α ) Z u ( u − s ) α − λ ( θ − E ( V s |F t ) ds + 1Γ( α ) Z t ( u − s ) α − ν p V s dW s . If λ = 0, we can re-write this expression as dξ t ( u ) = 1Γ( α ) ( u − t ) α − p V t dW t . Proposition 2.2 (see [ER19]). For λ > dξ t ( u ) = κ ( u − t ) p V t dW t = κ ( u − t ) p ξ t ( t ) dW t (5) where κ is the inverse Laplace transform of ˆ κ ( z ) = νz − α z − α , which is given explicitly by κ ( x ) = νx α − E α,α ( − λx α ) ∼ α ) νx α − as x → (see also page 6 in [GK18] and page 29 in [ER18]). Proof.

See Appendix A.

Remark 2.1

From (5), we see that ξ t ( . ) is Markov in ξ t ( . ). However V is not Markov in itself. We simulate the variance curve at time t > ξ t ( u ) = ξ ( u ) + Z t κ ( u − s ) p V s dW s and substituting the expression for ξ ( t ) = E ( V t ) in (2) and the expression for κ ( t ) in Proposition 2.2 (which are bothexpressed in terms of the Mittag-Leﬄer function). 3 .4 The characteristic function of the log stock price From Corollary 3.1 in [ER19] (see also Section 5 in [GGP18]), we know that for all t ≥ E ( e pX t ) = e V I − α f ( p,t )+ λθI f ( p,t ) (6)for p in some open interval I ⊃ [0 , f ( p, t ) satisﬁes D α f ( p, t ) = 12 ( p − p ) + ( p ρν − λ ) f ( p, t ) + 12 ν f ( p, t ) (7)with initial condition f ( p,

0) = 0, where I α f denotes the fractional integral operator of order α (see e.g. page 16 in [ER19]for deﬁnition) and D α denotes the fractional derivative operator of order α (see page 17 in [ER19] for deﬁnition). If we now replace the constant θ with a time-dependent function θ ( t ), then E ( V t ) = V + 1Γ( α ) Z t ( t − s ) α − λ ( θ ( s ) − E ( V s )) dt, which we can re-arrange as E ( V t ) − V + λI α E ( V t ) = λI α θ ( t )so to make this generalized model consistent with a given initial variance curve E ( V t ), we set θ ( t ) = 1 λ D α ( E ( V t ) − V + λI α E ( V t )) = 1 λ D α ( E ( V t ) − V ) + E ( V t )(see also Remark 3.2, Theorem 3.2 and Corollary 3.2 in [ER18]). We can also consider other models which are not Rough Heston but for which (5) is still satisﬁed, and models of this formare known as aﬃne forward variance (AFV) models (see [GK18] for an excellent treatise on such models and how to obtainthe Rough Heston model as the limit of a market microstructure model driven by a generalized Hawkes process in thesmall-jump, high jump intensity limit). We can of course integrate (5) and set u = t to get V t = ξ ( t ) + Z t κ ( t − s ) p V s dW s (8)which generalizes the Rough Heston model. Another well known (and non-aﬃne) variance curve model is the Rough Bergomi model, for which dξ t ( u ) = η ( u − t ) H − ξ t ( u ) dW t or the standard Bergomi model dξ t ( u ) = ηe − λ ( u − t ) ξ t ( u ) dW t . The canonical n -dimensional Hawkes process is a generalized Poisson process ( N t ) t ≥ with stochastic intensity given by λ t = µ t + R t φ ( t − s ) .dN s . Such processes are useful for modelling contagion in ﬁnance, and N t can also be interpreted asa branching process where immigrants arrive at a rate µ t , immigrants give birth to children at a rate φ ( t ), and childrengive birth to further children at a rate φ ( t ). With this interpretation, the average number of descendants of a particularimmigrant is the L norm of the kernel || φ || , which is also the average proportion of the population who are children asopposed to immigrants. If this norm is ≥ || φ || < N t is: E ( e iaN t ) = e R t ( C ( a,t − s ) − µ ( s ) ds ) where C ( a, t ) satisﬁes the non-linear integral equation C ( a, t ) = e ia + R t φ ( s ) C ( a,t − s ) ds . Moreover, with a certain choice ofparameters these processes can generate price processes which display observed and conjectured stylized features of ﬁnancialtime series such as market endogeneity and buy/sell asymmetry, and one can control the proportion of orders that are socalled “metaorders”. The intensity is chosen to satisfy: λ Tt = (cid:18) λ T, + t λ T, − t (cid:19) = ˆ µ T (cid:18) (cid:19) + Z t φ T ( t − s ) .dN Ts (9)4here β > α ∈ ( , λ > µ > ξ > a T = 1 − λT − α , φ T = ϕ T χ , χ = β +1 (cid:18) β β (cid:19) , ϕ T = a T ϕ , ϕ = f α, ,ˆ µ T ( t ) = µT α − + ξµT α − ( − a T (1 − R t ϕ T ( s ) ds ) − R t ϕ T ( s ) ds ), where f α, is the Mittag-Leﬄer density function deﬁnedin the appendix of [ER19]. Returning to the branching interpretation, the exogenous orders are the immigrants and theendogenous orders are the children. The fact that 1 > a T → t ∈ [0 , X Tt = 1 − a T T α µ N TtT , Λ Tt = 1 − a T T α µ Z tT λ Ts ds, Z Tt = r T α µ − a T ( X Tt − Λ Tt )Building on [JR16]), [ER19] show that the processes (Λ Tt , X Tt , Z Tt ) t ∈ [0 , converges in law under the Skorokhod topology toΛ t = X t = Z t Y s ds (cid:18) (cid:19) , Z t = Z t p Y s (cid:18) dB s dB s (cid:19) and Y is the unique solution of the rough stochastic diﬀerential equation Y t = ξ + 1Γ( α ) Z t ( t − s ) α − λ (1 − Y s ) ds + λ s β λµ (1 + β ) 1Γ( α ) Z t ( t − s ) α − p Y s dB s where B = B + βB √ β and ( B , B ) is 2-dimensional Brownian motion. As a corollary, for θ >

0, if V = θY and P Tt = r θ r − a T T α µ ( N T, + tT − N T, − tT ) − θ − a T T α µ N T, + tT then we have similar convergence in distribution of P T ( . ) to P t = R t √ V s dW s − R t V s ds , so the Rough Heston model isrecovered. The time scale T is of the order of the reciprocal of the price tick size, hence as T → ∞ , the price moves morefrequently with smaller size. The equality µ T = µ > /T occur µT times on average in a unit time interval, maintaining a non-zero contributionfrom exogenous activity in the limit. To simplify calculations, we make the following assumption throughout this section:

Assumption 3.1 λ = 0 . Remark 3.1

The formal higher order Laplace asymptotics in subsection 3.4 indicate that λ will not aﬀect the leadingorder small-time asymptotics, i.e. λ will not aﬀect the rate function, as we would expect from previous works on small-timeasymptotics for rough stochastic volatility models. The assumption that λ = 0 is relaxed in the next section where weconsider large-time asymptotics.We now state the main small-time result in the article (recall that α = H + ): Theorem 3.2

For the Rough Heston model deﬁned in (1) , we have lim t → t H log E ( e ptα X t ) = lim t → t H log E ( e pt H Xtt − H ) = (cid:26) ¯Λ( p ) if T ∗ ( p ) > ∞ if T ∗ ( p ) ≤ where ¯Λ( p ) := V Λ( p ) , Λ( p ) := Λ( p, , Λ( p, t ) := I − α ψ ( p, t ) and ψ ( p, t ) satisﬁes the Volterra diﬀerential equation D α ψ ( p, t ) = 12 p + pρνψ ( p, t ) + 12 ν ψ ( p, t ) (11) with initial condition ψ ( p,

0) = 0 , where T ∗ ( p ) > is the explosion time for ψ ( p, t ) which is ﬁnite for all p = 0 (assuming ν > ). Moreover, the scaling relation in Lemma 3.3 and its Corollary 3.4 inside the main proof below shows that Λ( p ) = | p | Hα Λ(sgn( p ) , | p | α ) , so in fact we only need to solve (11) for p = ± , and we can re-write (10) in more familiar form as lim t → t H log E ( e ptα X t ) = lim t → t H log E ( e pt H Xtt − H ) = (cid:26) ¯Λ( p ) p ∈ ( p − , p + )+ ∞ p / ∈ ( p − , p + ) where p ± = ± ( T ∗ ( ± α , so p + > and p − < . Then X t /t − H satisﬁes the LDP as t → with speed t − H and good ratefunction I ( x ) equal to the Fenchel-Legendre transform of ¯Λ . roof. We ﬁrst consider the following family of re-scaled Rough Heston models: dX εt = − εV εt dt + √ ε p V εt dB t , V εt = V + ε α Γ( α ) Z t ( t − s ) H − λ ( θ − V εs ) ds + ε H Γ( α ) Z t ( t − s ) H − ν p V εs dW s (12)with X εt = 0, where H = α − ∈ (0 , ]. Then from Appendix B we know that( X ε ( . ) , V ε ( . ) ) (d) = ( X ε ( . ) , V ε ( . ) ) (13)(note this actually holds for all λ >

0, but from here on we set λ = 0). Proceeding along similar lines to Theorem 4.1 in[FZ17], we let ˜ X εt denote the solution to d ˜ X εt = √ ε p V εt dB t (14)with ˜ X ε = 0. From Eq 8 in [ER18] we know that E ( e p ˜ X t ) = E Q p ( e p R t V s ds )where Q p is deﬁned as in [ER18], but under Q p the value of the mean reversion speed changes from zero to ¯ λ = ρpν , so E ( e p ˜ X t ) = e V I − α g ( p,t ) on some non-empty interval [0 , T ∗ ( p )), where D α g ( p, t ) = 12 p + pρνg ( p, t ) + 12 ν g ( p, t ) with g ( p,

0) = 0. Existence and uniqueness of solutions to these kind of fractional diﬀerential equation (FDE) is standard(as is their equivalence to VIEs), see [GGP18] for details and references. Now deﬁne g ε ( p, t ) := ε − α g ( p, εt ) . and setting s = εu , we see that I − α g ε ( p, t ) = 1Γ(1 − α ) Z t ( t − u ) − α ε − α g ( p, εu ) du = 1Γ(1 − α ) Z εt ( t − s/ε ) − α ε − α g ( p, s ) ds = ε α − α ) Z εt ( εt − s ) − α ε − α g ( p, s ) ds = ( I − α g ( p, . ))( εt ) (15)and ε α I g ε ( p, t ) = ε α Z t ε − α g ( p, εu ) du = ε α Z εt ε − α g ( p, s ) ds = ( I g ( p, . ))( εt ) . (16)Thus when λ = 0, replacing g ( p, t ) with g ε ( p, t ) is tantamount to changing the maturity t to εt (as opposed to t ). Combiningthis observation with the results of Section 5 of [GGP18], we see that E ( e p ˜ X εt ) = E ( e p ˜ X εt ) = e V I − α g ε ( p,t ) (17)on some non-empty interval [0 , T ∗ ε ( p )). Moreover g ε ( p, t ) = ε − α α ) Z εt ( εt − s ) α − ( 12 p + pρνg ( p, s ) + 12 ν g ( p, s ) ) ds = ε − α α ) Z t ( εt − εu ) α − ( 12 p + pρνg ( p, εu ) + 12 ν g ε ( p, εu ) ) εdu = ε Γ( α ) Z t ( t − u ) α − ( 12 p + pρνε α − g ε ( p, u ) + 12 ν ε α − g ε ( p, εu ) ) du = 1Γ( α ) Z t ( t − u ) α − ( 12 εp + pρνε α g ε ( p, u ) + 12 ν ε H g ε ( p, εu ) ) du (18)so we see that g ε ( p, t ) satisﬁes D α g ε ( p, t ) = 12 εp + ε α pρνg ε ( p, t ) + 12 ε H ν g ε ( p, t ) (19)with initial condition g ε ( p,

0) = 0. Now set g ε ( pε α , t ) = ψ ( p, t ) ε H . (20) in fact this relationship clearly holds for any function g p pε α , and substituting for g ε ( pε α , t ) in (19) and multiplying by ε H , we ﬁnd that D α ψ ( p, t ) = 12 p + pρνψ ( p, t ) + 12 ν ψ ( p, t ) (21)with ψ ( p,

0) = 0. Moreover, from Propositions 3.2 and 3.4 in [GGP18], we know that ψ ( p, t ) blows up at some ﬁnite time T ∗ ( p ) >

0, since λ = 0 so the quadratic G ( p, w ) = p + pρνw + ν has no real roots (i.e. case A or B in the [GGP18]classiﬁcation). Moreover, ψ ( p, t ) is independent of ε so T ∗ ε ( pε α ) (where T ε ( p ) is deﬁned above) is equal to T ∗ ( p ). Thus wesee that E ( e pεα ˜ X εt ) = e ε H V I − α ψ ( p,t ) (22)for all t ∈ [0 , T ∗ ( p )), which we can re-write as E ( e ptα ˜ X t ) = e ¯Λ( p ) t H Thus we see thatlim ε → ε H log E ( e pεα ˜ X εt ) = V I − α ψ ( p, t ) = ¯Λ( p, t )and Λ( p ) := Λ( p, < ∞ if and only if T ∗ ( p ) >

1. Thus E ( e pεα ˜ X εt ) = ∞ if T ∗ ( p ) ≤

1, and otherwise (by (13)) we see thatlim t → t H log E ( e ptα ˜ X t ) = lim ε → ε H log E ( e pεα ˜ X ε ) = V I − α ψ ( p,

1) = ¯Λ( p ) . Lemma 3.3

We have the scaling relation for t ∈ [0 , T ∗ ( p )] : Λ( p, t ) = t − H Λ( pt α ,

1) = t − H Λ( pt α ) . (23) Proof.

See Appendix C (as a sanity check we note that (23) is satisﬁed by the function Λ( p, t ) = p t , i.e. the solutionwhen ν = 0. Corollary 3.4 Λ( q ) = t H Λ( | q | t α , t ) = ( t ∗ q ) H Λ(1 , t ∗ q ) = | q | Hα Λ(sgn( q ) , | q | α ) (24) where we have set p = 1 = | q | t α , and t ∗ q = | q | α . Remark 3.2

This implies that Λ( p ) → ∞ as p → p ± := ± ( T ∗ ( ± α , and more generally pT ∗ ( p ) α = 1 p> p + + 1 p< p − . (25)To prove the LDP, we ﬁrst prove the corresponding LDP for ˜ X t . From Lemma 2.3.9 in [DZ98], we know that lim t → t H log E ( e ptα ˜ X t ) =Λ( p ) = Λ( p,

1) = I − α ψ ( p, t ) | t =1 is convex in p , and from (21) we know that ddt Λ( p, t ) = 12 p + pρνψ ( p, t ) + 12 ν ψ ( p, t ) which shows that Λ( p, t ) is also diﬀerentiable in t , and thus from the scaling property in (23), Λ( p ) = Λ( p,

1) is diﬀerentiablein p . We also know that ψ ( p, t ) → ∞ as t → T ∗ ( p ) (see Propositions 3.2 and 3.4 in [GGP18]), so Λ( p, t ) = I − α ψ ( p, t )also explodes at T ∗ ( p ) by Lemma 3.8 in [GGP18]. Then from Corollary 3.4, we know that Λ( p ) = p Hα Λ(sgn( p ) , | p | α ),so Λ( p ) → ∞ as p → p ± = ± ( T ∗ ( ± α and (by convexity and diﬀerentiability) Λ is also essentially smooth, so by theG¨artner-Ellis theorem from large deviations theory (see Theorem 2.3.6 in [DZ98]), ˜ X ε /ε − H satisﬁes the LDP as ε → ε − H and rate function I ( x ).Moreover, using that E ( e pε α ε R V εs ds ) = E ( e pε H R V εs ds ) = e V I − α φ ( p,t ) for p ∈ ( −∞ , ˆ p ) (and inﬁnity otherwise), where ˆ p is the value of p + for ρ = 0 and D α φ ( p, t ) = p + ν ψ ( p, t ) with ψ ( p, t ) = 0 (see also (22) and Theorem 3.2 in [ER18]), we ﬁnd that J ( p ) = lim ε → ε H log E ( e pε α ε R V εs ds ) = lim ε → ε H log E ( e pε H R V εs ds )so (again using part a) of the G¨artner-Ellis theorem in Theorem 2.3.6 in [DZ98]), A ε := R V εs ds satisﬁes the upper boundLDP as ε → ε − H and good rate function J ∗ equal to the FL transform of J . But we also know that X ε − ˜ X ε = − εA ε and for any a > δ > P ( | X ε ε − H − ˜ X ε ε − H | > δ ) = P ( 12 ε + H A ε > δ ) = P ( A ε > δε + H ) ≤ P ( A ε > a ) ≤ e − inf a ′≥ a J ( a ′ ) − δ ε H ε suﬃciently small, where we have use the upper bound LDP for A ε to obtain the ﬁnal inequality. Thuslim sup ε → ε H log P ( | X ε ε − H − ˜ X ε ε − H | > δ ) ≤ J ( a )but a is arbitrary and (from Lemma 2.3.9 in [DZ98]), J is a good rate function, so in factlim sup ε → ε H log P ( | X ε ε − H − ˜ X ε ε − H | > δ ) = −∞ . Thus X ε ε − H and ˜ X ε ε − H are exponentially equivalent in the sense of Deﬁnition 4.2.10 in [DZ98], so (by Theorem 4.2.13 in[DZ98]) X ε ε − H satisﬁes the same LDP as ˜ X ε ε − H . Corollary 3.5

We have the following limiting behaviour for the price of a European call option with maturity t and log-strike t − H x , with x > ﬁxed: lim t → t H log E (( e X t − e xt − H ) + ) = − I ( x ) . Proof.

The lower estimate follows from the exact same argument used in Appendix C in [FZ17] (see also Theorem 6.3 in[FGP18b]). The proof of the upper estimate is the same as in Theorem 6.3 in [FGP18b].

Corollary 3.6

For x = 0 ﬁxed, the implied volatility satisﬁes ˆ σ ( x ) := lim t → ˆ σ t ( t − H x ) = | x | p I ( x ) . (26) Proof.

Follows from Corollary 7.2 in [GL14]. See also the proof of Corollary 4.1 in [FGP18b] for details on this, but thepresent situation is simpler, as we only require the leading order term here.

Proceeding as in Lemma 7.2 in [GGP18], we can compute a fractional power series for ψ ( p, t ) (and hence Λ( p, t )) and thenusing (24), we ﬁnd that ¯Λ( p ) = 2 V ν ∞ X n =1 a n (1) p n Γ( αn + 1)Γ(2 + ( n − α )where the a n = a n ( u ) coeﬃcients are deﬁned (recursively) as in [GGP18] except for our application here (based on (21))we have to set λ = 0, and c = u instead of u ( u −

1) (note this series will have a ﬁnite radius of convergence). Usingthe Lagrange inversion theorem, we can then derive a power series for I ( x ) which takes the formˆ σ ( x ) = p V + ρν α ) √ V x + ν Γ(1 + 2 α ) + 2 ρ Γ(1 + α ) (2 − Γ(2+2 α )Γ(2+ α ) )8 V Γ(1 + α ) Γ(2 + 2 α ) x + O ( x ) . (27)(compare this to Theorem 3.6 in [BFGHS18] for a general class of rough models and Theorem 4.1 in [FJ11b] for a Markovianlocal-stochastic volatility model). We can re-write this expansion more concisely in dimensionless form asˆ σ ( x ) = p V [1 + ρ α ) z + Γ(1 + 2 α ) + 2 ρ Γ(1 + α ) (2 − Γ(2+2 α )Γ(2+ α ) )8Γ(1 + α ) Γ(2 + 2 α ) z + O ( z )]where the dimensionless quantity z = νxV . Remark 3.3

In principle one can use (27) to calibrate V , ρ and ν to observed/estimated values of ˆ σ (0), ˆ σ ′ (0) and ˆ σ ′′ (0)(i.e. the short-end implied vol level, skew and convexity respectively). From Eq 3.2 in [RO96], we expect that ψ ( p, t ) ≈ const. ( T ∗ ( p ) − t ) α as t → T ∗ ( p ) and thus Λ( p, t ) = I − α ψ ( p, t ) ≈ const. ( T ∗ ( p ) − t ) α − as t → T ∗ ( p ). Assuming this is consistent with the p -asymptotics, then (by (25)) we haveΛ( p ) = Λ( p, ≈ const. ( T ∗ ( p ) − α − = const. (( p + p ) /α − α − ∼ const. ( p + − p ) α − ( p → p + )so p ∗ ( x ) in I ( x ) = sup p ( px − V Λ( p )) satisﬁes p ∗ ( x ) ≈ p + − const. · x − / α , so I ( x ) ≈ p + x + const. · x − α as x → ∞ .8 .4 Higher order Laplace asymptotics If we now relax the assumption that λ = 0, and work with the original X ε process in (12) (as opposed to the driftless ˜ X ε process in (14)), then we know that E ( e pX εt ) = E ( e pX εt ) = e V I − α g ε ( p,t )+ ε α λθI g ε ( p,t ) for t in some non-empty interval [0 , T ∗ ε ( p )), where now g ε ( p, t ) satisﬁes D α g ε ( p, t ) = 12 ε ( p − p ) + ( pρν − λ ) ε α g ε ( p, t ) + 12 ε H ν g ε ( p, t ) (28)with initial condition g ε ( p,

0) = 0. Setting g ε ( pε α , t ) = ψ ε ( p, t ) ε H (29)and setting p pε α , and substituting for g ε ( pε α , t ) in (28) and multiplying by ε H as before, we ﬁnd that D α ψ ε ( p, t ) = 12 p + pρνψ ε ( p, t ) + 12 ν ψ ε ( p, t ) − ε α ( 12 p + λψ ε ( p, t ))with ψ ( p,

0) = 0. If we now formally try a higher order series approximation of the form ψ ε ( p, t ) = ψ ( p, t ) + ε + H ψ ( p, t ),we ﬁnd that ψ ( p, t ) must satisfy D α ψ ( p, t ) = − p − λψ ( p, t ) + pρνψ ( p, t ) + ν ψ ( p, t ) ψ ( p, t )with ψ ( p,

0) = 0, which is a linear FDE for ψ ( p, t ). Remark 3.4

Setting ψ ( p, t ) = P ∞ n =1 b n ( p ) t αn we see that ∞ X n =1 nα Γ( nα )Γ(1 + ( n − α ) b n ( p ) t ( n − α = − p − λ ∞ X n =1 ¯ a n ( p ) t αn + pρν ∞ X n =1 b n ( p ) t αn + ν ∞ X n =1 ¯ a n ( p ) t αn ∞ X m =1 b m ( p ) t αm where ¯ a n ( p ) = ν a n ( p ), and we have set λ = 0 and c = p in computing the a n ( p ) coeﬃcients, so α Γ( α ) b ( p ) = − p , ( n + 1) α Γ(( n + 1) α )Γ(1 + nα ) b n +1 ( p ) = − λ ¯ a n ( p ) + ρpνb n ( p ) + ν n − X k =1 a k ( p ) b n − k ( p )so we have fractional power series for ψ ( p, t ) on some ﬁnite radius of convergence.Returning now to the main calculation, we see that if p ε ( x ) denotes the density of X ε ε α , then p ε ( xε H ) ∼ π Z ∞−∞ e − ikxε H e ε H ( F ( k )+ ε

12 + H G ( k )+ εαε H λθ ( F ( k )+ ε

12 + H G ( k )) dk where F ( k ) := V I − α ψ ( ik, G ( k ) := V I − α ψ ( ik, F := I ψ ( ik,

1) and G := I ψ ( ik, k ∗ = k ∗ ( x ) = ip ∗ ( x ) of ¯ F ( k ) = − ikx + F ( k ) satisﬁes ¯ F ′ ( k ∗ ) = 0 which always falls on the imaginary axis (and in our case p ∗ ( x ) ∈ (0 , p + ) when x > p ∗ ( x ) < ∈ ( p − ,

0) when x > F ( k ) ≈ ¯ F ( k ∗ ) + 12 F ′′ ( k ∗ )( k − k ∗ ) = ¯ F ( k ∗ ) −

12 ¯Λ ′′ ( p ∗ )( k − k ∗ ) (recall that ¯Λ( p ) = F ( − ip )) and p ∗ = ik ∗ ∈ ( p − , p + ). Then proceeding along similar lines to [FJL12] and using Laplace’smethod we have p ε ( xε H ) ∼ π Z ∞−∞ e ε H ( ¯ F ( k )+ ε

12 + H G ( k ))+ ε − H λθ ( F ( k )+ ε

12 + H G ( k )) dk ∼ π e ε − H ( G ( k ∗ )+ λθF ( k ∗ )) Z ∞−∞ e ε H ( ¯ F ( k ∗ ) − ¯Λ ′′ ( p ∗ )( k − k ∗ ) ) dk ∼ π e ε − H ( G ( k ∗ )+ λθF ( k ∗ )) e − I ( x ) ε H Z ∞−∞ e − ε H ¯Λ ′′ ( p ∗ )( k − k ∗ ) dk ∼ ε H e − I ( x ) ε H p π ¯Λ ′′ ( p ∗ ) [1 + ε − H ( G ( k ∗ ) + λθF ( k ∗ )) + O ( ε (1 − H ) ∧ H )]9 as well, 2nd term in denominator if H ∈ (0 , ), where the O ( ε H ) part of the error terms comes from the next order termin Theorem 7.1 in chapter 4 in [Olv74], and the ε (1 − H ) term comes from the 2nd order term in expanding the exponential.Then letting z = kε α , we see that C ε ( x ) = E (( e X ε − e xε − H ) + ) = 12 π e xε − H Z − ip ∗ + ∞− ip ∗ −∞ Re( e − izxε − H − iz − z E ( e izX ε )) dz = 12 π e xε − H Z − ip ∗ + ∞− ip ∗ −∞ Re( e − i kε H x − i kε α − ( kε α ) E ( e i kεα X ε )) d kε α (30) ∼ ε − α π e xε − H Z − ip ∗ + ∞− ip ∗ −∞ Re( e i kε H x ( − ε α k − i ε α k + O ( ε α )) E ( e i kεα X ε )) dk ∼ ε +2 H e − I ( x ) ε H ( p ∗ ) p π ¯Λ ′′ ( p ∗ ) [1 + ε − H ( x + G ( k ∗ ) + λθF ( k ∗ )) + O ( ε (1 − H ) ∧ H )] . (31)The ε -dependence of the leading order term here is exactly the same as in Corollary 7.1 in the recent article of Friz etal.[FGP18a] (in [FGP18a] ε = t whereas here ε = t ) which deals with a general class of rough stochastic volatility models(which excludes Rough Heston).More generally, we can formally substitute a fractional power series of the form ψ ε ( p, t ) = P ∞ n =0 ψ n ( p, t ) ε ( n +1) α (where ψ ( p, t ) := ψ ( p, t )), and we ﬁnd that ( ψ n ) n ≥ satisﬁes a nested sequence of linear fractional diﬀerential equations: D α ψ ( p, t ) = − p − λψ ( p, t ) + pρνψ ( p, t ) + ν ψ ( p, t ) ψ ( p, t ) D α ψ ( p, t ) = − λψ ( p, t ) + pρνψ ( p, t ) + ν ψ ( p, t ) ψ ( p, t ) + 12 ν ψ ( p, t ) ...D nα ψ n ( p, t ) = − λψ n − ( p, t ) + pρνψ n ( p, t ) + 12 ν [ n X k =0 ψ k ( p, t ) ψ n − k ( p, t ) + 1 n ∈ N · ψ n ( p, t ) ] (32)with ψ n ( p,

0) = 0, and in principle we can then compute fractional power series expansions for each ψ n ( p, t ) of the form ψ n ( p, t ) = P ∞ m =1 a m,n ( p ) t αm , as in Remark 3.4 above. (31) is of little use in practice, since the leading order Laplace approximation ignores the variation of the function k inthe integrand, and even if we partially take account of this eﬀect by going to next order with Laplace’s method using theformula in Theorem 7.1 in chapter 4 in [Olv74] (which we have checked and tried), it still frequently gives a worse estimatethat the leading order estimate ˆ σ ( x ) because the higher order error terms being ignored are too large, and since H isusually very small in practice, t H converges very slowly to zero. If we instead compute an approximate call price using theFourier integral along the horizontal contour going through the saddlepoint in (30) (using e.g. the NIntegrate command inMathematica) and use our higher order asymptotic estimate ψ ( ik, t ) + ε + H ψ ( ik, t ) for log E ( e i kεα X ε )), and then computethe exact implied volatility associated with this price (which avoids the problems with the Laplace approximation), then(for the parameters we considered) we found this approximation to be an order of magnitude closer to the Monte Carlovalue than the leading order approximation ˆ σ ( x ) (see graph and tables below). See [LK07] for more on computing theoptimal contour of integration for such problems.In principle can use Corollary 7.2 in Gao&Lee [GL14] to translate (31) into an asymptotic estimate for implied volatility,for which we obtain a cumbersome expression which shows that ˆ σ t ( x ) = ˆ σ ( x ) + O ( t H log t ), but again in practice we havefound this approximation to be of little practical use since the error terms which are ignored are typically too large. Inspired by [BFGHS18], if we replace (29) with g ε ( pε q , t ) = ψ ε ( p, t ) ε H − β where q = − H + β , then we ﬁnd that D α ψ ε ( p, t ) = 12 p − pε − H + β + pε − H +3 β ρνψ ε ( p, t ) − ε − H +4 β λψ ε ( p, t ) + 12 ε − H +6 β ν ψ ε ( p, t ) and we see that all non constant terms on the right hand side are o (1) as ε → β ∈ ( H, H ) and H ∈ (0 , ). Followingsimilar calculations as above, we formally obtain that lim t → t H − β log E ( e pt H − β Xttq ) = V I − α I α ( p ) = V p for all p ∈ R , which (modulo some rigour) implies that X t /t q satisﬁes the LDP with speed t H − β and Gaussian rate function I ( x ) = x /V . Note that β = H corresponds to the central limit or Edgeworth regime, see [FSV19] for details.10 aseA Case B CaseC - - - - - CaseD

Figure 1: Here we have plotted the quadratic function G ( p, w ) as a function of w for the four cases described in [GGP18].In cases A and B there are no roots and the solution ψ ( p, t ) to (21) increases without bound whereas in cases C and D wehave a stable ﬁxed point (the lesser of the two roots) and an unstable root, so a solution starting at the origin increases(decreases) until it reaches the stable ﬁxed ﬁxed point. For Case D we have also drawn the curve arising from the reﬂectiontransformation used in the proof in Appendix D. - -

10 10 20 30 402040 - - Figure 2: Here we have solved for the solution f ( p, t ) to (7) numerically by discretizing the VIE with 2000 time steps, andplotted f ( p, t ) a function of t and the corresponding quadratic function G ( p, w ) as a function of w with p ﬁxed. In the ﬁrstcase α = . λ = 2, ρ = − . ν = . p = 2 and f ( p, t ) tends to a ﬁnite constant, and in the second case α = . λ = 1, ρ = 0 . ν = 1 and p = 5 and we see that f ( p, t ) has an explosion time at some T ∗ ( p ) ≈ . (cid:1)(cid:2) - (cid:3) (cid:3) (cid:1)(cid:2)(cid:1)(cid:2)(cid:2)(cid:4)(cid:2)(cid:2)(cid:5)(cid:2)(cid:2)(cid:6)(cid:2)(cid:2)(cid:3)(cid:2)(cid:2)(cid:7)(cid:2)(cid:2) ⨯ ⨯ ⨯ ⨯ ⨯ ⨯⨯ ⨯ ⨯ ⨯ ⨯ + + + + ++ + + + + - - Figure 3: On the left we have plotted Λ( p ) using an Adams scheme to numerically solve the VIE in (21) with 2000 timesteps combined with Corollary 3.4, for α = 0 . V = . ν = . ρ = − .

02, and we ﬁnd that p + = T ∗ (1) ≈ . p − = T ∗ ( − ≈ .

25. On the right we have plotted the corresponding asymptotic small-maturity smile ˆ σ ( x ) (in blue)verses the higher order approximation using Eq (30) (red “+” signs), and the smile points obtained from a simple Euler-typeMonte Carlo scheme with maturity T = . simulations and 1000 time steps in Matlab (grey crosses), Matlab andMathematica code available on request. We did not use the Adams scheme to compute ˆ σ ( x ); rather have used the ﬁrst15 terms in the series expansion for ¯Λ( p ) in subsection 3.3 and then numerically computed its Fenchel-Legendre transformand used this to compute I ( x ) and hence ˆ σ ( x ). We see that the Monte Carlo and higher order smile points can barely bedistinguished by the naked eye. For | x | small, we have found this method of computing ˆ σ ( x ) to be far superior to usingan Adams scheme, since the numerical computation of the fractional integral I − α f ( p, t ) for | t | ≪ p,

1) close to x = 0. ⨯ ⨯ ⨯ ⨯ ⨯ ⨯⨯ ⨯ ⨯ ⨯ ⨯ + + + + ++ + + + + - (cid:1)(cid:2)(cid:3)(cid:1) - (cid:1)(cid:2)(cid:1)(cid:4) (cid:1)(cid:2)(cid:1)(cid:4) (cid:1)(cid:2)(cid:3)(cid:1)(cid:1)(cid:2)(cid:3)(cid:5)(cid:5)(cid:4)(cid:1)(cid:2)(cid:6)(cid:1)(cid:1)(cid:4)(cid:1)(cid:2)(cid:6)(cid:1)(cid:3)(cid:1)(cid:1)(cid:2)(cid:6)(cid:1)(cid:3)(cid:4)(cid:1)(cid:2)(cid:6)(cid:1)(cid:6)(cid:1) ⨯ ⨯ ⨯ ⨯ ⨯ ⨯⨯ ⨯ ⨯ ⨯ ⨯ + + + + ++ + + + + - (cid:1)(cid:2)(cid:3)(cid:1) - (cid:1)(cid:2)(cid:1)(cid:4) (cid:1)(cid:2)(cid:1)(cid:4) (cid:1)(cid:2)(cid:3)(cid:1) (cid:1)(cid:2)(cid:3)(cid:5)(cid:6)(cid:1)(cid:2)(cid:3)(cid:5)(cid:5)(cid:1)(cid:2)(cid:7)(cid:1)(cid:3)(cid:1)(cid:2)(cid:7)(cid:1)(cid:7) Figure 4: On the left here we have the same plot as above but with T = .

005 and for the right plot T = .

005 and α = . H = 0 . α , or larger values of T , | x | or | ρ | , e.g. ρ = − .

65 reported in e.g. [GR18], but the point here is really just to verify the correctness of the asymptotic formula in(26), and give a starting point for other authors/practitioners who wish to test reﬁnements/variants of our formula. We havenot repeated numerical results for the large-time case at the current time, since it is intuitively fairly clear that our largematurity formula is correct (since it just boils down to computing the stable ﬁxed point of the VIE) and for maturities ≈ O ( N ) for a rough model (where N is the number of time steps), and it is diﬃcult to verify the formula numericallyeven for the standard Heston model. 12 ˆ σ ( x ) Higher order T = . T = . T = .

005 Monte Carlo T = . -0.10 20.2068% 20.2023% 20.2020% 20.1615% 20.1589 %-0.08 20.141% 20.1364% 20.1363% 20.0953% 20.0931%-0.06 20.0869% 20.0822% 20.0824% 20.0407% 20.0388%-0.04 20.045% 20.0404% 20.0407% 19.9986% 19.9968%-0.02 20.016% 20.0113% 20.0119% 19.9693% 19.9676%0.00 20.0000% - 19.9942% - 19.9513%0.02 19.9973% 19.9926% 19.9921% 19.9503% 19.9509%0.04 20.0079% 20.0033% 20.0029% 19.9610% 19.9613 %0.06 20.0316% 20.0270% 20.0266% 19.9850% 19.9850%0.08 20.068% 20.0634% 20.0629% 20.0218% 20.0213%0.10 20.1166% 20.1120% 20.1114% 20.0709% 20.0699%Table of numerical results corresponding to the right plot in Figure 3 and the left plot in Figure 4. In this section, we derive large-time large deviation asymptotics for the Rough Heston model, and we begin making thefollowing assumption throughout this section:

Assumption 4.1 λ > , ρ ≤ . Recall that f ( p, t ) in (6) satisﬁes D α f ( p, t ) = H ( p, f ( p, t )) (33)subject to f ( p,

0) = 0, where H ( p, w ) := p − p + ( pρν − λ ) w + ν w . We write U ( p ) := 1 ν [ λ − pρν − p λ − λρνp + ν p (1 − p ¯ ρ )]for the smallest root of H ( p, . ), and note that U ( p ) is real if and only if p ∈ [ p, ¯ p ], where p := ν − λρ − p λ + ν − λρν ν (1 − ρ ) , ¯ p := ν − λρ + p λ + ν − λρν ν (1 − ρ ) . Proposition 4.2 V ( p ) := lim t →∞ t log E ( e pX t ) = ( λθU ( p ) p ∈ [ p, ¯ p ] , + ∞ p / ∈ [ p, ¯ p ] . Proof. [GGP18] show that the explosion time for the Rough Heston model T ∗ ( p ) < ∞ if and only if T ∗ ( p ) < ∞ for thecorresponding standard Heston model (i.e. the case α = 1).From the usual quadratic solution formula − b ±√ b − ac a , we know that H ( p, . ) has two distinct real roots (or a singleroot) if and only if ( λ − ρpν ) ≥ ( p − p ) ν (34)which is the same as the condition e ( p ) ≥ ρ ≤ λ > λ > ρσ and we note that ¯ p , p are the zeros of e ( p ).We now have to verify that under our assumptions that λ > ρ ≤ T ∗ ( p ) < ∞ if and only e ( p ) <

0. We havetwo cases to consider to verify this claim: • Suppose e ( p ) ≥

0. Then case B in [GGP18] is impossible by deﬁnition, and p ∈ [ p, ¯ p ], and Eq (3.5) in [FJ11] issatisﬁed. Eq (3.4) in [FJ11] is λ ≥ ρσp in our current notation, and by the assertion on p.769 in [FJ11] that “(3.4) is implied by (3.5)”, we see that it holds,which is equivalent to e ( p ) <

0. Therefore, case A is impossible. So we are in the non-explosive cases C or D ofthe [GGP18] classiﬁcation. Case C is by deﬁnition equivalent now to c ( p ) >

0, and by an easy calculation this isequivalent to U ( p ) > • Suppose e ( p ) <

0. By deﬁnition we are not in case C. And we have p / ∈ [ p, ¯ p ], but from p.769 in [FJ11], we know theinterval [0 ,

1] is strictly contained in [ p, ¯ p ]. Hence, case D is also impossible, and we are in the explosive cases A or B.13ence our claim is veriﬁed. We can now re-write (33) in integral form as f ( p, t ) = 1Γ( α ) Z t ( t − s ) α − H ( p, f ( p, s )) ds. Clearly, we have H ( p, w ) ց w ր U ( p ). Assume to begin with that U ( p ) > ≤ f ( p, t ) ≤ U ( p ).Moreover, w ∗ = U ( p ) is the smallest root of H ( p, w ), so H ( p, w ) ≥ H δ := H ( p, U ( p ) − δ ) for w ≤ U ( p ) − δ and δ ∈ (0 , U ( p )); hence we must have H δ Γ( α ) Z t ( t − s ) α − f ( p,s ) ≤ U ( p ) − δ ds < U ( p )for all t >

0. This implies that H δ Γ( α ) ( t − α − R t f ( p,s ) ≤ U ( p ) − δ ds < U ( p ), or equivalently t − − Z t f ( p,s ) >U ( p ) − δ ds ≤ Γ( α ) H δ U ( p )( t − − α . Then we see that 1 t Z t f ( p, s ) ds ≥ t Z t f ( p, s ) ds ≥ t Z t f ( p, s )1 f ( p,s ) >U ( p ) − δ ds ≥ t ( U ( p ) − δ )( t − − Γ( α ) H δ U ( p )( t − − α ) ≥ U ( p ) − δ for t suﬃciently large. Thus U ( p ) − δ ≤ t R t f ( p, s ) ds ≤ U ( p ) , so t R t f ( p, s ) ds → U ( p ) as t → ∞ . Then using thatlog E ( e pX t ) = V I − α f ( p, t ) + λθIf ( p, t )and that f ( p, t ) is bounded, the result follows. We proceed similarly for the case U ( p ) < Corollary 4.3 X t /t satisﬁes the LDP as t → ∞ with speed t and rate function V ∗ ( x ) equal to the Fenchel-Legendretransform of V ( p ) , as for the standard Heston model. Proof.

Since U ′ ( p ) → + ∞ as p → ¯ p and U ′ ( p ) → −∞ as p → p , the function λθU ( p ) is essentially smooth; so the statedLDP follows from the G¨artner-Ellis theorem in large deviations theory. Remark 4.1

We can easily add stochastic interest rates into this model by modelling the short rate r t by an independentRough Heston process, and proceeding as in [FK16] (we omit the details), see also [F11].Note that we have not proved that f ( p, t ) → U ( p ), but to establish the leading order behaviour in Proposition 4.2, thisis not necessary, rather we only needed to show that I − α f ( p, t ) ∼ tU ( p ). Nevertheless, this convergence would be requiredto go to higher order, so for completeness we prove this property as well, as a special case of the following general result: Lemma 4.4

Consider functions G ( y ) and K ( z ) which satisfy the following: • G ( y ) is analytic and increasing on [0 , y ] and decreasing on [ y , ∞ ) where y ≥ ; • G (0) ≥ ; • K ( z ) is positive, continuous and strictly decreasing for z > ; • R t K ( z ) dz is ﬁnite for each t > and diverges as t → ∞ ; • K ( z + α ) /K ( z ) is strictly increasing in z for each ﬁxed α greater than zero.Then the solution to y ( t ) = R t K ( t − s ) G ( y ( s )) ds is monotonically increasing, and if G has at least one positive root then y ( t ) converges to the smallest positive root of G as t → ∞ . Proof.

See Appendix D.This lemma can be applied to both cases C and D. As shown in [GGP18], the solution in case C is bounded betweenzero and the smallest positive root of G (denoted a in that paper) so G need only satisfy the conditions of the above lemmaon the interval [0 , a ] which it does with y = 0. For case D, multiplying the deﬁning integral equation by − − y ( t ) → y ( t ) and − G ( − y ( t )) → G ( y ( t )) (see ﬁnal plot in Figure 3) we recover an integral equation ofthe desired form (again G need only satisfy the conditions of the lemma over the corresponding interval [0 , a ]).14 .1 Asymptotics for call options and implied volatility Corollary 4.5

We have the following large-time asymptotic behaviour for European put/call options in the large-time, largelog-moneyness regime: − lim t →∞ t log E ( S t − S e xt ) + = V ∗ ( x ) − x ( x ≥

12 ¯ θ ) , − lim t →∞ t log( S − E ( S t − S e xt ) + ) = V ∗ ( x ) − x ( − θ ≤ x ≤

12 ¯ θ ) , − lim t →∞ t log( E ( S e xt − S t ) + ) = V ∗ ( x ) − x ( x ≤ − θ ) , where ¯ θ = λθλ − ρν . Proof.

See Corollary 2.4 in [FJ11].

Corollary 4.6

We have the following asymptotic behaviour in the large-time, large log-moneyness regime, where ˆ σ t ( kt ) isthe implied volatility of a European put/call option with strike S e kt : ˆ σ ∞ ( x ) = lim t →∞ ˆ σ t ( xt ) = ω ω ρx + q ( ω x + ρ ) + ¯ ρ ) where ω = 4 λθν ¯ ρ [ p (2 λ − ρν ) + ν ¯ ρ − (2 λ − ρν )] , ω = νλθ . Proof.

See Proposition 1 in [GJ11] (note that for the Rough Heston model λ has to be replaced with λ Γ( α ) and ν replacedwith ν Γ( α ) , but the eﬀect of the α here cancels out in the ﬁnal formula for ˆ σ ∞ ( k ). We can formally try going to higher order; indeed, using the ansatz f ( p, t ) = U ( p ) t + U ( p ) t − α (1 + o (1)) for p ∈ [ p, ¯ p ], andwe ﬁnd that U ( p ) = − U ( p )( λ − U ( p ) ν − pρν )Γ(1 − α )but if we try and go higher order again, the fractional derivative on the left hand side of (7) does not exist. Using the sameapproach as in [FJM11], one should be able to use this to compute a higher order large-time saddlepoint approximation forcall options. For the sake of brevity, we defer the details of this for future work. References [BFGHS18] Bayer, C., P.K.Friz, A.Gulisashvili, B.Horvath, B.Stemper,“Short-Time Near-The-Money Skew In Rough Frac-tional Volatility Models”, to appear in

Quantitative Finance .[DZ98] Dembo, A. and O.Zeitouni, “Large deviations techniques and applications”, Jones and Bartlet publishers, Boston,1998.[Dur10] Durrleman, V., “From Implied to Spot Volatilities”,

Finance and Stochastics , 14(2):157-17, 2010.[EFGR18] El Euch, O., M.Fukasawa, J.Gatheral and M.Rosenbaum, “Short-term at-the-money asymptotics under stochas-tic volatility models”, to appear in

SIAM Journal on Financial Mathematics .[EFR18] El Euch, O., M.Fukasawa, and M.Rosenbaum, “The microstructural foundations of leverage eﬀect and roughvolatility”,

Finance and Stochastics , 12 (6), p. 241-280, 2009.[EGR18] El Euch, O., Gatheral, J. and M.Rosenbaum, “Roughening Heston”, to appear in

Mathematical Finance .[ER18] El Euch, O. and M.Rosenbaum, “Perfect hedging in Rough Heston models”,

The Annals of Applied Probability , 28(6), 3813-3856, 2018.[ER19] El Euch, O. and M.Rosenbaum, “The characteristic function of Rough Heston models”,

Mathematical Finance ,29(1), 3-38, 2019.[F11] Forde, M., Large-time asymptotics for an uncorrelated stochastic volatility model, “Statistics&Probability Letters’,81(8), 1230-1232, 2011.[FJ11] Forde, M. and A.Jacquier, “The Large-maturity smile for the Heston model”,

Finance and Stochastics , 15, 755-780,2011. 15FJ11b] Forde, M. and A.Jacquier, “Small-time asymptotics for an uncorrelated Local-Stochastic volatility model”, withA.Jacquier, Appl. Math. Finance, 18, 517-535, 2011.[FJL12] Forde, M., A.Jacquier and R.Lee, “The small-time smile and term structure of implied volatility under the Hestonmodel”,

SIAM J. Finan. Math. , 3, 690-708, 2012.[FJM11] Forde, M., A.Jacquier and A.Mijatovic, “A note on essential smoothness in the Heston model”,

Finance andStochastics , 15, 781-784, 2011.[FS18] Forde, M., and B.Smith, “Optimal signal trading with temporary, resilient and power law price impact - exis-tence/uniqueness for mean-ﬁeld predictive FBSDEs”, preprint, 2018.[FSV19] Forde, M., B.Smith and L.Viitasaari, “Rough volatility with CGMY jumps conditioned on a ﬁnite history andthe Rough Heston model - small-time Edgeworth expansions and the prediction formula for the Riemann-Liouvilleprocess”, preprint, 2019.[FZ17] Forde, M. and H.Zhang, “Asymptotics for rough stochastic volatility models”,

SIAM J.Finan.Math. , 8, 114-145,2017.[FGP18a] Friz, P.K, P.Gassiat and P.Pigato, “Precise Asymptotics: Robust Stochastic Volatility Models”, preprint.[FGP18b] Friz, P., S.Gerhold and A.Pinter, “Option Pricing in the Moderate Deviations Regime”,

Mathematical Finance ,28(3), 962-988, 2018.[FK16] Forde, M. and R.Kumar, “Large-time option pricing using the Donsker-Varadhan LDP - correlated stochasticvolatility with stochastic interest rates and jumps”,

Annals of Applied Probability , 6, 3699-3726, 2016.[Fuk17] Fukasawa, M., “Short-time at-the-money skew and rough fractional volatility”,

Quantitative Finance , 17(2), 189-198, 2017.[GL14] Gao, K., and Lee, R., “Asymptotics of implied volatility to arbitrary order”,

Finance Stoch. , 18, 349-392, 2014.[GGP18] Gerhold, S., C.Gerstenecker and A.Pinter, “Moment Explosions In The Rough Heston Model”, preprint.[GJ11] Gatheral, J. and A.Jacquier, “Convergence of Heston to SVI”,

Quant.Finance , 11(8), 1129-1132, 2011.[GK18] Gatheral, J. and M.Keller-Ressel, “Aﬃne forward variance models”, preprint, 2018.[GR18] Gatheral, J. and R.Radoiˇci´c,

Rational approximation of the Rough Heston solution , preprint.[JR16] Jaisson, T. and M.Rosenbaum, “Rough fractional diﬀusions as scaling limits of nearly unstable heavy tailed Hawkesprocesses”,

The Annals of Applied Probability , 26 (5), 2860-2882, 2016.[LK07] Lord, R. and C.Kahl, “Optimal Fourier Inversion in Semi-Analytical Option Pricing”,

Tinbergen Institute DiscussionPaper No. 2006-066/2 .[MW51] Mann, W.R. and F.Wolf, “Heat transfer between solids and gases under nonlinear boundary conditions”,

Quarterlyof Applied Mathematics , Vol. 9, No. 2, pp. 163-184, 1951.[MF71] Miller, R.K. and A.Feldstein, “Smoothness of solutions Of Volterra Integral Equations with weakly singular kernels”,

Siam J. Math. Anal. , Vol. 2, No. 2, 1971.[Olv74] Olver, F.W., “Asymptotics and Special Functions”, Academic Press, 1974.[RO96] Roberts, C.A., and Olmstead, W.E., “Growth rates for blow-up solutions of nonlinear Volterra equations”,

Quart.Appl. Math. , 54(1): 153-159, 1996.

A Computing the kernel for the Rough Heston variance curve

Let Z t = R t √ V s dW s , and we recall that V t = V + 1Γ( α ) Z t ( t − s ) α − λ ( θ − V s ) ds + 1Γ( α ) Z t ( t − s ) α − ν p V s dW s = ˜ ξ ( t ) − λν ( ϕ ∗ V ) + ϕ ∗ dZ where ∗ denotes the convolution of two functions, ϕ ∗ dZ = R t ϕ ( t − s ) dZ s and ˜ ξ ( t ) = V + α ) R t ( t − s ) α − λθds = V + λθα Γ( α ) t α , and ϕ ( t ) = ν Γ( α ) t α . Now deﬁne κ to be the unique function which satisﬁes κ = ϕ − λν ( ϕ ∗ κ ) . (A-1)16uch a κ exists and is known as the resolvent of ϕ . Then we see that V t − λν κ ∗ V t = ˜ ξ ( t ) − λν ϕ ∗ V + ϕ ∗ dZ − λν κ ∗ [ ˜ ξ ( t ) − λν ϕ ∗ V + ϕ ∗ dZ ]= ξ ( t ) − λν ( ϕ − λν κ ∗ ϕ ) ∗ V + ( ϕ − λν κ ∗ ϕ ) ∗ dZ = ξ ( t ) − λν κ ∗ V + κ ∗ dZ where ξ ( t ) = ˜ ξ ( t ) − λν κ ∗ ˜ ξ ( t ), and we have used (A-1) in the ﬁnal line. Cancelling the − λν κ ∗ V terms, we see that V t = ξ ( t ) + κ ∗ dZ = ξ ( t ) + Z t κ ( t − s ) √ V s dW s ⇒ ξ t ( u ) = E ( V u |F t ) = ξ ( u ) + Z t κ ( u − s ) √ V s dW s and thus dξ t ( u ) = κ ( u − t ) √ V t dW t i.e. the correct κ function is the solution to (A-1). If we take the Laplace transform of (A-1), we getˆ κ ( z ) = ˆ ϕ ( z ) − λν ˆ ϕ ( z )ˆ κ ( z ) . (A-2)and (A-2) is just an algebraic equation now, which we can solve explicitly to get ˆ κ ( z ) = ˆ ϕ ( z )1+ λν ˆ ϕ ( z ) . But we know that ϕ ( t ) = ν Γ( α ) t α whose Laplace transform is ˆ ϕ ( z ) = νz − α , so ˆ κ ( z ) evaluates toˆ κ ( z ) = νz − α λz − α . Then the inverse Laplace transform of ˆ κ ( z ) is given by κ ( x ) = νx α − E α,α ( − λx α ) . B The re-scaled model

We ﬁrst let dX εt = − εV εt dt + √ ε p V εt dW t V εt − V = ε γ Γ( α ) Z t ( t − s ) H − λ ( θ − V εs ) ds + ε H Γ( α ) Z t ( t − s ) H − ν p V εs dW s (d) = ε γ Γ( α ) Z t ( t − s ) H − λ ( θ − V εs ) ds + ε H − Γ( α ) Z t ( t − s ) H − ν p V εs dW εs = ε γ Γ( α ) Z εt ( t − uε ) H − λ ( θ − V εu/ε ) 1 ε du + ε H − Γ( α ) Z εt ( t − uε ) H − ν q V εu/ε dW u . where we have set u = εs . Now set V ′ εt = V εt . Then V ′ εt − V = ε γ − Γ( α ) Z εt ( t − uε ) H − λ ( θ − V ′ u ) du + ε H − Γ( α ) Z εt ( t − uε ) H − ν p V ′ u dW u = ε γ − ε H − Γ( α ) Z εt ( εt − u ) H − λ ( θ − V ′ u ) du + ε H − ε H − Γ( α ) Z εt ( εt − u ) H − ν p V ′ u dW u = 1Γ( α ) Z εt ( εt − u ) H − ν p V ′ u dW u + 1Γ( α ) Z εt ( εt − u ) H − ν p V ′ u dW u where the last line follows on setting γ − H − , i.e. γ = α . Thus for this choice of γ , V ε ( . ) (d) = V ε ( . ) . C Proof of the scaling property

Recall from (22) that E ( e pεα ˜ X εt ) = e ε H V I − α ψ ( p,t ) = e V p,t ) ε H i.e. E ( e ptα ˜ X t ) = e V p,t ) t H where ˜ X t = ˜ X t . ThenΛ( p, t ) = ε H log E ( e pεα ˜ X εt ) = t − H ( εt ) H log E ( e ptα ( εt ) α ˜ X εt ) = t − H ˆ ε H log E ( e ptα ˆ εα ˜ X εt ) = t − H V Λ( pt α , ε = εt . 17 Proof of monotonicity of the solution for a general class of Volterraintegral equations

Recall that y ( t ) satisﬁes y ( t ) = Z t K ( t − s ) G ( y ( s )) ds One can easily verify that the kernel used for the Rough Heston model satisﬁes the stated properties in Lemma 4.4.In the classical case K ( t ) ≡ y ( t )is analytic for t >

0. This is proved for the kernel relevant to the Rough Heston model in [MF71] (Theorem 6), see also theend of page 14 in [GGP18].What follows is a natural extension of the technique used in [MW51] (Theorem 8). Using the properties of convolutionand diﬀerentiating under the integral sign, we have: y ( t ) = Z t K ( t − s ) G ( y ( s )) ds = Z t K ( s ) G ( y ( t − s )) ds (D-1) y ′ ( t ) = K ( t ) G (0) + Z t K ( s ) G ′ ( y ( t − s )) y ′ ( t − s ) ds (D-2)= K ( t ) G (0) + Z t K ( t − s ) G ′ ( y ( s )) y ′ ( s ) ds (D-3) G (0) > y ′ ( t ) → + ∞ as t → + and since G ( y ) is increasing for y ≤ y we have that y ′ ( t ) > y ( t ) reaches y i.e. the solution increases. For y ≥ y , G ( y ) is decreasing and suppose that y ( t ) ceases to be increasing at some point.This implies (assuming a continuous derivative) the existence of a t and an interval I = [ t , t ] such that y ′ ( t ) = 0 and y ′ ( t ) < t ∈ I (if y ( t ) and hence y ′ ( t ) is analytic then the zeros of the derivative are isolated and a suﬃcientlysmall interval I exists). Using the integral equation for y ′ ( t ): y ′ ( t ) = K ( t ) G (0) + Z t K ( t − s ) G ′ ( y ( s )) y ′ ( s ) ds = 0 (D-4) y ′ ( t ) = K ( t ) G (0) + Z t K ( t − s ) G ′ ( y ( s )) y ′ ( s ) ds + Z t t K ( t − s ) G ′ ( y ( s )) y ′ ( s ) ds We can re-write the kernels in the ﬁrst and second terms of the expression for y ′ ( t ) as: K ( t ) = K ( t ) K ( t ) K ( t ) , K ( t − s ) = K ( t − s ) K ( t − s ) K ( t − s )and we can easily check that the quotient in the second expression here decreases monotonically from K ( t ) /K ( t ) to zero.By the mean value theorem for deﬁnite integrals there exists a τ ∈ (0 , t ) such that: Z t K ( t − s ) K ( t − s ) K ( t − s ) G ′ ( y ( s )) y ′ ( s ) ds = K ( t − τ ) K ( t − τ ) Z t K ( t − s ) G ′ ( y ( s )) y ′ ( s ) ds = − K ( t − τ ) K ( t − τ ) K ( t ) G (0) (D-5)where the second equality follows from (D-4). Substituting this into our expression for y ′ ( t ): y ′ ( t ) = K ( t ) K ( t ) K ( t ) G (0) + K ( t − τ ) K ( t − τ ) Z t K ( t − s ) G ′ ( y ( s )) y ′ ( s ) ds + Z t t K ( t − s ) G ′ ( y ( s )) y ′ ( s ) ds = K ( t ) G (0) ( K ( t ) K ( t ) − K ( t − τ ) K ( t − τ ) ) | {z } > + Z t t K ( t − s ) G ′ ( y ( s )) y ′ ( s ) | {z } > ds > G ( y ) = ( y − θ ) + θ i.e. a quadratic with positive leading coeﬃcient (for simplicity set to 1here) and minimum of θ obtained at y = θ . Depending on the values of { θ , θ } the following cases due to [GGP18] aredistinguished: • (C) G (0) > θ > θ < (D) G (0) ≤ y = 0. In case D, applying the transformation y ( t ) → − y ( t ) and − G ( − y ( t )) → G ( y ( t )) (reﬂecting in the x and then y axis) yields a function G ( y ) which is a quadratic with negative leadingcoeﬃcient and thus increases until it reaches it’s maximum after which it decreases which is of the type considered here.As shown in Propositions 3.6 and 3.7 in [GGP18], solutions to this integral equation must be bounded between 0 and a where a is the ﬁrst positive root of G ( y ), and monotonicity of y ( t ) implies that y ( t ) → a as t → ∞ (since if y ( t ) were totend to a constant c with 0 < c ≤ a , then G ( y ( t )) will be bounded below by some G ∗ >

0, so y ( t ) = Z t K ( t − s ) G ( y ( s )) ds ≥ G ∗ Z t K ( t − s ) ds → ∞ which contradicts the boundedness of y ( tt