[PDF] Affine forward variance models

Abstract

We introduce the class of affine forward variance (AFV) models of which both the conventional Heston model and the rough Heston model are special cases. We show that AFV models can be characterized by the affine form of their cumulant generating function, which can be obtained as solution of a convolution Riccati equation. We further introduce the class of affine forward order flow intensity (AFI) models, which are structurally similar to AFV models, but driven by jump processes, and which include Hawkes-type models. We show that the cumulant generating function of an AFI model satisfies a generalized convolution Riccati equation and that a high-frequency limit of AFI models converges in distribution to the AFV model.

Full PDF

AAﬃne forward variance models

Jim Gatheral, Baruch College, CUNY, [email protected] ,Martin Keller-Ressel, TU Dresden,

[email protected]

October 31, 2018

Abstract

We introduce the class of aﬃne forward variance (AFV) models ofwhich both the conventional Heston model and the rough Heston modelare special cases. We show that AFV models can be characterized by theaﬃne form of their cumulant generating function, which can be obtainedas solution of a convolution Riccati equation. We further introduce theclass of aﬃne forward order ﬂow intensity (AFI) models, which are struc-turally similar to AFV models, but driven by jump processes, and whichinclude Hawkes-type models. We show that the cumulant generating func-tion of an AFI model satisﬁes a generalized convolution Riccati equationand that a high-frequency limit of AFI models converges in distributionto the AFV model.

MKR gratefully acknowledges ﬁnancial support from DFG grants ZUK 64 andKE 1736/1-1. We thank Masaaki Fukasawa and two anonymous referees fortheir insightful comments.

The class of aﬃne processes, introduced in [5], consists of all continuous-timeMarkov processes taking values in R m ě ˆ R n , whose log-characteristic func-tion depends in an aﬃne way on the initial state vector of the process. Aﬃneprocesses have proved particularly convenient for ﬁnancial modeling, typicallygiving rise to models with tractable formulae for the values of ﬁnancial claims;the perennially popular Heston model [15] is just one (and perhaps the mostfamous) example of such a model.In this paper, we introduce the class of aﬃne forward variance (AFV) mod-els of which classical Markovian aﬃne stochastic volatility models turn out tobe a special case. By writing our model in forward variance form, we are ableto provide a unique characterization of a much wider class of aﬃne stochastic1 a r X i v : . [ q -f i n . M F ] O c t olatility models, which includes non-Markovian models, such as the rough He-ston model of [7] or, more generally, stochastic volatility models driven by aﬃneVolterra processes in the sense of [1]. Our contribution is to provide necessaryand suﬃcient conditions for a (non-Markovian) stochastic volatility model to beaﬃne, thus adding a reverse direction to the results of [1], and with a simplerproof . In essence, the rough Heston model is – up to a choice of kernel – the only stochastic volatility model with an aﬃne moment generating function.Inspired by the original derivation [7] of the rough Heston model as a limitof simple pure jump models of order ﬂow, we further introduce the class of aﬃne forward order ﬂow intensity (AFI) models. These model are structurallysimilar to aﬃne forward variance models and generalize the simple order ﬂowmodel of [7], by allowing arbitrary order size distributions and more generaldecay of the self-excitation of order ﬂow. We deﬁne a high-frequency limit inwhich such models give rise to continuous aﬃne forward variance models. In sodoing, we generalize and simplify previous such derivations. Moreover, there isa clear structural analogy between the results we prove for AFV and AFI mod-els, adding insight to and generalizing the connection between microstructuralmodels of order ﬂow and stochastic volatility models ﬁrst brought to light in[16] and [17].Our paper proceeds as follows. In Section 2, we introduce the class of aﬃneforward variance models and show that a forward variance model has an aﬃnecumulant generating function (CGF) if and only if it can be written in a veryspeciﬁc form. We further show that the CGF can be obtained as the uniqueglobal solution of a convolution Riccati equation closely related to the Volterra-Riccati equations of [1]. In Section 3, we introduce the class of AFI models,showing that the CGF of such models solves a generalized convolution Riccatiequation. In Section 4, we show that AFI models become AFV models ina high-frequency limit, where order arrivals are extremely frequent and ordersizes extremely small. Let a probability space p Ω , F , P q with right-continuous ﬁltration p F t q t ě and twoindependent, adapted Brownian motions W and W K be given. The ﬁltrationgenerated by W only is denoted by p F Wt q t ě . Our starting point is a genericstochastic volatility model p S, V q , where spot volatility V is modeled by a F W -adapted continuous, integrable, and non-negative process and the price process S by dS t “ S t a V t ´ ρdW t ` a ´ ρ dW K t ¯ , (2.1) Some of these simpliﬁcations are due to the fact that we limit ourselves to the (real-valued)moment generating function, as opposed to the (complex-valued) characteristic function, stud-ied in [7] ρ P r´ , s . Crucially, we do not assume that V is an Ito-process or even a semi-martingale. Instead, we focus on the family(indexed by T ą

0) of forward variance processes ξ t p T q : “ E r V T | F t s “ E “ V T | F Wt ‰ , (2.2)which are, by deﬁnition, F W -adapted martingales with terminal values ξ T p T q “ V T . By the martingale representation theorem, there exists, for each T ą

0, apredictable process η t p T q with ş T η s p T q ds ă 8 a.s., such that dξ t p T q “ η t p T q dW t , t P r , T s . (2.3)We refer to (2.1) together with (2.3) as stochastic volatility model in forwardvariance form, or simply as forward variance model . It is often convenient touse the log-price X “ log S instead of S and we note that it follows from (2.1)that X satisﬁes the SDE dX t “ ´ V t dt ` a V t ´ ρdW t ` a ´ ρ dW K t ¯ . (2.4)We also refer to X together with the family of processes p ξ . p T qq T ą as for-ward variance model and denote it by p X, ξ q . Moreover, we use the followingconvention: For t ą T , deﬁne ξ t p T q : “ V T η t p T q : “ . which is consistent with (2.2) and allows to extend (2.3) to all t ě η . p T q : Assumption 2.1. (a) For dt b d P -almost all p t, ω q it holds that τ ÞÑ η t p t ` τ, ω q is non-negative, decreasing and continuous on p , .(b) For any T ą ż T ˜ż T η r p s q dr ¸ { ds ă 8 (2.5)holds almost surely.We will show that in the class of aﬃne forward variance models, the inte-grand η t p T q must factor as η t p T q “ ? V t κ p T ´ t q , where κ is a deterministicfunction. To describe the admissible functions we introduce the following: Deﬁnition 2.2.

For 1 ď p ă 8 , an L p -kernel is a function κ : R ě Ñ R ě which is continuous on p , and satisﬁes ş T κ p t q p dt ă 8 for all T ą Remark . (a) If η t p T q factors as η t p T q “ Z t κ p T ´ t q into a non-negativecontinuous stochastic process Z and a deterministic function κ , then As-sumption 2.1 is equivalent to κ being a decreasing L -kernel in the sense ofDeﬁnition 2.2. 3b) By (2.1) S is a non-negative local martingale and therefore a supermartin-gale, such that E r S t s “ E “ e X t ‰ ď S . This implies by Jensen’s inequalitythat the moments E r S ut s are ﬁnite for all u P r , s ; a fact that will be usedsubsequently. Deﬁnition 2.4.

We say that a forward variance model p X, ξ q has an aﬃnecumulant generating function determined by g p t, u q , if its conditional cumulantgenerating function is of the formlog E ” e u p X T ´ X t q ˇˇˇ F t ı “ ż Tt g p T ´ s, u q ξ t p s q ds, (2.6)for all u P r , s , 0 ď t ď T and where g p ., u q is R ď -valued and continuous on r , T s for all T ą u P r , s . Remark . Alternatively, we could consider (2.6) with imaginary parameter u “ iz for z P R , i.e. an aﬃne log-characteristic function as in [7]. This isof particular interest for applications of Fourier pricing to the model p X, ξ q .To show results on distributional convergence, which will be the subject ofSection 4, using the cumulant generating function is suﬃcient and it will turnout that restricting to real parameters simpliﬁes many of the mathematicalarguments.Convolution integrals, as in the exponent of (2.6), will appear frequentlyin the following calculations and so it is natural to introduce the convolutionoperation p f ‹ g qp t q : “ ş t f p t ´ s q g p s q ds . For functions with multiple argumentsor subscripts, we use the convention that convolution acts on the ﬁrst argument,excluding subscripts. Other arguments or subscripts are passed on to the result.With this convention (2.6) can be written succinctly as E ” e u p X T ´ X t q ˇˇˇ F t ı “ exp ´ p g ‹ ξ q t p T, u q ¯ . (2.7)The following result gives a characterization of all forward variance models withaﬃne CGF. Its proof is given in Section 2.4 with some parts relegated to Ap-pendix B. thm 2.6. Under Assumption 2.1 a forward variance model p X, ξ q has an aﬃneCGF if and only if η t p T q “ a V t κ p T ´ t q , (2.8) for a deterministic, decreasing L -kernel κ . Moreover, g p ., u q : R ě Ñ R ď in (2.6) is the unique global continuous solution of the convolution Riccati equation g p t, u q “ R V ´ u, ż t κ p t ´ s q g p s, u q ds ¯ “ R V ´ u, p κ ‹ g qp t, u q ¯ , t ě where R V p u, w q “ p u ´ u q ` ρuw ` w . (2.10)4 emark . Alternatively, g p t, u q can be written as g p t, u q “ R V p u, f p t, u qq , where f p t, u q is the unique global continuous solution of the non-linear Volterraequation f p t, u q “ ż t κ p t ´ s q R V p u, f p s, u qq ds. (2.11)See Appendix A for further discussion of non-linear Volterra equations and forthe equivalence of equations (2.9) and (2.11).We introduce the useful notion of a γ -resolvent kernel: lem 2.8. Let ď p ă 8 and let κ be an L p -kernel. Then for any γ ě , thereexists a unique L p -kernel r , such that r ´ κ “ γ p κ ‹ r q . (2.12) If log κ is convex, then the assertion also holds for γ ă . We call r the γ -resolvent of κ .Proof. If γ “ r “ κ , and the Lemma becomes trivial. In all othercases ´ γr is the so-called ‘resolvent of the second kind’ (see [13]) of ´ γκ , andthe properties of r follow directly from Thm. 2.3.1 (existence and uniqueness),Thm. 2.3.5 (local L p -integrability), Prop. 9.5.7 (continuity), Prop. 9.8.1 (posi-tivity for γ ě q and Prop. 9.8.8 (positivity under log-convexity for γ ă p κ, p r of κ, r , the resolvent equation (2.12)becomes p r ´ p κ “ γ p p κ ¨ p r q , (2.13)which can be useful for determining r explicitly from a given κ . Remark . Note that the instantaneousvariance process V t “ ξ t p t q of an AFV model can be written as V t “ ξ p t q ` ż t κ p t ´ s q a V s dW s , (2.14)which shows that V is an aﬃne Volterra process in the sense of [1]. We empha-size, that the representation (2.14) is not unique. Indeed, let λ ą ϕ be the λ -resolvent of κ . Then, using e.g. [1, Lem. 2.5], it follows that V t “ ˜ ξ p t q ´ λ ż t ϕ p t ´ s q V s ds ` ż t ϕ p t ´ s q a V s dW s , (2.15)with ˜ ξ “ ξ ` λ p ϕ ‹ ξ q , and a mean-reverting drift-term appears. Conversely,if a process of the form (2.15) is given, and if ϕ has a p´ λ q -resolvent κ , thenthe forward variance is of the form ξ t p T q “ E r V T | F t s “ ξ p T q ` ż t κ p T ´ s q a V s dB s , with ξ “ ˜ ξ ´ λ p κ ‹ ˜ ξ q , cf. [1, Lem. 2.5].5 .3 Two examples: Heston and rough Heston models Example . The Heston model [15] is given by dS t “ S t a V t ´ ρdW t ` a ´ ρ dW K t ¯ (2.16a) dV t “ ´ λ p V t ´ θ q dt ` ζ a V t dW t . (2.16b)A simple calculation shows that ξ t p T q “ E r V T | F t s “ θ ´ ´ e ´ λ p T ´ t q ¯ ` e ´ λ p T ´ t q V t . Hence, dξ t p T q “ ζe ´ λ p T ´ t q a V t dW t and it follows that the Heston model can be written as an aﬃne forward variancemodel with kernel κ p x q “ ζe ´ λx and initial forward variance ξ p T q “ θ ` ´ e ´ λT ˘ ` e ´ λT V “ V ` p θ ´ V q λ ż T κ p s q ds. Note that κ is the p´ λ { ζ q -resolvent of the constant kernel ζ , in accordance withRemark 2.9. To obtain the Riccati ODEs for the Heston model in the usualform (see e.g. [18]), let ψ p ., u q be a C -function such that g p t, u q “ BB t ψ p t, u q ` λψ p t, u q , and ψ p , u q “ . By partial integration we obtain p κ ‹ g qp t, u q “ ζ ż t e ´ λ p t ´ s q g p s, u q “ ζψ p t, u q . Inserting into the convolution Riccati equation (2.9) yields BB t ψ p t, u q “ p u ´ u q ` p ζρu ´ λ q ψ p t, u q ` ζ ψ p t, u q , ψ p , u q “ , in accordance with [18]. Furthermore, it is straightforward to show that ż Tt g p t ´ s, u q ξ t p s q ds “ φ p T ´ t, u q ` V t ψ p T ´ t, u q , with φ p t, u q “ λθ ş t ψ p s, u q ds . ˛ xample . In the rough Heston model, intro-duced in [7], (2.16b) is replaced by V t “ V ` p α q ż t p t ´ s q α ´ λ p θ ´ V s q ds ` ζ Γ p α q ż t p t ´ s q α ´ a V s dW s (2.17)where α P p { , q is related to the ‘roughness’ of the paths of V . Note that thisis an aﬃne Volterra process (2.15) with power-law kernel φ pow p t q “ ζt α ´ { Γ p α q .In [8] it is shown that the forward variance in the rough Heston model satisﬁes dξ t p T q “ κ p T ´ s q a V t dW t , with the kernel κ p x q “ ζx α ´ E α,α p´ λx α q and where E α,β p x q denotes the generalized Mittag-Leﬄer function (cf. [9], [19,Sec. 1.2]). Thus, the rough Heston model is an aﬃne forward variance modelin the sense of Theorem 2.6. The initial forward variance is given by (cf. [8,Prop. 3.1]) ξ p T q “ V ` p θ ´ V q λ ż T κ p s q ds. To obtain the fractional

Riccati equation (cf. [7, Eq. (24)]) for the rough Hestonmodel set ψ p t, u q “ ζ p κ ‹ g qp t, u q “ p α q ż t p t ´ s q α ´ E α,α p´ λ p t ´ s qq g p s, u q ds. By [7, Lem. A.2] ψ p t, u q satisﬁes D α ψ p t, u q ` λψ p t, u q “ g p s, u q where D α denotes the Riemann-Liouville fractional derivative of order α . In-serting into the convolution Riccati equation (2.9) yields D α ψ p t, u q “ p u ´ u q ` p ζρu ´ λ q ψ p t, u q ` ζ ψ p t, u q , ψ p t ; 0 q “ , in accordance with [7, Eq (24)]. Denote by I α f “ p α q ş p t ´ s q α ´ f p s q ds theRiemann-Liouville fractional integral of order α and write for the function ofconstant value one. The exponent in (2.6) can be transformed as follows: ż . g p . ´ s, u q ξ p s q ds “ V p g ‹ q ` p θ ´ V q λζ p g ‹ κ ‹ q ““ V pp g ´ λ ψ q ‹ q ` θλ p ψ ‹ q ““ V p D α ψ q ‹ ` θλ ż . ψ p s, u q ““ V I ´ α ψ ` θλ ż . ψ p s, u q , which is the same as [7, Eq (23)]. ˛ Alternatively, one can check that the Laplace transforms p φ pow p z q “ ζz ´ α and p κ p z q “ ζ {p z α ` λ q satisfy the relation (2.13) with γ “ ´ λ { ζ . .4 Proving the characterization result To prepare for the proof of Theorem 2.6, we introduce the following notation:Given a family g p ., u q u Pr , s of continuous functions from R ě to R , we set G t “ p g ‹ ξ q t p T, u q “ ż Tt g p T ´ s, u q ξ t p s q ds, (2.18) M t “ exp p uX t ` G t q . (2.19)If p X, ξ q has an aﬃne CGF determined by g p t, u q then it follows from (2.6)that M is a martingale. Conversely, if M is a martingale, then (2.6) followsby taking conditional expectations. Hence, the aﬃne property of p X, ξ q can becharacterized in terms of the martingale property of M . In order to apply Itˆo’sformula to M we represent G as an Itˆo process. The calculation is analogousto the drift computation in the Heath-Jarrow-Morton-model (cf. [10, Ch. 6])and uses the stochastic Fubini theorem to interchange stochastic integral andLebesgue integral. lem 2.12. Let G and M be given as in (2.18) , (2.19) and let Assumption 2.1on the forward variance model p X, ξ q hold. Then G can be written in Itˆo processform as G t “ ż T g p T ´ s, u q ξ p s q ds ´ ż t g p T ´ s, u q V s ds ` ż t h s p T, u q dW s , where h t p T, u q “ p g ‹ η q t p T, u q “ ż Tt g p T ´ r, u q η t p r q dr. (2.20) Moreover, M is an Itˆo process, which can be decomposed as dM t M t “ loc.mg. ` (2.21) ` " p u ´ u q V t ´ g p T ´ t, u q V t ` uρ a V t h t p T, u q ` h t p T, u q * dt. Proof.

Fix T ą u P r , s . Following [10, p. 94] closely, we compute G t “ ż Tt g p T ´ s, u q ξ t p s q ds ““ ż Tt g p T ´ s, u q ξ p s q ds ` ż Tt ż t g p T ´ s, u q η r p s q dW r ds stoch.F ub. ““ ż Tt g p T ´ s, u q ξ p s q ds ` ż t ż Tt g p T ´ s, u q η r p s q ds dW r ““ ż T g p T ´ s, u q ξ p s q ds ` ż t ż Tr g p T ´ s, u q η r p s q ds dW r ´ ż t g p T ´ s, u q ξ p s q ds ´ ż t ż tr g p T ´ s, u q η r p s q ds dW r stoch.F ub. ““ ż T g p T ´ s, u q ξ p s q ds ` ż t ż Tr g p T ´ s, u q η r p s q ds dW r ´´ ż t g p T ´ s, u q ˆ ξ p s q ds ` ż s η r p s q dW r ˙loooooooooooooooomoooooooooooooooon “ V s ds. To justify the application of the stochastic Fubini theorem in the form of [21,Thm. 2.2], we have to check whether I t : “ ż Tt ˆż t g p T ´ s, u q η r p s q dr ˙ { ds is ﬁnite for all t P r , T s . Since g p ., u q is continuous, there is a ﬁnite constant g ˚ p u q : “ sup t Pr ,T s | g p t, u q| . Using Assumption 2.1 we ﬁnd that I t ď g ˚ p u q ż T ˜ż T η r p s q dr ¸ { ds ă 8 for all t P r , T s and the application of the stochastic Fubini theorem was justi-ﬁed.To show (2.21), we apply Itˆo’s formula to M t “ exp p uX t ` G t q and obtain,using (2.1) and (2.3) that dM t M t “ u dX t ` dG t ` u d r X, X s t ` u d r X, G s t ` d r G, G s t “ (2.22) “ loc.mg. `` " p u ´ u q V t ´ g p T ´ t, u q V t ` uρ a V t h t p T, u q ` h t p T, u q * dt, as claimed. lem 2.13. Let κ be a decreasing L -kernel. Then for any u P r , s , the convo-lution Riccati equation (2.9) has a unique global continuous R ď -valued solution g p ., u q . Moreover, g p t, u q “ R V p u, f p t, u qq , where f p ., u q is the unique globalcontinuous solution of (2.11) .Proof. Fix u P p , q and set H u p w q “ R V p u, w q , with R V given by (2.10). It iseasy to check that H u satisﬁes all conditions of Corollary A.7, i.e., that • H u p w q is a ﬁnite, strictly convex function on p´8 , s and satisﬁes H u p q ă • H u p w q has a single root H u p w ˚ p u qq “ p´8 , s .9hus, we conclude from Corollary A.7 the existence of a unique global contin-uous solution g p t, u q to (2.9) for all u P p , q . Moreover, g p t, u q ď p t, u q P R ě ˆ p , q by estimate (A.11). Adding the boundary cases u P t , u is trivial: Observe that g p t, u q ” u P t , u , which must be unique by [13, Thm. 13.1.2]. Also the representationas g p t, u q “ R V p u, f p t, u qq follows directly from Corollary A.7.We are now prepared to prove the ﬁrst part of Theorem 2.6. Theorem 2.6, ‘if ’ part.

Let η t p T q “ ? V t κ p T ´ t q and ﬁx some p T, u q P p ,

8q ˆp , q . By Lemma 2.13, the convolution Riccati equation (2.9) associated tothis kernel κ has a unique global continuous R ď -valued solution g p ., u q . Usingthis solution g p ., u q we deﬁne the processes G and M as in (2.18) and (2.19).Applying Lemma 2.12, we see that equation (2.20) for h t p T, u q can be simpliﬁeddue to the factorization of η t p T q to h t p T, u q “ p g ‹ η q t p T, u q “ a V t ż T ´ t g p T ´ t ´ s, u q κ p s q ds “ a V t ¨ p g ‹ κ qp T ´ t q . Inserting into (2.21) we obtain dM t M t “ loc.mg. ` ! R V ` u, p g ‹ κ qp T ´ t q ˘ ´ g p T ´ t, u q ) V t dt. (2.23)Since g p ., u q solves the convolution Riccati equation (2.9) the dt -term vanishesand we have shown that M is a local martingale. It remains to show that M isa true martingale. To this end, observe that | M t | { u “ exp ˆ u p uX t ` G t q ˙ ď exp p X t q “ S t (2.24)for all t P r , T s . But S is a supermartingale, such that E ” | M t | { u ı ď E r S t s ď S . Setting p “ { u ą L p -criterion for uniform integrability of M . Weconclude that M is a true martingale, and hence that E “ e uX T ˇˇ F t ‰ “ E r M T | F t s “ M t “ exp ˜ uX t ` ż Tt g p T ´ s, u q ξ t p s q ds ¸ , (2.25)showing (2.6) for all u P p , q . Adding the boundary cases u P t , u is trivial.Since g p t, u q ” ρ ď ρ ą

0. The proof follows the same structure in both cases, but ρ ą heorem 2.6, ‘only if ’ part in the case ρ ě . By assumption, the forward vari-ance model p X, ξ q has an aﬃne CGF in the sense of Deﬁnition 2.4. Using theassociated function g p ., u q , we deﬁne the processes G and M as in (2.18) and(2.19). Observe that due to (2.6), M has to be a martingale. In addition, notethat Assumption 2.1, together with g p ., u q ď h t p T, u q “ p g ‹ η q t p T, u q ď t, T ě , u P r , s . Applying Lemma 2.12 and using the fact that M is amartingale, we see that the dt -term in (2.21) has to vanish, i.e. that12 p u ´ u q V t ´ g p T ´ t, u q V t ` uρ a V t h t p T, u q ` h t p T, u q “ , (2.26)up to a p dt ˆ d P q -nullset. This is a quadratic equation in the variable h t p T, u q ,and due to the assumption ρ ď h t p T, u q “ a V t q ´ p T ´ t, u q : “ a V t ` ´ ρu ´ a u p ρ ´ q ` u ` g p T ´ t, u q ˘ . Inserting the deﬁnition of h t p T, u q from (2.20) and setting τ “ T ´ t we obtain ż τ g p τ ´ s, u q η t p t ` s q ds “ a V t q ´ p τ, u q , @ τ ě . (2.27)This is a Volterra integral equation of the ﬁrst kind (cf. [13, Sec. 5.5]) for theunknown function η t p t ` . q . From (2.26) it can easily be seen that g p , u q “ p u ´ u q ă

0. Therefore, by [13, Ex. 5.25], see also [12, Thm. 1], there exists a(locally ﬁnite Borel) measure π p dτ, u q – the resolvent of the ﬁrst kind of g p ., u q – such that ş τ g p τ ´ s, u q π p ds, u q “ τ ą

0. Convolving (2.27) with π p dτ, . q , the unique solution of (2.27) can be expressed as, cf. [13, Thm. 5.3], η t p t ` τ q “ a V t ¨ BB τ ˆż τ q ´ p τ ´ s, u q π p ds, u q ˙ . (2.28)Denoting the last factor by κ p τ q and taking into account Assumption 2.1 thedesired decomposition (2.8) follows. We now introduce a class of models for market order ﬂow, which are structurallysimilar to forward variance models. These models consist of a log-price X anda forward intensity process ξ t p T q , which models the expectation (at time t )of the future intensity of order ﬂow (at time T ). The forward intensity ξ t p T q has a role similar to the forward variance, and we call the resulting model an aﬃne forward order ﬂow intensity (AFI) model. The AFI model is driven Not to be confused with the γ -resolvent of Lemma 2.8. The strong empirical correlation between order volume (as a proxy for intensity) andreturn variance is well-documented in the literature (see e.g. [11]). Therefore the parallelsbetween AFV and AFI models should not come as a complete surprise. J ` t , J ´ t of ﬁnite activity and with common intensity λ t ´ . Asin (2.2), we assume that ξ . p T q and λ are connected by ξ t p T q : “ E r λ T ´ | F t s . The driving processes J ˘ jump only upwards and represent the arrival of buyand sell orders respectively. Their jump height distributions are given by twoprobability measures ζ ˘ p dx q on R ě for buy and for sell orders. We assumethat ş e x ζ ˘ p dx q ă 8 ; in particular, also the ﬁrst moments m ˘ : “ ż x ζ ˘ p dx q exist. In addition, we assume that the order ﬂow processes are self-exciting, inthe sense that each arriving order positively impacts the intensity process. Thisimpact can be asymmetric, i.e. the degree of self-excitement may be diﬀerent forbuy- and sell-orders. Together this leads to the speciﬁcation of the AFI modelas dX t “ ´ λ t ´ m X dt ` dJ ` t ´ dJ ´ t , (3.1a) dξ t p T q “ κ p T ´ t q ´ γ ` d r J ` t ` γ ´ d r J ´ t ¯ . (3.1b)where κ is an L -kernel in the sense of Deﬁnition 2.2, γ ˘ are positive constants, m X is determined by the martingale condition on S “ e X and r J ˘ t denote the compensated order ﬂow processes, i.e. r J ˘ t : “ J ˘ t ´ m ˘ ş t λ s ´ ds . Setting J Xt “ J ` t ´ J ´ t , J λt “ γ ` J ` t ` γ ´ J ´ t (3.2)and denoting by r J λ the compensated counterpart of J λ , we can rewrite (3.1) as dX t “ ´ λ t ´ m X dt ` dJ Xt ,dξ t p T q “ κ p T ´ t q d r J λt . We proceed to discuss the jump processes and the compensators of their randomjump measures in more detail. The random jumps of J ˘ are compensated by dν ˘ t p dx q “ λ t ´ ζ ˘ p dx q dt, where x represents jump size. While J ` and J ´ are independent, given λ , it isimportant to note that J Xt and J λt are not. Instead, they move by simultaneousjumps. Thus, the predictable compensator of the jump measure of p J X , J λ q isgiven by dν p X,λ q t p dx, dy q “ λ t ´ χ p dx, dy q dt, where χ p dx, dy q “ ´ t x ě u t y “ γ ` x u ζ ` p dx q ` t x ď u t y “´ γ ´ x u ζ ´ p´ dx q ¯ . χ p dx, dy q is concentrated on theline segments y “ γ ` x, p x ě q and y “ ´ γ ´ x, p x ď q due to the simultaneityof jumps. In addition, we deﬁne ψ ˘ p u q “ ż p e ux ´ q ζ ˘ p dx q (3.3)and calculate ż R ˆ R ě ` e ux ` wy ´ ˘ χ p dx, dy q “ ψ ` ` u ` wγ ` ˘ ` ψ ´ ` ´ u ` wγ ´ ˘ . Applying Itˆo’s formula for jump processes to e X it is easy to see that the mar-tingale condition implies that m X “ ψ ` p q ` ψ ´ p´ q . The following theorem is the analogue of Theorem 2.6 and shows the structuralsimilarity between aﬃne forward variance models and AFI models. thm 3.1.

The AFI model (3.1) has an aﬃne CGF in the sense of Deﬁnition 2.4.Moreover, g p ., u q : R ě Ñ R ď in (2.6) is the unique global solution of thegeneralized convolution Riccati equation g p t, u q “ R λ ´ u, ż t κ p t ´ s q g p s, u q ds ¯ “ R λ ´ u, p κ ‹ g qp t, u q ¯ , (3.4) where R λ p u, w q “ ψ ` ` u ` wγ ` ˘ ` ψ ´ ` ´ u ` wγ ´ ˘ ´ um X ´ w ` γ ` m ` ` γ ´ m ´ ˘ , (3.5) with ψ ˘ as in (3.3) .Proof. Essentially, we proceed as in the ‘if’-part of the proof of Theorem 2.6.Let G be deﬁned as in (2.18) and set M t “ exp p uX t ` G t q . Applying the sameargument as in the proof of Lemma 2.12, but replacing Brownian motion by thepure-jump-martingale r J λ we obtain G t “ ż T g p T ´ s, u q ξ p s q ds ´ ż t g p T ´ s, u q λ s ´ ds ` ż t h s p T, u q d r J λs , where h t p T, u q “ λ t ´ ż Tt κ p r ´ t q g p T ´ r, u q dr. Applying the Itˆo-formula with jumps to M we obtain M t “ M ` ż t M s ´ p udX t ` dG t q` ÿ ď s ď t M s ´ ` e u ∆ X s ` ∆ G s ´ ´ u ∆ X s ´ ∆ G s ˘ dM t M t “ loc. mg. ´ λ t ´ um X dt ` ` γ ` m ` ` γ ´ m ´ ˘ h t p T, u q dt ´ g p T ´ t, u q λ t ´ dt `` λ t ´ ż R ˆ R ě ´ e ux ` yh t p T,u q ´ ¯ χ p dx, dy q dt ““ loc. mg. ` λ t ´ ! R λ ` u, p g ‹ κ qp T ´ t, u q ˘ ´ g p T ´ t, u q ) dt. (3.6)where ‘loc. mg.’ denotes a local martingale part that we need not computeexplicitly. We see that the dt -terms vanish, if g p τ, u q “ R λ ´ u, ż τ κ p τ ´ s q g p s, u q ds ¯ , i.e. if the generalized convolution Riccati equation (3.4) has a solution for0 ď τ ď T ´ t .To show that there exists a unique global continuous solution of (3.4), we pro-ceed as in the proof of Lemma 2.13, i.e., we set H u p w q “ R λ p u, w q , u P p , q and show that H u satisﬁes the conditions of Corollary A.7. In particular, for all u P p , q , • H u p w q is a ﬁnite, strictly convex function on p´8 , s and satisﬁes H u p q ă • H u p w q has a single root H u p w ˚ p u qq “ p´8 , s .Indeed, note that strict convexity is inherited from ψ ˘ , cf. (3.3). In addition,convexity of the exponential function implies, for u P p , q , that e ux “ e u ¨ x `p ´ u q¨ ď ue x ` p ´ u q e ă ue x ` , and hence that H u p q “ ż p e ux ´ ´ ue x q ζ ` p dx q ` ż ` e ´ ux ´ ´ ue ´ x ˘ ζ ´ p dx q ă . Finally, the existence of the root w ˚ p u q follows from the fact thatlim w Ñ´8 ż ´ e p˘ u ` γ ˘ w q x ´ ´ γ ˘ wx ¯ ζ ˘ p dx q “ `8 , which implies that lim w Ñ´8 H u p w q “ `8 . In summary, H u satisﬁes all con-ditions of Cor. A.7 and we conclude the existence of a unique global solution g p t, u q of the Riccati equation for all u P p , q . Moreover, g p t, u q ď p t, u q P R ě ˆ p , q by estimate (A.11). We can add the boundary cases u P t , u , observing that they yield the constant global solution g p t, u q ” M t “ exp p uX t ` G t q is a local martingale. By the same arguments as in (2.24)and (2.25) it follows that M is a true martingale, and hence that p X, ξ q has anaﬃne CGF. 14 xample . Consider (3.1a), driven by abivariate Hawkes process p J ` , J ´ q with unit jump size (i.e., ζ ˘ p dx q “ δ p dx qq ,common kernel ϕ , and common intensity λ t , given by λ t “ µ ` ż t ϕ p t ´ s q ` γ ` dJ ` s ` γ ´ dJ ´ s ˘ , as in [7, Sec. 2]. Set p γ : “ γ ` ` γ ´ and let κ be the p γ -resolvent of ϕ in the sense of(2.12). In terms of κ , the Hawkes intensity λ has the martingale representation(cf. [2, Eq. (45)]) λ t “ µ ` µ p γ ż t κ p t ´ u q du ` ż t κ p t ´ u q d ˜ J λu , with the last integral now driven by a compensated jump processes. Takingconditional expectations and using the martingale property of ˜ J λ yields E r λ T | F t s “ µ ` µ p γ ż T κ p T ´ u q du ` ż t κ p T ´ u q d ˜ J λu , and hence dξ t p T q “ d E r λ T | F t s “ κ p T ´ t q d ˜ J λt , which shows that the model can be cast as AFI model with kernel κ . For concretespeciﬁcations of ϕ , we can take Laplace transforms and use the relation (2.13)to determine the corresponding κ . Consider, for example ϕ p x q “ e ´p λ ` p γ q x with Laplace tf. p ϕ p z q “ z ` λ ` p γ . (3.7)For this ϕ we obtain from (2.13) the p γ -resolvent κ p x q “ e ´ λx with Laplace tf. p κ p z q “ z ` λ , i.e., the kernel of the Heston model in forward variance form; see Example 2.10.Furthermore, the Hawkes kernel ϕ p x q “ x α ´ E α,α p´p λ ` p γ q x α q (3.8)has Laplace transform p ϕ p z q “ {p z α ` λ ` p γ q (cf. [14, Eq. (7.5)]). Thus its p γ -resolvent is κ p x q “ x α ´ E α,α p´ λx α q , (3.9)the kernel of the rough Heston model in forward variance form; see Example 2.11.

We proceed to show that the AFV model is the high-frequency limit of the AFImodel. This limit is closely related to the limits of ‘nearly unstable’ Hawkesprocesses considered in [16, 17, 7], see Example 4.4 below.15 .1 A ﬁrst convergence result

Our starting point is the AFI model (3.1a). We assume that buy/sell order sizedistributions ζ ˘ are normalized in the sense that ż x ζ ` p dx q ` ż x ζ ´ p dx q “ , (4.1)and we denote by p : “ ż x ζ ` p dx q P r , s (4.2)the variance of buy orders relative to sell orders. We introduce a small parameter (cid:15) and rescale (3.1) as dX (cid:15)t “ ´ λ (cid:15)t ´ m X dt ` dJ (cid:15), ` t ´ dJ (cid:15), ´ t , (4.3a) dξ (cid:15)t p T q “ κ (cid:15) p T ´ t q ´ γ ` d r J (cid:15), ` t ` γ ´ d r J (cid:15), ´ t ¯ , (4.3b)where J (cid:15), ˘ are pure jump semimartingales, independent given their commonintensity λ (cid:15)t “ ξ (cid:15)t p t q and with jump height distribution ζ (cid:15) ˘ p dx q “ ζ ˘ p dx {? (cid:15) q . Moreover, the kernels are scaled as κ (cid:15) p x q “ (cid:15) κ p x q . Thus, as (cid:15) Ó

0, the frequency of jumps increases proportional to 1 { (cid:15) , while thesize of jumps shrinks proportional to ? (cid:15) . The initial conditions of (4.3) aregiven by X (cid:15) “ X and ξ (cid:15) p T q “ (cid:15) ξ p T q . Under the given scaling, the quantitiesfrom (3.3) and below transform as ψ (cid:15) ˘ p u q “ ψ ˘ p? (cid:15)u q m (cid:15)X “ ψ ` p? (cid:15) q ` ψ ´ p´? (cid:15) q m (cid:15) ˘ “ ? (cid:15)m ˘ and we write R (cid:15) p u, w q “ ψ (cid:15) ` ` u ` wγ ` ˘ ` ψ (cid:15) ´ ` ´ u ` wγ ´ ˘ ´ um (cid:15)X ´ w ` γ ` m (cid:15) ` ` γ ´ m (cid:15) ´ ˘ . lem 4.1. Given γ ˘ ą and the jump height distributions ζ ˘ p dx q , deﬁne c ą and ρ P r´ , s by c “ b pγ ` ` p ´ p q γ ´ ρ “ c ´ pγ ` ´ p ´ p q γ ´ ¯ . (4.4) Then lim (cid:15) Ñ (cid:15) R (cid:15) p u, w q “ p u ´ u q ` cρuw ` c w “ R V p u, cw q ith R V p u, w q as in (2.10) . Moreover, also the partial derivatives with respectto u and w converge, i.e. lim (cid:15) Ñ (cid:15) B R (cid:15) B u p u, w q “ B R V B u p u, cw q “ u ´ ` cρw lim (cid:15) Ñ (cid:15) B R (cid:15) B w p u, w q “ B R V B w p u, cw q “ cw ` ρu. Proof.

We can write R (cid:15) p u, w q “ ż b (cid:15) ` p x ; u, w q ζ ` p dx q ` ż b (cid:15) ´ p x ; u, w q ζ ´ p dx q (4.5)where b (cid:15) ˘ p x, u, w q “ ´ u p e ˘? (cid:15)x ´ q ´ wγ ˘ ? (cid:15)x `` exp ´ p˘ u ` wγ ˘ q? (cid:15)x ¯ ´ . Expanding in powers of ? (cid:15)x yields b (cid:15) ˘ p x, u, w q “ (cid:15)x ˆ p u ´ u q ˘ uwγ ˘ ` w p γ ˘ q ˙ ` O p (cid:15) { x q . Hence, using (4.1) and (4.2), it follows thatlim (cid:15) Ñ (cid:15) R (cid:15) p u, w q “ p u ´ u q ` uw p pγ ` ´ p ´ p q γ ´ q ` w ` pγ ` ` p ´ p q γ ´ ˘ ““ p u ´ u q ` cρuw ` c w “ R V p u, cw q , where exchanging limit and integral is justiﬁed by dominated convergence andthe integrability condition ş e x ζ ˘ p dx q ă 8 .To show the convergence of partial derivatives, we take partial derivativesin (4.5) to obtain B R (cid:15) B u p u, w q “ ż B b (cid:15) ` B u p x ; u, w q ζ ` p dx q ` ż B b (cid:15) ´ B u p x ; u, w q ζ ´ p dx q . Since R (cid:15) is convex, its diﬀerence quotients converge monotonically, and mono-tone convergence can be used to exchange derivative and integral. Expanding B b (cid:15) ˘ B u p x ; u, w q in powers of ? (cid:15)x , a direct calculation yields the desired limit. Theproof for the BB w -derivative is analogous. Remark . Equation (4.4) gives important insights on the dependence of theleverage parameter ρ on the micro-structural parameters p (asymmetry of or-der sizes) and γ ˘ (asymmetry of self-excitement). Consider ﬁrst the case ofsymmetric order size distributions p “ as in [6]. In this case ρ “ γ ` ´ γ ´ b p γ ` ` γ ´ q , p´ { ? , { ? q « p´ . , . q , with boundariesattained in the limiting cases γ ˘ Ñ 8 . When also asymmetry of order sizes isallowed, then ρ can be represented as a scalar product of unit length vectors ρ “ ` ? p, ? ´ p ˘ ¨ ˆ γ ` ? p { c ´ γ ´ ? ´ p { c ˙ , which conﬁrms that ρ P p´ , q . In addition, the boundary cases ρ “ ˘ p Ñ p Ñ prop 4.3. Let p X (cid:15) , ξ (cid:15) q be the rescaled AFI model (4.3) . Deﬁne c, ρ as inLemma 4.1 and set κ V p x q “ cκ p x q . Then, for any t ě , X (cid:15)t (cid:15) Ñ ÝÝÝÑ X t in distribution , (4.6) where p X, ξ q is a forward variance model with correlation parameter ρ , andkernel κ V .Proof. By Theorem 3.1, g (cid:15) p t, u q in the CGF (2.6) of X (cid:15) is the unique globalsolution of the generalized convolution Riccati equation (3.4) and hence satisﬁes (cid:15) g (cid:15) p t, u q “ (cid:15) R (cid:15) ´ u, κ (cid:15) ‹ g (cid:15) p t, u q ¯ “ (cid:15) R (cid:15) ´ u, κ ‹ ` (cid:15) g (cid:15) ˘ p t, u q ¯ . (4.7)Note that (cid:15) R (cid:15) p u, w q is jointly continuous in all variables, and by Lemma 4.1converges to R V p u, cw q as (cid:15) Ñ

0. By Corollary A.7, equation (4.7) can betransformed into a non-linear Volterra equation of type (A.6), whose solutiondepends jointly continuous on p t, (cid:15), u q by [13, Thm. 13.1.1]. We conclude that (cid:15) g (cid:15) p t, u q converges, uniformly for p t, u q in compacts, to g p t, u q as (cid:15) Ñ

0, where g p t, u q is the unique solution (cf. Theorem 2.6) of g p t, u q “ R V ´ u, cκ ‹ g p t, u q ¯ “ R V ´ u, κ V ‹ g p t, u q ¯ . Using Theorems 2.6 and 3.1, we conclude that E ” e uX (cid:15)t ı “ exp ˆ uX ` ż t g (cid:15) p t ´ s, u q ξ (cid:15) p s q ds ˙ Ñ exp ˆ uX ` ż t g p t ´ s, u q ξ p s q ds ˙ “ E “ e uX t ‰ , as (cid:15) Ñ

0, i.e., the moment generating function of X (cid:15)t converges to the momentgenerating function of X t on u P r , s . By [4, Prob. 30.4], convergence ofmoment generating functions on a (non-empty) interval implies the convergencein distribution in (4.6). 18he following example shows that the scaling in (4.3) is related to the ‘nearlyunstable’ limit of Hawkes models in [7]. Example . We continueExample 3.2 and consider the bivariate Hawkes process from [7] with Mittag-Leﬄer kernel (3.8). Introduce a small parameter (cid:15) and scale the kernel as ϕ (cid:15) p x q “ (cid:15) x α ´ E α,α p´p λ ` r γ(cid:15) q x α q . In terms of its Laplace transform, this scaling becomes p ϕ (cid:15) p z q “ {p (cid:15) p z α ` λ q` r γ q .In particular, we have r γ ż ϕ (cid:15) p x q dx “ r γ p ϕ p q “ r γ(cid:15)λ ` r γ Ñ , i.e. as (cid:15) Ñ κ (cid:15) p x q can be determined as p κ (cid:15) p z q “ p ϕ (cid:15) p z q ´ p ϕ (cid:15) p z q “ { (cid:15)z α ` λ and thus the resolvent kernel is given by κ (cid:15) p x q “ (cid:15) x α ´ E α,α p´ λx α q “ (cid:15) κ p x q . Together with square-root scaling of the jump size we are exactly in the set-ting of (4.3) and conclude from Proposition 4.3 that the (univariate) marginaldistributions of X converge to those of the corresponding AFV model, i.e. therough Heston model (cf. Example 2.11). Theorem 4.10 below strengthens thisresult to convergence of all ﬁnite-dimensional marginal distributions. ˛ In this subsection, we derive results on the joint moment generating function oflog-price and forward variance and of the ﬁnite-dimensional marginal distribu-tions of X . Assumption 4.5.

We assume that p X, ξ q is either an AFV model or an AFImodel, and we write R p u, w q for the corresponding function R V p u, w q or R λ p u, w q .In addition we denote, for any u P p , q , by w ˚ p u q the unique root where R p u, w ˚ p u qq “ , and w ˚ p u q ă . Note that the function R p u, w q has already been studied in the context ofaﬃne stochastic volatility models in [18, Lem. 3.2ﬀ]. In particular, we note that R p u, w q and w ˚ p u q are convex functions for u P r , s , w ď rop 4.6. Let p X, ξ q be an AFV or an AFI model and let R p u, w q and w ˚ p u q bedeﬁned as in Assumption 4.5. Let ∆ ą , T “ T ` ∆ , and let h be a piecewisecontinuous R ď -valued function on r , ∆ s , such that w ˚ p u q ă ş ∆0 κ p ∆ ´ s q h p s q ds .Then E « exp ˜ uX T ` ż T T h p T ´ s q ξ T p s q ds ¸ˇˇˇˇˇ F t ﬀ ““ exp ˜ uX t ` ż T t g p T ´ s, u, h q ξ t p s q ds ¸ , (4.8) where g p ., u, h q : R ě Ñ R ď is the unique continuous solution of the (general-ized) convolution Riccati equation g p t, u, h q “ R ´ u, ż t κ p t ´ s q g p s, u, h q ds ¯ , t P r ∆ , (4.9) with initial condition g p t, u, h q “ h p t q , t P r , ∆ q . (4.10) Remark . Note that the expression (4.8) for the joint moment generatingfunction does not correspond to the exponential-aﬃne transform formula (4.6)of [1]. Speciﬁcally, h constant in (4.8) would give the joint moment generationfunction of X T and the forward variance swap ş T T ξ T p s q ds . In contrast, f constant in (4.6) of [1] would give the the joint moment generation function of X T and quadratic variation ş T V s ds . Proof.

The existence of a unique, R ď -valued solution to (4.9) with initial condi-tion (4.10) follows from an application of Corollary A.8 with H u p w q “ R p u, w q .In the proofs of Theorem 2.6 and Theorem 3.1, we have already establishedthat H u satisﬁes the necessary conditions to apply the corollary. Next, we de-ﬁne G ∆ t “ ş T t g p T ´ s, u, h q ξ t p s q ds and specialize to the forward variance case.By Lemma 2.12, it holds that dG ∆ t “ ´ g p T ´ t, u, h q V t dt ` ˜ż T t κ p r ´ t q g p T ´ r, u, h q dr ¸ a V t dW t and that M ∆ t is a local martingale on r , T s if g p T ´ t, u, h q “ R ˜ u, ż T t κ p r ´ t q g p T ´ r, u, h q dr ¸ . Setting τ “ T ´ t P r ∆ , T s this is exactly (4.9). We conclude that M ∆ t is alocal martingale on r , T s , and – repeating the argument following (2.24) – evena true martingale. Using the initial condition (4.10), we observe that20 « exp ˜ uX T ` ż T T h p T ´ s q ξ T p s q ds ¸ˇˇˇˇˇ F t ﬀ “ E “ M ∆ T ˇˇ F t ‰ ““ M t “ exp ˜ uX t ` ż T t g p T ´ s, u, h q ξ t p s q ds ¸ , showing (2.6). The proof in the AFI case is analogous with the following modi-ﬁcations: W t has to be substituted by the pure-jump martingale ˜ J Xt and V t bythe intensity λ t ´ . Itˆo’s formula for jump processes can then be applied as inthe proof of Theorem 3.1. prop 4.8. Let p X, ξ q be an AFV or an AFI model and let R p u, w q and w ˚ p u q bedeﬁned as in Assumption 4.5. Let t ď t ď ¨ ¨ ¨ t n “ T and u “ p u , . . . , u n ´ q Pp , q n be such that w ˚ p u q ď w ˚ p u q ď ¨ ¨ ¨ w ˚ p u n ´ q . Then, for all k Pt , . . . , n ´ u , E “ exp ` u k p X t k ` ´ X t k q ` ¨ ¨ ¨ ` u n ´ p X T ´ X t n ´ q ˘ˇˇ F t k ‰ ““ exp ˜ż Tt k g k p T ´ s, u q ξ t k p s q ds ¸ , (4.11) where the functions g k are deﬁned by backward recursion as the solutions of theconvolution Riccati equations g k p t, u q “ R ´ u k , ż t κ p t ´ s q g k p s, u q ds ¯ , t P r T ´ t k ` , T ´ t k q (4.12) with initial conditions g k p t, u q “ g k ` p t, u q , t P r , T ´ t k ` q . (4.13) Remark . Note that for k “ n ´ without initial condition and (4.13) becomes void(i.e. a condition over an empty set). Proof.

We show the result by backward induction on k : For k “ n ´ p X, ξ q is an AFV model, and toTheorem 3.1, when p X, ξ q is an AFI model. Setting ∆ k : “ T ´ t k , we obtainfrom (A.15) in Corollary A.8 that w ˚ p u n ´ q ă ż ∆ n ´ κ p ∆ n ´ ´ s q g n ´ p s, u q ds. (4.14)For the induction step assume that (4.11) has been shown for a certain k andthat (4.14) holds with n ´ k . Writing Z k ´ : “ exp ` u k ´ p X t k ´ X t k ´ q ` ¨ ¨ ¨ ` u n p X T ´ X t n ´ q ˘ E “ Z k ´ | F t k ´ ‰ “ E “ exp ` u k ´ p X t k ´ X t k ´ q ˘ ¨ E r Z k | F t k s ˇˇ F t k ´ ‰ ““ E « exp ˜ u k ´ p X t k ´ X t k ´ q ` ż Tt k g k p T ´ s, u q ξ t k p s q ds ¸ˇˇˇˇˇ F t k ´ ﬀ . Since w ˚ p u k ´ q ď w ˚ p u k q ă ż ∆ k κ p ∆ k ´ s q g k p s, u q ds we may apply Proposition 4.6 with ∆ k and obtain (4.11) with g k ´ as solutionof (4.12) with initial condition (4.13). Finally, (4.14) holds with n ´ k ´

1, using the estimate (A.15) from Corollary A.8. thm 4.10.

Let p X (cid:15) , ξ (cid:15) q be the rescaled AFI model (4.3) , deﬁne c, ρ as inLemma 4.1 and set κ V p x q “ cκ p x q . Then, for any n P N and “ t ď t ď¨ ¨ ¨ t n “ T , ` X (cid:15)t , . . . , X (cid:15)t n ˘ (cid:15) Ñ ÝÝÝÑ p X t , . . . , X t n q in distribution , (4.15) where p X, ξ q is the AFV model with correlation parameter ρ and kernel κ V .Proof. By Lemma 4.1 (cid:15) R (cid:15) p u, w q converges to R V p u, cw q and the same holds truefor the partial derivatives with respect to u and w . Therefore, by the implicitfunction theorem, also w (cid:15) ˚ p u q and BB u w (cid:15) ˚ p u q converge to c w ˚ p u q and c w p u q as (cid:15) Ñ u P p , q . Moreover, since the w (cid:15) ˚ are convex functions of u , theconvergence is uniform on compacts (cf. [20, Thm. 10.8]). The limit c w ˚ p u q can be calculated explicitly and is given by c w ˚ p u q “ c ´ ´ ρu ` a ρ u ` p u ´ u q ¯ . It is easy to see that w ˚ is decreasing on p , u ˚ q and increasing on p u ˚ , q , where u ˚ : “

12 1 ´| ρ | ´ ρ if ρ P p , q if | ρ | “ . We conclude that there is N P N and a closed interval I Ă p , u ˚ q with non-empty interior, such that u ÞÑ w ˚ p u q and u ÞÑ w (cid:15) ˚ p u q are decreasing on I for all (cid:15) ď { N . Introduce the set D : “ t u P I n : u ě u ě ¨ ¨ ¨ ě u n ´ u Ă p , q n and note that also D is closed with non-empty interior. In addition, w (cid:15) ˚ p u q ď w (cid:15) ˚ p u q ď ¨ ¨ ¨ w (cid:15) ˚ p u n ´ q for all u “ p u , . . . , u n ´ q P D and (cid:15) ď { N , and the22ame holds for w ˚ . From Proposition 4.8 we conclude that the joint momentgenerating function of the increments p X (cid:15)t ´ X (cid:15)t , X (cid:15)t ´ X (cid:15)t , . . . , X (cid:15)t n ´ X (cid:15)t n ´ q is of the form Z (cid:15) p u q : “ E ” exp ´ u p X (cid:15)t ´ X (cid:15)t q ` ¨ ¨ ¨ ` u n ´ p X (cid:15)T ´ X (cid:15)t n ´ q ¯ı ““ exp ˜ż T g (cid:15) p T ´ s, u q ξ t p s q ds ¸ , for any u P D and (cid:15) ď { N , where g (cid:15) satisﬁes the iterated Riccati convo-lution equations (4.12) with R p u, w q “ R (cid:15) p u, w q . By Corollary A.8 each ofthese equations can be transformed into a non-linear Volterra equation, whosesolution depends continuously on p (cid:15), t, u q by [13, Thm. 13.1.1]. In addition,Lemma 4.1 yields the convergence (cid:15) R (cid:15) p u, w q Ñ R V p u, cw q . Hence we conclude– as in the proof of Proposition 4.3 – that (cid:15) g (cid:15) p t, u q converges, uniformly for p t, u q in compacts, to g p t, u q as (cid:15) Ñ

0, where g p t, u q is the unique solutionof the iterated Riccati convolution equations (4.12) with R p u, w q “ R V p u, cw q .Consider now the joint moment generating function Z p u q of the increments p X t ´ X t , X t ´ X t , . . . , X t n ´ X t n ´ q of the AFV model with parameter ρ and kernel κ V “ cκ . The convergence (cid:15) g (cid:15) p t, u q Ñ g p t, u q together with Propo-sition 4.8 yields Z (cid:15) p u q “ exp ˜ż T g (cid:15) p T ´ s q ξ (cid:15) p s q ds ¸ Ñ exp ˜ż T g p T ´ s, u q ξ p s q ds ¸ “ Z p u q for all u P D . By [4, Thm. 29.4 and Prob. 30.4] convergence of the momentgenerating function on a set with non-empty interior implies convergence indistribution, and (4.15) follows. A Some results on Volterra equations with con-vex non-linearity

We show some results on Volterra equations with convex non-linearity, of thetype appearing in Theorem 2.6 and 3.1. On the non-linearity we impose thefollowing assumptions:

Assumption A.1.

The function H : p´8 , w max s Ñ R is continuously diﬀer-entiable and convex with a unique root H p w ˚ q “ p´8 , w max s . Moreover, H p w ˚ q ă H p w max q ă H satisfying Assumption A.1, we set w “ argmin w Pp´8 ,w max s H p w q ;if the minimum is not unique (i.e., if H has a ﬂat part), then w shall denotethe leftmost minimizer. Note that either23 w “ w max , in which case H is strictly decreasing on p´8 , w max s ; or • w ă w max , in which case H is strictly decreasing on p´8 , w q and in-creasing on r w , w max s .In any case, w ˚ ă w ď w max holds true. Also the following deﬁnition will beuseful: Deﬁnition A.2.

Let H be a function satisfying Assumption A.1. The decreas-ing envelope of H is deﬁned as H : “ H p w q , w ď w H p w q , w P r w , w max s . (A.1)Clearly H also satisﬁes Assumption A.1, but is in addition decreasing andsatisﬁes H ď H . Both Assumption A.1 and Deﬁnition A.2 are illustrated inFigure 1. ww max = w w ∗ H ( w ) ww max w w ∗ H ( w ) H ( w ) Figure 1: Illustration of two convex functions H , H satisfying Assumption A.1.While H is monotone decreasing, H is not, and its decreasing envelope H isalso shown. lem A.3. Let H : p´8 , w max s Ñ R be a convex function that satisﬁes Assump-tion A.1; in particular it has a root H p w ˚ q “ . Then(a) For any a P p w ˚ , w max s the function w ÞÑ Q p w, a q “ ´ ż aw dζH p ζ q , (A.2) maps p w ˚ , a s onto r , ; is strictly decreasing, and has an inverse Q ´ p r, a q ,which maps r , onto p w ˚ , a s .(b) For any a P p´8 , w ˚ q the function w ÞÑ Q p w, a q “ ż wa dζH p ζ q , (A.3) maps r a, w ˚ q onto r , ; is strictly increasing, and has an inverse Q ´ p r, a q ,which maps r , onto r a, w ˚ q . emark A.4 . Analogous to (A.2), we denote by Q the function w ÞÑ Q p w, a q “ ´ ż aw dζH p ζ q , (A.4)where H is the decreasing envelope of H . Proof.

To show (a), note that the integrand ´ { H p ζ q is strictly positive on p w ˚ , a q . It follows that Q p ., a q is strictly decreasing and maps p w ˚ , a s into r , . It remains to show that range of this map covers all of r , . To thisend, observe that by convexity we have H p w q ě H p w ˚ qp w ´ w ˚ q , for all w P p´8 , w max s , (A.5)and H p w ˚ q ă

0. Thus, we obtainlim w Ó w ˚ Q p w, a q “ ´ ż aw ˚ dζH p ζ q ě ´ H p w ˚ q ż aw ˚ dζζ ´ w ˚ “ `8 . The proof of (b) is analogous; only the diﬀerent sign of H on p´8 , w ˚ q has tobe taken into account. thm A.5. Let κ be an L -kernel in the sense of Deﬁnition 2.2 and let H be aconvex function that satisﬁes Assumption A.1; in particular H p w ˚ q “ is itsunique root in p´8 , w max s . For any continuous function a : R ě Ñ p´8 , w max s consider the non-linear Volterra equation f p t q “ a p t q ` ż t κ p t ´ s q H p f p s qq ds, t P R ě . (A.6) (a) If a is increasing with values in p w ˚ , w s then (A.6) has a unique globalsolution f which satisﬁes w ˚ ă r p t q ď f p t q ă a p t q , @ t ą , (A.7) where r p t q “ Q ´ ´ş t κ p s q ds, a p q ¯ and Q is given by (A.2) .(b) If a ” w ˚ then f ” w ˚ is the unique global solution of (A.6) (c) If a is decreasing with values in p´8 , w ˚ q then (A.6) has a unique globalsolution f which satisﬁes a p t q ă f p t q ď r p t q ă w ˚ , @ t ą , (A.8) where r p t q “ Q ´ ´ş t κ p s q ds, a p q ¯ and Q is given by (A.3) .In addition, case (a) can be extended to the following more general statement: a’) If a is increasing with values in p w ˚ , w max s then (A.6) has a unique globalsolution f which satisﬁes w ˚ ă r p t q ď f p t q ă a p t q , @ t ą , (A.9) where r p t q “ Q ´ ´ş t κ p s q ds, a p q ¯ and Q is given by (A.4) .Remark A.6 . Clearly, if H is decreasing (and hence w “ w max ), cases (a) and(a’) coincide. In the general case (a) gives better bounds on f than (a’), but ismore restrictive in its assumption on the function a .Before proving the theorem, we add two Corollaries that are used in theproofs of Theorems 2.6, 3.1 and 4.10. cor A.7. Under the assumptions of Theorem A.5, consider the non-linear in-tegral equation g p t q “ H ˆ a p t q ` ż t κ p t ´ s q g p s q ds ˙ , t P R ě . (A.10) (a) If a is increasing with values in p w ˚ , w s then (A.10) has a unique globalsolution g which satisﬁes H p a p t qq ă g p t q ď H p r p t qq ă , @ t ą . (A.11) (b) If a ” w ˚ then g ” is the unique global solution of (A.10) .(c) If a is decreasing with values in p´8 , w ˚ q then (A.10) has a unique globalsolution g which satisﬁes ă g p t q ď H p r p t qq ă H p a p t qq , @ t ą . (A.12) In addition, case (a) can be extended to:(a’) If a is increasing with values in p w ˚ , w max s then (A.10) has a unique globalsolution g which satisﬁes g p t q ă , @ t ą . (A.13) In any of the above cases, g p t q “ H p f p t qq , where f is the solution of (A.6) . cor A.8. Let the assumptions of Theorem A.5 hold with w max “ . Let ∆ ą and let h be a piecewise continuous function from r , ∆ q to R ď . Consider thenon-linear integral equation g p t q “ H ˆż t κ p t ´ s q g p s q ds ˙ , t P r ∆ , , (A.14) with initial condition g p t q “ h p t q , t P r , ∆ q . f w ˚ ă ş ∆0 κ p ∆ ´ s q h p s q ds , then (A.14) has a unique global solution g takingvalues in R ď , which satisﬁes w ˚ ă ż t κ p t ´ s q g p s q ds for all t ě . (A.15)We start with the proof of Theorem A.5, which closely follow the account ofLakshmikantham’s comparison method in [3, Sec. II.7]. Proof of Theorem A.5.

Clearly, H can be extended to a continuous functionon all of R and thus it follows from [13, Thm. 12.1.1] that (A.6) has a localcontinuous solution f on an interval r , T max q with T max ą

0. In addition, T max can be chosen maximal, in the sense that the solution cannot be continuedbeyond r , T max q . Case (a):

By assumption, a is increasing and takes values in p w ˚ , w s . Set T ˚ : “ inf t t P p , T max q : f p t q “ w ˚ or f p t q “ a p T max qu (A.16)and note that T ˚ ą

0. From (A.6) it is clear that f p t q “ a p t q ` ż t κ p t ´ s q H p f p s qq ds ă a p t q ď a p T max q , @ t P r , T ˚ q , (A.17)i.e. the lower bound w ˚ in (A.16) is always hit before the upper bound a p T max q .In addition, using that the kernel κ is decreasing, we obtain that f p t q “ a p t q ` ż t κ p t ´ s q H p f p s qq ds ď a p T q ` ż t κ p T ´ s q H p f p s qq ds : “ v p t, T q (A.18)for all 0 ď t ď T ď T ˚ . The function v p t, T q , which we have just deﬁned,satisﬁes v p t, t q “ f p t q (A.19) v p , T q “ a p T q ě a p q (A.20)and the diﬀerential inequality BB t v p t, T q “ κ p T ´ t q H p f p t qq ě κ p T ´ t q H p v p t, T qq . (A.21)Here, we have used (A.18) and the fact that H is decreasing on p w ˚ , w s . To-gether with the initial estimate (A.20), a standard comparison principle fordiﬀerential inequalities (cf. [22, II. § v p t, T q ě r p t, T q , (A.22)where BB t r p t, T q “ κ p T ´ t q H p r p t, T qq , r p , T q “ a p q . (A.23)27e claim that the diﬀerential equation (A.23) is solved by r p t, T q “ Q ´ ˆż t κ p T ´ s q ds, a p q ˙ . (A.24)Indeed, applying Q p ., a p qq to both sides of (A.24) yields ż t κ p T ´ s q ds “ Q p r p t, T q , a p qq “ ´ ż a p q r p t,T q dζH p ζ q . Taking BB t -derivatives, we obtain κ p T ´ t q “ H p r p t, T qq BB t r p t, T q which is equivalent to (A.23). From (A.18), (A.19) and (A.22) we obtain thebound r p t q : “ lim T Ó t r p t, T q ď lim T Ó t v p t, T q “ f p t q (A.25)for all t P r , T ˚ q . This implies thatlim t Ñ T ˚ f p t q ě r p T ˚ q ą w ˚ , (A.26)which, in light of (A.16), means that T ˚ “ T max , i.e. we have shown thebounds (A.7) to hold for all t P r , T max q . However, by [13, Thm. 12.1.1]lim t Ñ T max | f p t q| “ 8 whenever T max ă 8 . We conclude that T max “ 8 ,and hence that f is a global solution of (A.6). Uniqueness follows from [13,Thm. 13.1.2]. Case (b):

By assumption, a ” w ˚ . Since H p w ˚ q “ f p t q ” w ˚ is a global solution of (A.6). Uniqueness follows from [13, Thm. 13.1.2]. Case (c):

By assumption, a is decreasing and takes values in p´8 , w ˚ s . Thiscase can be handled analogous to case (a) with the following adaptations: Theinequality signs in equations (A.17) – (A.22) have to be reversed. In (A.24) Q has to be substituted by Q and also in (A.25) and (A.26) the inequalities haveto be reversed. Case (a’):

The proof of Case (a) applies, except for the following modiﬁcation:(A.21) holds only when v p t, T q ď w , since H is decreasing only on p´8 , w s .However, when v p t, T q ą w , we can use the trivial estimate BB t v p t, T q “ κ p T ´ t q H p f p t qq ě κ p T ´ t q H p w q , which can be combined with (A.21) into BB t v p t, T q “ κ p T ´ t q H p f p t qq ě κ p T ´ t q H p v p t, T qq , where H is the decreasing envelope of H from Deﬁnition A.2. The remainingproof of Case (a) applies after substituting H by H and Q by Q .28 roof of Corollary A.7. Let f be the global solution of (A.6). Applying H toboth sides of (A.6), we see that g p t q : “ H p f p t qq is a global solution of (A.10).Next, we show uniqueness. To this end, assume that r g is a local solution of(A.10) on r , T q , diﬀerent from g and deﬁne r f p t q : “ a p t q ` ż t κ p t ´ s q r g p s q ds. Clearly, r g p t q “ H p r f p t qq on r , T q , and hence r f is a local solution of (A.6).By [13, Thm. 13.1.2], this solution is unique, and we conclude that r f “ f , andhence also r g “ g . Finally, applying H – which is decreasing on p´8 , w s – to theinequalities (A.7) and (A.8) yields (A.11) and (A.12). In case (a’) monotonicityof H is lost, but H p w q ă w P p w ˚ , w max s yields (A.13). Proof of Corollary A.8.

Set a p t q : “ ż ∆0 κ p t ` ∆ ´ s q h p s q ds and note that a is increasing with values in p w ˚ , s . Consider the non-linearVolterra equation f p t q “ a p t q ` ż t κ p t ´ s q H p f p s qq ds, (A.27)which has a unique global solution f by Theorem A.5.(a) or (a’). For t P R ě set g p t q “ H p f p t ´ ∆ qq , t P r ∆ , h p t q t P r , ∆ q . For t ě ∆ we have g p t q “ H p f p t ´ ∆ qq “ H ˜ż ∆0 κ p t ´ s q h p s q ds ` ż t ∆ κ p t ´ s q g p s q ds ¸ ““ H ˜ż t κ p t ´ s q g p s q ds ¸ , showing that g is a global solution of (A.14). From cases (a) or (a’) of Theo-rem A.5, we obtain the bound w ˚ ă f p t ´ ∆ q “ ż t κ p t ´ s q g p s q ds, as claimed. To show uniqueness, assume that r g is a solution of (A.14), diﬀerentfrom g . Setting r f p t q : “ a p t q ` ż t κ p t ´ s q r g p s ` ∆ q , we see that r f is a solution of (A.27) and conclude from Theorem A.5 that r f “ f and hence also r g “ g . 29 Theorem 2.6 in the case ρ ą We provide the remaining part of the proof of Theorem 2.6 in the case ρ ą ρ . In the case ρ ą

0, additional argumentsare needed, since this equation may have two negative solutions.

Theorem 2.6, ‘only if ’ part in the case ρ ą . On the set S “ tp t, ω q : V t p ω q ‰ u we set k t p τ, ω q “ V t p ω q η t p t ` τ, ω q , for τ ě

0. Note that by Assumption 2.1 τ ÞÑ k t p τ, ω q must be a decreasing L -kernel for a.e. p t, ω q . Since (2.8) holds trivially if S is a dt b d P -nullset, wemay, without loss of generality, assume that S is not a nullset and consider only p t, ω q P S in the remainder of the proof. Inserting into (2.20) yields h t p t ` τ, u q “ a V t ż τ g p τ ´ s, u q k t p s q ds “ a V t ¨ p g ‹ k q t p τ, u q . Plugging into (2.26) and eliminating V t gives12 p u ´ u q ´ g p τ, u q ` uρ p g ‹ k q t p τ, u q ` p g ‹ k q t p τ, u q “ , (B.1)with is a quadratic equation in the variable p g ‹ k q t p τ, u q with two solutions q ˘ p τ, u q “ ´ ρu ˘ a u p ρ ´ q ` u ` g p τ, u q , (B.2)both of which may be negative. However, using continuity of g p ., τ q and eval-uating (B.1) at τ Ó g p , u q “ p u ´ u q . Inserting into (B.2) andusing p g ‹ k q t p , u q “ q ` and shows that p g ‹ k q t p τ, u q “ q ` p τ, u q , for all τ P r , T ˚ p u qq , where T ˚ p u q is the ﬁrst collision time of q ` and q ´ , i.e. , T ˚ p u q : “ inf " τ ą g p τ, u q “ p u ´ u q ´ u ρ * . On the interval r , T ˚ p u qq we can proceed as in the case of ρ ď η t p t ` τ q “ a V t k t p τ q “ a V t BB τ ˆż τ q ` p τ ´ s, u q π p ds, u q ˙ , τ P r , T ˚ p u qq (B.3)where π is the resolvent of the ﬁrst kind of g p τ, u q . Therefore, to complete theproof it suﬃces to show that T ˚ p u q can be made arbitrarily large by choosing asuitable u P p , q . To this end, note that (B.1) is a convolution Riccati equationfor g p ., u q with the kernel k t p . q , i.e., g p τ, u q “ R V p u, p g ‹ k q t p τ, u qq . g p ., u q is its unique continuous solution,which, by Corollary A.7, can be written as g p τ, u q “ R V p u, f p τ, u qq , where f p τ, u q solves f p τ, u q “ ż τ k t p τ ´ s q R V p u, f p s, u qq ds. (B.4)Moreover, the collision time can be represented in terms of f as T ˚ p u q : “ inf t t ą f p t, u q “ ´ ρu u . (B.5)By convexity, we can estimate R V p u, w q from below as R V p u, w q ě wρ ` p u ´ u q . Hence, from (B.4) we obtain the estimate f p τ, u q ě ρ p k t ‹ f qp τ, u q ` p u ´ u q . (B.6)Let r t be the ρ -resolvent of k t , and note that r t is again an L -kernel (in partic-ular non-negative) by Lemma 2.8. By the generalized Gronwall Lemma of [13,Lem. 9.8.2] it follows that f p τ, u q ě l p τ, u q , where l solves the linear Volterraequation l p τ, u q “ ρ p k t ‹ l qp τ, u q ` p u ´ u q . Moreover, using [13, Thm. 2.3.5], we can express l p t, u q in terms of the ρ -resolvent r t and obtain f p τ, u q ě l p τ, u q “ p u ´ u q ş τ r t p s q for all u P p , q .Combining with (B.5), we ﬁnally obtain ż T ˚ p u q r t p s q ds ě ´ ρu p u ´ u q “ ρ ´ u . Sending u Ò u Ò T ˚ p u q “ `8 , which together with (B.3) completes the proof. References [1] Eduardo Abi Jaber, Martin Larsson, and Sergio Pulido. Aﬃne Volterraprocesses. arXiv:1708.08796 , 2017.[2] Emmanuel Bacry, Iacopo Mastromatteo, and Jean-Fran¸cois Muzy. Hawkesprocesses in ﬁnance.

Market Microstructure and Liquidity , 1(01), 2015.[3] Drumi D Bainov and Pavel S Simeonov.

Integral inequalities and applica-tions , volume 57. Springer Science & Business Media, 2013.[4] Patrick Billingsley.

Probability and measure . John Wiley & Sons, 2ndedition edition, 1986.[5] Darrell Duﬃe, Damir Filipovi´c, and Walter Schachermayer. Aﬃne pro-cesses and applications in ﬁnance.

Annals of applied probability , pages984–1053, 2003. 316] Omar El Euch, Masaaki Fukasawa, and Mathieu Rosenbaum. The mi-crostructural foundations of leverage eﬀect and rough volatility.

Financeand Stochastics , 22(2):241–280, 2018.[7] Omar El Euch and Mathieu Rosenbaum. The characteristic function ofrough Heston models.

Mathematical Finance, forthcoming , 2018.[8] Omar El Euch and Mathieu Rosenbaum. Perfect hedging in rough Hestonmodels.

The Annals of Applied Probability , 28(6):3813–3856, 2018.[9] Arthur Erd´elyi, Wilhelm Magnus, Fritz Oberhettinger, and Francesco GTricomi.

Higher transcendental functions, Vol. III, based on notes left byHarry Bateman, reprint of the 1955 original . Robert E. Krieger PublishingCo., Inc., Melbourne, Fla, 1981.[10] D. Filipovic.

Term-Structure Models . Springer, 2009.[11] A Ronald Gallant, Peter E Rossi, and George Tauchen. Stock prices andvolume.

The Review of Financial Studies , 5(2):199–242, 1992.[12] Gustaf Gripenberg. On Volterra equations of the ﬁrst kind.

Integral Equa-tions and Operator Theory , 3(4):473–488, 1980.[13] Gustaf Gripenberg, Stig-Olof Londen, and Olof Staﬀans.

Volterra integraland functional equations , volume 34. Cambridge University Press, 1990.[14] Hans J Haubold, Arak M Mathai, and Ram K Saxena. Mittag-Leﬄerfunctions and their applications.

Journal of Applied Mathematics , 2011.[15] Steven L Heston. A closed-form solution for options with stochastic volatil-ity with applications to bond and currency options.

The Review of Finan-cial Studies , 6(2):327–343, 1993.[16] Thibault Jaisson and Mathieu Rosenbaum. Limit theorems for nearly un-stable Hawkes processes.

The Annals of Applied Probability , 25(2):600–631,2015.[17] Thibault Jaisson and Mathieu Rosenbaum. Rough fractional diﬀusions asscaling limits of nearly unstable heavy tailed Hawkes processes.

The Annalsof Applied Probability , 26(5):2860–2882, 2016.[18] Martin Keller-Ressel. Moment explosions and long-term behavior of aﬃnestochastic volatility models.

Mathematical Finance , 21(1):73–98, 2011.[19] Igor Podlubny.

Fractional diﬀerential equations , volume 198 of

Mathematicsin Science and Engineering . Academic press, 1998.[20] R. Tyrrell Rockafellar.

Convex Analysis . Princeton University Press, 1970.[21] Mark Veraar. The stochastic Fubini theorem revisited.

Stochastics ,84(4):543–551, 2012.[22] Wolfgang Walter.