[PDF] Approximation of Stochastic Volterra Equations with kernels of completely monotone type

Abstract

In this work, we develop a multi-factor approximation for Stochastic Volterra Equations with Lipschitz coefficients and kernels of completely monotone type that may be singular. Our approach consists in truncating and then discretizing the integral defining the kernel, which corresponds to a classical Stochastic Differential Equation. We prove strong convergence results for this approximation. For the particular rough kernel case with Hurst parameter lying in (0,1/2), we propose various discretization procedures and give their precise rates of convergence. We illustrate the efficiency of our approximation schemes with numerical tests for the rough Bergomi model.

Full PDF

AAPPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS WITHKERNELS OF COMPLETELY MONOTONE TYPE

AUR´ELIEN ALFONSI AND AHMED KEBAIER

Abstract.

In this work, we develop a multi-factor approximation for Stochastic VolterraEquations with Lipschitz coeﬃcients and kernels of completely monotone type that may besingular. Our approach consists in truncating and then discretizing the integral deﬁningthe kernel, which corresponds to a classical Stochastic Diﬀerential Equation. We provestrong convergence results for this approximation. For the particular rough kernel case withHurst parameter lying in (0 , / Introduction

In recent years, there has been signiﬁcant and growing interest in studying StochasticVolterra Equations (SVE) since they arise in many applications such as mathematical ﬁ-nance, biology, physics, and engineering. Several studies have investigated the SVE underregular kernels, see e.g. Berger and Mizel [8, 9], Protter [30], Pardoux and Protter [29], andunder non-regular kernels as well, see e.g. Cochran et al. [12], Coutin and Decreusefond [13],Decreusefond [14], Wang [34], Zhang [37], and the references therein. More recently, much at-tention in quantitative ﬁnance has centered on using the SVE with a fractional kernel having asmall Hurst parameter H (cid:39) . Date : March 1, 2021.2010

Mathematics Subject Classiﬁcation.

Key words and phrases.

Stochastic Volterra Equation, Strong error, Fractional kernel, Rough volatilitymodel.This work beneﬁted from the support of the “chaire Risques ﬁnanciers”, Fondation du Risque, and of theLaboratory of Excellence MME-DII, Grant no. ANR11LBX-0023-01 (http://labex-mme-dii.u-cergy.fr/). a r X i v : . [ m a t h . P R ] F e b AUR´ELIEN ALFONSI AND AHMED KEBAIER

Kusuoka [24], Ninomiya and Victoir [28], Alfonsi [2] and Shinozaki [32]), Multilevel MonteCarlo methods (see e.g. Giles [19], Ben Alaya and Kebaier [6], Lemaire and Pag`es [26]), thevariance reduction techniques (see e.g. Newton [27], Jourdain and Lelong [23], Lemaire andPag`es [25], Belomestny et al. [5]) etc. This gives more ﬂexibility for the approximation setting.In this paper, we are interested in approximating the SVE in a general form given by X t = x + (cid:90) t G ( t − s ) b ( X s ) ds + (cid:90) t G ( t − s ) σ ( X s ) dW s , t ≥ , (1.1)where x ∈ R d , b : R d → R d , σ : R d → M d ( R ) are globally Lipschitz continuous coeﬃcients, W is a standard Brownian motion in R d and G , G : R ∗ + → M d ( R ) are kernels of the form G j ( t ) = (cid:90) R + e − ρt M j ( ρ ) λ ( dρ ) , for t ∈ ]0 , + ∞ [ , with bounded measurable functions M , M : R + → M d ( R ) and a measure λ on R + satisfying (cid:82) R + e − ρt λ ( dρ ) < + ∞ . Note that when M and M are non-negative scalar functions, G and G are known in the literature as completely monotone kernels. In particular, the singularfractional kernel with Hurst parameter that lies in (0 , /

2) is covered within this framework.More precisely, we approximate the solution to (1.1) by a multi-factor approximation thatcorresponds to a stochastic diﬀerential equation in a higher dimension. For this setting, weprove a strong convergence error for our multi-factor approximation scheme. To do so, weproceed in two steps: ﬁrst we truncate the integrals deﬁning G j and second we discretizethe measure λ on the truncated interval [0 , K ]. We denote respectively by X K and ˆ X K thecorresponding SVE processes. Thus, in Section 3 we derive a ﬁrst strong convergence onthe error between the processes X and its truncated version X K and a second one for theerror between X K and ˆ X K . Section 4 is devoted to the study of the rough kernel, where wepropose various discretization procedures and give their precise rates of convergence. Thenwe illustrate in Section 5 our theoretical results and give a ﬁnancial application with thecelebrated rough Bergomi model. It is worth noticing that the obtained strong error givesonly an asymptotic rate, but the approximation error may be signiﬁcantly impacted by someunknown multiplicative constant. Thus, we develop adjustments useful in practice to reducesigniﬁcantly this constant and get eﬃcient methods.2. General Framework and preliminary results

We consider the SVE in a general form given by X t = x + (cid:90) t G ( t − s ) b ( X s ) ds + (cid:90) t G ( t − s ) σ ( X s ) dW s , t ≥ , (2.1)where x ∈ R d , b : R d → R d , σ : R d → M d ( R ) are globally Lipschitz continuous coeﬃcientsi.e. ∃ L > , ∀ x, y ∈ R d , | b ( x ) − b ( y ) | + (cid:107) σ ( x ) − σ ( y ) (cid:107) ≤ L | x − y | , (2.2) W is a standard Brownian motion in R d and G , G : R ∗ + → M d ( R ) are kernels that satisfy (cid:90) T (cid:107) G ( s ) (cid:107) + (cid:107) G ( s ) (cid:107) ds < ∞ , for every T ∈ R + . (2.3) PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 3

Then, we can apply Theorem 3.1 [37] and get that there exists a unique strong solutionto (2.1). Note that if (cid:82) T (cid:107) G ( s ) (cid:107) + (cid:107) G ( s ) (cid:107) ds < ∞ for some T >

0, then there existsa unique strong solution ( X t , t ∈ [0 , T ]) up to time T . Obviously, those conditions do notdepend on the choice of the norms | · | and (cid:107) · (cid:107) on R d and M d ( R ). In this paper, we will usethe Euclidean norm on R d and the Frobenius norm on M d ( R ), and we recall that we have ∀ A, B ∈ M d ( R ) , ∀ x ∈ R d , (cid:107) AB (cid:107) ≤ (cid:107) A (cid:107)(cid:107) B (cid:107) and | Ax | ≤ (cid:107) A (cid:107)| x | . (2.4)In this paper, we are interested in the approximation of (2.1) when there exists boundedmeasurable functions M , M : R + → M d ( R ) and a measure λ on R + satisfying ∀ t > , ¯ G ( t ) = (cid:90) R + e − ρt λ ( dρ ) < + ∞ , (2.5)such that G j ( t ) = (cid:90) R + e − ρt M j ( ρ ) λ ( dρ ) , for t ∈ ]0 , + ∞ [ . (2.6)We note M j = sup ρ ≥ (cid:107) M j ( ρ ) (cid:107) and trivially have (cid:107) G j ( t ) (cid:107) ≤ M j ¯ G ( t ). We will assume throughthe paper that ¯ G ∈ L ( R ∗ + , R + ), i.e. ∀ T > , (cid:90) T ¯ G ( t ) dt < ∞ , (2.7)and therefore condition (2.3) is satisﬁed.In the one-dimensional case, the kernel G j is completely monotone when M i ≥ G λ H ( t ) = t H − / Γ( H +1 / with parameter H ∈ (0 , / G λ H ( t ) = (cid:82) R + e − ρt λ H ( dρ ) with λ H ( dρ ) = c H ρ − H − / dρ, and c H := 1Γ( H + 1 / / − H ) . (2.8)The principle of the approximation is rather simple. We approximate the measure λ bya ﬁnite discrete measure. Then, the next proposition ensures that the Stochastic VolterraEquation (2.1) can be obtained from the solution of a classical SDE, for which many numericalmethods have been developed. Thus, the goal of the paper is to analyze the error made whenreplacing the measure λ by a ﬁnite discrete measure. We will focus in this paper on strongerror estimates. Proposition 2.1.

Let us assume that λ ( dρ ) = (cid:80) ni =1 α i δ ρ i ( dρ ) with α i ≥ and ρ < · · · < ρ n . (1) Let us assume M = M = M and rank([ α M ( ρ ) . . . α n M ( ρ n )]) = d so that thereexist x , . . . , x n ∈ R d such that (cid:80) ni =1 α i M ( ρ i ) x i = x . Then, the solution of (2.1) is given by (cid:80) ni =1 α i M ( ρ i ) X ρ i t , where ( X ρ t , . . . , X ρ n t ) is the solution of the ( n × d ) -dimensional Stochastic Diﬀerential Equation deﬁned by X ρ i t = x i − (cid:90) t ρ i ( X ρ i s − x i ) ds + (cid:90) t b (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s (cid:33) ds + (cid:90) t σ (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s (cid:33) dW s . (2.9) AUR´ELIEN ALFONSI AND AHMED KEBAIER (2)

Let us assume rank([ α M ( ρ ) . . . α n M ( ρ n ) α M ( ρ ) . . . α n M ( ρ n )]) = d so thatthere exist x , . . . , x n , y , . . . , y n ∈ R d such that (cid:80) ni =1 α i [ M ( ρ i ) x i + M ( ρ i ) y i ] = x .Then, the solution of (2.1) is given by X t = (cid:80) ni =1 α i M ( ρ i ) X ρ i t + (cid:80) ni =1 α i M ( ρ i ) Y ρ i t ,where ( X ρ t , Y ρ t , . . . , X ρ n t , Y ρ n t ) is the solution of the (2 n × d ) -dimensional StochasticDiﬀerential Equation deﬁned by X ρ i t = x i − (cid:90) t ρ i ( X ρ i s − x i ) ds + (cid:90) t b (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s + n (cid:88) i =1 α i M ( ρ i ) Y ρ i s (cid:33) ds,Y ρ i t = y i − (cid:90) t ρ i ( Y ρ i s − y i ) ds + (cid:90) t σ (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s + n (cid:88) i =1 α i M ( ρ i ) Y ρ i s (cid:33) dW s . (2.10) Proof.

Let us ﬁrst consider the case M = M = M . The SDE (2.9) has Lipschitz coeﬃcientsand therefore has a unique strong solution. Since d (cid:0) e ρ i t ( X ρ i t − x i ) (cid:1) = e ρ i t b ( (cid:80) ni =1 α i M ( ρ i ) X ρ i t ) dt + e ρ i t σ ( (cid:80) ni =1 α i M ( ρ i ) X ρ i s ) dW t , we get X ρ i t = x i + (cid:90) t e − ρ i ( t − s ) b (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s (cid:33) ds + (cid:90) t e − ρ i ( t − s ) σ (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s (cid:33) dW s . We left multiply this equation by α i M ( ρ i ) and then sum over i to obtain that (cid:80) ni =1 α i M ( ρ i ) X ρ i t solves (2.1). The strong uniqueness result (Theorem 3.1 [37]) gives the claim.In the general case, we similarly get X ρ i t = x i + (cid:90) t e − ρ i ( t − s ) b (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s + n (cid:88) i =1 α i M ( ρ i ) Y ρ i s (cid:33) ds,Y ρ i t = y i + (cid:90) t e − ρ i ( t − s ) σ (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s + n (cid:88) i =1 α i M ( ρ i ) Y ρ i s (cid:33) dW s . We then left multiply the ﬁrst equation by α i M ( ρ i ) and the second equation by α i M ( ρ i ),and sum over i to get the claim. (cid:3) Strong error analysis for the approximation

To analyse the error between the SVE and its approximation by using kernels ˆ G j ( t ) sup-ported by a ﬁnite discrete measure (as in Proposition 2.1), we proceed in two steps. First, weanalyse the truncation error when replacing the kernels by the kernels obtained by truncatingthe measure λ in (2.6). Second, we analyse the error between the SVE with the truncatedkernels and the approximating kernels.For any K >

0, we introduce then the truncated convolution kernels G Kj : R + → M d ( R ), j ∈ { , } , that are deﬁned as follows: G Kj ( t ) = (cid:90) [0 ,K ) e − ρt M j ( ρ ) λ ( dρ ) , for all t ≥ . (3.1)Thus, the kernel G Kj approximates the kernel G j deﬁned by (2.6) as K → + ∞ . Since M j = sup ρ ≥ (cid:107) M j ( ρ ) (cid:107) < ∞ , we have the following uniform bound: ∀ K > , (cid:107) G Kj ( t ) (cid:107) ≤ M j ¯ G ( t ) with ¯ G ( t ) = (cid:90) R + e − ρt λ ( dρ ) . (3.2) PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 5

We introduce the stochastic convolution equation X K associated to the kernels G Kj , j ∈{ , } , given by X Kt = x + (cid:90) t G K ( t − s ) b ( X Ks ) ds + (cid:90) t G K ( t − s ) σ ( X Ks ) dW s , x ∈ R d . (3.3)We also consider for c >

0, the resolvant of second kind E c ( t ) that solves the equationE c ( t ) = ¯ G ( t ) + (cid:90) t c ¯ G ( t − s )E c ( s ) ds. (3.4)Since ¯ G ∈ L ( R ∗ + , R + ) by (2.7), we get that E c ( t ) is well deﬁned and belongs also to L ( R ∗ + , R + ) (see Subsection A.3 [1] and Theorem 2.3.1 [20]). Proposition 3.1.

Let λ be a positive measure such that ∀ K > , r ( K ) := (cid:90) [ K, + ∞ ) (cid:90) [ K, + ∞ ) ρ + ρ λ ( dρ ) λ ( dρ ) < ∞ . (H1) Then, for any

T > , there exists a positive constant C that depends on T , λ , M , M , L , | b (0) | and (cid:107) σ (0) (cid:107) such that ∀ t ∈ [0 , T ] , E (cid:104)(cid:12)(cid:12) X t − X Kt (cid:12)(cid:12) (cid:105) ≤ C × r ( K ) . (3.5) Proof.

We note ∆ Kj ( t ) = G j ( t ) − G Kj ( t ). We have for all t ≥ | X t − X Kt | ≤ (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) (cid:90) t ∆ K ( t − s ) b ( X s ) ds (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) t ∆ K ( t − s ) σ ( X s ) dW s (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) t G K ( t − s ) (cid:2) b ( X s ) − b ( X Ks ) (cid:3) ds (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) t G K ( t − s ) (cid:2) σ ( X s ) − σ ( X Ks ) (cid:3) dW s (cid:12)(cid:12)(cid:12)(cid:12) (cid:19) , by using the inequality ( a + b + c + d ) ≤ a + b + c + d ). Then, we get by using Jensen’sinequality, the Itˆo isometry and (2.4): E (cid:2) | X t − X Kt | (cid:3) ≤ t (cid:90) t (cid:107) ∆ K ( t − s ) (cid:107) E [ | b ( X s ) | ] ds + 4 (cid:90) t (cid:107) ∆ K ( t − s ) (cid:107) E [ (cid:107) σ ( X s ) (cid:107) ] ds + 4 t (cid:90) t (cid:107) G K ( t − s ) (cid:107) E [ (cid:12)(cid:12) b ( X s ) − b ( X Ks ) (cid:12)(cid:12) ] ds + 4 (cid:90) t (cid:107) G K ( t − s ) (cid:107) E [ (cid:107) σ ( X s ) − σ ( X Ks ) (cid:107) ] ds. Then, by using the Lipschitz property (2.2) we get for c := 8( T ∨ (cid:0) | b (0) | ∨ (cid:107) σ (0) (cid:107) + L sup t ∈ [0 ,T ] E [ | X t | ] (cid:1) and c := 4 L ( M T + M ) E | X t − X Kt | ≤ c (cid:90) t (cid:107) ∆ K ( t − s ) (cid:107) + (cid:107) ∆ K ( t − s ) (cid:107) ds + c (cid:90) t ¯ G ( t − s ) E (cid:2) | X s − X Ks | (cid:3) ds. Hence, we use the generalized Gronwall Lemma (see e.g. [20, Theorem 9.8.2]) to get E | X t − X Kt | ≤ c (cid:16) (cid:90) t (cid:107) ∆ K ( t − s ) (cid:107) + (cid:107) ∆ K ( t − s ) (cid:107) ds (cid:17)(cid:16) (cid:90) T E c ( s ) ds (cid:17) , Note that if λ ( R + ) < ∞ then ¯ G ( t − s ) ≤ λ ( R + ) and then we can use the classical Gronwall lemma. Thisargument cannot be applied for the rough kernels. AUR´ELIEN ALFONSI AND AHMED KEBAIER where E c is deﬁned by (3.4). Since ∆ Kj ( t − s ) = (cid:82) [ K, + ∞ ) M j ( ρ ) e − ρ ( t − s ) λ ( dρ ), we have (cid:107) ∆ Kj ( t − s ) (cid:107) ≤ M j (cid:82) [ K, + ∞ ) e − ρt λ ( dρ ) and thus (cid:90) t (cid:107) ∆ Kj ( t − s ) (cid:107) ds ≤ M j (cid:90) t (cid:90) [ K, + ∞ ) (cid:90) [ K, + ∞ ) e − ( ρ + ρ )( t − s ) λ ( dρ ) λ ( dρ ) ds = M j (cid:90) [ K, + ∞ ) (cid:90) [ K, + ∞ ) − e − ( ρ + ρ ) t ρ + ρ λ ( dρ ) λ ( dρ ) ≤ M j r ( K ) . We therefore get (3.5) with C = c ( M + M ) (cid:16) (cid:82) T E c ( s ) ds (cid:17) . (cid:3) Remark 3.1. (Non asymptotic estimates) One interest to work with truncation is that thefamily G K and G K are uniformly bounded in L ([0 , T ]) . Suppose now that ˆ G and ˆ G aretwo kernels such that ∃ ¯ C ∈ R ∗ + , ∀ j ∈ { , } , t ∈ [0 , T ] (cid:107) ˆ G j ( t ) (cid:107) ≤ ¯ C (1 + (cid:107) G j ( t ) (cid:107) ) . Then, we have (cid:82) T (cid:107) ˆ G ( t ) (cid:107) + (cid:107) ˆ G ( t ) (cid:107) dt < ∞ and by Theorem 3.1 [37] , there exists a uniquesolution to ˆ X t = x + (cid:90) t ˆ G ( t − s ) b ( ˆ X s ) ds + (cid:90) t ˆ G ( t − s ) σ ( ˆ X s ) dW s , t ∈ [0 , T ] . By the same arguments as in the proof of Proposition 3.1, we get the existence of a constant C ∈ R ∗ + (depending on b , σ , G and G ) such that E [ | ˆ X t − X t | ] ≤ C (cid:16) (cid:90) t (cid:107) ˆ∆ ( s ) (cid:107) + (cid:107) ˆ∆ ( s ) (cid:107) ds (cid:17) . (3.6) Lemma 3.1.

Under the assumptions of the above theorem, we have r ( K ) ≤ (cid:16) (cid:82) [ K, + ∞ ) λ ( dρ ) √ ρ (cid:17) . If λ ( dρ ) = f ( ρ ) dρ with f ( ρ ) = ρ →∞ O ( ρ − η − / ) for some η > , we have r ( K ) = K →∞ O ( K − η ) .Proof. The upper bound is obtained from the standard inequality ρ + ρ ≤ √ ρ ρ . For λ ( dρ ) = f ( ρ ) dρ with f ( ρ ) = O ( ρ − η − / ) and η > C > f ( ρ ) (cid:90) [ K, + ∞ ) λ ( dρ ) √ ρ ≤ CηK − η . (cid:3) We now turn to the approximation of the truncated kernel G Kj , j ∈ { , } by a kernel ˆ G Kj .Let T >

0. We deﬁne, for t ∈ [0 , T ] ∀ t ∈ [0 , T ] , ˆ∆ Kj ( t ) = ˆ G Kj ( t ) − G Kj ( t ) , and assume the following bound: ∃ ¯∆ : [0 , T ] → R + s.t. (cid:90) T ¯∆ ( t ) dt < ∞ , ∀ K > , ∀ t ∈ [0 , T ] , (cid:107) ˆ∆ Kj ( t ) (cid:107) ≤ ¯∆( t ) . (H2)Note that in our examples, we will use ¯∆ as a constant function, but we keep it general forthe presentation of the results. The assumption (H2) implies (cid:82) T (cid:107) ˆ G K ( t ) (cid:107) + (cid:107) ˆ G K ( t ) (cid:107) dt < ∞ , PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 7 and we know from Theorem 3.1 [37] that there exists a unique strong solution ( ˆ X Kt , t ∈ [0 , T ])of the SVE ˆ X Kt = x + (cid:90) t ˆ G K ( t − s ) ˆ X Ks ds + (cid:90) t ˆ G K ( t − s ) ˆ X Ks ds. The key property of (H2) is that the bound is uniform in K . This enables to get the followingresult. Lemma 3.2. (Uniform estimate on X K ) Let (H2) hold. Then, there exists C ∈ R ∗ + (depend-ing on | x | , T , | b (0) | , (cid:107) σ (0) (cid:107) , L , M and M ) such that ∀ K > , ∀ t ∈ [0 , T ] , E [ | ˆ X Kt | ] ≤ C. Proof.

We have by using Jensen’s formula and Itˆo’s isometry, for t ∈ [0 , T ], E [ | ˆ X Kt | ] ≤ | x | + 3 t (cid:90) t (cid:107) ˆ G K ( t − s ) (cid:107) E [ | b ( ˆ X Ks ) | ] ds + 3 (cid:90) t (cid:107) ˆ G K ( t − s ) (cid:107) E [ (cid:107) σ ( ˆ X Ks ) (cid:107) ] ds. On the one hand, we use that | b ( x ) | ≤ | b (0) | + L | x | and (cid:107) σ ( x ) (cid:107) ≤ (cid:107) σ (0) (cid:107) + L | x | . On the otherhand, we get from (H2) and (3.2) (cid:107) ˆ G Kj ( t ) (cid:107) ≤

2( ¯∆ ( t ) + (cid:107) G Kj ( t ) (cid:107) ) ≤

0, let (˜E c ( t ) , t ∈ [0 , T ]) be deﬁned as the solution of the equation˜E c ( t ) = ¯∆ ( t ) + ¯ G ( t ) + (cid:90) t c ( ¯∆ ( t − s ) + ¯ G ( t − s ))˜E c ( s ) ds. (3.7)Since ¯∆ + ¯ G ∈ L ((0 , T ) , R + ), we get that ˜E c ( t ) is well deﬁned and belongs also to L ((0 , T ) , R + ) by applying the results of Subsection A.3 [1] and Theorem 2.3.1 [20] to thekernel (0 ,T ) ( t )[ ¯∆ ( t ) + ¯ G ( t )]. We then get from [1, Lemma A.4] or [20, Lemma 9.8.2] ∀ t ∈ [0 , T ] , E [ | ˆ X Kt | ] ≤ C (cid:18) (cid:90) T ˜E C ( t ) dt (cid:19) , which gives the claim. (cid:3) Proposition 3.2.

Let

T > . Suppose that for any K > , there are kernels ˆ G K , ˆ G K :[0 , T ] → M d ( R ) such that (H2) holds. Then, there is a constant C (depending on | x | , T , | b (0) | , (cid:107) σ (0) (cid:107) , L , M and M ) such that ∀ t ∈ [0 , T ] , E (cid:104) | ˆ X Kt − X Kt | (cid:105) ≤ C (cid:18)(cid:90) t (cid:104) (cid:107) ˆ∆ K ( s ) (cid:107) + (cid:107) ˆ∆ K ( s ) (cid:107) (cid:105) ds (cid:19) . Proof.

We repeat the same arguments as in the proof of Proposition 3.1 and get E (cid:104) | ˆ X Kt − X Kt | (cid:105) ≤ t (cid:90) t (cid:107) ˆ∆ K ( t − s ) (cid:107) E [ | b ( ˆ X Ks ) | ] ds + 4 (cid:90) t (cid:107) ˆ∆ K ( t − s ) (cid:107) E [ (cid:107) σ ( ˆ X Ks ) (cid:107) ] ds + 4 t (cid:90) t (cid:107) G K ( t − s ) (cid:107) E [ (cid:12)(cid:12) b ( ˆ X Ks ) − b ( X Ks ) (cid:12)(cid:12) ] ds + 4 (cid:90) t (cid:107) G K ( t − s ) (cid:107) E [ (cid:107) σ ( ˆ X Ks ) − σ ( X Ks ) (cid:107) ] ds. AUR´ELIEN ALFONSI AND AHMED KEBAIER

From Lemma 3.2, we get the existence of a constant C ∈ R ∗ + such thatsup K> sup t ∈ [0 ,T ] E [ | ˆ X Kt | ] ≤ C. Then, we set similarly as in the proof of Proposition 3.1 c := 8( T ∨ (cid:0) | b (0) | ∨(cid:107) σ (0) (cid:107) + L C (cid:1) , c := 4 L ( M T + M ), and we get E | ˆ X Kt − X Kt | ≤ c (cid:90) t (cid:107) ˆ∆ K ( s ) (cid:107) + (cid:107) ˆ∆ K ( s ) (cid:107) ds + c (cid:90) t ¯ G ( t − s ) E (cid:104) | ˆ X Ks − X Ks | (cid:105) ds. Hence, we use the generalized Gronwall Lemma (see e.g. [20, Lemma 9.8.2]) to get E | ˆ X Kt − X Kt | ≤ c (cid:18) (cid:90) T E c ( s ) ds (cid:19) (cid:18)(cid:90) t (cid:104) (cid:107) ˆ∆ K ( s ) (cid:107) + (cid:107) ˆ∆ K ( s ) (cid:107) (cid:105) ds (cid:19) , where E c is deﬁned by (3.4). (cid:3) Combining Propositions 3.1 and 3.2, we obtain easily our main result.

Theorem 3.1.

Let us assume that λ satisﬁes (H1) and that (H2) holds. Then, there existsa constant C ∈ R ∗ + such that ∀ t ∈ [0 , T ] , E [ | X t − ˆ X Kt | ] ≤ C (cid:18) r ( K ) + (cid:90) t (cid:104) (cid:107) ˆ∆ K ( s ) (cid:107) + (cid:107) ˆ∆ K ( s ) (cid:107) (cid:105) ds (cid:19) . The term r ( K ) and the integral in the right hand side correspond respectively to the trun-cation and discretization error. When using a Riemann discretization, we get the followinggeneral result. Corollary 3.1.

Let us assume that λ satisﬁes (H1) , and that the functions M j : R + → M d ( R ) are Lipschitz continuous: ∃ ¯ L > , ∀ j ∈ { , } , ∀ ρ, ρ (cid:48) ≥ , | M j ( ρ ) − M j ( ρ (cid:48) ) | ≤ ¯ L | ρ − ρ (cid:48) | . Let n ∈ N ∗ , I Ki,n = (cid:2) i − n K, in K (cid:1) for ≤ i ≤ n and ρ Ki,n ∈ I Ki,n . Let us deﬁne the kernels j ∈ { , } , ˆ G Kj ( t ) = n (cid:88) i =1 λ (cid:0) I Ki,n (cid:1) M j ( ρ Ki,n ) e − ρ Ki,n t , that correspond to the measure ˆ λ ( dρ ) = n (cid:88) i =1 λ (cid:0) I Ki,n (cid:1) δ ρ Ki,n ( dρ ) . (3.8) Then, there exists a constant C ∈ R ∗ + such that for n ≥ Kλ ([0 , K )) , we have ∀ t ∈ [0 , T ] , E [ | X t − ˆ X Kt | ] ≤ C (cid:18) r ( K ) + K n λ ([0 , K )) (cid:19) . This corollary indicates the theoretical optimal choice for n , when K → + ∞ . Namely, onehas to take n proportional to Kλ ([0 ,K )) √ r ( K ) in order to equalize the both terms, i.e. the error dueto the truncation and the one due to the approximation. PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 9

Proof.

We have ˆ G Kj ( t ) − G Kj ( t ) = (cid:80) ni =1 (cid:82) I Ki,n (cid:104) M j ( ρ Ki,n ) e − ρ Ki,n t − M j ( ρ ) e − ρt (cid:105) λ ( dρ ). From thetriangular inequality, we get for t ∈ [0 , T ] (cid:107) M j ( ρ Ki,n ) e − ρ Ki,n t − M j ( ρ ) e − ρt (cid:107) ≤ (cid:107) M j ( ρ Ki,n ) − M j ( ρ ) (cid:107) e − ρ Ki,n t + (cid:107) M j ( ρ ) (cid:107)| e − ρ Ki,n t − e − ρt |≤ ( ¯ L + M j t ) | ρ − ρ Ki,n | ≤ ( ¯ L + M j T ) Kn .

This yields to (cid:107) ˆ G Kj ( t ) − G Kj ( t ) (cid:107) ≤ ( ¯ L + M j ) λ ([0 , K )) Kn , for any t ∈ [0 , T ]. In particular, (H2)holds for n ≥ Kλ ([0 , K )). We can thus apply Theorem 3.1 and get the result. (cid:3) Corollary 3.1 gives a general result on the approximation of SVE by SDE. Obviously, it ispossible to derive many variations and reﬁnements of this result by assuming more regularityon the functions M j or on the measure λ . In the next section, we investigate some of thesereﬁnements when λ is given by (2.8).4. More approximation results for the rough kernels

Let us start by applying the result of Corollary 3.1 to the measure λ deﬁned in Equa-tion (2.8). We have λ ([0 , K )) = c H (1 / − H ) K − H and r ( K ) = O ( K − H ) by Lemma 3.1,which gives ∀ t ∈ [0 , T ] , E [ | X t − ˆ X Kt | ] ≤ C (cid:18) K − H + K − H n (cid:19) . By taking n = K or equivalently K = n , we get E [ | X t − ˆ X Kt | ] = n →∞ O ( n − H × ) . (4.1)Let us recall that n is the number of points weighted by the approximating measure ˆ λ .By Proposition 2.1, n scales as the dimension of the SDE that approximates the SVE andtherefore as the computation time needed to simulate the SDE. The goal of this section is toimprove this rate, by assuming more regularity on the functions M j .To get a better approximation, we assume more regularity on the functions M and M .To approximate G Kj ( t ) = (cid:82) K e − ρt M j ( ρ ) c H ρ − H − / dρ , we use the same type of approximationon [0 , K β ] with 0 < β < K β , K ], with K > Proposition 4.1.

Suppose that λ is given by (2.8) . Let us assume that the functions M and M are C with bounded derivatives. Let β ∈ (0 , and ˆ G Kj ( t ) = (cid:82) R + e − ρt M j ( ρ )ˆ λ S ( dρ ) with ˆ λ S ( dρ ) = n (cid:88) i =1 λ ( I K β i,n ) δ ρ Kβi,n + c H ( K − K β )6 n n (cid:88) i =1 (cid:104) ( ρ Ki,n, ) − H − δ ρ Ki,n, + 4( ρ Ki,n, ) − H − δ ρ Ki,n, + ( ρ Ki,n, ) − H − δ ρ Ki,n, (cid:105) , where I K β i,n = [ i − n K β , in K β ) , ρ K β i,n ∈ I K β i,n , ρ Ki,n, = K β + ( i − K − K β ) n , ρ Ki,n, = K β + (2 i − K − K β )2 n and ρ Ki,n, = K β + i ( K − K β ) n . With β = − H − H and K ∼ n − H − H , there exists a constant C ∈ R ∗ + such that ∀ t ∈ [0 , T ] , E [ | ˆ X Kt − X t | ] ≤ Cn − H × − H − H . We clearly have ≤ − H − H for H ∈ (0 , /

2) and notice that ˆ λ S weights 3 n + 1 = O ( n )diﬀerent points. Thus, the approximation given by Proposition 4.1 is asymptotically betterthan the one given by Corollary 3.1. Proof.

We aim at applying Theorem 3.1. We haveˆ G Kj ( t ) − G Kj ( t ) = n (cid:88) i =1 (cid:90) I Kβi,n (cid:20) M j ( ρ K β i,n ) e − ρ Kβi,n t − M j ( ρ ) e − ρt (cid:21) λ ( dρ ) − (cid:90) KK β M j ( ρ ) e − ρt c H ρ − H − / dρ + c H ( K − K β )6 n n (cid:88) i =1 (cid:104) ( ρ Ki,n, ) − H − M j ( ρ Ki,n, ) + 4( ρ Ki,n, ) − H − M j ( ρ Ki,n, ) + ( ρ Ki,n, ) − H − M j ( ρ Ki,n, ) (cid:105) . The norm of the ﬁrst sum can be upper bounded by O (cid:16) λ ([0 , K β )) K β n (cid:17) = O (cid:16) K β (3 / − H ) n (cid:17) , as inthe proof Corollary 3.1. For the other terms, we work componentwise and may assume w.l.o.g.that M j is real valued. Let ψ t ( ρ ) = c H M j ( ρ ) ρ − H − / e − ρt . The well known convergence resulton the Simpson’s rule (see e.g. [22], p. 339) allows to upper bound the norm of the otherterms by sup ρ ∈ [ K β ,K ] ψ (4) t ( ρ )90 n (cid:16) K − K β (cid:17) . We get that sup t ∈ [0 ,T ] sup ρ ∈ [ K β ,K ] ψ (4) t ( ρ ) = O ( K − β ( H +1 / ) by using that the derivatives of M j are bounded and 0 ≤ e − ρt ≤

1. This leads to ∀ t ∈ [0 , T ] , (cid:107) ˆ G Kj ( t ) − G Kj ( t ) (cid:107) ≤ C (cid:32) K β (3 / − H ) n + K − β ( H +1 / n (cid:33) . (4.2)Note that (H2) is then satisﬁed for n ≥ max( K β (3 / − H ) , K [5 − β ( H +1 / / ). Then, by Theo-rem 3.1 and Lemma 3.1, we then get ∀ t ∈ [0 , T ] , (cid:113) E [ | ˆ X Kt − X t | ] ≤ C (cid:32) K − H + K β (3 / − H ) n + K − β ( H +1 / n (cid:33) . By taking β = − H − H and K ∼ n − H − H , we equalize the three terms and get the claim. (cid:3) We can now go further and use higher order numerical integration algorithm such as theNewton-Cotes method, which for any even number J ∈ N and any smooth function f : [ a, b ] → R gives (see e.g. [22, Theorem 1, p. 310]) (cid:90) ba f ( x ) dx = ( b − a ) J (cid:88) j =0 c Jj f ( a + j b − aJ ) + ˜ c J ( b − a ) J +3 f ( J +2) ( ξ ) , with ξ ∈ ( a, b ) , where the coeﬃcients ( c Jj ) ≤ j ≤ J and ˜ c J are known explicitly. We recover the Simpson’s ruleby taking J = 2. Hence, one can use the Newton-Cotes method on the interval [ K β , K ]. Thisleads to a new measureˆ λ NC ( dρ ) = n (cid:88) i =1 λ ( I K β i,n ) δ ρ Kβi,n + c H ( K − K β ) n n (cid:88) i =1 J (cid:88) j =0 c Jj ( ρ K,Ji,n,j ) − H − δ ρ K,Ji,n,j (4.3)with ρ K,Ji,n,j = K β + K − K β n ( i − jJ ). PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 11

Proposition 4.2.

Suppose that λ is given by (2.8) . Let us assume that the functions M and M are C ∞ with bounded derivatives. Let J ∈ N and ˆ G Kj ( t ) = (cid:82) R + e − ρt M j ( ρ )ˆ λ NC ( dρ ) with ˆ λ NC deﬁned by (4.3) . With β = J +3) − J +1) H J +7 − J +1) H and K ∼ n J +7 − J +1) H J +9 − J +1) H , there exists aconstant C ∈ R ∗ + such that ∀ t ∈ [0 , T ] , E [ | ˆ X Kt − X t | ] ≤ Cn − H × J +7 − J +1) H J +9 − J +1) H . For any ε ∈ (0 , , there exists J such that sup t ∈ [0 ,T ] E [ | ˆ X Kt − X t | ] = O ( n − H × (1 − ε ) ) . We note that we get back Proposition 4.1 in the case J = 2. Proof.

We follow the same arguments as in the proof of Proposition 4.1. The terms corre-sponding to the Newton-Cotes method can be upper bounded by | ˜ c J | sup ρ ∈ [ Kβ,K ] ψ ( J +2) t ( ρ ) n J +2 (cid:16) K − K β (cid:17) J +3 ,that is uniformly O (cid:16) K J +3 − β ( H +1 / n J +2 (cid:17) in t ∈ [0 , T ]. We get ∀ t ∈ [0 , T ] , (cid:107) ˆ G Kj ( t ) − G Kj ( t ) (cid:107) ≤ C (cid:32) K β (3 / − H ) n + K J +3 − β ( H +1 / n J +2 (cid:33) , and then by Corollary 3.1 and Lemma 3.1, we obtain ∀ t ∈ [0 , T ] , (cid:113) E [ | ˆ X Kt − X t | ] ≤ C (cid:32) K − H + K β (3 / − H ) n + K J +3 − β ( H +1 / n J +2 (cid:33) . (4.4)With β = J +3) − J +1) H J +7 − J +1) H and K ∼ n J +7 − J +1) H J +9 − J +1) H , the three terms are of the same order andwe get the ﬁrst claim. We get the second claim noticing that J +7 − J +1) H J +9 − J +1) H → J → + ∞ (cid:3) In dimension d = 1 with M = M ≡

1, it is possible to take a particular value for ρ Ki,n in I Ki,n that improves the rate of convergence. This is stated in the next proposition.

Proposition 4.3.

Let us assume that d = 1 and M = M ≡ . Let us deﬁne ρ Ki,n = (cid:82) I Ki,n ρλ ( dρ ) λ ( I Ki,n ) . (1) Let ˆ λ ( dρ ) be deﬁned by (3.8) with these particular values for ρ Ki,n . Then, the approxi-mation ˆ G Kj ( t ) = (cid:82) e − ρt ˆ λ ( dρ ) with K ∼ n leads to ∃ C > , ∀ t ∈ [0 , T ] , E [ | ˆ X Kt − X t | ] ≤ Cn − H × . (2) Let ˆ λ NC ( dρ ) be deﬁned by (4.3) with these particular values for ρ K β i,n . Then, theapproximation ˆ G Kj ( t ) = (cid:82) e − ρt ˆ λ NC ( dρ ) with β = J +12 − HJ J +12 − HJ and K = n β +2 H (1 − β ) leads to ∃ C > , ∀ t ∈ [0 , T ] , E [ | ˆ X Kt − X t | ] ≤ Cn − H × J +12 − HJ J +15 − HJ . In particular for Simpson’s rule ( ˆ λ S ), we get sup t ∈ [0 ,T ] E [ | ˆ X Kt − X t | ] = O ( n − H × − H − H ) . It is worth noticing that the rate of convergence with factor obtained in the ﬁrst statementis the same as the one obtained by Abi Jaber and El Euch [1] on the kernels G j and theirdiscrete approximating kernels ˆ G Kj . Here, we get in addition a strong estimation error on theprocesses with the same rate. Note that the factor improves the factor obtained in (4.1),when the values of ρ Ki,n are only assumed to be in I Ki,n . Similarly, we notice that3 J + 7 − J + 1) H J + 9 − J + 1) H < J + 12 − HJ J + 15 − HJ < , which shows that the convergence rate is improved with respect to Proposition 4.2 but thefactor still remains under 1. Proof.

For the ﬁrst assertion, we remark that | G Kj ( t ) − ˆ G Kj ( t ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n (cid:88) i =1 (cid:90) I Ki,n ( e − ρt − e − ρ Ki,n t ) λ ( dρ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n (cid:88) i =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) I Ki,n ( e − ρt − e − ρ Ki,n t ) λ ( dρ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . From a Taylor expansion, we get e − ρt − e − ρ Ki,n t = − t ( ρ − ρ Ki,n ) e − ρ Ki,n t + (cid:90) ρρ Ki,n t e − xt ( ρ − x ) dx. When integrating with respect to λ over I Ki,n , the ﬁrst term vanishes and we get (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) I Ki,n ( e − ρt − e − ρ Ki,n t ) λ ( dρ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) I Ki,n (cid:90) ρρ Ki,n t e − xt ( ρ − x ) dxλ ( dρ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ t (cid:90) I Ki,n (cid:90) ρρ Ki,n | ρ − x | dxλ ( dρ ) = t (cid:90) I Ki,n ( ρ − ρ Ki,n ) λ ( dρ ) ≤ t K n λ (cid:0) I Ki,n (cid:1) since ρ Ki,n ∈ I Ki . Summing over i , we get | G Kj ( t ) − ˆ G Kj ( t ) | ≤ t K n λ ([0 , K ]) . (4.5)Thus, (H2) holds for n ≥ K (cid:112) λ ([0 , K ]). By Theorem 3.1 and Lemma 3.1, we get the existenceof C ∈ R ∗ + such that ∀ t ∈ [0 , T ] , E [ | X Kt − X t | ] ≤ C (cid:18) K − H + K − H n (cid:19) . This leads to the claim with K ∼ n .For the proof of the second point, we use the result of the ﬁrst point and repeat thearguments of the Proof of Proposition 4.2. We thus get ∀ t ∈ [0 , T ] , (cid:113) E [ | ˆ X Kt − X t | ] ≤ C (cid:32) K − H + K β (5 / − H ) n + K J +3 − β ( H +1 / n J +2 (cid:33) . instead of (4.4). Taking β = J +12 − HJ J +12 − HJ and K = n β +2 H (1 − β ) makes the three terms of thesame order and leads to the result. The case J = 2 corresponds to Simpson’s rule. (cid:3) PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 13 Numerical experiments

Validation of the theoretical results.

The aim of this section is to illustrate thediﬀerent convergence rates on a very simple example for the rough kernel (2.8). Namely, wetake b ( x ) = 0, σ ( x ) = 1, which means that X t = X + 1Γ( H + 1 / (cid:90) t ( t − s ) H − dW s . For this process, we have implemented the four following approximations.(1) ˆ X ,nt the approximation given by Corollary 3.1 with K = n , ρ Ki,n = i − / n K . From (4.1),the theoretical rate of convergence is E [ | ˆ X ,nt − X t | ] = O ( n − H × ).(2) ˆ X ,nt the approximation given by Proposition 4.3 with ˆ λ and K = n . The theoreticalrate of convergence is E [ | ˆ X ,nt − X t | ] = O ( n − H × ).(3) ˆ X ,nt the approximation given by Proposition 4.1 with K = n − H − H and ρ K − H − H i,n = i − / n K − H − H . The theoretical rate of convergence is E [ | ˆ X ,nt − X t | ] = O ( n − H × − H − H ).(4) ˆ X ,nt the approximation given Proposition 4.3 with ˆ λ S , K = n − H − H . The theoreticalrate of convergence is E [ | ˆ X ,nt − X t | ] = O ( n − H × − H − H ).Note that for 0 ≤ ρ < · · · < ρ n it is possible to simulate exactly the Gaussian vector (cid:18)(cid:90) t exp( − ρ ( t − s )) dW s , . . . , (cid:90) t exp( − ρ n ( t − s )) dW s , H + 1 / (cid:90) t ( t − s ) H − dW s (cid:19) . It is centered with covariance matrix Σ such thatΣ i,j = 1 − exp( − ( ρ i + ρ j ) t ) ρ i + ρ j for 1 ≤ i, j ≤ n, Σ n +1 ,n +1 = 12 H Γ( H + 1 / t H , (5.1)Σ i,n +1 = ρ − H − / i (cid:90) ρ i t H + 1 / s H − / e − s ds. The last quantity involves the incomplete gamma function that can be calculated eﬃciently.For each j ∈ { , . . . , } , we have calculated, using the following basic lemma, the quantity ζ j,nt := E [ | ˆ X j,nt − X t | ] . We reported the obtained results in Tables 1–4.

Lemma 5.1.

Let ≤ ρ < · · · < ρ n , α , . . . , α n ∈ R . Then, n (cid:88) i =1 α i (cid:90) t exp( − ρ n ( t − s )) dW s − H + 1 / (cid:90) t ( t − s ) H − dW s is a centered Gaussian random variable with variance (cid:90) t (cid:32) ( t − s ) H − Γ( H + 1 / − n (cid:88) i =1 α i exp( − ρ n ( t − s )) (cid:33) ds = v (cid:62) Σ v, where Σ is deﬁned by (5.1) and v ∈ R n +1 is deﬁned by v i = α i for ≤ i ≤ n and v n +1 = − . We have calculated ζ j,nt with n = 50 and n = 100 for j ∈ { , } and n = 16 and n = 32for j ∈ { , } . Since the measure ˆ λ S weights 3 n + 1 points, this corresponds to approximatewith SDEs of dimension 49 and 97, making the comparison with the case j ∈ { , } relevant.We have also calculated ˆ γ j,nt = 12 H log(2) log( ζ j,nt /ζ j, nt ) , as a numerical estimation of the speed of convergence factor. Indeed, if we had E [ | ˆ X j,nt − X t | ] ∼ n →∞ cn − H × γ for some constants c, γ >

0, then ˆ γ j,nt would estimate the factor γ . Inour work, we have obtained E [ | ˆ X j,nt − X t | ] = n →∞ O ( n − H × γ ) , and we have reported this theoretical value of γ in the tables below. H ζ ,n ζ , n γ ,n Table 1.

Convergence results for the ﬁrst approximation, with n = 50 H ζ ,n ζ , n γ ,n Table 2.

Convergence results for the second approximation, with n = 50 H ζ ,n ζ , n γ ,n Table 3.

Convergence results for the third approximation, with n = 16 H ζ ,n ζ , n γ ,n Table 4.

Convergence results for the fourth approximation, with n = 16 PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 15

From these numerical results, we observe the following facts: • For each method, the quality of the approximation downgrades as H gets closer to 0.For H = 0 .

05, even if we observe empirical rates of convergence that are in linewith our theoretical results, the approximation error is around 2 for all methods,which is clearly too large for practical use. The next subsection presents signiﬁcantimprovements for this issue. • We notice that the numerical estimation of the speed of convergence factor is al-ways above the theoretical value of γ . These values coincide quite well for the one-dimensional methods (2nd and 4th methods) and for the case H = 0 .

05 for all meth-ods. For the approximations 1 and 3 and the values H = 0 .

45 and 0 .

25, the theoreticalvalue of the speed of convergence factor seems to be slightly pessimistic. • The improvement due to the particular choice of ρ Ki,n in dimension 1 is signiﬁcant.The values of ζ ,n and ζ , n (resp. ζ ,n and ζ , n ) are signiﬁcantly smaller than theone of ζ ,n and ζ , n (resp. ζ ,n and ζ , n ). • The asymptotic acceleration of convergence obtained by Simpson’s rule (i.e. by usingapproximation 3 (resp. 4) instead of 1 (resp. 2)) is not yet observed for these valuesof n . The approximation 1 (resp. 2) with n = 50 gives a slightly better result thanapproximation 3 (resp. 4) with n = 16.5.2. Improvement of the approximations for the Rough kernel.

In practice, themethod provided by truncating and discretizing the integral (cid:82) + ∞ e − ρt M ( ρ ) λ ( dρ ) is partlysatisfactory. Its advantage is that it is systematic, and it may lead to good rates of conver-gence when λ ( dρ ) has a thin tail and under smoothness assumption. For the rough kernel, λ ( dρ ) = c H ρ − H − / dρ is not smooth close to the origin and has fat tails, which makes thetruncation error large. Thus, the convergences that we obtain in Section 4 are quite slow,especially when H is close to zero. Here, we present a systematic way to correct this bytruncating at a higher level. A systematic approach.

The principle is the following. All the methods that we havepresented consists in truncating the integral (cid:82) R + e − ρt λ H ( dρ ) at K = n γH for some γ > , K ]. Here, in addition, we take A >

K, A n K ) by using the same discretization rule on each interval[ A i − K, A i K ) for i = 1 , . . . , n . Since the size of these intervals does not go to zero, we donot expect to improve the asymptotic rate of convergence: the goal is rather to reduce thetruncation error.For simplicity, we present this idea only on the approximation ˆ λ given by Proposition 4.3.Namely, let K > i ∈ { , . . . , n } , I K,Ai,n = (cid:20) i − n K, in K (cid:19) for i ≤ n, I K,Ai,n = (cid:2) KA i − n − , KA i − n (cid:1) for n + 1 ≤ i ≤ n. (5.2)We then consider for i ≤ n , ρ K,Ai,n = (cid:82) IK,Ai,n ρλ H ( dρ ) (cid:82) IK,Ai,n λ H ( dρ ) , which can be calculated exactly since (cid:82) [ a,b ] ρλ H ( dρ ) (cid:82) [ a,b ] λ H ( dρ ) = 1 / − H / − H × b / − H − a / − H b / − H − a / − H for 0 ≤ a < b. Last, we deﬁne the corresponding approximating measure ˆ λ A byˆ λ A ( dρ ) = n (cid:88) i =1 λ H ( I K,Ai,n ) δ ρ K,Ai,n ( dρ ) , (5.3)and ˆ G K,A ( t ) = (cid:82) R + e − ρt ˆ λ A ( dρ ). We have the simple but interesting result. Proposition 5.1.

Let ˆ λ A ( dρ ) be deﬁned by (5.3) , ˆ λ ( dρ ) = (cid:80) ni =1 λ H ( I K,Ai,n ) δ ρ K,Ai,n ( dρ ) be themeasure introduced in Proposition 4.3 and ˆ G K ( t ) = (cid:82) R + e − ρt ˆ λ ( dρ ) . Then, we have ˆ G K ( t ) ≤ ˆ G K,A ( t ) ≤ G λ H ( t ) . If X (resp. ˆ X K,A ) denotes the solution of X t = x + (cid:82) t G λ H ( t − s ) b ( ˆ X s ) ds + (cid:82) t G λ H ( t − s ) σ ( X s ) dW s (resp. ˆ X K,At = x + (cid:82) t ˆ G K,A ( t − s ) b ( ˆ X K,As ) ds + (cid:82) t ˆ G K,A ( t − s ) σ ( ˆ X K,As ) dW s ), wehave E [ | ˆ X K,At − X t | ] = O ( n − H × ) if K ∼ n →∞ cn / for some c ∈ R ∗ + .Proof. The ﬁrst inequality is obvious. The second one is a consequence of Jensen inequalitythat gives (cid:82) I e − ρt λ H ( dρ ) ≥ λ H ( I ) e − (cid:82) I ρλH ( dρ ) (cid:82) I λH ( dρ ) t on any interval I since ρ → e − ρt is a convexfunction. We then get 0 ≤ G λ H ( t ) − ˆ G K,A ( t ) ≤ G λ H ( t ) − ˆ G K ( t ) and thus (cid:82) T ( G λ H ( t ) − ˆ G K,A ( t )) dt ≤ (cid:82) T ( G λ H ( t ) − ˆ G K ( t )) dt for any T >

0. This gives by (4.5), Theorem 3.1 andLemma 3.1 the rate of convergence. (cid:3)

Note that Proposition 5.1 gives the same asymptotic rate of convergence than Proposi-tion 4.3. This is conﬁrmed on our numerical experiments: we have indicated in Table 5 the L -errors obtained with K = n / and A = 3 and the estimated rate of convergence ˆ γ thatis close to the theoretical one of 4 /

5. However, comparing with Table 2 (approximation byˆ G K ), we see that the error is signiﬁcantly reduced: for n = 50 and H = 0 .

05, we get asquared error of 0 . .

03. Thus, if the rate of convergence is not improved withrespect to the approximation given by ˆ λ , the approximation given by ˆ λ A signiﬁcantly reducesthe approximation error. This suggests that the kernel with the constant A improves themultiplicative constant in the rate of convergence. H ζ . × − . × − ζ . × − . × − ζ . × − . × − γ := H log(2) log( ζ /ζ ) 0.819 0.841 0.806Theoretical factor 0.8 0.8 0.8 Table 5.

Convergence results for ζ nt = E [ | ˆ X K,At − X t | ], with A = 3 and t = 1.Now, we discuss the choice of A . By Remark 3.1, (cid:82) T ( G λ H ( t ) − ˆ G K,A ( t )) dt is a naturalcriterion to assess the quality of the approximation. Besides, we know by Lemma 5.1 that thisquantity can be calculated easily. Thus, it is natural to ﬁnd A ∗ that minimizes (cid:82) T ( G λ H ( t ) − ˆ G K,A ( t )) dt . This can be done in practice by using a one-dimensional optimization routine. PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 17 (a) H = 0 . n = 5 (b) H = 0 . n = 10 (c) H = 0 . n = 5 (d) H = 0 . n = 10 (e) H = 0 . n = 10 (f) H = 0 . n = 40 Figure 1.

Plots of G λ H ( t ) = t H − / Γ( H +1 / (black), ˆ G n / ( t ) (blue), ˆ G n / ,A ∗ ( t )(red) and ξ ∗ ˆ G n / ,A ∗ ( t ) (magenta) for diﬀerent values of H and n . H n (cid:113)(cid:82) ( t H − / Γ( H +1 / − ξ ∗ ˆ G n / ,A ∗ n ( t )) dt (cid:113)(cid:82) ( t H − / Γ( H +1 / − ˆ G n,a ∗ ( t )) dt Table 6.

Table giving the value of the L error for ξ ∗ ˆ G n / ,A ∗ and ˆ G n,a ∗ .Last, once A ∗ has been calculated, we still notice that we have ˆ G K,A ∗ ( t ) ≤ G λ H ( t ) byProposition 5.1. Therefore, there exists ξ ∗ ≥ (cid:82) T ( G λ H ( t ) − ξ ˆ G K,A ∗ ( t )) dt ,namely ξ ∗ = (cid:82) T G λ H ( t ) ˆ G K,A ∗ ( t ) dt (cid:82) T ( ˆ G K,A ∗ ( t )) dt , that can similarly as in Lemma 5.1 be calculated exactly by the mean of the Gamma incom-plete function. Let us note that with this last adjustment, the approximation ξ ∗ G K,A ∗ is stillcompletely monotone, which may be an interesting property to preserve.Figure 1 illustrates for diﬀerent values of H the diﬀerent approximations of the roughkernel. It shows the interest of the progressive steps of our approximations from ˆ G n / ( t )to ˆ G n / ,A ∗ ( t ) and then to ξ ∗ ˆ G n / ,A ∗ ( t ). We ﬁrst observe that the approximation ˆ G n / ( t )provided by Proposition 4.3 is not accurate close to time zero, due to the truncation. For H = 0 .

45 (resp. H = 0 . G n / ,A ∗ ( t ) and ξ ∗ ˆ G n / ,A ∗ ( t ) arequite perfect for n = 5 (resp. n = 10). For H = 0 .

05 and n = 10, one better observes therole of the parameter ξ ∗ that shifts upward the approximation so that it crosses G λ H at someoptimal point to minimize the L error. For n = 40 the approximation of the rough kernel isquite perfect. A more heuristic approach.

For a ﬁxed n ∈ N ∗ , we are thus interested to minimizewith respect to α = ( α , . . . , α n ) ∈ R n and 0 ≤ ρ < · · · < ρ n the quantity (cid:90) T (cid:32) t H − / Γ( H + 1 / − n (cid:88) i =1 α i e − ρ i t (cid:33) dt = v (cid:62) Σ v, where Σ is the matrix deﬁned by (5.1) and v (cid:62) = ( α , . . . , α n , − ≤ ρ < · · · < ρ n isﬁxed, the minimization in α is simply given by α ( ρ ) = ((Σ i,j ) ≤ i,j ≤ n ) − (Σ n +1 ,i ) ≤ i ≤ n . We denote v ( ρ ) (cid:62) = ( α ( ρ ) (cid:62) , − ρ (cid:55)→ v ( ρ ) (cid:62) Σ v ( ρ ) on 0 ≤ ρ < · · · < ρ n . To simplify, we restrict the minimization on ρ i ( a ) = a i − − κn with κ ∈ (0 ,

1) ﬁxed and then perform the one dimensional minimization withrespect to a >

1. In our simulation of Table 6, we have taken κ = 0 .

2. Heuristically, takingfor the values of ρ i powers of a enables to select diﬀerent time-scales. We denote by ˆ G n,a ( t ) = (cid:80) ni =1 α i ( ρ ( a )) e − ρ i ( a ) t . We take as a comparison ξ ∗ ˆ G K,A ∗ ( t ) deﬁned as in Proposition 5.1 with K = n / . We have indicated in Table 6 the L errors provided by the two methods. Table 6 PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 19 shows that the quality of the ﬁt given by ˆ G n,a ∗ and ξ ∗ ˆ G n / ,A ∗ are of the same magnitude. Letus note that ˆ G n,a ∗ (resp. ξ ∗ ˆ G n / ,A ∗ ) is a combination of n (resp. 2 n ) exponential functions.Therefore, the SDE associated to ˆ G n,a ∗ has a dimension twice less than the one associatedto ξ ∗ ˆ G n / ,A ∗ , which is an interesting advantage. However, the approximation ˆ G n,a ∗ is nolonger completely monotone, which may be a property that one would like to preserve insome applications. Second, the values of α and ρ deﬁning ˆ G n,a ∗ may be with very largepositive and negative values, leading to an SDE with large coeﬃcients that is thus diﬃcultto approximate. For H = 0 .

45 and n = 5, this is not the case since we obtain α = 1 . α = − . α = 0 . α = 0 . α = 0 . H = 0 .

05 and n = 20 or n = 40,we have positive and negative coeﬃcients with absolute value above 10 . For this reason, theheuristic approach may lead to unstable results especially for H being close to zero. We havenoticed such kind of instability in some of our numerical tests when using this approximationfor the rough Bergomi model. Thus, we recommend in general the use of ξ ∗ ˆ G n / ,A ∗ .5.3. Application to the Rough Bergomi model.

In this subsection, we give a practicalapplication and consider the pricing of European call options with the Rough Bergomi model.Namely, we consider a two dimensional Brownian motion W and the following dynamics: S t = S exp (cid:18)(cid:90) t √ ν s ( ρdW s + (cid:112) − ρ dW s ) − (cid:90) t ν s ds (cid:19) , (5.4) ν t = ν exp (cid:18) η √ H (cid:90) t ( t − s ) H − / dW s − η t H (cid:19) . (5.5)We ﬁrst describe the algorithm of Bayer et al. [3]. It consists in discretizing the timeinterval [0 , T ] with N time steps. Thus, one has to simulate the Gaussian vector ( (cid:82) lN T ( lN T − s ) H − / dW s , W lN T ) l =1 ,...,N by computing a Cholesky decomposition of the covariance matrix.Then, the values of ν lN T are sampled exactly, and one approximate S with the followingscheme, for l ∈ { , . . . , N } :ˆ S lN T = ˆ S l − N T exp (cid:18) ν l − N T (cid:16) ρ ( W ln T − W l − n T ) + (cid:112) − ρ ( W ln T − W l − n T ) (cid:17) − ν l − N T TN (cid:19) . Here, we furthermore approximate ν by using an approximation of the rough kernel.Namely we use that √ H (cid:90) t ( t − s ) H − / dW s = √ H Γ( H + 1 / (cid:90) t G λ H ( t − s ) dW s ≈ √ H Γ( H + 1 / (cid:90) t ξ ∗ G n / ,A ∗ ( t − s ) dW s . Since the approximation is a combination of exponential functions, we can simulate it exactlyby the Gaussian vector ( (cid:82) lN T exp (cid:0) − ρ i ( lN T − s ) (cid:1) dW s , W lN T ) l ∈{ ,...,N } ,i again by computing a Cholesky decomposition of the covariance matrix. Then, we deﬁne the following approxi-mation of ν with ¯ c = η √ H Γ( H + 1 / ξ ∗ :ˆ ν lN T = ν l − N T exp (cid:18) ¯ c (cid:90) t G n / ,A ∗ ( t − s ) dW s −

12 ¯ c (cid:90) t G n / ,A ∗ ( t − s ) ds (cid:19) . (5.6)Note that the integral (cid:82) t G n / ,A ∗ ( t − s ) ds can be easily calculated exactly. We notice that itis important in numerical applications to compute it instead of using η [( lN T ) H − ( l − N T ) H ]that introduces some bias. This slight modiﬁcation improves signiﬁcantly the numericalresults in approximating the smile curve. k Figure 2.

Implicit volatility of the Call option with strike e k obtained by theMonte-Carlo estimator: the method of Bayer et al [3] in blue, our proposedapproximation in red. Respective 95% conﬁdence intervals in yellow and greendelimited with dotted lines of the same color. Parameters: H = 0 . S = 1, v = 0 . , η = 1 . ρ = − . T = 0 . T = 0 . E [( S T − e k ) + ] with 10 samples. Theapproximation that we propose is very close to the smile produced by the method proposedin [3], which shows its relevance. Note that on this speciﬁc example, there is no particularadvantage to use our approximation rather than the one of Bayer et al. [3] since everythingcan be sampled exactly. However, if one uses for the volatility a more involved Volterra SDEwith the rough kernel, exact sampling is no longer possible while our approximations can stillbe used since they correspond to a classical SDE in a higher dimension. PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 21

References [1] E. Abi Jaber and O. El Euch. Multifactor approximation of rough volatility models.

SIAM J. FinancialMath. , 10(2):309–349, 2019.[2] A. Alfonsi. High order discretization schemes for the CIR process: application to aﬃne term structureand Heston models.

Math. Comp. , 79(269):209–237, 2010.[3] C. Bayer, P. Friz, and J. Gatheral. Pricing under rough volatility.

Quant. Finance , 16(6):887–904, 2016.[4] C. Bayer, P. K. Friz, A. Gulisashvili, B. Horvath, and B. Stemper. Short-time near-the-money skew inrough fractional volatility models.

Quant. Finance , 19(5):779–798, 2019.[5] D. Belomestny, S. H¨afner, T. Nagapetyan, and M. Urusov. Variance reduction for discretised diﬀusionsvia regression.

J. Math. Anal. Appl. , 458(1):393–418, 2018.[6] M. Ben Alaya and A. Kebaier. Central limit theorem for the multilevel Monte Carlo Euler method.

Ann.Appl. Probab. , 25(1):211–234, 2015.[7] M. Bennedsen, A. Lunde, and M. S. Pakkanen. Hybrid scheme for Brownian semistationary processes.

Finance Stoch. , 21(4):931–965, 2017.[8] M. A. Berger and V. J. Mizel. Volterra equations with Ito integrals - I.

J. Integral Equations , 2:187–245,1980.[9] M. A. Berger and V. J. Mizel. Volterra equations with Ito integrals - II.

J. Integral Equations , 2:319–337,1980.[10] P. Carmona and L. Coutin. Fractional Brownian motion and the Markov property.

Electron. Commun.Probab. , 3:95–107, 1998.[11] P. Carmona, L. Coutin, and G. Montseny. Approximation of some processes.

Stat. Inference Stoch. Pro-cess. , 3(1-2):161–171, 2000.[12] W. G. Cochran, J.-S. Lee, and J. Potthoﬀ. Stochastic Volterra equations with singular kernels.

StochasticProcesses Appl. , 56(2):337–349, 1995.[13] L. Coutin and L. Decreusefond. Stochastic Volterra equations with singular kernels. In

Stochastic analysisand mathematical physics , pages 39–50. Boston: Birkh¨auser, 2001.[14] L. Decreusefond. Regularity properties of some stochastic Volterra integrals with singular kernel.

PotentialAnal. , 16(2):139–149, 2002.[15] P. K. Friz, P. Gassiat, and P. Pigato. Short dated smile under rough volatility: asymptotics and numerics,2020.[16] M. Fukasawa. Short-time at-the-money skew and rough fractional volatility.

Quant. Finance , 17(2):189–198, 2017.[17] M. Fukasawa. Volatility has to be rough, 2020.[18] J. Gatheral, T. Jaisson, and M. Rosenbaum. Volatility is rough.

Quant. Finance , 18(6):933–949, 2018.[19] M. B. Giles. Multilevel Monte Carlo path simulation.

Oper. Res. , 56(3):607–617, 2008.[20] G. Gripenberg, S.-O. Londen, and O. Staﬀans.

Volterra integral and functional equations , volume 34 of

Encyclopedia of Mathematics and its Applications . Cambridge University Press, Cambridge, 1990.[21] P. Harms and D. Stefanovits. Aﬃne representations of fractional processes with applications in mathe-matical ﬁnance.

Stochastic Processes Appl. , 129(4):1185–1228, 2019.[22] E. Isaacson and H. B. Keller.

Analysis of numerical methods . Dover Publications, Inc., New York, 1994.Corrected reprint of the 1966 original [Wiley, New York; MR0201039 (34

Ann. Appl.Probab. , 19(5):1687–1718, 2009.[24] S. Kusuoka. Approximation of expectation of diﬀusion processes based on Lie algebra and Malliavincalculus. In

Advances in mathematical economics. Vol. 6 , volume 6 of

Adv. Math. Econ. , pages 69–83.Springer, Tokyo, 2004.[25] V. Lemaire and G. Pag`es. Unconstrained recursive importance sampling.

Ann. Appl. Probab. , 20(3):1029–1067, 2010.[26] V. Lemaire and G. Pag`es. Multilevel Richardson-Romberg extrapolation.

Bernoulli , 23(4A):2643–2692,2017.[27] N. J. Newton. Variance reduction for simulated diﬀusions.

SIAM J. Appl. Math. , 54(6):1780–1805, 1994.[28] S. Ninomiya and N. Victoir. Weak approximation of stochastic diﬀerential equations and application toderivative pricing.

Appl. Math. Finance , 15(1-2):107–121, 2008. [29] E. Pardoux and P. Protter. Stochastic Volterra equations with anticipating coeﬃcients.

Ann. Probab. ,18(4):1635–1655, 1990.[30] P. Protter. Volterra equations driven by semimartingales.

Ann. Probab. , 13:519–530, 1985.[31] A. Richard, X. Tan, and F. Yang. Discrete-time simulation of stochastic volterra equations, 2020.[32] Y. Shinozaki. Construction of a third-order K-scheme and its application to ﬁnancial models.

SIAM J.Financial Math. , 8(1):901–932, 2017.[33] D. Talay and L. Tubaro. Expansion of the global error for numerical schemes solving stochastic diﬀerentialequations.

Stochastic Anal. Appl. , 8(4):483–509 (1991), 1990.[34] Z. Wang. Existence and uniqueness of solutions to stochastic Volterra equations with singular kernels andnon-Lipschitz coeﬃcients.

Stat. Probab. Lett. , 78(9):1062–1071, 2008.[35] D. V. Widder.

The Laplace Transform . Princeton Mathematical Series, v. 6. Princeton University Press,Princeton, N. J., 1941.[36] X. Zhang. Euler schemes and large deviations for stochastic Volterra equations with singular kernels.

J.Diﬀer. Equations , 244(9):2226–2250, 2008.[37] X. Zhang. Stochastic Volterra equations in Banach spaces and stochastic partial diﬀerential equation.

J.Funct. Anal. , 258(4):1361–1425, 2010.

Aur´elien Alfonsi, CERMICS, Ecole des Ponts, Marne-la-Vall´ee, France. MathRisk, Inria,Paris, France.

Email address : [email protected] Ahmed Kebaier, Universit´e Sorbonne Paris Nord, LAGA, CNRS, (UMR 7539), F-93430 Villeta-neuse, France

Email address ::