Approximation of Stochastic Volterra Equations with kernels of completely monotone type
AAPPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS WITHKERNELS OF COMPLETELY MONOTONE TYPE
AUR´ELIEN ALFONSI AND AHMED KEBAIER
Abstract.
In this work, we develop a multi-factor approximation for Stochastic VolterraEquations with Lipschitz coefficients and kernels of completely monotone type that may besingular. Our approach consists in truncating and then discretizing the integral definingthe kernel, which corresponds to a classical Stochastic Differential Equation. We provestrong convergence results for this approximation. For the particular rough kernel case withHurst parameter lying in (0 , / Introduction
In recent years, there has been significant and growing interest in studying StochasticVolterra Equations (SVE) since they arise in many applications such as mathematical fi-nance, biology, physics, and engineering. Several studies have investigated the SVE underregular kernels, see e.g. Berger and Mizel [8, 9], Protter [30], Pardoux and Protter [29], andunder non-regular kernels as well, see e.g. Cochran et al. [12], Coutin and Decreusefond [13],Decreusefond [14], Wang [34], Zhang [37], and the references therein. More recently, much at-tention in quantitative finance has centered on using the SVE with a fractional kernel having asmall Hurst parameter H (cid:39) . Date : March 1, 2021.2010
Mathematics Subject Classification.
Key words and phrases.
Stochastic Volterra Equation, Strong error, Fractional kernel, Rough volatilitymodel.This work benefited from the support of the “chaire Risques financiers”, Fondation du Risque, and of theLaboratory of Excellence MME-DII, Grant no. ANR11LBX-0023-01 (http://labex-mme-dii.u-cergy.fr/). a r X i v : . [ m a t h . P R ] F e b AUR´ELIEN ALFONSI AND AHMED KEBAIER
Kusuoka [24], Ninomiya and Victoir [28], Alfonsi [2] and Shinozaki [32]), Multilevel MonteCarlo methods (see e.g. Giles [19], Ben Alaya and Kebaier [6], Lemaire and Pag`es [26]), thevariance reduction techniques (see e.g. Newton [27], Jourdain and Lelong [23], Lemaire andPag`es [25], Belomestny et al. [5]) etc. This gives more flexibility for the approximation setting.In this paper, we are interested in approximating the SVE in a general form given by X t = x + (cid:90) t G ( t − s ) b ( X s ) ds + (cid:90) t G ( t − s ) σ ( X s ) dW s , t ≥ , (1.1)where x ∈ R d , b : R d → R d , σ : R d → M d ( R ) are globally Lipschitz continuous coefficients, W is a standard Brownian motion in R d and G , G : R ∗ + → M d ( R ) are kernels of the form G j ( t ) = (cid:90) R + e − ρt M j ( ρ ) λ ( dρ ) , for t ∈ ]0 , + ∞ [ , with bounded measurable functions M , M : R + → M d ( R ) and a measure λ on R + satisfying (cid:82) R + e − ρt λ ( dρ ) < + ∞ . Note that when M and M are non-negative scalar functions, G and G are known in the literature as completely monotone kernels. In particular, the singularfractional kernel with Hurst parameter that lies in (0 , /
2) is covered within this framework.More precisely, we approximate the solution to (1.1) by a multi-factor approximation thatcorresponds to a stochastic differential equation in a higher dimension. For this setting, weprove a strong convergence error for our multi-factor approximation scheme. To do so, weproceed in two steps: first we truncate the integrals defining G j and second we discretizethe measure λ on the truncated interval [0 , K ]. We denote respectively by X K and ˆ X K thecorresponding SVE processes. Thus, in Section 3 we derive a first strong convergence onthe error between the processes X and its truncated version X K and a second one for theerror between X K and ˆ X K . Section 4 is devoted to the study of the rough kernel, where wepropose various discretization procedures and give their precise rates of convergence. Thenwe illustrate in Section 5 our theoretical results and give a financial application with thecelebrated rough Bergomi model. It is worth noticing that the obtained strong error givesonly an asymptotic rate, but the approximation error may be significantly impacted by someunknown multiplicative constant. Thus, we develop adjustments useful in practice to reducesignificantly this constant and get efficient methods.2. General Framework and preliminary results
We consider the SVE in a general form given by X t = x + (cid:90) t G ( t − s ) b ( X s ) ds + (cid:90) t G ( t − s ) σ ( X s ) dW s , t ≥ , (2.1)where x ∈ R d , b : R d → R d , σ : R d → M d ( R ) are globally Lipschitz continuous coefficientsi.e. ∃ L > , ∀ x, y ∈ R d , | b ( x ) − b ( y ) | + (cid:107) σ ( x ) − σ ( y ) (cid:107) ≤ L | x − y | , (2.2) W is a standard Brownian motion in R d and G , G : R ∗ + → M d ( R ) are kernels that satisfy (cid:90) T (cid:107) G ( s ) (cid:107) + (cid:107) G ( s ) (cid:107) ds < ∞ , for every T ∈ R + . (2.3) PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 3
Then, we can apply Theorem 3.1 [37] and get that there exists a unique strong solutionto (2.1). Note that if (cid:82) T (cid:107) G ( s ) (cid:107) + (cid:107) G ( s ) (cid:107) ds < ∞ for some T >
0, then there existsa unique strong solution ( X t , t ∈ [0 , T ]) up to time T . Obviously, those conditions do notdepend on the choice of the norms | · | and (cid:107) · (cid:107) on R d and M d ( R ). In this paper, we will usethe Euclidean norm on R d and the Frobenius norm on M d ( R ), and we recall that we have ∀ A, B ∈ M d ( R ) , ∀ x ∈ R d , (cid:107) AB (cid:107) ≤ (cid:107) A (cid:107)(cid:107) B (cid:107) and | Ax | ≤ (cid:107) A (cid:107)| x | . (2.4)In this paper, we are interested in the approximation of (2.1) when there exists boundedmeasurable functions M , M : R + → M d ( R ) and a measure λ on R + satisfying ∀ t > , ¯ G ( t ) = (cid:90) R + e − ρt λ ( dρ ) < + ∞ , (2.5)such that G j ( t ) = (cid:90) R + e − ρt M j ( ρ ) λ ( dρ ) , for t ∈ ]0 , + ∞ [ . (2.6)We note M j = sup ρ ≥ (cid:107) M j ( ρ ) (cid:107) and trivially have (cid:107) G j ( t ) (cid:107) ≤ M j ¯ G ( t ). We will assume throughthe paper that ¯ G ∈ L ( R ∗ + , R + ), i.e. ∀ T > , (cid:90) T ¯ G ( t ) dt < ∞ , (2.7)and therefore condition (2.3) is satisfied.In the one-dimensional case, the kernel G j is completely monotone when M i ≥ G λ H ( t ) = t H − / Γ( H +1 / with parameter H ∈ (0 , / G λ H ( t ) = (cid:82) R + e − ρt λ H ( dρ ) with λ H ( dρ ) = c H ρ − H − / dρ, and c H := 1Γ( H + 1 / / − H ) . (2.8)The principle of the approximation is rather simple. We approximate the measure λ bya finite discrete measure. Then, the next proposition ensures that the Stochastic VolterraEquation (2.1) can be obtained from the solution of a classical SDE, for which many numericalmethods have been developed. Thus, the goal of the paper is to analyze the error made whenreplacing the measure λ by a finite discrete measure. We will focus in this paper on strongerror estimates. Proposition 2.1.
Let us assume that λ ( dρ ) = (cid:80) ni =1 α i δ ρ i ( dρ ) with α i ≥ and ρ < · · · < ρ n . (1) Let us assume M = M = M and rank([ α M ( ρ ) . . . α n M ( ρ n )]) = d so that thereexist x , . . . , x n ∈ R d such that (cid:80) ni =1 α i M ( ρ i ) x i = x . Then, the solution of (2.1) is given by (cid:80) ni =1 α i M ( ρ i ) X ρ i t , where ( X ρ t , . . . , X ρ n t ) is the solution of the ( n × d ) -dimensional Stochastic Differential Equation defined by X ρ i t = x i − (cid:90) t ρ i ( X ρ i s − x i ) ds + (cid:90) t b (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s (cid:33) ds + (cid:90) t σ (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s (cid:33) dW s . (2.9) AUR´ELIEN ALFONSI AND AHMED KEBAIER (2)
Let us assume rank([ α M ( ρ ) . . . α n M ( ρ n ) α M ( ρ ) . . . α n M ( ρ n )]) = d so thatthere exist x , . . . , x n , y , . . . , y n ∈ R d such that (cid:80) ni =1 α i [ M ( ρ i ) x i + M ( ρ i ) y i ] = x .Then, the solution of (2.1) is given by X t = (cid:80) ni =1 α i M ( ρ i ) X ρ i t + (cid:80) ni =1 α i M ( ρ i ) Y ρ i t ,where ( X ρ t , Y ρ t , . . . , X ρ n t , Y ρ n t ) is the solution of the (2 n × d ) -dimensional StochasticDifferential Equation defined by X ρ i t = x i − (cid:90) t ρ i ( X ρ i s − x i ) ds + (cid:90) t b (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s + n (cid:88) i =1 α i M ( ρ i ) Y ρ i s (cid:33) ds,Y ρ i t = y i − (cid:90) t ρ i ( Y ρ i s − y i ) ds + (cid:90) t σ (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s + n (cid:88) i =1 α i M ( ρ i ) Y ρ i s (cid:33) dW s . (2.10) Proof.
Let us first consider the case M = M = M . The SDE (2.9) has Lipschitz coefficientsand therefore has a unique strong solution. Since d (cid:0) e ρ i t ( X ρ i t − x i ) (cid:1) = e ρ i t b ( (cid:80) ni =1 α i M ( ρ i ) X ρ i t ) dt + e ρ i t σ ( (cid:80) ni =1 α i M ( ρ i ) X ρ i s ) dW t , we get X ρ i t = x i + (cid:90) t e − ρ i ( t − s ) b (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s (cid:33) ds + (cid:90) t e − ρ i ( t − s ) σ (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s (cid:33) dW s . We left multiply this equation by α i M ( ρ i ) and then sum over i to obtain that (cid:80) ni =1 α i M ( ρ i ) X ρ i t solves (2.1). The strong uniqueness result (Theorem 3.1 [37]) gives the claim.In the general case, we similarly get X ρ i t = x i + (cid:90) t e − ρ i ( t − s ) b (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s + n (cid:88) i =1 α i M ( ρ i ) Y ρ i s (cid:33) ds,Y ρ i t = y i + (cid:90) t e − ρ i ( t − s ) σ (cid:32) n (cid:88) i =1 α i M ( ρ i ) X ρ i s + n (cid:88) i =1 α i M ( ρ i ) Y ρ i s (cid:33) dW s . We then left multiply the first equation by α i M ( ρ i ) and the second equation by α i M ( ρ i ),and sum over i to get the claim. (cid:3) Strong error analysis for the approximation
To analyse the error between the SVE and its approximation by using kernels ˆ G j ( t ) sup-ported by a finite discrete measure (as in Proposition 2.1), we proceed in two steps. First, weanalyse the truncation error when replacing the kernels by the kernels obtained by truncatingthe measure λ in (2.6). Second, we analyse the error between the SVE with the truncatedkernels and the approximating kernels.For any K >
0, we introduce then the truncated convolution kernels G Kj : R + → M d ( R ), j ∈ { , } , that are defined as follows: G Kj ( t ) = (cid:90) [0 ,K ) e − ρt M j ( ρ ) λ ( dρ ) , for all t ≥ . (3.1)Thus, the kernel G Kj approximates the kernel G j defined by (2.6) as K → + ∞ . Since M j = sup ρ ≥ (cid:107) M j ( ρ ) (cid:107) < ∞ , we have the following uniform bound: ∀ K > , (cid:107) G Kj ( t ) (cid:107) ≤ M j ¯ G ( t ) with ¯ G ( t ) = (cid:90) R + e − ρt λ ( dρ ) . (3.2) PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 5
We introduce the stochastic convolution equation X K associated to the kernels G Kj , j ∈{ , } , given by X Kt = x + (cid:90) t G K ( t − s ) b ( X Ks ) ds + (cid:90) t G K ( t − s ) σ ( X Ks ) dW s , x ∈ R d . (3.3)We also consider for c >
0, the resolvant of second kind E c ( t ) that solves the equationE c ( t ) = ¯ G ( t ) + (cid:90) t c ¯ G ( t − s )E c ( s ) ds. (3.4)Since ¯ G ∈ L ( R ∗ + , R + ) by (2.7), we get that E c ( t ) is well defined and belongs also to L ( R ∗ + , R + ) (see Subsection A.3 [1] and Theorem 2.3.1 [20]). Proposition 3.1.
Let λ be a positive measure such that ∀ K > , r ( K ) := (cid:90) [ K, + ∞ ) (cid:90) [ K, + ∞ ) ρ + ρ λ ( dρ ) λ ( dρ ) < ∞ . (H1) Then, for any
T > , there exists a positive constant C that depends on T , λ , M , M , L , | b (0) | and (cid:107) σ (0) (cid:107) such that ∀ t ∈ [0 , T ] , E (cid:104)(cid:12)(cid:12) X t − X Kt (cid:12)(cid:12) (cid:105) ≤ C × r ( K ) . (3.5) Proof.
We note ∆ Kj ( t ) = G j ( t ) − G Kj ( t ). We have for all t ≥ | X t − X Kt | ≤ (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) (cid:90) t ∆ K ( t − s ) b ( X s ) ds (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) t ∆ K ( t − s ) σ ( X s ) dW s (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) t G K ( t − s ) (cid:2) b ( X s ) − b ( X Ks ) (cid:3) ds (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) t G K ( t − s ) (cid:2) σ ( X s ) − σ ( X Ks ) (cid:3) dW s (cid:12)(cid:12)(cid:12)(cid:12) (cid:19) , by using the inequality ( a + b + c + d ) ≤ a + b + c + d ). Then, we get by using Jensen’sinequality, the Itˆo isometry and (2.4): E (cid:2) | X t − X Kt | (cid:3) ≤ t (cid:90) t (cid:107) ∆ K ( t − s ) (cid:107) E [ | b ( X s ) | ] ds + 4 (cid:90) t (cid:107) ∆ K ( t − s ) (cid:107) E [ (cid:107) σ ( X s ) (cid:107) ] ds + 4 t (cid:90) t (cid:107) G K ( t − s ) (cid:107) E [ (cid:12)(cid:12) b ( X s ) − b ( X Ks ) (cid:12)(cid:12) ] ds + 4 (cid:90) t (cid:107) G K ( t − s ) (cid:107) E [ (cid:107) σ ( X s ) − σ ( X Ks ) (cid:107) ] ds. Then, by using the Lipschitz property (2.2) we get for c := 8( T ∨ (cid:0) | b (0) | ∨ (cid:107) σ (0) (cid:107) + L sup t ∈ [0 ,T ] E [ | X t | ] (cid:1) and c := 4 L ( M T + M ) E | X t − X Kt | ≤ c (cid:90) t (cid:107) ∆ K ( t − s ) (cid:107) + (cid:107) ∆ K ( t − s ) (cid:107) ds + c (cid:90) t ¯ G ( t − s ) E (cid:2) | X s − X Ks | (cid:3) ds. Hence, we use the generalized Gronwall Lemma (see e.g. [20, Theorem 9.8.2]) to get E | X t − X Kt | ≤ c (cid:16) (cid:90) t (cid:107) ∆ K ( t − s ) (cid:107) + (cid:107) ∆ K ( t − s ) (cid:107) ds (cid:17)(cid:16) (cid:90) T E c ( s ) ds (cid:17) , Note that if λ ( R + ) < ∞ then ¯ G ( t − s ) ≤ λ ( R + ) and then we can use the classical Gronwall lemma. Thisargument cannot be applied for the rough kernels. AUR´ELIEN ALFONSI AND AHMED KEBAIER where E c is defined by (3.4). Since ∆ Kj ( t − s ) = (cid:82) [ K, + ∞ ) M j ( ρ ) e − ρ ( t − s ) λ ( dρ ), we have (cid:107) ∆ Kj ( t − s ) (cid:107) ≤ M j (cid:82) [ K, + ∞ ) e − ρt λ ( dρ ) and thus (cid:90) t (cid:107) ∆ Kj ( t − s ) (cid:107) ds ≤ M j (cid:90) t (cid:90) [ K, + ∞ ) (cid:90) [ K, + ∞ ) e − ( ρ + ρ )( t − s ) λ ( dρ ) λ ( dρ ) ds = M j (cid:90) [ K, + ∞ ) (cid:90) [ K, + ∞ ) − e − ( ρ + ρ ) t ρ + ρ λ ( dρ ) λ ( dρ ) ≤ M j r ( K ) . We therefore get (3.5) with C = c ( M + M ) (cid:16) (cid:82) T E c ( s ) ds (cid:17) . (cid:3) Remark 3.1. (Non asymptotic estimates) One interest to work with truncation is that thefamily G K and G K are uniformly bounded in L ([0 , T ]) . Suppose now that ˆ G and ˆ G aretwo kernels such that ∃ ¯ C ∈ R ∗ + , ∀ j ∈ { , } , t ∈ [0 , T ] (cid:107) ˆ G j ( t ) (cid:107) ≤ ¯ C (1 + (cid:107) G j ( t ) (cid:107) ) . Then, we have (cid:82) T (cid:107) ˆ G ( t ) (cid:107) + (cid:107) ˆ G ( t ) (cid:107) dt < ∞ and by Theorem 3.1 [37] , there exists a uniquesolution to ˆ X t = x + (cid:90) t ˆ G ( t − s ) b ( ˆ X s ) ds + (cid:90) t ˆ G ( t − s ) σ ( ˆ X s ) dW s , t ∈ [0 , T ] . By the same arguments as in the proof of Proposition 3.1, we get the existence of a constant C ∈ R ∗ + (depending on b , σ , G and G ) such that E [ | ˆ X t − X t | ] ≤ C (cid:16) (cid:90) t (cid:107) ˆ∆ ( s ) (cid:107) + (cid:107) ˆ∆ ( s ) (cid:107) ds (cid:17) . (3.6) Lemma 3.1.
Under the assumptions of the above theorem, we have r ( K ) ≤ (cid:16) (cid:82) [ K, + ∞ ) λ ( dρ ) √ ρ (cid:17) . If λ ( dρ ) = f ( ρ ) dρ with f ( ρ ) = ρ →∞ O ( ρ − η − / ) for some η > , we have r ( K ) = K →∞ O ( K − η ) .Proof. The upper bound is obtained from the standard inequality ρ + ρ ≤ √ ρ ρ . For λ ( dρ ) = f ( ρ ) dρ with f ( ρ ) = O ( ρ − η − / ) and η > C > f ( ρ )
0. We define, for t ∈ [0 , T ] ∀ t ∈ [0 , T ] , ˆ∆ Kj ( t ) = ˆ G Kj ( t ) − G Kj ( t ) , and assume the following bound: ∃ ¯∆ : [0 , T ] → R + s.t. (cid:90) T ¯∆ ( t ) dt < ∞ , ∀ K > , ∀ t ∈ [0 , T ] , (cid:107) ˆ∆ Kj ( t ) (cid:107) ≤ ¯∆( t ) . (H2)Note that in our examples, we will use ¯∆ as a constant function, but we keep it general forthe presentation of the results. The assumption (H2) implies (cid:82) T (cid:107) ˆ G K ( t ) (cid:107) + (cid:107) ˆ G K ( t ) (cid:107) dt < ∞ , PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 7 and we know from Theorem 3.1 [37] that there exists a unique strong solution ( ˆ X Kt , t ∈ [0 , T ])of the SVE ˆ X Kt = x + (cid:90) t ˆ G K ( t − s ) ˆ X Ks ds + (cid:90) t ˆ G K ( t − s ) ˆ X Ks ds. The key property of (H2) is that the bound is uniform in K . This enables to get the followingresult. Lemma 3.2. (Uniform estimate on X K ) Let (H2) hold. Then, there exists C ∈ R ∗ + (depend-ing on | x | , T , | b (0) | , (cid:107) σ (0) (cid:107) , L , M and M ) such that ∀ K > , ∀ t ∈ [0 , T ] , E [ | ˆ X Kt | ] ≤ C. Proof.
We have by using Jensen’s formula and Itˆo’s isometry, for t ∈ [0 , T ], E [ | ˆ X Kt | ] ≤ | x | + 3 t (cid:90) t (cid:107) ˆ G K ( t − s ) (cid:107) E [ | b ( ˆ X Ks ) | ] ds + 3 (cid:90) t (cid:107) ˆ G K ( t − s ) (cid:107) E [ (cid:107) σ ( ˆ X Ks ) (cid:107) ] ds. On the one hand, we use that | b ( x ) | ≤ | b (0) | + L | x | and (cid:107) σ ( x ) (cid:107) ≤ (cid:107) σ (0) (cid:107) + L | x | . On the otherhand, we get from (H2) and (3.2) (cid:107) ˆ G Kj ( t ) (cid:107) ≤
2( ¯∆ ( t ) + (cid:107) G Kj ( t ) (cid:107) ) ≤
2( ¯∆( t ) + M j ¯ G ( t ) ).Since (cid:82) T ¯∆( t ) + ¯ G ( t ) dt < ∞ , this leads to the existence of a constant C ∈ R ∗ + that dependson | x | , T , | b (0) | , (cid:107) σ (0) (cid:107) , L , M and M such that E [ | ˆ X Kt | ] ≤ C + C (cid:90) t (cid:0) ¯∆( t − s ) + ¯ G ( t − s ) (cid:1) E [ | ˆ X Ks | ] ds. For c >
0, let (˜E c ( t ) , t ∈ [0 , T ]) be defined as the solution of the equation˜E c ( t ) = ¯∆ ( t ) + ¯ G ( t ) + (cid:90) t c ( ¯∆ ( t − s ) + ¯ G ( t − s ))˜E c ( s ) ds. (3.7)Since ¯∆ + ¯ G ∈ L ((0 , T ) , R + ), we get that ˜E c ( t ) is well defined and belongs also to L ((0 , T ) , R + ) by applying the results of Subsection A.3 [1] and Theorem 2.3.1 [20] to thekernel (0 ,T ) ( t )[ ¯∆ ( t ) + ¯ G ( t )]. We then get from [1, Lemma A.4] or [20, Lemma 9.8.2] ∀ t ∈ [0 , T ] , E [ | ˆ X Kt | ] ≤ C (cid:18) (cid:90) T ˜E C ( t ) dt (cid:19) , which gives the claim. (cid:3) Proposition 3.2.
Let
T > . Suppose that for any K > , there are kernels ˆ G K , ˆ G K :[0 , T ] → M d ( R ) such that (H2) holds. Then, there is a constant C (depending on | x | , T , | b (0) | , (cid:107) σ (0) (cid:107) , L , M and M ) such that ∀ t ∈ [0 , T ] , E (cid:104) | ˆ X Kt − X Kt | (cid:105) ≤ C (cid:18)(cid:90) t (cid:104) (cid:107) ˆ∆ K ( s ) (cid:107) + (cid:107) ˆ∆ K ( s ) (cid:107) (cid:105) ds (cid:19) . Proof.
We repeat the same arguments as in the proof of Proposition 3.1 and get E (cid:104) | ˆ X Kt − X Kt | (cid:105) ≤ t (cid:90) t (cid:107) ˆ∆ K ( t − s ) (cid:107) E [ | b ( ˆ X Ks ) | ] ds + 4 (cid:90) t (cid:107) ˆ∆ K ( t − s ) (cid:107) E [ (cid:107) σ ( ˆ X Ks ) (cid:107) ] ds + 4 t (cid:90) t (cid:107) G K ( t − s ) (cid:107) E [ (cid:12)(cid:12) b ( ˆ X Ks ) − b ( X Ks ) (cid:12)(cid:12) ] ds + 4 (cid:90) t (cid:107) G K ( t − s ) (cid:107) E [ (cid:107) σ ( ˆ X Ks ) − σ ( X Ks ) (cid:107) ] ds. AUR´ELIEN ALFONSI AND AHMED KEBAIER
From Lemma 3.2, we get the existence of a constant C ∈ R ∗ + such thatsup K> sup t ∈ [0 ,T ] E [ | ˆ X Kt | ] ≤ C. Then, we set similarly as in the proof of Proposition 3.1 c := 8( T ∨ (cid:0) | b (0) | ∨(cid:107) σ (0) (cid:107) + L C (cid:1) , c := 4 L ( M T + M ), and we get E | ˆ X Kt − X Kt | ≤ c (cid:90) t (cid:107) ˆ∆ K ( s ) (cid:107) + (cid:107) ˆ∆ K ( s ) (cid:107) ds + c (cid:90) t ¯ G ( t − s ) E (cid:104) | ˆ X Ks − X Ks | (cid:105) ds. Hence, we use the generalized Gronwall Lemma (see e.g. [20, Lemma 9.8.2]) to get E | ˆ X Kt − X Kt | ≤ c (cid:18) (cid:90) T E c ( s ) ds (cid:19) (cid:18)(cid:90) t (cid:104) (cid:107) ˆ∆ K ( s ) (cid:107) + (cid:107) ˆ∆ K ( s ) (cid:107) (cid:105) ds (cid:19) , where E c is defined by (3.4). (cid:3) Combining Propositions 3.1 and 3.2, we obtain easily our main result.
Theorem 3.1.
Let us assume that λ satisfies (H1) and that (H2) holds. Then, there existsa constant C ∈ R ∗ + such that ∀ t ∈ [0 , T ] , E [ | X t − ˆ X Kt | ] ≤ C (cid:18) r ( K ) + (cid:90) t (cid:104) (cid:107) ˆ∆ K ( s ) (cid:107) + (cid:107) ˆ∆ K ( s ) (cid:107) (cid:105) ds (cid:19) . The term r ( K ) and the integral in the right hand side correspond respectively to the trun-cation and discretization error. When using a Riemann discretization, we get the followinggeneral result. Corollary 3.1.
Let us assume that λ satisfies (H1) , and that the functions M j : R + → M d ( R ) are Lipschitz continuous: ∃ ¯ L > , ∀ j ∈ { , } , ∀ ρ, ρ (cid:48) ≥ , | M j ( ρ ) − M j ( ρ (cid:48) ) | ≤ ¯ L | ρ − ρ (cid:48) | . Let n ∈ N ∗ , I Ki,n = (cid:2) i − n K, in K (cid:1) for ≤ i ≤ n and ρ Ki,n ∈ I Ki,n . Let us define the kernels j ∈ { , } , ˆ G Kj ( t ) = n (cid:88) i =1 λ (cid:0) I Ki,n (cid:1) M j ( ρ Ki,n ) e − ρ Ki,n t , that correspond to the measure ˆ λ ( dρ ) = n (cid:88) i =1 λ (cid:0) I Ki,n (cid:1) δ ρ Ki,n ( dρ ) . (3.8) Then, there exists a constant C ∈ R ∗ + such that for n ≥ Kλ ([0 , K )) , we have ∀ t ∈ [0 , T ] , E [ | X t − ˆ X Kt | ] ≤ C (cid:18) r ( K ) + K n λ ([0 , K )) (cid:19) . This corollary indicates the theoretical optimal choice for n , when K → + ∞ . Namely, onehas to take n proportional to Kλ ([0 ,K )) √ r ( K ) in order to equalize the both terms, i.e. the error dueto the truncation and the one due to the approximation. PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 9
Proof.
We have ˆ G Kj ( t ) − G Kj ( t ) = (cid:80) ni =1 (cid:82) I Ki,n (cid:104) M j ( ρ Ki,n ) e − ρ Ki,n t − M j ( ρ ) e − ρt (cid:105) λ ( dρ ). From thetriangular inequality, we get for t ∈ [0 , T ] (cid:107) M j ( ρ Ki,n ) e − ρ Ki,n t − M j ( ρ ) e − ρt (cid:107) ≤ (cid:107) M j ( ρ Ki,n ) − M j ( ρ ) (cid:107) e − ρ Ki,n t + (cid:107) M j ( ρ ) (cid:107)| e − ρ Ki,n t − e − ρt |≤ ( ¯ L + M j t ) | ρ − ρ Ki,n | ≤ ( ¯ L + M j T ) Kn .
This yields to (cid:107) ˆ G Kj ( t ) − G Kj ( t ) (cid:107) ≤ ( ¯ L + M j ) λ ([0 , K )) Kn , for any t ∈ [0 , T ]. In particular, (H2)holds for n ≥ Kλ ([0 , K )). We can thus apply Theorem 3.1 and get the result. (cid:3) Corollary 3.1 gives a general result on the approximation of SVE by SDE. Obviously, it ispossible to derive many variations and refinements of this result by assuming more regularityon the functions M j or on the measure λ . In the next section, we investigate some of theserefinements when λ is given by (2.8).4. More approximation results for the rough kernels
Let us start by applying the result of Corollary 3.1 to the measure λ defined in Equa-tion (2.8). We have λ ([0 , K )) = c H (1 / − H ) K − H and r ( K ) = O ( K − H ) by Lemma 3.1,which gives ∀ t ∈ [0 , T ] , E [ | X t − ˆ X Kt | ] ≤ C (cid:18) K − H + K − H n (cid:19) . By taking n = K or equivalently K = n , we get E [ | X t − ˆ X Kt | ] = n →∞ O ( n − H × ) . (4.1)Let us recall that n is the number of points weighted by the approximating measure ˆ λ .By Proposition 2.1, n scales as the dimension of the SDE that approximates the SVE andtherefore as the computation time needed to simulate the SDE. The goal of this section is toimprove this rate, by assuming more regularity on the functions M j .To get a better approximation, we assume more regularity on the functions M and M .To approximate G Kj ( t ) = (cid:82) K e − ρt M j ( ρ ) c H ρ − H − / dρ , we use the same type of approximationon [0 , K β ] with 0 < β < K β , K ], with K > Proposition 4.1.
Suppose that λ is given by (2.8) . Let us assume that the functions M and M are C with bounded derivatives. Let β ∈ (0 , and ˆ G Kj ( t ) = (cid:82) R + e − ρt M j ( ρ )ˆ λ S ( dρ ) with ˆ λ S ( dρ ) = n (cid:88) i =1 λ ( I K β i,n ) δ ρ Kβi,n + c H ( K − K β )6 n n (cid:88) i =1 (cid:104) ( ρ Ki,n, ) − H − δ ρ Ki,n, + 4( ρ Ki,n, ) − H − δ ρ Ki,n, + ( ρ Ki,n, ) − H − δ ρ Ki,n, (cid:105) , where I K β i,n = [ i − n K β , in K β ) , ρ K β i,n ∈ I K β i,n , ρ Ki,n, = K β + ( i − K − K β ) n , ρ Ki,n, = K β + (2 i − K − K β )2 n and ρ Ki,n, = K β + i ( K − K β ) n . With β = − H − H and K ∼ n − H − H , there exists a constant C ∈ R ∗ + such that ∀ t ∈ [0 , T ] , E [ | ˆ X Kt − X t | ] ≤ Cn − H × − H − H . We clearly have ≤ − H − H for H ∈ (0 , /
2) and notice that ˆ λ S weights 3 n + 1 = O ( n )different points. Thus, the approximation given by Proposition 4.1 is asymptotically betterthan the one given by Corollary 3.1. Proof.
We aim at applying Theorem 3.1. We haveˆ G Kj ( t ) − G Kj ( t ) = n (cid:88) i =1 (cid:90) I Kβi,n (cid:20) M j ( ρ K β i,n ) e − ρ Kβi,n t − M j ( ρ ) e − ρt (cid:21) λ ( dρ ) − (cid:90) KK β M j ( ρ ) e − ρt c H ρ − H − / dρ + c H ( K − K β )6 n n (cid:88) i =1 (cid:104) ( ρ Ki,n, ) − H − M j ( ρ Ki,n, ) + 4( ρ Ki,n, ) − H − M j ( ρ Ki,n, ) + ( ρ Ki,n, ) − H − M j ( ρ Ki,n, ) (cid:105) . The norm of the first sum can be upper bounded by O (cid:16) λ ([0 , K β )) K β n (cid:17) = O (cid:16) K β (3 / − H ) n (cid:17) , as inthe proof Corollary 3.1. For the other terms, we work componentwise and may assume w.l.o.g.that M j is real valued. Let ψ t ( ρ ) = c H M j ( ρ ) ρ − H − / e − ρt . The well known convergence resulton the Simpson’s rule (see e.g. [22], p. 339) allows to upper bound the norm of the otherterms by sup ρ ∈ [ K β ,K ] ψ (4) t ( ρ )90 n (cid:16) K − K β (cid:17) . We get that sup t ∈ [0 ,T ] sup ρ ∈ [ K β ,K ] ψ (4) t ( ρ ) = O ( K − β ( H +1 / ) by using that the derivatives of M j are bounded and 0 ≤ e − ρt ≤
1. This leads to ∀ t ∈ [0 , T ] , (cid:107) ˆ G Kj ( t ) − G Kj ( t ) (cid:107) ≤ C (cid:32) K β (3 / − H ) n + K − β ( H +1 / n (cid:33) . (4.2)Note that (H2) is then satisfied for n ≥ max( K β (3 / − H ) , K [5 − β ( H +1 / / ). Then, by Theo-rem 3.1 and Lemma 3.1, we then get ∀ t ∈ [0 , T ] , (cid:113) E [ | ˆ X Kt − X t | ] ≤ C (cid:32) K − H + K β (3 / − H ) n + K − β ( H +1 / n (cid:33) . By taking β = − H − H and K ∼ n − H − H , we equalize the three terms and get the claim. (cid:3) We can now go further and use higher order numerical integration algorithm such as theNewton-Cotes method, which for any even number J ∈ N and any smooth function f : [ a, b ] → R gives (see e.g. [22, Theorem 1, p. 310]) (cid:90) ba f ( x ) dx = ( b − a ) J (cid:88) j =0 c Jj f ( a + j b − aJ ) + ˜ c J ( b − a ) J +3 f ( J +2) ( ξ ) , with ξ ∈ ( a, b ) , where the coefficients ( c Jj ) ≤ j ≤ J and ˜ c J are known explicitly. We recover the Simpson’s ruleby taking J = 2. Hence, one can use the Newton-Cotes method on the interval [ K β , K ]. Thisleads to a new measureˆ λ NC ( dρ ) = n (cid:88) i =1 λ ( I K β i,n ) δ ρ Kβi,n + c H ( K − K β ) n n (cid:88) i =1 J (cid:88) j =0 c Jj ( ρ K,Ji,n,j ) − H − δ ρ K,Ji,n,j (4.3)with ρ K,Ji,n,j = K β + K − K β n ( i − jJ ). PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 11
Proposition 4.2.
Suppose that λ is given by (2.8) . Let us assume that the functions M and M are C ∞ with bounded derivatives. Let J ∈ N and ˆ G Kj ( t ) = (cid:82) R + e − ρt M j ( ρ )ˆ λ NC ( dρ ) with ˆ λ NC defined by (4.3) . With β = J +3) − J +1) H J +7 − J +1) H and K ∼ n J +7 − J +1) H J +9 − J +1) H , there exists aconstant C ∈ R ∗ + such that ∀ t ∈ [0 , T ] , E [ | ˆ X Kt − X t | ] ≤ Cn − H × J +7 − J +1) H J +9 − J +1) H . For any ε ∈ (0 , , there exists J such that sup t ∈ [0 ,T ] E [ | ˆ X Kt − X t | ] = O ( n − H × (1 − ε ) ) . We note that we get back Proposition 4.1 in the case J = 2. Proof.
We follow the same arguments as in the proof of Proposition 4.1. The terms corre-sponding to the Newton-Cotes method can be upper bounded by | ˜ c J | sup ρ ∈ [ Kβ,K ] ψ ( J +2) t ( ρ ) n J +2 (cid:16) K − K β (cid:17) J +3 ,that is uniformly O (cid:16) K J +3 − β ( H +1 / n J +2 (cid:17) in t ∈ [0 , T ]. We get ∀ t ∈ [0 , T ] , (cid:107) ˆ G Kj ( t ) − G Kj ( t ) (cid:107) ≤ C (cid:32) K β (3 / − H ) n + K J +3 − β ( H +1 / n J +2 (cid:33) , and then by Corollary 3.1 and Lemma 3.1, we obtain ∀ t ∈ [0 , T ] , (cid:113) E [ | ˆ X Kt − X t | ] ≤ C (cid:32) K − H + K β (3 / − H ) n + K J +3 − β ( H +1 / n J +2 (cid:33) . (4.4)With β = J +3) − J +1) H J +7 − J +1) H and K ∼ n J +7 − J +1) H J +9 − J +1) H , the three terms are of the same order andwe get the first claim. We get the second claim noticing that J +7 − J +1) H J +9 − J +1) H → J → + ∞ (cid:3) In dimension d = 1 with M = M ≡
1, it is possible to take a particular value for ρ Ki,n in I Ki,n that improves the rate of convergence. This is stated in the next proposition.
Proposition 4.3.
Let us assume that d = 1 and M = M ≡ . Let us define ρ Ki,n = (cid:82) I Ki,n ρλ ( dρ ) λ ( I Ki,n ) . (1) Let ˆ λ ( dρ ) be defined by (3.8) with these particular values for ρ Ki,n . Then, the approxi-mation ˆ G Kj ( t ) = (cid:82) e − ρt ˆ λ ( dρ ) with K ∼ n leads to ∃ C > , ∀ t ∈ [0 , T ] , E [ | ˆ X Kt − X t | ] ≤ Cn − H × . (2) Let ˆ λ NC ( dρ ) be defined by (4.3) with these particular values for ρ K β i,n . Then, theapproximation ˆ G Kj ( t ) = (cid:82) e − ρt ˆ λ NC ( dρ ) with β = J +12 − HJ J +12 − HJ and K = n β +2 H (1 − β ) leads to ∃ C > , ∀ t ∈ [0 , T ] , E [ | ˆ X Kt − X t | ] ≤ Cn − H × J +12 − HJ J +15 − HJ . In particular for Simpson’s rule ( ˆ λ S ), we get sup t ∈ [0 ,T ] E [ | ˆ X Kt − X t | ] = O ( n − H × − H − H ) . It is worth noticing that the rate of convergence with factor obtained in the first statementis the same as the one obtained by Abi Jaber and El Euch [1] on the kernels G j and theirdiscrete approximating kernels ˆ G Kj . Here, we get in addition a strong estimation error on theprocesses with the same rate. Note that the factor improves the factor obtained in (4.1),when the values of ρ Ki,n are only assumed to be in I Ki,n . Similarly, we notice that3 J + 7 − J + 1) H J + 9 − J + 1) H < J + 12 − HJ J + 15 − HJ < , which shows that the convergence rate is improved with respect to Proposition 4.2 but thefactor still remains under 1. Proof.
For the first assertion, we remark that | G Kj ( t ) − ˆ G Kj ( t ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n (cid:88) i =1 (cid:90) I Ki,n ( e − ρt − e − ρ Ki,n t ) λ ( dρ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ n (cid:88) i =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) I Ki,n ( e − ρt − e − ρ Ki,n t ) λ ( dρ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . From a Taylor expansion, we get e − ρt − e − ρ Ki,n t = − t ( ρ − ρ Ki,n ) e − ρ Ki,n t + (cid:90) ρρ Ki,n t e − xt ( ρ − x ) dx. When integrating with respect to λ over I Ki,n , the first term vanishes and we get (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) I Ki,n ( e − ρt − e − ρ Ki,n t ) λ ( dρ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) I Ki,n (cid:90) ρρ Ki,n t e − xt ( ρ − x ) dxλ ( dρ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ t (cid:90) I Ki,n (cid:90) ρρ Ki,n | ρ − x | dxλ ( dρ ) = t (cid:90) I Ki,n ( ρ − ρ Ki,n ) λ ( dρ ) ≤ t K n λ (cid:0) I Ki,n (cid:1) since ρ Ki,n ∈ I Ki . Summing over i , we get | G Kj ( t ) − ˆ G Kj ( t ) | ≤ t K n λ ([0 , K ]) . (4.5)Thus, (H2) holds for n ≥ K (cid:112) λ ([0 , K ]). By Theorem 3.1 and Lemma 3.1, we get the existenceof C ∈ R ∗ + such that ∀ t ∈ [0 , T ] , E [ | X Kt − X t | ] ≤ C (cid:18) K − H + K − H n (cid:19) . This leads to the claim with K ∼ n .For the proof of the second point, we use the result of the first point and repeat thearguments of the Proof of Proposition 4.2. We thus get ∀ t ∈ [0 , T ] , (cid:113) E [ | ˆ X Kt − X t | ] ≤ C (cid:32) K − H + K β (5 / − H ) n + K J +3 − β ( H +1 / n J +2 (cid:33) . instead of (4.4). Taking β = J +12 − HJ J +12 − HJ and K = n β +2 H (1 − β ) makes the three terms of thesame order and leads to the result. The case J = 2 corresponds to Simpson’s rule. (cid:3) PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 13 Numerical experiments
Validation of the theoretical results.
The aim of this section is to illustrate thedifferent convergence rates on a very simple example for the rough kernel (2.8). Namely, wetake b ( x ) = 0, σ ( x ) = 1, which means that X t = X + 1Γ( H + 1 / (cid:90) t ( t − s ) H − dW s . For this process, we have implemented the four following approximations.(1) ˆ X ,nt the approximation given by Corollary 3.1 with K = n , ρ Ki,n = i − / n K . From (4.1),the theoretical rate of convergence is E [ | ˆ X ,nt − X t | ] = O ( n − H × ).(2) ˆ X ,nt the approximation given by Proposition 4.3 with ˆ λ and K = n . The theoreticalrate of convergence is E [ | ˆ X ,nt − X t | ] = O ( n − H × ).(3) ˆ X ,nt the approximation given by Proposition 4.1 with K = n − H − H and ρ K − H − H i,n = i − / n K − H − H . The theoretical rate of convergence is E [ | ˆ X ,nt − X t | ] = O ( n − H × − H − H ).(4) ˆ X ,nt the approximation given Proposition 4.3 with ˆ λ S , K = n − H − H . The theoreticalrate of convergence is E [ | ˆ X ,nt − X t | ] = O ( n − H × − H − H ).Note that for 0 ≤ ρ < · · · < ρ n it is possible to simulate exactly the Gaussian vector (cid:18)(cid:90) t exp( − ρ ( t − s )) dW s , . . . , (cid:90) t exp( − ρ n ( t − s )) dW s , H + 1 / (cid:90) t ( t − s ) H − dW s (cid:19) . It is centered with covariance matrix Σ such thatΣ i,j = 1 − exp( − ( ρ i + ρ j ) t ) ρ i + ρ j for 1 ≤ i, j ≤ n, Σ n +1 ,n +1 = 12 H Γ( H + 1 / t H , (5.1)Σ i,n +1 = ρ − H − / i (cid:90) ρ i t H + 1 / s H − / e − s ds. The last quantity involves the incomplete gamma function that can be calculated efficiently.For each j ∈ { , . . . , } , we have calculated, using the following basic lemma, the quantity ζ j,nt := E [ | ˆ X j,nt − X t | ] . We reported the obtained results in Tables 1–4.
Lemma 5.1.
Let ≤ ρ < · · · < ρ n , α , . . . , α n ∈ R . Then, n (cid:88) i =1 α i (cid:90) t exp( − ρ n ( t − s )) dW s − H + 1 / (cid:90) t ( t − s ) H − dW s is a centered Gaussian random variable with variance (cid:90) t (cid:32) ( t − s ) H − Γ( H + 1 / − n (cid:88) i =1 α i exp( − ρ n ( t − s )) (cid:33) ds = v (cid:62) Σ v, where Σ is defined by (5.1) and v ∈ R n +1 is defined by v i = α i for ≤ i ≤ n and v n +1 = − . We have calculated ζ j,nt with n = 50 and n = 100 for j ∈ { , } and n = 16 and n = 32for j ∈ { , } . Since the measure ˆ λ S weights 3 n + 1 points, this corresponds to approximatewith SDEs of dimension 49 and 97, making the comparison with the case j ∈ { , } relevant.We have also calculated ˆ γ j,nt = 12 H log(2) log( ζ j,nt /ζ j, nt ) , as a numerical estimation of the speed of convergence factor. Indeed, if we had E [ | ˆ X j,nt − X t | ] ∼ n →∞ cn − H × γ for some constants c, γ >
0, then ˆ γ j,nt would estimate the factor γ . Inour work, we have obtained E [ | ˆ X j,nt − X t | ] = n →∞ O ( n − H × γ ) , and we have reported this theoretical value of γ in the tables below. H ζ ,n ζ , n γ ,n Table 1.
Convergence results for the first approximation, with n = 50 H ζ ,n ζ , n γ ,n Table 2.
Convergence results for the second approximation, with n = 50 H ζ ,n ζ , n γ ,n Table 3.
Convergence results for the third approximation, with n = 16 H ζ ,n ζ , n γ ,n Table 4.
Convergence results for the fourth approximation, with n = 16 PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 15
From these numerical results, we observe the following facts: • For each method, the quality of the approximation downgrades as H gets closer to 0.For H = 0 .
05, even if we observe empirical rates of convergence that are in linewith our theoretical results, the approximation error is around 2 for all methods,which is clearly too large for practical use. The next subsection presents significantimprovements for this issue. • We notice that the numerical estimation of the speed of convergence factor is al-ways above the theoretical value of γ . These values coincide quite well for the one-dimensional methods (2nd and 4th methods) and for the case H = 0 .
05 for all meth-ods. For the approximations 1 and 3 and the values H = 0 .
45 and 0 .
25, the theoreticalvalue of the speed of convergence factor seems to be slightly pessimistic. • The improvement due to the particular choice of ρ Ki,n in dimension 1 is significant.The values of ζ ,n and ζ , n (resp. ζ ,n and ζ , n ) are significantly smaller than theone of ζ ,n and ζ , n (resp. ζ ,n and ζ , n ). • The asymptotic acceleration of convergence obtained by Simpson’s rule (i.e. by usingapproximation 3 (resp. 4) instead of 1 (resp. 2)) is not yet observed for these valuesof n . The approximation 1 (resp. 2) with n = 50 gives a slightly better result thanapproximation 3 (resp. 4) with n = 16.5.2. Improvement of the approximations for the Rough kernel.
In practice, themethod provided by truncating and discretizing the integral (cid:82) + ∞ e − ρt M ( ρ ) λ ( dρ ) is partlysatisfactory. Its advantage is that it is systematic, and it may lead to good rates of conver-gence when λ ( dρ ) has a thin tail and under smoothness assumption. For the rough kernel, λ ( dρ ) = c H ρ − H − / dρ is not smooth close to the origin and has fat tails, which makes thetruncation error large. Thus, the convergences that we obtain in Section 4 are quite slow,especially when H is close to zero. Here, we present a systematic way to correct this bytruncating at a higher level. A systematic approach.
The principle is the following. All the methods that we havepresented consists in truncating the integral (cid:82) R + e − ρt λ H ( dρ ) at K = n γH for some γ > , K ]. Here, in addition, we take A >
K, A n K ) by using the same discretization rule on each interval[ A i − K, A i K ) for i = 1 , . . . , n . Since the size of these intervals does not go to zero, we donot expect to improve the asymptotic rate of convergence: the goal is rather to reduce thetruncation error.For simplicity, we present this idea only on the approximation ˆ λ given by Proposition 4.3.Namely, let K > i ∈ { , . . . , n } , I K,Ai,n = (cid:20) i − n K, in K (cid:19) for i ≤ n, I K,Ai,n = (cid:2) KA i − n − , KA i − n (cid:1) for n + 1 ≤ i ≤ n. (5.2)We then consider for i ≤ n , ρ K,Ai,n = (cid:82) IK,Ai,n ρλ H ( dρ ) (cid:82) IK,Ai,n λ H ( dρ ) , which can be calculated exactly since (cid:82) [ a,b ] ρλ H ( dρ ) (cid:82) [ a,b ] λ H ( dρ ) = 1 / − H / − H × b / − H − a / − H b / − H − a / − H for 0 ≤ a < b. Last, we define the corresponding approximating measure ˆ λ A byˆ λ A ( dρ ) = n (cid:88) i =1 λ H ( I K,Ai,n ) δ ρ K,Ai,n ( dρ ) , (5.3)and ˆ G K,A ( t ) = (cid:82) R + e − ρt ˆ λ A ( dρ ). We have the simple but interesting result. Proposition 5.1.
Let ˆ λ A ( dρ ) be defined by (5.3) , ˆ λ ( dρ ) = (cid:80) ni =1 λ H ( I K,Ai,n ) δ ρ K,Ai,n ( dρ ) be themeasure introduced in Proposition 4.3 and ˆ G K ( t ) = (cid:82) R + e − ρt ˆ λ ( dρ ) . Then, we have ˆ G K ( t ) ≤ ˆ G K,A ( t ) ≤ G λ H ( t ) . If X (resp. ˆ X K,A ) denotes the solution of X t = x + (cid:82) t G λ H ( t − s ) b ( ˆ X s ) ds + (cid:82) t G λ H ( t − s ) σ ( X s ) dW s (resp. ˆ X K,At = x + (cid:82) t ˆ G K,A ( t − s ) b ( ˆ X K,As ) ds + (cid:82) t ˆ G K,A ( t − s ) σ ( ˆ X K,As ) dW s ), wehave E [ | ˆ X K,At − X t | ] = O ( n − H × ) if K ∼ n →∞ cn / for some c ∈ R ∗ + .Proof. The first inequality is obvious. The second one is a consequence of Jensen inequalitythat gives (cid:82) I e − ρt λ H ( dρ ) ≥ λ H ( I ) e − (cid:82) I ρλH ( dρ ) (cid:82) I λH ( dρ ) t on any interval I since ρ → e − ρt is a convexfunction. We then get 0 ≤ G λ H ( t ) − ˆ G K,A ( t ) ≤ G λ H ( t ) − ˆ G K ( t ) and thus (cid:82) T ( G λ H ( t ) − ˆ G K,A ( t )) dt ≤ (cid:82) T ( G λ H ( t ) − ˆ G K ( t )) dt for any T >
0. This gives by (4.5), Theorem 3.1 andLemma 3.1 the rate of convergence. (cid:3)
Note that Proposition 5.1 gives the same asymptotic rate of convergence than Proposi-tion 4.3. This is confirmed on our numerical experiments: we have indicated in Table 5 the L -errors obtained with K = n / and A = 3 and the estimated rate of convergence ˆ γ thatis close to the theoretical one of 4 /
5. However, comparing with Table 2 (approximation byˆ G K ), we see that the error is significantly reduced: for n = 50 and H = 0 .
05, we get asquared error of 0 . .
03. Thus, if the rate of convergence is not improved withrespect to the approximation given by ˆ λ , the approximation given by ˆ λ A significantly reducesthe approximation error. This suggests that the kernel with the constant A improves themultiplicative constant in the rate of convergence. H ζ . × − . × − ζ . × − . × − ζ . × − . × − γ := H log(2) log( ζ /ζ ) 0.819 0.841 0.806Theoretical factor 0.8 0.8 0.8 Table 5.
Convergence results for ζ nt = E [ | ˆ X K,At − X t | ], with A = 3 and t = 1.Now, we discuss the choice of A . By Remark 3.1, (cid:82) T ( G λ H ( t ) − ˆ G K,A ( t )) dt is a naturalcriterion to assess the quality of the approximation. Besides, we know by Lemma 5.1 that thisquantity can be calculated easily. Thus, it is natural to find A ∗ that minimizes (cid:82) T ( G λ H ( t ) − ˆ G K,A ( t )) dt . This can be done in practice by using a one-dimensional optimization routine. PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 17 (a) H = 0 . n = 5 (b) H = 0 . n = 10 (c) H = 0 . n = 5 (d) H = 0 . n = 10 (e) H = 0 . n = 10 (f) H = 0 . n = 40 Figure 1.
Plots of G λ H ( t ) = t H − / Γ( H +1 / (black), ˆ G n / ( t ) (blue), ˆ G n / ,A ∗ ( t )(red) and ξ ∗ ˆ G n / ,A ∗ ( t ) (magenta) for different values of H and n . H n (cid:113)(cid:82) ( t H − / Γ( H +1 / − ξ ∗ ˆ G n / ,A ∗ n ( t )) dt (cid:113)(cid:82) ( t H − / Γ( H +1 / − ˆ G n,a ∗ ( t )) dt Table 6.
Table giving the value of the L error for ξ ∗ ˆ G n / ,A ∗ and ˆ G n,a ∗ .Last, once A ∗ has been calculated, we still notice that we have ˆ G K,A ∗ ( t ) ≤ G λ H ( t ) byProposition 5.1. Therefore, there exists ξ ∗ ≥ (cid:82) T ( G λ H ( t ) − ξ ˆ G K,A ∗ ( t )) dt ,namely ξ ∗ = (cid:82) T G λ H ( t ) ˆ G K,A ∗ ( t ) dt (cid:82) T ( ˆ G K,A ∗ ( t )) dt , that can similarly as in Lemma 5.1 be calculated exactly by the mean of the Gamma incom-plete function. Let us note that with this last adjustment, the approximation ξ ∗ G K,A ∗ is stillcompletely monotone, which may be an interesting property to preserve.Figure 1 illustrates for different values of H the different approximations of the roughkernel. It shows the interest of the progressive steps of our approximations from ˆ G n / ( t )to ˆ G n / ,A ∗ ( t ) and then to ξ ∗ ˆ G n / ,A ∗ ( t ). We first observe that the approximation ˆ G n / ( t )provided by Proposition 4.3 is not accurate close to time zero, due to the truncation. For H = 0 .
45 (resp. H = 0 . G n / ,A ∗ ( t ) and ξ ∗ ˆ G n / ,A ∗ ( t ) arequite perfect for n = 5 (resp. n = 10). For H = 0 .
05 and n = 10, one better observes therole of the parameter ξ ∗ that shifts upward the approximation so that it crosses G λ H at someoptimal point to minimize the L error. For n = 40 the approximation of the rough kernel isquite perfect. A more heuristic approach.
For a fixed n ∈ N ∗ , we are thus interested to minimizewith respect to α = ( α , . . . , α n ) ∈ R n and 0 ≤ ρ < · · · < ρ n the quantity (cid:90) T (cid:32) t H − / Γ( H + 1 / − n (cid:88) i =1 α i e − ρ i t (cid:33) dt = v (cid:62) Σ v, where Σ is the matrix defined by (5.1) and v (cid:62) = ( α , . . . , α n , − ≤ ρ < · · · < ρ n isfixed, the minimization in α is simply given by α ( ρ ) = ((Σ i,j ) ≤ i,j ≤ n ) − (Σ n +1 ,i ) ≤ i ≤ n . We denote v ( ρ ) (cid:62) = ( α ( ρ ) (cid:62) , − ρ (cid:55)→ v ( ρ ) (cid:62) Σ v ( ρ ) on 0 ≤ ρ < · · · < ρ n . To simplify, we restrict the minimization on ρ i ( a ) = a i − − κn with κ ∈ (0 ,
1) fixed and then perform the one dimensional minimization withrespect to a >
1. In our simulation of Table 6, we have taken κ = 0 .
2. Heuristically, takingfor the values of ρ i powers of a enables to select different time-scales. We denote by ˆ G n,a ( t ) = (cid:80) ni =1 α i ( ρ ( a )) e − ρ i ( a ) t . We take as a comparison ξ ∗ ˆ G K,A ∗ ( t ) defined as in Proposition 5.1 with K = n / . We have indicated in Table 6 the L errors provided by the two methods. Table 6 PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 19 shows that the quality of the fit given by ˆ G n,a ∗ and ξ ∗ ˆ G n / ,A ∗ are of the same magnitude. Letus note that ˆ G n,a ∗ (resp. ξ ∗ ˆ G n / ,A ∗ ) is a combination of n (resp. 2 n ) exponential functions.Therefore, the SDE associated to ˆ G n,a ∗ has a dimension twice less than the one associatedto ξ ∗ ˆ G n / ,A ∗ , which is an interesting advantage. However, the approximation ˆ G n,a ∗ is nolonger completely monotone, which may be a property that one would like to preserve insome applications. Second, the values of α and ρ defining ˆ G n,a ∗ may be with very largepositive and negative values, leading to an SDE with large coefficients that is thus difficultto approximate. For H = 0 .
45 and n = 5, this is not the case since we obtain α = 1 . α = − . α = 0 . α = 0 . α = 0 . H = 0 .
05 and n = 20 or n = 40,we have positive and negative coefficients with absolute value above 10 . For this reason, theheuristic approach may lead to unstable results especially for H being close to zero. We havenoticed such kind of instability in some of our numerical tests when using this approximationfor the rough Bergomi model. Thus, we recommend in general the use of ξ ∗ ˆ G n / ,A ∗ .5.3. Application to the Rough Bergomi model.
In this subsection, we give a practicalapplication and consider the pricing of European call options with the Rough Bergomi model.Namely, we consider a two dimensional Brownian motion W and the following dynamics: S t = S exp (cid:18)(cid:90) t √ ν s ( ρdW s + (cid:112) − ρ dW s ) − (cid:90) t ν s ds (cid:19) , (5.4) ν t = ν exp (cid:18) η √ H (cid:90) t ( t − s ) H − / dW s − η t H (cid:19) . (5.5)We first describe the algorithm of Bayer et al. [3]. It consists in discretizing the timeinterval [0 , T ] with N time steps. Thus, one has to simulate the Gaussian vector ( (cid:82) lN T ( lN T − s ) H − / dW s , W lN T ) l =1 ,...,N by computing a Cholesky decomposition of the covariance matrix.Then, the values of ν lN T are sampled exactly, and one approximate S with the followingscheme, for l ∈ { , . . . , N } :ˆ S lN T = ˆ S l − N T exp (cid:18) ν l − N T (cid:16) ρ ( W ln T − W l − n T ) + (cid:112) − ρ ( W ln T − W l − n T ) (cid:17) − ν l − N T TN (cid:19) . Here, we furthermore approximate ν by using an approximation of the rough kernel.Namely we use that √ H (cid:90) t ( t − s ) H − / dW s = √ H Γ( H + 1 / (cid:90) t G λ H ( t − s ) dW s ≈ √ H Γ( H + 1 / (cid:90) t ξ ∗ G n / ,A ∗ ( t − s ) dW s . Since the approximation is a combination of exponential functions, we can simulate it exactlyby the Gaussian vector ( (cid:82) lN T exp (cid:0) − ρ i ( lN T − s ) (cid:1) dW s , W lN T ) l ∈{ ,...,N } ,i again by computing a Cholesky decomposition of the covariance matrix. Then, we define the following approxi-mation of ν with ¯ c = η √ H Γ( H + 1 / ξ ∗ :ˆ ν lN T = ν l − N T exp (cid:18) ¯ c (cid:90) t G n / ,A ∗ ( t − s ) dW s −
12 ¯ c (cid:90) t G n / ,A ∗ ( t − s ) ds (cid:19) . (5.6)Note that the integral (cid:82) t G n / ,A ∗ ( t − s ) ds can be easily calculated exactly. We notice that itis important in numerical applications to compute it instead of using η [( lN T ) H − ( l − N T ) H ]that introduces some bias. This slight modification improves significantly the numericalresults in approximating the smile curve. k Figure 2.
Implicit volatility of the Call option with strike e k obtained by theMonte-Carlo estimator: the method of Bayer et al [3] in blue, our proposedapproximation in red. Respective 95% confidence intervals in yellow and greendelimited with dotted lines of the same color. Parameters: H = 0 . S = 1, v = 0 . , η = 1 . ρ = − . T = 0 . T = 0 . E [( S T − e k ) + ] with 10 samples. Theapproximation that we propose is very close to the smile produced by the method proposedin [3], which shows its relevance. Note that on this specific example, there is no particularadvantage to use our approximation rather than the one of Bayer et al. [3] since everythingcan be sampled exactly. However, if one uses for the volatility a more involved Volterra SDEwith the rough kernel, exact sampling is no longer possible while our approximations can stillbe used since they correspond to a classical SDE in a higher dimension. PPROXIMATION OF STOCHASTIC VOLTERRA EQUATIONS 21
References [1] E. Abi Jaber and O. El Euch. Multifactor approximation of rough volatility models.
SIAM J. FinancialMath. , 10(2):309–349, 2019.[2] A. Alfonsi. High order discretization schemes for the CIR process: application to affine term structureand Heston models.
Math. Comp. , 79(269):209–237, 2010.[3] C. Bayer, P. Friz, and J. Gatheral. Pricing under rough volatility.
Quant. Finance , 16(6):887–904, 2016.[4] C. Bayer, P. K. Friz, A. Gulisashvili, B. Horvath, and B. Stemper. Short-time near-the-money skew inrough fractional volatility models.
Quant. Finance , 19(5):779–798, 2019.[5] D. Belomestny, S. H¨afner, T. Nagapetyan, and M. Urusov. Variance reduction for discretised diffusionsvia regression.
J. Math. Anal. Appl. , 458(1):393–418, 2018.[6] M. Ben Alaya and A. Kebaier. Central limit theorem for the multilevel Monte Carlo Euler method.
Ann.Appl. Probab. , 25(1):211–234, 2015.[7] M. Bennedsen, A. Lunde, and M. S. Pakkanen. Hybrid scheme for Brownian semistationary processes.
Finance Stoch. , 21(4):931–965, 2017.[8] M. A. Berger and V. J. Mizel. Volterra equations with Ito integrals - I.
J. Integral Equations , 2:187–245,1980.[9] M. A. Berger and V. J. Mizel. Volterra equations with Ito integrals - II.
J. Integral Equations , 2:319–337,1980.[10] P. Carmona and L. Coutin. Fractional Brownian motion and the Markov property.
Electron. Commun.Probab. , 3:95–107, 1998.[11] P. Carmona, L. Coutin, and G. Montseny. Approximation of some processes.
Stat. Inference Stoch. Pro-cess. , 3(1-2):161–171, 2000.[12] W. G. Cochran, J.-S. Lee, and J. Potthoff. Stochastic Volterra equations with singular kernels.
StochasticProcesses Appl. , 56(2):337–349, 1995.[13] L. Coutin and L. Decreusefond. Stochastic Volterra equations with singular kernels. In
Stochastic analysisand mathematical physics , pages 39–50. Boston: Birkh¨auser, 2001.[14] L. Decreusefond. Regularity properties of some stochastic Volterra integrals with singular kernel.
PotentialAnal. , 16(2):139–149, 2002.[15] P. K. Friz, P. Gassiat, and P. Pigato. Short dated smile under rough volatility: asymptotics and numerics,2020.[16] M. Fukasawa. Short-time at-the-money skew and rough fractional volatility.
Quant. Finance , 17(2):189–198, 2017.[17] M. Fukasawa. Volatility has to be rough, 2020.[18] J. Gatheral, T. Jaisson, and M. Rosenbaum. Volatility is rough.
Quant. Finance , 18(6):933–949, 2018.[19] M. B. Giles. Multilevel Monte Carlo path simulation.
Oper. Res. , 56(3):607–617, 2008.[20] G. Gripenberg, S.-O. Londen, and O. Staffans.
Volterra integral and functional equations , volume 34 of
Encyclopedia of Mathematics and its Applications . Cambridge University Press, Cambridge, 1990.[21] P. Harms and D. Stefanovits. Affine representations of fractional processes with applications in mathe-matical finance.
Stochastic Processes Appl. , 129(4):1185–1228, 2019.[22] E. Isaacson and H. B. Keller.
Analysis of numerical methods . Dover Publications, Inc., New York, 1994.Corrected reprint of the 1966 original [Wiley, New York; MR0201039 (34
Ann. Appl.Probab. , 19(5):1687–1718, 2009.[24] S. Kusuoka. Approximation of expectation of diffusion processes based on Lie algebra and Malliavincalculus. In
Advances in mathematical economics. Vol. 6 , volume 6 of
Adv. Math. Econ. , pages 69–83.Springer, Tokyo, 2004.[25] V. Lemaire and G. Pag`es. Unconstrained recursive importance sampling.
Ann. Appl. Probab. , 20(3):1029–1067, 2010.[26] V. Lemaire and G. Pag`es. Multilevel Richardson-Romberg extrapolation.
Bernoulli , 23(4A):2643–2692,2017.[27] N. J. Newton. Variance reduction for simulated diffusions.
SIAM J. Appl. Math. , 54(6):1780–1805, 1994.[28] S. Ninomiya and N. Victoir. Weak approximation of stochastic differential equations and application toderivative pricing.
Appl. Math. Finance , 15(1-2):107–121, 2008. [29] E. Pardoux and P. Protter. Stochastic Volterra equations with anticipating coefficients.
Ann. Probab. ,18(4):1635–1655, 1990.[30] P. Protter. Volterra equations driven by semimartingales.
Ann. Probab. , 13:519–530, 1985.[31] A. Richard, X. Tan, and F. Yang. Discrete-time simulation of stochastic volterra equations, 2020.[32] Y. Shinozaki. Construction of a third-order K-scheme and its application to financial models.
SIAM J.Financial Math. , 8(1):901–932, 2017.[33] D. Talay and L. Tubaro. Expansion of the global error for numerical schemes solving stochastic differentialequations.
Stochastic Anal. Appl. , 8(4):483–509 (1991), 1990.[34] Z. Wang. Existence and uniqueness of solutions to stochastic Volterra equations with singular kernels andnon-Lipschitz coefficients.
Stat. Probab. Lett. , 78(9):1062–1071, 2008.[35] D. V. Widder.
The Laplace Transform . Princeton Mathematical Series, v. 6. Princeton University Press,Princeton, N. J., 1941.[36] X. Zhang. Euler schemes and large deviations for stochastic Volterra equations with singular kernels.
J.Differ. Equations , 244(9):2226–2250, 2008.[37] X. Zhang. Stochastic Volterra equations in Banach spaces and stochastic partial differential equation.
J.Funct. Anal. , 258(4):1361–1425, 2010.
Aur´elien Alfonsi, CERMICS, Ecole des Ponts, Marne-la-Vall´ee, France. MathRisk, Inria,Paris, France.
Email address : [email protected] Ahmed Kebaier, Universit´e Sorbonne Paris Nord, LAGA, CNRS, (UMR 7539), F-93430 Villeta-neuse, France
Email address ::