[PDF] Truncated control variates for weak approximation schemes

Abstract

In this paper we present an enhancement of the regression-based variance reduction approaches recently proposed in Belomestny et al. This enhancement is based on a truncation of the control variate and allows for a significant reduction of the computing time, while the complexity stays of the same order. The performances of the proposed truncated algorithms are illustrated by a numerical example.

Full PDF

TTRUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES

DENIS BELOMESTNY, STEFAN HÄFNER, AND MIKHAIL URUSOV

Abstract.

In this paper we present an enhancement of the regression-based variance reduction ap-proaches recently proposed in Belomestny et al. [1] and [4]. This enhancement is based on a truncationof the control variate and allows for a signiﬁcant reduction of the computing time, while the complexitystays of the same order. The performances of the proposed truncated algorithms are illustrated by anumerical example.

Keywords.

Control variates; Monte Carlo methods; regression methods; stochastic diﬀerential equa-tions; weak schemes.

Mathematics Subject Classification (2010).

Introduction

Let

T > be a ﬁxed time horizon. Consider a d -dimensional diﬀusion process ( X t ) t ∈ [0 ,T ] deﬁned bythe Itô stochastic diﬀerential equation dX t = µ ( X t ) dt + σ ( X t ) dW t , X = x ∈ R d , (0.1)for Lipschitz continuous functions µ : R d → R d and σ : R d → R d × m , where ( W t ) t ∈ [0 ,T ] is a standard m -dimensional Brownian motion. Our aim is to compute the expectation u ( t, x ) := E [ f ( X t,xT )] , (0.2)for some f : R d → R , where X t,x denotes the solution to (0.1) started at time t in point x . Thestandard Monte Carlo (SMC) estimate for u (0 , x ) at a ﬁxed point x ∈ R d has the form V N := 1 N N (cid:88) i =1 f (cid:16) X ( i ) T (cid:17) (0.3)for some N ∈ N , where X T is an approximation for X ,xT constructed via a time discretisationof (0.1) (we refer to [7] for a nice overview of various discretisation schemes). In the computation of u (0 , x ) = E [ f ( X ,xT )] by the SMC approach there are two types of error inherent: the (deterministic)discretisation error E [ f ( X ,xT )] − E [ f ( X T )] and the Monte Carlo (statistical) error, which results fromthe substitution of E [ f ( X T )] with the sample average V N . The aim of variance reduction methods isto reduce the latter statistical error. For example, in the so-called control variate variance reductionapproach one looks for a random variable ξ with E ξ = 0 , which can be simulated, such that the varianceof the diﬀerence f ( X T ) − ξ is minimised, that is, Var[ f ( X T ) − ξ ] → min under E ξ = 0 . Then one uses the sample average V CVN := 1 N N (cid:88) i =1 (cid:104) f (cid:16) X ( i ) T (cid:17) − ξ ( i ) (cid:105) (0.4)instead of (0.3) to approximate E [ f ( X T )] . The use of control variates for computing expectations offunctionals of diﬀusion processes via Monte Carlo was initiated by Newton [11] and further developedin Milstein and Tretyakov [8]. In Belomestny et al [1] a novel regression-based approach for theconstruction of control variates, which reduces the variance of the approximated functional f ( X T ) wasproposed. As shown in [1], the “Monte Carlo approach with the Regression-based Control Variate”(abbreviated below as “RCV approach”) is able to achieve a higher order convergence of the resultingvariance to zero, which in turn leads to a signiﬁcant complexity reduction as compared to the SMC The work of Denis Belomestny is supported by the Russian Science Foundation project 14-50-00150. a r X i v : . [ m a t h . P R ] M a y DENIS BELOMESTNY, STEFAN HÄFNER, AND MIKHAIL URUSOV algorithm. Other prominent examples of algorithms with this property are the multilevel Monte Carlo(MLMC) algorithm of [5] and quadrature-based algorithms of [9] and [10]. The RCV approach becomesespecially simple in the case of the so-called weak approximation schemes, i.e., the schemes, wheresimple random variables are used in place of Brownian increments, and which became quite popularin recent years. However, due to the fact that a lot of computations are required for implementing theRCV approach, its numerical eﬃciency is not convincing in higher-dimensional examples. The sameapplies also to the SRCV algorithm of [4]. In this paper we further enhance the performances of theRCV and SRCV algorithms by truncating the control variates, leading to a reduction from (2 m − to m terms at each time point in case of the weak Euler scheme and a reduction from (3 m m ( m − − to m ( m + 1) = O ( m ) terms at each time point in case of the second order weak scheme. It turns outthat, while the computing time is reduced signiﬁcantly, we still have a suﬃcient variance reductioneﬀect such that the complexity is of the same order as for the original RCV and SRCV approaches.The paper is organised as follows. In Section 1 we present a smoothness theorem for a general classof discretisation schemes. Section 2 recalls the construction of control variates for weak schemes ofthe ﬁrst and the second order. The main truncation results are derived in Section 3. In Section 4 wedescribe a generic regression algorithm. Section 5 deals with a complexity analysis for the algorithmthat is based on the truncated control variate. Section 6 is devoted to a simulation study. Finally, allproofs are collected in Section 7.1. Smoothness theorem for discretisation schemes

In this section we present a technical result for discretisation schemes, which will be very importantin the sequel. To begin with, let J ∈ N denote the time discretisation parameter, we set ∆ := T /J and consider discretisation schemes deﬁned on the grid { j ∆ : j = 0 , . . . , J } .Let us consider a scheme, where d -dimensional approximations X ∆ ,j ∆ , j = 0 , . . . , J , satisfy X ∆ , = x and X ∆ ,j ∆ = Φ ∆ (cid:0) X ∆ , ( j − , ξ j (cid:1) , j = 1 , . . . , J, (1.1)for some Borel measurable functions Φ ∆ : R d + ˜ m → R d , where ˜ m ≥ m , and for ˜ m -dimensional i.i.d.random vectors ξ j = ( ξ j , . . . , ξ ˜ mj ) (cid:62) with independent coordinates satisfying E (cid:104) ξ ij (cid:105) = 0 and Var (cid:104) ξ ij (cid:105) = 1 for all i = 1 , . . . , ˜ m , j = 1 , . . . , J . Moreover, let G be the trivial σ -ﬁeld and G j = σ ( ξ , . . . , ξ j ) , j = 1 , . . . , J . In the chapters below we will focus on diﬀerent kinds of discretisation schemes, resultingin diﬀerent convergence behaviour.We now deﬁne the random function G l,j ( x ) for J ≥ l ≥ j ≥ , x ∈ R d , as follows G l,j ( x ) ≡ Φ ∆ ,l ◦ Φ ∆ ,l − ◦ . . . ◦ Φ ∆ ,j +1 ( x ) , l > j, (1.2) G l,j ( x ) ≡ x, l = j, where Φ ∆ ,l ( x ) := Φ ∆ ( x, ξ l ) for l = 1 , . . . , J . By Φ k ∆ ,l , k ∈ { , . . . , d } , we denote the k -th componentof the function Φ ∆ ,l . Note that it holds q j ( x ) := E [ f ( X ∆ ,T ) | X ∆ ,j ∆ = x ] = E [ f ( G J,j ( x ))] . (1.3)Let us deﬁne the operator D α as follows D α g ( x ) := ∂ | α | g ( x ) ∂x α · · · ∂x α d d , (1.4)where g is a real-valued function, α ∈ N d and | α | = α + . . . + α d ( N := N ∪ { } ).In the next theorem we present some smoothness conditions on q j , which will be used several timesin the chapters below. Theorem 1.1.

Let K ∈ { , , } . Suppose that f is K times continuously diﬀerentiable with boundedpartial derivatives up to order K , Φ ∆ ( · , ξ ) is K times continuously diﬀerentiable (for any ﬁxed ξ ), andthat, for any n ∈ N , l ≥ j , k ∈ { , . . . , d } , α ∈ N d with ≤ | α | ≤ K , it holds (cid:12)(cid:12)(cid:12) E (cid:104) (cid:16) D α Φ k ∆ ,l +1 ( G l,j ( x )) (cid:17) n (cid:12)(cid:12)(cid:12) G l (cid:105)(cid:12)(cid:12)(cid:12) ≤ (cid:40) (1 + A n ∆) , | α | = α k = 1 B n ∆ , ( | α | > ∨ ( α k (cid:54) = 1) (1.5) RUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES 3 with probability one for some constants A n > , B n > . Moreover, suppose that for any n , n ∈ N , α, β ∈ N d , with | α | = 1 , ≤ | β | ≤ K , α (cid:54) = β , it holds (cid:12)(cid:12)(cid:12) E (cid:104) (cid:16) D α Φ k ∆ ,l +1 ( G l,j ( x )) (cid:17) n (cid:16) D β Φ k ∆ ,l +1 ( G l,j ( x )) (cid:17) n (cid:12)(cid:12)(cid:12) G l (cid:105)(cid:12)(cid:12)(cid:12) ≤ E n ,n ∆ (1.6) for some constants E n ,n > . Then we obtain for all j ∈ { , . . . , J } that q j is K times continuouslydiﬀerentiable with bounded partial derivatives up to order K . Representations for weak approximation schemes

Below we focus on weak schemes of ﬁrst and second order.2.1.

Weak Euler scheme.

In this subsection we treat weak schemes of order . Let us consider ascheme, where d -dimensional approximations X ∆ ,j ∆ , j = 0 , . . . , J , satisfy X ∆ , = x and X ∆ ,j ∆ = Φ ∆ ( X ∆ , ( j − , ξ j ) , j = 1 , . . . , J, (2.1)for some functions Φ ∆ : R d + m → R d , with ξ j = ( ξ j , . . . , ξ mj ) , j = 1 , . . . , J , being m -dimensional iidrandom vectors with iid coordinates such that P (cid:16) ξ kj = ± (cid:17) = 12 , k = 1 , . . . , m. That is, relating to the framework in Section 1, we have ˜ m = m and use the discrete increments ξ ij , i = 1 , . . . , m . A particular case is the weak Euler scheme (also called the simpliﬁed weak Euler scheme in [7, Section 14.1]) of order 1, which is given by Φ ∆ ( x, y ) = x + µ ( x ) ∆ + σ ( x ) y √ ∆ . (2.2)Let us recall the functions (cf. (1.3)) q j ( x ) = E [ f ( X ∆ ,T ) | X ∆ ,j ∆ = x ] . The proposition below summarises important representations for the weak Euler scheme, which werederived in [1].

Proposition 2.1.

The following representation holds f ( X ∆ ,T ) = E f ( X ∆ ,T ) + J (cid:88) j =1 m (cid:88) r =1 (cid:88) ≤ s <...

~~Proposition 2.2.~~

Assume that µ and σ in (0.1) are Lipschitz continuous with components µ i , σ i,r : R d → R , i = 1 , . . . , d , r = 1 , . . . , m , being times continuously diﬀerentiable with theirpartial derivatives of order up to having polynomial growth. Let f : R d → R be times continuouslydiﬀerentiable with partial derivatives of order up to having polynomial growth. Provided that (2.2) holds and that, for suﬃciently large p ∈ N , the expectations E | X ∆ ,j ∆ | p are uniformly bounded in J and j = 0 , . . . , J , we have for this “simpliﬁed weak Euler scheme” | E [ f ( X T ) − f ( X ∆ ,T )] | ≤ c ∆ , where the constant c does not depend on ∆ . Moreover, it holds Var (cid:104) f ( X ∆ ,T ) − M (1)∆ ,T (cid:105) = 0 . Discussion.

In order to use the control variate M (1)∆ ,T in practice, we need to estimate the unknowncoeﬃcients a j,r,s . Thus, practically implementable control variates (cid:102) M (1)∆ ,T have the form (2.6) with someestimated functions ˜ a j,r,s : R d → R . Notice that they remain valid control variates, i.e. we still have E (cid:2) (cid:102) M (1)∆ ,T (cid:3) = 0 , which is due to the martingale transform structure in (2.6).2.2. Second order weak scheme.

Now we treat weak schemes of order . We consider a scheme,where d -dimensional approximations X ∆ ,j ∆ , j = 0 , . . . , J , satisfy X ∆ , = x and X ∆ ,j ∆ = Φ ∆ ( X ∆ , ( j − , ξ j , V j ) , j = 1 , . . . , J, (2.7)for some functions Φ ∆ : R d + m + m × m → R d . Here,(S1) ξ j = ( ξ ij ) mi =1 are m -dimensional random vectors,(S2) V j = ( V ilj ) mi,l =1 are random m × m -matrices,(S3) the pairs ( ξ j , V j ) , j = 1 , . . . , J , are i.i.d.,(S4) for each j , the random elements ξ j and V j are independent,(S5) for each j , the random variables ξ ij , i = 1 , . . . , m , are i.i.d. with P (cid:16) ξ ij = ±√ (cid:17) = 16 , P (cid:0) ξ ij = 0 (cid:1) = 23 , (S6) for each j , the random variables V ilj , ≤ i < l ≤ m , are i.i.d. with P (cid:16) V ilj = ± (cid:17) = 12 , (S7) V lij = − V ilj , ≤ i < l ≤ m , j = 1 , . . . , J ,(S8) V iij = − , i = 1 , . . . , m , j = 1 , . . . , J .Hence, the matrices V j can be generated by means of m ( m − i.i.d. random variables. That is, relatingto the framework in Section 1, we have ˜ m -dimensional random vectors ˜ ξ j := (( ξ ij ) i =1 ,...,m , ( V ilj ) ≤ i
In order to obtain a second order weak scheme in the multidimensional case, we needto incorporate additional random elements V j into the structure of the scheme. This is the reason whywe now consider (2.7) instead of (2.1). For instance, to get the second order weak scheme (also calledthe simpliﬁed order 2 weak Taylor scheme ) of [7, Section 14.2] in the multidimensional case, we needto deﬁne the functions Φ ∆ ( x, y, z ) , x ∈ R d , y ∈ R m , z ∈ R m × m , as explained below. First we deﬁnethe function Σ : R d → R d × d by the formula Σ( x ) = σ ( x ) σ ( x ) (cid:62) and recall that the coordinates of vectors and matrices are denoted by superscripts, e.g. Σ( x ) =(Σ kl ( x )) dk,l =1 , Φ ∆ ( x, y, z ) = (Φ k ∆ ( x, y, z )) dk =1 . Let us introduce the operators L r , r = 0 , . . . , m , that act This phrase means that the discrete-time process ˜ M = ( ˜ M l ) l =0 ,...,J , where ˜ M = 0 and ˜ M l is deﬁned like the right-hand side of (2.6) but with (cid:80) Jj =1 being replaced by (cid:80) lj =1 and a j,r,s by ˜ a j,r,s is a martingale, which is a straightforwardcalculation. RUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES 5 on suﬃciently smooth functions g : R d → R as follows: L g ( x ) := d (cid:88) k =1 µ k ( x ) ∂g∂x k ( x ) + 12 d (cid:88) k,l =1 Σ kl ( x ) ∂ g∂x l ∂x k ( x ) , L r g ( x ) := d (cid:88) k =1 σ kr ( x ) ∂g∂x k ( x ) , r = 1 , . . . , m. The r -th coordinate Φ r ∆ , r = 1 , . . . , d , in the simpliﬁed order 2 weak Taylor scheme of [7, Section 14.2]is now given by the formula Φ r ∆ ( x, y, z ) = x r + m (cid:88) k =1 σ rk ( x ) y k √ ∆ (2.8) +  µ r ( x ) + 12 m (cid:88) k,l =1 L k σ rl ( x )( y k y l + z kl )  ∆+ 12 m (cid:88) k =1 (cid:104) L σ rk ( x ) + L k µ r ( x ) (cid:105) y k ∆ / + 12 L µ r ( x ) ∆ , provided the coeﬃcients µ and σ of (0.1) are suﬃciently smooth. We will need to work explicitlywith (2.8) at some point, but all results in this subsection assume structure (2.7) only.Let us deﬁne the index sets I = { , . . . , m } , I = (cid:8) ( k, l ) ∈ I : k < l (cid:9) and the system A = { ( U , U ) ∈ P ( I ) × P ( I ) : U ∪ U (cid:54) = ∅} , where P ( I ) denotes the set of all subsets of a set I . For any U ⊆ I and o ∈ { , } U , we write o as o = ( o r ) r ∈ U . Below we use the convention that a product over the empty set is always one.For k ∈ N , H k : R → R stands for the (normalized) k -th Hermite polynomial, i.e. H k ( x ) := ( − k √ k ! e x d k dx k e − x , x ∈ R . We remark that, in particular, H ≡ , H ( x ) = x and H ( x ) = √ ( x − .As in Subsection 2.1, we summarise important representations from [1] below. Proposition 2.4.

It holds f ( X ∆ ,T ) = E f ( X ∆ ,T ) + J (cid:88) j =1 (cid:88) ( U ,U ) ∈A (cid:88) o ∈{ , } U a j,o,U ,U ( X ∆ , ( j − ) (cid:89) r ∈ U H o r ( ξ rj ) (cid:89) ( k,l ) ∈ U V klj , (2.9) where the coeﬃcients a j,o,U ,U : R d → R can be computed by the formula a j,o,U ,U ( x ) = E  f ( X ∆ ,T ) (cid:89) r ∈ U H o r ( ξ rj ) (cid:89) ( k,l ) ∈ U V klj (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ∆ , ( j − = x  . (2.10) Moreover, we have for each j ∈ { , . . . , J } , q j − ( x ) = E [ q j ( X ∆ ,j ∆ ) | X ∆ , ( j − = x ]= 12 m ( m − m (cid:88) ( y ,...,y m ) ∈{−√ , , √ } m (cid:88) ( z uv ) ≤ u
Proposition 2.5.

Assume, that µ and σ in (0.1) are Lipschitz continuous with components µ i , σ i,r : R d → R , i = 1 , . . . , d , r = 1 , . . . , m , being times continuously diﬀerentiable with theirpartial derivatives of order up to having polynomial growth. Let f : R d → R be times continuouslydiﬀerentiable with partial derivatives of order up to having polynomial growth. Provided that (2.8) holds and that, for suﬃciently large p ∈ N , the expectations E | X ∆ ,j ∆ | p are uniformly bounded in J and j = 0 , . . . , J , we have for this “simpliﬁed second order weak Taylor scheme” | E [ f ( X T ) − f ( X ∆ ,T )] | ≤ c ∆ , where the constant c does not depend on ∆ . Moreover, we have Var (cid:104) f ( X ∆ ,T ) − M (2)∆ ,T (cid:105) = 0 for thecontrol variate M (2)∆ ,T := J (cid:88) j =1 (cid:88) ( U ,U ) ∈A (cid:88) o ∈{ , } U a j,o,U ,U ( X ∆ , ( j − ) (cid:89) r ∈ U H o r ( ξ rj ) (cid:89) ( k,l ) ∈ U V klj , (2.12) where the coeﬃcients a j,o,U ,U ( x ) are deﬁned in (2.10) . Truncated control variates for weak approximation schemes

Below we recall the assumptions from [1], suggest suﬃcient conditions for them in terms of thefunctions f, µ, σ , and then suggest some stronger conditions that will justify the use of truncatedcontrol variates.3.1.

Weak Euler scheme.

Note that we considered only the second order weak scheme in terms of theregression and complexity analyses in [1]. However, analogous assumptions for the weak Euler schemeare as follows (cf. Proposition 2.1): ﬁx some j ∈ { , . . . , J } , r ∈ { , . . . , m } , s = ( s , . . . , s r ) with ≤ s < . . . < s r ≤ m , set ζ j,r,s := f ( X ∆ ,T ) (cid:81) ri =1 ξ s i j and remark that a j,r,s ( x ) = E [ ζ j,r,s | X ∆ , ( j − = x ] .We assume that, for some positive constants Σ , A , it holds:(A1) sup x ∈ R d Var[ ζ j,r,s | X ∆ , ( j − = x ] ≤ Σ < ∞ ,(A2) sup x ∈ R d | a j,r,s ( x ) | ≤ A √ ∆ < ∞ .In the following theorem we suggest suﬃcient conditions for the above assumptions. Theorem 3.1. (i) Let f be bounded. Then (A1) holds.(ii) Let all the functions σ ki , k ∈ { , . . . , d } , i ∈ { , . . . , m } , be bounded and all the functions f, µ k , σ ki be continuously diﬀerentiable with bounded partial derivatives. Then (A2) holds. Next we suggest some stronger conditions that give us somewhat more than (A2).

Theorem 3.2.

Let all the functions σ ki , k ∈ { , . . . , d } , i ∈ { , . . . , m } , be bounded and all thefunctions f, µ k , σ ki be twice continuously diﬀerentiable with bounded partial derivatives up to order 2.Then it holds (A3) sup x ∈ R d | a j,r,s ( x ) | (cid:46) ∆ , whenever r > . Remark 3.3.

As a generalisation of Theorem 3.2, it is natural to expect that it holds, under additionalsmoothness conditions on f, µ, σ , sup x ∈ R d | a j,r,s ( x ) | (cid:46) ∆ r/ for all j ∈ { , . . . , J } , r ∈ { , . . . , m } and ≤ s < . . . < s r ≤ m . RUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES 7

Let us deﬁne the “truncated control variate” M (1) ,trunc ∆ ,T := J (cid:88) j =1 m (cid:88) i =1 a j, ,e i ( X ∆ , ( j − ) ξ ij , (3.1)where e i ∈ R m denotes the i -th unit vector in R m and a j, ,e i is given by (cf. (2.4)) a j, ,e i ( x ) = E (cid:2) f ( X ∆ ,T ) ξ ij (cid:12)(cid:12) X ∆ , ( j − = x (cid:3) . Note that the superscript “trunc” comes from “truncated”. That is, we consider in M (1) ,trunc ∆ ,T only theterms of the control variate M (1)∆ ,T for which r = 1 (cf. (2.6)).Next we study the truncation error that arises from replacing M (1)∆ ,T by M (1) ,trunc ∆ ,T . Theorem 3.4.

Suppose that all the functions σ ki , k ∈ { , . . . , d } , i ∈ { , . . . , m } are bounded and allthe functions f, µ k , σ ki are twice continuously diﬀerentiable with bounded partial derivatives up to order2. Then it holds (cf. Proposition 2.2) Var (cid:104) f ( X ∆ ,T ) − M (1) ,trunc ∆ ,T (cid:105) (cid:46) ∆ . (3.2)Notice that under Assumption (A2) alone the variance in (3.2) would have been O (1) .3.2. Second order weak scheme.

First we recall some of the required assumptions in [1]: let us ﬁxsome j ∈ { , . . . , J } , ( U , U ) ∈ A , o ∈ { , } U , set ζ j,o,U ,U := f ( X ∆ ,T ) (cid:89) r ∈ U H o r ( ξ rj ) (cid:89) ( k,l ) ∈ U V klj and remark that a j,o,U ,U ( x ) = E [ ζ j,o,U ,U | X ∆ , ( j − = x ] . We assume that, for some positive con-stants Σ , A , it holds:(B1) sup x ∈ R d Var[ ζ j,o,U ,U | X ∆ , ( j − = x ] ≤ Σ < ∞ ,(B2) sup x ∈ R d | a j,o,U ,U ( x ) | ≤ A √ ∆ < ∞ .Below we verify the above assumptions. Theorem 3.5. (i) Let f be bounded. Then (B1) holds.(ii) Let all the functions µ k and σ ki , k ∈ { , . . . , d } , i ∈ { , . . . , m } , be bounded, the function f becontinuously diﬀerentiable with bounded partial derivatives, and all the functions µ k , σ ki be three timescontinuously diﬀerentiable with bounded partial derivatives up to order 3. Then (B2) holds. Let us deﬁne the index sets K := { r ∈ U : o r = 1 } , K := { r ∈ U : o r = 2 } . In the following theorem we provide some stronger conditions that give us more than (B2).

Theorem 3.6. (i) Let all the functions µ k and σ ki , k ∈ { , . . . , d } , i ∈ { , . . . , m } , be bounded, thefunction f be twice continuously diﬀerentiable with bounded partial derivatives up to order 2, and allthe functions µ k , σ ki be four times continuously diﬀerentiable with bounded partial derivatives up toorder 4. Then it holds (B3) sup x ∈ R d | a j,o,U ,U ( x ) | (cid:46) ∆ , whenever | U | + |K | + |K | ≥ .(ii) Let all the functions µ k and σ ki , k ∈ { , . . . , d } , i ∈ { , . . . , m } , be bounded, the function f be three times continuously diﬀerentiable with bounded partial derivatives up to order 3, and all thefunctions µ k , σ ki be ﬁve times continuously diﬀerentiable with bounded partial derivatives up to order 5.Then it holds (B4) sup x ∈ R d | a j,o,U ,U ( x ) | (cid:46) ∆ / , whenever | U | + |K | + |K | > . Remark 3.7. (i) As a generalisation of Theorem 3.6, it is natural to expect that it holds, underadditional smoothness conditions on f, µ, σ , sup x ∈ R d | a j,o,U ,U ( x ) | (cid:46) ∆ | U | + |K | + |K | DENIS BELOMESTNY, STEFAN HÄFNER, AND MIKHAIL URUSOV for all j ∈ { , . . . , J } , ( U , U ) ∈ A and o ∈ { , } U .(ii) Deﬁne ∆ U ,U := (cid:40) ∆ | U | + |K | + |K | if | U | + |K | + |K | ≤ , ∆ / otherwise . (3.3)An equivalent reformulation of assumptions (B2)–(B4) is as follows: there exists some positive constant ˜ A such that it holds sup x ∈ R d | a j,o,U ,U ( x ) | ≤ ˜ A ∆ U ,U (3.4)for all j, o, U , U .Similar to Section 3.1, let us deﬁne a truncated control variate through M (2) ,trunc ∆ ,T := J (cid:88) j =1 (cid:88) ( U ,U ) ∈A| U | + |K | + |K |≤ (cid:88) o ∈{ , } U a j,o,U ,U ( X ∆ , ( j − ) (cid:89) r ∈ U H o r ( ξ rj ) (cid:89) ( k,l ) ∈ U V klj . (3.5)Next we derive the truncation error that arises from replacing M (2)∆ ,T by M (2) ,trunc ∆ ,T . Theorem 3.8.

Suppose that all the functions µ k and σ ki , k ∈ { , . . . , d } , i ∈ { , . . . , m } are bounded,the function f is three times continuously diﬀerentiable with bounded partial derivatives up to order 3,and all the functions µ k , σ ki are ﬁve times continuously diﬀerentiable with bounded partial derivativesup to order 5. Then it holds (cf. Proposition 2.5) Var (cid:104) f ( X ∆ ,T ) − M (2) ,trunc ∆ ,T (cid:105) (cid:46) ∆ . (3.6) 4. Generic regression algorithm

In the previous sections we have given several representations for control variates. Now we discusshow to compute the coeﬃcients in these representations via regression. For the sake of clarity, we focuson second order schemes and control variate (3.5) with coeﬃcients given by (2.10).4.1.

Monte Carlo regression.

Fix a Q -dimensional vector of real-valued functions ψ = ( ψ , . . . , ψ Q ) on R d . Simulate a big number N of independent “training paths” of the discretised diﬀusion X ∆ ,j ∆ ,j = 0 , . . . , J . In what follows these N training paths are denoted by D trN : D trN := (cid:110) ( X tr, ( i )∆ ,j ∆ ) j =0 ,...,J : i = 1 , . . . , N (cid:111) . Let α j,o,U ,U = ( α j,o,U ,U , . . . , α Qj,o,U ,U ) , where j ∈ { , . . . , J } , ( U , U ) ∈ A , | U | + |K | + |K | ≤ , o ∈ { , } U , be a solution of the following least squares optimisation problem: argmin α ∈ R n N (cid:88) i =1 (cid:104) ζ tr, ( i ) j,o,U ,U − α ψ ( X tr, ( i )∆ , ( j − ) − . . . − α Q ψ Q ( X tr, ( i )∆ , ( j − ) (cid:105) with ζ tr, ( i ) j,o,U ,U := f ( X tr, ( i )∆ ,T ) (cid:89) r ∈ U H o r (cid:16) ( ξ tr, ( i ) j ) r (cid:17) (cid:89) ( k,l ) ∈ U ( V tr, ( i ) j ) kl . Deﬁne an estimate for the coeﬃcient function a j,o,U ,U via ˆ a j,o,U ,U ( x ) := ˆ a j,o,U ,U ( x, D trN ) := α j,o,U ,U ψ ( x ) + . . . + α Qj,o,U ,U ψ Q ( x ) , x ∈ R d . The intermediate expression ˆ a j,o,U ,U ( x, D trN ) in the above formula emphasises that the estimates ˆ a j,o,U ,U of the functions a j,o,U ,U are random in that they depend on the simulated training paths. The In the complexity analysis below we show how large N is required to be in order to provide an estimate within somegiven tolerance. RUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES 9 cost of computing α j,o,U ,U is of order O ( N Q ) , since each α j,o,U ,U is of the form α j,o,U ,U = B − b with B k,l := 1 N N (cid:88) i =1 ψ k (cid:0) X tr, ( i )∆ , ( j − (cid:1) ψ l (cid:0) X tr, ( i )∆ , ( j − (cid:1) (4.1)and b k := 1 N N (cid:88) i =1 ψ k (cid:0) X tr, ( i )∆ , ( j − (cid:1) ζ tr, ( i ) j,o,U ,U ,k, l ∈ { , . . . , Q } . The cost of approximating the family of the coeﬃcient functions a j,o,U ,U , j ∈{ , . . . , J } , ( U , U ) ∈ A , | U | + |K | + |K | ≤ , o ∈ { , } U , is of order O (cid:0) J m ( m + 1) N Q (cid:1) .4.2. Summary of the algorithm.

The algorithm consists of two phases: training phase and testingphase. In the training phase, we simulate N independent training paths D trN and construct regressionestimates ˆ a j,o,U ,U ( · , D trN ) for the coeﬃcients a j,o,U ,U ( · ) . In the testing phase, independently from D trN we simulate N independent testing paths ( X ( i )∆ ,j ∆ ) j =0 ,...,J , i = 1 , . . . , N , and build the Monte Carloestimator for E [ f ( X T )] as(4.2) E = 1 N N (cid:88) i =1 (cid:16) f ( X ( i )∆ ,T ) − (cid:99) M (2) ,trunc, ( i )∆ ,T (cid:17) , where (cid:99) M (2) ,trunc, ( i )∆ ,T := J (cid:88) j =1 (cid:88) ( U ,U ) ∈A| U | + |K | + |K |≤ (cid:88) o ∈{ , } U ˆ a j,o,U ,U ( X ( i )∆ , ( j − , D trN ) (cid:89) r ∈ U H o r ( ξ r, ( i ) j ) (cid:89) ( k,l ) ∈ U V kl, ( i ) j (4.3)(cf. with (2.12)). Due to the martingale transform structure in (4.3) (recall footnote 1 on page 4), wehave E (cid:104) (cid:99) M (2) ,trunc, ( i )∆ ,T | D trN (cid:105) = 0 , hence E [ E| D trN ] = E [ f ( X ( i )∆ ,T ) − (cid:99) M (2) ,trunc, ( i )∆ ,T | D trN ] = E [ f ( X ∆ ,T )] , andwe obtain (cf. (3.6)) Var[ E ] = E [Var( E| D trN )] + Var[ E ( E| D trN )] = E [Var( E| D trN )]= 1 N E (cid:104) Var (cid:16) f ( X (1)∆ ,T ) − (cid:99) M (2) ,trunc, (1)∆ ,T | D trN (cid:17)(cid:105) = 1 N Var (cid:104) f ( X (1)∆ ,T ) − (cid:99) M (2) ,trunc, (1)∆ ,T (cid:105) . Summarising, we have E [ E ] = E [ f ( X ∆ ,T )] , Var[ E ] = 1 N Var (cid:104) f ( X (1)∆ ,T ) − (cid:99) M (2) ,trunc, (1)∆ ,T (cid:105) . (4.4)Notice that the result of (4.4) indeed requires the computations above and cannot be stated right fromthe outset because the summands in (4.2) are dependent (through D trN ).This concludes the description of the generic regression algorithm for constructing the control variate.Further details, such as bounds for the right-hand side of (4.4), depend on a particular implementation,i.e. on the quality of the chosen basis functions.5. Complexity analysis

In this section we extend the complexity analysis presented in [1] to the case of the “TRCV” (trun-cated RCV) algorithm. Below we only sketch the main results for the second order schemes. We makethe following assumption (cf. [2] and [4]):(B5) The functions a j,o,U ,U ( x ) can be well approximated by the functions from Ψ Q := span ( { ψ , . . . , ψ Q } ) , in the sense that there are constants κ > and C κ > such that inf g ∈ Ψ Q ˆ R d ( a j,o,U ,U ( x ) − g ( x )) P ∆ ,j − ( dx ) ≤ C κ Q κ , where P ∆ ,j − denotes the distribution of X ∆ , ( j − . Remark 5.1.

Note that (B5) is a natural condition to be satisﬁed for good choices of Ψ Q . Forinstance, under appropriate assumptions, in the case of piecewise polynomial regression as describedin [1], (B5) is satisﬁed with κ = ν ( p +1)2 d ( p +1)+ dν , where the parameters p and ν are explained in [1].In Lemma 5.2 below we present an L -upper bound for the estimation error of the TRCV algorithm.To this end, we need to describe more precisely, how exactly the regression-based approximations ˜ a j,o,U ,U are constructed:Let functions ˆ a j,o,U ,U ( x ) be obtained by regression onto the set of basis functions { ψ , . . . , ψ Q } ,while the approximations ˜ a j,o,U ,U ( x ) of the TRCV algorithm be the truncated estimates, which aredeﬁned as follows ˜ a j,o,U ,U ( x ) := T ˜ A ∆ U ,U ˆ a j,o,U ,U ( x ) := (cid:40) ˆ a j,o,U ,U ( x ) if | ˆ a j,o,U ,U ( x ) | ≤ ˜ A ∆ U ,U , ˜ A ∆ U ,U sgn ˆ a j,o,U ,U ( x ) otherwise , (5.1)where ∆ U ,U and ˜ A are given in (3.3) and (3.4)). Lemma 5.2.

Under (B1)–(B5), we have E (cid:107) ˜ a j,o,U ,U − a j,o,U ,U (cid:107) L ( P ∆ ,j − ) ≤ ˜ c (Σ + ˜ A ∆ U ,U (log N + 1)) QN + 8 C κ Q κ , (5.2) where ˜ c is a universal constant. Notice that the expectation in the left-hand side of (5.2) means averaging over the randomnessin D trN .Let ( X ∆ ,j ∆ ) j =0 ,...,J be a testing path, which is independent of the training paths D trN . We deﬁne (cid:102) M (2) ,trunc ∆ ,T := J (cid:88) j =1 (cid:88) ( U ,U ) ∈A| U | + |K | + |K |≤ (cid:88) o ∈{ , } U ˜ a j,o,U ,U ( X ∆ , ( j − , D trN ) (cid:89) r ∈ U H o r ( ξ rj ) (cid:89) ( k,l ) ∈ U V klj (5.3)(cf. (3.5)). Lemma 5.2 now allows to bound the variance Var[ f ( X ∆ ,T ) − (cid:102) M (2) ,trunc ∆ ,T ] from above. Theorem 5.3.

Under (B1)–(B5), it holds

Var[ f ( X ∆ ,T ) − (cid:102) M (2) ,trunc ∆ ,T ] (cid:46) ∆ + J m ( m + 1) (cid:18) ˜ c (Σ + ˜ A ∆(log N + 1)) QN + 8 C κ Q κ (cid:19) . Complexity of the TRCV approach.

Let us study the complexity of the TRCV approach.The overall cost is of order

J Q max { N Q, N } , provided that we only track the constants which tendto ∞ when ε (cid:38) with ε being the accuracy to be achieved. That is, the constants, such as d, m, κ, C κ ,are ignored. We have the following constraints max (cid:26) J , J N , J QN N , JQ κ N (cid:27) (cid:46) ε , (5.4)where the ﬁrst term comes from the squared bias of the estimator and the remaining three ones comefrom the variance of the estimator (see Theorem 5.3 as well as footnote 3 on page 10). We get thefollowing result. Theorem 5.4.

For the TRCV approach with the second order weak schemes under (B1)–(B5), it isoptimal to choose the orders of parameters as follows (cf. [4] ) J (cid:16) ε − , Q (cid:16) ε − κ +4 , N (cid:16) ε − , N (cid:16) N Q (cid:16) ε − κ +104 κ +4 , Notice that the variance of the TRCV estimate N (cid:80) N i =1 (cid:104) f (cid:16) X ( i )∆ ,T (cid:17) − ˜ M (2) ,trunc, ( i )∆ ,T (cid:105) with N testing paths is N Var[ f ( X ∆ ,T ) − ˜ M (2) ,trunc ∆ ,T ] (cf. (4.4)). RUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES 11 provided that κ > . Thus, we have for the complexity C T RCV (cid:16)

J N Q (cid:16) J N Q (cid:16) ε − κ +174 κ +4 . (5.5) Remark 5.5. (i) For the sake of comparison with the SMC and MLMC approaches, we recall at thispoint that their complexities are C SMC (cid:16) ε − . and C MLMC (cid:16) ε − at best (we are considering the second order scheme).(ii) Complexity estimate (5.5) shows that one can go beyond the complexity order ε − , providedthat κ > , and that we can achieve the complexity order ε − . − δ , for arbitrarily small δ > , provided κ is large enough.(iii) The complexity of the TRCV approach is the same that we obtain for the RCV approach (wherethe “complete” control variate (2.12) is estimated), since the second constraint in (5.4), which does notarise for the RCV approach, is the only inactive one in this case. That is why we truncated M (2) ,trunc ∆ ,T in (3.5) at the level | U | + |K | + |K | ≤ . For instance, if we had used a control variate of the form(cf. (3.1)) J (cid:88) j =1 (cid:88) ( U ,U ) ∈A| U | + |K | + |K | = (cid:88) o ∈{ , } U a j,o,U ,U ( X ∆ , ( j − ) (cid:89) r ∈ U H o r ( ξ rj ) (cid:89) ( k,l ) ∈ U V klj = J (cid:88) j =1 m (cid:88) i =1 a j, ,i, ∅ ( X ∆ , ( j − ) ξ ij with a j, ,i, ∅ ( x ) = E (cid:104) f ( X ∆ ,T ) ξ ij | X ∆ , ( j − = x (cid:105) , the bound for the variance in (3.6) would have beenof order ∆ and due to the resulting constraint JN (cid:46) ε , we would have obtained worse complexitiesthan ε − , since C T RCV (cid:38)

J N . 6. Numerical results

The results below are based on program codes written and vectorised in MATLAB and running ona Linux 64-bit operating system.Let us consider the following SDE for d = m = 5 (cf. [1]) dX it = − sin (cid:0) X it (cid:1) cos (cid:0) X it (cid:1) dt + cos (cid:0) X it (cid:1) dW it , X i = 0 , i ∈ { , , , } ,dX t = (cid:88) i =1 (cid:20) −

12 sin (cid:0) X it (cid:1) cos (cid:0) X it (cid:1) dt + cos (cid:0) X it (cid:1) dW it (cid:21) + dW t , X = 0 . (6.1)The solution of (6.1) is given by X it = arctan (cid:0) W it (cid:1) , i ∈ { , , , } ,X t = (cid:88) i =1 arsinh (cid:0) W it (cid:1) + W t . for t ∈ [0 , . Further, we consider the functional f ( x ) = cos (cid:32) (cid:88) i =1 x i (cid:33) − (cid:88) i =1 sin (cid:0) x i (cid:1) , Performing the full complexity analysis via Lagrange multipliers one can see that these parameter values are not optimal if κ ≤ (a Lagrange multiplier corresponding to a “ ≤ ” constraint is negative). Recall that in the case ofpiecewise polynomial regression (see [1] and recall Remark 5.1) we have κ = ν ( p +1)2 d ( p +1)+ dν . Let us note that in [1] it isrequired to choose the parameters p and ν according to p > d − and ν > d ( p +1)2( p +1) − d , which implies that κ > , for κ expressed via p and ν by the above formula. that is, we have E [ f ( X )] = (cid:0) E (cid:2) cos (cid:0) arctan (cid:0) W (cid:1) + arsinh (cid:0) W (cid:1)(cid:1)(cid:3)(cid:1) E (cid:2) cos (cid:0) W (cid:1)(cid:3) ≈ . . Here we consider weak schemes of the second order and compare the numerical performances ofthe SMC, MLMC, RCV, TRCV and TSRCV approaches. The latter one is the truncated version ofthe SRCV approach of [4]. Like the RCV algorithm, the SRCV one is based on (2.12), the diﬀerenceis only in how to implement the approximations of the coeﬃcients a j,o,U ,U in practice (while theRCV algorithm is a direct Monte Carlo regression, in the SRCV algorithm the regression is combinedwith a kind of “stratiﬁcation”; see [4] for more detail). Therefore, the idea of the truncation (i.e.replacing (2.12) with (3.5)) applies also to the SRCV approach and gives us the TSRCV one.For simplicity we implemented a global regression for the RCV, TRCV and TSRCV approaches (i.e.the one without considering the truncation operator in (5.1), as a part of the general description inSection 4). More precisely, we use quadratic polynomials (that is (cid:81) i =1 x l i i , where l , . . . l ∈ { , , } and (cid:80) l =1 l i ≤ ) as well as f as basis functions, hence Ψ Q consists of Q = (cid:0) (cid:1) + 1 = 22 basis functions.Note that we do not need to consider random variables V klj in the second order weak scheme, since L k σ rl ( x ) = 0 for k (cid:54) = l (see (2.8)). This gives us less terms for the RCV approach, namely m − rather than m m ( m − − terms in (2.12) (the factor m ( m − ≡ is no longer present).As for the TRCV and TSRCV approaches, this gives us only m ( m +3)2 = 20 compared to m ( m + 1) = 30 terms in (3.5).We choose κ = 1 . , which is related to the piecewise polynomial regression with polynomial degree p = 2 (comparable to our setting) and the limiting case ν → ∞ (see footnote 4 on page 11). Moreover,for each ε = 2 − i , i ∈ { , , , , } , we set the parameters J , N and N for the RCV, TRCV andTSRCV approaches as follows (compare with the formulas in Subsection 5.1): J = (cid:6) ε − . (cid:7) , N = (cid:40) · (cid:100) ε − . (cid:101) , RCV, TRCV , · (cid:100) ε − . (cid:101) , TSRCV , N = 512 · (cid:100) ε − . (cid:101) . The factors 512 and 2048 are here for stability purposes. For the TRCV and SMC algorithms weadditionally consider ε = 2 − , which produces a picture with approximately equal maximal computa-tional time (that is, the time corresponding to the best accuracy) for all algorithms. Next we estimatethe numerical complexity for the RCV, TRCV and TSRCV approaches by means of 100 independentsimulations and compare it with the one for the SMC and MLMC approach, for which we use thesame output as in [1]. As can be seen from Figure 1, the estimated numerical complexity is aboutRMSE − . for the RCV approach, RMSE − . for the TRCV approach, RMSE − . for the TSRCVapproach, RMSE − . for the SMC approach and RMSE − . for the MLMC approach, which we get byregressing the log-time (logarithmic computing time of the whole algorithm in seconds) vs. log-RMSE.Beyond the numerical complexities we observe that the truncation eﬀect from RCV algorithm to itstruncated versions is huge. While we have poor results for the RCV approach (as in [1]), i.e. in thisregion of ε -values the RCV approach is numerically outperformed by the other ones, the TRCV andTSRCV approaches work best (even better than the SMC and MLMC approaches).7. Proofs

Proof of Theorem 1.1.

We begin with the following remark. Assumptions (1.5) and (1.6) togetherwith the Cauchy-Schwarz inequality | E [ XY |G ] | ≤ (cid:112) E [ X |G ] E [ Y |G ] imply that the following general-isation of (1.6) is satisﬁed: for any n , n ∈ N , α, β ∈ N d , with ≤ | α | ≤ K , ≤ | β | ≤ K , α (cid:54) = β , itholds (cid:12)(cid:12)(cid:12) E (cid:104) (cid:16) D α Φ k ∆ ,l +1 ( G l,j ( x )) (cid:17) n (cid:16) D β Φ k ∆ ,l +1 ( G l,j ( x )) (cid:17) n (cid:12)(cid:12)(cid:12) G l (cid:105)(cid:12)(cid:12)(cid:12) ≤ C n ,n ∆ (7.1)for some appropriate constants C n ,n > .Let us begin with the case K = 1 . We have for some k, r ∈ { , . . . , d } ∂∂x r G kl +1 ,j ( x ) = d (cid:88) s =1 ∂∂x s Φ k ∆ ,l +1 ( G l,j ( x )) ∂∂x r G sl,j ( x ) =: d (cid:88) s =1 γ s RUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES 13 -12 -10 -8 -6 -4 -2 0051015

Figure 1.

Numerical complexities of the RCV, TRCV, TSRCV, SMC and MLMC approaches.and ∂∂x r G sj +1 ,j ( x ) = ∂∂x r Φ s ∆ ( x, ξ j +1 ) , where G sl +1 ,j and Φ s ∆ , s ∈ { , . . . , d } , denote the s -th componentof the functions G l +1 ,j and Φ ∆ . Hence E (cid:34)(cid:18) ∂∂x r G kl +1 ,j ( x ) (cid:19) (cid:35) ≤ E  γ k + (cid:88) s : s (cid:54) = k (cid:0) γ k γ s + ( d − γ s (cid:1) . For an arbitrary j ∈ { , . . . , J − } , denote ρ r,sl +1 ,n, := E (cid:20)(cid:18) ∂∂x r G sl +1 ,j ( x ) (cid:19) n (cid:21) , then, due to (1.5) and (7.1), we get for l = j, . . . , J − , ρ r,kl +1 , , ≤ (1 + A ∆) ρ r,kl, , + (cid:88) s : s (cid:54) = k (cid:16) C , ∆( ρ r,kl, , + ρ r,sl, , ) + ( d − B ∆ ρ r,sl, , (cid:17) . Further, denote ρ rl +1 ,n, := d (cid:88) s =1 ρ r,sl +1 ,n, , then we get ρ rl +1 , , ≤ (1 + A ∆) ρ rl, , + 2( d − C , ∆ ρ rl, , + ( d − B ∆ ρ rl, , . This gives us ρ rl +1 , , ≤ (1 + κ ∆) ρ rl, , for some constant κ > , leading to(7.2) ρ rl, , ≤ (1 + κ ∆) l − j − ρ rj +1 , , , l = j + 1 , . . . , J − , where ρ rj +1 , , = d (cid:88) s =1 E (cid:34)(cid:18) ∂∂x r Φ s ∆ ( x, ξ j +1 ) (cid:19) (cid:35) , which is bounded due to (1.5). Together with (7.2) we obtain the boundedness of { ρ rJ, , : J ∈ N } andhence the boundedness of (cid:12)(cid:12)(cid:12)(cid:12) ∂∂x r q j ( x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ d (cid:88) s =1 E (cid:12)(cid:12)(cid:12)(cid:12) ∂∂x s f ( G J,j ( x )) ∂∂x r G sJ,j ( x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ d (cid:88) s =1 (cid:118)(cid:117)(cid:117)(cid:116) E (cid:34)(cid:18) ∂∂x s f ( G J,j ( x )) (cid:19) (cid:35) ρ r,sJ, , ≤ (cid:118)(cid:117)(cid:117)(cid:116) d d (cid:88) s =1 E (cid:34)(cid:18) ∂∂x s f ( G J,j ( x )) (cid:19) (cid:35) ρ r,sJ, , ≤ const (cid:113) ρ rJ, , for all r ∈ { , . . . , d } , since f is assumed to be continuously diﬀerentiable with bounded partial deriva-tives.Let us proceed with the case K = 2 . We have, due to ( (cid:80) dk =1 a k ) n ≤ d n − (cid:80) dk =1 a nk , E (cid:34)(cid:18) ∂∂x r G kl +1 ,j ( x ) (cid:19) (cid:35) ≤ E  γ k + (cid:88) s : s (cid:54) = k (cid:0) γ k γ s + 6( d − γ k γ s + 4( d − γ k γ s + ( d − γ s (cid:1) and thus, due to a b ≤ a + b and a b ≤ a + b , ρ r,kl +1 , , ≤ (1 + A ∆) ρ r,kl, , + (cid:88) s : s (cid:54) = k (cid:16) C , ∆(3 ρ r,kl, , + ρ r,sl, , ) + 3( d − C , ∆( ρ r,kl, , + ρ r,sl, , )+( d − C , ∆( ρ r,kl, , + 3 ρ r,sl, , ) + ( d − B ∆ ρ r,sl, , (cid:17) . This gives us ρ rl +1 , , ≤ (1 + A ∆) ρ rl, , + 4( d − C , ∆ ρ rl, , + 6( d − C , ∆ ρ rl, , +4( d − C , ∆ ρ rl, , + ( d − B ∆ ρ rl, , . Hence, we obtain ρ rl +1 , , ≤ (1 + κ ∆) ρ rl, , , for some constant κ > , leading to ρ rl, , ≤ (1 + κ ∆) l − j − ρ rj +1 , , , l = j + 1 , . . . , J − , where ρ rj +1 , , = d (cid:88) s =1 E (cid:34)(cid:18) ∂∂x r Φ s ∆ ( x, ξ j +1 ) (cid:19) (cid:35) . Next, we have for some k, o, r ∈ { , . . . , d } ∂ ∂x r ∂x o G kl +1 ,j ( x ) = d (cid:88) s =1 ∂∂x s Φ k ∆ ,l +1 ( G l,j ( x )) ∂ ∂x r ∂x o G sl,j ( x )+ d (cid:88) s,u =1 ∂ ∂x s ∂x u Φ k ∆ ,l +1 ( G l,j ( x )) ∂∂x r G sl,j ( x ) ∂∂x o G ul,j ( x )=: d (cid:88) s =1 η ,s + d (cid:88) s,u =1 η ,s,u RUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES 15 and ∂ ∂x r ∂x o G sj +1 ,j ( x ) = ∂ ∂x r ∂x o Φ s ∆ ( x, ξ j +1 ) . Hence E (cid:34)(cid:18) ∂ ∂x r ∂x o G kl +1 ,j ( x ) (cid:19) (cid:35) ≤ E  η ,k + (cid:88) s : s (cid:54) = k (cid:0) η ,k η ,s + ( d − η ,s (cid:1) + 2 d (cid:88) s,u,v =1 η ,v η ,s,u + d d (cid:88) s,u =1 η ,s,u  Denote ρ r,o,sl +1 ,n, = E (cid:20)(cid:18) ∂ ∂x r ∂x o G sl +1 ,j ( x ) (cid:19) n (cid:21) , then we get, due to E [ XY Z ] ≤ (cid:112) E [ X ] (cid:112) E [ Y ] (cid:112) E [ Z ] ≤ E (cid:2) X (cid:3) + (cid:112) E [ Y ] (cid:112) E [ Z ] ≤ E (cid:2) X (cid:3) + 12 (cid:0) E (cid:2) Y (cid:3) + E (cid:2) Z (cid:3)(cid:1) , as well as (1.5) and (7.1), ρ r,o,kl +1 , , ≤ (1 + A ∆) ρ r,o,kl, , + (cid:88) s : s (cid:54) = k (cid:16) C , ∆( ρ r,o,kl, , + ρ r,o,sl, , ) + ( d − B ∆ ρ r,o,sl, , (cid:17) + d (cid:88) s,u,v =1 C , ∆ (cid:18) ρ r,o,vl, , + 12 (cid:16) ρ r,sl, , + ρ o,ul, , (cid:17)(cid:19) + d d (cid:88) s,u =1 B ∆ 12 (cid:16) ρ r,sl, , + ρ o,ul, , (cid:17) . Further, denote ρ r,ol +1 ,n, = d (cid:88) s =1 ρ r,o,sl +1 ,n, , then we get for l = j + 1 , . . . , J − , ρ r,ol +1 , , ≤ (1 + A ∆) ρ r,ol, , + 2( d − C , ∆ ρ r,ol, , + ( d − B ∆ ρ r,ol, , + d C , ∆ (cid:18) ρ r,ol, , + 12 (cid:0) ρ rl, , + ρ ol, , (cid:1)(cid:19) + d B ∆ 12 (cid:0) ρ rl, , + ρ ol, , (cid:1) . This gives us ρ r,ol +1 , , ≤ (1 + κ ∆) ρ r,ol, , + κ ∆ , for some constants κ , κ > , leading to ρ r,ol, , ≤ (1 + κ ∆) l − j − ρ r,oj +1 , , + κ , l = j + 1 , . . . , J − , where κ > and ρ r,oj +1 , , = d (cid:88) s =1 E (cid:34)(cid:18) ∂ ∂x r ∂x o Φ s ∆ ( x, ξ j +1 ) (cid:19) (cid:35) . Thus, we obtain the boundedness of (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂x r ∂x o q j ( x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ d (cid:88) s =1 E (cid:12)(cid:12)(cid:12)(cid:12) ∂∂x s f ( G J,j ( x )) ∂ ∂x r ∂x o G sJ,j ( x ) (cid:12)(cid:12)(cid:12)(cid:12) + d (cid:88) s,u =1 E (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂x s ∂x u f ( G J,j ( x )) ∂∂x r G sJ,j ( x ) ∂∂x o G uJ,j ( x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ d (cid:88) s =1 (cid:118)(cid:117)(cid:117)(cid:116) E (cid:34)(cid:18) ∂∂x s f ( G J,j ( x )) (cid:19) (cid:35) ρ r,o,sJ, , + d (cid:88) s,u =1 (cid:118)(cid:117)(cid:117)(cid:116) E (cid:34)(cid:18) ∂ ∂x s ∂x u f ( G J,j ( x )) (cid:19) (cid:35) (cid:113) ρ r,sJ, , ρ o,uJ, , for all r, o ∈ { , . . . , d } , since f is assumed to be twice continuously diﬀerentiable with bounded partialderivatives up to order .Let us proceed with the ﬁnal case K = 3 . We have E (cid:34)(cid:18) ∂∂x r G kl +1 ,j ( x ) (cid:19) (cid:35) ≤ E  γ k + (cid:88) s : s (cid:54) = k (cid:0) γ k γ s + 15( d − γ k γ s + 20( d − γ k γ s + 15( d − γ k γ s +6( d − γ k γ s + ( d − γ s (cid:1)(cid:3) and thus, due to a b ≤ a + b , a b ≤ a + b and a b ≤ a + b , ρ r,kl +1 , , ≤ (1 + A ∆) ρ r,kl, , + (cid:88) s : s (cid:54) = k (cid:16) C , ∆(5 ρ r,kl, , + ρ r,sl, , ) + 5( d − C , ∆(2 ρ r,kl, , + ρ r,sl, , )+ 10( d − C , ∆( ρ r,kl, , + ρ r,sl, , ) + 5( d − C , ∆( ρ r,kl, , + 2 ρ r,sl, , )+( d − C , ∆( ρ r,kl, , + 5 ρ r,sl, , ) + ( d − B ∆ ρ r,sl, , (cid:17) . This gives us ρ rl +1 , , ≤ (1 + A ∆) ρ rl, , + 6( d − C , ∆ ρ rl, , + 15( d − C , ∆ ρ rl, , +20( d − C , ∆ ρ rl, , + 15( d − C , ∆ ρ rl, , + 6( d − C , ∆ ρ rl, , +( d − B ∆ ρ rl, , . Hence, we obtain ρ rl +1 , , ≤ (1 + κ ∆) ρ rl, , for some constant κ > , leading to ρ rl, , ≤ (1 + κ ∆) l − j − ρ rj +1 , , , l = j + 1 , . . . , J − , where ρ rj +1 , , = d (cid:88) s =1 E (cid:34)(cid:18) ∂∂x r Φ s ∆ ( x, ξ j +1 ) (cid:19) (cid:35) RUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES 17

Moreover, we have E (cid:34)(cid:18) ∂∂x r G kl +1 ,j ( x ) (cid:19) (cid:35) ≤ E  γ k + (cid:88) s : s (cid:54) = k (cid:0) γ k γ s + 28( d − γ k γ s + 56( d − γ k γ s + 70( d − γ k γ s +56( d − γ k γ s + 28( d − γ k γ s + 8( d − γ k γ s + ( d − γ s (cid:1)(cid:3) and thus, due to a b ≤ a + b , a b ≤ a + b , a b ≤ a + 3 b and a b ≤ a + b , ρ r,kl +1 , , ≤ (1 + A ∆) ρ r,kl, , + (cid:88) s : s (cid:54) = k (cid:16) C , ∆(7 ρ r,kl, , + ρ r,sl, , ) + 7( d − C , ∆(3 ρ r,kl, , + ρ r,sl, , )+ 7( d − C , ∆(5 ρ r,kl, , + 3 ρ r,sl, , ) + 35( d − C , ∆( ρ r,kl, , + ρ r,sl, , )+ 7( d − C , ∆(3 ρ r,kl, , + 5 ρ r,sl, , ) + 7( d − C , ∆( ρ r,kl, , + 3 ρ r,sl, , )+( d − C , ∆( ρ r,kl, , + 7 ρ r,sl, , ) + ( d − B ∆ ρ r,sl, , (cid:17) . This gives us ρ rl +1 , , ≤ (1 + A ∆) ρ rl, , + 8( d − C , ∆ ρ rl, , + 28( d − C , ∆ ρ rl, , +56( d − C , ∆ ρ rl, , + 70( d − C , ∆ ρ rl, , +56( d − C , ∆ ρ rl, , + 28( d − C , ∆ ρ rl, , +8( d − C , ∆ ρ rl, , + ( d − B ∆ ρ rl, , . Hence, we obtain ρ rl +1 , , ≤ (1 + κ ∆) ρ rl, , , for some constant κ > , leading to ρ rl, , ≤ (1 + κ ∆) l − j − ρ rj +1 , , , l = j + 1 , . . . , J − , where ρ rj +1 , , = d (cid:88) s =1 E (cid:34)(cid:18) ∂∂x r Φ s ∆ ( x, ξ j +1 ) (cid:19) (cid:35) . Moreover, we have E (cid:34)(cid:18) ∂ ∂x r ∂x o G kl +1 ,j ( x ) (cid:19) (cid:35) ≤ E  η ,k + (cid:88) s : s (cid:54) = k (cid:0) η ,k η ,s + 6( d − η ,k η ,s + 4( d − η ,k η ,s + ( d − η ,s (cid:1) + d (cid:88) s,u,v =1 (cid:0) d η ,v η ,s,u + 6 d η ,v η ,s,u + 4 d η ,v η ,s,u (cid:1) + d d (cid:88) s,u =1 η ,s,u  and thus, due to a bc ≤ a + (cid:0) b + c (cid:1) , a b c ≤ a + (cid:0) b + c (cid:1) and ab c ≤ a + (cid:0) b + c (cid:1) , ρ r,o,kl +1 , , ≤ (1 + A ∆) ρ r,o,kl, , + (cid:88) s : s (cid:54) = k (cid:16) C , ∆(3 ρ r,o,kl, , + ρ r,o,sl, , ) + 3( d − C , ∆( ρ r,o,kl, , + ρ r,o,sl, , )+( d − C , ∆( ρ r,o,kl, , + 3 ρ r,o,sl, , ) + ( d − B ∆ ρ r,o,sl, , (cid:17) + d (cid:88) s,u,v =1 (cid:18) d C , ∆ (cid:18) ρ r,o,vl, , + 12 (cid:16) ρ r,sl, , + ρ o,ul, , (cid:17)(cid:19) + 3 d C , ∆ (cid:18) ρ r,o,vl, , + 12 (cid:16) ρ r,sl, , + ρ o,ul, , (cid:17)(cid:19) + d C , ∆ (cid:18) ρ r,o,vl, , + 32 (cid:16) ρ r,sl, , + ρ o,ul, , (cid:17)(cid:19)(cid:19) + d d (cid:88) s,u =1 B ∆ 12 (cid:16) ρ r,sl, , + ρ o,ul, , (cid:17) . This gives us ρ r,ol +1 , , ≤ (1 + A ∆) ρ r,ol, , + 4( d − C , ∆ ρ r,ol, , + 6( d − C , ∆ ρ r,ol, , +4( d − C , ∆ ρ rl, , + ( d − B ∆ ρ r,ol, , + d C , ∆ (cid:18) ρ r,ol, , + 12 (cid:0) ρ rl, , + ρ ol, , (cid:1)(cid:19) +3 d C , ∆ (cid:18) ρ r,ol, , + 12 (cid:0) ρ rl, , + ρ ol, , (cid:1)(cid:19) + d C , ∆ (cid:18) ρ r,ol, , + 32 (cid:0) ρ rl, , + ρ ol, , (cid:1)(cid:19) + d B ∆ 12 (cid:0) ρ rl, , + ρ ol, , (cid:1) . Hence, we obtain ρ r,ol +1 , , ≤ (1 + κ ∆) ρ r,ol, , + κ ∆ , for some constants κ , κ > , leading to ρ r,ol, , ≤ (1 + κ ∆) l − j − ρ r,oj +1 , , + κ , l = j + 1 , . . . , J − , where κ > and ρ r,oj +1 , , = d (cid:88) s =1 E (cid:34)(cid:18) ∂ ∂x r ∂x o Φ s ∆ ( x, ξ j +1 ) (cid:19) (cid:35) . RUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES 19

Next, we have for some k, o, r, z ∈ { , . . . , d } ∂ ∂x r ∂x o ∂x z G kl +1 ,j ( x )= d (cid:88) s =1 ∂∂x s Φ k ∆ ,l +1 ( G l,j ( x )) ∂ ∂x r ∂x o ∂x z G sl,j ( x )+ d (cid:88) s,u =1 ∂ ∂x s ∂x u Φ k ∆ ,l +1 ( G l,j ( x )) (cid:18) ∂ ∂x r ∂x o G sl,j ( x ) ∂∂x z G ul,j ( x ) + ∂ ∂x r ∂x z G sl,j ( x ) ∂∂x o G ul,j ( x )+ ∂∂x r G sl,j ( x ) ∂ ∂x o ∂x z G ul,j ( x ) (cid:19) + d (cid:88) s,u,v =1 ∂ ∂x s ∂x u ∂x v Φ k ∆ ,l +1 ( G l,j ( x )) ∂∂x r G sl,j ( x ) ∂∂x o G ul,j ( x ) ∂∂x z G vl,j ( x )=: d (cid:88) s =1 ψ ,s + d (cid:88) s,u =1 ψ ,s,u + d (cid:88) s,u,v =1 ψ ,s,u,v and ∂ ∂x r ∂x o ∂x z G sj +1 ,j ( x ) = ∂ ∂x r ∂x o ∂x z Φ s ∆ ( x, ξ j +1 ) . Hence E (cid:34)(cid:18) ∂ ∂x r ∂x o ∂x z G kl +1 ,j ( x ) (cid:19) (cid:35) ≤ E  ψ ,k + (cid:88) s : s (cid:54) = k (cid:0) ψ ,k ψ ,s + ( d − ψ ,s (cid:1) + 2 d (cid:88) s,u,v =1 ψ ,v ψ ,s,u +2 d (cid:88) s,u,v,w =1 ψ ,w ψ ,s,u,v + 2 d d (cid:88) s,u =1 ψ ,s,u + 2 d d (cid:88) s,u,v =1 ψ ,s,u,v  Denote ρ r,o,z,sl +1 ,n, = E (cid:20)(cid:18) ∂ ∂x r ∂x o ∂x z G sl +1 ,j ( x ) (cid:19) n (cid:21) , then we get, due to a b c ≤ a + b + c and E [ XY ZU ] ≤ (cid:112) E [ X ] (cid:112) E [ Y ] (cid:112) E [ Z ] (cid:112) E [ U ] ≤ E (cid:2) X (cid:3) + (cid:112) E [ Y ] (cid:112) E [ Z ] (cid:112) E [ U ] ≤ E (cid:2) X (cid:3) + 13 (cid:0) E (cid:2) Y (cid:3) + E (cid:2) Z (cid:3) + E (cid:2) U (cid:3)(cid:1) , as well as (1.5) and (7.1), ρ r,o,z,kl +1 , , ≤ (1 + A ∆) ρ r,o,z,kl, , + (cid:88) s : s (cid:54) = k (cid:16) C , ∆( ρ r,o,z,kl, , + ρ r,o,z,sl, , ) + ( d − B ∆ ρ r,o,z,sl, , (cid:17) + d (cid:88) s,u,v =1 C , ∆ (cid:18) ρ r,o,z,vl, , + 12 (cid:16) ρ r,sl, , + ρ o,ul, , + ρ z,ul, , + ρ r,o,sl, , + ρ r,z,sl, , + ρ o,z,ul, , (cid:17)(cid:19) + d (cid:88) s,u,v,w =1 C , ∆ (cid:18) ρ r,o,z,wl, , + 13 (cid:16) ρ r,sl, , + ρ o,ul, , + ρ z,vl, , (cid:17)(cid:19) +3 d d (cid:88) s,u =1 B ∆ (cid:16) ρ r,sl, , + ρ o,ul, , + ρ z,ul, , + ρ r,o,sl, , + ρ r,z,sl, , + ρ o,z,ul, , (cid:17) + d d (cid:88) s,u,v =1 B ∆ 13 (cid:16) ρ r,sl, , + ρ o,ul, , + ρ z,vl, , (cid:17) . Further, denote ρ r,o,zl +1 , , = d (cid:88) s =1 ρ r,o,z,sl +1 , , , then we get ρ r,o,zl +1 , , ≤ (1 + A ∆) ρ r,o,zl, , + 2( d − C , ∆ ρ r,o,zl, , + ( d − B ∆ ρ r,o,zl, , + d C , ∆ (cid:18) ρ r,o,zl, , + 12 (cid:16) ρ rl, , + ρ ol, , + ρ zl, , + ρ r,ol, , + ρ r,zl, , + ρ o,zl, , (cid:17)(cid:19) + d C , ∆ (cid:18) ρ r,o,zl, , + 13 (cid:0) ρ rl, , + ρ ol, , + ρ zl, , (cid:1)(cid:19) +3 d B ∆ (cid:16) ρ rl, , + ρ ol, , + ρ zl, , + ρ r,ol, , + ρ r,zl, , + ρ o,zl, , (cid:17) + d B ∆ 13 (cid:0) ρ rl, , + ρ ol, , + ρ zl, , (cid:1) . This gives us ρ r,o,zl +1 , , ≤ (1 + κ ∆) ρ r,o,zl, , + κ ∆ , for some constants κ , κ > , leading to ρ r,o,zl, , ≤ (1 + κ ∆) l − j − ρ r,o,zj +1 , , + κ , l = j + 1 , . . . , J − , where κ > and ρ r,o,zj +1 , , = d (cid:88) s =1 E (cid:34)(cid:18) ∂ ∂x r ∂x o ∂x z Φ s ∆ ( x, ξ j +1 ) (cid:19) (cid:35) . Thus, we obtain the boundedness of (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂x r ∂x o ∂x z q j ( x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ d (cid:88) s =1 E (cid:12)(cid:12)(cid:12)(cid:12) ∂∂x s f ( G J,j ( x )) ∂ ∂x r ∂x o ∂x z G sJ,j ( x ) (cid:12)(cid:12)(cid:12)(cid:12) + d (cid:88) s,u =1 E (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂x s ∂x u f ( G J,j ( x )) (cid:18) ∂ ∂x r ∂x o G sJ,j ( x ) ∂∂x z G uJ,j ( x ) + ∂ ∂x r ∂x z G sJ,j ( x ) ∂∂x o G uJ,j ( x )+ ∂∂x r G sJ,j ( x ) ∂ ∂x o ∂x z G uJ,j ( x ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) + d (cid:88) s,u,v =1 E (cid:12)(cid:12)(cid:12)(cid:12) ∂ ∂x s ∂x u ∂x v f ( G J,j ( x )) ∂∂x r G sJ,j ( x ) ∂∂x o G uJ,j ( x ) ∂∂x z G vJ,j ( x ) (cid:21) ≤ d (cid:88) s =1 (cid:118)(cid:117)(cid:117)(cid:116) E (cid:34)(cid:18) ∂∂x s f ( G J,j ( x )) (cid:19) (cid:35) ρ r,o,z,sJ, , + d (cid:88) s,u =1 (cid:118)(cid:117)(cid:117)(cid:116) E (cid:34)(cid:18) ∂ ∂x s ∂x u f ( G J,j ( x )) (cid:19) (cid:35) (cid:16) (cid:113) ρ r,o,sJ, , ρ z,uJ, , + (cid:113) ρ r,z,sJ, , ρ o,uJ, , + (cid:113) ρ r,sJ, , ρ o,z,uJ, , (cid:17) + d (cid:88) s,u,v =1 (cid:118)(cid:117)(cid:117)(cid:116) E (cid:34)(cid:18) ∂ ∂x s ∂x u ∂x v f ( G J,j ( x )) (cid:19) (cid:35) (cid:113) ρ r,sJ, , ρ o,uJ, , ρ z,vJ, , for all r, o, z ∈ { , . . . , d } , since f is assumed to be three times continuously diﬀerentiable with boundedpartial derivatives up to order 3. RUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES 21

Proof of Theorem 3.1. (i) Straightforward.(ii) Let us deﬁne µ ∆ ( x ) := x + µ ( x )∆ . Then we obtain via Taylor’s theorem (cf. (2.2)) q j (Φ ∆ ( x, y )) = q j ( µ ∆ ( x )) + √ ∆ d (cid:88) k =1 m (cid:88) i =1 σ ki ( x ) y i ˆ ∂q j ∂x k ( µ ∆ ( x ) + tσ ( x ) √ ∆ y ) dt. This gives us (see (2.5)) a j,r,s ( x ) = 12 m (cid:88) y ∈{− , } m q j (Φ ∆ ( x, y )) r (cid:89) o =1 y s o = √ ∆2 m (cid:88) y ∈{− , } m (cid:32) r (cid:89) o =1 y s o (cid:33) d (cid:88) k =1 m (cid:88) i =1 σ ki ( x ) y i ˆ ∂q j ∂x k ( µ ∆ ( x ) + tσ ( x ) √ ∆ y ) dt, (7.3)since m (cid:88) y ∈{− , } m r (cid:89) o =1 y s o = E (cid:34) r (cid:89) o =1 ξ s o j (cid:35) = 0 . Next we apply Theorem 1.1 for the case K = 1 to get that all the functions q j are continuouslydiﬀerentiable with bounded partial derivatives. Clearly, the assumptions in this theorem hold, whenall the functions f, µ k , σ ki , k ∈ { , . . . , d } , i ∈ { , . . . , m } are continuously diﬀerentiable with boundedderivatives. Together with the assumption, that all the functions σ ki are bounded, we get from (7.3)that a j,r,s is of order √ ∆ for all j, r, s .7.2. Proof of Theorem 3.2.

Let us consider a higher order Taylor expansion compared to the proofof Theorem 3.1 and recall that µ ∆ ( x ) = x + µ ( x )∆ . We have for any y ∈ {− , } m q j (Φ ∆ ( x, y )) = q j ( µ ∆ ( x )) + √ ∆ d (cid:88) k =1 ∂∂x k q j ( µ ∆ ( x )) m (cid:88) i =1 σ ki ( x ) y i + ∆ d (cid:88) k,l =1 (2 − δ k,l ) ˆ (1 − t ) ∂ ∂x k ∂x l q j ( µ ∆ ( x ) + t √ ∆ σ ( x ) y ) dt m (cid:88) i =1 σ ki ( x ) y i m (cid:88) i =1 σ li ( x ) y i , (7.4)where δ · , · is the Kronecker delta. This gives us for r ≥ (cf. (2.5)) a j,r,s ( x ) = 12 m (cid:88) y ∈{− , } m q j (Φ ∆ ( x, y )) r (cid:89) o =1 y s o = ∆2 m d (cid:88) k,l =1  (2 − δ k,l ) (cid:88) y ∈{− , } m (cid:32) m (cid:88) i =1 σ ki ( x ) y i m (cid:88) i =1 σ li ( x ) y i r (cid:89) o =1 y s o · ˆ (1 − t ) ∂ ∂x k ∂x l q j ( µ ∆ ( x ) + t √ ∆ σ ( x ) y ) dt  , (7.5)due to (cf. (7.4)) m (cid:88) y ∈{− , } m y i r (cid:89) o =1 y s o = E (cid:34) ξ ij r (cid:89) o =1 ξ s o j (cid:35) = 0 (7.6)for all i ∈ { , . . . , m } . (Note that (7.6) does not hold for r = 1 .) Applying Theorem 1.1 (case K = 2 ),we get that q j is twice continuously diﬀerentiable with bounded partial derivatives up to order 2,provided that all the functions f, µ k , σ k,i are twice continuously diﬀerentiable with bounded partialderivatives up to order 2. Together with the assumption, that all the functions σ kl are bounded, weget from (7.5) that a j,r,s is of order ∆ for all j, r, s with r > . Proof of Theorem 3.4.

Here we apply Theorem 3.2, which gives us (cf. (2.6))

Var (cid:104) f ( X ∆ ,T ) − M (1) ,trunc ∆ ,T (cid:105) = Var (cid:104) M (1)∆ ,T − M (1) ,trunc ∆ ,T (cid:105) = Var  J (cid:88) j =1 m (cid:88) r =2 (cid:88) ≤ s <... ~~.7.4. Proof of Theorem 3.5.~~

The proof works similarly to the one of Theorem 3.1. More precisely,here we deﬁne (cf. (2.8)) µ ∆ ( x ) := x + µ ( x )∆ + 12 L µ ( x ) ∆ . Then we derive the zero-order Taylor expansion for q j (Φ ∆ ( x, y, z )) around µ ∆ ( x ) , use that E  (cid:89) r ∈ U H o r ( ξ rj ) (cid:89) ( k,l ) ∈ U V klj  = 0 and observe that all components ˜Φ k ∆ ( x, y, z ) := Φ k ∆ ( x, y, z ) − µ k ∆ ( x ) , k ∈ { , . . . , d } (as an analogue of √ ∆ (cid:80) mi =1 σ ki ( x ) y i in case of the weak Euler scheme), are of order √ ∆ under less strict assumptionsthan required in the present theorem. Finally we apply Theorem 1.1 (case K = 1 ) which gives usthat q j is continuously diﬀerentiable with bounded partial derivatives under the assumptions, thatall functions µ k and σ ki are bounded and all the functions f, µ k , σ ki are three-times continuouslydiﬀerentiable with bounded partial derivatives up to order 3. Consequently, all the functions a j,o,U ,U are of order √ ∆ .7.5. Proof of Theorem 3.6. (i) The proof works similarly to the one of Theorem 3.2, that is, weconsider a Taylor expansion for q j (Φ ∆ ( x, y, z )) of order 1, around the same point µ ∆ ( x ) as in the proofof Theorem 3.5. Then we use E  ˜Φ k ∆ ( x, ξ j , V j ) (cid:89) r ∈ U H o r ( ξ rj ) (cid:89) ( k,l ) ∈ U V klj  = 0 , k ∈ { , . . . , d } , whenever | U | + |K | + |K | ≥ (where again ˜Φ k ∆ ( x, y, z ) = Φ k ∆ ( x, y, z ) − µ k ∆ ( x ) ). Then we applyTheorem 1.1 (case K = 2 ) which gives us that q j is twice continuously diﬀerentiable with boundedpartial derivatives up to order 2 under the assumptions, that all functions µ k and σ ki are bounded andall the functions f, µ k , σ ki are four-times continuously diﬀerentiable with bounded partial derivativesup to order 4. Finally, we get that all the functions a j,o,U ,U are of order ∆ , since the product of allfunctions ˜Φ k ∆ ( x, y, z ) ˜Φ l ∆ ( x, y, z ) , k, l ∈ { , . . . , d } , is of order ∆ under the above assumptions. RUNCATED CONTROL VARIATES FOR WEAK APPROXIMATION SCHEMES 23 (ii) Here we consider the Taylor expansion of order 2, that is q j (Φ ∆ ( x, y, z )) = q j ( µ ∆ ( x )) + d (cid:88) k =1 ∂∂x k q j ( µ ∆ ( x )) ˜Φ k ∆ ( x, y, z )+ d (cid:88) k,l =1

12 (2 − δ k,l ) ∂ ∂x k ∂x l q j ( µ ∆ ( x )) ˜Φ k ∆ ( x, y, z ) ˜Φ l ∆ ( x, y, z )+ d (cid:88) k,l,n =1 (cid:20)(cid:18) −

32 ( δ k,l + δ k,n + δ l,n ) + 2 δ k,l δ k,n δ l,n (cid:19) ˜Φ k ∆ ( x, y, z )Φ l ∆ ( x, y, z )Φ n ∆ ( x, y, z ) · ˆ (1 − t ) ∂ ∂x k ∂x l ∂x n q j ( µ ∆ ( x ) + t ˜Φ ∆ ( x, y, z )) dt  . Next we use E  ˜Φ k ∆ ( x, ξ j , V j ) ˜Φ l ∆ ( x, ξ j , V j ) (cid:89) r ∈ U H o r ( ξ rj ) (cid:89) ( k,l ) ∈ U V klj  = 0 , k, l ∈ { , . . . , d } , whenever | U | + |K | + |K | > , and thus we obtain (cf. (2.11)) a j,o,U ,U ( x )= 12 m ( m − m (cid:88) y ∈{−√ , , √ } m (cid:88) z ∈{− , } m ( m − (cid:80) mi =1 I ( y i =0) (cid:89) r ∈ U H o r ( y r ) (cid:89) ( k,l ) ∈ U z kl q j (Φ ∆ ( x, y, z ))= 12 m ( m − m (cid:88) y ∈{−√ , , √ } m (cid:88) z ∈{− , } m ( m − (cid:80) mi =1 I ( y i =0) (cid:89) r ∈ U H o r ( y r ) (cid:89) ( k,l ) ∈ U z kl · d (cid:88) k,l,n =1 (cid:20)(cid:18) −

32 ( δ k,l + δ k,n + δ l,n ) + 2 δ k,l δ k,n δ l,n (cid:19) ˜Φ k ∆ ( x, y, z )Φ l ∆ ( x, y, z )Φ n ∆ ( x, y, z ) · ˆ (1 − t ) ∂ ∂x k ∂x l ∂x n q j ( µ ∆ ( x ) + t ˜Φ ∆ ( x, y, z )) dt  Then we apply Theorem 1.1 (case K = 3 ) which gives us that q j is three-times continuously diﬀeren-tiable with bounded partial derivatives up to order 3 under the assumptions, that all functions µ k and σ ki are bounded and all the functions f, µ k , σ ki are ﬁve-times continuously diﬀerentiable with boundedpartial derivatives up to order 5. Finally, we get that all the functions a j,o,U ,U are of order ∆ / ,since the product of all functions ˜Φ k ∆ ( x, y, z ) ˜Φ l ∆ ( x, y, z ) ˜Φ n ∆ ( x, y, z ) , k, l, n ∈ { , . . . , d } , is of order ∆ / under the above assumptions.7.6. Proof of Theorem 3.8.

The proof is similar to the one of Theorem 3.4.7.7.

Proof of Lemma 5.2.

We refer to Theorem 11.3 in [6]. When applying it, we obtain actually E (cid:107) ˜ a j,o,U ,U − a j,o,U ,U (cid:107) L ( P ∆ ,j − ) ≤ ˜ c max (cid:110) Σ , ˜ A ∆ U ,U (cid:111) (log N + 1) QN + 8 C κ Q κ . (7.7)However, the maximum in (7.7) is in fact a sum of two terms Σ and ˜ A ∆ U ,U (log N r + 1) so that thelogarithm is only included in one term (see the proof of Theorem 11.3 in [6]). Proof of Theorem 5.3.

Using the martingale transform structure in (2.12) and (3.5) (recallfootnote 1 on page 4) together with the orthonormality of the system (cid:81) r ∈ U H o r ( ξ rj ) (cid:81) ( k,l ) ∈ U V klj , weget by (3.6) and (5.2) Var (cid:104) f ( X ∆ ,T ) − (cid:102) M (2) ,trunc ∆ ,T (cid:105) = Var (cid:104) f ( X ∆ ,T ) − M (2) ,trunc ∆ ,T (cid:105) + Var (cid:104) M (2) ,trunc ∆ ,T − (cid:102) M (2) ,trunc ∆ ,T (cid:105) (cid:46) ∆ + J (cid:88) j =1 (cid:88) ( U ,U ) ∈A| U | + |K | + |K |≤ (cid:88) o ∈{ , } U E (cid:107) ˜ a j,o,U ,U − a j,o,U ,U (cid:107) L ( P ∆ ,j − ) ≤ ∆ + J m ( m + 1) (cid:18) ˜ c (Σ + ˜ A ∆(log N + 1)) QN + 8 C κ Q κ (cid:19) , since ∆ U ,U ≤ ∆ .7.9. Proof of Theorem 5.4.

The proof is similar to the complexity analysis performed in [3].

References [1] D. Belomestny, S. Häfner, T. Nagapetyan, and M. Urusov. Variance reduction for discretised diﬀusions via regression.

Preprint, arXiv:1510.03141v3 , 2016.[2] D. Belomestny, S. Häfner, and M. Urusov. Regression-based complexity reduction of the dual nested Monte Carlomethods.

Preprint, arXiv:1611.06344 , 2016.[3] D. Belomestny, S. Häfner, and M. Urusov. Regression-based variance reduction approach for strong approximationschemes.

Preprint, arXiv:1612.03407v2 , 2017.[4] D. Belomestny, S. Häfner, and M. Urusov. Stratiﬁed regression-based variance reduction approach for weak approx-imation schemes.

Preprint, arXiv:1612.05255v2 , 2017.[5] M. B. Giles. Multilevel Monte Carlo path simulation.

Operations Research , 56(3):607–617, 2008.[6] L. Györﬁ, M. Kohler, A. Krzyżak, and H. Walk.

A distribution-free theory of nonparametric regression . SpringerSeries in Statistics. Springer-Verlag, New York, 2002.[7] P. Kloeden and E. Platen.

Numerical solution of stochastic diﬀerential equations , volume 23. Springer, 1992.[8] G. N. Milstein and M. V. Tretyakov. Practical variance reduction via regression for simulating diﬀusions.

SIAMJournal on Numerical Analysis , 47(2):887–910, 2009.[9] T. Müller-Gronbach, K. Ritter, and L. Yaroslavtseva. On the complexity of computing quadrature formulas formarginal distributions of SDEs.

Journal of Complexity , 31(1):110–145, 2015.[10] T. Müller-Gronbach and L. Yaroslavtseva. Deterministic quadrature formulas for SDEs based on simpliﬁed weakItô-Taylor steps.

Foundations of Computational Mathematics , 16(5):1325–1366, 2016.[11] N. J. Newton. Variance reduction for simulated diﬀusions.

SIAM journal on applied mathematics , 54(6):1780–1805,1994.

University of Duisburg-Essen, Essen, Germany and IITP RAS, Moscow, Russia

E-mail address : [email protected] PricewaterhouseCoopers GmbH, Frankfurt, Germany

E-mail address : [email protected] University of Duisburg-Essen, Essen, Germany

E-mail address ::

Related Researches

Adding edge dynamics to wireless random-access networks

by Matteo Sfragara

A CLT for degenerate diffusions with periodic coefficients, and application to homogenisation of linear PDEs

by Nikola Sandri?

A probabilistic approach to the Erdös-Kac theorem for additive functions

by Louis H. Y. Chen

Exact lower bound on an "exactly one" probability

by Iosif Pinelis

Frozen 1 -RSB structure of the symmetric Ising perceptron

by Will Perkins

Infinite Horizon Multi-Dimensional BSDE with Oblique Reflection and Switching Problem

by Brahim El Asri

On telegraph processes, their first passage times and running extrema

by Nikita Ratanov

Berry--Esseen Bounds for Multivariate Nonlinear Statistics with Applications to M-estimators and Stochastic Gradient Descent Algorithms

by Qi-Man Shao

Stationary Distribution Convergence of the Offered Waiting Processes in Heavy Traffic under General Patience Time Scaling

by Chihoon Lee

Statistical Enumeration of Groups by Double Cosets

by Persi Diaconis

A shape theorem for exploding sandpiles

by Ahmed Bou-Rabee

The two-sided exit problem for a random walk on Z and having infinite variance II

by Kohei Uchiyama

The Brownian Web as a random R -tree

by Giuseppe Cannizzaro

A note on Fokker-Planck equations and graphons

by Fabio Coppini

Convergence rate to the Tracy-Widom laws for the largest eigenvalue of Wigner matrices

by Kevin Schnelli

Numerical approximations of one-point large deviations rate functions of stochastic differential equations with small noise

by Jialin Hong

Set-valued Ito's formula with an application to the general set-valued backward stochastic differential equation

by Yao-jia Zhang

Existence of solutions to a system of SDEs with mean-field drift and jump random measures

by Ying Jiao

Sample canonical correlation coefficients of high-dimensional random vectors with finite rank correlations

by Zongming Ma

Local elliptic law

by Johannes Alt

Unified Signature Cumulants and Generalized Magnus Expansions

by Peter K. Friz

The Critical Mean-field Chayes-Machta Dynamics

by Antonio Blanca

A new discrete distribution arising from a generalised random game and its asymptotic properties

by Rudolf Frühwirth

Multiple Phase Transitions for an Infinite System of Spiking Neurons

by A. M. B. Nascimento

Stability of Overshoots of Markov Additive Processes

by Leif Döring

«

1

2

3

4

»

Submitted on 1 Jan 2017 (v1), last revised 22 May 2017 (this version, v2) Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar