[PDF] Linear multi-step schemes for BSDEs

Abstract

We study the convergence rate of a class of linear multi-step methods for BSDEs. We show that, under a sufficient condition on the coefficients, the schemes enjoy a fundamental stability property. Coupling this result to an analysis of the truncation error allows us to design approximation with arbitrary order of convergence. Contrary to the analysis performed in \cite{zhazha10}, we consider general diffusion model and BSDEs with driver depending on z . The class of methods we consider contains well known methods from the ODE framework as Nystrom, Milne or Adams methods. We also study a class of Predictor-Correctot methods based on Adams methods. Finally, we provide a numerical illustration of the convergence of some methods.

Full PDF

LLinear multi-step schemes for BSDEs

Jean-François CHASSAGNEUX ∗ This version: October 2012. (submitted)

Abstract

We study the convergence rate of a class of linear multi-step methods for BSDEs.We show that, under a suﬃcient condition on the coeﬃcients, the schemes enjoy afundamental stability property. Coupling this result to an analysis of the trunca-tion error allows us to design approximation with arbitrary order of convergence.Contrary to the analysis performed in [22], we consider general diﬀusion model andBSDEs with driver depending on z . The class of methods we consider contains wellknown methods from the ODE framework as Nystrom, Milne or Adams methods.We also study a class of Predictor-Correctot methods based on Adams methods.Finally, we provide a numerical illustration of the convergence of some methods. Key words:

Backward SDEs, High order discretization, Linear multi-step methods.

MSC Classiﬁcation (2000):

In this paper, we are interested in the discrete-time approximation of solutions of (de-coupled) Backward Stochastic Diﬀerential Equation (BSDE), i.e . a triplet ( X, Y, Z ) sat-isfying X t = X + (cid:90) t b ( X s )d s + (cid:90) t σ ( X s )d W s , (1.1) Y t = g ( X T ) + (cid:90) Tt f ( Y t , Z t )d t − (cid:90) Tt Z t d W t . (1.2)The function ( b, σ ) : R d (cid:55)→ R d × M d , and f : R × R d (cid:55)→ R are Lipschitz-continuousfunction, g : R d (cid:55)→ R is diﬀerentiable with continuous and bounded ﬁrst derivative .The positive constant T is given and W is a Brownian motion supported by a ﬁltered ∗ Departement of Mathematics, Imperial College London. [email protected] These assumptions will be strengthened in the following sections. a r X i v : . [ m a t h . P R ] J un robability space (Ω , F , ( F ) ≤ t ≤ T , P ) . The process Y is a one-dimensional stochasticprocess, the processes X and Z are valued in R d and Z is written, by convention,as a row vector. Under the Lipschitz assumption on the coeﬃcients, the processes X and Y belong to the set S of continuous adapted processes with square integrablesupremum and Z belongs to H , the set of progressively measurable processes satisfying E (cid:104)(cid:82) T | Z s | d s (cid:105) .The existence and uniqueness of solutions of the system (1.1) -(1.2) was ﬁrst addressedby Pardoux and Peng in [16]. Moreover, in [17], they show that Y t = u ( t, X t ) , Z t = ∇ u (cid:62) ( t, X t ) σ ( X t ) , t ∈ [0 , T ] , where u ∈ C , ([0 , T ] × R d ) is the solution of the ﬁnal value Cauchy problem L (0) u ( t, x ) = − f (cid:16) u ( t, x ) , ∇ u (cid:62) ( t, x ) σ ( x ) (cid:17) , t ∈ [0 , T ) , x ∈ R d (1.3) u ( T, x ) = g ( x ) , x ∈ R d (1.4)with L (0) deﬁned to be the second order diﬀerential operator L (0) = ∂ t + d (cid:88) i =1 b i ∂ x i + 12 d (cid:88) i,j =1 a ij ∂ x i ∂ x j , (1.5)and a = a ij = σσ (cid:62) .To approximate (1.1)-(1.2), one has to come up with an approximation of the SDEpart and the BSDE part. Obtaining approximations of the distribution of the forwardcomponent X has been largely resolved in the last thirty years. There is a large literatureon the subject and one can refer to [15] and the references therein for a systematic studyof numerical methods for approximating X .Here, we focus on the approximation of ( Y, Z ) instead. Numerical methods approximat-ing this backward component have already been proposed. They are mainly based on aEuler approximation, see [3, 23, 13, 8] and the references therein. These methods havebeen successfully extended to a broader class of BSDEs: reﬂected BSDEs [1, 6], BS-DEs with jumps [2], BSDEs with driver of quadratic growth [18], see also the referencetherein. In a very speciﬁc framework, [20, 19, 21, 22] proposed some high order methodsto approximate the solution of the BSDE. Recently high order method of Runge-Kuttatype have been studied [9, 7] in the general framework of (1.1)-(1.2).In this paper, we consider another type of high order method, very well known for ODEs,namely linear multi-step methods.The approximations presented below are associated to an arbitrary, but ﬁxed, partition π of the interval [0 , T ] , π = { t = 0 < · · · < t i < t i +1 < · · · < t n = T } . We deﬁne2 i = t i +1 − t i , i = 0 , ..., n − and | π | = max i h i and denote by ( Y i , Z i ) the approximationof ( Y t i , Z t i ) for i = 1 , ..., n . The construction of the approximating process is done in arecursive manner, backwards in time. We describe in the following the salient featuresof the class of approximations considered in this paper. Deﬁnition 1.1. (Linear multi-step methods)(i) To initialise the scheme with r steps, r ≥ , we are given r terminal condition ( Y n − j , Z n − j ) , F t n − j -measurable square integrable random variables, ≤ j ≤ r − .(ii) For i ≤ n − r , the computation of ( Y i , Z i ) involves r steps and is given by  Y i = E t i (cid:104)(cid:80) rj =1 a j Y i + j + h (cid:80) rj =0 b i,j f ( Y i + j , Z i + j ) (cid:105) Z i = E t i (cid:104)(cid:80) rj =1 α j H Yi,j Y i + j + h (cid:80) rj =1 β i,j H fi,j f ( Y i + j , Z i + j ) (cid:105) where a j , b i,j , α j , β i,j are real numbers satisfying | a j | + | b i,j | + | α j | + | β i,j | ≤ Λ , ≤ i ≤ n − r , ≤ j ≤ r , and Λ is a positive constant. We impose the so-called pre-consistency condition i.e. r (cid:88) j =1 a j = r (cid:88) j =1 α j = 1 . The coeﬃcients H Yi,j , H fi,j , ≤ i ≤ n − r , ≤ j ≤ r are F t i + j -measurable randomvariables satisfying, for all j , h i E (cid:104) | H Yi,j | + | H fi,j | (cid:105) ≤ Λ and E t i (cid:2) H Yi,j (cid:3) = E t i (cid:104) H fi,j (cid:105) = 0 . Remark 1.1. (i) The value ( Y n , Z n ) is generally given by ( g ( X T ) , ∇ g (cid:62) ( X T ) σ ( X T )) . If r > , one needs to specify other initialisation values. This choice is important becauseit will impact the global rate of convergence. One can use Runge-Kutta type scheme [7]with high order of convergence.(ii) When r = 1 , schemes 1.1 are one-step scheme. See [3, 23, 13, 9, 7] and thereferences therein for a study of these schemes. The global error we investigate here is a time discretization error and is, given a grid π , ( E Y ( π ) , E Z ( π )) with E Y ( π ) := max i E (cid:2) | Y t i − Y i | (cid:3) and E Z ( π ) := (cid:88) i h i E (cid:2) | Z t i − Z i | (cid:3) . To implement high order scheme in practice, we need to specify a particular form forthe H -coeﬃcient appearing in Deﬁnition 1.1 above. Let us ﬁrst introduce a special classof random variables, which was already considered in [7].3 eﬁnition 1.2. (i) For m ≥ , we denote by B m [0 , the set of bounded measurablefunction ψ : [0 , → R satisfying (cid:90) ψ ( u )d u = 1 and if m ≥ , (cid:90) ψ ( u ) u k d u = 0 , ≤ k ≤ m. (ii) Let ( ψ (cid:96) ) ≤ (cid:96) ≤ d ∈ B m [0 , , for t ∈ [0 , T ] and h > s.t. t + h ≤ T , we deﬁne, H ψt,h := ( 1 h (cid:90) t + ht ψ (cid:96) ( u − th )d W (cid:96)u ) ≤ (cid:96) ≤ d , which is a row vector.By convention, we set H ψt, = 0 . In the sequel, when studying the order of convergence of the scheme and depending ofthe order we want to retrieve, we will assume that, for ≤ j ≤ r , H Yi,j := H ψt i ,jh and H fi,j := H φt i ,jh (1.6)for some functions ψ and φ in B m , m ≥ , see Theorem 2.1 below.The convergence analysis is done in a classical way. We ﬁrst prove a fundamental stabil-ity property for the schemes, under a reasonable suﬃcient condition, see Proposition 2.1.Then, assuming smoothness of the value function u given by (1.3)-(1.4), we study thetruncation error associated to the above methods. We prove a suﬃcient condition on thecoeﬃcient to retrieve methods of any order. These two steps allow us to retrieve generalconvergence and design new high order method for BSDEs. Contrary to the analysisperformed in [22], we work with general diﬀusion model given by (1.1) and BSDEs withdriver depending on z . As an example of application, we extend some classical schemeused in the ODE framework and then proceed with the study of Adams type methods.Based on these methods, we also design Predictor-Corrector methods and study theirconvergence. To the best of our knowledge, it is the ﬁrst time that these methods areconsidered for BSDEs. Finally, we illustrate our theoretical results with some numericalexperiments showing empirical convergence rates.The rest of this paper is organised as follows. In section 2, we prove our general conver-gence result which relies heavily on a stability property. In section 3, we study Adamsmethods and Predictor-Corrector methods in the context of BSDEs. The main resultsare stated in the multi-dimensional case but for the reader’s convenience the proofs aredone with d = 1 . Finally, in section 4, we provide a numerical example.4 otations We denote by M d the set of matrices with d lines and d columns. For amatrix A ∈ M d , Tr [ A ] denotes its trace, A .j its j -th column, A i. its i -th row, and A ij the i -th term of A .j . I d is the identity matrix of M d . The transpose of a matrix or avector y will be denoted y (cid:62) . The sup-norm for both vectors and matrix is denoted | . | ∞ .In the sequel C is a positive constant whose value may change from line to line dependingon T , d , Λ , X but which does not depend on π . We write C p if it depends on somepositive parameters p .For t ∈ π , R a random variable and r a real number, the notation R = O t ( r ) meansthat | U | ≤ λ πt u where λ πt is a positive random variable satisfying: E [ | λ πt | p ] ≤ C p , for all p > , t ∈ π and all grid π . In this part, we study the convergence properties of the schemes given in Deﬁnition 1.1.We ﬁrst establish a stability property for the schemes. We then state a suﬃcient condi-tion on the coeﬃcients which allows us to retrieve high order schemes. L -stability To investigate the stability of the schemes given in Deﬁnition 1.1, we introduce a per-tubed scheme  ˜ Y i = E t i (cid:104)(cid:80) rj =1 a j ˜ Y i + j + h (cid:80) rj =0 b i,j f ( ˜ Y i + j , ˜ Z i + j ) (cid:105) + ζ Yi ˜ Z i = E t i (cid:104)(cid:80) rj =1 α j H Yi,j ˜ Y i + j + h (cid:80) rj =1 β i,j H fi,j f ( ˜ Y i + j , ˜ Z i + j ) (cid:105) + ζ Zi (2.1)where ζ Yi , ζ Zi are random variables belonging to L ( F t i ) , for i ≤ n − r .The notion of stablity we consider here is the following. Deﬁnition 2.1. ( L -Stability) The scheme given in Deﬁnition 1.1 is said to be L -stableif max ≤ i ≤ n − r E (cid:2) | δY i | (cid:3) + n − r (cid:88) i =0 h i E (cid:2) | δZ i | (cid:3) ≤ C (cid:16) max ≤ j ≤ r − E (cid:2) | δY n − j | + | π || δZ n − j | (cid:3) + | π | n − r (cid:88) i =0 E (cid:20) h i | ζ Yi | + | ζ Zi | (cid:21) (cid:17) for all sequences ζ Yi , ζ Zi of L ( F t i ) -random variable, i ≤ n − r , and terminal values ( Y n − j , Z n − j ) , ( ˜ Y n − j , ˜ Z n − j ) belonging to L ( F t n − j ) , ≤ j ≤ r − . roposition 2.1. Assume that the following holds ( H c ) The coeﬃcients ( a j ) are non-negative, (cid:80) rj =1 a j = 1 and for ≤ j ≤ r , a j = 0 = ⇒ α j = 0 ,then, the scheme given in Deﬁnition 1.1 is L -stable, recalling Deﬁnition 2.1. Proof.

6n particular, we have for i = j , and | π | small enough, | δY i | ≤ C (cid:16) n − r (cid:88) k = i h k E t i [ | δY k | + | δZ k | ] + n − r (cid:88) k = i E t i (cid:2) | ζ Yk | (cid:3) + n (cid:88) k = n − r +1 E t i [ | δY k | ] (cid:17) . (2.2)We then compute E (cid:2) | δY i | (cid:3) ≤ C (cid:16) | π | n − r (cid:88) k = i E (cid:2) | δY k | (cid:3) + n − r (cid:88) k = i h k E (cid:2) | δZ k | (cid:3) + n − r (cid:88) k = i h k E (cid:2) | ζ Yk | (cid:3) + n (cid:88) k = n − r +1 E (cid:2) | δY k | (cid:3) (cid:17) . (2.3)1.b We will now control the term h (cid:80) n − rk = i E (cid:2) | δZ k | (cid:3) appearing in (2.3).Using Cauchy-Schwartz inequality, we obtain that, if a j (cid:54) = 0 then | E t i (cid:2) α j H Yi,j δY i + j (cid:3) | ≤ C ( a j E t i (cid:2) | δY i + j | − a j | E t i [ δY i + j ] | (cid:3) ) which leads to, under ( H c ) , h i E (cid:2) | δZ i | (cid:3) ≤ C (cid:16) r (cid:88) j =1 a j E (cid:2) | δY i + j | (cid:3) − r (cid:88) j =1 E (cid:2) a j | E t i [ δY i + j ] | (cid:3) )+ | π | r (cid:88) j =1 E (cid:2) | δY i + j | + | δZ i + j | (cid:3) + | π | E (cid:2) | ζ Zi | (cid:3) . (cid:17) (2.4)Under ( H c ) , we have that, − r (cid:88) j =1 E (cid:2) a j | E t i [ δY i + j ] | (cid:3) ≤ − E  | r (cid:88) j =1 E t i [ a j δY i + j ] |  . Then, recalling that r (cid:88) j =1 E t i [ a j δY i + j ] = δY i − h i r (cid:88) j =0 E t i (cid:2) δ Φ Yj + r (cid:3) − ζ Yi we compute − r (cid:88) j =1 E (cid:2) | E t i [ a j δY i + j ] | (cid:3) ≤ − E (cid:2) | δY i | (cid:3) + C | π | E  | δY i | r (cid:88) j =0 E t i [ | δY i + j | + | δZ i + j | ]  + C E (cid:2) | δY i | | ζ Yi | (cid:3) which leads, for < (cid:15) ≤ to be ﬁxed later on, to − r (cid:88) j =1 E (cid:2) | E t i [ a j δY i + j ] | (cid:3) ≤ − E (cid:2) | δY i | (cid:3) + C | π | r (cid:88) j =0 E t i (cid:20) (cid:15) | δY i + j | + (cid:15) | δZ i + j | (cid:21) + Ch i E (cid:2) | ζ Yi | (cid:3) . i , we obtain, for | π | smallenough n − r (cid:88) k = i h k E (cid:2) | δZ k | (cid:3) ≤ C (cid:16) n − r (cid:88) k = i ( r (cid:88) j =1 a j E (cid:2) | δY i + j | (cid:3) − E (cid:2) | δY i | (cid:3) ) + C (1 + 1 (cid:15) ) | π | n (cid:88) k = n − r +1 E (cid:2) | δY k | + | δZ k | (cid:3) + C (1 + 1 (cid:15) ) | π | n − r (cid:88) k = i E (cid:2) | δY k | (cid:3) + C(cid:15) | π | n − r (cid:88) k = i E (cid:2) | δZ k | (cid:3) + | π | n − r (cid:88) k = i E (cid:2) | ζ Zk | (cid:3) + n − r (cid:88) k = i Ch k E (cid:2) | ζ Yk | (cid:3) (cid:17) Using ( H c ) , setting (cid:15) := C , we then obtain n − r (cid:88) k = i h k E (cid:2) | δZ k | (cid:3) ≤ C (cid:16) n (cid:88) k = n − r +1 E (cid:2) | δY k | + | π || δZ k | (cid:3) + | π | n − r (cid:88) k = i E (cid:2) | δY k | (cid:3) + n − r (cid:88) k = i E (cid:20) h k | ζ Yk | + h k | ζ Zk | (cid:21) (cid:17) (2.5)1.c Combining the last inequality with (2.3), we get E (cid:2) | δY i | (cid:3) ≤ C (cid:16) | π | n − r (cid:88) j = i E (cid:2) | δY j | (cid:3) + n − r (cid:88) k = i | π | E (cid:20) h k | E t k (cid:2) ζ Yk (cid:3) | + | E t k (cid:2) ζ Zk (cid:3) | (cid:21) + n (cid:88) k = n − r +1 E (cid:2) | δY k | + | π || δZ k | (cid:3) (cid:17) (2.6)2.a Let us deﬁne δ i := n − r (cid:88) j = i E (cid:2) | δY j | (cid:3) ,θ i := n − r (cid:88) k = i | π | E (cid:20) h k | ζ Yk | + | ζ Zk | (cid:21) + n (cid:88) k = n − r +1 E (cid:2) | δY k | + | π || δZ k | (cid:3) . Equation (2.6) reads then δ i − δ i +1 ≤ C | π | δ i + Cθ i (2.7)Using a discrete version of Gronwall Lemma, we then compute δ i ≤ C ( δ n − r + n − r (cid:88) k = i θ k e C ( n − r − k ) | π | ) Since δ n − r ≤ η i and θ k ≤ θ i for k ≥ i , we compute δ i ≤ Cθ i − e C | π | This last equation combined with (2.7) leads to E (cid:2) | δY i | (cid:3) ≤ Cθ i Y -part.2.b For the Z -part, the proof is concluded pluging last inequality in (2.5), with i = 0 inthis equation. (cid:50) Remark 2.1.

It is easily checked that ( H c ) implies that the roots of the following poly-nomial equations y r +1 − r (cid:88) j =1 a j y r − j +1 = 0 . are in the closed unit disc and the multiple roots are in the open unit disc.It is known that in the ODEs framework this is a necessary and suﬃcient condition toget stability of linear multi-step schemes, see e.g [5, 11]. In our context, this conditionis only necessary. We have to imposed ( H c ) essentially because we need to deal with thenew process Z . Remark 2.2.

Proposition 2.1 is generic in the sense that we do not use the particularproperty of the probability space nor the fact that ( F t ) t ∈ [0 ,T ] is a Brownian ﬁltration. Wewill use this property in the last section of this paper. To study the order of the schemes, we use the following deﬁnition of truncation errors.The local truncation error for the pair ( Y, Z ) deﬁned as η i := η Yi + η Zi , ( η Yi , η Zi ) := (cid:18) h i E (cid:2) | Y t i − ˇ Y t i | (cid:3) , E (cid:2) | Z t i − ˇ Z t i | (cid:3)(cid:19) , i ≤ n − r , (2.8)with ˇ Y t i = E t i  r (cid:88) j =1 a j Y t i + j + h i r (cid:88) j =1 b i,j f ( Y t i + j , Z t i + j ) + h i b i, f ( ˇ Y t i , ˇ Z t i )  (2.9) ˇ Z t i = E t i  r (cid:88) j =1 α j H ψt i ,jh Y t i + j + h j r (cid:88) j =1 β i,j H φt i ,jh f ( Y t i + j , Z t i + j )  (2.10)where ψ , φ belongs to B .The global truncation error for a given grid π is given by T ( π ) := T Y ( π ) + T Z ( π ) , ( T Y ( π ) , T Z ( π )) := (cid:32) n − r (cid:88) i =0 h i η Yi , n − r (cid:88) i =0 h i η Zi (cid:33) , (2.11)9here T Y is the global truncation error for Y and T Z is the global truncation error for Z deﬁned as above. Deﬁnition 2.2.

An approximation is said to have a global truncation error of order m if we have T ( π ) ≤ C | π | m for all suﬃciently smooth solutions to (1.3) and all partitions π with suﬃciently smallmesh size. We study the order of the methods given in Deﬁnition 1.1 using Itô-Taylor expansions[15]. This requires the smoothness of the value function u introduced in (1.3)-(1.4). Inorder to state precisely these assumptions, we recall some notations of Chapter 5 (seeSection 5.4) in [15].Let M := {(cid:11)} ∪ ∞ (cid:91) m =1 { , . . . , d } m be the set of multi-indices with entries in { , . . . , d } endowed with the measure (cid:96) of thelength of a multi-index ( (cid:96) ( (cid:11) ) = 0 by convention).We introduce the concatenation operator ∗ on M for multi-indices with ﬁnite length: α = ( α , . . . , α p ) , β = ( β , . . . , β q ) then α ∗ β = ( α , . . . , α p , β , . . . , β q ) .A non empty subset A ⊂ M is called a hierarchical set if sup α (cid:96) ( α ) < ∞ and − α ∈ A , ∀ α ∈ A \ {(cid:11)} For any hierarchical A set, we consider the remainder set B ( A ) given by B ( A ) := { α ∈ M \ A| − α ∈ A} We will use in the sequel the following sets of multi-indices, for n ≥ : A n := { α | (cid:96) ( α ) ≤ n } and observe that B ( A n ) = A n +1 \ A n . The required regularity assumptions will be stated in the Theorems below. j ∈ { , . . . , d } , we consider the operators: L ( j ) = d (cid:88) k =1 σ kj ∂ x k . For a multi-index α = ( α , . . . , α p ) , the iteration of these operators has to be understoodin the following sense L α := L ( α ) ◦ · · · ◦ L ( α p ) . By convention, L (cid:11) is the identity operator, recall also the deﬁnition of L (0) given in(1.5). One can observe that L α ∗ β = L α ◦ L β .For a multi-index with ﬁnite length α , we consider the set G α of function v : [0 , T ] × R d → R for which L α v is well deﬁned and continuous. We also introduce G αb the subset offunction v ∈ G α such that the function L α v is bounded. For v ∈ G α , we denote L α u by u α .Finally, for n ≥ , we deﬁne the set G nb of function u such that u α ∈ G αb for all α ∈A n \ {(cid:11)} .The two following Propositions are key results to prove the high order rate of convergenceof the schemes. We refer to [7] for proofs. Proposition 2.2.

Assume d = 1 . Let m ≥ , then for a function v ∈ G m +1 b , we havethat E t [ v ( t + h, X t + h )] = v t + hv (0) t + h v (0 , t + · · · + h m m ! v (0) m t + O t ( h m +1 ) Proposition 2.3.

Assume d = 1 . (i) Let m ≥ , for ψ ∈ B m [0 , , assuming that v ∈G m +2 b , we have E t (cid:104) H ψt,h v ( t + h, X t + h ) (cid:105) = v (1) t + hv (1 , t + · · · + h m m ! v (1) ∗ (0) m t + O t ( h m +1 ) (ii) For ψ ∈ B , , assuming that v ∈ G b , we have E t (cid:104) H ψt,h v ( t + h, X t + h ) (cid:105) = O t (1) . (iii) If L (0) ◦ L (1) = L (1) ◦ L (0) , then the expansion of (i) holds true with ψ = 1 . .2.3 Suﬃcient condition for Order m For the reader’s convenience, we assume in this paragraph a constant time step for thegrid π i.e. h i = h = | π | := Tn , for all i and that the coeﬃcients b , β do not depend of i .Under these conditions, the scheme given in Deﬁnition 1.1, recalling (1.6), rewrites, for i ≤ n − r ,  Y i = E t i (cid:104)(cid:80) rj =1 a j Y i + j + h (cid:80) rj =0 b j f ( Y i + j , Z i + j ) (cid:105) Z i = E t i (cid:104)(cid:80) rj =1 α j H ψt i ,jh Y i + j + h (cid:80) rj =1 β j H φt i ,jh f ( Y i + j , Z i + j ) (cid:105) (2.12) Proposition 2.4. (Order m) For m ≥ , assume that the following holds ( C Y ) m : r (cid:88) j =1 a j j p − p r (cid:88) j =0 b j j p − = 0 , ≤ p ≤ m and ( C Z ) m : r (cid:88) j =1 α j j p − pβ j j p − = 0 , ≤ p ≤ m − and that u ∈ G m +1 b , then we have T Y ( π ) + T Z ( π ) ≤ C | π | m , provided that ψ ∈ B m − , and φ ∈ B m − , , recalling (2.12) . Proof.

1. We ﬁrst study the truncation error for the Z-part. We have that ˇ Z t i = E t i  r (cid:88) j =1 α j H ψt i ,jh Y t i + j + h r (cid:88) j =1 β j H φt i ,jh f ( Y t i + j , Z t i + j )  = E t i  r (cid:88) j =1 α j H ψt i ,jh u ( t i + j , X t i + j ) − h r (cid:88) j =1 β j H φt i ,jh u (0) ( t i + j , X t i + j )  . Using Proposition 2.3, we compute ˇ Z t i = m − (cid:88) p =0 r (cid:88) j =1 α j j p h p p ! u (1) ∗ (0) p ( t i , X t i ) − m − (cid:88) p =0 r (cid:88) j =1 β j j p h p +1 p ! u (1) ∗ (0) p +1 ( t i , X t i ) + O t i ( | π | m ) which leads to ˇ Z t i − Z t i = ( r (cid:88) j =1 α j − u (1) ( t i , X t i ) + m − (cid:88) p =1 h p p ! u (1) ∗ (0) p ( t i , X t i )( r (cid:88) j =1 α j j p − p r (cid:88) j =1 j p − β j ) + O t i ( | π | m ) ( C Z ) m , we obtain ˇ Z t i − Z t i = O t i ( | π | m ) which leads directly to η Zi = O ( | π | m ) , i ≤ n − r. (2.13)2.a We now study the truncation error for the Y-part. Let us introduce ¯ Y t i = E t i  r (cid:88) j =1 a j Y t i + j + h r (cid:88) j =0 b j f ( Y t i + j , Z t i + j )  We have that ˇ Y t i = ¯ Y t i + hb (cid:16) f ( ˇ Y t i , ˇ Z t i ) − f ( Y t i , Z t i ) (cid:17) Since f is Lipschitz-continuous, we get that for | π | small enough, ˇ Y t i − Y t i = O t i ( ¯ Y t i − Y t i ) + | π | O t i ( ˇ Z t i − Z t i ) . (2.14)2.b Now observe that ¯ Y t i = E t i  r (cid:88) j =1 a j u ( t i + j , X t i + j ) − h r (cid:88) j =0 b j u (0) ( t i + j , X t i + j )  . Using Proposition 2.2, we compute ¯ Y t i = m (cid:88) p =0 r (cid:88) j =1 a j j p h p p ! u (0) p ( t i , X t i ) − m − (cid:88) p =0 r (cid:88) j =0 b j j p h p +1 p ! u (0) p +1 ( t i , X t i ) + O t i ( | π | m +1 ) which leads to ¯ Y t i − Y t i = ( r (cid:88) j =1 a j − u ( t i , X t i ) + m (cid:88) p =1 h p p ! u (0) p ( t i , X t i )( r (cid:88) j =1 a j j p − p r (cid:88) j =0 j p − b j ) + O t i ( | π | m +1 ) . Under ( C Y ) m , we thus get ¯ Y t i − Y t i = O t i ( | π | m +1 ) η Yi = O ( | π | m +2 ) , i ≤ n − r .

3. Combining the last equation with (2.13), we conclude that T ( π ) = O ( | π | m ) , and so the scheme is of order m . (cid:50) .3 Convergence results and examples of high order methods Theorem 2.1.

We simply observe that the solution ( Y, Z ) of the BSDE is also the solution ofa perturbed scheme with ζ Yi := ˇ Y t i − Y t i and ζ Zi := ˇ Z t i − Z t i . The proof then followsdirectly from Proposition 2.1. (cid:50) In particular, in the special setting of paragraph 2.2.3, a straightforward application ofTheorem 2.1 and Proposition 2.4 leads to

Corollary 2.1.

Under ( H c ) and ( C Y ) m - ( C Z ) m , assuming that (2.15) holds, we have E Y ( π ) + E Z ( π ) ≤ C | π | m , provided u ∈ G m +1 b and ψ ∈ B m − , , φ ∈ B m − , . To illustrate the previous results, we conclude this section by giving two examples ofhigh order method which can be designed using Corollary 2.1.

Example 2.1. (Nystrom’s method) The following scheme is –for the Y-part– inspiredby the

Leap-frog (or Nystrom’s) method for ODE, namely  Y i = E t i [ Y i +2 + 2 hf ( Y i +1 , Z i +1 )] Z i = E t i (cid:104) H ψt i , h Y i +2 + 2 hH φt i , h f ( Y i +2 , Z i +2 ) (cid:105) . This 2-step method is convergent and the rate of convergence is at least of order 2,assuming that u ∈ G b and ψ ∈ B , , φ ∈ B , . Example 2.2. (Milne’s method) The second scheme we propose is inspired –for theY-part– by the

Milne’s method for ODE, namely  Y i = E t i (cid:104) Y i +4 + h (cid:16) f ( Y i +1 , Z i +1 ) − f ( Y i +2 , Z i +2 ) + f ( Y i +3 , Z i +3 ) (cid:17)(cid:105) Z i = E t i (cid:104) H ψt i , h Y i +4 + h (cid:16) H φt i ,h f ( Y i +1 , Z i +1 ) − H φt i , h f ( Y i +2 , Z i +2 ) + H φt i , h f ( Y i +3 , Z i +3 ) (cid:17)(cid:105) . his 4-step method is convergent and the rate of convergence is at least of order 4,assuming that u ∈ G b and ψ ∈ B , , φ ∈ B , . In this section, we introduce methods for BSDEs inspired by Adams methods fromthe ODE framework. These methods are of two kinds: explicit methods , also calledAdams-Bashforth, or implicit methods, also called, Adams-Moulton.The schemes introduced in Deﬁnition 1.1 are always explicit for the Z -part but maybe implicit for the Y -part. So, for the Z -part, we use Adams-Bashforth approximationwhich may then be combined with explicit or implicit approximation for the Y -part.We ﬁrst study methods combining Adams-Moulton type approximation for the Y -partand Adams-Bashforth type approximation for the Z -part. We show that these methodsare really eﬃcient because high order rate of convergence can be achieved, assumingsmoothness of the value function. We then quickly discuss the case of explicit methods,i.e. Adams-Bashforth type approximation both for the Y -part and Z -part.At the end of this section, we use these Adams type approximation to design Predictor-Corrector methods for BSDEs. These methods are inspired by Adams-Moulton method for the Y -part and Adams-Bashforth for the Z -part.They have the following form, for i ≤ n − r , ( AM B ) r :  Y i = E t i (cid:104) Y i +1 + h i (cid:80) rj =0 b i,j,r f ( Y i + j , Z i + j ) (cid:105) Z i = E t i (cid:104) H ψt i ,h Y i +1 + h i (cid:80) rj =1 β i,j,r H φt i ,jh f ( Y i + j , Z i + j ) (cid:105) where ψ, φ ∈ B , .The coeﬃcients for the Y -part are given by b i,j,r = 1 h i (cid:90) t i +1 t i L i,j,r ( s )d s , with L i,j,r ( t ) = r (cid:89) k =0 ,k (cid:54) = j t − t i + k t i + j − t i + k , ≤ j ≤ r. (3.1)The Lagrange polynomials L i,j,r are of degree r and L i,j,r ( t i + j ) = 1 , which implies r (cid:88) j =0 ( t i + j − t i ) k L i,j,r ( t ) = ( t − t i ) k , ≤ k ≤ r . (3.2)15he deﬁnition of the b -coeﬃcients means that Y i = E t i (cid:20) Y i +1 + (cid:90) t i +1 t i Q Yi,r ( t )d t (cid:21) where Q Yi,r is a polynomial of degree less than r satisfying Q Yi,r ( t i + j ) = f ( Y i + j , Z i + j ) , ≤ j ≤ r . In the case where the time step is constant, the coeﬃcient does not depends on i andare given by b j,r = (cid:90) (cid:96) j,r ( s )d s , with (cid:96) j,r ( s ) = r +1 (cid:89) k =0 ,k (cid:54) = j s − kj + 1 − k , ≤ j ≤ r . The coeﬃcients for the Z -part are given by β i,j,r = 1 h i (cid:90) t i +1 t i ˜ L i,j,r ( s )d s , ≤ j ≤ r, with ˜ L i,j,r ( s ) = r (cid:89) k =1 ,k (cid:54) = j t − t i + k t i + j − t i + k . (3.3)The Lagrange polynomials ˜ L i,j,r are of degree r − and ˜ L i,j,r ( t i + j ) = 1 , which implies r (cid:88) j =1 ( t i + j − t i ) k ˜ L i,j,r ( t ) = ( t − t i ) k , ≤ k ≤ r − . (3.4)The deﬁnition of the β -coeﬃcients means that Z i = E t i (cid:20) H ψt i ,h Y i +1 + (cid:90) t i +1 t i Q Zi,r ( t )d t (cid:21) where Q Zi,r is a polynomial of degree less than r − satisfying Q Zi,r ( t i + j ) = H φt i ,jh f ( Y i + j , Z i + j ) , ≤ j ≤ r . In the case where the time step is constant, the coeﬃcient does not depends on i andare given by β j,r = (cid:90) (cid:96) j,r ( s )d s , with (cid:96) j,r ( s ) = r (cid:89) k =1 ,k (cid:54) = j s − kj − k . When the time step is constant, the table below gives the b -coeﬃcients and β -coeﬃcientsfor r ≤ : r b b b b b β β β β

12 12

52 812 −

112 32 −

924 1924 −

524 124 2312 − − − − − roposition 3.1. The ( AM B ) r method is convergent and at least of order r + 1 , pro-vided that ψ ∈ B r , φ ∈ B r − and u ∈ G r +2 b . Proof.

1. The stability of the schemes comes from a direct application of Proposition2.1, since obviously ( H c ) holds for ( AM B ) r . Following Theorem 2.1, we only have tostudy the order of the method.2.a We ﬁrst study the error for the Z part. Observe that, recalling (2.10), ˇ Z t i := E t i  H ψt i ,h Y t i +1 + r (cid:88) j =1 H φt i ,jh f ( Y t i + j , Z t i + j ) (cid:90) t i +1 t i L i,j,r ( t )d t  = E t i  H ψt i ,h u ( t i +1 , X t i +1 ) − r (cid:88) j =1 H φt i ,jh u (0) ( t i + j , X t i + j ) (cid:90) t i +1 t i L i,j,r ( t )d t  Using Proposition 2.3, we get ˇ Z t i − Z t i = r (cid:88) k =1 h ki k ! u (1) ∗ (0) k ( t i , X t i ) − r (cid:88) j =1 (cid:90) t i +1 t i ˜ L i,j,r ( t )d t r − (cid:88) k =0 u (1) ∗ (0) k +1 ( t i , X t i ) k ! ( t i + j − t i ) k + O t i ( | π | r +1 ) which reads also ˇ Z t i − Z t i = r (cid:88) k =1 h ki k ! u (1) ∗ (0) k ( t i , X t i ) − r − (cid:88) k =0 u (1) ∗ (0) k +1 ( t i , X t i ) k ! (cid:90) t i +1 t i r (cid:88) j =1 ( t i + j − t i ) k ˜ L i,j,r ( t )d t + O t i ( | π | r +1 ) . Using (3.4), we obtain ˇ Z t i − Z t i = r (cid:88) k =1 (cid:16) h ki k ! − k − (cid:90) t i +1 t i ( t − t i ) k − d t (cid:17) u (1) ∗ (0) k ( t i , X t i ) + O t i ( | π | r +1 )= O t i ( | π | r +1 ) . Y part. Let us deﬁne, ¯ Y t i := E t i  Y t i +1 + r (cid:88) j =0 f ( Y t i + j , Z t i + j ) (cid:90) t i +1 t i L i,j,r ( t )d t  Observe that ˇ Y t i := ¯ Y t i + (cid:16) f ( ˇ Y t i , ˇ Z t i ) − f ( Y t i , Z t i ) (cid:17) (cid:90) t i +1 t i L i, ,r ( t )d t which leads since f is Lipschitz continuous, for | π | small enough, to ˇ Y t i − Y t i = O t i ( | ¯ Y t i − Y t i | ) + | π | O t i ( | ˇ Z t i − Z t i | ) . (3.5)17ow, ¯ Y t i = E t i  u ( t i +1 , X t i +1 ) − r (cid:88) j =0 u (0) ( t i +1 , X t i +1 ) (cid:90) t i +1 t i L i,j,r ( t )d t  Using Proposition 2.2, we get ¯ Y t i − Y t i = r +1 (cid:88) k =1 h ki k ! u (0) k ( t i , X t i ) − r (cid:88) j =0 (cid:90) t i +1 t i L i,j,r ( t )d t r (cid:88) k =0 u (0) k +1 ( t i , X t i ) k ! ( t i + j − t i ) k + O t i ( | π | r +2 ) which reads also ¯ Y t i − Y t i = r +1 (cid:88) k =1 h ki k ! u (0) k ( t i , X t i ) − r (cid:88) k =0 u (0) k +1 ( t i , X t i ) k ! (cid:90) t i +1 t i r (cid:88) j =0 ( t i + j − t i ) k L i,j,r ( t )d t + O t i ( | π | r +2 ) . Using (3.2), we obtain ¯ Y t i − Y t i = r +1 (cid:88) k =1 (cid:16) h ki k ! − k − (cid:90) t i +1 t i ( t − t i ) k − d t (cid:17) u (0) k ( t i , X t i ) + O t i ( | π | r +2 )= O t i ( | π | r +2 ) . Thus, using (3.5), ˇ Y t i − Y t i = O t i ( | π | r +2 ) . T ( π ) = O t i ( | π | r +1 ) . which concludes the proof. (cid:50) These methods are inspired by Adams-Bashforth method both for the Y -part and Z -part. ( ABB ) r  Y i = E t i (cid:104) Y i +1 + h i (cid:80) rj =1 b i,j,r f ( Y i + j , Z i + j ) (cid:105) Z i = E t i (cid:104) H ψt i ,h Y i +1 + h i (cid:80) rj =1 β i,j,r H φt i ,jh f ( Y i + j , Z i + j ) (cid:105) where ψ, φ ∈ B , . 18ow, the coeﬃcients for the Y -part are given by b i,j,r = 1 h i (cid:90) t i +1 t i ˜ L i,j,r ( s )d s , recalling (3.3).From the proof of Proposition 3.1, step 1.b. we know that we can obtain a truncationerror for the Z -part s.t. ˆ Z t i − Z t i = O t i ( | π | r +1 ) . But here, due to the explicit feature ofthe Y part and thus an order r global error only, we only need to retrieve an error forthe Z -part of order r as well. This simply means that the scheme, for the Z -part, hasone more coeﬃcient than needed. So one can set β i,j,r = b i,j,r or β i,r,r = 0 and β i,j,r = b i,j,r − , ≤ j ≤ r − . Following the arguments of the proof of Proposition 3.1, one obtains

Proposition 3.2.

The ( ABB ) r method is convergent and at least of order r , providedthat ψ ∈ B r − , φ ∈ B r − and u ∈ G r +1 b . These methods are fully explicit method but have a better rate of convergence than the ( ABB ) r methods presented above. Nevertheless, they require the computation of onemore conditional expectation by step. This has to be compared in practice to the PicardIteration required by ( AM B ) r approximation. Deﬁnition 3.1. ( P C ) r  Z i = E t i (cid:104) H ψt i ,h Y i +1 + h i (cid:80) rj =1 β i,j,r H φt i ,jh f ( Y i + j , Z i + j ) (cid:105) p Y i = E t i (cid:104) Y i +1 + h i (cid:80) rj =1 β i,j,r f ( Y i + j , Z i + j ) (cid:105) Y i = E t i (cid:104) Y i +1 + h i (cid:80) rj =1 b i,j,r f ( Y i + j , Z i + j ) (cid:105) + h i b i, ,r f ( p Y i , Z i ) (3.6) where ψ, φ ∈ B , . The b -coeﬃcients are given by (3.1) and the β -coeﬃcients by (3.3) . Theorem 3.1.

The ( P C ) r method is convergent and at least of order r + 1 , providedthat ψ ∈ B r , φ ∈ B r − and u ∈ G r +2 b . As usual, the proof of this Theorem is splitted in two steps below. We ﬁrst study thestability of the above schemes and then their truncation errors.19 .3.1 Stability

To study the stability of the methods (3.6), we introduce ﬁrst a pertubed version of thescheme  ˜ Z i = E t i (cid:104) H ψt i ,h ˜ Y i +1 + h i (cid:80) rj =1 β i,j,r H φt i ,jh f ( ˜ Y i + j , ˜ Z i + j ) + ζ Zi (cid:105) p ˜ Y i = E t i (cid:104) ˜ Y i +1 + h i (cid:80) rj =1 b i,j,r f ( ˜ Y i + j , ˜ Z i + j ) (cid:105) ˜ Y i = E t i (cid:104) ˜ Y i +1 + h i (cid:80) rj =1 b ∗ i,j,r f ( ˜ Y i + j , ˜ Z i + j ) + h i b ∗ i, ,r f ( p ˜ Y i , ˜ Z i ) + ζ Yi (cid:105) where ζ Yi , ζ Zi are random variables belonging to L ( F t i ) , for i ≤ n − r . Proposition 3.3.

The scheme given in (3.6) is L -stable, recalling Deﬁnition 2.1. Proof.

For | π | small enough, we compute, denoting p δY = p Y − p ˜ Y , E (cid:2) | p δY i | (cid:3) ≤ (1 + | π | ) E (cid:2) | δY i +1 | (cid:3) + C r (cid:88) j =1 h i E (cid:2) | δY i + j | + | δZ i + j | (cid:3) (3.7) E (cid:2) | δY i | (cid:3) ≤ (1 + | π | ) E (cid:2) | δY i +1 | (cid:3) + C (cid:16) h i E (cid:2) | p δY i | (cid:3) + r (cid:88) j =1 h i E (cid:2) | δY i + j | + | δZ i + j | (cid:3) (cid:17) + Ch i | ζ Yi | (3.8)Plugging (3.7) into (3.8) and using the discrete version of Gronwall’s Lemma, we obtain E (cid:2) | δY i | (cid:3) ≤ C (cid:16) | π | n − r (cid:88) k = i E (cid:2) | δY k | (cid:3) + n − r (cid:88) k = i h k E (cid:2) | δZ k | (cid:3) + n − r (cid:88) k = i h k E (cid:2) | ζ Yk | (cid:3) + n (cid:88) k = n − r +1 E (cid:2) | δY k | (cid:3) (cid:17) . (3.9)Using the same arguments as in step 1.b of the proof of Proposition 2.1, we retrieve that n − r (cid:88) k = i h k E (cid:2) | δZ k | (cid:3) ≤ C (cid:16) n (cid:88) k = n − r +1 E (cid:2) | δY k | + | π || δZ k | (cid:3) + | π | n − r (cid:88) k = i E (cid:2) | δY k | (cid:3) + n − r (cid:88) k = i E (cid:20) h k | ζ Yk | + h k | ζ Zk | (cid:21) (cid:17) (3.10)This leads, using (3.9), E (cid:2) | δY i | (cid:3) ≤ C (cid:16) | π | n − r (cid:88) j = i E (cid:2) | δY j | (cid:3) + n − r (cid:88) k = i | π | E (cid:20) h k | E t k (cid:2) ζ Yk (cid:3) | + | E t k (cid:2) ζ Zk (cid:3) | (cid:21) + n (cid:88) k = n − r +1 E (cid:2) | δY k | + | π || δZ k | (cid:3) (cid:17) (3.11)which corresponds to (2.6).The proof is then concluded using the same arguments as in step of the proof ofProposition 2.1. (cid:50) .3.2 Truncation errorProposition 3.4. The scheme given in Deﬁnition 3.1 is at least of order r + 1 providedthat ψ ∈ B r , φ ∈ B r − and u ∈ G r +2 b . Proof.

1. The truncation error for the Z -part is the same that the one of the ( AM M ) r method. From the proof of Proposition 3.1 step 1, we get ˇ Z t i − Z t i = O t i ( | π | r +1 ) .

2. The study of the truncation error for the Y-part is slightly more involved. Let usdeﬁne Y ∗ t i := E t i  Y t i +1 + h i r (cid:88) j =1 b i,j,r f ( Y t i + j , Z t i + j )  + h i b i, ,r f ( Y ∗ t i , ˆ Z t i ) Using the proof of Proposition 3.1 step 2, we know that Y ∗ t i − Y t i = O t i ( | π | r +2 ) (3.12)this quantity represents the truncation error for the Y-part of the Adams-Moultonmethod.We also deﬁne p ˇ Y t i = E t i  Y t i +1 + h i r (cid:88) j =1 β i,j,r f ( Y t i + j , Z t i + j )  ˇ Y t i = E t i  Y t i +1 + h i r (cid:88) j =1 b i,j,r f ( Y t i + j , Z t i + j )  + h i b i, ,r f ( p ˇ Y t i , ˇ Z t i ) The term p ˇ Y t i − Y t i represents then the truncation error for the Predictor part, adaptingthe arguments of Proposition 3.1 step 1, we have p ˇ Y t i − Y t i = O t i ( | π | r +1 ) (3.13)The term ˇ Y t i − Y t i is the truncation error we are interested in.We then observe that ˇ Y t i = Y ∗ t i + h i b i, ,r (cid:16) f ( p ˇ Y t i , ˇ Z t i ) − f ( Y ∗ t i , ˇ Z t i ) (cid:17) = Y ∗ t i + h i b i, ,r (cid:16) f ( p ˇ Y t i , ˇ Z t i ) − f ( Y t i , ˇ Z t i ) + f ( Y t i , ˇ Z t i ) − f ( Y ∗ t i , ˇ Z t i ) (cid:17) f is Lipschitz continuous, we obtain ˇ Y t i − Y t i = O t i ( | Y ∗ t i − Y t i | ) + | π | O t i ( | p ˇ Y t i − Y t i | ) Combining (3.12)and (3.13), we then obtain ˇ Y t i − Y t i = O t i ( | π | r +2 ) .

3. From step 1. and step 2. above, we obtain that T ( π ) = O ( | π | r +1 ) , which concludes the proof. (cid:50) In this part, we provide a numerical illustration for the results presented above. Thescheme given in Deﬁnition 1.1 is still a theoretical one because in practice one hasto compute the conditional expectation involved. Many methods have been studiedalready in the context of BSDEs: regression methods [14], quantization methods [1, 10],Malliavin calculus methods [3, 4] and tree based methods e.g. Cubature methods [9].To illustrate our previous results, i.e. the order of the time discretization error, we willfocus on the simple case where d = 1 and X = W , in the spirit of [22]. Obviously,further numerical experiments are needed, specially in high dimension. Because we arelooking towards high order approximation, it seems reasonable to combine the presentmulti-step schemes with Cubature methods [9]. This is left for further research.In the sequel, we will also assume that the r terminal conditions are perfectly known.Generally, this won’t be the case but it is not really a problem, see Remark 1.1 (i).We explain below how the Brownian motion is approximated and give the expressionof the numerical scheme which is implemented in practice. We show that this schemeis convergent and characterise its convergence order. The error we are dealing with isnow composed of the discrete time error and the space discretization error. Finally, weprovide some numerical results, where we compute the empirical convergence rate. In order to deﬁne the scheme implemented in practice, we use a multinomial approxi-mation of the Brownian motion. Let us consider a discrete random variable ξ matchingthe moments of a gaussian variable G up to order K , i.e. E (cid:104) ξ k (cid:105) = E (cid:104) G k (cid:105) , ≤ k ≤ K.

22n dimension 1, an eﬃcient way to construct ξ is to use quadrature formula.On a (discrete, but big enough) probability space ( (cid:98) Ω , (cid:98) P ) , we are then given ( ξ i ) ≤ i ≤ n ,i.i.d random variables with the same law as ξ and deﬁne (cid:99) W x, t i = x + i (cid:88) j =1 (cid:112) t j − t j − ξ j , ∀ t i ∈ π . (4.1)For later use, we say that (cid:99) W x, t i is an order K approximation of the Brownian motion.We also denote by ( (cid:98) F ) t ∈ π the ﬁltration generated by ( (cid:99) W ,xt ) t ∈ π and (cid:98) E t [ · ] the relatedconditional expectation.We can now deﬁne the numerical scheme which is used in practice. Deﬁnition 4.1. (Linear multi-step)(i) To initialise the scheme with r steps, r ≥ , we set, for ≤ j ≤ r − , ( Y n − j , Z n − j ) = ( u ( t n − j , (cid:99) W ,xt n − j ) , ∂ x u ( t n − j , (cid:99) W ,xt n − j )) . (ii) For i ≤ n − r , the computation of ( Y i , Z i ) involves r steps and is given by  (cid:98) Y i = (cid:98) E t i (cid:104)(cid:80) rj =1 a j (cid:98) Y i + j + h i (cid:80) rj =0 b j f ( (cid:98) Y i + j , (cid:98) Z i + j ) (cid:105)(cid:98) Z i = (cid:98) E t i (cid:104)(cid:80) rj =1 (cid:98) H i,j (cid:16) α j (cid:98) Y i + j + h i (cid:80) rj =1 β j f ( (cid:98) Y i + j , (cid:98) Z i + j ) (cid:17)(cid:105) (4.2) The coeﬃcient H i,j are the discrete version of coeﬃcients given in (1.6) . From Propo-sition 2.3 (iii), we observe that, in the case X = W , they can simply be deﬁned asapproximation of the Brownian increment, i.e. (cid:98) H i,j = (cid:99) W x, t i + j − (cid:99) W x, t i t i + j − t i . (4.3)When implementing Predictor-Corrector methods, we use Deﬁnition 4.2. (Predictor-Corrector)(i) To initialise the scheme with r steps, r ≥ , we set, for ≤ j ≤ r − , ( Y n − j , Z n − j ) = ( u ( t n − j , (cid:99) W ,xt n − j ) , ∂ x u ( t n − j , (cid:99) W ,xt n − j )) . (ii) For i ≤ n − r , the computation of ( Y i , Z i ) involves r steps and is given by ( P C ) r  (cid:98) Z i = (cid:98) E t i (cid:104) (cid:98) H i, (cid:98) Y i +1 + h i (cid:80) rj =1 (cid:98) H i,j (cid:80) rj =1 β j f ( (cid:98) Y i + j , (cid:98) Z i + j ) (cid:105) p (cid:98) Y i = (cid:98) E t i (cid:104) (cid:98) Y i +1 + h i (cid:80) rj =1 β i,j,r f ( (cid:98) Y i + j , (cid:98) Z i + j ) (cid:105)(cid:98) Y i = (cid:98) E t i (cid:104) (cid:98) Y i +1 + h i (cid:80) rj =1 b i,j,r f ( (cid:98) Y i + j , (cid:98) Z i + j ) (cid:105) + h i b i, ,r f ( p (cid:98) Y i , (cid:98) Z i ) here the b -coeﬃcients are given by (3.1) , the β -coeﬃcients by (3.3) and the (cid:98) H -coeﬃcientsby (4.3) . Proposition 4.1. (i) In Deﬁnition 4.1, if we assume that the method given by thecoeﬃcients a , b , α , β is of order m , according to Deﬁnition 2.2, and that the multinomialapproximation of the Brownian motion is of order K = 2 m + 1 then we have Y − (cid:98) Y = O ( h m ) , provided that the coeﬃcient f and the value function u are smooth enough.(ii) In Deﬁnition 4.2 for ( P C ) r method, if we assume that the multinomial approxima-tion of the Brownian motion is of order K = 2 r + 3 then we have Y − (cid:98) Y = O ( h r +1 ) , provided that the coeﬃcient f and the value function u are smooth enough. The proof of this proposition is postponed to the end of this section.We can now turn to a concrete example which illustrates the above order of convergence.

As in [22], we consider the process, on [0 , T ] , ( X t , Y t , Z t ) = (cid:16) W t ,

11 + exp( − W t − t ) , exp( − W t − t )(1 + exp( − W t − t )) (cid:17) . This process is solution of the (decoupled) FBSDE X t = W t Y t = g T ( W T ) + (cid:90) Tt f ( Y s , Z s )d s − (cid:90) Tt Z s d W s where the driver f is given by f ( y, z ) = − z ( 34 − y ) , and g T ( x ) = 11 + exp( − x − T ) (4.4)24o approximate the value of Y , we consider the following methods:1. Implicit Euler approximation, coupled with an order 3 Brownian approximation.2. Crank-Nicholson approximation, coupled with an order 5 Brownian approxima-tion.3. Explicit two step Adams method, coupled with an order 5 Brownian approxima-tion.4. Implicit two step Adams method, coupled with an order 7 Brownian approxima-tion.5. Heun method which is a Predictor-Corrector method, coupled with an order 5Brownian approximation.The log-log graph in Fig . y = ‐1,0033x ‐ 4,5491 y = ‐1,9461x ‐ 4,6086 y = ‐2,0464x ‐ 6,4053 y = ‐2,8197x ‐ 7,9677 y = ‐2,018x ‐ 5,8166 ‐25 ‐20 ‐15 ‐10 ‐5 0 0 0,5 1 1,5 2 2,5 3 3,5 4 4,5 5 l o g e rr o r log number of steps Adams_exp<5> f(Y,Z) Crank_Nich<5> f(Y,Z) Adams_exp<7> f(Y,Z) Euler_imp<3> f(Y,Z) Heun<5> f(Y,Z)

Figure 1: Illustration of the convergence rate25he graph in Fig . y = ‐2,2695x ‐ 5,897 y = ‐1,026x ‐ 4,8053 y = ‐2,0834x ‐ 7,8981 ‐14 ‐12 ‐10 ‐8 ‐6 ‐4 ‐2 0 0 0,5 1 1,5 2 2,5 l o g e rr o r log number of step Cranck Nicholson <5> f(Y,Z) Cranck Nicholson <3> f(Y,Z) Cranck Nicholson <7> f(Y,Z)

Figure 2: Impact of space discretization

We only provide the proof of (i), the proof of (ii) follows from the same arguments andusing the proof of Proposition 3.3 and Proposition 3.4.1.

Notations

We ﬁrst need to consider ’functional’ version of the schemes above. Let us introducethe following operator, related to the theoretical schemes given in Deﬁnition 1.1. R Zi,j : ( C b ) → C b R Zi,j [ ϕ Y , ϕ Z ]( x ) = E (cid:104) H ,xt i ,jh (cid:16) α j ϕ Y ( W x,t i t i + j ) + hβ j f ( ϕ Y ( W x,t i t i + j ) , ϕ Z ( W x,t i t i + j )) (cid:17)(cid:105) R Yi,j : ( C b ) → C b R Yi,j [ ϕ Y , ϕ Z ]( x ) = E (cid:104) a j ϕ Y ( W x,t i t i + j ) + hb j f ( ϕ Y ( W x,t i t i + j ) , ϕ Z ( W x,t i t i + j )) (cid:105) (cid:98) R Zi,j : ( C b ) → C b (cid:98) R Zi,j [ ϕ Y , ϕ Z ]( x ) = (cid:98) E (cid:104) (cid:98) H i,j (cid:16) α j ϕ Y ( (cid:99) W x,t i t ij ) + hβ j f ( ϕ Y ( (cid:99) W x,t i t ij ) , ϕ Z ( (cid:99) W x,t i t ij )) (cid:17)(cid:105)(cid:98) R Yi,j : ( C b ) → C b (cid:98) R Yi,j [ ϕ Y , ϕ Z ]( x ) = (cid:98) E (cid:104) a j ϕ Y ( (cid:99) W x,t i t i + j ) + hb j f ( ϕ Y ( (cid:99) W x,t i t i + j ) , ϕ Z ( (cid:99) W x,t i t i + j )) (cid:105) The functional version of the schemes given in Deﬁnition 4.1 reads then, for i ≤ n − r ,  (cid:98) y i ( x ) = (cid:80) rj =1 (cid:98) R Yi,j [ (cid:98) y i + j , (cid:98) z i + j ]( x ) (cid:98) z i ( x ) = (cid:80) rj =1 (cid:98) R Zi,j [ (cid:98) y i + j , (cid:98) z i + j ]( x ) given r initial data ( (cid:98) y n − j , (cid:98) z n − j ) = ( u ( t n − j , · ) , ∂ x u ( t n − j , · )) , ≤ j ≤ r − .Due to the markov property of the discrete process ( (cid:99) W ,xt ) t ∈ π , it is easily checked that (cid:98) Y i = (cid:98) y i ( (cid:99) W x, t i ) and (cid:98) Z i = (cid:98) z i ( (cid:99) W x, t i ) . Finally, we deﬁne (cid:101) Y i = u ( t i , (cid:99) W x, t i ) and (cid:101) Z i = ∂ x u ( t i , (cid:99) W x, t i ) . Observe that (cid:101) Y = u (0 , x ) and that, for ≤ j ≤ r − , ( (cid:98) Y n − j , (cid:98) Z n − j ) = ( (cid:101) Y n − j , (cid:101) Z n − j ) .2. Stability

The key observation is here that ( (cid:101) Y i , (cid:101) Z i ) can be seen as a perturbed version of the schemegiven in (4.2), namely  (cid:101) Y i = (cid:98) E t i (cid:104)(cid:80) rj =1 a j (cid:101) Y i + j + h (cid:80) rj =0 b j f ( (cid:101) Y i + j , (cid:101) Z i + j ) (cid:105) + t ζ Yi + s ζ Yi (cid:101) Z i = (cid:98) E t i (cid:104)(cid:80) rj =1 H (cid:16) α j (cid:101) Y i + j + h (cid:80) rj =1 β j f ( (cid:101) Y i + j , (cid:101) Z i + j ) (cid:17)(cid:105) + t ζ Zi + s ζ Zi where the local error due to the time-discretization is  t ζ Yi = E (cid:104) Y t i − ˇ Y t i | X t i = (cid:99) W x, t i (cid:105) t ζ Zi = E (cid:104) Z t i − ˇ Z t i | X t i = (cid:99) W x, t i (cid:105) (4.5)recalling (2.9)-(2.10) and the local error due to the ’space-discretization’ is (cid:40) s ζ Yi = (cid:80) rj =1 ( R Yi,j − (cid:98) R Yi,j )[ u ( t i + j , · ) , ∂ x u ( t i + j , · )]( (cid:99) W x, t i ) s ζ Zi = (cid:80) rj =1 ( R Zi,j − (cid:98) R Zi,j )[ u ( t i + j , · ) , ∂ x u ( t i + j , · )]( (cid:99) W x, t i ) . (4.6)27ow, we can apply Proposition 2.1, recalling Remark 2.2, to obtain in particular that | (cid:101) Y − (cid:98) Y | ≤ | π | n − r (cid:88) i =0 (cid:98) E (cid:20) h i | t ζ Yi + s ζ Yi | + | t ζ Zi + s ζ Zi | (cid:21) (4.7)3. Study of the local error

We now turn to the study of the local errors ( ζ Yi , ζ Zi ) ≤ i ≤ n − r . Assuming that thefunction are smooth enough we compute the following expansion ( R Yi,j − (cid:98) R Yi,j )[ u ( t i + j , · ) , u (1) ( t i + j , · )]( x ) = K (cid:88) k =0 k ! χ ( k ) i,j ( x ) E (cid:104) ( W ,t i t i + j ) k (cid:105) − K (cid:88) k =0 k ! χ ( k ) i,j ( x ) (cid:98) E (cid:104) ( (cid:99) W ,t i t i + j ) k (cid:105) + O ( | π | K +12 ) where χ i,j are functions depending on f , u and the coeﬃcients of the methods.Using the matching moment property of (cid:99) W ,t i , we easily obtain that (cid:98) E (cid:104) | ( R Yi,j − (cid:98) R Yi,j )[ u ( t i + j , · ) , u (1) ( t i + j , · )]( (cid:99) W x, t i ) | (cid:105) ≤ C | π | K +1 For the Z part, we have ( R Zi,j − (cid:98) R Zi,j )[ u ( t i + j , · ) , u (1) ( t i + j , · )]( x ) = K − (cid:88) k =0 k ! χ ( k ) i,j ( x ) E (cid:104) H i,j ( W ,t i t i + j ) k (cid:105) − K − (cid:88) k =0 k ! χ ( k ) i,j ( x ) (cid:98) E (cid:104) ( (cid:98) H i,j (cid:99) W ,t i t i + j ) k (cid:105) + O ( | π | K − ) Using the matching moment property of (cid:99) W ,t i , we easily obtain that (cid:98) E (cid:104) | ( R Zi,j − (cid:98) R Zi,j )[ u ( t i + j , · ) , u (1) ( t i + j , · )]( (cid:99) W x, t i ) | (cid:105) ≤ C | π | K − Combining the above estimates with (4.7) and the fact that the discrete-time error is oforder m , leads to | (cid:101) Y − (cid:98) Y | ≤ C | π | m + C | π | K − . which concludes the proof since K = 2 m + 1 . (cid:50) References [1]

Bally V. and G. Pagès (2003). Error analysis of the quantization algorithmfor obstacle problems.

Stochastic Processes and their Applications , , 1-40.282] Bouchard B. and R. Elie (2008) , Discrete-time approximation of decoupledforward-backward SDE with jumps,

Stochastic Processes and their Applications , , 53-75.[3] Bouchard B. and N. Touzi (2004) , Discrete-Time Approximation andMonte-Carlo Simulation of Backward Stochastic Diﬀerential Equations.

Stochas-tic Processes and their Applications , (2), 175-206.[4] Bouchard B. and X. Warin (2011) Monte-Carlo valorisation of Americanoptions: facts and new algorithms to improve existing methods, to appear in

Numerical Methods in Finance , Springer Proceedings in Mathematics, ed. R.Carmona, P. Del Moral, P. Hu and N. Oudjane , 2011.[5]

Butcher J. C. (2008),

Numerical Methods for Ordinary Diﬀerential Equations ,Second Edition, Wiley.[6]

Chassagneux J.-F. (2008) Processus réﬂéchis en ﬁnance et probabiliténumérique, phd thesis , Université Paris Diderot - Paris 7.[7]

Chassagneux J.-F. and D. Crisan (2012), Runge-Kutta Scheme for BSDEs, preprint .[8]

Crisan D. and K. Manolarakis (2009) , Solving Backward Stochastic Dif-ferential Equations using the Cubature Method, preprint .[9]

Crisan D. and K. Manolarakis (2010) , Second order discretization of aBackward SDE and simulation with the cubature method, preprint .[10]

Delarue, F. and S. Menozzi (2006) A forward backward algorithm for quasi-linear PDEs,

Annals of Applied Probability , , 140-184.[11] Demailly J.-P.

Analyse numérique et équations diﬀérentielles , 3e édition, EDPSciences.[12]

El Karoui N., S. Peng, M.C. Quenez (1997), Backward Stochastic Diﬀer-ential Equation in ﬁnance

Mathematical ﬁnance , (1), 1-71.[13] Gobet E. and C. Labart (2007) , Error expansion for the discretization ofbacward stochastic diﬀerential equations,

Stochastic Processes and their Appli-cations , , 803-829. 2914] Gobet E., J.-P. Lemor and X. Warin (2006) Rate of convergence of an em-pirical regression method for solving generalized backward stochastic diﬀerentialequations,

Bernoulli , (5), 889-916.[15] Kloeden P. E. and E. Platen (1992),

Numerical solutions of Stochastic Dif-ferential Equations , Applied Math. 23, Springer, Berlin.[16]

Pardoux E. and S. Peng (1990), Adapted solution of a backward stochasticdiﬀerential equation,

Systems and Control Letters , , 55-61.[17] Pardoux E. and S. Peng (1992), Backward stochastic diﬀerential equationsand quasilinear parabolic partial diﬀerential equations. In Stochastic partial dif-ferential equations and their applications (Charlotte, NC, 1991), 200-217, volume176 of

Lecture Notes in Control and Inform. Sci., Springer, Berlin, 1992. [18]

Richou, A. (2010) Numerical simulation of BSDEs with drivers of quadraticgrowth. Forthcoming in

The Annals of Applied Probability .[19]

Zhao W., L. Chen and S. Peng (2006), A new kind of accurate numericalmethod for backward stochastic diﬀerential equations,

SIAM J. Sci. Comput. , , 1563-1581.[20] Zhao W., Y. Li and G. Zhang , (2012) A generalized theta-scheme for solvingbackward stochastic diﬀerential equations ,

Dis. Cont. Dyn. Sys. B , , 1585-1603.[21] Zhao W., J. Wang and S. Peng (2009) Error estimates of the theta-scheme forbackward stochastic diﬀerential equations ,

Dis. Cont. Dyn. Sys. B , , 905-924.[22] Zhao W., G. Zhang and L. Ju (2010) A stable multistep scheme for solvingBackward Stochastic Diﬀerential Equations.

SIAM J. Numer. Anal. (4), 1369-1394.[23] Zhang J. (2004), A numerical scheme for backward stochastic diﬀerential equa-tion,

Annals of Applied Probability ,14