LLinear multi-step schemes for BSDEs
Jean-François CHASSAGNEUX ∗ This version: October 2012. (submitted)
Abstract
We study the convergence rate of a class of linear multi-step methods for BSDEs.We show that, under a sufficient condition on the coefficients, the schemes enjoy afundamental stability property. Coupling this result to an analysis of the trunca-tion error allows us to design approximation with arbitrary order of convergence.Contrary to the analysis performed in [22], we consider general diffusion model andBSDEs with driver depending on z . The class of methods we consider contains wellknown methods from the ODE framework as Nystrom, Milne or Adams methods.We also study a class of Predictor-Correctot methods based on Adams methods.Finally, we provide a numerical illustration of the convergence of some methods. Key words:
Backward SDEs, High order discretization, Linear multi-step methods.
MSC Classification (2000):
In this paper, we are interested in the discrete-time approximation of solutions of (de-coupled) Backward Stochastic Differential Equation (BSDE), i.e . a triplet ( X, Y, Z ) sat-isfying X t = X + (cid:90) t b ( X s )d s + (cid:90) t σ ( X s )d W s , (1.1) Y t = g ( X T ) + (cid:90) Tt f ( Y t , Z t )d t − (cid:90) Tt Z t d W t . (1.2)The function ( b, σ ) : R d (cid:55)→ R d × M d , and f : R × R d (cid:55)→ R are Lipschitz-continuousfunction, g : R d (cid:55)→ R is differentiable with continuous and bounded first derivative .The positive constant T is given and W is a Brownian motion supported by a filtered ∗ Departement of Mathematics, Imperial College London. [email protected] These assumptions will be strengthened in the following sections. a r X i v : . [ m a t h . P R ] J un robability space (Ω , F , ( F ) ≤ t ≤ T , P ) . The process Y is a one-dimensional stochasticprocess, the processes X and Z are valued in R d and Z is written, by convention,as a row vector. Under the Lipschitz assumption on the coefficients, the processes X and Y belong to the set S of continuous adapted processes with square integrablesupremum and Z belongs to H , the set of progressively measurable processes satisfying E (cid:104)(cid:82) T | Z s | d s (cid:105) .The existence and uniqueness of solutions of the system (1.1) -(1.2) was first addressedby Pardoux and Peng in [16]. Moreover, in [17], they show that Y t = u ( t, X t ) , Z t = ∇ u (cid:62) ( t, X t ) σ ( X t ) , t ∈ [0 , T ] , where u ∈ C , ([0 , T ] × R d ) is the solution of the final value Cauchy problem L (0) u ( t, x ) = − f (cid:16) u ( t, x ) , ∇ u (cid:62) ( t, x ) σ ( x ) (cid:17) , t ∈ [0 , T ) , x ∈ R d (1.3) u ( T, x ) = g ( x ) , x ∈ R d (1.4)with L (0) defined to be the second order differential operator L (0) = ∂ t + d (cid:88) i =1 b i ∂ x i + 12 d (cid:88) i,j =1 a ij ∂ x i ∂ x j , (1.5)and a = a ij = σσ (cid:62) .To approximate (1.1)-(1.2), one has to come up with an approximation of the SDEpart and the BSDE part. Obtaining approximations of the distribution of the forwardcomponent X has been largely resolved in the last thirty years. There is a large literatureon the subject and one can refer to [15] and the references therein for a systematic studyof numerical methods for approximating X .Here, we focus on the approximation of ( Y, Z ) instead. Numerical methods approximat-ing this backward component have already been proposed. They are mainly based on aEuler approximation, see [3, 23, 13, 8] and the references therein. These methods havebeen successfully extended to a broader class of BSDEs: reflected BSDEs [1, 6], BS-DEs with jumps [2], BSDEs with driver of quadratic growth [18], see also the referencetherein. In a very specific framework, [20, 19, 21, 22] proposed some high order methodsto approximate the solution of the BSDE. Recently high order method of Runge-Kuttatype have been studied [9, 7] in the general framework of (1.1)-(1.2).In this paper, we consider another type of high order method, very well known for ODEs,namely linear multi-step methods.The approximations presented below are associated to an arbitrary, but fixed, partition π of the interval [0 , T ] , π = { t = 0 < · · · < t i < t i +1 < · · · < t n = T } . We define2 i = t i +1 − t i , i = 0 , ..., n − and | π | = max i h i and denote by ( Y i , Z i ) the approximationof ( Y t i , Z t i ) for i = 1 , ..., n . The construction of the approximating process is done in arecursive manner, backwards in time. We describe in the following the salient featuresof the class of approximations considered in this paper. Definition 1.1. (Linear multi-step methods)(i) To initialise the scheme with r steps, r ≥ , we are given r terminal condition ( Y n − j , Z n − j ) , F t n − j -measurable square integrable random variables, ≤ j ≤ r − .(ii) For i ≤ n − r , the computation of ( Y i , Z i ) involves r steps and is given by Y i = E t i (cid:104)(cid:80) rj =1 a j Y i + j + h (cid:80) rj =0 b i,j f ( Y i + j , Z i + j ) (cid:105) Z i = E t i (cid:104)(cid:80) rj =1 α j H Yi,j Y i + j + h (cid:80) rj =1 β i,j H fi,j f ( Y i + j , Z i + j ) (cid:105) where a j , b i,j , α j , β i,j are real numbers satisfying | a j | + | b i,j | + | α j | + | β i,j | ≤ Λ , ≤ i ≤ n − r , ≤ j ≤ r , and Λ is a positive constant. We impose the so-called pre-consistency condition i.e. r (cid:88) j =1 a j = r (cid:88) j =1 α j = 1 . The coefficients H Yi,j , H fi,j , ≤ i ≤ n − r , ≤ j ≤ r are F t i + j -measurable randomvariables satisfying, for all j , h i E (cid:104) | H Yi,j | + | H fi,j | (cid:105) ≤ Λ and E t i (cid:2) H Yi,j (cid:3) = E t i (cid:104) H fi,j (cid:105) = 0 . Remark 1.1. (i) The value ( Y n , Z n ) is generally given by ( g ( X T ) , ∇ g (cid:62) ( X T ) σ ( X T )) . If r > , one needs to specify other initialisation values. This choice is important becauseit will impact the global rate of convergence. One can use Runge-Kutta type scheme [7]with high order of convergence.(ii) When r = 1 , schemes 1.1 are one-step scheme. See [3, 23, 13, 9, 7] and thereferences therein for a study of these schemes. The global error we investigate here is a time discretization error and is, given a grid π , ( E Y ( π ) , E Z ( π )) with E Y ( π ) := max i E (cid:2) | Y t i − Y i | (cid:3) and E Z ( π ) := (cid:88) i h i E (cid:2) | Z t i − Z i | (cid:3) . To implement high order scheme in practice, we need to specify a particular form forthe H -coefficient appearing in Definition 1.1 above. Let us first introduce a special classof random variables, which was already considered in [7].3 efinition 1.2. (i) For m ≥ , we denote by B m [0 , the set of bounded measurablefunction ψ : [0 , → R satisfying (cid:90) ψ ( u )d u = 1 and if m ≥ , (cid:90) ψ ( u ) u k d u = 0 , ≤ k ≤ m. (ii) Let ( ψ (cid:96) ) ≤ (cid:96) ≤ d ∈ B m [0 , , for t ∈ [0 , T ] and h > s.t. t + h ≤ T , we define, H ψt,h := ( 1 h (cid:90) t + ht ψ (cid:96) ( u − th )d W (cid:96)u ) ≤ (cid:96) ≤ d , which is a row vector.By convention, we set H ψt, = 0 . In the sequel, when studying the order of convergence of the scheme and depending ofthe order we want to retrieve, we will assume that, for ≤ j ≤ r , H Yi,j := H ψt i ,jh and H fi,j := H φt i ,jh (1.6)for some functions ψ and φ in B m , m ≥ , see Theorem 2.1 below.The convergence analysis is done in a classical way. We first prove a fundamental stabil-ity property for the schemes, under a reasonable sufficient condition, see Proposition 2.1.Then, assuming smoothness of the value function u given by (1.3)-(1.4), we study thetruncation error associated to the above methods. We prove a sufficient condition on thecoefficient to retrieve methods of any order. These two steps allow us to retrieve generalconvergence and design new high order method for BSDEs. Contrary to the analysisperformed in [22], we work with general diffusion model given by (1.1) and BSDEs withdriver depending on z . As an example of application, we extend some classical schemeused in the ODE framework and then proceed with the study of Adams type methods.Based on these methods, we also design Predictor-Corrector methods and study theirconvergence. To the best of our knowledge, it is the first time that these methods areconsidered for BSDEs. Finally, we illustrate our theoretical results with some numericalexperiments showing empirical convergence rates.The rest of this paper is organised as follows. In section 2, we prove our general conver-gence result which relies heavily on a stability property. In section 3, we study Adamsmethods and Predictor-Corrector methods in the context of BSDEs. The main resultsare stated in the multi-dimensional case but for the reader’s convenience the proofs aredone with d = 1 . Finally, in section 4, we provide a numerical example.4 otations We denote by M d the set of matrices with d lines and d columns. For amatrix A ∈ M d , Tr [ A ] denotes its trace, A .j its j -th column, A i. its i -th row, and A ij the i -th term of A .j . I d is the identity matrix of M d . The transpose of a matrix or avector y will be denoted y (cid:62) . The sup-norm for both vectors and matrix is denoted | . | ∞ .In the sequel C is a positive constant whose value may change from line to line dependingon T , d , Λ , X but which does not depend on π . We write C p if it depends on somepositive parameters p .For t ∈ π , R a random variable and r a real number, the notation R = O t ( r ) meansthat | U | ≤ λ πt u where λ πt is a positive random variable satisfying: E [ | λ πt | p ] ≤ C p , for all p > , t ∈ π and all grid π . In this part, we study the convergence properties of the schemes given in Definition 1.1.We first establish a stability property for the schemes. We then state a sufficient condi-tion on the coefficients which allows us to retrieve high order schemes. L -stability To investigate the stability of the schemes given in Definition 1.1, we introduce a per-tubed scheme ˜ Y i = E t i (cid:104)(cid:80) rj =1 a j ˜ Y i + j + h (cid:80) rj =0 b i,j f ( ˜ Y i + j , ˜ Z i + j ) (cid:105) + ζ Yi ˜ Z i = E t i (cid:104)(cid:80) rj =1 α j H Yi,j ˜ Y i + j + h (cid:80) rj =1 β i,j H fi,j f ( ˜ Y i + j , ˜ Z i + j ) (cid:105) + ζ Zi (2.1)where ζ Yi , ζ Zi are random variables belonging to L ( F t i ) , for i ≤ n − r .The notion of stablity we consider here is the following. Definition 2.1. ( L -Stability) The scheme given in Definition 1.1 is said to be L -stableif max ≤ i ≤ n − r E (cid:2) | δY i | (cid:3) + n − r (cid:88) i =0 h i E (cid:2) | δZ i | (cid:3) ≤ C (cid:16) max ≤ j ≤ r − E (cid:2) | δY n − j | + | π || δZ n − j | (cid:3) + | π | n − r (cid:88) i =0 E (cid:20) h i | ζ Yi | + | ζ Zi | (cid:21) (cid:17) for all sequences ζ Yi , ζ Zi of L ( F t i ) -random variable, i ≤ n − r , and terminal values ( Y n − j , Z n − j ) , ( ˜ Y n − j , ˜ Z n − j ) belonging to L ( F t n − j ) , ≤ j ≤ r − . roposition 2.1. Assume that the following holds ( H c ) The coefficients ( a j ) are non-negative, (cid:80) rj =1 a j = 1 and for ≤ j ≤ r , a j = 0 = ⇒ α j = 0 ,then, the scheme given in Definition 1.1 is L -stable, recalling Definition 2.1. Proof.
We define U i = ( Y i , . . . , Y i + r − ) (cid:62) , ˜ U i = ( ˜ Y i , . . . , ˜ Y i + r − ) (cid:62) , and Φ Yi = (cid:32) (cid:80) rj =0 b i,j f ( Y i + j , Z i + j ) (cid:33) r, , ˜Φ Yi = (cid:32) (cid:80) rj =0 b i,j f ( ˜ Y i + j , ˜ Z i + j ) (cid:33) r, and ˜Θ Yi = (cid:32) ζ Yi (cid:33) r, and denote δU i = U i − ˜ U i , δ Φ Yi = Φ Yi − ˜Φ Yi , a = ( a , . . . , a r ) , α = ( α , . . . , α r ) and A = (cid:32) a , . . . , a r − a r I r − (cid:33) r,r . The scheme and the pertubed scheme rewrite then for the Y part E t i [ U i ] = E t i (cid:2) AU i +1 + h Φ Yi (cid:3) and E t i (cid:104) ˜ U i (cid:105) = E t i (cid:104) A ˜ U i +1 + h ˜Φ Yi + Θ Yi (cid:105) . i ≤ j ≤ n − r , we compute that | E t i [ δU j ] | ∞ ≤ | A | ∞ | E t i [ δU j +1 ] | ∞ + h j | E t i (cid:2) δ Φ Yj (cid:3) | ∞ + | E t i (cid:2) Θ Yj (cid:3) | ∞ Under ( H c ) , we observe that | A | ∞ = 1 and we get | E t i [ δU j ] | ∞ ≤ | E t i [ δU j +1 ] | ∞ + h j | E t i (cid:2) δ Φ Yj (cid:3) | ∞ + | E t i (cid:2) Θ Yj (cid:3) | ∞ Iterating on j , we compute that | E t i [ δU j ] | ∞ ≤ | E t i [ δU n − r +1 ] | ∞ + n − r (cid:88) k = j h k | E t i (cid:2) δ Φ Yk (cid:3) | ∞ + n − r (cid:88) k = j | E t i (cid:2) Θ Yk (cid:3) | ∞ .
6n particular, we have for i = j , and | π | small enough, | δY i | ≤ C (cid:16) n − r (cid:88) k = i h k E t i [ | δY k | + | δZ k | ] + n − r (cid:88) k = i E t i (cid:2) | ζ Yk | (cid:3) + n (cid:88) k = n − r +1 E t i [ | δY k | ] (cid:17) . (2.2)We then compute E (cid:2) | δY i | (cid:3) ≤ C (cid:16) | π | n − r (cid:88) k = i E (cid:2) | δY k | (cid:3) + n − r (cid:88) k = i h k E (cid:2) | δZ k | (cid:3) + n − r (cid:88) k = i h k E (cid:2) | ζ Yk | (cid:3) + n (cid:88) k = n − r +1 E (cid:2) | δY k | (cid:3) (cid:17) . (2.3)1.b We will now control the term h (cid:80) n − rk = i E (cid:2) | δZ k | (cid:3) appearing in (2.3).Using Cauchy-Schwartz inequality, we obtain that, if a j (cid:54) = 0 then | E t i (cid:2) α j H Yi,j δY i + j (cid:3) | ≤ C ( a j E t i (cid:2) | δY i + j | − a j | E t i [ δY i + j ] | (cid:3) ) which leads to, under ( H c ) , h i E (cid:2) | δZ i | (cid:3) ≤ C (cid:16) r (cid:88) j =1 a j E (cid:2) | δY i + j | (cid:3) − r (cid:88) j =1 E (cid:2) a j | E t i [ δY i + j ] | (cid:3) )+ | π | r (cid:88) j =1 E (cid:2) | δY i + j | + | δZ i + j | (cid:3) + | π | E (cid:2) | ζ Zi | (cid:3) . (cid:17) (2.4)Under ( H c ) , we have that, − r (cid:88) j =1 E (cid:2) a j | E t i [ δY i + j ] | (cid:3) ≤ − E | r (cid:88) j =1 E t i [ a j δY i + j ] | . Then, recalling that r (cid:88) j =1 E t i [ a j δY i + j ] = δY i − h i r (cid:88) j =0 E t i (cid:2) δ Φ Yj + r (cid:3) − ζ Yi we compute − r (cid:88) j =1 E (cid:2) | E t i [ a j δY i + j ] | (cid:3) ≤ − E (cid:2) | δY i | (cid:3) + C | π | E | δY i | r (cid:88) j =0 E t i [ | δY i + j | + | δZ i + j | ] + C E (cid:2) | δY i | | ζ Yi | (cid:3) which leads, for < (cid:15) ≤ to be fixed later on, to − r (cid:88) j =1 E (cid:2) | E t i [ a j δY i + j ] | (cid:3) ≤ − E (cid:2) | δY i | (cid:3) + C | π | r (cid:88) j =0 E t i (cid:20) (cid:15) | δY i + j | + (cid:15) | δZ i + j | (cid:21) + Ch i E (cid:2) | ζ Yi | (cid:3) . i , we obtain, for | π | smallenough n − r (cid:88) k = i h k E (cid:2) | δZ k | (cid:3) ≤ C (cid:16) n − r (cid:88) k = i ( r (cid:88) j =1 a j E (cid:2) | δY i + j | (cid:3) − E (cid:2) | δY i | (cid:3) ) + C (1 + 1 (cid:15) ) | π | n (cid:88) k = n − r +1 E (cid:2) | δY k | + | δZ k | (cid:3) + C (1 + 1 (cid:15) ) | π | n − r (cid:88) k = i E (cid:2) | δY k | (cid:3) + C(cid:15) | π | n − r (cid:88) k = i E (cid:2) | δZ k | (cid:3) + | π | n − r (cid:88) k = i E (cid:2) | ζ Zk | (cid:3) + n − r (cid:88) k = i Ch k E (cid:2) | ζ Yk | (cid:3) (cid:17) Using ( H c ) , setting (cid:15) := C , we then obtain n − r (cid:88) k = i h k E (cid:2) | δZ k | (cid:3) ≤ C (cid:16) n (cid:88) k = n − r +1 E (cid:2) | δY k | + | π || δZ k | (cid:3) + | π | n − r (cid:88) k = i E (cid:2) | δY k | (cid:3) + n − r (cid:88) k = i E (cid:20) h k | ζ Yk | + h k | ζ Zk | (cid:21) (cid:17) (2.5)1.c Combining the last inequality with (2.3), we get E (cid:2) | δY i | (cid:3) ≤ C (cid:16) | π | n − r (cid:88) j = i E (cid:2) | δY j | (cid:3) + n − r (cid:88) k = i | π | E (cid:20) h k | E t k (cid:2) ζ Yk (cid:3) | + | E t k (cid:2) ζ Zk (cid:3) | (cid:21) + n (cid:88) k = n − r +1 E (cid:2) | δY k | + | π || δZ k | (cid:3) (cid:17) (2.6)2.a Let us define δ i := n − r (cid:88) j = i E (cid:2) | δY j | (cid:3) ,θ i := n − r (cid:88) k = i | π | E (cid:20) h k | ζ Yk | + | ζ Zk | (cid:21) + n (cid:88) k = n − r +1 E (cid:2) | δY k | + | π || δZ k | (cid:3) . Equation (2.6) reads then δ i − δ i +1 ≤ C | π | δ i + Cθ i (2.7)Using a discrete version of Gronwall Lemma, we then compute δ i ≤ C ( δ n − r + n − r (cid:88) k = i θ k e C ( n − r − k ) | π | ) Since δ n − r ≤ η i and θ k ≤ θ i for k ≥ i , we compute δ i ≤ Cθ i − e C | π | This last equation combined with (2.7) leads to E (cid:2) | δY i | (cid:3) ≤ Cθ i Y -part.2.b For the Z -part, the proof is concluded pluging last inequality in (2.5), with i = 0 inthis equation. (cid:50) Remark 2.1.
It is easily checked that ( H c ) implies that the roots of the following poly-nomial equations y r +1 − r (cid:88) j =1 a j y r − j +1 = 0 . are in the closed unit disc and the multiple roots are in the open unit disc.It is known that in the ODEs framework this is a necessary and sufficient condition toget stability of linear multi-step schemes, see e.g [5, 11]. In our context, this conditionis only necessary. We have to imposed ( H c ) essentially because we need to deal with thenew process Z . Remark 2.2.
Proposition 2.1 is generic in the sense that we do not use the particularproperty of the probability space nor the fact that ( F t ) t ∈ [0 ,T ] is a Brownian filtration. Wewill use this property in the last section of this paper. To study the order of the schemes, we use the following definition of truncation errors.The local truncation error for the pair ( Y, Z ) defined as η i := η Yi + η Zi , ( η Yi , η Zi ) := (cid:18) h i E (cid:2) | Y t i − ˇ Y t i | (cid:3) , E (cid:2) | Z t i − ˇ Z t i | (cid:3)(cid:19) , i ≤ n − r , (2.8)with ˇ Y t i = E t i r (cid:88) j =1 a j Y t i + j + h i r (cid:88) j =1 b i,j f ( Y t i + j , Z t i + j ) + h i b i, f ( ˇ Y t i , ˇ Z t i ) (2.9) ˇ Z t i = E t i r (cid:88) j =1 α j H ψt i ,jh Y t i + j + h j r (cid:88) j =1 β i,j H φt i ,jh f ( Y t i + j , Z t i + j ) (2.10)where ψ , φ belongs to B .The global truncation error for a given grid π is given by T ( π ) := T Y ( π ) + T Z ( π ) , ( T Y ( π ) , T Z ( π )) := (cid:32) n − r (cid:88) i =0 h i η Yi , n − r (cid:88) i =0 h i η Zi (cid:33) , (2.11)9here T Y is the global truncation error for Y and T Z is the global truncation error for Z defined as above. Definition 2.2.
An approximation is said to have a global truncation error of order m if we have T ( π ) ≤ C | π | m for all sufficiently smooth solutions to (1.3) and all partitions π with sufficiently smallmesh size. We study the order of the methods given in Definition 1.1 using Itô-Taylor expansions[15]. This requires the smoothness of the value function u introduced in (1.3)-(1.4). Inorder to state precisely these assumptions, we recall some notations of Chapter 5 (seeSection 5.4) in [15].Let M := {(cid:11)} ∪ ∞ (cid:91) m =1 { , . . . , d } m be the set of multi-indices with entries in { , . . . , d } endowed with the measure (cid:96) of thelength of a multi-index ( (cid:96) ( (cid:11) ) = 0 by convention).We introduce the concatenation operator ∗ on M for multi-indices with finite length: α = ( α , . . . , α p ) , β = ( β , . . . , β q ) then α ∗ β = ( α , . . . , α p , β , . . . , β q ) .A non empty subset A ⊂ M is called a hierarchical set if sup α (cid:96) ( α ) < ∞ and − α ∈ A , ∀ α ∈ A \ {(cid:11)} For any hierarchical A set, we consider the remainder set B ( A ) given by B ( A ) := { α ∈ M \ A| − α ∈ A} We will use in the sequel the following sets of multi-indices, for n ≥ : A n := { α | (cid:96) ( α ) ≤ n } and observe that B ( A n ) = A n +1 \ A n . The required regularity assumptions will be stated in the Theorems below. j ∈ { , . . . , d } , we consider the operators: L ( j ) = d (cid:88) k =1 σ kj ∂ x k . For a multi-index α = ( α , . . . , α p ) , the iteration of these operators has to be understoodin the following sense L α := L ( α ) ◦ · · · ◦ L ( α p ) . By convention, L (cid:11) is the identity operator, recall also the definition of L (0) given in(1.5). One can observe that L α ∗ β = L α ◦ L β .For a multi-index with finite length α , we consider the set G α of function v : [0 , T ] × R d → R for which L α v is well defined and continuous. We also introduce G αb the subset offunction v ∈ G α such that the function L α v is bounded. For v ∈ G α , we denote L α u by u α .Finally, for n ≥ , we define the set G nb of function u such that u α ∈ G αb for all α ∈A n \ {(cid:11)} .The two following Propositions are key results to prove the high order rate of convergenceof the schemes. We refer to [7] for proofs. Proposition 2.2.
Assume d = 1 . Let m ≥ , then for a function v ∈ G m +1 b , we havethat E t [ v ( t + h, X t + h )] = v t + hv (0) t + h v (0 , t + · · · + h m m ! v (0) m t + O t ( h m +1 ) Proposition 2.3.
Assume d = 1 . (i) Let m ≥ , for ψ ∈ B m [0 , , assuming that v ∈G m +2 b , we have E t (cid:104) H ψt,h v ( t + h, X t + h ) (cid:105) = v (1) t + hv (1 , t + · · · + h m m ! v (1) ∗ (0) m t + O t ( h m +1 ) (ii) For ψ ∈ B , , assuming that v ∈ G b , we have E t (cid:104) H ψt,h v ( t + h, X t + h ) (cid:105) = O t (1) . (iii) If L (0) ◦ L (1) = L (1) ◦ L (0) , then the expansion of (i) holds true with ψ = 1 . .2.3 Sufficient condition for Order m For the reader’s convenience, we assume in this paragraph a constant time step for thegrid π i.e. h i = h = | π | := Tn , for all i and that the coefficients b , β do not depend of i .Under these conditions, the scheme given in Definition 1.1, recalling (1.6), rewrites, for i ≤ n − r , Y i = E t i (cid:104)(cid:80) rj =1 a j Y i + j + h (cid:80) rj =0 b j f ( Y i + j , Z i + j ) (cid:105) Z i = E t i (cid:104)(cid:80) rj =1 α j H ψt i ,jh Y i + j + h (cid:80) rj =1 β j H φt i ,jh f ( Y i + j , Z i + j ) (cid:105) (2.12) Proposition 2.4. (Order m) For m ≥ , assume that the following holds ( C Y ) m : r (cid:88) j =1 a j j p − p r (cid:88) j =0 b j j p − = 0 , ≤ p ≤ m and ( C Z ) m : r (cid:88) j =1 α j j p − pβ j j p − = 0 , ≤ p ≤ m − and that u ∈ G m +1 b , then we have T Y ( π ) + T Z ( π ) ≤ C | π | m , provided that ψ ∈ B m − , and φ ∈ B m − , , recalling (2.12) . Proof.
1. We first study the truncation error for the Z-part. We have that ˇ Z t i = E t i r (cid:88) j =1 α j H ψt i ,jh Y t i + j + h r (cid:88) j =1 β j H φt i ,jh f ( Y t i + j , Z t i + j ) = E t i r (cid:88) j =1 α j H ψt i ,jh u ( t i + j , X t i + j ) − h r (cid:88) j =1 β j H φt i ,jh u (0) ( t i + j , X t i + j ) . Using Proposition 2.3, we compute ˇ Z t i = m − (cid:88) p =0 r (cid:88) j =1 α j j p h p p ! u (1) ∗ (0) p ( t i , X t i ) − m − (cid:88) p =0 r (cid:88) j =1 β j j p h p +1 p ! u (1) ∗ (0) p +1 ( t i , X t i ) + O t i ( | π | m ) which leads to ˇ Z t i − Z t i = ( r (cid:88) j =1 α j − u (1) ( t i , X t i ) + m − (cid:88) p =1 h p p ! u (1) ∗ (0) p ( t i , X t i )( r (cid:88) j =1 α j j p − p r (cid:88) j =1 j p − β j ) + O t i ( | π | m ) ( C Z ) m , we obtain ˇ Z t i − Z t i = O t i ( | π | m ) which leads directly to η Zi = O ( | π | m ) , i ≤ n − r. (2.13)2.a We now study the truncation error for the Y-part. Let us introduce ¯ Y t i = E t i r (cid:88) j =1 a j Y t i + j + h r (cid:88) j =0 b j f ( Y t i + j , Z t i + j ) We have that ˇ Y t i = ¯ Y t i + hb (cid:16) f ( ˇ Y t i , ˇ Z t i ) − f ( Y t i , Z t i ) (cid:17) Since f is Lipschitz-continuous, we get that for | π | small enough, ˇ Y t i − Y t i = O t i ( ¯ Y t i − Y t i ) + | π | O t i ( ˇ Z t i − Z t i ) . (2.14)2.b Now observe that ¯ Y t i = E t i r (cid:88) j =1 a j u ( t i + j , X t i + j ) − h r (cid:88) j =0 b j u (0) ( t i + j , X t i + j ) . Using Proposition 2.2, we compute ¯ Y t i = m (cid:88) p =0 r (cid:88) j =1 a j j p h p p ! u (0) p ( t i , X t i ) − m − (cid:88) p =0 r (cid:88) j =0 b j j p h p +1 p ! u (0) p +1 ( t i , X t i ) + O t i ( | π | m +1 ) which leads to ¯ Y t i − Y t i = ( r (cid:88) j =1 a j − u ( t i , X t i ) + m (cid:88) p =1 h p p ! u (0) p ( t i , X t i )( r (cid:88) j =1 a j j p − p r (cid:88) j =0 j p − b j ) + O t i ( | π | m +1 ) . Under ( C Y ) m , we thus get ¯ Y t i − Y t i = O t i ( | π | m +1 ) η Yi = O ( | π | m +2 ) , i ≤ n − r .
3. Combining the last equation with (2.13), we conclude that T ( π ) = O ( | π | m ) , and so the scheme is of order m . (cid:50) .3 Convergence results and examples of high order methods Theorem 2.1.
Under ( H c ) , assuming that the scheme is of order m according to Def-inition 2.2 and that max ≤ j ≤ r − E (cid:2) | Y t n − j − Y n − j | + h | Z t n − j − Z n − j | (cid:3) ≤ C | π | m (2.15) we have E Y ( π ) + E Z ( π ) ≤ C | π | m . Proof.
We simply observe that the solution ( Y, Z ) of the BSDE is also the solution ofa perturbed scheme with ζ Yi := ˇ Y t i − Y t i and ζ Zi := ˇ Z t i − Z t i . The proof then followsdirectly from Proposition 2.1. (cid:50) In particular, in the special setting of paragraph 2.2.3, a straightforward application ofTheorem 2.1 and Proposition 2.4 leads to
Corollary 2.1.
Under ( H c ) and ( C Y ) m - ( C Z ) m , assuming that (2.15) holds, we have E Y ( π ) + E Z ( π ) ≤ C | π | m , provided u ∈ G m +1 b and ψ ∈ B m − , , φ ∈ B m − , . To illustrate the previous results, we conclude this section by giving two examples ofhigh order method which can be designed using Corollary 2.1.
Example 2.1. (Nystrom’s method) The following scheme is –for the Y-part– inspiredby the
Leap-frog (or Nystrom’s) method for ODE, namely Y i = E t i [ Y i +2 + 2 hf ( Y i +1 , Z i +1 )] Z i = E t i (cid:104) H ψt i , h Y i +2 + 2 hH φt i , h f ( Y i +2 , Z i +2 ) (cid:105) . This 2-step method is convergent and the rate of convergence is at least of order 2,assuming that u ∈ G b and ψ ∈ B , , φ ∈ B , . Example 2.2. (Milne’s method) The second scheme we propose is inspired –for theY-part– by the
Milne’s method for ODE, namely Y i = E t i (cid:104) Y i +4 + h (cid:16) f ( Y i +1 , Z i +1 ) − f ( Y i +2 , Z i +2 ) + f ( Y i +3 , Z i +3 ) (cid:17)(cid:105) Z i = E t i (cid:104) H ψt i , h Y i +4 + h (cid:16) H φt i ,h f ( Y i +1 , Z i +1 ) − H φt i , h f ( Y i +2 , Z i +2 ) + H φt i , h f ( Y i +3 , Z i +3 ) (cid:17)(cid:105) . his 4-step method is convergent and the rate of convergence is at least of order 4,assuming that u ∈ G b and ψ ∈ B , , φ ∈ B , . In this section, we introduce methods for BSDEs inspired by Adams methods fromthe ODE framework. These methods are of two kinds: explicit methods , also calledAdams-Bashforth, or implicit methods, also called, Adams-Moulton.The schemes introduced in Definition 1.1 are always explicit for the Z -part but maybe implicit for the Y -part. So, for the Z -part, we use Adams-Bashforth approximationwhich may then be combined with explicit or implicit approximation for the Y -part.We first study methods combining Adams-Moulton type approximation for the Y -partand Adams-Bashforth type approximation for the Z -part. We show that these methodsare really efficient because high order rate of convergence can be achieved, assumingsmoothness of the value function. We then quickly discuss the case of explicit methods,i.e. Adams-Bashforth type approximation both for the Y -part and Z -part.At the end of this section, we use these Adams type approximation to design Predictor-Corrector methods for BSDEs. These methods are inspired by Adams-Moulton method for the Y -part and Adams-Bashforth for the Z -part.They have the following form, for i ≤ n − r , ( AM B ) r : Y i = E t i (cid:104) Y i +1 + h i (cid:80) rj =0 b i,j,r f ( Y i + j , Z i + j ) (cid:105) Z i = E t i (cid:104) H ψt i ,h Y i +1 + h i (cid:80) rj =1 β i,j,r H φt i ,jh f ( Y i + j , Z i + j ) (cid:105) where ψ, φ ∈ B , .The coefficients for the Y -part are given by b i,j,r = 1 h i (cid:90) t i +1 t i L i,j,r ( s )d s , with L i,j,r ( t ) = r (cid:89) k =0 ,k (cid:54) = j t − t i + k t i + j − t i + k , ≤ j ≤ r. (3.1)The Lagrange polynomials L i,j,r are of degree r and L i,j,r ( t i + j ) = 1 , which implies r (cid:88) j =0 ( t i + j − t i ) k L i,j,r ( t ) = ( t − t i ) k , ≤ k ≤ r . (3.2)15he definition of the b -coefficients means that Y i = E t i (cid:20) Y i +1 + (cid:90) t i +1 t i Q Yi,r ( t )d t (cid:21) where Q Yi,r is a polynomial of degree less than r satisfying Q Yi,r ( t i + j ) = f ( Y i + j , Z i + j ) , ≤ j ≤ r . In the case where the time step is constant, the coefficient does not depends on i andare given by b j,r = (cid:90) (cid:96) j,r ( s )d s , with (cid:96) j,r ( s ) = r +1 (cid:89) k =0 ,k (cid:54) = j s − kj + 1 − k , ≤ j ≤ r . The coefficients for the Z -part are given by β i,j,r = 1 h i (cid:90) t i +1 t i ˜ L i,j,r ( s )d s , ≤ j ≤ r, with ˜ L i,j,r ( s ) = r (cid:89) k =1 ,k (cid:54) = j t − t i + k t i + j − t i + k . (3.3)The Lagrange polynomials ˜ L i,j,r are of degree r − and ˜ L i,j,r ( t i + j ) = 1 , which implies r (cid:88) j =1 ( t i + j − t i ) k ˜ L i,j,r ( t ) = ( t − t i ) k , ≤ k ≤ r − . (3.4)The definition of the β -coefficients means that Z i = E t i (cid:20) H ψt i ,h Y i +1 + (cid:90) t i +1 t i Q Zi,r ( t )d t (cid:21) where Q Zi,r is a polynomial of degree less than r − satisfying Q Zi,r ( t i + j ) = H φt i ,jh f ( Y i + j , Z i + j ) , ≤ j ≤ r . In the case where the time step is constant, the coefficient does not depends on i andare given by β j,r = (cid:90) (cid:96) j,r ( s )d s , with (cid:96) j,r ( s ) = r (cid:89) k =1 ,k (cid:54) = j s − kj − k . When the time step is constant, the table below gives the b -coefficients and β -coefficientsfor r ≤ : r b b b b b β β β β
12 12
52 812 −
112 32 −
924 1924 −
524 124 2312 − − − − − roposition 3.1. The ( AM B ) r method is convergent and at least of order r + 1 , pro-vided that ψ ∈ B r , φ ∈ B r − and u ∈ G r +2 b . Proof.
1. The stability of the schemes comes from a direct application of Proposition2.1, since obviously ( H c ) holds for ( AM B ) r . Following Theorem 2.1, we only have tostudy the order of the method.2.a We first study the error for the Z part. Observe that, recalling (2.10), ˇ Z t i := E t i H ψt i ,h Y t i +1 + r (cid:88) j =1 H φt i ,jh f ( Y t i + j , Z t i + j ) (cid:90) t i +1 t i L i,j,r ( t )d t = E t i H ψt i ,h u ( t i +1 , X t i +1 ) − r (cid:88) j =1 H φt i ,jh u (0) ( t i + j , X t i + j ) (cid:90) t i +1 t i L i,j,r ( t )d t Using Proposition 2.3, we get ˇ Z t i − Z t i = r (cid:88) k =1 h ki k ! u (1) ∗ (0) k ( t i , X t i ) − r (cid:88) j =1 (cid:90) t i +1 t i ˜ L i,j,r ( t )d t r − (cid:88) k =0 u (1) ∗ (0) k +1 ( t i , X t i ) k ! ( t i + j − t i ) k + O t i ( | π | r +1 ) which reads also ˇ Z t i − Z t i = r (cid:88) k =1 h ki k ! u (1) ∗ (0) k ( t i , X t i ) − r − (cid:88) k =0 u (1) ∗ (0) k +1 ( t i , X t i ) k ! (cid:90) t i +1 t i r (cid:88) j =1 ( t i + j − t i ) k ˜ L i,j,r ( t )d t + O t i ( | π | r +1 ) . Using (3.4), we obtain ˇ Z t i − Z t i = r (cid:88) k =1 (cid:16) h ki k ! − k − (cid:90) t i +1 t i ( t − t i ) k − d t (cid:17) u (1) ∗ (0) k ( t i , X t i ) + O t i ( | π | r +1 )= O t i ( | π | r +1 ) . Y part. Let us define, ¯ Y t i := E t i Y t i +1 + r (cid:88) j =0 f ( Y t i + j , Z t i + j ) (cid:90) t i +1 t i L i,j,r ( t )d t Observe that ˇ Y t i := ¯ Y t i + (cid:16) f ( ˇ Y t i , ˇ Z t i ) − f ( Y t i , Z t i ) (cid:17) (cid:90) t i +1 t i L i, ,r ( t )d t which leads since f is Lipschitz continuous, for | π | small enough, to ˇ Y t i − Y t i = O t i ( | ¯ Y t i − Y t i | ) + | π | O t i ( | ˇ Z t i − Z t i | ) . (3.5)17ow, ¯ Y t i = E t i u ( t i +1 , X t i +1 ) − r (cid:88) j =0 u (0) ( t i +1 , X t i +1 ) (cid:90) t i +1 t i L i,j,r ( t )d t Using Proposition 2.2, we get ¯ Y t i − Y t i = r +1 (cid:88) k =1 h ki k ! u (0) k ( t i , X t i ) − r (cid:88) j =0 (cid:90) t i +1 t i L i,j,r ( t )d t r (cid:88) k =0 u (0) k +1 ( t i , X t i ) k ! ( t i + j − t i ) k + O t i ( | π | r +2 ) which reads also ¯ Y t i − Y t i = r +1 (cid:88) k =1 h ki k ! u (0) k ( t i , X t i ) − r (cid:88) k =0 u (0) k +1 ( t i , X t i ) k ! (cid:90) t i +1 t i r (cid:88) j =0 ( t i + j − t i ) k L i,j,r ( t )d t + O t i ( | π | r +2 ) . Using (3.2), we obtain ¯ Y t i − Y t i = r +1 (cid:88) k =1 (cid:16) h ki k ! − k − (cid:90) t i +1 t i ( t − t i ) k − d t (cid:17) u (0) k ( t i , X t i ) + O t i ( | π | r +2 )= O t i ( | π | r +2 ) . Thus, using (3.5), ˇ Y t i − Y t i = O t i ( | π | r +2 ) . T ( π ) = O t i ( | π | r +1 ) . which concludes the proof. (cid:50) These methods are inspired by Adams-Bashforth method both for the Y -part and Z -part. ( ABB ) r Y i = E t i (cid:104) Y i +1 + h i (cid:80) rj =1 b i,j,r f ( Y i + j , Z i + j ) (cid:105) Z i = E t i (cid:104) H ψt i ,h Y i +1 + h i (cid:80) rj =1 β i,j,r H φt i ,jh f ( Y i + j , Z i + j ) (cid:105) where ψ, φ ∈ B , . 18ow, the coefficients for the Y -part are given by b i,j,r = 1 h i (cid:90) t i +1 t i ˜ L i,j,r ( s )d s , recalling (3.3).From the proof of Proposition 3.1, step 1.b. we know that we can obtain a truncationerror for the Z -part s.t. ˆ Z t i − Z t i = O t i ( | π | r +1 ) . But here, due to the explicit feature ofthe Y part and thus an order r global error only, we only need to retrieve an error forthe Z -part of order r as well. This simply means that the scheme, for the Z -part, hasone more coefficient than needed. So one can set β i,j,r = b i,j,r or β i,r,r = 0 and β i,j,r = b i,j,r − , ≤ j ≤ r − . Following the arguments of the proof of Proposition 3.1, one obtains
Proposition 3.2.
The ( ABB ) r method is convergent and at least of order r , providedthat ψ ∈ B r − , φ ∈ B r − and u ∈ G r +1 b . These methods are fully explicit method but have a better rate of convergence than the ( ABB ) r methods presented above. Nevertheless, they require the computation of onemore conditional expectation by step. This has to be compared in practice to the PicardIteration required by ( AM B ) r approximation. Definition 3.1. ( P C ) r Z i = E t i (cid:104) H ψt i ,h Y i +1 + h i (cid:80) rj =1 β i,j,r H φt i ,jh f ( Y i + j , Z i + j ) (cid:105) p Y i = E t i (cid:104) Y i +1 + h i (cid:80) rj =1 β i,j,r f ( Y i + j , Z i + j ) (cid:105) Y i = E t i (cid:104) Y i +1 + h i (cid:80) rj =1 b i,j,r f ( Y i + j , Z i + j ) (cid:105) + h i b i, ,r f ( p Y i , Z i ) (3.6) where ψ, φ ∈ B , . The b -coefficients are given by (3.1) and the β -coefficients by (3.3) . Theorem 3.1.
The ( P C ) r method is convergent and at least of order r + 1 , providedthat ψ ∈ B r , φ ∈ B r − and u ∈ G r +2 b . As usual, the proof of this Theorem is splitted in two steps below. We first study thestability of the above schemes and then their truncation errors.19 .3.1 Stability
To study the stability of the methods (3.6), we introduce first a pertubed version of thescheme ˜ Z i = E t i (cid:104) H ψt i ,h ˜ Y i +1 + h i (cid:80) rj =1 β i,j,r H φt i ,jh f ( ˜ Y i + j , ˜ Z i + j ) + ζ Zi (cid:105) p ˜ Y i = E t i (cid:104) ˜ Y i +1 + h i (cid:80) rj =1 b i,j,r f ( ˜ Y i + j , ˜ Z i + j ) (cid:105) ˜ Y i = E t i (cid:104) ˜ Y i +1 + h i (cid:80) rj =1 b ∗ i,j,r f ( ˜ Y i + j , ˜ Z i + j ) + h i b ∗ i, ,r f ( p ˜ Y i , ˜ Z i ) + ζ Yi (cid:105) where ζ Yi , ζ Zi are random variables belonging to L ( F t i ) , for i ≤ n − r . Proposition 3.3.
The scheme given in (3.6) is L -stable, recalling Definition 2.1. Proof.
For | π | small enough, we compute, denoting p δY = p Y − p ˜ Y , E (cid:2) | p δY i | (cid:3) ≤ (1 + | π | ) E (cid:2) | δY i +1 | (cid:3) + C r (cid:88) j =1 h i E (cid:2) | δY i + j | + | δZ i + j | (cid:3) (3.7) E (cid:2) | δY i | (cid:3) ≤ (1 + | π | ) E (cid:2) | δY i +1 | (cid:3) + C (cid:16) h i E (cid:2) | p δY i | (cid:3) + r (cid:88) j =1 h i E (cid:2) | δY i + j | + | δZ i + j | (cid:3) (cid:17) + Ch i | ζ Yi | (3.8)Plugging (3.7) into (3.8) and using the discrete version of Gronwall’s Lemma, we obtain E (cid:2) | δY i | (cid:3) ≤ C (cid:16) | π | n − r (cid:88) k = i E (cid:2) | δY k | (cid:3) + n − r (cid:88) k = i h k E (cid:2) | δZ k | (cid:3) + n − r (cid:88) k = i h k E (cid:2) | ζ Yk | (cid:3) + n (cid:88) k = n − r +1 E (cid:2) | δY k | (cid:3) (cid:17) . (3.9)Using the same arguments as in step 1.b of the proof of Proposition 2.1, we retrieve that n − r (cid:88) k = i h k E (cid:2) | δZ k | (cid:3) ≤ C (cid:16) n (cid:88) k = n − r +1 E (cid:2) | δY k | + | π || δZ k | (cid:3) + | π | n − r (cid:88) k = i E (cid:2) | δY k | (cid:3) + n − r (cid:88) k = i E (cid:20) h k | ζ Yk | + h k | ζ Zk | (cid:21) (cid:17) (3.10)This leads, using (3.9), E (cid:2) | δY i | (cid:3) ≤ C (cid:16) | π | n − r (cid:88) j = i E (cid:2) | δY j | (cid:3) + n − r (cid:88) k = i | π | E (cid:20) h k | E t k (cid:2) ζ Yk (cid:3) | + | E t k (cid:2) ζ Zk (cid:3) | (cid:21) + n (cid:88) k = n − r +1 E (cid:2) | δY k | + | π || δZ k | (cid:3) (cid:17) (3.11)which corresponds to (2.6).The proof is then concluded using the same arguments as in step of the proof ofProposition 2.1. (cid:50) .3.2 Truncation errorProposition 3.4. The scheme given in Definition 3.1 is at least of order r + 1 providedthat ψ ∈ B r , φ ∈ B r − and u ∈ G r +2 b . Proof.
1. The truncation error for the Z -part is the same that the one of the ( AM M ) r method. From the proof of Proposition 3.1 step 1, we get ˇ Z t i − Z t i = O t i ( | π | r +1 ) .
2. The study of the truncation error for the Y-part is slightly more involved. Let usdefine Y ∗ t i := E t i Y t i +1 + h i r (cid:88) j =1 b i,j,r f ( Y t i + j , Z t i + j ) + h i b i, ,r f ( Y ∗ t i , ˆ Z t i ) Using the proof of Proposition 3.1 step 2, we know that Y ∗ t i − Y t i = O t i ( | π | r +2 ) (3.12)this quantity represents the truncation error for the Y-part of the Adams-Moultonmethod.We also define p ˇ Y t i = E t i Y t i +1 + h i r (cid:88) j =1 β i,j,r f ( Y t i + j , Z t i + j ) ˇ Y t i = E t i Y t i +1 + h i r (cid:88) j =1 b i,j,r f ( Y t i + j , Z t i + j ) + h i b i, ,r f ( p ˇ Y t i , ˇ Z t i ) The term p ˇ Y t i − Y t i represents then the truncation error for the Predictor part, adaptingthe arguments of Proposition 3.1 step 1, we have p ˇ Y t i − Y t i = O t i ( | π | r +1 ) (3.13)The term ˇ Y t i − Y t i is the truncation error we are interested in.We then observe that ˇ Y t i = Y ∗ t i + h i b i, ,r (cid:16) f ( p ˇ Y t i , ˇ Z t i ) − f ( Y ∗ t i , ˇ Z t i ) (cid:17) = Y ∗ t i + h i b i, ,r (cid:16) f ( p ˇ Y t i , ˇ Z t i ) − f ( Y t i , ˇ Z t i ) + f ( Y t i , ˇ Z t i ) − f ( Y ∗ t i , ˇ Z t i ) (cid:17) f is Lipschitz continuous, we obtain ˇ Y t i − Y t i = O t i ( | Y ∗ t i − Y t i | ) + | π | O t i ( | p ˇ Y t i − Y t i | ) Combining (3.12)and (3.13), we then obtain ˇ Y t i − Y t i = O t i ( | π | r +2 ) .
3. From step 1. and step 2. above, we obtain that T ( π ) = O ( | π | r +1 ) , which concludes the proof. (cid:50) In this part, we provide a numerical illustration for the results presented above. Thescheme given in Definition 1.1 is still a theoretical one because in practice one hasto compute the conditional expectation involved. Many methods have been studiedalready in the context of BSDEs: regression methods [14], quantization methods [1, 10],Malliavin calculus methods [3, 4] and tree based methods e.g. Cubature methods [9].To illustrate our previous results, i.e. the order of the time discretization error, we willfocus on the simple case where d = 1 and X = W , in the spirit of [22]. Obviously,further numerical experiments are needed, specially in high dimension. Because we arelooking towards high order approximation, it seems reasonable to combine the presentmulti-step schemes with Cubature methods [9]. This is left for further research.In the sequel, we will also assume that the r terminal conditions are perfectly known.Generally, this won’t be the case but it is not really a problem, see Remark 1.1 (i).We explain below how the Brownian motion is approximated and give the expressionof the numerical scheme which is implemented in practice. We show that this schemeis convergent and characterise its convergence order. The error we are dealing with isnow composed of the discrete time error and the space discretization error. Finally, weprovide some numerical results, where we compute the empirical convergence rate. In order to define the scheme implemented in practice, we use a multinomial approxi-mation of the Brownian motion. Let us consider a discrete random variable ξ matchingthe moments of a gaussian variable G up to order K , i.e. E (cid:104) ξ k (cid:105) = E (cid:104) G k (cid:105) , ≤ k ≤ K.
22n dimension 1, an efficient way to construct ξ is to use quadrature formula.On a (discrete, but big enough) probability space ( (cid:98) Ω , (cid:98) P ) , we are then given ( ξ i ) ≤ i ≤ n ,i.i.d random variables with the same law as ξ and define (cid:99) W x, t i = x + i (cid:88) j =1 (cid:112) t j − t j − ξ j , ∀ t i ∈ π . (4.1)For later use, we say that (cid:99) W x, t i is an order K approximation of the Brownian motion.We also denote by ( (cid:98) F ) t ∈ π the filtration generated by ( (cid:99) W ,xt ) t ∈ π and (cid:98) E t [ · ] the relatedconditional expectation.We can now define the numerical scheme which is used in practice. Definition 4.1. (Linear multi-step)(i) To initialise the scheme with r steps, r ≥ , we set, for ≤ j ≤ r − , ( Y n − j , Z n − j ) = ( u ( t n − j , (cid:99) W ,xt n − j ) , ∂ x u ( t n − j , (cid:99) W ,xt n − j )) . (ii) For i ≤ n − r , the computation of ( Y i , Z i ) involves r steps and is given by (cid:98) Y i = (cid:98) E t i (cid:104)(cid:80) rj =1 a j (cid:98) Y i + j + h i (cid:80) rj =0 b j f ( (cid:98) Y i + j , (cid:98) Z i + j ) (cid:105)(cid:98) Z i = (cid:98) E t i (cid:104)(cid:80) rj =1 (cid:98) H i,j (cid:16) α j (cid:98) Y i + j + h i (cid:80) rj =1 β j f ( (cid:98) Y i + j , (cid:98) Z i + j ) (cid:17)(cid:105) (4.2) The coefficient H i,j are the discrete version of coefficients given in (1.6) . From Propo-sition 2.3 (iii), we observe that, in the case X = W , they can simply be defined asapproximation of the Brownian increment, i.e. (cid:98) H i,j = (cid:99) W x, t i + j − (cid:99) W x, t i t i + j − t i . (4.3)When implementing Predictor-Corrector methods, we use Definition 4.2. (Predictor-Corrector)(i) To initialise the scheme with r steps, r ≥ , we set, for ≤ j ≤ r − , ( Y n − j , Z n − j ) = ( u ( t n − j , (cid:99) W ,xt n − j ) , ∂ x u ( t n − j , (cid:99) W ,xt n − j )) . (ii) For i ≤ n − r , the computation of ( Y i , Z i ) involves r steps and is given by ( P C ) r (cid:98) Z i = (cid:98) E t i (cid:104) (cid:98) H i, (cid:98) Y i +1 + h i (cid:80) rj =1 (cid:98) H i,j (cid:80) rj =1 β j f ( (cid:98) Y i + j , (cid:98) Z i + j ) (cid:105) p (cid:98) Y i = (cid:98) E t i (cid:104) (cid:98) Y i +1 + h i (cid:80) rj =1 β i,j,r f ( (cid:98) Y i + j , (cid:98) Z i + j ) (cid:105)(cid:98) Y i = (cid:98) E t i (cid:104) (cid:98) Y i +1 + h i (cid:80) rj =1 b i,j,r f ( (cid:98) Y i + j , (cid:98) Z i + j ) (cid:105) + h i b i, ,r f ( p (cid:98) Y i , (cid:98) Z i ) here the b -coefficients are given by (3.1) , the β -coefficients by (3.3) and the (cid:98) H -coefficientsby (4.3) . Proposition 4.1. (i) In Definition 4.1, if we assume that the method given by thecoefficients a , b , α , β is of order m , according to Definition 2.2, and that the multinomialapproximation of the Brownian motion is of order K = 2 m + 1 then we have Y − (cid:98) Y = O ( h m ) , provided that the coefficient f and the value function u are smooth enough.(ii) In Definition 4.2 for ( P C ) r method, if we assume that the multinomial approxima-tion of the Brownian motion is of order K = 2 r + 3 then we have Y − (cid:98) Y = O ( h r +1 ) , provided that the coefficient f and the value function u are smooth enough. The proof of this proposition is postponed to the end of this section.We can now turn to a concrete example which illustrates the above order of convergence.
As in [22], we consider the process, on [0 , T ] , ( X t , Y t , Z t ) = (cid:16) W t ,
11 + exp( − W t − t ) , exp( − W t − t )(1 + exp( − W t − t )) (cid:17) . This process is solution of the (decoupled) FBSDE X t = W t Y t = g T ( W T ) + (cid:90) Tt f ( Y s , Z s )d s − (cid:90) Tt Z s d W s where the driver f is given by f ( y, z ) = − z ( 34 − y ) , and g T ( x ) = 11 + exp( − x − T ) (4.4)24o approximate the value of Y , we consider the following methods:1. Implicit Euler approximation, coupled with an order 3 Brownian approximation.2. Crank-Nicholson approximation, coupled with an order 5 Brownian approxima-tion.3. Explicit two step Adams method, coupled with an order 5 Brownian approxima-tion.4. Implicit two step Adams method, coupled with an order 7 Brownian approxima-tion.5. Heun method which is a Predictor-Corrector method, coupled with an order 5Brownian approximation.The log-log graph in Fig . y = ‐1,0033x ‐ 4,5491 y = ‐1,9461x ‐ 4,6086 y = ‐2,0464x ‐ 6,4053 y = ‐2,8197x ‐ 7,9677 y = ‐2,018x ‐ 5,8166 ‐25 ‐20 ‐15 ‐10 ‐5 0 0 0,5 1 1,5 2 2,5 3 3,5 4 4,5 5 l o g e rr o r log number of steps Adams_exp<5> f(Y,Z) Crank_Nich<5> f(Y,Z) Adams_exp<7> f(Y,Z) Euler_imp<3> f(Y,Z) Heun<5> f(Y,Z)
Figure 1: Illustration of the convergence rate25he graph in Fig . y = ‐2,2695x ‐ 5,897 y = ‐1,026x ‐ 4,8053 y = ‐2,0834x ‐ 7,8981 ‐14 ‐12 ‐10 ‐8 ‐6 ‐4 ‐2 0 0 0,5 1 1,5 2 2,5 l o g e rr o r log number of step Cranck Nicholson <5> f(Y,Z) Cranck Nicholson <3> f(Y,Z) Cranck Nicholson <7> f(Y,Z)
Figure 2: Impact of space discretization
We only provide the proof of (i), the proof of (ii) follows from the same arguments andusing the proof of Proposition 3.3 and Proposition 3.4.1.
Notations
We first need to consider ’functional’ version of the schemes above. Let us introducethe following operator, related to the theoretical schemes given in Definition 1.1. R Zi,j : ( C b ) → C b R Zi,j [ ϕ Y , ϕ Z ]( x ) = E (cid:104) H ,xt i ,jh (cid:16) α j ϕ Y ( W x,t i t i + j ) + hβ j f ( ϕ Y ( W x,t i t i + j ) , ϕ Z ( W x,t i t i + j )) (cid:17)(cid:105) R Yi,j : ( C b ) → C b R Yi,j [ ϕ Y , ϕ Z ]( x ) = E (cid:104) a j ϕ Y ( W x,t i t i + j ) + hb j f ( ϕ Y ( W x,t i t i + j ) , ϕ Z ( W x,t i t i + j )) (cid:105) (cid:98) R Zi,j : ( C b ) → C b (cid:98) R Zi,j [ ϕ Y , ϕ Z ]( x ) = (cid:98) E (cid:104) (cid:98) H i,j (cid:16) α j ϕ Y ( (cid:99) W x,t i t ij ) + hβ j f ( ϕ Y ( (cid:99) W x,t i t ij ) , ϕ Z ( (cid:99) W x,t i t ij )) (cid:17)(cid:105)(cid:98) R Yi,j : ( C b ) → C b (cid:98) R Yi,j [ ϕ Y , ϕ Z ]( x ) = (cid:98) E (cid:104) a j ϕ Y ( (cid:99) W x,t i t i + j ) + hb j f ( ϕ Y ( (cid:99) W x,t i t i + j ) , ϕ Z ( (cid:99) W x,t i t i + j )) (cid:105) The functional version of the schemes given in Definition 4.1 reads then, for i ≤ n − r , (cid:98) y i ( x ) = (cid:80) rj =1 (cid:98) R Yi,j [ (cid:98) y i + j , (cid:98) z i + j ]( x ) (cid:98) z i ( x ) = (cid:80) rj =1 (cid:98) R Zi,j [ (cid:98) y i + j , (cid:98) z i + j ]( x ) given r initial data ( (cid:98) y n − j , (cid:98) z n − j ) = ( u ( t n − j , · ) , ∂ x u ( t n − j , · )) , ≤ j ≤ r − .Due to the markov property of the discrete process ( (cid:99) W ,xt ) t ∈ π , it is easily checked that (cid:98) Y i = (cid:98) y i ( (cid:99) W x, t i ) and (cid:98) Z i = (cid:98) z i ( (cid:99) W x, t i ) . Finally, we define (cid:101) Y i = u ( t i , (cid:99) W x, t i ) and (cid:101) Z i = ∂ x u ( t i , (cid:99) W x, t i ) . Observe that (cid:101) Y = u (0 , x ) and that, for ≤ j ≤ r − , ( (cid:98) Y n − j , (cid:98) Z n − j ) = ( (cid:101) Y n − j , (cid:101) Z n − j ) .2. Stability
The key observation is here that ( (cid:101) Y i , (cid:101) Z i ) can be seen as a perturbed version of the schemegiven in (4.2), namely (cid:101) Y i = (cid:98) E t i (cid:104)(cid:80) rj =1 a j (cid:101) Y i + j + h (cid:80) rj =0 b j f ( (cid:101) Y i + j , (cid:101) Z i + j ) (cid:105) + t ζ Yi + s ζ Yi (cid:101) Z i = (cid:98) E t i (cid:104)(cid:80) rj =1 H (cid:16) α j (cid:101) Y i + j + h (cid:80) rj =1 β j f ( (cid:101) Y i + j , (cid:101) Z i + j ) (cid:17)(cid:105) + t ζ Zi + s ζ Zi where the local error due to the time-discretization is t ζ Yi = E (cid:104) Y t i − ˇ Y t i | X t i = (cid:99) W x, t i (cid:105) t ζ Zi = E (cid:104) Z t i − ˇ Z t i | X t i = (cid:99) W x, t i (cid:105) (4.5)recalling (2.9)-(2.10) and the local error due to the ’space-discretization’ is (cid:40) s ζ Yi = (cid:80) rj =1 ( R Yi,j − (cid:98) R Yi,j )[ u ( t i + j , · ) , ∂ x u ( t i + j , · )]( (cid:99) W x, t i ) s ζ Zi = (cid:80) rj =1 ( R Zi,j − (cid:98) R Zi,j )[ u ( t i + j , · ) , ∂ x u ( t i + j , · )]( (cid:99) W x, t i ) . (4.6)27ow, we can apply Proposition 2.1, recalling Remark 2.2, to obtain in particular that | (cid:101) Y − (cid:98) Y | ≤ | π | n − r (cid:88) i =0 (cid:98) E (cid:20) h i | t ζ Yi + s ζ Yi | + | t ζ Zi + s ζ Zi | (cid:21) (4.7)3. Study of the local error
We now turn to the study of the local errors ( ζ Yi , ζ Zi ) ≤ i ≤ n − r . Assuming that thefunction are smooth enough we compute the following expansion ( R Yi,j − (cid:98) R Yi,j )[ u ( t i + j , · ) , u (1) ( t i + j , · )]( x ) = K (cid:88) k =0 k ! χ ( k ) i,j ( x ) E (cid:104) ( W ,t i t i + j ) k (cid:105) − K (cid:88) k =0 k ! χ ( k ) i,j ( x ) (cid:98) E (cid:104) ( (cid:99) W ,t i t i + j ) k (cid:105) + O ( | π | K +12 ) where χ i,j are functions depending on f , u and the coefficients of the methods.Using the matching moment property of (cid:99) W ,t i , we easily obtain that (cid:98) E (cid:104) | ( R Yi,j − (cid:98) R Yi,j )[ u ( t i + j , · ) , u (1) ( t i + j , · )]( (cid:99) W x, t i ) | (cid:105) ≤ C | π | K +1 For the Z part, we have ( R Zi,j − (cid:98) R Zi,j )[ u ( t i + j , · ) , u (1) ( t i + j , · )]( x ) = K − (cid:88) k =0 k ! χ ( k ) i,j ( x ) E (cid:104) H i,j ( W ,t i t i + j ) k (cid:105) − K − (cid:88) k =0 k ! χ ( k ) i,j ( x ) (cid:98) E (cid:104) ( (cid:98) H i,j (cid:99) W ,t i t i + j ) k (cid:105) + O ( | π | K − ) Using the matching moment property of (cid:99) W ,t i , we easily obtain that (cid:98) E (cid:104) | ( R Zi,j − (cid:98) R Zi,j )[ u ( t i + j , · ) , u (1) ( t i + j , · )]( (cid:99) W x, t i ) | (cid:105) ≤ C | π | K − Combining the above estimates with (4.7) and the fact that the discrete-time error is oforder m , leads to | (cid:101) Y − (cid:98) Y | ≤ C | π | m + C | π | K − . which concludes the proof since K = 2 m + 1 . (cid:50) References [1]
Bally V. and G. Pagès (2003). Error analysis of the quantization algorithmfor obstacle problems.
Stochastic Processes and their Applications , , 1-40.282] Bouchard B. and R. Elie (2008) , Discrete-time approximation of decoupledforward-backward SDE with jumps,
Stochastic Processes and their Applications , , 53-75.[3] Bouchard B. and N. Touzi (2004) , Discrete-Time Approximation andMonte-Carlo Simulation of Backward Stochastic Differential Equations.
Stochas-tic Processes and their Applications , (2), 175-206.[4] Bouchard B. and X. Warin (2011) Monte-Carlo valorisation of Americanoptions: facts and new algorithms to improve existing methods, to appear in
Numerical Methods in Finance , Springer Proceedings in Mathematics, ed. R.Carmona, P. Del Moral, P. Hu and N. Oudjane , 2011.[5]
Butcher J. C. (2008),
Numerical Methods for Ordinary Differential Equations ,Second Edition, Wiley.[6]
Chassagneux J.-F. (2008) Processus réfléchis en finance et probabiliténumérique, phd thesis , Université Paris Diderot - Paris 7.[7]
Chassagneux J.-F. and D. Crisan (2012), Runge-Kutta Scheme for BSDEs, preprint .[8]
Crisan D. and K. Manolarakis (2009) , Solving Backward Stochastic Dif-ferential Equations using the Cubature Method, preprint .[9]
Crisan D. and K. Manolarakis (2010) , Second order discretization of aBackward SDE and simulation with the cubature method, preprint .[10]
Delarue, F. and S. Menozzi (2006) A forward backward algorithm for quasi-linear PDEs,
Annals of Applied Probability , , 140-184.[11] Demailly J.-P.
Analyse numérique et équations différentielles , 3e édition, EDPSciences.[12]
El Karoui N., S. Peng, M.C. Quenez (1997), Backward Stochastic Differ-ential Equation in finance
Mathematical finance , (1), 1-71.[13] Gobet E. and C. Labart (2007) , Error expansion for the discretization ofbacward stochastic differential equations,
Stochastic Processes and their Appli-cations , , 803-829. 2914] Gobet E., J.-P. Lemor and X. Warin (2006) Rate of convergence of an em-pirical regression method for solving generalized backward stochastic differentialequations,
Bernoulli , (5), 889-916.[15] Kloeden P. E. and E. Platen (1992),
Numerical solutions of Stochastic Dif-ferential Equations , Applied Math. 23, Springer, Berlin.[16]
Pardoux E. and S. Peng (1990), Adapted solution of a backward stochasticdifferential equation,
Systems and Control Letters , , 55-61.[17] Pardoux E. and S. Peng (1992), Backward stochastic differential equationsand quasilinear parabolic partial differential equations. In Stochastic partial dif-ferential equations and their applications (Charlotte, NC, 1991), 200-217, volume176 of
Lecture Notes in Control and Inform. Sci., Springer, Berlin, 1992. [18]
Richou, A. (2010) Numerical simulation of BSDEs with drivers of quadraticgrowth. Forthcoming in
The Annals of Applied Probability .[19]
Zhao W., L. Chen and S. Peng (2006), A new kind of accurate numericalmethod for backward stochastic differential equations,
SIAM J. Sci. Comput. , , 1563-1581.[20] Zhao W., Y. Li and G. Zhang , (2012) A generalized theta-scheme for solvingbackward stochastic differential equations ,
Dis. Cont. Dyn. Sys. B , , 1585-1603.[21] Zhao W., J. Wang and S. Peng (2009) Error estimates of the theta-scheme forbackward stochastic differential equations ,
Dis. Cont. Dyn. Sys. B , , 905-924.[22] Zhao W., G. Zhang and L. Ju (2010) A stable multistep scheme for solvingBackward Stochastic Differential Equations.
SIAM J. Numer. Anal. (4), 1369-1394.[23] Zhang J. (2004), A numerical scheme for backward stochastic differential equa-tion,
Annals of Applied Probability ,14