[PDF] Backward Deep BSDE Methods and Applications to Nonlinear Problems

Abstract

In this paper, we present a backward deep BSDE method applied to Forward Backward Stochastic Differential Equations (FBSDE) with given terminal condition at maturity that time-steps the BSDE backwards. We present an application of this method to a nonlinear pricing problem - the differential rates problem. To time-step the BSDE backward, one needs to solve a nonlinear problem. For the differential rates problem, we derive an exact solution of this time-step problem and a Taylor-based approximation. Previously backward deep BSDE methods only treated zero or linear generators. While a Taylor approach for nonlinear generators was previously mentioned, it had not been implemented or applied, while we apply our method to nonlinear generators and derive details and present results. Likewise, previously backward deep BSDE methods were presented for fixed initial risk factor values X 0 only, while we present a version with random X 0 and a version that learns portfolio values at intermediate times as well. The method is able to solve nonlinear FBSDE problems in high dimensions.

Full PDF

BBackward Deep BSDE Methods and Applications toNonlinear Problems

Yajie Yu, Bernhard Hientzsch, and Narayan GanesanCorporate Model Risk, Wells Fargo

Abstract

In this paper, we present a backward deep BSDE method applied to Forward BackwardStochastic Diﬀerential Equations (FBSDE) with given terminal condition at maturity thattime-steps the BSDE backwards. We present an application of this method to a nonlinearpricing problem - the diﬀerential rates problem. To time-step the BSDE backward, oneneeds to solve a nonlinear problem. For the diﬀerential rates problem, we derive an exactsolution of this time-step problem and a Taylor-based approximation. Previously backwarddeep BSDE methods only treated zero or linear generators. While a Taylor approach fornonlinear generators was previously mentioned, it had not been implemented or applied,while we apply our method to nonlinear generators and derive details and present results.Likewise, previously backward deep BSDE methods were presented for ﬁxed initial riskfactor values X only, while we present a version with random X and a version that learnsportfolio values at intermediate times as well. The method is able to solve nonlinear FBSDEproblems in high dimensions. As proposed in E, Han and Jentzen [EHJ17], deep learning (DL) and deep neural networks(DNN) method can be used to solve high dimensional nonlinear PDEs by converting them toForward Backward Stochastic Diﬀerential Equations (FBSDE) and building neural networks tolearn the control and initial value of the corresponding stochastic control problem. One examplein that paper uses their proposed forward deep BSDE method to price a combination of twocall options with diﬀerential rates (diﬀerent borrowing and lending interest rates) as studied inMercurio [Mer15]. Hientzsch [Hie19] also gives an overview of pricing diﬀerent instruments inquantitative ﬁnance via deep BSDE and FBSDE. Ganesan, Yu and Hientzsch [GYH20] showhow to price Barrier options with deep BSDE and FBSDE.Han, Jentzen and E [HJE18] propose time-stepping both forward and backward SDE forwardin time and transform the ﬁnal value problem to a stochastic control problem in which theobjective function measures how well the given ﬁnal value has been approximated. We calltheir method ”forward deep BSDE” method since it time-steps the BSDE forward. Wang et al[WCS +

18] consider a BSDE with zero drift term which can be trivially time-stepped backwardsand propose and demonstrate forward and backward methods with ﬁxed X , describing the ﬁrstbackward deep BSDE method. Liang, Xu and Li [LXL19] solve BSDEs with linear generatorswith both forward and backward methods in their examples. They indicate that a nonlineargenerator could be handled with a Taylor expansion approach, but do not work out nor implement1 a r X i v : . [ q -f i n . C P ] J un he case of the nonlinear generator in the backward method. In this paper, we will describe boththe general approach as well as the application to the diﬀerential rates setting for two variantsof the backward method, demonstrating to the best of our knowledge the ﬁrst application of thebackward method to nonlinear problems.The main idea of the backward method is that the BSDE is started at maturity with thegiven ﬁnal value and then time-stepped backwards until a given initial time t . In the case of thedynamics of X being started at t at a ﬁxed value X , with the right trading strategy, for thetime-continuous case, all realizations of Y t (spot price of derivative at initial time) should havethe same value at time t . Thus, a measure of the size of the range - variance in this particularcase - of Y t is picked as the objective function and the variance of the mini-batch is chosenin mini-batch stochastic gradient descent. For the case of random X , we minimize the squaredistance from an also to-be-determined function yinit ( X ) represented by a DNN. This functionis also the predictable adapted L -projection of the values obtained from the pathwise roll-back.The particular nonlinear pricing problem that we consider is the case of diﬀerential ratestogether with Black-Scholes forward dynamics for European option pricing problem involving, forexample, a linear combination of two calls with coeﬃcients with opposite signs. Diﬀerential ratesmean that positive cash balances in the trading strategy accrue interest at a lower lending ratewhile negative cash balances (debts/loans) accrue interest at a higher borrowing rate. Standardself ﬁnancing trading strategy arguments lead to a nonlinear BSDE.For the diﬀerential rates problem, E, Han and Jentzen [EHJ17] present a nonlinear PDEwhich can be solved by appropriate nonlinear PDE solvers in small dimensions (see, for in-stance, Forsyth and Labahn [FL07]). For a more general setting, Mercurio [Mer15] presentsPDEs and proposes PDE solution or binomial tree methods. None of these methods works inhigher dimensions due to the curse of dimensionality. All these methods require problem speciﬁcimplementation of nonlinear PDE or tree solver.Standard Monte-Carlo approaches that simulate, discount, and average can not handle non-linear pricing or control problems that depend on the solution or its gradient.There are some other approaches to such nonlinear problems in high dimensions such asWarin [War18] or Hur´e, Pham and Warin [HPW19]. However, they use nested Monte-Carlo ormore elaborate methods rather than a path-wise approach (and they do not treat the diﬀerentialrates problem as an example).In this paper, we ﬁrst introduce FBSDE for general nonlinear problems, with particulardetails for the diﬀerential rates problem, time-discretize them, and then derive exact and Taylorapproximations for the backward step. We then quickly describe the forward and backward deepBSDE approaches that we consider - both the batch-variance variant already described in theliterature but also the novel initial variable and network versions, the last one for random X ,together with the computational graphs for the implementations. Then we apply these methodsto the diﬀerential rates problem for the call combination case from Han, Jentzen and E [HJE18]and for the straddle case from Forsyth and Labahn [FL07]. We compare the results for a casewith ﬁxed X and for a case with varying X with the results from Forsyth and Labahn [FL07]and see that they agree well. We visualize and discuss some of the results. Finally, we conclude. One type of nonlinear PDEs that we are interested in solving has the general form: u t ( t, x ) + L t u ( t, x ) + f ( t, x, u ( t, x ) , ∇ u ( t, x )) = 0 , (1)2ith L t u ( t, x ) := 12 Tr (cid:0) σ N σ TN ( t, x ) (Hess x u ) ( t, x ) (cid:1) + µ ( t, x ) ∇ u ( t, x ) , (2)where Hess x u is the Hessian matrix, with terminal condition at maturity given as: u ( T, x ) = g ( x ) . (3)A nonlinear Feyman-Kac theorem shows the solution of above PDE also satisﬁes the followingFBSDE system under appropriate assumptions:The forward SDE (FSDE) for the underlying assets: dX t = µ ( t, X t ) dt + σ N ( t, X t ) dW t , (4)and the backward SDE (BSDE) in terms of the coeﬃcient of the Brownian Z t : − dY t = f Z ( t, X t , Y t , Z t ) dt − Z Tt dW t , (5)or in terms of values Π t : − dY t = f ( t, X t , Y t , Π t ) dt − Π Tt σ LN ( t, X t ) dW t , (6)with terminal condition Y T = g ( X T ) , (7)where Y t = u ( t, X t ) , Π t = ∇ X u ( t, X t ) , Z t = σ TLN ( t, X t ) Π t . (8)In terms of pricing applications in ﬁnance, g ( X T ) is the ﬁnal payoﬀ of the European optionthat one tries to replicate with a self-ﬁnancing portfolio in the underlying asset(s) X t and aremaining cash position. That portfolio will contain π i ( t ) worth of X i ( t ) (Π t being the vector of π i ( t )). The portfolio (including cash position) is worth Y t at time t .Now Y t or equivalently u ( t, X t ) represent the needed wealth at t to exactly or approximatelyreplicate the payoﬀ when starting at X t at time t . This gives one of the possible ways to deﬁneprice (pricing by replication): price ( t, X t ; X T (cid:55)→ g ( X T )) as the solution of the FBSDE and/orthe nonlinear PDE. Linear pricing satisﬁes (among other things) price ( t, X t ; X T (cid:55)→ g ( X T )) = − price ( t, X t ; X T (cid:55)→ − g ( X T )) . (9)In nonlinear pricing in general (as for instance for diﬀerential rates, as we will see in exampleslater), these two prices are no longer necessarily the same but will give an upper and a lowerprice. Using Euler-Maruyama method to discretize time direction forward for both X t and Y t , we have X t i +1 = X t i + µ ( t i , X t i )∆ t i + σ ( t i , X t i )∆ W i (10)and Y t i +1 = Y t i − f ( t i , X t i , Y t i , Π t i ) ∆ t i + Π Tt i σ ( t i , X t i )∆ W i . (11) If Π t measures the hedging delta in the portfolio, it would be σ N rather than σ LN in the stochastic term ofthe Y BSDE, where σ N ( t, X ) = σ LN ( t, X ) X . If measured by value, or π ( t ) X i ( t ) worth of X i ( t ) if measured by delta/size. .2 Backward Time-stepping To backward step in time direction, we rewrite (11) as: Y t i − f ( t i , X t i , Y t i , Π t i ) ∆ t i = Y t i +1 − Π Tt i σ ( t i , X t i )∆ W i (12)and solve for Y t i .For a diﬀerential rates setup in a risk neutral measure, the f generator function in the BSDEis: f ( t, X t , Y t , Π t ) = − r l ( t ) Y t + ( r b ( t ) − r l ( t )) (cid:32) n (cid:88) i =1 π i ( t ) − Y t (cid:33) + . (13)This driver expresses that all assets X i ( t ) and positive cash balances grow at a risk-neutral rate r l ( t ) unless the cash position Y t − (cid:80) ni =1 π i ( t ) is negative, and that negative cash balance willgrow at a rate r b ( t ) corresponding to the higher borrowing rate as compared to the lower orequal lending rate.There are two cases for equation (13):1). If (cid:80) ni =1 π i ( t ) > Y ( t ): f ( t, X t , Y t , Π t ) = − r l ( t ) Y t + ( r b ( t ) − r l ( t )) (cid:32) n (cid:88) i =1 π i ( t ) − Y t (cid:33) . (14)Inserting this into equation (12) and solving, we obtain: Y t i = Y t i +1 + (cid:0) r b ( t i ) − r l ( t i ) (cid:1) (cid:16)(cid:80) nj =1 π j ( t i ) (cid:17) ∆ t i − Π Tt i σ ( t i , X t i )∆ W i r b ( t i )∆ t i . (15)2). If (cid:80) ni =1 π i ( t ) ≤ Y ( t ): f ( t, X t , Y t , Π t ) = − r l ( t ) Y t . (16)Inserting this into equation (12) and solving, we obtain: Y t i = Y t i +1 − Π Tt i σ ( t i , X t i )∆ W i r l ( t i )∆ t i . (17)However, we do not know Y t i before solving the nonlinear equation (12) for it. From (15) and(17) and the conditions involving Y t i , we obtain that Y t i < (cid:80) nj =1 π j ( t i ) is equivalent to Y t i +1 <  n (cid:88) j =1 π j ( t i )  (cid:8) σ ( t i , X t i )∆ W i + (1 + r l ( t i ))∆ t i (cid:9) (18)and the same for the relation with ≥ . Thus, if (18) is satisﬁed, we use (15), otherwise (17). For ﬁrst order Taylor expansion, we have: f (cid:0) t i , X t i , Y t i , Π Tt i σ ( t i , X t i ) (cid:1) ≈ f (cid:0) t i , X t i , Y t i +1 , Π Tt i σ ( t i , X t i ) (cid:1) − ∂f∂Y (cid:0) t i , X t i , Y t i +1 , Π Tt i σ ( t i , X t i ) (cid:1) ∆ Y ∆ tt . (19)4nserting this into equation (12) and solving for Y t i , we have the following: Y t i = Y t i +1 + f (cid:0) t i , X t i , Y t i +1 , Π Tt i σ ( t i , X t i ) (cid:1) ∆ t i − Π Tt i σ ( t i , X t i )∆ W i − ∂f∂Y (cid:0) t i , X t i , Y t i +1 , Π Tt i σ ( t i , X t i ) (cid:1) ∆ t . (20)Note that f and ∂f∂u are evaluated at Y t i +1 .With the same setup for the diﬀerential rates problem, it is clear that there are only twopossible forms for f :1). If (cid:80) nj =1 π j ( t i ) > Y t i +1 : f ( t i , X t i , Y t i +1 , Π t i ) = − r l ( t i ) Y ( t i ) + ( r b ( t i ) − r l ( t i ))  n (cid:88) j =1 π j ( t i ) − Y t i +1  (21)and ∂f∂Y = − r b ( t i ) . (22)Inserting this into equation (20), we obtain: Y t i = Y t i +1 + (cid:0) r b ( t i ) − r l ( t i ) (cid:1) (cid:16)(cid:80) nj =1 π j ( t i ) (cid:17) ∆ t i − Π Tt i σ ( t i , X t i )∆ W i r b ( t i )∆ t i . (23)2). If (cid:80) nj =1 π j ( t i ) ≤ Y t i +1 : f ( t i , X t i , Y t i +1 , Π t i ) = − r l ( t i ) Y t i +1 (24)and ∂f∂Y = − r l ( t i ) . (25)Inserting this into equation (20), we have: Y t i = Y t i +1 − Π Tt i σ ( t i , X t i )∆ W i r l ( t i )∆ t i . (26)Notice that (15) and (23) are the same and that (17) and (26) are the same. The onlydiﬀerence lies in the conditions when they are applied. As introduced in E, Han and Jentzen [EHJ17], with forward time-stepped equations (10) and(11), one minimizes the loss function E ( || Y T − g ( X N ) || ) . (27)The initial portfolio value Y is a parameter of the minimization problem as are all theparameters of the DNN functions π i ( t i , X t i ) treated as functions of X t i (that give the stochasticvector process Π t as value or size of the holdings of the risky underlying securities in the portfolio).Since X is ﬁxed, instead of learning a function π ( X ), one learns a parameter π . Alternatively,5ne can learn a single function π ( t i , X t i ) as function of t i and X t i which means that all theparts of the computational graph that represent the evaluation of π ( t, x ) share the same DNNparameters. The minimization problem is then solved with standard deep learning approachessuch as mini-batch stochastic gradient methods, using approaches such as Adam, pre-scalingand/or batch-normalization, etc.For the case of random X , one also learns the initial value of Y as a function Yinit ( X ) of X ,using the same loss function. Han, Jentzen, and E [HJE18] mention this approach for random X on page 8509. We are not aware of any publication presenting results or implementations ofthe random X approach except in our own work. In the backward approach, one time-steps equations (10) forward but time-steps equations (11)backward, starting from Y T = g ( X T ). As discussed in the previous section, one can use ananalytical solution of (12) or some Taylor expansion approach. Using either approach, one willobtain an expression or implementation Y t i = ybackstep ( t i , Y t i +1 , X t i , Π t i , ∆ W i ) . (28)For ﬁxed X , the loss function will be var ( Y ) . (29)For the mini-batch stochastic gradient step, the loss function will be the mini-batch variance E ( || Y − ¯ Y || ) , (30)where ¯ Y will be the mean over the mini-batch. For MC (Monte Carlo) estimates for Y , one canuse the last mini-batch mean or one can compute the mean of Y over a larger sample of paths(but ﬁxing the trading strategy).Instead of using the mini-batch mean in the loss function, one can learn ¯ Y as a parame-ter/variable (resulting in the same loss function but with diﬀerent meaning of ¯ Y = Yinit ).Once X is random, one can no longer use batch variance in a straightforward way. Instead(and inspired by the parameter version just discussed), one uses a loss function E ( || Y − Yinit ( X ) || ) , (31)where the Yinit ( X ) is a function represented by a DNN which is learned as part of the DLapproaches.Similarly, one can introduce additional terms E ( || Y t i − Ylearned i ( X t i ) || ) (32)at some (or all) intermediate times t i to learn some approximations for the solution function Ylearned i ( X t i ) = u ( t i , X t i ) (or one could learn the trading strategy and intermediate solutionfunctions in stages in a roll-back fashion).All the methods except the one using batch variance are novel, to the best of our knowledge.For these diﬀerent backward approaches, the ﬁrst time step translates into the three diﬀerentcomputational graphs shown in ﬁgures 1, 2, and 3. The general time step for all three methodsis shown in ﬁgure 4, while the last time step is shown in ﬁgure 5. First, one would simulate X There are many introductions into DL, DNN, and common forms of DNN. For a minimal one geared towardsdeep BSDE, see Hientzsch [Hie19]. Y T is set to g ( X T ), and backward steps ybstep aretaken, proceeding through intermediate steps, until one reaches back at the ﬁrst time step. Inthe ﬁgures, gray boxes are given implementations/operations that do not change, pink boxes(circles) are networks (variables/parameters) to be trained, blue circles are random and greencircles are input parameters. We present results on two ﬁnancial derivatives treated in the literature so that we can compareour results more easily with those of others. The two ﬁnancial derivatives are a call combination(long a call on the maximum across assets with strike 120.0 and short two calls on the maximumwith strike 150.0, with maturity 0.5 years) as in E, Han and Jentzen [EHJ17] and a straddleon the maximum (both long and short a straddle with strike 100.0 with maturity 1 year) as inForsyth and Labahn [FL07].For ease of visualization, testing, and presentation, we present results for the one-dimensionalcase (which is also the case treated in Forsyth and Labahn [FL07]).

For the E, Han, and Jentzen example, we picked σ = 0 . µ = 0 . r l = 0 .

04, and r b = 0 . X case, we picked X = 120 .

0. For therandom/varying X case, we picked a uniform distribution in [70 , dim + 10 (dim is the dimension of PDE we are trying tosolve) neurons and activitation functions ELU for the ﬁrst two layers and then identity on theoutput layer.We ﬁrst show results for the ﬁxed X case for the call combination.Figure 6 shows how the loss function values behave over the number of minibatches.Figures 7 and 8 show the trading strategy and the portfolio value Y . The batch variancemethod uses shared parameters for the risky portfolio vector functions π ( t, x ) while the initialparameter method uses separate DNNs with separate parameters for diﬀerent times.We next show results for the random X case for the call combination. Figure 9 shows theloss functions. Figure 10 shows initial Yinit network and rollback and the Z network. Figure11 shows the strategy DNN outputs and the path Y values. This example uses separate DNNs. Forsyth and Labahn picked the settings σ = 0 . µ = r b = 0 .

05, and r l = 0 .

03 [FL07, Table 1 onpage 28 in hjb.pdf]. We used 100 time steps (one of the numbers of time steps for which resultsare given in tables in Forsyth and Labahn [FL07]). The strike for the straddle is 100.0.We ﬁrst consider the ﬁxed X case. Like Forsyth and Labahn [FL07], we pick initial spot X to be 100.0 .We plot the Y estimates or parameters for diﬀerent backward and forward methods in ﬁgure12 for exact and ﬁgure 13 for Taylor backward step, for both long and short straddles, togetherwith a more detail view. We see that the method that learns the y B X volstep Σ Π mult Z ybstep ∆ W xstep Y X Figure 1: First time step in backward method when using MC mean for Y for a ﬁxed X ¯ Y = Θ y Y B X volstep Σ Π mult Z ybstep ∆ W xstep Y X Figure 2: First time step in backward method when learning Y for a ﬁxed X Y yinit Y B X volstep Σ Π mult Z ybstep ∆ W xstep Y X Figure 3: First time step in backward method when learning Y as a network from a random X Y i X i volstep π − step i Σ i Π i mult Z i ybstep i ∆ W i xstep i Y i +1 X i +1 Figure 4: General step in backward method9 N − X N − volstep π − step N − Σ N − Π N − mult Z N − ybstep N − ∆ W N − xstep N − Y N payoﬀ X N Figure 5: Step for last time step in backward method y y y X case, we pick X uniformly within [50 , , X for diﬀerent batch sizes, all for long and shortpositions. We can see that for increasing batch sizes, the agreement is improving. It can be seen that the results for exact backward step and Taylor backward step are veryclose.Lastly, for the exact backward step for batch size 1024, we show RiskyPortfolio size (delta),RiskyPortfolio value (delta times stock price), cash position value for long and short straddle,and the regions with borrowing (red) and lending (black) in ﬁgures 17 and 18 respectively. Notice that [FL07, Figure 1] do not give the number of space or time steps used for the plot. a) Batch variance method - exact (b) Batch variance method - Taylor(c) Method with Y as a parameter - exact (d) Method with Y as a parameter - Taylor Figure 6: Loss function over 20000 mini-batches for diﬀerent backward methods (ﬁxed X ) forthe call combination example 11 a) Batch variance - exact (b) Batch variance - Taylor(c) Method with Y as a parameter - exact (d) Method with Y as a parameter - Taylor Figure 7: Strategy DNN (delta) outputs at 20000 mini-batches for diﬀerent backward methods(ﬁxed X ) for the call combination example 12 a) Batch variance - exact (b) Batch variance - Taylor(c) Method with Y as a parameter - exact (d) Method with Y as a parameter - Taylor Figure 8: Y path values at 20000 mini-batches for diﬀerent backward methods (ﬁxed X ) forthe call combination example (a) Initial network - exact (b) Initial network - Taylor Figure 9: Loss function over 20000 mini-batches for diﬀerent backward methods (random X )for the call combination example 13 a) Y network vs roll-back - exact (b) Y network vs roll-back - Taylor(c) π network - exact (d) π network - Taylor Figure 10: Y initial network, roll-back, and π initial network at 20000 mini-batches for diﬀerentbackward methods (random X ) for the call combination example14 a) Strategy - exact (b) Strategy - Taylor(c) Y path values - exact (d) Y path values - Taylor Figure 11: Strategies deltas and Y path values at 20000 mini-batches for diﬀerent backwardmethods (random X ) for the call combination example15igure 12: Y estimates or parameters for the straddle case - over 20000 mini-batches and detailfor 5000-1000 minibatches - exact backward step - batch size 51216igure 13: Y estimates or parameters for the straddle case - over 20000 mini-batches and detailfor 5000-1000 minibatches - Taylor backward step - batch size 25617ethod ResultResults from Forsyth and Labahn - 101 nodesFully Implicit HJB PDE (implicit control) 24.02047Crank-Nicolson HJB PDE (implicit control) 24.0512Fully Implicit HJB PDE (pwc policy) 24.01163Crank-Nicolson HJB PDE (pwc policy) 24.0652Forward deep BSDE - 20000 batches, size 256Learned y y y y y y y y a) Batch size 256 (b) Batch size 512(c) Batch size 1024 Figure 14:

Yinit ( X ) for various backward methods with exact backward step plotted overForsyth and Labahn curves 19 a) Batch size 256 (b) Batch size 512(c) Batch size 1024 Figure 15:

Yinit ( X ) for various backward methods with Taylor backward step plotted overForsyth and Labahn curves 20 a) Batch size 256 (b) Batch size 512(c) Batch size 1024 Figure 16:

Yinit ( X ) for various forward methods plotted over Forsyth and Labahn curves21 a) Risky portfolio size (delta) (b) Risky portfolio value (delta times stock price)(c) Cash position value (d) Where strategy borrows/lends Figure 17: Risky portfolio size, risky portfolio value, cash position value, and locations wherestrategy borrows (red) or lends (black) for random X , exact backward step, batch size 1024,long position/upper price 22 a) Risky portfolio size (delta) (b) Risky portfolio value (delta times stock price)(c) Cash position value (d) Where strategy borrows/lends Figure 18: Risky portfolio size, risky portfolio value, cash position value, and locations wherestrategy borrows (red) or lends (black) for random X , exact backward step, batch size 1024,short position/lower price 23 Acknowledgements

The authors would like to thank Orcan Ogetbil, Daniel Weingard and Xin Wang for proofreading the draft and giving helpful feedback; Vijayan Nair for discussion regarding methods,presentation, and results and for reviewing the paper; and Agus Sudjianto for supporting thisresearch.

In this paper, we ﬁrst introduced FBSDE for general nonlinear problems, with particular detailsfor the diﬀerential rates problem, time-discretize them, and then derived exact and Taylor ap-proximations for the backward step. We then quickly described the forward and backward deepBSDE approaches that we consider - both the batch-variance variant already described in theliterature but also the novel initial variable and network versions, the last one for random X .Then we applied these methods for the diﬀerential rates problem for the call combination casefrom Han, Jentzen and E [HJE18] and for the straddle case from Forsyth and Labahn [FL07].We compare the results for a case with ﬁxed X and for a case with varying X with the resultsfrom Forsyth and Labahn [FL07] and see that they agree well. We also compare methods forthe exact backward step and the Taylor backward step and in the straddle and call combinationexamples that we ran, the results seem to be very close. We also visualized some of the resultsto show what they mean in terms of trading strategy and borrowing and lending.The deepBSDE methods described in this paper are using a very diﬀerent approach from thePDE methods by Forsyth and Labahn [FL07], but they give results very close to those publishedthere. That makes us conﬁdent that these methods can be used to generically and eﬃcientlyapproximate solutions to such nonlinear pricing problems, even with relatively small batch-sizesuch as 512 or 1024. Any opinions, ﬁndings and conclusions or recommendations expressed in this material are thoseof the author and do not necessarily reﬂect the views of Wells Fargo Bank, N.A., its parentcompany, aﬃliates and subsidiaries.

References [EHJ17] Weinan E, Jiequn Han, and Arnulf Jentzen. Deep learning-based numerical methodsfor high-dimensional parabolic partial diﬀerential equations and backward stochasticdiﬀerential equations.

Communications in Mathematics and Statistics , 5(4):349–380,2017. arXiv:1706.04702.[FL07] Peter A Forsyth and George Labahn. Numerical methods for controlled hamilton-jacobi-bellman pdes in ﬁnance.

Journal of Computational Finance , 11(2):1–44,2007. preprint version available online for instance at: https://cs.uwaterloo.ca/~paforsyt/hjb.pdf .[GYH20] Narayan Ganesan, Yajie Yu, and Bernhard Hientzsch. Pricing barrier options withdeepBSDEs. arXiv preprint arXiv:2005.10966 , May 2020.24Hie19] Bernhard Hientzsch. Introduction to solving quant ﬁnance problems withtime-stepped FBSDE and deep learning. arXiv preprint arXiv:1911.12231 ,Nov 2019. Also available at SSRN: https://ssrn.com/abstract=3494359 orhttp://dx.doi.org/10.2139/ssrn.3494359.[HJE18] Jiequn Han, Arnulf Jentzen, and Weinan E. Solving high-dimensional partial diﬀeren-tial equations using deep learning.

Proceedings of the National Academy of Sciences ,115(34):8505–8510, 2018.[HPW19] Cˆome Hur´e, Huyˆen Pham, and Xavier Warin. Some machine learning schemes forhigh-dimensional nonlinear PDEs. arXiv preprint arXiv:1902.01599 , 2019.[LXL19] Jian Liang, Zhe Xu, and Peter Li. Deep learning-based least square forward-backward stochastic diﬀerential equation solver for high-dimensional derivativepricing. arXiv preprint arXiv:1907.10578 , 2019. Also available at SSRN:https://ssrn.com/abstract=3381794 or http://dx.doi.org/10.2139/ssrn.3381794.[Mer15] Fabio Mercurio. Bergman, Piterbarg, and beyond: pricing derivatives under collater-alization and diﬀerential rates. In

Actuarial Sciences and Quantitative Finance , pages65–95. Springer, 2015. Also available at SSRN: https://ssrn.com/abstract=2326581or http://dx.doi.org/10.2139/ssrn.2326581.[War18] Xavier Warin. Nesting monte carlo for high-dimensional non linear PDEs. arXivpreprint arXiv:1804.08432 , 2018.[WCS +

18] Haojie Wang, Han Chen, Agus Sudjianto, Richard Liu, and Qi Shen. Deep learning-based BSDE solver for LIBOR market model with application to bermudan swaptionpricing and hedging. arXiv preprint arXiv:1807.06622arXiv preprint arXiv:1807.06622