Pricing Barrier Options with DeepBSDEs
PPricing Barrier Options with DeepBSDEs
Narayan Ganesan, Yajie Yu, and Bernhard HientzschCorporate Model Risk, Wells Fargo
Abstract
This paper presents a novel and direct approach to price bound-ary and final-value problems, corresponding to barrier options, usingforward deep learning to solve forward-backward stochastic differentialequations (FBSDEs). Barrier instruments are instruments that expireor transform into another instrument if a barrier condition is satis-fied before maturity; otherwise they perform like the instrument with-out the barrier condition. In the PDE formulation, this correspondsto adding boundary conditions to the final value problem. The deepBSDE methods developed so far have not addressed barrier/boundaryconditions directly. We extend the forward deep BSDE to the barriercondition case by adding nodes to the computational graph to explic-itly monitor the barrier conditions for each realization of the dynamicsas well as nodes that preserve the time, state variables, and tradingstrategy value at barrier breach or at maturity otherwise. Given theseadditional nodes in the computational graph, the forward loss functionquantifies the replication of the barrier or final payoff according to achosen risk measure such as squared sum of differences. The proposedmethod can handle any barrier condition in the FBSDE set-up andany Dirichlet boundary conditions in the PDE set-up, both in low andhigh dimensions.
Deep Learning and Deep Neural Networks have been applied to numericallysolve high-dimensional nonlinear PDEs via the use of Forward-BackwardStochastic Differential Equations or FBSDEs (see [HJE18]). In particular,they applied it to a quantitative finance pricing problem to price a combi-nation of two call options under differential rates (dfferent lending and bor-rowing interest rates), a nonlinear problem that was also studied in [Mer15].In related work, [CWNMW19] and [Rai18] also showed the applicability ofdeep learning in solving FBSDEs. The overview paper of Deep BSDE meth-ods [Hie19] introduces how FBSDE and certain deep learning methods can1 a r X i v : . [ q -f i n . C P ] M a y e used to solve quantitative finance problems beyond options pricing togeneral contingent claims, hedging and initial margin problems.This paper presents a novel approach to price boundary and final valueproblems (corresponding to barrier options) with forward deep learning ap-proaches (“forward deep BSDE”). The approach adds nodes to the compu-tational graph of the deep neural network to explicitly monitor the barrierconditions for each realization of the dynamics. In case of barrier breach,the nodes record the time and underlying state variables; otherwise theyrecord the final time and value at maturity which is used to determine thefinal payoff. The forward loss function then quantifies the replication of thebarrier or final payoff according to a chosen risk measure such as squaredsum of differences.The simplest forms of barrier options are single-underlier or basket knock-in or knock-out options in which the barrier condition only involves thecurrent value of the underlier or the basket and compares it to a given bar-rier level, which is often a given constant. Let S t be that underlier or thevalue of the basket. Examples shown in Table 1 include the variants ofsimple Barrier calls in this setting, for strike price ( K ), time to maturity( T ), a barrier positions U t and L t (possibly time dependent) and the vanillacall price V ( S t , K, T − t ). Put versions of these examples are obtained byreplacing calls and call prices by puts and put prices everywhere.Name Barrier Condi-tion Knocked-inInstrumentRebate Knocked-In In-strument Value Final Payoff ifnot breachedUp-and-Out Call S t > = U t (Upper Bar-rier Position) G t S T − K, S t < = L t (LowerBarrier Posi-tion) G t S T − K, S t > = U t (Up-per Barrier Posi-tion) 0 V ( S t , K, T − t ) 0Down-and-In Call S t < = L t (Lower BarrierPosition) 0 V ( S t , K, T − t ) 0Table 1: Summary of basic Barrier Call instrumentsThe simplest barrier options are those with barrier at some constant levelwhich is active for the entire life of the instrument. Two standard examples:A standard Up-and-Out barrier call option with upper barrier B will pay2he final call payoff unless the underlier S of the option was observed at alevel S ≥ B during the life of the option in which case it will pay a rebate G . A standard up-and-in barrier call option with upper barrier B will paythe final call payoff only if the underlier S of the option was observed at alevel S ≥ B during the life of the option and otherwise will pay a rebate G .Given a knocked-in instrument, a model, and a valuation approach ap-plied to that model, the knocked-in instrument can be replaced by an im-mediate payment of the value of the knocked-in instrument according tothe model. In this way, one can restrict oneself to the case of knock-outinstruments with rebates.There are many instruments with barrier features in Quantitative Fi-nance. There are also many other applications in natural sciences, en-gineering, and economics that involve bounded domains and boundariesin stochastic analysis and stochastic processes. Similarly, PDE models innatural sciences and engineering are often posed in bounded domains withboundary conditions imposed at the boundary. The extensions presented inthis paper to the forward deep BSDE method allow this new methodologyto be applied to all these situations and applications.The rest of the paper is organized as follows: Section 2 presents a briefoverview of Forward-Backward Stochastic Differential Equations as appliedto options pricing. Section 3 discusses the technique behind DeepBSDEapproach to solving traditional Option pricing problem and a description ofthe approach to problems with barriers. Section 4 discusses other approachesthat are currently used or proposed to pricing Barrier Options and outlinestheir limitations. Section 5 describes our approach, which employs barriertracking variables as a part of the DeepBSDE network. Section 6 presentsthe results of pricing different types of barrier options using this approachand error convergence. The paper concludes with some remarks. For a general introduction to FBSDE and their relations to nonlinear parabolicPDE and/or quantitative finance, see [EKPQ97] and [Per10]. [HJE18] pre-sented the first application of DeepBSDE method to the pricing of Europeanoptions under differential rates. [Hie19] provides an overview of the typesof problems in quantitative finance that can be solved using deepBSDE andFBSDE. In this paper, we will discuss only specific aspects of PDEs andFBSDEs relevant to our setting.We will first discuss how a semilinear parabolic PDE can give rise to a3BSDE. A semilinear parabolic PDE is a PDE of the form: ∂u∂t ( t, x ) + Lu ( t, x ) + f ( t, x, u ( t, x ) , a T ( t, x ) ∇ x u ( t, x )) = 0 (1) Lu ( t, x ) := 12 Tr (cid:0) a ( t, x ) a T ( t, x ) ∇ x u ( t, x ) (cid:1) + ∇ x u ( t, x ) b ( t, x )defined on a domain D ⊂ R d and x ∈ D . In quantitative finance and otherapplications, these PDE are often posed with terminal conditions, u ( T, x ) = g ( T, x ) for a given function g . However, the case where boundary/barrierconditions u ( t, x ) = g B ( t, x ) for ( t, x ) ∈ B (2)are imposed is also of interest. Final value conditions can be included sothat g B ( T, x ) = g ( T, x ) with the previously defined g . We assume that thisis done.Consider an Itˆo process X given by: dX t = b ( t, X t ) dt + a ( t, X t ) dW (3)where X t ∈ R d , dW ∈ R d , b ( t, X ) ∈ R d and a ( t, X ) ∈ R d × d , with uncorre-lated dW and correlations expressed through a ( t, X ). For every ( t, x ) we willconsider a version of X that starts (or arrives) at time t in x and call it X t,x .Let Y t be u ( t, X t ). Then Y t satisfies the Backward Stochastic DifferentialEquation dY t = − f ( t, X t , Y t , a T ( t, X t ) ∇ u ( t, X t )) dt + ( ∇ u ( t, X t )) T a ( t, X t ) dW t (4)along with the equation for X t , satisfying (3). A terminal condition u ( T, x ) = g ( T, x ) for the PDE translates into a terminal condition for the BSDE Y T = g ( T, X T ).With π t = ∇ x u ( t, X t ), the BSDE (4) can be written − dY t = f ( t, X t , Y t , a T ( t, X t ) π t ) dt − π Tt a ( t, X t ) dW t (5)Equations (3) and (5) together describe the dynamics of solution u ( t, x )of PDE (1) within the framework of FBSDE. So, instead of solving a PDE(1) for u , we can solve the FBSDE (5) by finding stochastic processes Y t and π t . Instead of finding a process π t we can also try to find a function π ( t, x )(or different functions for different t ) such that π t = π ( t, X t ), as inspired bythe expression for π t given above.Notice that the FBSDE as written in (3) and (5) can be more generalthan the original PDE. For instance, the terminal condition could be arandom variable rather than a function, depending on the path of X t ratherthan just the final value. Then, solution functions u and portfolio functions π would also have more general forms and more arguments and might also4epend on additional processes that capture the path-dependency of theterminal condition.The above relation between PDE and FBSDE can be illustrated explic-itly, for instance in the case of risk-neutral valuation of simple derivativeswhich obey the Black-Scholes-Merton equation ∂u∂t + r ( t, x ) x ∂u∂x + σ ( t, x )2 x ∂ u∂x − r ( t, x ) u ( t, x ) = 0 (6)with the underlying X , driven by the Itˆo process, in the risk-neutral measure[Shr04], under which X follows dX = r ( t, x ) Xdt + σ ( t, x ) XdW, (7)where r ( t, x ) is the risk-free rate and σ ( t, x ) is the volatility of the underlier.Under this setup, for any function y ( t, x ), by Itˆo process rule, dy ( t, x ) = (cid:18) ∂y∂t + r ( t, x ) x ∂y∂x + σ ( t, x ) x ∂ y∂x (cid:19) dt + σ ( t, x ) x ∂y∂x dW (8)If y ( t, x ) is to describe the behavior of the value of the derivative u ( t, x ), ithas to obey equation (6), along with the terminal condition, y T = g ( T, X T ).Under this assumption (6), the first term on right side in the paranthesis ofequation(8) must obey, ∂y∂t + r ( t, x ) x ∂y∂x + σ ( t, x ) x ∂ y∂x = r ( t, x ) y (9)substituting for the term above in equation(8), leads to the update equationfor y ( t, x ): dy ( t, x ) = r ( t, x ) y ( t, x ) dt + σ ( t, x ) x ∂y∂x dW (10)Now denoting Y t = y ( t, X t ), π ( t ) = ∂y∂x ( t, X t ), a ( t, X t ) = σ ( t, X t ) X t , and f ( t, X t , Y t , Z t ) = − r ( t, X t ) Y t , this update equation corresponds to (5).To include the barrier condition in the FBSDE formulation, the process Y t that we are trying to determine will follow the FBSDE (3) and (5) when X t is outside of B while the value of the process Y t is directly given by g B ( t, X t ) inside of B . There are many additional variants, such as multiple barrier conditions,barrier conditions only active at certain discrete times or in certain timeintervals, etc. At a more general level, a FBSDE problem with barrierconditions can be stated as follows: We will not discuss the behavior of u or Y close to the boundary, we will just assumethat boundary conditions are consistent and the appropriate conditions are satisfied sothat u and/or Y will satisfy those boundary conditions in the appropriate sense. C t (eg: the barrier breach) as a random variablewith values true and false which is measurable (computable) given theinformation about the dynamics X s , ( s ≤ t ). At the time the conditionturns true , it is ‘stopped’ and will stay true for all subsequent times.i. When the condition C t is false , X will follow (3) and Y will follow theBSDE (5).ii. If C T is false , Y T is equal to or approximates a given final value G T (in general, a random variable measurable at time T ).iii. If C t is true , Y t is equal to or approximates a given barrier value G t (ingeneral, a random variable measurable as of time t ).iv. For the up-and-out barrier mentioned above for a single underlier, S t = X t , the condition C t is whether S t ≥ U t is true or false (which givesthe barrier region as the domain where S t ≥ U t and the boundary as S t = U t ), G t is zero, and G T is the standard call payoff.In the case that the final value G T is given as function of X T and barrierconditions C t and values G t are given as functions of time t and the ‘statevariables’ X at that time, X t , this time-continuous FBSDE problem, underappropriate conditions, is equivalent to a PDE boundary/final value problemwhere the final values are given by that function g , the boundary valuesare given by that function g B , and the boundary region is defined by the(boundary of the) domain in which C ( t, X t ) is true.For the up-and-out barrier mentioned previously, these conditions aresatisfied and one obtains the same PDE as before, only with a zero boundarycondition enforced on S t = U t . For any general FBSDE problem as inEquations (3) and (5), applying a simple Euler-Maruyama discretization forboth X t and Y t , we obtain X t i +1 = X t i + b ( t i , X t i )∆ t i + a ( t i , X t i )∆ W i (11) Y t i +1 = Y t i − f ( t i , X t i , Y t i , π t i ) ∆ t i + π Tt i a ( t i , X t i )∆ W i (12)This can be used to time-step both X t and Y t forward. We will first discuss application to European option pricing (see [HJE18]).The FBSDE is first discretized in time as in (11) and (12). The portfolio pro-cess π t is represented by functions p n ( X t n ) = π ( t n , X t n ), one for each time t n , and each function p n ( X t n ) is given as a deep neural network. The initialvalue of the portfolio at the fixed X and the initial portfolio composition p are given as constants. 6o generate a realization/training sample, one first generates a realiza-tion of X t n by (11) and then given the current parameter values for Y andall the π t n = p n networks, one computes Y t n and finally Y T = Y t N step bystep by (12). The L norm of the replication error Y t N − g ( t N , X t N ) wasused as a loss function.Instead of using a constant X , one can start with a randomly gener-ated X . Under those circumstances, one determines the parameters of thenetwork representing the function Y ( X ) rather than a single value Y (andalso a π = p network and its parameters). We have also implementedthat method. [HJE18] mention the forward approach for random X as apossibility on page 8509 but we are not aware of any implementation of thismethod besides our own to the best of our knowledge. One can also use otherrisk measures defined from the replication error in the optimization. Finally,the loss function is optimized with stochastic optimization methods such asmini-batch stochastic gradient descent combined with Adam optimizers orother appropriate deep learning methods to determine the parameters of the p n and Y network.Figure 1 shows the computational graph for the forward method for Euro-pean options with random X . Y Network Y Y · · · Y n − Y n π Network π Network · · · π n − Network X X · · · X n − X n dW dW · · · dW n − dW n Payoff/LossFunction
Figure 1: DeepBSDE networkAs seen earlier, in the time-continuous case, π t = ∇ x u ( t, X t ), which in fi-nancial terms corresponds to delta-hedging according to some value function u . In the case the FBSDE, time-continuous or not, is expressed as minimiza-tion, the π t determined during the optimization does not necessarily have tobe equal to ∇ x u ( t, X t ) although it often approximates it to a certain extent.In the time-discrete case, the analytical solution is no longer guaranteed tominimize the replication error, but reflects a good benchmark and what onewould achieve by delta-hedging with that value function. Later, in section6.1, we will compare the hedging/replication strategy given by the optimizedDNN with the one implied by the analytical solution, for the barrier case.We now present the fundamental idea for the barrier case and will leavethe implementation details to a later section. The fundamental idea toextend the forward method to the barrier case is that the barrier breach timeand place (together with maturity) takes the place of maturity as the firsttime and place at which the value of Y t is known and any approximation of7 should try to approximate that first known value well regardless whetherit was specified in the problem as a final value or a barrier condition value.The first time at which the dynamics X t enters the barrier domain B orsatisfies the barrier condition C t is called barrier touch/breach time. Withthe version X t,x of X started in x at time t , we denote τ t,x as the first timewhen the barrier is breached (when C t is first true). As previously men-tioned, we add the set t = T to the barrier domain and set the barrier valuefunction to the final value function at maturity and denote the resultingstopping time as τ t,xT (since both barrier domain and stopping time dependnow on T) and also note that τ t,xT is the minimum of τ t,x and T .Thus, the loss function to be minimized, according to some norm/riskmasure (for instance, L norm), is Y τ t,xT − g B (cid:16) τ t,xT , X τ t,xT (cid:17) (13) We briefly review some of the techniques used in practice to price suchbarrier options.
PDE Based Approaches:
Options pricing via the solution of PDEwith finite differences or finite elements or other standard approaches inhigh dimensions (many underlying assets) poses difficulties due to extremestorage and computational requirements to compute and store the valueson a grid that covers the domain. Accuracy and stability requirements of-ten require larger grids than what is achievable with given resources. Intraditional PDE schemes, the PDE grid is generated with sufficiently smallspacing for finite difference methods in region of interest that includes finaltime T and barrier positions. The finite difference time-stepping proceedsfrom the final time T along the time axis spanning a domain bounded onone or both sides by barriers and/or other appropriate boundary condi-tions. The time-stepping scheme is then applied to determine the values atgrid points going backward in time. PDE techniques are well studied andapplied widely, however their applicability is limited and is best suited topricing derivatives on fewer number of underliers due to issues with storageand computational requirements caused by dimensionality. However, PDEtechniques can model at least some nonlinear pricing. Monte-Carlo Based Approaches:
Barrier options in high dimensionsare often priced by Monte-Carlo approaches. This includes generating mul-tiple independent sample paths of the underliers in order to compute therealized payoff at maturity or barrier breach along each path and discount-ing it to the present time and averaging the payoffs to determine the valueof the option. While generating sample paths in high dimensions certainlyincreases in difficulty and resource requirements with the dimension, these8equirements do not depend exponentially on the number of dimensions asthe grid size in PDE. In the case when there are rare events with high im-pact on the price (such as a knock-out condition on a short or far out barrierfor an otherwise high payoff), simulation-based approaches such as Monte-Carlo need to sample those rare events sufficiently well to approximate theprice of the instrument well. One of the main limitations of standard Monte-Carlo based approaches is that they can only take into account risk-neutraldiscounting or other linear pricing approaches and cannot be used to modelnonlinear pricing such as the differential rates, in which borrowing and lend-ing attracts different interest rates [Hie19].
DeepBSDE Based Approaches:
DeepBSDE based approaches asoriginally proposed in [HJE18] alleviate the problem of dimensionality byconverting the high-dimensional PDEs into Backward Stochastic DifferentialEquations. The SDEs are then solved as a stochastic optimal control prob-lem by approximating the co-state and initial value of the control problemusing Deep Neural Networks. In the present work we apply this techniqueto price Barrier options.Barrier options require the handling of the barrier boundary conditionsas they correspond to ‘knock-in’ and ‘knock-out’s at any time before matu-rity and such problems are not treated in [HJE18].For risk neutral barrier option pricing, the following expression holds u ( t, x ) = E [ g B ( τ t,xT , X t,xτ t,xT ) e − (cid:82) τt,xTt r ( s,X t,xs ) ds ] (14)where the expectation is taken with respect to the risk-neutral measure[Shr04]. Under the risk-neutral measure, the value of X at time t is equal toexpectation of X T with T > t discounted back to t with the risk free interestrate. Under this risk-neutral measure X follows (7) with functions as definedin Section 2, τ t,xT denotes the earlier of barrier breach and maturity, and g B combines both final values and barrier values, as defined in Section 3. Forknock-out options with zero rebate, g B ( t, x ) will be zero unless t = T .Therefore, (14) will in this case simplify to u ( t, x ) = E [1 τ t,x ≥ T g ( T, X t,xT ) e − (cid:82) Tt r ( s,X t,xs ) ds ]= E [ P (cid:0) τ t,x ≥ T (cid:1) g ( T, X t,xT ) e − (cid:82) Tt r ( s,X t,xs ) ds ] (15)This corresponds to an instrument with final payoff P (cid:0) τ t,x ≥ T (cid:1) g ( T, X t,xT ) , (16)which is measurable as of time T knowing X up to that time T . In thecase that P (cid:0) τ t,x ≥ T (cid:1) can be written as a (relatively simple and explicit)function of X t = x and X T , this will give a (different) final value problem9or each t and x . Informally, the final payoff function is adapted via aBrownian bridge based method to include the probability of breaching thebarrier before option maturity.In recent work, [YXS19] have approached the Barrier option pricing asdescribed by the above formula. The computation of probability of breach P ( τ t,x ≤ T ) as a function of x , t , X T , and T is analytically tractable forconstant barriers and simple risk models (constant drift and volatility ofthe underlying assets), with the help of results from Brownian bridge prob-abilities . Once x and t are fixed, these are standard final value problemsthat can be solved with the deep BSDE methods for European options, asdemonstrated in [YXS19].A slight generalization can be derived along similar lines for the case inwhich touching the barrier does not lead to an immediate rebate but only toa changed final payoff, which we denote by g Br ( T, x ). Proceeding similarlyto above, one obtains an expectation that can be interpreted as an Europeanoption on the final payoff P (cid:0) τ t,x ≥ T (cid:1) g ( T, X t,xT ) + P (cid:0) τ t,x < T (cid:1) g Br ( T, X t,xT ) (17)However, in general, time varying barriers and stochastic risk-models(stochastic interest-rate and volatility of underliers) require Monte-Carlobased approaches to compute the probability. For instance for a single bar-rier with different levels in multiple time-periods, the probability computa-tion can be extended as follows:Consider an Up-and-Out barrier option with piecewise constant barrierswith M values within 0 ≤ t ≤ T , defined by increasing times t = 0, t i ,. . . , t M = T : B ( t ) = { B i for t i ≤ t < t i +1 } (18)The total probability of no-breach can be computed as product of prob-abilities of no-breach within each time period, informally written as: P ( τ t,x ≥ T ) = PNoBreach( t , X ( t ); T, X ( T ))= M − (cid:89) i =0 PNoBreach( t i , X ( t i ); t i +1 , X ( t i +1 )) (19)whereeach of the probability values can be computed via Brownian bridgefor simple risk-models. However, now one needs to integrate over and/orotherwise sample over the intermediate positions at intermediate times toobtain the barrier probabilities as a function of only initial and final risk fac-tor value. Some other popular varieties of barrier options including multiplebarriers (Up and Down Barriers), interacting barriers and barriers spec-ified as a function of asset price (maximum drawdown) require more in-volved Monte-Carlo approaches to compute the probability. Monte-Carlo10pproaches in higher dimensions to compute these probability are computa-tionally expensive and incur higher costs than pricing the option in higherdimensions as outlined above (and therefore will not lead to an efficientmethod to compute barrier option values through final value problems).In this work, we propose an original approach using DeepBSDEs thatexplicitly monitors whether the underliers breach the barrier before maturityand record the state of the barrier option at every time step via nodes inthe computational graph, which is then used to determine the appropriatepayoff conditions. The technique proposed here is applicable to solvinggeneral semi-linear parabolic PDEs or FBSDE in any discipline (in additionto quantitative finance) in which boundary conditions are specified. To thebest of our knowledge there is no prior work to handle boundary conditionsexplicitly in the context of DeepBSDEs, which is important in solving avariety of PDEs and FBSDE with standard and non-standard boundaryconditions. As mentioned before, we need to keep track of variables that detect whetherthe barrier was breached (the barrier condition satisfied) and preserve thevalue of X , t , and Y at the time of the first barrier breach. In general, we willcall these “conditional” variables, or “conditional” tensors (since variablesare represented as and called tensors in TensorFlow).In the time-continuous setting, we need the values of τ t,xT , X τ t,xT , and Y τ t,xT to compute the loss function (13) or some other risk measure that evaluateshow the trading strategy replicates the appropriate payoff. In the time-discrete case, we need to keep track of time-discrete counterparts, whichwe call tFP ( t for payoff), XFP ( X for payoff), and YFP ( Y for payoff).To write updates for these variables, we need XTrig that turns from false(0.0) to true (1.0) once the barrier has been breached. For barrier conditionfunctions C t ( t, X t ), this can be defined as XTrig i = (cid:26) XTrig i − if XTrig i − C t i ( t i , X t i ) else . (20)For a problem with a single upper barrier at level U active during the entiretime to maturity, this would read XTrig i = (cid:26) XTrig i − if XTrig i − X t i ≥ U else . (21) XTrig would be appropriately initialized depending on whether the barriercondition will be checked at time t and could possibly be true there orwhether it will only be checked starting at the next time step.11aving defined XTrig , we can define tFP , XFP , and
YFP as follows:tFP i = t i × (1 . − XTrig i − ) + tFP i − × XTrig i − (22)XFP i = X t i × (1 . − XTrig i − ) + XFP i − × XTrig i − (23)YFP i = Y t i × (1 . − XTrig i − ) + YFP i − × XTrig i − (24)and initialized with tF P = t , XF P = X t , and Y F P = Y t . Theseupdate equations keep copying the underlying X , Y , and t until the barrieris hit and then stop at the next step, so that the values of the correspondingvariables at barrier breach stay in the *FP versions. If the barrier is neverbreached, the values of t , X , and Y at t = T = t N will be in tFP N = t N , XFP N = X t N , and YFP N = Y t N . Thus, tFP N , XFP N , and YFP N arethe appropriate time-discrete analogues of τ t,xT , X τ t,xT , and Y τ t,xT , and theloss function for the computational graph implementing this time-discreteapproach would be (cid:107) YFP N − g B ( tFP N , XFP N ) (cid:107) (25)or any other appropriate risk measure.Notice also that time-discretization leads to well-known biases in barrieroptions. If approximation of a certain time-continuous problem with contin-uously enforced barriers is desired, there are approaches such as barrier levelcorrection (solving the time-discrete problem with appropriately shifted bar-rier levels) that will minimize such bias and lead to faster convergence tothe solution of the time-continuous problem.Figure 2 shows the computational graph/network for the approach justdescribed. The variables/tensors that have been added or changed for thebarrier case are shown as color shaded nodes and the nodes from the origi-nal forward deep BSDE network are shown unshaded. As discussed above,these additional tensors are needed to preserve the state at barrier breachor maturity and serve to constrain the Y t i appropriately so that it followsthe FBSDE if not in the barrier domain but approximates the given barrierdomain value once it touches the barrier domain.The function g B will encode the specific barrier option treated. For aknock-out barrier call option, the barrier and final payoff is given by g B ( t, X t ) = (cid:26) max(X t − K, .
0) if t = T . t ≤ T . (26)A knock-in barrier call option would be given by g B ( t, X t ) = (cid:26) . t = TV ( X t , K, T − t ) if t ≤ T (27)with V ( X t , K, T − t ) the vanilla call price (as also used in the introductionsection). For both examples, we assume that the barrier is not active at12 Network π Network · · · π n − Network Y Y · · · Y n tFPinitXTriginitYFP XFP init Network Y X XTrig tFP YFP XFP X XTrig tFP YFP XFP · · · · · ·· · ·· · · · · · X n − XTrig n − tFP n − YFP n XFP n X n tFP n Payoff/LossFunction
Figure 2: DeepBSDE network with conditional barrier triggers T . Note that the given g B is the same regardless of the level or number ofbarriers and can be thus used for up-and-out respective up-and-in, down-and-out respective down-and-in, and the double barrier variants of each. Figure 3 shows the values of these barrier breach tracking variables for acase with 400 time steps, X = 100 and an upper barrier at level 150, forthe same parameters of the X dynamics as defined in Table 2, in the nextsection. The upper panel shows X t i while the lower panel shows XTrig i (scale on left axis) and XFP i (scale on the right axis). The solid lines showseveral realizations of XFP corresponding to the same colored realizationof X in the upper panel. One can see that XFP stays at the barrier level(“has been stopped”) once the barrier has been touched (for the orange pathshort before time step 100, for the green path short before time step 300). Ifthe barrier is never touched,
XFP is just X . Correspondingly, the dashedlines reflect XTrig - the orange line jumps to 1.0 (true) when the orangepath touches the barrier short before time step 100, the green line jumpsto 1.0 (true) short before time step 300, and none of the other lines jump.In the case with several barriers or more complicated barrier domains andconditions, tFP N and XFP N would identify the time and place the barrierwas hit and thereby identify the barrier. We present the results of the forward deep BSDE barrier pricer for an Up-and-Out call option with a single underlier, with the parameters of the13igure 3: Behavior of conditional triggers σ r T B (Barrier Position) K (Strike)0.2 0.05 0.5 Years 150.0 100.0Table 2: Parameters for the X dynamics, the generator, and the instrumentdynamics and the instrument as shown in Table 2, along with initial value forsample paths, X , chosen uniformly within 50 and 150. We use the generator f ( t, X t , Y t , Z t ) = − r ( t, X t ) Y t for the risk-neutral (discounting-only) case,as shown in section 2. However, any generator such as a generator fordifferential rates or other nonlinear pricing settings could be easily usedinstead.The hyper-parameters of the computational graph and the embeddedneural networks include (a) The number of time steps in the discretization,(b) The number ( n ) of layers for each π t i for 1 ≤ i ≤ N and the Y network;and (c) The number of units per layer ( u ). The number of layers ( n ) andunits per layer ( u ) for both the Y network and the π t i networks are forthe hidden layers. The input layer will always have as many units as thedimension of the problem, the output layer for the π t i networks will bethe same size as the input layer, and the output layer for the Y networkwill only have one unit. We also tested different mini-batch sizes (samplepaths per mini-batch, b = 256 , , n = 5 , u = 5 ,with 200 time steps) wasused for subsequent tests. 14oss function values after 20,000 mini-batch stepsHyperParameters time steps = 50 time steps = 100 time steps = 200b=256 b=512 b=1024 b=256 b=512 b=1024 b=256 b=512 b=1024n=3 u=3 21.09 19.04 15.71 19.43 14.38 21.35 23.45 19.45 22.43u=5 12.46 17.01 11.82 10.65 13.67 8.79 12.5 12.86 12.64n=5 u=3 11.32 9.89 8.99 8.9 11.49 10.2 12.09 9.35 8.75u=5 11.86 7.5 12.19 8.52 8.78 7.6 8.93 7.87 6.4n=7 u=3 17.45 13.07 16.93 27.06 29.55 23.79 35.19 21.99 40.46u=5 9.59 13.37 29.14 16.63 31.26 23.43 26.31 17.08 27.44Table 3: Summary of loss minimization with respect to network hyper-parametersThe price of option at time 0, as learned by the initial network Y iscompared to the analytic solution of the time-continuous problem in Figure4, is seen to be in close agreement and is better approximated with largernumber of training mini-batches. The results show this methodology issuccessful in replicating the appropriate payoff as shown in Figures 4c and4d. The payoff approximation is shown for sample paths that have notbreached the barrier (blue) as well as those that have knocked out (black).The loss (measured as the MSE of difference between YFP and payoff)as computed from the mini-batch and its standard deviation are shown inFigure 5a, and the loss is seen to definitely decrease with larger number ofiterations.Furthermore, another neural network design with batch normalizationand 5 hidden layers with 7 units per layer was trained and results are shownin Figure 7. As mentioned earlies, for the time-continuous case and exactsolution of the minimization problem, π i would be ∂u ( t i ,x ) ∂x and π wouldrepresent the delta at time 0. Thus, it is compared to the analytical deltaof the option as given by the known closed-form solution. The differencesbetween the analytical and learned values is shown on the plots on the rightof the panel. As also discussed earlier, one would not expect arbitrarily closeagreement between analytical and learned values for the time-discrete case. As discussed earlier, the π i networks give how many units of each underlierare in the trading strategy and that trading strategy for the time-continouscase and appropriate risk measures such as L error would be the delta-hedging strategy ∂u ( t,x ) ∂x corresponding to a given analytical solution u . It is15 a) Analytical vs BSDE solution – 5000mini-batches (b) Analytical vs BSDE solution – 20000mini-batches(c) Approximation of the Payoffs – 5000mini-batches (d) Approximation of the Payoffs – 20000mini-batches Figure 4: Approximation of Initial Value (Analytical solution) and payoffs (a) Loss History – 20000 mini-batches (b) Trading Strategy
Figure 5: Loss history and trading strategy (how many units of underlier inportfolio) 16 a) Hedging PnL after 5000 training mini-batches (b) Hedging PnL after 20000 trainingmini-batches
Figure 6: The approximation of final payoff with number of mini-batchesFigure 7: Comparison of Analytical solution to DeepBSDE for network 2(different architecture) 17tandard to implement time-discrete delta-hedging trading strategies thatadjust the portfolio to ∂u ( t i ,x ) ∂x units of underliers at given discrete time in-stances t i .The position in the portfolio for the underlier (corresponding to Delta)as learned by the π i networks is shown in Figure 5b, as a function of thetime t i and the underlier as a 3D plot.The difference between value of the trading strategy at maturity or bar-rier and the payoff specified by the instrument can be understood as hedgingPnL resulting from such hedging at discrete times. This hedging PnL as ob-tained from the deep BSDE solution at different number of mini-batchesduring the training is shown in Figure 6. Since the hedging is only per-formed at discrete times, it is expected to deviate from theoretical PnL ofzero for continuous hedging. One can see that as the training proceeds onmore and more mini-batches, the hedging PnL is more strongly peaked at 0and has fewer outliers.It is also interesting to compare the trading strategy obtained from thedeepBSDE method to the trading strategy implied by delta-hedging ac-cording to the known analytical solution for barrier problem in the time-continuous case (which is obtained by setting π i to the analytical gradientof the known analytical solution evaluated at time t i ).The histogram of the hedging PnL (final value of strategy versus requiredpayoff) for the hedging strategy from the trained π i network is shown inFigure 8a, for both the active as well as the knocked-out case. In contrast,the hedging PnL obtained by using the analytical value of the Option Deltasis shown in Figure 8b.It can be seen that the hedging by analytical Option Deltas yields smallPnL in a majority of cases, however performs poorly for a number of caseswhere PnL assumes a large relative value (-30 to 30) due to imperfectionsin discrete-time hedging. In contrast, the Deep Learning based approachyields very small PnL in comparitively smaller number of cases, has betteroverall risk performance (and generally small PnL) and smaller extremeerrors/performance compared to the hedging with analytic Delta. The authors would like to thank Fernando Cela Diaz and Pallavi Abhang fortheir help on setting up and running large scale hyper-parameter optimiza-tion on distributed computing infrastructure, Orcan Ogetbil for discussionson barrier option pricing, Vijayan Nair for discussion regarding methods,presentation, and results and Agus Sudjianto for supporting this research.18 a) Payoff Approximation Error usingDeepBSDE Delta for active and knockedout case. (b) Payoff Approximation Error usinganalytical Option Deltas for active andknocked out case.
Figure 8: Comparison of hedging PnL between discrete-time analytic anddeepbsde solution
We proposed a novel deep neural network based architecture to price BarrierOptions via Deep BSDE, that captures whether the underliers’ price move-ment touched the barrier or not in order to model the appropriate payoffconditions and minimize the payoff error in order to learn the price of theoption at initial time. We also demostrated the effectiveness of the hedgingstrategy via Deep BSDE learned Deltas at discrete time instances.
References [CWNMW19] Quentin Chan-Wai-Nam, Joseph Mikael, and Xavier Warin.Machine learning for semi linear PDEs.
Journal of ScientificComputing , 79(3):1667–1712, 2019. arXiv:1809.07609.[EKPQ97] Nicole El Karoui, Shige Peng, and Marie Claire Quenez.Backward stochastic differential equations in finance.
Math-ematical finance , 7(1):1–71, 1997. Also available on seman-ticscholar.org.[Hie19] Bernhard Hientzsch. Introduction to solving quant fi-nance problems with time-stepped FBSDE and deep learn-ing. arXiv preprint arXiv:1911.12231 , Nov 2019. Alsoavailable at SSRN: https://ssrn.com/abstract=3494359 orhttp://dx.doi.org/10.2139/ssrn.3494359.19HJE18] Jiequn Han, Arnulf Jentzen, and Weinan E. Solving high-dimensional partial differential equations using deep learn-ing.
Proceedings of the National Academy of Sciences ,115(34):8505–8510, 2018.[Mer15] Fabio Mercurio. Bergman, Piterbarg, and beyond:pricing derivatives under collateralization and differ-ential rates. In
Actuarial Sciences and Quantita-tive Finance , pages 65–95. Springer, 2015. Alsoavailable at SSRN: https://ssrn.com/abstract=2326581 orhttp://dx.doi.org/10.2139/ssrn.2326581.[Per10] Nicolas Perkowski. Backward Stochastic Differential Equa-tions: an Introduction, 2010. Available on semantic-scholar.org.[Rai18] Maziar Raissi. Forward-backward stochastic neural networks:Deep learning of high-dimensional partial differential equa-tions. arXiv preprint arXiv:1804.07010 , 2018.[Shr04] Steven Shreve.
Stochastic Calculus for Finance II . Springer-Verlag New York, 2004.[YXS19] Bing Yu, Xiaojing Xing, and Agus Sudjianto. Deep-learning based numerical BSDE method for Barrier op-tions. arXiv preprint arXiv:1904.05921arXiv preprint arXiv:1904.05921