Pricing spread option with liquidity adjustments
PPRICING SPREAD OPTION WITH LIQUIDITY ADJUSTMENTS ˚ KEVIN S. ZHANG : AND
TRAIAN A. PIRVU ; Abstract.
We study the pricing and hedging of European spread options on correlated assetswhen, in contrast to the standard framework and consistent with imperfect liquidity markets, thetrading in the stock market has a direct impact on stocks prices. We consider a partial-impact and a full-impact model in which the price impact is caused by every trading strategy in the market. Thegeneralized Black-Scholes pricing partial differential equations (PDEs) are obtained and analysed. Weperform a numerical analysis to exhibit the illiquidity effect on the replication strategy of the Europeanspread option. Compared to the Black-Scholes model or a partial impact model, the trader in the fullimpact model buys more stock to replicate the option, and this leads to a higher option price.
Key words.
Spread Option, price impact, XVA, illiquid market, deep learning, Deep GalerkinMethod, transfer learning
AMS subject classifications.
1. Introduction.
Spread option gives holders the right but not the obligation topurchase the difference between two assets, at a cost. ` S p T q ´ S p T q ´ k ˘ ` (here k is the strike of the option, A.K.A the cost). While it can be written on all varietiesof asset such as equities, bonds, and currencies, it has a unique role in the commoditymarket. The commodity market consists of many sectors such as agriculture, energy,and petroleum. In the agricultural market, Crush Spread
Johnson et al. (1991) [14]allows for the exchanges of unrefined soybeans with a combination of soybean oil andsoybean meal. In the energy market,
Spark Spread
Girma and Paulson (1998) [10] paysthe spread between natural gas and electricity. In the petro-market,
Crack Spread pro-vides utility for the differential between the price of crude oil and petroleum products.Information on various Crack Spreads can be found in the NYMEX Rulebook (2020)[11].Valuation of spread option involves solving a two-dimensional Black–Scholes PDEof the form: $’’&’’% rV “ V t ` σ s V s s ` ρσ σ s s V s s ` σ s V s s ` rs V s ` rs V s ,V p T, s , s q “ p s ´ s ´ k q ` , with 0 ă s , s ă 8 , 0 ď t ď T . (1.1)The terminal condition of the above equation reflects the maturity payoff of the Spreadoption. One should be aware a closed form solution of (1.1) does not exist. Rather,we rely on numerical approaches for spread option pricing. In fact, there have beenvarious numerical methods developed on approximating the pricing for these options.For example, [13, 5, 8] are well established methods.Underpinning classical pricing theory is the assumption of perfect market liquidity.Namely, that trading activity has no effect on asset prices. The relaxation of this as-sumption will impact the asset prices, and subsequently the value of derivative contracton the assets. This difference in price against the classical model was named liquidity ˚ Preprint.
Funding:
This work was funded by NSERC grant 5-36700. : Department of Mathematics and Statistics, McMaster University, Hamilton, ON, Canada. ; Department of Mathematics and Statistics, McMaster University, Hamilton, ON, Canada.1
This manuscript is for review purposes only. a r X i v : . [ q -f i n . C P ] J a n KEVIN S. ZHANG, TRAIAN A. PIRVU valuation adjustment (LVA) Pirvu and Zhang (2020) [21]. Willmot (2000) [20] studiedvarious price impact models arising from different trading strategies. In particular,Pirvu et al. [1], [16] investigated spread option impact models under
Delta Hedging strategies. The study yielded both a partial impact model and full impact model. Thekey difference between the two aforementioned model is within the replication process.The partial impact model uses the Delta of the impactless model while the full impactmodel use the Delta of the model with price impact. As of a result, the full impactmodel to have a non-linear pricing PDE.In this paper, we further explore the LVA market model of Pirvu and Zhang (2020)[21]. The model consist of two risky assets whose prices are driven by a pair of stochas-tic differential equations (SDEs). The illiquid asset price is modified to include full orpartial price impact from trading. This give rise to two distinctive pricing PDEs corre-sponding to the partial impact and full impact model. Existence and uniqueness of theasset price SDEs are established. The approach used to derive the option price PDE isthe replicating portfolio methodology. The option price PDEs are investigated in bothpartial impact and full impact models. We present a novel technique to numericallysolve these PDEs. Motivated by [18, 2], we apply the Deep Galerkin Method (DGM)for PDEs. It relies on approximating the solution of a PDE with a deep neural network.The network is trained to satisfy the PDE’s differential operator, its initial condition,as well as the boundary conditions. Their numerical routine is not affected by the PDEdimension because it is mesh free. We performed numerical experiments and analysethe results. We learn from our numerical experiments that the option hedging strategyin the full impact model has a higher financing cost and this result in a higher optionprice when compared to a partial impact or no impact model.The remainder of this paper is organized as follows. Section 2 presents the financialmarket model. The partial impact and full impact are treated in Subsections 3.1 and3.2. The DGM method is presented in Section 4 and numerical experiments in Section5. The paper ends with a conclusion and an appendix sections.
2. Model Framework.
In this study, we adopt the basic mode framework of [21]for exchange options, and apply modifications to suit spread options. Our model ofa market is based on a filtered probability space p Ω , t F t u t Pr ,T s , P q that satisfies theusual conditions, and consists of two assets. The asset prices are assumed to follow a t F t u t ě -adaptive two-dimensional Itˆo-process S p t q “ p S p t q , S p t qq . The randomness inthis market is driven by a two-dimensional Brownian motion W p t q “ p W p t q , W p t qq .The assets prices dynamics are given by the following stochastic differential equations dS p t q “ µ p t q S p t q dt ` σ S p t q dW p t q ` λ p t, S , S q df p t, S , S q ,dS p t q “ µ p t q S p t q dt ` σ S p t q dW p t q , (2.1)where λ p t, S , S q and f p t, S , S q are a price impact and the corresponding tradingstrategy respectively. Dynamics of the price impact function is given by¯ λ ` t, s , s ˘ “ (cid:15) ` ´ e ´ β p T ´ t q ˘ if S ă s , s ă S, , (2.2)where S and S represents a trading floor and cap of the assets respectively. This causethe trading price impact to be truncated within the floor and cap. As for the otherparameters, (cid:15) is the price impact per share, and β is a decaying constant. It is importantto emphasize that ¯ λ p t, s , s q is used for numerical approximation. The theoretical This manuscript is for review purposes only.
PREAD OPTION WITH LIQUIDITY ADJUSTMENTS λ p t, s , s q is a function with bounded derivative, and can be obtained through standardmollifying of ¯ λ p t, s , s q .The trading strategy f p t, s , s q can be a large trader’s strategy (partial impact),or any trading strategy (full impact). As in [16], the SDEs of (2.1) can be rewrittenas: dS p t q “ ¯ µ ` S p t q ˘ dt ` ¯ σ ` S p t q ˘ dW p t q ` ¯ σ ` S p t q ˘ dW p t q ,dS p t q “ ¯ µ p t q dt ` ¯ σ p t q dW p t q ` ¯ σ p t q dW p t q , (2.3)where the drift and diffusion functions are:¯ µ p t, s , s q “ ´ λf s ´ µ s ` λf t ` s µ λf s ` f s s p ρσ σ s s ` σ s λf s q ´ λf s ` f s s p σ s ` σ s λ f s ` ρσ σ s s λf s q ` ´ λf s ˘ ` σ s f s s ¯ , ¯ µ p t q “ µ s , ¯ σ p t, s , s q “ σ s ´ λf s , ¯ σ p t, s , s q “ σ s λf s ´ λf s , ¯ σ p t q “ σ s ρ, ¯ σ p t q “ σ s a ´ ρ . In Section 3.1 and Section 3.2 we specialize the above SDEs to reflect partial and fullimpact. For the interest of pricing and hedging, we look for a risk-neutral pricingprobability measure. The existence of such a probability measure is shown in Theorem2.1.
Theorem
Finite Liquidity Risk-Neutral Measure ). There exists aunique risk-neutral measure r P for the finite liquidity market model (FLMM), givenby r P p A q “ ż A Z p ω q d P p ω q for all A P F T , where Z p t q “ exp ` ´ ż t x Θ p u q , d W p t qy ´ ż t || Θ p u q|| du ˘ , with the vector-valued market price of risk generator process Θ ` t, S p t q , S p t q ˘ “ σ ¯ σ ´ ¯ σ ¯ σ „ ¯ σ ´ ¯ σ ´ ¯ σ ¯ σ „ ¯ µ ´ r ¯ µ ´ r . Proof.
Please see 7.1 in the Appendix Section.The following theorem states the existence of a unique solution for the SDE systemdriving the asset prices, based on standard result.
Proposition
Finite Liquidity SDE Existence ). Suppose the diffusion functions ¯ σ p t, s , s q and ¯ σ p t, s , s q of (2.3) are uni-formly Lipschitz continuous in s , s P ` R ` ˘ . Then, the SDE system (2.3) under r P becomes dS p t q “ rdt ` ¯ σ ` S p t q ˘ d Ă W p t q ` ¯ σ ` S p t q ˘ d Ă W p t q ,dS p t q “ rdt ` ¯ σ p t q d Ă W p t q ` ¯ σ p t q d Ă W p t q . (2.4) Furthermore, it has a unique strong solution. 2.1.This manuscript is for review purposes only.
KEVIN S. ZHANG, TRAIAN A. PIRVU
3. Market Specification.
The purpose of this section is to describe the full andpartial impact model in a manner that is clear, concise while still maintaining somemathematical rigorousness. For each of the partial and full impact model, we not onlyproved the existence and uniqueness for the market SDEs, but also the PDEs for optionpricing. We also explains the market conditions that admits full price impact.
In the case of partial price impact, only the trading activ-ities of big institutional traders will affect market prices. If we assume all these bigplayers are delta hedgers with the delta from BS model, then the risk-neutral dynamicsof FLMM SDE system becomes dS p t q “ rdt ` ¯ σ ˚ ` S p t q ˘ d Ă W p t q ` ¯ σ ˚ ` S p t q ˘ d Ă W p t q ,dS p t q “ rdt ` ¯ σ p t q d Ă W p t q ` ¯ σ p t q d Ă W p t q , (3.1)and the diffusion function are:¯ σ ˚ p t, s , s q “ σ s ´ λV p BS q s s , ¯ σ ˚ p t, s , s q “ σ s λV p BS q s s ´ λV p BS q s s , where V p BS q s s and V p BS q s s are second order Greeks of spread option, under the BS Model.As one can see the study of Greeks for spread option becomes crucial in furtheranalysing the model. There has been some extensive studies on this in Li and Deng(2008) [15]. We derive our Greeks analysis from a Fourier transform method developedby Hurd and Zhou (2010) [13]. More details on the Greeks are available in Appendix7.2. Based on this we establish the existence and uniqueness of asset prices in the nextTheorem. Theorem
Finite Liquidity Existence III ). The SDE system (3.1) of FLMM under partial impact has a unique solution.Proof.
Please refer to the Appendix Section 7.3.The replicating portfolio argument is fundamental in the derivations of the BStype PDE characterizing the spread option price. In this scenario, the portfolio usedfor replication have two assets and one cash account. The full argument can be foundin Pirvu and Yazdanian (2016) [1]. The resulting PDE will be linear and parabolic. $’’’’’’’’’&’’’’’’’’’% rV “ V t ` V s s p ´ λV p BS q s s q ` σ s ` σ s λ p V p BS q s s q ` ρσ σ s s λV p BS q s s ˘ ` V s s ´ λV p BS q s s ` ρσ σ s s ` σ s λV p BS q s s ˘ ` V s s σ s ` rs V s ` rs V s V p T, s , s q “ h p s , s q , with 0 ă s , s ă 8 , 0 ď t ď T . (3.2)To show the above PDE emit a unique classical solution, one may refer to Chapter 4 ofFriedman (1975) [9]. In fact, the PDE (3.2) yield a unique classical solution whenever1 ´ λf s satisfies condition p q of Theorem 3.1. In the case of full price impact, the interaction of every marketparticipant (big or small), has a direct impact on the price of market constituents. If weassume all these participants applies a delta hedge strategy, then option sensitivities will
This manuscript is for review purposes only.
PREAD OPTION WITH LIQUIDITY ADJUSTMENTS S . We showcase the risk-neutral dynamics of FLMM SDE system in (3.3). dS p t q “ rdt ` ¯ σ ˚˚ ` S p t q ˘ d Ă W p t q ` ¯ σ ˚˚ ` S p t q ˘ d Ă W p t q ,dS p t q “ rdt ` ¯ σ p t q d Ă W p t q ` ¯ σ p t q d Ă W p t q , (3.3)where the diffusion function are¯ σ ˚˚ p t, s , s q “ σ s ´ λV s s , ¯ σ ˚˚ p t, s , s q “ σ s λV s s ´ λV s s . We can apply the canonical two-asset replicating portfolio argument, the result isthe following non-linear BS-like PDE: $’’’’’’’’’&’’’’’’’’’% rV “ V t ` V s s p ´ λV s s q ` σ s ` λ V s s σ s ` λV s s ρσ σ s s ˘ ` V s s ´ λV s s ` ρσ σ s s ` λV s s σ s ˘ ` V s s σ s ` rs V s ` rs V s ,V p T, s , s q “ h p s , s q , with 0 ă s , s ă 8 , 0 ď t ď T . (3.4)This non-linearity is a major contrast between the partial and full price impact model,it brings a challenge to the establishment of model existence and uniqueness.We manage to establish existence and uniqueness for the market model (3.3) byinitially showing the PDE (3.4) has a certain class of smooth solutions. Then extendingexistence and uniqueness to the SDE system (3.3) in a similar manner as outlined inTheorem 3.1. This procedure is
Theorem
Finite Liquidity Existence IV ). The SDE system (3.3) of FLMM SDEs with full impact assumption has a strongsolution under r P .Proof. Please refer to the Appendix Section 7.4.
4. Deep Galerkin Method.
The curse of dimensionality is a common issuewhen attempting to solve high dimensional PDEs (include some literature). In highdimensions, methods such as finite difference and element not only become costly,but are often unstable. Muti-asset option pricing PDEs are affected by the curseof dimensionality. The
Deep Galerkin Method (DGM), developed by Sirignano andSpiliopoulos (2018) [18], have the potential to address these issues. From a high level,DGM can be viewed as a deep learning approach to solve weak formulation problemson PDE, but without the need to construct a mesh.In this section, we describe the DGM approach to solving two-asset pricing PDEs.We also outlines the steps required to solve (1.1), (3.2) and (3.4). Then, these so-lutions are compared against various benchmarks and insights on price impacts aresubsequently derived.
This manuscript is for review purposes only.
KEVIN S. ZHANG, TRAIAN A. PIRVU
Consider a two-asset option pric-ing PDE: $’&’% L V p t, s , s q “ , p t, s , s q P r , T s ˆ p R ` q , p interior q V p T, s , s q “ h p s , s q , p s , s q P p R ` q , p terminal q V p t, s , s q “ g p t, s , s q , p t, s , s q P r , T s ˆ B p boundary q . (4.1)For the Sobolev space H “ H ` r , T s ˆ p R ` q ˘ , the equivalent weak formulation of(4.1) is: $’&’% x L V, u y “ @ u P H , x V ´ h, v y “ @ v P ¯ H , x V ´ g, w y “ @ w P B H . (4.2)The regular Galerkin Method would require the careful selection of a set of basisfunctions p φ , φ , ..., φ N q , that characterizes a finite approximation space E N Ă H . Aunique best approximation, ˆ V N of V , can be determined by projecting the PDE onto E N . By increasing the dimension of approximation space, projection theorem gives aunique best approximation ˆ V i in each approximation space E i . The result is a sequenceof approximators (cid:32) ˆ V i ( i “ N,N ` ,.. , that converges to V by the completeness of H . Therigorous formulation of this method can be found in [12].DGM deviates from the regular Galerkin Method by assuming a neural network f p t, s , s ; θ q : R Ñ R has the potential to capture the behavior of V . The networkis subsequently initialized and trained with information gathered from the domainof V . This requires the selection of a meaningful objective function. The suggestedobjective function in [18] closely resembles the weak formulation of (4.2) with the L inner product. The choice of L norm can be supported by literature such as[3]. The construction of the objective function proceeds as follows. For unit vectors u P H , v P ¯ H , and B H , we apply the Cauchy-Schwartz Inequality to the weakformulation equations in (4.2). Next, we sum the resulting terms, thus producing theobjective function defined by J “ } L V } r ,T sˆp R ` q ` } V ´ h } p R ` q ` } V ´ g } r ,T sˆp R ` q . (4.3)During the implementation stage, distributions are selected to generate points inthe domain. This gives rise to a L distributional norm } f p x q} D ,φ “ ş D | f p x q| φ p x q d x ,where φ p x q is a probability density function on the domain. Thus, the choice of φ p x q will significantly impact the performance of this method. A suitable objective functioncan be reformulated for our option pricing problems as: J p θ q “ J p θ q ` J p θ q ` J p θ q , (4.4)where J i p θ q , for i “ , ,
3, are defined as follows: J p θ q “ } L f p t, s , s ; θ q} r ,T sˆp R ` q ,φ . (4.5)This objective function measures how well the network satisfies the pricing PDE’sdifferential operator. J p θ q “ } f p T, s , s ; θ q ´ h p s , s q} p R ` q ,φ . (4.6) This manuscript is for review purposes only.
PREAD OPTION WITH LIQUIDITY ADJUSTMENTS J p θ q “ } f p t, s , s ; θ q ´ g p t, s , s q} r ,T sˆ B ,φ . (4.7)The last objective function characterizes boundary conditions, or artificially createdboundaries from asymptotics. For European style option pricing PDEs, these bound-aries exist when underlying prices reach 0. The asymptotic appears when a price capis impose on the underlings.The training data are generated as a tuple p x p i q , x p T q , x p b q q , where x p i q “ p t, s , s q „ φ , x p T q “ p T, s , s q „ φ and x p b q “ p t, s p b q , s p b q q „ φ . In particular, x p i q , x p T q and x p b q are generated from the interior, terminal and boundary (or artificial boundary) ofthe PDE respectively. The generated data are used to compute the objective function(4.4).In the next phase, we apply a gradient descent algorithm, in hope of eventuallyfinding a set of parameters θ for f p t, s , s ; θ q that will produce a minima for theobjective function. In fact, Correia et al. (2019) [2] mentioned DGM is strictly anoptimization problem. Validation is unnecessary because the objective function directlycharacterizes weak formulation of the PDE. This also means a network that producezero-valued objective function is the analytical solution of the PDE.The network architecture adopted in [18] contains 1 dense layer and 3 DGM layers,all embedded with tanh activation function. We modify the structure and incorporatethe swish activation function [17]. Detailed arguments on the effectiveness of using swish may be found in [6]. A summary of different types of activation function usedin our network architecture are included in Table 1.Table 1: Activation FunctionsSigmoid σ p x q “ ` e ´ x Tanh σ p x q “ e x ´ e ´ x e x ` e ´ x Swish σ p x q “ x ` e ´ x Figure 1 captures the DGM network structure.Fig. 1: DGM Network Architecture X S w i s h D e n s e L a y e r D G M L a y e r D G M L a y e r D G M L a y e r Y Linear Output
This manuscript is for review purposes only.
KEVIN S. ZHANG, TRAIAN A. PIRVU
Our modified DGM layer is inspired by
Gated Recurrent Unit by Chung et al. [7](2014). Figure 2 captures the structure of each modified DGM layer.Fig. 2: Modified DGM Layer H i H i ` X F U QO `ˆˆ
Sigmoid Layer Tanh Layer Swish Layer ` Tensor Addition ˆ Hadamard ProductThe mathematical operation behind our entire DGM network can be representedby the following set of equations: H “ Swish ` W X ` b ˘ ,F l “ Sigmoid ` W fx,l X ` W fh,l H l ` b f,l ˘ , for l “ , , ,U l “ Sigmoid ` W ux,l X ` W uh,l H l ` b u,l ˘ ,Q l “ T anh ` W qx,l X ` W qh,l H l ` b q,l ˘ ,O l “ Swish ` W ox,l p U l ˝ O l q ` b o,l ˘ .H l ` “ F l ˝ H l ` O l ,Y “ H W y ` b y , where W are the weights, b are the biases and ˝ is the Hadamard product.For iteration size I and batch size B , we provide a general overview of the imple-mentation of DGM in Algorithm 4.1. Algorithm 4.1
Deep Galerkin Method for Option PricingInitialize learning rate α and network parameters θ for i “ to I do Generate interior sample point x p i q i “ r x p i q i , x p i q i , ..., x p i q iB s from φ Generate terminal sample point x p T q i “ r x p T q i , x p T q i , ..., x p T q iB s from φ Generate boundary sample point x p b q i “ r x p b q i , x p b q i , ..., x p b q iB s from φ Compute the loss function: J p θ q “ } L f p x p i q i ; θ q} ` } f p x p T q i ; θ q ´ h p x p T q i q} ` } f p x p b q i ; θ q ´ g p x p b q i q} Take a descent step: θ p new q “ θ p old q ´ α B J p θ qB θ p old q Apply decay to the learning rate α end for This manuscript is for review purposes only.
PREAD OPTION WITH LIQUIDITY ADJUSTMENTS
5. Experiments.
We run several experiments here to learn the partial (3.2) andfull impact (3.4) option pricing PDEs. The adaptation of transfer learning is justifieddue to the similarity of these PDEs.
If an undergraduate student is given the taskof learning graduate material. It is unlikely the student will perform very well. How-ever, if that same student were to learn the prerequisites knowledge beforehand, andreattempt. That student certainly stand a better chance. In machine learning, thisconcept is often referred to as transfer learning . It is the method of applying priorknowledge to related problems but often difficult to solve directly. Bengio (2012) [4]goes into extensive detail on transfer learning. Weiss et al. (2016) [19] provides aformal definition for this method in terms of a domain D “ (cid:32) X , φ X ( and learning task T “ (cid:32) Y , f p¨q ( ( X -feature space, φ X -feature distribution, Y -label space, f p¨q -predictivefunction). Definition ( Transfer Learning )For a pair of domain and learning task D s “ (cid:32) X s , φ X s ( , T s “ (cid:32) Y s , f s p¨q ( . Considera target domain and learning task D t “ (cid:32) X t , φ X t ( , T t “ (cid:32) Y t , f t p¨q ( . Transfer learningis the process of using relevant information of f s p¨q to improve the predictive capabilityof f t p¨q . By adopting transfer learning, we may train DGM nets to learn the FLMM pricingPDEs (3.2) and (3.3). The aforementioned PDEs are special cases of the 2-dimensionalBS PDE for Spread Option (1.1). Therefore, we should train an initial DGM net tolearn the relatively simpler BS PDE (1.1). Subsequently, we may modify the objectivefunction (4.4) in accordance to the more complicated PDE with price impacts, thenfurther train the network to learn (3.2) and (3.3).For some asset price cap C , we restrict the domain to the finite cube r , T sˆr , C s .This will allow us impose asymptotics as boundary conditions (see Section 7.5 for moredetails), and in turn get a faster convergence. During implementation, we use meansquared error (MSE) as an estimator for the L norms in (4.4). In calculation of MSE, N is the mini-batch size for training, it should be large to ensure accuracy of theestimator.To implement DGM for the PDEs (1.1), (3.2) and (3.4), we can follow Algorithm4.1 and define a distinct objective function for each of the PDEs. It is apparent thefunctions J p θ q and J p θ q are shared amongst these objective functions.ˆ J p b q p θ q “ ˆ J p b q p θ q ` ˆ J p θ q ` ˆ J p θ q , (BS Model)ˆ J p p q p θ q “ ˆ J p p q p θ q ` ˆ J p θ q ` ˆ J p θ q , (FLMM with partial impact)ˆ J p f q p θ q “ ˆ J p f q p θ q ` ˆ J p θ q ` ˆ J p θ q . (FLMM with full impact)Details on the objective functions ˆ J p b q p θ q , ˆ J p p q p θ q , ˆ J p f q p θ q , ˆ J p θ q and ˆ J p θ q can befound in Section 7.5.The sampling method is completely problem depended, user should focus sam-pling from a sub-domain of highly probable option input parameters. In the case ofspread option, we noticed the objective function convergence faster when we choosesampling distribution that produce more non-zero option value. We present details onour sampling distributions for each scenario in Table 2. This manuscript is for review purposes only. KEVIN S. ZHANG, TRAIAN A. PIRVU
Table 2: Sampling Method
Support Sampling Distribution ˆ φ p t, s , s q P r , T s ˆ r , C s t „ U p , T q , s „ Cβ p , q , s „ Cβ p , q ˆ φ p s , s q P r , C s s „ Cβ p , q , s „ Cβ p , q ˆ φ p t, s q P r , T s ˆ r , C s t „ U p , T q , s „ Cβ p , q ˆ φ p t, s q P r , T s ˆ r , C s t „ U p , T q , s „ Cβ p , q We present histograms for our sampling method in Figure 3. One may notice wesample t uniformly, this because we desire to option prices evenly across a span oftime to maturities. For the assets, we adopted two beta distribution, one slightly morecentered than the other. The reason is because we desire the option value to be non-zero, then with high probability, the first asset should have a greater price than thesecond. Fig. 3: Sampling Method For all 3 PDEs ((1.1), (3.2) and (3.4)), the sharedoption parameters we used are: k “ r “ . ρ “ . σ “ . σ “ . T “ p s , s q P r , s , the asset price cap is set to C “ This manuscript is for review purposes only.
PREAD OPTION WITH LIQUIDITY ADJUSTMENTS *Benchmarked against FFT with grid size N=512.
The trained net matches extremely well with the FFT results. For this particularPDE, we our modified DGM net out performs the canonical DGM net. Althoughthe trained net performs well on r , s , we should not expect the same level ofperformance will extend to p R ` q .Next, we take the previously trained model and apply transfer learning by switchingthe loss function to L p p q p θ q of (5.1). The results are illustrated in Figure 5. This manuscript is for review purposes only. KEVIN S. ZHANG, TRAIAN A. PIRVU
Fig. 5: Spread Option (Partial Impact FLMM) *Benchmarked against FFT with grid size N=512.
From this experiment, we observe a price premium for the partial impact model.The premium is the result of illiquidity, it is the greatest for at-the-money optionswith high underlying asset prices. Furthermore, the liquidity premium increases as thecost-per-share parameter (cid:15) increases.For the full impact model, we fetch the pre-trained network for regular BS modeland switch the loss function to L p f q p θ q of (5.1). The results are illustrated in Figure 6. This manuscript is for review purposes only.
PREAD OPTION WITH LIQUIDITY ADJUSTMENTS *Benchmarked against FFT with grid size N=512.
From Figure 6, we observe a price premium for the full impact model. Once again,this price premium is the greatest for at-the-money options with high underlying assetprices. The liquidity premium also increases as the cost-per-share parameter (cid:15) increases.Fig. 7: Spread Option (Partial vs Full)It’s evident from Figure 7 the full impact model carries a greater liquidity pre-mium than the partial impact mode. This is not a surprise because under full impactassumptions, all market trading activity has an impact on asset prices.
6. Conclusions.
We extended the FLMM of Pirvu and Zhang (2020) [21] to pricespread options. We established the existence and uniqueness for the full and partialimpact model SDEs driving the underlying asset prices and for the PDEs characterizingthe spread option prices. We developed a variation of DGM method that essentially is
This manuscript is for review purposes only. KEVIN S. ZHANG, TRAIAN A. PIRVU a long short-term memory (LSTM) network with swish activation function, to numer-ically solve the option pricing PDEs.Our DGM network has the ability to learn the PDEs (3.4) and (3.2). The learn-ing speed can be improved by learning the PDE of the impactless spread option (1.1)initially, then apply transfer learning. Our results indicate the full impact model re-quires greater liquidity value adjustment than the partial impact model. This findingis consistent because the full model takes includes all market trading activities. Thispaper may be useful for commodity traders who deal with illiquid underlying.
7. Appendix.
This section will include some of the formulas and proofs left outfrom the main body.
Proof.
Suppose there exists an equivalent measure r P generated by some process Θ p t q such that under r P , dS p t q and dS p t q has Z p t q “ exp ` ´ ż t x Θ p u q , d W p t qy ´ ż t || Θ p u q|| du ˘ . Then under r P d Ă W p t q “ dW p t q ` Θ p t q dt, and d Ă W p t q “ dW p t q ` Θ p t q dt. Under r P we have the following dynamics: dS p t q “ ´ ¯ µ ` S p t q ˘ ´ ¯ σ ` S p t q ˘ Θ p t q ´ ¯ σ ` S p t q ˘ Θ p t q ¯ dt ` ¯ σ ` S p t q ˘ d Ă W p t q` ¯ σ ` S p t q ˘ d Ă W p t q ,dS p t q “ ´ ¯ µ p t q ´ ¯ σ p t q Θ p t q ´ ¯ σ p t q Θ p t q ¯ dt ` ¯ σ d Ă W p t q ` ¯ σ d Ă W p t q . Imposing the risk-less return rate under r P leads to the following linear system: „ ¯ σ ¯ σ ¯ σ ¯ σ „ Θ p t q Θ p t q “ „ ¯ µ ´ r ¯ µ ´ r . The system has a unique solution r P almost surely when the determinant is not zero,that is ¯ σ p t q ¯ σ p t q ´ ¯ σ p t q ¯ σ p t q ‰ , r P b dt almost surely. Due to continuity of ourprocesses, it will be sufficient to conclude for all t we have ¯ σ p t q ¯ σ p t q ´ ¯ σ p t q ¯ σ p t q ‰ , r P almost surely. Therefore, a necessary condition for the finite liquid market modelto be complete is: S p t q S p t q ‰ σ ρσ a ´ ρ λ p t, S p t q , S p t qq f s , r P almost surely.This condition is met in light of the continuous distribution of our processes. Hurd and Zhou (2010) [13] pricing formula forSpread BS model. Let x “ ` log p s q , log p s q ˘ be the log initial asset prices, V p BS q ` t, s , s ˘ “ ke ´ r p T ´ t q p π q ż ż R ` i(cid:15) e i u x ˆ P p u q Φ x p u , τ q d u , (7.1) This manuscript is for review purposes only.
PREAD OPTION WITH LIQUIDITY ADJUSTMENTS P p u q “ Γ ` i p u ` u q ´ ˘ Γ ` ´ iu ˘ Γ ` iu ` ˘ , Φ x p u , τ q “ exp (cid:32) i u ` r ´ tr p Σ q ˘ τ ´ u Σ u τ ( . The (cid:15) “ p (cid:15) , (cid:15) q term is a dampening factor with the restrictions (cid:15) ą (cid:15) ` (cid:15) ă ´ First Order Greek∆ p t q “ „ ∆ ∆ p t q “ ke ´ r p T ´ t q p π q T p q s ∆ p t q “ ke ´ r p T ´ t q p π q „ s ¯ ∆ s ¯ ∆ p t q Second Order GreekΓ p t q “ „ Γ Γ Γ Γ p t q “ ´ ke ´ r p T ´ t q p π q ´ T p q s ∆ p t q ` T p q s Γ p t q T p q ¯ “ ´ ke ´ r p T ´ t q p π q « s p ¯∆ ` ¯Γ q s s ¯Γ s s ¯Γ
21 1 s p ¯∆ ` ¯Γ q ff p t q Third Order GreekSpd p t q “ „ Spd
Spd
Spd
Spd b „ Spd
Spd
Spd
Spd p t q“ ke ´ r p T ´ t q p π q ´ T p q s ∆ p t q ´ T p q ` T p q s Γ p t q ˘ ´ ` T p q ˘ `Ę Spd p t q T p q ˘¯ “ ke ´ r p T ´ t q p π q ! « s p ` ¯Γ ´ Ğ Spd q ´ s s Ğ Spd s s p ¯Γ ´ Ğ Spd q ´ s s Ğ Spd ff , « ´ s s Ğ Spd
112 1 s s p ¯Γ ´ Ě Spd q´ s s Ğ Spd
212 1 s p ` ¯Γ ´ Ě Spd q ff ) p t q To derive Spread Option Delta, recall x “ log p s q and ∆ p t q “ B V p BS q p t, s qB x “ B x B s B V p BS q p t, s qB x , if we let T p q “ B x B s “ „ s s , then by taking the matrix derivative of (7.1), we have ∆ p t q “ ke ´ rτ p π q T p q BB x ´ ż ż R ` i(cid:15) e i u x Φ p u , τ q ˆ P p u q d u ¯ . By Dominated Convergence Theorem , the order of differentiation and integration can
This manuscript is for review purposes only. KEVIN S. ZHANG, TRAIAN A. PIRVU be switched, and it follows ∆ p t q “ ke ´ rτ p π q T p q ż ż R ` i(cid:15) BB x ´ e i u X p t q ¯ Φ p u , τ q ˆ P p u q d u “ ke ´ rτ p π q T p q ż ż R ` i(cid:15) i u e i u x Φ p u , τ q ˆ P p u q d u , if we let s ∆ p t q “ ż ż R ` i(cid:15) i u e i u x Φ p u , τ q ˆ P p u q d u , then we have ∆ p t q “ ke ´ rτ p π q T p q s ∆ p t q “ ke ´ rτ p π q „ s ¯ ∆ s ¯ ∆ p t q . The Spread Option Gamma can be defined as: Γ p t q “ B ∆ p t qB s “ ke ´ r p T ´ t q p π q ´ B T p q B s s ∆ p t q ` T p q B s ∆ p t qB x B x B s ¯ , let s Γ p t q “ ż ż R ` i(cid:15) p u b u q e i u x Φ p u , τ q ˆ P p u q d u , and T p q “ ! „ s
00 0 , „ s ) , then we have Γ p t q “ ke ´ r p T ´ t q p π q ! ´ T p q ż ż R ` i(cid:15) i u e i u x Φ p u , τ q ˆ P p u q d u ´ T p q ´ ż ż R ` i(cid:15) p u b u q e i u x Φ p u , τ q ˆ P p u q d u ¯ T p q ) “ ´ ke ´ r p T ´ t q p π q ´ T p q s ∆ p t q ` T p q s Γ p t q T p q ¯ “ ´ ke ´ r p T ´ t q p π q « s p ¯∆ ` ¯Γ q s s ¯Γ s s ¯Γ
21 1 s p ¯∆ ` ¯Γ q ff p t q The Spread Option Spd can be defined as:
Spd p t q “ B Γ p t qB s “ ´ ke ´ r p T ´ t q p π q ! B T p q B s s ∆ p t q ` T p q B s ∆ p t qB x B x B s ` B T p q B s s Γ p t q T p q ` T p q ´ B s Γ p t qB x B x B s T p q ` s Γ p t q B T p q B s ¯) , let Ę Spd p t q “ ż ż R ` i(cid:15) i p u b u b u q e i u X p t q Φ p u , τ q ˆ P p u q d u , and T p q “ !´ „ s
00 0 , „ ¯ , ´ „ , „ s ¯) , This manuscript is for review purposes only.
PREAD OPTION WITH LIQUIDITY ADJUSTMENTS
Spd p t q “ ´ ke ´ rτ p π q ! ´ T p q s ∆ p t q ` T p q s Γ p t q T p q ´ T p q s Γ p t q T p q ` T p q ´` ż ż R ` i(cid:15) i p u b u b u q e i u X p t q Φ p u , τ q ˆ P p u q d u ˘` T p q ˘ ´ s Γ p t q T p q ¯) “ ke ´ rτ p π q ´ T p q s ∆ p t q ` T p q s Γ p t q T p q ´ T p q Ę Spd p t q ` T p q ˘ ¯ “ ke ´ rτ p π q ! « s p ` ¯Γ ´ Ğ Spd q ´ s s Ğ Spd s s p ¯Γ ´ Ğ Spd q ´ s s Ğ Spd ff , « ´ s s Ğ Spd
112 1 s s p ¯Γ ´ Ě Spd q´ s s Ğ Spd
212 1 s p ` ¯Γ ´ Ě Spd q ff ) p t q . The method required to determine higher order Greeks becomes redundant. Allthe Greeks will be linear combinations of contour integral with the particular form: Ğ Greek p t, s , s q “ ż ż R ` i(cid:15) f b p u q e i u x Φ p u , τ q ˆ P p u q d u , (7.2)where f b p u q is some complex tensor polynomial function. For example, Γ p t q is a linearcombination of the contour integrals ¯ ∆ p t q and ¯ Γ p t q , with respective tensor polynomialfunctions i u and u b u . Proof.
According to Proposition (2.4), the system of SDEs in (3.2) emit weak so-lutions when the diffusion functions σ ˚ p t, s , s q and σ ˚ p t, s , s q are uniformly Lipshitz continuous. We can invoke The-orem 2.1 of Pirvu and Zhang (2020) [21], and check whether the regularity conditions(1)-(3) are satisfied. We reinstate the conditions: p q } λ p s f s s ` s f s s ` f s ` s f s ` s f s s ` s f s s ` s f s s q} ă 8 , p q } ` λ s ` λ s ˘` s f s ` s f s ` s f s ˘ } ă 8 , p q ||| ´ λf s ||| ą δ , for some δ ą . To achieve this, first recall from (7.2) that all the Greeks are just linear combina-tions of the form: Ğ Greek p t, s , s q “ ż ż R ` i(cid:15) f b p u q e i u x Φ p u , τ q ˆ P p u q d u “ e ´ (cid:15) x ż ż R f b p u ` i(cid:15) q e i (cid:60) p u q x Φ p u ` i(cid:15), τ q ˆ P p u ` i(cid:15) q d u “ s (cid:15) s (cid:15) ż ż R f b p u ` i(cid:15) q e i (cid:60) p u q x Φ p u ` i(cid:15), τ q ˆ P p u ` i(cid:15) q d u “ s (cid:15) s (cid:15) Ğ Greek (cid:60) p t, s , s q . Here we use Ğ Greek (cid:60) p t, s , s q to distinguish between contour and real integrals forms.The term e i (cid:60) p u q x lays on the complex unit circle, this results in } Ğ Greek (cid:60) p t, s , s q} ă This manuscript is for review purposes only. KEVIN S. ZHANG, TRAIAN A. PIRVU for all Greeks. Then proving the regularity conditions only boils down to the terms s (cid:15) s (cid:15) .When we rewrite the counter integral as real integrals and substitute the BS SpreadGreeks into Condition (1), we get: λ p t, s , s q ` ke ´ rτ p π q ˘` s ` s s ` (cid:15) s (cid:15) p (cid:60) ` ¯Γ (cid:60) ´ Ě Spd (cid:60) q ´ s ` s s ` (cid:15) s ` (cid:15) Ě Spd (cid:60) ´ ` s s ` (cid:15) s ` (cid:15) ¯Γ (cid:60) ´ ` s s ` (cid:15) s ` (cid:15) Ě Spd (cid:60) ˘ . By dropping the constants and bounded real integral terms, we have λ p t, s , s q ` s ` s s ` (cid:15) s (cid:15) ´ s ` s s ` (cid:15) s ` (cid:15) ´ ` s s ` (cid:15) s ` (cid:15) ´ ` s s ` (cid:15) s ` (cid:15) ˘ “ λ p t, s , s q ´ s ´ s ´ s s s ` (cid:15) s ` (cid:15) ¯ . (7.3)Since λ p t, s , s q is only non-zero between S and S , we conclude Expression (7.3) isbounded.Substitute the BS Spread Greeks into Condition (2) and adopting real integrals,we get: ` λ s ` λ s ˘` ke ´ rτ p π q ˘` s ` s s ` (cid:15) s (cid:15) p ¯∆ (cid:60) ` ¯Γ (cid:60) q ` s s ` (cid:15) s ` (cid:15) ¯Γ (cid:60) ˘ . By dropping the constants and bounded real integral terms, we have ` λ s ` λ s ˘` s s ` s s ` (cid:15) s ` (cid:15) ˘ . By the same argument of Condition (1), this term is also bounded.Condition (3) is bounded in s and s by the same logic as Condition (1) and(2). For the t dimension of Condition p q , Γ p t, s , s q diverges as t Ñ T . By design λ p t, s , s q has a higher order decaying that ensures λ p t, s , s q Γ p t, s , s q to stayfinite. Therefore we can always find a δ ą p q to p q hold under r P measure, we can concludethe system of SDE (3.2) emit strong solution under r P . Proof.
By similar argument as 3.1, the system of SDEs in (3.4) emits strong r P solu-tion when the diffusion functions σ ˚˚ p t, s , s q and σ ˚˚ p t, s , s q are uniformly Lipshitzcontinuous. Since these diffusion functions contain partial derivatives of V p t, s , s q (the option with full impact), we need to establish existence of solution for the PDE(3.4).Define Ω “ tp t, x, y q|p t, x, y q P r , T s ˆ p ,
8q ˆ p , , and let X and Y be: X “ (cid:32) V P C , , p Ω q | s.t λV s s , λV s s , λV s s s and λV s s s are bounded on Ω ,and conditions p q , p q , p q are met u , Y “ (cid:61) ` F p V p ε q , ε q ˘ . This manuscript is for review purposes only.
PREAD OPTION WITH LIQUIDITY ADJUSTMENTS F p V p ε q , ε q “ V t ` V s s p ´ λV s q ` σ s ` λ V s σ s ` λV s ρσ σ s s ˘ ` V s s ´ λV s ` ρσ σ s s ` λV s σ s ˘ ` V s s σ s ` rs V s ` rs V s ´ rV. When we set ε “ V “ V p BS q (i.e solution of the BS PDE without price impact).According to Implicit Function Theorem , given that the conditionsi. F p V , ε q “ F V p V , ε q : X Ñ Y , (the Gateaux derivative of F ) is bijec-tive,are met, then there exists a neighborhood V of V and a neighborhood ε of ε suchthat for every ε in that neighbourhood, there is a unique element V p ε q such that F p V p ε q , ε q “
0. Moreover the mapping Λ Q ε Ñ V p ε q is of class C . Condition (i) istrivially satisfied since F p V , ε q “ V P X (the proof, based on standard arguments, is omitted).Now we are going to argue that the linear mapping F V p V , ε q : X Ñ Y is bijective.According to definition, the Gateaux derivative of F at V in the direction V is F V p V , ε q V “ lim τ Ñ F p V ` τ V, ε q ´ F p V , ε q τ “ V t ` σ s V s s ` σ s V s s ` σ σ s s ρV s s ` rs V s ` rs V s ´ rV. Thus, the operator L “ F V p V , ε q is L “ BB t ` σ s B B s ` σ s B B s ` σ σ s s ρ B B s B s ` rs BB s ` rs BB s ´ r. Condition (ii) boils down to showing that the equation L V “ g has a unique solution V P X for every g P Y . The proof of this, based on standard arguments, is omitted.The proof up to this point ensures the PDE 3.4 has a solution in C , , p Ω q . It remainsto show the Lipshitz requirements of σ ˚˚ p t, s , s q and σ ˚˚ p t, s , s q , which boils downto p q } λ p s V s s s ` s V s s s ` V s s ` s V s s ` s V s s s ` s V s s s ` s V s s s q} ă 8 , p q } ` λ s ` λ s ˘` s V s s ` s V s s ` s V s s ˘ } ă 8 , p q ||| ´ λV s s ||| ą δ , for some δ ą . Since V p t, s , s q is C , , p Ω q , conditions (1) and (2) are satisfied by the piece-wiseproperty of λ p t, s , s q . Condition (3) is also satisfied because λ has an order of O p τ q ,which has a decaying effect on ||| ´ λV s s ||| as t Ñ T . This manuscript is for review purposes only. KEVIN S. ZHANG, TRAIAN A. PIRVU
These functions are the MSEestimators for (4.5), (4.6) and (4.7).ˆ J p b q p θ q “ N N ÿ p t,s ,s q„ ˆ φ ` L p b q f p t, s , s ; θ q ˘ , L p b q “ BB t ` σ s B B s ` σ σ s s ρ B B s B s ` σ s B B s ` rs BB s ` rs BB s ´ r, ˆ J p p q p θ q “ N N ÿ p t,s ,s q„ ˆ φ ` L p p q f p t, s , s ; θ q ˘ , L p p q “ BB t ` σ s ` σ s λ p V p BS q s s q ` ρσ σ s s λV p BS q s s p ´ λV p BS q s s q B B s ` ρσ σ s s ` σ s λV p BS q s s ´ λV p BS q s s B B s B s ` σ s B B s ` rs BB s ` rs BB s ´ r, ˆ J p f q p θ q “ N N ÿ p t,s ,s q„ ˆ φ ` L p f q f p t, s , s ; θ q ˘ , L p f q “ BB t ` ` σ s ` σ s λ p B B s B s q ` ρσ σ s s λ B B s B s ˘ p ´ λ B B s B s q B B s ` ρσ σ s s ` σ s λ B B s B s ´ λ B B s B s B B s B s ` σ s B B s ` rs BB s ` rs BB s ´ r, ˆ J p θ q “ N N ÿ p s ,s q„ ˆ φ ` p s ´ s ´ k q ` ´ f p T, s , s ; θ q ˘ , ˆ J p θ q “ N N ÿ p t,s q„ ˆ φ ` C ´ s ´ ke ´ r p T ´ t q ´ f p t, C, s ; θ q ˘ ` N N ÿ p t,s q„ ˆ φ ` s N p d ` q ´ ke ´ r p T ´ t q N p d ´ q ´ f p t, s , θ q ˘ ` N N ÿ p t,s q„ ˆ φ f p t, s , C ; θ q ` N N ÿ p t,s q„ ˆ φ f p t, , s ; θ q . Acknowledgments.
The authors are grateful to the anonymous referee for acareful checking of the details and for helpful comments that improved this paper.
REFERENCES[1]
A. Y. A. Shidfar, Kh. Paryab and T. A. Pirvu , Numerical analysis for spread option pricingmodel of markets with finite liquidity: first-order feedback model , International Journal ofComputer Mathematics, 91 (2014), pp. 2603–2620, https://doi.org/https://doi.org/10.1080/00207160.2014.887274.[2]
A. Al-Aradi, A. Correia, D. Naiff, G. Jardim, and Y. Saporito , Solving nonlinear andhigh-dimensional partial differential equations via deep learning , 2018, https://arxiv.org/abs/1811.08782.
This manuscript is for review purposes only.
PREAD OPTION WITH LIQUIDITY ADJUSTMENTS [3] B. P. Ayati and T. F. Dupont , Convergence of a step-doubling galerkin method for parabolicproblems , 1999.[4]
Y. Bengio , Deep Learning of Representations for Unsupervised and Transfer Learning , vol. 27of Proceedings of Machine Learning Research, PMLR, 02 Jul 2012, http://proceedings.mlr.press/v27/bengio12a.html.[5]
R. Carmona and V. Durrleman , Pricing and hedging spread options , SIAM Review, 45 (2003),pp. 627–685, https://doi.org/10.1137/S0036144503424798.[6]
J. Chen, R. Du, and K. Wu , A comprehensive study of boundary conditions when solving pdesby dnns , 05 2020.[7]
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio , Empirical evaluation of gated recurrentneural networks on sequence modeling , 2014, https://arxiv.org/abs/1412.3555.[8]
M. Dempster and S. Hong , Spread option valuation and the fast fourier transform , 03 2001,https://doi.org/10.1007/978-3-662-12429-1 10.[9]
A. Friedman , Stochastic Differential Equations and Applications , Academic Press, 1st edi-tion ed., 1975.[10]
P. B. Girma and A. S. Paulson , Seasonality in petroleum futures spreads , Journal of Fu-tures Markets, 18 (1998), pp. 581–598, https://doi.org/10.1002/(SICI)1096-9934(199808)18:5 x y C. group , New york mercantile exchange rulebook
J. K. Hunter , Notes on partial differential equations , 2010.[13]
T. R. Hurd and Z. Zhou , A fourier transform method for spread option pricing , SIAM Journalon Financial Mathematics, 1 (2010), pp. 142–157, https://doi.org/10.1137/090750421.[14]
R. L. Johnson, C. R. Zulauf, S. H. Irwin, and M. E. Gerlow , The soybean complex spread:An examination of market efficiency from the viewpoint of a production process , Journal ofFutures Markets, 11 (1991), pp. 25–37, https://doi.org/10.1002/fut.3990110104.[15]
M. Li, S. Deng, and J. Zhou , Closed-form approximations for spread option prices and greeks ,The Journal of Derivatives, 15 (2008), https://doi.org/10.2139/ssrn.952747.[16]
T. Pirvu and A. Yazdanian , Numerical analysis for spread option pricing model in illiquidunderlying asset market: Full feedback model , Applied Mathematics & Information Sciences,10 (2015), pp. 1271–1281, https://doi.org/10.18576/amis/100406.[17]
P. Ramachandran, B. Zoph, and Q. V. Le , Searching for activation functions , 2017, https://arxiv.org/abs/1710.05941.[18]
J. Sirignano and K. Spiliopoulos , Dgm: A deep learning algorithm for solving partial dif-ferential equations , Journal of Computational Physics, 375 (2018), p. 1339–1364, https://doi.org/10.1016/j.jcp.2018.08.029.[19]
K. Weiss, T. Khoshgoftaar, and D. Wang , A survey of transfer learning , Journal of BigData, 3 (2016), https://doi.org/10.1186/s40537-016-0043-6.[20]
P. Wilmott and P. J. Sch¨onbucher , The feedback effect of hedging in illiquid markets ,SIAM Journal on Applied Mathematics, 61 (2000), pp. 232–272, https://doi.org/10.1137/S0036139996308534.[21]
K. S. Zhang and T. A. Pirvu , Numerical simulation of exchange option with finite liquidity:Controlled variate model , 2020, https://arxiv.org/abs/2006.07771., 2020, https://arxiv.org/abs/2006.07771.