Sig-SDEs model for quantitative finance
SSig-SDEs model for quantitative finance.
Imanol Perez Arribas [email protected] of OxfordAlan Turing InstituteUnited Kingdom
Cristopher Salvi [email protected] of OxfordAlan Turing InstituteUnited Kingdom
Lukasz Szpruch [email protected] of EdinburghAlan Turing InstituteUnited Kingdom
ABSTRACT
Mathematical models, calibrated to data, have become ubiquitousto make key decision processes in modern quantitative finance. Inthis work, we propose a novel framework for data-driven modelselection by integrating a classical quantitative setup with a genera-tive modelling approach. Leveraging the properties of the signature,a well-known path-transform from stochastic analysis that recentlyemerged as leading machine learning technology for learning time-series data, we develop the Sig-SDE model. Sig-SDE provides anew perspective on neural SDEs and can be calibrated to exoticfinancial products that depend, in a non-linear way, on the wholetrajectory of asset prices. Furthermore, we our approach enables toconsistently calibrate under the pricing measure Q and real-worldmeasure P . Finally, we demonstrate the ability of Sig-SDE to sim-ulate future possible market scenarios needed for computing riskprofiles or hedging strategies. Importantly, this new model is under-pinned by rigorous mathematical analysis, that under appropriateconditions provides theoretical guarantees for convergence of thepresented algorithms. CCS CONCEPTS • Mathematics of computing → Probability and statistics ; •
Applied computing ; •
Computing methodologies → Machinelearning ; KEYWORDS market simulation, pricing, signatures, rough path theory
ACM Reference Format:
Imanol Perez Arribas, Cristopher Salvi, and Lukasz Szpruch. 2018. Sig-SDEsmodel for quantitative finance. . In
ACM, New York, NY, USA, 8 pages.https://doi.org/10.1145/1122445.1122456
The question of finding a parsimonious model that well representsempirical data has been of paramount importance in quantitativefinance. The modelling choice is dictated by the desire to fit andexplain the available data, but is also subject to computational
Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected].
ICAIF-2020, October 15–16, 2020, NY © 2018 Association for Computing Machinery.ACM ISBN 978-1-4503-XXXX-X/18/06...$15.00https://doi.org/10.1145/1122445.1122456 considerations. Inevitably, all models can only provide an approxi-mation to reality, and the risk of using inadequate ones is hard todetect. A classical approach consists in fixing a class of parametricmodels, with a number of parameters that is significantly smallerthan the number of available data points. Next, in the processcalled calibration, the goal is to solve a data-dependent optimiza-tion problem yielding an optimal choice of model parameters. Themain challenge, of course, is to decide what class of models oneshould choose from. The theory of statistical learning [28] tell usthat to simple models cannot fit the data, and to complex one arenot expected to generalise to unseen observations. In modern ma-chine learning approaches, one usually starts by defining a highlyoveparametrised model from some universality class, exhibiting anumber of parameters often exceeding the number of data points,and let (stochastic) gradient algorithms find the best configurationof parameters yielding a calibrated model. In this work, we finda middle ground between the two approaches. We develop a newframework for systematic model selection that exhibits universalapproximation properties, and we provide a explicit solution tothe optimization used in its calibration, that completely removesthe need to deploy expensive gradient descent algorithms. Impor-tantly the class of models that we consider builds upon classicalrisk models that are well underpinned by research on quantitativefinance.The mathematical object at the core of this work is the expectedsignature of a path, whose properties are well-understood in thefield of stochastic analysis. It allows to identify a linear structureunderpinning the high non-linearity of the sequential data we workwith. This linear structure leads to a massive speed-up of calibra-tion, pricing, and generation of future scenarios. Our approachprovides a new systematic model selection mechanism, that canalso be deployed to calibrate classical non-Markovian models in acomputationally efficient way. Signatures have been deployed tosolve various tasks in mathematical finance, such as options pricingand hedging [22, 23], high frequency optimal execution [4, 14] andothers [12, 24]. They have also been applied in several areas ofmachine learning [6, 16, 19, 21, 29–34].
Let X : [ , T ] → R d denote the price process of an arbitrary finan-cial asset under the pricing measure Q . To ensure the no-arbitrageassumption is not violated, X typically is given by the solution ofthe following Stochastic Differential Equation (SDE) dX t = Σ t dW t , X = x , (1)where W is a one-dimensional Brownian motion and Σ t is anadapted process (the volatility process). Model (1) accommodatesmany standard risk models used e.g: the classical Black–Scholes a r X i v : . [ q -f i n . C P ] J un CAIF-2020, October 15–16, 2020, NY Perez Arribas, Salvi and Szpruch model assumes that volatility is proportional to the spot price, i.e. Σ t : = σX t with σ ∈ R constant; the local volatility model assumesthat Σ t : = σ ( t , X t ) X t , where σ (· , ·) (called local volatility surface)depends on both time and spot. Hence, it is a generalisation of theBlack–Scholes model; various stochastic volatility model assumethat Σ t : = σ t X t with σ t following some diffusion process; theSABR model chooses Σ t : = σ t X βt , with β ∈ [ , ] and where σ t follows a diffusion process.A natural question would be whether one can find a model for thevolatility process Σ t that is large enough to include all the classicalmodels such as the ones mentioned above and that would allow forsystematic a data driven model selection. We will require such amodel to satisfy the following requirements:(1) Universality . The model should be able to approximate ar-bitrarily well the dynamics of classical models.(2)
Efficient calibration . Given market prices for a family ofoptions, it should be possible to efficiently calibrate the modelso that it correctly prices the family of options.(3)
Fast pricing . Ideally, it should be possible to quickly price(potentially exotic) options under the model without usingMonte Carlo techniques.(4)
Efficient simulation . Sampling trajectories from the modelshould be computationally cheap and efficient.An example of a model that satisfies point 1. above is a neuralnetwork model , where the volatility process Σ t is approximated bya neural network NN θ ( t , ( W s ) s ∈[ , t ] ) with parameters θ . Such amodel would be able to approximate a rich class of classical models.However, the calibration and pricing of such models would involveperforming multiple Monte Carlo simulations on each epoch, whichmight be expensive if done naively. See however, [7, 10].The aim of this paper is to propose a model for asset price dynam-ics that, we believe, satisfies all four points above. Our techniquemodels the volatility process Σ t as Σ t = ⟨ ℓ N , (cid:98) W , t ⟩ (2)where ℓ N is the model parameters and (cid:98) W , t is the signature (c.f def-inition 2.6) of the stochastic process (cid:98) W t : = ( t , W t ) . The motivationfor choosing the signature as the main building block of this paperis anchored in a very powerful result for universal approximationof functions based on the celebrated Stone-Weierstrass Theorem thatwe present next in an informal manner (for more technical detailssee [8, Proposition 3])Theorem 1.1.
Consider a compact set K of continuous R d -valuedpaths. Denote by S the function that maps a path X from K to itssignature X . Let f : K → R be any any continuous functions. Then,for any ϵ > and any path X ∈ K , there exists a linear function l ∞ acting on the signature such that || f ( X ) − ⟨ l ∞ , X ⟩|| ∞ < ϵ (3)In other words, any continuous function on a compact set of pathscan be uniformly well approximated by a linear combination ofterms of the signature. This universal approximation property issimilar to the one provided by Neural Networks (NN). However, aswe will discuss below, NN models depend on a very large collec-tion of parameters that need to be optimized via expensive back-propagation-based techniques, whilst the optimization needed in our Sig-SDE model consists of a simple linear regression on theterms of the signature. In this way, the signature can be thought ofas a feature map for paths that provides a linear basis for the spaceof continuous functions on paths. In the setting of SDEs, samplepaths are Brownian and solutions are images of these sample tra-jectories by a continuous functions that one wishes to approximatefrom a set of observations. Our Sig-SDE model will rely upon theuniversality of the signature to approximate such functions actingon Brownian trajectories. Importantly, the signature of a realisa-tion of a semimartingale provides a unique representation of thesample trajectory [2, 13]. Similarly, the expected signature – i.e. thecollection of the expectations of the iterated integrals – provides aunique representation of the law of the semimartingale [5].Note that model calibration is an example of generative modelling[11, 18]. Indeed, recall that if one knew prices of traded liquidderivatives, then one can approximate the pricing measure frommarket data [3, 22]. We denote this measure by Q real .We know that when equation (1) admits a strong solution thenthere exists a measurable map G : R × C ([ , T ]) → C ([ , T ]) suchthat X = G ( x , ( W s ) s ∈[ , T ] ) (4)as shown in [15, Corollary 3.23]. If G t denotes the projection of G given by X t : = G t ( ξ , ( W s ) s ∈[ , t ] ) , then one can view (1) as agenerative model that maps µ supported on R d into ( G t ) µ = Q θt . Note that by construction G is a casual transport map i.e atransport map that is adapted to the filtration F t [1]. In practice,one is interested in finding such a transport map from a family ofparametrised functions G θ . One then looks for a θ such that G θ µ is a good approximation of Q real with respect to a metric specifiedby the user. In this paper the family of transport maps G θ is givenby linear functions on signatures (or linear functionals below). We begin by introducing some notation and preliminary resultsthat are used in this paper.
Definition 2.1.
Let d ∈ N . For any n ≥
0, we call an n -dimensional d -multi-index any n -tuple of non-negative integers of the form K = ( k , . . . , k n ) such that k i ∈ { , . . . , d } for all i ∈ { , . . . , n } . Wedenote its length by | K | = n . The empty multi-index is denoted byø. We denote by I d the set of all d -multi-indices, and by I nd ⊂ I d the set of all d -multi-indices of length at most n ∈ N . Definition 2.2 (Concatenation of multi-indices).
Let I = ( i , . . . , i p ) and J = ( j , . . . , j q ) be any two multi-indices in I d . Their concate-nation product ⊗ as the multi-index I ⊗ J = ( i , . . . , i m , j , . . . , j n ) ∈I d . Example 2.3. (1) ( , ) ⊗ ( , ) = ( , , , ) .(2) ( , , ) ⊗ ( ) = ( , , , ) .(3) ( , ) ⊗ ø = ( , ) . Definition 2.4 (Linear functional).
For a given d ≥
1, a linear func-tional is a (possibly infinite) sequence of real numbers indexed by ig-SDEs model for quantitative finance. ICAIF-2020, October 15–16, 2020, NY multi-indices in I d of the following form F = { F ( K ) ∈ R : K ∈ I d } . (5)We note that a multi-index K ∈ I d is always a linear functional.Both concatenation ⊗ and x can be extended by linearity to opera-tions on linear functionals. We will now define two basic operationson linear functionals that will be used throughout the paper. Definition 2.5.
For any two linear functionals F , G and any realnumbers α , β ∈ R define αF + βG = { αF ( K ) + βG ( K ) ∈ R : K ∈ I d } (6)and ⟨ F , G ⟩ = (cid:213) K ∈I d F ( K ) G ( K ) ∈ R (7) Rough paths theory can be briefly described as a non-linear ex-tension of the classical theory of controlled differential equationswhich is robust enough to allow a deterministic treatment of sto-chastic differential equations controlled by much rougher signalsthan semi-martingales [26].
Definition 2.6 (Signature).
Let X : [ , T ] → R d be a continuoussemimartingale. The Signature of X over a time interval [ s , t ] ⊂[ , T ] is the linear functional X s , t : = { X ( K ) s , t ∈ R : K ∈ I d } , suchthat X ( ø ) s , t = n ≥ K = (cid:98) K ⊗ a ∈ I nd , with a ∈ { , . . . , d } and (cid:98) K ∈ I n − d we have X ( K ) s , t = ∫ ts X ( (cid:98) K ) s , u ◦ d X ( a ) u (8)where the integral is to be interpreted in the Stratonovich sense. Example 2.7.
Let X : [ , T ] → R be a semimartingale.(1) X ( ) s , t = X ( ) t − X ( ) s .(2) X ( , ) s , t = ∫ ts X ( ) s , u ◦ dX ( ) u .(3) X ( , ) s , t = ( X ( ) s − X ( ) t ) .A more detailed overview of signatures is included in Appendix A. In this section we define the
Signature Model for asset price dy-namics that we propose in this paper. The goal is to approximatethe volatility process Σ t (that is a continuous function on the driv-ing Brownian path) by a linear functional on the signature of theBrownian path. Definition 3.1 (Signature Model).
Let W be a one-dimensional Brow-nian motion. Let N ∈ N be the order of the Signature Model. TheSignature Model of parameter ℓ = { ℓ ( K ) : K ∈ I N } is given by Σ t : = ⟨ ℓ, (cid:98) W , t ⟩ , where (cid:98) W denotes the signature of W add-time . Inother words, the asset price dynamics are given by dX t = ⟨ ℓ, (cid:98) W , t ⟩ dW t , X = x ∈ R . (9)We note that the Signature Model has two components: the hy-perparameter N ∈ N , and the model parameter ℓ . Intuitively, thehyperparameter N plays a similar role to the width of a layer in a neural network. The larger this value is, the richer the range of mar-ket dynamics the Signature Models can generate. Once the valueof N is fixed, the challenge is to find a suitable model parameter ℓ . Again, in analogy with neural networks, ℓ plays the role of theweights of the network.The Signature Model possesses the universality property , in thesense that given a classical model, there exists a Signature Modelthat can approximate its dynamics to a given accuracy [20].We show in the upcoming Sections 5-7 that (a) the Signature Modelis efficient to simulate, (b) it is efficient to calibrate, and (c) exoticoptions can be priced fast under the Signature Model. Remark 1.
The Signature Model introduced in Definition 3.1 as-sumes that the source of noise (i.e. the Brownian motion W ) is one-dimensional. This was done for simplicity, but the authors wouldlike to emphasise that the model generalises in a straightforwardway to multi-dimensional Brownian motion. We now demonstrate the feasibility of our methodology as outlinedin Sections 5-7. Throughout this section, we work with the SignatureModel dX t = ⟨ ℓ, (cid:98) W , t ⟩ dW t , X = ℓ = { ℓ ( K ) : K ∈ I N } . We fix N =
4. Therefore, the model has1 + + + + =
31 parameters that need to be calibrated. Wealso fix the terminal maturity T = We assume that the family of options available on the market are amixture of vanilla and exotic options, given as follows: • Vanilla call options with strikes K = . , . , . . . , , . andmaturities t = . , . , . , . . . , . , . , Φ : = max ( X t − K , ) . • Variance options with strikes K = . , . , . . . , . , . t = . , . , . , . . . , . , . , Φ : = max (⟨ X ⟩ t − K , ) . where ⟨ X ⟩ is the quadratic variation of X . • Down-and-Out barrier call options with maturity 1, strikes K = . , . , . , . . . , . , .
03 and barrier levels L = . , . , . , . . . , . , . Φ : = (cid:40) max ( X t − K , ) if min s ∈[ , t ] X s > L . The option prices are generated from a Black-Scholes model withvolatility σ = . dX t = σX t dW t . The optimisation (14) was then solved to calibrate the model pa-rameters ℓ = { ℓ ( K ) : K ∈ I N } .Figure 1 shows the absolute error between the real option prices andthe option prices of the calibrated model, for the different optiontypes. CAIF-2020, October 15–16, 2020, NY Perez Arribas, Salvi and Szpruch
Figure 1: Error analysis between the option prices of the realmodel and the calibrated Signature Model.
Once the Signature Model has been calibrated to the availableoption prices, we can use Algorithm 1 to simulate realisations ofthe calibrated Signature Model. Figure 2 shows 1,000 realisationsof the Signature Model.
We will now use the calibrated Signature Model to price a new set ofoptions that was not used in the calibration step. This set of optionconsists of Down-and-In barrier put options with barriers levels L = . , . , . . . , . , .
82 and strikes K = . , . , . . . , . , . Φ : = (cid:40) max ( K − X t , ) if min s ∈[ , t ] X s < L . Figure 2: 1,000 realisations of the calibrated SignatureModel.Figure 3: Error analysis between the option prices of the realmodel and the calibrated Signature Model.
Figure 3 shows the absolute error of the prices under the SignatureModel, compared to the real prices.As we see, the calibrated model is able to generate accurate pricesfor these new exotic options. The error is highest when the barrieris close to the strike price, as expected.
This section will address the question of simulation efficiency ofSignature Models. We begin by stating the following two results.The first result rewrites the differential equation (9) solely in termsof the lead-lag signature of the Brownian motion, (cid:98) W LL , t . Here (cid:98) W LL denotes the lead-lag transformation of (cid:98) W , see Appendix B. Weuse the lead-lag transformation because it allows us to rewriteItˆo integrals as certain Stratonovich integrals, which in turn canbe written as linear functions on signatures. The second resultguarantees that the computational cost of computing (cid:98) W LL , t is thesame as the cost of computing { (cid:98) W LL , s ; 0 ≤ s ≤ t } . These tworesults lead to Algorithm 1, which provides an efficient algorithmto sample from a Signature Model. ig-SDEs model for quantitative finance. ICAIF-2020, October 15–16, 2020, NY Algorithm 1:
Sampling from a Signature Model.
Parameters : D = { t i } ni = with = t < t < . . . < t n − < t n = T : samplingtimes. ℓ = { ℓ ( K ) : K ∈ I N } : Signature Model parameter. x ∈ R : initial spot price. Output:
A sample path { X t k } nk = from the Signature Model. Simulate a one-dimensional Brownian motion at thesampling times { W t i } ni = . Apply the lead-lag transformation (25) to (cid:98) W to obtain (cid:98) W LL . (cid:98) W LL , ← { F ( K ) : K ∈ I N + } with F ( ø ) = F ( K ) = K (cid:44) ø. X ← x . for k = , . . . , n do Compute the signature (cid:98) W LLt k − , t k = { (cid:98) W LL , ( K ) t k − , t k : K ∈ I N + } . Use Chen’s identity (Theorem 5.2) to compute thesignature (cid:98) W LL , t k ← { (cid:98) W LL , K , t k : K ∈ I N + } Use proposition 5.1 to get X t k ← ⟨ x ( ø ) + ℓ ⊗ ( ) , (cid:98) W LL , t k ⟩ . end return { X t k } nk = . Proposition 5.1 ([22, Lemma 3.11]).
Let X follow a Signature Modelwith parameter ℓ = { ℓ ( K ) : K ∈ I N } . Then, X is given by X t = ⟨ x ( ø ) + ℓ ⊗ ( ) , (cid:98) W LL , t ⟩ (10) where ℓ ⊗ ( ) = { K ⊗ ( ) : K ∈ ℓ } , x = X ∈ R , and (cid:98) W LL denotesthe lead-lag transformation, introduced in Definition B.1, of the 2-dimensional process (cid:98) W = ( t , W t ) . Theorem 5.2 (Chen’s identity, [25, Theorem 2.12]).
Let ≤ s ≤ t .Then, for each multi-index K ∈ I d we have (cid:98) W LL , ( K ) , t = (cid:213) I , J ∈I d I ⊗ J = K (cid:98) W LL , ( I ) , t · (cid:98) W LL , ( J ) , t (11) where for any multi-index K ∈ I d we used the notation (cid:98) W LL , ( K ) , t = ⟨ K , (cid:98) W LL , t ⟩ . These two results lead to Algorithm 1. We note there are a numberof publicly available software packages to compute signatures, suchas esig , iisignature and signatory . This section will show that exotic options can be priced fast undera Signature Model. This will be done via a two step procedure.First, it was shown in [22, 23] that prices of exotic options can beapproximated with arbitrary precision by a special class of payoffscalled signature payoffs , defined below. Hence, we will assume thatthe exotic option to be priced is a signature payoff, defined asfollows. https://pypi.org/project/esig/ https://github.com/bottler/iisignature, [27] https://github.com/patrick-kidger/signatory, [17] Definition 6.1 (Signature payoffs).
A signature payoff of maturity T > f = { f ( K ) : K ∈ I N } is a payoff that pays attime T an amount given by ⟨ f , (cid:98) X , T ⟩ .Second, the price of a signature payoff is ⟨ f , E [ (cid:98) X , T ]⟩ . To price asignature payoff, all we need is E [[] (cid:98) X , T ] , which doesn’t dependon the signature payoff itself. In particular, it may be reused to priceother signature payoffs.We now explicitly derive the expected signature E [[] (cid:98) X , T ] in termsof the model parameters and the expected signature of the lead-lagBrownian motion E [[] (cid:98) W LL , T ] .Proposition 6.2. Let X be a Signature Model of order N ∈ N withparameter ℓ = { ℓ ( K ) : K ∈ I N } . Consider the following linearfunctionals P = ( ) and P = ℓ ⊗ ( ) . Consider any multi-index I = ( i , . . . , i n ) ∈ I n such that n ≤ N . Then X ( I ) s , t = ⟨ C I ( ℓ ) , (cid:98) W LLs , t ⟩ (12) where C I ( ℓ ) is given explicitly in closed-form by C I ( ℓ ) = ( . . . (( P i ≻ P i ) ≻ P i ) ≻ . . . ≻ P i n ) (13)Proof. By Proposition 5.1 we know that if X follows a SignatureModel with parameter ℓ = { ℓ ( K ) : K ∈ I N } then X t = ⟨ x ( ø ) + ℓ ⊗ ( ) , (cid:98) W LL , t ⟩ Let I = ( i , . . . , i n ) be any multi-index in I n such that n ≤ N . If n = I = ( i ) and we necessarily one of the following twooptions must hold • If i = X ( i ) s , t = t − s = (cid:98) W LL , ( ) s , t = ⟨ P , (cid:98) W LLs , t ⟩• If i = X ( i ) s , t = X t − X s = ⟨ ℓ ⊗ ( ) , (cid:98) W LLs , t ⟩ = ⟨ P , (cid:98) W LLs , t ⟩ .Hence the statement holds for n =
1. Let’s assume by induction thatthe statement holds for any 1 ≤ n ≤ N . We write I = J ⊗ ( i n ) with i n ∈ { , } and J = ( i , . . . , i n − ) ∈ I n − . Clearly |( i n )| , | J | < n ,therefore by induction hypothesis X ( i n ) s , t = ⟨ C ( i n ) ( ℓ ) , (cid:98) W LLs , t ⟩ = ⟨ P i n , (cid:98) W LLs , t ⟩ and X ( J ) s , t = ⟨ C J ( ℓ ) , (cid:98) W LLs , t ⟩ = ⟨( . . . ( P i ≻ P i ) ≻ . . . ≻ P i n − ) , (cid:98) W LLs , t ⟩ By definition of the signature (see 2.6) we know that X ( I ) s , t = ∫ ts X ( J ) s , u ◦ d X ( i n ) u = ∫ ts ⟨ C J ( ℓ ) , (cid:98) W LLs , t ⟩ ◦ d ⟨ C ( i n ) ( ℓ ) , (cid:98) W LLs , t ⟩ = ⟨ C J ( ℓ ) ≻ C ( i n ) ( ℓ ) , (cid:98) W LLs , t ⟩ = ⟨( . . . ( P i ≻ P i ) ≻ . . . ≻ P i n − ) ≻ P i n , (cid:98) W LLs , t ⟩ which concludes the induction. □ CAIF-2020, October 15–16, 2020, NY Perez Arribas, Salvi and Szpruch
We will now address the task of calibrating a Signature Model.We assume that the market has a family of options { Φ i } ni = whosemarket prices { p i } ni = are observable. Typically { Φ i } ni = will containvanilla options, together with some exotic options such as variousvariance or barrier products. Fix N ∈ N be the order of the SignatureModel. The challenge here is to find the model parameter ℓ = { ℓ ( K ) : K ∈ I N } that best fits the data, in the sense that the prices of Φ i ,under the Signature Model with parameter ℓ , are approximatelygiven by the observed market prices p i .Following Section 6, we assume that the options Φ i are given bysignature options. Therefore, we assume that we can write Φ i by Φ i = ⟨ φ i , (cid:98) X , T ⟩ , φ i = { φ ( K ) i : K ∈ I N } . The minimisation problem we aim to solve now is the following:min ℓ = { ℓ ( K ) : K ∈I N } n (cid:213) i = (cid:16) ⟨ φ i , E [ (cid:98) X , T ]⟩ − p i (cid:17) . (14)where E [ (cid:98) X , T ] is the expected signature of the Signature Modelwith parameter ℓ = { ℓ ( K ) : K ∈ I N } .By Proposition 6.2, the price of Φ i , which is given by ⟨ φ i , E [ (cid:98) X , T ]⟩ ,can be written as a polynomial on ℓ ( K ) . Hence, the optimisation(14) is rewritten as a minimisation of a polynomial of variables ℓ ( K ) ,for K ∈ I N .If the number of parameters ℓ ( K ) is large compared to the numberof available option prices, the optimisation problem might be over-parametrised and there will be multiple solutions to (14). In thiscase, we are in the robust finance setting where there are multipleequivalent martingale measures that fit to the data. If the numberof parameters ℓ ( K ) is small, however, we are in the setting of classi-cal mathematical finance modeling and there will in general be aunique solution to (14). In this paper we have proposed a new model for asset price dy-namics called the signature model . This model was develop with theobjective of satisfying the following properties:(1) Universality.(2) Efficiency of calibration to vanilla and exotic options.(3) Fast pricing of vanilla and exotic options.(4) Efficiency of simulation.Due to the rich properties of signatures, the signature model sat-isfies all four properties and is, therefore, capable of generatingrealistic paths without sacrificing the computational feasibility ofcalibration, pricing and simulation.Although this paper has focused on the risk-neutral measure Q , itcan also be used to learn the real-world measure P . One would firstcalibrate to the risk-neutral measure Q and then learn the drift. ACKNOWLEDGMENTS
This work was supported by The Alan Turing Institute under theEPSRC grant EP/N510129/1.
REFERENCES [1] Beatrice Acciaio, Julio Backhoff-Veraguas, and Anastasiia Zalashko. 2019. Causaloptimal transport and its links to enlargement of filtrations and continuous-timestochastic optimization.
Stochastic Processes and their Applications (2019).[2] Horatio Boedihardjo, Xi Geng, Terry Lyons, and Danyu Yang. 2016. The signatureof a rough path: uniqueness.
Advances in Mathematics
293 (2016), 720–737.[3] Douglas T Breeden and Robert H Litzenberger. 1978. Prices of state-contingentclaims implicit in option prices.
Journal of business (1978), 621–651.[4] ´Alvaro Cartea, Imanol Perez Arribas, and Leandro S´anchez-Betancourt. 2020.Optimal Execution of Foreign Securities: A Double-Execution Problem withSignatures and Machine Learning.
Available at SSRN (2020).[5] Ilya Chevyrev, Terry Lyons, et al. 2016. Characteristic functions of measures ongeometric rough paths.
The Annals of Probability
44, 6 (2016), 4049–4082.[6] Ilya Chevyrev and Harald Oberhauser. 2018. Signature moments to characterizelaws of stochastic processes. arXiv preprint arXiv:1810.10971 (2018).[7] Christa Cuchiero, Wahid Khosrawi, and Josef Teichmann. 2020. A generativeadversarial network approach to calibration of local stochastic volatility models. arXiv preprint arXiv:2005.02505 (2020).[8] Adeline Fermanian. 2019. Embedding and learning with signatures. arXiv preprintarXiv:1911.13211 (2019).[9] Guy Flint, Ben Hambly, and Terry Lyons. 2016. Discretely sampled signals andthe rough Hoff process.
Stochastic Processes and their Applications to appear (2020).[11] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarialnets. In
Advances in neural information processing systems . 2672–2680.[12] Lajos Gergely Gyurk´o, Terry Lyons, Mark Kontkowski, and Jonathan Field. 2013.Extracting information from the signature of a financial data stream. arXivpreprint arXiv:1307.7244 (2013).[13] Ben Hambly and Terry Lyons. 2010. Uniqueness for the signature of a path ofbounded variation and the reduced path group.
Annals of Mathematics (2010),109–167.[14] Jasdeep Kalsi, Terry Lyons, and Imanol Perez Arribas. 2020. Optimal executionwith rough path signatures.
SIAM Journal on Financial Mathematics
11, 2 (2020),470–493.[15] Ioannis Karatzas and Steven Shreve. 2012.
Brownian motion and stochastic calculus.Vol. Vol. 113 . Springer Science & Business Media.[16] Patrick Kidger, Patric Bonnier, Imanol Perez Arribas, Cristopher Salvi, and TerryLyons. 2019. Deep Signature Transforms. In
Advances in Neural InformationProcessing Systems . 3099–3109.[17] Patrick Kidger and Terry Lyons. 2020. Signatory: differentiable computations ofthe signature and logsignature transforms, on both CPU and GPU. arXiv preprintarXiv:2001.00706 (2020).[18] Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).[19] Franz J Kir´aly and Harald Oberhauser. 2016. Kernels for sequentially ordereddata. arXiv preprint arXiv:1601.08169 (2016).[20] Daniel Levin, Terry Lyons, and Hao Ni. 2013. Learning from the past, predict-ing the statistics for the future, learning an evolving system. arXiv preprintarXiv:1309.0260 (2013).[21] Chenyang Li, Xin Zhang, and Lianwen Jin. 2017. Lpsnet: A novel log pathsignature feature based hand gesture recognition framework. In
Proceedings ofthe IEEE International Conference on Computer Vision Workshops . 631–639.[22] Terry Lyons, Sina Nejad, and Imanol Perez Arribas. 2019. Nonparametric pricingand hedging of exotic derivatives. arXiv preprint arXiv:1905.00711 (2019).[23] Terry Lyons, Sina Nejad, and Imanol Perez Arribas. 2020. Numerical Methodfor Model-free Pricing of Exotic Derivatives in Discrete Time Using Rough PathSignatures.
Applied Mathematical Finance (2020), 1–15.[24] Terry Lyons, Hao Ni, and Harald Oberhauser. 2014. A feature set for streams andan application to high-frequency financial tick data. In
Proceedings of the 2014International Conference on Big Data Science and Computing . 1–8.[25] Terry J Lyons. 1998. Differential equations driven by rough signals.
RevistaMatem´atica Iberoamericana
14, 2 (1998), 215–310.[26] Terry J Lyons, Michael Caruana, and Thierry L´evy. 2007.
Differential equationsdriven by rough paths . Springer.[27] Jeremy F Reizenstein and Benjamin Graham. 2020. Algorithm 1004: The iisigna-ture library: Efficient calculation of iterated-integral signatures and log signatures.
ACM Transactions on Mathematical Software (TOMS)
46, 1 (2020), 1–21.[28] Vladimir Vapnik. 2013.
The nature of statistical learning theory . Springer science& business media.[29] Zecheng Xie, Zenghui Sun, Lianwen Jin, Hao Ni, and Terry Lyons. 2017. Learningspatial-semantic context with fully convolutional recurrent network for onlinehandwritten chinese text recognition.
IEEE transactions on pattern analysis andmachine intelligence
40, 8 (2017), 1903–1917. ig-SDEs model for quantitative finance. ICAIF-2020, October 15–16, 2020, NY [30] Weixin Yang, Lianwen Jin, and Manfei Liu. 2015. Chinese character-level writeridentification using path signature feature, DropStroke and deep CNN. In .IEEE, 546–550.[31] Weixin Yang, Lianwen Jin, and Manfei Liu. 2016. Deepwriterid: An end-to-endonline text-independent writer identification system.
IEEE Intelligent Systems . IEEE, 4083–4088.[33] Weixin Yang, Lianwen Jin, Dacheng Tao, Zecheng Xie, and Ziyong Feng. 2016.DropSample: A new training method to enhance deep convolutional neural net-works for large-scale unconstrained handwritten Chinese character recognition.
Pattern Recognition
58 (2016), 190–203.[34] Weixin Yang, Terry Lyons, Hao Ni, Cordelia Schmid, Lianwen Jin, and JiaweiChang. 2017. Leveraging the path signature for skeleton-based human actionrecognition. arXiv preprint arXiv:1707.03993 (2017).
A OVERVIEW OF SIGNATURES
In this section we state some of the main properties of signaturesthat are used in this paper.
Definition A.1 (Shuffle of multi-indices).
For any two multi-indices I , J ∈ I d and 1-dimensional multi-indices a , b ∈ I d = { , . . . , d } we define the shuffle product x recursively as follows:ø x I = I x ø = I (15)and ( I ⊗ a ) x ( J ⊗ b ) = (( I ⊗ a ) x J ) ⊗ b + ( I x ( J ⊗ b )) ⊗ a (16) Example A.2.
We have the following examples for I :(1) ( , ) x ( ) = ( , , ) + ( , , ) + ( , , ) .(2) ( , ) x ( , ) = ·( , , , ) + ( , , , ) + ( , , , ) + ( , , , ) + ( , , , ) .(3) ( , ) x ø = ( , ) .(4) ø x ( , ) = ( , ) .Proposition A.3 (Shuffle identity). Let X : [ , T ] → R d be acontinuous semimartingale. For any two multi-indices I , J ∈ I d thefollowing identity on the Signature of X holds ⟨ I x J , X s , t ⟩ = ⟨ I , X s , t ⟩ · ⟨ J , X s , t ⟩ : = X ( I ) s , t · X ( J ) s , t (17)Proof. Theorem 2.15 in [26]. □ Proposition A.4 (Uniqeness of the Signature).
Let X : [ , T ] → R d , Y : [ , T ] → R d be two continuous semimartingales. Then ∀ t ∈ [ , T ] , X t = Y t ⇐⇒ ∀ K ∈ I d , X ( K ) s , t = Y ( K ) s , t (18)Proof. See main result in [13]. □ Proposition A.5 (Factorial decay).
Given a semimartingale X : [ , T ] → R d , for any time interval [ s , t ] ⊂ [ , T ] and any multi-index K ∈ I d such that | K | = n (cid:12)(cid:12) X ( K ) s , t (cid:12)(cid:12) = O (cid:18) n ! (cid:19) (19)Proof. Proposition 2.2 in [26]. □ Definition A.6.
For a given time interval [ , T ] we call a contin-uous, surjective, increasing function ψ : [ , T ] → [ , T ] a time-reparametrization. Proposition A.7 (Invariance to time-reparametrizations). Let X : [ , T ] → R d be a semimartingale and ψ : [ , T ] → [ , T ] be atime-reparametrization. Then the Signature of X has the followinginvariance property X s , t = X ψ ( s ) , ψ ( t ) ∀ s , t ∈ [ , T ] such that s < t (20) Definition A.8 (Half-Shuffle).
Let F and G be any two linear function-als. We define their half-shuffle product ≻ on X s , t as the following(Stratonovich) iterated integral on the real line ⟨ F ≻ G , X s , t ⟩ = ∫ ts ⟨ F , X s , u ⟩ ◦ d ⟨ G , X s , u ⟩ (21)Let B be a 2-dimensional Brownian motion, defined for exampleon the interval [ , ] . Consider two linear functionals F = { F ( K ) : K ∈ I d } and G = { G ( K ) : K ∈ I d } defined as F ( K ) = (cid:26) K = ( , ) G ( K ) = (cid:26) K = ( , ) A s , t = (cid:10) F ≻ G − G ≻ F , B s , t (cid:11) (24)is the Levy area of the Brownian motion B on [ s , t ] ⊂ [ , ] . A.0.1 Expected signature.
We will now define the expected signa-ture of a semimartingale.
Definition A.9 (Expected signature).
Let X : [ , T ] → R d be a con-tinuous semimartingale, and let X s , t = { X ( K ) s , t ∈ R : K ∈ I d } be itssignature. The expected signature of X is defined by E [ X s , t ] : = { E [ X ( K ) s , t ] ∈ R : K ∈ I d } . The expected signature – i.e. the expectation of the iterated integrals(8) – behaves analogously to the moments of random variables, inthe sense that under certain assumptions it characterises the lawof the stochastic process:Theorem A.10 ([5]).
Let X : [ , T ] → R d be a semimartingale. Then,under certain assumptions (see [5]) the expected signature E [ X , T ] characterises the law of X . B TIME AND LEAD-LAG TRANSFORMATION
The invariance of the signature of a semimartingale to timereparametrizations allows to handle irregularly sampled samplepaths (prices etc.) by completely eliminating the need to retain in-formation about the original time-parametrization. Nonetheless, forthe pricing of many options, especially ones resulting from payoffscalculated pathwise (such as integrals for American options), thetime represents an important information that we are required toretain. To do so it suffices to augment the state space of the inputsemimartingale X by adding time t as an extra dimension to get X add-time t = ( t , X t ) .We report another basic transformation that can be applied tosemimartingales and that will be useful in the sequel of the paper:the lead-lag transformation. This transformation allows us to writeItˆo integrals as linear functions on the signature of the lead-lagtransformed path. CAIF-2020, October 15–16, 2020, NY Perez Arribas, Salvi and Szpruch
Figure 4: Lead-lag transformation of a Brownian motion.
Definition B.1 (Lead-lag transformation).
Let Z : [ , T ] → R d be asemimartingale. For each partition D = { t i } i ⊂ [ , T ] of mesh size | D | , define the piecewise linear path Z D : [ , T ] → R d given by Z D kT / n : = ( Z t k , Z t k ) , (25) Z D ( k + ) T / n : = ( Z t k , Z t k + ) (26)and linear interpolation in between. Figure 4 shows the lead-lagtransformation of a Brownian motion. As we see, the lead compo-nent leads the lag component, hence the name. The lead componentcan be seen as the future of the path, and the lag component as thepast.Denote by Z D the signature of Z D . Then, we define the lead-lagtransformation of Z , denoted by Z LL , as the limit of signatures of Z D : Z LL : = lim | D |→ Z D . The work in [9] showed the convergence of this limit and studiedsome of its properties.
B.1 Expected signature of the lead-lagBrownian motion
Definition B.2.
Let I = ( i , . . . , i n ) ∈ I n be a multi-index. We denoteby P( I ) the set of all possible tuples of non-empty multi-indicesfrom I n − such that their concatenation is equal to I and theirlength doesn’t exceed 2, i.e. P( I ) = {( I , . . . , I k ) ∈ (I n − ) k : I ⊗ . . . ⊗ I k = I and | I j | ∈ { , }} Example B.3. (1)
P(( , , )) = {( , , ) , ( , ( , )) , (( , ) , )} .(2) P(( , , , )) = {( , , , ) , ( , ( , ) , ) , ( , , ( , )) , (( , ) , , ) , (( , ) , ( , ))} (3) P(( , )) = {( , ) , (( , ))} . Definition B.4 (Exponential of a linear functional).