[PDF] Detecting and repairing arbitrage in traded option prices

Abstract

Option price data are used as inputs for model calibration, risk-neutral density estimation and many other financial applications. The presence of arbitrage in option price data can lead to poor performance or even failure of these tasks, making pre-processing of the data to eliminate arbitrage necessary. Most attention in the relevant literature has been devoted to arbitrage-free smoothing and filtering (i.e. removing) of data. In contrast to smoothing, which typically changes nearly all data, or filtering, which truncates data, we propose to repair data by only necessary and minimal changes. We formulate the data repair as a linear programming (LP) problem, where the no-arbitrage relations are constraints, and the objective is to minimise prices' changes within their bid and ask price bounds. Through empirical studies, we show that the proposed arbitrage repair method gives sparse perturbations on data, and is fast when applied to real world large-scale problems due to the LP formulation. In addition, we show that removing arbitrage from prices data by our repair method can improve model calibration with enhanced robustness and reduced calibration error.

Full PDF

DDetecting and repairing arbitrage in tradedoption prices

Samuel N. Cohen Christoph ReisingerSheng WangMathematical Institute, University of Oxford { samuel.cohen, christoph.reisinger, sheng.wang } @maths.ox.ac.uk August 24, 2020

Abstract

Option price data are used as inputs for model calibration, risk-neutraldensity estimation and many other ﬁnancial applications. The presence ofarbitrage in option price data can lead to poor performance or even failureof these tasks, making pre-processing of the data to eliminate arbitragenecessary. Most attention in the relevant literature has been devoted toarbitrage-free smoothing and ﬁltering (i.e. removing) of data. In contrastto smoothing, which typically changes nearly all data, or ﬁltering, whichtruncates data, we propose to repair data by only necessary and mini-mal changes. We formulate the data repair as a linear programming (LP)problem, where the no-arbitrage relations are constraints, and the objec-tive is to minimise prices’ changes within their bid and ask price bounds.Through empirical studies, we show that the proposed arbitrage repairmethod gives sparse perturbations on data, and is fast when applied toreal world large-scale problems due to the LP formulation. In addition,we show that removing arbitrage from prices data by our repair methodcan improve model calibration with enhanced robustness and reduced cal-ibration error. a r X i v : . [ q -f i n . P R ] A ug Introduction

Price data of vanilla options are widely used in various ﬁnancial applicationssuch as calibrating models for pricing and hedging, and computing risk-neutraldensities (RND) of the underlying. The presence of arbitrage in option pricedata can lead to poor or even failed model calibration, as well as erroneous RNDestimation. Derivative pricing models in nearly all applications are constructedto be arbitrage-free, as it is economically meaningless to have a model that hasthe potential to make risk-free proﬁts. Exact calibration is impossible for anyarbitrage-free model or RND function with any size of input data if the pricedata contain arbitrage. For example, calibration of the local volatility model ofDupire [15] and Derman and Kani [12] would fail given arbitrageable data. Itis also not possible to have any arbitrage-free interpolation such as Kahale [26]and Wang et al. [36], because the data to be interpolated are not arbitrage-freefrom the beginning. Though inexact calibration methods are available for manymodels, it seems natural to expect that removing arbitrage from input data canimprove calibration of arbitrage-free models, such as enhancing robustness orreducing calibration error.Therefore, it is important to remove arbitrage (if present) from option pricedata. Most attention in the relevant literature has been devoted to the smoothing and ﬁltering of data. Notable works on smoothing include A¨ıt-Sahalia andDuarte [1], Fengler et al. [16] [17] [18], Gatheral and Jacquier [19], and Lim[29]. In fact, the calibration of many pricing models, such as stochastic volatilitymodels, is essentially arbitrage-free smoothing. Arbitrage-free data is only abyproduct of smoothing, since the main goal of smoothing is to produce a C , call price function ( T, K ) (cid:55)→ C ( T, K ) (or equivalently implied volatility function(

T, K ) (cid:55)→ σ imp ( T, K )). For smoothing, usually an (cid:96) -norm optimisation is usedwhen searching over polynomial, spline or kernel parameters that produce valuesas close to the given price data as possible. This method leads to changes fornearly all data. Though liquidity considerations can be included by addingweights in the optimisation, it remains unclear what should be an eﬀective wayto set weight values for diﬀerent options. The ﬁltering of data refers to simplyremoving suspiciously low-quality price data according to criteria in terms ofmoneyness, expiry, trading volume, intra-day activity, etc. A good survey ofpopular empirical ﬁltering criteria can be found in Ivanovas [25] and Meier [31].Filtering can be quite subjective, can cause information loss, and might noteven be feasible as many criteria are based on order-book level data, which arenot always available (for example, in OTC markets).In contrast to smoothing, which typically changes nearly all data, or ﬁltering,which truncates data, we propose to repair data in the sense that only necessaryand minimal changes are made to the given data in the presence of arbitrage.If arbitrages in data are mainly consequences of infrequent price updates ofilliquid options rather than noncompetitive market, it is better to only perturbas few data points as possible. In addition, when making changes, we use bidand ask prices as soft bounds such that liquidity proﬁles of diﬀerent options areconsidered in an objective way. Bid-ask spread is a measure of liquidity, i.e. the2arrower the spread, the easier a market order can be matched and executed.Since the “fair” price could lie anywhere within the bid-ask price bounds, thewidth of the bid-ask spread represents the degree of certainty in the marketprices. Empirically, deep out-of-the-money (OTM) and in-the-money (ITM)options are thinly traded with wide bid-ask spreads, leading to less trustworthyprice data compared with more liquid options. We therefore formulate the datarepair as a constrained optimisation problem, where the no-arbitrage relationsare written as constraints, and the objective is to minimise price changes withinsoft bounds. By carefully choosing the objective function, we can rewrite theformulation as a linear programming (LP) problem, so that we can take advan-tage of eﬃcient solution techniques and software for large-scale LPs.Our method is to repair single-price data. At any moment during the tradingday, each tradable asset has multiple prices, i.e. bid price and ask price. How-ever, most applications require single-price inputs. There is a need to constructsome “fair” reference price from the market-quoted multiple prices. Examplesof a reference price are the mid-price, the quantity-weighted price, the last tradeprice or the micro-price by Stoikov [35]. In this article, we do not discuss theconstruction of reference price, and use the mid-price by default, however otherreference prices could easily be considered.We envisage further applications of our methodology in repairing data gen-erated by models which do not themselves rule out arbitrage. Included in thisclass are prices predicted by deep learning methods, which have gained sub-stantial popularity recently, as documented by the survey paper by Ruf andWang [33]. Typically, there is no guarantee for arbitrage-free predicted optionprices even if the training set is arbitrage free; see also a more detailed discus-sion of this point in the introduction of Dixon, Cr´epey and Chataigner [13],which goes on to use the local volatility code book for arbitrage free vanillaprices as a means of guaranteeing arbitrage-free interpolation of prices. Thearbitrage repair method from our paper can provide a simple post-processingstep of potentially arbitrageable learned prices. By repairing a discrete set ofinput prices directly without extra assumptions, using linear constraints only,the method distinguishes itself by versatility, transparency, and speed, makingit particularly well-suited to online computations.The rest of the paper is structured as follows. We derive a set of empir-ically veriﬁable model-independent, static arbitrage constraints in Section 2.Our derivation is mainly based on Carr, G´eman, Madan and Yor [6] [7], Davisand Hobson [10], and Cousot [8] [9]. In Section 3, we formulate data repair asa constrained LP problem, and the design of the objective function is carefullydiscussed . Finally in Section 4, we apply our arbitrage repair method to FXoption data to justify why arbitrage repair is needed for real data, and demon-strate how our method performs empirically on various metrics, especially onthe improvement of model calibration. We also show an example of how wecan use our repair method for identifying the formation and disappearance of Our implementation of this algorithm in Python is available in the repository https://github.com/vicaws/arbitragerepair . We consider a ﬁnite collection of traded European call options written on thesame asset. These options can have arbitrary expiry and strike parametersrather than a rectangular grid of parameters, a restrictive prerequisite for manyarbitrage detection [7] and spline-type smoothing methods [16] [26] to work.In practice, it is uncommon to have price data on a rectangular grid, see, e.g.Figure 1.Consider N European call options that have expiries 0 < T < T < · · ·

Figure 1:

Distributions of (

K, T ) for traded EURUSD call options in the OTC market(Bloomberg data) and at the CME market, observed as of 31st May, 2018.

The arbitrage constraints that price data should satisfy are derived under africtionless market assumption. As a consequence, when the price data breakthese constraints, it may not be possible in practice to exploit the apparentarbitrage, given practical market barriers and transaction costs. However, theassumption that prices should be arbitrage free is justiﬁed by the fact that thesingle-price data are not executable prices in the market, but are designed tobe reference or benchmark prices for tradable assets, which are useful inputs to We focus on European style vanilla options in this study. Speciﬁcally, we only considercall options, since the static arbitrage constraint between call and put options is the put-callparity, which can be easily incorporated in our approach. The framework of our arbitragerepair method is applicable to a mixture of a wider range of options, as long as their arbitrageconstraints can be deﬁned by feasible linear inequalities of prices. . At presenttime 0, we use D ( T ) to denote the market discount factor for time T , and Γ( T )to denote the number of shares which will be owned by time T if dividendincome is invested in shares. Then there is a model-independent, arbitrage-freeforward price, F ( T ) = S / (Γ( T ) D ( T )), for delivery of the asset at T .We assume that zero-coupon bonds and forward contracts on the risky asset,with the same expiries as the options, are traded in the market. In addition,they are suﬃciently liquid that we can neglect their bid-ask spreads (e.g. usuallyone or two ticks). Therefore, we observe market discount factors D i := D ( T i )and forward prices F i := F ( T i ) for 1 ≤ i ≤ m . However, when the underlying(spot or forward) trades at a suﬃciently large bid-ask spread, then any arbitragestrategy can become impossible (see the discussion by Gerhold and G¨ul¨um [20]). Arbitrage refers to a costless trading strategy that has a positive probability ofearning risk-free proﬁt. A static arbitrage is an arbitrage exploitable by ﬁxedpositions in options and the underlying stock at initial time, while the positionof the underlying stock can be modiﬁed at a ﬁnite number of trading times inthe future. Any other arbitrage is called dynamic arbitrage . As an exampleof a static arbitrage, it must hold the condition that C ≥ C for K < K ,otherwise by going long one ( T , K ) option and short one ( T , K ) option,we make immediate proﬁt of C − C with non-negative terminal payoﬀ. Anexample of dynamic arbitrage is a continuously delta-hedged short position onan over-priced option in the perfect Black–Scholes world.Dynamic arbitrage relies on dynamics and path properties of the tradableassets. From the data repair perspective, we should minimise model dependence,because the repaired data are to be used in more generic applications. Hence,data should only be adjusted by model-independent constraints, so we restrictourselves to static arbitrage in which no dynamics need to be modelled. Staticarbitrage constraints establish the prerequisites that the price data have tosatisfy at time zero for admitting a dynamically arbitrage-free model.A model M is a ﬁltered probability space (Ω, F , { F t } t ∈T , P ), that carries anadapted price process { ( S t , C t ) } t ∈T , where C t gives the prices of the N optionsat time t , and we observe C . Here T denotes the set of times at which theasset can be traded so that 0 ∈ T and T e ⊂ T , and F = { Ω , ∅} augmentedwith all null sets of F T m .The First Fundamental Theorem of Asset Pricing (FFTAP) establishes anequivalence relation between no-arbitrage (static and dynamic) and the exis-tence of an equivalent martingale measure (EMM). After the landmark workof Harrison and Kreps [22], there are various versions of the FFTAP and ex-tensions of the no-arbitrage concept (e.g. no free lunch by Kreps [28], no free When applying our method to other asset classes, dividends of stock shares are comparableto foreign currency interest rates for FX rates, or convenience yields for commodities. M , thereis no arbitrage if and only if ∃ Q ∼ P , such that ∀ ( T, K ) ∈ P T,K ∪ ( T e × { } ) , D ( t ) C t ( T, K ) = D ( s ) E Q [ C s ( T, K ) | F t ] (1)for all t < s ≤ T where t, s ∈ T . No static arbitrage corresponds to a muchsmaller set of conditions, since the path dynamics governed by Q no longer mat-ter. As discussed by Carr, G´eman, Madan and Yor [6], [7] and Davis [10], staticarbitrage is present if no Q exists such that C ( T, K ) = D ( T ) E Q [ C T ( T, K ) | F ].Therefore, static arbitrage constraints are consequences of relations betweenterminal payoﬀs, projected to the present time. Let us deﬁne M T i = S T i /F i , k ij = K ij /F i , c ij = C ij / ( D i F i ), for all i, j . To haveno static arbitrage, there must exist Q such that c ij = E Q  (cid:32) S T i F i − K ij F i (cid:33) + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F  = E Q (cid:104) (cid:0) M T i − k ij (cid:1) + (cid:12)(cid:12)(cid:12) F (cid:105) , ∀ i, j. We will work on “normalised” quantities M , k , c in the rest of this section. Wedeﬁne the normalised call function c ( T, k ) as c ( T, k ) := E Q (cid:104) ( M T − k ) + (cid:12)(cid:12)(cid:12) F (cid:105) , where T ∈ R > , k ∈ R ≥ . (2)Given the speciﬁc structure in (2), a probability measure Q exists only whenthe call function satisﬁes some shape constraints. For arbitrary but ﬁxed T ,using Breeden and Litzenberger’s analysis [4], the marginal measure Q T := Q ( ·| F T ) exists if ∀ k > k > k ≥ , − ≤ c ( T, k ) − c ( T, k ) k − k ≤ c ( T, k ) − c ( T, k ) k − k ≤ . If a family of marginal measures { Q T } T ∈T e on ( R , B ( R )) exists with time-independent mean, and Q T ≥ cx Q T whenever T > T , then, by Kellerer’stheorem [27], there exists a Markov martingale measure with these marginals.Here we write Q T ≥ cx Q T if (cid:82) R φ d Q T ≥ (cid:82) R φ d Q T for each convex function φ : R → R , and we say { Q T } T ∈T e is Non-Decreasing in Convex Order (NDCO).The convex order can be equivalently characterised in terms of the call function[34]: Q T ≥ cx Q T ⇐⇒  Q T i and Q T j have equal means; (cid:90) R ( x − k ) + d Q T ≥ (cid:90) R ( x − k ) + d Q T ∀ x ∈ R . Given that E Q T [ M U ] = E Q T [ S U /F T ( U )] = 1 is time-independent for any T < U where

T, U ∈ T e , it is then suﬃcient to conclude that { Q T } T ∈T e is NDCO if6 ( T, · ) ≤ c ( U, · ). Also note that lim k ↓ c ( T, k ) = E Q [ M T | F ] = M = 1 for any T , and by monotonicity we have 0 ≤ c ( T, k ) ≤

1. Hence, if we deﬁne a set offunctions s ( x, y ) : X × Y → R , where X, Y ⊆ R ≥ are compact sets, by S ( X × Y ) = (cid:40) ( x, y ) (cid:55)→ s ( x, y ) : ∀ x < x ∈ X, y < y < y ∈ Y, ≤ s ≤ , s ( x , · ) ≤ s ( x , · ) , − ≤ s ( · , y ) − s ( · , y ) y − y ≤ s ( · , y ) − s ( · , y ) y − y ≤ (cid:41) , (3)then no arbitrage can be constructed on the static surface ( T, k ) (cid:55)→ c ( T, k ) if c ∈ S ( R > × R ≥ ). Consequently, no static arbitrage can be constructed fromthe ﬁnite collection of prices if ∃ c ∈ S (cid:18) T e × [0 , max i,j k ij ] (cid:19) , s.t. ∀ ( T i , k ij ) ∈ P T,k , c ( T i , k ij ) = c ij , (4)where P T,k = { ( T i , k ij ) } ≤ j ≤ n i , ≤ i ≤ m .Condition (4) can be characterised by practically veriﬁable constraints ofprices c . We slightly revise Cousot’s construction (Deﬁnition 2.1 – 2.3 in [9]).We augment the given price data with the price that corresponds to a callstruck at 0 for each expiry. This means ∀ i ∈ { , · · · , m } we add K i = 0 and C i = F i , or equivalently k i = 0 and c i = 1. This augmentation is necessary tocheck arbitrage relationships between call options and forwards. Deﬁne, for any k i j > k i j , where 1 ≤ i , i ≤ m , 0 ≤ j ≤ n i , and 0 ≤ j ≤ n i , β ( i , j ; i , j ) := c i j − c i j k i j − k i j , (5)which can be viewed as the slope of the straight line passing through the twopoints ( k i j , c i j ) and ( k i j , c i j ), if we plot all prices on the ( k, c ) plane. We willemploy β ( · ) to deﬁne the price of some test strategies . Deﬁnition 1.

A test spread strategy is deﬁned ∀ ≤ i ≤ i ≤ m , and ∀ ≤ j ≤ n i , ≤ j ≤ n i such that k i j ≥ k i j , byS i ,i j ,j = (cid:40) − β ( i , j ; i , j ) if k i j > k i j ,c i j − c i j if k i j = k i j . In particular, there are three types of test spread strategies:(1) Vertical spread: VS ij ,j = S i,ij ,j with k ij > k ij .(2) Calendar spread: CS i ,i j = S i ,i j ,j with k i j = k i j and i < i .(3) Calendar vertical spread: CVS i ,i j ,j = S i ,i j ,j with k i j > k i j and i < i . eﬁnition 2. A test butterﬂy strategy is deﬁned ∀ i, i , i ∈ [1 , m ] s.t. i ≤ i and i ≤ i , ∀ j ∈ [0 , n i ] , j ∈ [0 , n i ] , j ∈ [0 , n i ] such that k i j < k ij < k i j , byB i,i ,i j,j ,j = − β ( i, j ; i , j ) + β ( i , j ; i, j ) . In particular, there are two types of test butterﬂy strategies:(1) Vertical butterﬂy: VB ij,j ,j = B i,i,ij,j ,j .(2) Calendar butterﬂy: CB i,i ,i j,j ,j = B i,i ,i j,j ,j where i, i and i are not all equal. Based on these deﬁnitions of test strategies, we restate Cousot’s constraintsfor no-arbitrage in the following proposition.

Proposition 1 (Cousot [8], [9]) . All test strategies are non-negative, and all testvertical spreads are not greater than , if and only if there exist m risk-neutralmeasures { Q T i } ≤ i ≤ m corresponding to all option expiries, that are NDCO. Inaddition, all their means are equal to M = 1 . Together with Kellerer’s theorem [27], Proposition 1 gives suﬃcient condi-tions for the existence of a Q -martingale (thus no static arbitrage), in terms ofconstraints on prices of the test strategies. Those constraints are also necessaryfor no static arbitrage if semi-static strategies are allowed to exploit arbitrageopportunities, as proved by Cousot in Appendix A of [8]. Cousot’s constraints contain redundancies. For instance, if two vertical spreadsVS ij ,j and VS ij ,j (where k ij < k ij < k ij ) are non-negative, then VS ij ,j ≥ O ( N ) to O ( m N ) by localisation on the surface. Localisation can successfully reducethe amount of constraints because the shape constraints speciﬁed in (3) includeonly boundedness, positivity, monotonicity and convexity, which are all localproperties. The reduced set of constraints is listed in Table 1, where the (orderof the) number of constraints in each category is also indicated.We give details of the localisation method in Appendix A. We claim thatthe reduced set of constraints listed in Table 1 are suﬃcient to imply Cousot’sconstraints, thus are suﬃcient and necessary to guarantee no-arbitrage, as statedin Proposition 2. Proposition 2.

If the constraints C1 – C6 are satisﬁed, then all test strategiesare non-negative, and all test vertical spreads are not greater than 1.Proof.

See Appendix B.

The static arbitrage constraints in Table 1 are linear inequalities of at most threecall prices. Therefore, we can write these constraints in the form A c ≥ b , where8 ategory Constraints Number C1 Outright ∀ i ∈ [1 , m ] , c in i ≥ m C2 Vertical spread ∀ i ∈ [1 , m ] , j ∈ [1 , n i ] , N + m VS ij,j − ≥ i , ≤ ∀ i ∈ [1 , m ] , j ∈ [1 , n i − , VB ij,j − ,j +1 ≥ N − m C4 Calendar spread ∀ ≤ i < i ≤ m, j ∈ [0 , n i ] , j ∈ [0 , n i ] , O ( mN )CS i ,i j ,j ≥ ∀ i ∗ ∈ [1 , m ] , j ∗ ∈ [1 , n i ∗ ] , O ( mN )deﬁne I := { i, j : T i > T i ∗ , k i ∗ j ∗ − < k ij < k i ∗ j ∗ } , then ∀ i, j ∈ I , CVS i ∗ ,ij ∗ ,j ≥ ∀ i ∗ ∈ [1 , m ] , j ∗ ∈ [1 , n i ∗ − , O ( m N )deﬁne I := { i, j : T i > T i ∗ , k i ∗ j ∗ − < k ij < k i ∗ j ∗ } , then ∀ i, j ∈ I , CB i ∗ ,i,i ∗ j ∗ ,j,j ∗ +1 ≥ ∀ i ∗ ∈ [1 , m ] , j ∗ ∈ [2 , n i ∗ ] , deﬁne I := { i, j : T i > T i ∗ , k i ∗ j ∗ − < k ij < k i ∗ j ∗ } , then ∀ i, j ∈ I , CB i ∗ ,i ∗ ,ij ∗ − ,j ∗ − ,j ≥ ∀ i ∗ ∈ [1 , m ] , deﬁne I := { i, j : T i > T i ∗ , k ij > k i ∗ n i ∗ } , then ∀ i, j ∈ I , CB i ∗ ,i ∗ ,in i ∗ ,n i ∗ − ,j ≥ ∀ i ∗ ∈ [1 , m ] , j ∗ ∈ [1 , n i ∗ − , O ( m N )deﬁne I := { i, j : T i > T i ∗ , k i ∗ j ∗ − < k ij < k i ∗ j ∗ } , I := { i, j : T i > T i ∗ , k i ∗ j ∗ < k ij < k i ∗ j ∗ +1 } , ∀ i , j ∈ I , ∀ i , j ∈ I , CB i ∗ ,i ,i j ∗ ,j ,j ≥ ∀ i ∗ ∈ [1 , m ] , deﬁne I := { i, j : T i > T i ∗ , k i ∗ n i ∗ − < k ij < k i ∗ n i ∗ } , I := { i, j : T i > T i ∗ , k ij > k i ∗ n i ∗ } , ∀ i , j ∈ I , ∀ i , j ∈ I , CB i ∗ ,i ,i n i ∗ ,j ,j ≥ Table 1:

The reduced set of static arbitrage constraints c = [ c · · · c n · · · c mn m ] (cid:62) ∈ R N , and A = ( a ij ) ∈ R R × N and b = ( b j ) ∈ R R area constant matrix and a vector corresponding to coeﬃcients and bounds of theinequalities, respectively, that are completely determined by the expiries andstrikes of observed options. Here, R is the number of no-arbitrage constraints,where R ∼ O ( m N ). These constraints are feasible by construction, i.e. { x ∈ R N : A x ≥ b } (cid:54) = ∅ , because S ( T e × [0 , max i,j k ij ]) (cid:54) = ∅ , (for example, the pricesunder a Black–Scholes model satisfy the requirements).When some row of the system of inequalities A c ≥ b is not satisﬁed, there isarbitrage. We deﬁne ε to be the vector of perturbations added to the vector ofcall prices c such that the perturbed prices are arbitrage-free, i.e. A ( c + ε ) ≥ b .Hence, to remove arbitrage from the call price data, we seek the “minimal”repair subject to no-arbitrage constraints:9in ε ∈ R N f ( ε ) , subject to A ε ≥ b − A c , (6)where the objective f : R N → R measures how much the perturbation deviatesfrom zero. The formulation (6) is feasible because its constraints are feasible. We start from the simple case where there is no liquidity diﬀerence amongoptions. It seems natural to use the (cid:96) -norm for measuring the size of pertur-bations due to its convexity and computational eﬃciency when optimising bygradient-based methods. The (cid:96) -norm has been widely used in data smoothingalgorithms, such as [1], [16], [17] and [18].However, the (cid:96) -norm usually leads to small perturbations for all prices,while in our application sparse perturbation is desirable. An alternative is the (cid:96) -norm , which is a natural way of comparing diﬀerence, and produces sparsesolutions. Nevertheless, the (cid:96) -norm is nonconvex and in general leads to anNP-hard [32] optimisation problem. Hence, it is natural to consider the (cid:96) -norm, which is well known as a convex relaxation of the (cid:96) -norm. In fact,optimal solutions of the (cid:96) and (cid:96) norms objectives are equivalent under certainconditions, see [5], [21] and [14].Choosing the (cid:96) -norm has other beneﬁts. When minimising a convex con-tinuous objective function like the (cid:96) -norm, every local minimum is a globalminimum, see Chapter 4 of [3]. In addition, our repair problem is a Linear Pro-gramming (LP) problem with the (cid:96) -norm objective, which can be solved fairlyquickly even for large-scale problems. Finally, compared with the (cid:96) -norm, the (cid:96) -norm is more robust to outliers because the (cid:96) -norm squares values, whichincreases the cost of outliers quadratically, see Huber [24].Consequently, the (cid:96) -norm is a natural candidate for the objective func-tion. Blacque-Florentin and Missaoui [2] also choose the (cid:96) -norm as objec-tive when ﬁtting tensor polynomials to sparse data, as inspired by the com-pressed sensing framework. The diﬀerences between our work and theirs arethat they are concerned with smoothing data rather than repairing data, andassume a rectangular grid of strikes and expiries. The (cid:96) -norm optimisationwith linear constraints can be expressed as an LP problem. We write theobjective function as f ( ε ) := || ε || (cid:96) = (cid:80) Ni = j | ε j | = (cid:80) Ni = j (cid:0) ε + j + ε − j (cid:1) , where ε + j = max( ε j , , ε − j = − min( ε j ,

0) for each j . We denote ε + = [ ε +1 · · · ε + N ]and ε − = [ ε − · · · ε − N ] so that ε = ε + − ε − . We deﬁne B = [ − A A ] and θ = [ ε + ε − ] (cid:62) . Hence, the repair problem with the (cid:96) -norm minimisation isequivalent to the following LP in canonical form:min θ (cid:62) θ , subject to B θ ≤ A c − b , θ ≥ . (7) Note that the (cid:96) -norm is not actually a “norm” as it violates the homogeneity and triangleinequality properties that a vector norm must satisfy. θ = [ ˆ ε + ˆ ε − ] (cid:62) , the optimal perturbation vector isrecovered by ˆ ε = ( ˆ ε + − ˆ ε − ) (cid:62) . The reference prices will typically lie within their corresponding bid-ask pricebounds. In the presence of arbitrage, we not only want minimal repair, but alsowish to have as many perturbed prices falling within the bid-ask price boundsas possible. Speciﬁcally, a reference price with wider bid-ask spread shall begiven more freedom to be perturbed. The sparsity of the solution of the (cid:96) -norm optimisation is less desirable if perturbing a larger number of prices cankeep more perturbed prices within the bid-ask price bounds. Design of the objective with bid and ask prices

We consider using the best bid/ask prices for data repair. To incorporate bid-ask price constraints into the repair problem, we revise the objective function f rather than adding extra constraints. In other words, we treat bid-ask pricebounds as soft constraints rather than hard constraints like the arbitrage con-straints. There may not be arbitrage-free prices within the bid-ask price bounds,and adding bid-ask price bounds as hard constraints may cause the repair prob-lem to be infeasible.We choose an objective function of the form f ( ε ) = (cid:80) Nj =1 f j ( ε j ) with f j ( x ) ≥ x ∈ R . Then f j ( x ) can be naturally interpreted as the cost of perturbingthe j -th option price, and d f j ( x ) / d | x | > (cid:96) -norm objective sets f j ( x ) = | x | = max( − x, x ) and any perturbation x ,where | x | >

0, has marginal cost 1 for all j . Let δ aj , δ bj > j -th price, respectively. To incorporatethese spreads into the objective, we require that f j ( x ) should have the followingproperties, for all j ∈ [1 , N ]:(1) f j (0) = inf x f j ( x ) = 0. The minimum is attained when there is no per-turbation, which is costless to the objective;(2) f j ( x ) is monotonically increasing (decreasing) for x > x < f j ( − δ bj ) = f j ( δ aj ) = δ , where δ ≥ f j ( x ) / d | x | = 1 for x ∈ ( −∞ , − δ bj ) ∪ ( δ aj , + ∞ ). The marginal cost ofperturbing a price out of the bid-ask price bounds is the same for alloptions.We therefore propose the following objective that meets all the properties,and, with particular merit, retains the ability to be expressed as an LP: f j ( x ) = max (cid:32) − x − δ bj + δ , − δ δ bj x, δ δ aj x, x − δ aj + δ (cid:33) , δ ≤ min( δ aj , δ bj ) for all j ∈ [1 , N ], as such the marginal cost of perturbinga price within the bid-ask price band is not greater than the marginal cost ofperturbing mid prices outside the bid-ask price bounds. xf (cid:96) j ( x ) (a) f (cid:96) j ( x ) xf j ( x ) f ( x ) f ( x ) f ( x ) δ − δ b − δ b − δ b δ a δ a δ a (b) f j ( x ) Figure 2:

Plot of the objective function component f (cid:96) j ( x ) and f j ( x ) Denote f (cid:96) j as the j -th component of the (cid:96) -norm objective. We visualise thediﬀerence between f (cid:96) j and f j in Figure 2. Note that f (cid:96) j is a special case of f j when δ aj = δ bj = δ > j . Choosing smaller δ makes it relatively morecostly to move prices outside of their bid-ask price bounds. Nevertheless, letting δ = 0 causes the optimisation problem to be ill-posed as it admits inﬁnitelymany solutions. For example, if ε ∗ j = 0 is optimal, then so is ε ∗ j = ω min( δ aj , δ bj )for all ω ∈ [0 , δ = 1 N ∧ min j =1 ,...,N (cid:0) δ aj ∧ δ bj (cid:1) . (8)This means we prefer to move all options (by ε ) within the bid-ask, rather thanmoving one option outside its bid-ask bounds.Hence, the objective function taking into account bid-ask spread is f ( ε ) = N (cid:88) j =1 max (cid:32) − e (cid:62) j ε − δ bj + δ , − δ δ bj e (cid:62) j ε , δ δ aj e (cid:62) j ε , e (cid:62) j ε − δ aj + δ (cid:33) , (9)where e j is the standard basis vector for R N with its j -th element being 1 andothers being 0. With objective (9), we can rewrite the repair problem (6) as thefollowing LP by introducing auxiliary variables t = [ t · · · t N ] (cid:62) :minimise ε , t N (cid:88) j =1 t j subject to − ε j − δ bj + δ ≤ t j , ε j − δ aj + δ ≤ t j , ∀ j ∈ [1 , N ] , − δ δ bj ε j ≤ t j , δ δ aj ε j ≤ t j , ∀ j ∈ [1 , N ] , − A ε ≤ − b + A c . (10)12fter solving for the optimal perturbation vector ˆ ε , we get the arbitrage-free normalised call price ˆ c = c + ˆ ε . For each i, j , the arbitrage-free call price is (cid:98) C ij = ˆ c ij D i F i . Executable arbitrage opportunities

We refer to the objective function taking into account bid-ask spread with δ asin (8) as the (cid:96) -BA objective. We deﬁne the eﬀectively perturbed prices as thosethat are perturbed outside of the bid-ask price bounds. We denote the numberof perturbed (resp. eﬀectively perturbed) prices by N ε (resp. N ε,δ ), thus N ε = N (cid:88) j =1 {| ε j | > } , N ε,δ = N (cid:88) j =1 { ε j >δ aj }∪{ ε j < − δ bj } . (11)We say an arbitrage is executable if we can realise it by buying and selling itscomponents at their ask and bid quotes, respectively. The arbitrage detectedin options’ reference prices is not necessarily executable. However, if the (cid:96) -BA repair results in eﬀective perturbations, i.e. N ε,δ >

0, then there mustexist executable arbitrages. To see this, let E j = [ c j − δ bj , c j + δ aj ], and we cancharacterise N ε,δ > ∀ i ∈ [1 , R ] , N (cid:88) j =1 a ij ˆ c j ≥ b i , then ∃ j ∈ [1 , N ] s.t. ˆ c j (cid:54)∈ E j . Equivalently, its contrapositive statement isif ∀ j ∈ [1 , N ] , ˆ c j ∈ E j , then ∃ i ∗ ∈ [1 , R ] s.t. N (cid:88) j =1 a i ∗ j ˆ c j < b i ∗ . Therefore, it holds that N (cid:88) j =1 a i ∗ j (cid:104) ( c j + δ aj ) { a i ∗ j ≥ } + ( c j − δ bj ) { a i ∗ j < } (cid:105) < b i ∗ . By going long on the left-hand side and going short on the right side of theinequality, we construct a portfolio that makes immediate positive proﬁt, whilethe portfolio has non-negative future payoﬀs. The left-hand side of the inequalityconsists of positions in options, for which we buy at ask price ( c j + δ aj ) and sellat bid price ( c j − δ bj ). We carry out a series of empirical studies. We show that arbitrage is frequentlypresent in historical price data, so repairing data is important. We also demon-strate empirical performances of the repair method in terms of sparsity, speed13nd improvement to model calibration. Last, we use the (cid:96) -BA repair for iden-tifying the formation and disappearance of arbitrage in the intra-day S&P 500options market on a day when the market underwent a regime switch. We collect daily close (bid, ask and mid) prices from 1st November, 2007 to31st May, 2018 for OTC FX options from Bloomberg. Bloomberg provides pricequoted as implied volatility given in terms of delta. We choose 13 benchmarktenors (expiries) from overnight (one-day) to two-year. For each tenor, a listof standard instruments are available: at-the-money (ATM), risk-reversal (RR)and butterﬂy (BF). We choose the liquid 10-delta, 15-delta, 25-delta and 35-delta instruments, and construct a vanilla volatility smile of 9 moneynessesfor each tenor. Following the OTC FX market conventions [37], we computestrike and time-to-expiry for each IV mid quote, and generate vanilla IV spreadsfrom the bid/ask quotes for the instruments . Thereafter, we calculate mid callprices and vanilla call price spreads using the mid vanilla IVs and generatedvanilla IV spreads, together with Bloomberg FX mid forward curves. There are117 = 13 (tenors) × T . . . . . K . . . . .

45 0 . . . . . Call price surface C ( T, K ) T . . . . . K .

05 1 .

15 1 .

25 1 .

35 1 . . . . . . . Implied vol surface σ ( T, K ) Figure 3:

An example of observed OTC-traded call option prices. These are end of dayprices settled by Bloomberg for EURUSD European call options as of 31st May, 2018.

We count violations of arbitrage constraints in raw daily close mid-pricesover time for some major currencies and emerging market (EM) currencies. InFigure 4, we see that there are more arbitrages in the EM currency markets. Wealso see persistent clustering of (mild) arbitrages from early 2007 to mid 2012 inmajor currency markets. Further investigation suggests that these are caused Given the instrument bid-ask spreads for ATM, RR and BF, one cannot uniquely de-termine the corresponding vanilla spreads without specifying some rule. For example, inpractice, trading desks may estimate vanilla spreads only using ATM spreads, which makesthe spread of each option at the same expiry equal, see Section 4.2.1 of [37]. Since vanilla IVsare linear transformations of instrument IVs, we conservatively assume that vanilla spreads areweighted sums of instrument spreads. This does not take into account that delta-symmetricvanilla spreads are dependent on each other, and generates the widest possible bid-ask spreadsfor vanilla IVs.

14y over-priced 1-day options, which result in calendar arbitrages with longer-dated options. We conjecture that the systematic appearance of the same typeof arbitrage is due to Bloomberg’s legacy data cleansing method.

Major currencies

AUDUSDEURUSDUSDCHFUSDJPY 2008 2010 2012 2014 2016 2018Time

Major currencies

GBPUSDUSDCAD2008 2010 2012 2014 2016 2018Time

Cross currencies

EURGBP 2008 2010 2012 2014 2016 2018Time

Emerging market currencies

USDBRLUSDKRWUSDMXN

Figure 4:

Time series of number of daily violated arbitrage constraints in OTC FX optionmarket, during the period from 1st November, 2007 to 31st May, 2018.

Calendar arbitrage (especially CVS C5 and CBS C6) is more diﬃcult andcostly to exploit than non-calendar arbitrages, as it requires rebalancing thehedging portfolio over time. Most arbitrage-free smoothing algorithms in theliterature only remove calendar arbitrage of C4 type, because they assume arectangular grid of expiries and strikes. However, calendar arbitrage can bea major source of arbitrage. In Figure 5, we consider what fraction of thearbitrages are of calendar type for diﬀerent currency pairs. Comparing medians(and overall distributions), as shown in the plot, the proportion of calendararbitrages for major currencies (AUD, EUR, GBP, CAD, CHF, and JPY) islarger than that for EM currencies (BRL, KRW, and MXN), though the crosspair EURGBP is an exception. In fact, the medians are very close to 100% foralmost all major currencies except sterling. In other words, nearly all arbitragesin major currencies’ option markets are calendar ones.

AUDUSD EURUSD GBPUSD USDCAD USDCHF USDJPY EURGBP USDBRL USDKRW USDMXN0255075100 P e r ce n t o f a ll a r b i t r a g e s Figure 5:

Fraction of calendar arbitrages on a given day, for diﬀerent currency pairs duringthe period from 1st November, 2007 to 31st May, 2018. The light blue shadow is a violinplot which indicates the kernel density of the percentages, and the red notched box is a boxplot. The horizontal short bar shows the median of each sample.

We examine the day when the EURUSD option price data have the mostoccurrences of calendar arbitrages over our observation period, and plot the call15rice curves for the ﬁrst three expiries in Figure 6. There is no non-calendararbitrage on that day since each curve is non-increasing and convex. After therepair, the T -curve is pushed downwards until it does not lie beyond the othertwo curves, which ensures NDCO marginal risk-neutral measures. .

96 0 .

98 1 .

00 1 . Strike k . . . . C a ll p r i ce c ( T , k ) Raw T T T .

96 0 .

98 1 .

00 1 . Strike k . . . . C a ll p r i ce c ( T , k ) Repaired .

96 0 .

98 1 .

00 1 . Strike k − . − . − . . P e r t u r b a t i o n Perturbations

Figure 6:

An example of arbitrage repair for EURUSD call options on 2nd April, 2015.

Left – raw call price curves for the ﬁrst three expiries.

Middle – repaired arbitrage-free callprice curves.

Right – perturbations added to each data point.

However, when calendar and non-calendar arbitrages are mixed, the per-turbations added to ensure no arbitrage tend to be more varied in signs. Forinstance, in Figure 7 we plot the call price curves for the ﬁrst four expiries onthe day when USDBRL options had the most occurrences of calendar arbitrage,however, there are also many non-calendar arbitrages. Unlike the above EU-RUSD example, the repair does not simply translate any curve. Therefore, theperturbations are not systematically negative. . . . . Strike k . . . C a ll p r i ce c ( T , k ) Raw T T T T . . . . Strike k . . . C a ll p r i ce c ( T , k ) Repaired . . . . Strike k − . . . . P e r t u r b a t i o n Perturbations

Figure 7:

An example of arbitrage repair for USDBRL call options on 28th October, 2008.

Sparse solution of the (cid:96) -norm objective The (cid:96) -norm objective leads to sparse perturbations. We show the fractionof perturbed prices in Figure 8. The medians are very close to zero for allcurrency pairs, indicating that very few data points need to be perturbed onaverage to remove arbitrage. This is especially true for major currencies, astheir distributions collapse almost entirely to zero.16 UDUSD EURUSD GBPUSD USDCAD USDCHF USDJPY EURGBP USDBRL USDKRW USDMXN0204060 P e r ce n t o f a ll qu o t e s Figure 8:

Number of perturbed prices as a percentage of all prices, for diﬀerent currencypairs during the period from 1st November, 2007 to 31st May, 2018.

Computational time

Our data repair method is designed to be fast due to the LP formulation. Inaddition, the reduction of arbitrage constraints shrinks the scale of the LP andspeeds up the repair. We investigate the computational time of our repairmethod when applied to a few practical cases. All of the following studies werecarried out on a quadcore Intel Core i7-8650U CPU with 32GB RAM. All LPsare solved using the GLPK (GNU Linear Programming Kit) solver wrapped bythe CVXOPT [30] Python package.In Figure 9, we plot histograms of (1) the number of constraints R ∼O ( m N ), (2) the fraction of violated constraints, and (3) the elapsed timesfor constructing the constraints (Table 1) and solving the LP (7). We take EU-RUSD and USDBRL as representatives of major currencies and EM currencies,respectively. The number of constraints rarely exceeds 4000, while it takes lessthan 0.4 seconds to transform them to the matrix form in most cases. Solvingthe LP with N = 117 variables and R < traded

EURUSD call options listed by CME from 1stJanuary, 2013 to 31st December, 2018. The number of traded options variesfrom one day to another, see Figure 10a. We show the distribution of tradedexpiry and strikes on a typical day in Figure 10b. In Figure 10c, we plot similarrepair statistics to those for the OTC data. On average, there are 500 callprices per day, which result in on average 25000 arbitrage constraints to verify.Though the number of constraints is observed as high as 90000, it takes lessthan 1 second to construct them in the matrix form. Solving the LP now cantake up to 6 seconds, but on average it only takes 1.44 seconds.

Stress testing the (cid:96) -norm objective repair We test how our repair method works in hypothetical extreme scenarios whenthere is massive arbitrage. First, we collect arbitrage-free call prices for a day,denote these data by c ∈ R N , and let I = { , . . . , N } be the set of its indices.Next, we simulate noises and add them to a portion λ ∈ (0 ,

1] of the price data,where we denote I ξ ⊂ I as the set of indices of those polluted prices. Here, I ξ is17

000 3500 40000100200

Number of constraints

Percentage of violated constraints . . Second

Elapsed time

Build ConstraintsSolve Optimistion (a)

EURUSD options

Number of constraints

Percentage of violated constraints . . . . Second

Elapsed time

Build ConstraintsSolve Optimistion (b)

USDBRL options

Figure 9:

Histograms of various statistics for repairing data of EURUSD options andUSDBRL options, during the period from 1st November, 2007 to 31st May, 2018.

200 400 600050100

Number of traded call options (a)

Histogram of the number oftraded call options per day. . . . . . Strike K . . T ( y e a r f r a c t i o n ) Distribution of ( T, K ) (b) The distribution of (

T, K ) for all tradedEURUSD call options on 31st May, 2018.

Number of constraints

Percentage of violated constraints

Second

Elapsed time

Build ConstraintsSolve Optimistion (c)

Histograms of various statistics.

Figure 10:

Statistics for repairing data of CME-listed EURUSD options during the periodfrom 1st January, 2013 to 31st December, 2018. randomly sampled without replacement such that |I ξ | = (cid:100) λ |I|(cid:101) . Constructingthe noise ξ = ( ξ j ) ∈ R N by taking ξ j = ζ j { j ∈I ξ } where ζ = ( ζ j ) are i.i.d., wethen deﬁne the noisy price ˜ c ∈ R N by˜ c j = c j e ξ j , ∀ ≤ j ≤ N. ε ∈ R N . The perturbed arbitrage-free price vector isˆ c = ˜ c + ε . An example of c , ˜ c , and ˆ c is given in Figure 11. Arbitrage-free surface c Surface with noises ˜ c Repaired surface ˆ c Figure 11:

An example used for stress testing the repair method. There are N = 591 pricesin total. The data used are CME-traded EURUSD options’ prices on 31st May, 2018. We assess how well c is recovered by examining the quantities ln(ˆ c j /c j ) andˆ λ := N (cid:80) Nj =1 { ˆ c j (cid:54) = c j } . Note that ln(ˆ c j /c j ) ≈ (ˆ c j − c j ) /c j (when ˆ c j ≈ c j ). Inaddition, ˆ λ counts the portion of diﬀerent prices per call surface. It is unrealisticto expect any repair method to fully recover a price as it is unlikely to know theexact marginal that generates the price. However, given the ground truth thata portion λ of the surface prices has been polluted by noise, a desirable datarepair method should leave as many unpolluted prices unchanged as possible,i.e. ˆ λ − λ should be small.Assuming Gaussian noises ζ ∼ N ( , σ ξ I ), we simulate noises M times, andcompute the average value of ˆ λ and plot the histograms of ln(ˆ c j /c j ), conditionalon non-zero values, as shown in Figure 12. For a ﬁxed noise magnitude σ ξ , thegap between ˆ λ and λ widens as λ increases, i.e. the repair method adjusts alarger number of prices to remove arbitrage if there are more noisy prices. Thesame observation holds for diﬀerent values of σ ξ , though larger noise magni-tude σ ξ results in more arbitrages. Note that taking σ ξ = 1 and λ = 25%already results in, on average, ˆ λ = 30 .

80% of the price data being perturbed,an extremely large fraction that has rarely been seen in our data, see Figure8. Hence, in practice our repair method seems to only perturb a few additional(i.e. ˆ λ − λ ≈

5% for λ = 25%) prices to ensure no arbitrage. Comparing the objectives: (cid:96) -norm and (cid:96) -BA The (cid:96) -BA repair is designed to perturb more prices (larger N ε ) than the (cid:96) -norm repair does, but fewer of them are eﬀective (smaller N ε,δ ), if possible.To verify this, we apply the (cid:96) -BA repair method to the same OTC FX optionprice data. In Figure 13, we show the histograms of the diﬀerence in these twostatistics N ε and N ε,δ that are produced by the two repair methods.A detailed example showing how the two repair methods work in realityis given in Figure 14. From left to right, the displayed data are ordered byincreasing strikes, grouped by expiry. The light blue areas are conﬁned by bid-ask spread as a percentage of option prices (green lines). We see that ITM and19 c/c )020004000 ˆ λ = 11 . σ ξ = 1 , λ = 10% − c/c )0500010000 ˆ λ = 30 . σ ξ = 1 , λ = 25% − c/c )01000020000 ˆ λ = 63 . σ ξ = 1 , λ = 50% (a) σ ξ = 1. − c/c )020004000 ˆ λ = 12 . σ ξ = 2 , λ = 10% − c/c )0500010000 ˆ λ = 30 . σ ξ = 2 , λ = 25% − c/c )01000020000 ˆ λ = 63 . σ ξ = 2 , λ = 50% (b) σ ξ = 2. Figure 12:

Histograms of ln(ˆ c j /c j ) ≈ (ˆ c j − c j ) /c j , conditional on non-zero values,computed under diﬀerently valued noise simulation parameters ( λ, σ ξ ). We simulate M = 100 times. − − − − Number of price perturbations, N ε ( ‘ ) − N ε ( ‘ -BA) − Number of eﬀective price perturbations, N ε,δ ( ‘ ) − N ε,δ ( ‘ -BA) Figure 13:

Histograms of the diﬀerence in N ε and N ε,δ that are produced by the (cid:96) -normrepair method and the (cid:96) -BA repair method. These two methods are separately applied tothe same set of OTC FX data as in Figure 4, and the histograms are plotted by stackingdata of all ten currency pairs and historical dates. OTM options have wider bid-ask spreads than ATM options do. The (cid:96) -BArepair method results in fewer eﬀective perturbations. First, there is one lesseﬀective perturbation of 1M option prices, at the cost of perturbing a few 2W,3W and 1M option prices to their bid or ask prices. Second, all four eﬀectiveperturbations of 4M option prices by the (cid:96) -norm repair are replaced by sixineﬀective perturbations of 6M option prices by the (cid:96) -BA repair.For a given set of prices, if none of the perturbations is eﬀective, then thebid and ask quotes given by the market admit some arbitrage-free prices thatfall within the bid-ask price bounds. In contrast, eﬀective perturbations implythe existence of executable arbitrages that are exploitable through matchingexisting bid or ask orders in the market, see Section 3.2. In Table 2, we countthe number of days when there is arbitrage in mid-prices ( N ε >

0) and the20

D 1W 2W 3W 1M 2M 3M 4M 6M 9M 1Y 18M 2Y-20%0% ‘

1D 1W 2W 3W 1M 2M 3M 4M 6M 9M 1Y 18M 2Y-20%0% ‘ -BA PerturbationEﬀective perturbationBid-ask spread range

Figure 14:

Perturbations (as percentages of the raw price data) resulted from the (cid:96) -normand the (cid:96) -BA objectives. Data used are bid, ask and mid prices for OTC-traded USDBRLoptions on 18th September, 2008. number of days when there is executable arbitrage ( N ε,δ >

0) in historical datafor the four currency pairs that have been seen to have the most occurrences ofarbitrages.

Currency pair EURGBP USDBRL USDKRW USDMXN N ε > N ε,δ > Table 2:

Number of days when there is arbitrage in mid-prices ( N ε >

0) and when there isexecutable arbitrage ( N ε,δ > We verify that our repair method improves model calibration with more robustparameter estimates and smaller calibration error.

Test framework

Let Θ be model parameters. We specify Θ = Θ and generate model prices c forcall options on a set of expiries and strikes. Then we carry out the followingsteps M times. For the m -th time:(1) Simulate noises to create synthetic arbitrageable price data ˜c ( m ) , followingthe method in Section 4.2. Recall that λ ∈ (0 ,

1] portion of prices arepolluted by Gaussian noises of variance σ ξ .(2) Repair arbitrage in ˜c ( m ) to get arbitrage-free data ˆc ( m ) .213) Calibrate model parameters Θ to ˜c ( m ) and ˆc ( m ) separately , and get cal-ibrated parameters (cid:101) Θ ( m ) and (cid:98) Θ ( m ) , respectively. Deﬁning the calibrationobjective as G (Θ; c ) = (cid:80) Nj =1 ( c Θ j − c j ) where c Θ j is the model price forthe j -th option, we have (cid:101) Θ ( m ) = arg min Θ G (Θ; ˜c ( m ) ) , (cid:98) Θ ( m ) = arg min Θ G (Θ; ˆc ( m ) ) . We measure model calibration performance by two metrics, which are (a)the robustness deﬁned by variations in the parameter estimates, and (b) thecalibration error deﬁned as the square root of the minimal objective value.Since we have parameter estimates { (cid:101) Θ ( m ) } ≤ m ≤ M and { (cid:98) Θ ( m ) } ≤ m ≤ M , we cancompare the variations in them for assessing robustness. For each m , we deﬁnethe (relative) reduction of calibration error as∆ G ( m ) = 1 − (cid:115) G ( (cid:98) Θ ( m ) ; ˆc ( m ) ) G ( (cid:101) Θ ( m ) ; ˜c ( m ) ) . Heston model calibration

We carry out a test on calibration of the Heston model [23]. Recall that the He-ston model is described by the SDEs with model parameters Θ = ( ν , θ, k, σ, ρ ):d S t = r t S t d t + √ ν t S t d W St , d ν t = k ( θ − ν t ) d t + σ √ ν t d W νt , d (cid:104) W St , W νt (cid:105) = ρ d t, where the Feller condition 2 kθ > σ is suﬃcient to ensure strict positivity ofthe instantaneous variance process ν t .We specify a typical set of expiries and strikes that is observed on a day in theOTC market, such as the one shown in Figure 3. Other simulation parametersand ground truth model parameters are listed in Table 3. Heston model SimulationParameter ν θ k σ ρ N M λ σ ξ Value 0.003 0.008 2.32 0.38 0.36 117 500 0.25 0.1 or 1

Table 3:

Parameter values

Next, we follow the test framework and evaluate (cid:101) Θ ( m ) , (cid:98) Θ ( m ) , ∆ G ( m ) and∆ t ( m ) for m = 1 , . . . , M . In Figure 15, we plot and compare the normed his-tograms of calibrated Heston parameters (cid:101) Θ (using noisy data) and (cid:98)

Θ (using Note that we must apply exactly the same numerical procedure for these two separatecalibrations, i.e. the same optimisation algorithm, terminal criteria, lower and upper bounds,and initial values. Heston model parameters are chosen as those that reproduce a typical call price surfacefor USDBRL options. Noise simulation parameters λ and σ ξ are chosen to mimic severe butnot extreme arbitrage scenarios (measured by the fraction of perturbed prices by the repairmethod) observed in real world data. .

002 0 .

003 0 . ν .

01 0 .

02 0 .

03 0 . θ . . k . . . . . σ .

00 0 .

25 0 .

50 0 . ρ LegendGround truth e Θ b Θ (a) σ ξ = 0 . .

00 0 .

02 0 . ν .

00 0 .

05 0 .

10 0 . θ . . . k . . . σ − . . . ρ LegendGround truth e Θ b Θ (b) σ ξ = 1. Figure 15:

Sample (normed) histograms of (cid:101)

Θ and (cid:98)

Θ, where Θ = ( ν , θ, k, σ, ρ ). repaired data) given diﬀerent choices of σ ξ . The ground truth parameter valuesare also indicated by vertical dotted lines. Repairing data does make the modelcalibration more robust, as supported by two types of evidence. First, there areapparently more variations in (cid:101) Θ than in (cid:98)

Θ. Second, (cid:101)

Θ tends to hit the boundsset in the numerical optimisation procedure (e.g. 0 for k , 0.5 for σ , 1 for ρ ) muchmore often than (cid:98) Θ does. Moreover, when the price data are more noisy (larger σ ξ ) so that more prices with arbitrage are present, the robustness improvementof model calibration by the repair method becomes more signiﬁcant.In Figure 16, we plot the histograms of ∆ G and indicate their means byvertical dotted lines. Repairing arbitrage in data reduces the calibration errorsin all M simulations with no exception. Moreover, the more noisy the raw dataare, the arbitrage repair method reduces relatively more calibration errors. Onaverage, repairing data can reduce the calibration error by more than 70% for σ ξ = 0 .

1, and more than 95% for σ ξ = 1.Hence, for model calibration task, there is more beneﬁt of repairing data byremoving arbitrage when the data contain larger noise.23

0% 30% 40% 50% 60% 70% 80% 90%

Reduction of the objective value ∆ G ( σ ξ = 0 . ) Average reduction 50% 60% 70% 80% 90% 100%

Reduction of the objective value ∆ G ( σ ξ = 1 ) Average reduction

Figure 16:

Sample histograms of (relative) reductions in calibration error.

We can use the (cid:96) -BA repair method on order book data for identifying exe-cutable arbitrage. An example is given in Figure 17. We collect the order bookdata for all E-mini S&P 500 monthly European call options from 12:00 ET to16:10 ET on 12th June, 2020. N ε , δ N ε,δ . . A T M I V ATM IV12:00 13:00 14:00 15:00 16:00Time (Eastern Time) F u t u r e s p r i ce s Futures prices 15min return (positive)15min return (negative) -0.5%0.0%0.5%1.0% R e t u r n Figure 17:

Top – The formation and disappearance of intra-day executable arbitrageopportunities in the E-mini S&P 500 monthly European call option market on 12th June,2020.

Bottom – front-month futures’ prices and 15-minute return.

We extract the active best ask and best bid prices for all quoted call op-tions from the order book at the end of every minute. Then we compute midprices, apply the (cid:96) -BA repair method to the mid prices, and count the numberof eﬀective perturbations N ε,δ . Recall from Section 3.2 that, given δ small,there exists executable arbitrage if the (cid:96) -BA repair method results in eﬀectiveperturbations. In the top plot of Figure 17, the black line gives N ε,δ over time,while we also indicate the ATM implied volatility of the front-month option(which would expire on 19th June) by the grey line. The bottom plot givesthe prices and the 15-minute returns of the front-month futures contract. Thedownward trend of the futures market was inverted around 14:00 ET, after whenthe implied volatility also falls gradually from its peak.There is a large spike of N ε,δ at around 15:52 ET, a few minutes beforethe close of the S&P 500 index market at 16:00 ET. This spike coincided withrallies in the futures market, while the IV maintained its relatively low level.There are some clusters of smaller spikes of N ε,δ outside of the US trading24ours. Apart from this arbitrage outbreak preceding the close of the underlyingmarket, which lasted for around 15 minutes, there seems to be trivial executablearbitrage during the rest of the afternoon trading hours, even when the marketunderwent regime switch (from downward trend to upward trend) at around14:00 ET. A Localisation of static arbitrage constraints

To localise calendar butterﬂy constraints, we use a sequential build-up of lo-cal constraints from the shortest expiry to the longest expiry. Deﬁne D i := { ( k ij , c ij ) : 1 ≤ j ≤ n i } as price data for options of expiry T i ∈ T e . Givenarbitrage-free D i ∗ , we construct constraints such that adding price data of anylonger-expiry option should not introduce arbitrage. This is done locally intwo steps, where we scan a neighbourhood of each k i ∗ j . The ﬁrst step, we call“absolute location convexity” C6.1, ﬁnds constraints ensuring that adding anysingle data point ( k ij , c ij ) where i > i ∗ will not introduce arbitrage. In Figure18 we indicate the regions where adding a single data point will not introducearbitrage for four types of strike neighbourhood. In the second step “relativelocation convexity” C6.2, we ﬁnd constraints making sure that adding all datapoints ( k ij , c ij ) where i > i ∗ will not introduce arbitrage for two types of strikeneighbourhood. As shown in Figure 19, if we draw line segments by linkingeach added point and the reference point o = ( k i ∗ j ∗ , c i ∗ j ∗ ), we require the slope ofany line on the left { l i } to be not greater than the slope of any line on the right { r j } . kck i ∗ j ∗ (a) j ∗ = 1 kck i ∗ j ∗ − k i ∗ j ∗ k i ∗ j ∗ +1 (b) j ∗ ∈ [2 , n i ∗ − kc k i ∗ j ∗ − k i ∗ j ∗ (c) j ∗ = n i ∗ kc k i ∗ j ∗ − k i ∗ j ∗ (d) j ∗ = n i ∗ , k > k i ∗ j ∗ Figure 18:

Absolute location convexity constraint, discussed in four cases. Points falling inthe green region satisfy the absolute location convexity constraint. c ol l r r k i ∗ j ∗ (a) j ∗ ∈ [1 , n i ∗ − kc ol l r r k i ∗ j ∗ (b) j ∗ = n i ∗ Figure 19:

Relative location convexity constraint, discussed in two cases. Points within thegreen region satisfy the absolute location convexity constraint.

B Proof of Proposition 2

We prove Proposition 2 by establishing Lemma 1, 2 and 3.

Lemma 1.

If C1, C2 and C3 are satisﬁed, then all outrights, vertical spreadsand vertical butterﬂies are non-negative. In addition, all test vertical spreadsare not greater than 1.Proof.

We consider the prices of call options with the same expiry T i where i ∈ [1 , m ].First, we prove that any vertical spread is non-negative, i.e. ∀ ≤ j < j ≤ n i , c ij ≥ c ij . This is true by the vertical spread constraint C2, as c ij ≥ c ij +1 ≥· · · ≥ c ij .Second, we show that all outrights are non-negative, i.e. ∀ j ∈ [0 , n i ], c ij ≥ c ij ≥ c in i ≥ ∀ ≤ j < j

0. To do that, we claim β ( i, j + 1; i, j ) ≤ β ( i, j ; i, j + 1) , if j < j − , (12a) β ( i, j ; i, j − ≥ β ( i, j − i, j ) , if j > j + 1 . (12b)These two claims can be proved by induction. Here we only show the proof for(12a). It is true that β ( i, j − i, j − ≤ β ( i, j ; i, j −

1) (the j = j − = l < j −

2, i.e. c ij ≥ c il +1 + ( k ij − k il +1 ) c il +1 − c il k il +1 − k il = c il +1 + (cid:2) ( k ij − k il ) − ( k il +1 − k il ) (cid:3) c il +1 − c il k il +1 − k il = c il + ( k ij − k il ) β ( i, l + 1; i, l ) . This leads to β ( i, j ; i, l ) ≥ β ( i, l + 1; i, l ). Again by C3 we have β ( i, l + 1; i, l ) ≥ β ( i, l ; i, l − β ( i, j ; i, l ) ≥ β ( i, l ; i, l − j = l −

1. Therefore, (12a) holds by induction in reverse order from j = j − j = 0. Thereafter, (12a) implies − c ij ≤ − c ij +1 + ( k ij +1 − k ij ) c ij − c ij +1 k ij − k ij +1 = − c ij +1 + (cid:2) ( k ij +1 − k ij ) − ( k ij − k ij ) (cid:3) c ij − c ij +1 k ij − k ij +1 = − c ij + ( k ij +1 − k ij ) β ( i, j ; i, j + 1) , which leads to β ( i, j ; i, j ) ≤ β ( i, j ; i, j + 1). Similarly we have β ( i, j ; i, j ) ≤ β ( i, j ; i, j + 1) ≤ · · · ≤ β ( i, j ; i, j − j = j + 2 to j = n i , and deduce β ( i, j ; i, j ) ≥ β ( i, j − i, j ) ≥ · · · ≥ β ( i, j +1; i, j ). Therefore, with C3, we can conclude β ( i, j ; i, j ) ≥ β ( i, j ; i, j ).Finally, we show that any vertical spread is bounded by 1, i.e. 0 ≤ j < j ≤ n i , − β ( i, j ; i, j ) ≤

1. For the case j >

1, given that any butterﬂy spread isnon-negative, we have − β ( i, j ; i, j ) ≤ − β ( i, j ; i, ≤ − β ( i, i, ≤

1, wherethe last inequality holds due to the vertical spread constraint C2. If j = 1,then − β ( i, j ; i, j = 1) ≤ − β ( i, i, ≤

1. Otherwise j = 0, applying (12a)by assigning j = 0, j = j yields − β ( i, j ; i, j = 0) ≤ − β ( i, i, ≤ Lemma 2.

If C2, C4 and C5 are satisﬁed, then any calendar spread or calendarvertical spread is non-negative.Proof.

We would like to prove that ∀ ≤ i < i ≤ m and ∀ j ∈ [0 , n i ] , j ∈ [0 , n i ] where k i j ≥ k i j , we have c i j ≤ c i j .First consider the calendar spread case when k i j = k i j . The calendar spreadconstraint C4 immediately leads to c i j ≤ c i j .Otherwise k i j > k i j , which implies that j must be greater than 0. Given i ∈ [1 , m ], j ∈ [1 , n i ], there must be k i j ∈ [ k i j − p − , k i j − p ) for some p ∈ [0 , j − c i j ≥ c i j − p .In addition, c i j − p ≥ c i j due to the vertical spread constraint C2. Hence, c i j ≥ c i j . Lemma 3.

If C3 and C6 are satisﬁed, then any calendar butterﬂy is non-negative. roof. We would like to prove that ∀ i, i , i ∈ [1 , m ] where i ≤ i , i ≤ i and ∀ j ∈ [1 , n i ] , j ∈ [0 , n i ] , j ∈ [0 , n i ] where k i j < k ij < k i j , we have β ( i, j ; i , j ) ≤ β ( i , j ; i, j ).Given i ∈ [1 , m ], j ∈ [1 , n i ], it must be that k i j ∈ [ k ij − p − , k ij − p ] for some p ∈ [0 , j −

1] and either k i j ∈ [ k ij + q , k ij + q +1 ] for some q ∈ [0 , n i − j −

1] or k i j ∈ ( k in i , ∞ ). See Figure 20. kc ... ( k ij , c ij )( k i j , c i j ) k ij − p k ij − p +1 (a) k i j ∈ [ k ij − p − , k ij − p ] for some p ∈ [0 , j − kc ( k ij , c ij ) ... ( k i j , c i j ) k ij + q − k ij + q (b) k i j ∈ [ k ij + q , k ij + q +1 ] for some q ∈ [0 , n i − j − Figure 20:

Locations of k i j and k i j relative to k ij . Let us consider the case when k i j ≤ k in i (which implies that j < n i ). If p = q = 0, then by the calendar butterﬂy relative location constraints C6.2 weconclude β ( i, j ; i , j ) ≤ β ( i , j ; i, j ). Otherwise, we claim that if p > β ( i, j ; i , j ) ≤ β ( i, j ; i, j − p ) , (13a) β ( i, j ; i, j − p ) ≤ β ( i, j ; i, j − p + 1) ≤ · · · ≤ β ( i, j ; i, j − q > β ( i , j ; i, j ) ≥ β ( i, j + q ; i, j ) , (14a) β ( i, j + q ; i, j ) ≥ β ( i, j + q − i, j ) ≥ · · · ≥ β ( i, j + 1; i, j ) . (14b)We will show the proof for the four claims later. If p > q >

0, thefour claims and the vertical butterﬂy constraint C3 lead to the stated result. If p > q = 0, then (13) and the calendar butterﬂy absolute location convexityconstraint C6.1 lead to the stated result. If p = 0 but q >

0, then (14) and C6.1lead to the stated result. 28ext we would like to prove the claims (13) and (14). First of all, (13b)and (14b) hold because of the convexity of the set of points { ( k il , c il ) } l ∈ [ j − p,j + q ] resulted from the vertical butterﬂy constraint C3. The calendar butterﬂy abso-lute location convexity constraint C6.1 results in β ( i, j − p ; i , j ) ≤ β ( i, j − p + 1; i, j − p ). In addition, the vertical butterﬂy constraint C3 results in β ( i, j ; i, j − p + 1) ≥ β ( i, j − p + 1; i, j − p ), then c ij ≥ c ij − p +1 + ( k ij − k ij − p +1 ) c ij − p +1 − c ij − p k ij − p +1 − k ij − p = c ij − p +1 + (cid:2) ( k ij − k j − p ) − ( k ij − p +1 − k j − p ) (cid:3) c ij − p +1 − c ij − p k ij − p +1 − k ij − p = c ij − p + ( k ij − k j − p ) β ( i, j − p + 1; i, j − p ) . Hence β ( i, j, i, j − p ) ≥ β ( i, j − p + 1 , i, j − p ) ≥ β ( i, j − p, i , j ). Then c i j ≤ c ij − p + ( k i j − k ij − p ) c ij − c ij − p k ij − k ij − p = c ij − p + (cid:2) ( k i j − k ij ) − ( k ij − p − k ij ) (cid:3) c ij − c ij − p k ij − k ij − p = c ij + ( k i j − k ij ) β ( i, j ; i, j − p ) , which indicates that β ( i, j ; i , j ) ≤ β ( i, j ; i, j − p ), i.e. (13a).The calendar butterﬂy spread absolute location convexity constraint C6.1results in β ( i , j ; i, j + q ) ≥ β ( i, j + q ; i, j + q − β ( i, j + q − i, j ) ≤ β ( i, j + q ; i, j + q − − c ij ≤ − c ij + q − + ( k ij + q − − k ij ) c ij + q − c ij + q − k ij + q − k ij + q − = − c ij + q − + (cid:2) ( k ij + q − − k ij + q ) − ( k ij − k ij + q ) (cid:3) c ij + q − c ij + q − k ij + q − k ij + q − = − c ij + q + ( k ij + q − k ij ) β ( i, j + q − i, j + q ) . Hence β ( i, j + q ; i, j ) ≤ β ( i, j + q ; i, j + q − ≤ β ( i , j ; i, j + q ). Then c i j ≥ c ij + q + ( k i j − k ij + q ) c ij − c ij + q k ij − k ij + q = c ij + q + (cid:2) ( k i j − k ij ) − ( k ij + q − k ij ) (cid:3) c ij − c ij + q k ij − k ij + q = c ij + ( k i j − k ij ) β ( i, j ; i, j + q ) , which indicates that β ( i , j ; i, j ) ≥ β ( i, j + q ; i, j ), i.e. (14a).Now we consider the case when k i j > k in i . The same proof as above appliesif we let q = n i − j and allow j to take the value n i . Acknowledgements

This publication is based on work supported by the EPSRC Centre for DoctoralTraining in Industrially Focused Mathematical Modelling (EP/L015803/1) in29ollaboration with CME Group. We thank Florian Huchede, Director of Quan-titative Risk Management, and other colleagues at CME Group for providingvaluable data access, suggestions from the business perspective, and continuedsupport.Samuel Cohen and Christoph Reisinger acknowledge the support of theOxford-Man Institute for Quantitative Finance, and Samuel Cohen also ac-knowledges the support of the Alan Turing Institute under the Engineering andPhysical Sciences Research Council grant EP/N510129/1.

References [1] Y. Ait-Sahalia and J. Duarte. Nonparametric option pricing under shaperestrictions.

Journal of Econometrics , 116(1-2):9–47, 2003.[2] P. M. Blacque-Florentin and B. Missaoui. Nonparametric and arbitrage-free construction of call surfaces using (cid:96) -recovery, 2015.[3] S. Boyd and L. Vandenberghe. Convex Optimization . Cambridge UniversityPress, New York, NY, USA, 2004.[4] D. T. Breeden and R. H. Litzenberger. Prices of state-contingent claimsimplicit in option prices.

The Journal of Business , 51(4):621–51, 1978.[5] E. Candes, M. Rudelson, T. Tao, and R. Vershynin. Error correction vialinear programming. In , pages 668–681, 2005.[6] P. Carr, H. Geman, D. Madan, and M. Yor. Stochastic volatility for L´evyprocesses.

Mathematical Finance , 13(3):345–382, 2003.[7] P. Carr and D. B. Madan. A note on suﬃcient conditions for no arbitrage.

Finance Research Letters , 2(3):125–130, 2005.[8] L. Cousot. Conditions on option prices for absence of arbitrage and exactcalibration.

Journal of Banking & Finance , 31(11):3377–3397, 2007.[9] L. Cousot and M. Street. Necessary and suﬃcient conditions for no staticarbitrage among European calls. 2004.[10] M. H. A. Davis and D. G. Hobson. The range of traded option prices.

Mathematical Finance , 17(1):1–14, 2007.[11] F. Delbaen and W. Schachermayer. A general version of the fundamentaltheorem of asset pricing.

Mathematische Annalen , 300(1):463–520, 1994.[12] E. Derman and I. Kani. Riding on a smile.

Risk , 7, 1994.[13] M. Dixon, S. Cr´epey, and M. Chataigner. Deep local volatility. arxivpreprint arxiv:2007.10462 , 2020. 3014] D. L. Donoho and M. Elad. Optimally sparse representation in general(nonorthogonal) dictionaries via l1 minimization.

Proceedings of the Na-tional Academy of Sciences , 100(5):2197–2202, 2003.[15] B. Dupire. Pricing with a smile.

Risk Magazine , 7:18–20, 1994.[16] M. Fengler. Arbitrage-free smoothing of the implied volatility surface.

Quantitative Finance , 9:417–428, 06 2009.[17] M. Fengler. Option data and modeling BSM implied volatility.

Handbookof Computational Finance , pages 117–142, 2012.[18] M. R. Fengler and L.-Y. Hin. Semi-nonparametric estimation of thecall-option price surface under strike and time-to-expiry no-arbitrage con-straints.

Journal of Econometrics , 184(2):242–261, 2015.[19] J. Gatheral and A. Jacquier. Arbitrage-free SVI volatility surfaces.

Quan-titative Finance , 14(1):59–71, 2014.[20] S. Gerhold and I. C. Glm. Consistency of option prices under bidaskspreads.

Mathematical Finance , 30(2):377–402, 2020.[21] R. Gribonval and M. Nielsen. Sparse representations in unions of bases.

IEEE Transactions on Information Theory , 49(12):3320–3325, 2003.[22] J. M. Harrison and D. Kreps. Martingales and arbitrage in multiperiodsecurities markets.

Journal of Economic Theory , 20(3):381–408, 1979.[23] S. L. Heston. A closed-form solution for options with stochastic volatil-ity with applications to bond and currency options.

Review of FinancialStudies , 6:327–343, 1993.[24] P. Huber, J. Wiley, and W. InterScience.

Robust statistics . Wiley NewYork, 1981.[25] A. Ivanovas.

Option data, missing tails, and the intraday variation of im-plied moments . PhD thesis, University of St. Gallen, 2015.[26] N. Kahale. An arbitrage-free interpolation of volatilities.

Risk Magazine ,17:102–106, 2004.[27] H. G. Kellerer. Markov-Komposition und eine Anwendung auf Martingale.

Mathematische Annalen , 198:99–122, 1972.[28] D. Kreps. Arbitrage and equilibrium in economies with inﬁnitely manycommodities.

Journal of Mathematical Economics , 8(1):15–35, 1981.[29] H. Lim. Improved methods for implied volatility surface and implied dis-tributions.

SSRN preprint 3561100 , 2020.[30] L. V. Martin S. Andersen, J Dahl. Cvxopt: A python package for convexoptimization. [Version 1.2.5; available at cvxopt.org].3131] P. Meier.

Essays on pricing kernel estimation, option data ﬁltering andrisk-neutral density tail estimation . PhD thesis, University of St. Gallen,2015.[32] B. K. Natarajan. Sparse approximate solutions to linear systems.

SIAMJournal on Computing , 24(2):227–234, 1995.[33] J. Ruf and W. Wang. Neural networks for option pricing and hedging: aliterature review.

SSRN preprint 3486363 , 2019.[34] M. Shaked and J. Shanthikumar.

Stochastic Orders . Springer Series inStatistics. Springer New York, 2007.[35] S. Stoikov. The micro-price: a high-frequency estimator of future prices.

Quantitative Finance , 18:1–8, 2018.[36] Y. Wang, H. Yin, and L. Qi. No-arbitrage interpolation of the optionprice function and its reformulation.

Journal of Optimization Theory andApplications , 120(3):627–649, 2004.[37] U. Wystup.