SMC-ABC methods for the estimation of stochastic simulation models of the limit order book
SSMC-ABC methods for the estimation of stochastic simulationmodels of the limit order book
Gareth W. PetersUCL, Department of Statistical Science, WC1E 6BT, London, [email protected] PanayiUCL, Department of Computer Science, WC1E 6BT, London, [email protected] SeptierInstitute Mines-Telecom Lille, CRIStAL UMR CNRS 9189, France.September 12, 2018
In this chapter we consider classes of models that have been recently developed for quantitativefinance that involve modelling a highly complex multivariate, multi-attribute stochastic processknown as the Limit Order Book (LOB). The LOB is the primary data structure recorded each dayintra-daily for all assets on every electronic exchange in the world in which trading takes place. Assuch, it represents one of the most important fundamental structures to study from a stochasticprocess perspective if one wishes to characterize features of stochastic dynamics for price, volume,liquidity and other important attributes for a traded asset. In this paper we aim to adopt themodel structure recently proposed by Panayi and Peters [2015], which develops a stochastic modelframework for the LOB of a given asset and to explain how to perform calibration of this stochasticmodel to real observed LOB data for a range of different assets. One can consider this class ofproblems as truly a setting in which both the likelihood is intractable to evaluate pointwise, buttrivial to simulate, and in addition the amount of data is massive. This is a true example of big-dataapplication as for each day and for each asset one can have anywhere between 100,000-500,000 datavectors for the calibration of the models.The class of calibration techniques we will consider here involves a Bayesian ABC reformulationof the indirect inference framework developed under the multi-objective optimization formulationproposed recently by Panayi and Peters [2015]. To facilitate an equivalent comparison for thetwo frameworks, we also adopt a reformulation of the class of genetic stochastic search algorithmsutilised by Panayi and Peters [2015], known as NGSA-II [Deb et al., 2002]. We adapt this widelyutilised stochastic genetic search algorithm from the multi-objective optimization algorithm litera-ture to allow it to be utilised as a mutation kernal in a class of Sequential Monte Carlo Samplers(SMC Sampler) algorithms in the ABC context. We begin with the problem and model formula-1 a r X i v : . [ q -f i n . C P ] A p r ion, then we discuss the estimation frameworks and finish with some real data simulation resultsfor equity data from a highly utilised pan-European secondary exchange formerly known as Chi-X,before it was recently aquired by BATS. The structure of a financial market dictates the form of interaction between buyers and sellers.Markets for financial securities generally operate either as quote driven markets, in which specialists(dealers) provide 2-way prices, or order driven markets, in which participants can trade directlywith each other by expressing their trading interest through a central matching mechanism. TheLOB is an example of the latter, and indicatively, the Helsinki, Hong Kong, Shenzhen, Swiss, Tokyo,Toronto, and Vancouver Stock Exchanges, together with Euronext and the Australian SecuritiesExchange operate as pure LOBs, while the New York Stock Exchange, NASDAQ, and the LondonStock Exchange also operate a hybrid LOB system [Gould et al., 2013], with specialists operatingfor the less liquid securities.We will consider trading activity in the context of the LOB in this chapter. Market participantsare typically allowed to place two types of orders on the venue: Limit orders, where they specifya price over which they are unwilling to buy (or a price under which they are unwilling to sell),and market orders, which are executed at the best available price. Market orders are executedimmediately, provided there are orders of the same size on the opposite side of the book. Limitorders to buy (sell) are only executed if there is opposite trading interest in the order book at, orbelow (above), the specified limit price. If there is no such interest, the order is entered into thelimit order book, where orders are displayed by price, then time priority.Figure 1: An example of the state of the Chi-X order book. The left hand side of the book is thebuying interest, while the right hand side is the selling interest. The highest bid (order to buy) isfor 100 shares at 2700 cents, while there are two lowest offers (orders to sell) for 70 and 100 sharesat 2702. Orders are prioritised by price, then time.Figure 1 shows an example snapshot of the order book for a particular stock, as traded on theChi-X exchange, at a particular instance of time. A market order to buy 200 shares would result in3 trades: 70 shares at 2702, another 100 shares at 2702 and the remaining 30 at 2704. A limit orderto sell 300 shares at 2705, on the other hand, would not be executed immediately, as the highest2rder to buy is only at 2700 cents. It would instead enter the limit order book on the right handside, second in priority at 2705 cents after the order for 120 shares which is already in the book.LOB simulation models aim to generate the trading interactions observed in such a LOB, andallow for a realistic description of the intra-day trading process. In particular, the models simulatethe behaviour of individual market participants, usually based on the behaviour of various classesof real traders. The price of the traded financial asset is then determined from the limit andmarket orders submitted by these traders. Depending on the model, the instantaneous price iseither considered to be the mid-point between the highest bid price and lowest ask price, or thelast traded price.Because of the practical interest of modelling the intra-day dynamics of both stock prices andavailable volumes in the LOB, as well as the difficulty of traditional economic models based onrationality to reproduce these dynamics, there have been a multitude of research approaches at-tempting to address this gap. One one hand, there have been the zero-intelligence approaches (e.g.Maslov [2000]), which generally consist of a single, unsophisticated type of trading agent who sub-mits orders randomly, possibly subject to (very few) constraints, like budget considerations. Thisminimum amount of discipline imposed by their actions is therefore often sufficient to reproducesome commonly observed features of financial markets, such as fat tails in the distribution of returns[Maslov, 2000]. Later models [LiCalzi and Pellizzari, 2003, Fricke and Lux, 2013] also consideredmore realistic trading behaviour, where agents act based on their perceived value of the financialasset. However, LiCalzi and Pellizzari [2003] noted that these additional considerations regardingagent behaviour did not necessarily lead to a more realistic stock price output, and that it waslikely the imposed market structure that had the largest effect on reproducing the price featuresobserved.One other prominent strand of research pertains to the modelling of the LOB as a set of queuesat each price and on both the buy (bid) and sell (ask) side. In the models of Cont et al. [2010] andCont and De Larrard [2013], a power law characterising the intensities of the limit order processesis found to be a good emprical fit. However, their assumption of independence between the activityat each level and for each event type is unlikely to hold in modern LOBs, due to the presence ofalgorithmic trading strategies, which induce different types of dependence structures. It is clearfrom observed empirical LOB data that non-trivial dependence structures are present, and as such,ignoring these features in the model formulation will result in inadequate representations of thestochastic LOB process being modelled.In the LOB simulation model introduced in Panayi and Peters [2015], they attempted to provideboth a richer description of the LOB market structure and its constituent agents, but also considerthe dependence between the trading activity at different levels. The main components of theproposed model are detailed in the following sections, before it is extended into an ABC posteriorformulation for estimation and inference purposes.
In this section we develop a new class of Bayesian models that can be utilised to study the dynamicsof Limit Order Books (LOB) intra-daily. We start by presenting the stochastic multivariate orderflow model, which we develop as a “stochastic representative agent” model framework. This is thenreformulated as an intractable likelihood stochastic model and developed into an Approximate3ayesian Computational (posterior model).
We consider the intra-day LOB activity in fixed intervals of time . . . , [ t − , t ) , [ t, t + 1) , . . . . Forevery interval [ t, t + 1), we allow the total number of levels on the bid or ask sides of the LOB tobe dynamically adjusted as the simulation evolves. These LOB levels are defined with respect totwo reference prices, equal to p b, t − and p a, t − , i.e. the price of the highest bid and lowest ask price atthe start of the interval. We consider these reference prices to be constant throughout the interval[ t − , t ) and thus, the levels on the bid side of the book are defined at integer number of ticks awayfrom p a, t − , while the levels on the ask side of the book are defined at integer number of ticks awayfrom p b, t − .This does not mean that we expect the best bid and ask prices to remain constant, just that wemodel the activity (i.e. limit order arrivals, cancellations and executions) according to the distancein ticks from these reference prices during this period. We note that it is of course possible thatthe volume at the best bid price is consumed during the interval, and that limit orders to sellare posted at this price, which would be considered at 0 ticks away from the reference price. Toallow for this possibility, we actively model the activity at − l d +1 , . . . , , . . . , l p ticks away from eachreference price. Here, the p subscript will refer to passive orders, i.e. orders which would not lead toimmediate execution, if the reference prices remained constant and d refers to direct, or aggressiveorders, where it is again understood that they are aggressive with respect to the reference pricesat the start of the period. Therefore, we actively model the activity at a total l t = l p + l d levels onthe bid and ask.We assume that activity that occurs further away is uncorrelated with the activity close tothe top of the book, and therefore unlikely to have much of an impact on price evolution and theproperties of the volume process. Therefore, the volume resting outside the actively modelled LOBlevels ( − l d + 1 , . . . , , . . . , l p ) on the bid and ask is assumed to remain unchanged until the agentinteractions brings those levels inside the band of actively modelled levels, at which time they willagain dynamically evolve. Such a set of assumptions is consistent with observed stylized featuresof all LOB for all modern electronic exchanges.To present the details of the simulation framework, including the stochastic model componentsfor each agent, i.e. liquidity providers and liquidity demanders, we first define the following notation:1. V at = ( V a, − l d +1 t , . . . , V a,l p t ) - the random vector for the number of orders resting at each levelon the ask side at time t at the actively modelled levels of the LOB at time t ;2. N LO,at = ( N LO,a, − l d +1 t , . . . , N LO,a,l p t ) - the random vector for the number of limit ordersentering the limit order book on the ask side at each level in the interval [ t − , t );3. N C,at = ( N C,a, t , . . . , N C,a,l t t ) - the random vector for the number of limit orders cancelled onthe ask side in the interval [ t − , t );4. N MO,at - the random variable for the number of market orders submitted by liquidity deman-ders in the interval [ t − , t ).We consider the processes for limit orders and market orders, as well as cancellations to belinked to the behaviour of real market participants in the LOB. In the following, we model the4ggregation of the activity of 2 classes of liquidity motivated agents, namely liquidity providers andliquidity demanders. As we model LOB activity in discrete time intervals, we process the aggregateactivity at the end of each time interval in the following order:1. Limit order arrivals - passive - by the liquidity provider agent;2. Limit order arrivals - aggressive or direct - by the liquidity provider agent;3. Cancellations by the liquidity provider agent;4. Market orders by the liquidity demander agent.The rationale for this ordering is that the vast majority of limit order submissions and cancella-tions are typically accounted for by the activity of high-frequency traders, and many resting ordersare cancelled before slower traders can execute against them. In addition, such an ordering allowsus to condition on the state of the LOB, so that we do not have more cancellations at a particularlevel than the orders resting at that level. This is generally appropriate, as the time interval weconsider can be made as small as desired for a given simulation. We assume liquidity providers are responsible for all market-making behaviour (i.e. limit ordersubmissions and cancellations on both the bid and ask side of the LOB). After liquidity is postedto the LOB, liquidity seeking market participants, such as mutual funds using some executionalgorithm, can take advantage of the resting volume with market orders. For market makers,achieving a balance between volume executed on the bid and the ask side can be profitable. However,there is also the risk of adverse selection, i.e. trading against a trader with superior information.This may lead to losses if, e.g. a trader posts multiple market orders that consume the volume onseveral levels of the LOB. The risk of adverse selection as a result of asymmetric information isone of the basic tenets of market microstructure theory O’hara [1995]. To reduce this risk, marketmakers cancel and resubmit orders at different prices and/or different sizes.
Definition 1 (Limit order submission process for the liquidity provider agent)
Considerthe limit order submission process of the liquidity provider agent to include both passive and ag-gressive limit orders on the bid and ask sides of the book, which are assumed to have the followingstochastic model structure:1. Let the multivariate path-space random matrix N LO,k T ∈ N l t × T + be constructed from randomvectors for the numbers of limit order placements N LO,k T = (cid:16) N LO,k , N LO,k , . . . , N LO,kT (cid:17) .Furthermore, assume these random vectors for the number of orders at each level at time t are each conditionally dependent on a latent stochastic process for the intensity at whichthe limit orders arrive, given by the random matrix Λ LO,k T ∈ R l t × T + and on the path-spaceby Λ LO,k T = (cid:16) Λ LO,k , Λ LO,k , . . . , Λ LO,kT (cid:17) . In the following, k ∈ { a, b } indicates the respectiveprocess on the ask and bid side.2. Assume the conditional independence property for the random vectors given by, (cid:104) N LO,ks | Λ LO,ks (cid:105) ⊥⊥ (cid:104) N LO,kt | Λ LO,kt (cid:105) , ∀ s (cid:54) = t, s, t ∈ { , , . . . , T } . (1)5 . For each time interval [ t − , t ) from the start of trading on the day, let the random vector forthe number of new limit orders placed in each actively modelled level of the limit order book,i.e. the price points corresponding to ticks ( − l d + 1 , . . . , , , . . . , l p ) , be denoted by N LO,kt =( N LO,k, − l d +1 t , . . . , N LO,k,l p t ) , and assume that these random vectors satisfy the conditionalindependence property (cid:104) N LO,k,st | Λ LO,k,st (cid:105) ⊥⊥ (cid:104) N LO,k,qt | Λ LO,k,qt (cid:105) , ∀ s (cid:54) = q, s, q ∈ {− l d + 1 , . . . , , , . . . , l p } . (2)
4. Assume the random vector N LO,kt ∈ N l t + is distributed according to a multivariate generalizedCox process with conditional distribution N LO,kt ∼ GCP (cid:16) λ LO,kt (cid:17) given by P r (cid:16) N LO,k, − l d +1 t = n , . . . , N LO,k,l p t = n l t (cid:12)(cid:12)(cid:12) Λ LO,kt = λ LO,kt (cid:17) = (cid:81) l p s = − l d +1 (cid:16) λ LO,k,st (cid:17) ns n s ! exp (cid:104) − λ LO,k,st (cid:105) . (3)
5. Assume the independence property for random vectors of latent intensities unconditionallyaccording to Λ LO,ks ⊥⊥ Λ LO,kt , ∀ s (cid:54) = t, s, t ∈ { , , . . . , T } . (4)
6. Assume that the intensity random vector Λ LO,kt ∈ R l t + is obtained through an element-wisetransformation of the random vector Γ LO,kt ∈ R l t , where for each element we have the mapping Λ LO,k,st = µ LO,k,s F (cid:16) Γ LO,k,st (cid:17) , (5) where we have s ∈ {− l d + 1 , . . . , l p } , baseline intensity parameters (cid:110) µ LO,k,s (cid:111) ∈ R + and astrictly monotonic mapping F : R (cid:55)→ [0 , .7. Assume the random vector Γ LO,kt ∈ R is distributed according to a multivariate skew-t dis-tribution Γ LO,kt ∼ M St ( m k , β k , ν k , Σ k ) with location parameter vector m k ∈ R l t , skewnessparameter vector β k ∈ R l t , degrees of freedom parameter ν k ∈ N + and l t × l t covariance matrix Σ k . Hence, Γ LO,kt has density function f Γ LO,kt (cid:0) γ t ; m k , β k , ν k , Σ k (cid:1) = cK νk + lt (cid:18)(cid:113) ( ν k + Q ( γ t , m k )) [ β k ] T [ Σ k ] − β k (cid:19) exp ( γ t − m k ) T [ Σ k ] − β k (cid:18)(cid:113) ( ν k + Q ( γ t , m k )) [ β k ] T [ Σ k ] − β k (cid:19) − νk + lt (cid:18) Q ( γ t, m k ) νk (cid:19) νk + lt , (6) where K v ( z ) is a modified Bessel function of the second kind given by K v ( z ) = 12 (cid:90) ∞ y v − e − z ( y + y − ) dy, (7) and c is a normalisation constant. We also define the function Q ( · , · ) as follows: Q ( γ t , m k ) = ( γ t − m k ) T (cid:104) Σ k (cid:105) − ( γ t − m k ) . (8)6 his model admits skew-t marginals and a skew-t copula, see Smith et al. [2012] for details.Importantly, this stochastic model admits the following scale mixture representation, Γ LO,kt d = m k + β k W + √ W Z , (9) with Inverse-Gamma random variable W ∼ IGa (cid:16) ν k , ν k (cid:17) and independent Gaussian randomvector Z ∼ N (cid:0) , Σ k (cid:1) .8. Assume that for every element N LO,k,st of order counts from the random vector N LO,kt , thereis a corresponding random vector O LO,k,st ∈ N N LO,k,st + of order sizes. We assume that theelement O LO,k,si,t , i ∈ (cid:110) , . . . , N LO,k,st (cid:111) is distributed as O LO,k,si,t ∼ H ( · ) . Furthermore, weassume that order sizes are unconditionally independent O LO,k,si,t ⊥⊥ O LO,k,si (cid:48) ,t for i (cid:54) = i (cid:48) , s (cid:54) = s (cid:48) and t (cid:54) = t (cid:48) . Remark 2.1
Under this proposed model for market maker liquidity activity, the number of limitorders placed by the liquidity providers in the market has an appropriate dynamic intensity structurethat can evolve intra-daily to reflect the changing nature of liquidity provided by market makersthroughout the trading day. In addition, the number of limit orders placed at each level of the bidand ask also allow for the model to capture the observed dependence structures in order placementsin each level of the bid and ask regularly seen in empirical analysis of high-frequency LOB data.The dependence structure utilised is based on a skew-t copula which allows non-exchangeability ofthe stochastic intensity on the bid and ask at each level of the book, as well as asymmetry in the taildependence features. This means that when large movements by market makers to replenish liquidityafter a liquidity drought occurs intra-daily, such as after a large market order execution, the modelcan produce such replenishment on just the bid or just the ask depending on the situation. It doesnot automatically replenish both sides of the book equally likely, as would occur under a standardt-copula structure as opposed to a skew-t copula structure used in this model.
We now define the second component of the liquidity provider agents, namely the cancellationprocess. The cancellation process has the same stochastic process model specification as the limitorder submission process above, including a skew-t dependence structure between the stochasticintensities at each LOB level on the bid and ask. We therefore only specify the differences unique tothe cancellation process relative to the order placement model definition in the below specification,to avoid repetition.
Definition 2 (Limit order cancellation process for liquidity provider agent)
Consider thelimit order cancellation process of the liquidity provider agent to have an identically specified stochas-tic model structure as the limit order submissions. The exception to this pertains to the assumptionthat the number of cancelled orders in each interval at each level is right-truncated at the totalnumber of orders at that level.1. As for submissions, we assume for cancellations a multivariate path-space random matrix N C,k T ∈ N l t × T + constructed from random vectors for the number of cancelled orders givenby N C,k T = (cid:16) N C,k , N C,k , . . . , N C,kT (cid:17) . Furthermore, assume for these random vectors forthe number of cancelled orders at each of the l t levels, the latent stochastic process for the ntensity is given by the random matrix Λ C,k T ∈ R l t × T + and given on the path-space by Λ C,k T = (cid:16) Λ C,k , Λ C,k , . . . , Λ C,kT (cid:17) .2. Assume that for the random vector ˜ V kt for the volume resting in the LOB after the place-ment of limit orders we have ˜ V kt = V kt − + N LO,kt , and that the random vector N C,kt ∈ N l t + is distributed according to a truncated multivariate generalized Cox process with conditionaldistribution N C,kt | ˜ V kt = v ∼ GCP (cid:16) λ C,kt (cid:17) I ( N C,kt < v ) (with v = ( v − l d +1 , . . . , v l p ) ) given by P r (cid:16) N C,k, − l d +1 t = n − l d +1 , . . . , N C,k,l p t = n l p (cid:12)(cid:12)(cid:12) Λ C,kt = λ C,kt , ˜ V kt = v (cid:17) = l p (cid:89) s = − l d +1 ( λ C,k,st ) ns n s ! (cid:80) v s j =0 ( λ C,k,st ) j j ! . (10)
3. Assume that for the cancellation count N C,k,st , the orders with highest priority are cancelledfrom level s (which are also the oldest orders in their respective queue). Assume also thatcancellations always remove an order in full, i.e. there are no partial cancellations. Remark 2.2
Cancellations are a critical part of a market makers ability to modulate and adjusttheir liquidity activity to avoid large losses in trades that would otherwise be executed under anadverse selection setting. Under this proposed model for market maker liquidity removal activity(cancellations), the number of limit orders cancelled by the liquidity providers in the market has anappropriate dynamic intensity structure that can evolve intra-daily to reflect the changing nature ofliquidity demand throughout the trading day. In addition, the number of limit orders cancelled ateach level of the bid and ask also allow for the model to capture the observed dependence structuresin order cancellations at each level of the bid and ask. The dependence structure utilised is based ona skew-t copula which allows non-exchangeability of the stochastic intensity on the bid and ask ateach level of the book, as well as asymmetry in the tail dependence features. This means that whenlarge price movements occur in the LOB, market makers need to adjust their LOB volumes andprofile by cancelling existing resting orders and creating new orders. This typically occurs manytimes throughout the trading day, and the ability to do this with an appropriate dependence structureis critical. In addition, the number of cancelled orders needs to preserve the principle of volumepreservation, that is the upper bound on the total number of Limit Orders that may be cancelled atany given time is based on the instantaneous resting volume in the book at the given time instant.
We complete the specification of the representative agents by considering the specification ofthe liquidity demander agent. In addition to market makers who are incentivized to place ordersintra-daily in the limit order book, by exchanges in which they operate, there are also other marketparticipants who trade for other reasons. These other market participants include hedge funds,pension funds and other types of large investors, typically we refer to such groups of traders asliquidity demanders. They absorb liquidity throughout the day by purchasing resting orders inthe limit order book. These purchases are most often made through market orders or aggressivelimit orders. In this chapter we assume that all such activities can be adequately modelled by astochastic liquidity demander agent making dynamically evolving decisions to place market orders,as specified below. 8 efinition 3 (Market order submission process for liquidity demander agent)
Considera representative agent for the liquidity providers to be composed of a market order component,which has the following stochastic structure:1. Assume a path-space random vector N MO,k T ∈ N × T + for the number of market orders con-structed from the random variables for the number of market orders in each interval N MO,k T = (cid:16) N MO,k , N MO,k , . . . , N MO,kT (cid:17) . Furthermore, assume that for these random variables the la-tent stochastic process for the intensity is given by random variable Λ MO,k T ∈ R l t × T + , and givenon the path-space by Λ MO,k T = (cid:16) Λ MO,k , Λ MO,k , . . . , Λ MO,kT (cid:17) .2. Assume the conditional independence property for the random variables (cid:104) N MO,ks | Λ MO,ks (cid:105) ⊥⊥ (cid:104) N MO,kt | Λ MO,kt (cid:105) , ∀ s (cid:54) = t, s, t ∈ { , , . . . , T } . (11)
3. Assume that for the random variable ˜ R kt for the volume resting on the opposite side of the LOBafter the placement of limit orders and cancellations we have ˜ R kt = Σ l p s =1 (cid:104) ˜ V k (cid:48) ,st − ∆ t − N C,k (cid:48) ,st (cid:105) ,where k (cid:48) = a if k = b , and vice-versa, and that the random variable N MO,kt ∈ N + is distributedaccording to a truncated generalized Cox process with conditional distribution N MO,kt | ˜ R kt = r ∼ GCP (cid:16) λ MO,kt (cid:17) ( N MO,kt < r ) given by P r (cid:16) N MO,kt = n (cid:12)(cid:12)(cid:12) Λ MO,kt = λ MO,kt , ˜ R kt = r (cid:17) = ( λ MO,kt ) n n ! (cid:80) rj =0 ( λ MO,kt ) j j ! . (12)
4. Assume the independence property for random vectors of latent intensities unconditionallyaccording to Λ MO,ks ⊥⊥ Λ MO,kt , ∀ s (cid:54) = t, s, t ∈ { , , . . . , T } . (13)
5. Assume that for each intensity random variable Λ MO,kt ∈ R + there is a corresponding trans-formed intensity variable Γ MO,kt ∈ R and the relationship for each element is given by themapping Λ MO,kt = µ MO,k F (cid:16) Γ MO,kt (cid:17) (14) for some baseline intensity parameter µ MO,k ∈ R + and strictly monotonic mapping F : R (cid:55)→ [0 , .6. Assume that the random variables Γ MO,kt ∈ R , characterizing the intensity before transfor-mation of the Generalized Cox-Process, are distributed in interval [ t − , t ) according to aunivariate skew-t distribution Γ MO,kt ∼ St ( m MO,kt , β
MO,k , ν
MO,k , σ
MO,k ) .7. Assume that for every element N MO,kt of market order counts, there is a corresponding ran-dom vector O MO,k,st ∈ N N MO,kt + of order sizes. We assume that the element O MO,ki,t , i ∈ (cid:110) , . . . , N MO,kt (cid:111) is distributed according to O MO,ki,t ∼ H ( · ) . Assume also that market ordersizes are unconditionally independent O MO,ki,t ⊥⊥ O MO,ki (cid:48) ,t for i (cid:54) = i (cid:48) or t (cid:54) = t (cid:48) .
9e denote the LOB state for the real dataset at time t on a given day by the random vector L t , and this corresponds to the prices and volumes at each level of the bid and ask. Utilising thestochastic agent-based model specification described above, and given a parameter vector θ , whichwill generically represent all parameters of the liquidity providing and liquidity demanding agenttypes, one can then also generate simulations of intra-day LOB activity and arrive at the syntheticstate L ∗ t ( θ ). The state of the simulated LOB at time t is obtained from the state at time t − X t , which are obtained from a singlestochastic realisation of the following components of the agent-based models: • Limit order submission intensities Λ LO,bt , Λ LO,at , order numbers N LO,bt , N LO,at , and ordersizes O LO,a,si,t , O LO,b,sj,t , where s = − l d + 1 . . . l p , i = 1 . . . N LO,a,st , j = 1 . . . N
LO,b,st ; • Limit order cancellation intensities Λ C,bt , Λ C,at and numbers of cancellations N C,bt , N C,at ; • Market order intensities Λ MO,bt , Λ MO,at , numbers of market orders N MO,bt , N MO,at , V MO,bt , V MO,at and market order sizes O MO,ai,t , O MO,bj,t , i = 1 . . . N
MO,at , j = 1 . . . N
MO,bt
These stochastic features are combined with the previous state of the LOB, L ∗ t − ( θ ), to producethe new state L ∗ t ( θ ) for a given set of parameters θ , given by L ∗ t ( θ ) = G ( L ∗ t − ( θ ) , X t ) (15) G ( · ) is a transformation that maps the previous state of the LOB and the activity generated inthe current step into a new step, much the same way as the matching engine updates the LOB afterevery event. As we model the activity in discrete intervals, however, the LOB is only updated atthe end of every interval, and the incoming events (limit orders, market orders and cancellations)are processed in the order specified in Section 2.1. Conditional then on a realization of theseparameters θ , the trading activity in the LOB can be simulated for a single trading day, and thecomplete procedure is described in the algorithm set out in Panayi and Peters [2015]. In this section we consider the class of LOB stochastic models developed in the previous section andwe detail a Bayesian model formulation under an ABC framework. Methods for Bayesian modellingin the presence of computationally intractable likelihood functions are of growing interest. Thesemethods may arise either because the likelihood is truly intractable to evaluate point-wise, or inour case it may be that the likelihood is so complex in terms of model specification and costlyto evaluate point-wise that one has to resort to alternative methods to perform estimation andinference. Termed likelihood-free samplers or Approximate Bayesian Computation (ABC) methods,simulation algorithms such as Sequential Monte Carlo Samplers have been adapted for this setting,see for instance Peters et al. [2012].We start by recalling a few basics. Typically, Bayesian inference proceeds via the posteriordistribution, generically denoted by π ( θ | y ) ∝ f ( y | θ ) π ( θ ), the updating of prior information π ( θ )for a parameter θ ∈ E through the likelihood function f ( y | θ ) after observing data y ∈ Y . Numericalalgorithms, such as importance sampling, Markov chain Monte Carlo (MCMC) and sequentialMonte Carlo (SMC), are commonly employed to draw samples from the posterior π ( θ | y ).10 emark 2.3 (Note on Data vector) In the context of this chapter the data y is the entire orderbook structure for a given asset over a day as summarized by: • Limit order submission order numbers N LO,bt , N LO,at , and order sizes O LO,a,si,t , O LO,b,sj,t , where s = − l d + 1 . . . l p , i = 1 . . . N LO,a,st , j = 1 . . . N
LO,b,st • Limit order numbers of cancellations N C,bt , N C,at • Numbers of market orders N MO,bt , N MO,at , V MO,bt , V MO,at and market order sizes O MO,ai,t , O MO,bj,t , i = 1 . . . N
MO,at , j = 1 . . . N
MO,bt .The resulting observation vector y t , at time t , is then the concatenation of all these variables.These stochastic features are obtained at sampling rate t within the market hours of the tradingday, typically say every 5-30 seconds for the 8.5 hour trading day, producing a total of between1000-6000 vector valued observations per day. Clearly, even evaluating a likelihood on this many records, even if it could be written downwhich in many LOB models built on queues like the one in this chapter will not be the case, wouldstill be a challenging task. Generically we will denote below the collection of all observations y of the LOB for an asset on a given day, and by θ the set of all parameters that are utilised toparameterize the LOB stochastic model.There is growing interest in posterior simulation in situations where the likelihood function iscomputationally intractable i.e. f ( y | θ ) may not be numerically evaluated pointwise. As a result,sampling algorithms based on repeated likelihood evaluations require modification for this task.Collectively known as likelihood-free samplers (and also as approximate Bayesian computation )these methods have been developed across multiple disciplines. They employ generation of auxiliarydatasets under the model as a means to circumvent (intractable) likelihood evaluation. In essence, likelihood-free methods first reduce the observed data, y , to a low-dimensional vectorof summary statistics t y = T ( y ) ∈ T , where dim( θ ) ≤ dim( t y ) << dim( y ). Accordingly, the trueposterior π ( θ | y ) is replaced with a new posterior π ( θ | t y ). These are equivalent if t y is sufficient for θ , and π ( θ | t y ) ≈ π ( θ | y ) is an approximation if there is some loss of information through t y . Thenew target posterior, π ( θ | t y ), still assumed to be computationally intractable, is then embeddedwithin an augmented model from which sampling is viable. Specifically the joint posterior of themodel parameters θ , and auxiliary data t ∈ T given observed data t y is π ( θ , t | t y ) ∝ K h ( t y − t ) f ( t | θ ) π ( θ ) , (16)where t ∼ f ( t | θ ) may be interpreted as the vector of summary statistics t = T ( x ) computed froma dataset simulated according to the model x ∼ f ( x | θ ). Assuming such simulation is possible,data-generation under the model, t ∼ f ( t | θ ), forms the basis of computation in the likelihood-freesetting. The target marginal posterior π M ( θ | t y ) for the parameters θ , is then obtained as π M ( θ | t y ) = c M (cid:90) T K h ( t y − t ) f ( t | θ ) π ( θ ) dt (17)11here ( c M ) − = (cid:82) E (cid:82) T K h ( t y − t ) f ( t | θ ) π ( θ ) dtd θ normalises (17) such that it is a density in θ (e.g. Reeves and Pettitt [2005]; Wilkinson [2013]; Blum [2010]; Sisson et al. [2007]; Fearnheadand Prangle [2012]). The function K h ( t y − t ) is a standard kernel function, with scale parameter h ≥
0, which weights the intractable posterior with high density in regions t ≈ t y where auxiliaryand observed datasets are similar. As such, π M ( θ | t y ) ≈ π ( θ | t y ) forms an approximation to theintractable posterior via (17) through standard smoothing arguments (e.g. Blum [2010]). In the caseas h →
0, so that K h ( t y − t ) becomes a point mass at the origin (i.e. t y = t ) and is zero elsewhere,if t y is sufficient for θ then the intractable posterior marginal π M ( θ | t y ) = π ( θ | t y ) = π ( θ | y ) isrecovered exactly (although small h is usually impractical). Various choices of smoothing kernel K have been examined in the literature (e.g. Marjoram et al. [2003]; Beaumont et al. [2002]; Peterset al. [2012]).For our discussion on likelihood-free or ABC samplers, it is convenient to consider a generali-sation of the joint distribution (16) incorporating S ≥ π J ( θ , t S | t y ) ∝ ˜ K h ( t y , t S ) f ( t S | θ ) π ( θ )where t S = ( t , . . . , t S ) and t , . . . , t S ∼ f ( t | θ ) are S independent datasets generated from the(intractable) model. As the auxiliary datasets are, by construction, conditionally independentgiven θ , we have f ( t S | θ ) = (cid:81) Ss =1 f ( t s | θ ). We follow Del Moral et al. [2012] and specify the kernel˜ K as ˜ K h ( t y , t S ) = S − (cid:80) Ss =1 K h ( t y − t s ), which produces the joint posterior π J ( θ , t S | t y ) = c J (cid:34) S S (cid:88) s =1 K h ( t y − t s ) (cid:35) (cid:34) S (cid:89) s =1 f ( t s | θ ) (cid:35) π ( θ ) , (18)with c J > K ( t y − t s ) by Del Moral et al. [2012] to the general case. It is easy to see that, by con-struction, (cid:82) T S π J ( θ , t S | t y ) dt S = π M ( θ | t y ) admits the distribution (17) as a marginal distribution(c.f. Del Moral et al. [2012]). The case S = 1 with π J ( θ, t S | t y ) = π ( θ, t | t y ) corresponds to themore usual joint posterior (16) in the likelihood-free setting.There are two obvious approaches to posterior simulation from π M ( θ | t y ) ≈ π ( θ | t y ) as an ap-proximation to π ( θ | y ). The first approach proceeds by sampling directly on the augmented model π J ( θ , t S | t y ), realising joint samples ( θ , t S ) ∈ E × T S before a posteriori marginalisation over t S (i.e. by discarding the t s realisations from the sampler output). In this approach, the summaryquantities t S are treated as parameters in the augmented model.The second approach is to sample from π M ( θ | t y ) directly, a lower dimensional space, by ap-proximating the integral (17) via Monte Carlo integration in lieu of each posterior evaluation of π M ( θ | t y ). In this case π M ( θ | t y ) ∝ π ( θ ) (cid:90) T K h ( t y − t ) f ( t | θ ) dt ≈ π ( θ ) S S (cid:88) s =1 K h ( t y − t s ) := ˆ π M ( θ | t y ) , (19)where t , . . . , t S ∼ f ( t | θ ). This expression, examined by various authors (e.g. Marjoram et al.[2003]; Reeves and Pettitt [2005]; Ratmann et al. [2009]; Toni et al. [2009]; Peters et al. [2012]),requires multiple generated datasets t , . . . , t S , for each evaluation of the marginal posterior dis-tribution π M ( θ | t y ). As with standard Monte Carlo approximations, Var[ˆ π M ( θ | t y )] reduces as S increases, with lim S →∞ Var[ˆ π M ( θ | t y )] = 0. For the marginal posterior distribution, the quantities t S serve only as a means to estimate π M ( θ | t y ), and do not otherwise enter the model explicitly.The number of samples S directly impacts on the variance of the estimation.12 Estimation of Bayesian LOB stochastic agent model via Population-based samplers
Population-based likelihood-free samplers were introduced to circumvent poor mixing in MCMCsamplers (Sisson et al. [2007]; Toni et al. [2009]; Beaumont et al. [2009]; Peters et al. [2012];Del Moral et al. [2012]). These samplers propagate a population of particles , θ (1) , . . . , θ ( N ) , withassociated importance weights W ( θ ( i ) ), through a sequence of related densities φ ( θ ) , . . . , φ T ( θ T ),which defines a smooth transition from the distribution φ , from which direct sampling is available,to φ T the target distribution.For likelihood-free or ABC samplers, φ k is defined by allowing K h n ( t y − t ) to place greater densityon regions for which t y ≈ t as k increases (that is, the bandwidth h n decreases with n ). Hence, we de-note π J,n ( θ , t S | t y ) ∝ ˜ K h n ( t y , t S ) f ( t S | θ ) π ( θ ) and π M,n ( θ | t y ) ∝ π ( θ ) (cid:82) T S ˜ K h n ( t y , t S ) f ( t S | θ ) dt S for n = 1 , . . . , T , under the joint and marginal posterior models respectively. SMC methods have emerged out of the fields of engineering, probability and statistics in recentyears. Variants of the methods sometimes appear under the names of particle filtering or interactingparticle systems (e.g.,Ristic et al. [2004],Doucet et al. [2001], Del Moral [2004]), and their theoreticalproperties have been extensively studied by Crisan and Doucet [2002], Del Moral [2004], Chopin[2004], and K¨unsch [2005].The standard SMC algorithm involves finding a numerical solution to a set of filtering recursions,such as filtering problems arising from nonlinear/non-Gaussian state space models. Under thisframework, the SMC algorithm samples from an (often naturally occurring) sequence of distribu-tions π n , indexed by n = 1 , . . . , T . Each distribution is defined on the support E n = E × E × · · · × E for some generic space denoted E . Recently, this class of algorithms was adapted to tackle the sameclass of problems typically addressed by MCMC methods where one has instead a sequence of dis-tributions { π n } n ≥ each defined on fixed support E . NOTE: this is not a product space but afixed space E . Del Moral et al. [2006], Peters [2005], Peters et al. [2012] and Targino et al. [2015]generalize the SMC algorithm to the case where the target distributions π n are all defined on thesame support E . This generalization, termed the SMC sampler , adapts the SMC algorithm tothe more popular setting in which the state space E remains static, that is, the settings we havediscussed earlier with regard to the MCMC algorithms.In short, the SMC sampler generates weighted samples (termed particles ) from a sequence ofdistributions π n , for n = 1 , . . . , T , where π T may be of particular interest. We refer to π T as thetarget distribution such as a posterior distribution for model parameters. Procedurally, particlesobtained from an arbitrary initial distribution π , with a set of corresponding initial weights, aresequentially propagated through each distribution π t in the sequence via three processes, involvingmutation (or move), correction (or importance weighting), and selection (or resampling). The finalweighted particles at distribution π T are considered weighted samples from the target distribution π . Hence, given a sequence of distributions { π n ( d θ ) } Tn =1 , the aim is to develop a large collectionof N -weighted random samples at each time n denoted by (cid:110) W ( i ) n , Θ ( i ) n (cid:111) Ni =1 such that W ( i ) n > (cid:80) Ni =1 W ( i ) n = 1. These importance weights and samples, denoted by (cid:110) W ( i ) n , Θ ( i ) n (cid:111) Ni =1 , are known as13articles (hence the name often given to such algorithms as particle filters or interacting particlesystems). For such approaches to be sensible we would require that the empirical distributionsconstructed through these samples should converge asymptotically ( N → ∞ ) to the target distri-bution π n for each time n . This means that for any π n integrable function, denoted, for example,by φ ( θ ) : E → R (cid:48) one would have the following convergence: N (cid:88) i =1 W ( i ) n φ (cid:16) θ ( i ) n (cid:17) a.s. → E π n (cid:2) φ ( Θ ) (cid:3) . (20)In the SMC Sampler algorithm, a particular variant of SMC algorithms, a modification of theSMC algorithm, is developed. Consider a generic sequence of distributions given by π n ( θ ) , n =1 , . . . , T , with θ ∈ E , where the final distribution π T is the distribution of interest. By introducinga sequence of backward kernels L k , a new distribution (cid:101) π n ( θ , . . . , θ n ) = π n ( θ n ) n − (cid:89) k =1 L k ( θ k +1 , θ k ) (21)may be defined for the path of a particle ( θ , . . . , θ n ) ∈ E n through the sequence π , . . . , π n . Theonly restriction on the backward kernels is that the correct marginal distributions (cid:82) (cid:101) π n ( θ , . . . , θ n ) d θ , . . . , d θ n − = π n ( θ n ) are available. Within this framework, one may then workwith the constructed sequence of distributions, (cid:101) π n , under the standard SMC algorithm.In summary, the SMC Sampler algorithm involves three stages:1. Mutation , whereby the particles are moved from θ n − to θ n via a mutation kernel M n ( θ n − , θ n );2. Correction , where the particles are reweighted with respect to π n via the incremental impor-tance weight (Eq. 22);3. Selection , where according to some measure of particle diversity, commonly the effectivesample size, the weighted particles may be resampled in order to reduce the variability of theimportance weights.In more detail, suppose that at time n −
1, the distribution (cid:101) π n − can be approximated empiricallyby (cid:101) π Nn − using N -weighted particles. These particles are first propagated to the next distribution (cid:101) π n using a mutation kernel M n ( θ n − , θ n ), and then assigned new weights W n = W n − w n ( θ , . . . θ n ),where W n − is the weight of a particle at time n − w n is the incremental importance weightgiven by w n ( θ , . . . , θ n ) = (cid:101) π n ( θ , . . . , θ n ) (cid:101) π n − ( θ , . . . , θ n − ) M n ( θ n − , θ n ) = π n ( θ n ) L n − ( θ n , θ n − ) π n − ( θ n − ) M n ( θ n − , θ n ) . (22)The resulting particles are now weighted samples from (cid:101) π n . Consequently, from Eq. (22), underthe SMC Sampler framework, one may work directly with the marginal distributions π n ( θ n ) suchthat w n ( θ , . . . , θ n ) = w n ( θ n − , θ n ). While the choice of the backward kernels L n − is essentiallyarbitrary, their specification can strongly affect the performance of the algorithm, as will be dis-cussed in the following subsections. The basic version of the SMC Sampler algorithm thereforeproceeds explicitly as given in Algorithm 3.1. 14 emark 3.1 In all cases in which we utilize the incremental importance sampling weight correc-tion, the arguments in the expressions only need to be known up to normalization. That is, itis perfectly acceptable to only be able to evaluate the sequence of target distributions { π n } up tonormalization constant. This is true as long as the same normalization constant is present for allparticles, since the renormalization step will correct for this lack of knowledge in the importanceweighting. In practice, this is critical to the application of such methods. Sequential Monte Carlo Samplers
1. Initialize the particle system;(a) Set n = 1;(b) For i = 1 , . . . , N , draw initial particles Θ ( i )1 ∼ p ( θ );(c) Evaluate incremental importance weights (cid:110) w (cid:16) Θ ( i )1 (cid:17)(cid:111) using Equation (22) and normal-ize the weights to obtain (cid:110) W ( i )1 (cid:111) .Iterate the following steps through each distribution in sequence { π t } Tn =2 .2. Resampling(a) If the effective sampling size ( ESS ) = (cid:80) Ni =1 (cid:16) w ( i ) n (cid:17) < N eff is less than a threshold N eff ,then resample the particles via the empirical distribution of the weighted sample eitherby multinomial or stratified methods; see discussions on unbiased resampling schemesby K¨unsch [2005] and Del Moral [2004].3. Mutation and correction(a) Set n = n + 1, if n = T + 1, then stop;(b) For i = 1 , . . . , N draw samples from mutation kernel Θ ( i ) n ∼ M n (cid:16) Θ ( i ) n − (cid:17) ;(c) Evaluate incremental importance weights (cid:110) w n (cid:16) Θ ( i ) n (cid:17)(cid:111) using Equation (22) andnormalize the weights to obtain (cid:110) W ( i ) n (cid:111) via W ( i ) n = W ( i ) n − w ( i ) n ( Θ n − , Θ n ) (cid:80) Nj =1 W ( i ) n − w ( i ) n ( Θ n − , Θ n ) . (23)At this stage, for practitioners wishing to utilise such SMC methods, it is informative to under-stand better what properties are known about this class of estimation methods in terms of accuracyof such numerical approximations. We mention briefly some known results based on recent exam-ples of concentration inequalities for particle methods that are finite sample result (see discussionand references by Del Moral et al. [2013]).The exponential concentration inequalities presented here are satisfied under some regularityconditions on the particle weights and the mutation kernel M n when defined on some general statespace E n ; see specific probabilistic details of these conditions by Del Moral [2004].15sing the concentration analysis of mean field particle models, the following exponential es-timate can be obtained (see discussion by Del Moral [2004]) and references therein. Note in thefollowing when the N particle approximation to a distribution or density, such as π , is used we willdenote it by π N . Theorem 3.1 (Finite Sample Exponential Concentration Inequality)
For any x ≥ , n ≥ ,and any population size N ≥ , the probability of the event is P r (cid:18)(cid:12)(cid:12) π Nn ( ϕ ) − π n ( ϕ ) (cid:12)(cid:12) ≤ c N (cid:0) x + √ x (cid:1) + c √ N √ x (cid:19) ≥ − e − x , (24) where one defines the N particle sample estimator as follows: π Nn ( ϕ ) = N (cid:88) i =1 W ( i ) n ϕ (cid:16) θ ( i ) n (cid:17) and π n ( ϕ ) = (cid:90) ϕ ( θ n ) π n ( θ n ) d θ n . (25)In the case of a stable SMC algorithm, that is, one that is insensitive to initial conditions, suchas those we discussed earlier, the constants c and ( c , c ) do not depend on the time parameter.One can also bound the difference between the particle estimate of the target distribution and thetrue distribution as follows. Consider that for any θ = ( θ i ) ≤ i ≤ d and any ( −∞ , x ] = (cid:81) di =1 ( −∞ , θ i ]cells in E n = R d , we let F n ( x ) = π n (cid:0) ( −∞ ,x ] (cid:1) and F Nn ( x ) = π Nn (cid:0) ( −∞ ,x ] (cid:1) . Using these definitions of the empirical particle constructed distribution function and the targetdistribution function at sequence number n in the sequence of distribution { π , π , . . . , π T } , wecan state the following corollary for the distribution functions for sequence of densities π t givenpreviously. Corollary 3.1
For any y ≥ , n ≥ , and any population size N ≥ , the probability of thefollowing event √ N (cid:13)(cid:13) F Nn − F n (cid:13)(cid:13) ≤ c (cid:112) d ( y + 1) is greater than − e − y . This concentration inequality ensures that the particle repartition function F Nt converges to F t ,almost surely for the uniform norm. Formalizing this in the context of SMC Samplers for ABC posterior settings, the particle popu-lation θ n − drawn from the distribution φ n − ( θ n − ) at time n − φ n ( θ n ) by thekernel M n ( θ n − , θ n ). The weights for the mutated particles θ n may be obtained as W n ( θ n ) =16 n − ( θ n − ) w n ( θ n − , θ n ) where, for the marginal model sequence π M,n ( θ n | t y ), the incrementalweight is w n ( θ n − , θ n ) = π M,n ( θ n | t y ) L n − ( θ n , θ n − ) π M,n − ( θ n − | t y ) M n ( θ n − , θ n ) ≈ ˆ π M,n ( θ n | t y ) L n − ( θ n , θ n − )ˆ π M,n − ( θ n − | t y ) M n ( θ n − , θ n ) , (26)where, following (19), and setting the kernel bandwidth to an ABC tolerance level (cid:15) n we obtainˆ π M,n ( θ n | t y ) := π ( θ ) S S (cid:88) s =1 K (cid:15) n ( t y − t s )which is proportional to an (unbiased) estimate of π M,n ( θ n | t y ) based on S Monte Carlo draws t , . . . , t S ∼ f ( t | θ n ). Here L n − ( θ n , θ n − ) is a reverse-time kernel describing the mutation ofparticles from φ n ( θ n ) at time n to φ n − ( θ n − ) at time n −
1. As with the ABC-MCMC algorithm,the incremental weight (26) consists of the “biased” ratio ˆ π M,n ( θ n | t y ) / ˆ π n − ( θ M,n − | t y ) for finite S ≥ π J,n ( θ , t S | t y ), withthe natural mutation kernel factorisation M n [( θ n − , t Sn − ) , ( θ n , t Sn )] = M n ( θ n − , θ n ) S (cid:89) s =1 f ( t sn | t y )(and similarly for L n − ), following the form of (26), the incremental weight is exactly w n (cid:2) ( θ n − , t Sn − ) , ( θ n , t Sn ) (cid:3) = S (cid:80) s K (cid:15) n ( t y − t sn ) π ( θ n ) L n − ( θ n , θ n − ) S (cid:80) s K (cid:15) n − ( t y − t sn − ) π ( θ n − ) M n ( θ n − , θ n ) . (27)Hence, as the incremental weights (26, 27) are equivalent, they induce identical SMC algorithmsfor both marginal and joint models π M ( θ | t y ) and π J ( θ , t S | t y ). As a result, while applications ofthe marginal sampler targeting π M ( θ | y ) are theoretically biased for finite S ≥
1, as before, theyare in practice unbiased through association with the equivalent sampler on joint space targeting π J ( θ , t S | t y ).We note that a theoretically unbiased sampler targeting π M ( θ | t y ), for all S ≥
1, can be obtainedby careful choice of the kernel L n − ( θ n , θ n − ). For example, Peters [2005], Peters et al. [2012] andTargino et al. [2015] all use the suboptimal approximate optimal kernel given by L n − ( θ n , θ n − ) = π M,n − ( θ n − | t y ) M n ( θ n − , θ n ) (cid:82) π M,n − ( θ n − | t y ) M n ( θ n − , θ n ) d θ n − , (28)from which the incremental weight (26) is approximated by w n ( θ n − , θ n ) = π M,n ( θ n | t y ) / (cid:90) π M,n − ( θ n − | t y ) M n ( θ n − , θ n ) d θ n − ≈ ˆ π M,n ( θ n | t y ) / N (cid:88) i =1 W n − ( θ ( i ) n − ) M n ( θ ( i ) n − , θ n ) . (29)Under this choice of backward kernel, the weight calculation is now unbiased for all S ≥
1, sincethe approximation ˆ π M,n − ( θ | y ) in the denominator of (26) is no longer needed.17 .2.1 Adaptive Schedules: Choice of sequence of ABC Bayesian distributions viaannealed tolerance schedule In this section we consider how to take the ABC posterior distribution and construct a the sequenceof distributions that are required for the SMC Sampler. That is, we address the question of how todevelop an ABC specific sequence of target distributions. We have chosen to design this sequenceby following what we call ABC reverse annealing. In particular, we construct a sequence of targetposterior distributions { φ n } n ≥ which are constructed based on strictly decreasing tolerance values,generically denoted by the sequence { (cid:15) n } n ≥ such that (cid:15) > (cid:15) > . . . > (cid:15) n > . . . > (cid:15) T . We obtainthis sequence of ABC posterior distributions by considering the φ n , which was defined with respectto the ABC likelihood involving a kernel. If we consider the kernel to have a decreasing bandwidthgiven by K (cid:15) n ( t y − t ) then we will progressively place greater emphasis i.e. density on regions forwhich t y ≈ t as n increases (that is, the bandwidth (cid:15) n decreases with n ). Hence, we denote thetwo possible ABC constructions one may consider under the joint and marginal posterior modelsrespectively, in the SMC Samplers procedure as follows: π J,n ( θ , t S | t y ) ∝ ˜ K (cid:15) n ( t y , t S ) f ( t S | θ ) π ( θ ) π M,n ( θ | t y ) ∝ π ( θ ) (cid:90) T S ˜ K (cid:15) n ( t y , t S ) f ( t S | θ ) dt S (30)Now the aspect of this procedure that makes it adaptive is the selection of the size of the dis-crepancy between π n and π n +1 , for each n ∈ { , . . . , T } as well as the final stopping point. In thispaper we propose to perform adaption of the ABC target distribution sequence at every step. Theaim is to progressively select a sequence of distributions, online such that the discrepancy betweenthe next distribution and the previous, as controlled by the tolerance (cid:15) n sequence, is controlled bythe “fitness” or efficiency of the particle approximation of the previous target distribution in thesequence. A good approximation would indicate that one may take a larger step whilst a poorerapproximation indicates that smaller steps should be taken.Formally, we perform the adaption such that a new tolerance (cid:15) n , at iteration n , is generatedthrough a particle system based quantile matching procedure. The procedure adopted considersthe new tolerance to be obtained as the solution for (cid:15) in the following equation (cid:98) q n − = (cid:15) √ − [2 F ( q ) − , (31)where q is a user specified quantile level, F is the CDF of a normal distribution with mean 0 andstandard deviation (cid:15) and (cid:98) q n − is the particle population quantile estimate obtained from the ABCposterior approximation after correction stage in the SMC Sampler ABC algorithm. In this waythe tolerance schedule is continually adapting to the local particle approximation performance. Inpractice it is computationally efficient to employ the following adaptive schedule for the tolerancein the ABC posterior, where we ensure that the sequence of distributions is designed such that thenew tolerance is calculated as a strictly decreasing schedule given by (cid:15) n = min ((1 − α ) (cid:15) n − , (cid:15) ∗ ) , (32)where α ∈ (0 , .2.2 Choice of Mutation Kernel There are many choices for mutation kernel that could be considered when designing an SMCSampler algorithm. The choice of kernel is often critical to select in order for the algorithm toperform well. In this section, we first survey a few possible choices and then we present a specializedchoice of mutation kernel we adopted from the genetic search literature [Li and Zhang, 2009] whichinvolves combination of mutation and cross-over operators for the particle mutation in the SMCSampler. In order to utilise this class of mutation operator in SMC Sampler settings we hadto formally write down not just the mutation and cross-over operators in a structural form, astypically specified in the NGSAII class of genetic optimization algorithms, but to also define theirdistributional form. We provide these at the end of the section.Some examples of possible choices of the mutation kernel are given as follows:1.
Independent kernels . In this setting, one would select a mutation kernel given for all n ∈ { , , . . . , T } by M n ( θ n − , θ n ) = M n ( θ n );2. Local Random Walks . In this setting, the kernel would be selected for all n ∈ { , , . . . , T } to be of the form M n ( θ n − , θ n ), where the mutation from θ n − to θ n follows a local Random Walk based around, say, a Gaussian smoothing kernel as given byGivens and Raftery [1996];3. Markov chain Monte Carlo Kernels . In this setting, the kernel would be selected for all n ∈ { , , . . . , T } to be an MCMC kernel of invariant distribution π n . As noted by Del Moralet al. [2006] and Peters [2005], this option is suitable if the Markov chain kernel is mixingrapidly or if the sequence of distributions is such that π n − is close to π n , which is oftenthe case by design. Then the use of an MCMC kernel would result in running for eachstage, N inhomogeneous Markov chains. Then one must correct for the fact that one is nottargeting the correct distribution under these Markov chains, which is achieved using IS:ˆ π Nn − = (cid:80) Ni =1 W ( i ) n − δ θ ( i ) n − ( θ ) and running L iterations of the Markov chain for each particle,where each of the N chains will target (cid:80) Ni =1 W ( i ) n − (cid:81) Ll =1 M l (cid:16) θ ( i ) l − , θ l (cid:17) , which is not in general π n , then with an IS correction, such an approach is accurate and unbiased (i.e., targets thedistribution of interest at time n given by π n ;4. Gibbs Sampler kernels . If the sequence of target distributions { π n } n ≥ is such that itssupport is multivariate, then it may also be possible to sample from the full conditionaldistributions in the sequence of distributions. This approach allows one to undertake a Gibbsstep, which would involve a kernel for update of the k -th element given in the form M n ( θ n − , d θ n ) = δ θ n − , − k ( d θ n, − k ) π n ( θ n,k | θ n, − k ) (33)with θ n, − k = ( θ n, , θ n, , . . . , θ n,k − , θ n,k +1 , . . . , θ n,J ), where there are J parameters in theOpRisk model target posterior. If the full conditionals are not available, one could approxi-mate them accurately at each stage and then correct for the approximation error through IS;5. Mixture kernels . It is always possible to consider a mixture kernel choice given by M n ( θ n − , θ n ) = M (cid:88) m =1 α n,m ( θ n − ) M n,m ( θ n − , θ n ) , (34)19ith α n,m ( θ n − ) > (cid:80) Mm =1 α n,m ( θ n − ) = 1. One special case of this type of kernelwould be an independent kernel constructed by a kernel density estimate of M n,m ( θ n − , θ n ) = M n ( θ n − , θ n ) for all m and α n,m ( θ n − ) = W ( i ) n − with M = N ;6. Partial Rejection Control kernels . In this case, one aims to construct a mutation kernelin the SMC Sampler that guarantees all sampled particles have importance weights witha “fitness” exceeding a user-specified threshold at each time n , denoted by c n such that w ( i ) n ≥ c n , ∀ i ∈ { , , . . . , N } . To achieve this, one modifies any of the earlier mutationkernels to take the form given by M ∗ n (cid:0) θ in − , θ n (cid:1) = r ( c n , θ ( i ) n − ) − min , W ( i ) n − w n (cid:16) θ ( i ) n − , θ n (cid:17) c n M n (cid:16) θ ( i ) n − , θ n (cid:17) . (35)The quantity r ( c n , θ ( i ) n − ) denotes the normalizing constant for particle θ ( i ) n − , given by r ( c n , θ ( i ) n − ) = (cid:90) min , W ( i ) n − w n (cid:16) θ ( i ) n − , θ n (cid:17) c n M n (cid:16) θ ( i ) n − , θ n (cid:17) d θ n . (36)Note that 0 < r ( c n , θ n − ) ≤ M n is normalized, so that (cid:82) M n ( θ n − , θ n ) d θ n = 1, and if the PRC threshold 0 ≤ c n < ∞ is finite. The sequence ofPRC thresholds is then user-specified to ensure a certain particle “fitness” at each stage ofthe SMC Sampler. We will detail more explicitly this example in a future section.7. Genetic Mutation and Cross-Over operators . In this class of mutation kernel in theSMC Sampler we consider the class of genetic algorithm type mutations. In particular, wedescribe the class of MOEA mutation and crossover operators widely used in stochastic searchalgorithms, introduced in the method of Deb et al. [2002]. This class of mutation kernel is themost widely used operator in multi-objective optimisation and we demonstrate its adaptionto the SMC Sampler framework.A disadvantage of the NSGA-II operators is that they are only able to mutate binary, integer,or real encodings of the output parameter vectors, whereas the stochastic process for the limitorder submission activity by liquidity providers requires the specification of a positive definiteand symmetric covariance matrix for the generation of intensities from a multivariate skew-tdistribution. The positive definiteness and symmetry constraints of the covariance matrixwill not be preserved if one simply employs the evolutionary operators above to producenew sets of covariance matrix candidate solutions. For this reason, Panayi and Peters [2015]introduced a new covariance mutation operator which generates new candidate covariancematrices which remains in the manifold of positive definite matrices.
Simulated Binary Crossover (SBX):
From two previous particles θ ( i ) n − , θ ( j ) n − , a newsolution θ ( i ) n is formed, where the k -th elements is crossed as follows: θ ( i,k ) n = 12 [(1 − ¯ β ) θ ( i,k ) n − + (1 + ¯ β ) θ ( j,k ) n − ] . (37)20ere, ¯ β is a random sample from a distribution with density¯ β = (cid:40) ( αu ) ηc +1 , if u ≤ α ( − αu ) ηc +1 , otherwise . where u ∼ U (0 ,
1) and α = 2 − β − ( η c +1) , with β = 1 + 2 θ ( j,k ) n − − θ ( i,k ) n − min (cid:104)(cid:16) θ ( i,k ) n − − θ k L (cid:17) , (cid:16) θ k U − θ ( j,k ) n − (cid:17)(cid:105) (38)This would produce a mutation kernel for this type of move at SMC Sampler iteration n forthe k -th element of the i -th particle vector which would be updated according to a densitygiven by M n ( θ ( i,k ) n | θ ( i,k ) n − , θ ( j,k ) n − ) = 2 θ ( j,k ) n − − θ ( i,k ) n − θ ( i,k ) n ∈ (cid:34) θ ( i,k ) n − , θ ( i,k ) n − θ ( j,k ) n − (cid:35) (cid:32) η c + 1 α · θ ( i,k ) n − + θ ( j,k ) n − − θ ( i,k ) n θ ( j,k ) n − − θ ( i,k ) n − (cid:33) η c + θ ( i,k ) n ∈ (cid:34) θ ( i,k ) n − θ ( j,k ) n − − − α ) 1 ηc +1 ,θ ( i,k ) n − (cid:35) (cid:32) η c + 1 α · θ ( i,k ) n − + θ ( j,k ) n − − θ ( i,k ) n θ ( j,k ) n − − θ ( i,k ) n − (cid:33) η c +2 We use the crossover operator with probability p c = 0 . η c = 5.Every element k of the i -th particle vector is crossed with probability 0.5. Polynomial mutation:
The mutation operator perturbs elements of the solution, accordingto the distance from the boundaries. θ ( i,k ) n = θ ( i,k ) n − + ¯ δ ( θ k U − θ k L )where we have for ¯ δ ¯ δ = (cid:40)(cid:2) γ + (1 − γ )(1 − δ ) η m +1 (cid:3) ηm +1 − γ < . − (cid:2) − γ ) + 2( γ − . − δ ) η m +1 (cid:3) ηm +1 if γ ≥ . . with δ = min (cid:104)(cid:16) θ ( i,k ) n − θ k L (cid:17) , (cid:16) θ k U − θ ( i,k ) n (cid:17)(cid:105) . where, γ ∼ U (0 , n forthe k -th element of the i -th particle vector which would be updated according to a densitygiven by M n ( θ ( i,k ) n | θ ( i,k ) n − ) = 1 θ k U − θ k L (cid:20) θ ( i,k ) n ≤ θ ( i,k ) n − (cid:18) ( η m + 1)(¯ δ + 1) η m − (1 − δ ) η m +1 ) (cid:19) + θ ( i,k ) n >θ ( i,k ) n − (cid:18) ( η m + 1)(1 − ¯ δ ) η m − (1 − δ ) η m +1 ) (cid:19)(cid:21) η m = 10. The polynomial mutation operator is used with probability p m = 0 . Covariance mutation operator:
In the t -th generation of the MOEA, we generate (cid:110) Σ ( i ) t (cid:111) , i =1 . . . N from a mixture distribution M n (Σ n,i ) defined as follows: M n (Σ ( i ) t ) = (1 − w ) IW (Ψ n , p ) + w IW (Ψ , p )where IW denotes the Inverse Wishart distribution, p , p are degrees of freedom param-eters with p < p , and where w is small so that sampling from the second distributionhappens infrequently. Here Ψ denotes an uninformative positive definite matrix, with theeffect that sampling from the second distribution leads to moves away from the local regionbeing explored. Ψ t is also a positive definite matrix, fitted based on moment matching tothe sample mean of the successfully proposed candidate solutions in the previous stage of theMulti-Objective optimisation as follows:Ψ n = 1 (cid:80) ns =1 w s n (cid:88) s =1 w s (cid:80) Ni =1 1 r ( i ) s N (cid:88) i =1 r ( i ) s ˜Σ ( i ) n where r ( i ) s is the non-domination rank of the i -th solution in the s -th generation, and w s with w < The data employed in this study constitutes the intra-day trading activity on the European multi-lateral trading facility (MTF) Chi-X Europe between January and April 2012. Chi-X Europe oper-ated as an individual entity from 2007, before being purchased by BATS Europe at the end of thetrading period under consideration. We note that Chi-X Europe is a secondary exchange, i.e. thesecurities that are traded on the exchange are listed and primarily traded on national/supranationalexchanges, including the London Stock Exchange, Euronext, Deutsche Boerse, and the SIX SwissExchange, amongst others. However, it maintains a significant proportion of the daily tradingactivity in each of these markets, between 20% and 35% in most cases .The complete dataset covers over 1300 assets, primarily stocks, but also including exchange-traded funds (ETFs) and American depositary receipts (ADRs). For the purposes of this study, weselect one of the most commonly traded stocks in the French CAC 40 Index, namely BNP ParibasSA. Figure 2 shows the evolution of the LOB on a typical day for this asset based on real marketobservation data from the LOB. We also present a heatmap of the inside spread S t = P a, t − P b, t over the 2 month period February to March 2012. The inside spread is the most common measureof ‘liquidity’, i.e. the relative ease with which one can buy or sell a financial asset.Chi-X Europe operates both a visible and a hidden order book, and traders have the optionto route orders to the hidden book if they meet certain conditions relating to order type and size.The dataset consists of only data in the visible book, after it has been processed by the exchange’smatching engine. That is, while the exchange allows for a range of order types with time-in-forcemodifiers, the processed data consists of the timestamps and order sizes of limit order submissions, Time P r i c e ( c en t s ) Side askbid100020003000 value
Time S p r ead ( c en t s ) count Figure 2: (Left): A representation of real market data intra-day LOB states obtained from thetrading activity for asset BNP Paribas SA on the 5th of March 2012. The shading of each boxindicates the volume available at that price, which is volume available to buy for blue-borderedboxes and volume available to sell for red-bordered boxes. (Right): A heatmap of the intra-dayspread for the period February to March 2012 for asset BNP Paribas SA.executions and cancellations. However, this data is sufficient to construct a much more detailedpicture of the state of the LOB than is typically available in previous studies (which only consideraggregate volume in either the first level or the first 5 levels), as we can disaggregate the volumesper level up to any depth in the LOB.The raw, unevenly spaced data is thus used to construct the state of the LOB at each eventtimestamp (these are accurate up to millisecond precision). Because of our interest in fitting theauxiliary models describing price and volume dynamics (these are outlined in Section 5), however,we subsample the process at regular 10 second intervals, in order to then extract the price andvolume variables of interest. Thus, from an irregularly spaced process typically containing between50,000 and 500,000 events every day, we extract a regular timeseries of the auxiliary model variablesfor the purposes of our estimation.
The results presented in this section may be compared to those obtained from indirect inferenceprocedures as reported in Panayi and Peters [2015]. To achieve comparison we have also providethe results for what they termed the benchmark ‘reference’ model, which makes a series of as-sumptions in order to simplify estimation and model structure. This basic reference model has thefollowing parameter vector (cid:110) µ LO,p , µ LO,d , µ MO , γ , ν, σ MO (cid:111) , as well as the covariance matrix Σ tobe estimated, see details in Panayi and Peters [2015]. The results will be presented in terms of theABC marginal posterior distributions of the individual parameters of the LOB simulation and interms of the resulting stochastic agent based LOB model to reasonably produce realistic featuresof the simulated LOB intra-daily.In our introduction to likelihood-free methods in Section 2.2.1, we discussed the reduction of the23bserved data y to a low-dimensional vector of summary statistics t y . We are interested specificallyin two of the most commonly studied LOB characteristics, which correspond to the volatility inthe log returns obtained from the price process dynamic (as obtained from half the inside spread)and the evolution of the volume resting on the LOB (as measured by the instantaneous aggregatetotal volume on the bid and ask at levels 1 to 5). The summaries we adopt at this stage areless-standard in ABC applications since they employ a functional (i.e. regression model based)summary of features of observable LOB process. In this case the summary information becomesthe model characterization (dimensional reduction) captured by the estimated model parametersfit to the real LOB data and the simulated LOB data for price or volume dynamic. Specifically wehave: Auxiliary Model 1 - Price features:
If we denote the mid-price as p midt = p a, t + p b, t then thelog return is defined as r t = ln p midt p midt − ∆ t where ∆ t is a suitable interval, in our case 1 minute. We fit a GARCH(1,1) model for this aspectof the data parameterized by (cid:98) β . Auxiliary Model 2 - Volume features:
We fit an MA(1) model to the detrended total volume(i.e. an ARIMA(0,1,1) model) in the first 5 levels on both the bid and ask side parameterized by (cid:98) β , in order to capture the time series structure of the LOB volumes.The auxiliary models are fit to both the real and simulated data, and for the distance weestimate the Euclidean distances between the auxiliary parameter vectors D = D (cid:16) (cid:98) β ( y ) , (cid:98) β ( y ∗ ( θ )) (cid:17) , D = D (cid:16) (cid:98) β ( y ) , (cid:98) β ( y ∗ ( θ )) (cid:17) . To perform the estimation there are also a number of inputs to the SMC Sampler ABC algorithmthat we specify, including the number of particles, the tolerance schedule forced decrement amountand the total number of iterations over which to run the estimation. Specifically, we have for ourestimation procedure: • The estimation procedure was run for 20 iterations.; • The tolerance schedule employed was the forced decrement schedules specified in Section3.2.1, with a decrement parameter α = 0 . • We obtain results using 50, 100 and 200 particles per iteration.; • We also tested the quality of the results for a series of quantile levels for the tolerance, i.e. q . , q . and q . .Carrying out the estimation procedure for each configuration above indicated that the bestresults (in terms of the lowest values of D , D ) were obtained for a quantile level q . for thetolerance, and 200 particles. We repeated the estimation procedure 20 times with this configurationand configuration above and Figure 3 shows the evolution of the tolerance in the ABC posterior inthe case of the forced tolerance schedule, when the estimation is run for T = 20 iterations.24e note that the mutation operator for the covariance matrix, specified in Section 3.2.2, whichwas composed of both an exploration and a mutation component, could lead to particle degeneracyin higher dimensions. Consequently, in practice it can be computationally more efficient to simplifythe mutation kernel for the covariance matrix to a static mutation kernel which would eliminatethe prior weighting in the numerator and denominator of each incremental particle weight. Whenthis was performed it produces particle systems which less degeneracy issues in higher dimensions.Secondly, due to the nature of the crossover operator, it is possible for a particle to cross withan identical particle, for example if the two particles were produced in the resampling step of theprevious iteration. In our estimation we explicitly exclude this possibility and where a particleis chosen to cross with an identical particle, it is instead mutated using the operator specified inSection 3.2.2. . . . . . Iterations ep s il on Figure 3: The adaptively estimated tolerance schedule obtained from multiple SMC Sampler-ABCruns on real data for BNP Paribas on 05/03/2012 specified in Section 3.2.1.
Having run the SMC Sampler-ABC algorithm on the BNP Paribas LOB data for 05/03/2012 weobtained estimates of the posterior for the agent based LOB simulation model. The first set ofresults shows the accuracy of the LOB model to replicate features of the real LOB stochasticprocess relating to price and volume dynamics. This is clearly illustrated in Figure 4 in terms ofthe values of the objective functions D , D for each of the particles at the final iteration stageof the SMC Sampler-ABC algorithm. This is the standard way in which results for optimisation25 l l l l l l l l l l l l l l l l l l ll l l l ll ll ll l l l l ll ll ll l l l l l l ll l l l l l l l l l l l l l l ll l l ll ll l lll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l ll ll l l l l l l l l l l l l ll l l l l l l l ll l l l l l l l l l l l ll l l l l l l ll l l l l l l lll l l l l l l l l ll l l l l l l l l l l l l l Aux function 1 distance A u x f un c t i on d i s t an c e weights l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l ll l l l l l l l l l l l l l l l l l l l ll l l l l l ll l l l l l l l l l l l l l l ll l l ll l l l l l l l l l ll ll l l l l l l l l l l l ll l l l l l l l l l lll l l l l l l l l l l l l l l l ll l l llll l l l l l l l l l l l l l l l l l l l l l l l l l l l Aux function 1 distance A u x f un c t i on d i s t an c e weights l l l l Figure 4: The realized objective function (distance metrics D and D ) values from each partcle inthe SMC Sampler-ABC algoirthm at the final iteration for independent trials on the real data forBNP Paribas on 05/03/2012. The x-axis is the GARCH(1,1) model parameter distance discrep-ancies for the intra-day volatility dynamic of the price process. The y-axis is the ARIMA(0,1,1)model parameter distance discrepancies for the intra-day volume process dynamics.using multi-objective evolutionary algorithms (MOEAs) are presented (see discussion in Panayiand Peters [2015]), in order to show the Pareto optimal front that is obtained in that setting, seethe discussion in the Section 5.3.We also present realisations of the LOB intra-day evolution for both the particle with the highestweight and the weighted mean of the particles in Figure 5. We note that there are differences inthe intra-day dynamics of the simulated financial market resulting from different repetitions of theestimation procedure. However, we note that for a subset of particles we can recover price andvolume dynamics that are similar to those observed in the real market (an example of which wehad seen in Figure 2).To complete the analysis we also illustrate the median of the resulting marginal posterior dis-tributions for the model parameters obtained from 20 independent runs of the SMC Sampler-ABCalgorithm for the BNP Paribas data on 05/03/2012. These results are presented in Figure 8. The method introduced in Panayi and Peters [2015] is a combination of simulation-based indirectinference (II) and multi-objective optimisation, denoted the Multi-objective-II estimation frame-work. In common with ABC, II is used when one cannot write down the likelihood of the datagenerating model in closed form, but realisations are easily obtained via simulation given modelparameters θ . II introduces a new, ‘auxiliary’ model (with parameter vector β ), which is fit to atransform of both the real and simulated data ( y and y ∗ ( θ ), respectively) and the objective is tofind the model parameter vector ˆ θ which minimises some distance metric D ( β ( y ) , β ( y ∗ ( θ )).The multi-objective extension to the standard II procedure pertains to the objective function D ( β ( y ) , β ( y ∗ ( θ )). Where standard II procedures consider a scalar output of the objective function,the Multi-objective-II method considers a vector-valued output, where each element of the vector26 Time P r i c e ( c en t s ) Side askbid1000200030004000 value
Time P r i c e ( c en t s ) Side askbid1000200030004000 value
Time P r i c e ( c en t s ) Side askbid50010001500 value
Time P r i c e ( c en t s ) Side askbid500100015002000 value
Figure 5: Representations of simulated intra-day LOB states obtained from using the (Top): MAPparticle from a single estimation procedure and (Bottom): MMSE particle estimates.pertains to a different feature of the LOB stochastic process. In this framework, the search isthen for non-dominated parameter vectors, i.e. such that there is no parameter vector in thesearch space that can unilaterally improve a single criterion (objective function element) withoutworsening another criterion. The procedure uses the same mutation and crossover kernels outlinedin Section 3.2.2 and outputs a set of
Pareto optimal solutions, see details in Panayi and Peters[2015].Where the SMC-ABC method returns a family of particles and associated weights, the MOEA-II procedure returns a family of particles and their non-domination rank. Our comparison is thenbetween the highest weighted particles returned from the former procedure, and the non-dominatedparticles returned from the latter. The results have been found to be comparable between the twomethods, both in terms of achieving similar objective function values and in terms of producingsimulations which resemble real financial markets. That is, while not all highest weight/non-dominated particles will give rise to realistic financial market simulations, there is a subset that27
Time P r i c e ( c en t s ) Side askbid200040006000 value
Time P r i c e ( c en t s ) Side askbid1000200030004000 value
Time P r i c e ( c en t s ) Side askbid1000200030004000 value
Time P r i c e ( c en t s ) Side askbid100020003000 value
Figure 6: Representations of simulated intra-day LOB states obtained from using the (Top): MAPparticle from a single estimation procedure and (Bottom): MMSE particle estimates.do. While we have tried to provide a fair comparison between the two methods by utilising thesame mutation and crossover operators, we should highlight some differences between the MOEA-II procedure and the SMC-ABC procedure presented in this paper. Firstly, the MOEA-II proceduredid not suffer from particle degeneracy when using the adaptive mutation kernel for the covariancemutation operator, and thus the operator in Section 3.2.2 was utilised as described. Secondly, wherethe probability of crossover between particles in the MOEA-II procedure was set at the default valueof 0.7 in every iteration, this was found to cause additional particle degeneracy issues and thus theprobability was reduced to 0.05 (with the additional exclusion of crossing with identical particlesdescribed at the front of this section). 28 .02.55.07.510.0 09:00 12:00 15:00
Time S p r ead ( c en t s ) count Time S p r ead ( c en t s ) count Figure 7: Heatmaps of the intra-day value of the spread for (Left): The MAP particle from theestimation procedure and (Right): MMSE particle estimates.
This chapter has proposed a stochastic agent based liquidity supply and demand based simulationbased model to characterize the LOB for an asset traded on an electronic exchange. The calibrationof this model to real market LOB data has been performed via a posterior inference procedure thatadopted an ABC structure due to the complexity of writing down the resulting likelihood for theLOB agent simulation model. The estimation of the posterior distribution was then shown how tobe performed via an adaptive SMC Sampler-ABC algorithm. The results were tested on real dataand compared to an indirect inference procedure with multi-objective optimization features.29 eferences
Mark A Beaumont, Wenyang Zhang, and David J Balding. Approximate Bayesian computation inpopulation genetics.
Genetics , 162(4):2025–2035, 2002.Mark A Beaumont, Jean-Marie Cornuet, Jean-Michel Marin, and Christian P Robert. Adaptiveapproximate Bayesian computation.
Biometrika , 2009.Michael GB Blum. Approximate Bayesian computation: a nonparametric perspective.
Journal ofthe American Statistical Association , 105(491), 2010.Nicolas Chopin. Central limit theorem for sequential Monte Carlo methods and its application toBayesian inference.
Annals of statistics , 32(6):2385–2411, 2004.Rama Cont and Adrien De Larrard. Price dynamics in a Markovian limit order market.
SIAMJournal on Financial Mathematics , 4(1):1–25, 2013.Rama Cont, Sasha Stoikov, and Rishi Talreja. A stochastic model for order book dynamics.
Oper-ations research , 58(3):549–563, 2010.Dan Crisan and Arnaud Doucet. A survey of convergence results on particle filtering methods forpractitioners.
Signal Processing, IEEE Transactions on , 50(3):736–746, 2002.Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and TAMT Meyarivan. A fast and elitist multi-objective genetic algorithm: NSGA-II.
Evolutionary Computation, IEEE Transactions on , 6(2):182–197, 2002.Pierre Del Moral.
Feynman-Kac Formulae . Springer, 2004.Pierre Del Moral, Arnaud Doucet, and Ajay Jasra. Sequential monte carlo samplers.
Journal ofthe Royal Statistical Society: Series B (Statistical Methodology) , 68(3):411–436, 2006.Pierre Del Moral, Arnaud Doucet, and Ajay Jasra. An adaptive sequential Monte Carlo methodfor approximate Bayesian computation.
Statistics and Computing , 22(5):1009–1020, 2012.Pierre Del Moral, Gareth W Peters, and Christelle Verg´e. An introduction to stochastic particleintegration methods: with applications to risk and insurance. In
Monte Carlo and Quasi-MonteCarlo Methods 2012 , pages 39–81. Springer, 2013.Arnaud Doucet, Nando De Freitas, and Neil Gordon.
An introduction to sequential Monte Carlomethods . Springer, 2001.Paul Fearnhead and Dennis Prangle. Constructing summary statistics for approximate Bayesiancomputation: semi-automatic approximate Bayesian computation.
Journal of the Royal Statis-tical Society: Series B (Statistical Methodology) , 74(3):419–474, 2012.Daniel Fricke and Thomas Lux. The effects of a financial transaction tax in an artificial financialmarket.
Journal of Economic Interaction and Coordination , pages 1–32, 2013.Geof H Givens and Adrian E Raftery. Local adaptive importance sampling for multivariate densitieswith strong nonlinear relationships.
Journal of the American Statistical Association , 91(433):132–141, 1996. 30artin D Gould, Mason A Porter, Stacy Williams, Mark McDonald, Daniel J Fenn, and Sam DHowison. Limit order books.
Quantitative Finance , 13(11):1709–1742, 2013.Hans R K¨unsch. Recursive Monte Carlo filters: algorithms and theoretical analysis.
Annals ofStatistics , 33(5):1983–2021, 2005.Hui Li and Qingfu Zhang. Multiobjective optimization problems with complicated Pareto sets,MOEA/D and NSGA-II.
Evolutionary Computation, IEEE Transactions on , 13(2):284–302,2009.Marco LiCalzi and Paolo Pellizzari. Fundamentalists clashing over the book: a study of order-drivenstock markets.
Quantitative Finance , 3(6):470–480, 2003.Paul Marjoram, John Molitor, Vincent Plagnol, and Simon Tavar´e. Markov chain Monte Carlowithout likelihoods.
Proceedings of the National Academy of Sciences , 100(26):15324–15328,2003.Sergei Maslov. Simple model of a limit order-driven market.
Physica A: Statistical Mechanics andits Applications , 278(3):571–578, 2000.Maureen O’hara.
Market microstructure theory , volume 108. Blackwell Cambridge, MA, 1995.Efstathios Panayi and Gareth William Peters. Stochastic simulation framework for the Limit OrderBook using liquidity motivated agents.
Available at SSRN 2551410 , 2015.Gareth W Peters. Topics in sequential Monte Carlo samplers.
M. sc, University of Cambridge,Department of Engineering , 2005.Gareth W Peters, Y Fan, and Scott A Sisson. On sequential Monte Carlo, partial rejection controland approximate Bayesian computation.
Statistics and Computing , 22(6):1209–1222, 2012.Oliver Ratmann, Christophe Andrieu, Carsten Wiuf, and Sylvia Richardson. Model criticism basedon likelihood-free inference, with an application to protein network evolution.
Proceedings of theNational Academy of Sciences , 106(26):10576–10581, 2009.RW Reeves and AN Pettitt. A theoretical framework for approximate Bayesian computation.
Proc.20th Int. Wrkshp Statistical Modelling, Sydney , pages 393–396, 2005.Branko Ristic, Sanjeev Arulampalam, and Neil Gordon.
Beyond the Kalman filter: Particle filtersfor tracking applications , volume 685. Artech house Boston, 2004.Scott A Sisson, Yanan Fan, and Mark M Tanaka. Sequential monte carlo without likelihoods.
Proceedings of the National Academy of Sciences , 104(6):1760–1765, 2007.Michael S Smith, Quan Gan, and Robert J Kohn. Modelling dependence using skew t copulas:Bayesian inference and applications.
Journal of Applied Econometrics , 27(3):500–522, 2012.Rodrigo S Targino, Gareth W Peters, and Pavel V Shevchenko. Sequential Monte Carlo Samplers forcapital allocation under copula-dependent risk models.
Insurance: Mathematics and Economics ,61(1):206 – 226, 2015. 31ina Toni, David Welch, Natalja Strelkowa, Andreas Ipsen, and Michael PH Stumpf. ApproximateBayesian computation scheme for parameter inference and model selection in dynamical systems.
Journal of the Royal Society Interface , 6(31):187–202, 2009.Richard David Wilkinson. Approximate Bayesian computation (ABC) gives exact results underthe assumption of model error.
Statistical applications in genetics and molecular biology , 12(2):129–141, 2013. 32 . . . . . . Parameter 1 CD F . . . . . . Parameter 2 CD F . . . . . . Parameter 3 CD F −5 0 5 10 . . . . . . Parameter 4 CD F . . . . . . Parameter 5 CD F . . . . . . Parameter 6 CD F Figure 8: Median of the CDFs for every iteration of the estimation procedure for the parametersof the model. In the figures, parameters 1 to 6 correspond to (cid:110) µ LO,p , µ LO,d , µ MO , γ , ν, σ MO (cid:111)(cid:111)