A level-1 Limit Order book with time dependent arrival rates
Jonathan A. Chávez-Casillas, Robert J. Elliott, Bruno Rémillard, Anatoliy V. Swishchuk
AA LEVEL-1 LIMIT ORDER BOOK WITH TIME DEPENDENTARRIVAL RATES
JONATHAN A. CH ´AVEZ-CASILLAS, ROBERT J. ELLIOTT, BRUNO R´EMILLARD,AND ANATOLIY V. SWISHCHUK
Abstract.
We propose a simple stochastic model for the dynamics of a limitorder book, extending the recent work of Cont and de Larrard (2013), wherethe price dynamics are endogenous, resulting from market transactions. Wealso show that the conditional diffusion limit of the price process is the so-calledBrownian meander. Introduction
In the now classical approach of financial engineering, one assumes a given modelfor the price of assets, e.g., geometric Brownian motion, and then uses the modelto evaluate options or optimized portfolios. In this approach, the notion of bid/askspread is generally not considered and the value of a portfolio is a linear function ofthe “price” of the assets. However, in practice, the value of a portfolio is not a linearfunction of the prices. In addition, also in contrast to the classical approach, theselling value of a portfolio is smaller than the buying value of the same positions.These values are really determined by the so-called limit order book, giving thelist of possible bid/ask prices together with the size (number of shares available) ateach price.This limit order book changes rapidly over time, many orders possibly arrivingwithin a millisecond. Either for testing high frequency trading strategies or decidingon an optimal way to buy or sell a large number of shares, it is important to tryto model the behavior of limit order books. Several authors suggested interestingmodels for limit order books. For example, in Smith et al. (2003), the authorsassumed that the tick size δ (least difference between two bid or ask prices) isconstant; this implies that prices are multiples of the tick size. They also assumedthat the markets orders (bid/ask) arrive independently at rate µ in chunks of m shares; since these orders reduce the number of shares at the best bid or best askprice, they are usually combined with order cancellations. In their model, the limitorders (bid/ask) also arrive independently at rate λ in chunks of m shares; theassociated price is said to be selected “uniformly” amongst the possible bid pricesor ask prices, whatever it means. Basically, they examined some properties of theresulting limit order book, trying to use techniques used in physics to characterizesome macro quantities of their model. Date : January 4, 2017.This research is supported by the Montreal Institute of Structured Finance and Derivatives, theNatural Sciences and Engineering Research Council of Canada, the Social Sciences and HumanitiesResearch Council of Canada, and the Australian Research Council. a r X i v : . [ q -f i n . T R ] A p r J. A. CH´AVEZ-CASILLAS, R. J. ELLIOTT, B. R´EMILLARD, AND A. V. SWISHCHUK
More recently, Cont and de Larrard (2013) proposed a similar model and theyfound the asymptotic behavior of the price. In fact, the behaviour of the assetprice is a consequence of their model for orders arrivals. Contrary to Smith et al.(2003), they only consider the level-1 order book, meaning that only the best bidand best ask prices are taken into account. In order to do so, they assumed that thebid/ask spread δ is constant. As before, markets orders for the best bid/ask pricesarrive independently at rate µ , in chunks of m shares, and limit orders for the bestbid/ask prices arrive independently at rate λ , also in chunks of m shares. When thesize (number of shares) of the best bid price attains 0, the bid price decreases by δ and so does the ask price; the sizes of the best bid/ask prices are then chosen atrandom from a distribution ˜ f . When the size of the best ask price attains 0, the askprice increases by δ and so does the bid price; the sizes of the best bid/ask pricesare then chosen at random from a distribution f . With this simple but tractablemodel, they were able to determine the asymptotic behavior of the price process,instead of assuming it.According to some participants in the high frequency trading world, the hypoth-esis of constant arrivals of orders is not justified. Therefore, one should assumedthat the arrival rates are time-dependent. This is the model proposed here. Weextend the Cont and de Larrard (2013) setting by assuming that the rates for mar-ket orders and limit orders depend on time and that they are also different if theyare bid or ask orders. As in Cont and de Larrard (2013), under some simple as-sumptions, we are also able to find the limiting behavior of the price process, andwe show how to estimate the main parameters of the model. The main ingredientsare the random times at which the price changes, the associated counting process,and the distribution of the price changes.More precisely, in Section 2, we present the construction of the model we con-sider. Under some simplifying assumptions, we derive in Section 3 the distributionof the random times at which the price changes. The asymptotic distribution ofthe price process is examined in Section 4, while the estimation of the parametersis discussed in Section 5, together with an example of implementation. The proofsof the main results are given in Appendix B.2. Description of the model
We discuss a level-1 Limit Order Book model using as a framework the modelproposed in Cont and de Larrard (2013). However, the point processes describingthe arrivals of Limit orders have time-dependent periodic rates proportional to therate describing the arrival of Market orders plus Cancellations.Recalling the Cont-de Larrard model we will define the level-1 Limit Order bookmodel as follows: • There is just one level on each side of the order book, i.e., one knows onlythe best bid and the best ask prices, together with their sizes (number ofavailable shares at these prices). • The spread is constant and always equals the tick size δ . • Order volume is assumed to be constant (set as one unit). • Limit Orders at the bid and ask sides of the book arrive independentlyaccording to inhomogeneous Poisson processes L bt and L at , with intensities λ bt and λ at respectively. LEVEL-1 LIMIT ORDER BOOK WITH TIME DEPENDENT ARRIVAL RATES 3 • Market Orders plus Cancellations at the bid and ask sides of the book arriveindependently according to inhomogeneous Poisson processes M bt and M at ,with intensities µ bt and µ at respectively. • The processes L at , L bt , M at and M bt are all independent. • Every time there is a depletion at the ask side of the book, both the bidand the ask prices increase by one tick, and the size of both queues getsredrawn from some distribution f ∈ N . • Every time there is a depletion at the bid side of the book, both the bidand the ask prices decrease by one tick, and the size of both queues getsredrawn from some distribution ˜ f ∈ N .2.1. Construction of the processes.
First, consider the following infinitesimalgenerators of birth and death processes:(1) ( L at ) ij = , i = 0 , j ≥ ,µ at , ≤ i, j = i − ,λ at , ≤ i, j = i + 1 , − ( µ at + λ at ) , ≤ i, j = i, , otherwise . (2) (cid:0) L bt (cid:1) ij = , i = 0 , j ≥ ,µ bt , ≤ i, j = i − ,λ bt , ≤ i, j = i + 1 , − (cid:0) µ bt + λ bt (cid:1) , ≤ i, j = i, , otherwise . Note that 0 is an absorbing state for any Markov chain with generators L a or L b . When a chain reaches the absorbing point 0, one calls it extinction.To describe precisely the behavior of the price process S t and the queues sizesprocess q t = ( q bt , q bt ), one needs to define the following sequence of random times.Let σ ( b, x and σ ( a, y be the extinction times of independent Markov chains X ( b, and X ( a, with generators L ( b, and L ( a, , starting from x and y respectively,where L ( a, t = L at and L ( b, t = L bt . Further set τ = 0 and τ = min (cid:16) σ ( b, x , σ ( a, y (cid:17) .Having defined τ , . . . , τ n − , set V n − = (cid:80) n − k =0 τ k , and let σ ( b,n ) x n − and σ ( a,n ) y n − be theextinction times of independent Markov chains X ( b,n ) and X ( a,n ) with generators L ( b,n ) and L ( a,n ) , starting respectively from x n − and y n − , where L ( a,n ) t = L aV n − + t and L ( b,n ) t = L bV n − + t , t ≥
0; then set τ n = min (cid:16) σ ( n ) x n − , σ ( n ) y n − (cid:17) . Here the randomvariables ( x k , y k ) are F τ k -measurable, for any k ≥
0. In fact, ( x , y ) is chosen atrandom from distribution f , while ( x n , y n ) is chosen at random from distribution f n if σ ( a,n ) x n − < σ ( b,n ) y n − and chosen at random from distribution ˜ f n if σ ( a,n ) x n − > σ ( b,n ) y n − .Now for t ∈ [ V n − , V n ), q bt = X ( b,n ) t − V n − and q at = X ( a,n ) t − V n − starting respectively from x n − and y n − at time V n − . Finally, the price process S , representing either theprice or the log-price, is defined the following way: for t ∈ [ V n − , V n ), S t = S V n − and S V n − = S V n − + δ if σ ( a,n ) x n − < σ ( b,n ) y n − while S V n − = S V n − − δ if σ ( b,n ) x n − < σ ( a,n ) y n − .In Cont and de Larrard (2013), the authors assumed that the arrivals were timehomogeneous, meaning that L at ≡ Q a and L bt ≡ Q b . In fact, most of their results J. A. CH´AVEZ-CASILLAS, R. J. ELLIOTT, B. R´EMILLARD, AND A. V. SWISHCHUK were stated for the case Q a = Q b = Q , where(3) Q aij = i = 0 , j ≥ ,µ a if 1 ≤ i, j = i − ,λ a if 1 ≤ i, j = i + 1 , − ( λ a + µ a ) if 1 ≤ i, j = i, | i − j | > . (4) Q bij = i = 0 , j ≥ ,µ b if 1 ≤ i, j = i − ,λ b if 1 ≤ i, j = i + 1 , − ( λ b + µ b ) if 1 ≤ i, j = i, | i − j | > . and(5) Q ij = i = 0 , j ≥ ,µ if 1 ≤ i, j = i − ,λ if 1 ≤ i, j = i + 1 , − ( λ + µ ) if 1 ≤ i, j = i, | i − j | > . Distributional properties
Because of the independence between the ask and the bid side of the book beforethe first price change, to analyze the distribution of τ , it is enough to study oneside of the orderbook, say the ask. In this case, an explicit formula for P [ σ ( a, > t ]is given in the next section.3.1. Distribution of the inter-arrival time between price changes.
Let L t be the infinitesimal generator of a non homogeneous birth and death process X given by(6) ( L t ) ij = i = 0 , j ≥ ,µ t if 1 ≤ i, j = i − ,λ t if 1 ≤ i, j = i + 1 , − ( λ t + µ t ) if 1 ≤ i, j = i, | i − j | > . Notice that 0 is an absorbing state. Also, let σ X be the first hitting times of 0 forthis process, i.e.,(7) σ X := inf { t > | X t = 0 } . Then since 0 is an absorbing state, one has P x [ σ X ≤ t ] = P x [ X t = 0].It is hopeless to expect solving the problem for general generators so as a firstapproach, some assumptions L a and L b will be made. Assumption 1.
There exists a measurable function α : R + → R + such that A t = (cid:82) t α s ds < ∞ for any t ≥ , with L at = α t Q a and L bt = α t Q b .Remark . Under the assumption that L t = α t Q , a process X with infinitesimalgenerator L t can be seen as a time change of a process Y with infinitesimal generator Q , viz. X t = Y A t . In particular, if σ X and σ Y are respectively the first hitting timeof 0 for X and Y , then for any t ≥ F L ( t ; x ) := P [ σ X ≤ t | X = x ] = P [ σ Y ≤ A t | Y = x ] := F Q ( A t ; x ) . LEVEL-1 LIMIT ORDER BOOK WITH TIME DEPENDENT ARRIVAL RATES 5
This result is essential in what follows since it implies that the distribution of thetime between price changes in the present model is comparable to the distributionof the inter-arrival time between price changes for the model considered by Contand de Larrard (2013).The following lemma gives the distribution of the extinction time σ Y of a birthand death process Y with generator Q . Lemma 3.2.
Let Y be a birth and death process with generator Q given by (5) . If λ ≤ µ , then − F Q ( t ; x ) = P x [ σ Y > t ] = u λ,µ ( t, x ) , where (9) u λ,µ ( t, x ) = x (cid:16) µλ (cid:17) x/ (cid:90) ∞ t s I x (cid:16) s (cid:112) λµ (cid:17) e − s ( λ + µ ) ds, and where I ν ( · ) is the modified Bessel function of the first kind.If λ > µ , then (10) u λ,µ ( t, x ) = 1 − (cid:16) µλ (cid:17) x + x (cid:16) µλ (cid:17) x/ (cid:90) ∞ t s I x (cid:16) s (cid:112) λµ (cid:17) e − s ( λ + µ ) ds. In particular, P x [ σ Y = + ∞ ] = 1 − (cid:0) µλ (cid:1) x > .Remark . The case λ ≤ µ is proven in Cont and de Larrard (2013). For the case λ > µ , note that E x [ e − sσ Y ] = (cid:18) λ + µ + s − √ ( λ + µ + s ) − λµ λ (cid:19) x , so letting s ↓ P x ( σ Y < ∞ ) = (cid:0) µλ (cid:1) x . It then follows that P x [ σ Y > t | σ Y < ∞ ] = u µ,λ ( t, x ). Then P x [ σ Y > t ] = 1 − (cid:0) µλ (cid:1) x + (cid:0) µλ (cid:1) x u µ,λ ( t, x ). Hence the result.It is important to analyze the tail behavior of the survival distribution for σ Y .The following lemma, whose proof is deferred to Appendix B, establishes suchbehavior. Recall that Γ( s, x ) = (cid:82) ∞ x u s − e − u du is the incomplete gamma function. Lemma 3.4.
Let Y be a birth and death process with generator Q given by (5) ,and assume that λ ≤ µ . Set C = ( √ µ − √ λ ) . Then, for a sufficiently large T , P [ σ Y > T | Y = x ] ∼ (cid:0) µλ (cid:1) x/ x √ π √ λµ (cid:104) e − T C √ T − √C Γ (cid:0) , T C (cid:1)(cid:105) if λ < µ ; xλ √ π √ T if λ = µ. Consequently, as expected, if λ = µ , E x [ σ Y ] = ∞ , whereas if λ < µ , E x (cid:2) e θσ Y (cid:3) < ∞ for θ < C . In particular, E (cid:2) σ kY (cid:3) < ∞ for every k ∈ N .Remark . Note that if λ = µ , the results in Lemma 3.4 agree with the resultsobtained in Eq. (6) in Cont and de Larrard (2013). However, if λ < µ , Eq. (5) inCont and de Larrard (2013) says that P [ σ Y > T | Y = x ] ∼ x ( λ + µ )2 λ ( µ − λ ) 1 T , which isincorrect, since for a birth and death process with death rate larger than its birthrate , the extinction time σ Y has moments of all orders. An easy way to see this isto use the moment generating function (mgf) computed in Proposition 1 of Contand de Larrard (2013) and observe that if λ < µ , then the mgf is defined on anopen interval around 0; see, e.g., (Billingsley, 1995, Section 21).Lemma 3.2 allows a closed formula to be obtained for the distribution of σ X ,when the rates are proportional to each other, as in Assumption 1. Such a formulais described in the following proposition, whose proof is deferred to Appendix B. J. A. CH´AVEZ-CASILLAS, R. J. ELLIOTT, B. R´EMILLARD, AND A. V. SWISHCHUK
Proposition 3.6.
Let X be a birth and death process with generator L satisfying L t = α t Q . If λ ≤ µ , then the distribution of σ X is given by P x [ σ X > T ] = P x [ σ Y > A T ] = x (cid:16) µλ (cid:17) x/ (cid:90) ∞ A T s I x (cid:16) s (cid:112) λµ (cid:17) e − s ( λ + µ ) ds. Corollary 3.7.
Under Assumption 1, for A t = (cid:82) t α s ds , the distribution of τ isgiven by P L [ τ > T | q = ( x, y )] = P L b [ σ ( b, x > T ] P L a [ σ ( a, y > T ]= P Q b [ σ ( b, x > A T ] P Q b [ σ ( a, y > A T ]= P Q [ τ > A T | q = ( x, y )] . Proof.
The result follows from the fact that τ = σ ( a, y ∧ σ ( b, x , Proposition 3.6 andthe independence between σ ( a, y and σ ( b, x . (cid:3) Now, we present the asymptotic behavior of the survival distribution function of τ under L . It follows directly from Lemma 3.4 and Corollary 3.7. Lemma 3.8.
Let C a = ( √ µ a − √ λ a ) , C b = ( (cid:112) µ b − √ λ b ) , and set F L ( t : x, y ) = P L (cid:104) τ ≤ t (cid:12)(cid:12)(cid:12) q b = x, q a = y (cid:105) , t ≥ . Assume that λ a ≤ µ a and λ b ≤ µ b . Then, as T → ∞ , − F L ( T : x, y ) is asymptotic to (cid:18) µ b λ b (cid:19) x/ (cid:18) µ a λ a (cid:19) y/ xyπ ( λ a λ b µ a µ b ) / (cid:20) exp( − A T C a ) √ A T − (cid:112) C a Γ (cid:18) , A T C a (cid:19)(cid:21) × (cid:20) exp( − A T C b ) √ A T − (cid:112) C b Γ (cid:18) , A T C b (cid:19)(cid:21) . In particular, if λ a = µ a and λ b = µ b , then A T P L [( τ > T | q = ( x, y )] T →∞ → xyπ √ λ a λ b . Remark . It might happen that either λ a > µ a or λ b > µ b . If both theseconditions hold, there is a positive probability that the queues will never deplete,so this case must be excluded. There are basically two cases left. The followingresult follows directly from the proof of Lemma 3.8.(C1) Suppose that λ b > µ b and λ a ≤ µ a . Then, as T → ∞ , 1 − F L ( T : x, y ) isasymptotic to (cid:20) − (cid:18) µ b λ b (cid:19) x (cid:21) (cid:18) µ a λ a (cid:19) y/ yπ ( λ a µ a ) / (cid:20) exp( − A T C a ) √ A T − (cid:112) C a Γ (cid:18) , A T C a (cid:19)(cid:21) . (C2) Suppose that λ a > µ a and λ b ≤ µ b . Then, as T → ∞ , 1 − F L ( T : x, y ) isasymptotic to (cid:20) − (cid:18) µ a λ a (cid:19) y (cid:21) (cid:18) µ b λ b (cid:19) x/ xπ ( λ b µ b ) / (cid:20) exp( − A T C b ) √ A T − (cid:112) C b Γ (cid:18) , A T C b (cid:19)(cid:21) . In particular, if λ a > µ a and λ b = µ b , then (cid:112) A T P L [( τ > T | q = ( x, y )] T →∞ → xπ √ λ b (cid:20) − (cid:18) µ a λ a (cid:19) y (cid:21) . LEVEL-1 LIMIT ORDER BOOK WITH TIME DEPENDENT ARRIVAL RATES 7
Probability of a price increase.
In Cont and de Larrard (2013, Proposition3), the authors considered an asymmetric order flow as given here by the processes Y a and Y b for computing the probability of a price increase. This was not usedelsewhere in their paper. They obtained the following result, which we cite withoutmuch changes. However there are some typos that are corrected here. The proofof the result is given in Van Leeuwaarden et al. (2013). Proposition 3.10.
Suppose that λ a ≤ µ a and λ b ≤ µ b . Given ( q b , q a ) = ( x, y ) ,the probability p up ( x, y ) that the next price change is an increase is p up ( x, y ) = 1 − π (cid:18) µ a λ a (cid:19) y (cid:18) √ λ a µ a µ a + λ a (cid:19) (cid:90) π H xt sin( yt ) sin( t ) × (cid:40) λ b H t − G t √ λ a µ a µ a + λ a cos( t ) − (cid:41) (cid:40) (cid:112) G t − λ b µ b (cid:41) dt, where Σ = µ a + µ b + λ a + λ b , G t = Σ − √ λ a µ a cos( t ) , and H t = G t − √ G t − λ b µ b λ b . Under Assumption 1, the same result applies for our model since X at = Y aA t and X bt = Y bA t . Remark . One can also use Lemma 3.2 and Proposition 3.6 to obtain theprevious result by integration.4.
Long-run dynamics of the price process
Let V n be the time of the n -th jump in the price, as defined in Section 2.1. Weare interested in analyzing the asymptotic behavior of the number of price changesup to time t , that is, in describing the counting process(11) N t := max { n ≥ | V n ≤ t } , t ≥ . Asymptotic behavior of the counting process N . The next proposition,depending on a new assumption, whose proof is deferred to Appendix B, providesan expression which relates the distribution of the partial sums for the waitingtimes between price changes for the models with the generators L and Q . Assumption 2. (cid:80) ( x,y ) ∈ N ˜ f ( x, y ) P Q [ τ ≤ t | q b = x, q a = y ] = (cid:80) ( x,y ) ∈ N f ( x, y ) P Q [ τ ≤ t | q b = x, q a = y ] = F , Q ( t ) . This is true for example, when (i) ˜ f ( x, y ) = f ( y, x ) and Q a = Q b , or (ii) ˜ f = f . Properties (i) and (ii) are used for example in Contand de Larrard (2013). Proposition 4.1.
Recall that A t = (cid:82) t α s ds . Then, under Assumptions 1–2, P L [ V n ≤ t | q b = x, q a = y ] = P Q [ V n ≤ A t | q b = x, q a = y ] . Remark . Under generator Q , τ , τ , . . . , τ n are independent and τ , . . . , τ n arei.i.d. The starting point ( x, y ) must be random with the correct distribution inorder that τ has the same law as τ .In order to deal with the counting process N , we need another assumption. Assumption 3.
There exists a positive constant υ such that A t t → υ as t → ∞ . J. A. CH´AVEZ-CASILLAS, R. J. ELLIOTT, B. R´EMILLARD, AND A. V. SWISHCHUK
Remark . Assumption 3 is true for example if α is periodic. Such an assumptionmakes sense. One can easily imagine that α repeats itself everyday. Of course, itmust be validated empirically. One can also suppose that α is random but inde-pendent of the other processes. In this case, α would act as a random environmentand if we assume that α is stationary and ergodic, then Assumption 3 holds almostsurely. However, in this case, all computations are conditional on the environment.In order to obtain the asymptotic behavior of the prices, there are two cases tobe taken into account: C a + C b > C a + C b = 0.4.1.1. Case C a + C b > . First, assume that(12) γ = (cid:88) ( x,y ) ∈ N xy (cid:18) µ b λ b (cid:19) x/ (cid:18) µ a λ a (cid:19) y/ f ( x, y ) < ∞ . Now, from Abramowitz and Stegun (1972, p. 376), I n ( z ) = π (cid:82) π e z cos θ cos( nθ ) dθ ,so for any x ∈ N , I n ( z ) ≤ e z . In this case, it follows from Lemma 3.2 and Lemma3.4 that E Q ( τ ) = (cid:88) ( x,y ) ∈ N xy (cid:18) µ b λ b (cid:19) x/ (cid:18) µ a λ a (cid:19) y/ f ( x, y ) (cid:90) ∞ (cid:90) ∞ t ∧ sg x,b ( t ) g y,a ( s ) dtds ≤ γ max( C a , C b ) < ∞ , where g y,a ( s ) = s I y (cid:0) s √ λ a µ a (cid:1) e − s ( λ a + µ a ) and g x,b ( s ) = s I x (cid:16) s (cid:112) λ b µ b (cid:17) e − s ( λ b + µ b ) .Then, under Assumptions 1–2 and under model Q , V n /n → E Q ( τ ) < ∞ a.s. Us-ing Assumption 3 and Lemma 3.8, one then finds that under model L , V n /n con-verges in probability to c = E Q ( τ ) /υ . Finally, using Propositions A.1–A.2, onefinds that under L , N t /t converges in probability to c = υ/ E Q ( τ ). In addition, N (cid:98) nt (cid:99) − nt/c √ n (cid:32) c / W ( t ), where W is a Brownian motion. This follows from theconvergence of V n , under Q , to a Brownian motian. It also holds under L , usingAssumption 3.4.1.2. Case C a + C b = 0 . Assume that(13) γ = (cid:88) ( x,y ) ∈ N xyf ( x, y ) < ∞ . Then it follows from Lemma 3.8 and Proposition A.4 that T P L [ τ > T ] T →∞ → c = γ υπ √ λ a λ b . As a result, using Propositions A.1–A.2 with f ( n ) = n log n , one finds that under L , N t / ( t/ log t ) converges in probability to c = υπ √ λ a λ b γ . In particular, if a n = n log n ,then N a n t /n converges in probability to tc . Also, V (cid:98) nt (cid:99) n − c t log n (cid:32) υ V t , where V is a stable process of index 1. It then follows that N (cid:98) n log nt (cid:99) − nt/c n/ log n (cid:32) − c υ V t . Notethat V is the weak limit of V n n − c υ log n under Q , and V = ˜ V + d , where d isthe limit of nb n − c υ log n , where b n = E Q { sin( τ /n ) } . Next, it follows from Feller(1971) that the characteristic function of ˜ V is e ψ ( ζ ) , where ψ ( ζ ) = −| ζ | c υ (cid:110) π i sgn( ζ ) log | ζ | (cid:111) . LEVEL-1 LIMIT ORDER BOOK WITH TIME DEPENDENT ARRIVAL RATES 9
Asymptotic behavior of the price process.
Under no other additionalhypothesis on f and ˜ f than Assumption 2, the sequence ( ξ i ) of price changes is anergodic Markov chain with transition matrix Π; the sequence is also independentfrom N t . Note that P ( ξ = δ | ξ = δ ) = (cid:80) ( i,j ) ∈ N f ( i, j ) P up ( i, j ) and P ( ξ = δ | ξ = − δ ) = (cid:80) i,j ˜ f ( i, j ) P up ( i, j ), so the associated transition matrix Π is given byΠ = (cid:20) P ( ξ = − δ | ξ = − δ ) P ( ξ = δ | ξ = − δ ) P ( ξ = − δ | ξ = δ ) P ( ξ = δ | ξ = δ ) (cid:21) , with stationary distribution ( ν, − ν ) satisfying ν = P ( ξ = − δ ) = P ( ξ = − δ | ξ = δ ) P ( ξ = − δ | ξ = δ ) + P ( ξ = δ | ξ = − δ ) . If (cid:98) c (cid:99) stands for the largest integer smaller of equal to c , then the sequence W n ( t ) = √ n (cid:80) (cid:98) nt (cid:99) i =1 { ξ i − E ( ξ i ) } converges in law to σ W ( t ), where W is a Brownian motion,and the variance σ is given by(14) σ = 4 δ (cid:34) ν (1 − ν ) + ν ∞ (cid:88) k =1 (cid:8) (Π k ) − ν (cid:9) − (1 − ν ) ∞ (cid:88) k =1 (cid:8) (Π k ) − ν (cid:9)(cid:35) , with (Π k ) ij being the element ( i, j ) of Π k . Remark . If ˜ f = f , then the variables ξ j , j ≥
1, are i.i.d. In fact, P ( ξ = δ | ξ = δ ) = (cid:88) ( i,j ) ∈ N f ( i, j ) P up ( i, j )and P ( ξ = δ | ξ = − δ ) = (cid:88) i,j ˜ f ( i, j ) P up ( i, j ) = (cid:88) i,j f ( i, j ) P up ( i, j )= P ( ξ = δ | ξ = δ ) . Note also that the variables ξ j , j ≥
1, are independent from τ , . . . , τ n . However,unless Q a = Q b and f is symmetric, one cannot conclude that P ( ξ i = δ ) = 1 / S can be expressed as S t = S + N t (cid:88) i =1 ξ i , t ≥ . To state the final results, set a n = n log n or n , according as C a + C b = 0 or not.Then, using the results of Section 4.1, N a n t /n converges in probability to t/c , where c = c or c = c according as C a + C b = 0 or not. It is then easy to show that n − / (cid:80) N ant i =1 { ξ i − E ( ξ ) } (cid:32) σ √ c ˜ W , where ˜ W is a Brownian motion. In fact, for any t ≥
0, ˜ W t = √ c W t/c . Next,(15) S a n t − nt/c E ( ξ ) = N ant (cid:88) i =1 { ξ i − E ( ξ ) } + E ( ξ )( N a n t − nt/c ) . This expression shows that there are really two sources of randomness involved inthe asymptotic behavior of S a n t − nt/c E ( ξ ). As before, one must consider the cases C a + C b > C a + C b = 0. C a + C b > . In this case, setting W n ( t ) = { S nt − nt/c E ( ξ ) } / √ n , then W n (cid:32) ˜ σW , where W is a Brownian motion and(16) ˜ σ = (cid:20) σ c + { E ( ξ ) } c (cid:21) / . In fact, ˜ σW t = σ √ c ˜ W t + E ( ξ ) c / W t , where ˜ W and W are the two independent Brow-nian motions appearing respectively in the asymptotic behaviour of the Markovchain and the counting process. Note that the volatility ˜ σ could be estimated bytaking the standard deviation of the price increments every 10 minutes, as proposedin Cont and de Larrard (2013); see also Swishchuk et al. (2016). More generally,if ∆ is the time in seconds between successive prices and s ∆ is the correspondingstandard deviation of the price increments over interval of size ∆, then ˆ˜ σ = s ∆ / √ ∆.4.2.2. C a + C b = 0 . In this case, if E ( ξ ) = 0, then using (15), one obtains that S n log nt / √ n (cid:32) √ c W t , where W is the Brownian motion resulting from the con-vergence of the Markov chain.However, if E ( ξ ) (cid:54) = 0, then ( S nt − nt/c E ( ξ )) / ( n/ log n ) (cid:32) − E ( ξ ) c υ V t , where V is the stable process defined in Section 4.1.2. Remark . Note that in Cont and de Larrard (2013), E ( ξ ) = 0, so the limitingprocess is a Brownian motion whether C a + C b = 0 or C a + C b > Conditioned limit of the price process.
If one thinks about it, what onewants to achieve in rescaling the price process S is to replace a discontinuous processby a more amenable process if possible, over a given time interval. However, on thistime interval, the price is known to be positive, so the limiting distribution shouldbe also be positive.If the unconditioned limit is a Brownian motion, then the conditioned limit, i.e.,conditioning on the fact that the Brownian motion is positive, is called a Brownianmeander (Durrett et al., 1977, Revuz and Yor, 1999). If the unconditioned limit isa stable process, then the conditioned limit could be called a stable meander. See,e.g., Caravenna and Chaumont (2008) for more details. Note that according toDurrett et al. (1977), a Brownian meander W + t over (0 ,
1) has conditional density P ( W + t ∈ dy | W + s = x ) = { φ t − s ( y − x ) − φ t − s ( y + x ) } (cid:26) Φ − t ( y ) − / − s ( x ) − / (cid:27) , < s < t < x, y >
0, where Φ t is the distribution function of a centeredGaussian variable with variance t and associated density φ t . It then follows thatthe infinitesimal generator H t of W + t is given by H t f ( x ) = f (cid:48) ( x ) { φ − t ( x ) } + f (cid:48)(cid:48) ( x )2 , x > . Estimation of parameters
In order to have identifiable parameters, one has to answer the following questionabout α : What happens if α is multiplied by a positive factor h ? Then, the value v in Assumption 3 is multiplied by h . Thus the parameters λ a , λ b , µ a , and µ b are alldivided by h , since for example, λ at = λ a α t . As a result, E Q ( τ ) is then multipliedby h and so is γ . It then follows that c and c are invariant by any scaling. So, LEVEL-1 LIMIT ORDER BOOK WITH TIME DEPENDENT ARRIVAL RATES 11 one could normalize α so that v = 1. This is what we will assume from now on.The estimation of the parameters will then be easier.Next, one of the assumptions of the model is that the size of the orders areconstant, which is not the case in practice. So in view of applications, and dependingof the statistics of sizes for level-1 orders, if the chosen size is 100 say, then an orderof size 324 would count for 3 .
24 orders.Assume that data are collected over a period of n days. Recall that time 0corresponds to the opening of the market at 9:30:00 ET. Let Λ bit and Λ ait be thenumber of limit orders for bid and ask respectively up to time t (measured inseconds) for day i . Further let t d be the number of seconds considered in a day.Typically, t d = 23400. Finally, let M bit and M ait be the number of market ordersand cancellations for bid and ask respectively up to time t (measured in seconds)for day i . For any i ≥
1, set v i = (cid:8) A it d − A ( i − t d (cid:9) /t d , and set ˆ v = ¯ v = n (cid:80) ni =1 v i .Then for any i ≥
1, one should have approximatelyˆ µ a v i = M ait d /t d , ˆ µ b v i = M bit d /t d , ˆ λ a v i = Λ ait d /t d , ˆ λ b v i = Λ bit d /t d . Having assumed that v = 1, one can setˆ µ a = 1 nt d n (cid:88) i =1 M ait d , ˆ µ b = 1 nt d n (cid:88) i =1 M bit d , ˆ λ a = 1 nt d n (cid:88) i =1 Λ ait d , ˆ λ b = 1 nt d n (cid:88) i =1 Λ bit d . Finally, note that the transition matrix Π can be estimated directly from the data,as is 1 /c from N t /t .5.1. Example of implementation.
For this example, we use the Facebook dataprovided in Cartea et al. (2015), from November 3rd, 2014 to November 7th, 2014.First, the results for the spread are given in Table 1, from which we can see thatmost of the time, the spread δ is . Table 1.
Spread distribution in cents for Facebook, from Novem-ber 3rd, 2014 to November 7th, 2014.DaySpread 1 2 3 4 5 Ave.1 91.6% 91.8% 89.7% 88.4% 93.6% 91.0%2 7.6 % 8.0 % 10.1% 11.1% 5.9% 8.5% > λ a < ˆ µ a and ˆ λ b < ˆ µ b , So with these data, we are in the case where C a + C b >
0, meaningthat the unconditioned limiting price process is a Brownian motion with volatilitysatisfying (16).
Remark . According to Figure 5, on November 3rd, the ratio Λ a t d /M a t d is biggerthan one, while the ratio Λ b t d /M b t d is smaller than one, meaning that most of thetime, the bid queue will be depleted before the ask queue, so the price has a negative trend throughout that day. This is well illustrated in Figure 7, where it is seen thatthe price indeed goes down on that day. Table 2.
Values of M bit d /t d , M ait d /t d , Λ bit d /t d , and Λ ait d /t d .Day Λ bit d /t d Λ ait d /t d M bit d /t d M ait d /t d λ b ˆ λ a ˆ µ b ˆ µ a There are basically two ways of estimating ˜ σ . One can use the standard deviationof high-frequency data, as exemplified in Table 3, or we could use the analyticexpression, as proposed in Swishchuk and Vadori (2017), Swishchuk et al. (2016).To estimate ˜ σ analytically, one needs the estimation of the transition matrix Π. Table 3.
Estimation of ˜ σ = s ∆ / √ ∆ using high-frequency stan-dard deviations. ∆Day 10-minute 5-minute 1-minute1 0.0040 0.0052 0.00572 0.0079 0.0073 0.00753 0.0069 0.0070 0.00824 0.0071 0.0062 0.00595 0.0038 0.0040 0.0051pooled 0.0062 0.0060 0.0066With the data set, we get ˆΠ = (cid:20) . . . . (cid:21) . It then follows thatˆ ν = 0 . E ( ξ ) = 0 . σ = 0 . / ˆ c = 0 . σ = 0 . LEVEL-1 LIMIT ORDER BOOK WITH TIME DEPENDENT ARRIVAL RATES 13 . . . . . . Milliseconds M t a t Day1Day2Day3Day4Day5Ave.0.0e+00 5.0e+06 1.0e+07 1.5e+07 2.0e+07 . . . . . . Milliseconds M t b t Day1Day2Day3Day4Day5Ave.
Figure 1.
Graphs of M ait /t and M bit /t for each of the five days. . . . Seconds for 5 days M t a t . . . Seconds for 5 days M t b t Figure 2.
Graphs of M at /t and M bt /t for five days. . . . . . . Milliseconds L t a t Day1Day2Day3Day4Day5Ave.0.0e+00 5.0e+06 1.0e+07 1.5e+07 2.0e+07 . . . . . . Milliseconds L t b t Day1Day2Day3Day4Day5Ave.
Figure 3.
Graphs of Λ ait /t and Λ bit /t for each of the five days. . . . Seconds for 5 days L t a t . . . Seconds for 5 days L t b t Figure 4.
Graphs of Λ at /t and Λ bt /t for five days. LEVEL-1 LIMIT ORDER BOOK WITH TIME DEPENDENT ARRIVAL RATES 15 . . . . . . . Milliseconds L t a M t a Day1Day2Day3Day4Day5Ave.0.0e+00 5.0e+06 1.0e+07 1.5e+07 2.0e+07 . . . . . . . Milliseconds L t b M t b Day1Day2Day3Day4Day5Ave.
Figure 5.
Graphs of Λ ait /M ait and Λ bit /M bit for each of the five days. . . . . Seconds for 5 days L t a M t a . . . . Seconds for 5 days L t b M t b Figure 6.
Graphs of Λ at /M at and Λ bt /M bt for five days. . . . Seconds P r i c e i n D o ll a r s Figure 7.
Graphs of the midprice for November 3rd, 2014.
References
Abramowitz, M. and Stegun, I. E. (1972).
Handbook of Mathematical Functions withFormulas, Graphs, and Mathematical Tables , volume 55 of
Applied MathematicsSeries . National Bureau of Standards, tenth edition.Billingsley, P. (1995).
Probability and Measure . Wiley Series in Probability andMathematical Statistics. John Wiley & Sons Inc., New York, third edition. AWiley-Interscience Publication.Caravenna, F. and Chaumont, L. (2008). Invariance principles for random walksconditioned to stay positive.
Ann. Inst. Henri Poincar´e Probab. Stat. , 44(1):170–190.Cartea, ´A., Jaimungal, S., and Penalva, J. (2015).
Algorithmic and high-frequencytrading . Cambridge University Press.Cont, R. and de Larrard, A. (2013). Price dynamics in a Markovian limit ordermarket.
SIAM J. Financial Math. , 4(1):1–25.Durrett, R. (1996).
Probability: Theory and Examples . Duxbury Press, Belmont,CA, second edition.Durrett, R. T., Iglehart, D. L., and Miller, D. R. (1977). Weak convergence toBrownian meander and Brownian excursion.
The Annals of Probability , 5(1):117–129.Feller, W. (1971).
An Introduction to Probability Theory and its Applications ,volume II of
Wiley Series in Probability and Mathematical Statistics . John Wiley& Sons, second edition.Olver, F. W., Lozier, D. W., Boisvert, R. F., and Clark, C. W. (2010).
NISTHandbook of Mathematical Functions . Cambridge University Press, New York,NY.Revuz, D. and Yor, M. (1999).
Continuous martingales and Brownian motion ,volume 293 of
Grundlehren der Mathematischen Wissenschaften [FundamentalPrinciples of Mathematical Sciences] . Springer-Verlag, Berlin, third edition.Smith, E., Farmer, J. D., Gillemot, L., and Krishnamurthy, S. (2003). Statisticaltheory of the continuous double auction.
Quantitative Finance , 3(6):481–514.Swishchuk, A., Cera, K., Schmidt, J., and Hofmeister, T. (2016). General semi-Markov model for limit order books: theory, implementation and numerics. arXiv
LEVEL-1 LIMIT ORDER BOOK WITH TIME DEPENDENT ARRIVAL RATES 17 preprint arXiv:1608.05060 .Swishchuk, A. V. and Vadori, N. (2017). A semi-Markovian modeling of limit ordermarkets.
SIAM Journal on Financial Mathematics . (in press).Van Leeuwaarden, J. S., Raschel, K., et al. (2013). Random walks reaching againstall odds the other side of the quarter plane.
Journal of Applied Probability ,50(1):85–102.
Appendix A. Auxiliary results
Proposition A.1.
Suppose that V n = X + · · · + X n , where the variables X i arei.i.d. with xP ( X i > x ) x →∞ → c ∈ (0 , ∞ ) . Then V n n log n P r → c , as n → ∞ .Proof. First, for any s >
T > s (cid:90) ∞ T e − sx x dx = s (cid:90) ∞ sT e − y y dy = − s log( T s ) e − T s + s (cid:90) ∞ T s log( y ) e − y dy, so as s → s (cid:82) ∞ T e − sx x dx ∼ − s log s . Next, for any non negative random variable X and any s ≥ E (cid:2) e − sX (cid:3) = 1 − s (cid:90) ∞ P ( X > x ) e − sx dx. As a result, if P ( X > x ) ∼ c/x , as x → ∞ , then, as s → E (cid:2) e − sX (cid:3) = 1 + cs log s + o ( s log s ) . Therefore, setting a n = n log n , one obtains, for a fixed s > E (cid:104) e − sV n /a n (cid:105) = (cid:104) E (cid:104) e − sX /a n (cid:105)(cid:105) n = (cid:26) − sca n log( sa n ) + o ( log ( a n ) /a n ) (cid:27) nn →∞ → e − cs , since nsa n log( sa n ) → s as n → ∞ . Hence, V n /a n P r → c , as n → ∞ . (cid:3) Proposition A.2.
Suppose that V n /f ( n ) P r → c , as n → ∞ , where f ( n ) → ∞ isregularly varying of order α . Define N t = max { n ≥ V n ≤ t } and suppose that forsome function g on (0 , ∞ ) , f ◦ g ( t ) ∼ g ◦ f ( t ) ∼ t , as t → ∞ . Then N t /g ( t ) P r → c − /α .Proof. The proof is similar to the proof of the renewal theorem in Durrett (1996)[The-orem 7.3]. By definition, V N t ≤ t < V N t +1 . As a result, V N t f ( N t ) ≤ tf ( N t ) < V N t f ( N t + 1) f ( N t + 1) f ( N t ) . By hypothesis, V n /f ( n ) converges in probability to c ∈ (0 , ∞ ), as n → ∞ . Also,since V n is finite for any n ∈ N , it follows that N t converges in [probability to + ∞ as t → ∞ . Next, since f ( n + 1) /f ( n ) → n → ∞ , it follows that as t → ∞ , f ( N t ) /t converges in probability to c . Also, g is regularly varying of order 1 /α , soone may conclude that N t /g ( t ) P r → c − /α . (cid:3) Remark
A.3 . If f ( t ) = t log t , then α = 1 and one can take g ( t ) = t/ log t . Proposition A.4.
Set ψ λ ( t, x ) = (cid:82) ∞ t u I x (2 uλ ) e − uλ du , for any t, x, λ > . Thenthere exists a constant C so that for any x, λ > , and any t ≥ λ , ψ λ ( t, x ) ≤ C √ λt . Proof.
First, note that ψ λ ( t, x ) = ψ / (2 λt, x ). It is well-known that I x ( z ) = 1 π (cid:90) π e z cos θ cos( xθ ) dθ ≤ π (cid:90) π e z cos θ dθ ≤
12 + 1 π (cid:90) e zs √ − s ds. Next, set E ( u ) := (cid:82) ∞ u e − w w dw , u >
0. Then ψ / ( t, x ) ≤ (cid:90) ∞ t e − u u (cid:26)
12 + 1 π (cid:90) e us √ − s ds (cid:27) du = 12 E ( t ) + 1 π (cid:90) ∞ t (cid:90) e − su u (cid:112) s (2 − s ) dsdu = 12 E ( t ) + 1 π (cid:90) E ( st ) (cid:112) s (2 − s ) ds. = 12 E ( t ) + 1 π (cid:90) t E ( s ) (cid:112) s (2 t − s ) ds. According to Olver et al. (2010, Section 6.8.1), E ( u ) ≤ e − u ln (1 + 1 /u ) for any u >
0. Furthermore, ln(1 + x ) ≤ x and ln(1 + x ) ≤ x / for any x ≥
0. As a result, ψ / ( t, x ) ≤ e − t t + t − / π (cid:90) t s − / e − s ds ≤ e − t t + Γ( ) πt / ≤ Ct − / for any t ≥
1, where C = e − + Γ( ) π . (cid:3) Appendix B. Proofs
Proof of Lemma 3.4.
From Olver et al. (2010)[Formula 10.30.4], for fixed ν , I ν ( z ) ∼ e z √ πz as z → ∞ . Also, from Abramowitz and Stegun (1972, p. 376), I n ( z ) = π (cid:82) π e z cos θ cos( nθ ) dθ , so for any x ∈ N , I n ( z ) ≤ e z . Thus, as T → ∞ , P x [ σ Y > T ] = (cid:16) µλ (cid:17) x/ (cid:90) ∞ T xs I x (cid:16) s (cid:112) λµ (cid:17) e − s ( λ + µ ) ds ∼ (cid:16) µλ (cid:17) x/ (cid:90) ∞ T xs e s √ λµ (cid:112) sπ √ λµ e − s ( λ + µ ) ds ∼ (cid:16) µλ (cid:17) x/ x (cid:112) π √ λµ (cid:90) ∞ T s − / e − s C ds. Also, for any x ∈ N ,(17) P x [ σ Y > T ] ≤ x (cid:16) µλ (cid:17) x/ (cid:90) ∞ T s − e − s C ds. Consequently, if λ = µ , C = 0 and P x [ σ Y > T ] ∼ x λ √ π (cid:90) ∞ T s − / ds ∼ x λ √ π √ T ∼ xλ √ πT . LEVEL-1 LIMIT ORDER BOOK WITH TIME DEPENDENT ARRIVAL RATES 19
This agrees with the result proved in Cont and de Larrard (2013). However, if λ < µ , using the change of variable u = s C , one gets P x [ σ Y > T ] ∼ C / (cid:16) µλ (cid:17) x/ x (cid:112) π √ λµ (cid:90) ∞ T C u − / e − u du ∼ (cid:16) µλ (cid:17) x/ x (cid:112) π √ λµ (cid:20) e − T C √ T − √C Γ (cid:18) , T C (cid:19)(cid:21) . To compute the expectation in the case where λ = µ , note that for large enough T , E x [ σ Y ] = (cid:82) ∞ P x [ σ Y > t ] dt ≥ x λ √ π (cid:82) ∞ T √ t dt = ∞ , whereas if λ < µ , for asufficiently large T , there are finite constants C and C such that for any 0 ≤ θ < C , E x (cid:2) e θσ Y (cid:3) = 1 + θ (cid:90) ∞ e θt P x [ σ Y > t ] dt ≤ C + θC (cid:90) ∞ T e − t ( C− θ ) dt = C + C e − T ( C− θ ) ( C − θ ) < ∞ . (cid:3) Proof of Proposition 4.1.
Let F n,Q ( t ; x, y ) and F n,L ( t ; x, y ) denote the cdf of S nQ and S nL , respectively, starting from z = ( x, y ), with densities f n, Q ( t ; z ) and f n, L ( t ; z ),where F n,Q ( · ; z ) is the convolution of F ,Q ( n −
1) times with F ,Q ( · ; z − n = 1 is given in Corollary 3.7.Assume the result is true for any m ≤ n ∈ N . Then by Corollary 3.7 and theinduction hypothesis,(18) F L ( t ; x, y ) = F Q ( A t ; x, y ) and f n, L ( t ; x, y ) = f n, Q ( A t ; x, y ) α t . Also, by the definition of τ n and V n , under Assumption 2, if z = ( x, y ), then F n, L ( t ; z ) = P L [ V n +1 ≤ t | q = z ] = P L [ V n ≤ t, τ n +1 ≤ t − V n | q = z ]= (cid:88) z f ( z ) (cid:90) t P L [ τ n +1 ≤ t − u | q u = z ] f n, L ( u ; z ) du = (cid:88) z f ( z ) (cid:90) t P Q (cid:104) τ n +1 ≤ A ( n +1) t − u | q u = z (cid:105) f n, Q ( A u ; z ) α u du = (cid:90) t F , Q ( A t − A u ) f n, Q ( A u ; z ) α u du = (cid:90) A t F , Q ( A t − u ) f n, Q ( u ; z ) du = (cid:90) A t F , Q ( A t − u ) dF n, Q ( u ; z ) = (cid:90) A t F n, Q ( A t − u ) dF , Q ( u ; z )= P Q [ V n +1 ≤ A t | q = z ] , where we used the fact that for any s ≥ α ( n +1) ( s ) = α ( s + u ) given V n = u , so A ( n +1) ( t ) = (cid:82) t α ( s + u ) ds = A t + u − A u . Furthermore, in the last equality we usedthe fact that for X and Y , non-negative independent random variables, F X + Y ( t ) = P [ X + Y ≤ t ] = F X ∗ F Y ( t ) = (cid:90) t F X ( t − x ) dF Y ( x ) , with F X and F Y denoting the cdfs of X and Y . Furthermore, starting q fromdistribution f , one obtains that P L [ V n ≤ t ] = P Q [ V n ≤ A t ]. (cid:3) Department of Mathematics and Statistics, University of Calgary, Canada
E-mail address : [email protected] Haskayne School of Business, University of Calgary, Canada, and Centre for Ap-plied Financial Studies, University of South Australia, Adelaide, Australia
E-mail address : [email protected] GERAD, CRM, and Department of Decision Sciences, HEC Montr´eal, Canada
E-mail address , Corresponding author: [email protected]
Department of Mathematics and Statistics, University of Calgary, Canada
E-mail address ::