[PDF] An Empirical Study on Arrival Rates of Limit Orders and Order Cancellation Rates in Borsa Istanbul

Abstract

Order book dynamics play an important role in both execution time and price formation of orders in an exchange market. In this study, we aim to model the limit order arrival rates in the vicinity of the best bid and the best ask price levels. We use limit order book data for Garanti Bank, which is one of the most traded stocks in Borsa Istanbul. In order to model the daily, weekly, and monthly arrival of limit order quantities, three different discrete probability distributions are tested: Geometric, Beta-Binomial and Discrete Weibull. Additionally, two theoretical models, namely, Exponential and Power law are also tested. We aim to model the arrival rates in the first fifteen bid and ask price levels. We use L1 norms in order to calculate the goodness-of-fit statistics. Furthermore, we examine the structure of weekly and monthly mean cancellation rates in the first ten bid and ask price levels.

Full PDF

AAn Empirical Study on Arrival Rates of Limit Ordersand Order Cancellation Rates in Borsa Istanbul

Can Yilmaz Altinigne a , Harun Ozkan b , Veli Can Kupeli b , Zehra Cataltepe c a School of Computer and Communication Sciences, EPFL, Switzerland b Matriks Bilgi Dağıtım Hizmetleri, 34396, Istanbul, Turkey c Department of Computer Engineering, Faculty of Computer and Informatics Engineering,Istanbul Technical University, Istanbul, Turkey

Abstract

Order book dynamics play an important role in both execution time and priceformation of orders in an exchange market. In this study, we aim to modelthe limit order arrival rates in the vicinity of the best bid and the best askprice levels. We use limit order book data for Garanti Bank, which is one ofthe most traded stocks in Borsa Istanbul. In order to model the daily, weekly,and monthly arrival of limit order quantities, three diﬀerent discrete probabilitydistributions are tested: Geometric, Beta-Binomial and Discrete Weibull. Ad-ditionally, two theoretical models, namely, Exponential and Power law are alsotested. We aim to model the arrival rates in the ﬁrst ﬁfteen bid and ask pricelevels. We use L norms in order to calculate the goodness-of-ﬁt statistics. Fur-thermore, we examine the structure of weekly and monthly mean cancellationrates in the ﬁrst ten bid and ask price levels. Keywords:

Order arrival processes, Probability distribution ﬁtting, Limitorder book, Queueing systems

JEL:

C46, C51

1. Introduction

One of the main research area on high-frequency ﬁnancial data is to inves-tigate the microstructural properties of stock markets. Generally, the researchon this area contains modeling the main characteristics of the limit order book(Cont et al. (2010); Bouchaud et al. (2002); Zovko et al. (2002)) and the behaviorof traders of stocks around speciﬁc events (Mu et al. (2010)).Modeling some market elements such as the duration between two orders,order volumes and order arrivals using parametric statistical distributions can ∗ Corresponding author

Email addresses: [email protected] (Can Yilmaz Altinigne), [email protected] (Harun Ozkan), [email protected] (Veli CanKupeli), [email protected] (Zehra Cataltepe)

Preprint submitted to Borsa Istanbul Review September 19, 2019 a r X i v : . [ q -f i n . M F ] S e p elp to understand the structure of market dynamics. Exponential family dis-tributions are widely used in modeling these types of exchange market elements(Cont et al. (2010); Jiang et al. (2008)). Alternatively, arrival rates of limitorders can be modeled using a power law (Bouchaud et al. (2002); Zovko et al.(2002)). By modeling the order dynamics in the market, we can have some basicinsight about the market microstructure. For this purpose, we aim to modelthe arrival rates of limit order in the exchange market. Also, we intend to ob-serve the statistical features of the cancellation rates (ratios of cancel orders tooutstanding orders).We use three well-known discrete statistical distributions for modeling thearrivals of orders: Discrete Weibull, Geometric and Beta - Binomial. In addition,we ﬁt Exponential distribution and Power law on the same variables. We use L norms between probability mass functions of discrete distributions and thetrue arrival rates. For continuous distributions, we discretize the ﬁt results bycalculating the area below the probability density function. We compare theperformance scores of diﬀerent ﬁts using Welch’s t -test.After completing the discrete and continuous distribution ﬁts on the arrivalrates of limit orders, we compare the best ﬁtting discrete model which is DiscreteWeibull model with the theoretical models. We show that the performance ofDiscrete Weibull model is three times better than the Exponential model interms of L norms, and it is very competitive against the Power law modelwhich is suggested by Bouchaud et al. (2002). In this part of our research, wepresent that the arrival rates of limit orders in the vicinity of the best prices (15ticks or less) can also be represented by discrete models.In addition to the analysis of limit orders, we conduct a research on cancelorders. We analyze the cancellation rates on the weekly and monthly basis. Weconsider the ﬁrst 10 bid and the ask price levels. We investigate whether thebehavior of order cancellation rates change with respect to diﬀerent bid and askprice levels. In our research, we observe that the hypothesis which implies thatweekly and monthly mean order cancellation rates are consistent with Uniformdistribution can not be statistically rejected.

2. Background and Literature Review

In exchange markets, limit orders, market orders and cancel orders constitutethe current market dynamics. Arrived limit orders in the market create a limitorder book. Buy and sell orders are placed in the limit order book according totheir price and the quantity. Until a market order or a cancel order is executedon a particular limit order, that limit order stays in the order book (Cont(2011)). Cancel orders delete limit orders in the table. Market orders executelimit orders and carry out the buying and selling operation in the market. Thehighest price on the buy side represents the bid price, and the lowest price onthe sell side represents the ask price. The prices on the buy (sell) side of thelimit order book are arranged in descending (ascending) order. The mean of thebid and the ask price is the mid-price. The ask price is always higher than thebid price only during continuous auction which is the phase that the continuous2rading occurs in the market. Diﬀerence between them is named as the bid/askspread (Cont et al. (2010)).

Figure 1: Example of a limit order book

An example of a limit order book is shown in Figure 1. In this limit orderbook, the bid price is 12.14 and the ask price is 12.17. Limit buy (sell) ordersrepresent the traders that would like to buy (sell) a quantity of a stock with theindicated price. If someone would like to buy (sell) a stock with a price equalor higher (lower) than the ask (bid) price, the operation is executed just afterthe submission of that particular limit buy (sell) order (Cont (2011)).The main part of our research contains the modeling of arrival rates of limitorders. For limit buy (sell) orders, we consider the quantities on the left (right)side of the limit order book. The price levels in the vicinity of the best prices arenamed as ticks. When we consider limit buy (sell) orders, we examine the orderquantities in the ticks with respect to their distance to the best ask (bid) price(Cont et al. (2010)). For example, the price of 12.14 (12.17) is the ﬁrst tick, andthe price of 12.13 (12.18) is the second tick for the limit buy (sell) orders. Asshown in this example, the tick values are given to limit buy (sell) orders withrespect to the order price’s distance to the best ask (bid) price. Since a limitorder book has a dynamic structure, the prices always change during the dayas a result of incoming limit orders, execution of market and cancel orders. Inour research, we model the order quantities in the ﬁrst 15 ticks for both limitbuy and limit sell order quantities.The arrival of the limit orders near the best prices is dense. The priceof an order is a crucial criterion of order execution, since the distance to thecurrent bid and ask price is correlated with the rate of the arrival of limit orders(Cont et al. (2010)). There are diﬀerent remarks on explaining arrival rates oflimit orders mathematically. In previous research on order placement strategies,Bouchaud et al. (2002) and Zovko et al. (2002) suggested that arrival rates oflimit orders Λ( i ) can be modeled with a power law which can be seen below. Λ( i ) = ki α (1)In this equation, the value i denotes the tick value. The value of Λ( i ) can bebetween 0 and ∞ . The k parameter is a positive real number and k and α canbe estimated using a least squares ﬁt as it is shown in Equation (2) (Cont et al.(2010)). The value of 15 is chosen for an upper boundary in that equation, since3e perform modeling in the ﬁrst 15 ticks. The real arrival rates are denotedwith ˆΛ( i ) . min k,α (cid:88) n =1 (ˆΛ( i ) − ki α ) (2)In another research, a stochastic model consisting of independent Poissonprocesses is suggested by Cont et al. (2010). In this model, arrival rates of limitorders are modeled in such a way that they are distributed exponentially.Weibull and q -exponential distributions are used in an early work on mod-eling the duration between two successive transactions (Jiang et al. (2008)). Inanother work, the arrival of orders are represented as a renewal process wherethe waiting times between two successive orders are distributed according toWeibull distribution (Cincotti et al. (2006)). We can infer that the durationbetween two consecutive orders would also be diﬀerent in every tick, since thearrival rates of limit orders diﬀer with respect to the distance to the best prices.As a result, because of its ﬂexibility in modeling durations, we test Weibull dis-tribution in modeling arrival rates of limit orders. However, we use the discretevariant of Weibull distribution proposed by Nakagawa & Osaki (1975), since weperform an analysis on discrete distributions. Discrete Weibull distribution hastwo real number parameters q > and β > with integer support on [0, ∞ ).The probability mass function of Discrete Weibull can be seen in Equation (3)(Nakagawa & Osaki (1975)). We use the probability density function as it startsfrom the least tick value of 1. P ( X = x ; q, β ) = q ( x − β − q x β , x = 1 , , , . . . (3)Beta family distributions are frequently used in ﬁnance. It is also often usedin modeling the rates of recovery from debts and credit risk (Chen & Wang(2013)). It is well-known that high skewness is frequently observed in creditrisks data (Schroeck (2002)). Similarly, arrival rates of limit orders have positiveskewness since the rates are much higher in the ticks that are close to the bestprices. Because of its good performance in highly skewed data, we also test BetaBinomial distribution, which is a discrete member of Beta family distributions,in modeling the arrival rates. Beta Binomial distribution is deﬁned by two realnumber parameters: α > and β > . Both parameters have ﬁnite integersupport on [0, n ). The probability mass function for n trails in Beta Binomialcan be seen in Equation (4). P ( X = x ; α, β ) = (cid:18) nx (cid:19) B ( x + α, n − x + β ) B ( α, β ) (4) ∀ u, ∀ v > , B ( u, v ) = (cid:90) t u − (1 − t ) v − dt (5)As indicated before, Exponential distribution is also used for modeling ar-rival rates of limit orders. In an early work on modeling the market dynamics,4ont et al. (2010) assumed that limit orders arrive at the tick i from the bestprice with an exponential rate λ ( i ) in his stochastic model. As a result, Ex-ponential distribution and Geometric distribution which is the discrete variantof Exponential are also used in our research. Exponential distribution has onereal number parameter λ > . The probability density function of Exponentialdistribution is given below. P ( X = x ; λ ) = λe − λx , x ≥ (6)Geometric distribution has one real number parameter < p < . Theprobability mass function of Geometric distribution for x trails is given below.We use the probability density function of Geometric distribution starting fromthe least tick value of 1. P ( X = x ; p ) = (1 − p ) x − p, x = 1 , , , . . . (7)Since most of the continuous and discrete distributions have support for the set { x : x > } , we adjusted our ﬁrst tick rate to be the zeroth tick and, therefore,shifted the ﬁt results to one tick right.In an early work on cancel orders, Blanchet & Chen (2013) assumed thatthe cancellation rates are relatively higher in the ticks that are close to the bestbid and ask prices than distant ticks. Cont et al. (2010) made an assumptionthat the cancellation rates show an Exponential distribution with respect todistance to the best bid and ask prices, and these rates are proportional tothe limit orders in that level. Bouchaud et al. (2018) also suggested that thecancellation rates are proportional to the arrival rates of limit orders with anassumption that the activity is much higher in the area that has high arrivalrates of limit orders.

3. Materials and method

The pure market data contains network captured MoldUDP packets con-sisting of ITCH R (cid:13) messeages. An ITCH R (cid:13) NASDAQ protocol for market dataincludes all orders in nano-second scale (NASDAQ-OMX-Group (2015)). Weuse Garanti Bank stock data in Borsa Istanbul. The data spans 40 trading daysfrom August 1, 2017 to September 29, 2017, and we sample 228 instances whichcontain the rates of daily, weekly and monthly arrived limit orders to analyze.

We extract the information of limit order quantities arrived at the ﬁrst 15ticks to the best prices. There are quantities arrived after the ﬁrst 15 ticks, butthey were few with respect to the quantities in the ﬁrst 15 ticks so we omittedthose quantities. As a result, we perform discrete and continuous ﬁts on λ t ( i ) Q t ( i ) at the i th tick in timeinstance t . λ t ( i ) = Q t ( i ) (cid:80) i =1 Q t ( i ) , i = 1 , . . . , (8)We split the data into four diﬀerent time groups. These timesteps are thedaily average of limit buy/sell quantities (40 days), the weekly average of limitbuy/sell quantities (9 weeks), the monthly average of limit buy/sell quantitiesand the hourly average of limit buy/sell quantities in 9 weeks. We consider themarket working hours from 10 am to 1 pm and 2 pm to 6 pm. We created 7diﬀerent hourly timesteps in a day.Consequently, we obtain 40 instances for daily data, 9 instances for weeklydata, 2 instances for monthly data and 63 instances for hourly-weekly data.Since we consider both limit buy and limit sell orders, we have 228 diﬀerentinstances to perform discrete and continuous ﬁts. Using three discrete distribu-tions, Discrete Weibull, Beta-Binomial and Geometric, we perform ﬁts on thelimit buy/sell order quantities that arrived at the ﬁrst 15 ticks to the best prices.We used Exponential distribution as a continuous model approach (Cont et al.(2010)). Also we compared the performance of the best discrete ﬁt with Expo-nential ﬁts and Power law ﬁts which are proposed by Bouchaud et al. (2002)and Zovko et al. (2002) in order to examine if a discrete approach can competewith the approaches that are suggested in previous works.Maximum likelihood estimation ﬁnds the parameters that maximize the jointprobability density function of data (likelihood). Since the maximization isarduous for multiplication operation, in general, the logarithm of the likelihoodfunction is considered (Myung (2003)). The approach of maximum likelihoodestimation is shown in the Equation (10). In the equation θ is the parametervector of the model, and x n is the data. Likelihood ( θ ) = p ( x n | θ ) = n (cid:89) i =1 P ( x i | θ ) , x n = { x , x , . . . , x n } (9) ˆ θ = arg max θ p ( x n | θ ) (10)As indicated before, we did not use any functions of R to estimate parametersof Exponential and Geometric distributions. When we take the derivative ofthe logarithm of the likelihood functions and equate it to zero, we can ﬁnd themaximum likelihood estimation of the parameter λ of Exponential distributionand the parameter p of Geometric Distribution. P ( X = x ; λ ) = λe − λx , ≥ (11) ˆ λ, ˆ p = n (cid:80) ni =1 x i , x n = { x , x , . . . , x n } (12)6he estimated parameter of Geometric distribution is also found using thesame equation. Because Geometric distribution is the discrete variant of Ex-ponential distribution, the only diﬀerence is that Geometric distribution hasinteger x n values. Estimation of the parameters of Exponential and Geomet-ric distributions can be seen in Equation (12).In order to compare the performance of the models, we consider the sum of L norms between the real values and the ﬁt results. Sum of absolute values ofdiﬀerences between observed densities and ﬁt results are considered as the errorterm. The error term at timestep t is shown below.Error = (cid:88) i =1 | λ t ( i ) − ˆ λ t ( i ) | (13) We analyze the number of arrived cancel orders around the best price andthe ratio of cancel orders in the vicinity of the best prices. The ratios andnumbers of cancel orders are considered on average weekly and monthly basis.We consider the quantity of the particular order and the total quantity in thattick before that particular cancel order arrives. Then we sum these ratios on themonthly and weekly basis and divide the number of cancel orders that arrive ina particular tick on the monthly and weekly basis. We express the cancel orderratios in tick i with k arrived cancel orders in timestep t with as C t ( i ) . C t ( i ) = (cid:80) kn =1 Canceled Quantity in tick i with order p n Total Quantity in tick i before order p n Number of Cancel Orders Arrived in tick i in timestep t (14)We compare our experiments on the number of cancel orders arrived in thevicinity of the best bid and ask prices and the behavior of the ratios of cancelorders with respect to the distance to the best bid and ask prices with previousworks on cancel orders (Cont et al. (2010); Blanchet & Chen (2013); Bouchaudet al. (2018)).

4. Results

In order to ﬁnd the performance of the discrete ﬁts, we give each distributionﬁt a performance score. The performance score is the ratio of the error of adistribution ﬁt to the minimum ﬁt error on that instance. As a result, this ratiois higher or equal to 1. If a distribution has the best ﬁt, then its score becomes 1.We consider the performance according to the closeness of performance scoresto 1. The equation of the normalized performance score of a distribution d forinstance i is given below. N P S d ( i ) = Error of distribution d on instance i Minimum error among 3 distributions on instance i (15)7e calculate the mean and standard deviation of performance scores of threedistributions in hourly, daily, weekly and monthly ﬁts, and decide which distri-bution has the best ﬁt in a particular timestep. The limit buy and limit sell ﬁtsof Geometric, Beta-Binomial and Discrete Weibull distributions for the last 12days (from Day 29 to Day 40) can be seen in Figure 2. There are ﬁts for 40days, but showing all of them might occupy a lot of space. Because of that, weonly show ﬁts of last 12 days.Day 29 and Day 30 are Thursday, September 14th and Friday, September15th respectively. Day 31 to Day 35 is the week starting on Monday, September18th. Day 36 to Day 40 is the week starting on Monday, September 25th. Sincesome of the continuous and discrete probability distributions that we use have asupport from 0 to inﬁnity, we divide the probability mass values by the sum ofall probabilities from the ﬁrst tick to 15th tick for normalization. It is strikingto observe that Discrete Weibull distribution has the best ﬁts for 75 instancesout of 80. Beta Binomial outperforms the ﬁt performance of Discrete Weibulldistribution for only 5 instances, and the ﬁts of Geometric distribution is 3 to4 times worse than both Discrete Weibull and Beta-Binomial distribution.Normalized performance scores of each model in diﬀerent time steps canbe observed in the charts in Appendix section. The performance of the modelincreases as the cell color gets lighter and close to 1.The limit buy and limit sell ﬁts of Geometric, Beta-Binomial and DiscreteWeibull distributions from Week 1 to Week 9 can be seen in Figure 3 and Figure4 below. 8 igure 2: The upper ﬁgure shows Daily Limit Buy orders and the lower ﬁgure shows DailyLimit Sell orders igure 3: Discrete Fits on Weekly Limit Buy Arrival Rates We observe that Discrete Weibull distribution has the best ﬁts for all of theweekly basis instances. Beta Binomial has close ﬁt performance with respect toDiscrete Weibull distribution. Geometric distribution is 3 to 4 times worse thanboth Discrete Weibull and Beta-Binomial distribution on average.Geometric, Beta-Binomial and Discrete Weibull ﬁts on monthly basis arrivalrate of limit orders can be observed in Figure 5. We observe that DiscreteWeibull distribution has the best ﬁts for all of the monthly basis instances.Geometric distribution is 3 to 4 times worse than others on average.Geometric, Beta-Binomial and Discrete Weibull ﬁts on arrival rate of limitorders in hourly timesteps on diﬀerent weeks can be observed in Figure 6 andFigure 7. Since there are 63 instances for this timestep, we only show the ﬁtson arrival rates in the ﬁrst 3 weeks and the ﬁrst 3 hours.10 igure 4: Discrete Fits on Weekly Limit Sell Arrival RatesFigure 5: Discrete Fits on Monthly Limit Buy (above) and Sell (below) Arrival Rates

In order to compare the performance of discrete ﬁts, we use the means of theperformance scores of three distributions in diﬀerent timesteps. Welch’s t -test isused to decide which discrete distribution has the best ﬁts on arrival rate of limitorders. Welch’s t -test is utilized to compare if there is a signiﬁcant diﬀerence11 igure 6: Discrete Fits on Hourly Limit Buy Arrival RatesFigure 7: Discrete Fits on Hourly Limit Sell Arrival Rates Timestep Geometric Discrete Weibull Beta - Binomial

Daily Limit Buy 4.042 +- 2.115 1.000 +- 0.000 1.262 +- 0.144Daily Limit Sell 3.144 +- 1.320 1.033 +- 0.126 1.214 +- 0.162Weekly Limit Buy 4.285 +- 0.779 1.000 +- 0.000 1.363 +- 0.106Weekly Limit Sell 3.183 +- 0.692 1.000 +- 0.000 1.237 +- 0.070Monthly Limit 3.789 +- 0.748 1.000 +- 0.000 1.312 +- 0.113Hourly Limit Buy 3.643 +- 1.760 1.012 +- 0.057 1.243 +- 0.160Hourly Limit Sell 2.895 +- 1.125 1.026 +- 0.083 1.153 +- 0.103

Table 1: Mean

NP S of Discrete Fits on Diﬀerent Timesteps

It can be observed in Table 1 that the performance scores of Geometric ﬁtshave 3 to 4 times higher values than Discrete Weibull and Beta Binomial ﬁts.As a result, we can say that Geometric ﬁts have the worst performance amongthree distribution. In order to ﬁnd the best ﬁts, we perform Welch’s t -testbetween Discrete Weibull and Beta Binomial ﬁts. The results of Welch’s t -testsas p -values are given in Table 2. Timestep p -value Daily Limit Buy 0.011Daily Limit Sell 0.033Weekly Limit Buy 0.003Weekly Limit Sell 0.020Monthly Limit 0.061Hourly Limit Buy 0.092Hourly Limit Sell 0.039

Table 2: Welch’s t -test results between Discrete Weibull and Beta-Binomial on DiﬀerentTimesteps We choose 95% conﬁdence interval for t -tests. For 5 of 7 instances the p -value is below 0.05, so we can reject the null hypothesis that indicates the meansof Discrete Weibull and Beta Binomial performance scores are not signiﬁcantlydiﬀerent. As a result, it can be denoted that Discrete Weibull has signiﬁcantlybetter ﬁts than Beta Binomial has in Daily Limit Buy, Daily Limit Sell, WeeklyLimit Buy, Weekly Limit Sell and Hourly Limit Sell orders, since it has smallermeans. For Monthly Limit and Hourly Limit Buy orders, there is no signiﬁcantdiﬀerence between Discrete Weibull ﬁts and Beta Binomial ﬁts. We perform least squares approach for power law parameter estimation,since it is suggested by Bouchaud et al. (2002). Error approach, timesteps13nd the comparison of performance scores are the same as in the discrete ﬁts.We discretize Exponential ﬁts by using the area under the probability densityfunction. We divide the x-axis into 15 equal parts and we ﬁnd the densitiesfor 15 ticks by calculating areas under the probability density function. Thelimit buy and limit sell ﬁts of Exponential, Power law and Discrete Weibulldistributions from Day 29 to Day 40 can be seen in Figure 8 and Figure 9.

Figure 8: Exponential, Power law and Discrete Weibull ﬁts on Daily Limit Buy Arrival Ratesfrom Day 29 to Day 40

We observe that Discrete Weibull distribution and Power law have the bestﬁts for most of the instances. The performance of Discrete Weibull and Powerlaw is very close to each other. On the other hand, Exponential distributionis 2 to 3 times worse than both Discrete Weibull distribution and Power law.The limit buy and limit sell ﬁts of Exponential, Power law and Discrete Weibulldistributions from Week 1 to Week 9 can be seen in Figure 10 and Figure 11below.We observe that Discrete Weibull distribution has the best ﬁts for most ofthe weekly basis instances. Power law has close ﬁt performance with respect toDiscrete Weibull distribution. Not suprisingly, Exponential distribution, beinga more parsimonious distribution in the number of parameters, performed muchworse than both Discrete Weibull and Power law on average.14 igure 9: Exponential, Power law and Discrete Weibull ﬁts on Daily Limit Sell Arrival Ratesfrom Day 29 to Day 40Figure 10: Exponential, Power law and Discrete Weibull ﬁts on Weekly Limit Buy ArrivalRates

Exponential, Power law and Discrete Weibull ﬁts on monthly basis arrivalrate of limit orders can be observed in Figure 12.We observe that Discrete Weibull distribution has the best ﬁts for 3 out of4 monthly basis instances. Exponential distribution is 2 times worse than bothDiscrete Weibull and Power law on average.15 igure 11: Exponential, Power law and Discrete Weibull ﬁts on Weekly Limit Sell ArrivalRates (a) Limit Buy Orders (b) Limit Sell Orders

Figure 12: Exponential, Power law and Discrete Weibull ﬁts on Monthly Limit Orders ArrivalRates

In order to compare the performance of Exponential, Power law and DiscreteWeibull, we again use the means of the performance scores of three distributionsin diﬀerent timesteps. Welch’s t -test is used to decide which discrete distributionhas the best ﬁts on arrival rate of limit orders. Average performance scores ofDiscrete Weibull, Exponential and Power law ﬁts on diﬀerent timesteps are givenin Table 3.Exponential ﬁts evidently have the worst performance among three distri-bution. On the other hand, as Power law and Discrete Weibull have very closemean performance scores, we perform a t -test to ﬁnd if there is a signiﬁcantdiﬀerence between those values.For all of the instances, the p -values have a value that is higher than 0.05,so we can not reject the null hypothesis that indicates the mean performancescores of Discrete Weibull and Power Law is not signiﬁcantly diﬀerent. So we16 imestep Exponential Discrete Weibull Power law Daily Limit Buy 2.674 +- 1.346 1.098 +- 0.198 1.154 +- 0.222Daily Limit Sell 2.018 +- 1.071 1.059 +- 0.104 1.128 +- 0.167Weekly Limit Buy 2.515 +- 0.651 1.165 +- 0.191 1.051 +- 0.066Weekly Limit Sell 1.771 +- 0.471 1.004 +- 0.014 1.076 +- 0.060Monthly Limit 2.038 +- 0.394 1.063 +- 0.127 1.048 +- 0.062Hourly Limit Buy 2.624 +- 1.279 1.094 +- 0.126 1.163 +- 0.268Hourly Limit Sell 1.918 +- 0.753 1.056 +- 0.074 1.128 +- 0.168

Table 3: Mean

NP S d ( Daily ) of Proposed Distributions on Diﬀerent Timesteps Timestep p -value Daily Limit Buy 0.363Daily Limit Sell 0.326Weekly Limit Buy 0.343Weekly Limit Sell 0.464Monthly Limit 0.973Hourly Limit Buy 0.743Hourly Limit Sell 0.242

Table 4: Welch’s t -test results between Discrete Weibull and Power law can say that the performance of Power Law and Discrete Weibull ﬁts are notsigniﬁcantly diﬀerent for all instances. Consequently, Discrete Weibull modelscan compete with Power law models which are proposed by Bouchaud et al.(2002) and Zovko et al. (2002). We can use Discrete Weibull distribution tomodel arrival rates of limit orders with respect to distance to the best pricesaccurately. We analyze the number of cancel orders and the ratio of canceled orders inthe vicinity of the best prices. We consider both cancel buy orders and cancelsell orders on the weekly and monthly basis. In previous works, Blanchet &Chen (2013) denoted that the cancellation activity is much higher in the closeregions to the best bid and ask prices. The number of cancel orders arrived atthe ﬁrst 10 ticks on the weekly basis in our experiments are shown in Figure 13and Figure 14.When we consider the cancel activity as the number of cancel orders arrived,the results are consistent with the previous works. It can be observed that thenumber of cancel orders arrived in the close regions to the best prices are higher.We also consider the average ratio of canceled order quantity in the ticks. Weuse the metric which we denote in Section 3.3 for ﬁnding the ratios. The ratios ofcanceled order quantities in the ﬁrst 10 ticks on weekly basis in our experimentsare shown in Figure 15 and Figure 16. The red and blue lines in ﬁgures expectedvalues. 17 igure 13: Number of Cancel Buy orders arrived in the vicinity of the best ask price on weeklybasisFigure 14: Number of Cancel Sell orders arrived in the vicinity of the best bid price on weeklybasis igure 15: Ratios of canceled buy order quantities in the vicinity of the best ask price onweekly basis, Red line is the average ratioFigure 16: Ratios of canceled sell order quantities in the vicinity of the best bid price onweekly basis, Blue line is the average ratio Figure 17: Ratios of canceled buy order quantities in the vicinity of the best ask price onmonthly basis, Red line is the average ratioFigure 18: Ratios of canceled sell order quantities in the vicinity of the best bid price onmonthly basis, Blue line is the average ratio

The ratios were close most of the time. Thus we perform uniformity tests onthe ratios. We use Chi-Square Test to test if the ratios are distributed uniformly.Chi-Square test statistics formula is given in Equation 16. In the Equation 16, c means the degree of freedom. In our experiment, we have 10 diﬀerent categories(because of 10 ticks). Since the degree of freedom is equal to the number ofcategories minus 1, the degree of freedom value is 9 for our experiments. X c = (cid:88) i =1 ( Observed i − Expected i ) Expected i (16)20hi-Square tests use count data as observed data. Therefore we convert thecancellation ratios to integer values by multiplying them by 100. Then, we testthe null hypothesis which claims that the ratios of canceled orders are consistentwith Uniform distribution. Chi-Square test statistics and corresponding p -valuesare given in Table 5 and Table 6. Timestep Cancel Buy Cancel Sell

Table 5: Comparison between the ratios of canceled orders and the uniform distribution:Chi-Square Test Statistics on Diﬀerent Timesteps

Timestep Cancel Buy Cancel Sell

Table 6: Comparison between the ratios of canceled orders and the uniform distribution:Corresponding p -values on Diﬀerent Timesteps We use 95% conﬁdence interval for the tests. Chi-Square tests for uniformitypresent p -values higher than 0.05 in all of the instances except for the ratiosof cancel sell orders in the 5th week. Consequently, we can not reject thenull hypothesis which indicates that the cancellation rates are consistent withUniform distribution for most of the instances.21 . Conclusion and Future Work In this research, we used diﬀerent statistical distributions to ﬁt the limitorder quantities arrived in the vicinity of the best bid and ask prices. The ﬁtsare made on Garanti Bank stock data for the period from August - September2017. We analyzed the daily, weekly and monthly mean limit order quantitiesarrived at the ﬁrst 15 levels from the best prices. Also, we considered the weeklymean quantities of limit orders arrived in 7 diﬀerent time intervals in a day. Weused total sum of L norms between empirical density and ﬁt results in the ﬁrst15 price levels of the bid and ask prices to evaluate the goodness of ﬁts.We observed that Discrete Weibull and Beta Binomial distributions are al-most 4 times better at ﬁtting the order quantity data than Geometric distri-bution. We had 228 instances to ﬁt and Discrete Weibull has the lowest L norm for 210 of them. Beta Binomial ﬁts the data with the lowest L norm for17 of the instances and Geometric distribution has the best ﬁt for only one ofthe instances. Additionally, we used Exponential distribution to ﬁt the same228 instances. We found the probability mass values by calculating the areas of15 bins under the Exponential probability density function using discretization.Then we obtained the sum of L norms between empirical density and 15 prob-ability mass values, and compared the goodness of Exponential distribution ﬁtswith Discrete Weibull distribution ﬁts. We observed that Discrete Weibull ﬁtsthe daily, weekly and monthly mean quantities two times better than Exponen-tial distribution. Also, Discrete Weibull ﬁts can compete with Power law ﬁtswhich are proposed in early works.We analyzed the weekly and monthly mean ratio of cancel orders in theﬁrst 10 price levels. We conducted Chi-Square tests to test the uniformity. Weobserved that we can not deny the hypothesis which claims that the cancellationrates are consistent with Uniform distribution. As a result, we found out thatthe assumption made by Cont et al. (2010) on cancellation rates which denotesthat the cancellation rates are distributed exponentially can not be adapted toTurkish markets.Our dataset was quite small, since it only contains the data of 2 months.Also, we only consider Garanti Bank stock data. In future work, the sameexperiments can be conducted for larger datasets such as 6 months or 1 year.Moreover, stock data from other companies can be considered, and the relationbetween diﬀerent stocks would be another interesting extension. Additionally,the relation between stock prices and arrival rates can be examined and pre-dicted. References

Blanchet, J., & Chen, X. (2013). Continuous-time modeling of bid-ask spreadand price dynamics in limit order books. arXiv preprint arXiv:1310.1103 , .Bouchaud, J.-P., Bonart, J., Donier, J., & Gould, M. (2018).

Trades, quotes andprices: ﬁnancial markets under the microscope . Cambridge University Press.22ouchaud, J.-P., Mézard, M., Potters, M. et al. (2002). Statistical propertiesof stock order books: empirical results and models.

Quantitative ﬁnance , ,251–256.Chen, R., & Wang, Z. (2013). Curve ﬁtting of the corporate recovery rates:The comparison of beta distribution estimation and kernel density estimation. PloS one , , e68238.Cincotti, S., Focardi, S. M., Ponta, L., Raberto, M., & Scalas, E. (2006). Thewaiting-time distribution of trading activity in a double auction artiﬁcial ﬁ-nancial market. In The Complex Networks of Economic Interactions (pp.239–247). Springer.Cont, R. (2011). Statistical modeling of high-frequency ﬁnancial data.

IEEESignal Processing Magazine , , 16–25.Cont, R., Stoikov, S., & Talreja, R. (2010). A stochastic model for order bookdynamics. Operations research , , 549–563.Jiang, Z.-Q., Chen, W., & Zhou, W.-X. (2008). Scaling in the distribution ofintertrade durations of chinese stocks. Physica A: Statistical Mechanics andits Applications , , 5818–5825.Mu, G.-H., Zhou, W.-X., Chen, W., & Kertész, J. (2010). Order ﬂow dynamicsaround extreme price changes on an emerging stock market. New Journal ofPhysics , , 075037.Myung, I. J. (2003). Tutorial on maximum likelihood estimation. Journal ofMathematical Psychology , , 90–100.Nakagawa, T., & Osaki, S. (1975). The discrete weibull distribution. IEEETransactions on Reliability , , 300–301.NASDAQ-OMX-Group (2015). NASDAQ TotalView-ITCH 5.0 Interface Spec-iﬁcation . .Schroeck, G. (2002). Risk management and value creation in ﬁnancial institu-tions volume 155. John Wiley & Sons.Zovko, I., Farmer, J. D. et al. (2002). The power of patience: a behaviouralregularity in limit-order placement.

Quantitative ﬁnance , , 387–392.23 ppendix Figure A.1:

NP S d ( Daily ) of three discrete distributions on Arrival Rates of Daily LimitBuy/Sell Orders igure A.2: NP S d ( W eekly ) of three discrete distributions on Arrival Rates of Weekly LimitBuy/Sell Orders igure A.3: NP S d ( Hourly ) of three discrete distributions on Arrival Rates of Hourly LimitBuy/Sell Orders igure A.4: NP S d ( Daily ) of Discrete Weibull and theoretical distributions on Arrival Ratesof Daily Limit Buy/Sell Orders igure A.5: NP S d ( W eekly ) of Discrete Weibull and theoretical distributions on Arrival Ratesof Weekly Limit Buy/Sell Orders igure A.6: NP S d ( Hourly ) of Discrete Weibull and theoretical distributions on Arrival Ratesof Hourly Limit Buy/Sell Ordersof Discrete Weibull and theoretical distributions on Arrival Ratesof Hourly Limit Buy/Sell Orders