An Empirical Study on Arrival Rates of Limit Orders and Order Cancellation Rates in Borsa Istanbul
Can Yilmaz Altinigne, Harun Ozkan, Veli Can Kupeli, Zehra Cataltepe
AAn Empirical Study on Arrival Rates of Limit Ordersand Order Cancellation Rates in Borsa Istanbul
Can Yilmaz Altinigne a , Harun Ozkan b , Veli Can Kupeli b , Zehra Cataltepe c a School of Computer and Communication Sciences, EPFL, Switzerland b Matriks Bilgi Dağıtım Hizmetleri, 34396, Istanbul, Turkey c Department of Computer Engineering, Faculty of Computer and Informatics Engineering,Istanbul Technical University, Istanbul, Turkey
Abstract
Order book dynamics play an important role in both execution time and priceformation of orders in an exchange market. In this study, we aim to modelthe limit order arrival rates in the vicinity of the best bid and the best askprice levels. We use limit order book data for Garanti Bank, which is one ofthe most traded stocks in Borsa Istanbul. In order to model the daily, weekly,and monthly arrival of limit order quantities, three different discrete probabilitydistributions are tested: Geometric, Beta-Binomial and Discrete Weibull. Ad-ditionally, two theoretical models, namely, Exponential and Power law are alsotested. We aim to model the arrival rates in the first fifteen bid and ask pricelevels. We use L norms in order to calculate the goodness-of-fit statistics. Fur-thermore, we examine the structure of weekly and monthly mean cancellationrates in the first ten bid and ask price levels. Keywords:
Order arrival processes, Probability distribution fitting, Limitorder book, Queueing systems
JEL:
C46, C51
1. Introduction
One of the main research area on high-frequency financial data is to inves-tigate the microstructural properties of stock markets. Generally, the researchon this area contains modeling the main characteristics of the limit order book(Cont et al. (2010); Bouchaud et al. (2002); Zovko et al. (2002)) and the behaviorof traders of stocks around specific events (Mu et al. (2010)).Modeling some market elements such as the duration between two orders,order volumes and order arrivals using parametric statistical distributions can ∗ Corresponding author
Email addresses: [email protected] (Can Yilmaz Altinigne), [email protected] (Harun Ozkan), [email protected] (Veli CanKupeli), [email protected] (Zehra Cataltepe)
Preprint submitted to Borsa Istanbul Review September 19, 2019 a r X i v : . [ q -f i n . M F ] S e p elp to understand the structure of market dynamics. Exponential family dis-tributions are widely used in modeling these types of exchange market elements(Cont et al. (2010); Jiang et al. (2008)). Alternatively, arrival rates of limitorders can be modeled using a power law (Bouchaud et al. (2002); Zovko et al.(2002)). By modeling the order dynamics in the market, we can have some basicinsight about the market microstructure. For this purpose, we aim to modelthe arrival rates of limit order in the exchange market. Also, we intend to ob-serve the statistical features of the cancellation rates (ratios of cancel orders tooutstanding orders).We use three well-known discrete statistical distributions for modeling thearrivals of orders: Discrete Weibull, Geometric and Beta - Binomial. In addition,we fit Exponential distribution and Power law on the same variables. We use L norms between probability mass functions of discrete distributions and thetrue arrival rates. For continuous distributions, we discretize the fit results bycalculating the area below the probability density function. We compare theperformance scores of different fits using Welch’s t -test.After completing the discrete and continuous distribution fits on the arrivalrates of limit orders, we compare the best fitting discrete model which is DiscreteWeibull model with the theoretical models. We show that the performance ofDiscrete Weibull model is three times better than the Exponential model interms of L norms, and it is very competitive against the Power law modelwhich is suggested by Bouchaud et al. (2002). In this part of our research, wepresent that the arrival rates of limit orders in the vicinity of the best prices (15ticks or less) can also be represented by discrete models.In addition to the analysis of limit orders, we conduct a research on cancelorders. We analyze the cancellation rates on the weekly and monthly basis. Weconsider the first 10 bid and the ask price levels. We investigate whether thebehavior of order cancellation rates change with respect to different bid and askprice levels. In our research, we observe that the hypothesis which implies thatweekly and monthly mean order cancellation rates are consistent with Uniformdistribution can not be statistically rejected.
2. Background and Literature Review
In exchange markets, limit orders, market orders and cancel orders constitutethe current market dynamics. Arrived limit orders in the market create a limitorder book. Buy and sell orders are placed in the limit order book according totheir price and the quantity. Until a market order or a cancel order is executedon a particular limit order, that limit order stays in the order book (Cont(2011)). Cancel orders delete limit orders in the table. Market orders executelimit orders and carry out the buying and selling operation in the market. Thehighest price on the buy side represents the bid price, and the lowest price onthe sell side represents the ask price. The prices on the buy (sell) side of thelimit order book are arranged in descending (ascending) order. The mean of thebid and the ask price is the mid-price. The ask price is always higher than thebid price only during continuous auction which is the phase that the continuous2rading occurs in the market. Difference between them is named as the bid/askspread (Cont et al. (2010)).
Figure 1: Example of a limit order book
An example of a limit order book is shown in Figure 1. In this limit orderbook, the bid price is 12.14 and the ask price is 12.17. Limit buy (sell) ordersrepresent the traders that would like to buy (sell) a quantity of a stock with theindicated price. If someone would like to buy (sell) a stock with a price equalor higher (lower) than the ask (bid) price, the operation is executed just afterthe submission of that particular limit buy (sell) order (Cont (2011)).The main part of our research contains the modeling of arrival rates of limitorders. For limit buy (sell) orders, we consider the quantities on the left (right)side of the limit order book. The price levels in the vicinity of the best prices arenamed as ticks. When we consider limit buy (sell) orders, we examine the orderquantities in the ticks with respect to their distance to the best ask (bid) price(Cont et al. (2010)). For example, the price of 12.14 (12.17) is the first tick, andthe price of 12.13 (12.18) is the second tick for the limit buy (sell) orders. Asshown in this example, the tick values are given to limit buy (sell) orders withrespect to the order price’s distance to the best ask (bid) price. Since a limitorder book has a dynamic structure, the prices always change during the dayas a result of incoming limit orders, execution of market and cancel orders. Inour research, we model the order quantities in the first 15 ticks for both limitbuy and limit sell order quantities.The arrival of the limit orders near the best prices is dense. The priceof an order is a crucial criterion of order execution, since the distance to thecurrent bid and ask price is correlated with the rate of the arrival of limit orders(Cont et al. (2010)). There are different remarks on explaining arrival rates oflimit orders mathematically. In previous research on order placement strategies,Bouchaud et al. (2002) and Zovko et al. (2002) suggested that arrival rates oflimit orders Λ( i ) can be modeled with a power law which can be seen below. Λ( i ) = ki α (1)In this equation, the value i denotes the tick value. The value of Λ( i ) can bebetween 0 and ∞ . The k parameter is a positive real number and k and α canbe estimated using a least squares fit as it is shown in Equation (2) (Cont et al.(2010)). The value of 15 is chosen for an upper boundary in that equation, since3e perform modeling in the first 15 ticks. The real arrival rates are denotedwith ˆΛ( i ) . min k,α (cid:88) n =1 (ˆΛ( i ) − ki α ) (2)In another research, a stochastic model consisting of independent Poissonprocesses is suggested by Cont et al. (2010). In this model, arrival rates of limitorders are modeled in such a way that they are distributed exponentially.Weibull and q -exponential distributions are used in an early work on mod-eling the duration between two successive transactions (Jiang et al. (2008)). Inanother work, the arrival of orders are represented as a renewal process wherethe waiting times between two successive orders are distributed according toWeibull distribution (Cincotti et al. (2006)). We can infer that the durationbetween two consecutive orders would also be different in every tick, since thearrival rates of limit orders differ with respect to the distance to the best prices.As a result, because of its flexibility in modeling durations, we test Weibull dis-tribution in modeling arrival rates of limit orders. However, we use the discretevariant of Weibull distribution proposed by Nakagawa & Osaki (1975), since weperform an analysis on discrete distributions. Discrete Weibull distribution hastwo real number parameters q > and β > with integer support on [0, ∞ ).The probability mass function of Discrete Weibull can be seen in Equation (3)(Nakagawa & Osaki (1975)). We use the probability density function as it startsfrom the least tick value of 1. P ( X = x ; q, β ) = q ( x − β − q x β , x = 1 , , , . . . (3)Beta family distributions are frequently used in finance. It is also often usedin modeling the rates of recovery from debts and credit risk (Chen & Wang(2013)). It is well-known that high skewness is frequently observed in creditrisks data (Schroeck (2002)). Similarly, arrival rates of limit orders have positiveskewness since the rates are much higher in the ticks that are close to the bestprices. Because of its good performance in highly skewed data, we also test BetaBinomial distribution, which is a discrete member of Beta family distributions,in modeling the arrival rates. Beta Binomial distribution is defined by two realnumber parameters: α > and β > . Both parameters have finite integersupport on [0, n ). The probability mass function for n trails in Beta Binomialcan be seen in Equation (4). P ( X = x ; α, β ) = (cid:18) nx (cid:19) B ( x + α, n − x + β ) B ( α, β ) (4) ∀ u, ∀ v > , B ( u, v ) = (cid:90) t u − (1 − t ) v − dt (5)As indicated before, Exponential distribution is also used for modeling ar-rival rates of limit orders. In an early work on modeling the market dynamics,4ont et al. (2010) assumed that limit orders arrive at the tick i from the bestprice with an exponential rate λ ( i ) in his stochastic model. As a result, Ex-ponential distribution and Geometric distribution which is the discrete variantof Exponential are also used in our research. Exponential distribution has onereal number parameter λ > . The probability density function of Exponentialdistribution is given below. P ( X = x ; λ ) = λe − λx , x ≥ (6)Geometric distribution has one real number parameter < p < . Theprobability mass function of Geometric distribution for x trails is given below.We use the probability density function of Geometric distribution starting fromthe least tick value of 1. P ( X = x ; p ) = (1 − p ) x − p, x = 1 , , , . . . (7)Since most of the continuous and discrete distributions have support for the set { x : x > } , we adjusted our first tick rate to be the zeroth tick and, therefore,shifted the fit results to one tick right.In an early work on cancel orders, Blanchet & Chen (2013) assumed thatthe cancellation rates are relatively higher in the ticks that are close to the bestbid and ask prices than distant ticks. Cont et al. (2010) made an assumptionthat the cancellation rates show an Exponential distribution with respect todistance to the best bid and ask prices, and these rates are proportional tothe limit orders in that level. Bouchaud et al. (2018) also suggested that thecancellation rates are proportional to the arrival rates of limit orders with anassumption that the activity is much higher in the area that has high arrivalrates of limit orders.
3. Materials and method
The pure market data contains network captured MoldUDP packets con-sisting of ITCH R (cid:13) messeages. An ITCH R (cid:13) NASDAQ protocol for market dataincludes all orders in nano-second scale (NASDAQ-OMX-Group (2015)). Weuse Garanti Bank stock data in Borsa Istanbul. The data spans 40 trading daysfrom August 1, 2017 to September 29, 2017, and we sample 228 instances whichcontain the rates of daily, weekly and monthly arrived limit orders to analyze.
We extract the information of limit order quantities arrived at the first 15ticks to the best prices. There are quantities arrived after the first 15 ticks, butthey were few with respect to the quantities in the first 15 ticks so we omittedthose quantities. As a result, we perform discrete and continuous fits on λ t ( i ) Q t ( i ) at the i th tick in timeinstance t . λ t ( i ) = Q t ( i ) (cid:80) i =1 Q t ( i ) , i = 1 , . . . , (8)We split the data into four different time groups. These timesteps are thedaily average of limit buy/sell quantities (40 days), the weekly average of limitbuy/sell quantities (9 weeks), the monthly average of limit buy/sell quantitiesand the hourly average of limit buy/sell quantities in 9 weeks. We consider themarket working hours from 10 am to 1 pm and 2 pm to 6 pm. We created 7different hourly timesteps in a day.Consequently, we obtain 40 instances for daily data, 9 instances for weeklydata, 2 instances for monthly data and 63 instances for hourly-weekly data.Since we consider both limit buy and limit sell orders, we have 228 differentinstances to perform discrete and continuous fits. Using three discrete distribu-tions, Discrete Weibull, Beta-Binomial and Geometric, we perform fits on thelimit buy/sell order quantities that arrived at the first 15 ticks to the best prices.We used Exponential distribution as a continuous model approach (Cont et al.(2010)). Also we compared the performance of the best discrete fit with Expo-nential fits and Power law fits which are proposed by Bouchaud et al. (2002)and Zovko et al. (2002) in order to examine if a discrete approach can competewith the approaches that are suggested in previous works.Maximum likelihood estimation finds the parameters that maximize the jointprobability density function of data (likelihood). Since the maximization isarduous for multiplication operation, in general, the logarithm of the likelihoodfunction is considered (Myung (2003)). The approach of maximum likelihoodestimation is shown in the Equation (10). In the equation θ is the parametervector of the model, and x n is the data. Likelihood ( θ ) = p ( x n | θ ) = n (cid:89) i =1 P ( x i | θ ) , x n = { x , x , . . . , x n } (9) ˆ θ = arg max θ p ( x n | θ ) (10)As indicated before, we did not use any functions of R to estimate parametersof Exponential and Geometric distributions. When we take the derivative ofthe logarithm of the likelihood functions and equate it to zero, we can find themaximum likelihood estimation of the parameter λ of Exponential distributionand the parameter p of Geometric Distribution. P ( X = x ; λ ) = λe − λx , ≥ (11) ˆ λ, ˆ p = n (cid:80) ni =1 x i , x n = { x , x , . . . , x n } (12)6he estimated parameter of Geometric distribution is also found using thesame equation. Because Geometric distribution is the discrete variant of Ex-ponential distribution, the only difference is that Geometric distribution hasinteger x n values. Estimation of the parameters of Exponential and Geomet-ric distributions can be seen in Equation (12).In order to compare the performance of the models, we consider the sum of L norms between the real values and the fit results. Sum of absolute values ofdifferences between observed densities and fit results are considered as the errorterm. The error term at timestep t is shown below.Error = (cid:88) i =1 | λ t ( i ) − ˆ λ t ( i ) | (13) We analyze the number of arrived cancel orders around the best price andthe ratio of cancel orders in the vicinity of the best prices. The ratios andnumbers of cancel orders are considered on average weekly and monthly basis.We consider the quantity of the particular order and the total quantity in thattick before that particular cancel order arrives. Then we sum these ratios on themonthly and weekly basis and divide the number of cancel orders that arrive ina particular tick on the monthly and weekly basis. We express the cancel orderratios in tick i with k arrived cancel orders in timestep t with as C t ( i ) . C t ( i ) = (cid:80) kn =1 Canceled Quantity in tick i with order p n Total Quantity in tick i before order p n Number of Cancel Orders Arrived in tick i in timestep t (14)We compare our experiments on the number of cancel orders arrived in thevicinity of the best bid and ask prices and the behavior of the ratios of cancelorders with respect to the distance to the best bid and ask prices with previousworks on cancel orders (Cont et al. (2010); Blanchet & Chen (2013); Bouchaudet al. (2018)).
4. Results
In order to find the performance of the discrete fits, we give each distributionfit a performance score. The performance score is the ratio of the error of adistribution fit to the minimum fit error on that instance. As a result, this ratiois higher or equal to 1. If a distribution has the best fit, then its score becomes 1.We consider the performance according to the closeness of performance scoresto 1. The equation of the normalized performance score of a distribution d forinstance i is given below. N P S d ( i ) = Error of distribution d on instance i Minimum error among 3 distributions on instance i (15)7e calculate the mean and standard deviation of performance scores of threedistributions in hourly, daily, weekly and monthly fits, and decide which distri-bution has the best fit in a particular timestep. The limit buy and limit sell fitsof Geometric, Beta-Binomial and Discrete Weibull distributions for the last 12days (from Day 29 to Day 40) can be seen in Figure 2. There are fits for 40days, but showing all of them might occupy a lot of space. Because of that, weonly show fits of last 12 days.Day 29 and Day 30 are Thursday, September 14th and Friday, September15th respectively. Day 31 to Day 35 is the week starting on Monday, September18th. Day 36 to Day 40 is the week starting on Monday, September 25th. Sincesome of the continuous and discrete probability distributions that we use have asupport from 0 to infinity, we divide the probability mass values by the sum ofall probabilities from the first tick to 15th tick for normalization. It is strikingto observe that Discrete Weibull distribution has the best fits for 75 instancesout of 80. Beta Binomial outperforms the fit performance of Discrete Weibulldistribution for only 5 instances, and the fits of Geometric distribution is 3 to4 times worse than both Discrete Weibull and Beta-Binomial distribution.Normalized performance scores of each model in different time steps canbe observed in the charts in Appendix section. The performance of the modelincreases as the cell color gets lighter and close to 1.The limit buy and limit sell fits of Geometric, Beta-Binomial and DiscreteWeibull distributions from Week 1 to Week 9 can be seen in Figure 3 and Figure4 below. 8 igure 2: The upper figure shows Daily Limit Buy orders and the lower figure shows DailyLimit Sell orders igure 3: Discrete Fits on Weekly Limit Buy Arrival Rates We observe that Discrete Weibull distribution has the best fits for all of theweekly basis instances. Beta Binomial has close fit performance with respect toDiscrete Weibull distribution. Geometric distribution is 3 to 4 times worse thanboth Discrete Weibull and Beta-Binomial distribution on average.Geometric, Beta-Binomial and Discrete Weibull fits on monthly basis arrivalrate of limit orders can be observed in Figure 5. We observe that DiscreteWeibull distribution has the best fits for all of the monthly basis instances.Geometric distribution is 3 to 4 times worse than others on average.Geometric, Beta-Binomial and Discrete Weibull fits on arrival rate of limitorders in hourly timesteps on different weeks can be observed in Figure 6 andFigure 7. Since there are 63 instances for this timestep, we only show the fitson arrival rates in the first 3 weeks and the first 3 hours.10 igure 4: Discrete Fits on Weekly Limit Sell Arrival RatesFigure 5: Discrete Fits on Monthly Limit Buy (above) and Sell (below) Arrival Rates
In order to compare the performance of discrete fits, we use the means of theperformance scores of three distributions in different timesteps. Welch’s t -test isused to decide which discrete distribution has the best fits on arrival rate of limitorders. Welch’s t -test is utilized to compare if there is a significant difference11 igure 6: Discrete Fits on Hourly Limit Buy Arrival RatesFigure 7: Discrete Fits on Hourly Limit Sell Arrival Rates Timestep Geometric Discrete Weibull Beta - Binomial
Daily Limit Buy 4.042 +- 2.115 1.000 +- 0.000 1.262 +- 0.144Daily Limit Sell 3.144 +- 1.320 1.033 +- 0.126 1.214 +- 0.162Weekly Limit Buy 4.285 +- 0.779 1.000 +- 0.000 1.363 +- 0.106Weekly Limit Sell 3.183 +- 0.692 1.000 +- 0.000 1.237 +- 0.070Monthly Limit 3.789 +- 0.748 1.000 +- 0.000 1.312 +- 0.113Hourly Limit Buy 3.643 +- 1.760 1.012 +- 0.057 1.243 +- 0.160Hourly Limit Sell 2.895 +- 1.125 1.026 +- 0.083 1.153 +- 0.103
Table 1: Mean
NP S of Discrete Fits on Different Timesteps
It can be observed in Table 1 that the performance scores of Geometric fitshave 3 to 4 times higher values than Discrete Weibull and Beta Binomial fits.As a result, we can say that Geometric fits have the worst performance amongthree distribution. In order to find the best fits, we perform Welch’s t -testbetween Discrete Weibull and Beta Binomial fits. The results of Welch’s t -testsas p -values are given in Table 2. Timestep p -value Daily Limit Buy 0.011Daily Limit Sell 0.033Weekly Limit Buy 0.003Weekly Limit Sell 0.020Monthly Limit 0.061Hourly Limit Buy 0.092Hourly Limit Sell 0.039
Table 2: Welch’s t -test results between Discrete Weibull and Beta-Binomial on DifferentTimesteps We choose 95% confidence interval for t -tests. For 5 of 7 instances the p -value is below 0.05, so we can reject the null hypothesis that indicates the meansof Discrete Weibull and Beta Binomial performance scores are not significantlydifferent. As a result, it can be denoted that Discrete Weibull has significantlybetter fits than Beta Binomial has in Daily Limit Buy, Daily Limit Sell, WeeklyLimit Buy, Weekly Limit Sell and Hourly Limit Sell orders, since it has smallermeans. For Monthly Limit and Hourly Limit Buy orders, there is no significantdifference between Discrete Weibull fits and Beta Binomial fits. We perform least squares approach for power law parameter estimation,since it is suggested by Bouchaud et al. (2002). Error approach, timesteps13nd the comparison of performance scores are the same as in the discrete fits.We discretize Exponential fits by using the area under the probability densityfunction. We divide the x-axis into 15 equal parts and we find the densitiesfor 15 ticks by calculating areas under the probability density function. Thelimit buy and limit sell fits of Exponential, Power law and Discrete Weibulldistributions from Day 29 to Day 40 can be seen in Figure 8 and Figure 9.
Figure 8: Exponential, Power law and Discrete Weibull fits on Daily Limit Buy Arrival Ratesfrom Day 29 to Day 40
We observe that Discrete Weibull distribution and Power law have the bestfits for most of the instances. The performance of Discrete Weibull and Powerlaw is very close to each other. On the other hand, Exponential distributionis 2 to 3 times worse than both Discrete Weibull distribution and Power law.The limit buy and limit sell fits of Exponential, Power law and Discrete Weibulldistributions from Week 1 to Week 9 can be seen in Figure 10 and Figure 11below.We observe that Discrete Weibull distribution has the best fits for most ofthe weekly basis instances. Power law has close fit performance with respect toDiscrete Weibull distribution. Not suprisingly, Exponential distribution, beinga more parsimonious distribution in the number of parameters, performed muchworse than both Discrete Weibull and Power law on average.14 igure 9: Exponential, Power law and Discrete Weibull fits on Daily Limit Sell Arrival Ratesfrom Day 29 to Day 40Figure 10: Exponential, Power law and Discrete Weibull fits on Weekly Limit Buy ArrivalRates
Exponential, Power law and Discrete Weibull fits on monthly basis arrivalrate of limit orders can be observed in Figure 12.We observe that Discrete Weibull distribution has the best fits for 3 out of4 monthly basis instances. Exponential distribution is 2 times worse than bothDiscrete Weibull and Power law on average.15 igure 11: Exponential, Power law and Discrete Weibull fits on Weekly Limit Sell ArrivalRates (a) Limit Buy Orders (b) Limit Sell Orders
Figure 12: Exponential, Power law and Discrete Weibull fits on Monthly Limit Orders ArrivalRates
In order to compare the performance of Exponential, Power law and DiscreteWeibull, we again use the means of the performance scores of three distributionsin different timesteps. Welch’s t -test is used to decide which discrete distributionhas the best fits on arrival rate of limit orders. Average performance scores ofDiscrete Weibull, Exponential and Power law fits on different timesteps are givenin Table 3.Exponential fits evidently have the worst performance among three distri-bution. On the other hand, as Power law and Discrete Weibull have very closemean performance scores, we perform a t -test to find if there is a significantdifference between those values.For all of the instances, the p -values have a value that is higher than 0.05,so we can not reject the null hypothesis that indicates the mean performancescores of Discrete Weibull and Power Law is not significantly different. So we16 imestep Exponential Discrete Weibull Power law Daily Limit Buy 2.674 +- 1.346 1.098 +- 0.198 1.154 +- 0.222Daily Limit Sell 2.018 +- 1.071 1.059 +- 0.104 1.128 +- 0.167Weekly Limit Buy 2.515 +- 0.651 1.165 +- 0.191 1.051 +- 0.066Weekly Limit Sell 1.771 +- 0.471 1.004 +- 0.014 1.076 +- 0.060Monthly Limit 2.038 +- 0.394 1.063 +- 0.127 1.048 +- 0.062Hourly Limit Buy 2.624 +- 1.279 1.094 +- 0.126 1.163 +- 0.268Hourly Limit Sell 1.918 +- 0.753 1.056 +- 0.074 1.128 +- 0.168
Table 3: Mean
NP S d ( Daily ) of Proposed Distributions on Different Timesteps Timestep p -value Daily Limit Buy 0.363Daily Limit Sell 0.326Weekly Limit Buy 0.343Weekly Limit Sell 0.464Monthly Limit 0.973Hourly Limit Buy 0.743Hourly Limit Sell 0.242
Table 4: Welch’s t -test results between Discrete Weibull and Power law can say that the performance of Power Law and Discrete Weibull fits are notsignificantly different for all instances. Consequently, Discrete Weibull modelscan compete with Power law models which are proposed by Bouchaud et al.(2002) and Zovko et al. (2002). We can use Discrete Weibull distribution tomodel arrival rates of limit orders with respect to distance to the best pricesaccurately. We analyze the number of cancel orders and the ratio of canceled orders inthe vicinity of the best prices. We consider both cancel buy orders and cancelsell orders on the weekly and monthly basis. In previous works, Blanchet &Chen (2013) denoted that the cancellation activity is much higher in the closeregions to the best bid and ask prices. The number of cancel orders arrived atthe first 10 ticks on the weekly basis in our experiments are shown in Figure 13and Figure 14.When we consider the cancel activity as the number of cancel orders arrived,the results are consistent with the previous works. It can be observed that thenumber of cancel orders arrived in the close regions to the best prices are higher.We also consider the average ratio of canceled order quantity in the ticks. Weuse the metric which we denote in Section 3.3 for finding the ratios. The ratios ofcanceled order quantities in the first 10 ticks on weekly basis in our experimentsare shown in Figure 15 and Figure 16. The red and blue lines in figures expectedvalues. 17 igure 13: Number of Cancel Buy orders arrived in the vicinity of the best ask price on weeklybasisFigure 14: Number of Cancel Sell orders arrived in the vicinity of the best bid price on weeklybasis igure 15: Ratios of canceled buy order quantities in the vicinity of the best ask price onweekly basis, Red line is the average ratioFigure 16: Ratios of canceled sell order quantities in the vicinity of the best bid price onweekly basis, Blue line is the average ratio Figure 17: Ratios of canceled buy order quantities in the vicinity of the best ask price onmonthly basis, Red line is the average ratioFigure 18: Ratios of canceled sell order quantities in the vicinity of the best bid price onmonthly basis, Blue line is the average ratio
The ratios were close most of the time. Thus we perform uniformity tests onthe ratios. We use Chi-Square Test to test if the ratios are distributed uniformly.Chi-Square test statistics formula is given in Equation 16. In the Equation 16, c means the degree of freedom. In our experiment, we have 10 different categories(because of 10 ticks). Since the degree of freedom is equal to the number ofcategories minus 1, the degree of freedom value is 9 for our experiments. X c = (cid:88) i =1 ( Observed i − Expected i ) Expected i (16)20hi-Square tests use count data as observed data. Therefore we convert thecancellation ratios to integer values by multiplying them by 100. Then, we testthe null hypothesis which claims that the ratios of canceled orders are consistentwith Uniform distribution. Chi-Square test statistics and corresponding p -valuesare given in Table 5 and Table 6. Timestep Cancel Buy Cancel Sell
Table 5: Comparison between the ratios of canceled orders and the uniform distribution:Chi-Square Test Statistics on Different Timesteps
Timestep Cancel Buy Cancel Sell
Table 6: Comparison between the ratios of canceled orders and the uniform distribution:Corresponding p -values on Different Timesteps We use 95% confidence interval for the tests. Chi-Square tests for uniformitypresent p -values higher than 0.05 in all of the instances except for the ratiosof cancel sell orders in the 5th week. Consequently, we can not reject thenull hypothesis which indicates that the cancellation rates are consistent withUniform distribution for most of the instances.21 . Conclusion and Future Work In this research, we used different statistical distributions to fit the limitorder quantities arrived in the vicinity of the best bid and ask prices. The fitsare made on Garanti Bank stock data for the period from August - September2017. We analyzed the daily, weekly and monthly mean limit order quantitiesarrived at the first 15 levels from the best prices. Also, we considered the weeklymean quantities of limit orders arrived in 7 different time intervals in a day. Weused total sum of L norms between empirical density and fit results in the first15 price levels of the bid and ask prices to evaluate the goodness of fits.We observed that Discrete Weibull and Beta Binomial distributions are al-most 4 times better at fitting the order quantity data than Geometric distri-bution. We had 228 instances to fit and Discrete Weibull has the lowest L norm for 210 of them. Beta Binomial fits the data with the lowest L norm for17 of the instances and Geometric distribution has the best fit for only one ofthe instances. Additionally, we used Exponential distribution to fit the same228 instances. We found the probability mass values by calculating the areas of15 bins under the Exponential probability density function using discretization.Then we obtained the sum of L norms between empirical density and 15 prob-ability mass values, and compared the goodness of Exponential distribution fitswith Discrete Weibull distribution fits. We observed that Discrete Weibull fitsthe daily, weekly and monthly mean quantities two times better than Exponen-tial distribution. Also, Discrete Weibull fits can compete with Power law fitswhich are proposed in early works.We analyzed the weekly and monthly mean ratio of cancel orders in thefirst 10 price levels. We conducted Chi-Square tests to test the uniformity. Weobserved that we can not deny the hypothesis which claims that the cancellationrates are consistent with Uniform distribution. As a result, we found out thatthe assumption made by Cont et al. (2010) on cancellation rates which denotesthat the cancellation rates are distributed exponentially can not be adapted toTurkish markets.Our dataset was quite small, since it only contains the data of 2 months.Also, we only consider Garanti Bank stock data. In future work, the sameexperiments can be conducted for larger datasets such as 6 months or 1 year.Moreover, stock data from other companies can be considered, and the relationbetween different stocks would be another interesting extension. Additionally,the relation between stock prices and arrival rates can be examined and pre-dicted. References
Blanchet, J., & Chen, X. (2013). Continuous-time modeling of bid-ask spreadand price dynamics in limit order books. arXiv preprint arXiv:1310.1103 , .Bouchaud, J.-P., Bonart, J., Donier, J., & Gould, M. (2018).
Trades, quotes andprices: financial markets under the microscope . Cambridge University Press.22ouchaud, J.-P., Mézard, M., Potters, M. et al. (2002). Statistical propertiesof stock order books: empirical results and models.
Quantitative finance , ,251–256.Chen, R., & Wang, Z. (2013). Curve fitting of the corporate recovery rates:The comparison of beta distribution estimation and kernel density estimation. PloS one , , e68238.Cincotti, S., Focardi, S. M., Ponta, L., Raberto, M., & Scalas, E. (2006). Thewaiting-time distribution of trading activity in a double auction artificial fi-nancial market. In The Complex Networks of Economic Interactions (pp.239–247). Springer.Cont, R. (2011). Statistical modeling of high-frequency financial data.
IEEESignal Processing Magazine , , 16–25.Cont, R., Stoikov, S., & Talreja, R. (2010). A stochastic model for order bookdynamics. Operations research , , 549–563.Jiang, Z.-Q., Chen, W., & Zhou, W.-X. (2008). Scaling in the distribution ofintertrade durations of chinese stocks. Physica A: Statistical Mechanics andits Applications , , 5818–5825.Mu, G.-H., Zhou, W.-X., Chen, W., & Kertész, J. (2010). Order flow dynamicsaround extreme price changes on an emerging stock market. New Journal ofPhysics , , 075037.Myung, I. J. (2003). Tutorial on maximum likelihood estimation. Journal ofMathematical Psychology , , 90–100.Nakagawa, T., & Osaki, S. (1975). The discrete weibull distribution. IEEETransactions on Reliability , , 300–301.NASDAQ-OMX-Group (2015). NASDAQ TotalView-ITCH 5.0 Interface Spec-ification . .Schroeck, G. (2002). Risk management and value creation in financial institu-tions volume 155. John Wiley & Sons.Zovko, I., Farmer, J. D. et al. (2002). The power of patience: a behaviouralregularity in limit-order placement.
Quantitative finance , , 387–392.23 ppendix Figure A.1:
NP S d ( Daily ) of three discrete distributions on Arrival Rates of Daily LimitBuy/Sell Orders igure A.2: NP S d ( W eekly ) of three discrete distributions on Arrival Rates of Weekly LimitBuy/Sell Orders igure A.3: NP S d ( Hourly ) of three discrete distributions on Arrival Rates of Hourly LimitBuy/Sell Orders igure A.4: NP S d ( Daily ) of Discrete Weibull and theoretical distributions on Arrival Ratesof Daily Limit Buy/Sell Orders igure A.5: NP S d ( W eekly ) of Discrete Weibull and theoretical distributions on Arrival Ratesof Weekly Limit Buy/Sell Orders igure A.6: NP S d ( Hourly ) of Discrete Weibull and theoretical distributions on Arrival Ratesof Hourly Limit Buy/Sell Ordersof Discrete Weibull and theoretical distributions on Arrival Ratesof Hourly Limit Buy/Sell Orders