[PDF] Matching in size: How market impact depends on the concentration of trading

Abstract

We show that filling an order with a large number of distinct counterparts incurs additional market impact, as opposed to filling the order with a small number of counterparts. For best execution, therefore, it may be beneficial to opportunistically fill orders with as few counterparts as possible in Large-in-scale (LIS) venues. This article introduces the concept of concentrated trading, a situation that occurs when a large fraction of buying or selling in a given time period is done by one or a few traders, for example when executing a large order. Using London Stock Exchange data, we show that concentrated trading suffers price impact in addition to impact caused by (smart) order routing. However, when matched with similarly concentrated counterparts on the other side of the market, the impact is greatly reduced. This suggests that exposing an order on LIS venues is expected to result in execution performance improvement.

Full PDF

MMatching in size: How market impact depends on the concentration of trading

Ilija I. Zovko

Aspect Capital, London [email protected]@algo-quants.com I n this paper we provide empirical evidence that large orderssuffer greater market impact when matched with multiple tradecounterparts than when matched with one or a small number ofcounterparts. This provides some support for using large in scale(LIS) venues to improve execution performance. Filling an order ona venue in which a trader is able to minimise the number of coun-terparts to match with, all other things being equal, is expected tosuffer less price impact than splitting the order and matching withmany counterparts on anonymous venues.To reach this conclusion we introduce the concept of concentra-tion of trading and study its effect on market impact. Trading is concentrated if a disproportionate volume share of buying or sell-ing in a period of time is done by a few trading ﬁrms. It is dilute (i.e. not concentrated), if trading is spread across many ﬁrms. Buy-ers as well as sellers can be concentrated and will typically be aconsequence of a small number of ﬁrms executing large orders.Using a unique (but unfortunately somewhat dated) dataset fromthe London Stock Exchange (LSE) we are able to reconstruct thedegree of concentration in buying and selling, and investigate itsinﬂuence on resulting price moves. This is made possible by therecord of trade matching details identifying LSE member ﬁrms par-ticipating in each trade.Price impact, of course, is to a large degree inﬂuenced by the de-tails of order routing, as implemented by, say an SOR, a SmartOrder Router. The SOR decisions themselves can be inﬂuencedby the size of the order or the trading participation levels. There-fore, in order to isolate the effect of concentration, in our analysiswe control for orderﬂow effects. We take into account the mostcommon orderﬂow characteristics, such as the number of buyeror seller initiated trades and volumes (i.e., passive and aggressiveorders), and the total number of ﬁrms buying or selling.After adjusting for the effects of orderﬂow on price impact, we stillﬁnd that concentrated trading results in larger impact than dilutetrading.As a practical and direct test of our ﬁndings, we show that largeconcentrated orders suffer noticeably less market impact in cir-cumstances when they are matched with similarly concentratedtrade counterparts on the other side of the market. This suggeststhat opportunistically trading large orders in size may improveprice slippage.The results presented in this paper imply that while the order rout-ing decisions remain a key determinant of execution outcomes,it is similarly the number of counterparts one trades with that in-ﬂuences performance. The fewer the number of counterparts, allother things being equal, the better the performance. Algo execution | Market impact | LIS

1. Introduction W ith the introduction of MIFID II regulation, therehas been an emergence of trading venues dedicated tomatching in size. I.e., venues with mechanisms facilitatingﬁnding counterparts to ﬁll an order with, in as few numberof clips, or ideally in its entirety.While intuitively sound, the performance beneﬁts ofmatching in size, in comparison to splitting an order intoclips and working it on anonymous venues, are less obvious. We address this question by introducing the concept of concentration of trading and investigating its inﬂuence onprice returns, and consequently execution performance. Westart with the trivial observation that the total number ofshares bought must be equal to the total number of sharessold. It is how those shares are partitioned between theindividual traders that determines trading concentration. Concentrated buying describes a situation where a largeproportion of shares are bought by a single or a few exchangemember ﬁrms. On the other hand, dilute buying – a term weuse to describe a lack of concentration – describes a situationwhere the shares bought are spread over a larger groupof ﬁrms, all contributing comparable amounts to the total.Concentrated and dilute selling are deﬁned analogously forsellers in the market. It is straightforward to see that, say, alarge buy order will contribute to the concentration of thebuying side of the market.In the paper we present evidence that concentrated trad-ing produces larger price impact than dilute trading. Thismeans that a large order, when being matched with mul-tiple small orders, adversely moves the market and suﬀersprice slippage. Correspondingly, we ﬁnd that when a largeorder is matched with a similarly large order, price impactis signiﬁcantly reduced. ∗ As mentioned, concentration is closely related to order size– large orders tend to result in concentrated trading. Ordersize, on the other hand, tends to inﬂuence order routingand orderﬂow. It is well known that aspects of orderﬂowstrongly aﬀect price impact, with aggressive and passive orderplacement being the most exemplary one (1–12). Therefore,in order to investigate the eﬀect of concentration on priceimpact, we must take into account the known price eﬀects ∗ In the paper, we use the term ‘order‘ to denote the aggregate buying or selling by a singleexchange member ﬁrm. We do not have data granular enough to recover orders originatingfrom different trading desks, or strategies. However, one can argue that for the purposes ofexecution, it is the aggregate orderﬂow that is important anyway.

Executive summary

We show that ﬁlling an order with a large number of distinctcounterparts incurs additional market impact, as opposed toﬁlling the order with a small number of counterparts. For bestexecution, therefore, it may be beneﬁcial to opportunisticallyﬁll orders with as few counterparts as possible in Large-in-scale (LIS) venues.This article introduces the concept of concentrated trading, asituation that occurs when a large fraction of buying or sellingin a given time period is done by one or a few traders, forexample when executing a large order. Using London StockExchange data, we show that concentrated trading suffersprice impact in addition to impact caused by (smart) orderrouting. However, when matched with similarly concentratedcounterparts on the other side of the market, the impact isgreatly reduced. This suggests that exposing an order onLIS venues is expected to result in execution performanceimprovement.

Matching in size | a r X i v : . [ q -f i n . T R ] D ec f order routing and orderﬂow characteristics:• the number and GBP volume of aggressive and passivetrades, and• the number of ﬁrms buying vs. ﬁrms selling.Concentration is not an instantaneous market property.Like volatility, it is an aggregate property of trading in atime window, for example a day or an hour. We have chosento perform the analysis on daily intervals. However, thechoice of aggregation remains an open question. Instead ofcalendar time, one can use an aggregation scheme resultingfrom a ﬁxed number of trades (so called trade time), orresulting from a ﬁxed number of shares traded (volumetime). In addition to the daily aggregation, we have alsoran the analysis on hourly and intervals resulting from aﬁxed number of trades. The main conclusions from thedaily analysis are roughly borne out on the other testedaggregation schemes. While interesting in its own right, acomplete study of the diﬀerences between timescales is outof scope of our current work.The analysis is based on a somewhat dated trades andquotes dataset from the on-book trading of the London StockExchange (LSE) between 2000 to 2002. What makes thisdataset unique is that each trade record contains codes iden-tifying the stock exchange member ﬁrms that participatedin the trade. † While a member ﬁrm can make a trade bothin a principal (for their own account) or agent capacity, thedata shows that in more than 95% of trades the memberﬁrms act as principal to the transaction.We base the analysis on 32 months (674 full trading days)on 46 stocks. Excluding from the dataset days in which aparticular share traded less than 500 times, yields a sampleof roughly 20 million trades. For this set we have all thetrade and quote details, including trade matching detailsidentifying the participants in each trade. The data periodis from the beginning of May 2000 until the end of December2002. During this period, there were on average between33 to 72 member ﬁrms trading on the on-book market ofthe LSE each day. The average number of ﬁrms tradingincreases during the period and in general there are moreﬁrms trading the more active stocks. The typical number oftrades is several thousand per day for active stocks. LSE isopen for trading from 8:00 to 16:30 but we have discardeddata from the ﬁrst and last half hour of trading to avoidpossible opening or closing anomalies.

2. Concentration of trading

Concentrated trading is a situation where a single or a fewﬁrms contribute in a disproportionately large proportionto buying or selling in a time period. For example, a ﬁrmexecuting an order with a high percentage of volume (POV)algorithm will contribute to trading concentration. Alter-natively, concentration can also be a consequence of a dis-proportionately large order being worked in the market.Figure 1 illustrates the principle question of the paper: dothe three idealised examples have diﬀerent impact on pricemovement, while taking into account the diﬀerent orderﬂowcharacteristics expected from trading large orders.We quantify concentration using metrics of inequalitybetween the volume fractions by which each member ﬁrmcontributes to the total trade volume. Denote by w bi thefraction of shares bought by ﬁrm i in a trading session, i.e.,the sum of GBP trade sizes in which ﬁrm i participated as a † Member ﬁrm codes do not identify the ﬁrms by name and are scrambled monthly and acrossstocks for anonymity. This makes it impossible to track ﬁrms in time or investigate the propertiesof their orders with good statistics.

Buy orders

Sell orders

Concentrated buying matched with concentrated selling

Concentrated buyingmatched withdilute selling

D i l u t e b u y i n g m a t c he d wi t h d i l u t e s e l l i n g sell orders from tradersa large order

Fig. 1.

Three stylised examples of trading where different levels of concentrationand dilution may result in different price impact of large orders. Each small squarerepresents a share or a “lot”, which are then grouped by the ﬁrm that bought or soldthem. In the top example, buying was concentrated as a consequence of a largeorder on the market. Selling, on the other hand was dilute, with a large number oforders contributing to the aggregate selling. In the middle example, both the buyingand selling are concentrated. The bottom example shows trading between dilutebuyers and sellers.In the paper we show that large orders causing concentration suffer adverse priceimpact, even after taking into account the impact of the orderﬂow generated bysuch orders. This suggests there are performance advantages to match a largeorder with similarly large orders, as opposed to splitting it up and ﬁlling with multiplecounterparties. buyer, divided by the total traded volume. Analogously, w si isthe fraction of shares sold by ﬁrm i (computed analogously).A member ﬁrm i , which in a given session both bought andsold, will have w si and w bi non-zero.Volume fractions w i are positive and sum up to 1, deﬁninga probability distribution. ‡ Hence, we can quantify concen-tration using any statistical measure for the inequality of aprobability distribution. In order to make our results robustto the choice of the measure, we initially use two measuresof inequality, the Gini index G and the entropy E .The sample Gini index can be computed as the half ofthe relative mean absolute diﬀerence between all N samples x i (13, 14) G ≡ P i,j | x i − x j | /N P i x i /N [1]which for volume fractions w i can be simpliﬁed to G ≡ N X i,j | w i − w j | . [2]The Gini index G s for the concentration of selling is com-puted from the sell volume fractions, while G b for the con-centration of buying from the buy fractions. ‡ We omit the superscript denoting the market side when the expression applies to both sides. | or fully concentrated trading, in which all buys or sellsare done by a single ﬁrm (having a 100% participation ratio),the Gini index has a maximum value equal to 1. For fullydilute trading, in which all ﬁrms participate an equal amountto the total trade volume, it takes its minimum value of 0.The entropy based measure for concentration is deﬁnedas E ≡ − P i w i · log(1 /w i )log N , [3]where we have changed the sign and introduced the normali-sation factor of log N to the usual entropy expression. Withthese changes, we get the intuitive interpretation that largeconcentration is associated with large values of the metric,consistent with the Gini metric: for fully concentrated trad-ing the entropy measure is equal to 1, while for fully diluteit is 0.Finally, we deﬁne the imbalance in trading concentrationbetween the buying and selling as δG ≡ G b − G s and δE ≡ E b − E s . [4]If the buying is more concentrated than selling, the imbalancewill be positive.One might immediately wonder how often do we see trad-ing situations in which either buying or selling is signiﬁcantlyconcentrated. Looking at all stocks in our LSE dataset, onalmost half of the days there was at least one ﬁrm thattraded more than a quarter of all the buying or selling onthe day. So a fairly frequent occurrence.

3. Concentration and price returns

We have decided to estimate the eﬀects of concentration andorderﬂow on price returns using a linear model. After tryingout a few nonlinear forms, we did not ﬁnd clear evidence ofnonlinearities. The extremes of the dataset do deviate fromthe linear assumption, but the noise levels are large and donot warrant additional complexity. We have also found thatthe Entropy and Gini metrics bear same conclusions, so willrestrict the discussion to only using the Entropy metric.Therefore we will be explaining the daily price returns δP t by expressions of the form δP t = (concentration imbalance) t + (order routing imbalances) t [5]+ (error term) t . Price returns are calculated as the percent diﬀerence of thevolume weighted average price (VWAP) of the last 10% andﬁrst 10% of trades on the day § . We normalised the returnsby the overall market return, subtracting from each dailystock return the FTSE100 return for the day. To make surethat overall stock moves do not impact conclusions, meanreturn per stock over the entire period is subtracted fromthe daily stock return. The variance of price returns acrossstocks is left unchanged.The order routing eﬀects are captured by the imbalancesbetween buying and selling in• numbers of aggressive orders M t ,• GBP volume of aggressive orders V t , and• numbers of ﬁrms N t . § Recall that we discarded 30 minutes of data after the opening auction, and 30 minutes beforethe close.

Price returns (%) −4 −2 0 2 4 mean – – −4 −2 0 2 4 mean – −4−2024 mean – −4−2024 Correlation0.23

Imbalance in trading Concentration mean – – – Correlation0.45

Imbalance in Aggressive trade notional mean – −4−2024 mean – −4−2024 Correlation0.25

Correlation−0.068

Correlation0.68

Imbalance in Aggressive number of trades mean – −4 −2 0 2 4 Correlation−0.41

Correlation−0.30 −4 −2 0 2 4

Correlation−0.15 −4 −2 0 2 4 −4−2024

Imbalance in Number of firms

Fig. 2.

Scatterplots, histograms and correlations showing the dependency betweenprice returns and imbalances in concentration and orderﬂow. Matrix diagonal showsthe histograms, roughly normally distributed with mean 0 and standard deviation 1.The exception is the distribution of price returns, which in fact are not normalised -price returns are left as percentages. The triangle above the diagonal shows simplescatterplots of daily values (in red), together with a simple binned conditional meanestimate. Values in the matrix below the diagonal are the values of the correlationcoefﬁcients between the corresponding variables with the font size proportional tothe absolute value of the coefﬁcient.The charts show a fairly correlated set of variables, which are all in one way oranother related to trade volume. It is for this reason that we will have to takespecial care to properly account for the effects of orderﬂow in revealing the role ofconcentration in price returns.

The imbalances are computed using the usual conventionthat buying gets a positive sign, and using the commonnormalising form δx = x b − x s x b + x s [6]where b and s designate buying and selling. The full order-book data we use allows us to calculate these order routingvariables precisely.In summary, for each stock and trading day, we computethe market normalised price return δP t and the concentra-tion imbalance between buyers and sellers δE t using Eq.4.Using Eq.6, we compute the imbalance between the numberand GBP notional of aggressive buy orders and aggressivesell orders, δM t and δV t respectively. And similarily the im-balance in the number of buyers and sellers δN t . To combinedata from diﬀerent stocks, each imbalance variable is furthernormalised by the standard deviation for the correspondingstock.Figure 2 shows a matrix of dependencies between theprice returns and the imbalance metrics. The upper matrixtriangle shows scatter plots (in red) and a simple conditionalmean estimate as the solid gray line. The level of correlation(estimated coeﬃcients shown in the lower matrix triangle)is high since none of the orderﬂow variables can vary fullyindependently of the others. If a new buyer starts trading,changing the imbalance in the number of ﬁrms, this may inturn change the ratio of aggressive sell and buy orders in themarket. Depending on the order size, it may also change theconcentration of buying.Trading concentration, on the other hand, is fairly inde- Matching in size ||

Model ﬁt showing the magnitude and signiﬁcance of concentra-tion and order routing imbalances on price returns (in basis points). Theimpact of concentration on the price is adverse, in that positive imbal-ances corresponding to concentrated buying are associated with positiveprice returns. The impact of order routing is as expected: an imbalancetowards aggressive orders adversely affects the price, as well as thefewer ﬁrms trading on one side of the market, the more adverse priceimpact they can expect, reﬂecting their order size. Only the imbalance inthe number of aggressive trades is not signiﬁcant, with a similar effectbetter captured by aggressive notional imbalance.The effect of concentration is comparable in magnitude (Coef ≈

25 bps)to the impact of routing (Coefs ≈

80 and -60 bps), but the explanatorypower of concentration imbalance is less than that of order routing ( R p ≈ R P is the partial R of the selected variable. It expresseshow much the variable explains price returns in addition to the other threevariables. R S is the value of R in a regression where only the selectedvariable is present in the regression. It expresses how much the variableon its own explains price returns. pendent and only mildly anti-correlated with the imbalancein the number of ﬁrms, e.g., the smaller the number of sellers,the more concentrated selling is. The fact that concentrationand orderﬂow are rather independent is indicative of fairlysophisticated SORs at work. Even when there is a largeorder being worked (resulting in concentrated trading), theorder routing does not to create biases in orderﬂow.Ultimately the model to estimate is δP t = α · δE t + β · δM t + γ · δV t + τ · δN t + (cid:15) t , [7]where (cid:15) t are N ( µ, σ ) Normal residuals. We estimate theparameters via a standard OLS on a merged dataset forall stocks, containing 15491 samples corresponding to daysessions. In doing this, we have limited ourselves to sessionscontaining more than 500 trades and have removed sessionsin which the stock price changed by more than 5 percent (tolimit the eﬀect of exogenous or news events). The numberof samples we removed is around 5% and in reality doesnot materially change the conclusions. In terms of otherrobustness checks, we performed the analysis in distinctyearly periods as well as separately for each stock, broadlycorroborating the conclusions.Model estimates and corresponding errors are collectedin table 1. The model explains about 30% of overall pricereturn variation (model R = 0 . R = 0 . R S and R P . These goodness-of-ﬁt measures describe the contribution of each of the im-balances to the overall model ﬁt. R S is equal to the r-squared of a model with only the selected variable, and noothers, included ¶ . R P is the partial r-squared deﬁned as R P = ( R − ˆ R )(1 − ˆ R ) where R is the R-squared of the fullmodel with all variables included, and ˆ R is the R-squared ¶ As such, it is equal to the square of the correlation between the variable and price returns, shownin ﬁgure 2. of a restricted model with all except the selected variable. R P is the partial contribution to the full r-squared due tothe selected variable.Not surprisingly, order routing explains a large proportionof variation in price returns (large R p ). However, that doesnot mean that order routing is the single “important” con-tributor to market impact. Loosely speaking, “importance” of a factor is a combination of how large an eﬀect the factorpredicts, and how reliable a prediction it produces. I.e., anunreliable prediction of a large magnitude may be more “im-portant” than a reliable prediction of a small inconsequentialeﬀect. R describes how reliable a factor is in predicting theprice impact, while the estimated coeﬃcient determines themagnitude of the eﬀect.In the estimated model, price returns were expressedin basis points. The concentration and orderﬂow imbal-ances are without units, however they are normalised bythe stock standard deviation. Hence the units of regres-sion coeﬃcients can be read as basis points per unit oftypical imbalance. Therefore, while the imbalances in or-der routing and the number of ﬁrms trading explain re-spectively around R p (market order volume) = 0 .

12 and and R p (number of ﬁrms) = 0 .

16 to the overall R . The mag-nitude of their eﬀects for one standard deviation are es-timated to be 80 bps and -60 bps respectively. The con-tribution of concentration imbalance, is more noisy with R p (concentration) = 0 .

02, but has a comparable 25 bpseﬀect on the magnitude of the price impact.As an illustration we can take a sample from the dataset,for example LLOY on t = 2000-05-09. The market adjustedreturn on that day was δP t = − . δE t = − . δV t = − . δN t = 1 . . We omit aggressive trade count because its contribution isnot signiﬁcant and likely captured by the aggressive notional.On the day, the concentration of sellers was about 3 timesas large as typically, and there was a slightly higher numberof buyers than sellers. The number of aggressive sell orders(market orders) was not notably larger than the number ofaggressive buy orders. Upon multiplying these values by thecoeﬃcients and adding up, the predicted return for the dayis δP t = 25 · δE t + 80 · δV t − · δN t − − − − . [8]We see that the contribution of concentration to the pre-dicted return is of comparable magnitude as orderﬂow, albeitwith lower reliability. Across the sample of all stocks, thecontribution of concentration is between 20% and 30% ofthe total return.Checking the statistical speciﬁcation of the model, theresiduals are i.i.d. and very close to normal; all the explana-tory variables are exogenous to the model. Therefore, themodel coeﬃcients are estimated without bias and are dis-tributed according to a normal distribution. The reportedregression coeﬃcient errors and the p-values are calculatedin the standard way assuming normality. However, we havealso conﬁrmed the estimation errors using a bootstrap test. ‖ Recursive and sliding window estimates are all in line with ‖ By shufﬂing the price returns and keeping all other variables intact, we obtain a realisation ofthe null hypothesis where all the explanatory variables are correlated with themselves but areuncorrelated with the returns. Repeating the shufﬂing 1000 times and estimating the model onthe bootstrapped data, we get a distribution of the coefﬁcients under the null. The standarddeviation of the estimates and the p-values obtained in this way coincide with the theoreticalvalues shown in the table. | he overall model ﬁt. In addition, we split the model byyear and by stock name. The estimated values are ratherconstant over the years and do not greatly vary betweenstocks. We can note however, that higher activity stockstend to result in better model ﬁts.While the model and the estimates are reliable, a com-plication in the interpretation of results is that all of theexplanatory variables are, in one way or another, driven andinﬂuenced by traded volume. To determine that tradingconcentration inﬂuences price moves in its own right, it isimportant to determine that the eﬀect of concentration isindependent or orthogonal to order routing.One way of establishing that this is the case, is to splitthe single model Eq. 7 into, so called, partial regressions,and estimate them via a two stage process. In the ﬁrst stagewe remove the linear eﬀects of order routing from both theprice returns and trade concentration by ﬁtting models δP t = α · δV t + β · δN t + γ · δM t + (cid:15) ,t ,δE t = α · δV t + β · δN t + γ · δM t + (cid:15) ,t . [9]The residuals of these ﬁts, δP t ≡ ˆ (cid:15) ,t and δE t ≡ ˆ (cid:15) ,t , wecan term routing corrected price returns and routing cor-rected concentration . The second step entails regressing the“corrected” variables on each other δP t = η · δE t + (cid:15) ,t . [10]This procedure removes linear eﬀects of order routing fromprice returns and trading concentration imbalance. Whateversigniﬁcance remains in the second step is the eﬀect of tradingconcentration on price returns – orthogonal to order routing.A ﬁt of Eq. 10 yields ˆ η = 0 . ± .

01 with p-value zero and R = 0 .

02, very much in line with the results in table 1. Fig. 3shows the ﬁt graphically, together with simple conditionalaverages and the corresponding standard normal errors ofthe residuals.What this shows is that the inﬂuence of trading concentra-tion on market returns is indeed orthogonal to order routing,and typically about one quarter in magnitude (ˆ η = 0 .

4. Executing large in scale

The preceding section showed that trading concentration onaverage adversely impacts price moves. Practically speakinghowever, if one was to trade a large order, what might bethe ways of mitigating the additional impact due to con-centration? LIS venues aim to reduce impact by matchingoﬀsetting interest only when it can be matched in size be-tween a few counterparts. In other words, matching onlywhen both the buy side and the sell side are concentrated.By appropriately partitioning the LSE data we can testthis approach and determine if matching concentration withconcentration reduces impact.To do this, we partition the samples into categories, eachindexing days from one of the regimes corresponding toillustrations in Fig 1:• Concentrated buying matched with concentrated selling;• Concentrated buying matched with diluted selling;• Dilute buying matched with concentrated selling;• Dilute buying matched with dilute selling.Once we remove the eﬀects of order routing, we are left withthe average price move conditional to the four concentrationregimes. A way to compute these conditional price moves isto create four, so called, “dummy” variables indexing the fourmarket regimes. The model is very much like Eq. 7, with thediﬀerence that the continuous concentration imbalance δE t −3 −2 −1 0 1 2 3−1.5−1.0−0.50.00.51.01.5 Order routing corrected concentration imbalance O r de r r ou t i ng c o rr e c t ed p r i c e r e t u r n s mean – l Conditional averagesRegression fit

Fig. 3.

Figure showing the signiﬁcant relation of concentration and price impact,after taking order routing effects into account. Days where concentration was highon one side of the market tend to have larger adverse price moves in stock price.On the x-axis we plot the residual information in the concentration imbalance, onceorderﬂow effects have been regressed out. This is comptuted as the residuals ofthe concentration regression in the ﬁrst of the two stage partial regression processdescribed in the text ( (cid:15) ,t in Eq.9). Likewise, y-axis shows the residual price return (cid:15) ,t , after removing the effects of orderﬂow imbalances. Points in the chart arebinned variables averages, the line is the regression line from Eq.10. is replaced with the four categorical variables. In deﬁning thecategories, we consider a market side to be concentrated ifthe concentration measure is larger than the 70th quantile ofthe metric, and dilute if it is smaller than the 30th quantile. ∗∗ Table 2 displays the model ﬁt containing the four dummyvariables. The eﬀect of order routing remains signiﬁcant andvery similar to previously observed. What is interesting hereis that of the four conditional means only the ones indexingconcentrated with dilute trading are signiﬁcantly diﬀerentfrom zero. Additional price impact due to concentrationvanishes in situations where concentrated trading is matchedwith concentrated counterparts.Given the above, it seems beneﬁcial (if constraints of theexecution allow it) to ﬁll large orders with similarly large op-posite interest, minimising the number of trade counterparts.In an anonymous, centrally cleared market it may not be pos-sible to know directly the number of distinct counterparts anorder is ﬁlled with. However, while far from straightforward,from ﬁll periodicity, size or other patterns, there may be waysto indirectly infer the information. Algos with such logiccould, for example, speed up the execution to proﬁt fromtimes when trading with few concentrated counterparties,and slow down when trading with many counterparts.The other possibility is to advertise interest in large-in-scale (LIS) venues. Arguably, the upstairs voice market of theLSE is a venue designed with such a purpose. The brokerswould dial their contacts in search for a few counterparts toﬁll the order in size. However, trading in the upstairs marketbrings with it information leakage complications. LIS venuesmay achieve a similar purpose, but with information leakagetightly controlled and measurable. ∗∗ To estimate the quantiles, we merge the concentration metrics for the sell and buy sides.

Matching in size ||

Matching in size || oef. Error p-valSigned volume, δV δN -65.4 δM -4.4 1.8 0.01Concentrated sell, dilute buy -44.0 R = 0 . Table 2.

Regression results showing the different effects of concentrationon the resulting price move split by different levels of concentration on thetwo market sides. As before, the concentrated side of the market suffersadverse price moves, but only when matched with a dilute opposite side.When a concentrated order is matched with a concentrated counterpart,the market impact caused by concentration vanishes.The orderﬂow effects ( δV, δN, δM ) and coefﬁcients are largely un-changed from before. Likewise, concentrated selling when trading withdiluted buying results on average in a price drop (coef = − ± bp);concentrated buying when matched with dilute selling results in priceappreciation (coef = 33 ± bp). However, when concentrated tradingis matched with similarity concentrated counterparts, the impact of con-centration vanishes. This observation leads us to expect performanceimprovements for algos utilising logic to restrict the number of counter-parts to trade with.

5. Signalling and time persistence of concentration

This analysis is based on trading in an anonymous andcentrally cleared market in which information leakage isminimised. In spite of this, we speculate that it likely isinformation leakage that is responsible for the observedimpact of concentrated trading.Market impact is known to be a convex function of ordersize (6, 15, 16), a fact we were able to conﬁrm on the LSEdataset we use as well. The implication of this is that impactper share (or per notional) is larger for small orders thanit is for large ones. From a pure mechanical impact view,therefore it follows that when a large order is matched withmany small orders, the sum of impacts of small orders shouldbe larger than the impact of the large order. In the languageof concentration, the many small orders on the dilute side ofthe market should – in aggregate – cause more price impactthan the large order on the concentrated side of the market.From earlier results, we know the contrary to be the case.Clearly, a simple mechanical explanation does not capturethe full story.We speculate it is likely that signalling and informationleakage by the large order allows the “dilute side“ ﬁrmsto proﬁt by adjusting prices as the intent and size of thelarge order is revealed to the market. For them to be ableto do so, a certain level of persistence in concentration isrequired. Figure 4 shows that, indeed, there is signiﬁcantautocorrelation in the levels of concentration from one dayto the next. Correlations decay roughly as a power function,are strongly signiﬁcant up to a week, and weakly for multipleweeks. Even the ACF of the concentration imbalance issigniﬁcant up to a few days.Correlations in trading concentration can be generated intwo ways: (i) one or a few ﬁrms trading large orders acrossdays, or (ii) diﬀerent ﬁrms entering the market with largeorders sequentially.The former, “same-ﬁrm” correlations, are typically a con-sequence of splitting large orders and trading them sequen-tially (17, 18). The latter, “cross-ﬁrm” correlations can becaused by news events and its diﬀerent propagation amongﬁrms, or by the so called “herding hypothesis“ (19) where-upon observing large orders, traders respond by placing own l l l l l l l l l l l l l lllllllllllllllll C on c en t r a t i on A C F w ee k w ee ks w ee ks l l l l l l l l l l l l l lllllllllllllllll Lag (trading day) A C F c on t r i bu t i on s l Same−firm contribution to ACFCross−firm contribution to ACF (sign flipped)ACF = difference between the two

Fig. 4.

Autocorrelation function (ACF) of trading concentration showing strongpersistence over days (upper chart). We speculate it is this persistence in con-centration that allows the dilute side of the market to adjust their behaviour andproﬁt from the knowledge of a large order contributing to concentrated trading. Thedecay seems to roughly be a power curve (note the log-log scale), and show strongpersistence up to a week. The result is obtained by averaging across stocks andboth market sides.The lower chart shows that the persistence in concentration is caused by the same-ﬁrms executing large orders across multiple days. We break the aggregate ACFinto same- and cross-ﬁrm correlations. The effects of the two are opposite, with thesame-ﬁrm contributing positively to correlations. (We ﬂipped the sign of cross-ﬁrmcorrelations so we can chart it better). The difference between the two curves isthe resulting aggregate ACF. large orders. A “hot-potato“ is another variant of the herd-ing hypothesis in which traders sequentially trade a largeposition between themselves (20).By splitting the empirical ACF into two components, onecontributed by same-ﬁrm correlations, and another withcross-ﬁrm correlations, we can determine which of the twomechanisms contributes to the persistence in concentration.The autocorrelation function γ ( τ ) is deﬁned as γ ( τ ) = h E ( t ) · E ( t + τ ) ih E ( t ) · E ( t ) i [11]where the bracket notation h·i denotes averaging over days †† .We separate the contributions to the ACF by introducingsummands z i ( t ) so that the concentration on a given daycan we written as E ( t ) = X i ∈ ξ t − w i ( t ) log( w i ( t ))log N ( t ) ≡ X i ∈ ξ t z i ( t ) . [12]Substituting the deﬁnition of concentration into the expres- †† This expression holds for zero mean variables. Average entropy is not zero, hence we will needto remove the mean prior to computing the decomposition. | ion for the ACF we obtain γ ( τ ) = DP i z i ( t ) · P j z j ( t + τ ) EDP i z i ( t ) · P j z j ( t ) E [13]which can be rearranged as γ ( τ ) = DP i = j z i ( t ) · z j ( t + τ ) E + DP i = j z i ( t ) · z j ( t + τ ) EDP i,j z i ( t ) · z j ( t ) E . [14]The ﬁrst sum takes into account the contributions to the ACFfrom same-ﬁrm, while the second term takes into accountcross-ﬁrm contributions. It turns out that the second sum isnegative, so we will for convenience change the sign of thesecond sum and write the ACF as γ ( τ ) = γ same ( τ ) − γ cross ( τ ) . [15]Prior to decomposing the ACF however, we need to ensure E ( t ) has zero mean which we do by subtracting the mean E ( t ) ≡ E ( t ) − E = E ( t ) − T X d E ( d ) . [16]As before, denoting by ξ d the set of ﬁrms or orders presentin the market on day d , we can write the above expressionas E ( t ) = X i ∈ ξ t z i ( t ) − T X d X j ∈ ξ d z j ( d )= X i ∈ ξ t (cid:16) z i ( t ) − N t E (cid:17) ≡ X i ∈ ξ t z i ( t ) [17]where N t again stands for the cardinality of the set ξ t . Wecan now compute the components of the ACF using z i ( t ).The analysis reveals that same-ﬁrm and cross-ﬁrm corre-lations are of a diﬀerent sign and work in opposite directions(lower panel of Fig 4). Same-ﬁrm correlations increase over-all ACF and are very closely matched in magnitude by thereduction due to the anti-correlation of cross-ﬁrms. Thesmall diﬀerence in magnitude between the opposing forcesresults in the overall concentration ACF.While this analysis does not rule out that the ACF decom-position is purely a mechanical eﬀect due to how we computeentropy, it does show that the persistence in concentrationis due to large order splitting. As such it ﬁts well in thecommon narrative that large orders leak information duringexecution.

6. Conclusions

This paper introduces the concept of trading concentrationand investigates its eﬀect on price impact. When a ﬁrmexecutes a large order, in contributes to the concentration ofthat side of the market. In such a situation, in addition toprice impact due to the way how the order is traded (SORorder routing), concentration will have an adverse eﬀect onprice impact, in magnitude roughly 30% as the eﬀects oforderﬂow. A way to reduce this additional impact is toselectively ﬁll the order with comparably large opposinginterest.This can either be facilitated by expressing interest inLIS venues, or by inferring patterns of ﬁlls when tradingin an anonymous market, which may signal the presence of large opposite interest. In such situations, it is advanta-geous to speed up trading to minimise the number of tradecounterparts.We speculate that large orders, spread over hours or days,over time leak their intent even when trading in anonymousmarkets. This allows smaller traders to adjust prices andproﬁt from the large order ﬁlls.This paper does not unequivocally ﬁnd support for tradingin large in scale venues. This would be impossible only withdata from anonymous trading. What we do show is thateven in a standard continuous double auction market, takinginto account orderﬂow eﬀects, price impact of an order isincreased when the order is ﬁlled with a large number ofcounterparts.

1. Plerou V, Gopikrishnan P, Gabaix X, Stanley HE (2002) Quantifying stock price responseto demand ﬂuctuations.

Physical Review E

International Finance Dis-cussion Papers

Journal of International Money and Finance .4. Evans M, Lyons R (2002) Order ﬂow and exchange rate dynamics.

Journal of PoliticalEconomy

Journal of International Economics

Proceedings of the National Academy of Sciences of the United States of Amer-ica

Journal ofFinance

Physical Review E

Physica A-Statistical Mechanics and Its Applications

Physica A-Statistical Mechanics and Its Applications (299):234–246.12. Solomon S, Richmond P (2001) Power laws of wealth, market order volumes and marketreturns.

Physica A (299):188–197.13. Dixon PM, Weiner J, Mitchell-Olds T, Woodley R (1987) Bootstrapping the gini coefﬁcientof inequality.

Ecology

On Economic Inequality . (Oxford Clarendon Press).15. Potters M, Bouchaud JP (2003) More statistical properties of order books and price impact.

Physica A

Nature

Studies in NonlinearDynamics and Econometrics

Quantitative Finance

Journal of Financial Economics

Journalof International Economics

ACKNOWLEDGMENTS.

The author thanks J.Doyne Farmerfor providing the LSE data used in the analysis and early discus-sions. Thanks to Roel Oomen for reading a draft and providinguseful comments.

Matching in size ||