[PDF] Are trading invariants really invariant? Trading costs matter

Abstract

We revisit the trading invariance hypothesis recently proposed by Kyle and Obizhaeva by empirically investigating a large dataset of bets, or metaorders, provided by ANcerno. The hypothesis predicts that the quantity $I:=\ri/N^{3/2}$, where $\ri$ is the exchanged risk (volatility × volume × price) and N is the number of bets, is invariant. We find that the 3/2 scaling between $\ri$ and N works well and is robust against changes of year, market capitalisation and economic sector. However our analysis clearly shows that I is not invariant. We find a very high correlation R 2 >0.8 between I and the total trading cost (spread and market impact) of the bet. We propose new invariants defined as a ratio of I and costs and find a large decrease in variance. We show that the small dispersion of the new invariants is mainly driven by (i) the scaling of the spread with the volatility per transaction, (ii) the near invariance of the distribution of metaorder size and of the volume and number fractions of bets across stocks.

Full PDF

AAre trading invariants really invariant? Trading costs matter

Fr´ed´eric Bucci ∗ , Fabrizio Lillo , Jean-Philippe Bouchaud , and Michael Benzaquen Scuola Normale Superiore di Pisa, Piazza dei Cavalieri 7, 56126 Pisa, Italy Department of Mathematics, Universit`a di Bologna, Piazza di Porta San Donato 5, 40126 Bologna, Italy CADS, Human Technopole, Milan, Italy Capital Fund Management, 23 rue de l’Universit´e, 75007, Paris, France CFM-Imperial Institute of Quantitative Finance, Department of Mathematics, Imperial College, 180Queen’s Gate, London SW7 2RH Ladhyx UMR CNRS 7646 & Department of Economics, Ecole polytechnique, 91128 Palaiseau Cedex, France Chair of Econophysics and Complex Systems, Ecole polytechnique, 91128 Palaiseau Cedex, France

February 12, 2019

Abstract

We revisit the trading invariance hypothesis recently proposed by Kyle and Obizhaeva [1]by empirically investigating a large dataset of bets, or metaorders, provided by ANcerno. Thehypothesis predicts that the quantity I := R /N / , where R is the exchanged risk (volatility × volume × price) and N is the number of bets, is invariant. We ﬁnd that the 3 / R and N works well and is robust against changes of year, market capitalisation and economicsector. However our analysis clearly shows that I is not invariant. We ﬁnd a very high correlation R > . I and the total trading cost (spread and market impact) of the bet. We proposenew invariants deﬁned as a ratio of I and costs and ﬁnd a large decrease in variance. We showthat the small dispersion of the new invariants is mainly driven by (i) the scaling of the spreadwith the volatility per transaction, (ii) the near invariance of the distribution of metaorder sizeand of the volume and number fractions of bets across stocks. Contents ∗ Corresponding author: [email protected] a r X i v : . [ q -f i n . T R ] F e b Introduction

Finding universal scaling laws between trading variables is highly valuable to make progress in ourunderstanding of ﬁnancial markets and market microstructure. In the wake of these discoveries,Kyle and Obizhaeva posit a trading invariance principle that must be valid for a bet , theoreticallydeﬁned as a sequence of orders with a ﬁxed direction (buy or sell) belonging to a single tradingidea [1, 2]. This principle supports the existence of a universal invariant quantity I – expressed indollars, independent of the asset and constant over time – which represents the average cost of asingle bet . In particular, taking the share price P (in dollars per share), the square daily volatility σ (in % per day), the total daily amount traded with bets V (shares per day) and the averagevolume of an individual bet Q (in shares) as relevant variables, dimensional analysis suggests arelation of the form: P QI = f (cid:18) σ QV (cid:19) , (1)where f is a dimensionless function. Invoking the Modigliani-Miller capital structure irrelevanceprinciple yields f ( x ) ∼ x − / , which implies up to a numerical factor that: I = σ d P Q / V / := R N / , (2)where R := σ d P V measures the total dollar amount of risk traded per day (also referred to astotal exchanged risk or trading activity) while N := V /Q represents the number of daily bets for a given contract. Notwithstanding, the 3/2 law can be interpreted with diﬀerent degrees ofuniversality as discussed in [3]: no universality (the 3/2 holds for some contracts only), weakuniversality (the 3/2 holds but with a non-universal value of I ) and strong universality (the 3/2holds and I is constant across assets and time). Let us stress that identifying an elementary bet in the market is not a straightforward task.Theoretically, a bet is deﬁned as a trading idea typically executed in the market as many tradesover several days. As suggested by Kyle and Obizhaeva in their original work [1], metaorders,i.e. a bundle of orders corresponding to a single trading decision typically traded incrementallythrough a sequence of child orders, can be considered a proxy of these bet s; in this work we willmake use of such an approximation and use the words ‘bet’ or ‘metaorder’ indiﬀerently. Beyondthe subtleties in the bet’s deﬁnition, there has been in the past few years empirical evidence thatthe scaling law discussed above matches patterns in ﬁnancial data, at least approximately. The3/2-law was empirically conﬁrmed by Kyle and Obizhaeva using portfolio transition data relatedto rebalancing decisions made by institutional investors and executed by brokers [1]. Andersen etal. [4] reformulated suitably the trading invariance hypothesis at the single-trade level and showedthat the equivalent version of Eq. (2) in such a setting holds remarkably well using public trade-by-trade data relative to the E-mini S&P 500 futures contracts. Benzaquen et al. [3] substantiallyextended these empirical results showing that the 3 / et al. [5] provided additional empirical evidence that the intriguing 3 / I is actually quite far from invariant, as itvaries from one asset to the other and across time, thus in favour of the weak universality degree.Note that this is consistent with the idea that a universal invariant with dollar units would bequite strange, given that the value of the dollar is itself time-dependent. Benzaquen et al. [3]showed that a more suitable candidate for an invariant was actually the dimensionless I := I/ C where C denotes the spread trading costs.Yet, single transactions are typically not the same as single bets. Large and medium sizedorders are typically split in multiple transactions and traded incrementally over long periods oftime. Empirically market data do not allow to infer the trading decision and to link diﬀerent Note that here we only explore the daily level, time does not mean the same thing as in [3] where we varied thetime intervals over which the variables were computed. ransactions to a single execution. In order to test the trading invariance hypothesis at the betlevel and its relation with trading costs, it is necessary to have a dataset of market-wide (i.e. notfrom a single institution) metaorders.This is precisely the aim of the present paper, which leverages on a heterogeneous dataset ofmetaorders extracted from the ANcerno database. To our knowledge such a thorough analysisat the bet level for a wide range of assets is still lacking.Our main ﬁnding is that, while the 3 / I is notinvariant, as pointed out in [3]. We show that this quantity is strongly correlated with transactioncosts, including spread and impact. Therefore we introduce new invariants, obtained by dividing I by the cost and we show that these quantities ﬂuctuate very little across stocks and timeperiods. Finally we show that the observed small dispersion of the new invariants can be connectedwith three microstructural properties: (i) the linear relation between spread and volatility pertransaction; (ii) the near invariance of the metaorder size distribution, and (iii) of the totalvolume and number fractions of the bets across diﬀerent stocks.The paper is organized as follows. In section 2 we describe the dataset collecting tradingdecisions of institutional investors operating in the US equity market. In section 3 we show thatthe 3 / I and weargue in favour of weak universality . We propose a more natural deﬁnition for a trading invariantthat accounts both for the spread and the market impact costs; and we exhibit the microstructuralorigin of its small dispersion. Some conclusions and open questions are presented in section 5. Our analysis relies on a database made available by ANcerno, a leading transaction-cost analysisprovider ( ). Our dataset counts heterogeneous institutional investors placinglarge buy or sell orders executed by a broker as a succession of smaller orders belonging to thesame trading decision of a single investor. Our sample includes the period January 2007 – June2010 for a total of 880 trading days. Only metaorders completed within at most a single tradingday are held. Further, we select stocks belonging to the Russell 3000 index, thereby retaining ∼ ∼

5% of the total reportedmarket volume, regardless of market capitalisation (large, mid and small) and economical sectors(basic materials, communications, consumer cyclical and non-cyclical, energy, ﬁnancial, industrial,technology and utilities). More details and statistics on the investigated sample are presented inAppendix A.

Here we investigate the trading invariance hypothesis at the daily level. The daily timescale choiceavoids an elaborate analysis of when precisely each metaorder starts and ends, thereby averagingout all the non-trivial problems related to the daily simultaneous metaorders executed on thesame asset [6].

From the metaorders executed on the same stock during the same day we compute the totalexchanged volume in dollars: (cid:80) Ni =1 p i v i , where N is the number of daily metaorders per asset inthe ANcerno database, v i and p i are respectively the number of shares and the volume weighted In fact, for example, Kyle and Obizhaeva tackled this problem investigating a proprietary dataset of portfoliotransitions. ANcerno Ltd. (formerly the Abel Noser Corporation) is a widely recognised consulting ﬁrm that works withinstitutional investors to monitor their equity trading costs. Its clients include many pension funds and asset managers.In [1] the authors claim that the ANcerno database includes more orders than the data set of portfolio transitionsthey used in their work. From a preliminary research Albert S. Kyle and Kingsley Fong found that proxies for bet s inANcerno data have size patterns consistent with the proposed invariance hypothesis discussed in [1, 2]. N h R i N Basic MaterialsCommunicationsConsumerEnergy FinancialIndustrialTechnologyUtilities . . . S l o p e N h R i N Large CapMid CapSmall Cap . . . S l o p e N h R i N . . . S l o p e N h R i N . . . Slope . . . F r e q u e n c y Intercept . . . F r e q u e n c y Figure 1: Plots of the mean daily exchanged risk (cid:104)R(cid:105) N as function of the daily number N ofmetaorders per asset conditional to the market capitalisation (top left panel), the economic sec-tor (top right panel), and the time period (bottom left panel). The insets show the slopes obtainedfrom linear regression of the data, ﬁrstly averaged respect to N and secondly log-transformed. Thebottom right panel shows a plot of (cid:104)R(cid:105) N as function of N for a subset of 200 stocks chosen randomlyfrom the pool of around three thousand US stocks: the two insets represent respectively the distri-bution of the slopes and of the y -intercept, i.e. (cid:104) I (cid:105) := (cid:104)R(cid:105) N /N / , obtained from linear regressionof the data, ﬁrstly averaged respect to N and secondly log-transformed of the data considering eachstock separately. average price (vwap) of the i -th available metaorder. We then deﬁne the total daily exchangedANcerno risk per asset as: R := N (cid:88) i =1 R i , with R i = σ d v i p i , (3)and where σ d denotes the daily volatility per asset, computed as σ d = (p high − p low ) / p open fromthe high, low, and open daily prices only. The statistical properties of the bets, in terms of theirassociated risk R i and of their total daily number N per asset are discussed in Appendix A. Thevariability of the observables over several orders of magnitude should allow to test the 3/2-lawquite convincingly. We checked that the results discussed in the present work are still valid using other deﬁnitions of the daily volatilityand of the price in analogy to what done for example in [1]. Speciﬁcally, the results are still valid when computing σ d with the Rogers-Satchell volatility estimator [3, 7] or as the monthly averaged daily volatility, i.e. ¯ σ d = (cid:80) m =1 σ d,m and/or deﬁning the price p i as the closing price of the day before the metaorder’s execution. .2 Empirical evidence We introduce the mean daily exchanged risk (cid:104)R(cid:105) N , where in general (cid:104)•(cid:105) N := E [ •| N ] denotes in acompact way the average over various days and stocks with a ﬁxed daily number of metaorders N . As shown in the ﬁrst three panels of Fig. 1 the scaling (cid:104)R(cid:105) N ∼ N / holds well independentlyof the conditioning to market capitalisation, economic sector, and time period. Slight deviationsmay have diﬀerent origins but can mostly be attributed to the heterogeneous sample’s compositionin terms of stocks for each bucket in N . The 3 / / y -intercepts of the ﬁtted lines for individual stocks varysubstantially (see the bottom right inset in the bottom right panel of Fig. 1), which indicates that I is not constant across diﬀerent stocks. More empirical insights on the origin of the 3 / The conjecture that the quantity (cid:104) I (cid:105) := (cid:104)R(cid:105) N /N / is invariant across diﬀerent contracts isclearly rejected by the empirical analysis performed in the previous section. Indeed, the quantity (cid:104) I (cid:105) varies by at least one order of magnitude across diﬀerent stocks. This result goes againstthe strong universality version of the trading invariance hypothesis which states that both theaverage value (cid:104) I (cid:105) and the full probability distribution of I = R /N / should be invariant acrossproducts. Dimensionally I is a cost (i.e. it is measured in dollars) and indeed the trading invariancehypothesis posits that the cost of a bet is invariant. Using the identiﬁcation of metaorders and betswe can use the ANcerno dataset to estimate the trading cost, including a spread and a marketimpact component. We will show that I and trading cost are very correlated, and thereforepropose new invariants based on their ratio. Trading costs are typically divided into fees/commissions, spread, and market impact. For largeorders, like those investigated here, fees/commissions typically account for a very small fractionand therefore we will neglect them. We shall however take into consideration both the spread cost(as was done at the single-trade level in [3]) and the market impact cost computed from the squareroot law (see e.g. [6, 8, 9, 10, 11, 12, 13, 14]). We thus deﬁne the average daily bet ’s trading cost − x − − − − − − p ( x ) I CI − x − − − − − p ( x ) I/ C I/ C spd I/ C imp Figure 2: (Left) Empirical distributions of the KO invariant I = R N − / , of the daily average bet ’stotal trading cost C (using Y spd = 3 . Y imp = 1 . I := I/ C . (Right) Empirical distributions in log-log scale of the KO invariant I rescaled respectivelyby the total daily average cost C , by the spread cost C spd and by the market impact cost C imp .5 s: C = C spd + C imp = Y spd × N N (cid:88) i =1 S v i + Y imp × N N (cid:88) i =1 σ d v i p i (cid:114) v i V d := Y spd × C + Y imp × C , (4)with S the average daily spread, V d the total daily market volume, Y spd and Y imp two constantsto be determined. The factor Y spd depends, among other things, on the fraction of trades ofthe metaorder executed with market orders, whereas Y imp only weakly depends on the executionalgorithm and is typically estimated to be very close to unity [6, 8, 15]. The empirical propertiesof C spd and C imp and the relative importance of the two terms as a function of the metaorder sizeare presented in Appendix C. To determine Y spd and Y imp we perform an ordinary least squareregression of the KO invariant I with respect to the daily average cost C deﬁned for each assetby Eq. 4. We obtain Y spd (cid:39) . Y imp (cid:39) . R (cid:39) .

8. Theseresults show that the original KO invariant is indeed strongly correlated with the trading cost.Since these costs have no a priori reason to be universal, this explains why I is not invariant.Guided by such results and by the fact that a market microstructure invariant, if any, shouldbe dimensionless, we deﬁne new invariants by dividing the original KO invariant I by the cost oftrading. Therefore, we consider three diﬀerent speciﬁcations, namely: I = I C , I spd = I C spd , I imp = I C imp . (5)The left panel of Figure 2 shows the distribution of the original KO invariant I together withthat of I , and of the cost C . It is visually quite clear that rescaling by the cost dramaticallyreduces the dispersion, and that the distribution of I is very similar to that of C , with somedeviation for small value. The right panel compares the distribution of I with that of the othertwo new invariants. A quantitative comparison is provided in Table 1, which reports the mean,the standard deviation, and the coeﬃcient of variation (CV) of I and of the three new invariants.It is clear that, due to the correlation between I and C , the new invariants I ◦ (with ◦ = spd , imp)have a much smaller CV than I . Since the distributions have clear fat tails, we also implementeda robust version of CV obtained by replacing the standard deviation with the mean absolutedeviation (MAD), here denoted CV MAD . The table indicates that also in this case the newinvariants are much more peaked than I . Notice also that with CV MAD the three new invariantsbecome similar, while with CV the invariant I imp is more dispersed. Table 1: Statistics of the diﬀerent invariants, namely the original KO invariant I (left), and the threenew ones rescaled by cost (right). MAD is the mean absolute deviation and CV stands for coeﬃcientof variation. I · ($) I I spd I imp mean 6.33 2.20 4.70 7.8st. dev. 11 1.84 3.11 12.2MAD 6.9 1.25 2.21 7.56CV 1.74 0.84 0.66 1.56CV MAD

Here we investigate the origin of the small dispersion of the new invariants. Let us ﬁrst consideronly the impact cost normalisation only and rewrite I imp as: I imp = N (cid:80) Ni =1 σ d p i v i Y imp N / ( σ d (cid:80) Ni =1 p i v i (cid:112) v i /V d ) . (6) The daily spread is recovered from a dataset provided by CFM since it is not available in the ANcerno dataset. The coeﬃcient of variation is the ratio of standard deviation and mean, an indicator of distribution ‘peakedness’. m . . . . . p ( m ) Sample 1Sample 2Sample 3Sample 4Sample 5Sample 6Sample 7Sample 8 Sample 9Sample 10Sample 11Sample 12Sample 13Sample 14Sample 15 − − − η − − − p ( η ) − − − ξ p ( ξ ) Figure 3: Empirical distributions of the ratio m = [v / ] / [v] / (left panel), η = V /V d (central panel)and ξ = N/N d (right pannel), all three computed at the daily level for each asset: we randomlygroup the stocks in equally sized samples and for each of them we compute the empirical distributionrespectively of m , η and ξ ﬁnding that they are, to a ﬁrst approximation, stock independent. Using p i (cid:39) p, for all the metaorders executed in a day on a stock, the above expression simpliﬁesto: I imp = 1 Y imp √ η [v] / [v / ] = 1 Y imp m √ η , (7)where η := V /V d with V := (cid:80) Ni =1 v i is the total ANcerno bet volume, [ • ] is a daily averageoperation per stock, and m > shape of the distribution of metaorder size. We have checked that m as well as η are, to a ﬁrst approximation, independent of the stock (see left and central panelsin Fig. 3) indicating that the distribution of metaorder size is, to a large degree, universal andthat the ANcerno database is representative of the trading across all stocks. These observationsexplains why I imp is also, to a large degree, stock independent.For the total cost normalisation, our understanding of the invariance property relies on thefollowing empirical fact. The average spread is proportional to the volatility per trade, that is S = c p σ d / √ N d , where c is a stock independent numerical constant, see [15, 16]. Indeed, the abovearguments taken together show that the dimensionless quantity I can be written as: I = 1 Y spd c √ ξ + Y imp m √ η , (8)where ξ := N/N d is found to be stock independent (see right panel in Fig. 3). Therefore I is alsostock independent. However, the fact that the CV of I is less than both those of I spd and I imp suggests that the Kyle-Obizhaeva “invariant” reﬂects the fact that metaorders are commensurateto the total cost of trading, including both the spread cost and the impact cost. In this work we empirically investigated the market microstructure invariance hypothesis recentlyproposed by Kyle and Obizhaeva [1, 2]. Their conjecture is that the expected dollar cost ofexecuting a bet is constant across assets and time. The ANcerno dataset provides a uniquelaboratory to test this intriguing hypothesis through its available metaorders which can be treatedas a proxy for bet s, i.e. a decision to buy or sell a quantity of institutional size generated by aspeciﬁc trading idea. Let us summarise what we have achieved in this paper: • Using bets issued for around three thousand stocks, we showed that, at the daily timescaleinterval, the N / scaling law between exchanged risk R and number of bets is observedindependently of the year, the economic sector and the market capitalisation. The trading invariant I := R /N / proposed by Kyle and Obizhaeva is non-universal: bothits average value (cid:104) I (cid:105) and its whole distribution clearly depend on the considered stocks, infavour of a weak universality interpretation. Furthermore, this quantity has dollar unitswhich makes its hypothesised invariance rather implausible. • On the basis of dimensional and empirical arguments, we propose a dimensionless invariantdeﬁned as a ratio of I and of the bet’s total cost, which includes both spread and marketimpact costs. We ﬁnd a variance reduction of more than 50%, qualitatively traceable tothe proportionality between spread and volatility per trade, and the near invariance of thedistributions of bet size, of the volume fraction and number fraction of bets across stocks.Our empirical analysis has allowed to show that the trading invariance hypothesis holds at the bet level in a strong sense provided one considers the exchanged risk and the total trading costof the bets . This is in the spirit of Kyle and Obizhaeva’s arguments, but takes into account thefact that transaction costs are both asset and epoch dependent. As anticipated in [3], our resultsstrongly suggest that trading “invariance” is a consequence of the endogeneisation of costs in thetrading decision of market participants, and has little to do with the Modigliani-Miller theorem.It would actually be quite interesting to investigate other markets such as bond markets, currencymarkets or futures markets, for which the Modigliani-Miller theorem is totally irrelevant, whiletrading invariance still holds – at least at the level of single trades [3, 4]. Finally, diﬀerences inmarket structure across countries, such as execution mechanisms, fees and regulations could alsochallenge the validity of the results presented here. Acknowledgments

We thank Alexios Beveratos, Laurent Erreca, Antoine Fosset, Charles-Albert Lehalle and AmineRaboun for fruitful discussions. This research was conducted within the

Econophysics & ComplexSystems

Research Chair, under the aegis of the Fondation du Risque, the Fondation de l’Ecolepolytechnique, the Ecole polytechnique and Capital Fund Management.

Data availability statement

The data were purchased from the company ANcerno Ltd (formerly the Abel Noser Corporation)which is a widely recognised consulting ﬁrm that works with institutional investors to monitortheir equity trading costs. Its clients include many pension funds and asset managers. The authorsdo not have permission to redistribute them, even in aggregate form. Requests for this commercialdataset can be addressed directly to the data vendor. See for details. eferences [1] Albert S. Kyle and Anna A. Obizhaeva. Market microstructure invariance: Empirical hy-potheses. Econometrica , 84(4):1345–1404, 2016.[2] Albert S. Kyle and Anna A. Obizhaeva. Dimensional analysis, leverage neutrality, and marketmicrostructure invariance. 2017.[3] Michael Benzaquen, Jonathan Donier, and Jean-Philippe Bouchaud. Unravelling the tradinginvariance hypothesis.

Market Microstructure and Liquidity , 2(03n04):1650009, 2016.[4] Torben G. Andersen, Oleg Bondarenko, Albert S. Kyle, and Anna A. Obizhaeva. Intradaytrading invariance in the e-mini s&p 500 futures market. 2016.[5] Mathias Pohl, Alexander Ristig, Walter Schachermayer, and Ludovic Tangpi. Theoreticaland empirical analysis of trading activity. arXiv preprint arXiv:1803.04892 , 2018.[6] Elia Zarinelli, Michele Treccani, J. Doyne Farmer, and Fabrizio Lillo. Beyond the squareroot: Evidence for logarithmic dependence of market impact on size and participation rate.

Market Microstructure and Liquidity , 1(02):1550004, 2015.[7] L. Christopher G. Rogers and Stephen E. Satchell. Estimating variance from high, low andclosing prices.

The Annals of Applied Probability , pages 504–512, 1991.[8] Bence T´oth, Yves Lemperiere, Cyril Deremble, Joachim De Lataillade, Julien Kockelkoren,and Jean-Philippe Bouchaud. Anomalous price impact and the critical nature of liquidity inﬁnancial markets.

Physical Review X , 1(2):021006, 2011.[9] Nicolo G. Torre and Mark J. Ferrari. The market impact model.

Horizons, The BarraNewsletter , 165, 1998.[10] Robert Almgren, Chee Thum, Emmanuel Hauptmann, and Hong Li. Direct estimation ofequity market impact.

Risk , 18(7):5862, 2005.[11] Robert Engle, Robert Ferstenberg, and Jeﬀrey Russell. Measuring and modeling executioncost and risk. Chicago GSB Research Paper, no. 08-09, 2006.[12] Xavier Brokmann, Emmanuel Serie, Julien Kockelkoren, and Jean-Philippe Bouchaud. Slowdecay of impact in equity markets.

Market Microstructure and Liquidity , 1(02):1550007, 2015.[13] Fr´ed´eric Bucci, Iacopo Mastromatteo, Zolt´an Eisler, Fabrizio Lillo, Jean-Philippe Bouchaud,and Charles-Albert Lehalle. Co-impact: Crowding eﬀects in institutional trading activity. arXiv preprint arXiv:1804.09565 , 2018.[14] Fr´ed´eric Bucci, Michael Benzaquen, Fabrizio Lillo, and Jean-Philippe Bouchaud. Crossoverfrom linear to square-root market impact. arXiv preprint arXiv:1811.05230 , 2018.[15] Jean-Philippe Bouchaud, Julius Bonart, Jonathan Donier, and Martin Gould.

Trades, quotesand prices: ﬁnancial markets under the microscope . Cambridge University Press, 2018.[16] Matthieu Wyart, Jean-Philippe Bouchaud, Julien Kockelkoren, Marc Potters, and MicheleVettorazzo. Relation between bid–ask spread, impact and volatility in order-driven markets.

Quantitative Finance , 8(1):41–57, 2008.[17] Charles M. Jones, Gautam Kaul, and Marc L. Lipson. Transactions, volume, and volatility.

The Review of Financial Studies , 7(4):631–651, 1994. Statistics of metaorder sample

Here we describe some statistics of the metaorders executed from the main investments funds andbrokerage ﬁrms gathered by ANcerno. The empirical probability distribution of the number ofmetaorders N per asset, of the risk R i exchanged by a metaorder and of the total daily tradedrisk R per asset are illustrated in Figure 4. It emerges that both the number of daily metaorders N and the risk measures typically vary over several orders of magnitude. In particular, as evidentfrom the left panel in Fig. 4, there is a signiﬁcant number of metaorders active every day, since inaverage ∼ R i and the total daily exchanged risk R vary overalmost eight decades. Note that these statistical properties are approximately independent fromthe time period and from the economical sector of the asset exchanged through metaorders. N − − − − − − p ( N ) risk − − − − − − p ( r i s k ) risk = R i risk = R Figure 4: (Left panel) Empirical probability distribution of the daily number N of metaorders perasset: N is broadly distributed over two decades with an average close to 5. (Right panel) Empiricalprobability distributions of the exchanged risk per metaorder, i.e R i := σ d v i p i , and of the total dailyrisk per day/assets, i.e R := (cid:80) Ni =1 R i . B The 3/2-law under the microscope

One may rightfully wonder whether it is possible to understand the 3 / R i distribution properties as a function of N . We ﬁnd that when rescaling the metaorder’srisk R i by the square root of the number N of daily metaorders per asset one obtains a conditionalcumulative distribution P ( R i / √ N | N ) dependent on N but with a mean E [ R i / √ N | N ] invarianton N (see Fig. 5). It emerges then that the conditional average metaorder risk R i can be predictedfrom the number N of daily metaorders per asset since E [ R i | N ] scales as N γ with γ (cid:39) .

5, thatis E [ R i | N ] ∼ √ N . It immediately follows that combining this empirical result and the linearityproperty of the mean, one recovers the 3 / E [ R| N ] ∼ N / , since: E [ R| N ] = E (cid:20) N (cid:88) i =1 R i (cid:12)(cid:12)(cid:12) N (cid:21) = N (cid:88) i =1 E [ R i | N ] = N E [ R i | N ] ∼ N √ N = N / . (9)To explain the scaling E [ R i | N ] ∼ √ N through the product E [ σ d | N ] × E [v i p i | N ] we need to checkfor the correlation between the daily volatility σ d and the volume in dollars v i p i of a metaorder, In analogy, the variance V [ R i | N ] scales linearly with N , i.e. V [ R i | N ] ≈ E [ R i | N ] . − x . . . . . . P ( R i / √ N ≤ x | N ) E [ R i | N ]10 V [ R i | N ] V [ R i | N ] ≈ E [ R i | N ] − x . . . . . . P ( R i ≤ x | N ) N = 1 N = 5 N = 10 N = 15 N = 20 N = 25 N = 30 N = 35 N = 40 N = 45 N = 50 Figure 5: Empirical cumulative distribution of the traded metaorder’s risk R i = σ d v i p i without (leftpanel) and with (right panel) rescaling by the square root of the daily number N of metaorders perasset. The colored vertical lines represent the location of the average for each sample conditional on N . To note that also if the empirical distribution is not an invariant function of N , we observe that E [ R i / √ N | N ] (cid:39) const., as evident from the vertical lines in the right panel, which is at the origin ofthe measured 3/2-law. which is found to be (cid:104) C ( σ d , v i p i ) (cid:105) ≈ × − . For each stock we regress R i ∼ N γ , σ d ∼ N ν ,v i p i ∼ N δ , and we obtain from the empirical distributions of the exponents in Fig. 6 that theiraverage values read (cid:104) γ (cid:105) = 0 . (cid:104) ν (cid:105) = 0 .

25 and (cid:104) δ (cid:105) = 0 .

20, thus (cid:104) γ (cid:105) (cid:54) = (cid:104) ν (cid:105) + (cid:104) δ (cid:105) . However, bylooking at the scatter plot of the estimated exponent γ as function of the sum ν + δ computedseparately for each stock (see bottom right panel in Fig. 6) one observes a clear linear relation.A possible and intuitive explanation of the non null measured correlation between σ d and v i p i is that metaorders add up to volume, generate market impact and thus increase price volatility.In this way trading volume increases due to both an increase in the number of bets and in theirsizes, and so does volatility from the increased market impact as discussed for example in [17].Note that this reasoning is valid even if the metaorders only account for a certain percentageof the total daily market volume V = (cid:80) Ni =1 v i = ηV d with η adjusting for the partial view ofthe ANcerno sample in terms of volume, and for the non- bet traded by intermediaries: from ourdataset we measure in average (cid:104) η (cid:105) ≈ × − . C Statistics of trading costs

As expected, we ﬁnd that, for a single bet with unsigned volume v, the spread cost c spd = S × vis dominant for small volumes, while the market impact cost c imp = σ d × vp × (cid:112) v /V d takes overfor large volumes (see left panel of Fig. 7). Furthermore, as shown in the right panel of Fig. 7, theaverage daily market impact cost C imp accounts on average for ≈ / C = C spd + C imp , computed using Y = 3 . Y = 1 . ν . . . . . . F r e q u e n c y h ν i = 0 . σ d ∼ N ν − δ . . . . F r e q u e n c y h δ i = 0 . i p i ∼ N δ − γ . . . . . . F r e q u e n c y h γ i = 0 . R i ∼ N γ . . . . ν + δ . . . . γ γ = ν + δ . . . . . Figure 6: (Top left panel) Empirical distribution of the scaling exponent ν computed for each stockregressing σ d ∼ N ν : in average (cid:104) ν (cid:105) = 0 .

25 as shown by the dashed black line. (Top right panel)Empirical distribution of the scaling exponent δ computed for each stock regressing v i p i ∼ N δ : inaverage (cid:104) δ (cid:105) = 0 .

20 as shown by the dashed black line. (Bottom left panel) Empirical distribution ofthe scaling exponent γ computed for each stock regressing R i ∼ N γ : in average (cid:104) γ (cid:105) = 0 . ν + δ and γ respectively estimated conditioning to each stock.12 − − − v /V d . . . . . . . E [ ···| v / V d ] c spd /cc imp /c . . . . . . x . . . . . . . p ( x ) C spd / ChC spd / CiC imp / ChC imp / Ci Figure 7: (Left panel) Averaged spread and market impact cost ratios given respectively by c spd /c and c imp /c - with c spd = S × v (spread cost), c imp = σ d × vp × (cid:112) v /V d (market impact cost) and c = c spd + c imp (total cost per bet ) - as function of the metaorder’s order size v /V d : to note that for a bet with small (large) order size the spread (market impact) cost is dominant. (Right panel) Empiricaldistributions of the C spd / C and C imp / C ratios which give us an idea of the order of magnitude of thediﬀerent contributions to the total daily average cost per bet C = C spd + C imp (computed from Eq. 4ﬁxing Y spd = 3 . Y imp =1.5): the dashed vertical lines represent the location of the mean valuesequal respectively to (cid:104)C spd / C(cid:105) = 0 .

49 and (cid:104)C imp / C(cid:105) = 0 ..