How to build a cross-impact model from first principles: Theoretical requirements and empirical results
HHow to build a cross-impact model from first principles:Theoretical requirements and empirical results
Mehdi Tomas , Iacopo Mastromatteo , and Michael Benzaquen
LadHyX UMR CNRS 7646, Ecole Polytechnique, 91128 Palaiseau Cedex, France CMAP UMR CNRS 7641, Ecole Polytechnique, 91128 Palaiseau Cedex, France Chair of Econophysics & Complex Systems, Ecole Polytechnique, 91128 Palaiseau Cedex, France Capital Fund Management, 23-25, Rue de l’Université 75007 Paris, France
April 6, 2020
Abstract
Cross-impact, namely the fact that on average buy (sell) trades on a financial instrument induce positive (negative)price changes in other correlated assets, can be measured from abundant, although noisy, market data. In this paper wepropose a principled approach that allows to perform model selection for cross-impact models, showing that symmetriesand consistency requirements are particularly effective in reducing the universe of possible models to a much smaller setof viable candidates, thus mitigating the effect of noise on the properties of the inferred model. We review the empiricalperformance of a large number of cross-impact models, comparing their strengths and weaknesses on a number of assetclasses (futures, stocks, calendar spreads). Besides showing which models perform better, we argue that in presence ofcomparable statistical performance, which is often the case in a noisy world, it is relevant to favor models that provide ex-ante theoretical guarantees on their behavior in limit cases. From this perspective, we advocate that the empiricalvalidation of universal properties (symmetries, invariances) should be regarded as holding a much deeper epistemologicalvalue than any measure of statistical performance on specific model instances.
Contents (cid:63) transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.6 Relations between axioms and models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 a r X i v : . [ q -f i n . T R ] A p r .5 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.5.1 Overfitting and non stationarity in predictive power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.5.2 Influence of the number of instruments and of the timescale of study . . . . . . . . . . . . . . . . . . . 23 A.1 Relation between fragmentation and liquidity axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26A.2 Proof of Proposition 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27A.3 Proof of Proposition 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28A.4 Proof of some properties of the kyle model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
B Proofs of illustrative examples 29
B.1 Proof of Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29B.2 Proof of Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
The fact that the prices of financial instruments can be moved by trading pressure in the order flow is now a well-establishedphenomenon, known as Market Impact (see e.g. [2, 4, 19]). In fact, the shape of market impact has been measured ina large number of studies, and displays a structure that is surprisingly robust across assets, time periods and markets.Cross-impact is a multivariate analogue of this market impact, denoting the impact of trading flows from one asset on theprice of other, potentially correlated assets. Despite the evidence that cross-impact allows the transmission of informationacross markets, amplifying shocks in presence of crashes and aggregating liquidity across market venues, the number ofempirical results concerning this phenomenon is much more limited (see e.g. [11, 16]). As a relatively new area of research,one can divide empirical studies of cross-impact in two main groups. On the one hand, one finds non-parametric studiesfocusing on a small number of instruments, and on the other, parametric studies where heuristically-derived models arecalibrated on data. Both parametric and non-parametric studies have provided strong evidence for cross-impact. Asidefrom empirical studies, a number of papers have touched on the subject of cross-impact through other topics, for examplevia optimal control problems.We first focus on non-parametric studies. Wang et al. [21, 22] analyzed stocks and used empirical cross-responsesbetween trade signs and prices to build propagator models for cross-impact, using only the direction of the trade (andtherefore neglecting the influence of volumes). In [20], the authors define an entropy for impact between stocks: whenthis coefficient is high, there should be no cross-impact. To study the structure of cross-impact, they build a networkwhich connects assets when the entropy coefficient is low. Thus, a stock in this network with a high ingoing connectivity isaffected by trading of many other stocks. On the other hand, trading an asset with a high outgoing connectivity impactsthe prices of many other assets. This analysis reveals that the least liquid stocks have high ingoing connectivity and lowoutgoing connectivity while very liquid stocks have low ingoing connectivity and high outgoing connectivity. In short, theprices of illiquid stocks is driven by the prices of liquid stocks . On the other hand, the paper [17] considered bond marketswhich, given their higher pairwise correlation, one may expect to show significant cross-impact. Using a similar approach,they fit a propagator model and find strong cross-impact as well as small, but statistically significant, asymmetries.In [3], the authors fit a different propagator model on stocks. In particular, they find that parametric, factorised modelsdisplay better performance and are less prone to overfitting than non-parametric alternatives, which gives a strong practicalargument for the use of such models. They also found that the first eigenvectors of the stocks return covariance and orderflow covariance matrices are roughly aligned: thus the approximation that they commute is partly justified. Naturally thisis not always true and it is interesting to know how these models perform when this assumption is violated.Furthermore, if previous empirical studies yield insight into the shape of one cross-impact model on a given assetclass, to our knowledge there is no study allowing to compare different cross-impact models on a variety of markets.Therefore, we are unable to say whether there exists a universal, statistically accurate cross-impact model, or even lessambitiously what are the properties that a cross-impact model should respect in order not to be empirically rejected.2n the other hand, several theoretical studies have attempted to reduce the universe of possible cross-impact mod-els by constraining their acceptable outcomes. Multivariate propagator-like models were studied in [1] and only a specificclass of kernels was shown to satisfy no-arbitrage principles. In particular, the authors showed that a necessary butnot sufficient condition for no-arbitrage to hold is that the propagator kernels are positive definite, which implies thatnon-parametric estimates with asymmetric kernels can be arbitraged (in the absence of linear costs). However, thiscondition is not sufficiently restrictive to prescribe specific cross-impact models. Interestingly, the non-parametric study[17] calibrates a propagator model on bonds and finds statistically significant asymmetries at different estimation scales(weekly, bi-weekly and monthly) and conservative confidence levels.If the no-arbitrage property alone is insufficient to determine the structure of cross-impact, one may look at how cross-impact arises in other areas. Two of them are particularly interesting: optimal execution and market making.In the mean-field framework for optimal execution of [13] (an extension of [6]), a continuum of risk-averse investors seekto acquire a portfolio under a common time horizon. The authors assume that the price of asset i changes linearly withthe signed traded volume on asset i , which is the sum of the signed traded volume of all investors. While cross-impactis not directly modeled, it appears through the following mechanism. Even though investors are wary of pushing assetprices because it increases their costs, their risk-aversion incentivises them to hedge their target position portfolio bytemporarily trading other correlated assets. For example, if an investor wants to acquire 1000 units of asset i which isnegatively correlated to asset j , his optimal execution schedule will involve, in turn:(i) buying most of his position on asset i while selling asset j (ii) slowing down his trading speed of asset i and buying back asset j to end the trading session with an inventory of 1000units of asset i and 0 units of asset j .This point of view interprets cross-impact as the trading pressure generated by opportunistic investors with target positionsin correlated financial assets. While this approach provides one explanation of the many possible phenomena underlyingcross-impact, it does not provide a recipe one may use on empirical data to estimate the impact induced on asset i bytrading asset j .We now turn to the study of cross-impact in the optimal market making literature. In the setting of [10], given an assetdynamics a market maker places quotes away from the mid price, seeking to maximise its expected utility at a terminaldate. The probability of the limit orders being executed against an incoming market order depend on the distance to themid-price. Incoming orders are then modeled as independent Poisson processes. One key assumption is thus that thesigned traded order flow of two assets is uncorrelated, contrary to intuition and empirical evidence (see for example in[3, 17, 20, 21, 22]). Nevertheless, studying a simplified version of the partial differential equation ruling the behaviour of theutility function of the market maker (the link between the solution of the original PDE and simplified PDE is not clear, see[10]), the authors of [8] find that the corrective term to the mark-to-market value of the assets in the utility of the marketmaker when he holds an inventory q is of the form − q (cid:62) Λ q , where: Λ = D − (cid:113) D Σ D D − ,with Σ the instantenous return covariance matrix and D playing the role of the typical change of order flow fluctuationsaround the market-maker’s posts. Thus, the risk-averse market maker effectively adjusts the mark-to-market price of theassets to include the estimated liquidation costs of his portfolio. Since we can estimate D and Σ on empirical data, we cantest this cross-impact model in practice.While previous works on cross-impact introduced a given model that satisfied some convenient properties (such as [3]),used a specific theoretical framework to derive a model (for example [7]), or focused purely on empirical data (see forexample [17, 22], we chose a different approach. In this paper, a cross-impact model is instead simply a function ofempirical observables. In place of looking for a single model, we look for reasonable constraints, axioms, which we wouldlike them to satisfy. Data being scarce, the question that we ask ourselves is how should a cross-impact model look like inorder to be reasonable ?In particular, we introduce symmetry axioms which we expect any models to satisfy (when abstracting away microstructuraleffects), fragmentation axioms (generalized from a notion first discussed in [3]), which guarantee suitable behaviour ofmodels when some instruments (or linear combination of instruments) have very small fluctuations, and finally stabilityaxioms to control the impact of trading a basket containing both liquid and illiquid instruments. We further establish3inks between fragmentation and stability axioms and show that, if a model satisfies all symmetry axioms, then it does notdisplay arbitrages in the sense of [1]. These axioms enable us to classify models previously introduced in the literature andgive perspective on which may work best in a given scenario.To test these prescriptions, we apply a variety of cross-impact models to different markets and confirm which axioms arecritical to explain empirical observations. We also find that performance of cross-impact models decreases rapidly with thetimescale of aggregation of the order flow, increases with the size of the universe of tradeable instruments and generallywith the pairwise correlation between instruments.The paper is organized as follows. Section 2 introduces some notation in force throughout the paper, Section 3 laysdown axioms and models, highlighting which axiom each model satisfies. Section 4 presents the calibration results of ourzoology of models on different markets. We conclude on the contributions of the paper, remaining open questions anddirections for future work in Section 5. Throughout this paper, we write scalars in roman lower cases, vectors in bold lower cases and matrices in roman uppercases. The set of n by n real-valued square matrices is denoted by M n , the set of orthogonal matrices by O n , the set of realsymmetric positive semi-definite matrices by S + n and the set of real symmetric positive definite matrices by S ++ n . Further,given a matrix A in M n , A (cid:62) denotes its transpose. Given A in S + n , we write A for a matrix such that A ( A ) (cid:62) = A and (cid:112) A for the matrix square root, the unique positive semi-definite symmetric matrix such that ( (cid:112) A ) = A . We writeker( M ) for the null space of a matrix M ∈ M n , Π V for the projector on a linear subspace of V ∈ (cid:82) n and ¯ Π V = (cid:73) − Π V forthe orthogonal projector. Finally, given a vector v ∈ (cid:82) n , we write v = ( v ,..., v n ) and diag( v ) for the diagonal matrix withdiagonal components the components of v . We are interested in constructing a theory that is able to associate a vector of predicted price changes ∆ p t = p t + ∆ t − p t = ( ∆ p t ,..., ∆ p n , t ) to a signed order flow imbalance q t = ( q t ,..., q n , t ) measured in a time interval ∆ t on a universe of n financial instruments. In order to ensure mathematical tractability of our construction, we focus on the simplest possiblescenario, by assuming that (i) the relation between the order flow imbalance q t and the price change ∆ p t is linear, and(ii) the dependence upon past imbalances q t − , q t − ,... is disregarded. This choice is motivated mainly by the desireof focusing on the most prominent cross-sectional features of these impact models, without overemphasizing the richstructure of the dependence of ∆ p t on the magnitude of the components of q t , nor its temporal dynamics. Hence, wewill drop the time subscript from both price changes and imbalances from now on. The interested reader is referred to[3, 17, 20] for more general approaches to this problem. In our stylized setting, the relation between prices and order flowscan thus be written as: ∆ p = Λ q + η , (1)where Λ is the cross-impact matrix , and η = ( η ,..., η n ) is a vector of zero-mean random variables independent of q . Oneof the main reasons why practitioners are interested in models of this form is that Eq. (1) allows to build is a predictivetheory of impact costs resulting from the execution of a series of trades of size q . In particular, if one assumes that thedifference between the arrival price and the execution price is given by ∆ p , the cost incurred after the execution of a vectorof trades q can be written as: C ( q ) = q (cid:62) ∆ p = q (cid:62) Λ q + q (cid:62) η . (2)Hence, our linear price impact model Eq. (1) induces quadratic impact costs, the average impact cost being given by (cid:69) [ C ( q )] = q (cid:62) Λ q , so that Λ quantifies how expensive the trading is on average due to the reaction of the market to thetraded flow q .One of the main purposes of this paper is to present a series of prescriptions that can be used to choose the mostsuitable estimator for the impact matrix Λ given a set of empirical observations of market data. The main problem liesin the large number of possible combinations of variables that can be used in order to build Λ ; one would like to have aprincipled manner to perform model selection. 4 .2 Covariances and responses If one were to consider the traded flows q and the price changes ∆ p as zero-mean Gaussian variables, then it would be verynatural to assume that covariances of such observables are sufficient statistics, meaning that no matter what observableone builds, it can always be written as a function of the covariance of such quantities. This leads to the following definition. Definition 1 (Price and order flow covariances) . We define respectively as return covariance, order flow covariance andresponse the quantities: Σ : = (cid:69) [ ∆ p ∆ p (cid:62) ] Ω : = (cid:69) [ q q (cid:62) ] (3) R : = (cid:69) [ ∆ pq (cid:62) ].These quantities appear very naturally in the context of market microstructure, as they capture simple features ofthe coupled dynamics of prices and order flows. While Σ quantifies the co-variation of prices, Ω captures co-trading ofdifferent assets, and R reflects the average change of asset prices with traded order flow.Though we will not assume price variations nor order flows to be Gaussian random variables, we work under theassumption that the cross-impact matrix Λ can be expressed solely as a function of Σ , Ω and R . Whereas this is notrestrictive in a Gaussian, zero-mean world, in a non-Gaussian context it can be justified as a modeling assumptionthat allows to capture a large number of distinctive features of the price impact phenomenology while retaining a largedegree of simplicity. For convenience, we will note the price volatility σ : = ( (cid:112) Σ ii ) (1 ≤ i ≤ n ) , the signed order flow volatility ω : = ( (cid:112) Ω ii ) (1 ≤ i ≤ n ) , and the price and flow correlations ρ : = diag( σ ) − Σ diag( σ ) − , ρ Ω : = diag( ω ) − Ω diag( ω ) − . Though theprice volatility σ is a familiar quantity, it is worth commenting on the signification of ω . As the average of the signed orderflow (cid:69) [ q ] = ω quantifies the fluctuations of the traded order flow and will thus be used (and often referred to) as a proxyfor liquidity. For example, if asset i is not traded, then ω i =
0. These considerations motivate the following definition.
Definition 2 (Cross-impact model) . A linear, single period cross-impact model is a function Λ of the form Λ : S + n × S ++ n × M n → M n ( Σ , Ω , R ) (cid:55)→ Λ ( Σ , Ω , R ), where we recall that S + n is the space of real, positive semi-definite n × n matrices, S ++ n is the space of real, strictly positivedefinite matrices and M n the space of n × n real matrices. Even though writing down the impact matrix Λ as a Λ ( Σ , Ω , R ) restricts the possible choices that can be made for modelingit, one still has a large number of degrees of freedom to choose from. This is why we propose an axiomatic approach to thecalibration of cross-impact models: instead of comparing models only on the basis of statistical performance, we wouldlike to control ex ante which properties they satisfy. There are three reasons to do this. First, for practical applications it isoften preferable to establish theoretical guarantees about the properties satisfied by a cross-impact model. Second, the riskof overfitting in data is considerably reduced when the space of possible models is restricted. And third, it typically haslarger scientific value to validate or falsify a generic property of a class of models (symmetry, invariance) with respect to thevalidation or rejection of a specific instance of a model, for which implementation details might hinder universality. The first properties we review involve the dimensional consistency of the models. First, the ordering used to compute thecovariance matrices should be immaterial.
Axiom 1 (Permutational invariance) . A cross-impact model is Λ is permutational-invariant if, for any permutation matrixP, and ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , Λ ( P Σ P (cid:62) , P Ω P (cid:62) , PRP (cid:62) ) = P Λ ( Σ , Ω , R ) P (cid:62) .When the second order statistics are all diagonal, we expect price changes and order flows between distinct assets to beindependent in the Gaussian case. Thus, the cross-impact model should respect the independence between assets, whichmotivates the following axiom. 5 xiom 2 (Direct invariance) . A cross-impact model is Λ is direct-invariant if, for any σ , ω ∈ (cid:82) n + r ∈ (cid:82) n , Λ (diag( σ ) ,diag( ω ) ,diag( r )) = n (cid:88) i = Λ ( σ i e i e (cid:62) i , ω i e i e (cid:62) i , r i e i e (cid:62) i ),where e i is the i -th element of the canonical basis. Furthermore, given that predicted costs are expressed in units ofcurrency, one would expect the currency unit used to express the costs to be immaterial. The next axiom translates thisproperty. Axiom 3 (Cash-invariance) . A cross-impact model Λ is cash-invariant if, for any α > , and ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , Λ ( α Σ , Ω , α R ) = α Λ ( Σ , Ω , R ).Similarly, cross-impact models should account for changes in volume units. In equities, one might have stock splits: acompany can double the number of outstanding shares and halve their values, though one does not expect the long-termbehavior of the system to be affected by this change. This leads to the following axiom. Axiom 4 (Split invariance) . A cross-impact model Λ is split-invariant if, for any diagonal matrix of positive elements D ∈ M n and ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , Λ ( D − Σ D − , D Ω D , D − RD ) = D − Λ ( Σ , Ω , R ) D − .Split invariance guarantees the natural property that impact should adapt to any stock splits, setting aside microstruc-tural effects (tick size effects, lot rounding, etc.).Another reasonable characteristic is the invariance of the impact model under orthogonal transformations: given thatthe profit and loss of traders is invariant under this type of transformation (as it is the case in Eq. (2)), it is natural to inquirewhether this property is also shared by the corresponding impact model. Again, this abstracts away microstructural effectssuch as trading fees, bid-ask spread, etc. The following axiom introduces this property. Axiom 5 (Rotational invariance) . A cross-impact model Λ is rotation invariant if, for any real orthogonal matrix O ∈ O n and ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , Λ ( O Σ O (cid:62) , O Ω O (cid:62) , ORO (cid:62) ) = O Λ ( Σ , Ω , R ) O (cid:62) .We say of a model which does not satisfy Axiom 5 that it has a privileged basis. Note that any cross-impact model whichsatisfies Axioms 4 and 5 is invariant under the action of any non-singular matrix M . This family of axioms clarify what properties a cross-impact model should satisfy for costs to be positive on average, orequivalently not to admit any manipulation strategy, in the sense of [9]. The first axiom involves the static arbitrages thatit would be possible to exploit in our single-period model if the cost of trading a portfolio of q units, C ( q ) = q (cid:62) Λ q , wasnegative along some direction. Axiom 6 (Positive semi-definiteness) . The cross-impact model Λ takes values in the space of positive semi-definite matrices. The second axiom that we consider involves dynamic arbitrages in the spirit of [1, 9]. Even though these arbitragescannot be exploited in our single-period setup, they would emerge by generalizing our setup to the multi-time step settingsuch as in [17], which is why we choose to also consider this class of arbitrages.
Axiom 7 (Symmetry) . The cross-impact model Λ takes values in the space of symmetric matrices. Axioms 6 and 7 together are sufficient to guarantee absence of arbitrages for both linear single-period models ( C ( q ) = q (cid:62) Λ q ) and linear multi-period models with factorized kernel ( C ( q ) = (cid:80) t , t (cid:48) φ ( t − t (cid:48) ) q (cid:62) t Λ q (cid:62) t (cid:48) ): in these cases trading anyportfolio will induce a positive cost on average. 6 .3.3 Fragmentation While the previous axioms focused on ruling out strategies with average negative costs, another related issue is the impactof trading assets which have constant prices. Indeed, we expect that if an instrument is a linear combination of otherinstruments, the cross-impact induced by trading the spread of each instrument should be zero. For example, take a stocktraded on multiple markets (say, Apple traded on the Nasdaq and on the Bats venues): for a reasonably large interval oftime ∆ t , we expect the price on different venues to be the same, or equivalently we expect that it should be impossible tomove the combination p Nasdaq − p Bats . Thus, buying a volume q = q Nasdaq + q Bats of Apple stock should yield the same costno matter how one fragments the q Nasdaq units bought on Nasdaq and the q Bats units bought on Bats. For this reason, thisaxiom is dubbed fragmentation invariance . We distinguish between three different forms of fragmentation invariance. Thefirst, weak fragmentation invariance , concerns the price changes given by a cross-impact model and is detailed in the nextAxiom.
Axiom 8 (Weak fragmentation invariance) . The cross-impact model Λ is weakly fragmentation invariant if, for any ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) and (cid:59) ⊂ V ⊆ ker Σ , Π V Λ ( Σ , Ω , R ) = where we recall that Π V denotes the projector on the linear subspace V . In practice, if a linear combination of prices is assumed not to fluctuate, weak fragmentation invariance guaranteesthat impact does not produce any volatility in that direction.
Remark 1.
From now on, we will implicitly assume that ker( Σ ) ⊆ ker( R (cid:62) ) , which is consistent with the interpretation of Σ and R as covariations in the sense of Eq. (3) . This implies that from the point of view of the fragmentation-related axioms,any condition involving the the kernel of Σ will be naturally related to the kernel of R (cid:62) as well. A stronger condition is obtained if one thinks that the volume q traded in directions that do not fluctuate does not haveinfluence on the measured impact. This leads to the following Axiom. Axiom 9 (Semi-Strong fragmentation invariance) . A cross-impact model satisfies semi-strong fragmentation invariance if,besides satisfying the weak fragmentation invariance Axiom 8, for any ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) and (cid:59) ⊂ V ⊆ ker Σ onehas: Λ ( Σ , Ω , R ) Π V = other market members. This is strong fragmentation invariance , the subject of the next Axiom. Axiom 10 (Strong fragmentation invariance) . An impact model Λ is strongly fragmentation invariant if, besides satisfyingsemi-strong fragmentation invariance (Axiom 9), for any ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) and (cid:59) ⊂ V ⊆ ker( Σ ) , one additionallyhas: Λ ( Σ , Ω , R ) = Λ ( Σ , ¯ Π V Ω ¯ Π V , R ¯ Π V ).Each of the fragmentation Axioms imposes constrains on the elements of the cross-impact matrix. To summarise,Figure 1 indicates schematically the interplay between all forms of fragmentation invariance. Π ker Σ ¯ Π ker Σ Π ker Σ ¯ Π ker Σ SemiStrong= 0 Weak = 0StrongIndependent of Π ker Σ Figure 1: Summary diagram of fragmentation axioms .We sketched block-wise the cross-impact matrix Λ ( Σ , Ω , R ), separating assets with zero volatility (top left corner) and other assets(bottom right corner). Weak fragmentation invariance (Axiom 8, in red) stipulates that, along the directions of zero price fluctuations andregardless of the traded volumes, the cross-impact model predicts zero price change. Semi-strong fragmentation invariance (Axiom 9,in blue) further requires that there is no impact at all from volume traded along directions of zero price fluctuations. Finally, strongfragmentation invariance (Axiom 10, in green) further stipulates that cross-impact in the directions of non-zero price flucutations is nota function of the volume traded in directions of zero price fluctuations. .3.4 Liquidity A cross-impact model should also have controlled behavior when a set of instruments is considerably less liquid thanthe rest of the tradable universe. Intuitively, cross-impact models should be such that trading illiquid products does notstrongly impact the price of liquid products. We model this by defining a set V of illiquid instruments and by consideringthe projector Π V on the space of such products. Then we can consider the matrix ¯ Π V + (cid:178) Π V that multiplies by (cid:178) (cid:191) V , and consider the modified observables: Σ (cid:48) = ΣΩ (cid:48) = ( ¯ Π V + (cid:178) Π V ) Ω ( ¯ Π V + (cid:178) Π V ) R (cid:48) = R ( ¯ Π V + (cid:178) Π V )that correspond to the covariances that one would have measured if the liquidities of instruments belonging to V wereto be multiplied by (cid:178) . We are now ready to formulate axioms relating to how one expects the system to behave when theilliquid instruments are traded. Axiom 11 (Weak Cross-Stability) . We say that a cross-impact model Λ is weakly cross-stable if, for any ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , subspace V and using the above notations, we have: ¯ Π V Λ ( Σ (cid:48) , Ω (cid:48) , R (cid:48) ) Π V = (cid:178) → O (1). (4)This axiom formulates the intuition that the same notional amount traded on illiquid and liquid products shouldnot move the price of the liquid product by a disproportionate amount. A stronger cross-stability property can also beformulated. Axiom 12 (Strong Cross-Stability) . A cross-impact model Λ is strongly cross-stable if, in addition to satisfying weak-crossstability ( Axiom 11), for any ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , subspace V and using the above notations, one has that: ¯ Π V Λ (cid:161) Σ (cid:48) , Ω (cid:48) , R (cid:48) (cid:162) ¯ Π V → (cid:178) → ¯ Π V Λ (cid:161) ¯ Π V Σ ¯ Π V , ¯ Π V Ω ¯ Π V , ¯ Π V R ¯ Π V (cid:162) ¯ Π V This axiom formalizes the intuition that the price moves on a liquid basket of products induced by trading that very samebasket should be independent of the behavior of the illiquid products that have not been traded. Finally, an unresolvedquestion is the effect of trading illiquid instruments on illiquid products. The following axiom deals with this issue.
Axiom 13 (Self-stability) . A cross-impact model is self-stable if, using the above notations, we have: Π V Λ ( Σ (cid:48) , Ω (cid:48) , R (cid:48) ) Π V = (cid:178) → O (1). (5)Intuitively this property is less desirable than the previous one: it indicates that, even though a product is illiquid ( q ∝ (cid:178) ,so that one would expect a diverging impact) the predicted cost of trading such product can be finite, typically because thatproduct is correlated to other more liquid instruments. Figure 2 summarises the differences between weak cross-, strongcross- and self-stability. Π V ¯ Π V Π V ¯ Π V Weak= O(1)SelfStable = O(1) ⇒ (cid:66) StrongIndependentof Π V Figure 2: Summary diagram of cross stability properties.
We sketched block-wise the cross-impact matrix Λ ( Σ , Ω , R ), separating illiquid assets (top left corner) and other assets (bottom rightcorner). The weak cross stable axiom (Axiom 11, in blue) imposes that impact due to trading of illiquid assets is finite. On the other hand,similarly to the strong fragmentation invariance axiom, the strong cross stable axiom (Axiom 12, in green) imposes that cross-impactin the direction of liquid assets does not depend at all on assets with zero liquidity. Finally self-stability (Axiom 13, in red) constrainsimpact on illiquid assets due to trading of illiquid assets to go to zero. This is a dangerous property as one would fear trading very illiquidinstruments would push their price significantly. .3.5 Predicted covariance Finally, it can be interesting to consider whether a cross-impact model predicts a contribution to the return covariance thatis proportional to Σ or not. Axiom 14 (Return covariance consistency) . A cross-impact model Λ is return covariance consistent if, for any ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , it satisfies (up to a multiplicative constant): Σ = Λ ( Σ , Ω , R ) ΩΛ ( Σ , Ω , R ) (cid:62) .This axiom is motivated by the fact that under the model in Eq. (1), we expect return covariances to be given by Σ = (cid:69) [ ∆ p ∆ p (cid:62) ] = ΛΩΛ (cid:62) + (cid:69) [ ηη (cid:62) ],so if one assumes that (cid:69) [ ηη (cid:62) ] ∝ Σ (fundamental return covariance is proportional to the observed one), one would recoverreturn covariance consistency. We do not attribute the same level plausibility to the different axioms listed above.Concerning the invariance-related axioms, we believe the permutational, direct and cash invariance (i.e Axioms 1 to 3)to be the most plausible ones, as we don’t expect to measure a privileged ordering or price scale in empirical data (ticksize effects should be negligible at low enough frequency). Similarly, split invariance (Axiom 4) holds a large degree ofplausibility: this symmetry is only expected to be broken by microstructural effects (e.g., lot size). Abstracting away thesedetails, one expects volumes to only appear upon multiplication by prices. Moreover, given that split invariance plays a bigrole in the deep link between liquidity-related properties and fragmentation-related properties (see section Section 3.6below), we have a strong a priori in favor of this property. Concerning rotational invariance (Axiom 5), it would not besurprising to find it violated in real markets, given that the physical basis of product is expected to play a privileged role.Though, all invariance-related axioms (Axioms 1 to 5) should be particularly relevant in very liquid markets in whichmicrostructural effects play a limited role and can be abstracted away. Surprisingly, these axioms actually considerablyrestricts the set of linear cross-impact models.Weak fragmentation invariance (Axiom 8) is a critical property of a consistent impact model. By construction one doesnot want to predict price changes along directions that do not fluctuate. For analogous reasons, we believe that evensemi-strong and strong fragmentation invariance (Axioms 9 and 10) should be also of crucial importance in order toconstruct a consistent cross-impact model.Arbitrage-related axioms Axioms 6 and 7 are of great important in applications, in which one might want to excludethe presence of arbitrages by construction. Still, it is an extremely interesting empirical question to assess whether realmarkets admit some kind of arbitrage à la [9]. Indeed, even if these axioms were not satisfied in real markets, othermechanisms could prevent statistical arbitrage. For example, transaction costs such as fees or spreads could preventprofitable trading of these arbitrages. Furthermore, it would be surprising to measure a perfectly symmetrical marketimpact. Instead, one would expect a small asymmetrical component: as long as it is below a certain threshold related to thescale of the symmetrical component, no statistical arbitrages should be possible.Liquidity-related axioms control the behaviour of cross-impact models to evaluate trading costs. We believe weak-crossstability (Axiom 11) to be a fundamental condition: it should be impossible to move disproportionately the price of liquidassets by trading a moderate amount of illiquid products. The stronger version of this axiom (Axiom 12) is also expectedto be fulfilled in data: one expects that liquid products should be insensitive to the behavior of illiquid ones, even in thecritical case where they are not traded. On the other hand, self-stability (Axiom 13) can be an undesirable property forapplications, because it doesn’t penalize the trading of extremely illiquid products.The return covariance consistency requires that the innovation covariance matrix (cid:69) [ ηη (cid:62) ] is proportional to the returncovariance matrix, meaning that the part of the volatility that one can explain with the order flow has the same direction asthe total one. There is no ex-ante reason for this to be true, though it is implicitly assumed in the setting of the multivariateKyle model, see below [7]. Actually, this constraint, combined with no-arbitrage axioms (Axioms 6 and 7) restricts the set ofall linear cross-impact models to a single model. 9 .5 Models Now that we have characterized the properties that we can expect from a linear cross-impact model, we provide a set ofcandidate impact models, whose properties and empirical performance we would like to characterize. We divide thesemodels in two classes; the ones that are based on the return covariance Σ and the ones based on the response R . Let us start with the simplest possible linear impact model: one without cross-impact.
Definition 3 ( direct model) . The direct model is defined for any ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) as: Λ direct ( Σ , Ω , R ) : = diag( σ ) diag( ω ) − . (6)To generalize this model to the multivariate setting, while respecting cash invariance, weak fragmentation invari-ance and consistency with correlations, a first idea is to use the matrices Σ and Ω − . Since Ω − q is a whiteningtransformation, this model is referred to as the whitening model. Definition 4 ( whitening model) . Recall that given M ∈ S + n , M indicates a symmetric matrix factorization (i.e., M ( M ) (cid:62) = (cid:73) ). The whitening model is defined, for any ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , as: Λ whitening ( Σ , Ω , R ) : = Σ Ω − . (7)Unfortunately, this estimator does not respect symmetry or positive-definiteness (Axioms 6 and 7), strong fragmentationinvariance (Axiom 10) and cross-stability (Axiom 11). To impose symmetry and strong fragmentation invariance, the el model proposed in [14] defines an impact model directly in the basis of the return covariance matrix, and fixes itseigenvalues in order to be consistent with dimensional analysis. This model then assumes by construction [ Λ , Σ ] =
0, whichimplies that impact is along the directions of the return covariance matrix.
Definition 5 ( el model) . The eigenliquidity ( el ) model is defined, for any ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , as Λ el ( Σ , Ω , R ) = n (cid:88) a = s a (cid:112) λ a ( s (cid:62) a Ω s a ) s (cid:62) a , (8) where we have introduced the eigenvalue decomposition of Σ : = (cid:80) na = s a λ a s (cid:62) a . The el model is cross-stable, self-stable (Axioms 11 to 13) and is return covariance inconsistent (Axiom 14). Asmentioned above, there is in fact only one model which satisfies all the axioms that we have provided: the so-calledmultivariate kyle model, see [7]. Definition 6 ( kyle model) . The kyle model is defined, for any ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , as: Λ kyle ( Σ , Ω , R ) : = ( Ω − ) (cid:62) (cid:112) ( Ω ) (cid:62) ΣΩ Ω − . (9)The kyle model is extremely similar to the cross-impact mark-to-market adjustment found in [8, 10]. All the models presented above assume that it is possible to relate the effect of the price to the order flow imbalance withthe total return covariance, implicitly assuming that there is no distinction between the directions along which pricesrespond to liquidity and the ones along which prices fluctuate independently of the order flow. Then, one could expect theresponse R = (cid:69) [ ∆ pq (cid:62) ] to be more informative in selecting the effect of liquidity shocks, because instead of capturing thetotality of the price variations, it is only determined by the ones that are aligned with the order flow q . This leads to thefollowing class of models.First, we can define a response-based direct impact model similar to Eq. (6). Definition 7 ( r-direct model) . The response direct ( r-direct ) model is defined, for any ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , as: Λ r-direct ( Σ , Ω , R ) : = diag(( R ii ) ni = )diag( ω ) − . The whitening model is not independent of the symmetric factorization chosen for Σ and Ω . As convention, we will take the square root obtained byan orthogonal decomposition of each matrix and the square root of their eigenvalues. The model proposed in [14] is actually the response-based one, referred later as r-el (cid:63) model. Λ under the constraint Λ i j = i (cid:54)= j . Hence, it is quite natural that by trying to replicate the same construction as for the whitening model inthe response-based context in order to get a truly multivariate estimator, one obtains the Maximum Likelihood estimatorfor the matrix Λ . Definition 8 ( ml model) . The maximum likelihood ( ml ) model is defined, for any ( Σ , Ω , R ) in ( S + n × S ++ n × M n ) , as: Λ ml ( Σ , Ω , R ) : = R Ω − .The ml does not satisfy desirable arbitrage or liquidity axioms. Thus, for similar reasons the el was introduced, weintroduce a r-el model, so to have a response-based model satisfying more axioms while coinciding with the ml when R and Ω commute. Definition 9 ( r-el model) . The response-based eigenliquidity ( r-el ) model is defined, for any ( Σ , Ω , R ) in ( S + n × S ++ n × M n ) ,as: Λ r-el ( Σ , Ω , R ) : = (cid:88) a s a s (cid:62) a R s a s (cid:62) a Ω s a s (cid:62) a , (10) where s a are the eigenvectors of Σ . Finally, we can replicate the construction of the kyle estimator in a response-based context, so to obtain the followingmodel.
Definition 10 ( r-kyle model) . The response-based Kyle ( r-kyle ) model is defined, for any ( Σ , Ω , R ) in ( S + n × S ++ n × M n ) ,as: Λ r-kyle ( Σ , Ω , R ) : = ( Ω − ) (cid:62) (cid:112) ( Ω ) (cid:62) R Ω − R (cid:62) Ω Ω − . (11)Note that one might intuitively rationalize this model by generalizing the approach of [7] to the case in which a marketmaker faces insiders whose price information is misaligned with respect to the total price variations, as if not all informationabout price variation was revealed through trading. (cid:63) transformation Some of the models defined in the previous section ( whitening , el , r-el ) violate split invariance, due to the fact thatthe eigenvectors { s a } na = of the return covariance Σ do not transform like prices under the effect of volume changes, eventhough they are well-behaved under rotation. Summarizing, these models violate Axiom 4 but respect Axiom 5. If oneis willing to trade one axiom for the other, it is possible to cure the lack of split invariance by introducing a privilegedbasis, thus sacrificing Axiom 5. One might in fact argue that split invariance should be more fundamental than rotationalinvariance, given that the basis of physical products should be allowed to play a special role. Trading Axiom 4 with Axiom 5can be achieved through the following transformation. Definition 11 (The (cid:63) transformation) . Given a cross-impact model Λ , the starred version of Λ , written Λ (cid:63) , is a cross-impactmodel defined for any ( Σ , Ω , R ) in ( S + n × S ++ n × M n ) as: Λ (cid:63) ( Σ , Ω , R ) : = diag( σ ) Λ ( ρ , Ω (cid:63) , R (cid:63) )diag( σ ), where we have defined Ω (cid:63) = diag( σ ) Ω diag( σ ) and R (cid:63) = diag( σ ) − R diag( σ ) , as well as ρ : = diag( σ ) − Σ diag( σ ) − is thereturn correlation matrix. In practice, the starred version of a cross-impact model applies the original cross-impact model after rescaling all theobservables in units of risk via a multiplication by the volatility σ . Of course, this transformation has no effect on modelsthat satisfy split invariance.Table 1 summarises the axioms satisfied by each model. 11 odel Symmetries Arbitrage Fragmentation Liquidity CovariancesPI DI CI SI RI SA DA WFI SSFI SFI WCS SCS SS PCC direct (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) (cid:51) (cid:51) (cid:55) (cid:55) (cid:55) (cid:51) (cid:51) (cid:55) (cid:55) whitening (cid:51) (cid:51) (cid:51) (cid:55) (cid:51) (cid:55) (cid:55) (cid:51) (cid:55) (cid:55) (cid:55) (cid:55) (cid:55) (cid:51) whitening (cid:63) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) (cid:55) (cid:55) (cid:51) (cid:55) (cid:55) (cid:55) (cid:55) (cid:55) (cid:51) el (cid:51) (cid:51) (cid:51) (cid:55) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) el (cid:63) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) kyle (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) (cid:51) r-direct (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) (cid:51) (cid:55) (cid:55) (cid:55) (cid:55) (cid:51) (cid:51) (cid:55) (cid:55) ml (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) (cid:55) (cid:51) (cid:55) (cid:55) (cid:55) (cid:55) (cid:55) (cid:55) r-el (cid:51) (cid:51) (cid:51) (cid:55) (cid:51) (cid:55) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) r-el (cid:63) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) (cid:55) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) r-kyle (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) (cid:55) Table 1:
Summary of axioms satisfied by different cross-impact model. We use the symbol (cid:51) for axioms that are satisfied and (cid:55) foraxioms that are violated. We use the color green in order to label a desirable property of the model, red for an undesirable property ofthe model. Yellow is used for properties/models whose violation might not be particularly relevant in order to explain empirical data,although they are interesting to consider. Axioms are grouped by category and the order in which they were presented in the text.
The properties listed above are not independent, and one can easily derive several relations that can provide furtherintuition on the axioms above, and additionally relate them to some of the models. It is particularly instructive to relate thefragmentation-related axioms to the liquidity-related ones. Proofs of results presented here are given in Appendix A.
Proposition 1.
Let Λ be cross-impact model which satisfies split symmetry (Axiom 4) and semi-strong fragmentationinvariance (Axiom 9). Then:(i) Λ is weakly cross-stable (Axiom 11) if for a generic linear subspace V ¯ Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) Π V = ε → O ( (cid:178) ). (ii) If, additionally, Λ is continuous in the first and third argument strongly fragmentation invariant, then Λ is stronglycross-stable (Axiom 12).(iii) Λ is self-stable (Axiom 13) if for a generic linear subspace V Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) Π V = ε → O ( (cid:178) ). where Σ (cid:48)(cid:48) = ( ¯ Π V + (cid:178) Π V ) Σ ( ¯ Π V + (cid:178) Π V ) R (cid:48)(cid:48) = ( ¯ Π V + (cid:178) Π V ) R Remark 2.
Note that the above proposition introduces some additional technical hypothesis that restrict the speed ofconvergence of the model around the point in which one asks for fragmentation invariance. In order to satisfy items (i) and(iii) above, when fluctuations are of order (cid:178) , impact should go to zero at speeds of respectively (cid:178) and (cid:178) . This is obviously astronger requirement than a simple fragmentation invariance condition, which is a point-wise property.Even though a simpler, sufficient condition would guarantee the same property is differentiability, interestingly severalmodels do verify a non-pointwise fragmentation property but lack differentiability (e.g., kyle ). Interestingly, the converse property does not hold, thus indicating that the fragmentation invariance properties play amore fundamental role than the liquidity related axioms. For example, an interesting sufficient condition for semi-strongfragmentation invariance is given below.
Proposition 2.
Let Λ be a split-invariant, weakly cross-stable and self-stable cross-impact model (Axioms 4, 11 and 13).Then if ker Σ can be generated by the canonical basis, then Λ is semi-strongly fragmentation invariant (Axiom 9). el , r-el , kyle and r-kyle ones. However, the kyle model also satisfies arbitrage and correlation-consistency (proved in [7] and [5]).In fact, it is the only model which satisfies these axioms, as discussed in the following proposition, the proof of which isgiven in Appendix A.2 and heavily inspired by [5, 7]. Proposition 3.
Let Λ be a symmetric, positive-semidefinite and price-covariance consistent cross-impact model (Axioms 6, 7and 14). Then Λ = Λ kyle up to a multiplicative constant. Hence, there is a single symmetric, positive-semidefinite, correlation-consistent, strongly fragmentation invariantmodel. Given that the fragmentation-related axioms seem so fundamental, one might wonder how many models one canbuild that enjoy that family of properties. Indeed, as a trivial corollary of Proposition 1, all models satisfying the abovecondition are also cross-stable. Surprisingly, the class of models enjoying both split invariance and rotational invariance isquite small: in Appendix A.3 we prove the following proposition.
Proposition 4.
A return covariance based cross-impact model Λ that is both split invariant and rotational invariant(Axioms 4 and 5) can always be written in the form Λ ( Σ , Ω ) = L −(cid:62) U F ( µ ) U (cid:62) L − , where Ω = L L (cid:62) ; ˆ Σ : = L (cid:62) Σ L ; U (cid:62) ˆ Σ U : = diag( µ ) ; F ( µ ) : = Λ (diag( µ ), (cid:73) ). Furthermore, if Λ is cash-invariant and direct-invariant Axioms 2 and 3, then F ( µ ) ∝ diag( µ ) and Λ = Λ kyle up to amultiplicative constant. The above shows that the only return-based cross-impact model which satisfies all symmetry axioms Axioms 1 to 5is the kyle model. This indicates that data is bound to play a major role in order to select what cross-impact model isdeemed to be more suitable in order to describe market microstructure.
The focus of the present section is to study a simple set of estimation scenarios for cross-impact models regressed fromempirical data. Our goal is to stress test the cross-impact models presented in Section 3. We shall not restrict to rulingout models that appear to be inconsistent with data, but also to establish which of the Axioms presented in Section 3 it isnecessary to adopt in order to provide an accurate description of the impact of order flows on prices. To this purpose, weconsider three examples constructed with instruments belonging to different asset classes.(i) First, we consider a set of liquid future contracts traded on the New York Mercantile Exchange (NYMEX) crude oiltogether with their corresponding calendar-spread contract, in order to illustrate how liquidity aggregates in thepresence of very strong correlations. The calendar spread contract is included in the estimation in order to highlightthe effect of the presence of a zero-risk mode in the correlation matrix of returns.(ii) Second, we consider a universe of four contracts, two maturities of a liquid stock index future contract and twomaturities of a liquid bond contract. We use this example in order to show how different models react to anti-correlation.(iii) Finally, we estimate our cross-impact models on a universe of 393 stocks, in order to see how our calibration procedurereacts in the presence of a large number of instruments, so to address the issues of overfitting and robustness to noise.Our goal is to explain the 1-minute price changes of a basket of instruments using the traded volume during that time span.Price changes are assumed to be independent and identically distributed.In order to compare the different impact models, we construct three different indicators of performance which em-phasize different aspects of prediction errors. All three indicators are parametrized by a symmetric, positive definite matrix M ∈ S + n , M (cid:54)=
0, that is used to construct a generalized R error for the predicted price changes (cid:100) ∆ p t versus the realized ones ∆ p t . In particular, given a realization of the price process { ∆ p t } Tt = of length T and a corresponding series of predictions{ (cid:100) ∆ p t } Tt = , the generalized R is defined as: R ( M ) : = (cid:80) ≤ t ≤ T ( ∆ p t − (cid:100) ∆ p t ) (cid:62) M ( ∆ p t − (cid:100) ∆ p t ) (cid:80) ≤ t ≤ T ∆ p (cid:62) t M ∆ p t .13hereas R ( I ) is the ordinary R error, other choices of M can be used in order to highlight different sources of error. Inparticular we consider:(i) M = I σ : = diag( σ ) − , to account for errors relative to the typical deviation of the asset considered. This type of error isrelevant for strategies predicting idiosyncratic moves of the constituents of the basket, rather than strategies bettingon correlated market moves.(ii) M = J σ : = ( Σ − ii Σ − j j ) ≤ i , j ≤ m , to check if the model successfully forecasts the overall direction of all assets, which isobviously relevant for strategies that try to forecast the overall move of the constituents of the basket.(iii) M = Σ − , to consider how well the model predicts the individual modes of the covariance matrix. This would be therelevant error measure for strategies that place a constant amount of risk on the modes of the correlation matrix,leveraging up combinations of products with low volatility and scaling down market direction that exhibit largefluctuations.Note that the last measure strongly penalizes models violating fragmentation invariance: errors along modes of zero riskshould a-priori be enhanced by an infinite amount. In this study we have decided to clip the eigenvalues of Σ to a small,non-zero amount equal to 10 − . Given M ∈ S + n , M (cid:54)=
0, we compute scores on empirical data in the following manner.First, we divide data into two subsets of roughly equal length: data from 2016 on the one hand and in 2017 on the otherhand. Given data from year X and year Y , we calibrate estimators and cross-impact models on year X and use modelsto predict price changes in year Y , writing R X → Y ( M ) for the average score. In-sample scores are defined as R ( M ) : = ( R → ( M ) + R → ( M )) while out-of-sample scores are defined as R out ( M ) : = ( R → ( M ) + R → ( M )). Before presenting empirical results, we give a synthetic example to illustrate the behaviour of different cross-impact models.To simplify computations and interpretations of the results, we focus on diagonal volume covariance matrices and generalreturn covariance matrices. Thus, this example deals with correlated assets traded in separate markets. The details of thecomputations are given in Appendix B.
Example 1 (Two correlated assets on a fragmented market) . Let Σ : = (cid:181) σ σ σ ρσ σ ρ σ (cid:182) and Ω : = (cid:181) ω ω (cid:182) where σ , σ > , ω , ω ≥ and − ≤ ρ ≤ . Then, Λ kyle ( Σ , Ω ) = σ σ (cid:113) σ ω + ω ω σ σ (cid:112) − ρ + σ ω σ σ + ω ω (cid:112) − ρ ρρ σ σ + ω ω (cid:112) − ρ . Furthermore, if σ = σ = σ , then: Λ kyle ( Σ , Ω ) = σ (cid:113) ω + ω ω (cid:112) − ρ + ω + ω ω (cid:112) − ρ ρρ + ω ω (cid:112) − ρ and Λ el ( Σ , Ω ) = σ (cid:113) ω + ω ) (cid:181)(cid:112) + ρ + (cid:112) − ρ (cid:112) + ρ − (cid:112) − ρ (cid:112) + ρ − (cid:112) − ρ (cid:112) + ρ + (cid:112) − ρ (cid:182) . Remark 3.
When σ = σ in Example 1, we recover an example treated in [7]. Furthermore, as one might expect, if σ = then trading Asset 1 has no impact on Asset 1’s price or on Asset’s 2 price: this is naturally related to fragmentation invariance(see Axioms 8 to 10). Similarly, when σ = σ and either ρ = or ρ = − , the return covariance matrix has only one non-nulleigenvalue. As each model satisfies fragmentation invariance, and both coincide in dimension one, they yield the same result. Remark 4.
Example 1 only deals with diagonal volume covariance matrices but invoking basis invariance (Axioms 4 and 5)one can diagonalize Ω and reduce computations to this example. Description of the dataset
We illustrate the performance of the different models by considering a universe of threeinstruments: two liquid NYMEX Crude Oil future contracts and the corresponding Calendar Spread contract. The firsttwo contracts (respectively, CRUDE0 and CRUDE1) entail an agreement to buy or sell 1000 barrels of oil either at the next14 an2016 Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec10 VV CRUDE0 V V V Figure 3: Number of traded NYMEX Crude oil futures and Calendar Spread contracts (in thousands) relative to daily number oftraded contracts.
The number of contracts sold relative to the daily average is shown for the front month contract (in blue), the subsequent month (inorange) and the Calendar Spread (in green). The average number of traded NYMEX Crude oil futures and Calendar Spread contracts¯ V over 2016 is shown in the upper right corner. Vertical dashed lines show specific dates. An example of first notice date for the frontmonth contract is shown in bold black. After the first notice date, holders of the future contract may ask for physical delivery of theunderlying. We also show two dates away from a first notice date: the 6th and 15th of June 2016. Colored triangles show the relativenumber of contracts exchanged on these dates. Note that the number of contracts is represented in thousands and was not adjusted bythe basis point, so that the underlying of each contract is 1000 barrels of oil. month or at the subsequent month, whereas the Calendar Spread CRUDE1_0 allows to exchange the closer-to-expiry futurecontract (front month) with a contract settling on the following month. We collected trades and quotes data from January2016 to December 2017, between 9:30AM to 7:30PM UTC, where most of the trading takes place in our dataset, removing 30minutes around the opening of trading hours to mitigate intraday seasonality. After filtering and processing, we have a totalof 430 days in our sample (237 in 2016 and 193 in 2017). We highlight below two important features of our pre-processingfor the estimation of Σ , Ω and R . Pre-processing: accounting for non-stationarity
Overall, the front month contract CRUDE0 is by far the most liquid,followed by the subsequent month contract CRUDE1 and the calendar spread CRUDE1_0. However, there are strongseasonal dependencies which are shown in Figure 3. For example, the subsequent month contract becomes more liquid asone approaches the maturity of the front month contract. Global estimators of Σ , Ω and R would thus be biased by thisvarying liquidity ω ( σ also appears to follow a non-stationary pattern, but is not shown here). Thus, we used local (daily)estimators of price volatility σ t and liquidity ω t , and built local covariance estimators Σ t and Ω t by assuming stationarityof the correlations (cid:37) = diag( σ t ) − Σ t diag( σ t ) − and (cid:37) Ω = diag( ω t ) − Ω t diag( ω t ) − . Pre-processing: cleaning estimators
As illustrated in Figure 4, where the structure of Σ , Ω and R are shown for a typicalday, one can appreciate that the correlation between the two future contracts CRUDE0 and CRUDE1 is close to one,whereas the correlation with the Calendar Spread contract is very small, due to the small volatility of the fluctuations alongthe relative mode. Because of these effects, the sign of the Calendar Spread correlations with CRUDE0 and CRUDE1 isnon-trivial to estimate: due to microstructural effects, the measured correlation is dominated by tick-size related effects .In fact, empirical price changes of the Calendar Spread are not given by the difference of price changes of the legs. To solvethis issue, we impose the price changes of the Calendar Spread according to the price changes of the futures contracts. Structure of ρ , Ω and R The estimators of (cid:37) , Ω and R matrices for the 6th of June 2016 are shown in Figure 4. We chosethis date as it represents the typical behaviour of these contracts far away from the first notice date, before rolling effectsbecome prevalent. As previously mentioned, the two futures contracts are heavily correlated, which implies that Σ will To test this hypothesis, we estimated the empirical smallest eigenvalue of the covariance matrix for multiple futures contract as a function of relativetick size (not shown). If price changes of the Calendar Spread were given by the legs of the contract, this eigenvalue should be equal to zero. However,we found that as the tick size increases, so does the smallest eigenvalue away from zero. This thus validates our hypothesis and justifies the need foradditional processing of futures data. RUDE0 CRUDE1 CRUDE1_0CRUDE0CRUDE1CRUDE1_0 1.00 0.99 0.060.99 1.00 -0.060.06 -0.06 1.00 CRUDE0 CRUDE1 CRUDE1_028.62 2.41 0.022.41 1.03 0.000.02 0.00 0.00 CRUDE0 CRUDE1 CRUDE1_01.89 0.22 0.001.86 0.22 0.001.77 0.08 0.09 R Figure 4: Estimates of (cid:37) , Ω and R for Crude contracts (in MUSD). The return correlation matrix ρ (left), orderflow covariance matrix Ω (center) and response matrix R (right) were estimated using 2016data using the procedure described in Section 4.2 and computed on the 6th of June 2016. To highlight the amount of notional traded,orderflow is reported in millions of exchanged dollars according to the average value of each contract on the 6th of June 2016. Thoughnon-nill, orderflow covariance of Calendar Spread thus appears small because traded notional is much smaller than on each leg of thefutures contract. have one direction of zero fluctuations (due to the Calendar Spread) and another of small, though non-nill fluctuations.Because of these modes of small fluctuations, we expect models which satisfy fragmentation invariance (Axioms 8 to 10) tobe preferable. On the other hand, Ω shows an L-pattern that reflects heterogeneity in liquidity, so that the response showsvertical stripes. This was already noted in [22] in the case of stocks. One should therefore be cautious of models which donot satisfy stability axioms (Axioms 11 to 13), as they will not penalize trading directions of small liquidity.The empirical structure of (cid:37) , Ω , R motivates the following example, where we compare the predictions of models which areboth fragmentation invariant but differ in their stability properties. Example 2 (Calendar Spread and futures) . Let σ > ω , ω , ω ≥ ≥ ρ ≥ − and Σ : = σ ρ − ρρ − (1 − ρ )1 − ρ − (1 − ρ ) 2(1 − ρ ) Ω : = ω ω
00 0 ω . Then, one has: Λ el ( Σ , Ω ) = (cid:112) + ρ σ ¯ ω + (cid:112) − ρ σ ¯ ω − − − − , where ρ ω : = ω − ω (cid:113) ( ω + ω )( ω + ω + ω ) and ¯ ω : = ω + ω , ¯ ω : = ω + ω + ω , λ = (cid:112) + ρ , λ = (cid:112) − ρ . The lengthyexpression of Λ kyle is given in Appendix B.2. Since it is difficult to analyse the kyle model in full generality of Example 2, we make remarks in special cases ofinterest.
Remark 5.
Once again, when ρ = there is only one direction of the return covariance matrix with a non-zero eigenvalue (itthus reduces to the rank-one case treated in [7]) and the kyle and el models should coincide (as in the previous example).Indeed, it is easy to check that: Λ kyle ( Σ , Ω ) = ρ → σ (cid:115) ω + ω . This example highlights that ¯ ω plays the role of the effective liquidity of the portfolio (1,1,0) . Conversely, when ρ = − , wefind: Λ kyle ( Σ , Ω ) = ρ →− σ (cid:115) ω + ω + ω − − − − .16 RUDE0 CRUDE1 CRUDE1_0CRUDE0CRUDE1CRUDE1_0 55 82 1205 55 82 382 21 -126 85318 ml CRUDE0 CRUDE1 CRUDE1_0 59 56 251 56 58 -248251 -248 52159 r-el
CRUDE0 CRUDE1 CRUDE1_0 58 56 85 56 68 -1253 85 -1253 140463 kyle
Figure 5: Values of different cross-impact models for Crude contracts.
We report the values of the ml (left), r-el (cid:63) (center) and kyle (right) cross-impact models for the covariances of the 6th of June 2016obtained using the procedure described in Section 4.2. Units are chosen to represent the relative price change in basis points (10 − ofthe asset price) by hundred million USD worth of contract traded. In the ρ = − regime, buying the calendar spread is equivalent to buying two units of the long leg (or shorting two units of theshort leg), which is consistent with the prediction of the kyle model. As before, we find that ¯ ω plays the role of the effectiveliquidity of the portfolio (1, − . Cross impact models for Crude oil contracts
For illustrative purposes, we highlight results for a handful of models inFigure 5, selected because of their performance and the different set of axioms which they satisfy: the ml , r-el (cid:63) and kyle models. Recall from Table 1 that each of these models satisfies weak fragmentation invariance (Axiom 8) so that one cannotimpact directions of zero fluctuations. Therefore, our impact models prevent arbitrage which would trade the physicalCalendar Spread contract against the synthetic Calendar Spread (made up of CRUDE0 and CRUDE1). However, modelsdiffer in the stability axioms they satisfy. The ml and kyle models are not self-stable (Axiom 13) while the ml model is.This explains why impact from trading the illiquid Calendar Spread is much larger in the ml and kyle models than in the r-el model. Overall, by construction, the r-el sees a unique basket of liquidity: there is only one direction with a largeeigenvalue, the market mode, the relative mode having a much smaller eigenvalue. Empirical comparison of models
Table 2 shows the scores of cross-impact models on the Crude dataset. First, note thatmodels which do not satisfy weak fragmentation invariance (Axiom 8) poorly explain idiosyncratic price changes becauseof the small volatility of the Calendar Spread. Furthermore, since Σ has one eigenvalue equal to zero, models which do notsatisfy weak fragmentation invariance cannot explain risk-weighted price changes. It is therefore more suited to compareall models on the basis of R ( J σ ). Variants of direct models account for 33% and 40% of the variance of market wide moves.Cross-impact models slightly improve on direct models (scoring around 46%). This is somewhat surprising: despite theconcentration of liquidity in the front month contract and the large correlation between the front and subsequent monthcontracts, accounting for the off-diagonal elements of Σ and Ω matters. Finally, among cross-impact models, there islittle difference between the performance of the ml , r-el , r-el (cid:63) and kyle models because the relative mode has verylittle fluctuations: this three-dimensional system roughly behaves like a one-dimensional system. Overall, this exampleemphasizes the importance of fragmentation invariance (Axioms 8 to 10) but does not suggest which stability axioms(Axioms 11 to 13) are most relevant. Symmetry axioms may play an important constraining role for return-based covariancemodels, since the kyle model is the only covariance-based model which yields similar results to the best response-basedmodels. On the other hand, response models which violate some symmetry axioms (such as r-el ) have good performance.To distinguish between split invariance (Axiom 4) and rotation invariance (Axiom 5), we can compare the scores of modelsto their starred counterpart. Here, the performance of starred models is close to that of non-starred models. Indeed, takingthe example of r-el and r-el *, since the liquidity of the calendar spread is thin, the projected liquidity along the threeproducts is the same as the projected liquidity along the two legs of the contract. The same can be said about the projectedresponse. Since the volatility of the two legs are about the same, the first two components of the eigenvectors of ρ and Σ should be about the same. We have checked and confirmed this empirically (not shown). For example, though the R ( I σ ) score of the r-direct model is small, it explains about 35% of the variance of price changes of CRUDE0, roughly 20%for CRUDE1 but predicts incorrect price changes for the Calendar Spread CRUDE1_0. odel In-sample Out-sample R ( I σ ) R ( J σ ) R ( Σ − ) R ( I σ ) R ( J σ ) R ( Σ − ) direct ± ± −∞ ± ± −∞ whitening ± ± − ± ± ± − ± whitening (cid:63) ± ± − ± ± ± − ± el ± ± ± ± ± ± el (cid:63) ± ± ± ± ± ± kyle ± ± ± ± ± ± r-direct ± ± −∞ ± ± −∞ ml ± ± ± ± ± ± r-el ± ± ± ± ± ± r-el (cid:63) ± ± ± ± ± ± r-kyle ± ± ± ± ± ± Table 2: In-sample and out-sample scores for Crude contracts.
We reported as ∞ the scores of models which are numerically infinite, but due to clipping appear finite. The previous dataset on Crude futures and Calendar Spreads showed the importance of satisfying fragmentation invariance(Axiom 8). However this market corresponds to a pathological case where Σ has an eigenvalue equal to zero. Furthermore,we were unable to study the varying liquidity of each contract since the high correlation between calendar spreads left Σ with only one large non-nill eigenvalue, so that cross-impact models gave similar results. We now focus on studying theheterogeneity in liquidity in a basket of instruments, using bonds and indices data. Description of the dataset
We look at 10-year US Treasury note futures and the E-MINI futures. We collect data from theChicago Mercantile Exchange and use the first two upcoming maturities of both contracts (respectively called SPMINI andSPMINI3 for E-MINI contracts and 10USNOTE and 10USNOTE3 for 10-year US treasury notes). E-Mini futures are quarterly,financially settled contracts with maturities in March, June, September and December. At expiry, the final settlementprice of E-MINI futures is a proxy for the S&P500 index using the opening prices of the underlying stocks belonging tothe index. Similarly, the 10-year treasury note futures are quaterly, financially settled contracts with maturities in March,June, September and December. At expiry, the final settlement price is volume weighted average price of past trades on theunderlying treasury note. We collected trades and quotes data from January 2016 to December 2017, between 9AM to7PM UTC, where most of the trading takes place in our dataset. After filtering days for which data for one product wasmissing, we keep a total of 160 days (75 in 2016 and 85 in 2017). We highlight below one important pre-processing step forthe estimation of Σ , Ω and R . This is a simplification of the settlement rules to emphasize the expected value of the final settlement price. Further details about the final settlementprice of E-MINI futures and 10-year US Treasury Note futures can be found in the CME Rulebook. R Figure 6: Estimates of (cid:37) , Ω and R for bonds and indices (in MUSD). The return correlation matrix ρ (left), orderflow covariance matrix Ω (center) and response matrix R (right) were estimated using 2016data using the procedure described in Section 4.3 and computed on the 17th of August 2016. To highlight the amount of notional traded,orderflow is reported in millions of exchanged dollars according to the average value of each contract on the 17th of August 2016. Basispoints were accounted for, so that one traded unit of the futures contracts entitles the owner to one unit of the underlying. ml r-el kyle Figure 7: Values of different cross-impact models for bonds and indices.
We report the values of the ml (left), r-el (cid:63) (center) and kyle (right) cross-impact models for the covariances of the 17th of August 2016obtained using the procedure described in Section 4.3. Units are chosen to represent the relative price change in basis points (10 − ofthe asset price) by hundred million USD worth of contract traded. Pre-processing: accounting for non-stationarity
The same non-stationary behaviour observed for Crude Oil futurescontract is observed here. Thus we adopt the same estimation procedure for the local covariance estimators Σ t and Ω t byassuming stationarity of the correlations (cid:37) = diag( σ t ) − Σ t diag( σ t ) − and (cid:37) Ω = diag( ω t ) − Ω t diag( ω t ) − . Structure of ρ , Ω and R The estimators of (cid:37) , Ω and R matrices for the 17th of August 2016 are shown in Figure 6. Aswe expected, contracts with similar underlying are strongly correlated, thus ρ shows 2 by 2 blocks of strongly correlatedcontracts and an anti-correlation between bonds and futures. The correlation between two contracts is smaller than in theprevious example of Crude contracts. This may be because the maturities for Crude contracts were one month apart whilethe maturities for the bonds or indices futures studied here are 3 months apart. Similarly to the previous example, liquidityis heterogeneous. Non-front month contracts have small liquidity. Interestingly, the structure of the volume covariancematrix also shows anti-correlation between bonds and indices. In this configuration, the discriminating factor betweenmodels should be stability axioms (Axioms 11 to 13) rather than fragmentation axioms (Axioms 8 to 10). Cross impact models for bonds and indices
For illustrative purposes, we highlight results for a handful of models inFigure 7, selected because of their performance and the different set of axioms which they satisfy: the ml , r-el (cid:63) and kyle models. Recall that the r-el and kyle models are weakly cross-stable (Axiom 11) while the ml model is not. Thus the ml assigns large impact to less liquid contracts, 10USNOTE3 and SPMINI3. Similarly, the self-stability (Axiom 13) of r-el explains the small impact predicted if one trades illiquid contracts. Reassuringly, all models correctly capture the negativeindex-bonds correlation. Empirical comparison of models
Table 3 shows the scores of cross-impact models on the bonds and indices dataset.The notable difference in the structure of Σ between the previous basket of instruments and this set of instruments isthe importance of both the market (indices minus bonds) and relative (indices + bonds) modes. In the case of Crude Model In-sample Out-sample R ( I σ ) R ( J σ ) R ( Σ − ) R ( I σ ) R ( J σ ) R ( Σ − ) direct − ± ± − ± − ± ± − ± whitening ± − ± − ± ± − ± − ± whitening (cid:63) ± − ± − ± ± − ± − ± el ± ± − ± ± ± − ± el (cid:63) ± − ± − ± ± − ± − ± kyle ± ± ± ± ± ± r-direct ± ± − ± ± ± − ± ml ± ± ± ± ± ± r-el ± ± ± ± ± ± r-el (cid:63) ± ± ± ± ± ± r-kyle ± ± ± ± ± ± Table 3: In-sample and out-sample scores for bonds and indices. R ( I σ ) scores). On the other hand, the notable difference in the structure of Ω is the presence oftwo instruments with very small liquidity. However, there is no clear difference in scores of models according to whichstability axioms (Axioms 11 to 13) they satisfy. A second order characteristic of Σ is the presence of directions of smallfluctuations: the price difference between 10USNOTE and 10USNOTE3 or between SPMINI and SPMINI3. As before, thesedirections of small fluctuations penalise models which do not respect fragmentation invariance, as evidence by R ( Σ − )scores. To distinguish between split invariance (Axiom 4) and rotation invariance (Axiom 5), we can compare the scores ofmodels to their star counterpart. Contrary to the previous example, we find here that starred models perform worse thantheir non-starred counterparts. This suggests, surprisingly, that Axiom 5 may be more relevant than Axiom 4 to explainprice changes. The previous datasets emphasized again the importance of satisfying fragmentation invariance (Axioms 8 to 10) but did notyield a clear conclusion on the role of stability axioms (Axioms 11 to 13). However, in both examples the return covariancematrix had directions of small fluctuations because of high pairwise correlation among some instruments. To circumventthis issue, we study the behaviour of cross-impact models in the low-correlation, many assets regime, using stocks data.
Description of the dataset
We selected 393 different stocks among the constituents of the S&P500 and collected tradesand quotes data between 2PM and 9:30PM UTC, removing the beginning and end of the trading period to focus on theintraday behaviour of liquidity and volatility and circumvent intraday non-stationary issues. We collected trades andquotes data from January 2016 to December 2017, between 2PM and 9:30PM UTC, to focus on the intraday behaviour ofliquidity and volatility and circumvent intraday non-stationary issues. After filtering days for which data for one productwas missing, we keep a total of 302 days (154 in 2016 and 148 in 2017). Some summary characteristics of our sample arepresented in Table 4. Quantile10% 50% 90%Relative tick size (in %) 1.6 2.5 4.6Number of trades per day (in thousands) 5.9 12.6 29.4Daily turnover (in MUSD) 28.5 56.1 116.2
Table 4: Summary statistics for our sample of stocks.
Basic MaterialsCommunications Consumer, CyclicalConsumer, Non-CyclicalEnergyFinancialIndustrialTechnologyUtilities 0.080.060.040.020.000.020.040.060.08
Figure 8: Estimated price and orderflow correlation matrices (cid:37) , (cid:37) Ω for stocks. We represent the return correlation matrix ρ (left), orderflow correlation matrix (cid:37) Ω (right) estimated on 2016. To highlight the amount ofnotional traded, orderflow is reported in millions of exchanged dollars according to the average value of each contract on the 17th ofAugust 2016. Correlation matrices were represented instead of covariance matrices due to the large volume heterogeneities betweenstocks. Stocks were grouped by sectors to highlight the blockwise structure of these matrices. asic MaterialsCommunications Consumer, CyclicalConsumer, Non-CyclicalEnergyFinancialIndustrialTechnologyUtilities ml r-el kyle Figure 9: Values of different cross-impact models for stocks.
We report the values of the ml (left), r-el (cid:63) (center) and kyle (right) cross-impact models for covariances obtained using the proceduredescribed in Section 4.4. Units are chosen to represent the relative price change in basis points (10 − of the asset price) by hundredmillion USD worth of instruments traded. Structure of (cid:37) , (cid:37) Ω and R Estimators of (cid:37) , (cid:37) Ω are shown in Figure 8. To highlight the blockwise structure of these matrices,we show correlations instead of covariances. For the same reasons, R is not shown but presents a bandwise structure oneexpects from heterogeneities in liquidity. As previously mentioned, pairwise price and orderflow correlations betweenassets are small, so that the improvement of cross-impact models over direct models should be lower than in previousapplications. For more details about the structure of the price and volume covariance matrices, see [3]. Cross impact models for stocks
For illustrative purposes, we highlight results for a handful of models in Figure 9, selectedbecause of their performance and the different set of axioms which they satisfy: the ml , r-el (cid:63) and kyle models. At firstglance, each model appears to present a blockwise structure similar to that of (cid:37) , (cid:37) Ω . However, the ml model does not satisfyweak cross-stability (Axiom 11) and thus predicts large impact on liquid stocks if one trades illiquid stocks. By constructionthe r-el (cid:63) model weighs most impact on the market mode. Finally, the kyle model looks like a symmetrized version of the r-el model. Empirical comparison of models
Table 5 shows the scores of cross-impact models on the stocks dataset. Contrary tothe two previous examples where only a small number of directions had notable impact contributions, the low pair-wisecorrelation of stocks (aside from market mode contributions) suggests that there are many directions which contributeto the overall impact. Precisely because of the importance of the market mode, the scores reported in Table 5 strikinglyshow that cross-impact models can explain market-wide moves up to twice as well as direct models, as shown by thedifference in R ( J σ ) scores. Naturally, explaining idiosyncratic or eigenportfolio price changes is a more challenging task,but cross-impact models ( r-el , r-el (cid:63) , kyle , ml ) improve r-direct scores by 20% to 30%. To distinguish between splitinvariance (Axiom 4) and rotation invariance (Axiom 5), we can compare the scores of models to their star counterpart.We find, in continuity with our results on the bonds and indices dataset, that starred models perform better than their Model In-sample Out-sample R ( I σ ) R ( J σ ) R ( Σ − ) R ( I σ ) R ( J σ ) R ( Σ − ) direct ± ± − ± ± ± − ± whitening − ± − ± − ± − ± − ± − ± whitening (cid:63) ± ± − ± ± ± − ± el − ± − ± − ± − ± − ± − ± el (cid:63) − ± ± − ± − ± ± − ± kyle ± ± ± ± ± ± r-direct ± ± ± ± ± ± ml ± ± ± ± ± ± r-el ± ± ± ± ± ± r-el (cid:63) ± ± ± ± ± ± r-kyle ± ± ± ± ± ± Table 5: In-sample and out-sample scores for stocks.
Influence of liquidity on scores
An interesting feature of our stocks dataset is the heterogeneity in liquidity among different stocks. This allows to explorethe influence of the liquidity of a given stock on the performance of different models. We show the results of this analysis inFigure 10. Consistently with the scores obtained Table 5, we find that overall, in score terms, ml > kyle > r-direct > r-el . Asone may expect, the r-direct model fares better for very liquid stocks, as a larger fraction of variance can be explainedby idiosyncratic trades. Surprisingly, the same holds for ml and kyle models. The r-el model stands as an exception:it better explains price moves for stocks which are within the band of typical liquidity, between ω and ω , than forvery liquid stocks. However, this is natural: the r-el model assumes an aggregated pool of liquidity (along the modesof the covariance matrix). Thus, though this assumption is roughly justified for stocks close to the average liquidity, it isviolated outside of this zone. On the other hand, the ml and kyle models have no such assumption and can thus betterdeal with very liquid or illiquid stocks. To further reinforce this point, for stocks of liquidity close to the average in our poolof stocks, the difference scores of the el and kyle models reach a minimum. This is consistent with the fact that in theapproximation Ω ≈ ω (cid:73) , the two models coincide. Thus, violating self-stability (Axiom 13) may be relevant to explainprice changes for all ranges of liquidity within our basket of instruments.
10% 90% stock liquidity i R ( i ) mlekyler-directr-elm Figure 10: Idiosyncratic scores as a function of liquidity.
For each stock in our dataset, we compute the the in-sample stock-specific scores R ( Π i ) scores on 2016 data. We then represent theaverage in-sample stock-specific score as a function of the liquidity ω i , binning data by ω i to smooth out noise. Results for the ml (inpink), kyle (in green), r-direct (in blue) and r-el (in orange) models are shown. We have further indicated the 10% and 90% quantilesof liquidity ω and ω . The previous section compared the descriptive power of different cross-impact models. However, axioms also constraincross-impact models and thus change the degrees of freedom of models. We are thus interested in investigating therobustness of cross-impact models on our datasets.
Recall that we write R X → Y ( M ) for the average score for an error matrix M . In-sample scores are defined as R ( M ) : = (cid:161) R → ( M ) + R → ( M ) (cid:162) while out-of-sample scores are defined as R out ( M ) : = (cid:161) R → ( M ) + R → ( M ) (cid:162) .To study the non-stationarity in predictive power, we compute | R ( M ) − R ( M ) | R ( M ) + R ( M ) , which we refer to as the difference inpredictability. When this quantity is equal to zero, it signifies data from 2016 is as easily predictable as data from 2017. Adifference indicates that the predictive power of the model has changed from one year to the other. In Figure 11 we compareoverfitting to predictability difference. Overall, overfitting is smaller than predictability difference, which shows that modelsare indeed robust across time and their performance is not dictated by changes in predictability of the underlying pricechanges. 22 -dir ml r-el kyle0%2%4%6%8%10%12%14%16% M = I r-dir ml r-el kyle M = overfittingdifference in predictability Figure 11: Overfitting and difference predictability scores for different models.
We show side by side the comparison of overfitting coefficient 1 − R M ) R M ) (with no shading) and non-stationarity in predictive power | R ( M ) − R ( M ) | R ( M ) + R ( M ) (with shading) for the r-direct (in blue), ml (in pink), r-el (in orange) and kyle (in green) models for idiosyncratic M = I σ (left) and risk-weighted M = Σ − (right) scores. Across our asset classes, we have focused on the 1 minute timescale. Now, we examine which models better explain pricechanges on longer timescales. In Figure 12, we show the out-of-sample score and overfitting coefficient for idiosyncraticprice changes for our set of 393 stocks, as a function of the bin timescale and number of instruments. R out ( I ) 0.00.10.20.30.40.5 mlkyler-dirr-elcrude contractsbonds and indicesall stocks10 number of assets96%97%98%99%100% R out ( I ) R in ( I ) 10 bin scale [minutes]80%85%90%95%100% Figure 12: Idiosyncratic score and overfitting as a function of the number of assets and bin timescale.Left column: average out-of-sample idiosyncratic score R ( I σ ) (top left) and overfitting coefficient R ( I σ ) R ( I σ ) (bottom left) computedusing stocks data. Out-of-sample and in-sample scores were computed by randomly selecting a subset of stocks and computing scoreson the given subset, repeating the procedure more when there are fewer stocks are selected than when a large proportion of stocks fromour sample is considered. The average score for each models across all samples is then shown. Scores are shown for the ml (in pink),the kyle (in green), r-direct (in blue) and r-el (in orange). Stars show results for crude contracts, crosses for bonds and indices andtriangles for all 393 stocks of our sample. Right column: idiosyncratic scores (top right) and overfitting coefficient (bottom right) as afunction of the bin timescale. Scores were computed using the same procedure described in Section 4.4, varying the bin parameter from10 seconds up to around an hour.
23s expected, the number of degrees of freedom controls the overfitting of different models. This explains why, in terms ofoverfitting with respect to the number of instruments at the minute timescale, r-direct < kyle < ml ≈ r-el . In contrast,models overfit less on futures, which suggests that overfitting decreases as the pairwise correlation between instrumentsincreases. Furthermore, out-of-sample idiosyncratic scores for the ml and kyle model increase with the number of assets.A somewhat surprising result, despite the small pairwise correlation of instruments in our stock dataset and the largenumber of stocks considered in this study, is that idiosyncratic scores appear to keep increasing for more than 400 assets.Nevertheless, cross-impact models score much better for futures than for stocks, suggesting that scores also increase as thepairwise correlation between instruments increases.Focusing on to the influence of the bin timescale, there is little overfitting at the minute timescale but it increaseswith the bin timescale. In particular, the good fit of the ml at small timescales quickly breaks down for larger timescales.On the other hand, both the r-el and kyle models are quite robust up until the 10 minute timescale. This highlights theimportance of enforcing consistency requirements to reduce overfitting. Let us summarise what we have achieved. Our main object was to build a series of theoretical prescriptions to choose themost suitable cross-impact model Λ given a set of empirical observations of market data. To do so, we generically defineda cross-impact model as a function of second-order statistics, constrained through a certain number of axioms. From atheoretical standpoint, no matter the specific framework for building cross-impact models, one can easilly understand thatproceeding in this fashion is very useful as it significantly reduces the universe of possible models. Empirical evidenceconfirms the importance of fragmentation invariance axioms (Axioms 8 to 10) for cross-impact models applied to marketswhere some instruments (or linear combination of instruments) display very small fluctuations. Stability axioms enablemodels to better explain price moves of instruments in extreme liquidity/illiquidity conditions (see Figure 10). Additionally,the reduced number of parameters of models constrained by axioms (such as the kyle model) reduces overfitting - bothwith the number of instruments in our tradeable universe and with the timescale of study (see Figure 12). In all marketsstudied, our analysis confirms that cross-impact models are well suited to predict execution costs and evaluate liquidityrisk, showing significant improvement compared to impact models which ignore cross-sectional effects (see Tables 2, 3and 5).Though our framework focused on the linear static scenario, the prescriptions and models introduced in this papercould be generalised to deal with more general cases. Furthermore, the framework can be adapted to deal with derivatives,which we leave as the topic of future work. Another topic is the generalisation of this framework to account for theauto-correlation of the order flow. In [18], the authors began to study this question, paving the way to a cross-impactgeneralisation of [12]. Finally, many questions subsist about the underlying dynamics or supply and demand whichaccount for cross-impact: from the microstructural perspective what coupling of order-book dynamics could accountfor such aggregate price dynamics? This question requires further attention and we hope to examine it in detail from theperspective of zero-intelligence limit order-book models (see e.g. [15]). We warmly thank J.-P. Bouchaud, Z. Eisler and B. Toth, M. Rosenbaum and A. Fosset for fruitful discussions. This researchwas conducted within the
Econophysics & Complex Systems
Research Chair under the aegis of the Fondation du Risque, ajoint initiative by the
Fondation de l’École polytechnique, l’École polytechnique and Capital Fund Management.24 eferences [1] Aurélien Alfonsi, Florian Klöck, and Alexander Schied. Multivariate transient price impact and matrix-valued positivedefinite functions.
Mathematics of operations research , 41(3):914–934, 2016.[2] Robert Almgren, Chee Thum, Emmanuel Hauptmann, and Hong Li. Direct estimation of equity market impact.
Risk ,18(7):58–62, 2005.[3] Michael Benzaquen, Iacopo Mastromatteo, Zoltan Eisler, and Jean-Philippe Bouchaud. Dissecting cross-impact onstock markets: An empirical analysis.
Journal of Statistical Mechanics: Theory and Experiment , 2017(2):23406, 2017.[4] Jean-Philippe Bouchaud, Julius Bonart, Jonathan Donier, and Martin Gould.
Trades, Quotes and Prices . CambridgeUniversity Press, 3 2018.[5] Jordi Caballe and Murugappa Krishnan. Imperfect competition in a multi-security market with risk neutrality.
Econometrica (1986-1998) , 62(3):695, 1994.[6] Pierre Cardaliaguet and Charles-Albert Lehalle. Mean field game of controls and an application to trade crowding.
Mathematics and Financial Economics , 12(3):335–363, 2018.[7] Luis Carlos del Molino, Iacopo Mastromatteo, Michael Benzaquen, and Jean-Philippe Bouchaud. The MultivariateKyle model: More is different. 2018.[8] David Evangelista and Douglas Vieira. New closed-form approximations in multi-asset market making. arXiv preprintarXiv:1810.04383 , 2018.[9] Jim Gatheral. No-Dynamic-Arbitrage and Market Impact. Technical report, 2009.[10] Olivier Guéant. Optimal market making.
Applied Mathematical Finance , 24(2):112–154, 2017.[11] Joel Hasbrouck and Duane J Seppi. Common factors in prices, order flows, and liquidity.
Journal of financial Economics ,59(3):383–411, 2001.[12] Thibault Jaisson. Market impact as anticipation of the order flow imbalance.
Quantitative Finance , 15(7):1123–1135,2015.[13] Charles-Albert Lehalle and Charafeddine Mouzouni. A Mean Field Game of Portfolio Trading and Its ConsequencesOn Perceived Correlations. arXiv preprint arXiv:1902.09606 , 2019.[14] Iacopo Mastromatteo, Michael Benzaquen, Zoltan Eisler, and Jean-Philippe Bouchaud. Trading lightly: Cross-impactand optimal portfolio execution. 2017.[15] Iacopo Mastromatteo, Bence Toth, and Jean-Philippe Bouchaud. Agent-based models for latent liquidity and concaveprice impact.
Physical Review E , 89(4):042805, 2014.[16] Paolo Pasquariello and Clara Vega. Strategic cross-trading in the us stock market.
Review of Finance , 19(1):229–282,2015.[17] Michael Schneider, Fabrizio Lillo, Michael Benzaquen, Jean-Philippe Bouchaud, Katia Colaneri, Thomas Guhr, FlorianKlöck, Enrico Melchioni, Loriana Pelizzon, and Damian Taranto. Cross-impact and no-dynamic-arbitrage. Technicalreport, 2017.[18] Mehdi Tomas and Mathieu Rosenbaum. From microscopic price dynamics to multidimensional rough volatilitymodels. arXiv preprint arXiv:1910.13338 , 2019.[19] Nicolo Torre. BARRA market Impact model handbook.
BARRA Inc., Berkeley , 1997.[20] Shanshan Wang, Sebastian Neusüß, and Thomas Guhr. Grasping asymmetric information in market impacts. arXivpreprint arXiv:1710.07959 , 2017.[21] Shanshan Wang, Rudi Schäfer, and Thomas Guhr. Price response in correlated financial markets: empirical results. arXiv preprint arXiv:1510.03205 , 2015.[22] Shanshan Wang, Rudi Schäfer, and Thomas Guhr. Cross-response in correlated financial markets: individual stocks.
The European Physical Journal B , 89(4):105, 2016. 25
Models and Axioms
This section proves some of the results stated in Table 1 and implications between the different axioms.
A.1 Relation between fragmentation and liquidity axioms
Let us start us with a powerful lemma that allows to translate liquidity properties into fragmentation properties throughsplit invariance.
Lemma 1.
Let us consider a cross-impact model Λ that is split-invariant and a subspace V such that (cid:59) ⊂ V ⊆ (cid:82) n . Define theregularized orderflow-covariance and response matrices: Ω (cid:48) = ( ¯ Π V + (cid:178) Π V ) Ω ( ¯ Π V + (cid:178) Π V ) R (cid:48) = R ( ¯ Π V + (cid:178) Π V ). Then we have: Λ ( Σ , Ω (cid:48) , R (cid:48) ) = ¯ Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) ¯ Π V + (cid:178) − ¯ Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) Π V + (cid:178) − Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) ¯ Π V + (cid:178) − Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) Π V , where Σ (cid:48)(cid:48) = ( ¯ Π V + (cid:178) Π V ) Σ ( ¯ Π V + (cid:178) Π V ) R (cid:48)(cid:48) = ( ¯ Π V + (cid:178) Π V ) R . Proof.
By split invariance, we can write: Λ ( Σ , Ω (cid:48) , R (cid:48) ) = D Λ ( D − Σ D − , D Ω (cid:48) D , D − R (cid:48) D ) D where we choose D = ( ¯ Π V + (cid:178) − Π V ). We can always do this with D diagonal as long as V is generated by the canonical basis.We then obtain: Λ ( Σ , Ω (cid:48) , R (cid:48) ) = ( ¯ Π V + (cid:178) − Π V ) Λ ( D − Σ D − , Ω , D − R )( ¯ Π V + (cid:178) − Π V ),which, upon substitution of D − yields the result.This implies that, if one assumes fragmentation invariance and controls the speed of divergence of the remaining terms,one can show cross-stability results. Proposition 5.
Let a cross-impact model Λ satisfy split symmetry (Axiom 4), semi-strong fragmentation invariance (Axiom 9).Then:(i) Λ is weakly cross-stable (Axiom 11) if (cid:178) − ¯ Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) Π V = ε → O (1). (ii) If, additionally, Λ is continuous in the first and third argument strongly fragmentation invariant, then Λ is stronglycross-stable (Axiom 12).(iii) Λ is self-stable (Axiom 13) if (cid:178) − Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) Π V = ε → O (1). Proof.
Using the results of Lemma 1, if we assume that (cid:178) − ¯ Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) Π V = ε → O (1),then Λ is weakly cross-stable (Axiom 11). Further assuming continuity at (cid:178) = Π V Λ ( Σ , Ω (cid:48) , R (cid:48) ) ¯ Π V = ¯ Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) ¯ Π V = ε → ¯ Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) ¯ Π V + o (1).Strong fragmentation invariance (Axiom 10) thus implies strong cross-stability (Axiom 12). Finally, if (cid:178) − Π V Λ ( Σ (cid:48)(cid:48) , Ω , R (cid:48)(cid:48) ) Π V = ε → O (1),then Lemma 1 implies that Λ is self-stable. 26ne can also exploit the same lemma and try to establish a link with the fragmentation invariance properties. In asimilar fashion as Lemma 1, one can prove the following Lemma. Lemma 2.
Let us consider a cross-impact model Λ that is split-invariant and a subspace V such that (cid:59) ⊂ V ⊆ (cid:82) n , Define theregularized price-covariance and response matrices: Σ (cid:48) : = ( ¯ Π V + (cid:178) Π V ) Σ ( ¯ Π V + (cid:178) Π V ) R (cid:48) : = ( ¯ Π V + (cid:178) Π V ) R . Then we have: Λ ( Σ (cid:48) , Ω , R (cid:48) ) = ¯ Π V Λ ( Σ , Ω (cid:48)(cid:48) , R (cid:48)(cid:48) ) ¯ Π V + (cid:178) − ¯ Π V Λ ( Σ , Ω (cid:48)(cid:48) , R (cid:48)(cid:48) ) Π V + (cid:178) − Π V Λ ( Σ , Ω , (cid:48)(cid:48) R (cid:48)(cid:48) ) ¯ Π V + (cid:178) − Π V Λ ( Σ , Ω (cid:48)(cid:48) , R (cid:48)(cid:48) ) Π V , where: Ω (cid:48)(cid:48) : = ( ¯ Π V + (cid:178) Π V ) Ω ( ¯ Π V + (cid:178) Π V ) R (cid:48)(cid:48) : = R ( ¯ Π V + (cid:178) Π V ). Proposition 6.
Let Λ be a split-invariant, weakly cross-stable and self-stable cross-impact model (Axioms 4, 11 and 13).Then if ker Σ can be generated by the canonical basis, then Λ is semi-strongly fragmentation invariant (Axiom 9).Proof. Let us assume that it is possible to generate ker( Σ ) with the physical (canonical) basis. In that case, one can choose V = ker( Σ ) and observe that, using the notations introduced in Lemma 2, Σ (cid:48) = ¯ Π V Σ ¯ Π V , R (cid:48) = ¯ Π V R . Weak-cross stability(Axiom 11) implies: ¯ Π V Λ ( Σ (cid:48) , Ω (cid:48) , R (cid:48) ) Π V = (cid:178) → O (1) Π V Λ ( Σ (cid:48) , Ω (cid:48) , R (cid:48) ) ¯ Π V = (cid:178) → O (1).Therefore, applying Lemma 2 yields¯ Π V Λ ( Σ (cid:48) , Ω , R (cid:48) ) Π V = (cid:178) − ¯ Π V Λ ( Σ , Ω (cid:48)(cid:48) , R (cid:48)(cid:48) ) Π V = (cid:178) → O (1),and we obtain ¯ Π V Λ ( Σ , Ω (cid:48)(cid:48) , R (cid:48)(cid:48) ) Π V = ¯ Π V Λ ( Σ , Ω , R ) Π V → (cid:178) →
0. (similarly, ¯ Π V Λ ( Σ , Ω , R ) Π V = Π V Λ ( Σ , Ω , R ) Π V → (cid:178) →
0. Combined, we thus have proved Axioms 8 and 9.
A.2 Proof of Proposition 3
Proposition 7.
Let Λ be a symmetric, positive-definite and price-covariance consistent cross-impact model (Axioms 6, 7and 14). Then Λ = Λ kyle up to a multiplicative constant.Proof. Let Λ be a cross-impact model which satisfies Axioms 6 and 14 and ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ). Then, we have,writing Λ for Λ ( Σ , Ω , R ), and L a matrix such that Ω = L L (cid:62) , Σ = ΛΩΛ (cid:62) = Λ L L (cid:62) Λ (cid:62) = ( Λ L )( Λ L ) (cid:62) .Thus, by unicity up to a rotation of the square root decomposition, writing G for a matrix such that Σ = G G (cid:62) , there exists O such that Λ = G O L − . Furthermore, since Λ is symmetric, G O L − = ( G O L − ) T .Rewriting, we find O T = L T G O L − G − T ,27o that (cid:73) = O L (cid:62) G O L − G −(cid:62) = O ( L (cid:62) G ) O ( L (cid:62) G ) −(cid:62) = O ( L (cid:62) G ) O ( L (cid:62) G )( L (cid:62) G ) − ( L (cid:62) G ) −(cid:62) .Finally, we have: ( O L (cid:62) G ) = ( L (cid:62) G ) (cid:62) ( L (cid:62) G ).Since ( L (cid:62) G ) (cid:62) ( L (cid:62) G ) is symmetric positive semi-definite, the square root is unique and O L T G = (cid:112) ( L T G ) T ( L T G ),which concludes the proof. A.3 Proof of Proposition 4
Let us start with a lemma that will be used throughout this subsection.
Lemma 3.
Let Λ be a cross-impact model which satisfies Axioms 4 and 5. Then, for all ( Σ , Ω , R ) ∈ ( S + n × S ++ n × M n ) , it canbe written as Λ ( Σ , Ω , R ) = L −(cid:62) U Λ ( U (cid:62) ˆ Σ U , (cid:73) , U (cid:62) ˆ RU ) U (cid:62) L − , where Ω = L L (cid:62) ˆ Σ = L (cid:62) Σ L ˆ R = L (cid:62) R L −(cid:62) and U is an orthogonal matrix (i.e., UU (cid:62) = (cid:73) ).Proof. The lemma is obtained by applying sequentially rotational invariance, split invariance and again rotational invari-ance. The first two transformations can be used in order to remove the dependency in Ω as the second argument of the Λ ( Σ , Ω , R ) function. Proposition 8.
A return covariance based cross-impact model Λ that is both split invariant and rotational invariant(Axioms 4 and 5) can always be written in the form Λ ( Σ , Ω ) = L −(cid:62) U F ( µ ) U (cid:62) L − , where Ω = L L (cid:62) ; ˆ Σ : = L (cid:62) Σ L ; U (cid:62) ˆ Σ U : = diag( µ ) ; F ( µ ) : = Λ (diag( µ ), (cid:73) ). Furthermore, if Λ is cash-invariant and direct-invariant Axioms 2 and 3, then F ( µ ) ∝ diag( µ ) and Λ = Λ kyle up to amultiplicative constant.Proof. For a return covariance based model, we can simply choose from Eq. (3) to fix U as the rotation that diagonalizesthe symmetric matrix ˆ Σ , obtaining: U (cid:62) ˆ Σ U = diag( µ ).This choice implies: Λ ( Σ , Ω ) = L −(cid:62) U Λ (diag( µ ), (cid:73) ) U (cid:62) L − ,which yields the result of the first part of the proposition. Furthermore, if we assume Λ is cash-invariant and direct-invariantAxioms 2 and 3, Λ (diag( µ ), (cid:73) ) = d (cid:88) i = (cid:112) µ i Λ ( e i e (cid:62) i , e i e (cid:62) i )which yields the kyle model up to a constant. 28 .4 Proof of some properties of the kyle model Lemma 4.
The kyle model is strongly cross-stable in the sense of Axioms 12 and 13 and is not self-stable in the sense ofAxiom 13.Proof.
Let V be a linear subspace of (cid:82) n ) and ε >
0. Note that, writing G for a matrix such that G G (cid:62) = Σ , for any matrix L ε such that L ε L (cid:62) ε = Ω ε , there exists a rotation matrix O ε such that we have Λ kyle = G O ε L − ε .However, Ω ε = ( ¯ Π V + ε Π V ) Ω ( ¯ Π V + ε Π V ) = ( ¯ Π V + ε Π V ) L L (cid:62) ( ¯ Π V + ε Π V ) = [( ¯ Π V + ε Π V ) L ][( ¯ Π V + ε Π V ) L ] (cid:62) . Thus, Λ kyle = G O ε [( ¯ Π V + ε Π V ) L ] − = G O ε L − ( ¯ Π V + ε Π V ) = G O ε L − ¯ Π V + ε G O ε L − Π V .Using the symmetry of the kyle model, the above yields: Λ kyle = ¯ Π V L −(cid:62) O (cid:62) ε G (cid:62) + ε Π V L −(cid:62) O (cid:62) ε G (cid:62) .Thus, we have: ¯ Π V Λ kyle Π V = ¯ Π V L −(cid:62) O (cid:62) ε G (cid:62) Π V ,and, as O (cid:62) ε is an orthogonal matrix: (cid:178) γ ¯ Π V Λ kyle Π V = ε → O (1),which proves Axiom 11 weak-cross stability. Furthermore, Π V Λ kyle Π V = ε Π V L −(cid:62) O (cid:62) ε G (cid:62) Π V ,so that unless Π V L −(cid:62) O (cid:62) ε G (cid:62) Π V =
0, we have: || Π V Λ kyle Π V || = ε − || Π V L −(cid:62) O (cid:62) ε G (cid:62) Π V || → ε → ∞ .Choosing diagonal Σ and Ω such that Π V L (cid:54)= G Π V (cid:54)=
0, we see that Π V L −(cid:62) O (cid:62) ε G (cid:62) Π V = Σ , Ω .This shows that kyle does not satisfy Axiom 13. Finally, notice that by using Lemma 3 one can make Ω appear only inthe combination L (cid:62) Σ L , which is insensitive to the components of Ω belonging to the kernel of Σ , which proves strongcross-stability. B Proofs of illustrative examples
This section is devoted to proving the results given in Section 4, for completeness.
B.1 Proof of Example 1
We have: Σ = (cid:181) ρσρσ σ (cid:182) = (cid:181) σρ σ (cid:112) − ρ (cid:182)(cid:181) σρ σ (cid:112) − ρ (cid:182) = G G T .Thus, as we can express O in the orthogonal group solely as a function of an angle θ as O = (cid:181) cos θ − sin θ sin θ cos θ (cid:182) , we can expressthe Kyle estimator as: Λ kyle = G O L − = (cid:181) σρ σ (cid:112) − ρ (cid:182)(cid:181) cos θ − sin θ sin θ cos θ (cid:182)(cid:181) ω
00 1/ ω (cid:182) = (cid:195) cos θ / ω − sin θ / ω σω ( ρ cos θ + (cid:112) − ρ sin θ ) σω ( − ρ sin θ + (cid:112) − ρ cos θ ) (cid:33) = cos θω − ω ω tan θσ ( ρ + (cid:112) − ρ tan θ ) σω ω ( − ρ tan θ + (cid:112) − ρ ) θ = − arctan σρω ω + σ (cid:112) − ρ . Injecting in the above we have: Λ kyle = cos θω − ω ω tan θσ ( ρ + (cid:112) − ρ tan θ ) σω ω ( − ρ tan θ + (cid:112) − ρ ) = cos θω σρ + ω ω σ (cid:112) − ρ σρ + ω ω σ (cid:112) − ρ σω ω (cid:112) − ρ − σω ω ρ tan θ = ω (cid:118)(cid:117)(cid:117)(cid:116) ω ω + ω ω σ (cid:112) − ρ + σ + ω ω σ (cid:112) − ρ σρσρ (1 + ω ω σ (cid:112) − ρ )( σω ω (cid:112) − ρ − σω ω ρ tan θ ) = (cid:113) ω + ω ω σ (cid:112) − ρ + ω σ + ω ω σ (cid:112) − ρ σρσρ (1 + ω ω σ (cid:112) − ρ )( σω ω (cid:112) − ρ − σω ω ρ tan θ ) = (cid:113) ω + ω ω σ (cid:112) − ρ + ω σ + ω ω σ (cid:112) − ρ σρσρ σ + σ ω ω (cid:112) − ρ Finally, Λ kyle = (cid:113) ω + ω ω σ (cid:112) − ρ + ω σ + ω ω σ (cid:112) − ρ σρσρ σ + σ ω ω (cid:112) − ρ .The previous computations allow us to generalize to any return covariance matrix written Σ = (cid:181) σ ρσ σ ρσ σ σ (cid:182) as then wehave: Λ kyle = (cid:113) σ ω + ω ω σ σ (cid:112) − ρ + σ ω σ + ω ω σ σ (cid:112) − ρ σ σ ρσ σ ρ σ + σ σ ω ω (cid:112) − ρ ,which concludes the proof. B.2 Proof of Example 2
Example 3 (Calendar Spread) . Let Σ : = σ ρ − ρρ − (1 − ρ )1 − ρ − (1 − ρ ) 2(1 − ρ ) and Ω : = ω ω
00 0 ω . Then, one has: Λ kyle ( Σ , Ω ) = σλ λ (cid:114) ¯ ω λ + ¯ ω λ + ω ¯ ω λ λ (cid:113) − ρ ω λ + λ λ λ + ω + ¯ ω ¯ ω ¯ ω (cid:113) − ρ ω + (cid:112) ρ ω (cid:113) − ρ ω λ − λ λ λ + ω − ¯ ω ¯ ω ¯ ω (cid:113) − ρ ω λ λ + ω ¯ ω (cid:113) − ρ ω + (cid:112) ρ ω (cid:113) − ρ ω λ − λ λ λ + ω − ¯ ω ¯ ω ¯ ω (cid:113) − ρ ω λ + λ λ λ + ω + ¯ ω ¯ ω ¯ ω (cid:113) − ρ ω − (cid:112) ρ ω (cid:113) − ρ ω − λ λ − ω ¯ ω (cid:113) − ρ ω + (cid:112) ρ ω (cid:113) − ρ ω λ λ + ω ¯ ω (cid:113) − ρ ω + (cid:112) ρ ω (cid:113) − ρ ω − λ λ − ω ¯ ω (cid:113) − ρ ω + (cid:112) ρ ω (cid:113) − ρ ω λ λ + ω ¯ ω (cid:113) − ρ ω . where ρ ω : = ω − ω (cid:113) ( ω + ω )( ω + ω + ω ) and ¯ ω : = ω + ω , ¯ ω : = ω + ω + ω , λ = (cid:112) + ρ , λ = (cid:112) − ρ . Furthermore, Λ el ( Σ , Ω ) = (cid:112) + ρ σ ¯ ω + (cid:112) − ρ σ ¯ ω − − − − .30y construction, the two legs and the calendar spread are linearly related, therefore with a proper change of basis(thanks to the invariance of basis, see Axioms 4 and 5 one can reduce this three-dimensional case to Example 1. The returncovariance matrix can be diagonalized as Σ = σ (cid:112) (cid:112) −(cid:112) (cid:112) −(cid:112) (cid:112) (cid:112) (cid:112) + ρ − ρ
00 0 0 (cid:112) (cid:112) −(cid:112) (cid:112) −(cid:112) (cid:112) (cid:112) (cid:112) − = : σ P Σ D P − .By the invariance of basis, we have: Λ kyle ( Σ , Ω , R ) = P Λ kyle ( Σ D , P − Ω P , P − RP ) P − .Furthermore, using fragmentation invariance (see Axioms 8 and 10, the impact matrix is left unchanged if the volumecovariance’s projection along zero-risk modes is set to zero. Thus, denoting Π the projection orthogonal to zero risk modes,we obtain: Λ kyle ( Σ , Ω , R ) = P Λ kyle ( Σ D , Π P − Ω P Π , P − RP ) P − .Thus, our computations reduce to computing the Kyle model on a 2-dimensional system of covariances with diagonal Σ and non-diagonal Ω . To do so, note first that Λ kyle is symmetric and: Λ kyle ( Σ , Ω , R ) = G O L − ,with O = RG (cid:112) ( RG )( RG ) T . Thus, we can use symmetry and: Λ kyle ( Σ , Ω , R ) = G O L − = L − T O T G T = L − T O T G = Λ kyle ( Ω − , Σ , R ).Here, as: Π P − Ω P Π = ω + ω ω − ω (cid:112) ω − ω (cid:112) ω + ω + ω ,writing ρ ω : = ω − ω (cid:113) ( ω + ω )( ω + ω + ω ) and ¯ ω : = ω + ω + ω ω : = ω + ω Λ kyle = σλ λ (cid:114) ¯ ω λ + ¯ ω λ + ω ¯ ω λ λ (cid:113) − ρ ω λ λ + ¯ ω ¯ ω (cid:113) − ρ ω ρ ω (cid:113) − ρ ω ρ ω (cid:113) − ρ ω λ λ + ¯ ω ¯ ω (cid:113) − ρ ω .So that finally, moving back to the original basis: Λ kyle = σλ λ (cid:114) ¯ ω λ + ¯ ω λ + ω ¯ ω λ λ (cid:113) − ρ ω λ + λ λ λ + ω + ¯ ω ¯ ω ¯ ω (cid:113) − ρ ω + (cid:112) ρ ω (cid:113) − ρ ω λ − λ λ λ + ω − ¯ ω ¯ ω ¯ ω (cid:113) − ρ ω λ λ + ω ¯ ω (cid:113) − ρ ω + (cid:112) ρ ω (cid:113) − ρ ω λ − λ λ λ + ω − ¯ ω ¯ ω ¯ ω (cid:113) − ρ ω λ + λ λ λ + ω + ¯ ω ¯ ω ¯ ω (cid:113) − ρ ω − (cid:112) ρ ω (cid:113) − ρ ω − λ λ − ω ¯ ω (cid:113) − ρ ω + (cid:112) ρ ω (cid:113) − ρ ω λ λ + ω ¯ ω (cid:113) − ρ ω + (cid:112) ρ ω (cid:113) − ρ ω − λ λ − ω ¯ ω (cid:113) − ρ ω + (cid:112) ρ ω (cid:113) − ρ ω λ λ + ω ¯ ω (cid:113) − ρ ω ..