Transaction Costs in Execution Trading
UUNIVERSITY OF OXFORD
Transaction Costs in Execution Trading
David Marcos
A thesis submitted in partial fulfillment for the degree ofMSc Mathematical FinanceMathematical InstituteSt. Anne’s College
December 2019 e shall not cease from exploration,and the end of all our exploringwill be to arrive where we startedand know the place for the first time.
T. S. Elliot
This too to remember. If a man writes clearly enough any one can see if he fakes. If hemystifies to avoid a straight statement, which is very different from breaking so-called rulesof syntax or grammar to make an effect which can be obtained in no other way, the writertakes a longer time to be known as a fake and other writers who are afflicted by the samenecessity will praise him in their own defense. True mysticism should not be confused withincompetence in writing which seeks to mystify where there is no mystery but is really onlythe necessity to fake to cover lack of knowledge or the inability to state clearly. Mysticismimplies a mystery and there are many mysteries; but incompetence is not one of them; noris overwritten journalism made literature by the injection of a false epic quality. Rememberthis too: all bad writers are in love with the epic.”
Ernest Hemingway. cknow ledgements “For a successful technology,reality must take precedenceover public relations,for Nature cannot be fooled.”
Richard P. FeynmanWriting acknowledgements under the regulatory constraint of not making it possible toidentify the writer is certainly a challenging task. However, I realise that the people on thislist are extremely unique. So I hope this makes my vagueness not challenging for them whentrying to identify themselves.First and foremost, I would like to thank my Thesis advisor. Not only for providing meguidance, but also for allowing me the intellectual freedom that was necessary to completethis work. Finding someone that combines a considerable degree of both wisdom and modestyis, in my opinion, a lucky rare event. My advisor is one of them.The peculiar members of the trading team deserve a special mention, starting with P.Hegetschweiler, M. Both, A. Papale, and C. Laske; and followed by L. Challenger and M.David, among others, who have massively contributed to my knowledge of how trading isdone in the real world (and about recommended wine, restaurants, and other amenities).I am still indebted to my advisor in Physics, Ram´on A. –from whom I learned one ofthe best abilities that a man can have: critical thinking; and as a corollary the science ofacademic writing. To this day, his human and professional abilities remain an inspiration forme. At Oxford I left many unforgettable memories and people. Thanks to the long list ofclassmates with whom I shared fun and desperation. Thanks to the Maths Department,both Professors and Secretaries have made the Programme extremely well organised andeasy to follow (despite of the sometimes long pricing formulas). Thanks to that Professorthat helped me land my first job in Finance. Thanks to St. Anne’s College, for being bothliberal in spirit and deep in nature. Thanks to my friends of adventure at Oxford, mainly toMati Kaim and Jasper Beerepoot, who both showed me that crisis and opportunity can besynonymous words. ivhis Thesis would not exist without my closest friends and family. These are the peoplethat have been next to me day after day, on sunny or rainy days. Thank you, Alejandro G.Alfonso, for being as close as a brother to me. Sara S. Bailador, our childhood really wasthe beginning of a beautiful friendship. Walter Boyajian, for all we keep learning and livingtogether. Thanks to Nenad and Boro Skalonja, Faustino L´opez, and so many others. Thankyou Eva M. –despite living 10000 km away from each other, you are extremely close to me.Thank you Magda, for showing me the real meaning of the word happiness. Thank you tomy parents, Valent´ın and Araceli, for your emotions; for a path of unconditional support.Thank you to my grandmothers, Julia and Inmaculada. I do not underestimate the workyou have done to provide the next generations with a better life. I cannot do less now thandedicating this Thesis to you. bstract “Joy lies in the fight,in the attempt,in the suffering involved,not in the victory itself.”
M. GandhiIn the present work we develop a formalism to tackle the problem of optimal executionwhen trading market securities. More precisely, we introduce a utility function that balancesmarket impact and timing risk, with this last being modelled as the very negative transactioncosts incurred by our order execution. The framework is built upon existing theory onoptimal trading strategies, but incorporates characteristics that enable distinctive executionstrategies. The formalism is complemented by an analysis of various impact models anddifferent distributional properties of market returns. ontents
Acknowledgements ivAbstract viContents viiiAbbreviations xList of Tables xiList of Figures xii1 Introduction 12 Trading Strategiesand Transaction Costs 7 viii bbreviations
TCA T ransaction C ost A nalysis TC T ransaction C osts PnL P rofit a n d L oss AC A lmgren- C hriss OTC O ver T he C ounter RFQ R equest F or Q uote OG O rder G eneration PM P ortfolio M anager IT I nformation T echnology IS I mplementation S hortfall IM I mpact M odel MI M arket I mpact PMI P ermanent M arket I mpact TMI T emporary M arket I mpact VaR V alue a t R isk UF U tility F unctionx ist of Tables ist of Figures mis abuelas, Julia e Inmaculada,que comenzaron un largo camino para que esta tesis fuera posible xiv hapter 1 Introduction “A person’s work may not be finished in his lifetime, but let us begin.”
J. F. KenedyA few years ago, in a scientific conference held at the heart of Europe, I was sittingat a dinner table where an interesting discussion arose. Someone asked the question “what–in our opinion– had been the most important discovery or development in human history”.To this day, I remember with clarity some of the answers (and the names of the Profes-sors and students who formulated them –although I will omit these here). “The theory ofGeneral Relativity” –said the Cosmology Professor. “Black-Scholes’ theory” –said the Math-ematical Finance Professor. “The theory of Evolution” –said a Biologist. “The fact thatmatter is made of atoms” –said a Condensed Matter Physicist. Fortunately, some studentsincorporated candidates such as “the development of mechanical tools” or “the emergenceof agriculture”. My answer at the time was “the development of the scientific method” –theprocess through which our understanding of nature is hypothesised and tested using logic,reason, and scrutinised by comparison to empirical evidence. Although I am profoundlyaware –as the previous examples suggest– of synthetic happiness and of the power to con-vince ourselves that what we do is certainly the most relevant and noble task that couldpossibly be done, perhaps with the aid of this ‘self-convincing effect’, I have, over recentyears, come to the belief that “the existence of markets” stands up there, as one of thegreatest developments in the history of mankind. The organisation of societies, not throughauthoritarian leadership, but through an economic framework which allows individuals toattain unanimity without consent, respecting the fundamental value of individual freedom,and thus reaching a true representative democracy, is in my mind one of the most profounddevelopments of our civilisation. 1hapter 1.
Introduction “In everyday, normal experience, there is something of a balance between the amounts ofgoods and services that some individuals want to supply and the amounts that other, differentindividuals want to sell. Would-be buyers ordinarily count correctly on being able to carryout their intentions, and would-be sellers do not ordinarily find themselves producing greatamounts of goods that they cannot sell. This experience of balance is indeed so widespreadthat it raises no intellectual disquiet among laymen; they take it so much for granted thatthey are not disposed to understand the mechanism by which it occurs.”
It is thus important to realise that, although old in nature, markets and trade have notalways been around. Furthermore, the realisation that the market system –as opposed toa centralised government– can enable the organisation of societies [Smi76] is a formidablediscovery that deserves special attention. What are the mechanisms that guide this ‘invisiblehand’ ? How do they operate? Are they –in some sense– efficient? Does a central agent needto supervise its activity? Is this mechanism preferred to other forms of societal organisations?Remarkably, the answer to these questions is still under intense debate. In the words of F.H. Hahn [Hah70]: “The most intellectually exciting question in our subject remains: is ittrue that the pursuit of private interest produces not chaos but coherence, and if so how isit done?” . In order to bring some light into the operation of markets, mathematical modelshave been developed, aiming to address fundamental questions such how prices are formed ina market economy . From a general perspective, this is the central question around which thisThesis is articulated. From a more specific point of view, the principal question we addressis:
Given how (our own) supply and demand affects prices, what is the “optimal” way to sellor buy a particular asset?’ . Specific interest should be put into the keyword “optimal”. Whatdoes it mean to sell or buy optimally? What are the ways to address this mathematically?As we will see, how (our own) supply/demand affects prices will be described by marketimpact models , while how to “optimise” our trading will be addressed by the minimisationof a utility function capturing the essential degrees of freedom that contribute to variationsof prices or –in our context– costs .Our theory finds its fundamentals in the Walrasian General Equilibrium formulation[Wal74], in the sense that through the minimisation of a utility function –for a fixed numberof shares to buy or sell– we determine a unique, stable, optimal execution strategy for atrading agent in the market economy. Under classical microeconomics theory, the equilibriumprice of a financial asset is determined by the point at which supply equals demand. At afundamental level, the process of price formation therefore depends on the dynamics ofan order book , capturing the evolution of all buy and sell quotes, or in other words, on the market microstructure , i.e. how the mechanics of trading governs the shape of the supply andhapter 1.
Introduction emergence of prices in financial markets and the will to deepenmy understanding of market microstructure cemented the starting point of this Thesis. Atthe same time, the use of market prices and traded volumes as signals to economic outputs[Hay45, Key36] indicated the importance of this research within an even bigger picture. Inthis work we focus in particular on transaction costs , i.e. the difference between the priceat which we buy or sell an asset, and a specific reference price of that asset (which we willcall benchmark price ). For simplicity, our discussion will be centered around the Equitymarket, and so we will talk about buying or selling shares of a publicly-owned company. Thestudy can nonetheless be straightforwardly extended to other asset classes, such as ForexExchange, Fixed Income, or Commodities. Just as prices depend on supply and demand,transaction costs will depend on the number of shares sold and bought (trading volumes)over a specific time period. Our goal will be focused on the understanding of how our ownsupply/demand will affect the share price, and given this relationship (impact model), weaddress the question on what is the best way to buy or sell the stock over time in order tominimise transaction costs.Over the last two decades, the study and characterisation of transaction costs, gener-ically referred as
Transaction Cost Analysis (TCA), has become an essential component inthe investment process. The importance of this field becomes self-evident when observing thedifference between the “on-paper” Profit & Loss (PnL) and the realised PnL of a portfolio:A theoretically profitable trading strategy can become highly unprofitable when taking intoaccount price moves due to our own supply and demand of the constituent assets. As men-tioned before, transaction costs (TC) are defined as follows:
Given a trading order, definedby side (buy/sell), size (number of shares), and trading horizon (time span over which theorder must be completely executed), TC are the difference between the average execution priceper share and a benchmark price per share, c = ξ ( p exe − p bmk ) , (1.1)with ξ = − ξ = 1 for a sell order. Here c indicates “cost”, while p refersto price per share . The specification “average” (when referring to execution price) comeshapter 1. Introduction
Commissions, fees and taxes:
These correspond to extra payments to external parties(intermediaries in the financial transaction), associated to e.g. brokers/dealers perform-ing the transaction, or to the government allowing it.ii)
Spreads:
These correspond to the difference between the ask (offer) and bid price of anasset. Such premioum is associated to market makers bearing the risk of holding theasset available for both sides of the transaction. Therefore we could refer to the spreadas the ‘price of liquidity’ .iii)
Price moves in the market:
These correspond to price shifts coming from our own actionor inaction in the process of executing an order. While our supply/demand signal causesa change of the asset price (‘market impact’) , delaying the execution of an order –withrespect to the time it is conceived– causes the asset price to drift away due to marketvolatility generated by other market participant’s supply and demand (‘timing risk’) .Although, as we mention below, the first two categories can be easily incorporatedinto our framework, we will focus here on TC associated to our own action (or inaction)with respect to trading an order. In this context, the fundamental problem of TCA can beformulated in terms of the trader’s dilemma : If an order is executed quickly, this may move the security’s price against the trader(‘market impact’); if, however, the order is executed slowly, market volatility might lead to aless favourable price that when the order was received by the trader (‘timing risk’).
This concept is best captured in the words of A. F. Perold [Per98]: “Reality involvesthe cost of trading and the cost of not trading” , and conveys the idea that there must bean optimal rate of execution, determined by optimising transaction costs. Here the terms“quickly”, “slowly”, and “optimal rate of execution” are precisely defined by the balancebetween market impact and timing risk.hapter 1.
Introduction alternative formulation of timing risk . Here we introducea new utility function, characterising timing risk beyond “variance of transaction costs”.Specifically, we consider timing risk as the (negative) tail of the TC distribution, and in thisway go beyond a “mean-variance” approach to TCA. As we will see, solving for the optimaltrading strategy under this new approach allows us to obtain strategies with more sharestraded in later trading intervals than in earlier intervals (a characteristic not possible underthe AC framework with a linear impact model). Furthermore, we study the behaviour of theoptimal trading strategy when instead of normally-distributed returns (as AC assume), oneconsiders market returns following a Student distribution. As it will be argued, this is a moreaccurate characterisation of market returns due to the fat-tail nature of their distribution.We also investigate different impact models, such a sub-linear dependence of the stock pricewith the number of executed shares, and an exponentially or power-law time dependence ofmarket impact in a propagator model [BBDG18]. By doing all this we determine some of thecharacteristics and limitations of the AC formalism, as well as of our own. In particular, wefind that as we consider timing risk encapsulating more extreme market events, the optimaltrading strategy “becomes more impatient” (in the sense that most of the shares ought tobe executed close to the start of the order). Importantly, we also find that if we minimise autility function containing only our timing risk term, the optimal trading path is such thatall shares ought to be executed at the order start. Finally we discuss the extension of ourformalism to include other degrees of freedom, such as the treatment of the trading horizonas a variable –rather than a parameter– of the problem, and offer other future prospects ofour work.hapter 1.
Introduction hapter 2 Trading Strategiesand Transaction Costs “It is not from the benevolence of the butcher,the brewer, or the baker, that we expect our dinner,but from their regard to their own interest.”
Adam Smith
As it was described in the introduction, in this Thesis we tackle the problem of minimisingtransaction costs, defined by equation (1.1), and we focus on the aspect of transaction costsarising from price moves in the market. In order to solve this optimisation problem, it isuseful to break down the execution of an order as it occurs over time. An order to trade aparticular security is an instruction specified by the following parameters: • Side ξ : buy or sell. • Size N : total number of shares to trade. • Time horizon or trading horizon T : time given to complete the order.As we will see, there is a large variety of trading algorithms, which depend on marketcharacteristics such as traded volumes, volatilities and price levels.7hapter 2. Trading Strategies and Transaction Costs
Market order:
This is an order to be executed during the time horizon. As such, theirexecution is typically benchmarked with respect to “expected” prices given by specifictrading algorithms.2)
Market-on-Close order:
This is an order to be executed at the close of the Exchange. Assuch, their execution is typically benchmarked with respect to the price at the time ofthe close.3)
Market-Open order:
This is an order to be executed at the open of the Exchange. Assuch, their execution is typically benchmarked with respect to the price at the time ofthe open.4)
Limit order:
This is an order whose execution should only be carried during times atwhich the share price is between a band of threshold prices. As such, their executionshould be typically benchmarked with respect to prices in which it is possible (as allowedby this band) to execute the order.In these definitions, we have assumed that orders are done through Exchanges (where a limit order book (LOB) exists). Although we will focus on such scenario here, it is importantto keep in mind that this is not always the case. Often, trading is done “over-the-counter” (OTC), where trading is carried through a specific dealer. This distinction is relevant becausefor OTC trading we lack “tick data”, i.e. a quasi-continuum of prices through the tradinghorizon, implying that the execution algorithms discussed here do not generally apply in thiscase. Furthermore, OTC execution is typically benchmarked by comparing various quotesfrom different dealers. In reality,
Request-For-Quote (RFQ)-based trading is more popularin Fixed Income markets, while LOB-based trading is more standard in the Equity market.In figure 2.1 we schematically show the different stages of an order in a typical tradingfloor. An order is first decided and then sent to an Order Generation (OG) team (or system)by the Portfolio Manager (PM). Such OG team collects orders from multiple PMs and sendsthem to the Trading team. The trader can then evaluate, given the trading horizon, howto partition and execute the order. Typically, larger order volumes will be executed in timeperiods with larger market liquidity. The trader could at this stage put in place his strategyby executing through an Exchange. In practice, however, the order is sent (or ‘placed’) toa Broker with a series of instructions (urgency level, etc.), who will in turn implement theexecution strategy by trading through an Exchange. Finally, relevant execution values (fillhapter 2.
Trading Strategies and Transaction Costs time P M de c i de s P M s end s O G r e c e i v e s O G s end s T r ade r r e c e i v e s T r ade r s end s B r o k e r r e c e i v e s B r o k e r s end s c PM c OG c IT c IT c IT c Trader c Broker
Figure 2.1:
Timeline of the investment process. The order, decided originally by thePortfolio Manager (PM), is sent to Order Generation (OG) and subsequently to Trader andBroker. Each of the intervals between the time at which the order is received and the orderis sent has an associated cost. Furthermore, each of the time intervals between when theorder is sent and the order is received has an extra cost, which we label as ‘InformationTechnology’ (IT) cost. times, fill volumes, and fill prices) are retrieved from the Broker . To each of these stages wecan associate a transaction cost: c PM , c OG , c Trader , c Broker , which allows us to better identifywhere improvements should be made in order to minimise TC.
Given a trading order, an execution algorithm is defined as the path { n ( t ) } , 0 ≤ t ≤ T ,describing the number of shares being executed at time t . Here T is the order time horizon,and t = 0 is the ‘order start’ time. Depending on when we consider that the order becomes“active”, the latter can correspond to the time at which the Trader receives the order (typicalscenario), the time at which the Broker or OG receive the order, or to the time at which thePM decides that the order should be traded. If N is the total number of shares to buy or sell, n ( t ) must satisfy the constraint (cid:82) T n ( t ) dt = N . In reality, however, execution is a discreteprocess: it occurs as a series of F events (fills, placements, ...) at a set of times { t j =1 ,...,F } ,and thus n ( t ) = (cid:80) Fj =1 n j δ ( t − t j ), being δ ( t ) a Dirac delta function. Similarly, executioncan be thought as taking place over K time sub-intervals of time-length ∆ t in which [0 , T ] issubdivided, and in each of these sub-intervals n exe k shares being executed at price p exe k , where k = 1 , . . . , K . This leads to the constraint K (cid:88) k =1 n exe k = N, (2.1) It is worth noticing that, in reality, different PMs might send orders referring to the same stock to OG.In this case, these orders are considered as ‘child orders’ , and blocked (typically by the Trader or by OG) intoa single ‘parent order’. In this case, TCA for Traders and Brokers ought to be done with respect to the parentorder (since after blocking, the information about the individual child orders is lost). It is also important tonotice that after the order execution information is retrieved from the Broker, an ‘Order Allocation’ team (orTraders themselves) allocate the position and PnL to the different portfolio accounts involved in the trade. hapter 2.
Trading Strategies and Transaction Costs K = T / ∆ t . We will then call trading strategy , execution strategy , trading algorithm , or execution algorithm to the path { n exe k } , with k = 1 , . . . , K (we will denote n exe k = n k and p exe k = p k indistinctly throughout the text). It is important to notice that a trading strategy { n exe k } can itself be employed as a benchmark (in which case it will be denoted as { n bmk k } ).This can be done from two different perspectives:i) Pre-trade: { n bmk k } and { p bmk k } depend on values prices previous to trade execution.ii) Post-trade: { n bmk k } and { p bmk k } depend on values prices posterior to trade execution.With the discretisation of the time horizon into K subintervals, we can write p exe ≡ N K (cid:88) k =1 n exe k p exe k , p bmk ≡ N K (cid:88) k =1 n bmk k p bmk k (2.2)Here N ≡ number of shares to buy/sell within time T.n exe k ≡ number of executed shares in time bin k.p exe k ≡ average execution price in time bin k.n bmk k ≡ number of shares in time bin k executed by the algorithm/benchmark. p bmk k ≡ average algorithm/benchmark price in time bin k. Combining equations (1.1) and (2.2), we can write transaction costs as c ≡ ξN K (cid:88) k =1 (cid:16) n exe k p exe k − n bmk k p bmk k (cid:17) (2.4)Where ξ = − ξ = 1 for a shell order . Transaction costs are thereforecharacterised by { n bmk k } and { p bmk k } . In Appendix A we specify the functional form ofthese variables for the most standard benchmarks and trading algorithms. In this Thesis wefocus in particular in the implementation shortfall (IS) definition of transaction costs, i.e. Note that if we wish to express TC in bips, this definition should be written as c ( bps ) = 10 p exe − p bmk p bmk .It is worth noticing that this definition can be embedded within a most generic one. In the continuum timelimit, we can write transaction costs per share as c ≡ (cid:90) T Λ( t ) dt, (2.3)where Λ( t ) ≡ (cid:80) Kk =1 κ ( t ) δ ( t − k ∆ t ) is a cost function whose functional form depends on our precise definitionof transaction costs; in our case κ ( t ) ≡ ξN (cid:0) n exe ( t ) p exe ( t ) − n bmk ( t ) p bmk ( t ) (cid:1) (linear function of the number ofshares and price). If within a trading bin k we have F fills/placements, the average execution and benchmark prices in thatbin are calculated, respectively, as p exe k = F (cid:80) Fj =1 n exe j p exe j , p bmk k = F (cid:80) Fj =1 n bmk j p bmk j . hapter 2. Trading Strategies and Transaction Costs Figure 2.2:
Pictorial representation of market impact. The black line represents the stockprice in the presence of our trading (“Actual”), while the grey dotted line represents thestock price in the absence of our trading (“Paper”). Since either we do execute the order orwe do not, we cannot observe both components, and thus market impact is not a directlyobservable quantity (its magnitude can only be inferred via a calibrated impact model). p denotes the price at the order start (in this case when the market receives information ofour interest in the security). p and p represent the market prices at fill times. p indicatesthe market price at the time from which the security’s price in the absence and in presenceof our trade differ by a fixed amount (‘permanent market impact’) . Before this time, thesetwo prices differ by an amount that may vary with time (temporary market impact) . Figuretaken from [Joh10]. henceforth we will consider c ≡ ξ (cid:34) N K (cid:88) k =1 n exe k p exe k − p (cid:35) , (2.5)being p the stock price at the order start.We next define some of the main terms used in TCA. To this end, it is useful to introducea simple price model. Consider that when trading n τ shares in the time interval [0 , τ ] thestock price moves as p τ − p p = I ( n τ ) + ζ τ (2.6)Here I ( n τ ) is a (deterministic) function of the number of executed shares, while ζ τ is astochastic component with zero mean, E [ ζ τ ] = 0, capturing the market volatility from t = 0to t = τ . It is custom to use a model in which ζ τ = σ √ τ χ τ , with χ τ having zero meanand standard deviation 1, but is not necessary to particularise the specific form of ζ τ at thisstage. Taking the expectation at t = 0 of equation (2.6) we obtain E [ p τ ] − p p = I ( n τ ) (2.7) A small variation of this simple model is to consider p τ − p = I ( n τ )+ ζ τ , i.e. a price model for arithmeticreturns. Notice that there is no assumption of preferred market directionality, i.e. not drift or momentumother than the one induced by our own trading. hapter 2. Trading Strategies and Transaction Costs ‘market impact’ or ‘(market) impact function’ . Here wehave specified the average at t = 0, time at which we predict a particular market impactbased on the size of our order and the functional form of I ( n τ ). It is important to realisethat market impact can be estimated (what we believe what it was) or predicted (what webelieve it will be), but market impact can never be measured . As illustrated in figure 2.2,market impact is the difference between the stock price in the presence (“Actual”) and inthe absence (“Paper”) of our execution. Since in a particular market realisation, we eitherexecute or do not execute, we can only observe one, not both, of these components. This iswhy impact models (explained in detail in section 2.3) are one of the most, if not the most,relevant elements of TCA: they allow us to estimate (or predict) the difference between the“Actual” and “Paper” stock price.Given the simple model and the equations above, we can now give definitions to someof the most recurring concepts in TCA : • Transaction cost: c . This is the (signed) difference between average execution priceand benchmark price (c.f. equations (1.1), (2.4)). The sign included in this definitionimplies that we take positive costs as out-performance with respect to the benchmark,and negative costs as under-performance with respect to the benchmark. It is importantto notice that, while in post-trade analysis transaction cost is a deterministic variable,from a pre-trade perspective it is a stochastic variable. This is why we explicitly define‘expected transaction cost’ next. • Expected transaction cost: E [ c ]. This is the expectation of transaction cost. Froma post-trade perspective, both expected transaction cost and transaction cost are in-terchangeable concepts. From a pre-trade perspective, however, transaction cost is astochastic variable, and expected transaction cost refers to its expectation value. In the literature, and in the industry, different terms are used to describe the same (or similar) quan-tities. For example, ‘execution cost’ or ‘trading cost’ are synonyms of ‘transaction cost’. ‘Expectedimpact cost’ or ‘impact cost’ describe ‘expected transaction cost’, while ‘timing risk’ is also referred as ‘timing cost’ , ‘shortfall risk’ , ‘volatility risk’ , or ‘opportunity cost’ . Furthermore, ‘implementationshortfall’ and ‘slippage’ are mentioned as similar concepts to ‘market impact’ (which some authors define,not as an expectation value, but as a stochastic variable, including a noise term). We should notice that someof these definitions differ from those given by Perold [Per98], and given in this text. One important exampleis that of ‘opportunity cost’ , defined as the appreciation (in currency value) of the shares specified by theorder but not executed by the end of the trading horizon. If ˜ N is the number of shares specified by the orderand N is the number of shares actually executed by the end of the time horizon, the opportunity cost is thengiven by ξ ˜ N − NN ∆ p T , (2.8)where ∆ p T ≡ p T − p is the security’s market price difference over the time interval [0 , T ] (and which needsto be estimated/predicted via an impact model), and ξ = − ξ = 1 for a sell order. Noticethat in the text we take N = ˜ N , i.e. we do not discuss the case in which the order is partly executed bythe end of the trading horizon, and thus –given this definition– opportunity cost is zero. The case in which N < ˜ N (non-zero opportunity cost) is then left as a future prospect of this work. hapter 2. Trading Strategies and Transaction Costs • Market impact: E [ p t ] − p p . Calculated at a specific time t , this is the expected trans-action cost when the benchmark is taken to be the security’s price at the order-starttime, p . The expectatino value is also taken at the order start time (i.e. E [ c ] ≡ E [ c ]).Notice that market impact can be defined in bips (as done here) or, equivalently, incurrency value, in which case market impact ≡ E [ p t ] − p . • Timing risk: σ [ c ] p . This corresponds to the standard deviation of transaction cost, i.e.the square root of the variance V [ c ]. Notice that timing risk can be defined in bips (asdone here) or, equivalently, in currency value, in which case timing risk ≡ σ [ c ]. An impact model (IM) postulates a certain relationship between the number of shares exe-cuted over a time interval and the change in the security’s price. In terms of equation (2.6),the impact model corresponds to the functional form assumed for I ( n τ ). In the following,we first explain the IM proposed by Almgren and Chriss [AC01], and then introduce our ownimpact model, to finally briefly comment on other models in the literature.Almgren and Chriss split market impact (MI) into the sum of two components:i) Permanent market impact (PMI):
This is the difference between the stock price in thepresence and in the absence of execution when this value becomes stationary, i.e. whenit does not change with time. In terms of the model, PMI has a functional form non-dependent on time, and whose effect on the stock price is delayed by one time sub-interval.ii)
Temporary market impact (TMI):
This is the difference between the stock price in thepresence and in the absence of execution before this value becomes stationary, i.e. whenit still changes with time. In terms of the model, TMI has a functional form dependenton time, and whose effect on the stock price is not delayed in time.More specifically, if the trading horizon [0 , T ] is subdivided into K subintervals of length∆ t = T /K each, AC assume that, when we execute n k shares in the sub-interval k , the stockprice changes as p k = p + k − (cid:88) j =1 (cid:104) σ √ ∆ t χ j − ∆ t g (cid:16) n j ∆ t (cid:17)(cid:124) (cid:123)(cid:122) (cid:125) PMI (cid:105) − h (cid:16) n k ∆ t (cid:17)(cid:124) (cid:123)(cid:122) (cid:125) TMI (2.9)hapter 2.
Trading Strategies and Transaction Costs χ ∼ N (0 ,
1) (normally-distributed variable with mean 0 and standard deviation 1), p is the stock price at order start, and g ( n ), h ( n ), are permanent and temporary impactfunctions, respectively, which are assumed to be of the form . g ( x ) = ξ γ x (2.10) h ( x ) = ξ ( ε + η x ) (2.11)being γ , (cid:15) and η model parameters to be calibrated according to historical data. As we willsee, this linearity of both PMI and TMI with the number of executed shares is an essentialassumption that defines the characteristics of the optimal execution given by AC. Actually,the linear relationship between the number of executed shares and the stock price in the PMIcomponent is rather generic: Huberman and Stanzl proved that this is the only functionalform for PMI to be compatible with non-arbitrage conditions [HS04]. In these equations, ε can be interpreted as a bid-ask spread cost. Assuming that we execute at the price (2.9),and inserting this expression into equation (2.5), we obtain c = ξ N N (cid:88) k =1 n k (cid:110) p + k − (cid:88) j =1 (cid:104) σ √ ∆ t χ j − ∆ t g (cid:16) n j ∆ t (cid:17)(cid:105) (cid:111) − N N (cid:88) k =1 n k h (cid:16) n k ∆ t (cid:17) − p = ξN N (cid:88) k =1 k − (cid:88) j =0 (cid:110) n k (cid:104) σ √ ∆ t χ j − ∆ t g (cid:16) n j ∆ t (cid:17)(cid:105) (cid:111) − N (cid:88) k =1 n k h (cid:16) n k ∆ t (cid:17) = ξN N − (cid:88) j =0 N (cid:88) k = j +1 (cid:110) n k (cid:104) σ √ ∆ t χ j − ∆ t g (cid:16) n j ∆ t (cid:17)(cid:105) (cid:111) − N (cid:88) k =1 n k h (cid:16) n k ∆ t (cid:17) , (2.12)where we have used (cid:80) Nk =1 n k = N , and n = 0, χ ≡
0. Defining x j ≡ N (cid:88) k = j +1 n k = N − j (cid:88) k =1 n k , (2.13) While the original work of AC assumes a linear dependence of the stock price with the rate of execution, n τ /τ , in a subsequent work [Alm03] Almgren extends this theory to sub-linear models of TMI. This, andmodels alike, are currently widely used in the industry [ATHL05]. For a comprehensive discussion of the functional forms of market impact (under a propagator framework),compatible with a non-arbitrage assumption, see [Gat09]. hapter 2.
Trading Strategies and Transaction Costs j , and takinginto account that x N = 0, we finally get c AC = ξN (cid:34) K (cid:88) k =1 (cid:110) σ √ ∆ t χ k − ∆ t g (cid:16) n k ∆ t (cid:17) (cid:111) x k − K (cid:88) k =1 n k h (cid:16) n k ∆ t (cid:17)(cid:35) (2.14)This is the expression for TC that will be used when we calculate the optimal trading strategygiven by Almgren-Chriss.We now consider an alternative impact model. Extending the price dynamics given inequation (2.6) over multiple time intervals, we have the “law of motion” for the price: p k = p k − (1 + I ( n k ) + ζ k ) , (2.15)where I ( n k ) is an impact function and ζ k a noise term. Assuming that we execute at thisprice, equation (2.5) gives in this case c = ξ p N K (cid:88) k =1 (cid:110) n exe k k (cid:89) j =1 (1 + I ( n j ) + ζ j ) (cid:111) − (2.16)In order to provide a comparison with the model developed by Almgren and Chriss, wewill use this expression of TC –assuming various functional forms for I ( x )– in Chapter 3.Other distinct models exist in the vast body of literature on market impact. To cite a few,the pioneer work of Kyle [Kyl85] postulated a linear function for MI, while later works haveshown that the “experimental” shape of MI is actually concave. For example, J.-P. Bouchaud et al. postulate a square-root behaviour [TLD + n exe , the stock price is linear with n exe ,while, for large executed volumes, the stock price is proportional to √ n exe [BBLB19]. Otherrecent studies on MI brilliantly point out that the relevant variable for market impact is the ‘execution horizon’ , i.e. the time that we take to fully execute the order, rather than theexecuted volume or the rate of execution [CC19]. As a future prospect of our work, it willbe interesting to apply these models and ideas to the utility function proposed in Chapter 3. It is instructive to remark that should we have considered a MI “local in time”, i.e. a price model p k = p + (cid:80) kj =1 (cid:104) σ √ ∆ t χ j − ∆ t g (cid:0) n j ∆ t (cid:1) (cid:105) − h (cid:0) n k ∆ t (cid:1) , we would have obtained the IS transaction cost c = ξN (cid:104)(cid:80) Kk =1 (cid:110) σ √ ∆ t χ k − ∆ t g (cid:0) n k ∆ t (cid:1) (cid:111) x k − − (cid:80) Nk =1 n k h (cid:0) n k ∆ t (cid:1)(cid:105) , which fundamentally affects some of the con-clusions derived from the model. The same would occur if consider a sub-linear TMI model, e.g. a relationshipof the type h ( x ) = ξ ( ε + η x α ), with α < hapter 2. Trading Strategies and Transaction Costs As explained above, our goal is to determine the execution strategy { n exe k } that minimisestransaction costs, caused by price moves in the market , and determined by equation (2.5).We should notice, however, that –from a pre-trade perspective– the price is a stochastic vari-able, so in this case c is a stochastic variable as well. Then, when we talk about “minimisingtransaction costs”, what do we mean? One option would be to minimise the expectationvalue of (2.5), subject to the constraint (2.1). Bertsimas and Lo [BL98] take this approachfrom a dynamic optimisation perspective, i.e. they solve recursively a Bellman equation,ensuring that the utility function U [ c ] = − E [ c ] is minimised at the beginning of every timeinterval in which the trading horizon is subdivided. With an IM similar to (2.15) (but witharithmetic –not geometric– returns) and with an impact function of the form I ( n k ) = γ n k ,they obtain that the optimal trading path corresponding to this utility function is a TWAPstrategy (c.f. Appendix A). Almgren and Chriss [AC01] take, however, a different approach.They realise that by introducing a variance term in the utility function, one may capture theinterplay between market impact –“the cost of trading immediately” (represented by E [ c ])–and timing risk –“the cost of not trading immediately” (represented by V [ c ]). The AC utilityfunction is U AC [ c ] = − (1 − λ ) E [ c ] + λ V [ c ] , (2.17)which they minimise with c given by (2.14), subject to the constraint (2.1), and with the IMdefined by equations (2.9), (2.10), (2.11). In contrast to Bertsimas and Lo, AC solve a staticoptimisation problem, minimising U AC [ c ] at t = 0 (order start). Therefore, the expectationand variance in equation (2.17) are to be understood at t = 0. In this equation, λ is the ‘risk-aversion’ parameter, whose magnitude is determined by the “aggressiveness” of theorder ( λ → ⇒ non-urgent order, λ → ⇒ very-urgent order).The utility function (2.17) can be therefore interpreted as follows: if we trade aggres-sively, our sudden demand/supply may move the security’s price substantially, thus creatinga market impact, which is associated with the term E [ c ]. In contrast, if we trade passively,price volatility may cause an increase in V [ c ], or timing risk. In this way, AC captures math-ematically the “trader’s dilemma”, described in Chapter 1. As a further remark, one maywonder what does V [ c ] have to do with the cost associated to “not trading immediately” (ortiming risk). Taking the standard deviation of the AC transaction cost up to time t , we get Including commission costs can be done by adding a (calibrated/measured) constant to the TC discussedhere. Spread costs can be included in the model in various ways. The simplest method is to consider them asa calibrated constant (i.e. as commissions) in the market impact function. More elaborate models of spreadcosts consider a relationship spread–number of executed shares, and the estimated/predicted spread beingadded to the estimated/predicted execution price. Using more complex impact models they obtain trading strategies with a richer structure. AC write U AC [ c ] = − E [ c ] + ˜ λ V [ c ], which is equivalent to (2.17) under the definition ˜ λ ≡ λ − λ . hapter 2. Trading Strategies and Transaction Costs
17a timing risk proportional to √ t , which means that when penalising the variance term in theutility function, we are essentially penalising the lapse of time without completing the order.From this point of view, there is an optimal rate of execution , that achieves a compromisebetween market impact and timing risk. This gives us the ‘optimal trading strategy’, which,as we vary the risk-aversion parameter λ , defines an ‘efficient frontier’ (set of optimal solu-tions) in the market-impact/timing-risk two-dimensional space. We can then summarise theAC problem of TCA as follows: Given a trading order, defined by side, size, and time horizon, and given an impactmodel of the form described by equations (2.9), (2.10), (2.11), find a trading strategy { n exe k } ,fulfilling the constraint (2.1), that minimises the utility function (2.17), for a given value ofthe risk-aversion parameter λ . This model, and variations of it, has become paramount in the financial industry. Theoptimal strategy can be used –from a post-trade perspective– as a benchmark for Best Ex-ecution reporting and for Trading Performance analysis. From a pre-trade perspective, theoptimal strategy is widely used as a trading algorithm. The model by AC has been refinedby e.g. considering sub-linear impact models. For example, Almgren et al. [ATHL05] con-struct a model with both PMI and TMI given by a power law, with respective calibratedexponents α ≈ . β ≈ .
6, something that is in approximate agreement with the value α = 1 derived from the non-arbitrage arguments of Huberman and W. Stanzl [HS04], andwith the value β = 0 . Trading Strategies and Transaction Costs hapter 3 Execution CostsBeyond Mean-Variance “All models are wrong, but some are useful.”
G. E. P. Box
Given an impact model and a law of motion for the execution price, transaction costs –definedby equation (2.5)– follow a specific probability distribution. The optimal execution strategyderived from the model of Almgren and Chriss is such that it balances the mean and thevariance of this distribution. However, a natural question may arise: How would the optimalexecution strategy change when we take into account further moments of the distribution,or rather, the full probability distribution? This question is particularly important in viewthat in most financial markets, returns are fundamentally non-Gaussian, or more specifically,they feature fat-tail characteristics [BP09]. In particular, as we know from portfolio theory,we can think about risk “beyond the variance of the distribution”, for example consideringthe ‘Value-at-Risk’ (VaR) associated to the probability distribution [Jor06, Ale09]. Althoughthere is controversy and criticism on whether VaR is an appropriate measure of overall risk(and whether it can be appropriately estimated/predicted) [Tal09], it captures features thatthe variance of the distribution does not capture (and vice versa!), being particularly suitedto characterise extreme events. It becomes thus natural to develop a framework in which theoptimal trading strategy is based to minimising both, market impact, as well as (extreme)losses. In this work we will develop such formalism.19hapter 3.
Execution Costs Beyond Mean-Variance χ ∼N (0 ,
1) (variable following a normal distribution with mean 0 and standard deviation 1).2) Price dynamics given by (2.9), with impact model determined by (2.10)-(2.11) and χ ∼ t (0 ,
1) (variable following a t-Student distribution with 5 degrees of freedom, mean 0and standard deviation (cid:112) / I ( n k ) = − ξγn k e − ρ k ∆ t , and ζ k = σ √ ∆ t χ k , being χ ∼ N (0 ,
1) (variable following a normal distribution with mean 0 andstandard deviation 1).4) Price dynamics given by (2.15), with impact model I ( n k ) = − ξγn k e − ρ k ∆ t , and ζ k = σ √ ∆ t χ k , being χ ∼ t (0 ,
1) (variable following a t-Student distribution with 5 degreesof freedom, mean 0 and standard deviation (cid:112) / ξ = − ξ = 1 for a sell order, γ and ρ are constant parameters,and k , ∆ t , and σ , are as defined in the previous chapter (c.f. caption figure 3.1 for parametervalues used here). In all these cases, we perform a hypothesis test with the null hypothesisthat c follows a normal distribution when χ ∼ N (0 ,
1) and a t-Student distribution when χ ∼ t (0 , σ , the mean and variance of TC in the AC framework The t-Student distribution, as we define it, has the following probability density function f ( x ) = Γ (cid:0) ν +12 (cid:1) σ √ νπ Γ (cid:0) ν (cid:1) (cid:34) ν + (cid:0) x − µσ (cid:1) ν (cid:35) − ( ν +12 ) (3.1)Where µ ≡ mean, σ (cid:112) ν/ ( ν − ≡ standard deviation, ν ≡ number of degrees of freedom, Γ( x ) ≡ Gammafunction of x . Here we have used a Kolmogorov-Smirnov test. For the normality tests we have verified the hypothesiswith Anderson-Darling, Jarque-Bera, and Lilliefors hypothesis tests. hapter 3.
Execution Costs Beyond Mean-Variance (a) (b)(c) (d) Figure 3.1:
Probability density functions of transaction costs. In order to generate the his-tograms we have generated 10000 price paths. (a) Histogram and Gaussian fit correspondingto the scenario 1) described in the main text. Here the TC distribution is well described bya normal distribution. The sample of TC has mean = − .
71, standard deviation = 0 . . .
0. The Gaussian fit has mean = − . ± .
01 and stan-dard deviation = 0 . ± .
01. (b) Histogram and t-Student fit corresponding to the scenario2) described in the main text. Here the TC distribution is well described by a t-Studentdistribution. The sample of TC has mean = − .
71, standard deviation = 0 .
57, skewness= − . .
2. The t-Student fit has mean = − . ± .
01, standard deviation= 0 . ± .
01, and degrees of freedom = 5 . ± .
6. (c) Histogram and Gaussian fit corre-sponding to the scenario 3) described in the main text. Here the TC distribution is not welldescribed by a normal distribution. The sample of TC has mean = − .
40, standard devia-tion = 1 .
1, skewness = 0 .
88, and kurtosis = 8 .
2. The Gaussian fit has mean = − . ± . . ± .
02. (d) Histogram and t-Student fit corresponding to thescenario 4) described in the main text. Here the TC distribution is not well described bya t-Student distribution. The sample of TC has mean = − .
39, standard deviation = 1 . .
1, and kurtosis = 22. The t-Student fit has mean = − . ± .
02, standarddeviation = 0 . ± .
02, and degrees of freedom = 2 . ± .
1. Errors here correspond to the95% confidence intervals. Parameter values are γ = 1, η = 1, ε = 1, σ = 1, ∆ t = 1, p = 1, ξ = 1, ρ = 1 /
2. The strategy used has been the optimal trading strategy derived from theAC framework for K = 13, λ = 0 . hapter 3. Execution Costs Beyond Mean-Variance E [ c AC ] = − ξN K (cid:88) k =1 (cid:110) x k ∆ t g (cid:16) n k ∆ t (cid:17) + n k h (cid:16) n k ∆ t (cid:17)(cid:111) (3.2) V [ c AC ] = σ ∆ tN K (cid:88) k =1 x k (3.3)In contrast, when considering the price evolution given by (2.15), TC follow a funda-mentally different distribution to financial returns. These conclusions can be also derivedthrough direct observation of equations (2.14) and (2.16): The first involves the sum of aset of variables χ , while the second involves the sum of the product of these variables. Spe-cially interesting is the fact that the price dynamics (2.15), together with the impact model I ( n k ) = − ξγn k e − ρ k ∆ t , seem to introduce a (positive) skewness in the distribution of TC, i.e.this model is able to describe asymmetric TC distributions, or in other words, it is flexibleenough to reproduce the non-zero skewness regularly observed in practice. As we have seen, the TC distribution highly depends on the characteristics of market returnsand market impact. Different markets have distinct properties, some feature distributionsof returns with very fat tails, and in some the distributions are very asymmetric. Moreover,how our trading moves the security’s price may be very different, depending on factors suchas market liquidity. As such, not only is it important to examine various impact models, butalso to investigate different utility functions. In particular, the optimal strategy obtainedthrough the AC utility function (UF) fully depends on the mean and the variance of the TCdistribution, only. As a result, markets with similar TC mean and variance, but with largeasymmetries, or with frequent rare events, might give rise to very similar optimal executionstrategies under the AC framework, but in reality, it might be advantageous (from the pointof view of PnL) to execute differently in these markets. Consequently, it becomes natural tothink about “transaction costs beyond mean-variance” , i.e. to develop a formalism to accountfor properties of the TC distribution beyond its mean and its variance. One option wouldbe to include subsequent moments (or rather cumulants) of the TC distribution in the UF.However, such “cumulant expansion” is only a –better than AC– approximation to accountfor the full distribution, characterised by the probability density function of TC, f [ c ]. Inprinciple, we could consider this object, and obtain the execution strategy that “optimisesits shape”. Following this rationale, we introduce the following utility function:hapter 3. Execution Costs Beyond Mean-Variance U [ c ] = − (1 − λ ) E [ c ] + λ (cid:90) ˜ c −∞ f [ c ] dc (3.4)Here we have considered the integral of the TC distribution from −∞ to ˜ c : a threshold belowwhich we want to minimise TC. Optimising a UF including this integral (and depending onthe value of ˜ c ) will penalise more or less extreme negative costs . This should be contrastedwith the effect of the variance term in the AC utility function (2.17), which, in turn, penalisesthe width of the distribution, i.e. strategies in which TC (positive or negative) are verydissimilar.The first term (expectation of TC) in the UF (3.4) is still very relevant: it penalisesthe the speed of execution; if we were to minimise the utility function U [ c ] = (cid:82) ˜ c −∞ f [ c ] dc (equivalent to λ = 1 in (3.4)), we would obtain the optimal solution n = N , n k = 0 ∀ k ≥ market impact term ∼ E [ c ] “competes” with the integral term ∼ (cid:82) ˜ c −∞ f [ c ] dc , which capturesthe contribution of all moments (or cumulants) on the negative side of the TC distribution,and can thus be seen as an alternative timing risk measure. As we will see below, workingwith the UF (3.4) enables optimal strategies that execute the largest proportion of the sharesin intervals later in time, a feature not possible under the AC formalism. Important is as wellto consider the effect of the additional parameter ˜ c : depending on the value ˜ c (the magnitudeof the tail of the TC distribution considered in the UF) we will see a crossover from optimalstrategies trading more shares in earlier time intervals to optimal strategies trading moreshares later in time. Finally, it is noteworthy that (3.4) is just an example of a wider class ofutility functions that can be considered to take into account the effect of the higher moments(or the full TC distribution) into the optimal strategy. More concretely, two cases of interestare U [ c ] = − (1 − λ ) E [ c ] + λ (cid:90) −| ˜ c |−∞ f [ c ] dc + λ (cid:90) ∞| ˜ c | f [ c ] dc (3.5) U [ c ] = − (1 − λ ) E [ c ] + λ (cid:90) | ˜ c |−| ˜ c | f [ c ] dc (3.6)In the first one, both positive and negative tails of the TC distribution are considered, whilein the second, only the body of the TC distribution in taken into account. However, in-depthinvestigation of these instances is beyond the scope of this work, and it is left for for futureresearch. Notice that for ˜ c > hapter 3.
Execution Costs Beyond Mean-Variance Given the utility function (3.4), we can now proceed to numerically obtain the optimaltrading strategy that is derived from it. To start with, we consider the price evolution andimpact model considered by Almgren and Chriss (equations (2.9) and (2.10)-(2.11), with χ a Gaussian variable). In the probability density function f [ c ] we have, in this case, the(stochastic) TC given by (2.14). In order to minimise the UF, we consider 10000 realisationsof the vector of random variables ( χ , . . . , χ K ), which, for a given trading strategy { n k } (with k = 1 , . . . , K ) give rise to corresponding price vectors ( p , . . . , p K ), and finally to 10000TC, c AC , distributed as shown in figure 3.1(a). Provided the random samples of c AC havesufficiently converged to the underlying probability distribution , as we consider differentrealisations of { n k } , the UF (3.4) will reach a minimum for one of them, which will be the optimal trading strategy (or optimal execution strategy ), { n opt k } .In order to minimise the utility function, one may be inclined to use an algorithm ofthe type “gradient descent” [Cau47]. These, however, highly depend on the initial pointfrom which we start the search of the global minimum, and might yield optimal solutionswhich correspond to local minima. In particular, due to the stochastic nature of our UF, anddepending on the level of convergence of the considered sample of TC to the subjacent proba-bility distribution, the (numerically-estimated) utility function is “noisy”, i.e. it might have amultitude of local minima in the multi-dimensional space span by the variables { n , . . . , n K } .As a matter of fact, in the results below we show how we have encountered this issue. Inorder to circumvent this difficulty, we can use a “Monte Carlo optimisation” method: Asample of quasi-random numbers { n , . . . , n K } , fulfilling the constraint (cid:80) Kk =1 n k = N , isgenerated inside the hypercube with vertices { v , . . . , v K } , with v j ∈ { , N } ( j = 1 , . . . , K ).We compute the value of the UF at each of these points, and select among these many thevector (cid:126)n giving rise to the lowest utility. Although, in our case, this procedure proves tobe more effective than gradient-descent methods in order to find the global minimum, dueto the noisy nature of the UF, the precision to which we determine the optimal solutionmay be inadequate. This issue is illustrated in figure 3.2. However, we postulate that theUF should be a smooth (not noisy) function of the variables { n , . . . , n K } , and because ourMonte Carlo simulations seem to indicate that the UF corresponding to the subjacent TCdistribution does have only one local minimum, we fit the numerically-obtained values of theUF in the subspace span by the variables { n , . . . , n K − } to a ( K − quadraticpolynomial . In figures 3.2 and 3.3 we see the fitted curve for the cases in which the tradinghorizon is subdivided into two and three time intervals, respectively. Once we have this curve, Unless stated differently, this will be the case for all simulations presented throughout the text. This is an important point that depends on the sample size and on the form of the underlying distribution,and which we will discuss further below. Quasi-random numbers present a higher degree of correlation than a standard random sample, i.e. theyfill the corresponding hypercube more uniformly. hapter 3.
Execution Costs Beyond Mean-Variance simulatedfit (a) U AC U DM simulatedfit (b) U AC U DM Figure 3.2:
Utility function in the case in which the trading horizon is subdivided into of 2time intervals ( K = 2). (a) Case ˜ c = −
1. Utility function (2.17) derived from the Almgren-Chriss formalism ( U AC ). Here we have plotted both the “analytical solution” (obtained viathe equations (3.2)-(3.3)) and the “numerical solution” (obtained via numerical simulationof the TC distribution), labelled U AC . Both curves almost overlap. The UF derived fromequation (3.4), labelled U DM , is also shown. In this scenario, both curves show a minimum(optimal trading strategy) with n > .
5. (b) Case ˜ c = − .
5. In this scenario, while U AC shows a minimum with n > .
5, this is not the case for U DM , which shows a minimumwith n < .
5. The “analytical” and “numerical” solution to the AC utility function almostoverlap. In both plots, the dots correspond to the values of the utility function computedover a quasi-random sample. The solid lines correspond to a fit of these points to a quadraticpolynomial. The Monte Carlo sample contains 100 quasi-random strategies in each case, and10000 price paths have been used to estimate the UF at every point. The other parametervalues are γ = 1, η = 1, ε = 1, σ = 1, ∆ t = 1, ξ = 1, λ = 0 .
3. The number of shares n k ,executed in time interval k is expressed as a ratio to the order size, i.e. we take N = 1. we can use an optimisation method to find the global minimum: yet again a quasi-randomminimisation or a gradient-descent procedure, both of which work well for this smooth utilityfunction. As we show in tables 3.1 and 3.2, the best accuracy proves to be obtained with thelatter method, and will we take this solution as the optimal execution strategy.Figure 3.2 presents various important insights. First, the UF (3.4) does indeed presenta local minimum. This means that our modelling of timing risk is meaningful in the contextof the formulated optimisation problem, in the sense that it competes with market impact,and it is in the balance between the two that we find an optimal execution strategy. Second,by comparing figure 3.2(a) and 3.2(b), we see that, while the AC utility function alwaysgives rise to an optimal trading strategy with n > . n > . n < .
5, depending onthe value of ˜ c . Indeed, the parameter ˜ c plays an essential role: as we increase the value of˜ c , we “redefine” timing risk to include –not only the negative tail of the TC distribution(rare losses)– but also the body of the distribution. This redefinition of timing risk from“extreme loss” to “loss” effectively penalises less being patient to complete the order, andhapter 3. Execution Costs Beyond Mean-Variance AC analytic AC numeric DM (˜ c = −
1) DM (˜ c = − . (cid:126)n opt U opt (cid:126)n opt U opt (cid:126)n opt U opt (cid:126)n opt U opt GD (0 . , .
35) 0 .
58 (0 . , .
44) 0 .
58 (0 . , .
44) 0 .
60 (0 . , .
29) 0 . . , .
35) 0 .
58 (0 . , .
39) 0 .
57 (0 . , .
44) 0 .
61 (0 . , .
59) 0 . . , .
35) 0 .
58 (0 . , .
35) 0 .
58 (0 . , .
43) 0 .
61 (0 . , .
56) 0 . . , .
35) 0 .
58 (0 . , .
35) 0 .
58 (0 . , .
43) 0 .
61 (0 . , .
56) 0 . Table 3.1:
Solutions, expressed in the form ( n , n ), and value of the UF, for the optimalexecution strategy obtained by various optimisation methods, in the case in which the tradinghorizon is subdivided into of 2 time intervals ( K = 2). GD refers to ‘Gradient-descent’(the solution obtained by applying standard optimisation methods [Cau47] to the UF), MCrefers to ‘Monte Carlo’ (the solution obtained evaluating the UF at a quasi-random sample ofstrategies), and Q2 Fit refers to ‘Quadratic Fit’ (the solution obtained from the methods MCand GD applied to the UF approximated by a quadratic-polynomial fit to the quasi-randomsample). AC denotes the solution obtained by using the AC utility function (2.17), whichcan be expressed analytically, or solved by numerical simulation of the TC distribution. DMdenotes the solution obtained by using the UF (3.4). The Monte Carlo sample contains100 quasi-random strategies in each case. We have investigated the cases corresponding to˜ c = − c = − .
5. The other parameter values are γ = 1, η = 1, ε = 1, σ = 1, ∆ t = 1, ξ = 1, λ = 0 .
3. The number of shares n k , executed in time interval k , is expressed as a ratioto the order size, i.e. we take N = 1. importantly it admits solutions in which we trade a larger proportion of the order in latertime intervals. Consequently, we find a crossover from n > . n < . c is increased from − − .
5. Finally, it is important to remark that the stochasticnature of the problem leads to a noisy simulated utility function (c.f. dots in figure 3.2). Inthis respect, in order to get an accurate solution to the problem, it is important to simulateenough price paths so that the simulated TC distribution has sufficiently converged to theunderlying probability distribution. In the case simulated here, judging from the values of themoments of the distribution –as compared to the theoretical values– one can observe a largedegree of convergence: the theoretical values of the first four moments of the distributionare, in this case, mean = 0 .
71, standard deviation = 0 .
44, skewness = 0, kurtosis = 3, whichare in close agreement with the values obtained from our simulation (see caption figure 3.1).Although this convergence is certainly important, the introduced procedure of determiningthe optimal trading strategy via a fitted curve to the simulated UF makes the obtainedsolution robust against statistical fluctuations, i.e. a second-order polynomial fit to evenmore noisy utility functions proves to deliver similar optimal execution strategies. This willallow us to tackle more complicated scenarios in which the mentioned convergence is not asaccurate, such as the cases (b), (c) and (d) shown in figure 3.1. Furthermore, obtaining theoptimal trading strategy from the fitted UF also allows us to reduce the density of simulatedpoints: while obtaining the minimum from the individual evaluation of the utility functionat each of the simulated points requires a high density of candidate strategies to accuratelydetermine the local minimum, rather similar polynomial fits are obtained when the densityof points is decreased, and so the obtained minimum is relative robust in this sense as well.hapter 3.
Execution Costs Beyond Mean-Variance (a) (b) Figure 3.3:
Utility function in the case in which the trading horizon is subdivided into of3 time intervals ( K = 3), and with ˜ c = − .
5. (a) Utility function (2.17) derived from theAlmgren-Chriss formalism, and obtained via numerical simulation of the TC distribution.The result obtained via the analytical solution is visually almost indistinguishable. (b)Utility function derived from equation (3.4). In both plots, the dots correspond to thevalues of the UF computed over a quasi-random sample. The surfaces correspond to a fitof these points to a quadratic polynomial. Notice that, although shown here, the fittingcurve fulfilling n + n > N is actually not defined in the context of our problem: due tothe constraint (cid:80) Kk =1 n k = N , any pair of two variables must fulfil n + n ≤ N . The MonteCarlo sample contains ∼
200 quasi-random strategies in each case, and 10000 price pathshave been used to estimate the UF at every point. The other parameter values are γ = 1, η = 1, ε = 1, σ = 1, ∆ t = 1, ξ = 1, λ = 0 .
3. The number of shares n k , executed in timeinterval k is expressed as a ratio to the order size, i.e. we take N = 1. In figure 3.3 we show the dependency of the UF on the number of shares executed inthe first and second time interval, in the case in which the trading horizon is subdividedinto three time intervals ( K = 3). Similarly to the case K = 2 discussed above, both utilityfunctions, (2.17) and (3.4), have a local minimum. In figure 3.3(a) we show the simulation of200 strategies (red dots) following the AC price dynamics, and the corresponding quadraticfitting curve. In figure 3.2(b) we show the same for the UF (3.4). Visually, both fittingapproximations are rather accurate, and thus we yet again use them to determine the localminimum. Doing so we obtain the results presented in table 3.1. The best approximationto the optimal trading strategy is provided by the optimisation Q2 Fit + GD (quadratic fitfollowed by gradient-descent). As in the case for two trading intervals, we find a “reversal”in the optimal execution as we vary the parameter ˜ c , from trading the largest proportion ofthe shares in the earliest time intervals, to trading the largest proportion of the shares in thelatest time intervals. Finally, it is important to remark that when generating quasi-randompoints corresponding to different strategies, we must do so by respecting the constraint (cid:80) Kk =1 n k = N . Consequently, we have written n K = N − (cid:80) K − k =1 n k , and taken a sample ofhapter 3. Execution Costs Beyond Mean-Variance Opt. method (cid:126)n opt
AC analytic (cid:126)n opt
AC numeric (cid:126)n opt
DM (˜ c = − (cid:126)n opt DM (˜ c = − . GD (0 . , . , .
14) (0 . , . , .
04) (0 . , . , .
33) (0 . , . , . . , . , .
14) (0 . , . , .
14) (0 . , . , .
33) (0 . , . , . . , . , .
14) (0 . , . , .
14) (0 . , . , .
25) (0 . , . , . . , . , .
14) (0 . , . , .
14) (0 . , . , .
28) (0 . , . , . Table 3.2:
Solutions, expressed in the form ( n , n , n ), for the optimal execution strat-egy obtained by various optimisation methods, in the case in which the trading horizon issubdivided into of 3 time intervals ( K = 3). GD refers to ‘Gradient-descent’ (the solu-tion obtained by applying standard optimisation methods [Cau47] to the UF), MC refers to‘Monte Carlo’ (the solution obtained evaluating the UF at a quasi-random sample of strate-gies), and Q2 Fit refers to ‘Quadratic Fit’ (the solution obtained from the methods MCand GD applied to the UF approximated by a quadratic-polynomial fit to the quasi-randomsample). AC denotes the solution obtained by using the AC utility function (2.17), whichcan be expressed analytically, or solved by numerical simulation of the TC distribution. DMdenotes the solution obtained by using the UF (3.4). The Monte Carlo sample contains ∼
200 quasi-random strategies in each case. We have investigated the cases correspondingto ˜ c = − c = − .
5. The other parameter values are γ = 1, η = 1, ε = 1, σ = 1, ∆ t = 1, ξ = 1, λ = 0 .
3. The number of shares n k , executed in time interval k , is expressed as a ratioto the order size, i.e. we take N = 1. vectors ( n , . . . , n k − ) such that K − (cid:88) k =1 n k ≤ K (3.7)This is why in figure 3.3 we observe that all the randomly generated points rest within thetriangular region corresponding to n + n ≤ N . Importantly, such constraint has significantimplications for the scaling of the problem towards a higher dimensionality (number of timeintervals in which the trading horizon is subdivided). In order to generate q quasi-randomnumbers in the region fulfilling (3.7), we must first generate Q > q quasi-random numbersin the hypercube with vertices { v , . . . , v K − } , with v j ∈ { , N } ( j = 1 , . . . , K −
1) andthen post-select for the fitting the subsample that satisfies (3.7). This implies that, if wewant to preserve an approximately constant number of points used for the fitting as thedimensionality of the problem is increased, the number Q of pre-generated random strategiesneeds to increase exponentially with K . We numerically observe that for K ≥ Q needsto be larger than 10 . On top of this, we expect that, as K increases, the goodness-of-fit ofthe polynomial curve will worsen if we keep q constant . Therefore, q should also scale withthe dimensionality K . This leads to a limitation of our approach for increased granularityof the trading intervals, and in practice when K (cid:38)
5. How to improve the scaling of thisproblem is left as a topic for future research. How does q need to be increased as we increase the dimensionality –if we would like the fitting curve toremain accurate– can be investigated e.g. by monitoring the p value of a goodness-of-fit test as we increasethe dimensionality of the problem. hapter 3. Execution Costs Beyond Mean-Variance We next examine the solution for the optimal trading strategy as we increase the numberof trading intervals and vary the parameters of the utility function. As we have mentionedabove, using our method in high dimensions (large number of time intervals that subdividethe trading horizon) should be handled with care, verifying:i) That the distribution of generated transaction costs has sufficiently converged to theunderlying probability distribution, to avoid too much noise in the utility function.ii) That the sample of generated strategies is sufficiently dense in the ( K − K = 2 and K = 3, following the discussion above and the illustration of figures 3.2 and3.3, it is easy to see that these conditions are fulfilled, and –due to the high precision of theresults in these cases– to presume that we will have a sufficient degree of accuracy in slightlyhigher dimensions. However, in full rigour the conditions above should be mathematicallyverified, in particular when K (cid:38)
5. Furthermore, it is important to notice that how wellthese conditions are fulfilled will depend –not only on the utility function– but also onthe distribution of financial returns, as well as on the impact model assumed in the priceevolution. In particular, in figure 3.1 we see that considering t-Student returns affects onlyslightly the convergence of the TC sample to the underlying probability distribution: thetheoretical values of the first four moments of the distribution associated to panel b) are:mean = 0 .
71, standard deviation = 0 .
57, skewness = 0, kurtosis = 6 (c.f. caption of figure 3.1for the obtained values through the sample). Considering a different impact model, however,has a stronger impact on the convergence of the TC sample to the underlying probabilitydistribution: generating various samples of the same size, we see that the obtained meanand standard deviation associated to panels c) and d) are somewhat robust, but not sorobust are the skewness and kurtosis of the sample. This suggests that we should considertaking a larger sample size in the simulations of TC associated to the impact model usedto derive these results. However, once we have made this this remark, one should noticethat the convergence of the TC sample to the underlying probability distribution will bereflected in the noisy behaviour of the generated utility function, but –to our advantage–the fitting function is somewhat robust with respect to the noise and density of generatedtrading strategies.hapter 3.
Execution Costs Beyond Mean-Variance DMAC
DMAC (a) (b)
Figure 3.4:
Number of shares n k sold in interval k , as a proportion to the total number oforder shares N . Here 5 time intervals have been considered. (a) Optimal strategy derivedvia the UF (3.4) (DM) and via the UF (2.17) (AC), in the case λ = 0 .
3, ˜ c = −
1. Whilethe AC solution decays exponentially, the DM solution seems to decay linearly. (b) Optimalstrategy derived via the UF (3.4) (DM) and via the UF (2.17) (AC), in the case λ = 0 . c = − .
5. In this case, the strategy obtained via the UF (3.4) is almost constant butincreases slightly the proportion of executed shares with the trading interval. The analyticalAC solution (obtained using equations (3.2)-(3.3)) coincides with the numerical AC solutionshown here (obtained via Monte Carlo simulation) to a very high –but finite– precision (dueto statistical fluctuations). The other parameter values are γ = 1, η = 1, ε = 1, σ = 1,∆ t = 1, ξ = 1. Following the logic explained above, we here present results for the optimal tradingstrategy obtained with our utility function (3.4), using the AC price-impact model. In figure3.4 we show the comparison between the optimal trading strategies obtained via equations(3.4) and (2.17). For the given values of ˜ c , the strategy derived by AC decays more rapidlywith the trading interval than that obtained with the UF (3.4), i.e. under the AC model thetrader’s patience “decays exponentially”. However, in a framework in which timing risk ismodelled through the negative tail of the TC distribution, the patience of the trader seemsto decay more slowly (it appears to be linear with the time interval). More interestingly, atsome level of ˜ c –as we increase its value– the number of traded shares no longer decreaseswith the trading interval. We conjecture that this is because at some level of ˜ c , the integral inthe timing-risk term of equation (3.4) plays a different role (as the integral between −∞ and˜ c of the probability density function captures –in that case– almost the whole distribution).When the tail includes a larger proportion of the distribution, trading itself is penalised (asit incurs a negative transaction cost), thus a slight preference to leave a larger proportion ofthe shares to be executed in later intervals (case in which market impact has a contributionover less intervals, due to the decaying nature of the temporary market impact).hapter 3. Execution Costs Beyond Mean-Variance ˜ c = − c = − . c = 0 ˜ c = 0 . c = 1 λ = 1 (1.00,0.00) (0.21,0.78) (0.00,1.00) (0.00,1.00) (0.00,1.00) λ = 0 . (0.84,0.16) (0.33,0.67) (0.21,0.79) (0.38,0.62) (0.47,0.53) λ = 0 . (0.66,0.34) (0.38,0.62) (0.38,0.62) (0.46,0.54) (0.49,0.51) λ = 0 . (0.57,0.43) (0.44,0.56) (0.45,0.55) (0.48,0.52) (0.49,0.51) λ = 0 (0.50,0.50) (0.50,0.50) (0.50,0.50) (0.50,0.50) (0.50,0.50) Table 3.3:
Solutions, expressed in the form ( n , n ), for the optimal execution strategyobtained by minimising the UF (3.4). As with the AC utility function (for a given valueof ˜ c ) decreasing λ gives a more “homogeneous” solution ( n ≈ n ). Actually, solutions fordifferent values of λ corresponding to ˜ c ≈ − λ , increasing ˜ c changes the notion of the timing-risk termin the utility function. From penalising “patience” at ˜ c = −
1, it penalises “impatience” for˜ c (cid:38) − .
5. We thus find an “inversion effect”, from n > n to n > n , when increasing ˜ c .The other parameter values are γ = 1, η = 1, ε = 1, σ = 1, ∆ t = 1, ξ = 1. In order to explore further this behaviour, we have calculated the optimal trading strat-egy across different values of λ and ˜ c , computed with the utility function (3.4), here in thecase of two trading intervals. The results are shown in table 3.3. Starting with a fixed valueof ˜ c = −
1, we see that –as we increase the value of λ – we go from a TWAP strategy, thatdistributes equally the number of executed shares among all trading intervals, to a strategywhere we trade most of the order shares in the first time interval. This outcome is similar towhat it is obtained with the UF (2.17), and intuitive: λ → ∼ E [ c ] dominates the UF, and this will be minimised in a situation where we providethe minimal “shock” to the market per trading interval, namely when we trade at a con-stant rate through the time horizon. λ →
1, in contrast, implies that the timing-risk term ∼ (cid:82) ˜ c −∞ f [ c ] dc dominates the UF. In this case, we aim to minimise the area under the tail ofthe TC distribution (extreme negative costs), which is accomplished by executing the order“rapidly”, i.e. completing it within the first time interval(s). More striking is the behaviourwhen ˜ c is varied. As we increase ˜ c , for example going from ˜ c = − c = − .
5, we yetagain find a homogeneous distribution of executed shares among trading intervals for λ → λ we observe an “inversion effect”: Now most of the shares are executedin the latest time interval(s), i.e. the UF (3.4) incorporates an extra degree of “patience”with respect to the AC utility function. This phenomenon is mostly relevant in the contextof describing markets where most of the trading is completed towards the end of the day (dueto increased liquidities at the close). In future research, it will be interesting to investigatethe necessary degrees of freedom to be included in the model in order to recover the common“U shape” (meaning that most of the trading is done at the beginning and at the end of theday) with lower trading volumes in intermediate time intervals A potential option to accomplish this might be adding a variance term ∼ V [ c ] to the UF (3.4), butexploring this avenue is left as a future prospect. hapter 3. Execution Costs Beyond Mean-Variance Numerical solutionAnalytical solution simulatedlinear fit λ = 1 λ = 0 . λ = 0 . λ = 0 . λ = 0 . λ = 0 . (a) (b) Figure 3.5:
Efficient trading frontier computed with 5 trading intervals. (a) Result ofAlmgren and Chriss, where the market impact – timing risk relationship is nonlinear. Inparticular –for small values TC volatility– taking twice as much timing risk might reducemarket impact by significantly more than a factor of two (when both rescaled to the corre-sponding units). Here ‘Market-impact term’ = - E [ c ], while ‘Timing-risk term’ = V [ c ]. (b)Result obtained with the UF (3.4) in the case ˜ c = −
1. Within this range of values of λ ,and for sufficiently small ˜ c , the relationship market impact – timing risk is approximatelylinear. Here ‘Market-impact term’ = - E [ c ], while ‘Timing-risk term’ = (cid:82) ˜ c −∞ f [ c ] dc . Noticethat the units of the Timing-risk term are different from each other in panels (a) and (b),and that these units are themselves different to the those of the Market-impact term. Theother parameter values are γ = 1, η = 1, ε = 1, σ = 1, ∆ t = 1, ξ = 1. From the results above, we see that –at least for small ˜ c – the timing-risk term ∼ (cid:82) ˜ c −∞ f [ c ] dc can be interpreted as a “standard” TC risk (similar to TC volatility, or rather toTC value-at-risk). As such, we may wonder whether in a picture market impact vs. timingrisk we can define an ‘efficient trading frontier’ , similarly to how this is done within the ACformalism [AC01], i.e. as the set of optimal trading strategies for different values of λ in amarket impact – timing risk representation. In figure 3.5(a) we display the result obtained forsuch efficient frontier when we use the AC utility function (2.17). Here we have computed theoptimal execution strategy across different values of λ ∈ [0 ,
1] and plotted both, the solutionsobtained via a numerical simulation of transaction costs, and using the analytical formulas(3.2)-(3.3) for their expectation and variance. In figure 3.5(b) we show the equivalent efficientfrontier if we use the utility function (3.4). We do so for ˜ c = − λ , when the integral term (cid:82) ˜ c −∞ f [ c ] dc can be clearly interpreted as a “standard” TC risk.Interestingly, in this case the efficient frontier displays a linear relationship between marketimpact and timing risk. In other words, if we are in a regime in which TC risk correspondsto extreme negative events, and if we scale MI and timing risk to the appropriate units, inorder to reduce market impact by a factor of two, we need to take twice as much timing risk.Conversely, if we want to reduce timing risk by half, our execution will need to incur twiceas much market impact. This is perhaps the most distinctive characteristic of timing riskmodelled as done in the utility function (3.4), compared to a mean-variance UF of the type(2.17).hapter 3. Execution Costs Beyond Mean-Variance One important assumption in the model of Almgren and Chriss is that, in their framework,returns are independent normally-distributed variables . This enables, specifically, to obtainanalytical formulas for TC expectation and variance, thus avoiding the need to numericallysimulate the price dynamics (as done above via Monte Carlo). However, real markets arenot that gentle, and it is common that the probability distribution of financial returns isleptokurtic (i.e. with fat tails) [BP09]. In particular, we have analysed over 200 markets,with the purpose of establishing certain universality for the distribution of financial returns.By downloading daily prices over a period of around 30 years, one can observe that gener-ically (both arithmetic and geometric) returns fail tests of normality at the most commonsignificance levels. More specifically, less liquid markets (e.g. emerging markets) tend toshow more rare events or certain autocorrelation, giving rise to distributions with heaviertails than those of the normal distribution. The difficulty or establishing a “universal law”that describes market returns lies in the fact that –apart from providing a good fit– theconjectured distribution should: i) have the minimum number of fitting parameters (to avoidoverfitting) and ii) apply over the broad class of markets that it aims to describe with thesame parameter values (within a margin of error). Following our analysis, the distributionsthat, to our knowledge, accomplish this best are: t-Student distributions, piece-wise distri-butions (e.g. Gaussian body and Pareto or exponential tails), or of the class of ‘alpha-stable’ distributions, such as the Levy distribution, or the truncated Levy distribution [BP09]. Wewill, therefore, for the illustration of our method under more realistic market conditions,consider the case of market returns distributed according to a t-Student distribution .As we have seen, the rationale of introducing the utility function (3.4), as comparedto the UF (2.17) is to precisely capture TC timing risk as originated by extreme negativeevents that may occur during the period of “not executing” the order. It is therefore intuitivethat, in markets characterised by fat-tailed distributions, our UF plays a relevant role indetermining how to optimally execute a market order. In particular, as we will show, thisis perhaps the most relevant application of our theory: Given two orders following differentTC distributions, one Gaussian and one fat-tailed (e.g. t-Student), with the same meanand variance, but different higher moments (as it happens with heavy-tailed distributions),should these two orders be executed equally over the trading horizon? It seems intuitive that,if the “tail-risk” (probability of negative rare events) is different, the execution strategy forboth orders should be, in some sense, different. The UF provided by Almgren and Chriswill provide, however, the same solution for the optimal execution strategy (as it is apparentfrom the fact that only TC mean and variance enter this optimisation problem). In contrast, More precisely, AC consider that arithmetic returns follow a Gaussian distribution. In the sense that it estimates or predicts . hapter 3. Execution Costs Beyond Mean-Variance simulatedfit U DM U AC Figure 3.6:
Utility function in the case in which the trading horizon is subdivided into 2time intervals ( K = 2) and when the returns follow a t-Student distribution with µ = 0, σ = (cid:112) ( ν − /ν , and ν = 5. Here we show the utility function (2.17) numerically estimatedusing t-Student returns (labeled U AC ), and compared to the equivalent analytical result(which corresponds to Gaussian returns with σ = 1). Both curves overlap almost exactly.The utility function (3.4) estimated with t-Student returns ( µ = 0, σ = (cid:112) ( ν − /ν , and ν = 5) is also shown (labeled U DM ). The solutions for the optimal execution strategies(minima of the fitted UF), expressed in the form ( n , n ) are (0 . , . . , . . , . γ = 1, η = 1, ε = 1, σ = 1, ∆ t = 1, ξ = 1, λ = 0 .
7, ˜ c = −
1. The number of shares n k , executed in time interval k is expressedas a ratio to the order size, i.e. we take N = 1. we show next how the UF (3.4) distinguishes between these two situations, and suggests thatthe second order should be traded more or less aggressively than the first, depending on thevalue of ˜ c , which determines the characterisation of timing risk as more or less “rare events”.In figure 3.6 we show the result of the simulation of 100 strategies, for each of whichthe utility function has been evaluated using 10000 price paths, with price dynamics givenby (2.9), (2.10), (2.11), being now χ a t-Student-distributed variable with mean 0, standarddeviation (cid:112) ( ν − /ν , with ν = 5 degrees of freedom. We have taken this specific value of the σ in order to make the standard deviation of the distribution (given by σ (cid:112) ν/ ( ν − K = 2). The result for the optimal trading strategy derived through equation(2.17), U AC , coincides the one obtained via the same equation (at this value of λ = 0 . n opt1 , n opt2 ) = (0 . , . Execution Costs Beyond Mean-Variance n opt1 , n opt2 ) = (0 . , . n opt1 , n opt2 ) = (0 . , .
16) (c.f. table 3.3) that is obtained usingnormally-distributed returns. The utility function (3.4) therefore “recognises” the nature ofTC, and it is in this sense that it is said to be a “beyond mean-variance” approach to TCA.It is instructive to analyse the optimal trading strategies obtained for Gaussian andnon-Gaussian returns. While the result derived through the UF (2.17) is the same in bothcases, equation (3.4) gives (for λ = 0 . c = −
1) the result (cid:126)n opt = (0 . , .
16) for Gaussian(arithmetic) returns, and (cid:126)n opt = (0 . , .
19) for t-Student (arithmetic) returns. Thereforein the second case (which, as we have said, is a better representation of reality) we oughtto execute more patiently . Is this intuitive? After all we may think that in a case inwhich we have more rare events (heavier tails), we should have a larger timing risk (underthe picture of the UF (3.4)) and thus, to minimise it, execute more aggressively than in theGaussian case, not more patiently. This is indeed the situation when ˜ c is sufficiently small,namely smaller than the threshold value ∼ mean[ c ] − . × standard deviation[ c ], which isthe smallest point at which the Gaussian and t-Student cumulative distribution functionsintersect. As ˜ c is increased above this value, the value of the integral (cid:82) ˜ c −∞ f [ c ] dc is actuallygreater for Gaussian TC than for t-Student TC, thus giving rise to the fact that in this casewe ought to execute more patiently. Is is important to remember that ˜ c allows us to modifythe significance of timing risk, from extreme negative costs to less extremely negative (or evenpositive) TC. This conceptual variation, provided by the additional degree of freedom that ˜ c in the UF (3.4) introduces, is what gave rise to the “inversion effect” mentioned above (fromexecuting more shares in earlier trading intervals than in later trading intervals to otherwise).In the future it would be interesting to investigate the role that autocorrelated returns (i.e.non-independently distributed) plays in the optimal trading strategy. We speculate that –forpositive autocorrelation– market impact gets amplified, therefore suggesting that in this casethe order should be executed more patiently than for independent returns (and converselyfor negative autocorrelation); but this is a topic for future work. Notice that the difference between both approaches corresponds, in this case, to imbalance of around 3%of the shares to be traded more/less in each of the two time intervals. hapter 3.
Execution Costs Beyond Mean-Variance We end this chapter by considering another area where the utility function (3.4) showsdistinctively features. In section 3.1 we have shown how –not only the nature of financialreturns– but also the conjectured impact model fundamentally affects the shape of the TCdistribution. However, just as we can estimate the distributional properties of returns fora particular market realisation, market impact, as we have previously mentioned, cannotbe measured: For a given security and point in time (unique market conditions), either weexecute the order (in which case our own supply/demand affects the security’s price, andthis is what we observe), or we do not execute the order (in which case the security’s priceis not affected by our supply/demand, only by that of other market participants). Since wecannot observe both instances of reality at the same time (either we execute the order orwe do not), the difference between the security’s price in the presence and in the absenceof our execution cannot be measured. The best we can do in order to optimise TC is toestimate/predict MI based on a model. As such, having a realistic representation of reality(impact model) is of extreme importance –since as we will see– the optimal trading strategyhighly depends on the underlying assumptions of how execution affects a security’s marketprice.In this section we consider the price dynamics given by (2.15), as compared to the priceevolution (2.9) considered so far. For the impact function I ( n k ), we will contemplate threedifferent possibilities: LINEXP : I ( n k ) = − ξγn k e − ρ k ∆ t (3.8)LINPOW : I ( n k ) = − ξγn k ( k ∆ t ) − ρ (3.9)SQRT : I ( n k ) = − ξγ √ n k (3.10)Where ‘LINEXP’, ‘LINPOW’, and ‘SQRT’ stand for “linear-exponential” (linear in the num-ber of executed shares and exponential time decay), “linear-power-law” (linear in the numberof executed shares and power-law time decay), and “square-root” (square root in the numberof executed shares), respectively. These are some of the most popular functional forms ofMI that appear in the literature within the propagator approach [Gat09, BBDG18]. Oneimportant difference between the price evolution (2.15) and that of (2.9) is that, in the casethat we now consider, both the impact law and the stochastic component are linked to ge-ometric returns (as compared to arithmetic returns in the dynamics previously considered).This is relevant because: i) Geometric returns are generally better characterised by universaldistributional properties (i.e. when we say that the returns follow a Gaussian or t-Studentdistribution). ii) Arithmetic returns are only a good approximation to geometric returns for More precisely, not only execution –but also sending any signal to the market of our willingness/desireto trade– will affect the security’s price as the market absorbs this information. hapter 3.
Execution Costs Beyond Mean-Variance Gaussian Rets. t-Student Rets.Impact model (cid:126)n opt
AC numeric (cid:126)n opt DM (cid:126)n opt AC numeric (cid:126)n opt DM LINEXP (0 . , .
44) (0 . , .
71) (0 . , .
44) (0 . , . . , .
57) (0 . , .
67) (0 . , .
58) (0 . , . . , .
57) (0 . , .
59) (0 . , .
57) (0 . , . Table 3.4:
Solutions, expressed in the form ( n , n ), for the optimal execution strategy ob-tained with various impact models, in the case in which the trading horizon is subdivided intoof 2 time intervals ( K = 2). All results have been obtained by using the price evolution givenin (2.15), with noise term ζ = σ √ ∆ t χ , being χ ∼ N (0 ,
1) –normally-distributed variablewith mean 0 and standard deviation 1– (‘Gaussian Rets.’) or χ ∼ t ν (0 , (cid:112) ( ν − /ν ) √ σ )–t-Student-distributed variable with mean 0, standard deviation 1, and ν = 5 degrees offreedom– (‘t-Student Rets.’). LINEXP refers to ‘linear-exponential’ (the solution obtainedby using the impact model (3.8) in the price evolution). LINPOW refers to ‘linear-power-law’ (the solution obtained by using the impact model (3.9) in the price evolution). SQRTrefers to ‘square-root’ (the solution obtained by using the impact model (3.10) in the priceevolution). ‘AC numeric’ denotes the solution (numerically) obtained by using the AC util-ity function (2.17). DM denotes the solution obtained by using the UF (3.4). The MonteCarlo sample contains 100 quasi-random strategies in each case. Parameter values are γ = 1, η = 1, ε = 1, σ = 1, ∆ t = 1, ξ = 1, p = 1, λ = 0 .
3, ˜ c = − ρ = 1 /
2. The number ofshares n k , executed in time interval k , is expressed as a ratio to the order size, i.e. we take N = 1. In these simulations, we have used 15000 price paths to calculate each of 100 trad-ing strategies, which we fit to a cubic polynomial in order to obtain the optimal executionstrategy. small time increments. iii) With geometric returns TC are typically non-Gaussian, regardlesson whether returns are normally distributed or not (since a product of stochastic variables–as involved in equation (2.16)– Gaussian or otherwise, is generally non-Gaussian). Ourutility function (3.4) becomes therefore particularly relevant in this case: as we have seen,timing risk modelled as (cid:82) ˜ c −∞ f [ c ] dc captures non-Gaussian features such as heavy tails, and inthis case the optimal execution strategy depends on the nature of TC beyond mean-variance.This will be illustrated further now, by showing the difference between the result obtainedvia the AC utility function (2.17) (evaluated through numerical simulations only, as in thiscase no analytical expression for E [ c ] and V [ c ] is available) and via the UF (3.4).In table 3.4 we show the optimal trading strategies obtained with equations (2.17) and(3.4), in the case of 2 trading intervals ( K = 2), for each of the impact functions (3.8), (3.9),(3.10), and for both –Gaussian and t-Student– returns. The new price evolution seems tohave a significant impact on the optimal trading strategies. Most scenarios show an “inversioneffect” with respect to the solutions derived through the time evolution (2.9). In the case ofUF (3.4), TC are such that here the value of ˜ c = − Execution Costs Beyond Mean-Variance simulatedfit simulatedfit simulatedfit (a)(b)(c) LINEXPLINPOWSQRT simulatedfit LINEXP simulatedfit (d)(e) LINPOW simulatedfit
SQRT(f) U AC U AC U AC U AC U AC U AC U DM U DM U DM U DM U DM U DM Gaussian returnsGaussian returnsGaussian returns t-Student returnst-Student returnst-Student returns
Figure 3.7:
Utility function in the case in which the trading horizon is subdivided into of2 time intervals ( K = 2), comparing different impact models, and Gaussian vs. t-Studentreturns. Here we show the utility functions (2.17), U AC , and (3.4), U DM , evaluated throughthe impact models LINEXP, LINPOW, SQRT (c.f. equations (3.8), (3.9)), (3.10)) and theprice evolution (2.15). The minima of the UF (optimal trading strategies) correspond tothe values shown in table 3.4. Both utility functions have been numerically evaluated, usinga Monte Carlo sample with 15000 price paths to estimate each of the 100 quasi-randomstrategies represented by the dots. The solid lines are a fit of a third-order polynomialto these points. Panels (a), (b) and (c) correspond to Gaussian (geometric) returns, whilepanels (d), (e) and (f) correspond t-Student (geometric) returns. Parameter values are γ = 1, η = 1, ε = 1, σ = 1, ∆ t = 1, ξ = 1, p = 1, λ = 0 .
3, ˜ c = − ρ = 1 /
2. The number of shares n k , executed in time interval k is expressed as a ratio to the order size, i.e. we take N = 1. hapter 3. Execution Costs Beyond Mean-Variance n > n and n > n , respectively. Again, the UF (3.4) tends to provide asolution in which more shares are executed in the latest trading intervals, a situation thatis associated to the timing-risk term ∼ (cid:82) ˜ c −∞ f [ c ] dc incorporating both extreme and non-extreme TC. Most interestingly, ‘AC numeric’ provides the same optimal trading strategyfor LINPOW and SQRT in this case, and ‘DM’ executes most aggressively for SQRT. Wehave also evaluated the case in which –for the price dynamics given by (2.15) and impactmodels (3.8), (3.9), (3.10)– (geometric) returns follow t-Student distribution. The optimaltrading strategies, in this situation, are almost identical to the case of Gaussian returns.The impact models and price evolution considered here are therefore robust against certaindistributional properties of financial returns (such as heavy tails). This characteristic holdsat the level of the full utility function (not only for the optimal trading strategy), as it isshown in figure 3.7. For all impact models, the utility functions corresponding to Gaussianand t-Student returns are here almost identical. Furthermore, we see that now the quadraticapproximation –previously used for the fitted UF– is no longer appropriate. Instead, we havenow used a cubic polynomial, which, as figure 3.7 shows, provides a good fit to the simulatedstrategies in this case. Although here the appropriate degree of the fitting polynomial hasbeen assessed visually, and by comparison of the solutions obtained via Monte Carlo and aquadratic fit together with a gradient descent method, determining the pertinent functionalform of the utility function should be done, in general, more rigorously. This may enter theterritory of model selection and the well known problem of finding the optimal bias-variancetradeoff [HTF01], a methodology that is beyond the scope of this text, but that might beworth investigating in the presence of more complex impact models or for a higher numberof trading intervals. That is, we need to find a good balance between goodness-of-fit and and the number of model parameters. hapter 3.
Execution Costs Beyond Mean-Variance hapter 4 Conclusions “You must unlearn what you have learned... Do. Or do not. There is no try.”
Yoda, Jedi MasterIn this Thesis we have investigated transaction costs in securities trading. Specifically,we have developed a framework that enables us to characterise optimal execution of marketorders in terms of two essential degrees of freedom: market impact and timing risk. Here–in contrast to previously existing theories that mathematically model these variables– wehave introduced an alternative representation of timing risk, namely it has been modelled asthe TC incurred by our execution below a particular cost threshold. This characterisationof timing risk is embedded into a novel utility function, whose minimisation provides theoptimal execution strategy. While the most common approaches to optimal execution relyon mean and variance of TC, only, our formalism captures features of the TC distributionbeyond these first moments. This allows us to devise distinct optimal execution strategies formarket situations whose associated TC are only differentiated by higher-order moments (suchas the asymmetry of the distribution or the thickness of the tails of the distribution). Moreprecisely, using our UF –encapsulated in equation (3.4)– we have shown that it is possibleto have optimal execution strategies that trade most of the order volume in the latest timeintervals of the trading horizon.Along the way towards establishing a formalism for optimal execution, we have encoun-tered various difficulties. Namely, resolving the optimal trading strategy as the minimum of autility function relies, in general, on a stochastic simulation of market prices. This inherentlyleads to non-deterministic, noisy utility functions, whose global minima are non-trivially de-termined. We have, nonetheless, developed a methodology that is sufficiently robust againststatistical fluctuations, and by which a stable solution for the optimal execution strategy isreached at attainable levels of convergence of the TC sample to the underlying probabilitydistribution. This framework has been used to resolve the optimal trading path when the41hapter 4.
Conclusions linear relationship ; something that is in starkcontrast with the relationship that market impact and timing risk follow when the latter ismodelled as TC variance.Our inquiry on execution trading has been complemented with the development andanalysis of a price-impact model, which lead to the equation (2.16). As we have shown,this has distinctive characteristics. Among them is the fact that execution impatience ispenalised, i.e. it allows solutions to optimal execution strategies that trade the majority ofthe order shares in the latest time intervals. Furthermore, it is robust with respect to differentTC distributions, namely the optimal trading strategies and utility functions obtained forGaussian and t-Student returns are, in this case, almost indistinguishable from each other.We have analysed in detail three different functional forms of the impact function, showingin particular how a sub-linear dependence on the executed volume tends to give “moreimpatient” optimal trading strategies. Finally, we have examined important applications ofour theory, such as the case when market returns include features beyond mean-variance. Inparticular, by analysing the case in which returns follow a t-Student distribution, we haveidentified two regimes (controlled by the value of the cost threshold) in which heavy tailscontribute to more or less aggressive execution strategies.Our work opens new questions and future prospects. Among them we can cite:i) The scaling of the formalism to higher dimensionality (i.e. larger number of tradingintervals).ii) The treatment of the trading horizon as an optimisation variable, and its relationshipto the risk-aversion parameter.iii) Extending the framework to include opportunity cost, i.e. the case in which the ordermight be incompletely executed by the end of the trading horizon.iv) The application of our utility function to other impact models.v) Generalising the formalism to treat market-on-close orders and limit orders.hapter 4.
Conclusions
Conclusions ppendix A Fundamental benchmarks andtrading algorithms
TWAP (Time Weighted Average Price)
This strategy trades ‘uniformly’ over the time horizon T , i.e. n TWAP k = NK (A.1)Trading at times t k = TK k , where k = 1 , . . . , K . Here N is the total number of shares to betraded and K the number of time sub-intervals over the time T . The benchmark price isgiven by p TWAP = 1 K N (cid:88) k =1 p k (A.2)Where p k is the average transaction price in time bin k . VWAP (Volume Weighted Average Price)
This strategy trades proportionally to the market volume over the time horizon T . If ν k isthe traded market volume over time bin k , we have n VWAP k = ν k V N (A.3) Notice that for VWAP as an algorithm, ν k will be a predicted volume from historical observations, whilefor VWAP as a benchmark, ν k corresponds to the realised market volume. Fundamental benchmarks and trading algorithms V is the market volume traded over the time period T and N the total number of sharesto be traded. The benchmark price is given by p VWAP = 1 V N (cid:88) k =1 ν k p k (A.4)Where p k is the average transaction price in time bin k . POV (Percentage of Volume) &PWP (Participation Weighted Price)
This strategy trades according to a percentage of the traded market volume in each time bin.A POV algorithm is accompanied by a ‘participation rate’ parameter η (with 0 ≤ η ≤ k is given by n POV k = ην k , (A.5)where ν k is the traded market volume over time bin k . The associated benchmark is calledPWP, whose price is given by p PWP = η N (cid:88) k =1 ν k p k (A.6) MO (Market Open)
MO only acts as a benchmark (so n bmk k = n exe k ), given a trading strategy { n exe k } . Thebenchmark price is simply the price of the security at the market opening time p IS ≡ p O (A.7) MC (Market Close)
MC only acts as a benchmark (so n bmk k = n exe k ), given a trading strategy { n exe k } . Thebenchmark price is simply the price of the security at the market closing time p IS ≡ p C (A.8) For discrete times this is taken to be the price of the security at the beginning of the corresponding timebin. For discrete times this is taken to be the price of the security at the end of the corresponding time bin. ppendix A.
Fundamental benchmarks and trading algorithms IS (Implementation Shortfall)
Defined as a benchmark, IS is simply the price of the security at the order start time p IS ≡ p (A.9)So, taking into account that (cid:80) Kk =1 n bmk k = N , the transaction cost associated to this bench-mark can be calculated as c IS ≡ ξN K (cid:88) k =1 ( n exe k p exe k − p ) , (A.10)where ξ = − ξ = 1 for a shell order. Defined as an algorithm, IS triesto minimise a utility function of transaction costs, as described in the main text.It is worth noticing that an alternative conception of IS as a benchmark is as follows:compute the predicted cost, given equation (A.10), predicted prices (determined by historicalobservations), and the corresponding optimal trading strategy. This value, which we will call estimated-IS , can be compared to the observed IS calculated from equation (A.10), given theexecuted strategy and the corresponding execution prices. For discrete times this is taken to be the price of the security at the start or end (a matter of criterium)of the corresponding time bin. ppendix A.
Fundamental benchmarks and trading algorithms ibliography [AC01] R. Almgren and N. Chriss. Optimal execution of portfolio transactions. Journalof Risk , 3:5–39, 2001. (Cited on pages 5, 13, 16, and 32).[Ale09] C. Alexander.
Value-at-Risk Models . John Wiley and Sons, Inc., 2009.(Cited on page 19).[Alm03] R. F. Almgren. Optimal execution with nonlinear impact functions and trading-enhanced risk.
Applied Mathematical Finance , 10:1–18, 2003. (Cited on page 14).[Arr74] K. J. Arrow. General economic equilibrium: Purpose, analytic techniques,collective choice.
The American Economic Review , 64(3):253–272, 1974.(Cited on page 2).[ATHL05] R. Almgren, C. Thum, E. Hauptmann, and H. Li. Direct estimation of equitymarket impact.
Risk , 18:58–62, 2005. (Cited on pages 14 and 17).[BBDG18] J.-P. Bouchaud, J. Bonart, J. Donier, and M. Gould.
Trades, Quotes and Prices .Cambridge University Press, 2018. (Cited on pages 3, 5, and 36).[BBLB19] F. Bucci, M. Benzaquen, F. Lillo, and J.-P. Bouchaud. Crossover fromlinear to square-root market impact.
Phys. Rev. Lett. , 122:108302, 2019.(Cited on page 15).[BL98] D. Bertsimas and A. W. Lo. Optimal control of execution costs.
Journal ofFinancial Markets , 1:1–50, 1998. (Cited on page 16).[BP09] J.-P. Bouchaud and M. Potters.
Theory of Financial Risk and Derivative Pricing:From Statistical Physics to Risk Management . Cambridge University Press, 2009.(Cited on pages 19 and 33).[Cau47] M. A. Cauchy. M´ethode g´en´erale pour la r´esolution des syst`emes d’´equationssimultan´ees.
Comptes Rendus Hebd. S´eances Acad. Sci. , 25:536–538, 1847.(Cited on pages 24, 26, and 28). 49 ibliography
Available at SSRN: https://ssrn.com/abstract=3351736 orhttp://dx.doi.org/10.2139/ssrn.3351736 , 2019. (Cited on page 15).[CJP15] ´A. Cartea, S. Jaimungal, and J. Penalva.
Algorithmic and High-Frequency Trad-ing . Cambridge University Press, Cambridge, 2015. (Cited on page 3).[Fol94] D. K. Foley. A statistical equilibrium theory of markets.
Journal of EconomicTheory , 62:321–345, 1994. (Cited on page 3).[Fri62] M. Friedman.
Price Theory: A Provisional Text . Aldine, Chicago, 1962.(Cited on page 3).[Gat09] J. Gatheral. No-dynamic-arbitrage and market impact.
Quantitative Finance ,10:749–759, 2009. (Cited on pages 14 and 36).[Hah70] F. H. Hahn. Some adjustment problems.
Econometrica , 38:1–17, 1970.(Cited on page 2).[Har02] L. Harris.
Trading and Exchanges: Market Microstructure for Practitioners .Oxford University Press, 2002. (Cited on page 3).[Has07] J. Hasbrouck.
Empirical Market Microstructure . Oxford Press, Oxford, 2007.(Cited on page 3).[Hay45] F. A. Hayek. The use of knowlege in society.
The American Economic Review ,35:519–530, 1945. (Cited on page 3).[HS04] G. Huberman and W. Stanzl. Price manipulation and quasi-arbitrage.
Econo-metrica , 72:1247–1275, 2004. (Cited on pages 14 and 17).[HTF01] T. Hastie, R. Tibshirani, and J. Friedman.
The Elements of Statistical Learning:Data Mining, Inference, and Prediction . Springer, 2001. (Cited on page 39).[Joh10] B. Johnson.
Algorithmic Trading & DMA . 4Myeloma Press, London, 2010.(Cited on pages 3 and 11).[Jor06] P. Jorion.
Value at Risk: The New Benchmark for Managing Financial Risk .McGraw-Hill, 2006. (Cited on page 19).[Key36] J. M. Keynes.
The General Theory of Employment, Interest and Money . PalgraveMacmillan, 1936. (Cited on page 3).[Kis14] R. Kissell.
The Science of Algorithmic Trading and Portfolio Management . Aca-demic Press - Elsevier, 2014. (Cited on page 3). ibliography
Econometrica , 53:1315–1335, 1985. (Cited on page 15).[O’H95] M. O’Hara.
Market Microstructure Theory . Blackwell, Oxford, 1995.(Cited on page 3).[Per98] A. F. Perold. The implementation shortfall.
The Journal of Portfolio Manage-ment , 14:4–9, 1998. (Cited on pages 4 and 12).[Smi76] A. Smith.
An Inquiry Into the Nature and Causes of the Wealth of Nations . W.Strahan and T. Cadell, London, 1776. (Cited on page 2).[Tal09] N. N. Taleb. Report on the risks of financial modeling, var and the economicbreakdown.
U.S. House of Representatives , 2009. (Cited on page 19).[TLD +
11] B. T´oth, Y. Lemp´eri`ere, C. Deremble, J. de Lataillade, J. Kockelkoren, and J.-P.Bouchaud. Anomalous price impact and the critical nature of liquidity in financialmarkets.
Phys. Rev. X , 1:021006, 2011. (Cited on page 15).[Tor97] N. G. Torre.
Market Impact Model Handbook . BARRA Inc., Berkley, 1997.(Cited on page 17).[Wal74] L. Walras. ´El´ements d’´economie politique pure, ou th´eorie de la richesse sociale .L. Corbaz & Co., Lausanne, 1874. (Cited on page 2).[ZTFL15] E. Zarinelli, M. Treccani, J. D. Farmer, and F. Lillo. Beyond the square root:Evidence for logarithmic dependence of market impact on size and participationrate.