[PDF] SHIFT: A Highly Realistic Financial Market Simulation Platform

Abstract

This paper presents a new financial market simulator that may be used as a tool in both industry and academia for research in market microstructure. It allows multiple automated traders and/or researchers to simultaneously connect to an exchange-like environment, where they are able to asynchronously trade several financial assets at the same time. In its current iteration, this order-driven market implements the basic rules of U.S. equity markets, supporting both market and limit orders, and executing them in a first-in-first-out fashion. We overview the system architecture and we present possible use cases. We demonstrate how a set of automated agents is capable of producing a price process with characteristics similar to the statistics of real price from financial markets. Finally, we detail a market stress scenario and we draw, what we believe to be, interesting conclusions about crash events.

Full PDF

SSHIFT: A H

IGHLY R EALISTIC F INANCIAL M ARKET S IMULATION P LATFORM

A P

REPRINT

Thiago W. Alves ∗ School of BusinessStevens Institute of TechnologyHoboken, NJ 07310 [email protected]

Ionu¸t Florescu

School of BusinessStevens Institute of TechnologyHoboken, NJ 07310 [email protected]

George Calhoun

School of BusinessStevens Institute of TechnologyHoboken, NJ 07310 [email protected]

Drago¸s Bozdog

School of BusinessStevens Institute of TechnologyHoboken, NJ 07310 [email protected]

August 31, 2020

Funding Information

CME Group Foundation Research Grants for the SHIFT Project (2016, 2017) ∗ CNPq Science Without Borders Grant/Award Number: 200989/2014-6 A BSTRACT

This paper presents a new ﬁnancial market simulator that may be used as a tool in both industryand academia for research in market microstructure. It allows multiple automated traders and/orresearchers to simultaneously connect to an exchange-like environment, where they are able toasynchronously trade several ﬁnancial assets at the same time. In its current iteration, this order-driven market implements the basic rules of U.S. equity markets, supporting both market and limitorders, and executing them in a ﬁrst-in-ﬁrst-out fashion. We overview the system architecture and wepresent possible use cases. We demonstrate how a set of automated agents is capable of producing aprice process with characteristics similar to the statistics of real price from ﬁnancial markets. Finally,we detail a market stress scenario and we draw, what we believe to be, interesting conclusions aboutcrash events. K eywords ﬁnancial engineering · high frequency trading · market microstructure · real time simulation · tradingstrategies A recent Congressional Research Service report on high frequency trading (Miller and Shorter, 2016) estimates that itaccounts for of the U.S. equity market and of European equity markets. Many studies have been done on theadvantages and disadvantages such group of traders pose to the health of ﬁnancial markets. Some are discussed in Yeand Florescu (2019). High frequency trading, however, is just a subset of larger algorithmic trading .There are many possible interpretations to what algorithmic trading actually means. In general, it refers to advancedmathematical models used in either automatic trading strategies or optimal order execution algorithms, with little to nohuman interaction. Kissell (2013) estimates that algorithmic trading as a whole accounted for of market volume in a r X i v : . [ q -f i n . T R ] A ug PREPRINT - A

UGUST

31, 20202012. A report from Research and Markets estimates the global algorithmic trading market size to grow from . billion U.S. dollars in 2019 to . billion U.S. dollars by 2024. In fact, according to JPMorgan analysts, only of2017’s stock market trading volume was performed by fundamental value traders. Among other possible effects, thishigh activity of algorithmic-controlled trading may cause sell-off episodes when machines act immediately after datareleases, without the proper analysis a human would do.In order to try and mitigate the effects of ill-constructed algorithms, regulations such as Regulation Automated Trading(“Reg AT”) by the Commodity Futures Trading Commission (“CFTC”) (CFTC, 2015) of the United States (one ofthe main regulators for derivatives markets in the U.S.) have been proposed. In general terms, Reg AT recognizes theurgent need to adapt ﬁnancial market regulations to the current business models under which exchanges and mosttraders are operating today. Speciﬁcally, today’s trades are based on high-speed automated market processes for allsegments of typical ﬁnancial transactions, from order placement and cancellation, to the operation of matching enginesfor connecting and clearing bids and offers, to post-trade processing and data reporting. The CFTC points out in itsextensive and ambitious rule-making effort (CFTC, 2015) that most of its supervisory policies assume a world in whichtrades are executed “by hand” – with extensive human intermediation, and at “human speed” – whereas the technologyin the market today operates at “machine speed” with latency as low as a few hundred microseconds. The broad charterof Reg AT calls for a rethinking and revamping of market regulation to bring the framework up to date with moderntechnology.Reg AT speciﬁcally calls for development of capability to test all forms of automated trading or trade-support systemsthat interface directly with ﬁnancial markets before they are introduced into a real exchange environment. This capabilityshould operate in a controlled, off-line test-bed environment – but one that is realistic enough to allow reasonableassessment of the likely impact and risk of operating those systems in a “live” market.We believe such a test-bed would allow exchange operators to explore the consequences of possible rule changes, neworder type offerings, or anti-spooﬁng measures. The system would potentially have value for private sector participantswho wish to test the effectiveness of algorithmic trading systems. Moreover, researchers proposing speciﬁc changesto the way markets operate, e.g. Budish et al. (2015), could beneﬁt from such platform. Most of the current researchinvolving policy changes are either based on a theoretical framework, with no empirical evidence that rules wouldproperly work in practice, or they study the consequences of a rule change implementation on a particular exchange,months after the fact (Jørgensen et al., 2018).This technical capability does not yet exist, and Reg AT is vague as to the requirements such a system would haveto meet. The models developed thus far are primarily based on agent-based simulations. These existing models aregenerally limited – often based on a agents trading a single instrument, with simulated low-frequency data and highlyartiﬁcial trading rules.The simulator described in this paper was constructed with the goal of creating a test environment as close to realityas possible. We replicated all the basic characteristics of a ﬁnancial market exchange and tried to expand on whatis presented in agent-based models literature. The ﬁnal result is a system that is very versatile and can be applied todifferent scenarios in education and research. The Santa Fe Artiﬁcial Stock Market (Palmer et al., 1999) is one of the most cited agent-based systems applied toﬁnance. With the development work taking place in the 90’s, the system models a market with one risk-free bond anda single stock traded by agents, which follow a set of pre-deﬁned basic rules. The system is now viewed as groundbreaking since it is the ﬁrst one that models traders and measures the result of their interaction i.e., the equity behavior.Another, more developed market simulator is presented in Jacobs et al. (2004). The authors point to the fact thatalthough asynchronous-time, discrete-event simulations are commonly used to model complex systems, they are rarelyused to model ﬁnancial markets. The system is a multi-asset trading environment with asynchronous events. Theauthors describe these asynchronous events as the agent states are updated at different times (not all agents are updatedat every turn). Outstanding buy and sell orders remain in a book, and simulation sessions last several (virtual) days,with trading events happening throughout the day. The agents are mean-variance portfolio holders that trade at mostonce a day.The next references are important as they detail a system used to study multiple aspects of ﬁnancial markets. The GenoaArtiﬁcial Stock Market (GASM) (Raberto et al., 2001) is a market simulator that serves as basis for multiple researchpapers published since 2000. The authors set to build a simple market structure that would be able to reproduce some Algorithmic Trading Market by Trading Type (FOREX, Stock Markets, ETF, Bonds, Cryptocurrencies), Component (Solutionsand Services), Deployment Mode (Cloud and On-premises), Enterprise Size, and Region - Global Forecast to 2024. PREPRINT - A

UGUST

31, 2020stylized facts, such as volatility clustering and heavy tails , observable in the distribution of real data returns. Thereis only one risky asset in the market and agents send random limit orders at every simulation step, based on a ﬁniteamount of cash and the current realized volatility. Price is formed by the intersection of the demand and supply curves(since the system does not implement a limit order book). Raberto et al. (2003) increments the simulation with differentagent strategies, and compares their performance by looking at their wealth evolution. Cincotti et al. (2003) addsmulti-asset support to the simulator. Each agent is now holding a portfolio, with no short positions allowed. Most of theagents act in a completely random fashion, but the paper explores the application of three different trading strategies(mean-variance, mean-reversion and relative chartist strategy) acting in the resulting market. Both Ponta et al. (2011)and Ponta and Cincotti (2018) explore information exchange networks among traders in variations of this multi-assetmarket. In Raberto and Cincotti (2005) and Ponta et al. (2012), the single-asset model from Raberto et al. (2001)is extended to use a limit order book as its pricing mechanism. To accommodate the simulation to this new pricingmechanism (with time-price priority), one single agent is chosen at random at every time step to perform an action.Jacob Leal et al. (2016) proposes a model designed to study the interaction between low frequency traders and highfrequency traders (HFT) on a single-asset market. Slow (low frequency) traders submit orders every θ turns, with eachagent having a different θ , based on either a fundamentalist or a chartist strategy. The orders sent by low frequencytraders are sent ahead of other agents at every simulation step. High speed traders act every time they see a proﬁtopportunity, employing directional strategies. They submit their orders after the submissions from low frequency tradersare completed. The idea being that they are fast enough to exploit the information generated by slow traders. The authorsconclude that their approach is able to reproduce main stylized facts from current ﬁnancial markets. We review thispaper as one of the few examples of agent based models that is attempting to model the low frequency/high frequencyinteraction. Please note that the framework is a turn based simulator similar with the traditional ones described above.A thorough review of such agent-based simulation studies is presented in Alsulaiman and Khashanah (2015). In general,authors set to solve a research problem. They adapt the most suitable agent-based simulator to answer the researchproblem studied. They generally focus on a few sets of features that are affecting the research problem. Once theproblem has been answered and a new problem appears likely the old simulator needs to be redone. The GASM modelmentioned above is symptomatic in this aspect as every paper added a new layer of complexity to be able to answer anew problem in effect evolving the system toward a more realistic one.In 2014, when we started the development of the system described in this paper, we wanted a “as close to reality aspossible” replica of a real market exchange. To this end we created and replicated a real market. This task was extremelycomplicated and in fact we rebuilt the system from scratch four times to be the completely expandable system we havetoday. We believe the resulting SHIFT system described in Section 2 behaves as close to a real market as possible in aresearch environment. In fact, we can trade any real standardized asset in the SHIFT system. We shall discuss this inSections 3 and 4.When comparing SHIFT with the existing agent-based models we found three features that are all present in our systemand which we think are crucial to replicate how markets operate today. The artiﬁcial markets in existing literaturemay contain one or at most two of these features. These features in order of importance are: real pricing mechanism , distributed asynchronous , and multi-asset . Real pricing mechanism.

Most exchanges today are order driven , while the rest are quote driven . Both types ofexchanges as well as exchange participants need to keep track of supply and demand as these are the main drivers ofmarket microstructure. Alsulaiman and Khashanah (2015) cites only four studies which use the limit order book as thepricing mechanism and no quote driven markets.

Distributed asynchronous market.

The majority of ﬁnancial market simulators in the literature employ some typeof “turn-based system”. Even if the agents do not “play” at every turn, e.g. they perform an action every ∆ t , mostof the times there is a notion of action taken at step t = 1 ...T and a central unit dictating the order of agent turns. Despite some serious attempts to introduce a real batch auction exchange, operating in discrete time (Budish et al.,2015), all exchanges today operate in real time. Therefore, we believe that having a distributed asynchronous systemwhere clients may be all over the world dealing with real latency as well as a market exchange operator processingthe orders in the order that they arrive is crucial to be able to simulate a high frequency trading environment. Jacobset al. (2004) goes in this direction with its implementation of asynchronous events, but even though events are renderedrandomly or are caused by other events, the central unit controlling the simulation knows what the next event will be.In our system, agents perform actions whenever they want to, and the central unit is constantly listening for incomingmessages, with no control over when they are sent and by whom. In a turn based system, high frequency traders are SHIFT = “ S tevens Hi gh F requency T rading Market Simulation System” Some studies randomize the order of action of the agents at every turn (Fricke and Lux, 2015). PREPRINT - A

UGUST

31, 2020commonly simulated using a smaller ∆ t , and thus the orders from the low frequency traders never arrive before theirorders. However, in a real system low frequency orders operating on outdated information may arrive earlier at theexchange and by chance predate the HFT orders. Multi-asset market.

Most of the academic literature employing agent-based simulators is using a single risky assetand a risk free asset. This one-traded-asset model is certainly the basis of any simulation, and many interestingconclusions may be derived. However, allowing agents to trade multiple assets can potentially recreate the highlycorrelated markets we are experiencing today. We note Cincotti et al. (2003) as early work using a portfolio of tradedassets. Furthermore, the ability to trade an ETF (i.e., a basket of stocks), as well as the ETF’s components allows us tostudy complex events such as the May

Flash Crash (Kirilenko et al., 2017; Paddrik et al., 2012).We would like to make a special mention of the Penn-Lehman Automated Trading Project (Kearns and Ortiz, 2003),developed by the Computer Science Department at Pennsylvania State University in partnership with Lehman Brothers.The concept was similar with our system, but it was limited to single-asset trading. Further, it required a constantfeed of real market data (either historical or live every seconds) to operate (probably to provide liquidity to its users).Although the system was used to organize algorithmic trading competitions, we were not able to ﬁnd any evidence ofagents trading against each other or being capable to move the market through their trades. This paper is focused on deﬁning a realistic test bed capability, extending the agent-based approach to encompass amuch higher degree of realism in a rich market environment capable of dealing with: • Large numbers of agents trading large numbers of assets. • Realistic and robust trading strategies. • Real-time, high-frequency market pricing and limit order book data. • The ability to observe interactions between multiple agents (traders), employing potentially overlapping andcompeting strategies, to enable the study of realistic market events such as crowded trades and liquidity crises. • The ability to test under realistic conditions the effects of regulatory measures, either imposed by a centralregulator (e.g., the CFTC or the U.S. Securities and Exchange Commission - “SEC”) or introduced by theexchange operators or researchers. • The possible application of such a capability to perform “stress testing” of ﬁnancial market systems (similar inspirit to the Comprehensive Capital Analysis and Review - “CCAR” - program introduced for major banksunder the auspices of Dodd-Frank (2010)).Our research builds upon an extensive modeling effort conducted over the past six years at the Hanlon Financial SystemsCenter of the Stevens Institute of Technology (Hoboken, NJ - USA), known as the Stevens High Frequency TradingMarket Simulation System (SHIFT) project. We aim to demonstrate that this tool is extremely versatile and provides aﬁnancial laboratory environment akin to laboratory environments from other research areas - where experiments can berun in isolation, but in realistic conditions. To accomplish this, the rest of this paper is organized as follows. Section 2provides a description of the system, presenting its modules and some of the design decisions behind them. Section 3discusses market event replay capabilities, along with their applications on research and teaching. Section 4 presentsthe use of the platform when creating a completely artiﬁcial market through the use of autonomous agents. The agentscan reproduce actual market stylized facts, and we study the effects caused by changing their parameters. Section 5concludes our paper, and presents future possible directions for our work.

SHIFT is a complete and standalone system designed to emulate the essential parts of an exchange: a distributed,real-time, and order-driven market. Its initial development focus has been on equity markets, however, the platform canbe extended to commodity, future, and option markets, and potentially to any other asset class. The system may bethought of as more of a replica of a real time market exchange rather than a simulation environment.The platform operates in two different ways. In one mode, SHIFT works with live, real-time, order-level market datasent by market participants which inﬂuence everything in the market. This is typically the format used in researchstudies. This implementation provides researchers the ability to assess unexpected interactions between different According to , the project is not active anymore. PREPRINT - A

UGUST

31, 2020strategies. In a second mode, the system replays recorded datasets of quote data. This implementation is typical forcommercial market simulators (e.g., paper trading accounts from Interactive Brokers , Quantopian , etc.) and wenormally use this type of implementation for trading competitions and teaching. In either mode, SHIFT is capable ofgenerating trade and quote records that may be used to evaluate the effectiveness of complex trading strategies underconditions similar to a real high frequency market.A realistic platform such as SHIFT needs to process a massive amount of real-time data, while interacting with anundeﬁned number of clients. Thus one of its major challenges is performance. Particularly, in a high frequency market,speed is a critical factor. To accomplish this, apart from developing in a high performance programming language(C++), we separate the server side of the system in different modules, each with a specialized task. This allow us toavoid overloading any of the modules, as well as to divide the work of each layer of the system into multiple copies ofthe same module, if necessary.A simpliﬁed schematic of the primary modules in our system is shown in Figure 1. The arrows in Figure 1 point fromthe server to the client, however information ﬂows both ways in all connected levels of the platform. All communicationis done using the Financial Information eXchange (“FIX”) protocol , the industry standard.Figure 1: Modules of the SHIFT platform.In the following subsections, we offer a brief description of the system’s architecture, with details on both server side,called “Exchange”, and client side. A note on the scalability of the platform is also given. Our ﬁnancial market exchange simulator contains three distinct modules:

Datafeed Engine ; Matching Engine ; and

Brokerage Center . Datafeed Engine.

This module works as a streamer of data to the Matching Engine when the system is running inreplay mode. It requests and stores historical quoting and trading data from a market data provider, by implementingthe necessary API (application programming interface). Replay mode may be used to test single-user trading strategieswith historical data, or to provide liquidity in a multi-user environment (e.g., artiﬁcial agents or students in a classroom).

Matching Engine.

As it is the case for all its real market counterparts, this module is the brain of our exchange. It isresponsible for managing the limit order book (LOB) of all of the platform’s traded assets, implementing the dynamicsof an order driven market. The matching engine manages a local LOB, containing orders from the clients that areconnected to the platform, for each ticker. It also maintains a global LOB, which functions as the National Best Bid and We use an open source implementation of FIX 5.0 SP2, QuickFIX. It is available at: . PREPRINT - A

UGUST

31, 2020Offer (NBBO) system speciﬁc to U.S. markets. The Matching Engine automatically routes orders from the local LOBto the global LOB whenever a better price may be obtained in an outside exchange.

Brokerage Center.

This is the hub that centralizes all communication between clients and Matching Engine. It wasinitially conceived as a way to remove unnecessary load from the Matching Engine in functions such as providing allcurrent limit order book data to newly connected clients, as well as broadcasting changes in the limit order books to allclients. However, in its current implementation, its use has been expanded. It charges transaction fees (both long andshort sells), and it keeps portfolio information for all connected clients, along with their current buying power . Thisinformation is used for account persistence and portfolio valuation, as well as assessing trading limits and margin calls.The Brokerage Center also stores permanent records of all trading data generated by the users of the exchange. We have developed two main ways for users to access our platform: a web interface and APIs in C++ and Python. Theweb interface was developed with students in mind, so that they could use it in market microstructure classes to learn therules of operating a trading account in a real market. A sample of the interface is presented in Figure 2. In addition tothe overview page (Figure 2a) and the limit order book page (Figure 2b), users can also see their portfolio information. (a) Overview page, with last and best prices data for each of the trading symbols. Thegreen and red coloring indicate up and down movements, respectively, since the lastupdate.(b) Each trading symbol has its own LOB page, containing a candlestick data plot ofthe simulated price, as well as the global and local LOBs, as explained in Section 2.1.

Figure 2: SHIFT web interface. The amount of money a user of the system has available to spend. PREPRINT - A

UGUST

31, 2020For more advanced uses, we have created APIs in both C++ and Python. These can be used to create completealgorithmic trading strategies, and we use them in teaching and in research. For research purposes, each client can beviewed as an agent in an agent-based simulation. Since agents are actual trading accounts operated by individual piecesof software or real people, SHIFT provides a more complex and close-to-reality simulation than existing literature.Because of its server-client architecture, multiple simultaneous agent connections are naturally asynchronous, and eventhe effects of network latency can be explored.Examples of use of the platform as an agent-based simulation tool are presented in Section 4. Basic examples of use ofour Python API can be found in Appendix A.

The platform was developed so that it may be scaled to any number and types of assets as well as any number of clients.The modular architecture allows us to add more instances of each module as needed. For example, a common issue inhigh frequency studies is when a large number of simultaneous client connections causes the system to slow down dueto increased network trafﬁc. A solution is presented in Figure 3a where we add more instances of the Brokerage Center.In the case when the Matching Engine starts receiving more orders than it can process in real time, or if we simply wantto add different ﬁnancial assets, we may add more Matching Engine modules (Figure 3b). (a) Schematics of the platform with more instances of theBrokerage Center, as a measure to avoid hitting network per-formance bottlenecks. (b) Schematics of the platform with more instances of both theBrokerage Center and the Matching Engine.

Figure 3: Scalability of the SHIFT platform.

Outstanding orders of users connected to the platform are placed in what we call the local limit order book. Theseorders follow the usual rules of order-driven markets, with price-time priority of orders. When in replay mode, thesystem also makes use of market data obtained from a particular provider. We currently collect microsecond last andbest prices data from different exchanges, along with their volume, and we use this information to create what we callthe global limit order book of each asset - representing the National Best Bid and Offer (NBBO) system.The Datafeed Engine streams data to the Matching Engine, which keeps track of the best prices as they were at a givenmoment in time. These global quotes together with the orders coming from the users in the system create marketliquidity. Liquidity therefore is not inﬁnite in the system. There are two major consequences of this design. First, usersare in fact competing for liquidity, so two equal orders submitted at exact the same time may have completely differentoutcomes, depending on which order arrives ﬁrst at the Matching Engine. Second, even though users cannot causelong-term impact on market prices in replay mode, traded prices may deviate from the real prices for a little while.In general, researchers and students who backtest trading strategies use downloaded historical price and quotes data.They therefore use unrealistic assumptions such as inﬁnite liquidity and minimal reaction time. In our system we canaccount for order timing, bid-ask spread, and available volume thus creating much more realistic results. Moreover,7

PREPRINT - A

UGUST

31, 2020the capability of replaying any given day, or of creating completely artiﬁcial market scenarios (see Section 4), allowsresearchers to better design take proﬁt and stop loss rules, as well as stress test their trading strategies.

In an effort to engage students with hands-on experience with modeling and algorithmic trading, we introduced SHIFTinto lectures at Stevens Institute of Technology. From computing basic statistics from a live stream of limit order bookand last price data to implementing and verifying their own trading strategies in market microstructure and algorithmictrading classes, the feedback from students so far has been very positive. We should mention that SHIFT is invaluablein demonstrating to the students a speciﬁc point. Every strategy that we implemented which is proﬁtable when usingdaily data ends up losing money in a realistic system using intraday, high frequency data.A pilot algorithmic trading competition ran during the Spring semester of , and others are planned for the future. Inthis ﬁrst edition, there were participating students, divided in teams of to students each, trading any of the Dow Jones Industrial Average stocks. Each week, teams were given access to their own instance of the simulator for days of training, which culminated in all algorithmic trading strategies running against each other and competing forthe best opportunities in day . Every competition day had a different theme, from low volatility days to ﬂash crashdays, no outside (human) intervention was allowed, and portfolios would reset, giving a fair chance for teams to recoverfrom a bad week. In the end, the team with the highest total proﬁt after weeks of competition won.The competition was beneﬁcial for us since it allowed us to discover and ﬁx many issues as well as improve the systemusability. It was also beneﬁcial for the students who learned about trading and difﬁculties of applying class conceptsto real world. Figure 4 shows the daily proﬁt of the top teams along with their average (red dotted line) during thecompetition’s weeks. The lines generally display a positive trend showing that students were learning from theirmistakes and enhancing their algorithms.Figure 4: Algorithmic trading competition teams’ daily proﬁt evolution. When not replaying market events, the global limit order book functionality is turned off, and all market formationhappens in the local limit order book, with orders coming from the users of the system. These can be researchers,students, market practitioners, or completely artiﬁcial agents.As an initial proof of concept, we set ourselves to create the simplest possible market, where zero intelligence agentswith no notion of proﬁt or loss trade a single asset. We describe these agents in the next subsection, followed by theresults of experiments we did with such agents in Sections 4.2 and 4.3.

The trading strategy we chose for our zero intelligence agents is inspired from previous work done in the GenoaArtiﬁcial Stock Market, described in Raberto et al. (2001) and Ponta et al. (2012). Modiﬁcations were necessary due tothe real-time nature of our simulation. https://us.spindices.com/indices/equity/dow-jones-industrial-average PREPRINT - A

UGUST

31, 2020

During the trading session (i.e. a simulation execution), each agent trades according to a Poisson process with ﬁxed rate λ ( λ is the same for every trader). One can ﬁnd details of generating a Poisson process in (Florescu, 2014, Chapter 10).Each of the N traders trades Φ i times at times τ i,j , j = 1 ... Φ i .At time τ i,j , the i -th trader will execute two simple actions:1. If the trader has an outstanding order, i.e. if their last limit order (or a portion of it) is still in the limit orderbook, send a corresponding cancel for the remaining (buy/sell) order.2. Decide whether their next limit order is going to be a buy or a sell (probability . ): • If the order is a limit buy, the limit price of the order will be P bτ i,j ∼ N ( µ bτ i,j , σ ) , where µ bτ i,j is thesmaller value between the current best bid and the last available price. This simulates the fact that buyerswant to pay the lowest possible value to acquire assets. • If the order is a limit sell, the limit price of the order will be P aτ i,j ∼ N ( µ aτ i,j , σ ) , where µ aτ i,j is thelarger value between the current best offer and the last available price. Sellers want to receive the highestpossible value for their assets.An initial price value P is given as a parameter to our autonomous agents, representing the close price of the previousday. This value is used as the initial µ bτ i, and µ aτ i, values if no other information is available at the moment, i.e. ifno other agent submitted limit orders yet. Furthermore, the volume of each submitted limit order is determined as aproportion r τ i,j , which we will call the current conﬁdence level , of the buying power (for limit bids) or number ofshares (for limit offers) the i th trader has available at the moment of order submission. The GASM papers that inspired our agents implementation use an equal distribution of buying power and amount ofshares among their autonomous agents. We discovered that such homogeneous distribution has an important contributionto the resulting price formation process, as will be shown in Section 4.2.3. Therefore, we have opted for a randomizedwealth distribution in our experiments.The initial division of shares S = ( S , ..., S N ) , i = 1 ...N , with N the total number of traders in the simulation, followsa Dirichlet distribution. The probability density function of a Dirichlet distribution has the following form: f ( x , ..., x N ; α , ..., α N ) = Γ (cid:16)(cid:80) Ni =1 α i (cid:17)(cid:81) Ni =1 Γ( α i ) N (cid:89) i =1 x α i − i , where x i ∈ (0 , , (cid:80) Ni =1 x i = 1 , and α , ..., α N are the concentration parameters.A symmetric Dirichlet distribution is a particular case of a Dirichlet distribution with α = ... = α N = α . Theprobability density function is then simpliﬁed to: f ( x , ..., x N ; α ) = Γ( αN )Γ( α ) N N (cid:89) i =1 x α − i , with α the concentration parameter. The higher the value of α , the more homogeneous the distribution of wealth isamong traders.When a trader i is assigned S i initial shares, we also give them an initial buying power (cash) equal to their sellingpower (shares). That is BP i = P S i , where P is the initial share value. In this section we will introduce parameters and agents that create a well functioning exchange. Our goal is todemonstrate that the resulting price process has similar characteristics with a real price process during a normal tradingday devoid of any ﬁnancial events. We also aim to study how the agent parameters affect the behavior of the formedprice process.There are , , available shares of CS1, a fake stock ticker, with an initial price P = $100 . . The initial marketcapitalization of CS1 is thus $200 million. Since traders receive a sum of cash equal to their endowed shares the totalinitial value of assets in the market (sum of all buying and selling powers) is $400 , , . The rest of the parametersin this experiment are as follows: 9 PREPRINT - A

UGUST

31, 2020 • N = 200 traders. • Simulation length M = 23400 seconds ( . hours). • Agents attempt to trade an average of λ = 390 times during the trading session, i.e. on average, they submitone limit order every minute. • Standard deviation from the best prices σ = $0 . . • Conﬁdence level r τ i,j ∼ U (0 . , . , i.e. limit order size is uniformly distributed between and ofthe total buying or selling power of the agents, depending if it is a limit buy or a limit sell order, respectively. • Wealth concentration parameter α = N = 200 . With this α value, traders on average have , initialshares each, with a standard deviation of about initial shares among all traders.Example simulated price paths are shown in Figure 5, with the respective return plots in Figure 6. Visually, these plotsresemble real stock price/return behavior during a given day. Even though there is no ﬂux of outside information intothe system, prices display characteristics of real price series, which we discuss in the next sections. (a) Baseline Experiment 1 (b) Baseline Experiment 2 Figure 5: -minute price paths for two baseline experiments. (a) Baseline Experiment 1 (b) Baseline Experiment 2 Figure 6: -minute returns for two baseline experiments. Figures 7 and 8 present statistics for the returns displayed in Figure 6. Although here we only discuss the results of twosimulated series, the statistics of all simulated experiments are very much in line with the known stylized facts of returntime series (Cont, 2001). We present two results to display the consistency of the resulting statistics.10

PREPRINT - A

UGUST

31, 2020 (a) Returns Autocorrelation (b) Normal Q-Q Plot (c) Squared Returns Autocorrelation

Figure 7: -second returns statistics for a baseline experiment. (a) Returns Autocorrelation (b) Normal Q-Q Plot (c) Squared Returns Autocorrelation Figure 8: -second returns statistics for a second baseline experiment. Negative autocorrelation.

Because of the “bounce effect” caused by the bid-ask spread, where market orders maymatch against either side of the book, returns are expected to exhibit a negative autocorrelation when sampling in smalltime scales, as shown in Figures 7a and 8a.

Leptokurtic behavior.

The distribution of returns has “heavy tails” (Bouchaud and Potters, 2003; Voit, 2005). Thatis, return values far from the average are occurring more frequently than if they should if they followed a Gaussiandistribution. This is evidenced in the Q-Q plots (Figures 7b and 8b), where the excess kurtosis is also reported. Here weuse data sampled every second, and the average excess kurtosis of all experiments we did was around . Volatility clustering.

When looking at the realized volatility, more precisely at the autocorrelation function of squaredreturns, it is possible to see in Figures 7c and 8c that periods of high volatility will lead to other periods of high volatility.This phenomenon, known as volatility clustering, is another known feature of ﬁnancial market data (Cont, 2007). In fact,the slow decay found in these autocorrelation plots, showing signs of long memory in the volatility, is also documentedin the literature (Lobato and Velasco, 2000).Furthermore, the resulting time series data exhibits heteroskedastic effects. When performing the autoregressiveconditional heteroscedasticity (ARCH) test (Engle, 1982) we obtain p -values extremely low and we reject the nullhypothesis of no ARCH effects for all experiments. If we try to ﬁt an actual ARCH model, we need around lags tobest ﬁt returns sampled every second.We note that unlike the GASM model our zero intelligence agents do not look at the realized volatility and adapt theirstrategy depending on its current value. Indeed, the volatility parameter they use when choosing the price of submittedlimit orders remains constant during the whole simulation. Our system does not need the agents to adapt to create aprice process with all characteristics mentioned in above. We make this observation since it is argued in literature (seee.g., Lux and Marchesi (1998)) that the arrival of news and the reaction of agents to the news and the market plays a bigrole in creating such characteristics. In our system we see that even though there is no external news and the agents arevery basic we still observe these market characteristics. We thus argue that implementing and respecting the actualtrading rules of current ﬁnancial markets are instrumental to create a proper market simulation.11 PREPRINT - A

UGUST

31, 2020

The limit order book data gathered from our simulations displays characteristics found in real market data. Figure 9ashows the average shape of the limit order book, i.e., the average volume at each tick (in our case, $0 . ) distance fromthe mid price. Here, we present both bids and offers together, since their average volume behavior is the same. In realdata this shape is characterized (Bouchaud et al., 2002) by a peak a few ticks away from the mid price, since volumescloser to the mid price tend to be executed more frequently, followed by a power law decay of the average volume ofmore “patient” traders. We observe similar characteristics in Figure 9a. (a) Average Shape of the Limit Order Book (b) Bid-Ask Volume Imbalance Figure 9: Limit order book average shape and volume imbalance for a baseline experiment. (a) Spread Time Series (b) Spread Autocorrelation (c) Spread Sample Distribution

Figure 10: Spread characteristics for a baseline experiment.In Figure 9b, we plot the dynamic volume imbalance in the limit order book. Speciﬁcally, we plot the differencebetween the volume in each side of the limit order book during the trading day. There are imbalance peaks in both sidesof the spectrum throughout the trading day, when there is more pressure from one of the market sides. However, asexpected in near-equilibrium, the general trend is mean reverting.We then turn to spread characteristics in Figure 10. Spread is the difference between current best bid and offer prices inthe limit order book, and it represents the cost someone incurs when executing a market order. Spread is one of the bestproxies for liquidity in high frequency trading (Salighehdar et al., 2017; Mago et al., 2017). As previously described inthe literature (Plerou et al., 2005), the time series of spread values should be characterized by persistence (Figure 10b).Furthermore, the asymptotic shape of the spread distribution should be described best by a power law (Figure 10c).

Generally agent-based model papers provide a set of parameters that is tested to create market behavior similar to realmarkets. Here, since the system is so close to reality we can study the impact of parameters on the resulting priceformed. Intuitively, in a homogeneous environment we have an entropy/central limit principle that tells us the resultingquantity (temperature, pressure, price) has a Gaussian behavior. The more non-homogeneous the environment the moredeparture from Gaussianity. Thus we wanted to create parameters characterizing the agents which will allow us to gofrom homogeneous agents to non-homogeneous ones. 12

PREPRINT - A

UGUST

31, 2020

Impact of order size.

In our base experiment scenario, traders submit buy or sell orders with sizes ranging from to of their current cash or shares value, respectively. This proportion r τ i,j is random for every trade and representsthe trader conﬁdence level at the moment when the order is sent. This conﬁdence level turns out to be very importantfor the distribution of the resulting returns. (a) r τ i,j ∼ U (0 . , . (b) r τ i,j ∼ U (0 . , . (c) r τ i,j ∼ U (0 . , . Figure 11: Normal Q-Q plots for simulated returns with different trader conﬁdence levels.We ran three different experiments. We vary the agent conﬁdence levels from conservative ( to ), baseline( to ), and risky ( to ) and we show the Q-Q plots of the returns in Figure 11. We can see from theplots that the larger the orders the traders can execute, the more leptokurtic the resulting return distribution will be.When running several of the same experiment, the average excess kurtosis increases from . for the conservativecase, . for the baseline case, to . for the risky case - all of these values being statistically different. Impact of wealth distribution.

Recall that in the typical agent-based simulations all agents have the same initialwealth. This is a typical homogeneous environment. In SHIFT we wanted to have random initial endowment. This iswhy we use the Dirichlet distribution. We experimented with modifying the wealth concentration parameter α from N to . When α = N traders have on average , initial shares each with a standard deviation of about sharesamong all traders. When α = 1 the agents are much more heterogeneous from the perspective of initial wealth. Theirexpected value is still , , but the standard deviation is now about , shares. In our experiments, the trader withthe largest amount had , shares in the beginning of the trading day, while the poorest trader had only shares. (a) r τ i,j ∼ U (0 . , . (b) r τ i,j ∼ U (0 . , . (c) r τ i,j ∼ U (0 . , . Figure 12: Normal Q-Q plots for simulated returns with different trader conﬁdence levels, when wealth distribution innon-homogeneous.The homogeneous wealth results ( α = N ) have been presented in Figure 11. We contrast those results with the resultsin Figure 12. This non-homogeneous distribution of wealth is likely to be closer to reality. Exchanges today have asmall number of large institutional traders that dominate through their volume of trades.The resulting excess kurtosis is always larger than in the previous (more homogeneous) experiments. We note that,although for this particular run Figure 12b shows an excess kurtosis below the value in Figure 12a, on the average, therelationship between different conﬁdence level ranges stays the same. The average values are . , . , and . -from conservative to risky cases.The relationship between “heavy tails” and the impact of orders coming from large market participants has beenpreviously studied (Gabaix et al., 2003). It is nonetheless interesting that we can easily reproduce it with simpleparameter values changes in our simulation. 13 PREPRINT - A

UGUST

31, 2020

We think this section is one of the most interesting observations we made simply by running the system and varyingparameters. It is well known that using different sampling frequencies produces different parameter values. Forexample, in one of the most cited papers in mathematical ﬁnance literature (Zhang et al., 2005), the authors observethat realized variance has different values depending on the sampling frequency of the price data used. They attributethis discrepancy to noise in the market and propose a new estimator that is used extensively today (multi-grid realizedvolatility). However, in our simulations the same exact run produces completely different distribution shapes dependingon the sampling frequency used. (a) . -Second Returns (b) -Second Returns (c) -Second Returns Figure 13: Normal Q-Q plots for simulated returns with different sampling frequencies.Figure 13 exempliﬁes such behavior. The returns presented in Figures 13a, 13b, and 13c are all from the same priceseries (Figure 5a), but the smaller the time scale, the “heavier” the tails of the distribution of returns. This effect isactually present when sampling real ﬁnancial data as well (Aldrich et al., 2014). In fact, this particular effect is called aggregational Gaussianity in Cont (2001). As the sampling time scale is increased, the returns distribution will getcloser to a Normal distribution. (a) -Second Returns when λ = 390 (b) -Second Returns when λ = 390 (c) . -Second Returns when λ = 780 (d) -Second Returns when λ = 780 Figure 14: Normal Q-Q plots for simulated returns. Top row presents results from an experiment where agents submitorders on average every minute ( λ = 390 ). Bottom row are results from an experiment where traders act on averageevery half a minute ( λ = 780 ).Moreover, we found that this aggregational Gaussianity is not only related to the sampling time interval, but also tothe total trading activity in the market. In the base experiment scenario, with agents submitting orders on average14 PREPRINT - A

UGUST

31, 2020 (a) -Second Returns when λ = 390 (b) -Second Returns when λ = 390 (c) -Second Returns when λ = 780 (d) -Second Returns when λ = 780 Figure 15: Normal Q-Q plots for simulated returns. Top row presents results from an experiment where agents submitorders on average every minute ( λ = 390 ). Bottom row are results from an experiment where traders act on averageevery half a minute ( λ = 780 ).every minute ( λ = 390 ), we averaged , , shares traded during a simulation day. If we increase their actionfrequency to once every half a minute ( λ = 780 ), we averaged , , shares traded during a simulation day.This increase corresponds to a more active equity.Figures 14 and 15 exemplify the relation between sampling frequency, trading activity, and the Gaussian distribution.First, we note the aggregational Gaussianity - speciﬁcally we see the excess kurtosis dropping as the sampling intervalincreases. Second, an even more interesting phenomenon is observed by looking at the “two” equities: the baselineand the more active equity. As we double the trading activity and the sampling frequency we obtain similar kurtosisvalues with the baseline case. Speciﬁcally, compare Figure 14a with Figure 14c and Figure 14b with Figure 14d. Weobserve a similar phenomenon in Figure 15, where we decrease the sampling frequency further. In these results theexcess kurtosis values are negligible, but the Q-Q plots visual resemblance is present. This is interesting because itpoints toward studying and comparing different ﬁnancial asset time series differently depending on their characteristics,such as trading volume. Following our ﬁndings on replicating stylized facts in the context of SHIFT, we demonstrate the system capability tostudy market stress conditions. Speciﬁcally, we study the relationship between market factors and crash characteristics.To set up the experiments we use N = 200 traders, and the simulation length is set to M = 3600 seconds ( hour).Around minutes into the simulation, we create a crash by having new trader(s) forcefully placing a large order on themarket. We study the differences in the way we create the crash and the interaction with the market conditions. Market factors: • Trading frequency : Market traders attempt to trade on average every minute ( ) or every half a minute( ). • Homogeneity : The market can be homogeneous ( H ), with an even distribution of wealth ( α = N ) and tradersconﬁdence level r τ i,j ∼ U (0 . , . , or non-homogeneous ( NH ), with an uneven distribution of wealth ( α = 1 )and traders conﬁdence level r τ i,j ∼ U (0 . , . . That is, we choose the extreme cases described in Section4.2.3 to represent the homogeneous and the heterogeneous market conditions.15 PREPRINT - A

UGUST

31, 2020

Crash condition factors: • Stress size : Crash traders own (level one of the factor) or (level two of the factor) of the total amountof shares available in the market. • Stress traders : This factor has three levels. The ﬁrst level is a single crash trader placing a large order around min into the simulation (labeled in the output with ). For the second level we consider crash traderscollectively owning the same quantity as the one trader and all placing the orders at about the same time ( simultaneously). For the third level, we consider crash traders placing the same total quantity with a -second interval between their actions (

20 NS non-simultaneously).We ran experiments for each possible combination of factors, for a total of experiments. We were initiallyplanning more experiments for each factor combination, but the results were very stable. Sections 4.3.1 and 4.3.2discuss the results obtained. In the vast majority of our stress event experiments, the price of the CS1 stock falls after the sell-off event. In somecases, the price drop was considerably large, as exempliﬁed in Figure 16a. In other cases, the price decrease was notlarge, and the market would either continue to drop slowly after the stress event (Figures 16b) or completely recover(Figure 16c). (a) Large Market Impact ( ≈ ) (b) Small Market Impact ( ≈ . ) (c) No Market Impact Figure 16: -second returns during stress events.Based on the results obtained we analyze which of the factors listed in Section 4.3 are inﬂuential for a stress event. Oneof the main issues is constructing a variable that measures a crash. We could use the drawdown or the time to drawdownto measure the impact of the crash. Here we chose to use the slope of the market drawdown since we know the exacttime when the stress event starts. The slope combines both the drawdown size and duration. We illustrate the drawdownslope in blue in Figure 16.Since we use categorical variables as inputs and a quantitative variable (drawdown slope) as output, an analysis ofvariance (ANOVA) is the most appropriate statistical analysis. We display the ANOVA table of the ﬁnal model thateliminated all non-signiﬁcant interaction terms in Table 1. These results indicate a strong inﬂuence of each of the fourfactors individually, as well as that some factor interaction is signiﬁcant.Table 1: Analysis of variance on the slope of the market drawdown in our simulated stress events.Factor DF Sum Sq. Mean Sq. F-Value P-ValueTrading Freq. (TF) . − . − . < − Homogeneity (H) . −

11 7 . −

11 5 .

315 0 . Stress Size (SS) . − . − . < − Stress Traders (ST) . −

10 8 . −

11 6 .

211 0 . TF : SS . −

10 2 . −

10 15 .

02 0 . H : SS . −

11 7 . −

11 5 .

357 0 . H : ST . −

11 4 . −

11 3 .

617 0 . Residuals

230 3 . − . − As expected, the market conditions do not interact, however, the crash agents characteristics interact with marketconditions. Since the three way and more interactions are not signiﬁcant, we next investigated how the combination of16

PREPRINT - A

UGUST

31, 2020factors affects the drawdown slope. We apply a multiple pairwise procedure (Tukey’s honestly signiﬁcant difference -HSD - test) to the resulting signiﬁcant factors, and we summarize the results in Table 2.Table 2: Tukey’s HSD test applied to the resulting signiﬁcant factors of the analysis of variance in Table 1.Factor 1 Factor 2 Average 1 Factor 1 Factor 2 Average 2 Diff. P-ValueTrading Freq. Trading Freq. . min − . − min − . − Homogeneity HomogeneityNH − . − H − . − . Stress Size Stress Size − . − − . − Stress Traders Stress Traders − . − NS − . − . − . − S − . − . NS − . − S − . − . Trading Freq. Stress Size Trading Freq. Stress Size . min − . − min − . − . min − . − . min − . − . min − . − min − . − min − . − . min − . − . min − . − min − . − . min − . − min − . − . Homogeneity Stress Size Homogeneity Stress SizeH − . − NH − . − . H − . − NH − . − H − . − H − . − NH − . − NH − . − NH − . − H − . − NH − . − H − . − . Homogeneity Traders Homogeneity TradersNH − . − NH NS − . − . NH − . − H − . − . NH − . − H NS − . − . NH − . − NH S − . − . NH − . − H S − . − . NH NS − . − H − . − . NH NS − . − H NS − . − . NH NS − . − NH S − . − . NH NS − . − H S − . − . H − . − H NS − . − . H − . − NH S − . − . H − . − H S − . − . H NS − . − NH S − . − . H NS − . − H S − . − . NH S − . − H S − . − Looking at individual factor effects, we see results that we more or less suspected. A more active ( ) marketexacerbates the drawdown. A non-homogeneous ( NH ) market creates steeper market drawdown movements. Similarly,when traders liquidate a larger market share ( ), this produces a larger slope.Looking at the stress traders characteristics produces interesting conclusions. Markets in which the stress event iscaused by a single trader have a stronger tendency to steeper market drawdown movements than in markets in which thestress event is caused by traders. In fact, there is no statistical difference when comparing simultaneous traders(

20 S ) liquidating their shares and non-simultaneous traders (

20 NS ) liquidating the same amount. It is easy tounderstand the difference may exist when comparing a single trader with traders liquidating the same order but over17 PREPRINT - A

UGUST

31, 2020a longer period. It is the difference between absorbing a sudden shock all at once or in smaller doses. The reasoningwhy the simultaneous traders behavior is closer to the non-simultaneous traders rather than the single trader is notthat easy. We believe what we are seeing is related to the price-time order priority of order-driven markets, and the factthat there are other traders in our simulation competing for this priority. That is, the market order of the single stresstrader will be executed in its entirety, all at once. The orders from the simultaneous traders are programmed to besubmitted all at the same time. However, random orders from the other traders may arrive between these orders,thus sometimes smoothing the stress event effect. This in turn produces statistically different results. We highlight thisﬁnding since such impact may be difﬁcult to observe unless using an order-driven and distributed asynchronous marketimplementation.Studying interaction terms, trading frequency and stress size show a clear multiplicative behavior. Less active markets( ) with stress events of smaller magnitude ( ) show smoother drawdown compared with high active markets ( and stress size). When looking at the interaction between homogeneity and stress size, we see a differentpicture. Liquidating a order impacts the market much stronger than liquidating , regardless of market conditions( H versus NH ). Most interestingly, when studying the interaction between market conditions (homogeneity) and stresstraders characteristics ( versus ), it is clear that stress events caused by a single trader in non-homogeneous marketconditions produce a steeper drawdown movement. Drawdown is a classical measure which may be calculated it in our experiments since we know the exact start timeof the crash. However, we also want to study the immediate effect of the orders’ liquidation on the exchange price.Visually, we can see that some of our experiments show signs of an immediate impact in the stock price while others donot (Figure 17 versus Figure 16). The price drop may have different magnitudes, as presented in Figures 17a and 17b.The drop may in fact happen a few minutes after the sell-off event, as is the case in Figure 17c. (a) Large Immediate Impact ( ≈ ) (b) Small Immediate Impact ( ≈ . ) (c) Delayed Immediate Impact ( ≈ ) Figure 17: -second returns during stress events with immediate impact.In order distinguish between the situations in the ﬁgures depicted we have to devise a distinguishing criteria. In order todo this we use the return statistics from the identical experiments in Section 4.2 which were lacking the crash traders.Speciﬁcally, we use the levels of the two market factors to identify the corresponding non-stress experiments and use itsstatistics.The procedure looks at the one second returns of the asset during a window of time starting a few moments before andending minutes after the stress event. The largest negative one second return during this period must be greater than standard deviations from the mean return of the corresponding non-stress simulation day. If there is no such return themarket did not experience an immediate impact.If such a large return exists, we use it as the starting point for further investigation. We denote the time of the largestreturn with τ . We next look for returns at least standard deviations away from the mean return of a calm day, k seconds prior and k seconds past τ . If they exist, these returns may be positive or negative, since at this point we areinterested in any market disturbance that might be part of the immediate impact. We continue to expand our immediateimpact window in both directions k seconds at a time until no such returns are found anymore. Finally, the resultingtotal return (sum of all returns for the period) must be at least standard deviations away from the mean return ofa non-stress simulation day. If everything passes the check the immediate impact period is returned. In practice weuse k = 15 seconds and standard deviations as these parameter values maximized the recognition of events withimmediate impact visible on the plots, while also minimizing the number of false positives.18 PREPRINT - A

UGUST

31, 2020This simple technique allow us to identify experiments which had immediate impacts. Next, we analyze which factorshad the most inﬂuence on the probability of having an immediate impact. The most signiﬁcant factors and theirinteractions are presented in Table 3.Table 3: Factors pairwise t-tests for determining which values increase the probability of immediate impact events.Factor 1 Factor 2 Average 1 Factor 1 Factor 2 Average 2 Diff. P-ValueHomogeneity HomogeneityNH . H .

8% 0 . Stress Size Stress Size

10% 55 .

0% 5% 25 .

8% 2 . − Stress Traders Stress Traders .

5% 20 NS .

8% 0 . .

5% 20 S .

0% 0 . NS .

8% 20 S .

0% 0 . Homogeneity Stress Size Homogeneity Stress SizeH

10% 68 . NH

10% 41 .

7% 0 . H

10% 68 . NH

5% 28 .

3% 1 . − H

10% 68 . H

5% 23 .

3% 1 . − NH

10% 41 . NH

5% 28 .

3% 0 . NH

10% 41 . H

5% 23 .

3% 0 . NH

5% 28 . H

5% 23 .

3% 0 . Both homogeneity and stress size seem to play a large role on the probability of immediate impacts. In fact, about of the simulations in which the market was homogeneous and the stress size was present immediate impact events.The contribution of the stress size on such probability is expected, but the homogeneity behavior is complementing theﬁndings in Table 2. In the previous results the non-homogeneous markets created a larger drawdown slope. However,when coupled with results from Table 3 we see that even though heterogeneous markets may suffer more drasticallyoverall from a stress event, the impact is not as sudden and there is a smaller likelihood of an observed crash.The conclusions related to the stress traders are the same i.e., the stress trader creates immediate impact more oftenthan the traders and there is no signiﬁcant difference between the -trader cases. Looking at the actual magnitudeof the immediate impact, we see that larger stress events ( ) produce an average return drop of − . , as opposedto − . when the stress event size is . The immediate impact is longer if the market is less active, with an averageof s against s on more active markets.The stress traders characteristics affect the immediate impact event duration. When a single trader causes the stressevent, the average duration is s, while simultaneous traders produce an average duration of s. These twoquantities are not signiﬁcantly different. However, when non-simultaneous traders cause the stress event, the averageduration of s is signiﬁcantly different from the other two cases. This is as expected, as the sell-off in this scenario isexecuted in waves instead of all at once. In 2015, market participants reviewed the Regulation AT (CFTC, 2015) proposal. The proposed regulation required thatany algorithm needs to be tested in “laboratory conditions” before being put into practice. The tool is never explicitlymentioned and the absence of such a tool meant that traders would test algorithms in a replica of a real exchange withoutany market impact in effect backtesting paper trades. Further, implementing the algorithms in a system accessible toregulators means that proprietary algorithms would be potentially analyzed by regulators.In fact the CFTC Chairman J. Christopher Giancarlo had the following remarks at FIA Expo Chicago, Illinois, onOctober 17, 2018:As you know, Regulation AT was an initiative of my predecessor, Chairman Massad. My positionwas and continues to be that, while there were some good things in the proposal, there were otherthings that were unacceptable and perhaps unconstitutional, including that proprietary source codeused in trading algorithms be accessible without a subpoena at any time to the CFTC and the JusticeDepartment. 19

PREPRINT - A

UGUST

31, 2020At heart, Reg AT is a registration scheme that would put hundreds if not thousands of automatedtraders under CFTC oversight, a role for which our agency has inadequate resources and capabilities.While I share genuine concerns about the inevitability of some future market disruption exacerbatedby automated trading algorithms, there is nothing in Reg AT’s proposed imposition of burdensomefees and registration requirements that will prevent such an event. The blunt act of registeringautomated traders does not begin to address the complex public policy considerations that arise fromthe digital revolution in modern markets. Worse is that it would give a false sense of security that theCFTC had regulatorily foreclosed such market disruption, which is impossible. That is why I votedagainst Reg AT. I do not intend to advance it in its current iteration. Giancarlo (2018)This paper details SHIFT, a ﬁnancial market replica with applications to learning and research. Our goal is to replicatereal market conditions rather than create a software specialized in agent-based modeling. SHIFT offers a uniqueenvironment combining a real pricing mechanism , a distributed asynchronous market, and multi-asset support. Webelieve that SHIFT may create an environment where algorithms can be tested and stressed in laboratory conditions. Theenvironment may be setup so that proprietary source code may be tested adequately in absence and without participationof other market participants.This paper describes the system architecture and discusses several use cases. We show how a simple setup mayreproduce known stylized facts of the ﬁnancial markets such as leptokurtic return distributions and volatility clustering.We investigate the resulting order book dynamics, and show that the system reproduces the known average shape ofthe order book and statistics of the spread. We hope we convinced the reader that the resulting price process has verysimilar characteristics to real price behavior.We think one of the most important contributions of the paper is studying how the price behavior is affected by thetrading agents characteristics. We also analyzed a stress experiment in a statistical manner and drew some interestingconclusions. We found that having a single trader with a large order is more likely to produce a market crash than traders liquidating the same order. However, the impact on the market of the traders lasts longer and has a largerimpact on the price in the long term. A crash event in a non-homogeneous market (market in turmoil) has a larger longterm impact but it is less propitious to an immediate impact to the price than a crash in homogeneous market conditions.Finally, SHIFT has been successfully used in market microstructure classes at Stevens Institute of Technology for overa year and plans for future editions of the algorithmic trading competition are under way. Students have the opportunityto test out what they learn in class by either using the web interface or one of our APIs.We envision a multitude of experiments to take advantage of our ﬁnancial laboratory environment. Future work includesanalyzing market participants wealth evolution and many potential expansions. Acknowledgements

This project would not be possible without the contribution of several Stevens Institute of Technology students andresearchers who participated in different stages of its development: Chen Liu, Gaojie Li, Han Zheng, Hanrun Li, IsaacCohen, Jian Zhao, Jiaxu Duan, Jinyu Zeng, Lalita Gajbe, Meng Zhi, Runxi Ding, Shaoyong Tang, Shiwei Zeng, ShuoyuMao, Waris Bantherngpaesach, Weipu Xu, Xiaojian Zhu, Xiaoshuai Luo, Xuan Luo, Xuming Bing, Yang Liu, YongxinFeng, Yuan Tian, Yuewei Mao, Zhanyu Tan, Zhenjiu Dai, Ziwen Ye.

References

Aldrich, E. M., I. Heckenbach, and G. Laughlin (2014, August). The Random Walk of High Frequency Trading. arXiv:1408.3650.Alsulaiman, T. and K. Khashanah (2015, July). Bounded Rational Heterogeneous Agents In Artiﬁcial Stock Markets:Literature Review And Research Direction.

International Journal of Social, Behavioral, Educational, Economic andManagement Engineering 9 (6), 2038–2057.Bouchaud, J.-P., M. Mézard, and M. Potters (2002, August). Statistical properties of stock order books: empiricalresults and models.

Quantitative Finance 2 (4), 251–256.Bouchaud, J.-P. and M. Potters (2003).

Theory of ﬁnancial risk and derivative pricing: from statistical physics to riskmanagement . Cambridge university press.Budish, E., P. Cramton, and J. Shim (2015, November). The High-Frequency Trading Arms Race: Frequent BatchAuctions as a Market Design Response.

The Quarterly Journal of Economics 130 (4), 1547–1621.CFTC (2015). Regulation Automated Trading, 17 C.F.R. Parts 1, 38, 40, and 170, RIN 3038-AD52.20

PREPRINT - A

UGUST

31, 2020Cincotti, S., S. M. Focardi, M. Marchesi, and M. Raberto (2003, June). Who wins? Study of long-run trader survival inan artiﬁcial stock market.

Physica A: Statistical Mechanics and its Applications 324 (1-2), 227–233.Cont, R. (2001, February). Empirical properties of asset returns: stylized facts and statistical issues.

QuantitativeFinance 1 (2), 223–236.Cont, R. (2007). Volatility Clustering in Financial Markets: Empirical Facts and Agent-Based Models. In G. Teyssièreand A. P. Kirman (Eds.),

Long Memory in Economics , pp. 289–309. Springer Berlin Heidelberg.Dodd-Frank (2010). Dodd-Frank Wall Street Reform and Consumer Protection Act, Pub.L. 111–203, 124 Stat.1376-2223.Engle, R. F. (1982, July). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of UnitedKingdom Inﬂation.

Econometrica 50 (4), 987.Florescu, I. (2014).

Probability and Stochastic Processes . John Wiley & Sons.Fricke, D. and T. Lux (2015, April). The effects of a ﬁnancial transaction tax in an artiﬁcial ﬁnancial market.

Journal ofEconomic Interaction and Coordination 10 (1), 119–150.Gabaix, X., P. Gopikrishnan, V. Plerou, and H. E. Stanley (2003, May). A theory of power-law distributions in ﬁnancialmarket ﬂuctuations.

Nature 423 (6937), 267–270.Giancarlo, J. C. (2018, October). A Week in the Life of the CFTC.Jacob Leal, S., M. Napoletano, A. Roventini, and G. Fagiolo (2016, March). Rock around the clock: An agent-basedmodel of low- and high-frequency trading.

Journal of Evolutionary Economics 26 (1), 49–76.Jacobs, B. I., K. N. Levy, and H. M. Markowitz (2004, January). Financial Market Simulation.

The Journal of PortfolioManagement 30 (5), 142–152.Jørgensen, K., J. Skjeltorp, and B. A. Ødegaard (2018, January). Throttling hyperactive robots – Order-to-trade ratios atthe Oslo Stock Exchange.

Journal of Financial Markets 37 , 1–16.Kearns, M. and L. Ortiz (2003, November). The Penn-Lehman automated trading project.

IEEE Intelligent Systems 18 (6),22–31.Kirilenko, A., A. S. Kyle, M. Samadi, and T. Tuzun (2017, June). The Flash Crash: High-Frequency Trading in anElectronic Market.

The Journal of Finance 72 (3), 967–998.Kissell, R. L. (2013).

The science of algorithmic trading and portfolio management . Academic Press.Lobato, I. N. and C. Velasco (2000, October). Long Memory in Stock-Market Trading Volume.

Journal of Business &Economic Statistics 18 (4), 410–427.Lux, T. and M. Marchesi (1998, June). Volatility Clustering in Financial Markets: A Micro-Simulation of InteractingAgents.

IFAC Proceedings Volumes 31 (16), 7–10.Mago, D., A. Salighehdar, M. Parekh, D. Bozdog, and I. Florescu (2017). Liquidity risk and asset movement evidencefrom brexit. In , pp. 1–8. IEEE.Miller, R. S. and G. Shorter (2016).

High frequency trading: Overview of recent developments , Volume 4. CongressionalResearch Service Report, Washington, DC.Paddrik, M., R. Hayes, A. Todd, S. Yang, P. Beling, and W. Scherer (2012, March). An agent based model of the E-MiniS&P 500 applied to ﬂash crash analysis. In , pp. 1–8. IEEE.Palmer, R. G., W. B. Arthur, J. H. Holland, and B. LeBaron (1999, March). An artiﬁcial stock market.

Artiﬁcial Lifeand Robotics 3 (1), 27–31.Plerou, V., P. Gopikrishnan, and H. E. Stanley (2005, April). Quantifying ﬂuctuations in market liquidity: Analysis ofthe bid-ask spread.

Physical Review E 71 (4), 046131.Ponta, L. and S. Cincotti (2018). Traders’ Networks of Interactions and Structural Properties of Financial Markets: AnAgent-Based Approach.

Complexity 2018 , 1–9.Ponta, L., S. Pastore, and S. Cincotti (2011, April). Information-based multi-assets artiﬁcial stock market withheterogeneous agents.

Nonlinear Analysis: Real World Applications 12 (2), 1235–1242.Ponta, L., E. Scalas, M. Raberto, and S. Cincotti (2012, August). Statistical Analysis and Agent-Based MicrostructureModeling of High-Frequency Financial Trading.

IEEE Journal of Selected Topics in Signal Processing 6 (4), 381–387.Raberto, M. and S. Cincotti (2005, September). Modeling and simulation of a double auction artiﬁcial ﬁnancial market.

Physica A: Statistical Mechanics and its Applications 355 (1), 34–45.21

PREPRINT - A

UGUST

31, 2020Raberto, M., S. Cincotti, S. M. Focardi, and M. Marchesi (2001, October). Agent-based simulation of a ﬁnancial market.

Physica A: Statistical Mechanics and its Applications 299 (1-2), 319–327.Raberto, M., S. Cincotti, S. M. Focardi, and M. Marchesi (2003). Traders’ Long-Run Wealth in an Artiﬁcial FinancialMarket.

Computational Economics 22 (2/3), 255–272.Salighehdar, A., Y. Liu, D. Bozdog, and I. Florescu (2017, June). Cluster Analysis Of Liquidity Measures In A StockMarket Using High Frequency Data.

Journal of Management Science and Business Intelligence 2 (2), 1–8.Voit, J. (2005).

The statistical mechanics of ﬁnancial markets . Springer Science & Business Media.Ye, Z. and I. Florescu (2019, February). Extracting information from the limit order book: New measures to evaluateequity data ﬂow.

High Frequency 2 (1), 37–47.Zhang, L., P. A. Mykland, and Y. Aït-Sahalia (2005, December). A Tale of Two Time Scales: Determining IntegratedVolatility With Noisy High-Frequency Data.

Journal of the American Statistical Association 100 (472), 1394–1411.

A Python API Examples

Here, we present some source code listings to showcase how easy it is to create a simple trading strategy using ourPython API . Listing 1 exempliﬁes how to import our library (API), create a trader object, and how to connectand disconnect from the system using the API functions. All the user code is written between the connect() and disconnect() function calls. We show an example of a limit order creation and submission. Orders are created bygiving them a type (buy/sell), indicating the ticker traded, the order size, and price.1 import s h i f t2 t r a d e r = s h i f t . T r a d e r ( " myusername " )3 t r a d e r . c o n n e c t ( " c o n n e c t i o n . c f g " , " mypassword " )4 . . .5 l i m i t _ b u y = s h i f t . O r d e r ( s h i f t . O r d e r . LIMIT_BUY , " CS1 " , 1 , 1 0 0 . 0 0 )6 t r a d e r . s u b m i t _ o r d e r ( l i m i t _ b u y )7 . . .8 t r a d e r . d i s c o n n e c t ( ) Listing 1: Python API: Connection ExampleListing 2 contains a very simplistic trading strategy. The strategy buys when price is under $95 . and sells whenprice is over $105 . . The get_portfolio_item() function is used to get the user’s current position in the giveninstrument. The get_last_price() function is used to get the last traded price of the provided symbol. Apart fromthe price limits, the strategy only maintains one current open position, and it has to have a unit available when shorting.Furthermore, the code will only work for up to swings between $95 . and $105 . . Obviously, other more complexstrategies may be developed, but we chose to present a simple logic to clearly illustrate the use of our API. The Trader class has a multitude of methods developed in the API to interface with the exchange.In this particular strategy, price is sampled every seconds. During the seconds the algorithm sleeps price may havemoved beyond the trading limits, possibly missing a trading opportunity. The API also provides the user the ability toobserve market changes by setting event handlers. An example is on_last_price_updated() in Listing 3. In thiscode, a user-deﬁned function will be called every time a price update is received from the exchange. The user candetermine which ﬁnancial instrument had the price update and act accordingly. The API provides other such eventhandlers. A full documentation of our API is available at: https://github.com/hanlonlab/shift-python/wiki . PREPRINT - A

UGUST

31, 20201 w h i l e n u m _ t r a d e s < 1 0 :2 i f c u r r e n t _ p o s i t i o n == 0 and l a s t _ p r i c e < 9 5 . 0 0 :10 m a r k e t _ b u y = s h i f t . O r d e r ( s h i f t . O r d e r . MARKET_BUY, " CS1 " , 1 )11 t r a d e r . s u b m i t _ o r d e r ( m a r k e t _ b u y )12 e l i f c u r r e n t _ p o s i t i o n > 0 and l a s t _ p r i c e > 1 0 5 . 0 0 :13 m a r k e t _ s e l l = s h i f t . O r d e r ( s h i f t . O r d e r . MARKET_SELL, " CS1 " , 1 )14 t r a d e r . s u b m i t _ o r d e r ( m a r k e t _ s e l l )15 n u m _ t r a d e s = n u m _ t r a d e s + 1Listing 2: Python API: Simple Strategy1 d e f l a s t _ p r i c e _ u p d a t e d _ e v e n t ( t r a d e r , symbol ) :2 i f symbol == " CS1 " :3 i f c u r r e n t _ p o s i t i o n == 0 and l a s t _ p r i c e < 9 5 . 0 0 :9 m a r k e t _ b u y = s h i f t . O r d e r ( s h i f t . O r d e r . MARKET_BUY, " CS1 " , 1 )10 t r a d e r . s u b m i t _ o r d e r ( m a r k e t _ b u y )11 e l i f c u r r e n t _ p o s i t i o n > 0 andand