[PDF] Market Making with Stochastic Liquidity Demand: Simultaneous Order Arrival and Price Change Forecasts

Abstract

We provide an explicit characterization of the optimal market making strategy in a discrete-time Limit Order Book (LOB). In our model, the number of filled orders during each period depends linearly on the distance between the fundamental price and the market maker's limit order quotes, with random slope and intercept coefficients. The high-frequency market maker (HFM) incurs an end-of-the-day liquidation cost resulting from linear price impact. The optimal placement strategy incorporates in a novel and parsimonious way forecasts about future changes in the asset's fundamental price. We show that the randomness in the demand slope reduces the inventory management motive, and that a positive correlation between demand slope and investors' reservation prices leads to wider spreads. Our analysis reveals that the simultaneous arrival of buy and sell market orders (i) reduces the shadow cost of inventory, (ii) leads the HFM to reduce price pressures to execute larger flows, and (iii) introduces patterns of nonlinearity in the intraday dynamics of bid and ask spreads. Our empirical study shows that the market making strategy outperforms those which ignores randomness in demand, simultaneous arrival of buy and sell market orders, and local drift in the fundamental price.

Full PDF

MMarket Making with Stochastic Liquidity Demand:Simultaneous Order Arrival and Price Change Forecasts

Agostino Capponi ∗ , Jos´e E. Figueroa-L´opez † , and Chuyi Yu ‡ January 11, 2021

Abstract

We provide an explicit characterization of the optimal market making strategy in a discrete-time Limit Order Book (LOB). In our model, the number of ﬁlled orders during each perioddepends linearly on the distance between the fundamental price and the market maker’s limitorder quotes, with random slope and intercept coeﬃcients. The high-frequency market maker(HFM) incurs an end-of-the-day liquidation cost resulting from linear price impact. The optimalplacement strategy incorporates in a novel and parsimonious way forecasts about future changesin the asset’s fundamental price. We show that the randomness in the demand slope reducesthe inventory management motive, and that a positive correlation between demand slope andinvestors’ reservation prices leads to wider spreads. Our analysis reveals that the simultaneousarrival of buy and sell market orders (i) reduces the shadow cost of inventory, (ii) leads the HFMto reduce price pressures to execute larger ﬂows, and (iii) introduces patterns of nonlinearityin the intraday dynamics of bid and ask spreads. Our empirical study shows that the marketmaking strategy outperforms those which ignores randomness in demand, simultaneous arrivalof buy and sell market orders, and local drift in the fundamental price. ∗ Department of Industrial Engineering and Operations Research, Columbia University, NY 10027, USA( [email protected] ). † Department of Mathematics and Statistics, Washington University in St. Louis, St. Louis, MO 63130, USA( [email protected] ). Research supported in part by the NSF Grants: DMS-2015323, DMS-1613016. ‡ Department of Mathematics and Statistics, Washington University in St. Louis, St. Louis, MO 63130, USA( [email protected] ). a r X i v : . [ q -f i n . T R ] J a n Introduction

Since the last decade, most security trading activities have migrated to electronic markets and,as a result, high frequency trading has become one of the most signiﬁcant market developments.Estimates of high frequency volumes in the treasury, foreign exchange, equity and index futuresmarkets are typically several deciles of the total traded volume (Joint Staﬀ Report (2015); MarketsCommittee et al. (2011); Securities and Exchange Commission (2010); Kirilenko et al. (2017)). Asobserved by Menkveld (2013), market making activities are being predominantly carried out byhigh-frequency traders. A traditional market maker provides liquidity to the exchange by continu-ously placing bid and ask orders, and hence, earning proﬁt from the bid-ask spread of her quotes.Like traditional market makers, high frequency market makers (HFMs) make proﬁt from roundtriptransactions but, unlike traditional market makers, they also submit numerous passive orders thatare canceled shortly after submission at extraordinary high-speed (Securities and Exchange Com-mission (2010)). More importantly, compared with traditional market makers, HFMs typicallywork at privately held ﬁrms and, thus, inventory control becomes necessary for them to limit theamount of capital tied up in margin accounts (Menkveld (2016)). The practice of ending the dayclose to a ﬂat position is also driven by risk management motives, as it allows the market maker toreduce uncertainty coming from ﬂuctuations in security prices at the beginning of the next tradingday.Existing literature has analyzed market making control problems with inventory risk. The studyof Ho & Stoll (1981) considers a single period mean-variance utility for a market maker wishingto optimize expected proﬁt from bid-ask spreads, and to ﬁnd oﬀsetting transactions to minimizeinventory risk. Huang et al. (2012) consider risk-averse market makers with single period mean-variance or exponential utility who use a threshold inventory control policy to reduce the risk fromprice uncertainty. Due to the static nature of their setup, these studies do not analyze the intradayeﬀects of orders arrival and inventory management on prices and liquidity.Other studies have considered dynamic models of market making. Some of them aim to penalizeintraday inventory holdings (see e.g. Cartea et al. (2014), Cartea & Jaimungal (2015) for contri-2utions in this space). Other studies impose explicit constraints on terminal inventory. Examplesof those works include the early contributions of Bradﬁeld (1979), who analyze the increasing pricevariability induced by strategies that target a ﬂat end-of-day inventory level; O’Hara & Oldﬁeld(1986), who consider a repeated optimal market making problem, in which each day consists ofseveral trading periods, and the market maker maximizes utility over an inﬁnite number of tradingdays while facing end-of-day inventory costs; Gu´eant et al. (2013), who sets the penalty for endof day inventory to be proportional to the absolute value of the terminal inventory; and the mostrecent study of Adrian et al. (2020), who study intraday patterns of prices and volatility inducedby inventory management motives with focus on the Treasury market. We solve a discrete-time optimal control problem to maximize expected cumulative proﬁts ofthe market maker, while incorporating an end-of-the-day inventory liquidation cost. Such a costis driven by the assumption of a linear instantaneous price impact: the average price per share inliquidating an inventory of size I T at time T is S T − λI T , where S T is the fundamental price at time T (typically, the asset’s midprice) and λ is a constant penalty. In our framework, the HFM placeslimit orders (LOs) on the ask and bid sides simultaneously, and cancels the remaining unexecutedquotes shortly before submitting new quotes in the book. Her wealth and inventory trajectory arehence determined by the prices of her quotes and the number of shares that are ﬁlled or lifted fromher orders at these prices.Modeling the number of lifted shares between consecutive actions is a key component of ourframework. In continuous-time control problems, a common approach is to model the probabilitywith which an incoming market order (MO) can lift one share of the HFM’s LO in the book(known as ‘lifting probability’). For instance, Cartea et al. (2014); Cartea & Jaimungal (2015)assume that the MOs arrive according to a Poisson process and model the lifting probability asthe exponential of the negative distance of the HFM’s quote from the midprice times a constant;Cvitanic & Kirilenko (2010) instead model the lifting probability with a linear function, assuming Amihud & Mendelson (1980) assume a dynamic model in which dealers are risk neutral and buy and sell ordersarrive according to a Poisson process with price-dependent rates. They consider an inﬁnite time horizon and restrictinventory levels to be inside a prespeciﬁed interval throughout the entire trading day. Few empirical studies have analyzed the relationship between trades, prices and bid-ask spreads using transactiondata. Glosten & Harris (1988) and Hasbrouck (1988) decompose bid-ask spreads into two components, reﬂectingcompensation for inventory costs and adverse selection costs, which arise from the presence of informed traders.They ﬁnd that, in contrast to the transitory spread component explained by inventory considerations, the permanentcomponent explained by information asymmetries is signiﬁcant for large trades but not for small ones. et al. (2010). Unlike ours, their work does not consider inventory costs. The exponential lifting probabilities used in continuous-time control problems can be relatedto the linear demand function used in discrete problems. Speciﬁcally, if ρ is the arrival intensityof MO’s and the lifting probability is set to be exp( − κd ), where d is the distance between the LOprice and the midprice, then, during a time span of ∆, we expect that ∆ ρ exp( − κd ) times a MOwill lift a LO placed at distance d . Since it is typically assumed that only one share of the orderis lifted at a time, when d is small (as it is commonly the case), the expected number of sharesﬁlled during that time span is approximately equal to ∆ ρ − ∆ ρκd , which is precisely linear in d .Note, however, that the previous argument assumes that a LO with volume Q ∈ N is treated as Q independent LOs of volume one with the same lifting probability, which is not the case in practice.Our work extends existing models of high-frequency market making in several ways. We assumethe demand to be linear when modeling the number of ﬁlled shares from the HFM’s limit orders.However, unlike Adrian et al. (2020) and Hendershott & Menkveld (2014), the demand is not deterministic but random . This means that the actual number of shares bought or sold varies overtime, even if the distances of quotes from the fundamental price stay the same. The proposedrandomization not only allows for greater ﬂexibility and better ﬁt to empirically observed orderﬂows, but also uncovers novel properties of the resulting optimal placement strategies. For instance,it is known from Adrian et al. (2020) that, under a constant demand slope, the inventory adjustmentin the optimal placement at any given time decreases with the size of the slope. We show that thevariance of the slope further reduces this adjustment. This implies that assets with highly volatiledemand proﬁles require less strict inventory adjustments. We also ﬁnd that the optimal placement A separate stream of literature has analyzed liquidity in a limit order book with endogenous equilibrium dynamics.See, for instance, Gayduk & Nadtochiy (2018). et al. (2020) and Cartea & Jaimungal (2015)), which are obtained fromthe continuous-time versions via Euler approximations, does not allow for this feature. We ﬁrstobtain a closed form expression for the optimal strategies that explicitly account for the probability π (1 ,

1) of simultaneous arrivals of buy and sell MOs during each small time period. We then performa comparative statics analysis, and discover novel patterns of optimal placements strategies.First, at any given time we ﬁnd that the optimal bid-ask spread declines if π (1 ,

1) increases.The intuition is that a higher likelihood of simultaneous buy and sell orders arrival provides theHFM with more opportunities to manage inventory: the positive net position resulting from theexecution of sell MOs oﬀsets the negative net position corresponding to the execution of buy MOs.Second, we show that bid-ask spreads are less sensitive to the passage of time as π (1 ,

1) increases.Interestingly, if π (1 ,

1) is suﬃciently large, the bid-ask spread no longer rises towards the end of theday. This intraday pattern stands in contrast with that identiﬁed by Adrian et al. (2020), and canbe understood as follows. While the need of reaching a zero inventory target becomes stronger withpassage of time, a larger arrival rate of oﬀsetting buy and sell MOs reduces the shadow cost of end-of-the-day inventory and incentivizes the HFM to reduce price pressures for attracting larger ﬂows.Third, we show that the presence of simultaneous arrivals introduces a nonlinearity in the intradaydynamics of bid spread and ask spread. In the absence of simultaneous arrivals, the ask spreadand bid spread (i.e., the distances of the HFM’s optimal ask and bid quotes from the fundamentalprice) are decreasing functions of time. However, if π (1 , (cid:54) = 0, this monotonicity is broken aswe get closer to the end of day. Last, but not least, we observe a novel threshold phenomenon inthe HFM’s inventory management: if inventory holdings are below a certain threshold, the HFMwidens her bid and ask spread to dampen trading activity on both sides of the market and preservethe current inventory position, instead of aggressively placing LOs close to the security price tolower her net position.Another distinguishing and novel feature of our study, relative to the rest of literature, is thatwe allow the market maker to incorporate forecasts about the fundamental price of the asset, rather5han assuming martingale dynamics. We obtain a parsimonious formula which describes how theinvestor should adjust her limit order placements based on her asset price forecasts. Intuitively,if the HFM expects future price changes to be negative, she would reduce the bid and ask spreadproportionally to the expected price change. The proportionality constant depends on the modelparameters in a non-trivial way. This feature also allows the investor to take advantage of sophis-ticated time series- or machine learning-based forecast procedures of asset prices into the intradaymarket-making process (see Section 3 for the technical details).To the best of our knowledge, our work is unique in that it assesses the performance of theproposed market making strategy against LOB data. Speciﬁcally, we ﬁrst use a rolling windowapproach to estimate the model parameters. We then test the calibrated model against actualLOB dynamics, allowing for adjustments of LO’s placements every second and determining thecash ﬂows and inventory changes generated during the day. At day’s close, the HFM submits aMO to liquidate its ﬁnal inventory, and determine the actual cost taking into account the state ofthe LOB. We ﬁnd that the optimal placement yields, on average, larger revenue compared to thesituation where π (1 ,

1) = 0 (such as in Adrian et al. (2020)), even if π (1 ,

1) is estimated to be smalland about 0 .

05. Our empirical analysis lends strong support to demand stochasticity: the slopecoeﬃcient has a standard deviation which is about 200% larger than the average demand level, anda correlation of about 20% with the investors’ reservation price. Moreover, using real LOB data weestimate the optimal placement strategy based on a simple one-step ahead forecast to outperformthe one that presumes a martingale price evolution.The solution of the optimal control problem presents nontrivial mathematical challenges. Whilethe ﬁrst-order optimality conditions involves solving a quadratic equation, establishing the second-order conditions needed for the veriﬁcation theorem is intricate. It involves establishing severalclever estimates, in which we leverage direct inequalities implied by the primitives of our model.The rest of the paper is organized as follows. In Section 2, we present the model setup togetherwith our assumptions. Section 3 solves the Bellman equation for the control problem, and provesa veriﬁcation theorem. In Section 4, we analyze in detail the main economic forces behind theoptimal placement strategies. In Section 5, we measure the performance of our market makingstrategy against real LOB data. We delegate technical proofs to two appendices.6

Model Setup

In this section we introduce our Limit Order Book (LOB) model and specify the type of consideredstrategies. We assume the market making strategy runs from time 0 to a ﬁxed time

T > t < t < · · · < t N < T . Throughout, we set t N +1 = T and T = { t , t , . . . , t N +1 } . All variables introduced below are assumed to be deﬁned on a probabilityspace (Ω , P , F ) equipped with a ﬁltration {F t } t ∈T , which represents the arrival of market makers’available information through time.Arrivals of buy and sell market orders (MO) are modeled by two Bernoulli processes. Speciﬁcally,let + t k +1 ( − t k +1 ) be a Bernoulli random variable indicating whether there is at least one buy (sell)market order arriving during the time period [ t k , t k +1 ): + t k +1 = { At least one buy MO arrives during [ t k ,t k +1 ) } , − t k +1 = { At least one sell MO arrives during [ t k ,t k +1 ) } . (1)We assume that + t k +1 , − t k +1 ∈ F t k +1 and P ( + t k +1 = j + , − t k +1 = j − |F t k ) = π t k +1 ( j + , j − ) , (2)for j ± ∈ { , } , where π t k +1 : { , } × { , } → [0 ,

1] is a deterministic probability distribution. Themarginal conditional probabilities are denoted as π ± t k +1 := P ( ± t k +1 = 1 |F t k ) , (3)and throughout we assume that π + t k +1 > π − t k +1 >

0, for all k = 0 , . . . , N . Concretely, betweentwo consecutive time steps t k and t k +1 , the arrival probability of at least one buy (sell) marketorder is π + t k +1 ( π − t k +1 ). Remark 1

By deﬁnition of marginal probabilities, we have that π + t k +1 = π t k +1 (1 ,

1) + π t k +1 (1 , and π − t k +1 = π t k +1 (1 ,

1) + π t k +1 (0 , . Then the following relation between π t k +1 (1 , and π ± t k +1 must old for each t k +1 : ( π + t k +1 + π − t k +1 − ∨ ≤ π t k +1 (1 , ≤ π + t k +1 ∧ π − t k +1 . (4)The ask (bid) LO is placed at time t k , k = 0 , . . . , N , at the price level a t k ∈ F t k ( b t k ∈ F t k ).We parameterize a t k and b t k as a t k = S t k + L + t k , b t k = S t k − L − t k , (5)where L ± t k ∈ F t k are the market maker’s spreads and S t k ∈ F t k is the fundamental price of theasset at time t k . The assumptions on the fundamental prices process { S t k } k =0 ,...,N +1 are furtherspeciﬁed in Section 3.The limit orders placed at time t k may be fully or partially executed during the time interval[ t k , t k +1 ), but only if there exists at least one arrival of a market order during that period. Weassume that the number of ﬁlled shares on the bid side during the interval [ t k , t k +1 ) is given by Q − t k +1 (cid:44) − t k +1 c − t k +1 (cid:2) ( b t k − ( S t k − p − t k +1 ) (cid:3) = − t k +1 c − t k +1 ( p − t k +1 − L − t k ) , (6)where c − t k +1 , p − t k +1 ∈ F t k +1 are positive random variables whose distribution is speciﬁed below inAssumption 1. When no sell market order arrives during the interval [ t k , t k +1 ), − t k +1 = 0 andthe number of executions on the buy side is 0. Here, p − t k +1 is deﬁned such that S t k − p − t k +1 is thelowest price that all sell market orders arriving during [ t k , t k +1 ) can attain. In other words, bidlimit orders placed by the HFM will not be executed during the interval [ t k , t k +1 ) if the price issmaller than S t k − p − t k . We refer to p − t k +1 as the reservation price for sellers. The demand slope c − t k +1 measures the rate of increase in the number of ﬁlled shares of the bid order, as the order’sbid price b t k gets closer to the fundamental price S t k . Symmetrically, the number of shares ﬁlledby the HFM’s ask limit order during [ t k , t k +1 ) is given by Q + t k +1 (cid:44) + t k +1 c + t k +1 (cid:2) ( S t k + p + t k +1 ) − a t k (cid:3) = + t k +1 c + t k +1 ( p + t k +1 − L + t k ) . (7)8 t k S t k + p + t k +1 S t k − p − t k +1 b t k a t k L − t k L + t k Q + t k +1 Q − t k +1 c + t k +1 p + t k +1 c − t k +1 p − t k +1 Price

Number of ﬁlled sharesFigure 1: S t k − p − t k +1 is the lowest price that a sell market order can attain, and S t k + p + t k +1 is thehighest price that a buy market order can attain during the time interval [ t k , t k +1 ). The number ofﬁlled shares increase as the market maker places limit orders closer to the fundamental price S t k .Figure 2: Prototypical Plot of Actual Demand vs. Estimated Linear Demand over TimeInterval [ t k , t k +1 ) . The number of ﬁlled shares Q ± t k +1 is illustrated in Figure 1. The quantity Q ± t k +1 may be viewedas the “best” linear ﬁt for the actual demand as shown in Figure 2. More details on the estimationof Q ± t k +1 are covered in the empirical analysis conducted in Section 5.Next, we state our main assumptions on c ± . and p ± . . Assumption 1 (General Properties of ( c ± . , p ± . ) ) For k = 0 , . . . , N , we have:1. ( c ± t k +1 , p ± t k +1 ) are F t k +1 -measurable,2. the conditional distribution of ( c + t k +1 , p + t k +1 , c − t k +1 , p − t k + ) given ( F t k , + t k +1 , − t k +1 ) does not dependon k and is nonrandom,3. ( c + t k +1 , p + t k +1 ) and ( c − t k +1 , p − t k +1 ) are independent given ( F t k , + t k +1 , − t k +1 ) . c ± . , p ± . ): µ ± c := E ( c ± t k +1 |F t k , ± t k +1 = 1) ,µ ± c := E (( c ± t k +1 ) |F t k , ± t k +1 = 1) ,µ ± cp := E ( c ± t k +1 p ± t k +1 |F t k , ± t k +1 = 1) ,µ ± c p := E (( c ± t k +1 ) p ± t k +1 |F t k , ± t k +1 = 1) ,µ ± c p := E (( c ± t k +1 p ± t k +1 ) |F t k , ± t k +1 = 1) . (8)We consider the following maximization problem for the HFM:max ( L + . ,L − . ) ∈A E [ W T + S T I T − λI T ] , (9)where A is the collection of all F -adapted processes, and W T and I T stand for the market maker’scash holdings and inventory at the end of period [0 , T ], respectively. The cash holding and inventoryprocesses, { W t k } and { I t k } , respectively, satisfy the following equations: W t k +1 = W t k + a t k Q + t k +1 − b t k Q − t k +1 = W t k + ( S t k + L + t k ) + t k +1 c + t k +1 ( p + t k +1 − L + t k ) − ( S t k − L − t k ) − t k +1 c − t k +1 ( p − t k +1 − L − t k ) (10)and I t k +1 = I t k − Q + t k +1 + Q − t k +1 = I t k − + t k +1 c + t k +1 ( p + t k +1 − L + t k ) + − t k +1 c − t k +1 ( p − t k +1 − L − t k ) , (11)where W t = 0 and I t = 0.The term λI T ≥ T . This formulation captures, in reduced form, the fact that HFMs tend to have deminimis balance sheets, thus making any overnight inventory costly to carry. The penalty term forholding end-of-day inventory can also be interpreted as follows. We can rewrite the last two terms S T I T − λI T in the above expectation as ( S T − λI T ) I T . Then, S T − λI T is the average price pershare that the HFM will get when liquidating her inventory I T via a MO, under the assumption Similar objective criteria have been proposed in earlier studies, such as Cartea & Jaimungal (2015) and Adrian et al. (2020).

10f a linear instantaneous price impact. For instance, if I T > I T < At time t k , the value function of the control problem described above is given by V t k = sup ( L + . ,L − . ) ∈A E [ W T + S T I T − λI T |F t k ] . (12)By the dynamic programming principle, we obtain it satisﬁes the following equation V t k = sup ( L + tk ,L − tk ) ∈A E [ V t k +1 |F t k ] . (13)We now proceed to ﬁnd the optimal placement strategy for the market maker. Our objectiveis to derive it for a general adapted stochastic process of the fundamental price. To illustrate theprocedure behind the construction, we ﬁrst analyze the setting where the fundamental price processis a martingale, and obtain tractable formulas for the optimal bid and ask prices and the valuefunction (see Subsection 3.1). In Subsection 3.2, we relax the martingale assumption and providea general formula for general adapted stochastic price dynamics of the fundamental price. In this subsection, we assume that the fundamental price S t k ∈ F t k of the asset is a martingale: E ( S t k +1 |F t k ) = S t k , k = 0 , . . . , N. (14)Furthermore, we assume that S t k +1 − S t k and ( + t k +1 , − t k +1 , c + t k +1 , p + t k +1 , c − t k +1 , p − t k + ) are conditionallyindependent given F t k . We start by making the following ansatz for the value function V t k : V t k = v ( t k , S t k , W t k , I t k ) := W t k + α t k I t k + S t k I t k + h t k I t k + g t k , (15)11here α : T → R , h : T → R , and g : T → R are deterministic functions deﬁned on T = { t , t , . . . , t N +1 } (recall that we set t N +1 = T ). Since V T = W T + S T I T − λI T , we obtain theterminal conditions α T = − λ , g T = 0, and h T = 0.We can determine the functions α . , h . , and g . by plugging the ansatz (15) back into Eq. (13), andthen using Eqs. (10)-(11). This yields the following iterative representation for the value function v ( t k , S t k , W t k , I t k )= sup ( L + tk ,L − tk ) ∈A E (cid:2) v ( t k +1 , S t k +1 , W t k +1 , I t k +1 ) (cid:12)(cid:12) F t k (cid:3) = sup ( L + tk ,L − tk ) ∈A E (cid:2) v (cid:0) t k +1 , S t k +1 , W t k + a t k Q + t k +1 − b t k Q − t k +1 , I t k − Q + t k +1 + Q − t k +1 (cid:1)(cid:12)(cid:12) F t k (cid:3) . (16)From the construction of a t k , b t k , and Q ± t k +1 (see Eqs. (5)-(7)), we know that a t k and Q + t k +1 are linearin L + t k , while b t k and Q − t k +1 are linear in L − t k . Also, by our ansatz, v ( t k , S t k , W t k , I t k ) is linear in W t k and quadratic in I t k . Denoting the expectation on the right-hand side of Eq. (16) as f ( L + t k , L − t k ), wecan then conclude that f ( L + t k , L − t k ) is quadratic in L + t k and L − t k . Therefore, we can use the ﬁrst-orderconditions to ﬁnd the candidates L ± , ∗ t k . We can then evaluate the second partial derivative, andestablish that the critical point ( L + , ∗ t k , L − , ∗ t k ) is indeed a maximum point. We state this fact in thefollowing proposition, whose proof is given in Appendix A.1. Proposition 1 (Optimal Controls)

The optimal controls that solve the optimization problem(16) using the ansatz (15) and state dynamics (10)-(11) are given, for k = 0 , . . . , N , by L + , ∗ t k = (1) A + t k I t k + (2) A + t k + (3) A + t k ,L − , ∗ t k = − (1) A − t k I t k − (2) A − t k + (3) A − t k . (17) where the coeﬃcients above are speciﬁed as (1) A ± t k = β ± t k α t k +1 γ t k , (2) A ± t k = β ± t k h t k +1 γ t k , (18) (3) A ± t k = π ∓ t k +1 γ t k ( α t k +1 µ ∓ c − µ ∓ c ) (cid:2) π ± t k +1 ( µ ± cp − α t k +1 µ ± c p ) + 2 α t k +1 π t k +1 (1 , µ ± c µ ∓ cp (cid:3) + π t k +1 (1 , α t k +1 γ t k µ + c µ − c (cid:2) π ∓ t k +1 ( µ ∓ cp − α t k +1 µ ∓ c p ) + 2 α t k +1 π t k +1 (1 , µ ∓ c µ ± cp (cid:3) , nd γ t k := (cid:2) π t k +1 (1 , α t k +1 µ + c µ − c (cid:3) − π + t k +1 π − t k +1 ( α t k +1 µ + c − µ + c )( α t k +1 µ − c − µ − c ) ,β ± t k := π + t k +1 π − t k +1 µ ± c ( α t k +1 µ ∓ c − µ ∓ c ) − π ∓ t k +1 π t k +1 (1 , α t k +1 µ ± c ( µ ∓ c ) . In the expressions above, α : T → R and h : T → R are speciﬁed using the following backwardequations: α T = − λ , h T = 0 at T = t N +1 and, for k = 0 , . . . , N : α t k = α t k +1 + (cid:88) δ = ± π δt k +1 (cid:2) ( α t k +1 µ δc − µ δc )( (1) A δt k ) + 2 α t k +1 µ δc ( (1) A δt k ) (cid:3) + 2 α t k +1 π t k +1 (1 , µ + c µ − c ( (1) A + t k (1) A − t k ) , (19) and h t k = h t k +1 + (cid:88) δ = ± π δt k +1 (cid:110) α t k +1 µ δc − µ δc ) (cid:2) (1) A δt k ( δ (3) A δt k + (2) A δt k ) (cid:3) + 2 α t k +1 µ δc ( δ (3) A δt k + (2) A δt k ) − α t k +1 ( δµ δcp ) + ( δ (1) A δt k )( µ δcp + δh t k +1 µ δc − α t k +1 µ δc p ) (cid:111) (20) − α t k +1 π t k +1 (1 , µ + c µ − c (cid:104) (1) A + t k ( (3) A − t k − (2) A − t k ) − (1) A − t k ( (2) A + t k + (3) A + t k ) + µ + cp µ + c ( (1) A − t k ) − µ − cp µ − c ( (1) A + t k ) (cid:105) . The following key lemma will be needed to show that the critical point ( L + , ∗ t k , L − , ∗ t k ) of Proposition1 is indeed a maximum point and also in the analysis of the optimal placement’s properties inSection 4. Its proof is intricate and deferred to Appendix A.2. Lemma 1

The quantity α t k deﬁned in Eq. (19) is strictly decreasing with t k and negative for every t k . Recall ( L + , ∗ t k , L − , ∗ t k ) are admissible if ( L + , ∗ t k , L − , ∗ t k ) ∈ F t k . It is easy to check that ( L + , ∗ t k , L − , ∗ t k ) ∈ F t k since I t k ∈ F t k and (1) A ± t k , (2) A ± t k , (3) A ± t k are deterministic functions. From the previous result and theexpression for bid and ask prices in Eq. (5), we deduce that the optimal placements at time t k are a ∗ t k = S t k + (1) A + t k I t k + (2) A + t k + (3) A + t k , (21) b ∗ t k = S t k + (1) A − t k I t k + (2) A − t k − (3) A − t k , (22)13here a ∗ t k is the price for the ask limit order and b ∗ t k is the price for the bid limit order.Recall that V t k = sup ( L + . ,L − . ) ∈A E [ W T + S T I T − λI T |F t k ] . (23)We next prove a veriﬁcation theorem for the optimal placements given in Eq. (17). Its proof isgiven in Appendix A.3. Theorem 1 (Veriﬁcation Theorem)

The optimal value function V t k of the control problem (23)is given by V t k = v ( t k , S t k , W t k , I t k ) , where, for t k ∈ T , v ( t k , s, w , i ) = w + α t k i + si + h t k i + g t k , with α t k and h t k given in Proposition 1, and g t k deﬁned as g T = 0 and, for k = 0 , . . . , N , g t k = g t k +1 + (cid:88) δ = ± π δt k +1 (cid:104) ( α t k +1 µ δc − µ δc )( (3) A δt k + ( δ (2) A δt k )) + α t k +1 µ δc p − ( δh t k +1 ) µ δcp + ( µ δcp + ( δh t k +1 ) µ δc − α t k +1 µ δc p )( (3) A δt k + ( δ (2) A δt k )) (cid:105) − α t k +1 π t k +1 (1 , µ + c µ − c (cid:104) ( (2) A + t k + (3) A + t k )( (3) A − t k − (2) A − t k ) − µ + cp µ + c ( (3) A − t k − (2) A − t k ) − µ − cp µ − c ( (2) A + t k + (3) A + t k ) + µ + cp µ − cp µ − c µ + c (cid:105) . Furthermore, the optimal controls are given by L ± , ∗ . as deﬁned in (17). In this subsection, we relax the martingale assumption on the fundamental price process { S t k } t k ∈T made in the previous subsection, and consider a general adapted process. Furthermore, we assumethat, conditionally on F t k , { S t j +1 − S t j } j ≥ k and ( + t k +1 , − t k +1 , c + t k +1 , p + t k +1 , c − t k +1 , p − t k + ) are indepen-dent. Let us introduce the notation:∆ t k := E ( S t k +1 − S t k |F t k ) . t k reﬂects the HFM’s forecast about the asset price’s change in the interval [ t k , t k +1 )based on her information available at t k . Including this term makes our model more ﬂexible and,as we shall see in Section 5, the resulting optimal placement strategies achieve better empiricalperformance. We leave the rest of the model setup as in Section 2.We deﬁne the price change forecasts∆ t k t j := E (∆ t j |F t k ) = E ( S t j +1 − S t j |F t k ) , j ≥ k, (24)and recall the standard convention (cid:81) k − (cid:96) = k = 1. The following result gives the optimal placementspreads for an arbitrary adapted price process { S t k } t k ∈T in terms of the forecasts (24) and theoptimal placement strategy L ± , ∗ t k of Proposition 1. The proof is provided in Appendix A.1. Theorem 2 (Optimal Controls with a General Adapted Fundamental Price Process)

Theoptimal controls which solve the dynamic optimization problem (16) are given, for k = 0 , . . . , N , by (cid:101) L + , ∗ t k = L + , ∗ t k + β + t k γ t k ∆ t k + (cid:16) β + t k γ t k (cid:17) N (cid:88) j = k +1 j (cid:89) (cid:96) = k +1 ξ (cid:96) ∆ t k t j , (cid:101) L − , ∗ t k = L − , ∗ t k − β − t k γ t k ∆ t k − (cid:16) β − t k γ t k (cid:17) N (cid:88) j = k +1 j (cid:89) (cid:96) = k +1 ξ (cid:96) ∆ t k t j , (25) where β ± t k and γ t k are the deterministic sequences introduced in Proposition 1, and L ± , ∗ t k are theoptimal spreads deﬁned therein. The quantity ξ k is deﬁned as: ξ k = 1 + α t k +1 γ t k (cid:88) δ = ± π δt k +1 β δt k (cid:110) β δt k γ t k ( α t k +1 µ δc − µ δc ) + 2 µ δc (cid:111) + 2 α t k +1 γ t k π t k +1 (1 , µ + c µ − c β + t k β − t k . (26)The optimal placement strategies at time t k with a non-martingale dynamics for the fundamentalprice process can then be written as: (cid:101) a ∗ t k = S t k + (cid:101) L + , ∗ t k , (cid:101) b ∗ t k = S t k − (cid:101) L − , ∗ t k , (27)where (cid:101) a ∗ t k is the price for the ask limit order and (cid:101) b ∗ t k is the price for the bid limit order. Eq. (25)highlights that we can split the problem of ﬁnding the optimal trading strategy into two subprob-lems. First, we compute the recursive expressions (18)-(20). This is done “oﬄine” at the beginning15f each trading day. That is to say, all parameters needed to compute L ± , ∗ t k (i.e., the optimal con-trols with a martingale price process) are predetermined at the beginning of the day. Second, wesolve the forecasting problem of determining { ∆ t k t j } j = k,...,N , and compute (cid:101) L ± , ∗ t k using the expressionof L ± , ∗ t k as in Eq. (25). This task is done “online” at each time t k . Thus, under a general adaptedfundamental price process, the optimal strategy (cid:101) L ± , ∗ t k incorporates the views of the HFM aboutchanges in the fundamental based on her information available at time t k . Remark 2

As shown in Appendix A.1, for the case of a general adapted price process { S t k } t k ∈T ,the ansatz for the value function V t k takes the form: V t k = v ( t k , S t k , W t k , I t k ) := W t k + α t k I t k + S t k I t k + (cid:101) h t k I t k + (cid:101) g t k (28) where, as in Subsection 3.1, α : T → R is a deterministic function, but { (cid:101) h t k } t k ∈T and { (cid:101) g t k } t k ∈T are now processes adapted to the ﬁltration {F t } t ∈T . The precise expressions for (cid:101) h and (cid:101) g are givenin Eqs. (A-15)-(A-16) using notation (A-2). The proof of the corresponding veriﬁcation theoremproceeds along similar lines as the proof of Theorem 1. In the next proposition, we provide conditions under which the bid-ask spread is guaranteed tobe positive (i.e., a t k > b t k ). We defer the proof to Appendix A.4. Proposition 2 (Conditions for a Optimal Positive Spread)

Under both martingale and non-martingale price processes, the optimal placement strategy yields positive spreads at all times (i.e., a t k > b t k , for all k = 0 , . . . , N ), provided that the following three conditions hold:(1) The ﬁrst and second conditional moments of c ± deﬁned in Eq.(8) satisfy µ c := µ + c = µ − c , µ c := µ + c = µ − c . (29) (2) Buy and sell market orders arrive with the same probability: π + t k +1 = π − t k +1 . (30)16

3) The conditional expectations of ( cp ) ± and ( c p ) ± deﬁned in Eq.(8) satisfy µ ± cp = µ ± c µ ± p , µ ± c p = µ ± c µ ± p , (31) where µ ± p := E ( p ± t k +1 |F t k , ± t k +1 = 1) . Conditions (29) and (30) imply a symmetric market. Under Condition (29), both mean and vari-ance of the bid demand slope c − t k +1 are the same as those on the ask side. Condition (30) postulatesthat buy and sell MOs arrive with the same probability within each time interval [ t k , t k +1 ). Con-dition (31) postulates that the demand slope c ± t k +1 and the reservation price p ± t k +1 are uncorrelated.These assumptions are empirically supported by the analysis of Section 5. In this section, we will discuss the behavior of the optimal placement strategies and their sensitivitiesto model parameters, such as the arrival rate π t k (1 , I , and the penalty λ onthe terminal inventory. π t k (1 , π t k (1 , ≡ . We ﬁrst consider the situation where only one type of MOs (buy or sell) can arrive between twotimes. Recall P ( + t k +1 = j + , − t k +1 = j − |F t k ) = π t k +1 ( j + , j − ), j ± ∈ { , } , where + t k +1 ( − t k +1 )indicates whether there are arrivals of buy (sell) MOs during [ t k , t k +1 ). If π t k +1 (1 ,

1) = 0, it followsfrom Eq. (27) that the best placement strategies take the following form: (cid:101) a ∗ , t k = S t k + (cid:101) L + , ∗ , tk (cid:122) (cid:125)(cid:124) (cid:123) α t k +1 µ + c µ + c − α t k +1 µ + c I t k + µ + cp − α t k +1 µ + c p µ + c − α t k +1 µ + c ] + (∆ t k + (cid:101) h t k ) µ + c µ + c − α t k +1 µ + c ] (32) (cid:101) b ∗ , t k = S t k + − (cid:101) L − , ∗ , tk (cid:122) (cid:125)(cid:124) (cid:123) α t k +1 µ − c µ − c − α t k +1 µ − c I t k − µ − cp − α t k +1 µ − c p µ − c − α t k +1 µ − c ] + (∆ t k + (cid:101) h t k ) µ − c µ − c − α t k +1 µ − c ] , (33)17here α t k = α t k +1 + (cid:88) δ = ± π δt k +1 ( α t k +1 µ δc ) µ δc − α t k +1 µ δc (cid:101) h t k = N (cid:88) j = k +1 j (cid:89) (cid:96) = k +1 ξ (cid:96) ∆ t k t j , (34)and ξ k is given by Eq. (26), setting π t k +1 (1 ,

1) = 0 therein.

Remark 3 (Weaker Conditions For a Positive Spread ( (cid:101) a ∗ , t k > (cid:101) b ∗ , t k )) Under the symmetry con-dition (29), it follows from (32)-(33) that (cid:101) a ∗ , t k − (cid:101) b ∗ , t k = (cid:101) L + , ∗ , t k + (cid:101) L − , ∗ , t k = µ + cp − α t k +1 µ + c p µ + c − α t k +1 µ + c ) + µ − cp − α t k +1 µ − c p µ − c − α t k +1 µ − c ) , (35) which is positive, because α t k ≤ as shown in Lemma 1. The second term in (32)-(33) is the adjustment for inventory holdings, whose coeﬃcient isnegative because α t k +1 < α k is close to 0 for most of the day and decreasesrapidly to − λ near the end. It is also interesting to note that the variance of c , which is the randomslope in the linear demand function, reduces the shadow cost of inventory because the coeﬃcientof I t k can be written as: α t k +1 µ ± c µ ± c − α t k +1 µ ± c = α t k +1 − α t k +1 µ ± c − α t k +1 Var( c ± t k +1 |F t k ) /µ ± c . As Var( c ± t k +1 |F t k ) becomes larger, the HFM tends to act less ‘aggressively’ in order to zero outher inventory; i.e., when the inventory is positive (negative), the second terms of (cid:101) a ∗ , t k and (cid:101) b ∗ , t k in Eqs. (32)-(33) become larger (smaller), thus the ask (bid) placement is not that close to S t k and the bid (ask) placement is less deep into the book. We can explain this phenomenon asfollows. Consider two LOB dynamics with the same value of µ ± c , but one of those having largerVar( c ± t k +1 |F t k ). Because c ± ≥

0, the book with larger variance will generally have larger demand18 a) α t k (b) (cid:101) h t k Figure 3:

Paths of α t k and (cid:101) h t k in a Symmetric Market. The action times are chosen to beeach second from time 0 to 19800 seconds (5.5 hours). λ = 0 . µ + cp = µ − cp hold. Speciﬁcally, we set µ ± c = 100 , µ ± c = 1 × , µ ± cp = 500, µ ± c p = 5 × , µ ± c p = 1 × , π + t k = π − t k ≡ . π t k (1 , ≡ c ± t k . As a result, more shares of the HFM’s LOs will be executed (see Fig. 1) and, hence,the HFM can act less ‘aggressively’ when attempting to zero out her inventory. The feature justdescribed cannot be captured by linear demand functions with deterministic slope as in Hendershott& Menkveld (2014) and Adrian et al. (2020). We provide further analysis on the sensitivity of theoptimal strategy to the inventory cost in Subsection 4.2.If t k is far from the terminal time T and the market is reasonably “symmetric” , the sequences α t k and (cid:101) h t k deﬁned in Eq. (34) are close to zero most of the time (see Fig. 3). The optimal strategyis then mainly dependent on the second term of (cid:101) L ± , ∗ , t k and the drift ∆ t k in the price dynamics of S t k . It follows from (32)-(33) that (cid:101) a ∗ , t k ≈ S t k + µ + cp µ + c + 12 ∆ t k = S t k + µ + p c + t k +1 , p + t k +1 |F t k )2 µ + c + 12 ∆ t k , (36) (cid:101) b ∗ , t k ≈ S t k − µ − cp µ − c + 12 ∆ t k = S t k − µ − p − Cov( c − t k +1 , p − t k +1 |F t k )2 µ − c + 12 ∆ t k , (37)where recall that µ ± p = E ( p ± t k +1 |F t k , ± t k +1 = 1). The correlation between c and p now plays a keyrole in the optimal placements. Under the martingale Condition (14) and under Condition (31), thelast two terms in the optimal placements become zero. The optimal placements are then near themidpoint between S t k and S t k ± µ ± p for most of the time. However, when the correlation between c That is, the fundamental price process is a martingale and the Conditions (29)-(30) of Proposition 2 are satisﬁedas well as µ + cp = µ − cp . Under these conditions, (cid:101) h t k ≡ p is positive, instead of placing LOs around S t k ± µ ± p /

2, the HFM will tend to go deeper intothe book. Roughly, a larger realization of c also implies a large value of p , resulting in a largerdemand function and, hence, greater opportunity for the HFM to obtain better prices for her ﬁlledLOs. Another way to understand (36)-(37) is to recall that c ± p ± is the y-intercept of the demandfunctions (see Fig. 1) and, thus, the larger µ ± cp , the larger the demand function and the deeper theHFM could place her LOs. The discussion above holds for most of the day. However, when t k gets closer to T , the second term of (32)-(33) will play a more important role in the best strategybecause α t k +1 is no longer close to zero by end of the day. Hence, the optimal strategy is mostlyinﬂuenced by the inventory level towards the day’s end. π t k (1 , (cid:54)≡ . The probability π t k (1 ,

1) of simultaneous arrivals of buy and sell MOs during a time step is typicallysmall at high-frequency trading (say, at 1 seconds or less). For the empirical analysis conductedin Section 5, we ﬁnd that π t k (1 , ≈ .

05 for a trading period of 1 second. However, this isno longer the case if the trading frequency is smaller (say, at 5 seconds or more). In that case,it is important to account for the event of joint arrivals. The following corollary sheds light onthe optimal placement’s behavior under conditions (29)-(31) plus additional conditions (which arereasonably met by our data in Section 5).

Corollary 1

Under Assumptions (29)-(30), the optimal spreads are • invariant to the local drifts { ∆ t k } k =0 ,...,N ; • independent on the inventory level.Suppose that, in addition to (29)-(30), the condition (31) as well as the following conditions hold: µ c = µ c , π ± t k ≡ π ± , π t k (1 , ≡ π (1 , , (38) for some constants π ± ∈ (0 , and π (1 , ∈ [( π + + π − − ∨ , π + ∧ π − ] . Then, the optimalspreads are • non-decreasing with time and, if π (1 ,

0) = π (0 ,

1) = 0 , they are ﬂat throughout the tradinghorizon; decreasing with π (1 , at any given time point. We prove Corollary 1 in Appendix B.1. We know that while the drift in the mid-price processand the HFM’s net inventory position can aﬀect the optimal bid and ask prices at any given time,the optimal spread is invariant to the speciﬁc value of the drift and inventory position. It can beseen from Fig. 4 that, as t k approaches the terminal time T , the optimal spread widens due to thepenalty placed on the terminal inventory. By widening the optimal spread, the HFM attempts totrade predominantly on one side of the book (say, sell side if inventory is positive), so to controlthe inventory level. As π (1 ,

1) increases, the probability of simultaneous arrivals of buy and sellMOs increases, hence, providing more opportunities for the HFM to manage her inventory. Thisis because the positive net position resulting from the execution of sell MOs and the negative netposition corresponding to the execution of buy MOs are more likely to be canceled out with eachother when π (1 ,

1) is positive. Thus, the HFM tends to narrow the spread to get more LOs ﬁlledon both sides of the book and gain larger proﬁt.Figure 4:

Optimal Bid-Ask Spread within a Trading Horizon.

The action times are chosento be each second from time 0 to 19800 seconds (5.5 hours). λ = 0 . µ ± c = 100 , µ ± p = 5, µ ± c p = 1 × , π + = π − = 0 . π (1 ,

1) ranges from 0 to π ± . These parameter values are consistent with theempirical estimates obtained from our data and given in Section 5. We now generalize the sensitivity analysis of optimal placements on inventory levels developedearlier for the case π (1 ,

1) = 0. We account for the nonzero probability of joint arrivals, i.e. for π (1 , >

0. 21 orollary 2

The optimal ask price (cid:101) a ∗ t k and bid price (cid:101) b ∗ t k , as deﬁned in Eq. (27) , are strictly de-creasing with inventory I t k . The proof is given in Appendix B.2. Corollary 2 reﬂects the HFM’s ability to control inventorythrough the optimal placement strategies under a general adapted fundamental price process. Whenthe HFM has a large net long inventory position, she puts ask and bid quotes at lower price toaccelerate selling and dampen buying activities. If instead the inventory position becomes largebut net short, she will raise the bid and ask prices to accelerate buying and dampen selling.Fig. 5 plots the distance of the HFM’s optimal ask and bid quotes from the fundamental price S t k within the last 500-seconds before the end of trading, for diﬀerent inventory levels. As wementioned in Subsection 4.1.1, if the market is reasonably “symmetric” and under the assumptionthat c and p are independent, the agent’s optimal ask spread, (cid:101) a ∗ t k − S t k , and bid spread, S t k − (cid:101) b ∗ t k ,are close to µ + p / µ − p /

2, respectively, for most of the time, regardless the inventory level. Weremark that µ ± p / . • If her inventory level is low (e.g., between −

250 to 250 shares), the HFM will widen both bidand ask spreads to decrease both buying and selling, as t k approaches T and, hence, keepthe inventory low till the end; • When her inventory level is high (e.g., outside −

250 to 250 shares), the HFM will narrow herask (bid) spread to facilitate selling (buying) of shares, while widening the bid (ask) spreadto dampen buying (selling) with a large positive (negative) net position.Thus, under the parameter speciﬁcation used to produce Fig. 5, the inventory levels ±

250 sharesare boundaries for the diﬀerent end-of-horizon behaviors as described above.

We ﬁrst consider the case of π t k +1 (1 ,

1) = 0 under a symmetric market and martingale dynamicsfor the fundamental price process. The following result then characterizes the optimal placementsrelative to the baseline price level S t k ± µ p /

2. The proof is given in Appendix B.3.22igure 5:

Optimal Bid and Ask Spreads in the Last 500-Seconds for Various InventoryLevels.

The action times are chosen to be each second. We choose the parameters so to satisfyConditions (29)-(31) and (38). Speciﬁcally, we set µ ± c = 100 , µ ± p = 5, µ ± c p = 1 × , π + = π − = 0 . π (1 ,

1) = 0, and λ = 0 . Corollary 3

Assume the market is symmetric (i.e., Conditions (29)-(31) of Proposition 2 holdand that µ p := µ + p = µ − p ), and π t k +1 (1 ,

1) = 0 (i.e., only one type of MOs can arrive during eachsubinterval). Then, under a martingale fundamental price process, there exists a threshold for theinventory level, I ± = ± µ c µ p µ c , such that the following statements hold for every penalty term λ > : • When the inventory level I t k ∈ ( I − , I + ) , the optimal strategy is to place the ask and bid quotesdeeper in the LOB relative to the levels S t k + µ p / and S t k − µ p / , respectively; • When the inventory level I t k > I + ( I t k < I − ), the optimal strategy is to place the ask (bid)quote closer to S t k than to S t k + µ p / ( S t k − µ p / ), and the bid (ask) quote farther from S t k than from S t k − µ p / ( S t k + µ p / ) into the LOB. It can be seen from Fig. 6 that if there is no inventory holding cost, the optimal strategy is tokeep the ask and bid prices constant throughout the day, no matter how much inventory the HFMholds. With a larger penalty, a HFM with a positive net position will place her bid LOs deeperinto the book near the end of the trading horizon to avoid more purchases of stock. For the askside, she will pick one of three diﬀerent strategies: (a) place the ask LO further from S t k , (b) place23he ask LO closer to S t k , or (c) keep the ask LO at the same price level as earlier in the tradingday. According to Corollary 3, the selection between (a), (b), or (c) depends on whether I t k > I + , I t k = I + , or I t k < I + , respectively. Under the parameter speciﬁcation used to produce Fig. 6, weﬁnd that the threshold inventory is I ± = ± I ± , the strategy ofthe HFM is insensitive to the inventory penalty, no matter how large it is. However, when theinventory level reaches 500, the HFM needs to lower the optimal ask price in order to get moreask LOs executed and, hence, lower the inventory level. The higher the inventory, the closer sheputs her ask quote to the mid-price. A similar discussion applies to the case of negative inventorypositions, as shown in the right panel of Fig. 6.However, if we allow for joint arrivals, i.e. π t k (1 , >

0, we can observe signiﬁcant diﬀerencesin the optimal strategies. Fig. 7 illustrates that, if π t k (1 , >

0, there does not exist an inventorylevel such that the optimal prices are ﬂat throughout the trading horizon. Furthermore, for someinventory levels, the optimal strategies are no longer monotonic in time. We observe a valley (peak)pattern in the optimal ask (bid) placement for some positive (negative) intermediate inventory level.Our intuition for this trading pattern is as follows. Consider, for example, the left panel in Fig. 7,where I t k = 250. If there is still enough trading time left, it is of higher priority for the HFM tosell more and lower the inventory level because there will be opportunities to buy later and proﬁtfrom the roundtrip transaction. However, as the time gets closer to the terminal time T , it becomesmore important to proﬁt directly from less, but wider, roundtrip transactions because there is notenough time left for the market maker to conclude the roundtrip transaction. If buy and sell MOscan arrive simultaneously, roundtrip transactions are more likely to happen within two consecutiveactions. Notice that as the penalty λ gets larger, the optimal strategy becomes more ‘aggressive’24igure 6: Optimal strategies in the last 100 seconds, for various inventory and penaltylevels.

The action times are chosen to be each second. We let λ range from 0 to 0 .

01. We set µ ± c = 100 , µ ± p = 5, µ ± c p = 1 × , π + = π − = 0 . π (1 ,

1) = 0. These parameters satisfyConditions (29)-(31) and (38). The parameter values are consistent with the estimates obtainedfrom real data in Section 5.because of the stronger incentives to make higher proﬁts and compensate for the cost of holdingterminal inventory. 25igure 7:

The last 100 seconds optimal strategies with various inventory and penaltylevels.

The action times are chosen to be every 1 second. λ ranges from 0 to 0 . µ ± c = 100 , µ ± p = 5, µ ± c p = 1 × , π + = π − = 0 . π (1 ,

1) = 0 .

05 and the parameters satisfy Conditions (29)-(31) and (38). The values of the parameters are consistent with the estimates from real data givenin Section 5.

This section studies the performance of the optimal strategies derived in Section 3 using real LOBdata. We ﬁrst describe the data set and the parameter estimation procedure. We then presentthe performance analysis, and additionally compare the optimal strategy against “benchmark”strategies that place limit orders on ﬁxed levels in the limit order book.

Data.

We use of LOB data of the MSFT stock during the year of 2019 (252 days in total). Ourdata set is obtained from Nasdaq TotalView-ITCH 5.0, which is a direct data feed product oﬀeredby The Nasdaq Stock Market, LLC . TotalView-ITCH uses a series of event messages to trackany change to the state of the LOB. For each message, we observe the timestamp, type, direction,volume, and price. We reconstruct the dynamics of the top 20 levels of the LOB directly from theevent message data. We treat each day as an independent sample. Actions.

We assume no latency in the HFM’s actions and the HFM’s order is always ahead of thequeue of the LOs with same price in the LOB. We ﬁx the action times for the HFM to be everysecond of a trading period running from 10:00 a.m. to 15:30 p.m. Thus, the HFM acts 19800 timesin a regular trading day. At the beginning of each 1-second subinterval, the HFM places an askand a bid LO, each of a ﬁxed volume. The volume is set to be 500 shares, roughly matching theaverage volume of MOs arriving within 1-second intervals. The tick size of MSFT stock is one cent. Historical Window Size for Parameter Estimation.

The parameters plugged into the optimalstrategy for the current day are estimated via historical averages including the prior 20 trading days.Recall that those parameters are the arrival probabilities π ± t k +1 , π t k +1 (1 ,

1) deﬁned in Eqs. (2)-(3),and the conditional expectations related to ( c ± . , p ± . ), deﬁned in Eq. (8). Because there is a total of252 trading days in year 2019, we compute terminal revenues for 232 days, i.e., starting from the21st trading day. Frequencies of MOs.

During a typical trading day, sell and buy MOs usually arrive more fre-quently near the opening or closing of the stock market. To capture this ‘U’ shape intraday pattern,we model the parameters π ± t k +1 and π t k +1 (1 ,

1) deﬁned in Eqs. (2)-(3) as quadratic deterministicfunctions of the time t k +1 . More speciﬁcally, for the i-th trading day, we ﬁrst compute¯ π ± ,it k +1 = 120 (cid:88) j =1 ± ,i − jt k +1 , (39)¯ π it k +1 (1 ,

1) = 120 (cid:88) j =1 ( + ,i − jt k +1 · − ,i − jt k +1 ) , (40)where ± ,it k +1 are the MO indicators deﬁned in Eq. (1) for trading day i . By conducting a least-squares quadratic ﬁt to the time series of arrival probabilities ¯ π ± ,it k +1 and ¯ π it k +1 (1 , π ± t k +1 and π t k +1 (1 ,

1) for the i-th trading day. We denote these estimates by ˆ π ± ,it k +1 andˆ π it k +1 (1 , π ± t k +1 and ˆ π t k +1 (1 , Demands Function.

For each 1-second subinterval within a day used to estimate the parametersof the model, we ﬁrst compute the actual demand at each price level. Suppose the HFM placesask LOs at price level P l at time t k . At time t i ∈ [ t k , t k +1 ), we observe a buy MO i with volume V MO i submitted to the market, and the volume of existing ask LOs in the book with prices lowerthan P l at this moment is V LO i . Then the number of shares to be ﬁlled with this buy MO in theHFM’s placement equals to ( V MO i − V LO i ) ∨

0. We compute this quantity for all buy MOs arriving27igure 8:

Prototypical Trajectories of MOs Frequencies within a Trading Day.

During atypical trading day, sell and buy MOs arrive more frequently when the time is closer to the openingor closing of the market. We model π ± t k +1 and π t k +1 (1 ,

1) as quadratic deterministic functions oftime.during the interval [ t k , t k +1 ), and use (cid:80) i (cid:0) ( V MO i − V LO i ) ∨ (cid:1) to quantify the actual demand atprice level P l during [ t k , t k +1 ). The computation on the bid side is symmetric (see the piecewiseconstant graph in Fig. 9 for an example of the actual demand during a 1-second subinterval). Thenwe conduct a weighted linear regression on each side of the book, with the actual demand beingthe response variable and the price level (speciﬁcally, its distance to S t k ) being the predictor, toestimate Q ± t k +1 . We place higher weight on price levels closer to S t k and smaller weights on the pricelevels which are deep in the book. Fig. 9 shows the prototypical linear ﬁt to the actual demandfunction in one subinterval. Fig. 10 plots the estimated time series ( c ± t k , p ± t k ) throughout a tradingday. By virtue of the augmented Dickey-Fuller test (ADF), all ( c ± . , p ± . )-related time series deﬁnedin Eq. (8) are reasonably stationary.We then proceed to estimate the i -day conditional expectations µ ± ,i { c,p } deﬁned in Eq. (8) byaveraging the corresponding regression parameters over all subintervals within that day. We denotethese estimates by ˆ µ ± ,i { c,p } . Table 1 shows the average of ˆ µ ± ,i { c,p } over all 252 trading days in 2019.These results suggest that the symmetry assumption imposed on the demand of buy and sell orders(Eq. (29)), and the assumption of independence between c ± and p ± (Eq. (29)) are largely satisﬁed.In our implementation, the estimates of µ ±{ c,p } used to compute the optimal strategies in the i-thtrading day are obtained by averaging ˆ µ ± ,i − j { c,p } ( j = 1 , . . . ,

20) over the previous 20 training days.28igure 9:

Prototypical Plot of Actual Demand vs. Estimated Linear Demand over a1-Second Trading Interval

Figure 10:

Estimated Values of ( c ± t k , p ± t k ) throughout a Prototypical Trading Day. For eachcoeﬃcient, the average value is shown in gray dashed line.

Drift of the Midprice Process.

Following standard conventions in the literature, we set thefundamental price S t k to be the midprice, i.e., the average of the best bid and best ask prices (see alsoHendershott & Menkveld (2014)). Recall, from Section 3.2, the deﬁnition ∆ t k = E ( S t k +1 − S t k |F t k ),where S t k is the midprice at time t k . Since the optimal strategies are computed using a backwardinduction algorithm, we need to estimate ∆ t k , and additionally make predictions on future pricechanges conditioned on the present information (see Eq. (25)). For computational eﬃciency (see29 µ + c = 94 .

86 ¯ µ − c = 98 . µ + p = 3 .

977 ¯ µ − p = 3 . µ + cp = 432 .

06 ¯ µ − cp = 451 . µ + c = 6 . × ¯ µ − c = 3 . × ¯ µ + p = 17 .

25 ¯ µ − p = 17 . µ + c p = 3 . × ¯ µ − c p = 1 . × ¯ µ + c p = 1 . × ¯ µ − c p = 1 . × Table 1: ¯ µ ±{ c,p } : Average Values of ˆ µ ± ,i { c,p } over 252 Trading Days in 2019.also Remark 4 below for further discussion), we hereafter assume that∆ t k t j = E ( S t j +1 − S t j |F t k ) = 0 , j ≥ k + 1 . (41)Under this assumption, Eq. (25) simpliﬁes as (cid:101) L ± , ∗ t k = L ± , ∗ t k ± (cid:16) π + t k +1 π − t k +1 γ t k ( α t k +1 µ ∓ c − µ ∓ c ) µ ± c ∓ π t k +1 (1 , α t k +1 γ t k µ + c µ − c π ∓ t k +1 µ ∓ c (cid:17) ∆ t k , where L ± , ∗ t k are the optimal spreads deﬁned in Proposition 1. The above expression indicates thatwe only need to predict the immediate midprice change to compute the optimal strategy. In ourimplementation with real data, we estimate ∆ t k by taking the average over the last 5 incrementsin the midprice: ˆ∆ t k = 15 (cid:88) i =1 ( S t k − i +1 − S t k − i ) = S t k − S t k − . (42)In this way, the optimal strategy with ∆ t k is able to respond quicker to local midprice trends. Remark 4

In practice, one could expect ∆ t k t j = E ( S t j +1 − S t j |F t k ) to quickly decrease to as j is farther away from k , otherwise, statistical arbitrage opportunities would appear. Furthermore,the estimation error of the forecasts ∆ t k t j increases quickly as t j is farther away from t k . Hence,the reduction in the misspeciﬁcation error (the error in assuming that ∆ t k t j = 0 when they are not)will be oﬀset by the estimation error of the forecasts ∆ t k t j . Therefore, in practice, it is better toconsider very few steps ahead forecasts in formula (25) . The assumption (41) appears to be a goodcompromise between accuracy and computational eﬃciency. .2 Results This section shows the performance of optimal strategies on the MSFT stock during the year 2019.We compute the terminal cash ﬂow W T and inventory I T for each trading day by executing theoptimal strategy over a time period against the observed market data. Within each subinterval[ t k , t k +1 ), the change in inventory is given by I t k +1 − I t k = − ‹ Q + t k +1 + ‹ Q − t k +1 , where ‹ Q + t k +1 and ‹ Q − t k +1 ,are the actual numbers of ﬁlled shares in the HFM’s placement on ask and bid side, respectively, andcomputed from transaction data (in the same way as we compute the actual demand described inSection 5.1). The change in cash ﬂows are given by W t k +1 − W t k = a t k ‹ Q + t k +1 − b t k ‹ Q − t k +1 , where a t k , b t k are, respectively, the ask and bid prices implied by the strategy. As a comparison benchmark, wealso consider ﬁxed-level strategies, which always quote at some ﬁxed level in the LOB (e.g., alwaysquote at level I, level II, etc...). Control on Terminal Inventory.

Fig. 11 shows the intraday price and inventory paths of theoptimal strategy compared with the ‘Level 1’- ‘Level 6’ strategies for a prototypical trading day.As we can see from Fig. 11a, the optimal prices typically swing between the levels 2 and 3 in theLOB at the beginning of the trading period. During the last portion of the trading horizon, theoptimal ask prices go down from level 3 to level 2, and the optimal bid prices go down from level3 to level 6. This is the case because, as the HFM gains a positive net position during the tradingprocess (see Fig. 11b), she gradually lowers both her ask and bid prices to buy less and sell moreand, hence, to revert the net position towards zero. As shown in Fig. 11b, from 10:00 am-12:30pm, the level of the net position goes positive under each strategy, likely because of the decreasingmidprice trend early in the day. However, if the HFM executes according to the optimal strategy,the penalty on the terminal inventory prevents the inventory from exploding and pulls it back closeto zero by the end. This shows that the eﬀectiveness of the liquidation penalty − λI T in controllinginventory and avoiding large end of the day costs. Probability Distribution of Terminal Value.

Table 2 reports the means and standard de-viations of the terminal objective values W T + S T I T − λI T under diﬀerent strategies. ‘Level 1’-‘Level 6’ represent the benchmark strategies that place LOs at a ﬁxed level (i.e., level 1- level 6,respectively) in the LOB. For comparison, Table 3 presents the means and standard deviations ofthe terminal values W T + ¯ S T I T , computed using the actual average price ¯ S T per share that the31 a) The Intraday Prices Paths.(b) The Intraday Inventory Paths. Figure 11:

The Intraday Price and Inventory Paths of the Optimal Strategy Comparedwith the Benchmark Strategies for a Prototypical Trading Day. ‘Optimal Strategy’ cor-responds to the optimal strategy under the non-martingale price assumption. ‘Level 1’- ‘Level 6’represent the benchmark strategies that place LOs at a ﬁxed level (i.e. level 1- level 6, respectively)in the LOB. (a) Upper row shows prices on the ask side and lower row shows prices on the bidside. Three columns from left to right represent three 1-minute time windows, which are at thebeginning of the trading horizon 10:00 − − − ptimal Strategywith Non-Martingale Fundamental Priceand π t k (1 , ≥ π t k (1 , ≥ π t k (1 , ≡ . × . × . × Std. 1 . × . × . × Level 1 Level 2 Level 3 Level 4 Level 5 Level 6Mean − . × − . × − . × − . × − . × − . × Std. 1 . × . × . × . × . × . × Table 2: Mean and Std. of the Terminal Objective Values W T + S T I T − λI T over 232 Days. We ﬁx λ = 0 . Optimal Strategywith Non-Martingale Priceand π t k (1 , ≥ π t k (1 , ≥ π t k (1 , ≡ . × . × . × Std. 1 . × . × . × Table 3: Mean and Std. of the Terminal Values W T + ¯ S T I T (Terminal Cash Holdings plus Liqui-dation Proceeds) over 232 Days using Diﬀerent Strategies.HFM will get when liquidating her inventory I T with MOs based on the state of the book at time T . We refer to ¯ S T I T as the liquidation proceeds. We do not observe signiﬁcant diﬀerences withthe results presented in Table 2. This suggests that the penalty parameter λ , ﬁxed to be 0 . S T − λI T ) I T in the objective function matches well the realized averageproceeds of liquidating all net positions using MOs at end of the trading horizon.Based on the results of Table 2 and Table 3, we can conclude that the optimal strategiesoutperform the ﬁxed level 1-level 6 strategies. With the incorporation of the drift term ∆ t k inthe midprice process, we achieve a higher average and a lower standard deviation of the terminalvalues. We observe that allowing for simultaneous arrivals of buy and sell MOs also leads to ahigher average of the terminal values. Hereafter we focus on the optimal strategy computed usingnon-martingale fundamental price dynamics and assuming π t k (1 , ≥

0. The optimal strategyyields a positive average terminal value. However, as shown in Fig. 12, the distributions of terminalvalues appear to exhibit heavy tails on both sides with kurtosis larger than 14. Such a large kurtosisresults in high standard deviation estimates. We use the subsample bootstrap method proposedby Hall & LePage (1996) to construct a conﬁdence interval for the mean. From Fig. 12a, we can33ee that the 95%-conﬁdence interval for the mean of the terminal objective W T + S T I T − λI T is[ − . × , . × ]. Fig. 12b shows that the 95%-conﬁdence interval for the mean of theterminal values W T + ¯ S T I T is [ − . × , . × ]. We remark, however, that bootstrap basedCIs tend to be highly conservative for heavy tailed distributions as shown in the simulations ofPeng (2004). In Section 5.3, we will show that these extreme negative revenues are due to largestructural breaks over time, and discuss how to identify and potentially exclude atypical days fromthe analysis. On some days, the market experiences ‘atypical’ demand and supply due to various factors (e.g.non-scheduled news arrival, entry of new market participants, etc.), which are not predictablefrom recent market data. These ‘atypical’ patterns can result in structural parameter breaks, andconstitute the main reason for the observance of extreme negative revenues in Section 5.2. Inour case, this means that the parameter values estimated based on the last 20 days can diﬀer bya large extent from the actual parameter values of the current trading day when the strategy isimplemented.One key parameter that signiﬁcantly aﬀects the performance of the optimal strategies is µ ± cp ,deﬁned in Eq. (8). Recall the value of c ± p ± is the y -intercept of the demand functions (see Fig. 1)and a biased estimate of µ ± cp can lead to misleading predictions of ﬁlled shares near the midprice,which are the most critical ticks. For each trading day i , we therefore compute the diﬀerencebetween the average of historical estimated values of µ ± cp based on 20 past days and the estimatedvalues from the current trading day i : err ± ,icp := 120 (cid:88) j =1 ˆ µ ± ,i − jcp − ˆ µ ± ,icp , where ˆ µ ± ,icp is deﬁned in Section 5.1. The empirical distributions of err ± ,icp are heavy right-tailedwhich means that µ ± cp are much overestimated for some trading days. We therefore identify dayswhen either µ + cp or µ − cp are overestimated, and mark days with error larger than the 0.95 quantileof the empirical distributions of err ± ,icp as days with large structural parameter break.34 a) Terminal Objective Values: W T + S T I T − λI T ( λ = 0 . W T + ¯ S T I T (Cash Holdings + Liquidation Proceeds). Figure 12:

Histogram of the Terminal Values Obtained From the Optimal Strategy inYear 2019 (232 Trading Days Included).

We compute the terminal values achieved by theoptimal strategy for each trading day of the year, starting from the 21st trading day. In each day,we use the prior 20 days to estimate parameters. This results in a total of 232 trading days usedto estimate the probability distribution. 35 ith all Days Excluding ‘Atypical’ DaysMean (Std.) 6 . × (1 . × ) 1 . × (1 . × )95% Conﬁdence Interval of MeanNormal Approximation [ − . × , . × ] [1 . × , . × ]95% Conﬁdence Interval of MeanSubsample Bootstrap [ − . × , . × ] [2 . × , . × ] Table 4: Terminal objective value W T + S T I T − λI T . We consider both the inclusion and exclusionof the 22 ‘Atypical’ Days. We set λ = 0 . π t k (1 , i , we ﬁrst compute the historical estimate of π t k (1 ,

1) for the day i as:¯ˆ π i (1 ,

1) = 1 N N (cid:88) k =1 ˆ π it k +1 (1 , , where { ˆ π it k +1 (1 , } k =0 ,...,N are the least-squares estimates of ¯ π it k +1 (1 ,

1) deﬁned in (40). We thencompute the diﬀerence between the historical estimates ¯ˆ π i (1 ,

1) and the estimated probability˜ π i (1 ,

1) := (cid:80) N − k =0 ( + ,it k +1 · − ,it k +1 ) /N for day i , and set err iπ (1 , := ¯ˆ π i (1 , − ˜ π i (1 , err iπ (1 , is greater than the 0.95 quantile of its empiricaldistribution as days with large structural parameter breaks. Results after Excluding Days with Large Structural Parameter Break.

Using the criteriadescribed above, we identify days with a large structural break in the estimate of either µ ± cp or π t k (1 , for the mean of these terminal values only consistof positive values once we exclude ‘atypical’ trading days. The average and standard deviationof terminal values are also signiﬁcantly increased and reduced, respectively. As shown in the his-tograms of Fig. 13, the selection criterion discussed above eﬀectively excludes days where revenuesare extreme and negative. The conﬁdence intervals are constructed using the standard normal approximation method and the subsamplebootstrap method proposed by Hall & LePage (1996), as mentioned in Section 5.2. Peng (2004) show that thesubsample bootstrap method provides a more conservative estimate for the conﬁdence interval of the mean if thedistribution is heavy-tailed. ith all Days Excluding ‘Atypical’ DaysMean (Std.) 6 . × (1 . × ) 1 . × (1 . × )95% Conﬁdence Interval of MeanNormal Approximation [ − . × , . × ] [1 . × , . × ]95% Conﬁdence Interval of MeanSubsample Bootstrap [ − . × , . × ] [1 . × , . × ] Table 5: Terminal value W T + ¯ S T I T (terminal cash holdings plus liquidation proceeds). We considerboth the inclusion and exclusion of the 22 ‘Atypical’ Days. (a) Terminal Objective Values: W T + S T I T − λI T ( λ = 0 . W T + ¯ S T I T (Cash Holdings + Liquidation Proceeds) Figure 13:

Histogram of Terminal Values in Year 2019 Before (Left Panels) and AfterExcluding ‘Atypical’ Days (Right Panels).

We compute the terminal values achieved by theoptimal strategy for each trading day of the year, starting from the 21st trading day. For each day,we use the prior 20 days to estimate parameters. This results in a total of 232 trading days, whichare used to estimate the probability distribution shown in the left panel. There are 22 out of 232trading days identiﬁed as ‘atypical’ days. The right panel shows the distribution of terminal valuesafter excluding those 22 ‘atypical’ days. 37

Proofs of Section 3

A.1 Proofs of Proposition 1 and Theorem 2.

We prove Proposition 1 and Theorem 2 through four steps:

Step 1.

We start by proposing the following ansatz for the value function V t k : V t k = (cid:101) v ( t k , S t k , W t k , I t k ) := W t k + α t k I t k + S t k I t k + (cid:101) h t k I t k + (cid:101) g t k , (A-1)where α : T → R is a deterministic function deﬁned on T = { t , t , . . . , t N +1 } (recall that weset t N +1 = T ) and { (cid:101) h t } t ∈T , { (cid:101) g t } t ∈T are some processes adapted to the ﬁltration {F t } t ∈T . Since V T = W T + S T I T − λI T , we have the terminal conditions α T = − λ , (cid:101) g T = 0, and (cid:101) h T = 0. In whatfollows, we will use the following notation: (cid:101) h t k t k +1 := E [ (cid:101) h t k +1 |F t k ] , (cid:101) g t k t k +1 := E [ (cid:101) g t k +1 |F t k ] . (A-2)By plugging Eq. (10), (11), and (A-1) into the right-hand side of the Bellman equation (13), weget V t k = sup L ± tk E ß (cid:88) δ = ± ( S t k + δL δt k ) δ δt k +1 c δt k +1 ( p δt k +1 − L δt k )+ α t k +1 (cid:2) I t k − (cid:88) δ = ± δ δt k +1 c δt k +1 ( p δt k +1 − L δt k ) (cid:3) + S t k +1 (cid:2) I t k − (cid:88) δ = ± δ δt k +1 c δt k +1 ( p δt k +1 − L δt k ) (cid:3) + (cid:101) h t k +1 (cid:2) I t k − (cid:88) δ = ± δ δt k +1 c δt k +1 ( p δt k +1 − L δt k ) (cid:3) + (cid:101) g t k +1 (cid:12)(cid:12)(cid:12)(cid:12) F t k ™ (A-3)38e expand the squares inside the expectation above and arrange the terms as follows: (cid:88) δ = ± δt k +1 (cid:2) − c δt k +1 ( L δt k ) + ( c δt k +1 p δt k +1 − δc δt k +1 S t k ) L δt k + δc δt k +1 p δt k +1 S t k (cid:3) (A-4)+ α t k +1 ß I t k + (cid:88) δ = ± δt k +1 (cid:110) ( c δt k +1 ) ( L δt k ) + (cid:2) δI t k c δt k +1 − c δt k +1 ) p δt k +1 (cid:3) L δt k + ( c δt k +1 p δt k +1 ) − δI t k c δt k +1 p δt k +1 (cid:111) (A-5)+ 2 + t k +1 − t k +1 c + t k +1 c − t k +1 ( − L + t k L − t k + p + t k +1 L − t k + p − t k +1 L + t k − p + t k +1 p − t k +1 ) ™ + S t k +1 ï I t k + (cid:88) δ = ± δt k +1 ( − δc δt k +1 p δt k +1 + δc δt k +1 L δt k ) ò (A-6)+ (cid:101) h t k +1 I t k + (cid:88) δ = ± δt k +1 ( − δ (cid:101) h t k +1 c δt k +1 p δt k +1 + δ (cid:101) h t k +1 c δt k +1 L δt k ) + (cid:101) g t k +1 (A-7)The conditional expectations of most terms above are easy to compute from the conditions inAssumption 1 and the adaptability of the controls { L ± t k } , { S t k } , and { I t k } . For instance, we caneasily see that E (cid:2) + t k +1 − t k +1 c + t k +1 c − t k +1 p + t k +1 p − t k +1 (cid:12)(cid:12) F t k (cid:3) = E (cid:2) + t k +1 − t k +1 E (cid:2) c + t k +1 p + t k +1 (cid:12)(cid:12) F t k , + t k +1 − t k +1 (cid:3) E (cid:2) c − t k +1 p − t k +1 (cid:12)(cid:12) F t k , + t k +1 − t k +1 (cid:3)(cid:12)(cid:12) F t k (cid:3) = µ + cp µ − cp E (cid:2) + t k +1 − t k +1 (cid:12)(cid:12) F t k (cid:3) = µ + cp µ − cp π t k +1 (1 , . For the terms in (A-6), using the conditional independence of ( ± t k +1 , c ± t k +1 , p ± t k +1 ) and S t k +1 − S t k given F t k stated after (14), we have: E (cid:2) ( S t k +1 − S t k ) δt k +1 c δt k +1 p δt k +1 (cid:12)(cid:12) F t k (cid:3) = E (cid:2) S t k +1 − S t k (cid:12)(cid:12) F t k (cid:3) E (cid:2) δt k +1 c δt k +1 p δt k +1 (cid:12)(cid:12) F t k (cid:3) = ∆ t k π δt k +1 µ δcp , and, thus, E (cid:2) S t k +1 δt k +1 c δt k +1 p δt k +1 (cid:12)(cid:12) F t k (cid:3) = ( S t k + ∆ t k ) π δt k +1 µ δcp . Similarly, we can show that E (cid:2) S t k +1 δt k +1 c δt k +1 (cid:12)(cid:12) F t k (cid:3) = ( S t k + ∆ t k ) π δt k +1 µ δc . For the terms in (A-7), let us assume for now that: E (cid:2)(cid:101) h t k +1 δt k +1 c δt k +1 (cid:12)(cid:12) F t k (cid:3) = (cid:101) h t k t k +1 E (cid:2) δt k +1 c δt k +1 (cid:12)(cid:12) F t k (cid:3) , (A-8) E (cid:2)(cid:101) h t k +1 δt k +1 c δt k +1 p δt k +1 (cid:12)(cid:12) F t k (cid:3) = (cid:101) h t k t k +1 E (cid:2) δt k +1 c δt k +1 p δt k +1 (cid:12)(cid:12) F t k (cid:3) . (A-9)39he above identities will be proved below in Step 4. Using the previous arguments, we can computethe conditional expectation E (cid:2) · (cid:12)(cid:12) F t k (cid:3) of the terms in Eqs. (A-4)-(A-7), and plug them in the right-hand side of Eq. (A-3) to get: α t k I t k + S t k I t k + (cid:101) h t k I t k + (cid:101) g t k = sup L ± tk (cid:88) δ = ± π δt k +1 (cid:110) ( α t k +1 µ δc − µ δc )( L δt k ) + (cid:2) µ δcp + δ (cid:101) h t k t k +1 µ δc + α t k +1 (2 δµ δc I t k − µ δc p ) + δµ δc ∆ t k (cid:3) L δt k + α t k +1 ( µ δc p − δµ δcp I t k ) − δ (cid:101) h t k t k +1 µ δcp − δµ δcp ∆ t k (cid:111) + α t k +1 I t k + 2 α t k +1 π t k +1 (1 , − µ + c µ − c L + t k L − t k + µ + cp µ − c L − t k + µ + c µ − cp L + t k − µ + cp µ − cp )+ I t k ( S t k + ∆ t k ) + (cid:101) h t k t k +1 I t k + (cid:101) g t k t k +1 (A-10)Denote the right hand side of above equation as sup L ± tk (cid:101) f ( L + t k , L − t k ). As we can see (cid:101) f ( L + t k , L − t k ) isa quadratic function of L + t k and L − t k . Setting the partial derivatives with respect to L + t k and L − t k ,respectively, equal to 0, we have ∂ L + tk (cid:101) f = 2 π + t k +1 ( α t k +1 µ + c − µ + c ) L + t k + π + t k +1 (cid:2) µ + cp + (cid:101) h t k t k +1 µ + c + α t k +1 (2 µ + c I t k − µ + c p ) + µ + c ∆ t k (cid:3) − α t k +1 π t k +1 (1 , µ + c µ − c L − t k + 2 α t k +1 π t k +1 (1 , µ + c µ − cp = 0 ,∂ L − tk (cid:101) f = 2 π − t k +1 ( α t k +1 µ − c − µ − c ) L − t k + π − t k +1 (cid:2) µ − cp − (cid:101) h t k t k +1 µ − c + α t k +1 ( − µ − c I t k − µ − c p ) − µ − c ∆ t k (cid:3) − α t k +1 π t k +1 (1 , µ + c µ − c L + t k + 2 α t k +1 π t k +1 (1 , µ − c µ + cp = 0 . Solving for L + t k and L − t k , we get the expressions (cid:101) L + , ∗ t k = (1) A + t k I t k + (2) (cid:101) A + t k + (3) (cid:101) A + t k , (cid:101) L − , ∗ t k = − (1) A − t k I t k − (2) (cid:101) A − t k + (3) (cid:101) A − t k (A-11)40here (1) A ± t k , (2) (cid:101) A ± t k , (3) (cid:101) A ± t k are given as (1) A ± t k = β ± t k α t k +1 γ t k , (2) (cid:101) A ± t k = β ± t k (cid:101) h t k t k +1 γ t k , (A-12) (3) (cid:101) A ± t k = π ∓ t k +1 γ t k ( α t k +1 µ ∓ c − µ ∓ c ) (cid:2) π ± t k +1 ( µ ± cp − α t k +1 µ ± c p ) + 2 π t k +1 (1 , α t k +1 µ ± c µ ∓ cp ± π ± t k +1 ∆ t k µ ± c (cid:3) + π t k +1 (1 , α t k +1 γ t k µ + c µ − c (cid:2) π ∓ t k +1 ( µ ∓ cp − α t k +1 µ ∓ c p ) + 2 α t k +1 π t k +1 (1 , µ ± cp µ ∓ c ∓ π ∓ t k +1 ∆ t k µ ∓ c (cid:3) . (A-13)By plugging (cid:101) L ± , ∗ t k back into Eq. (A-10) and matching terms with respect to I t k , we obtain thefollowing recursive expressions for α t k , ˜ h t k , and ˜ g t k : α t k = α t k +1 + (cid:88) δ = ± π δt k +1 (cid:2) ( α t k +1 µ δc − µ δc )( (1) A δt k ) + 2 α t k +1 µ δc ( (1) A δt k ) (cid:3) + 2 α t k +1 π t k +1 (1 , µ + c µ − c ( (1) A + t k (1) A − t k ) , (A-14) (cid:101) h t k = (cid:101) h t k t k +1 + (cid:88) δ = ± π δt k +1 (cid:110) α t k +1 µ δc − µ δc ) (cid:2) (1) A δt k (( δ (3) (cid:101) A δt k ) + (2) (cid:101) A δt k ) (cid:3) + 2 α t k +1 µ δc (( δ (3) (cid:101) A δt k ) + (2) (cid:101) A δt k ) − α t k +1 ( δµ δcp ) + ( δ (1) A δt k )( µ δcp + ( δ (cid:101) h t k t k +1 ) µ δc − α t k +1 µ δc p ) (cid:111) − α t k +1 π t k +1 (1 , µ + c µ − c (cid:104) (1) A + t k ( (3) (cid:101) A − t k − (2) (cid:101) A − t k ) − (1) A − t k ( (2) (cid:101) A + t k + (3) (cid:101) A + t k ) + µ + cp µ + c ( (1) A − t k ) − µ − cp µ − c ( (1) A + t k ) (cid:105) + ∆ t k (cid:2) (1) A + t k π + t k +1 µ + c + (1) A − t k π − t k +1 µ − c + 1 (cid:3) (A-15) (cid:101) g t k = (cid:101) g t k t k +1 + (cid:88) δ = ± π δt k +1 (cid:2) ( α t k +1 µ δc − µ δc )( (3) (cid:101) A δt k + ( δ (2) (cid:101) A δt k )) + α t k +1 µ δc p − ( δ (cid:101) h t k t k +1 ) µ δcp + ( µ δcp + ( δ (cid:101) h t k t k +1 ) µ δc − α t k +1 µ δc p )( (3) (cid:101) A δt k + ( δ (2) (cid:101) A δt k )) (cid:3) − α t k +1 π t k +1 (1 , µ + c µ − c (cid:104) ( (2) (cid:101) A + t k + (3) (cid:101) A + t k )( (3) (cid:101) A − t k − (2) (cid:101) A − t k ) − µ + cp µ + c ( (3) (cid:101) A − t k − (2) (cid:101) A − t k ) − µ − cp µ − c ( (2) (cid:101) A + t k + (3) (cid:101) A + t k ) + µ + cp µ − cp µ − c µ + c (cid:105) + ∆ t k (cid:104) ( (3) (cid:101) A δt k + (2) (cid:101) A δt k ) π + t k +1 µ + c − ( (3) (cid:101) A δt k − (2) (cid:101) A δt k ) π − t k +1 µ − c − π + t k +1 µ + cp + π − t k +1 µ − cp (cid:105) (A-16)41 tep 2. We next prove that (cid:101) L ± , ∗ t k are indeed the maximum point of the function ˜ f ( L + t k , L − t k ). Tothis end, we will use Lemma 1, which states that α t k <

0. Indeed, for every t k , we have: D = ( ∂ L + tk ˜ f )( ∂ L − tk ˜ f ) − ( ∂ L + tk L − tk ˜ f ) = 4 π + t k +1 π − t k +1 ( α t k +1 µ + c − µ + c )( α t k +1 µ − c − µ − c ) − (cid:2) π t k +1 (1 , α t k +1 µ + c µ − c (cid:3) > ,∂ L + tk ˜ f = α t k +1 µ + c − µ + c < . By the second derivative test, ˜ f ( L + t k , L − t k ) takes its maximum value at (cid:101) L ± , ∗ t k . Step 3.

We now show that (25) holds. Note that, by plugging the expressions of (2) (cid:101) A ± t k and (3) (cid:101) A ± t k given in (A-12)-(A-13) into (A-15), (cid:101) h t k can be written as (cid:101) h t k = d k + ξ k ( (cid:101) h t k t k +1 + ∆ t k ) , (A-17)for some deterministic constant d k and ξ k = 1 + α t k +1 γ t k (cid:88) δ = ± π δt k +1 β δt k (cid:110) β δt k γ t k ( α t k +1 µ δc − µ δc ) + 2 µ δc (cid:111) + 2 α t k +1 γ t k π t k +1 (1 , µ + c µ − c β + t k β − t k . Note also that h t k deﬁned in Eq. (20) can also be written as h t k = d k + ξ k h t k +1 , (A-18)where d k , ξ k are the same as those in (A-17). Since h t N +1 = 0 and (cid:101) h t N +1 = 0, for the time point t N , we have that (cid:101) h t N = d N + ξ N ∆ t N and h t N = d N . By induction, we get h t k = N (cid:88) j = k j − (cid:89) (cid:96) = k ξ (cid:96) d j , where (cid:81) k − (cid:96) = k ξ (cid:96) := 1, and (cid:101) h t k = N (cid:88) j = k j − (cid:89) (cid:96) = k ξ (cid:96) ( d j + ξ j ∆ t k t j ) = h t k + N (cid:88) j = k j (cid:89) (cid:96) = k ξ (cid:96) ∆ t k t j . (A-19)42n particular, we have (cid:101) h t k t k +1 = E (cid:104) h t k +1 + N (cid:88) j = k +1 j (cid:89) (cid:96) = k +1 ξ (cid:96) ∆ t k +1 t j (cid:12)(cid:12)(cid:12) F t k (cid:105) = h t k +1 + N (cid:88) j = k +1 j (cid:89) (cid:96) = k +1 ξ (cid:96) ∆ t k t j . Plugging the above expression into (2) (cid:101) A ± t k deﬁned in (A-12) and, then, plugging (1) A ± t k , (2) (cid:101) A ± t k , and (3) (cid:101) A ± t k into (A-11), we deduce that (cid:101) L + , ∗ t k = L + , ∗ t k + β + t k γ t k ∆ t k + (cid:16) β + t k γ t k (cid:17) N (cid:88) j = k +1 j (cid:89) (cid:96) = k +1 ξ (cid:96) ∆ t k t j , (cid:101) L − , ∗ t k = L − , ∗ t k − β + t k γ t k ∆ t k − (cid:16) β − t k γ t k (cid:17) N (cid:88) j = k +1 j (cid:89) (cid:96) = k +1 ξ (cid:96) ∆ t k t j . (A-20)This proves Proposition 1 and Theorem 2 at once. Step 4.

It remains to show the validity of the identities (A-8)-(A-9). First note that the formula(A-19) can be derived directly from the equations (A-12)-(A-15) regardless (A-11) holds true ornot. Using (A-19), we then have E (cid:2)(cid:101) h t k +1 δt k +1 p δt k +1 (cid:12)(cid:12) F t k (cid:3) = E (cid:104)(cid:16) h t k +1 + N (cid:88) j = k +1 j (cid:89) (cid:96) = k +1 ξ (cid:96) ∆ t k +1 t j (cid:17) δt k +1 c δt k +1 (cid:12)(cid:12)(cid:12) F t k (cid:105) = h t k +1 π δt k +1 µ δc + N (cid:88) j = k +1 j (cid:89) (cid:96) = k +1 ξ (cid:96) E (cid:104) ∆ t k +1 t j δt k +1 c δt k +1 (cid:12)(cid:12)(cid:12) F t k (cid:105) . Next, using the conditional independence of ( ± t k +1 , c ± t k +1 , p ± t k +1 ) and { S t j +1 − S t j } j ≥ k given F t k , for j ≥ k , E (cid:104) ∆ t k +1 t j δt k +1 c δt k +1 (cid:12)(cid:12)(cid:12) F t k (cid:105) = E (cid:104) ( S t j +1 − S t j ) δt k +1 c δt k +1 (cid:12)(cid:12)(cid:12) F t k (cid:105) = E (cid:104) S t j +1 − S t j (cid:12)(cid:12)(cid:12) F t k (cid:105) E (cid:104) δt k +1 c δt k +1 (cid:12)(cid:12)(cid:12) F t k (cid:105) = ∆ t k t j π δt k +1 µ δc . We then deduce that E (cid:2)(cid:101) h t k +1 δt k +1 c δt k +1 (cid:12)(cid:12) F t k (cid:3) = (cid:101) h t k t k +1 E (cid:2) δt k +1 c δt k +1 (cid:12)(cid:12) F t k (cid:3) . The proof of (A-9) is thesame. 43 .2 Proof of Lemma 1 From the terminal condition we have α T = − λ <

0. So, we only need to prove that 0 < α t k /α t k +1 < α t k +1 <

0. By plugging (1) A ± t k deﬁned in Eq. (18) into Eq. (19), we can write α t k /α t k +1 =1 + N k /D k . where N k = π + t k +1 π − t k +1 α t k +1 [( µ + c ) π + t k +1 ( α t k +1 µ − c − µ − c ) + ( µ − c ) π − t k +1 ( α t k +1 µ + c − µ + c )] − π + t k +1 π − t k +1 π t k +1 (1 , α t k +1 ( µ + c µ − c ) ,D k = (cid:2) π t k +1 (1 , α t k +1 µ + c µ − c (cid:3) − π + t k +1 π − t k +1 ( α t k +1 µ + c − µ + c )( α t k +1 µ − c − µ − c ) . Therefore, it suﬃces to show that N k /D k ∈ ( − ,

0) whenever α t k +1 <

0. First, we prove that D k < N k >

0. Indeed, the ﬁrst term in D k satistiﬁes: (cid:2) π t k +1 (1 , α t k +1 µ + c µ − c (cid:3) ≤ α t k +1 π + t k +1 π − t k +1 ( µ + c µ − c ) ≤ α t k +1 π + t k +1 π − t k +1 µ + c µ − c , by using the facts that π t k +1 (1 , ≤ π + t k +1 ∧ π − t k +1 and µ ± c ≥ ( µ ± c ) . Combined with the second termin D k , we have D k ≤ π + t k +1 π − t k +1 (cid:2) α t k +1 ( µ + c µ − c + µ + c µ − c ) − µ + c µ − c (cid:3) < , (A-21)since α t k +1 < µ ± . ≥

0. To prove that N k >

0, note that, since α t k +1 < π t k +1 (1 , ≤ π + t k +1 ∧ π − t k +1 , the ﬁrst term in N k satisﬁes π + t k +1 π − t k +1 α t k +1 [( µ + c ) π + t k +1 ( α t k +1 µ − c − µ − c ) + ( µ − c ) π − t k +1 ( α t k +1 µ + c − µ + c )] ≥ π + t k +1 π − t k +1 π t k +1 (1 , α t k +1 [( µ + c ) ( α t k +1 µ − c − µ − c ) + ( µ − c ) ( α t k +1 µ + c − µ + c )] . Combining the formula above with the second term in N k , we have N k ≥ α t k +1 π + t k +1 π − t k +1 π t k +1 (1 , (cid:8) α t k +1 ( µ + c ) [ µ − c − ( µ − c ) ]+ α t k +1 ( µ − c ) [ µ + c − ( µ + c ) ] − µ + c µ − c ( µ + c + µ − c ) (cid:9) ≥ − α t k +1 π + t k +1 π − t k +1 π t k +1 (1 , µ + c µ − c ( µ + c + µ − c ) > . µ ± c ≥ ( µ ± c ) and α t k +1 <

0. Thus N k /D k <

0, which impliesthat α t k is always larger than α t k +1 whenever α t k +1 <

0. Next we prove that, whenever α t k +1 < N k /D k > − D k + N k <

0. Note that D k + N k = π t k +1 (1 , α t k +1 µ + c µ − c ) ( π (1 , − π + t k +1 π − t k +1 ) + α t k +1 ( π + t k +1 ) π − t k +1 ( µ + c ) ( α t k +1 µ − c − µ − c )+ α t k +1 π + t k +1 ( π − t k +1 ) ( µ − c ) ( α t k +1 µ + c − µ + c ) − π + t k +1 π − t k +1 ( α t k +1 µ + c − µ + c )( α t k +1 µ − c − µ − c ) . (A-22)Let us ﬁrst see N k + D k as a linear function of µ + c and note that ∂ µ + c ( N k + D k ) = π + t k +1 ( π − t k +1 ) α t k +1 ( µ − c ) − π + t k +1 π − t k +1 α t k +1 µ − c + π + t k +1 π − t k +1 α t k +1 µ − c ≤ π + t k +1 ( π − t k +1 ) α t k +1 µ − c − π + t k +1 π − t k +1 α t k +1 µ − c + π + t k +1 π − t k +1 α t k +1 µ − c (A-23)= π + t k +1 π − t k +1 α t k +1 µ − c ( π − t k +1 −

1) + π + t k +1 π − t k +1 α t k +1 µ − c < , (A-24)where (A-23) holds from µ − c ≥ ( µ − c ) while (A-24) holds since π − t k +1 < α t k +1 <

0. Thus N k + D k decrease with µ + c . Since µ + c ≥ ( µ + c ) , substituting µ + c with ( µ + c ) , we have D k + N k ≤ π + t k +1 π − t k +1 α t k +1 ( µ + c ) (cid:2) π + t k +1 ( α t k +1 µ − c − µ − c ) − π t k +1 (1 , α t k +1 ( µ − c ) (cid:3) + π + t k +1 π − t k +1 α t k +1 ( µ − c ) (cid:8) π − t k +1 [ α t k +1 ( µ + c ) − µ + c ] − π t k +1 (1 , α t k +1 ( µ + c ) (cid:9) + (cid:2) π t k +1 (1 , α t k +1 µ + c µ − c (cid:3) − π + t k +1 π − t k +1 [ α t k +1 ( µ + c ) − µ + c ]( α t k +1 µ − c − µ − c ) . (A-25)Similarly, the RHS of (A-25) can be seen as a linear decreasing function of µ − c since the coeﬃcientof µ − c is π + t k +1 π − t k +1 α t k +1 ( µ + c ) ( π + t k +1 −

1) + π + t k +1 π − t k +1 α t k +1 µ + c <

0. With the fact that µ − c ≥ ( µ − c ) ,45e substitute µ − c with ( µ − c ) in the RHS of (A-25) and get D k + N k ≤ π + t k +1 π − t k +1 α t k +1 ( µ + c ) (cid:8) π + t k +1 [ α t k +1 ( µ − c ) − µ − c ] − π t k +1 (1 , α t k +1 ( µ − c ) (cid:9) + π + t k +1 π − t k +1 α t k +1 ( µ − c ) (cid:8) π − t k +1 [ α t k +1 ( µ + c ) − µ + c ] − π t k +1 (1 , α t k +1 ( µ + c ) (cid:9) + (cid:2) π t k +1 (1 , α t k +1 µ + c µ − c (cid:3) − π + t k +1 π − t k +1 [ α t k +1 ( µ + c ) − µ + c ][ α t k +1 ( µ − c ) − µ − c ]= µ + c µ − c (cid:2) ( π + t k +1 ) π − t k +1 α t k +1 µ + c ( α t k +1 µ − c − − π + t k +1 π − t k +1 π t k +1 (1 , α t k +1 µ + c µ − c + π + t k +1 ( π − t k +1 ) α t k +1 µ − c ( α t k +1 µ + c −

1) + π t k +1 (1 , α t k +1 µ + c µ − c − π + t k +1 π − t k +1 ( α t k +1 µ + c − α t k +1 µ − c − (cid:3) (cid:44) µ + c µ − c (cid:96) ( µ + c , µ − c ) (A-26)To prove D k + N k <

0, we only need to show that (cid:96) ( µ + c , µ − c ) < (cid:96) ( µ + c , µ − c ) is a linear function in µ + c . The coeﬃcient of µ + c is ∂ µ + c (cid:96) ( µ + c , µ − c ) = m ( π t k +1 (1 , × α t k +1 µ − c + α t k +1 π + t k +1 π − t k +1 (1 − π + t k +1 )where m ( π t k +1 (1 , π + t k +1 ) π − t k +1 − π + t k +1 π − t k +1 π t k +1 (1 ,

1) + π + t k +1 ( π − t k +1 ) + π t k +1 (1 , − π + t k +1 π − t k +1 . For now we assume that m (cid:0) π t k +1 (1 , (cid:1) ≤ π t k +1 (1 ,

1) in Eq. (4) and we will give theprove later. Since µ − c ≥

0, we plug 0 into µ − c and get ∂ µ + c (cid:96) ( µ + c , µ − c ) ≤ α t k +1 π + t k +1 π − t k +1 (1 − π + t k +1 ) < (cid:96) ( µ + c , µ − c ) decrease with µ + c . Since µ + c ≥

0, we have that (cid:96) ( µ + c , µ − c ) ≤ (cid:96) (0 , µ − c ) = − π + t k +1 π − t k +1 (1 − α t k +1 µ − c ) − π + t k +1 ( π − t k +1 ) α t k +1 µ − c = − π + t k +1 π − t k +1 + π + t k +1 π − t k +1 α t k +1 µ − c (1 − π − t k +1 ) < . Thus (cid:96) ( µ + c , µ − c ) < µ ± c ≥

0. Immediately from Eq. (A-26) we have D k + N k <

0, whichimplies that N k /D k > − m (cid:0) π t k +1 (1 , (cid:1) ≤ π t k +1 (1 ,

1) in Eq. (4). From Eq. (4),we know that ( π + t k +1 + π − t k +1 − ∨ ≤ π t k +1 (1 , ≤ π + t k +1 ∧ π − t k +1 . Since m (cid:0) π t k +1 (1 , (cid:1) is a quadratic46unction of π t k +1 (1 ,

1) opening upwards, we only need to check that the values of m (cid:0) π t k +1 (1 , (cid:1) attwo end points ( π + t k +1 + π − t k +1 − ∨ π + t k +1 ∧ π − t k +1 are non-positive. Without loss of generality,we assume π − t k +1 ≤ π + t k +1 . First we check that m Ä π − t k +1 ä ≤ m Ä π − t k +1 ä = ( π − t k +1 ) − π + t k +1 ( π − t k +1 ) + π + t k +1 π − t k +1 ( π + t k +1 + π − t k +1 − π + t k +1 − π − t k +1 ) π − t k +1 ( π + t k +1 − ≤ . Next we check that m Ä ( π + t k +1 + π − t k +1 − ∨ ä ≤

0. If π + t k +1 + π − t k +1 − <

0, we immediately have m (0) = π + t k +1 π − t k +1 ( π + t k +1 + π − t k +1 − ≤

0. Otherwise, if π + t k +1 + π − t k +1 − ≥ m Ä π + t k +1 + π − t k +1 − ä = ( π + t k +1 + π − t k +1 − − π + t k +1 π − t k +1 ( π + t k +1 + π − t k +1 − − π − t k +1 )( π + t k +1 ) + (1 − π − t k +1 )( π − t k +1 − π + t k +1 + ( π − t k +1 − (cid:44) n ( π + t k +1 ) . We can see n ( π + t k +1 ) is a quadratic function of π + t k +1 opening upwards. By assumption π − t k +1 ≤ π + t k +1 ≤ π + t k +1 + π − t k +1 − ≥

0, we have the range of π + t k +1 as  − π − t k +1 ≤ π + t k +1 ≤ ≤ π − t k +1 ≤ . ,π − t k +1 ≤ π + t k +1 ≤ . ≤ π − t k +1 ≤ . We only need to check n ( π + t k +1 ) is non-positive at the boundary: n (1) = (1 − π − t k +1 ) + (1 − π − t k +1 )( π − t k +1 −

2) + ( π − t k +1 − = 0 . When 0 ≤ π − t k +1 ≤ . n (1 − π − t k +1 ) = (1 − π − t k +1 ) + (1 − π − t k +1 ) ( π − t k +1 −

2) + ( π − t k +1 − = 0 . When 0 . ≤ π − t k +1 ≤ n ( π − t k +1 ) = (1 − π − t k +1 )( π − t k +1 ) + (1 − π − t k +1 )( π − t k +1 − π − t k +1 + ( π − t k +1 − ≤ . m Ä π + t k +1 + π − t k +1 − ä ≤ π + t k +1 + π − t k +1 − ≥

0. This completes the prove for theclaim m (cid:0) π t k +1 (1 , (cid:1) ≤ π t k +1 (1 ,

1) in Eq. (4).

A.3 Proof of Theorem 1 (Veriﬁcation Theorem)

Throughout, W t i , I t i , for i = k, . . . , N + 1, are the cash holding and inventory processes resultingfrom adopting an admissible placement strategy L ± t i , i = k, . . . , N . In contrast, for i = k +1 , . . . , N + 1, W ∗ t i , I ∗ t i and (cid:99) W t i , (cid:98) I t i are respectively the resulting cash holding and inventory processesstarting from initial states W t k , I t k , when setting L ± t i = L ± , ∗ t i and L ± t i = (cid:98) L ± t i , for some arbitraryadmissible placement strategy (cid:98) L ± t i . First note that, for an arbitrary admissible placement strategy L ± t i , { v ( t i , S t i , W t i , I t i ) } i = k,...,N +1 is a supermartingale since E (cid:2) v ( t i +1 , S t i +1 , W t i +1 , I t i +1 ) |F t i (cid:3) ≤ sup (cid:98) L ± ti E (cid:2) v ( t i +1 , S t i +1 , (cid:99) W t i +1 , (cid:98) I t i +1 ) |F t i (cid:3) = v ( t i , S t i , W t i , I t i ) . (A-27)The last equation follows from (16) and Proposition 1. That is, α t k , h t k , g t k in v ( t k , s, w , i ) = w + α t k i + si + h t k i + g t k are picked in order for (16) to hold true. We then have that v ( t k , S t k , W t k , I t k ) ≥ sup ( L ± ti ) k ≤ i ≤ N E [ v ( T, S T , W T , I T ) |F t k ]= sup ( L ± ti ) k ≤ i ≤ N E [ W T + S T I T − λI T |F t k ]= V t k . (A-28)The ﬁrst equality in Eq. (A-28) holds because v ( T, S T , W T , I T ) = W T + S T I T − λI T by the terminalconditions α T = − λ, g T = 0 , h T = 0.Next we prove that v ( t k , S t k , W t k , I t k ) ≤ V t k . To this end, recall from Proposition 1 that wepick α t k , h t k , and g t k so that v ( t i , S t i , W ∗ t i , I ∗ t i ) = E [ v ( t i +1 , S t i +1 , W ∗ t i +1 , I ∗ t i +1 ) |F t i ] , i = k, . . . , N . Hence, by induction, v ( t k , S t k , W t k , I t k ) = v ( t k , S t k , W ∗ t k , I ∗ t k )= E [ v ( t N +1 , S t N +1 , W ∗ t N +1 , I ∗ t N +1 ) |F t k ]= E [ W ∗ T + S T I ∗ T − λ ( I ∗ T ) |F t k ] . It also trivially follows that E [ W ∗ T + S T I ∗ T − λ ( I ∗ T ) |F t k ] ≤ sup ( L ± ti ) k ≤ i ≤ N E [ W T + S T I T − λI T |F t k ] = V t k . We then conclude that v ( t k , S t k , W t k , I t k ) ≤ V t k , which combined with (A-28) implies that v ( t k , S t k , W t k , I t k ) = V t k . A.4 Proof of Proposition 2 (Conditions for Positive Spread)

We ﬁrst prove the result under the martingale condition (14). By Eq. (17), we need to prove that L + , ∗ t k + L − , ∗ t k = ( (1) A + t k − (1) A − t k ) I t k + ( (2) A + t k − (2) A − t k ) + ( (3) A + t k + (3) A − t k ) > . Under the Conditions (29)-(30) in Proposition 2, it is easy to see that β + t k − β − t k = π + t k +1 π − t k +1 α t k +1 ( µ + c µ − c − µ − c µ + c ) − π t k +1 (1 , α t k +1 µ − c µ + c ( π − t k +1 µ − c − π + t k +1 µ + c ) = 0 . This directly implies that (1) A + t k − (1) A − t k = 0 and (2) A + t k − (2) A − t k = 0. We now proceed to show that (3) A + t k − (3) A − t k >

0. To this end, ﬁrst note that, as shown in Eq. (A-21) ( D k = γ t k ), the denominator γ t k of (3) A + t k − (3) A − t k is negative. So, it remains to show that the numerator of (3) A + t k + (3) A − t k is alsonegative. By Condition (31) in Proposition 2 (i.e., µ ± cp = µ ± c µ ± p and µ ± c p = µ ± c µ ± p ), the numerator49f (3) A + t k + (3) A − t k can be written as N ( (3) A + t k + (3) A − t k ) = (cid:110) π + t k +1 π − t k +1 ( α t k +1 µ − c − µ − c )( µ + c − α t k +1 µ + c ) + 2 (cid:2) α t k +1 π t k +1 (1 , µ + c µ − c (cid:3) − π + t k +1 π t k +1 (1 , α t k +1 ( µ + c ) µ − c (cid:111) µ + p + (cid:110) π + t k +1 π − t k +1 ( α t k +1 µ + c − µ + c )( µ − c − α t k +1 µ − c ) + 2 (cid:2) α t k +1 π t k +1 (1 , µ + c µ − c (cid:3) − π − t k +1 π t k +1 (1 , α t k +1 ( µ − c ) µ + c (cid:111) µ − p . We can then show that the coeﬃcients of µ + p is negative. To wit, denote the coeﬃcients of µ + p as r ( µ + c , µ − c ), which is a linear function of µ − c with coeﬃcient π + t k +1 π − t k +1 α t k +1 ( µ + c − α t k +1 µ + c ) < µ − c ≥ ( µ − c ) , we have that r ( µ + c , µ − c ) ≤ r ( µ + c , ( µ − c ) )= π + t k +1 π − t k +1 [ α t k +1 ( µ − c ) − µ − c ]( µ + c − α t k +1 µ + c ) + 2 (cid:2) α t k +1 π t k +1 (1 , µ + c µ − c (cid:3) − π + t k +1 π t k +1 (1 , α t k +1 ( µ + c ) µ − c . Similarly, r ( µ + c , ( µ − c ) ) is linear in µ + c with coeﬃcient − α t k +1 π + t k +1 π − t k +1 [ α t k +1 ( µ − c ) − µ − c ] < r ( µ + c , ( µ − c ) ) ≤ r (( µ + c ) , ( µ − c ) )= π + t k +1 π − t k +1 [ α t k +1 ( µ − c ) − µ − c ][ µ + c − α t k +1 ( µ + c ) ] + 2 (cid:2) α t k +1 π t k +1 (1 , µ + c µ − c (cid:3) − π + t k +1 π t k +1 (1 , α t k +1 ( µ + c ) µ − c = µ + c µ − c { [2 α t k +1 π t k +1 (1 , − α t k +1 π + t k +1 π − t k +1 ] µ + c µ − c + 2 α t k +1 π + t k +1 π − t k +1 µ + c + π + t k +1 α t k +1 [ π − t k +1 µ − c − π t k +1 (1 , µ + c ] − π + t k +1 π − t k +1 } . By Eq. (4) we have π t k +1 (1 , ≤ π + t k +1 π − t k +1 and by Lemma 1 we have α t k +1 <

0, thus the summationof the ﬁrst two terms in the brackets above (i.e., [2 α t k +1 π t k +1 (1 , − α t k +1 π + t k +1 π − t k +1 ] µ + c µ − c +2 α t k +1 π + t k +1 π − t k +1 µ + c ) is negative. Under the Condition (29), the third term in the brackets (i.e., π + t k +1 α t k +1 [ π − t k +1 µ − c − π t k +1 (1 , µ + c ]) can be written as π + t k +1 α t k +1 ( π − t k +1 − π t k +1 (1 , µ c <

0. Thus50he coeﬃcients of µ + p in N ( (3) A + t k + (3) A − t k ) (i.e., r ( µ + c , µ − c )) is negative. Similarly the coeﬃcients of µ − p is also negative. Therefore N ( (3) A + t k + (3) A − t k ) < (3) A + t k + (3) A − t k > β + t k − β − t k = 0 and, by (25), we have: (cid:101) L + , ∗ t k + (cid:101) L − , ∗ t k = L + , ∗ t k + L − , ∗ t k . B Proofs of Section 4

B.1 Proof of Corollary 1

Under the Conditions (29)-(30), it is easy to check that β + t k = β − t k . From (25), we can then easilysee that the optimal spreads, denoted hereafter Sprd t k , are the same under the martingale andnon-martingale midprice cases. Furthermore, Sprd t k = L + , ∗ t k + L − , ∗ t k = (cid:101) L + , ∗ t k + (cid:101) L − , ∗ t k = (3) A + t k + (3) A − t k , (B-1)which proves that the optimal spreads are independent of the inventory level and the local drifts { ∆ t k } k =0 ,...,N . If we further assume Condition (31) and Condition (38), the optimal spread can bewritten as Sprd t k = (cid:2) π ( µ c − α t k +1 µ c ) + 2 α t k +1 π (1 , µ c (cid:3) ( µ + p + µ − p )2 (cid:2) π (1 , α t k +1 µ c − π ( α t k +1 µ c − µ c ) (cid:3) , (B-2)where π = π ± . We show that Sprd t k is non-decreasing with time by checking that the diﬀerencebetween Sprd t k and Sprd t k − is non-negative: Sprd t k − Sprd t k − = µ + p + µ − p · ( α t k +1 − α t k ) πµ c (cid:0) π (1 , µ c − πµ c (cid:1)(cid:81) (cid:96) = k,k +1 [ π (1 , α t (cid:96) µ c − π ( α t (cid:96) µ c − µ c )] . (B-3)First, we show that the denominator is positive. Since α t k is negative and by deﬁnition 0 ≤ π (1 , ≤ π and 0 < µ c ≤ µ c , we have π (1 , α t (cid:96) µ c − π ( α t (cid:96) µ c − µ c ) ≥ πα t (cid:96) µ c − π ( α t (cid:96) µ c − µ c ) = πµ c > α t k is decreasing51ith time and π (1 , µ c ≤ πµ c . Thus, Sprd t k − Sprd t k − ≥

0. Particularly, if π (1 , µ c = πµ c , Sprd t k − Sprd t k − = 0.To show that, at a ﬁxed time point, the optimal spreads decrease with π (1 , ∂ π (1 , Sprd t k = α t k +1 µ c ( µ + p + µ − p ) (cid:2) π (1 , α t k +1 µ c − π ( α t k +1 µ c − µ c ) (cid:3)(cid:2) π (1 , α t k +1 µ c − π ( α t k +1 µ c − µ c ) (cid:3) − α t k +1 µ c (cid:2) π ( µ c − α t k +1 µ c ) + 2 α t k +1 π (1 , µ c (cid:3) ( µ + p + µ − p )2 (cid:2) π (1 , α t k +1 µ c − π ( α t k +1 µ c − µ c ) (cid:3) = α t k +1 µ c ( µ + p + µ − p )2 (cid:2) π (1 , α t k +1 µ c − π ( α t k +1 µ c − µ c ) (cid:3) < . This completes the proof of Corollary 1.

B.2 Proof of Corollary 2

Recall (cid:101) a ∗ t k = S t k + (1) A + t k I t k + (2) (cid:101) A + t k + (3) (cid:101) A + t k , (cid:101) b ∗ t k = S t k + (1) A − t k I t k + (2) (cid:101) A − t k − (3) (cid:101) A − t k . To show that (cid:101) a ∗ t k and (cid:101) b ∗ t k are strictly decreasing with I t k , we only need to show that (1) A ± t k <

0. ByProposition 1, we have that (1) A ± t k = β ± tk α tk +1 γ tk and γ t k := (cid:2) π t k +1 (1 , α t k +1 µ + c µ − c (cid:3) − π + t k +1 π − t k +1 ( α t k +1 µ + c − µ + c )( α t k +1 µ − c − µ − c ) ,β ± t k := π + t k +1 π − t k +1 µ ± c ( α t k +1 µ ∓ c − µ ∓ c ) − π ∓ t k +1 π t k +1 (1 , α t k +1 µ ± c ( µ ∓ c ) . Since π t k +1 (1 , ≤ π + t k +1 ∧ π − t k +1 and ( µ ± c ) ≤ µ ± c , we have γ t k ≤ π + t k +1 π − t k +1 ¶ α t k +1 (cid:2) ( µ + c ) ( µ − c ) − µ + c µ − c (cid:3) + α t k +1 ( µ + c µ − c + µ + c µ − c ) − µ + c µ − c © ≤ π + t k +1 π − t k +1 (cid:2) α t k +1 ( µ + c µ − c + µ + c µ − c ) − µ + c µ − c (cid:3) < ,β ± t k ≤ π + t k +1 π − t k +1 (cid:2) α t k +1 µ ∓ c µ ± c − µ ± c µ ∓ c − α t k +1 µ ± c ( µ ∓ c ) (cid:3) = π + t k +1 π − t k +1 (cid:8) α t k +1 µ ± c (cid:2) µ ∓ c − ( µ ∓ c ) (cid:3) − µ ± c µ ∓ c (cid:9) < , α t k +1 is negative. Thus, (1) A ± t k < t k . B.3 Proof of Corollary 3

Under the assumptions in Corollary 3, the optimal strategies can be written as a ∗ t k = S t k + L + , ∗ tk (cid:122) (cid:125)(cid:124) (cid:123) α t k +1 µ c µ c − α t k +1 µ c I t k + µ c − α t k +1 µ c µ c − α t k +1 µ c ] µ p + h t k +1 µ c µ c − α t k +1 µ c ] (B-4) b ∗ t k = S t k + − L − , ∗ tk (cid:122) (cid:125)(cid:124) (cid:123) α t k +1 µ c µ c − α t k +1 µ c I t k − µ c − α t k +1 µ c µ c − α t k +1 µ c ] µ p + h t k +1 µ c µ c − α t k +1 µ c ] (B-5)where α t k = α t k +1 + 2 π t k +1 ( α t k +1 µ c ) µ c − α t k +1 µ c , h t k = h t k +1 + 2 π t k +1 α t k +1 µ c h t k +1 µ c − α t k +1 µ c , and π t k +1 := π ± t k +1 . Since h T = 0, we have h t k ≡ . It’s easy to check that for any time t k andpenalty levels which lead to diﬀerent values of α t k , the optimal ask price a ∗ t k = S t k + µ p I t k = I + = µ c µ p µ c , and optimal bid price b ∗ t k = S t k − µ p I t k = I − = − µ c µ p µ c .First we consider the scenario where the inventory level is non-negative. When I t k = 0, wecan see from Eq. (B-5) that the optimal bid price equals to S t k − µ c − α t k +1 µ c µ c − α t k +1 µ c ] µ p = S t k − µ p α t k +1 µ c µ c − α t k +1 µ c ] µ p < S t k − µ p α t k +1 <

0. As stated in Corollary 2, the optimal bid is strictlydecreasing with inventory level. Thus when I t k ≥

0, the optimal bid price is always smaller than S t k − µ p a ∗ t k = S t k + µ p I t k = I + = µ c µ p µ c .Thus for I t k ∈ [0 , I + ), the optimal ask price is always larger than S t k + µ p I t k > I + , theoptimal ask price is always smaller than S t k + µ p References

Adrian, T., Capponi, A., Fleming, M., Vogt, E. & Zhang, H. (2020). Intraday marketmaking with overnight inventory costs.

Journal of Financial Markets , 100564.53 mihud, Y. & Mendelson, H. (1980). Dealership market: Market-making with inventory. Journal of Financial Economics (1), 31–53. Bradfield, J. (1979). A formal dynamic model of market making.

Journal of Financial andQuantitative Analysis , 275–291. Cartea, ´A. & Jaimungal, S. (2015). Risk metrics and ﬁne tuning of high-frequency tradingstrategies.

Mathematical Finance (3), 576–611. Cartea, A., Jaimungal, S. & Ricci, J. (2014). Buy low, sell high: A high frequency tradingperspective.

SIAM Journal on Financial Mathematics (1), 415–444. Cont, R., Stoikov, S. & Talreja, R. (2010). A stochastic model for order book dynamics.

Operations Research (3), 549–563. Cvitanic, J. & Kirilenko, A. A. (2010). High frequency traders and asset prices.

Workingpaper. Preprint available at SSRN 1569067 . Gayduk, R. & Nadtochiy, S. (2018). Liquidity eﬀects of trading frequency.

MathematicalFinance (3), 839–876. Glosten, L. R. & Harris, L. E. (1988). Estimating the components of the bid/ask spread.

Journal of Financial Economics (1), 123–142. Gu´eant, O., Lehalle, C.-A. & Fernandez-Tapia, J. (2013). Dealing with the inventory risk:a solution to the market making problem.

Mathematics and Financial Economics (4), 477–507. Hall, P. & LePage, R. (1996). On bootstrap estimation of the distribution of the studentizedmean.

Annals of the Institute of Statistical Mathematics (3), 403–421. Hasbrouck, J. (1988). Trades, quotes, inventories, and information.

Journal of Financial Eco-nomics , 229–252. Hendershott, T. & Menkveld, A. J. (2014). Price pressures.

Journal of Financial Economics (3), 405–423.

Ho, T. & Stoll, H. R. (1981). Optimal dealer pricing under transactions and return uncertainty.

Journal of Financial Economics (1), 47–73. 54 uang, K., Simchi-Levi, D. & Song, M. (2012). Optimal market-making with risk aversion. Operations Research (3), 541–565. Joint Staff Report (2015). The US treasury market on october 15, 2014. Tech. rep., Joint StaﬀReport, July.

Kirilenko, A., Kyle, A. S., Samadi, M. & Tuzun, T. (2017). The ﬂash crash: High-frequencytrading in an electronic market.

The Journal of Finance (3), 967–998. Markets Committee et al. (2011). High-frequency trading in the foreign exchange market.

BankFor International Settlement . Menkveld, A. (2013). High frequency trading and the new-market makers.

Journal of FinancialMarkets , 712–740. Menkveld, A. J. (2016). The economics of high-frequency trading: Taking stock.

Annual Reviewof Financial Economics , 1–24. O’Hara, M. & Oldfield, G. (1986). The microeconomics of market making.

Journal of Financialand Quantitative Analysis (4), 2603–2619. Peng, L. (2004). Empirical-likelihood-based conﬁdence interval for the mean with a heavy-taileddistribution.