[PDF] Optimal Trading with Signals and Stochastic Price Impact

Abstract

Trading frictions are stochastic. They are, moreover, in many instances fast-mean reverting. Here, we study how to optimally trade in a market with stochastic price impact and study approximations to the resulting optimal control problem using singular perturbation methods. We prove, by constructing sub- and super-solutions, that the approximations are accurate to the specified order. Finally, we perform some numerical experiments to illustrate the effect that stochastic trading frictions have on optimal trading.

Full PDF

OOptimal Trading with Signals and Stochastic Price Impact ∗ Jean-Pierre Fouque † Sebastian Jaimungal ‡ Yuri F. Saporito § January 26, 2021

Abstract

Trading frictions are stochastic. They are, moreover, in many instances fast-mean reverting. Here,we study how to optimally trade in a market with stochastic price impact and study approximations tothe resulting optimal control problem using singular perturbation methods. We prove, by constructingsub- and super-solutions, that the approximations are accurate to the speciﬁed order. Finally, weperform some numerical experiments to illustrate the eﬀect that stochastic trading frictions have onoptimal trading.

Trading frictions exist across all electronic markets and stem from a multitude of factors. In quote drivenmarkets, liquidity providers place limit orders to buy and sell assets, while liquidity takers use marketorders to extract this liquidity, sometimes by walking the limit order book (LOB), but often by picking oﬀavailable volume at the touch (the volume at the best bid/ask prices). In either case, trading quickly (orwith large volume) induces a trading cost while trading slowly introduces price risk. The seminal work of[2] study how to optimally trade in such environments. [1] extends the analysis to the case of stochasticliquidity and volatility and, from numerical methods applied to the resulting Hamilton-Jacobi-Bellman(HJB) equation, shows that stochastic liquidity is indeed an important component in optimal execution.Figure 1: Sample path (May 28, 2014),two ﬁts for κ ( t ), and quantiles for κ t forMSFT using 2014 data.While the cost of trading ﬂuctuates throughout the day,markets react quickly – sometimes on the timescale of mil-liseconds or even microseconds. The fast-mean-reverting as-pect of the temporary price impact is documented in [5]. Forexample, Figure 1 shows (i) a sample path of trading impact(measured as the eﬀective cost of walking the limit orderbook per unit of volume) denoted κ t , (ii) a deterministic ﬁt(denoted by κ ( t ), see section 4) to the sample path, and (iii)the 20%, 50%, and 80% quantiles through the day (across allof 2014 data) for Microsoft (MSFT).Based on these observations, here we study how stochasticprice impact driven by a fast mean-reverting process aﬀectsthe optimal execution of trades. Fast mean-reverting (FMR)models have appeared widely in stochastic volatility models,as ﬁrst proposed in [11] and [10], [12], and extended and ap-plied in numerous areas including stochastic interest models[8], commodity models [19, 14], credit risk [25], and portfoliooptimisation problems [15]. As far as the authors are aware, ∗ J-PF was supported by NSF grant DMS-1814091. SJ would like to acknowledge the support of the Natural Sciencesand Engineering Research Council of Canada, funding reference numbers RGPIN-2018-05705 and RGPAS-2018-522715. † Department of Statistics and Applied Probability, University of California, Santa Barbara, United States ( mailto:[email protected] , http://fouque.faculty.pstat.ucsb.edu/ ). ‡ Department of Statistical Sciences, University of Toronto, Canada ( mailto:[email protected] , http://sebastian.statistics.utoronto.ca/ ). § School of Applied Mathematics (EMAp), Getulio Vargas Foundation (FGV), Brazil ( mailto:[email protected] , , http://yurisaporito.com/ ). a r X i v : . [ q -f i n . M F ] J a n owever, this is the ﬁrst paper to suggest using FMR models in the context of algorithmic trading, andin particular to model stochastic impact.[17] studies stochastic impact models that generalise those in [1] and develop a novel approach forlooking at a general class of optimal control problems with singular terminal state constraints. [23]analyzes the eﬀect of jumps in prices to model the uncertain price impact from other traders. [7] examinesdiscrete-time optimal execution problem with stochastic volatility and liquidity and (for linear criterion)characterise the optimal strategy in terms of a certain forward-backward-stochastic-diﬀerential equation(FBSDE). [3], which is perhaps the closest work to ours, analyses stochastic impact by making use of acoeﬃcient expansion of the resulting HJB equation and provide a formal expansion of the approximatestrategy (but provides no proof of accuracy and does not utilise the FMR property). A distinct line ofwork looks at modeling impact through the dynamics of the limit order book (LOB), where orders walkthe LOB and after some time it recovers – this is called the resilience of the LOB. [24] is the earliestwork along these lines, and many others generalized the approach in various directions. For example, [16]studies a stochastic “illiquidity” process that induces costs when the trader’s position jumps, [27] lookat regime switching driven market resilience. For an overview of optimal execution, and more generallyalgorithmic trading and market microstructure, see, e.g., [6], [18], and [21].We distinguish our work from the extent literature in several ways. First, to the authors knowledge,this is the ﬁrst example of an approximate closed-from solution for stochastic impact that reﬂects the FMRbehaviour observed in the market. Second, we include not only stochastic impact, but also stochastictrading signals (which may be seen as stemming from order-ﬂow or other factors). Thirdly, we provideproof that the approximate optimal strategy are indeed approximately optimal to the order speciﬁed.The remainder of this paper is organized as follows: Section 2 introduces our framework, performsa dimensional reduction, and introduces the approximate optimal strategy whose accuracy is addressedlater. Section 3 provides an explicit trading signal model where the approximate optimal strategy may becomputed exactly. Section 4 shows how to estimate parameters from data and applies the approximateoptimal strategy in a simulation environment. Section 5 provides a proof accuracy of the approximation. In this paper, we investigate optimal trading in environments where price impact is stochastic and drivenby a fast mean-reverting factor and the trader incorporates price signals. This generalizes the results in[5] for optimal execution with linear temporary and permanent price impacts as well as those studiedin [22] – which look at particular cases of [5]. Speciﬁcally, let (Ω , P , F = ( F t ) t ≥ ) denote a completeﬁltered probability space. What generates the ﬁltration is speciﬁed shortly. On this probability space,we also introduce a number of stochastic processes: the asset price process S ν = ( S νt ) t ≥ , the trader’scumulative cash process X ν = ( X νt ) t ≥ , the trader’s cumulative inventory Q ν = ( Q νt ) t ≥ , the tradingsignals µ = ( µ t , . . . , µ dt ) t ≥ , and the driver of stochastic impact Y ε = ( Y εt ) t ≥ . The process ν = ( ν t ) t ≥ denotes the trader’s rate of trading, and this is what the trader controls – hence the superscript on thevarious processes that are directly aﬀected by the trader’s actions. The various stochastic processes areassumed to satisfy the system of SDEs:  dS νt = ( γ · µ t + b ν t ) dt + σ dW t ,dX νt = − ( S νt + k ( t, Y εt ) ν t ) ν t dt,dQ νt = ν t dt,d µ t = c ( µ t ) dt + g ( µ t ) d W (cid:48) t , and dY εt = 1 ε α ( Y εt ) dt + 1 √ ε β ( Y εt ) dW ∗ t . (2.1)Here, b is the permanent price impact and k ( t, y ) is the temporary price impact which is driven by thefast mean-reverting process Y ε . The processes W , W (cid:48) and W ∗ are correlated Brownian motions, whereboth W and W ∗ are one-dimensional and W (cid:48) is d -dimensional. Moreover, we assume d [ W (cid:48) i , W ∗ ] t = ρ i dt , i ∈ { , . . . , d } , where ρ := ( ρ , . . . , ρ d ) ∈ ( − , d and the joint correlation structure of the three Brownian2otions is positive deﬁnite. The ﬁltration F = ( F t ) t ≥ is taken to be the natural ﬁltration induced bythese Brownian motions.The dynamics in (2.1) may be interpreted as follows. In the absence of the trader’s actions, the assetprice process has a predictable drift γ · µ that stems for the (vector valued) trading signal µ . The trader’saction, however, pushes prices in the direction of their trades. In the current setup, we view ν t > W . Thetrader’s cumulative cash process increases when the sell, but they do not receive the midprice S νt , ratherthey incur additional liquidity costs. Those costs are approximated as being linear in the trading speed ν , and the coeﬃcient of the impact is given by k ( t, Y εt ) – which incorporates the fast mean-revertingnature of the trading frictions, as well as its diurnal pattern as observed in, e.g. Figure 1. To model thispattern, we assume that k ( t, y ) = κ ( t ) (1 + η ( y )) − , (2.2) κ ( t ) > t ∈ [0 , T ], and that η > −

1. We denote the average with respect to the stationary distributionof Y ε by (cid:104)·(cid:105) and, without loss of generality, we impose the constraint that (cid:104) η (cid:105) = 0.The investor faces the following optimal control problem: H ε ( t, x, S, µ , q, y ) = sup ν ∈ A H ν ( t, x, S, µ , q, y ) , where (2.3) H ν ( t, x, S, µ , q, y ) = E t,S,q,x, µ ,y (cid:34) X νT + Q νT ( S νT − AQ νT ) − φ (cid:90) Tt ( Q νu ) du (cid:35) . (2.4)In (2.4), φ is an inventory penalty parameter (and may be interpreted as stemming from a penaltyon quadratic variation of the book-value of the traders position, or, more interestingly, from modeluncertainty [4]); A is a liquidation penalty parameter which, if large, forces the trader to end with little orno inventory (see [17] for how to enforce exact liquidation with a singular terminal condition); E t,S,q,x, µ ,y [ · ]denotes conditional expectation given S νt = S , Q νt = q , X νt = x , µ t = µ and Y εt = y ; and A is the setof admissible controls that consist of F -predictable processes in L ([0 , T ] , Ω), i.e. E [ (cid:82) T | ν t | dt ] < + ∞ ,taking values in A := R .Note, the investor’s wealth process and the asset price can be removed from the optimization problem.Firstly, notice that, by the product rule (and that Q ν is absolutely continuous), we have X νT + Q νT S νT = x − (cid:90) Tt ( S νu + k ( u, Y εu ) ν u ) ν u du + qS + (cid:90) Tt S νu dQ νu + (cid:90) Tt Q νu dS νu = x + qS − (cid:90) Tt S νu ν u du − (cid:90) Tt k ( u, Y εu ) ν u du + (cid:90) Tt S νu ν u du + (cid:90) Tt Q νu dS νu (2.5)= x + qS − (cid:90) Tt k ( u, Y εu ) ν u du + (cid:90) Tt Q νu dS νu . (2.6)Hence, we may write H ν ( t, x, S, µ , q, y ) = x + qS + E t,S,q,x, µ ,y (cid:34) − A ( Q νT ) − (cid:90) Tt k ( u, Y εu ) ν u du + (cid:90) Tt Q νu ( γ · µ u + bν u ) du − φ (cid:90) Tt ( Q νu ) du (cid:35) . (2.7)Next, as ( µ t , Q νt , Y εu ) is Markovian, we may deﬁne Z ( t, µ , q, y, v ) = − k ( t, y ) v + q ( γ · µ + bv ) − φq , (2.8) Z ν ( t, µ , q, y ) = E t,q, µ ,y (cid:34) − A ( Q νT ) + (cid:90) Tt Z ( u, µ u , Q νu , Y εu , ν u ) du (cid:35) , (2.9)3nd h ε ( t, µ , q, y ) = sup ν ∈ A Z ν ( t, µ, q, y ) , (2.10)where E t,q, µ ,y [ · ] here means the conditional expectation given Q νt = q , µ t = µ , and Y εt = y . Hence,comparing (2.7) with (2.10), we may write H ε ( t, x, S, µ , q, y ) = x + qS + h ε ( t, µ , q, y ) . (2.11)From the dynamic programming principle, we expect that h ε satisﬁes the associated Hamilton-Jacobi-Bellman (HJB) equation ∂ t h ε + 1 ε L h ε + L µ h ε − φq + 1 √ ε β ( y ) k (cid:88) i =1 ( g i ( µ ) · ρ ) ∂ µ i y h ε + ( γ · µ ) q + sup v ∈A (cid:8) ( bq + ∂ q h ε ) v − k ( t, y ) v (cid:9) = 0 , (2.12)with terminal condition h ε ( T, µ , q, y ) = − A q , where g i ( µ ) is the i -th row of g ( µ ). Here, L and L µ arethe inﬁnitesimal generators L = α ( y ) ∂ y + β ( y ) ∂ yy , and (2.13) L µ = c ( µ ) · ∂ µ + Tr(( g ( µ ) g ( µ ) T ) ∂ µµ ) , (2.14)respectively.This HJB equation is not solvable in closed-form for a multitude of reasons. Rather, in the spirit of[20] for portfolio optimization problems and elaborated in [13], we aim to obtain an approximate solutionthat holds when the stochastic price impact driving factor Y ε is fast mean-reverting. We also prove thatthe approximation we propose is correct to the appropriate order, and we do so using the following broadsteps:(i) determine the feedback form of the optimal control and derive the non-linear PDE that h ε satisﬁes,(ii) notice that h ε may be represented as a quadratic form in q , but non-linear in all other state variables,(iii) perform a formal expansion of each coeﬃcient of h ε in powers of ε ,(iv) notice that constant (in q ) and linear (in q ) functions satisfy linear PDEs and apply, now classical,methods from [13] to prove accuracy, and ﬁnally(v) the quadratic (in q ) function satisﬁes a non-linear PDE, and for this we develop a super and sub-solution approach to prove that the approximation error is controlled.Proceeding along the above lines, we ﬁrst apply the ﬁrst-order condition (FOC) to obtain the optimalfeedback control as v ε ∗ ( t, µ , q, y ) = b q + ∂ q h ε ( t, µ , q, y )2 k ( t, y ) , (2.15)and, upon substitution of the FOC, the HJB equation (2.12) reduces to( ∂ t + L µ ) h ε − φ q + ( γ · µ ) q + ( b q + ∂ q h ε ) k ( t, y )+ 1 √ ε β ( y ) k (cid:88) i =1 ( g i ( µ ) · ρ ) ∂ µ i y h ε + 1 ε L h ε = 0 , (2.16)where, for readability, we suppress the dependence of h ε on its arguments.4e deﬁne a non-linear diﬀerential operator L ( k ) that corresponds to the temporary price impact k ,and it acts on functions f as follows L ( k )[ f ] := ( ∂ t + L µ ) f − φ q + ( γ · µ ) q + ( b q + ∂ q f ) k . (2.17)This operator is, in general, non-linear due to the last term. With this notation, we may write the PDE(2.16) as L ( k ( t, y ))[ h ε ] + (cid:18) √ ε L + 1 ε L (cid:19) h ε = 0 , (2.18)where L = β ( y ) k (cid:88) i =1 ( g i ( µ ) · ρ ) ∂ µ i y . (2.19) q As in the case with constant price impact studied in [5], but including trading signals, we may write h ε as a quadratic polynomial in q without any loss of generality. This is due to the form of the terminalconditions and the coeﬃcients of the PDE (2.18). To this end, we write h ε ( t, µ , q, y ) = h (0) ,ε ( t, µ , y ) + h (1) ,ε ( t, µ , y ) q + h (2) ,ε ( t, µ , y ) q (2.20)and aim to determine approximations for ( h ( i ) ,ε ) i =1 , , . Inserting (2.20) into (2.18), collecting powers of q , and setting the coeﬃcient of each power to zero, we ﬁnd the following coupled system of PDEs for( h ( i ) ,ε ) i =1 , ,  ( ∂ t + L µ ) h (2) ,ε − φ + ( b + 2 h (2) ,ε ) k ( t, y ) + 1 √ ε L h (2) ,ε + 1 ε L h (2) ,ε = 0 ,h (2) ,ε ( T, µ , y ) = − A , (2.21)  ( ∂ t + L µ ) h (1) ,ε + γ · µ + ( b + 2 h (2) ,ε ) h (1) ,ε k ( t, y ) + 1 √ ε L h (1) ,ε + 1 ε L h (1) ,ε = 0 ,h (1) ,ε ( T, µ , y ) = 0 , (2.22)  ( ∂ t + L µ ) h (0) ,ε + ( h (1) ,ε ) k ( t, y ) + 1 √ ε L h (0) ,ε + 1 ε L h (0) ,ε = 0 ,h (0) ,ε ( T, µ , y ) = 0 . (2.23)The PDEs (2.23) and (2.22) for h (0) ,ε and h (1) ,ε are linear. They do, however, have non-linear potentialterms. These potentials are “known” when solving the problem sequentially in the order h (2) ,ε then h (1) ,ε and ﬁnally h (0) ,ε . Contrasting, the PDE (2.21) for h (2) ,ε is non-linear. Similarly to the constant priceimpact case studied in [5], (2.21) contains no coeﬃcients or terminal conditions that depend on µ , hencethe solution h (2) ,ε is not a function of µ , and we write h (2) ,ε ( t, y ). Performing a constant shift X ε ( t, y ) := h (2) ,ε ( t, y ) + b  ∂ t X ε − φ + ( X ε ) k ( t, y ) + 1 ε L X ε = 0 , X ε ( T, y ) = − A + b/ , (2.25)  ( ∂ t + L µ ) h (1) ,ε + γ · µ + X ε h (1) ,ε k ( t, y ) + 1 √ ε L h (1) ,ε + 1 ε L h (1) ,ε = 0 ,h (1) ,ε ( T, µ , y ) = 0 . (2.26)5 .2 The case of deterministic temporary price impact The case of constant k is fully characterized in [5] and it is straightforward to generalize that result when k is a deterministic function of time: k ( t, · ) ≡ k ( t ). Indeed, consider the PDE (cid:40) ( ∂ t + L µ ) h − φ q + ( γ · µ ) q + ( b q + ∂ q h ) k ( t ) = 0 ,h ( T, µ , q ) = − Aq . (2.27)Next, h may be written as h ( t, µ , q ) = h (0) ( t, µ ) + h (1) ( t, µ ) q + h (2) ( t ) q , (2.28)where X ( t ) = h (2) ( t ) + b/  X (cid:48) ( t ) − φ + 1 k ( t ) X ( t ) = 0 , X ( T ) = − A + b/ , (2.29)and h (1) , h (0) are given by h (1) ( t, µ ) = (cid:90) Tt e (cid:82) st X ( u ) k ( u ) du E [ γ · µ s | µ t = µ ] ds, and (2.30) h (0) ( t, µ ) = (cid:90) Tt k ( s ) E (cid:104) ( h (1) ) ( s, µ s ) | µ t = µ (cid:105) ds. (2.31)See Appendix A for the derivation of these formulas. However, no closed-form solution for X is availablein this case. Motivated by the perturbation framework of [13], we ﬁrst carry out formal expansions and provide therigorous proof of accuracy of the approximation in Section 5.To proceed with the approximations, consider the formal expansion of X ε in powers of ε X ε = X + ε X + ε X + · · · , (2.32)and of h ( i ) ,ε , i = 0 ,

1, in powers of √ ε : h ( i ) ,ε = h ( i )0 + √ ε h ( i )1 + ε h ( i )2 + · · · . (2.33)Inserting these formal expansions into the PDEs (2.25), (2.26) and (2.23), collecting terms of like powersof ε , and setting each to zero separately, we ﬁnd, up to order √ ε , the following system of PDEs: L X = 0 , (2.34a) L X + ∂ t X − φ + X k ( t, y ) = 0 , (2.34b) L h ( i )0 = 0 , (2.34c) L h ( i )0 + L h ( i )1 = 0 , (2.34d)( ∂ t + L µ ) h (1)0 + γ · µ + X h (1)0 k ( t, y ) + L h (1)1 + L h (1)2 = 0 , (2.34e)( ∂ t + L µ ) h (0)0 + ( h (1)0 ) k ( t, y ) + L h (0)1 + L h (0)2 = 0 , (2.34f)( ∂ t + L µ ) h (1)1 + X h (1)1 k ( t, y ) + L h (1)2 + L h (1)3 = 0 , (2.34g)6 ∂ t + L µ ) h (0)1 + h (1)0 h (1)1 k ( t, y ) + L h (0)2 + L h (0)3 = 0 , (2.34h)with terminal conditions X ( T, y ) = − A + b/ h ( i )0 ( T, µ , y ) = 0 and h ( i )1 ( T, µ , y ) = 0, i = 0 , Proposition 1 (Zero-order Terms) . A solution for the zero-order approximation terms X , h (0)0 and h (1)0 that solves PDEs (2.34a), (2.34c) and (2.34d) and centers the Poisson equations (2.34b), (2.34e) and(2.34f) are given by Equations (2.29), (2.31) and (2.30), respectively, with deterministic temporary priceimpact k ( t ) = κ ( t ) .Proof. From the ﬁrst equation (2.34a), we choose X = X ( t ) independent of y . The second equation(2.34b) is a Poisson equation for X and its centering condition is (cid:28) ∂ t X − φ + X k ( t, · ) (cid:29) = 0 ⇔ ∂ t X − φ + X κ ( t ) = 0 , because (cid:28) k ( t, · ) (cid:29) = (cid:28) κ ( t ) (1 + η ( · )) (cid:29) = 1 κ ( t ) (1 + (cid:104) η (cid:105) ) = 1 κ ( t ) , (2.35)where the last equality follows from the standing assumption (cid:104) η (cid:105) = 0. This is the same Ricatti ODE as(2.29) with k = κ and therefore we may write X ( t ) = X ( t ) = h (2) ( t ) + b/ h ( i ) . From (2.34c), we choose h ( i )0 = h ( i )0 ( t, µ ) independent of y .Therfore, the PDE (2.34d) reduces to L h ( i )1 = 0. Consequently, for i = 0 ,

1, we choose h ( i )1 = h ( i )1 ( t, µ )also independent of y , thus L h ( i )1 = 0. Equations (2.34e) and (2.34f) are Poisson equations for h (1)2 and h (0)2 , respectively, and their centering conditions are (cid:42) ( ∂ t + L µ ) h (1)0 + γ · µ + X ( t ) h (1)0 k ( t, · ) (cid:43) = 0 ⇔ ( ∂ t + L µ ) h (1)0 + γ · µ + X ( t ) κ ( t ) h (1)0 = 0 , (cid:42) ( ∂ t + L µ ) h (0)0 + ( h (1)0 ) k ( t, · ) (cid:43) = 0 ⇔ ( ∂ t + L µ ) h (0)0 + 14 κ ( t ) ( h (1)0 ) = 0 . These PDEs for h (0)0 and h (1)0 result from the analysis in Section 2.2 with k = κ (speciﬁcally correspondingto (A.3) and (A.2)), and, hence, h (0)0 = h (0) and h (1)0 = h (1) as in equations (2.31) and (2.30), respectively. Proposition 2 (First-order Terms) . The Poisson equations (2.34g) and (2.34h) are centered when h (1)1 and h (0)1 solve the PDEs ( ∂ t + L µ ) h (1)1 + X ( t ) κ ( t ) h (1)1 + V · F (1) ( t, µ ) = 0 , (2.36)( ∂ t + L µ ) h (0)1 + h (1)0 κ ( t ) (cid:16) h (1)1 + V · F (0) ( t, µ ) (cid:17) = 0 , (2.37) where V = − ρ (cid:104) βψ (cid:48) (cid:105) , F (1) ( t, µ ) = X ( t ) κ ( t ) k (cid:88) i =1 g i ( µ ) ∂ µ i h (1)0 ( t, µ ) , and (2.38) F (0) ( t, µ ) = k (cid:88) i =1 g i ( µ ) ∂ µ i h (1)0 ( t, µ ) , (2.39) and ψ is the solution of the centered Poisson equation L ψ ( y ) = η ( y ) . (2.40)7 roof. From the Poisson equations (2.34e) and (2.34f), and accounting for their centering constants,which may depend on time and trading signal, we ﬁnd h (1)2 ( t, µ , y ) = − ( L − η ( y )) (cid:18) X ( t ) κ ( t ) h (1)0 ( t, µ ) (cid:19) + c (1) ( t, µ ) , (2.41) h (0)2 ( t, µ , y ) = − ( L − η ( y )) (cid:18) κ ( t ) ( h (1)0 ( t, µ )) (cid:19) + c (0) ( t, µ ) , (2.42)for some functions c (1) and c (0) independent of y . Due to our standing assumption (cid:104) η (cid:105) = 0, the Poissonequation (2.40) is centered and we can use ψ to simplify (2.41) (2.42). Hence, we may write h (1)2 ( t, µ , y ) = − ψ ( y ) X ( t ) κ ( t ) h (1)0 ( t, µ ) + c (1) ( t, µ ) , (2.43) h (0)2 ( t, µ , y ) = − ψ ( y ) 14 κ ( t ) ( h (1)0 ( t, µ )) + c (0) ( t, µ ) . (2.44)Therefore, L h (1)2 ( t, µ , y ) = − ψ (cid:48) ( y ) β ( y ) X ( t ) κ ( t ) k (cid:88) i =1 ( g i ( µ ) · ρ ) ∂ µ i h (1)0 ( t, µ ) , (2.45) L h (0)2 ( t, µ , y ) = − ψ (cid:48) ( y ) β ( y ) h (1)0 ( t, µ )2 κ ( t ) k (cid:88) i =1 ( g i ( µ ) · ρ ) ∂ µ i h (1)0 ( t, µ ) , (2.46)and the proposition follows.The solution to these PDEs (2.36) and (2.37) are provided in the following proposition. Its proof is astraightforward application of the Feynman-Kac representation. Proposition 3.

The unique solutions to (2.37) and (2.36) are h (1)1 ( t, µ ) := V · φ ( t, µ ) , and (2.47) h (0)1 ( t, µ ) := V · (cid:90) Tt E (cid:20) h (1)0 ( s, µ s )2 κ ( s ) (cid:16) φ ( s, µ s ) + F (0) ( t, µ s ) (cid:17) (cid:12)(cid:12)(cid:12)(cid:12) µ t = µ (cid:21) ds, (2.48) where φ ( t, µ ) = (cid:90) Tt e (cid:82) st X u ) κ ( u ) du E [ F (1) ( s, µ s ) | µ t = µ ] ds. (2.49) First-order approximation

Putting all these results together, we can ﬁnd the ﬁrst-order approximation for H ε is given by H ε ( t, x, S, µ , q ) = x + q S + h ( t, µ , q ) + V ε · h ( t, µ , q ) , (2.50)where V ε = √ ε V , h ( t, µ , q ) = h (1)1 ( t, µ ) q + h (0)1 ( t, µ ) and h ( t, µ , q ) is given by Equation (2.28) with k = κ . Recall that the optimal feedback control is v ε ∗ ( t, µ , q, y ) = k ( t,y ) ( b q + ∂ q h ε ) = κ ( t ) (1 + η ( y )) ( bq + ∂ q h ε ) . (2.51)Using the expansion derived for h ε , we ﬁnd, formally, v ε ∗ ( t, µ , q, y ) = 12 k ( t, y ) (cid:16) b q + h (1) ,ε + 2 q h (2) ,ε (cid:17) (2.52a)8 12 k ( t, y ) (cid:16) X ε q + h (1) ,ε (cid:17) = − X ε k ( t, y ) q − k ( t, y ) h (1) ,ε (2.52b)= X ( t ) k ( t, y ) q − k ( t, y ) (cid:16) h (1)0 ( t, µ ) + √ ε h (1)1 ( t, µ ) (cid:17) + · · · (2.52c)= 1 k ( t, y ) (cid:16) X ( t ) q + h (1)0 ( t, µ ) (cid:17) − k ( t, y ) √ ε h (1)1 ( t, µ ) + · · · . (2.52d)For later simpliﬁcation, we deﬁne v ( t, µ , q ) = 1 κ ( t ) (cid:18) X ( t ) q + 12 h (1)0 ( t, µ ) (cid:19) (2.53)which represents the optimal control of model (2.1) assuming k ( t, · ) ≡ κ ( t ). The ﬁrst-order approximationof the optimal control is then given by the expression(1 + η ( y )) v ε ( t, µ , q ) , (2.54)where v ε ( t, µ , q ) = v ( t, µ , q ) + √ ε κ ( t ) h (1)1 ( t, µ ) (2.55a)= v ( t, µ , q ) + V ε · C ( t, µ ) , (2.55b)and C ( t, µ ) = 12 κ ( t ) φ ( t, µ ) , (2.56)with φ given in (2.49). In this section, we use a speciﬁc model to show how the results of the previous section may be appliedin practice. To completely specify the model, we must determine the dynamics of µ and Y ε . For thelatter, we choose α ( y ) = θ − y and β ( y ) := √ β (for a constant β ). Thus, Y ’s stationary distribution isGaussian with mean θ and variance β . Setting η ( y ) = y , we must have θ = 0 to guarantee (cid:104) η (cid:105) = 0. Inthis case, we ﬁnd ψ ( y ) = − y (which solves (2.40)), and hence V ε = √ ε β ρ . We model the trading signal µ as a multidimensional Ornstein-Uhlenbeck (OU) process d µ t = ( A µ t + ¯ µ ) dt + B d W t , where A, B ∈ R k × k and ¯ µ ∈ R k . Straightforward computations show that, for s ≥ t , µ s = e A ( s − t ) µ t + (cid:90) st e A ( s − u ) du ¯ µ + (cid:90) st e A ( s − u ) B d W u . Then E [ µ s | µ t = µ ] = e A ( s − t ) µ + (cid:90) st e A ( s − u ) du ¯ µ , and from (2.30), we may write (for the case of deterministic impact) h (1) ( t, µ ) = γ · (cid:32)(cid:90) Tt e (cid:82) st X ( u ) k ( u ) du (cid:18) e A ( s − t ) µ + (cid:90) st e A ( s − u ) du ¯ µ (cid:19) ds (cid:33) (3.1)= γ · (Φ ( t ) µ + Φ ( t ) ¯ µ ) , (3.2)9here Φ ( t ) = (cid:90) Tt e (cid:82) st X ( u ) k ( u ) du e A ( s − t ) ds and Φ ( t ) = (cid:90) Tt e (cid:82) st X ( u ) k ( u ) du (cid:90) st e A ( s − u ) du ds. (3.3)If A in invertible, then we may write Φ in terms of Φ using (cid:90) st e A ( s − u ) du = ( e A ( s − t ) − I ) A − which implies Φ ( t ) = (cid:90) Tt e (cid:82) st X ( u ) k ( u ) du ( e A ( s − t ) − I ) A − ds = Φ ( t ) − (cid:90) Tt e (cid:82) st X ( u ) k ( u ) du dsA − . Using the representation in (3.1), the explicit formula for the zeroth-order optimal trading speed is v ( t, µ , q ) = − X ( t ) κ ( t ) q − κ ( t ) γ · (Φ ( t ) µ + Φ ( t ) ¯ µ ) . (3.4)Next, we analyze φ (see (2.49)) to obtain the ﬁrst-order correction. First note that F (1) ( t, µ ) = X ( t ) κ ( t ) k (cid:88) i =1 b i ∂ µ i h (1) ( t, µ ) , where b i is the i -th row of B . Moreover, ∂ µ i h (1) ( t, µ ) = k (cid:88) j =1 γ j Φ ( t ) ji = γ · Φ ,i ( t ) , where Φ ,i ( t ) is the i -th column of Φ ( t ). Thus F ( t, µ ) = X ( t ) κ ( t ) k (cid:88) i =1 ( γ · Φ ,i ( t )) b i , which is independent of µ . Therefore, φ is independent of µ and φ ( t ) = (cid:90) Tt e (cid:82) st X ( u ) κ ( u ) du F ( s ) ds = (cid:90) Tt e (cid:82) st X ( u ) κ ( u ) du X ( s ) κ ( s ) k (cid:88) i =1 ( γ · Φ ,i ( s )) b i ds = k (cid:88) i =1 ( γ · Φ ,i ( t )) b i , where Φ ,ji ( t ) = (cid:90) Tt e (cid:82) st X ( u ) κ ( u ) du X ( s ) κ ( s ) Φ ,ji ( s ) ds. Hence, C is independent of µ and C ( t ) = 12 κ ( t ) φ ( t ) = 12 κ ( t ) k (cid:88) i =1 b i ( γ · Φ ,i ( t )) . Inserting these results into (2.55) and (3.4), provides the ﬁrst-order approximation for the optimalstrategy as v ε ( t, µ , q ) = X ( t ) κ ( t ) q + 12 κ ( t ) γ · (Φ ( t ) µ + Φ ( t ) ¯ µ ) + 12 κ ( t ) k (cid:88) i =1 ( V ε · b i ) ( γ · Φ ,i ( t )) . Figure 2: Estimation of the process Y ε and of the stationary distribution. In this section, we present an application of our model with real data from the Microsoft stock in 2014.The impact parameter is estimated using the cost of walking the limit order book by ﬁctitious ordersof various volumes, and regressing this cost/volume curve to obtain a linear approximation. The slope isour estimate of κ t = k ( t, Y εt ), and we obtain this estimate every second of the day, resulting in a samplepath as shown in 1. To obtain estimates for η t = η ( Y εt ), we perform the following steps:1. Project the sample path of κ t onto a polynomial basis of order J = 8, i.e. ﬁnd α (cid:63) := arg min α (cid:80) Ni =1 (cid:16) κ t i − (cid:80) Jj =1 α j t ij − (cid:17) , where N = 23400 (number of seconds in a trading day), t i = i/N .2. Adjust the coeﬃcients to ensure that (cid:104) η t (cid:105) = 0. We do this by making the empirical mean of theimplied η t process over the data as close to zero as possible α ∗ := arg min α N (cid:80) Ni =1 (cid:18) (cid:80) Jj =1 α j t ij − κ ti − (cid:19) where the initial starting point for the minimizer is α (cid:63) from step 1.3. Next, we assume η ( y ) = y and set Y εt = η t = (cid:80) Jj =1 α j t j − κ t −

1, and assume Y εt is an OU process andhence satisﬁes dY εt = − ε Y εt dt + √ ε β dW ∗ t .

4. Finally, we estimate the parameters ε and β by regressing ( η t i +1 − η t i ) onto η t i .The results of the estimation are provided in Table 1 and η t sample path, together with scatter plotof η t and η t − and histogram of residuals are shown in Figure 2. Param. Value ε β b ρ -0.5 κ µ θ σ µ µ S X Q φ bA · bγ V ε -0.008 Table 1: Parameters: Estimated on the left table and exogenous on the other two.Figure 3 illustrates the key functions ( X ( t ) which solves the ODE (2.29), Φ and Φ given in (3.3),and C deﬁned in (2.56)) that feed into the zero-order and ﬁrst-order control strategies, v and (1 + η ( y )) v ε , using the estimated parameters. The ﬁgure shows the results for three diﬀerent running penaltyparameters φ = b , 5 b , and 10 b . As φ increases, we see that X ( t ) generally increases in magnitude (morenegative), Φ and Φ decreases, while C behaves non-monotonically.11 -10 -10 Figure 3: Plot of X /κ , Φ , Φ and C that deﬁne the control strategies v and v ε for various levels of φ .Figure 4 shows a sample path of the zero-order and ﬁrst-order optimal controls for various runningpenalties. The ﬁgure shows that the starting level of trading speed increases with φ , as the trader ismore urgent to rid themselves of shares, but larger φ levels induce the trader to slow down sooner in theday, and ends trading more slowly. As well, and as expected, the ﬁgure shows that responding to thestochastic impact on average traces out the zero-order path. φ = b φ = 5 b φ = 10 b Figure 4: A sample path of the zero-order and ﬁrst-order optimal controls, with φ = b, b, , b .Next, we simulate 10,000 sample paths of the zero-orer and ﬁrst-order optimal controls, and thecorresponding inventory paths. Figure 5 shows how the ﬁrst-order optimal control’s inventory pathsdiﬀers from (left) the Almgren-Chris paths – deﬁned as the path with trading speed v AC = −X ( t ) /κ ( t ),and (right) the zero-order paths. The ﬁgure shows a single sample path (generated with the same set ofrandom numbers, although they do produce diﬀerent impacts due to diﬀering trading speeds), togetherwith the 10%, 50%, and 90% quantiles across all 10,000 sample paths. As the ﬁgure shows, for the medianpath, there is a slight slow down relative to the AC path (since the median has an upward bump). Asurgency ( φ ) increases, the deviations develop a hump just after the start of trading and towards the endof trading the deviations are pinned more closely to the AC solution. The deviation from the zero-orderstrategy have a median path of essentially zero, and, as expected, the variation around the zero-orderpath is smaller than around the AC path. There is, however, still a signiﬁcant amount of deviation dueto the ﬁrst-order strategy correctly adjusting to the stochastic impact.As a ﬁnal numerical comparison, we investigate the cost savings that the ﬁrst-order strategy provides12 = b φ = 5 b φ = 10 b Figure 5: A sample path and the10%, 50%, and 90% quantile of (left)the deviation of the inventory follow-ing the ﬁrst-order optimal and theAlmgren-Chriss controls, and (right)the deviation of the inventory follow-ing the ﬁrst-order optimal and thezero-order optimal controls.13ver the AC and the zero-order strategy. For this purpose we compute the cost C νT = X νT + Q νT S νT associated with a strategy ν and compute the savings in basis points relative to a benchmark ¯ ν (with ¯ ν either given by ν AC or ν (0) ) as basis points savings = C νT − C ¯ νT C ¯ νT × . (4.1)The results for various levels of urgency are shown in Figure 6. The results show the as we increase theurgency we on average perform better than both the AC and zero-order strategy. -1 0 1 205001000150020002500 -1 0 1 205001000150020002500 Figure 6: Histogram of the savings in basis points for various φ between ﬁrst-order strategy, zero-orderstrategy, and AC strategy. Dashed lines show the location of the corresponding median. In this section we prove the accuracy of the approximation of X ε by X . As noted earlier, once the accuracyfor h (2) ,ε is proved, the accuracy for the approximations for h (1) ,ε and h (0) ,ε , which are the solutions ofthe linear PDEs (2.22) and (2.23), follow from the usual arguments from [13]. The proof of accuracy fornon-linear PDEs was ﬁrst developed in [9] under the diﬀerent context of portfolio optimization.Recall that X ε satisﬁes the non-linear PDE (2.25): (cid:40) ∂ t X ε − φ + ( X ε ) k ( t,y ) + ε L X ε = 0 , X ε ( T, y ) = − A + b/ . (5.1)The zero-order approximation X is the solution of Ricatti ODE  X (cid:48) ( t ) − φ + 1 κ ( t ) X ( t ) = 0 , X ( T ) = − A + b/ . (5.2)To provide precise bounds, we make the following standing assumption for the remainder of this section. Assumptions 1.

We assume that1. κ is bounded and bounded away from zero with bounded derivative; and2. η is bounded, implying that the solutions to all Poisson equations considered below andtheir derivatives are bounded. Theorem 1.

With Assumptions 1 enforced, there exists

C > such that, for any ε < , |X ε ( t, y ) − X ( t ) | ≤ C ε, for any t ∈ [0 , T ] and y ∈ R . roof. The proof involves three main steps: (i) recast the PDE (2.25) for X ε in terms of an auxiliarycontrol problem, (ii) construct a sub-solution and a super-solution of (2.25), and (iii) use the sub/super-solutions to bound the error.First, the PDE for X ε may be recast as the following HJB equation  ∂ t X ε − φ + sup ξ ∈B (cid:8) − X ε ξ − k ( t, y ) ξ (cid:9) + ε L X ε = 0 , X ε ( T, y ) = − A + b , (5.3)where B = R . Therefore, non-linear Feynman-Kac (see [26]) implies that X ε admits the representationin terms of the auxiliary control problem X ε ( t, y ) = sup ξ ∈ B Z ( t, y, ξ ) , (5.4) Z ( t, y, ξ ) = E t,y (cid:34) ( − A + b ) e − (cid:82) Tt ξ u du + (cid:90) Tt e − (cid:82) st ξ u du ( − φ − k ( s, Y εs ) ξ s ) ds (cid:35) , (5.5)where B is a set of admissible controls taking values in B that consists of F -predictable processes in L ([0 , T ] , Ω). We next proceed along the ideas developed in [9] for the proof of accuracy of non-linearPDEs stemming from portfolio optimisation problems.To this end, deﬁne X ± as follows X ± ( t, y ) = X ( t ) + ε ˜ X ( t, y ) ± ε (2 T − t ) C N ( t ) ± ε M ( t, y ) , (5.6)for constant C > X , N and M to be deﬁned shortly.By the Poisson equation (2.34b), we ﬁnd X ( t, y ) = − ψ ( y ) X ( t ) κ ( t ) + c ( t ) , (5.7)for some function c independent of y and ψ is the solution of the Poisson equation L ψ = η which isdeﬁned only up to a deterministic function of time (constant in y ). By choosing this function, we maytherefore assume (cid:104) ψ (cid:105) = 0 without loss of generality. We deﬁne the y -dependent part of X as˜ X ( t, y ) := − ψ ( y ) X ( t ) κ ( t ) , (5.8)which sets one of the three free functions in the deﬁnition of X ± .Next, deﬁne the diﬀerential operator R ξ [ X ] = ∂ t X − φ − ξ X − k ( t, y ) ξ + ε L X , (5.9)for any ξ ∈ B . The HJB equation (5.3) may be written as sup ξ ∈B R ξ [ X ] = 0. For any control ξ ∈ B , Itˆo’slemma implies X ± ( T, Y εT ) e − (cid:82) Tt ξ u du − (cid:90) Tt e − (cid:82) st ξ u du ( φ + k ( s, Y εs ) ξ s ) ds (5.10)= X ± ( t, y ) + (cid:90) Tt e − (cid:82) st ξ u du R ξ s [ X ± ]( s, Y εs ) ds (5.11)+ 1 √ ε (cid:90) Tt e − (cid:82) st ξ u du β ( Y εs ) ∂ y X ± ( s, Y εs ) dW ∗ s . (5.12)Taking expectation, since the Itˆo integral above is a true martingale by Assumptions 1, we ﬁnd Z ( t, y, ξ ) + ε H ± ( t, y, ξ ) = X ± ( t, y ) + (cid:90) Tt E t (cid:104) e − (cid:82) st ξ u du R ξ s [ X ± ]( s, Y εs ) (cid:105) ds, (5.13) H ± ( t, y, ξ ) := E t,y (cid:104) e − (cid:82) Tt ξ u du (cid:16) ˜ X ( T, Y εT ) ± T C N ( T ) ± ε M ( T, Y εT ) (cid:17) (cid:105) . (5.14)Next, we construct particularly choices for the two remaining free functions of X ± , deﬁned in (5.6), thatprovide sub and super solutions. 15 ub-solution Let us ﬁrst analyze X − . Acting on it with the R ξ operator, we have R ξ [ X − ] = X (cid:48) − φ − ξ X − k ( t, y ) ξ + ε (cid:24)(cid:24)(cid:24)(cid:58) L X − ε∂ t ˜ X − εξ ˜ X − L ˜ X + εC N ( t ) − εC (2 T − t ) ( ∂ t N − ξN ) − ε ( ∂ t M − ξM ) − ε L M. (5.15)Deﬁne the particular auxiliary control ξ ( t, y ) := − X ( t ) k ( t, y ) = − (1 + η ( y )) X ( t ) κ ( t ) . (5.16)With this control, we have that X (cid:48) − φ − ξ X − k ( t, y ) ξ = X (cid:48) − φ + X k ( t,y ) , (5.17)hence, by (2.34b) X (cid:48) − φ + X k ( t,y ) + L ˜ X = 0 . (5.18)Therefore, R ξ [ X − ] = ε (cid:8) ∂ t ˜ X − ξ ˜ X + C N ( t ) − C (2 T − t ) ( N (cid:48) ( t ) − ξ N ) − L M (cid:9) − ε { ∂ t M − ξ M } (5.19)= ε (cid:110) − ψ ( y ) (cid:0) ∂ t ( X /κ ) + 2 X /κ (cid:1) − η ( y ) ψ ( y ) X /κ + C N ( t ) − C (2 T − t ) ( N (cid:48) ( t ) + 2 X N/κ ) − C (2 T − t ) η ( y ) X N/κ − L M (cid:111) − ε { ∂ t M − ξ M } . (5.20)We next choose M such that the ε term is centered with respect to the invariant measure, that is M ( t, y ) = − Ψ ( y ) (cid:0) ∂ t ( X /κ ) + 2 X /κ (cid:1) − ( y ) X /κ − ψ ( y ) C (2 T − t ) X N/κ, (5.21)where we use that (cid:104) η (cid:105) = (cid:104) ψ (cid:105) = 0, and where L Ψ = ψ and L Ψ = ηψ − (cid:104) ηψ (cid:105) . (5.22)Therefore, we ﬁnd R ξ [ X − ] = ε (cid:26) − (cid:104) ψη (cid:105) X κ + C N ( t ) − C (2 T − t ) (cid:18) N (cid:48) ( t ) + 2 X Nκ (cid:19)(cid:27) − ε { ∂ t M − ξ M } . (5.23)Next, we choose N such that the term proportional to (2 T − t ) vanishes, which is given by N ( t ) = exp (cid:26) − (cid:90) t X ( s ) κ ( s ) ds (cid:27) . (5.24)For the above choices of N and M , we therefore obtain R ξ [ X − ] = ε (cid:0) − (cid:104) ψη (cid:105)X /κ + C N ( t ) (cid:1) − ε ( ∂ t M − ξ M ) . (5.25)By Assumptions 1, the ﬁrst and last terms of the expression above are bounded. Hence, there exists C large enough such that (cid:90) Tt E t,y (cid:104) e − (cid:82) st ξ ( u,Y εu ) du R ξ [ X − ]( s, Y εs ) (cid:105) ds ≥ , (5.26)16or any t and y . Moreover, notice that H − ( t, y, ξ ) = E t,y (cid:104) e − (cid:82) Tt ξ ( u,Y εu ) du (cid:16) ˜ X ( T, Y εT ) − T C N ( T ) − ε M ( T, Y εT ) (cid:17)(cid:105) . Again by Assumptions 1, if necessary, we increase C such that the term inside the expectation that deﬁnes H − ( t, y, ξ ) is negative everywhere. Hence, by the deﬁnition of X ε we ﬁnd that, for any ( t, y ) ∈ [0 , T ] × R , X ε ( t, y ) ≥ Z ( t, y, ξ ) (5.27)= X − ( t, y ) − ε H − ( t, y, ξ ) + (cid:90) Tt E t,y (cid:104) e − (cid:82) st ξ ( u,Y εu ) du R ξ [ X + ]( s, Y εs ) (cid:105) ds (5.28) ≥ X − ( t, y ) . (5.29) Super-solution

Let us now analyze X + . We start by computingsup ξ ∈B R ξ [ X + ] = ∂ t X + − φ + ( X + ) k ( t, y ) + 1 ε L X + (5.30)= ε (cid:110) ∂ t ˜ X + 2 k ( t, y ) X ( ˜ X + (2 T − t ) C N + εM ) − C N + (2 T − t ) C ∂ t N + L M (cid:111) + ε (cid:26) k ( t, y ) ( ˜ X + (2 T − t ) C N + εM ) + ∂ t M (cid:27) . (5.31)The term of order ε is then ∂ t ˜ X + 2 k ( t, y ) X ( ˜ X + (2 T − t ) C N ) − C N (5.32)+ (2 T − t ) C ∂ t N + L M. Similarly to what was done before, we choose M to center (5.32) with respect to the invariant distribution.Since (cid:104) ψ (cid:105) = (cid:104) η (cid:105) = 0, the term of order ε becomes − (cid:104) ηψ (cid:105)X /κ − C N + (2 T − t ) C ( ∂ t N + 2 X N/κ ) . (5.33)Choosing the same function N as for the sub-solution, the term of order ε is given by − (cid:104) ηψ (cid:105)X /κ − C N. (5.34)By Assumptions 1, the term of order ε is bounded. We can then choose C large enough so thatsup ξ ∈B R ξ [ X + ] ≤

0. Furthermore, X + ( T, y ) = ( − A + b/

2) + ε ˜ X ( T ) + εT C N ( T ) + ε M ( T, y ) . (5.35)Possibly increasing C even more, we conclude that, say for ε ≤ X + ( T, y ) ≥ − A + b/ y .Then, by Itˆo’s formula (5.10), Z ( t, y, ξ ) ≤ E t,y (cid:20) X + ( T, Y εT ) e − (cid:82) Tt ξ u du + (cid:90) Tt e − (cid:82) st ξ u du ( − φ − k ( s, Y εs ) ξ s ) d (cid:21) (5.36)= X + ( t, y ) + (cid:90) Tt E t,y (cid:104) e − (cid:82) st ξ u du R ξ s [ X + ]( s, Y εs ) (cid:105) ds (5.37) ≤ X + ( t, y ) + (cid:90) Tt E t,y (cid:20) e − (cid:82) st ξ u du sup ξ ∈B R ξ [ X + ]( s, Y εs ) (cid:21) ds ≤ X + ( t, y ) . (5.38) Therefore, X ε ( t, y ) ≤ X + ( t, y ) . (5.39)We have ﬁnally concluded that there exists a constant C > |X ε ( t, y ) − X ( t ) | ≤ Cε. (5.40)17

Auxiliary Results for Deterministic Impact

We present here the proof of the formulas presented in Section 2.2. We follow the steps of [5]. Write h ( t, µ , q ) = h (0) ( t, µ ) + h (1) ( t, µ ) q + h (2) ( t, µ ) q , with h (0) ( T, µ ) = h (1) ( T, µ ) = 0 and h (2) ( T, µ ) = − A .Therefore, we ﬁnd the following PDEs for h ( i ) :( ∂ t + L µ ) h (2) ( t, µ ) − φ + 14 k ( t ) ( b + 2 h (2) ( t, µ )) = 0 , (A.1)( ∂ t + L µ ) h (1) ( t, µ ) + γ · µ + 12 k ( t ) ( b + 2 h (2) ( t, µ )) h (1) ( t, µ ) = 0 , (A.2)( ∂ t + L µ ) h (0) ( t, µ ) + 14 k ( t ) ( h (1) ) ( t, µ ) = 0 . (A.3)To solve the ﬁrst equation above, notice that we may assume h (2) = h (2) ( t ) independent of µ . Deﬁne X ( t ) = b h (2) ( t ) . (A.4)and notice that X (cid:48) ( t ) − φ + 1 k ( t ) X ( t ) = 0 . Now, h satisﬁes ( ∂ t + L µ ) h (1) ( t, µ ) + γ · µ + X ( t ) k ( t ) h (1) ( t, µ ) = 0 . (A.5)By Feynman-Kac’s representation, h (1) ( t, µ ) = E (cid:34)(cid:90) Tt e (cid:82) st X ( u ) k ( u ) du ( γ · µ s ) ds (cid:12)(cid:12)(cid:12) µ t = µ (cid:35) (A.6)= (cid:90) Tt e (cid:82) st X ( u ) k ( u ) du E [ γ · µ s | µ t = µ ] ds, (A.7) h (0) ( t, µ ) = E (cid:34)(cid:90) Tt k ( s ) ( h (1) )( s, µ s ) ds (cid:12)(cid:12)(cid:12) µ t = µ (cid:35) (A.8)= (cid:90) Tt k ( s ) E (cid:104) ( h (1) ) ( s, µ s ) | µ t = µ (cid:105) ds. (A.9) References [1] R. Almgren. Optimal trading with stochastic liquidity and volatility.

SIAM Journal on FinancialMathematics , 3(1):163–181, 2012.[2] R. Almgren and N. Chriss. Optimal execution of portfolio transactions.

Journal of Risk , 3:5–40,2001.[3] W. Barger and M. Lorig. Optimal liquidation under stochastic price impact.

International Journalof Theoretical and Applied Finance , 22(02):1850059, 2019.[4] ´A. Cartea, R. Donnelly, and S. Jaimungal. Algorithmic trading with model uncertainty.

SIAMJournal on Financial Mathematics , 8(1):635–671, 2017.[5] ´A. Cartea and S. Jaimungal. Incorporating order-ﬂow into optimal execution.

Mathematics andFinancial Economics , 10(3):339––364, 2016. 186] ´A. Cartea, S. Jaimungal, and J. Penalva.

Algorithmic and high-frequency trading . Cambridge Uni-versity Press, 2015.[7] P. Cheridito and T. Sepin. Optimal trade execution under stochastic volatility and liquidity.

AppliedMathematical Finance , 21(4):342–362, 2014.[8] P. Cotton, J.-P. Fouque, G. Papanicolaou, and R. Sircar. Stochastic volatility corrections for interestrate derivatives.

Mathematical Finance: An International Journal of Mathematics, Statistics andFinancial Economics , 14(2):173–200, 2004.[9] J.-P. Fouque, R. Hu, and R. Sircar. Accuracy of approximation for portfolio optimization undermultiscale stochastic environment.

In preparation , 2020.[10] J.-P. Fouque, G. Papanicolaou, and K. R. Sircar.

Derivatives in ﬁnancial markets with stochasticvolatility . Cambridge University Press, 2000.[11] J.-P. Fouque, G. Papanicolaou, and K. R. Sircar. Mean-reverting stochastic volatility.

InternationalJournal of theoretical and applied ﬁnance , 3(01):101–142, 2000.[12] J.-P. Fouque, G. Papanicolaou, R. Sircar, and K. Sølna. Multiscale stochastic volatility asymptotics.

Multiscale Modeling & Simulation , 2(1):22–42, 2003.[13] J.-P. Fouque, G. Papanicolaou, R. Sircar, and K. Sølna.

Multiscale Stochastic Volatility for Equity,Interest Rate, and Credit Derivatives . Cambridge University Press, 2011.[14] J.-P. Fouque, Y. F. Saporito, and J. P. Zubelli. Multiscale stochastic volatility model for derivativeson futures.

International Journal of Theoretical and Applied Finance , 17(07):1450043, 2014.[15] J.-P. Fouque, R. Sircar, and T. Zariphopoulou. Portfolio optimization and stochastic volatilityasymptotics.

Mathematical Finance , 27(3):704–745, 2017.[16] A. Fruth, T. Sch¨oneborn, and M. Urusov. Optimal trade execution in order books with stochasticliquidity.

Mathematical Finance , 29(2):507–541, 2019.[17] P. Graewe, U. Horst, and E. S´er´e. Smooth solutions to portfolio liquidation problems under price-sensitive market impact.

Stochastic Processes and their Applications , 128(3):979–1006, 2018.[18] O. Gu´eant.

The Financial Mathematics of Market Liquidity: From optimal execution to marketmaking , volume 33. CRC Press, 2016.[19] S. Hikspoors and S. Jaimungal. Asymptotic pricing of commodity derivatives using stochastic volatil-ity spot models.

Applied Mathematical Finance , 15(5-6):449–477, 2008.[20] R. S. J.-P. Fouque and T. Zariphopoulou. Portfolio Optimization and Stochastic Volatility Asymp-totics.

Math. Fin. , 27(3):704–745, 2017.[21] S. Laruelle and C.-a. Lehalle.

Market microstructure in practice . World Scientiﬁc, 2018.[22] C.-A. Lehalle and E. Neuman. Incorporating signals into optimal trading.

Finance and Stochastics ,23(2):275–311, 2019.[23] S. Moazeni, T. F. Coleman, and Y. Li. Optimal execution under jump models for uncertain priceimpact.

Journal of Computational Finance , 16(4):1–44, 2013.[24] A. A. Obizhaeva and J. Wang. Optimal trading strategy and supply/demand dynamics.

Journal ofFinancial Markets , 16(1):1–32, 2013.[25] E. Papageorgiou and R. Sircar. Multiscale intensity models for single name credit derivatives.

AppliedMathematical Finance , 15(1):73–105, 2008.[26] H. Pham.

Continuous-time Stochastic Control and Optimization with Financial Applications .Springer-Verlag Berlin Heidelberg, 2009.[27] C. C. Siu, I. Guo, S.-P. Zhu, and R. J. Elliott. Optimal execution with regime-switching marketresilience.