Optimal posting price of limit orders: learning by trading
aa r X i v : . [ q -f i n . T R ] S e p Optimal posting price of limit orders: learning bytrading
Sophie Laruelle ∗ Charles-Albert Lehalle † Gilles Pag`es ‡ Abstract
Considering that a trader or a trading algorithm interacting with markets during continuousauctions can be modeled by an iterating procedure adjusting the price at which he posts orders at agiven rhythm, this paper proposes a procedure minimizing his costs. We prove the a.s. convergenceof the algorithm under assumptions on the cost function and give some practical criteria on modelparameters to ensure that the conditions to use the algorithm are fulfilled (using notably the co-monotony principle). We illustrate our results with numerical experiments on both simulated dataand using a financial market dataset.
Keywords
Stochastic approximation, order book, limit order, market impact, statistical learning,high-frequency optimal liquidation, compound Poisson process, co-monotony principle . In recent years, with the growth of electronic trading, most of the transactions in the markets occur in
Limit Order Books . During the matching of electronic orders, traders send orders of two kinds to themarket: passive ( i.e. limit or patient orders) which will not give birth to a trade but will stay in theorder book (sell orders at a higher price than the higher bid price or buy orders at a lower price thanthe lower ask price are passive orders) and aggressive orders ( i.e. market or impatient orders) whichwill generate a trade (sell orders at a lower price than the higher passive buy price or buy orders at ahigher price than the lowest passive price). When a trader has to buy or sell a large number of shares,he cannot just send his large order at once (because it would consume all of the available liquidity inthe order book, impacting the price at his disadvantage); he has first to schedule his trading rate tostrike balance between the market risk and the market impact cost of being too aggressive (too manyorders exhaust the order book and makes the price move). Several theoretical frameworks have beenproposed for optimal scheduling of large orders (see [3], [7], [24], [2]). Once this optimal trading rateis known, the trader has to send smaller orders in the (electronic) limit order book by alternatinglimit ( i.e. patient or passive) orders and market ( i.e. urgent or aggressive) orders. The optimal mix oflimit and market orders for a trader has not been investigated in the quantitative literature even if ithas been studied from a global economic efficiency viewpoint (see for instance [9]). It has not either ∗ Laboratoire de Probabilit´es et Mod`eles al´eatoires, UMR 7599, UPMC, 4, pl. Jussieu, F-75252 Paris Cedex 5, France.E-mail: [email protected] † Head of Quantitative Research, Cr´edit Agricole Cheuvreux, CALYON group ; 9 quai Paul Doumer, 92920 Paris LaD´efense. E-mail: [email protected] ‡ Laboratoire de Probabilit´es et Mod`eles al´eatoires, UMR 7599, UPMC, case 188, 4, pl. Jussieu, F-75252 Paris Cedex5, France. E-mail: [email protected] e.g. [1] or [11] for such models of limit orderbooks).Optimal submission strategies have been studied in the microstructure literature using utilityframework and optimal control (see [4], [10], [5], [12] and [13]). The authors of such papers consideran agent who plays the role of a market maker, i.e. he provides liquidity on the exchange by quotingbid and ask prices at which he is willing to buy and sell a specific quantity of assets. Strategies forbid and ask orders are derived by maximizing his utility function.Our approach is different: we consider an agent who wants to buy (or sell) during a short period[0 , T ] a quantity Q T of traded assets and we look for the optimal distance where he has to post hisorder to minimize the execution cost.We are typically at a smaller time scale than in usual optimal liquidation frameworks. In fact orderposting strategies derived from the viewpoint presented below can be “plugged” into any larger scalestrategy. We are modeling the market impact of an aggressive order using a penalization function κ · Φ( Q ), where Q is the size of the market order.If a stochastic algorithm approach has been already proposed in [20] by the authors for optimalspatial split of orders across different Dark Pools, here the purpose is not to control fractions of thesize of orders, but to adjust successive posting prices to converge to an optimal price. Qualitatively,this framework can be used as soon as a trader wants to trade a given quantity Q T over a given timeinterval [0 , T ] with no firm constraint on its trading rate between 0 and T . It is typically the casefor small Implementation Shortfall benchmarked orders. The trader can post his order very close tothe “ fair price ”( S t ) t ∈ [0 ,T ] (which can be seen as the fundamental price, the mid price of the availabletrading venues or any other reference price). In this case he will be exposed to the risk to trade toofast at a “bad price” and being adversely selected. Conversely he can post it far away from the fairprice; in that case he will be exposed to never obtain a transaction for the whole quantity Q T , butonly for a part of it (say the positive part of Q T − N T , where N T is the quantity that the tradingflow allowed him to trade). He will then have to consume aggressively liquidity with the remainingquantity, disturbing the market and paying not only the current market price S T , but also a marketimpact (say S T Φ( Q T − N T ) where Φ is a market impact penalization function).The approach presented here follows the mechanism of a “learning trader”. He will try to guessthe optimal posting distance to the fair price achieving the balance between being too demanding inprice and too impatient, by successive trials, errors and corrections. The optimal recursive procedurederived from our framework gives the best price adjustment to apply to an order on given stoppingtime (reassessment dates) given the observed past on the market. We provide proofs of the convergenceof the procedure and of its optimality.To this end, we model the execution process of orders by a Poisson process ( N ( δ ) t ) ≤ t ≤ T whichintensity Λ T ( δ, S ) depends on the fair price ( S t ) t ≥ and the distance of order submission δ . Theexecution cost results from the sum of the price of the executed quantity and a penalization functiondepending on the remaining quantity to be executed at the end of the period [0 , T ]. This penalty κ · Φ( Q ) models the over-cost induced by crossing the spread and the resulting market impact of thisexecution. The aim is to find the optimal distance δ ∗ ∈ [0 , δ max ], where δ max is the depth of the limitorder book, which minimizes the execution cost. In practice, the prices are constrained to be on a“tick size grid”(see [25]), instead of being on the real line. We will follow the approach of paperson market making [14, 4, 10] assuming the the tick size is small enough to not change the dynamics2f the price and of the bid-ask spread [22]. Nevertheless, a “rounding effect” would not change thenature of our results, since any projection ( e.g. to nearest-neighbor) is a non-decreasing transform andthe co-monotony principle still holds (by slightly adapting the proofs). This leads to an optimizationproblem under constraints which we solve by using a recursive stochastic procedure with projection(this particular class of algorithm is studied in [18] and [19]). We prove the a.s. convergence of theconstrained algorithm under additional assumptions on the execution cost function. From a practicalpoint of view, it is not easy to check the conditions on the cost function. So we give criteria onthe model parameters which ensure the viability of the algorithm which relies on the co-monotonyprinciple assumed to be satisfied by the “fair price” process ( S t ) t ∈ [0 ,T ] . This principle will be detailedin the Appendix Section B. We conclude this paper by some numerical experiments with simulatedand real data. We consider the Poisson intensity presented in [4] and use a Brownian motion to modelthe fair price dynamics. We plot the cost function and its derivative and show the convergence of thealgorithm to its target δ ∗ .The paper is organized as follows: in Section 2, we first propose a model for the execution process ofposted orders, then we define a penalized execution cost function (including the market impact at theterminal execution date). Then we devise the stochastic recursive procedure under constraint to solvethe resulting optimization problem in terms of optimal posting distance on the limit order book. Westate the main convergence result and provide operating criteria that ensure this convergence, basedon a co-monotony principle for one dimensional diffusions. Section 3 establishes the representationsas expectations of the cost function and its derivatives which allow to define the mean function of thealgorithm. Section 4 presents the convergence criteria (which ensure that the optimization is well-posed) derived from the principle of co-monotony established in Section B of the appendix. FinallySection 5 illustrates with numerical experiments the convergence of the recursive procedure towardsits target. Notations. • ( x ) + = max { x, } denotes the positive part of x , ⌊ x ⌋ = max { k ∈ N : k ≤ x } J , x K := { y ∈ [0 , d : 0 ≤ y ≤ x } = Q di =1 [0 , x i ], N = N ∪ {∞} . •h· | ·i denotes the canonical inner product on R d . • ( R d ) = ⇒ denotes the weak convergence on R d and L −→ denotes the convergence in distribution. • C ([0 , T ] , A ) := { f : [0 , T ] → A continuous } (equipped with the supnorm topology) and D ([0 , T ] , A ) := { f : [0 , T ] → A c`adl`ag } (equipped with the Skorokhod topology when necessary see [15]) where c`adl`agmeans right continuous with left limits and A = R q , R q + , etc. They are equipped with the standardBorel σ -field σ ( α α ( t ) , t ∈ [0 , T ]). • P -esssup f = inf { a ∈ R : P ( { x : f ( x ) > a } ) = 0 } , k α k ∞ = sup t ∈ [0 ,T ] | α ( t ) | , α ∈ D ([0 , T ] , R ) and f ′ ℓ denotes the left derivative of f . We focus our work on the problem of optimal trading with limit orders on one security without needingto model the limit order book dynamics. We only model the execution flow which reaches the pricewhere the limit order is posted with a general price dynamics ( S t ) t ∈ [0 ,T ] since we intend to use realdata. However there will be two frameworks for the price dynamics: either ( S t ) t ∈ [0 ,T ] is a processbounded by a constant L (which is obviously an unusual assumption but not unrealistic on a shorttime scale see Section 2.2.2) or ( S t ) t ∈ [0 ,T ] is ruled by a Brownian diffusion model (see Section 2.2.1).3e consider on a short period T , say a dozen of seconds, a Poisson process modeling the executionof posted passive buy orders on the market (cid:0) N ( δ ) t (cid:1) ≤ t ≤ T with intensity Λ T ( δ, S ) := Z T λ ( S t − ( S − δ )) dt (2.1)where 0 ≤ δ ≤ δ max with δ max the depth of the order book, δ max ∈ (0 , S ), and ( S t ) t ≥ is a stochasticprocess modeling the dynamics of the “fair price” of a security stock (from an economic point of view).In practice one may consider that S t represents the best opposite price at time t .One way to build N ( δ ) is to set N ( δ ) t = e N R T λ ( S t − ( S − δ )) dt where e N is a Poisson process with intensity 1 independent of the price ( S t ) t ∈ [0 ,T ] .This representation underlines the fact that for one given trajectory of the price S t , the intensityof the Point process N is decreasing with δ : in fact the above representation for N ( δ ) is even pathwiseconsistent in the sense that if 0 < δ < δ ′ then P - a.s. (cid:16) ∀ t ∈ [0 , T ] , N ( δ ) t ≤ N ( δ ′ ) t (cid:17) . The natural question how to account for the actually real possibility of simultaneously placing limitorders at different prices may be much harder to handle and is not studied in this paper. In partic-ular, due to interacting impact features, it would need a more sophisticated approach then simplyconsidering ( N δ ( k ) ) ≤ k ≤ K , processes as above with δ (1) < δ (2) < . . . < δ ( K ) (with the same N ).We assume that the function λ is defined on [ − S , + ∞ ) as a finite non-increasing convex function.Its specification will rely on parametric or non parametric statistical estimation based on formerobtained transactions (see Figure 1 below and Section 5). At time t = 0, buy orders are posted in thelimit order book at price S − δ . Between t and t + ∆ t , the probability for such an order to be executedis λ ( S t − ( S − δ ))∆ t where S t − ( S − δ ) is the distance to the current fair price of our posted orderat time t . The further the order is at time t , the lower is the probability for this order to be executedsince λ is decreasing on [ − S , + ∞ ). Empirical tests strongly confirm this kind of relationship with aconvex function λ (even close to an exponential shape, see Figure 1). Over the period [0 , T ], we aimFigure 1: Empirical probabilities of execution (blue stars) and its fit with an exponential law (reddotted line) with respect to the distance to the “fair price”.at executing a portfolio of size Q T ∈ N invested in the asset S . The execution cost for a distance δ is E h ( S − δ ) (cid:16) Q T ∧ N ( δ ) T (cid:17)i . We add to this execution cost a penalization depending on the remaining4uantity to be executed, namely at the end of the period T , we want to have Q T assets in the portfolio,so we buy the remaining quantity (cid:16) Q T − N ( δ ) T (cid:17) + at price S T .At this stage, we introduce a market impact penalization function Φ : R R + , non-decreasingand convex, with Φ(0) = 0 to model the additional cost of the execution of the remaining quantity(including the market impact). Then the resulting cost of execution on a period [0 , T ] reads C ( δ ) := E h ( S − δ ) (cid:16) Q T ∧ N ( δ ) T (cid:17) + κ S T Φ (cid:16)(cid:0) Q T − N ( δ ) T (cid:1) + (cid:17)i (2.2)where κ > Q ) = 1, we just consider that we buy the remainingquantity at the end price S T . Introducing a market impact penalization function Φ( x ) = (1 + η ( x )) x ,where η ≥ η
0, models the market impact induced by the execution of (cid:0) Q T − N ( δ ) T (cid:1) + at time T whereas we neglect the market impact of the execution process via limit orders over [0 , T ). Ouraim is then to minimize this cost by choosing the distance to post at, namely to solve the followingoptimization problem min ≤ δ ≤ δ max C ( δ ) . (2.3)Our strategy to solve numerically (2.3) using a large enough dataset is to take advantage of therepresentation of C and its first two derivatives as expectations to devise a recursive stochastic algo-rithm, namely a stochastic gradient procedure, to find the minimum of the (penalized) cost function(see below). Furthermore we will show that under natural assumptions on the quantity Q T to beexecuted and on the parameter κ , the function C is twice differentiable, strictly convex on [0 , δ max ] with C ′ (0) <
0. Consequently,argmin δ ∈ [0 ,δ max ] C ( δ ) = { δ ∗ } , δ ∗ ∈ (0 , δ max ]and δ ∗ = δ max iff C is non-increasing on [0 , δ max ] . Criteria involving κ and based on both the risky asset S and the trading process especially theexecution intensity λ , are established further on in Proposition 4.1 and Proposition 4.2. We specifyrepresentations as expectations of the function C and its derivatives C ′ and C ′′ . In particular we willexhibit a Borel functional H : [0 , δ max ] × D ([0 , T ] , R ) −→ R such that ∀ δ ∈ [0 , δ max ] , C ′ ( δ ) = E h H (cid:0) δ, ( S t ) t ∈ [0 ,T ] (cid:1)i . The functional H has an explicit form given in Proposition 3.2, Equations (3.14) or (3.16), involvingintegrals over [0 , T ] of the intensity λ ( S t − S + δ ) of the Poisson process ( N ( δ ) t ) t ∈ [0 ,T ] . In particular,any quantity H (cid:0) δ, ( S t ) t ∈ [0 ,T ] (cid:1) can be simulated, up to a natural time discretization, either from a truedataset (of past executed orders) or from the stepwise constant discretization scheme of a formerlycalibrated diffusion process modeling ( S t ) t ∈ [0 ,T ] (see below). This will lead us to replace for practicalimplementations the continuous time process ( S t ) t ∈ [0 ,T ] over [0 , T ], either by a discrete time sample, i.e. a finite dimensional R m +1 -valued random vector ( S t i ) ≤ i ≤ m (where t = 0 and t m = T ) or by atime discretization scheme with step Tm (typically the Euler scheme when ( S t ) t ∈ [0 ,T ] is a diffusion). A theoretical stochastic learning procedure:
Based on this representation (3.14) of C ′ , wecan formally devise a recursive stochastic gradient descent a.s. converging toward δ ∗ . However to makeit consistent, we need to introduce constrain so that it lives in [0 , δ max ]. In the classical literature on5tochastic Approximation Theory (see [18] and [19]) this amounts to consider a variant with projectionon the “order book depth interval” [0 , δ max ], namely δ n +1 = Proj [0 ,δ max ] (cid:16) δ n − γ n +1 H (cid:16) δ n , (cid:0) S ( n +1) t (cid:1) t ∈ [0 ,T ] (cid:17)(cid:17) , n ≥ , δ ∈ (0 , δ max ) , (2.4)where • Proj [0 ,δ max ] denotes the projection on the (nonempty closed convex) [0 , δ max ], • ( γ n ) n ≥ is a positive step sequence satisfying (at least) the minimal decreasing step assumption P n ≥ γ n = + ∞ and γ n → • the sequence n ( S ( n ) t ) t ∈ [0 ,T ] , n ≥ o , is the “innovation” sequence of the procedure : ideally itis either a sequence of simulable independent copies of ( S t ) t ∈ [0 ,T ] or a sequence sharing someergodic (or averaging) properties with respect to the distribution of ( S t ) t ∈ [0 ,T ] .The case of independent copies can be understood as a framework where the dynamics of S is typically aBrownian diffusion solution to an stochastic differential equation, which has been calibrated beforehandon a dataset in order to be simulated on a computer. The case of ergodic copies corresponds to adataset which is directly plugged into the procedure i.e. S ( n ) t = S t − n ∆ t , t ∈ [0 , T ], n ≥
0, where ∆ t > S (starting in the past) is stationary and shares e.g. mixing properties. The resulting implementable procedure:
In practice, the above procedure cannot be imple-mented since the full path ( S t ( ω )) t ∈ [0 ,T ] of a continuous process cannot be simulated nor a functional H ( δ, ( S t ( ω )) t ∈ [0 ,T ] ) of such a path can be computed. So we are led in practice to replace the “copies” S ( n ) by copies of a time discretization of step, say ∆ t = Tm , ( m ∈ N ∗ ). The time discretizations areformally defined in continuous time as follows¯ S t = ¯ S t i , t ∈ [ t i , t i +1 ) , i = 0 , . . . , m − t i = iTm , i = 0 , . . . , m, where ( ¯ S t i ) ≤ i ≤ m = ( S t i ) ≤ k ≤ m if ( S t i ) ≤ i ≤ m can be simulated (see e.g. [6] for 1D-Brownian diffusionsprocesses). The sequence ( ¯ S t i ) ≤ i ≤ m can also be a time discretization scheme (at times t i ) of ( S t ) t ∈ [0 ,T ),typically an Euler scheme with step Tm .Then, with an obvious abuse of notation for the function H , we can write the implementableprocedure as follows: δ n +1 = Proj [0 ,δ max ] (cid:16) δ n − γ n +1 H (cid:16) δ n , (cid:0) ¯ S ( n +1) t i (cid:1) ≤ i ≤ m (cid:17)(cid:17) , n ≥ , δ ∈ [0 , δ max ] (2.5)where (cid:0) ¯ S ( n ) t i (cid:1) ≤ i ≤ m are copies of ( ¯ S t i (cid:1) ≤ i ≤ m either independent or sharing “ergodic” properties, namelysome averaging properties in the sense of [21]. In the first case, one will think about simulated dataafter a calibration process and in the second case to a direct implementation of a historical highfrequency data base of best opposite prices of the asset S (with e.g. ¯ S ( n ) t i = S t i − n Tm ). The following theorems give a.s. convergence results for the stochastic procedure (2.4): the first onefor i.i.d. sequences and the second one for “averaging” sequences (see [21]).6 .2.1 I.i.d. simulated data from a formerly calibrated model
In this section, we consider that the innovation process (cid:8)(cid:0) ¯ S ( n ) t i (cid:1) ≤ i ≤ m , n ≥ (cid:9) comes from a diffusionmodel beforehand calibrated on real data which can be simulated at time t i , 0 ≤ i ≤ m , either exactlyor via a stepwise constant time discretization scheme. Theorem 2.1. ( a ) Theoretical procedure.
Assume that C is strictly convex [0 , δ max ] with C ′ (0) < . Let (cid:0) S ( n ) t (cid:1) t ∈ [0 ,T ] , n ≥ , be a sequence of i.i.d. copies of ( S t ) t ∈ [0 ,T ] . Furthermore, as-sume that the decreasing step sequence satisfies the standard “decreasing step assumption” X n ≥ γ n = + ∞ and X n ≥ γ n < + ∞ . (2.6) Then the recursive procedure defined by (2.4) converges a.s. towards its target δ ∗ = argmin δ ∈ [0 ,δ max ] C ( δ ) : δ n a.s. −→ δ ∗ . ( b ) Implementable procedure.
Assume the cost function ¯ C related to the discretization scheme ( ¯ S t ) t ∈ [0 ,T ] is strictly convex [0 , δ max ] with ¯ C ′ (0) < and the step sequence satisfies the “decreasingstep” assumption. Let (cid:0) ¯ S ( n ) t i (cid:1) ≤ i ≤ m , n ≥ , be a sequence of i.i.d. copies of (cid:0) ¯ S t i (cid:1) ≤ i ≤ m , then therecursive procedure defined by (2.5) converges a.s. towards its target ¯ δ ∗ = argmin δ ∈ [0 ,δ max ] ¯ C ( δ ) . This theorem is a straightforward application of the classical a.s. convergence for constrainedstochastic algorithms (see Appendix A). In particular, the fact that in the original theorem the inno-vation process takes values in a finite dimensional space R q plays no role in the proof. In this framework we will focus on the time discretized procedure i.e. on ( ¯ S t ) t ∈ [0 ,T ] rather than on( S t ) t ∈ [0 ,T ] itself. Keep in mind that, when directly implementing a high frequency dataset, then¯ S t = S t i t ∈ [ t i , t i +1 ) , i = 0 , . . . , m and ¯ S T = S T . and that the sequence ( ¯ S ( n ) t i ) ≤ i ≤ m , n ≥
1, is usually obtained by shifting the data as follows: if ∆ t > t i − t i − = ∆ t = Tm , we set ∀ t ∈ [0 , T ] , ¯ S ( n ) t i = ¯ S t i − n ∆ t = ¯ S t i − n . We will assume that the sequence ( ¯ S ( n ) t i ) ≤ i ≤ m shares an averaging property with respect to adistribution ν as developed in [21]. The definition is recalled below. Definition 2.1.
Let m ∈ N and ν be a probability measure on ([0 , L ] m +1 , B or ([0 , L ] m +1 )) . A [0 , L ] m +1 -valued sequence ( ξ n ) n ≥ is ν -averaging if n n X k =1 δ ξ k ( R m +1 )= ⇒ ν as n → ∞ . ξ n ) n ≥ satisfies D ∗ n ( ξ ) := sup x ∈ [0 ,L ] m +1 (cid:12)(cid:12)(cid:12) n n X k =1 J ,x K ( ξ k ) − ν ( J , x K ) (cid:12)(cid:12)(cid:12) −→ n → ∞ , where D ∗ n ( ξ ) is called the discrepancy at the origin or star discrepancy.The resulting execution cost function ¯ C is defined by (2.2) where S is replaced by ( ¯ S t ) t ∈ [0 ,T ] whosedistribution is entirely characterized by the distribution ν . In some sense this function ¯ C is the bestpossible approximation of the true execution function C that we can get from the high frequencydatabase.In this setting, we apply the previous results to the price sequence (cid:8)(cid:0) ¯ S ( n ) t i (cid:1) ≤ i ≤ m , n ≥ (cid:9) , i.e. weset for every n ≥ ξ n = (cid:0) ¯ S ( n ) t i (cid:1) ≤ i ≤ m . In particular we will make the assumption that the dataset isbounded by a real number L ∈ (0 , + ∞ ) so that ξ n ∈ [0 , L ] m +1 for every n ≥ pathwise Lyapunov function, which means in this one dimensionalsetting that H ( · , (¯( s t i ) ≤ i ≤ m ) is non-decreasing for every ( s t i ) ≤ i ≤ m ∈ R m +1+ , n ≥ Theorem 2.2.
Implementable procedure . Let λ ( x ) = Ae − kx , A > , k > . Assume (cid:0) ¯ S ( n ) (cid:1) n ≥ is an [0 , L ] m +1 -valued ν -averaging sequence where ν is a probability measure on ( R m +1 , B or ( R m +1 )) .Assume that the execution cost function C is strictly convex over [0 , δ max ] with ¯ C ′ (0) < and ¯ C ′ ( δ max ) > . Finally assume that the step sequence ( γ n ) n ≥ is a positive non-increasing sequencesatisfying X n ≥ γ n = + ∞ , nD ∗ n ( ¯ S ) γ n −→ n →∞ , and X n ≥ nD ∗ n ( ¯ S ) max (cid:0) γ n , | ∆ γ n +1 | (cid:1) < + ∞ . (2.7) Furthermore (having in mind that ¯ S = S ), assume that Q T ≥ T λ ( − ¯ S ) and κ ≤ k ( ¯ S − δ max ) k (cid:13)(cid:13) ¯ S (cid:13)(cid:13) ∞ (Φ( Q T ) − Φ( Q T − Then the recursive procedure defined by (2.5) converges a.s. towards its target ¯ δ ∗ = argmin δ ∈ [0 ,δ max ] ¯ C ( δ ) : δ n a.s. −→ ¯ δ ∗ . Proof.
We apply Theorem 2.1 Section 2 of [21] in a QMC framework similar to Appendix A. First wehave that H (¯ δ ∗ , · ) ∈ L ( P ). Note that ¯ δ ∗ ∈ (0 , δ max ) since ¯ C ′ (0) < C ′ ( δ max ) > C ′ as a convex function on the whole real line. Moreover, by using the proof of Proposition 4.1( b ), weprove that if Q T ≥ T λ ( − ¯ S ) and (2.8) is satisfied, then H is non-decreasing in δ so that H satisfiesthe strict pathwise Lyapunov assumption with L ( δ ) = (cid:12)(cid:12) δ − ¯ δ ∗ (cid:12)(cid:12) , namely ∀ δ ∈ R \{ δ ∗ } , ∀ y ∈ R m +1 , (cid:10) H ( δ, y ) − H (¯ δ ∗ , y ) | δ − ¯ δ ∗ (cid:11) > . It remains to check the averaging rate assumption for H (¯ δ ∗ , · ): as ( s i ) ≤ i ≤ m H (¯ δ ∗ , ( s i ) ≤ i ≤ m ) is anon-decreasing function (for ( s i ) ≤ i ≤ m ≤ ( s ′ i ) ≤ i ≤ m , i.e. ∀ ≤ i ≤ m , s i ≤ s ′ i , i.e. J , s K ⊂ J , s ′ K ), then H (¯ δ ∗ , · ) has finite variation and by using the Koksma-Hlawka Inequality, we get (cid:12)(cid:12)(cid:12) n n X k =1 H (¯ δ ∗ , ¯ S ( k ) ) − Z [0 ,L ] m +1 H (¯ δ ∗ , s ) ν ( ds ) (cid:12)(cid:12)(cid:12) ≤ ( H (¯ δ ∗ , L ) − H (¯ δ ∗ , D ∗ n ( ¯ S ) , so that H (¯ δ ∗ , · ) is ν -averaging at rate ε n = D ∗ n ( ¯ S ). Finally, Theorem 2.1 of Section 2 from [21] yields δ n a.s. −→ ¯ δ ∗ . (cid:3) ractical comments of the needed bounds. • The constraint Q T ≥ T λ ( − S ) is structural: it only involves parameters of the model and theasked quantity Q T . It means that Q T does have some chances not to be fully executed beforethe end of a slice of duration T ( i.e. the intensity of trades obtained very far away from thecurrent price is smaller than Q T / • The criterion involving the free parameter κ is two-folded depending on the modeling of the“market impact”. – The market impact does not depend on the remaining quantity to be traded ( i.e. whenΦ = id or η ≡ Q T ) − Φ( Q T −
1) = 1). This setting is appropriatefor executing very small quantities or trading very liquid assets (like equity futures). Thenthe criterion on the free parameter κ reads κ ≤ ¯ S − δ max (cid:13)(cid:13) ¯ S (cid:13)(cid:13) ∞ + 1 k (cid:13)(cid:13) ¯ S (cid:13)(cid:13) ∞ . It states that in this case, the constant premium to pay for the remaining quantity ( i.e.κ , in basis points) has to be lower than the price range inside which we wish to trade( i.e. ( S − δ max ) / k S k ∞ ) plus a margin , namely 1 / ( k k S k ∞ ). It can be seen as a symmetryargument : a model where one cannot imagine buying at a lower price than a given threshold,one cannot either accept to pay (on the other side) more market impact than this verythreshold. – The market impact of the remaining quantity is a function of the quantity . The interpreta-tion is very similar to the previous one. In this case a quantity homogeneous to the marketimpact ( i.e. κ · (Φ( Q T ) − Φ( Q T − S − δ max ) / (cid:13)(cid:13) ¯ S (cid:13)(cid:13) ∞ +1 / ( k (cid:13)(cid:13) ¯ S (cid:13)(cid:13) ∞ ). Here again it is a symmetry argument: the trader cannot consider to pay moremarket impact for almost one share ( i.e. Φ( Q T ) − Φ( Q T −
1) plays the role of Φ(1) but“taken around Q T ”) than his reasonable trading range (( ¯ S − δ max ) / (cid:13)(cid:13) ¯ S (cid:13)(cid:13) ∞ in basis pointsagain), plus a margin. Looking more carefully at this margin: 1 / ( k (cid:13)(cid:13) ¯ S (cid:13)(cid:13) ∞ ) (the same asin the constant market impact case), it can be read that it is in basis points, and meansthat if one considers large intensities of fill rates ( i.e. k is small) then the trader can belazy on the market impact constraint (because his margin, proportional to 1 /k , is large insuch a case), mainly because he will have to pay market impact not that often. If on thecontrary, k is large ( i.e. he will have remaining quantities) then he needs to really fulfill hisconstraint on market impact.The needed bounds to obtain the convergence of δ n toward δ ∗ are thus not only fulfilled naturally.They also emphasis the consistency of the model and the mechanisms of the proofs. In this section, we look for simple criteria involving the parameter κ , that imply the requested assump-tion on the execution cost function C (or ¯ C ). One important feature of this section is that we willnever need to really specify the process S . For notational convenience we will drop the ¯ C notation.Checking that the assumptions on the function C ( i.e. C convex with C ′ (0) <
0) in Theorem 2.1are satisfied on [0 , δ max ] is a nontrivial task: in fact, as emphasized further on in Figures 2 and 7in Section 5, the function C in (2.2) is never convex on the whole non-negative real line, so we needreasonably simple criteria involving the market impact function Φ, Q T and the parameter κ and others9uantities related to the asset dynamics which ensure that the required conditions are fulfilled by thefunction C . These criteria should take the form of upper bounds on the free parameter κ .Their original form, typically those derived by simply writing C ′ (0) < C ′′ (0) ≥
0, are notreally operating since they involve ratios of expectations of functionals combining both the dynamicsof the asset S and the execution parameters in a highly nonlinear way. A large part of this paper isdevoted to establish simpler criteria (although slightly more conservative) when ( S t ) [0 ,T ] is a continuousprocess satisfying a functional co-monotony principle.We still need an additional assumption, this time on the function λ . Roughly speaking we needthat the functional Λ depends on the distance parameter δ essentially exponentially in the followingsense (which depends on S ):0 < k := P - essinf δ ∈ [0 ,δ max ] − ∂∂δ Λ T ( δ, S )Λ T ( δ, S ) ! ≤ k := P - esssup δ ∈ [0 ,δ max ] − ∂∂δ Λ T ( δ, S )Λ T ( δ, S ) ! < + ∞ , (2.9)0 < k := P - essinf δ ∈ [0 ,δ max ] − ∂ ∂δ Λ T ( δ, S ) ∂∂δ Λ T ( δ, S ) ! ≤ k := P - esssup δ ∈ [0 ,δ max ] − ∂ ∂δ Λ T ( δ, S ) ∂∂δ Λ T ( δ, S ) ! < + ∞ . (2.10)Note that the above assumption implies k := P -esssup − ∂∂δ Λ T (0 , S )Λ T (0 , S ) ! ≥ k := P -essinf − ∂∂δ Λ T (0 , S )Λ T (0 , S ) ! ≥ k > . (2.11)Although this assumption is stated on the functional Λ (and subsequently depends on S ), this ismainly an assumption on the intensity function λ . In particular, the above assumptions are satisfiedby intensity functions λ of the form λ k ( x ) = e − kx , x ∈ R , k ∈ (0 , + ∞ ) . For these functions λ k , one checks that k = k = k = k = k = k = k .The key to establish the criteria is the functional co-monotony principle that we establish in theAppendix B for a wide class of diffusions and their associated time discretization schemes.A Borel functional F : D ([0 , T ] , R ) → R is non-decreasing if ∀ α, β ∈ D ([0 , T ] , R ) , ( ∀ t ∈ [0 , T ] , α ( t ) ≤ β ( t )) = ⇒ F ( α ) ≤ F ( β ) . It is monotonic if F or − F is non-decreasing. Two functionals F and G on D ([0 , T ] , R ) are co-monotonicif they are monotonic with the same monotony.A functional F has polynomial growth if ∃ r > ∀ α ∈ D ([0 , T ] , R ) , | F ( α ) | ≤ K (1 + k α k r ∞ ) . Definition 2.2.
A stepwise constant c`adl`ag (resp. continuous) process ( S t ) t ∈ [0 ,T ] satisfies a functionalco-monotony principle if for every pair F , G : D ([0 , T ] , R ) → R of co-monotonic Borel functionals (resp.continuous at every α ∈ C ([0 , T ] , R ) for the sup-norm) with polynomial growth such that F ( S ) , G ( S ) and F ( S ) G ( S ) ∈ L , one has E h F (cid:0) ( S t ) t ∈ [0 ,T ] (cid:1) G (cid:0) ( S t ) t ∈ [0 ,T ] (cid:1)i ≥ E h F (cid:0) ( S t ) t ∈ [0 ,T ] (cid:1)i E h G (cid:0) ( S t ) t ∈ [0 ,T ] (cid:1)i . (2.12)We will use in the proofs below this principle for the price process ( S t ) t ∈ [0 ,T ] for monotonic func-tionals with opposite monotony. Thus Inequality in (2.12) is reversed (all we have to do is to replace F by − F ). This co-monotony principle is established and proved in Appendix B for a wide class ofBrownian diffusions, their discrete time samples and their Euler schemes.The main dynamics in which we are interested are the following10. An “admissible” Brownian diffusion in the sense od Definition B.2 satisfies a functional co-monotony principle. So does its Euler scheme with step Tm , at least for m large enough, say m ≥ m b,σ , where m b,σ only depends on (the Lipschitz coefficients of) the drift b and the diffusioncoefficient σ of the SDE that ( S t ) t ∈ [0 ,T ] is solution to.2. Finally any discrete time sample ( S t i ) ≤ i ≤ m of a continuous process satisfying a functionalco-monotony principle also satisfies a co-monotony principle for continuous functions f, g : R m +1 → R monotonic in each of their variables with the same monotony such that f (( S t i ) ≤ i ≤ m ), g (( S t i ) ≤ i ≤ m ), f g (( S t i ) ≤ i ≤ m ) ∈ L .3. If ( S t i ) ≤ i ≤ m is a Markov chain whose transitions P i g ( x ) = E (cid:2) g ( S t i +1 | S t i = x (cid:3) , i = 0 , . . . , m, preserve monotony ( g non-decreasing implies P i g is non-decreasing), then ( S t i ) ≤ i ≤ m also satisfiesa co-monotony principle in the same sense as for discrete time sample.The proof of 1. is postponed in Appendix B (Theorem B.2). Admissible diffusions include standardBrownian motion, geometrical Brownian motion and most models of dynamics traded assets (seeSection B.2.3).Claim 2. follows by associating to a function f the functional F ( α ) = f (( α ( t i )) ≤ i ≤ m ).For claim 3., we refer to [23] Proposition 2.1. Theorem 2.3.
Assume that ( S t ) t ∈ [0 ,T ] satisfies a functional co-monotony principle. Assume thatthe function λ is essentially exponential in the sense of (2.9) and (2.10) above. Then the followingmonotony and convexity criteria hold true. ( a ) Monotony at the origin:
The derivative C ′ (0) < as soon as Q T ≥ T λ ( − S ) and κ ≤ k S k E [ S T ] (Φ( Q T ) − Φ( Q T − . ( b ) Convexity.
Let ρ Q ∈ (cid:18) , − P ( N µ = Q T − P ( N µ ≤ Q T − | µ = T λ ( − S ) (cid:19) . If Φ = id , assume that Φ satisfies ∀ x ∈ [1 , Q T − , Φ( x ) − Φ( x − ≤ ρ Q (Φ( x + 1) − Φ( x )) . If Q T ≥ (cid:16) ∨ (cid:0) k k k (cid:1)(cid:17) T λ ( − S ) and κ ≤ k k k E [ S T ] Φ ′ ℓ ( Q T ) , then C ′′ ( δ ) ≥ , δ ∈ [0 , δ max ] , so that C is convex on [0 , δ max ] . ( c ) The same inequalities hold for the Euler scheme of ( S t ) t ∈ [0 ,T ] with step Tm , for m ≥ m b,σ , or forany R m +1 -time discretization sequence ( S t i ) ≤ i ≤ m which satisfies a co-monotony principle. Remark.
These conditions on the model parameters are conservative. Indeed, “sharper” criteria canbe given whose bounds involve ratios of expectation which can be evaluated only by Monte Carlosimulations: C ′ (0) < ⇐⇒ < κ < b , where b = E (cid:2) − Q T P (0) ( N µ > Q T ) + (cid:0) S ∂∂δ Λ T (0 , S ) − Λ T (0 , S ) (cid:1) P (0) ( N µ ≤ Q T − (cid:3) E (cid:2) S T ∂∂δ Λ T (0 , S ) ϕ (0) ( µ ) (cid:3) C is convex on [0 , δ max ] ⇐⇒ < κ < min δ ∈D + A ( δ ) B ( δ )where A ( δ ) = E (cid:20)(cid:18) ( S − δ ) ∂ ∂δ Λ T ( δ, S ) − ∂∂δ Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ ≤ Q T − − ( S − δ ) (cid:18) ∂∂δ Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ = Q T − ,B ( δ ) = E " S T ∂ ∂δ Λ T ( δ, S ) ϕ ( δ ) ( µ ) − (cid:18) ∂∂δ Λ T ( δ, S ) (cid:19) ψ ( δ ) ( µ ) ! and D + = { δ ∈ [0 , δ max ] | B ( δ ) > } . C and its derivatives First we briefly recall for convenience few basic facts on Poisson distributed variables that will beneeded to compute the cost function C and its derivatives C ′ and C ′′ (proofs are left to the reader). Proposition 3.1. (Classical formulas).
Let ( N µ ) µ> be a family of Poisson distributed random vari-ables with parameter µ > . ( i ) For every function f : N → R + such that log f ( n ) = O ( n ) , ddµ E [ f ( N µ )] = E [ f ( N µ + 1) − f ( N µ )] = E (cid:20) f ( N µ ) (cid:18) N µ µ − (cid:19)(cid:21) . In particular, for any k ∈ N , ddµ P ( N µ ≤ k ) = − P ( N µ = k ) .For any k ∈ N ∗ , ( ii ) E [ k ∧ N µ ] = k P ( N µ > k ) + µ P ( N µ ≤ k − and ddµ E [ k ∧ N µ ] = P ( N µ ≤ k − , ( iii ) E (cid:2) ( k − N µ ) + (cid:3) = k P ( N µ ≤ k ) − µ P ( N µ ≤ k − , ( iv ) k P ( N µ = k ) = µ P ( N µ = k − . To compute the cost function (or its gradient), it is convenient to proceed a pre-conditioning withrespect to F ST := σ ( S t , ≤ t ≤ T ). We come down to compute the above quantity when N ( δ ) isreplaced by a standard Poisson random variable with parameter µ denoted N µ . Therefore we have C ( δ ) = E (cid:20) ( S − δ ) (cid:16) Q T ∧ N ( δ ) T (cid:17) + κS T Φ (cid:18)(cid:16) Q T − N ( δ ) T (cid:17) + (cid:19)(cid:21) = E (cid:20) ( S − δ ) E h(cid:16) Q T ∧ N ( δ ) T (cid:17) | F ST i + κS T E (cid:20) Φ (cid:18)(cid:16) Q T − N ( δ ) T (cid:17) + (cid:19) | F ST (cid:21)(cid:21) = E h ( S − δ ) E [ Q T ∧ N µ ] | µ =Λ T ( δ,S ) + κS T E (cid:2) Φ (cid:0) ( Q T − N µ ) + (cid:1)(cid:3) | µ =Λ T ( δ,S ) i = E h e C (cid:16) δ, Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:17)i , (3.13)where for every x ∈ C ([0 , T ] , R + ) and every µ ∈ R + , e C ( δ, µ, x ) = ( x − δ ) ( Q T P ( N µ > Q T ) + µ P ( N µ ≤ Q T − κx T E (cid:2) Φ (cid:0) ( Q T − N µ ) + (cid:1)(cid:3) . We introduce some notations for reading convenience: we set P ( δ ) ( N µ > Q T ) = P ( N µ > Q T ) | µ =Λ T ( δ,S ) and E ( δ ) [ f ( µ )] = E [ f ( µ )] | µ =Λ T ( δ,S ) . Now we are in position to compute the first and second derivatives of the cost function C .12 roposition 3.2. ( a ) If Φ = id , then C ′ ( δ ) = E [ H ( δ, S )] with H ( δ, S ) = − Q T P ( δ ) ( N µ > Q T ) + (cid:18) ∂∂δ Λ T ( δ, S )( S − δ ) − Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ ≤ Q T − − κS T ∂∂δ Λ T ( δ, S ) ϕ ( δ ) ( µ ) (3.14) where ϕ ( δ ) ( µ ) = E ( δ ) (cid:2) (Φ ( Q T − N µ ) − Φ ( Q T − N µ − { N µ ≤ Q T − } (cid:3) and C ′′ ( δ ) = E (cid:20)(cid:18) ( S − δ ) ∂ ∂δ Λ T ( δ, S ) − ∂∂δ Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ ≤ Q T − − κS T ∂ ∂δ Λ T ( δ, S ) ϕ ( δ ) ( µ ) − ( S − δ ) (cid:18) ∂∂δ Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ = Q T −
1) + κS T (cid:18) ∂∂δ Λ T ( δ, S ) (cid:19) ψ ( δ ) ( µ ) (3.15) where ψ ( δ ) ( µ ) = E ( δ ) (cid:2) Φ (cid:0) ( Q T − N µ − + (cid:1) − (cid:0) ( Q T − N µ − + (cid:1) + Φ (cid:0) ( Q T − N µ ) + (cid:1)(cid:3) . ( b ) If Φ = id , then C ′ ( δ ) = E [ H ( δ, S )] with H ( δ, S ) = − Q T P ( δ ) ( N µ > Q T ) + (cid:18) ( S − δ − κS T ) ∂∂δ Λ T ( δ, S ) − Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ ≤ Q T −
1) (3.16) and C ′′ ( δ ) = E (cid:20)(cid:18) ( S − δ − κS T ) ∂ ∂δ Λ T ( δ, S ) − ∂∂δ Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ ≤ Q T − − ( S − δ − κS T ) (cid:18) ∂∂δ Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ = Q T − (3.17) Proof.
Interchanging derivation and expectation in the representation (3.13) implies C ′ ( δ ) = E (cid:20) ∂∂δ e C (cid:16) δ, Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:17) + ∂∂µ e C (cid:16) δ, Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:17) ∂∂δ Λ T ( δ, S ) (cid:21) . ( a ) We come down to compute the partial derivatives of e C ( δ, µ, x ). ∂ e C∂δ ( δ, µ, x ) = − E [( Q T ∧ N µ )] = − Q T P ( N µ > Q T ) − µ P ( N µ ≤ Q T − ,∂ e C∂µ ( δ, µ, x ) = − ( x − δ ) ∂∂µ E (cid:2) ( Q T − N µ ) + (cid:3) + κx T ∂∂µ E (cid:2) Φ (cid:0) ( Q T − N µ ) + (cid:1)(cid:3) . We have ∂∂µ E (cid:2) ( Q T − N µ ) + (cid:3) = − Q T P ( N µ = Q T ) − P ( N µ ≤ Q T −
1) + µ P ( N µ = Q T − − P ( N µ ≤ Q T −
1) thanks to ( iv ) in Proposition 3.1and ∂∂µ E (cid:2) Φ (cid:0) ( Q T − N µ ) + (cid:1)(cid:3) = E (cid:2) Φ (cid:0) ( Q T − N µ − + (cid:1) − Φ (cid:0) ( Q T − N µ ) + (cid:1)(cid:3) = E (cid:2) (Φ ( Q T − N µ − − Φ ( Q T − N µ )) { N µ ≤ Q T − } (cid:3) := − ϕ ( µ )owing to ( v ) in Proposition 3.1. Therefore ∂ e C∂µ ( δ, µ, x ) = ( x − δ ) P ( N µ ≤ Q T − − κx T ϕ ( µ ) . C ′ ( δ ) = E (cid:20) − Q T P ( δ ) ( N µ > Q T ) + (cid:18) ∂∂δ Λ T ( δ, S )( S − δ ) − Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ ≤ Q T − − κS T ∂∂δ Λ T ( δ, S ) ϕ ( δ ) ( µ ) (cid:21) = E (cid:20) b C (cid:18) δ, Λ T ( δ, S ) , ∂∂δ Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:19)(cid:21) , (3.18)where ϕ ( δ ) ( µ ) := ϕ ( µ ) | µ =Λ T ( δ,S ) and for every x ∈ C ([0 , T ] , R + ) and every µ, ν ∈ R + , b C ( δ, µ, ν, x ) = − Q T P ( N µ > Q T ) + ( ν ( x − δ ) − µ ) P ( N µ ≤ Q T − − κx T νϕ ( µ ) . Interchanging derivation and expectation in the representation (3.18) implies C ′′ ( δ ) = E (cid:20) ∂∂δ b C (cid:18) δ, Λ T ( δ, S ) , ∂∂δ Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:19) + ∂∂µ b C (cid:18) δ, Λ T ( δ, S ) , ∂∂δ Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:19) ∂∂δ Λ T ( δ, S )+ ∂∂ν b C (cid:18) δ, Λ T ( δ, S ) , ∂∂δ Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:19) ∂ ∂δ Λ T ( δ, S ) (cid:21) . We deal now with the partial derivatives of b C ( δ, µ, ν, x ). ∂ b C∂δ ( δ, µ, ν, x ) = − ν P ( N µ ≤ Q T − ,∂ b C∂µ ( δ, µ, ν, x ) = − P ( N µ ≤ Q T − − ( x − δ ) ν P ( N µ = Q T −
1) + κx T νψ ( µ ) ,∂ b C∂ν ( δ, µ, ν, x ) = ( x − δ ) P ( N µ ≤ Q T − − κx T ϕ ( µ ) . Consequently C ′′ ( δ ) = E (cid:20)(cid:18) ( S − δ ) ∂ ∂δ Λ T ( δ, S ) − ∂∂δ Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ ≤ Q T − − κS T ∂ ∂δ Λ T ( δ, S ) ϕ ( δ ) ( µ ) − ( S − δ ) (cid:18) ∂∂δ Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ = Q T −
1) + κS T (cid:18) ∂∂δ Λ T ( δ, S ) (cid:19) ψ ( δ ) ( µ ) . ( b ) If Φ = id so that ∂∂µ E (cid:2) Φ (cid:0) ( Q T − N µ ) + (cid:1)(cid:3) = − P ( N µ ≤ Q T − ∂ e C∂δ ( δ, µ, x ) = − Q T P ( N µ > Q T ) − µ P ( N µ ≤ Q T −
1) and ∂ e C∂µ ( δ, µ, x ) = ( x − δ − κx T ) P ( N µ ≤ Q T − . Consequently C ′ ( δ ) = E (cid:20) − Q T P ( δ ) ( N µ > Q T ) + (cid:18) ( S − δ − κS T ) ∂∂δ Λ T ( δ, S ) − Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ ≤ Q T − (cid:21) = E (cid:20) b C (cid:18) δ, Λ T ( δ, S ) , ∂∂δ Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:19)(cid:21) , (3.19)14here for every x ∈ C ([0 , T ] , R + ) and every µ, ν ∈ R + , b C ( δ, µ, ν, x ) = − Q T P ( N µ > Q T ) + (( x − δ − κx T ) ν − µ ) P ( N µ ≤ Q T − . Interchanging derivation and expectation in the the representation (3.19) implies C ′′ ( δ ) = E (cid:20) ∂∂δ b C (cid:18) δ, Λ T ( δ, S ) , ∂∂δ Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:19) + ∂∂µ b C (cid:18) δ, Λ T ( δ, S ) , ∂∂δ Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:19) ∂∂δ Λ T ( δ, S )+ ∂∂ν b C (cid:18) δ, Λ T ( δ, S ) , ∂∂δ Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:19) ∂ ∂δ Λ T ( δ, S ) (cid:21) . We come down to compute the partial derivatives of b C (cid:16) δ, Λ T ( δ, S ) , ∂∂δ Λ T ( δ, S ) , ( S t ) ≤ t ≤ T (cid:17) . ∂ b C∂δ ( δ, µ, ν, x ) = − ν P ( N µ ≤ Q T − , ∂ b C∂ν ( δ, µ, ν, x ) = ( x − δ − κx T ) P ( N µ ≤ Q T − ,∂ b C∂µ ( δ, µ, ν, x ) = − P ( N µ ≤ Q T − − ( x − δ − κx T ) ∂∂δ Λ T ( δ, S ) P ( N µ = Q T − . Consequently C ′′ ( δ ) = E (cid:20)(cid:18) ( S − δ − κS T ) ∂ ∂δ Λ T ( δ, S ) − ∂∂δ Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ ≤ Q T − − ( S − δ − κS T ) (cid:18) ∂∂δ Λ T ( δ, S ) (cid:19) P ( δ ) ( N µ = Q T − . (cid:3) C To ensure that the optimization problem is well-posed, namely that the cost function C has a minimumon [0 , δ max ], we need some additional assumptions: the cost function C must satisfy C ′ (0) <
0. Thisleads to define bounds for the parameter κ and this section is devoted to give sufficient condition on κ to ensure that this two properties are satisfied. The computations of the bounds given below relyon the co-monotony principle introduced in the Appendix Section B. The proposition below gives bounds for the parameter κ which ensure that the execution cost function C has a minimum. The aim of this subsection is to obtain sufficient bounds, easy to compute, namelydepending only of the model parameters. Proposition 4.1.
Assume that ( S t ) t ∈ [0 ,T ] satisfies a co-monotony principle. ( a ) Monotony at the origin. C ′ (0) < as soon as Q T ≥ T λ ( − S ) , k = infess (cid:18) − ∂∂δ Λ T (0 ,S )Λ T (0 ,S ) (cid:19) > and κ ≤ k S k E [ S T ] (Φ( Q T ) − Φ( Q T − . n particular, when Φ ≡ id , the condition reduces to κ ≤ k S k E [ S T ] . ( b ) Global monotony (exponential intensity).
Assume that s ∗ := k sup t ∈ [0 ,T ] S t k L ∞ < + ∞ . If λ ( x ) = Ae − kx , A > , k > , Q T ≥ T λ ( − S ) and κ ≤ k ( S − δ max ) k s ∗ if Φ = id , κ ≤ k ( S − δ max ) k s ∗ (Φ( Q T ) − Φ( Q T − if Φ = id , (4.20) then H ( · , ( y i ) ≤ i ≤ m ) is non-decreasing on [0 , δ max ] for every ( y i ) ≤ i ≤ m ∈ [0 , s ∗ ] m +1 . ( c ) The same inequality holds for the Euler scheme of ( S t ) t ∈ [0 ,T ] with step Tm , for m ≥ m b,σ , orfor any R m +1 -time discretization sequence ( S t i ) ≤ i ≤ m which satisfies a co-monotony principle (seeTheorem 2.3- ( c ) and Corollary B.1). To prove this result, we need to establish the monotony of several functions of µ which appear inthe expression of C ′ . Lemma 4.1. ( i ) The function µ µ P ( N µ ≤ Q ) is non-decreasing on h , j Q +12 ki . ( ii ) The function µ Θ( Q, µ ) := E [Φ( Q − N µ ) − Φ( Q − N µ − | N µ ≤ Q − satisfies Θ( Q T , µ ) ≤ Θ( Q T ,
0) = Φ( Q ) − Φ( Q − for all µ . Proof of Lemma 4.1. ( i ) We have ddµ ( µ P ( N µ ≤ Q )) = P ( N µ ≤ Q ) − µ P ( N µ = Q ). Consequently ddµ ( µ P ( N µ ≤ Q )) ≥ Q X k =0 µ k k ! ≥ µ µ Q Q ! . But k µ k k ! is non-decreasing on { , , . . . , ⌊ µ ⌋} and non-increasing on {⌊ µ ⌋ , . . . } .Hence Q X k =0 µ k k ! ≥ Q X k = ⌊ µ ⌋ µ Q Q ! = ( Q − ⌊ µ ⌋ ) µ Q Q ! , so that Q X k =0 µ k k ! ≥ µ µ Q Q ! as soon as Q ≥ ⌊ µ ⌋ + 1.( ii ) The function Φ is non-decreasing, non-negative and convex with Φ(0) = 0. If we look at therepresentation of µ N µ by N µ ( ω ) = max n n ∈ N | n Y i =1 U i ( ω ) > e − µ o , where U i are i.i.d. uniformly distributed random variables on the probability space (Ω , A , P ), then µ N µ is clearly non-decreasing, so µ Q − N µ is non-increasing and µ ϕ ( µ ) = Φ( Q − N µ ) − Φ( Q − N µ −
1) too (because of the convexity of Φ). (cid:3)
Remark. If µ ∈ (0 , µ µ P ( N µ ≤ Q ) is always non-decreasing. If µ ∈ [1 , µ µ P ( N µ = 0) = µe − µ is not always non-decreasing clearly, but only on [0 , Proof of Proposition 4.1. ( a ) In our problem the intensity parameter µ = Z T λ ( S t − S + δ ) dt iscontinuous, non-increasing to zero when δ tends to + ∞ and bounded by assumption ( λ ( − S ) < + ∞ ).Hence µ ∈ [0 , λ ( − S ) T ]. 16 i ) From (3.18), we have for δ = 0, b C (cid:18) , Λ T (0 , S ) , ∂∂δ Λ T (0 , S ) , ( S t ) ≤ t ≤ T (cid:19) ≤ (cid:18) S ∂∂δ Λ T (0 , S ) − Λ T (0 , S ) (cid:19) P (0) ( N µ ≤ Q T − − κS T ∂∂δ Λ T (0 , S ) ϕ (0) ( µ )= (cid:18) ∂∂δ Λ T (0 , S ) ( S − κS T Θ( Q T , Λ T (0 , S ))) − Λ T (0 , S ) (cid:19) P (0) ( N µ ≤ Q T − , because − Q T P ( N µ > Q T ) | µ =Λ T (0 ,S ) < Q T is large. Let k and k be defined by (2.11).We have that k > a.s. and k > a.s. by assumption, i.e. ∂∂δ Λ T (0 , S ) ≤ − k Λ T (0 , S ) a.s. and ∂∂δ Λ T (0 , S ) ≥ − k Λ T (0 , S ) a.s. . Then b C (cid:18) , Λ T (0 , S ) , ∂∂δ Λ T (0 , S ) , ( S t ) ≤ t ≤ T (cid:19) ≤ − (cid:0) k S − κS T k Θ( Q T , Λ T (0 , S )) (cid:1) ( µ P ( N µ ≤ Q T − | µ =Λ T (0 ,S ) . Now, by Lemma 4.1 , Θ( Q T , µ ) ≤ Θ( Q T ,
0) = Φ( Q ) − Φ( Q −
1) = ϕ (0) . Therefore b C (cid:16) , Λ T (0 , S ) , ∂∂δ Λ T (0 , S ) , ( S t ) ≤ t ≤ T (cid:17) ≤ − (cid:0) k S − κS T k (Φ( Q T ) − Φ( Q T − (cid:1) ( µ P ( N µ ≤ Q T − | µ =Λ T (0 ,S ) . By Lemma 4.1, if Q T ≥ T λ ( − S ), then µ µ P ( N µ ≤ Q −
1) is non-decreasing. Moreover, thefunctionals F : D ([0 , T, R ) → R α Λ T (0 , α ) = Z T λ ( α ( t ) − S ) dt is non-increasing and α ( T ) (cid:0) − − k S + κα ( T ) k (Φ( Q T ) − Φ( Q T − (cid:1) is non-decreasing.Both are continuous (with respect to the sup-norm) at any α ∈ C ([0 , T ] , R ). Conserquently, we obtainby the functional co-monotony principle E h(cid:0) − − k S + κS T k (Φ( Q T ) − Φ( Q T − (cid:1) ( µ P ( N µ ≤ Q T − | µ =Λ T (0 ,S ) i ≤ E (cid:2) − − k S + κS T k (Φ( Q T ) − Φ( Q T − (cid:3) E h ( µ P ( N µ ≤ Q T − | µ =Λ T (0 ,S ) i . In turn, this implies C ′ (0) ≤ E (cid:2) − − k S + κS T k (Φ( Q T ) − Φ( Q T − (cid:3) E h ( µ P ( N µ ≤ Q T − | µ =Λ T (0 ,S ) i . As E h ( µ P ( N µ ≤ Q T − | µ =Λ T (0 ,S ) i ≥
0, then C ′ (0) ≤ E (cid:2) − − k S + κS T k (Φ( Q T ) − Φ( Q T − (cid:3) ≤ ,i.e. κ ≤ k S k E [ S T ] (Φ( Q T ) − Φ( Q T − . b ) From (3.18), the form of λ and ( a ), we get that for every δ ∈ [0 , δ max ], S = ( S i ) ≤ i ≤ m ∈ R m +1 , H ( δ, S ) = − Q T P ( δ ) ( N µ > Q T ) + f ( δ, S ) ( µ P ( N µ ≤ Q T − | µ =Λ T ( δ,S ) , where f ( δ, S ) = − − k ( S − δ − κS T Θ( Q T − , Λ T ( δ, S ))) if Φ = id and f ( δ, S ) = − − k ( S − δ − κS T )if Φ = id. Since δ
7→ − Q T P ( δ ) ( N µ > Q T ) is non-decreasing, δ ( µ P ( N µ ≤ Q T − | µ =Λ T ( δ,S ) isnon-increasing and non positive owing to Lemma 4.1 ( i ) and δ f ( δ, S ) is non-decreasing owing toLemma 4.1 ( ii ), δ H ( δ, S ) is non-decreasing if f ( δ, S ) > δ ∈ [0 , δ max ], S ∈ R m +1 , which leads to(4.20). (cid:3) We give below sufficient condition to insure the convexity of the execution cost function C , namelyconservative upper bound for the free parameter κ . Proposition 4.2.
Assume that ( S t ) t ∈ [0 ,T ] satisfies a co-monotony principle. ( i ) If Φ = id , assume that there exists ρ Q ∈ (cid:16) , − P ( N µ = Q T − P ( N µ ≤ Q T − | µ = T λ ( − S ) (cid:17) such that ∀ x ∈ [1 , Q T − , Φ( x ) − Φ( x − ≤ ρ Q (Φ( x + 1) − Φ( x )) . If Q T ≥ (cid:16) ∨ (cid:0) k k k (cid:1)(cid:17) T λ ( − S ) and κ ≤ k k k E [ S T ] Φ ′ ℓ ( Q T ) , where Φ ′ ℓ is the left derivative of the convex function Φ , then C ′′ ( δ ) ≥ , δ ∈ [0 , δ max ] , so that C isconvex on [0 , δ max ] . ( ii ) When
Φ = id , the bound on κ reads κ ≤ k k k E [ S T ] . Remark.
Note that if we replace the diffusion process ( S t ) t ∈ [0 ,T ] by its stepwise constant Eulerscheme or by a discrete time sample ( S t i ) ≤ i ≤ m , we have the same criteria owing to Corollary B.1 (seealso Theorem 2.3-( c )).To prove the above Proposition, we need the following results Lemma 4.2. If µ ≤ Q − , then µ P ( N µ = Q − P ( N µ ≤ Q − is non-decreasing. Proof of Lemma 4.2. ddµ P ( N µ = Q − P ( N µ ≤ Q −
1) = P ( N µ = Q − − P ( N µ = Q − P ( N µ ≤ Q −
1) + P ( N µ = Q − P ( N µ ≤ Q − = P ( N µ = Q − (cid:16) P ( N µ ≤ Q − (cid:16) Q − µ − (cid:17) + P ( N µ = Q − (cid:17) P ( N µ ≤ Q − ≥ µ ≤ Q − . (cid:3) Lemma 4.3.
Assume that Φ = id . If there exists ρ Q ∈ (cid:16) , − P ( N µ = Q T − P ( N µ ≤ Q T − | µ = T λ ( − S ) (cid:17) such that ∀ x ∈ [1 , Q T − , Φ( x ) − Φ( x − ≤ ρ Q (Φ( x + 1) − Φ( x )) , then µ ϕ ( µ ) P ( N µ ≤ Q T − is non-increasing where ϕ ( µ ) = Φ( Q − N µ ) − Φ( Q − N µ − . emark. If Φ = id, then µ ϕ ( µ ) P ( N µ ≤ Q T − ≡
1, therefore we do not need the previous lemmas.
Proof of Lemma 4.3.
We have ddµ ϕ ( µ ) P ( N µ ≤ Q T −
1) = P ( N µ = Q T − ϕ ( µ ) P ( N µ ≤ Q T − − ψ ( µ ) P ( N µ ≤ Q T − ≤ ψ ( µ ) = Φ (cid:0) ( Q T − N µ − + (cid:1) − (cid:0) ( Q T − N µ − + (cid:1) + Φ (cid:0) ( Q T − N µ ) + (cid:1) iff − ψ ( µ ) P ( N µ ≤ Q T − P ( N µ = Q T − ϕ ( µ ) ≤ P ( N µ ≤ Q T − E (cid:2) Φ (cid:0) ( Q T − N µ − + (cid:1) − Φ (cid:0) ( Q T − N µ − + (cid:1)(cid:3) ≤ P ( N µ ≤ Q T − ϕ ( µ ) . ButΦ (cid:0) ( Q T − N µ − + (cid:1) − Φ (cid:0) ( Q T − N µ − + (cid:1) ≤ ρ Q (cid:0) Φ (cid:0) ( Q T − N µ ) + (cid:1) − Φ (cid:0) ( Q T − N µ − + (cid:1)(cid:1) { N µ ≤ Q T − } = ρ Q (cid:0) Φ (cid:0) ( Q T − N µ ) + (cid:1) − Φ (cid:0) ( Q T − N µ − + (cid:1) − (Φ(1) − Φ(0)) { N µ = Q T − } (cid:1) ≤ ρ Q (cid:0) Φ (cid:0) ( Q T − N µ ) + (cid:1) − Φ (cid:0) ( Q T − N µ − + (cid:1)(cid:1) since (Φ(1) − Φ(0)) { N µ = Q T − } ≥ a.s. Consequently P ( N µ ≤ Q T − E (cid:2) Φ (cid:0) ( Q T − N µ − + (cid:1) − Φ (cid:0) ( Q T − N µ − + (cid:1)(cid:3) − P ( N µ ≤ Q T − ϕ ( µ ) ≤ ( ρ Q P ( N µ ≤ Q T − − P ( N µ ≤ Q T − ϕ ( µ )= (cid:16) ρ Q − (cid:16) − P ( N µ = Q T − P ( N µ ≤ Q T − (cid:17)(cid:17) P ( N µ ≤ Q T − ϕ ( µ ) ≤ ρ Q ≤ − P ( N µ = Q T − P ( N µ ≤ Q T − . (cid:3) Proof of Proposition 4.2.
By using the notation (2.9)-(2.10), we obtain the following bound forthe second derivative of the cost function C ′′ ( δ ) ≥ E h k Λ T ( δ, S ) P ( δ ) ( N µ ≤ Q T − S − δ ) k k Λ T ( δ, S ) (cid:16) P ( δ ) ( N µ ≤ Q T − − k k k Λ T ( δ, S ) P ( δ ) ( N µ = Q T − (cid:17) − κS T Λ T ( δ, S ) (cid:16) k k φ ( δ ) ( µ ) − k Λ T ( δ, S ) ψ ( δ ) ( µ ) (cid:17)i . By adapting the result of Lemma 4.1, we obtain, if Q T ≥ (cid:16) k k k (cid:17) T λ ( − S ), that E h(cid:16) P ( N µ ≤ Q T − − k k k µ P ( N µ = Q T − (cid:17) | µ =Λ T ( δ,S ) i ≥ ψ ( δ ) ( µ ) ≥ a.s. Then we obtain the followingupper bound for κ , κ ≤ k E h ( µ P ( N µ ≤ Q T − | µ =Λ T ( δ,S ) i k k E (cid:2) S T Λ T ( δ, S ) ϕ ( δ ) ( µ ) (cid:3) . (4.21)19y Lemma 4.3, µ ϕ ( µ ) P ( N µ ≤ Q T − is non-increasing and by Lemma 4.1 µ µ P ( N µ ≤ Q T −
1) isnon-decreasing for Q T ≥ ⌊ µ ⌋ −
1. Furthermore α Λ T ( δ, α ) is non-increasing. By applying thefunctional co-monotony principle, we then have, for Q T ≥ T λ ( − S ), that E h S T Λ T ( δ, S ) ϕ ( δ ) ( µ ) i ≤ E h ( µ P ( N µ ≤ Q T − | µ =Λ T ( δ,S ) i E h(cid:16) ϕ ( µ ) P ( N µ ≤ Q T − (cid:17) | µ =Λ T ( δ,S ) i . Therefore (4.21) will be satisfied as soon as κ ≤ k k k E h S T (cid:16) ϕ ( µ ) P ( N µ ≤ Q T − (cid:17) | µ =Λ T ( δ,S ) i . Since by Lemma 4.3, µ ϕ ( µ ) P ( N µ ≤ Q T − is non-increasing, we get (cid:16) ϕ ( µ ) P ( N µ ≤ Q T − (cid:17) | µ =Λ T ( δ,S ) ≤ (cid:16) ϕ ( µ ) P ( N µ ≤ Q T − (cid:17) | µ =0 = Φ( Q + ) − Φ(( Q − + ) ≤ Φ ′ ℓ ( Q ) , which finally yields the announced (more stringent) criterion κ ≤ k k k E [ S T ] Φ ′ ( Q ) . (cid:3) Remark. As δ ∈ [0 , δ max ], then ( S − δ ) ∈ [ S − δ max , S ] and( S − δ ) k k Λ T ( δ, S ) (cid:16) P ( δ ) ( N µ ≤ Q T − − k k k Λ T ( δ, S ) P ( δ ) ( N µ = Q T − (cid:17) ≥ ( S − δ max ) k k Λ T ( δ, S ) (cid:16) P ( δ ) ( N µ ≤ Q T − − k k k Λ T ( δ, S ) P ( δ ) ( N µ = Q T − (cid:17) = ( S − δ max ) k k h µ (cid:16) P ( N µ ≤ Q T − − k k k µ P ( N µ = Q T − (cid:17)i | µ =Λ T ( δ,S ) . Unfortunately we cannot use the functional co-monotony principle to improve the bound because,for Q ≥ (cid:16) ∨ (cid:16) k k k (cid:17)(cid:17) T λ ( − S ), the function µ µ P ( N µ ≤ Q T −
1) is non-decreasing and µ − µ P ( N µ = Q T − P ( N µ ≤ Q T − is non-increasing, and we need to obtain a lower bound for this expression butco-monotony naturally yields an upper bound. In this section, we present numerical results with simulated and real data. We first present the chosenmodel for the price dynamic and the penalization function. Within the numerical examples, we aremodeling the optimal behavior of a “learning trader” reassessing the price of his passive order every5 units of time (can be seconds or minutes) in the order books to adapt to the characteristics of themarket (fair price moves S t and order flow dynamics N t ). During each period of 5 seconds, the traderposts her order of size Q in the book at a distance δ of the best opposite price ( δ lower than the bestask for a buy order), and waits 5 seconds. If the order is not completely filled after these 5 seconds (sayat time T ), the trader cancels the remaining quantity ( Q − N ) + and buys it using a market order20t price S T plus a market impact; she will buy at κS T (1 + η (( Q − N ) + )). Then she can reproducethe experiment choosing another value for the distance to the best opposite δ .The reassessment procedure used here is the one of formula (2.4) using the expectation represen-tation of C ′ given by Property 3.2 to provide the proper form for the function H .Then we plot the cost function and its derivative for a trivial penalization function Φ = Id ( η ≡ We assume that dS t = σdW t , S = s and Λ T ( δ, S ) = A Z T e − k ( S t − S + δ ) dt where ( W t ) t ≥ is a standard Brownian motion and σ, A, k > λ ( x ) = Ae − kx ). Wedenote here by ( ¯ S t ) t ≥ the Euler scheme with step Tm of ( S t ) t ∈ [0 ,T ] defined by¯ S k +1 := ¯ S k + σ r Tm Z k +1 , ¯ S = s , Z k +1 ∼ N (0 , , k ≥ , and we approximate Λ T ( δ, S ) by ¯Λ T ( δ, S ) = A Tm P mk =0 e − k ( ¯ S k − S + δ ) . The market impact penalizationfunction is Φ( x ) = (1 + η ( x )) x with η ( x ) = A ′ e k ′ x . Now we present the cost function and itsderivative for the following parameters: • parameters of the asset dynamics: s = 100 and σ = 0 . • parameters of the intensity of the execution process: A = 5 and k = 1, • parameters of the execution: T = 5 and Q = 10, • parameters of the penalization function: κ = 1, A ′ = 0 . k ′ = 0 . m = 20 for the Euler scheme and M = 10000 Monte Carlo simulations. Setting 1 ( η δ Cost function δ Derivative
Figure 2: η T = 5, A = 5, k = 1, s = 100, σ = 0 . Q = 10, κ = 6, A ′ = 1, k ′ = 0 . m = 20and M = 10000. 21 etting 2 ( η ≡ δ Cost function δ Derivative
Figure 3: η ≡ T = 5, A = 5, k = 1, s = 100, σ = 0 . Q = 10, κ = 12, m = 20 and M = 10000. Remark.
When one looks at the cost functions in Figures 2 and 3 (left), one may think that it wouldbe simpler to compute the cost functions to derive the minimum and the associated optimal distance.But the computing time of the costs is about 100 seconds which is too large compared to the length ofthe posting period T = 5 seconds, whereas that of the stochastic recursive procedure is about 1 second.Now we present the results of the stochastic recursive procedure for the two settings with n = 100 and γ n = 1100 n . Setting 1 ( η Stochastic Approximation
50 100 150 200 250 300 350 400 450 50099.9299.9499.9699.98100100.02100.04100.06100.08 time (s)
Fair and posting prices
Fair pricePosting price
Figure 4: η T = 5, A = 5, k = 1, s = 100, σ = 0 . Q = 10, κ = 6, A ′ = 1, k ′ = 0 . m = 20and n = 100 22 etting 2 ( η ≡ Stochastic Approximation
200 400 600 800 1000 1200 1400 1600 1800 200099.9299.9499.9699.98100100.02100.04100.06100.08 time (s)
Fair and posting prices
Fair pricePosting price
Figure 5: η ≡ T = 5, A = 5, k = 1, s = 100, σ = 0 . Q = 10, κ = 12, m = 20 and n = 100. The self-adaptiveness nature of this recurrence procedure allows to implement it on real data, even ifthey are not exactly fulfilling the model assumptions. In the numerical example of this section, thetrader reassess his order using the previously exposed recurrence procedure on real data on which theparameters of the models ( k , A , κ , k ′ , A ′ ) have been beforehand fitted.As market data, we use the bid prices of Accor SA (ACCP.PA) of 11/11/2010 for the fair priceprocess ( S t ) t ∈ [0 ,T ] . We divide the day into periods of 15 trades which will denote the steps of thestochastic procedure. Let N cycles be the number of these periods. For every n ∈ N cycles , we havea sequence of bid prices ( S ( n ) t i ) ≤ i ≤ and we approximate the jump intensity of the Poisson processΛ T n ( δ, S ), where T n = P i =1 t i , by ∀ n ∈ { , . . . , N cycles } , Λ T n ( δ, S ) = A X i =2 e − k ( S ( n ) ti − S t + δ ) ( t i − t i − ) . The empirical mean of the intensity function¯Λ( δ, S ) = 1 N cycles N cycles X n =1 Λ T n ( δ, S )is plotted on Figure 6. δ Intensity
Figure 6: Fit of the exponential model on real data (Accor SA (ACCP.PA) 11/11/2010): A = 1 / k = 50 and N cycles = 220. 23he penalization function has the following formΦ( x ) = (1 + η ( x )) x with η ( x ) = A ′ e k ′ x . Now we present the cost function and its derivative for the following parameter values: A = 1 / k = 50, Q = 100, A ′ = 0 .
001 and k ′ = 0 . Setting1 ( η δ Cost function δ Derivative
Figure 7: η A = 1 / k = 50, Q = 100, κ = 1, A ′ = 0 . k ′ = 0 . N cycles = 220. Setting2 ( η ≡ δ Cost function δ Derivative
Figure 8: η ≡ A = 1 / k = 50, Q = 100, κ = 1 .
001 and N cycles = 220.Now we present the results of the stochastic recursive procedure for two cases. To smoothen thebehavior of the stochastic algorithm, we use Ruppert and Poliak’s averaging principle (see [8]). Inshort, this principle is two-folded:– Phase 1 : Implement the original zero search procedure with γ n = γ n ρ , < ρ < γ > Phase 2 : Compute the arithmetic mean at each step n of all the past values of the procedure,namely ¯ δ n = 1 n + 1 n X k =0 δ k , n ≥ .
24t has been shown by several authors that this procedure under appropriate assumptions is ruled bya
CLT having a minimal asymptotic variance (among recursive procedures).
Setting 1 ( η Stochastic Approximation Fair and Posting Prices
Fair pricePosting price
Figure 9: η A = 1 / k = 50, Q = 100, κ = 1, A ′ = 0 . k ′ = 0 . N cycles = 220. Crudealgorithm with γ n = n . Stochastic Approximation Fair and Posting Prices
Fair pricePosting price
Figure 10: η A = 1 / k = 50, Q = 100, κ = 1, A ′ = 0 . k ′ = 0 . N cycles = 220.Ruppert and Poliak’s averaging algorithm with γ n = n . .25 etting 2 ( η ≡ Stochastic Approximation Fair and Posting Prices
Fair pricePosting price
Figure 11: η ≡ A = 1 / k = 50, Q = 100, κ = 1 . γ n = n and N cycles = 220. Stochastic Approximation Fair and Posting Prices
Fair pricePosting price
Figure 12: η A = 1 / k = 50, Q = 100, κ = 1 .
001 and N cycles = 220. Ruppert and Poliak’saveraging algorithm with γ n = n . .We see on Figures 10 (for η
0) and 12 (for η ≡
0) that the recursive procedures converge towardtheir respective targets, namely the minimum of the execution cost functions presented in Figures 7left (for η
0) and 8 left (for η ≡ Practically, the stochastic algorithm defined by the dynamics of equation (2.4) has to be read as areassessment rule to apply to the distance to the reference price (which can be taken as the bestopposite) at a given frequency (that can be expressed in calendar time or business time, i.e. numberof trades or traded quantities): each cycle (of given 5 seconds), the trading algorithm updates δ n andmodifies the price of its limit order to be the best opposite price minus δ n (for a buy order).The change from δ n − to δ n is computed using the expression (3.14) for H , and a typical choicefor the step is γ n = γ /n ρ (0 < ρ < vs less than one minute on simulated ones).The second element is that once the algorithm succeeded in being close to the optimal value δ ∗ ,it oscillates around it without any need to wait again from one to ten minutes to converge. It means26hat using a heuristic rule to choose δ rather than a random value can practically avoid “paying” thisfirst step.Another important element to highlight is that the posting price dynamics on simulated dataare really different than the ones on real data. On simulated data, the price evolution diagrams areunrealistic for any trader. It mainly comes from the fact that the price does not behave like a Brownianmotion at a very short time scale. On real data, the posting price adapts to it, being also far morerealistic by following the last prices more closely than in a simulated environment.It shows the adaptiveness property of a stochastic algorithm, that may need more time to catch areasonable value for δ n , but then keep very close to it, despite the price moves that are less smooththan a classical diffusion.Going back to the expression of H , it must be qualitatively said that: • the first term − Q T P ( δ ) ( N µ > Q T ) will push the price away when the order has been completelyfilled before the end of the reload period T ; • the second term (cid:0) ∂∂δ Λ T ( δ, S )( S − δ ) − Λ T ( δ, S ) (cid:1) P ( δ ) ( N µ ≤ Q T −
1) will attract the price to thebest opposite as far as the price move between S t and S t + T does not change the intensity of the“order flow” Λ( δ n , S t + T ) it will be exposed to; • the last component of the reassessment policy − κS T ∂∂δ Λ T ( δ, S ) ϕ ( δ ) ( µ ) attracts the price to thebest opposite when the “price impact cost” ( i.e. coming into the equation via κϕ ( δ ) , which canbee seen as the first derivative of the market impact κ Φ) to pay when the expected fill rate hasnot been obtain, is too high.This qualitative interpretation of the update rule obtained by rigorous derivation of criteria (2.2)shows how to estimate and optimally mix these three natural effects that a trader would like to seeinto any price reassessment policy. It shows the value of a stochastic algorithm approach can bring tooptimal trading at every time scale: (1) clearly links an objective criteria to the reassessment policies,(2) exposing the assumptions needed to guarantee its convergence. It is clearly the risk-control rolethat one can expect from a formalized approach of trading.
This paper presents a rigorous proof of convergence of a reassessment scheme for a trading tacticaiming at capturing liquidity “ around the book ”. It implements a learning by trading approach,validated inside our class of model (which is quite general): • the distance to a reference point (in practice it can be the best opposite, the mid point, or anyefficient price estimate) is fixed at δ n during few seconds or market trades, • the tactic observes the market feedback resulting from the combination of the natural diffusionof the reference price and a point process filling my order with an intensity depending on myinstantaneous distance to the reference price (which varies), • our formal results give the optimal way to adjust the distance δ n +1 to the reference price, giventhen marginal variations of the different market components which can be anticipated to anincrease or decrease of posting distance. 27he robustness of the approach is not only guaranteed theoretically (provided that the marketimpact is in a realistic range, as commented on section 2.2.1), but also confirmed and emphasizedby tests carried out on real data (partially reproduced in Section 5.2). This benchmark shows thateven if the real data behave differently from Monte-Carlo generated scenarios (see Section 5.1), theconvergence still occurs.This paper strongly suggests that such iterative trading procedures, very often used by practitionersbecause of the way it can be efficiently fitted on line to real time data and providing optimal reassess-ment rules. This has to be compared to a stochastic control approach which needs to be calibratedon using data on longer time frames inducing somehow an “averaging” effect of the instantaneousliquidity effects that can occur on the real markets.With the recent modifications of market microstructure following fragmentation of markets andemergence of high frequency trading, it is clear that algorithmic traders will need to devise morereactive and short term tactics. Covering this aspect of trading, this paper opens the door to otherresearches (like multi trading pools and multi asset reassessment of limit prices), and to applicationsfor practitioners. Traders can use such scheme as sub-tactics of a brokerage algorithm, of an highfrequency market making mechanism, or of any intraday arbitrage automated process.28 ppendixA Convergence theorem for constrained algorithms The aim is to determine an element of the set { θ ∈ Θ : h ( θ ) = E [ H ( θ, Y )] = 0 } (zeros of h in Θ)where Θ ⊂ R d is a closed convex set, h : R d → R d and H : R d × R q → R d . For θ ∈ Θ, we considerthe R d -valued sequence ( θ n ) n ≥ defined by θ n +1 = Proj Θ ( θ n − γ n +1 H ( θ n , Y n +1 )) , (A.22)where ( Y n ) n ≥ is an i.i.d. sequence with the same law as Y , ( γ n ) n ≥ is a positive sequence of realnumbers and Proj Θ denotes the Euclidean projection on Θ. The recursive procedure (A.22) can berewritten as follows θ n +1 = θ n − γ n +1 h ( θ n ) − γ n +1 ∆ M n +1 + γ n +1 p n +1 , (A.23)where ∆ M n +1 = H ( θ n , Y n +1 ) − h ( θ n ) is a martingale increment and p n +1 = 1 γ n +1 Proj Θ ( θ n − γ n +1 H ( θ n , Y n +1 )) − θ n γ n +1 + H ( θ n , Y n +1 ) . Theorem A.1. (see [18] and [19]) Let ( θ n ) n ≥ be the sequence defined by (A.23). Assume that thereexists a unique θ ∗ ∈ Θ such that h ( θ ∗ ) = 0 and that the mean function satisfies on Θ the followingmean-reverting property, namely ∀ θ = θ ∗ ∈ Θ , h h ( θ ) | θ − θ ∗ i > . (A.24) Assume that the gain parameter sequence ( γ n ) n ≥ satisfies X n ≥ γ n = + ∞ and X n ≥ γ n < + ∞ . (A.25) If the function H satisfies ∃ K > such that ∀ θ ∈ Θ , E h | H ( θ, Y ) | i ≤ K (1 + | θ | ) , (A.26) then θ n a.s. −→ n → + ∞ θ ∗ . Remark.
If Θ is bounded (A.26) reads sup θ ∈ Θ E h | H ( θ, Y ) | i < + ∞ , which is always satisfied if Θ iscompact and θ E h | H ( θ, Y ) | i is continuous. 29 Functional co-monotony principle for a class of one-dimensionaldiffusions
In this section, we present the principle of co-monotony, first for random vectors taking values in anonempty interval I , then for one-dimensional diffusions lying in I . B.1 Case of random variables and random vectors
First we recall a classical result for random variables.
Proposition B.1.
Let f, g : I ⊂ R → R be two monotonic functions with same monotony. Let X : (Ω , A , P ) → I be a real valued random variable such that f ( X ) , g ( X ) ∈ L ( P ) . Then Cov( f ( X ) , g ( X )) ≥ . Proof.
Let
X, Y be two independent random variables defined on the same probability space withthe same distribution P X . Then ( f ( X ) − f ( Y ))( g ( X ) − g ( Y )) ≥ E [ f ( X ) g ( X )] − E [ f ( X ) g ( Y )] − E [ f ( Y ) g ( X )] + E [ f ( Y ) g ( Y )] ≥ Y ( d ) = X and Y , X are independent, yields2 E [ f ( X ) g ( X )] ≥ E [ f ( X )] E [ g ( Y )] + E [ f ( Y )] E [ g ( X )] = 2 E [ f ( X )] E [ g ( X )]that is Cov( f ( X ) , g ( X )) ≥ (cid:3) Proposition B.2.
Let
F, G : R d → R be two monotonic functions with same monotony in each of theirvariables, i.e. for every i ∈ { , . . . , d } , x i F ( x , . . . , x i , . . . , x d ) and x i G ( x , . . . , x i , . . . , x d ) are monotonic with the same monotony which may depend on i (but does not depend on ( x , . . . , x i − ,x i +1 , . . . , x d ) ∈ R d − ). Let X , . . . , X d be independent real valued random variables defined on aprobability space (Ω , A , P ) such that F ( X , . . . , X d ) , G ( X , . . . , X d ) ∈ L ( P ) . Then Cov ( F ( X , . . . , X d ) , G ( X , . . . , X d )) ≥ . Proof.
The proof of the above proposition is made by induction on d . The case d = 1 is given byProposition B.1. We give here the proof for d = 2 for notational convenience, but the general case ofdimension d follows likewise. By the monotonic assumption on F and G , we have for every x ∈ R , if X ′ d = X with X ′ , X independent, that (cid:0) F ( X , x ) − F ( X ′ , x ) (cid:1) (cid:0) G ( X , x ) − G ( X ′ , x ) (cid:1) ≥ . This implies that (see Proposition B.1)Cov ( F ( X , x ) G ( X , x )) ≥ . If X and X are independent, using Fubini’s Theorem and what precedes, we have E [ F ( X , X ) G ( X , X )] = Z R P X ( dx ) E [ F ( X , x ) G ( X , x )] ≥ Z R P X ( dx ) E [ F ( X , x )] E [ G ( X , x )] .
30y setting ϕ ( x ) = E [ F ( X , x )] and ψ ( x ) = E [ G ( X , x )] and using the monotonic assumptions on F and G , we have that ϕ and ψ are monotonic with the same monotony so that Z R P X ( dx ) E [ F ( X , x )] E [ G ( X , x )] = E [ ϕ ( X ) ψ ( X )] ≥ E [ ϕ ( X )] E [ ψ ( X )] . Combining these above two inequalities finally yields Cov ( F ( X , X ) G ( X , X )) ≥ (cid:3) B.2 Case of (one-dimensional) diffusions
This framework corresponds to the infinite dimensional case and we can not apply straightforwardlythe result of Proposition B.1: indeed, if we define the following natural order relation on D ([0 , T ] , R ) ∀ α , α ∈ D ([0 , T ] , R ) , α ≤ α ⇐⇒ ( ∀ t ∈ [0 , T ] , α ( t ) ≤ α ( t )) , this order is partial which makes the formal proof of Proposition B.1 collapse. To establish a co-monotony principle for diffusions, we proceed in two steps: first, we use the Lamperti transform to“force” the diffusion coefficient to be equal to 1 and we establish the co-monotony principle for thisclass of diffusions. Then by the inverse Lamperti transform, we go back to the original process.In this section, we first present our framework in more details. Then we recall some weak conver-gence results for diffusion with diffusion coefficient equal to 1. Afterwards we present the Lampertitransform and we conclude by the general co-monotony principle.Let I be a nonempty open interval of R . One considers a real-valued Brownian diffusion process dX t = b ( t, X t ) dt + σ ( t, X t ) dW t , X = x ∈ I, t ∈ [0 , T ] , (B.27)where b, σ : [0 , T ] × I → R are Borel functions with at most linear growth such that the aboveEquation (B.27) admits at least one (weak) solution over [0 , T ] and W is a Brownian motion definedon a probability space (Ω , A , P ). We assume that the diffusion X a.s. does not explode and lives inthe interval I . This implies assumptions on the function b and σ especially in the neighborhood (in I ) of the endpoints of I that we will not detail here. At a finite endpoint of I , these assumptions arestrongly connected with the Feller classification for which we refer to [17] (with σ ( t, · ) > t ∈ [0 , T ]). We will simply make the classical linear growth assumption on b and σ (which preventsexplosion at a finite time) that will be used for different purpose in what follows.To “remove” the diffusion coefficient of the diffusion X , we will introduce the so-called Lampertitransform which requires additional assumptions on the drift b and the diffusion coefficient σ , namely( A b,σ ) ≡ ( i ) σ ∈ C ([0 , T ] × I, R ) , ( ii ) ∀ ( t, x ) ∈ [0 , T ] × I, | b ( t, x ) | ≤ C (1 + | x | ) and 0 < σ ( t, x ) ≤ C (1 + | x | ) , ( iii ) ∀ x ∈ I, R ( −∞ ,x ] ∩ I dξσ ( t,ξ ) = R [ x, + ∞ , ) ∩ I dξσ ( t,ξ ) = + ∞ (B.28) Remark.
Condition ( iii ) clearly does not depend on x ∈ I . Furthermore, if I = R , ( iii ) follows from( ii ) since σ ( t,ξ ) ≥ C | ξ | .Before passing to a short background on the Lamperti transform which will lead to the newdiffusion deduced from (B.27) whose diffusion coefficient is equal to 1, we need to recall (and adapt)some background on solution and discretization of such SDE .31 .2.1 Background on diffusions with σ ≡ (weak solution, discretization). The following proposition gives a condition on the drift for the existence and the uniqueness of a weaksolution of a SDE when σ ≡ Proposition B.3.
Consider the stochastic differential equation dY t = β ( t, Y t ) dt + dW t , t ∈ [0 , T ] , (B.29) where T is a fixed positive number, W is a one-dimensional Brownian motion and β : [0 , T ] × R → R is a Borel-measurable function satisfying | β ( t, y ) | ≤ K (1 + | y | ) , t ∈ [0 , T ] , y ∈ R , K > . For any probability measure ν on ( R , B ( R )) , equation (B.29) has a weak solution with initial distribu-tion ν .If, furthermore, the drift term β satisfies one of the following conditions: ( i ) β is bounded on [0 , T ] × R , ( ii ) β is continuous, locally Lipschitz in y ∈ R uniformly in t ∈ [0 , T ] ,then this weak solution is unique (in fact ( ii ) is a strong uniqueness assumption). Now we introduce the stepwise constant (Brownian) Euler scheme ¯ Y m = (cid:16) ¯ Y kTm (cid:17) ≤ k ≤ m with step Tm of the process Y = ( Y t ) t ∈ [0 ,T ] defined by (B.29). It is defined by¯ Y t mk +1 = ¯ Y t mk + β ( t mk , ¯ Y t mk ) Tm + r Tm U k +1 , ¯ Y = Y = y , k = 0 , . . . , m − , (B.30)where t mk = kTm , k = 0 , . . . , m , and ( U k ) ≤ k ≤ m denotes a sequence of i.i.d. N (0 , U k = r mT (cid:16) W t mk − W t mk − (cid:17) , k = 1 , . . . , m. The following theorem gives a weak convergence result for the stepwise constant Euler scheme (B.30).Its proof is a straightforward consequence of the functional limit theorems for semi-martingales (to beprecise Theorem 3.39, Chap. IX, p. 551 in [15]).
Theorem B.1.
Let β : [0 , T ] × R → R be a continuous function satisfying ∃ K > , | β ( t, y ) | ≤ K (1 + | y | ) , t ∈ [0 , T ] , y ∈ R . Assume that the weak solution of equation (B.29) is unique. Then, the stepwise constant Euler schemeof (B.29) with step Tm satisfies ¯ Y m L −→ Y for the Skorokhod topology as m → ∞ . In particular, for every functional F : D ([0 , T ] , R ) → R P Y - a.s. continuous at α ∈ C ([0 , T ] , R ) , withpolynomial growth, we have E F ( ¯ Y m ) −→ m →∞ E F ( Y ) (by uniform integrability since sup t ∈ [0 ,T ] (cid:12)(cid:12) ¯ Y mt (cid:12)(cid:12) ∈ T p> L p ). .2.2 Background on the Lamperti transform We will introduce a new diffusion Y t := L ( t, X t ) which will satisfy a new SDE whose diffusion coefficientwill be constant equal to 1. This function L defined on [0 , T ] × I is known in the literature as theLamperti transform. It is defined for every ( t, x ) ∈ [0 , T ] × I by L ( t, x ) := Z xx dξσ ( t, ξ ) (B.31)where x is an arbitrary fixed value lying in I . The Lamperti transform clearly depends on the choiceof x in I but not its properties of interest. First, under ( A b,σ )-( i )-( ii ), L ∈ C , ([0 , T ] × I ) with ∂L∂t ( t, x ) = − Z xx σ ( t, ξ ) ∂σ∂t ( t, ξ ) dξ, ∂L∂x ( t, x ) = 1 σ ( t, x ) > ∂ L∂x ( t, x ) = − σ ( t, x ) ∂σ∂x ( t, x ) . Let t ∈ [0 , T ], L ( t, · ) is an increasing C -diffeomorphism from I onto R = L ( t, I ) (the last claimfollows from ( A b,σ )-( iii )). Its inverse will be denoted L − ( t, · ).Notice that, ( t, y ) L − ( t, y ) is continuous on [0 , T ] × I since both sets (cid:8) ( t, y ) ∈ [0 , T ] × I : L − ( t, y ) ≤ c (cid:9) = { ( t, y ) ∈ [0 , T ] × R : L ( t, c ) ≥ y } and (cid:8) ( t, y ) ∈ [0 , T ] × I : L − ( t, y ) ≥ c (cid:9) = { ( t, y ) ∈ [0 , T ] × R : L ( t, c ) ≤ y } are both closed for every c ∈ R . Therefore, if ( A b,σ ) holds, the function β : [0 , T ] × I R defined by β ( t, y ) := (cid:18) bσ − Z · x σ ( t, ξ ) ∂σ∂t ( t, ξ ) dξ − ∂σ∂x (cid:19) ( t, L − ( t, y )) (B.32)is a Borel function, continuous as soon as b is.Now, we set ∀ t ∈ [0 , T ] , Y t := L ( t, X t ) . Itˆo formula straightforwardly yields dY t = β ( t, Y t ) dt + dW t , Y = L (0 , x ) =: y ∈ R . (B.33) Remarks. • In the homogeneous case, which is the most important case for our applications, dX t = b ( X t ) dt + σ ( X t ) dW t , X = x ∈ R , t ∈ [0 , T ] , (B.34)we have L ( t, x ) = L ( x ) := Z xx dξσ ( ξ ) . Then by setting Y t := L ( X t ), we obtain dY t = β ( Y t ) dt + dW t , Y = L ( x ) =: y with β := (cid:16) bσ − σ ′ (cid:17) ◦ L − . Note that β is bounded as soon as bσ − σ ′ is. • If the partial derivative b ′ x exists on [0 , T ] × I , one easily checks, using ( L − ) ′ y ( t, y ) = σ ( t, L − ( t, y )),that for every ( t, y ) ∈ [0 , T ] × I , β ′ y ( t, y ) = (cid:16) b ′ x − bσ ′ x + σ ′ t σ − σσ ′′ x (cid:17) ( t, L − ( t, y )) . (B.35)33s a consequence, as soon as the function b ′ x − bσ ′ x + σ ′ t σ − σσ ′′ x is bounded on [0 , T ] × I , then β satisfies the linear growthLipschitz assumption ≥ , then β is non-decreasing. (B.36) Definition B.1.
The functional Lamperti transform , denoted Λ , is a functional from C ([0 , T ] , I ) to C ([0 , T ] , R ) defined by ∀ α ∈ C ([0 , T ] , I ) , Λ( α ) = L ( · , α ( · )) . Proposition B.4.
If the diffusion coefficient σ satisfies ( A b,σ ) , the functional Lamperti transform isan homeomorphism from C ([0 , T ] , I ) onto C ([0 , T ] , R ) . Proof.
Let α ∈ C ([0 , T ] , I ). Since σ is bounded away from 0 on the compact set [0 , T ] × α ([0 , T ]),standard arguments based on Lebesgue domination theorem, imply that Λ( α ) ∈ C ([0 , T ] , R ).Conversely, as L ( t, · ) : I → R is an homeomorphism for every t ∈ [0 , T ], Λ admits an inverse definedby ∀ ξ ∈ C ([0 , T ] , R ) , Λ − ( ξ ) := (cid:0) t L − ( t, ξ ( t )) (cid:1) ∈ C ([0 , T ] , I ) . Let U K denote the topology of the convergence on compact sets of I on C ([0 , T ] , I ). ⊲ U K -Continuity of Λ on [0 , T ] × I : If α n U K −→ α ∞ , the set K = [0 , T ] × S n ∈ N α n ([0 , T ]) is a compactset included in I . Hence σ is bounded away from 0 on K so that ∀ t ∈ [0 , T ] , | L ( t, α n ( t )) − L ( t, α ∞ ( t )) | ≤ K σ | α n ( t ) − α ∞ ( t ) | i.e. k Λ( α n ) − Λ( α ∞ ) k ∞ ≤ K σ k α n − α ∞ k ∞ . ⊲ U K -Continuity of Λ − on [0 , T ] × I : by using ( A b,σ )-( ii ), we have for a fixed t ∈ [0 , T ], ∀ x, x ′ ∈ I, (cid:12)(cid:12) L ( t, x ) − L ( t, x ′ ) (cid:12)(cid:12) ≥ C Z x ∨ x ′ x ∧ x ′ dξ | ξ | = 1 C (cid:12)(cid:12) Φ( x ) − Φ( x ′ ) (cid:12)(cid:12) , where Φ( z ) = sign( z ) log(1 + | z | ). Thus, ∀ y, y ′ ∈ R , (cid:12)(cid:12) Φ( L − ( t, y )) − Φ( L − ( t, y ′ )) (cid:12)(cid:12) ≤ C (cid:12)(cid:12) y − y ′ (cid:12)(cid:12) . Let ( ξ n ) n ≥ be a sequence of functions of D ([0 , T ] , R ) such that ξ n U −→ n → + ∞ ξ ∈ C ([0 , T ] , R ). Then, forevery t ∈ [0 , T ] and n ≥ (cid:12)(cid:12) Φ( L − ( t, ξ n ( t ))) − Φ( L − ( t, (cid:12)(cid:12) ≤ C | ξ n ( t ) | ≤ C ( k ξ n ( t ) − ξ k + k ξ k ) + | Φ( x ) | ≤ C ′ , since L − ( t,
0) = x . Consequently, for every t ∈ [0 , T ] and every n ≥ L − ( t, ξ n ( t )) ∈ K ′ :=Φ − ([ − C ′ , C ′ ]). The set K ′ is compact (because the function Φ is continuous and proper (lim | z |→∞ | Φ( z ) | =+ ∞ )). As inf K ′ Φ ′ >
0, we deduce that there exists η > ∀ x, y ∈ I, | Φ( x ) − Φ( y ) | > η | x − y | ,i.e. ∀ t ∈ [0 , T ] , ∀ u, v ∈ L ( t, I ) , (cid:12)(cid:12) L − ( t, u ) − L − ( t, v ) (cid:12)(cid:12) ≤ C ′′ | u − v | , C ′′ > . Hence, one concludes that k Λ − ( ξ n ) − Λ − ( ξ ∞ ) k ∞ ≤ C ′′ k ξ n − ξ ∞ ) k ∞ . (cid:3) .2.3 Functional co-monotony principle for diffusionDefinition B.2. The diffusion process (B.27) is admissible if ( A b,σ ) holds and(i) for every starting value x ∈ I , (B.27) has a unique weak solution which lives in I up to t = + ∞ (see Proposition B.3 for a criteria),(ii) the function β defined by β ( t, y ) := (cid:18) bσ − Z · x σ ( t, ξ ) ∂σ∂t ( t, ξ ) dξ − ∂σ∂x (cid:19) ( t, L − ( t, y )) , is continuous on [0 , T ] × R , non-decreasing in y for every t ∈ [0 , T ] or Lipschitz in y uniformlyin t ∈ [0 , T ] , and satisfies ∃ K > such that | β ( t, y ) | ≤ K (1 + | y | ) , t ∈ [0 , T ] , y ∈ R . Definition B.3.
Let F : D ([0 , T ] , R ) → R be a functional. ( i ) The functional F is non-decreasing (resp. non-increasing) on D ([0 , T ] , R ) if ∀ α , α ∈ D ([0 , T ] , R ) , ( ∀ t ∈ [0 , T ] , α ( t ) ≤ α ( t )) ⇒ F ( α ) ≤ F ( α ) (resp. F ( α ) ≥ F ( α ) ) . ( ii ) The functional F is continuous at α ∈ C ([0 , T ] , R ) if ∀ α m ∈ D ([0 , T ] , R ) , α m U −→ α ∈ C ([0 , T ] , R ) , F ( α m ) → F ( α ) . where U denotes the uniform convergence of functions on [0 , T ] . The functional F is C -continuous if it is continuous at every α ∈ C ([0 , T ] , R ) . ( iii ) The functional F has polynomial growth if there exists a positive real number r > such that ∀ α ∈ D ([0 , T ] , R ) , | F ( α ) | ≤ K (1 + k α k r ∞ ) . (B.37) Remark.
Any C -continuous functional in the above sense is in particular P Z - a.s. continuous for everyprocess Z with continuous paths. Definition B.4.
A process ( X t ) t ∈ [0 ,T ] with continuous (resp. c`adl`ag stepwise constant ) paths definedon (Ω , A , P ) satisfies a functional co-monotony principle if for every C -continuous functionals (resp.measurable functionals on D ([0 , T ] , R ) ) F, G monotonic with the same monotony satisfying (B.37) suchthat F ( X ) , G ( X ) and F ( X ) G ( X ) ∈ L , we have Cov (cid:16) F (cid:0) ( X t ) t ∈ [0 ,T ] (cid:1) , G (cid:0) ( X t ) t ∈ [0 ,T ] (cid:1)(cid:17) ≥ . The main result of this section is the following
Theorem B.2.
Assume that the real-valued diffusion process (B.27) is admissible (see Definition B.2).Then it satisfies a co-monotony principle.
Corollary B.1.
Assume that the real-valued diffusion process (B.27) is admissible (see Defintion B.2). ( a ) Let (cid:0) ¯ X t mk (cid:1) ≤ k ≤ m be its stepwise constant Euler scheme with step Tm ( t mk = kTm , ≤ k ≤ m ). Then (cid:0) ¯ X t mk (cid:1) ≤ k ≤ m satisfies a co-monotony principle. ( b ) Let (cid:16) ˜ X t k (cid:17) ≤ k ≤ m be a sample of discrete time observations of ( X t ) t ∈ [0 ,T ] for a subdivision ( t k ) ≤ k ≤ m of [0 , T ] ( t < · · · < t m = T ). Then (cid:16) ˜ X t k (cid:17) ≤ k ≤ m satisfies a co-monotony principle. emark. The proof of Corollary B.1 is contained in the proof of Theorem B.2. The only differenceis that we do not need to transfer the co-monotony principle from the Euler scheme to the diffusionprocess.Before passing to the proof of Theorem B.2, we need two lemmas: one is a key step to transferco-monotony from the Euler scheme to the diffusion process, the other aims at transferring uniquenessproperty for weak solutions.
Lemma B.1.
For every α ∈ D ([0 , T ] , R ) , set α ( m ) = m − X k =0 α ( t mk ) [ t mk ,t mk +1 ) + α ( T ) { T } , m ≥ , (B.38) with t mk := kTm , k = 0 , . . . , m . Then α ( m ) U −→ α as m → ∞ .If F : D ([0 , T ] , R ) → R is C -continuous and non-decreasing (resp. non-increasing), then theunique function F m : R m +1 → R satisfying F ( α ( m ) ) = F m ( α ( t mk ) , k = 0 , . . . , m ) is continuous andnon-decreasing (resp. non-increasing) in each of its variables. Furthermore, if F satisfies a polynomialgrowth assumption of the form ∀ α ∈ D ([0 , T ] , R ) , | F ( α ) | ≤ C (1 + k α k r ∞ ) then, for every m ≥ , | F m ( x , . . . , x m ) | ≤ C (1 + max ≤ k ≤ m | x k | r ) with the same real constant C > . Lemma B.2.
Let ( S, d ) , ( T, δ ) be two Polish spaces and let Φ : S T be a continuous injectivefunction. Let µ and µ ′ be two probability measures on ( S, B or ( S )) . If µ ◦ Φ − = µ ′ ◦ Φ − , then µ = µ ′ . Proof of Lemma B.2.
For every Borel set A of S , µ ( A ) = sup { µ ( K ) , K ⊂ A, K compact } . Let A ∈ B or ( S ) such that µ ( A ) = µ ′ ( A ). Then there exists a compact set K of A such that µ ( K ) = µ ′ ( K ).But Φ( K ) is a compact set of S because Φ is continuous, so Φ − (Φ( K )) is a Borel set of S whichcontains K . As Φ is injective, Φ − (Φ( K )) = K . Therefore µ (cid:0) Φ − (Φ( K )) (cid:1) = µ ′ (cid:0) Φ − (Φ( K )) (cid:1) . Wededuce that µ ◦ Φ − = µ ′ ◦ Φ − . (cid:3) Proof of Theorem B.2.
First we consider the Lamperti transform ( Y t ) t ≥ (see (B.31)) of thediffusion X solution to (B.29) with X = x ∈ I . Using the homeomorphism property of Λ and callingupon the above Lemma B.2 with Λ − and Λ, we see that existence and uniqueness assumptions onEquation (B.29) can be transferred to (B.33) since Λ is a one-to-one mapping between the solutionsof these two SDE’s.To fulfill condition ( ii ) in Definition B.2, we need to introduce the smallest integer, denoted m b,σ ,such that y y + Tm b,σ β ( t, y ) is non-decreasing in y for every t ∈ [0 , T ]. Its existence follows from A b,σ -( ii ).Note that if β is non-decreasing in y for every t ∈ [0 , T ], then m b,σ = 1. Then we introducethe stepwise constant (Brownian) Euler scheme ¯ Y m = (cid:0) ¯ Y kTm (cid:1) ≤ k ≤ m with step Tm (defined by (B.30)) of Y = ( Y t ) t ∈ [0 ,T ] with m ≥ m b,σ . It is clear by induction on k that there exists for every k ∈ { , . . . , m } a function Θ k : R k +1 → R such that¯ Y t mk = Θ k ( y , ∆ W t m , . . . , ∆ W t mk )36here for ( y , z , . . . , z k ) ∈ R k +1 ,Θ k ( y , z , . . . , z k ) = Θ k − ( y , z , . . . , z k − ) + β ( t mk − , Θ k − ( y , z , . . . , z k − )) Tm + z k = (cid:18) id + β ( t mk − , · ) Tm (cid:19) ◦ Θ k − ( y , z , . . . , z k − ) + z k . Thus for every i ∈ { , . . . , k } , z i Θ k ( y , z , . . . , z i , . . . , z k ) is non-decreasing because y (cid:0) y + β ( t mk − , y ) Tm (cid:1) is non-decreasing for m large enough, say m ≥ m b,σ . We deduce that if F m : R m +1 → R is non-decreasing in each variables, then, for every i ∈ { , . . . , k } , z i F m ( y , Θ ( y , z ) , . . . , Θ m ( y , z , . . . , z m )) is non-decreasing . By the same reasoning, we deduce that for G m : R m +1 → R , non-increasing in each variables, we havefor every i ∈ { , . . . , k } , z i G m ( y , Θ ( y , z ) , . . . , Θ m ( y , z , . . . , z m )) is non-increasing . Let F m and G m be the functions defined on R m +1 associated to F and G respectively by Lemma B.1.As β has linear growth, Y and its Euler scheme have polynomial moments at any order p >
0. Thenwe can apply Proposition B.2 to deduce that E (cid:2) F G (cid:0) ¯ Y m (cid:1)(cid:3) = E (cid:20) F m (cid:18)(cid:16) ¯ Y kTm (cid:17) ≤ k ≤ m (cid:19) G m (cid:18)(cid:16) ¯ Y kTm (cid:17) ≤ k ≤ m (cid:19)(cid:21) ≥ E (cid:20) F m (cid:18)(cid:16) ¯ Y kTm (cid:17) ≤ k ≤ m (cid:19)(cid:21) E (cid:20) G m (cid:18)(cid:16) ¯ Y kTm (cid:17) ≤ k ≤ m (cid:19)(cid:21) = E (cid:2) F (cid:0) ¯ Y m (cid:1)(cid:3) E (cid:2) G (cid:0) ¯ Y m (cid:1)(cid:3) . Note that if F and G are C -continuous with polynomial growth, so is F G . We derive fromTheorem B.1 that E (cid:2) F G (cid:0) ¯ Y m (cid:1)(cid:3) −→ m →∞ E F G ( Y ) , E (cid:2) F (cid:0) ¯ Y m (cid:1)(cid:3) −→ m →∞ E F ( Y ) , E (cid:2) G (cid:0) ¯ Y m (cid:1)(cid:3) −→ m →∞ E G ( Y ) , therefore Cov ( F ( Y ) , G ( Y )) ≥ . To conclude the proof, we need to go back to the process X by using the inverse Lamperti transform.Indeed, for every t ∈ [0 , T ], X t = L − ( t, Y t ), where Y satisfies (B.33). Let F : D ([0 , T ] , R ) → R C -continuous. Set ∀ α ∈ C ([0 , T ] , R ) , e F ( α ) := F (cid:16)(cid:0) L − ( t, α t ) (cid:1) t ∈ [0 ,T ] (cid:17) . Assume first that F and G are bounded. The functional e F is C -continuous owing to Proposition B.4,non-decreasing (resp. non-increasing) since L − ( t, . ) is for every t ∈ [0 , T ] and is bounded. Conse-quently, Cov ( F ( X ) , G ( X )) = Cov (cid:16) e F ( Y ) , e G ( Y ) (cid:17) ≥ . To conclude we approximate F and G in a robust way with respect to the “constraints”, by a canonicaltruncation procedure, say F M := max (cid:16) ( − M ) , min (cid:0) F, M (cid:1)(cid:17) , M ∈ N . If F and G have polynomial growth, it is clear that Cov ( F M ( X ) , G M ( X )) → Cov ( F ( X ) , G ( X )) as M → ∞ . (cid:3) xamples of admissible diffusions. • The Bachelier model : This simply means that X t = µt + σW t , σ >
0, clearly fulfills the assumptions of Theorem B.2. • The Black-Scholes model : The diffusion process X is a geometric Brownian motion, solution tothe SDE dX t = rX t dt + ϑX t dW t , X = x > , where r ∈ R and ϑ > I = (0 , + ∞ ) and β ( y ) = rϑ − ϑ is constant. One checks that L ( x ) = σ log (cid:16) xx (cid:17) where x ∈ (0 , + ∞ )is fixed. • The Hull-White model : It is an elementary improvement of the Black-Scholes model where ϑ : [0 , T ] → (0 , + ∞ ) is a deterministic positive function i.e. the diffusion process X is a geometricBrownian motion solution of the SDE dX t = rX t dt + ϑ ( t ) X t dW t , X = x > . Then, elementary stochastic calculus shows that X t = x e rt − R t ϑ ( s ) ds + R t ϑ ( s ) dW s = x e rt − R t ϑ ( s ) ds + B R t ϑ s ) ds where ( B u ) u ≥ is a standard Brownian motion (the second equality follows form the Dambins-Dubins-Schwarz theorem).Consequently X t = ϕ (cid:16) t, B R t ϑ ( s ) ds (cid:17) where the functional ξ (cid:16) t ϕ (cid:16) t, ξ (cid:16) Z . ϑ ( s ) ds (cid:17)(cid:17)(cid:17) definedon D ([0 , T ϑ ] , R ), T ϑ = R T ϑ ( t ) dt , is C -continuous on C ([0 , T ϑ ] , R ). Hence for any C -continuous R -functional on D ([0 , T ] , R ), the R -valued functional e F defined by e F ( ξ ) = F (cid:16) ϕ (cid:16) t, ξ (cid:16) Z . ϑ ( s ) ds (cid:17)(cid:17)(cid:17) is C -continuous on D ([0 , T ϑ ] , R ). Then, on can transfer the co-monotony property from B to X . • Local volatility model (elliptic case) : More generally, it applies still with I = (0 , + ∞ ) to someusual extensions like the models with local volatility dX t = rX t dt + ϑ ( X t ) X t dW t , X = x > , where ϑ : R → ( ϑ , + ∞ ), ϑ >
0, is a bounded, twice differentiable function satisfying | ϑ ′ ( x ) | ≤ C | x | and | ϑ ′′ ( x ) | ≤ C | x | , x ∈ (0 , + ∞ ).In this case I = (0 , + ∞ ) and, x ∈ I being fixed, one has for every x ∈ I , L ( x ) = Z xx dξξϑ ( ξ )which clearly defines an increasing homeomorphism from I onto R since ϑ is bounded.Furthermore, one easily derives from the explicit form (B.35) and the condition (B.36) that β isLipschitz as soon as the function x rx ϑ ′ ϑ ( x ) + x ϑϑ ′′ ( x )2 + xϑϑ ′ ( x ) is bounded on (0 , ∞ )which easily follows from the assumptions made on ϑ .38 xtension to other classes of diffusions and models. This general approach does not embodyall situations: thus the true CEV model does not fulfill the above assumptions. The
CEV model is adiffusion process X following the SDE dX t = rX t dt + ϑX αt dW t , X = x , where ϑ > < α < CEV model, for which I = (0 , + ∞ ), does not fulfill Definition A b,σ -( iii ). As a consequence L ( t, I ) = R is an open interval (depending on the choice of x . To be precise, if x ∈ (0 , + ∞ ) is fixed, L ( x ) = 1 ϑ (1 − α ) (cid:0) x − α − x − α (cid:1) , x ∈ (0 , + ∞ )so that, if we set J x := L ( I ) = (cid:16) − x − α ϑ (1 − α ) , + ∞ (cid:17) ,L defines an homeomorphism from I = (0 , + ∞ ) onto J x . Finally the function β defined by β ( y ) = rϑ (cid:0) ϑ (1 − α ) y + x − α (cid:1) − αϑ ϑ (1 − α ) y + x − α ) , y ∈ J x is non-decreasing with linear growth at + ∞ . Now, tracing the lines of the above proof, in particularestablishing weak existence and uniqueness of the solution of the SDE (B.29) in that setting, leads tothe same positive conclusion concerning the covariance inequalities for co-monotonic or anti-monotonicfunctionals. References [1] F. Abergel and A. Jedidi. A mathematical approach to order book modelling. In
Proceedings of the 5thKolkata Econophysics conference, Springer (in press) .[2] A. Alfonsi, A. Fruth, and A. Schied. Optimal execution strategies in limit order books with general shapefunctions.
Quantitative Finance , 10(2):143–157, 2010.[3] R. F. Almgren and N. Chriss. Optimal execution of portfolio transactions.
Journal of Risk , 3(2):5–39,2000.[4] M. Avellaneda and S. Stoikov. High-frequency trading in a limit order book.
Quantitative Finance ,8(3):217–224, 2008.[5] E. Bayraktar and M. Ludkovski. Liquidation in limit order books with controlled intensity.
CoRR , 2011.[6] A. Beskos and G.O. Roberts. Exact simulation of diffusions.
Ann. appl. Prob. , 15(4):2422–2444, 2005.[7] B. Bouchard, N.-M. Dang, and C.-A. Lehalle. Optimal control of trading algorithms: a general impulsecontrol approach.
SIAM J. Financial Math. , 2:404–438, 2011.[8] M. Duflo.
Algorithmes stochastiques , volume 23 of
Math´ematiques & Applications (Berlin) [Mathematics& Applications] . Springer-Verlag, Berlin, 1996.[9] T. Foucault, O. Kadan, and E. Kandel. Limit order book as a market for liquidity. Discussion Paper Seriesdp321, Center for Rationality and Interactive Decision Theory, Hebrew University, Jerusalem, January2003.[10] O. Gu´eant, J. Fernandez-Tapia, and C.-A. Lehalle. Dealing with the inventory risk. Technical report, 2011.
11] O. Gu´eant, C.-A. Lehalle, and J. Razafinimanana. High Frequency Simulations of an Order Book: a Two-Scales Approach. In F. Abergel, B. K. Chakrabarti, A. Chakraborti, and M. Mitra, editors,
Econophysicsof Order-Driven Markets , New Economic Windows. Springer, 2010.[12] F. Guilbaud, M. Mnif, and H. Pham. Numerical methods for an optimal order execution problem. toappear in Journal of Computational Finance, 2010.[13] F. Guilbaud and H. Pham. Optimal high-frequency trading with limit and market orders. 2012.[14] T. Ho and H. R. Stoll. Optimal dealer pricing under transactions and return uncertainty.
Journal ofFinancial Economics , 9(1):47–73, 1981.[15] J. Jacod and A. N. Shiryaev.
Limit theorems for stochastic processes , volume 288 of
Grundlehren der Math-ematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] . Springer-Verlag, Berlin,second edition, 2003.[16] I. Karatzas and S. E. Shreve.
Brownian motion and stochastic calculus , volume 113 of
Graduate Texts inMathematics . Springer-Verlag, New York, second edition, 1991.[17] S. Karlin and H. M. Taylor.
A second course in stochastic processes . Academic Press Inc. [Harcourt BraceJovanovich Publishers], New York, 1981.[18] H. J. Kushner and D. S. Clark.
Stochastic approximation methods for constrained and unconstrainedsystems , volume 26 of
Applied Mathematical Sciences . Springer-Verlag, New York, 1978.[19] H. J. Kushner and G. G. Yin.
Stochastic approximation and recursive algorithms and applications , vol-ume 35 of
Applications of Mathematics (New York) . Springer-Verlag, New York, second edition, 2003.Stochastic Modelling and Applied Probability.[20] S. Laruelle, C.-A. Lehalle, and G. Pag`es. Optimal Split of Orders Across Liquidity Pools: A StochasticAlgorithm Approach.
SIAM J. Financial Math. , 2(1):1042–1076, 2011.[21] S. Laruelle and G. Pag`es. Stochastic approximation with averaging innovation applied to finance.
MonteCarlo Methods Appl. , 18(1):1–51, 2012.[22] J. McCulloch. A model of true spreads on limit order markets. 2011. Available at SSRN:http://ssrn.com/abstract=1815782.[23] G. Pag`es. A functional co-monotony principle with an application to peacoks. 2012. to appear in
S´eminairede probabilit´es .[24] S. Predoiu, G. Shaikhet, and S. Shreve. Optimal Execution of a General One-Sided Limit-Order Book.Technical report, Carnegie Mellon University, September 2010.[25] C. Y. Robert and M. Rosenbaum. A new approach for the dynamics of ultra high frequency data: themodel with uncertainty zones.
Journal of Financial Econometrics , 9(2):344–366, 2011., 9(2):344–366, 2011.