[PDF] A convex duality method for optimal liquidation with participation constraints

Abstract

In spite of the growing consideration for optimal execution in the financial mathematics literature, numerical approximations of optimal trading curves are almost never discussed. In this article, we present a numerical method to approximate the optimal strategy of a trader willing to unwind a large portfolio. The method we propose is very general as it can be applied to multi-asset portfolios with any form of execution costs, including a bid-ask spread component, even when participation constraints are imposed. Our method, based on convex duality, only requires Hamiltonian functions to have C 1,1 regularity while classical methods require additional regularity and cannot be applied to all cases found in practice.

Full PDF

aa r X i v : . [ q -f i n . T R ] D ec A convex duality method for optimal liquidation withparticipation constraints ∗ Olivier Guéant † , Jean-Michel Lasry ‡ , Jiang Pu § Abstract

In spite of the growing consideration for optimal execution in the ﬁnancial mathematicsliterature, numerical approximations of optimal trading curves are almost never discussed.In this article, we present a numerical method to approximate the optimal strategy of atrader willing to unwind a large portfolio. The method we propose is very general as it canbe applied to multi-asset portfolios with any form of execution costs, including a bid-askspread component, even when participation constraints are imposed. Our method, based onconvex duality, only requires Hamiltonian functions to have C , regularity while classicalmethods require additional regularity and cannot be applied to all cases found in practice. When he is willing to unwind a large portfolio, a trader faces a trade-oﬀ between market riskon the one hand and market impact and execution costs on the other hand. Selling rapidlya large quantity of shares is indeed costly as it requires to take liquidity from limit orderbooks. Selling slowly however exposes to price risk because of other market participants’actions. If a trader makes the decision to unwind his portfolio at a given time, his decisionis certainly based on market prices at that time. He needs therefore to sell fast enough sothat prices do not move too much over the course of the execution process.Optimal execution has been an important topic in the academic literature for around 15years. The above trade-oﬀ between execution costs and market impact on the one handand price risk on the other hand has indeed been modeled in the seminal papers [2, 3]of Almgren and Chriss published in 1999 and 2001. Almgren and Chriss proposed a sim-ple framework to solve the problem of optimally scheduling the execution process and this ∗ This research has been conducted with the support of the Research Initiative “Modélisation des marchésﬁnanciers à haute fréquence” (formerly “Exécution optimale et statistiques de la liquidité haute fréquence”)under the aegis of the Europlace Institute of Finance. This paper has been improved following the reportsof two anonymous referees who really need to be thanked. † Université Paris-Diderot, UFR de Mathématiques, Laboratoire Jacques-Louis Lions. Avenue de France,75013 Paris. ‡ Université Paris-Dauphine, Ceremade. Place du Maréchal de Lattre de Tassigny, 75116 Paris. § Institut Europlace de Finance. 28 Place de la Bourse, 75002 Paris. à la

Almgren-Chriss are sim-pler and permit to sum up the inﬂuence of the diﬀerent venues (associated with diﬀerentmicrostructures) into one or two functions.In most of the papers in the literature, the optimal strategy does not depend on the evo-lution of prices (see in particular [30]). The outcome of models à la

Almgren-Chriss isindeed an optimal trading curve, stating, before it starts, the optimal time scheduling ofthe execution process. This trading curve constitutes the ﬁrst (strategic) layer of mostexecution algorithms. An execution algorithm is indeed usually made of two layers: astrategic layer, which controls the risk with respect to a benchmark and mitigates executioncosts; and a tactical layer, which seeks liquidity inside order books, through all types oforders, and across all other liquidity pools (lit or dark). The second layer has been studiedin the literature more recently through new models involving limit orders (see for instance[7, 13, 14, 21]) or through the study of liquidation with dark pools (see [18, 19] and [22]).In this article, we focus on the ﬁrst layer of execution algorithms for a multi-asset portfo-lio. We consider a Von Neumann-Morgenstern expected utility framework in the case of See [24] for a general description of execution algorithms. See also [8] for another viewpoint. The case of a multi-asset portfolio is also interesting for one-asset portfolio liquidation as it enables toconsider a round trip on an additional asset for hedging.

2n investor with constant absolute risk aversion. We consider a general form of executioncosts and the optimal strategy is characterized by a Hamiltonian system as in [16] and[30]. Numerical approximations of the optimal strategy are brieﬂy discussed in [16] but themethod presented in [16] is limited to a small class of Hamiltonian functions (strictly convexfunctions with C regularity). The method we present in this paper is more general as itonly requires Hamiltonian functions to have C , regularity. It enables to solve numericallythe problem of the optimal strategy to unwind a multi-asset portfolio when bid-ask spreadsare taken into account and when one adds participation rate constraints (an upper boundto the volume that can be traded, relative to market volume).Numerical methods are very rarely discussed in the literature (one important exception is[20]). In fact, the problem is rather simple in the one-asset case as the bid-ask spread com-ponent plays no role and as most papers do not consider participation constraints. In theone-asset case, a shooting method can indeed be used to obtain a precise numerical approxi-mation of the optimal trading curve. In the multi-asset case, however, shooting methods arenot relevant anymore. Newton’s methods to ﬁnd a solution to the Hamiltonian equations orto the Euler-Lagrange equations associated to Almgren-Chriss-like models are not possibleeither, except in the case of smooth functions... too smooth to embed a bid-ask spreadcomponent or participation constraints. Considering the Bellman equations associated tothe optimal control is always possible, at least theoretically, but it has serious drawbacksin terms of computation time and numerical precision. The method we present is basedon convex duality and allows to consider all conceivable practical cases as it only requiresHamiltonian functions to be C , .We present the general framework in continuous time in Section 2 and we state the classicalexistence and uniqueness results for the optimal liquidation strategy. We also provide theHamiltonian equations that characterize this optimal liquidation strategy. These equations,and their discrete counterparts, play a major role in our paper. In Section 3, we presentour numerical method (based on convex duality) to approximate the optimal liquidationstrategy, and we prove a convergence result. In Section 4, we present numerical examples.Proofs are presented in Appendix A. Appendix B is dedicated to the discrete counterpartof the continuous-time model presented in Section 2. We consider a portfolio of d diﬀerent stocks with initial quantities q = ( q , . . . , q d ) . Weconsider the problem of unwinding this portfolio over the time window [0 , T ] (the timehorizon being usually a few minutes to a few hours). We consider here a variant of the classical models in continuous time used in [16] or [30]. The discretecounterpart of our model is presented in Appendix B.

3e consider a probability space (Ω , P ) equipped with a ﬁltration ( F t ) t ∈ [0 ,T ] satisfying theusual conditions. We assume that all stochastic processes are deﬁned on (Ω , ( F t ) t ∈ [0 ,T ] , P ) .We also introduce the set P (0 , T ) of progressively measurable processes deﬁned on [0 , T ] .Market volume processes are denoted by ( V t , . . . , V dt ) t ∈ [0 ,T ] . They are assumed to be deter-ministic ( F − measurable), positive and bounded. For each stock, we consider a maximumparticipation rate and we denote them by ρ m , . . . , ρ dm ∈ R ∗ + . This allows to deﬁne the setof admissible liquidation strategies: A = ( ( v t , . . . , v dt ) t ∈ [0 ,T ] ∈ P (0 , T ) , ∀ i, | v it | ≤ ρ im V it a.e. on [0 , T ] × Ω , ∀ i, Z T v it dt = q i a.s. ) . For a liquidation strategy ( v t , . . . , v dt ) t ∈ [0 ,T ] ∈ A , representing the velocity at which thetrader sells his shares, we denote by ( q t , . . . , q dt ) t ∈ [0 ,T ] the process giving the state of theportfolio. It veriﬁes: ∀ i, q it = q i − Z t v is ds. Remark 1.

In other words, our hypotheses on the strategies ( v , . . . , v d ) are simply thatone cannot trade too quickly, relatively to market volume, and that we indeed liquidate theportfolio by time T . Remark 2.

We always assume that liquidation is feasible in the sense that ∀ i, | q i | ≤ ρ im Z T V it dt. For each stock, we consider Brownian dynamics for the price: ∀ i, dS it = σ i dW it σ i > , and we assume that the d -dimensional Brownian motion ( σ W , . . . , σ d W d ) has a covari-ance matrix Σ that is not singular. Remark 3.

Considering Brownian dynamics for prices instead of Black-Scholes dynamicshas almost no inﬂuence in terms of modelling, as we consider time horizons of at most afew hours. From a mathematical point of view however, the former is more practical thanthe latter.

Remark 4.

The case of non-constant volatility processes can easily be considered in ourmodel, as soon as these processes remain deterministic and positive. For instance one mayrelate the volatility processes to the market volume processes. We do not consider permanent market impact as it plays no role in liquidation strategies – see [11] and[15]. t for his trades on stock i is not S it because of exe-cution costs. To model these execution costs, we introduce d functions L , . . . , L d verifyingthe following hypotheses: • ∀ i, L i (0) = 0 , • ∀ i, L i is an even function, • ∀ i, L i is increasing on R + , • ∀ i, L i is strictly convex.Now, for ( v , . . . , v d ) ∈ A , we deﬁne the cash process ( X t ) t ∈ [0 ,T ] by: X t = Z t d X i =1 (cid:18) v is S is − V is L i (cid:18) v is V is (cid:19)(cid:19) ds. Remark 5.

In practice, we need to cover the cases L i ( ρ ) = η i | ρ | φ i + ψ i | ρ | for η i > , ψ i ≥ and φ i ∈ (0 , . The proportional part of the function corresponds to bid-askspread, stamp duty and/or ﬁnancial transaction tax – we generally call it the bid-ask spreadcomponent. The superlinear part is the classical execution cost component of all models àla Almgren-Chriss.

Our objective function for ( v , . . . , v d ) ∈ A is J ( v , . . . , v d ) = E [ − exp( − γX T )] , where γ > is the absolute risk aversion parameter of the trader.In other words, our problem is to ﬁnd: ( v ∗ , . . . , v d ∗ ) ∈ argmax ( v ,...,v d ) ∈A J ( v , . . . , v d ) . Adapting the results obtained in [16] and [30], we can show that there exists such an optimalliquidation strategy. We can also show that stochastic strategies cannot do better thandeterministic ones. These results, along with a characterization of the unique deterministicoptimal strategy, are exhibited in the next paragraphs.

The results obtained in [16] and [30] can easily be adapted to the case considered here.First of all, one can restrict the set of liquidation strategies to deterministic processes in A . We denote by A det this set, which consists of the liquidation strategies in A that are F -measurable.Our ﬁrst Proposition states that, in the case of deterministic strategies, the objective func-tion simpliﬁes, since ﬁnal wealth X T is normally distributed. This assumption can easily be relaxed, to include for instance an asymmetrical stamp duty. roposition 2.1. If ( v , . . . , v d ) ∈ A det then X T is normally distributed, and J ( v , . . . , v d ) = − exp − γ q ′ S − d X i =1 Z T V is L i (cid:18) v is V is (cid:19) ds − γ Z T q ′ s Σ q s ds !! . A consequence of this Proposition is that the problem boils down to solving the followingvariational problem: ( P ) inf ( q ,...,q d ) ∈C I ( q , . . . , q d ) where I ( q , . . . , q d ) = d X i =1 Z T V is L i (cid:18) ˙ q i ( s ) V is (cid:19) ds + 12 γ Z T q ( s ) ′ Σ q ( s ) ds, and where C = n q ∈ W , ((0 , T ) , R d ) , q (0) = q , q ( T ) = 0 , ∀ i, | ˙ q i ( t ) | ≤ ρ im V it , a.e. in (0 , T ) o . Existence and uniqueness of minimizers for I are obtained using classical techniques ofvariational calculus and convex optimization as in [16]. Moreover, the optimal liquidationstrategy is characterized by a Hamiltonian system. Theorem 2.1 (Existence, uniqueness and characterization of an optimal strategy) . Thereexists a unique function q ∗ ∈ C that minimizes the function I deﬁned in problem ( P ) .There exists a W , function p such that ( q ∗ , p ) is a solution of the Hamiltonian system ( S H ) : ( ˙ p ( t ) = γ Σ q ( t )˙ q i ( t ) = V it H i ′ ( p i ( t )) , ∀ i ∈ { , . . . , d } with boundary conditions q (0) = q , q ( T ) = 0 , where: H i ( p ) = sup | ρ |≤ ρ im pρ − L i ( ρ ) , Moreover, if a pair ( q, p ) of W , functions is solution of ( S H ) , then q = q ∗ . Remark 6.

Although there is uniqueness of the optimal trajectory q ∗ , there may not beuniqueness of p . This turns out to be important when it comes to numerics. Remark 7.

If market volume processes are continuous, it is clear that for any solution ( p, q ) of the system ( S H ) , q has C regularity and p has C regularity. We shall thereforeapproximate p instead of q . The Hamiltonian characterization ( S H ) is more suited to numerics than the Euler-Lagrangeequation used in most papers since the execution cost functions L i s are not of class C assoon as there is a bid-ask spread component. Finding an approximation to a solution of theHamiltonian system ( S H ) (or in fact to the counterpart of this system in a discrete-timemodel) is the goal of this paper and we present our method in the next section. But beforeturning to our numerical method, let us conclude this section by stating that stochasticstrategies cannot do better than the best deterministic strategy.6 heorem 2.2 (Optimality of deterministic strategies) . sup ( v ,...,v d ) ∈A E [ − exp ( − γX T )] = sup ( v ,...,v d ) ∈A det E [ − exp ( − γX T )] . The model we have presented in the above section is a model in continuous time. Continuous-time models are useful to beneﬁt from diﬀerential calculus and to have intuition on numericalmethods. However, as explained in the introduction, the problem we tackle consists of theﬁrst layer of execution algorithms and a decision has to be taken every 5 or 10 minutes.Hence, the problem faced by practitioners is naturally in discrete time rather than in contin-uous time. The goal of this section is to present a new numerical method for approximatinga solution ( p, q ) of the Hamiltonian system ( S H ) , or more exactly of a discrete version of thissystem. This discrete version ( ˜ S H ) corresponds to the optimality conditions of the discretecounterpart of the optimization problem of Section 2 (see Appendix B for the presentationof the discrete-time model). Considering a time grid , ∆ t, . . . , N ∆ t = T , we are lookingfor (( q in ) ≤ i ≤ d, ≤ n ≤ N , ( p in ) ≤ i ≤ d, ≤ n ≤ N − ) satisfying: ( ˜ S H ) : ( p n +1 = p n + ∆ tγ Σ q n +1 , ≤ n < N − q in +1 = q in + ∆ tV in +1 H i ′ ( p in ) , ≤ n < N, ∀ i ∈ { , . . . , d } q = q , q N = 0 . The ﬁrst part of the system is made of linear equations. The diﬃculty then relates to thenonlinearity introduced by the Hamiltonian functions H i s. In particular, it is noteworthythat in the initial Almgren-Chriss case [2, 3] with a quadratic execution cost function, thesystem boiled down to a linear one.In practice, a classical form for the execution cost functions L i s is L ( ρ ) = η | ρ | φ + ψ | ρ | , η > , ψ ≥ , φ ∈ (0 , . The superlinear component has been considered since the initial papers by Almgren andChriss. In [2, 3], the authors considered the quadratic case φ = 1 to obtain closed formsolutions. However, realistic values for φ are usually estimated to be between 0.4 and 0.8.For instance, Almgren and coauthors from Citigroup estimated φ ≃ . using a large datasetof meta-orders (see [5]). The proportional component models the inﬂuence of the bid-askspread, of a stamp duty, or of a ﬁnancial transaction tax.The Hamiltonian function H associated to L ( ρ ) = η | ρ | φ + ψ | ρ | with participation con-straint ρ m is: 7 ( p ) = sup | ρ |≤ ρ m pρ − η | ρ | φ − ψ | ρ | =  if | p | ≤ ψφη (cid:16) | p |− ψη (1+ φ ) (cid:17) φ if ψ < | p | ≤ ψ + η (1 + φ ) ρ φm ( | p | − ψ ) ρ m − ηρ φm if ψ + η (1 + φ ) ρ φm < | p | As L is strictly convex, H is C with: H ′ ( p ) = sign ( p ) min ρ m , (cid:18) max( | p | − ψ, η (1 + φ ) (cid:19) φ ! This function H ′ is not diﬀerentiable (see Figure 1). This is an issue to build a generalnumerical method since the systems ( S H ) and ( ˜ S H ) involve derivatives of Hamiltonianfunctions of this form. −0.06 −0.04 −0.02 0.00 0.02 0.04 0.06 − . − . . . . p H ’ Figure 1: Shape of H ′ for η = 0 . , φ = 0 . , ψ = 0 . and ρ m = 20% .In dimension d = 1 , solutions of the system ( ˜ S H ) are easy to approximate numerically usingsimple shooting methods. One can indeed replace the terminal condition q N = 0 by aninitial condition on p of the form p = λ . Then, the system can easily be solved recursively(forward in time). It remains to notice that q N is a monotone function of λ to end up withan eﬃcient method that ﬁnds the appropriate value of λ such that q N = 0 . However, assoon as we consider several assets, the issue of numerical methods arises. The shootingmethod described above is not relevant anymore: the monotonicity property is lost andusual methods to ﬁnd the appropriate value of p (such as gradient descents) generally fail.Using a Newton’s method on the whole system, as discussed in [16], is only possible when thefunctions H i s are twice diﬀerentiable. Therefore, it cannot be used to ﬁnd an approximatesolution of the system ( ˜ S H ) when there is a bid-ask spread component or when participationconstraints are considered. 8nother method would be to consider the Bellman equations associated to the problemand to use classical techniques to approximate the solutions of these equations. This isalways possible, independently of the regularity of the functions L i s and H i s. However,this method has several serious drawbacks. First, Bellman equations need to be solved ona d -dimensional grid because dynamic programming requires to be able to solve executionproblems for (almost) all values of the state variable q . In terms of computation time, thisis not an eﬃcient method, especially as d becomes larger than or . Moreover, the do-main for the state variable q need to be considered large, as optimal strategies may involveround trips for hedging purposes. Even in dimension or , solving the Bellman equationsassociated to ( ˜ S H ) is not recommended as optimal strategies must either be on the grid, orapproximated using interpolation methods ( e.g. splines)... and the step of the grid in q hasno reason to be the same over the course of the execution process if one accounts for thevariability of the market volume process.The method we propose is based on a dual formulation of problem ( P ) . It only requires thederivative of the functions H i s to be Lipschitz (and this is the case for the form consideredabove). In addition, our method is iterative and only requires computations on a vector ofsize N × d . The problem ( P ) we considered in Section 2 is: inf ( q ,...,q d ) ∈C d X i =1 Z T V is L i (cid:18) ˙ q i ( s ) V is (cid:19) ds + 12 γ Z T q ( s ) ′ Σ q ( s ) ds. The dual problem ( D ) associated to this Bolza problem is: inf ( p ,...,p d ) ∈ W , ((0 ,T ) , R d ) J ( p , . . . , p d ) , where J ( p , . . . , p d ) = d X i =1 Z T V is H i (cid:0) p i ( s ) (cid:1) ds + 12 γ Z T ˙ p ( s ) ′ Σ − ˙ p ( s ) ds + p (0) · q . Remark 8.

By construction of the dual problem, the Hamiltonian equations that character-ize the minimizers of J are exactly those of the system ( S H ) . In other words, any solutionof problem ( D ) gives a (the) solution of problem ( P ) , through q = γ Σ − ˙ p . If we consider the discrete counterpart of our model (see Appendix B), then the discretecounterpart ( ˜ P ) of ( P ) is: inf ( q in ) ≤ n ≤ N, ≤ i ≤ d ∈ ˜ C ˜ I (( q in ) ≤ n ≤ N, ≤ i ≤ d ) , See [27] for more details on Bolza problems. ˜ I (( q in ) ≤ n ≤ N, ≤ i ≤ d ) = d X i =1 N − X n =0 L i (cid:18) q in − q in +1 V in +1 ∆ t (cid:19) V in +1 ∆ t + γ N − X n =0 q ′ n +1 Σ q n +1 ∆ t, and where ˜ C = n ( q , . . . , q d ) ′ ∈ (cid:0) R N +1 (cid:1) d , ∀ i, q i = q i , q iN = 0 , | q in − q in +1 | ≤ ρ im V in +1 ∆ t, ≤ n ≤ N − o . The dual problem ( ˜ D ) associated to ( ˜ P ) is: inf ( p in ) ≤ n

As above, by construction of the dual problem, any solution of problem ( ˜ D ) gives a solution of ( ˜ S H ) and therefore the solution of problem ( ˜ P ) – see Appendix B. Thissolution can be written using the relations between the primal and the dual variables: q n +1 = 1 γ ∆ t Σ − ( p n +1 − p n ) , ≤ n < N − . This dual problem is at the core of our numerical approximation method since our methodis based on a numerical approximation of a minimizer of ˜ J .The initial idea to ﬁnd such a minimizer is in fact to use a gradient descent algorithmon ˜ J . However, a simple gradient descent would be equivalent to an explicit scheme toapproximate a solution of the PDE ∂ θ p − γ Σ − ∂ tt p +  V H ′ (cid:0) p (cid:1) ... V d H d ′ (cid:0) p d (cid:1)  = 0 with Neumann boundary conditions ˙ p ( θ,

0) = γ Σ q and ˙ p ( θ, T ) = 0 , and with an initialcondition at θ = 0 .Therefore, the naïve gradient descent would require a very small step for θ in order toconverge to a minimum. The methodology we propose consists instead of considering asemi-implicit gradient descent. We have indeed ∇ J ( p ) = − γ Σ − ¨ p +  V H ′ (cid:0) p (cid:1) ... V d H d ′ (cid:0) p d (cid:1)  , where the gradient is taken in L . .3 Semi-implicit gradient descent The method we propose is inspired from a gradient descent. However, we consider an im-plicit scheme for what would correspond to the heat operator in continuous time.The idea is in fact to decompose ˜ J as ˜ J + ˜ J where ˜ J (( p in ) ≤ n is to be chosen to guarantee convergence of the sequence ( p k ) k .In other words, we propose the following semi-implicit gradient descent: • Start with an initial guess ( p i, n ) ≤ n , deﬁne recursively ( p i,k +1 n ) ≤ n

Lemma 3.1.

For p = (cid:0) p , . . . , p d , . . . , p N − , . . . , p dN − (cid:1) ′ , ∇ ˜ J ( p ) = 1 γ ∆ t M ⊗ Σ − p where the N × N matrix M is deﬁned by: p · ,kn denotes the R d column vector ( p ,kn , . . . , p d,kn ) ′ . =  − − − . . . . . . . . . − − −  . Then, the next Proposition states that our method is indeed well deﬁned.

Proposition 3.1.

The sequence ( p k ) k is well deﬁned. Now, our goal is to prove that for suﬃciently small values of ∆ θ , the sequence ( p k ) k convergestoward a minimizer of ˜ J . Then, the optimal trajectory q ∗ will also be obtained as a limitusing the relation between the dual variable p and the primal variable q . For that purpose,we start with a straightforward Lemma. Lemma 3.2.

Assume that the Hamiltonian functions H i s are C , , then ∇ ˜ J is a Lipschitzfunction with k∇ ˜ J k Lip ≤ ∆ t sup i,n V in +1 k H i ′ k Lip . We can now state our convergence result.

Theorem 3.1 (Convergence of the semi-implicit gradient descent) . Assume that the Hamil-tonian functions H i s are C , , and let us consider K such that k∇ ˜ J k Lip ≤ K ∆ t .Then, if ∆ θ < K , ( p k ) k converges towards a minimum p ∗ of ˜ J .Therefore, if we deﬁne ∀ n ∈ { , . . . , N − } , q ∗ n = (cid:16) q ∗ n , . . . , q d ∗ n (cid:17) ′ = lim k → + ∞ γ ∆ t I N ⊗ Σ − (cid:16) p · ,kn − p · ,kn − (cid:17) , then ( q ∗ , p ∗ ) is a solution of ( ˜ S H ) . This theorem proves that our method works better than a simple gradient descent, as ∆ θ can be chosen independently of ∆ t . In addition to that, it does not require to considersecond derivatives of Hamiltonian functions contrary to a Newton’s scheme. Before turningto the practical use of our method, let us notice that the bound K can be made explicit inthe case of execution costs of the form L ( ρ ) = η | ρ | φ + ψ | ρ | : Remark 10. If ∀ i, L i ( ρ ) = η i | ρ | φ i + ψ i | ρ | , then we can consider K = sup i,n V in +1 η i φ i (1 + φ i ) ρ im − φ i In practice, convergence may occur for values of ∆ θ above K . Remark 11.

If we deﬁne q kn = γ ∆ t I N ⊗ Σ − (cid:16) p · ,kn − p · ,kn − (cid:17) , then q k = q and q kN = 0 forall k , thanks to the deﬁnition of p k − and p kN . This is important in practice as we shallapproximate q ∗ by q k for a large k . Practical examples

In practice, it may be convenient to diagonalize Σ to simplify computations in the semi-implicit gradient descent. If we write indeed Σ =

QDQ − the spectral decomposition of Σ ,the variable y k deﬁned by y · ,kn = Q − p · ,kn veriﬁes: y · ,k +1 n − y · ,kn ∆ θ − γ D − y · ,k +1 n +1 − y · ,k +1 n + y · ,k +1 n − ∆ t + Q −  V H ′ (cid:16) ( Qy · ,kn ) (cid:17) ... V d H d ′ (cid:16) ( Qy · ,kn ) d (cid:17)  = 0 , ≤ n < N, where, by convention, we deﬁne: y · ,k − = y · ,k − ∆ tγDQ − q , y · ,kN = y · ,kN − . This formulation enables indeed to consider the discrete heat operator on each stock inde-pendently.We now turn to the practical use of our method with speciﬁc examples. We shall not proceedto comparative statics as the role of the parameters have been described in many papers(see for instance [6, 16]). Instead we focus on speciﬁc cases where the constraint on theparticipation rate is binding, or where bid-ask spread plays a role.

To examplify the use of our method, we consider ﬁrst the liquidation of a portfolio with asingle stock. The parameters of the stock are the following: • Stock price S = 75 , • Volatility 20%, corresponding to σ = 0 . , • Market volume is assumed to be constant over the day with V t = 2000000 , • The execution cost function is L ( ρ ) = η | ρ | φ + ψ | ρ | , with η = 0 . , φ = 0 . and ψ = 0 . .We consider the liquidation of q = 300000 shares (that is of market daily volume) overone day ( T = 1) , with three diﬀerent ﬁgures for the maximum participation rate. In the ﬁrstcase, we set ρ m = 60% and since the constraint is never binding, this corresponds to settingno constraint at all on participation rate. The two other cases correspond respectively to ρ m = 40% and ρ m = 20% . The results for the optimal liquidation strategy are shown onFigure 2. We see that, as expected, the more we constrain the participation rate, the slower The ﬁgures are inspired from the French stock Sanoﬁ. ρ m = 40% and ρ m = 20% , as the liquidation process starts in straight line (the slope of theline corresponding to the maximum participation rate). Time P o s i t i on No constraintMaximum Participation Rate = 40%Maximum Participation Rate = 20%

Figure 2: Liquidation with diﬀerent values of the maximum participation rate ρ m . Riskaversion: γ = 4 . − . The above examples were in the case of a one-asset portfolio. We now turn to three diﬀerentcases involving several assets. In the ﬁrst case, we liquidate a long portfolio of 2 positivelycorrelated assets. In the second case, we liquidate a portfolio with a long position in the ﬁrstasset and a short position in the second asset, the two assets being still positively correlated.The third case corresponds to the liquidation of a one-asset portfolio when a round trip onanother (correlated) asset is allowed, in order to hedge risk.We consider that the ﬁrst asset is as above: • Stock price S = 75 , • Volatility 20%, corresponding to σ = 0 . , • Market volume is assumed to be constant over the day with V t = 2000000 , • The execution cost function is L ( ρ ) = η | ρ | φ + ψ | ρ | , with η = 0 . , φ = 0 . and ψ = 0 . . 14or the ﬁrst two cases we illustrate, the second asset we consider has the following charac-teristics: • Stock price S = 50 , • Volatility 17%, corresponding to σ = 0 . , • Market volume is assumed to be constant over the day with V t = 4500000 , • The execution cost function is L ( ρ ) = η | ρ | φ + ψ | ρ | , with η = 0 . , φ = 0 . and ψ = 0 . .Correlation between the two assets is assumed to be . .For the ﬁrst example, we assume that q = 300000 and q = 675000 , that is ofthe daily market volume of each asset. Maximum participation rates are assumed to be ρ m = ρ m = 40% . Time P o s i t i on Asset 1Asset 2Asset 1 when liquidated without Asset 2

Figure 3: Liquidation of a 2-asset long portfolio. Risk aversion: γ = 4 . − . The results are shown on Figure 3. As a benchmark, we plotted the optimal strategy forthe liquidation of the portfolio with Asset 1 only. We see that the presence of the two assetsin the portfolio accelerates the liquidation of Asset 1. Since the two assets are positivelycorrelated, price risk is increased by the presence of Asset 2. Therefore, it is natural thatthe liquidation of Asset 1 occurs faster in the presence of Asset 2. It is also interesting tonotice that the velocity at which liquidation occurs when the full portfolio is considered The ﬁgures are inspired from the French stock Total.

Time P o s i t i on − − − Asset 1Asset 2Asset 1 when liquidated without Asset 2

Figure 4: Liquidation of a 2-asset long/short portfolio. Risk aversion: γ = 4 . − . The leftaxis corresponds to Asset 1, the right axis corresponds to Asset 2.To illustrate the former, we consider the same assets as above but the portfolio is q = 300000 and q = − . Maximum participation rates are assumed to be ρ m = ρ m = 30% .The results are shown on Figure 4. As expected, because price risk is reduced by the pres-ence of the second asset, the liquidation of Asset 1 is slower in the two-asset case than inthe one-asset case. It is however noteworthy that the constraint on Asset 1 is binding atthe very begining in both cases.The third case we consider is a pure hedging case, the trader does not have an initial positionon the Asset 2, but this second asset will be used for hedging purpose. We consider the casewhere Asset 2 is much more liquid than Asset 1: • Stock price S = 50 , • Volatility 17%, corresponding to σ = 0 . ,16 Market volume is assumed to be constant over the day with V t = 10000000 , • The execution cost function is L ( ρ ) = η | ρ | φ + ψ | ρ | , with η = 0 . , φ = 0 . and ψ = 0 . . Time P o s i t i on − − − Asset 1Asset 2Asset 1 when liquidated without Asset 2Theoretical Hedging

Figure 5: Liquidation of a one-asset portfolio with and without hedge. Risk aversion: γ = 1 . − . The left axis corresponds to Asset 1, the right axis corresponds to Asset 2.Correlation between the two asset prices is assumed to be . . As far as maximum partici-pation rates are concerned, we took ρ m = 50% and ρ m large enough for the constraint notto be binding.The results are shown on Figure 5. As a benchmark, we also plotted in dotted line thetheoretical hedging curve q = − ρ σ S σ S q , had there be no execution costs. In fact, in orderto avoid paying too much for the round trip, the trader restricts its round trip by stayingon a plateau (this is link to the proportional term). We also see, as expected, that theexecution process is slowed down thanks to the hedging instrument. Appendix A: Proofs

Proof of Proposition 2.1 :After integrating by parts in the deﬁnition of X T , we obtain: X T = q ′ S − Z T V is L i (cid:18) v is V is (cid:19) ds + Z T σ i q is dW is . ( v , . . . , v d ) ∈ A det , ( q , . . . , q d ) is deterministic and X T is Gaussian. Therefore J ( v , . . . , v d ) = E [ − exp( − γX T )] can be computed in closed form as we know the Laplacetransform of a Gaussian variable: J ( v , . . . , v d ) = − exp − γ q ′ S − d X i =1 Z T V is L i (cid:18) v is V is (cid:19) ds − γ Z T q ′ s Σ q s ds !! . Proof of Theorem 2.1 :Existence can be obtained using the same method as in Theorem 2.1 of [16]. Uniqueness comes straightforwardly from the fact that the quadratic form x x ′ Σ x is astrictly convex function.Now, coming to the Hamiltonian characterization, we consider the generalized functions: L i ( ρ ) = ( L i ( ρ ) if | ρ | ≤ ρ im + ∞ if | ρ | > ρ im , ∀ i Then, the problem ( P ) is equivalent to inf ( q ,...,q d ) ∈ b C d X i =1 Z T V is L i (cid:18) ˙ q i ( s ) V is (cid:19) ds + 12 γ Z T q ( s ) ′ Σ q ( s ) ds where b C = n q ∈ W , ((0 , T ) , R d ) , q (0) = q , q ( T ) = 0 o . Applying Theorem 6 of [27] and its corollary to this problem, we obtain the Hamiltoniancharacterization stated in Theorem 2.1.

Proof of Theorem 2.2 :The proof is exactly the same as in [16] and [30]. It is a simple consequence of GirsanovTheorem.

Proof of Proposition 3.1 :Using Lemma 3.1, we have: p k +1 − p k = − ∆ θγ ∆ t M ⊗ Σ − p k +1 − ∆ θ ∆ t ∇ ˜ J ( p k ) . The proof is even simpliﬁed by the presence of participation constraints that enables to avoid the useof Dunford-Pettis Theorem. p k +1 is uniquely deﬁned from p k , we need to prove that the matrix I Nd + ∆ θγ ∆ t M ⊗ Σ − is invertible. For that purpose, let us write Σ =

QDQ − the spectral decomposition of Σ , D being diagonal with positive coeﬃcients. Notice then that I Nd + ∆ θγ ∆ t M ⊗ D − is a strictly diagonally dominant matrix and hence it is invertible. Therefore, I Nd + ∆ θγ ∆ t M ⊗ Σ − = ( I N ⊗ Q ) (cid:18) I Nd + ∆ θγ ∆ t M ⊗ D − (cid:19) ( I N ⊗ Q − ) is invertible. Proof of Theorem 3.1 :To improve readability, let us introduce A = 1 γ M ⊗ Σ − and B = I Nd + ∆ θ ∆ t A. Then, using Lemma 3.1, straightforward computations give: p k +1 − p k = − ∆ θ ∆ t B − ∇ ˜ J ( p k ) . Now, we decompose ˜ J ( p k +1 ) − ˜ J ( p k ) as ˜ J ( p k +1 ) − ˜ J ( p k ) + ˜ J ( p k +1 ) − ˜ J ( p k ) . • For the ﬁrst part, we have ˜ J ( p k +1 ) − ˜ J ( p k ) = ∇ ˜ J ( p k ) ′ ( p k +1 − p k ) + 12 ( p k +1 − p k ) ′ A ∆ t ( p k +1 − p k )= − ∆ θ ∆ t ∇ ˜ J ( p k ) ′ B − ∇ ˜ J ( p k ) + 12 (cid:18) ∆ θ ∆ t (cid:19) ∇ ˜ J ( p k ) ′ B − A ∆ t B − ∇ ˜ J ( p k ) . • For the second part, we have ˜ J ( p k +1 ) − ˜ J ( p k ) ≤ ∇ ˜ J ( p k ) ′ ( p k +1 − p k ) + 12 K ∆ t k p k +1 − p k k ≤ − ∆ θ ∆ t ∇ ˜ J ( p k ) ′ B − ∇ ˜ J ( p k ) + 12 K ∆ t (cid:18) ∆ θ ∆ t (cid:19) ∇ ˜ J ( p k ) ′ B − ∇ ˜ J ( p k ) . Summing, we get: ˜ J ( p k +1 ) − ˜ J ( p k ) ≤ − ∆ θ ∆ t ∇ ˜ J ( p k ) ′ B − ∇ ˜ J ( p k )+ 12 (cid:18) ∆ θ ∆ t (cid:19) ∇ ˜ J ( p k ) ′ B − (cid:18) K ∆ tI Nd + A ∆ t (cid:19) B − ∇ ˜ J ( p k ) − ∆ θ ∆ t ∇ ˜ J ( p k ) ′ B − (cid:18) B −

12 ∆ θ ∆ t (cid:18) K ∆ tI Nd + A ∆ t (cid:19)(cid:19)| {z } = R B − ∇ ˜ J ( p k ) . Hence, writing ∆ θ ∆ t ∇ ˜ J ( p k ) ′ B − RB − ∇ ˜ J ( p k ) ≤ ˜ J ( p k ) − ˜ J ( p k +1 ) and summing over k , we obtain for κ ∈ N : ∆ θ ∆ t κ X k =0 ∇ ˜ J ( p k ) ′ B − RB − ∇ ˜ J ( p k ) ≤ ˜ J ( p ) − inf ˜ J .

Now, R = B −

12 ∆ θ ∆ t (cid:18) K ∆ tI Nd + A ∆ t (cid:19) = (cid:18) − K ∆ θ (cid:19) I Nd + 12 ∆ θ ∆ t A is a positive-deﬁnite matrix for ∆ θ < K .Therefore, the series of positive terms P k ∇ ˜ J ( p k ) ′ B − RB − ∇ ˜ J ( p k ) is convergent.As R and B are positive-deﬁnite matrices, we can conclude that the series P k k∇ ˜ J ( p k ) k is also convergent.Now, since p k +1 − p k = − ∆ θ ∆ t B − ∇ ˜ J ( p k ) , the series P k k p k +1 − p k k is convergent and wecan conclude that the sequence ( p k ) k converges toward a vector p ∗ such that ∇ ˜ J ( p ∗ ) = 0 –this corresponds to a minimizer of ˜ J .Now, if we deﬁne q ∗ = q , q ∗ N = 0 , and ∀ n ∈ { , . . . , N − } , q ∗ n = (cid:16) q ∗ n , . . . , q d ∗ n (cid:17) ′ = 1 γ ∆ t I N ⊗ Σ − (cid:0) p · , ∗ n − p · , ∗ n − (cid:1) , it is straightforward to verify that ( p ∗ , q ∗ ) is a solution of ( ˜ S H ) . Appendix B: Discrete model

This appendix is dedicated to the discrete counterpart of our model. We consider for thatpurpose that time is divided into slices of length ∆ t and we denote by t = 0 ≤ . . . ≤ t n = n ∆ t ≤ . . . ≤ t N = T the relevant sequence of times for our discrete model.For i ∈ { , . . . , d } , we denote by v in +1 ∆ t the number of shares of stock i sold by the traderbetween t n and t n +1 . As a consequence, the state of the portfolio q = ( q , . . . , q d ) is givenby ∀ i, q in +1 = q in − v in +1 ∆ t, ≤ n < N. The fact that deterministic strategies are optimal can be shown in the discrete framework using similartechniques as in the continuous-time model. S in +1 = S in + σ i √ ∆ tǫ in +1 , where ( σ ǫ n , . . . , σ d ǫ dn ) n are i.i.d. N (0 , Σ) random variables.The amount of cash obtained by the trader for the v in +1 ∆ t shares of stock i he sold over ( t n , t n +1 ] depends on v in +1 ∆ t itself and on the market volume for stock i over ( t n , t n +1 ] ,assumed to be V in +1 ∆ t .The resulting dynamics for the cash account is: X n +1 = X n + d X i =1 (cid:18) v in +1 S in − L i (cid:18) v in +1 V in +1 (cid:19) V in +1 ∆ t (cid:19) , X = 0 . The maximization criterion we consider is: E [ − exp( − γX N )] . As in the continuous-time model, the ﬁnal wealth can be computed: X N = d X i =1 q i S i + σ i √ ∆ t N − X n =0 q in +1 ǫ in +1 − N − X n =0 L i (cid:18) v in +1 V in +1 (cid:19) V in +1 ∆ t ! . Hence, X N is a Gaussian variable with mean d X i =1 q i S i − N − X n =0 L i (cid:18) v in +1 V in +1 (cid:19) V in +1 ∆ t ! and variance ∆ t N − X n =0 q ′ n +1 Σ q n +1 . Therefore, E [ − exp( − γX N )]= − exp − γ d X i =1 q i S i − N − X n =0 L i (cid:18) v in +1 V in +1 (cid:19) V in +1 ∆ t ! − γ t N − X n =0 q ′ n +1 Σ q n +1 !! , and the problem boils down to minimizing d X i =1 N − X n =0 (cid:18) L i (cid:18) q in − q in +1 V in +1 ∆ t (cid:19) V in +1 ∆ t (cid:19) + γ N − X n =0 q ′ n +1 Σ q n +1 ∆ t, over ˜ C = n ( q , . . . , q d ) ′ ∈ (cid:0) R N +1 (cid:1) d , ∀ i, q i = q i , q iN = 0 , | q in − q in +1 | ≤ ρ im V in +1 ∆ t, ≤ n ≤ N − o . This problem can be written as a discrete-time Bolza problem:21 nf ( q in ) ≤ n ≤ N, ≤ i ≤ d ,q = q ,q N =0 I (( q in ) ≤ n ≤ N, ≤ i ≤ d ) , where I (( q in ) ≤ n ≤ N, ≤ i ≤ d ) = N − X n =0 (cid:18) L i (cid:18) q in − q in +1 V in +1 ∆ t (cid:19) V in +1 ∆ t (cid:19) + γ N − X n =0 q ′ n +1 Σ q n +1 ∆ t, with L i ( ρ ) = ( L i ( ρ ) if | ρ | ≤ ρ im + ∞ if | ρ | > ρ im . Then, using the same techniques as those developed in [23], but in discrete time, one canshow that the dual formulation of this problem is problem ( ˜ D ) : inf ( p in ) ≤ n

SIAM Journalon Financial Mathematics, 3(1), 163-181 , 2012.[2] R. Almgren and N. Chriss. Value under liquidation.

Risk , 12(12):61–63, 1999.[3] R. Almgren and N. Chriss. Optimal execution of portfolio transactions.

Journal ofRisk , 3:5–40, 2001.[4] R. Almgren and J. Lorenz. Adaptive arrival price.

Journal of Trading , (1):59–66, 2007. As the function H i s are of class C , the derivation of this system is straightforward for the dual problem.It is also straightforward for the primal problem when the functions L i s are of class C and when there isno participation constraint. However, in the general case, one needs to rely on classical techniques of convexoptimization – such as the very general ones presented in [23] – and mimic the (rather long but classical)proofs in the continuous-time case to obtain the equivalence between the primal problem, the dual problemand the Hamiltonian equations, in the discrete-time case. Risk, 2005, 18.[6] R. Almgren. Optimal execution with nonlinear impact functions and trading-enhancedrisk.

Applied Mathematical Finance , 10(1):1–18, 2003.[7] E. Bayraktar and M. Ludkovski. Liquidation in limit order books with controlled inten-sity.

Mathematical Finance , 2012.[8] B. Bouchard, N.M. Dang, and C.-A. Lehalle. Optimal control of trading algorithms:a general impulse control approach.

SIAM Journal on Financial Mathematics, 2(1),404-438 , 2011.[9] P. Cannarsa and C. Sinestrari.

Semiconcave functions, Hamilton-Jacobi equations,and optimal control . Birkhäuser Boston, 2004.[10] P. A. Forsyth, J. S. Kennedy, S. T. Tse, and H. Windcliﬀ. Optimal trade execution:a mean quadratic variation approach.

Quantitative Finance , 2009.[11] J. Gatheral. No-dynamic-arbitrage and market impact.

Quantitative Finance ,10(7):749–759, 2010.[12] J. Gatheral and A. Schied. Optimal trade execution under geometric brownian motionin the almgren and chriss framework.

International Journal of Theoretical and AppliedFinance , 14(03):353–368, 2011.[13] O. Guéant and C.-A. Lehalle. General intensity shapes in optimal liquidation. to appearin

Mathematical Finance , 2014.[14] O. Guéant, C.-A. Lehalle, and J. Fernandez Tapia. Optimal portfolio liquidation withlimit orders.

SIAM Journal on Financial Mathematics, 3(1), 740-764 , 2012.[15] O. Guéant. Permanent market impact can be nonlinear, 2013 preprint.[16] O. Guéant. Optimal execution and block trade pricing: a general framework, to appearin

Applied Mathematical Finance , 2014.[17] S. Jaimungal, and D. Kinzebulatov. Optimal execution with a price limiter. Preprint,2013.[18] P. Kratz and T. Schöneborn. Optimal liquidation in dark pools.

In EFA 2009 BergenMeetings Paper , 2012.[19] P. Kratz and T. Schöneborn. Portfolio liquidation in dark pools in continuous time.

Mathematical Finance , 2013.[20] M. Labadie and C.-A. Lehalle. Optimal starting times, stopping times and risk measuresfor algorithmic trading, to appear in

The Journal of Investment Strategies , 2013.

21] S. Laruelle, C.-A. Lehalle, Optimal posting price of limit orders: learning by trading.preprint, 2011[22] S. Laruelle, C.-A. Lehalle, G. Pagès. Optimal split of orders across liquidity pools: astochastic algorithm approach. SIAM Journal on Financial Mathematics, 2(1), 1042-1076, 2011[23] J.-M. Lasry, New Variational Techniques in Mathematical Physics, C.I.M.E. SummerSchools Volume 63, 149-170, 2011.[24] C.-A. Lehalle, S. Laruelle, Market Microstructure in Practice, World Scientiﬁc, 2014[25] J. Lorenz and R. Almgren. Mean-Variance Optimal Adaptive Execution.

AppliedMathematical Finance, 18(5), 395-422 , 2011.[26] A. Obizhaeva and J. Wang. Optimal trading strategy and supply/demand dynamics.Journal of Financial Markets, 16(1), 1-32, 2013.[27] R.T. Rockafellar, Conjugate convex functions in optimal control and the calculus ofvariations, J. Math. Analysis Appl. 32, 174-222, 1970.[28] R.T. Rockafellar.

Convex analysis , volume 28. Princeton university press, 1996.[29] A. Schied and T. Schöneborn. Risk aversion and the dynamics of optimal liquidationstrategies in illiquid markets.

Finance and Stochastics , 13(2):181–204, 2009.[30] A. Schied, T. Schöneborn, and M. Tehranchi. Optimal basket liquidation for carainvestors is deterministic.

Applied Mathematical Finance , 17(6):471–489, 2010.[31] S. T. Tse, P. A. Forsyth, J. S. Kennedy, and H. Windcliﬀ. Comparison between themean variance optimal and the mean quadratic variation optimal trading strategies.

Applied Mathematical Finance, 20(5), 415-449 , 2013., 2013.