[PDF] A two-player price impact game

Abstract

We study the competition of two strategic agents for liquidity in the benchmark portfolio tracking setup of Bank, Soner, Voss (2017), both facing common aggregated temporary and permanent price impact à la Almgren and Chriss (2001). The resulting stochastic linear quadratic differential game with terminal state constraints allows for an explicitly available open-loop Nash equilibrium in feedback form. Our results reveal how the equilibrium strategies of the two players take into account the other agent's trading targets: either in an exploitative intent or by providing liquidity to the competitor, depending on the ratio between temporary and permanent price impact. As a consequence, different behavioral patterns can emerge as optimal in equilibrium. These insights complement existing studies in the literature on predatory trading models examined in the context of optimal portfolio liquidation problems.

Full PDF

AA two-player price impact game ∗ Moritz Voß † November 14, 2019

Abstract

We study the competition of two strategic agents for liquidity in thebenchmark portfolio tracking setup of Bank et al. [5], both facing com-mon aggregated temporary and permanent price impact `a la Almgrenand Chriss [2]. The resulting stochastic linear quadratic diﬀerentialgame with terminal state constraints allows for an explicitly availableopen-loop Nash equilibrium in feedback form. Our results reveal howthe equilibrium strategies of the two players take into account theother agent’s trading targets: either in an exploitative intent or byproviding liquidity to the competitor, depending on the ratio betweentemporary and permanent price impact. As a consequence, diﬀerentbehavioral patterns can emerge as optimal in equilibrium. These in-sights complement existing studies in the literature on predatory trad-ing models examined in the context of optimal portfolio liquidationproblems.

Mathematical Subject Classiﬁcation (2010):

JEL Classiﬁcation:

C61, C73, G11

Keywords: stochastic diﬀerential game, Nash equilibrium, illiquid markets,portfolio tracking, predatory trading

The confrontation of ﬁnancial agents concurrently trading in a single riskyasset and adversely aﬀecting its execution price through common aggregatedtemporary and permanent price impact is a well-known problem in optimal ∗ I would like to thank Jean-Pierre Fouque for encouraging and illuminating discussions. † University of California Santa Barbara, Department of Statistics & Applied Probabil-ity, Santa Barbara, CA 93106-3110, USA, email [email protected] . a r X i v : . [ q -f i n . M F ] N ov ortfolio liquidation. Indeed, when a possibly ﬁnancially distressed institu-tional investor aims to liquidate a large portfolio position in a risky asset,another strategic trader’s awareness of the investor’s intentions might tempther to also trade in the risky asset in order to beneﬁt from the incurred priceimpact. Typically, one might expect a predatory trading activity: Like aprey the distressed investor is raced to the market by a predator who ini-tially trades in parallel in the same direction solely to unwind her accruedposition eventually at more favorable prices due to the induced price impact,and at the expense of the liquidating investor. But, is it also conceivableto observe a converse behaviour where an agent actually cooperates and pro-vides liquidity by buying some of the seller’s shares – or even engages in bothoccasional predation as well as cooperation? Among the ﬁrst game theoreticapproaches carried out to formulate and investigate possible phenomena inthis context of portfolio liquidation are, e.g., Brunnermeier and Pedersen [6],Attari et al. [4], Carlin et al. [8], Sch¨oneborn [22], Sch¨oneborn and Schied[23], Carmona and Yang [9], and Schied and Zhang [19].Our goal in this paper is to expand these works and to address above ques-tion by studying the competition of two strategic agents for liquidity whenboth agents are trading simultaneously in an illiquid risky asset aﬀected byprice impact because each agent seeks to track her own exogenously givenstochastic target strategy like, e.g., a frictionless delta hedge to dynamicallyhedge the ﬂuctuations of their random endowments. Speciﬁcally, we extendthe single-player cost optimal benchmark portfolio tracking problem studiedin Bank et al. [5] in the presence of temporary and permanent price impactas proposed by Almgren and Chriss [2] to a two-player stochastic diﬀerentialgame. Both strategic agents are fully aware of the opponent’s individualtracking objectives and they compete for available liquidity as the jointlycaused price impact on the execution price directly feeds into their tradingperformances. We also allow for individual stochastic terminal state con-straints on each agent’s ﬁnal portfolio position. Our aim is to shed light onthe strategic interplay between the agents and to make transparent how eachagent takes into account the other agent’s trading targets in an optimal costminimizing manner by solving for a Nash equilibrium in this two-player priceimpact game.The paper most closely related to ours is Schied and Zhang [19]. Thereinthe authors determine a unique open-loop Nash equilibrium within the classof deterministic strategies of agents aiming to liquidate a given asset posi-tion by maximizing a mean-variance criterion in an Almgren and Chriss [2]2ramework. Their study is an extension of the corresponding deterministicdiﬀerential game solved in Carlin et al. [8] of liquidating risk-neutral agentswho maximize expected revenues. Other extensions of the latter game in-clude, e.g., Sch¨oneborn and Schied [23] and Carmona and Yang [9]: Bothconsider a two-stage version of the model in Carlin et al. [8] by allowing fordiﬀerent time horizons among the agents. While Sch¨oneborn and Schied [23]investigate the qualitative behaviour of the resulting deterministic open-loopNash equilibrium, Carmona and Yang [9] numerically analyze closed-loopNash equilibrium strategies; see also Moallemi et al. [17] for an extension ofthe diﬀerential game in [8] with asymmetric information, or to the case ofmultiple assets in Chu et al. [12]. By contrast to these papers, we addition-ally allow the agents to track their own general predictable target strategiesas in the single-player case investigated in [5]. Facing the same time horizon,the players’ terminal portfolio positions are also restricted to some exoge-nously predetermined stochastic levels which reveal gradually over time. Asa consequence, both agents will choose their dynamic trading strategies froma suitable set of adapted stochastic processes rather than opting for staticstrategies from a set of deterministic functions as in the papers cited above(except for the numerical study in Carmona and Yang [9]). Other recentwork on price impact games with Almgren-Chriss type price impact include,e.g., Huang et al. [16], Casgrain and Jaimungal [10, 11] where agents pur-sue optimal liquidation or trading and interact through common aggregatedpermanent price impact; Ekren and Nadtochiy [14] describe an equilibriumbetween competing market makers with constant absolute risk aversion whoare intertwined through jointly incurred temporary price impact while hedg-ing fractions of the same European option in a Markovian framework. Priceimpact games of liquidating agents in a market model with transient priceimpact are analyzed, e.g., in Schied and Zhang [20], Schied et al. [21], andStrehle [24].Our main result is an explicit description of a unique open-loop Nashequilibrium in feedback form within the class of progressively measurablestrategies to our two-player stochastic diﬀerential game where both agentstrack their own target strategies as in [5] and interact through temporaryand permanent price impact as in [19] and [8]. Mathematically, we solve asymmetric linear quadratic stochastic diﬀerential game with random stateconstraints. Inspired by the analysis in [5], we follow a probabilistic andconvex-analytic approach in the style of Pontryagin’s stochastic maximumprinciple. This also allows us to consider general predictable strategies as3he agents’ tracking targets and not necessarily Markovian or continuousdiﬀusion-type processes. We derive a characterization of the Nash equilib-rium which takes the form of a coupled system of linear forward-backwardstochastic diﬀerential equations (FBSDEs). Solving this system provides uswith the agents’ optimal trading strategies in equilibrium in closed-form andunveils a rich phenomenology for their optimal behaviour.It turns out that in equilibrium, similar to the single-player solution pre-sented in [5], both agents anticipate their individual running target portfolioby gradually trading in the direction of a weighted average of expected fu-ture target positions of the target strategy. However, being aware of thecompetitor’s tracking goals, each agent also assesses a weighted average ofthe expected future positions of the opponent’s target strategy and chooses totrade accordingly. Interestingly, it arises that the agents’ trading directionswith respect to the adversary’s target strategy is not invariant but dependson the relation between temporary and permanent price impact. Indeed, asit becomes apparent from our explicit solution both predation by simultane-ously trading in the same direction as the opponent, as well as cooperationby trading in the opposite direction can occur, even in a coexisting manner.As a consequence, diﬀerent behavioral paradigms can emerge as optimal inour Nash equilibrium.Conceptually, our explicit results elaborate on the analysis carried outby Sch¨oneborn and Schied [23]. Therein the authors identify two distincttypes of illiquid markets: A plastic market where the price impact is predom-inantly permanent, and an elastic market where the major part of incurredprice impact is temporary. Their model predicts that a competitor who isconscious of the other agent’s liquidation intention engages in predatory trad-ing in a plastic market, while she tends to cooperate and provides liquidityin an elastic market. Our closed-form Nash equilibrium solution of our moregeneral price impact game corroborates this. The novelty of our contributioncomes from the fact that in our game setup even though both agents sharethe same time horizon the various behavioral patterns of coexisting liquidityprovision as well as preying in equilibrium arise due to the agents’ trackingobjectives which incorporate a risk aversion in form of an inventory risk. Infact, this is also already suggested by the deterministic Nash equilibrium formean-variance optimal liquidation in Schied and Zhang [19]. In contrast, theagents in the diﬀerential game in Sch¨oneborn and Schied [23] are risk-neutraland cooperation among them is facilitated by introducing a two-stage setupwhere the agents have diﬀerent time horizons. Indeed, in the correspond-4ng one-stage game from Carlin et al. [8] with same time constraint for allagents only predation in equilibrium is observed and cooperation can onlybe enforced by repeating the game.The remainder of the paper is organized as follows. In Section 2 we intro-duce our two-player stochastic diﬀerential price impact game by extendingthe framework of Carlin et al. [8] and Schied and Zhang [19] to a stochastictracking problem of general predictable target strategies. Our main result,an explicit description of a unique open-loop Nash equilibrium of the game ispresented in Section 3. Section 4 contains some illustrations and discusses thequalitative behaviour of the two players’ optimal strategies in equilibrium.The technical proofs are deferred to Section 5. Notation:

Throughout this manuscript we use superscripts for enumer-ating purposes as, e.g., in X , X , α , α , or other quantities like ξ , ξ etc.to mark all objects which are associated with player 1 and player 2, respec-tively; or, to itemize objects as w , w , w etc. In particular, X , α , ξ isnot to be confused with quadratic powers, which will be explicitly denotedwith brackets like ( α ) , or, if necessary, as ( α ) . Let

T > , F , ( F t ) ≤ t ≤ T , P ) satisfying the usual conditions of right con-tinuity and completeness. We consider two agents who are trading in a ﬁnan-cial market consisting of one risky asset, e.g., a stock. The number of sharesagent 1 and agent 2 are holding at time t ∈ [0 , T ] are deﬁned, respectively,as X t (cid:44) x + (cid:90) t α s ds and X t (cid:44) x + (cid:90) t α s ds (1)with initial positions x , x ∈ R . The real-valued stochastic processes ( α t ) ≤ t ≤ T and ( α t ) ≤ t ≤ T represent the turnover rate at which each agent trades in therisky asset and belong to the admissibility set A (cid:44) (cid:26) α : α progressively measurable s.t. E (cid:20)(cid:90) T ( α t ) dt (cid:21) < ∞ (cid:27) . (2)We adopt the framework from Carlin et al. [8] and Schied and Zhang [19]and suppose that the agents’ trading incurs linear temporary and permanent5rice impact `a la Almgren and Chriss [2] in the sense that trades in the riskyasset are executed at prices S t (cid:44) P t + λ ( α t + α t ) + γ (( X t − x ) + ( X t − x )) (0 ≤ t ≤ T ) (3)with some unaﬀected price process P · = P + √ σW · following a Brownianmotion ( W t ) ≤ t ≤ T with variance σ >

0. The trading of both agents in therisky asset consumes available liquidity and instantaneously aﬀects the ex-ecution price in (3) in an adverse manner through temporary price impact λ >

0. In addition, the agents’ total accumulated trading activity also leavesa trace in the execution price which is captured by the permanent priceimpact parameter γ > ξ t ) ≤ t ≤ T and ( ξ t ) ≤ t ≤ T ,respectively. Both processes ξ and ξ are supposed to be real-valued pre-dictable processes in L ( P ⊗ dt ) and can be thought of, for instance, as hedgingstrategies adopted from a frictionless market. Moreover, the agents are alsorequired to reach a predetermined terminal portfolio target position Ξ T andΞ T in L ( P , F T ) at time T . Mathematically, we can formalize their objec-tives as follows: For a given strategy ( α t ) ≤ t ≤ T of her competitor agent 2,agent 1 aims to choose her trading rate ( α t ) ≤ t ≤ T in order to minimize thecost functional J ( α , α ) (cid:44) E (cid:20) (cid:90) T ( X t − ξ t ) σdt + 12 λ (cid:90) T α t (cid:0) α t + α t (cid:1) dt + γ (cid:90) T α t (cid:0) X t − x (cid:1) dt (cid:21) , (4)whereas agent 2 wishes to minimize J ( α , α ) (cid:44) E (cid:20) (cid:90) T ( X t − ξ t ) σdt + 12 λ (cid:90) T α t (cid:0) α t + α t (cid:1) dt + γ (cid:90) T α t (cid:0) X t − x (cid:1) dt (cid:21) (5)via her trading rate ( α t ) ≤ t ≤ T in response to a given strategy ( α t ) ≤ t ≤ T ofher opponent agent 1. As in the single-agent problem in Bank et al. [5], the6rst term in (4) and (5) reﬂects the agents’ running after their individual tar-get strategies ξ and ξ , respectively, through minimizing the correspondingsquare deviation from their respective portfolio positions X and X . Thecommon weight parameter σ measures price ﬂuctuations of the underlyingunaﬀected price process. The second and third terms in (4) and (5) take intoaccount the additional incurred linear quadratic illiquidity costs which are in-duced by temporary and permanent price impact while both agents are trad-ing in the risky asset as stipulated in (3) (see also Carlin et al. [8] and Schiedand Zhang [19]). Note, however, that due to each agent’s individual terminalstate constraint X iT = Ξ iT P -a.s. (for i = 1 ,

2) only the competitor’s accruedpermanent price impact feeds into their respective cost functional. Indeed,integration by parts yields that the i -th agent’s permanent impact from herown trading always creates the same costs γ ( X iT − x i ) = γ (Ξ iT − x i ) in-dependent of her chosen trading rate and therefore can be neglected in herown objective functional. We obtain following individual optimal stochasticcontrol problems for agent 1 and agent 2, namely, J ( α , α ) → min α ∈ A (6)for any ﬁxed strategy α ∈ A , and J ( α , α ) → min α ∈ A , (7)for any ﬁxed strategy α ∈ A , where A i , i = 1 ,

2, is the set of constraintpolicies deﬁned as A i (cid:44) (cid:26) α i : α i ∈ A satisfying X iT = x i + (cid:90) T α it dt = Ξ iT P -a.s. (cid:27) . (8)As in Bank et al. [5] we further assume that the target positions Ξ T , Ξ T ∈ L ( P , F T ) satisfy E (cid:20)(cid:90) T T − s d (cid:104) M + (cid:105) s (cid:21) < ∞ and E (cid:20)(cid:90) T T − s d (cid:104) M − (cid:105) s (cid:21) < ∞ , (9)where M + t (cid:44) E [Ξ T + Ξ T | F t ] and M − t (cid:44) E [Ξ T − Ξ T | F t ] for 0 ≤ t ≤ T . Remark .

1. Similar to Carlin et al. [8] and Schied and Zhang [19] theagent’s individual optimization problems in (6) and (7) are intertwined7hrough common aggregated temporary and permanent price impactaﬀecting their performance functionals J and J in (4) and (5) (incontrast to, e.g, Huang et al. [16], Casgrain and Jaimungal [10, 11] orEkren and Nadtochiy [14] where agents only interact through perma-nent or temporary price impact, respectively). One can think of bothplayers as strategic agents who compete for liquidity while concurrentlytrading in a single illiquid risky asset to meet their tracking objectivesfor the purpose of, e.g., hedging ﬂuctuations of random endowments.Note that both agents are fully aware of the opponent’s trading tar-gets ξ i and Ξ iT ( i = 1 , A i (cid:54) = ∅ for i = 1 , iT ∈ L ( P , F T ) only knownat time T the terminal state constraint X iT = Ξ iT P -a.s. ( i = 1 ,

2) isquite demanding. Thus, loosely speaking, the condition in (9) requiresthat the speed at which information on the random ultimate targetpositions Ξ T , Ξ T is revealed as t ↑ T is suﬃciently fast.Our goal is to compute a Nash equilibrium in which both agents solvetheir minimization problems in (6) and (7) simultaneously, given the strategyof her competitor, in the following sense: Deﬁnition 2.2.

A pair of admissible strategies ( ˆ α , ˆ α ) ∈ A × A is calledan open-loop Nash equilibrium if for all admissible strategies α ∈ A and8 ∈ A it holds that J ( ˆ α , ˆ α ) ≤ J ( α , ˆ α ) and J ( ˆ α , ˆ α ) ≤ J ( ˆ α , α ) . In other words, in a Nash equilibrium neither player has an incentive todeviate from the chosen strategy.

Remark .

1. In the special case of optimally liquidating the agents’ initial risky assetholdings x , x ∈ R without tracking exogenously given target strate-gies, i.e., ξ ≡ ξ ≡

0, and with non-random terminal target positionsΞ T = Ξ T = 0 P -almost surely, the above formulated two-player (deter-ministic) diﬀerential game is solved in Carlin et al. [8] setting σ = 0 inthe performance functionals in (4) and (5), and in Schied and Zhang[19] allowing for σ > σ = 0 in (4) and (5), are studied, e.g., in Sch¨oneborn andSchied [23], and Carmona and Yang [9] by allowing for diﬀerent timehorizons between both agents. The former determines a unique de-terministic open-loop Nash equilibrium whereas the latter numericallyanalyzes a closed-loop Nash equilibrium. Our main result is an explicit description of a unique open-loop Nash equi-librium in the sense of Deﬁnition 2.2 of the two-player stochastic diﬀerentialgame formulated in Section 2. To state our result it is convenient to introducefollowing nonnegative constants δ + (cid:44) γ + 6 λσ, δ − (cid:44) γ + 2 λσ, (10)the nonnegative functions c + t (cid:44) √ δ + coth( √ δ + ( T − t ) / (3 λ )) + 13 γ,c − t (cid:44) √ δ − coth( √ δ − ( T − t ) /λ ) − γ (0 ≤ t ≤ T ) (11)9uch that lim t ↑ T c ± t = + ∞ , as well as the weight functions w t (cid:44) √ δ + e γ λ ( T − t ) c + t + c − t ) sinh( √ δ + ( T − t ) / (3 λ )) ,w t (cid:44) √ δ − e − γλ ( T − t ) ( c + t + c − t ) sinh( √ δ − ( T − t ) /λ ) ,w t (cid:44) c + t c + t + c − t − w t , w t (cid:44) c − t c + t + c − t − w t , w t (cid:44) c + t − c − t c + t + c − t (12)for all t ∈ [0 , T ]. We are now ready to state our main theorem: Theorem 3.1.

There exists a unique open-loop Nash equilibrium ( ˆ α , ˆ α ) in A × A in the sense of Deﬁnition 2.2. The corresponding equilibrium stockholdings ˆ X · = x + (cid:82) · ˆ α t dt of agent 1 and ˆ X · = x + (cid:82) · ˆ α t dt of agent 2satisfy the random linear coupled ODE ˆ X = x , d ˆ X t = c + t + c − t λ (cid:16) ˆ ξ t − w t ˆ X t − ˆ X t (cid:17) dt, ˆ X = x , d ˆ X t = c + t + c − t λ (cid:16) ˆ ξ t − w t ˆ X t − ˆ X t (cid:17) dt (0 ≤ t < T ) , (13) where, for ≤ t ≤ T , we let ˆ ξ t (cid:44) w t · E [Ξ T + Ξ T | F t ] + w t · E [Ξ T − Ξ T | F t ]+ w t · E (cid:20)(cid:90) Tt ( ξ u + ξ u ) · K ( t, u ) du (cid:12)(cid:12)(cid:12) F t (cid:21) + w t · E (cid:20)(cid:90) Tt ( ξ u − ξ u ) · K ( t, u ) du (cid:12)(cid:12)(cid:12) F t (cid:21) (14) and ˆ ξ t (cid:44) w t · E [Ξ T + Ξ T | F t ] + w t · E [Ξ T − Ξ T | F t ]+ w t · E (cid:20)(cid:90) Tt ( ξ u + ξ u ) · K ( t, u ) du (cid:12)(cid:12)(cid:12) F t (cid:21) + w t · E (cid:20)(cid:90) Tt ( ξ u − ξ u ) · K ( t, u ) du (cid:12)(cid:12)(cid:12) F t (cid:21) (15)10 ith nonnegative kernels K ( t, u ) (cid:44) w t w t σe − γ λ ( T − u ) sinh( √ δ + ( T − u ) / (3 λ )) √ δ + ,K ( t, u ) (cid:44) w t w t σe γλ ( T − u ) sinh( √ δ − ( T − u ) /λ ) √ δ − (0 ≤ t ≤ u < T ) (16) which, for each t ∈ [0 , T ) , integrate to one over [ t, T ] . The solution ( ˆ X , ˆ X ) of (13) satisﬁes the terminal state constraints in the sense that lim t ↑ T ˆ X t = Ξ T and lim t ↑ T ˆ X t = Ξ T P -a.s. (17)The proof of Theorem 3.1 is deferred to Section 5. The equilibrium stockholdings prescribed by the linear coupled ODE in (13) can be easily computedexplicitly. Corollary 3.2.

The solution ( ˆ X , ˆ X ) to the linear ODE in (13) is given by ˆ X t = 12 ( x + x ) e − (cid:82) t c + sλ ds + 14 λ (cid:90) t ( c + s + c − s )( ˆ ξ s + ˆ ξ s ) e − (cid:82) ts c + uλ du ds + 12 ( x − x ) e − (cid:82) t c − sλ ds + 14 λ (cid:90) t ( c + s + c − s )( ˆ ξ s − ˆ ξ s ) e − (cid:82) ts c − uλ du ds (18) and, similarly, by ˆ X t = 12 ( x + x ) e − (cid:82) t c + sλ ds + 14 λ (cid:90) t ( c + s + c − s )( ˆ ξ s + ˆ ξ s ) e − (cid:82) ts c + uλ du ds + 12 ( x − x ) e − (cid:82) t c − sλ ds + 14 λ (cid:90) t ( c + s + c − s )( ˆ ξ s − ˆ ξ s ) e − (cid:82) ts c − uλ du ds (19) for all t ∈ [0 , T ] .Remark . Following up on Remark 2.3 1.), setting ξ ≡ ξ ≡ T = Ξ T = 0 P -almost surely, our Theorem 3.1 together with Corollary 3.2retrieves the two-player results from Carlin et al. [8, Result 1] for the case σ = 0 and from Schied and Zhang [19, Corollary 2.6] for the case σ >

0. Notethat this conﬁguration yields ˆ ξ ≡ ˆ ξ ≡ Lemma 3.4.

The weight functions w , w , w , w , w deﬁned in (12) satisfy1. w · ∈ ( − , , w , , , · > on [0 , T ) and w · + w · + w · + w · = 1 on [0 , T ] ,2. lim t ↑ T w , t = 1 / and lim t ↑ T w , , t = 0 . Let us now discuss qualitatively the two-player Nash equilibrium obtainedin Theorem 3.1. First of all, we observe that the Nash equilibrium strate-gies ˆ X and ˆ X of both players presented therein and in Corollary 3.2 aresymmetric in the agents’ trading targets ξ , Ξ T and ξ , Ξ T . Moreover, verysimilar to the single-player solution obtained in Bank et al. [5] it turns outthat the trading rates ˆ α and ˆ α in (13) prescribe, respectively, to graduallytrade in the direction of an optimal signal process ˆ ξ t and ˆ ξ t (rather thantoward the actual target position ξ t , ξ t ) which is further adjusted by a frac-tion w t ∈ ( − ,

1) of the opponent’s respective current portfolio position ˆ X t and ˆ X t . The optimal signal processes ˆ ξ in (14) and ˆ ξ in (15) are con-vex combinations of weighted averages of expected future target positionsof the processes ξ , ξ and the expected terminal positions Ξ T , Ξ T , wherethe weights w t , w t , w t , w t systematically shift toward the desired individualterminal state as t ↑ T (Lemma 3.4 2.) implies that lim t ↑ T ˆ ξ it = Ξ iT P -a.s.for both players i = 1 , c + t + c − t ) / (2 λ ) ↑ ∞ for t ↑ T , together with lim t ↑ T w t = 0, then forces both strategies in (13) towind up in the predetermined terminal portfolio position at maturity T (seealso the proof of Theorem 3.1 in Section 5 below).Interestingly, we note that the ﬁrst agent’s optimal signal process ˆ ξ notonly seeks to anticipate and average out the future evolution of her own tar-get strategy ξ but, conscious of her competitor’s trading goals, does so alsofor the opponent’s target strategy ξ . In other words, besides following herown objectives, she also takes into account the other agent’s known trad-ing intentions. Moreover, the weights w t and w t dictate the actual tradingdirection with respect to the other agent’s tracking target. Indeed, observethat if w t predominates w t in (14), the ﬁrst player’s optimal signal ˆ ξ directsto also trade in parallel in the same direction as the second player, that is,in the direction of the expected future average positions of ξ . In contrast,if w t outweighs w t , then the optimal signal imposes to trade in the oppo-site direction of the second player’s target strategy, i.e., toward the expected12eighted averages of − ξ . Loosely speaking, the former case can be viewedas a predatory trading action of the ﬁrst agent against the second agent: Bytrading in the same direction she is racing her adversary to the market inorder to beneﬁt from more favorable prices (due to the induced price impact)when unwinding at a later point in time the corresponding accrued positionin accordance with the known competitor’s trading needs. The latter case,on the other hand, can be regarded as a cooperative behaviour: By tradingsimultaneously in the opposite direction, she provides liquidity to her oppo-nent. The same applies to the optimal signal ˆ ξ of the second player in (15)due to symmetry.In our illustrations in Section 4 below it becomes apparent that boththese cases depend on the relationship between the permanent and temporaryprice impact parameters γ and λ . This is in line with the study of an optimalliquidation problem in Sch¨oneborn and Schied [23], where the authors identifytwo distinct types of illiquid markets: A plastic maket where γ (cid:29) λ and anelastic market with λ (cid:29) γ . They predict that competitors who are awareof other market participants’ trading intentions engage in predatory tradingin plastic markets, while they tend to provide liquidity in elastic markets;both in the above sense of trading in the same or opposite direction. Ourcase studies in Section 4 illustrate how these two market types, plastic andelastic, correlate accordingly with the sizes of the weights w and w ; werefer to the graphical illustration in Figure 6 below (note, however, that thedependence of the functions w , w on γ and λ , as well as their interrelationswith σ and T are nontrivial). In this regard, depending on the illiquidityparameters the optimal signal processes ˆ ξ in (14) and ˆ ξ in (15) accountfor diﬀerent types of regimes. It turns out that this leads to qualitativediﬀerent behavioral patterns in the Nash equilibrium of Theorem 3.1 whereboth predation and cooperation between the agents can occur in a coexistingmanner (see Section 4 below). This diﬀers remarkably from the dichotomiein the existing literature where either predation or cooperation is observedin equilibrium, with the latter essentially being enforced through repeatingthe game or by allowing for diﬀerent trading horizons among the agents as,e.g., in Carlin et al. [8] and Sch¨oneborn and Schied [23]. This is due to thefact that the agents in the above cited work are risk-neutral and maximizeexpected revenues. In contrast, in our stochastic diﬀerential price impactgame formulation in Section 2 agents are sensitive about their inventories.This induces a form of risk-aversion which leads to a richer variety of diﬀerentbehavior in equilibrium for diﬀerent model parameters; see also the study of13iquidating agents with mean-variance preferences by Schied and Zhang [19]which is a special case of our setup.Finally, observe that the optimal signal processes in (14) and (15) alsoforecast both player’s terminal target positions Ξ T and Ξ T . However, onecan argue that w > w on [0 , T ) independent of γ and λ (we refer to thegraphical illustration in Figure 6 below for simplicity). That is, regardingthe opponent’s terminal portfolio position each agent only pursues predatorytrading. Remark . Note that the open-loop Nash equi-librium strategies in Theorem 3.1 are given in feedback form of the statevariables ˆ X and ˆ X . In principle, this allows each player to react dynami-cally to the other player’s state variable and to adapt if the opponent decidesto deviate from the equilibrium strategy. However, it does not necessarilyimply that the adapting player still reacts optimally in response to the otherplayer’s deviation in this case. Recall that in a closed-loop stochastic diﬀer-ential game solving for each player’s optimal strategy is performed over a setof feedback policies and the other player’s strategy is likewise ﬁxed as a de-terministic feedback function of the random state variables; in contrast to thegeneral stochastic processes considered in Deﬁnition 2.2. Our solution in (13)suggests to search for a closed-loop Nash equilibrium for our stochastic dif-ferential game where the strategies are ﬁxed as feedback controls of the statevariables ( X t , X t ) augmented by the (uncontrolled) target positions ( ξ t , ξ t )(under the additional assumption that ( F t ) ≤ t ≤ T is their natural ﬁltration).Nonetheless, we expect that a closed-loop equilibrium does not exhibit anynew qualitative features; see also the heuristic analysis in Carlin et al. [8,Appendix B] and the discussion in Sch¨oneborn and Schied [23]. In this section we present some case studies to illustrate the qualitative be-haviour of the two-player Nash equilibrium presented in Theorem 3.1.

We start with revisiting the diﬀerential game of optimal portfolio liquida-tion studied in Schied and Zhang [19]. Speciﬁcally, the ﬁrst agent seeks toliquidate her initial portfolio position of x = 1 shares in the risky asset by14ime T = 2 and hence requires her terminal position to satisfy Ξ T = 0 P -a.s.at ﬁnal time. Vigilant about her stock holdings and in line with her sellingintention she also wants her inventory to be close to 0 throughout by tracking ξ ≡ , T ]. The second agent, on the contrary, does not pursue anypredetermined buying or selling objectives but solely chooses to trade in therisky asset because she knows about the intentions of the ﬁrst liquidatingagent. That is, possessing no shares at time 0 ( x = 0) she gives herselfthe constraints ξ t = Ξ T = 0 P -a.s. for all t ∈ [0 , T ]. In this case, followingTheorem 3.1, we have ˆ ξ ≡ ˆ ξ ≡ P -a.s. on [0 , T ] in (14) and (15), and thedeterministic equilibrium trading rates of both players in (13) reduce toˆ α t = c + t + c − t λ (cid:16) − w t ˆ X t − ˆ X t (cid:17) and ˆ α t = c + t + c − t λ (cid:16) − w t ˆ X t − ˆ X t (cid:17) (20)on [0 , T ); cf. also the result in Schied and Zhang [19, Corollary 2.6] witha slightly diﬀerent representation. We observe in (20) that the ﬁrst agent’sportfolio position ˆ X t is not gradually reverting towards 0 but takes the eﬀectof the second agent’s actions into account via the correction term − w t ˆ X t .Similarly, concerning the second agent, it is optimal for her to systematicallytrade in the direction of the liquidating agent’s current portfolio position ˆ X t weighted with w t ∈ ( − , γ = 2 > λ . Indeed, duringthe ﬁrst half of the trading period she short-sells the risky asset in parallelto the selling of the ﬁrst agent and then steadily unwinds her accrued shortposition by buying back shares to become “hands-clean” by ﬁnal time T .In contrast, in an elastic market with, e.g., γ = 0 . < λ , the Nashequilibrium strategy dictates the second agent to cooperate with the sellerand to moderately buy almost up to one-tenth of the shares by time T / T . Note that the weight function w · in (20)ﬂips sign depending on the market’s illiquidity regime (see also Figure 6 for agraphical illustration of the weight functions). As a consequence, comparedto the single-player optimal liquidation strategy ˆ X t = 1 + (cid:82) t ˆ α s ds , t ∈ [0 , T ],which satisﬁesˆ α t = − (cid:114) σλ coth (cid:18)(cid:114) σλ ( T − t ) (cid:19) ˆ X t (0 ≤ t < T ) (21)(cf., e.g., Almgren [1]), and does not depend on γ , we observe in Figure 1that, due to the presence of the second agent’s trading activity which directly15 layer 1Player 2Player 1's Single - Player Solution - Plastic Market

Player 1Player 2Player 1's Single - Player Solution - Elastic Market

Figure 1: The two-player Nash equilibrium strategies ˆ X for the liquidatingagent 1 (green) and ˆ X for agent 2 (orange) on [0 , T ], together with thecorresponding processes − w ˆ X i ( i = 1 ,

2) from the trading rates in (20)(same-color dashed lines). The optimal single-player liquidation strategyfrom (21) is depicted in black. The parameters are T = 2, σ = 1, λ = 1, aswell as γ = 2 (upper panel), γ = 0 . α via − w ˆ X in (20), her optimalportfolio liquidation strategy becomes more prudent in a plastic market andslightly more aggressive in an elastic market environment. To sum up, inequilibrium, depending on the illiquid market type, either predation or co-operation between both agents occurs; see also the discussion in Schied andZhang [19, Section 3]. The next two case studies are again simple deterministic examples but thistime with nonzero optimal signal processes ˆ ξ and ˆ ξ .In the ﬁrst example, as in the optimal liquidation problem above, wesuppose that agent 2 only trades in the risky asset because of her awarenessof the trading activity of the ﬁrst agent. That is, with x = 0 initial sharesher inventory targets are ξ t = Ξ T = 0 P -a.s. for all t ∈ [0 , T ]. Concern-ing the ﬁrst agent, starting with no inventory x = 0 she wants to followa stock-buying schedule over a time period of T = 10 that prescribes tohold one stock until time T / T . Her inventory target is thus ξ t = 1 · { ≤ t< } + 2 · { ≤ t ≤ } on[0 , T ] with terminal constraint Ξ T = 2. Note that in this game setup theoptimal signal processes ˆ ξ and ˆ ξ of both agents in (14) and (15) in equi-librium are nonzero. In particular, similar to the single-player case in Banket al. [5] they are anticipating and smoothing out the jump in ξ at time T / K and K . The associated Nash-equilibrium trading strategies ˆ X and ˆ X from Theorem 3.1 are presentedin Figure 2. As expected from the liquidation problem above, if the marketis plastic ( γ > λ ) the second agent heavily preys on the ﬁrst agent by trad-ing halfway of the trading period in the same direction and buying shares.Accordingly, in comparison to the ﬁrst agent’s single-player optimal trackingstrategy from Bank et al. [5] (which does not dependent on γ ) her runningafter the buying-schedule ξ gets aﬀected due to the presence of the preyingsecond agent and slightly falls behind in the second half of the trading period(also recall the adjustment ˆ ξ − w ˆ X of the ﬁrst agent’s optimal signal pro-cess in her trading rate in (13)). However, if the market is elastic ( λ > γ ) thesecond agent’s optimal behaviour in equilibrium changes. Interestingly, weobserve that her strategy turns out to be a succession of round-trips duringwhich she either provides liquidity to her opponent by short-selling the riskyasset like, e.g., during the ﬁrst quarter of the trading period, or engages in17 layer 1Player 2Player 1's Single - Player Solution

Plastic Market

Player 1Player 2Player 1's Single - Player Solution

Elastic Market

Figure 2: The two-player Nash equilibrium strategies ˆ X for Player 1 (green)and ˆ X for Player 2 (orange), together with the processes ˆ ξ i − w ˆ X j ( i (cid:54) = j ∈ { , } ) from the optimal trading rates in (13) (same-color dashed lines).The ﬁrst agent’s buying programm ξ = 1 [0 , + 2 · [5 , is plotted in grey.For comparison, the corresponding single-player optimal tracking strategywith associated optimal signal process from [5] is depicted in black (solidand dashed). The parameters are T = 10, σ = 1, λ = 1, as well as γ = 2(upper panel), γ = 0 . x = x = 0 seek to gradually build up and hold apositive fraction of the risky asset over some time period [0 , T ] with T = 10.Concretely, assume that ξ ≡ Ξ T = 1 and ξ ≡ Ξ T = 0 .

1, i.e., agent 1wants her inventory to be close to 1 and ten times larger than the desiredinventory level of agent 2 all through the trading period [0 , T ]. The associatedNash equilibrium strategies ˆ X and ˆ X from Theorem 3.1 are presented inFigure 3. Again, as expected from the analysis above, in a plastic marketit is optimal for agent 2 to excessively prey on the ﬁrst agent who aims fora much larger asset position by buying up to three times more shares thanher actual target inventory predetermines. In response, the acquisition ofthe ﬁrst agent is slowed down compared to her single-player optimal strategyfrom [5]. By contrast, in an elastic market environment it turns out to beoptimal for the second agent to initially ignore her own tracking target andto trade away from her desired inventory level in order to provide liquidity tothe higher-volume seeking ﬁrst agent by short-selling some shares. Also notehow in this case the second agent’s single-player optimal tracking strategyfrom [5] strongly diﬀers from her optimal behaviour in the two-player Nashequilibrium at the beginning of the trading period. In the ﬁnal two examples we want to investigate a situation where the targetstrategies ξ and ξ are adapted stochastic processes. Speciﬁcally, let ussuppose that the ﬁrst agent wants to hedge an at-the-money call option withmaturity T on the underlying unaﬀected price process P in (3) by trackingthe corresponding frictionless (Bachelier-)delta-hedging strategy ξ t (cid:44) Φ (cid:32) P t − P (cid:112) σ ( T − t ) (cid:33) (0 ≤ t ≤ T ) . (22)Here, Φ denotes the cumulative distribution function of the standard normaldistribution. We further suppose that her initial position in the risky asset19 layer 1Player 2Single - Player Solutions - Plastic Market

Player 1Player 2Single - Player Solutions - Elastic Market

Figure 3: The two-player Nash equilibrium strategies ˆ X for Player 1 (green)and ˆ X for Player 2 (orange), together with the processes ˆ ξ i − w ˆ X j ( i (cid:54) = j ∈ { , } ) from the optimal trading rates in (13) (same-color dashed lines).Both agent’s inventory targets ξ ≡ ξ ≡ . T = 10, σ = 1, λ = 1, as well as γ = 2 (upperpanel), γ = 0 . x = ξ = 1 / T = 0 P -a.s., i.e.,she wants to systematically unwind her hedging portfolio when approachingmaturity T .Firstly, we assume that the second agent does not pursue any speciﬁcpredetermined trading objectives, that is, x = ξ = Ξ T = 0 P -a.s. Since ξ in (22) is a martingale on [0 , T ] the optimal signal processes ˆ ξ and ˆ ξ in (14)and (15) simplify toˆ ξ t = ( w t + w t ) ξ t and ˆ ξ t = ( w t − w t ) ξ t (0 ≤ t ≤ T ) . (23)The Nash equilibrium strategies ˆ X and ˆ X from Theorem 3.1 are plotted inFigure 4, together with the corresponding realisation of the delta-hedge ξ inthe case where the call option expires in the money. Depending on the illiq-uidity parameters, we observe the same behavioral patterns in equilibrium asin the deterministic cases analyzed above: In a plastic market environment,the second agent engages in predatory trading on the ﬁrst agent by tradingin parallel in the same direction of the delta-hedge. When the market iselastic she turns into a liquidity provider instead and partially takes the op-posite side of the hedger’s transactions. Also note that the sign of the secondagent’s optimal signal process in (23) is determined by the relation betweenthe weights w and w , which is in turn aﬀected by the ratio between γ and λ (cf. also Figure 6).Secondly, let us now assume that the second agent also hedges a one-tenth fraction of the same call option, i.e., ξ = ξ /

10 (with initial andﬁnal portfolio positions x = 1 /

20 and Ξ T = 0 P -a.s.). The resulting Nashequilibrium strategies from Theorem 3.1 are presented in Figure 5 where weused the same realisation of the delta-hedge as in Figure 4. In a similar veinas in the deterministic case above, the second agent’s optimal behaviour inthe two-player Nash equilibrium changes notably compared to her optimalsingle-player frictional hedging strategy from Bank et al. [5]; focussing moreon preying on the ﬁrst agent’s larger hedging portfolio in a plastic market,or on providing liquidity to the latter in an elastic market. Inspired by Bank et al. [5] we will use tools from convex analysis and simplecalculus of variations arguments to prove our main Theorem 3.1. Indeed,note that for any α ∈ A ﬁxed, the map α (cid:55)→ J ( α , α ) in (4) is strictly21 layer 1Player 2Player 1's Single - Player Solution - Plastic Market

Player 1Player 2Player 1's Single - Player Solution

Elastic Market

Figure 4: The two-player Nash equilibrium strategies ˆ X for Player 1(green) and ˆ X for Player 2 (orange), together with the processes ˆ ξ i − w ˆ X j ( i (cid:54) = j ∈ { , } ) from the optimal trading rates in (13) (same-color dashedlines). The ﬁrst agent’s frictionless delta-hedge ξ is plotted in grey. Forcomparison, her corresponding single-player optimal hedging strategy withassociated optimal signal process from [5] is depicted in black (solid anddashed). The parameters are T = 5, σ = 1, λ = 1, as well as γ = 2 (upperpanel), γ = 0 . layer 1Player 2Single - Player Solutions - Plastic Market

Player 1Player 2Single - Player Solutions - Elastic Market

Figure 5: The two-player Nash equilibrium strategies ˆ X for Player 1 (green)and ˆ X for Player 2 (orange), together with the processes ˆ ξ i − w ˆ X j ( i (cid:54) = j ∈{ , } ) from the optimal trading rates in (13) (same-color dashed lines). Onlythe second agent’s frictionless delta-hedge ξ = ξ /

10 is plotted in grey (theﬁrst agent’s target strategy ξ is the same as in Figure 4 and omitted here).For comparison, the corresponding single-player optimal hedging strategiesof the two agents together with their associated optimal signal processesfrom [5] are depicted in black (solid and dashed). The parameters are T = 5, σ = 1, λ = 1, as well as γ = 2 (upper panel), γ = 0 . - w w w w w Plastic Market - w w w w w Elastic Market

Figure 6: Exemplary illustration of the weight functions w , w , w , w , w on [0 , T ] deﬁned in (12). The parameters are T = 5, σ = 1, λ = 1, as well as γ = 2 (upper panel), γ = 0 . A . Same holds true for the map α (cid:55)→ J ( α , α )in (5) over the convex set A for any ﬁxed α ∈ A . As a consequence, weimmediately obtain following lemma: Lemma 5.1.

There exists at most one open-loop Nash equilibrium in thesense of Deﬁnition 2.2.

Given two controls ˜ α ∈ A , ˜ α ∈ A we can introduce the Gˆateauxderivatives of the mappings α (cid:55)→ J ( α , ˜ α ) at α ∈ A and α (cid:55)→ J ( ˜ α , α )24t α ∈ A , respectively, in any directions β , β ∈ A (cid:44) { β : β ∈ A satisfying (cid:82) T β t dt = 0 P -a.s. } , namely, (cid:104)∇ J ( α , ˜ α ) , β (cid:105) (cid:44) lim ε → J ( α + εβ , ˜ α ) − J ( α , ˜ α ) ε , (cid:104)∇ J ( ˜ α , α ) , β (cid:105) (cid:44) lim ε → J ( ˜ α , α + εβ ) − J ( ˜ α , α ) ε , allowing for following explicit expressions: Lemma 5.2.

For α ∈ A , α ∈ A we have (cid:104)∇ J ( α , α ) , β (cid:105) = E (cid:20)(cid:90) T β s (cid:18) λα s + λ α s + γ ( X s − x ) + (cid:90) Ts ( X t − ξ t ) σdt (cid:19) ds (cid:21) (24) and (cid:104)∇ J ( α , α ) , β (cid:105) = E (cid:20)(cid:90) T β s (cid:18) λα s + λ α s + γ ( X s − x ) + (cid:90) Ts ( X t − ξ t ) σdt (cid:19) ds (cid:21) (25) for any β , β ∈ A .Proof. We only compute the Gˆateaux derivative in (24). The same compu-tations apply for (25). Let ε > α ∈ A , α ∈ A , β ∈ A . Notethat X α + εβ · = X α · + ε (cid:82) · β s ds . In fact, since J ( α + εβ , α ) − J ( α , α )= ε E (cid:20)(cid:90) T (cid:18) λ β t (2 α t + α t ) + (cid:18)(cid:90) t β s ds (cid:19) ( X t − ξ t ) σ + γβ t ( X t − x ) (cid:19) dt (cid:21) + 12 ε E (cid:34)(cid:90) T (cid:32) λ ( β t ) + (cid:18)(cid:90) t β s ds (cid:19) σ (cid:33) dt (cid:35) , we obtain the desired result in (24) after applying Fubini’s theorem.Having at hand the explicit expressions in (24) and (25), we can derive asuﬃcient ﬁrst order condition for a Nash equilibrium.25 emma 5.3. Suppose that ( ˆ X , ˆ X ) with controls ( ˆ α , ˆ α ) ∈ A × A solvesfollowing coupled forward backward SDE system  dX t = α t dt, X = x ,dX t = α t dt, X = x ,dα t = σλ ( X t − ξ t ) dt − γλ α t dt − dα t + dM t , X T = Ξ T ,dα t = σλ ( X t − ξ t ) dt − γλ α t dt − dα t + dM t , X T = Ξ T , (26) for two suitable square integrable martingales ( M t ) ≤ t

12 ( ˆ α t − ˆ α ) + M t d P ⊗ dt -a.e. on Ω × [0 , T )for some square integrable martingale ( M t ) ≤ t

In view of Lemma 5.3 we merely have to show that( ˆ X , ˆ X , ˆ α , ˆ α ) with dynamics described in Theorem 3.1, equation (13), is asolution of the FBSDE system in (26) with some suitable square integrablemartingales ( M t ) ≤ t

We start with computing the dynamics of the controls ˆ α and ˆ α in (13). Therefore, it is convenient to rewrite w , w in (12), as well as ˆ ξ in (14) and ˆ ξ in (15) by introducing˜ w t (cid:44) ( c + t + c − t ) w t , ˜ w t (cid:44) ( c + t + c − t ) w t (0 ≤ t < T ) (29)and ˜ ξ t (cid:44) ( c + t + c − t ) ˆ ξ t , ˜ ξ t (cid:44) ( c + t + c − t ) ˆ ξ t (0 ≤ t < T ) . (30)Moreover, setting Y + t (cid:44) (cid:90) t ( ξ s + ξ s ) 2 σ √ δ + e − γ λ ( T − s ) sinh( √ δ + ( T − s ) / (3 λ )) ds,M + t (cid:44) E (cid:2) Ξ T + Ξ T + Y + T | F t (cid:3) (31)27nd Y − t (cid:44) (cid:90) t ( ξ s − ξ s ) 2 σ √ δ − e γλ ( T − s ) sinh( √ δ − ( T − s ) /λ ) ds,M − t (cid:44) E (cid:2) Ξ T − Ξ T + Y − T | F t (cid:3) (32)for all 0 ≤ t ≤ T , we obtain the representations˜ ξ t = ˜ w t ( M + t − Y + t ) + ˜ w t ( M − t − Y − t ) , ˜ ξ t = ˜ w t ( M + t − Y + t ) − ˜ w t ( M − t − Y − t ) (0 ≤ t < T ) . (33)In particular,˜ ξ t + ˜ ξ t = 2 ˜ w t ( M + t − Y + t ) , ˜ ξ t − ˜ ξ t = 2 ˜ w t ( M − t − Y − t ) (34)on [0 , T ). Note that Ξ T , Ξ T , Y + T , Y − T ∈ L ( P ) implies that ( M + t ) ≤ t ≤ T and( M − t ) ≤ t ≤ T are square integrable martingales. Also, observe that the pro-cesses Y + , M + , Y − , M − ∈ L ( P ⊗ dt ). We can now rewrite (13) asˆ α t = 12 λ ( ˜ ξ t − c + t ˆ X t + c − t ˆ X t − c + t ˆ X t − c − t ˆ X t ) , ˆ α t = 12 λ ( ˜ ξ t − c + t ˆ X t + c − t ˆ X t − c + t ˆ X t − c − t ˆ X t ) (0 ≤ t < T ) . (35)Next, for ˜ w , ˜ w in (29) one can easily check that( ˜ w t ) (cid:48) = ˜ w t (cid:18) λ c + t − γ λ (cid:19) , ( ˜ w t ) (cid:48) = ˜ w t (cid:18) λ c − t + 2 γλ (cid:19) (0 ≤ t < T ) . (36)Hence, by applying integration by parts in (33) we obtain the dynamics d ˜ ξ t = ˜ w t ( M + t − Y + t ) (cid:18) λ c + t − γ λ (cid:19) dt − σ ( ξ t + ξ t ) dt + ˜ w t ( M − t − Y − t ) (cid:18) λ c − t + 2 γλ (cid:19) dt − σ ( ξ t − ξ t ) dt + ˜ w t dM + t + ˜ w t dM − t (0 ≤ t < T ) (37)and d ˜ ξ t = ˜ w t ( M + t − Y + t ) (cid:18) λ c + t − γ λ (cid:19) dt − σ ( ξ t + ξ t ) dt − ˜ w t ( M − t − Y − t ) (cid:18) λ c − t + 2 γλ (cid:19) dt − σ ( ξ t − ξ t ) dt + ˜ w t dM + t − ˜ w t dM − t (0 ≤ t < T ) . (38)28ow, having at hand (37) and (38), as well as the fact that the functions c + , c − in (11) satisfy the ordinary Riccati diﬀerential equations( c + t ) (cid:48) = ( c + t ) λ − γ λ c + t − σ, ( c − t ) (cid:48) = ( c − t ) λ + 2 γλ c − t − σ (0 ≤ t < T ) , (39)an elementary but tedious computation reveals that the dynamics of ˆ α andˆ α in (35) on [0 , T ) are given by d ˆ α t = ˆ X t (cid:18) σ λ + γ λ c + t − γλ c − t (cid:19) dt − σ λ ξ t dt + γ λ ˜ ξ t dt + ˆ X t (cid:18) − σ λ + γ λ c + t + γλ c − t (cid:19) dt + 2 σ λ ξ t dt − γ λ ˜ ξ t dt + ˜ w t λ dM + t + ˜ w t λ dM − t (40)and, similarly, by d ˆ α t = ˆ X t (cid:18) σ λ + γ λ c + t − γλ c − t (cid:19) dt − σ λ ξ t dt + γ λ ˜ ξ t dt + ˆ X t (cid:18) − σ λ + γ λ c + t + γλ c − t (cid:19) dt + 2 σ λ ξ t dt − γ λ ˜ ξ t dt + ˜ w t λ dM + t − ˜ w t λ dM − t , (41)where we also employed the identities in (34). As a consequence, using therepresentations in (35) we obtain d ˆ α t + 12 d ˆ α t = σλ ( ˆ X t − ξ t ) dt − γ λ ( ˜ ξ t − c + t ˆ X t + c − t ˆ X t − c + t ˆ X t − c − t ˆ X t ) dt + 34 λ ˜ w t dM + t + 14 λ ˜ w t dM − t = σλ ( ˆ X t − ξ t ) dt − γλ ˆ α t dt + 34 λ ˜ w t dM + t + 14 λ ˜ w t dM − t (0 ≤ t < T )29nd d ˆ α t + 12 d ˆ α t = σλ ( ˆ X t − ξ t ) dt − γ λ ( ˜ ξ t − c + t ˆ X t + c − t ˆ X t − c + t ˆ X t − c − t ˆ X t ) dt + 34 λ ˜ w t dM + t − λ ˜ w t dM − t = σλ ( ˆ X t − ξ t ) dt − γλ ˆ α t dt + 34 λ ˜ w t dM + t − λ ˜ w t dM − t (0 ≤ t < T ) . In other words, the pair ( ˆ α , ˆ α ) described in (13) satisﬁes the dynamics ofthe FBSDE system in (26), where (cid:82) · ˜ w t dM + t , (cid:82) · ˜ w t dM − t are square integrablemartingales on [0 , T ) providing the ingredients for M and M . Step 2:

Next, we have to check the terminal conditions of the FBSDEsystem in (26), that is, lim t ↑ T ˆ X t = Ξ T and lim t ↑ T ˆ X t = Ξ T P -a.s. holds truefor the pair of solutions ( ˆ X , ˆ X ) of the coupled ODE in (13). We adopt theargumentation from Bank et al. [5] which employs a simple comparison prin-ciple for ordinary diﬀerential equations to our current setting. Speciﬁcally,note that it suﬃces to show thatlim t ↑ T ( ˆ X t + ˆ X t ) = Ξ T + Ξ T P -a.s. and (42)lim t ↑ T ( ˆ X t − ˆ X t ) = Ξ T − Ξ T P -a.s. , (43)where, using the dynamics in (13) and the deﬁnition of w in (12), the pro-cesses ˆ X + ˆ X and ˆ X − ˆ X satisfy, respectively, the ODE d ( ˆ X t + ˆ X t ) = c + t + c − t λ (cid:16) ˆ ξ t + ˆ ξ t − w t ˆ X t − w t ˆ X t − ˆ X t − ˆ X t (cid:17) dt = c + t λ (cid:32) ˆ ξ t + ˆ ξ t w t − ( ˆ X t + ˆ X t ) (cid:33) dt (0 ≤ t < T ) (44)and d ( ˆ X t − ˆ X t ) = c + t + c − t λ (cid:16) ˆ ξ t − ˆ ξ t + w t ˆ X t − w t ˆ X t − ˆ X t + ˆ X t (cid:17) dt = c − t λ (cid:32) ˆ ξ t − ˆ ξ t − w t − ( ˆ X t − ˆ X t ) (cid:33) dt (0 ≤ t < T ) . (45)30ote that w t ∈ ( − ,

1) for all t ∈ [0 , T ] by virtue of Lemma 3.4 1.). First,analogously to (33) let us rewrite ˆ ξ and ˆ ξ in (14) and (15) asˆ ξ t = w t ( M + t − Y + t ) + w t ( M − t − Y − t ) , ˆ ξ t = w t ( M + t − Y + t ) − w t ( M − t − Y − t ) (0 ≤ t ≤ T ) (46)with Y + , M + , Y − , M − as deﬁned in (31) and (32). Hence, we can consider ac`adl`ag version of the processes ( ˆ ξ t ) ≤ t ≤ T and ( ˆ ξ t ) ≤ t ≤ T and obtain, togetherwith Lemma 3.4, 2.), the P -a.s. limitslim t ↑ T ˆ ξ t = 12 E [Ξ T + Ξ T | F T − ] + 12 E [Ξ T − Ξ T | F T − ] = Ξ T andlim t ↑ T ˆ ξ t = 12 E [Ξ T + Ξ T | F T − ] − E [Ξ T − Ξ T | F T − ] = Ξ T due to F T − -measurability of Ξ T and Ξ t by virtue of our assumption in (9).In particular, since lim t ↑ T w t = 0 because of Lemma 3.4, 2.), it also holdsthatlim t ↑ T ˆ ξ t + ˆ ξ t w t = Ξ T + Ξ T and lim t ↑ T ˆ ξ t − ˆ ξ t − w t = Ξ T − Ξ T P -a.s. (47)Let us now start with proving the limit in (42). As a consequence of (47),for every ε > τ ε ∈ [0 , T ) such that P -a.s.Ξ T + Ξ T − ε ≤ ˆ ξ t + ˆ ξ t w t ≤ Ξ T + Ξ T + ε for all t ∈ [ τ ε , T ) . (48)Next, deﬁne Y + ,εt (cid:44) Ξ T + Ξ T + ε − ( ˆ X t + ˆ X t ) for all t ∈ [0 , T ) so that Y + ,εt ≥ ˆ ξ t + ˆ ξ t w t − ( ˆ X t + ˆ X t ) for all t ∈ [ τ ε , T ) . (49)Together with the dynamics of ˆ X + ˆ X in (44) this yields dY + ,εt = − d ( ˆ X t + ˆ X t ) = − c + t λ (cid:32) ˆ ξ t + ˆ ξ t w t − ( ˆ X t + ˆ X t ) (cid:33) dt ≥ − c + t λ Y + ,εt dt on [ τ ε , T ) . (50)31oreover, since for all ω ∈ Ω the linear ODE on [ τ ε ( ω ) , T ) given by Z + ,ετ ε ( ω ) = Y + ,ετ ε ( ω ) ( ω ) , dZ + ,εt = − c + t λ Z + ,εt dt admits the solution Z + ,εt = Y + ,ετ ε ( ω ) ( ω ) · e − (cid:82) tτε c + sλ ds = Y + ,ετ ε ( ω ) · e − γ λ ( t − τ ε ) · sinh( √ δ + ( T − t ) / (3 λ ))sinh( √ δ + ( T − τ ε ) / (3 λ )) ( τ ε ≤ t < T )with lim t ↑ T Z + ,εt = 0, the comparison principle for ODEs in (50) implies that Y + ,εt ≥ Z + ,εt for all t ∈ [ τ ε , T ) and thuslim inf t ↑ T Y + ,εt ≥ lim t ↑ T Z + ,εt = 0 P -a.s. , or, equivalently, lim sup t ↑ T ( ˆ X t + ˆ X t ) ≤ Ξ T + Ξ T + ε P -a.s. (51)Next, in a similar way, set ˜ Y + ,εt (cid:44) Ξ T + Ξ T − ε − ( ˆ X t + ˆ X t ) for all t ∈ [0 , T )and observe as above from (48) that P -a.s. on [ τ ε , T ) it holds that d ˜ Y + ,εt ≤− c + t λ ˜ Y + ,εt dt and hencelim sup t ↑ T ˜ Y + ,εt ≤ lim t ↑ T Z + ,εt ≤ P -a.s.by the comparison principle. That is,lim inf t ↑ T ( ˆ X t + ˆ X t ) ≥ Ξ T + Ξ T − ε P -a.s. , which, together with (51) yields the limit in (42).In fact, it can now be argued along the same lines as above that also thelimit in (43) holds true. Indeed, simply note that (47) implies similar to (48)that P -a.s. for every ε > τ (cid:48) ε ∈ [0 , T ) such thatΞ T − Ξ T − ε ≤ ˆ ξ t − ˆ ξ t − w t ≤ Ξ T − Ξ T + ε for all t ∈ [ τ (cid:48) ε , T ) . Y − ,εt (cid:44) Ξ T − Ξ T + ε − ( ˆ X t − ˆ X t ) and ˜ Y − ,εt (cid:44) Ξ T − Ξ T − ε − ( ˆ X t − ˆ X t ) for all t ∈ [0 , T ). By using the dynamics of ˆ X − ˆ X in (45) we can once more apply the comparison principle on the interval[ τ (cid:48) ε , T ) for the ODEs of Y − ,ε and ˜ Y − ,ε together with the linear ODE Z − ,ετ ε = z ∈ R , dZ − ,εt = − c − t λ Z − ,εt dt, which admits the solution Z − ,εt = ze − (cid:82) tτ (cid:48) ε c − sλ ds = z − e γλ ( t − τ ε ) sinh( √ δ − ( T − t ) /λ )sinh( √ δ − ( T − τ (cid:48) ε ) /λ ) ( τ (cid:48) ε ≤ t < T )such that lim t ↑ T Z − ,εt = 0 to ﬁnally conclude thatΞ T − Ξ T − ε ≤ lim inf t ↑ T ( ˆ X t − ˆ X t ) ≤ lim sup t ↑ T ( ˆ X t − ˆ X t ) ≤ Ξ T − Ξ T + ε as desired. Step 3:

It is left to argue that the controls ˆ α , ˆ α described in (13) belongto the set A in (2), i.e., ˆ α , ˆ α ∈ L ( P ⊗ dt ). To achieve this we will follow asimilar strategy as in Bank et al. [5]. For simplicity, we will assume withoutloss of generality that x = x = 0. Because of the coupling of ˆ α , ˆ α in (13)it is more convenient to prove that ˆ α + (cid:44) ˆ α + ˆ α ∈ L ( P ⊗ dt ) and ˆ α − (cid:44) ˆ α − ˆ α ∈ L ( P ⊗ dt ), where we set ˆ X + · (cid:44) (cid:82) · ˆ α + s ds and ˆ X −· (cid:44) (cid:82) · ˆ α − s ds . Recallfrom (44) and (45) above that we then haveˆ α + t = c + t λ (cid:32) ˆ ξ t + ˆ ξ t w t − ˆ X + t (cid:33) , ˆ α − t = c − t λ (cid:32) ˆ ξ t − ˆ ξ t − w t − ˆ X − t (cid:33) (52)on [0 , T ).We start with showing that ˆ α + ∈ L ( P ⊗ dt ). For this purpose, observethat it suﬃces two examine following two cases separately.Case 1.1: ξ ≡ ξ ≡ ξ t + ˆ ξ t = 2 w t M + t . Moreover, the explicit solu-tions in (18) and (19) yieldˆ X + t = e − (cid:82) t c + uλ du (cid:90) t c + s + c − s λ w s M + s e (cid:82) s c + uλ du ds = e γ λ ( T − t ) sinh( √ δ + ( T − t ) / (3 λ )) (cid:90) t M + s √ δ + λ sinh( √ δ + ( T − s ) / (3 λ )) ds (0 ≤ t < T ) . (53)33ntroducing the deterministic and diﬀerentiable function f + s (cid:44) / sinh( √ δ + ( T − s ) / (3 λ )) on [0 , T ) allows to rewrite the integral in (53) by applying integra-tion by parts as (cid:90) t M + s √ δ + λ sinh( √ δ + ( T − s ) / (3 λ )) ds = (cid:90) t ˜ M + s df + s = ˜ M + t f + t − ˜ M +0 f +0 − (cid:90) t f + s d ˜ M + s (0 ≤ t < T ) , (54)where ˜ M + t (cid:44) M + t / cosh( √ δ + ( T − t ) / (3 λ )) for all t ∈ [0 , T ). Moreover, wehave that ˆ ξ t + ˆ ξ t w t = √ δ + e γ λ ( T − t ) c + t sinh( √ δ + ( T − t ) / (3 λ )) M + t (0 ≤ t ≤ T ) . (55)Now, plugging back (55) and (53) together with (54) into ˆ α + in (52) yields,after some elementary computations,ˆ α + t = − γ λ e γ λ ( T − t ) ˜ M + t + c + t λ e γ λ ( T − t ) sinh( √ δ + ( T − t ) / (3 λ )) ˜ M +0 f +0 + c + t λ e γ λ ( T − t ) sinh( √ δ + ( T − t ) / (3 λ )) (cid:90) t f + s d ˜ M + s (0 ≤ t < T ) . (56)In fact, since c + t sinh( √ δ + ( T − t ) / (3 λ )) is bounded on [0 , T ] and ˜ M + ∈ L ( P ⊗ dt ) (recall that M + in (31) belongs to L ( P ⊗ dt )) the ﬁrst two terms in (56)are in L ( P ⊗ dt ). For the stochastic integral, we obtain (cid:90) t f + s d ˜ M + s = (cid:90) t √ δ + ˜ M + s λ cosh( √ δ + ( T − s ) / (3 λ ) ds + (cid:90) t ˜ f + s cosh( √ δ + ( T − s ) / (3 λ )) dM + s , where the ﬁrst integral on the right is again an element of L ( P ⊗ dt ). The34econd integral satisﬁes E (cid:90) T (cid:32)(cid:90) t ˜ f + s cosh( √ δ + ( T − s ) / (3 λ )) dM + s (cid:33) dt  = E (cid:90) T (cid:90) t (cid:32) ˜ f + s cosh( √ δ + ( T − s ) / (3 λ )) (cid:33) d (cid:104) M + (cid:105) s dt  = E (cid:34)(cid:90) T ( T − s ) ( ˜ f + s ) cosh( √ δ + ( T − s ) / (3 λ )) d (cid:104) M + (cid:105) s (cid:35) ≤ λ δ + E (cid:20)(cid:90) T T − s d (cid:104) M + (cid:105) s (cid:21) < ∞ (57)by our assumption in (9), where we also used Fubini’s theorem twice and thefact that sinh( τ ) ≥ τ and cosh( τ ) ≥ τ ≥

0. That is, we obtain thatˆ α + ∈ L ( P ⊗ dt ) in this case.Case 1.2: Ξ T = Ξ T = 0:In this case, we obtain from the expressions in (14) and (15) thatˆ ξ t + ˆ ξ t = 2 w t E (cid:20)(cid:90) Tt ( ξ u + ξ u ) K ( t, u ) du (cid:12)(cid:12)(cid:12) F t (cid:21) (0 ≤ t ≤ T )and thus, using again the explicit representation for ˆ X + = ˆ X + ˆ X from (18)and (19), ˆ α + in (52) becomesˆ α + t = c + t λ (cid:32) ˆ ξ t + ˆ ξ t w t − ˆ X + t (cid:33) = 2 c + t w t λ (1 + w t ) E (cid:20)(cid:90) Tt ( ξ u + ξ u ) K ( t, u ) du (cid:12)(cid:12)(cid:12)(cid:12) F t (cid:21) − c + t λ e − (cid:82) t c + uλ du (cid:90) t ( c + s + c − s ) w s λ e (cid:82) s c + uλ du E (cid:20)(cid:90) Ts ( ξ u + ξ u ) K ( s, u ) du (cid:12)(cid:12)(cid:12)(cid:12) F s (cid:21) ds. (58)In fact, it holds that all the ratios in (58) involving c + , c − are bounded on[0 , T ]. Moreover, by Lemma 5.4 we have E (cid:20)(cid:90) Tt ( ξ u + ξ u ) K ( t, u ) du (cid:12)(cid:12)(cid:12)(cid:12) F t (cid:21) ∈ L ( P ⊗ dt ) ,

35s well as E (cid:34)(cid:90) T (cid:18)(cid:90) t E (cid:20)(cid:90) Ts ( ξ u + ξ u ) K ( s, u ) du (cid:12)(cid:12)(cid:12)(cid:12) F s (cid:21) ds (cid:19) dt (cid:35) ≤ T E (cid:34)(cid:90) T (cid:18) E (cid:20)(cid:90) Ts ( ξ u + ξ u ) K ( s, u ) du (cid:12)(cid:12)(cid:12)(cid:12) F s (cid:21)(cid:19) ds (cid:35) < ∞ by using Jensen’s inequality. As a consequence, we can also conclude in thiscase that ˆ α + belongs to L ( P ⊗ dt ).Let us now argue that also ˆ α − in (52) belongs to L ( P ⊗ dt ). The argu-mentation is very similar to the one presented above so that we only sketchthe main steps. Again, it is enough to investigate following two separatecases.Case 2.1: ξ ≡ ξ ≡ ξ t − ˆ ξ t = 2 w t M − t from (46) we obtain via (18)and (19) the representationˆ X − t = e − (cid:82) t c − uλ du (cid:90) t c + s + c − s λ w s M − s e (cid:82) s c − uλ du ds = e − γλ ( T − t ) sinh( √ δ − ( T − t ) /λ ) (cid:90) t M − s √ δ − λ sinh( √ δ − ( T − s ) /λ ) ds (0 ≤ t < T ) . (59)Setting f − s (cid:44) / sinh( √ δ − ( T − s ) /λ ) on [0 , T ) we can rewrite the integralin (59) as (cid:90) t ˜ M − s df − s = ˜ M − t f − t − ˜ M − f − − (cid:90) t f − s d ˜ M − s (0 ≤ t < T ) (60)with ˜ M − t (cid:44) M − t / cosh( √ δ − ( T − t ) /λ ) for all t ∈ [0 , T ). In addition,ˆ ξ t − ˆ ξ t − w t = √ δ − e − γλ ( T − t ) c − t sinh( √ δ − ( T − t ) /λ ) M − t (0 ≤ t ≤ T ) . (61)Inserting (61) and (59) together with (60) into ˆ α − in (52) then yieldsˆ α − t = γλ e − γλ ( T − t ) ˜ M − t + c − t λ e − γλ ( T − t ) sinh( √ δ − ( T − t ) /λ ) ˜ M − f − + c − t λ e − γλ ( T − t ) sinh( √ δ − ( T − t ) /λ ) (cid:90) t f − s d ˜ M − s (0 ≤ t < T ) , (62)36here (cid:90) t f − s d ˜ M − s = (cid:90) t √ δ − ˜ M − s λ cosh( √ δ − ( T − s ) /λ ) ds + (cid:90) t ˜ f − s cosh( √ δ − ( T − s ) /λ ) dM − s . (63)Observe as in (56) above that c − t sinh( √ δ − ( T − t ) /λ ) is bounded on [0 , T ] andthat ˜ M − ∈ L ( P ⊗ dt ). Therefore, we only need to justify that the stochasticintegral in (63) belongs to L ( P ⊗ dt ). Indeed, by the same computations asin (57), we obtain via our assumption in (9) that E (cid:90) T (cid:32)(cid:90) t ˜ f − s cosh( √ δ − ( T − s ) /λ ) dM − s (cid:33) dt  ≤ λ δ − E (cid:20)(cid:90) T T − s d (cid:104) M − (cid:105) s (cid:21) < ∞ . (64)Hence, we can conclude that ˆ α − ∈ L ( P ⊗ dt ) in this case.Case 2.2: Ξ T = Ξ T = 0:Here, similar to (58) above, (14) and (15) imply thatˆ ξ t − ˆ ξ t = 2 w t E (cid:20)(cid:90) Tt ( ξ u − ξ u ) K ( t, u ) du (cid:12)(cid:12)(cid:12) F t (cid:21) (0 ≤ t ≤ T )and hence, together with ˆ X − = ˆ X − ˆ X from (18) and (19), ˆ α − in (52) canbe written asˆ α − t = c − t λ (cid:32) ˆ ξ t − ˆ ξ t − w t − ˆ X − t (cid:33) = 2 c − t w t λ (1 − w t ) E (cid:20)(cid:90) Tt ( ξ u − ξ u ) K ( t, u ) du (cid:12)(cid:12)(cid:12)(cid:12) F t (cid:21) − c − t λ e − (cid:82) t c − uλ du (cid:90) t ( c + s + c − s ) w s λ e (cid:82) s c − uλ du E (cid:20)(cid:90) Ts ( ξ u − ξ u ) K ( s, u ) du (cid:12)(cid:12)(cid:12)(cid:12) F s (cid:21) ds. (65)37s in (58), all the ratios in (65) involving the functions c + , c − are boundedon [0 , T ], and we can conclude along the same lines as in step 2.1 by virtueof Lemma 5.4 that ˆ α − ∈ L ( P ⊗ dt ) in this case as well. Step 4:

Finally, observe that c + t > c − t > t ∈ [0 , T ], whichimplies that w · , w · > , T ). Moreover, a direct computation yields thatfor all t ∈ [0 , T ) we have0 < (cid:90) Tt σ √ δ + e − γ λ ( T − u ) sinh( √ δ + ( T − u ) / (3 λ )) du = w t w t , < (cid:90) Tt σ √ δ − e γλ ( T − u ) sinh( √ δ − ( T − u ) /λ ) du = w t w t . (66)Thus, we also obtain that w · , w · > , T ). But this implies for thefunctions deﬁned in (16) that K ( t, u ) > K ( t, u ) > ≤ t ≤ u < T , as well as that (cid:82) Tt K ( t, u ) du = (cid:82) Tt K ( t, u ) du = 1 for all t ∈ [0 , T ). Proof of Corollary 3.2.

Recall that from the dynamics of ˆ X and ˆ X in (13) we obtain that the processes ˆ X + ˆ X and ˆ X − ˆ X satisfy, respectively,the linear ODEs in (44) and (45) with initial values x + x and x − x .Applying the variation of constants formula then yieldsˆ X t ± ˆ X t = ( x ± x ) e − (cid:82) t c ± sλ ds + (cid:90) t c + s + c − s λ ( ˆ ξ s ± ˆ ξ s ) e − (cid:82) ts c ± uλ du ds and hence the assertion in (18) and (19) via the obvious relationˆ X , t = 12 ( ˆ X t + ˆ X t ) ±

12 ( ˆ X t − ˆ X t ) . Proof of Lemma 3.4. First, recall from the proof of Theorem 3.1,Step 4, above that w · , w · , w · , w · > , T ). Moreover, from the deﬁnitionin (12) we immediately obtain that w t + w t + w t + w t = 1 for all t ∈ [0 , T ].Together with the fact that c + · > c −· > , T ], we also observe that w t ∈ ( − ,

1) for all t ∈ [0 , T ]. Concerning the limiting behaviour of the weight functions, it suﬃcesto note that lim t ↑ T sinh( √ δ + ( T − t ) / (3 λ ))sinh( √ δ − ( T − t ) /λ ) = √ δ + √ δ − . w , w in (12) by plugging in c + , c − from (11) to obtain therepresentations w t = √ δ + e γ λ ( T − t ) d t , w t = 3 √ δ − e − γλ ( T − t ) d t with d t (cid:44) √ δ + cosh( √ δ + ( T − t ) / (3 λ )) − γ sinh( √ δ + ( T − t ) / (3 λ ))+ √ δ − sinh( √ δ + ( T − t ) / (3 λ )) coth( √ δ − ( T − t ) /λ ) ,d t (cid:44) √ δ − cosh( √ δ − ( T − t ) /λ ) − γ sinh( √ δ − ( T − t ) /λ )+ √ δ + sinh( √ δ − ( T − t ) /λ ) coth( √ δ + ( T − t ) / (3 λ ))yields lim t ↑ T w t = √ δ + √ δ + + √ δ + = 12 , lim t ↑ T w t = √ δ − √ δ − + √ δ − = 12 . Similarly, with c + t c + t + c − t = √ δ + coth( √ δ + ( T − t ) / (3 λ )) + γ √ δ + coth( √ δ + ( T − t ) / (3 λ )) + 3 √ δ − coth( √ δ − ( T − t ) /λ ) − γc − t c + t + c − t = 3 √ δ − coth( √ δ − ( T − t ) /λ ) − γ √ δ + coth( √ δ + ( T − t ) / (3 λ )) + 3 √ δ − coth( √ δ − ( T − t ) /λ ) − γ we also havelim t ↑ T c + t c + t + c − t = √ δ + √ δ + + √ δ + = 12 , lim t ↑ T c − t c + t + c − t = √ δ − √ δ − + √ δ − = 12and hence lim t ↑ T w t = lim t ↑ T w t = lim t ↑ T w t = 0as desired.The ﬁnal lemma provides estimates with respect to the L ( P ⊗ dt )-normwhich are used in the proof of Theorem 3.1 above. Lemma 5.4.

Let ( ζ t ) ≤ t ≤ T ∈ L ( P ⊗ dt ) be progressively measurable. More-over, let K ( t, u ) , K ( t, u ) , ≤ t ≤ u < T , denote the kernels from Theo-rem 3.1. ) For ζ K t (cid:44) E [ (cid:82) Tt ζ u K ( t, u ) du | F t ] , ≤ t < T , it holds that (cid:107) ζ K (cid:107) L ( P ⊗ dt ) ≤ c (cid:107) ζ (cid:107) L ( P ⊗ dt ) for some constant c > .b) For ζ K t (cid:44) E [ (cid:82) Tt ζ u K ( t, u ) du | F t ] , ≤ t < T , it holds that (cid:107) ζ K (cid:107) L ( P ⊗ dt ) ≤ c (cid:107) ζ (cid:107) L ( P ⊗ dt ) for some constant c > .Proof. Both upper bounds can be veriﬁed in a similar fashion as in the proofof Lemma 5.5 in Bank et al. [5]. We will thus omit it here.

References [1] Robert. Almgren. Optimal trading with stochastic liquidity and volatil-ity.

SIAM Journal on Financial Mathematics , 3(1):163–181, 2012. doi:10.1137/090763470. URL https://doi.org/10.1137/090763470 .[2] Robert Almgren and Neil Chriss. Optimal execution of portfolio trans-actions.

J. Risk , 3:5–39, 2001.[3] Robert Almgren and Tianhui Michael Li. Option hedging with smoothmarket impact.

Market Microstructure and Liquidity , 02(01):1650002,2016. doi: 10.1142/S2382626616500027. URL https://doi.org/10.1142/S2382626616500027 .[4] Mukarram Attari, Antonio S. Mello, and Martin E. Ruckes. Arbitragingarbitrageurs.

The Journal of Finance , 60(5):2471–2511, 2005. ISSN00221082, 15406261. URL .[5] Peter Bank, H. Mete Soner, and Moritz Voß. Hedging with temporaryprice impact.

Mathematics and Financial Economics , 11(2):215–239,2017. ISSN 1862-9660. doi: 10.1007/s11579-016-0178-4. URL http://dx.doi.org/10.1007/s11579-016-0178-4 .[6] Markus K. Brunnermeier and Lasse Heje Pedersen. Predatory trad-ing.

The Journal of Finance , 60(4):1825–1863, 2005. doi: 10.1111/j.1540-6261.2005.00781.x. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1540-6261.2005.00781.x .407] Jiatu Cai, Mathieu Rosenbaum, and Peter Tankov. Asymptotic lowerbounds for optimal tracking: A linear programming approach.

Ann.Appl. Probab. , 27(4):2455–2514, 08 2017. doi: 10.1214/16-AAP1264.URL https://doi.org/10.1214/16-AAP1264 .[8] Bruce Ian Carlin, Miguel Sousa Lobo, and S. Viswanathan. Episodicliquidity crises: Cooperative and predatory trading.

The Journal ofFinance , 62(5):2235–2274, 2007. doi: 10.1111/j.1540-6261.2007.01274.x. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1540-6261.2007.01274.x .[9] Ren´e Carmona and Joseph Yang. Predatory Trading: a Game on Volatil-ity and Liquidity.

Quantitative Finance , under revision, 2011.[10] Philippe Casgrain and Sebastian Jaimungal. Mean-ﬁeld games withdiﬀering beliefs for algorithmic trading. Preprint, October 2018. URL https://arxiv.org/abs/1810.06101 .[11] Philippe Casgrain and Sebastian Jaimungal. Mean ﬁeld games withpartial information for algorithmic trading. Preprint, March 2019. URL https://arxiv.org/abs/1803.04094 .[12] Chenghuan Sean Chu, Andreas Lehnert, and Wayne Passmore. Strategictrading in multiple assets and the eﬀects on market volatility.

Interna-tional Journal of Central Banking , 5(4):143–172, 2009.[13] I. Ekeland and R. T´emam.

Convex Analysis and Variational Problems .Society for Industrial and Applied Mathematics, 1999. doi: 10.1137/1.9781611971088. URL http://epubs.siam.org/doi/abs/10.1137/1.9781611971088 .[14] Ibrahim Ekren and Sergey Nadtochiy. Utility-based pricing and hedg-ing of contingent claims in Almgren-Chriss model with temporary priceimpact. Preprint, October 2019. URL https://arxiv.org/abs/1910.01778 .[15] Martin Herdegen, Johannes Muhle-Karbe, and Dylan Possama¨ı. Equi-librium Asset Pricing with Transaction Costs. Preprint, June 2019.4116] Xuancheng Huang, Sebastian Jaimungal, and Mojtaba Nourian. Mean-ﬁeld game strategies for optimal execution.

Applied Mathemati-cal Finance , 26(2):153–185, 2019. URL https://doi.org/10.1080/1350486X.2019.1603183 .[17] Ciamac C. Moallemi, Beomsoo Park, and Benjamin Van Roy. Strate-gic execution in the presence of an uninformed arbitrageur.

Jour-nal of Financial Markets , 15(4):361 – 391, 2012. ISSN 1386-4181.doi: https://doi.org/10.1016/j.ﬁnmar.2011.11.002. URL .[18] L. C. G. Rogers and Surbjeet Singh. The cost of illiquidity and its eﬀectson hedging.

Mathematical Finance , 20:597–615, 2010.[19] Alexander Schied and Tao Zhang. A state-constrained diﬀeren-tial game arising in optimal portfolio liquidation.

Mathematical Fi-nance , 27(3):779–802, 2017. doi: 10.1111/maﬁ.12108. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/mafi.12108 .[20] Alexander Schied and Tao Zhang. A market impact game under tran-sient price impact.

Mathematics of Operations Research , 44(1):102–121,2019. URL https://doi.org/10.1287/moor.2017.0916 .[21] Alexander Schied, Elias Strehle, and Tao Zhang. High-frequency limitof nash equilibria in a market impact game with transient price impact.

SIAM Journal on Financial Mathematics , 8(1):589–634, 2017. URL https://doi.org/10.1137/16M107030X .[22] Torsten Sch¨oneborn.

Trade execution in illiquid markets: Optimalstochastic control and multi-agent equilibria . PhD thesis, TechnischeUniversit¨at Berlin, 2008.[23] Torsten Sch¨oneborn and Alexander Schied. Liquidation in the Face ofAdversity: Stealth vs. Sunshine Trading. SSRN Preprint 1007014, April2009.[24] Elias Strehle. Optimal execution in a multiplayer model of transientprice impact.

Market Microstructure and Liquidity , 3(4):1850007, 2017.URL https://doi.org/10.1142/S2382626618500077https://doi.org/10.1142/S2382626618500077