[PDF] A General Solution Method for Insider Problems

Abstract

We develop a flexible approach to solve a continuous-time, multi-asset/multi-option Kyle-Back model of informed trading under very general assumptions, including on the distribution of the belief about the fundamental, and the noise process. The main insight is to postulate the pricing rule of the market maker at maturity as an optimal transport map. The optimal control of the informed trader reduces to the computation of a conjugate convex function, explicit in some cases, and otherwise easily obtainable using fast numerical algorithms. To illustrate the power of our method, we apply it to a long-standing problem: how are informed investors splitting trades between a spot asset and its options? Our method allows to i) prove the existence of an equilibrium and characterize the informed trader's trading strategy in the spot and the option markets, even for non-Gaussian price priors (e.g., lognormal); ii) show there can be cross-market price impact between the spot market and multiple options even when their noise trading is independent; and iii) compare our pricing results to a simple Black-Scholes model and quantify the price distortion of the option due to strategic trading. In particular, we show that a Black-Scholes implied volatility (IV) smile/smirk can emerge because of the market marker's adaptation to asymmetric information.

Full PDF

AA GENERAL SOLUTION METHOD FOR INSIDER PROBLEMS ∗ FRANÇOIS COCQUEMAS † IBRAHIM EKREN ‡ ABRAHAM LIOUI § This version: June 16, 2020

ABSTRACT

We develop a ﬂexible approach to solve a continuous-time, multi-asset/multi-optionKyle-Back model of informed trading under very general assumptions, including onthe distribution of the belief about the fundamental, and the noise process. The maininsight is to postulate the pricing rule of the market maker at maturity as an optimaltransport map. The optimal control of the informed trader reduces to the computationof a conjugate convex function, explicit in some cases, and otherwise easily obtainableusing fast numerical algorithms. To illustrate the power of our method, we apply it toa long-standing problem: how are informed investors splitting trades between a spotasset and its options? Our method allows to i) prove the existence of an equilibriumand characterize the informed trader’s trading strategy in the spot and the optionmarkets, even for non-Gaussian price priors (e.g., lognormal); ii) show there can becross-market price impact between the spot market and multiple options even whentheir noise trading is independent; and iii) compare our pricing results to a simpleBlack-Scholes model and quantify the price distortion of the option due to strategictrading. In particular, we show that a Black-Scholes implied volatility (IV) smile/smirkcan emerge because of the market marker’s adaptation to asymmetric information.

Please ﬁnd the latest version at this link:

JEL classiﬁcation: G12, G13, G14Keywords: Kyle’s model, options, informed trading, optimal transport, asymmetric in-formation, price impact ∗ This project was supported in part by NSF Grant DMS 2007826 (Ekren). † Florida State University, College of Business. [email protected]. ‡ Florida State University, Department of Mathematics. [email protected]. § EDHEC Business School. [email protected]. a r X i v : . [ q -f i n . T R ] J un . Introduction In the vast literature stemming from the seminal contributions by Kyle (1985) and Back(1992), a typical insider trading game involves three players: an informed trader (or insider),a market maker, and uninformed (or noise) traders. The informed trader possesses long-lived information about the true fundamental value of an asset. She attempts to maximizeproﬁts from trading on this information before it is revealed at some known future date. Themarket maker has a prior belief about probability distribution of the asset value. He mustattempt to form expectations about the fundamental value of the asset from the total orderﬂow comprising uninformed and informed traders. This simple game theoretical setup hasspawned many extensions and empirical applications over the years. However, they have raninto severe limitations, in large part because of the diﬃcult ﬁltering problem faced by themarket maker.This paper presents an approach to the continuous-time Kyle-Back problem which dra-matically expands the universe of models where an equilibrium can be found. Based on thetheory we develop, a straightforward methodology can be implemented to solve classes ofproblems that were previously too complex to study. In particular, we are able to tackle multi-dimensional insider problems which can include (simultaneously) multiple assets, multipleoptions at diﬀerent strikes, arbitrary non-Gaussian price priors, and arbitrary but determin-istic covariances accross noise trading. The key ingredient of our method is a long-standing,but recently ﬂourishing, mathematical theory known as optimal transport . We show thatthe pricing rule of the market maker at maturity can be viewed as an optimal transport map.It connects the distribution of noise trading to that of the market maker’s prior belief for thefundamental value of the assets. The optimal control of the informed trader then reducesto the computation of a conjugate convex function, explicit in some cases, and otherwiseeasily obtained numerically, using eﬃcient algorithms such as the Sinkhorn (Cuturi, 2013).The prices of the assets as a function of the order ﬂows become a simple convolution ofthe transport map (the solution of a partial diﬀerential equation using the Feyman-Kac for-mula), trivial to compute. In essence, optimal transport makes the market maker’s ﬁlteringproblem feasible in a wide range of previously intractable cases.To illustrate our methodology, we focus in on the problem of informed trading betweena spot and an options markets. The model of Back (1993) remains the state-of-the-art forthis problem. Many important insights have emerged from this contribution, in particular The optimal transport theory used in our paper is the classical optimal transport theory (see in particu-lar Brenier, 1991; McCann, 1995; Villani, 2009). As opposed to martingale optimal transport theory studiedin mathematical ﬁnance (Beiglböck, Nutz, and Touzi, 2017; Dolinsky and Soner, 2014), we do not requirethe martingality of transport maps and we do not put additional constraints on them (Ekren and Soner,2018).

2. Model

An informed trader has perfect information on the fundamental value of n traded assets ata given horizon T . Interacting with noise traders, as well as the market maker, the objectiveof the informed trader is to make the most proﬁt from the information she holds. We ﬁrstdescribe the investment universe, the order ﬂow of noise traders, then the rational choiceof the informed trader, and ﬁnally the pricing strategy of the market maker. We end thissection with the deﬁnition of an equilibrium. The model’s setup is in line with continuous-time Kyle-Back models of informed trading. 4 . 1. Traded Assets In the economy, n assets are available for trade for n ≥ ﬁxed. The n assets can be spotassets (e.g., stocks) or derivatives with no early exercise feature (e.g., forwards, or Europeanoptions, including options that are not at-the-money). Full information on the fundamentalvalue of these assets, denoted v ∈ R n , will be revealed at future time T . While the informedtrader knows this value already at time t = 0 , the other players do not. The market makerwill have to ﬁlter this value from observing aggregate volumes. He views the true valuesof the assets as random variables ˜ v ∈ R n and has a prior belief for the distribution of ˜ v ,denoted ν . Our general ﬁndings are derived under minimal assumptions on ν . Importantly, ν does not have to be Gaussian, or absolutely continuous. This is a distinction from existingmulti-dimensional Kyle (1985) models, which are solved only for a relatively small numberof special distributional assumptions on ν . Relaxing the normality assumption is desirablefor stocks, given their limited liability feature. To guarantee that an equilibrium is reachable,we make the following standing assumption: Assumption 1. ν satisﬁes the moment condition (cid:90) R n | x | p ν ( dx ) < ∞ (2.1) for some p > . We also assume that ν is not a point mass.

2. 2. Order Flow of Noise Traders

We assume that noise traders provide liquidity in the n assets according to dZ t = σ t dW t ,where σ t is a deterministic but possibly time-varying square covariance matrix and W is astandard n -dimensional Wiener process deﬁned on a probability space (Ω , P ) . Following theliterature, we assume that W and ˜ v are independent to guarantee that noise trading is anuninformed trading. We denote by µ the distribution in R n of (cid:82) T σ s dW s , which is Gaussian.In other words, µ is the distribution of total noise trading from time 0 to the terminal date T . Since σ t is deterministic, the variance of noise trading up to the terminal date T is givenby the symmetric positive matrix of size n : Σ := (cid:82) T σ s ds . We make the additional standingassumption: Assumption 2. σ and σ − are continuous and bounded on [0 , T ] . Note that, since W is a Brownian motion, we can write the variance of remaining noisetrading at t as Σ t := (cid:82) Tt σ s ds . The moment condition is chosen for ease of presentation and is most likely not sharp. . 3. The Rational Choice of the Informed Trader We start by specifying the information set of the informed trader, which is diﬀerent fromthat of the noise traders and the market maker. We assume that besides W and ˜ v , theprobability space (Ω , P ) contains a random variable U ∈ R n that is independent of ˜ v and W and distributed according to µ . We denote by F = {F t } t ∈ [0 ,T ] the augmented ﬁltrationof the Markov process ( { t< } U + { t =0 } ˜ v + { t ∈ (0 ,T ] } W t ) t ∈ [ − , ∞ ) . With this deﬁnition, F isnon-trivial since the informed trader knows v , the value at terminal date T of ˜ v and also U ,and in equilibrium F is strictly larger than the ﬁltration of the market maker. The informedtrader is a risk-neutral agent who submits the order ﬂow dX t ∈ R n at each time t ∈ [0 , T ) .At the last trading date T , the true value v is revealed to the market. Her cumulative tradeup to time t , X t , is known only to her, i.e. the market maker cannot infer the informed tradertrade from observing total order ﬂow.The informed trader is not a price taker, i.e. her order will have a price impact. Shetrades in an attempt to extract the most beneﬁt from her private information, while stillallowing the market maker to quote a price. In addition to the n assets, the informed tradercan trade a locally risk free asset in zero net supply which return is assumed to be 0. Theinitial wealth of the informed trader is set at 0 and any net long position at the outset ofthe trading is ﬁnanced by borrowing. Short selling is permitted. The only constraint on theinformed trader’s strategy aims at avoiding doubling strategies. This admissibility conditionis formally speciﬁed further down.In the classical Kyle’s model, the informed trader observes only the quoted prices, andit is assumed that in equilibrium the quoted prices are strictly increasing in the total orderﬂow. Thus, the informed agent can obtain Z t from the prices and his ﬁltration F . In our case,since there might be multiple options on a stock, there might be a redundant asset , and itmight not be possible to obtain Z from the observation of the price process. Therefore, weassume that the informed trader observes directly Y , and from this information she computes Z (as in Back, 1993). Thus, we assume that the information of the informed agent at time t is F t .Denoting P t ∈ R n the prices quoted by the market maker, the objective of the informedtrader is to maximise her total gains from trading which are (cid:82) T X (cid:62) t dP t + ( v − P T ) (cid:62) X T .Applying Ito’s lemma to ( v − P t ) (cid:62) X t , the informed trader’s objective function can be written The random variable U is only needed for randomization purposes when the distribution ˜ v is notabsolutely continuous. When the distribution of ˜ v is absolutely continuous our construction of the equilibriumdoes not need this random variable. U could have any absolutely continuous distribution on R n . For ease of presentation, we assume that X is a continuous semimartingale. This assumption can berelaxed as in Back (1992). In the sense of static, not dynamic replication.

6s follows: sup X E (cid:34)(cid:90) T ( v − P t ) (cid:62) dX t − n (cid:88) i =1 (cid:104) X i , P i (cid:105) T (cid:35) . (2.2)where (cid:104) X i , P i (cid:105) T is the integrated quadratic variation between X and P up to time T and X is a strategy in the sense of Deﬁnition 2 below.

2. 4. The Market Maker’s Problem

The economy has a risk-neutral continuum of market makers (“the market maker”) compet-ing for order ﬂow and quoting at each time t a price P t . At the ﬁnal time T , the privateinformation is revealed, and the price will reach P T = v , possibly with a jump if all informa-tion has not been incorporated In equilibrium there will be no jump but the insider optimizesamongst strategies allowing for jumps although we show that no such strategy is optimal.The market maker observes the total order ﬂow Y , from which she is unable to disentangleinformed and noise traders order ﬂow. Denoting the total order ﬂow from time 0 to time t by Y t , its dynamics writes as: dY t = dX t + σ t dW t (2.3)The market maker quotes prices while facing two sources of uncertainty: the terminalvalue of the traded assets and the order ﬂow. Being risk neutral, rational pricing by themarket maker commands the following pricing rule: P t = E [˜ v |F Yt ] (2.4)where F Yt is the ﬁltration associated with Y assuming F Y is trivial. Given that the informedtrader does know the terminal value of the traded assets, the market maker ﬁltration satisﬁes F Y ⊂ F .

2. 5. Equilibrium

The goal is to determine an equilibrium in the game between the informed trader and themarket marker, and to simultaneously price the assets. The only source of information forthe market maker is the order ﬂow. As such, quoted prices will be adapted to F Y and suchthat, given Y t , the market maker quotes prices such that: P t = H ( t, Y t ) . (2.5)where H is a suitably chosen functional of t and Y . We require that this pricing rule H satisﬁes the following properties: 7 eﬁnition 1. A pricing rule is a measurable map H : { (0 , } ∪ (0 , T ] × R n (cid:55)→ R n which is • continuously diﬀerentiable in t and twice continuously diﬀerentiable in y on (0 , T ) × R n , • satisﬁes the integrability assumption E [ | H ( T, Z T ) | ] + (cid:90) T E [ | H ( t, Z t ) | ] dt < ∞ . (2.6)Condition (2.6) insures that the local martingales we manipulate are martingales, andthat we have explicit formulas for expected returns gains. Since we only require it to bedeﬁned on a strict subset of [0 , T ] × R n , our class of pricing rules is larger than its counterpartin Back (1992). With the generality we are targeting for ν , we are not able to prove that theequilibrium pricing rule we construct can be extended to [0 , T ] × R n . Although we expectthat such an extension is possible for particular examples, this point is in fact not needed toestablish an equilibrium.We turn now to deﬁne the admissible strategies by the informed trader: Deﬁnition 2.

A trading strategy for the informed trader is a continuous square integrablesemi-martingale X adapted to F satisfying (cid:90) T E [ | H ( t, X t + Z t ) | ] dt < ∞ (2.7) for all pricing rule H . Similarly to the boundedness condition based on the market maker information ﬂow, wealso guarantee that informed trader activity will stay reasonable in that it will not generateerratic prices.We are now well equipped to set up the deﬁnition of the equilibrium:

Deﬁnition 3.

We say that a pricing rule H ∗ and a trading strategy X ∗ for the informedtrader is an equilibrium if • H ∗ ( t, Y t ) = E [˜ v |F Yt ] whenever Y t = X ∗ t + Z t , • X ∗ is maximizer of sup X E (cid:34)(cid:90) T ( v − H ∗ ( t, X t + Z t )) (cid:62) dX t − n (cid:88) i =1 (cid:104) X i · , H ∗ ,i ( · , X · + Z · ) (cid:105) T (cid:35) (2.8) among all trading strategies X . We are now set up to solve for the equilibrium. We omit in the following the superscriptin X ∗ , H ∗ when there is no confusion in the notation.8 . A General Solution Method In order to construct our candidate equilibrium strategy, we will use results from optimaltransport theory. We start by elaborating on the relationship between the informed tradingproblem and optimal transport, before recalling the main theorems and then their applicationin our setting.

3. 1. Intuition

The distribution of noise trading is common knowledge in the economy. Interaction of themarket maker with only noise traders will never guarantee that at the ﬁnal date T , the pricingby the market maker will be such that H ( T, Z T ) = P T = ˜ v . Additionally, in equilibrium,the market maker cannot observe nor predict the future order ﬂow of the informed traderand sees the distribution of the total order ﬂow process Y T = X T + Z T as the distribution of Z T . In equilibrium, the informed trader’s strategy is that the order ﬂow she submits to themarket maker guarantees that H ( T, X T + Z T ) = P T = ˜ v . Hence, the informed trader strategyshould be such that the market maker, while knowing that the distribution of noise trading Z T over the period is µ , should set up a pricing rule H ( t, Y t ) such that the distribution of H ( T, Y T ) is exactly the distribution ν of the fundamental value. Therefore, we will constructa candidate equilibrium where we postulate the pricing rule of the market maker at ﬁnaltime x (cid:55)→ H ( T, x ) to be the unique optimal transport map (to be deﬁned precisely below)from µ to ν . The candidate equilibrium strategy of the informed trader will be to map back the distribution ν of ˜ v onto the distribution µ of Z T , and force the total order ﬂow to matchthis random variable at ﬁnal time.After providing some preliminaries on optimal transport theory, we fully deﬁne our can-didate strategies.

3. 2. Main Theorems from Transport Theory

For two probability measures α and β on R n with α absolutely continuous with respect tothe Lebesgue measure and a Borel measurable map M : R n (cid:55)→ R n , we denote by M (cid:93) α thepush-forward measure of the measure α by the mapping M which is deﬁned as M (cid:93) α ( A ) = α ( M − ( A )) for all Borel measurable set A ⊂ R n . We say that M pushes α forward to β ifthe equality of measures M (cid:93) α = β holds. In the case ν is not absolutely continuous, this is essentially only possible thanks to an additionalrandomization via U .

9e recall in the Appendix A some concepts related to the optimal transport theoryand state the Brenier’s Theorem. Using these results, we prove the following Corollary thatsummarizes the optimal transport results needed to construct the candidate equilibrium:

Corollary 1 (A Corollary to Brenier’s theorem) . There exists a unique convex function

Γ : R n → R such that ∇ Γ( Z T ) is distributed according to ν and E [Γ( Z T )] = 0 .Additionally, in the probability space (Ω , P ) there exists a random variable ζ satisfying ∇ Γ( ζ ) = ˜ v, and ζ has distribution µ. (3.1) If ν is absolutely continuous, then ∇ Γ is invertible (on the support of ν ) and one can take ζ = ( ∇ Γ) − (˜ v ) . (3.2) Proof.

The proof is provided in Appendix A.1.The function Γ is called the Brenier’s potential and the function ∇ Γ is called the Brenier’smap. In the optimal transport theory Brenier’s potential is always deﬁned up an additiveconstant. Thus, up to integrability of Γ( Z T ) that we prove in this corollary, we can choosethe additive constant to require E [Γ( Z T )] = 0 . This choice of the additive constant is madeto simplify the expression for the expected gain of the informed trader in equilibrium thatwe provide below.We are fully equipped to derive the equilibrium in our economy with asymmetry ofinformation.

3. 3. Solving the Informed Trader Problem

Due to the deﬁnition of equilibrium, H ( t, Y t ) must be a martingale. A convenient way ofdeﬁning our equilibrium pricing rule H is via the stochastic representation H ( t, y ) := E (cid:20) ∇ Γ (cid:18) y + (cid:90) Tt σ s dW s (cid:19)(cid:21) for ( t, y ) ∈ { (0 , } ∪ (0 , T ] × R n (3.3)where ∇ Γ( · ) is the optimal transport map constructed in Corollary 1 which pushes forward µ to ν .With a slight abuse of notation, we also deﬁne the function Γ : { (0 , } ∪ (0 , T ] × R n (cid:55)→ R by Γ( t, y ) := E (cid:20) Γ (cid:18) y + (cid:90) Tt σ s dW s (cid:19)(cid:21) . (3.4)We now provide properties of (3.3) and (3.4). As mentioned, we only use U to construct ζ when ν is not absolutely continuous. emma 1. The function H deﬁned by (3.3) is a pricing rules and, for all i = 1 , . . . n , H i isthe solution of the PDE ∂ t H i + T r (cid:18) σ t ∂ yy H i (cid:19) = 0 , for all ( t, y ) ∈ (0 , T ) × R n (3.5) with ﬁnal condition H i ( T, y ) = ∂ x i Γ( y ) for y ∈ R n . Additionally, Γ( t, y ) deﬁned via (3.4) satisﬁes: • Γ( t, y ) is a continuous function on { (0 , } ∪ (0 , T ] × R n and continuously diﬀerentiablein t and twice continuously diﬀerentiable in y on (0 , T ) × R n , • for all ( t, y ) ∈ (0 , T ] × R n , we have ∇ Γ( t, y ) = H ( t, y ) . (3.6)By deﬁnition, ∇ Γ (cid:16)(cid:82) T σ s dW s (cid:17) is distributed as ν and has good integrability properties.However, it is not clear if this is also the case for ∇ Γ (cid:16) y + (cid:82) T σ s dW s (cid:17) for some y (cid:54) = 0 . Tohave such a result, one needs to obtain bounds on the growth of ∇ Γ at inﬁnity. Such boundsin fact exists for a fairly large class of ν as shown by Caﬀarelli (1990, 1991). However, forthe case of interest such as stocks and options on the stock, ν is a singular measure and weare not able to use the results available in the literature. Therefore, we have chosen to onlydeﬁne pricing rules on { (0 , } ∪ (0 , T ] × R n where the moments of ν allow the deﬁnition of H . Whether H can be extended to [0 , T ] × R n is out of the scope of this paper.Given this property of the pricing function H , we can state the following: Lemma 2.

The Jacobian matrix ∇ y H ( t, y ) = { ∂ y j H i ( t, y ) } i,j =1 ,...n = { ∂ y i y j Γ( t, y ) } i,j =1 ,...n isa symmetric positive semideﬁnite matrix for t ∈ (0 , T ) , and for any trading strategy X , wehave E (cid:34)(cid:90) T ( v − H ( t, Y t )) (cid:62) dX t − n (cid:88) i =1 (cid:104) X i , H i ( · , Y · ) (cid:105) T (cid:35) = E (cid:34) v (cid:62) Y T − Γ( Y T ) − n (cid:88) i,j =1 (cid:90) T n (cid:88) i,j =1 ∂ y i y j Γ( t, Y t ) d (cid:104) X i , X j (cid:105) t (cid:35) . (3.7) Proof.

See Appendix B.1.The representation (3.7) directly links the informed trader’s objective function to theoptimal transport map. Now deﬁne the convex conjugate Γ ∗ of Γ as follows: Γ ∗ ( v ) = sup y ∈ R n { v (cid:62) y − Γ( y ) } . (3.8)11 complete characterization of the informed trader’s optimal strategy is given in thefollowing proposition. Proposition 1.

If the market maker uses the pricing rule (3.6) then the criterion of theinformed trader (2.8) is a concave problem. For all realizations v of ˜ v , the wealth of theinformed trader at the optimum is: max X E (cid:34)(cid:90) T ( v − H ( t, Y t )) (cid:62) dX t − n (cid:88) i =1 (cid:104) X i , H i ( · , Y · ) (cid:105) T (cid:35) = Γ ∗ ( v ) . (3.9) Any absolutely continuous trading strategy X of the informed trader insuring P T = ˜ v isoptimal. In particular, dX t = σ t (Σ t ) − ( ζ − Y t ) dt (3.10) is optimal where ζ ∈ R n is the total volume target, constructed as in (3.1) .Proof. See Appendix B.1.The informed trader optimal strategy X t is absolutely continuous which is a generaliza-tion of many of the previous ﬁndings in the literature which have to assume a particulardistribution for the common belief about the fundamental values. We can then write thedynamics of the total order ﬂow to the market maker as: dY t = σ t (Σ t ) − ( ζ − Y t ) dt + σ t dW t (3.11)which satisﬁes Y T = ζ and therefore H ( T, Y T ) = ∇ Γ( Y T ) = ∇ Γ( ζ ) = ˜ v .The construction of the total target volume ζ only requires the use of U if ν is not abso-lutely continuous. If ν is absolutely continuous, then ζ has the more natural representation(3.2) which can also be written as ζ = ( ∇ Γ) − (˜ v ) = ∇ Γ ∗ (˜ v ) . Due to the use of U for its construction, our equilibrium is superﬁcially diﬀerent fromthe one in Back (1993). However, they actually coincide (at least distributionally) under themore restrictive Back (1993) assumptions. Indeed, both our Proposition 1 and Back (1993,Lemma 1) state that any (absolutely continuous) strategy allowing P T = ˜ v is optimal forthe informed agent. In Back (1993, Lemma 2), such a strategy is explicitly constructed bycontrolling Y t via a drift term (1 − t ) − E [ Z T | Z t = y, P T = ˜ v ] . This strategy insures that,conditional on the ﬁltration of the market maker, the distribution of Y T is the same as thedistribution of Z T . This property is what is needed to show that we have an equilibriumstrategy. In our case, because the set of y for which we have ∇ Γ( y ) = ˜ v is no longer always12 straight line, there are no results in the literature to help conjecture a strategy for theinformed agent. Therefore, we are introducing the additional random variable U in the casewhere ν is not absolutely continuous. It serves to construct a target volume ζ as in (3.1).Then, the informed agent trades to control Y to insure Y T = ζ , so that Y T has distribution µ , similar to Back (1993, Lemma 2). Finally, although in terms of realizations of the randomvariables, our equilibrium might be diﬀerent from the one in Back (1993, Lemma 2), in factboth equilibria share the same distributional properties, and our construction in (3.10) isarguably more intuitive.One of the main reasons why the optimal transport theory fundamentally simpliﬁes theunderstanding of the classical Kyle-Back models is the equality (3.9). Indeed, this identityeasily identiﬁes the expected wealth of the informed trader with the convex conjugate of theBrenier’s potential.The second important link between these concepts relies on the so-called dual formulationof the optimal transport problem. Indeed, we observe that, for all y ∈ R n and v in the supportof ν , the functions Γ and Γ ∗ satisfy Γ( y ) + Γ ∗ ( v ) ≥ y (cid:62) v (3.12)and (at least when Γ ∗ is diﬀerentiable) Γ( ∇ Γ ∗ ( v )) + Γ ∗ ( v ) = ( ∇ Γ ∗ ( v )) (cid:62) v. (3.13)Thus, if we deﬁne on { (0 , } × R n ∪ (0 , T ] × R n × R n , the function J by J ( t, y, v ) = Γ ∗ ( v ) + Γ( t, y ) − v (cid:62) y, we obtain J ( T, y, v ) = Γ ∗ ( v ) + Γ( y ) − v (cid:62) y ≥ ∇ Γ ∗ ( v )) + Γ ∗ ( v ) − ( ∇ Γ ∗ ( v )) (cid:62) v = J ( T, ∇ Γ ∗ ( v ) , v ) . Additionally, an analysis of our proof shows that the expected welfare of the informedtrader from trading on [ t, T ] is max X E (cid:34)(cid:90) Tt ( v − H ( s, Y s )) (cid:62) dX s − n (cid:88) i =1 (cid:104) X i , H i ( · , Y · ) (cid:105) T − (cid:104) X i , H i ( · , Y · ) (cid:105) t |F t (cid:35) = J ( t, Y t , v ) . One can directly check that the function J solves the Hamilton-Jacobi-Bellman equation max θ ∈ R n (cid:26) ∂ t J + θ (cid:62) ∂ y J + 12 T r (cid:0) σ t σ (cid:62) t ∂ yy J (cid:1) + θ (cid:62) ( v − H ) (cid:27) = 0 . (3.14)13hus, the function J is the value function deﬁned in Back (1992, Theorem 2).In fact, via the inequality (3.12) and (3.13), the functions Γ and Γ ∗ identify the ﬁnalcondition of the Hamilton-Jacobi-Bellman equation (3.14). The main contribution of thepresent work is to fully identify this ﬁnal condition via the Brenier’s map and its convexconjugate. Indeed, although the equation (3.14) was known in the literature, the statementof Kyle-Back model does not specify a ﬁnal condition and dynamic programming principletype approaches such as (3.14) were only able to handle Kyle-Back models in speciﬁc cases.

3. 4. Solving the Market Maker’s Problem

Having characterized the optimal strategy of the informed trader, we turn now to the pricingrule used by the market maker at equilibrium which is given in the following:

Proposition 2.

If the informed trader uses the strategy (3.11) , then, conditionally on F Yt , ζ is distributed as a Gaussian random variable with mean Y t and covariance matrix Σ t and ( H ( t, Y t )) t ∈ [0 ,T ] is a F Y martingale with ﬁnal value v and therefore H ( t, Y t ) = E [˜ v |F Yt ] .Proof. See Appendix B.2.The following theorem is the main theoretical contributions of this paper.

Theorem 1 (Existence of Equilibrium) . The couple of strategies yielding total ﬂow (3.11) and pricing function (3.3) is an equilibrium for the generalized Kyle – Back’s model.Proof.

The proof is a direct consequence of Propositions 1-2.

Evolution of the market maker’s belief.

The Proposition 2 also provides the evolutionof the belief of the market maker. Indeed, in equilibrium, the ﬁnal price will satisfy P T = ∇ Γ( Y T ) and conditional to F Yt , the information of the market maker at time t , Y T is Gaussianwith mean Y t and covariance Σ t . Therefore, conditional to F Yt , ˜ v has the same distributionas ∇ Γ( Y t + Z T − Z t ) . Once the transport map ∇ Γ is computed, one can easily compute thedistribution of ˜ v conditional to F Yt . Price impact and market depth.

The price impact matrix is { ∂ y i y j Γ( t, y ) } i,j =1 ,...n , whichis symmetric positive semi-deﬁnite. Additionally, by a diﬀerentiation of (3.4), the matrix val-ued process { ∂ y i y j Γ( t, Z t ) } i,j =1 ,...n is a martingale. If it is invertible, taking its inverse, whichis a convex operation on symmetric positive matrices, the market depth is a submartingale.Overall, we cannot guarantee at this level of generality that the price impact matrix will beinvertible and hence that it is suﬃcient to have non-singular noise trading processes to have14ake derivatives, for example, non redundant.

3. 5. Computing the Transport Map

Given the equilibrium characterized above, the main diﬃculty is then to ﬁnd the transportmap that links the multivariate distribution of the noise to the multivariate distribution ofthe payoﬀs. In some cases, the transport map may be written explicitly. This is for instancethe case for any one-dimensional case, and in some speciﬁc cases such as the ones consideredby Back (1992) and Back (1993). Note that the geometric construction of the solution in Back(1993) has a direct interpretation in terms of a transport map. We explain that connectionand generalize it in Subsection 4.4. In general, explicit solutions to the transport problemare currently known for a certain class of multivariate Gaussian and elliptical distribution(Ghaﬀari and Walker, 2018).Thankfully, in cases where an explicit transport map is not known, some very eﬃcientalgorithms make it possible to compute it numerically. For instance, the Sinkhorn algorithmis a popular choice for calculating a transport map (Cuturi, 2013). This algorithm is fast,parallelizable, and well-suited for GPU computation.To summarize, when there is no explicit formula for the transport, the following stepshave to be followed to generate prices as quoted by the market maker:1. Parametrize a prior distribution for the price of the assets, and a distribution for thenoise;2. Decide on a space discretization for the two distributions;3. Compute a distance matrix between each point of the two space discretizations;4. Run the Sinkhorn (or an alternative) algorithm to ﬁnd the optimal transport mapbetween the two distributions;5. Compute the asset prices for a given level of order ﬂow by numerically integrating thenoise distribution against the transport map (analog to Equation 3.3).In the next section, we describe some examples ranging from the simple one-asset case(already well-known) to more complex scenarios, including one asset with multiple options(not previously solved). We describe some parametrizations for the numerical approach, andthe explicit constructions of the transport maps.15 . Applications

In this section, we analyze several applications of the general method outlined above. Besidethe one-asset case, which is well-studied for normal and lognormal assumptions on the priorbelief about the terminal value of the assets, we oﬀer a solution to the multi-asset Gaussianand lognormal prior cases. We then generalize the Back (1993) case of one underlying andone call, and extend it to one underlying, one call and one put. These few applications, byno means exhaustive, are meant to demonstrate the ﬂexibility of our method.

4. 1. Single Asset

The one-asset model put forth in the seminal contribution of Kyle (1985) is widespread inthe literature on informed trading and equilibrium. Our approach applies straightforwardlyin this case. In dimension n = 1 , there exists a unique increasing function pushing µ on to ν .Denoting F µ and F ν the cumulative distribution functions of µ and ν , this function is givenby x (cid:55)→ F − ν ( F µ ( x )) . Therefore, Γ is the only antiderivative of this function satisfying thecondition E [Γ( Z T )] = 0 . The construction of the pricing rule in Back (1992, Theorem 1) canin fact be explained via our method, which provides an intuition through optimal transport:the pricing rule at ﬁnal time is the unique monotone transport map for one-dimensionaldistributions.

4. 2. Multidimensional Gaussian Prior

Assume now that the market maker and the noise trader beliefs are such that ˜ v ∼ ν = N ( m v , Σ v Σ (cid:62) v ) where m v ∈ R n and Σ v is a n × n symmetric positive deﬁnite matrix. In thiscase, the function Γ is given by: Γ( x ) = 12 x (cid:62) Σ v (Σ v Σ Σ v ) − / Σ v x + m (cid:62) v x and the optimal transport mapping will be: ∇ Γ( x ) = Σ v (Σ v Σ Σ v ) − / Σ v x + m v as elicited in Ghaﬀari and Walker (2018).One can thus deduce the pricing rule which is: P t = Σ v (Σ v Σ Σ v ) − / Σ v Y t + m v . This is a generalization of Pasquariello and Vega (2015) and Garcia del Molino et al. (2020)in a continuous-time setting.Note that the fact that σ t is time-dependent does not make Kyle’s λ time-dependent; itis still the constant matrix Σ v (Σ v Σ Σ v ) − / Σ v in our case.16 . 3. Multidimensional Lognormal Prior For n = 1 , assuming log(˜ v ) ∼ N ( m v , σ v ) for some m v ∈ R and σ v > , ∇ Γ can be explicitlycomputed as in Subsection 4.1. The explicit solution is: ∇ Γ( x ) = exp (cid:32) m v + σ v (cid:112) Σ x (cid:33) . (4.1)By the Feymann-Kac formula (3.3) we obtain that P t = exp (cid:32) m v + σ v (cid:112) Σ Y t + σ v Σ t (cid:33) . Additionally, we can explicitly compute the price impact by noting that: dP t = ∂ y H ( t, Y t ) dY t = σ v (cid:112) Σ P t dY t which shows that the Kyle’s lambda is proportional to the price. These ﬁndings are similarto those already reported by Back (1992, Example 2).In the multidimensional case n ≥ , one assumes that log(˜ v ) ∼ N ( m v , Σ v ) for some m v ∈ R n and Σ v a n × n symmetric and positively deﬁnite matrix. In such a case also, Brenier’stheorem shows that ∇ Γ exists, and thanks to our main Theorem 1, we are guaranteed ofthe existence of an equilibrium. Closed form expressions for ∇ Γ are not available in theliterature for all distributions. However, the function ∇ Γ can be numerically computed viathe methods mentioned in Subsection 3.5. This case is easy to simulate, but for concision wereserve the simulations to the next cases, which are of more interest.An interesting open question is whether, in the multidimensional case, the mapping ∇ Γ admits some exponential factorisation such as (4.1) in the one-dimensional case. This wouldallow us to explicitly obtain the price impact as a function of the price process. We leavethis question for future research.

4. 4. Case of a Stock and a Call Option

In this subsection, we show how our methodology allows us to solve a general version ofthe problem studied by Back (1993). We assume that there are two assets in the market: astock and a European call option on the stock with strike K maturing at time T , which isthe instant the fundamental value of the stock will be revealed. We denote by ν S the priorbelief at time for the stock value at T , and F ν S the Cumulative Distribution Function(CDF) of this distribution. Given that the second asset is a call option on the stock, thejoint distribution ν of the terminal values of the stock and the call option at time T , denoted17 ˜ v S , ˜ v C ) , is a singular distribution on R supported on the graph of the payoﬀ function x (cid:55)→ ( x − K ) + , i.e. ν ( dv S , dv C ) = δ ( v S − K ) + ( dv C ) ν S ( dv S ) where δ is the Dirac mass.Additionally, µ , the distribution of the noise ( Z S T , Z C T ) , is a Gaussian distribution whoseProbability Density Function (PDF) for x ∈ R is p ( x ) = 12 π (cid:112) det Σ e − x (cid:62) (Σ ) − x (4.2)We allow, in particular, the two noise trading volumes to be arbitrarily correlated. Ourmain Theorem 1 applies in this framework and the optimal transport map ∇ Γ : R (cid:55)→ R from µ to ν provides an equilibrium pricing rule.The computation of the map can be done via the methods mentioned in Subsection 3.5.Figure 1 and 2 show two sample paths, one ﬁnishing in the money, the other out of the money.We make the following assumptions for our simulations. The belief about the terminal valueof the underlying is one of a lognormal distribution with mean 100 and return volatility 20%.For the noise trading process, we assume that it follows a multivariate Gaussian distributionwith mean 0 and variance 4 for each asset (i.e., the underlying and the call option). Thecovariance between the volumes is set to − to match the restriction from Back (1993)that it be − . of the noise in the call option. Of course, our method does not require thisassumption; our purpose is to deviate from the Back (1993) case only by changing the priorfrom normal to lognormal, for which that paper was not able to obtain a solution. The strikeprice is K = 100 . Comparing the option price coming out of the model versus Black-Scholes,we see that the two prices track each other quite closely but converge more rapidly for thecase when the price ends up in the money.[Figure 1 about here.][Figure 2 about here.][Figure 3 about here.]Beside these computational methods, in this particular case, it is in fact possible todescribe the transport map based on an ordinary diﬀerential equation, generalizing the caseof Back (1993). For this purpose, we conjecture, then prove that there exists a function from R to R with a derivative less than − , so that under and above this graph ∇ Γ is a “simple”one-dimensional projection, described below.18eﬁne the functions p , p and p by p : ( y S , y C ) ∈ R (cid:55)→ (cid:90) y C −∞ p ( y S , y ) dyp : ( y S , y C ) ∈ R (cid:55)→ (cid:90) ∞ y S p ( y, y S + y C − y ) dyp : ( y S , y C ) ∈ R (cid:55)→ (cid:90) ∞ y S p ( y, y C − y ) dy. (4.3)(4.4)(4.5)We also deﬁne the ODE for x ∈ R  A (cid:48) ( x )= p ( x, B ( x )) B (cid:48) ( x )= − F − νS ( A ( x )) − KF − νS ( p ( x,x + B ( x ))+ A ( x )) − K (4.6)where the unknown functions is the couple ( A, B ) . This ODE is in fact ill-posed since thedenominator might become small. The constants of integration of the ODE are determinedby the condition lim x →−∞ A ( x ) = 0 , and lim x → + ∞ A ( x ) = P (cid:0) Z C T ≤ B (cid:0) Z S T (cid:1)(cid:1) = q K (4.7)where q K = F − ν S ( K ) is the probability that the option will be out of the money at maturityand F − ν S is the quantile function of ν S . We provide below assumptions on ν S to ﬁnd solutionsto the ODE satisfying this condition. Assuming this existence, we now provide an explicitconstruction of the transport map ∇ Γ . For this purpose, denote p l ( c ) = P (cid:0) Z C T ≤ B (cid:0) Z S T (cid:1) and Z S T ≤ c (cid:1) p r ( c ) = 1 − P (cid:0) Z C T ≥ B (cid:0) Z S T (cid:1) and Z S T + Z C T ≥ c (cid:1) f l ( x ) = F − ν S ( p l ( x ))) and f r ( x ) = F − ν S ( p r ( x )) . (4.8)(4.9)(4.10)By direct computation, we have p r ( x + B ( x )) = p ( x, x + B ( x )) + A ( x ) for all x ∈ R (4.11)and B solves B (cid:48) ( x ) = − f l ( x ) − Kf r ( x + B ( x )) − K for all x ∈ R (4.12)and satisﬁes B (cid:48) < − .Both functions f l and f r are increasing and f l ( x ) ↑ K as x ↑ ∞ whereas f r ( x ) ↓ K as x ↓ −∞ . The following proposition provides the construction (up to computation of B above) of the pricing rule: 19 roposition 3. The function

Γ : R (cid:55)→ R deﬁned by Γ( y S , y C ) =  (cid:82) y S f l ( y ) dy if y C ≤ B ( y S ) (cid:82) y S + y C B (0) f r ( y ) dy − K ( y C − B (0)) if y C > B ( y S ) (4.13) is convex on R and ∇ Γ( Z S T , Z C T ) is distributed as ν . Therefore, up to an additive constant, Γ is the map in Theorem 2 and the pricing rule at ﬁnal time ∇ Γ has the expression ∇ Γ( y S , y C ) =   f l ( y S )0  if y C ≤ B ( y S )  f r ( y S + y C ) f r ( y S + y C ) − K  if y C > B ( y S ) . (4.14) Proof.

See Appendix B.3.Proposition 3 allows us to fully compute an equilibrium and exhibits an importantpartition of the space in two regions { y C ≤ B ( y S ) } , called “out of the volume” (OTV),and { y C > B ( y S ) } , called “in the volume” (ITV). We denote these regions as follows: OTV = { y C ≤ B ( y S ) } and ITV = { y C >B ( y S ) } .Thanks to our main theorem, Theorem 1, we have the following representation of thepricing rule for t ∈ [0 , T ] : H ( t, y ) = E (cid:20) ∇ Γ (cid:18) y + (cid:90) Tt σ s dW s (cid:19)(cid:21) =  E (cid:2) OTV f l (cid:0) y S + Z S T − Z S t (cid:1) + ITV f r (cid:0) y S + Z S T − Z S t + y C + Z C T − Z C t (cid:1)(cid:3) E (cid:2) ITV (cid:0) f r (cid:0) y S + Z S T − Z S t + y C + Z C T − Z C t (cid:1) − K (cid:1)(cid:3)  and the price impact matrix λ ( t, y ) is given by (cid:2) ∂ y H ( t, y ) , ∂ y H ( t, y ) (cid:3) , where the compo-nents are the vectors ∂ y S H ( t, y ) =  E (cid:2) OTV f (cid:48) l (cid:0) y S + Z S T − Z S t (cid:1) + ITV f (cid:48) r (cid:0) y S + Z S T − Z S t + y C + Z C T − Z C t (cid:1)(cid:3) E (cid:2) ITV f (cid:48) r (cid:0) y S + Z S T − Z S t + y C + Z C T − Z C t (cid:1)(cid:3)  + (cid:82) ∞−∞ B (cid:48) ( y S + z ) p ( t, y, B ( y S + z ) − y C )( f l ( y S + z ) − f r ( y S + z + B ( y S + z ))) dz (cid:82) ∞−∞ B (cid:48) ( y S + z ) p ( t, z, B ( y S + z ) − y C )( K − f r ( y S + z + B ( y S + z ))) dz  and ∂ y C H ( t, y ) = E (cid:2) ITV f (cid:48) r (cid:0) y S + Z S T − Z S t + y C + Z C T − Z C t (cid:1)(cid:3)   + (cid:82) ∞−∞ p ( t, z, B ( y S + z ) − y C )( f r ( y S + z + B ( y S + z )) − f l ( y S + z )) dz (cid:82) ∞−∞ p ( t, z, B ( y S + z ) − y C )( f r ( y S + z + B ( y S + z )) − K ) dz  p ( t, y S , y C ) is the probability density function of (cid:82) Tt σ s dW s .We now provide a lemma that yields to existence of solutions to (4.6). Lemma 3.

Assume that there exists ε > so that ν S only charges points on [ − ε , K − ε ] ∪ [ K + ε, ε ] and F − ν S is C with bounded derivatives on [0 , q K ) ∪ ( q K , . Then, there exists asolution ( A, B ) to (4.6) satisfying (4.7) .Proof. See Appendix B.1.The lemma mainly means that ν S does not charge any mass near the strike of the optionwhich allows us to have existence of solutions to (4.6).Note that a natural way of obtaining solutions for a general ν S would be to approximateits quantile function F − ν S by quantile functions satisfying the assumptions of Lemma 3, thento show that the solutions of the equation with approximated quantile function converge.However, proving such a convergence seems to be challenging and is left for future research.An example of solution for (4.6) is provided in Back (1993). Assume that (Σ ) , = − (Σ ) , and F ν S is symmetric around K , then a computation shows that B ( x ) = − x solvesthe ODE (4.6). Indeed, we can directly compute p ( x, − x ) and p ( x, − x )+ (cid:82) x −∞ p ( s, − s ) ds and show that p ( x, − x ) + (cid:82) x −∞ p ( s, − s ) ds − = − (cid:82) x −∞ p ( s, − s ) ds . Additionally thesymmetry of F ν S around K implies that q K = and F − ν S (cid:16)(cid:82) x −∞ p ( s, − s ) ds (cid:17) − KF − ν S (cid:16) p ( x, − x ) + (cid:82) x −∞ p ( s, − s ) ds (cid:17) − K = − . Thus, B ( x ) = − x solves (4.6) and one can compute f r , f l to obtain an expression for ∇ Γ .Note that we are in fact generalizing the results of Back (1993) since we only require thesymmetry of F ν S around K and not its normality. However, we emphasize that our optimaltransport based approach does not need the solvability of this ODE which only indicatesan additional property of the transport map. The optimal transport map and therefore theequilibrium exist via the Brenier’s theorem.

4. 5. Case of a Stock, a Call and a Put

In this subsection, we assume that there are three traded assets: a stock, as well as Europeancall and put options on the asset with the same maturity T and strike K .Note that at the ﬁnal time the prices of call option, put option and the stock (in thisorder) take value on the set U price ∪ L price Our approach does not require that the call and put share the same strike – we do in order to discussput-call parity implications. U price := { ( s − K, , s ) : s ≥ K } and L price := { (0 , K − s, s ) : s ≤ K } . We are giventhe distribution ν S of the stock at maturity, and denote F ν S its CDF.We conjecture that the transport map can actually be derived explicitly. However, weonce again show some numerical computation results ﬁrst, for a lognormal prior on thespot price. Figure 4 shows a sample path based on the parametrization used in the previoussubsection, but this time with both a call and a put on top of the spot asset. The price of theoptions quoted by the market maker (in blue) tracks the Black-Scholes price (in green) witha time-varying spread which is sometimes negative and sometimes positive, likely related tothe relative intensity of the noise levels.[Figure 4 about here.] Parity implications.

Since ∇ Γ takes values in U price ∪ L price the derivatives of Γ satisfythe equality ∂ y S Γ( y C , y P , y S ) + ∂ y P Γ( y C , y P , y S ) = K + ∂ y C Γ( y C , y P , y S ) . Thus, taking the conditional expectatio, for all ( t, y C , y P , y S ) ∈ { (0 , , , } ∪ (0 , T ) × R wehave H S ( t, y C , y P , y S ) + H P ( t, y C , y P , y S ) = K + H C ( t, y C , y P , y S ) (4.15)which is the classic put-call parity. We can now diﬀerentiate (4.15) to obtain identitiesbetween various entries of the price impact matrix that holds at any time and order ﬂow.

4. 6. Black-Scholes Implied Volatility Smile with Three Options

In this subsection, we consider the case where there are one asset and three European optionsof the same type (e.g. puts) trading at diﬀerent strikes. This allows us to take a closer lookat the option pricing implications of the model. In particular, we can compute Black-Scholesimplied volatility (IV) for the three strikes. It is obvious since Back (1993) that Black-Scholesdynamics do not apply when adding an option to the single-asset model of Back (1992): evenif the prior about the fundamental value if lognormally distributed, the stock’s price will notbe lognormally distributed. However, IV is often used as an alternative measure of optionprice. Further, the empirical fact of the IV “smile” or “smirk” (created from diﬀerent IVs atdiﬀerent strikes) has been one motivation behind the development of alternatives to Black-Scholes. To our knowledge, however, there has been no theory of how asymmetric informationmay lead to an IV smile. We show here that variation in the relative order ﬂow for diﬀerentoptions and the stock can create that link.[Figure 5 about here.]22igure 5 shows the implied volatility curvature based on three simulations of 1,000 Brow-nian trajectories for order ﬂows. They include a spot asset and three put options with strikes70, 100 and 130. The market maker’s prior for the ﬁnal asset price is assumed to be lognor-mal with mean 100 and volatility 20%. At each step of the trajectory, we compute the priceof the spot asset and the three options, then we obtain the Black-Scholes IV for each optiongiven the spot price. The curvature is then computed as ( IV + IV − × IV ) / (130 − .A positive curvature indicates a smile, while a negative one indicates a frown.First, it is noticeable that diﬀerent paths can result both in smiles and frowns. Thestandard deviation of the curvature increases as time passes (as shown by the outer boundsof the crossbar). Because order ﬂows are drawn from a multivariate Gaussian with meanzero, on average cumulative volumes will stay at zero. However, as time passes, it becomesmore likely that some trajectories will go further away from zero. All things being equal,if one asset’s order ﬂow becomes larger relative to others, its price will be pushed up, andso will its implied volatility. This, in turn controls the shape of the IV curvature. Whilethe average curvature hovers around zero, it appears to increase at the very end, when itbecomes clear which options will end up in the money. Individual paths for the curvature,however, can vary, and even revert.Second, comparing Scheme 1 in Panel A, where the noise covariance is the identity, andScheme 2 in Panel B, where it is the identity times four, the patterns are almost exactlythe same. Scaling the variance of the order ﬂow does not appear to aﬀect the IV curvaturedistribution. Scheme 3, in Panel C, has the variance of the spot asset volume at four timesthat of the options. While the overall patterns are similar, the curvatures appear to beslightly more concentrated around their means throughout, although less so towards theend. [Figure 6 about here.]Figure 6 presents a similar setup but keeps the noise covariance as an identity matrix.Instead, each panel present diﬀerent combination of strikes. Panel A is the same as in Fig-ure 5, with puts at 70, 100 and 130. Panel B shows strikes 95, 100 and 105. Panel C showsstrikes 70, 100 and 105.Compared to Panel A, the average curvature in Panel B – with more concentrated strikes– appears to diminish into negative territory after time 0.5, before coming back up in thelast few stews. With more concentrated strikes, there is more likelihood for a longer periodof time that some out-of-the-money option ends up in the money, or vice versa. Panel C,with its asymetric strikes, appears to have a similar pattern as Panel A for the average, butthe dispersion is higher. The standard deviations for all panels, however, are very large, sothat there are limits to how we can visually interpret the average curvature pattern.23Table I about here.]To better understand the drivers of the curvature, we therefore turn to some panel re-gressions, based on the same simulations as before. Table I uses the IVs and curvatures fromPanel A in Figures 5 and 6. We look at how the ﬂow on each of the spot and the three putoptions impacts each of the option’s IV, and the IV curvature (here multiplied by forlegibility).All IVs have an intercept which is relatively close to 20%, which is the volatility of theprior. Looking at the IV for the put at 70, it is driven positively by its own volumes, as wellas volumes on the spot and the 100 put. Volumes on the 130 put, however, have a negativeeﬀect on the 70 put IV. This is only partially mirrored for the IV of the 130 put: activity onthe 70 put does not signiﬁcantly aﬀect it, at the 5% threshold. The 100 put IV is positivelyaﬀected by ﬂows on the 70 put and negatively by ﬂows on the 130. Again, this is not fullymirrored in the 130 put, where the volumes on the 100 put actually increase the 130 put IV.Turning to the curvature, it increases with activity on the spot and the 70 and 130 puts,but decreases with volumes on the 100 put. Adding time control does not aﬀect the ﬂowcoeﬃcients, but we can see that time passing does increase each of the IVs signiﬁcantly, butnot the curvature.All in all, it appears the volumes on the diﬀerent available options will determine theshape of the IV smile. This establishes a clear theoretical foundation to link asymmetricinformation and diﬀerences in IV. This also presents with an opportunity to calibrate themodel on observed IV smiles/smirks.

5. Conclusion

This paper develops a methodology to prove the existence of, and characterize the equilibriumin a very ﬂexible multi-asset continuous-time Kyle-Back model, using the tools of optimaltransport. It relaxes many of the limitations in the existing literature. There is no theoreticallimitation on the number of assets and derivatives in the model. Options do not have to beat-the-money. The prior of the price distribution does not have to be Gaussian. There are fewrestrictions on the variance-covariance matrix of the uninformed traders, other than (quitenatural) positive-deﬁniteness. In fact, the current formulation of the model accommodatedeterministically time-varying noise. We demonstrate how numerical methods eﬃciently helpapply the model in cases where no easy conjecture is available on the shape of the transportmap. This leads us to simulation results showing how the IV smile can be created by relativeorder ﬂows under asymmetric information. 24ur approach holds a lot of promise as to the number of issues it allows to address. Besideseﬃcient pricing, the paradigm could also be used to assess the desirable and unintendedconsequences of informed trader and transparency regulation for market equilibrium. Theﬂexibility of our method should be well-suited to applications on intraday trades and quotesdata. In particular, a subject of prime importance is whether one can fully calibrate ourmodel on ﬁnancial data. The calibration of σ can be directly performed by observing theorder ﬂow Y . Then, the fundamental question is whether by observing real market prices ( H market ( t i , y t i )) i and trading volumes ( y t i ) i , one can ﬁnd the distribution ν . We conjecturethat such a calibration can be done by a stochastic gradient descent method.A natural extension of our setting will be to allow for stochastic ﬂows from noise traders,along the lines of Collin-Dufresne and Fos (2016). Indeed, our methodology simply transformthe non-Gaussian price ˜ v into Gaussian ξ so that the ﬁltering problem is only carried outin a Gaussian framework. We conjecture that it would be possible to combine our approachwith Collin-Dufresne and Fos (2016) to establish an equilibrium.Additional extension to the case where the market maker or the informed trader are riskaverse are possible. Other more diﬃcult extensions are worth investigating. For example,even though noise traders are not sophisticated, it is not unimaginable that total ﬂow has afeedback eﬀect on their trading pattern. As such, allowing the volatility of noise trading todepend upon the volume is a promising avenue for future research.25 ibliography Back, Kerry, 1992, Insider Trading in Continuous Time,

The Review of Financial Studies

5, 387–409.Back, Kerry, 1993, Asymmetric Information and Options,

The Review of Financial Studies

6, 435–472.Beiglböck, Mathias, Marcel Nutz, and Nizar Touzi, 2017, Complete Duality for Martingale Optimal Transporton the Line,

The Annals of Probability

45, 3038–3074.Biais, Bruno, and Pierre Hillion, 1994, Insider and Liquidity Trading in Stock and Options Markets,

TheReview of Financial Studies

7, 743–780.Brenier, Yann, 1991, Polar Factorization and Monotone Rearrangement of Vector-Valued Functions,

Com-munications on Pure and Applied Mathematics

44, 375–417.Caﬀarelli, L. A., 1990, A Localization Property of Viscosity Solutions to the Monge-Ampere Equation andtheir Strict Convexity,

Annals of Mathematics

Communica-tions on Pure and Applied Mathematics

44, 965–969.Collin-Dufresne, Pierre, and Vyacheslav Fos, 2016, Insider Trading, Stochastic Liquidity and EquilibriumPrices,

Econometrica

84, 1441–1475.Collin-Dufresne, Pierre, Vyacheslav Fos, and Dmitriy Muravyev, 2019, Informed Trading in the Stock Mar-ket and Option Price Discovery, SSRN Scholarly Paper ID 2675866, Social Science Research Network,Rochester, NY.Cuturi, Marco, 2013, Sinkhorn Distances: Lightspeed Computation of Optimal Transport, in

Proceedingsof the 26th International Conference on Neural Information Processing Systems - Volume 2 , NIPS’13,2292–2300 (Curran Associates Inc., Lake Tahoe, Nevada).Dolinsky, Yan, and H. Mete Soner, 2014, Martingale Optimal Transport and Robust Hedging in ContinuousTime,

Probability Theory and Related Fields

The Journal of Finance

53, 431–465.Ekren, Ibrahim, and H. Mete Soner, 2018, Constrained Optimal Transport,

Archive for Rational Mechanicsand Analysis

SIAM Journal on Financial Mathematics arXiv:1801.03516[math] .Kacperczyk, Marcin T., and Emiliano Pagnotta, 2019, Becker Meets Kyle: Inside Insider Trading, SSRNScholarly Paper ID 3142006, Social Science Research Network, Rochester, NY.Kramkov, Dmitry, and Yan Xu, 2019, An Optimal Transport Problem with Backward Martingale ConstraintsMotivated by Insider Trading, arXiv:1906.03309 [math, q-ﬁn] .Kyle, Albert S., 1985, Continuous Auctions and Insider Trading,

Econometrica

53, 1315–1335.Liptser, Robert S., and Albert N. Shiryaev, 2001,

Statistics of Random Processes II: Applications , StochasticModelling and Applied Probability, Applications Mathematics Liptser,R.S.:Statistics Processes, secondedition (Springer-Verlag, Berlin Heidelberg).McCann, Robert J., 1995, Existence and Uniqueness of Monotone Measure-Preserving Maps,

Duke Mathe-matical Journal

80, 309–323.Pasquariello, Paolo, and Clara Vega, 2015, Strategic Cross-Trading in the U.S. Stock Market,

Review ofFinance

19, 229–282.Rochet, Jean-Charles, and Jean-Luc Vila, 1994, Insider Trading without Normality,

The Review of EconomicStudies

61, 131–152.Rockafellar, R. Tyrrell, 1970,

Convex Analysis (Princeton University Press).Villani, Cédric, 2009,

Optimal Transport: Old and New , Grundlehren Der Mathematischen Wissenschaften(Springer-Verlag, Berlin Heidelberg). − to matchthe Back (1993) restriction. The price prior is lognormal with mean and volatility .Space is discretized from − to for each noise component, and four standard deviationsaround the mean for the price. We use 101 steps for each of the three dimensions. The strikeprice is K = 100 . The ﬁrst and second panels show the (anti-correlated) Brownian noiseprocess for the spot and the , respectively. The third panel shows the equilibrium spot pricequoted by the market maker as order ﬂows arrive. The fourth panel shows the call pricequoted by the market maker in blue, and the Black-Scholes price based on the spot pricefrom panel 3 and a 20% implied volatility, which is that of the market marker’s prior, ingreen. 27igure 2: Sample Path for the Spot and Call Case (Lognormal Prior, Out-of-The-Money atExpiry)This ﬁgure shows a random sample path for a particular parametrization of the model.The model includes a spot asset and a European call with strike price K = 100 . Noise ismultivariate Gaussian with mean 0 and variance 4 for each asset; covariances are − to matchthe Back (1993) restriction. The price prior is lognormal with mean and volatility .Space is discretized from − to for each noise component, and four standard deviationsaround the mean for the price. We use 101 steps for each of the three dimensions. The strikeprice is K = 100 . The ﬁrst and second panels show the (anti-correlated) Brownian noiseprocess for the spot and the call respectively. The third panel shows the equilibrium spotprice quoted by the market maker as order ﬂows arrive. The fourth panel shows the callprice quoted by the market maker in blue, and the Black-Scholes price based on the spotprice from panel 3 and a implied volatility.28igure 3: Price Map of Spot and Call Based on Noise Volumes (Lognormal Prior)This ﬁgure shows the price map based on spot and option volumes for a particularparametrization of the model. The model includes a spot asset and a European call withstrike price K = 100 . Noise is multivariate Gaussian with mean 0 and variance 4 for eachasset; covariances are − to match the Back (1993) restriction. The price prior is lognormalwith mean and volatility . Space is discretized from − to for each noise compo-nent, and four standard deviations around the mean for the price. We use 101 steps for eachof the three dimensions. We consider ﬁve diﬀerent times from 0 to 1.29igure 4: Sample Path for the Spot, Call, and Put Case (Lognormal Prior, Call In-The-Moneyat Expiry)Noise is multivariate Gaussian with mean 0 and variance 4 for each asset; covariances withthe spot are − to match the Back (1993) restriction. Noise covariance between the call andput is . The price prior is lognormal with mean and volatility . Space is discretizedfrom − to for each noise component, and four standard deviations around the mean forthe price. We use 101 steps for each of the three dimensions. The strike price is K = 100 .The ﬁrst three panels show the (anti-correlated) Brownians noise process for the spot, thecall, and the put respectively. The fourth panel shows the equilibrium spot price quoted bythe market maker as order ﬂows arrive. The ﬁfth and sixth panel shows the call and putprices quoted by the market maker in blue, and the Black-Scholes price based on the spotprice from panel 3 and a implied volatility.30igure 5: Black-Scholes Implied Volatility Curvature for Diﬀerent Noise SchemesThis ﬁgure summarizes the curvature of the Black-Scholes implied volatility 1000 Brownianpaths drawn for one spot asset and three put options with strikes 70, 100 and 130. Thecurvature is computed as the sum of the IVs of the “outside” strikes minus twice the IV ofthe middle strike, divided by the distance between the outside strikes. The market makerstarts with a lognormal prior one the spot price with mean 100 and volatility 20%. Thevolumes of the noise are drawn with mean zero noise and a diﬀerent variance-covarianceaccording to three diﬀerent schemes in Panels A through C: Σ =   , Σ =   , Σ =   Each path is discretized into 20 time steps. The noise is discretized into a 30-step grid oneach of the four dimensions, and the ﬁnal price into a 500-step grid. Implied volatilities arestandard Black-Scholes IVs given the spot and option prices. The crossbars show the meancurvature and one standard deviation on each side. The faint blue lines show the 1,000 IVtrajectories (with y axis truncated). 31igure 6: Black-Scholes Implied Volatility Curvature for Diﬀerent StrikesThis ﬁgure summarizes the curvature of the Black-Scholes implied volatility over 1000 Brown-ian paths drawn for one spot asset and three put options with diﬀerent strikes. The curvatureis computed as the sum of the IVs of the “outside” strikes minus twice the IV of the middlestrike, divided by the distance between the outside strikes. Panel A shows strikes 70, 100and 130. Panel B shows strikes 95, 100 and 105. Panel C shows strikes 70, 100 and 105. Themarket maker starts with a lognormal prior one the spot price with mean 100 and volatility20%. The volumes of the noise are drawn with mean zero noise and a identity matrix asvariance-covariance: Σ =   Each path is discretized into 20 time steps. The noise is discretized into a 30-step grid oneach of the four dimensions, and the ﬁnal price into a 500-step grid. Implied volatilities arestandard Black-Scholes IVs given the spot and option prices. The crossbars show the meancurvature and one standard deviation on each side. The faint blue lines show the 1,000 IVtrajectories (with y axis truncated). 32able I: Panel Regression for Black-Scholes Implied Volatility and Order Flows with ThreePuts (Noise Scheme 1)This table compiles panel regressions of the Black-Scholes implied volatility and its curvatureover 1000 Brownian paths drawn for one spot asset and three put options with strikes 70,100 and 130. The curvature is computed as the sum of the IVs of the “outside” strikes minustwice the IV of the middle strike, divided by the distance between the outside strikes. Themarket maker starts with a lognormal prior one the spot price with mean 100 and volatility20%. The volumes of the noise are drawn with mean zero noise and a identity matrix asvariance-covariance: Σ =   Each path is discretized into 20 time steps. The noise is discretized into a 30-step grid oneach of the four dimensions, and the ﬁnal price into a 500-step grid. Implied volatilities arestandard Black-Scholes IVs given the spot and option prices. The dependent variable of theregression is the IV of the put at 70 in columns (1) and (2), of the put at 100 in columns(3) and (4), of the put at 130 in columns (5) and (6), and the curvature ( × ) in columns(7) and (8). Independent variables are the order ﬂows for the spot and each of the puts, aswell as time (from 0 to 0.95) in columns (2), (4), (6), and (8). Standard errors are clusteredat the path and time levels. IV (P ) IV (P ) IV (P ) IV Curvature(1) (2) (3) (4) (5) (6) (7) (8)Intercept 0.213 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ − ∗∗ (0.003) (0.003) (0.003) (0.002) (0.005) (0.004) (3.486) (5.177)Spot Volume 0.017 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.002) (0.002) (0.002) (0.002) (0.003) (0.003) (2.700) (2.671)P Volume 0.045 ∗∗∗ ∗∗∗ ∗∗ ∗∗ ∗ ∗∗∗ ∗∗∗ (0.002) (0.001) (0.001) (0.001) (0.001) (0.001) (3.767) (3.748)P Volume 0.020 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ − ∗∗∗ − ∗∗∗ (0.002) (0.002) (0.004) (0.004) (0.002) (0.002) (12.167) (12.127)P Volume − ∗∗∗ − ∗∗∗ − ∗∗∗ − ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.002) (0.002) (0.002) (0.002) (0.004) (0.004) (6.711) (6.705)Time 0.044 ∗∗∗ ∗∗∗ ∗∗∗ ∗ p < ∗∗ p < ∗∗∗ p < ppendices A. Optimal Transport Preliminaries

This appendix summarizes some well-known results in optimal transport literature.We denote by Π( α, β ) the set of probability distributions Q on R n such that (cid:90) y ∈ R n Q ( dx , . . . , dx n , dy , . . . , dy n ) = α ( dx ) and (cid:90) x ∈ R n Q ( dx , . . . , dx n , dy , . . . , dy n ) = β ( dy ) where the ﬁrst integral is over dy and the second one on dx .By deﬁnition under Q ∈ Π( α, β ) the ﬁrst n coordinate S ∈ R n and the last n coordinates S ∈ R n are, respectively, of law α and β . Note that for a mapping M so that M pushes α forward to β , the measure deﬁned by Q ( dx, dy ) = δ M ( x ) ( dy ) α ( dx ) is in Π( α, β ) .The fundamental problem in optimal transport theory is the study of the Monge problem inf M : M (cid:93) α = β (cid:90) R n | x − M ( x ) | α ( dx ) (A.1)and its relaxation by Monge-Kantorovich as follows: inf Q ∈ Π( α,β ) (cid:90) R n | x − y | Q ( dx, dy ) = inf Q ∈ Π( α,β ) E Q (cid:2) | S − S | (cid:3) . (A.2)The following theorem is one of the fundamental results of the optimal transport theory.Its proof can be found in Brenier (1991, Theorem 3.1) or McCann (1995, Main Theorem): Theorem 2 (Brenier’s Theorem) . Let α, β be two measures on R n such that α is absolutelycontinuous with respect to the Lebesgue measure. Then there exists a unique (on the supportof α and up to an additive constant) convex function Γ so that ∇ Γ pushes α forward to β . ∇ Γ also provides an optimizer in the Monge optimal transport problem (A.1) and themeasure Q ∗ ∈ Π( α, β ) deﬁned by Q ∗ ( dx, dy ) = δ {∇ Γ( x ) } ( dy ) α ( dx ) is an optimizer for the Monge-Kantorovich problem (A.2) . .1. Proof of Corollary 1 The existence of Γ is a consequence of Theorem 2 and its properties can be found in (Brenier,1991, Proposition 3.1). Note that Γ is deﬁned up to an additive constant. Independently ofCorollary 1, in Lemma 1, we show that for any choice of Γ , Γ( Z T ) is integrable. Therefore,we can choose the additive constant to insure E [Γ( Z T )] = 0 We now construct ζ . Recall the measure Q ∗ ∈ Π( µ, ν ) constructed in Theorem 2. Thereexists a disintegration of the measure Q ∗ on ν , meaning there exists a mapping that we alsodenote Q ∗ so that • Q ∗ ( dx, dy ) = Q ∗ ( dx, y ) ν ( dy ) , • y ∈ R n (cid:55)→ Q ∗ ( A, y ) is measurable for any Borel measurable subset A of R n • for ν almost every y ∈ R n , the measure Q ∗ ( dx, y ) is supported in ( ∇ Γ) − ( y ) := { z ∈ R n : ∇ Γ( z ) = y } .If ν is absolutely continuous, it is well-known (see Brenier, 1991, Proposition 3.1) that ∇ Γ is bijective and its inverse denoted ( ∇ Γ) − is the optimal transport map of ν onto µ .Therefore Q ∗ ( dx, y ) = δ { ( ∇ Γ) − ( y ) } ( dx ) and the random variable ( ∇ Γ) − (˜ v ) has distribution µ . Note that this construction does not use U .If ν is not absolutely continuous, ∇ Γ might fail to be injective. For all v ∈ R n , giventhe probabilty measure Q ∗ ( dx, v ) , we use the Brenier’s theorem to have the existence of afunction f v : R n (cid:55)→ R n so that f v ( U ) has distribution Q ∗ ( dx, v ) . Since f v ( U ) is supported in ( ∇ Γ) − ( v ) we have that ∇ Γ( f v ( U )) = v . It is now also clear that since ˜ v has distribution ν ,from the perspective of the market maker, the distribution of f ˜ v ( U ) is µ and we can take ζ = f ˜ v ( U ) . B. Proofs

B.1. Informed Trader’s Problem

Proof of Lemma 1.

We ﬁrst prove the integrability of (3.3) and (3.4). By deﬁntion of Γ , ∇ Γ (cid:16)(cid:82) T σ s dW s (cid:17) has distribution ν and therefore, thanks to the Holder inequality, and theexponential moments of the Gaussian distribution, there exists (cid:15) > so that E (cid:34) exp (cid:18) (cid:15) (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) T σ s dW s (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) (cid:12)(cid:12)(cid:12)(cid:12) ∇ Γ (cid:18)(cid:90) T σ s dW s (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) p/ (cid:35) < ∞ .

35y conditioning and Fubini’s theorem, for ( t, y ) = (0 , or for all t ∈ (0 , T ] and Lebesguealmost every y ∈ R n we have that E (cid:34) exp (cid:18) (cid:15) (cid:12)(cid:12)(cid:12)(cid:12) y + (cid:90) Tt σ s dW s (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) (cid:12)(cid:12)(cid:12)(cid:12) ∇ Γ (cid:18) y + (cid:90) Tt σ s dW s (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) p/ (cid:35) < ∞ (B.1)which shows that (3.3) is well deﬁned for such ( t, y ) .The convexity of Γ implies that for all y ∈ R n and s ∈ (0 , we have ( ∇ Γ( y ) − ∇ Γ( sy )) (cid:62) ((1 − s ) y ) ≥ and ( ∇ Γ( sy ) − ∇ Γ(0)) (cid:62) ( sy ) ≥ . By direct estimates, there exists a constant

C > such that C (cid:16) | y | /p + |∇ Γ( y ) | p p (cid:17) ≥ y (cid:62) ∇ Γ( y ) ≥ (cid:90) y (cid:62) ∇ Γ( sy ) ds ≥ Γ( y ) − Γ(0) ≥ y (cid:62) ∇ Γ(0) ≥ − C (cid:16) | y | /p + |∇ Γ(0) | p p (cid:17) . and therefore up to taking C > larger we have C (cid:0) | y | (2+8 /p )(2+ p/ + |∇ Γ( y ) | p/ + 1 (cid:1) ≥ | Γ( y ) | p/ which implies that E (cid:34) exp (cid:18) (cid:15) (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) T σ s dW s (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) (cid:12)(cid:12)(cid:12)(cid:12) Γ (cid:18)(cid:90) T σ s dW s (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) p/ (cid:35) < ∞ for eventually smaller (cid:15) > . Similarly as above, for ( t, y ) = (0 , or for all t ∈ (0 , T ] andLebesgue almost every y , we can show by conditioning and Fubini’s theorem that (3.4) iswell deﬁned.We now extend this integrability of (3.3) to all y . The proof of the extension for (3.4)can be done similarly. We ﬁx t ∈ (0 , T ) , y ∈ R n in the set of full measure where (B.1) holdsand y (cid:48) ∈ R n satisfying | y − y (cid:48) | × | (Σ t ) − | ≤ ε . For all x ∈ R n , the Gaussian kernel satisﬁes | e − ( x − y ) (cid:62) (Σ t ) − ( x − y ) − e − ( x − y (cid:48) ) (cid:62) (Σ t ) − ( x − y (cid:48) ) | = e − ( x − y ) (cid:62) (Σ t ) − ( x − y ) | − e ( y (cid:48) − y ) (cid:62) (Σ t ) − ( x − y + y (cid:48) ) |≤ | y − y (cid:48) | × | (Σ t ) − | e ε (cid:12)(cid:12)(cid:12) y + y (cid:48) (cid:12)(cid:12)(cid:12) e − ( x − y ) (cid:62) (Σ t ) − ( x − y ) e ε | x | . (B.2)We now have that, for all | y − y (cid:48) | × | (Σ t ) − | ≤ ε , the integrability of E (cid:34) exp (cid:18) ε (cid:12)(cid:12)(cid:12)(cid:12) y + (cid:90) Tt σ s dW s (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) (cid:12)(cid:12)(cid:12)(cid:12) ∇ Γ (cid:18) y + (cid:90) Tt σ s dW s (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) p/ (cid:35) E (cid:34) exp (cid:18) ε (cid:12)(cid:12)(cid:12)(cid:12) y (cid:48) + (cid:90) Tt σ s dW s (cid:12)(cid:12)(cid:12)(cid:12)(cid:19) (cid:12)(cid:12)(cid:12)(cid:12) ∇ Γ (cid:18) y (cid:48) + (cid:90) Tt σ s dW s (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) p/ (cid:35) . Therefore, H ( t, y (cid:48) ) is well-deﬁned and in fact satisﬁes | H ( t, y ) − H ( t, y (cid:48) ) |≤ | y − y (cid:48) || (Σ t ) − | e ε (cid:12)(cid:12)(cid:12) y + y (cid:48) (cid:12)(cid:12)(cid:12) π (cid:112) det(Σ t ) (cid:90) R n |∇ Γ( x ) | e − ( x − y ) (cid:62) (Σ t ) − ( x − y ) e ε | x | dx. Additionally, for all t ∈ (0 , T ) , y (cid:55)→ H ( t, y ) is locally Lipschitz continuous. Being locallyLipschitz, H i ( t, y ) = 12 π (cid:112) det(Σ t ) (cid:90) R n ∂ x i Γ( x ) e − ( x − y ) (cid:62) (Σ t ) − ( x − y ) dx (B.3)is almost everywhere diﬀerentiable for i = 1 , . . . , n . We can now repeat the same argu-ments above with the function ∂ x i Γ( x )(Σ t ) − ( x − y ) instead of ∇ Γ( x ) and show that theformal derivative of (B.3) ∇ H i ( t, y ) = 12 π (cid:112) det(Σ t ) (cid:90) R n ∂ x i Γ( x )(Σ t ) − ( x − y ) e − ( x − y ) (cid:62) (Σ t ) − ( x − y ) dx. is in fact the derivative of H i . One can now use the upper bound (B.2) and repeat thearguments above to show that H i and ( t, y ) (cid:55)→ Γ( t, y ) are smooth solutions of (3.5) on (0 , T ) × R n and both functions are continuous at (0 , . Similarly (B.2) and a DominatedConvergence Theorem allows us to obtain (3.6).Finally, for all t ∈ [0 , T ] , H ( t, Z t ) = E [ ∇ Γ( Z T ) | σ ( W s , s ≤ t )] . Therefore, by the Jenseninequality, we have E [ | H ( t, Z t ) | ] ≤ E [ |∇ Γ( Z T ) | ] < ∞ and we obtain (2.6).Given the smoothness of Γ and H , we can now easily establish the dynamic programmingprinciple H i ( t, y ) := E (cid:20) H i (cid:18) τ, y + (cid:90) τt σ s dW s (cid:19)(cid:21) for all stopping times τ . This easily yields to (3.5). Proof of Lemma 2.

Note that Γ satisﬁes, ∂ t Γ( t, y ) + T r (cid:18) σ t ∂ yy Γ( t, y ) (cid:19) = 0 for ( t, y ) ∈ [0 , T ) × R n , with ﬁnal condition Γ( T, y ) = Γ( y ) , for all y ∈ R n . (B.4)(B.5)37nd by assumption on Γ( · ) , Γ(0 ,

0) = E [Γ( Z T )] = 0 . Due to the stochastic representation(3.4), it is clear that for all t ∈ (0 , T ] , y (cid:55)→ Γ( t, y ) is convex. The smoothness of Γ proven inLemma 1 implies then easily that { ∂ y i y j Γ( t, y ) } i,j =1 ,...n is a symmetric positive semi-deﬁnitematrix. Without loss of generality, we ﬁx a trading strategy X of the informed trader, sothat dX t = dA Xt + σ Xt dW t where ( A X , σ X ) are the semi-martingale characteristics of X . ByIto’s formula and the condition E [Γ( Z T )] = 0 , the dynamics of Y , dY t = dA Xt +( σ Xt + σ t ) dW t ,yields for t ∈ (0 , T ) to d Γ( t, Y t ) = 12 T r (cid:0) (( σ Xt + σ t ) − σ t ) ∂ yy Γ( t, Y t ) (cid:1) dt + H (cid:62) ( t, Y t ) dX t + H (cid:62) ( t, Y t ) dZ t = T r (cid:18) ( σ Xt σ t + 12 ( σ Xt ) ) ∂ yy Γ( t, Y t ) (cid:19) dt + H (cid:62) ( t, Y t ) dX t + H (cid:62) ( t, Y t ) dZ t where we have used (B.4) to simplify ∂ t Γ . Note also that by a direct computation d n (cid:88) i =1 (cid:104) X i , H i ( · , Y · ) (cid:105) t = T r (cid:0) σ Xt ( σ t + σ Xt ) ∂ yy Γ( t, Y t ) (cid:1) dt. Thus, the dynamics of Γ( t, Y t ) and the inequality (2.7) yields for all ε ∈ (0 , T / to E (cid:34)(cid:90) T − εε ( v − H ( t, Y t )) (cid:62) dX t − n (cid:88) i =1 (cid:104) X i , H i ( · , Y · ) (cid:105) T − ε − (cid:104) X i , H i ( · , Y · ) (cid:105) ε (cid:35) = E (cid:20) v (cid:62) ( Y T − ε − Y ε ) − Γ( T − ε, Y T − ε ) + Γ( ε, Y ε ) − (cid:90) T − εε T r (cid:18)

12 ( σ Xt ) ∂ yy Γ( t, Y t ) (cid:19) dt (cid:21) Note that H ( · , Y · ) might have jumps at and T . However, X is assumed to be continuous.Therefore, t ∈ [0 , T ] (cid:55)→ (cid:104) X i , H i ( · , Y · ) (cid:105) t is continuous and square integrable. Additionally, bythe convexity of Γ , T r (cid:18)

12 ( σ Xt ) ∂ yy Γ( t, Y t ) (cid:19) ≥ . Therefore we can send ε to to obtain E (cid:34)(cid:90) T ( v − H ( t, Y t )) (cid:62) dX t − n (cid:88) i =1 (cid:104) X i , H i ( · , Y · ) (cid:105) T (cid:35) = E (cid:20) v (cid:62) Y T − Γ( Y T ) − (cid:90) T T r (cid:18)

12 ( σ Xt ) ∂ yy Γ( t, Y t ) (cid:19) dt (cid:21) = E (cid:34) v (cid:62) Y T − Γ( Y T ) − n (cid:88) i,j =1 (cid:90) T n (cid:88) i,j =1 ∂ y i y j Γ( t, Y t ) d (cid:104) X i , X j (cid:105) t (cid:35) (B.6)38 roof of Proposition 1. Due to the convexity of Γ( y ) , the function Γ( t, y ) is convex andtherefore n (cid:88) i,j =1 (cid:90) T n (cid:88) i,j =1 ∂ y i y j Γ( t, Y t ) d (cid:104) X i , X j (cid:105) t ≥ for any trading strategy X and the right hand side of (B.6) is bounded from above by Γ ∗ ( v ) .Note that this inequality also shows that any martingale part of the trading strategy of theinformed trader will be costly to the informed trader.We now show that for our candidate strategy (3.11) this upper bound Γ ∗ ( v ) is achieved.It is well known (see Rockafellar (1970)) that the supremum in (3.8) is achieved at any y ∈ R n satisfying ∇ Γ( y ) = v .By our construction ∇ Γ( ζ ) = v almost surely. Additionally, for the strategy deﬁned at(3.11), we have that σ X = 0 . Thus, in order to obtain the optimality of the candidate strategy(3.11), it is suﬃcient to show that the solution of (3.11) satisﬁes Y T = ζ . Note that by adirect computation the solution of (3.11) is Y t = ( I n − Σ t (Σ ) − ) ζ + Σ t (cid:90) t (Σ s ) − σ s dW s . (B.7)where I n is the identity matrix of dimension n .As symmetric matrices, we have σI n ≥ σ t ≥ σI n for some positive constants σ > σ > and by integration ( T − s ) σ I n ≥ Σ s ≥ ( T − s ) σ I n . These inequalities and the Ito isom-etry easily imply that Σ t √ T − t (cid:82) t (Σ s ) − σ s dW s is bounded in L . Therefore Σ t (cid:82) t (Σ s ) − σ s dW s converges to in L and almost surely as t → T , which concludes the proof of optimality of(3.11). B.2. Market Maker’s Problem

Proof of Proposition 2.

At time t = 0 the market maker knows the a priori distribution of ζ which is Gaussian by construction. Thanks to Liptser and Shiryaev (2001)[Theorem 12.7],conditionally on F Yt , ζ is normally distributed with mean denoted m t and variance denoted V t that are the unique continuous solutions to dm t = V t (Σ t ) − σ t ( σ t ) − ( dY t − σ t (Σ t ) − ( m t − Y t ) dt ) dV t = − V t (Σ t ) − σ t ( σ t ) − σ t (Σ t ) − V (cid:62) t dt (B.8)(B.9)with intial conditions V = Σ and m = Y = 0 . Note that V t = Σ t solves (B.9) and thus,by uniqueness, V t = Σ t for all t ∈ [0 , T ] . We inject this equality into (B.8) to obtain dm t = dY t − σ t (Σ t ) − ( m t − Y t ) dt. Y = m and this proves that conditionally on F Yt , ζ is N ( Y t , Σ t ) . Note that regardless of the absolutecontinuity of ν , by construction, we have v = ∇ Γ( ζ ) .Thus, we have the following equalities that complete the proof of the proposition E [˜ v | F Yt ] = E [ ∇ Γ( ζ ) | F Yt ] = E (cid:20) ∇ Γ (cid:18) y + (cid:90) Tt σ s dW s (cid:19)(cid:21) y = Y t = H ( t, Y t ) B.3. Case of a Stock with a Call Option

Proof of Proposition 3.

Given the continuity of p l , p r , F − ν S and B , to obtain the continuityof Γ , it is suﬃcient to prove this continuity on the graph of B . This continuity holds if (cid:90) x f l ( z ) dz = (cid:90) x + B ( x ) B (0) f r ( z ) dz − K ( B ( x ) − B (0)) for all x ∈ R . (B.10)Note that (B.10) holds for x = 0 . Thus, it is suﬃcient to show the equality of the derivativesof both sides of (B.10) in x , which is f l ( x ) − f r ( x + B ( x )) = B (cid:48) ( x )( f r ( x + B ( x )) − K ) for all x ∈ R . This is equivalent to (4.12) and we conclude the proof of continuity of Γ . We now show the convexity of Γ by showing that for all ( x i , y i ) ∈ R , i = 1 , we havethat ( ∇ Γ( x , y ) − ∇ Γ( x , y )) (cid:62) (( x , y ) − ( x , y )) ≥ . (B.11)Given the symmetry of the statement in ( x , y ) , ( x , y ) and the fact that Γ is convex onboth sets { y ≤ B ( x ) } and { y > B ( x ) } , without loss of generality we assume that B ( x ) < y and B ( x ) ≥ y (B.12)and expand (B.11) to ( f r ( x + y ) − K )( x + y − ( x + y )) + ( K − f l ( x ))( x − x ) . (B.13)Note that the derivative of (B.13) in y is K − f r ( x + y ) which is non positive due to themonotonicity of f r and its limit at + ∞ . Thus, by the assumption y ≤ B ( x ) , the expression(B.13) is larger or equal than ( f r ( x + y ) − K )( x + y − ( x + B ( x ))) + ( K − f l ( x ))( x − x ) . (B.14)40e now diﬀerentiate this expression in y to obtain f (cid:48) r ( x + y )( x + y − ( x + B ( x ))) + f r ( x + y ) − K. (B.15)We now show that (B.14) is non negative if x ≤ x . We treat the other case below. Dueto the monotonicity of x + B ( x ) and the assumption (B.12), x + y > x + B ( x ) ≥ x + B ( x ) . Thus, (B.15) yields that if x ≤ x (B.14) is increasing in y and as a consequence of (B.12)it is bounded from below by ( f r ( x + B ( x )) − K )( x + B ( x ) − ( x + B ( x ))) + ( K − f l ( x ))( x − x ) . (B.16)Let x ∈ [ x , x ] , then the fact that f l and f r are increasing and x (cid:55)→ x + B ( x ) is decreasingimplies that K − f l ( x ) K − f l ( x ) ≥ ≥ f r ( x + B ( x )) − Kf r ( x + B ( x )) − K (B.17)which yields thanks to (4.12) to x + B ( x ) − ( x + B ( x )) = (cid:90) x x K − f l ( x ) f r ( x + B ( x )) − K dx ≥ ( x − x ) K − f l ( x ) f r ( x + B ( x )) − K .

Thanks to the inequality f r ≥ K , we rearrange these terms to obtain that (B.16) (andtherefore (B.14)) is non negative in the case x ≤ x .In order to complete the proof of convexity, we now assume that x > x . Note that forany y satisfying y ≥ x + B ( x ) − x , (B.15) is non negative. Thus, the minimum of (B.14)in y is achieved at a value satisfying y ≤ x + B ( x ) − x . Given also the assumption (B.12),in order to show that (B.14) is non negative we can without loss of generality assume that x + B ( x ) < x + y ≤ x + B ( x ) .Let x be deﬁned by x + B ( x ) = x + y . By the strict monotonicity of x + B ( x ) thisimplies that x > x ≥ x and we can write (B.14) as ( f r ( x + B ( x )) − K )( x + B ( x ) − ( x + B ( x ))) + ( K − f l ( x ))( x − x )+ ( K − f l ( x ))( x − x ) . (B.18)The second line being non negative, in order to ﬁnish the proof, it is suﬃcient to prove that ( f r ( x + B ( x )) − K )( x + B ( x ) − ( x + B ( x ))) + ( K − f l ( x ))( x − x ) ≥ (B.19)for x > x . Similarly to (B.17), for all x ∈ ( x , x ) , we have that K − f l ( x ) K − f l ( x ) ≥ ≥ f r ( x + B ( x )) − Kf r ( x + B ( x )) − K . K − f l ( x ) f r ( x + B ( x )) − K ≥ x − x (cid:90) x x K − f l ( x ) f r ( x + B ( x )) − K dx ≥ x − x ( x + B ( x ) − ( x + B ( x ))) which implies (B.19) and concludes the proof of the convexity of Γ . Proof of Lemma 3.

We ﬁx

R > . Under the assumptions of the Lemma for all u < q K < v ,we have F − ν S ( u ) − K ≤ − ε ≤ ε ≤ F − ν S ( v ) − K. (B.20)One can ﬁnd smooth increasing maps φ ,R , φ ,R : [0 , (cid:55)→ R so that for all u, v ∈ [0 , , ≤ φ ,R ( u ) ≤ q K − R − , q K + R − ≤ φ ,R ( v ) ≤ and for all ≤ u < q K < v ≤ φ ,R ( u ) ↑ u and φ ,R ( v ) ↓ v as R ↑ ∞ . For i = 1 , , denote F i,R ( u ) = F − ν S ( φ i,R ( u )) and ( A b,R , B b,R ) the solution of the ODE A b,R ( x ) = (cid:90) x − R p ( s, B b,R ( s )) dsB b,R ( x ) = b − ( x + R ) + (cid:90) x − R F ,R ( A b,R ( s )) − KF ,R ( p ( s, s + B b,R ( s )) + A b,R ( s )) − K ds. (B.21)Note that due to the choice of φ i,R , the data of this ODE is Lipschitz in ( A b,R , B b,R ) and thesolutions exist and depend continuously on b. We can also diﬀerentiate the solution of theODE in b to ﬁnd that ∂ b A b,R ( x ) = (cid:90) x − R p ( s, B b,R ( s )) ∂ b B b,R ( s ) ds∂ b B b,R ( x ) = 1 + (cid:90) x − R ( F ,R ) (cid:48) (cid:0) A b,R ( s ) (cid:1) ( F ,R (cid:0) p ( s, s + B b,R ( s )) + A b,R ( s ) (cid:1) − K )( F ,R (cid:0) p ( s, s + B b,R ( s )) + A b,R ( s ) (cid:1) − K ) ∂ b A b,R ( s ) ds − (cid:90) x − R ( F ,R (cid:0) A b,R ( s ) (cid:1) − K )( F ,R ) (cid:48) (cid:0) p ( s, s + B b,R ( s )) + A b,R ( s ) (cid:1) ( F ,R (cid:0) p ( s, s + B b,R ( s )) + A b,R ( s ) (cid:1) − K ) ∂ b A b,R ( s ) ds − (cid:90) x − R ( F ,R (cid:0) A b,R ( s ) (cid:1) − K )( F ,R ) (cid:48) (cid:0) p ( s, s + B b,R ( s )) + A b,R ( s ) (cid:1) p ( s, B b,R ( s ))( F ,R (cid:0) p ( s, s + B b,R ( s )) + A b,R ( s ) (cid:1) − K ) ∂ b B b,R ( s ) ds. The signs of F ,R ( p ( s, s + B b,R ( s )) + A b,R ( s )) − K , F ,R ( A b,R ( s )) − K and the monotonicityof F − ν S shows that sign ( ∂ b A b,R ( x )) = sign ( x + R ) , sign ( ∂ b B b,R ( x )) = 1 . b (cid:55)→ (cid:90) x − R p ( s, B b,R ( s )) ds is increasing and continuous and therefore there exists b R so that it is .Note that the set { A b R ,R (0) : R > } is bounded. Thus, we can take a subsequencethat converges. Additionally, { A b R ,R : R > } is equicontinuous and bounded on boundedsets. Therefore, thanks to Arzela Ascoli theorem, we can take a further subsequence so that A b R ,R → A ∞ uniformly on compact sets that is increasing and satisﬁes lim x →−∞ A ∞ ( x ) = 0 and lim x →∞ A ∞ ( x ) = q K and hence satisfy < A ∞ ( x ) < q K . Note that due to (B.20) B (cid:48) b R ,R ∈ ( − ε − , − . Thus, if { B b R ,R (0) : R > } admits asubsequence diverging to ±∞ , B b R ,R ( x ) also diverges to the same limit for all x . Thus, A b R ,R ( x ) either converges to or to which is in contradiction with lim x →∞ A ∞ ( x ) = q K . Therefore, we conclude that { B b R ,R (0) : R > } is also bounded. Similarly to A , wecan take a subsequence converging to B ∞ uniformly on compact sets. One can show that p ( s, s + B b,R ( s )) + A b,R ( s ) is decreasing and satisﬁes lim x →−∞ p ( s, s + B b,R ( s )) + A b,R ( s ) = 1 and lim x →∞ p ( s, s + B b,R ( s )) + A b,R ( s ) = q K . It is now easy to see that ( A ∞ , B ∞ ))