Optimal Trading with Differing Trade Signals
OOptimal Trading with Differing Trade Signals
Ryan Donnelly (cid:3)
Matthew Lorig y This version: June 25, 2020
Abstract
We consider the problem of maximizing portfolio value when an agent has a subjective view on assetvalue which differs from the traded market price. The agent’s trades will have a price impact which affectthe price at which the asset is traded. In addition to the agent’s trades affecting the market price, theagent may change his view on the asset’s value if its difference from the market price persists. We alsoconsider a situation of several agents interacting and trading simultaneously when they have a subjectiveview on the asset value. Two cases of the subjective views of agents are considered, one in which theyall share the same information, and one in which they all have an individual signal correlated withprice innovations. To study the large agent problem we take a mean-field game approach which remainstractable. After classifying the mean-field equilibrium we compute the cross-sectional distribution ofagents’ inventories and the dependence of price distribution on the amount of shared information amongthe agents.
A significant proportion of trading performed in modern markets is done by computer algorithms withsome reports giving figures of up to 80% of trades in some markets (see Kaya et al. (2016) and Bigiotti andNavarra (2018)). Many of these algorithms are used to execute strategies that manage inventory, for exampleto rebalance a portfolio or achieve a desired hedge ratio. Others may also be speculative, executing tradesbased on predictions of market behaviour. When a trading strategy is designed based on speculation, tradeexecutions are typically based on a trade signal, which indicates that the value of the asset at a future timewill be predictably different from its present value. Exploiting this predicted difference offers the possibilityof attaining a profit.The trade executions which are conducted in a market will also have impact on the dynamics of themarket itself, and when several market participants implement trading strategies simultaneously they willinevitably influence the behaviour of each other. Thus, in order for a strategy to be designed to performexecutions in an optimal manner, the trade signals of other market participants should also be taken into (cid:3)
Department of Mathematics, King’s College London. e-mail : [email protected] y Department of Applied Mathematics, University of Washington. e-mail : [email protected] a r X i v : . [ q -f i n . M F ] J un ccount. The analysis of such a system quickly leads to a high dimensional problem, but framing the systemin terms of a mean-field game allows for further tractability.In this work, we consider how an agent will base his trades through time on an observed trade signal.In general, a trade signal can be an abstract quantity which dictates tendencies of market dynamics asin Donnelly and Gan (2018), or it can be treated as a direct valuation adjustment of the asset comparedto the prevailing midprice as in Lehalle and Mounjid (2017). We take the latter approach in that theagent’s trade signal is directly transformed into a monetary quantity to be added to the asset price to givea subjective valuation. The agent also controls his trading to manage the risk of his position at the end ofthe trading horizon as an acknowledgement that his assessment of value may not be accurate. This singleagent framework is structured similar to Almgren and Chriss (2001) with the addition of the observed tradesignal. Beyond the single agent problem we investigate a market in which several agents are trading, each ofwhom is observing a trade signal that dictates their subjective valuation of the asset. In this setting, in orderto fully optimize the profits they seek to extract from their trade signal, the agents must take into accountthe aggregate behaviour of other market participants. In order to maintain tractability of the model whenthere are a large number of agents, we use a mean-field game approach. Similar approaches with respect tooptimal execution and algorithmic trading are conducted in Huang et al. (2015) and Casgrain and Jaimungal(2018b).Additionally, the work Casgrain and Jaimungal (2018a) also considers a mean-field approach wheresubsets of agents have different views of the asset price. While in that work the differing beliefs are modeledas agents behaving according to different probability measures, we work with a fixed probability where eachagent observes a different trade signal process which forms their subjective valuation. We consider twospecifications of how the trade signals of different agents relate to each other. First, we suppose that theyall share the same trade signal which is correlated with the asset’s midprice. Then any differences in theirtrading behaviour will come from different initial inventories. Second, we suppose that they all have differenttrade signals, each correlated with the asset’s midprice with a structure that also dictates the nature of thecorrelation between any pair of signals. In the second case, if all correlations are equal to 1 and the initialsignal states are identical across all agents, then the model reverts to the first case of the shared signal.In order for each agent to optimize their trades they must take into account the order flow submitted byother agents. This is similar to models proposed in Cartea and Jaimungal (2016a) and Cartea and Jaimungal(2016b) in which net order flow is given by an exogenous process. In this work, agents make an assumptionabout the net order flow process before conducting their individual optimization. A mean-field equilibrium isreached by finding a fixed point of the net order flow process. If all agents trade according to the mean-fieldequilibrium, then we can quantify the relationship between the correlation of their trade signals and thecross sectional distribution of their inventories through time. This also allows us to study how the overallprice impact on the asset depends on how much information is shared between agents as dictated by thecorrelation between their signals.In Section 2 we propose our model of a single agent optimizing trades with the observation of a tradesignal and analyze the agent’s optimal trading strategy. In Section 3.1 we propose our model when there areseveral agents trading simultaneously. The section is broken down into subsections depending on whether2he agents share the same signal (Section 3.2) or have separate but correlated signals (Section 3.3). InSection 4 we compute the cross-sectional joint distribution of the inventory and trade signal across all agentsand show how this depends on the correlation of the trade signals. We also compute how this correlationaffects the variance of the asset price. We conclude in Section 5. In this section, we consider a single agent that wishes to maximize the value of a portfolio at a future timeT < ∞ through trading a single risky asset with temporary and permanent price impact. The agent has hisown subjective valuation of the asset which may be different from the market price. At each point in timethe agent chooses a rate at which he buys shares of the asset via a process ν = ( ν t ) ≤ t ≤ T . Thus, denotingthe agent’s inventory holdings by Q ν = (Q ν t ) ≤ t ≤ T , it changes according todQ ν t = ν t d t , Q ν = Q .The market view of asset value is denoted S ν = (S ν t ) ≤ t ≤ T , which will be subject to a permanent priceimpact due to the agent’s trades. We model permanent price impact through a linear relation and thus themarket view of the asset satisfiesdS ν t = ( µ + b ν t )d t + σ dW t , S ν = S ,where W = (W t ) ≤ t ≤ T is a Brownian motion. Temporary price impact is also accounted for by modelingthe price of trades as being dependent on the speed of trading. Given that the speed of trading at time t is ν t , the price at which the transaction occurs is b S ν t = S ν t + k ν t .Thus, the agent’s cash process, denoted X ν = (X ν t ) ≤ t ≤ T satisfiesdX ν t = – b S ν t ν t d t , X ν = X .We suppose that the agent observes a trade signal which means his own subjective view of the asset’s valuediffers from the traded market price S ν . We will denote the difference between the subjective value and thetraded market price by V ν = (V ν t ) ≤ t ≤ T , which satisfiesdV ν t = –( β V ν t + γν t )d t + η dZ t , V ν = V ,where Z = (Z t ) ≤ t ≤ T is a Brownian motion correlated with W with constant correlation parameter ρ . Attime t when the market price of the asset is S ν t , the agent’s subjective valuation of the asset due to the tradesignal is S ν t + V ν t . Even though V ν t represents a valuation adjustment due to the trade signal, we will referto the process V ν as the trade signal. The dynamics of V ν imply that the trade signal is influenced by thetrading of the agent, this coming from the term – γν t . This is to capture the effect of diminishing the tradesignal’s strength when the agent acts upon the information that it provides.3 .2 Agent’s Objective Functional and HJB Equation Throughout the remainder of Section 2 we work with a complete and filtered probability space ( Ω , ( F t ) ≤ t ≤ T , P )where ( F t ) ≤ t ≤ T is the standard augmentation of the natural filtration generated by (W t , Z t ) ≤ t ≤ T and(S , Q , X , V ). We suppose the agent wishes to maximize the following functional of ν :J( ν ) := E (cid:16) X ν T + Q ν T (S ν T + V ν T ) – α (Q ν T ) (cid:17) ,where the control process ν must be taken from the admissible set N which consists of F -predictableprocesses such that E [ R T0 ν t dt ] < ∞ . The first term in the expectation X ν T is the amount of cash on handat time T. The second term in the expectation Q ν T (S ν T + V ν T ) is the agent’s assessment of the value of hisinventory holdings at time T. The third term – α (Q ν T ) behaves as a risk control term and is present becausethe agent acknowledges that his valuation due to the trade signal may not be completely accurate. Thisterm helps to ensure that he does not acquire very large inventory positions due to the the risk of beingincorrect. Let us define the agent’s value function H as followsH( t , x , q , S, V) := sup ν ∈ N E t , x , q ,S,V (cid:16) X ν T + Q ν T (S ν T + V ν T ) – α (Q ν T ) (cid:17) .The Hamilton-Jacobi-Bellman (HJB) partial differential equation (PDE) associated with H is ∂ t H + sup ν ∈ R ( A ν H) = 0, H(T, x , q , S, V) = x + q (S + V) – α q , (2.1)where the operator A ν is given by A ν = –(S + k ν ) ν∂ x + ν∂ q + ( µ + b ν ) ∂ S – ( β V + γν ) ∂ V + 12 σ ∂ SS + 12 η ∂ VV + ρση∂ SV . In this section, we express the solution to the HJB PDE (2.1) in terms of a system of coupled ODEs.
Proposition 2.1.
Suppose c , . . . , c : [0, T] → R satisfy the following system of ODEs with terminalconditions: c + η c + ( c – γ c ) k = 0 , c (T) = 0 , c + µ + ( c – γ c )( b + 2 c – γ c )2 k = 0 , c (T) = 0 , (2.2) c – β c + ( c – γ c )( c – 2 γ c )2 k = 0 , c (T) = 0 , (2.3) c + ( b + 2 c – γ c ) k = 0 , c (T) = – α , c – 2 β c + ( c – 2 γ c ) k = 0 , c (T) = 0 , c – β c + ( c – 2 γ c )( b + 2 c – γ c )2 k = 0 , c (T) = 1 . Then the value function H is given by H( t , x , q , S, V) = x + q S + h ( t , q , V) ,4 ( t , q , V) = c ( t ) + c ( t ) q + c ( t ) y + c ( t ) q + c ( t )V + c ( t ) q V , and the optimal trading strategy in feedback form is ν ∗ ( t , q , V) = c ( t ) – γ c ( t ) + ( b + 2 c ( t ) – γ c ( t )) q + ( c ( t ) – 2 γ c ( t ))V2 k . Proof.
This is shown by direct substitution into (2.1).
Proposition 2.2. If µ = 0 then c = c ≡ .Proof. This is immediate from equations (2.2) and (2.3).Proposition 2.2 is a result of symmetry in the model when the unaffected market value of the asset isa martingale. In this circumstance, the agent’s value function remains unchanged if the underlying statevariables are transformed according to ( q , S, V) (– q , –S, –V). In addition, we also see ν ∗ ( t , – q , –V) =– ν ∗ ( t , q , V). This is an expected result because if the dynamics possess enough symmetry, then the agentshould place equal value on a long position in the asset as they would on a short position of equal magnitude,as long as his future projection of the total value of his current holdings is the same. Of particular interest is how the trading strategy depends on the values of Q t and V t . The effect of theseunderlying processes on the trading strategy can be directly quantified by the corresponding loadings. Tothis end, observe that ν ∗ can be written as ν ∗ ( t , q , V) = c ( t ) – γ c ( t )2 k + ν ∗ q ( t ) q + ν ∗ V ( t )V ,where ν ∗ q and ν ∗ V are defined by ν ∗ q ( t ) = b + 2 c ( t ) – γ c ( t )2 k , ν ∗ V ( t ) = c ( t ) – 2 bc ( t )2 k .In Figure 1 we plot the loadings on Q t and V t for the optimal trading strategy ν ∗ . The behaviour wesee in the left panel is typical of this type of model for optimal execution. In particular, the loading onQ t is seen to grow to a large negative value as t → T. This is because of the agent’s terminal risk controlrepresented by α Q and the relevance of this term becomes stronger as time approaches the horizon ofthe trading period. Although the loading shown above is typical, we will see below that the contributionof inventory towards trading speed, and indeed the total trading speed, exhibits behaviour which is nottypically seen in this style of optimal execution.There is also an intuitive explanation for why the loading on V t is most significant close to time T, andit is because V t is more likely to experience a change in sign if there is a longer time to the horizon of thetrading period. If the agent is overeager in his attempts to extract profits early in the trading period dueto exploiting the trade signal V t , then he risks this quantity changing sign in which case his prior tradesare in fact working against his goals. If this occurs then the agent would wish to reverse his trades, but theround trip involved in this task accumulates needless costs due to temporary price impact. By waiting until5 Figure 1: Optimal loadings on Q t and V t . Parameters used are µ = 0, σ = 1, η = 0.5, β = 1, γ = 0.1, ρ = 0.3, b = 10 –2 , k = 5 · –3 , α = 0.1, and T = 1.a time closer to T, the probability and magnitude of this type of sign change is significantly lowered, thusthe agent prefers to wait before extracting profits.In Figure 2 we show simulated paths of the optimal trading strategy broken down into the two maincontributing components, as well as the total trading speed. The left panel shows the graph of ν ∗ q ( t )Q ν ∗ t ,the center panel shows ν ∗ V ( t )V ν ∗ t , and the last shows ν ∗ ( t , Q ν ∗ t , V ν ∗ t ). Here we see what may be consideredatypical behaviour in an optimal execution program. Namely that all of the individual contributions to thetrading speed, as well as the total trading speed, are concentrated towards the end of the trading period.In other optimal execution models, for example Almgren and Chriss (2001) and Cartea et al. (2018), theloading on Q t becomes large as t approaches T, but the magnitude of the contribution to the trading speedfrom inventory does not change significantly over the course of the trading period. The main contributingfactor to this difference is related to the discussion of the previous paragraph. The agent’s main source ofprofits is due to the value of the trade signal V t , but this is only taken advantage of through trades thatare submitted when the sign of V t is the same as the sign of V T . This gives incentive for the agent to delaytrades and perform most of his action at times close to T. Here we consider a model in which multiple agents trade with interaction. The interaction stems fromthe fact that price impact will account for the trading of all agents, not just an individual. Agents areindexed by n ∈ {1, . . . , N} and each agent has his own control process ν n = ( ν nt ) ≤ t ≤ T . As before thecontrol process represents the rate of trading for agent n , thus the inventory holdings of agent n denotedby Q n , ν n = (Q n , ν n t ) ≤ t ≤ T changes according todQ n , ν n t = ν nt d t , Q n , ν n = Q n ,6 Figure 2: Contribution to trading speed of Q t and V t , and total optimal trading speed ν t . Parameters usedare µ = 0, σ = 1, η = 0.5, β = 1, γ = 0.1, ρ = 0.3, b = 10 –2 , k = 5 · –3 , α = 0.1, T = 1, S = 100, V = 0,and Q = 0. 7here we assume each Q n is independent from all other variables with finite expectation and variance. Themarket view of the asset value is denoted S ¯ ν = (S ¯ ν t ) ≤ t ≤ T and changes according todS ¯ ν t = ( µ + b ¯ ν t )d t + σ dW t , S ¯ ν = S , (3.1)where ¯ ν t is the average trading rate of all agents at time t ¯ ν t := 1N N X n =1 ν nt .Temporary price impact incurred by agent n depends only on his own rate of trading such that the transactionprice for agent n is b S ν n , ¯ ν t = S ¯ ν t + k ν nt .Thus, the cash process of agent n , denoted X n , ν n , ¯ ν = (X n , ν n , ¯ ν t ) ≤ t ≤ T changes according todX n , ν n , ¯ ν t = – b S ν n , ¯ ν t ν nt d t , X n , ν n , ¯ ν = X n .Lastly, it will be useful to define the average inventory holdings of all agents ¯ Q ¯ ν = ( ¯ Q ¯ ν t ) ≤ t ≤ T which is givenby ¯ Q ¯ ν t := 1N N X n =1 Q n , ν n t . (3.2)It appears from the definition that ¯ Q ¯ ν depends on each individual ν n , but given ¯ Q we haved ¯ Q ¯ ν t = 1N N X n =1 dQ n , ν n t = 1N N X n =1 ν nt d t = ¯ ν t d t ,which justifies the dependence on only ¯ ν . We remark here that based on the definition of ¯ Q ¯ ν t in (3.2), if weassume in addition that all Q n are independent and identically distributed then when we directly handle thelimiting case N → ∞ we have ¯ Q ¯ ν = E [Q n , ν n ]. From now on we will make this assumption on the collection(Q n ) n ∈ N . First we consider a model in which each agent receives the same trade signal. The common trade signal isdenoted by ¯ V ¯ ν = ( ¯ V ¯ ν t ) ≤ t ≤ T and changes according tod ¯ V ¯ ν t = –( β ¯ V ¯ ν t + ¯ γ ¯ ν t )d t + η dZ t , ¯ V ¯ ν = ¯ V ,8here Z = (Z t ) ≤ t ≤ T is a Brownian motion correlated with W with constant correlation parameter ρ . TheN agents under consideration do not represent all participants in the market, only the ones which are actingbased on the trade signal ¯ V ¯ ν . This is why there is a prevailing market price S ¯ ν which is different from thevaluation due to the trade signal of S ¯ ν + ¯ V ¯ ν . Each agent attempts to maximize his own expected futurewealth given that the trading strategies of all other agents are fixed. We let ν – n denote the collection oftrading strategies for all agents except agent n . Then for a fixed ν – n , agent n wishes to maximize thefunctional J( ν n ; ν – n ) = E (cid:16) X n , ν n , ¯ ν T + Q n , ν n T (S ¯ ν T + ¯ V ¯ ν T ) – α (Q n , ν n T ) (cid:17) .Throughout Section 3.2 we work with a complete and filtered probability space ( Ω , ( F t ) ≤ t ≤ T , P ) where( F t ) ≤ t ≤ T is the standard augmentation of the natural filtration generated by (W t , Z t ) ≤ t ≤ T and theinitial state (S , (Q n ) n ∈ N , (X n ) n ∈ N , ¯ V ). We do not attempt to solve the finite player game, rather we consider the limiting case N → ∞ directly.Under this condition the average trading speed ¯ ν is not affected by any one individual control ν n . Thus,fixing ν – n is equivalent to fixing ¯ ν . In addition, we assume ¯ ν t = ¯ ν ( t , ¯ Q ¯ ν t , ¯ V ¯ ν t ) so that we remain within aMarkovian framework. With a fixed function ¯ ν we may define the value function for agent n asH n ( t , x , q , ¯ q , S, ¯ V; ¯ ν ) := sup ν n ∈ N E t , x , q , ¯ q ,S, ¯ V (cid:16) X n , ν n , ¯ ν T + Q n , ν n T (S ¯ ν T + ¯ V ¯ ν T ) – α (Q n , ν n T ) (cid:17) , (3.3)where the collection of admissible strategies N consists of F -predicable processes such that E [ R T0 ( ν nt ) dt ] < ∞ . The value function in (3.3) has an associated HJB equation of the form ∂ t H n + sup ν n ∈ R ( A ν n , ¯ ν H n ) = 0, H n (T, x , q , ¯ q , S, ¯ V; ¯ ν ) = x + q ¯ V – α q , (3.4)where the operator A ν n , ¯ ν is given by A ν n , ¯ ν = –(S + k ν n ) ν n ∂ x + ν n ∂ q + ¯ ν∂ ¯ q + ( µ + b ¯ ν ) ∂ S – ( β ¯ V + ¯ γ ¯ ν ) ∂ ¯ V + 12 σ ∂ SS + 12 η ∂ ¯ V ¯ V + ρση∂ S ¯ V .Based on the form of the feedback control in the previous section, we make the further assumption ¯ ν ( t , ¯ q , S, ¯ V) = f ( t ) + f ( t ) ¯ q + f ( t ) ¯ V . (3.5)With this assumption the solution to the HJB equation (3.4) along with the optimal control in feedbackform can be characterized by a solution to a system of ODE’s.
Proposition 3.1.
Given ¯ ν in (3.5) , suppose c , . . . , c : [0, T] → R satisfy the following system ofODEs with terminal conditions:c + f ( c – ¯ γ c ) + η c + c k = 0 , c (T) = 0 , (3.6)9 + µ + f ( b + c – ¯ γ c ) + c c k = 0 , c (T) = 0 , c + f (2 c – ¯ γ c ) + f ( c – ¯ γ c ) + c c k = 0 , c (T) = 0 , c + f ( c – 2 ¯ γ c ) + f ( c – ¯ γ c ) – β c + c c k = 0 , c (T) = 0 , c + c k = 0 , c (T) = – α , c + f (2 c – ¯ γ c ) + c k = 0 , c (T) = 0 , c + f ( c – 2 ¯ γ c ) – 2 β c + c k = 0 , c (T) = 0 , c + f ( b + c – ¯ γ c ) + c c k = 0 , c (T) = 0 , c + f ( b + c – ¯ γ c ) – β c + c c k = 0 , c (T) = 1 , c + f ( c – 2 ¯ γ c ) + f (2 c – ¯ γ c ) – β c + c c k = 0 , c (T) = 0 , (3.7) Then the value function H n is given by H n ( t , x , q , ¯ q , S, ¯ V) = x + q S + h n ( t , q , ¯ q , ¯ V) , h n ( t , q , ¯ q , ¯ V) = c ( t ) + c ( t ) q + c ( t ) ¯ q + c ( t ) ¯ V+ c ( t ) q + c ( t ) ¯ q + c ( t ) ¯ V + c ( t ) q ¯ q + c ( t ) q ¯ V + c ( t ) ¯ q ¯ V , and the optimal trading strategy in feedback form is ν n ∗ ( t , q , ¯ q , V) = c ( t )2 k + c ( t ) k q + c ( t )2 k ¯ q + c ( t )2 k ¯ V . (3.8)
Proof.
This is shown by direct substitution into (3.4).In order for the trading strategy in (3.8) to yield a mean-field Nash equilibrium it is necessary that aconsistency condition is satisfied. Because (3.8) is based on the assumption that the average trading speedis given by (3.5), we must impose that when each agent uses the strategy (3.8) the resulting average tradingspeed is (3.5). Thus, we require lim N →∞ N X n =1 ν n ∗ ( t , q n , ¯ q , ¯ V) = ¯ ν ( t , ¯ q , ¯ V) .Substituting (3.5) and (3.8) into this equation yields f = c k , f = 2 c + c k , f = c k . (3.9)When solving equations (3.6) to (3.7) we shall always substitute (3.9) first to guarantee that the optimalstrategy in (3.8) represents an equilibrium. In a mean-field Nash equilibrium, the optimal strategy of agent n can be written in a particular form demonstrated in the next proposition. Proposition 3.2.
In equilibrium, the trading strategy of agent n and the average trading rate of allagents are related by ν n ∗ ( t , q n , ¯ q , ¯ V) = c ( t ) k ( q n – ¯ q ) + ¯ ν ( t , ¯ q , ¯ V) .10 roof.
This is a consequence of combining equations (3.5), (3.8), and (3.9).
Proposition 3.3. If α = 0 then c = c = c = c = c ≡ in equilibrium. If µ = 0 then c = c = c ≡ in equilibrium. If α = µ = 0 then the non-zero c i in equilibrium are given byc ( t ) = η Z T t c ( s )4 b (1 – e – bk (T– s ) )d s , c ( t ) = c ( t )4 b (1 – e – bk (T– t ) ) , c ( t ) = 2 k β – b ( ¯ γ + 2 k β – b ) e k β – b k (T– t ) – ¯ γ , and the optimal strategy in feedback form is ν n ∗ ( t , q n , ¯ q , ¯ V) = c ( t )2 k ¯ V = 12 k (cid:18) k β – b ( ¯ γ + 2 k β – b ) e k β – b k (T– t ) – ¯ γ (cid:19) ¯ V .
Proof.
In equations (3.6) to (3.7) we substitute (3.9). The result can then be seen by direct substitution.
In a similar fashion to Section 2.4 we are interested in the loadings of the optimal strategy on the underlyingprocesses and the resulting pathwise behaviour. By substituting equations (3.9) into equations (3.6) to (3.7)we arrive at a system of ODE’s which define a mean-field Nash equilibrium, and this system easily lendsitself to numerical methods. From (3.8) we have ν n ∗ ( t , q , ¯ q , ¯ V) = c ( t )2 k + ν ∗ q ( t ) q + ν ∗ ¯ q ¯ q + ν ∗ ¯ V ( t ) ¯ V ,where we have defined ν ∗ q ( t ) = c ( t ) k , ν ∗ ¯ q = c ( t )2 k ν ∗ ¯ V ( t ) = c ( t )2 k .We plot the functions ν ∗ q , ν ∗ ¯ q and ν ∗ ¯ V in Figure 3. We see that ν ∗ q and ν ∗ ¯ V have qualitatively similar behaviourin the multi-agent setting as when there is an individual agent, and the reasoning is the same.The additional loading ν ∗ ¯ q is seen to change sign about half way through the trading period, and shortlyafter reaching its maximum positive value it quickly drops to zero. The behaviour of this loading can beexplained by considering the actions of the entire population of agents combined with the resulting dynamicsof the midprice and trade signal, and the results are more easily understood by seeing the effect that theprice impact and trade signal impact parameters, b and ¯ γ , have on this loading.In Figure 4 we show the loading ν ∗ ¯ q as a function of time for several values of the parameters b and ¯ γ . Inthe left panel, the parameter ¯ γ is fixed and each curve represents a different value of b . As b is increased,the loading decreases, and the sensitivity is greatest at earlier times. The direction and magnitude of thesechanges as well as the sign of the loading are understood by considering what the population will do onaverage based on the average inventory of the population and based on the time remaining in the tradingperiod. If the average inventory is high, then the agent expects the order flow to be negative. This will have11 Figure 3: Optimal loadings on Q t , ¯ Q t , and V t . Parameters used are µ = 0, σ = 1, η = 0.5, β = 1, ¯ γ = 0.1, ρ = 0.3, b = 10 –2 , k = 5 · –3 , α = 0.1, and T = 1.two effects on quantities relevant to the agent. First, it will decrease the price of the asset in the future,and second, it will increase the value of the trade signal to the agent. When the remaining time is short,this increase in the value of the trade signal gives incentive to buy shares of the asset to benefit from thisperceived increase in subjective value. However, the impact on the trade signal is short lived due to themean-reverting dynamics, so when the remaining time is long this incentive to buy shares does not arisebecause it will have disappeared before the advantage can be gained. For longer remaining times the effectof permanent price impact dominates, and the price decrease caused by negative order flow incentivizes theagent to sell shares. A larger value of price impact b makes the decision to trade based on price impactdominate the decision to trade on a change in the trade signal, hence the loading ν ∗ ¯ q ( t ) decreases with b .The fact that the sensitivity to b is greatest at time t = 0 is also explained due to the permanent nature oforder flow on price impact along with the transient nature of the effect on the trade signal.In the right panel, the parameter b is fixed and each curve represents a different value of ¯ γ . As ¯ γ isincreased, the loading increases, and this effect is most pronounced close to the end of the trading period.The reasoning for the general shape of each curve is the same as in the discussion of the left panel in theprevious paragraph. Also based on the same discussion is the reason that the sensitivity towards ¯ γ is greatestclose to the end of the trading period. This comes from the transient nature of the impact effects on thetrade signal, so when the remaining time is large the agent knows these effects will gradually disappearbefore they can offer their advantage.In Figure 5 we show the result of a simulation when each agent acts according to the mean-field optimalstrategy described by the loadings plotted in Figure 3. The most striking feature of this simulation is thatall agents appear to approach very similar terminal inventory holdings even though the initial positions arewide spread. This stems from the fact that they each share the same view of the asset’s value. At timeT, there will be a tradeoff to holding non-zero inventory between the terminal liquidation penalty and theadditional value imparted by the signal ¯ V T . Since each agent assigns the same value to this signal, they arewilling to accept the same magnitude of terminal penalty in order to benefit from the signal.12 Figure 4: Optimal loading on ¯ Q t . Left panel: ¯ γ = 0.1 and b ranging from 0 (blue curve) to 5 · –2 (redcurve). Right panel: b = 10 –2 and ¯ γ ranging from 0 (blue curve) to 0.5 (red curve). Other parameters are µ = 0, σ = 1, η = 0.5, β = 1, ρ = 0.3, k = 5 · –3 , α = 0.1, and T = 1.Indeed, if we consider the optimal trading strategy for an arbitrary agent, from (3.8) it satisfies ν n ∗ t = c ( t ) + 2 c ( t )Q n , ν n ∗ t + c ( t ) ¯ Q ¯ ν t + c ( t ) ¯ V ¯ ν t k .This can be rewritten, and recalling that this represents the optimal rate of inventory change we writedQ n , ν n ∗ t = c ( t )2 k d t + c ( t ) ¯ Q ¯ ν t k d t – c ( t ) k (cid:18) c ( t )–2 c ( t ) ¯ V ¯ ν t – Q n , ν n ∗ t (cid:19) d t .As t → T, the first two terms above approach zero. We also have c (T) = – α and c (T) = 1. Thus, towardsthe end of the trading interval we expect Q n , ν n ∗ to drift towards ¯ V ¯ ν t α . For the simulated path correspondingto Figure 5, this value is equal to –0.58 and the average of the terminal inventories is –0.53.13 Figure 5: The top row shows the contributions to trading speed from Q t , ¯ Q t , and ¯ V t . The left panel of thesecond row shows each agent’s inventory path Q n , ν n t (blue curves) as well as the average inventory of allagents ¯ Q ¯ ν t (red dotted curve). The right panel of the second row shows the optimal trading speed ν it (bluecurves) and the average trading speed ¯ ν t (red dotted curve). Parameters used are µ = 0, σ = 1, η = 0.5, β = 1, ¯ γ = 0.1, ρ = 0.3, b = 10 –2 , k = 5 · –3 , α = 0.1, T = 1, S = 100, ¯ V = 0, Q n ∼ N (0, 0.5 ), andN = 50. 14 .3 Separate Subjective View of Asset Value Here we consider a model in which each agent has his own individual trading signal, each of which changes ac-cording to a different stochastic process. For agent n we denote his trading signal by V n , ν n , ¯ ν = (V n , ν n , ¯ ν t ) ≤ t ≤ T which changes according todV n , ν n , ¯ ν t = –( β V n , ν n , ¯ ν t + γν nt + ¯ γ ¯ ν t )d t + η dZ nt , V n , ν n , ¯ ν = V n ,Z nt = ρ W t + p ρ W n , ⊥ t ,where each W n , ⊥ = (W n , ⊥ t ) ≤ t ≤ T is a Brownian motion, independent of one another for different n , andindependent of W. In addition we assume that all V n are i.i.d. with finite expectation and variance, andindependent from all other variables. Inventory and price dynamics are equivalent to those of Section 3.1.In this section it will be useful to consider the average value of the trading signals over all agents which willbe denoted ¯ V ¯ ν = ( ¯ V ¯ ν t ) ≤ t ≤ T and is defined by ¯ V ¯ ν t := 1N N X n =1 V n , ν n , ¯ ν t .Based on this definition we may also compute the dynamics to bed ¯ V ¯ ν t = 1N N X n =1 dV n , ν n , ¯ ν t = –( β ¯ V ¯ ν t + ( γ + ¯ γ ) ¯ ν t )d t + η N N X n =1 dZ nt = –( β ¯ V ¯ ν t + ( γ + ¯ γ ) ¯ ν t )d t + ηρ dW t + η p ρ N N X n =1 dW n , ⊥ t .Due to the independence of each W n , ⊥ , when we consider the limit N → ∞ the last term above becomeszero due to the law of large numbers. It is worth making the brief remark that this model of separate tradesignals can be reduced to the shared signal of Section 3.2 by choosing some parameter values in a particularway. Specifically, if each V n in 3.3 and V of 3.2 are equal to a constant (the same constant for each n ),and if γ = 0 and ρ = ± n , ν n , ¯ ν t = ¯ V ¯ ν t and every agent observes the sametrade signal, which is the setting considered in Section 3.2.Each agent attempts to maximize his own expected future wealth given that the trading strategies of allother agents are fixed. That is, if ν – n is fixed, agent n wishes to maximize the functionalJ( ν n ; ν – n ) = E (cid:16) X n , ν n , ¯ ν T + Q n , ν n T (S ¯ ν T + V n , ν n , ¯ ν T ) – α (Q n , ν n T ) (cid:17) .For the remainder of Section 3.3 we work with a complete and filtered probability space ( Ω , ( F t ) ≤ t ≤ T , P )where ( F t ) ≤ t ≤ T is the standard augmentation of the natural filtration generated by (W t , Z nt ) ≤ t ≤ T, n ∈ N and the initial state (S , (Q n ) n ∈ N , (X n ) n ∈ N , (V n ) n ∈ N ).15 .3.1 HJB Equation and Consistency Condition with Separate Subjective Views With a similar approach to Section 3.2.1 we consider a solution in the limiting case N → ∞ . We assumethat the average trading speed is of the form ¯ ν t = ¯ ν ( t , ¯ Q ¯ ν t , ¯ V ¯ ν t ) to remain within a Markovian framework.This is similar to the assumption that was made in Section 3.2.1 except we must use ¯ V ¯ ν t because all agentshave a different subjective view. With this function fixed we define the value function for agent n asH n ( t , x , q , ¯ q , S, V, ¯ V; ¯ ν ) := sup ν n ∈ N E t , x , q , ¯ q ,S,V, ¯ V (cid:16) X n , ν n , ¯ ν T + Q n , ν n T (S ¯ ν T + V n , ν n , ¯ ν T ) – α (Q n , ν n T ) (cid:17) , (3.10)where the set of admissible strategies N consists of all F -predictable processes such that E [ R T0 ( ν nt ) dt ] < ∞ .The value function in (3.10) has an associated HJB equation of the form ∂ t H n + sup ν n ∈ R ( A ν n , ¯ ν H n ) = 0, H n (T, x , q , ¯ q , S, V, ¯ V; ¯ ν ) = x + q (S + V) – α q , (3.11)where the operator A ν n , ¯ ν is given by A ν n , ¯ ν = –(S + k ν n ) ν n ∂ x + ν n ∂ q + ¯ ν∂ ¯ q + ( µ + b ¯ ν ) ∂ S – ( β V + γν n + ¯ γ ¯ ν ) ∂ V – ( β ¯ V + ( γ + ¯ γ ) ¯ ν ) ∂ ¯ V + 12 σ ∂ SS + 12 η ∂ VV + 12 ρ η ∂ ¯ V ¯ V + ρση∂ SV + ρση∂ S ¯ V + ρ η ∂ V ¯ V .Based on the form of the feedback control in the previous sections, we make the further assumption ¯ ν ( t , ¯ q , ¯ V) = f ( t ) + f ( t ) ¯ q + f ( t ) ¯ V . (3.12)With this assumption the solution to the HJB equation (3.11) along with the optimal control in feedbackform can be characterized by a solution to a system of ODE’s.
Proposition 3.4.
Given ¯ ν in (3.12) , suppose c , . . . , c : [0, T] → R satisfy the following system ofODEs with terminal conditions:c + f ( c – ¯ γ c – ( γ + ¯ γ ) c ) + η c + ρ η ( c + c ) + ( c – γ c ) k = 0 , c (T) = 0 ,(3.13) c + µ + f ( b + c – ¯ γ c – ( γ + ¯ γ ) c ) + (2 c – γ c )( c – γ c )2 k = 0 , c (T) = 0 , c + f (2 c – ¯ γ c – ( γ + ¯ γ ) c ) + f ( c – ¯ γ c – ( γ + ¯ γ ) c ) + ( c – γ c )( c – γ c )2 k = 0 , c (T) = 0 , c – β c + f ( c – 2 ¯ γ c – ( γ + ¯ γ ) c ) + ( c – γ c )( c – 2 γ c )2 k = 0 , c (T) = 0 , c – β c + f ( c – ¯ γ c – 2( γ + ¯ γ ) c ) + f ( c – ¯ γ c – ( γ + ¯ γ ) c ) + ( c – γ c )( c – γ c )2 k = 0 , c (T) = 0 ,16 + (2 c – γ c ) k = 0 , c (T) = – α , c + f (2 c – ¯ γ c – ( γ + ¯ γ ) c ) + ( c – γ c ) k = 0 , c (T) = 0 , c – 2 β c + ( c – 2 γ c ) k = 0 , c (T) = 0 , c – 2 β c + f ( c – ¯ γ c – 2( γ + ¯ γ ) c ) + ( c – γ c ) k = 0 , c (T) = 0 , c + f ( b + c – ¯ γ c – ( γ + ¯ γ ) c ) + (2 c – γ c )( c – γ c )2 k = 0 , c (T) = 0 , c – β c + (2 c – γ c )( c – 2 γ c )2 k = 0 , c (T) = 1 , c – β c + f ( b + c – ¯ γ c – ( γ + ¯ γ ) c ) + (2 c – γ c )( c – γ c )2 k = 0 , c (T) = 0 , c – β c + f ( c – 2 ¯ γ c – ( γ + ¯ γ ) c ) + ( c – 2 γ c )( c – γ c )2 k = 0 , c (T) = 0 , c – β c + f ( c – ¯ γ c – 2( γ + ¯ γ ) c ) + f (2 c – ¯ γ c – ( γ + ¯ γ ) c ) + ( c – γ c )( c – γ c )2 k = 0 , c (T) = 0 , c – 2 β c + f ( c – 2 ¯ γ c – ( γ + ¯ γ ) c ) + ( c – 2 γ c )( c – γ c )2 k = 0 , c (T) = 0 .(3.14) Then the value function H n is given by H n ( t , x , q , ¯ q , S, V, ¯ V) = x + q S + h n ( t , q , ¯ q , V, ¯ V) , h n ( t , q , ¯ q , V, ¯ V) = c ( t ) + c ( t ) q + c ( t ) ¯ q + c ( t )V + c ( t ) ¯ V + c ( t ) q + c ( t ) ¯ q + c ( t )V + c ( t ) ¯ V + c ( t ) q ¯ q + c ( t ) q V + c ( t ) q ¯ V + c ( t ) ¯ q V + c ( t ) ¯ q ¯ V + c ( t )V ¯ V , and the optimal trading strategy in feedback form is ν n ∗ ( t , q , ¯ q , V, ¯ V) = c ( t ) – γ c ( t )2 k + 2 c ( t ) – γ c ( t )2 k q + c ( t ) – γ c ( t )2 k ¯ q + c ( t ) – 2 γ c ( t )2 k V + c – γ c ( t )2 k ¯ V .(3.15)
Proof.
This is shown by direct substitution into equation (3.11).17n a similar fashion to the previous section, we require a consistency condition to be satisfied in orderfor the trading strategy in (3.15) to yield a mean-field Nash equilibrium. The strategy (3.15) is based onthe assumption that the average trading speed is given by (3.12), therefore we must impose that when eachagent uses the strategy (3.15) the resulting average trading speed is (3.12). Thus, we require lim N →∞ N X n =1 ν n ∗ ( t , q n , ¯ q , V n , ¯ V) = ¯ ν ( t , ¯ q , ¯ V) .Substituting (3.5) and (3.8) into this equation yields f = c – γ c k , f = 2 c + c – γ ( c + c )2 k , f = c + c – γ (2 c + c )2 k . (3.16)As we did in the previous section, we will only consider solutions of (3.13) to (3.14) in which (3.16) has beenenforced. This means we only consider optimal trading strategies that result in equilibrium. Also as in theprevious section, the trading strategies in a mean-field Nash equilibrium can be written in a particular form Proposition 3.5.
In equilibrium, the trading strategy of agent n and the average trading rate of allagents are related by ν n ∗ ( t , q n , ¯ q , V n , ¯ V) = 2 c ( t ) – γ c ( t ) k ( q n – ¯ q ) + c ( t ) – 2 γ c ( t )2 k (V n – ¯ V) + ¯ ν ( t , ¯ q , ¯ V) Proposition 3.6. If α = γ = 0 then c = c = c = c = c = c ≡ in equilibrium. If µ = 0 thenc = c = c = c ≡ in equilibrium. If α = γ = µ = 0 then the non-zero c i in equilibrium are givenby c ( t ) = Z T t η c ( s ) + ρ η ( c ( s ) + c ( s ))d s , c ( t ) = T – t k e –2 β (T– t ) , c ( t ) = – 4 z e – bk (T– t ) z ) e ω (T– t ) D ( t ) , c ( t ) = e – β (T– t ) , c ( t ) = –2 z z ) e ω (T– t ) – e – β (T– t ) , c ( t ) = – (T – t )2 k e –2 β (T– t ) – (cid:18) z z ) e ω (T– t ) (cid:19)(cid:18) e – b k (T– t ) b (cid:19) e – β (T– t ) , where D ( t ) = 116 kz (cid:18) e –2 ωτ ω – τ e –2 ωτ (cid:19) – 1 + 2 z kz (cid:18) e – ωτ ω – τ e – ωτ (cid:19) + 2 z ¯ γ – b bz ( k β – b ) (cid:18) e ( bk – β ) τ (cid:19) + 14 b (cid:18) e bk τ (cid:19) – 1 + 2 z bz (cid:18) e b k τ (cid:19) – 132 kz ω (cid:18) e –2 ωτ (cid:19) + (1 + 2 z ) b – 4 z ¯ γ bkz ω (cid:18) e – ωτ (cid:19) – (1 + 2 z ) kz τ , z = 2 k β – b ¯ γ , 18 = 2 k β – b k , τ = T – t . Proof.
In equations (3.13) to (3.14) we substitute (3.16). The first two conclusions can be seen by inspec-tion. The expressions for the non-zero c i come from a tedious computation, but can be checked by directsubstitution. We consider again the loadings of the optimal strategy on the underlying processes. Note that from (3.15)we have ν n ∗ ( t , q , ¯ q , V, ¯ V) = c ( t ) – γ c ( t )2 k + ν ∗ q ( t ) q + ν ∗ ¯ q ( t ) ¯ q + ν ∗ V ( t )V + ν ∗ ¯ V ( t ) ¯ V ,where the loadings on q , ¯ q , V and ¯ V are given by ν ∗ q ( t ) = 2 c ( t ) – γ c ( t )2 k , ν ∗ ¯ q ( t ) = c ( t ) – γ c ( t )2 k , ν ∗ V ( t ) = c ( t ) – 2 γ c ( t )2 k , ν ∗ ¯ V ( t ) = c ( t ) – γ c ( t )2 k .We plot the above four loadings in Figure 6. The first three loadings are qualitatively similar to thesituation where each agent’s subjective valuation is governed by the same process, but the intuition behindunderstanding these loadings is more effectively shown by considering various values of some of the relevantparameters.In Figure 7 we show the loadings on ¯ Q t and ¯ V t as the three impact parameters b , γ , and ¯ γ are varied.Many of the features shown in this figure can be explained with similar reasoning to the discussion aroundFigure 4. New features which deserve discussion are the qualitative shape of the loading ν ∗ ¯ V and the orderingof the curves based on the changing parameter.Typically the loading ν ∗ ¯ V has a minimum value, usually negative, shortly before the end of the tradingperiod. If the average signal viewed by agents is positive shortly before time T, then this will tend to increasethe average order flow and the agent can expect their own trade signal to decrease, thus giving them reasonto sell the asset. However, there is a counteracting effect which is the impact that the average order flowhas on the asset price. When the average order flow is positive the asset price will tend to increase, givingincentive for the agent to buy shares shortly before time T. This explains why larger values of permanentprice impact, b , result in higher loading ν ∗ ¯ V (bottom left panel) and why larger values of market impact ontrade signal, ¯ γ , result in lower loading ν ∗ ¯ V .The permanence of price impact and the transience of trade signal impact also explain the sharp humpsseen in this figure. Since any impact on the trade signal will decay over time due to mean reversion, theconsiderations of market wide order flow on trade signals become more significant shortly before T. Theeffects of market wide order flow on the price are long lasting, so the agent takes into account this effectover the entire trading period. 19 Figure 6: Optimal loadings on Q t , ¯ Q t , V t , and ¯ V t . Parameters used are µ = 0, σ = 1, η = 0.5, β = 1, γ = 0.05, ¯ γ = 0.1, ρ = 0.3, b = 10 –2 , k = 5 · –3 , α = 0.1, and T = 1.In Figure 8 we show a simulation of relevant processes when each agent adopts the mean-field optimalstrategy depicted in Figure 6. The main qualitative difference between this simulation and that shown inFigure 5 is that the distribution of terminal inventories (Q n T ) n ≤ N does not become concentrated arounda particular value based on the average trade signal. In fact, in this particular simulation the terminalinventories have sample variance 1.32 which is significantly greater than the initial sample variance of 0.24(the initial inventories are drawn from a distribution with variance 0.5 = 0.25). In this section we compute the joint distribution of the agents’ inventories and signals when all agents usethe mean-field equilibrium strategy given by (3.15) (with (3.16) enforced). We do not directly consider thecase when all agents observe the same trade signal because those corresponding results can be obtained fromthose of the separate signal by setting ρ = 1, γ = 0, and each V n the same constant. In addition, as we areassuming all agents are using the mean-field equilibrium strategies, we drop the notational dependencies on ν n and ¯ ν .We begin by defining the processes Y n = (Y nt ) ≤ t ≤ T and ¯ Y = ( ¯ Y t ) ≤ t ≤ T byY nt = " Q nt V nt , ¯ Y t = " ¯ Q t ¯ V t .20 Figure 7: Top row shows optimal loading on ¯ Q t for various parameters, bottom row shows optimal loadingon ¯ V t . Each figure considers a change in only one parameter, indicated in the legend, from a minimum value(blue curve) to a maximum value (red curve). Otherwise the fixed parameters are µ = 0, σ = 1, η = 0.5, β = 1, γ = 0.05, ¯ γ = 0.1, ρ = 0.3, b = 10 –2 , k = 5 · –3 , α = 0.1, and T = 1.21 Figure 8: The top row shows the contributions to trading speed from Q t , ¯ Q t , and V t . The left panel ofthe second row shows the contributions from ¯ V t . The middle panel of the second row shows each agent’sinventory path Q n , ν n t (blue curves) as well as the average inventory of all agents ¯ Q ¯ ν t (red dotted curve). Theright panel of the second row shows the optimal trading speed ν nt (blue curves) and the average tradingspeed ¯ ν t (red dotted curve). Parameters used are µ = 0, σ = 1, η = 0.5, β = 1, γ = 0.05, ¯ γ = 0.1, ρ = 0.3, b = 10 –2 , k = 5 · –3 , α = 0.1, T = 1, S = 100, V n ∼ N (0, 0.02 ), Q n ∼ N (0, 0.5 ), and N = 50.22e also introduce random measure processes on R , denoted m N = ( m N t ) ≤ t ≤ T and m = ( m t ) ≤ t ≤ T ,which are given by m N t = 1N N X n =1 δ Y nt , m t = lim N →∞ m N t .In the next proposition we provide expressions for the mean vector ¯ Y t and covariance matrix ¯Σ t of thedistribution induced by m . Proposition 4.1.
Let a t , B t , and C t be given bya t = c ( t ) – γ c ( t )2 k " γ + ¯ γ ) ,B t = " ν ∗ q ( t ) ν ∗ V ( t )– γν ∗ q ( t ) –( β + γν ∗ V ( t )) ,C t = ν ∗ q ( t ) + ν ∗ ¯ q ( t ) ν ∗ V ( t ) + ν ∗ ¯ V ( t )–( γ + ¯ γ )( ν ∗ q ( t ) + ν ∗ ¯ q ( t )) – (cid:18) β + ( γ + ¯ γ )( ν ∗ V ( t ) + ν ∗ ¯ V ( t )) (cid:19) , and let Φ t and Ψ t be the solutions to the matrix differential equations Φ t = C t Φ t , Φ = I × , (4.1) Ψ t = B t Ψ t , Ψ = I × . (4.2) The mean vector and covariance matrix induced by m t are given by ¯ Y t = Φ t (cid:18) ¯ Y + Z t Φ –1 u a u d u + ρ Z t Φ –1 u Θ dW u (cid:19) , (4.3) ¯Σ t = Ψ t ¯Σ Ψ > t + (1 – ρ ) Ψ t Z t Ψ –1 u ΘΘ > ( Ψ –1 u ) > d u Ψ > t , (4.4) where Θ = " η . If the distribution induced by m is Gaussian, then m t induces a Gaussian distribution for all t ∈ [0, T] .If µ = α = γ = 0 then the covariance matrix in (4.4) has individual elements ¯Σ Q t = ¯Σ Q0 + e – β T k t ¯Σ QV0 + e –2 β T k t ¯Σ V0 + (1 – ρ ) η e –2 β T β k (e β t – 1 – 2 β t – 2 β t ) , (4.5) ¯Σ V t = 2 –2 β t ¯Σ V0 + (1 – ρ ) η β (1 – e –2 β t ) , (4.6) ¯Σ QV t = e – β t ¯Σ QV0 + e – β T k t e – β t ¯Σ V0 + (1 – ρ ) η e – β T β k ( sinh ( β t ) – β t e – β t ) . (4.7) Proof.
The dynamics of Y n and ¯ Y are given bydY nt = ( a t + B t Y nt + (C t – B t ) ¯ Y t ) d t + Θ dZ nt , (4.8)23 ¯ Y t = ( a t + C t ¯ Y t ) d t + ρ Θ dW t . (4.9)The solution to (4.9) is given by (4.3) (see Section 5.6 of Karatzas and Shreve (2012)). By substituting thissolution for ¯ Y t into (4.8) and performing some tedious computations we arrive atY nt = Ψ t (Y n – ¯ Y ) + Φ t ¯ Y + Φ t Z t Φ –1 u a u d u + ρ Φ t Z t Φ –1 u Θ dW u + p ρ Ψ t Z t Ψ –1 u Θ dW n , ⊥ u .Subtracting ¯ Y t from this expression yieldsY nt – ¯ Y t = Ψ t (Y n – ¯ Y ) + p ρ Ψ t Z t Ψ –1 u Θ dW n , ⊥ u ,from which we also compute(Y nt – ¯ Y t )(Y nt – ¯ Y t ) > = Ψ t (Y n – ¯ Y ) (Y n – ¯ Y ) > Ψ > t + 2 p ρ Ψ t (Y n – ¯ Y ) Z t Ψ –1 t Θ dW n , ⊥ u + (1 – ρ ) Ψ t (cid:18)Z t Ψ –1 t Θ dW n , ⊥ u (cid:19)(cid:18)Z t Ψ –1 t Θ dW n , ⊥ u (cid:19) > Ψ > t .We sum both sides over 1 ≤ n ≤ N and divide by N. As N → ∞ the left hand side converges to ¯Σ t . Thesecond term on the right converges to zero due to independence of Y n and W n , ⊥ . Applying the law of largenumbers and Ito’s isometry to the third term yields (4.4). If the initial distribution of m is Gaussian, thenindependence of Y n and W n , ⊥ and the fact that the stochastic integrand is deterministic result in m t beingGaussian.To obtain the expression in (4.5), (4.6), and (4.7) we first use Proposition 3.6 to write the matrix B t inclosed form. Then (4.2) can be solved in closed form, which yields Ψ t = " e – β T k t – β t , Ψ –1 t = " e – β (T– t ) k t β t .Substituting these expressions into (4.4) and computing the integral gives the result.The covariance matrix in (4.4) confirms an observation made in comparing the simulations of Figure 5and Figure 8: the sample variance of the terminal inventory of all agents is greater when they have separatesignals compared to when they share the same signal. This is because of the lower correlation betweensignals implied by the separate signals and the term 1 – ρ in (4.4). In fact the variance of inventory will beminimized when the correlation is ρ = ±
1. This has a clear intuitive reason being that if the agents havevery similar signals then they will trade in a similar fashion, and any variance in their terminal inventory willbe the result of variance of their initial inventory and the limited speed of trading due to market frictionssuch as temporary price impact.In Figure 9 we show the variances and correlation across agents of inventories and signals in the mean-field limit. This gives a visual demonstration that the variance of inventories is lowest when ρ is largest.In addition we also see that in the early parts of the trading period the variance does not depend much onthe magnitude of shared information which is measured by ρ . This is due to the fact that for much of thetrading interval the agents are controlling the size of their inventory by trading towards a common target of24 Figure 9: Cross sectional variances and correlation of Q nt and V nt in mean-field limit. Parameters used are µ = 0, σ = 1, η = 1, β = 1, γ = 0.05, ¯ γ = 0.1, b = 5 · –2 , k = 5 · –3 , α = 0.1, T = 1. The initial variancesof inventories and signals are 0.5 and 0.02 respectively with an initial correlation of 0.zero. When the end of the trading period is closer they begin to take advantage of the information in thetrade signal, and their trading targets due to the trade signal may be different causing their inventories todiverge.The behaviour of the trade signal variance is more expected. Since the initial distribution is relativelyconcentrated with a variance of 0.02 , the variance quickly increases, but at different rates depending onthe magnitude of shared information measured by ρ . If ρ is large then the agents share much of the sameinformation, and so it is expected that the cross sectional variance of their trade signals is lower.With the expression given in (4.3) for the cross sectional mean of inventory and signal, we are able todemonstrate the effect of a shared trade signal on the variance of the asset price. This is done in the followingproposition. Proposition 4.2.
In mean-field equilibrium, the variance of the asset price is E [(S t – E [S t ]) ] = Z t (cid:18) ρ η b h i Φ t Φ –1 u " + σ (cid:19) d u , (4.10) where Φ t is as in Proposition 4.1. If µ = α = γ = 0 then this reduces to E [(S t – E [S t ]) ] = Z t (cid:18) ρ η z ( e – b k ( t – u ) – 1)1 – (1 + 2 z ) e ω (T– u ) + σ (cid:19) d u , (4.11) where ω = 2 k β – b k , z = 2 k β – b ¯ γ . Proof.
With ¯ ν t in (3.1) being set equal to the average trading speed in equilibrium we may writedS t = (cid:18) µ + c ( t ) – γ c ( t )2 k + N t ¯ Y t (cid:19) d t + σ dW t ,where N t = b h ν ∗ q ( t ) + ν ∗ ¯ q ( t ) , ν ∗ V ( t ) + ν ∗ ¯ V ( t ) i .25ith the expression for ¯ Y t given in (4.3) the solution to the SDE can be written asS t = S + Z t (cid:18) µ + c ( t ) – γ c ( u )2 k + N u Φ u ¯ Y + N u Φ u Z u Φ –1 s a s ds (cid:19) d u + Z t (cid:18) ρ η b h i Φ t Φ –1 u " + σ (cid:19) dW u ,where a s is defined as in Proposition 4.1. This allows us to writeS t – E [S t ] = Z t (cid:18) ρ η b h i Φ t Φ –1 u " + σ (cid:19) dW u ,and the result in (4.10) follows from Ito’s isometry. The expression in (4.11) arises again from using Propo-sition 3.6 to solve (4.1), which yields Φ t = z (e – b k t –1) b (1–(1+2 z )e ω T ) z ) e ω (T– t ) z ) e ω T e – b k t , Φ –1 t = –2 z (e – b k t –1) e b k t b (1–(1+2 z ) e ω (T– t ) ) z ) e ω T z ) e ω (T– t ) e b k t .Substituting these expressions into (4.10) yields (4.11).In Figure 10 we plot the variance of S t through time when agents trade according to the mean-fieldequilibrium strategy. If there were no price impact then this variance would be purely from the accumulatedvolatility over time. With price impact, the drift of the midprice has an element of randomness caused by thecommon noise component of the agents’ trade signal. Here we see that the effect on price variance dependson more than just the information shared by agents, as measured by ρ , but the sign of ρ also matters.When ρ is large, agents share a lot of information and trade in a similar fashion. When this happens withpositive ρ , their order flow is concentrated and tends to occur in the same direction as midprice changes,effectively increasing the size of midprice changes and therefore variance. When ρ is negative, their orderflow is concentrated but tends to occur in the opposite direction of midprice changes, lowering the variance.When ρ is close to zero, they share little information and net order flow tends to be close to zero which addsno additional variance to the midprice. In this paper we have presented a model for price dynamics and trading in which an agent attempts to extractprofits from his own subjective valuation of an asset. When his subjective view of asset value is significantlydifferent than the traded market price he wants to accumulate a large position, but friction effects and riskaversion prevent him from trading too quickly. Instead he manages a trade-off between the potential profitsand costs. We continue our analysis when multiple agents are undertaking this task, either with a commontrade signal shared between them or with individual signals correlated to each other. A mean-field gameapproach is taken to represent a setting with a large number of agents which keeps the problem tractable.This also allows us to study the cross sectional distribution of inventory as it depends on the correlation26
Figure 10: Variances of midprice S t through time in mean-field limit for various ρ . Parameters used are µ = 0, σ = 1, η = 1, β = 1, γ = 0.05, ¯ γ = 0.1, b = 5 · –2 , k = 5 · –3 , α = 0.1, T = 1.structure of the collection of signals. When correlation between signals is large, the inventory across allagents will have a tighter distribution because they are essentially trading off of the same information andtherefore have similar behaviour. The correlation between signal and price innovations also modifies theasset price variance, as the random order flow will cause it to deviate from its accumulated volatility overtime. Positive correlation between each signal and price innovations will increase the variance of the assetprice at any fixed point in time. References
Almgren, R. and N. Chriss (2001). Optimal execution of portfolio transactions.
Journal of Risk 3 , 5–40.Bigiotti, A. and A. Navarra (2018). Optimizing automated trading systems. In
The 2018 InternationalConference on Digital Science , pp. 254–261. Springer.Cartea, A., R. Donnelly, and S. Jaimungal (2018). Hedging Non-Tradable Risks with Transaction Costs andPrice Impact.
SSRN eLibrary, http://ssrn.com/abstract=3158727 .Cartea, Á. and S. Jaimungal (2016a). A closed-form execution strategy to target volume weighted averageprice.
SIAM Journal on Financial Mathematics 7 (1), 760–785.Cartea, A. and S. Jaimungal (2016b). Incorporating order-flow into optimal execution.
Mathematics andFinancial Economics 10 (3), 339–364.Casgrain, P. and S. Jaimungal (2018a). Mean-field games with differing beliefs for algorithmic trading.
Mathematical Finance .Casgrain, P. and S. Jaimungal (2018b). Mean field games with partial information for algorithmic trading. arXiv preprint arXiv:1803.04094 . 27onnelly, R. and L. Gan (2018). Optimal decisions in a time priority queue.
Applied MathematicalFinance 25 (2), 107–147.Huang, X., S. Jaimungal, and M. Nourian (2015). Mean-field game strategies for optimal execution.
AppliedMathematical Finance .Karatzas, I. and S. Shreve (2012).
Brownian motion and stochastic calculus , Volume 113. Springer Science& Business Media.Kaya, O., J. Schildbach, and D. B. Ag (2016). High-frequency trading.
Reaching the limits, Automatedtrader magazine 41 , 23–27.Lehalle, C.-A. and O. Mounjid (2017). Limit order strategic placement with adverse selection risk and therole of latency.