Trading multiple mean reversion
TTrading multiple mean reversion
E. Boguslavskaya ∗ , M. Boguslavsky † , and D. Muravey ‡ Brunel University London, UK TradeTeq, UK Lomonosov State University, Moscow, RussiaSeptember 22, 2020
Abstract
How should one construct a portfolio from multiple mean-revertingassets? Should one add an asset to portfolio even if the asset has zeromean reversion? We consider a position management problem for anagent trading multiple mean-reverting assets. We solve an optimalcontrol problem for an agent with power utility, and present a semi-explicit solution. The nearly explicit nature of the solution allowsus to study the effects of parameter mis-specification, and derive anumber of properties of the optimal solution.
Contents ∗∗ Email: [email protected] †∗ Email: [email protected] ‡∗ Email: [email protected] a r X i v : . [ q -f i n . M F ] S e p The model 5
A Reduction of the HJB equation to the linear PDE 24
A.1 Distortion transformation . . . . . . . . . . . . . . . . . . . . 24
B Wealth SDE solution 27 Proof of Theorem 6.1 28
C.1 Proof of formulas (28) and (29) . . . . . . . . . . . . . . . . . 28C.2 Proof of formula (30) . . . . . . . . . . . . . . . . . . . . . . . 29C.3 Proof of formulas (31) . . . . . . . . . . . . . . . . . . . . . . 30
D Auxiliary facts about the structure of the matrix F in thezero correlation case 31E Auxilliary facts about correlation matrices 35 E.1 Proof of formula (1). . . . . . . . . . . . . . . . . . . . . . . . 36E.2 Proof of formula (2) . . . . . . . . . . . . . . . . . . . . . . . 36E.3 Proof of formula (3) . . . . . . . . . . . . . . . . . . . . . . . 37E.4 Proof of formula (4) . . . . . . . . . . . . . . . . . . . . . . . 37
One of the basic patterns of statistical arbitrage is mean reversion trading.Typically, one constructs a synthetic asset from one or several traded assetsin such a way that its price dynamics is mean reverting. For example, for apair of cointegrated assets there exists a mean-reverting linear combinationof these assets. We will be calling this mean-reverting synthetic asset the spread . Generally, trading a mean reverting asset consists of buying thespread when it is below its mean level and sellings when it is above. The mainquestion is how should the position be optimally managed with movementof the spread, trader’s risk aversion, and time horizon. When there areseveral mean-reverting assets available, the trader should additionally solvea dynamic portfolio optimization problem in order to decide the best way tocombine positions in these assets.A number of papers addressed this problem by specifying a stochastic dif-ferential equation (SDE) for spread dynamics and finding the optimal strat-egy that optimizes the expected utility over the terminal wealth. The sim-plest example of mean-reverting dynamics in continuous time is the Ornstein–Uhlenbeck process, the continuous version of the AR(1) discrete process. Fora single spread optimal trading strategy see [4]. For a more complicatedmean-reverting dynamics we refer to paper [2], where the spread is modelledby a Markov modulated Ornstein–Uhlenbeck process, and to papers [9] and[10] where the authors consider fractional stochastic processes. The models3ith uncertainty in the mean reversion level were discussed in [14]. Othermodels for the spread have also been considered in the literature: for mod-els based on Brownian brigde see [16], and for models based on CER/CIRprocesses see [19]. A comprehensive review of the mean reversion tradingcan be found in [13]. For methodology of statistical arbitrage we refer to [3].In [15] the authors assume different mean-reversion dynamics for multiplespread processes. They solve a portfolio optimization problem for severalGeometric Brownian motions with multiple co-integration terms in drifts.Usually a portfolio allocator has access to multiple investing opportuni-ties. Optimal sizing and timing of positions in each of these opportunitiesmay be affected by positions in other assets and performance of those assets.To develop intuition about optimal dynamic allocation strategy, we gener-alise [4] to the case of multiple correlated Ornstein-Uhlenbeck and BrownianMotion processes. We solve the problem of maximization of a power utilityover the terminal wealth for a finite horizon agent. Power utilities are a suf-ficiently broad family of utility functions, containing log-utility as a specialcase and linear utility as a limit case.For the general problem, the optimal strategy is found in quasi–analyticalform as a solution to a matrix Riccati ordinary differential equation. Forseveral important special cases it is possible to solve this equation explicitly.We also propose an efficient approach to analyse effects of parameter mis-specification. Although the proposed model is very simple, one can observenon-trivial qualitative properties of the optimal strategy. The availabilityof a quasi–analytical solution allows us to study how the trading strategyis affected by correlation between spreads, and demonstrate the tradeoffsbetween ”harvesting” each spread separately and hedging positions in corre-lated spreads.The rest of this paper is organized as follows: in Section 2 we give abrief overview of optimal strategy properties. In Section 3 we specify ourformal asset and trading model and formulate a stochastic optimal controlproblem. Section 4 contains explicit formulas for the optimal control and thevalue function. Section 5 reminds main insights for the one-dimensional case.Optimal solution analysis is presented in Section 6. In Section 7, we presentan ODE based framework to analyse the effect of parameter mis-specificationand calculate the moments of the terminal wealth’s distribution. We thenapply this framework to analyse strategy and value sensitivity to reversionrates misspecification.Implementation source code in python and numerical implementation4ints are available at [1].
The optimal solution has a number of interesting qualitative properties. • Trade-off between hedging and spread extraction
In the case of a single asset, the position is managed to extract valuefrom this asset movements. With several correlated mean-revertingassets, the optimal strategy also uses positions in assets with slowermean reversion to hedge positions in faster mean reverting assets. • Impact of correlations
With all other parameters fixed, higher absolute values of correlationsbetween asset driving processes are preferable to lower absolute values,as long as they stay below 1. See Section 6.5 for more details. • Impact of different reversion rates
With all other parameters fixed, higher reversion speeds are not alwayspreferable for the trader. An asset with a lower reversion rate anda non-zero correlation with higher reversion rate assets, may be usedprimarily as a hedge for positions in these assets. Hedge efficiency maybe declining with the increases in the lower reversion rate. See Section6.4 for more details. • Cost of parameter misspecification
The optimal strategy has a strong dependence on assumed reversionrates. It is safer to underestimate reversion rates than to overestimatethem. The value function is more sensitive to errors in reversion rateratios between assets than to joint correlated errors in rate estimates.See Section 7.
Assume the canonical multivariate filtered probability space (Ω , F , F , P )with filtration ( F t ) t ≥ to satisfy the usual conditions, see e.g. [11]. On this5pace let [ X t , X t , . . . , X nt ] (cid:62) be a collection of tradeable assets following amultidimensional Ornstein–Uhlenbeck process d X t = − κ X t dt + σ d B t (1)Here B t = [ B t , B t , . . . , B nt ] (cid:62) is an n -dimensional Wiener process with corre-lation matrix Θ ∈ R n × n (i.e. d B t d B (cid:62) t = Θ dt ), and κ ∈ R n × n + and σ ∈ R n × n + are diagonal matrices with reversion rates and volatility entries correspond-ingly κ = diag ( κ , κ , . . . , κ n ) , σ = diag ( σ , σ , . . . , σ n ) , Θ = ρ . . . ρ n ρ . . . ρ n ... ... . . . ... ρ n ρ n . . . The diagonality of matrices κ and σ means that all dependency betweenassets comes from the correlations between the driving Brownian motions.We also consider models with some assets exhibiting zero mean reversion (i.e.with some zero elements of κ .) These assets are simply following correlatedBrownian motions. However, we assume that elements of vector κ are not allzero to avoid a trivial problem. Correlation matrix Θ should be symmetricand positive semi-definite with unit diagonal elements, ρ ii = 1, ρ ij = ρ ji . Wewill assume that Θ has full rank to avoid obvious arbitrages.Without loss of generality, we can also assume that long-term means ofeach process are equal to zero. The general case can be reduced to equation(1) by the substitution [ X t − θ ] → X t , where θ is a vector of long termmeans. Equation (1) can be solved explicitly in terms of Itˆo integral: X t = e − κ t X + (cid:90) t e − κ ( t − s ) σ d B s Here e A is a matrix exponential: e A = ∞ (cid:88) k =0 k ! A k , A = I . The problem can be treated in the general Merton portfolio optimisationframework, see [17]. Let vector α t α t = (cid:2) α t , α t , . . . , α nt (cid:3) (cid:62)
6e a traders position at time t , i.e. the number of units of each asset held.This is the control in our optimization problem. Assuming zero interest ratesand no transaction costs, for a given control process α t , the wealth process W α t is given by dW α t = α (cid:62) t d X t = n (cid:88) i =1 α it dX it , or in integral form W α t = W α t + (cid:90) Tt α (cid:62) u d X u = W α t + n (cid:88) i =1 (cid:90) Tt α iu dX iu . Without loss of generality, we assume unit noise magnitudes: i.e. σ = I .For the general case, the following parametrisation should be used: X t → σ − X t , α t → σα t . The value function J ( W α t , X t , t ) : R + × R n × [0 , T ] → R is the supremum overall admissible controls of the expectation of the terminal utility conditionalon the information available at time tJ ( w, x , t ) = sup α t ∈A E [ U ( W α T ) | W α t = w, X t = x ] , where the set of admissible controls A is defined as A = (cid:40) α : [0 , T ] × Ω → R n | α t ∈ F t , (cid:90) (cid:62) ( W α t ) n (cid:88) i =1 (cid:0) α it X it (cid:1) dt < ∞ , a.s (cid:41) (2)We consider a power utility function with the parameter γ < U = U ( W α T ) = 1 γ ( W α T ) γ . The relative risk aversion is measured by 1 − γ . It is convenient to use anothermeasure δ which is also known as a distortion rate (see [18]) δ = 11 − γ , < δ < ∞
7o the smaller δ is, the less risk averse the agent. The case γ = 0 correspondsto the logarithmic utility function and the investor with γ → Our aim is to find the optimal control α ∗ ( W α t , X t , t ) and the value function J ( W α t , X t , t ) as the functions of wealth W α t , prices X t and time t . TheHamilton–Jacobi–Bellman equation issup α (( ∂/∂t + L ) J ) = 0 . (3)Here L is the infinitesimal generator of the wealth process W α t : L = α (cid:62) Θ α ∂ ∂w + α (cid:62) Θ ∇ ∂∂w + ∇ (cid:62) Θ ∇ − α (cid:62) κx ∂∂w − x (cid:62) κ ∇ and the first order optimality condition on the control α ∗ is α ∗ ( w, x , t ) = J w J ww Θ − κx − ∇ J w J ww . (4)The operator ∇ denotes a vector differential operator ∇ = (cid:20) ∂∂x , ∂∂x , . . . , ∂∂x n (cid:21) (cid:62) for which we define the following operations for any vectors a ∈ R × n andmatrices A ∈ R n × n : a (cid:62) ∇ = n (cid:88) i =1 a i ∂∂x i , ∇ (cid:62) A ∇ = n (cid:88) i =1 n (cid:88) j =1 A ij ∂ ∂x i ∂x j . Note that the first summand in the right-hand side of (4) is the myopic de-mand term corresponding to a static optimization problem while the secondterm hedges from changes in the investment opportunity set. For a log utilityinvestor ( γ = 0 or, equivalently, δ = 1) the second term vanishes (see [17].)8ubstituting this condition into the equation (3) for the value function,we obtain a non-linear PDE which can be linearised by the distortion trans-formation (see [18]): J ( w, x , t ) = w γ γ f /δ ( x , t ) . Here the function f ( x , t ) is a solution to the Cauchy problem for the parabolicPDE: ∇ (cid:62) Θ ∇ f − δ + 12 x (cid:62) κ ∇ f − δ − ∇ (cid:62) f κx + δ ( δ − x (cid:62) κ Θ − κx f + ∂f∂t = 0 .f ( x , T ) = 1 . The main equation (5) can be reduced to the matrix Riccati ODE. The valuefunction J and the optimal control α ∗ have quasi-analytic representationsvia solutions to this ODE. Using an ansatz similar to [5] and [15], we provethat the value function J is given by J ( w, x , t ) = w γ γ · exp (cid:26)(cid:90) T − t Tr ( A ( u ) Θ ) δ du (cid:27) · exp (cid:26) x (cid:62) A ( T − t ) x δ (cid:27) where Tr denotes trace operator and the function A : R + → R n × n × R + is amatrix function of inverse time τ = T − t : A ( τ ) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) A ( τ ) A ( τ ) . . . A n ( τ ) A ( τ ) A ( τ ) . . . A n ( τ )... ... . . . ... A n ( τ ) A n ( τ ) . . . A nn ( τ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) which is defined as a solution to the following matrix Ricatti equation: A (cid:48) ( τ ) = R Θ , κ ,δ A (5) A (0) = with R Θ , κ ,δ denoting the nonlinear operator R Θ , κ ,δ A = (cid:0) A (cid:62) + A (cid:1) Θ (cid:0) A (cid:62) + A (cid:1) − δ + 12 κ (cid:0) A (cid:62) + A (cid:1) − δ − (cid:0) A (cid:62) + A (cid:1) κ + δ ( δ − κ Θ − κ α ∗ has the following representation: α ∗ ( w, x , t ) = w (cid:2) − δ Θ − κ + A + A (cid:62) (cid:3) x . (7)Introducing a new matrix D as D ( τ ) = δ Θ − κ − (cid:0) A ( τ ) + A (cid:62) ( τ ) (cid:1) we get the following formula for optimal strategy α ∗ : α ∗ ( w, x , t ) = − w D ( τ ) x (8)Matrix D can be found directly from another Riccati ODE: D (cid:48) ( τ ) = − D (cid:62) Θ D + δ κ Θ − κ . (9) D (0) = δ Θ − κ . If one only needs the optimal control it is sufficient to solve the simplerequation (9). To find the value functions, one needs to solve the more complexsystem (5.)Optimality of the candidate control α ∗ can be verified using the samearguments as in [15] (see also [6] and [7].) Before we analyse the multidimensional case, let us present a short reviewof the one-dimensional case, for more details see [4]. It is obtained from ourproblem by setting n = 1 in all formulas from Section 3.1. To be more precise,we consider mean-reverting asset X t which follows an Orntein–Uhlenbeckprocess with zero mean and unit variance: dX t = − κX t dt + dB t and the wealth process W α t generated by the trading strategy α : dW α t = α t dX t . We are looking for the maximizer α ∗ of the expected utility over the terminalwealth W α T : α ∗ = argmax α [ E t [ U ( W α T )]] . .2 The structure of the optimal strategy The optimal control α ∗ can be expressed as α ∗ ( w, x, t ) = − wD κ ( T − t ) x, where the function D κ ( τ ) is a solution to the following Riccati equation: D (cid:48) κ = − D κ + δk (10) D κ (0) = δκ. This one-dimensional problem (10) can be solved explicitly (this can be donevia the substitution τ ( D κ ) = D − κ ). The function D κ ( τ ) is a shifted andscaled sigmoid function of the inverse time τ = T − t : D κ ( τ ) = κ √ δ √ δ cosh κ √ δτ + sinh κ √ δτ √ δ sinh κ √ δτ + cosh κ √ δτ It is worth to mention that for γ < D κ can be representedas D κ ( τ ) = κ √ δ tanh (cid:16) κ √ δτ + ϕ (cid:17) , tanh ϕ = √ δ The behavior of the function D κ ( T − t ) depends on the value of riskaversion γ : an agent with negative gamma (less risk averse than log-utilityinversor) becomes less agressive if time approaches to the terminal time whiletraders with positive gamma become more aggressive (see Figure 1). For thelog-utility agent ( γ = 0, red line on Figure 1) the optimal strategy is static,i.e. D κ ( τ ) ≡ const . The value function J ( w, x, t ) can be split into three terms: J ( w, x, t ) = w γ γ (cid:124)(cid:123)(cid:122)(cid:125) a · exp (cid:26) − (cid:90) T − t D ( u ) − δκ δ du (cid:27)(cid:124) (cid:123)(cid:122) (cid:125) b · exp (cid:26) − x ( D ( T − t ) − δκ )2 δ (cid:27)(cid:124) (cid:123)(cid:122) (cid:125) c which can be interpreted as follows: • a : present wealth utility, • b : time value (utility of future expected opportunities), • c : instrinsic value (utility of the immediate investment opportunityset.) 11 D ( T − t ) κ = 1, T = 5 γ = 0.7γ = 0.3γ = 0.1γ = 0.0γ = − 0.1γ = − 2.0γ = − 16.0 Figure 1: Position size multiplier D ( T − t ) for different values of risk aversion The stochastic process W α t generated by the optimal strategy α ∗ can berepresented as (for more details see B)log (cid:18) W α t W α s (cid:19) = (cid:90) ts D κ ( T − u ) − δκ X u du (cid:124) (cid:123)(cid:122) (cid:125) a + X s D κ ( T − s ) − X t D κ ( T − t )2 (cid:124) (cid:123)(cid:122) (cid:125) b . So the log return of wealth between times s and t is the sum of • a : profit/loss from dynamic trading in the time period [ s, t ], • b : profit/loss on position open at at time s . The higher mean reversion speed κ makes trader more aggressive. Authorsalso make the following observations based on Monte Carlo simulations: • The influence of mean reversion coefficient misspecification is asymet-ric. 12
Trading with a conservatively estimated κ reduces greatly the utilityuncertainty. The overestimation of κ leads to excessively aggressivepositions. It is much safer to underestimate κ than to overestimate it. The main difference between multidimensional and one dimensional case isthat changes in some spreads may affect positions in other spreads via changesin risk exposures. Generally, one might expect two possible motivations totake a position in each of the assets: to extract value from its reversion or tohedge positions in other assets.In the multidimensional case, the time decay function D is a matrix.The main difficulty is that there are no known techniques to explicitly solvegeneric matrix Riccati equations. However, there are several important spe-cial cases in which explicit solutions can be obtained. We start our analysiswith these cases; based on these formulas we can demonstrate the main prin-ciples of interaction between asset prices and optimal positions.For the rest of the paper, we will analyse only the case X ≡ θ , i.e. thelong-term investment behavior of the value function J ( w, , t ). Assume that the asset processes are driven by non-correlated Wiener pro-cesses, Θ = I . We can expect that the optimal strategy is simply a vectorof one dimensional optimal strategies for each asset. That is, a candidateoptimal control is α ∗ = − w D ( τ ) x , D ( τ ) = diag ( D κ ( τ ) , D κ ( τ ) . . . , D κ n ( τ )) , τ = T − t. For the definition of D κ see 5. One can directly confirm that this control isindeed optimal by checking that it solves the system (9).In this case, there are no interactions between the assets. The positionin the i -th assets depends only on time t , current wealth and i -th assetparameters. 13 .1.2 Common reversion rate Another case that allows an explicit solution is when the correlations arenon-trivial but the reversion rate κ is the same for all assets κ = κ I . RecallSDE for the price process d X t = − κ X t dt + d B t , d B t d B (cid:62) t = Θ dt. We show that for this case the explicit solution can also be constructed.Indeed, with a single common reversion rate, any non-zero linear com-bination Y t = L − X t of Ornstein–Uhlenbeck processes is also an Ornstein–Uhlenbeck process: d Y t = − κ Y t dt + d ˜ B t , d ˜ B t d ˜ B t (cid:62) = L − Θ( L − ) (cid:62) dt Here ˜ B t is a n - dimensional Wiener process with correlation matrix L − Θ( L − ) (cid:62) . Assuming invertibility of L , one can find an optimal control α Y for thisnew process Y t and then transform it to an optimal control for X t . Thetransformation is based on the following equality dW α t = α (cid:62) Y d Y t = α (cid:62) X d X t , α X ( W α t , X t , t ) = ( L − ) (cid:62) α Y ( W α t , L − X t , t ) . The transformaton matrix L is constructed as a Cholesky decomposition ofcorrelation matrix Θ : L (cid:62) L = LL (cid:62) = Θ , ( L − ) (cid:62) L − = L (cid:62) ( L − ) (cid:62) = Θ − . Applying this transformation, we obtain the following equation for the opti-mal control: α ∗ = − wD κ ( T − t ) Θ − x . Thus, the optimal trading rule can be interpreted as constuction of linearlyindependent factor portfolios and then trading them in the manner of theprevious case. This is similar to the portfolio signal construction approachof [12].In this case, there are also no interactions between the assets. The valuefunction J ( w , t ) does not depend on asset correlations: J ( w, , t ) = w γ γ exp (cid:26) n (cid:90) T − t δκ − D κ ( u )2 δ du (cid:27) .1.3 Hedging a mean reverting asset via correlated BrownianMotions Let us consider a case when the tradeable asset set consists of a single mean-reverting asset and one or several correlated Brownian motions. We can alsoconsider this case as the limiting case for tradeable asset sets where one asset’mean reversion rate κ is very large relatively to all other asset’ reversion rates.Consider the following matrix of reversion rates: κ = diag ( κ, , , . . . , . One can check by a direct calculation that the solution to the Riccati equation(9) has the following form: D ( t ) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) D . . . D . . . D n . . . (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) D j = δκ (cid:0) Θ − (cid:1) j . The term D ( τ ) can be derived from the followingRiccati ODE: D (cid:48) ( τ ) = − D + 2 δ ( ζ − κ D + κ δζ ( δ (1 − ζ ) + 1) D (cid:48) (0) = δζκ. This ODE can be solved explicitly to yield the following formula for D : D ( τ ) = κλ δ cosh λκτ + λ sinh λκτδ sinh λκτ + λ cosh λκτ + δκ ( ζ − , γ < /ζκδ δκτ + δκ ( ζ − , γ = 1 /ζκλ δ cos λκτ − λ sin λκτδ sin λκτ + λ cos λκτ + δκ ( ζ − , /ζ < γ < . Here ζ = (cid:0) Θ − (cid:1) , λ = (cid:112) | δ ( δ − ζ − δ | Thus, in this case we trade the mean-reverting asset and hedge it viacorrelated Brownian motions. Both the mean revertion asset position and thehedging positions are larger for large correlations. Availability of correlatedhedging assets allows us to take larger positions for given risk aversion andwealth. 15 .2 The structure of the optimal strategy
To illustrate the structure of the optimal strategy, we expand the product D ( τ ) x in formula (8) for optimal control α ∗ : (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) α ∗ α ∗ ... α ∗ n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) D ( τ ) x + D ( τ ) x + . . . D n ( τ ) x n D ( τ ) x + D ( τ ) x + . . . D n ( τ ) x n ... D n ( τ ) x + D n ( τ ) x + . . . D nn ( τ ) x n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) The summand D ii x i is a position size multiplier for a mean reversion tradingof i − th asset while D ij x j is a quantity of i − th asset required to hedge theposition in j − th asset. In case of non-correlated assets each D ij = 0, for i (cid:54) = j . The quantities D ij and D ji satisfy the following relations : D ij + δ Θ − ij κ j = D ji + δ Θ − ij κ i . Note that the difference between D ij and D ji does not depend on time t . Similarly to the one-dimensional case, the wealth process W α t can be ex-pressed aslog (cid:16) W α t W α s (cid:17) = a (cid:122) (cid:125)(cid:124) (cid:123)(cid:90) ts TrΘ D ( T − u ) − δ X (cid:62) u κ Θ − κ X u du ++ X (cid:62) s D ( T − s ) X s − X (cid:62) t D ( T − t ) X t (cid:124) (cid:123)(cid:122) (cid:125) b + 12 (cid:90) ts X (cid:62) u (cid:2) D − D (cid:62) (cid:3) d X u (cid:124) (cid:123)(cid:122) (cid:125) c (11)One term of equation (11) that is missing in the one-dimensional case is c . This summand corresponds to hedging efficiency. It is easy to see that forcases Θ = I or κ = κ I this term vanises. As we mentined before, the case κ = κ I can be reduced to the case Θ = I . . 16o illustrate interactions between reversion speed and correlation, let usconsider a two-dimensional example in more details. We will use the followingparameters for this illustration: numbers of assets be n = 2, noise magnitude σ = I , long term mean and initial point θ = X = 0, risk aversion γ = − T = 3. We consider an optimal strategy for a portfolioof two correlated Ornstein–Uhlenbeck processes with κ = 1 and differentvalues of κ and correlation ρ . n = 2 , γ = − , σ = I , κ = diag (1 , κ ) , θ = X = , Θ = (cid:20) ρρ (cid:21) Figure 2 shows of the value function J as a function of log( κ /κ ) ( κ = 1)for several different values of ρ . We are varying here the lower of two assetmean-reversion rates. It turns out that for sufficiently high correlation ρ ,the value function has a proper minima as function of κ and it becomesdecreasing in κ as correlation gets closer to 1. This means that in thesecases, one would prefer to have a lower value for the second asset’ mean-reversion rate to a slightly higher value (but not to a much higher value κ >> κ . Therefore, with more that one asset, a higher reversion rateis not always good for extracting value from trading, quite unlike the one-dimensional case. We have seen in the previous section that the value function can be non-monotonic in mean-reversion rates. Let us show that it is always increasingwith the correlation all other parameters being equal.Suppose now that we start our trading process with no immediate trad-ing opportunities (i.e. x = ). We consider J ( w, , t ) as the function oncorrelation coefficients ρ mn . In the standard Markowitz portfolio optimiza-tion problem, one can construct more profitable portfolios when correlationsare lower. In our setting, we can prove that the value function has a localminima at zero correlations Θ = I . Correlations between driving processesenable cross-hedging between positions in different assets and these increasethe value function. We have already seen a similar beneficial effect of highercorrelations in section 6.1.3 for a special case of a single mean-reverting assethedged with Brownian motions and the following theorem demonstrates thatthis effect holds in the general case as well.17 /κ −0.08−0.06−0.04−0.020.00 J κ =1, σ =1, σ =1, γ= −4, T=3, x =0, θ=0 ρ=0ρ=0.3ρ=0.5ρ=0.7ρ=0.8ρ=0.9ρ=0.95ρ=0.99 Figure 2: 2D example. Value function for a range of values for κ andcorrelation ρ . Theorem 6.1.
In the absense of immediate trading opportunities ( x = )the value function J ( w, , t ) as a function of pairwise correlation coefficients ρ mn has a local minima at Θ = I .Proof. Recall the representation of the value function: J ( w, , t ) = w γ γ exp (cid:26) δ (cid:90) T − t Tr ( F ( u )) du (cid:27) where matrix F is equal F = 12 ( A + A (cid:62) ) Θ . (12)Define new matrix Γ : Γ = Θ − κ Θ (13)Note that Γ is a result of similarity transformation of the matrix κ andlim Θ → I Γ = κ . For the matrix F we have the following ODE: F (cid:48) = 2 F − δ ( κF + F Γ ) + δ ( δ − κ Γ (14) F (0) = . ρ mn be an arbitrary correlation coefficient at the position mn (i.e. mn =( ij ), Θ ij = Θ ji = ρ mn ) and let us consider the following partial derivatives: ∂J ( w, , t ) ∂ρ mn = J ( w, , t ) δ (cid:90) T − t Tr (cid:18) ∂ F ( u ) ∂ρ mn (cid:19) du∂ J ( w, , t ) ∂ρ mn ∂ρ pq = J ( w, , t ) δ (cid:90) T − t Tr (cid:18) ∂ F ( u ) ∂ρ mn ∂ρ pq (cid:19) du∂ J ( w, , t ) ∂ρ mn = J ( w, , t ) δ (cid:90) T − t Tr (cid:18) ∂ F ( u ) ∂ρ mn (cid:19) du We will prove the following properties for any mn and pq :lim Θ → I ∂J ( w, , t ) ∂ρ mn = 0 (15)lim Θ → I ∂ J ( w, , t ) ∂ρ mn ∂ρ pq = 0 (16) sign lim Θ → I ∂ J ( w, , t ) ∂ρ mn = signγ, ( κ i (cid:54) = κ j ) (17)lim Θ → I ∂ J ( w, , t ) ∂ρ mn = 0 ( κ i = κ j ) (18)From equation (15), the point Θ = I is an extrema point. Equation (16)implies that the Gessian matrix at Θ = I is a diagonal matrix. UsingSilvester’s criterion we prove that Gessian matrix is a positive definite at thepoint Θ = I , for more details see Appendix C. In practice, one does not know the true values for model parameters, so it isimportant to understand value function sensitivities to errors in parametersestimation. In this section, we present an ODE based framework for theanalysis of parameter mis-specification sensitivity. We provide semi-explicitformulas for the value function corresponding to misspecified parameters.19et ˆ κ , ˆ σ , ˆ Θ be an estimates of reversion rates, volatility and correlation. Weconsider the control ˆ α as a function of these estimatesˆ α = w ˆ σ − (cid:104) − δ ˆ Θ − ˆ κ + (cid:16) ˆ A (cid:62) + ˆ A (cid:17)(cid:105) ˆ σ − x . Here the matrix ˆ A is a solution to the following ODEˆ A (cid:48) ( τ ) = R ˆ Θ , ˆ κ ,δ ˆ A (19)ˆ A (0) = , where the differential operator R is defined in (6). The wealth process ˆ W t generated by the strategy ˆ α is a solution to the following SDE d ˆ W t = ˆ α (cid:62) t d X t (20) Theorem 7.1.
Let P (cid:15) ( w, x , t ) be the following expectation of a function ofterminal wealth ˆ W T defined by (20): P (cid:15) ( w, x , t ) = E (cid:34) ˆ W (cid:15)T (cid:15) (cid:12)(cid:12)(cid:12) ˆ W t = w, X t = x (cid:35) . The expectation P (cid:15) ( w, x , t ) can be explicitly found in the following form P (cid:15) ( w, x , t ) = w (cid:15) (cid:15) · exp (cid:26)(cid:90) T − t Tr ( Θ Q ( u )) du (cid:27) (21) · exp (cid:8) x (cid:62) σ − Q ( T − t ) σ − x (cid:9) , where matrix Q is a solution to Riccati equation Q (cid:48) = B Q (22) Q (0) = . The nonlinear operator B is given by B Q = (cid:0) Q + Q (cid:62) (cid:1) Θ (cid:0) Q + Q (cid:62) (cid:1) (cid:0) (cid:15) β (cid:62) Θ − κ (cid:1) (cid:0) Q + Q (cid:62) (cid:1) + (cid:15) ( (cid:15) − β (cid:62) Θ β − (cid:15) β (cid:62) κ and the matrix β is defined as β = σ ˆ σ − (cid:104) − δ ˆ Θ − ˆ κ + (cid:16) ˆ A + ˆ A (cid:62) (cid:17)(cid:105) ˆ σ − σ here the matrix ˆ A is a solution to the equation (19).
20n the setting (cid:15) = γ we obtain the expected utility corresponding to themisspecified parametes. The values (cid:15) = 1 or (cid:15) = 2 corresponds to the firsttwo moments of W T , so we can calculate Sharpe ratio: Sh [ ˆ α ] = P ( w, x , t ) (cid:112) P ( w, x , t ) − P ( w, x , t ) . It is worth to mention, that the effects on misspecified long term mean level θ can be also analysed in the same way. For this case, we have to add extraterm exp (cid:8) x (cid:62) V (cid:9) to the equation (21). Here V is an n × T − t .As an alternative, one can analyse the effect of parameter misspecifica-tion by using Monte-Carlo methods. However, from our point of view, theproposed ODE approach is computationally much more efficient than Monte-Carlo simulations. We illustrate the method presented above on the analysis of misspecifiedreversion rates κ . For simplicity, we consider the portfolios with only twoassets. The results are presented on figure 3. We measure effect on misspec-ification by the difference between the value functions corresponding to trueand mis-specified parameters (color and value of z-axis respectively).Similarly to the one-dimensional case, the infuence of mean reversioncoefficient misspecification is asymmetric. Depending on the value of corre-lation, correct estimation of the ratio between reversion rates is more impor-tant than the estimations of the exact values of each mean-reversion rate. Itfollows from the nature of optimal strategy: the faster mean-reverting assetis hedged in the slower one and the hedging accuracy depends on the ratiobetween reversion speeds. Acknowledgments
Dmitry Muravey acknowledges support by the Russian Science Foundationunder the Grant number 20-68-47030.21 l o g ( κ * / κ ) γ= −2.0 ρ=0.0 −2 −1 0 1−2−101 γ= −2.0 ρ=0.5 −2 −1 0 1−2−101 γ= −2.0 ρ=0.9 −2 −1 0 1log(κ *1 /κ )−2−101 l o g ( κ * / κ ) γ=0.5 ρ=0.0 −2 −1 0 1log(κ *1 /κ )−2−101 γ=0.5 ρ=0.5 −2 −1 0 1log(κ *1 /κ )−2−101 γ=0.5 ρ=0.9 l o g ( * / ) l o g ( * / ) | J * J | = 2.0 =0.0 l o g ( * / ) l o g ( * / ) | J * J | = 2.0 =0.5 l o g ( * / ) l o g ( * / ) | J * J | = 2.0 =0.9 l o g ( * / ) l o g ( * / ) | J * J | =0.5 =0.0 l o g ( * / ) l o g ( * / ) | J * J | =0.5 =0.5 l o g ( * / ) l o g ( * / ) | J * J | =0.5 =0.9 Figure 3: Misspecified reversion rates. Heatmap plot and 3D plot.22 eferences [1] https://github.com/DmitryMuravey/TradingMultipleMeanReversion[2] Altay S., Colaneri K., Eski Z. (2018). Pairs trading under drift uncertaintyand risk penalization.
International Journal of Theoretical and AppliedFinance . Vol. 21, No. 07, 1850046.[3] Avelaneda M., Lee J.-H. (2010). Statistical Arbitrage in the U.S. EquitiesMarket.
Quantitative finance.
Volume 10. 7.[4] Boguslavskaya E., Boguslavsky M. (2004). Arbitrage under power.
RISKmagazine. . June, pp.69–73.[5] Brendle S. (2006). Portfolio selection under incomplete information
Stochastic Processes and their Applications ,116, 701-723.[6] Davis M.A. and Lleo S. (2008). Risk sensitive benchmarked asset man-agement,
Quantitative Finance
Advanced studies on Statistical science and Applied probability, vol19, World Scientific Publishing.[8] Fleming W., Soner M. (2006).
Controlled Markov porcesses and viscositysolutions.
Stochastic modelling and applied probability. Springer –Verlag,2nd edition.[9] Fouque J.-P., Hu R. (2019). Optimal Portfolio under Fractional StochasticEnvironment.
Mathematical Finance.
Volume 29, Issue 3, July, Pages697–734, https://doi.org/10.1111/mafi.12195[10] Fouque J.-P., Hu R., (2019). Portfolio Optimization under Fast Mean-reverting and Rough Fractional Stochastic Environment.
Applied Mathe-matical Finance
International Journal of Theoretical and Applied Finance. , Vol. 19, No.08, 1650054[15] Li T.N., Papanicolau A. (2019). Dynamic Optimal Portfolios for Multi-ple Co-Integrated Assets. preprint.[16] J. Liu, F. Longstaff (2001). Losing money on arbitrages. Optimal Dy-namic Portfolio choice in Markets with Arbitrage Opportunities 2001.[17] Merton, R.C. (1990).
Continuous-Time finance.
Blackwell Publishers.[18] Zariphopoulou, T. (2001). A solution approach to valuation with un-hedgeable risks.
Finanance and Stochastics , 5, 61-82.[19] Zervos M., Johnson T., Alazemi F. (2013). Buy-low and sell-high invest-ment strategies,
Mathematical Finance , 23 3, 560–578.
A Reduction of the HJB equation to the lin-ear PDE
A.1 Distortion transformation
The first order optimality condition on the control α ∗ yields the followinglinear system for the α ∗ : J ww Θ α ∗ = κx J w − Θ σ ∇ J w . (23)The solution of this system reads α ∗ = 1 J ww (cid:0) Θ − κx − ∇ (cid:1) J w (24)Using again the first order optimality condition, we get:( α ∗ ) (cid:62) κx J w − ( α ∗ ) (cid:62) Θ ∇ J w = ( α ∗ ) (cid:62) Θ α ∗ J ww J t −
12 ( α ∗ ) (cid:62) Θ α ∗ J ww − x (cid:62) κ ∇ J + 12 ∇ (cid:62) Θ ∇ J = 0 , (25) J ( w, x , T ) = w γ γ Plugging the exact value for an optimal control α ∗ yields non-linear PDE: J t − J w J ww ( κx ) (cid:62) Θ − ( κx ) + 12 J w J ww (cid:104) ( κx ) (cid:62) ∇ J w + ∇ (cid:62) J w ( κx ) (cid:105) −
12 1 J ww ∇ (cid:62) J w Θ ∇ J w − x (cid:62) κ ∇ J + 12 ∇ (cid:62) Θ ∇ J = 0 . We proceed with an application of the so-called distortion transformation: J = w γ γ f /δ ( x, t ) , δ = 11 − γ (26)The exact formulas for the partial derivatives of the value function J reads J t = 1 δ Jf ∂f∂t , J w = γw J, J ww = γ ( γ − w J ∇ J = 1 δ Jf ∇ f, ∇ J w = γw δ Jf ∇ f ∇ (cid:62) Θ ∇ J = 12 1 δ Jf ∇ (cid:62) Θ ∇ f + 12 1 δ (cid:18) δ − (cid:19) Jf ∇ (cid:62) f Θ ∇ f. −
12 1 J ww ∇ (cid:62) J w Θ ∇ J w = − γ w δ J f w γ ( γ − J ∇ (cid:62) f Θ ∇ f = 12 γδ Jf ∇ (cid:62) f Θ ∇ f = −
12 1 δ (cid:18) δ − (cid:19) Jf ∇ (cid:62) f Θ ∇ f − J w J ww = − γ w J w γ ( γ − J = 12 γ − γ J = 12 1 δ δ ( δ − J J w J ww (cid:104) ( κx ) (cid:62) ∇ J w + ∇ (cid:62) J w ( κx ) (cid:105) = 12 γJw w γ ( γ − J (cid:34) ( κx ) (cid:62) (cid:18) γw δ Jf ∇ f (cid:19) + (cid:18) γw δ Jf ∇ f (cid:19) (cid:62) ( κx ) (cid:35) = 12 1 δ γγ − Jf (cid:2) x (cid:62) κ ∇ f + ∇ (cid:62) f κx (cid:3) = 1 − δ δ Jf (cid:2) x (cid:62) κ ∇ f + ∇ (cid:62) f κx (cid:3) = − δ −
12 1 δ Jf (cid:2) x (cid:62) κ ∇ f + ∇ (cid:62) f κx (cid:3) This yields the following linear equation for the function f :12 ∇ Θ ∇ f − δ + 12 x (cid:62) κ ∇ f − δ − ∇ (cid:62) f ( κx ) + 12 δ ( δ −
1) ( κx ) (cid:62) Θ − ( κx ) f + ∂f∂t = 0 . or12 ∇ Θ ∇ f − δ + 12 x (cid:62) κ ∇ f − δ − ∇ (cid:62) f κx + δ ( δ − x (cid:62) κ Θ − κx f + ∂f∂t = 0 . The optimal control α ∗ reads: α ∗ ( w, x , t ) = w (cid:20) − δ ( σ Θ σ ) − κx + ∇ ff (cid:21) . (27)26 Wealth SDE solution
The wealth process corresponding to the optimal control takes the followingform : dW t = − W t X (cid:62) t D (cid:62) d X t . We represent the process W t in the stochastic exponent form: W t = W e λ (cid:62) Y t , d Y t = udt + η d X t . and apply Itˆo’s lemma : dW t = W t (cid:20) λ (cid:62) d Y t + 12 λ (cid:62) d Y t d Y (cid:62) t λ (cid:21) . Let us note that λ (cid:62) u = − λ (cid:62) η Θ η (cid:62) λλ (cid:62) η = − X (cid:62) t D (cid:62) η (cid:62) λ = − D X t λ (cid:62) u = − X (cid:62) t D (cid:62) Θ D X t λ (cid:62) d Y t = λ (cid:62) udt + λη d Y t . λ (cid:62) d Y t = − X (cid:62) t D (cid:62) Θ D X t dt − X (cid:62) t D (cid:62) d X t Therefore (cid:90) t λ (cid:62) d Y s = − (cid:90) t X (cid:62) s D ( T − s ) (cid:62) Θ D ( T − s ) X s ds − (cid:90) t X (cid:62) s D ( T − s ) (cid:62) d X s Using that the matrix D solves the following Riccati ODE: − d D dt = D (cid:62) Θ D − δ κ Θ − κ we get W t = W exp (cid:26) − δ (cid:90) t X (cid:62) s κ Θ − κ X s ds − (cid:2) X (cid:62) t D ( T − t ) X t − X (cid:62) D ( T ) X (cid:3)(cid:27) · exp (cid:26) (cid:90) t TrΘ D ( T − s ) ds + 12 (cid:90) t X (cid:62) s (cid:2) D − D (cid:62) (cid:3) d X s (cid:27) Proof of Theorem 6.1
The proof of Theorem 6.1 is equivalent to proof of the following 4 facts aboutmatrix F : lim Θ → I (cid:18) ∂ F ∂ρ mn (cid:19) ij = 0 , ( ij ) / ∈ mn. (28)lim Θ → I Tr ∂ F ∂ρ mn = 0 (29)lim Θ → I Tr ∂ F ∂ρ mn ∂ρ pq ≡ . (30)lim Θ → I Tr ∂ F ∂ρ mn > , γ > , κ i (cid:54) = κ j . (31)lim Θ → I Tr ∂ F ∂ρ mn < , γ < , κ i (cid:54) = κ j , lim Θ → I Tr ∂ F ∂ρ mn ≡ , γ = 0 or κ i = κ j C.1 Proof of formulas (28) and (29)
Consider the partial derivative of F with respect to the any correlation ρ mn : (cid:18) ∂ F ∂ρ mn (cid:19) (cid:48) = ∂∂ρ mn (cid:18) F F − δ ( κF + F Γ ) + δ ( δ − κ Γ (cid:19) = 2 (cid:18) ∂ F ∂ρ mn F + F ∂ F ∂ρ mn (cid:19) − δ (cid:18) κ ∂ F ∂ρ mn + ∂ F ∂ρ mn Γ + F ∂ Γ ∂ρ mn (cid:19) + δ ( δ − κ ∂ Γ ∂ρ mn Θ to I we get: λ (cid:48) = 2 ( λ Ψ + Ψ λ ) − δ κλ + λκ + Ψ under LemmaE. , (cid:122) (cid:125)(cid:124) (cid:123) [ κI mn − I mn κ ] + δ ( δ − κ [ κI mn − I mn κ ] (cid:124) (cid:123)(cid:122) (cid:125) under LemmaE. , λ (cid:48) ij = 2 λ ij ( Ψ ii + Ψ jj − δ [ κ i + κ j ]) − δ n (cid:88) s =1 n (cid:88) k =1 (cid:0) Ψ is κ sk I mnkj − Ψ is I mnsk κ kj (cid:1) + δ ( δ − n (cid:88) s =1 n (cid:88) k =1 (cid:2) κ is κ sk I mnkj − κ is I mnsk κ kj . (cid:3) λ (cid:48) ij = λ ij (2 Ψ ii + 2 Ψ jj − δ [ κ i + κ j ]) − δ Ψ ii I mnij [ κ i − κ j ]+ δ ( δ − κ i I mnij [ κ i − κ j ] . λ (cid:48) ij = λ ij (2 Ψ ii + 2 Ψ jj − δ [ κ i + κ j ]) − δ I mnij [ κ i − κ j ] (cid:20) Ψ ii + 1 − δ κ i (cid:21) . λ ij (0) = 0 . Since I mnij = 0 for ( ij ) / ∈ mn , hence λ ij ≡
0. Moreover, for diagonal elements( ii ) / ∈ mn , ∀ i = 1 ..n , therefore Tr λ ≡ C.2 Proof of formula (30) (cid:18) ∂ F ∂ρ mn ∂ρ pq (cid:19) (cid:48) = ∂∂ρ mn ∂ρ pq (cid:18) F F − δ ( κF + F Γ ) + δ ( δ − κ Γ (cid:19) (32)= 2 (cid:18) ∂ F ∂ρ mn ∂ρ pq F + ∂ F ∂ρ mn ∂ F ∂ρ pq + ∂ F ∂ρ pq ∂ F ∂ρ mn + F ∂ F ∂ρ mn ∂ρ pq (cid:19) − δ (cid:32) κ ∂ F ∂ρ mn ∂ρ pq + ∂ F ∂ρ mn ∂ρ pq Γ + ∂ F ∂ρ mn ∂ Γ ∂ρ pq + ∂ F ∂ρ pq ∂ Γ ∂ρ mn + F ∂ Γ ∂ρ mn ∂ρ pq (cid:33) + δ ( δ − κ ∂ Γ ∂ρ mn ∂ρ pq Let us define η = lim Θ → I ∂ F ∂ρ mn ∂ρ pq , ˜ λ = lim Θ → I ∂ F ∂ρ pq η (cid:48) = 2 [ η Ψ + Ψ η ] − δ (cid:104) κη + ηκ + λ ( κI pq − I pq κ ) + ˜ λ ( κI mn − I mn κ ) + Ψ Q (cid:105) + δ ( δ − κQη (cid:48) ii = 4 η ii Ψ ii − δκ i η ii − δ Ψ ii Q ii − δ n (cid:88) s =1 n (cid:88) k =1 (cid:104) λ is κ sk I pqki − λ is I pqsk κ ki + ˜ λ is κ sk I mnki − ˜ λ is I mnsk κ ki (cid:105) + δ ( δ − κ i Q ii η (cid:48) ii = 2 η ii [2 Ψ ii − δ κ ii ] − δ n (cid:88) s =1 (cid:104) λ is κ s I pqsi − λ is I pqsi κ i + ˜ λ is κ s I mnsi − ˜ λ is I mnsi κ i (cid:105) η (cid:48) ii = 2 η ii [2 Ψ ii − δ κ ii ] , η ii (0) = 0 . η ii ≡ . Tr η ≡ . C.3 Proof of formulas (31)
According to the definition of ϕ we obtain the following ODE: ϕ (cid:48) = 2 [ ϕ Ψ + Ψ ϕ ] − δ [ κϕ + ϕκ + 2 λ ( κI mn − I mn κ ) + Ψ P ] + δ ( δ − κPϕ (0) = or in the element wise notation: ϕ (cid:48) ii = 2 ϕ ii [2 Ψ ii − δ κ ii ] − δ λ ij I mnij ( κ j − κ i ) − δ Ψ ii P ii + δ ( δ − κ i P ii ϕ (cid:48) ii = ϕ ii [4 Ψ ii − δ κ ii ] + 2 δ λ ij I mnij ( κ i − κ j ) − δ P ii (cid:18) Ψ ii + 1 − δ κ i (cid:19) ϕ (cid:48) ii = ϕ ii [4 Ψ ii − δ κ ii ] + 2 δ I mnij ( κ i − κ j ) (cid:34) λ ij − κ i (1 − √ δ )2 e κ i √ δτ + 1 e κ i √ δτ + ω (cid:35) ϕ (0) = 0 . It is easy to check that under the condition κ i = κ j : ϕ ii = ϕ jj = 0 . (33)30he formula 33 also holds for the special case γ = 0( δ = 1). Indeed, for thiscase λ ij = λ ji = 0. It turns out to that the RHS of the last equation for ϕ ij is equal to zero, therefore ϕ ii = ϕ jj = 0.We proceed with the case i / ∈ mn . Each element P ii equals 0, i.e. ϕ ii ( τ ) ≡
0. Therefore, the trace of the matrix ϕ contains only two non-zero termswith multi-index mn . For simplicity of notation, we denote it as i and j ,i.e mn = ( ij ). The summands ϕ ii and ϕ jj can be found via the followingODEs: ϕ (cid:48) ii − ϕ ii [4 Ψ ii − δκ i ] = 2 δ ( κ i − κ j ) (cid:34) λ ij − κ i (1 − √ δ )2 e κ i √ δτ + 1 e κ i √ δτ + ω (cid:35) . ϕ (cid:48) jj − ϕ jj [4 Ψ jj − δκ j ] = 2 δ ( κ j − κ i ) (cid:34) λ ji − κ j (1 − √ δ )2 e κ j √ δτ + 1 e κ j √ δτ + ω (cid:35) ϕ ii (0) = ϕ jj (0) = 0 . Using Lemma D.3 we finish the proof.
D Auxiliary facts about the structure of thematrix F in the zero correlation case Here we present some facts about the structure of F for the zero correlationcase. We consider the matrices Ψ , λ and ϕ defined as follows: Ψ = lim Θ → I F , λ = lim Θ → I ∂ F ∂ρ mn , ϕ = lim Θ → I ∂ F ∂ρ mn (34) Lemma D.1.
The matrix Ψ is a diagonal matrix with the following entries: Ψ = diag (Ψ( κ , τ ) , Ψ( κ , τ ) , . . . , Ψ( κ n , τ )) Here the function Ψ( κ, τ ) can be defined as a solution to the following one-dimensional Riccati equation d Ψ dτ = 2Ψ − δκ Ψ + δ ( δ − κ , Ψ(0) = 0 . (35) which can be solved explicitly: Ψ( κ, τ ) = κ √ δ ( √ δ − e κ √ δτ − e κ √ δτ + ω , ω = 1 − √ δ √ δ . (36)31 oreover, the function Ψ has the following properties: (cid:90) Ψ( κ, τ ) dτ = δ + √ δ κτ −
12 ln (cid:16) e κ √ δτ + ω (cid:17) + C. (37)Ψ( κ, τ ) + 1 − δ κ = κ (1 − (cid:112) δ )2 e κ √ δτ + 1 e κ √ δτ + ω (38) Proof. d Ψ dτ = 2Ψ − δκ Ψ + δ ( δ − κ ,dτ = d Ψ2Ψ − δκ Ψ + δ ( δ − κ / (cid:90) dτ = (cid:90) d Ψ2Ψ − δκ Ψ + δ ( δ − κ / τ + c = 12 √ δκ (cid:20) ln (cid:18) δκ − √ δκ + 1 (cid:19) − ln (cid:18) − δκ − √ δκ (cid:19)(cid:21) τ + c = 12 √ δκ ln (cid:32) δκ −
2Ψ + √ δκ − δκ + 2Ψ + √ δκ (cid:33) √ δκτ + ln (cid:32) δκ + √ δκ − δκ + √ δκ (cid:33) = ln (cid:32) δκ −
2Ψ + √ δκ − δκ + 2Ψ + √ δκ (cid:33) √ δκτ = ln (cid:32) δκ −
2Ψ + √ δκ − δκ + 2Ψ + √ δκ (cid:33) − ln (cid:32) √ δ − √ δ (cid:33) e √ δκτ = (cid:16) δκ −
2Ψ + √ δκ (cid:17) (1 − √ δ ) (cid:16) − δκ + 2Ψ + √ δκ (cid:17) (1 + √ δ ) e √ δκτ = 2Ψ( √ δ −
1) + √ δκ (1 − δ )2Ψ( √ δ + 1) + √ δκ (1 − δ )32ence Ψ equals toΨ = 12 √ δκ (1 − δ ) (cid:16) − e √ δκτ (cid:17) e √ δκτ (cid:16) √ δ (cid:17) + 1 − √ δ Ψ = − √ δκ − √ δ ) (cid:16) − e − √ δκτ (cid:17) −√ δ √ δ e − √ δκτ Ψ = κ √ δ ( √ δ − e κ √ δτ − e κ √ δτ + ω , ω = 1 − √ δ √ δ Lemma D.2.
Each element λ ij of the matrix λ is the following function: λ ij = κ i √ δ (1 − √ δ )2( e κ i √ δτ + ω )( e κ j √ δτ + ω ) ×× (cid:34) κ j − κ i κ j + κ i (cid:16) e ( κ j + κ i ) √ δτ − (cid:17) (cid:16) e ( κ j + κ i ) √ δτ + ω (cid:17) (39)+ e κ i √ δτ (cid:16) e ( κ j − κ i ) √ δτ − (cid:17) (cid:16) e ( κ j − κ i ) √ δτ + ω (cid:17) (cid:35) Proof.
Differentiating the matrix equation 12 with respect to time t andtaking the limit Θ → I , we get the following element wise ODEs for the λ ij : λ (cid:48) ij = λ ij (2 Ψ ii + 2 Ψ jj − δ [ κ i + κ j ]) − δ [ κ i − κ j ] (cid:20) Ψ ii + 1 − δ κ i (cid:21) λ ij (0) = 0 . The corresponding homogeneous ODE can be solved explicitly: e κ i √ δτ + κ j √ δτ ( e κ i √ δτ + ω )( e κ j √ δτ + ω ) . λ ij = − δ [ κ i − κ j ] κ i (1 − √ δ )2 e κ i √ δτ + κ j √ δτ ( e κ i √ δτ + ω )( e κ j √ δτ + ω ) × (cid:90) τ ( e κ i √ δζ + 1)( e κ j √ δζ + ω ) e κ i √ δζ + κ j √ δζ dζ = − δ [ κ i − κ j ] κ i (1 − √ δ )2 e κ i √ δτ + κ j √ δτ ( e κ i √ δτ + ω )( e κ j √ δτ + ω ) × (cid:90) τ (cid:104) e ( κ i + κ j ) √ δζ + ωe ( κ i − κ j ) √ δζ + e ( κ j − κ i ) √ δζ + ωe − ( κ i + κ j ) √ δζ (cid:105) dζ ;= δ [ κ j − κ i ] κ i (1 − √ δ )2 e ( κ i + κ j ) √ δτ ( e κ i √ δτ + ω )( e κ j √ δτ + ω ) × (cid:34) e ( κ i + κ j ) √ δτ − ωe − ( κ i + κ j ) √ δτ + ω − κ i + κ j ) √ δ + e ( κ j − κ i ) √ δτ − ωe − ( κ j − κ i ) √ δτ + ω − κ j − κ i ) √ δ (cid:35) = κ i √ δ (1 − √ δ )2( e κ i √ δτ + ω )( e κ j √ δτ + ω ) ×× (cid:34) κ j − κ i κ j + κ i (cid:16) e ( κ j + κ i ) √ δτ − (cid:17) (cid:16) e ( κ j + κ i ) √ δτ + ω (cid:17) + e κ i √ δτ (cid:16) e ( κ j − κ i ) √ δτ − (cid:17) (cid:16) e ( κ j − κ i ) √ δτ + ω (cid:17) (cid:35) Lemma D.3.
Any diagonal element ϕ ii of the matrix ϕ can be defined as asolution to the following ODE: ϕ (cid:48) ii = − κ i √ δ e κ i √ δτ − ωe κ i √ δτ + ω ϕ ii + δ (1 − √ δ ) κ i ( κ i − κ j ) × (40) × (cid:20) − e κ i √ δτ + 1 e κ i √ δτ + ω + √ δ κ j − κ i κ j + κ i e ( κ j + κ i ) √ δτ − e κ i √ δτ + ω e ( κ j + κ i ) √ δτ + ωe κ j √ δτ + ω + √ δe κ i √ δτ e ( κ j − κ i ) √ δτ − e κ i √ δτ + ω e ( κ j − κ i ) √ δτ + ωe κ j √ δτ + ω (cid:21) ϕ ii (0) = 0 34 oreover, the following inequalities holds for any κ i > , κ j > , T > and δ > : (cid:90) T (cid:2) ϕ ii ( u ) + ϕ jj ( u ) (cid:3) du > , δ > , κ i (cid:54) = κ j (cid:90) T (cid:2) ϕ ii ( u ) + ϕ jj ( u ) (cid:3) du ≡ , δ = 1 or κ i = κ j (cid:90) T (cid:2) ϕ ii ( u ) + ϕ jj ( u ) (cid:3) du < , < δ < κ i (cid:54) = κ j Proof.
Can be checked by the direct calculations.
E Auxilliary facts about correlation matrices
In this section we use two special types of square symmetric matrices, I mn and I uu . These objects are defined as follows: Matrix I mn has zero entries,except elements with multiindex ( mn ), these elements are equal to 1: I mnij = 0 , ∀ ( ij ) (cid:54) = ( mn ) , I mnij = 1 , ( ij ) = ( mn ) , or ( ji ) = ( mn ) . (41)Matrix I mn is a traceless matrix, Tr I mn = 0. The matrix I uu has also zeroentries, except only one element on ( u, u ). This element is equal to 1.We prove some useful facts about correlation matrix Θ and the similaritytransform Γ = Θ − κ Θ of the matrix κ . Lemma E.1.
Correlation matrix Θ and its similarity transform Γ have thefollowing properties:1. ∂ Θ − ∂ρ mn = − Θ − ∂ Θ ∂ρ mn Θ − . (42) lim Θ → I ∂ Γ ∂ρ mn = κI mn − I mn κ . (43) lim Θ → I ∂ Γ ∂ρ mn ∂ρ pq = Q , Q ii = 0 , ∀ i = 1 ..n. (44) lim Θ → I ∂ Γ ∂ρ mn = P , P ii = 2 I ( ij ∈ mn ) [ κ i − κ j ] (45)35 .1 Proof of formula (1). ΘΘ − = I ∂∂ρ mn (cid:0) ΘΘ − (cid:1) = ∂ I ∂ρ mn Θ ∂ Θ − ∂ρ mn = − ∂ Θ ∂ρ mn Θ − Θ ∂ Θ − ∂ρ mn = − ∂ Θ ∂ρ mn Θ − ∂ Θ − ∂ρ mn = − Θ − ∂ Θ ∂ρ mn Θ − E.2 Proof of formula (2) lim Θ → I ∂ Γ ∂ρ mn = lim Θ → I ∂ (cid:0) Θ − κ Θ (cid:1) ∂ρ mn = lim Θ → I ∂ Θ − ∂ρ mn κ Θ + lim Θ → I Θ − κ ∂ Θ ∂ρ mn = lim Θ → I ∂ Θ − ∂ρ mn κI + Iκ lim Θ → I ∂ Θ ∂ρ mn = − lim Θ → I ∂ Θ ∂ρ mn κ + κ lim Θ → I ∂ Θ ∂ρ mn = − I mn κ + κI mn = κI mn − I mn κ . .3 Proof of formula (3) ∂ Γ ∂ρ mn ∂ρ pq = ∂ ∂ρ mn ∂ρ pq Θ − κ Θ (46)= ∂ Θ − ∂ρ mn ∂ρ pq κ Θ + ∂ Θ − ∂ρ mn κ ∂ Θ ∂ρ pq + ∂ Θ − ∂ρ pq κ ∂ Θ ∂ρ mn + Θ − κ ∂ Θ ∂ρ mn ∂ρ pq = − ∂∂ρ pq (cid:20) Θ − ∂ Θ ∂ρ mn Θ − (cid:21) κ Θ − Θ − ∂ Θ ∂ρ mn Θ − κ ∂ Θ ∂ρ pq − Θ − ∂ Θ ∂ρ pq Θ − κ ∂ Θ ∂ρ mn = − ∂ Θ − ∂ρ pq ∂ Θ ∂ρ mn Θ − κ Θ − Θ − ∂ Θ ∂ρ mn ∂ Θ − ∂ρ pq κ Θ − Θ − ∂ Θ ∂ρ mn Θ − κ ∂ Θ ∂ρ pq − Θ − ∂ Θ ∂ρ pq Θ − κ ∂ Θ ∂ρ mn = Θ − ∂ Θ ∂ρ pq Θ − ∂ Θ ∂ρ mn Θ − κ Θ + Θ − ∂ Θ ∂ρ mn Θ − ∂ Θ ∂ρ pq Θ − κ Θ − Θ − ∂ Θ ∂ρ mn Θ − κ ∂ Θ ∂ρ pq − Θ − ∂ Θ ∂ρ pq Θ − κ ∂ Θ ∂ρ mn Q = I pq I mn κ + I mn I pq κ − I mn κI pq − I pq κI mn Q ii = n (cid:88) s =1 n (cid:88) k =1 [ I pqis I mnsk κ si + I mnis I pqsk κ si − I mnis κ sk I pqki − I pqis κ sk I mnki ] . Q ii = n (cid:88) s =1 [ I pqis I mnsi κ ii + I mnis I pqsi κ ii − I mnis κ ss I pqsi − I pqis κ ss I mnsi ] Q ii = 0 . Since I mnis = 0 if I pqsi = 1 for each s = 1 ..n and vice versa. E.4 Proof of formula (4) P ii = 2 n (cid:88) s =1 [ I mnis I mnsi κ ii − I mnis κ ss I mnsi ] (47) P ii = 2 I ( ij ∈ mn ) [ κ i − κ jj