[PDF] Mean-variance portfolio selection under Volterra Heston model

Abstract

Motivated by empirical evidence for rough volatility models, this paper investigates continuous-time mean-variance (MV) portfolio selection under the Volterra Heston model. Due to the non-Markovian and non-semimartingale nature of the model, classic stochastic optimal control frameworks are not directly applicable to the associated optimization problem. By constructing an auxiliary stochastic process, we obtain the optimal investment strategy, which depends on the solution to a Riccati-Volterra equation. The MV efficient frontier is shown to maintain a quadratic curve. Numerical studies show that both roughness and volatility of volatility materially affect the optimal strategy.

Full PDF

MMean-variance portfolio selection under Volterra Heston model

Bingyan Han ∗ Hoi Ying Wong † January 24, 2020

Abstract

Motivated by empirical evidence for rough volatility models, this paper investigatescontinuous-time mean-variance (MV) portfolio selection under the Volterra Heston model.Due to the non-Markovian and non-semimartingale nature of the model, classic stochas-tic optimal control frameworks are not directly applicable to the associated optimizationproblem. By constructing an auxiliary stochastic process, we obtain the optimal investmentstrategy, which depends on the solution to a Riccati-Volterra equation. The MV eﬃcientfrontier is shown to maintain a quadratic curve. Numerical studies show that both roughnessand volatility of volatility materially aﬀect the optimal strategy.

Keywords:

Mean-variance portfolio, Volterra Heston model, Riccati–Volterra equations,rough volatility.

Mathematics Subject Classiﬁcation:

There has been a growing interest in studying rough volatility models [15, 11, 20]. Roughvolatility models are stochastic volatility models whose trajectories are rougher than the pathsof a standard Brownian motion in terms of the H¨older regularity. Speciﬁcally, when the H¨olderregularity is less than 1/2, the stochastic path is regarded as rough. The roughness is closelyrelated to the Hurst parameter H . This paper focuses on the Volterra Heston model, whoseprobabilistic characterization does not involve the rough paths theory [11].Rough volatility models are attractive because they capture the dynamics of historical andimplied volatilities remarkably well with only a few additional parameters. Investigations of thetime series of the realized volatility from high frequency data estimate the Hurst parameter H to be near 0 .

1, which is much smaller than the 0 . H , the rougher the timeseries model. Therefore, the empirical ﬁnding suggests a rougher realized path of volatility thanthe standard Brownian motion. Although previous studies have found a long memory propertywithin realized volatility series, it is shown in [15] that rough volatility models can generate theillusion of a long memory. However, the simulated paths with a small Hurst parameter resemblethe realized ones.Rough volatility models also better capture the term structure of an implied volatility sur-face, especially for the explosion of at-the-money (ATM) skew when maturity goes to zero.More precisely, let σ BS ( k, τ ) be the implied volatility of an option where k is the log-moneynessand τ is the time to expiration. The ATM skew at maturity τ is deﬁned by φ ( τ ) (cid:44) (cid:12)(cid:12)(cid:12) ∂σ BS ( k, τ ) ∂k (cid:12)(cid:12)(cid:12) k =0 . (1.1) ∗ Department of Statistics, The Chinese University of Hong Kong, Hong Kong, [email protected] † Department of Statistics, The Chinese University of Hong Kong, Hong Kong, [email protected] See, for example, Oxford-Man Institute’s realized library at https://realized.oxford-man.ox.ac.uk/data a r X i v : . [ q -f i n . P M ] J a n mpirical evidence shows that the ATM skew explodes when τ ↓

0. However, conventionalvolatility models such as the Heston model [21] generate a constant ATM skew for a small τ . If the volatility is modeled by a fractional Brownian motion, then the ATM skew has anasymptotic property [14], φ ( τ ) ≈ τ H − / , when τ ↓ , (1.2)where H is the Hurst parameter. Rough volatility models can ﬁt the explosion remarkably wellby simply adjusting the H .Recent advances oﬀer elegant theoretical foundations for rough volatility models. We notethe martingale expansion formula for implied volatility [14], asymptotic analysis of fBM [14,Section 3.3], the microstructural foundation of rough Heston models by scaling the limit ofproper Hawkes processes [9], the closed-form characteristic function of rough Heston models upto the solution of a fractional Riccati equation [11], and the hedging strategy for options underrough Heston models [10]. In this paper, we are particularly interested in the aﬃne Volterraprocesses [3] because these models embrace rough Heston model [11] as a special case. Thecharacteristic function in [11] is extended to the exponential-aﬃne transform formula in termsof Riccati-Volterra equations [3]. Aﬃne Volterra processes are applied to ﬁnance problems in[23]. In addition, an alternative rough version of the Heston model is introduced in [20], wheresome asymptotic results are derived.While the rough volatility literature focuses on option pricing, only a few works contribute toportfolio optimization such as [12, 13, 4]. All of them consider utility maximization. To the bestof our knowledge, this is the ﬁrst paper to consider the mean-variance (MV) portfolio selectionunder a rough stochastic environment. The MV criterion in portfolio selection pioneered byMarkowitz’s seminal work is the cornerstone of the modern portfolio theory. We cannot give afull list of research outputs related to this Nobel Prize winning work, but mention contributionsin continuous-time settings [36, 27, 26, 6, 22, 31] as important references. We formulate the MV portfolio selection under the Volterra Heston models in a reasonablyrigorous manner. As pointed out by [3, 23], the Volterra Heston model (2.6)-(2.7) has a unique inlaw weak solution, but its pathwise uniqueness is still an open question in general. This enforcesus to consider the MV problem under a general ﬁltration F that satisﬁes the usual conditionsbut may not be the augmented ﬁltration generated by the Brownian motion. A similar generalsetting also appears in [22]. We emphasize that the probability basis and Brownian motions arealways ﬁxed for the problem in Section 3. Therefore, our formulation is still considered to be a strong formulation , because the ﬁltered probability space and Brownian motions are not partsof the control.Under such a problem formulation, we construct in Section 4 an auxiliary stochastic process M t to solve the MV portfolio selection by completion of squares. Several properties of M t arederived in Theorem 4.1, which is a main result of this paper. Like [11, 10, 3], we encounterdiﬃculties due to the non-Markovian and non-semimartingale structure of the Volterra Hestonmodel (2.6)-(2.7). Inspired by the exponential-aﬃne formulas in [3, 11], the process M t isconstructed upon the forward variance under a proper alternative measure. The explicit solutionfor the optimal investment strategy is obtained in Theorem 4.3.Under the rough Heston model, we investigate the impact of roughness on the optimal invest-ment strategy u ∗ . Recently, a trading strategy has been proposed to leverage the information ofroughness [18]. The strategy longs the roughest stocks and shorts the smoothest stocks. Excessreturns from this strategy are not fully explained by standard factor models like the CAPMmodel and Fama-French model. We examine this trading signal under the MV setting. Ourtheory predicts that the eﬀect of roughness on investment strategy is opposite under diﬀerentvolatility of volatility (vol-of-vol). We also discuss the roughness eﬀect on the eﬃcient frontier.2he rest of the paper is organized as follows. Section 2 presents the Volterra Heston modeland some useful properties. We discuss a related Riccati-Volterra equation. We then formulatethe MV portfolio selection problem in Section 3 and solve it explicitly in Section 4. Numericalillustrations are given in Section 5. Section 6 concludes the paper. The existence and uniquenessof the solution to Riccati-Volterra equations are summarized in Appendix A. An auxiliary resultused in Theorem 4.1 is proved in Appendix B. Our problem is deﬁned under a given complete probability space (Ω , F , P ), with a ﬁltration F = {F t } ≤ t ≤ T satisfying the usual conditions, supporting a two-dimensional Brownian motion W = ( W , W ). The ﬁltration F is not necessarily the augmented ﬁltration generated by W ;thus, it can be a strictly larger ﬁltration. This consideration is diﬀerent from some previousstudies like [27, 26, 31] but is consistent with [22] for the MV hedging problem under a generalﬁltration. This consideration is important because the stochastic Volterra equation (2.6)-(2.7)only has a unique in law weak solution but its strong uniqueness is still an open question ingeneral. Recall that for stochastic diﬀerential equations, X is referred to as a strong solution ifit is adapted to the augmented ﬁltration generated by W , and a weak solution otherwise. For aweak solution, the driving Brownian motion W is also a part of the solution [30, Chapter IX].Therefore, F cannot be simply chosen as the augmented ﬁltration generated by W , as extrainformation may be needed to construct a solution to (2.6)-(2.7).To proceed, we introduce a kernel K ( · ) ∈ L loc ( R + , R ), where R + = { x ∈ R | x ≥ } , andmake the following standing assumption throughout the paper, in line with [3, 23]. A func-tion f is called completely monotone on (0 , ∞ ), if it is inﬁnitely diﬀerentiable on (0 , ∞ ) and( − k f ( k ) ( t ) ≥ t >

0, and k = 0 , , ... . Assumption 2.1. K is strictly positive and completely monotone on (0 , ∞ ) . There is γ ∈ (0 , ,such that (cid:82) h K ( t ) dt = O ( h γ ) and (cid:82) T ( K ( t + h ) − K ( t )) dt = O ( h γ ) for every T < ∞ . The convolutions K ∗ L and L ∗ K for a measurable kernel K on R + and a measure L on R + of locally bounded variation are deﬁned by( K ∗ L )( t ) = (cid:90) [0 ,t ] K ( t − s ) L ( ds ) and ( L ∗ K )( t ) = (cid:90) [0 ,t ] L ( ds ) K ( t − s ) (2.1)for t > t = 0 by right-continuity ifpossible. If F is a function on R + , let( K ∗ F )( t ) = (cid:90) t K ( t − s ) F ( s ) ds. (2.2)Let W be a 1-dimensional continuous local martingale. The convolution between K and W is deﬁned as ( K ∗ dW ) t = (cid:90) t K ( t − s ) dW s . (2.3)A measure L on R + is called resolvent of the ﬁrst kind to K , if K ∗ L = L ∗ K ≡ id . (2.4)The existence of a resolvent of the ﬁrst kind is shown in [19, Theorem 5.5.4] under the completemonotonicity assumption, imposed in Assumption 2.1. Alternative conditions for the existenceare given in [19, Theorem 5.5.5].A kernel R is called the resolvent or resolvent of the second kind to K if K ∗ R = R ∗ K = K − R. (2.5)3he resolvent always exists and is unique by [19, Theorem 2.3.1].Further properties of these deﬁnitions can be found in [19, 3]. Although the same notioncan be deﬁned for higher dimensions and in matrix form, it suﬃces for us to consider the scalarcase. Commonly used kernels [3] summarized in Table 1 satisfy Assumption 2.1 once c > α ∈ (1 / , β ≥ K ( t ) R ( t ) L ( dt )Constant c ce − ct c − δ ( dt )Fractional (Power-law) c t α − Γ( α ) ct α − E α,α ( − ct α ) c − t − α Γ(1 − α ) dt Exponential ce − βt ce − βt e − ct c − ( δ ( dt ) + β dt )Table 1: Examples of kernels K and their resolvents R and L of the second and ﬁrst kind. E α,β ( z ) = (cid:80) ∞ n =0 z n Γ( αn + β ) is the Mittag–Leﬄer function. See [11, Appendix A1] for its properties.The constant c (cid:54) = 0.The variance process within the Volterra Heston model is deﬁned as V t = V + κ (cid:90) t K ( t − s ) ( φ − V s ) ds + (cid:90) t K ( t − s ) σ (cid:112) V s dB s , (2.6)where dB s = ρdW s + (cid:112) − ρ dW s and V , κ, φ , and σ are positive constants. The correlation ρ between stock price and variance is also constant. As documented in [15], the general overallshape of the implied volatility surface does not change signiﬁcantly, indicating that it is stillacceptable to consider a variance process whose parameters are independent of stock price andtime. The rough Heston model in [11, 10] becomes a special case of (2.6) once K ( t ) = t α − Γ( α ) .Another rough version of the Heston model studied in [20] is adopted to investigate the powerutility maximization [4].Following [3] and [24, 6, 35, 32], the risky asset (stock) price S t is assumed to follow dS t = S t ( r t + θV t ) dt + S t (cid:112) V t dW t , S > , (2.7)with a deterministic bounded risk-free rate r t > θ (cid:54) = 0. The market price ofrisk, or risk premium, is then given by θ √ V t . The risk-free rate r t > Theorem 2.2. ([3, Theorem 7.1]) Under Assumption 2.1, the stochastic Volterra equation(2.6)-(2.7) has a unique in law R + × R + -valued continuous weak solution for any initial condition ( S , V ) ∈ R + × R + . Remark 2.3.

Our model (2.6)-(2.7) is deﬁned under the physical measure, whereas the optionpricing model of [3, Equations (7.1)-(7.2)] is under a risk-neutral measure with a zero risk-freerate. However, the proofs are almost identical because the aﬃne structure is maintained and S is determined by V . Remark 2.4.

For strong uniqueness, we mention [2, Proposition B.3] as a related result withkernel K ∈ C ([0 , T ] , R ) and [29, Proposition 8.1] for certain Volterra integral equations withsmooth kernels. However, the strong uniqueness of (2.6)-(2.7) is left open for singular kernels.For weak solutions, it is free to construct the Brownian motion as needed. However, the MV bjective only depends on the mathematical expectation for the distribution of the processes. Inthe sequel, we will only work with a version of the solution to (2.6)-(2.7) and ﬁx the solution ( S, V, W , W ) , as other solutions have the same law. The following condition enables us to verify the admissibility of the optimal strategy. To bemore precise about the constant a , (4.23) gives an explicit suﬃcient large value needed. Assumption 2.5. E (cid:104) exp (cid:0) a (cid:82) T V s ds (cid:1)(cid:105) < ∞ for a large enough constant a > . To verify that Assumption 2.5 holds under reasonable conditions, we consider the Riccati-Volterra equation (2.8) for g ( a, t ) as follows: g ( a, t ) = (cid:90) t K ( t − s ) (cid:2) a − κg ( a, s ) + σ g ( a, s ) (cid:3) ds. (2.8)The existence and uniqueness of the solution to (2.8) are given in Lemmas A.2 and A.3. Theorem 2.6.

Suppose Assumption 2.1 holds and the Riccati-Volterra equation (2.8) has aunique continuous solution on [0 , T ] , then E (cid:104) exp (cid:0) a (cid:90) T V s ds (cid:1)(cid:105) = exp (cid:104) κφ (cid:90) T g ( a, s ) ds + V (cid:90) T (cid:2) a − κg ( a, s ) + σ g ( a, s ) (cid:3) ds (cid:105) < ∞ . (2.9) Moreover, denote L as the resolvent of the ﬁrst kind to K , then E (cid:104) exp (cid:0) a (cid:90) T V s ds (cid:1)(cid:105) = exp (cid:104) κφ (cid:90) T g ( a, s ) ds + V (cid:90) T g ( a, T − s ) L ( ds ) (cid:105) . (2.10) Proof.

Note g ( a, t ) in (2.8) corresponds to [3, Equation (4.3)] with u = 0 and f = a . [3,Theorem 4.3] shows the equivalence between [3, Equation (4.4)] and [3, Equation (4.6)]. For t = T , the expressions in [3, Equation (4.4)-(4.6)] indicate that a (cid:90) T V s ds = Y − σ (cid:90) T g ( a, T − s ) V s ds + σ (cid:90) T g ( a, T − s ) (cid:112) V s dB s , (2.11)with Y = κφ (cid:90) T g ( a, s ) ds + V (cid:90) T (cid:2) a − κg ( a, s ) + σ g ( a, s ) (cid:3) ds. (2.12)As g ( a, · ) is continuous on [0 , T ] and therefore bounded, exp (cid:0) − σ (cid:82) t g ( a, T − s ) V s ds + σ (cid:82) t g ( a, T − s ) √ V s dB s (cid:1) is a martingale by [3, Lemma 7.3]. Therefore, E (cid:104) exp (cid:0) a (cid:90) T V s ds (cid:1)(cid:105) = exp( Y ) = exp (cid:104) κφ (cid:90) T g ( a, s ) ds + V (cid:90) T (cid:2) a − κg ( a, s ) + σ g ( a, s ) (cid:3) ds (cid:105) . (2.13)Note that K ∗ L = id implies (cid:90) T (cid:2) a − κg ( a, s ) + σ g ( a, s ) (cid:3) ds = (cid:90) T g ( a, T − s ) L ( ds ) . (2.14)The result follows.Theorem 2.6 recovers the same expression for E (cid:104) exp (cid:0) a (cid:82) T V s ds (cid:1)(cid:105) in [10, Theorem 3.2]. Westress that the proof circumvents the use of the Hawkes processes. In addition, we mention [17],which examines the moment explosions in the rough Heston model, as a related reference.5 Mean-variance portfolio selection

Let u t (cid:44) √ V t π t be the investment strategy, where π t is the amount of wealth invested in thestock. Then wealth process X t satisﬁes dX t = (cid:0) r t X t + θ (cid:112) V t u t (cid:1) dt + u t dW t , X = x > . (3.1) Deﬁnition 3.1.

An investment strategy u ( · ) is said to be admissible if(1). u ( · ) is F -adapted;(2). E (cid:104)(cid:16) (cid:82) T |√ V t u t | dt (cid:17) (cid:105) < ∞ and E (cid:104) (cid:82) T | u t | dt (cid:105) < ∞ ; and(3). the wealth process (3.1) has a unique solution in the sense of [34, Chapter 1, Deﬁnition6.15], with P - a.s. continuous paths.The set of all of the admissible investment strategies is denoted as U . Remark 3.2.

In Condition (1), F is possibly strictly larger than the Brownian ﬁltration of W = ( W , W ) , which means that extra information in addition to W can be used to construct anadmissible strategy. In general, u can rely on a local P -martingale that is strongly P -orthogonalto W . See hedging strategy (3.6) in [22, Theorem 3.1] for such examples. However, our optimalstrategy u ∗ turns out to only depend on the variance V and Brownian motion W , as shown inTheorem 4.3. Remark 3.3.

We emphasize once again that the underlying probability space and Brownianmotions are not parts of our control. Therefore, our formulation should still be referred to asa strong formulation. Readers may refer to [34, Chapter 2, Section 4] for discussions of thediﬀerence between strong and weak formulations of stochastic control problems.

The MV portfolio selection in continuous-time is the following problem .  min u ( · ) ∈U J ( x ; u ( · )) = E (cid:2) ( X T − c ) (cid:3) , subject to E [ X T ] = c, ( X ( · ) , u ( · )) satisfy (3.1) . (3.2)The constant c is the target wealth level at the terminal time T . We assume c ≥ x e (cid:82) T r s ds following [27, 26, 31]. Otherwise, a trivial strategy that puts all of the wealth into the risk-freeasset can dominate any other admissible strategy. The MV problem is said to be feasible for c ≥ x e (cid:82) T r s ds if there exists a u ( · ) ∈ U that satisﬁes E [ X T ] = c . Note that r t > E [ (cid:82) T V t dt ] >

0. It is then clear that the feasibility of our problem is guaranteed for any c ≥ x e (cid:82) T r s ds by a slight modiﬁcation to the proof in [26, Propsition 6.1].As Problem (3.2) has a constraint, it is equivalent to the following max-min problem [28]. (cid:26) max η ∈ R min u ( · ) ∈U J ( x ; u ( · )) = E (cid:2) ( X T − ( c − η )) (cid:3) − η , ( X ( · ) , u ( · )) satisfy (3.1) . (3.3)Let ζ = c − η and consider the inner Problem (3.4) of (3.3) ﬁrst. (cid:26) min u ( · ) ∈U J ( x ; u ( · )) = E (cid:2) ( X T − ζ ) (cid:3) − η , ( X ( · ) , u ( · )) satisfy (3.1) . (3.4) There are several equivalent formulations. Optimal investment strategy

To solve Problem (3.4), we introduce a new probability measure ˜ P by d ˜ P d P (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F t = exp (cid:16) − θ (cid:90) t V s ds − θ (cid:90) t (cid:112) V s dW s (cid:17) , (4.1)where the stochastic exponential is a true martingale [3, Lemma 7.3]. Then ˜ W t (cid:44) W t +2 θ (cid:82) t √ V s ds is a new Brownian motion under ˜ P . Hence, V t = V + (cid:90) t K ( t − s ) ( κφ − λV s ) ds + (cid:90) t K ( t − s ) σ (cid:112) V s d ˜ B s , (4.2)where λ = κ + 2 θρσ and d ˜ B s = ρd ˜ W s + (cid:112) − ρ dW s .Denote ˜ E [ · ] and ˜ E [ ·|F t ] as the ˜ P -expectation and conditional ˜ P -expectation, respectively.The forward variance under ˜ P is the conditional ˜ P -expected variance: ˜ E [ V s |F t ] (cid:44) ξ t ( s ). Thefollowing identity is proven in [23, Propsition 3.2] by an application of [3, Lemma 4.2]. ξ t ( s ) = ˜ E [ V s |F t ] = ξ ( s ) + (cid:90) t λ R λ ( s − u ) σ (cid:112) V u d ˜ B u , (4.3)where ξ ( s ) = (cid:18) − (cid:90) s R λ ( u ) du (cid:19) V + κφλ (cid:90) s R λ ( u ) du, (4.4)and R λ is the resolvent of λK such that λK ∗ R λ = R λ ∗ ( λK ) = λK − R λ . (4.5)If λ = 0, interpret R λ /λ = K and R λ = 0.Consider the stochastic process, M t = 2 exp (cid:104) (cid:90) Tt (cid:0) r s − θ ξ t ( s ) + (1 − ρ ) σ ψ ( T − s ) ξ t ( s ) (cid:1) ds (cid:105) , (4.6)where ψ ( t ) = (cid:90) t K ( t − s ) (cid:2) (1 − ρ ) σ ψ ( s ) − λψ ( s ) − θ (cid:3) ds. (4.7)The existence and uniqueness of the solution to (4.7) are established in Lemma A.4.The process M is the key to applying the completion of squares technique in Theorem 4.3,inspired by [27, 26, 31]. Heuristically speaking, the non-Markovian and non-semimartingalecharacteristics of the Volterra Heston model are overcome by considering M . The construction of M is based on the following observations. To make a completion of squares, we need an auxiliaryprocess M as an additional stochastic factor in a place consistent with previous studies of MVportfolios under semimartingales. The completion of squares procedure for proving Theorem4.3 indicates that M should satisfy (4.8). We then link M with the conditional expectation in(4.13) via a proper transformation. The exponential-aﬃne transform formula in [3, Equation(4.7)] is applied to obtain (4.6). Theorem 4.1.

Assume Assumption 2.1 holds and (4.7) has a unique continuous solution on [0 , T ] , then M satisﬁes the following properties.(1). M t is essentially bounded and < M t < e (cid:82) Tt r s ds , P - a.s. , ∀ t ∈ [0 , T ) . M T = 2 . M on t , then dM t = (cid:2) − r t + θ V t (cid:3) M t dt + (cid:2) θ (cid:112) V t U t + U t M t (cid:3) dt + U t dW t + U t dW t , (4.8) where U t = ρσM t (cid:112) V t ψ ( T − t ) , (4.9) U t = (cid:112) − ρ σM t (cid:112) V t ψ ( T − t ) . (4.10) (3). M = 2 exp (cid:104) (cid:90) T r s ds + κφ (cid:90) T ψ ( s ) ds + V (cid:90) T (cid:2) (1 − ρ ) σ ψ ( s ) − λψ ( s ) − θ (cid:3) ds (cid:105) . (4.11) Furthermore, for fractional kernel K ( t ) = t α − Γ( α ) , denote the fractional integral as I α ψ ( t ) = K ∗ ψ ( t ) . Then M = 2 exp (cid:104) (cid:90) T r s ds + κφI ψ ( T ) + V I − α ψ ( T ) (cid:105) . (4.12) (4). E (cid:104)(cid:0) (cid:82) T U it dt (cid:1) p/ (cid:105) < ∞ for p ≥ , i = 1 , .Proof. Property (1) .It is straightforward to see that M t > − ρ = 0,note (cid:82) Tt ξ t ( s ) ds > P -a.s. by Lemma B.1, then M t < e (cid:82) Tt r s ds , P -a.s.. If 1 − ρ (cid:54) = 0, we claim M − ρ t = 2 − ρ exp (cid:2) − ρ ) (cid:90) Tt r s ds (cid:3) ˜ E (cid:104) exp (cid:0) − θ (1 − ρ ) (cid:90) Tt V s ds (cid:1)(cid:12)(cid:12)(cid:12) F t (cid:105) . (4.13)It is equivalent to show that˜ E (cid:104) exp (cid:0) − θ (1 − ρ ) (cid:90) Tt V s ds (cid:1)(cid:12)(cid:12)(cid:12) F t (cid:105) (4.14)= exp (cid:104) (cid:90) Tt (cid:0) − (1 − ρ ) θ ξ t ( s ) + (1 − ρ ) σ ψ ( T − s ) ξ t ( s ) (cid:1) ds (cid:105) . Denote ˜ ψ = (1 − ρ ) ψ . Then ˜ ψ satisﬁes˜ ψ = K ∗ (cid:0) σ ψ − λ ˜ ψ − (1 − ρ ) θ (cid:1) . (4.15)Therefore, (4.14) holds for all t ∈ [0 , T ] by [3, Theorem 4.3] applying to ˜ ψ . The martingaleassumption in [3, Theorem 4.3] is veriﬁed by [3, Lemma 7.3].If 1 − ρ >

0, then ˜ E (cid:104) exp (cid:0) − θ (1 − ρ ) (cid:82) Tt V s ds (cid:1)(cid:12)(cid:12)(cid:12) F t (cid:105) < P -a.s., which implies M t < e (cid:82) Tt r s ds , P -a.s.. 1 − ρ < Property (2).

Denote M t = 2 e Z t in (4.6) with proper Z t . We ﬁrst derive the equation for dZ t . From (4.3),apply Itˆo’s lemma to ξ t ( s ) on time t and get dξ t ( s ) = 1 λ R λ ( s − t ) σ (cid:112) V t d ˜ B t . (4.16)8hen dZ t = (cid:2) − r t + θ V t − (1 − ρ ) σ ψ ( T − t ) V t (cid:3) dt − θ (cid:90) Tt λ R λ ( s − t ) σ (cid:112) V t d ˜ B t ds + (1 − ρ ) σ (cid:90) Tt ψ ( T − s ) 1 λ R λ ( s − t ) σ (cid:112) V t d ˜ B t ds = (cid:2) − r t + θ V t − (1 − ρ ) σ ψ ( T − t ) V t (cid:3) dt − θ (cid:90) Tt σ λ R λ ( s − t ) ds (cid:112) V t d ˜ B t + (1 − ρ ) σ (cid:90) Tt σψ ( T − s ) 1 λ R λ ( s − t ) ds (cid:112) V t d ˜ B t = (cid:2) − r t + θ V t − (1 − ρ ) σ ψ ( T − t ) V t (cid:3) dt + d ˜ B t · σ (cid:112) V t (cid:90) Tt (cid:104) (1 − ρ ) σ ψ ( T − s ) − θ (cid:105) λ R λ ( s − t ) ds. The second equality is guaranteed by the stochastic Fubini theorem [33].We claim the following representation for (4.9)-(4.10). U t = σρM t (cid:112) V t (cid:90) Tt (cid:104) (1 − ρ ) σ ψ ( T − s ) − θ (cid:105) λ R λ ( s − t ) ds, (4.17) U t = σ (cid:112) − ρ M t (cid:112) V t (cid:90) Tt (cid:104) (1 − ρ ) σ ψ ( T − s ) − θ (cid:105) λ R λ ( s − t ) ds. (4.18)Indeed, we only have to show (cid:90) Tt (cid:104) (1 − ρ ) σ ψ ( T − s ) − θ (cid:105) λ R λ ( s − t ) ds = ψ ( T − t ) . (4.19)Although one can verify (4.19) in the same fashion as [3, Lemma 4.4], we still detail the deriva-tion here for a self-contained paper. As (cid:90) Tt (cid:104) (1 − ρ ) σ ψ ( T − s ) − θ (cid:105) λ R λ ( s − t ) ds = (cid:90) T − t (cid:104) (1 − ρ ) σ ψ ( T − t − s ) − θ (cid:105) λ R λ ( s ) ds = (cid:2) (1 − ρ ) σ ψ − θ (cid:3) ∗ λ R λ ( T − t ) , we have (cid:90) Tt (cid:104) (1 − ρ ) σ ψ ( T − s ) − θ (cid:105) λ R λ ( s − t ) ds − ψ ( T − t )= (cid:2) (1 − ρ ) σ ψ − θ (cid:3) ∗ λ R λ ( T − t ) − K ∗ (cid:2) (1 − ρ ) σ ψ − λψ − θ (cid:3) ( T − t )= (cid:2) (1 − ρ ) σ ψ − θ (cid:3) ∗ (cid:2) λ R λ − K (cid:3) ( T − t ) + λK ∗ ψ ( T − t )= − R λ ∗ K ∗ (cid:2) (1 − ρ ) σ ψ − θ (cid:3) ( T − t ) + λK ∗ ψ ( T − t ) . The application of (4.7) leads to R λ ∗ ψ = R λ ∗ K ∗ (cid:2) (1 − ρ ) σ ψ − λψ − θ (cid:3) . (4.20)9onsequently, − R λ ∗ K ∗ (cid:2) (1 − ρ ) σ ψ − θ (cid:3) ( T − t ) + λK ∗ ψ ( T − t )= (cid:2) λK − R λ − λK ∗ R λ (cid:3) ∗ ψ ( T − t ) = 0 . This shows that dZ t = (cid:2) − r t + θ V t − (1 − ρ ) σ ψ ( T − t ) V t (cid:3) dt + U t M t d ˜ W t + U t M t dW t . (4.21)Applying Itˆo’s lemma to M t = 2 e Z t with function f ( z ) = 2 e z yields dM t = M t dZ t + 12 M t dZ t dZ t = M t (cid:2) − r t + θ V t − (1 − ρ ) σ ψ ( T − t ) V t (cid:3) dt + U t + U t M t dt + U t d ˜ W t + U t dW t = (cid:2) − r t + θ V t (cid:3) M t dt + (cid:2) θ (cid:112) V t U t + U t M t (cid:3) dt + U t dW t + U t dW t . Property (3) .The proof for the property of Y t in [3, Theorem 4.3] indicates (cid:90) T (cid:2) − θ ξ ( s ) + (1 − ρ ) σ ψ ( T − s ) ξ ( s ) (cid:3) ds = (cid:90) T (cid:2) − θ V + ( κφ − λV ) ψ ( s ) + (1 − ρ ) σ ψ ( s ) V (cid:3) ds. Under the fractional kernel, we show by integration by parts that (cid:90) T (cid:2) − θ − λψ ( s ) + (1 − ρ ) σ ψ ( s ) (cid:3) ds = I − α ψ ( T ) . (4.22)This gives the desired result. Property (4) .It is suﬃcient to consider the case with p >

2. As ψ ( t ) is continuous on [0 , T ] and M t isessentially bounded, E (cid:104)(cid:0) (cid:90) T U it dt (cid:1) p/ (cid:105) ≤ C E (cid:104)(cid:0) (cid:90) T V t dt (cid:1) p/ (cid:105) ≤ C (cid:90) T E (cid:2) V p/ t (cid:3) dt ≤ C sup t ∈ [0 ,T ] E (cid:2) V p/ t (cid:3) < ∞ . The last term is ﬁnite by [3, Lemma 3.1].We ﬁrst propose a candidate optimal control u ∗ . In the following theorem, we prove theadmissibility of u ∗ and the integrability of the corresponding X ∗ . Theorem 4.2 is in the spiritof [27, 26, 31]. Finally, we prove the optimality of u ∗ in (4.24) by Theorem 4.3. Theorem 4.2.

Assume Assumption 2.1 holds and (4.7) has a unique continuous solution on [0 , T ] . Denote A t (cid:44) θ + ρσψ ( T − t ) . Suppose Assumption 2.5 holds with constant a given thefollowing: a = max (cid:110) p | θ | sup t ∈ [0 ,T ] | A t | , (8 p − p ) sup t ∈ [0 ,T ] A t (cid:111) , for certain p > . (4.23)10 onsider u ∗ ( t ) = ( θ + ρσψ ( T − t )) (cid:112) V t ( ζ ∗ e − (cid:82) Tt r s ds − X ∗ t ) , (4.24) where X ∗ t is the wealth process under u ∗ and ζ ∗ = c − η ∗ with η ∗ = e − (cid:82) T r s ds M x − e − (cid:82) T r s ds M c − e − (cid:82) T r s ds M . (4.25) u ∗ ( · ) in (4.24) is admissible and X ∗ under u ∗ ( · ) satisﬁes E (cid:104) sup t ∈ [0 ,T ] | X ∗ t | p (cid:105) < ∞ , (4.26) for p ≥ . Moreover, ζ ∗ e − (cid:82) Tt r s ds − X ∗ t ≥ , P - a.s. , ∀ t ∈ [0 , T ] . (4.27) Proof.

The wealth process under u ∗ is given by (cid:40) dX ∗ t = (cid:2) r t X ∗ t + θA t V t ( ζ ∗ e − (cid:82) Tt r s ds − X ∗ t ) (cid:3) dt + A t √ V t ( ζ ∗ e − (cid:82) Tt r s ds − X ∗ t ) dW t ,X ∗ = x . (4.28)To ﬁnd a solution to X ∗ , deﬁne Y t satisfying (cid:40) dY t = − r t Y t dt − θ √ V t Y t dW t + Y t (cid:112) − ρ σψ ( T − t ) √ V t dW t ,Y = M ( ζ ∗ e − (cid:82) T r s ds − x ) . (4.29)The unique solution of Y t is given by Y t = Y exp (cid:104) − (cid:90) t (cid:0) r s + θ V s + (1 − ρ ) σ ψ ( T − s ) V s (cid:1) ds − (cid:90) t θ (cid:112) V s dW s + (cid:90) t (cid:112) − ρ σψ ( T − s ) (cid:112) V s dW s (cid:105) . Itˆo’s lemma yields X ∗ t = ζ ∗ e − (cid:82) Tt r s ds − Y t M t (4.30)as the unique solution of the wealth process. Indeed, d Y t M t = (cid:104) r t Y t M t − θA t V t Y t M t (cid:105) dt − A t (cid:112) V t Y t M t dW t . (4.31)The existence of u ∗ is also guaranteed by the existence of the solution X ∗ . Furthermore, Y t M t = Y M Φ( t ), whereΦ( t ) (cid:44) exp (cid:104) (cid:90) t (cid:2) r s − (cid:0) θA s + A s (cid:1) V s (cid:3) ds − (cid:90) t A s (cid:112) V s dW s (cid:105) . As Y t /M t ≥

0, (4.27) follows from (4.30).For (4.26), note that by Doob’s maximal inequality and [3, Lemma 7.3], E (cid:104) sup t ∈ [0 ,T ] | Φ( t ) | p (cid:105) ≤ C E (cid:104) sup t ∈ [0 ,T ] (cid:12)(cid:12)(cid:12) e − (cid:82) t θA s V s ds (cid:12)(cid:12)(cid:12) p (cid:105) + C E (cid:104) sup t ∈ [0 ,T ] (cid:12)(cid:12)(cid:12) exp (cid:16) − (cid:90) t A s V s ds − (cid:90) t A s (cid:112) V s dW s (cid:17)(cid:12)(cid:12)(cid:12) p (cid:105) ≤ C E (cid:104) e p (cid:82) T | θA s | V s ds (cid:105) + C E (cid:104) exp (cid:16) − (cid:90) T pA s V s ds − (cid:90) T pA s (cid:112) V s dW s (cid:17)(cid:105) . a = 2 p | θ | sup t ∈ [0 ,T ] | A t | . The secondterm is also ﬁnite. In fact, by H¨older’s inequality and Assumption 2.5 with a constant a =(8 p − p ) sup t ∈ [0 ,T ] A t , E (cid:104) exp (cid:16) − (cid:90) T pA s V s ds − (cid:90) T pA s (cid:112) V s dW s (cid:17)(cid:105) ≤ (cid:110) E (cid:104) e (8 p − p ) (cid:82) T A s V s ds (cid:105)(cid:111) / (cid:110) E (cid:104) exp (cid:16) − p (cid:90) T A s V s ds − p (cid:90) T A s (cid:112) V s dW s (cid:17)(cid:105)(cid:111) / < ∞ . E (cid:104) sup t ∈ [0 ,T ] | X ∗ t | p (cid:105) < ∞ is proved. As for admissibility of u ∗ , u ∗ is F -adapted at ﬁrst. Forintegrability, let 1 / ˆ p + 1 / ˆ q = 1, ˆ p, ˆ q >

1, we have E (cid:104)(cid:16) (cid:90) T | (cid:112) V t u ∗ t | dt (cid:17) (cid:105) ≤ C E (cid:104)(cid:16) (cid:90) T | A t V t Φ( t ) | dt (cid:17) (cid:105) ≤ C E (cid:104) sup t ∈ [0 ,T ] Φ ( t ) (cid:16) (cid:90) T V t dt (cid:17) (cid:105) ≤ C (cid:110) E (cid:104) sup t ∈ [0 ,T ] Φ p ( t ) (cid:105)(cid:111) / ˆ p (cid:110) E (cid:104)(cid:16) (cid:90) T V t dt (cid:17) q (cid:105)(cid:111) / ˆ q ≤ C (cid:110) E (cid:104) sup t ∈ [0 ,T ] Φ p ( t ) (cid:105)(cid:111) / ˆ p (cid:16) sup t ∈ [0 ,T ] E (cid:2) V qt (cid:3)(cid:17) / ˆ q < ∞ and E (cid:104) (cid:90) T | u ∗ t | dt (cid:105) ≤ C E (cid:104) (cid:90) T A t V t Φ ( t ) dt (cid:105) ≤ C E (cid:104) sup t ∈ [0 ,T ] Φ ( t ) (cid:90) T V t dt (cid:105) ≤ C (cid:110) E (cid:104) sup t ∈ [0 ,T ] Φ p ( t ) (cid:105)(cid:111) / ˆ p (cid:110) E (cid:104)(cid:16) (cid:90) T V t dt (cid:17) ˆ q (cid:105)(cid:111) / ˆ q ≤ C (cid:110) E (cid:104) sup t ∈ [0 ,T ] Φ p ( t ) (cid:105)(cid:111) / ˆ p (cid:16) sup t ∈ [0 ,T ] E (cid:2) V ˆ qt (cid:3)(cid:17) / ˆ q < ∞ . The last terms in the two inequalities above are ﬁnite by [3, Lemma 3.1] and take p = 2ˆ p .We are now ready to prove u ∗ in (4.24) is optimal and to derive the eﬃcient frontier. Theorem 4.3.

Suppose the assumptions in Theorem 4.2 hold, then the optimal investmentstrategy for Problem (3.2) is given by (4.24) . Moreover, (4.24) is unique under a given solution ( S, V, W , W ) to (2.6)-(2.7). The variance of X ∗ T isVar [ X ∗ T ] = M − e − (cid:82) T r s ds M (cid:0) ce − (cid:82) T r s ds − x (cid:1) . (4.32) Proof.

First, we consider the inner Problem (3.4) with an arbitrary ζ ∈ R . Denote h t = ζe − (cid:82) Tt r s ds . By Itˆo’s lemma with the property of M and completing the square, for any admis-sible strategy u , d M t ( X t − h t ) = 12 (cid:2) ( X t − h t ) M t θ V t + 2( X t − h t ) θ (cid:112) V t U t + ( X t − h t ) U t M t + 2 M t ( X t − h t ) θ (cid:112) V t u t + 2( X t − h t ) u t U t + M t u t (cid:3) dt + 12 (cid:2) ( X t − h t ) U t + 2 M t ( X t − h t ) u t (cid:3) dW t + 12 ( X t − h t ) U t dW t = 12 M t (cid:104) u t + (cid:0) θ (cid:112) V t + U t M t (cid:1) ( X t − h t ) (cid:105) dt + 12 (cid:2) ( X t − h t ) U t + 2 M t ( X t − h t ) u t (cid:3) dW t + 12 ( X t − h t ) U t dW t . M t and h t are bounded, E (cid:104) (cid:82) T U it dt (cid:105) < ∞ for i = 1 , u t is admissible, and X t has P -a.s.continuous paths, then stochastic integrals (cid:90) t (cid:2) ( X s − h s ) U s + 2 M s ( X s − h s ) u s (cid:3) dW s and (cid:90) t ( X s − h s ) U s dW s are ( F , P )-local martingales. There is an increasing localizing sequence of stopping times { τ k } k =1 , ,... such that τ k ↑ T when k → ∞ . The local martingales stopped by { τ k } k =1 , ,... are true martingales. Consequently,12 E [ M τ k ( X τ k − h τ k ) ] = 12 M ( x − h ) + 12 E (cid:104) (cid:90) τ k M t (cid:16) u t + (cid:0) θ (cid:112) V t + U t M t (cid:1) ( X t − h t ) (cid:17) dt (cid:105) . (4.33)From (3.1), by Doob’s maximal inequality and the admissibility of u ( · ), E [ X τ k ] ≤ C (cid:104) x + E (cid:104)(cid:0) (cid:90) T | u t (cid:112) V t | dt (cid:1) (cid:105) + E (cid:104) (cid:90) T u t dt (cid:105)(cid:105) < ∞ . (4.34)Then M τ k ( X τ k − h τ k ) is dominated by a non-negative integrable random variable for all k .Sending k to inﬁnity, by the dominated convergence theorem and the monotone convergencetheorem, we derive E [( X T − ζ ) ] = 12 M ( x − h ) + 12 E (cid:104) (cid:90) T M t (cid:16) u t + (cid:0) θ (cid:112) V t + U t M t (cid:1) ( X t − h t ) (cid:17) dt (cid:105) . (4.35)Therefore, the cost functional E [( X T − ζ ) ] is minimized when u t = − (cid:0) θ (cid:112) V t + U t M t (cid:1) ( X t − h t ) . (4.36)Then E [( X T − ζ ) ] = M ( x − h ) . The uniqueness of u ∗ follows directly from (4.35) and M t > P -a.s., ∀ t ∈ [0 , T ]. To solve the outer maximization problem in (3.3), consider J ( x ; u ( · )) = 12 M (cid:2) x − ( c − η ) e − (cid:82) T r s ds (cid:3) − η . (4.37)The ﬁrst and second order derivatives are ∂J∂η = M (cid:2) x − ( c − η ) e − (cid:82) T r s ds (cid:3) e − (cid:82) T r s ds − η,∂ J∂η = M e − (cid:82) T r s ds − < , where we have used the strict inequality M < e (cid:82) T r s ds , by Theorem 4.1.Then the optimal value for η is given by (4.25), solved from ∂J∂η = 0. Var[ X ∗ T ] is obtained bydirect simpliﬁcation of J ( x ; u ( · )) with η ∗ .Although the Volterra Heston model is non-Markovian and non-semimartingale in nature,the optimal control u ∗ in (4.24) does not rely on the whole volatility path. Moreover, theoptimal amount of wealth in the stock, π ∗ t , does not depend on the volatility value directly, butrather on the roughness and dynamics of volatility through parameters and the Riccati-Volterraequation (4.7). If we let kernel K = id, it is then clear that the Volterra Heston model (2.6)reduces to the classic Heston model [21]. Our results in Theorem 4.2 and Theorem 4.3 indicatethat the u ∗ in (4.24) is optimal even under a general ﬁltration F . It extends the correspondingresult in [6, 32] where the ﬁltration is chosen as the Brownian ﬁltration. As a sanity check, thefollowing corollary veriﬁes that our solution reduces to the one under the Heston model.13 orollary 4.4. Consider the Heston model, that is, the kernel K = id . Suppose other as-sumptions in Theorem 4.2 hold, then the optimal strategy (4.24) is the same as the one in[6].Proof. Without loss of generality, suppose r t = 0, as in [6]. We ﬁrst match M t / L t in [6, Equation (3.2)].Note the resolvent in (4.5) reduces to R λ ( t ) = λe − λt and the forward variance in (4.3) is ξ t ( s ) = e − λ ( s − t ) V t + κφλ (cid:16) − e − λ ( s − t ) (cid:17) . (4.38)Therefore, (cid:90) Tt ξ t ( s ) ds = 1 − e − λ ( T − t ) λ V t + κφλ (cid:32) T − t − − e − λ ( T − t ) λ (cid:33) and (cid:90) Tt ψ ( T − s ) ξ t ( s ) ds = V t (cid:90) Tt ψ ( T − s ) e − λ ( s − t ) ds + κφλ (cid:90) Tt (cid:2) − e − λ ( s − t ) (cid:3) ψ ( T − s ) ds. Then (cid:90) Tt (cid:2) − θ ξ t ( s ) + (1 − ρ ) σ ψ ( T − s ) ξ t ( s ) (cid:3) ds = w ( T − t ) V t + y ( T − t ) , (4.39)where w ( T − t ) (cid:44) (1 − ρ ) σ (cid:90) Tt ψ ( T − s ) e − λ ( s − t ) ds − θ − e − λ ( T − t ) λ ,y ( T − t ) (cid:44) (1 − ρ ) σ κφλ (cid:90) Tt (cid:2) − e − λ ( s − t ) (cid:3) ψ ( T − s ) ds − θ κφλ (cid:32) T − t − − e − λ ( T − t ) λ (cid:33) . Replacing t with T − t and taking derivative on t give˙ w ( t ) = (1 − ρ ) σ ψ ( t ) − λ (1 − ρ ) σ (cid:90) TT − t ψ ( T − s ) e − λ ( s − T + t ) ds − θ e − λt = (1 − ρ ) σ ψ ( t ) − λw ( t ) − θ . Comparing with (4.7), we ﬁnd w ( t ) = ψ ( t ). Moreover,˙ y ( t ) = (1 − ρ ) σ κφλ (cid:90) TT − t λe − λ ( s − T + t ) ψ ( T − s ) ds − θ κφλ (cid:16) − e − λt (cid:17) = κφw ( t ) .y ( t ) and w ( t ) satisfy the same ODEs as in [6, Equations (A.1)-(A.4)], with our notations.Therefore, M t / L t in [6, Equation (3.2)].Consider the inner Problem (3.4). With a constant H = ζ , terms in the optimal hedge ϕ ( x, H ) [6, p.476] are reduced to ξ = 0 , a = ( θ + ψ ( T − t ) ρσ ) /S t , V = ζ, and x + ϕ ( x, H ) · S = X ∗ t . (4.40)Then it is clear that the optimal strategies are the same.14 Numerical studies

In this section, we restrict ourself to the case with K ( t ) = t α − Γ( α ) , α ∈ (1 / , α = 1 recovers the classic Heston model. We examine the eﬀect of α onthe optimal investment strategy and eﬃcient frontier.The ﬁrst step is to solve the Riccati-Volterra equation (4.7) numerically. Following [11], weuse the fractional Adams method in [7, 8]. The convergence of this numerical method is givenin [25]. Readers may refer to [11, Section 5.1] for more details about the procedure.In Figure (1a), ψ decreases when α becomes smaller under certain speciﬁc parameters, closeto the calibration result in [11] with one extra risk premium parameter θ . However, one cannotexpect ψ to be monotone in α in general (see Figure (1b)). Figures (1a)-(1b) also conﬁrm theclaim that ψ ≤ − ρ > ψ α = 0.6α = 0.7α = 0.8α = 0.9α = 1.0 (a) ψ under parameters in [11] ψ α=0.6α=0.7α=0.8α=0.9α=1.0 (b) ψ under another setting Figure 1: Plot of ψ under diﬀerent α . Other parameters are as follows. In Figure (1a), vol-of-vol σ = 0 .

03, mean-reversion speed κ = 0 .

1, risk premium parameter θ = 5, correlation ρ = − . T = 1. In Figure (1b), σ = 0 . κ = 2 . θ = 0 . ρ = − .

56, and T = 1 . u ∗ and α is not straightforward and may change with diﬀerentcombinations of parameters. We emphasize that the following analysis is based on the parametersetting detailed in the descriptions of the ﬁgures. Consider the setting in Figure (1a) ﬁrst.Interestingly, the eﬀect of α on u ∗ is signiﬁcantly inﬂuenced by σ . This can be explained using(4.24). If the correlation ρ between stock and volatility is negative due to the leverage eﬀectin the equity market, θ + ρσψ ( T − t ) will increase as α decreases, as shown in Figure (1a). Incontrast, ζ ∗ e − (cid:82) Tt r s ds − X ∗ t ≥ ζ ∗ = c − η ∗ = 2 c − e − (cid:82) T r s ds M x − e − (cid:82) T r s ds M . (5.1)The M in (4.12) is an increasing function on α because ψ is negative. Then ζ ∗ will be smallerif α is smaller, under certain parameters. Therefore, ζ ∗ e − (cid:82) Tt r s ds − X ∗ t and θ + ρσψ ( T − t )move in diﬀerent directions when α is decreasing. If σ is small, ζ ∗ e − (cid:82) Tt r s ds − X ∗ t will dominate θ + ρσψ ( T − t ). Then u ∗ will decrease as α becomes smaller. If σ is relatively large, θ + ρσψ ( T − t )will dominate ζ ∗ e − (cid:82) Tt r s ds − X ∗ t . Then u ∗ increases when α becomes smaller. The above eﬀect ofvol-of-vol σ also appears under the parameters setting in Figure (1b), where ψ is not monotonein α . Figures (2a)-(2b) display the optimal investment strategy u ∗ . We make use of the open-source Python package diﬀerint to calculate the fractional integrals I − α and I in (4.12).Assumption 2.5 is validated under the setting in Figures (2a)-(2b). Available at https://github.com/differint/differint .0 0.2 0.4 0.6 0.8 1.0 1.2 1.4time (in years)2.742.752.762.772.78 u * α=0.6α=0.7α=0.8α=0.9α=1.0 (a) u ∗ under σ = 0 . u * α=0.6α=0.7α=0.8α=0.9α=1.0 (b) u ∗ under σ = 3 Figure 2: Optimal strategy u ∗ with α = 0 . , . , . , .

9, and 1 .

0. In both subplots, we set initialwealth x = 1, risk-free rate r = 0 .

01, initial variance V = 0 .

5, long-term mean level φ = 0 . c = x e ( r +0 . T . For simplicity, we set V t = 0 . X ∗ t = 1 forall time t ∈ [0 , T ]. The other parameters are the same as in Figure (1b), namely, κ = 2 . θ = 0 . ρ = − .

56, and T = 1 .

35. Figures (2a)-(2b) only diﬀer in the vol-of-vol σ .Figures (2a)-(2b) are a sensitivity analysis as we keep most of the parameters unchanged,and vary a few of them. Speciﬁcally, the use of constant V t and X ∗ t in Figures (2a)-(2b) hasthe following interpretation. We are interested in the sensitivity of the optimal control on theHurst parameter through α . As the other parameters being ﬁxed, if we observe V t = 0 . X ∗ t = 1 at t ∈ [0 , T ], Figures (2a)-(2b) illustrate the marginal eﬀect of the Hurst parameter onthe investment strategy. The constant values of V t and X ∗ t are not from a realized path.Figures (2a)-(2b) only provide a marginal eﬀect of α ; thus, we conduct a further numericalanalysis under the settings in [1]. Consider a realistic situation in which the investor calibratestwo sets of parameters for the Heston model and rough Heston model for a given impliedvolatility surface. We contrast the two strategies induced from the calibrated parameters.Figure (3c) exhibits the optimal amount of wealth π ∗ with one simulation path of V t in Figure(3b) by the lifted Heston approach [1]. Assumption 2.5 holds true under the setting in Figure3. Figure (3a) plots the A t = θ + ρσψ ( T − t ). Furthermore, ζ ∗ = 30 . ζ ∗ = 21 . ζ ∗ reported. A rough Heston investor has a larger A t ζ ∗ but a smaller A t .Moreover, Figure (4c) illustrates that the rough Heston strategy has an average terminal wealthcloser to the target c = 1 . α . Figures (2a)-(2b) indicate that α is not the only factor determining the investment in a stock. The trading idea in [18] agreeswith Figure (2b), because the optimal investment position u ∗ is larger for a smaller α . However,an inconsistency occurs in Figure (2a). Indeed, if we use the VVIX index as a proxy for thevol-of-vol, then the vol-of-vol seems larger in 2007, 2008, 2010, and 2015. The buy-rough-sell-smooth strategy [18] performs better in 2005, 2007, 2008, 2010, and 2014 than in other years, asshown in [18, Figure 3]. This consistency suggests that vol-of-vol may also be important when16

50 100 150 200 250time (in days)0.4000.4050.4100.4150.4200.4250.4300.435 A t RoughHeston (a) A t V t (b) volatility π * RoughHeston (c) π ∗ Figure 3: Investment strategies under the Heston and rough Heston models. The varianceprocess is simulated with the lifted Heston model in [1]. The parameters for simulation arespeciﬁed in [1, Equations (23) and (26)] with α = 0 .

6. The path is rougher than that of theclassic Heston model. Moreover, we implement the Euler scheme for the stock process. Thesimulation is run with 250 time steps for one year, corresponding to the 250 trading days in ayear. The investor under the Heston model uses the calibrated parameters in [1, Table 6] toimplement the optimal strategy with α = 0 . x = 1, r = 0 . θ = 0 . T = 1,and c = x e ( r +0 . T = 1 . (a) u ∗ under the rough Hestonmodel (b) u ∗ under the Heston model (c) Wealth Figure 4: Statistics for strategies and wealth. Based on 3000 simulated paths, the solid line plotsthe mean and the shadow area is the 95% conﬁdence interval estimated by bootstrapping. Therough Heston model suggests investing more and the terminal wealth is closer to the expectedvalue c = 1 . α and expectedwealth level c . Their relationship is clear, and the variance of the optimal wealth is reduced if α decreases, as M decreases when α decreases and Var[ X ∗ T ] in (4.32) is an increasing functionon M . We have also veriﬁed Assumption 2.5 under the setting in Figures (5a)-(5b). To the best of our knowledge, this is the ﬁrst study of the continuous-time Markowitz’s mean-variance portfolio selection problem under a rough stochastic environment. We speciﬁcally focuson the Volterra Heston model. By deriving the optimal strategy and eﬃcient frontier, we obtainfurther insights into the eﬀect of roughness on them.There are many possible future research directions. Natural considerations are the utility17 E [ X * T ] V a r [ X * T ] (a) Eﬃcient frontier V a r [ X * T ] c=1.2c=1.3c=1.4c=1.5 (b) Var[ X ∗ T ] under diﬀerent expected value c Figure 5: Plots of the eﬃcient frontier and variance. Roughness parameter α ∈ [0 . , r = 0 . V = 0 . x = 1, φ = 0 . σ = 0 . κ = 0 . θ = 0 . ρ = − . T = 1, and c ∈ [ x e ( r +0 . T , x e ( r +0 . T ].maximization and time-inconsistency of the MV criterion. In addition, we have already includedmodel ambiguity with rough volatility in our research agenda. Acknowledgements

The authors would like to thank two anonymous referees and the Editor for their careful readingand valuable comments, which have greatly improved the manuscript.

A Solutions of Riccati-Volterra equations

To demonstrate the existence and uniqueness of the solution to a Riccati-Volterra equation, weﬁrst rephrase the following result from a recent monograph [5] with more general assumptions.The underlying idea of the proof is the Picard iteration.

Theorem A.1.

Suppose kernel K ( · ) is bounded or is the fractional kernel with α ∈ (0 , . Let c , c , c be constant. Then there exsits δ > such that f ( t ) = (cid:90) t K ( t − s ) (cid:2) c + c f ( s ) + c f ( s ) (cid:3) ds (A.1) has a unique continuous solution f on [0 , δ ] .Proof. Note that quadratic function is locally Lipschitz; then according to Theorem 3.1.2 andTheorem 3.1.4 in [5], the claim holds.However, δ in Theorem A.1 is not explicit. Tighter results exist if more assumptions areimposed.We investigate g ( a, t ) in (2.8) ﬁrst. Based on [16, Theorem A.5], we have Lemma A.2.

Suppose Assumption 2.1 holds and κ − aσ > . Then (2.8) has a uniqueglobal continuous solution. Moreover, < g ( a, t ) ≤ r ( t ) < w ∗ , ∀ t > , (A.2) where w ∗ (cid:44) κ −√ κ − aσ σ and r ( t ) (cid:44) Q − (cid:0) (cid:82) t K ( s ) ds (cid:1) ; that is, the inverse function of Q , givenby Q ( w ) = (cid:90) w dua − κu + σ u . (A.3)18 roof. To apply the result in [16, Theorem A.5], we deﬁne H ( w ) = a − κw + σ w . Then H ( w ) satisﬁes Assumption A.1 in [16] with w max (cid:44) κσ and w ∗ deﬁned above. The claimfollows from [16, Theorem A.5 (c)] with a ( t ) ≡ K ( t ) = t α − Γ( α ) , [10, Theorem 3.2] obtains the following tighterresults and the proof is based on the scaling limits of the Hawkes processes. Lemma A.3. If K ( t ) = t α − Γ( α ) , α ∈ (1 / , , then g ( a, t ) in (2.8) satisﬁes g ( a, t ) ≤ cσ (cid:104) κ + t − α Γ(1 − α ) + σ (cid:112) a ( t ) − a (cid:105) , (A.4) with a ( t ) = σ (cid:2) κ + t − α Γ(1 − α ) (cid:3) and a constant c > . In other words, if a < a ( T ) , thenAssumption 2.5 is satisﬁed. Next, we study ψ ( · ) in (4.7). (4.7) has a unique continuous solution on some interval [0 , δ ] ifthe conditions in Theorem A.1 are satisﬁed. Without Theorem A.1, we also have the followingresult. Lemma A.4.

Suppose Assumption 2.1 holds.(1). If − ρ > , then (4.7) has a unique global continuous solution ψ ∈ L loc ( R + , R ) and ψ < for t > .(2). If − ρ = 0 , then (4.7) is linear and has a unique continuous solution on [0 , T ] .(3). If − ρ < , further assume λ > and λ + 2(1 − ρ ) θ σ > . Then (4.7) has aunique global continuous solution. Moreover, ¯ w ∗ − ρ < ¯ r ( t )1 − ρ ≤ ψ ( t ) < , ∀ t > , (A.5) with ¯ w ∗ = λ − √ λ +2(1 − ρ ) θ σ σ and ¯ r ( t ) (cid:44) ¯ Q − (cid:0) (cid:82) t K ( s ) ds (cid:1) , where ¯ Q ( w ) = (cid:90) w du σ u − λu − (1 − ρ ) θ . (A.6) Proof.

The claim in (1) follows from [3, Theorem 7.1]. The continuity follows from the unique-ness of the global solution and [19, Theorem 12.1.1]. The claim in (2) is classic and can befound in [5, Theorem 1.2.3]. For (3), we consider ˜ ψ = (1 − ρ ) ψ . Then ˜ ψ satisﬁes˜ ψ = K ∗ (cid:0) σ ψ − λ ˜ ψ − (1 − ρ ) θ (cid:1) . (A.7)Deﬁne H ( w ) = σ w − λw − (1 − ρ ) θ . (A.8)Then ¯ w ∗ is the unique root of H ( w ) = 0 on ( −∞ , ¯ w max ] with ¯ w max = λσ . H ( w ) satisﬁesAssumption A.1 in [16]. Therefore, [16, Theorem A.5 (c)] with a ( t ) ≡ < ˜ ψ ( t ) ≤ ¯ r ( t ) < ¯ w ∗ , ∀ t > . (A.9)Note ˜ ψ = (1 − ρ ) ψ . This gives the result desired.19 Positivity of integrals with forward variance

Lemma B.1.

Suppose Assumption 2.1 holds. The forward variance ξ t ( s ) in (4.3) satisﬁes (cid:82) Tt ξ t ( s ) ds > , P - a.s. , for every t ∈ [0 , T ) .Proof. As (cid:82) Tt ξ t ( s ) ds = ˜ E [ (cid:82) Tt V s ds |F t ] and V s is non-negative by Theorem 2.2, it is suﬃcient toshow that (cid:82) Tt V s ds > P -a.s..Given t ∈ [0 , T ), for ω ∈ Ω such that V s ( ω ) is continuous in s , we suppose (cid:82) Tt V s ( ω ) ds = 0.By the continuity of V s ( ω ), V s ( ω ) = 0 for s ∈ [ t, T ]. Using the argument given in [3, Theorem3.5, Equation (3.8)], for 0 < h < T − t , we have V t + h ( ω ) = V + (cid:90) t K ( t + h − s ) ( κφ − λV s ( ω )) ds + (cid:90) t K ( t + h − s ) σ (cid:112) V s ( ω ) d ˜ B s ( ω )+ (cid:90) t + ht K ( t + h − s ) ( κφ − λV s ( ω )) ds + (cid:90) t + ht K ( t + h − s ) σ (cid:112) V s ( ω ) d ˜ B s ( ω ) ≥ (cid:90) t + ht K ( t + h − s ) ( κφ − λV s ( ω )) ds + (cid:90) t + ht K ( t + h − s ) σ (cid:112) V s ( ω ) d ˜ B s ( ω ) . (B.1)As V s ( ω ) = 0 , s ∈ [ t, t + h ], then V t + h ( ω ) ≥ κφ (cid:90) t + ht K ( t + h − s ) ds > . (B.2)This contradiction implies that (cid:82) Tt V s ds > P -a.s., and the claim follows. References [1] Abi Jaber, E.: Lifting the Heston model. Quant. Finance, 1-19 (2019)[2] Abi Jaber, E., El Euch, O.: Multifactor approximation of rough volatility models. SIAMJ. Financial Math. , 309-349 (2019)[3] Abi Jaber, E., Larsson, M., Pulido, S.: Aﬃne Volterra processes. Ann. Appl. Probab. ,3155-3200 (2019)[4] B¨auerle, N., Desmettre, S.: Portfolio optimization in fractional and rough Heston models.arXiv preprint arXiv:1809.10716 (2018)[5] Brunner, H.: Volterra integral equations: An introduction to theory and applications (Vol.30). Cambridge University Press (2017)[6] ˇCern´y, A., Kallsen, J.: Mean-variance hedging and optimal investment in Heston’s modelwith correlation. Math. Finance, , 473-492 (2008)[7] Diethelm, K., Ford, N. J., Freed, A. D.: A predictor-corrector approach for the numericalsolution of fractional diﬀerential equations. Nonlinear Dynam., , 3-22 (2002)[8] Diethelm, K., Ford, N. J., Freed, A. D.: Detailed error analysis for a fractional Adamsmethod. Numer. Algorithms, , 31-52 (2004)[9] El Euch, O., Fukasawa, M., Rosenbaum, M.: The microstructural foundations of leverageeﬀect and rough volatility. Finance Stoch., , 241-280 (2018)2010] El Euch, O., Rosenbaum, M.: Perfect hedging in rough Heston models. Ann. Appl.Probab., , 3813-3856 (2018)[11] El Euch, O., Rosenbaum, M.: The characteristic function of rough Heston models. Math.Finance, , 3-38 (2019)[12] Fouque, J. P., Hu, R.: Optimal portfolio under fast mean-reverting fractional stochasticenvironment. SIAM J. Financial Math., , 564-601 (2018)[13] Fouque, J. P., Hu, R.: Optimal portfolio under fractional stochastic environment. Math.Finance, 1-38 (2018)[14] Fukasawa, M.: Asymptotic analysis for stochastic volatility: martingale expansion. FinanceStoch., , 635-654 (2011)[15] Gatheral, J., Jaisson, T., Rosenbaum, M.: Volatility is rough. Quant. Finance, , 933-949(2018)[16] Gatheral, J., Keller-Ressel, M.: Aﬃne forward variance models. Finance Stoch., ,501-533 (2019)[17] Gerhold, S., Gerstenecker, C., Pinter, A.: Moment explosions in the rough Heston model.Decis. Econ. Finance, , 575-608 (2019)[18] Glasserman, P. He, P.: Buy rough, sell smooth. Quant. Finance, 1-16 (2019)[19] Gripenberg, G., Londen, S. O., Staﬀans, O.: Volterra integral and functional equations(Vol. 34). Cambridge University Press (1990)[20] Guennoun, H., Jacquier, A., Roome, P., Shi, F.: Asymptotic behavior of the fractionalHeston model. SIAM J. Financial Math., , 1017-1045 (2018)[21] Heston, S. L.: A closed-form solution for options with stochastic volatility with applicationsto bond and currency options. Rev. Financial Stud., , 327-343 (1993)[22] Jeanblanc, M., Mania, M., Santacroce, M., Schweizer, M.: Mean-variance hedging viastochastic control and BSDEs for general semimartingales. Ann. Appl. Probab., , 2388-2428 (2012)[23] Keller-Ressel, M., Larsson, M., Pulido, S.: Aﬃne rough models. arXiv preprintarXiv:1812.08486 (2018)[24] Kraft, H.: Optimal portfolios and Heston’s stochastic volatility model: An explicit solutionfor power utility. Quant. Finance, , 303-313 (2005)[25] Li, C., Tao, C.: On the fractional Adams method. Comput. Math. Appl., , 1573-1588(2009)[26] Lim, A. E.: Quadratic hedging and mean-variance portfolio selection with random param-eters in an incomplete market. Math. Oper. Res., , 132-161 (2004)[27] Lim, A. E., Zhou, X. Y.: Mean-variance portfolio selection with random parameters in acomplete market. Math. Oper. Res., , 101-120 (2002)[28] Luenberger, D. G.: Optimization by vector space methods. John Wiley and Sons (1968)[29] Mytnik, L., Salisbury, T.S.: Uniqueness for Volterra-type stochastic integral equations.arXiv preprint arXiv:1502.05513 (2015) 2130] Revuz, D., Yor, M.: Continuous martingales and Brownian motion (Vol. 293). SpringerScience and Business Media (1999)[31] Shen, Y.: Mean–variance portfolio selection in a complete market with unbounded randomcoeﬃcients. Automatica, , 165-175 (2015)[32] Shen, Y., Zeng, Y.: Optimal investment-reinsurance strategy for mean-variance insurerswith square-root factor process. Insurance Math. Econom., , 118-137 (2015)[33] Veraar, M.: The stochastic Fubini theorem revisited. Stochastics, , 543-551 (2012)[34] Yong, J., Zhou, X. Y.: Stochastic controls: Hamiltonian systems and HJB equations (Vol.43). Springer Science and Business Media (1999)[35] Zeng, X., Taksar, M.: A stochastic volatility model and optimal portfolio selection. Quant.Finance, , 1547-1558 (2013)[36] Zhou, X. Y., Li, D.: Continuous-time mean-variance portfolio selection: a stochastic LQframework. Appl. Math. Optim.,42