Continuous time mean-variance-utility portfolio problem and its equilibrium strategy
aa r X i v : . [ q -f i n . M F ] M a y Continuous time mean-variance-utility portfolio problem and itsequilibrium strategy
Ben-Zhang Yang a , Xin-Jiang He b , and Song-Ping Zhu b ∗ a. Department of Mathematics, Sichuan University, Chengdu, Sichuan 610064, P.R. Chinab. School of Mathematics and Applied Statistics, University of Wollongong NSW 2522, Australia Abstract.
In this paper, we propose a new class of optimization problems, which maximizethe terminal wealth and accumulated consumption utility subject to a mean variancecriterion controlling the final risk of the portfolio. The multiple-objective optimizationproblem is firstly transformed into a single-objective one by introducing the concept ofoverall “happiness” of an investor defined as the aggregation of the terminal wealth underthe mean-variance criterion and the expected accumulated utility, and then solved undera game theoretic framework. We have managed to maintain analytical tractability; theclosed-form solutions found for a set of special utility functions enable us to discuss someinteresting optimal investment strategies that have not been revealed before in literature.
Keywords:
Merton’s problem; Mean-variance portfolio problem; Equilibrium; Time-inconsistency control; Utility
Ever since the development of the Black-Scholes derivative pricing model and the Merton portfolio se-lection model, a large amount of research interest has been led into this area, and various analytical aswell as numerical approaches together with their applications in finance practice have been discussed[16, 23, 24, 27, 28, 29, 30, 31, 32, 34, 35, 36]. In fact, optimal portfolio selection problem is essentiallyto achieve a balance between uncertain returns and risks, for which the mean-variance methodology hasbecome one of the most important tools, ever since Markowitz’s pioneering work on a static investmentmodel [25]. This approach conveys a nice and elegant idea, quantifying the return and risk of an invest-ment with its mean and variance. Under this framework, a rational investor generally either maximizesthe expected return at a given level of risk or minimizes the risk at a given level of expected return.Although Markowitz’s work is theoretically very appealing, it only provides results based on a single-period (static) model, in which the investors can only make a decision at the very beginning, while they ∗ Corresponding author. E-mail address: [email protected] ffi e and Richardson [15] through setting a mean-variance objective at the initial date. They obtaineda pre-commitment solution , which also solves the optimal problem with a quadratic objective for somespecific parameters. A similar approach developed for continuous time complete-market settings has alsobeen widely discussed [20, 21, 22, 35]. In addition, Brandt [8] discussed the portfolio selection problemunder a mean-variance criterion based on the assumption that the investor chooses portfolio weights forseveral periods in advance, implicitly assuming pre-commitment.However, Basak and Chabakauri [2] challenged the pre-commitment assumption [35], and assumedinvestors are sophisticated in the sense that they will maximize their mean-variance objective over timeconsidering all future updates, instead of finding an optimal solution at a fixed given time moment.Following this, Kryger and Ste ff ensen [18] worked under the Black-Scholes framework without the pre-commitment assumption, and showed that the optimal strategy derived for a mean-standard deviationinvestor is to take no risk at all. The latest contribution to the relevant literature was made as presentedin [5], where mean-variance optimization problems were considered under a game theoretic framework,and the optimal strategies were derived in the context of sub-game perfect Nash equilibrium.Although all the work mentioned above is very appealing, consumption was usually not consideredin mean-variance optimization problems, which is not consistent with what is happening in practice, asthe decisions made for consumption would naturally have an impact on the optimality of the investmentstrategy [9, 10]. Thus, researchers started to incorporate consumption choices into the mean-varianceproblem, investigating the optimal investment-consumption problem together with the mean-variancecriterion. For example, Kronborg and Ste ff ensen [17] directly added the accumulated consumption to theterminal wealth to formulate an “adjusted” terminal wealth, and tried to maximize the adjusted terminalwealth over time under the mean-variance framework. Christiansen and Ste ff ensen [11] went even further2o consider the same optimization problem with deterministic consumption and investment to avoid aseries of di ffi culties. Unfortunately, the optimal consumption strategy derived under this particular modelassumption and the formulation of the problem has led to a rather absurd conclusion that an investor couldsuddenly be required to switch his / her consumption strategy from consuming as much as possible to aslittle as possible, in order to achieve the “optimal” objective set at the beginning of the investment period.It is this fundamental flaw in the state-of-the-art model frame of the portfolio selection problem that hasprompted us to propose a new model frame that would not only eliminate this rather strange behaviorbut also maintain mathematical tractability so that a more rational investment behavior can be discussedeconomically for some simple utility functions through analytical closed-form solutions.A new class of continuous-time portfolio selection problems is proposed in this paper, which com-bine maximizing the terminal wealth under the mean-variance criterion and maximizing accumulatedconsumption utility together. Instead of adding the consumption back to the terminal wealth in the ob-jective value function of the optimization problem as previously presented in the literature [11, 17], weintroduce the concept of overall “happiness” of an investor, which is measured by the aggregation ofthe terminal wealth under the mean-variance criterion and the expected accumulated utility, using a con-sumption preference parameter. Amazingly, the newly formulated optimization problem preserves theanalytical tractability under a continuous-time game theoretic framework, and the analytical optimal con-tinuous investment and consumption strategies derived in the sense of equilibrium [4, 5] admit intuitiveeconomic explanation.The rest of this paper is organized as follows. Section 2 reviews the classical mean-variance problemand proposes the new portfolio selection problem. In Section 3, we analytically derive the optimalstrategies based on the definition of the equilibrium strategy. Explicit solutions to the optimal strategiesare then presented in Section 4. Numerical examples and discussions are provided in Section 5, followedby some concluding remarks given in the last section. We now assume that we work under the standard Black-Scholes market, where an investor has access toa risk-free bank account and a stock whose dynamics can be specified as dM ( t ) = rM ( t ) dt , M (0) = , dS ( t ) = µ S ( t ) dt + σ S ( t ) dB ( t ) , S (0) = s > . (1)Here, r > µ and σ are constants, and it is assumed that µ > r . The process B ( t ) is a standard Brownianmotion on the probability space ( Ω , F , P ) with the filtration σ { B ( s ); 0 ≤ s ≤ t } , ∀ t ∈ [0 , T ].We also assume that the investor in this market needs to make investment decisions on a finite timehorizon [0 , T ], and he / she allocates a proportion π ( t ) and 1 − π ( t ) of his wealth into the stock and bankaccount, respectively, at time t . Let X π ( t ) be the wealth of the investor at time t following the investment3trategy π ( · ) with an initial wealth of x at time 0. In this case, the dynamic of the investor’s wealthfollows dX π ( t ) = [( r + π ( t )( µ − r )) X π ( t )] dt + π ( t ) σ X π ( t ) dB ( t ) , t ∈ [0 , T ) , X (0) = x > . (2)If L F (0 , T ; R ) denotes the set of all R -valued, measurable stochastic process f ( t ) adapted to { F t } t ≥ such that E (cid:20)R T f ( t ) dt (cid:21) < ∞ , the classical continuous time mean-variance portfolio optimization prob-lem is stated below. Definition 2.1. ([35]) A portfolio strategy π ( · ) is admissible if π ( · ) ∈ L F (0 , T ; R ) . Definition 2.2. ([35]) The continuous time mean-variance portfolio optimization problem is a multi-objective optimization problem, which is defined as min π ( · ) ( V ( π ( · )) , V ( π ( · ))) ≡ ( − E ( X ( T )) , Var ( X ( T ))) , s . t . π ( · ) ∈ L F (0 , T ; R ) , ( x ( · ) , π ( · )) satis f y Equation (2) . (3)The optimization problem (3) can be transformed into a single-objective optimization problem byintroducing a weight parameter γ such that the new objective becomes the weighted average of twooriginal objectives using mild convexity conditions [33]min π ( · ) V ( π ( · )) + γ V ( π ( · )) ≡ − E ( X ( T )) + γ Var ( X ( T )) , s . t . π ( · ) ∈ L F (0 , T ; R ) , ( x ( · ) , π ( · )) satis f y Equation (2) , (4)where the weight parameter satisfies γ >
0. In fact, this particular optimization problem (4) has beenextensively studied in the past 20 years with various theoretical results, numerical algorithms, and appli-cations being available in the literature. Interested readers are referred to [2, 34, 35] and the referencestherein for more details.
It should be pointed out that the classical mean-variance optimization problem (3) or the transformed op-timization problem (4) has a fundamental flow that the wealth of the investor does not take into accounthis / her income and consumption. The main possible reason is that the incorporation of the consumptionchoices in the classical mean-variance problem could destroy the tractability of the original problem.However, this is apparently not appropriate as it is not consistent with real situations, and thus we as-sume that the investor possesses a continuous deterministic income rate l ( t ), and chooses a non-negativeconsumption rate c ( t ). Under these assumptions, the dynamic of the investor’s wealth can be derived as dX c ,π ( t ) = [( r + π ( t )( µ − r )) X c ,π ( t ) + l ( t ) − c ( t )] dt + π ( t ) σ X c ,π ( t ) dB ( t ) , t ∈ [0 , T ) , X (0) = x > . (5)4bviously, after incorporating the consumption into the mean-variance problem, the investor is alsoseeking for his / her maximum utility through the consumption choices, apart from the mean-variance typeobjective as specified in the optimization problem (4) . In other words, the investor again faces a dual-objective optimization problem; he / she wants to achieve the maximum accumulated utility over a choiceof consumption, while at the same time minimizing investment risk by considering a mean-varianceobjective over terminal wealth X ( T ). In this case, we need a measurement for the utility obtained throughconsumption. With ρ representing a constant discounting rate and U ( · ) denoting an utility function, theaccumulated utility of the investor through his / her continuous consumption on time interval [ t , T ] can bedefined as V c ,π ( t , x ) = E "Z Tt e − ρ ( s − t ) U ( c ( s )) ds , (6)where E ( · ) denotes taking the expectation. It is this new optimal portfolio selection problem, named as“mean-variance-utility consumption and investment optimization problem”, that is presented below. Definition 2.3.
A new mean-variance-utility consumption and investment optimization problem can beformulated as max h V c ,π ( t , x ) , V c ,π ( t , x ) , V c ,π ( t , x ) i ≡ " E ( X ( T )) , − Var ( X ( T )) , E Z Tt e − ρ ( s − t ) U ( c ( s )) ds ! s . t . c ( · ) , π ( · ) ∈ L F (0 , T ; R ) , ( X ( · ) , c ( · ) , π ( · )) satis f y Equation (5) . (7)Similarly to what have been presented in the previous subsection, the optimization problem (7) canalso be converted into a single-objective optimization problemmax c ( · ) ,π ( · ) E ( X ( T )) − γ Var ( X ( T )) + β E Z Tt e − ρ ( s − t ) U ( c ( s )) ds ! s . t . c ( · ) , π ( · ) ∈ L F (0 , T ; R ) , ( x ( · ) , c ( · ) , π ( · )) satis f y Equation (5) , (8)where β is a positive constant. It should be noted that the problem (8) degenerates to the classical mean-variance portfolio selection model (4) when β approaches zero.Clearly, the parameter β can be treated as a trade-o ff between acquiring more terminal wealth in themean-variance sense and achieving more accumulated utility through consumption; the larger the valueof β is, the more inclined the investor is to consume to maximize his / her accumulated utility. In thissense, β is actually a consumption preference parameter. It should also be noted that it is not usual to addthe terminal wealth and the expected utility together as it does not make any economical sense. However,this is possible in our case here, since the introduced parameter β can be regarded as a conversion operatorthat converts the utility units to wealth units. Although the two wealth dynamics (2) and (5) are di ff erent, the optimal solution to the mean-variance problem (4) with (2)and that to the mean-variance problem (4) with (5) are the same. In particular, by setting ρ =
5t should be particular emphasized that although Christiansen and Ste ff ensen [11]; Kronborg andSte ff ensen [17] have already tried to incorporate the consumption into the mean-variance framework,the problem they discussed is essentially di ff erent from our new problem (8). This is because they havedirectly added the accumulated consumption to the terminal wealth to formulate an “adjusted” terminalwealth and considered the mean-variance optimization of the adjusted wealth, while we have distin-guishably introduced parameter β so that the mean-variance of the terminal wealth and the accumulatedconsumption utility can be added together to form a new objective. In this way, the new objective can beeconomically interpreted as the overall “happiness” of an investor towards his / her investment return aswell as the undertaken risk level during the time period [0 , T ].The investor aims at achieving the maximum overall happiness through the combination of maxi-mizing the terminal wealth with the mean-variance criteria and the consumption utility, leading to a newclass of mean-variance-utility optimization problems. At the first glance, this appears to be suggestingthat one can consume more even if one does not have much wealth. But, this is not a correct interpre-tation of our new model. What our new model suggests is that an investor should not proportionallyconsume as suggested by the Merton’s classic framework and she / he should also consider the balance ofthe total wealth management under the Markowitz’s mean-variance criterion. The fundamental reasonis because now he / she still wants to minimize his / her total investment risk at the end of the investmentperiod, while maximizing his / her expected return and accumulated consumption utility. Moreover, thefinancing of optimal consumption seems to be independent with the trade-o ff of mean-variance. There-fore, we also need to explore whether the appearance of consumption utility a ff ects the previous structureof the investment strategy reported in [2, 5, 17]. The new challenge is to solve the new optimal portfolioselection problem (8), which will be discussed in the next section. Having successfully established a new optimal portfolio selection problem in (8), a natural question iswhether or not there exists a solution to the optimal portfolio selection strategy, and how it can be derivedif it does exist. This is a challenging problem because we are not able to find time-consistent solutions inthe sense that the condition for the Bellman Optimality Principle no longer holds, given that the law ofiterated expectations does not apply for a given strategy. As a result, we have to seek an optimal solutionto problem (8) in the sense of time inconsistency. A natural way for an investor to deal with any timeinconsistent problem is to solve the problem by setting t =
0, and the investor will follow the resultingoptimal strategy during the finite time horizon. This is the so-called pre-commitment control, i.e., theinvestor pre-commits at a fixed time moment. However, one of the main drawbacks for the optimalsolution with pre-commitment is that it will not be optimal for the control problem at any time t > t during theconsidered time horizon instead of time 0. Therefore, instead of seeking a pre-commitment solution,6ur problem is considered under a game theoretic framework without pre-commitment, which was intro-duced in [4, 5] and developed by Kronborg and Ste ff ensen [17] as well as Kryger et al. [19].To solve the optimization problem (8), we consider a more general optimization problem with adiscount factor as follows:max c ( · ) ,π ( · ) E ( e − δ ( T − t ) X ( T )) − γ Var ( e − δ ( T − t ) X ( T )) + β E Z Tt e − ρ ( s − t ) U ( c ( s )) ds ! s . t . c ( · ) , π ( · ) ∈ L F (0 , T ; R ) , ( x ( · ) , c ( · ) , π ( · )) satis f y Equation (5) , (9)where δ is a discount rate. Obviously, the optimization problem (9) degenerates to the original one (8)when the investor requires a discount rate of 0.The equilibrium strategy under the continuous-time game theoretic equilibrium for the problem (9)can be defined below. Definition 3.1.
Consider a strategy ( c ∗ , π ∗ ) and a fixed point ( c , π ) . For a fixed number h > and aninitial point ( t , x ) , we define the strategy ( e c h , e π h ) as ( e c h ( s ) , e π h ( s )) = ( c , π ) , for t ≤ s < t + h , ( c ∗ ( s ) , π ∗ ( s )) , for t + h ≤ s < T . (10) If lim h → inf 1 h (cid:16) f c ∗ ,π ∗ ( t , x , y c ∗ ,π ∗ , z c ∗ ,π ∗ , w c ∗ ,π ∗ ) − f e c h , e π h ( t , x , y e c h , e π h , z e c h , e π h , w e c h , e π h ) (cid:17) ≥ for all ( c , π ) ∈ R + × R , where f is an optimal value function andy c ,π : = y c ,π ( t , x ) = E h e − δ ( T − t ) X c ,π ( T ) (cid:12)(cid:12)(cid:12) X ( t ) = x i , z c ,π : = z c ,π ( t , x ) = E (cid:20) (cid:16) e − δ ( T − t ) X c ,π ( T ) (cid:17) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( t ) = x (cid:21) , w c ,π : = w c ,π ( t , x ) = E " Z Tt e − ρ ( T − t ) U ( c ( s )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( t ) = x , (12) then ( c ∗ , π ∗ ) is an equilibrium strategy. The equilibrium strategy defined in Definition 3.1 is a time-inconsistent solution to the control prob-lem (8), which is essentially di ff erent from time-consistent solutions discussed in the context of opti-mization [2]. If we denote ( c ∗ , π ∗ ) as the equilibrium strategy satisfying Definition 3.1, and let V be thethe corresponding value function with the equilibrium strategy, we can obtain V ( t , x ) = f c ,π ( t , x , y c ∗ ,π ∗ , z c ∗ ,π ∗ , w c ∗ ,π ∗ ) . (13)Clearly, our problem is to search for the corresponding optimal strategies and the optimal value function f : [0 , T ] × R → R as a C , , , , function of the form f c ∗ ,π ∗ ( t , x , y c ,π , z c ,π , w c ,π ) = y − γ z − y ) + β w , ( c , π ) ∈ A , (14)7here A is the class of admissible strategies to be defined below. As pointed in [19], the investor contin-uously deviates from this strategy and thus does not actually achieve any of the determined supremums.Instead, the investor concentrates on determining the equilibrium control law, as introduced in [4] and[18]. The desired investment strategy is determined so that it maximizes the present objective at anytime moment t , under the restriction that the future strategy is assumed to be given. In other words,the strategy is determined through backward recursion, and thus this recursively optimal solution underequilibrium control law is also regarded as the optimal control (see [18, 19]).Before we are able to present the optimal solution, some preliminaries need to be outlined. In partic-ular, we establish an extension of the HJB equation for the characterization of the optimal value functionand the corresponding optimal strategy, so that the stochastic problem can be transformed into a systemof deterministic di ff erential equations and a deterministic point-wise minimization problem.Let A be the set of admissible strategies that contains all strategies ( c , π ) satisfying the following twoconditions: i) there exist solutions to the partial di ff erential equations (15)-(17); ii) the stochastic integralin (21) ,(26), (29) and (41) are martingales. Then, we can prove the following two lemmas. Lemma 3.1.
Suppose there exist three functions Y = Y ( t , x ) , Z = Z ( t , x ) and W = W ( t , x ) such that Y t ( t , x ) = − [( r + π ( µ − r )) x + l − c ] Y x ( t , x ) − π σ x Y xx ( t , x ) + δ Y ( t , x ) , Y ( T , x ) = x , (15) Z t ( t , x ) = − [( r + π ( µ − r )) x + l − c ] Z x ( t , x ) − π σ x Z xx ( t , x ) + δ Z ( t , x ) , Z ( T , x ) = x , (16) and W t ( t , x ) = − [( r + π ( µ − r )) x + l − c ] W x ( t , x ) − π σ x W xx ( t , x ) − e − ρ t U ( c ) , W ( T , x ) = , (17) where ( c , π ) is an arbitrary admissible strategy. Then,Y ( t , x ) = y c ,π ( t , x ) , Z ( t , x ) = z c ,π ( t , x ) , W ( t , x ) = w c ,π ( t , x ) , (18) where y c ,π , z c ,π and w c ,π are given by (12) .Proof. Define e Y ( t , x ) = e − δ t Y ( t , x ) . (19)Substituting (19) into (15) yields e Y t ( t , x ) = − [( r + π ( µ − r )) x + l − c ] e Y x ( t , x ) − π σ x e Y xx ( t , x ) , e Y ( T , x ) = e − δ T x . (20)8pplying Itˆo’s lemma further yields e Y ( t , X c ,π ( t )) = − Z Tt d e Y ( s , X c ,π ( s )) + e Y ( T , X c ,π ( T )) = − Z Tt (cid:18)e Y s ( s , X c ,π ( s )) + [( r + π ( s )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s )] e Y x ( s , X c ,π ( s )) + π ( s ) σ ( X c ,π ( s )) e Y xx ( s , X c ,π ( s )) (cid:19) ds − Z Tt π ( s ) σ ( X c ,π ( s )) e Y x ( s , X c ,π ( s )) dB ( s ) + e Y ( T , X c ,π ( T )) = e − δ T X c ,π ( T ) − Z Tt π ( s ) σ ( X c ,π ( s )) e Y x ( s , X c ,π ( s )) dB ( s ) . (21)Since ( c , π ) is an admissible strategy, taking the expectation on two sides of (21) conditional upon X ( t ) = x results in e Y ( t , X c ,π ( t )) = E h e − δ T X c ,π ( T ) (cid:12)(cid:12)(cid:12) X ( t ) = x i , (22)from which one can obtain Y ( t , x ) = e δ t e Y ( t , x ) = y c ,π ( t , x ) . (23)Similarly, if we denote e Z ( t , x ) = e − δ t Z ( t , x ) , (24)we can obtain e Z t ( t , x ) = − [( r + π ( µ − r )) x + l − c ] e Z x ( t , x ) − π σ x e Z xx ( t , x ) , e Z ( T , x ) = e − δ T x . (25)Again, applying Itˆo’s lemma leads to e Z ( t , X c ,π ( t )) = − Z Tt d e Z ( s , X c ,π ( s )) + e Z ( T , X c ,π ( T )) = − Z Tt (cid:18)e Z s ( s , X c ,π ( s )) + [( r + π ( s )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s )] e Z x ( s , X c ,π ( s )) + π ( s ) σ ( X c ,π ( s )) e Z xx ( s , X c ,π ( s )) (cid:19) ds − Z Tt π ( s ) σ ( X c ,π ( s )) e Z x ( s , X c ,π ( s )) dB ( s ) + e Z ( T , X c ,π ( T )) = (cid:16) e − δ T X c ,π ( T ) (cid:17) − Z Tt π ( s ) σ ( X c ,π ( s )) e Z x ( s , X c ,π ( s )) dB ( s ) . (26)Taking the expectation on both sides of (26) conditional upon X ( t ) = x , it is straightforward that e Z ( t , X c ,π ( t )) = E (cid:20) (cid:16) e − δ T X c ,π ( T ) (cid:17) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( t ) = x (cid:21) , (27)and thus Z ( t , x ) = e δ t e Z ( t , x ) = z c ,π ( t , x ) . (28)9inally, following a similar fashion, W ( t , X c ,π ( t )) satisfying (17) can be founded as W ( t , X c ,π ( t )) = − Z Tt dW ( s , X c ,π ( s )) + W ( T , X c ,π ( T )) = − Z Tt (cid:18) W s ( s , X c ,π ( s )) + [( r + π ( s )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s )] W x ( s , X c ,π ( s )) + π ( s ) σ ( X c ,π ( s )) W xx ( s , X c ,π ( s )) (cid:19) ds − Z Tt π ( s ) σ ( X c ,π ( s )) W x ( s , X c ,π ( s )) dB ( s ) + W ( T , X c ,π ( T )) = Z Tt e − ρ s U ( c ( s )) ds − Z Tt π ( s ) σ ( X c ,π ( s )) e Z x ( s , X c ,π ( s )) dB ( s ) . (29)Taking the conditional expectation on (29) yields the desired result. This has completed the proof. (cid:3) Lemma 3.2.
If there exists a function F = F ( t , x ) such that F t = inf c ,π ∈A ( − [( r + π ( µ − r )) x + l − c ]( F x − Q ) − π σ x ( F xx − K ) + J ) , F ( T , x ) = f c ,π ( T , x , x , x , , (30) where Q = f c ∗ ,π ∗ x ,K = f c ∗ ,π ∗ xx + f c ∗ ,π ∗ yy ( F (1) x ) + + f c ∗ ,π ∗ zz ( F (2) x ) + f c ∗ ,π ∗ ww ( F (3) ) + f c ∗ ,π ∗ xy F (1) x + f c ∗ ,π ∗ xz F (2) x + f c ∗ ,π ∗ xw F (3) x + f c ∗ ,π ∗ yz F (1) x F (2) x + f c ∗ ,π ∗ yw F (1) x F (3) x + f c ∗ ,π ∗ zw F (2) x F (3) x (31) and J = f c ∗ ,π ∗ t + f c ∗ ,π ∗ y δ F (1) + f c ∗ ,π ∗ z δ F (2) − f c ∗ ,π ∗ w e − ρ t U ( c ( t )) . (32) with F (1) = y c ∗ ,π ∗ ( t , x ) , F (2) = z c ∗ ,π ∗ ( t , x ) , F (3) = w c ∗ ,π ∗ ( t , x ) , then F ( t , x ) = V ( t , x ) , where V is the optimal value function defined by (13) .Proof. The proof process is divided into three steps. The first step is to derive an expression for f c ,π ( t , X c ,π ( t ) , y c ,π ( t , X c ,π ( t )) , z c ,π ( t , X c ,π ( t )) , w c ,π ( t , X c ,π ( t ))) . (33)10sing Itˆo’s lemma, we have f c ,π ( t , X c ,π ( t ) , y c ,π ( t , X c ,π ( t )) , z c ,π ( t , X c ,π ( t )) , w c ,π ( t , X c ,π ( t )) = − Z Tt d f c ,π ( t , X c ,π ( t ) , y c ,π ( t , X c ,π ( t )) , z c ,π ( t , X c ,π ( t )) , w c ,π ( t , X c ,π ( t )) + f c ,π ( T , X c ,π ( T ) , y c ,π ( T , X c ,π ( T )) , z c ,π ( T , X c ,π ( T )) , w c ,π ( t , X c ,π ( T )) = − Z Tt (cid:26) ( f c ,π s + f c ,π y Y s + f c ,π z Z s + f c ,π w W s ) ds + ( f c ,π x + f c ,π y Y x + f c ,π z Z x + + f c ,π w W x ) dX c ,π ( s ) + π ( s ) σ ( X c ,π ( s )) (cid:2) f c ,π y Y xx + f c ,π z Z xx + f c ,π w W xx + f c ,π yy ( Y x ) + + f c ,π zz ( Z x ) + f c ,π ww ( W x ) + f c ,π xy Y x + f c ,π xz Z x + f c ,π xw W x + f c ,π yz Y x Z x + f c ,π yw Y x W x + f c ,π zw Z x W x (cid:3)(cid:27) ds + f c ,π ( T , X c ,π ( T ) , Y c ,π ( T , X c ,π ( T )) , Z c ,π ( T , X c ,π ( T )) , W c ,π ( T , X c ,π ( T )) . (34)Using (15), (16) and (17), we further have f c ,π ( t , X c ,π ( t ) , y c ,π ( t , X c ,π ( t )) , z c ,π ( t , X c ,π ( t )) , w c ,π ( t , X c ,π ( t )) = − Z Tt (cid:26) f c ,π s ds + f c ,π y − [( r + π ( µ − r )) X c ,π ( s )) + l − c ] Y x − π σ ( X c ,π ( s )) Y xx + δ Y ! + f c ,π z − [( r + π ( µ − r )) X c ,π ( s )) + l − c ] Z x − π σ ( X c ,π ( s )) Z xx + δ Z ! + f c ,π w − [( r + π ( µ − r )) X c ,π ( s )) + l − c ] W x − π σ ( X c ,π ( s )) W xx − e − ρ s U ( c ( s ) ! + ( f c ,π x + f c ,π y Y x + f c ,π z Z x + f c ,π w W x ) (cid:0) [( r + π ( s )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s )] ds + π ( s ) σ X c ,π ( s ) dB ( s ) (cid:1) + π ( s ) σ ( X c ,π ( s )) (cid:2) f c ,π y Y xx + f c ,π z Z xx + f c ,π w W xx + f c ,π yy ( Y x ) + f c ,π zz ( Z x ) + f c ,π ww ( W x ) + f c ,π xy Y x + f c ,π xz Z x + f c ,π xw W x + f c ,π yz Y x Z x + f c ,π yw Y x W x + f c ,π zw Z x W x (cid:3)(cid:27) ds + f c ,π ( T , X c ,π ( T ) , Y c ,π ( T , X c ,π ( T )) , Z c ,π ( T , X c ,π ( T )) , W c ,π ( T , X c ,π ( T )) . (35)Therefore, f c ,π ( t , X c ,π ( t ) , y c ,π ( t , X c ,π ( t )) , z c ,π ( t , X c ,π ( t )) , w c ,π ( t , X c ,π ( t )) = − Z Tt (cid:26) ( f c ,π s + f c ,π y δ Y + f c ,π z δ Z − f c ,π w e − ρ s K ( c ( s )) ds + f c ,π x [( r + π ( t )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s )] ds + π ( s ) σ X c ,π ( s )( f c ,π x + f c ,π y Y x + f c ,π z Z x + f c ,π w W x ) dB ( s )) + π ( s ) σ ( X c ,π ( s )) (cid:2) f c ,π xx + f c ,π yy ( Y x ) + f c ,π zz ( Z x ) + f c ,π ww ( W x ) + f c ,π xy Y x + f c ,π xz Z x + f c ,π xw W x + f c ,π yz Y x Z x + f c ,π yw Y x W x + f c ,π zw Z x W x (cid:3)(cid:27) ds + f c ,π ( T , X c ,π ( T ) , Y c ,π ( T , X c ,π ( T )) , Z c ,π ( T , X c ,π ( T )) , W c ,π ( T , X c ,π ( T )) . (36)For an arbitrary admissible strategy ( c , π ), we furthermore define e K = f c ,π xx + f c ,π yy ( Y x ) + f c ,π zz ( Z x ) + f c ,π ww ( W x ) + f c ,π xy Y x + f c ,π xz Z x + f c ,π xw W x + f c ,π yz Y x Z x + f c ,π yw Y x W x + f c ,π zw Z x W x (37)11nd e J = f c ,π t + f c ,π y δ Y + f c ,π z δ Z − f c ,π w e − ρ t U ( c ( t )) . (38)This leads to f c ,π ( t , X c ,π ( t ) , y c ,π ( t , X c ,π ( t )) , z c ,π ( t , X c ,π ( t )) , w c ,π ( t , X c ,π ( t )) = − Z Tt (cid:26) ( e J ( s ) + f c ,π x [( r + π ( t )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s )] ds + π ( s ) σ X c ,π ( s )( f c ,π x + f c ,π y Y x + f c ,π z Z x + f c ,π w W x ) dB ( s )) + π ( s ) σ ( X c ,π ( s )) e K ( s ) ds + f c ,π ( T , X c ,π ( T ) , Y c ,π ( T , X c ,π ( T )) , Z c ,π ( T , X c ,π ( T )) , W c ,π ( T , X c ,π ( T )) . (39)With the utilization of Itˆo’s lemma, one can easily derive F ( t , X c ,π ( t )) = − Z Tt dF ( s , X c ,π ( s )) + F ( T , X c ,π ( T )) = − Z Tt F s ds + F x dX c ,π ( s ) + F xx ( π ( s )) σ ( X c ,π ( s )) ds ! + F ( T , X c ,π ( T )) (40)Since F solves the pseudo HJB equation (30), we can obtain that for any arbitrary strategy ( c , π ), we have F t ≤ − [( r + π ( s )( µ − r )) x + l − c ]( F x − Q )] − π σ x ( F xx − K ) + J . Setting x = X c ,π ( s ) in (5) and using the terminal conditions (15), (16) and (17) directly lead to F ( t , X c ,π ( t )) ≥ − Z Tt (cid:26)(cid:0) [( r + π ( s )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s )]( F x − Q ) − π ( s ) σ ( X c ,π ( s )) ( − K ( s ) + F xx ) + J ( s ) (cid:1) ds + F x (cid:0) [( r + π ( s )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s )] ds + π ( s ) σ X c ,π ( s ) dB ( s ) (cid:1) + F xx ( π ( s )) σ ( X c ,π ( s )) ds (cid:27) + f c ,π ( T , X c ,π ( T ) , Y c ,π ( T , X c ,π ( T )) , Z c ,π ( T , X c ,π ( T )) , W c ,π ( T , X c ,π ( T )) = − Z Tt (cid:26)(cid:0) [( r + π ( s )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s )] f c ∗ ,π ∗ x ( s ) + π ( s ) σ ( X c ,π ( s )) K ( s ) + J ( s ) (cid:1) ds + F x π ( s ) σ X c ,π ( s ) dB ( s ) (cid:27) + f c ,π ( T , X c ,π ( T ) , Y c ,π ( T , X c ,π ( T )) , Z c ,π ( T , X c ,π ( T )) , W c ,π ( T , X c ,π ( T )) = − Z Tt (cid:26)(cid:0) [( r + π ( s )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s )] (cid:16) f c ∗ ,π ∗ x ( s ) − f c ,π x ( s ) (cid:17) + π ( s ) σ ( X c ,π ( s )) ( K ( s ) − e K ( s )) + J ( s ) − e J ( s ) (cid:1) ds + (cid:16) f c ,π x + f c ,π y Y x + f c ,π z Z x + f c ,π w W x − F x (cid:17) π ( s ) σ X c ,π ( s ) dB ( s ) (cid:27) + f c ,π ( t , X c ,π ( t ) , y c ,π ( t , X c ,π ( t )) , z c ,π ( t , X c ,π ( t )) , w c ,π ( t , X c ,π ( t )) , (41)12here the second equality follows from (39), after taking the expectation of the first equality conditionalupon X ( t ) = x . We can thus arrive at f c ,π ( t , x , y c ,π ( t , x ) , z c ,π ( t , x ) , w c ,π ( t , x )) ≤ F ( t , x ) + Z Tt (cid:26) [( r + π ( t )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s )] × ( f c ∗ ,π ∗ x ( s ) − f c ,π x ( s )) + J ( s ) − e J ( s ) + π ( s ) σ ( X c ,π ( s )) ( K ( s ) − e K ( s )) (cid:27) ds . (42)The last step is to check whether the Nash equilibrium criteria specified in Definition 3.1 are satisfied.If we assume that the strategy ( c ∗ , π ∗ ) satisfies the infimum in (30), it follows from (18) that F (1) ( t , x ) = y c ∗ ,π ∗ ( t , x ) , F (2) ( t , x ) = z c ∗ ,π ∗ ( t , x ) , F (3) ( t , x ) = w c ∗ ,π ∗ ( t , x ) . (43)As (41) holds for any admissible strategy ( c , π ), it also applies for the specific strategy ( c ∗ , π ∗ ), i.e., F t = − [( r + π ( µ − r )) x + l − c ]( F x − Q ) − π σ x ( F xx − K ) + J , leading to F ( t , X c ∗ ,π ∗ ( t )) = Z Tt (cid:26) (cid:16) f c ∗ ,π ∗ x + f c ∗ ,π ∗ y Y x + f c ∗ ,π ∗ z Z x + f c ∗ ,π ∗ w W x − F x (cid:17) π ∗ ( s ) σ X c ∗ ,π ∗ ( s ) dB ( s ) (cid:27) + f c ∗ ,π ∗ ( t , X c ∗ ,π ∗ ( t ) , y c ∗ ,π ∗ ( t , X c ∗ ,π ∗ ( t )) , z c ∗ ,π ∗ ( t , X c ∗ ,π ∗ ( t )) , w c ∗ ,π ∗ ( t , X c ∗ ,π ∗ ( t )) . (44)Taking the expectation on both sides of the above equality conditional upon X ( t ) = x yields F ( t , x ) = f c ∗ ,π ∗ (cid:16) t , x , y c ∗ ,π ∗ ( t , x ) , z c ∗ ,π ∗ ( t , x ) , w c ∗ ,π ∗ ( t , x ) (cid:17) . (45)If we consider the strategy ( e c h , e π h ) defined in (10), Equations (42) and (44) yieldlim h → inf f c ∗ ,π ∗ ( t , x , y c ∗ ,π ∗ ( t , x ) , z c ∗ ,π ∗ ( t , x ) , w c ∗ ,π ∗ ( t , x )) − f e c h , e π h ( t , x , y e c h , e π h ( t , x ) , z e c h , e π h ( t , x ) , w e c h , e π h ( t , x )) h ≥ lim h → inf 1 h (Z Tt h ( r + e π h ( s )( µ − r )) X e c h , e π h ( s ) + l ( s ) − e c h ( s ) i ( f e c h , e π h x ( s ) − f c ∗ ,π ∗ x ( s )) ds + Z Tt e J h ( s ) − J ( s ) + σ ( e π h ( s )) ( X e c h , e π h ( s )) ( e K h ( s ) − K ( s )) ds !) = lim h → inf 1 h (Z t + ht (cid:2) ( r + π ( s )( µ − r )) X c ,π ( s ) + l ( s ) − c ( s ) (cid:3) ( f e c h , e π h x ( s ) − f c ∗ ,π ∗ x ( s )) ds + Z t + ht e J h ( s ) − J ( s ) + σ ( π ( s )) ( X c ,π ( s )) ( e K h ( s ) − K ( s )) ds !) = (cid:2) ( r + π ( t )( µ − r )) X c ,π ( t ) + l ( t ) − c ( t ) (cid:3) ( f e c , e π x ( t ) − f c ∗ ,π ∗ x ( t )) + e J ( t ) − J ( t ) + σ ( π ( t )) ( X c ,π ( t )) ( e K ( t ) − K ( t )) = , which implies that F ( t , x ) = V ( t , x ) and ( c ∗ , π ∗ ) is the desired optimal strategy. (cid:3) emark 3.1. The representation corresponds to the pseudo-Bellman equation (30) , originally presentedin [18] and applied in Theorem 2.1 of [17], calls for an optimization across strategies, whereas the wholepoint of dynamic programming is to appeal only to optimization across vectors. Therefore, the optimalsolution solved by this approach belongs to the subspace of A . Similar to Bj¨ork and Murgoci [4], theoptimality of the our obtained strategy can also be confirmed in the sense of equilibrium. In this section, we present the optimal solutions to the optimal portfolio selection problem (8) based onthe results derived in the previous section, and some detailed discussions are provided to illustrate thebehaviour of the optimal strategies.A candidate strategy for the optimal value function (30) can be derived by simply di ff erentiating (30)with respect to c and π , respectively. This leads to ∂∂ c (cid:16) c ( F x − Q ) − f w e − ρ t U ′ ( c ) (cid:17) = ∂∂π − π ( µ − r ) x ( F x − Q ) − π σ x ( F xx − K ) ! = . (47)A further simplification then yields c ∗ = [ U ′ ] − F x − Qf w e − ρ t ! ,π ∗ = − β − rx σ F x − QF xx − K , (48)where [ f ] − ( · ) is the inverse function of f , and stars denote that they are the optimal strategies.Substituting the corresponding objective form f ( t , x , y , z , w ) = y − γ z − y ) + β w (49)into (31) and (32) gives Q = , K = γ ( F (1) x ) , J = δ F (1) − γδ (cid:16) F (2) − ( F (1) ) (cid:17) − β e − ρ t U ( c ) . (50)To obtain an explicit solution for this optimal portfolio selection problem, we assume that F , F (1) and F (3) can be written in the following form: F ( t , x ) = A ( t ) x + B ( t ) , F (1) ( t , x ) = a ( t ) x + b ( t ) , F (3) ( t , x ) = p ( t ) x + q ( t ) , (51)which then naturally leads to F being written in the form F (2) ( t , x ) = γ (cid:2) a ( t ) x + b ( t ) + β (cid:2) p ( t ) x + q ( t ) (cid:3) − [ A ( t ) x + B ( t )] (cid:3) + [ a ( t ) x + b ( t )] . (52)Now, the substitution of (51) and (52) into (50) results in Q = , K = γ ( a ( t )) , (53)14nd J = δ [ a ( t ) x + b ( t )] − δ (cid:2) a ( t ) x + b ( t ) + β (cid:2) p ( t ) x + q ( t ) (cid:3) − [ A ( t ) x + B ( t )] (cid:3) − β e − ρ t U ( c ( t )) , (54)with which the optimal strategy (48) becomes c ∗ = [ U ′ ] − (cid:16) β − e − ρ t A ( t ) (cid:17) ,π ∗ = γ µ − rx σ A ( t ) a ( t ) . (55)If we further substitute (53), (54) and (55) into (15) - (17) and (30) with the corresponding terminalconditions, it is straightforward to show that A t x + B t = − rxA − γ ( µ − r ) A σ a − lA + c ∗ A − β e − ρ t U ( c ∗ ) + δ ( ax + b ) − δ ( ax + b − Ax − B + β ( px + q )) , a t x + b t = − rxa − γ ( µ − r ) A σ a − la + c ∗ a + δ ( ax + b ) , p t x + q t = − rxp − γ ( µ − r ) Ap σ a − lp + c ∗ p − e − ρ t U ( c ∗ ( t )) , (56)with terminal conditions A ( T ) = a ( T ) = B ( T ) = b ( T ) = p ( T ) = q ( T ) = A ( t ) = a ( t ) = e ( r − δ )( T − t ) , p ( t ) = q ( t ) = R Tt e − ρ s U ( c ∗ ( s )) ds ,and b ( t ) = e δ t Z Tt " γ ( µ − r ) σ + l ( s ) e ( r − δ )( T − s ) − c ∗ ( s ) e ( r − δ )( T − s ) e − δ s ds , B ( t ) = e δ t Z Tt " γ ( µ − r ) σ + l ( s ) e ( r − δ )( T − s ) − c ∗ ( s ) e ( r − δ )( T − s ) + β e − ρ s U ( c ∗ ( s )) + δ b ( s ) + δβ q ( s ) e − δ s ds . (57)Therefore, the solution to the mean-variance-utility problem can be presented in the following proposi-tion. Proposition 4.1.
The optimal consumption and investment strategy for problem (9) are respectively givenby c ∗ = [ U ′ ] − (cid:16) β − e r ( T − t ) − δ T (cid:17) (58) and π ∗ = γ µ − rx σ e − ( r − δ )( T − t ) . (59) In particular, when the discount rate δ is 0, we can obtainc ∗ = [ U ′ ] − (cid:16) β − e r ( T − t ) (cid:17) (60) and π ∗ = γ µ − rx σ e − r ( T − t ) , (61) which are exactly the desired optimal strategy of the mean-variance-utility problem (8) . This shows thatunder the mean-variance-utility criterion, the optimal consumption rate of the investor is independent ofthe current wealth, while the optimal investment rate is reversely proportional to the current wealth.
15t should be pointed out that the specific objective of the mean-variance-utility problem (8) consid-ered in this paper actually belongs to the category of time-inconsistent control problems, which meansthat it can also be solved with the general framework proposed by Bj¨ork et al. [7]. For the completenessof the paper and easiness of reference, the alternative derivation is also included in the Appendix. Onthe other hand, in addition to the mathematical theory in solving this specific problem, we also try to ob-tain some useful conclusions from the economical point of view. Our main emphasis is to formulate aninteresting economical problem and attempt to provide the clear strategies for a new kind of investmentproblem, which hasn’t been studied before.
Remark 4.1.
The new optimal portfolio selection problem subject to a minimized risk at the end of aninvestment period has led to at least two very interesting features that clearly distinguish themselves fromthose of the Merton’s classic framework [26]:i) The optimal consumption strategy derived under the current mean-variance-utility framework isindependent of the wealth, as suggested by Eq. (60). In the Merton’s classic framework, the optimalconsumption depends on one’s current total accumulated wealth. This of course makes sense econom-ically as one would probably feel that he / she can a ff ord to consume more when his / her total wealth islarger. However, in our new problem, the newly introduced risk control at the end of the investmentperiod has magically balanced out such a dependence; our solution Eq. (60) suggests that one’s optimalconsumption should be independent of the current wealth and be an increaseing function of t . This meansthat an investor still increases his / her consumption when his / her total wealth increases towards the endof an investment period. But, his / her consumption is no longer directly proportional to the total wealth,as there is not much time left at the end of an investment period to control the total investment risk whileoptimizing his / her return. When there is no need to worry about the investment risk at all, his / her invest-ment behavior would naturally be di ff erent as suggested by Merton’s original framework. It should alsobe noted that the final wealth under our framework can be negative. This is because investors under ourframework try to achieve a balance between achieving more terminal wealth in the mean-variance senseand obtaining more accumulated utility through consumption, and when the consumption preference pa-rameter β is large enough, the investor tends to consume as much as possible without caring about thefinal wealth.ii)On the other hand, the optimal investment strategy in our problem, (61), shows that the optimalinvestment strategy π ∗ obtained in this paper is dependent of the current wealth, which is consistent withthe reality. The optimal investment rate is in fact inversely related to the current wealth value, sinceinvestors have to manage the risk of the current wealth under the mean-variance criterion, in which casethe investment rate will be slowed down when the wealth value increases. If we further rewrite (61) as π ∗ x = γ µ − r σ e − r ( T − t ) . (62)it is not di ffi cult to find that the dollar amount invested in the risky asset at time t is independent of thecurrent wealth x , which agrees well with previous relevant works in the literature [2, 5, 17]. The mostastonishing part, however, is that all the optimal investment ends with a same form, as long as a mean16ariance is built into a model. Specifically, the optimal investment found by Basak and Chabakauri [2],who solve the dynamic mean-variance portfolio problem and derive its time-consistent solution usingdynamic programming, and Bj¨ork et al. [5], who placing the mean-variance problem within a gametheoretic framework and obtain the subgame perfect Nash equilibrium strategies with time inconsistency,and even Kronborg and Ste ff ensen [17], who take into account the consumption term in mean-varianceframework and consider the optimal investment strategy, all share a common formula, Eq. (61), no matterwhere the mean variance is placed at. This suggests that mean-variance term added to minimize theinvestment risk only alters the consumption behavior. This makes economical sense as risks associatedwith investing in risky asset are a di ff erent type of risk from those associated with consumption, whichhas a direct impact on the total wealth available to be invested at each point of the investment horizon.It should be noted that as the consumption is deterministic, the investor can actually just adjust theinitial capital by the present value of future (deterministic) income and future (deterministic) consump-tion and then invest the remaining capital according to Basak and Chabakauri [2]. Since the Basak andChabakauri strategy (amount invested) is independent of wealth, the consumption / investment combina-tion actually has a simple and reasonable interpretation, which is stated below. Remark 4.2.
The optimal investment strategy obtained under our mean-variance-utility framework isexactly the same as that derived in [11, 17], with the optimal amount of money being independent ofwealth . A possible explanation is that intermediary consumption is independent of wealth and involvesno risk, which indicates that the structure of the solution to the remaining mean-variance problem remainsthe same, independent of this consumption term. In particular, one has to firstly finance the deterministicoptimal consumption, and the rest of the capital is invested according to a mean-variance problem with-out consumption. As the capital is invested independently of the size of the wealth, financing optimalconsumption does not play a role there. Therefore, under the balance of mean-variance, the presenceof consumption utility could not a ff ect the fundamental change of optimal investment strategy, and theinvestment strategy remains the same as that under the mean-variance criterion. We also note that, theoptimal consumption strategies under the two frameworks are completely di ff erent, as the one in [11, 17]is discrete, taking either the maximal or minimal allowed value, while ours is continuous. Moreover,from an economic point of view, their results are actually not reasonable as it is usually not possible fora normal investor to make sudden changes in his / her consumption strategy from consuming the maximalto the minimal allowed value.Having successfully derived the optimal consumption and investment strategy, it is not di ffi cult toformulate the optimal value function of the problem (9) as V ( t , x ) = e ( r − δ )( T − t ) x + B ( t ) , (63) As pointed out in [17], this seems to be economically unreasonable for a multi-period model, and a possible way to resolvethis issue is to make the risk aversion be time- and wealth-dependent. We refer interested readers to [17] for more detaileddiscussion. B ( t ) is specified in (57). Obviously, the optimal value function at the terminal point is constructedwith the accumulated amount of the wealth x at time t and the additional amount resulted from thecontinuous consumption and investment strategy. With V x ( t , x ) = e ( r − δ )( T − t ) > , (64)the optimal value increases with wealth, which financially matches with one’s intuition. The sensitivityof the optimal value function with respect to the time is a ff ected by two aspects, i.e., the wealth, and theconsumption and investment strategy.The optimal strategy derived in (58) and (59) can also give rise to the conditional expected value andconditional second moment of the discounted optimal terminal wealth, yielding E h e − δ ( T − t ) X c ∗ ,π ∗ ( T ) (cid:12)(cid:12)(cid:12) X ( t ) = x i = e ( r − δ )( T − t ) x + b ( t ) , (65)and E (cid:20) (cid:16) e − δ ( T − t ) X c ∗ ,π ∗ ( T ) (cid:17) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( t ) = x (cid:21) = γ (cid:8) b ( t ) − B ( t ) + β (cid:2) p ( t ) x + q ( t ) (cid:3)(cid:9) , (66)respectively. One can also similarly compute the conditional expectation of the discounted accumulatedutility of consumption as E " Z Tt e − ρ ( T − t ) U ( c ∗ ( s )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( t ) = x = Z Tt e − ρ s U ( c ∗ ( s )) ds . (67)To further investigate the properties of the optimal consumption strategy as well as the correspondingoptimal value function, we now provide three examples with specific utility functions. Proposition 4.2.
With some particular choices of utility functions for problem (7) , the correspondingconsumption strategies can be specified according to Proposition 4.1.(i) With a logarithmic utility function U ( c ) = log( c ) , the optimal consumption strategy (58) can besimplified to take the form c ∗ = β e − r ( T − t ) + δ T . (68) Then, c ∗ = β e − r ( T − t ) (69) is the optimal one for the mean-variance-utility problem (8) .(ii) With a power utility function U ( c ) = c θ /θ , where θ < and θ , , the optimal consumptionstrategy (58) becomes c ∗ = (cid:16) β − e r ( T − t ) − δ T (cid:17) θ − . (70) Then, c ∗ = (cid:16) β − e r ( T − t ) (cid:17) θ − (71) is the optimal one for the mean-variance-utility problem (8) . iii) With an exponential utility function U ( c ) = − e − η c /η with η > , the optimal consumption strategy (58) can be explicitly obtained as c ∗ = η (cid:2) ln β − r ( T − t ) + δ T (cid:3) . (72) Then, c ∗ = η (cid:2) ln β − r ( T − t ) (cid:3) (73) is the optimal one for the mean-variance-utility problem (8) . With di ff erent optimal consumption strategies being derived corresponding to di ff erent utility func-tions, it is of interest to investigate the e ff ect of the newly introduced parameter, β , on the optimal objec-tive value function V ( t , x ). Let’s adopt the logarithmic utility function to mean-variance-utility problem(8) (setting δ = dVd β = M ( t ) + M ( t ) log β (74)and d Vd β = β M ( t ) , (75)where M ( t ) = Z Tt e − ρ s [ − r ( T − s ) + ρ T ] ds . From the expression of the first-order derivative, one can easily observe that the changes of the valuefunction with respect to β are dependent on both the parameter values and some other time-dependentfunctions, which implies that the sensitivity of the value function over β will be adjusted over time. Inaddition, it is not di ffi cult to find that V ββ > r < ρ , which suggests that the sensitivity of thevalue function towards β is a monotonic increasing function of β in the case the expected return on thesaved money is less than expected return on consumption utility. This is also reasonable, as β denotes theconsumption preference, and when β is large, any tiny changes in β value would result in a large changein the consumption strategy, leading to a significant impact on the value function.Apart from the optimal value function, one may also be interested to see how the optimal strategiesbehave with respect to di ff erent parameter values, the details of which are provided in the next section.Before we finish here, it should be remarked that we may not always be able to derive the specific form ofthe optimal consumption strategy, since it depends on whether we can find the inverse of the derivative ofthe chosen utility function. However, even when the analytical inversion of the selected utility functionis not available, e.g., when some mixed utility functions are adopted, it is still very straightforward toimplement (58) in some numerical softwares like Matlab to compute.One may wonder what happens if β goes to infinity. Mathematically, this limit process will lead toan ill-posed problem, as far as the optimization is concerned without any constraints for β . In fact, theinfinite β value will cause abnormal (infinite) consumption, which can also be allowed, once the incomeof investors is also abnormal (infinite). If we assume that investors only have limited initial wealth andnormal income, then in order to maintain the balance of income and expenditure of investors, β should be19onstrained to a reasonable but not infinite range. Financially, such a limit has actually freed the investorfrom the “hassle” of trying to optimize his / her portfolio in the sense that he / she could consume withoutrestraints which is actually not reasonable as the investor would normally keep a balanced budget. If β do go to infinity, the optimal consumption will also approach infinity. This is because β going to infinitymeans that the investor does not care about final wealth and the mean-variance concern for terminalwealth becomes redundant, in which case the investor will consume as much as possible. In this section, the properties of optimal consumption strategy under three common utility functionsdiscussed in Proposition 4.2 are investigated, by setting µ and r as 0 .
05 and 0 .
01, respectively. Theoptimal strategy in this paper is applicable to general utility function, as long as the basic definition ofutility function is satisfied: the more satisfied a person is with consumption, the better, that is, the firstderivative of utility function is greater than zero; with the increase of consumption, the increasing speedof satisfaction decreases, and the second derivative of utility function is less than zero. Once a specificutility function is selected, the optimal investment strategy will be determined. Now, we choose somesimple and representative utility functions in economics to illustrate.First of all, depicted in Figure 1 is the optimal consumption strategy with di ff erent β values whenthe investor chooses a logarithmic utility function. One can easily observe that the investor tends toconsume more when the β value is higher. This is indeed reasonable as an increase in β places a higherweight on the accumulated utility when calculating the value function, and this corresponds to the casewhere the investor prefers more to increase consumption to achieve a higher utility than managing wealthunder the mean-variance framework. It is also interesting to find that the investor would like to raisethe level of consumption when the end of the pre-determined investment period is approached underour mean-variance-utility framework. This may appear to be strange at a first glance, but this couldalso be understood from an economic point of view. At the early stage, a rational investor tends to beconservative in terms of consumption given that there may be plenty of uncertainty with maximizingthe terminal wealth being part of his / her long-term goal in achieving maximum “happiness”, and thusmanaging his / her terminal wealth through investment has higher priority over consumption. However,when the time passes by and the investor has accumulated certain amount of wealth, he / she would gainmore confidence in consuming more to more “happiness”. This is indeed consistent with our theoreticalfindings for the optimal value function. 20 ime O p t i m a l c on s u m p t i on s t r a t eg y β =0.1 β =1 β =10 Figure 1: Optimal consumption under a logarithmic utility function.
Time O p t i m a l c on s u m p t i on s t r a t eg y β =1 β =2 β =3 Time O p t i m a l c on s u m p t i on s t r a t eg y θ =0.1 θ =0.3 θ =0.5 Figure 2: Optimal consumption under the power utility function. θ used in left subfigure and β used inthe right subfigure are set to be 0.1 and 1, respectively. Time O p t i m a l c on s u m p t i on s t r a t eg y β =5 β =10 β =20 Time O p t i m a l c on s u m p t i on s t r a t eg y η =1 η =3 η =5 Figure 3: Optimal consumption under the exponential utility function. η used in left subfigure and β usedin the right subfigure are set to be 3 and 10, respectively.21igures 2 and 3 display how the optimal consumption strategy varies when the utility function isin the form of a power and an exponential function, respectively. What can be observed first from bothfigures is a similar pattern as shown in Figure 1 that the optimal consumption strategy is still a monotonicincreasing function of β , as a higher value of β still implies that more consumption is preferred. Anotherphenomenon that should be noted is that the investor is willing to consume more when θ ( η ) takes smallervalues. The main explanation for this is that θ ( η ) indicates the degree of risk aversion, and a larger θ ( η )value implies avoiding excessive consumptions.It is also interesting to show the di ff erence between the optimal consumption strategy derived underour framework and that obtained in [11], as there are two di ff erent approaches used to incorporate theconsumption into the mean-variance problem. In particular, the problem proposed in [11] does not ad-mit an explicit and analytical solution, and it was to be numerically solved with the fixed-point method,while our optimal consumption strategy is completely closed form solution, which facilitates its practicalapplications. Moreover, as displayed in Figure 4, the optimal consumption strategy derived in [11] is notcontinuous, and there exists a sudden drop from the maximal to minimal allowed consumption rate forany investor using their framework. This is by no means reasonable, since an investor would never con-sider to make substantial changes in his / her consumption in normal situations. Our optimal consumptionstrategy, on the other hand, turns out to be continuous, being a monotonic increasing function of the time,which is more reasonable for the same reason stated above. Time O p t i m a l c on s u m p t i on s t r a t eg y Our continous solutionSolution from Christiansen & Steffensen (2013)
Figure 4: Comparisons of the optimal consumption strategies.
In this paper, we introduce the concept of overall “happiness” of an investor, with which the terminalwealth under the mean-variance criterion and accumulated consumption utility can be directly addedtogether using a consumption preference parameter, to formulate a new class of investment-consumptionoptimization problems. The optimal consumption strategy is continuous and increases over time, which22s consistent with financial intuition that a normal investor would prefer investment over consumptionwhen it is far away from the end of period to achieve more happiness, while he / she would graduallyincrease the level of consumption when the wealth starts to be accumulated. Appendix
The mean-variance-utility problem (8) can also be solved under the time-inconsistent control frameworkintroduced by Bj¨ork et al. [7]. Let J ( t , x ) = f ( t , x ) + G ( g ( t , x )) + k ( t , x ) , where f ( t , x ) = E t , x [ F ( X ( T ))], g ( t , x ) = E t , x [ X ( T )], and k ( t , x ) = E t , x (cid:20)R Tt β e − ρ ( s − t ) U ( c ( s ))) ds (cid:21) . By taking F ( x ) = x − γ x and G ( x ) = γ x , we obtain the following system of equations0 = sup c ,θ (cid:8) A u V − A u ( G ◦ g ) + G ′ ( g ( t , x )) · ( A u g ) + β U ( c ) − ρ k (cid:9) = A u g = A u k + β U ( c ) − ρ k (76)with boundary conditions V ( T , x ) = F ( x ) + G ( x ) , g ( T , x ) = x , k ( T , x ) = . Here, θ is the amount of money invested in risky assets (following the notation of Basak and Chabakauri[2]), so that we have θ = π x in terms of the proportion π . The symbol A u denotes the partial di ff erentialoperator that is defined, for any su ffi ciently di ff erentiable function φ = φ ( t , x ), by( A u φ )( t , x ) = φ t ( t , x ) + µφ x ( t , x ) + σ φ xx ( t , x ) . By direct calculation, one finds the following equation for the mean-variance-utility case as studied inthe paper 0 = sup c ,θ ( V t + ( rx + ( µ − r ) θ + l − c ) V x − σ θ g x + β U ( c ) − ρ k ) with boundary condition V ( T , x ) = x . This equation is to be taken together with the equations for g andk as stated earlier. It is seen from the equation above that the optimization problems for c and for θ canbe solved separately. Using the trial solution V ( t ; x ) = A ( t ) x + B ( t ), g ( t ; x ) = a ( t ) x + b ( t ) and (with a bitof abuse of notation) k ( t ; x ) = k ( t ), one easily finds the following expressions for the amount of moneyin risky assets and for consumption: θ ∗ ( t ) = γ µ − r σ e − r ( T − t ) , c ∗ ( t ) = [ U ′ ] − (cid:16) β − e r ( T − t ) (cid:17) , which coincide with the ones derived in Proposition 4.1.23 eferences [1] Bajeux-Besnainou I. and R. Portait., Dynamic Asset Allocation in a Mean-Variance Framework, Management Science , 44 (1998), 79-95.[2] Basak S. and Chabakauri G. , Dynamic mean-variance asset allocation,
Review of Financial Studies ,23 (2010), 2970-3016.[3] Bielecki, T., H. Jin, S. R. Pliska, and X. Y. Zhou, Continuous-Time Mean-Variance Portfolio Se-lection with Bankruptcy Prohibition.
Mathematical Finance , 15(2) (2005), 13-44.[4] Bj¨ork T. and Murgoci, A., A general theory of Markovian time inconsistent stochastic controlproblems. Working paper, Stockholm School of Economics, 2009.[5] Bj¨ork, T., Murgoci, A., and Zhou, X.Y., Meanvariance portfolio optimization with statedependentrisk aversion.
Mathematical Finance , 24(1) (2012), 1-24.[6] Bj¨ork, T., Khapko M., and Murgoci A., Time inconsistent stochastic control in continuous time:Theory and examples. Working paper, arxiv.org / abs / Finance and Stochastics , 21 (2017), 331-360.[8] Brandt, M. W., Portfolio Choice Problems. In Y. Ait-Sahalia and L. P. Hansen (eds.), Handbook ofFinancial Econometrics. Amsterdam: North-Holland, 2009.[9] Cairns, A., Some Notes on the Dynamics and Optimal Control of Stochastic Pension Fund Modelsin Continuous Time.
ASTIN Bulletin: The Journal of the IAA
Stochastics and Dynamics , 11(02n03) (2011), 283-299.[11] Christiansen, M. and Ste ff ensen, M, Deterministic mean-variance-optimal consumption and invest-ment, Stochastics , 85(4) (2013), 620-636.[12] Cochrane, J. H., A Mean Variance Benchmark for Intertemporal Portfolio Theory. Working Paper,University of Chicago, 2008.[13] Cvitanic, J., A. Lazrak, and T. Wang, Implications of the Sharpe Ratio as a Performance Measurein Multi-Period Settings.
Journal of Economic Dynamics and Control ffi e, D., and H. Richardson, Mean-Variance Hedging in Continuous Time. Annals of Probability ,1 (1991), 1-15. 2416] He, X.J. and Zhu, S. P. How should a local regime-switching model be calibrated?
Journal ofEconomic Dynamics and Control , 78 (2017), 149-163.[17] Kronborg, M.T. and Ste ff ensen, M., Inconsistent investment and consumption problems. ApplliedMathematics & Optimization , 71(3) 2015, 473-515.[18] Kryger E.M. and Ste ff ensen M., Some solvable portfolio problems with quadratic and collectiveobjectives, (2010), Available at SSRN 1577265.[19] Kryger, E.M., Nordfang, M.B., and M. Ste ff ensen, Optimal control of an objective functional withnon-linearity between the conditional expectations: solutions to a class of time-inconsistent portfo-lio problems. Mathematical Methods of Operations Research , (2019), DOI: 10.1007 / s00186-019-00687-5.[20] Leippold, M., F. Trojani, and P. Vanini, Geometric Approach to Multiperiod Mean-Variance Op-timization of Assets and Liabilities. Journal of Economic Dynamics and Control , 28(10) (2004),79113.[21] Li, D., and W. L. Ng, Optimal Dynamic Portfolio Selection: Multiperiod Mean-Variance Formula-tion,
Mathematical Finance , 10 (2000), 387-406.[22] Lim, A. E. B., and X. Y. Zhou, Mean-Variance Portfolio Selection with Random Parameters in aComplete Market,
Mathematics of Operations Research , 27(1) (2002), 101-120.[23] Ma, G., and Zhu, S.P., Optimal investment and consumption under a continuous-time cointe-gration model with exponential utility.
Quantitative Finance , 19(7), (2019), 1135-1149, DOI:10.1080 / European Journal of Operational Research , 278(3) (2019), 976-988.[25] Markowitz H.M, Portfolio selection.
Journal of Finance , 7(1) (1952), 77-91.[26] Merton, R.C., Optimum consumption and portfolio rules in a continuous-time model.
StochasticOptimization Models in Finance , 3(4) 1975, 621-661.[27] Pun, C.S., Robust time-inconsistent stochastic control problems.
Automatica , 94 (2018), 249-257.[28] Shen, Y., Mean-variance portfolio selection in a complete market with unbounded random coe ffi -cients. Automatica , 55 (2015) , 165-175.[29] Yan, T., and Wong, H. Y., Open-loop equilibrium strategy for mean-variance portfolio problemunder stochastic volatility.
Automatica , 107 (2019), 211-223.[30] Yang, B.Z., Yue, J., and Huang, N.J., Equilibrium price of variance swaps under stochastic volatil-ity with Lvy jumps and stochastic interest rate.
International Journal of Theoretical and AppliedFinance (IJTAF), 22(04) (2019), 1-33. 2531] Yang, B.Z., He, X.J., and Huang, N.J., Equilibrium price and optimal insider trading strategy un-der stochastic liquidity with long memory.
Appllied Mathematics & Optimization , (2020), DOI:10.1007 / s00245-020-09675-2.[32] Yang B.Z., Lu X., Ma G., Zhu S.P., Robust portfolio optimization with multi-factor stochasticvolatility, Journal of Optimization Theory and Applications , (2020), DOI: 10.1007 / s10957-020-01687-w.[33] Zeleny, M., Multiple Criteria Decision Making, McGraw-Hill, New York, 1981.[34] Zhao, Y., and W. T. Ziemba, Mean-Variance versus Expected Utility in Dynamic Investment Anal-ysis. Working Paper, University of British Columbia, 2002.[35] Zhou X.Y. and Li D., Continuous-time mean-variance portfolio selection: a stochastic LQ frame-work. Appllied Mathematics & Optimization , 42 (2000), 19-53.[36] Zhu, S. P., and Lian, G. H., A closedform exact solution for pricing variance swaps with stochasticvolatility.