Optimal Investment and Consumption under a Habit-Formation Constraint
OOptimal Investment and Consumption under a Habit-FormationConstraint
Bahman Angoshtari ∗ Erhan Bayraktar † Virginia R. Young ‡ This version: February 9, 2021
Abstract
We extend the result of our earlier study [Angoshtari, Bayraktar, and Young; “Optimal consumptionunder a habit-formation constraint,” available at: arXiv:2012.02277, (2020)] to a market setup thatincludes a risky asset whose price process is a geometric Brownian motion. We formulate an infinite-horizon optimal investment and consumption problem, in which an individual forms a habit based on theexponentially weighted average of her past consumption rate, and in which she invests in a Black-Scholesmarket. The novelty of our model is in specifying habit formation through a constraint rather thanthe common approach via the objective function. Specifically, the individual is constrained to consumeat a rate higher than a certain proportion α of her consumption habit. Our habit-formation modelallows for both addictive ( α = 1) and nonaddictive (0 < α < 1) habits. The optimal investment andconsumption policies are derived explicitly in terms of the solution of a system of differential equationswith free boundaries, which is analyzed in detail. If the wealth-to-habit ratio is below (resp. above) acritical level x ∗ , the individual consumes at (resp. above) the minimum rate and invests more (resp.less) aggressively in the risky asset. Numerical results show that the addictive habit formation requiressignificantly more wealth to support the same consumption rate compared to a moderately nonaddictivehabit. Furthermore, an individual with a more addictive habit invests less in the risky asset comparedto an individual with a less addictive habit but with the same wealth-to-habit ratio and risk aversion,which provides an explanation for the equity-premium puzzle. Keywords:
Optimal investment and consumption, habit formation, habit persistence, average past con-sumption, stochastic control, free-boundary problem.
The study of consumption habit formation is a classical topic in financial economics and the literaturegoes back to the late 1960’s. See, for instance, Pollak (1970), Ryder and Heal (1973), Sundaresan (1989), ∗ Department of Mathematics, University of Miami. e-mail : [email protected] † Department of Mathematics, University of Michigan. e-mail : [email protected] E. Bayraktar is supported in part by theNational Science Foundation under grant DMS-1613170 and by the Susan M. Smith Professorship. ‡ Department of Mathematics, University of Michigan. e-mail : [email protected] V. R. Young is supported in part by theCecil J. and Ethel M. Nesbitt Professorship. a r X i v : . [ q -f i n . M F ] F e b onstantinides (1990), Detemple and Zapatero (1991), Detemple and Zapatero (1992) for early works, andDetemple and Karatzas (2003), Munk (2008), Englezos and Karatzas (2009), Muraviev (2011), and Yu (2015)for more recent studies. In this literature, habit formation is modeled through the so-called habit-formationpreference E (cid:104)(cid:82) T0 U( t , C t – Z t )d t (cid:105) , in which U : [0, T] × R → R is a given utility function and Z t is the agent’s habit (or standard of living) defined as the exponentially weighted running average of past consumption ratesC s , 0 ≤ s < t . If the consumption rate is allowed to fall below the habit, the habit-formation model is called nonaddictive . Otherwise, a model with a constraint C t ≥ Z t is called addictive . habit-formation models arenotoriously more difficult to solve than their non-habit formation counterparts. Indeed, explicit forms foroptimal policies are rare and, in most cases, the optimal policy is specified in terms of a solution of a PDEor an unknown process characterized via the martingale representation theorem.A related literature on consumption ratcheting and drawdown is devoted to models of optimal consump-tion under a more severe form of habit formation in which the reference point for forming habit is the runningmaximum of past consumption rates (instead of their running average). Dybvig (1995) found the optimalinvestment and consumption policies for an investor in a Black-Scholes financial market who seeks to maxi-mize discounted utility of consumption, while imposing a ratcheting constraint on the rate of consumption(that is, the consumption rate has to be a non-decreasing process). Arun (2012) extended Dybvig (1995)by allowing the rate of consumption to decrease, but not below a fraction of its maximum rate (that is, aso-called drawdown constraint on the consumption rate). See, also, Jeon et al. (2018) and Roche (2019)for similar models. Angoshtari et al. (2019) solved a problem setting similar to that of Arun (2012) thatalso allowed for agent’s bankruptcy (in the context of an optimal dividend problem), which occurred withpositive probability. In a related yet different setting, Albrecher et al. (2020b) and Albrecher et al. (2020a)considered an optimal dividend problem in a Brownian risk model while imposing a ratcheting constraint onthe dividend rates. In the studies above, habit formation is modeled by imposing a constraint on admissibleconsumption policies, rather than through the objective function, which is the approach taken for classicalhabit-formation models. Recently, Deng et al. (2020) provided a direct link to the classical literature of habitformation by solving an optimal investment and consumption model with a habit-formation preference (thatis, they modeled habit formation through the objective function rather than through the admissibility set),in which the habit is presented by the running maximum of consumption.Habit-formation models based on the running maximum have been more tractable and produced moreexplicit policies than those with the running average as the reference point. The former class of models,however, represent a more extreme form of habit formation in the sense that the effect of past consumptiondoes not “fade away” with time, as one expects. Indeed, under a drawdown constraint, our future habits willchange forever if we decide to increase consumption beyond its historical maximum. In reality, recent levelsof consumption have more effect on our current consumption habit than how we consumed a long time ago,and the effect of past consumption fades away with time. These observations motivated us to consider ahabit-formation model in which the reference point of habit is the running average of consumption (as in thehabit-formation literature), and the habit-formation mechanism operates though a constraint on admissibleconsumption policies (as in the consumption ratcheting and drawdown literature). In a sense, we also providea connection between these two bodies of work, however in the opposite direction of Deng et al. (2020).2n Angoshtari et al. (2020), we provided the first step by solving a deterministic optimal consumptionproblem with the objective of maximizing the functional (cid:82) + ∞ e – δ t (cid:2) C( t )/Z( t ) (cid:3) γ γ d t while imposing the habit-formation constraint C( t ) ≥ α Z( t ) for all t ≥
0. Here, C( t ), t ≥
0, is the deterministic consumption rateand Z( t ) = e – ρ t (cid:18) z + (cid:90) t ρ e ρ u C( u )d u (cid:19) ; t ≥ t . In particular, we assumed that the individual funds her consumption solelythrough a riskless asset offering an interest rate r > 0; thus, wealth and consumption processes were deter-ministic. To avoid bankruptcy, we showed that the wealth-to-habit ratio must always be above a certain level x given by (2.5) below. We showed that there exists a threshold x ∗ such that if the ratio of wealth-to-habitis above (resp. below) x ∗ , it is optimal to consume at a rate greater than (resp. equal to) the minimumacceptable rate imposed by the habit-formation constraint. We also found a significant difference betweenimpatient individuals (those with δ ≥ ρ (1 – α ) + r ) and patient individuals (those with 0 < δ < ρ (1 – α ) + r ).Impatient individuals always consume above the minimum rate (that is, x ∗ = x ) and, thereby, eventuallyattain the minimum wealth-to-habit ratio x , while patient individuals might consume at the minimum rate(that is, x ∗ > x ) and, thereby, attain a wealth-to-habit ratio greater than the minimum acceptable level.We obtained explicit results in terms of the solution of a nonlinear free-boundary problem.In this paper, we extend the model in Angoshtari et al. (2020) by assuming that the agent invests in aBlack-Scholes financial market. We formulate and solve a stochastic control problem to obtain the optimalinvestment and consumption policies. We find that the optimal consumption policy has a similar generalstructure as what we found in the riskless case. That is, there exists a critical level x ∗ of wealth-to-habitratio such that the agent consumes above the minimum rate if her wealth-to-habit ratio is above x ∗ andconsumes at the minimum rate otherwise. The value of x ∗ and the optimal consumption function are,however, different from their counterparts in the riskless case. In particular, we don’t see the structuraldifference between the consumption functions of patient and impatient individuals in that x ∗ > x for allvalues of δ . As for the investment policy, we found that the agent optimally invests “more aggressively” inthe stock when her wealth-to-habit ratio is below x ∗ compared to when it is above x ∗ . By more aggressiveinvestment, we mean that an (infinitesimal) increase in wealth-to-habit results in a larger increase in stock’sholdings. Finally, numerical analysis shows that increasing α (while keeping wealth-to-habit ratio and riskaversion constant) decreases the optimal investment in the risky asset. In other words, individuals withmore addictive habit formation (that is, larger α ) optimally invest less in the risky asset. Thus, the markethas to provide a higher premium to attract such an individual which indicates that our model provides anexplanation for the equity premium puzzle of Mehra and Prescott (1985).On the mathematical side, the results presented here rely on analyzing a coupled system of first-orderODEs with a free boundary, as opposed to a single ODE in Angoshtari et al. (2020). The analysis of sucha system is more delicate (see Proposition 3.1) and provides the main technical backbone of the paper. Asecond technical point of the paper is the verification theorem (Theorem 3.1), which did not pose manydifficulties in Angoshtari et al. (2020) when there is no stochasticity involved. Besides, the fact that the driftcoefficient of the optimal wealth SDE has more than linear growth and the coefficients are only semi-explicit3akes certain parts of the verification argument somewhat non-standard.The paper is organized as follows. In Section 2, we introduce the consumption habit process and itsbasic properties, formulate a stochastic control problem for finding the optimal investment and consumptionpolicy, and prove a verification lemma for the stochastic control problem. In Section 3, we formulate theHamilton-Jacobi-Bellman (HJB) free-boundary-problem and solve it semi-explicitly by applying the Legendretransform. This section also includes the main result of the paper, namely, Theorem 3.1, in which we verifythat the solution of the HJB free-boundary problem yields the value function and the optimal investmentand consumption policies. In Section 4, we include a series of numerical examples that highlight certainproperties of the optimal policy. Proofs of auxiliary results are included in Appendices A and B. We consider an individual who invests in a market consisting of a riskless and a risky asset in order tomaximize her utility of lifetime consumption. We assume that the riskless asset pays interest at a fixed rate r > 0 and that the price of the risky asset (S t ) t ≥ follows a geometric Brownian motiondS t S t = µ d t + σ dB t ; t ≥ µ > r and σ > 0 are constants, and (B t ) t ≥ is a standard Brownian motion in a filtered probabilityspace (cid:0) Ω , F , P , F = ( F t ) t ≥ (cid:1) , in which the filtration F is generated by the Brownian motion and satisfies theusual conditions.Let π t denote the amount invested in the risky asset, and let C t denote the individual’s consumption rateat time t , so that (cid:82) t C u d u is the total consumption over the time interval [0, t ]. Then, her wealth process(W t ) t ≥ follows the dynamics dW t = (cid:0) r W t + ( µ – r ) π t – C t (cid:1) d t + σπ t dB t , (2.1)for t ≥
0, with W = w > 0.For a given consumption process (C t ) t ≥ , we define the individual’s habit process (that is, consumptionhabit) as the process (Z t ) t ≥ given byZ t = e – ρ t (cid:18) z + (cid:90) t ρ e ρ u C u d u (cid:19) ; t ≥
0, (2.2)which has the following equivalent differential form: dZ t = – ρ (Z t – C t )d t ; t ≥ = z . (2.3)Here, ρ > 0 is a constant, and z > 0 represents the initial consumption habit of the individual. Theparameter ρ determines how much current habit is influenced by the recent rate of consumption relative tothe consumption rate farther in the past. As ρ increases, more weight is given to recent consumption. Inthe limiting cases, ρ = 0 implies Z t = z , and ρ = + ∞ implies Z t = C t .4or t > 0, the consumption habit Z t given by (2.2) is the exponentially weighted moving average of pastconsumption (C s ) s < t . To see this, let us assume that the individual lived (and consumed) over the timeperiod (– ∞ , t ). Let z be the exponentially weighted average of her consumption rate before time zero, thatis, z = (cid:82) ∞ ρ e ρ u C u d u . (Note that (cid:82) ∞ ρ e ρ u d u = 1.) By substituting for z in (2.2), we obtainZ t = (cid:90) ∞ ρ e – ρ ( t – u ) C u d u + (cid:90) t ρ e – ρ ( t – u ) C u d u = (cid:90) t – ∞ ρ e – ρ ( t – u ) C u d u ,with (cid:82) t – ∞ ρ e – ρ ( t – u ) d u = 1. Thus, Z t is the exponentially weighted moving average of (C s ) s < t , as claimed.We consider a consumption habit formation for the individual by assuming that, at any time t ≥
0, sheis unwilling to consume at a rate that is below a certain proportion of her habit Z t . In particular, we imposethe following constraint on the individual’s consumption processC t ≥ α Z t ; P - a . s ., t ≥
0, (2.4)in which 0 < α ≤ α , the less tolerant the individual is in allowing her currentconsumption to fall below her habit. Note that the consumption habit process (Z t ) t ≥ depends on z and onthe consumption process (C t ) t ≥ . To ease the notational burden, however, we write Z t instead of the moreaccurate Z z ,(C s ) ≤ s ≤ t t .The following lemma establishes a lower bound for the consumption habit process, and we use it in laterarguments. We omit the proof of this lemma because it closely follows the proof of Lemma 2.1 in Angoshtariet al. (2020). Lemma 2.1.
Let (C t ) t ≥ be a consumption process satisfying (2.4) , in which (Z t ) t ≥ is given by (2.2) . We,then, have Z t ≥ Z s e – ρ (1– α )( t – s ) , P -a.s., for all ≤ s ≤ t. In particular, Z t ≥ z e – ρ (1– α ) t , P -a.s., for all t ≥ . We assume that the individual avoids bankruptcy with probability one. The following lemma providesthe corresponding necessary and sufficient condition. In it, we use the notation x = x ( α ) := α r + ρ (1 – α ) , (2.5)for α ∈ [0, 1]. Note that x is strictly increasing in α , x (0) = 0, and x (1) = 1/ r . Again, we omit the proof ofthis lemma because it closely follows the proof of Lemma 2.2 in Angoshtari et al. (2020). Lemma 2.2.
Let the F -adapted process ( π t , C t ) t ≥ satisfy condition (2.9) below, and let (W t ) t ≥ and (Z t ) t ≥ be given by (2.1) and (2.2) , respectively. Then, W t > 0 for all t ≥ P -a.s. if and only if W t Z t ≥ x , (2.6) P -a.s., for all t ≥ . α → + , the requirement for consumption (2.4) becomes C t ≥
0, and inequality (2.6) becomes moot, whichwe expect because this limiting case is the market model considered by Merton (1969). Also, note that, inthe special case of α = 1, the requirement for consumption (2.4) becomes C t ≥ Z t , and inequality (2.6)becomes r W t ≥ Z t , which is consistent with feasibility condition adapted by Dybvig (1995), namely, that r W t ≥ C t – . Note, however, that our preference specification in (2.10) differs from Merton’s and Dybvig’sfor the case α → + . Therefore, our optimal policies do not converge to theirs as α → + or for α = 1.In the following, we define the set of admissible investment and consumption policies as those that avoidbankruptcy while satisfying the individual’s consumption habit-formation constraint. Definition 2.1.
Let (cid:101) A ( α ) be the set of all processes ( π t , C t ) t ≥ such that ( π t ) t ≥ is F -adapted, (C t ) t ≥ isnon-negative and F -progressively measurable, (cid:90) t (cid:0) π u + C u (cid:1) d u < + ∞ ; t ≥ P -a.s.,and conditions (2.4) and (2.6) hold, namely,C t ≥ α Z t , and W t ≥ x Z t , P -a.s., for all t ≥
0, in which (W t ) t ≥ and (Z t ) t ≥ are given by (2.1) and (2.2), respectively.Next, we formulate the individual’s lifetime consumption and investment problem as a stochastic controlproblem. For any admissible investment and consumption policy ( π t , C t ) t ≥ , let us introduce the wealth-to-habit process X t := W t Z t ; t ≥
0, (2.7)and note that, by (2.1) and (2.3), dX t = (cid:16) ( ρ + r )X t + ( µ – r ) θ t – (1 + ρ X t ) c t (cid:17) d t + σθ t dB t ; t ≥ = x := wz ≥ x , (2.8)in which we have defined the investment-to-habit process ( θ t ) t ≥ and the consumption-to-habit process( c t ) t ≥ by, θ t := π t Z t and c t := C t Z t , respectively.We define the set of admissible investment-to-habit and consumption-to-habit policies as follows. Definition 2.2.
Let A = A ( α ) be the set of all processes ( θ t , c t ) t ≥ such that ( θ t ) t ≥ is F -adapted, ( c t ) t ≥ is F -progressively measurable, (cid:90) t (cid:0) θ u + c u (cid:1) d u < + ∞ ; P -a.s., t ≥
0, (2.9)and c t ≥ α , and X t ≥ x , P -a.s., for all t ≥
0, in which (X t ) t ≥ is given by (2.7).6s the following proposition states, our two definitions of admissible policies are equivalent in the sensethat any admissible investment and consumption policy corresponds to an admissible relative investmentand consumption policy and vice versa. Its proof is an application of Itô’s lemma and, thus, omitted. Proposition 2.1.
Assume that ( π t , C t ) t ≥ ∈ (cid:101) A ( α ) and let (Z t ) t ≥ be given by (2.2) . Then, we have ( π t /Z t , C t /Z t ) t ≥ ∈ A ( α ) . Conversely, assume that ( θ t , c t ) t ≥ ∈ A ( α ) , and let (W t ) t ≥ be the solution of dW t W t = (cid:18) r + ( µ – r ) θ t X t – c t X t (cid:19) d t + σ θ t X t dB t ; t ≥ = w , in which (X t ) t ≥ is given by (2.8) . We, then, have ( π t := θ t W t /X t , C t := c t W t /X t ) ∈ (cid:101) A ( α ) . We assume that the individual values her consumption relative to her habit. In particular, for a givenconsumption process (C t ) t ≥ , the expected utility of her lifetime consumption is given by E (cid:32)(cid:90) τ d
11 – γ (cid:18) C t Z t (cid:19) γ e – ˜ δ t d t (cid:33) = E (cid:32)(cid:90) + ∞
11 – γ (cid:18) C t Z t (cid:19) γ e –( ˜ λ + ˜ δ ) t d t (cid:33) , (2.10)in which ˜ δ > 0 is the individual’s subjective time preference, γ > 0 (with γ (cid:54) = 1) is her (constant) relativerisk aversion, and τ d is the random time of her death, which we assume is exponentially distributed withmean 1/ ˜ λ > 0, and τ d is independent of the Brownian motion.In light of Proposition 2.1, the individual’s optimal investment-consumption problem is, thus, formulatedby the following stochastic control problem:V( x ) = V( x , α ) := sup ( θ t , c t ) ∈ A ( α ) E x (cid:32)(cid:90) + ∞ c γ t γ e – δ t d t (cid:33) ; x ≥ x , (2.11)in which δ = ˜ δ + ˜ λ , and E x denotes conditional expectation given X = x .We end this section by proving a verification theorem for the stochastic control problem (2.11). For itsstatement, we define the operator L θ , c on twice-differentiable functions by L θ , c v ( x ) = – δ v ( x ) + (cid:16) ( ρ + r ) x + ( µ – r ) θ (cid:17) v (cid:48) ( x ) + 12 σ θ v (cid:48)(cid:48) ( x ) + c γ γ – c (1 + ρ x ) v (cid:48) ( x ). Theorem 2.1.
Suppose v ∈ C (cid:0) [ x , + ∞ ) (cid:1) satisfies the following properties: for any x ≥ x , ( i ) L θ , c v ( x ) ≤ for all θ ∈ R and c ≥ α . ( ii ) v (cid:48) ( x ) > 0 , v ( x ) = α γ δ (1– γ ) , and lim x → x + v (cid:48) ( x ) = + ∞ . ( iii ) lim T → + ∞ E x (cid:0) e – δ T v (X T ) (cid:1) = 0 for any wealth-to-habit process (X t ) t ≥ that arising from an admissiblepolicy ( θ t , c t ) t ≥ ∈ A ( α ) . ( iv ) L θ ∗ ( x ), c ∗ ( x ) v ( x ) = 0 for some functions θ ∗ ( x ) and c ∗ ( x ) ≥ α . v ) For θ ∗ and c ∗ in condition ( iii ) , the following stochastic differential equation has a unique strongsolution: dX ∗ t = (cid:16) ( ρ + r )X ∗ t + ( µ – r ) θ ∗ (X ∗ t ) – (1 + ρ X ∗ t ) c ∗ (X ∗ t ) (cid:17) d t + σθ ∗ (X ∗ t )dB t ; t ≥ ∗ = x , and (cid:0) θ ∗ (X ∗ t ), c ∗ (X ∗ t ) (cid:1) t ≥ ∈ A .Then, v = V on [ x , + ∞ ) , and (cid:0) θ ∗ (X ∗ t ), c ∗ (X ∗ t ) (cid:1) t ≥ is an optimal policy.Proof. See Appendix A.
In this section, we consider the stochastic control problem (2.11) when (2.4) is a habit-formation constraint,that is, when 0 < α ≤
1. In other words, we exclude the case α = 0.Theorem 2.1 implies that the value function V( · ; α ) is a solution of the following differential equation:– δ v ( x ) + ( ρ + r ) xv (cid:48) + sup θ (cid:20) ( µ – r ) θ v (cid:48) + 12 σ θ v (cid:48)(cid:48) (cid:21) + sup c ≥ α (cid:20) c γ γ – (1 + ρ x ) cv (cid:48) (cid:21) = 0; x ≥ x . (3.1)For the rest of this section, we construct a classical solution of (3.1), and then use Theorem 2.1 to show thatthe solution equals the value function V( · ; α ) in (2.11).To construct a candidate solution, we hypothesize that the optimal investment and consumption policyhas the following form. There exists a critical value of wealth-to-habit ratio x ∗ ≥ x , such that,( a ) If x ≤ X t ≤ x ∗ , it is optimal to consume at the minimum rate, that is, c ∗ t = α . Also, if X t = x , it isoptimal to invest fully in the riskless asset, that is, θ ∗ t = 0.( b ) If X t > x ∗ , it is optimal to consume more than the minimum rate.The optimal expressions for c and θ in (3.1) are given by c ∗ ( x ) := α ; (1 + ρ x ) v (cid:48) ( x ) ≥ α – γ , (cid:0) (1 + ρ x ) v (cid:48) ( x ) (cid:1) – γ ; 0 < (1 + ρ x ) v (cid:48) ( x ) < α – γ , (3.2)and θ ∗ ( x ) := – µ – r σ v (cid:48) ( x ) v (cid:48)(cid:48) ( x ) , (3.3)respectively. To obtain these equations, we assume that v x > 0 and v xx < 0, which we show in Proposition3.2 below. Thus, for ( a ) and ( b ) in our hypothesis to be true, we must have lim x → x + v (cid:48) ( x ) v (cid:48)(cid:48) ( x ) = 0,(1 + ρ x ) v (cid:48) ( x ) ≥ α – γ ; x ≤ x ≤ x ∗ ,0 < (1 + ρ x ) v (cid:48) ( x ) < α – γ ; x > x ∗ . (3.4)8nder these additional conditions, (3.1) becomes the following free-boundary problem (FBP): κ v (cid:48) ( x ) v (cid:48)(cid:48) ( x ) – α (cid:18) xx – 1 (cid:19) v (cid:48) ( x ) + δ v ( x ) = α γ γ ; x ≤ x ≤ x ∗ , κ v (cid:48) ( x ) v (cid:48)(cid:48) ( x ) – ( r + ρ ) xv (cid:48) ( x ) + δ v ( x ) = γ γ (cid:0) (1 + ρ x ) v (cid:48) ( x ) (cid:1) γ ; x > x ∗ , lim x → x + v (cid:48) ( x ) v (cid:48)(cid:48) ( x ) = 0,(1 + ρ x ∗ ) v (cid:48) ( x ∗ ) = α – γ , (3.5)in which x ∗ ≥ x is unknown, and in which κ is defined by κ = ( µ – r ) σ .In anticipation that v is increasing and concave, we apply the Legendre transform to v to define its convexdual u by u ( y ) := sup x ≥ x (cid:8) v ( x ) – xy (cid:9) ; y > 0. (3.6)Here, we assume lim x → x + v (cid:48) ( x ) = + ∞ , which we show in Proposition 3.2 below. By using the relationships v (cid:0) I( y ) (cid:1) = u ( y ) – yu (cid:48) ( y ), I( y ) = – u (cid:48) ( y ), and v (cid:48)(cid:48) (cid:0) I( y ) (cid:1) = – 1 u (cid:48)(cid:48) ( y ) , (3.7)in which I( · ) is the inverse of v (cid:48) ( · ) (that is, v (cid:48) (cid:0) I( y ) (cid:1) = y , for y > 0), FBP (3.5) transforms into the followingFBP: – κ y u (cid:48)(cid:48) ( y ) + ( r + ρ (1 – α ) – δ ) yu (cid:48) ( y ) + δ u ( y ) = α γ γ – α y ; y ≥ y ∗ , (3.8)– κ y u (cid:48)(cid:48) ( y ) + ( r + ρ – δ ) yu (cid:48) ( y ) + δ u ( y ) = γ γ (cid:0) y – ρ yu (cid:48) ( y ) (cid:1) γ ; 0 < y < y ∗ , (3.9) lim y → + ∞ u (cid:48) ( y ) = – x , (3.10) lim y → + ∞ yu (cid:48)(cid:48) ( y ) = 0, (3.11)and y ∗ – ρ y ∗ u (cid:48) ( y ∗ ) = α – γ , (3.12)in which y ∗ = v (cid:48) ( x ∗ ) is unknown.It is easier to analyze u ’s second-order differential equation in (3.8) on [ y ∗ , + ∞ ) by transforming it intoa system of first-order ODEs. Specifically, by formally defining ϕ and H by ϕ ( y ) = y – ρ yu (cid:48) ( y ),and H( y ) = 1 ϕ ( y ) (cid:20) δ u ( y ) – γ γ ϕ ( y ) γ – r + ρ – δρ (cid:0) ϕ ( y ) – y (cid:1)(cid:21) ,9espectively, and by manipulating these expressions via the differential equation (3.8) and the free-boundarycondition in (3.12), we obtain the system in part ( i ) of the following proposition. (As an aside, we find thevalue of H at the free-boundary y ∗ by first solving for u on (0, y ∗ ) and by using continuity of u to obtain u ( y ∗ ) and, then, H( y ∗ ).) Proposition 3.1 provides the complete solution of FBP (3.8)–(3.12). Proposition 3.1.
Define the constant λ ∈ (– δ / κ , 0) by λ = 12 κ (cid:18)(cid:0) κ + r + ρ (1 – α ) – δ (cid:1) – (cid:113)(cid:0) κ + r + ρ (1 – α ) – δ (cid:1) + 4 δκ (cid:19) , (3.13) and define constants η < η by η := λα – γ ( λ – 1)(1 + ρ x ) , and η := α – γ ρ x . (3.14) Then, we have: ( i ) There exists a constant y ∗ ∈ ( η , η ) , an increasing function ϕ : (0, y ∗ ] → (0, α – γ ] , and a function H : (0, y ∗ ] → (0, κ / ρ ] satisfying the system: ϕ (cid:48) ( y ) = ρκ y (cid:18) κρ – H( y ) (cid:19) ϕ ( y ),H (cid:48) ( y ) = ρκ y (cid:18) κρ – H( y ) (cid:19) (cid:18) ϕ ( y ) – γ – H( y ) – r + ρ – δρ (cid:19) + r + ρρ ϕ ( y ) – δρ y , ϕ ( y ∗ ) = α – γ ,H( y ∗ ) = κρ (cid:20) λ (cid:18) y ∗ η (cid:19)(cid:21) , (3.15) for y ≤ y ∗ . Furthermore, lim y → + H( y ) = κ / ρ . ( ii ) A solution of FBP (3.8) - (3.12) is given by y ∗ as in ( i ) and by u : R + → R given byu ( y ) = y ∗ (1 + ρ x ) – α – γ ρλ (cid:18) yy ∗ (cid:19) λ – x y + α γ δ (1 – γ ) ; y ≥ y ∗ ,1 δ (cid:20) ϕ ( y )H( y ) + γ γ ϕ ( y ) γ + r + ρ – δρ (cid:16) ϕ ( y ) – y (cid:17)(cid:21) ; 0 < y < y ∗ , (3.16) in which ϕ and H are as in ( i ) . Furthermore, u ∈ C ( R + ) is strictly decreasing and convex, and lim y → + u (cid:48) ( y ) = – ∞ .Proof. In various parts of this proof, we will use the fact that λ in (3.13) solves the quadratic equation f ( λ ) := – κλ + (cid:0) κ + r + ρ (1 – α ) – δ (cid:1) λ + δ = 0. (3.17)Let λ (cid:48) denote the other zero of the quadratic function f , which is given by λ (cid:48) = 12 κ (cid:18)(cid:0) κ + r + ρ (1 – α ) – δ (cid:1) + (cid:113)(cid:0) κ + r + ρ (1 – α ) – δ (cid:1) + 4 δκ (cid:19) > 1. (3.18)That λ ∈ (– δ / κ , 0) follows from f (– δ / κ ) = – (cid:0) r + ρ (1 – α ) (cid:1) δ / κ < 0 and f (0) = δ > 0. That λ (cid:48) > 1 followsfrom f (1) = r + ρ (1 – α ) > 0 and lim ξ → + ∞ f ( ξ ) = – ∞ . Below, we prove ( i ) and then ( ii ).10roof of ( i ): When reading this part of the proof, it is helpful to refer to Figure 1 in Section 4 for visualreference. Define the set D := (cid:26) ( y , ϕ , H) : y , ϕ > 0, 0 < H < κρ (cid:27) , (3.19)and functions g ( y , ϕ , H) := ρκ y (cid:18) κρ – H (cid:19) ϕ , (3.20)and g ( y , ϕ , H) := ρκ y (cid:18) κρ – H (cid:19) (cid:18) ϕ – γ – H – r + ρ – δρ (cid:19) + r + ρρ ϕ – δρ y , (3.21)for ( y , ϕ , H) ∈ D . For a constant η ∈ ( η , η ), consider the boundary-value problem ϕ (cid:48) ( y ) = g ( y , ϕ ( y ), H( y ) (cid:1) ,H (cid:48) ( y ) = g ( y , ϕ ( y ), H( y ) (cid:1) , ϕ ( η ) = α – γ ,H( η ) = κρ (cid:20) λ (cid:18) ηη (cid:19)(cid:21) , (3.22)for (cid:0) y , ϕ ( y ), H( y ) (cid:1) ∈ D . Because α > 0 and η > η , the boundary conditions in (3.22) are inside D .Furthermore, g and g are locally Lipschitz continuous with respect to ϕ and H in D , since they are onlyunbounded (or have unbounded partial derivatives) if y = 0 or ϕ = 0. It, then, follows that (3.22) has aunique solution that extends to the boundary of D . Denote this solution by ( ϕ η ( · ), H η ( · ) (cid:1) : ( ε ( η ), η ] → R forsome constant ε ( η ) ∈ [0, η ) such that ( ε ( η ), η ] is the maximal domain over which the solution exists (within D ). We prove additional properties of ( ϕ η ( · ), H η ( · ) (cid:1) for η ∈ ( η , η ) in Lemma B.1 in Appendix B and usethose properties in the rest of this proof.Note that, because the solution ( ϕ η ( · ), H η ( · ) (cid:1) continuously depends on η because of the aforementionedlocal Lipschitz property of g and g , the mapping η (cid:55)→ ε ( η ) is continuous for η ∈ ( η , η ). Our goal is toshow that there exists a constant y ∗ ∈ ( η , η ) such that ε ( y ∗ ) = 0; that is, the solution ( ϕ y ∗ ( · ), H y ∗ ( · ) (cid:1) isdefined over the interval (0, y ∗ ].To show the existence of such y ∗ , we first show that for every y (cid:48) ∈ (0, η ), there exists a constant η ∈ ( η , η ) such that the solution ( ϕ η ( · ), H η ( · ) (cid:1) exists through a point ( y (cid:48) , ϕ (cid:48) , κ / ρ ) ∈ D , in which we havedefined D := (cid:8) ( y , ϕ , κ / ρ ) : y ∈ (0, η ), ϕ ∈ (0, α – γ ) (cid:9) . (3.23)To prove this statement, let B be the set of all y (cid:48) ∈ (0, η ) such that there exists a solution ( ϕ η ( · ), H η ( · ) (cid:1) , η ∈ ( η , η ), which exits through a point ( y (cid:48) , ϕ (cid:48) , κ / ρ ) ∈ D . We want to show that B = (0, η ). By LemmaB.1.( ii ), B is nonempty. From Lemma B.1.( iv ) and the continuity of ( ϕ η ( · ), H η ( · ) (cid:1) with respect to η , itfollows that if y ∈ B , then y ∈ B for all y ∈ ( y , η ). Therefore, we have one of the following scenarios: (a) B = ( (cid:101) y , η ) for some (cid:101) y ∈ (0, η ), (b) B = [ (cid:101) y , η ) for some (cid:101) y ∈ (0, η ), or (c) B = (0, η ).11n scenario (a), there exists a monotone increasing sequence { η n } ∞ n =1 in ( η , η ), such that solutions (cid:0) ϕ η n ( · ), H η n ( · ) (cid:1) all exit from D and lim n → + ∞ ε ( η n ) = (cid:101) y . By Lemma B.1.( iii )-( iv ), we must have η n < η – (cid:15) for some (cid:15) > 0 and for all n . Thus, lim n → + ∞ η n = η ∞ for some constant η ∞ ∈ ( η , η – ε ]. Furthermore,by continuity of ( ϕ η ( · ), H η ( · ) (cid:1) with respect to η , we must have that ( ϕ η ∞ ( · ), H η ∞ ( · ) (cid:1) exits D through somepoint ( (cid:101) y , (cid:101) ϕ , κ / ρ ) ∈ D . This implies (cid:101) y ∈ B , which contradicts the assumption that (cid:101) y / ∈ B . Thus, scenario(a) is impossible.In scenario (b), because (cid:101) y ∈ B , it follows from Lemma B.1.( iii ) that there exists a constant (cid:101) η ∈ ( η , η )such that ( ϕ (cid:101) η ( · ), H (cid:101) η ( · ) (cid:1) exits D through a point ( (cid:101) y , (cid:101) ϕ , κ / ρ ) ∈ D . Since (cid:101) η < η and (cid:101) y > 0, from continuityof ( ϕ η ( · ), H η ( · ) (cid:1) with respect to η , it follows that for some y (cid:48) ∈ (0, (cid:101) y ), there exists an η (cid:48) ∈ ( (cid:101) η , η ) suchthat ( ϕ η (cid:48) ( · ), H η (cid:48) ( · ) (cid:1) exits D through a point ( y (cid:48) , ϕ (cid:48) , κ / ρ ) ∈ D . In other words, y (cid:48) ∈ B , which contradicts (cid:101) y = min B . Thus, scenario (b) is also impossible. We conclude that the only possible scenario is (c), in otherwords, B = (0, η ).Finally, define y ∗ = inf (cid:8) η ∈ ( η , η ) : ε ( η ) ∈ B (cid:9) . From Lemma B.1.( ii )-( iii ), we must have y ∗ ∈ ( η , η ).From continuity of ( ϕ η ( · ), H η ( · ) (cid:1) with respect to η , we deduce ε ( y ∗ ) = 0, and lim y → + H y ∗ ( y ) = κρ .Thus, the solution ( ϕ y ∗ ( · ), H y ∗ ( · ) (cid:1) satisfies (3.15) for y ∈ (0, y ∗ ). Finally, that ϕ y ∗ ( · ) is increasing followsfrom ϕ (cid:48) ( y ) = ρκ y (cid:18) κρ – H( y ) (cid:19) ϕ ( y ) > 0,for all y ∈ (0, y ∗ ), since (cid:0) y , ϕ y ∗ ( y ), H y ∗ ( y ) (cid:1) ∈ D .Proof of ( ii ): The solution of the Euler equation (3.8) is u ( y ) = C y λ + C (cid:48) y λ (cid:48) – x y + α γ δ (1 – γ ) ; y ≥ y ∗ , (3.24)in which C and C (cid:48) are constants to be determined, and λ ∈ (– δ / κ , 0) and λ (cid:48) ∈ (1, + ∞ ) are given by (3.13)and (3.18), respectively. By (3.24), conditions (3.10) and (3.11) become lim y → + ∞ (cid:16) λ C y λ –1 + λ (cid:48) C (cid:48) y λ (cid:48) –1 (cid:17) = 0, lim y → + ∞ (cid:16) λ (1 – λ )C y λ –1 + λ (cid:48) (1 – λ (cid:48) )C (cid:48) y λ (cid:48) –1 (cid:17) = 0.Since λ < 0 and λ (cid:48) > 1, the system above can only hold if C (cid:48) = 0. So, we must have, u ( y ) = C y λ – x y + α γ δ (1 – γ ) ; y ≥ y ∗ .From (3.12), we obtain C = y ∗ (1+ ρ x )– α – γ ρλ ( y ∗ ) λ , which yields u ( y ) = y ∗ (1 + ρ x ) – α – γ ρλ (cid:18) yy ∗ (cid:19) λ – x y + α γ δ (1 – γ ) ; y ≥ y ∗ . (3.25)12hus, FBP (3.8)–(3.12) reduces to the following FBP: – κ y u (cid:48)(cid:48) ( y ) + ( r + ρ – δ ) yu (cid:48) ( y ) + δ u ( y ) = γ γ (cid:0) y – ρ yu (cid:48) ( y ) (cid:1) γ ; 0 < y < y ∗ , y ∗ – ρ y ∗ u (cid:48) ( y ∗ ) = α – γ , u ( y ∗ ) = (cid:16) ρ x ρλ – x (cid:17) y ∗ + (cid:16) αδ (1– γ ) – ρλ (cid:17) α – γ . (3.26)Now, let ϕ , H, and y ∗ be as determined in part ( i ) of this proposition; then, we claim that u defined by u ( y ) = 1 δ (cid:20) ϕ ( y )H( y ) + γ γ ϕ ( y ) γ + r + ρ – δρ (cid:0) ϕ ( y ) – y (cid:1)(cid:21) ; y ∈ (0, y ∗ ], (3.27)satisfies FBP (3.26). Indeed, because ϕ ( y ∗ ) = α – γ and H( y ∗ ) = κρ (cid:104) λ (cid:16) y ∗ η (cid:17)(cid:105) , one can show that u in(3.27) satisfies the second free-boundary condition in (3.26). Next, if we differentiate u twice, substitute for ϕ (cid:48) and H (cid:48) from (3.15) each time, then we obtain u (cid:48) ( y ) = 1 ρ – ϕ ( y ) ρ y , (3.28)and u (cid:48)(cid:48) ( y ) = ϕ ( y )H( y ) κ y , (3.29)for 0 < y < y ∗ . Note that (3.28) and ϕ ( y ∗ ) = α – γ give us the first free-boundary condition in (3.26). If wesubstitute for u (cid:48) and u (cid:48)(cid:48) from (3.28) and (3.29), respectively, in the non-linear differential equation in (3.26),then we obtain – κ y u (cid:48)(cid:48) ( y ) + ( r + ρ – δ ) yu (cid:48) ( y ) + δ u ( y ) – γ γ (cid:0) y – ρ yu (cid:48) ( y ) (cid:1) γ = – ϕ ( y )H( y ) + r + ρ – δρ (cid:0) y – ϕ ( y ) (cid:1) + δ u ( y ) – γ γ ϕ ( y ) γ = 0,in which the last equality follows from the definition of u in (3.27). We have, thereby, shown that y ∗ frompart ( i ) and u given by (3.16) solve FBP (3.8)-(3.12).Next, we show that u given by (3.16) is decreasing and convex; note that u ∈ C ( R + ) is continuouslytwice differentiable by construction. For y ≥ y ∗ , these properties of u directly follow by differentiating (3.25)as follows: u (cid:48) ( y ) = y ∗ (1 + ρ x ) – α – γ ρ y ∗ λ y λ –1 – x < 0, (3.30)and u (cid:48)(cid:48) ( y ) = ( λ – 1) (cid:0) y ∗ (1 + ρ x ) – α – γ (cid:1) ρ y ∗ λ y λ –2 > 0, (3.31)for y ≥ y ∗ , in which, to get the inequalities, we used λ < 0 and y ∗ < η = α – γ ρ x , which we proved earlier.That u is convex on (0, y ∗ ) follows from (3.29), ϕ > 0, and H > 0; we proved the latter two inequalities inpart ( i ). Also, (3.30) implies u (cid:48) ( y ∗ ) < 0, and u convex on (0, y ∗ ) implies u (cid:48) ( y ) < 0 for all y ∈ (0, y ∗ ).13t only remains to show that lim y → + u (cid:48) ( y ) = – ∞ . Suppose, on the contrary, that lim y → + u (cid:48) ( y ) (cid:54) = – ∞ .Because u (cid:48) is increasing and lim y → + ∞ u (cid:48) ( y ) = – x , we must have lim y → + u (cid:48) ( y ) = M for some constantM < – x < 0. From (3.28), we have ϕ ( y )/ y = 1 – ρ u (cid:48) ( y ) for 0 < y < y ∗ . Therefore, lim y → + ϕ ( y ) y = 1 – ρ M > 1. (3.32)The above limit implies lim y → + ϕ ( y ) = 0. By L’Hôpital’s rule, (3.15), and lim y → + H( y ) = κ / ρ from part( i ), we obtain lim y → + ϕ ( y ) y = lim y → + ϕ (cid:48) ( y ) = lim y → + ρϕ ( y ) κ y (cid:18) κρ – H( y ) (cid:19) = 0,which contradicts (3.32). Thus, we must have lim y → + u (cid:48) ( y ) = – ∞ . Remark 3.1.
It is possible to find a differential equation for ϕ of Proposition 3.1.( i ) that does not involveH. Indeed, substituting u (cid:48) = ρ – ϕρ y and u (cid:48)(cid:48) = ϕρ y – ϕ (cid:48) ρ y into (3.9) yields δ u ( y ) = – κ y ρ ϕ (cid:48) ( y ) + κ + r + ρ – δρ ϕ ( y ) + γ γ ϕ ( y ) γ – r + ρ – δρ y ; 0 < y ≤ y ∗ .By differentiating this equation and substituting u (cid:48) = ρ – ϕρ y , we obtain the following second-order differentialequation for ϕ : κρ y ϕ (cid:48)(cid:48) ( y ) + (cid:18) ϕ ( y ) – γ – r + ρ – δρ (cid:19) y ϕ (cid:48) ( y ) – δρ ϕ ( y ) + r + ρρ y = 0; 0 < y ≤ y ∗ .The above equation provides a link between Proposition 3.1 and Propositions 3.1 and 3.2 in Angoshtariet al. (2020). Indeed, by setting κ = 0, the above equation reduces to the differential equation in (3.17) and(3.29) of Angoshtari et al. (2020). Thus, as κ → + , ϕ of Proposition 3.1 becomes ψ of Propositions 3.1 and3.2 in Angoshtari et al. (2020). This relationship is expected because, by letting κ → + , the risky assetbecomes redundant and the optimal policy only invests in the riskless asset, which is the scenario analyzedin Angoshtari et al. (2020).Proposition 3.1 provides a strictly decreasing and convex function u and corresponding free boundary y ∗ that solve (3.8)–(3.12). By reversing the Legendre transform (3.6), we obtain an increasing and concavesolution of FBP (3.5). We prove this result in the following proposition. Proposition 3.2.
Let λ , y ∗ , ϕ ( y ) , H( y ) , and u ( y ) be as in Proposition 3.1, and let J( ξ ) : (– ∞ , – x ) → (0, + ∞ ) be the inverse of u (cid:48) ( y ) , that is, u (cid:48) (cid:0) J( ξ ) (cid:1) = ξ for ξ < – x . Definex ∗ := – u (cid:48) ( y ∗ ) = α – γ ρ y ∗ – 1 ρ , (3.33) v ( x ) := u (cid:0) J(– x ) (cid:1) + x J(– x ); x > x , c ∗ ( x ) := α ; x ≤ x ≤ x ∗ , (cid:16) ϕ (cid:0) J(– x ) (cid:1)(cid:17) – γ ; x > x ∗ , (3.34)14 nd θ ∗ ( x ) := ( µ – r )(1 – λ ) σ ( x – x ); x ≤ x ≤ x ∗ , µ – r κσ H (cid:0) J(– x ) (cid:1) (1 + ρ x ); x > x ∗ . (3.35) Then, x ∗ , v ( x ) , θ ∗ ( x ) , and c ∗ ( x ) satisfy (3.2) , (3.3) , (3.4) , and (3.5) . Furthermore, v ∈ C (cid:0) [ x , + ∞ ) (cid:1) isstrictly increasing and concave, x ∗ > x , and we can write v as follows:v ( x ) = (cid:18) ρ y ∗ α – γ – y ∗ (1 + ρ x ) (cid:19) γ –1 (cid:26) – y ∗ λ ( x – x ) λλ –1 + ( x – x ) (cid:27) + α γ δ (1 – γ ) ; x ≤ x ≤ x ∗ ,1 δ (cid:20) ϕ (cid:0) J(– x ) (cid:1) H (cid:0) J(– x ) (cid:1) + γ γ ϕ (cid:0) J(– x ) (cid:1) γ + r + ρρ (cid:0) ϕ (cid:0) J(– x ) (cid:1) – J(– x ) (cid:1)(cid:21) ; x > x ∗ . (3.36) In particular, the expression for v in (3.36) implies that lim x → x + v (cid:48) ( x ) = + ∞ .Proof. By Proposition 3.1, u (cid:48) : (0, + ∞ ) → (– ∞ , – x ) is an increasing function such that lim y → + u (cid:48) ( y ) = – ∞ and lim y → + ∞ u (cid:48) ( y ) = – x . Therefore, its inverse J : (– ∞ , – x ) → (0, + ∞ ) is an increasing function such that lim ξ → – x – J( ξ ) = + ∞ and lim ξ → – ∞ J( ξ ) = 0.The expression for x ∗ follows from (3.28), the expression for v follow (3.7), and the expression for c ∗ follows from (3.2), (3.7), and (3.28). To obtain (3.35), use (3.3) and (3.7) to obtain θ ∗ ( x ) := – µ – r σ v (cid:48) ( x ) v (cid:48)(cid:48) ( x ) = µ – r σ J(– x ) u (cid:48)(cid:48) (cid:0) J(– x ) (cid:1) ; x > x . (3.37)We consider two cases: x ∈ [ x , x ∗ ] and x > x ∗ . For the former case, we argue as follows. By (3.30), u (cid:48) ( y ) ∈ (– x ∗ , – x ) for y > y ∗ and, therefore, J( ξ ) > y ∗ for ξ ∈ (– x ∗ , – x ). It then follows from (3.30) that ξ = u (cid:48) (cid:0) J( ξ ) (cid:1) = y ∗ (1 + ρ x ) – α – γ ρ y ∗ λ J( ξ ) λ –1 – x , = ⇒ J( ξ ) λ –1 = (cid:32) ρ y ∗ λ ( x + ξ ) y ∗ (1 + ρ x ) – α – γ (cid:33) ,for ξ ∈ (– x ∗ , – x ). By using (3.31) and (3.35), we obtain θ ∗ ( x ) = µ – r σ J(– x ) u (cid:48)(cid:48) (cid:0) J(– x ) (cid:1) = µ – r σ ( λ – 1) (cid:0) y ∗ (1 + ρ x ) – α – γ (cid:1) ρ y ∗ λ J(– x ) λ –1 = ( µ – r )(1 – λ ) σ ( x – x ),for x ∈ [ x , x ∗ ]. To obtain (3.35) for x > x ∗ , note that by the definition of J and (3.28), we have– x = u (cid:48) (cid:0) J(– x ) (cid:1) = 1 ρ (cid:32) ϕ (cid:0) J(– x ) (cid:1) J(– x ) (cid:33) = ⇒ ϕ (cid:0) J(– x ) (cid:1) J(– x ) = 1 + ρ x ,for x > x ∗ . From (3.29), it follows thatJ(– x ) u (cid:48)(cid:48) (cid:0) J(– x ) (cid:1) = 1 κ H (cid:0) J(– x ) (cid:1) (1 + ρ x ),for x > x ∗ . By substituting for J(– x ) u (cid:48)(cid:48) (cid:0) J(– x ) (cid:1) in (3.37), we obtain (3.35) for x > x ∗ . We can double checkthat θ ∗ ( x ) is continuous at x = x ∗ as follows:1 κ H (cid:0) J(– x ∗ ) (cid:1) (1 + ρ x ∗ ) = 1 κ H (cid:0) y ∗ (cid:1) (1 + ρ x ∗ ) = 1 ρ (cid:18) λ + λ y ∗ η (cid:19) (1 + ρ x ∗ )15 1 ρ (cid:18) λ + ( λ – 1) 1 + ρ x ρ x ∗ (cid:19) (1 + ρ x ∗ ) = 1 ρ (1 – λ ) (cid:18) ρ x ρ x ∗ (cid:19) (1 + ρ x ∗ ) = (1 – λ )( x ∗ – x ),in which we used u (cid:48) ( y ∗ ) = – x ∗ to get the first equality and the second terminal condition in (3.15) for thesecond equality. To get the third equality, we used the boundary condition (1 + ρ x ∗ ) v (cid:48) ( x ∗ ) = α – γ in (3.5)and the definition of η in (3.14) to obtain(1 + ρ x ∗ ) y ∗ = α – γ = λ – 1 λ (1 + ρ x ) η = ⇒ λ y ∗ η = ( λ – 1) 1 + ρ x ρ x ∗ .It is, then, straightforward to show that x ∗ , v ( · ), θ ∗ ( · ), and c ∗ ( · ) satisfy (3.2), (3.3), (3.4), and (3.5) byreversing the transformation (3.6) and by using the fact that y ∗ and u ( · ) solve FBP (3.8)–(3.12). That v ( · )is increasing and strictly concave follows from (3.7) since u ( · ) is decreasing and strictly convex as establishedby Proposition 3.1.( ii ). Furthermore, x ∗ = α – γ ρ y ∗ – 1 ρ > α – γ ρη – 1 ρ = x ,because 0 < y ∗ < η = α – γ ρ x by Proposition 3.1.( i ). Finally, the expression for v in (3.36) follows from v ( x ) = u (cid:0) J(– x ) (cid:1) + x J(– x ) and the expressions in Proposition 3.1.We end this section with the main result of the paper, that is, the solution of the stochastic controlproblem (2.11). Theorem 3.1.
Let x ∗ , v ( x ) , θ ∗ ( x ) , and c ∗ ( x ) be as in Proposition 3.2; then, V( x , α ) = v ( x ) for all x ≥ x .Furthermore, the optimal investment-to-habit and consumption-to-habit processes are given by θ ∗ t := θ ∗ (X ∗ t ) and c ∗ t := c ∗ (X ∗ t ) , respectively, for all t ≥ , in which (X ∗ t ) t ≥ solves the stochastic differential equation dX ∗ t = (cid:0) ( ρ + r )X ∗ t + ( µ – r ) θ ∗ (X ∗ t ) – (1 + ρ X t ) c ∗ (X ∗ t ) (cid:1) d t + σθ ∗ (X ∗ t )dB t ; t ≥ = x . (3.38) Proof.
It suffices to show that v , θ ∗ , and c ∗ satisfy conditions ( i )–( v ) of Theorem 2.1. Conditions ( i ), ( ii ),and ( iv ) directly follow from Proposition 3.2. Below, we prove conditions ( iii ) and ( v ) of that theorem. Condition ( iii ) : Let (X t ) t ≥ be an admissible wealth-to-habit process corresponding to a relative investmentand consumption policy ( θ t , c t ) t ≥ ∈ A ( α ). By Proposition 3.2, v is increasing and v ( x ) = α γ δ (1– γ ) ; therefore,e – δ T α γ δ (1 – γ ) ≤ E x (cid:0) e – δ T v (X T ) (cid:1) , (3.39)for all T ≥
0. Define the non-negative process (Y t ) t ≥ by dY t = – (cid:0) r + ρ (1 – c t ) (cid:1) Y t d t – µ – r σ Y t dB t ; t ≥ = 1. (3.40)From (2.8), it follows that X t Y t + (cid:90) t c s Y s d s = x + (cid:90) t (cid:16) σθ s – µ – r σ X s (cid:17) Y s dB s ,16or any t > 0. In particular, (cid:0) X t Y t + (cid:82) t c s Y s d s (cid:1) t ≥ is a non-negative local martingale and, hence, asupermartingale. Therefore, E x (cid:0) X t Y t + (cid:82) t c s Y s d s (cid:1) ≤ x which, in turn, yields0 < E x (X t Y t ) ≤ x ; t ≥
0, (3.41)because X t , Y t , c t > 0, P -a.s., for all t ≥ u be as in Proposition 3.1.( ii ). From (3.6), we obtain E x (cid:0) v (X T ) (cid:1) = E x (cid:0) v (X T ) – X T Y T + X T Y T (cid:1) ≤ E x (cid:0) u (Y T ) + X T Y T (cid:1) ≤ E (cid:0) u (Y T ) (cid:1) + x , (3.42)for all T > 0, in which we used (3.41) to get the last inequality.For 0 < y < y ∗ , (3.16) yields u ( y ) = 1 δ (cid:20) ϕ ( y )H( y ) + γ γ ϕ ( y ) γ + r + ρ – δρ (cid:0) ϕ ( y ) – y (cid:1)(cid:21) ≤ γδ (1 – γ ) ϕ ( y ) γ + ( κ + r + ρ – δ ) α – γ δρ ,because ϕ ( y ) ∈ (0, α – γ ) and H( y ) ∈ (0, κρ ) by Proposition 3.1.( i ). For γ > 1, because u is decreasing byProposition 3.1.( ii ), we have u ( y ) ≤ ( κ + r + ρ – δ ) α – γ δρ for all y > 0. Inequalities (3.39) and (3.42), then, yielde – δ T α γ δ (1 – γ ) ≤ E x (cid:0) e – δ T v (X T ) (cid:1) ≤ e – δ T (cid:0) x + E (cid:0) u (Y T ) (cid:1)(cid:1) ≤ e – δ T (cid:18) x + ( κ + r + ρ – δ ) α – γ δρ (cid:19) .Condition ( iii ) of Theorem 2.1 follows by taking the limit as T → + ∞ .It only remains to prove that Condition ( iii ) of Theorem 2.1 holds for 0 < γ < 1. By (3.40), because c t ≥ α , we have Y t = exp (cid:18) – (cid:0) κ + r (cid:1) t – ρ (cid:90) t (1 – c s )d s – µ – r σ B t (cid:19) ≥ (cid:101) Y t ,for t ≥
0, in which ( (cid:101) Y t ) t ≥ is a geometric Brownian motion given by (cid:101) Y t := e – at – b B t ; t ≥ a := κ + r + ρ (1 – α ), and b := µ – r σ . Define z ∗ := – b √ T ( a T + ln y ∗ ), and note that z ≥ z ∗ ⇔ e – a T– b √ T z ≤ y ∗ . Because u is decreasing, we have E (cid:0) u (Y T ) (cid:1) ≤ E (cid:0) u ( (cid:101) Y T ) (cid:1) = (cid:90) + ∞ – ∞ u (cid:16) e – a T– b √ T z (cid:17) d Φ ( z )= (cid:90) z ∗ – ∞ u (cid:16) e – a T– b √ T z (cid:17) d Φ ( z ) + (cid:90) + ∞ z ∗ u (cid:16) e – a T– b √ T z (cid:17) d Φ ( z ) ≤ u ( y ∗ ) Φ ( z ∗ ) + (cid:90) + ∞ z ∗ (cid:20) γδ (1 – γ ) ϕ (cid:16) e – a T– b √ T z (cid:17) γ + ( κ + r + ρ – δ ) α – γ δρ (cid:21) d Φ ( z ) ≤ u ( y ∗ ) + ( κ + r + ρ – δ ) α – γ δρ + γδ (1 – γ ) (cid:90) + ∞ z ∗ ϕ (cid:16) e – a T– b √ T z (cid:17) γ d Φ ( z ), (3.43)in which Φ is the standard normal distribution. 17o find the asymptotic behavior of E (cid:0) u (Y T ) (cid:1) as T → + ∞ , we, thus, must study the integral in (3.43).By Lemma B.3, we have lim y → + ϕ ( y ) y β = + ∞ for any β > 0. Next, choose β > 0 so that β < γ (1 – γ ) b (cid:16)(cid:112) a + 2 δ b – a (cid:17) ; (3.44)therefore, there exists an (cid:15) > 0 such that ϕ ( y ) > y β ; y ∈ (0, (cid:15) ]. (3.45)Define (cid:101) z = max (cid:26) – 1 b √ T ( a T + ln (cid:15) ), z ∗ (cid:27) = – 1 b √ T (cid:104) a T + ln (cid:0) min { (cid:15) , y ∗ } (cid:1)(cid:105) ,and note that, by (3.45), because we are considering the case 0 < γ < 1, we have ϕ (cid:16) e – a T– b √ T z (cid:17) γ < (cid:16) e – a T– b √ T z (cid:17) – β (1– γ ) γ = e β (1– γ ) γ ( a T+ b √ T z ); z ≥ (cid:101) z .From the above inequality and from the fact that ϕ is increasing, we compute (cid:90) + ∞ z ∗ ϕ (cid:16) e – a T– b √ T z (cid:17) γ d Φ ( z )= (cid:90) (cid:101) zz ∗ ϕ (cid:16) e – a T– b √ T z (cid:17) γ d Φ ( z ) + (cid:90) + ∞ (cid:101) z ϕ (cid:16) e – a T– b √ T z (cid:17) γ d Φ ( z ) ≤ ϕ (cid:16) e – a T– b √ T (cid:101) z (cid:17) γ (cid:0) Φ ( (cid:101) z ) – Φ ( z ∗ ) (cid:1) + (cid:90) + ∞ (cid:101) z e β (1– γ ) γ ( a T+ b √ T z )d Φ ( z ) ≤ e β (1– γ ) γ ( a T+ b √ T (cid:101) z ) + e β (1– γ ) γ a T (cid:90) + ∞ – ∞ e β (1– γ ) γ b √ T z d Φ ( z )= exp (cid:26) – β (1 – γ ) γ ln (cid:0) min { (cid:15) , y ∗ } (cid:1)(cid:27) + exp (cid:40) (cid:18) β (1 – γ ) γ (cid:19) b T + β (1 – γ ) γ a T (cid:41) .From (3.39), (3.42), and (3.43), we obtaine – δ T α γ δ (1 – γ ) ≤ E x (cid:0) e – δ T v (X T ) (cid:1) ≤ x e – δ T + e – δ T E (cid:0) u (Y T ) (cid:1) ≤ e – δ T (cid:18) x + u ( y ∗ ) + ( κ + r + ρ – δ ) α – γ δρ + γδ (1 – γ ) e – β (1– γ ) γ ln (cid:0) min { (cid:15) , y ∗ } (cid:1)(cid:19) + γδ (1 – γ ) exp (cid:40) (cid:18) β (1 – γ ) γ (cid:19) b T + β (1 – γ ) γ a T – δ T (cid:41) .Note that inequality (3.44) implies the final exponent is strictly negative for all T > 0. Finally, we obtainCondition ( iii ) of Theorem 2.1 by letting T → + ∞ . Condition ( v ) : It suffices to show that (3.38) has a unique strong solution (X ∗ t ) t ≥ taking values in theopen interval I := ( x , + ∞ ). For x ∈ I, let b ( x ) := ( r + ρ ) x + ( µ – r ) θ ∗ ( x ) – (1 + ρ x ) c ∗ ( x ), and a ( x ) := σθ ∗ ( x ), (3.46)18e the drift and diffusion terms of (3.38), respectively. Note that the drift function b ( x ) in (3.46) is notglobally Lipschitz because of the term x c ∗ ( x ). Therefore, standard existence results, such as Theorem 5.2.9on page 289 of Karatzas and Shreve (1991), are not directly applicable here.Since b ( x ) and a ( x ) are locally Lipschitz for x ∈ I, a standard localization argument yields that (3.38)has a unique strong solution up to an explosion time. In the remaining part of the proof, we show that (3.38)does not have an exploding solution (that is, a solution that exits I in finite time). For x ∈ I, define ψ ( x ) = (cid:90) xx ∗ (cid:90) yx ∗ a ( z ) exp (cid:18) –2 (cid:90) yz b ( η ) a ( η ) d η (cid:19) d z d y ,By Feller’s test for explosions (see, for example, Theorem 5.5.29 on page 348 of Karatzas and Shreve (1991)),(3.38) does not have an exploding solution if lim x → + ∞ ψ ( x ) = lim x → x + ψ ( x ) = + ∞ , which we show next.For x ∈ ( x , x ∗ ), (3.34) and (3.35) yield that b ( x ) = ( x – x ) b and a ( x ) = ( x – x ) a , in which b := r + ρ (1 – α ) + (cid:0) µ – r σ (cid:1) (1 – λ ) > 0 and a := ( µ – r )(1 – λ )/ σ > 0. It then follows that, ψ ( x ) = 2 a + b a a + b (cid:18) x – xx ∗ – x (cid:19) b a – 1 + ln (cid:18) x ∗ – xx – x (cid:19) ,for x ∈ ( x , x ∗ ), which yields that lim x → x + ψ ( x ) = + ∞ .It only remains to show that lim x → + ∞ ψ ( x ) = + ∞ . By (3.34) and (3.35), we have c ∗ ( x ) = (cid:0) ϕ (cid:0) J(– x ) (cid:1)(cid:1) – γ and θ ∗ ( x ) = µ – r κσ H (cid:0) J(– x ) (cid:1) (1 + ρ x ), for x > x ∗ . Furthermore, by the proof of Proposition 3.1, there exists aconstant H such that 0 < H ≤ H (cid:0) J(– x ) (cid:1) ≤ κρ , (3.47)for x > x ∗ . For y > z > x ∗ , we, then, have (cid:90) yz b ( η ) a ( η ) d η = (cid:90) yz (cid:18) r + ρσ x θ ∗ ( x ) + µ – r σ θ ∗ ( x ) – (1 + ρ x ) c ∗ ( x ) θ ∗ ( x ) (cid:19) d η ≤ (cid:90) yz (cid:32) κ ( r + ρ )2 x H (cid:0) J(– x ) (cid:1) (1 + ρ x ) + κ H (cid:0) J(– x ) (cid:1) (1 + ρ x ) (cid:33) d η ≤ κ H (cid:90) yz (cid:18) a x (1 + ρ x ) + 11 + ρ x (cid:19) d η = κρ H (cid:20) a ρ y – a ρ z + ( a + ρ ) ln (cid:18) ρ y ρ z (cid:19)(cid:21) ≤ κ ( a + ρ ) ρ H ln (cid:18) ρ y ρ z (cid:19) , (3.48)in which a := r + ρ . Let b := κ ( a + ρ ) ρ H , and note that, because 0 < H < κ / ρ , we have b = 2 κρ H (cid:18) r + ρ + ρ (cid:19) ≥ κρ κρ (cid:32) r + ρ κρ + ρ (cid:33) = r + ρκ + 2 > 2.For x > x ∗ , (3.47) and (3.48) yield ψ ( x ) = (cid:90) xx ∗ (cid:90) yx ∗ a ( z ) exp (cid:18) –2 (cid:90) yz b ( η ) a ( η ) d η (cid:19) d z d y (cid:90) xx ∗ (cid:90) yx ∗ κρ (1 + ρ z ) exp (cid:18) – 2 κ ( a + ρ ) ρ H ln (cid:18) ρ y ρ z (cid:19)(cid:19) d z d y = (cid:90) xx ∗ (cid:90) yx ∗ ρ κ (1 + ρ z ) –2 (cid:18) ρ z ρ y (cid:19) b d z d y = ρκ ( b – 1) (cid:90) xx ∗ (cid:32)
11 + ρ y – (1 + ρ x ∗ ) b –1 (1 + ρ y ) b (cid:33) d y = ρκ ( b – 1) (cid:34) ρ ln (cid:18) ρ x ρ x ∗ (cid:19) + ρ (1 + ρ x ∗ ) b –1 b – 1 (cid:16) (1 + ρ x ) b – (1 + ρ x ∗ ) b (cid:17)(cid:35) .Finally, by letting x → + ∞ , it follows that lim x → + ∞ ψ ( x ) = + ∞ . We end the paper by providing a series of numerical examples to highlight certain properties of the optimalinvestment and consumption policy. Throughout the section, we choose the following values for the modelparameters: r = 0.02, µ = 0.12, σ = 0.2, ρ = 1, α = 0.5, δ = 0.3, and γ = 2. On occasions, however, we willchange the value of a parameter (while keeping other parameters fixed) to show sensitivity of the solutionwith respect to that parameter.To obtain the solution, we first numerically solve FBP (3.15) as follows. For a given value of y ∗ , (3.15) canbe solved using an ODE solver (we used “ RK45 ” through Python’s scipy.integrate.solve_ivp() function).By using a simple bisection search, we then find the smallest value of y ∗ ∈ ( η , η ) for which H exits fromthe top boundary H = κ / ρ . The algorithm is illustrated by Figure 1. With y ∗ , H, and ϕ at hand, we canuse (3.16) to find u ( y ) and its first two derivatives for all y > 0, as shown in Figure 2.Proposition (3.2) then yields x ∗ , v , c ∗ , and θ ∗ . The left plot of Figure 3 shows the optimal investmentfunction θ ∗ ( x ). As indicated by (3.35), for x ∈ [ x , x ∗ ], θ ∗ ( x ) is linear with slope µ – r σ (1– λ ). For x > x ∗ , θ ∗ ( x )is asymptotically linear with the slope µ – r σ since lim x → + ∞ H (cid:0) J(– x ) (cid:1) = κ / ρ . Indeed, as Figure 3 shows, thisasymptotic linearity can occur for small values of x . Since λ < 0, the slope of θ ∗ ( x ) is greater in the range x ∈ [ x , x ∗ ] than in the range x > x ∗ . In other words, the individual invests extra wealth more aggressivelywhen her wealth-to-habit ratio is below the critical level x ∗ compared to when her relative wealth is above x ∗ . The right plot in Figure 3 shows the optimal consumption function c ∗ ( x ) by the solid black curve. Asindicated by (3.34), the optimal policy is to consume at the lowest consumption to habit ratio of α whilewealth-to-habit ratio is below x ∗ and to increase relative consumption once the relative wealth becomeslarger than x ∗ . In the same plot, the dashed curve represents the certainty equivalent (CE) function, whichwe define as follows. Assume that the individual maintains a constant consumption-to-habit ratio of ˜ c . Then,her utility of this consumption stream is (cid:90) + ∞ e – δ t ˜ c γ γ d t = ˜ c γ δ (1 – γ ) .We define CE( x ) as the value of the constant consumption-to-habit process that yields the same utility asV( x ) of (2.11). In other words, the individual is indifferent between receiving a constant consumption-to- Recall that η i are the constants in (3.14). .
00 0 .
25 0 .
50 0 .
75 1 .
00 1 .
25 1 .
50 1 .
75 2 . y ϕ α − γ λα − γ ( λ − ρx ) α − γ ρxy ∗ ϕ η ( y ) for various η .
00 0 .
25 0 .
50 0 .
75 1 .
00 1 .
25 1 .
50 1 .
75 2 . y . . . . . . H κρ λα − γ ( λ − ρx ) α − γ ρxy ∗ H η ( y ) for various η Figure 1: The solid black curves represent the (approximate) solution of the free-boundary problem (3.15).The dashed red and blue curves are the upper and lower solutions that satisfy the boundary-value problem(3.22) within the set D given by (3.19). y ∗ is the value of η such that the solution exists for all y ∈ (0, η ). y − . − . − . − . − . . u y ∗ u ( y ) y − − − u y ∗ − x u ( y ) y u y ∗ u ( y ) Figure 2: The solution ( y ∗ , u ( · )) of FBP (3.8)-(3.12) and its first two derivatives. x = wz . . . . c = Cz α = x x ∗ c ∗ ( x ) and CE( x ) c ∗ ( x )CE( x ) x = wz θ = πz x x ∗ θ ∗ ( x ) Figure 3: The optimal investment function θ ∗ ( x ), the optimal consumption function c ∗ ( x ), the certaintyequivalent function CE( x ) = (cid:0) δ (1 – γ )V( x ) (cid:1) γ . 21 . . . . . δ . . . . . . x ∗ x = 0 . small values of δ . . . . . δ . . . . x ∗ x = 0 . large values of δ x ∗ for various values of δ > Figure 4: Sensitivity of the critical threshold x ∗ with respect to δ . Because of the difference in scale of x ∗ values, we have separated the plot for small (on left) and large (on right) values of δ . Note that the lowestrange of the vertical axes is x and not zero.habit ratio of CE( x ) versus consuming according to Theorem 3.1. It follows that we must haveCE( x ) γ δ (1 – γ ) = V( x ) = ⇒ CE( x ) = (cid:0) δ (1 – γ )V( x ) (cid:1) γ .From the plot, we observe that the optimal consumption and CE functions meet at a point ( x , c ) ≈ (2.8, 0.8)such that c ∗ ( x ) < CE( x ) (resp. c ∗ ( x ) > CE( x )) for x ∈ ( x , x ) (resp. x > x ). Thus, by following the optimalconsumption policy, the individual consumes less than (resp. greater than) her “overall” consumption rateif her wealth-to-habit ratio is below (resp. above) the relative wealth x . This observation indicates thatthe individual has a preference for specific levels of consumption-to-habit and wealth-to-habit ratios. InAngoshtari et al. (2020), for the case when risky investment is not allowed, we showed a strong form of thisproperty and explicitly identified the corresponding relative wealth and consumption levels ( x , c ).In Figure 4, we investigate the dependence of the critical wealth-to-habit ratio x ∗ on the subjectivediscount rate δ in (2.11). We find x ∗ to be decreasing in δ , which indicates that impatient individuals (thatis, with higher δ ) are more eager to consume at a rate higher than α than patient individuals (that is, withlower δ ). We also saw this relationship in Angoshtari et al. (2020) for the case of riskless investment. InAngoshtari et al. (2020), we also found that x ∗ = x for δ ≥ r + ρ (1 – α ). In contrast, Figure 4 highlightsthat x ∗ > x for all values of δ > 0, which we proved in Section 3. Indeed, Proposition 3.1.(i) implies that y ∗ < η , from which it follows that x ∗ > x by (3.33).Figure 5 shows dependence of the optimal policy on the parameter α in (2.4). Note that, by (2.5), x is increasing in α . Thus, the domains of c ∗ and θ ∗ in Figure 5 shift to right as α increases. The left plotindicates that increasing α decreases the optimal investment-to-habit ratio θ ∗ ( x ), as long as the current levelof wealth-to-habit ratio stays admissible (that is, x ≥ x ). The right plot shows that an increase in α increases(resp. decreases) c ∗ ( x ) if x ∈ ( x , x ∗ ) (resp. x > x ∗ ). In other words, an individual who is more amenable toaddiction (that is, higher α ) optimally invests less in the risky asset than an individual with less addictivepersonality and the same wealth-to-habit ratio. Furthermore, the individual with more addictive personality22 x α = 0 . α = 0 . α = 0 . α = 0 . c ∗ , α c ∗ ( x ) for various values of α ∈ (0 , c ∗ α = ( r + ρ ) x ρx x θ ∗ α = . α = . α = . α = . θ ∗ ( x ) for various values of α ∈ (0 , Figure 5: Sensitivity of the optimal investment function θ ∗ ( x ) and the optimal consumption function c ∗ ( x )with respect to α .
35 40 45 50 55 60 65 70 x = wz . . . . . . c = Cz α = x x ∗ c ∗ ( x ) for α = 1 and α = 0 . α = 0 . α = 1
35 40 45 50 55 60 65 70 x = wz θ = πz x x ∗ θ ∗ ( x ) for α = 0 .
75 and α = 1 α = 0 . α = 1 Figure 6: The optimal investment function θ ∗ ( x ) and the optimal consumption function c ∗ ( x ) for theaddictive habits, that is, α = 1. For reference, the optimal investment and consumption function for thenonaddictive habits α = 0.75 are also shown by the dashed curves. Note that the left limit on the horizontalaxis is x = 35.optimally consumes less than the individual with less addictive personality, unless the former individual’sconsumption is driven by the habit-formation constraint (that is, x ∈ ( x , x ∗ ) such that c ∗ ( x ) = α for theindividual with higher α ).Figure 6 shows the optimal policies for the case α = 1, which was included in the analysis of Section 3.For this case, the individual’s consumption rate is forced to be at least as large as her habit by (2.4), that is,C t ≥ Z t . This scenario is usually referred to as addictive habit formation, while the case in which C t < Z t is allowed is called nonaddictive habit formation. Therefore, in our setting, α = 1 (resp. α < 1) representsaddictive (resp. nonaddictive) habit formation. As Figure 6 shows, the optimal policies of the addictive andnonaddictive cases have a similar structure. Their main difference is that the amount of wealth needed tosupport a certain level of consumption is significantly higher for addictive habits. For instance, for our chosen See, for instance, Detemple and Karatzas (2003). x ≈
47 in the right plot of Figure 6), and a wealth of about 50 times her habit to consume above theminimum rate. On the other hand, Figure 5 shows that for a nonaddictive habit formation with α = 0.75,the individual needs a wealth-to-habit ratio of around 3 to optimally consume above her minimum rate.Finally, Figure 6 shows that, for the same values of risk aversion and wealth-to-habit ratio, addictive habits(that is, α = 1) correspond to significantly lower levels of optimal consumption and optimal investment inthe risky asset than nonaddictive habits (with α = 0.75). In other words, individuals with more addictivehabits (optimally) invest less in the risky asset. To attract such individuals, the market premiums needsto be higher than they would be for individuals with less addictive habits. This observation provides anexplanation for the equity premium puzzle of Mehra and Prescott (1985), which states that the historicalrisk premium offered by stock markets has been significantly higher than the level that could be explainedby investors’ risk aversion alone. See Constantinides (1990) for further discussion on the puzzle and how itcan be explained by habit-formation models. References
Albrecher, H., P. Azcue, and N. Muler (2020a). Optimal ratcheting of dividends in a brownian risk model.preprint, available at arXiv:2012.10632.Albrecher, H., P. Azcue, and N. Muler (2020b). Optimal ratcheting of dividends in insurance.
SIAM Journalon Control and Optimization 58 (4), 1822–1845.Angoshtari, B., E. Bayraktar, and V. R. Young (2019). Optimal dividend distribution under drawdown andratcheting constraints on dividend rates.
SIAM Journal on Financial Mathematics 10 (2), 547–577.Angoshtari, B., E. Bayraktar, and V. R. Young (2020). Optimal consumption under a habit-formationconstraint. preprint, available at arXiv:2012.02277.Arun, T. (2012). The Merton problem with a drawdown constraint on consumption. preprint, available atarXiv:1210.5205.Constantinides, G. M. (1990). Habit formation: A resolution of the equity premium puzzle.
Journal ofpolitical Economy 98 (3), 519–543.Deng, S., X. Li, H. Pham, and X. Yu (2020). Optimal consumption with reference to past spending maximum.preprint, available at SSRN 3656811.Detemple, J. B. and I. Karatzas (2003). Non-addictive habits: optimal consumption-portfolio policies.
Journal of Economic Theory 113 (2), 265–285.Detemple, J. B. and F. Zapatero (1991). Asset prices in an exchange economy with habit formation.
Econo-metrica: Journal of the Econometric Society 59 (6), 1633–1657.24etemple, J. B. and F. Zapatero (1992). Optimal consumption-portfolio policies with habit formation.
Mathematical Finance 2 (4), 251–274.Dybvig, P. H. (1995). Dusenberry’s racheting of consumption: Optimal dynamic consumption and investmentgiven intolerance for any decline in standard of living.
Review of Economic Studies 62 (2), 287–313.Englezos, N. and I. Karatzas (2009). Utility maximization with habit formation: Dynamic programmingand stochastic pdes.
SIAM Journal on Control and Optimization 48 (2), 481–520.Jeon, J., H. K. Koo, and Y. H. Shin (2018). Portfolio selection with consumption ratcheting.
Journal ofEconomic Dynamics and Control 92 , 153–182.Karatzas, I. and S. E. Shreve (1991).
Brownian motion and stochastic calculus (Second ed.), Volume 113 of
Graduate Texts in Mathematics . Springer-Verlag, New York.Mehra, R. and E. C. Prescott (1985). The equity premium: A puzzle.
Journal of monetary Economics 15 (2),145–161.Merton, R. C. (1969). Lifetime portfolio selection under uncertainty: The continuous-time case.
Review ofEconomics and Statistics 51 (3), 247–257.Munk, C. (2008). Portfolio and consumption choice with stochastic investment opportunities and habitformation in preferences.
Journal of Economic Dynamics and Control 32 (11), 3560 – 3589.Muraviev, R. (2011). Additive habit formation: consumption in incomplete markets with random endow-ments.
Mathematics and financial economics 5 (2), 67.Pollak, R. A. (1970). Habit formation and dynamic demand functions.
Journal of political Economy 78 (4,Part 1), 745–763.Roche, H. (2019). Asset management with endogenous withdrawals under a drawdown constraint.
Quanti-tative Finance 19 (2), 289–312.Ryder, H. E. and G. M. Heal (1973). Optimal growth with intertemporally dependent preferences.
TheReview of Economic Studies 40 (1), 1–31.Sundaresan, S. M. (1989). Intertemporally dependent preferences and the volatility of consumption andwealth.
Review of financial Studies 2 (1), 73–89.Walter, W. (1998).
Ordinary differential equations , Volume 182 of
Graduate Texts in Mathematics . Springer-Verlag, New York. Translated from the sixth German (1996) edition by Russell Thompson, Readings inMathematics.Yu, X. (2015, 06). Utility maximization with addictive consumption habit formation in incomplete semi-martingale markets.
Ann. Appl. Probab. 25 (3), 1383–1419.25
Proof of Theorem 2.1
We complete the proof in two steps by showing (1) v ≥ V and (2) v ≤ V. Step 1:
Let ( θ t , c t ) t ≥ ∈ A ( α ) and {X t } t ≥ be the corresponding wealth-to-habit process given by (2.8).Define the non-decreasing sequence of stopping times { τ n } ∞ n =1 by τ n := inf (cid:26) t ≥ (cid:90) t e – δ s (cid:0) θ s v (cid:48) (X s ) (cid:1) d s ≥ n (cid:27) ,for n ≥
1. For all T ≥
0, applying Itô’s lemma to e – δ t v (X t ), t ∈ [0, T ∧ τ n ] yieldse – δ (T ∧ τ n ) v (cid:0) X T ∧ τ n (cid:1) + (cid:90) T ∧ τ n c γ t γ e – δ t d t = v ( x ) + (cid:90) T ∧ τ n e – δ t L θ t , c t v (X t ) d t + (cid:90) T ∧ τ n σθ t e – δ t v (cid:48) (X t ) dB t .Condition ( i ) implies that the first integral on the right is non-positive; thus, we have α γ δ (1 – γ ) e – δ (T ∧ τ n ) ≤ e – δ (T ∧ τ n ) v (cid:0) X T ∧ τ n (cid:1) + (cid:90) T ∧ τ n c γ t γ e – δ t d t ≤ v ( x ) + (cid:90) T ∧ τ n σθ t e – δ t v (cid:48) (X t ) dB t ,in which we used condition ( ii ) to get the first inequality. The definition of τ n implies that the expectationof the remaining integral on the right is zero, which implies α γ δ (1 – γ ) E x (cid:16) e – δ (T ∧ τ n ) (cid:17) ≤ E x (cid:32) e – δ (T ∧ τ n ) v (cid:0) X T ∧ τ n (cid:1) + (cid:90) T ∧ τ n c γ t γ e – δ t d t (cid:33) ≤ v ( x ). (A.1)Define τ ∞ := ess sup { τ n : n ≥ P ( τ ∞ = + ∞ ) > 0. From thedominated convergence theorem, because { τ n } is non-decreasing, we deduce lim n → + ∞ E x (cid:16) e – δ (T ∧ τ n ) (cid:17) = E x (cid:16) e – δ (T ∧ τ ∞ ) (cid:17) ∈ [0, 1).Because v (cid:48) ( x + ) = + ∞ by condition ( ii ), we have τ ∞ < + ∞ only if X τ ∞ = x which, in turn, is equivalent toX t = x and c t = α for all t ≥ τ ∞ by the proof of Lemma 2.2 in Angoshtari et al. (2020). By letting n → ∞ in (A.1) and by using the dominated convergence theorem to exchange expectation and limit, we obtain α γ δ (1 – γ ) E x (cid:16) e – δ (T ∧ τ ∞ ) (cid:17) ≤ lim n → + ∞ E x (cid:32) e – δ (T ∧ τ n ) v (cid:0) X T ∧ τ n (cid:1) + (cid:90) T ∧ τ n c γ t γ e – δ t d t (cid:33) = E x (cid:34) { τ ∞ For this step, consider the admissible policy (cid:0) θ ∗ (X ∗ t ), c ∗ (X ∗ t ) (cid:1) t ≥ , and define the stopping time (cid:98) τ n by (cid:98) τ n := inf (cid:26) t ≥ (cid:90) t e – δ s (cid:0) θ ∗ s v x (X ∗ s ) (cid:1) ds ≥ n (cid:27) .Then, by repeating the argument in Step 1 and by using condition ( iv ), we obtain v ( x ) = E x (cid:32) e – δ (T ∧ (cid:98) τ n ) v (cid:0) X T ∧ (cid:98) τ n (cid:1) + (cid:90) T ∧ (cid:98) τ c γ t γ e – δ t d t (cid:33) ≥ α γ δ (1 – γ ) E x (cid:16) e – δ (T ∧ (cid:98) τ n ) (cid:17) .By arguing as in Step 1, and by taking the limit as n → + ∞ and, then, as T → + ∞ , we have v ( x ) = E x (cid:32) (cid:90) + ∞ ( c ∗ (X ∗ t )) γ γ e – δ t d t (cid:33) .Thus, because v is the value function corresponding to an admissible policy, we deduce v ≤ V on [ x , + ∞ ). B Auxiliary lemmas for Section 3 The following Lemma is used in the proof of Proposition 3.1. Lemma B.1. For η ∈ ( η , η ) , let ( ϕ η ( · ), H η ( · ) (cid:1) be the solution of the boundary-value problem (3.22) suchthat ( ε ( η ), η ] is the maximal domain over which the solution exists within D given by (3.19) . We, then, have: ( i ) If ε ( η ) > 0 , then ( ϕ η ( · ), H η ( · ) (cid:1) exits D either through the boundary D given by (3.23) or through theboundary D given by D := (cid:8) ( y , ϕ , 0) : y ∈ (0, η ), ϕ ∈ (0, α – γ ) (cid:9) .( ii ) For values of η ∈ ( η , η ) that are sufficiently close to η , the solution ( ϕ η ( · ), H η ( · ) (cid:1) exits D through D . ( iii ) For values of η ∈ ( η , η ) that are sufficiently close to η , the solution ( ϕ η ( · ), H η ( · ) (cid:1) exits D through D . ( iv ) Assume that η , η (cid:48) ∈ ( η , η ) are such that η < η (cid:48) and the solutions ( ϕ η ( · ), H η ( · ) (cid:1) and ( ϕ η (cid:48) ( · ), H η (cid:48) ( · ) (cid:1) donot have disjoint domains, that is max (cid:8) ε ( η (cid:48) ), ε ( η ) (cid:9) < η . Then, ϕ η ( y ) > ϕ η (cid:48) ( y ) and H η ( y ) > H η (cid:48) ( y ) for all y ∈ (cid:0) max (cid:8) ε ( η (cid:48) ), ε ( η ) (cid:9) , η ] . roof. Proof of ( i ): From the differential equation for ϕ in (3.22), we deduce that ϕ (cid:48) η ( y ) < 0 for y ∈ ( ε ( η ), η ),since H η ( y ) < κ / ρ . So, it can only be possible for ( ϕ η ( · ), H η ( · ) (cid:1) to exit D from the boundary D := (cid:8) ( y , 0, H) : y ∈ (0, η ), H ∈ (0, κ / ρ ) (cid:9) ,the boundary D (cid:48) := (cid:8) ( y , ϕ , κ / ρ ) : y ∈ (0, η ), ϕ ∈ (0, α – γ ) (cid:9) .or the boundary D . We can eliminate the possibility of exiting through the boundary D by the followingargument. On the contrary, suppose ( ϕ η ( · ), H η ( · ) (cid:1) exits D thorough D , that is, 0 < H η ( y ) < κ / ρ for ε ( η ) < y ≤ η and lim y → ε ( η ) + ϕ η ( y ) = 0. For y > 0, define u ( y ) = (1 + ρ x ) y and u ( y ) = 0. Note that u (cid:0) ε ( η )) > 0 = lim y → ε ( η ) + ϕ η ( y ) and u (cid:0) ε ( η )) = 0 < lim y → ε ( η ) + ϕ η ( y ). Furthermore, for ε ( η ) < y ≤ η , wehave u (cid:48) ( y ) – g (cid:0) y , u ( y ), u ( y ) (cid:1) = 0 = ϕ (cid:48) η ( y ) – g (cid:0) y , ϕ η ( y ), H η ( y ) (cid:1) ,and u (cid:48) ( y ) – g (cid:0) y , u ( y ), u ( y ) (cid:1) = 0 – 1 y (cid:18) (1 + ρ x ) – γ y – γ – α (cid:19) < 0 = ϕ (cid:48) η ( y ) – g (cid:0) y , ϕ η ( y ), H η ( y ) (cid:1) , (B.1)in which g and g are given by (3.20) and (3.21), respectively. To get the first equality in (B.1), we used ρ x ρ x = α which follows from (2.5). To get the inequality in (B.1), we used 0 < y ≤ η < η = α – γ ρ x .Because g ( y , ϕ , H) is decreasing in H and g ( y , ϕ , H) is decreasing in ϕ , we can apply Lemma B.2.( i ) belowto conclude that ϕ η ( η ) ≤ (1 + ρ x ) η . The last statement, however, contradicts the boundary condition in(3.22), namely, ϕ η ( η ) = α – γ and η < η ⇒ (1 + ρ x ) η < α – γ . Thus, ( ϕ η ( · ), H η ( · ) (cid:1) can only exit D througheither D (cid:48) or D .To finish proving ( i ), it remains to show that ( ϕ η ( · ), H η ( · ) (cid:1) cannot exit through the boundary D (cid:48) \ D = {( y , ϕ , κ / ρ ) : y ∈ [ η , η ), ϕ ∈ (0, α – γ )} .To show this statement, it suffices to showH η ( y ) ≤ w ( y ); max (cid:8) ε ( η ), η (cid:9) < y ≤ η , (B.2)in which w is defined by w ( y ) = κρ (cid:20) λ (cid:18) y η (cid:19)(cid:21) ; y ∈ (0, η ).Recall that λ < 0, and note that κρ = w ( η ) > w ( y ) > w ( η ) = 0 for y ∈ ( η , η ). To show inequality(B.2), let w ( y ) = α – γ for y ∈ (0, η ). From (3.22), we have ϕ η ( η ) = w ( η ) and H η ( η ) = w ( η ). Furthermore,for y ∈ (cid:0) max (cid:8) ε ( η ), η (cid:9) , η (cid:3) , we have w (cid:48) ( y ) – g (cid:0) y , w ( y ), w ( y ) (cid:1) = 0 – ρκ y (cid:18) κρ – w ( y ) (cid:19) α – γ < 0 = ϕ (cid:48) η ( y ) – g (cid:0) y , ϕ η ( y ), H η ( y ) (cid:1) ,28nd w (cid:48) ( y ) – g (cid:0) y , w ( y ), w ( y ) (cid:1) = κλρη – ρκ y (cid:18) κρ – w ( y ) (cid:19) (cid:18) δ – r – ρ (1 – α ) ρ – w ( y ) (cid:19) – r + ρρα – γ + δρ y = κλρη – r + ρρα – γ + δρ y + 1 y (cid:18) y η (cid:19) (cid:18) – κλ ρ + ( κ + r + ρ (1 – α ) – δ ) λρ + κλ y ρη (cid:19) = κλρη – r + ρρα – γ + δρ y + 1 y (cid:18) y η (cid:19) (cid:18) – δρ + κλ y ρη (cid:19) = κλρη – r + ρρα – γ + δρη + κλ ρη (cid:18) y η (cid:19) = 1 ρη (cid:20) κλ – λ ( r + ρ (1 – α )) λ – 1 + δ + κλ (cid:18) y η (cid:19)(cid:21) = κλ ρη (cid:18) y η (cid:19) < 0 = H (cid:48) η ( y ) – g (cid:0) y , ϕ η ( y ), H η ( y ) (cid:1) .In two steps of the calculation for w (cid:48) – g , we used the fact that λ satisfies (3.17), and we used the definitionof η in (3.14). To get the last inequality, we used y > η . Finally, inequality (B.2) follows from LemmaB.2.( ii ) below.Proofs of ( ii ) and ( iii ): As η → η +1 , The boundary condition in (3.22) approaches the point ( y , ϕ , H) =( η , α – γ , κ / ρ ), which lies on the boundary of D . Furthermore, g (cid:0) η , α – γ , κ / ρ ) = r + ρρα – γ – δρη = r + ρρα – γ (cid:18) δ (1 – λ ) λ ( r + ρ (1 – α )) (cid:19) = κ ( r + ρ )( λ – 1) ρα – γ (cid:0) r + ρ (1 – α ) (cid:1) < 0,in which we used (2.5) and (3.14) to get the second equality, (3.17) to get the third equality, and λ < 0 to getthe inequality. From continuous dependence of the solution ( ϕ η ( · ), H η ( · ) (cid:1) on η , it follows that ( ϕ η ( · ), H η ( · ) (cid:1) exits D through D for values of η in a right neighborhood ( η , η + (cid:15) ) of η . With a similar argument, weconclude that ( ϕ η ( · ), H η ( · ) (cid:1) exits D through D for values of η in a left neighborhood ( η – (cid:15) (cid:48) , η ) of η .Proof of ( iv ): The statement directly follows from Lemma B.2.( ii ) below by taking into account that( ϕ η ( · ), H η ( · ) (cid:1) and ( ϕ η (cid:48) ( · ), H η (cid:48) ( · ) (cid:1) are unique solutions of (3.22).We refer to the following lemma in the proof of Lemma B.1. Lemma B.2. For an open set D ⊆ R and an interval J = ( a , b ) , assume that the vector-valued function ( f , f ) = f ( x , y ) : J × D → R is locally Lipschitz continuous with respect to y , that f ( x , y , y ) is decreasingin y , and that f ( x , y , y ) is decreasing in y . Let u = ( u , u ) : J → D and w = ( w , w ) : J → D bedifferentiable functions. Then: ( i ) If u ( a + ) ≥ w ( a + ) , u ( a + ) ≤ w ( a + ) , u (cid:48) ( x ) – f (cid:0) x , u ( x ) (cid:1) ≥ w (cid:48) ( x ) – f (cid:0) x , w ( x ) (cid:1) , and u (cid:48) ( x ) – f (cid:0) x , u ( x ) (cid:1) ≤ w (cid:48) ( x ) – f (cid:0) x , w ( x ) (cid:1) for x ∈ J , then u ( x ) ≥ w ( x ) and u ( x ) ≤ w ( x ) for x ∈ J . ii ) If u i ( b – ) ≤ w i ( b – ) and u (cid:48) i ( x ) – f i (cid:0) x , u ( x ) (cid:1) ≥ w (cid:48) i ( x ) – f i (cid:0) x , w ( x ) (cid:1) for x ∈ J and i ∈ {1, 2} , thenu i ( x ) ≤ w i ( x ) for x ∈ J and i ∈ {1, 2} .Proof. See, for instance, the comparison theorem on page 112 of Walter (1998). Note, however, that f isquasimonotone decreasing and that we have stated the lemma for a right-boundary-value problem in ( ii ).We use the following Lemma in the proof of Theorem 3.1. Lemma B.3. Let ϕ be as in Proposition 3.1. ( i ) . For any β > 0 , lim y → + ϕ ( y ) y β = + ∞ .Proof. The statement is trivial if lim y → + ϕ ( y ) > 0; therefore, suppose lim y → + ϕ ( y ) = 0, and defineF( y ) := ϕ ( y ) y β ,for 0 < y < y ∗ . Our goal is to show that lim y → + F( y ) = + ∞ . Assume, on the contrary, lim y → + F( y ) (cid:54) =+ ∞ . We compute F (cid:48) ( y ) = ϕ ( y ) y β +1 (cid:18) ρκ (cid:18) κρ – H (cid:19) – β (cid:19) ,for 0 < y < y ∗ . Because lim y → + H( y ) = κ / ρ by Proposition 3.1.( i ), there exists an (cid:15) > 0 such thatF( y ) is decreasing for y ∈ (0, (cid:15) ). Because F is decreasing and positive on (0, (cid:15) ), and because we assume lim y → + F( y ) (cid:54) = + ∞ , we must have lim y → + F( y ) = M for some constant M > 0. From L’Hôpital’s rule,(3.15), and lim y → + H( y ) = κ / ρ , we deduceM = lim y → + F( y ) = lim y → + ϕ (cid:48) ( y ) β y β –1 = lim y → + ρβκ F( y ) (cid:18) κρ – H( y ) (cid:19) = 0,which contradicts M > 0. Thus, we must have lim y → + F( y ) = + ∞∞