[PDF] A Certainty Equivalent Merton Problem

Abstract

The Merton problem is the well-known stochastic control problem of choosing consumption over time, as well as an investment mix, to maximize expected constant relative risk aversion (CRRA) utility of consumption. Merton formulated the problem and provided an analytical solution in 1970; since then a number of extensions of the original formulation have been solved. In this note we identify a certainty equivalent problem, i.e., a deterministic optimal control problem with the same optimal value function and optimal policy, for the base Merton problem, as well as a number of extensions. When time is discretized, the certainty equivalent problem becomes a second-order cone program (SOCP), readily formulated and solved using domain specific languages for convex optimization. This makes it a good starting point for model predictive control, a policy that can handle extensions that are either too cumbersome or impossible to handle exactly using standard dynamic programming methods.

Full PDF

aa r X i v : . [ m a t h . O C ] J a n For professional clients and qualified investors onlyNot for public distributionThis version posted with permission

A Certainty Equivalent Merton Problem

Nicholas Moehle Stephen BoydJanuary 27, 2021

Abstract

The Merton problem is the well-known stochastic control problem of choos-ing consumption over time, as well as an investment mix, to maximize expectedconstant relative risk aversion (CRRA) utility of consumption. Merton formu-lated the problem and provided an analytical solution in 1970; since then anumber of extensions of the original formulation have been solved. In thisnote we identify a certainty equivalent problem, i.e. , a deterministic optimalcontrol problem with the same optimal value function and optimal policy, forthe base Merton problem, as well as a number of extensions. When time isdiscretized, the certainty equivalent problem becomes a second-order cone pro-gram (SOCP), readily formulated and solved using domain speciﬁc languagesfor convex optimization. This makes it a good starting point for model predic-tive control, a policy that can handle extensions that are either too cumbersomeor impossible to handle exactly using standard dynamic programming methods.

We revisit Merton’s seminal 1970 formulation (and solution) of the consumption andinvestment decisions of an individual investor. We present a formulation of Mer-ton’s problem as a deterministic convex optimal control problem, and in particular,a second-order cone program (SOCP) when time is discretized. Even though theMerton problem was ﬁrst solved more than 50 years ago, its reformulation as a deter-ministic convex optimization problem provides fresh insight into the solution of thestochastic problem that may be useful for formulating other multiperiod investmentproblems as convex optimization problems.We also see two practical advantages to the certainty equivalent formulation. First,for extensions of the Merton problem for which a solution is known, working out theoptimal policy can be complex and error prone. To handle these extensions with thecertainty equivalent form, we simply add the appropriate terms to the objective orconstraints, to obtain the optimal policy. The problem speciﬁcation is straightforwardand transparent, especially when expressed in a domain speciﬁc language (DSL) forconvex optimization, such as cvxpy [DB16].1he second and perhaps more signiﬁcant advantage is that the certainty equiv-alent problem can be used as a starting point for further extensions of the Mertonproblem, for which no closed-form solutions are known. In this case, the certaintyequivalence property is lost, and solving the deterministic problem no longer solvesthe corresponding stochastic problem exactly. We can, however, still use model pre-dictive control (MPC), a method that involves online convex optimization, to developa policy that handles the extension. MPC policies are simple, easy to implement, fullyinterpretable, and have excellent (if not always optimal) practical performance.

Merton’s problem.

Merton’s consumption–investment problem dates back to hisoriginal 1970 paper [Mer70]. Many extensions to the basic Merton problem exist, someof which were covered in Merton’s original paper. (These include deterministic incomeand general HARA utility.) Most proposed extensions do not have a closed-formsolution, but some that do include uncertain mortality, life insurance, and annuities,ﬁrst adressed by [Ric75]. Some extensions for the speciﬁc case of quadratic utilityare handled in [BC10]. We note that many of these extensions individually lead tocomplicated solutions, and deriving the optimal policy when several extensions arecombined may be very inconvenient for a practical implementation.

Certainty equivalence.

Rarely, stochastic control problems have a certainty equiv-alent formulation, i.e. , a deterministic optimal control problem with the same optimalpolicy. The most famous example is the linear quadratic regulator (LQR) problem, inwhich the dynamics are aﬃne, driven by additive noise, and the stage costs are con-vex quadratic [BB18], [Ber17, § § § § Model predictive control.

In model predictive control, unknown values of futureparameters are replaced with estimates or forecasts over a planning horizon extendingfrom the current time to some time in the future, resulting in a deterministic optimalcontrol problem. This problem is solved, with the result intrepretable as a planof action over the planning horizon. The MPC policy simply uses the current orﬁrst value in the plan of action. This planning is repeated when updated forecastsare available, using the updated forecasts and current state. When applied in thecontext of stochastic control, MPC policies are not optimal in general, but often2xhibit excellent practical performance, and are widely used in several applicationareas. MPC is discussed in detail in [BBM17; KH06]. In [Boy+14], the authors use acomputational bound to show that MPC is nearly optimal for some stochastic controlproblems in ﬁnance.As discussed above, ignoring uncertainty is in fact optimal for linear quadraticcontrol, and MPC leads to an optimal policy when applied to LQR. In this sense, MPCcan be interpreted as applying certainty-equivalence beyond where it is theoreticallyjustiﬁed in order to obtain a good heuristic control policy [Ber17, § cvxgen [MB12] can be used to generatelow-level code for that solves the problem speciﬁed, which is suitable for use in highspeed embedded applications [WB09]. In the context of the present paper, this meansthat the MPC policy we propose in § Multi-period portfolio optimization.

It is instructive to compare our certaintyequivalent problem to popular formulations of multi-period portfolio allocation (See[Boy+17] and references therein). There are two features present in our certaintyequivalent problem that we do not see in practical multiperiod portfolio constructionproblems in the literature:1. The risk term (which is quadratic in the dollar-valued asset allocation vector x t ), is normalized by the total wealth w t , which is also is a decision variable.This risk term is jointly convex in x t and w t (and is in fact SOCP representable).With this normalization, risk preferences are consistent even as the wealth w t changes over the investment horizon.2. The risk term is included as a penalty in the dynamics, i.e. , by taking more risknow, one should expected to have lower wealth in the future. This contrastswith the tradition of penalizing risk in the objective function.We believe these to be valuable improvements to standard multi-period portfolioconstruction formulations, especially in cases when the control or optimization isover a very long time period. 3 .2 Outline In §

2, we give the base Merton problem and review its solution, for future reference. In §

3, we give a certainty equivalent problem and prove equivalence. In §

4, discuss severalextensions to the Merton problem, and show how each one changes the certaintyequivalent formulation. In §

6, we discuss how to use the certainty equivalent problemfor model predictive control.

In this section we discuss the Merton problem and its solution. To keep the proofsconcise, we consider the most basic form of this problem; extensions are considered in §

4. Our formulation is in continuous time and relies on stochastic calculus. However,to maintain both brevity and accessibility, we are cavalier about the technical details,with the assumption that a sophisticated reader can ﬁll in the gaps, or consult otherreferences.

Dynamics.

An investor must choose how to invest and consume over a lifetimeof T years. The investor has wealth w t > t , and consumes wealth at rate c t >

0, for t ∈ [0 , T ], with the remaining wealth invested in a portfolio with mean rateof return µ t and volatility σ t . The wealth dynamics are a geometric random walk, dw t = ( µ t w t − c t ) dt + σ t w t dz t , where z t is a Brownian motion. The initial condition is w = w init > Investment portfolio.

The portfolio consists of n assets, with an investment mixgiven by the fractional allocation θ t , with T θ t = 1 (where is the vector with allentries one). Thus we invest ( w t θ t ) i dollars in asset i , with a negative value denotinga short position. The portfolio return rate and volatility are given by µ t = µ T θ t , σ t = ( θ Tt Σ θ t ) / , where µ ∈ R n is the mean of the return process, and Σ is the symmetric positive def-inite covariance. (Note that we use the time-varying scalar µ t to denote the portfolioreturn as a function of time, and the vector µ to denote the constant expected returnrates of the n assets.)The investment allocation decision θ t satisﬁes T θ t = 1, as well as other investmentconstraints, which we summarize as θ t ∈ Θ, where Θ is a convex set. These couldinclude risk limits, sector exposure limits, or concentration limits. (See [Boy+17, § θ t ∈ Θ satisﬁes T θ t = 1.With the portfolio return and volatility we obtain the wealth dynamics dw t = ( µ T θ t w t − c t ) dt + (cid:0) θ Tt Σ t θ t (cid:1) / w t dz t . (1)4 tility. The investor has lifetime consumption utility R T c γt /γ dt and bequest utility w γT /γ . The risk aversion parameter γ satisﬁes γ < γ = 0. The investor’s totalexpected utility is U = E (cid:18) βγ w γT + Z T γ c γt dt (cid:19) . (2)The parameter β > Stochastic control problem.

At each time t , the investor chooses the consump-tion c t and the investment allocation θ t . A policy maps the time t and the currentwealth w t to the consumption c t and the allocation θ t , which we write as( c t , θ t ) = π t ( w t ) , (3)where for each t ∈ [0 , T ], π t : R ++ → R ++ × Θ. (Here R ++ denotes the set of positivereal numbers.) The Merton problem is to choose a policy π t , t ∈ [0 , T ], to maximize U . We review here the solution of the Merton problem via dynamic programming, forcompleteness and also for future reference.

Value function.

The value function V t : R ++ → R , for t ∈ [0 , T ], is deﬁned as V t ( w ) = E (cid:18) βγ w γT + Z Tt γ c γτ dτ (cid:19) , with c τ and θ τ following an optimal policy for τ ∈ [ t, T ], and initial condition w t = w .We deﬁne V T ( w ) = ( B/γ ) w γ for w > − ˙ V t ( w ) = sup c,θ ∈ Θ (cid:18) γ c γ + V ′ t ( w )( µ T θw − c ) + 12 V ′′ t ( w )( θ T Σ θ ) w (cid:19) (4)for w >

0. Conversely, any function satisfying (4) and the terminal condition V T = 0is the value function. Here ˙ V t denotes the partial derivative of V with respect totime, and V ′ t and V ′′ t denote the ﬁrst and second partial derivatives with respect tothe wealth.It is well known that the value function for the Merton problem is V t ( w ) = a t w γ γ , (5)5here a t is a function of time. To obtain a t , we ﬁrst solve a Markowitz portfolioallocation problem, maximize µ T θ + γ − θ T Σ θ subject to θ ∈ Θ , (6)with variable θ . (Since γ − <

0, the second term is a concave risk adjustment.) Welet r ce denote the optimal value, and we denote the solution as θ ce . We then have, for t ∈ [0 , T ], a t = (cid:18) − γγr ce (cid:18) − C exp (cid:16) γr ce − γ ( T − t ) (cid:17)(cid:19)(cid:19) − γ , (7)where C = 1 − γr ce β / (1 − γ ) / (1 − γ ). Optimal policy.

The optimal policy can be expressed in terms of the value functionas π ⋆t ( w ) = ( c t , θ t ) = argmax c,θ ∈ Θ (cid:18) γ c γ + V ′ t ( w )( µ T θw − c ) + 12 V ′′ t ( w )( θ T Σ θ ) w (cid:19) . With the value function (5), we obtain the following optimal policy. The consumptionhas the simple form c t = a / ( γ − t w t , and the optimal investment mix is constant over time, θ t = θ ce . (In extensions of the Merton problem, described below, the optimal investment mixis not constant over time.) Proof of optimality.

Here we show that the function (5) satisﬁes the Hamilton-Jacobi-Bellman PDE. To do this, ﬁrst we substitute ˙ V , V ′ t and V ′′ t into (4) to obtain − ˙ V ( w ) z }| { ˙ a t w γ γ = sup c,θ ∈ Θ (cid:18) γ c γ + V ′ ( w ) z }| { a t w γ − ( µ T θw − c ) + 12 V ′′ ( w ) z }| { a t ( γ − w γ − ( θ T Σ θ ) w (cid:19) . By pulling out w γ − from the last two terms and simplifying, we obtain − ˙ a t w γ γ = sup c,θ ∈ Θ (cid:18) γ c γ + a t w γ − (cid:18)(cid:16) µ T θ + γ − θ T Σ θ (cid:17) w − c (cid:19)(cid:19) . (8)6he maximizing θ is the solution θ ce to problem (6). The quantity in the innerparantheses of (8) is the optimal value r ce of this problem, which can be intrepretedas the certainty equivalent return. We now have − ˙ a t w γ γ = sup c (cid:18) γ c γ + β t w γ − ( r ce w − c ) (cid:19) . The supremum over c is obtained for c = a / ( γ − t w . Substituting in this value andsimplifying, we obtain − ˙ a t = (1 − γ ) a γ/ ( γ − t + γa t r ce . It can be veriﬁed that the deﬁnition of a t in (7) is indeed a solution to this diﬀerentialequation with terminal condition a T = β . In this section we present a deterministic convex optimal control problem that isequivalent to the Merton problem in the sense that it has the same value functionand same optimal policy.This certainty equivalent problem ismaximize βγ w γT + Z T γ c γt dt subject to ˙ w t ≤ µ T x t − c t + ( γ − x Tt Σ x t w t , t ∈ [0 , T ] x t /w t ∈ Θ , t ∈ [0 , T ] w = w init . (9)The variables are the consumption c t : [0 , T ] → R ++ , wealth w t : [0 , T ] → R ++ , and x t : [0 , T ] → R n , which is the dollar-valued allocation of wealth to each asset. (Inthe notation of §

2, we have x t = w t θ t , and θ t = x t /w t .) Note that the constraint x t /w t ∈ Θ implies T x t = w t , i.e. , the total wealth is the sum of the dollar-valuedasset allocations.The objective is the lifetime utility, but without expectation since this problemis deterministic. The ﬁrst constraint resembles the dynamics of the stochastic pro-cess (1), and we call this the dynamics constraint . We will see that for any solutionto (9), this inequality constraint holds with equality, in which case the dynamicsconstraint becomes a (deterministic) ODE. Interpretation.

The problem can be interpreted in the following way. We plan fora single outcome of the stochastic process (1). In particular, the dynamics constraint7estricts the growth rate of the wealth to be no greater than the µ T x t − c t (themean growth rate in the stochastic process (1)), but reduced by the additional term(1 / γ − x Tt Σ x t /w t . Because γ <

1, this term is negative. With the change ofvariables θ t = x t /w t , we have x Tt Σ x t w t = w t θ Tt Σ θ t , i.e. , this adjustment term is proportional to the variance of the portfolio growthrate with investment allocation θ t = x t /w t . In other words, we are pessimisticallyplanning for bad investment returns, with the degree of pessimism depending on therisk aversion parameter γ and the risk of our portfolio.In fact, in problem (9), we plan for the returns r t = µ + γ − w t Σ x t = µ + γ −

12 Σ θ t . The coeﬃcients in front of Σ x t and Σ θ t are negative, and the entries of Σ x t and Σ θ t are typically positive. The vector Σ θ t can be interpreted as the risk allocation to theindividual assets in the portfolio, since θ Tt Σ θ t = n X i =1 ( θ t ) i (Σ θ t ) i . In other words, the planned asset returns are the mean returns, reduced in proportionto the marginal contribution of each asset to the portfolio variance. This is relatedto the concept of risk parity [BST16].

Convexity.

Convexity of (9) follows from the fact that the risk penalty term x Tt Σ x t /w t is a quadratic-over-linear function, with is jointly convex in x t and w t [BV04, § { ( x t , w t ) ∈ R n × R ++ | x t /w t ∈ Θ } is the perspective of Θ, which is convex when Θ is [BV04, § § § Equivalence to Merton problem.

The Merton problem and problem (9) areequivalent in the sense that they have the same value function and optimal policy.To see this, we ﬁrst consider a modiﬁed version of (9) in which we convert thedynamics to an equality constraint using a slack variable u t ≥ w t = µ T x t − c t + ( γ − x Tt Σ x t w t + u t . u t can be interpreted as the rate at which we discard wealth.(We will see that at optimality u t = 0.) For this modiﬁed problem, the Hamilton-Jacobi-Bellman equation is − ˙ V ( w ) = sup c,x ∈ w Θ ,u ≥ γ c γ + V ′ t ( w ) (cid:18)(cid:16) µ T x + γ − w x T Σ x (cid:17) w − c − u (cid:19) . First note that with our value function candidate (5), we have V ′ ( w ) >

0, and there-fore u = 0, as expected. Now, by using the change of variables x = θw and pluggingin our value function candidate, this equation becomes (8). From this point on,the proof that this candidate value function satisﬁes the Hamilton-Jacobi-Bellmanequation proceeds exactly as for the (stochastic) Merton problem. Here we consider several extensions to the Merton problem, all of which are knownin the literature and have closed-form solutions. For each one, we describe how tomodify problem (9) to maintain the certainty-equivalence property.

Time-varying parameters.

The Merton problem can be solved when µ , Σ, andΘ change over time. To handle this in the certainty equivalent problem, we simplyreplace these parameters by µ t , Σ t , and Θ t . (Here µ t denotes the time-varying vectorof asset expected returns, a notation clash with our previous use of µ t as the scalarportfolio expected return.) Similarly, if we discount the consumption utility of theMerton problem: U = E (cid:18) βγ w γT + Z T α t γ c γt dt (cid:19) . where α t > t , then the objectiveof the certainty equivalent problem will match U (but without the expectation). Uncertain mortality and bequest.

Here the terminal time t f ∈ [0 , T ] is randomwith probability density p t and survival function s t = Prob ( t f > t ) = Z Tt p t dt. In this case, the investor’s utility is U = E (cid:18) Bγ w γt f + Z t f γ c γt dt (cid:19) . Here the expectation is taken over t f as well as the paths of the stochastic process (1).9ith this modiﬁcation, the objective of the certainty equivalent problem changesto Z T (cid:18) p t βγ w γt + s t γ c γt (cid:19) dt. We weight the consumption utility by the probability the investor is still alive, i.e. ,we treat the survival function as a discount factor. We also get utility for the bequestcontinuously over the interval [0 , T ], weighted by the density function p t . Annuities and life insurance.

This extension is due to [Ric75]. Continuing withthe previous extension, we allow the investor to purchase life insurance. The premiumis l t , which the investor can choose, and the payout of the plan is λ t l t , where λ t ≥ t . When l t <

0, we interpret this as an annuity.In particular, at time t , the investor has − l t in the annuity account, which is lost ondeath, in return for an additional return of − λ t l t . The actuarially fair value of λ t is p t /s t , which is called the force of mortality . (If λ t > p t /s t , then life insurance isfavorable and annuities are unfavorable; if λ t < p t /s t , the reverse is true.)With this modiﬁcation, the objective of the certainty-equivalant problem changesto U = Z T (cid:18) p t βγ ( w t + λ t l t ) γ + s t γ c γt (cid:19) dt, i.e. , we add the insurance payout to the wealth in the bequest utility. The dynamicschange to ˙ w t ≤ µ T x t − c t − l t + ( γ − x Tt Σ x t w t . Here we subtract the insurance premium from the growth rate of the wealth.

Income.

We can add a deterministic income stream, with income rate y t at time t .The stochastic dynamics are modiﬁed be the addition of y t to the drift term of thewealth process, i.e. , µ t = µ T θ t w t + y t − c t . In this case, we also assume one of the assets is risk free with return µ rf and volatility0, and that Θ = { θ | T θ = 1 } . (10)These assumptions allow the investor to counteract the income stream by shortingthe risk-free asset and investing the proceeds in a preferred portfolio of other assets.The fair value of the income stream is its net present value over [ t, T ] at the risk-freerate: v t = Z Tt e − µ rf y t dt, which can be interpreted as the remaining human capital of the investor.10or this extension, the dynamics in (9) are replaced by˙ w t ≤ µ Tt x t + y t − c t + ( γ − x Tt Σ x t w t + v t . Note the addition of the income term y t and the normalization of risk by the totalwealth plus the remaining human capital. In this case, the wealth w t need not bepositive but instead satisﬁes w t + v t >

0. Because of this, we also replace the constraint x t /w t ∈ Θ (which is not deﬁned for w t = 0) with T x t = w t . Epstein–Zin preferences.

One interesting feature of the certainty equivalent prob-lem (9) is that the risk aversion parameter γ appears separately in the objective anddynamics constraint. It is reasonable to ask whether, by modifying the consumptionutility to be βρ w ρT + Z T ρ c ρt dt for some ρ = γ with ρ < ρ = 0, but keeping γ in the dynamics constraint,problem (9) is equivalent to some variant of the Merton problem. This is indeed thecase, but with the expected utility U replaced by Epstein–Zin preferences, where 1 /ρ is the elasticity of intertemporal substitution and γ is the risk aversion. For details,see [DE92]. Here we discuss several extensions of problem (9) that (to our knowledge) do notexactly solve a version of the Merton problem. Some of these build on the exactextensions of § Modiﬁed utility.

We can change the objective of (9) to use any increasing, concaveutility function for either consumption or bequest. These utility functions need notbe additive over time: For example, we can maximize the minimum consumption overthe interval [0 , T ],As a special case, we can add a minimum consumption constraint c t ≥ c min t , where c min t is the minimum allowable consumption amount as a function of age. Sim-ilarly, we can enforce a minimum bequest over some time window (say, to care forunderage dependents until they come of age).11 pending limit. We can limit consumption as a fraction of income with the con-straint c t ≤ ηy t for some parameter η >

0. For example, when η = 0 .

7, this constraint means thatwe can’t consume more than 70% of our income, i.e. , we must have a savings rate of30%.This constraint can be adjusted to account for investment income. To see this,take d ∈ R n to be the vector of dividend yields for each asset, which is constant andknown in advance. The modiﬁed constraint becomes c t ≤ ηy t + d T x t . When this constraint is tight, i.e. , when we desire to consume more than η times ourincome, there is added incentive to invest in assets with high dividend yield. Minimum cash balance.

We can include a constraint that the amount investedin cash be above a certain level, i.e. ,( x t ) i ≥ ( x min t ) i , where i is the index of the cash asset. This is similar to an emergency fund constraintthat we must keep six months worth of consumption in cash, which is expressed as( x t ) i ≥ . c t . Model predictive control is a technique for stochastic control problems that lever-ages a deterministic approximation of the stochastic problem. To evaluate an MPCpolicy, we ﬁrst solve this determistic problem to obtain a planned trajectory for thestate and control input over the planning horizon. We then implement only the ﬁrstcontrol input in this plan, and rest of the planned trajectory is discarded. To obtainfuture control inputs, the policy is evaluated again, which requires solving a newdeterministic problem.In the context of the Merton problem, the certainty equivalent problem is usedas a basis for a simple model predictive control policy, which we denote π mpc t . Weﬁrst deﬁne this policy when t = 0, with initial wealth w . We start by solving thedeterministic control problem (9) to obtain the optimal trajectories c t and θ t . TheMPC policy then takes π mpc0 ( w ) = ( c , θ ). To deﬁne the MPC policy for t ∈ (0 , T ),we ﬁrst form a new instance of problem (9), which is deﬁned over the interval [ t, T ]and has initial wealth w t . Once again we solve the deterministic optimal controlproblem (9), to obtain optimal c τ and θ τ over the interval τ ∈ [ t, T ]. We then take12 mpc t ( w t ) = ( c t , θ t ). Evaluating the MPC policy therefore always requires solving adeterministic optimal control problem of the form (9).MPC is a convenient way to implement the optimal policy for the basic problemor any of the extensions of §

4. In those cases, the MPC policy is optimal. When MPCis applied with constraints and an objective that do not correspond to any versionof Merton problem, the MPC policy is a sophisticated heuristic, and very useful inpractice.To use MPC in practice requires discretizing problem (9), which we discuss in thenext section.

Here we show how to discretize problem (9). We do this for the basic problem only,but note that the extensions can be handled similarly.We let x k denote the value of x t in (9) at time t = hk , k = 0 , . . . , K , where h = T /K is the discretization interval. (We use the same notation, but index x with the subscript k to denote the discretized variable, and index with t to denotethe continuous variable.) We similarly deﬁne the discretized variables c k and w k .Replacing the time derivative ˙ w t with the forward Euler approximation ( w k +1 − w k ) /h ,and replacing the integral in the objective with a Riemann sum approximation, weobtain the discretized problemmaximize βγ w γT + K − X k =0 hγ c γk subject to w k +1 − w k h ≤ µ T x k − c k + ( γ − x Tk Σ x k w k , k = 0 , . . . , K − x k /w k ∈ Θ , k = 0 , . . . , Kw = w init . (11)The variables are x k ∈ R n and w k ∈ R ++ for k = 0 , . . . , K and c k ∈ R ++ for k = 0 , . . . , K −

1. All of the extensions (exact and inexact) discussed above can bediscretized as well, but we do not give the details here.The discretized certainty equivalent problem (11) is a (ﬁnite-dimensional) convexoptimization problem, and can therefore be easily expressed in a domain-speciﬁclanguage for convex optimization, such as cvxpy . As an example, we give a cvxpy implementation of (11) in listing 1 when Θ is given by (10).For most practical portfolio construction problems, Θ is SOCP representable,which means that problem (11) is an SOCP [Lob+98]. To see this, note that the powerutility c γk and the quadratic-over-linear functions are SOCP representable; see [AG03, § § § = Variable ( K +1)x = Variable (n , K +1)c = Variable ( K )S i g m a _ h a l f = numpy . linalg . cholesky ( Sigma )U = beta / gamma * power ( w [ K ] , gamma ) + h / gamma * sum ( power (c ,gamma ) )constr = [ w == sum (x , axis =0) , w [0] == w_init ]for k in range ( K ) :constr += [ diff ( w [ k +1] - w [ k ]) / h <= mu @ x [: , k ] - c [ k ] +( gamma - 1) /2 * q u a d _ o v e r _ l i n ( S i g m a _ h a l f @ x [: , k ] , w [ k]) ]]problem = Problem ( Maximize ( U ) , constr )problem . solve () Listing 1: An implementation of the discretized certainty equivalent problem (11)using cvxpy .To give some idea of the speed at which current solvers can solve the discretizedproblem (11) (and its extensions), consider a problem with n = 500 assets, K = 50periods, and covariance matrix Σ given as a typical factor model, with 25 factors. Thisproblem has more than 100000 optimization variables. With just a small modiﬁcationof the code given in listing 1 to exploit the low rank plus diagonal structure of thecovariance matrix, the open-source solver ECOS [DCB13] solve the problem in aroundtwo seconds, on a single thread. 14 eferences [AG03] F. Alizadeh and D. Goldfarb. “Second-order cone programming”. In: Math-ematical programming

Predictive control for linear andhybrid systems . Cambridge University Press, 2017.[BC10] S. Basak and G. Chabakauri. “Dynamic mean-variance asset allocation”.In:

The Review of Financial Studies

Dynamic programming and optimal control . Vol. 4. Athenascientiﬁc, 2017.[Boy+14] S. Boyd, M. Mueller, B. O’Donoghue, and Y. Wang. “Performance boundsand suboptimal policies for multi-period investment”. In:

Foundations andTrends in Optimization

Foundationsand Trends in Optimization

Quantitative Finance

Convex optimization . Cambridge Univer-sity Press, 2004.[DB16] S. Diamond and S. Boyd. “CVXPY: A Python-embedded modeling lan-guage for convex optimization”. In:

Journal of Machine Learning Research

European Control Conference . 2013, pp. 3071–3076.[DE92] D. Duﬃe and L. G. Epstein. “Stochastic diﬀerential utility”. In:

Econo-metrica: Journal of the Econometric Society (1992), pp. 353–394.[KH06] W. H. Kwon and S. H. Han.

Receding horizon control: Model predictivecontrol for state models . Springer, 2006.[KS72] H. Kwakernaak and R. Sivan.

Linear optimal control systems . Vol. 1. JohnWiley & Sons, 1972.[Lob+98] M. S. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret. “Applications ofsecond-order cone programming”. In:

Linear algebra and its applications

Optimization and Engineering

Systems and Control Letters

86 (2015), pp. 34–40.[Mer70] R. C. Merton. “Optimum consumption and portfolio rules in a continuous-time model”. In:

Stochastic Optimization Models in Finance . Elsevier,1970, pp. 621–661.[Ric75] S. F. Richard. “Optimal consumption, portfolio and life insurance rulesfor an uncertain lived individual in a continuous time model”. In:

Journalof Financial Economics

IEEE Transactions on control systems technology