[PDF] Behavioral Portfolio Selection in Continuous Time

Abstract

This paper formulates and studies a general continuous-time behavioral portfolio selection model under Kahneman and Tversky's (cumulative) prospect theory, featuring S-shaped utility (value) functions and probability distortions. Unlike the conventional expected utility maximization model, such a behavioral model could be easily mis-formulated (a.k.a. ill-posed) if its different components do not coordinate well with each other. Certain classes of an ill-posed model are identified. A systematic approach, which is fundamentally different from the ones employed for the utility model, is developed to solve a well-posed model, assuming a complete market and general Itô processes for asset prices. The optimal terminal wealth positions, derived in fairly explicit forms, possess surprisingly simple structure reminiscent of a gambling policy betting on a good state of the world while accepting a fixed, known loss in case of a bad one. An example with a two-piece CRRA utility is presented to illustrate the general results obtained, and is solved completely for all admissible parameters. The effect of the behavioral criterion on the risky allocations is finally discussed.

Full PDF

aa r X i v : . [ q -f i n . P M ] S e p Behavioral Portfolio Selection in Continuous Time ∗ Hanqing Jin † and Xun Yu Zhou ‡ February 11, 2013

Abstract

This paper formulates and studies a general continuous-time behavioral portfolioselection model under Kahneman and Tversky’s (cumulative) prospect theory, featuringS-shaped utility (value) functions and probability distortions. Unlike the conventionalexpected utility maximization model, such a behavioral model could be easily mis-formulated (a.k.a. ill-posed) if its diﬀerent components do not coordinate well with eachother. Certain classes of an ill-posed model are identiﬁed. A systematic approach, whichis fundamentally diﬀerent from the ones employed for the utility model, is developed tosolve a well-posed model, assuming a complete market and general Itˆo processes for assetprices. The optimal terminal wealth positions, derived in fairly explicit forms, possesssurprisingly simple structure reminiscent of a gambling policy betting on a good stateof the world while accepting a ﬁxed, known loss in case of a bad one. An example witha two-piece CRRA utility is presented to illustrate the general results obtained, and issolved completely for all admissible parameters. The eﬀect of the behavioral criterionon the risky allocations is ﬁnally discussed.

Key words:

Portfolio selection, continuous time, cumulative prospect theory, behav-ioral criterion, ill-posedness, S-shaped function, probability distortion, Choquet integral ∗ This paper has beneﬁted from comments of participants at the Quantitative Methods in Finance 2005Conference in Sydney, the 2005 International Workshop on Financial Engineering and Risk Management inBeijing, the 2006 International Symposium on Stochastic Processes and Applications to Mathematical Financein Ritsumeikan University, and the 2006 Workshop on Mathematical Finance and Insurance in Lijiang; andfrom the comments of Knut Aase, Andrew Cairns, Mark Davis, Peter Imkeller, Jacek Krawczyk, Terry Lyons,James Mirrlees, Eckhard Platen, Sheldon Ross, Larry Samuelson, Martin Schweizer, John van der Hoek,and David Yao. The authors thank especially Jia-an Yan for remarks and discussions over the years on thegeneral topic considered in this paper. Two anonymous referees have given constructive comments leadingto a much improved version. All errors are the responsibility of the authors. Zhou gratefully acknowledgesﬁnancial support from the RGC Earmarked Grants CUHK4175/03E and CUHK418605, and the CroucherSenior Research Fellowship. † Department of Mathematics, National University of Singapore, Singapore. Email: < [email protected] > . ‡ Mathematical Institute, University of Oxford, 24-29 St Giles’, Oxford, OX1 3LB, UK, and Departmentof Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, HongKong. email: < [email protected] > . Introduction

Mean–variance and expected utility maximization are by far the two predominant investmentdecision rules in ﬁnancial portfolio selection. Portfolio theory in the dynamic setting (bothdiscrete time and continuous time) has been established in the past twenty years, againcentering around these two frameworks while employing heavily among others the martingaletheory, convex duality and stochastic control; see Duﬃe (1996), Karatzas and Shreve (1998),and F¨ollmer and Schied (2002) for systematic accounts on dynamic utility maximization, andLi and Ng (2000), Zhou and Li (2000), and Jin, Yan and Zhou (2004) for recent studies onthe mean–variance (including extensions to mean–risk) counterpart.Expected utility theory (EUT), developed by von Neumann and Morgenstern (1944) basedon an axiomatic system, has an underlying assumption that decision makers are rational andrisk averse when facing uncertainties. In the context of asset allocations, its basic tenets are:Investors evaluate wealth according to ﬁnal asset positions; they are uniformly risk averse;and they are able to objectively evaluate probabilities. These, however, have long been criti-cized to be inconsistent with the way people do decision making in the real world. Substantialexperimental evidences have suggested a systematic violation of the EUT principles. Specif-ically, the following anomalies (as opposed to the assumed rationality in EUT) in humanbehaviors are evident from daily life: • People evaluate assets on gains and losses (which are deﬁned with respect to a referencepoint), not on ﬁnal wealth positions; • People are not uniformly risk averse: they are risk-averse on gains and risk-taking onlosses, and signiﬁcantly more sensitive to losses than to gains; • People overweight small probabilities and underweight large probabilities.In addition, there are widely known paradoxes and puzzles that EUT fails to explain,including the Allais paradox [Allais (1953)], Ellesberg paradox [Ellesberg (1961)], Friedmanand Savage puzzle [Friedman and Savage (1948)], and the equity premium puzzle [Mehra andPrescott (1985)].Considerable attempts and eﬀorts have been made to address the drawback of EUT,among them notably the so-called non-additive utility theory [see, for example, Fishburn(1998)]. Unfortunately, most of these theories are far too complicated to be analyzable andapplicable, and some of them even lead to new paradoxes. In 1970s, Kahneman and Tversky(1979) proposed the prospect theory (PT) for decision making under uncertainty, incorporatinghuman emotions and psychology into their theory. Later, Tversky and Kahneman (1992) ﬁnetuned the PT to the cumulated prospect theory (CPT) in order to be consistent with the ﬁrst-order stochastic dominance. Among many other ingredients, the key elements of Kahnemanand Tversky’s Nobel-prize-winning theory are • A reference point (or neutral outcome/benchmark/breakeven point/status quo ) in wealththat deﬁnes gains and losses ; 2 A value function (which replaces the notion of utility function), concave for gains and convex for losses , and steeper for losses than for gains (a behavior called loss aversion ); • A probability distortion that is a nonlinear transformation of the probability scale, whichenlarges a small probability and diminishes a large probability.There have been burgeoning research interests in incorporating the PT into portfoliochoice; nonetheless they have been hitherto overwhelmingly limited to the single-period set-ting; see for example Benartzi and Thaler (1995), Shefrin and Statman (2000), Levy and Levy(2004), Bassett et al. (2004), Gomes (2005), and De Giorgi and Post (2005), with emphaseson qualitative properties and empirical experiments. Analytical research on dynamic , espe-cially continuous-time, asset allocation featuring behavioral criteria is literally nil accordingto our best knowledge. [In this connection the only paper we know of that has some bearingon the PT for the continuous time setting is Berkelaar, Kouwenberg and Post (2004) wherea very speciﬁc two-piece power utility function is considered; however, the probability distor-tion, which is one of the major ingredients of the PT and which causes the main diﬃculty, isabsent in that paper.] Such a lack of study on continuous-time behavioral portfolio selectionis certainly not because the problem is uninteresting or unimportant; rather it is because, webelieve, that the problem is massively diﬃcult as compared with the conventional expectedutility maximization model. Many conventional and convenient approaches, such as convexoptimization, dynamic programming, and stochastic control, fall completely apart in handlingsuch a behavioral model: First, the utility function (or value function as called in the PT)is partly concave and partly convex (also referred to as an

S-shaped function), whereas theglobal convexity/concavity is a necessity in traditional optimization. Second, the nonlineardistortion in probabilities abolishes virtually all the nice properties associated with the nor-mal additive probability and linear expectation. In particular, the dynamic consistency of theconditional expectation with respect to a ﬁltration, which is the foundation of the dynamicprogramming principle, is absent due to the distorted probability. Worse still, the couplingof these two ill-behaved features greatly ampliﬁes the diﬃculty of the problem . Even thewell-posedness of the problem is no longer something that can be taken for granted.This paper ﬁrst establishes a general continuous-time portfolio selection model under theCPT, involving behavioral criteria deﬁned on possibly continuous random variables. Theprobability distortions lead to the involvement of the Choquet integrals [Choquet (1953/54)],instead of the conventional expectation. We then carry out, analytically , extensive investiga-tions on the model while developing new approaches in deriving the optimal solutions. Firstof all, by assuming that the market is complete, the asset prices follow general Itˆo processes,and the individual behavior of the investor in question will not aﬀect the market, we needonly to consider an optimization problem in terms of the terminal wealth. This is the usual In Berkelaar, Kouwenberg and Post (2004) an essentially convexiﬁcation technique is employed to dealwith the non-convexity of the problem. However, it does not work any longer in the presence of a distortedprobability. A maximization problem is called well-posed if its supremum is ﬁnite; otherwise it is ill-posed . An ill-posedproblem is a mis-formulated one: the trade-oﬀ is not set right so that one can always push the objective valueto be arbitrarily high. non-concave maximization problem due to the probabilitydistortion; yet by changing the decision variable and taking a series of transformations, weturn it into a concave maximization problem where the Lagrange method is applicable. Theloss part problem, nevertheless, is more subtle because it is to minimize a concave functionaleven after the similar transformations. We are able to characterize explicitly its solutions tobe certain “corner points” via delicate analysis. There is yet one more twist in deriving theoptimal solution to the original model given the solutions to the above two problems: oneneeds to ﬁnd the “best” parameters – the initial budget and the event of a terminal gain –by solving another constrained optimization problem.As mathematically complicated and sophisticated the solution procedure turns out tobe, the ﬁnal solutions are surprisingly and beautifully simple: the optimal terminal wealthresembles the payoﬀ of a portfolio of two binary (or digital ) options written on a mutual fund(induced by the state pricing density), characterized by a single number. This number, inturn, can be identiﬁed by solving a very simple two-dimensional mathematical programmingproblem. The optimal strategy is therefore a gambling policy, betting on good states ofthe market, by buying a contingent claim and selling another. We present an example withthe same value function taken by Tversky and Kahneman (1992), and demonstrate that ourgeneral results lead to a complete solution of the model for all admissible parameters involved.Furthermore, for the case when the market parameters are constants, we are able to derive theoptimal portfolio in closed form, thereby understand how the behavioral criteria may changethe risky asset allocations.To summarize, the main contributions of this paper are: 1) we establish, for the ﬁrst time,a bona ﬁde continuous-time behavioral portfolio selection model `a la cumulative prospecttheory, featuring very general S-shaped utility functions and probability distortions; 2) wedemonstrate that the well-posedness becomes an eminent issue for the behavioral model, andidentify several ill-posed problems; 3) we develop an approach, fundamentally diﬀerent fromthe existing ones for the expected utility model, to overcome the immense diﬃculties arisingfrom the analytically ill-behaved utility functions and probability distortions. Some of thesub-problems solvable by this approach, such as constrained maximization and minimizationof Choquet integrals, are interesting, in both theory and applications, in their own rights;and 4) we obtain fairly explicit solutions to a general model, and closed-form solutions for animportant special case, based on which we are able to examine how the allocations to equityare inﬂuenced by behavioral criteria. 4he rest of the paper is organized as follow. In Section 2 the behavioral model is formu-lated, and its possible ill-posedness is addressed in Section 3. The main results of the paperare stated in Section 4. The procedure of analytically solving the general model is developedin Sections 5 – 7, leading to a proof of the main results in Section 8. A special case with atwo-piece CRRA utility function is presented in Section 9 to demonstrate the general resultsobtained. Section 10 addresses the issue of how the behavioral criterion would aﬀect the riskyallocations. Some concluding remarks are given in Section 11. Finally, technical preliminariesare relegated to an appendix. In this paper T is a ﬁxed terminal time and (Ω , F , P, {F t } t ≥ ) is a ﬁxed ﬁltered completeprobability space on which is deﬁned a standard F t -adapted m -dimensional Brownian motion W ( t ) ≡ ( W ( t ) , · · · , W m ( t )) ′ with W (0) = 0. It is assumed that F t = σ { W ( s ) : 0 ≤ s ≤ t } ,augmented by all the null sets. Here and throughout the paper A ′ denotes the transpose of amatrix A .We deﬁne a continuous-time ﬁnancial market following Karatzas and Shreve (1998). Inthe market there are m + 1 assets being traded continuously. One of the assets is a bankaccount whose price process S ( t ) is subject to the following equation: dS ( t ) = r ( t ) S ( t ) dt, t ∈ [0 , T ]; S (0) = s > , (1)where the interest rate r ( · ) is an F t -progressively measurable, scalar-valued stochastic processwith R T | r ( s ) | ds < + ∞ , a.s.. The other m assets are stocks whose price processes S i ( t ), i = 1 , · · · , m , satisfy the following stochastic diﬀerential equation (SDE): dS i ( t ) = S i ( t )[ b i ( t ) dt + m X j =1 σ ij ( t ) dW j ( t )] , t ∈ [0 , T ]; S i (0) = s i > , (2)where b i ( · ) and σ ij ( · ), the appreciation and dispersion (or volatility) rates, respectively,are scalar-valued, F t -progressively measurable stochastic processes with R T [ P mi =1 | b i ( t ) | + P mi,j =1 | σ ij ( t ) | ] dt < + ∞ , a.s..Set the excess rate of return vector process B ( t ) := ( b ( t ) − r ( t ) , · · · , b m ( t ) − r ( t )) ′ , and deﬁne the volatility matrix process σ ( t ) := ( σ ij ( t )) m × m . Basic assumptions imposed onthe market parameters throughout this paper are summarized as follows: Assumption 2.1 (i) There exists c ∈ IR such that R T r ( s ) ds ≥ c , a.s..(ii) Rank ( σ ( t )) = m , a.e. t ∈ [0 , T ], a.s.. 5ii) There exists an IR m -valued, uniformly bounded, F t -progressively measurable process θ ( · ) such that σ ( t ) θ ( t ) = B ( t ), a.e. t ∈ [0 , T ], a.s..It is well known that under these assumptions there exists a unique risk-neutral (martin-gale) probability measure Q deﬁned by dQdP (cid:12)(cid:12)(cid:12) F t = ρ ( t ), where ρ ( t ) := exp (cid:26) − Z t (cid:20) r ( s ) + 12 | θ ( s ) | (cid:21) ds − Z t θ ( s ) ′ dW ( s ) (cid:27) (3)is the pricing kernel or state density price. Denote ρ := ρ ( T ). It is clear that 0 < ρ < + ∞ a.s., and 0 < Eρ < + ∞ .A random variable ξ is said to have no atom if P { ξ = a } = 0 ∀ a ∈ IR. The followingassumption is in force throughout this paper.

Assumption 2.2 ρ admits no atom.The preceding assumption is not essential, and is imposed to avoid undue technicality. Inparticular, it is satisﬁed when r ( · ) and θ ( · ) are deterministic with R T | θ ( t ) | dt = 0 (in whichcase ρ is a nondegenerate lognormal random variable). We are also going to use the followingnotation: ¯ ρ ≡ esssup ρ := sup { a ∈ IR : P { ρ > a } > } ,ρ ≡ essinf ρ := inf { a ∈ IR : P { ρ < a } > } . (4)Consider an agent, with an initial endowment x ∈ IR (ﬁxed throughout this paper) ,whose total wealth at time t ≥ x ( t ). Assume that the trading of shares takesplace continuously in a self-ﬁnancing fashion (i.e., there is no consumption or income) andthere are no transaction costs. Then x ( · ) satisﬁes [see, e.g., Karatzas and Shreve (1998)] dx ( t ) = [ r ( t ) x ( t ) + B ′ ( t ) π ( t )] dt + π ( t ) ′ σ ( t ) dW ( t ) , t ∈ [0 , T ]; x (0) = x , (5)where π ( · ) ≡ ( π ( · ) , · · · , π m ( · )) ′ is the portfolio of the agent with π i ( t ) , i = 1 , · · · , m, denoting the total market value of the agent’s wealth in the i -th asset at time t . A portfolio π ( · ) is said to be admissible if it is an IR m -valued, F t -progressively measurable process with Z T | σ ( t ) ′ π ( t ) | dt < + ∞ and Z T | B ( t ) ′ π ( t ) | dt < + ∞ , a.s. . An admissible portfolio π ( · ) is said to be tame if the corresponding discounted wealth process, S ( t ) − x ( t ), is almost surely bounded from below (the bound may depend on π ( · )).The following result follows from Karatzas and Shreve (1998, p. 24, Theorem 6.6) notingKaratzas and Shreve (1998, p. 21, Deﬁnition 6.1) or Cox and Huang (1989). Proposition 2.1

For any F T -measurable random variable ξ such that ξ is almost surelybounded from below and E [ ρξ ] = x , there exists a tame admissible portfolio π ( · ) such thatthe corresponding wealth process x ( · ) satisﬁes x ( T ) = ξ . Precisely speaking, x should be the diﬀerence between the agent’s initial wealth and a (discounted)reference wealth; for details see Remarks 2.1 and 2.2 below.

6n the conventional portfolio theory, an investor’s preference is modelled by the expectedutility of the terminal wealth. In this paper, we study a portfolio model featuring humanbehaviors by working within the CPT framework of Tversky and Kahneman (1992). Firstof all, in CPT there is a natural outcome or benchmark, assumed to be 0 (evaluated atthe terminal time, T ) in this paper without loss of generality (see Remark 2.1 below forelaborations on this point), which serves as a base point to distinguish gains from losses.Next, we are given two utility functions u + ( · ) and u − ( · ), both mapping from IR + to IR + ,that measure the gains and losses respectively. There are two additional functions T + ( · ) and T − ( · ) from [0 ,

1] to [0 , Assumption 2.3 u + ( · ) and u − ( · ): IR + IR + , are strictly increasing, concave, with u + (0) = u − (0) = 0. Moreover, u + ( · ) is strictly concave and twice diﬀerentiable, with the Inadaconditions u ′ + (0+) = + ∞ and u ′ + (+ ∞ ) = 0. Assumption 2.4 T + ( · ) and T − ( · ): [0 , [0 , T + (0) = T − (0) = 0 and T + (1) = T − (1) = 1.Now, given a contingent claim (a random variable) X , we assign it a value V ( X ) by V ( X ) = V + ( X + ) − V − ( X − )where V + ( Y ) := Z + ∞ T + ( P { u + ( Y ) > y } ) dy, V − ( Y ) := Z + ∞ T − ( P { u − ( Y ) > y } ) dy for any random variable Y ≥ , a.s.. (Throughout this paper a + and a − denote respectivelythe positive and negative parts of a real number a .) It is evident that both V + and V − arenon-decreasing in the sense that V ± ( X ) ≥ V ± ( Y ) for any random variables X and Y with X ≥ Y a.s.. Moreover, V + ( x ) = u + ( x ) and V − ( x ) = u − ( x ) ∀ x ∈ IR + . Finally, V is alsonon-decreasing.If T + ( x ) = x (there is no distortion) then V + ( Y ) = E [ u + ( Y )] (likewise with V − ); hence V + is a generalization of the expected utility. Yet this generalization poses a fundamen-tally diﬀerent (and diﬃcult) feature, namely, the set function T + ◦ P is a capacity [Choquet(1953/54)] which is a non-additive measure as opposed to the standard notion of probability.So the deﬁnition of V + involves the so-called Choquet integral [see Denneberg (1994) for acomprehensive account on Choquet integrals]. Notice that with the Choquet integral the dy-namic consistency of conditional expectation, which is the base for the dynamic programmingprinciple, is lost .In CPT the utility (or value) function u ( · ) is given on the whole real line, which is convexon IR − and concave on IR + (corresponding to the observation that people tend to be risk-averse The dynamic consistency refers to the following equality: E ( E ( X |F t ) |F s ) = E ( X |F s ) if F s ⊆ F t . Theproblem of generalizing conditional expectation to Choquet integral remains largely open, not to mention thevalidity of the corresponding dynamic consistency in any sense; see Denneberg (1994, Chapter 12).

7n gains and risk-seeking on losses). Such a function is said to be of

S-shaped . In our model,we separate the utility on gains and losses by letting u + ( x ) := u ( x ) and u − ( x ) = − u ( − x )whenever x ≥

0. Thus our model is equivalent to the one with an overall S-shaped utilityfunction.In our model, the value V ( X ) is deﬁned on a general random variable X , possibly a contin-uous one, which is necessary for the continuous-time portfolio selection model, as opposed toTversky and Kahneman (1992) where only discrete random variables are treated. Moreover,our deﬁnition of V agrees with that in Tversky and Kahneman (1992) if X is discrete [seeTversky and Kahneman (1992, pp. 300-301)].Under this CPT framework, our portfolio selection problem is to ﬁnd the most preferableportfolios, in terms of maximizing the value V ( x ( T )), by continuously managing the portfolio.The mathematical formulation is as follows:Maximize V ( x ( T ))subject to ( x ( · ) , π ( · )) satisﬁes (5) , π ( · ) is admissible and tame. (6)In view of Proposition 2.1, in order to solve (6) one needs only ﬁrst to solve the followingoptimization problem in the terminal wealth, X :Maximize V ( X )subject to E [ ρX ] = x , X is an a.s. lower bounded, F T -random variable . (7)Once (7) is solved with a solution X ∗ , the optimal portfolio is then the one replicating X ∗ (as determined by Proposition 2.1). Therefore, in the rest of the paper we will focus onProblem (7). Recall that a maximization problem is called well-posed if the supremum of itsobjective is ﬁnite; otherwise it is called ill-posed . One tries to ﬁnd an optimal solution only ifthe problem is known a priori to be well-posed. Remark 2.1

If the reference point at T is a general F T -measurable random variable ξ (in-stead of 0), then, since the market is complete, we can replicate ξ by a replicating portfolio¯ π ( · ) with the corresponding wealth process ¯ x ( · ). [Incidentally, one can also take this case asone where there is a dynamically and stochastically changing reference trajectory ¯ x ( · ).] Inthis case, by considering x ( t ) − ¯ x ( t ) as the state variable the problem (6) is reduced to onewith the reference point being 0. [In view of this, the process x ( · ) determined by (5) actuallyrepresents the magnitude of the change in wealth from the the price process of the terminalreference point. In particular, this is also why the given initial state x in (5) can be any real number.] Remark 2.2

Following the discussion of Remark 2.1, we see that our model models thesituation where the investor concerns a reference wealth only at the terminal of the planninghorizon (or, equivalently, an exogenously given dynamic reference trajectory). Examples ofsuch a situation are when a person is to make a down payment of a house in three months(in which case the reference point is a deterministic constant), or when an investor is to coverthe short position in a stock in one month (where the reference point is a random variable).8t is certainly plausible that an investor will update his reference point dynamically. If theupdating rule is known a priori, such as in Berkelaar et al. (2004), then it is possible toturn the problem into one covered by (6) by appropriately modifying some parameters. If,however, updating the reference point is part of the overall decision, then it would lead to acompletely diﬀerent and interesting model, which is open for further study.

Remark 2.3

We implicitly assume in our model that the agent is a “small investor”; sohis behavior only aﬀects his utility function – and hence his asset allocation – but not theoverall market. This is why the budget constraint in (7), E [ ρX ] = x , is still evaluated inthe conventional sense (no probability distortion). In other words, E [ ρX ] = x is the pricingrule of the market, which is (assumed to be) not inﬂuenced by the small investor underconsideration.Before we conclude this section, we recall the following deﬁnition. For any non-decreasingfunction f : IR + IR + , we deﬁne its inverse function f − ( x ) := inf { y ∈ IR + : f ( y ) ≥ x } , x ∈ IR + . (8)It is immediate that f − is non-decreasing and continuous on the left, and it holds alwaysthat f − ( f ( y )) ≤ y. In general ill-posedness of an optimization problem signiﬁes that the trade-oﬀ therein is not setright, leading to a wrong model. Well-posedness is an important issue from the modeling pointof view. In classical portfolio selection literature [see, e.g., Karatzas and Shreve (1998)] theutility function is typically assumed to be globally concave along with other nice properties;thus the problem is guaranteed to be well-posed in most cases . We now demonstrate thatfor the behavioral model (6) or (7) the well-posedness becomes a more signiﬁcant issue, andthat probability distortions in gains and losses play prominent, yet somewhat opposite, roles. Theorem 3.1

Problem (7) is ill-posed if there exists a nonnegative F T -measurable randomvariable X such that E [ ρX ] < + ∞ and V + ( X ) = + ∞ . Proof:

Deﬁne Y := X − c with c := ( E [ ρX ] − x ) /Eρ . Then Y is feasible for Problem(7). If c ≤

0, then obviously V ( Y ) = V + ( Y ) ≥ V + ( X ) = + ∞ . If c >

0, then V ( Y ) = V + ( Y + ) − V − ( Y − ) ≥ V + ( Y + ) − V − ( c )= Z + ∞ T + (cid:0) P { u + (( X − c ) + ) > y } (cid:1) dy − u − ( c ) Even with a global concave utility function the underlying problem could still be ill-posed; see counter-examples and discussions in Korn and Kraft (2004) and Jin, Xu and Zhou (2007). Z + ∞ T + ( P { u + ( X − c ) > y, X ≥ c } ) dy − u − ( c ) ≥ Z + ∞ T + ( P { u + ( X ) > y + u + ( c ) } ) dy − u − ( c ) ≥ Z + ∞ u + ( c ) T + ( P { u + ( X ) > y } ) dy − u − ( c )= + ∞ , where we have used the fact that u + ( x + y ) ≤ u + ( x ) + u + ( y ) ∀ x, y ∈ IR + due to the concavityof u + ( · ) along with u + (0) = 0. The proof is complete. Q.E.D.

This theorem says that the model is ill-posed if one can ﬁnd a nonnegative claim havinga ﬁnite price yet an inﬁnite prospective value. In this case the agent can purchase such aclaim initially (by taking out a loan if necessary) and reach the inﬁnite value at the end. Thefollowing example shows that such an almost “unbelievable” claim could indeed exist evenwith very “nice” parameters involved, so long as the probability on gains is distorted.

Example 3.1

Let ρ be such that its (probability) distribution function, F ( · ), is continuousand strictly increasing, with Eρ < + ∞ (e.g., when ρ is lognormal). Take T + ( t ) := t / on[0 , /

2] and u + ( x ) := x / . Set Z := F ( ρ ). Then it is known that Z ∼ U (0 , , X := Z − / −

1. Then X ≥ P ( X > x ) = (1 + x ) − for x ≥ E [ ρX ] = E [ ρZ − / ] − Eρ ≤ ( EZ − / ) / ( Eρ ) / − Eρ = 4 / ( Eρ ) / − Eρ < + ∞ . However, V + ( X ) ≥ R + ∞ T + ( P { X > y } ) dy = R + ∞ T + ((1 + y ) − ) dy = R + ∞ (1 + y ) − / dy > R + ∞ (2 y ) − / dy = + ∞ . In this example u + ( x ) = x / is a perfectly “nice” utility function satisfying every condi-tion required for well-posedness (as well as solvability) of the classical utility model; yet thedistortion T + ( · ) ruins everything and turns the problem into an ill-posed one.To exclude the ill-posed case identiﬁed by Theorem 3.1, we need the following assumptionthroughout this paper: Assumption 3.1 V + ( X ) < + ∞ for any nonnegative, F T -measurable random variable X satisfying E [ ρX ] < + ∞ .Assumption 3.1 is not suﬃcient to completely rule out the ill-posedness. The followingtheorem speciﬁes another class of ill-posed problems. Theorem 3.2 If u + (+ ∞ ) = + ∞ , ¯ ρ = + ∞ , and T − ( x ) = x , then Problem (7) is ill-posed. Proof:

Fix any a > ρ and deﬁne X := c ρ

0. Then for any n > V + ( nX ) = R + ∞ T + ( P { u + ( nX ) > y } ) dy = R u + ( nc )0 T + ( P { u + ( nc ρ y } ) dy = u + ( nc ) T + ( P { ρ < a } ) → + ∞ as n → + ∞ . (9)10ext, for any n >

1, deﬁne X n := c n ρ>n , where c n := nE [ ρX ] − x E [ ρ ρ>n ] . (Here E [ ρ ρ>n ] > ρ = + ∞ .) Obviously, c + n P { ρ > n } = ( nE [ ρX ] − x ) + E [ ρ | ρ>n ] ≤ | nE [ ρX ] − x | n → n → + ∞ . Hence V − ( c + n ρ>n ) = u − ( c + n ) P { ρ > n } ≤ u − ( c + n P { ρ > n } ) → n → + ∞ , (10)where the last inequality is due to the facts that u − ( · ) is concave and u − (0) = 0.Now, deﬁne ¯ X n := nX − X n . Then E [ ρ ¯ X n ] = nE [ Xρ ] − c n E [ ρ ρ>n ] = x . Moreover,since ¯ X + n ≥ nX and ¯ X − n ≤ c + n ρ>n , it follows from (9) and (10) that V ( ¯ X n ) ≥ V + ( nX ) − V − ( c + n ρ>n ) → + ∞ as n → + ∞ . Q.E.D.

Remark 3.1

Quite intriguingly, Theorem 3.2 shows that a probability distortion on losses is necessary for the well-posedness if the utility on gains can go arbitrarily large (the latter beingthe case for most commonly used utility functions). The intuition behind this result and itsproof can be explained as follows: one borrows enormous amount of money to purchase aclaim with a huge payoﬀ ( nX in the proof), and then bet the market be “good” leading tothe realization of that payoﬀ. If, for the lack of luck, the market turns out to be “bad”, thenthe agent ends up with a loss ( X n ); however due to the non-distortion on the loss side itsdamage on value is bounded [in fact equation (10) shows that the damage can be controlledto be arbitrarily small]. Notice that the above argument is no longer valid if the wealth is constrained to be bounded from below .Now we set out to identify and solve well-posed problems. The original problem (7) is solved in two steps involving three sub-problems, which are de-scribed in what follows.

Step 1 . In this step we consider two problems respectively: • Positive Part Problem : A problem with parameters (

A, x + ):Maximize V + ( X ) = R + ∞ T + ( P { u + ( X ) > y } ) dy subject to E [ ρX ] = x + , X ≥ , X = 0 a.s. on A C , (11)where x + ≥ x +0 ( ≥

0) and A ∈ F T are given. Thanks to Assumption 3.1, V + ( X ) is aﬁnite number for any feasible X . We deﬁne the optimal value of Problem (11), denoted v + ( A, x + ), in the following way. If P ( A ) >

0, in which case the feasible region of (11)is non-empty [ X = ( x + A ) / ( ρP ( A )) is a feasible solution], then v + ( A, x + ) is deﬁned tobe the supremum of (11). If P ( A ) = 0 and x + = 0, then (11) has only one feasiblesolution X = 0 a.s. and v + ( A, x + ) := 0. If P ( A ) = 0 and x + >

0, then (11) has nofeasible solution, where we deﬁne v + ( A, x ) := −∞ . This is why in Berkelaar et al. (2004) the model is well-posed even though no probability distortion isconsidered, as the wealth process there is constrained to be non-negative. Negative Part Problem : A problem with parameters (

A, x + ):Minimize V − ( X ) = R + ∞ T − ( P { u − ( X ) > y } ) dy subject to ( E [ ρX ] = x + − x , X ≥ , X = 0 a.s. on A,X is upper bounded a.s. , (12)where x + ≥ x +0 and A ∈ F T are given. Similarly to the positive part problem we deﬁnethe optimal value v − ( A, x + ) of Problem (12) as follows. When P ( A ) < v − ( A, x + ) is the inﬁmum of (12). If P ( A ) = 1and x + = x where the only feasible solution is X = 0 a.s., then v − ( A, x + ) := 0. If P ( A ) = 1 and x + = x , then there is no feasible solution, in which case we deﬁne v − ( A, x + ) := + ∞ . Step 2 . In this step we solveMaximize v + ( A, x + ) − v − ( A, x + )subject to ( A ∈ F T , x + ≥ x +0 ,x + = 0 when P ( A ) = 0 , x + = x when P ( A ) = 1 . (13)Let F ( · ) be the distribution function of ρ . Our main results are stated in terms of thefollowing mathematical program, which is intimately related to (but not the same as) Problem(13): Maximize v + ( c, x + ) − u − ( x + − x E [ ρ ρ>c ] ) T − (1 − F ( c ))subject to ( ρ ≤ c ≤ ¯ ρ, x + ≥ x +0 ,x + = 0 when c = ρ, x + = x when c = ¯ ρ, (14)where v + ( c, x + ) := v + ( { ω : ρ ≤ c } , x + ) and we use the following convention: u − (cid:18) x + − x E [ ρ ρ>c ] (cid:19) T − (1 − F ( c )) := 0 when c = ¯ ρ and x + = x . (15)Here go the main results of this paper. Theorem 4.1

Assume that u − ( · ) is strictly concave at . We have the following conclusions: (i) If X ∗ is optimal for Problem (7), then c ∗ := F − ( P { X ∗ ≥ } ) , x ∗ + := E [ ρ ( X ∗ ) + ] ,where F is the distribution function of ρ , are optimal for Problem (14). Moreover, { ω : X ∗ ≥ } and { ω : ρ ≤ c ∗ } are identical up to a zero probability set, and ( X ∗ ) − = x ∗ + − x E [ ρ ρ>c ∗ ] ρ>c ∗ a.s. . (ii) If ( c ∗ , x ∗ + ) is optimal for Problem (14) and X ∗ + is optimal for Problem (11) with pa-rameters ( { ρ ≤ c ∗ } , x ∗ + ) , then X ∗ := ( X ∗ ) + ρ ≤ c ∗ − x ∗ + − x E [ ρ ρ>c ∗ ] ρ>c ∗ is optimal for Problem(7). In the light of Theorem 4.1, we have the following algorithm to solve Problem (7).12 tep 1

Solve Problem (11) with ( { ω : ρ ≤ c } , x + ), where ρ ≤ c ≤ ¯ ρ and x + ≥ x +0 are given, toobtain v + ( c, x + ) and the optimal solution X ∗ + ( c, x + ). Step 2.

Solve Problem (14) to get ( c ∗ , x ∗ + ). Step 3. (i) If ( c ∗ , x ∗ + ) = ( ¯ ρ, x ), then X ∗ + ( ¯ ρ, x ) solves Problem (7).(ii) Else X ∗ + ( c ∗ , x ∗ + ) ρ ≤ c ∗ − x ∗ + − x E [ ρ ρ>c ∗ ] ρ>c ∗ solves Problem (7).We now impose the following assumption: Assumption 4.1 F − ( z ) /T ′ + ( z ) is non-decreasing in z ∈ (0 , x → + ∞ (cid:16) − xu ′′ + ( x ) u ′ + ( x ) (cid:17) > E h u + (cid:16) ( u ′ + ) − ( ρT ′ + ( F ( ρ )) ) (cid:17) T ′ + ( F ( ρ )) i < + ∞ .Then v + ( c, x + ) and the corresponding optimal solution X ∗ + to (11) can be expressed moreexplicitly: v + ( c, x + ) = E h u + (cid:16) ( u ′ + ) − (cid:16) λ ( c,x + ) ρT ′ + ( F ( ρ )) (cid:17)(cid:17) T ′ + ( F ( ρ )) ρ ≤ c i ,X ∗ + = ( u ′ + ) − (cid:16) λ ( c,x + ) ρT ′ + ( F ( ρ )) (cid:17) ρ ≤ c , where λ ( c, x + ) satisﬁes E [( u ′ + ) − ( λ ( c,x + ) ρT ′ + ( F ( ρ )) ) ρ ρ ≤ c ] = x + . In this case Theorem 4.1 can bere-stated with the preceding explicit expressions properly substituted.Under Assumption 4.1, the optimal terminal wealth to our behavioral model (6) is givenexplicitly as the following X ∗ = ( u ′ + ) − (cid:18) λ ( c ∗ , x ∗ + ) ρT ′ + ( F ( ρ )) (cid:19) ρ ≤ c ∗ − x ∗ + − x E [ ρ ρ>c ∗ ] ρ>c ∗ . (16)This solution possesses some appealing features. On one hand, the terminal wealth havinga gain or a loss is completely determined by the terminal state density price being lower orhigher than a single threshold, c ∗ , which in turn can be obtained by solving (14). On theother hand, (16) is the payoﬀ of a combination of two binary options, which can be easilypriced; see Appendix E.The remainder of this paper is devoted to proving all the above claims. But before that,let us discuss on the economical interpretation of the optimal wealth proﬁle (16). Indeed, (16)suggests that an optimal strategy should deliver a wealth in excess of the reference wealth ingood states of the world ( ρ ≤ c ∗ ), and a shortfall in bad states ( ρ > c ∗ ) . To realize this goal,the agent should initially buy a contingent claim with the payoﬀ ( u ′ + ) − (cid:16) λ ( c ∗ ,x ∗ + ) ρT ′ + ( F ( ρ )) (cid:17) ρ ≤ c ∗ atcost x ∗ + . Since x ∗ + ≥ x , he needs to issue (i.e., sell) a claim with a payoﬀ x ∗ + − x E [ ρ ρ>c ∗ ] ρ>c ∗ toﬁnance the shortfall, x ∗ + − x . In other words, the agent will not only invest in stocks, butalso will generally take a leverage to do so. He then gambles on a good state of the marketturning up at the terminal time while accepting a ﬁxed loss in case of a bad state . It can be easily shown, in the case of a one-stock market, that ρ ≤ c ∗ is equivalent to the stock priceexceeding a certain level. Such a gambling policy was derived in Berkelaar et al. (2004), Proposition 3, for a special model wherethe value function is a two-piece power function and there is no probability distortion. Here, we show thateven for the most general model an optimal behavioral policy still possesses such an elegantly simple structure. Splitting

The key idea developed in this paper, i.e., splitting (7) into three sub-problems and thenappropriately merging them, is based on the following observation: If X is a feasible solutionof (7), then one can split X + and X − . The former deﬁnes naturally an event A := { X ≥ } and an initial price x + := E [ ρX + ], and the latter corresponds to A C and x + − x , where A C denotes the complement of the set A . An optimal solution to (7) should, therefore, inducethe “best” such A and x + in certain sense. We now prove that this idea indeed works in thesense that (7) is equivalent to the three auxiliary problems combined.We start with the well-posedness. Proposition 5.1

Problem (7) is ill-posed if and only if Problem (13) is ill-posed.

Proof:

We ﬁrst show the “if” part. Suppose (13) is ill-posed. If v + (Ω , x ) = + ∞ , thenProblem (7) is obviously ill-posed. If v + (Ω , x ) < + ∞ , then for any M > v + (Ω , x ), thereexists a feasible pair ( A, x + ) for (13) such that v + ( A, x + ) − v − ( A, x + ) ≥ M . Clearly 0

1. (If P ( A ) = 0, then v + ( A, x + ) − v − ( A, x + ) ≤ < M . If P ( A ) = 1, then v + ( A, x + ) − v − ( A, x + ) ≤ v + (Ω , x ) < M .) Consequently, both (11) and (12) with parameters( A, x + ), x + ≥ x +0 , have non-empty feasible regions. So there exist X and X feasible for (11)and (12) respectively such that V + ( X ) ≥ v + ( A, x + ) − , V − ( X ) ≤ v − ( A, x + ) + 1. Deﬁne X = X − X . Then X is feasible for (7), and V ( X ) ≥ v + ( A, x + ) − v − ( A, x + ) − ≥ M − M >

0, there exists a feasible solution X for (7) such that V ( X ) ≥ M . Deﬁne A := { ω : X ≥ } , x + := E [ ρX + ]. Then ( A, x + )is feasible for Problem (13), and v + ( A, x + ) − v − ( A, x + ) ≥ V ( X + ) − V ( X − ) = V ( X ) ≥ M ,which shows that Problem (13) is ill-posed. Q.E.D.

Proposition 5.2

Given X ∗ , deﬁne A ∗ := { ω : X ∗ ≥ } and x ∗ + := E [ ρ ( X ∗ ) + ] . Then X ∗ isoptimal for Problem (7) if and only if ( A ∗ , x ∗ + ) are optimal for Problem (13) and ( X ∗ ) + and ( X ∗ ) − are respectively optimal for Problems (11) and (12) with parameters ( A ∗ , x ∗ + ) . Proof:

For the “if” part, we ﬁrst have V ( X ∗ ) = v + ( A ∗ , x ∗ + ) − v − ( A ∗ , x ∗ + ). For anyfeasible solution X of (7), deﬁne A := { ω : X ≥ } and x + := E [ ρX + ]. Then we have V + ( X + ) ≤ v + ( A, x + ) , V − ( X − ) ≥ v − ( A, x + ). Therefore V ( X ) = V + ( X + ) − V − ( X − ) ≤ v + ( A, x + ) − v − ( A, x + ) ≤ v + ( A ∗ , x ∗ + ) − v − ( A ∗ , x ∗ + ) = V ( X ∗ ), which means X ∗ is optimalfor (7).For the “only if” part, let X ∗ be optimal for (7). Obviously, V + (( X ∗ ) + ) ≤ v + ( A ∗ , x ∗ + ) and V − (( X ∗ ) − ) ≥ v − ( A ∗ , x ∗ + ). If the former holds strictly, then there exists X feasible for (11)with parameters ( A ∗ , x ∗ + ) such that V + ( X ) > V + (( X ∗ ) + ). As a result ¯ X := X A ∗ + X ∗ ( A ∗ ) C is feasible for (7) and V ( ¯ X ) > V ( X ∗ ), which contradicts the optimality of X ∗ . So ( X ∗ ) + isoptimal for (11). Similarly we can prove that ( X ∗ ) − is optimal for (12). Thus v + ( A ∗ , x ∗ + ) = V + (( X ∗ ) + ) , v − ( A ∗ , x ∗ + ) = V − (( X ∗ ) − ).Next we show that v + ( A, x + ) − v − ( A, x + ) ≤ v + ( A ∗ , x ∗ + ) − v − ( A ∗ , x ∗ + ) ≡ V ( X ∗ ) for anyfeasible pair ( A, x + ) of Problem (13). This can be proved in three cases:14i) If P ( A ) = 0 (hence x + = 0 and x ≤ v + ( A, x + ) − v − ( A, x + ) = − v − ( A, − v − ( A, x +0 )= sup E [ ρX ]= x − , X ≥ , X is upper bounded [ − V − ( X )]= sup E [ ρX ]= − x − , X ≤ , X is lower bounded V ( X ) ≤ sup E [ ρX ]= − x − , X is lower bounded V ( X )= V ( X ∗ ) , where the last equality is owing to the fact that − x − = x .(ii) If P ( A ) = 1 (hence x + = x ), then we need only to check v + ( A, x ) ≤ V ( X ∗ ), which iseasy since v + ( A, x ) = sup E [ ρX ]= x , X ≥ V ( X ).(iii) If 0 < P ( A ) <

1, then for any x + ≥ x +0 , both (11) and (12) with parameters ( A, x + )have non-empty feasible regions. Hence for any ǫ > X and X , feasible for(11) and (12) respectively, such that V + ( X ) > ( v + ( A, x + ) − ǫ ) , V − ( X ) < v − ( A, x + ) + ǫ .Letting X := X − X , which is feasible for (7), we have v + ( A, x + ) − v − ( A, x + )

The essential message of Propositions 5.1 and 5.2 is that our problem (7) is completelyequivalent to the set of problems (11) – (13) and, moreover, the solution to the former canbe obtained via those to the latter.Problem (13) is an optimization problem with the decision variables being a real number, x + , and a random event, A , the latter being very hard to handle. We now show that one needsonly to consider A = { ρ ≤ c } , where c is a real number in certain range, when optimizing(13).Recall that two random variables ξ and η are called comonotonic ( anti-comonotonic re-spectively) if [ ξ ( ω ) − ξ ( ω ′ )][ η ( ω ) − η ( ω ′ )] ≥ ( ≤ respectively) 0. Theorem 5.1

For any feasible pair ( A, x + ) of Problem (13), there exists c ∈ [ ρ, ¯ ρ ] such that ¯ A := { ω : ρ ≤ c } satisﬁes v + ( ¯ A, x + ) − v − ( ¯ A, x + ) ≥ v + ( A, x + ) − v − ( A, x + ) . (17) Moreover, if Problem (11) admits an optimal solution with parameters ( A, x + ) , then the in-equality in (17) is strict unless P ( A ∩ ¯ A C ) + P ( A C ∩ ¯ A ) = 0 . Proof:

The case when x + = x +0 is trivial. In fact, if x ≤

0, then x + = 0 and v + ( A, x + ) =0 ∀ A ; hence c = ρ or ¯ A = Ø. If x >

0, then obviously v − ( A, x + ) = 0 ∀ A ; hence (17) holdswith ¯ A = Ω or c = ¯ ρ . On the other hand, the case when P ( A ) = 0 or P ( A ) = 1 is also trivial,where c := ρ or c := ¯ ρ trivially meets (17). 15o we assume now that x + > x +0 and 0 < P ( A ) <

1. Denote α := P ( A ), B := A C . Let¯ A = { ω : ρ ≤ c } , where c ∈ [ ρ, ¯ ρ ) satisﬁes P { ρ ≤ c } = α . Further, set A = A ∩ { ω : ρ ≤ c } , A = A ∩ { ω : ρ > c } ,B = B ∩ { ω : ρ ≤ c } , B = B ∩ { ω : ρ > c } . Since P ( A ∪ B ) = P ( A ∪ A ) ≡ α , we conclude P ( A ) = P ( B ).If P ( A ) = P ( B ) = 0, then trivially v + ( ¯ A, x + ) − v − ( ¯ A, x + ) = v + ( A, x + ) − v − ( A, x + ). Sowe suppose P ( A ) = P ( B ) >

0. For any feasible solutions X and X for (11) and (12),respectively, with parameters ( A, x + ), we are to prove that V + ( X ) − V − ( X ) ≤ v + ( ¯ A, x + ) − v − ( ¯ A, x + ) . (18)To this end, deﬁne f ( t ) := P { X ≤ t | A } , g ( t ) := P { ρ ≤ t | B } , t ≥ Z := g ( ρ )and Y := f − ( Z ). Because ρ admits no atom with respect to P , it admits no atom withrespect to P ( ·| B ). Hence the distribution of Z conditional on B is U (0 , P { Y ≤ t | B } = P { Z ≤ f ( t ) | B } = f ( t ). Consequently, E [ ρX A ] ≥ cE [ X A ] = cP ( A ) E [ X | A ]= cP ( B ) Z + ∞ [1 − f ( t )] dt = cP ( B ) Z + ∞ P { Y > t | B } dt = cE [ Y B ] ≥ E [ ρY B ] , and the inequality is strict if and only if P { X > } > f ( t ) k := ( , if Y = 0 , a.s. on B , E [ ρX A ] E [ ρY B ] , otherwise.Then k ≥

1, and k > f ( t )

1. Set ¯ X := X A + k Y B . Then E [ ρX ] = E [ ρX A ] + E [ ρX A ] = E [ ρX A ] + E [ k ρY B ] = E [ ρ ¯ X ] , which means that ¯ X is feasible for (11) with parameters ( ¯ A, x + ) (recall that by deﬁnition¯ X = 0 on ¯ A C ).On the other hand, for any t> P { ¯ X > t } = P { ¯ X > t | A } P ( A ) + P { ¯ X > t | B } P ( B )= P { X > t | A } P ( A ) + P { k Y > t | B } P ( B ) ≥ P { X > t | A } P ( A ) + P { Y > t | B } P ( B )= P { X > t | A } P ( A ) + P { X > t | A } P ( A )= P { X > t } , t ≥ f ( t ) ≡ V + ( · ) that V + ( ¯ X ) ≥ V + ( X ) , (19)with the inequality being strict when f ( · ) X feasible for (12) with parameters ( ¯ A, x + ) satisfying V − ( ¯ X ) ≤ V − ( X ) . (20)Combining (19) and (20) we get (18).Now, if X is an optimal solution of (11) with parameters ( A, x + ), then P ( X = 0 | A ) < P ( X = 0 | A ) = 1, then by its optimality X is anti-comonotonic with ρ on A (seeProposition C.1), which implies P ( X = 0 | A ) = 1. Therefore P ( X = 0 | A ) = 1, and x + = E [ ρX A ] = 0, contradicting the fact that x + > x +0 ≥ f ( · )

1. As proved earlier, (19), and hence (18), holds strictly.

Q.E.D.

To simplify the notation, we now use v + ( c, x + ) and v − ( c, x + ) to denote v + ( { ω : ρ ≤ c } , x + )and v − ( { ω : ρ ≤ c } , x + ) respectively.In view of Theorem 5.1, one may replace Problem (13) by the following problem:Maximize v + ( c, x + ) − v − ( c, x + )subject to ( ρ ≤ c ≤ ¯ ρ, x + ≥ x +0 ,x + = 0 when c = ρ, x + = x when c = ¯ ρ. (21)This is clearly a much simpler problem, being a constrained optimization problem (a mathe-matical programming problem) in IR .Theorem 5.1 is one of the most important results in this paper. It discloses the form of ageneral solution to the behavioral model: the optimal wealth is the payoﬀ of a combinationof two binary options characterized by a single number c ∗ , as stipulated in the next theorem. Theorem 5.2

Given X ∗ , and deﬁne c ∗ := F − ( P { X ∗ ≥ } ) , x ∗ + := E [ ρ ( X ∗ ) + ] , where F ( · ) is the distribution function of ρ . Then X ∗ is optimal for Problem (7) if and only if ( c ∗ , x ∗ + ) is optimal for Problem (21) and ( X ∗ ) + ρ ≤ c ∗ and ( X ∗ ) − ρ>c ∗ are respectively optimalfor Problems (11) and (12) with parameters ( { ω : ρ ≤ c ∗ } , x ∗ + ) . Moreover, in this case { ω : X ∗ ≥ } and { ω : ρ ≤ c ∗ } are identical up to a zero probability set. Proof:

Straightforward from Proposition 5.2 and Theorem 5.1.

Q.E.D.

In the following two sections, we will solve the positive and negative part problems re-spectively to obtain v + ( c, x + ) and v − ( c, x + ). It turns out that the two problems require verydiﬀerent techniques to tackle. 17 Positive Part Problem

In this section we solve the positive part problem (11), including ﬁnding its optimal solutionand the expression of v + ( c, x + ), for any A = { ω : ρ ≤ c } , ρ ≤ c ≤ ¯ ρ , and x + ≥ x +0 . In fact, itis a special case of a more general Choquet maximization problem, which is of independentinterest and is solved in Appendix C. We apply the general results obtained in Appendix C to Problem (11) with A = { ω : ρ ≤ c } and x + ≥ x +0 ( ≥ F ( · ) be the distribution function of ρ .Let A = { ω : ρ ≤ c } be given. Problem (11) is trivial when P ( A ) = 0; hence we assume P ( A ) > c > ρ . Deﬁne T A ( x ) := T + ( xP ( A )) /T + ( P ( A )) , x ∈ [0 , , which is a strictly increasing, diﬀerentiable function from [0 ,

1] to [0 , T A (0) = 0 , T A (1) =1. For any feasible solution X of (11) and any y ≥ T + ( P { u + ( X ) > y } ) = T + ( P { u + ( X ) > y | A } P ( A )) = T + ( P ( A )) T A ( P { u + ( X ) > y | A } ) . Now considering Problem (11) in the conditional probability space (Ω ∩ A, F ∩

A, P A := P ( ·| A )),we can rewrite it asMaximize V + ( Y ) = T + ( P ( A )) R + ∞ T A ( P A { u + ( Y ) > y } ) dy subject to E A [ ρY ] = x + /P ( A ) , Y ≥ . (22)This specializes the general Choquet maximization problem (46) solved in Appendix C. It isevident that Y ∗ is optimal for (22) if and only if X ∗ = Y ∗ A is optimal for (11).To solve Problem (11) for all A = { ω : ρ ≤ c } , we need Assumption 4.1. Theorem 6.1

Let Assumption 4.1 hold. Given A := { ω : ρ ≤ c } with ρ ≤ c ≤ ¯ ρ , and x + ≥ x +0 . (i) If x + = 0 , then the optimal solution of (11) is X ∗ = 0 and v + ( c, x + ) = 0 . (ii) If x + > and c = ρ , then there is no feasible solution to (11) and v + ( c, x + ) = −∞ . (iii) If x + > and ρ < c ≤ ¯ ρ , then the optimal solution to (11) is X ∗ ( λ ) = ( u ′ + ) − (cid:16) λρT ′ + ( F ( ρ )) (cid:17) ρ ≤ c with the optimal value v + ( c, x + ) = E h u + (cid:16) ( u ′ + ) − ( λρT ′ + ( F ( ρ )) ) (cid:17) T ′ + ( F ( ρ )) ρ ≤ c i , where λ > is the unique real number satisfying E [ ρX ∗ ( λ )] = x + . Proof:

Cases (i) and (ii) are trivial. We prove (iii). Assume ρ < c ≤ ¯ ρ with P ( A ) ≡ P { ρ ≤ c } >

0. Deﬁne F A ( x ) := P A { ρ ≤ x } = P { ρ ≤ x ∧ c } P { ρ ≤ c } = F ( x ∧ c ) P ( A ) , x ≥

0. Then F − A ( x ) = F − ( xP ( A )). Noting T ′ A ( x ) = P ( A ) T + ( P ( A )) T ′ + ( xP ( A )), we have F − A ( z ) T ′ A ( z ) = F − ( zP ( A )) T ′ + ( zP ( A )) T + ( P ( A )) P ( A ) , which18s non-decreasing in z under Assumption 4.1. Noting that ρ ≤ c on A , we have ρT ′ A ( F A ( ρ )) = ρT ′ + ( F ( ρ )) T + ( P ( A )) P ( A ) . Hence, in view of Assumption 4.1 and Proposition C.2 we can apply TheoremC.1 to conclude that the optimal solution for (22) is Y ∗ = ( u ′ + ) − (cid:16) ¯ λρT ′ A ( F A ( ρ )) (cid:17) for some ¯ λ > λ := T + ( P ( A )) P ( A ) ¯ λ ≥

0, we obtain the optimality of X ∗ := Y ∗ ρ ≤ c in view of the relationbetween Problems (22) and (11).Finally, the optimal value of Problem (11) can be calculated as follows: v + ( c, x + ) = T + ( P ( A )) E A [ u + ( Y ∗ ) T ′ A ( F A ( ρ ))]= P ( A ) E A [ u + ( Y ∗ ) T ′ + ( F ( ρ ))]= E [ u + ( Y ∗ ) T ′ + ( F ( ρ )) ρ ≤ c ] . The proof is complete.

Q.E.D.

Theorem 6.1 remains true when the condition lim inf x → + ∞ (cid:16) − xu ′′ + ( x ) u ′ + ( x ) (cid:17) > x → + ∞ u ′ + ( kx ) u ′ + ( x ) < k > , which, in particular, does not require the twice diﬀerentiability of u + ( · ); see Jin, Xu and Zhou(2007, Lemma 3 and Proposition 2). We choose to use the current condition due to its cleareconomic meaning related to the relative risk aversion index.Before we end this subsection, we state the following result which is useful in the sequel. Proposition 6.1 If x + > , then Problem (11) admits an optimal solution with parameters ( { ρ ≤ c } , x + ) only if v + (¯ c, x + ) > v + ( c, x + ) for any ¯ c > c satisfying P { c < ρ ≤ ¯ c } > . Proof:

Assume c > ρ , the case c = ρ being trivial. Let X be optimal for (11) with( A ( c ) , x + ), where A ( c ) := { ω : ρ ≤ c } . Then Y c := X | A ( c ) , where X | A ( c ) is X restricted on A ( c ), is optimal for (22) with A = A ( c ).For any ¯ c > c , obviously v + (¯ c, x + ) ≥ v + ( c, x + ). If v + (¯ c, x + ) = v + ( c, x + ), then, with A (¯ c ) := { ω : ρ ≤ ¯ c } , the random variable¯ Y ( ω ) := ( Y c ( ω ) , if ω ∈ A ( c ) , , if ω ∈ A (¯ c ) \ A ( c )is feasible for (22) with A = A (¯ c ) and, since its objective value is v + ( c, x + ) = v + (¯ c, x + ), isoptimal. By Theorem C.2, P { ¯ Y = 0 | A (¯ c ) } = 0. However, the deﬁnition of ¯ Y shows that P { ¯ Y = 0 | A (¯ c ) } > P { c < ρ ≤ ¯ c } >

0. This contradiction leads to v + (¯ c, x + ) > v + ( c, x + ). Q.E.D.

In other words, v + is strictly increasing in c .19 .2 Discussion on the Monotonicity of F − ( z ) /T ′ + ( z ) It is seen from the previous subsections that in order to solve the positive part problem explicitly , a key assumption is the monotonicity of F − ( z ) /T ′ + ( z ). What is the economic in-terpretation of this property? Does it contradict the other assumptions usually imposed on F ( · ) and T + ( · )? More importantly, is the set of the problem parameters satisfying this as-sumption null in the ﬁrst place? In this subsection we depart from our optimization problemsfor a while to address these questions .Throughout this subsection, we assume that F ( · ) (the distribution function of ρ ) is twicediﬀerentiable and F ′ ( x ) > ∀ x > ρ is a non-degenerate lognormal randomvariable). Furthermore, suppose that T + ( · ) is twice diﬀerentiable on (0,1).Denote x = F − ( z ) or z = F ( x ). Then the monotonicity (being non-decreasing) of F − ( z ) /T ′ + ( z ) is equivalent to that T ′ + ( F ( x )) /x is non-increasing in x >

0. Set H ( x ) := T + ( F ( x )) , h ( x ) := H ′ ( x ), and I ( x ) := T ′ + ( F ( x )) /x ≡ h ( x ) / ( xF ′ ( x )), x >

0. Then I ( x )non-increases in x > I ′ ( x ) = xH ′′ ( x ) F ′ ( x ) − xH ′ ( x ) F ′′ ( x ) − H ′ ( x ) F ′ ( x ) x ( F ′ ( x )) ≤ ∀ x > , which is further equivalent to xH ′′ ( x ) H ′ ( x ) − xF ′′ ( x ) F ′ ( x ) ≤ ∀ x > H ′ ( x )) ′ ≤ (ln( xF ′ ( x ))) ′ ∀ x > . (24)Note that xu ′′ ( x ) u ′ ( x ) can be regarded as the relative risk seeking index of a given function u ( · ).On the other hand, recall that by deﬁnition H ( · ) is the distorted distribution function of ρ .Hence the condition (23) can be economically interpreted as that the distortion T + shouldnot be “too large” in the sense that it should not increase the relative risk seeking functionof the distribution by more than 1.Next we are to explore more properties of the function j ( · ) deﬁned by j ( x ) := xH ′′ ( x ) H ′ ( x ) − xF ′′ ( x ) F ′ ( x ) , x >

0. To this end, let G ( z ) := F − ( z ). Then G ′ ( z ) = F ′ ( G ( z )) ∀ z ∈ (0 , T + ( z ) = H ( G ( z )), we have T ′ + ( z ) = h ( G ( z )) /F ′ ( G ( z )); hence T ′′ + ( z ) = h ′ ( G ( z )) F ′ ( G ( z )) − h ( G ( z )) F ′′ ( G ( z )) F ′ ( G ( z )) . This leads to T ′′ + ( F ( x )) = h ′ ( x ) F ′ ( x ) − h ( x ) F ′′ ( x ) F ′ ( x ) = h ( x ) xF ′ ( x ) j ( x ) , x > . (25)As proposed by Tversky and Kahneman (1992), the probability distortion T + ( · ) is usuallyin reversed S-shape. Speciﬁcally, T + ( x ) changes from being concave to being convex when x goes from 0 to 1, or T ′′ + ( x ) changes from negative to positive. It follows then from (25) that The reader may skip this subsection without interrupting the ﬂow of reading. ( · ) changes from negative to positive when x goes from 0 to 1, while as shown earlier (23)requires that j ( · ) is bounded above by 1.To summarize, a reversed S-shaped distortion T + ( · ) satisfying the monotonicity conditionin Assumption 4.1 if there exists c > j ( x ) ≤ ∀ x ∈ (0 , c ] , and 0 ≤ j ( x ) ≤ ∀ x ∈ ( c , + ∞ ) . (26)The following is an example of distortion where the corresponding j ( · ) does satisfy (26). Example 6.1

Let ρ be a non-degenerate lognormal random variable; i.e., F ( x ) = N (cid:0) ln x − µσ (cid:1) for some µ ∈ IR and σ >

0, where N ( · ) is the distribution function of a standard normalrandom variable. Take j ( x ) =: a c , with c > a < < b < T + ( · ) thatproduces the function j ( · ).When 0 < x ≤ c , j ( x ) ≡ x [(ln H ′ ( x )) ′ − (ln F ′ ( x )) ′ ] = a . Henceln H ′ ( x ) − ln F ′ ( x ) = ¯ k + a ln x for some constant ¯ k , or H ′ ( x ) = kF ′ ( x ) x a = k √ πσ x a − e − (ln x − µ ) / (2 σ ) , < x ≤ c , (27)for some constant k . Thus, H ( x ) = k √ πσ R x t a − e − (ln t − µ ) / (2 σ ) dt = k √ πσ R ln x −∞ e as e − ( s − µ ) / (2 σ ) ds = k √ πσ e aµ + a σ / R ln x −∞ e − ( s − ( µ + aσ )) / (2 σ ) ds = ke aµ + a σ / N (cid:16) ln x − ( µ + aσ ) σ (cid:17) , < x ≤ c . (28)Consequently, T + ( z ) ≡ H ( F − ( z )) = ke aµ + a σ / N (cid:0) N − ( z ) − aσ (cid:1) , < z ≤ F ( c ) := z . When x > c , similar to (27) we have H ′ ( x ) = ˜ k √ πσ x b − e − (ln x − µ ) / (2 σ ) , x > c , (29)with ˜ k = c a − b k (to render H ′ ( x ) continuous at x = c ). Therefore, H ( x ) = H ( c ) + ˜ k √ πσ R xc t b − e − (ln t − µ ) / (2 σ ) dt = H ( c ) + ˜ k √ πσ R ln x ln c e bs e − ( s − µ ) / (2 σ ) ds = H ( c ) + ˜ k √ πσ e bµ + b σ / R ln x ln c e − ( s − ( µ + bσ )) / (2 σ ) ds = H ( c ) + ˜ ke bµ + b σ / h N (cid:16) ln x − ( µ + bσ ) σ (cid:17) − N (cid:16) ln c − ( µ + bσ ) σ (cid:17)i , x > c . (30) T + ( z ) = H ( F − ( z ))= ke aµ + a σ / N ( N − ( z ) − aσ ) + ˜ ke bµ + b σ / [ N ( N − ( z ) − bσ ) − N ( N − ( z ) − aσ )] ,z < z ≤ . In particular, T + (1) = ke aµ + a σ / N ( N − ( z ) − aσ ) + ˜ ke bµ + b σ / [1 − N ( N − ( z ) − aσ )]= ke aµ + a σ / N (cid:16) ln c − µ − aσ σ (cid:17) + kc a − b e bµ + b σ / h − N (cid:16) ln c − µ − aσ σ (cid:17)i . This, in turn, determines uniquely the value of k since T + (1) = 1.So, in this example we have constructed a class of distortions T + parameterized by z = F − ( c ) ∈ (0 , , a < b ∈ (0 , H ( · ) given in (28) and (30) show that the distortion T + ( · ) in ef-fect distorts the distribution of ρ , a lognormal random variable, into one having lognormalcomponents, albeit with enlarged means and rescaled values. On the other hand, as stip-ulated in Tversky and Kahneman (1992), a probability distortion on gain usually satisﬁes T ′ + (0) = T ′ + (1) = + ∞ , reﬂecting the observation that there are most signiﬁcant distortionson very small and very large probabilities. It turns out that the distortion functions con-structed in the preceding example do indeed satisfy T ′ + (0) = T ′ + (1) = + ∞ . To see this,notice T ′ + ( z ) = T ′ + ( F ( x )) = H ′ ( x ) /F ′ ( x ) = kx j ( x ) or ˜ kx j ( x ) . Hence, when z → x →

0, and T ′ + ( z ) → + ∞ . On the other hand, when z → x → + ∞ , and T ′ + ( z ) → + ∞ . Now we turn to the negative part problem (12), which is a Choquet minimization problem.Such a problem in a more general setting is solved thoroughly in Appendix D; so we need onlyto apply the results there to (12). Notice, though, (12) has a constraint that a feasible solutionmust be almost surely bounded from above. The reason we do not include this constraintexplicitly into the general problem (51) is that, under a mild condition, any optimal solutionto (51) is automatically almost surely bounded from above; see Proposition D.2 and thecomments right after it.Similarly with the positive part problem, for a given A = { ω : ρ ≤ c } with ρ ≤ c < ¯ ρ (the case when c = ¯ ρ is trivial), we deﬁne T A C ( x ) := T − ( xP ( A C )) T − ( P ( A C )) . Then T A C ( · ) is a strictlyincreasing, diﬀerentiable function from [0 ,

1] to [0 ,

1] with T A C (0) = 0 , T A C (1) = 1. Moreover,for any feasible solution X of (12) and any y ≥ T − ( P { u − ( X ) > y } ) = T − (cid:0) P { u − ( X ) > y | A C } P ( A C ) (cid:1) = T − ( P ( A C )) T A C ( P { u − ( X ) > y | A C } ) . P A C ( · ) = P ( ·| A C ). Then Problem (12), taken in the probabil-ity space (cid:0) Ω ∩ A C , F ∩ A C , P A C (cid:1) , is equivalent toMinimize V − ( Y ) = T − ( P ( A C )) R + ∞ T A C ( P A C { u − ( Y ) > y } ) dy subject to E A C [ ρY ] = ( x + − x ) /P ( A C ) , Y ≥ , Y is bounded a.s. . (31)This is a special case of (51) in Appendix D. Theorem 7.1

Assume that u − ( · ) is strictly concave at . Given A := { ω : ρ ≤ c } with ρ ≤ c ≤ ¯ ρ , and x + ≥ x +0 . (i) If c = ¯ ρ and x + = x , then the optimal solution of (12) is X ∗ = 0 and v − ( c, x + ) = 0 . (ii) If c = ¯ ρ and x + = x , then there is no feasible solution to (12) and v − ( c, x + ) = + ∞ . (iii) If ρ ≤ c < ¯ ρ , then v − ( c, x + ) = inf ¯ c ∈ [ c, ¯ ρ ) u − (cid:16) x + − x E [ ρ ρ> ¯ c ] (cid:17) T − (1 − F (¯ c )) . Moreover, Problem(12) with parameters ( A, x + ) admits an optimal solution X ∗ if and only if the followingminimization problem min ¯ c ∈ [ c, ¯ ρ ) u − (cid:18) x + − x E [ ρ ρ> ¯ c ] (cid:19) T − (1 − F (¯ c )) (32) admits an optimal solution ¯ c ∗ , in which case X ∗ = x + − x E [ ρ ρ> ¯ c ∗ ] ρ> ¯ c ∗ , a.s. . Proof : Cases (i) and (ii) are trivial. On the other hand, given Theorem D.1, and noticingthat X ∗ = x + − x E [ ρ ρ> ¯ c ∗ ] ¯ c ∗ ≤ ρ

Now that we have solved the problems of the positive and negative parts in Step 1, we areready to solve our ultimate Problem (7) via the optimization Problem (13) or equivalently,Problem (21), in Step 2, and hence prove the main results contained in Theorem 4.1.Recall problem (14) formulated earlier. The following lemma is straightforward by Theo-rem 7.1 and the convention (15).

Lemma 8.1

For any feasible pair ( c, x + ) for Problem (21), u − (cid:16) x + − x E [ ρ ρ>c ] (cid:17) T − (1 − F ( c )) ≥ v − ( c, x + ) . Proposition 8.1

Problems (21) and (14) have the same supremum values.

Proof:

Denote by α and β the supremum values of (21) and (14) respectively. By Lemma8.1, α ≥ β . Conversely, we prove α ≤ β . First we assume that α < + ∞ . For any ǫ > c, x + ) feasible for (21) such that v + ( c, x + ) − v − ( c, x + ) ≥ α − ǫ , and there exists¯ c ∈ [ c, ¯ ρ ] such that u − ( x + − x E [ ρ ρ> ¯ c ] ) T − (1 − F (¯ c )) ≤ v − ( c, x + ) + ǫ . Therefore v + (¯ c, x + ) − u − (cid:16) x + − x E [ ρ ρ> ¯ c ] (cid:17) T − (1 − F (¯ c )) ≥ v + (¯ c, x + ) − v − ( c, x + ) − ǫ ≥ v + ( c, x + ) − v − ( c, x + ) − ǫ ≥ α − ǫ. Letting ǫ →

0, we conclude α ≤ β .Next, if α = + ∞ , then for any M ∈ IR, there exists a feasible pair ( c, x + ) such that v + ( c, x + ) − v − ( c, x + ) ≥ M , and there is ¯ c ≥ c with u − ( x + − x E [ ρ ρ> ¯ c ] ) T − (1 − F (¯ c )) ≤ v − ( c, x + )+ M/ v + (¯ c, x + ) − u − ( x + − x E [ ρ ρ> ¯ c ] ) T − (1 − F (¯ c )) ≥ v + ( c, x + ) − v − ( c, x + ) − M/ ≥ M/

2, which impliesthat β = + ∞ . Q.E.D.

Proof of Theorem 4.1: (i) If X ∗ is optimal for (7), then by Theorem 5.2 ( c ∗ , x ∗ + ) isoptimal for (21) and ( X ∗ ) + ρ ≤ c ∗ and ( X ∗ ) − ρ>c ∗ are respectively optimal for Problems (11)and (12) with parameters ( { ω : ρ ≤ c ∗ } , x ∗ + ). We now show that with ( c, x + ) = ( c ∗ , x ∗ + ) theminimum in (32) is achieved at ¯ c = c ∗ , namely, v − ( c ∗ , x ∗ + ) = u − ( x ∗ + − x E [ ρ ρ>c ∗ ] ) T − (1 − F ( c ∗ )) . (33)To this end, we ﬁrst assume that x ∗ + = 0 (hence X ∗ = 0 a.s. and c ∗ = ¯ ρ ). Then x ≤ x ∗ + = 0.If x = 0, then (33) is trivial. If x <

0, Theorem 7.1 yields that ( X ∗ ) − ρ>c ∗ has the followingrepresentation ( X ∗ ) − ρ>c ∗ = x ∗ + − x E [ ρ ρ> ¯ c ∗ ] ρ> ¯ c ∗ , a.s. . (34)Recall X ∗ < ρ > c ∗ , and x ∗ + − x E [ ρ ρ> ¯ c ∗ ] >

0; so (34) implies c ∗ = ¯ c ∗ , and hence (33), in viewof Theorem 7.1.Next, if x ∗ + >

0, then by Proposition 6.1, we have v + (¯ c, x ∗ + ) > v + ( c ∗ , x ∗ + ) for any ¯ c > c ∗ with P { c ∗ < ρ ≤ ¯ c } >

0. If (33) is not true, then it follows from Theorem 7.1 that thereexists ¯ c > c ∗ with P { c ∗ < ρ ≤ ¯ c } > v − ( c ∗ , x ∗ + ) = u − ( x ∗ + − x E [ ρ ρ> ¯ c ] ) T − (1 − F (¯ c )).Consequently, v + (¯ c, x ∗ + ) − v − (¯ c, x ∗ + ) ≥ v + (¯ c, x ∗ + ) − u − ( x ∗ + − x E [ ρ ρ> ¯ c ] ) T − (1 − F (¯ c )) > v + ( c ∗ , x ∗ + ) − v − ( c ∗ , x ∗ + ) , violating the conclusion that ( c ∗ , x ∗ + ) is optimal for (21).Now, for any ( c, x + ) feasible for (14), v + ( c, x + ) − u − ( x + − x E [ ρ ρ>c ] ) T − (1 − F ( c )) ≤ v + ( c, x + ) − v − ( c, x + ) ≤ v + ( c ∗ , x ∗ + ) − v − ( c ∗ , x ∗ + )= v + ( c ∗ , x ∗ + ) − u − ( x ∗ + − x E [ ρ ρ>c ∗ ] ) T − (1 − F ( c ∗ )) , c ∗ , x ∗ + ) is optimal for (14). The other conclusions are straightforward.(ii) Since ( c ∗ , x ∗ + ) is optimal for (14), we have v + ( c ∗ , x ∗ + ) − v − ( c ∗ , x ∗ + ) ≥ v + ( c ∗ , x ∗ + ) − u − ( x ∗ + − x E [ ρ ρ>c ∗ ] ) T − (1 − F ( c ∗ ))= sup h v + ( c, x + ) − u − ( x + − x E [ ρ ρ>c ] ) T − (1 − F ( c )) i = sup [ v + ( c, x + ) − v − ( c, x + )] , where the supremum is over the feasible region of (14). This implies that ( c ∗ , x ∗ + ) is optimalfor (21) and the inequality above is in fact an equality, resulting in v − ( c ∗ , x ∗ + ) = u − ( x ∗ + − x E [ ρ ρ>c ∗ ] ) T − (1 − F ( c ∗ )) . The above in turn indicates, thanks to Theorem 7.1, that X ∗− := x ∗ + − x E [ ρ ρ>c ∗ ] ρ>c ∗ is optimal for(12) with parameters ( { ω : ρ ≤ c ∗ } , x ∗ + ). The desired result then follows from Theorem 5.2. Q.E.D.

Other claims in Section 4 on more explicit conclusions under Assumption 4.1 are straight-forward by virtue of Theorem 6.1.

In this section we solve a concrete (and very involved) example to demonstrate the generalresults obtained in previous section as well as the algorithm presented. The example showcasesall the possibilities associated with our behavioral portfolio selection model (6), namely, amodel could be ill-posed, or well-posed yet optimal solution not attainable, or well-posedand optimal solution obtainable. When the optimal solutions do exist, we are able to deriveexplicit terminal payoﬀs for most of the cases.In the example, we let ρ follow the lognormal distribution, i.e., ln ρ ∼ N ( µ, σ ) with σ > constant relative risk aversion ), i.e., u + ( x ) = x α , u − ( x ) = k − x α , x ≥

0, with k − > < α <

1. (Recall the overall utility function – or valuefunction in the terminology of Tversky and Kahneman – is an S-shaped function.) Thesefunctions, also taken in Tversky and Kahneman (1992) with α = 0 .

88 and k − = 2 .

25, clearlysatisfy Assumption 2.3. We do not spell out (neither do we need) the explicit forms of thedistortions T + ( · ) and T − ( · ) so long as they satisfy Assumption 2.4. In addition, we assume thatAssumption 4.1 holds, which is imposed on T + ( · ). An example of such T + ( · ) was presentedin Example 6.1.Clearly, u ′ + ( x ) = αx α − , ( u ′ + ) − ( y ) = ( y/α ) / ( α − , u + (( u ′ + ) − ( y )) = ( y/α ) α/ ( α − , ρ = 0,¯ ρ = + ∞ , and F ( x ) = N ((ln x − µ ) /σ ).Under this setting, we ﬁrst want to solve the positive part problem (11) with given ( c, x + ),where 0 ≤ c ≤ + ∞ and x + ≥ x +0 . The case that c = 0 is trivial, where necessarily x + = 0 in25rder to have a feasible problem, and v + ( c, x + ) = 0. So let c ∈ (0 , + ∞ ]. The optimal solutionto (11) in this case is X ∗ + ( c, x + ) = ( u ′ + ) − (cid:18) λ ( c, x + ) ρT ′ + ( F ( ρ )) (cid:19) ρ ≤ c = (cid:18) λ ( c, x + ) ραT ′ + ( F ( ρ )) (cid:19) / ( α − ρ ≤ c . To determine λ ( c, x + ), denote ϕ ( c ) := E "(cid:18) T ′ + ( F ( ρ )) ρ (cid:19) / (1 − α ) ρ ρ ≤ c > , < c ≤ + ∞ . Then the constraint x + = E [ ρX ∗ + ( c, x + )] = ϕ ( c ) (cid:16) λ ( c,x + ) α (cid:17) / ( α − gives λ ( c, x + ) = α (cid:18) x + ϕ ( c ) (cid:19) α − , < c ≤ + ∞ , x + ≥ x +0 . This in turn determines X ∗ + ( c, x + ) = x + ϕ ( c ) (cid:18) T ′ + ( F ( ρ )) ρ (cid:19) / (1 − α ) ρ ≤ c , < c ≤ + ∞ , x + ≥ x +0 , (35)and v + ( c, x + ) = (cid:16) x + ϕ ( c ) (cid:17) α E (cid:20)(cid:16) ρT ′ + ( F ( ρ )) (cid:17) α/ ( α − − ρ ρ ≤ c (cid:21) = (cid:16) x + ϕ ( c ) (cid:17) α ϕ ( c )= ϕ ( c ) − α x α + , < c ≤ + ∞ , x + ≥ x +0 . (36)Set ˜ ϕ ( c ) = ( ϕ ( c ) if 0 < c ≤ + ∞ , c = 0 , which is a non-decreasing function right continuous at0. Then Problem (14) specializes toMaximize v ( c, x + ) = ˜ ϕ ( c ) − α x α + − k − T − (1 − F ( c ))( E [ ρ ρ>c ]) α ( x + − x ) α , subject to ( ≤ c ≤ + ∞ , x + ≥ x +0 ,x + = 0 when c = 0 , x + = x when c = + ∞ . (37)When c > ex cluding + ∞ ) and x + ≥ x +0 , write v ( c, x + ) = ϕ ( c ) − α [ x α + − k ( c )( x + − x ) α ] , where k ( c ) := k − T − (1 − F ( c )) ϕ ( c ) − α ( E [ ρ ρ>c ]) α > , c > Theorem 9.1

Assume that x ≥ and Assumption 4.1 holds. If inf c> k ( c ) ≥ , then the optimal portfolio for Problem (6) is the replicating portfoliofor the contingent claim X ∗ = x ϕ (+ ∞ ) (cid:18) T ′ + ( F ( ρ )) ρ (cid:19) / (1 − α ) . (ii) If inf c> k ( c ) < , then Problem (6) is ill-posed. Proof:

Consider the problem max x ≥ x f ( x ) where f ( x ) = x α − k ( x − x ) α and k ≥ f ′ ( x ) = α [ x α − − k ( x − x ) α − ], we conclude that 1) if k ≥

1, then f ′ ( x ) ≤ ∀ x ≥ x ; therefore x ∗ = x is optimal with the optimal value x α ; and 2) if k <

1, then f ( x ) = x α [1 − k (1 − x /x ) α ] → + ∞ as x → + ∞ , implying that sup x ≥ x f ( x ) = + ∞ .(i) If inf c> k ( c ) ≥

1, thensup c> ,x + ≥ x +0 v ( c, x + ) ≡ sup c> (cid:2) ϕ ( c ) − α sup x + ≥ x (cid:0) x α + − k ( c )( x + − x ) α (cid:1)(cid:3) = sup c> [ ϕ ( c ) − α x α ] = ϕ (+ ∞ ) − α x α ≡ v + (+ ∞ , x ) ≥ . However, when c = 0 (and hence x + = 0) we have v ( c, x + ) = 0. As a result ( c ∗ , x ∗ + ) =(+ ∞ , x ) is optimal to (37). Theorem 4.1 then applies to conclude that X ∗ ≡ X ∗ + (+ ∞ , x ) = x ϕ (+ ∞ ) (cid:16) T ′ + ( F ( ρ )) ρ (cid:17) / (1 − α ) solves (7). Hence the optimal portfolio for (6) is the one that replicates X ∗ .(ii) If inf c> k ( c ) <

1, then there is c > k ( c ) <

1. In this case,sup c> ,x + ≥ x +0 v ( c, x + ) ≥ j ( c ) α sup x + ≥ x (cid:2) x α + − k ( c )( x + − x ) α (cid:3) = + ∞ . The conclusion thus follows from Propositions 8.1 and 5.1. .

Q.E.D.

Theorem 9.2

Assume that x < and Assumption 4.1 holds. (i) If inf c> k ( c ) > , then Problem (6) is well-posed. Moreover, (6) admits an optimalportfolio if and only if argmin c ≥ "(cid:18) k − T − (1 − F ( c ))( E [ ρ ρ>c ]) α (cid:19) / (1 − α ) − ˜ ϕ ( c ) = Ø . (38) Furthermore, if c ∗ > is one of the minimizers in (38), then the optimal portfolio isthe one to replicate X ∗ = x ∗ + ϕ ( c ∗ ) (cid:18) T ′ + ( F ( ρ )) ρ (cid:19) / (1 − α ) ρ ≤ c ∗ − x ∗ + − x E [ ρ ρ>c ∗ ] ρ>c ∗ , (39) where x ∗ + := − x k ( c ∗ ) / (1 − α ) − ; and if c ∗ = 0 is the unique minimizer in (38), then the uniqueoptimal portfolio is the one to replicate X ∗ = x Eρ . (ii) If inf c> k ( c ) = 1 , then the supremum value of Problem (6) is , which is however notachieved by any admissible portfolio. If inf c> k ( c ) < , then Problem (6) is ill-posed. Proof:

We ﬁrst consider a general optimization problem max x ≥ f ( x ) where f ( x ) := x α − k ( x − x ) α and k ≥ k >

1, then f ′ ( x ) = 0 has the only solution x ∗ = − x k / (1 − α ) − >

0. Since f ′′ ( x ) = α ( α − x α − − k ( x − x ) α − ] < ∀ x > x ∗ is the (only) maximum point with themaximum value f ( x ∗ ) = ( x ∗ ) α [1 − k (1 − x /x ) α ] = − ( − x ) α [ k / (1 − α ) − − α .

2) If k = 1, then f ′ ( x ) > ∀ x >

0. This means that the supremum of f ( x ) on x ≥ x → + ∞ f ( x ) = 0; yet this value is not achieved by any x ≥ k <

1, then f ( x ) = x α [1 − k (1 − x /x ) α ] → + ∞ as x → + ∞ , implying thatsup x ≥ f ( x ) = + ∞ .We need to solve (37) to obtain ( c ∗ , x ∗ + ). Since x < c = + ∞ is infeasible; so werestrict c ∈ [0 , + ∞ ). Care must be taken to deal with the special solution ( c, x + ) = (0 , v (0 ,

0) = − k − ( E [ ρ ]) α ( − x ) α .(i) If inf c> k ( c ) >

1, thensup c> ,x + ≥ x +0 v ( c, x + ) ≡ sup c> (cid:2) ϕ ( c ) − α sup x + ≥ (cid:0) x α + − k ( c )( x + − x ) α (cid:1)(cid:3) = sup c> h − ( − x ) α ϕ ( c ) − α (cid:0) k ( c ) / (1 − α ) − (cid:1) − α i = − ( − x ) α (cid:26) inf c> (cid:20)(cid:16) k − T − (1 − F ( c ))( E [ ρ ρ>c ]) α (cid:17) / (1 − α ) − ϕ ( c ) (cid:21)(cid:27) − α < + ∞ . (40)This yields that (6) is well-posed. Now, if c ∗ > (cid:20)(cid:16) k − T − (1 − F ( c ))( E [ ρ ρ>c ]) α (cid:17) / (1 − α ) − ˜ ϕ ( c ) (cid:21) over c ≥

0, then we have sup c> ,x + ≥ x +0 v ( c, x + ) ≥ − ( − x ) α k − ( Eρ ) α = v (0 , c ∗ > x ∗ + = − x k ( c ∗ ) / (1 − α ) − are optimal for (37). Theorem 4.1 then yields that the optimal portfoliois the one that replicates X ∗ given by (39).If c ∗ = 0 is the unique inﬁmum of (cid:20)(cid:16) k − T − (1 − F ( c ))( E [ ρ ρ>c ]) α (cid:17) / (1 − α ) − ˜ ϕ ( c ) (cid:21) over c ≥

0, thensup c> ,x + ≥ x +0 v ( c, x + ) = − ( − x ) α ( inf c> "(cid:18) k − T − (1 − F ( c ))( E [ ρ ρ>c ]) α (cid:19) / (1 − α ) − ϕ ( c ) − α < − ( − x ) α k − ( Eρ ) α ≡ v (0 , . This implies that ( c ∗ , x ∗ + ) = (0 ,

0) is uniquely optimal for (37), and the unique optimal solutionfor (7) is X ∗ + ( c ∗ , x ∗ + ) ρ ≤ c ∗ − x ∗ + − x E [ ρ ρ>c ∗ ] ρ>c ∗ ≡ x Eρ , for which the corresponding replicatingportfolio is the risk-free one.If the inﬁmum inf c ≥ (cid:20)(cid:16) k − T − (1 − F ( c ))( E [ ρ ρ>c ]) α (cid:17) / (1 − α ) − ˜ ϕ ( c ) (cid:21) is not attainable, thensup c> ,x + ≥ x +0 v ( c, x + ) > − k − ( E [ ρ ]) α ( − x ) α = v (0 , ,

0) is not optimal for2837). On the other hand, the optimality of (37) is not achieved at any c > x + ≥ c> k ( c ) = 1. If k ( c ) > c >

0, thensup c> ,x + ≥ x +0 v ( c, x + ) = − ( − x ) α { ϕ ( c ) − α [(inf c> k ( c )) / (1 − α ) − − α } = 0 . Yet, for any c > , x + ≥ v ( c, x + ) ≤ max x + ≥ v ( c, x + ) = − ( − x ) α [ ϕ ( c ) − α ( k ( c ) / (1 − α ) − − α ] <

0. Also, v (0 , <

0. Therefore the optimal value is not attainable.On the other hand, if there exists c ∗ > k ( c ∗ ) = 1, then sup c ≥ ,x + ≥ x +0 v ( c, x + ) ≥ sup x + ≥ v ( c ∗ , x + ) = 0. However, v ( c, x + ) = ϕ ( c ) − α [ x α + − k ( c )( x + − x ) α ] ≤ ϕ ( c ) − α [ x α + − ( x + − x ) α ] < ∀ c > , x + ≥

0. Together with the fact that v (0 , < c ≥ ,x + ≥ x +0 v ( c, x + ) = 0, which is however not achieved.(iii) If inf c ≥ k ( c ) <

1, then there exists c such that k ( c ) <

1. As a result, v ( c , x + ) = ϕ ( c ) − α [ x α + − k ( c )( x + − x ) α ] → + ∞ as x + → + ∞ . Q.E.D.

We see that the key features of the underlying behavioral portfolio selection problemcritically depend on the value inf c> k ( c ). Recall that k ( c ), by its deﬁnition, reﬂects in aprecise way the coordination among the utility functions, the probability distortions, andthe market (represented by ρ ). Let us elaborate on one particular point. In Tversky andKahneman (1992), the parameters are taken, based on extensive experiments, to be α = 0 . k − = 2 . >

1, the latter reﬂecting the fact that losses loom larger than gains: thepain associated with a loss is typically larger than the pleasure associated with an equivalentgain . Now, by the deﬁnition of k ( c ) we see the larger the loss aversion the more likely theunderlying model is well-posed and solvable. The economic intuition behind this is that witha larger loss aversion coeﬃcient it is not optimal to allocate all the fund to stocks (becausestocks are risky and prone to losses), and hence one needs to carefully balance the investmentbetween risky and risk-free assets, leading to a meaningful model.Another interesting observation is that the optimal portfolios behave fundamentally dif-ferent depending on whether x > x < x >

0, then the optimal strategy is simply to spend x buyinga contingent claim that delivers a payoﬀ in excess of the reference point, reminiscent of aclassical utility maximizing agent (although the allocation to stocks is “distorted” due to theprobability distortion). If x <

0, then the investor starts oﬀ a loss situation and needs toget “out of the hole” soonest possible. As a result, the optimal strategy is a gambling policywhich involves raising additional capital to purchase a claim that delivers a higher payoﬀ inthe case of a good state of the market and incurs a ﬁxed loss in the case of a bad one. Finally,if x = 0, then the optimal portfolio is not to invest in risky asset at all. Notice that x = 0corresponds to a natural psychological reference point – the risk-free return – for many people.This, nonetheless, does explain why most households do not invest in equities at all . k − is the so-called loss aversion coeﬃcient . A similar result is derived in Gomes (2005) for his portfolio selection model with loss averse investors,albeit in the single-period setting without probability distortions. Along the line of the discussions at the end of the last section we would like to investigatemore on how exactly the behavioral criterion would aﬀect the wealth allocation to risky assets.This is best explained through a very concrete example, where an optimal portfolio (not justoptimal terminal payoﬀ) is explicitly available. We consider a model with the power utility u + ( x ) = x α , u − ( x ) = k − x α , and all the market parameters (investment opportunity set) aretime-invariant: r ( · ) ≡ r, B ( · ) = B, σ ( t ) = σ, θ ( · ) = θ . In this case ρ ( t, T ) := ρ ( T ) /ρ ( t ), given F t , follows a lognormal distribution with parameter ( µ t , σ t ), where µ t := − ( r + θ / T − t ) , σ t := θ ( T − t ) . (41)Furthermore, we set the distortion T + to be the one in Example 6.1 with j ( x ) =: a c , where c > a < < b < x ≥ c> k ( c ) ≥

1. (Other cases can also be done, which are left to interestedreaders.)

Theorem 10.1

Under the assumption of Theorem 9.1-(i), the optimal wealth-portfolio pair ( x ∗ ( · ) , π ∗ ( · )) for Problem (6) is x ∗ ( t ) = x γ [ x ( t ) + c ( a − b ) / (1 − α )0 x ( t )] ,π ∗ ( t ) = x γ " (1 − a ) x ( t ) + c ( a − b ) / (1 − α )0 (1 − b ) x ( t )1 − α ( σσ ′ ) − B, where ψ ( y ) := (2 π ) − / e − y / is the density function of a standard normal distribution, and x ( t ) := ρ ( t ) ( a − / (1 − α ) σ t R c /ρ ( t )0 y ( a − / (1 − α ) ψ (cid:16) ln y − µ t σ t (cid:17) dy ≡ σ t ρ ( t ) R c y ( a − / (1 − α ) ψ (cid:16) ln y − µ t − ln ρ ( t ) σ t (cid:17) dy,x ( t ) := ρ ( t ) ( b − / (1 − α ) σ t R + ∞ c /ρ ( t ) y ( b − / (1 − α ) ψ (cid:16) ln y − µ t σ t (cid:17) dy ≡ σ t ρ ( t ) R + ∞ c y ( b − / (1 − α ) ψ (cid:16) ln y − µ t − ln ρ ( t ) σ t (cid:17) dy,γ := E h ρ ( a − α ) / (1 − α ) ρ ≤ c + c ( a − b ) / (1 − α )0 ρ ( b − α ) / (1 − α ) ρ>c i . Proof:

It follows from (27) and (29) that T ′ + ( F ( ρ )) ρ ≡ H ′ ( ρ ) ρF ′ ( ρ ) = kρ a − ρ ≤ c + kc a − b ρ b − ρ>c , where k − = e aµ + a σ / N ( ln c − µ − aσ σ ) + c a − b e bµ + b σ / [1 − N ( ln c − µ − aσ σ )]. Hence ϕ (+ ∞ ) := E "(cid:18) T ′ + ( F ( ρ )) ρ (cid:19) − α ρ = k − α E (cid:20) ρ a − α − α ρ ≤ c + c a − b − α ρ b − α − α ρ>c (cid:21) = k − α γ. Appealing to Theorem 9.1-(i) the optimal portfolio is the replicating portfolio for the claim X ∗ = x γ [ ρ ( a − / (1 − α ) ρ ≤ c + c ( a − b ) / (1 − α )0 ρ ( b − / (1 − α ) ρ>c ] . x ( · ) , π ( · )) replicate ρ ( a − / (1 − α ) ρ ≤ c and ( x ( · ) , π ( · )) replicate ρ ( b − / (1 − α ) ρ>c . Thenthe results in Appendix B yield x ( t ) = ρ ( t ) ( a − / (1 − α ) σ t Z c /ρ ( t )0 y ( a − / (1 − α ) ψ (cid:18) ln y − µ t σ t (cid:19) dy,π ( t ) = − (cid:20) a − − α x ( t ) − σ t ρ ( t ) c ( a − α ) / (1 − α )0 ψ (cid:18) ln c − µ t − ln ρ ( t ) σ t (cid:19)(cid:21) ( σσ ′ ) − B,x ( t ) = ρ ( t ) ( b − / (1 − α ) σ t Z + ∞ c /ρ ( t ) y ( b − / (1 − α ) ψ (cid:18) ln y − µ t σ t (cid:19) dy,π ( t ) = − (cid:20) b − − α x ( t ) + 1 σ t ρ ( t ) c ( b − α ) / (1 − α )0 ψ (cid:18) ln c − µ t − ln ρ ( t ) σ t (cid:19)(cid:21) ( σσ ′ ) − B. Combining these two portfolios linearly we obtain the desired result.

Q.E.D.

Now consider the case when c = 1 for ease of exposition. In this case the optimal portfoliocan be simpliﬁed to be π ∗ ( t ) = x γ (cid:20) (1 − a ) x ( t ) + (1 − b ) x ( t )1 − α (cid:21) ( σσ ′ ) − B = (1 − a ) x ( t ) + (1 − b ) x ( t ) x ( t ) + x ( t ) x ∗ ( t )1 − α ( σσ ′ ) − B = (cid:18) − ax ( t ) + bx ( t ) x ( t ) + x ( t ) (cid:19) x ∗ ( t )1 − α ( σσ ′ ) − B, or the optimal ratio in risky assets is π ∗ ( t ) x ∗ ( t ) = (1 − α ) − (cid:18) − ax ( t ) + bx ( t ) x ( t ) + x ( t ) (cid:19) ( σσ ′ ) − B. (42)Recall that in the conventional expected utility model with the utility function u ( x ) = x α and without distortion, the optimal ratio in risky assets isˆ π ( t )ˆ x ( t ) = (1 − α ) − ( σσ ′ ) − B. (43)So when b − a > x ( t ) x ( t ) = R y ( a − / (1 − α ) ψ (cid:16) ln y − µ t − ln ρ ( t ) σ t (cid:17) dy R + ∞ y ( b − / (1 − α ) ψ (cid:16) ln y − µ t − ln ρ ( t ) σ t (cid:17) dy , the investor underweights the risky assets in her portfolio compared with the one dictated bythe conventional utility model, and vice versa .

11 Concluding Remarks

In this paper, we introduce, for the ﬁrst time in literature to our best knowledge, a generalcontinuous-time portfolio selection model within the framework of the cumulative prospect31heory, so as to account for human psychology and emotions in investment activities. Themodel features inherent diﬃculties, including non-convex/concave and non-smooth (overall)utility functions and probability distortions. Even the well-posedness of such a model be-comes more an exception than a rule: we demonstrate that a well-posed model calls for acareful coordination among the underlying market, the utility function, and the probabilitydistortions. We then develop an approach to solving the model thoroughly. The approach islargely diﬀerent from the existing ones employed in the conventional dynamic asset allocationmodels. Notwithstanding the complexity of the approach, the ﬁnal solution turns out to besimply structured: the optimal terminal payoﬀ is related to certain binary options character-ized by a single number, and the optimal strategy is an aggressive gambling policy bettingon good states of the market. Finally, we apply the general results to a speciﬁc case with atwo-piece CRRA utility function, and show how the behavioral criterion will change the riskyallocation.The equity premium puzzle [Mehra and Prescott (1985)] refers to the phenomenon thatobserved average annual returns on stocks over the past century are higher by large margin(approximately 6 percentage points) than returns on government bonds, whereas standardasset allocation theories (such as that based on the utility model) predict that the diﬀerencein returns between these two investments should be much smaller. Benartzi and Thaler(1995) proposed an explanation for the puzzle using prospect theory (in single period andwithout probability distortion). In Section 10 we demonstrate that the investor would indeedunderweight stocks in her portfolio under certain conditions. We are not claiming that wehave provided a satisfactory explanation to the equity premium puzzle in the continuous timesetting; but we do hope that the research along the line will shed lights on eventually solvingthe puzzle.It should be emphasized again that the agent under study in this paper is a “small investor”in that his behavior will not aﬀect the market. Hence we can still comfortably assume somemarket properties, such as the absence of arbitrage and the market completeness, as usuallyimposed for the conventional utility model. (It remains an interesting problem to studya behavioral model in an incomplete market.) It is certainly a fascinating and challengingproblem to study how the overall market might be changed by the joint behaviors of investors;e.g., a “behavioral” capital asset pricing model.Let us also mention about an on-going work [He and Zhou (2007)] on behavioral portfoliochoice in single period, featuring both S-shaped utilities and probability distortions. Per-versely, the single-period model is equally diﬃcult, and calls for a technique quite diﬀerentfrom its continuous-time counterpart to tackle. Only some special cases have been solved,which are used to study the equity premium puzzle more closely.To conclude, this work is meant to be initiating and inspiring, rather than exhaustive andconclusive, for the research on intertemporal behavioral portfolio allocation.32 ppendixA An Inequality

Lemma A.1

Let f : IR + IR + be a non-decreasing function with f (0) = 0 . Then xy ≤ Z x f − ( t ) dt + Z y f ( t ) dt ∀ x ≥ , y ≥ , and the equality holds if and only if f ( y − ) ≤ x ≤ f ( y +) .Proof: By interpreting the integrations involved as the appropriate areas, we have Z y f ( t ) dt = yf ( y ) − Z f ( y )0 f − ( t ) dt. Deﬁne g ( x, y ) := R x f − ( t ) dt + R y f ( t ) dt − xy . Then g ( x, y ) = y ( f ( y ) − x ) + Z xf ( y ) f − ( t ) dt = Z xf ( y ) ( f − ( t ) − y ) dt. We now consider all the possible cases. First, if x < f ( y − ), then f − ( t ) ≤ y ∀ t < f ( y ). Therefore g ( x, y ) = R f ( y ) x ( y − f − ( t )) dt ≥

0. Moreover, in this case there exists z > x such that z < f ( y − ),which implies y > f − ( z ) (otherwise f ( y − ǫ ) < z ∀ ǫ >

0, leading to z ≥ f ( y − )). The monotonicityof f − yields y > f − ( t ) for any t ≤ z . Hence g ( x, y ) = R f ( y ) x ( y − f − ( t )) dt ≥ R zx ( y − f − ( t )) dt > x ∈ [ f ( y − ) , f ( y )]. Since f − ( t ) ≤ y ∀ t < f ( y ), and f − ( t ) ≥ y ∀ t > x ≥ f ( y − ), we have g ( x, y ) = R f ( y ) x ( y − f − ( t )) dt = 0.Symmetrically, we can prove that g ( x, y ) = 0 when x ∈ [ f ( y ) , f ( y +)], and g ( x, y ) > x > f ( y +). The proof is complete. Q.E.D.

B Two Auxiliary Optimization Problems

In this subsection we solve two auxiliary optimization problems, which play a key role in simplifyingthe behavioral portfolio selection model.Let Y be a given strictly positive random variable on (Ω , F , P ) with the probability distributionfunction F ( · ). Let G ( · ) be another given distribution function with G (0) = 0. Consider the followingtwo optimization problems: Maximize E [ XY ]subject to P ( X ≤ x ) = G ( x ) ∀ x ∈ IR , (44)and Minimize E [ XY ]subject to P ( X ≤ x ) = G ( x ) ∀ x ∈ IR . (45)These are two highly non-convex optimization problems. Lemma B.1 (i)

Let h ( · ) be a non-decreasing function. If X and h ( Y ) share the same distribution,then E [ XY ] ≤ E [ h ( Y ) Y ] while the equality holds if and only if X ∈ [ h ( Y − ) , h ( Y +)] a.s. . ii) Let h ( · ) be a non-increasing function. If X and h ( Y ) share the same distribution, then E [ XY ] ≥ E [ h ( Y ) Y ] while the equality holds if and only if X ∈ [ h ( Y − ) , h ( Y +)] a.s. Proof: (i) First assume h (0) = 0. Employing Lemma A.1, together with the assumption that X and h ( Y ) have the same distribution, we have E [ XY ] ≤ E [ Z X h − ( u ) du ] + E [ Z Y h ( u ) du ]= E [ Z h ( Y )0 h − ( u ) du ] + E [ Z Y h ( u ) du ] = E [ h ( Y ) Y ] , and the equality holds if and only if X ∈ [ h ( Y − ) , h ( Y +)] a.s..For the general case when h (0) = 0, deﬁne ¯ h ( x ) := h ( x ) − h (0). Then E [ XY ] = E [( X − h (0)) Y ] + h (0) EY ≤ E [¯ h ( Y ) Y ] + h (0) EY = E [ h ( Y ) Y ] . (ii) It is straightforward by applying the result in (i) to − X and − h ( Y ). Q.E.D.

Theorem B.1

Assume that Y admits no atom. (i) Deﬁne X ∗ := G − ( F ( Y )) . Then E [ X ∗ Y ] ≥ E [ XY ] for any feasible solution X of Problem(44). If in addition E [ X ∗ Y ] < + ∞ , then X ∗ is the unique (in the sense of almost surely)optimal solution for (44). (ii) Deﬁne X ∗ := G − (1 − F ( Y )) . Then E [ X ∗ Y ] ≤ E [ XY ] for any feasible solution X of Problem(45). If in addition E [ X ∗ Y ] < + ∞ , then X ∗ is the unique optimal solution for (45).Proof: First of all note that Z := F ( Y ) follows uniform distribution on the (open or closed) unitinterval.(i) Deﬁne h ( x ) := G − ( F ( x )). Then P { h ( Y ) ≤ x } = P { Z ≤ G ( x ) } = G ( x ), and h ( · ) isnon-decreasing. By Lemma B.1, E [ X ∗ Y ] ≥ E [ XY ] for any feasible solution X of Problem (44),where X ∗ := h ( Y ). Furthermore, if E [ X ∗ Y ] < + ∞ , and there is X which is optimal for (44), then E [ XY ] = E [ X ∗ Y ]. By Lemma B.1, X ∈ [ h ( Y − ) , h ( Y +)] a.s.. Since h ( · ) is non-decreasing, itsset of discontinuous points is at most countable. However, Y admits no atom; hence h ( Y − ) = h ( Y +) = h ( Y ), a.s., which implies that X = h ( Y ) = X ∗ , a.s.. Therefore we have proved that X ∗ is the unique optimal solution for (44).(ii) Deﬁne h ( x ) := G − (1 − F ( x )). It is immediate that P { h ( Y ) ≤ x } = G ( x ), and h ( · ) isnon-increasing. Applying Lemma B.1 and a similar argument as in (i) we obtain the desired result. Q.E.D.

The preceding theorem shows that the optimal solution to (44) is comonotonic with Y , and thatto (44) is anti-comonotonic with Y . C A Choquet Maximization Problem

Consider a general utility maximization problem involving the Choquet integral:Maximize V ( X ) = R + ∞ T ( P { u ( X ) > y } ) dy subject to E [ ξX ] = a, X ≥ , (46) here ξ is a given strictly positive random variable, with no atom and whose distribution function is F ξ ( · ), a ≥ T : [0 , [0 ,

1] is a strictly increasing, diﬀerentiable function with T (0) = 0 , T (1) =1, and u ( · ) is a strictly concave, strictly increasing, twice diﬀerentiable function with u (0) = 0, u ′ (0) = + ∞ , u ′ (+ ∞ ) = 0.The case a = 0 is trivial, where X ∗ = 0 is the only feasible, and hence optimal, solution. So weassume a > non-convex optimization problemwith a constraint ; thus the normal technique like Lagrange multiplier does not apply directly. Theapproach we develop here is to change the decision variable and turn the problem into a convexproblem through a series of transformations. To start with, we have the following lemma. Lemma C.1

If Problem (46) admits an optimal solution X ∗ whose distribution function is G ( · ) ,then X ∗ = G − (1 − F ξ ( ξ )) , a.s. . Proof:

Since a > G ( t )

1. Denote ¯ X := G − (1 − F ξ ( ξ )). Notice that 1 − F ξ ( ξ ) ∼ U (0 , X has the same distribution as X ∗ and E [ ξ ¯ X ] > X ∗ = ¯ X a.s. is not true, then it follows from the uniqueness result in Theorem B.1 that E [ ξ ¯ X ] < E [ ξX ∗ ] = a . Deﬁne X := k ¯ X , where k := a/E [ ξ ¯ X ] >

1. Then X is feasible for (46), and V ( X ) > V ( ¯ X ) = V ( X ∗ ), which contradicts the optimality of X ∗ . Q.E.D.

Lemma C.1 implies that an optimal solution to (46), if it exists, must be anti-comonotonic with ξ . Denote Z := 1 − F ξ ( ξ ). Then Z follows U (0 , ξ = F − ξ (1 − Z ) , a.s., thanks to ξ beingatomless. Lemma C.1 suggests that in order to solve (46) one needs only to seek among randomvariables in the form G − ( Z ), where G is the distribution function of a nonnegative random variable[i.e., G is non-decreasing, c`adl`ag, with G (0 − ) = 0 , G (+ ∞ ) = 1]. Motivated by this observation, weintroduce the following problemMaximize v ( G ) := R + ∞ T ( P { u ( G − ( Z )) > t } ) dt subject to ( E [ G − ( Z ) F − ξ (1 − Z )] = a,G is the distribution function of a nonnegative random variable. (47)The following result, which is straightforward in view of Lemma C.1, stipulates that Problem(47) is equivalent to Problem (46). Proposition C.1 If G ∗ is optimal for (47), then X ∗ := ( G ∗ ) − ( Z ) is optimal for (46). Con-versely, if X ∗ is optimal for (46), then its distribution function G ∗ is optimal for (47) and X ∗ =( G ∗ ) − ( Z ) , a.s. . Now we turn to Problem (47). Denoting ¯ T ( x ) := T (1 − x ), x ∈ [0 , u := sup x ∈ IR + u ( x ), wehave v ( G ) = Z ¯ u ¯ T (cid:0) P { u ( G − ( Z )) ≤ y } (cid:1) dy = Z ¯ u ¯ T (cid:0) P { Z ≤ G ( u − ( y )) } (cid:1) dy = Z ¯ u ¯ T ( G ( u − ( y ))) dy = Z u ( G − ( ¯ T − ( t ))) dt = − Z u ( G − ( s )) ¯ T ′ ( s ) ds = Z u ( G − ( s )) T ′ (1 − s ) ds = E (cid:2) u ( G − ( Z )) T ′ (1 − Z ) (cid:3) . enoting Γ := { g : [0 , IR + is non-decreasing, left continuous, with g (0) = 0 } , and considering g = G − , we can rewrite Problem (47) intoMaximize ¯ v ( g ) := E [ u ( g ( Z )) T ′ (1 − Z )]subject to E [ g ( Z ) F − ξ (1 − Z )] = a, g ∈ Γ . (48)Some remarks on the set Γ are in order. Since any given g ∈ Γ is left continuous, we can alwaysextend it to a map from [0 ,

1] to IR + ∪ { + ∞} by setting g (1) := g (1 − ). It is easy to see that g (1) < + ∞ if and only if the corresponding random variable η (i.e., η is such a random variablewhose distribution function has an inverse identical to g ) is almost surely bounded from above.Since T ′ ( · ) > u ( · ) is concave, the objective functional of (48) is now concave in g . Onthe other hand, the constraint functional E [ g ( Z ) F − ξ (1 − Z )] is linear in g . Hence we can use theLagrange method to remove this linear constraint as follows. For a given λ ∈ IR,Maximize ˜ v λ ( g ) := E h u ( g ( Z )) T ′ (1 − Z ) − λg ( Z ) F − ξ (1 − Z ) i subject to g ∈ Γ , (49)and then determine λ via the original linear constraint.Although Problem (49) is a convex optimization problem in g , it has an implicit constraint that g be non-decreasing; hence is very complex. Let us ignore this constraint for the moment. For eachﬁxed z ∈ (0 ,

1) we maximize u ( g ( z )) T ′ (1 − z ) − λg ( z ) F − ξ (1 − z ) over g ( z ) ∈ IR + . The zero-derivativecondition gives g ( z ) = ( u ′ ) − ( λF − ξ (1 − z ) /T ′ (1 − z )). Now, if F − ξ ( z ) /T ′ ( z ) happens to be non-decreasing in z ∈ (0 , g ( z ) is non-decreasing in z ∈ [0 ,

1) and, hence, it solves (49). Onthe other hand, if F − ξ ( z ) /T ′ ( z ) is not non-decreasing, then it remains an open problem to express explicitly the optimal solution to (49).Denote R u ( x ) := − xu ′′ ( x ) u ′ ( x ) , x >

0, which is the

Arrow–Pratt index of relative risk aversion of theutility function u ( · ). Proposition C.2

Assume that F − ξ ( z ) /T ′ ( z ) is non-decreasing in z ∈ (0 , and lim inf x → + ∞ R u ( x ) > . Then the following claims are equivalent: (i) Problem (48) is well-posed for any a > . (ii) Problem (48) admits a unique optimal solution for any a > . (iii) E h u (cid:16) ( u ′ ) − ( ξT ′ ( F ξ ( ξ )) ) (cid:17) T ′ ( F ξ ( ξ )) i < + ∞ . (iv) E h u (cid:16) ( u ′ ) − ( λξT ′ ( F ξ ( ξ )) ) (cid:17) T ′ ( F ξ ( ξ )) i < + ∞ ∀ λ > .Furthermore, when one of the above (i)–(iv) holds, the optimal solution to (48) is g ∗ ( x ) ≡ ( G ∗ ) − ( x ) = ( u ′ ) − λF − ξ (1 − x ) T ′ (1 − x ) ! , x ∈ [0 , , where λ > is the one satisfying E [( G ∗ ) − (1 − F ξ ( ξ )) ξ ] = a . roof: Since T ′ (1 − Z ) > E [ T ′ (1 − Z )] = R T ′ ( x ) dx = T (1) − T (0) = 1, we can deﬁne anew probability measure ˜ P whose expectation ˜ E ( X ) := E [ T ′ (1 − Z ) X ].Denote ζ := F − ξ (1 − Z ) T ′ (1 − Z ) ≡ ξT ′ ( F ξ ( ξ )) . Then ζ > P as followsMaximize ¯ v ( g ) := ˜ E [ u ( g ( Z ))]subject to ˜ E [ ζg ( Z )] = a, g ∈ Γ . (50)By Jin, Xu and Zhou (2007, Theorem 6) and the fact that g ∗ ( x ) = ( u ′ ) − (cid:18) λF − ξ (1 − x ) T ′ (1 − x ) (cid:19) is automat-ically non-decreasing in x , we get the desired result. Q.E.D.

We now summarize all the results above in the following theorem.

Theorem C.1

Assume that F − ξ ( z ) /T ′ ( z ) is non-decreasing in z ∈ (0 , and lim inf x → + ∞ R u ( x ) > . Deﬁne X ( λ ) := ( u ′ ) − (cid:16) λξT ′ ( F ξ ( ξ )) (cid:17) for λ > . If V ( X (1)) < + ∞ , then X ( λ ) is an optimal solutionfor Problem (46), where λ is the one satisfying E [ ξX ( λ )] = a . If V ( X (1)) = + ∞ , then Problem(46) is ill-posed. To conclude this subsection, we state a necessary condition of optimality for Problem (46), whichis useful in solving Problem (13) in Step 2.

Lemma C.2 If g is optimal for (49), then either g ≡ or g ( x ) > ∀ x > . Proof : Suppose g

0. We now show that g ( x ) > ∀ x >

0. If not, deﬁne δ := inf { x > g ( x ) > } . Then 0 < δ <

1, and g ( x ) = 0 ∀ x ∈ [0 , δ ].For any y >

0, let ǫ ( y ) := inf { x > g ( δ + x ) > y } and g y ( x ) :=  , x ∈ [0 , δ/ ,y, x ∈ ( δ/ , δ + ǫ ( y )] ,g ( x ) , x ∈ ( δ + ǫ ( y ) , . Then E [ u ( g y ( Z )) T ′ (1 − Z ) − λg y ( Z ) F − ξ (1 − Z )] − E [ u ( g ( Z )) T ′ (1 − Z ) − λg ( Z ) F − ξ (1 − Z )]= Z δ + ǫ ( y ) δ/ [ u ( y ) T ′ (1 − x ) − λyF − ξ (1 − x )] dx − Z δ + ǫ ( y ) δ [ u ( g ( x )) T ′ (1 − x ) − λg ( x ) F − ξ (1 − x )] dx ≥ Z δ + ǫ ( y ) δ/ [ u ( y ) T ′ (1 − x ) − λyF − ξ (1 − x )] dx − Z δ + ǫ ( y ) δ u ( g ( x )) T ′ (1 − x ) dx ≥ u ( y ) Z δδ/ T ′ (1 − x ) dx − λy Z δ + ǫ ( y ) δ/ F − ξ (1 − x ) dx = y " u ( y ) y ( T (1 − δ/ − T (1 − δ )) − λ Z δ + ǫ ( y ) δ/ F − ξ (1 − x ) dx . ince u ( y ) y → + ∞ as y → T (1 − δ/ − T (1 − δ ) >

0, we have u ( y ) y ( T (1 − δ/ − T (1 − δ )) → + ∞ as y → . On the other hand, ǫ ( y ) → y →

0; hence Z δ + ǫ ( y ) δ/ F − ξ (1 − x ) dx ≤ ( ǫ ( y ) + δ/ F − ξ (1 − δ/ → δ/ F − ξ (1 − δ/

2) as y → . Consequently, u ( y ) y ( T (1 − δ/ − T (1 − δ )) − λ Z δ + ǫ ( y ) δ/ F − ξ (1 − x ) dx → + ∞ as y → . Fix y > E [ u ( g y ( Z )) T ′ (1 − Z ) − λg y ( Z ) F − ξ (1 − Z )] − E [ u ( g ( Z )) T ′ (1 − Z ) − λg ( Z ) F − ξ (1 − Z )] ≥ y > , which implies that g y is strictly better than g for (49). Q.E.D.

Theorem C.2 If X ∗ is an optimal solution for (46) with some a > , then P ( X ∗ = 0) = 0 . Proof:

Proposition C.1 implies that the distribution function G ∗ of X ∗ is optimal for (47). ByLagrange method there exists λ ≥ G ∗ ) − is optimal for (49). Since a >

0, ( G ∗ ) − G ∗ ) − ( x ) > ∀ x >

0, or G ∗ (0) = 0. Q.E.D.

So an optimal solution to (46) with a positive initial budget is positive almost surely.

D A Choquet Minimization Problem

Consider a general utility minimization problem involving the Choquet integral:Minimize V ( X ) := R + ∞ T ( P { u ( X ) > y } ) dy subject to E [ ξX ] = a, X ≥ , (51)where ξ, a, T ( · ) satisfy the same assumptions as those with Problem (46), and u ( · ) is strictlyincreasing, concave with u (0) = 0.It is easy to see that (51) always admits feasible solutions (e.g., X = x ξ ≤ ξ is feasible withappropriate x ∈ IR , ξ ∈ IR); hence the optimal value of (51) is a ﬁnite nonnegative number.In view of Theorem B.1, a similar argument to that in Appendix C reveals that the optimalsolution X ∗ to (51) must be in the form of G − ( F ξ ( ξ )) for some distribution function G ( · ), whichcan be determined by the following problemMinimize v ( G ) := R + ∞ T ( P { u ( G − ( Z )) > y } ) dy subject to ( E [ G − ( Z ) F − ξ ( Z )] = a,G is the distribution function of a nonnegative random variable , (52)where Z := F ξ ( ξ ). Proposition D.1 If G ∗ is optimal for (52), then X ∗ := ( G ∗ ) − ( Z ) is optimal for (51). Con-versely, if X ∗ is optimal for (51), then its distribution function G ∗ is optimal for (52) and X ∗ =( G ∗ ) − ( Z ) , a.s. . y the same calculation as in Appendix C, we have v ( G ) = E [ u ( G − ( Z )) T ′ (1 − Z )]. Denoting g = G − , Problem (52) can be rewritten asMinimize ¯ v ( g ) := E [ u ( g ( Z )) T ′ (1 − Z )]subject to E [ g ( Z ) F − ξ ( Z )] = a, g ∈ Γ . (53)Since the objective of the above problem is to minimize a concave functional, its solution musthave a very diﬀerent structure compared with Problem (46), which in turn requires a completelydiﬀerent technique to obtain. Speciﬁcally, the solution should be a “corner point solution” (in theterminology of linear program). The question is how to characterize such a corner point solution inthe present setting. Proposition D.2

Assume that u ( · ) is strictly concave at . Then the optimal solution for Problem(53), if it exists, must be in the form g ( t ) = q ( b ) ( b, ( t ) , t ∈ [0 , , with some b ∈ [0 , and q ( b ) := aE [ F − ξ ( Z ) ( b, Z ) ] . Proof:

Denote f ( · ) := F − ξ ( · ) for notational convenience. We assume a > g is an optimal solution to (53), then g

0. Fix t ∈ (0 ,

1) such that g ( t ) > k := R g ( t ) f ( t ) dt R t g ( t ) f ( t ) dt + g ( t ) R t f ( t ) dt , and¯ g ( t ) := ( kg ( t ) , if t ∈ [0 , t ] kg ( t ) if t ∈ ( t , . Then ¯ g ( · ) ∈ Γ, and R ¯ g ( t ) f ( t ) dt = k R t g ( t ) f ( t ) dt + kg ( t ) R t f ( t ) dt = R g ( t ) f ( t ) dt , implying that¯ g ( · ) is feasible for (53). We now claim that g ( t ) = g ( t ) , a.e. t ∈ ( t , k >

1. Deﬁne λ := 1 − /k ∈ (0 ,

1) and ˜ g ( t ) := g ( t ) − g ( t ) λ t>t , t ∈ [0 , − λ )¯ g ( t ) + λ ˜ g ( t ) = g ( t ) ∀ t ∈ [0 , . (54)It follows from the concavity of u ( · ) that ¯ v ( g ) ≥ (1 − λ )¯ v (¯ g ) + λ ¯ v (˜ g ), and the equality holds onlyif u ( g ( t )) = (1 − λ ) u (¯ g ( t )) + λu (˜ g ( t )) , a.e. t ∈ (0 , . Owing to the optimality of g , the above equality does hold. However, the equality when t ≤ t implies that u ( · ) is not strictly concave at 0, which is a contradiction.Denote b := inf { t ≥ g ( t ) > } . The preceding analysis shows that g ( t ) = k t>b for some k ∈ IR + . The feasibility of g ( · ) determines k ≡ q ( b ) = aE [ F − ξ ( Z ) ( b, Z ) ] . Q.E.D.

By left-continuity one can extend the optimal g described in Proposition D.2 to [0 ,

1] by deﬁning g (1) := q ( b ). Moreover, since g ( t ) is uniformly bounded in t ∈ [0 , X ∗ to (51) can be represented as X ∗ = g ( Z ), hence must be uniformlybounded from above.Proposition D.2 suggests that we only need to ﬁnd an optimal number b ∈ [0 ,

1) so as to solveProblem (53), which motivates the introduction of the following problemMinimize ˜ v ( b ) := E [ u ( g ( Z )) T ′ (1 − Z )]subject to g ( · ) = aE [ F − ξ ( Z ) ( b, ( Z )] ( b, ( · ) , ≤ b < . (55) roposition D.3 Problems (53) and (55) have the same inﬁmum values.

Proof:

Denote by α and β the inﬁmum values of Problems (53) and (55) respectively. Clearly α ≤ β . If the opposite inequality is false, then there is a feasible solution g for (53) such that¯ v ( g ) < β .For any s ∈ [0 , k ( s ) = R g ( t ) f ( t ) dt R s g ( t ) f ( t ) dt + g ( s ) R s f ( t ) dt ≥

1, where f ( · ) := F − ξ ( · ). Thenlim s → k ( s ) = 1. Deﬁne H s ( t ) := ( k ( s ) g ( t ) , t ∈ [0 , s ] k ( s ) g ( s ) , t ∈ ( s, . As shown in the proof of Proposition D.2, H s ( · ) is feasible for (53), and¯ v ( H s ) ≤ Z u ( k ( s ) g ( t )) T ′ (1 − t ) dt ≤ Z k ( s ) u ( g ( t )) T ′ (1 − t ) dt → ¯ v ( g ) , as s → . Therefore there exists s ∈ [0 , v ( H s ) < β . For any nonnegativeinteger n , deﬁne a ( n, k ) := R k/ n ( k − / n H s ( t ) f ( t ) dt R k/ n ( k − / n f ( t ) dt , for any k = 1 , · · · , n . It is clear that H s (( k − / n ) ≤ a ( n, k ) ≤ H s ( k/ n ). Deﬁne g n ( t ) := n X k =1 a ( n, k ) (( k − / n ,k/ n ] ( t ) , t ∈ [0 , . Clearly g n ∈ Γ, and R g n ( t ) f ( t ) dt = R H s ( t ) f ( t ) dt = a , implying that g n is feasible for (53) for each n . Furthermore, g n ( t ) → H s ( t ) ∀ t and 0 ≤ g n ( t ) ≤ k ( s ) g ( s ) ∀ t , which leads to ¯ v ( g n ) → ¯ v ( H s ).So there exists n such that ¯ v ( g n ) < β .Because g n ( · ) is a left continuous and non-decreasing step function, we can rewrite it as g n ( t ) = m X k =1 a k − ( t k − ,t k ] ( t )with 0 = t < t < · · · < t m = 1, 0 = a < a < a < · · · < a m < + ∞ . Denote λ k := a k − a k − q ( t k ) , k = 1 , , · · · , m , where q ( · ) is deﬁned in Proposition D.2. Then for any t ∈ (0 , g n ( t ) = m X k =1 a k − ( t k − ,t k ] ( t ) = m X k =1 ( a k − a k − ) ( t k , = m X k =1 λ k J t k ( t ) , where J t k ( t ) := q ( t k ) ( t k , . Since a ≡ Z g n ( t ) f ( t ) dt = m X k =1 λ k Z J t k ( t ) f ( t ) dt = m X k =1 λ k a, we conclude that P mk =1 λ k = 1, which means that g n is a convex combination of J t k . It follows fromthe concavity of u ( · ) that there exists k such that ¯ v ( J t k ) ≤ ¯ v ( g n ), which contradicts the conclusionthat ¯ v ( g n ) < β ≤ ¯ v ( J t k ). Q.E.D.

Summarizing, we have the following result. heorem D.1 Problems (51) and (55) have the same inﬁmum values. If, in addition, u ( · ) is strictlyconcave at , then (51) admits an optimal solution if and only if the following problem min ≤ c< esssup ξ u (cid:18) aE [ ξ ξ>c ] (cid:19) T ( P ( ξ > c )) admits an optimal solution c ∗ , in which case the optimal solution to (51) is X ∗ = aE [ ξ ξ>c ∗ ] ξ>c ∗ . Proof:

The ﬁrst conclusion follows from Proposition D.3. For the second conclusion, we rewritethe objective functional of (55) as˜ v ( b ) = E (cid:2) u (cid:0) q ( b ) ( b, ( Z ) (cid:1) T ′ (1 − Z ) (cid:3) = Z b u ( q ( b )) T ′ (1 − t ) dt = u ( q ( b )) T (1 − b ) , where b ∈ [0 , c := F − ξ ( b ) ∈ [0 , esssup ξ ). Then˜ v ( b ) = u ( q ( b )) T (1 − b ) = u (cid:18) aE [ ξ ξ>c ] (cid:19) T ( P ( ξ > c )) , and the desired results are straightforward in view of Theorem D.2. Q.E.D.

E Replicating a Binary Option

In this subsection, we want to ﬁnd a portfolio replicating the contingent claim ρ α ρ ∈ ( c ,c ) , where0 ≤ c < c ≤ + ∞ , α ∈ IR, and ρ = ρ ( T ) with ρ ( t ) := exp (cid:26) − ( r + 12 | θ | ) t − θ ′ W ( t ) (cid:27) , ≤ t ≤ T. The claim resembles the payoﬀ of a binary (or digital) option, except that ρ does not correspond toany underlying stock [although it is indeed the terminal wealth of a mutual fund; see Bielecki et al. (2005, Remark 7.3), for details].Let ψ ( · ) and N ( · ) be the density function and distribution function of the standard normaldistribution respectively. Recall that ρ ( t, T ) := ρ ( T ) /ρ ( t ) conditional on F t follows a lognormaldistribution with parameters ( µ t , σ t ) given by (41). Theorem E.1 If c < + ∞ , then the wealth-portfolio pair replicating ρ α ρ ∈ ( c ,c ) is x ( t ) = ρ ( t ) α σ t R c /ρ ( t ) c /ρ ( t ) y α ψ (cid:16) ln y − µ t σ t (cid:17) dy,π ( t ) = − h αx ( t ) − σ t ρ ( t ) (cid:16) c α +12 ψ (cid:16) ln c − µ t − ln ρ ( t ) σ t (cid:17) − c α +11 ψ (cid:16) ln c − µ t − ln ρ ( t ) σ t (cid:17)(cid:17)i ( σσ ′ ) − B. If c = + ∞ , then the corresponding replicating pair is x ( t ) = ρ ( t ) α σ t R + ∞ c /ρ ( t ) y α ψ (cid:16) ln y − µ t σ t (cid:17) dy,π ( t ) = − h αx ( t ) + σ t ρ ( t ) c α +11 ψ (cid:16) ln c − µ t − ln ρ ( t ) σ t (cid:17)i ( σσ ′ ) − B. roof: When c < + ∞ , the replicating wealth process is x ( t ) = E [ ρ ( t, T ) ρ ( T ) α ρ ( T ) ∈ ( c ,c ) |F t ]= ρ ( t ) α E [ ρ ( t, T ) α +1 ρ ( t,T ) ∈ ( c /ρ ( t ) ,c /ρ ( t )) |F t ]= ρ ( t ) α Z c /ρ ( t ) c /ρ ( t ) y α +1 dN (cid:18) ln y − µ t σ t (cid:19) = ρ ( t ) α σ t Z c /ρ ( t ) c /ρ ( t ) y α ψ (cid:18) ln y − µ t σ t (cid:19) dy = f ( t, ρ ( t )) , where f ( t, ρ ) := ρ α σ t R c /ρc /ρ y α ψ (cid:16) ln y − µ t σ t (cid:17) dy. It is well known that the replicating portfolio is π ( t ) = − ( σσ ′ ) − B ∂f ( t, ρ ( t )) ∂ρ ρ ( t ); (56)see, e.g., Bielecki et al. (2005, Eq. (7.6)). Now we calculate ∂f ( t, ρ ) ∂ρ = αρ α − σ t Z c /ρc /ρ y α ψ (cid:18) ln y − µ t σ t (cid:19) dy + ρ α σ t (cid:20)(cid:18) c ρ (cid:19) α ψ (cid:18) ln c − µ t − ln ρσ t (cid:19) − c ρ − (cid:18) c ρ (cid:19) α ψ (cid:18) ln c − µ t − ln ρσ t (cid:19) − c ρ (cid:21) = αx ( t ) ρ − σ t ρ (cid:20) c α +12 ψ (cid:18) ln c − µ t − ln ρσ t (cid:19) − c α +11 ψ (cid:18) ln c − µ t − ln ρσ t (cid:19)(cid:21) . Plugging in (56) we get the desired result.The case with c = + ∞ can be dealt with similarly (in fact more easily). Q.E.D.

References

M. Allais (1953) , Le comportement de l’homme rationnel devant le risque, critique des postu-lats et axiomes de l’ecole americaine,

Econometrica , 21, pp. 503-546.

G.W. Bassett, Jr., R. Koenker and G. Kordas (2004) , Pessimistic portfolio allocationand Choquet expected utility,

J. Financial Econometrics , 2, pp. 477-492.

S. Benartzi and R. Thaler (1995) , Myopic loss aversion and the equity premium puzzle,

Quart. J. Econ. , 110 , pp. 73-92.

A.B. Berkelaar, R. Kouwenberg and T. Post (2004) , Optimal portfolio choice under lossaversion,

Rev. Econ. Stat. , 86, pp. 973-987.

T.R. Bielecki, H. Jin, S.R. Pliska and X.Y. Zhou (2005) , Continuous-time mean–varianceportfolio selection with bankruptcy prohibition,

Math. Finance , 15, pp. 213-244.

G. Choquet (1953/54) , Theory of capacities,

Ann. de l’Inst. Fourier , 5, pp. 131-295.

J.C. Cox and C.-F. Huang (1989) , Optimal consumption and portfolio policies when assetprices follow a diﬀusion process,

J. Econom. Theory , 49, 33-83.

E. De Giorgi and T. Post (2005) , Second order stochastic dominance, reward-risk portfolioselection and CAPM,

J. Financial Quant. Anal. , to appear.

D. Denneberg (1994) , Non-Additive Measure and Integral , Kluwer, Dordrecht.

D. Duffie (1996) , Dynamic Asset Pricing Theory , 2nd Edition, Princeton University Press, rinceton. D. Ellsberg (1961) , Risk, ambiguity and the Savage axioms,

Quart. J. Econom. , 75, pp. 643-669.

P.C. Fishburn (1998) , Nonlinear Preference and Utility Theory,

The John Hopkins UniversityPress, Baltimore.

H. F¨ollmer and A. Schied (2002) , Stochastic Finance: An Introduction in Discrete Time ,Walter de Gruyter, Berlin.

M. Friedman and L.J. Savage (1948) , The utility analysis of choices involving risk,

J. Polit-ical Economy , 56, pp. 279-304.

F.J. Gomes (2005) , Portfolio choice and trading volume with loss-averse investors,

J. Business ,78, pp. 675-706.

X. He and X.Y. Zhou (2007) , Behavioral portfolio choice: Model, theory, and equity premiumpuzzle, working paper , The Chinese University of Hong Kong.

H. Jin, J.A. Yan and X.Y. Zhou (2005) , Continuous-time mean–risk portfolio selection,

Ann.de l’Institut Henri Poincar´e: Probab. & Stat. , 41, pp. 559-580.

H. Jin, Z. Xu and X.Y. Zhou (2007) , A convex stochastic optimization problem arising fromportfolio selection,

Math. Finance , to appear.

D. Kahneman and A. Tversky (1979) , Prospect theory: An analysis of decision under risk,

Econometrica , 47, pp. 263-291.

I. Karatzas and S.E. Shreve (1998) , Methods of Mathematical Finance , Springer-Verlag,New York.

R. Korn and H. Kraft (2004), On the stability of continuous-time portfolio problems withstochastic opportunity set,

Math. Finance

14, pp. 403-414.

H. Levy and M. Levy (2004) , Prospect theory and mean–variance analysis,

Rev. FinancialStudies , 17, pp. 1015-1041.

D. Li and W.L. Ng (2000) , Optimal dynamic portfolio selection: Multiperiod mean–varianceformulation,

Math. Finance , 10, pp. 387-406.

R. Mehra and E.C. Prescott (1985) , The equity premium: A puzzle,

J. Monet. Econ. , 15,pp. 145-161.

J. von Neumann and O. Morgenstern (1944) , Theory of Games and Economic Behavior ,Princeton University Press, Princeton.

H. Shefrin and M. Statman (2000) , Behavioral portfolio theory,

J. Finan. Quant. Anal. , 35,pp. 127-151.

A. Tversky and D. Kahneman (1992) , Advances in prospect theory: Cumulative represen-tation of uncertainty.

J. Risk & Uncertainty , 5, pp. 297-323.

X.Y. Zhou and D. Li (2000) , Continuous time mean–variance portfolio selection: A stochasticLQ framework,

Appl. Math. & Optim. , 42, pp. 19-33., 42, pp. 19-33.