[PDF] Aggregating time preferences with decreasing impatience

Abstract

It is well-known that for a group of time-consistent decision makers their collective time preferences may become time-inconsistent. Jackson and Yariv (2014) demonstrated that the result of aggregation of exponential discount functions always exhibits present bias. We show that when preferences satisfy the axioms of Fishburn and Rubinstein (1982), present bias is equivalent to decreasing impatience (DI). Applying the notion of comparative DI introduced by Prelec (2004), we generalize the result of Jackson and Yariv (2014). We prove that the aggregation of distinct discount functions from comparable DI classes results in the collective discount function which is strictly more DI than the least DI of the functions being aggregated. We also prove an analogue of Weitzman's (1998) result, for hyperbolic rather than exponential discount functions. We show that if a decision maker is uncertain about her hyperbolic discount rate, then long-term costs and benefits will be discounted at a rate which is the probability-weighted harmonic mean of the possible hyperbolic discount rates.

Full PDF

AAggregating time preferences with decreasingimpatience ∗ Nina Anchugina , Matthew Ryan , and Arkadii Slinko Department of Mathematics, University of Auckland School of Economics, Auckland University of Technology [email protected], [email protected], [email protected]

April 2016

Abstract.

It is well-known that for a group of time-consistent decision makers theircollective time preferences may become time-inconsistent. Jackson and Yariv [11] demon-strated that the result of aggregation of exponential discount functions always exhibitspresent bias. We show that when preferences satisfy the axioms of Fishburn and Rubin-stein [7], present bias is equivalent to decreasing impatience (DI). Applying the notion ofcomparative DI introduced by Prelec [19], we generalize the result of Jackson and Yariv[11]. We prove that the aggregation of distinct discount functions from comparable DIclasses results in the collective discount function which is strictly more DI than the leastDI of the functions being aggregated.We also prove an analogue of Weitzman’s [27] result, for hyperbolic rather than ex-ponential discount functions. We show that if a decision maker is uncertain about herhyperbolic discount rate, then long-term costs and beneﬁts will be discounted at a ratewhich is the probability-weighted harmonic mean of the possible hyperbolic discount rates.

Keywords:

Discounting, hyperbolic discounting, decreasing impatience, aggregation.

JEL Classiﬁcation:

D71, D90. ∗ We thank Matthew Jackson, Simon Grant and several seminar audiences for commentsand suggestions. Arkadii Slinko was supported by the Marsden Fund grant UOA 1420,and Nina Anchugina gratefully acknowledges ﬁnancial support from the University ofAuckland. a r X i v : . [ q -f i n . E C ] A p r Introduction

Sometimes decisions about timed outcomes have to be made by a group of individuals,such as boards, committees or households. It is natural to think that individuals may diﬀerin the discounting procedure that they use. If the decision is to be made by a group ofindividuals it is desirable to have an aggregating procedure that suitably reﬂects the timepreferences of all members. The natural option is to average the discount functions acrossindividuals, which is equivalent to averaging the discounted utilities in the case when allagents have identical utility functions. This approach has been widely used in the existingliterature on time preferences. It is known that such collective discount functions need notshare properties that are common to the individual discount functions being aggregated.As Jackson and Yariv demonstrate [11], if individuals discount the future exponentiallyand there is a heterogeneity in discount factors, then their aggregate discount functionexhibits present bias, which means that delaying two diﬀerent dated-outcomes by thesame amount of time can reverse the ranking of these outcomes. Moreover, when thenumber of individuals grows, in the limit the group discount function becomes hyperbolic[11].Jackson and Yariv [11] give the following example of present-biased group preferencesfor a household with two time-consistent individuals, Constantine and Patience. Bothhave identical instantaneous utility functions, and discount the future exponentially, butConstantine has a discount factor of 0.5, whereas Patience has a discount factor of 0.8.Suppose that they need to choose between 10 utiles for each today or 15 utiles for eachtomorrow. They calculate the aggregate discounted utility for each option: 10 + 10 = 20and 15(0 . .

5) = 19 .

5. Therefore, 10 utiles today is chosen. Now suppose that theymust choose between 10 utiles at time t ≥ t + 1. The aggregatediscounted utilities in this case are 10(0 . t + 0 . t ) and 15(0 . t +1 + 0 . t +1 ), respectively.For any t ≥ t + 1 is preferable to the 10 utiles at t , which reversesthe initial preference for 10 utiles at t = 0 over 15 utiles at t = 1. The behaviour of thehousehold is present-biased.Another scenario in which the aggregation of time preferences may be required is whena single decision maker is uncertain about the appropriate discount function to apply.For example, discounting may be aﬀected by a survival function with a constant butuncertain hazard rate. Such scenarios are considered by Weitzman [27] and Sozou [24]. Ifthe decision-maker maximizes expected discounted utility, then she maximizes discountedutility for a certainty equivalent discount function, calculated as the probability-weightedaverage of the diﬀerent possible discount functions that may apply. Weitzman [27] showsthat if each of the possible rates of time preference converges to some non-negative value(as time goes to inﬁnity), then the certainty equivalent time preference function convergesto the lowest of these limits. Similarly, Sozou [24] considers a decision maker whosediscounting reﬂects a survival function with a constant, but uncertain, hazard rate. If thishazard rate is exponentially distributed, Sozou shows that the decision-maker’s expecteddiscount function is hyperbolic.Of course, present bias is not limited to aggregate or expected discount functions. Itis often observed in experiments that individual decision makers become decreasingly im-1atient (increasingly patient) as rewards are shifted further into the future. If a decision-maker is indiﬀerent between an early outcome and a larger, later outcome, then delayingboth outcomes by the same amount of time will often result in the larger, later outcomebeing preferred. Such subjects exhibit present bias, or strictly decreasing impatience(DI). Exponential discounting implies constant impatience, so it cannot explain strictly decreasing impatience, either globally or locally. The necessity of accommodating DI in in-dividual time preference has made hyperbolic discounting a signiﬁcant tool in behaviouraleconomics. Several types of hyperbolic discount functions have been introduced, includingquasi-hyperbolic discounting [17, 14], discounting for delay [1], proportional hyperbolicdiscounting [10, 16], and generalized hyperbolic discounting [15, 2]. Given the widespreaduse of hyperbolic discount functions to describe individual time preferences, it is importantto understand the behaviour of aggregated, or averaged, hyperbolic functions.The goal of this paper is twofold. Firstly, we seek to extend Jackson and Yariv’sresult on the aggregation of exponential discount functions. Two individuals may diﬀerin the rate at which their impatience decreases, but their respective levels of DI maybe comparable – the preferences of the one may exhibit unambiguously more DI thanthe preferences of the other. As Prelec [19] proved, one individual exhibits more DI thananother if the logarithm of the discount function of the former is more convex than that ofthe latter. Can we say anything about the level of DI of the weighted average of individualdiscount functions that can be (weakly) ordered by DI? Theorem 1 establishes that theweighted average always exhibits strictly more DI than the component with the least DI.This generalizes Jackson and Yariv’s result. Proposition 1 in [11] shows that the weightedaverage of exponential discount functions with diﬀerent discount factors exhibits presentbias. We show that when preferences satisfy the axioms of Fishburn and Rubinstein[7], Jackson and Yariv’s deﬁnition of present bias is equivalent to strictly decreasingimpatience. Since all exponential discount functions exhibit constant impatience – theyall exhibit the same degree of DI – Proposition 1 of Jackson and Yariv is a special caseof our Theorem 1.Our second goal is to prove an analogue of Weitzman’s [27] result: one in whichdiscounting is hyperbolic but there is an uncertainty about the hyperbolic discount factor.The answer, given in Theorem 3, is very diﬀerent to Weitzman’s answer for the case ofexponential discounting. We show that the certainty equivalent hyperbolic discount factorconverges, not to the lowest individual hyperbolic discount factor, but to the probability-weighted harmonic mean of the individual hyperbolic discount factors.

In this section we introduce the framework for our investigation and deﬁne the two keyconcepts used in this paper: present bias and strictly decreasing impatience of preferences.We prove that these two concepts coincide when the Fishburn-Rubinstein axioms for adiscounted utility representation are satisﬁed. Taking our lead from Pratt [18] and Arrow[3], these concepts are discussed in terms of log-convexity of discount functions, hence weintroduce necessary results and deﬁnitions in this regard. Most results are known but2ncluded to keep the paper self-contained.

Convexity and log-convexity play an important role in the theory of discounting. Let I be an interval (ﬁnite or inﬁnite) of real numbers. A function f : I → R is convex if forany two points x, y ∈ I and any λ ∈ [0 ,

1] it holds that: f ( λx + (1 − λ ) y ) ≤ λf ( x ) + (1 − λ ) f ( y ) . A function f is strictly convex if f ( λx + (1 − λ ) y ) < λf ( x ) + (1 − λ ) f ( y )for any x, y ∈ I such that x (cid:54) = y and any λ ∈ (0 , f is twice diﬀerentiable convexityis equivalent to f (cid:48)(cid:48) ≥

0, and strict convexity is equivalent to two conditions: the function f (cid:48)(cid:48) is nonnegative on I and the set { x ∈ I f (cid:48)(cid:48) ( x ) = 0 } contains no non-trivial interval[25].The following equivalent deﬁnition of a (strictly) convex function is well known. Afunction f : I → R is (strictly) convex if for every x, y, v, z ∈ I such that x − y = v − z > y > z we have f ( x ) − f ( y ) ≤ [ < ] f ( v ) − f ( z ) . Convexity is preserved under composition of functions, as shown in the followinglemma, whose straightforward proof is omitted:

Lemma 1.

Let f : I → R be a non-decreasing and convex function and f : I → R bea convex function, such that the range of f is contained in the domain of f . Then thecomposition f = f ◦ f is a convex function. If, in addition, f is strictly increasing, andeither f or f is strictly convex, then f is also strictly convex. A function f : I → R is called log-convex if f ( x ) > x ∈ I and ln( f ) isconvex. It is called strictly log-convex if ln( f ) is strictly convex. If follows that if f isa (strictly positive) twice diﬀerentiable function, then log-convexity of f is equivalent tothe condition f (cid:48)(cid:48) f − ( f (cid:48) ) ≥

0, while strict log-convexity of f requires, in addition, thatthe set { x ∈ I f (cid:48)(cid:48) ( x ) f ( x ) − [ f (cid:48) ( x )] = 0 } contains no non-trivial interval. Log-convexity can also be expressed without using loga-rithms [5]. A function f : I → R is log-convex if and only if f ( x ) > x ∈ I andfor all x, y ∈ I and λ ∈ [0 ,

1] we have: f ( λx + (1 − λ ) y ) ≤ f ( x ) λ f ( y ) − λ . (1)The function f is strictly log-convex if inequality (1) is strict when x (cid:54) = y and λ ∈ (0 , emma 2. Let f, g : I → R be functions with f strictly log-convex and g log-convex. Thenthe sum f + g is strictly log-convex.Proof. Since f ( x ) > g ( x ) > x ∈ I , we have ( f + g )( x ) > x ∈ I .Let x, y ∈ I such that x (cid:54) = y and let λ ∈ (0 , f ( λx + (1 − λ ) y ) + g ( λx + (1 − λ ) y ) < ( f ( x ) + g ( x )) λ ( f ( y ) + g ( y )) − λ . Since f is strictly log-convex, we have f ( λx + (1 − λ ) y ) < f ( x ) λ f ( y ) − λ . (2)Analogously, since g ( x ) is log-convex: g ( λx + (1 − λ ) y ) ≤ g ( x ) λ g ( y ) − λ . (3)Summing (2) and (3) we obtain: f ( λx + (1 − λ ) y ) + g ( λx + (1 − λ ) y ) < f ( x ) λ f ( y ) − λ + g ( x ) λ g ( y ) − λ . Denote a = f ( x ) , b = f ( y ) , c = g ( x ) , d = g ( y ). Note that a, b, c, d >

0. To prove the claimof the lemma, it is suﬃcient to show that: a λ b − λ + c λ d − λ ≤ ( a + c ) λ ( b + d ) − λ . (4)Since ( a + c ) λ ( b + d ) − λ > (cid:18) aa + c (cid:19) λ (cid:18) bb + d (cid:19) − λ + (cid:18) ca + c (cid:19) λ (cid:18) db + d (cid:19) − λ ≤ . By the Weighted AM-GM inequality [6, Theorem 7.6, p. 74]: (cid:18) aa + c (cid:19) λ (cid:18) bb + d (cid:19) − λ ≤ λ aa + c + (1 − λ ) bb + d and (cid:18) ca + c (cid:19) λ (cid:18) db + d (cid:19) − λ ≤ λ ca + c + (1 − λ ) db + d . Hence, (cid:18) aa + c (cid:19) λ (cid:18) bb + d (cid:19) − λ + (cid:18) ca + c (cid:19) λ (cid:18) db + d (cid:19) − λ ≤ λ + (1 − λ ) = 1 , which proves the statement in the lemma.One of the important deﬁnitions which will be frequently used throughout the paperis that of a convex transformation. We say that f is a (strictly) convex transformation of f if there exists a (strictly) convex function f such that f ( x ) = ( f ◦ f )( x ) = f ( f ( x )). Lemma 3.

Let f , f : I → R such that f − exists. Then f is a (strictly) convex trans-formation of f if and only if the composition f ◦ f − is (strictly) convex.Proof. See [18].Recall also that a function f : I → R is called concave if and only if − f is convex.Thus a function f : I → R is log-concave if and only if 1 /f is log-convex. Therefore, thedeﬁnitions and results stated in this section can be easily adapted for (log-)concavity.4 .2 Preferences Let X ⊂ R + be the set of outcomes. We will assume that X is an interval of non-negative real numbers containing 0. The natural interpretation is that outcomes aremonetary (for an inﬁnitely divisible currency) but this is not essential. Let T = [0 , ∞ )be a set of points in time where 0 corresponds to the present moment. The Cartesianproduct X × T will be identiﬁed with the set of timed outcomes, i.e., a pair ( x, t ) ∈ X × T is understood as a dated outcome, when a decision-maker receives x at time t and nothingat all other time periods in T \ t .Suppose that a decision-maker has a preference order (cid:60) on the set of timed outcomeswith (cid:31) expressing strict preference and ∼ indiﬀerence. We say that a utility function U : X × T → R represents the preference order (cid:60) , if for all x, y ∈ X and all t, s ∈ T we have ( x, t ) (cid:60) ( y, s ) if and only if U ( x, t ) ≥ U ( y, s ). This is a discounted utility (DU)representation if U ( x, t ) = D ( t ) u ( x ) , (5)where u : X → R is a continuous and strictly increasing function with u (0) = 0, and D : T → (0 ,

1] is continuous and strictly decreasing such that D (0) = 1 and lim t →∞ D ( t ) = 0.The function u is called the instantaneous utility function , and D is called the discountfunction associated with (cid:60) . We say that the pair ( u, D ) provides a discounted utilityrepresentation for (cid:60) . Fishburn and Rubinstein [7] provide an axiomatic foundation fora discounted utility representation. A list of their axioms is given in the Appendix. Weassume that (cid:60) has a discounted utility representation throughout the paper.As D is strictly decreasing, our decision maker is always impatient. However, as timegoes by, her impatience may increase or decrease. Deﬁnition 1 ([19]) . The preference order (cid:60) exhibits (strictly) decreasing impatience (DI)if for all σ > , all ≤ t < s and all outcomes y > x > , the equivalence ( x, t ) ∼ ( y, s ) implies ( x, t + σ ) (cid:52) [ ≺ ] ( y, s + σ ) . Increasing impatience (II) can be deﬁned by reversing the ﬁnal preference ranking inDeﬁnition 1. However, we focus on DI preferences in the present paper, since this appearsto be the empirically relevant case. As in the previous sentence, we also use the acronym“DI” interchangeably as a noun (“decreasing impatience”) and an adjective (“decreasinglyimpatient”), relying on context to indicate the intended meaning.In case the preference order (cid:60) has a discounted utility representation, the characteri-zation of DI in terms of the discount function is well-known. Proposition 4 ([10, 19]) . Let (cid:60) be a preference order having discounted utility represen-tation with the discount function D . The following conditions are equivalent: • The preference order (cid:60) exhibits (strictly) DI ; • D is (strictly) log-convex on [0 , ∞ ) . The proof in [10, Theorem 3.3] can be easily adapted to demonstrate an analogous result for increasingimpatience: the preference order (cid:60) exhibits (strictly) II if and only if D is (strictly) log-concave on [0 , ∞ ) .

5e say that discount function D is (strictly) DI if the preference order (cid:60) exhibits(strictly) DI and has a discounted utility representation with discount function D .We next show that a preference order (cid:60) with a discounted utility representation ex-hibits strictly DI if and only if it exhibits present bias in the sense of Jackson and Yariv[11, p. 4190]. It is important to note, however, that Jackson and Yariv assume a dis-crete time setting, whereas we allow time to be continuous. The following Deﬁnition 2 is,therefore, the continuous-time analogue of their present bias deﬁnition. Deﬁnition 2 (Present Bias) . The preference order is present-biased if(i) ( x, t ) (cid:52) ( y, s ) implies ( x, t + σ ) (cid:52) ( y, s + σ ) for every x, y , every σ > and every s, t ∈ T such that s > t ≥ ; and(ii) for every s, t ∈ T with s > t ≥ and every σ > there exist x ∗ and y ∗ such that ( x ∗ , t + σ ) ≺ ( y ∗ , s + σ ) and ( x ∗ , t ) (cid:31) ( y ∗ , s ) . Proposition 5 gives conditions which are equivalent to present bias for preferences witha discounted utility representation:

Proposition 5.

Suppose that (cid:60) has a discounted utility representation. Then the ﬁrstcondition of Deﬁnition 2 is equivalent to convexity of ln D ( t ) ; while the second conditionof Deﬁnition 2 is equivalent to strict convexity of ln D ( t ) .Proof. We start by proving the ﬁrst equivalence. Since a discounted utility representationexists, the ﬁrst condition is equivalent to: u ( x ) D ( t ) ≤ u ( y ) D ( s ) implies u ( x ) D ( t + σ ) ≤ u ( y ) D ( s + σ )for every x, y , every σ > s, t ∈ T with s > t ≥

0. This may be rewritten asfollows: u ( x ) ≤ D ( s ) D ( t ) u ( y ) implies u ( x ) ≤ D ( s + σ ) D ( t + σ ) u ( y ) . Since ( y, s ) (cid:60) ( x, t ), s > t and D is strictly decreasing it follows that u ( y ) > u ( x ). As u (0) = 0 and u is strictly increasing we deduce that u ( y ) >

0. Since u is continuous, x and y can be chosen so that u ( x ) /u ( y ) takes any value in [0 , D ( s ) D ( t ) ≤ D ( s + σ ) D ( t + σ ) . Alternatively, ln D ( s ) + ln D ( t + σ ) ≤ ln D ( s + σ ) + ln D ( t ) (6)for every σ > s, t ∈ T with s > t ≥

0. Inequality (6) is equivalent to convexity ofln D ( t ). There is an inconsistency between Jackson and Yariv’s Present Bias deﬁnition in their 2014 paper(referenced here) and their 2015 paper [12]. We adhere to the former deﬁnition. s, t ∈ T with s > t ≥ σ > x ∗ and y ∗ such that: u ( x ∗ ) D ( t + σ ) < u ( y ∗ ) D ( s + σ ) but u ( x ∗ ) D ( t ) > u ( y ∗ ) D ( s ) . Equivalently, D ( s ) D ( t ) u ( y ∗ ) < u ( x ∗ ) < D ( s + σ ) D ( t + σ ) u ( y ∗ ) . From the fact that ( y ∗ , s + σ ) is preferred to ( x ∗ , t + σ ) with s > t we deduce that u ( y ∗ ) > D ( s ) D ( t ) < D ( s + σ ) D ( t + σ ) . This inequality is equivalent to:ln D ( s ) + ln D ( t + σ ) < ln D ( s + σ ) + ln D ( t ) (7)for every s, t ∈ T with s > t ≥ σ >

0. Inequality (7) holds if and only ifln D ( t ) is strictly convex.Proposition 5 implies that when a discounted utility representation exists the ﬁrstcondition of Deﬁnition 2 follows from the second one, since strict convexity of ln D ( t )implies convexity of ln D ( t ). An immediate consequence is that present bias is equivalentto strictly DI, as stated below: Corollary 6.

Suppose the preference order (cid:60) admits a discounted utility representation.Then it exhibits present bias if and only if (cid:60) exhibits strictly DI.

Assume now that there are two decision makers and they are both decreasingly im-patient. What does it mean to say that one of them is more decreasingly impatient thanthe other? The answer to this question is in the following deﬁnition: Deﬁnition 3 (cf. [19], Deﬁnition 2; [4], Deﬁnition 1) . We say that (cid:60) exhibits [strictly]more DI than (cid:60) , if for every σ > , every ρ , every s, t ∈ T with ≤ t < s andevery x, x (cid:48) , y, y (cid:48) ∈ X with y > x > and y (cid:48) > x (cid:48) > , the conditions ( x (cid:48) , t ) ∼ ( y (cid:48) , s ) , ( x (cid:48) , t + σ ) ∼ ( y (cid:48) , s + σ + ρ ) and ( x, t ) ∼ ( y, s ) imply ( x, t + σ ) (cid:52) [ ≺ ] ( y, s + σ + ρ ) . Not surprisingly, the (strictly) more DI relation may be expressed in terms of thecomparative convexity of the logarithms of the respective discount functions, for cases inwhich both preference relations have discounted utility representations. Since the sign of ρ is not restricted in Deﬁnition 3, it actually applies to preferences that exhibitdecreasing or increasing impatience. roposition 7 (cf. [19], Proposition 1) . Let (cid:60) and (cid:60) be two preference orders withdiscounted utility representation by ( u , D ) and ( u , D ) , respectively. The following con-ditions are equivalent:(i) The preference order (cid:60) exhibits (strictly) more DI than (cid:60) ;(ii) ln D ( D − ( e z )) is (strictly) convex in z on ( −∞ , .Proof. See the Appendix. We follow Prelec’s argument for his Proposition 1 in [19]. Theadditional adjustment is the necessity to replace convexity of the log-transformed discountfunction with strict convexity for the strictly more DI case. The required adjustments arenot substantial but we have included a detailed proof as it clariﬁes some details omittedfrom Prelec’s original version [19].Note that the form of the utility functions u and u does not inﬂuence the comparativeDI properties of preference relations. Corollary 8.

Let (cid:60) and (cid:60) be two preference relations with discounted utility represen-tations ( u , D ) and ( u , D ) , respectively, where D ( t ) = δ t and δ ∈ (0 , . The preferenceorder (cid:60) exhibits (strictly) DI if and only if it exhibits (strictly) more DI than (cid:60) .Proof. Prelec [19] proves that a preference relation is DI if and only if it is more DI thanan exponential discount function. We prove the “strict” part of the claim.Since D ( t ) = δ t and δ ∈ (0 ,

1) we have D − ( e z ) = z ln δ ≥ . By Proposition 7, for (cid:60) to exhibit strictly more DI than (cid:60) it is necessary and suﬃcientthat ln D ( D − ( e z )) is strictly convex in z on ( −∞ , D ( D − ( e z )) = ln D (cid:16) z ln δ (cid:17) = ln D ( t ) , where t = z ln δ ∈ [0 , ∞ )when z takes arbitrary values in ( −∞ , D ( D − ( e z ))in z on ( −∞ ,

0] is equivalent to strict convexity of ln D ( t ) in t on [0 , ∞ ). By Proposition4, strict convexity of ln D ( t ) in t on [0 , ∞ ) is equivalent to (cid:60) exhibiting strictly DI.The following notations will be used below: • If D and D represent equally DI preferences, we write D ∼ DI D ; • If D represents more DI preferences than D , we write D (cid:60) DI D ; • If D represents strictly more DI preferences than D , we write D (cid:31) DI D .8he following corollary, due to Prelec [19], characterizes the relation between any twodiscount functions from the same DI class. Corollary 9 ([19]) . For any two discount functions D and D , we have D ∼ DI D ifand only if D ( t ) = D ( t ) c , where c > is a constant not depending on t . The (cid:60) DI relation is a partial order. In fact, the “more DI” and “strictly more DI”relations are both transitive This is established in the following proposition. Proposition 10. If D (cid:60) DI D and D (cid:60) DI D , then D (cid:60) DI D . If at least one of therelations D (cid:60) DI D or D (cid:60) DI D is strict, then D (cid:31) DI D .Proof. Suppose D (cid:60) DI D and D (cid:60) DI D . By Proposition 7, we know that bothln D ( D − ( e z )) and ln D ( D − ( e z )) are convex in z on ( −∞ , h i = ln D i for i ∈ { , , } , we can equivalently state that h ◦ h − and h ◦ h − are convex on ( −∞ , D ( D − ( e z )) is convex in z on ( −∞ , h ◦ h − is convex on ( −∞ , f = h ◦ h − and f = h ◦ h − . Thenln D (cid:0) D − ( e z ) (cid:1) = h (cid:0) h − ( z ) (cid:1) = h h − (cid:0) h h − ( z ) (cid:1) = f ◦ f ( z ) = f ( z ) . By the assumption, f and f are convex functions. Note that f is increasing, as thecomposition of two decreasing functions h and h − . Indeed, h = ln D is a strictlydecreasing function as D is strictly decreasing, and h − is a decreasing function as theinverse of the decreasing function h . Lemma 1 then implies that f ( z ) = f ◦ f ( z ) =ln D (cid:0) D − ( e z ) (cid:1) is convex, and that f is strictly convex if f i is strictly convex for some i ∈ { , } . In this section we assume that D is twice continuously diﬀerentiable. The rate of timepreference , r ( t ), is deﬁned as follows: r ( t ) = − D (cid:48) ( t ) D ( t ) . The following lemma relates the DI property to the behaviour of r ( t ). Lemma 11.

Let (cid:60) be a preference relation with DU representation ( u, D ) in which D istwice continuously diﬀerentiable. Then the following conditions are equivalent:(i) The preference relation exhibits (strictly) DI; Takeuchi [26] contains a related result. His Corollary 1 says that the hazard function is decreasing(increasing) if and only if preferences exhibit decreasing (increasing) impatience. Takeuchi’s hazardfunction h ( t ) corresponds to our time preference rate r ( t ). However, Takeuchi does not analyse the caseof strictly decreasing impatience. ii) The time preference rate r ( t ) is (strictly) decreasing on [0 , ∞ ) .Proof. Suppose that r ( t ) is decreasing on [0 , ∞ ). This is equivalent to r (cid:48) ( t ) = − D (cid:48)(cid:48) ( t ) D ( t ) − ( D (cid:48) ( t )) D ( t ) ) = ( D (cid:48) ( t )) − D (cid:48)(cid:48) ( t ) D ( t ) D ( t ) ≤ . Or, alternatively, D (cid:48)(cid:48) D − ( D (cid:48) ) ≥

0. This inequality is equivalent to log-convexity of D ,which, by Proposition 4, means that the preference order exhibits DI.To prove the equivalence of strictly DI preferences and a strictly decreasing rate oftime preference, recall that a continuously diﬀerentiable function r : R + → R is strictlydecreasing if and only if r (cid:48) ( t ) ≤ t and the set { t r (cid:48) ( t ) = 0 } contains no non-trivial interval [25, 23]. If a function v is diﬀerentiable on an open interval I ⊂ R , then v is strictly convex on I if and only if v (cid:48) is strictly increasing on I [23]. Assume that r ( t ) isstrictly decreasing on [0 , ∞ ). Let M ⊆ R + be the set of t values such that r (cid:48) ( t ) <

0. Then D (cid:48)(cid:48) ( t ) D ( t ) − [ D (cid:48) ( t )] > t ∈ M . Since R + \ M contains no non-trivial interval, r (cid:48) ( t ) being strictly decreasing is equivalent to D being strictly log-convex.One way to measure the level of DI for suitably diﬀerentiable discount functions wassuggested by Prelec [19]. Since more DI preferences have discount functions which aremore log-convex, the natural criterion would be some measure of convexity of the log ofthe discount function. The Arrow-Pratt coeﬃcient, which is a measure of the concavityof a function, can be adapted to this purpose. Indeed, a non-increasing rate of timepreference, r (cid:48) ( t ) ≤

0, is precisely analogous to the notion of decreasing risk aversion inPratt [18].Recall that D is a twice continuously diﬀerentiable function. The associated rate ofimpatience, IR ( D ) , is deﬁned as follows: IR ( D ) = − D (cid:48)(cid:48) D (cid:48) . The index of DI of D , denoted I DI ( D ), is the diﬀerence between the rate of impatienceand the rate of time preference: I DI ( D ) = IR ( D ) − r ( D ) = (cid:18) − D (cid:48)(cid:48) D (cid:48) (cid:19) − (cid:18) − D (cid:48) D (cid:19) . Note that I DI ( D )( t ) = − r (cid:48) ( t ) r ( t ) = − ddt ln [ r ( t )] . (8)Prelec [19] proved that if (cid:60) and (cid:60) both have DU representations with twice continuouslydiﬀerentiable discount functions, D and D respectively, then (cid:60) exhibits more DI than (cid:60) if and only if I DI ( D ) ≥ I DI ( D ) on the interval [0 , ∞ ). The following propositionstrengthens this result. 10 roposition 12. Let (cid:60) and (cid:60) have DU representations with discount functions D and D , respectively, where D and D are twice continuously diﬀerentiable. Then thepreference order (cid:60) exhibits strictly more DI than (cid:60) if and only if I DI ( D ) ≥ I DI ( D ) on the interval [0 , ∞ ) and { t | I DI ( D )( t ) = I DI ( D )( t ) } contains no non-trivial interval.Proof. Prelec’s [19, Proposition 2] proof applies the Arrow-Pratt coeﬃcient [18], whichis used to compare the concavity of functions. There is no straightforward adaptation ofPrelec’s argument to the case of strict concavity. We therefore adapt Pratt’s [18] originalargument directly.Recall that D is strictly more DI than D if and only if ln( D ) is strictly moreconvex than ln( D ) on [0 , ∞ ). Let h = ln( D ) and h = ln( D ), so h and h arestrictly decreasing functions. The function h is strictly more convex than h on ( −∞ , f such that h = f ( h ), or,equivalently, h (cid:0) h − ( z ) (cid:1) is strictly convex on ( −∞ , h (cid:0) h − ( z ) (cid:1) is: dh (cid:0) h − ( z ) (cid:1) dz = h (cid:48) (cid:0) h − ( z ) (cid:1) h (cid:48) (cid:0) h − ( z ) (cid:1) . (9)We need to show that expression (9) is strictly increasing. Note that h − ( z ) is strictlydecreasing since h is strictly decreasing. Therefore, (9) is strictly increasing if and onlyif h (cid:48) ( x ) (cid:30) h (cid:48) ( x ) is strictly decreasing. The latter is satisﬁed if and only iflog (cid:20) h (cid:48) ( x ) h (cid:48) ( x ) (cid:21) (10)strictly decreases (since log( x ) is strictly increasing). The ﬁrst derivative of (10) is: h (cid:48) ( x ) h (cid:48) ( x ) · h (cid:48)(cid:48) ( x ) h (cid:48) ( x ) − h (cid:48) ( x ) h (cid:48)(cid:48) ( x )[ h (cid:48) ( x )] = h (cid:48)(cid:48) ( x ) h (cid:48) ( x ) − h (cid:48)(cid:48) ( x ) h (cid:48) ( x )Therefore (10) is strictly decreasing if and only if h (cid:48)(cid:48) ( x ) h (cid:48) ( x ) − h (cid:48)(cid:48) ( x ) h (cid:48) ( x ) ≤ (cid:26) x (cid:12)(cid:12)(cid:12)(cid:12) h (cid:48)(cid:48) ( x ) h (cid:48) ( x ) − h (cid:48)(cid:48) ( x ) h (cid:48) ( x ) = 0 (cid:27) contains no non-trivial interval.Note that: h (cid:48)(cid:48) i h (cid:48) i = D (cid:48)(cid:48) i D (cid:48) i − D (cid:48) i D i . Therefore, h (cid:48)(cid:48) ( x ) h (cid:48) ( x ) − h (cid:48)(cid:48) ( x ) h (cid:48) ( x ) ≤ − D (cid:48)(cid:48) D (cid:48) − (cid:18) − D (cid:48) D (cid:19) ≥ − D (cid:48)(cid:48) D (cid:48) − (cid:18) − D (cid:48) D (cid:19) . This means that D (cid:31) DI D if and only if I DI ( D ) ≥ I DI ( D ) on [0 , ∞ ), and { t | I DI ( D )( t ) = I DI ( D )( t ) } contains no non-trivial interval.From Proposition 12, Lemma 11 and (8) it follows that (cid:60) is DI if and only if I DI ( D ) ≥ , ∞ ), and (cid:60) is strictly DI if and only if I DI ( D ) ≥ , ∞ ) and { t | I DI ( D )( t ) = 0 } contains no non-trivial interval. Note that the index of DI equals zero for an exponentialdiscount function. The following example illustrates the index of DI for a generalized hyperbolic discountfunction. We will make use of this information later.

Example 1.

The function D ( t ) = (1 + ht ) − α/h , with h > and α > , is called the generalized hyperbolic discount function. For this function we have: r ( t ) = α (1 + ht ) − and IR ( D )( t ) = ( α + h )(1 + ht ) − . Therefore, I DI ( D )( t ) = h (1 + ht ) − . If D ( t ) = (1 + h t ) − α/h and D ( t ) = (1 + h t ) − α/h are two generalized hyperbolic discount functions then D (cid:60) DI [ (cid:31) DI ] D , if and only if h ≥ [ > ] h .Thus the parameter h may be used as a measure of the degree of DI of a generalizedhyperbolic discount function, while the parameter α has no inﬂuence on I DI ( D ) . We callparameter h the hyperbolic discount rate . The special case of a generalized hyperbolicdiscount function with α = h > is called the proportional hyperbolic discount function. As described in the introduction, there are some situations in which the necessityarises to calculate a convex combination (mixture) of discount functions.

The ﬁrst situation where a convex combination of discount functions may be used iswhen there is a group of decision makers with diﬀerent discount functions and a socialdiscount function needs to be constructed. The natural option is averaging the discountfunctions among individuals, which is equivalent to averaging the discounted utilitieswhen all agents have identical utility functions. This approach has been widely used inthe existing literature on time preferences ([11]).Indeed, suppose that we have a set of agents M = { , . . . , m } . Assume that agent i has time preferences with DU representation ( u, D i ). Thus, all agents have the same Similarly, (cid:60) is II if and only if I DI ( D ) ≤ , ∞ ) and strictly II if and only if I DI ( D ) ≤ , ∞ ) and { t | I DI ( D )( t ) = 0 } contains no non-trivial interval. collective (utilitarian) utility function as ˆ u ( x ) = mu ( x ) and the collective total utility at time t isˆ U ( x, t ) = m (cid:88) i =1 D i ( t ) u ( x ) = (cid:32) m m (cid:88) i =1 D i ( t ) (cid:33) ˆ u ( x ) . Thus, we obtain the collective discount function D = m (cid:80) mi =1 D i .In the second possible scenario, discussed by Sozou [24] and Weitzman [27], there is asingle decision maker with some uncertainty about her discount function. For example,there may be some possibility of not surviving to any given period, t , described by asurvival function with and uncertain (constant) hazard rate [24]. Then the expecteddiscount function of this decision maker can be calculated as a weighted average of thedistinct discount functions that may eventuate.If the discount function D i eventuates with probability p i , then the expected utilityof the decision maker isˆ U ( x, t ) = m (cid:88) i =1 p i D i ( t ) u ( x ) = (cid:32) m (cid:88) i =1 p i D i ( t ) (cid:33) u ( x ) . and the certainty equivalent discount function will be D = (cid:80) mi =1 p i D i .The same question arises in both cases: Is it possible to make some conclusion aboutthe behaviour of the convex combination of distinct discount functions in comparison withits components, if all the component discount functions exhibit DI? Given a set of discount functions { D , D , . . . , D n } , we deﬁne a mixture of them as D = n (cid:88) i =1 λ i D i , where 0 < λ i < i and (cid:80) ni =1 λ i = 1. Note that we deﬁne a mixture such that each D i has a strictly positive weight.We ﬁrst discuss some known results related to the mixture of discount functions.One of the most recent results was obtained by Jackson and Yariv [11], who demon-strated that if all decision makers in a group have exponential discount functions, butthey do not all have the same discount factor, then their collective discount function mustbe present biased.It has also been noted by several authors (for example, [20] and [22]), that time prefer-ences have strong similarities with risk preferences and that some results from risk theoryare relevant in the context of intertemporal choice. Pratt [18] showed that decreasingrisk aversion is preserved under linear combinations. As was observed in Section 3.2,decreasing risk aversion is analogous to non-decreasing time preference rate, or DI of thediscount function. Therefore, Pratt’s result can be translated into our time preferenceframework as follows: 13 roposition 13 (cf. [18], Theorem 5) . Let (cid:60) , (cid:60) , . . . , (cid:60) n have DU representations withtwice continuously diﬀerentiable discount functions D , . . . , D n , respectively. Assume that (cid:60) , (cid:60) , . . . , (cid:60) n all exhibit DI. Let D = n (cid:88) i =1 λ i D i , be a mixture of D , . . . , D n . Then D is DI. It is strictly DI if and only if { t | r ( t ) = r ( t ) = . . . = r n ( t ) and r (cid:48) ( t ) = r (cid:48) ( t ) = . . . = r (cid:48) n ( t ) = 0 } contains no non-trivial interval.Proof. From the deﬁnition of time preference rate it follows that D (cid:48) i = − r i D i for all i = 1 , . . . , n . The time preference rate for D is: r = − D (cid:48) D = − (cid:80) ni =1 λ i D (cid:48) i D = n (cid:88) i =1 λ i D i D r i . By Lemma 11, to prove that D exhibits DI we must show that r (cid:48) ( t ) ≤ r (cid:48) = n (cid:88) j =1 λ j D (cid:48) j (cid:80) ni =1 λ i D i − λ j D j (cid:80) ni =1 λ i D (cid:48) i D r j + n (cid:88) j =1 λ j D j D r (cid:48) j . Rearranging and substituting D (cid:48) i = − r i D i we obtain: r (cid:48) = n (cid:88) j =1 λ j D j D r (cid:48) j + QD , where Q = − n (cid:88) j =1 (cid:34) λ j r j D j n (cid:88) i =1 λ i D i − λ j D j n (cid:88) i =1 λ i r i D i (cid:35) r j This is a quadratic form in D , D , . . . , D n with the coeﬃcient on D i D j being λ i λ j (cid:0) r i r j − r j (cid:1) + λ i λ j (cid:0) r i r j − r i (cid:1) = − λ λ ( r i − r j ) . Hence Q = − (cid:88) i r (cid:48) i ≤ i = 1 , . . . n , we have r (cid:48) ( t ) ≤

0. Therefore, (cid:60) isDI. The preference relation (cid:60) is strictly DI if and only if r (cid:48) ( t ) is strictly decreasing. Wesee that r (cid:48) ( t ) is strictly decreasing iﬀ { t | r ( t ) = r ( t ) = . . . = r n ( t ) and r (cid:48) ( t ) = r (cid:48) ( t ) = . . . = r (cid:48) n ( t ) = 0 } contains no non-trivial interval. 14he following corollary describes an important special case of Proposition 13: Corollary 14.

Mixtures of non-identical exponential discount functions are strictly DI.

Corollary 14 is therefore a continuous-time version of Jackson and Yariv (2014, Propo-sition 1). Prelec [19, Corollary 4] considers the mixture of two discount functions only, butdoes not require diﬀerentiability. He proves that the mixture of two equally DI discountfunctions is more DI than its components. Prelec [19, Corollary 4] implies the special caseof Jackson and Yariv’s [11] result when n = 2.Our objective is to establish such a result which is more general than both Prelec [19]and Jackson and Yariv [11]. The result we obtain is stated in the following theorem: Theorem 1.

Let n ≥ and D , . . . , D n be distinct discount functions such that D (cid:60) DI D (cid:60) DI . . . (cid:60) DI D n . If D is a mixture of D , . . . , D n , then D (cid:31) DI D n . To construct the proof of this theorem, two preliminary results will be useful. Theﬁrst is a strengtheneing of a result in Prelec [19].

Proposition 15.

Let λ ∈ (0 , . If two distinct discount functions D and D satisfy D ∼ DI D , then their mixture, D = λD + (1 − λ ) D , represents strictly more DIpreferences than each D i . That is, D (cid:31) DI D and D (cid:31) DI D .Proof. As D ∼ DI D , then, by Corollary 9, D ( t ) = D ( t ) c , where c (cid:54) = 1 , c >

0. ByProposition 7, it is necessary to demonstrate strict convexity of f ( z ) = ln D (cid:0) D − ( e z ) (cid:1) for z ≤

0. By Proposition 10 it is also suﬃcient. We note that: f ( z ) = ln D (cid:0) D − ( e z ) (cid:1) = ln (cid:0) λD (cid:0) D − ( e z ) (cid:1) + (1 − λ ) D (cid:0) D − ( e z ) (cid:1)(cid:1) = ln (cid:0) λe z + (1 − λ ) e z/c (cid:1) . The ﬁrst-order derivative of f ( z ) is: f (cid:48) ( z ) = λe z + c (1 − λ ) e z/c λe z + (1 − λ ) e z/c . The second-order derivative is: f (cid:48)(cid:48) ( z ) = (cid:0) λe z + c (1 − λ ) e z/c (cid:1) (cid:0) λe z + (1 − λ ) e z/c (cid:1) − (cid:0) λe z + c (1 − λ ) e z/c (cid:1) ( λe z + (1 − λ ) e z/c ) = p ( z ) q ( z ) . Both the denominator q ( z ) and the numerator p ( z ) of this fraction are strictly positive.The former is obvious. To see the latter, note that the numerator p ( z ) can be simpliﬁedas follows: p ( z ) = (cid:18) λe z + 1 c (1 − λ ) e z/c (cid:19) (cid:0) λe z + (1 − λ ) e z/c (cid:1) − (cid:18) λe z + 1 c (1 − λ ) e z/c (cid:19) = λ (1 − λ ) e ( c ) z − c λ (1 − λ ) e ( c ) z + 1 c λ (1 − λ ) e ( c ) z = e ( c ) z λ (1 − λ ) (cid:18) − c (cid:19) . Therefore, f is a strictly convex function. 15roposition 15 is a stronger version of Corollary 4 in [19], as we show that the mixtureof two discount functions that are equally DI represents strictly more DI preferences,rather than just more DI preferences. It is important to note that Proposition 15 cannotbe directly generalized to n discount functions by induction. To obtain the path to suchgeneralization, we will need the following lemma. Lemma 16.

Let λ ∈ (0 , . If two distinct discount functions D and D satisfy D (cid:60) DI D , then their mixture D = λD + (1 − λ ) D represents strictly more DI preferences than D . That is, D (cid:31) DI D .Proof. If D ∼ DI D then the conclusion follows from Proposition 15. Suppose that D represents strictly more DI preferences than D . Then, by Proposition 7, ln D (cid:0) D − ( e z ) (cid:1) is strictly convex on ( ∞ , D (cid:0) D − ( e z ) (cid:1) is also strictlyconvex on ( ∞ , D (cid:0) D − ( e z ) (cid:1) = λD D − ( e z ) + (1 − λ ) D D − ( e z ) = λD D − ( e z ) + (1 − λ ) e z . Denote λD D − ( e z ) = f and (1 − λ ) e z = g . Then we have: D (cid:0) D − ( e z ) (cid:1) = f + g, where f is strictly log-convex by assumption and g = (1 − λ ) e z is log-convex. By Lemma 2,the sum of a strictly log-convex function and a log-convex function is strictly log-convex,hence D (cid:0) D − ( e z ) (cid:1) is strictly log-convex.We are now ready to provide the proof of Theorem 1, since Lemma 16 can be straight-forwardly generalized to the case of n distinct discount functions. Proof of Theorem 1.

We prove this statement by induction on n . By Lemma 16 the resultholds for n = 2. Suppose that the statement of the theorem is true for n = k . Let D ( k +1) = η D + . . . + η k +1 D k +1 where (cid:80) k +1 i =1 η i = 1 and each η i ∈ (0 , D ( k +1) = η D + . . . + η k +1 D k +1 = (1 − η k +1 ) (cid:18) η − η k +1 D + . . . + η k − η k +1 D k (cid:19) + η k +1 D k +1 . Let D ( k ) = η − η k +1 D + . . . + η k − η k +1 D k . By the induction hypothesis, D ( k ) (cid:31) DI D k . It is also known that D k (cid:60) DI D k +1 , andhence, by Proposition 10, D ( k ) (cid:31) DI D k +1 . However, the mixture of these two functions isexactly D ( k +1) = (1 − η k +1 ) D ( k ) + η k +1 D k +1 . Then, by Proposition 16, D ( k +1) (cid:31) DI D k +1 , which completes the induction step.16 .3 Mixtures of twice continuously diﬀerentiablediscount functions Note that when discount functions are suitably diﬀerentiable, Theorem 1 and Propo-sition 12 imply that I DI ( D ) ≥ min i { I DI ( D i ) } on [0 , ∞ ) (11)and the set of t values at which equality holds does not include any non-trivial interval.Consider the following example: Example 2.

Let D ( t ) = (1 + ht ) − be a zero-speed hyperbolic discount function [13] and D ( t ) = exp( − αt / ) be a slow Weibull discount function [13]. As shown in Example 1, I DI ( D )( t ) = h (1 + ht ) − > for all t . We also have I DI ( D )( t ) = (2 t ) − > for all t since r ( t ) = α t − / and r (cid:48) ( t ) = − α t − / . Therefore, both D and D exhibit strict DI. Assume that h = 0 . . Then I DI ( D )( t ) − I DI ( D )( t ) = 0 .

11 + 0 . t − . t = 0 . t −

101 + 0 . t . Obviously, I DI ( D )( t ) ≤ I DI ( D )( t ) if and only if t ≤ and I DI ( D )( t ) > I DI ( D )( t ) if and only if t > . It follows that D and D both are from incomparable DI classes.Since D and D both exhibit strictly DI, Proposition 13 implies that their mixture D alsoexhibits strictly DI. Index of DI of D Index of DI of D Index of DI of Mixture of D and D Time . . . Figure 1: Index of DI for the Mixture of D and D y direct calculation we obtain the following expression: I DI ( D )( t ) = 6 λ h (1 + ht ) − + 1 / λ α exp − αt . t − ( α + t − . )2 λ h (1 + ht ) − + 1 / λ α exp − αt . t − . − λ h (1 + ht ) − + 1 / λ α exp − αt . t − . λ (1 + ht ) − + λ exp − αt . . The behaviour of I DI ( D ) with parameters λ = λ = 0 . , h = 0 . and α = 0 . isillustrated in Figure 1. It can be clearly seen from Figure 1 that neither D (cid:31) DI D nor D (cid:31) DI D . However, I DI ( D ) ≥ min { I DI ( D ) , I DI ( D ) } on [0 , ∞ ) . Example 2 suggests that the inequality (11) may continue to hold even if the discountfunctions are not DI-comparable. Theorem 2 veriﬁes this conjecture.

Theorem 2.

Let (cid:60) , (cid:60) , . . . , (cid:60) n have DU representations with twice continuously diﬀer-entiable discount functions D , D , . . . , D n , respectively. Let D = (cid:80) ni =1 λ i D i be a mixtureof D , D , . . . , D n . Then I DI ( D ) ≥ min i { I DI ( D i ) } on [0 , ∞ ) , and I DI ( D )( t ) > min i { I DI ( D i )(ˆ t ) if r j (ˆ t ) (cid:54) = r k (ˆ t ) for some j (cid:54) = k .Proof. Let I i = I DI ( D i ) for all i ∈ { , . . . , n } and let I = I DI ( D ). Recall that D (cid:48) = − rD and hence D (cid:48)(cid:48) = Dr − Dr (cid:48) = Dr ( r + I ). Recall also that I = − D (cid:48)(cid:48) D (cid:48) + D (cid:48) D .

Therefore, I = − (cid:80) ni =1 λ i D (cid:48)(cid:48) i (cid:80) ni =1 λ i D (cid:48) i + (cid:80) ni =1 λ i D (cid:48) i (cid:80) ni =1 λ i D i = (cid:80) ni =1 λ i D i r i ( r i + I i ) (cid:80) ni =1 λ i D i r i − (cid:80) ni =1 λ i D i r i (cid:80) ni =1 λ i D i . This expression can be rearranged as follows: I = (cid:80) ni =1 λ i D i r i I i (cid:80) ni =1 λ i D i r i + (cid:80) ni =1 λ i D i r i (cid:80) ni =1 λ i D i r i − (cid:80) ni =1 λ i D i r i (cid:80) ni =1 λ i D i = n (cid:88) i =1 α i ( t ) I i + Q, where Q = (cid:80) ni =1 λ i D i r i (cid:80) ni =1 λ i D i r i − (cid:80) ni =1 λ i D i r i (cid:80) ni =1 λ i D i and α i = λ i D i r i (cid:80) ni =1 λ i D i r i with (cid:80) ni =1 α i = 1 and α i ≥

0. Note thatmin i { I i } ≤ n (cid:88) i =1 α i I i ≤ max i { I i } for all t ∈ [0 , ∞ ) . Q can be rewritten as: Q = (cid:104) (cid:80) ni =1 λ i D i r i (cid:105) · (cid:104) (cid:80) ni =1 λ i D i (cid:105) − (cid:104) (cid:80) ni =1 λ i D i r i (cid:105) (cid:104) (cid:80) ni =1 λ i D i r i (cid:105) · (cid:104) (cid:80) ni =1 λ i D i (cid:105) . The denominator of Q is strictly positive, so the sign of Q depends on the sign of thenumerator. Let N be the numerator of Q : N = (cid:104) n (cid:88) i =1 λ i D i r i (cid:105) · (cid:104) n (cid:88) i =1 λ i D i (cid:105) − (cid:104) n (cid:88) i =1 λ i D i r i (cid:105) . We can simplify N as follows: N = n (cid:88) i =1 n (cid:88) j =1 λ i λ j D i D j r i − n (cid:88) i =1 n (cid:88) j =1 λ i λ j D i D j r i r j . Therefore, we have: N = n (cid:88) i =1 n (cid:88) j =1 θ ij r i ( r i − r j )where θ ij = λ i λ j D i D j . Since θ ij = θ ji > i and j we see that N = (cid:88) i r j (cid:54) = r k for some j (cid:54) = k . It follows that Q ≥ Q > r j (cid:54) = r k for some j (cid:54) = k . Therefore, since I = (cid:80) ni =1 α i I i + Q andmin i { I i } ≤ n (cid:88) i =1 α i I i ≤ max i { I i } for all t ∈ [0 , ∞ ) , we have: min i { I i } ≤ min i { I i } + Q ≤ n (cid:88) i =1 α i I i + Q = I. In other words, I ≥ min i { I i } on [0 , ∞ ), and I (ˆ t ) > min i { I i (ˆ t ) } if r j (ˆ t ) (cid:54) = r k (ˆ t ) for some j (cid:54) = k .Observe that this result does not require discount functions to exhibit decreasingimpatience. Therefore, Theorem 2 makes less restrictive assumptions than Proposition13 – it allows the discount functions to exhibit increasing impatience.19 .4 Mixtures of proportional hyperbolic discount functions Weitzman [27] shows that if diﬀerent discount functions may eventuate with certainprobabilities, then future costs and beneﬁts must eventually be discounted at the lowestpossible limiting time preference rate. This result is particularly salient when the possiblediscount functions are all exponential, with constant time preference rates. The purpose ofthis section is to give an analogous result for proportional hyperbolic discount functions,with constant hyperbolic discount rates (Example 1). The result in this case is verydiﬀerent to Weitzman’s. Long-term future beneﬁts and costs are discounted, not at thelowest hyperbolic discount rate, but at the probability-weighted harmonic mean of theindividual hyperbolic discount rates.Suppose that there is some uncertainty about the rate of time preference, and wehave a set of possible scenarios N = { , . . . , n } where each time preference rate r i ( t ) mayeventuate with probability p i ≥

0, such that (cid:80) nt =1 p i = 1. Since for each ir i ( t ) = − D (cid:48) i ( t ) D i ( t ) , the corresponding discount function can be expressed in terms of the rate of time prefer-ence as follows D i ( t ) = exp (cid:18) − (cid:90) t r i ( τ ) dτ (cid:19) for each i ∈ N. (12)The certainty equivalent discount function will be: D = n (cid:88) i =1 p i D i , where p i ≥ n (cid:88) t =1 p i = 1 . Then the certainty equivalent time preference rate is r = − D (cid:48) D . Weitzman [27] provedthat if each rate of time preference converges to a non-negative value as time goes toinﬁnity, then the certainty equivalent rate of time preference converges to the lowest ofthese values. In other words, if lim t →∞ r i ( t ) = r ∗ i with r ∗ i ≥ r ∗ < r ∗ i , where i (cid:54) = 1,then lim t →∞ r ( t ) = r ∗ . Example 3.

Note that r i ( t ) in (12) is constant if and only if D i is exponential. In thiscase we have: D i ( t ) = exp ( − r i t ) for each i ∈ N, where r i = const . Therefore, Weitzman’s result implies that lim t →∞ r ( t ) = min i r i . Figure2 illustrates for the case n = 3 , r = 0 . , r = 0 . , r = 0 . and p = p = p =1 / . We also observe that the certainty equivalent rate of time preference r ( t ) decreasesmonotonically towards r . This is a consequence of Corollaries 8 and 14 and the fact that I DI ( D ) = − r (cid:48) /r . However, Weitzman’s result [27] does not provide much insight in the special case wheneach possible time preference has a DU representation with a proportional hyperbolicdiscount function. Suppose D i ( t ) = 11 + h i t D D D(t) r r r r(t) Time D i s c o un t R a t e Time E x p o n e n t i a l D i s c o un t F un c t i o n . . . . . . . . . Figure 2: Mixture of Exponential Discount Functionsfor each i ∈ N , where h i > h > h > . . . > h n . Suppose that D i eventuates with probability p i where p i ≥ (cid:80) ni =1 p i = 1. Then the certainty equivalent discount function would be D ( t ) = p h t + . . . + p n h n t . The rate of time preference is r i ( t ) = h i h i t for all i . It is obvious that r ∗ i = r ∗ j = 0 for all i (cid:54) = j and lim t →∞ r ( t ) = 0, which,indeed, corresponds to Weitzman’s result. However, this conclusion does not give muchinformation about the asymptotic behavior of the certainty equivalent discount function.Given that each possible discount function comes from a diﬀerent DI class (unlike in thecase of heterogeneous exponential discount functions) we would like to know which (if any)most closely characterizes the asymptotic behaviour of the certain equivalent function.To answer this question we need to modify the analysis of Weitzman. Note that thecertainty equivalent discount function can be written as D ( t ) = 11 + h ( t ) t , where h ( t ) is the certainty equivalent hyperbolic discount rate . In particular, h ( t ) = ( 1 D ( t ) −

1) 1 t , h ( t ) is well-deﬁned for t ∈ (0 , ∞ ). We ask: How does h ( t ) behave as t → ∞ ? We remind the reader that the weighted harmonic mean of non-negative values x , x , . . . , x n with non-negative weights a , a , . . . , a n satisfying a + . . . + a n = 1 is H ( x , a ; . . . ; x n , a n ) = (cid:32) n (cid:88) i =1 a i x i (cid:33) − . It is well-known that the weighted harmonic mean is smaller than the correspondingexpected value (weighted arithmetic mean).

Theorem 3.

Suppose that each D i ( i ∈ N ) is a proportional hyperbolic discount function,with associated hyperbolic discount rate h i . Discount function D i will eventuate with proba-bility p i . Then the long-term certainty equivalent hyperbolic discount rate is the probability-weighted harmonic mean of the individual hyperbolic discount rates, H ( h , p ; . . . ; h n , p n ) .Proof. We note that p i h i t = p i h i t + (cid:15) i ( t ) , where (cid:15) i ( t ) /t → t → ∞ . Let (cid:15) ( t ) = (cid:15) ( t ) + . . . + (cid:15) n ( t ). Hence it follows that:11 + h ( t ) t = n (cid:88) i =1 p i D i ( t ) = p h t + . . . + p n h n t = p h t + . . . + p n h n t + (cid:15) ( t )= (cid:18) p h + . . . + p n h n (cid:19) t + (cid:15) ( t )= 1 H ( h , p ; . . . ; h n , p n ) t + (cid:15) ( t )= 11 + H ( h , p ; . . . ; h n , p n ) t + ˆ (cid:15) ( t ) , where ˆ (cid:15) ( t ) /t → t → ∞ . This implies that h ( t ) → H ( h , p ; . . . ; h n , p n ) as t → ∞ .Figure 3 illustrates Theorem 3 for the case n = 3, when hyperbolic rates h = 0 . h = 0 .

02 and h = 0 .

03 eventuate with equal probabilities. Note that h = 0 . h , h and h . Figure 3 displays the conver-gence of the certainty equivalent hyperbolic discount rate to the weighted harmonic mean H ( h , p ; h , p ; h , p ). It also shows the certainty equivalent hyperbolic discount rate de-creasing monotonically. The following proposition proves that this is always the case. Proposition 17.

Suppose that each D i ( i ∈ N ) is a proportional hyperbolic discountfunction, with associated hyperbolic discount rate h i . Discount function D i will eventu-ate with probability p i . Then the certainty equivalent hyperbolic discount rate is strictlydecreasing on (0 , ∞ ) . D D D(t) h h h h(t)H(h , p ; h , p ; h , p ) Time H y p e r b o li c D i s c o un t R a t e Time H y p e r b o li c D i s c o un t F un c t i o n . . . . . . . . . Figure 3: Mixture of Hyperbolic Discount Functions

Proof.

We prove this statement by induction on n . First we need to prove that thestatement holds for n = 2. The respective certainty equivalent hyperbolic discount rateis: h ( t ) = (cid:20) p (1 + h t ) − + p (1 + h t ) − − (cid:21) t for each t >

0. Rearranging: h ( t ) = (cid:20) (1 + h t )(1 + h t ) p (1 + h t ) + p (1 + h t ) − (cid:21) t = (cid:20) h + h ) t + h h t p + p + ( p h + p h ) t − (cid:21) t . Since p + p = 1 we obtain: h ( t ) = (cid:20) h + h ) t + h h t p h + p h ) t − (cid:21) t = ( h + h − p h − p h ) t + h h t p h + p h ) t · t = h + h − p h − p h + h h t p h + p h ) t = p h + p h + h h t p h + p h ) t . By diﬀerentiating h ( t ): h (cid:48) ( t ) = h h (1 + ( p h + p h ) t ) − ( p h + p h + h h t ) ( p h + p h )[1 + ( p h + p h ) t ] (13)23e need to show that h (cid:48) ( t ) <

0. Since the denominator of (13) is positive, the sign of h (cid:48) ( t ) depends on the sign of the numerator. Therefore, we denote the numerator of (13)by Q and analyse it separately: Q ( t ) = h h [1 + ( p h + p h ) t ] − ( p h + p h + h h t ) ( p h + p h )= h h + h h ( p h + p h ) t − ( p h + p h )( p h + p h ) − h h ( p h + p h ) t = h h − ( p h + p h ) ( p h + p h ) . By expanding the brackets and using the fact that p + p = 1 implies 1 − p − p = 2 p p expression Q can be simpliﬁed further: Q ( t ) = h h − p h h − p p h − p p h − p h h = h h (1 − p − p ) − p p ( h + h )= 2 p p h h − p p ( h + h )= − p p ( h − h ) . Therefore, since h (cid:54) = h we have Q <

0. Hence it follows that h (cid:48) ( t ) < h ( t ) isstrictly decreasing.Suppose that the proposition holds for n = k . We need to show that it also holds for n = k + 1. When n = k + 1 the certainty equivalent hyperbolic discount rate is: h k +1 ( t ) = (cid:20) D ( k +1) − (cid:21) t , where D ( k +1) = k +1 (cid:88) i =1 p i D i = (1 − p k +1 ) (cid:32) k (cid:88) i =1 p i − p k +1 D i (cid:33) + p k +1 D k +1 . Since k (cid:88) i =1 p i − p k +1 = 1 , we have D ( k +1) = (1 − p k +1 ) D ( k ) + p k +1 D k +1 . where D ( k ) = k (cid:88) i =1 p i − p k +1 D i . By the induction hypothesis it follows that D ( k ) = 11 + h k ( t ) t , where h k is strictly decreasing. Therefore, h ( k +1) ( t ) = (cid:20) − p k +1 ) D ( k ) + p k +1 D k +1 − (cid:21) t = (cid:20) − p k +1 ) (1 + h k ( t ) t ) − + p k +1 (1 + h k +1 t ) − − (cid:21) t . p = 1 − p k +1 , ˆ p = p k +1 , ˆ h ( t ) = h k ( t ) and ˆ h = h k +1 = const . Then we have h ( k +1) ( t ) = (cid:34) p (1 + ˆ h ( t ) t ) − + ˆ p (1 + ˆ h t ) − − (cid:35) t . Analogously to the case n = 2, this expression can be rearranged to give: h ( k +1) ( t ) = ˆ p ˆ h + ˆ p ˆ h + ˆ h ˆ h t p ˆ h t + ˆ p ˆ h t . However, by contrast to the case n = 2, ˆ h is now a function of t . The derivative of h ( k +1) is: dh ( k +1) ( t ) dt = (cid:16) ˆ p ˆ h (cid:48) + ˆ h ˆ h + ˆ h (cid:48) ˆ h t (cid:17) (cid:16) p ˆ h t + ˆ p ˆ h t (cid:17) − (cid:16) ˆ p ˆ h + ˆ p ˆ h + ˆ h ˆ h t (cid:17) (cid:16) ˆ p ˆ h + ˆ p ˆ h + ˆ p ˆ h (cid:48) t (cid:17)(cid:104) p ˆ h t + ˆ p ˆ h t (cid:105) . The denominator of this fraction is strictly positive, so the sign of the derivative dependson the numerator only. Denote the numerator by N : N = (cid:16) ˆ p ˆ h (cid:48) + ˆ h ˆ h + ˆ h (cid:48) ˆ h t (cid:17) (cid:16) p ˆ h t + ˆ p ˆ h t (cid:17) − (cid:16) ˆ p ˆ h + ˆ p ˆ h + ˆ h ˆ h t (cid:17) (cid:16) ˆ p ˆ h + ˆ p ˆ h + ˆ p ˆ h (cid:48) t (cid:17) . Note that N = ˆ Q ( t ) + ˆ h (cid:48) (cid:104)(cid:16) ˆ p + ˆ h t (cid:17) (cid:16) p ˆ h t + ˆ p ˆ h t (cid:17) − ˆ p t (cid:16) ˆ p ˆ h + ˆ p ˆ h + ˆ h ˆ h t (cid:17)(cid:105) , where ˆ Q ( t ) is deﬁned as in the proof of Proposition 1, but with h = ˆ h ( t ) and h = ˆ h .Since Proposition 1 establishes that ˆ Q ( t ) ≤ h ( t ) = h ) andˆ h (cid:48) <

0, it suﬃces to show that (cid:16) ˆ p + ˆ h t (cid:17) (cid:16) p ˆ h t + ˆ p ˆ h t (cid:17) − ˆ p t (cid:16) ˆ p ˆ h + ˆ p ˆ h + ˆ h ˆ h t (cid:17) > p (cid:16) p ˆ h t (cid:17) + ˆ h t (cid:16) p ˆ h t (cid:17) − (ˆ p ) ˆ h t. We now use the fact that (ˆ p ) = (1 − ˆ p ) = 1 − p + (ˆ p ) to getˆ p (cid:16) p ˆ h t (cid:17) + ˆ h t (cid:16) p ˆ h t (cid:17) − (cid:2) − p + (ˆ p ) (cid:3) ˆ h t = ˆ p + (cid:16) t ˆ h (cid:17) ˆ p + 2ˆ p ˆ h t, which is strictly positive as required. Therefore, h ( k +1) ( t ) is strictly decreasing.25 Discussion

We generalized Jackson and Yariv’s result [11] by proving that whenever we aggregatediﬀerent discount functions from comparable DI classes, the weighted average functionis always strictly more DI than the least DI of its constituents. This also strengthensthe conclusion of the theorem of Prelec [19] who demonstrates that the mixture of twodiﬀerent discount functions from the same DI class represents more DI preferences.When a decision maker is uncertain about her hyperbolic discount rate, we showedthat long-term costs and beneﬁts must be discounted at the probability-weighted harmonicmean of the hyperbolic discount rates that might eventuate. This complements the well-known result of Weitzman [27].One natural question that arises is whether it is possible to prove a result analogousto Proposition 13 when all preference orders exhibit increasing impatience (II). Will themixture of II discount functions be (strictly) II? Perhaps surprisingly, the answer to thisquestion is negative in general.This follows from results in the literature on survival analysis and reliability theory.The similarity between reliability theory and temporal discounting is discussed in [24].Takeuchi [26] also notes that a discount function is analogous to a survival function, S ( t ).The failure rate associated with S ( t ) is g ( t ) = − S (cid:48) ( t ) S ( t ) , which behaves as a time preference rate. For twice continuously diﬀerentiable survivalfunctions, a decreasing failure rate (DFR) corresponds to a decreasing time preferencerate, and hence to DI, whereas an increasing failure rate (IFR) corresponds to II. Mix-tures of probability distributions are a common topic in survival and reliability analysis.Proschan [21] established that mixtures of distributions with DFR always exhibit DFR. However, Gurland and Sethuraman [8, 9] provide striking examples of mixtures of veryquickly increasing failure rates that are eventually decreasing.

After Fishburn and Rubinstein [7], we assume that:

Axiom 1. (Weak Order)

The preference order (cid:60) is a weak order, i.e., it is completeand transitive.

Axiom 2. (Monotonicity)

For every x, y ∈ X , if x < y , then ( x, t ) ≺ ( y, t ) for every t ∈ T . This result is comparable to the “non-strict” part of our Proposition 13. xiom 3. (Continuity) For every ( y, s ) ∈ X × T the sets { ( x, t ) ∈ X × T : ( x, t ) (cid:60) ( y, s ) } and { ( x, t ) ∈ X × T : ( x, t ) (cid:52) ( y, s ) } are closed. Axiom 4. (Impatience)

For all t, s ∈ T and every x >

0, if t < s , then ( x, t ) (cid:31) ( x, s ).If t < s and x = 0, then ( x, t ) ∼ ( x, s ) for every t, s ∈ T , that is, 0 is a time-neutraloutcome. Axiom 5. (Separability)

For every x, y, z ∈ X and every r, s, t ∈ T if ( x, t ) ∼ ( y, s )and ( y, r ) ∼ ( z, t ) then ( x, r ) ∼ ( z, s ).Fishburn and Rubinstein [7] proved the following result: Theorem 4 ([7]) . The preferences (cid:60) on X × T satisfy Axioms 1-5 if and only if thereexists a discounted utility representation for (cid:60) on X × T . If ( u, D ) and ( u , D ) bothprovide discounted utility representations for (cid:60) on X × T , then u = αu for some α > ,and D = βD for some β > . We need to prove the following lemma ﬁrst:

Lemma 18.

Suppose that h and h are strictly decreasing functions. Then h is a(strictly) convex transformation of h if and only if h ( s ) − h ( t ) = h ( s + σ + ρ ) − h ( t + σ ) implies that h ( s ) − h ( t ) ≤ [ < ] h ( s + σ + ρ ) − h ( t + σ ) for every s . t , σ and ρ satisfying < t < s ≤ t + σ < s + σ + ρ .Proof. We prove necessity ﬁrst. Suppose that h is a (strictly) convex transformation of h ; that is, there exists a (strictly) convex function f such that h = f ( h ). Assume alsothat 0 < t < s ≤ t + σ < s + σ + ρ and h ( s ) − h ( t ) = h ( s + σ + ρ ) − h ( t + σ ) . (15)We need to show that h ( s ) − h ( t ) ≤ [ < ] h ( s + σ + ρ ) − h ( t + σ )whenever 0 < t < s ≤ t + σ < s + σ + ρ . Since h is strictly decreasing, it follows that h ( s + σ + ρ ) < h ( t + σ ) ≤ h ( s ) < h ( t ) . Recall that f is a (strictly) convex function. Therefore, as equality (15) holds, it impliesthat f ( h ( t + σ )) − f ( h ( s + σ + ρ )) ≤ [ < ] f ( h ( t )) − f ( h ( s )) . Since h = f ( h ), this inequality is equivalent to h ( t + σ ) − h ( s + σ + ρ ) ≤ [ < ] h ( t ) − h ( s ) . h ( s ) − h ( t ) ≤ [ < ] h ( s + σ + ρ ) − h ( t + σ ) , (16)whenever 0 < t < s ≤ t + σ < s + σ + ρ .To show the suﬃciency, suppose that (15) implies (16) for every s , t , σ and ρ satisfying0 < t < s ≤ t + σ < s + σ + ρ . Deﬁne f such that f = h ◦ h − . Note that we can do sobecause h − exists (since h is a strictly decreasing function). Then if h ( s + σ + ρ ) < h ( t + σ ) ≤ h ( s ) < h ( t )and equation (15) holds, we have f ( h ( t + σ )) − f ( h ( s + σ + ρ )) ≤ [ < ] f ( h ( t )) − f ( h ( s )) . Therefore, f is a (strictly) convex function, which means that h is a (strictly) convextransformation of h .We can now prove Proposition 7. Proof.

Observe that D i : [0 , ∞ ) → (0 ,

1] is one-to-one and onto, so D − i : (0 , → [0 , ∞ ).Let us ﬁrst prove that condition (i) follows from condition (ii). The proof is bycontraposition. We show that not (i) implies not (ii). Assume that (i) fails; that is, thereexist s and t with 0 < t < s , ρ > σ > x, y, x (cid:48) , y (cid:48) ∈ X with 0 < x < y and0 < x (cid:48) < y (cid:48) such that ( x (cid:48) , t ) ∼ ( y (cid:48) , s ), ( x (cid:48) , t + σ ) ∼ ( y (cid:48) , s + σ + ρ ), ( x, t ) ∼ ( y, s ) and( x, t + σ ) (cid:31) [ (cid:60) ] ( y, s + σ + ρ ) . Since u ( y ) > u ( y (cid:48) ) > u ( x (cid:48) ) u ( y (cid:48) ) = D ( s ) D ( t ) = D ( s + σ + ρ ) D ( t + σ )and u ( x ) u ( y ) = D ( s ) D ( t ) > [ ≥ ] D ( s + σ + ρ ) D ( t + σ ) . Let h = ln D and h = ln D . Note that h and h are both strictly decreas-ing functions. Observe also that h i : [0 , ∞ ) → ( −∞ ,

0] is one-to-one and onto. Thus h − i : ( −∞ , → [0 , ∞ ), where h − i ( z ) = D − i ( e z ). Rewriting these expressions we get D i ( t ) = e h i ( t ) for each i ∈ { , } . Thus: e h ( s ) e h ( t ) = e h ( s + σ + ρ ) e h ( t + σ ) and e h ( s ) e h ( t ) > [ ≥ ] e h ( s + σ + ρ ) e h ( t + σ ) . Equivalently, h ( s ) − h ( t ) = h ( s + ρ + σ ) − h ( t + σ ) (17)28nd h ( s ) − h ( t ) > [ ≥ ] h ( s + ρ + σ ) − h ( t + σ ) . (18)Note that ln D ( D − ( e z )) (strictly) convex in z on ( −∞ ,

0] is equivalent to h ◦ h − (strictly) convex in z on ( −∞ , h is a (strictly) convex transformation of h . By Lemma 18 this conclusion contradicts equation (17) and inequality (18). Therefore,not (i) implies not (ii).Secondly, we need to demonstrate that (i) implies (ii). Using the previously introducednotation, we show that for every for every s , t , σ and ρ satisfying0 < t < s ≤ t + σ < s + σ + ρ the equation h ( s ) − h ( t ) = h ( s + σ + ρ ) − h ( t + σ )implies h ( s ) − h ( t ) ≤ [ < ] h ( s + σ + ρ ) − h ( t + σ ) . As h and h are decreasing functions, this proves that h is a (strictly) convex transfor-mation of h . Assume that 0 ≤ t < s ≤ t + σ < s + σ + ρ such that h ( s ) − h ( t ) = h ( s + σ + ρ ) − h ( t + σ ) . By deﬁnition of h i = ln D i this expression is equivalent to D ( s ) D ( t ) = D ( s + σ + ρ ) D ( t + σ ) ∈ (0 , . As u is continuous, we can choose 0 < x (cid:48) < y (cid:48) such that: D ( s ) D ( t ) = D ( s + σ + ρ ) D ( t + σ ) = u ( x (cid:48) ) u ( y (cid:48) ) . Therefore, D ( t ) u ( x (cid:48) ) = D ( s ) u ( y (cid:48) ) and D ( t + σ ) u ( x (cid:48) ) = D ( s + σ + ρ ) u ( y (cid:48) ). Thismeans that ( x (cid:48) , t ) ∼ ( y (cid:48) , s ) and ( x (cid:48) , t + σ ) ∼ ( y (cid:48) , s + σ + ρ ).Analogously, because u is continuous, we can choose x, y such that: D ( s ) D ( t ) = u ( x ) u ( y ) ∈ (0 , . Hence, ( x, t ) ∼ ( y, s ).But according to (i), if ( x (cid:48) , t ) ∼ ( y (cid:48) , s ), ( x (cid:48) , t + σ ) ∼ ( y (cid:48) , s + σ + ρ ) and ( x, t ) ∼ ( y, s )then ( x, t + σ ) (cid:52) [ ≺ ] ( y, s + σ + ρ ). The latter is equivalent to: D ( s + σ + ρ ) D ( t + σ ) ≥ [ > ] u ( x ) u ( y ) . It follows that D ( s ) D ( t ) ≤ [ < ] D ( s + σ + ρ ) D ( t + σ ) , D ( s ) − ln D ( t ) ≤ [ < ] ln D ( s + σ + ρ ) − ln D ( t + σ )or h ( s ) − h ( t ) ≤ [ < ] h ( s + σ + ρ ) − h ( t + σ ) . Therefore, h ( s ) − h ( t ) = h ( s + σ + ρ ) − h ( t + σ )implies h ( s ) − h ( t ) ≤ [ < ] h ( s + σ + ρ ) − h ( t + σ )whenever 0 ≤ t < s ≤ t + σ < s + σ + ρ . Hence, by Lemma 18, h is a (strictly) convextransformation of h . References [1] G. Ainslie. Specious reward: a behavioral theory of impulsiveness and impulse con-trol.

Psychological Bulletin , 82(4):463–496, 1975.[2] A. Al-Nowaihi and S. Dhami. A note on the Loewenstein-Prelec theory of intertem-poral choice.

Mathematical Social Sciences , 52(1):99–108, 2006.[3] K. J. Arrow.

Aspects of the theory of risk-bearing . Yrj¨o Jahnssonin S¨a¨ati¨o, Helsinki,1965.[4] A. E. Attema, H. Bleichrodt, K. I. M. Rohde, and P. P. Wakker. Time-tradeoﬀsequences for analyzing discounting and time inconsistency.

Management Science ,56(11):2015–2030, 2010.[5] S. Boyd and L. Vandenberghe.

Convex optimization . Cambridge University Press,Cambridge, 2004.[6] Z. Cvetkovski.

Inequalities: Theorems, Techniques and Selected Problems . Springer-Verlag, Berlin Heidelberg, 2012.[7] P. C. Fishburn and A. Rubinstein. Time preference.

International Economic Review ,23(3):677–694, 1982.[8] J. Gurland and J. Sethuraman. Shorter communication: reversal of increasing failurerates when pooling failure data.

Technometrics , 36(4):416–418, 1994.[9] J. Gurland and J. Sethuraman. How pooling failure data may reverse increasingfailure rates.

Journal of the American Statistical Association , 90(432):1416–1423,1995.[10] C. M. Harvey. Proportional discounting of future costs and beneﬁts.

Mathematics ofOperations Research , 20(2):381–399, 1995.3011] M. O. Jackson and L. Yariv. Present bias and collective dynamic choice in the lab. the American Economic Review , 104(12):4184–4204, 2014.[12] M. O. Jackson and L. Yariv. Collective dynamic choice: the necessity of time incon-sistency.

American Economic Journal: Microeconomics , 7(4):150–178, 2015.[13] D. T. Jamison and J. Jamison. Characterizing the amount and speed of discountingprocedures.

Journal of Beneﬁt-Cost Analysis , 2(2):1–56, 2011.[14] D. Laibson. Golden eggs and hyperbolic discounting.

Quarterly Journal of Eco-nomics , 112(2):443–478, 1997.[15] G. Loewenstein and D. Prelec. Anomalies in intertemporal choice: Evidence and aninterpretation.

The Quarterly Journal of Economics , 107(2):573–597, 1992.[16] J. E. Mazur. Hyperbolic value addition and general models of animal choice.

Psy-chological Review , 108(1):96–112, 2001.[17] E. S. Phelps and R. A. Pollak. On second-best national saving game-equilibriumgrowth.

Review of Economic Studies , 35(2):185–199, 1968.[18] J. W. Pratt. Risk aversion in the small and in the large.

Econometrica , 32(1/2):122–136, 1964.[19] D. Prelec. Decreasing impatience: A criterion for non-stationary time preference andhyperbolic discounting.

Scandinavian Journal of Economics , 106(3):511–532, 2004.[20] D. Prelec and G. Loewenstein. Decision making over time and under uncertainty: Acommon approach.

Management Science , 37(7):770–786, 1991.[21] F. Proschan. Theoretical explanation of observed decreasing failure rate.

Techno-metrics , 5(3):373–383, 1963.[22] J. Quiggin and J. Horowitz. Time and risk.

Journal of Risk and Uncertainty ,10(1):37–55, 1995.[23] R. T. Rockafellar and R. J.-B. Wets.

Variational analysis , volume 317 of

A Series ofComprehensive Studies in Mathematics . Springer-Verlag, Berlin Heidelberg, 1998.[24] P. D. Sozou. On hyperbolic discounting and uncertain hazard rates.

Proceedingsof the Royal Society of London. Series B: Biological Sciences , 265(1409):2015–2020,1998.[25] O. Stein. Twice diﬀerentiable characterizations of convexity notions for functions onfull dimensional convex sets.

Schedae Informaticae , 21:55–63, 2012.[26] K. Takeuchi. Non-parametric test of time consistency: Present bias and future bias.

Games and Economic Behavior , 71(2):456–478, 2011.3127] M. L. Weitzman. Why the far-distant future should be discounted at its lowestpossible rate.