[PDF] A Theory of the Saving Rate of the Rich

Abstract

Empirical evidence suggests that the rich have higher propensity to save than do the poor. While this observation may appear to contradict the homotheticity of preferences, we theoretically show that that is not the case. Specifically, we consider an income fluctuation problem with homothetic preferences and general shocks and prove that consumption functions are asymptotically linear, with an exact analytical characterization of asymptotic marginal propensities to consume (MPC). We provide necessary and sufficient conditions for the asymptotic MPCs to be zero. We calibrate a model with standard constant relative risk aversion utility and show that zero asymptotic MPCs are empirically plausible, implying that our mechanism has the potential to accommodate a large saving rate of the rich and high wealth inequality (small Pareto exponent) as observed in the data.

Full PDF

AA Theory of the Saving Rate of the Rich ∗ Qingyin Ma † Alexis Akira Toda ‡ July 15, 2020

Abstract

Empirical evidence suggests that the rich have higher propensity tosave than do the poor. While this observation may appear to contradictthe homotheticity of preferences, we theoretically show that that is notthe case. Speciﬁcally, we consider an income ﬂuctuation problem withhomothetic preferences and general shocks and prove that consumptionfunctions are asymptotically linear, with an exact analytical characteriza-tion of asymptotic marginal propensities to consume (MPC). We providenecessary and suﬃcient conditions for the asymptotic MPCs to be zero.We solve a calibrated model with standard constant relative risk aver-sion utility and show that asymptotic MPCs can be zero in empiricallyplausible settings, implying an increasing and large saving rate of the rich.

Keywords: asymptotic linearity, income ﬂuctuation problem, mono-tone convex map, saving rate.

JEL codes:

C65, D15, D52, E21.

Empirical evidence suggests that the rich have higher propensity to save thando the poor. This fact implies that the rich have lower marginal propensityto consume (MPC), which has important economic consequences. For example,when the rich have lower MPC, the consumption tax, which is a popular tax in-strument in many countries, becomes regressive and may not be desirable fromequity perspectives. MPC heterogeneity also implies that the wealth distribu-tion matters for determining aggregate demand and hence monetary and ﬁscalpolicies (Kaplan, Moll, and Violante, 2018; Mian, Straub, and Suﬁ, 2020). ∗ We thank Chris Carroll, ´Emilien Gouin-Bonenfant, Ben Moll, Johannes Wieland, andseminar participants at CRETA Economic Theory Conference for valuable feedback and sug-gestions. A previous version of this paper was circulated under the title “Asymptotic MarginalPropensity to Consume”. † International School of Economics and Management, Capital University of Economics andBusiness. Email: [email protected]. ‡ Department of Economics, University of California San Diego. Email: [email protected]. Quadrini (1999) documents that entrepreneurs (who tend to be rich) have high savingrates. Dynan, Skinner, and Zeldes (2004) document that there is a positive association be-tween saving rates and lifetime income. More recently, using Norwegian administrative data,Fagereng, Holm, Moll, and Natvik (2019) show that among households with positive networth, saving rates are increasing in wealth. a r X i v : . [ ec on . T H ] J u l hy do the rich save so much? Intuition suggests that canonical modelsof consumption and savings that feature homothetic preferences are unable toexplain the high saving rate of the rich: in such models, consumption (hence sav-ing) functions should be asymptotically linear in wealth due to homotheticity,implying an asymptotically constant saving rate. A seemingly obvious explana-tion for the high saving rate of the rich is that preferences are not homothetic. However, non-homothetic preferences have some undesirable theoretical proper-ties. First, they are inconsistent with balanced growth (whereas many aggregateeconomic variables such as real per capita GDP are near unit root processes),at least in basic models in which preference parameters are constant. Second,non-homothetic utility functions have more parameters than homothetic ones,which introduces arbitrariness in model speciﬁcation and calibration.In this paper we theoretically show that the intuition of “homotheticity im-plies (asymptotic) linearity” is only partially correct. We consider a standardincome ﬂuctuation problem with (homothetic) constant relative risk aversion(CRRA) preferences but with capital and labor income risk in a general Marko-vian setting. We prove that the consumption functions are asymptotically linearin wealth, or the asymptotic marginal propensities to consume converge to someconstants. While this statement is intuitive, there is one surprise: we obtainan exact analytical characterization of the asymptotic MPCs and prove thatthey can be zero. The asymptotic MPCs depend only on risk aversion and thestochastic processes for the discount factor and return on wealth, and are inde-pendent of the income process. Furthermore, we derive necessary and suﬃcientconditions for zero asymptotic MPCs. When the asymptotic MPCs are zero,the saving rates of the rich converge to one as agents get wealthier. Thus, weprovide a potential explanation for why the rich save so much, and we do sowith standard homothetic preferences.To prove that consumption functions are asymptotically linear with partic-ular slopes, we apply policy function iteration as in Li and Stachurski (2014)and Ma, Stachurski, and Toda (2020). Since agents cannot consume more thantheir ﬁnancial wealth in the presence of borrowing constraints, a natural up-per bound on consumption is asset, which is linear with a slope of 1. Startingfrom this candidate consumption function, policy function iteration results inincreasingly tighter upper bounds. On the other hand, we directly obtain lowerbounds by restricting the space of candidate consumption functions such thatthey have linear lower bounds with speciﬁc slopes. We analytically derive theseslopes based on the ﬁxed point theory of monotone convex maps developed inDu (1990), which has recently been applied in economics by Toda (2019) andBoroviˇcka and Stachurski (2020). Finally, we show that the upper and lowerbounds thus obtained have identical slopes, implying the asymptotic linearityof consumption functions with an exact characterization of asymptotic MPCs.To assess the empirical plausibility of our new mechanism, we numericallysolve an income ﬂuctuation problem with CRRA utility and capital income risk For example, Carroll (2000) considers a ‘capitalist spirit’ model in which agents directlyget utility from holding wealth, where the utility functions for consumption and wealth havediﬀerent curvatures. De Nardi (2004) considers a model with bequest, which is mathemat-ically similar. Straub (2019) estimates that the elasticity of consumption with respect topermanent income is below 1 (which implies concavity of consumption functions) and usesnon-homothetic preferences to explain it. Another possibility is to introduce frictions such asportfolio adjustment costs (Fagereng, Holm, Moll, and Natvik, 2019).

Our paper is related to the theoretical studies of the income ﬂuctuation problem,which is a key building block of heterogeneous-agent models in modern macroe-conomics. Chamberlain and Wilson (2000) study the existence of a solutionassuming bounded utility and applying the contraction mapping theorem. Liand Stachurski (2014) relax the boundedness assumption and apply policy func-tion iteration. Benhabib, Bisin, and Zhu (2015) consider a special model withCRRA utility, constant discounting, and iid and mutually independent returnsand income shocks to study the tail behavior of wealth. Ma, Stachurski, andToda (2020) allow for stochastic discounting and returns on wealth in a generalMarkovian setting and discuss the ergodicity, stochastic stability, and tail be-havior of wealth. Carroll (2020) examines detailed properties of a special modelwith CRRA utility, constant discounting and risk-free rate, and iid permanentand transitory income shocks. While the main focus of these papers is the ex-istence, uniqueness, and computation of a solution, we focus on the asymptoticbehavior of consumption with general shocks. Carroll and Kimball (1996) showthe concavity of consumption functions in a class of income ﬂuctuation prob-lems, which implies asymptotic linearity. However, they do not characterize theasymptotic MPCs as we do.

In this section we introduce a general income ﬂuctuation problem following thesetting in Ma, Stachurski, and Toda (2020) and study the asymptotic propertyof the consumption functions when preferences are homothetic.

Time is discrete and denoted by t = 0 , , , . . . . Let a t be the ﬁnancial wealth ofthe agent at the beginning of period t . The agent chooses consumption c t ≥ a t − c t . The period utility function is u andthe discount factor, gross return on wealth, and non-ﬁnancial income in period See, for example, Cao (2020) and A¸cıkg¨oz (2018) for the existence of equilibrium with andwithout aggregate shocks, where the theoretical properties of the income ﬂuctuation problemplay an important role. Lehrer and Light (2018) and Light (2018) prove comparative staticsresults regarding savings. Light (2020) proves the uniqueness of stationary equilibrium in anAiyagari model that exhibits a certain gross substitute property. are denoted by β t , R t , Y t , where we normalize β = 1. Thus the agent solvesmaximize E ∞ (cid:88) t =0 (cid:32) t (cid:89) i =0 β i (cid:33) u ( c t )subject to a t +1 = R t +1 ( a t − c t ) + Y t +1 , (2.1a)0 ≤ c t ≤ a t , (2.1b)where the initial wealth a = a > The stochastic processes { β t , R t , Y t } t ≥ obey β t = β ( Z t , ε t ) , R t = R ( Z t , ζ t ) , Y t = Y ( Z t , η t ) , (2.2)where β, R, Y are nonnegative measurable functions, { Z t } t ≥ is a time-homogeneousﬁnite state Markov chain taking values in Z = { , . . . , Z } with a transition prob-ability matrix P , and the innovation processes { ε t } , { ζ t } , { η t } are independentand identically distributed ( iid ) over time and mutually independent.We introduce the following notation. For a square matrix A , the scalar r ( A )denotes its spectral radius (largest absolute value of all eigenvalues), i.e., r ( A ) := max {| α | | α is an eigenvalue of A } . (2.3)The spectral radius (2.3) plays an important role in the subsequent discussion.The symbols β, R, Y are shorthand of β ( Z, ε ), R ( Z, ζ ), Y ( Z, η ) and ˆ β, ˆ R, ˆ Y areshorthand of β ( ˆ Z, ˆ ε ), R ( ˆ Z, ˆ ζ ), Y ( ˆ Z, ˆ η ). Deﬁne the diagonal matrix D β by D β ( z, z ) = E z β = E [ β ( Z, ε ) | Z = z ] = E β ( z, ε ) . More generally, for any stochastic process { X t } such that the distribution of X t conditional on all past information and Z t = z depends only on z , let D X bethe diagonal matrix such that D X ( z, z ) = E z X = E [ X | Z = z ]. Consider thefollowing assumptions. Assumption 1.

The utility function u : [0 , ∞ ) → R ∪ {−∞} is twice contin-uously diﬀerentiable on (0 , ∞ ) and satisﬁes u (cid:48) > , u (cid:48)(cid:48) < , u (cid:48) (0) = ∞ , and u (cid:48) ( ∞ ) < . Assumption 1 is essentially the usual Inada condition together with mono-tonicity and concavity.

Assumption 2.

The following conditions hold:1. E z β < ∞ and E z βR < ∞ for all z ∈ Z ,2. r ( P D β ) < and r ( P D βR ) < ,3. E z Y < ∞ and E z u (cid:48) ( Y ) < ∞ for all z ∈ Z . The condition r ( P D β ) < β < r ( P D βR ) < βR < The no-borrowing condition a t − c t ≥ heorem 2.1. Suppose Assumptions 1 and 2 hold. Then the income ﬂuctuationproblem (2.1) has a unique solution. Furthermore, the consumption function c ( a, z ) can be computed by policy function iteration.Proof. See Ma, Stachurski, and Toda (2020, Theorem 2.2). ‘Policy function iteration’ means the following. When the borrowing con-straint c t ≤ a t does not bind, the Euler equation implies u (cid:48) ( c t ) = E t β t +1 R t +1 u (cid:48) ( c t +1 ) . If c t = a t , then clearly u (cid:48) ( c t ) = u (cid:48) ( a t ). Therefore combining these two cases, wecan compactly express the Euler equation as u (cid:48) ( c t ) = max { E t β t +1 R t +1 u (cid:48) ( c t +1 ) , u (cid:48) ( a t ) } . Based on this observation, given a candidate consumption function c ( a, z ), thepolicy function iteration updates the consumption function by the value ξ = T c ( a, z ) that solves u (cid:48) ( ξ ) = max (cid:110) E z ˆ β ˆ Ru (cid:48) ( c ( ˆ R ( a − ξ ) + ˆ Y , ˆ Z )) , u (cid:48) ( a ) (cid:111) . (2.4)Let C be the space of candidate consumption functions such that c : (0 , ∞ ) × Z → R is continuous, is increasing in the ﬁrst element, 0 < c ( a, z ) ≤ a for all a > z ∈ Z , and sup ( a,z ) ∈ (0 , ∞ ) × Z | u (cid:48) ( c ( a, z )) − u (cid:48) ( a ) | < ∞ . For c, d ∈ C , deﬁne ρ ( c, d ) = sup ( a,z ) ∈ (0 , ∞ ) × Z | u (cid:48) ( c ( a, z )) − u (cid:48) ( d ( a, z )) | . (2.5)When Assumptions 1 and 2 hold, Theorem 2.2 of Ma, Stachurski, and Toda(2020) shows that C is a complete metric space with metric ρ and T : C → C deﬁned as

T c ( a, z ) = ξ that solves (2.4) is a contraction mapping. We call theoperator T the time iteration operator . Exploiting policy function iteration, Ma, Stachurski, and Toda (2020) showseveral properties such as (i) consumption and savings are increasing in wealthand (ii) consumption is increasing in income.

To study the asymptotic behavior of consumption, we strengthen Assumption 1as follows. In addition to Assumptions 1 and 2, Ma, Stachurski, and Toda (2020) assume that thetransition probability matrix P is irreducible. However, irreducibility is required only for theirergodicity result, not for existence and uniqueness of a solution. The time iteration operator was introduced by Coleman (1990). Several papers such asDatta, Mirman, and Reﬀett (2002), Rabault (2002), Morand and Reﬀett (2003), Kuhn (2013),and Li and Stachurski (2014) use this approach to establish existence of solutions and studytheoretical properties. ssumption 1’. The utility function exhibits constant relative risk aversion γ >

0: we have u ( c ) = (cid:40) c − γ − γ , ( γ (cid:54) = 1)log c. ( γ = 1) (2.6)Furthermore, E z βR − γ < ∞ for all z . Theorem 2.2 below, which is our main theoretical result, shows that whenthe utility function exhibits constant relative risk aversion, the consumptionfunctions are asymptotically linear and characterizes the asymptotic MPCs. Toavoid overwhelming the reader with notation and technicalities, we maintainthe additional condition E z βR − γ < ∞ as in Assumption 1’. Furthermore,Theorem 2.2 only provides a necessary and almost suﬃcient condition for theasymptotic MPCs to be zero. We provide a complete characterization in Theo-rem 2.5 below. Theorem 2.2 (Asymptotic linearity) . Suppose Assumptions 1’ and 2 hold. Let D = D βR − γ be the diagonal matrix whose ( z, z ) -th element is E z βR − γ < ∞ .Then the followings are true:1. If r ( P D ) < , then for all z ∈ Z we have lim a →∞ c ( a, z ) a =: ¯ c ( z ) > , (2.7) where ¯ c ( z ) = x ∗ ( z ) − /γ and x ∗ = ( x ∗ ( z )) Zz =1 ∈ R Z + is the unique ﬁnitesolution to the system of equations x ( z ) = ( F x )( z ) := (cid:16) P Dx )( z ) /γ (cid:17) γ , z = 1 , . . . , Z. (2.8)

2. If r ( P D ) ≥ and P D is irreducible, then for all z ∈ Z we have lim a →∞ c ( a, z ) a = 0 . The proof of Theorem 2.2 is relegated to Section 4. Here we heuristicallydiscuss the intuition for why we would expect the conclusion of Theorem 2.2 tohold. Suppose the limit (2.7) exists. Assuming that the borrowing constraintdoes not bind, the Euler equation (2.4) implies u (cid:48) ( ξ ) = E z ˆ β ˆ Ru (cid:48) ( c ( ˆ R ( a − ξ ) + ˆ Y , ˆ Z )) , where ξ = c ( a, z ). Setting u (cid:48) ( c ) = c − γ as in Assumption 1’, setting c ( a, z ) =¯ c ( z ) a motivated by (2.7), multiplying both sides by a γ , letting a → ∞ , andinterchanging expectations and limits, it must be¯ c ( z ) − γ = E z ˆ β ˆ R − γ ¯ c ( ˆ Z ) − γ (1 − ¯ c ( z )) − γ . (2.9)Dividing both sides of (2.9) by (1 − ¯ c ( z )) − γ and setting x ( z ) = ¯ c ( z ) − γ , we obtain x ( z ) = (cid:18) (cid:16) E z ˆ β ˆ R − γ x ( ˆ Z ) (cid:17) /γ (cid:19) γ , z = 1 , . . . , Z. (2.10) We use the convention βR − γ = ( βR ) R − γ and 0 · ∞ = 0. Then E z βR − γ ∈ [0 , ∞ ] iswell-deﬁned even if γ > β, R ) = (0 ,

0) with positive probability. β, ˆ R depend only on ˆ Z and iid innovations, we haveE z ˆ β ˆ R − γ x ( ˆ Z ) = Z (cid:88) ˆ z =1 P ( z, ˆ z ) E ˆ z ˆ β ˆ R − γ x (ˆ z ) . Therefore letting P be the transition probability matrix and D = D βR − γ bethe diagonal matrix whose ( z, z )-th element is E z βR − γ < ∞ , we can rewrite(2.10) as (2.8). This discussion motivates the ﬁxed point equation (2.8).Next, we discuss the intuition for the spectral condition r ( P D ) ≷

1. Whenthe elements of the vector x ∈ R Z + are large, since P D is a nonnegative matrix,it follows from the deﬁnition of F in (2.8) that F x ≈ P Dx.

Since for large x the function x (cid:55)→ F x is almost linear, whether iterating x (cid:55)→ F x converges or not depends on whether the largest eigenvalue of the coeﬃcientmatrix

P D is less or greater than 1. When r ( P D ) < F in (2.8) behaveslike a contraction and we would expect it to have a unique ﬁxed point. When r ( P D ) ≥

1, because F is monotonic, we would expect the iteration of x (cid:55)→ F x to diverge to inﬁnity, and hence ¯ c ( z ) = x ( z ) − /γ to converge to 0.Theorem 2.2 roughly says two things: with homothetic preferences, (i) con-sumption functions are asymptotically linear, and (ii) the asymptotic MPCs canbe zero. The ﬁrst point is not surprising based on the intuition of scale invari-ance with homothetic preferences. The second point is nontrivial and surprising,and it depends on whether the condition r ( P D βR − γ ) < z ˆ β ˆ R − γ <

1, which Carroll (2009) callsthe “ﬁnite value condition” and implies (2.11), is often required for the existenceof a solution in dynamic programming problems with homothetic preferences. The following proposition explains why this condition has often been assumedin the literature.

Proposition 2.3.

Suppose Assumption 1’ holds. Then the optimal consumption-saving problem (2.1) with zero income ( Y ≡ ) has a solution (with ﬁnite life-time utility) if and only if the ﬁnite value condition (2.11) holds. Under thiscondition, the optimal consumption function is c ( a, z ) = x ∗ ( z ) − /γ a, (2.12) where x ∗ ∈ R Z + is the unique ﬁnite solution to (2.8) .Proof. The case γ (cid:54) = 1 follows from Proposition 1 of Toda (2019). The case γ = 1 follows from Proposition 10 of Online Appendix C of Toda (2019). See, for example, the discussion on p. 244 of Samuelson (1969), Equation (9) of Krebs(2006), Equation (3) of Carroll (2009), Equation (18) of Toda (2014), or Equation (3) of Toda(2019). In Toda (2019), the discount factor β t and the return R t are deterministic functions of the previous state Z t − . In our setting, they are functions of the current state Z t as well as iid shocks as in (2.2). This diﬀerence in the timing convention explains the diﬀerence in the state-ments, but the proof is essentially identical and therefore omitted. In fact, we can subsumeboth settings as follows. Instead of (2.2), assume β t = β ( Z t − , Z t , ε t ) and similarly for R t , Y t .For a random variable X t , deﬁne the matrix M X by M X ( z, ˆ z ) = E [ X t | Z t − = z, Z t = ˆ z ].Then using P (cid:12) M X instead of P D X for X = β, βR, βR − γ (where (cid:12) denotes the Hadamard(element-wise) product), we can analyze the problem in a uniﬁed way. Y can be positive. In fact, the Inada condition u (cid:48) (0) = ∞ in Assumption 1and the condition E z u (cid:48) ( Y ) < ∞ in Assumption 2 imply that Y > r ( P D ) ≥ r ( P D ) ≥ γ > Proposition 2.4.

If Assumption 2 holds and γ ≤ , then r ( P D βR − γ ) < . Example 2.3 below (with iid lognormal returns) shows that zero asymptoticMPCs are possible for any γ > z βR − γ could be inﬁnite or the matrix P D need not be irreducible in particular appli-cations. We can generalize Theorem 2.2 to cover all possible cases at the costof making the notation slightly more complicated. To this end, let K = P D be as in Theorem 2.2, where the diagonal element D ( z, z ) = E z βR − γ couldbe inﬁnite. By relabeling the states z = 1 , . . . , Z if necessary, without loss ofgenerality we may assume that K is block upper triangular, K =  K · · · ∗ ... . . . ...0 · · · K J  , (2.13)where each diagonal block K j is irreducible. Partition Z as Z = Z ∪ · · · ∪ Z J accordingly. Then we have the following complete characterization. Theorem 2.5 (Complete characterization) . Suppose Assumption 2 holds andthe utility function exhibits constant relative risk aversion γ > . Express K = P D as in (2.13) . Deﬁne the sequence { x n } ∞ n =0 ∈ [0 , ∞ ] Z by x = 1 and x n = F x n − , where F is as in (2.8) and we apply the convention · ∞ = 0 . Then { x n } monotonically converges to x ∗ ∈ [1 , ∞ ] Z , and the limit (2.7) holds with ¯ c ( z ) = x ∗ ( z ) − /γ ∈ [0 , .Furthermore, ¯ c ( z ) = 0 if and only if there exist j , ˆ z ∈ Z j , and m ∈ N suchthat K m ( z, ˆ z ) > and r ( K j ) ≥ , where r ( K j ) = ∞ if some element of K j isinﬁnite. An interesting implication of Theorems 2.2 and 2.5 is that the asymptoticMPCs ¯ c ( z ) depend only on the matrix P D , which in turn depends only on rela-tive risk aversion γ as well as “multiplicative shocks” β and R , and not on “addi-tive shocks” Y . The following corollary veriﬁes the intuition in Gouin-Bonenfantand Toda (2018) that only multiplicative shocks matter for characterizing thebehavior of wealthy agents. Recall that a square matrix A is reducible if there exists a permutation matrix P suchthat P (cid:62) AP is block upper triangular with at least two diagonal blocks. Matrices that arenot reducible are called irreducible. Hence by induction a decomposition of the form (2.13) isalways possible. By deﬁnition scalars (1 × K j in (2.13) can be zero if it is 1 × orollary 2.6 (Irrelevance of additive shocks) . Let everything be as in Theo-rem 2.5. The asymptotic MPCs ¯ c ( z ) depend only on the relative risk aversion γ ,transition probability matrix P , the discount factor β , and the return on wealth R , and not on income Y . The system of ﬁxed point equations (2.8) is in general nonlinear and does notadmit a closed-form solution. Below, we discuss several examples with explicitsolutions.

Example 2.1. If γ = 1, then (2.8) becomes x ∗ = 1 + P Dx ∗ ⇐⇒ x ∗ = ( I − P D ) − , where D = D β = diag( . . . , E z β, . . . ). A corollary is that with log utility, wealways have ¯ c ( z ) > Example 2.2. If b = b ( z ) = E z βR − γ does not depend on z , then D = bI . If x = k P Dx = bP k bk P is atransition probability matrix. Thus if b <

1, (2.8) reduces to x ∗ ( z ) = (1 + ( bx ∗ ( z )) /γ ) γ ⇐⇒ x ∗ ( z ) = (1 − b /γ ) − γ ⇐⇒ ¯ c ( z ) = 1 − b /γ . This example shows that with constant discounting ( β ( z, ε ) ≡ β ) and risk-freesaving ( R ( z, ζ ) ≡ R ), the asymptotic MPC is constant regardless of the incomeshocks: ¯ c ( z ) = (cid:40) − ( βR − γ ) /γ if βR − γ < Example 2.3.

Suppose the return on wealth R t = R ( Z t , ζ t ) does not dependon Z t , so R t = R ( ζ t ). Assume further that log R t is normally distributed withstandard deviation σ and mean µ − σ /

2, so E R = e µ . Let the discount factor β = e − δ be constant, where δ > > E βR = e − δ + µ ⇐⇒ δ > µ, > E βR − γ = e − δ +(1 − γ )( µ − γσ / ⇐⇒ δ > (1 − γ ) (cid:18) µ − γσ (cid:19) . Therefore assuming δ > µ for Assumption 2 to hold, it follows from Example2.2 that¯ c ( z ) = (cid:40) − e − ψδ − (1 − ψ )( µ − γσ / > δ > (1 − γ ) (cid:0) µ − γσ (cid:1) ,0 otherwise,where ψ = 1 /γ is the elasticity of intertemporal substitution. If γ >

1, then(1 − γ )( µ − γσ / → ∞ as γ, σ → ∞ , so the asymptotic MPC is 0 if riskaversion or volatility is suﬃciently high. Note that since r ( P D ) = r ( P D β ) < I − P D ) − = (cid:80) ∞ k =0 ( P D ) k exists and is nonnegative. Asymptotic MPCs and saving rates

In this section we apply our theory of asymptotic MPCs to shed light on thesaving rate of the rich.

We deﬁne an agent’s saving rate by the change in net worth divided by to-tal income excluding capital loss (to prevent the denominator from becomingnegative): s t +1 = a t +1 − a t max { ( R t +1 − a t − c t ) , } + Y t +1 . (3.1)For x ∈ R , deﬁne its positive and negative parts by x + = max { x, } and x − = − min { x, } . Then x = x + − x − . Using the budget constraint (2.1a), thesaving rate (3.1) can be rewritten as s t +1 = [( R t +1 − + − ( R t +1 − − ]( a t − c t ) + Y t +1 − c t ( R t +1 − + ( a t − c t ) + Y t +1 = 1 − ( ˆ R − − (1 − c/a ) + c/a ( ˆ R − + (1 − c/a ) + ˆ Y /a ∈ ( −∞ , . (3.2)Letting a → ∞ , the saving rate of an inﬁnitely wealthy agent becomes¯ s := 1 − ( ˆ R − − (1 − ¯ c ) + ¯ c ( ˆ R − + (1 − ¯ c ) ∈ [ −∞ , , (3.3)where ¯ c is the asymptotic MPC. Under what conditions can the saving rate (3.2)be increasing in wealth, and in particular, can the asymptotic saving rate (3.3)become positive? The following proposition provides a negative answer withina class of models. Proposition 3.1.

Consider a canonical Bewley (1977) model in which agentsare inﬁnitely-lived and relative risk aversion γ , discount factor β , and return onwealth R > are constant. Then in the stationary equilibrium the asymptoticsaving rate (3.3) is negative.Proof. Stachurski and Toda (2019) show that it must be βR <

R > βR − γ = ( βR ) R − γ < c = 1 − ( βR − γ ) /γ ∈ (0 , s = 1 − ¯ c ( R − − ¯ c ) < ⇐⇒ ( R − − ¯ c ) < ¯ c ⇐⇒ ( R − βR − γ ) /γ < − ( βR − γ ) /γ ⇐⇒ ( βR ) /γ < , which holds because βR < Thus, these models are unable to This result has a similar ﬂavor to Stachurski and Toda (2019), who prove that canonicalBewley models cannot explain the tail behavior of wealth. β or R to be stochastic need not solve the problemwhen ¯ c > Proposition 3.2.

Consider a Bewley (1977) model in which agents are inﬁnitely-lived, relative risk aversion γ is constant, and { β t , R t } t ≥ is iid with E R > and E βR − γ < . If the stationary equilibrium wealth distribution has an un-bounded support, then the asymptotic saving rate (3.3) evaluated at ˆ R = E R isnonpositive.Proof. Since by assumption E βR − γ <

1, by Example 2.2 the asymptotic MPCis ¯ c = 1 − (E βR − γ ) /γ ∈ (0 , R > s = 1 − ¯ c (E R − − ¯ c ) ≤ ⇐⇒ (E R − − ¯ c ) ≤ ¯ c ⇐⇒ E R (1 − ¯ c ) ≤ . Since E R (1 − ¯ c ) is the expected growth rate of wealth for inﬁnitely wealthyagents, if the wealth distribution is unbounded and E R (1 − ¯ c ) >

1, then wealthwill grow at the top, which violates stationarity. Therefore in a stationaryequilibrium, it must be ¯ s ≤ r ( P D βR − γ ) ≥

1, then by Theorem 2.2 we have ¯ c = 0 and hence the asymptotic saving ratebecomes ¯ s = 1 > To show the theoretical possibility of positive and increasing saving rates, weconsider a numerical example calibrated from U.S. data. We present a minimalmodel to illustrate our theory, and a detailed comparison to the data is beyondthe scope of the paper.The agent has constant discount factor β and relative risk aversion γ >

0. Wesuppose that wealthy agents invest their wealth into stocks, private businesses,and a risk-free asset in constant proportions subject to a capital income tax. Let R st , R bt be the gross returns on stock and business between time t − t ,and let R f be the gross risk-free rate. The stock return process { R st } exhibitsconstant expected return E R st = e µ with GARCH(1 ,

1) innovations:log R st = µ − σ t + (cid:15) t , (3.4a) (cid:15) t = σ t ζ t , ζ t ∼ iid N (0 ,

1) (3.4b) σ t = ω + α(cid:15) t − + ρσ t − , (3.4c) Another possibility is to consider overlapping generations models. Stachurski and Toda(2019, Theorem 9) present a model with random birth/death and show that it is possible tohave βR > s > Since our focus is the individual optimization problem (2.1), the distinction between stocks(which are subject to only aggregate shocks) and private businesses (which are subject to bothaggregate and idiosyncratic shocks) is unimportant. We include both assets only to reﬂectthe evidence on individual portfolio cited below. σ t > (cid:15) t is a zero mean innovation, and weassume ω, α, ρ > α + ρ < R bt = (cid:40) − p b R st with probability 1 − p b ,0 with probability p b ,so that private businesses go bankrupt with probability p b but otherwise businessreturns are perfectly correlated with the stock return with identical mean. Letting τ be the capital income tax rate, the after-tax gross portfolio return is R t ( θ ) := 1 + (1 − τ )( θ s R st + θ b R bt + θ f R f − , (3.5)where θ = ( θ s , θ b , θ f ) is the portfolio with θ s + θ b + θ f = 1.To calibrate the stock return parameters, we use the 1947–2018 monthly datafor U.S. stock market returns (volume-weighted index including dividends) andrisk-free rates from the updated spreadsheet of Welch and Goyal (2008). Theirspreadsheet contains monthly nominal stock and risk-free returns as well as theinﬂation. From these we construct the real gross stock and risk-free returns R st , R ft , deﬁne the residual (cid:98) (cid:15) t in (3.4a) by demeaning the log excess returnslog R st − log R ft , and estimate the GARCH parameters ω = 9 . × − , α = 0 . ρ = 0 . R f =E[log R ft ] = 5 . × − (annual rate 0.65%). We estimate the log expectedreturn as µ = log(E R st ) = 6 . × − (annual rate 8.19%). Because our modelrequires a ﬁnite state Markov chain, we discretize the GARCH(1 ,

1) process(3.4) using the Farmer and Toda (2017) method as described in Appendix Awith N v = 3 points for the volatility state and N (cid:15) = 15 for the return state.To calibrate the portfolio shares θ = ( θ s , θ b , θ f ), we use the 1913–2012 wealthshare data of the wealthiest households in U.S. estimated by Saez and Zucman(2016). Speciﬁcally, in Table B5b of their Online Appendix, they report thecomposition of wealth of the top 0.01% across asset groups (equities, ﬁxed in-come claims, housing, business assets, and pensions). We classify equities andpension as “stock”, business assets as “business”, and ﬁxed income claims andhousing as “risk-free asset” to compute the portfolio share θ for all years, takethe average across all years, and obtain ( θ s , θ b , θ f ) = (0 . , . , . β = e − . / so that the annual discounting is 5%. The bankruptcy probabilityis p b = 1 − e − . / so that the annual exit rate is 2.5% as documented inLuttmer (2010) for ﬁrms with more than 500 employees. The capital incometax rate is τ = 0 .

25 based on the estimate in McDaniel (2007) using nationalaccount statistics.To solve the income ﬂuctuation problem (2.1), we need to specify the incomeprocess. Because the U.S. economy has been growing, and by Corollary 2.6 Since business returns are modeled as a mean-preserving spread of stock returns, riskaverse and unconstrained agents would never hold business assets. Here we suppose thatwealthy agents hold business assets for other reasons, for example retaining voting rights inshareholder meetings. . These portfolio shares are relatively stable over time. Although the classiﬁcation of hous-ing and pension may be ambiguous, because these two categories comprise a small fraction(about 10%) of the portfolio, choosing diﬀerent classiﬁcations give quantitatively similar re-sults. g , so Y t = e gt . Wecalibrate the growth rate g from the U.S. real per capita GDP in 1947–2018 andobtain g = 1 . × − at the monthly frequency. Although the theory inMa, Stachurski, and Toda (2020) requires a stationary process for income, it isstraightforward to allow for constant growth in income by detrending the modelwhen the utility function is CRRA. After simple algebra, it suﬃces to use˜ R t ( θ ) = R t ( θ )e − g , ˜ β = β e (1 − γ ) g , ˜ Y t = Y t e − gt = 1 . In the current setting, Assumption 1’ and conditions 1 and 3 of Assumption 2obviously hold. To apply Theorems 2.1 and 2.2, it remains to verify r ( P D βR ) < r ( P D ) ≷

1, where D = D βR − γ . Figure 1 shows thedetermination of the asymptotic MPC ¯ c ( z ) when we change the relative riskaversion γ and the annual discount rate δ . We see that the asymptotic MPCscan be zero if relative risk aversion is moderately high (above 4–5). (risk aversion) ( a nnu a l d i s c o un t r a t e ) c ( z ) > 0 r ( PD ) < 1 c ( z ) = 0 r ( PD ) > 1 r ( PD R ) = 1 r ( PD ) = 1 Figure 1: Determination of asymptotic MPCs with GARCH(1 ,

1) returns.Is the possibility of zero asymptotic MPC empirically plausible? To addressthis concern, we do a simple calculation similar to Friend and Blume (1975).Although our paper abstracts from portfolio choice, suppose that wealthy agentschoose the portfolio θ s , θ f (ﬁxing θ b ) by maximizing the certainty equivalent ofreturn E[ R ( θ ) − γ ] − γ , where R ( θ ) is the gross portfolio return in (3.5) and theexpectation is taken over the ergodic distribution of asset returns. The ﬁrst-order condition of this optimization problem isE[ R ( θ ) − γ ( R s − R f )] = 0 . (3.6)Using our discretized asset returns and the portfolio share θ , the relative riskaversion that makes (3.6) hold is γ = 6 .

38. According to Figure 1, this level ofrisk aversion makes the asymptotic MPC ¯ c ( z ) equal to zero for any reasonablediscount rates.We next solve the model for γ = 3 , γ = 3 the consumption functions are approximatelylinear with positive slopes for high asset level. When γ = 5, the consumptionfunctions show a more concave pattern. asset c o n s u m p t i o n = 3 = l = m = h asset = 5 = l = m = h asset c o n s u m p t i o n = l = m = h asset = l = m = h Figure 2: Optimal consumption rule.

Note: The top and bottom panels plot the consumption functions in the range a ∈ [0 , a ∈ [0 , ], respectively. Here and in other ﬁgures, the left (right) panels correspond to γ = 3 ( γ = 5). For visibility, we plot across asset and the three volatility states σ l < σ m < σ h holding (cid:15) = 0 constant. Figure 3 plots the consumption rates (solid lines) in log-log scale. We see thatthe consumption rates are decreasing in wealth for each realized volatility. For γ = 3, as asset level gets large, the asymptotic MPCs approach to positive con-stants that coincide with the theoretical values calculated based on Theorem 2.2(dotted lines), indicating that the consumption functions are asymptotically lin-ear, consistent with the theorem. For γ = 5, the consumption rates exhibit aclear decreasing trend even when asset is extremely large ( a ≈ ), which isconsistent with zero asymptotic MPC established in Theorem 2.2.Finally, Figure 4 shows the saving rates assuming σ t = σ t +1 ∈ (cid:8) σ l , σ m , σ h (cid:9) and (cid:15) = 0. When wealth is low, the borrowing constraint binds and labor in-14 asset c o n s u m p t i o n / a ss e t = 3 = l theo. lev. = m theo. lev. = h theo. lev. 10 asset = 5 = l = m = h Figure 3: Consumption rate.come is the only source of income and net worth accumulation, i.e., s t +1 =( Y t +1 − a ) /Y t +1 = 1 − e − g a , which is decreasing in asset. A moderately greaterwealth implies lower saving rates because capital income is used to ﬁnance dis-proportionately large consumption. The saving rate starts to increase whenwealth is relatively high ( ≈ γ = 5 and σ ∈ (cid:8) σ l , σ m (cid:9) ,the saving rate is increasing in wealth among agents with large asset and theasymptotic saving rate equals 1, as opposed to the increasing but either nega-tive or small positive saving rate when γ = 3. This example illustrates that theempirically observed large positive and increasing saving rate could potentiallybe explained by models with capital income risk, particularly those with zeroasymptotic MPCs. asset s a v i n g r a t e = 3 = l = m = h asset = 5 Figure 4: Saving rate.

The proof of Theorem 2.2 is technical and consists of the following steps: When σ = σ h , the saving rate becomes negative because ˆ R < (cid:15) = 0; see (3.2).

15. show that policy function iteration leads to increasingly tighter upperbounds on consumption functions that are asymptotically linear with ex-plicit slopes,2. show that the slopes of the upper bounds converge using the ﬁxed pointtheory of monotone convex maps, and3. show that the consumption functions have linear lower bounds with iden-tical slopes to the limit of upper bounds, implying asymptotic linearity.Let C be the space of candidate consumption functions and T : C → C bethe time iteration operator as deﬁned in Section 2. The following propositionallows us to asymptotically bound the consumption rate c ( a, z ) /a from above. Proposition 4.1.

Let everything be as in Theorem 2.2. If c ∈ C and lim sup a →∞ c ( a, z ) a ≤ x ( z ) − /γ for some x ( z ) ≥ for all z ∈ Z , then lim sup a →∞ T c ( a, z ) a ≤ ( F x )( z ) − /γ . (4.1) Proof.

Let α = lim sup a →∞ T c ( a, z ) /a . By deﬁnition, we can take an increasingsequence { a n } such that α = lim n →∞ T c ( a n , z ) /a n . Deﬁne α n = T c ( a n , z ) /a n ∈ (0 ,

1] and λ n = c ( ˆ R (1 − α n ) a n + ˆ Y , ˆ Z ) a n > . (4.2)Let us show that lim sup n →∞ λ n ≤ x ( ˆ Z ) − /γ ˆ R (1 − α ) . (4.3)To see this, if α < R >

0, then since ˆ R (1 − α n ) a n → ˆ R (1 − α ) · ∞ = ∞ ,by assumption we havelim sup n →∞ λ n = lim sup n →∞ c ( ˆ R (1 − α n ) a n + ˆ Y , ˆ Z )ˆ R (1 − α n ) a n + ˆ Y (cid:32) ˆ R (1 − α n ) + ˆ Ya n (cid:33) ≤ lim sup a →∞ c ( a, ˆ Z ) a × ˆ R (1 − α ) ≤ x ( ˆ Z ) − /γ ˆ R (1 − α ) , which is (4.3). If α = 1 or ˆ R = 0, then since c ( a, z ) ≤ a , we have λ n = c ( ˆ R (1 − α n ) a n + ˆ Y , ˆ Z )ˆ R (1 − α n ) a n + ˆ Y (cid:32) ˆ R (1 − α n ) + ˆ Ya n (cid:33) ≤ ˆ R (1 − α n ) + ˆ Ya n → ˆ R (1 − α ) = 0 , so again (4.3) holds. 16ince ξ n := T c ( a n , z ) = α n a n solves the Euler equation, using u (cid:48) ( c ) = c − γ and the deﬁnition of λ n in (4.2), we have0 = u (cid:48) ( α n a n ) u (cid:48) ( a n ) − max (cid:40) E z ˆ β ˆ R u (cid:48) ( c ( ˆ R (1 − α n ) a n + ˆ Y , ˆ Z )) u (cid:48) ( a n ) , (cid:41) = α − γn − max (cid:110) E z ˆ β ˆ R ( c ( ˆ R (1 − α n ) a n + ˆ Y , ˆ Z ) /a n ) − γ , (cid:111) = α − γn − max (cid:110) E z ˆ β ˆ Rλ − γn , (cid:111) = ⇒ α − γn = max (cid:110) E z ˆ β ˆ Rλ − γn , (cid:111) ≥ E z ˆ β ˆ Rλ − γn . (4.4)Now letting n → ∞ in (4.4) and applying Fatou’s lemma, we obtain α − γ = lim n →∞ α − γn ≥ lim inf n →∞ E z ˆ β ˆ Rλ − γn ≥ E z lim inf n →∞ ˆ β ˆ Rλ − γn = E z ˆ β ˆ R (cid:20) lim sup n →∞ λ n (cid:21) − γ ≥ E z ˆ β ˆ R (cid:104) x ( ˆ Z ) − /γ ˆ R (1 − α ) (cid:105) − γ by (4.3). Solving the inequality for α and using the convention βR − γ =( βR ) R − γ and 0 · ∞ = 0 (see Footnote 7), we obtainlim sup a →∞ T c ( a, z ) a = α ≤

11 + (cid:16) E z ˆ β ˆ R − γ x ( ˆ Z ) (cid:17) /γ = ( F x )( z ) − /γ . Starting from the trivial upper bound c ( a, z ) ≤ a and applying Proposi-tion 4.1 repeatedly we obtain increasingly tighter upper bounds of c ( a, z ). Thefollowing proposition characterizes the limits of the slopes of the upper bounds. Proposition 4.2.

Let everything be as in Theorem 2.2. Then F in (2.8) hasa ﬁxed point x ∗ ∈ R Z + if and only if r ( P D ) < , in which case the ﬁxed point isunique. Take any x ∈ R Z + and deﬁne the sequence { x n } ∞ n =1 ⊂ R Z + by x n = F x n − (4.5) for all n ∈ N . Then the followings are true.1. If r ( P D ) < , then { x n } ∞ n =1 converges to x ∗ .2. If r ( P D ) ≥ and P D is irreducible, then x n ( z ) → x ∗ ( z ) = ∞ as n → ∞ for all z ∈ Z .Proof. Immediate from Lemmas 4.3 and 4.4 below.

Lemma 4.3.

Let γ > and deﬁne φ : R + → R + by φ ( t ) = (1 + t /γ ) γ . Thenthere exist a ≥ and b ≥ such that φ ( t ) ≤ at + b . Furthermore, we can take a ≥ arbitrarily close to 1. (The choice of b may depend on a .)Proof. The proof depends on γ ≷

1. 17 ase 1: γ ≤ Let us show that we can take a = b = 1. Let f ( t ) = 1+ t − φ ( t ).Then f (0) = 0 and f (cid:48) ( t ) = 1 − φ (cid:48) ( t ) = 1 − γ (1 + t /γ ) γ − γ t /γ − = 1 − ( t − /γ + 1) γ − ≥ , so f ( t ) ≥ t ≥

0. Therefore φ ( t ) ≤ t . Case 2: γ > By simple algebra we obtain φ (cid:48)(cid:48) ( t ) = ( γ − t − /γ + 1) γ − (cid:18) − γ t − /γ − (cid:19) < , (4.6)so φ is increasing and concave. Therefore φ ( t ) ≤ φ ( u ) + φ (cid:48) ( u )( t − u ) for all t, u .Letting a = φ (cid:48) ( u ) and b = max { , φ ( u ) − φ (cid:48) ( u ) u } , we obtain φ ( t ) ≤ at + b .Furthermore, since φ (cid:48) ( t ) = ( t − /γ + 1) γ − → t → ∞ , we can take a = φ (cid:48) ( u )arbitrarily close to 1 by taking u large enough. Lemma 4.4.

Let γ > and K be a Z × Z nonnegative matrix. Deﬁne F : R Z + → R Z + by F x = φ ( Kx ) , where φ is as in Lemma 4.3 and is applied element-wise. Then F has a ﬁxed point x ∗ ∈ R Z + if and only if r ( K ) < , in which case x ∗ is unique.Take any x ∈ R Z + and deﬁne the sequence { x n } ∞ n =1 ⊂ R Z + by x n = F x n − for all n ∈ N . Then the followings are true.1. If r ( K ) < , then { x n } ∞ n =1 converges to x ∗ .2. If r ( K ) ≥ and K is irreducible, then x n ( z ) → x ∗ ( z ) = ∞ as n → ∞ forall z ∈ Z .Proof. We divide the proof into several steps.

Step 1. If r ( K ) ≥ , then F does not have a ﬁxed point. If in addition K isirreducible, then x n ( z ) → ∞ for all z ∈ Z . We prove the contrapositive. Suppose that F has a ﬁxed point x ∗ ∈ R Z + .Since φ >

0, we have x ∗ (cid:29)

0. Since clearly φ ( t ) > t for all t ≥

0, we have x ∗ = φ ( Kx ∗ ) (cid:29) Kx ∗ . Since K is a nonnegative matrix, by the Perron-Frobeniustheorem, we can take a right eigenvector y > y (cid:48) K = r ( K ) y (cid:48) . Since x ∗ (cid:29) Kx ∗ and y >

0, we obtain0 < y (cid:48) ( x ∗ − Kx ∗ ) = ⇒ r ( K ) y (cid:48) x ∗ < y (cid:48) x ∗ . Dividing both sides by y (cid:48) x ∗ >

0, we obtain r ( K ) < r ( K ) ≥ K is irreducible. Since K is nonnegative and φ isstrictly increasing, F = φ ◦ K is a monotone map. Therefore to show x n ( z ) → ∞ ,it suﬃces to show this when x = 0. Since x = F x = F ≥

0, applying F n − we obtain x n ≥ x n − for all n . Since { x n } ∞ n =0 is an increasing sequencein R Z + , if it is bounded, then it converges to some x ∗ ∈ R Z + . By continuity, x ∗ isa ﬁxed point of F , which is a contradiction. Therefore { x n } ∞ n =0 is unbounded,so x n (ˆ z ) → ∞ for at least one ˆ z ∈ Z . Since by assumption K is irreducible, foreach ( z, ˆ z ) ∈ Z , there exists m ∈ N such that K m ( z, ˆ z ) >

0. Therefore x m + n ( z ) ≥ K m ( z, ˆ z ) x n (ˆ z ) → ∞ as n → ∞ , so x n ( z ) → ∞ for all z ∈ Z .18 tep 2. If r ( K ) < , then F has a unique ﬁxed point x ∗ in R Z + . If we take a ∈ [1 , /r ( K )) and b > as in Lemma 4.3, then ≤ x ∗ (cid:28) ( I − aK ) − b . (4.7)Take any ﬁxed point x ∗ ∈ R Z + of F . Since φ ( t ) ≥ t ≥

0, clearly x ∗ ≥

1. Since K is nonnegative and ar ( K ) <

1, the inverse ( I − aK ) − = (cid:80) ∞ k =0 ( aK ) k exists and is nonnegative. Therefore x ∗ = F x ∗ (cid:28) aKx ∗ + b ⇒ x ∗ (cid:28) ( I − aK ) − b , which is (4.7).The proof of existence and uniqueness uses a similar strategy to Boroviˇckaand Stachurski (2020). Clearly F is a monotone map. Using (4.6), it followsthat F is convex if γ ≤ γ ≥

1. Deﬁne u = 0 and v = ( I − aK ) − b (cid:29)

0. Then

F u = 1 (cid:29) u and F v = φ ( Kv ) (cid:28) aKv + b v .Hence by Theorem 2.1.2 of Zhang (2013), which is based on Theorem 3.1 of Du(1990), F has a unique ﬁxed point in [ u , v ] = [0 , v ]. Since by (4.7) any ﬁxedpoint x ∗ must lie in this interval, it follows that F has a unique ﬁxed point in R Z + . Step 3. If r ( K ) < , then { x n } ∞ n =1 converges to x ∗ . Let a ∈ [1 , /r ( K )), b >

0, and v (cid:29) F x = φ ( Kx ), we obtain x n = F x n − = φ ( Kx n − ) (cid:28) aKx n − + b . Iterating, we obtain x n (cid:28) ( aK ) n x + n − (cid:88) k =0 ( aK ) k ( b aK ) n x + ∞ (cid:88) k =0 ( aK ) k ( b − ∞ (cid:88) k = n ( aK ) k ( b aK ) n ( x − v ) + v . Since r ( aK ) = ar ( K ) <

1, we have ( aK ) n ( x − v ) → n → ∞ . Therefore0 = u (cid:28) x n (cid:28) v for large enough n . Again by Theorem 2.1.2 of Zhang (2013),we have x n → x ∗ as n → ∞ .The following proposition allows us to obtain explicit linear lower bounds onconsumption functions. Proposition 4.5.

Let everything be as in Theorem 2.2. Suppose r ( P D ) < and let x ∗ ∈ R Z ++ be the unique ﬁxed point of F in (2.8) . Restrict the candidatespace to C = { c ∈ C | c ( a, z ) ≥ (cid:15) ( z ) a for all a > and z ∈ Z } , (4.8) where (cid:15) ( z ) = x ∗ ( z ) − /γ ∈ (0 , . Then T C ⊂ C . roof. Suppose to the contrary that T C (cid:54)⊂ C . Then there exists c ∈ C suchthat for some a > z ∈ Z , we have ξ := T c ( a, z ) < (cid:15) ( z ) a .Since u (cid:48) is strictly decreasing and (cid:15) ( z ) ∈ (0 , u (cid:48) ( a ) ≤ u (cid:48) ( (cid:15) ( z ) a ) < u (cid:48) ( ξ ) = max (cid:110) E z ˆ β ˆ Ru (cid:48) ( c ( ˆ R ( a − ξ ) + ˆ Y , ˆ Z )) , u (cid:48) ( a ) (cid:111) . Therefore it must be u (cid:48) ( a ) < E z ˆ β ˆ Ru (cid:48) ( c ( ˆ R ( a − ξ ) + ˆ Y , ˆ Z )). Since u (cid:48) is strictlydecreasing and c ∈ C , we obtain u (cid:48) ( (cid:15) ( z ) a ) < u (cid:48) ( ξ ) = E z ˆ β ˆ Ru (cid:48) ( c ( ˆ R ( a − ξ ) + ˆ Y , ˆ Z )) ≤ E z ˆ β ˆ Ru (cid:48) ( (cid:15) ( ˆ Z )( ˆ R ( a − ξ ) + ˆ Y )) ≤ E z ˆ β ˆ Ru (cid:48) ( (cid:15) ( ˆ Z ) ˆ R [1 − (cid:15) ( z )] a ) . Using u (cid:48) ( c ) = c − γ and (cid:15) ( z ) = x ∗ ( z ) − /γ , we obtain x ∗ ( z ) < E z ˆ β ˆ R − γ x ∗ ( ˆ Z )[1 − x ∗ ( z ) − /γ ] − γ ⇐⇒ x ∗ ( z ) < (cid:16) z ˆ β ˆ R − γ x ∗ ( ˆ Z )) /γ (cid:17) γ = (cid:16) P Dx ∗ )( z ) /γ (cid:17) γ , which is a contradiction because x ∗ is a ﬁxed point of F in (2.8).With all the above preparations, we can prove Theorem 2.2. Proof of Theoreom 2.2.

Deﬁne the sequence { c n } ⊂ C by c ( a, z ) = a and c n = T c n − for all n ≥

1. Since

T c ( a, z ) ≤ a for any c ∈ C , in particular c ( a, z ) = T c ( a, z ) ≤ a = c ( a, z ). Since T : C → C is order preserving by Lemma B.4of Ma, Stachurski, and Toda (2020), by induction 0 ≤ c n ≤ c n − for all n and c ( a, z ) = lim n →∞ c n ( a, z ) exists. Then by Theorem 2.2 of Ma, Stachurski, andToda (2020), this c is the unique ﬁxed point of T and also the unique solutionto the income ﬂuctuation problem (2.1).Deﬁne the sequence { x n } ⊂ R Z ++ by x = 1 and x n = F x n − , where F isas in (2.8). By deﬁnition, we have c ( a, z ) /a = 1 = x ( z ) − /γ , so in particularlim sup a →∞ c ( a, z ) /a ≤ x ( z ) − /γ for all z ∈ Z . Since c n ↓ c ≥ ≤ lim sup a →∞ c ( a, z ) a ≤ lim sup a →∞ c n ( a, z ) a ≤ x n ( z ) − /γ . (4.9) Case 1: r ( P D ) ≥ P D is irreducible.

By Proposition 4.2 we have x n ( z ) → ∞ for all z ∈ Z . Letting n → ∞ in (4.9), we obtainlim a →∞ c ( a, z ) a = 0 . Case 2: r ( P D ) < By Proposition 4.2 we have x n ( z ) → x ∗ ( z ), where x ∗ isthe unique ﬁxed point of F in (2.8). Letting n → ∞ in (4.9), we obtainlim sup a →∞ c ( a, z ) a ≤ x ∗ ( z ) − /γ . (4.10)20n the other hand, a repeated application of Proposition 4.5 implies that c n ( a, z ) ≥ x ∗ ( z ) − /γ a for all a > z ∈ Z . Since c n → c point-wise,letting n → ∞ , dividing both sides by a >

0, and letting a → ∞ , we obtainlim inf a →∞ c ( a, z ) a ≥ x ∗ ( z ) − /γ . (4.11)Combining (4.10) and (4.11), we obtain lim a →∞ c ( a, z ) /a = ¯ c ( z ) = x ∗ ( z ) − /γ . Proof of Proposition 2.4. If γ = 1, then r ( P D βR − γ ) = r ( P D β ) < γ ∈ (0 , A and θ >

0, let A ( θ ) = ( A ( z, ˆ z ) θ ) be the matrix of θ -th power. Also, let (cid:12) denote theHadamard (element-wise) product. Applying H¨older’s inequality, we obtainE z βR − γ = E z β γ ( βR ) − γ ≤ (E z β ) γ (E z βR ) − γ . Constructing diagonal matrices, we obtain D βR − γ ≤ D ( γ ) β (cid:12) D (1 − γ ) βR . Multiplying P from left and noting that D X is diagonal, it follows that P D βR − γ ≤ P ( D ( γ ) β (cid:12) D (1 − γ ) βR ) = ( P D β ) ( γ ) (cid:12) ( P D βR ) (1 − γ ) . Applying Theorem 1 of Elsner, Johnson, and Dias da Silva (1988), we obtain r ( P D βR − γ ) ≤ r ( P D β ) γ r ( P D βR ) − γ < Proof of Theorem 2.5.

Since K is a nonnegative matrix (with elements that arepotentially inﬁnite), the map F in (2.8) is monotone and therefore { x n } ∞ n =0 monotonically converges to some x ∗ ∈ [1 , ∞ ] Z . To characterize x ∗ ( z ) and ¯ c ( z ),we consider two cases. Case 1: There exist j , ˆ z ∈ Z j , and m ∈ N such that K m ( z, ˆ z ) > r ( K j ) ≥ Deﬁne the block diagonal matrix ˜ K = diag( K , . . . , K J ) andthe sequence { ˜ x n } ∞ n =0 ⊂ [0 , ∞ ] Z by ˜ x = 1 and iterating (2.8), where K isreplaced by ˜ K . Since K ≥ ˜ K ≥

0, clearly x n ≥ ˜ x n ≥ n . Since bydeﬁnition ˜ K is block diagonal with each diagonal block irreducible, by Lemma4.4 we have ˜ x n ( z ) → ∞ as n → ∞ if and only if there exists j such that z ∈ Z j and r ( K j ) ≥

1. (Although Lemma 4.4 assumes the elements of K are ﬁnite, theinﬁnite case is similar.) Replacing the vector 1 in (2.8) by 0 and iterating, weobtain x m + n ≥ K m x n ≥ K m ˜ x n . Therefore if there exist j , ˆ z ∈ Z j and m ∈ N such that K m ( z, ˆ z ) > r ( K j ) ≥

1, then x m + n ( z ) ≥ K m ( z, ˆ z )˜ x n (ˆ z ) → ∞ as n → ∞ , so x ∗ ( z ) = ∞ . In this case we obtain ¯ c ( z ) = 0 by the same argumentas in the proof of Proposition 4.1. 21 ase 2: For all j , either r ( K j ) < K m ( z, ˆ z ) = 0 for all ˆ z ∈ Z j and m ∈ N . For any ˆ z such that K m ( z, ˆ z ) = 0 for all m , by (2.8) the value of x n ( z ) is unaﬀected by all previous x k (ˆ z ) for k < n . Therefore for the purposeof computing x n ( z ), we may drop all rows and columns of K corresponding tosuch ˆ z . The resulting matrix has block diagonal elements K j with r ( K j ) < x n ( z ) → x ∗ ( z ) < ∞ as n → ∞ . In this case we obtain ¯ c ( z ) = x ∗ ( z ) − /γ by the same argument as in the proof of Theorem 2.2. References ¨Omer T. A¸cıkg¨oz. On the existence and uniqueness of stationary equilibrium inBewley economies with production.

Journal of Economic Theory , 173:18–55,January 2018. doi:10.1016/j.jet.2017.10.006.Jess Benhabib, Alberto Bisin, and Shenghao Zhu. The wealth distribution inBewley economies with capital income risk.

Journal of Economic Theory , 159(A):489–515, September 2015. doi:10.1016/j.jet.2015.07.013.Truman F. Bewley. The permanent income hypothesis: A theoretical for-mulation.

Journal of Economic Theory , 16(2):252–292, December 1977.doi:10.1016/0022-0531(77)90009-6.Jaroslav Boroviˇcka and John Stachurski. Necessary and suﬃcient conditionsfor existence and uniqueness of recursive utilities.

Journal of Finance , 75(3):1457–1493, June 2020. doi:10.1111/joﬁ.12877.Dan Cao. Recursive equilibrium in Krusell and Smith (1998).

Journal of Eco-nomic Theory , 186:104978, March 2020. doi:10.1016/j.jet.2019.104978.Christopher D. Carroll. Why do the rich save so much? In Joel B. Slemrod,editor,

Does Atlas Shrug? The Economic Consequences of Taxing the Rich ,chapter 14, pages 465–484. Harvard University Press, Cambridge, MA, 2000.Christopher D. Carroll. Precautionary saving and the marginal propensity toconsume out of permanent income.

Journal of Monetary Economics , 56(6):780–790, September 2009. doi:10.1016/j.jmoneco.2009.06.016.Christopher D. Carroll. Theoretical foundations of buﬀer stock saving.

Quan-titative Economics , 2020. Forthcoming.Christopher D. Carroll and Miles S. Kimball. On the concavity of the consump-tion function.

Econometrica , 64(4):981–992, July 1996. doi:10.2307/2171853.Gary Chamberlain and Charles A. Wilson. Optimal intertemporal consumptionunder uncertainty.

Review of Economic Dynamics , 3(3):365–395, July 2000.doi:10.1006/redy.2000.0098.Wilbur John Coleman, II. Solving the stochastic growth model by policy-function iteration.

Journal of Business and Economic Statistics , 8(1):27–29,January 1990. doi:10.1080/07350015.1990.10509769.22anjira Datta, Leonard J. Mirman, and Kevin L. Reﬀett. Existence anduniqueness of equilibrium in distorted dynamic economies with capitaland labor.

Journal of Economic Theory , 103(2):377–410, April 2002.doi:10.1006/jeth.2000.2789.Mariacristina De Nardi. Wealth inequality and intergenerational links.

Re-view of Economic Studies , 71(3):743–768, July 2004. doi:10.1111/j.1467-937X.2004.00302.x.Yihong Du. Fixed points of increasing operators in ordered Ba-nach spaces and applications.

Applicable Analysis , 38(1-2):1–20, 1990.doi:10.1080/00036819008839957.Karen E. Dynan, Jonathan Skinner, and Stephen P. Zeldes. Do the richsave more?

Journal of Political Economy , 112(2):397–444, April 2004.doi:10.1086/381475.Ludwig Elsner, Charles R. Johnson, and Jos´e Ant´onio Dias da Silva.The Perron root of a weighted geometric mean of nonneagative ma-trices.

Linear and Multilinear Algebra , 24(1):1–13, November 1988.doi:10.1080/03081088808817892.Andreas Fagereng, Martin Blomhoﬀ Holm, Benjamin Moll, and Gisle Natvik.Saving behavior across the wealth distribution: The importance of capitalgains. 2019.Leland E. Farmer and Alexis Akira Toda. Discretizing nonlinear, non-GaussianMarkov processes with exact conditional moments.

Quantitative Economics ,8(2):651–683, July 2017. doi:10.3982/QE737.Irwin Friend and Marshall E. Blume. The demand for risky assets.

AmericanEconomic Review , 65(5):900–922, December 1975.´Emilien Gouin-Bonenfant and Alexis Akira Toda. Pareto extrapolation: Ananalytical framework for studying tail inequality. 2018. URL https://ssrn.com/abstract=3260899 .Greg Kaplan, Benjamin Moll, and Giovanni L. Violante. Monetary policy ac-cording to HANK.

American Economic Review , 108(3):697–743, March 2018.doi:10.1257/aer.20160042.Tom Krebs. Recursive equilibrium in endogenous growth models with incom-plete markets.

Economic Theory , 29(3):505–523, 2006. doi:10.1016/S0165-1889(03)00062-9.Moritz Kuhn. Recursive equilibria in an Aiyagari-style economy with permanentincome shocks.

International Economic Review , 54(3):807–835, August 2013.doi:10.1111/iere.12018.Ehud Lehrer and Bar Light. The eﬀect of interest rates on consumption in anincome ﬂuctuation problem.

Journal of Economic Dynamics and Control , 94:63–71, September 2018. doi:10.1016/j.jedc.2018.07.004.23uiyu Li and John Stachurski. Solving the income ﬂuctuation problem withunbounded rewards.

Journal of Economic Dynamics and Control , 45:353–365, August 2014. doi:10.1016/j.jedc.2014.06.003.Bar Light. Precautionary saving in a Markovian earnings environment.

Reviewof Economic Dynamics , 29:138–147, July 2018. doi:10.1016/j.red.2017.12.004.Bar Light. Uniqueness of equilibrium in a Bewley-Aiyagari economy.

EconomicTheory , 69:435–450, 2020. doi:10.1007/s00199-018-1167-z.Erzo G. J. Luttmer. Models of growth and ﬁrm heterogeneity.

Annual Review ofEconomics , 2:547–576, 2010. doi:10.1146/annurev.economics.102308.124410.Qingyin Ma, John Stachurski, and Alexis Akira Toda. The income ﬂuctuationproblem and the evolution of wealth.

Journal of Economic Theory , 187:105003, May 2020. doi:10.1016/j.jet.2020.105003.Cara McDaniel. Average tax rates on consumption, investment, labor and cap-ital in the OECD 1950-2003. 2007.Atif R. Mian, Ludwig Straub, and Amir Suﬁ. Indebted demand. NBER WorkingPaper 26940, 2020. URL .Olivier F. Morand and Kevin L. Reﬀett. Existence and uniqueness of equi-librium in nonoptimal unbounded inﬁnite horizon economies.

Journal ofMonetary Economics , 50(6):1351–1373, September 2003. doi:10.1016/S0304-3932(03)00082-5.Vincenzo Quadrini. The importance of entrepreneurship for wealth concentra-tion and mobility.

Review of Income and Wealth , 45(1):1–19, March 1999.doi:10.1111/j.1475-4991.1999.tb00309.x.Guillaume Rabault. When do borrowing constraints bind? Some new results onthe income ﬂuctuation problem.

Journal of Economic Dynamics and Control ,26(2):217–245, February 2002. doi:10.1016/S0165-1889(00)00042-7.Emmanuel Saez and Gabriel Zucman. Wealth inequality in the United Statessince 1913: Evidence from capitalized income tax data.

Quarterly Journal ofEconomics , 131(2):519–578, May 2016. doi:10.1093/qje/qjw004.Paul A. Samuelson. Lifetime portfolio selection by dynamic stochastic pro-gramming.

Review of Economics and Statistics , 51(3):239–246, August 1969.doi:10.2307/1926559.John Stachurski and Alexis Akira Toda. An impossibility theorem for wealth inheterogeneous-agent models with limited heterogeneity.

Journal of EconomicTheory , 182:1–24, July 2019. doi:10.1016/j.jet.2019.04.001.Ludwig Straub. Consumption, savings, and the distribution of permanent in-come. 2019.Ken’ichiro Tanaka and Alexis Akira Toda. Discrete approximations of contin-uous distributions by maximum entropy.

Economics Letters , 118(3):445–450,March 2013. doi:10.1016/j.econlet.2012.12.020.24en’ichiro Tanaka and Alexis Akira Toda. Discretizing distributions with ex-act moments: Error estimate and convergence analysis.

SIAM Journal onNumerical Analysis , 53(5):2158–2177, 2015. doi:10.1137/140971269.Alexis Akira Toda. Incomplete market dynamics and cross-sectional dis-tributions.

Journal of Economic Theory , 154:310–348, November 2014.doi:10.1016/j.jet.2014.09.015.Alexis Akira Toda. Wealth distribution with random discount fac-tors.

Journal of Monetary Economics , 104:101–113, June 2019.doi:10.1016/j.jmoneco.2018.09.006.Ivo Welch and Amit Goyal. A comprehensive look at the empirical performanceof equity premium prediction.

Review of Financial Studies , 21(4):1455–1508,July 2008. doi:10.1093/rfs/hhm014.Zhitao Zhang.

Variational, Topological, and Partial Order Methods with TheirApplications , volume 29 of

Developments in Mathematics . Springer, 2013.doi:10.1007/978-3-642-30709-6.

A Discretizing the GARCH (1 , process In this appendix we explain how to discretize the GARCH(1 ,

1) process (3.4).

A.1 Constructing the grid

Let v t = σ t . Using the properties of the GARCH process, it is known that theexpected conditional variance isE[ v t ] = ω − α − ρ . Therefore it is natural to take an evenly-spaced grid { ¯ (cid:15) n } N (cid:15) n =1 , where N (cid:15) is anodd number and the largest grid point ¯ (cid:15) := ¯ (cid:15) N (cid:15) is some multiple of (cid:113) ω − α − ρ .Because the conditional variance of the GARCH process can be quite large, itis also natural to choose an exponential grid (as discussed in Appendix A.3) { ¯ v n } N v n =1 such that the median point of the grid is ω − α − ρ .To determine the end points, let v ¯ = v and ¯ v = v N v . In principle v t = σ t can be arbitrarily close to ω , so we set v ¯ = ω . For v t = σ t to remain in the gridwhen (cid:15) t − and σ t − are at their maximum value, we need¯ v ≥ ω + α ¯ (cid:15) + ρ ¯ v ⇐⇒ ¯ v ≥ ω + α ¯ (cid:15) − ρ . Setting ¯ (cid:15) = k (cid:113) ω − α − ρ for some k >

0, we obtain¯ v ≥ − ρ + ( k − α − ρ ω − α − ρ . (A.1)25n order to be able to match up to the second moments of (cid:15) t when v t = ¯ v , it isnecessary and suﬃcient that¯ (cid:15) ≥ √ ¯ v ⇐⇒ ¯ v ≤ ¯ (cid:15) = k ω − α − ρ . (A.2)We have the following result. Proposition A.1.

Consider the GARCH (1 , process (3.4) with α + ρ < andset v t = σ t . Let N (cid:15) ≥ be an odd number and N v ≥ . Then there exists adiscretization such that1. { ¯ (cid:15) n } N (cid:15) n =1 is evenly spaced and centered around 0,2. { ¯ v n } N v n =1 is exponentially spaced with minimum point ω and median point ω − α − ρ , and3. the conditional mean of v t and the conditional mean and variance of (cid:15) t are exact.Proof. Set ¯ (cid:15) = ¯ (cid:15) N (cid:15) = k (cid:113) ω − α − ρ for some k >

0. Combining (A.1) and (A.2),we obtain 1 − ρ + ( k − α − ρ ω − α − ρ ≤ k ω − α − ρ ⇐⇒ − ρ + ( k − α ≤ (1 − ρ ) k ⇐⇒ ( k − α + ρ − ≤ , which always holds if k ≥ α + ρ <

1. Set( v ¯ , ¯ v ) = (cid:18) ω, − ρ + ( k − α − ρ ω − α − ρ (cid:19) for some k ≥ { v n } N v n =1 on [ v ¯ , ¯ v ] with medianpoint ω − α − ρ as explained in Appendix A.3. For the exponential grid to bewell-deﬁned, we need ω − α − ρ < v ¯ + ¯ v (cid:18) ω + 1 − ρ + ( k − α − ρ ω − α − ρ (cid:19) ⇐⇒ < − α − ρ + 1 − ρ + ( k − α − ρ ⇐⇒ (1 − ρ )(1 + α + ρ ) < − ρ + ( k − α ⇐⇒ (1 − ρ )( α + ρ ) < ( k − α, (A.3)which holds for large enough k ≥ α >

0. To make (A.3) true, forexample we can set k − N (cid:15) (1 − ρ )(1 + ρ/α ) ⇐⇒ k = (cid:112) N (cid:15) (1 − ρ )(1 + ρ/α ) , which satisﬁes k ≥ In this case we have¯ v = (1 + N (cid:15) ( α + ρ )) ω − α − ρ . Setting k ∼ √ N (cid:15) is advocated in Farmer and Toda (2017) based on the trapezoidal rulefor quadrature. { ¯ v n } N v n =1 and { ¯ (cid:15) n } N (cid:15) n =1 as follows. Constructing grid for GARCH.

1. Select the number of grid points N (cid:15) ≥ N v ≥ (cid:15) = (cid:113) (1 + N (cid:15) (1 − ρ )(1 + ρ/α )) ω − α − ρ and construct the evenly-spaced grid { ¯ (cid:15) n } N (cid:15) n =1 on [ − ¯ (cid:15), ¯ (cid:15) ].3. Set ( a, b, c ) = (cid:18) ω, (1 + N (cid:15) ( α + ρ )) ω − α − ρ , ω − α − ρ (cid:19) and construct the exponentially-spaced grid { ¯ v n } N v n =1 on [ a, b ] withmedian point c as in Appendix A.3. A.2 Constructing transition probabilities

Having constructed the grid, it remains to construct transition probabilities. Let Z = { , . . . , N v } × { , . . . , N (cid:15) } be the state space. If z = ( m, n ) ∈ Z , then thecurrent conditional variance and return are ( v, (cid:15) ) = (¯ v m , ¯ (cid:15) n ). The next period’sconditional variance is then ˆ v = ω + α ¯ (cid:15) n + ρ ¯ v m . This ˆ v will in general not be a grid point. However, we can approximate thetransition to ˆ v by assigning probabilities 1 − θ, θ to the two points ¯ v m (cid:48) , ¯ v m (cid:48) +1 such that ˆ v = (1 − θ )¯ v m (cid:48) + θ ¯ v m (cid:48) +1 , where m (cid:48) is uniquely determined such that ¯ v m (cid:48) < ˆ v ≤ ˆ v m (cid:48) +1 .Because the distribution of ˆ (cid:15) is N (0 , ˆ v ), we can assign probabilities on thegrid points { ¯ (cid:15) n } N (cid:15) n =1 such that the mean and variance are exact. For this purpose,we can use the maximum entropy method of Tanaka and Toda (2013, 2015) andFarmer and Toda (2017). If N (cid:15) = 3, we can avoid optimizing because there is aclosed-form solution as follows. Assign probabilities p, − p, p to points − ¯ (cid:15), , ¯ (cid:15) so that E[ (cid:15) ] = 0 and Var[ (cid:15) ] = ˆ v . For this purpose, we can setˆ v = 2 p ¯ (cid:15) ⇐⇒ p = ˆ v (cid:15) , which is always in (0 , /

2) because ˆ v ≤ ¯ v < ¯ (cid:15) . A.3 Exponential grid

In many models, the state variable may become negative (e.g., asset holdings),which causes a problem for constructing an exponentially-spaced grid becausewe cannot take the logarithm of a negative number. Suppose we would like to27onstruct an N -point exponential grid on a given interval ( a, b ). A natural ideato deal with such a case is as follows. Constructing exponential grid.

1. Choose a shift parameter s > − a .2. Construct an N -point evenly-spaced grid on (log( a + s ) , log( b + s )).3. Take the exponential.4. Subtract s .The remaining question is how to choose the shift parameter s . Suppose wewould like to specify the median grid point as c ∈ ( a, b ). Since the median ofthe evenly-spaced grid on (log( a + s ) , log( b + s )) is (log( a + s ) + log( b + s )),we need to take s > − a such that c = exp (cid:18)

12 (log( a + s ) + log( b + s )) (cid:19) − s ⇐⇒ c + s = (cid:112) ( a + s )( b + s ) ⇐⇒ ( c + s ) = ( a + s )( b + s ) ⇐⇒ c + 2 cs + s = ab + ( a + b ) s + s ⇐⇒ s = c − aba + b − c . Note that in this case s + a = c − aba + b − c + a = ( c − a ) a + b − c , so s + a is positive if and only if c < a + b . Therefore, for any c ∈ (cid:0) a, a + b (cid:1) , itis possible to construct an exponentially-spaced grid with end points ( a, b ) andmedian point cc