[PDF] Dynamic Information Design with Diminishing Sensitivity Over News

Abstract

A Bayesian agent experiences gain-loss utility each period over changes in belief about future consumption ("news utility"), with diminishing sensitivity over the magnitude of news. We show the agent's preference between an information structure that delivers news gradually and another that resolves all uncertainty at once depends on his consumption ranking of different states. One-shot resolution is better than gradual bad news, but it is not optimal among all information structures (under common functional forms). In a dynamic cheap-talk framework where a benevolent sender communicates the state over multiple periods, the babbling equilibrium is essentially unique without loss aversion. More loss-averse agents may enjoy higher news utility in equilibrium, contrary to the commitment case. We characterize the family of gradual good news equilibria that exist with high enough loss aversion, and find the sender conveys progressively larger pieces of good news. We discuss applications to media competition and game shows.

Full PDF

DDynamic Information Design withDiminishing Sensitivity Over News ∗ Jetlir Duraj † Kevin He ‡ First version: July 1, 2019This version: October 12, 2019

Abstract

A benevolent sender communicates non-instrumental information over time to aBayesian receiver who experiences gain-loss utility over changes in beliefs (“news util-ity”). We show how to compute the optimal dynamic information structure for ar-bitrary news-utility functions. With diminishing sensitivity over the magnitude ofnews, one-shot resolution of uncertainty is strictly suboptimal under commonly usedfunctional forms. Information structures that deliver bad news gradually are never op-timal. We identify additional conditions that imply the sender optimally releases goodnews in small pieces but bad news in one clump. When the sender lacks commitmentpower, diminishing sensitivity leads to a credibility problem for good-news messages.Without loss aversion, the babbling equilibrium is essentially unique. More loss-aversereceivers may enjoy higher equilibrium news-utility, contrary to the commitment case.We discuss applications to media competition and game shows. ∗ We thank Drew Fudenberg, Jerry Green, Jonathan Libgober, Pietro Ortoleva, Matthew Rabin, CollinRaymond, the MIT information design reading group, and our seminar participants for insightful comments.We also beneﬁted from conversations with Krishna Dasaratha, Ben Enke, Simone Galperti, David Hagmann,Marina Halac, Johannes Hörner, David Laibson, Shengwu Li, Elliot Lipnowski, Gautam Rao, and TomaszStrzalecki at an early stage of the project. † Harvard University. Email: [email protected] ‡ California Institute of Technology and University of Pennsylvania. Email: [email protected] a r X i v : . [ ec on . T H ] D ec Introduction

When people give others news, they are often mindful of the information’s psychologicalimpact. For example, this consideration aﬀects the way CEOs announce earnings forecaststo shareholders and organization leaders update their teams about recent developments.While the instrumental value of information also plays a signiﬁcant role, we analyze theunder-studied problem of how the audience’s psychological reaction to good and bad newsshapes the dynamic communication of information. This problem is even more relevant insituations like designing game shows and other entertainment content, where the audienceexperiences positive and negative reactions over time to news and developments that haveno bearing on their personal decision-making.We consider an informed, benevolent sender communicating non-instrumental informa-tion to a receiver who experiences gain-loss utility over changes in beliefs (“news utility”).The state of the world, privately known to the sender, determines the receiver’s consumptionat some future date. The sender communicates this state over multiple periods as to maxi-mize the receiver’s expected welfare, knowing that the receiver derives utility based on thenature and the magnitude of news each period — good news elates and bad news disappoints.The receiver will exogenously learn the true state just before future consumption.We focus on how the receiver’s diminishing sensitivity over news aﬀects the optimal designof information structures. Kahneman and Tversky (1979)’s original formulation of prospecttheory envisioned a gain-loss utility component based on deviations from a reference point,where larger deviations carry smaller marginal eﬀects. This idea of diminishing sensitivityis referenced in virtually all subsequent work on reference-dependent preferences, includingKőszegi and Rabin (2009), who ﬁrst introduced a model of news utility. In almost allcases, however, researchers then specialize for simplicity to a two-part linear gain-loss utilityfunction that allows for loss aversion but not diminishing sensitivity. Four decades sinceKahneman and Tversky (1979)’s publication, O’Donoghue and Sprenger (2018)’s review ofthe ensuing literature summarizes the situation as follows:“Most applications of reference-dependent preferences focus entirely on lossaversion, and ignore the possibility of diminishing sensitivity [...] The literaturestill needs to develop a better sense of when diminishing sensitivity is important.”We argue that diminishing sensitivity over the magnitude of news generates novel predic-1ions for information design. As Kőszegi and Rabin (2009) point out, the two-part linearnews-utility model makes the stark prediction that people prefer resolving all uncertainty inone period (“one-shot resolution”) over any other dynamic information structure. We showthat diminishing sensitivity over news complicates the sender’s problem and leads to a morenuanced optimal information structure. In particular, one-shot resolution is strictly subopti-mal for a class of news-utility functions exhibiting diminishing sensitivity. This class includesthe commonly used power-function speciﬁcation. It also includes a tractable quadratic spec-iﬁcation, whenever diminishing sensitivity is suﬃciently strong relative to the degree of lossaversion. We further identify conditions that imply the optimal information structure treatsgood news and bad news asymmetrically, disclosing good news gradually but bad news allat once. The direction of this optimal skewness is a central implication of diminishing sen-sitivity: the “opposite” kind of information structure that divulges all good news at oncebut doles out bad news in small portions is never optimal. In fact, this kind of informationstructure is even worse than one-shot resolution.In our model, the receiver knows the sender’s strategy and formulates Bayesian beliefs.This framework leads to cross-state constraints on the sender’s problem. In view of dimin-ishing sensitivity, one might conjecture that the sender should concentrate all bad news inperiod 1 if the state is bad, and deliver equally-sized pieces of good news in periods 1 , , , ... if the state is good. But these belief paths are infeasible, since a Bayesian audience whoknows this strategy and does not receive bad news in period 1 will conclusively infer that thestate is good. The receiver should not judge subsequent communication from the sender asfurther good news or derive positive news utility from them. We show that the sender cannevertheless implement a “gradual good news, one-shot bad news” information structure fora Bayesian receiver by sending a conclusive bad-news signal in a random period when thestate is bad. In the optimal information structure, conditional on the good state, the receivermay get diﬀerent amounts of good news in diﬀerent periods, even though his news-utilityfunction is time-invariant and the sender knows the state from the start.Another implication of diminishing sensitivity is that people with opposite consumptionrankings over states may exhibit opposite informational preferences. In a world with twopossible states, A and B , suppose state A realizes if and only if a series of intermediate events In the language of information design, these conjectured belief paths violate

Bayesian plausibility , asthey cannot arise from the Bayesian updating of a given prior. A will choose to observe the intermediate events resolve in real-time (gradual information),while agents who prefer the consumption they get in state B will choose to only learnthe ﬁnal state (one-shot information). This prediction distinguishes the news-utility modelwith diminishing sensitivity from other models of non-instrumental information preference.The result also rationalizes a “sudden death” format often found in game shows, where thecontestant must overcome every challenge in a sequence to win the grand prize (as opposedto the grand prize being contingent on beating at least one of several challenges.)When the sender lacks commitment power, information structures featuring gradual goodnews encounter a credibility problem. In the bad state, the sender may strictly prefer to lieand convey a positive message intended for the good state. This temptation exists despitethe fact that the sender is far-sighted and maximizes the receiver’s total news utility overtime. The intuition is that the receiver will inevitably feel disappointed upon learning thetruth in the future, so his marginal utility of (unwarranted) good news today is larger thanhis marginal disutility of heightened disappointment in the future, thanks to diminishingsensitivity. This perverse incentive to provide false hope in the bad state may precludeall meaningful communication in all states. We show that if the receiver has diminishingsensitivity but not loss aversion (or has low loss aversion), then every equilibrium is payoﬀ-equivalent to the babbling equilibrium. High enough loss aversion, however, can restorethe equilibrium credibility of good-news messages by increasing the future disappointmentcosts associated with inducing false hope today. As a consequence, receivers with higher lossaversion may enjoy higher equilibrium payoﬀs, which would never happen if the sender hadcommitment power.Finally, we characterize the entire family of equilibria featuring gradual good news andstudy how quickly the receiver learns the state. For a class of news-utility functions thatinclude the square-root and quadratic speciﬁcations mentioned before, the sender conveysprogressively larger pieces of good news over time, so the receiver’s equilibrium belief growsat an increasing rate in the good state. This puts a uniform bound on the number of periodsof informative communication across all time horizons and all equilibria.The rest of the paper is organized as follows. The remainder of Section 1 reviews relatedliterature. Section 2 deﬁnes the sender’s problem under the commitment assumption andintroduces our model of news utility. Section 3 studies the optimal information structure and3he relationship between consumption preferences and informational preferences. Section 4focuses on the cheap-talk model when the sender lacks commitment power. Section 5 looksat a variant of the model without a deterministic horizon. Section 6 discusses other modelsof preference over non-instrumental information. Section 7 concludes. Since Kőszegi and Rabin (2009), several other authors have analyzed the implications ofnews utility in such varied settings as asset pricing (Pagel, 2016), life-cycle consumption(Pagel, 2017), portfolio choice (Pagel, 2018), and mechanism design (Duraj, 2018). Thesepapers focus on Bayesian agents with two-part linear gain-loss utilities and do not study therole of diminishing sensitivity to news.We are not aware of other work that focuses on how diminishing sensitivity matters forinformation design with news utility. In fact, very few papers deal with diminishing sensitiv-ity in any kind of reference-dependent preference. One exception is Bowman, Minehart, andRabin (1999), who study a consumption-based reference-dependent model with diminishingsensitivity. A critical diﬀerence is that their reference points are based on past habits, notrational expectations. In their environment, a consumer who knows their future incomeoptimally concentrates all consumption losses in the ﬁrst period if income will be low, butspreads out consumption gains across multiple periods if income will be high. As discussedbefore, the analog of this strategy cannot be implemented in our setting since the receiverderives news utility from changes in rational Bayesian beliefs.Our model of diminishing sensitivity over the magnitude of news shares the same psy-chological motivation as Kahneman and Tversky (1979), who base their theory of humanresponses to monetary gains and losses on human responses to changes in physical attributeslike temperature or brightness:“Many sensory and perceptual dimensions share the property that the psy-chological response is a concave function of the magnitude of physical change.For example, it is easier to discriminate between a change of 3 ◦ and a change of6 ◦ in room temperature, than it is to discriminate between a change of 13 ◦ anda change of 16 ◦ .”We are not aware of any empirical work designed to measure diminishing sensitivity over4ews, but will highlight some testable predictions of the model later on.While some of our results apply to Kőszegi and Rabin (2009)’s model of news utility orto a more general class of such models (e.g., Proposition 1, Proposition 2, Proposition 3,Corollary A.1), we mostly focus on the simplest model of news utility where the agent derivesgain-loss utility from changes in expected future consumption utility. This mean-based modellets us concentrate on the implications of diminishing sensitivity, but diﬀers from Kőszegi andRabin (2009)’s model where agents make a percentile-by-percentile comparison between oldand new beliefs. Fully characterizing the optimal information structure using this percentile-based model is out of reach for us, but our numerical simulations in Appendix B.2 suggestthe answers would be very similar.Parallel to the recent literature on the applications of news utility discussed above, Dillen-berger and Raymond (2018) axiomatize a general class of additive belief-based preferences inthe domain of two-stage lotteries by suitably weakening the independence axiom of expectedutility. In the case of T = 2 , our news-utility model belongs to the class they characterize.Under this specialization, our work may be thought of as studying the information designproblem, with and without commitment, using some of Dillenberger and Raymond (2018)’sadditive belief-based preferences. Dillenberger and Raymond (2018) also provide high-levelconditions for additive belief-based preferences to exhibit preference for one-shot resolution.We are able to ﬁnd more interpretable and easy-to-verify conditions for the sub-optimalityof one-shot resolution, working with a speciﬁc sub-class of their preferences.In general, papers on belief-based utility have highlighted two sources of felicity: levels ofbelief about future consumption utility (“anticipatory utility,” e.g., Kőszegi (2006); Eliaz andSpiegler (2006); Schweizer and Szech (2018)) and changes in belief about future consumptionutility (“news utility”). News utility is a function of both the prior belief and the posteriorbelief, while a given posterior belief brings the same anticipatory utility for all priors (Eliazand Spiegler, 2006). As we discuss in Section 6, the rich dynamics of the optimal informationstructure are a unique feature of the news-utility model (with diminishing sensitivity).Brunnermeier and Parker (2005) and Macera (2014) study the optimal design of beliefsfor agents with belief-based utilities that diﬀer from the news-utility setup we consider.Another important distinction is that we focus on the design of information : changes in thereceiver’s belief derive from Bayesian updating an exogenous prior, using the informationconveyed by the sender. Macera (2014) considers a non-Bayesian agent who freely chooses5 path of beliefs, while knowing the actual state of the world. Brunnermeier and Parker(2005) study the “opposite” problem to ours, where the agent freely chooses a prior belief(over the sequence of state realizations) at the start of the game, then updates belief aboutfuture states through an exogenously given information structure.Our emphasis on information is shared by Ely, Frankel, and Kamenica (2015), who studydynamic information design with a Bayesian receiver who derives utility from suspense orsurprise. In contrast to these authors who propose and study an original utility function overbelief paths where larger belief movements always bring greater felicity, we consider a gain-loss utility function over changes in beliefs. Because our states are associated with diﬀerentconsumption consequences, changes in beliefs may increase or decrease the receiver’s utilitydepending on whether the news is good or bad. While one-shot resolution is suboptimal inboth Ely, Frankel, and Kamenica (2015)’s problem and our problem (under some conditions),the optimal information structure diﬀers. The optimal information structure in our prob-lem is asymmetric, a key implication of diminishing sensitivity. Another diﬀerence is thatinformation structures featuring gradual bad news, one-shot good news are worse than one-shot resolution in our problem, while one-shot resolution is the worst possible informationstructure in Ely, Frankel, and Kamenica (2015)’s problem.Also within the dynamic information design literature but without behavioral preferences,Li and Norman (2018) and Wu (2018) consider a group of senders moving sequentially topersuade a single receiver. The receiver takes an action after observing all signals. Thisaction, together with the true state of the world, determines the payoﬀs of every player.While these authors study a dynamic environment, only the ﬁnal belief of the receiver atthe end of the last period matters for the players’ payoﬀs. Indeed, every equilibrium intheir setting can be converted into a payoﬀ-equivalent “one-step” equilibrium where the ﬁrstsender sends the joint signal implied by the old equilibrium, while all subsequent sendersbabble uninformatively. In our setting, the distribution of the receiver’s ﬁnal belief at theend of the last period is already pinned down by the prior belief at the start of the ﬁrstperiod. Yet, diﬀerent sequences of interim beliefs cause the receiver to experience diﬀerentamounts of total news utility. The stochastic process of these interim beliefs constitutesthe object of design. We provide a general procedure for computing the optimal dynamicinformation structure in this new setting.Lipnowski and Mathevet (2018) study a static model of information design with a psycho-6ogical receiver whose welfare depends directly on posterior belief. They discuss an applica-tion to a mean-based news-utility model without diminishing sensitivity in their AppendixA, ﬁnding that either one-shot resolution or no information is optimal. We focus on theimplications of diminishing sensitivity and derive speciﬁc characterizations of the optimalinformation structure. Our work also diﬀers in that we study a dynamic problem, examineequilibria without commitment, and discuss how the rate of releasing good news changesover time. We consider a discrete-time model with periods 0 , , , ...,T , where T ≥

2. There are twoplayers, the sender (“she”) and the receiver (“he”). There is a ﬁnite state space Θ with | Θ | = K ≥

2. In state θ , the receiver will consume c θ in period T , deriving from it consumptionutility v ( c θ ) where v is strictly increasing. Assume that c θ = c θ when θ = θ . We maynormalize without loss min θ ∈ Θ [ v ( c θ )] = 0, max θ ∈ Θ [ v ( c θ )] = 1 . There is no consumption inother periods and neither player can aﬀect period T ’s consumption.The players share a common prior belief π ∈ ∆(Θ) about the state, where π ( θ ) > θ ∈ Θ. In period 0, the sender commits to a ﬁnite message space M and a strategy σ = ( σ t ) T − t =1 , where σ t ( · | h t − , θ ) ∈ ∆( M ) is a distribution over messages in period t thatdepends on the public history h t − ∈ H t − := ( M ) t − of messages sent so far, as well as thetrue state θ . The sender can commit to any information structure ( M, σ ), which becomescommon knowledge between the players. At the start of period 1, the sender privatelyobserves the state’s realization, then sends a message in each of the periods 1 , , ..., T − σ . (Section 4 studies a cheap talk model where the sender lackscommitment power.) Information about θ is non-instrumental in that it does not help thereceiver make better decisions, but it can change his belief about future welfare.At the end of period t for 1 ≤ t ≤ T −

1, the receiver forms the Bayesian posterior belief π t about the state after the on-path history h t ∈ H t of t messages. This belief is rational andcalculated with the knowledge of the information structure ( M, σ ). In period T, the receiverexogenously and perfectly learns the true state θ , consumes c θ , and the game ends. (Section7 considers a random-horizon model where the termination date is random and unknown toboth parties.)Since the receiver is Bayesian, the sender faces cross-state constraints in choosing pathsof beliefs. For example, if the sender wishes to use some message m ∈ M to convey positivebut inconclusive news in the ﬁrst period when the state is good, then the same messagemust also be sent with positive probability when the state is bad – otherwise, receivingthis information in the ﬁrst period would amount to conclusive evidence of the good state.As we later show, these cross-state constraints imply distortions from perfect “consumptionsmoothing” of good news.When K = 2, we label two states as G ood and B ad, Θ = { G, B } , so that v ( c G ) = 1 ,v ( c B ) = 0. We also abuse the notation π t to mean π t ( G ) in the case of binary states.In this model, the sender has perfect information about the receiver’s future consumptionlevel once she observes the state. Appendix B discusses an extension where the sender’s in-formation is imperfect, so that there is residual uncertainty about the receiver’s consumptioneven conditional on the state (i.e., conditional on the sender’s private information). The receiver derives utility based on changes in his belief about the ﬁnal period’s consump-tion. Speciﬁcally, he has a continuous news-utility function N : ∆(Θ) × ∆(Θ) → R , mappinghis pair of new and old beliefs about the state into a real-valued felicity. He receives utility N ( π t | π t − ) at the end of period 1 ≤ t ≤ T. Utility ﬂow is undiscounted and the receiver hasthe same N in all periods. The sender maximizes the total expected welfare of the receiver,which is the sum of the news utilities in diﬀerent periods and the ﬁnal consumption utility, P Tt =1 N ( π t | π t − ) + v ( c ) . We assume for every π ∈ ∆(Θ), both N ( · | π ) and N ( π | · ) arecontinuously diﬀerentiable except possibly at π .For many of our results, we study a mean-based news-utility model. Kőszegi and Rabin(2009) mention this model, but mostly consider a decision-maker who makes a percentile-by-percentile comparison between his old and new beliefs. We use the mean-based model tofocus on the implications of diminishing sensitivity in the simplest setup. The agent appliesa gain-loss utility function, µ : [ − , → R , to changes in expected consumption utility for Since diﬀerent states lead to diﬀerent levels of consumption, beliefs over states induce beliefs over con-sumption. T . That is, N ( π t | π t − ) = µ ( P θ ∈ Θ ( π t ( θ ) − π t − ( θ )) · v ( c θ )). Throughout we assume µ is continuous, strictly increasing, twice diﬀerentiable except possibly at 0, and µ (0) = 0 . We impose further assumptions on µ to reﬂect diminishing sensitivity and loss aversion. Deﬁnition 1.

Say µ satisﬁes diminishing sensitivity if µ ( x ) < µ ( − x ) > x > . Say µ satisﬁes (weak) loss aversion if − µ ( − x ) ≥ µ ( x ) for all x > . There is strictloss aversion if − µ ( − x ) > µ ( x ) for all x > . We now discuss two important functional forms of µ . In Appendix B.2, we compare theoptimal information structures for this model and for Kőszegi and Rabin (2009)’s percentile-based model, a class of news-utility functions that do not admit a mean-based representation. The quadratic news-utility function µ : [ − , → R is given by µ ( x ) =  α p x − β p x x ≥ α n x + β n x x < α p , β p , α n , β n > . So we have µ ( x ) =  α p − β p x x ≥ α n + 2 β n x x < , µ ( x ) =  − β p x ≥ β n x < . The parameters α p , α n control the extent of loss aversion near 0, while β p , β n determinethe amount of curvature — i.e., the second derivative of µ . We only consider quadraticnews-utility functions that satisfy the following parametric restrictions.1. Monotonicity : α p ≥ β p and α n ≥ β n . Monotonicity condition holds if and only if µ ( x ) ≥ x ∈ [ − , . Loss aversion : α n − α p ≥ ( β n − β p ) z for all z ∈ [0 , α ≥ β > λ ≥

1, then set α p = α , α n = λα , β p = β, β n = λβ .Figure 1 plots some of these news-utility functions for diﬀerent values of α, β, and λ .9 − . − . . . Quadratic news−utility function change in expected consumption utility ne w s u t ili t y Figure 1: Examples of quadratic news-utility functions in the family α p = α , α n = λα , β p = β, β n = λβ . Grey curve: α = 2 , β = 1 , λ = 1. Red curve: α = 2, β = 1, λ = 2. Bluecurve: α = 2 , β = 0 . λ = 1. The power-function news-utility µ : [ − , → R is given by µ ( x ) =  x α x ≥ − λ | x | β x < < α, β < λ ≥

1. Parameters α, β determine the degree of diminishing sensitivityto good news and bad news, while λ controls the extent of loss aversion. This class offunctions nests the square-root case when α = β = 0 . In this section, we characterize the optimal information structure that solves the sender’sproblem. We provide a general inductive procedure to maximize total expected news utilityand ﬁnd an information structure with K messages that achieves this maximum. We showthat information structures featuring gradual bad news, one-shot good news are strictly10orse than one-shot resolution, then identify suﬃcient conditions that imply the optimalinformation structure features gradual good news, one-shot bad news. We illustrate theseconditions with the quadratic news-utility speciﬁcation, ﬁnding that the conditions holdwhenever diminishing sensitivity is suﬃciently strong relative to loss aversion.We conclude this section by highlighting that agents with opposite consumption pref-erences over two states of the world can exhibit opposite informational preferences whenchoosing between one-shot resolution and gradual resolution of uncertainty. This endoge-nous diversity of information preferences distinguishes news utility with diminishing sensi-tivity from other models of preference over non-instrumental information in the literature. For f : ∆(Θ) → R , let cav f be the concaviﬁcation of f — that is, the smallest concave func-tion that dominates f pointwise. Concaviﬁcation plays a key role in solving this informationdesign problem, just as in Kamenica and Gentzkow (2011) and Aumann and Maschler (1995).For π T − , π T − ∈ ∆(Θ) two beliefs about the state, let U T − ( π T − | π T − ) be the sum ofthe receiver’s expected news utilities in periods T − T , if he enters period T − π T − and updates it to π T − . More precisely, U T − ( π T − | π T − ) := N ( π T − | π T − ) + X θ ∈ Θ π T − ( θ ) · N (1 θ | π T − ) , where 1 θ is the degenerate belief putting probability 1 on the state θ . Note that by themartingale property of beliefs, if the receiver holds belief π T − at the end of period T − , then state θ must then realize in period T with probability π T − ( θ ).Let U ∗ T − ( π T − ) := (cav U T − ( · | π T − )) ( π T − ) . As we will show in the proof of Proposition1, U ∗ T − ( π T − ) is the value function of the sender when the receiver enters period T − π T − . It is calculated by evaluating the concaviﬁed version of x U T − ( x | π T − )at the point x = π T − . By Carathéodory’s theorem, there exist weights w , ..., w K ≥ q , ..., q K ∈ ∆(Θ) , with P Kk =1 w k = 1, P Kk =1 w k q k = π T − , such that U ∗ T − ( x ) = P Kk =1 w k U T − ( q k | x ). When the receiver enters period T − π T − , the sendermaximizes his expected payoﬀ using a signaling strategy σ T − that generates a distributionof posteriors supported on ( q , ..., q K ) with probabilities ( w , ..., w K ) . U ∗ t +1 ( x ) for t ≥

1, we may deﬁne: U t ( π t | π t − ) := N ( π t | π t − ) + U ∗ t +1 ( π t ) , which leads to the period t value function U ∗ t ( x ) := (cav U t ( · | x )) ( x ) . The maximum expectednews utility across all information structures is U ∗ ( π ).Proposition 1 formalizes this discussion. It shows there exists an information structurewith K messages that achieves optimality, and the said information structure can be con-structed using the sequence of concaviﬁcations. Proposition 1.

The maximum expected news utility across all information structures is U ∗ ( π ) . There is an information structure ( M, σ ) with | M | = K attaining this maximum,with the property that after each on-path public history h t − associated with belief π t − , thesender’s strategy σ t ( · | h t − , θ ) induces posterior q k at the end of period t with probability w k , for some q , ..., q K ∈ ∆(Θ) , w , ..., w K ≥ , satisfying P Kk =1 w k = 1 , P Kk =1 w k q k = π t − , and U ∗ t ( π t − ) = P Kk =1 w k U t ( q k | π t − ) . A perhaps surprising implication is that the receiver only needs a binary message spaceif there are two states of the world, regardless of the shape or curvature of the news-utilityfunction N. Figure 2 illustrates the concaviﬁcation procedure in an environment with twoequally likely states, T = 5, and the mean-based news-utility function µ ( x ) = √ x for x ≥ ,µ ( x ) = − . √− x for x < . The sender optimally discloses a conclusive bad-news signal ina random period when θ = B , so each period of silence amounts to a small piece of goodnews. (In Appendix B.2, we consider Kőszegi and Rabin (2009)’s percentile-based newsutility model in a similar environment with Gaussian distributions of residual consumptionuncertainty in the two states. We ﬁnd a very similar optimal information structure underthe same square-root gain-loss function.)The information-design problem imposes additional constraints relative to a habit-formationmodel. To see this, consider a “relaxed” version of the sender’s problem in the binary-statescase where she simply chooses some x t ∈ [0 ,

1] each period for 1 ≤ t ≤ T −

1, dependingon the realization of θ. The receiver gets µ t ( x t − x t − ) in period 1 ≤ t ≤ T , with the initialcondition x = π and the terminal condition x T = 1 if θ = G, x T = 0 if θ = B . Oneinterpretation of the relaxed problem is that the sender chooses the receiver’s sequence ofbeliefs only subject to the constraint that the initial belief in period 0 is π and the ﬁnal12igure 2: The concaviﬁcations giving the optimal information structure with horizon T = 5,mean-based news-utility function µ ( x ) =  √ x for x ≥ − . √− x for x < π = 0 .

5. The dashedvertical line in the t -th graph marks the receiver’s belief in θ = G conditional on not havingheard any bad news by the start of period t. The y -axis shows the sum of news utility thisperiod and the value function of entering next period with a certain belief. In the goodstate of the world, the receiver’s belief in θ = G grows at increasing rates across the periods,0 . → . → . → . → . → . In the bad state of the world, the receiver’sbelief follows the same path as in the good state up until the random period when conclusivebad news arrives. 13elief in period T puts probability 1 on the true state. The belief paths do not have tobe Bayesian. Another interpretation is that x t is not a belief, but a consumption level forperiod t . The receiver’s welfare in period t only depends on a gain-loss utility based on howcurrent period’s consumption diﬀers from that of period t − . Provided µ has diminishingsensitivity and exhibits enough loss aversion, the sender maximizes the receiver’s utility bychoosing x t = π + tT (1 − π ) in period t when θ = G , and by choosing x t = 0 in everyperiod t ≥ θ = B . The belief paths in Figure 2 diﬀer from these “relaxed” solutionsin two ways. First, the receiver gets diﬀerent amounts of good news (in terms of π t − π t − )in diﬀerent periods when θ = G . Second, the sender sometimes provides false hope in thebad state. These diﬀerences come from the Bayesian constraints on beliefs. We begin with a suﬃcient condition on the news-utility function for one-shot resolution tobe strictly suboptimal for any T and Θ. Let θ H , θ L ∈ Θ be the states with the highest andlowest consumption utilities. Let 1 H , L ∈ ∆(Θ) represent degenerate beliefs in states θ H and θ L and let v := E θ ∼ π ( v ( c θ )) be the ex-ante expected future consumption utility. Thesymbol ⊕ denotes the mixture between two beliefs in ∆(Θ) . Proposition 2.

For any T and Θ , one-shot resolution is strictly suboptimal if lim (cid:15) → + N (1 H | (1 − (cid:15) )1 H ⊕ (cid:15) L ) (cid:15) + N (1 H | π ) − N (1 L | π ) > lim (cid:15) → + N (1 H | π ) − N ((1 − (cid:15) )1 H ⊕ (cid:15) L | π ) (cid:15) − N (1 L | H ) . For the mean-based news-utility model, this condition is equivalent to µ (0 + ) + µ (1 − v ) − µ ( − v ) > µ (1 − v ) − µ ( − . In fact, the proof of Proposition 2 shows that whenever its condition is satisﬁed, someinformation structure featuring gradual good news and one-shot bad news (to be deﬁnedprecisely in the next subsection) is strictly better than one-shot resolution.We can interpret Proposition 2’s suﬃcient condition as “strong enough diminishingsensitivity.” Evidently, µ (1 − v ) − µ ( − v ) >

0, so the condition is satisﬁed whenever µ (0 + ) ≥ µ (1 − v ) − µ ( − µ (0 + ) is suﬃciently large or µ is suﬃciently negative14o the right of 0 and suﬃciently positive to the left of 0, then both of the positive RHS terms µ (1 − v ) and − µ ( −

1) will be small relative to µ (0 + ), thus the inequality will hold.The quadratic news utility provides a clear illustration of this interpretation, as thecondition of Proposition 2 holds whenever there is enough curvature relative to the extentof loss aversion. Corollary 1.

If the receiver has quadratic news utility with α n − α p ≤ β n + β p , then one-shotresolution is strictly suboptimal for any T . The diﬀerence α n − α p ≥ µ (0 − ) − µ (0 + ) . On theother side, β p and β n control the amounts of curvature in the positive and negative regions,respectively.The suﬃcient condition in Proposition 2 is also satisﬁed by the most commonly usedmodel of diminishing sensitivity, the power function (see, for example, Tversky and Kah-neman (1992)). One could think of the power function speciﬁcation as having “inﬁnite”diminishing sensitivity near 0, as µ (0 + ) = −∞ and µ (0 − ) = ∞ . Corollary 2.

Suppose µ ( x ) =  x α if x ≥ − λ · | x | β if x < for some < α, β < and λ ≥ . Thenone-shot resolution is strictly suboptimal for any T . While Proposition 2 holds generally, we can ﬁnd sharper results on the sub-optimalityof one-shot resolution for speciﬁc news-utility models and environments. Kőszegi and Rabin(2009)’s percentile-based news-utility model stipulates N ( π t | π t − ) = Z µ (cid:16) v ( F π t ( p )) − v ( F π t − ( p )) (cid:17) dp, where F π t ( p ) and F π t − ( p ) are the p -th percentile consumption levels according to beliefs π t and π t − , respectively. Whenever µ exhibits diminishing sensitivity to gains and there are atleast 3 states, one-shot resolution is suboptimal. This result does not require any assumptionabout loss aversion. Proposition 3.

In Kőszegi and Rabin (2009)’s percentile-based news-utility model, providedthe gain-loss utility function satisﬁes µ ( x ) < for all x > , one-shot resolution is strictlysuboptimal for any T and any K ≥ . µ is two-part linear with µ ( x ) = bx for x ≥ , µ ( x ) = λbx for some b > ,λ > µ ( x ) = 0 for x > K ≥ . In a binary-states world, the percentile-based news-utility function N only depends on the value of µ attwo non-zero points. Thus every increasing µ is behaviorally indistinguishable from a two-part linear one, meaning the percentile-based model cannot capture diminishing sensitivityin a setting with K = 2.As an analog to Corollary 2, we study a setting with percentile-based news utility andresidual consumption uncertainty in Appendix B, ﬁnding that one-shot resolution is strictlysuboptimal with any number of states for a power-function µ (Corollary A.1). For the remainder of the paper, we focus on mean-based news-utility functions to studyadditional implications of diminishing sensitivity. Two classes of information structures willplay important roles in the sequel. To deﬁne them, we write v t := E θ ∼ π t [ v ( c θ )] for theexpected future consumption utility based on the receiver’s (random) belief at the end ofperiod t . Partition states into two subsets, Θ = Θ B ∪ Θ G , where v ( c θ ) < v for θ ∈ Θ B and v ( c θ ) ≥ v for θ ∈ Θ G . Interpret Θ B as the “bad” states and Θ G as the “good” ones. Deﬁnition 2.

An information structure (

M, σ ) features gradual good news, one-shot badnews if • P ( M,σ ) [ v t ≥ v t − for all 1 ≤ t ≤ T | θ ∈ Θ G ] = 1 and • P ( M,σ ) [ v t < v t − for no more than one 1 ≤ t ≤ T | θ ∈ Θ B ] = 1.An information structure ( M, σ ) features gradual bad news, one-shot good news if • P ( M,σ ) [ v t ≤ v t − for all 1 ≤ t ≤ T | θ ∈ Θ B ] = 1 and • P ( M,σ ) [ v t > v t − for no more than one 1 ≤ t ≤ T | θ ∈ Θ G ] = 1.16n the ﬁrst class of information structures (“gradual good news, one-shot bad news”), thesender relays good news over time and gradually increases the receiver’s expectation of futureconsumption. When the state is bad, the sender concentrates all the bad news in one period.The “one-shot bad news” terminology comes from noting that when θ ∈ Θ B , the singleperiod t where v t < v t − must satisfy v t = v ( c θ ) and v t = v t for all t > t . The receiver getsnegative information about his future consumption level for the ﬁrst time in period t, andhis expectation stays constant thereafter. On the other hand, we use the phrase “gradualbad news, one-shot good news” to refer to the “opposite” kind of information structure.One-shot resolution falls into both of these classes. To rule out this triviality, we say thatan information structure features strictly gradual good news if P ( M,σ ) [ v t > v t − and v t > v t − for two distinct 1 ≤ t, t ≤ T | θ ∈ Θ G ] > . That is, there is positive probability that the receiver’s expectation strictly increases at leasttwice in periods 1 through T . Similarly deﬁne strictly gradual bad news .We now prove that whenever µ satisﬁes diminishing sensitivity and (weak) loss aversion,information structures featuring strictly gradual bad news, one-shot good news are strictlyworse than one-shot resolution. By contrast, under some additional restrictions, the optimalinformation structure falls into the strictly gradual good news, one-shot bad news class. Proposition 4.

Suppose µ satisﬁes diminishing sensitivity and loss aversion. Any informa-tion structure featuring strictly gradual bad news, one-shot good news is strictly worse thanone-shot resolution in expectation, and almost surely weakly worse ex-post. This result holds for arbitrary state space Θ , horizon T, and prior π .For the rest of the paper, we specialize to the case of K = 2. The next result presents anecessary and suﬃcient condition for inconclusive bad news to be suboptimal when T = 2 . We then verify the condition for quadratic news utility.

Proposition 5.

For T = 2 , information structures with P ( M,σ ) [ π < π and π = 0] > are strictly suboptimal if and only if there exists some q ≥ π so that the chord connecting (0 , U (0 | π )) and ( q, U ( q | π )) lies strictly above U ( p | π ) for all p ∈ (0 , π ) . Corollary 3.

Quadratic news utility satisﬁes the condition of Proposition 5.

17n particular, combining Corollaries 1 and 3, we infer that any optimal informationstructure for a receiver with quadratic news utility with α n − α p ≤ β n + β p with T = 2 mustfeature strictly gradual good news, one-shot bad news. Furthermore, since there exists anoptimal information structure with binary messages by Proposition 1, in this environmentthere is an optimal information structure where the sender induces either belief 0 or belief p H > π in the only period of communication. The next subsection characterizes p H as afunction of the model parameters.In summary, we have established a ranking between three kinds of information structures.For any time horizon and any state space, provided the condition in Proposition 2 holds and µ satisﬁes diminishing sensitivity and weak loss aversion, some information structure featuringgradual good news, one-shot bad news gives more news utility than one-shot resolution, whichin turn gives more news utility than any information structure featuring strictly gradual badnews, one-shot good news. Further, under the additional restrictions in Proposition 5, agradual good news, one-shot bad news information structure is optimal among all informationstructures. We illustrate Proposition 1’s concaviﬁcation procedure by ﬁnding in closed-form the optimalinformation structure when the receiver has a quadratic news-utility function.Suppose the parameters of µ satisfy α n − α p ≤ β n + β p in a T = 2 environment. Fromthe arguments in Section 3.3, the optimal information structure induces either π = 0 or π = p H for some p H > π . Proposition 1 implies (cav U ( · | π )) ( x ) > U ( x | π ) for all x ∈ (0 , p H ) . The geometry of concaviﬁcation shows the derivative of the value function at p H , ∂∂x U ( x | π )( p H ), equals the slope of the chord from 0 to p H on the function U ( · | π ).We use this equality to derive p H as the solution to a cubic polynomial. Proposition 6.

For T = 2 and quadratic news utility satisfying α n − α p ≤ β n + β p , theoptimal partial good news p H > π satisﬁes π ( α n − α p ) − ( β p + β n ) π = p H ( α n − α p + β n + β p ) − p H (2 β p + 2 β n ) . We have dp H dπ > for π < α n − α p β n + β p and dp H dπ < for π > α n − α p β n + β p . .0 0.2 0.4 0.6 0.8 1.0 . . . . . Quadratic m with a p = 2, a n = 2.1, b p = 1, b n = 0.2 prior op t i m a l pa r t i a l good ne w s Figure 3: Optimal partial good news with quadratic news utility and T = 2 , ﬁxing parameters α p = 2, α n = 2 . , β p = 1 , β n = 0 . π = α n − α p β n + β p ≈ . p H increases with prior. But for high prior beliefs, p H decreaseswith prior. Figure 3 illustrates. In the case of α n = α p , and in particular when µ is symmetricaround 0, dp H dπ > π ∈ (0 , Leaving aside the setting where the sender knows the state upfront and can choose any infor-mation structure, consider an environment where a sequence of exogenous signal realizationsdetermine the state. We show that agents with opposite consumption preferences over thetwo states can exhibit opposite preferences when choosing between observing the signals asthey arrive (gradual resolution) or only learning the ﬁnal state (one-shot resolution).There are two states of the world, Alternative ( A ) and Baseline ( B ). In each period t = 1 , , ..., T , a binary random variable X t realizes, where P [ X t = 1] = q t with 0 < q t < X t = 1 for all t , then the state is A . Else, if X t = 0 for at least one t , then the stateis B . The agent’s consumption utility in period T depends on the state, and is normalizedwithout loss to be either 0 or 1. 19t time 0, the agent chooses between observing the realizations of the random variables( X t ) Tt =1 in real time, or only learning the state of the world at the end of period T . Asan example, imagine a televised debate between two political candidates A and B where A loses as soon as she makes a “gaﬀe” during the debate. If A does not make any gaﬀes,then A wins. An individual who strongly prefers one of the candidates to win must choosebetween watching the debate live in the evening or only reading the outcome of the debatethe following morning.The agent forms Bayesian belief π t ∈ [0 ,

1] about the probability of state A at the endof each period t , starting with the correct Bayesian prior π . For notational convenience, wealso write ρ t = 1 − π t as the belief in state B at the end of t , with the prior ρ = 1 − π . If theagent prefers state A , he gets news utility µ ( π t − π t − ) at the end of period t. If the agentprefers state B , then he gets news utility µ ( ρ t − ρ t − ). The function µ exhibits diminishingsensitivity, that is µ ( x ) < µ ( − x ) > x >

0. Also, to quantify the amount of lossaversion, we consider the parametric class of λ -scaled news-utility functions . We ﬁx some˜ µ pos : [0 , → R + , strictly increasing and strictly concave with ˜ µ pos (0) = 0, and consider thefamily of µ ’s given by µ ( x ) = ˜ µ pos ( x ), µ ( − x ) = − λ ˜ µ pos ( x ) for x > λ ≥ . Under diminishing sensitivity, someone rooting for state A wants to watch the eventsunfold in real time to celebrate the small victories, while someone hoping for state B prefersto only learn the ﬁnal state to avoid piecemeal bad news. The next proposition formalizesthis intuition. Proposition 7.

Consider the class of λ -scaled news-utility functions. For any λ ≥ ,an agent who prefers state B will choose one-shot resolution of uncertainty over gradualresolution of uncertainty. There exists some ¯ λ > so that for any ≤ λ ≤ ¯ λ , an agentwho prefers state A will choose gradual resolution of uncertainty over one-shot resolution ofuncertainty. This result suggests a possible mechanism for media competition: if the realization ofsome state A depends on a series of smaller events, then some news sources may cover thesesmall events in detail as they happen, while other sources may choose to only report theﬁnal outcome. Viewers sort between these two kinds of news sources based on how they rank Augenblick and Rabin (2018) use a similar example of political gaﬀes to illustrate Bayesian belief move-ments. A and B in terms of consumption. Opposite consumption preferences induce oppositeinformational preferences.By contrast, other behavioral models do not predict a diversity of informational prefer-ences in this environment. Proposition 8.

The following models do not predict diﬀerent informational preferences foragents with the opposite consumption rankings for the two states.1. Two-part linear news-utility function µ .2. Anticipatory utility where the agent gets either u ( π t ) or u ( ρ t ) in period t depending onhis preference over states A and B, with u an increasing, weakly concave function.3. Ely, Frankel, and Kamenica (2015)’s suspense and surprise utilities. Another application of Proposition 7 concerns the design of game shows. Consider a gameshow featuring a single contestant who will win either $100,000 or nothing depending on herperformance across ﬁve rounds. The audience, empathizing with the contestant, derivesnews utility µ ( π t − π t − ) at the end of round t, where π t is the contestant’s probability ofwinning the prize based on the ﬁrst t rounds. One possible format (“sudden death”) featuresﬁve easy rounds each with w = 0 . / ≈

87% winning probability, where the contestant wins$100,000 if she wins all ﬁve rounds. Another possible format (“repêchage”) involves ﬁve hardrounds each with 1 − w winning probability, but the contestant wins $100,000 as soon as shewins any round. Both formats lead to the same distribution over ﬁnal outcomes and generatethe same amount of suspense and surprise utilities à la Ely, Frankel, and Kamenica (2015).Proposition 7 shows the ﬁrst format induces more news utility than one-shot resolution foraudience members who are not too loss averse, while the second format is worse than one-shot resolution for all audience members. Consistent with our model, the vast majority ofgame shows resemble the ﬁrst format more than the second format. Section 3 studied the optimal disclosure of news when the sender has commitment power.We provided conditions for the optimal information structure to feature gradual good news, This can be thought of as a stylized payout structure for game shows like

American Ninja Warrior and

Who Wants to Be a Millionaire . total news utility of a receiver withdiminishing sensitivity, if the positive utility from today’s good news outweighs the additionalfuture disappointment from higher expectations. In fact, when news utility is symmetricand exhibits diminishing sensitivity, the above credibility problem is so severe that everyequilibrium is payoﬀ-equivalent to the babbling equilibrium. The same result also applies toasymmetric news-utility functions with µ ( − x ) = − λµ ( x ) for all x >

0, provided loss aversion λ > µ ’s diminishing sensitivity in a way we formalize.Suﬃciently strong loss aversion can restore the equilibrium credibility of good-news mes-sages. We show that the highest equilibrium payoﬀ when the sender lacks commitment maybe non-monotonic in the extent of loss aversion, in contrast to the conclusion that moreloss-averse receivers are always strictly worse oﬀ when the sender has commitment power.We also completely characterize the class of equilibria that feature (a deterministic sequenceof) gradual good news in the good state and study the equilibrium rate of learning. With thequadratic or the square-root news-utility function, equilibria within this class always releaseprogressively larger pieces of good news over time, so the receiver’s belief in the good stategrows at an increasing rate. We continue to maintain that state space Θ = { G, B } is binary. To study the case wherethe sender lacks commitment, we analyze the perfect-Bayesian equilibria of the cheap talkgame between the two parties. Formally, the equilibrium concept is as follows. Deﬁnition 3.

Let a ﬁnite set of messages M be ﬁxed. A perfect-Bayesian equilibrium consists of sender’s strategy σ ∗ = ( σ ∗ t ) T − t =1 together with receiver’s beliefs p ∗ : ∪ T − t =0 H t → [0 , • For every 1 ≤ t ≤ T − , h t − ∈ H t − and θ ∈ { G, B } , σ ∗ maximizes the receiver’stotal expected news utility in periods t, ..., T − , T conditional on having reached the22ublic history h t − in state θ at the start of period t . • p ∗ is derived by applying the Bayes’ rule to σ ∗ whenever possible.We make two belief-reﬁnement restrictions: • If t ≤ T − h t is a continuation history of h t , and p ∗ ( h t ) ∈ { , } , then p ∗ ( h t ) = p ∗ ( h t ) . • The receiver’s belief in period T when state is θ satisﬁes π T = 1 θ , regardless of thepreceding history h T − ∈ H T − . We will abbreviate a perfect-Bayesian equilibrium satisfying our belief reﬁnements as an“equilibrium.” Our deﬁnition requires that once the receiver updates his belief to 0 or 1, itstays constant through the end of period T −

1. In period T, the receiver updates his beliefto reﬂect full conﬁdence in the true state of the world, regardless of his (possibly dogmatic)belief at the end of period T − ≤ t ≤ T based on changes in his belief, as in the model with commitment.Let V µ,M,T ( π ) ⊆ R denote the set of equilibrium payoﬀs with news-utility function µ, message space M, time horizon T, and prior π . Clearly, V µ,M,T ( π ) is non-empty. There isalways the babbling equilibrium , where the sender mixes over all messages uniformly in bothstates and the receiver’s belief never updates from the prior belief until period T . Denotethe babbling equilibrium payoﬀ by V Babµ ( π ) := π µ (1 − π ) + (1 − π ) µ ( − π )and note it is independent of M or T. We state two preliminary properties of the equilibrium payoﬀs set V µ,M,T ( π ). Lemma 1.

We have:1. For any ﬁnite M, V µ,M,T ( π ) ⊆ V µ, { g,b } ,T ( π )

2. If T ≤ T , then V µ,M,T ( π ) ⊆ V µ,M,T ( π ) . The ﬁrst statement says any equilibrium payoﬀ achievable with an arbitrary ﬁnite messagespace is also achievable with a binary message space. The second statement says the set ofequilibrium payoﬀs weakly expands with the time horizon.23 .2 The Credibility Problem and Babbling

To understand the source of the credibility problem, let N B ( x ; π ) := µ ( x − π ) + µ ( − x )denote the total amount of news utility across two periods when the receiver updates hisbelief from π to x > π today and updates it from p to 0 tomorrow. Suppose there existsa period T − h T − ∈ H T − with p ∗ ( h T − ) = π and some x > π satisfying N B ( x ; π ) > N B (0; π ). Then, the sender strictly prefers to induce belief x rather than belief 0after arriving at the history h T − in the bad state. A good-news message m x inducing belief x and a bad-news message m inducing belief 0 cannot both be on-path following h T − , elsethe sender would strictly prefer to send m x with probability 1 in the bad state.Yet, the inequality N B (0; π ) < N B ( x ; π ) automatically holds for any x > π , provided µ is strictly concave in the positive region and symmetric around 0. Lemma 2. If µ is symmetric around 0 and µ ( x ) < for all x > , then for any < π π instead of 0) provides positive news utility at the cost of greater disappointmentin the ﬁnal period. Diminishing sensitivity limits the incremental cost of this additionaldisappointment.The credibility problem implies that the babbling payoﬀ is the unique equilibrium payoﬀ. Proposition 9.

Suppose µ is symmetric around 0 and µ ( x ) < for all x > . For any M, T, π , V µ,M,T ( π ) = { V Babµ ( π ) } . We now explore what happens when µ is asymmetric around 0 due to loss aversion. Say µ exhibits greater sensitivity to losses if µ ( x ) ≤ µ ( − x ) for all x >

0. We ﬁrst establisha robustness check on Proposition 9 within this class of news-utility functions: when lossaversion is suﬃciently weak relative to diminishing sensitivity in a T = 2 model, the babblingequilibrium remains unique up to payoﬀs. Proposition 10.

Suppose µ exhibits greater sensitivity to losses. If min z ∈ [0 , − π ] µ ( z ) µ ( − ( π + z )) > , then V µ,M, ( π ) = { V Babµ ( π ) } for any M. When µ is symmetric and does not exhibit strict loss aversion, diminishing sensitivityimplies µ ( − ( π + z )) = µ ( π + z ) < µ ( z ) for every z ∈ [0 , − π ], so the inequality24ondition in Proposition 10 is always satisﬁed. This condition continues to hold if µ isslightly asymmetric due to a “small enough” amount of loss aversion relative to the sizeof the sensitivity gap µ ( z ) − µ ( π + z ). This interpretation is clearest for the λ -scalednews-utility functions, as formalized in the following corollary. Corollary 4.

Suppose for some ˜ µ pos : [0 , → R + and λ ≥ , the news-utility function µ sat-isﬁes µ ( x ) = ˜ µ pos ( x ) , µ ( − x ) = − λ ˜ µ pos ( x ) for all x ≥ . Provided λ < min z ∈ [0 , − π ] ˜ µ pos ( z )˜ µ pos ( π + z ) , V µ,M, ( π ) = { V Babµ ( π ) } for any M. When µ is strictly concave in the positive region, Corollary 4 gives a non-degenerateinterval of loss-aversion parameters for which the conclusion of Proposition 9 extends in a T = 2 setting. If ˜ µ pos contains more curvature, then ˜ µ pos ( z ) / ˜ µ pos ( π + z ) becomes larger andthe interval of permissible λ ’s expands.What happens when loss aversion is high? The next proposition says a new equilibriumthat payoﬀ-dominates the babbling one exists for large λ , provided the marginal utility of aninﬁnitesimally small piece of good news is inﬁnite — as in the power-function speciﬁcation. Proposition 11.

Fix ˜ µ pos : [0 , → R + strictly increasing and concave, continuously dif-ferentiable at x > , ˜ µ pos (0) = 0 , and lim x → ˜ µ pos ( x ) = ∞ . Consider the family λ -indexednews-utility functions µ ( x ) = ˜ µ pos ( x ) , µ ( − x ) = − λ ˜ µ pos ( x ) for x ≥ . For each π ∈ (0 , , there exists ¯ λ ≥ so that whenever λ ≥ ¯ λ and for any T ≥ , | M | ≥ , there exists V ∈ V µ,M,T ( π ) with V > V

Babµ ( π ) . To help illustrate these results, suppose µ ( x ) = √ x for x ≥ , µ ( x ) = − λ √− x for x < ,T = 2, and π = . Corollary 4 implies whenever λ < √ , the babbling equilibrium is uniqueup to payoﬀs. On the other hand, Proposition 11 says when λ is suﬃciently high, there isanother equilibrium with strictly higher payoﬀs. In fact, a non-babbling equilibrium ﬁrstappears when λ = 2 . λ. Receivers withhigher λ may enjoy higher equilibrium payoﬀs. The reason for this non-monotonicity is thatfor low values of λ , the babbling equilibrium is unique and increasing λ decreases expectednews utility linearly. When the new, non-babbling equilibrium emerges for large enough λ ,the sender’s behavior in the new equilibrium depends on λ . Higher loss aversion carries twocountervailing eﬀects: ﬁrst, a non-strategic eﬀect of hurting welfare when θ = B , as thereceiver must eventually hear the bad news; second, an equilibrium eﬀect of changing the25 .0 2.2 2.4 2.6 2.8 3.0 − . − . − . Highest equilibrium payoff: square roots with loss aversion

Loss aversion E qu ili b r i u m pa y o ff babbling equilibrium is uniquenon−babbling equilibrium exists Figure 4: The babbling equilibrium is essentially unique for low values of λ , but there existsan equilibrium with gradual good news for λ ≥ . θ = G . Receivers with anintermediate amount of loss aversion enjoy higher expected news utility than receivers withlow loss aversion, as the equilibrium eﬀect leads to better “consumption smoothing” of goodnews across time. But, the non-strategic eﬀect eventually dominates and receivers with highloss aversion experience worse payoﬀs than receivers with low loss aversion. An equilibrium (

M, σ ∗ , p ∗ ) features deterministic gradual good news (GGN equilibrium) ifthere exist a sequence of constants p ≤ p ≤ ... ≤ p T − ≤ p T with p = π , p T = 1, and thereceiver always has belief p t in period t when the state is good. By Bayesian beliefs, in thebad state of any GGN equilibrium the sender must induce a belief of either 0 or p t in period t , as any message not inducing belief p t is a conclusive signal of the bad state.The class of GGN equilibria is non-empty, for it contains the babbling equilibrium where π = p = p = ... = p T − < p T . The number of intermediate beliefs in a GGN equilibrium isthe number of distinct beliefs in the open interval ( π ,

1) along the sequence p , p , ..., p T − . This class of equilibria is slightly more restrictive than the gradual good news, one-shot bad newsinformation structures from Deﬁnition 2, because the sender may not randomize between several increasingpaths of beliefs in the good state.

Proposition 12.

Let P ∗ ( π ) ⊆ ( π, be those beliefs x satisfying N B ( x ; π ) = N B (0; π ) . Suppose µ exhibits diminishing sensitivity and loss aversion. For ≤ J ≤ T − , there existsa gradual good news equilibrium with the J intermediate beliefs q (1) < ... < q ( J ) if and onlyif q ( j ) ∈ P ∗ ( q ( j − ) for every j = 1 , ..., J , where q (0) := π . To interpret, P ∗ ( π ) contains the set of beliefs x > π such that the sender is indiﬀerentbetween inducing the two belief paths π → x → π → . Recall that when µ issymmetric, Lemma 2 implies this indiﬀerence condition is never satisﬁed, which is the sourceof the credibility problem for good-news messages. The same indiﬀerence condition pins downthe relationship between successive intermediate beliefs in GGN equilibria.We illustrate this result with the quadratic news utility. Corollary 5.

1) With quadratic news utility, P ∗ ( π ) = n π · β p + β n β p − β n − α n − α p β p − β n o ∩ ( π, . β n > β p , there cannot exist any gradual good news equilibrium with more than oneintermediate belief.2b) If β n < β p , there can exist gradual good news equilibria with more than one interme-diate belief. For a given set of parameters of the quadratic news-utility function and prior π , there exists a uniform bound on the number of intermediate beliefs that can be sustainedin equilibrium across all T .3) In any GGN equilibrium with quadratic news utility, intermediate beliefs in the goodstate grow at an increasing rate. Combined with Proposition 12, part 1) of this corollary says that in every GGN equilib-rium, the successive intermediate beliefs are related by the linear map x x · β p + β n β p − β n − α n − α p β p − β n .When β n > β p , this map has a negative slope, so there cannot exist any GGN equilibriumwith more than one intermediate belief. When β p > β n , this map has a slope strictly largerthan 1. As a result, after eliminating periods where no informative signal is released, ev-ery GGN equilibrium releases progressively larger pieces of good news in the good state, q ( j +1) − q ( j ) > q ( j ) − q ( j − . Since equilibrium beliefs in the good state grow at an increasingrate, there exists some uniform bound ¯ J on the number of intermediate beliefs dependingonly on the prior belief π and parameters of the news-utility function.27 . . . Beliefs in GGN equilibrium, a p = 2, a n = 2.1, b p = 1, b n = 0.2 period be li e f c ond i t i ona l on t he good s t a t e Figure 5: The longest possible sequence of GGN intermediate beliefs starting with prior π = . For quadratic news utility, equilibrium GGN beliefs always increase at an increasingrate in the good state.As an illustration, consider the quadratic news utility with α p = 2, α n = 2 . β p = 1, and β n = 0 .

2. Starting at the prior belief of π = , Figure 5 shows the longest possible sequenceof intermediate beliefs in any GGN equilibrium for arbitrarily large T . Since the P ∗ sets areeither empty sets or singleton sets for the quadratic news utility, Figure 5 also contains allthe possible beliefs in any state of any GGN equilibrium with these parameters.The result that GGN equilibria release increasingly larger pieces of good news generalizesto other news-utility functions with diminishing sensitivity. The basic intuition is that ifthe sender is indiﬀerent between providing d amount of false hope and truth-telling in thebad state when the receiver has prior belief π L (i.e., π L + d ∈ P ∗ ( π L )), then she strictlyprefers providing the same amount of false hope over truth-telling at any higher prior belief π H > π L . The false hope generates the same positive news utility in both cases, but anextra d units of disappointment matters less when added a baseline disappointment level of π H rather than π L , thanks to diminishing sensitivity.The next proposition formalizes this idea. It shows that when diminishing sensitivity iscombined with a pair of regularity conditions, intermediate beliefs grow at an increasingrate in any GGN equilibrium. Proposition 13.

Suppose µ exhibits diminishing sensitivity, | P ∗ ( π ) | ≤ and ∂∂x N B ( x ; π ) | x = π > for all π ∈ (0 , . Then, in any GGN equilibrium with intermediate beliefs q (1) < ... < q ( J ) ,we get q ( j ) − q ( j − < q ( j +1) − q ( j ) for all ≤ j ≤ J − . π → x → π → x > π. It is a technical assumption that lets us proveour result, but we suspect the conclusion also holds under some relaxed conditions. Thesecond regularity condition says in the bad state, the total news utility associated with an (cid:15) amount of false hope is higher than truth-telling for small (cid:15) . These conditions are satisﬁedby the power-function news utility with α = β , for example. Corollary 6.

In any GGN equilibrium with power-function news utility with α = β and any λ ≥ , intermediate beliefs in the good state grow at an increasing rate. In this section, we study a version of our information design problem without a deterministichorizon. Each period, with probability 1 − δ ∈ (0 , , the true state of the world is exogenouslyrevealed to the receiver and the game ends. Until then, the informed sender communicateswith the receiver each period as in the model from Section 2. We verify that our results fromthe ﬁnite-horizon setting extend analogously into this random-horizon environment. Consider an environment where the consumption event takes place far in the future, butthe sender is no longer the receiver’s only source of information in the interim. Instead, athird party perfectly discloses the state to the receiver with some probability each period.For instance, the sender may be the chair of a central bank who has decided on the bank’smonetary policy for next year and wishes to communicate this information over time, whilethe third party is an employee of the bank who also knows the planned policy. With someprobability each period, the employee goes to the press and leaks the future policy decision.Time is discrete with t = 0 , , , ... The sender commits to an information structure (

M, σ )at time 0. The information structure consists of a ﬁnite message space M and a sequenceof message strategies ( σ t ) ∞ t =1 where each σ t ( · | h t − , θ ) ∈ ∆( M ) speciﬁes how the sender willmix over messages in period t as a function of the public history h t − so far and the truestate θ .The sender learns the state at the beginning of period 1 and sends a message according29o σ . At the start of each period t = 2 , , , ... , there is probability (1 − δ ) ∈ (0 ,

1] thatthe receiver exogenously and perfectly learns the state θ . If so, the game eﬀectively endsbecause no further communication from the sender can change the receiver’s belief. If not,then the sender sends the next message according to σ t . The randomization over exogenouslearning is i.i.d. across periods, so the time of state revelation (i.e., the horizon of the game)is a geometric random variable. Let V δ : [0 , → R be the value function of the problem with continuation probability δ —that is, V δ ( p ) is the highest possible total expected news utility up to the period of staterevelation, when the receiver holds belief p in the current period and state revelation doesnot happen this period. The value function satisﬁes the recursion V δ ( p ) = ˜ V δ ( p | p ), where˜ V δ ( · | p ) := cav q [ µ ( q − p ) + δV δ ( q ) + (1 − δ )( q · µ (1 − q ) + (1 − q ) · µ ( − q ))] . Ely (2017) studies an inﬁnite-horizon information design problem whose value function alsoinvolves concaviﬁcation. Unlike in Ely (2017), the current belief enters the objective functionfor our news-utility problem.Our ﬁrst result shows this recursion has a unique solution which increases in δ for anyﬁxed p ∈ [0 , Proposition 14.

For every δ ∈ [0 , , the value function V δ exists and is unique. Further-more, V δ ( p ) is increasing in δ for every p ∈ [0 , . Figure 6 illustrates this result by plotting V δ ( p ) for the quadratic news utility with α p = 2, α n = 2 . β p = 1, and β n = 0 . δ : 0, 0.8, and 0.95. (In fact, themonotonicity of the value function in δ also holds when there are more than two states.)The monotonicity of V δ in δ says that when the sender is benevolent and has commitmentpower, third-party leaks are harmful for the receiver’s expected welfare. This result canbe explained intuitively as follows. Just as with increasing T in the ﬁnite-horizon model,increasing δ expands the set of implementable belief paths. The idea behind implementinga payoﬀ from a shorter horizon / lower δ is that the sender switches to babbling foreverafter certain histories. This switching happens at a deterministic calendar time in the ﬁnite-30 .0 0.2 0.4 0.6 0.8 1.0 − . − . − . Value function with d = 0 p V ( p ) − . − . − . Value function with d = 0.8 p V ( p ) − . − . − . Value function with d = 0.95 p V ( p ) Figure 6: The value function for δ = 0 , . , . . Consistent with Proposition 14, the valuefunction is pointwise higher for higher δ .horizon setting but at a random time in the random-horizon setup, mimicking the randomarrival of the state revelation period. Now we turn to equilibria of the random-horizon cheap talk game when the sender lackscommitment power. Analogously to the case of ﬁnite horizon, a strict gradual good news equilibrium (strict GGN) features a deterministic sequence of increasing posteriors q (0)

Let P ∗ ( π ) ⊆ ( π, be those beliefs p satisfying N B ( p ; π ) = N B (0; π ) . uppose µ exhibits diminishing sensitivity and loss aversion. There exists a gradual goodnews equilibrium with a (possibly inﬁnite) sequence of intermediate beliefs q (1) < q (2) < ... ifand only if q ( j ) ∈ P ∗ ( q ( j − ) for every j = 1 , , ... , where q (0) := π . The P ∗ set is the same in the ﬁnite- and random-horizon environments. Corollary 6 thenimplies that even in the random-horizon environment where the game could continue forarbitrarily many periods, intermediate beliefs grow at an increasing rate in GGN equilibriafor quadratic and square-roots µ , and there exists a ﬁnite bound on the number of periodsof informative communication that applies for all δ ∈ [0 , . The literature on reference-dependent preferences and news utility has focused on two-partlinear gain-loss utility functions, which violate diminishing sensitivity. If µ is two-part linearwith loss aversion, then it follows from the martingale property of Bayesian beliefs that one-shot resolution is weakly optimal for the sender among all information structures. If thereis strict loss aversion, then one-shot resolution does strictly better than any informationstructure that resolves uncertainty gradually. As our results have shown, more nuancedinformation structures emerge as optimal when the receiver exhibits diminishing sensitivity. In our setup, a receiver who experiences anticipatory utility gets A ( P π t ( θ ) · v ( c θ )) if sheends period t with posterior belief π t ∈ ∆(Θ) , where A : R → R is a strictly increasinganticipatory-utility function. When A is the identity function (as in Kőszegi (2006)), thesolution to the sender’s problem would be unchanged if we modiﬁed our model and letthe receiver experience both anticipatory utility and news utility. This is because by themartingale property, the receiver’s ex-ante expected anticipatory utility in a given periodis the same across all information structures. So, the ranking of information structuresentirely depends on the news utility they generate. For a general A , if the receiver only32xperiences anticipatory utility, not news utility, then the sender has an optimal informationstructure that only releases information in t = 1, followed by uninformative babbling inall subsequent periods. For instance, Schweizer and Szech (2018) show that the best non-instrumental medical test for a patient with a concave anticipatory-utility function A is fullyuninformative. The above argument establishes that even if the doctor can give the patient aseries of tests on diﬀerent days and even if A is not concave, the optimal test design involvesa possibly informative test on the ﬁrst day, followed by uninformative tests on all subsequentdays. The rich dynamics of the optimal information structure in our news-utility model arethus absent in an anticipatory-utility model. A key distinction of our model from Ely, Frankel, and Kamenica (2015) is that changes inbeliefs may bring utility or disutility to the receiver, depending on the nature of the news. Bycontrast, agents with suspense or surprise utilities always derive greater utility from largermovements in beliefs, regardless of the directions of these movements.Ely, Frankel, and Kamenica (2015) also discuss state-dependent versions of suspense andsurprise utilities, but this extension does not embed our model either. Suppose there are twostates, Θ = { G, B } , and the agent has the suspense objective P T − t =0 u ( E t ( P θ α θ · ( π t +1 ( θ ) − π t ( θ )) )or the surprise objective P Tt =1 u ( P θ α θ · ( π t ( θ ) − π t − ( θ )) ), where α G , α B > π t +1 ( G ) − π t ( G ) = − ( π t +1 ( B ) − π t ( B )), so path-wise ( π t +1 ( G ) − π t ( G )) = ( π t +1 ( B ) − π t ( B )) . This shows that the new objectives obtainedby applying two possibly diﬀerent scaling weights α G = α B to states G and B are identicalto the ones that would be obtained by applying the same scaling weight α = α G + α B to bothstates. Due to this symmetry in preference, the optimal information structure for entertain-ing an agent with state-dependent suspense or surprise utility does not treat the two statesasymmetrically, in contrast to a central prediction of diminishing sensitivity in our model. In this work, we have studied how an informed sender optimally communicates with a receiverwho derives diminishing gain-loss utility from changes in beliefs. If we think that diminishingsensitivity to the magnitude of news is psychologically realistic in this domain, then the stark33redictions of the ubiquitous two-part linear models may be misleading. In the presence ofdiminishing sensitivity, richer information structures emerge as optimal for the committedsender. For example, the optimal information structure can feature asymmetric treatmentsof good and bad news. If the sender lacks commitment power, diminishing sensitivity leadsto novel credibility problems that inhibit any meaningful communication when the receiverhas no loss aversion.Some of our predictions can empirically distinguish news utility with diminishing sensitiv-ity from other models of belief-based preference over non-instrumental information, includingthe two-part linear news-utility model. Proposition 7, for example, suggests a laboratoryexperiment where a sequence of binary events determines whether a baseline state or analternative state realizes, with the alternative state happening only if all of the binary eventsare “successful.” Consider two treatments that have the same success probabilities for thebinary events, but diﬀer in terms of whether subjects get a higher consumption or a lower con-sumption in the alternative state compared with the baseline state. Diminishing sensitivityover news predicts that more subjects should prefer one-shot resolution when consumptionis lower in the alternative state than when it is higher in the alternative state, a hypothesiswe plan to test in future work.

References

Augenblick, N. and M. Rabin (2018): “Belief movement, uncertainty reduction, andrational updating,”

Working Paper . Aumann, R. J. and M. B. Maschler (1995):

Repeated Games with Incomplete Infor-mation , Cambridge, MA: MIT Press.

Bowman, D., D. Minehart, and M. Rabin (1999): “Loss aversion in a consumption–savings model,”

Journal of Economic Behavior and Organization , 38, 155–178.

Brunnermeier, M. K. and J. A. Parker (2005): “Optimal expectations,”

AmericanEconomic Review , 95, 1092–1118.

Dillenberger, D. and C. Raymond (2018): “Additive-Belief-Based Preferences,”

Work-ing Paper . 34 uraj, J. (2018): “Mechanism Design with News Utility,”

Working Paper . Eliaz, K. and R. Spiegler (2006): “Can anticipatory feelings explain anomalous choicesof information sources?”

Games and Economic Behavior , 56, 87–104.

Ely, J., A. Frankel, and E. Kamenica (2015): “Suspense and surprise,”

Journal ofPolitical Economy , 123, 215–260.

Ely, J. C. (2017): “Beeps,”

American Economic Review , 107, 31–53.

Kahneman, D. and A. Tversky (1979): “Prospect Theory: An Analysis of Decisionunder Risk,”

Econometrica , 47, 263–292.

Kamenica, E. and M. Gentzkow (2011): “Bayesian Persuasion,”

American EconomicReview , 101, 2590–2615.

Kőszegi, B. (2006): “Emotional agency,”

Quarterly Journal of Economics , 121, 121–155.

Kőszegi, B. and M. Rabin (2009): “Reference-dependent consumption plans,”

AmericanEconomic Review , 99, 909–36.

Li, F. and P. Norman (2018): “Sequential persuasion,”

Working Paper . Lipnowski, E. and L. Mathevet (2018): “Disclosure to a psychological audience,”

Amer-ican Economic Journal: Microeconomics , 10, 67–93.

Macera, R. (2014): “Dynamic beliefs,”

Games and Economic Behavior , 87, 1–18.

O’Donoghue, T. and C. Sprenger (2018): “Reference-dependent preferences,”

Hand-book of Behavioral Economics-Foundations and Applications 1 , 1.

Pagel, M. (2016): “Expectations-based reference-dependent preferences and asset pricing,”

Journal of the European Economic Association , 14, 468–514.——— (2017): “Expectations-based reference-dependent life-cycle consumption,”

Review ofEconomic Studies , 84, 885–934.——— (2018): “A News-Utility Theory for Inattention and Delegation in Portfolio Choice,”

Econometrica , 86, 491–522. 35 chweizer, N. and N. Szech (2018): “Optimal revelation of life-changing information,”

Management Science , 64, 5250–5262.

Tversky, A. and D. Kahneman (1992): “Advances in prospect theory: Cumulativerepresentation of uncertainty,”

Journal of Risk and uncertainty , 5, 297–323.

Wu, W. (2018): “Sequential Bayesian persuasion,”

Working Paper .36 ppendixA Proofs

In the proofs, we will often use the following fact about news-utility functions with dimin-ishing sensitivity. We omit its simple proof.

Fact 1.

Let d , d > and suppose µ (0) = 0 . • (sub-additivity in gains) If µ ( x ) < for all x > , then µ ( d + d ) < µ ( d ) + µ ( d ) . • (super-additivity in losses) If µ ( x ) > for all x < , then µ ( − d − d ) > µ ( − d ) + µ ( − d ) A.1 Proof of Proposition 1

Proof.

We ﬁrst justify by backwards induction that the value function is indeed given by U ∗ t ( x ) = (cav U t ( · | x )) ( x ) , for all x ∈ ∆(Θ) and all t ≤ T −

1, and that it is continuous in x .If the receiver enters period t = T − x ∈ ∆(Θ), the sender faces thefollowing maximization problem.( P T − ) max µ ∈ ∆(∆(Θ)) , E [ µ ]= x Z ∆(Θ) U T − ( p | x ) dµ ( p ) . This is because any sender strategy σ T − induces a Bayes plausible distribution of posteriorbeliefs, µ with E [ µ ] = x , and conversely every such distribution can be generated by somesender strategy, as in Kamenica and Gentzkow (2011). It is well-known that the valueof problem P T − is (cav U T − ( · | x )) ( x ), justifying U ∗ T − ( x ) as the value function for any x ∈ ∆(Θ). The objective in P T − is continuous in p (by assumption on N ) and hence in µ ,and furthermore the constraint set { µ ∈ ∆(∆(Θ)) : E [ µ ] = x } is continuous in x . Therefore, x U ∗ T − ( x ) is continuous by Berge’s Maximum Theorem.Assume that we have shown that value function is continuous and given by U ∗ t ( x ) for all t ≥ S . If the receiver enters period t = S − x, then the sender’s value must be:37 P t ) max µ ∈ ∆(∆(Θ)) , E [ µ ]= x Z ∆(Θ) N ( p | x ) + U ∗ t +1 ( p ) dµ ( p )using the inductive hypothesis that U ∗ t +1 ( p ) is the period t + 1 value function. But N ( p | x ) + U ∗ t +1 ( p ) = U t ( p | x ) by deﬁnition, and it is continuous by the inductive hypothesis. Soby the same arguments as in the base case, U ∗ S − ( x ) is the time-( S −

1) value function andit is continuous, completing the inductive step.In the ﬁrst period, by Carathéodory’s theorem, there exist weights w , ..., w K ≥

0, beliefs q , ..., q K ∈ ∆(Θ) , with P Kk =1 w k = 1, P Kk =1 w k q k = x , such that U ∗ ( π ) = P Kk =1 w k U ( q k | π ). Having now shown U ∗ is the period-2 value function, there must exist an optimalinformation structure where σ ( · | θ ) induces beliefs q k with probability w k . This informationstructure induces one of the beliefs q , ..., q K in the second period. Repeating the sameprocedure for subsequent periods establishes the proposition. A.2 Proof of Proposition 2

Proof.

Suppose T = 2 . Consider the following family of information structures, indexedby (cid:15) > . Order the states based on E c ∼ F θ [ v ( c )] and label them θ L , θ , ..., θ K − , θ H . Let M = { m L , m , ..., m K − , m H } . Let σ t ( θ k )( m k ) = 1 for 2 ≤ k ≤ K − , σ t ( θ H )( m H ) = 1, and σ t ( θ L )( m L ) = x, σ t ( θ L )( m H ) = 1 − x for some x ∈ (0 ,

1) so that the posterior belief afterobserving m H is (1 − (cid:15) )1 H ⊕ (cid:15) L .For every (cid:15) > , the information structure just described leads to one-shot resolution ofstates θ / ∈ { θ L , θ H } . The diﬀerence between its expected news utility and that of one-shotresolution is W ( (cid:15) ) , given by π ( θ H ) · [ N ((1 − (cid:15) )1 H ⊕ (cid:15) L | π ) + N (1 H | (1 − (cid:15) )1 H ⊕ (cid:15) L ) − N (1 H | π )]+ (cid:15) − (cid:15) π ( θ H ) · [ N ((1 − (cid:15) )1 H ⊕ (cid:15) L | π ) + N (1 L | (1 − (cid:15) )1 H ⊕ (cid:15) L ) − N (1 L | π )] .W is continuously diﬀerentiable away from 0 and W (0) = 0 . To show that W ( (cid:15) ) > (cid:15) >

0, it suﬃces that lim (cid:15) → + W ( (cid:15) ) > . Using the continuous diﬀerentiability of N (cid:15) → + N ((1 − (cid:15) )1 H ⊕ (cid:15) L | π ) − N (1 H | π ) (cid:15) + lim (cid:15) → + N (1 H | (1 − (cid:15) )1 H ⊕ (cid:15) L ) (cid:15) + N (1 H | π ) + N (1 L | H ) − N (1 L | π ) . Simple rearrangement gives the expression from Proposition 2. The expression for the caseof mean-based µ follows by algebra, noting that N ((1 − x )1 H ⊕ x L | π ) = µ ((1 − x ) − v )for x ∈ [0 , . If T > , then note the sender’s T -period problem starting with prior π has a value atleast as large as the 2-period problem with the same prior. On the other hand, one-shotresolution brings the same total expected news utility regardless of T. A.3 Proof of Corollary 1

Proof.

We verify Proposition 2’s condition µ (0 + ) + µ (1 − π ) − µ ( − π ) > − µ ( −

1) + µ (1 − π ) . We have that

LHS = α p + α p (1 − π ) − β p (1 − π ) − [ β n π − α n π ] RHS = [ − β n + α n ] + [ α p − β p (1 − π )]By algebra, LHS − RHS = (1 − π )( α p − α n ) + (1 − π )( β p + β n ) . Given that ( α n − α p ) ≤ ( β p + β n ) and 1 − π > − π for 0 < π < LHS − RHS > − (1 − π )( β p + β n ) + (1 − π )( β p + β n ) = 0 . A.4 Proof of Corollary 2

Proof.

This follows from Proposition 2 because µ (0 + ) = ∞ for the power function.39 ercentile c c c c c priorbeliefnewbelief Figure A.1: New belief about consumption after the muddled message m ∗ in an environmentwith 4 states, compared with the old belief given by the prior π . A.5 Proof of Proposition 3

Proof.

Suppose Θ = { θ , ..., θ K } and assume without loss the states are associated withconsumption levels c < ... < c K .Let the message space be M = { m , ..., m K , m ∗ } . In the ﬁrst period, • σ ( m k | θ k ) = 1 for 1 ≤ k ≤ K − , • σ ( m ∗ | θ K − ) = 1 , • σ ( m ∗ | θ K ) = π ( θ K − )1 − π ( θ K ) , • σ ( m K | θ K ) = 1 − σ ( m ∗ | θ K ) . So, message m k perfectly reveals state θ k , whereas m ∗ is a “muddled” message that impliesthe state is either θ K − or θ K . By simple algebra, the probability that the receiver assignsto state θ K after m ∗ is the same as the prior belief, P [ θ K | m ∗ ] = π ( θ K ) · σ ( m ∗ | θ K ) π ( θ K ) · σ ( m ∗ | θ K ) + π ( θ K − ) · π ( θ K ) . In the second period, the information structure perfectly reveals the true state regardlessof the last message, σ ( m k | θ k ) = 1 for all 1 ≤ k ≤ K. To compute the news utility of the muddled message m ∗ , note that at percentiles p ∈ [0 , π ( θ )) , the change in p -percentile consumption utility is v ( c K − ) − v ( c ). Similarly, for 2 ≤ k ≤ K −

2, the change in consumption utility at percentile p ∈ hP k − j =1 π ( θ j ) , π ( θ k ) + P k − j =1 π ( θ j ) (cid:17) is v ( c K − ) − v ( c k ). There are no changes at percentiles above P K − j =1 π ( θ j ) . θ = θ K − , total news utility from receiving m ∗ then m K − is " K − X k =1 π ( θ k ) · µ ( v ( c K − ) − v ( c k )) from m ∗ in period 1 + π ( θ K ) · µ ( v ( c K − ) − v ( c K )) | {z } from m K − in period 2 . This is identical to the news utility from one-shot resolution in state θ K − . Similarly, theinformation structure just constructed gives the same news utility as one-shot resolutionwhen the state is θ k for 1 ≤ k ≤ K − , and when the state is θ K and the receiver gets m K in period 1.When the receiver sees m ∗ in period 1 and m K in period 2 in state θ K , an event thathappens with strictly positive probability since π ( θ K − ) < − π ( θ K ) as K ≥ , he getsstrictly more news utility than from one-shot resolution.If θ = θ K , total news utility from receiving m ∗ then m K is " K − X k =1 π ( θ k ) · µ ( v ( c K − ) − v ( c k )) from m ∗ in period 1 + " K − X k =1 π ( θ k ) · µ ( v ( c K ) − v ( c K − )) from m K in period 2 , while one-shot resolution gives K − X k =1 π ( θ k ) · µ ( v ( c K ) − v ( c k )) . For each 1 ≤ k ≤ K − K ≥ µ ( v ( c K ) − v ( c K − )) + µ ( v ( c K − ) − v ( c k )) > µ ( v ( c K ) − v ( c k ))by sub-additivity in gains. This shows the constructed information structure gives strictlymore news utility. A.6 Proof of Proposition 5

Proof.

Let (!) be the following geometric condition: the concaviﬁcation of U ( p | π ) involvesa linear segment starting at the pair p = 0 , U (0 | π ) which is strictly above U ( π | π ) whenevaluated at p = π . We need to show that (!) holds true if and only if partial bad news41re suboptimal. It is clear that whenever the geometric condition (!) is satisﬁed, partialbad news are suboptimal as the posterior induced in the bad state must be equal to 0 withprobability one. On the other hand, knowing that the posterior induced in the bad stateis 0 with probability 1 implies two possibilities: (i) either perfect revelation of the state isoptimal, or (ii) the optimal information structure involves partial good news and perfectrevelation of the bad state. In either case, the only posterior induced in the bad state is thatof 0, i.e. the concaviﬁcation has to include the point (0 , U (0 | π )). From the deﬁnition ofconcaviﬁcation and the fact that it is supported on two points of the graph of q → U ( q | π ),it follows that the concaviﬁcation has to include a linear segment starting at (0 , U (0 | π )),thus (!) should hold true.Because of the two-point support feature of the concaviﬁcation and the fact that theaverage of the posteriors needs to be equal to the prior π ∈ (0 , q > π , which is the second point of support for the concaviﬁcation and the linear segment.In case (i) above, it holds q = 1 whereas in case (ii) it holds q < A.7 Proof of Corollary 3

We ﬁrst prove a suﬃcient condition for the sub-optimality of information structures withpartial bad news with T = 2. Consider the chord connecting (0 , U (0 | π )) and ( π , U ( π | π )) and let ‘ ( x ) be its height at x ∈ [0 , π ]. Let D ( x ) := ‘ ( x ) − U ( x | π ). Lemma A.1.

For this chord to lie strictly above U ( p | π ) for all p ∈ (0 , π ) , it suﬃces that D (0) > , D ( π ) < , and D ( p ) = 0 for at most one p ∈ (0 , π ) . Proof.

We need

D > , π ). We know that D (0) = D ( π ) = 0. Given theconditions in the statement and the twice-diﬀerentiability of D in (0 , π ) it follows that D changes sign only once. Moreover, it also follows that D > x = 0and a left-neighborhood of x = π . Suppose D has an interior minimum at x ∈ (0 , π ).Then it holds D ( x ) ≥ D ( x ) > x . Then it follows x ≤ p , where we set p = π if p doesn’t exist. Because D ( x ) ≥ x ≤ p we have that D ( x ) > x ≤ p . Inparticular also D ( x ) > x due to the Fundamental Theorem of Calculus. Thus,the interior minimum is positive and so the claim about D in (0 , π ) is proven in this case.42uppose instead that D ( x ) < x near enough to 0. Then it follows that x ≥ p .In particular, for all x > p we have D ( x ) >

0. Since the derivative is strictly increasing forall x ∈ ( x , π ) and D ( π ) < D ( x ) < x ∈ ( x , π ). In particular,from the Fundamental Theorem of Calculus, D ( π ) is strictly below D ( x ). Since D ( π ) = 0we have again that D ( x ) > D and the signs of the derivatives at 0 , π and that anyinterior minimum of D is strictly positive, we have covered all cases and so shown that D > , π ).Now we verify that the condition in Lemma A.1 holds for the quadratic news utility,which in turn veriﬁes the condition of Proposition 5 for q = π and shows partial bad newsinformation structures to be strictly suboptimal. Proof.

Clearly, D ( p ) is a third-order polynomial, so D ( p ) has at most one root.For p < π , we have the derivative ddp U ( p | π ) =2 β n ( p − π ) + α n + α p (1 − p ) − β p (1 − p ) + p ( − α p + 2 β p (1 − p )) − ( β n p − α n p ) + (1 − p )(2 β n p − α n )The slope of the chord between 0 and π is: α p − β p + (2 β p − α p + α n ) π − ( β p + β n ) π . So,after straightforward algebra, D (0) = (2( β p + β n ) − ( α p − α n )) π − ( β p + β n ) π . Applyingweak loss aversion with z = 1, α p − α n ≤ β p − β n . This shows D (0) ≥ (2( β p + β n ) − ( β p − β n )) π − ( β p + β n ) π = ( β p + β n ) π (1 − π ) + 2 β n π > < π < . We also derive D ( π ) = ( α p − β p − β n − α n ) π +(2 β p +2 β n ) π . Note that this is a convexparabola in π , with a root at 0. Also, the parabola evaluated at 1 is equal to α p − α n ≤ , where the inequality comes from the weak loss aversion with z = 0. This implies D ( π ) < < π < . .8 Proof of Proposition 4 Proof.

We show that one-shot resolution gives weakly higher news utility conditional on eachstate, and strictly higher news utility conditional on at least one θ ∈ Θ B .When θ ∈ Θ B , P ( M,σ ) -almost surely the expectations in diﬀerent periods form a decreasingsequence v ≥ v ≥ ... ≥ v T = v ( c θ ) . By super-additivity in losses, P Tt =1 µ ( v t − v t − ) ≤ µ ( v T − v ) = µ ( v ( c θ ) − v ) . This shows P ( M,σ ) -almost surely the ex-post news utility in state θ is no larger than µ ( v ( c θ ) − v ) , the news utility from one-shot resolution.Let E be the event where the receiver’s expectation strictly decreases two or more times.From the deﬁnition of strict gradual bad news, there exists some θ ∗ ∈ Θ B so that P ( M,σ ) [ E | θ ∗ ] > . On E ∩ { θ ∗ } , P Tt =1 µ ( v t − v t − ) < µ ( v ( c θ ∗ ) − v ) from super-additivity in losses,which means the expected news utility conditional on E ∩ { θ ∗ } is strictly lower than thatof one-shot resolution. Combined with the fact that the ex-post news utility in state θ ∗ is always weakly lower than µ ( v ( c θ ∗ ) − v ) , this shows expected news utility in state θ ∗ isstrictly lower than that of one-shot resolution.Conditional on any state θ ∈ Θ G , there is some random period t ∗ ∈ { , ..., T − } so that v t is weakly decreasing up to t = t ∗ and v t = v ( c θ ) for t > t ∗ . If t ∗ = 0 , then this belief pathyields the same news utility as one-shot resolution. If t ∗ ≥ , then the total news utility is t ∗ X t =1 µ ( v t − v t − ) + µ ( v ( c θ ) − v t ∗ ) . By sub-additivity in gains, t ∗ X t =1 µ ( v t − v t − ) ≤ µ ( v t ∗ − v ) , and for the same reason, µ ( v ( c θ ) − v t ∗ ) ≤ µ ( v − v t ∗ ) + µ ( v ( c θ ) − v )as we must have v t ∗ ≤ v . Total news utility is therefore bounded above by µ ( v t ∗ − v ) + µ ( v − v t ∗ ) + µ ( v ( c θ ) − v ) . By weak loss aversion, µ ( v t ∗ − v ) + µ ( v − v t ∗ ) ≤ , therefore total news utility is no larger44han that of one-shot resolution, µ ( v ( c θ ) − π ) . A.9 Proof of Proposition 6

Proof.

We have ddp U ( p | π ) = 2 α p − α n − β p + 2 β p π + p ( − α p + 2 β p + 2 α n + 2 β n ) + p ( − β p − β n )Further, p times slope of chord is: U ( p | π ) − U (0 | π ) = U ( p | π ) − ( β n π − α n π )= π ( − α p + α n ) + π ( − β p − β n ) + p (2 α p − α n − β p )+ p ( − α p + β p + α n + β n ) + p ( − β p − β n ) + pπ (2 β p )Equating p · ddp U ( p | π ) = U ( p | π ) − U (0 | π ), we get π ( α n − α p ) − ( β p + β n ) π = p ( α n − α p + β n + β p ) − p (2 β p + 2 β n ) . Deﬁne c = α n − α p β n + β p . Then, we can write the implicit function as π c − π = p (1 + c ) − p . That for every 0 ≤ c ≤ π ∈ (0 ,

1) this has a solution we know it from the fact that thechord condition is always satisﬁed for quadratic speciﬁcation. We want to take derivativesthough and maybe try and solve explicitly for the function p ( π , c ).The condition to apply implicit function theorem: deﬁne the function f ( π , p, c ) = p (1 + c ) − p − π c + π with domain (0 , ; then we need ∂ p f ( π , p, c ) = 0. If this is true, thenwe can solve locally for p ( π , c ) and then also calculate the local derivative/comparativestatics we do below locally. We note that ∂ p f ( π , p, c ) = 2 p (1 + c ) − p . Thus, the onlyconstellation where this would be zero is if p = c =: ˆ p ∈ [ , ], given the suﬃcient condition c ∈ (0 ,

1) that we are imposing (see Corollary 1) but we leave out the boundary values for asecond). Now, let us see for a ﬁxed c , what π would give ˆ p (because we only focus on regionwhere implicit function gives out a solution). This would mean solving for π the quadratic45quation π − π c + 127 (1 + c ) = 0 . (1)Let’s calculate the discriminant as a function of c . It is given as D ( c ) = c − (1 + c ) . Notethat D ( c ) = (2 − c )(2 c − D is falling from c = 0 till c = and increasingfrom then on till c = 1. We note also that D (0) < , D (1) < D ( c ) < c ∈ [0 , ∂ p f never changes sign in (0 , ∩ { ( p, π , c ) : π c − π = p (1 + c ) − p } ( f is a smooth function on its domain). Thus, implicit function theorem is applicable for all( π , c ) ∈ (0 , .Totally diﬀerentiating, we get: dπ · ( α n − α p ) − ( β p + β n )2 π · dπ = 2 p · dp · ( α n − α p + β n + β p ) − p · dp · (2 β p + 2 β n ) , which can be rearranged to dpdπ p = c − π p (1+ c ) − p . We note that we showed above thedenominator of this expression never changes sign. Given that we know it’s negative at c = 0 we conclude that it’s always negative for all c and all π ∈ (0 , c = 0, p ( π ) is falling till some prior and increasing afterwards. For c = 0 it is strictlyincreasing all the way. Note that an implication of the shape for the case of c > p (0 , c ) = c (because the other root which is zero would lead to a contradiction of the shape,given that p ∈ [0 , A.10 Proof of Proposition 7

Proof.

Consider an agent who prefers B over A. In state A, he gets µ ( − ρ ) with one-shotinformation, but P Tt =1 µ ( ρ t − ρ t − ) with gradual information. For each t, ρ t − ρ t − < , andfurthermore P Tt =1 ρ t − ρ t − = − ρ by telescoping and using the fact that ρ T = 0. Due to super-additivity in losses, we get that µ ( − ρ ) > P Tt =1 µ ( ρ t − ρ t − ). In state B, he gets µ (1 − ρ ) withone-shot information. With gradual information, let ˆ T ≤ T be the ﬁrst period where the cointoss comes up tails. His news utility is hP ˆ T − t =1 µ ( ρ t − ρ t − ) i + µ (1 − ρ ˆ T − ) where each ρ t − ρ t − < ≤ t ≤ ˆ T −

1. Again by super-additivity in losses, P ˆ T − t =1 µ ( ρ t − ρ t − ) < µ ( ρ ˆ T − − ρ ). By46ub-additivity in gains, µ (1 − ρ ˆ T − ) < µ ( ρ − ρ ˆ T − ) + µ (1 − ρ ) ≤ − µ ( ρ ˆ T − − ρ ) + µ (1 − ρ ),where the weak inequality follows since λ ≥ . Putting these pieces together,  ˆ T − X t =1 µ ( ρ t − ρ t − )  + µ (1 − ρ ˆ T − ) < µ ( ρ ˆ T − − ρ ) − µ ( ρ ˆ T − − ρ ) + µ (1 − ρ )= µ (1 − ρ )as desired.Now consider an agent who prefers A over B. We show that when λ = 1, the agent strictly prefers gradual information to one-shot information. By continuity of news utility in λ , thesame strict preference must also hold for λ in an open neighborhood around 1.In state A, the agent gets µ (1 − π ) with one-shot information, but P Tt =1 µ ( π t − π t − )with gradual information. For each t, π t − π t − > , and furthermore P Tt =1 π t − π t − = 1 − π by telescoping and using the fact that π T = 1. Due to sub-additivity in gains, we get that P Tt =1 µ ( π t − π t − ) > µ (1 − π ). In state B, he gets µ ( − π ) with one-shot information. Withgradual information, let ˆ T ≤ T be the ﬁrst period where the X ˆ T = 0. His news utility is hP ˆ T − t =1 µ ( π t − π t − ) i + µ ( − π ˆ T − ) where each π t − π t − > ≤ t ≤ ˆ T −

1. Again bysub-additivity in gains, P ˆ T − t =1 µ ( π t − π t − ) > µ ( π ˆ T − − π ). By super-additivity in losses, µ ( − π ˆ T − ) > µ ( − ( π ˆ T − − π )) + µ ( − π ) = − µ ( π ˆ T − − π ) + µ ( − π ), where the equality comesfrom the fact that λ = 1 so µ is symmetric about 0. Putting these pieces together,  ˆ T − X t =1 µ ( π t − π t − )  + µ ( − π ˆ T − ) > µ ( π ˆ T − − π ) − µ ( π ˆ T − − π ) + µ ( − π )= µ ( − π )as desired. A.11 Proof of Proposition 8

Proof. (1) Suppose µ is two-part linear with µ ( x ) = kx for x ≥ , µ ( x ) = λkx for x < , where k > , λ ≥

1. Then, agents preferring either state will strictly prefer one-shotinformation over gradual information. Indeed, since news utility is proportional to negativeof expected movement in beliefs, and since E [ P Tt =1 | π t − π t − | ] = E [ P Tt =1 | (1 − ρ t ) − (1 − ρ t − ) | ] = E [ P Tt =1 | ρ t − ρ t − | ], agents preferring state A and state B also derive the same amount of news47tility from each informational structure and hence have the same intensity of preference forone-shot information.If λ = 1, agents do not exhibit strict preference for either information structure.(2) Anticipatory utility. If u is linear, then agents are indiﬀerent between gradual andone-shot information, so (up to tie-breaking) the agents preferring states A and B have thesame preference over information structure. If u is strictly concave, then for 1 ≤ t ≤ T − E [ u ( π t )] < u ( π ) and E [ u ( ρ t )] < u ( ρ ) by combining the martingale property and Jensen’sinequality. So all agents strictly prefer to keep their prior beliefs until the last period andwill therefore all choose one-shot information.(3) Suspense and surprise. Ely, Frankel, and Kamenica (2015) mention a “state-dependent”speciﬁcation of their surprise and suspense utility functions. With two states, A and B, theirspeciﬁcation uses weights α A , α B > T − X t =0 u (cid:16) E t h α A · ( π t +1 − π t ) + α B · ( ρ t +1 − ρ t ) i(cid:17) and their re-scaled surprise utility is E " T X t =1 u (cid:16) α A · ( π t +1 − π t ) + α B · ( ρ t +1 − ρ t ) (cid:17) . We may consider agents with opposite preferences over states A and B as agents with diﬀerentpairs of scaling weights ( α A , α B ) . Speciﬁcally, say there are α High > α

Low >

0. For an agentpreferring A, α A = α High , α B = α Low . For an agent preferring B, α A = α Low , α B = α High . Butnote that we always have π t +1 − π t = − ( ρ t +1 − ρ t ), so along every realized path of beliefs,( π t +1 − π t ) = ( ρ t +1 − ρ t ) . This means these two agents with the opposite scaling weightsactually have identical objectives and therefore will have the same preference over gradualor one-shot information. A.12 Proof of Lemma 1

Proof.

Part 1.

Fix a prior π and a pair ( ¯ M , ¯ σ ) which induces an equilibrium as in Deﬁnition3. We focus on the case that | ¯ M | > M = { g, b } and we will inductively deﬁne the sender’s strategy σ t on t so that ( M, σ )48s another equilibrium which delivers the same expected utility as ( ¯

M , ¯ σ ) . In doing so wewill successively deﬁne a sequence of subsets of histories, H tint ⊆ M t and ¯ H tint ⊆ ¯ M t , whichare length t histories associated with interior equilibrium beliefs about the state in the newand old equilibria, as well as a map φ that associates new histories to old ones.Let H int = ¯ H int := { ∅ } , φ ( ∅ ) = ∅ . Once we have deﬁned σ t − , H t − int , ¯ H t − int and φ : H t − int → ¯ H t − int , we then deﬁne σ t . If h t − / ∈ H t − int , then simply let σ t ( h t − , θ )( g ) = 0 . θ ∈ { G, B } . For each h t − ∈ H t − int ,by the deﬁnition of ¯ H t − int , the equilibrium belief π t − associated with φ ( h t − ) in the oldequilibrium satisﬁes 0 < π t − < . Let Φ G ( h t − ) and Φ B ( h t − ) represent the sets of posteriorbeliefs that the sender induces with positive probability in the good and bad states followingpublic history φ ( h t − ) ∈ ¯ H t − int in ( ¯ M , ¯ σ ).We must have Φ G ( h t − ) \ Φ B ( h t − ) ⊆ { } and Φ B ( h t − ) \ Φ G ( h t − ) ⊆ { } , since any mes-sage unique to either state is conclusive news of the state. We construct σ t ( h t − , θ ) based onthe following four cases.Case 1: 1 ∈ Φ G ( h t − ) and 0 ∈ Φ B ( h t − ). Let σ t ( h t − , G ) assign probability 1 to g and let σ t ( h t − , B ) assign probability 1 to b. Case 2: 1 ∈ Φ G ( h t − ) but 0 / ∈ Φ B ( h t − ) . By Bayesian plausibility, there exists somesmallest q ∗ ∈ (0 , π t − ) with q ∗ ∈ Φ G ( h t − ) ∩ Φ B ( h t − ), induced by some message ¯ m b ∈ ¯ M sent with positive probabilities in both states. Also, some message ¯ m g ∈ ¯ M sent with positiveprobability in state G induces belief 1. Let σ t ( h t − , B )( b ) = 1 and let σ t ( ∅ , G )( b ) = x where x ∈ (0 ,

1) solves π t − xπ t − x +(1 − π t − ) = q ∗ .Case 3: 1 / ∈ Φ G ( h t − ) but 0 ∈ Φ B ( h t − ). By Bayesian plausibility, there exists somelargest q ∗ ∈ ( π t − ,

1) with q ∗ ∈ Φ G ( h t − ) ∩ Φ B ( h t − ). Let σ t ( h t − , G )( g ) = 1 and let σ t ( h t − , B )( g ) = x where x ∈ (0 ,

1) solves π t − π t − +(1 − π t − ) x = q ∗ .Case 4: 1 / ∈ Φ G ( h t − ) and 0 / ∈ Φ B ( h t − ) . By Bayesian plausibility, Φ G ( h t − ) = Φ B ( h t − ),and there exist some largest q L ≤ π t − and smallest q H ≥ π t − in this common set ofposterior beliefs, and further there exist x, y ∈ (0 ,

1) so that π t − xπ t − x +(1 − π t − ) y = q H and π t − (1 − x ) π t − (1 − x )+(1 − π t − )(1 − y ) = q L . Let σ ( h t − , G )( g ) = x and σ ( h t − , B )( g ) = y. Having constructed σ t , let H tint be those on-path period t histories with interior equilib-rium beliefs, that is h t = ( h t − , m ) ∈ H tint if and only if h t − ∈ H t − int and σ ( h t − , θ )( m ) > θ ∈ { G, B } . A property of the construction of σ t is that if h t − ∈ H t − int , then both( h t − , g ) and ( h t − , b ) are on-path. That is, oﬀ-path histories can only be continuations of49istories with degenerate beliefs in { , } . Let ¯ H tint be on-path period t histories with interior equilibrium beliefs in ( ¯ M , ¯ σ ) . By thedeﬁnition of σ t , there exists ¯ m ∈ ¯ M so that h t induces the same equilibrium belief in thenew equilibrium as the history ( φ ( h t − ) , ¯ m ) ∈ ¯ H tint in the old equilibrium, and we deﬁne φ ( h t ) := ( φ ( h t − ) , ¯ m ) . The receiver’s expected payoﬀ in both the B and G states are the same as in the oldequilibrium. To see this, note that by our construction, the receiver’s expected payoﬀ instate B is the same as if we took a deterministic selection of messages m , m , ... in the oldequilibrium with the property that σ ( ∅ , B )( m ) > t ≥ , σ t ( m , ..., m t − , θ )( m t ) > . Then, we had the sender play message m t in period t. Since this sequence of messages isplayed with positive probability in state B of the old equilibrium, it must yield the expectedpayoﬀ under B — if it yields higher or lower payoﬀs, then we can construct a deviation thatimproves the receiver’s ex-ante expected payoﬀs in the old equilibrium. A similar argumentholds for state G .It remains to check that ( M, σ ) is an equilibrium by ruling out one-shot deviations. Weargued before that all oﬀ-path histories must follow an on-path history with equilibrium beliefin 0 or 1. There are no proﬁtable deviations at oﬀ-path histories or at on-path histories withdegenerate beliefs, because the receiver does not update beliefs after such histories regardlessof the sender’s play.So consider an on-path history with a non-degenerate belief, i.e. a member h t ∈ H tint .A one-shot deviation following h t corresponds to a deviation following φ ( h t ) in ( ¯ M , ¯ σ ) , andmust not be strictly proﬁtable. Part 2.

We now turn to the second claim. If T ≤ T , then for any equilibrium withhorizon T, we may construct an equilibrium of horizon T which sends messages in the sameway in periods 1 , ..., T − , but babbles starting in period T . This equilibrium has the sameexpected payoﬀ as the old one.Note that the ﬁrst claim of Lemma 1 also holds for the inﬁnite horizon model of subsection5.3. Nothing in the argument relies on T being ﬁnite. This is because the proof argumentrelies on the one-shot deviation property which holds for equilibria in both ﬁnite and inﬁnitehorizon models. Thus, in particular, in the proof of Proposition 15 we can also focus on abinary signal space. 50 .13 Proof of Lemma 2 Proof.

Due to sub-additivity, µ ( p ) < µ ( p − π ) + µ ( π ) . (2)Note that symmetry implies µ ( − p ) = − µ ( p ) and that µ ( − π ) = − µ ( π ). Rearranged (2) isprecisely N (0; π ) < N ( p ; π ). A.14 Proof of Proposition 9

We begin by giving some additional deﬁnition and notation.For p, π ∈ [0 , , let N G ( p ; π ) := µ ( p − π ) + µ (1 − p ).We state and prove a preliminary lemma about N G and N B . Lemma A.2.

Suppose µ exhibits diminishing sensitivity and greater sensitivity to losses.Then, p N G ( p ; π ) is strictly increasing on [0 , π ] and symmetric on the interval [ π, . Foreach p ∈ [ π, , there exists exactly one point p ∈ [ π, so that N G ( p ; π ) = N G ( p ; π ) . For every p L < π and p H ≥ π, N G ( p L ; π ) < N G ( p H ; π ) . Also, N B ( p ; π ) is symmetric onthe interval [0 , π ] . For each p ∈ [0 , π ] , there exists exactly one point p ∈ [0 , π ] so that N B ( p ; π ) = N B ( p ; π ) . Proof.

We have ∂N G ( p ; π ) ∂p = µ ( p − π ) − µ (1 − p ). For 0 ≤ p < π and under greater sensitivityto losses, µ ( p − π ) ≥ µ ( π − p ) . Since µ ( x ) < x > , µ ( π − p ) > µ (1 − p ). This shows ∂N G ( p ; π ) ∂p > p ∈ [0 , π ) . The symmetry results follow from simple algebra and do not require any assumptions.Note that ∂ N G ( p ; π ) ∂p = µ ( p − π ) + µ (1 − p ) < p ∈ [ π, , due to diminishingsensitivity. Combined with the required symmetry, this means ∂N G ( p ; π ) ∂p crosses 0 at most onceon [ π, , so for each p ∈ [ π, p so that N G ( p ; π ) = N G ( p ; π ). Inparticular, this implies at every intermediate p ∈ ( π, , we get N G ( p ; π ) > N G ( π ; π ) sincewe already have N G (1; π ) = N G ( π ; π ) . This shows N G ( · ; π ) is strictly larger on [ π,

1] than on[0 , π ) . A similar argument, using µ ( x ) > x <

0, establishes that for each p ∈ [0 , π ], wecan ﬁnd at most one p so that N B ( p ; π ) = N B ( p ; π ).Consider any period T − h T − in any equilibrium ( M, σ ∗ , p ∗ ) where p ∗ ( h T − ) = π ∈ (0 , . Let P G and P B represent the sets of posterior beliefs induced at the end of T − P G , P B . Lemma A.3.

The sets P G , P B belong to one of the following cases. P G = P B = { π } P G = { } , P B = { } P G = { p } for some p ∈ ( π,

1) and P B = { , p } P G = { π, } and P B = { , π } P G = { p , p } for some p ∈ ( π, π ) , p = 1 − p + π , P B = { , p , p } . Proof.

Suppose | P G | = 1 . If P G = { π } , then any equilibrium message not inducing π must induce 0. By the Bayes’rule, the sender cannot induce belief 0 with positive probability in the bad state, so P B = { π } as well.If P G = { } , then any equilibrium message not inducing 1 must induce 0. Furthermore,the sender cannot send equilibrium messages inducing belief 1 with positive probability inthe bad state, else the equilibrium belief associated with these messages should be strictlyless than 1. Thus P B = { } .If P G = { p } for some 0 ≤ p < π, then any equilibrium message not inducing p mustinduce 0. This is a contradiction since the posterior beliefs do not average out to π. This leaves the case of P G = { p } for some π < p < . Any equilibrium message notinducing p must induce 0. Furthermore, the sender must induce the belief p in the badstate with positive probability, else we would have p = 1 . At the same time, the sendermust also induce belief 0 with positive probability in the bad state, else we violate Bayes’rule. So P B = { , p } .Now suppose | P G | = 2 . In the good state, the sender must be indiﬀerent between two beliefs p , p both inducedwith positive probability. By Lemma A.2, N G ( p ; π ) is strictly increasing on [0 , π ] and strictlyhigher on [ π,

1] than on [0 , π ), while for each p ∈ [ π, p ∈ [ π,

1] so that N G ( p ; π ) = N G ( p ; π ) . This means we must have p ∈ [ π, π ], p = 1 − p + π .52f P G = { π, } , any equilibrium message not inducing π or 1 must induce 0. Also, 1 / ∈ P B , because any message sent with positive probability in the bad state cannot induce belief 1.We cannot have P B = { } , because then the message inducing belief π actually induces 1.We cannot have P B = { π } for then we violate Bayes’ rule. This leaves only P B = { , π } .If P G = { p , p } for some p ∈ ( π, π ) , then any equilibrium message not inducing p or p must induce 0. Also, p , p ∈ P B , else messages inducing these beliefs give conclusiveevidence of the good state. By Bayes’ rule, we must have P B = { , p , p } . It is impossible that | P G | ≥ , since, by Lemma A.2, N G ( p ; π ) is strictly increasing on [0 , π ]and strictly higher on [ π,

1] than on [0 , π ), while for each p ∈ [ π, p ∈ [ π,

1] so that N G ( p ; π ) = N G ( p ; π ) . So the sender cannot be indiﬀerent between3 or more diﬀerent posterior beliefs of the receiver in the good state.We now give the proof of Proposition 9.

Proof.

Consider any period T − h T − with p ∗ ( h T − ) ∈ (0 , . By Lemma 2, N B ( p ; p ∗ ( h T − )) > N B (0; p ∗ ( h T − )) for all p ∈ ( p ∗ ( h T − ) , h T − , the receiver will get total news utility of µ (1 − p ∗ ( h T − )) in the good state and µ ( − p ∗ ( h T − )) in the bad state. This conclusion applies to all period T − T − T , and the equilibrium up to period T − T − . By backwards induction, wesee that along the equilibrium path, whenever the receiver’s belief updates, it is updated tothe dogmatic belief in θ . A.15 Proof of Proposition 10

Proof.

The conclusions of Lemmas A.2 and A.3 continue to hold, since these only dependon µ exhibiting greater sensitivity to losses. As in the proof of Proposition 9, we only needto establish N B ( p ; π ) > N B (0; π ) for all p ∈ ( π ,

1] to rule out cases 3 and 5 from LemmaA.3 and hence establish our result.For p = π + z where z ∈ (0 , − π ] ,N B ( p ; π ) − N B (0; π ) = µ ( z ) + µ ( − ( π + z )) − µ ( − π ) . D ( z ) of z. Clearly D (0) = 0 , and D ( z ) = µ ( z ) − µ ( − ( π + z )) . Since min z ∈ [0 , − π ] µ ( z ) µ ( − ( π + z )) > , we get D ( z ) > z ∈ [0 , − π ] , thus D ( z ) > A.16 Proof of Corollary 4

Proof.

First, µ exhibits greater sensitivity to losses, because µ ( − x ) = − λµ ( x ) for all x > λ ≥ . To apply Proposition 10, we only need to verify that min z ∈ [0 , − π ] µ ( z ) µ ( − ( π + z )) > . Forthe λ -scaled µ, min z ∈ [0 , − π ] µ ( z ) µ ( − ( π + z )) = λ · min z ∈ [0 , − π ] ˜ µ pos ( z )˜ µ pos ( π + z ) . The assumption thatmin z ∈ [0 , − π ] ˜ µ pos ( z )˜ µ pos ( π + z ) > λ gives the desired conclusion. A.17 Proof of Proposition 11

Proof.

By the proof of Proposition 12, which does not depend on this result, there is a GGNequilibrium with one intermediate belief p ∈ ( π ,

1) whenever N B ( p ; π ) = N B (0; π ) . In thisequilibrium, the sender induces a belief of either p or 0 by the end of period 1, then babblesin all remaining periods of communication. Since the sender is indiﬀerent between inducingbelief p or 0 in the bad state, this equilibrium gives the same payoﬀ as the babbling one inthe bad state. But, since µ ( p − π ) + µ (1 − p ) > µ (1 − π ) due to strict concavity of ˜ µ pos ,the receiver gets strictly higher news utility in the good state.To ﬁnd ¯ λ that guarantees the existence of a p solving N B ( p ; π ) = N B (0; π ) , let D ( p ) := N B ( p ; π ) − N B (0; π ) . We have D ( π ) = 0 and lim p → π +0 D ( p ) = lim x → + ˜ µ pos ( x ) − µ ( − π ) =lim x → + ˜ µ pos ( x ) − λµ ( π ). For any ﬁnite λ , this limit is ∞ , since lim x → + ˜ µ pos ( x ) = ∞ . Onthe other hand, D (1) = µ (1 − π ) + µ ( − − µ ( − π ) = ˜ µ pos (1 − π ) − λ (˜ µ pos (1) − ˜ µ pos ( π )) . Since ˜ µ pos (1) − ˜ µ pos ( π ) >

0, we may ﬁnd a large enough ¯ λ ≥ µ pos (1 − π ) − ¯ λ (˜ µ pos (1) − ˜ µ pos ( π )) < . Whenever λ ≥ ¯ λ, we therefore get D ( π ) = 0 , lim p → π +0 D ( p ) = ∞ , and D (1) < . By the intermediate value theorem applied to the continuous D, there existssome p ∈ ( π ,

1) so that D ( p ) = 0 . .18 Proof of Proposition 12 Proof.

Let J intermediate beliefs satisfying the hypotheses be given. We construct a gradualgood news equilibrium where p t = q ( t ) for 1 ≤ t ≤ J , and p t = q ( J ) for J + 1 ≤ t ≤ T − . Let M = { g, b } and consider the following strategy proﬁle. In period t ≤ J where thepublic history so far h t − does not contain any b , let σ ( h t − ; G )( g ) = 1 , σ ( h t − ; B )( g ) = x where x ∈ (0 ,

1) satisﬁes p t − p t − +(1 − p t − ) x = p t . But if public history contains at least one b, then σ ( h t − ; G )( b ) = 1 and σ ( h t − ; B )( b ) = 1. Finally, if the period is t > J , then σ ( h t − ; G )( b ) = 1 and σ ( h t − ; B )( b ) = 1. In terms of beliefs, suppose h t has t ≤ J and everymessage so far has been g. Such histories are on-path and get assigned the Bayesian posteriorbelief. If h t has t ≤ J and contains at least one b , then it gets assigned belief 0. Finally, if h t has t > J , then h t gets assigned the same belief as the subhistory constructed from itsﬁrst J elements. It is easy to verify that these beliefs are derived from Bayes’ rule wheneverpossible.We verify that the sender has no incentive to deviate. Consider period t ≤ J with history h t − that does not contain any b. The receiver’s current belief is p t − by construction.In state B , we ﬁrst calculate the sender’s equilibrium payoﬀ after sending g. The receiverwill get some I periods of good news before the bad state is revealed, either by the senderor by nature in period T. That is, the equilibrium news utility with I periods of good newsis given by I X i =1 µ ( p t − i − p t − i ) + µ ( − p t − I ) . Since p t − I ∈ P ∗ ( p t − I ), we have N B ( p t − I ; p t − I ) = N B (0; p t − I ) , that is to say µ ( p t − I − p t − I ) + µ ( − p t − I ) = µ ( − p t − I ) . We may therefore rewrite the receiver’s totalnews utility as P I − i =1 µ ( p t − i − p t − i )+ µ ( − p t − I ). But by repeating this argument, we con-clude that the receiver’s total news utility is just µ ( − p t − ). Since this result holds regardlessof I ’s realization, the sender’s expected total utility from sending g today is µ ( − p t − ), whichis the same as the news utility from sending b today. Thus, sender is indiﬀerent between g and b and has no proﬁtable deviation.In state G , the sender gets at least µ (1 − p t − ) from following the equilibrium strategy.This is because the receiver’s total news utility in the good state along the equilibrium pathis given by P J − ( t − i =1 µ ( p t − i − p t − i ) + µ (1 − p t − I ). By sub-additivity in gains, this sumis strictly larger than µ (1 − p t − ) . If the sender deviates to sending b today, then the receiver55pdates belief to 0 today and belief remains there until the exogenous revelation, when beliefupdates to 1. So this deviation gives the total news utility µ ( − p t − ) + µ (1). We have µ (1) < µ (1 − p t − ) + µ ( p t − ) ≤ µ (1 − p t − ) − µ ( − p t − ) , where the ﬁrst inequality comes from sub-additivity in gains, and the second from weak lossaversion. This shows µ ( − p t − ) + µ (1) < µ (1 − p t − ), so the deviation is strictly worse thansending the equilibrium message.Finally, at a history containing at least one b or a history with length K or longer, thereceiver’s belief is the same at all continuation histories. So the sender has no deviationincentives since no deviations aﬀect future beliefs.For the other direction, suppose by way of contradiction there exists a gradual good newsequilibrium with the J intermediate beliefs q (1) < ... < q ( J ) . For a given 1 ≤ j ≤ J, ﬁndthe smallest t such that p t = q ( k − and p t +1 = q ( k ) . At every on-path history h t ∈ H t with p ∗ ( h t ) = p t , we must have σ ∗ ( h t ; B ) inducing both 0 and q ( j ) with strictly positive probability.Since we are in equilibrium, we must have µ ( − q ( j − ) being equal to µ ( q ( j ) − q ( j − ) plus thecontinuation payoﬀ. If j = J , then this continuation payoﬀ is µ ( − q ( j ) ) as the only otherperiod of belief movement is in period T when the receiver learns the state is bad. If j < J, then ﬁnd the smallest ¯ t so that p ¯ t +1 = q ( j +1) . At any on-path h ¯ t ∈ H ¯ t which isa continuation of h t , we have p ∗ ( h ¯ t ) = q ( j ) and the receiver has not experienced any newsutility in periods t + 2 , ..., ¯ t . Also, σ ∗ ( h ¯ t ; B ) assigns positive probability to inducing posteriorbelief 0, so the continuation payoﬀ in question must be µ ( − q ( j ) ) . So we have shown that µ ( − q ( j − ) = µ ( q ( j ) − q ( j − ) + µ ( − q ( j ) ) , that is N B ( q ( j ) ; q ( j − ) = N B (0; q ( j − ). A.19 Proof of Corollary 5

Proof.

We apply Proposition 12 to the case of quadratic. Recall the relevant indiﬀerenceequation in the good state.(!) µ ( − q t ) = µ ( q t +1 − q t ) + µ ( − q t +1 ) . Plugging in the quadratic speciﬁcation and algebraic transformations lead to56 = ( α p − α n )( q t +1 − q t ) − β p ( q t +1 − q t ) + β n ( q t +1 − q t )( q t +1 + q t )Deﬁne r = q t +1 − q t . Then this relation can be written as( β p − β n ) r + ( α n − α p − β n q t ) r = 0 , i.e. r is a zero of a second order polynomial. For P ∗ to be non-empty we need this root r to be in (0 , − q t ). In particular the peak/trough ¯ r of the parabola deﬁned by the secondorder polynomial should satisfy ¯ r ∈ (0 , − q t ). Given that ¯ r = β n q t − ( α n − α p )2( β p − β n ) for the case that β p = β n , we get the equivalent condition on the primitives0 < β n q t − ( α n − α p )2( β p − β n ) < − q t . The root r itself is given by r = β n q t − ( α n − α p ) β p − β n , which leads to the recursion( R ) q t +1 = q t β p + β n β p − β n − α n − α p β p − β n . This leads to the formula for P ∗ ( π ) in part 1). Case 1:

When β p < β n the coeﬃcient in front of q t is negative so that the recursion (R)leads to (!!) q t +1 − q t = q t β n β p − β n − α n − α p β p − β n < . One also sees here that for the case that β p < β n to give a gradual good news equilibrium oftime-length 1, one needs a low enough prior: namely π < α n − α p β n =: q ∗ . For all priors largeror equal than q ∗ , there is no one-shot bad news partial good news equilibrium. Case 2:

When β p > β n the slope in (R) is above 1 so that for all priors π large enoughwe get an increasing sequence q t which satisﬁes (!). It is also easy to see from ( R ) that( q t +2 − q t +1 ) − ( q t +1 − q t ) = β p + β n β p − β n − ! > , proving the statement in the text after the corollary.That an equilibrium can exist where partial good news are released for more than twoperiods, is shown by the example in the main text following the statement of the Corollary(see Figure 5). 57 .20 Proof of Proposition 13 Proof.

Since N B ( p ; π ) − N B (0; π ) = 0 for p = π and ∂∂p N B ( p ; π ) | p = π > , N B ( p ; π ) − N B (0; π )starts oﬀ positive for p slightly above π. Given that | P ∗ ( π ) | ≤ , if we ﬁnd some p > π with N B ( p ; π ) − N B (0; π ) > , then any solution to N B ( p ; π ) − N B (0; π ) = 0 in ( π,

0) must lie tothe right of p . If q ( j ) , q ( j +1) are intermediate beliefs in a GGN equilibrium, then by Proposition 12, q ( j ) ∈ P ∗ ( q ( j − ) and q ( j +1) ∈ P ∗ ( q ( j ) ). Let p = q ( j ) + ( q ( j ) − q ( j − ). Then, N B ( p ; q ( j ) ) − N B (0; q ( j ) ) = µ ( p − q ( j ) ) + µ ( − p ) − µ ( − q ( j ) )= µ ( q ( j ) − q ( j − ) + µ ( − q ( j ) − ( q ( j ) − q ( j − )) − µ ( − q ( j ) ) > µ ( q ( j ) − q ( j − ) + µ ( − q ( j − − ( q ( j ) − q ( j − )) − µ ( − q ( j − ) , where the last inequality comes from diminishing sensitivity. But, the ﬁnal expression is N B ( q ( j ) ; q ( j − ) − N B (0; q ( j − ), which is 0 since q ( j ) ∈ P ∗ ( q ( j − ). This shows we must have q ( j +1) − q ( j ) > q ( j ) − q ( j − . A.21 Proof of Corollary 6

Proof.

We verify the suﬃcient condition in Proposition 13. We get ∂∂p N B ( p ; π ) = α ( p − π ) − α − λαp − α , so ∂∂p N B ( p ; π ) | p = π = ∞ .To show that | P ∗ ( π ) | ≤ , it suﬃces to show that ∂∂p N B ( p ; π ) = 0 for at most one p > π .For the derivative to be zero, we need ( pp − π ) − α = λ . As the LHS is decreasing for p > π, itcan have at most one solution. A.22 Proof of Proposition 14

Proof.

Consider the following operator φ on the space of continuous functions on [0 , . For V : [0 , → R , deﬁne φ ( V )( p ) := ˜ V ( p | p ) , where˜ V ( · | p ) := cav q [ µ ( q − p ) + δV ( q ) + (1 − δ )( q · µ (1 − q ) + (1 − q ) · µ ( − q ))] . We show that φ satisﬁes the Blackwell conditions and so is a contraction mapping.58uppose that V ≥ V pointwise. Then for any p, q ∈ [0 , µ ( q − p )+ δV ( q )+(1 − δ )( qµ (1 − q )+(1 − q ) µ ( − q )) ≥ ( q − p )+ δV ( q )+((1 − δ )( qµ (1 − q )+(1 − q ) µ ( − q ))therefore ˜ V ( · | p ) ≥ ˜ V ( · | p ) pointwise as well. In particular, ˜ V ( p | p ) ≥ ˜ V ( p | p ), that is φ ( V )( p ) ≥ φ ( V )( p ).Also, let k > V = V + k pointwise. It is easy to see that ˜ V ( · | p ) =˜ V ( · | p ) + δk for every p , because the argument to the concaviﬁcation operator will be point-wise higher by δk . So in particular, φ ( V )( p ) = φ ( V )( p ) + δk . By the Blackwell conditions,the operator φ is a contraction mapping on the metric space of continuous functions on [0 , δ, suppose 0 ≤ δ < δ <

1. First, V δ (0) = V δ (1) = 0 forany δ ∈ [0 , . Now consider an environment where full revelation happens at the end of eachperiod with probability 1 − δ , and ﬁx a prior p ∈ (0 , . There exists some binary informationstructure with message space M = { , } , public histories H t = ( M ) t for t = 0 , , ... , andsender strategies ( σ t ) ∞ t =0 with σ t : H t × Θ → ∆( M ), such that ( M, σ ) induces expected newsutility of V δ ( p ) when starting at prior p .We now construct a new information structure, ( ¯ M , ¯ σ ) to achieve expected news utility V δ ( p ) when starting at prior p in an environment where full revelation happens at the end ofeach period with probability 1 − δ , with δ > δ. Let ¯ M = { , , ∅ } . The idea is that when fullrevelation has not happened, there is a 1 − δδ probability each period that the sender entersinto a babbling regime forever. When the sender enters the babbling regime at the start ofperiod t + 1, the receiver’s expected utility going forward is the same as if full revelationhappened at the start of t + 1 . To implement this idea, after any history h t ∈ H t not containing ∅ , let¯ σ t +1 ( h t ; θ ) =  ∅ w/p 1 − δδ δδ · σ t +1 ( h t ; θ )(1)0 w/p δδ · σ t +1 ( h t ; θ )0) . That is, conditional on not entering the babbling regime, ¯ σ behaves in the same way as σ. But, after any history h t ∈ H t containing at least one ∅ , ¯ σ t +1 ( h t ; θ ) = ∅ with probability59. Once the sender enters the babbling regime, she babbles forever (until full revelationexogenously arrives at some random date). We need to verify that payoﬀ from this strategyis indeed V δ ( p ). Fix a history h t not containing ∅ and a state θ , and suppose p ∗ ( h t ) = q .Under ¯ σ t +1 , with probability of (1 − δ ) + δ (1 − δδ ) = 1 − δ the receiver gets the expectedbabbling payoﬀ qµ (1 − q ) + (1 − q ) µ ( − q ) in the period of state revelation. Analogously,under σ t +1 , there is probability 1 − δ that state revelation happens in period t + 1 and thereceiver gets qµ (1 − q ) + (1 − q ) µ ( − q ) in expectation. With probability δ δδ = δ , the receiverfacing ¯ σ gets the payoﬀ induced by σ t +1 ( h t ; θ ) in period t + 1 and the same distributionof continuation histories as under σ . The same argument applies to all these continuationhistories, so ¯ σ must induce the same expected payoﬀ as σ when starting at ( h t ; θ ) . A.23 Proof of Proposition 15

Proof.

We show ﬁrst suﬃciency. Consider the following strategy proﬁle. In period t wherethe public history so far h t − does not contain any b , let σ ( h t − ; G )( g ) = 1 , σ ( h t − ; B )( g ) = x where x ∈ (0 ,

1) satisﬁes p t − p t − +(1 − p t − ) x = p t . But if public history contains at least one b, then σ ( h t − ; G )( b ) = 1 and σ ( h t − ; B )( b ) = 1. In terms of beliefs, suppose h t is so that everymessage so far has been g. Such histories are on-path and get assigned the Bayesian posteriorbelief. If h t contains at least one b , then belief is 0. It is easy to verify that these beliefs arederived from Bayes’ rule whenever possible.We verify that the sender has no incentive to deviate. Consider period t with history h t − that does not contain any b. The receiver’s current belief is p t − by construction.In state B , we ﬁrst calculate the sender’s equilibrium payoﬀ after sending g. For anyrealization of the exogenous revelation date, the receiver’s total news utility in the goodstate along the equilibrium path is given by P Jj =1 µ ( p t − j − p t − j ) + µ ( − p t − J ) for someinteger J ≥

1. Since p t − J ∈ P ∗ ( p t − J ), we have N B ( p t − J ; p t − J ) = N B (0; p t − J ) , that is to say µ ( p t − J − p t − J ) + µ ( − p t − J ) = µ ( − p t − J ) . We may therefore rewritethe receiver’s total news utility as P J − j =1 µ ( p t − j − p t − j ) + µ ( − p t − J ). But by repeatingthis argument, we conclude that the receiver’s total news utility is just µ ( − p t − ). Sincethis result holds regardless of J , the sender’s expected total utility from sending g todayis µ ( − p t − ), which is the same as the news utility from sending b today. Thus, sender isindiﬀerent between g and b and has no proﬁtable deviation.In state G , the sender gets at least µ (1 − p t − ) from following the equilibrium strategy.60his is because for any realization of the exogenous revelation date, the receiver’s total newsutility in the good state along the equilibrium path is given by P Jj =1 µ ( p t − j − p t − j ) + µ (1 − p t − J ) for some integer J ≥

1. By sub-additivity in gains, this sum is strictly largerthan µ (1 − p t − ) . If the sender deviates to sending b today, then the receiver updates beliefto 0 today and belief remains there until the exogenous revelation, when belief updates to1. So this deviation has the total news utility µ ( − p t − ) + µ (1). We have µ (1) < µ (1 − p t − ) + µ ( p t − ) ≤ µ (1 − p t − ) − µ ( − p t − ) , where the ﬁrst inequality comes from sub-additivity in gains, and the second from weak lossaversion. This shows µ ( − p t − ) + µ (1) < µ (1 − p t − ), so the deviation is strictly worse thansending the equilibrium message.Finally, at a history containing at least one b , the receiver’s belief is the same at allcontinuation histories. So the sender has no deviation incentives since no deviations aﬀectfuture beliefs.We now show necessity. Suppose that we have a (possibly inﬁnite) gradual good newsequilibrium given by the sequence p < p < · · · < p t < . . . . By Bayesian plausibility andbecause we are focusing on two-message equilibria the sender must be sending the messages { , p t } in period t if the state is bad. The sender must thus be indiﬀerent between these twoposteriors in the bad state. Formally, N B (0; p t ) = N B ( p t +1 ; p t ) for all t ≥

0, as long as thereis no babbling. Written equivalently in the language of P ∗ : p t +1 ∈ P ∗ ( p t ) for all t ≥

0, aslong as there’s no babbling, where here p = π . B Residual Consumption Uncertainty

B.1 A Model of Residual Consumption Uncertainty

In the main text, we studied a model where the sender has perfect information about thereceiver’s ﬁnal-period consumption level.Now suppose the sender’s information is imperfect. In state θ , the receiver will consumea random amount c in period T + 1, drawn as c ∼ F θ , deriving from it consumption utility v ( c ) . As before, v is a strictly increasing consumption-utility function. We interpret the61tate θ as the sender’s private information about the receiver’s future consumption, whilethe distribution F θ captures the receiver’s residual consumption uncertainty conditional onwhat the sender knows. The case where F θ is degenerate for every θ ∈ Θ nests the baselinemodel.Assume that E c ∈ F θ [ v ( c )] = E c ∈ F θ [ v ( c )] when θ = θ . We may without loss normalizemin θ ∈ Θ E c ∈ F θ [ v ( c )] = 0, max θ ∈ Θ E c ∈ F θ [ v ( c )] = 1 . The mean-based news-utility function N ( π t | π t − ) in this environment is the same asin the environment where the receiver always gets consumption utility E c ∼ F θ [ v ( c )] in state θ. This is because given a pair of beliefs F old , F new ∈ ∆(Θ) about the state, the receiverderives news utility N ( F new | F old ) based on the diﬀerence in expected consumption utilities, µ ( E c ∼ F new [ v ( c )] − E c ∼ F old [ v ( c )]). So, all of the results in the paper concerning mean-basednews utility immediately extend. The two results in the paper that are not speciﬁc to mean-based news utility, Propositions 1 and 2, apply to any functions N ( π t | π t − ) satisfying thecontinuous diﬀerentiability condition stated in Section 2, without requiring any relationshipbetween N and consumptions in diﬀerent states.We now deﬁne N using Kőszegi and Rabin (2009)’s percentile-based news-utility modelwith a power-function gain-loss utility, in an environment with residual consumption un-certainty. We apply Proposition 2 to the resulting N and show that one-shot resolution isstrictly sub-optimal. This result applies for any K ≥ . Corollary A.1.

Consider the percentile-based model with µ ( x ) =  x α x ≥ − λ ( − x ) α x < for < α < , λ ≥ . Suppose there are two states θ G , θ B ∈ Θ with distributions of consumptionutilities v ( F θ B ) = Unif [0 , L ] , v ( F θ G ) = J + v ( F θ B ) for some L, J > . One-shot resolution isstrictly suboptimal for any ﬁnite T .Proof. We show that lim (cid:15) → N (1 G | (1 − (cid:15) )1 G ⊕ (cid:15) B ) (cid:15) = ∞ under this set of conditions. The argumentbehind Proposition 2 then implies some information structure involving perfect revelation ofstates other than θ G , θ B , one-shot bad news, partial good news for the two states θ G , θ B isstrictly better than one-shot resolution.For r ∈ [0 , , write F r for the distribution of consumption utilities under the belief r G ⊕ (1 − r )1 B . Note we must have R c F ( q ) − c F − (cid:15) ( q ) dq = J (cid:15) , and that c F ( q ) − c F − (cid:15) ( q ) ≥ q. q ∗ = min( (cid:15) · J/L, (cid:15) ). It is the quantile at which c F − (cid:15) ( q ∗ ) = J .For all q ≥ q ∗ , c F ( q ) − c F − (cid:15) ( q ) ≤ (cid:15)L . Case 1 : J ≥ L, so q ∗ = (cid:15) . Z q ∗ c F ( q ) − c F − (cid:15) ( q ) dq = Z (cid:15) J − q · (cid:15) · ((1 − (cid:15) ) L ) dq = J (cid:15) − (cid:15) (1 − (cid:15) ) L. This implies R q ∗ c F ( q ) − c F − (cid:15) ( q ) dq = (cid:15) (1 − (cid:15) ) L. The worst case is when the diﬀerence is (cid:15)L on some q -interval, and 0 elsewhere. For small (cid:15) < (cid:15)L < , Z q ∗ ( c F ( q ) − c F − (cid:15) ( q )) α dq ≥ ( (cid:15)L ) α · (1 / · (cid:15) (1 − (cid:15) ) L(cid:15)L = 12 ( (cid:15)L ) α (1 − (cid:15) ) . Therefore, for small (cid:15) > , N (1 G | (1 − (cid:15) )1 G ⊕ (cid:15) B ) (cid:15) =

12 1 (cid:15) − α L α (1 − (cid:15) ), which diverges to ∞ as (cid:15) → . Case 2:

J < L, so q ∗ = (cid:15)J/L . Z (cid:15)J/L c F ( q ) − c F − (cid:15) ( q ) dq = Z (cid:15)J/L J − q · (cid:15)J/L ( J − JL (cid:15) · L ) dq = 12 J L (cid:15) + 12 J L (cid:15) L< J (cid:15) + 12

L(cid:15) using J < L . This then implies R q ∗ c F ( q ) − c F − (cid:15) ( q ) dq > J (cid:15) − L(cid:15) . So, again using the worst-case of the diﬀerence being (cid:15)L on some q -interval, and 0 else-where, N (1 G | (1 − (cid:15) )1 G ⊕ (cid:15) B ) (cid:15) > (cid:15) ( (cid:15)L ) α · J (cid:15) − L(cid:15) (cid:15)L = 1 (cid:15) − α L α · (cid:18) J/L − (cid:15) (cid:19) . As (cid:15) → , RHS converges to ∞ . .2 A Calibration Comparing Percentile-Based News Utility andMean-Based News Utility Since Proposition 1’s procedure for computing the optimal information structure applies togeneral N , including both the percentile-based and the mean-based news-utility functions inan environment with residual consumption uncertainty, we can compare the solutions to thesender’s problem for these two models.Consider two states of the world, Θ = { G, B } . For some σ > , suppose consumptionis distributed normally conditional on θ with F G = N (1 , σ ), F B = N (0 , σ ) , consumptionutility is v ( x ) = x, and gain-loss utility (over consumption) is µ ( x ) = √ x for x ≥ µ ( x ) = − . √− x for x < . We calculated the optimal information structure for the mean-based model in an analogous environment, as reported in Figure 2.With the percentile-based model, an agent who believes P [ θ = G ] = π has a belief overﬁnal consumption given by a mixture normal distribution, πF G ⊕ (1 − π ) F B , illustrated inFigure A.2.We plot in Figure A.3 the optimal information structures for T = 5 , σ = 1. The optimalinformation structures for σ = 0 . , ,

10 all involve gradual good news, one-shot bad news.Table A.1 lists the optimal disclosure of good news over time. Not only are the shapes ofthe concaviﬁcation problems qualitatively similar to those of the mean-based model, but theresulting optimal information structures also bear striking quantitative similarities. t = 0 t = 1 t = 2 t = 3 t = 4 t = 5percentile-based, σ = 0 . σ = 1 0.50 0.55 0.62 0.71 0.83 1.00percentile-based, σ = 10 0.50 0.56 0.63 0.72 0.84 1.00mean-based, any σ λ = 1 . T = 5 ,σ = 0 . , , . The table shows belief movements conditional on the good state in diﬀerentperiods.From Table A.1, it appears that percentile-based and mean-based models deliver moresimilar results for larger σ . We provide an analytic result consistent with the idea that thesetwo models generate similar amounts of news utility when the state-dependent consumption64 . . . . Densities of consumption utility distributions consumption utility den s i t y p = 0.1 p = 0.9−3 −2 −1 0 1 2 3 4 . . . . . . CDFs of consumption utility distributions consumption utility CD F p = 0.1 p = 0.9 Figure A.2: The densities and CDFs of ﬁnal consumption utility distributions under twobeliefs about P [ θ = G ], π = 0 . π = 0 . . The dashed black lines in the CDFs plotshow the diﬀerences in consumption utilities at the 25th percentile, 50th percentile, and75th percentile levels between these two beliefs. The news utility associated with updatingbelief from π = 0 . π = 0 . µ to all these diﬀerences in consumption utilities at various quantiles,then integrating over all quantiles levels in [0 , . T = 5, gain-loss function µ ( x ) =  √ x for x ≥ − . √− x for x < π = 0 .

5, using Kőszegi andRabin (2009)’s percentile-based model in a Gaussian environment with σ = 1. The y -axisin each graph shows the sum of news utility this period and the value function of enteringnext period with a certain belief. 66tility distributions have large variances. Proposition 16.

Suppose

Θ = { B, G } and the distributions of consumption utilities instates B and G are Unif [0 , L ] and Unif [ d, L + d ] respectively, for L, d > . Let N perc ( p | p ) be the news utility associated with changing belief in θ = G from p to p in a percentile-basednews-utility model with a continuous gain-loss utility µ . Then, lim L →∞ sup ≤ p ,p ≤ | N perc ( p | p ) − µ [( p − p ) d ] | ! = 0 . In a uniform environment, if there is enough unresolved consumption risk even conditionalon the state θ , then the diﬀerence between percentile-based news utility and mean-based newsutility goes to zero uniformly across all possible belief changes. Proof.

Let F p ( x ) be the distribution function of the mixed distribution p · Unif[ d, L + d ] ⊕ (1 − p ) · Unif[0 , L ], and F − p ( q ) its quantile function for q ∈ [0 , . By a simple calculation, F − p ( d/L ) = d + pd and F − p (1 − d/L ) = L + pd − d . At the same time, for d/L ≤ q ≤ − d/L where q = d/L + y , we have F − p ( q ) = d + pd + yL .This shows that over the intermediate quantile values between d/L and 1 − d/L, Z − d/Ld/L µ h F − p ( q ) − F − p ( q ) i dq = Z − d/Ld/L µ [( p − p ) d ] dq = (1 − d/L ) · µ [( p − p ) d ] . For the lower part of the quantile integral [0 , d/L ] , using the fact that F − p ( d/L ) = d + pd ,we have the uniform bound 0 ≤ F − p ( q ) ≤ d for all p ∈ [0 ,

1] and q ≤ d/L. So, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z d/L µ h F − p ( q ) − F − p ( q ) i dq (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ dL · max x ∈ [ − d, d ] | µ ( x ) | . By an analogous argument, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z − d/L µ h F − p ( q ) − F − p ( q ) i dq (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ dL · max x ∈ [ − d, d ] | µ ( x ) | . So for any 0 ≤ p , p ≤ | N perc ( p | p ) − µ [( p − p ) d ] | ≤ dL max x ∈ [ d,d ] | µ ( x ) | + 2 dL max x ∈ [ − d, d ] | µ ( x ) | , Lemma 3 in the Online Appendix of Kőszegi and Rabin (2009) states a similar result, but for a diﬀerentorder of limits.

67n expression not depending on p , p . The max terms are seen to be ﬁnite by applyingextreme value theorem to the continuous µ , so the RHS tends to 0 as L → ∞→ ∞