[PDF] Only Time Will Tell: Credible Dynamic Signaling

Abstract

This paper explores a model of dynamic signaling without commitment. It is known that separating equilibria do not exist if the sender cannot commit to future costly actions, since no single action can have enough weight to be an effective signal. This paper, however, shows that informative and payoff-relevant signaling can occur even without commitment and without resorting to unreasonable off-path beliefs. Such signaling can only happen through attrition, when the weakest type mixes between revealing own type and pooling with the stronger types. The possibility of full information revelation in the limit hence depends crucially on the assumptions about the state space. We illustrate the results by exploring a model of dynamic price signaling and show that prices may be informative of product quality even if the seller cannot commit to future prices, with both high and low prices being able to signal high quality.

Full PDF

aa r X i v : . [ ec on . T H ] J u l Only Time Will Tell: Credible Dynamic Signaling ∗ Egor Starkov † July 21, 2020

Abstract

This paper explores a model of dynamic signaling without commitment. It is known thatseparating equilibria do not exist if the sender cannot commit to future costly actions, sinceno single action can have enough weight to be an eﬀective signal. This paper, however,shows that informative and payoﬀ-relevant signaling can occur even without commitment andwithout resorting to unreasonable oﬀ-path beliefs. Such signaling can only happen throughattrition, when the weakest type mixes between revealing own type and pooling with the strongertypes. The possibility of full information revelation in the limit hence depends crucially on theassumptions about the state space. We illustrate the results by exploring a model of dynamicprice signaling and show that prices may be informative of product quality even if the sellercannot commit to future prices, with both high and low prices being able to signal high quality.

Keywords : dynamic signaling, repeated signaling, reputation, attrition

JEL Codes : C73, D82, D83, L15

In his seminal contribution, Spence [1973] argued that economic agents’ actions can signal theirprivate information, giving an example of schooling as a signal of ability in labor markets. Inthe years since, researchers have extensively studied signaling models, describing the fundamentaldriving forces driving them and identifying signaling patterns in a wide spectrum of applications:from bargaining (Vincent [1990]) and limit pricing (Milgrom and Roberts [1982a,b]) to corporateﬁnance (Leland and Pyle [1977]) and advertising (Milgrom and Roberts [1986]). While most signaling models explore static interactions, dynamic signaling models may be bettersuited to explore some applications. For example, the choice of price to signal product quality or thechoice of education/eﬀort to signal worker’s ability are both inherently dynamic problems, since thesender must in both cases repeatedly reaﬃrm their action choice. However, signaling in dynamicsettings has a salient conceptual problem. In the context of Spence’s story of signaling abilitythrough education, this problem was formulated by Admati and Perry [1987]: “Once a high abilityworker has gone to school long enough to distinguish himself from a worker of lower ability, the ﬁrms ∗ This paper is based on chapter 3 of the author’s Ph.D. thesis. The author thanks Nemanja Antić, Eddie Dekel,Jeﬀrey Ely, Yingni Guo, Nicolas Inostroza, Johan Lagerlöf, Alexey Makarin, Wojciech Olszewski, Marco Schwarz,Ludvig Sinander, Peter Norman Sørensen, Bruno Strulovici and seminar participants at Northwestern University andUniversity of Copenhagen for valuable feedback and helpful comments. † Department of Economics, University of Copenhagen, Øster Farimagsgade 5, bygning 26, 1353 København K,Denmark; e-mail: [email protected]. See Riley [2001] for an excellent survey of the early literature on signaling. before enough time has elapsed to presentan eﬀective screen” (p.363). In other words, if neither workers can commit to completing theireducation in the future, nor ﬁrms can commit to not hire undereducated workers, then educationcannot serve as an eﬀective signal. To give a particular example: when low-ability workers are notsupposed to pursue a college degree, a single day spent in that pursuit would imply that a worker’sability is not low, and market wages for college dropouts would be correspondingly high – too highto actually deter low-ability workers from enrolling in college only to drop out soon thereafter.The literature has responded to this conceptual challenge by searching for aspects of suchdynamic interactions which would neutralize the argument above, thus enabling variations of thedynamic signaling model in which static separation is possible. Proposed solutions include: alteringthe payoﬀs to add intrinsic motivation for signaling (Weiss [1983]), tacit collusion on the receivers’side to generate instrumental commitment (Nöldeke and van Damme [1990a], Swinkels [1999]),evolving sender’s type to create the need for maintaining reputation as opposed to establishingit once (Roddie [2012a,b]), or receivers observing noisy outcomes instead of the sender’s actions(Dilmé [2017], Heinsalu [2018]). The basic case without any of the above is implicitly perceived asone in which informative signaling is impossible – if an equilibrium even exists, that is. However, theimpossibility argument of Admati and Perry [1987] only applies to perfectly separating outcomes –it does not preclude partial separation, meaning the outcome where certain actions act as suggestive rather than conclusive evidence of the sender’s information. The limits of such suggestive signalingin dynamic settings have, to our best knowledge, not been carefully investigated in the literature.We aim to ﬁll this gap.This paper undertakes the mission to characterize all informative outcomes that can arise in ageneral model of dynamic (a.k.a. repeated) signaling without commitment. Our main result showsthat the scope for signaling, while limited, does in fact exist in dynamic settings, contrary to theintuition of Admati and Perry [1987]. In particular, payoﬀ-relevant signaling is possible via what iseﬀectively a war of attrition, in which all sender types pool on the same action, with the lowest typemixing between pooling with the rest and separating to a myopically optimal action. Beyond suchattrition, actions are as informative as cheap talk. The contribution of this paper is both in showingthe existence of a wedge between signaling and cheap talk in the setting under consideration – awedge presumed nonexistent by the existing literature, – and characterizing this wedge explicitly.The conclusion regarding the uniqueness of attrition as an informative equilibrium outcomeleads into another message of the paper, which is methodological. The mechanism of attrition ofthe lowest type can yield full separation asymptotically if there are only two types in the game butnot if there are more (but ﬁnitely many). Further, one can show that in an analog of our modelwith a continuous type space, full asymptotic revelation is possible again. This aims to show thatone’s modelling assumptions may crucially aﬀect the result even when they are about an objectthat is as seemingly abstract and arbitrary as type space.While the attrition structure is restrictive, it nonetheless allows for a nontrivial equilibriummultiplicity. In addition to various possible combinations of informative and uninformative periods, One possible explanation for the lack of commitment is the possibility of renegotiation; see Beaudry and Poitevin[1993]. See Whitmeyer [2019] for a discussion regarding the sender-optimal amount of noise in signaling. An example of such outcome is presented by Fuchs and Skrzypacz [2010]. a ′ , the low type mixes between a ′ and a ′′ withpositive probabilities”) is encountered in many applied dynamic models. However, in spite ofthe overwhelming presence in the applied theory literature, the issue has never received a rigoroustreatment in signaling literature. Our paper amends that, demonstrating that the uniqueness ofattrition as an equilibrium structure arises in general settings well beyond the speciﬁc modelsexplored previously.Our analysis relies on the restriction of oﬀ-equilibrium path beliefs to be “reasonable”.In particular, we adopt the assumption of non-increasing belief supports or, as labeled byBond and Zhong [2016],

NDOC (“Never Dissuaded Once Convinced”) assumption. As the namesuggests, it implies that once the receiver has ruled out some type of the sender as impossible, thereceiver stands by this belief and never again assigns positive probability to that type, includingoﬀ the equilibrium path. Kaya [2009] and Roddie [2012a,b] have shown that in the absence ofNDOC full instantaneous separation is possible in dynamic settings, since the sender’s behaviorcan be disciplined by strong reputational threats in case of deviations. While the approach canbe justiﬁed when the sender’s type may change over time and hence needs constant re-veriﬁcation,in other settings it is susceptible to a critique of using unreasonable oﬀ-path threats to sustain anequilibrium – a practice typically reproved in the literature on equilibrium reﬁnements for staticsignaling games, as well as equilibrium concepts for dynamic games. To illustrate, consider again the job market signaling example. In the story of Spence [1973],high-ability students commit (during the college applications period) to obtaining a college degree.This commitment is too costly for low-ability students, who then forego college altogether and acceptlower wages. In the absence of commitment – if a student must every day decide whether to stay incollege or drop out and pursue a job at a competitive wage – this equilibrium can still be sustainedby unreasonable employers’ beliefs oﬀ path. Such beliefs would treat any high school graduate ashigh-ability as long as they are on track for a college degree – but any college dropout is immediatelydowngraded to low-ability in employers’ eyes. This belief system incentivizes high-ability studentsto endure all four years of college to obtain higher wages and disincentivizes low-ability studentsfrom pursuing higher education. However, this belief system is internally inconsistent: only high-ability students are believed to go to college in the ﬁrst place, so dropouts cannot be of low ability!NDOC rules out exactly this kind of inconsistency.Furthermore, it feels tongue-in-cheek to even call equilibria like the one above “separating”, sincethe sender of some type can never properly “separate” from other types in that scenario. While the Some examples include Vincent [1990], Deneckere and Liang [2006], Daley and Green [2012], Lee and Liu[2013], Dilmé and Li [2016], Dilmé [2017], Kaya and Kim [2018] in bargaining and bilateral trade;Strebulaev, Zhu, and Zryumov [2016] in corporate ﬁnance; Vettas [1997], Aköz, Arbatli, and Çelik [2017],Gryglewicz and Kolb [2019], Smirnov and Starkov [2020] in industrial organization/marketing; Smirnov and Starkov[2019] in cheap talk; De Angelis, Ekström, and Glover [2018] in Dynkin games. C.f. Banks and Sobel [1987] and Cho and Kreps [1987] for signaling and chapter 4 in Myerson [1997] for extensive-form games respectively. possible , and the sender’s deviation from the equilibriumpath would lead the receiver to recognize these types as probable . If any tremble away from theprescribed strategy can ruin all of the sender’s acquired reputation, then what is such reputationworth? In contrast, the NDOC assumption allows us to explore the limits of credible signaling –that which is not reversed by future deviations. NDOC has been widely used in applied models. On a separate note, NDOC has been criticized as leading to possible equilibrium nonexistence (seeMadrigal, Tan, and Werlang [1987] and Nöldeke and van Damme [1990b]). We thus characterizethe equilibria conditional on existence, without making any existence claims. Section 5, however,provides some examples of such equilibria.In order to illustrate our results, we explore a simple model of price signaling. In this modela ﬁrm is privately informed about the quality of its product, and sets the price of the product inevery period in an attempt to signal this quality and increase the consumers’ willingness to pay.We construct a family of informative equilibria in this setting and show that both ineﬃciently lowand ineﬃciently high prices are equally ﬁt to signal high quality in equilibrium, while the literature(some mentioned in Section 5) typically focused on one of the two as a signal. As a consequence, theprice path in informative equilibria is highly indeterminate (even beyond the dichotomy outlinedabove, as we show), which makes it diﬃcult to test empirically whether price signaling is takingplace in a given industry. The remainder of this paper is organized as follows. Section 2 describes the model. We thenproceed to analyze two versions of this model. The two-type version in Section 3 can be seenas an illustrative example. The version with ﬁnitely many types is then explored in Section 4.Section 5 considers an application to price signaling and illustrates how our results can be used toconstruct informative equilibria. Section 6 concludes. All proofs and a number of summplementaryresults are contained in Appendix A. Further applications of the general model are brieﬂy outlinedin Appendix B.

We will be looking at a continuous limit of a discrete-time inﬁnite-horizon game. Time is indexedby t ∈ T ≡ { , dt, dt, ... } ; period length dt is assumed to be arbitrarily small. There is a long-lived agent (sender) who has some persistent type θ ∈ Θ , where Θ is a ﬁnite ordered set. Alternatively, θ can be the state of the world that the agent is privately informed of.In every period t the agent has to choose an action a t ∈ A , where A is some compact set. Agent’saction choices aﬀect public outcome x t ∈ X , which is a random process, and its distribution at time t depends on θ and a t (more generally, it can depend on the whole past history). We assume thatoutcomes never allow to perfectly identify θ : the support of x t conditional on a t does not depend Away from Bond and Zhong [2016], one can also ﬁnd analogs of NDOC in Grossman and Perry [1986], LeBlanc[1992], Vettas [1997], Kraus, Wilkenfeld, and Zlotkin [1995], Sen [2000], Feinberg and Skrzypacz [2005], Lai [2014],Gryglewicz and Kolb [2019], Smirnov and Starkov [2019, 2020]. A similar point was made by Kaya [2013] in relation to signaling product quality via advertising expenditures. One could call x t a public “signal”; we avoid this phrasing so as to not create confusion with the process ofsignaling through actions. θ . Let h t ≡ { a s , x s } s ∈T ,s

Deﬁnition 1.

A Perfect Bayesian Equilibrium is given by the agent’s strategy proﬁle α = { α θ } θ ∈ Θ with α θ : H → ∆ ( A ) and the receiver’s belief system p : H → ∆ (Θ) such that:1. the agent’s strategy proﬁle α is optimal for all types θ ;2. the observer’s belief p is updated using Bayes’ rule whenever possible. PBE is a maximally permissive solution concept. Our main results characterize signaling in allPBE that satisfy NDOC (as deﬁned in the following subsection), hence they will also apply if oneimposes additional restrictions or equilibrium reﬁnements on top of PBE with NDOC.

The two sections above deﬁne the primitives of the model but impose only very minimalrestrictions on them. Throughout the paper, we will also impose the following assumptions: (MON)

Flow payoﬀ function u θ ( a t , p t ) is weakly increasing in p t w.r.t. FOSD mass shifts.I.e., for any p ′ , p ′′ ∈ ∆ (Θ) such that p ′ ( θ ′ ) > p ′′ ( θ ′ ) , p ′ ( θ ′′ ) < p ′′ ( θ ′′ ) for some θ ′ > θ ′′ , and p ′ ( θ ) = p ′′ ( θ ) for all θ ∈ Θ \{ θ ′ , θ ′′ } , it should be that u θ ( a t , p ′ ) ≥ u θ ( a t , p ′′ ) . Further, for any θ and a t , if p t > F OSD δ θ then u θ ( a t , p t ) > u θ ( a t , δ θ ) . (FIN) Equilibrium strategy α has ﬁnite support for all h t ∈ H . (NDOC) Process p t is progressively absolutely continuous. I.e., for any h s ⊃ h t , p ( h s ) isabsolutely continuous w.r.t. p ( h t ) .In the above, as well as everything that follows, δ θ is the Dirac delta: “ p ( h t ) = δ θ for some θ ∈ Θ ”is equivalent to saying that p ( θ | h t ) = 1 and p ( θ ′ | h t ) = 0 for all θ ′ = θ .Below are some conditions equivalent or related to the above, with the relations between therespective pairs of conditions described in the subsequent text and summarized in Lemma 1. (MON-2) Θ = { H, L } and u θ ( a t , p t ) is weakly increasing in p t ( H ) and u L ( a t , p t ) > u L ( a t , δ L ) for all a t and p t = δ L . (FIN-M) Action set A is ﬁnite. (NDOC-P) After any action a that is not on path at h t ∈ H : p ( h t ∪ ( a, x t )) = δ min S ( h t ) for any x t ∈ X . The ﬁrst assumption, (MON), requires the sender’s ﬂow payoﬀ function to be monotone w.r.t.reputation p t . It is suﬃcient to have weak monotonicity (w.r.t. FOSD order on beliefs) with theexception that it must always be strictly beneﬁcial to pool with the higher types. This assumption On-pathness is deﬁned in the usual way; see Section 4.2 for a formal deﬁnition. The strict part of (MON) simpliﬁes the analysis but rules out some relevant cases. E.g., it disallows the payoﬀto be a step function, which is the case when the agent only cares about his reputation being above some cutoﬀ. | Θ | = 2 ) it will also be the only restrictionon payoﬀs needed for the result. Further, in case of two types it can be written more simply as(MON-2). In other words, if | Θ | = 2 then (MON) and (MON-2) are equivalent.The next pair of assumptions is (FIN) and (FIN-M). The former is an equilibrium reﬁnementthat demands that the sender’s equilibrium strategy has ﬁnite support at any history. The latter isa restriction on the model in that the action set of the sender is ﬁnite. The analysis in the remainderof the paper relies on one of these two conditions to hold in order to avoid the problem of Bayesianinference from zero probability events. This problem is illustrated by the following example. Example 1.

Let

Θ = { L, H } , A = [0 , , and suppose that outcomes are uninformative: x t ≡ . Fixsome equilibrium and history h t ∈ H therein. Suppose that the high type’s strategy α H ( h t ) assignsweight one to action a = 1 , while the low type’s strategy α L ( h t ) mixes uniformly over all actions a ∈ [0 , . By Bayes’ rule, the receiver’s belief at history h t + dt = h t ∪ (1 , must assign probabilityone to type H and probability zero to type L . However, it is not immediate whether the receiver must in such cases rule out type L at all histories following h t + dt (which he does if (NDOC) holds). It is immediate that (FIN-M) is suﬃcient for (FIN) to hold, therefore in the remainder ofthis paper we use (FIN). However, (FIN-M) is a useful reminder that in ﬁnite games no furtherequilibrium reﬁnements are required, apart from (NDOC).Finally, (NDOC) is the assumption on the equilibrium beliefs that drives our analysis. Inparticular, it says that if p ( θ | h t ) = 0 then p ( θ | h s ) = 0 for any pair of histories h s ⊃ h t in H . Notethat this applies both on and oﬀ the equilibrium path. For the discussion of (NDOC), refer toSecton 1. To simplify the analysis, we strengthen (NDOC) to (NDOC-P), which requires that oﬀthe equilibrium path, the receiver’s belief p ( h t ) must be pessimistic – it must put all weight on thelowest type among those not yet ruled out by the receiver. Given (MON), this condition imposesthe strongest possible punishment on the sender for any deviation, among those punishments thatsatisfy (NDOC). Therefore, we argue that for any equilibrium that satisﬁes (NDOC), there existsan equivalent one that satisﬁes (NDOC-P), despite the latter being a stronger condition.The claims made in this section are summarized by the following lemma. Lemma 1.

Model assumptions are connected through the following relations.1. If | Θ | = 2 then (MON) and (MON-2) are equivalent.2. (FIN-M) implies (FIN).3. If (MON) holds then for any equilibrium that satisﬁes (FIN) and (NDOC), there exists apayoﬀ-equivalent and on-path strategy-equivalent equilibrium that satisﬁes (FIN) and (NDOC-P). This section discusses the assumptions that are implicit in the model set-up so that the readercan get a clearer picture of which aspects of the model are important for the results, and whichmodelling assumptions were made purely for expositional simplicity.To start with, the model setup includes a number of assumptions that impede with instantaneousseparation of types, namely: persistent sender’s type θ , vanishing period length, compact actionset and ﬁnite action costs. The former is required for (NDOC) to have any bite: if type could7hange over time then the sender’s reputation would need constant re-veriﬁcation, meaning thatcredible signaling is impossible by design. The other three assumptions are meant to remove anyimplicit commitment power the sender may have (since in discrete time he can eﬀectively committo not revise his action until the next period) and to remove the potential of any given single actionto be informative. All of these assumptions restrict us to the world in which, according to Weiss[1983] and Admati and Perry [1987], perfect separation is impossible, since this is the world we areinterested in exploring.This paper’s message is not about arguing that this is the only plausible set of assumptions indynamic signaling. Indeed, there are many settings in which a single action has the weight to beinformative enough by itself, or the sender has at least some commitment power, or the sender’stype is, in fact, volatile. This paper argues instead that there exist real-world settings, to which theaforementioned set of assumptions applies, and we as economists care about characterizing them.For example, one such setting is price signaling by the ﬁrm – be it signaling of the ﬁrm’s productquality to consumers or signaling of its production costs to existing and potential competitors.Prices are typically perfectly observable and can be changed frequently at no cost.The above raises the question of why we limit ourselves to discrete time, thus giving the senderlimited commitment power, rather than exploring a proper continuous time model. The answer issimplicity. While the essence of our results carries over to the continuous time case, their statementsbecome less clear-cut, and the analysis of such model becomes encumbered by the speciﬁcs of thecontinuous-time analysis. Furthermore, one can argue that discrete-time model is more general,since continuous time is its limit (special) case.On the other hand, ﬁniteness of the type space is a crucial assumption. For example, if type θ is distributed on an interval, then attrition takes a very diﬀerent form from what is stated inTheorems 1 and 2. Instead of the lowest type separating with positive probability in every period(in which payoﬀ-relevant signaling happens), we could have a positive mass of types at the lowerend of the support separating every period. In continuous time, the lower bound of the support oftypes would increase smoothly over time along the pooling path, shrinking the support; an exampleof such equilibrium is constructed by Fuchs and Skrzypacz [2010]. Importantly, such attrition couldlead to full separation in the limit as t → ∞ , unlike in the case with ﬁnitely many (but more thantwo) types. Furthermore, we can no longer guarantee that with a continuum of types, attrition isthe only way in which payoﬀ-relevant signaling can proceed.Finally, we assume that the receiver is passive, and the sender receives utility from reputation p . One may see this as a reduced form of a repeated Stackelberg game in which in every period t the sender ﬁrst chooses action a t , which together with the respective outcome x t determineshis current reputation p t , and the receiver then responds with some action b t , after which bothplayers i ∈ { S, R } receive utilities u θi ( a t , b t ) . This is a standard reduction used both in signalingliterature (Kaya [2009], Roddie [2012a,b]) and other literatures (e.g., Bayesian Persuasion – seeKamenica and Gentzkow [2011]). Given that this is by now a standard technique, we do not describethe full game in order to economize on notation. However, our results can be easily extended toboth repeated Stackelberg games in which the sender and the receiver act in sequence in everyperiod, and (with slightly more eﬀort) to repeated games in which both act simultaneously. The receiver’s utility may depend on true θ as long as the receiver does not observe his own utility ﬂow. Two Types

This section explores the version of the model with only two types:

Θ = { L, H } . Here we showthat signaling must take the form of attrition regardless of payoﬀs, as long as they are monotonein reputation p t . The ﬁrst part of Theorem 1 states that perfect separation cannot occur at anyhistory in equilibrium: if a given action is on path for θ = H then it is also on path for θ = L . Thisstatement captures the idea of Admati and Perry [1987] and Nöldeke and van Damme [1990a]. Wealso observe that there may eﬀectively be only one such pooling action in any period, in the senseof all pooling actions must be payoﬀ-equivalent for all types of the agent. This follows trivially fromthe fact that both types must be indiﬀerent between playing any such action if there are more thanone.The new insight is that the converse to the ﬁrst statement is not necessarily true: if α L ( a | h t ) > then α H ( a | h t ) may or may not be positive. In other words, there may exist actions which perfectlyidentify the low type, even if there do not exist any that identify the high type. It is immediatethat the low type must be mixing for this to be possible. All this is summarized by the second partof the theorem. The statement does not claim existence of any such separating actions, since they,of course, need not exist in any given case. However, Section 5 presents an example of a setting inwhich such informative equilibrium exists. Theorem 1.

Suppose that

Θ = { L, H } and (MON-2) holds. In any equilibrium such that (FIN)and (NDOC) hold, at any h t ∈ H with S ( h t ) = { L, H } , and for any a ∈ A :1. if α H ( a | h t ) > then α L ( a | h t ) > . Further, all such a are payoﬀ-equivalent in the sense that U θ ( h t ∪ a ) is the same across such a for all θ .2. if α H ( a | h t ) = 0 and α L ( a | h t ) > then a ∈ B ( δ L | L ) and U L ( h t ∪ a ) = U L ( h t ∪ a ′ ) for any a ′ such that α H ( a ′ | h t ) > . Note that the attrition structure of signaling imposes strong restrictions on actions that canbe played in equilibrium. Firstly, any separating action perfectly identiﬁes the low type, meaningit cannot reﬂect any signaling motives, and must hence be myopically optimal for the low type.Secondly, if the low type mixes between pooling and separating, then he must be indiﬀerent betweenthe two – meaning that pooling with the high type must yield exactly the same expected payoﬀfor the low type as separation. Gains from pooling in this scenario (higher reputation) are exactlyoﬀset by the cost of taking suboptimal actions in current and/or future periods.It is worth emphasizing that the result holds under very minimal assumptions on payoﬀs andsignals: the only requirements imposed on the model are that the sender’s payoﬀ is increasing in p (which, in fact, is only required for the low type) and that the outcomes x are not perfectlyrevealing. In other words, if your game ﬁts the following framework: • dynamic game with continuous time or short time intervals or patient players, • binary state of the world known by one player but not other(s), • the informed player has a preference over other(s)’ beliefs, and the direction of this preferencedoes not depend on the state, • the informed player chooses an action every period but cannot veriﬁably reveal the state(action set is independent of type),then the only informative equilibrium structure that can arise in this game (unless you are willing9o allow for NDOC-nonconformant beliefs oﬀ the equilibrium path) is attrition. Under attrition, thehigh type is playing some pooling action, while the low type mixes between that and a separatingaction. We now move to exploring the setting with more than two but ﬁnitely many types. In thissection we show that the insight of Theorem 1 can be extended to this case, although allowing formany types does raise a number of additional issues and calls for extra assumptions.

In order to secure the result in case of many types, we need to impose the following newassumption on payoﬀs: (SC) U θ ( a | h t ) satisﬁes single-crossing in ( θ, a ) at all h t ∈ H . I.e., for any a ′ , a ′′ ∈ ∪ h t ∈H ∪ θ ∈ S ( h t ) arg max a U θ ( a | h t ) and all h t ∈ H , function U ( θ ) ≡ U θ ( a ′′ | h t ) − U θ ( a ′ | h t ) either crosses zeroat most once, or is identically zero.This assumption belongs to a family of single-crossing conditions widely encountered in theliterature on signaling, monotone comparative statics, and mechanism design. The purpose ofour condition is standard: to ensure that the agent’s preferences over strategies satisfy a kindof monotonicity w.r.t. his type. Our condition, however, has three distinctive features whichdiﬀerentiate it slightly from other single-crossing conditions in the literature.

Feature 1.

This is an assumption about an equilibrium object, since belief system p ( h t ) – whichenters U θ ( a | h t ) – is endogenous to equilibrium. The simplest way to justify the assumption in thisrespect is strengthening the assumption to all combinations of actions and beliefs ( a , p ) , i.e., toassume that E h  X s ∈T ,s ≥ t e − r ( s − t ) u θ ( a ( h s ) , p ( h s )) dt | θ, h t  , satisﬁes single-crossing for all pairs ( a ′ , p ′ ) , ( a ′′ , p ′′ ) . Feature 2.

Unlike most standard single-crossing assumptions, (SC) does not require that a ′′ > a ′ (and that single-crossing happens from below). In contrast, we also require the statement to holdfor any pair of unordered strategies and respective belief proﬁles. The reason for that is while thegrand set of choices ( A × ∆ (Θ)) H is a lattice w.r.t. the product order for some given order on A , thesubset of these choices available to the sender at any given history is not necessarily its sublattice.The standard monotone comparative statics results are then of limited use. On the upside, however,we only require that single-crossing is satisﬁed for those strategies that may be optimal for any type.I.e., if one can rule out some actions as certainly suboptimal at certain histories, these actions maybe safely ignored. See Laﬀont and Martimort [2002] from a contract theory perspective (e.g., Ch. 2.2.3). Classic references onMCS, in turn, include Milgrom and Shannon [1994] and Athey [2002]. The order implied here is the product order on the set ( A × ∆ (Θ)) H of all collections ( a ( h t ) , p ( h t )) , composed ofsome order on A (although we have not imposed any) and FOSD order on ∆ (Θ) . eature 3. (SC) is a condition on the expectation of a discounted sum E P t e − rt u θ ( a t , p t ) ratherthan on the ﬂow utility u θ ( a, p ) . While the latter would be more preferable, aggegating single-crossing is not a trivial problem. Quah and Strulovici [2012] discuss this problem and oﬀer possiblesolutions, but none of them apply to our setting due to feature 2 above.All of the above means that (SC) is quite a non-trivial condition and may be diﬃcult to verifyin many models. If anything, verifying (SC) might as well be the main impediment to exploitingour results in applied models. However, this task is not impossible. Some examples of appliedmodels, in which payoﬀ functions can be easily veriﬁed to satisfy (SC) are presented in Section 5and Appendix B. Theorem 2 that we gradually build up to is the analog of Theorem 1 for the case when | Θ | > , inthe sense of characterizing the actions available in equilibrium at any history. We begin, however,by stating a weaker result which provides a clearer characterization of the attrition structure ofequilibrium signaling with | Θ | > . Proposition 1 below establishes that as long as (SC) and otherpreviously stated assumptions hold, strategies played in an arbitrary equilibrium of the game canbe split into two classes. The ﬁrst class consists of pooling strategies played by all types. Whilethere may be many such strategies nominally, they must all be payoﬀ-equivalent, so this class is, ina sense, degenerate. The second class is that of separating strategies employed by the lowest type –these may diﬀer in which pooling strategies they mimic and for how long. However, any separatingstrategy is only played by the lowest type.To state this and following results we need to introduce some additional notation and deﬁnitions.Firstly, denote the two boundaries of the belief support as ¯ S ( h t ) ≡ max S ( h t ) and S ( h t ) ≡ min S ( h t ) respectively. Furthermore, in a manner similar to type support S , given an equilibrium strategyproﬁle let us deﬁne action support as A θ ( h t ) ≡ n a ∈ A | α θ ( h t )( a ) > o ,A ( h t ) ≡ ∪ θ ∈ S ( h t ) A θ ( h t ) . We say that a pure strategy a arrives at h t = { a s ( h t ) , x s ( h t ) } s ∈T ,s

Fix an equilibrium and history h t ∈ H . Any two pure strategies a ′ , a ′′ ⊃ h t are: • payoﬀ-distinct at h t if there exists θ ∈ S ( h t ) such that U θ ( a ′ | h t ) = U θ ( a ′′ | h t ) ; We also use notation a ⊃ h t ∪ a to state “ a ⊃ h t and a t = a ”. payoﬀ-equivalent at h t if they are not payoﬀ-distinct at h t . Note also that while using the notation for full pure strategies, throughout the whole analysis weactually work with continuation strategies from some history h t ∈ H . When discussing strategiesconditional on some history h t we ignore all game paths that are ruled out by h t . In particular, twopure strategies a ′ , a ′′ ⊃ h t that prescribe the same actions at all h s ⊇ h t but diﬀer at some h s h t are treated as the same strategy for all means and purposes. We avoid introducing the continuationstrategies explicitly in order to economize on notation, which is quite heavy as is.The result can now be stated as follows. Proposition 1.

Suppose the payoﬀ function u θ satisﬁes (MON) and (SC). Fix an equilibrium suchthat (FIN) and (NDOC) hold. Fix some history h t ∈ H . Then, deﬁning θ ≡ S ( h t ) , the followinghold:1. all pure strategies a ′ on path at h t for any θ ∈ S ( h t ) \ θ are payoﬀ-equivalent and optimal forall θ ∈ S ( h t ) at h t , and at least one of these strategies is on path for θ at h t ;2. any pure strategy a ′′ that is on path at h t and payoﬀ-distinct at h t from any such a ′ is onlyon path for θ . The proposition implies, in particular, that any pure strategy a that is on path for some type θ ∈ S ( h t ) is also on path for the currently-lowest type θ . Therefore, no type of the agent can everconclusively separate from θ . At the same time, there may exist strategies that separate θ awayfrom the remaining types. The weight that the receiver’s belief assigns to θ may thus decrease overtime along the pooling path of play – it may even converge to zero asymptotically as t → ∞ , – butit may never become exactly zero.The proposition above is stated in terms of strategies rather than actions, and so provides onlylimited insight into how equilibrium actions look in any given period. That is, if a ′ as deﬁned in theproposition is unique then it is relatively straightforward that in every period there will be somesingle pooling action as prescribed by a ′ that all types above θ will play for sure, while the lowesttype will somehow mix between this pooling action and some number (between zero and inﬁnity)of separating actions. The challenge, however, comes from possible non-uniqueness of a ′ . If thereare many pooling actions, then they may be informative in that diﬀerent pooling actions conveydiﬀerent information – even though (or exactly because) all types are indiﬀerent between them.The following sections explore this issue in more detail and works around it to characterize thewithin-period signaling outcomes. To talk about signaling in relation to individual actions (rather than whole strategies), we needto deﬁne more precisely what “signaling” means in a dynamic context with many types. It is clearthat if all types pool on the same action in a given period then no information is revealed, while ifevery type plays an action diﬀerent from all others then full separation occurs, which is the mostinformative signaling outcome. The two grey zones are partitioning – when, for example, sometypes play action a ′ and some others play a ′′ – and mixing, – when one type plays action a ′ forsure and another type mixes between two actions a ′ and a ′′ . Both of the aforementioned outcomes12re usually dubbed as “semi-separation” in static settings and considered informative outcomes insignaling models. In dynamics, however, there are further complications.In a dynamic setting, it matters – for both sender’s payoﬀ and reputation – not only whataction the sender plays in a given period, but also his past and future actions. In particular, asingle costly action is inconsequential by itself, having only inﬁnitesimal eﬀect on payoﬀ and, as aresult, reputation – unless, that is, it is backed up by costly actions at future periods. Symmetrically,future actions also have the power to negate the payoﬀ consequences of past actions. This issue isillustrated by the following example, and its consequences are discussed further. Example 2.

Suppose

Θ = { , , } , types are ex ante equiprobable, A = R + , and u θ ( a, p ) = E p ( θ ) − a . Then the following would be an equilibrium: type θ = 2 plays some a ′′ at t = 0 and a = 0 at all t ≥ dt , while types θ = 1 , play a ′ = a ′′ (1 − e − rdt ) at all t ≥ . This is a PBE of the game aslong as a ′′ ≤ . In this PBE some information about type is conveyed in period zero – namely, type θ = 2 separates from θ = 1 , , – but this signaling is not relevant to the sender’s payoﬀ.The same is not necessarily true for the receiver. In particular, we can think of this exampleas a game between a worker (sender) and a ﬁrm (receiver), where a is the worker’s eﬀort and θ is his ability. Suppose the receiver’s ﬂow payoﬀ is given by v ( a, p, θ ) = θa − E p ( θ ) , with the ﬁrstterm being the worker’s output, and E p ( θ ) in both players’ payoﬀs is the worker’s wage, dictatedby the market. In this case the ﬁrm’s expected discounted proﬁt from hiring a worker of type θ = 2 at t = 0 equals a ′′ ) − − e − rdt , while that from hiring a worker of type θ ∈ { , } is ( a ′′ (1 − e − rdt ) ) − − e − rdt = 2( a ′′ ) (1 − e − rdt ) − − e − rdt , which is strictly less. This is reversed for hiringdecisions made at t ≥ dt . The example above illustrates that arbitrary information can in principle be conveyed via payoﬀ-irrelevant signaling – when diﬀerent types play diﬀerent actions, but nonetheless all types areindiﬀerent between all actions and respective continuations. This can be seen as a manifestationof “cheap talk”: for small dt , actions in a single period are eﬀectively costless, and so can be usedas a pure communication device with no regard for action costs. Situations in which informativecommunication arises through cheap talk have been studied extensively, see Sobel [2013] for arecent survey. Further, one can claim that there will often be scope for such cheap/payoﬀ-irrelevantcommunication in our model. This is due to multi-dimensionality of reputation p : with more thantwo types there are bound to be situations in which an agent is indiﬀerent between two kinds ofaverage reputation – one that says the agent is of average type for sure and the one that says theagent is of either high or low type with comparable probabilities, just like in the example above. Consequently, in this paper we focus on payoﬀ-relevant signaling – communication that relies onheterogeneity across the agent’s types of costs or beneﬁts from diﬀerent actions, in the spirit ofSpence [1973]. The information revealed through payoﬀ-relevant signaling is that which cannot becommunicated via plain cheap talk. Our contribution to characterizing the informative outcomesin dynamic signaling games without commitment should then be seen as complementary to that ofthe cheap talk literature. Battaglini [2002] and Chakraborty and Harbaugh [2007, 2010] explore cheap talk with multidimensionalinformation (which our setting is an instance of in case | Θ | > ) and show that there generally exist informativeequilibria in which the receiver can perfectly learn about N − out of N dimensions of uncertainty from a singlesender.

13e now provide the formal deﬁnitions of payoﬀ-relevant and irrelevant signaling in our setting.

Deﬁnition 3.

Fix an equilibrium and history h t ∈ H . • Payoﬀ-relevant signaling happens at h t if there exist a ′ , a ′′ ∈ A ( h t ) and θ ∈ S ( h t ) such that U θ ( h t ∪ a ′ ) = U θ ( h t ∪ a ′′ ) . • Payoﬀ-irrelevant signaling happens at h t if there exist a ′ , a ′′ ∈ A ( h t ) such that p ( h t ∪ a ′ ) = p ( h t ∪ a ′′ ) but U θ ( h t ∪ a ′ ) = U θ ( h t ∪ a ′ ) for all θ ∈ S ( h t ) . In other words, payoﬀ-relevant signaling implies that at a given history h t there are two distinctactions on path, a ′ and a ′′ , and there is some type of the agent for which the choice between thesetwo actions has payoﬀ consequences. Note that since both actions are on path, it cannot be thecase that all types prefer one over another – both a ′ and a ′′ must be optimal for some types of theagent. Payoﬀ-relevance of this action choice is then deﬁned as some type θ ∈ S ( h t ) having strictpreference between the two. We are now ready to state the theorem that characterizes payoﬀ-relevant signaling in termsof actions, making the implications of Proposition 1 more explicit. The result below expands themessage obtained in Theorem 1 to the case of ﬁnitely many types, albeit at the cost of restrictingmodel scope to payoﬀ functions that satisfy (SC).

Theorem 2.

Suppose the payoﬀ function u θ satisﬁes (MON) and (SC). Fix an equilibrium suchthat (FIN) and (NDOC) hold. Fix some history h t ∈ H . If payoﬀ-relevant signaling happens at h t then, deﬁning θ ≡ S ( h t ) , the following hold:1. any on-path action a ∈ A ( h t ) is on path for θ at h t ;2. A ( h t ) ∩ B (cid:0) δ θ | θ (cid:1) is nonempty, and any a in the intersection is on path only for θ at h t ;3. any action ¯ a ∈ A ( h t ) \ B (cid:0) δ θ | θ (cid:1) is optimal at h t for all θ ∈ S ( h t ) . What the theorem says is that in any equilibrium with payoﬀ-relevant signaling, there areeﬀectively at most two types of actions (as opposed to strategies in Proposition 1) on path at anyhistory – pooling actions (typical element ¯ a ) and separating actions (typical element a ). The latterare only ever played by the currently-lowest type θ = S ( h t ) and separate him from the remainingtypes. As in Theorem 1, any separating action must be myopically optimal for the lowest type giventhat he is revealed.Pooling actions, on the other hand, are optimal for all types. Further, if no payoﬀ-irrelevantsignaling takes place then any pooling action is, in fact, on path for all θ ∈ S ( h t ) – i.e., all types doactually pool on the pooling action(s). Notably, both payoﬀ-relevant and payoﬀ-irrelevant signalingmay occur simultaneously at a given history. In that case there will be more than one poolingaction, and while all of them are necessarily on path for θ , higher types may select diﬀerent actionsdespite all types being indiﬀerent between all of these actions (and the continuations they induce).The corollary below relates to the situations when payoﬀ-relevant signaling occurs at successivehistories. It states that the pooling action in the earlier history must then be such that the low typeis indiﬀerent between separating and pooling – meaning that ﬂow payoﬀs the low type gets fromthe separating and pooling actions must be the same. The low type must be indiﬀerent between14eparating at t and t + dt , so one period of pooling must be exactly as attractive as one period ofbeing identiﬁed as θ . In practice, this means that pooling action must be costlier for θ than theseparating action, since the former yields higher reputation payoﬀ. Corollary 1.

Suppose the conditions in Theorem 2 hold. Suppose payoﬀ-relevant signaling occursalso at h t + dt ≡ h t ∪ (¯ a, x ) for some ¯ a and all x in the support. Then such ¯ a must satisfy E x (cid:2) u θ (¯ a, p ( h t ∪ (¯ a, x ))) | θ (cid:3) = u θ (cid:0) a, δ θ (cid:1) . Finally, the theorem applies to all histories, including those oﬀ the equilibrium path. Applyingit inductively starting from the root history, we get the following corollary, which states that in theabsence of payoﬀ-irrelevant signaling, only the lowest type min Θ can ever separate from the rest,while the remaining ones can never separate from one another. It is worth noting that there may behistories h t at which p ( h t ) assigns arbitrarily small weight to the lowest type. What is importantis that this type can never be ruled out completely along the pooling path. Corollary 2.

In any equilibrium in which no payoﬀ-irrelevant signaling happens, for any on-pathhistory h t , one of the following must hold:1. S ( h t ) = Θ ;2. S ( h t ) = { min Θ } . The corollary above together with Theorem 2 eﬀectively provide a cookbook on how to constructan equilibrium with payoﬀ-relevant signaling only. Suppose we want signaling to occur during thetime interval [0 , T ] . Then in every period along the pooling path we shall have two actions availableto the sender: a separating action a ∈ B (cid:0) δ θ | θ (cid:1) only taken by the lowest type L ≡ min Θ and apooling action ¯ a that satisﬁes the condition in the last part of the theorem – the latter action willbe played by L with some probability and by all other types for sure. Note that we have a degreeof freedom in this construction: reputation from taking a pooling action depends on the probabilitywith which type L separates in that given period. Hence by changing these probabilities we will beable to sustain diﬀerent pooling actions ¯ a in equilibrium. Finally, we need to verify that from time T onwards, the pooling strategy is such that L is exactly indiﬀerent at T (or the last period before T ) between separating and following this pooling path.The following section takes this cookbook and uses it to construct an informative equilibriumin a concrete setting. This section looks at a simple model of price signaling with product reviews. Price signalingis a phenomenon that is widespread in the real world – high-quality products may be priced ata premium to signal quality, or, conversely, they may oﬀer more free trials or giveaways to helpconsumers learn about the product. Models exist that support both kinds of behavior. For example,Bagwell and Riordan [1991] show that if some consumers are initially informed of product qualitywhile others learn from repeated purchases, then high and declining prices signal product quality.They also refer to empirical cases which support their conclusions. On the other hand, Vettas [1997]15emonstrates that in the presence of social learning, high-type ﬁrm prices low on entry, graduallyincreasing the price afterwards, which is another pattern commonly observed in reality.While this apparent contradiction – that both high and low introductory prices can serve tosignal quality – has been recognized in the literature (see, e.g., Kirmani and Rao [2000]), we arenot aware of a theory that addresses it. The simple dynamic model below amends this and showsthat both high and low prices are equally ﬁt to serve as (suggestive yet inconclusive) signals of highquality in an informative equilibrium.

There is a long-lived ﬁrm i that faces a continuum of consumers j ∈ [0 , every period t ∈ T ≡ { , dt, dt, ... } , where period length dt is “small”. The ﬁrm oﬀers for sale a single product ofprivately known quality θ ∈ Θ = { H, L } (in Section 5.4 we discuss the case | Θ | > ). The marginalcosts of production are zero. In every period the ﬁrm sets price a t of its product and the consumersdecide whether to purchase it or not. A consumer’s payoﬀ from buying the product is given by θv j − a t , where v j ∼ i.i.d.U [0 , is the consumer’s value for quality. Payoﬀ from not buying theproduct is zero. Consumer j then buys the item if and only if E [ θ ] ≥ a t v j .The population of consumers is renewed every period. The newly arriving consumers base theirbelief p ( h t ) about the product quality on the prior p ∈ int ∆ (Θ) , the whole price path { a s } s ≤ t , andproduct reviews as described below. With probability − e − φdt ≈ φdt for φ ∈ [0 , the populationof consumers in a given period generates an informative review x t = θ which perfectly reveals theﬁrm’s quality and is observable by all future consumers. With complementary probability e − φdt noreview is generated: x t = ∅ . The ﬂow proﬁt of a ﬁrm of type θ receives in period t after setting price a t is then given by u θ ( a t , p t ) ≡ a t (cid:18) − a t E [ θ | p t ] (cid:19) + . (2)According to Theorem 1, (payoﬀ-relevant) signaling must necessarily take the form of attrition.Under such attrition signaling, type θ = L mixes between some pooling price a pt and separating tobliss price a L | L = L , whereas type H sets the pooling price. We then show that Theorem 2 appliesas well, meaning that the result translates to the case | Θ | > . We will be looking for an equilibrium of the game that satisﬁes (NDOC-P) and (FIN). Inparticular, let us construct an equilibrium in which prices are informative at every history wherethe ﬁrm’s type is not perfectly known (i.e., p ( h t ) = δ θ ). This would be the most informativeequilibrium, since in the remaining histories informative signaling is trivially impossible. If the ﬁrmis believed to be bad ( p ( h t ) = δ L ) then there are no signaling motives – it sets price a θ | Lt ≡ L andearns proﬁt u θ (cid:16) a θ | Lt , δ L (cid:17) = L . Similarly, if p ( h t ) = δ H then a θ | Lt ≡ H and u θ (cid:16) a θ | Ht , δ H (cid:17) = H . Here int ∆ (Θ) denotes the interior of ∆ (Θ) , i.e., we assume that the prior p does not rule out any states in Θ .This assumption is not strictly necessary, but all results are trivial without it. The analysis carries over fully to the case when the review arrival rate depends on the number of consumers whopurchased the product in a given period. p ( h t ) = δ H must be myopically optimal.We begin by deriving the pooling price a pt that renders the low type indiﬀerent betweenseparating and pooling. The former yields the continuation value equal to U L ( a L | L | h t ) = dt − e − rdt L ≈ L r , since − e − rdt ≈ rdt is a valid approximation when dt is small enough. Pooling, in turn, yields U L ( a pt | h t ) = a pt (cid:18) − a pt E [ θ | p t ] (cid:19) dt + e − rdt E x t U L ( h t ∪ ( a pt , x t )) , (3)where p t ≡ p ( h t ∪ ( a pt , ∅ )) and the continuation value can be written as E x t U L ( h t ∪ ( a pt , x t )) = (1 − e − φdt ) U L ( h t ∪ ( a pt , L )) + e − φdt U L ( h t ∪ ( a pt , ∅ )) . After a bad review x t = L the consumers are sure that the ﬁrm is bad, i.e., p ( h t ∪ ( a pt , L )) = δ L ,so we have U L ( h t ∪ ( a pt , L )) = L r . After no review x t = ∅ the consumers’ belief is inconclusive,hence in our construction signaling should continue. This means L must be indiﬀerent betweenpooling and separating once again, so U L ( h t ∪ ( a pt , ∅ )) = L r as well. Finally, L ’s indiﬀerence at h t yields U L ( a pt | h t ) = U L ( a L | L | h t ) = L r . Plugging all of these into (3), we obtain that ﬂow payoﬀ frompooling must coincide with that from separating: L a pt (cid:18) − a pt E [ θ | p t ] (cid:19) . The solution to the above is given by a pt = E [ θ | p t ]2 " ± s − L E [ θ | p t ] . (4)Therefore, for a ﬁxed E [ θ | p t ] we have two equivalid candidates for the pooling price a pt . The negativeroot corresponds to signaling by setting a low price – below L ’s preferred price (for any reputation).Such pooling price can be seen as low entry pricing à la Vettas [1997]. The positive root, conversely,corresponds to the price well above the myopic optimum and signaling through exclusivity, in thespirit of Bagwell and Riordan [1991].Further, the pooling output a pt (whichever root to (4) we choose) is a function of the seller’sreputation p ( h t ∪ ( a pt , ∅ )) = p ( h t )1 − λ t (1 − p ( h t )) , (5)where λ t is the probability with which type L separates at h t . In particular, we can choose theseprobabilities freely and construct an equilibrium for arbitrary λ t . Conversely, if we are restrictedin our choice of a pt – e.g., if this pooling price must amount to an integer number of dollars, – thisconstrains the set of p ( h t ∪ a pt ) and, consequently, λ t that we can implement in equilibrium at any17iven history h t . Therefore, in general the pooling price path is indeterminate given the market conditions (seller’sreputation), so inferring whether price signaling is taking place in a given market by looking at pricedata is a daunting task. This point was originally raised by Kaya [2013] in relation to advertisingexpenditures.To complete the equilibrium description we only need to argue that setting the pooling price a pt is optimal for H . His continuation value from doing so is U H ( a pt | h t ) = L dt + e − rdt (cid:20) (1 − e − φdt ) H r + e − φdt U H ( h t ∪ ( a pt , ∅ )) (cid:21) . Indeed, his ﬂow payoﬀ is the same as for L (the two types only diﬀer in the reviews they get),and in case a good review is generated at h t , he will be receiving H in every future period. Thesame, however, applies to any history with p ( h t ) ∈ int ∆ (Θ) , hence U H ( h t ∪ ( a pt , ∅ )) = U H ( a pt | h t ) ,which allows us to conclude that U H ( a pt | h t ) = r rL + φHr + φ . Setting any price other than a pt results in p ( h t + dt ) = δ L (by (NDOC-P)), and hence yields value of at most L r < r rL + φHr + φ . Therefore, poolingis indeed optimal for H . All of the above proves the following proposition. Proposition 2.

The following constitutes an equilibrium of the price signaling game for any proﬁleof λ t . At any history h t ∈ H :1. if p ( h t ) = δ θ then all types play θ and p ( h t + dt ) = p ( h t ) for all h t + dt ⊃ h t ;2. if p ( h t ) ∈ int ∆ (Θ) then:(a) type θ = H plays a pt as given by (4) with probability one;(b) type θ = L plays a pt w.p. − λ t and a L | L = L w.p. λ t ;(c) belief p ( h t ∪ ( a pt , ∅ )) is computed according to (5) ; p ( h t ∪ ( a pt , H )) = δ H , and p ( h t + dt ) = δ L for all other h t + dt ⊃ h t . The equilibrium described in the Proposition 2 traslates immediately to the case with | Θ | > .In that case all types θ ∈ Θ \ L would behave as the high type above. Therefore, signaling throughattrition is still possible. We now show that Theorem 2 can be applied in this problem as well toverify that payoﬀ-relevant signaling is possible only through attrition, even despite the presence ofinformative reviews.To do so, we need to verify that (SC) holds for the seller’s payoﬀ function. Whenever a reviewarrives – which happens with probability − e − φdt in every period – the continuation play is trivial(cf. Lemma 2 in the Appendix). Every type sets a myopically optimal price, thus obtaining payoﬀ θ per period. Therefore, we only need to consider strategies that are non-trivial at histories withnon-degenerate beliefs p ( h t ) . The agent’s value from following a given strategy a starting from anysuch history h t in the reduced game is given by U θ ( a | h t ) = X s ∈T ,s ≥ t e − ( r + φ )( s − t ) (cid:20) e − φdt · a t (cid:18) − a t E s [ θ | a ] (cid:19) + dt + (1 − e − φdt ) · θ r (cid:21) , Such an integer constaint will also apply to the bliss action a L | L , but this would not aﬀect the overall argument. E s [ θ | a ] ≡ E (cid:2) θ | h t ∪ a ( t,s ] (cid:3) . It is easy to see that this value function satisﬁes (SC): for any a ′ , a ′′ we have U ( θ ) ≡ U θ ( a ′′ | h ) − U θ ( a ′ | h )= X s ∈T ,s ≥ t e − ( r + φ )( s − t ) − φdt (cid:20) a ′′ t (cid:18) − a ′′ t E s [ θ | a ′′ ] (cid:19) + − a ′ t (cid:18) − a ′ t E s [ θ | a ′ ] (cid:19) + (cid:21) dt, which is independent of θ . Therefore, the conclusions of Theorem 2 apply, and payoﬀ-relevantsignaling is only possible through attrition. The price signaling model presented in this section does, despite being highly stylized,demonstrate that:1. informative price signaling is possible without commitment, cost advantages, and with orwithout consumers learning from experiences;2. signaling price may be either low (e.g., in the form of free trials or frequent sales) or ineﬃcientlyhigh (excluding most consumers) as a result of sunspots;3. multiple informative equilibria exist that diﬀer in the speed of separation;4. due to the above, the empirical identiﬁcation of price signaling in a given market is a dauntingtask.

This paper explores a model of dynamic signaling without commitment. In this model a singleprivately-informed agent takes an action every period, but cannot commit to future actions. Thereceiver tries to infer the agent’s information from his actions, and the receiver’s opinion is relevantto the agent’s payoﬀ. The existing literature has assumed signaling to be impossible in such setting,unless strong assumptions about oﬀ-equilibrium-path beliefs are adopted. This paper overturns thisview, demonstrating that signaling is, in fact, possible even under reasonable oﬀ-path beliefs.Contrary to the literature, we allow for suggestive signaling rather than requiring conclusive separation. We show that such signaling must necessarily happen through the attrition of thelowest type of the agent. In this attrition scenario, all types pool on the same action (or splitacross some payoﬀ-equivalent actions), while the lowest type also plays some separating action withpositive intensity.The paper also contains a methodological contribution. In particular, we demonstrate theimportance of seemingly innocuous assumptions regarding the type space for the conclusions oneobtains. In particular, our results imply that perfect learning is possible in the limit as t → ∞ in the model with two sender types but not possible with ﬁnitely many types, while the literaturedemonstrates that perfect learning is possible with a continuum of sender types.Finally, we explore an application of our results to a model of dynamic price signaling. Weconstruct an informative equilibrium in which prices set by the ﬁrm contain information about thequality of its product. We show that price signaling can happen through both ineﬃciently low and19neﬃciently high prices, thus reconciling some of the disparate conclusions in the literature andarguing that empirical identiﬁcation of price signaling in the data is a complicated venture. References

A. R. Admati and M. Perry. Strategic delay in bargaining.

Review of Economic Studies , 54(3):345–364, 1987.K. K. Aköz, C. E. Arbatli, and L. Çelik. Manipulation through biased product reviews.

SSRNElectronic Journal , 2017. doi: 10.2139/ssrn.3068345.S. Athey. Monotone comparative statics under uncertainty.

The Quarterly Journal of Economics ,117(1):187–223, 2002.L. M. Ausubel, P. Cramton, and R. J. Deneckere. Bargaining with incomplete information.

Handbookof game theory , 3:1897–1945, 2002.K. Bagwell and M. H. Riordan. High and declining prices signal product quality.

The AmericanEconomic Review , pages 224–239, 1991.J. S. Banks and J. Sobel. Equilibrium selection in signaling games.

Econometrica: Journal of theEconometric Society , 55(3):647–661, 1987.M. Battaglini. Multiple referrals and multidimensional cheap talk.

Econometrica , 70(4):1379–1401,2002.P. Beaudry and M. Poitevin. Signalling and renegotiation in contractual relationships.

Econometrica , 61(4):745–782, 1993.P. Bond and H. Zhong. Buying high and selling low: Stock repurchases and persistent asymmetricinformation.

The Review of Financial Studies , 29(6):1409–1452, June 2016. doi: 10.1093/rfs/hhw005.A. Chakraborty and R. Harbaugh. Comparative cheap talk.

Journal of Economic Theory , 132(1):70–94, 2007.A. Chakraborty and R. Harbaugh. Persuasion by cheap talk.

American Economic Review , 100(5):2361–82, 2010.I.-K. Cho and D. M. Kreps. Signaling games and stable equilibria.

Quarterly Journal of Economics ,102(2):179–221, 1987. doi: 10.2307/1885060.B. Daley and B. Green. Waiting for news in the market for lemons.

Econometrica , 80(4):1433–1504,2012.T. De Angelis, E. Ekström, and K. Glover. Dynkin games with incomplete and asymmetricinformation. arXiv preprint arXiv:1810.07674 , 2018.R. Deneckere and M.-Y. Liang. Bargaining with interdependent values.

Econometrica , 74(5):1309–1364, 2006. doi: 10.1111/j.1468-0262.2006.00706.x.20. Dilmé. Noisy signaling in discrete time.

Journal of Mathematical Economics , 68, 2017.F. Dilmé and F. Li. Dynamic signaling with dropout risk.

American Economic Journal:Microeconomics , 8(1):57–82, Feb. 2016. doi: 10.1257/mic.20120112.Y. Feinberg and A. Skrzypacz. Uncertainty about uncertainty and delay in bargaining.

Econometrica , 73(1):69–91, 2005.W. Fuchs and A. Skrzypacz. Bargaining with arrival of new traders.

American Economic Review ,100(3):802–36, 2010.S. J. Grossman and M. Perry. Sequential bargaining under asymmetric information.

Journal ofEconomic Theory , 39(1):120–154, 1986.S. Gryglewicz and A. Kolb. Strategic pricing in volatile markets.

SSRN Electronic Journal , 2019.doi: 10.2139/ssrn.3154372.S. Heinsalu. Dynamic noisy signaling.

American Economic Journal: Microeconomics , 10(2):225–249, May 2018. doi: 10.1257/mic.20160336.E. Kamenica and M. Gentzkow. Bayesian persuasion.

American Economic Review , 101(6):2590–2615, 2011.A. Kaya. Repeated signaling games.

Games and Economic Behavior , 66(2):841–854, 2009. doi:10.1016/j.geb.2008.09.030.A. Kaya. Dynamics of price and advertising as quality signals: anything goes.

Economics Bulletin ,2(1):1556–1564, 2013.A. Kaya and K. Kim. Trading dynamics with private buyer signals in the market for lemons.

TheReview of Economic Studies , 85(4):2318–2352, 2018.A. Kirmani and A. R. Rao. No pain, no gain: A critical review of the literature on signalingunobservable product quality.

Journal of marketing , 64(2):66–79, 2000.S. Kraus, J. Wilkenfeld, and G. Zlotkin. Multiagent negotiation under time constraints.

Artiﬁcialintelligence , 75(2):297–345, 1995.J.-J. Laﬀont and D. Martimort.

The theory of incentives: the principal-agent model . Princetonuniversity press, 2002. ISBN 9780691091846.E. K. Lai. Expert advice for amateurs.

Journal of Economic Behavior & Organization , 103:1–16,2014.G. LeBlanc. Signalling strength: limit pricing and predatory pricing.

The RAND Journal ofEconomics , pages 493–506, 1992.J. Lee and Q. Liu. Gambling reputation: Repeated bargaining with outside options.

Econometrica ,81(4):1601–1672, July 2013. doi: 10.3982/ECTA9200.21. E. Leland and D. H. Pyle. Informational asymmetries, ﬁnancial structure, and ﬁnancialintermediation.

The journal of Finance , 32(2):371–387, 1977.V. Madrigal, T. C. C. Tan, and S. R. d. C. Werlang. Support restrictions and sequential equilibria.

Journal of Economic Theory , 43(2):329–334, Dec. 1987. doi: 10.1016/0022-0531(87)90063-9.P. Milgrom and J. Roberts. Limit pricing and entry under incomplete information: An equilibriumanalysis.

Econometrica , pages 443–459, 1982a.P. Milgrom and J. Roberts. Predation, reputation, and entry deterrence.

Journal of economictheory , 27(2):280–312, 1982b.P. Milgrom and J. Roberts. Price and advertising signals of product quality.

Journal of politicaleconomy , 94(4):796–821, 1986.P. Milgrom and C. Shannon. Monotone comparative statics.

Econometrica: Journal of theEconometric Society , pages 157–180, 1994.R. B. Myerson.

Game Theory: Analysis of Conﬂict . Harvard University Press, 1997. ISBN9780674341166.G. Nöldeke and E. van Damme. Signalling in a dynamic labour market.

Review of Economic Studies ,57(1):1–23, 1990a. doi: 10.2307/2297540.G. Nöldeke and E. van Damme. Switching away from probability one beliefs. mimeo, 1990b.J. K.-H. Quah and B. Strulovici. Aggregating the single crossing property.

Econometrica , 80(5):2333–2348, 2012.J. G. Riley. Silver signals: Twenty-ﬁve years of screening and signaling.

Journal of Economicliterature , 39(2):432–478, 2001.C. Roddie. Signaling and reputation in repeated games, I: Finite games.

SSRN Electronic Journal ,2012a. doi: 10.2139/ssrn.1994378.C. Roddie. Signaling and reputation in repeated games, II: Stackelberg limit properties.

SSRNElectronic Journal , 2012b. doi: 10.2139/ssrn.2011835.A. Sen. Multidimensional bargaining under asymmetric information.

International EconomicReview , 41(2):425–450, 2000.A. Smirnov and E. Starkov. Timing of predictions in dynamic cheap talk: experts vs. quacks.

University of Zurich, Department of Economics Working Papers , 334, 2019.A. Smirnov and E. Starkov. Bad news turned good: Reversal under censorship.

American EconomicJournal: Microeconomics , forthcoming, 2020. doi: 10.1257/mic.20190379.J. Sobel. Giving and receiving advice.

Advances in economics and econometrics , 1:305–341, 2013.M. Spence. Job market signaling.

Quarterly Journal of Economics , 87(3):355–374, Aug. 1973. doi:10.2307/1882010. 22. A. Strebulaev, H. Zhu, and P. Zryumov. Optimal issuance under information asymmetry andaccumulation of cash ﬂows.

Rock Center for Corporate Governance at Stanford UniversityWorking Paper , 164, 2016.J. M. Swinkels. Education signalling with preemptive oﬀers.

Review of Economic Studies , 66(4):949–970, 1999.N. Vettas. On the informational role of quantities: Durable goods and consumers’ word-of-mouthcommunication.

International Economic Review , pages 915–944, 1997. doi: 10.2307/2527222.D. R. Vincent. Dynamic auctions.

Review of Economic Studies , 57(1):49–61, 1990. doi:10.2307/2297542.A. Weiss. A sorting-cum-learning model of education.

Journal of Political Economy , 91(3):420–442,1983.M. Whitmeyer. On optimal transparency in signaling. arXiv preprint arXiv:1902.00976 , 2019.

Appendix A. Proofs and Supplementary Results

A.1 Proofs: Preliminaries

Our ﬁrst observation states that once there is no need for signaling any more – i.e., when the receiver’sbelief assigns probability to some type of the agent – there are no reasons for the agent to steer away fromthe myopically optimal action. Lemma 2.

In any equilibrium that satisﬁes (NDOC), at any h t ∈ H , if | S ( h t ) | = 1 then for all θ and all h s ⊇ h t : α θ (cid:0) B ( δ S ( h t ) | θ ) | h s (cid:1) = 1 . Proof.

By (NDOC), for all h s ⊇ h t : p ( h s ) = δ S ( h t ) . In particular, p ( h s ) is independent of all actionsand outcomes during [ t, s ) . Therefore, the solution to (1) is given by pointwise maximization of the ﬂowutility.Lemma 2 above is the direct consequence of (NDOC): actions cannot aﬀect a degenerate belief underthis assumption, hence the myopic optimum is chosen. This captures the main tension between signalingand sequential rationality: signaling requires sticking to the costly action over an extended period of time,while sequential rationality as captured by Lemma 2 pushes against that when no further signaling concernsare present. The remaining statements formalize this intuition. However, before proceeding any further, weuse Lemma 2 to prove Lemma 1 from the text. Proof of Lemma 1.

Parts 1 and 2 are trivial. For part 3, denote the original equilibrium strategy and beliefproﬁle as α and p respectively. Construct the new equilibrium ( α , p ) by setting α ( h t ) = α ( h t ) and p ( h t ) = p ( h t ) for all on-path histories h t . Then for all oﬀ-path histories h t set p ( h t ) = δ S ( h ) where h ⊂ h t is the last on-path history preceding h t . The strategy α ( h t ) for oﬀ-path histories h t is set in conformancewith Lemma 2.Belief proﬁle p will then satisfy (NDOC-P) and be consistent with the strategy α . The strategy itselfwill be optimal at oﬀ-path histories by Lemma 2. Optimality of α at any on-path history h t can be veriﬁed y observing that U θ ( h t ∪ a ) is the same in both equilibria for all a ∈ A ( h t ) and weakly smaller in the newlyconstructed equilibrium for a ∈ A \ A ( h t ) . I.e., the choice between any pair of oﬀ-path actions is unaﬀectedby the oﬀ-path modiﬁcations, while deviations to oﬀ-path actions are less appealing in the new equilibrium.Therefore, ( α , p ) is an equilibrium. A.2 Proofs: Two Types

Proof of Theorem 1.

Statement 1.

Suppose ﬁrst, by way of contradiction, that there exist h t ∈ H and a ∈ A such that α H ( a | h t ) > but α L ( a | h t ) = 0 . Then p ( h t ∪ ( a, x )) = δ H for any x ∈ X by (NDOC).By playing a at h t the low type receives the highest possible continuation utility after t (since by Lemma 2he can play the myopically optimal action thereafter), while by following the equilibrium path he receivesstrictly less. The utility is bounded, hence for dt small enough deviating to a at h t is optimal for L – acontradiction.Payoﬀ-equivalence is shown as follows: for any two a, a ′ ∈ A such that α H ( a | h t ) > and α H ( a ′ | h t ) > it must be that U H ( h t ∪ a ) = U H ( h t ∪ a ′ ) , otherwise the high type would only play one of the actionsand not the other. The ﬁrst part of the argument showed that α L ( a | h t ) > and α L ( a ′ | h t ) > , hence U L ( h t ∪ a ) = U L ( h t ∪ a ′ ) by the same logic. Statement 2.

Begin with the ﬁrst part (that a ∈ B ( δ L | L ) ). For any such a that α H ( a | h t ) = 0 and α L ( a | h t ) > and any outcome x , we have p ( h t + dt ) = δ L , where h t + dt = h t ∪ ( a, x ) . By Lemma 2, a ′ ∈ B ( δ L | L ) must be played at all histories beginning with h t + dt . If a / ∈ B ( δ L | L ) then playing a ′ at h t instead – and continuing with a ′ at all subsequent histories – yields a strictly higher ﬂow payoﬀ at h t andthe same continuation payoﬀ. Hence playing a at h t was not optimal.The second part of the second statement follows from the same argument as did payoﬀ equivalence for L in the ﬁrst statement. A.3 Proofs: Finite Types

Before stating the proof of Theorem 2, we need some supplementary lemmas. We begin by arguing inLemma 3 that at no history can actions lead to separation of types into disjoint sets that can be comparedby a strong set order – unless one of these sets is a singleton coinciding with the lower bound of the other set.In particular, we show that sets of types in the support of two diﬀerent actions have to necessarily overlap(not in the sense of having common elements, but in the sense of upper and lower bounds).

Lemma 3.

Suppose (MON) holds and dt is small enough. Fix any equilibrium and any history h t ∈ H .Then for any a ′ , a ′′ ∈ A ( h t ) we have ¯ S ( h t ∪ a ′ ) ≥ S ( h t ∪ a ′′ ) , with equality only if S ( h t ∪ a ′ ) is a singleton. Proof.

Assume by contradiction that ¯ S ( h t ∪ a ′ ) < S ( h t ∪ a ′′ ) for some a ′ , a ′′ ∈ A ( h t ) . Pick any type θ ∈ S ( h t ∪ a ′ ) and any strategy a ′ on path for θ at h t . Then deviating to a ′′ at h t and following a ′ after t isstrictly better for θ than following a ′ throughout. To see this, recall that u θ is increasing in p t by (MON) –and reputation p s ≥ δ S ( h t ∪ a ′′ ) generated by the deviation for all s > t is strictly higher than any reputationon equilibrium path, since S ( h t ∪ a ′′ ) > ¯ S ( h t ∪ a ′ ) . This contradicts a ′ being optimal for θ as long as dt (period length and, hence, utility weight on the current period) is small enough. This Lemma and the remainder of the Appendix uses the notation S ( h t ∪ a ) ≡ S ( h t ∪ ( a, x )) for all x ∈ X inthe support. This object is well deﬁned in equilibrium for on-path histories and actions because the support of x istype-independent and equilibrium beliefs must be consistent. We are adopting the simplifying assumption that thesame holds oﬀ the equilibrium path, but this is not necessary for the arguments to go through as long as (NDOC)holds. ow suppose ¯ S ( h t ∪ a ′ ) = S ( h t ∪ a ′′ ) . Suppose by way of contradiction that | S ( h t ∪ a ′ ) | > , meaning S ( h t ∪ a ′ ) < S ( h t ∪ a ′′ ) . Then among all types in S ( h t ∪ a ′ ) there exists such θ that receives reputation p s < δ ¯ S ( h t ∪ a ′ ) with positive probability for all s > t (this follows from belief consistency). Such θ wouldstrictly beneﬁt from the deviation described in the ﬁrst part of this proof, yielding a contradiction.Lemma 4 puts the (SC) property to use, establishing a form of monotonicity of optimal strategies w.r.t.type (“higher types play higher strategies”). The main problem in the dynamic setting is the lack of any nicecomplete order over strategies a , so given two arbitrary strategies, we generally cannot say which one of themis “higher”. Therefore, we rephrase monotonicity to say instead that if a given strategy (or its equivalent)is optimal for two agent types, then it must also be optimal for all types in between. We cannot say withcertainty that the given strategy is chosen on equilibrium path by any of these types in between, but we canclaim that any strategy they play must be payoﬀ-equivalent to the one under consideration. Lemma 4.

Suppose (SC) holds. Fix any equilibrium and history h t ∈ H . If there exists a pair of strategies a , ¯ a ⊃ h t that are payoﬀ-equivalent at h t and are on path at h t for some types θ and ¯ θ > θ respectively, thenany strategy ˆ a ⊃ h t on path at h t for any ˆ θ ∈ ( θ, ¯ θ ) must be payoﬀ-equivalent at h t to ¯ a , a .Proof. Fix any such ˆ a . Strategy ¯ a has to be optimal for type ¯ θ . In particular, when evaluated at h t , it hasto be better than ˆ a : U ¯ θ (¯ a | h t ) ≥ U ¯ θ (ˆ a | h t ) . The same holds for type θ , since ¯ a and a are payoﬀ-equivalent: U θ (¯ a | h t ) = U θ ( a | h t ) ≥ U θ (ˆ a | h t ) . At the same time, ˆ θ at least weakly prefers ˆ a to ¯ a , meaning that the converse holds for ˆ θ : U ˆ θ (¯ a | h t ) ≤ U ˆ θ (ˆ a | h t ) . If this inequality is strict, then this is a direct contradiction with (SC), which requires that U θ (¯ a | h t ) − U θ (ˆ a | h t ) as a function of θ either crosses zero at most once, or is exactly zero.Lemma 5 below is the ﬁnal step before we can move on to the proofs of main results. It can be seen asa weaker version of Proposition 1, claiming that the highest and lowest types at any history have a strategyin common. Lemma 5.

Suppose (MON) and (SC) hold and dt is small enough. Fix any equilibrium and any h t ∈ H .There exist h t -payoﬀ-equivalent strategies ¯ a , a ⊃ h t , on path at h t for ¯ S ( h t ) and S ( h t ) respectively.Proof. We will proceed by induction on the support size | S ( h t ) | . The claim of the lemma holds trivially for | S ( h t ) | = 1 , and by Theorem 1 it also holds for | S ( h t ) | = 2 . The remainder of the proof shows that if theclaim holds when | S ( h t ) | = k − then it also holds when | S ( h t ) | = k ≥ . Let x : H → X denote an outcomeproﬁle which prescribes some outcome for every history. Fix some x . Coupled with some pure strategy andthe equilibrium belief system p , it fully determines the path of play and the agent’s payoﬀs.Begin the second layer of induction, iterating forwards on time periods from t . At h t and any subsequenthistory h s ⊃ h t , one of the following must apply: . There is an action a on path for both types ¯ S ( h t ) and S ( h t ) at h s . If this is the case, call h s a non-splitting history and continue to h s + dt = h s ∪ ( a, x ( h s )) .2. There is no action a on path for both ¯ S ( h t ) and S ( h t ) at h s . If this is the case, call h s a splitting history.Proceed along the non-splitting path (according to the chosen x ) until the ﬁrst splitting history h s . Pickarbitrary actions ¯ a and a that are on path for ¯ S ( h t ) and S ( h t ) at h s respectively, and consider twocontinuation histories ¯ h s + dt ≡ h s ∪ (¯ a, x ( h s )) and h s + dt ≡ h s ∪ ( a, x ( h s )) . Then we have that | S ( h s + dt ) | < | S ( h s ) | = k for both continuation histories, because S (¯ h s + dt ) ⊆ S ( h s ) \ S ( h s ) and S ( h s + dt ) ⊆ S ( h s ) \ ¯ S ( h s ) .Therefore, by the induction assumption, the statement of the lemma holds at both ¯ h s + dt and h s + dt .In particular, statement of the lemma for ¯ h s + dt states that there exist two ¯ h s + dt -payoﬀ-equivalentstrategies on path at ¯ h s + dt for ¯ S ( h s ) and S (¯ h s + dt ) respectively. Playing ¯ a at h s is on path for both ofthese types, hence there also exists a pair of strategies on path at h s for the two types respectively, whichgrant the same payoﬀ conditional on x . However, the argument above applies to any outcome proﬁle x and, in particular, to any outcome x ( h s ) , hence there also exists a pair of strategies ¯ a ′ , ¯ a ′′ on path at h s forthe two types ¯ S ( h s ) and S (¯ h s + dt ) respectively, which are payoﬀ-equivalent at h s (unconditionally).By a mirror argument, there also exists a pair of h s -payoﬀ-equivalent strategies a ′ , a ′′ on path at h s for S ( h s ) and ¯ S ( h s + dt ) respectively. Note further that by Lemma 3 we have that ¯ S ( h s + dt ) > S (¯ h s + dt ) . Lemma4 hence applies: a ′′ must be payoﬀ-equivalent to ¯ a ′ , ¯ a ′′ , thus so is a ′ . We have shown that the statement ofthe lemma holds at h t if | S ( h t ) | = k and h t is a splitting history.We are left to cover non-splitting histories. Suppose h t is non-splitting. Fix x . Then we know that thestatement of the lemma holds at the ﬁrst splitting history h s following h t along the path of pooling actionsand ﬁxed outcomes x . Therefore, there exists a pair of strategies on path at h t for ¯ S ( h t ) and S ( h t ) , whichgrant the same payoﬀ at h t conditional on x . This applies to any outcome proﬁle x , hence there exists apair of strategies ¯ a , a on path at h t for ¯ S ( h t ) and S ( h t ) , which are payoﬀ-equivalent at h t . This concludesthe induction argument and the proof of the lemma. Proof of Proposition 1.

Let ¯ θ ≡ ¯ S ( h t ) . Note that the statement of the proposition holds trivially if | S ( h t ) | = 1 , so for the remainder of this proof we assume that this is not the case (i.e., ¯ θ = θ ). FromLemma 5 we know there exist h t -payoﬀ-equivalent ¯ a , a ⊃ h t on path at h t for ¯ θ and θ respectively. Then byLemma 4, any pure strategy a ⊃ h t on path at h t for any θ ∈ S ( h t ) \{ ¯ θ, θ } is payoﬀ-equivalent at h t to ¯ a , a .Suppose now there exists a pure strategy ¯ a ′ ⊃ h t on path at h t for ¯ θ , which is payoﬀ-distinct at h t from ¯ a . By (SC), all types θ ∈ S ( h t ) \ ¯ θ must have a strict preference at h t between ¯ a and ¯ a ′ . The former is optimalfor these types, hence ¯ a ′ is only on path for ¯ θ . The two strategies cannot prescribe diﬀerent actions at h t in equilibrium – ¯ a ( h t ) = ¯ a ′ ( h t ) – since this is in violation of Lemma 3. The same, however, applies to anysubsequent history, hence ¯ a ( h s ) = ¯ a ′ ( h s ) for all h s ⊃ h t . This contradicts ¯ a and ¯ a ′ being payoﬀ-distinct,hence such ¯ a ′ does not exist. Therefore, any pure strategy a on path at h t that is h t -payoﬀ-distinct from ¯ a is only on path for θ . This concludes the proof. Proof of Theorem 2.

From Proposition 1, all actions a ∈ A ( h t ) are on path for θ , which proves the ﬁrststatement of the theorem.From the fact that payoﬀ-relevant signaling happens at h t we know that there exist two pure strategies a , ¯ a that are payoﬀ-distinct at h t and prescribe diﬀerent actions at h t : a ≡ a t = ¯ a ≡ ¯ a t . From Proposition 1 It does not matter for our argument if all types assign probability zero to outcome x ( h s ) conditional on ¯ a . To be slightly more precise, the argument applies to any h s such that | S ( h s ) | > . Otherwise Lemma 2 kicks inand implies that all pure strategies on path at h s are h s -payoﬀ-equivalent. e know at least one of these strategies – suppose a – is on path for θ but not for any other θ ∈ S ( h t ) \ ¯ θ at h t . Furthermore, it follows from the deﬁnition of payoﬀ-relevant signaling that there is no ¯ a ′ ⊃ h t ∪ a thatis payoﬀ-equivalent to ¯ a at h t . Therefore, a is only on path for θ , while ¯ a is optimal for all θ ∈ S ( h t ) at h t .We now show that a ∈ B (cid:0) δ θ | θ (cid:1) . If this is not true then type θ can play some a ∈ B (cid:0) δ θ | θ (cid:1) at h t andevery history after it. Compared to following a , this strategy would yield the same payoﬀ at all times s > t and a strictly higher payoﬀ at t (same as in the proof of Theorem 1), hence a is not optimal for θ at h t – acontradiction.To complete the proof of statements 2 and 3 of the theorem, we need to show that ¯ a / ∈ B (cid:0) δ θ | θ (cid:1) . Assumenot. Consider the strategy of playing ¯ a at h t and all subsequent histories. Compared to following a , thisstrategy would yield θ a weakly higher payoﬀ at all times s > t and a strictly higher payoﬀ at t (due to p ( h t ∪ ¯ a ) > δ θ and the strict part of (MON)), hence a would not be optimal for θ at h t – a contradiction.This completes the proof of Theorem 2. Proof of Corollary 1.

The statement is proved in the text. The low type must be indiﬀerent betweentaking a separating action a at h t and pooling on ¯ a at h t and separating at h t + dt . This indiﬀerencedictates that one period of pooling must be exactly as attractive as one period of being revealed as θ ,i.e., E x (cid:2) u θ (¯ a, p ( h t ∪ (¯ a, x ))) | θ (cid:3) = u θ (cid:0) a, δ θ (cid:1) . Proof of Corollary 2.

Proposition 1 states that all pure strategies a ′ on path at h t for any θ ∈ S ( h t ) \ S ( h t ) are payoﬀ-equivalent at h t . Since there is no payoﬀ-relevant signaling in equilibrium, the set of such strategiesis a singleton: if there is more than one then there exists h t ′ ⊃ h t at which the two prescribe diﬀerent actions,but that constitutes payoﬀ-irrelevant signaling at h t ′ (the two strategies coincide on [ t, t ′ ) , hence they arepayoﬀ-equivalent at h t ′ ).Therefore, at any h t there exists some ¯ a ∈ A such that α θ ( h t )(¯ a ) = 1 for all θ ∈ S ( h t ) \ S ( h t ) . Togetherwith part 1 of Theorem 2, this means that S ( h t ∪ ¯ a ) = S ( h t ) . By part 3 of the theorem, ¯ a is theunique element of ¯ a ∈ A ( h t ) \ B (cid:0) δ θ | θ (cid:1) . By part 2 of the theorem, for any a ∈ A ( h t ) ∩ B (cid:0) δ θ | θ (cid:1) we have S ( h t ∪ a ) = S ( h t ) . Since all on-path histories h t + dt can be written as h t + dt = h t ∪ ( a, x ) for some a ∈ A ( h t ) and x ∈ X , and outcomes x do not change support S , we obtain that for any pair of on-path histories h t , h t + dt : S ( h t + dt ) ∈ { S ( h t ) , { S ( h t ) }} . Applying this observation iteratively from h (for which S ( h ) = Θ )completes the proof. Proof of Proposition 2.

Contained in the text.

Appendix B. Application Examples

This appendix presents examples of applied models that ﬁt our framework. These are meant todemonstrate some instances of models yielding additively and/or multiplicatively separable payoﬀ functionsthat allow (SC) to be veriﬁed with little eﬀort, complementing Section 5 in this respect.

B.1 Labor Market Signaling

In this section we revisit the classic labor market signaling model (Spence [1973]), which sparked theoriginal discussion around dynamic signaling (Nöldeke and van Damme [1990a], Swinkels [1999]). In thedynamic version of this model, a long-lived candidate of privately known ability θ ∈ Θ ⊆ R + acquires costly nd, w.l.o.g., unproductive education in an attempt to signal her ability to potential employers. A high-ability worker is more productive on the job and can thus bargain for a higher wage, while also having lowercost of education than a low-ability worker.In every period t ∈ T ≡ { , dt, dt, ... } (where period length dt is “small”) she chooses whether to acquireeducation or not, e ∈ { , } . Alternatively, in a more general version of the model the candidate couldselect education intensity e ∈ [0 , ¯ e ] . The ﬂow cost of education is given by c ( e | θ ) ≡ l ( e ) · m ( θ ) , where l ( e ) isincreasing in e with l (0) = 0 , and m ( θ ) is strictly decreasing in θ .There is a population of homogeneous competitive employers, who observe the full history of thecandidate’s education choices and grades. In every period they simultaneously oﬀer employment contracts tothe candidate. After observing all contracts, the candidate may accept at most one of them. If a contract isaccepted, in every future period the candidate receives wage w · dt , where w is as speciﬁed in the contract. Let d ∈ { , } denote the worker’s acceptance decision – whether she chooses to accept an oﬀer in a given periodor not. If the candidate chooses to accept, she would trivially ﬁnd it optimal to choose the highest-wagecontract.W.l.o.g., let θ be equal to the candidate’s on-the-job productivity (so her output is θ · dt per period).This means that competitive ﬁrms will at any history h t all oﬀer the same wage w ( h t ) = E [ θ | h t , d ( h t ) = 1] .A history here consists of the candidate’s past actions: h t = { d s , e s } s ∈T ,s