aa r X i v : . [ ec on . T H ] N ov A Reputation for Honesty *Drew Fudenberg † Ying Gao ‡ Harry Pei § November 17, 2020
We analyze situations in which players build reputations for honesty rather than for playing particu-lar actions. A patient player facing a sequence of short-run opponents makes an announcement abouttheir intended action after observing an idiosyncratic shock, and before players act. The patient playeris either an honest type whose action coincides with their announcement, or an opportunistic type whocan freely choose their actions. We show that the patient player can secure a high payoff by buildinga reputation for being honest when the short-run players face uncertainty about which of the patientplayer’s actions are currently feasible, but may receive a low payoff when there is no such uncertainty.Many economic actors have reputations for keeping or breaking their promises. As a prominentexample, Archibald Cox Jr’s default on his promise of paying high bonuses in the early 90s triggereda massive defection of key personnel from First Boston to its archrival Merrill Lynch. Similar logicapplies to advertising and marketing, which can set customers’ expectations about the types of in-teractions they are going to have with the firm. If those expectations are not aligned with the actualcustomer experience, the firm’s brand and business will suffer.Motivated by these observations, we examine the reward for building a reputation for honesty.Compared to reputations for taking specific actions, a reputation for honesty can better adapt anagent’s decisions to the current circumstances, which is valuable when the environment changes overtime. Moreover, it is unrealistic to make commitments based on future contingencies that are hard todescribe in advance, and the simplicity of a commitment to honesty makes it more plausible. * We thank Mehmet Ekmekci and Navin Kartik for helpful comments, and National Science Foundation grants SES-1947021 and SES-1951056 for financial support. † Department of Economics, Massachusetts Institute of Technology. Email: [email protected] ‡ Department of Economics, Massachusetts Institute of Technology. Email: [email protected] § Department of Economics, Northwestern University. Email: [email protected] See “Taking the Dare” in The New Yorker, July 26th, 1993. The departed individuals include leaders of First Boston’sprestigious energy group, and more than a dozen managing directors in its fixed-income and mortgage-backed-securitiesgroups, triggered by a lower bonus payment than what they had been promised. Many who departed had been at FirstBoston their entire careers, including during its difficult times in the 80’s.
1n our model, a patient player (e.g., a firm) faces a sequence of myopic opponents (e.g., con-sumers), each of whom plays the game only once. Each period, before players act, the patient playerprivately observes an idiosyncratic shock, which can affect their payoff (e.g., their production cost)and which of their actions are currently feasible. Then the patient player announces the action theyintend to play. The myopic players cannot observe the shocks, but can observe the announcement inthe current period as well as whether the patient player has kept their word in the past.The patient player is either an honest type , who strategically chooses their announcements butalways keeps their word, or an opportunistic type , who strategically chooses both the announcementsand the actions. Both types have the same payoff function. This contrasts to Kreps and Wilson (1982),Milgrom and Roberts (1982), and Fudenberg and Levine (1989) in which with positive probability,the patient player is a commitment type who mechanically plays a particular action.Theorem 1 shows that the patient player receives at least their expected Stackelberg payoff in everyequilibrium when the myopic players face a small amount of uncertainty about the actions currentlyavailable to the patient player. A complication is that the opportunistic type may announce certainactions with higher probability than the honest type does, so the patient player’s announcement mayadversely affect their opponent’s belief about their type. As a result, both types of the patient playermay face a tradeoff between announcing actions that lead to higher credibility and announcing actionsthat lead to higher commitment payoffs (i.e., payoff conditional on being trusted).To see why the reputation bound nevertheless obtains, suppose the honest type announces theirStackelberg action whenever it is feasible. When a myopic player does not best reply against theannouncement, whether the patient player keeps their word in that period is informative about theirtype. Because the set of feasible actions is stochastic, the honest type announces each action withstrictly positive probability, which implies that observing the current announcement leads to at mosta bounded change in the myopic player’s belief. Therefore, when the patient player behaves honestly,there can be at most a bounded number of periods in which the myopic players do not best reply tothe announcement. As a result, the patient player receives at least their expected Stackelberg payoff.By contrast, Theorem 2 shows that when the patient player can choose from any of their possibleactions in every period, there are equilibria in which they receive a low payoff, which can be as low In Section 4, we show that our reputation result extends when the patient player observes which of their actions arefeasible after making their announcement, or when the patient player chooses an action (e.g., their effort), observes theirproduct quality, and makes an announcement about quality before the myopic player chooses their action.
2s their minmax value in examples such as the product choice game.
Related Literature:
Our paper contributes to the study of reputation models where no types arecommitted to specific actions. Schmidt (1993) characterizes the Markov equilibria of finite-horizonrepeated bargaining games in which a firm has private information about its production cost. Pei(2020) characterizes an informed player’s highest Nash equilibrium payoff when facing uninformedopponents. Sugaya and Wolitzky (2020) constructs a cooperative equilibrium in a community en-forcement model with a type that communicates strategically but is committed to playing alwaysdefect . By contrast, we provide a lower bound on the patient player’s payoff for all Nash equilibria.Our reputation result requires the uninformed players to face uncertainty about the availability ofthe informed player’s actions, or more generally, believe that the honest type makes every announce-ment with positive probability. This is related to Celentani, Fudenberg, Levine, and Pesendorfer (1996)and Atakan and Ekmekci (2015), which show that full support monitoring can help reputation build-ing when the uninformed player is long-lived. Their results, unlike ours, require that the informedplayer cannot perfectly observe the uninformed player’s actions.Jullien and Park (2020) studies repeated buyer-seller games in which a seller privately observestheir product quality, which is a noisy signal of their effort. It shows that cheap talk communica-tion about quality improves the maximum social welfare if and only if the seller’s cost of effort isintermediate. Our paper examines a complementary question, namely, whether a patient player canguarantee high payoffs in all equilibria by building reputations for honesty. Successful reputationbuilding in our model requires uncertainty about the actions available to the patient player, but doesnot depend on the players’ payoff functions. Corollary 2 in Section 4 extends our insights to Jullienand Park (2020)’s setting, which implies that a patient seller receives their optimal commitment pay-off in all equilibria when product quality (i.e., the seller’s private signal) is a noisy signal of effort, butreceives a payoff lower than that in some equilibria when quality is a perfect signal of effort.The fact that many people prefer to be honest has been established experimentally by e.g. Gneezy(2005) and Gneezy, Kajackaite, and Sobel (2018). Kartik, Ottaviani, and Squintani (2007) and Kartik(2009) show how costs of lying change the equilibrium outcomes of strategic communication games. Jullien and Park (2014) shows that communication accelerates consumer learning when product quality is determinedby the seller’s type, and the high type seller is non-strategic and always tells the truth. Awaya and Krishna (2016) identifiesa class of games in which players can achieve perfectly collusive payoffs with communication, but not without it.
Consider a game between a firm (row player) with discount factor δ ∈ ( , ) and a sequence of con-sumers (column player), each of whom plays the game only once. In every period, the firm privatelyobserves its i.i.d. cost of production θ t ∈ { θ g , θ b } . Let p g ∈ ( , ) be the probability that θ t = θ g . Theplayers’ stage-game payoffs are: θ = θ g T NH , − , L , − , θ = θ b T NH − , − , L , − , θ = θ g , the best pure-strategy commitment for the firm is to action H , which yields payoff 1. When θ = θ b , the firm’s optimal commitment action is L , which yields payoff 0. If the firm obtainsits optimal pure-strategy commitment payoff in every state, then its expected payoff is p g . No Announcement Benchmark:
Suppose the firm cannot make announcements about its intendedactions, and that with small but positive probability it is a commitment type that mechanically plays H in every period. Future consumers can observe the firm’s effort in previous periods, but not the pastrealizations of θ t . Then, there are equilibria in which the patient firm’s payoff is max { , p g − } ,which is strictly lower than p g . For example, when p g ≥ /
2, there is an equilibrium where the A commitment to a mixed strategy is even better in state θ g . We do not consider reputations for playing mixedactions in this paper. This is a reasonable assumption given that θ only affects the firm’s cost of supplying high quality. H in every period on the equilibrium path, and each consumer chooses T unless they observe L in at least one of the previous periods. Intuitively, when θ = θ b , the costof playing H outweighs the benefit from the consumer’s trust, and the firm faces a tradeoff betweensustaining its reputation for playing H and avoiding the excessive cost.This low-payoff equilibrium motivates our interest in reputations for honesty. Reputation for Honesty:
Suppose that the firm can make an announcement m t about its intendedaction a t to the current consumer after observing θ t , but before players choosing actions.The firm is either honest or opportunistic. In contrast to the commitment types in canonical repu-tation models, the honest type is strategic when making announcements and does not commit to anyparticular action. Instead, it commits to play the action it announces in every period. The two types ofthe firm have the same stage-game payoff function and discount factor, and maximize their respectivediscounted average payoffs. The consumer in period t observes the firm’s announcement in period t , as well as the value of { a s = m s } for s ∈ { , , ..., t − } , i.e., whether the firm’s announcementsmatched its actions in the previous periods.As Theorem 2 shows, the firm’s equilibrium payoff can be low when all of its actions are alwaysavailable. To see how this works in the example, consider the following strategy profile: Both typesof the firm announce L and play L at every history, and each consumer plays N regardless of the firm’sannouncement. The consumers’ belief about the firm’s type never changes on the equilibrium path.After the firm announces H , the current consumer believes that the firm is opportunistic and will play L . This strategy profile and assessment constitute a Perfect Bayesian equilibrium, in which the firm’sdiscounted average payoff is 0 regardless of its type.This low-payoff equilibrium is driven by the honest-type firm’s strategic concerns when makingannouncements. The consumers believe that the opportunistic type is more likely to announce H ,so the honest type faces a trade-off in state θ g between announcing an action that leads to highercredibility (i.e., action L ) and an action that leads to a higher commitment payoff (i.e., action H ). Thismotivates the honest type to announce L , making consumers’ beliefs self-fulfilling.In contrast, Theorem 1 shows that when some of the firm’s actions are unavailable with small When future consumers only observe whether a t coincides with m t , but not the exact realizations of a t and m t , theydo not observe deviations in the announcement stage if the firm kept its word. We show that the firm can also receive alow payoff when future consumers can observe both a t and m t . H and L withprobability 1 − ε , can only choose H with probability ε , and can only choose L with probability ε .Both types of the patient firm receive payoff at least ( − ε ) p g − ε ( − p g ) when the feasibility ofactions is i.i.d. over time and is independent of θ . This guaranteed payoff converges to p g as ε → Remarks:
Our assumption that the consumers face uncertainty about which of the firm’s actionsare feasible fits situations in which the firm is a single contractor who occasionally may be sick, andso unable to provide high-quality service. It also fits cases where the firm faces occasional regulatoryinspections, and producing low-quality products in those periods can lead to fines and the risk of beingshut down. In this situation, the firm will always choose to supply high quality, regardless of theirdiscount factor. Our reputation result (Theorem 1) extends to cases where the distribution of the feasible actionsvaries exogenously over time or is correlated with the current θ . It also extends when θ t is drawnfrom a potentially different set Θ t in every period, as long as the patient player’s payoff is uniformlybounded for all t ∈ N . This captures situations in which the client’s demand varies over time andwhich of the firm’s action benefits the client is known only after the client arrives.In these situations, the advantage of establishing a reputation for honesty is more pronounced.Due to the complicated nature of future payoff environments, it is impractical for the firm to committo state-contingent action plans. A reputation for honesty allows the firm to communicate its intendedactions after observing the payoff environment, which sidesteps these complications. Time is discrete, indexed by t = , ... . A long-lived player 1 (e.g., a seller) with discount factor δ interacts with an infinite sequence of short-lived player 2s (e.g., consumers), with 2 t denoting theshort-lived player in period t . Player 1’s action set is A and player 2’s action set is B .Each period consists of an announcement stage and an action stage. In period t , an i.i.d. random Formally, if the firm chooses L when it is inspected, it faces a fine f > q ∈ ( , ) of shuttingdown. One can show that there exists f > q ∈ ( , ) such that when f > f and q > q , it is a dominant strategy forboth types of the firm to choose a t = H regardless of their discount factors and the equilibrium being played. ( θ t , ω t ) ∈ Θ × Ω is drawn according to p ∈ ∆ ( Θ × Ω ) , where θ t ∈ Θ affects player 1’s stage-game payoff (e.g., their cost of supplying high quality) and ω t ⊂ A is the set of feasible actions, with Ω ≡ A \{ ∅ } . Player 1 privately observes ( θ t , ω t ) and announces to player 2 t that they intend to playaction m t ∈ A . Players then simultaneously choose their actions a t ∈ ω t and b t ∈ B .Player 1’s stage-game payoff is u ( θ t , a t , b t ) and player 2 t ’s is u ( a t , b t ) . Importantly, player 2’spayoff does not depend on θ t . Each player 2 who arrives after period t can observe y t ∈ Y , distributedaccording to F ( ·| a t , m t ) . A leading example is when y t is the indicator function { a t = m t } , that is,future short-run players observe whether the patient player has kept their word in the past.Player 1 has private information about their type γ ∈ { γ h , γ o } , which is either honest ( γ h ) or oppor-tunistic ( γ o ). Both types share the same stage-game payoff function. The honest type is restricted (i)to announce an action that is currently available, i.e., m t ∈ ω t , and (ii) to take an action that matchestheir announcement, i.e., a t = m t . The opportunistic type can announce any action (including onesthat are not feasible that period) and can take any action in ω t regardless of their announcement. Let π ∈ ( , ) be the prior probability of the honest type according to player 2s’ prior belief.For every t ∈ N , player 2 t ’s private history is h t ≡ { y , y , ..., y t − , m t } , with h t ∈ H t . Player 2 t ’sstrategy is σ t : H t → ∆ ( B ) , with σ ≡ ( σ t ) t ∈ N . Player 1’s private history in the announcement stageof period t is b h t ≡ { θ s , ω s , m s , a s , b s , y s } t − s = [ { γ , ω t , θ t } , with b h t ∈ c H t and c H ≡ S ∞ t = c H t . Player 1’s private history in the action stage of period t is e h t ≡ { θ s , ω s , m s , a s , b s , y s } t − s = [ { γ , ω t , θ t , m t } , with e h t ∈ f H t and f H ≡ S ∞ t = f H t . The opportunistic type’s strategy is σ o ≡ ( b σ o , e σ o ) , with b σ o : c H → ∆ ( A ) their strategy to make announcements and e σ o : f H → ∆ ( A ) their strategy to take actions, subjectto a feasibility constraint that the support of e σ o ( e h t ) is a subset of ω t . The honest type’s strategy is σ h ≡ ( b σ h , e σ h ) , with b σ h : c H → ∆ ( A ) their strategy to make announcements and e σ h : f H → ∆ ( A ) theirstrategy to take actions, subject to first, the support of b σ h ( b h t ) is a subset of ω t , and second, their actionmatches their announcement e σ h ( e h t ) = m t .A Nash equilibrium (NE) consists of ( σ o , σ h , σ ) , in which σ t maximizes player 2 t ’s stage-gamepayoff, and every type of player 1 chooses a strategy that maximizes their discounted average payoff7 h ∑ ∞ t = ( − δ ) δ t u ( θ t , a t , b t ) i . We assume that Θ , A , B , and Y are finite sets, which together withdiscounting of per period payoffs implies that a Nash equilibrium exists (Fudenberg and Levine 1983). We show that when the short-run players face a small amount of uncertainty about the feasibility of thepatient player’s actions, and y t is informative about whether the patient player has kept their word inperiod t , the patient player can secure their expected (pure) Stackelberg payoff in every equilibrium. By contrast, the patient player receives a low payoff in some equilibria when all of their actions arefeasible in every period.Recall that ω t ⊂ A is the set of feasible actions in period t . For every ε >
0, we say that player 1’saction choice is ε -flexible if the probability with which ω t = A is at least 1 − ε . Assumption 1.
For every a ∈ A, ω t = { a } with strictly positive probability. Our next assumption requires y t to be informative about whether player 1’s action and announce-ment match. A leading example that satisfies this assumption is y t = { a t = m t } . Assumption 2.
If a = m and a ′ = m ′ , then (i) F ( ·| a , m ) = F ( ·| a ′ , m ′ ) , and (ii) F ( ·| a , m ) does notbelong to the convex hull of { F ( ·| a ′′ , m ′′ ) } a ′′ = m ′′ . Let BR : ∆ ( A ) → B \{ ∅ } be player 2’s best reply correspondence. In state θ ∈ Θ , player 1’sStackelberg payoff is v ∗ ( θ ) ≡ max a ∈ A n min b ∈ BR ( a ) u ( θ , a , b ) o , and their expected Stackelberg payoff is v ∗ ≡ ∑ θ ∈ Θ p ( θ ) v ∗ ( θ ) . Theorem 1.
Suppose the environment satisfies Assumptions 1 and 2. For every η > , there exist δ ∈ ( , ) and ε > such that when δ > δ and player ’s action choice is ε -flexible, each type ofplayer receives payoff at least v ∗ − η in every Nash equilibrium. In what follows, we will simply say
Stackelberg action and
Stackelberg payoff , with “pure” left implicit. When y t = { a t = m t } , a patient player 1 can guarantee payoff approximately v ∗ in every weak rationalizable out-come defined in Watson (1993) in the perturbed game where player 1 is honest with positive probability. y t = { a t = m t } .Let a ∗ : Θ → A be any mapping such that a ∗ ( θ ) ∈ arg max a ∈ A n min b ∈ BR ( a ) u ( θ , a , b ) o for every θ .Fix any equilibrium, and consider the honest type’s payoff when they announce a ∗ ( θ t ) in period t whenever a ∗ ( θ t ) ∈ ω t . The second part of Assumption 2 implies that whether player 1’s action coin-cides with their announcement is informative about their type in the “bad” periods where player 2 failsto best reply to the announcement. Assumption 1 requires that for each a ∈ A , ω t = { a } with positiveprobability, which implies that the honest type makes each announcement with positive probability inevery period. The first part of Assumption 2 guarantees that the honest type’s deviation generatesthe same distribution of player 2’s histories as the honest type’s equilibrium strategy.If player 2 t fails to best respond to the announced action, player 2 must assign a significant prob-ability to the event that player 1 is opportunistic and takes a t = m t . Hence, observing a t = m t shouldincrease the posterior probability with which player 1 is honest. Thus, there exists at most a boundednumber of periods where player 2 believes that player 1 chooses a t = m t with significant probability.In other words, player 2 must believe that player 1 is honest or opportunistic but keeps their wordmost of the time.Theorem 1 requires that the patient player can choose any action in A with probability close to1. In fact, as long as the environment satisfies Assumptions 1 and 2, one can establish a reputationresult using the same argument as the proof of Theorem 1 after adjusting the definition of expectedStackelberg payoff. For every ( θ , ω ) ∈ Θ × Ω , let u ∗ ( θ , ω ) ≡ max a ∈ ω n min b ∈ BR ( a ) u ( θ , a , b ) o , (3.1)and let u ∗ ≡ ∑ ( θ , ω ) ∈ Θ × Ω p ( θ , ω ) u ∗ ( θ , ω ) . One can show that when the environment satisfies Assumptions 1 and 2, a patient player can guaranteepayoff u ∗ in all equilibria when δ is close enough to 1.Assumption 2 rules out situations in which player 2s can perfectly observe player 1’s past an- Our reputation result extends when ω t is the set of feasible announcements. When the honest type trembles andmakes each announcement with positive probability, our result also extends to settings where player 1 makes their an-nouncement before knowing which of their actions are feasible. See Corollary 3 for details. a t = m t . To study such situations, we extend our result when player2s can observe informative signals z t about the past realizations of a t and m t , as long as each of themcan only observe the realizations of z t in a bounded number of previous periods.Formally, let z t ∈ Z , where z t is distributed according to G ( ·| m t , a t ) . We make no restrictions on G except that its support Z is a finite set. Suppose for every t ∈ N , player 2 t can observe player 1’sannouncement m t , the history { y , ..., y t − } , as well as a (possibly stochastic) subset of { z , ..., z t − } that has at most K ∈ N elements. Whether player 1 can observe y and z are irrelevant for our result.Our assumption on the asymmetry between player 2s’ observations of y t and z t is motivated byretail markets in developing economies. Due to the lack-of record-keeping institutions, detailed infor-mation about sellers’ actions and announcements (e.g., the quality of their services, various attributesof their products, the content of their advertisements, and so on, which correspond to z t ) is likely toget lost over time. By contrast, coarse information about sellers’ records, such as whether they havekept their word (which corresponds to y t ), is likely to be more persistent due to social learning andword-of-mouth communication. Corollary 1 extends Theorem 1, with proof in Appendix C. Corollary 1.
Suppose the environment satisfies Assumptions 1 and 2, and there exists K ∈ N suchthat each player observes the past realizations of z in at most K periods. For every η > , thereexist δ ∈ ( , ) and ε > such that when δ > δ and player ’s action choice is ε -flexible, each typeof player has payoff at least v ∗ − η in every Nash equilibrium. Now we show why uncertainty about which of player 1’s actions are feasible is necessary forTheorem 1 to hold in general. We show this for situations in which ω t = A with probability 1 and y t = { a t = m t } . We start by introducing two auxiliary one-shot games that have the same payoff functions as theoriginal stage game. The first auxiliary game does not have a communication stage: Player 1 observes θ , and then players act simultaneously without any communication. Let v min be player 1’s lowest Nashequilibrium payoff in this game. The second auxiliary game has an action recommendation stage:Player 1 observes θ , makes a recommendation b b ∈ B to player 2 before players take their actions. We can construct low-payoff equilibria when ω t = A with probability 1 and player 2 t can perfectly observe { a s , m s } t − s = . We do not know how to construct low-payoff equilibria when some actions are not available with positiveprobability and player 2 t can perfectly observe { a s , m s } t − s = . b v be player 1’s lowest pure-strategy equilibrium payoff in this game. If there is no pure-strategyequilibrium in this game, let b v = + ∞ .Let B be the set of mappings β : A → ∆ ( B ) such that β ( a ) is a best reply to a for every a ∈ A .Abusing notation, let p be the distribution of θ . Let v ′ ≡ min A ′ ⊂ A , β ∈ B ∑ θ ∈ Θ p ( θ ) max a ∈ A ′ u ( θ , a , β ( a )) (3.2)subject to ∑ θ ∈ Θ p ( θ ) max a ∈ A ′ u ( θ , a , β ( a )) ≥ min { v min , b v } . (3.3)Theorem 2 shows that when all of player 1’s actions are feasible in every period, there are equilibriain which both types of player 1 have payoff no more than v ′ , Theorem 2. If ω t = { A } with probability and y t = { a t = m t } , then there exists δ ∈ ( , ) suchthat for every δ > δ , there exists an equilibrium in which both types of player ’s payoff is v ′ . The proof of this result and a subsequent lemma are in Appendix B.In order to understand the connections between the conclusion of Theorem 2 and that of Theorem1, we compare v ′ with player 1’s expected Stackelberg payoff v ∗ and their minmax payoff. To startwith, one can verify that v ∗ ≥ v ′ when players’ payoffs are generic and the auxiliary game withoutcommunication admits a pure-strategy equilibrium. Next, we introduce a class of games underwhich v ′ is strictly less than v ∗ , and under an additional supermodularity condition, v ′ equals player1’s minmax payoff. Supermodularity Condition.
There exists a complete order on A such that for every θ ∈ Θ ,u ( θ , a , b ) is strictly decreasing in a, and there exists θ ∈ Θ such that player ’s Stackelberg actionin state θ is not the lowest element in A. Lemma 1.
If every θ ∈ Θ occurs with positive probability and the stage-game payoffs satisfysupermodularity, then1. v ′ < v ∗ . The generic requirement is that player 1 has a strict best reply to every b ∈ B for every θ ∈ Θ , and player 2 has a strictbest reply to every a ∈ A . The existence of a pure-strategy equilibrium in the auxiliary game without communication rulesout zero-sum games such as matching pennies, where the patient player cannot benefit from committing to pure actions. . In addition, if there also exists a complete order on B such that u is strictly increasing in b andu has strictly increasing differences in a and b, then v ′ is player ’s minmax payoff. Condition 1 and the additional assumption in Lemma 1 fit applications such as product choicegames, where a firm finds it costly to exert high effort, can strictly benefit from consumers’ trust,and can benefit from committing to high effort in states where its production cost is low enough.The consumers have stronger incentives to trust the firm when the latter exerts higher effort. Ourconditions also apply to games of entry deterrence (Kreps and Wilson 1982, Milgrom and Roberts1982), capital taxation (Phelan 2006), monetary policy (Barro 1986), and trust games more generally(Liu and Skrzypacz 2014).
We note here that the conclusion of Theorem 1 extends to two alternative scenarios.
Announcing Product Quality:
Suppose that players move sequentially in the stage game. In period t ∈ N , player 1 (e.g., a firm) chooses their effort a t ∈ A , privately observes the quality of its product x t ∈ X which is distributed according to g ( ·| a t ) ∈ ∆ ( A ) , and makes an announcement m t ∈ X aboutquality. Player 2 t (e.g., a consumer) observes m t as well as whether x s coincides with m s for all s ≤ t − b t ∈ B . We assume that A , B , and X are finite sets.Player 1 is either an honest type who strategically chooses actions a t ∈ A but announces x t truth-fully, or an opportunistic type who strategically chooses both the actions and the announcements.Both types have stage-game payoff u ( a t , b t ) and discount factor δ ∈ ( , ) . Player 2 t ’s payoff is u ( x t , b t ) , i.e., their payoff depends only on product quality and their purchasing decision. This fitsthe model of Jullien and Park (2020) except that there is a positive probability of the honest type, andthe ex-post quality is not directly observed by subsequent consumers. For every x ∈ X , let BR ( x ) ⊂ B be the set of pure best replies against x . Player 1’s optimal commitment payoff is v ∗∗ ≡ max a ∈ A n ∑ x ∈ X g ( x | a ) min b ∈ BR ( x ) u ( a , b ) o . (4.1)12 orollary 2. If g ( ·| a ) has full support for every a ∈ A, then for every ε > , there exists δ ∈ ( , ) such that when δ > δ , every type of player ’s payoff in every Nash equilibrium is at least v ∗∗ − ε . The proof is in Appendix D, which uses similar ideas as the proof of Theorem 1.Next we show that reputations for honesty cannot guarantee player 1 their optimal commitmentpayoff when product quality is a perfect signal of player 1’s effort. Suppose X = A and g ( a | a ) = a ∈ A , and players’ stage-game payoffs are given by the following matrix: − T NH , − , L , − , H .We construct a Perfect Bayesian equilibrium in which player 1’s payoff is 0. On the equilibriumpath, both types of player 1 play L and announce L in every period, and player 2s play N at everyon-path history. After observing announcement H , player 2s believe that player 1 is the opportunistictype and has played L with probability 1 / N . Thisequilibrium survives both when player 2s can only observe whether m t matches with x t in all previousperiods, and when player 2s can observe the values of x t and m t in all previous periods. Making Announcements Before Knowing the Set of Feasible Actions:
In some applications, thepatient player makes announcements before knowing which of their actions are feasible, and an honestindividual may break their word when the action they announced turns out to be infeasible. Theorem1 extends to this setting if (1) the honest type trembles and makes each announcement with positiveprobability, and (2) the probability with which all actions are feasible is close to 1.In period t ∈ N , player 1 observes θ t ∈ Θ t and makes an announcement about their intended action m t ∈ A t . Player 2 t observes m t , player 1 observes the realization of ω t ∈ Ω ≡ A \{ ∅ } , and then bothplayers choose ( a t , b t ) ∈ ω t × B simultaneously. Future player 2s observe y t ≡ { a t = m t } . We assume { ω t , θ t } t ∈ N are i.i.d. over time, with p ∈ ∆ ( Ω × Θ ) their joint distribution.Player 1 is either an opportunistic type who can take any action regardless of their announcement,or an honest type who chooses a t = m t as long as m t ∈ ω t . Both types of player 1 tremble whenmaking announcements, i.e., there exists η > η at every information set.13 orollary 3. For every ε > , there exist δ ∈ ( , ) , η > , and c ∈ ( , ) , such that when δ > δ , η ∈ ( , η ) and the probability that ω t = A is at least − η c, then each type of player receives payoffat least v ∗ − ε in every Nash equilibrium. The proof is in Appendix E. Unlike our baseline model and reputation models with noisy monitor-ing such as Fudenberg and Levine (1992), when the honest type uses the strategy of announcing theirStackelberg action and keeping their word whenever it is feasible, their reputation may deteriorate inexpectation.Our proof starts by showing that when ω t = A with probability close to 1, the probability thatthe honest type keeps their word in equilibrium is close to 1, and for reputation to deteriorate whenthe honest type keeps their word, the opportunistic type must also keep their word with probabilityclose to 1. It implies that in those periods, player 2 has a strict incentive to best reply to player 1’sannouncement, and moreover, the amount of reputation deterioration is small. By contrast, in “bad”periods where player 2 has a strict incentive not to best reply against player 1’s announcement, theprobability that the opportunistic type breaks their promise is large and keeping one’s word leads to asignificant improvement in one’s reputation. Although the number of bad periods can be unbounded,their fraction goes to zero as the probability of ω t = A goes to one. A Proof of Theorem 1
Fix any Nash equilibrium ( σ o , σ h , σ ) and consider any history h t that occurs with strictly positiveprobability under ( σ o , σ h , σ ) .For i ∈ { h , o } , let P σ i , σ be the probability measure over Y ∞ induced by ( σ i , σ ) . Denote player 2’sbelief over player 1’s private history as a function of h t by β ( ˆ h t | h t ) , and let σ o ( h t ) be the expecteddistribution of opportunistic player 1’s joint announcement-action pairs implied by β ( h t | h t ) , withˆ σ o ( h t ) and ˜ σ o ( h t ) the marginal distributions of announcements and actions, respectively. Let π t bethe probability of the honest type according to player 2’s belief in period t after observing { y ..., y t − } .According to Bayes rule, π t = P σ h , σ ( y , ..., y t − ) π P σ h , σ ( y , ..., y t − ) π + P σ o , σ ( y , ..., y t − )( − π ) . (A.1)14et α t ( m t ) ≡ π t ˆ σ h ( h t )( m t ) + ( − π t ) ˆ σ o ( h t )( m t ) , which is the probability of announcement m t conditional on h t . Let ξ t ( m t ) be the probability that a t = m t conditional on m t , ξ t ( m t ) ≡ π t ˜ σ h ( h t )( m t ) π t ˆ σ h ( h t )( m t ) + ( − π t ) ˆ σ o ( h t )( m t ) + ( − π t ) ˜ σ o ( h t )( m t ) π t ˆ σ h ( h t )( m t ) + ( − π t ) ˆ σ o ( h t )( m t ) . Let ξ t be the unconditional probability that player 1’s action matches their announcement: ξ t ≡ ∑ a ∈ A α t ( a ) ξ t ( a ) . (A.2)Let ρ ≡ min a ∈ A Pr ( ω t = { a } ) , which by Assumption 1 is strictly positive. This implies that α t ( m ) > ρ for every announcement m ∈ A . Let λ ∈ ( , ) be the smallest real number such that for every θ ∈ Θ ,player 2 strictly prefers one of the actions in BR ( a ∗ ( θ )) to all actions outside of BR ( a ∗ ( θ )) whenthey believe that player 1 plays a ∗ ( θ ) with probability strictly more than λ .Consider the honest type’s payoff when they use strategy σ ∗ h ≡ ( b σ ∗ h , e σ ∗ h ) , where e σ ∗ h ( m ) = m forevery m ∈ A , and b σ ∗ h ( θ t , ω t ) = a ∗ ( θ t ) when a ∗ ( θ t ) ∈ ω t and is uniform over the actions in ω t when a ∗ ( θ t ) / ∈ ω t . For any history h t , suppose there exists m t ∈ A such that ξ t ( m t ) ≤ λ , then ξ t ≤ ξ ∗ ≡ − ( − λ ) ρ . Let d ( ·||· ) denote the KL-divergence, and let F ∗ ≡ F ( ·| a , a ) . Let D ∗ ≡ min a = m d (cid:16) F ∗ (cid:13)(cid:13)(cid:13) ξ ∗ F ∗ + ( − ξ ∗ ) F ( ·| a , m ) (cid:17) . (A.3)Part 2 of Assumption 2 implies that D ∗ > δ .Part 1 of Assumption 2 implies that P σ h , σ = P σ ∗ h , σ . Let F ( y | h t ) = ∑ ( a , m ) ∈ A F ( y | a , m ) σ o ( h t )( a , m ) so that F ( ·| h t )) denotes the distribution over y t induced by σ o ( h t ) .Similar to Gossner (2011), the chain rule for relative entropy implies: − log π ≥ d (cid:16) P σ h , σ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) π P σ h , σ + ( − π ) P σ o , σ (cid:17) = d (cid:16) P σ ∗ h , σ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) π P σ h , σ + ( − π ) P σ o , σ (cid:17) = E ( σ h , σ ) h ∞ ∑ t = d (cid:16) F ∗ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) π t F ∗ + ( − π t ) F ( ·| h t ) (cid:17)i , ( σ h , σ ) and ( σ ∗ h , σ ) induce the same distributionover y given the first part of Assumption 2.Therefore, d ( F ∗ || F ( h t )) ≥ D ∗ if h t is such that ξ t ( a t ) ≤ λ for some a t ∈ A , so the expected numberof such periods is at most T ( π ) ≡ l − log π D ∗ . m (A.4)Hence the honest type’s payoff from σ ∗ h is at least δ T ( π ) n ( − ε min θ ∈ Θ p ( θ ) ) v ∗ + ε min θ ∈ Θ p ( θ ) v o + ( − δ T ( π ) ) v , (A.5)in which v is player 1’s lowest stage-game payoff. Expression (A.5) converges to v ∗ when δ → ε →
0. Since the opportunistic type’s payoff is weakly greater than the honest type’s payoff, theirequilibrium payoff is also weakly more than (A.5).
B Proofs of Theorem 2 and Lemma 1
Proof of Theorem 2:
Suppose ( A ′ , β ) solves (3.2) subject to (3.3), and consider the followingstrategy profile: At every on-path history with θ t = θ , both types of player 1 announce the same a ∈ arg max a ∈ A ′ u ( θ , a , β ( a )) and match their actions with their announcements. Player 2 chooses β ( a ) following announcement a , and chooses β ( a ′ ) if player 1’s announcement does not belong to A ′ ,where a ′ is an arbitrary element of A ′ . At every h t where y s = s < t , player 2 believes thatplayer 1 is opportunistic. If v min ≤ b v , then players coordinate on the worst stage-game Nash equilib-rium for player 1. If v min > b v , then players coordinate on the worst pure-strategy Nash equilibirumin the second auxiliary game. Player 2s’ incentive constraints are trivially satisfied, and player 1’sincentive constraint is implied by (3.3). Proof of Lemma 1:
Let a ≡ min A and let b be player 2’s best reply against a . According to (3.2), v ′ ≤ ∑ θ ∈ Θ p ( θ ) u ( θ , a , b ) . According to (3.1), v ∗ ≥ ∑ θ ∈ Θ p ( θ ) u ( θ , a , b ) , and the inequality is strictfrom the second part of Condition 1 and the fact that each θ has strictly positive probability.If u is strictly decreasing in b and u has strictly increasing differences, then u ( θ , a , b ) is player1’s minmax payoff in state θ . Since v ′ ≤ ∑ θ ∈ Θ p ( θ ) u ( θ , a , b ) , v ′ is player 1’s minmax value.16 Proof of Corollary 1
Recall the definition of ξ ∗ in the proof of Theorem 1. Let b ξ ≡ − ρ K ( − ξ ∗ ) . (C.1)Suppose the honest type announces the Stackelberg action whenever it is available. By construction,if player 2 t believes that a t = m t with probability at least b ξ after observing { y , ..., y t − } , then because ω = { a } with probability at least ρ , their posterior belief after observing at most K realizations of { z , ..., z t − } attaches probability at least ξ ∗ to a t = m t . This implies that player 2 has an incentive tobest reply against player 1’s announced actions. Let b D ≡ min a = m d (cid:16) F ∗ (cid:13)(cid:13)(cid:13) b ξ F ∗ + ( − b ξ ) F ( ·| a , m ) (cid:17) . (C.2)The same argument as in the proof of Theorem 1 implies that in expectation, there exist at most b T ( π ) ≡ l − log π b D m (C.3)periods in which player 2 believes that a t = m t with probability less than b ξ after observing { y , ..., y t − } .As a result, the honest type’s payoff is at least v ∗ when ε → δ → D Proof of Corollary 2
Fix any equilibrium ( σ o , σ h , σ ) and consider any history h t that occurs with strictly positive proba-bility under P ( σ o , σ h , σ ) . Let ξ t ∈ [ , ] be the probability player 2 assigns to { m t = x t } after observing ( y , ..., y t − ) but before observing m t . Let ξ t ( m t ) ∈ [ , ] be the probability their belief attaches to { m t = x t } after observing ( y , ..., y t − ) and m t . Let α t ( m t ) ∈ [ , ] be the probability of announce-ment m t conditional on ( y , ..., y t − ) . By definition, ξ t = ∑ m ∈ X α t ( m ) ξ t ( m ) . (D.1)17et g ≡ min ( a , x ) ∈ A × X g ( x | a ) , which is strictly positive under the full support assumption. Let λ ∈ ( , ) be large enough such that for every x ∈ X , if player 2 believes that x occurs with probability at least λ , then they strictly prefer one of the actions in BR ( x ) to all actions that do not belong to BR ( x ) .Let ξ ∗ ≡ − g ( − λ ) . When ξ t ≥ ξ ∗ , they believe that m t = x t with probability more than λ afterobserving m t . As in the proof of Theorem 1, one can show that the expected number of periods inwhich ξ t ≤ ξ ∗ is uniformly bounded from above. Therefore, player 1’s payoff is at least v ∗ as δ → E Proof of Corollary 3
For every t ∈ N , let ν t ∈ { , } be a random variable such that1. ν t = a ∈ A , when player 1 announces a period t , player 2 strictly prefers one ofthe actions in BR ( a ) to all actions that do not belong to BR ( a ) ,2. ν t = ξ ∈ ( , ) such that for every a ∈ A , all actions outside of BR ( a ) are strictly inferior whenplayer 2 believes that player 1 plays a with probability more than ξ . Let π t ∈ ∆ { γ h , γ o } be player 2’sbelief in period t after observing { y , ..., y t − } but before observing m t , and let b π t ∈ ∆ { γ h , γ o } be player2’s belief in period t after observing { y , ..., y t − } and m t . Let l t ≡ log π t ( γ h ) π t ( γ o ) and b l t ≡ log b π t ( γ h ) b π t ( γ o ) . Sinceplayer 1 trembles with probability η , we have b l t − l t ≥ log η . As a result, there exists l ∗ ∈ R + suchthat l t ≥ l ∗ implies that ν t = ( ·| σ h ) be the probability under the honest type’s equilibrium strategy. Let Pr ( ·| σ o ) be theprobability under the opportunistic type’s equilibrium strategy. Let Pr ( ·| σ ∗ h ) be the probability underthe strategy of announcing the Stackelberg action in each state, and keeps one’s word whenever it isfeasible. Let ρ ≡ − Pr ( ω t = A ) . We havePr ( m t = a t | σ h ) ∈ [ − ρ , − ρη ] , Pr ( m t = a t | σ ∗ h ) ∈ [ − ρ , − ρη ] , and Pr ( m t = a t | σ o ) ≤ − ρη . In periods where ν t =
0, the Markov’s inequality implies that Pr ( m t = a t | σ o ) < − η ( − ξ ) . Whenplayer 1 plays according to σ ∗ h , Bayes Rule implies that E [ l t + − l t | l t ] = D ( p t ( σ ∗ h ) || p t ( σ o )) − D ( p t ( σ ∗ h ) || p t ( σ h )) , (E.1)18here D ( ·||· ) denoted the KL-divergence, and p t ( σ ) is the distribution over y t under strategy σ . When ν t =
1, we have D ( p t ( σ ∗ h ) || p t ( σ o )) − D ( p t ( σ ∗ h ) || p t ( σ h )) ≥ − D ( − ρ || − ηρ ) ≡ − β . (E.2)When ν t =
0, we have D ( p t ( σ ∗ h ) || p t ( σ o )) − D ( p t ( σ ∗ h ) || p t ( σ h )) ≥ D ( − ρ || − η ( − ξ )) − D ( − ρ || − ηρ ) ≡ α , (E.3)where D ( x || x ) denotes the KL-divergence between a distribution that attaches probability x to y t =
1, and one that attaches probability x to y t =
1. When ρ is small enough relative to η , the RHSof (E.3) is strictly positive, and moreover, for any fixed η > αβ → + ∞ as ρ → ∑ ∞ t = ( − δ ) δ t ν t when δ is close enough to1. Recall that y t = { a t = m t } . Let Z t be a random variable such that Z t = log Pr ( y t | σ h ) Pr ( y t | σ o ) with probability Pr ( y t | σ ∗ h ) for every y t ∈ Y . Our analysis above suggests that when ν t =
1, we have E [ Z t | σ ∗ h ] ≥ − β , and when ν t =
0, we have E [ Z t | σ ∗ h ] ≥ α . Claim 1.
For every ε > , there exists T ∈ N such that for every t ≥ T , E [ ∑ t − s = ν s | σ ∗ h ] ≥ t ( αα + β − ε ) with probability more than − ε . Our proof uses the Azuma-Hoeffding’s inequality:
Lemma 2.
Let { Z , Z , · · · } be a martingale such that | Z k − Z k − | ≤ c k . For every N ∈ N and ε > , we have Pr [ Z N − Z ≥ ε ] ≤ exp − ε ∑ Nk = c k ! . Proof of Claim 1:
Construct a martingale process { e l t } t ∈ N recursively. Let e l ≡ l , and for every t ∈ N ,let e l t + ≡ e l t + Z t − E [ Z t | e l t ] . Suppose t E [ ∑ t − s = ν s | σ ∗ h ] ≤ αα + β − ε , then t − ∑ s = E [ Z s | σ ∗ h ] ≥ t ε ( α + β ) . (E.4)19herefore,Pr ( l t ≤ l ∗ | σ ∗ h ) = Pr (cid:16)e l t − e l ≤ l ∗ − l − t ε ( α + β ) (cid:12)(cid:12)(cid:12) σ ∗ h (cid:17) ≤ exp (cid:16) − ( l ∗ − l − t ε ( α + β )) tC (cid:17) , (E.5)where C is the difference between the largest realization of Z t and the smallest realization of Z t .The RHS of (E.5) vanishes to zero exponentially as t → + ∞ . Since ν t = l t ≥ l ∗ , we knowthat for every ε >
0, there exists T ∈ N , such that for every t ≥ T , E [ ∑ t − s = ν s | σ ∗ h ] ≤ t (cid:16) αα + β − ε (cid:17) implies that ν t = − ε , and by setting ε < βα + β we obtain in this case that E [ ∑ ts = ν s | σ ∗ h ] − E [ ∑ t − s = ν s | σ ∗ h ] ≥ − ε > αα + β . Then for every t > T , one can show by inductionthat E [ ∑ t − s = ν s | σ ∗ h ] ≥ ( t − T − )( αα + β − ε ) , and the claim follows by choosing any ε > ε >
0, and T ≥ T + + α ( ε − ε )( α + β ) .We use the following well-known equation: E [ ∞ ∑ t = ( − δ ) δ t ν t | σ ∗ h ] = ( − δ ) + ∞ ∑ t = δ t t ∑ s = E [ ν s | σ ∗ h ] | {z } ≥ ( αα + β − ε )( t + ) with probability close to 1 . (E.6)Claim 1 and (E.6) imply that for every b ε >
0, there exists δ ∈ ( , ) , such that for every δ ∈ ( δ , ) ,we have E [ ∞ ∑ t = ( − δ ) δ t ν t | σ ∗ h ] ≥ αα + β − b ε . (E.7)Recall that v ∗ is player 1’s expected pure Stackelberg payoff. Without loss of generality, we normalizeplayer 1’s worst stage-game payoff to 0. If ν t =
1, then the honest type’s expected payoff fromannouncing their Stackelberg action in every state is at least ( − ρ − η ) v ∗ . If ν t =
0, then the honesttype’s expected payoff is at least 0. Pick δ ∈ ( , ) such that E [ ∑ ∞ t = ( − δ ) δ t ν t ] ≥ αα + β − b ε , and pick ρ small enough such that αα + β is greater than 1 − b ε , the honest type’s payoff is at least ( − b ε )( − η − ρ ) v ∗ . There exists c ∈ ( , ) such that the above expression is greater than v ∗ − ε when η and b ε are smallenough, and ρ ≤ c η . 20 eferences Alp Atakan and Mehmet Ekmekci. Reputation in the long-run with imperfect monitoring.
Journal ofEconomic Theory , 157:553–605, 2015.Yu Awaya and Vijay Krishna. On communication and collusion.
American Economic Review , 106(2):285–315, 2016.Robert Barro. Reputation in a model of monetary policy with incomplete information.
Journal ofMonetary Economics , 17(1):3–20, 1986.Marco Celentani, Drew Fudenberg, David Levine, and Wolfgang Pesendorfer. Maintaining a reputa-tion against a long-lived opponent.
Econometrica , 64(3):691–704, 1996.Gary Charness and Martin Dufwenberg. Promises and partnership.
Econometrica , 74(6):1579–1601,2006.Ying Chen. Perturbed communication games with honest senders and naive receivers.
Journal ofEconomic Theory , 146(2):401–424, 2011.Ying Chen, Navin Kartik, and Joel Sobel. Selecting cheap-talk equilibria.
Econometrica , 76(1):117–136, 2008.Drew Fudenberg and David Levine. Subgame-perfect equilibria of finite and infinite horizon games.
Journal of Economic Theory , 31(2):251–268, 1983.Drew Fudenberg and David Levine. Reputation and equilibrium selection in games with a patientplayer.
Econometrica , 57(4):759–778, 1989.Drew Fudenberg and David Levine. Maintaining a reputation when strategies are imperfectly ob-served. volume 59(3), pages 561–579. 1992.Uri Gneezy. Deception: The role of consequences.
American Economic Review , 95(1):384–394,2005.Uri Gneezy, Agne Kajackaite, and Joel Sobel. Lying aversion and the size of the lie.
AmericanEconomic Review , 108(2):419–453, 2018.Olivier Gossner. Simple bounds on the value of a reputation.
Econometrica , 79(5):1627–1641, 2011.Bruno Jullien and In-Uck Park. New, like new, or very good? reputation and credibility.
Review ofEconomic Studies , 81(4):1543–1574, 2014.Bruno Jullien and In-Uck Park. Communication, feedbacks and repeated moral hazard with short-lived buyers.
Working Paper , 2020.Navin Kartik. Strategic communication with lying costs.
Review of Economic Studies , 76(4):1359–1395, 2009. 21avin Kartik, Marco Ottaviani, and Francesco Squintani. Credulity, lies, and costly talk.
Journal ofEconomic Theory , 134(1):93–116, 2007.David Kreps and Robert Wilson. Reputation and imperfect information.
Journal of Economic Theory ,27(2):253–279, 1982.Qingmin Liu and Andrzej Skrzypacz. Limited records and reputation bubbles.
Journal of EconomicTheory , 151:2–29, 2014.Paul Milgrom and John Roberts. Predation, reputation, and entry deterrence.
Journal of EconomicTheory , 27(2):280–312, 1982.Harry Pei. Trust and betrayals: Reputational payoffs and behaviors without commitment.
TheoreticalEconomics, forthcoming , 2020.Christopher Phelan. Public trust and government betrayal.
Journal of Economic Theory , 130(1):27–43, 2006.Klaus Schmidt. Commitment through incomplete information in a simple repeated bargaining game.
Journal of Economic Theory , 60:114–139, 1993.Joel Sobel. A note on pre-play communication.
Games and Economic Behavior , 102:477–486, 2017.Takuo Sugaya and Alexander Wolitzky. Communication and community enforcement.
WorkingPaper , 2020.Joel Watson. A“reputation” refinement without equilibrium.