[PDF] Ambiguous Persuasion: An Ex-ante Perspective

Abstract

In a persuasion environment where both players are ambiguity averse, Beauchêne, Li and Li (2019) show that the sender can make strictly more profits from sending ambiguous signals if the receiver cares only about his interim payoff. As in the presence of ambiguity, the receiver may not be dynamically consistent. This paper studies ambiguous persuasion when the receiver's goal is to maximize his ex-ante payoff. First of all, if the receiver is dynamically consistent, I show the sender cannot make any profits more than Bayesian persuasion. In other words, ambiguity plays a role in persuasion only through inducing dynamically inconsistent behaviors. On the other hand, if the receiver is dynamically inconsistent and is able to adjust the information structure by ignoring the undesirable messages. Two seemingly undesirable features of ambiguous persuasion, synonyms and dilation, are not always undesirable to the receiver. In fact, they are always undesirable if and only if the payoff-relevant states are binary. Nonetheless, I show that the optimal value of ambiguous persuasion in the interim setting \citep*{Beauchêne, Li and Li, 2019} cannot be achieved in the current setting.

Full PDF

aa r X i v : . [ ec on . T H ] O c t Ambiguous Persuasion: An Ex-ante Perspective *Xiaoyu Cheng † October 13, 2020

Abstract

In a persuasion environment where both players are ambiguity averse, Beauchˆene et al.(2019) show that the sender can make strictly more proﬁts from sending ambiguous signalsif the receiver cares only about his interim payoff. As in the presence of ambiguity, the re-ceiver may not be dynamically consistent. This paper studies ambiguous persuasion when thereceiver’s goal is to maximize his ex-ante payoff. First of all, if the receiver is dynamicallyconsistent, I show the sender cannot make any proﬁts more than Bayesian persuasion. In otherwords, ambiguity plays a role in persuasion only through inducing dynamically inconsistentbehaviors. On the other hand, if the receiver is dynamically inconsistent and is able to adjustthe information structure by ignoring the undesirable messages. Two seemingly undesirablefeatures of ambiguous persuasion, synonyms and dilation, are not always undesirable to thereceiver. In fact, they are always undesirable if and only if the payoff-relevant states are bi-nary. Nonetheless, I show that the optimal value of ambiguous persuasion in the interim setting(Beauchˆene et al., 2019) cannot be achieved in the current setting.

JEL: C72, D81, D83Keywords: Bayesian persuasion, ambiguity aversion, dynamic consistency, consistent plan-ning * I am grateful to Peter Klibanoff and Marciano Siniscalchi for invaluable guidance and discussions throughout thecompletion of this paper. I thank Modibo Camara, Theo Durandard, Andres Espitia, Henrique Brasiliense de CastroPires and Udayan Vaidya for comments. All remaining errors are my own. † Department of Managerial Economics and Decision Sciences, Kellogg School of Management, NorthwesternUniversity, Evanston, IL, USA. E-mail: [email protected] Introduction

Information provision proves instrumental in inﬂuencing behaviors in light of development of theBayesian persuasion literature (Kamenica, 2019). More speciﬁcally, by controlling the informa-tion ﬂow, a sender (she) can induce a receiver (he) to take her preferred actions even when theirpreferences are not aligned. While Bayesian persuasion restricts the sender to utilize distributionalinformation, in reality, non-distributional or ambiguous information provision is also commonlyobserved. In an environment without ex-ante ambiguity and the receiver is averse to ambiguity inthe information provided, this paper examines to what extent the sender is able to take advantageof such an aversion.The problem of ambiguous persuasion, where the sender can provide ambiguous information,has been studied in Beauchˆene et al. (2019) (BLL henceforth) in an interim setting. In the presenceof ambiguity, the receiver’s updating of beliefs to incorporate new information may result in dy-namically inconsistent behaviors. In other words, once the sender ﬁxes the information structure,the receiver can form an optimal ex-ante contingent plan of actions given each possible realizationof information. However, at the interim stage where the information realizes, the receiver mayinstead prefer another action over the planned action.BLL circumvent this inconsistency by looking at merely the receiver’s interim behaviors. Theyshow the sender can make strictly more proﬁts than under Bayesian persuasion. Yet from theex-ante perspective of the receiver, his interim choices are not going to be optimal. Thus in theirinterim setting, the sender’s additional proﬁts from providing ambiguous information leverages notonly the receiver’s aversion to ambiguity, but also the dynamic inconsistency followed by it.This paper ﬁrst answers the question that how much does dynamic inconsistency contributeto the sender’s additional proﬁts. This can be achieved by assuming either the receiver is able tocommit to his ex-ante contingent plan or he applies the dynamically consistent updating rule char-acterized by Hanany and Klibanoff (2007, 2009). In the latter treatment, the receiver’s updatingrule is tailored to make sure that his conditional preferences agree with his ex-ante optimal plan.Under either assumption, Theorem 2.1 asserts that the sender cannot achieve anything more thanBayesian persuasion if she has the same ambiguity attitude as the receiver.This theorem applies to the sender and receiver in BLL. Therefore, it implies that all of theadditional proﬁts characterized in their paper is due to the receiver’s dynamic inconsistency underambiguity.This no-gain result does depend on the sender’s ambiguity attitude. Since if the sender is am-biguity seeking, then introducing ambiguity alone may already create positive value for her. Tofurther elaborate, Section 2.3 provides an example where the sender is always able to extract an“ambiguity premium” from the receiver if she is strictly less ambiguity averse.In light of Theorem 2.1, opting for dynamically consistent behaviors from the receiver willdeem ambiguity meaningless in information provision. Yet this will leave open the question whyambiguous information can still be observed in reality. On possible answer is that, the receivermay still well be dynamically inconsistent, probably because he is lack of the ability to commitand chooses not to follow the dynamically consistent updating rule. After all, dynamic inconsis- For example, Guo and Shmaya (2018) observe that Redﬁn deﬁnes a “hot home” as one that has

70 percent chanceor higher of having an accepted offer within two weeks of its debut. Also, in the well-known Monty Hall problem, thehost’s additional information is actually ambiguous in lack of assumptions on his choice of the door without a prize.

Illustrating Example.

Let

Ω = { ω , ω } be the possible states of the world. The sender andreceiver have a common uniform prior belief over the states. The receiver has two feasible actions a and a where the sender strictly prefers a . On the other hand, the receiver’s payoff is given bythe following utility function: u R ( a , ω ) = 0; u R ( a , ω ) = 4 u R ( a , ω ) = 1; u R ( a , ω ) = 1 Notice ex-ante, the receiver strictly prefers a to a . According to BLL, the sender can design an ambiguous device including the following two signaling devices :Let m and m be the messages, π ( m | ω ) = 1; π ( m | ω ) = 1 π ( m | ω ) = 1; π ( m | ω ) = 1 Namely, both signaling devices reveal the states but with opposite signals. In the case where m realizes, let q ( ω | m ) = 1 and q ( ω | m ) = 0 be the posteriors given these two devices respec-tively. If the receiver believes that both posteriors are plausible and applies the maxmin criterionto evaluate his actions , then he will strictly prefer action a to a . The same applies to the othermessage. As a result, the sender achieves his optimal possible payoff from using this ambiguousdevice. Moreover, this payoff cannot be obtained with Bayesian persuasion.Notice that the receiver strictly prefers a to a at the ex-ante stage, yet takes the action a atboth contingencies. Clearly, if the receiver cares about his ex-ante payoff, then there should alwaysbe an option for him to ignore the messages and stick with his prior information for decision.Such an option is even more reasonable in the illustrating example as the messages are neitherinformative nor accurate. In fact, the messages here showcase two important features of the optimalambiguous persuasion in BLL.First, they deﬁne synonyms as messages inducing the same action and show that they are nec-essary for optimal persuasion. The messages here are synonyms as they both lead to the action a . Second, they use dilation to motivate the use of ambiguity in their motivating example. Here,dilation can be understood as the receiver’s conditional belief containing the prior at every possi-ble message. Since both messages in the illustrating example induce the receiver to form a fully An ambiguous device is a set of signaling devices such that it is unknown which one of them generates the realizedsignal. The ﬁrst part is true if the receiver applies Relative Maximum Likelihood updating (Cheng, 2019), which includesFull Bayesian and Maximum Likelihood as special cases. The second part assumes that the receiver’s preference underambiguity admits the Maxmin Expected Utility (MEU) representation (Gilboa and Schmeidler, 1989). garble the informationstructure by pooling messages together. He can regard a set of messages as a single message andwhenever one of the messages realizes he only learns the event speciﬁed by the set. In the illus-trating example, the receiver could treat the set { m , m } as a single message, it is then equivalentto ignoring the messages as desired.Proposition 4.2 and Proposition 4.6 together conﬁrm that such an intuition applies if and only ifthe states being binary. When there are only two states, it is indeed the case that the receiver alwaysprefers to pool the synonyms and dilating messages. However, it may not be true beyond binarystates. In section 4.3, I provide an example where the receiver ﬁnds optimal to not pool messagesthat are both synonyms and dilating. In other words, the receiver’s interim preference actuallymight be better aligned with his ex-ante preference under the synonyms and dilating messages.An intuition which is absent in binary states is that, pooling those messages might exaggerate theimportance of some states which are not so important under the ex-ante preference.The remainder of this paper is organized as follows. Section 2 studies the ambiguous per-suasion problem when the receiver is dynamically consistent. Section 3 studies the ambiguouspersuasion problem when the receiver is dynamically inconsistent but can garble the informationstructure to maximize his ex-ante payoff. Then, Section 4 focuses on ambiguous devices featur-ing synonyms and dilation. Section 5 discusses about the related literature. Finally, Section 6concludes. All the omitted proofs are collected in the appendix. Throughout this paper, I consider the standard persuasion environment where a sender commits toan information structure to induce a receiver to take actions.This section ﬁrst considers the case where the receiver is dynamically consistent under am-biguity. To simplify exposition, I will proceed in the language of the receiver committing to hisfavorite course of actions at the ex-ante stage. This is equivalent to assuming the receiver’s up-dating rule being dynamically consistent in the way characterized in Hanany and Klibanoff (2007,2009) without such a commitment assumption.Under ex-ante commitment, the receiver’s interim behaviors become irrelevant. Thus, this4ection will focus solely on the ex-ante stage.

Let Ω be a ﬁnite set of states of the world . For any set X , let ∆( X ) be the space of probabilitydistributions over X endowed with the weak topology. The sender and receiver have a common prior p ∈ ∆(Ω) with full support.The receiver’s feasible actions are given by a compact set A . The sender and receiver’s payoffdepend on the receiver’s action and states, given by functions u S and u R , respectively. Namely, u S ( a, ω ) and u R ( a, ω ) are the payoffs of the players when the receiver’s action is a ∈ A and therealized state is ω ∈ Ω .Fix a ﬁnite message space M , the sender could commit to a probabilistic device π which isa mapping from Ω to probability distributions over M . Let π ( m | ω ) denote the probability ofsending message m in state ω under the device π . Notice that, the device π gives rise to a jointprior p π ∈ ∆(Ω × M ) : p π ( ω × m ) = π ( m | ω ) p ( ω ) On the other hand, let τ π ( m ) denote the marginal probability of sending message m underdevice π : τ π ( m ) ≡ X ω π ( m | ω ) p ( ω ) and let q πm denote the posterior from Bayesian updating upon observing message m : q πm ( ω ) = π ( m | ω ) p ( ω ) P ω ′ π ( m | ω ′ ) p ( ω ) Notice that the joint prior can also be written as p π ( ω × m ) = τ π ( m ) q πm ( ω ) The receiver’s goal is to maximize his ex-ante payoff. An ex-ante contingent plan speciﬁesa probability distribution over actions given each possible message. Formally, let the function f : M → ∆( A ) denote a generic contingent plan such that f ( m )( a ) is the probability of action a ∈ A when message m ∈ M realizes. Let F denote the set of all contingent plans. Given aprobabilistic device π , the receiver’s ex-ante payoff from a contingent plan f is given by V R ( π, f ) = X m,a,ω f ( m )( a ) u R ( a, ω ) p π ( ω × m ) Alternatively, the sender may choose to commit to an ambiguous device . As deﬁned in BLL,an ambiguous device Π is a closed and convex set of probabilistic devices with common support .Given such a device, it is unknown to both the sender and receiver which probabilistic device isgoing to be used to generate the messages. As a result, the ex-ante payoff of both players aftercommitting to an ambiguous device will be evaluated in the presence of ambiguity. For any π, π ′ ∈ Π , τ π ( m ) > if and only if τ π ′ ( m ) > .

5s each probabilistic device induces a joint prior p π . An ambiguous device Π will thereforeinduce a set of joint priors denoted by C Π . Formally, C Π ≡ { p π ∈ ∆(Ω × M ) : π ∈ Π } which is a closed and convex subset of ∆(Ω × M ) . Throughout this paper, both the sender andreceiver are assumed to take the set C Π as their perceived ambiguity for ex-ante preferences.Moreover, both players are assumed to be ambiguity averse with the same ambiguity attitude,which crucially drives the main result here. For the ease of exposition, I will assume that theirpreferences admit the Maxmin Expected Utility (MEU) representation (Gilboa and Schmeidler,1989). The result in this section holds as long as the player’s preferences are uncertainty averse(Cerreia-Vioglio et al., 2011).Under MEU, given an ambiguous device Π , the receiver’s ex-ante payoff from choosing thecontingent plan f is given by V R (Π , f ) = min p π ∈ C Π X m,a,ω f ( m )( a ) u R ( a, ω ) p π ( ω × m ) Similarly, the sender’s ex-ante payoff from ambiguous device Π when the receiver choosescontingent plan f is given by V S (Π , f ) = min p π ∈ C Π X m,a,ω f ( m )( a ) u S ( a, ω ) p π ( ω × m ) Once the sender commits to some ambiguous device Π , the receiver could commit to the contingentplan which solves the following program: max f ∈F min p π ∈ C Π X m,a,ω f ( m )( a ) u R ( a, ω ) p π ( ω × m ) Since the objective function is a continuous and linear function of f and p π , furthermore F and C Π are compact convex sets of linear topological spaces. One could apply Sion’s minimaxtheorem (Sion, 1958) to get max f ∈F min p π ∈ C Π X m,a,ω f ( m )( a ) u R ( a, ω ) p π ( ω × m ) = min p π ∈ C Π max f ∈F X m,a,ω f ( m )( a ) u R ( a, ω ) p π ( ω × m ) It further implies that there exists a saddle point ( f ∗ , p ∗ π ) which solves the program and f ∗ is a bestresponse to p ∗ π . (As usual, if there are multiple optimal contingent plans, the tie breaks to favor thesender’s payoff).Notice that the sender’s payoff from using this ambiguous device Π is V S (Π , f ∗ ) = min p π ∈ C Π X m,a,ω f ∗ ( m )( a ) u S ( a, ω ) p π ( ω × m ) which is weakly lower than the payoff using probabilistic device π ∗ generating p ∗ π since p ∗ π ∈ C Π and it also induces f ∗ . Therefore, the sender’s payoff from an ambiguous device is always weaklydominated by a probabilistic device. This observation leads to the following theorem:6 heorem 2.1. If the receiver is dynamically consistent, then the value of ambiguous persuasioncoincides with the value of Bayesian persuasion.

Remark

With a similar argument, the proof of this theorem in the case where both players areuncertainty averse can be found in the Appendix A.For using ambiguous messages in persuasion, Theorem 2.1 conveys a very strong message:if ambiguity is introduced only to affect the evaluation of actions but not to induce dynamicallyinconsistent behaviors, then there is no value of introducing it. Especially, any contingent planinduced by an ambiguous device can always be induced by some probabilistic device implied bythe minimax theorem. Therefore, without dynamic inconsistency, the sender cannot achieve whatis impossible with probabilistic devices by introducing ambiguity.On the other hand, undoubtedly, the sender who introduces ambiguity is also averse to it is acrucial driving force for this no-gain result. Intuitively, if the sender is ambiguity seeking, thenintroducing ambiguity alone already creates positive value for such a sender. Thus, if the senderhas a different attitudes towards ambiguity from the receiver, then there might be an “ambiguitypremium” the sender is able to extract from the receiver. An example illustrating this idea will beprovided in the next subsection. Focusing on the case where both players are ambiguity aversewith the same attitude therefore allows one to investigate the value of ambiguity other than such apremium. Evidently, Theorem 2.1 suggests that under dynamic consistency, there is no value otherthan that premium from introducing ambiguity.Lastly, the ﬁnal argument leading to Theorem 2.1 also depends on the fact that the sender hasthe same perception of ambiguity as the receiver. This should be a reasonable assumption since thesender should not have any control over which probabilistic device is going to be used. Otherwise,the receiver may be able to infer about the actual probabilistic device from the sender’s preferences.On the other hand, it is not difﬁcult to see that such an assumption can be slightly relaxed to thecase where the sender’s perceived ambiguity (set of priors) is a superset of the receiver’s.

Only in this subsection, the sender and receiver will be assumed to possibly have different ambigu-ity attitudes. More speciﬁcally, the receiver’s preference still admits MEU representation, whereasthe sender’s preference now admits the α -MEU representation for some α ∈ [0 , .Formally, the sender’s ex-ante payoff from ambiguous device Π when the receiver choosescontingent plan f is given by V αS (Π , f ) = α min p π ∈ C Π X m,a,ω f ( m )( a ) u S ( a, ω ) p π ( ω × m ) + (1 − α ) max p π ∈ C Π X m,a,ω f ( m )( a ) u S ( a, ω ) p π ( ω × m ) Given both players have the same perception of ambiguity, the sender is strictly less ambiguityaverse than the receiver if and only if α < (Ghirardato et al., 2004).Notice that Theorem 2.1 applies only when α = 1 . In the following, I present an examplewhere the sender can always make strictly more proﬁts than Bayesian persuasion if α < . Inother words, if the sender is strictly less ambiguity averse than the receiver, introducing ambiguitycan still be proﬁtable under dynamic consistency. Moreover, such proﬁts will be in the form of anambiguity premium. 7 xample 1. Let

Ω = { ω , ω } . Both players have a common uniform prior over the states, denotedby p = P r ( ω ) = 1 / . The receiver has three feasible actions { a , a , a } and the sender’s payoffdepends only on the action: u S ( a ) > u S ( a ) > u S ( a ) .The receiver’s payoff is given by the following utility function: u R ( a , ω ) = − u R ( a , ω ) = 2 u R ( a , ω ) = −

12 ; u R ( a , ω ) = 32 u R ( a ) , ω ) = 0; u R ( a , ω ) = 0 Suppose M = { m , m } and let f ( m ) f ( m ) denote the contingent plan f . It is easy to see that the optimal probabilistic device π ∗ generates posteriors / and / withequal marginal probabilities and the receiver takes action a and a at the two posteriors respec-tively. Given this device, the sender’s ex-ante payoff is V S ( π ∗ , a a ) = 12 u S ( a ) + 12 u S ( a ) Consider a probabilistic device π ǫ which generates posteriors / − ǫ and / for some ǫ > . Because of Bayes plausibility, the ﬁrst posterior is generated with the marginal probability / (2 + 4 ǫ ) < / . As a result, ﬁxing the receiver’s contingent plan, the sender’s ex-ante payoff isstrictly higher under π ǫ compared with π ∗ : V S ( π ǫ , a a ) > V S ( π ∗ , a a ) Let an ambiguous device Π be the closed convex hull of { π ǫ , π ∗ } and notice that a a ∈ arg max f ∈F V R (Π , f ) Thus, as the receiver takes the sender-preferred contingent plan , a a is also induced under thisambiguous device. Finally, the sender’s payoff from this ambiguous device is given by V αS (Π , a a ) = αV S ( π ∗ , a a ) + (1 − α ) V S ( π ǫ , a a ) which is strictly greater than V S ( π ∗ , a a ) if and only if α < .In Example 1, the sender makes additional proﬁts by making sure the receiver is indifferentbetween a probabilistic device and an ambiguity device. Then because she is less ambiguity averse,she ﬁnd herself better off under the ambiguous device. This is exactly the ambiguity premium thesender is able to extract from the receiver. Notice that, the additional proﬁts does not come from thefact that ambiguity changes the receiver’s behavior, but it is generated merely from the differencesin their ambiguity attitudes. This idea is in the same vein as a player can make proﬁts if she is lessrisk averse or more patient than the other players. Without this assumption, the same conclusion can still be achieved by including π − ǫ ′ in Π . Confronting Dynamic Inconsistency

In the presence of ambiguity, being dynamically consistent could be a demanding assumption. Notonly because it arises naturally when one’s preference deviates from the subjective expected utility.But in examples such as the dynamic Ellsberg’s urn problem, dynamically inconsistent behaviorsalso have reasonable justiﬁcations. Therefore, if knowing the receiver can be dynamically incon-sistent, the sender will be willing to introduce ambiguity to raise a conﬂict between the receiver’sex-ante and interim preferences.Indeed, in light of Theorem 2.1, BLL characterize exactly the value the sender is able to gainby inducing dynamic inconsistency. As for the receiver, his interim choices are in fact optimalfrom his interim perspective. Thus, without ex-ante commitment, he cannot force himself to takeactions that are not optimal at the interim stage. Nonetheless, the illustrating example indicatesthat the receiver can still adjust the information structure by ignoring messages to better align hisinterim preferences to the ex-ante preference. This is exactly the consistent planning approach(Strotz, 1955; Siniscalchi, 2011) to confront the problem of dynamic inconsistency.

Suppose the sender commits to the ambiguous device Π with support supp (Π) ⊆ M . Accordingto any probabilistic device π ∈ Π , if message m ∈ supp (Π) realizes, the receiver’s belief will beupdated to the posterior q πm . The set of posteriors for all the probabilistic devices in Π is calleda probability-possibly set denoted by Q Π m . Namely, Q Π m is the set of probability distributions thereceiver ﬁnd possible given the ambiguous device Π and realized message m .As a benchmark, I will assume the receiver follows Full Bayesian updating characterized byPires (2002). In this case, the receiver’s belief upon observing message m is exactly given by Q Π m .Moreover, the receiver’s interim (conditional) preference is again given by the MEU. Notice thatthe receiver’s behavior can be dynamically inconsistent with such preferences.Once the message m realizes, the receiver will be tempted to maximize his interim payoff. Let ¯ f ( m ) denote the set of receiver’s interim optimal strategies (probability distributions over actions),formally, ¯ f ( m ) = arg max ∆( A ) min q ∈ Q Π m X a,ω f ( m )( a ) u R ( a, ω ) q ( ω ) Conventionally, assume that the receiver always chooses the optimal strategy maximizing thesender’s ex-ante payoff which is unique and denote such a strategy by ˆ f ( m ) . As a result, thereceiver’s ex-ante payoff from those strategies is given by V R (Π , ˆ f ) = min p π ∈ C Π X m,a,ω ˆ f ( m )( a ) u R ( a, ω ) p π ( ω × m ) Notice that this is the ex-ante payoff of the receiver in BLL’s setting. Clearly, this is weaklyworse than the receiver’s ex-ante payoff with commitment. Furthermore, it may not even be opti-mal compared with his ex-ante payoff without the ambiguous device. In the current setting, as thereceiver’s goal is to maximize the ex-ante payoff, he may choose to ignore the information instead For the limits of applying Full Bayesian updating in ambiguous persuasion, see Cheng (2019) for a discussion.

9f updating every message. In fact, the receiver should be able to freely garble the informationstructure by pooling messages in order to better align his ex-ante and interim preferences.To be more speciﬁc, consider a set K ⊆ supp (Π) of messages. Suppose the receiver garblesthe information structure by pooling the messages in K such that whenever a message m ∈ K realizes he only learns the event K instead of that speciﬁc message. Then he updates his beliefonly conditioning on K , which gives rise to the following probability-possibility set: Q Π K =  q πK ∈ ∆(Ω) : q πK ( · ) = P m ∈ K π ( m |· ) p ( · ) P ω ′ ∈ Ω P m ∈ K π ( m | ω ′ ) p ( ω ′ ) , π ∈ Π  As a result, at the interim stage, his interim optimal strategy becomes ˆ f ( K ) for all the messagescontained in K . Such a strategy can be beneﬁcial for the receiver from the ex-ante perspective.Formally, suppose the receiver is able to commit to a garbling of the given ambiguous device Π . Let B denote a partition of the set supp (Π) with elements B , B , etc. and let B (Π) denote thecollection of all such partitions. The partition is a garbling of the device as the receiver will onlylearn about the events in the partition but not the exact messages. For example, if the receiver com-mits to the trivial partition, then effectively he ignores all the messages from the ambiguous device.With this additional step, the timing of the persuasion game now becomes:1. The sender announces and commits to a device Π .2. The receiver commits to a garbing of the device given by the partition B .3. The nature draws the state, probabilistic device and message according to the prior p anddevice Π .4. The receiver observes the pooled message and chooses an interim strategy.5. The nature draws the action according to the receiver’s strategy and payoffs realize to theplayers.Knowing the interim optimal strategy will always be chosen, the receiver’s main strategic con-cern will be to ﬁnd the optimal garbling to maximize his ex-ante payoff. Let V R (Π , ˆ f B ) denotethe receiver’s ex-ante payoff under the garbling B . His goal then will be solving the followingprogram: max B∈ B (Π) V R (Π , ˆ f B ) = max B∈ B (Π) min p π ∈ C Π X B i ,a,ω ˆ f ( B i )( a ) u R ( a, ω ) p π ( ω × B i ) Again, if the receiver is indifferent between multiple ways of garbling, he will choose the onefavors the sender’s ex-ante payoff in equilibrium. Let ˆ B denote such optimal garbling.Clearly, V R (Π , ˆ f ˆ B ) ≥ V R (Π , ˆ f ) and it is also weakly greater than the receiver’s ex-ante payoffwithout the device. Thus, by allowing the receiver to garble, he will never be worse-off participat-ing the persuasion game. 10otice that the receiver’s garbling transforms the given ambiguous device Π to another am-biguous device ˜Π . For each probabilistic device π ∈ Π , there exists a probabilistic device ˜ π ∈ ˜Π such that ˜ π ( B i | ω ) = X m ∈ B i π ( m | ω ) ∀ B i ∈ B The receiver’s behavior is the same under the device Π and ˜Π and therefore it is equivalent forthe sender to commit to either device. Write Π ≡ ˜Π if two devices are equivalent in this sense,then there exists an equivalence class of devices: [ ˜Π] = { Π : Π ≡ ˜Π } where both the sender and receiver’s payoffs are going to be the same. For each equivalence class [ ˜Π] , let ˜Π denote the device where the receiver does not pool any message. The sender’s problemin this persuasion game is thus to ﬁnd the optimal equivalence class of devices to maximize herex-ante payoff: max [ ˜Π] min p π ∈ C ˜Π X m,a,ω ˆ f ( m )( a ) u S ( a, ω ) p π ( ω × m ) To solve this program, difﬁculties come from identifying the equivalence class of ambiguousdevices. In other words, one needs to characterize under which circumstances does the receiverprefer to garble the devices. Recall in the illustrating example, the messages are synonyms as wellas dilating and the receiver strictly prefers to ignore those messages. In the next section, I am goingto show whether synonyms and dilation are sufﬁcient for the receiver to always prefer to garble.

Synonyms are messages inducing the receiver to take the same interim strategy. Formally,

Deﬁnition 4.1.

Two messages m and m are synonyms under a device Π if ˆ f ( m ) = ˆ f ( m ) The use of synonyms are closely tied to the idea of providing ambiguous information as sug-gested by BLL. For example, when there are multiple terms can describe the same situation, thesender (could be a politician, a seller or a news media) may purposefully use them in differentoccasions to create ambiguity in her information. On the theoretical side, BLL further show thatsynonyms are necessary for optimal ambiguous persuasion in the interim setting . Speciﬁcally,by allowing the sender to use synonyms, she can induce the receiver to take actions that are notoptimal without the synonyms. In light of the no-gain result (Theorem 2.1), synonyms are usedexactly to induce the receiver to make dynamically inconsistent decisions. Under additional regulating assumptions on utility functions.

Proposition 4.2.

If and only if | Ω | = 2 , the receiver always weakly prefers to pool the synonyms.Moreover, when the induced strategies are different before and after pooling, he strictly prefers topool. Proposition 4.2 implies that the intuition derived from the illustrating example stays only withthe case of binary states. The reason is more apparent from the proof that, when there are morethan two states, the receiver may ﬁnd his interim choices given the synonyms better aligned withhis ex-ante preference than his interim choices after pooling the synonyms. In a later subsection,Example 2 illustrates such a scenario with three states.Nonetheless, one can still say a little bit about pooling synonyms when there are more than twostates. The following corollary is derived from the proof of Proposition 4.2.

Corollary 4.3.

If the receiver’s belief becomes a singleton after pooling the synonyms, then shealways weakly prefers to do so. Moreover, when the induced strategies are different before andafter pooling, he strictly prefers to pool.

This corollary claims that, in general, if by pooling the synonyms the receiver can get back toa singleton belief, then he always prefers to do so. Straightforwardly, the receiver’s interim choiceat a singleton belief is aligned with his ex-ante preference as dynamic consistency is maintainedunder Bayesian updating. Thus, pooling those synonyms is always the best he can do to align hisinterim choices to his ex-ante preference.Notice that, BLL’s construction of the optimal ambiguous devices relies on such a type ofsynonyms. Although their characterization of the value of persuasion does not further rely on thisconstruction. I can still show that, under the same assumption as their Proposition 2, the optimalvalue characterized in their paper cannot be achieved if the receiver is able to garble the devices.

Proposition 4.4.

If the optimal ambiguous persuasion value in the interim setting is strictly higherthan Bayesian persuasion, then it can never be achieved if the receiver is able to garble the devices.

When the receiver can garble the information structure, the sender’s value from ambiguouspersuasion is clearly intermediate between the value of Bayesian persuasion and the value of am-biguous persuasion in the interim setting. Proposition 4.4 further asserts that the upper bound isactually never attained. Although in BLL, the optimal value sometimes is approximated by a se-quence of ambiguous devices which is also not exactly attained. Proposition 4.4 here speciﬁes adifferent type of unattainability, especially, the values of ambiguous devices in that sequence mayalso not be attained if the receiver garbles.

Dilation is a controversial phenomenon from applying Bayes’ rule to a set of probability distribu-tions. More importantly, some ﬁnd it unacceptable that one’s belief can be less accurate regardless12f the information received. In the case of ambiguous devices, dilation occurs if there exists aset of messages such that the probability-possibility set given the whole set is contained in everyprobability-possibility set for each message in the set. These messages are called dilating mes-sages, formally,

Deﬁnition 4.5.

The messages in a set K are dilating messages under a device Π if Q Π K ⊆ Q Π m ∀ m ∈ K and they are trivial if equality holds for all m ∈ K . Trivial dilating messages are trivial in the sense that both the sender and receiver will be indif-ferent whether the receiver pool them or not.Dilation implies that not only the beliefs become less accurate after updating, but the beliefbefore updating is contained in all the updated beliefs. Hence, it seems that dilating messages donot bring any useful information for the receiver to make decisions. Then it might be reasonablethat the receiver is willing to pool the dilating messages regardless. However, again, this intuitionapplies only as far as the states being binary.

Proposition 4.6.

If and only if | Ω | = 2 , the receiver always weakly prefers to pool the dilatingmessages. Moreover, when the induced strategies are different before and after pooling, he strictlyprefers to pool. The reason why conclusion of Proposition 4.6 do not extend to more than two states is the sameas for Proposition 4.2. Namely, it is not always the case that by pooling the dilating messages, thereceiver can better align his interim choices with his ex-ante preference.More importantly, this result sheds new light on understanding the phenomenon of dilation inupdating set of priors. Some arguments against dilation rely on the observation from the illustratingexample that dilation makes the receiver worse off in terms of his ex-ante payoff. Moreover, thereceiver may be willing to pay to avoid free information leading to dilation. However, in light ofProposition 4.6, dilation does not always imply the value of information being negative. Such aconclusion stays only within the world of binary states. In other words, a decision maker ignoresdilating messages when they are not proﬁtable does not automatically imply that she does not likedilation occurs in her belief.Shishkin and Ortoleva (2019) conduct an experiment reminiscent of the illustrating exampleand ﬁnd that a majority of the subjects do not follow the dilating messages. This observation maybecause of the subjects’ intrinsic dislike of dilation or they are simply trying to maximize the ex-ante payoff. The ﬁnding here suggests that one needs to conduct experiments with more than twostates to further disentangle these two possible causes.

As part of the proof of Proposition 4.2 and 4.6, this section provides the example where the receiverﬁnds optimal to not pool the messages, even when they are both synonyms and dilating at the sametime. See Bradley (2019) for a discussion and references therein. xample 2. Let

Ω = { ω , ω , ω } and the prior p is uniform denoted by p = (1 / , / , / .Let A = { a , a , a , b , b , b , c } be the receiver’s feasible actions. The receiver’s payoff giveneach action is denoted by a vector u R ( a, ω ) = ( u R ( a, ω ) , u R ( a, ω ) , u R ( a, ω )) as follows: • u R ( a , ω ) = (8 , − x, − − x ) , u R ( a , ω ) = (8 , − , − , and u R ( a , ω ) = (7 , − , − . • u R ( b , ω ) = (1 − ǫ, − ǫ, − ǫ ) , u R ( b , ω ) = (0 , − y, y ) , and u R ( b , ω ) = (1 , − z, z ) . • u R ( c , ω ) = (11 / , − / , − / It sufﬁces to consider x, y, z as large enough and ǫ small enough positive real numbers. Thesender’s preference does not depend on the states and she strictly prefers a to b and to all theother actions. First notice that without any information, the receiver strict prefers c and receives a payoff of / . Let M = { m , m , m ′ } and consider an ambiguous device Π given by the closed convex hullof the following probabilistic devices: π ( m | ω ) ω ω ω m / / m /

16 3 / m ′ /

16 9 / π ′ ( m | ω ) ω ω ω m / / m /

16 9 / m ′ /

16 3 / π ( m | ω ) ω ω ω m / / / m / /

32 5 / m ′ / /

32 15 / π ′ ( m | ω ) ω ω ω m / / / m / /

32 15 / m ′ / /

32 5 / As a result, the probability-possibility set given message m is Q Π m = co { (2 / , / , / , (1 / , / , / } At this belief, the receiver’s optimal strategy is any lottery between a and a , given the senderstrictly prefers a , m thus induces the receiver to take the action a .The message m and m ′ induces the same probability-possibility set: Q Π m = Q Π m = co { (0 , / , / , (0 , / , / , (1 / , / , / , (1 / , / , / } As y, z are large enough positive real numbers and ǫ small enough, it can always be the case thatthe receiver’s optimal action given this set is b .As both messages lead to the same induced action, the messages m and m ′ thus are synonyms.If the receiver chooses to pool the two messages together, then the pooled message { m , m ′ } leadsto the following probability-possibility set: Q Π { m ,m ′ } = co { (0 , / , / , (1 / , / , / } Q Π { m ,m ′ } $ Q Π m , the two messages are also dilating. At this belief, the receiver’soptimal action is any lottery between b and b . Assume the sender strict prefers b to b , then thereceiver takes b .Let f ( ω ) f ( ω ) f ( ω ) denote the receiver’s strategy f . To see whether the receiver prefers topool the two messages or not. If the receiver chooses to not pool any message, then his ex-antepayoff from this ambiguous device is given by V R (Π , a b b ) = 1312 − ǫ On the other hand, if he chooses to pool the two messages, then his ex-ante payoff becomes V R (Π , a b b ) = 1 Clearly, for small enough ǫ , the receiver ﬁnds it optimal to not garble the information structure,even though it has synonyms as well as dilation. More speciﬁcally, notice that in both cases, thereceiver’s ex-ante payoff can be evaluated at the posteriors (1 / , / , / and (1 / , / , / .Given these posteriors, b is a better action than b from the receiver’s ex-ante perspective. Hisinterim preferences at m and m ′ agree with this comparison, as there are beliefs where ω realizesmore likely than ω and b is much worse at those beliefs. While if he chooses to pool the twomessages, according to the resulting interim preference, he instead prefers b as the possibility that ω is more likely than ω vanishes.In addition, notice that the sender cannot simply generate the posterior (1 / , / , / withprobabilistic device because of the existence of b . In fact, b dominates b at any posterior assign-ing equal probabilities to ω and ω . Thus, to induce the action b , the ambiguous device needsto generate a probability-possibility set including posteriors where the probability of ω is greaterthan ω .It further implies that, synonyms are actually necessary in this example to achieve the sender’soptimal payoff. Without synonyms, since the optimal payoff is given by two actions, any proba-bilistic device in the optimal ambiguous device must generate two posteriors. In order to induceaction b , there must exist a device generating posteriors p and p ′ inducing the action a and b respectively. Moreover, it must be the case that p ( ω ) < p ( ω ) and p ′ ( ω ) > p ′ ( ω ) by Bayes’plausibility. However, if p is a possible posterior for evaluating action a , the action a then be-comes strictly preferred by the receiver. Therefore, the sender can never achieve his optimal payoffwithout using synonyms. At last, synonyms and dilation are not the only tools the sender can utilize with ambiguous de-vices. In other words, the sender can still beneﬁt from using ambiguous devices in the case ofbinary states. When the receiver’s payoff under Bayesian persuasion is strictly higher than his pay-off without any information, the sender can design ambiguous devices to extract such a surplus.Consider the following example.

Example 3.

Let

Ω = { ω , ω } , the prior p is uniform and denoted by p ( ω ) = 1 / . Thereceiver’s feasible action is A = { a , a , a , a } . Suppose the sender’s payoff depends only on he action: u S ( a ) > u S ( a ) > u S ( a ) > u S ( a ) . The receiver’s payoff is given by the followingutility function: u R ( a , ω ) = − u R ( a , ω ) = 6; u R ( a , ω ) = − u R ( a , ω ) = 4; u R ( a , ω ) = − u R ( a , ω ) = 3; u R ( a , ω ) = 0; u R ( a , ω ) = 0 . The receiver’s preference over these actions can be seen in the left ﬁgure of Figure 1. Let f ( ω ) f ( ω ) denote strategy f . p ( ω ) E p ( u R ( a, ω ))1 / / p a a a a p ( ω ) E p ( u R ( a, ω ))1 / / / p a a Figure 1: Graphical Illustration of Exampe 3Without any information, the receiver’s ex-ante optimal action is a and his payoff is . Supposethe optimal Bayesian device induces action a and a with posteriors / and / . This can beguaranteed if the sender is almost indifferent between a and a and gains very low payoff from a . As can be seen in the right ﬁgure in Figure 1 (red dashed line), the receiver’s ex-ante payoffgiven this device is strictly higher than his payoff without the persuasion.One way for the sender to gain more proﬁts is by inducing a more frequently than the optimalBayesian device. Notice that, if she can successfully induce actions a and a at posteriors / and / , then the receiver’s ex-ante payoff becomes the same as without any information.This isdepicted by the blue dashed line in Figure 1. In other words, the receiver will not choose to ignorethe sender’s recommendation of actions. This can actually be done since both a and a can beinduced by intervals with upper bound being / and / respectively. The ambiguous device Π given by a closed convex hull of the following probabilistic devices achieves that goal. π ( m | ω ) ω ω m / m / π ( m | ω ) ω ω m /

12 7 / m /

12 5 / m generates the probability-possibility set Q Π m =[0 , / and m generates the set Q Π m = [11 / , / . At these beliefs, the receiver’s optimalactions are still a and a respectively. Moreover, his ex-ante payoff from this device is given by V R (Π , a a ) = 13 (cid:20)

18 ( −

4) + 78 (4) (cid:21) + 23 (0) = 1

Namely, the receiver is indifferent between garbling this device or not and thus follows the sender’srecommendation. Notice that the sender’s ex-ante payoff is the same under any probabilistic devicein Π , therefore the sender gains a strictly more payoff than Bayesian persuasion.Example 3 showcases how ambiguous persuasion can be beneﬁcial without using synonymsand dilation. It relies on the fact that: 1. the receiver has a positive surplus under Bayesianpersuasion; 2. the desired actions can be induced by suitable intervals. When both conditions hold,the sender then can use ambiguous signals to extract the receiver’s surplus. More importantly,the ambiguous device is not ambiguous from the sender’s point of view. Thus even though she isambiguity averse, she does not expose herself to ambiguity using such an ambiguous device.Finally notice that, if the receiver is dynamically consistent, then his optimal contingent plangiven this ambiguous device is a a which strictly dominates a a . In other words, the senderextracts the surplus through the receiver’s dynamic inconsistency. For the receiver, since he cannotadjust the information structure to guarantee himself taking the action a and a at interim states,he can only accept his own suboptimal choices. It is well-known that dynamic (in)consistency is a critical and unsettled issue for updating underambiguity (Siniscalchi, 2009). It is therefore also important to consider the different implicationsfor applications under different treatment on the issue of dynamic (in)consistency. Many papers,including BLL, take an interim approach to circumvent this issue. The present paper radicallyapplies the existing treatments of dynamic (in)consistency in the literature to study ambiguouspersuasion from an ex-ante perspective.The two treatments of dynamic (in)consistency applied here are dynamically consistent up-dating (commitment) and consistent planning. In the ﬁrst treatment, the receiver applies updat-ing rules that are dynamically consistent as characterized in Hanany and Klibanoff (2007, 2009).This treatment is also used to develop a solution concept for dynamic games under ambiguity inHanany et al. (2020). The second treatment, consistent planning, is proposed by Strotz (1955) andbehaviorally characterized in Siniscalchi (2011).Except for the two treatments considered in the present paper, another main treatment existedin the literature is proposed by Epstein and Schneider (2003). They characterize a rectangularitycondition on the set of priors such that the decision maker will be dynamically consistent underFull Bayesian updating. Moreover, if given an information structure, the initial set of priors is notrectangular, they propose that the decision maker’s ex-ante preference should then be representedby the rectangular hull, which is given by combining all the possible marginal probabilities with17ll the possible posteriors. As a result, the rectangular hull is a strict superset of her initial set ofpriors. For a discussion on why such a set expansion might be reasonable, I refer to a recent paperby Hill (2020) for more details.BLL consider the issue of dynamic (in)consistency through the lens of rectangularity. However,they do not allow for the set expansion but restrict the information structure to be rectangular. Asa result, any information structure being rectangular must be given by a probabilistic device. Thenthey conclude that under dynamic consistency (rectangularity indeed), the sender cannot gain morethan Bayesian persuasion. However, this result is driven by the fact that rectangularity excludesany ambiguity, which is not in the same spirit as Theorem 2.1 in the present paper.Allowing for set expansion, Pahlke (2019) also studies ambiguous persuasion under rectangu-larity. For any ambiguous device, she ﬁnds the set of priors given by the rectangular hull and showsthat the receiver is indeed dynamically consistent. With the same approach, she also develops asolution concept for dynamic games under ambiguity in Pahlke (2018).Under the treatment pursued by Pahlke (2019), the receiver’s ex-ante payoff represented bythe rectangular hull will depend on the information structure. Then in this case, for the consistentplanning treatment studied in Section 3, pooling messages has an additional value for the receiver,which is decreasing the size of the rectangular hull. This additional value gives rise to the followingproposition:

Proposition 5.1.

If the receiver’s ex-ante payoff is represented by the rectangular hull, then healways weakly prefers to pool the synonyms and dilating messages. Moreover, when the inducedstrategies are different before and after pooling, he strictly prefers to pool.

In other words, the beneﬁts from shrinking the rectangular hull will be sufﬁcient for the receiverto ignore synonyms and dilating messages regardless.Except for rectangularity, BLL also provide some results along the line of consistent planningin their paper. However, those results requires the use of synonyms that dilates the receiver’ssingleton belief. Given Corollary 4.3, they no longer hold if the receiver can choose to garble theinformation structure.In this paper, the consistent planning treatment proceeds under the assumption of Full Bayesianupdating. However, it is not pivotal for the main ideas. Synonyms and dilation can be accordinglydeﬁned under different updating rules. Furthermore, if the receiver follows the Relative MaximumLikelihood updating characterized in Cheng (2019), all the examples (Illustrating example, Exam-ple 2 and Example 3) will go through since the messages are generated with the same likelihoodunder all the probabilistic devices in those ambiguous devices.Lastly, also in an interim setting, Tang (2020) considers ambiguous persuasion where the re-ceiver adopts a new updating rule he proposed, Conditional Maximum Likelihood. Although thisupdating rule renders ambiguous information unambiguous, he shows that the sender can get arbi-trarily close to her ideal payoff which is the payoff she can get if she chooses the action after staterealizes.

This paper adds to the literature on persuasion following the seminal work by Kamenica and Gentzkow(2011). For the development of persuasion literature in the Bayesian framework, I refer to Kamenica(2019) for a detailed survey. 18or papers taking a non-Bayesian approach, de Clippel and Zhang (2019) considers when thesender uses probabilistic devices yet the receiver may not apply Bayesian updating. When thereceiver’s initial belief is ambiguous, Kosterina (2018) and Hedlund et al. (2020) study the optimalprobabilistic device under maxmin and α -maxmin expected utility respectively. Also adopting amaxmin approach but not in the context of ambiguity, Dworczak and Pavan (2020) characterizedthe optimal robust Bayesian persuasion where robustness is in the sense of the sender’s Bayesianbelief may be false.The consistent planning approach taken in the present paper is similar to the idea taken inLipnowski et al. (2020) and Wei (2020), as the sender also needs to take into account the receiver’sactual perception of a given information structure. In those papers, the receiver incurs an attentioncost to process the information, thus may also prefer a garbling of the given information structure.Different from the exogenous cost of attention, the present paper studies an endogenous cost arisesfrom dynamic inconsistency. One of the main concern in applications of ambiguity is to deal with the issue of dynamic inconsis-tency. This paper exploits the existing approaches in the literature to study ambiguous persuasionfrom an ex-ante perspective. Ironically, ambiguity plays a role in persuasion only through inducingdynamically inconsistent behaviors. On one hand, if one wishes to keep dynamic consistency in-tact, then ambiguity does not grant the sender additional ability to make proﬁts provided she has thesame ambiguity attitude. On the other hand, if one believes ambiguity should play a role in persua-sion, then must accept the receiver’s dynamically inconsistent behaviors. For the latter perspective,this paper explores the use of consistent planning and gives another unexpected observation: syn-onyms and dilation are in fact sometimes desirable to the receiver. A general characterization ofthe value of ambiguous persuasion under consistent planning is deﬁnitely one of the open questionthat I think might be interesting to further look at.19 ppendix A Persuasion under Dynamic Consistency with Un-certainty Averse Players

In this appendix, I generalize the setting in Section 2 and show the conclusion of Theorem 2.1 stillholds.Assume that both the sender and receiver’s preferences under ambiguity admit the uncertaintyaverse representation characterized in Cerreia-Vioglio et al. (2011) (with axioms A.4 - A.7). Thenﬁx an ambiguous device Π , the receiver’s ex-ante payoff from a contingent plan f is given by V R (Π , f ) = min p ∈ ∆(Ω × M ) G X m,a,ω f ( m )( a ) u R ( a, ω ) p ( ω × m ) , p ! for some function G : T × ∆(Ω × M ) → ( −∞ , + ∞ ] satisﬁes:(i) G ( t, · ) is lower semi-continuous and quasiconvex.(ii) G ( · , p ) < ∞ only for p ∈ C Π and it is increasing and continuous on C Π .Notice that the second property not only comes from the axioms but also from the assumption thatthe receiver’s perceived ambiguity coincides with C Π .As a result, the receiver’s optimal contingent plan also solves a maxmin program: max f ∈F min p ∈ C Π G X m,a,ω f ( m )( a ) u R ( a, ω ) p ( ω × m ) , p ! Again, as the objective function G ( · , · ) satisﬁes the conditions for the Sion’s minimax theo-rem, one could apply it to conclude that there exists a saddle point ( f ∗ , p ∗ ) solves the program.Especially, f ∗ is a best response to p ∗ .Notice that for any p , G ( · , p ) is an increasing function. Thus the optimal contingent plan f ∗ maximizing G ( · , p ∗ ) is also optimal for P m,a,ω f ( m )( a ) u R ( a, ω ) p ( ω × m ) alone. Therefore, thesender is also able to induced f ∗ using the probabilistic device π ∗ generating the joint prior p ∗ .Finally, the sender’s payoff under the ambiguous device Π : V R (Π ∗ , f ∗ ) = min p ∈ C Π G X m,a,ω f ( m )( a ) u S ( a, ω ) p ( ω × m ) , p ! is weakly dominated by using the probabilistic device π ∗ as p ∗ ∈ C Π . Therefore, it also leads tothe following theorem: Theorem A.1. (When the sender and receiver are uncertainty averse.) If the receiver is dynam-ically consistent, then the value of ambiguous persuasion coincides with the value of Bayesianpersuasion.

The uncertainty averse preference is the most general class of convex preferences under ambi-guity. It includes two important classes of preferences, the variational preferences (Maccheroni et al.,2006) and the smooth ambiguity preferences (Klibanoff et al., 2005). Consequently, the no-gainresult also applies to those preferences with suitable assumptions on the player’s perception ofambiguity. 20 ppendix B Proofs of the Results

B.1 Proof of Proposition 4.2 and Corollary 4.3

The “if” direction is proved in the following, the“only if” direction is proved considering the Ex-ample 2.Suppose two messages m and m are synonyms under a device Π , i.e. ˆ f ( m ) = ˆ f ( m ) . If thereceiver chooses not to pool the synonyms, then his ex-ante payoff is given by V R (Π , ˆ f ) = min p π ∈ C Π X m,a,ω ˆ f ( m )( a ) u R ( a, ω ) p π ( ω × m ) If the receiver chooses to pool the synonyms, then he treats the set { m , m } as a single mes-sage. For the ease of notation, consider the following construction of an ambiguous device ˜Π : foreach probabilistic device π ∈ Π , there exists a probabilistic device ˜ π ∈ ˜Π such that for all ω ∈ Ω , ˜ π ( m | ω ) =  π ( m | ω ) if m = m or m τ π ( m ) τ π ( m )+ τ π ( m ) [ π ( m | ω ) + π ( m | ω )] if m = m τ π ( m ) τ π ( m )+ τ π ( m ) [ π ( m | ω ) + π ( m | ω )] if m = m Notice that q ˜ πm = q ˜ πm = q π { m ,m } and τ ˜ π ( m ) = τ π ( m ) , τ ˜ π ( m ) = τ π ( m ) . Namely, ˜Π is exactlyan ambiguous device equivalent to pooling the messages m and m under Π , but meanwhile keepeach individual message in its support. Let ˜ f denote the receiver’s optimal strategy given ˜Π .The receiver’s payoff under this device is therefore given by V R ( ˜Π , ˜ f ) = min p ˜ π ∈ C ˜Π X m,a,ω ˜ f ( m )( a ) u R ( a, ω ) p ˜ π ( ω × m ) Let ˜ π denote the minimizer of the above equation, by construction there exists a π ∈ Π corre-sponding to it. Then notice that: V R ( ˜Π , ˜ f ) − V R (Π , ˆ f ) ≥ V R (˜ π, ˜ f ) − V R ( π, ˆ f )= X m ,m X a,ω ˜ f ( m )( a ) u R ( a, ω ) p ˜ π ( ω × m ) − X m ,m X a,ω ˆ f ( m )( a ) u R ( a, ω ) p π ( ω × m )= [ τ ˜ π ( m ) + τ ˜ π ( m )] X a,ω h ˜ f ( m )( a ) − ˆ f ( m )( a ) i u R ( a, ω ) q ˜ πm ( ω ) (B.1)where the last equality follows from the following facts:1. ˜ f ( m ) = ˜ f ( m ) and ˆ f ( m ) = ˆ f ( m ) .2. p ˜ π ( ω × m ) = τ ˜ π ( m ) q ˜ πm ( ω ) .3. q ˜ πm ( ω ) = q ˜ πm ( ω ) and τ π ( m ) q πm ( ω ) + τ π ( m ) q πm ( ω ) = [ τ ˜ π ( m ) + τ ˜ π ( m )] q ˜ πm ( ω ) .21otice that, as ˜ f ( m ) is the receiver’s optimal strategy at belief Q ˜Π m , the expression B.1 isalways nonnegative if Q ˜Π m = q ˜ πm ( ω ) and is strictly positive if ˜ f ( m ) = ˆ f ( m ) . This observationproves Corollary 4.3.For the current proposition, since Q ˜Π m may not be a singleton, thus one cannot conclude thesame result. To further derive the result along this line, consider the following lemma: Lemma B.1. If | Ω | = 2 and ˜ f ( m ) = ˆ f ( m ) , there does not exist p ∈ Q ˜Π m such that X a,ω h ˜ f ( m )( a ) − ˆ f ( m )( a ) i u R ( a, ω ) p ( ω ) ≤ Proof of Lemma B.1.

Towards a contradiction, suppose there exists such a p ∈ Q ˜Π m . Then by theoptimality of ˜ f ( m ) at Q ˜Π m , there must exist another p ′ ∈ Q ˜Π m such that X a,ω ˆ f ( m )( a ) u R ( a, ω ) p ( ω ) > X a,ω ˆ f ( m )( a ) u R ( a, ω ) p ′ ( ω ) and X a,ω ˆ f ( m )( a ) u R ( a, ω ) p ′ ( ω ) ≤ X a,ω ˜ f ( m )( a ) u R ( a, ω ) p ′ ( ω ) (B.2)In the case where one can only ﬁnd a p ′ such that the inequality (B.2) holds with equality. Itthen implies that ˆ f ( m ) can also be induced at Q ˜Π m . Since the sender-preferred action is unique, itmust imply ˜ f ( m ) = ˆ f ( m ) .Thus, as ˜ f ( m ) = ˆ f ( m ) , in this case, inequality (B.2) holds with strict inequality: X a,ω ˆ f ( m )( a ) u R ( a, ω ) p ′ ( ω ) < X a,ω ˜ f ( m )( a ) u R ( a, ω ) p ′ ( ω ) Let p ′′ be the distribution where X a,ω ˆ f ( m )( a ) u R ( a, ω ) p ′′ ( ω ) = X a,ω ˜ f ( m )( a ) u R ( a, ω ) p ′′ ( ω ) By the presumption, p ′′ could be the same as p , moreover, it belongs to the segment connecting p and p ′ . Without loss of generality, let p ( ω ) < p ′ ( ω ) . Then any set of distribution has non-emptyintersection with the interval [ p ′′ ( ω ) , will induce ˜ f ( m ) instead of ˆ f ( m ) . Therefore, both Q Π m and Q Π m cannot intersect with this interval. However, that will violate Bayes’ plausibility as theredoes not exist any device generates the posterior p ′ ( ω ) ∈ [ p ′′ ( ω ) , given the pooled messages { m , m } . Thus, a contradiction.Lemma B.1 conﬁrms that when | Ω | = 2 and ˜ f ( m ) = ˆ f ( m ) , the inequality (B.1) is positive.Thus, the receiver will strictly prefers to pool the synonyms. In the case where ˜ f ( m ) = ˆ f ( m ) ,clearly the receiver will be indifferent between pooling or not. This proof can be easily generalizedto the case with more than two synonyms. (cid:3) .2 Proof of Proposition 4.4 This proof makes use of an equivalent characterization of the optimal ambiguous persuasion valuein the interim setting. While the details are available in the Online Appendix, the conclusion is thevalue can be given by a concaviﬁcation of the following value function: v ( p ) = max Q : p ∈ Q X a,ω ˆ f ( Q )( a ) u R ( a, ω ) p ( ω ) where ˆ f ( Q ) is the sender’s most preferred strategy among the receiver’s optimal strategies at belief Q . In other words, the ambiguous persuasion problem in the interim setting is mathematically aBayesian persuasion problem: max λ ∈ ∆(∆(Ω)) E λ [ v ( p )] subject to X p λ ( p ) p = p In light of this construction, Beauchˆene et al. (2019)’s Assumption 1 is redundant if the value ofambiguous persuasion is strictly higher than Bayesian persuasion. Furthermore, their Assumption2 is equivalent to the fact that, the concaviﬁcation hyperplane has at most | Ω | intersections with thevalue function v ( p ) . In other words, the concaviﬁcation value is given by a unique combination ofposteriors.To prove such a equivalence, the necessity is immediate by deﬁnition. Sufﬁciency is proved bycontrapositive, it the solution is not unique, then it is easy to ﬁnd the relevant action-posterior pairthat becomes irrelevant after perturbation.Now given the assumption that the concaviﬁcation value is given by a unique combination ofposteriors. If the concaviﬁcation value is strictly higher than Bayesian persuasion, then there mustexist a posterior p ∈ ∆(Ω) such that max Q : p ∈ Q X a,ω ˆ f ( Q )( a ) u S ( a, ω ) p ( ω ) > X a,ω ˆ f ( p )( a ) u S ( a, ω ) p ( ω ) Then, clearly ˆ f ( Q ) is not optimal for the receiver at belief p . To induce such a strategy, thesender can design synonyms to generate belief Q from the posterior p . However, because of Corol-lary 4.3, the receiver strictly prefers to pool such synonyms in this case.Beauchˆene et al. (2019) show that this value can never be achieved without synonyms. Thus,there remains a possibility that an ambiguous device can generate a set of belief Q ′ ( Q withoutsynonyms and then use synonyms to generate Q from Q ′ .Since Q ′ is not generated by synonyms, there must exists probabilistic devices π and π ′ suchthat π generates the posterior p and π ′ generates a different posterior p ′ ∈ Q ′ .As the sender’s payoff is given by MEU, then her value is given by the minimum evaluationover the two devices, and the minimizer is π . Since the optimal solution is unique, it then impliesthat the sender’s ex-ante value is strictly higher under π ′ compared with π . As a result, the sendercould achieve a strictly higher payoff by generating p ′ with π ′ and then induce the same strategywith synonyms. This fact contradicts to the optimality of the proposed ambiguous device, thusimpossible. (cid:3) .3 Proof of Proposition 4.6 The “if” direction is proved in the following, the “only if” direction is proved considering theExample 2.Suppose two messages m and m are dilating messages under an ambiguous device Π . Namely, Q Π { m ,m } ⊆ Q Π m and Q Π { m ,m } ⊆ Q Π m . If they are trivial dilating messages, then the receiver isalways indifferent between pooling or not. Consider the case they are not trivial, i.e. one of theinclusion is strict.Let ˆ f ( m ) and ˆ f ( m ) denote the receiver’s optimal strategy given the dilating messages. Ifthey are the same, then the dilating messages are also synonyms. Proposition 4.2 thus impliesthe result. Therefore, consider the case where they are not the same. Notice that in this case, thereceiver’s induced strategies are different before and after pooling.Construct ˜Π the same way as in the proof of Proposition 4.2. Similarly, let V R (Π , ˆ f ) and V R ( ˜Π , ˜ f ) denote the receiver’s payoff before and after pooling the dilating messages respectively.Let ˜ π denote the minimizer of V R ( ˜Π , ˜ f ) , and π ∈ Π the corresponding device. Then one canfurther derive: V R ( ˜Π , ˜ f ) − V R (Π , ˆ f ) ≥ V R (˜ π, ˜ f ) − V R ( π, ˆ f )= X m ,m X a,ω ˜ f ( m )( a ) u R ( a, ω ) p ˜ π ( ω × m ) − X m ,m X a,ω ˆ f ( m )( a ) u R ( a, ω ) p π ( ω × m )= τ ˜ π ( m ) X a,ω h ˜ f ( m )( a ) − ˆ f ( m )( a ) i u R ( a, ω ) q ˜ πm ( ω )+ τ ˜ π ( m ) X a,ω h ˜ f ( m )( a ) − ˆ f ( m )( a ) i u R ( a, ω ) q ˜ πm ( ω ) (B.3)The receiver may weakly prefer to not pool the dilating message only if there exists p ∈ Q Π { m ,m } such that either X a,ω h ˜ f ( m )( a ) − ˆ f ( m )( a ) i u R ( a, ω ) p ( ω ) ≤ or X a,ω h ˜ f ( m )( a ) − ˆ f ( m )( a ) i u R ( a, ω ) p ( ω ) ≤ Without loss of generality, suppose the ﬁrst inequality is true. Then by a similar argument to theproof of Lemma B.1, one can show Q Π m cannot be a superset of Q Π { m ,m } . Thus, a contradiction,i.e. the receiver always strictly prefers to pool the dilating messages in this case. The proof can beeasily generalized to more than two dilating messages. (cid:3) B.4 Proof of Proposition 5.1

Fix an ambiguous device Π and suppose the receiver does not pool any message. Let rectC Π denote the rectangular hull of the set of joint priors C Π . By deﬁnition, rectC Π = { p ∈ ∆(Ω × M ) : p ( ω × m ) = τ π ( m ) · q π ′ m ( ω ) ∀ π, π ′ ∈ Π } . Namely, every joint prior in the rectangular hull isgiven by the combination of a marginal probability distribution τ π and a set of posteriors { q m } m q m is taken from the set Q Π m . Let τ Π denote the set of marginal distributions over M induced by Π .Let V rectR (Π , f ) denote the receiver’s ex-ante payoff given the rectangular hull, namely V rectR (Π , f ) = min p π ∈ rectC Π X m,a,ω f ( m )( a ) u R ( a, ω ) p π ( ω × m ) With rectangularity, the interim optimal strategy is also ex-ante optimal. Namely, let ˆ f ( m ) denotethe receiver’s interim optimal strategy, then it also satisﬁes the following V rectR (Π , ˆ f ) = max f ∈F min p π ∈ rectC Π X m,a,ω f ( m )( a ) u R ( a, ω ) p π ( ω, m ) Synonyms.

Let m and m be synonyms. Let ˜Π be deﬁned the same way as in the proof ofProposition 4.2. Let ˆ f ( m ) and ˜ f ( m ) denote the receiver’s interim optimal strategies under Π and ˜Π respectively. Then one can derive V rectR (Π , ˆ f ) = min p π ∈ rectC Π X m,a,ω ˆ f ( m )( a ) u R ( a, ω ) p π ( ω, m )= min τ ∈ τ Π X m ∈ M τ ( m ) min q ∈ Q Π m X a,ω ˆ f ( m )( a ) u R ( a, ω ) q ( ω )= min τ ∈ τ Π " X m = m ,m τ ( m ) min q ∈ Q Π m X a,ω ˆ f ( m )( a ) u R ( a, ω ) q ( ω )+ τ ( m ) min q ∈ Q Π m X a,ω ˆ f ( m )( a ) u R ( a, ω ) q ( ω ) + τ ( m ) min q ∈ Q Π m X a,ω ˆ f ( m )( a ) u R ( a, ω ) q ( ω ) = min τ ∈ τ Π " X m = m ,m τ ( m ) min q ∈ Q Π m X a,ω ˆ f ( m )( a ) u R ( a, ω ) q ( ω )+ min q ∈ Q Π m min q ∈ Q Π m X a,ω ˆ f ( m )( a ) u R ( a, ω )( τ ( m ) q ( ω ) + τ ( m ) q ( ω )) ≤ min τ ∈ τ Π " X m = m ,m τ ( m ) min q ∈ Q Π m X a,ω ˆ f ( m )( a ) u R ( a, ω ) q ( ω )+ min q ∈ Q Π { m ,m } X a,ω ˆ f ( m )( a ) u R ( a, ω ) q ( ω )( τ ( m ) + τ ( m )) = min τ ∈ τ ˜Π X m ∈ M τ ( m ) min q ∈ Q ˜Π m X a,ω ˆ f ( m )( a ) u R ( a, ω ) q ( ω ) ≤ min τ ∈ τ ˜Π X m ∈ M τ ( m ) min q ∈ Q ˜Π m X a,ω ˜ f ( m )( a ) u R ( a, ω ) q ( ω ) = V rectR ( ˜Π , ˜ f ) where the second equality follows by rectangularity of rectC Π , the ﬁrst inequality follows by thefact that for any π , τ ( m ) q πm + τ ( m ) q πm ∈ Q Π { m ,m } , and the last inequality follows from ˜ f Q ˜Π m . Furthermore, notice that if ˆ f and ˜ f is not the same at m and m , then thelast inequality becomes strict. This argument can be easily generalized to more than two synonyms. Dilation.

Let m and m be two non-trivial dilating messages. Let ˜Π be deﬁned the same wayas before. By deﬁnition, one has Q ˜Π m ⊆ Q Π m and Q ˜Π m ⊆ Q Π m Moreover, one of the inclusion is strict. It thus further implies that rectC ˜Π $ rectC Π As a result, max f ∈F min p π ∈ rectC ˜Π X m,a,ω f ( m )( a ) u R ( a, ω ) p π ( ω, m ) ≥ max f ∈F min p π ∈ rectC Π X m,a,ω f ( m )( a ) u R ( a, ω ) p π ( ω, m ) Therefore, the receiver always weakly prefers to garble dilating messages. Moreover, if ˆ f and ˜ f are not the same, then the receiver strictly prefers to pool the messages. This argument can beeasily generalized to more than two dilating messages. (cid:3) eferences Beauchˆene, D., Li, J., and Li, M. (2019). Ambiguous persuasion.

Journal of Economic Theory ,179:312–365.Bradley, S. (2019). Imprecise Probabilities. In Zalta, E. N., editor,

The Stanford Encyclopedia ofPhilosophy . Metaphysics Research Lab, Stanford University, spring 2019 edition.Cerreia-Vioglio, S., Maccheroni, F., Marinacci, M., and Montrucchio, L. (2011). Uncertaintyaverse preferences.

Journal of Economic Theory , 146(4):1275–1330.Cheng, X. (2019). Relative maximum likelihood updating of ambiguous beliefs. arXiv preprintarXiv:1911.02678 .de Clippel, G. and Zhang, X. (2019). Non-bayesian persuasion. Technical report, Working Paper.Dominiak, A., Duersch, P., and Lefort, J.-P. (2012). A dynamic ellsberg urn experiment.

Gamesand Economic Behavior , 75(2):625–638.Dworczak, P. and Pavan, A. (2020). Preparing for the worst but hoping for the best: Robust(bayesian) persuasion.Epstein, L. G. and Schneider, M. (2003). Recursive multiple-priors.

Journal of Economic Theory ,113(1):1–31.Ghirardato, P., Maccheroni, F., Marinacci, M., et al. (2004). Differentiating ambiguity and ambi-guity attitude.

Journal of Economic Theory , 118(2):133–173.Gilboa, I. and Schmeidler, D. (1989). Maxmin expected utility with non-unique prior.

Journal ofMathematical Economics , 18(2):141–153.Guo, Y. and Shmaya, E. (2018). Costly miscalibration.

Working paper .Hanany, E. and Klibanoff, P. (2007). Updating preferences with multiple priors.

TheoreticalEconomics , 2(3):261–298.Hanany, E. and Klibanoff, P. (2009). Updating ambiguity averse preferences.

The BE Journal ofTheoretical Economics , 9(1).Hanany, E., Klibanoff, P., and Mukerji, S. (2020). Incomplete information games with ambiguityaverse players.

American Economic Journal: Microeconomics , 12(2):135–87.Hedlund, J., Kauffeldt, T. F., and Lammert, M. (2020). Persuasion under ambiguity.

Theory andDecision , pages 1–28.Hill, B. (2020). Dynamic consistency and ambiguity: A reappraisal.

Games and Economic Behav-ior , 120:289–310.Kamenica, E. (2019). Bayesian persuasion and information design.

Annual Review of Economics ,11:249–272. 27amenica, E. and Gentzkow, M. (2011). Bayesian persuasion.

American Economic Review ,101(6):2590–2615.Klibanoff, P., Marinacci, M., and Mukerji, S. (2005). A smooth model of decision making underambiguity.

Econometrica , 73(6):1849–1892.Kosterina, S. (2018). Persuasion with unknown beliefs.

Work. Pap., Princeton Univ., Princeton,NJ .Lipnowski, E., Mathevet, L., and Wei, D. (2020). Attention management.

American EconomicReview: Insights , 2(1):17–32.Maccheroni, F., Marinacci, M., and Rustichini, A. (2006). Ambiguity aversion, robustness, andthe variational representation of preferences.

Econometrica , 74(6):1447–1498.Pahlke, M. (2018). Dynamic consistency in incomplete information games with multiple priors.Technical report, Center for Mathematical Economics Working Papers.Pahlke, M. (2019). A note on dynamic consistency in ambiguous persuasion. Technical report,Center for Mathematical Economics Working Papers.Pires, C. P. (2002). A rule for updating ambiguous beliefs.

Theory and Decision , 53(2):137–152.Shishkin, D. and Ortoleva, P. (2019). Ambiguous information and dilation: An experiment.

Work-ing paper .Siniscalchi, M. (2009). Two out of three ain’t bad: A comment on “the ambiguity aversion litera-ture: A critical assessment”.

Economics & Philosophy , 25(3):335–356.Siniscalchi, M. (2011). Dynamic choice under ambiguity.

Theoretical Economics , 6(3):379–421.Sion, M. (1958). On general minimax theorems.

Paciﬁc Journal of mathematics , 8(1):171–176.Strotz, R. H. (1955). Myopia and inconsistency in dynamic utility maximization.

The review ofeconomic studies , 23(3):165–180.Tang, R. (2020). A theory of updating ambiguous information.

Working paper .Wei, D. (2020). Persuasion under costly learning.

Available at SSRN 3188302 .28 r X i v : . [ ec on . T H ] O c t Online Appendix of “Ambiguous Persuasion: AnEx-ante Perspective”

Xiaoyu Cheng * October 13, 2020

This online appendix provides an equivalent characterization of the value of ambiguous per-suasion in an interim setting. The setup and notations remain the same as in the paper “AmbiguousPersuasion: An Ex-ante Perspective”.

Let P denote the collection of all non-empty closed and convex subsets of ∆(Ω) . Let P, Q denotegeneric elements of P . Then a distribution over (closed and convex) sets of posteriors can bedenoted by µ ∈ ∆( P ) . If a distribution µ is induced by an ambiguous device, then it needs tosatisfy the following condition. Deﬁnition 1.1.

A distribution µ is Veriﬁably Bayes Plausible if there exists a selection function ϕ : P → ∆(Ω) with ϕ ( P ) ∈ P such that X P ∈ supp ( µ ) µ ( P ) ϕ ( P ) = p Namely, a distribution µ is veriﬁably Bayes plausible if Bayes plausibility can be veriﬁed byselecting a posterior from each set of posteriors in its support. Such a selection is then calledthe verifying selection . A veriﬁably Bayes plausible (VBP, henceforth) distribution µ may havemultiple verifying selections. Let Φ µ denote the set of all verifying selections of µ , i.e. Φ µ =  ϕ ( · ) : X P ∈ supp ( P ) µ ( P ) ϕ ( P ) = p  Fix a VBP µ and one of its verifying selection ϕ ( · ) , ϕ ( P ) is then a verifying posterior of P .Let P µ denote the set of all verifying posteriors of P given distribution µ , i.e. P µ ≡ { p ∈ P : p = ϕ ( P ) for some ϕ ∈ Φ µ } The ﬁrst observation is that P µ is also a closed and convex subset of ∆(Ω) , which is an imme-diate consequence of the sets in the support of µ being closed and convex. * Department of Managerial Economics and Decision Sciences, Kellogg School of Management, NorthwesternUniversity, Evanston, IL, USA. E-mail: [email protected] roposition 1.2. For any veriﬁably Bayes plausible µ and P ∈ supp ( µ ) , P µ is a closed and convexset.Proof. First show convexity. Given any VBP µ and P ∈ supp ( µ ) , consider the set P µ . Suppose p , p ∈ P µ such that p = ϕ ( P ) and p = ϕ ( P ) . Then it implies µ ( P ) p + X P ′ ∈ supp ( µ ) \ P µ ( P ′ ) ϕ ( P ′ ) = p µ ( P ) p + X P ′ ∈ supp ( µ ) \ P µ ( P ′ ) ϕ ( P ′ ) = p For any λ ∈ [0 , , let ϕ λ ( · ) = λϕ ( · ) + (1 − λ ) ϕ ( · ) . Then combine the two equations aboveimplies that ϕ λ is also a verifying selection of µ : µ ( P ) λ [ p + (1 − λ ) p ] + X P ′ ∈ supp ( µ ) \ P [ λϕ ( P ′ ) + (1 − λ ) ϕ ( P ′ )] = p Therefore, ϕ λ ( P ) = λp + (1 − λ ) p is also contained in P µ . Thus P µ is convex.For closeness, similarly consider a sequence of verifying posteriors { p n } n =1 , , ··· of P µ with cor-responding verifying selection { ϕ n } n =1 , , ··· . Suppose lim n →∞ p n = p , it implies that lim n →∞ ϕ n ( P ) = p ≡ ϕ ( P ) . The last term is well deﬁned because P is closed. Then it is easy to verify that such ϕ is a verifying selection of µ . Therefore, p ∈ P µ . (cid:3) . Not all ambiguous devices can induce distributions over set of posteriors. On the contrary, I willshow that through the following construction every VBP distribution can be induced by someambiguous device. In particular, any such ambiguous device can be constructed in a two-stepmanner.For the ﬁrst step, an ambiguous device Π is simple if for all π, π ′ ∈ Π , τ π ( m ) = τ π ′ ( m ) .Namely, a simple ambiguous device consists of probabilistic devices that generate the same mes-sage with the same overall probability. As a result, a simple ambiguous device Π generates theprobability-possibility set Q Π m with probability τ π ( m ) . In other words, it induces a distribution µ with µ ( Q m ) = τ π ( m ) . Moreover, notice that the induced distribution is always fully-veriﬁed : Deﬁnition 2.1.

A veriﬁably Bayes plausible distribution µ is fully-veriﬁed if for all P ∈ supp ( µ ) , P µ = P . Lemma 2.2.

A simple ambiguous device induces a fully-veriﬁed veriﬁably Bayes plausible distri-bution. A fully-veriﬁed veriﬁably Bayes plausible distribution can be induced by a simple ambigu-ous device.

The proof of Lemma 2.2 is straightforward thus skipped here. Clearly, not all VBP distributionsare fully-veriﬁed. Thus simple ambiguous devices alone cannot induce all VBP distributions.However, it sufﬁces to consider ambiguous devices that dilates a simple ambiguous device.2ix a simple ambiguous device Π with the message space M . Consider a dilating messagespace ˜ M and a function g maps from Ω × M to ∆( ˜ M ) . Moreover, assume that the support of g ( ·| m, ω ) is { ˜ m i } i =1 , , ··· ⊆ ˜ M for all ω ∈ Ω . Therefore, ˜ M can be thought of as a reﬁnement of M . Let g ◦ π denote the composition of g and π , which is a mapping from Ω to ∆( ˜ M ) : ( g ◦ π )( ˜ m i | ω ) = X m ′ ∈ M g ( ˜ m i | m ′ , ω ′ ) π ( m ′ | ω ) = g ( ˜ m i | m, ω ) π ( m | ω ) where the second equality follows from g ( ˜ m i | m ′ , ω ) > only when m ′ = m . Therefore, g ◦ π is aprobabilistic device with ˜ M being the message space. The posterior corresponding to message ˜ m i is then given by q g ◦ π ˜ m i ( ω ) = g ( ˜ m i | m, ω ) π ( m | ω ) p ( ω ) P ω ′ ∈ Ω g ( ˜ m i | m, ω ′ ) π ( m | ω ′ ) p ( ω ′ ) Let τ g ◦ π ( ˜ m i | m ) denote the marginal probability of message ˜ m i , i.e. τ g ◦ π ( ˜ m i | m ) = P ω ∈ Ω g ( ˜ m i | m, ω ) π ( m | ω ) p ( ω ) P ω ∈ Ω π ( m | ω ) p ( ω ) Then it is easy to verify that the following conditional Bayes plausibility holds: X i τ g ◦ π ( ˜ m i | m ) · q g ◦ π ˜ m i ( · ) = q πm ( · ) Namely, the posteriors generated from reﬁning the message m to { ˜ m i } i =1 , , ··· also needs to av-erage back to the posterior given m . Then let Q g ◦ π ˜ m denote the convex hull of the posteriors { q g ◦ π ˜ m i ( · ) } i =1 , , ··· it must be the case that q πm ∈ Q g ◦ π ˜ m . On the other hand, it also implies that given aposterior q ∈ ∆(Ω) any set P ⊆ (Ω) containing it can be induced by some function g in this way.Next, to let every message ˜ m i be able to induce the whole set Q g ◦ π ˜ m , it sufﬁces to considerpermutations of g . For example, let g ′ permute the label of ˜ m i and ˜ m j such that g ′ ( ˜ m i | m, ω ) = g ( ˜ m j | m, ω ) g ′ ( ˜ m j | m, ω ) = g ( ˜ m i | m, ω ) for all ω ∈ Ω . Then ﬁx π and suppose the set G ≡ co ( { g, g ′ } ) is used to generate messages in ˜ M in an ambiguous way. As a result, the probability-possibility set Q ˜ m i and Q ˜ m j will coincide andequal to the convex hull of p g ◦ π ˜ m i and p g ◦ π ˜ m j . Analogously, the whole set of posteriors Q g ◦ π ˜ m can begenerated at any message ˜ m i by constructing a set of all possible permutations of g .Finally, for a simple ambiguous device Π inducing the set Q Π m at message m . The aboveconstruction can be applied to every π ∈ Π such that every construction leads to the same set ofposteriors Q ˜ m . Notice that Q Π m ⊆ Q ˜ m by construction. Effectively, the messages { ˜ m i } i =1 , , ··· dilatethe set Q Π m . For this reason, the construction in this step dilates the simple ambiguous device.To summarize, any VBP distribution µ can be induced by an ambiguous device constructed inthe following two steps: 3i) Find a fully-veriﬁed VBP distribution µ ′ satisﬁes for each P ∈ supp ( µ ) there exists P ′ ∈ supp ( µ ′ ) such that P ′ ⊆ P and µ ′ ( P ′ ) = µ ( P ) . Then construct a simple ambiguous device Π induces µ ′ .(ii) For each π ∈ Π , identify the function g dilating the posteriors induced by π to the sets P inthe support of µ . Then construct probabilistic devices using π and all possible permutationsof g .Let G ◦ Π denote the ambiguous device constructed in this way. Clearly, it contains probabilisticdevices in the form of g ◦ π . Therefore, the following proposition can be proved directly from thisconstruction. Proposition 2.3.

Any veriﬁably Bayes plausible distribution can be induced by an ambiguousdevice.

Proposition 2.3 suggests that VBP distributions are relevant objectives to search for the optimalambiguous information structure. However, since the ambiguous devices constructed in this wayare special cases of all ambiguous devices. It is not automatically true that optimal ambiguouspersuasion will be given in this form.

Let ˆ f p ( P ) denote the receiver’s optimal strategy at a set of posteriors P gives the sender mostexpected payoff according to the belief p ∈ P . The explicit dependence on p is unique to thecurrent construction, since a VBP distribution µ can be induced by multiple ambiguous devices.Thus, ﬁx a VBP distribution µ , the sender is able to choose her most preferred ambiguous device.Then under this device, the receiver’s strategy will be evaluated at the belief p . Formally, for anyVBP distribution µ and a verifying selection ϕ , the sender’s ex-ante payoff from the correspondingambiguous device is given by: V S ( µ, ϕ ) = X P ∈ supp ( µ ) µ ( P ) X a,ω ˆ f ϕ ( P ) ( P )( a ) u S ( ˆ f ϕ ( P ) ( P ) , ω ) ϕ ( P )( ω ) The following result establishes the equivalence of value between the ambiguous devices and VBPdistributions with some verifying selection.

Proposition 3.1.

The followings are equivalent:(i) There exists an ambiguous device with value v ∗ .(ii) There exists a veriﬁably Bayes plausible distribution µ and a verifying selection ϕ such that V S ( µ, ϕ ) = v ∗ .Proof. The direction ( ii ) ⇒ ( i ) is immediate given our two-step construction of ambiguous de-vices. For the other direction ( i ) ⇒ ( ii ) , notice that v ∗ = min π ∈ Π X ω ∈ Ω p ( ω ) X m,a ˆ f q πm ( Q Π m )( a ) π ( m | ω ) u S ( a, ω ) min π ∈ Π X m,a ˆ f q πm ( Q Π m )( a ) "X ω ∈ Ω p ( ω ) π ( m | ω ) ω ∈ Ω p ( ω ) π ( m | ω ) u S ( a, ω ) P ω ∈ Ω p ( ω ) π ( m | ω )= min π ∈ Π X m τ π ( m ) X a,ω ˆ f q πm ( Q Π m )( a ) u S ( a, ω ) q πm ( ω )= X m ∈ M τ π ( m ) X a,ω ˆ f q πm ( Q Π m )( a ) u S ( a, ω ) q πm ( ω ) As q πm ∈ Q Π m for all m and P m τ π ( m ) q πm = p . The distribution given by µ ( Q Π m ) = τ π ( m ) isveriﬁably Bayes plausible with verifying selection ϕ ( P m ) = q πm . Therefore, one has V S ( µ, ϕ ) = v ∗ . Notice that, ﬁx any VBP distribution µ , the sender is able to achieve the maximum value over allverifying selections. Let ¯ V ( µ ) denote this value: V S ( µ ) = max ϕ ∈ Φ µ X P ∈ supp ( µ ) µ ( P ) X a,ω ˆ f ϕ ( P ) ( P )( a ) u S ( ˆ f ϕ ( P ) ( P ) , ω ) ϕ ( P )( ω ) The maximum exists and it can be achieved by designing a Bayesian device in the ﬁrst step thatinduces the maximizing verifying posteriors.As a result, the sender’s optimal ambiguous persuasion design can be simpliﬁed to the follow-ing program. Corollary 4.1.

The value of optimal ambiguous persuasion is given by the following: max µ ∈ ∆( P ) V S ( µ ) s. t. µ being veriﬁably Bayes plausible. This program is very similar to the program of Bayesian persuasion. Indeed, I am going toshow that the solution of this program can also be characterized by the concaviﬁcation of somevalue functions.Let v : ∆(Ω) → R be a value function deﬁned by the following: v ( p ) = max Q : p ∈ Q X a,ω ˆ f ( Q )( a ) u R ( a, ω ) p ( ω ) Namely, the value at a posterior p is given by maximizing over all probability-possibility setscontaining it. Given the receiver takes the sender-preferred action evaluating at posterior p , v ( p ) isclearly upper semi-continuous. Let ˆ v ( p ) denote the concave closure of v ( p ) : ˆ v ( p ) ≡ inf { H ( p ) | H : ∆(Ω) → R , H ≥ v S , H is afﬁne and continuous. } The following theorem shows that the concaviﬁcation of this value function characterizes thevalue of optimal ambiguous persuasion. P µ is closed and convex implies Φ µ is also a closed and convex set of functions. heorem 4.2. The value of optimal ambiguous persuasion at prior p equals to ˆ v ( p ) .Proof. Let V ∗ ( p ) denote the value of optimal ambiguous persuasion solved from the followingprogram: V ∗ ( p ) = max µ ∈ ∆( P ) V S ( µ ) s. t. µ being veriﬁably Bayes plausible.First of all, I show that ˆ v ( p ) can be achieved at a VBP distribution µ with a verifying selection ϕ . Let τ ∈ ∆(∆(Ω)) be the distribution inducing the value ˆ v ( p ) , thus it satisﬁes Bayes plausi-bility. For each p ∈ supp ( τ ) , let P ∗ denote the probability-possibility set inducing the value v ( p ) .Then the distribution µ with µ ( P ∗ ) = τ ( p ) and verifying selection ϕ ( P ∗ ) = p clearly inducing thevalue ˆ v ( p ) . That is, V S ( µ, ϕ ) = ˆ v ( p ) . Therefore, V ∗ ( p ) = max µ ∈ ∆( P ) V S ( µ ) ≥ V S ( µ, ϕ ) = ˆ v ( p ) Next, I am going to show that it cannot be the case V ∗ ( p ) > ˆ v ( p ) . Towards a contradiction,suppose the ex-ante value from some VBP distribution µ with verifying selection ϕ is strictlyhigher than ¯ v ( p ) . Because V S ( µ, ϕ ) = X P ∈ supp ( µ ) µ ( P ) X a,ω ˆ f ϕ ( P ) ( P )( a ) u S ( ˆ f ϕ ( P ) ( P ) , ω ) ϕ ( P )( ω ) is in the form of an expectation. It therefore implies that there must exist P ′ ∈ supp ( µ ) such that X a,ω ˆ f ϕ ( P ) ( P )( a ) u S ( ˆ f ϕ ( P ) ( P ) , ω ) ϕ ( P )( ω ) > max P ′ : ϕ ( P ) ∈ P ′ X a,ω ˆ f ϕ ( P ) ( P ′ )( a ) u S ( ˆ f ϕ ( P ) ( P ′ ) , ω ) ϕ ( P )( ω ) which is clearly impossible.Theorem 4.2 implies that the problem of optimal ambiguous persuasion is equivalent to anoptimal Bayesian persuasion problem with v ( p ) as the value function. Corollary 4.3.

The value of optimal ambiguous persuasion is given by the following: max τ ∈ ∆(∆(Ω)) E τ [ v ( p )] s. t. X p ∈ supp ( τ ) τ ( p ) p = p0