[PDF] Persuasion with Coarse Communication

Abstract

Persuasion is an exceedingly difficult task. A leading cause of this difficulty is the misalignment of preferences, which is studied extensively by the literature on persuasion games. However, the difficulty of communication also has a first order effect on the outcomes and welfare of agents. Motivated by this observation, we study a model of Bayesian Persuasion in which the communication between the sender and the receiver is constrained. This is done by allowing the cardinality of the signal space to be less than the cardinality of the action space and the state space, which limits the number of action recommendations that the sender can make. Existence of a maximum to the sender's problem is proven and its properties are characterized. This generalizes the standard Bayesian Persuasion framework, in which existence results rely on the assumption of rich signal spaces. We analyze the sender's willingness to pay for an additional signal as a function of the prior belief, which can be interpreted as the value of precise communication. We provide an upper bound for this value which applies to all finite persuasion games. While increased precision is always better for the sender, we show that the receiver might prefer coarse communication. We show this by analyzing a game of advice seeking, where the receiver has the ability to choose the size of the signal space.

Full PDF

PPersuasion with Coarse Communication ∗ Yunus C. Aybas † r ○ Eray Turkel ‡ July 20, 2020

Abstract

Persuasion is an exceedingly diﬃcult task. A leading cause of this diﬃculty is themisalignment of preferences, which is studied extensively by the literature on persuasiongames. However, the diﬃculty of communication also has a ﬁrst order eﬀect on theoutcomes and welfare of agents. Motivated by this observation, we study a model ofBayesian Persuasion in which the communication between the sender and the receiveris constrained. This is done by allowing the cardinality of the signal space to be lessthan the cardinality of the action space and the state space, which limits the numberof action recommendations that the sender can make. Existence of a maximum tothe sender’s problem is proven and its properties are characterized. This generalizesthe standard Bayesian Persuasion framework, in which existence results rely on theassumption of rich signal spaces. We analyze the sender’s willingness to pay for anadditional signal as a function of the prior belief, which can be interpreted as the valueof precise communication. We provide an upper bound for this value which applies toall ﬁnite persuasion games. While increased precision is always better for the sender, weshow that the receiver might prefer coarse communication. This is done by analyzinga game of advice seeking, where the receiver has the ability to choose the size of thesignal space. ∗ We are grateful to Avidit Acharya, Steven Callander, Mine Su Erturk, Fran¸coise Forges, MatthewGentzkow, Edoardo Grillo, Matthew Jackson, Semih Kara, Tarik Kara, Emin Karagozoglu, Cem Tutuncu,Robert Wilson, Kemal Yildiz, and seminar participants in Stanford University and Bilkent Universityfor helpful comments. Author names are randomized via AEA author randomization tool. Random-ization can be veriﬁed at . † Stanford University, Department of Economics. [email protected] ‡ Stanford University, Graduate School of Business. [email protected] a r X i v : . [ ec on . T H ] J u l Introduction

Communication is diﬃcult. This is especially true when the content is very complicated, andthe messages are relayed through coarse or imperfect channels. Financial analysts make rec-ommendations to their clients about taking long or short positions on assets. Some ﬁrms givesimple recommendations such as ‘buy’, ‘sell’ or ‘hold’, while others give more ﬁne-grainedadvice, including ‘strong buy’ or ‘strong sell’. Credit rating agencies use letter grades withthe goal of communicating the riskiness of an investment. What is the eﬀect of this coarse-ness in communication on agents who are interacting strategically? Under what conditionswould it be optimal to send or receive information through coarse channels?Motivated by these questions, this paper studies an information design problem betweentwo rational agents, a sender and a receiver, in a setting with coarse communication. Thediﬃculty in communication arises because the underlying space of possible states of the worldis large relative to the set of signals that can be used to describe it.The framework of Bayesian Persuasion (Kamenica and Gentzkow 2011), and in general,information design (Bergemann and Morris 2016), analyzes strategic communication betweenagents who might have misaligned preferences. In the canonical model, the sender designsthe informational environment of the receiver through a signaling scheme, creating beliefsthat will induce desirable actions. The ability of the sender to commit to a signaling schemedistinguishes this framework from the literature on cheap talk, where similar restrictions onsignal spaces have been studied by Jager et al. (2011), among others.We will start by giving an overview of our contributions to the rapidly expanding ﬁeldof information design and Bayesian Persuasion, and then provide a review of related workto explain how our results complement the existing literature.First, we generalize the Bayesian Persuasion framework to settings where the signal spaceis coarse, i.e. , limited in its cardinality. The standard model assumes the existence of a largesignal space, which is rich enough to describe the state of the world perfectly, or induce allpossible actions, depending on which one of the constraints is binding. This assumption isused to show the existence of a solution using Caratheodory’s theorem and other tools fromconvex analysis. We establish the existence and describe the properties of a sender-optimalinformation structure in a setting where the signal space is cardinality-constrained, hencethe standard tools for proving existence cannot be used.Second, we analyze the eﬀects of coarse communication for the sender and the receiver.We show that a larger signal space always weakly improves the sender’s utility, so a senderwould be willing to pay to get access to an additional signal. We call the sender’s willingnessto pay for an additional signal the price of precision , and provide an upper bound for it whichapplies to any ﬁnite Bayesian Persuasion game. The upper bound result is derived by usinga novel insight linking higher and lower dimensional information structures: namely, given2 higher dimensional information structure, we can combine some of the induced posteriorswhile still maintaining Bayes Plausibility, and create lower dimensional information struc-tures. Doing this in a systematic way, we can show an upper bound on the gap in achievableutilities between optimal information structures under diﬀerent cardinality constraints. Wealso analyze how the price of precision depends on the location of the prior, and the diﬃ-culty (for the sender) of inducing beneﬁcial actions while maintaining Bayes Plausibility. Weshow that the price of precision can be non-monotonic: e.g. the second signal can be morevaluable than the third one, or vice versa.Next, we show that the eﬀect of additional signals on receiver’s utility is ambiguous ingeneral. We analyze a game of optimal advice seeking , where the receiver has the ability tochoose the size of the signal space. Intuitively, this is a setting where the receiver can askfor simple or complicated recommendations from the sender. This framework can capturesituations where the receiver has power on the communication procedure. We show throughan example that there exists equilibria where the receiver optimally chooses to ask for ‘sim-ple advice’ with fewer action recommendations. Through our example we also show thatrestricting the cardinality of the signal space might not lead to less informative informationstructures, in the sense of Blackwell informativeness.Finally, our analysis of this problem makes a theoretical contribution by developing anovel method for ﬁnding sender-optimal information structures in persuasion games. A keyinsight we develop is using Choquet’s Theorem to analyze optimal information structures,represented as probability measures over the extreme points of low-dimensional simplices.This characterization is closely related to the study of generalized barycentric coordinates(Warren 1996, 2003; Warren et al. 2007). This approach allows us to solve information de-sign problems where concaviﬁcation methods cannot be used directly. In addition to ourtheoretical results, we provide intuitive geometric tools to analyze persuasion games in set-tings with coarse signal spaces. The concaviﬁcation method developed in Kamenica andGentzkow (2011) analyzes the convex hull of the hypograph of the sender utility, which canbe inspected to understand the properties of optimal information structures. We deﬁne therelated concept of k-convex hull of a set, which is the set of points that can be represented asthe convex combination of at most k points. We show that the set of achievable utilities fora sender in a game where the signal space has cardinality | S | “ k is given by the k-convexhull of the hypograph of sender utility.Previous work on persuasion games has introduced costs for generating precise infor-mation structures, where the costs are usually motivated through information theoreticfoundations: an example is assuming that the costs are proportional to the reduction inthe entropy of prior beliefs (Gentzkow and Kamenica 2014). This approach still allows thesender to make arbitrarily many action recommendations subject to a cost, and the exis-tence results rely on having a high dimensional signal space. Similarly, limitations to theinformativeness of posteriors in a persuasion game can rise endogenously in a setting where3he receiver has mental costs associated with processing more informative signals. This phe-nomenon has been analyzed under various speciﬁc preference structures, where the senderchooses to induce less informative posteriors due to increasing costs for paying attention toinformative signals on the receiver side (Wei 2018; Bloedel and Segal 2018; Lipnowski andMathevet 2018).While we assume exogenous restrictions on the signal space to prove our main results,we provide multiple applications in section 4, where our model can be used to analyze set-tings in which limitations on the signal space can arise endogenously. Our analysis of adviceseeking games, where a receiver determines the cardinality of the signal space, is similar tothe setting with binary states and signals analyzed in Ichihashi (2019), in which the receiverlimits the Blackwell informativeness of the signals. As we will see in one of our examples,optimal information structures under diﬀerent cardinality constraints are not always Black-well comparable, so using Blackwell-Informativeness constraints and cardinality constraintswill lead to diﬀerent outcomes in general.In a related paper to ours, Dughmi et al. (2016) examine the properties of a persuasiongame with a restricted number of signals, but in the speciﬁc context of bilateral trade withassumptions on the underlying preference structure. They also prove the NP-hardness ofapproximating optimal sender utility in general persuasion games with coarse signals. Ourfocus is on proving existence, characterizing the properties of the sender-optimal informationstructure and analyzing the various implications of coarse communication rather than thecomputational complexity of calculating the equilibrium sender utility.Two recent papers analyze noisy persuasion games with similar motivating questions.Le Treust and Tomala (2019) study a repeated game of persuasion, where the sender haslimited opportunities to intervene and send information through a noisy and cardinality-constrained channel. While they don’t prove the existence of a maximum, their main resultis an upper bound on achievable utilities by the sender. They also show that this boundis reached in the limit where the number of repetitions of the underlying game approachesinﬁnity. Their result can be modiﬁed to apply to our setting with noiseless channels anda coarse signal space, giving an upper bound on the utility of the sender. This asymptoticresult is shown by making an elegant connection to Shannon’s coding theorem. Similarly,Tsakas and Tsakas (2018) focus on persuasion through noisy communication channels in asingle persuasion game. They show that the eﬀect of noise on sender utility is ambiguous ingeneral, and within the class of symmetric noisy communication channels, more noise makesthe sender worse oﬀ.Our theoretical results complement the asymptotic framework of Le Treust and Tomala(2019): we focus on a single game and prove the existence of an optimal solution and give asharp characterization of its properties. We also provide an upper bound result on achievableutilities with cardinality constraints, which provides a bound on the loss of utility due to4oarseness in communication that applies to all ﬁnite persuasion games. We also show thatcoarse communication always makes the sender worse oﬀ, as opposed to the case with noisewhere the eﬀect is ambiguous, as is shown in Tsakas and Tsakas (2018). Our analyses ofthe value of precision and games of advice seeking also provide substantive applications forconstrained persuasion games in various market settings.While noisy channels make communication between parties more diﬃcult, the restrictionson implementable information structures are diﬀerent compared to cardinality constraints.Le Treust and Tomala (2019) show that asymptotically, all that matters for sender utility isthe channel’s capacity, which is aﬀected both by the inherent noise in communication andthe cardinality of the signal space. However, noisy and coarse signals have substantivelydiﬀerent implications on the optimal information structure and achievable utilities for thesender in ﬁnite games. Noise prevents the sender from inducing posteriors where the receiveris certain about the state of the world, and there are no explicit restrictions on the numberof inducable actions. Thus, the receiver can never be perfectly informed and there is alwaysresidual uncertainty in beliefs. With cardinality constraints, while the sender can induceinformative posteriors, it’s never possible to perfectly inform the receiver about all statesof the world at the same time. Thus, the sender has to prioritize some of the actions thatcan be induced with its limited capabilities while also maintaining Bayes Plausibility, whichleads to diﬀerent outcomes.Mathematically, with noisy channels, the sender’s choice is restricted to informationstructures in which posteriors are not too close to the extreme points of the simplex. Withcardinality constraints, there are no restrictions on the locations of the posteriors, but thesender’s problem reduces to optimally choosing a lower dimensional object embedded in ahigher dimensional probability simplex (i.e., a line segment within the 3-simplex, or a tri-angle within the 4-simplex). This is also why we cannot use the intuitive concaviﬁcationapproach in our setting. Suppose the signal space is constrained to have cardinality K . Itwill not be possible to achieve all utility levels on the convex hull of the epigraph of senderutility. Speciﬁcally, if a utility level can only be achieved as a convex combination of morethan K points, it will not be implementable in our setting. This insight will be clariﬁed whenwe deﬁne the concept of a K-convex hull.The rest of the paper is organized as follows. Section 2 provides a simple example andhighlights some of the insights that will be analyzed in later sections. We introduce ourmodel and provide our existence results in section 3. Section 4 provides applications forour model, where we analyze the value of precise communication, optimal advice seeking,and preferences for simple signals in persuasion. We conclude in section 5. All proofs andadditional results appear in the appendix. 5 A simple example: Financial Advice

We analyze a simple example with 3 states and 3 actions. The sender is a ﬁnancial institu-tion and the receiver is a risk neutral customer, looking for advice on a ﬁnancial position.The customer can take a long or short position on an asset, or do nothing. There is a ﬁxedamount of the asset that the customer can choose to buy or sell. The value of the asset canincrease, in which case the optimal action is to take a long position, it can decrease, in whichcase the optimal action is a short position, or it could hold steady, in which case the optimalaction is doing nothing. Suppose for simplicity, that the value of the asset can increase ordecrease by 1. There is also a risk free asset which the consumer can purchase as an outsideoption, which provides a small return of r “ .

3. In addition, the institution can chargecommissions to the customer, denoted by c , which can be any real number. The payoﬀ ofthe institution is the commission it can charge. The payoﬀ for the customer is 0 if no actionis taken, 1 if the correct position is taken, and ´ p ` , p , p ´ denote the common beliefs that the asset’s value will increase, hold steady,or decrease, respectively, where p ` ` p ` p ´ “

1. The sender (ﬁnancial institution) and thereceiver (customer) share a prior µ which is in the interior of the three dimensional simplex.Sender commits to a signaling mechanism, using signals from a ﬁnite set S , where | S | “ S , one for each realization of the (uncertain) stateof the world. The sender commits to this strategy prior to the realization of the state, andcannot change it afterwards. The receiver observes the signal (not the state of the world),and uses Bayesian updating to obtain posterior probabilities of each state conditional on theobserved signal. It should be noted that signals s P S do not have an intrinsic meaning,but obtain their meaning in equilibrium via the announced signaling mechanism. After thesignal is realized and the posterior beliefs are formed, the sender decides on the commissionthat will be charged. Finally, the receiver chooses their action.Formally, the receiver’s expected payoﬀ will be p ` ´ p ´ ´ r ´ c when taking a long po-sition, and p ` ´ p ´ ´ r ´ c when taking a short position. For any given belief, the receiverwill choose to take a position over doing nothing if and only if | p ´ ´ p ` | ´ r ´ c ě

0. Thesender can therefore extract all the surplus by optimally setting c “ | p ´ ´ p ` | ´ r . Notethat the commissions will be higher if the posterior beliefs approach the extreme points ofthe simplex. Intuitively, the ﬁnancial institution can charge higher commissions for inducingmore precise posteriors, from customers that will have very optimistic or very pessimisticbeliefs about the asset.With | S | “

3, ﬁnding the optimal information structure and calculating the maximumsender utility achievable is easily done by inspecting the concaviﬁcation of sender utility. Theoptimal information structure will induce beliefs on the extreme points of the simplex: thecustomer is absolutely sure about what will happen to the asset when he receives a signal.6etting the prior be µ “ p p ` , p , p ´ q “ p . , . , . q , the optimal information structure willinduce p , , q with probability 0 . p , , q with probability 0 .

4, and p , , q with proba-bility 0 . | S | “

2, solving for the optimal information structure is not straightforward: wecan no longer use concaviﬁcation. It turns out that the optimal information structure inthis case induces the belief p , , q with probability 0 .

3, and induces the belief p , . , . q with probability 0 .

7. The customer is still willing to take a short position, but the beliefsare now less extreme compared to the case with three signals. The resulting expected utilityfor the institution is therefore lower.

Figure 1:

The ﬁnancial advice example. The ﬁrst plot shows the sender utility function over the simplex.The second plot shows the concaviﬁcation of sender utility, where the red dot corresponds to the maximumutility achievable by the sender. The third plot shows the optimal information structure for the sender usingtwo signals. The black dot corresponds to the maximum utility achievable with 2 signals, shown togetherwith the maximum utility achievable with 3 signals.

The example demonstrates some of the key insights that will be generalized in this paper.First, the induced beliefs are located on the boundaries of the regions where the receiver’saction is ﬁxed. This is developed further in section 3.2. Second, the search for an optimalinformation structure is equivalent to searching for the highest value achievable by takingthe convex combination of two points from the graph of the sender utility function. Weformalize this insight in section 3.3. Third, the utility achievable by the sender is lower withtwo signals, hence the sender would be willing to pay to get access to additional signals.The loss in sender utility will be deﬁned as the ‘value of precision’ in section 4.1, where weanalyze it in more detail and provide upper bounds.7

Model and Results

There are two agents, a sender and a receiver, which are communicating about an uncertainstate of the world. The state of the world ω can take values from a ﬁnite set Ω, which hascardinality | Ω | “ n . There are ﬁnitely many actions a P A that can be taken by the receiver,where | A | “ m . The two agents have utility functions which depend on the state of theworld and the receiver’s action, respectively denoted by: u S , u R : Ω ˆ A Ñ R . The agentsshare a prior belief about the state of the world, µ , which is assumed to be in the interior of∆ p Ω q that is denoted by int p ∆ p Ω qq . It is common knowledge that the agents hold a sharedprior. The sender chooses a signaling policy which is a collection of conditional probabilitymass functions t π p . | ω qu ω P Ω over the signal space S with cardinality | S | “ k . Critically, weassume k ă min t m, n u . With this assumption we focus on coarse communication. Thus, thesender cannot induce all possible actions or describe the state of the world perfectly, andhas to decide which actions to induce through coarse communication while also maintainingBayes Plausibility.Each signal realization s P S induces a posterior which is formed through Bayesian updat-ing: The receiver observes a signal realization s P S and forms a posterior belief µ s P ∆ p Ω q .Hence, we can think of the collection t π p . | ω qu ω P Ω as inducing a probability measure over pos-teriors. We denote this probability measure over posteriors by τ P ∆ p ∆ p Ω qq . τ is a discreteprobability measure with support supp p τ q “ µ “ t µ s u s P S . Throughout the paper, τ will becalled an information structure. Naturally, by the restrictions imposed above, we will have1 ă | supp p τ q| ď k . The vector of probabilities of inducing a posterior belief that is in thesupport of the information structure τ will be denoted by τ p µ q . Formally, for µ s P supp p τ q ,the probability that µ s will be induced is given by τ p µ s q “ ř ω P Ω π p s | ω q µ p ω q .After forming the posterior µ s , the receiver chooses an action from the set ˆ A p µ s q “ arg max a P A E ω „ µ s u R p a, ω q . If the receiver is indiﬀerent between multiple actions, we assumethat the indiﬀerence is resolved by picking the action that is preferred by the sender. Ifthere are multiple such elements that maximize the sender’s utility, we pick an element fromˆ A p µ s q arbitrarily.Sender’s utility when the posterior µ s is induced will be ˆ u S p µ s q “ E ω „ µ s u S p ˆ a p µ s q , ω q .Similarly, receiver’s utility will be ˆ u R p µ s q “ E ω „ µ s u R p ˆ a p µ s q , ω q . The expected utility of ∆ p Ω q denotes the simplex over Ω “ t ω , ω , . . . , ω k u . µ s denotes the posterior induced by s which is a generic element of S , and µ i denotes the i th entry of µ “ supp p τ q . So we use µ i to refer a speciﬁc entry of µ and µ s to generic posteriors receiver forms uponobserving a generic signal s P S . The notation E ω „ µ s is used to denote the expectation over the random variable ω taken with respect tothe measure µ s . When the random variable is clear, we will just use the measure that gives the probabilitydistribution on the subscript. τ is denoted by E µ s „ τ ˆ u S p µ s q : ∆ p ∆ p Ω qq Ñ R . Wesimilarly deﬁne the expected receiver utility under τ by E µ s „ τ ˆ u R p µ s q .For a distribution of posteriors to be feasibly induced in the persuasion game with sharedpriors, we need the expected value of the posterior beliefs to be equal to the prior belief.This is the only restriction imposed by Bayesian Rationality (Kamenica and Gentzkow 2011),which we can state formally by E µ s „ τ µ s “ ř µ s P supp p τ q µ s τ p µ s q “ µ . The sender’s goal istherefore ﬁnding the optimal τ , which is described by the problem:max τ P ∆ p ∆ p Ω qq E µ s „ τ ˆ u S p µ s q subject to | supp p τ q| ď k and E τ p µ s q “ µ Formulating the sender’s problem as a search for an optimal information structure τ rather than a search for signal functions t π p . | ω qu ω P Ω makes the problem more tractable.Given a feasible information structure τ and corresponding probabilities for each posteriorbelief t τ p µ s qu s P S in our model, we can always ﬁnd the related signal functions by writ-ing π p s | ω q “ µ s p ω q τ p µ s q µ p ω q . The more interesting problem of ﬁnding the probability measure t τ p µ s qu s P S that make τ Bayes Plausible given only the posterior beliefs µ “ t µ , . . . , µ k u willalso be discussed when we present our existence result. We will show that these probabilitiesare uniquely deﬁned if µ consists of aﬃnely independent posteriors.The constraint on the signal space makes solving the sender’s problem considerably morediﬃcult compared to the standard Bayesian Persuasion framework. Note that it is no longerpossible to use Caratheodory theorem to show the existence of an optimal signal. Theachievable set of utilities can shrink considerably for the sender, compared to the baselinemodel with unrestricted communication. One simple implication of limiting the dimensionality of the signal space is that it is notpossible to induce posterior distributions supported exclusively on the extreme points of the n -dimensional simplex ∆ p Ω q . This is because the convex hull of k ă n extreme points ofthe simplex cannot include µ which is assumed to be in int p ∆ p Ω qq . The limitation alsoconstrains the sender’s ability of making action recommendations: the sender can no longercreate information structures that will induce all possible actions. Thus the sender mustdecide which actions are worth inducing with the limited signals they can use. We willshow how we can ﬁnd the optimal information structure through a constructive proof, andpresent its properties. Existence can also be shown by using the upper semi-continuity ofsender utility over the search space and showing the compactness of the set of Bayes Plausi-ble information structures inducable with coarse signals. Our approach provides additionalinsights about the properties of lower dimensional optimal information structures and givesus an explicit method for ﬁnding them. 9e use the underlying preference structure for the sender and the receiver to simplifythe search for an optimal information structure τ . Formally, we can deﬁne subsets of ∆ p Ω q where the receiver’s action is constant, and use the fact that sender utility is convex withinthese subsets. The properties presented in lemmas 1 and 2 have been applied in the contextof persuasion games where the receiver has psychological preferences over diﬀerent posteriorbeliefs (See Lipnowski and Mathevet (2017, 2018); Volund (2018)).

Deﬁnition 1.

The set R a Ď ∆ p Ω q is the set of beliefs where the action a is receiver-optimal: R a “ t µ i P ∆ p Ω q : a P ˆ A p µ i qu R “ t R a u a P A is the collection consisting of these sets for every action a P A . Lemma 1.

For every action a P A , the set R a is closed and convex. Lemma 2.

The sender’s utility ˆ u S is convex when restricted to each set R a . Lemma 1 follows from the fact that each R a can be written as the intersection of ﬁnitelymany closed half spaces. The proof of lemma 2 uses the deﬁnition of ˆ u S , which is a functionof sender-optimal actions at every belief. For any two points µ , µ in a given R a , let thesender-optimal action be ˆ a p µ q at their convex combination µ . This action must be amongthe set of receiver-optimal actions for the two original points. Since the action ˆ a p µ q is deﬁnedas the action that maximizes sender utility among the set of receiver-optimal actions ˆ A p µ q ,and we have ˆ a p µ q P ˆ A p µ q and ˆ a p µ q P ˆ A p µ q , convexity of ˆ u S follows.Let us deﬁne beneﬁcial information structures as τ with E τ p ˆ u S q ě ˆ u S p µ q . These areinformation structures that give the sender higher utility compared to the default action,which can be achieved by sending no information. Throughout the paper, we will maintainthe assumption that beneﬁcial information structures exist: the other case is trivial and thesender always prefers sending no information.The ﬁrst two lemmas show us that in the subspace where the receiver’s action is ﬁxed,sender prefers inducing mean-preserving spreads in beliefs. In the model with unrestrictedcommunication, these properties reduce the search for an optimal information structureto a more tractable optimization problem, since the optimal information structure mustbe supported by the outer points of the sets R “ t R a u a P A as described in Lipnowski andMathevet (2017). With coarse communication, we can prove a similar result. The nextlemma formally states that an information structure can always be weakly improved bychanging it in a way that maintains Bayes Plausiblity and moving all posteriors to theboundaries of an action region. In other words, the sender can restrict their search toposteriors that make the receiver indiﬀerent between multiple actions (and posteriors locatedon the boundaries of the n-simplex ∆ p Ω q ), with no loss in utility. This result reduces the sizeof our search space considerably, and provides tractability in higher dimensional problems. In the appendix, we also establish that sender utility is a continuous and piecewise linear function inthe interior of these sets. R is a ﬁnite cover of ∆ p Ω q . emma 3. Let τ be a feasible distribution of posteriors satisfying Bayes Plausibility, thatis also beneﬁcial for the sender. Suppose that D µ a P supp p τ q such that µ a P int p R a q forsome R a P R . Then, there exists a µ k P Bd p R a q and a Bayes Plausible τ ‰ τ where supp p τ q “ p supp p τ q{t µ a uq Y t µ k u such that E τ ˆ u S ě E τ ˆ u S . An immediate corollary of lemma 3 is the following result.

Corollary 1.

The sender’s search for an optimal information structure can be restrictedto information structures τ with the following property: @ µ s P supp p τ q , E R a P R such that µ s P int p R a q . The proof explicitly constructs the information structure τ , and uses the convexity ofˆ u S within each R a . The outline of the argument is the following. Let τ be our origi-nal Bayes Plausible information structure, with the corresponding probabilities t τ p µ i qu i ď k where ř i ď k τ p µ i q µ i “ µ . Let µ a P supp p τ q be in the interior of some R a and deﬁne theray originating from µ and passing through µ a . This ray will intersect the boundary of R a at two points µ and µ , since R a is compact and convex. By convexity of ˆ u S within R a ,sender utility must be weakly higher at one of those two points. First, we show that we canreplace µ a with one of these two points and still maintain Bayes Plausibility. Since we’rechanging µ a along the ray deﬁned above, we can change t τ p µ i qu i ď k in a way that maintainsBayes Plausibility. Note that greedily replacing µ a with the point that provides higher util-ity within R a might not always improve the expected sender utility E τ p ˆ u S q , and the overalleﬀect of this change depends on the relative positions of µ , µ a , µ and µ , and the changein the probabilities t τ p µ i qu i ď k that will maintain Bayes Plausibility. A carefully constructedargument relying on convexity shows that replacing µ a with either µ or µ will always yieldhigher expected utility, where the decision on which point to choose depends on the changesin t τ p µ i qu i ď k .While this simpliﬁes the search, to solve and characterize the sender maximization prob-lem in a tractable way, we still need to understand how the probabilities t τ p µ i qu i ď k change aswe make changes to the beliefs in supp p τ q under the restriction that E τ p µ s q “ ř i ď k τ p µ i q µ i “ µ . Each Bayes Plausible information structure τ deﬁnes a lower dimensional compact convexpolytope embedded in the space ∆ p Ω q Ă R n , with the extreme points supp p τ q . Bayes Plau-sibility implies µ must be in the convex hull of the information structure, µ P co p supp p τ qq with the representation ř i ď k τ p µ i q µ i “ µ where co denotes the convex hull operator. Thisrepresentation can be thought of a discrete probability measure over the convex polytopeco p µ , . . . , µ k q , with positive values only on the extreme points. The probability measure t τ p µ i qu i ď k may not be unique in general. However, if supp p τ q consists of aﬃnely indepen-dent beliefs, we can show that the representation is, indeed, unique using Choquet’s Theorem.We proceed by showing that we can restrict our search of optimal information structuresto the set of aﬃnely independent information structures without any loss. The next theo-rem shows that any aﬃnely dependent information structure can be modiﬁed by dropping11ome beliefs to reach aﬃne independence, increasing sender utility and maintaining BayesPlausibility at every step. The proof is independent of lemma 3 and holds for a general caseof information design problems with or without constrained signal spaces. Lemma 4.

Let τ be a feasible distribution of posteriors satisfying Bayes Plausibility. Supposethat supp p τ q is not aﬃnely independent. Then, there must exist a Bayes Plausible τ ‰ τ such that supp p τ q is aﬃnely independent and E τ ˆ u S ě E τ ˆ u S . Intuitively, for the sender, inducing aﬃnely dependent beliefs is not a good use of signalsbecause some beliefs are redundant. The proof outlines the details on how we can alwaysﬁnd a belief that is optimal to drop from the information structure. Because the beliefs canbe written as an aﬃne combination of each other, we can always choose a belief to dropsuch that the change in t τ p µ i qu i ď k guarantees higher sender utility. We use the relationshipbetween the convex weights characterizing µ which are t τ p µ i qu i ď k , and the set of aﬃneweights that allows us to characterize beliefs in terms of each other.Lemma 4 states that we can restrict our search to aﬃnely independent information struc-tures, or in other words, lower dimensional simplices contained in the n-simplex ∆ p Ω q . Thisgives us the uniqueness of the probability measure t τ p µ i qu i ď k representing µ through Cho-quet’s theorem. The statement of this well known result (e.g., see Alfsen (1965)) is asfollows. Theorem (Choquet Theorem) . Suppose that P is a metrizable compact convex subset of alocally convex Hausdorﬀ topological vector space, and that µ is an element of P . Then thereis a probability measure τ on P which represents µ i.e. ř p P P τ p p q p “ µ s.t. supp p τ q “ Ext p P q , where Ext p P q denotes the extreme points of P . Furthermore, if Ext p P q is aﬃnelyindependent, this probability measure τ is unique. We turn to the question of how this probability measure changes as we change beliefs in supp p τ q . If we perturb the set of beliefs induced (while maintaining Bayes Plausibility), wewould like to be able to analyze how the corresponding probability of inducing each beliefchanges. We can do this by using the fact that the convex hull of the posterior beliefsinduced is a compact and convex polytope. Lemma 5.

Let µ P int p ∆ p Ω qq , deﬁne ζ Ă R k ˆ N as the set of aﬃnely independent set ofposteriors with cardinality k that are Bayes Plausible. @ µ “ p µ , . . . , µ k q P ζ , there existsa unique probability distribution over τ P ∆ p ∆ p Ω qq with support µ and ř i ď k τ p µ i q µ i “ µ .Moreover, τ : ζ Ñ R k is uniformly continuous. Existence, smoothness and uniqueness of this probability measure can be analyzed through barycentriccoordinates by making use of existing work on generalized barycentric coordinates on convex sets (Warren1996, 2003; Warren et al. 2007), but we use Choquet Theory as a more convenient tool. A similar result is theorem 19.3 in (Rockafellar 1970), which shows that orthogonal projection of poly-hedral convex set P Ă R N on subspace L is another polyhedral convex set and linear maps map polyhedralconvex sets to polyhedral convex sets in ﬁnite dimensional vector spaces. µ and characterizes the change in theweights τ p µ q as we change µ to µ as a matrix operation. With this result, we can formulatethe sender’s problem as a search over Bayes Plausible information structures that are aﬃnelyindependent, with the added constraint that @ µ i P supp p τ q , µ i is in the boundary of some R a . The boundary of a given R a consists of facets of a polytope. Each set R a will have atmost m facets, which can be seen from their deﬁnition. Deﬁnition 2.

Choose at most k facets from any collection of polytopes from R “ t R a u a P A ,and denote them by F “ t F i : D a P A, F i is a facet of R a u . For a given collection F , denote the restriction of the sender utility function ˆ u S to a facet F i P F by ˆˆ u Si . Deﬁne the set of aﬃnely independent Bayes Plausible information structuresthat are supported on F by: ζ F “ ˜ ζ X ˜ k ą i “ t F i u ¸¸ . Where Ś ki “ t F i u denotes the Cartesian product of the sets F i . Since each F i is a subset of R n and ζ is a subset of R k ˆ N , ζ F is also a subset of R k ˆ N .The above deﬁnition allows us to characterize sender’s maximization as a search for the facetcollection on which the optimal information structure is supported on. Deﬁnition 3.

Given a collection F “ t F i u i ď k , the sender’s problem subject to the constraintthat the information structure must be supported on the facets F is given by the following: max τ ÿ i ď k τ p µ i q ˆ u Si p µ i q subject to: t µ , . . . , µ k u P ζ F Denote the maximized value of this problem (if a maximum exists) by V p F, µ , k q , withthe added convention that V p F, µ , k q “ ´8 if the feasible set ζ F is empty. The conventionof V p F, µ , k q “ ´8 is required because of the fact that it might be impossible to represent µ for some collection of facets. Deﬁnition 4.

Let F denote the ﬁnite set of all possible collections F . We can characterizethe sender’s maximization problem as follows: max F P F p V p F, µ , k qq . With this deﬁnition, we can show that the sender’s maximization problem is well deﬁnedand an optimal information structure will always exist.

Theorem 1.

An optimal information structure τ P ∆ p ∆ p Ω qq Ă R k ˆ N maximizing the senderobjective function exists. Note that this deﬁnition allows us to use the same set R a arbitrarily many times in deﬁning the collectionF. F i . Moreover, the set offeasible points ζ F might not be a compact set. Hence, the proof relies on a non-trivial two-step continuous extension argument. We ﬁrst deﬁne the continuous extension of the senderutility over the closures of the relevant sets, for which a maximum must always exist by anapplication of simple topological extreme value theorem. We then show that the originalproblem must attain the same maximum with the modiﬁed problem for the continuous ex-tension, through an application of Theorem 4. Our analysis of cardinality-constrained signal spaces also is also useful for Bayesian Per-suasion games with rich signal spaces where agents have preferences for simplicity. In astandard Bayesian Persuasion where k ě min t| A | , | Ω |u , suppose the sender cares about thesimplicity of the induced information structures, in addition to the utility received. In ap-pendix B.4, we analyze this setting by deﬁning an intuitive preference structure for thesender, and show that aﬃnely independent information structures will be chosen at an equi-librium. We proceed by showing that the solution to the maximization problem in deﬁnition 4 willbe equivalent to a geometric characterization of the optimum. We call this characterizationthe k-concaviﬁcation of sender utility. This will connect our solution technique to the con-caviﬁcation approach widely used in the Bayesian Persuasion literature.Let CH p ˆ u S q denote the convex hull of the hypograph of ˆ u S , in the space R n . Withunrestricted communication, the point p µ , z q P CH p ˆ u S q Ă R n represents a sender pay-oﬀ z which can achieved by an information structure when the prior is µ . This is thefoundation of the concaviﬁcation technique, ﬁrst used in repeated games and then appliedto Bayesian Persuasion and information design (Aumann and Maschler 1995; Kamenicaand Gentzkow 2011). In canonical persuasion games, the existence of an optimal signalis usually proven by referencing extremal representation theorems from convex analysis.For any p µ, z q P CH p ˆ u S q , Caratheodory’s theorem assures the existence of a τ such that µ P co p supp p τ qq and | supp p τ q| ě n `

1, where co denotes the convex hull operator. Notethat the last condition prevents us from using this theorem in our setting.With restricted communication, the point p µ, z q P CH p ˆ u S q might not be feasible if the An alternative proof is showing that the sender utility can be extended to an upper semi-continuousfunction deﬁned over a compact set, but the proof we provide is more constructive in nature and has analgorithm for ﬁnding the equilibrium. Since ˆ u S : ∆ p Ω q Ñ R , we can represent any belief µ with | Ω | ´ “ n ´ u S p µ q with areal number, so p µ, z q P R n . p µ, z q requires a convex combination of more than k points from the hypo-graph of ˆ u S . A prior belief-utility pair p µ, z q will only be feasible if it can be contained in theconvex hull of k or fewer points from the hypograph of ˆ u S . To represent achievable utilities,therefore, we need the following deﬁnition. Deﬁnition 5.

Given a set A Ď R n and an integer ă k ď n , deﬁne the set of points thatcan be represented as the convex combination of at most k points in A as the k-ConvexHull of A , denoted co k p A q . Formally, a P co k p A q if and only if there exists a set of at mostk points t a , . . . , a k u Ď A and a set of weights t γ , . . . , γ k u which satisfy ř i ď k γ i “ and @ i, ą γ i ą such that a “ ř i ď k γ i a i . Therefore, we can write: co k p A q “ t a P R n : Dt a , . . . , a k u Ď A, Dt γ , . . . , γ k u with γ i P R s.t. ÿ i ď k γ i “ and ě γ i ě , a “ ÿ i ď k γ i a i u Let CH k p ˆ u S q denote the k-convex hull of the hypograph of ˆ u S , in the space R n . Notethat if p µ , z q P CH k p ˆ u S q , there exists an information structure τ with supp p τ q ď k andthe E τ p ˆ u S q “ z . Deﬁning V p µ q “ sup t z |p µ , z q P CH k p ˆ u S qu , we get the largest payoﬀ thesender can achieve when the prior is µ . If V p µ q “ z , then we have k beliefs such that ř i ď k τ p µ i q µ i “ µ for some set of weights t τ p µ q , . . . , τ p µ k qu and ř i ď k τ p µ i q ˆ u S p µ i q “ z . Thisgives us the following equivalence between k-concaviﬁcation and our previous result. Theorem 2.

Let τ be the optimal information structure that solves the sender’s maximiza-tion problem given in deﬁnition 4. Then sup t z |p µ , z q P CH k p ˆ u S qu “ E τ ˆ u S . Going back to the ﬁnancial advice example, we can see in ﬁgure 2 that the optimal payoﬀfor the sender given µ can be observed by inspecting the 2-convex hull of the sender utility.The comparison with the regular convex hull (3-convex hull) reveals that the achieved util-ity must be lower. The optimal information structure can thus be determined by inspecting CH k p ˆ u S q Ă R n . We can further analyze the implications of restricting the signal space on sender’s utility.Let V ˚ p k, µ q be the value the sender objective function attains at µ when the signal spaceis restricted to have k elements. Then V ˚ p k ` , µ q ´ V ˚ p k, µ q is what the sender wouldbe willing to pay to increase the dimensionality of the signal space by one, given the ﬁxedprior µ . This can be intuitively interpreted as the value of precision for the sender. Notethat when k ě min t| Ω | , | A |u , the value of precision will be equal to zero by the results in15 igure 2: The supremum of the 3-convex hull and the 2-convex hull of sender utility from the ﬁnancialadvice example. The left ﬁgure shows the maximum achievable utility with 3 signals, and the right ﬁgureshows the maximum achievable utility with 2 signals as a function of the prior beliefs. The dots correspondto the prior belief given in the example ( µ “ p . , . , . q ). Kamenica and Gentzkow (2011). Therefore we focus exclusively on the coarse communica-tion setting in which k ă min t| Ω | , | A |u .The value of precision depends on the structure of the sender and receiver utility func-tions, and the location of the prior belief µ . It critically depends on what actions the sendercan induce while still maintaining Bayes Plausibility. If maintaining Bayes Plausibility withlower dimensional signals requires inducing actions with lower payoﬀs, or inducing a pos-terior located in a low-payoﬀ yielding portion of an action region, then the sender will bewilling to pay more for more precise communication.We establish an upper bound on the value of precision, or equivalently, a lower bound onthe utility achievable with k ´ V ˚ p k, µ q and V ˚ p k ´ , µ q , the loss in utility cannot be too high. Theorem 3.

Suppose | S | “ k ě , and the sender utility function u S is positive everywhere.Then, the following upper bound must hold for the value of precision at k ´ signals: V ˚ p k, µ q ´ V ˚ p k ´ , µ q ď k V ˚ p k, µ q Thus, we show that the utility attainable with k ´ k ´ k V ˚ p k q and V ˚ p k q . This provides a lower bound on the utility loss from using smaller signal spaces,as a function of utility achievable with unrestricted communication. Let τ ˚ k and τ ˚ k ´ bethe optimal information structures using k and k ´ τ ˚ k can be ‘collapsed’ to get an information structure with k ´ τ ˚ k ´ . We can construct k diﬀerent k ´ τ ˚ k pairwise and leaving the rest of the posteriors the same as τ ˚ k . Theutilities provided by these new information structures are related to V ˚ p k, µ q , because theycontain k ´ τ ˚ k . The resulting inequalities yieldthe lower bound in theorem 3. We will show that the value of precision can be non-monotone in general. We analyze anexample with 3 states of the world to demonstrate how the behavior of the value of precisioncan depend on the location of the prior. In our example we will see that V ˚ p , µ q ´ V ˚ p , µ q can be greater or less than V ˚ p , µ q ´ V ˚ p , µ q . We will also demonstrate how this diﬀer-ence depends on the diﬃculty of inducing beneﬁcial actions for the sender.Let Ω “ t ω , ω , ω u . There are four actions available to the receiver A “ t a , a , a , a u .We consider a Bayesian Persuasion game where the sender has an optimal action for eachstate and a default safe action. This can be represented with receiver preferences of the form: u R p a, ω i q “ $’&’% a “ a ´ ¯ π ¯ π if a “ a i @ i P t , , u´ a ‰ a i @ i P t , , u These preferences can be used to model situations in which for each state ω i action a i isoptimal, and mismatching the state i.e. taking action a j j ‰ j ‰ i is costly, with costnormalized to unity. Finally, a is the safe action. Such receiver preferences lead to actionthresholds over the simplex of posterior beliefs.Let us denote µ s p ω i q by µ is , where µ is is the i th coordinate of a given posterior belief µ s .One can think of µ s p ω q as the probability distribution over Ω induced by µ s .For each state, there is a corresponding preferred action a i which is taken by the receiverif and only if the receiver believes the state of the world is ω i with at least probability¯ π . Speciﬁcally, the receiver prefers action a i P t a , a , a u if and only if the posterior belief µ s P ∆ p Ω q such that µ is ě ¯ π , and prefers a otherwise. Hence, we can say that for i P t , , u , j P t , , , u and j ‰ i we have that E µ s r u R p a i , ω qs ě E µ s r u R p a j , ω qs if and only if µ is ą ¯ π .The action zones for these receiver preferences can be represented as: R i “ t µ s P ∆ p ω q| µ is ě ¯ π u @ ω P Ω, u s p a , ω q “ u s p a i , ω q “

1. Thus, thesender only cares about actions and not the states, and aims to induce the non-default ac-tions. The parameter ¯ π can be interpreted as the diﬃculty of inducing the beneﬁcial actionsfor the sender.Given this structure, it should be obvious that sender can attain a payoﬀ of 1 by using3-signal information structures. This follows from the fact that for every prior µ P ∆ p Ω q with µ “ p µ , µ , µ q the sender can use the information structure p , , q with probability µ , p , , q with probability µ and p , , q with probability µ . This information structurecorresponds to τ p µ s q P ∆ p ∆ p Ω qq with τ pp , , qq “ µ , τ pp , , qq “ µ , τ pp , , qq “ µ .We have that E τ u s p a p ω q , ω q “

1. Every point inside simplex can be represented as the con-vex combination of the extreme points of the simplex, hence achieving the maximal utilitywith 3 signals is possible for every interior prior.With 1-signal information structures (i.e. no information transmission at all), we havethat the payoﬀ sender can achieve is E µ u s p a p µ q , ω q “ µ P R i @ i P t , , u µ thatare in R , as for priors in R i for i P t , , u the maximal payoﬀ can be obtained withno information transmission at all. We deﬁne ∆ c as the set where two-signal informationstructures attain lower payoﬀ than three-signal information structures. The following lemmastates the values of ¯ π such that this set is non-empty. Lemma 6. ∆ c ‰ H if and only if ¯ π ě . For thresholds ¯ π ď , two-dimensional information structures suﬃce for achieving maxi-mal utility. We restrict attention to cases where ¯ π ą . In this regime, we can state that forany prior in ∆ c , the utility attained by two-signal information structures is bounded withintwo values. Lemma 7. If ¯ π ą we have that V p , µ q ă V p , µ q “ for every µ P ∆ c and V p , µ q “ V p , µ q “ for every µ R ∆ c . Moreover, @ µ P ∆ c , V p , µ q ą V p , µ q ą V p , µ q , where V p , µ q “ π ´ π and V p , µ q “ π . In ﬁgure 4, we plot V p , µ q and V p , µ q as a function of the action threshold ¯ π . Thefollowing is an immediate implication of lemma 7. Fixing the preferences of the sender andthe receiver, for some prior beliefs, the value of an additional signal is an increasing function,and for others, it is decreasing. 18 igure 3: On the left, we have the action threshold ¯ π “ so it is possible to maintain Bayes Plausibilitywhen inducing non-default actions for every prior. On the right, ¯ π ą , so for some beliefs, we have tomix the default action and the non-default action when constrained to 2 signals. The dark red, blue andgreen regions are the beneﬁcial action regions. The yellow middle region is the default action region. Orangeregion in the right ﬁgure corresponds to ∆ c . Figure 4:

Achievable utilities with two signals for µ P ∆ c , as a function of the action thresholds ¯ π . Blueline depicts the minimum of the equilibrium sender utility among all µ P ∆ c , and yellow line denotes themaximum value. Corollary 2.

Depending on the location of the prior inside ∆ c the value of precision canbe increasing or decreasing with respect to additional signals. That is V p , µ q ą and V p , µ q ă . The priors for which the value of precision is increasing are the ones that are the furthestaway from the beneﬁcial action regions. For the sender who only has access to two signals,the only way to induce favorable actions with these priors is by also inducing the defaultaction with high probability, getting an expected utility below 0 .

5. Therefore, the value ofthe second signal is also below 0 .

5. Getting access to the third signal allows the sender tomaintain Bayes Plausibility by not inducing the default action, guaranteeing a payoﬀ of 1.Hence, the value of the third signal is higher than 0 . .

5. The value of the second signal is then higher than thevalue of the third signal.Note that additional signals always weakly increase the sender utility, because the feasibleset in the optimization problem is expanding. This is not necessarily the case for the receiver,as we will see in our next application.

Our model also can be used to analyze the optimal advice seeking behavior of a receiver.Suppose, before the game described in section 3.1 takes place, the receiver can choose thecardinality of the signal space | S | “ k .Letting the receiver decide the cardinality of the signal space allows them to change theoutcome in their favor. The receiver can choose to ask for “simple advice” consisting offewer action recommendations rather than a more complicated one. We will show throughan example that the receiver will not always prefer using rich signal spaces.First, observe that if there is perfect alignment between the receiver and the sender’sutilities, so that ˆ u R “ ˆ u S , the receiver will always pick the maximum number of signalspossible. This is because the sender’s utility (and therefore the receiver’s utility) is weaklyincreasing in the number of signals available.Let us now turn to the more interesting case of misalignment. Receiver’s preferencesover the number of signals will depend on the location of the prior and the degree of themisalignment between the sender and the receiver. We will make this more clear with thefollowing example. Suppose there are three states t ω , ω , ω u “ Ω and ﬁve actions A “ t a , a , a , a , a u where a denotes the default action taken at the prior belief µ “ p { , { , { q . For simplicity,suppose the sender’s utility depends only on the actions taken, and the default action is theworst outcome. The sender prefers inducing the actions a , a and a over a , and a and a are preferred over a and a .Receiver preferences are such that the optimal actions are a , a , a whenever the beliefsare certain enough, meaning that upon observing signal s P S it is the case that µ s p ω i q ą ¯ π for i P t , , u . Moreover, whenever µ s p ω q ă ¯ π and µ s p ω q ă ¯ π but µ s p ω q ` µ s p ω q ą ¯ π the20eceiver takes action a . This means that there are two diﬀerent actions that the receiveroptimally takes when the beliefs are uncertain, which are a (uncertain but leaning ω ) and a (uncertain but leaning ω or ω ). Figure 5 plots these preferences along with the optimal2 and 3-signal information structures. The full utility function for the receiver is given inthe appendix B.3.We consider the following game: the receiver will pick the cardinality of the signal space k “ | S | ﬁrst, sender observes this choice and picks the optimal Bayes plausible informationstructure with k signals. By sequential rationality and our previous calculations we can char-acterize the sender’s behavior using our results. Namely, the sender will pick the optimalinformation structure for the k -constrained Bayesian Persuasion game, given the choice of k by the receiver. Hence, the receiver will pick k “ | Ω | such that expected receiver utility ismaximized.It is easy to verify that for every equilibrium (PBE) of this game the receiver will pick k “

2, as plotted in ﬁgure 5. For the receiver’s choice of k “

2, the sender will pick theinformation structure described by the red line in the lower left box in ﬁgure 5, inducing a and a . Oﬀ path, for the choice of k “ k “

3, the sender will pick the information structureshown with the blue triangle in the upper right corner in ﬁgure 5, inducing a , a , a . Hence,receivers picks k to be equal to 2. The lower right plot in ﬁgure 5 shows how the two-signalinformation structure, three-signal information structure and the single-signal informationstructure compare in terms of expected utility for the receiver.We see that there is a misalignment between the receiver and the sender preferences. Thereceiver prefers outcomes that are more certain about ω and ω whereas the sender onlywants to induce actions a and a and does not care about certainty in beliefs. The senderideally wants to induce uncertain posteriors leading to action a and a with high probability.Limiting the sender to two signals, the receiver can force the sender to induce morecertain beliefs about ω . This is because under the Bayes Plausibility constraint, imposing k “ ω . With k “

3, the senderoptimally induces vague posteriors about ω and ω .This example shows that the receiver might prefer to limit senders ability to communicateand opt for simple advice. More elaborately, in the game considered above, we see that thereceiver prefers the expected outcome with two signals ( k “

2) over three signals ( k “ k “

3) over no communication at all ( k “ See appendix B.3 for utility functions. igure 5: Partial misalignment and optimal advice seeking. In all ﬁgures, the black perpendicular line at(1/3,1/3,1/3) represents the location of the prior. The top left ﬁgure depicts the sender’s utility over thesimplex, which depends only on the actions taken by the receiver. The top right ﬁgure shows the optimal3-signal information structure, and the bottom left ﬁgure shows the optimal 2-signal information structure.The bottom right ﬁgure depicts the receiver’s utility, with the optimal 2-signal (red line) and 3-signal (bluesurface) information structures. For the receiver, utility with 2 signals (red point) is higher than the utilitywith 3 signals (blue point). that no useful information can be transmitted at all.The example also demonstrates an interesting property of cardinality constrained per-suasion games. We see that the optimal information structures chosen with k “ k “ This presents an interesting avenue for studying coarse communication. InformationDesign and Bayesian Persuasion literature generally focuses on a variety of examples -e.g.Judge-Prosecutor communication (Kamenica and Gentzkow 2011)- where the receiver hassome power on the communication procedure. One way to reﬂect this power is letting thereceiver pick the cardinality of the sender’s signal space. The example above shows thatthe receiver may prefer to pick k to be less than the cardinality of the action and statespace. Our framework can be used to analyze these interactions in detail by characterizingthe solutions to Bayesian Persuasion problems with coarse signal spaces. By Corollary 4 in appendix B.2. Conclusion

We set out to analyze the eﬀect of coarseness in strategic communication. The value ofprecise communication in a game where a sender is trying to persuade a receiver is char-acterized and an upper bound for this value which applies to all ﬁnite persuasion games ispresented. This is done by proving the existence and characterizing the properties of anoptimal information structure in a game of persuasion with constrained signal spaces, whichwas left unexplored by previous literature. In doing so, we develop a novel way of solvingﬁnite Bayesian Persuasion problems and ﬁnding optimal information structures using Cho-quet’s Theorem. Our work complements the asymptotic upper bound results of Le Treustand Tomala (2019) on inﬁnitely repeated persuasion games with noisy and coarse channels,and the results in Tsakas and Tsakas (2018) on ﬁnite persuasion games over noisy channels.We show that constrained signal spaces create non-trivial diﬃculties for the sender in a per-suasion game and demonstrate how we can analyze the outcomes using k-convex hulls. Insettings where a receiver is asking for advice from a sender with misaligned preferences, weshow that it might be optimal to ask for simple recommendations. This gives us a betterunderstanding of settings in which the communication between parties can be limited by thereceiver (or a regulator).With this general model, we can apply our framework to various settings where we wouldlike to learn about the willingness to pay of a sender for more precise messages. Some of themost important questions studied using persuasion games can now be analyzed from thisnew viewpoint. How much would a politician be willing to spend to design a more detailedpolicy experiment to convince voters? How much would a lobbyist be willing to pay to send amore precise action recommendation to the politician that they are trying to persuade? Howmuch would a ﬁrm trying to send product information to a potential customer be willing topay for a longer, more detailed ad?Our model can also be used to study competition between senders who have access to sig-nal spaces with diﬀerent degrees of complexity, or the problem of a sender trying to persuadea heterogeneous set of agents using public or private signals with diﬀerent dimensionalities.We leave these questions for future work. 23 ppendices

A Proofs

Proof of Lemma 1

Given a P A R a is the intersection of ∆ p Ω q , which is closed and convex, and ﬁnitelymany closed half spaces deﬁned by t µ P R | Ω | : ř ω P Ω µ p ω qp u p a, ω q ´ u p a , ω qq ě u a P A . It istherefore closed and convex. Proof of Lemma 2

Follows directly from Volund (2018), Theorem 1 or Lipnowski and Mathevet (2017), The-orem 1.

Proof of Lemma 3

We prove this claim by explicitly constructing τ . Using the convexity of ˆ u S within R a ,we can ﬁnd two alternative beliefs µ , µ in Bd p R a q , such that replacing µ with one of thesetwo beliefs maintains Bayes Plausibility and weakly increases E τ ˆ u S .Let supp p τ q “ t µ , µ , . . . , µ k u . Since τ satisﬁes Bayes Plausibility, we have µ “ ř ki “ τ p µ i q µ i for some τ p µ q , . . . , τ p µ k q , which satisfy ř i τ p µ i q “

1, and @ i P t , . . . k u ą τ p µ i q ą

0. Wewant to show that we can construct τ which satisﬁes Bayes Plausibility and E τ ˆ u S ě E τ ˆ u S .Without loss of generality, let µ k P supp p τ q be the belief p µ k ‰ µ q such that for some R k P R , µ k P int p R k q . Consider the ray from µ passing through µ k , parameterized as t µ ` s p µ k ´ µ q , s P R ` u .First, assume µ R R k . Since µ k P int p R k q , and R k is closed, bounded, and convex, theline segment passing through the interior point µ k intersects Bd p R k q at two points (Yaglomand Boltyansky 1961). Let these two points be denoted as µ k and µ k . Since these twopoints also lie on the ray passing through µ k originating from µ , they can be written in aparametric form. Therefore, for some δ ą ą γ ą µ k “ µ ` p ` δ qp µ k ´ µ q “ µ k ` δ p µ k ´ µ q ,µ k “ µ ` p ´ γ qp µ k ´ µ q “ µ k ´ γ p µ k ´ µ q . Moreover, we can write our original point µ k as a convex combination of these two pointsas γγ ` δ µ k ` δγ ` δ µ k “ µ k . Note that by convexity of ˆ u S within R k , we get: γγ ` δ ˆ u S p µ k q ` δγ ` δ ˆ u S p µ k q ě ˆ u S p µ k q . (1)24ow, let us deﬁne two new information structures, τ and τ , by replacing µ k with µ k and µ k , respectively. We will now show that we maintain Bayes Plausibility with these newinformation structures. Lemma 8.

The new information structures τ and τ , constructed as described above, areBayes Plausible. Proof.

Start with comparing τ with τ . We have supp p τ q “ t µ , . . . , µ k u and supp p τ q “t µ , . . . , µ k u . We know that τ is Bayes Plausible, so we have µ “ ř ki “ τ p µ i q µ i for some τ p µ q , . . . , τ p µ k q , which satisfy ř i τ p µ i q “

1, and @ i, ą τ p µ i q ą µ k “ µ ` p ` δ qp µ k ´ µ q “ µ k ` δ p µ k ´ µ q . Let us deﬁne a new probabilitydistribution τ P ∆ p Ω q representing µ i.e. µ “ ř i ă k τ p µ i q µ i ` τ p µ k q µ k . Simple algebrareveals that this equality will hold for τ : τ p µ i q “ τ p µ i qp ` δ q ` δ ´ τ p µ k q δ for i ă k and τ p µ k q “ τ p µ k q ` δ ´ τ p µ k q δ . Note that 1 ` δ ´ τ p µ k q δ ą

0, and τ p µ k q ă ` δ ´ τ p µ k q δ , so 1 ą τ p µ k q ą

0. Also notethat τ p µ i qp ` δ q ă ` δ ´ τ p µ k q δ since τ p µ i q´ ´ τ p µ k q´ τ p µ i q ă ă δ . Therefore @ i, ą τ p µ i q ą ÿ i ď k τ p µ i q “ p ` δ q ř i ă k τ p µ i q ` δ ´ τ p µ k q δ ` τ p µ k q ` δ ´ τ p µ k q δ “ ` δ ´ τ p µ k q δ ` δ ´ τ p µ k q δ “ . Similarly, take τ . We have supp p τ q “ t µ , . . . , µ k u and supp p τ q “ t µ , . . . , µ k u . Weknow that µ k “ µ ` p ´ γ qp µ k ´ µ q “ µ k ´ γ p µ k ´ µ q . Let us deﬁne a new probabilitydistribution τ P ∆ p Ω q representing µ i.e. µ “ ř i ă k τ p µ i q µ i ` τ p µ k q µ k . Simple algebrareveals that this equality will hold for τ : τ p µ i q “ τ p µ i qp ´ γ q ´ γ ` τ p µ k q γ for i ă k, and τ p µ k q “ τ p µ k q ´ γ ` τ p µ k q γ . Note that 1 ´ γ ` τ p µ k q γ ą γ ă

1. Also, τ p µ i qp ´ γ q ă ´ γ ` τ p µ k q γ since @ i, τ p µ i q ă

1. Therefore, @ i, ą τ p µ i q ą

0. Finally: ÿ i ď k τ p µ i q “ p ´ γ q ř i ă k τ p µ i q ´ γ ` τ p µ k q γ ` τ p µ k q ´ γ ` τ p µ k q γ “ ´ γ ` τ p µ k q γ ´ γ ` τ p µ k q γ “ . (cid:4) We are now ready to prove the main theorem. Let E τ ˆ u S and E τ ˆ u S be the sender’s util-ity under the new information structures τ and τ . Using the deﬁnitions of τ , τ , we cancalculate the diﬀerence between these new values and the sender’s utility under τ , which issimply E τ ˆ u S “ ř i ď k τ p µ i q ˆ u S p µ i q . Simple algebra shows the following:25 τ ˆ u S ´ E τ ˆ u S “ ˆ u S p µ k q ´ ˆ u S p µ k q ` δ ˜˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸ ´ ˆ u S p µ k q ¸ , E τ ˆ u S ´ E τ ˆ u S “ ˆ u S p µ k q ´ ˆ u S p µ k q ´ γ ˜˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸ ´ ˆ u S p µ k q ¸ . For a contradiction, suppose that both E τ ˆ u S ´ E τ ˆ u S ă E τ ˆ u S ´ E τ ˆ u S ă

0. Rear-ranging the above terms and multiplying with γ and δ respectively, we get: γ ˆ u S p µ k q ` γδ ˜˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸ ´ ˆ u S p µ k q ¸ ă γ ˆ u S p µ k q , and, δ ˆ u S p µ k q ´ δγ ˜˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸ ´ ˆ u S p µ k q ¸ ă δ ˆ u S p µ k q . Which implies: δ ˆ u S p µ k q ` γ ˆ u S p µ k q ă p δ ` γ q ˆ u S p µ k q . However, by inequality 1 implied by convexity, we have : γγ ` δ ˆ u S p µ k q ` δγ ` δ ˆ u S p µ k q ě ˆ u S p µ k qô δ ˆ u S p µ k q ` γ ˆ u S p µ k q ě p δ ` γ q ˆ u S p µ k q Therefore, we get a contradiction, so E τ ˆ u S ´ E τ ˆ u S ă E τ ˆ u S ´ E τ ˆ u S ă E τ ˆ u S ´ E τ ˆ u S ě E τ ˆ u S ´ E τ ˆ u S ě µ , µ k P R k . Since µ k P int p R k q , and R k is closed, bounded,and convex, the ray originating from µ passing through µ k intersects Bd p R k q at a singlepoint, which we will denote by µ k . Since µ k lies on this line, for some δ ą

0, we will have: µ k “ µ ` p ` δ qp µ k ´ µ q “ µ k ` δ p µ k ´ µ q Moreover, we can write µ k as a convex combination of µ k and µ , where µ k ` δ ` δµ ` δ “ µ k .Consider a new information structure τ , where we replace µ k with µ k in τ , implying supp p τ q “ t µ , . . . , µ k u . Similar to the ﬁrst part of the proof, we construct a probabil-ity distribution τ P ∆ p Ω q that represents µ i.e. we need t τ p µ i qu i ď k to satisfy µ “ ř i ă k τ p µ i q µ i ` τ p µ k q µ k . Simple algebra reveals that this equality will hold for τ :26 p µ i q “ τ p µ i qp ` δ q ` δ ´ τ p µ k q δ for i ă k, and τ p µ k q “ τ p µ k q ` δ ´ τ p µ k q δ . Since the original information structure is assumed to be beneﬁcial, we know that the payoﬀis better than the payoﬀ under receiver’s default action. This implies:ˆ u S p µ q ď ÿ i ď k τ p µ i q ˆ u S p µ i q Also by the convexity of ˆ u S within R k , we know that:ˆ u S p µ k q ` δ ` δ ˆ u S p µ q ` δ ě ˆ u S p µ k q . Now, let us calculate the diﬀerence in expected sender payoﬀ between τ and τ . We ﬁndthat: E τ ˆ u S ´ E τ ˆ u S “ ˆ u S p µ k q ´ ˆ u S p µ k q ` δ ˜˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸ ´ ˆ u S p µ k q ¸ . For a contradiction, suppose that E τ ˆ u S ´ E τ ˆ u S ă

0. This implies the following:ˆ u S p µ k q ´ ˆ u S p µ k q` ă δ ˜ ˆ u S p µ k q ´ ˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸¸ ď δ ` ˆ u S p µ k q ´ ˆ u S p µ q ˘ ô ` δ ˆ u S p µ k q ` δ ` δ ˆ u S p µ q ă ˆ u S p µ k q . This contradicts ˆ u S p µ k q ` δ ` δ ˆ u S p µ q ` δ ě ˆ u S p µ k q , which we know to be true from the convexity ofˆ u S within R k . Therefore E τ ˆ u S ´ E τ ˆ u S ě µ k P supp p τ q that is not on the boundary of a region R a through the stepsdescribed above, we can reach a τ that yields weakly higher utility for the sender. Thiscompletes the proof. Proof of lemma 4

Let supp p τ q “ t µ , . . . , µ k u be aﬃnely dependent. Then, there must exist t λ , . . . , λ k u suchthat ř i ď k λ i “ ř i ď k λ i µ i “

0. Since τ is Bayes Plausible, we have µ “ ř ki “ τ p µ i q µ i for some τ p µ q , . . . , τ p µ k q , which satisfy ř i τ p µ i q “

1, and @ i, ą τ p µ i q ą t λ , . . . , λ k u , some elements must be positive and some negative. Among27he subset with negative weights, pick j ˚ such that τ p µ j q λ j is maximized. Among the subsetwith positive weights, pick p ˚ such that τ p µ p q λ p is minimized. Now, we can write µ j ˚ “ ÿ i ‰ j ˚ ´ λ i λ j ˚ µ i , and µ p ˚ “ ÿ i ‰ p ˚ ´ λ i λ p ˚ µ i . Now, rewriting the Bayes Plausibility condition, we get: τ p µ q µ ` ¨ ¨ ¨ ` τ p µ j ˚ q ˜ ÿ i ‰ j ˚ ´ λ i λ j ˚ µ i ¸ ` ¨ ¨ ¨ ` τ p µ k q µ k “ µ ô ÿ i ‰ j ˚ ˆ τ p µ i q ´ τ p µ j ˚ q λ i λ j ˚ ˙ µ i “ µ , and analagously, ÿ i ‰ p ˚ ˆ τ p µ i q ´ τ p µ p ˚ q λ i λ p ˚ ˙ µ i “ µ . Now, we will show that @ i ‰ j ˚ , ´ τ p µ i q ´ λ i τ p µ j q λ j ˚ ¯ ě @ i ‰ p ˚ , ´ τ p µ i q ´ λ i τ p µ k q λ p ˚ ¯ ě λ i “

0, the inequalities hold trivially.If λ i ą

0, the inequalities are equivalent to τ p µ i q λ i ě τ p µ j ˚ q λ j ˚ and τ p µ i q λ i ě τ p µ p ˚ q λ p ˚ . In both cases,the condition holds, because λ j ˚ is negative and λ p ˚ is chosen to minimize this ratio.If λ i ă

0, the inequalities are equivalent to τ p µ i q λ i ď τ p µ j ˚ q λ j ˚ and τ p µ i q λ i ď τ p µ p ˚ q λ p ˚ . In both cases,the condition holds, because λ j ˚ is chosen to maximize this ratio and λ p ˚ is positive.Moreover, note that ř i ‰ j ˚ ´ τ p µ i q ´ λ i τ p µ j ˚ q λ j ˚ ¯ “ p ´ τ p µ j ˚ qq ` τ p µ j ˚ q λ j ˚ λ j ˚ “

1, and analogouslyfor p ˚ . Therefore, we can deﬁne τ and τ respectively from τ by dropping µ j ˚ or µ p ˚ , and wemaintain Bayes Plausibility using convex weights ´ τ p µ i q ´ λ i τ p µ j ˚ q λ j ˚ ¯ and ´ τ p µ i q ´ λ i τ p µ p ˚ q λ p ˚ ¯ .Now, writing E τ ˆ u S ´ E τ ˆ u S and E τ ˆ u S ´ E τ ˆ u S , we get: E τ ˆ u S ´ E τ ˆ u S “ ÿ i ‰ j ˚ ˆ τ p µ i q ´ λ i τ p µ j ˚ q λ j ˚ ˙ ˆ u S p µ i q ´ ÿ i ď k τ p µ i q ˆ u S p µ i q E τ ˆ u S ´ E τ ˆ u S “ ÿ i ‰ p ˚ ˆ τ p µ i q ´ λ i τ p µ p ˚ q λ p ˚ ˙ ˆ u S p µ i q ´ ÿ i ď k τ p µ i q ˆ u S p µ i qô E τ ˆ u S ´ E τ ˆ u S “ ´ τ p µ j ˚ q λ j ˚ ˜ ÿ i ‰ j ˚ λ i ˆ u S p µ i q ¸ ´ τ p µ j ˚ q ˆ u S p µ j ˚ qô E τ ˆ u S ´ E τ ˆ u S “ ´ τ p µ p ˚ q λ p ˚ ˜ ÿ i ‰ p ˚ λ i ˆ u S p µ i q ¸ ´ τ p µ p ˚ q ˆ u S p µ p ˚ q . Suppose E τ ˆ u S ´ E τ ˆ u S ă E τ ˆ u S ´ E τ ˆ u S ă

0. This implies:28 λ j ˚ ˜ ÿ i ‰ j ˚ λ i ˆ u S p µ i q ¸ ´ ˆ u S p µ j ˚ q ă , and ´ λ p ˚ ˜ ÿ i ‰ p ˚ λ i ˆ u S p µ i q ¸ ´ ˆ u S p µ p ˚ q ă ô λ j ˚ ˜ ÿ i ‰ j ˚ λ i ˆ u S p µ i q ¸ ` ˆ u S p µ j ˚ q ą , and 1 λ p ˚ ˜ ÿ i ‰ p ˚ λ i ˆ u S p µ i q ¸ ` ˆ u S p µ p ˚ q ą . However, note that by assumption, λ j ˚ and λ p ˚ have opposite signs. Multiplying the ﬁrstinequality by λ j ˚ and the second inequality by λ p ˚ , we must have: ˜ÿ i ď k λ i ˆ u S p µ i q ¸ ă , and ˜ÿ i ď k λ i ˆ u S p µ i q ¸ ą . Which is a contradiction. So E τ ˆ u S ´ E τ ˆ u S ă E τ ˆ u S ´ E τ ˆ u S ă τ or τ must yield weakly higher expected utility for the sender.Replace τ with the information structure that yields weakly higher utility using the processdeﬁned above, which drops one belief that is aﬃnely dependent. If the resulting informationstructure is aﬃnely independent, we’re done. If not, we can repeat the process describedabove and we will either reach an aﬃnely independent set of vectors before we get to two,or we reach two vectors, which must be aﬃnely independent. This completes the proof. Proof of lemma 5

Existence and uniqueness comes from the Choquet’s Theorem be-cause τ is a simplex, by the aﬃne independence condition. Now given the convex weights τ “ p τ p µ q , . . . , τ p µ k qq one can transform them to the Cartesian coordinates for µ by using T τ “ »—————– µ , µ , . . . µ k, µ , µ , . . . µ k, ... ... ... ... µ ,n µ ,n . . . µ k,n . . . ﬁﬃﬃﬃﬃﬃﬂ where µ i,j is the j’th coordinate of i’th posterior in supp p τ q . T τ is a matrix with dimensions p n ` , k q , with linearly independent columns, which is guaranteed by the aﬃne independenceof supp p τ q . Let us denote µ “ p µ , , . . . , µ ,n , q which is the p , n ` q vector of cartesiancoordinates of µ with an added 1 for the n ` T τ τ “ µ .

29e also know the left inverse of T τ exists by aﬃne independence, denoted T Lτ , which hasdimensions p k, n ` q . Similarly, we can deﬁne T τ for any τ P ζ . Then, we have: T τ τ “ µ T τ τ “ µ ô τ “ T Lτ T τ τ Where T Lτ T τ is an aﬃne transformation that takes the convex weights of µ with respectto k-simplex τ and maps to convex weights of µ with respect to k-simplex τ . The mapis bounded, because the two information structures are bounded polytopes. Hence theyare Lipschitz continuous, because bounded aﬃne transformations are Lipschitz continuous.Hence, τ is uniformly continuous. This completes the proof. Proof of Theorem 1

The following deﬁnition will be useful.

Deﬁnition 6. S a a Ă R a denotes the region where the sender preferred action a is takenin region R a . Formally S a a Ă R a is deﬁned as S a a : “ t µ P ∆ p Ω q : µ P R a and a P ˆ A p µ q ˆ u S p a , µ q ě ˆ u S p ˜ a, µ q @ ˜ a P ˆ A p µ qu . We start by creating auxiliary payoﬀ candidates. This only for illustrative purposes tothe reader. First deﬁne that: F a “ t F ai : F ai is a facet of R a u and F “ t F i : @ i “ , . . . , k F i P F a for some a P A u . ˆ u S is uniformly continuous in the interior of R a as ˆ u S is piece-wise linear in the interior of R a .Then by Kirszbraun Theorem we can extend ˆ u S | int p R a q to a Lipschitz continuous function ˆ u Sa deﬁned over R a with the same Lipschitz constant. Hence ˆ v a is uniformly continuous functionover R a .Before proceeding, the reader should note the following: Consider an information structuresupported on the boundary of an action zone R a . There is a possibility that µ l P Bd p R a q is the boundary is also deﬁned by µ l P Bd p R a q . The induced payoﬀ of this informa-tion structure can be represented in two ways: ř i ‰ l τ p µ i q ˆ u S p a p µ i qq ` τ p µ l q ˆ u Sa p a p µ l qq and ř i ‰ l τ p µ i q ˆ u S p a p µ q i qq ` τ p µ l q ˆ u Sa p a p µ l qq . But note that, since we are focusing on sender-preferred equilibrium, the realized payoﬀ at equilibrium corresponds to the payoﬀ that en-sures the higher payoﬀ. Hence, the maximizing payoﬀ we will obtain will always correspondto the true payoﬀs, as auxiliary payoﬀs are always dominated by true payoﬀs. This can bedone analogously via showing that we can limit our attention to those action zones such thatsender utility can be extended uniformly continuously to the boundaries.For each i “ , . . . , k , τ p µ i q ˆ u Sa p µ i q is a product of uniformly continuous bounded functions,hence it is uniformly continuous. Therefore the overall sum is uniformly continuous sincethe ﬁnite sum of uniformly continuous functions is uniformly continuous.30f the sender objective on ζ F which is E τ ˆ u S | ζ f : ζ F Ñ R is uniformly continuous, thenit can be extended to a continuous function on the closure of ζ F , denoted E τ ˆ u Sa | ζ f : ζ F Ñ R . E τ ˆ u Sa | ζ f attains a maximum over ζ F by Weierstraß theorem since ζ F is bounded, and itsclosure is compact by Heine-Borel Theorem.Next, observe that ζ F can be written as the intersection of three sets, ζ F “ Σ X I F X A ,where we deﬁne:1-The set of all Bayes Plausible information structures as Σ “ tp µ , . . . , µ k q P ∆ p Ω q : µ P co pt µ , . . . , µ n uqu

2- The set of information structures with supp p τ q on F as I F .3-The set of aﬃnely independent information structures: A “ tp µ , . . . , µ k q P ∆ p Ω q : p µ , . . . , µ k q is aﬃnely independent. u First, note that I F is a closed subset of R k ˆ N because Cartesian products of closed setsare closed. We proceed by proving the following claim. Lemma 9. Σ is closed. Proof.

We will show this by contradiction. Suppose not. Then D τ P Σ s.t. µ R co p supp p τ qq .If τ P Σ, then µ P co p supp p τ qq , by deﬁnition of ζ . Then τ P Σ (cid:114) Σ “ Bd p Σ q . Then τ is alimit point of Σ. Denote the elements of supp p τ q “ t µ , . . . , µ k u .So there exists a sequence t ˜ τ n u n P N “ p ˜ µ n , . . . , ˜ µ nk q s.t. ˜ τ n P Σ @ n P N and t ˜ τ n u n P N con-verges to τ . Since we supposed that τ R Σ there exists no such p α , . . . , α k q that satisﬁes α i ą ř ki “ α i “ ř ki “ α i µ i “ µ . Furthermore since ˜ τ n P Σ for each n P N wehave unique ˜ α n “ p ˜ α n , . . . , ˜ α nk q s.t. ř ki “ ˜ α ni ˜ µ ni “ µ for each n P N by lemma 3.2. Then theremust exist a δ ą @ n P N, || ř ki “ ˜ α ni ˜ µ ni ´ ř ki “ ˜ α ni µ i || “ || ř ki “ ˜ α ni p ˜ µ ni ´ µ i q|| ą δ ,where || . || denotes the Euclidean norm. Also by the fact that ˜ τ n Ñ τ we have that @ ε ą D n P N s.t. || ˜ τ n ´ τ || F ă ε , where || . || F denotes the Frobenius norm on R k ˆ N . So we get: ε ą || ˜ τ n ´ τ || F “ k ÿ i “ ||p ˜ µ ni ´ µ i q|| ą k ÿ i “ ˜ α ni ||p ˜ µ ni ´ µ i q|| ě || k ÿ i “ ˜ α ni p ˜ µ ni ´ µ i q|| “ || k ÿ i “ ˜ α ni ˜ µ ni ´ k ÿ i “ ˜ α ni µ i || ą δ Where the equality in the ﬁrst line follows from the deﬁnition of the Frobenius norm, andthe second line follows from applying Jensen’s inequality. Then picking ε “ δ , we have δ ą δ ,a contradiction. Therefore, for any τ P Σ , µ P co p supp p τ qq . This shows that Σ is closed. (cid:4) F denote the ﬁnite set of all possible collections F . We can characterize thesender’s maximization problem as follows:max F P F p V p F, µ , k qq Where V p F, µ , k q “ max τ P ζ F E τ ˆ u Sa | ζ F if the maximum exists, and the added convention that V p F, µ , k q “ ´8 , if the feasible set is empty or the maximum doesn’t exist for a givencollection F. Deﬁne V p F, µ , k q analogously for ´ max τ P ζ F E τ ˆ u Sa | ζ f ¯ .If ζ F is nonempty, from the ﬁrst part of the proof, we know that ´ max τ P ζ F E τ ˆ u Sa | ζ F ¯ at-tains a maximum since it is a continuous function over a compact set. Note that ζ F isalways nonempty for some F: trivially, we can ﬁnd two facets of the R a for which µ P R a ,and deﬁne the line segment passing through µ with endpoints at these two facets as a BayesPlausible, aﬃnely independent information structure . Therefore max F P F ´ V p F, µ , k q ¯ also exists. Now we will show that the maximum of the original problem also exists byshowing: max F P F p V p F, µ , k qq “ max F P F ´ V p F, µ , k q ¯ . Denote the maximizing collection of facets for the second problem as F P F and let τ P ζ F be the corresponding information structure for the maximization problem max τ P ζ F E τ ˆ u Sa | ζ F .If τ P ζ F , we’re done and max F P F p V p F, µ , k qq also exists. For the other case, suppose τ P ζ F { ζ F .From the argument in the ﬁrst part of this proof, we know: ζ F “ I F X Σ X A Ď I F X Σ X A But we also have I F “ I F and Σ “ Σ. Hence ζ F Ď I F X Σ X A. But then: ζ F { ζ F Ď ` I F X Σ X A ˘ { p I F X Σ X A q “ ` I F X Σ X p A { A q ˘ . So we must have τ P ` I F X Σ X p A { A q ˘ . However, if τ P p A { A q , it is not aﬃnely in-dependent, then by Theorem 4, we can always ﬁnd some F P F and τ P ζ F such that E τ ˆ u Sa | ζ F ě E τ ˆ u Sa | ζ F , contradicting the fact that τ is the maximum. Therefore τ P ζ F { ζ F cannot hold, and the maximum of the original problem will always exist. This completes theproof of Theorem 1. If µ is an element of a single facet of R a , then the argument still applies where the two points are onthe same facet. roof of Theorem 2 Let τ be the optimal information structure solving the sender’s maximization problem givenin deﬁnition 4, and suppose for a contradiction, sup t z |p µ , z q P CH k p ˆ u S qu ‰ E τ ˆ u S .For the ﬁrst case, let sup t z |p µ , z q P CH k p ˆ u S qu ă E τ ˆ u S . However, taking the beliefs in supp p τ q “ t µ , . . . , µ k u , we know that by the feasibility of τ , Dt τ p µ q , . . . , τ p µ k qu such that ř i ď k τ p µ i q µ i “ µ and ř i ď k τ p µ i q “ , ě τ p µ i q ě

0. Thus, by deﬁnition 5, p µ , E τ ˆ u S q P CH k p ˆ u S q . Therefore, we cannot have sup t z |p µ , z q P CH k p ˆ u S qu ă E τ ˆ u S .For the other case, let sup t z |p µ , z q P CH k p ˆ u S qu ą E τ ˆ u S . Since p µ , z q P CH k p ˆ u S q , takethe set of points t ˆ u S p µ q , . . . , ˆ u S p µ k qu and convex weights t α , . . . , α k u with ř i ď k α i µ i “ µ and ř i ď k α i ˆ u S p µ i q “ z , also satisfying ř i ď k α i “ , ě α ě

0. We know these points andweights must exist by deﬁnition 5. Now observe that τ “ t µ , . . . , µ k u must be a feasiblesolution to the sender’s maximization problem. t µ , . . . , µ k u must be elements of some facets,because otherwise by theorem 3, we can show the existence of another information structurewith higher expected utility, contradicting the fact that p µ , z q is a supremum. It must alsobe the case that t µ , . . . , µ k u are aﬃnely independent, because otherwise by theorem 4, wecan contradict p µ , z q being a supremum again. We know that τ satisﬁes Bayes Plausibilityby the deﬁnition given above. Therefore τ P ζ F for some facet combination F, and it couldhave been picked instead of τ in the maximization problem, contradicting the optimality of τ . This completes the proof of Theorem 2. Proof of Theorem 3

Suppose τ k is the optimal information structure with k signals, and τ k ´ is the optimal in-formation structure with k ´ V ˚ p k q , V ˚ p k ´ q the utilities obtainedusing these information structures.Let supp p τ k q “ t µ , . . . , µ k u . Observe that we can create a k ´ µ , µ , anddeﬁne a new posterior as their mixture: µ “ τ k p µ q τ k p µ q ` τ k p µ q µ ` τ k p µ q τ k p µ q ` τ k p µ q µ And deﬁne the new information structure with supp p τ q “ t µ , µ , . . . , µ k u , which main-tains Bayes Plausibility with the new weights tp τ k p µ q ` τ k p µ qq , τ p µ q , . . . , τ p µ k qu .Now, we can deﬁne k diﬀerent information structures containing k ´ µ , µ , . . . , µ k ´ ,k , µ k where we mix the consecutive posteriors µ l , µ l ` and use theweights deﬁned above to satisfy Bayes Plausibility. By the optimality of τ k ´ among the33nformation structures with k ´ V ˚ p k ´ q ě p τ k p µ q ` τ k p µ qq u S ˆ τ k p µ q τ k p µ q ` τ k p µ q µ ` τ k p µ q τ k p µ q ` τ k p µ q µ ˙ ` τ k p µ q u S p µ q ` ¨ ¨ ¨ ` τ k p µ k q u S p µ k q ,V ˚ p k ´ q ě τ k p µ q u S p µ q ` p τ k p µ q ` τ k p µ qq u S ˆ τ k p µ q τ k p µ q ` τ k p µ q µ ` τ k p µ q τ k p µ q ` τ k p µ q µ ˙ ` ¨ ¨ ¨ ` τ k p µ k q u S p µ k q , ... V ˚ p k ´ q ě τ k p µ q u S p µ q ` ¨ ¨ ¨ `p τ k p µ k ´ q ` τ k p µ k qq u S ˆ τ k p µ k ´ q τ k p µ k ´ q ` τ k p µ k q µ k ´ ` τ k p µ k q τ k p µ k ´ q ` τ k p µ k q µ k ˙ ,V ˚ p k ´ q ě τ k p µ q u S p µ q ` τ k p µ q u S p µ q ` ¨ ¨ ¨ `p τ k p µ q ` τ k p µ k qq u S ˆ τ k p µ q τ k p µ q ` τ k p µ k q µ ` τ k p µ k q τ k p µ q ` τ k p µ k q µ k ˙ Dividing all inequalities by k and summing up, we have: V ˚ p k ´ q ě k ´ k V ˚ p k q ` k V ě k ´ k V ˚ p k q Where V is the utility gained from the k dimensional information structure consisting of theposteriors t µ , µ , . . . , µ k ´ ,k , µ k u . This implies the following upper bound on the value ofan additional signal at k ´ V ˚ p k q ´ V ˚ p k ´ q ď k V ˚ p k q Equivalently, the following relationship must hold between the maximum utilities attainablebetween k and k ´ k ´ k V ˚ p k q ď V ˚ p k ´ q ď V ˚ p k q Proofs of the statements in section 4.1.1

Let p E, (cid:126)E q denote an Euclidean aﬃne space with E being an aﬃne space over the set ofreals such that the associated vector space is an Euclidian vector space. We will call E theEuclidean Space and (cid:126)E the space of its translations. For this example we will focus on three34imensional Euclidian aﬃne space i.e. (cid:126)E has dimension 3. We equip (cid:126)E with Euclidean dotproduct as its inner product, inducing the Euclidian norm as a metric. To simplify notation,we will simply write p R , (cid:126) R q . Given this structure, we can deﬁne the unitary simplex inthe aﬃne space R by the following set where ω i corresponds to the point with 1 in its i th coordinate and 0 in all of its other coordinates. We deﬁne the state space Ω “ t ω , ω , ω u .The simplex then becomes:∆ p Ω q “ " µ P R | µ “ λ ω ` λ ω ` λ ω such that ÿ i “ λ i “ ą λ i ą @ i P t , , u * Building on the problem deﬁnition in the main text, we focus on Bayesian Persuasion gameswhere the receiver preferences are described with thresholds, i.e. the receiver prefers action a i P t a , a , a u if and only if the posterior belief µ s P ∆ p Ω q such that µ s p ω i q ě ¯ π , and prefers a otherwise. Hence, we can say that for i P t , , u , j P t , , , u and j ‰ i we have E µ s r u R p a i , ω qs ě E µ s r u R p a j , ω qs if and only if µ s p ω i q ą ¯ π . Deﬁne δ “ p , ´ ¯ π, ´p ´ ¯ π qq , δ “ p ´ ¯ π, , ´p ´ ¯ π qq and δ “ p ´ ¯ π, ´p ´ ¯ π q , q and Γ “ p ¯ π, , ´ ¯ π q , Γ “ p , ¯ π, ´ ¯ π q and Γ “ p , ´ ¯ π, ¯ π q . The action zones will become: R i “ t µ s P ∆ p ω q| µ is ě ¯ π i u “ ∆ p ω q X tp µ ´ Γ i q ¨ δ i ě | µ P R u , where ¨ denotes the Euclidean dot product. Proof of lemma 6

Let us ﬁrst characterize the set ∆ c . We have ∆ c “ ∆ p Ω qz co p R Y R Y R qq . We notethat: co p R Y R q “ co pt ω , p ¯ π, ´ ¯ π, q , p ¯ π, , ´ ¯ π q , ω , p ´ ¯ π, ¯ π, q , p , ¯ π, ´ ¯ π quq“ co t ω , p ¯ π, , ´ ¯ π q , ω , p , ¯ π, ´ ¯ π qu (2)and similarly for co p R Y R q and co p R Y R q we have thatco p R Y R q “ co t ω , p ¯ π, ´ ¯ π, q , ω , p , ´ ¯ π, ¯ π qu (3)co p R Y R q “ co t ω , p ´ ¯ π, , ¯ π q , ω , p ´ ¯ π, , ¯ π qu (4)The second line follows from the ﬁrst line since the t ω , p ¯ π, , ´ ¯ π q , ω , p , ¯ π, ´ ¯ π qu corresponds to the extreme points of co pt ω , p ¯ π, ´ ¯ π, q , p ¯ π, , ´ ¯ π q , ω , p ´ ¯ π, ¯ π, q , p , ¯ π, ´ ¯ π quq . Similarly using equation (2), (3) and (4), co p R i Y R j q can be identiﬁed as the intersectionof a half space and the simplex i.e.co p R Y R q “ ∆ p Ω q X tp µ ´ p ¯ π, , ´ ¯ π qq ¨ p´ ¯ π, ¯ π, q ě | µ P R u (5)co p R Y R q “ ∆ p Ω q X tp µ ´ p ¯ π, ´ ¯ π, qq ¨ p´ ¯ π, , ¯ π q ě | µ P R u (6)co p R Y R q “ ∆ p Ω q X tp µ ´ p ´ ¯ π, ¯ π, qq ¨ p , ´ ¯ π, ¯ π q ě | µ P R u (7) co denotes convex hull operator and co k denotes k -convex hull i.e. co k p A q are the points that can berepresented as convex combination of k elements in A .

35o we can deﬁne ∆ c Ă ∆ p Ω q as ∆ c “ ∆ p Ω qz co p R Y R Y R q . By (5), (6) and (7) we cansee that ∆ c is deﬁned as∆ c “ t µ “ p µ , µ , µ q P ∆ p Ω q|@ i P t , , u , µ i ą ´ ¯ π u By deﬁnition of ∆ c and ∆ p Ω q this set is non-empty if and only if ¯ π ą . Proof of lemma 7

We can identify the upper bounds through the following problem: V p , µ q “ max i Pt , , u ˆ max µ P ∆ c ,µ i P R i ,µ P R ´ d p µ i , µ q d p µ , µ q ˙ subject to µ P co p µ i , µ q . First note that by the symmetry of the problem choice of i is not relevant. Withoutloss of generality we pick i “

1. Moreover, the constraint that µ P co p µ i , µ q impliesthat we are searching for a point with the goal of minimizing the distance with µ i andmaximizing the distance with µ . The maximizing triple is therefore p µ ˚ , µ ˚ , µ ˚ q with µ ˚ “p ´ ¯ π, ´ ¯ π, π ´ q , µ ˚ “ p ´ ¯ π , ´ ¯ π , ¯ π q µ ˚ “ p , , q . The solution follows from twoobservations. One is that given two points µ and µ i there is a unique line passing throughthese points hence µ is identiﬁed to be the furthest point on that line such that µ P R .The line always intersects with R as otherwise µ R ∆ c by construction. Then we choose µ and µ i to minimize d p µ , µ i q where d p µ , µ i q is measured in the space of translations of R .Given this solution, we have that: ||p ¯ π, ´ ¯ π , ´ ¯ π q ´ p π ´ q , ´ ¯ π, ´ ¯ π || “ ? p ´ ¯ π q||p ¯ π, ´ ¯ π , ´ ¯ π q ´ p , , qq|| “ ?

62 ¯ π Giving us that V p , µ q “ π ´ π . Similarly, we can solve: V p , µ q “ min i Pt , , u ˆ max µ i P R i ,µ P R ˆ min µ P ∆ c ´ d p µ i , µ q d p µ , µ q ˙˙ subject to µ P co p µ i , µ q . We observe that the point µ ˚ “ B “ p , , q is a solution. This follows from the fact that B is the barycenter of the simplex, and R , R and R are deﬁned with the same threshold ¯ π .Thus, any prior µ ‰ B implies that the µ is closer to one of the action zones. Minimizingthe objective, we pick µ ˚ “ B . Now given this choice, we choose µ to maximize leading tothe choice of µ ˚ “ p , , q and µ ˚ “ p ´ ¯ π , ´ ¯ π , ¯ π q .Interestingly, the posteriors induced in the optimal information structure for the twoproblems are the same, but they are induced with diﬀerent probabilities. This follows from36he fact that the hyperplanes deﬁning the action zones is parallel to one of the hyperplanesdeﬁning the simplex. So we can write V p , µ q “ π . Proof of corollary 2

Observe that with ﬁxed ¯ π “ {

3, we have V p , µ q “ “ V p , µ q . Also, V p , µ q “ π ´ π is increasing in ¯ π and V p , µ q “ π is decreasing in ¯ π . By continuity of distance, the ob-jective function in the deﬁnition of V p , µ q and V p , µ q are continuous. So for any other µ P ∆ c , V p , µ q takes every value between V p , µ q and V p , µ q by intermediate value the-orem. By deﬁnition of value of precision, V p , µ q ą implies decreasing value of precisionand V p , µ q ă implies increasing value of precision.37 Additional Results and Details

B.1 Properties of ˆ u S and sender-preferred zones Deﬁnition 7. S a a Ă R a denotes the region where the sender preferred action a is takenin region R a . Formally S a a Ă R a is deﬁned as S a a : “ t µ P ∆ p Ω q : µ P R a and a P ˆ A p µ q ˆ u S p a , µ q ě ˆ u S p ˜ a, µ q @ ˜ a P ˆ A p µ qu . Remark . Observe that by deﬁnition we have that @ a, a P A we have that S a a Ď S a a . Lemma 10. @ a, a P A S a a is closed and convex. Proof.

We can deﬁne S a a “ ´ X a ‰ a (cid:32) µ P R a : ř i ă ď Ω µ p ω q ` u S p a, ω q ´ u S p a , ω q ˘ ě ( a P A p µ q ¯ ,which is intersection of ﬁnitely many half-spaces and closed, convex set R a . (cid:4) Lemma 11. @ a, a P A , ˆ u S is an aﬃne function over S a a . Proof.

For every posterior µ P ∆ p Ω q the receiver is indiﬀerent between taking actions a P ˆ A p µ q . For every µ P S a a receiver takes action a , by deﬁnition of sender preferred equi-librium. Given a ﬁxed action a , ˆ u S p a q “ E µ p u S p a, ω qq , which is aﬃne over the simplex. (cid:4) Corollary 3. @ a P A , ˆ u S is a continuous function over int p R a q . Remark . ˆ u S has jump discontinuities only at µ P ∆ p µ q such that µ P R a X R a with R a X R a “ Bd p R a q X Bd p R a q . B.2 Properties of ˆ u R and receivers preferences for signal spacecardinality Lemma 12.

In ﬁnite persuasion games, receiver utility in equilibrium max a P A ˆ u R p a, ω q isconvex over ∆ p Ω q . In fact, it is a polyhedral convex function. Proof.

Observe that max a P A ˆ u R p a, ω q “ max a P A " t E µ u R p a , ω qu a P A * . E µ u R p a , ω q denotesthe expected utility for a ﬁxed action a P A , which is an aﬃne function over ∆ p Ω q , andtherefore convex. Then we have that epigraph of max a P A ˆ u R p a, ω q is a polyhedral convex set. (cid:4) An immediate implication is the following.

Corollary 4.

Let τ be the optimal information structure with k -signals and τ be the optimalinformation structure with with k ` signals. If τ and τ are Blackwell comparable we havethat receiver prefers τ over τ . The corollary follows from the deﬁnition of Blackwell comparability, and the fact thatthe receiver preferences must be convex. f is a polyhedral convex function if and only if its epigraph is polyhedral, as deﬁned in Rockafellar(1970). .3 Formal preferences for example 4.2.1 (Optimal Advice Seek-ing) We say that the sender’s utility only depends on the action, and a and a are preferredover a and a , and the default action is the least preferred action, which we call a . Forthe parametric example drawn in ﬁgure 5, we set u s p a q “ u s p a q “ u s p a q “ u s p a q “ u s p a q “ ω , ω or ω are high enough, they prefer a , a , a respectively. The defaultaction is a , which is taken when the beliefs are ‘leaning towards’ ω , and there is anotheraction a , which is taken when the beliefs are ‘leaning away from’ ω but are not suﬃcientlyclose to ω or ω . Formally, for the example in the ﬁgure, we deﬁne receiver utility as follows: u r p ω , a q “ ´ , u r p ω , a q “ , u r p ω , a q “ u r p ω , a q “ { , u r p ω , a q “ { , u r p ω , a q “ { u r p ω , a q “ ´ { , u r p ω , a q “ { , u r p ω , a q “ ´ { u r p ω , a q “ ´ { , u r p ω , a q “ ´ { , u r p ω , a q “ { u r p ω , a q “ ´ { , u r p ω , a q “ { , u r p ω , a q “ { B.4 Simplicity in Persuasion

In the main text, we have shown that we can restrict attention to aﬃnely independent struc-tures while searching for the optimal information structure. The goal of this section is toclarify the connection between aﬃne independence of information structures, preferencestowards simplicity and cognitive costs arising from complexity. We formalize cognitive costsby making the sender not only care about the payoﬀs of the persuasion game, but also thecomplexity of the information structures implemented.Our approach and deﬁnition of complexity is motivated by the seminal paper of Rubin-stein (1986) who studies complexity of automata strategies in repeated games. We opt fora similar simple formalization that deﬁnes complexity of an information structure by thenumber of diﬀerent posteriors induced i.e. the cardinality of the support of τ P ∆ p ∆ p Ω qq .This can be analogously thought as having a mental cost for each posterior induced by asignaling strategy. We work on the limiting case of inﬁnitesimal costs. Thus, the senderprimarily cares about the payoﬀ, and cares lexicographically, only secondarily, about thenumber of posteriors induced. Formally, we can deﬁne the preference relation ą of thesender by deﬁning τ ą τ if p E τ ˆ u s , ´| supp p τ q|q ą L p E τ ˆ u s , ´| supp p τ q|q ą L is the usual lexicographic order on R .This notion of complexity is fairly simple and intuitive, and captures some importantconsiderations. The simplest way to motivate the cost of an additional signal is by assum-ing that generating higher dimensional signals is costly, and committing to an informationstructure with more signals and more action recommendations implies that the sender shouldinvest in more capacity to send each diﬀerent signal that is sent with positive probability.Given a standard persuasion game with no limitations on the signal space and a senderwho has preferences for simplicity, we can extend the result of Theorem 2. We can now stateaﬃne independence as a necessary condition of optimality and state that for every informa-tion structure τ whose support µ is not aﬃnely independent there exists a strictly betterinformation structure that is preferred by the sender. The result follows from the construc-tion provided in the proof of Theorem 2. Existence of the optimal information structure isagain established by Theorem 3.These observations present an additional property of aﬃnely independent informationstructures, as they also happen to be the simplest (in the sense of the lexicographic orderdeﬁned above) possible information structures, within the set of information structures thatachieve the same utility level. Hence, our analysis of Bayesian Persuasion with coarse com-munication yields a general solution to Bayesian Persuasion games where the agents havepreferences for simplicity.The lexicographic preference order deﬁned above is analogous to having inﬁnitesimalcosts for additional signals. In general, using our deﬁnition for the value of precision, thesender can decide whether it is worth incurring the cost of an additional signal when costsare non-trivial. C Extension: Continuum of States

In this section, we will extend our results to the case where the state of the world ω cantake values in a continuum i.e. Ω “ r a, b s . Without loss of generality, set a “ , b “ τ be a signal or an information structure, and the signalspace be S with cardinality K. The general setting is akin to Gentzkow and Kamenica (2016).Suppose the action of the receiver only depends on the expected value of the state vari-able, E µ p ω q , where µ is a posterior belief (a probability distribution) over Ω. Let F be theCDF of the prior belief, with the mean m . A signal realization s P S will induce a posteriorbelief with CDF µ s . p x , x q ą L p y , y q if and only if x ą y or x “ y and x ą y . That is to say that τ ą τ if and onlyif E τ ˆ u s ą E τ ˆ u s or E τ ˆ u s “ E τ ˆ u s and | supp p τ q| ă | supp p τ q| . τ will induce at most K diﬀerent posterior CDF’s,denoted t µ , . . . , µ k u with corresponding means t m , . . . , m k u . Note that τ will now induce aprobability distribution over posterior means . Denote CDF of this distribution of posteriormeans by G.We make the following assumptions: The set of actions A has cardinality and that thereexists cutoﬀs γ , . . . , γ m such that when E µ p ω q P r γ i , γ i ` s , the action a i is optimal for thereceiver. Additionally we assume that the sender’s utility depends only on receiver’s actionand that u is an aﬃne-closed function, and satisﬁes regularity conditions, deﬁned in Dwor-czak and Martini (2019). Further, assume that the prior CDF, F , be continuous and havefull support over Ω. These assumptions ensure that the optimal signal creates a distributionof posterior means which is a monotone partitional signal .A monotone partitional signal partitions the state space into at most K continuous intervalssuch that for any interval in tr x i , x i ` su Ki “ , all the mass of G is on E p X | X P r x i , x i ` sq .Let c be the integral of the posterior mean function for the completely uninformative signal,which will be equal to 0 below the prior mean, and a linear function with slope 1 above theprior mean. Similarly, let c be the integral of the posterior mean function for the fully re-vealing signal (which will use inﬁnitely many signals). This signal reveals the state exactly.Therefore it will be equal to the integral of the prior.It is shown by Gentzkow and Kamenica (2016) that the function c for any form of sig-nal must lie between c and c . Note that both of these depend on the prior. Now, note thefollowing observation: the cardinality of the signal space K , determines how many ’kinks’the function c will have.It is straightforward to observe that , with k monotone partitional signals, we will havek ’kinks’ and a k ` c . This follows from the fact that we areinterested in the integral of G . Therefore the sender’s problem reduces to choosing the loca-tion of these k kinks and the slope of the function c at each kink, subject to the constraintthat c lies between c and c . Remember our assumption of the existence of action cutoﬀs γ , . . . , γ m such that when E µ p ω q P r γ i , γ i ` s , the action a i is optimal for the receiver. Therelationship between γ , . . . , γ m and the signal partitions will not be obvious when K ă M .More precisely, let c G denote the integral of G, c G p x q “ ş x G p t q dt . c G is a convex func-tion and we can analyze c instead of analyzing signal distributions as in Gentzkow andKamenica (2016). This deﬁnition also makes our focus on piecewise linear functions moreclear. Gentzkow and Kamenica (2016) shows that each function in this interval can berepresented by a signaling policy and vice versa. We will focus on solving the problem bychoosing a function between c and c instead of ﬁnding signaling policies for tractabilitypurposes. Let γ , . . . , γ m be the action cutoﬀs, and let c p x q be the chosen c function, with c c G p x q ă γ . Let U be the senderutility when action 1 is taken. Action two is taken when γ ě c G p x q ă γ , let U be senderutility when action 2 is taken, and so forth. The sender’s utility is then U p c q “ m ÿ k “ p c p γ k q ´ c p γ k ´ qq U k with the convention that γ “ c p γ q “ c as: F k “ t f P C r , s|D a partitioning of [0,1] into k intervals: t s l u kl “ “ tp , x s , p x , x s , . . . , p x k ´ , x k ´ s , p x k ´ , su and t φ l P R u kl “ such that: k ă K, D M P N @ l P t , . . . , k u ď φ l ă M, φ l ď φ l ` , and each s is connected and has non-zero measure, where f can be written as: f p x q “ x P s p φ x q ` k ÿ l “ x P s l ˜ φ l x ´ l ÿ j “ p φ j ´ φ j ´ q x j ´ ¸ u Given the deﬁnitions and the signal space of focus we establish existence of an optimalinformation structure for the sender.

Theorem 4. U p c G q attains its maximum over F k . Proof.

The proof proceeds by a series of lemmas:

Lemma 13. F k is pre-compact Proof. By Arzela-Ascoli theorem, proving pre-compactness suﬃces to showing equi-continuity and equi-boundedness . Note that the way that F k deﬁned ensures that its elements are Lip-schitz continuous. Then we have that equi-boundedness trivially. For equi-continuity pick M P N that is the largest Lipschitz constant for the set of functions in F and a set offunctions with bounded Lipschitz constant forms an equicontinuous set. (cid:4) emma 14. F k is closed. Proof.

Suppose there exists a sequence of functions where @ n P N, f n P F k and f n Ñ f uni-formly. We will show that f P F k .First, observe that all f n are Lipschitz continuous, and therefore f must be Lipschitz contin-uous, in addition to being convex. Therefore f is diﬀerentiable almost everywhere. Let theset D Ă r , s represent the set of points where f is diﬀerentiable.Since f n Ñ f uniformly and f n , f are convex, we have that @ x P D, f n p x q Ñ f p x q . Weproceed by proving the following claim. p , qz D can have at most cardinality K.Suppose not. Pick K ` p , q{ D and call this set X . By subclaim 2, @ x P X , we can ﬁnd h p x q ą @ h P r , h p x qq , there exists some N h p x q P N suchthat @ n ą N h p x q , f n p x q and f n p x ´ h q are on the same linear piece. Similarly, we can alsoﬁnd q p x q ą @ q P r , q p x qq , there exists some N q p x q P N such that @ n ą N q p x q , f n p x q and f n p x ` q q are on the same linear piece. Since there are K ` q ˚ “ min x P X p q p x qq , h ˚ “ min x P X p h p x qq , and N ˚ “ max x P X p max p N h p x q , N q p x q qq .Since f is diﬀerentiable almost everywhere, for every x P X Ď pp , q{ D q , there must exist (cid:15) p x q ą f is diﬀerentiable in the interval p x ´ (cid:15) p x qq and also (cid:15) p x q ą f is diﬀerentiable in the interval p x ` (cid:15) p x qq . Let (cid:15) ˚ “ min x P X p min p (cid:15) p x q , (cid:15) p x qqq .Deﬁne (cid:15) “ min p h ˚ , q ˚ , (cid:15) ˚ q . Now, @ x P X , and @ n ą N ˚ , we have that f n “ c p x q withinthe interval p x ´ (cid:15), x q , and f n “ c p x q within the interval p x, x ` (cid:15) q , for some constants c p x q , c p x q . The intervals p x ´ (cid:15), x q and p x, x ` (cid:15) q are contained by the set D for every valueof x, by deﬁnition. By the fact that within the set D, f n Ñ f , we must have f “ c p x q within p x ´ (cid:15), x q and f “ c p x q within p x, x ` (cid:15) q .Since f is continuous and convex, and @ x P X , f p x q doesn’t exist, we must have that @ x, c p x q ă c p x q . However, this implies that @ n ą N ˚ , f n also takes at least K ` f n P F k , i.e., f n cannot be K-piecewise linear. Thiscompletes the proof that p , q{ D can have at most cardinality K.Without loss of generality, suppose the set has cardinality K. The case where the car-dinality is less than K will be analogous. Let us order the elements of p , q{ D as 0 ă x ă x ¨ ¨ ¨ ă x K ă

1. Take the collection of intervals whose union is r , s as t s l u Kl “ “tr , x s , p x , x s , . . . , p x K , su . Within the interior of each interval, f is diﬀerentiable, hencewe must have f n Ñ f . Observe that f can take at most K ` f n cannot hold. Moreover, f must be constant within43he interior of each interval, since otherwise the cardinality of p , q{ D would exceed K.Therefore, we can write @ l ă K : @ x P int p s l q , φ l “ f p x q , and hence f p x q “ φ l x ` c l for some c. Moreover, since @ n P N, f n and f are continuous, @ x P p , q{ D , we must have:lim (cid:15) Ñ f p x ` (cid:15) q “ lim (cid:15) Ñ f p x ´ (cid:15) q “ lim (cid:15) Ñ φ l ` p x ` (cid:15) q ` c l ` “ lim (cid:15) Ñ φ l p x ´ (cid:15) q ` c l Therefore to preserve continuity we must have c l ` ´ c l “ ´p φ l ` ´ φ l q x . Also, observe thatwithin the ﬁrst interval r , x s , we have f n Ñ f “ φ and f n p x q “ φ ,n x Ñ f p x q “ φ x ` c .It follows that we must have c “ l ě , c l “ ´ ř li “ p φ i ´ φ i ´ q x i ´ . Therefore, f must have the de-sired form and f P F k . This completes the proof that it is closed. (cid:4) Corollary 5. F k is compact. Proof.

Follows from two lemmas above and the deﬁnition of a pre-compact set. (cid:4)

Lemma 15.

U(c) is continuous over F k . Proof.

Let f n P C be a sequence of convex functions such that f n Ñ f uniformly. Thisimplies : d p f n , f q “ sup t| f n p x q ´ f p x q| , x P r , su Ñ n Ñ 8 . We need to show U p f n q Ñ U p f q .By above lemma, since U only depends on the left derivatives on ﬁxed and exogenous points γ , . . . , γ k , then we will have U p f n q Ñ U p f q . Uniform convergence implies pointwise conver-gence, therefore f is convex.Since f is convex, there will exist left and right derivatives at every point. For any γ value, and for any (cid:15) ą

0, we need to show D N P N such that @ n ą N , | f n p γ q ´ f p γ q| ă (cid:15) where we write the left derivative at γ as: f p γ q “ lim h Ñ ´ f p γ ` h q ´ f p γ q h We proceed by proving two useful claims.

Claim 1. D h ą such that @ ď h ă h , f p γ ´ h q and f p γ q are on the same linearpiece, meaning that: f p γ ´ h q “ β p γ ´ h q and f p γ q “ βγ for some β ą .This implies f p γ ´ h q “ f p γ q , @ ď h ă h . roof. Follows from the fact that in our deﬁnition each linear piecewise interval is connectedand has strictly non zero measure. (cid:4)

Claim 2. D h ą that satisﬁes the following : @ h P r , h q , there exists some N h P N forwhich it holds that @ n ą N h , f n p γ ´ h q and f n p γ q are on the same linear piece. Proof.

Suppose not. For any given h ą

0, for all h P r , h q , there exists no N h . Meaningthat, @ n P N , f n p γ ´ h q and f n p γ q are not on the same linear piece. Implying that, forany 0 ď h ă h , for any n : there must be some β n , θ n where f n p γ ´ h q “ β n p γ ´ h q and f n p γ q “ θ n γ where β n ă θ n by convexity. Thus, | f n p γ q ´ f n p γ ´ h q| “ |p θ n ´ β n q γ ` β n h | .However, each f n is also continuous, by convexity. This implies that, at the point γ : @ (cid:15) ą D δ ą | x ´ γ | ă δ , then | f n p x q ´ f n p γ q| ă (cid:15) .For any f n , choose (cid:15) “ p θ n ´ β n q γ . Then, there exists some δ such that | x ´ γ | ă δ implies | f n p x q ´ f n p γ q| ă p θ n ´ β n q γ But then we can choose h where h ă h and h ă δ is satisﬁed.Which means that we will have: | f n p γ q ´ f n p γ ´ h q| “ |p θ n ´ β n q γ ` β n h | “ p θ n ´ β n q γ ` β n h from the ﬁrst argument, and | f n p γ q ´ f n p γ ´ h q| ă p θ n ´ β n q γ from the second argument.Therefore we have reached a contradiction. This completes the proof of claim 2. (cid:4) Proceeding with the proof of lemma 13, we have that uniform convergence implies pointwiseconvergence, therefore f is convex. Since f is convex, there will exist left and right deriva-tives at every point. For any γ value, and for any (cid:15) ą

0, we need to show D N P N such that @ n ą N , | f n p γ q ´ f p γ q| ă (cid:15) . Where we write the left derivative at γ as: f p γ q “ lim h Ñ ` f p γ ´ h q ´ f p γ q´ h Suppose an (cid:15) ą h ă min t h , h u . We have that: f p γ q ´ (cid:15) ă f p γ ´ h q ´ f p γ q´ h “ f p γ q “ f p γ ´ h q ă f p γ q ` (cid:15) For the picked number h, by claim 2, let N h be the number where @ n ą N h , f n p γ ´ h q and f n p γ q are on the same linear piece.Since f n converges to f , there exists N c P N such that @ n ą N c : f p γ q ´ (cid:15) ă f n p γ ´ h q ´ f n p γ q´ h ă f p γ q ` (cid:15) Let N ą t N h , N c u . Then, @ n ą N , the convergence result holds, and f n p γ ´ h q and f n p γ q areon the same linear piece. The following argument holds for all n ą N : Since f n p γ ´ h q and f n p γ q are on the same linear piece, we must have that the left derivatives are the same atthese two points and f n p γ q “ f n p γ ´ h q´ f n p γ q´ h . By direct substitution to the inequality above: f p γ q ´ (cid:15) ă f n p γ q ă f p γ q ` (cid:15) ´ (cid:15) ă f n p γ q ´ f p γ q ă (cid:15) ô | f n p γ q ´ f p γ q| ă (cid:15) Therefore the left derivatives converge and U p f n q Ñ U p f q , which completes the proofthat U(c) is continuous over F k . (cid:4) With all the lemmas, the proof of theorem 4 follows immediately by topological extremevalue theorem . We have proved the existence of an optimal monotone partitional informa-tion structure. (cid:4) Let p S, d S q and p R , d q be metric spaces d is the usual Euclidean metric deﬁned for all x, y d p x, y q “ | x ´ y | . Also let X Ď S be a compact subset of S f : S Ñ T be continuous on all of X. Then f p X q is closed andbounded in T and f achieves its supremum and inﬁmum on X, that is, there exists p, q P X such that f p p q “ sup t f p x q : x P X u and f p q q “ inf t f p x q : x P X u eferences Alfsen, E. M. (1965): “On the geometry of Choquet simplexes,”

Mathematica Scandinav-ica , 15, 97–110.

Aumann, R. J. and M. Maschler (1995):

Repeated Games with Incomplete Information ,MIT Press.

Bergemann, D. and S. Morris (2016): “Information design, Bayesian persuasion, andBayes correlated equilibrium,”

American Economic Review , 106, 586–91.

Bloedel, A. W. and I. R. Segal (2018): “Persuasion with Rational Inattention,”

Avail-able at SSRN 3164033 . Dughmi, S., D. Kempe, and R. Qiang (2016): “Persuasion with limited communication,”in

Proceedings of the 2016 ACM Conference on Economics and Computation , ACM, 663–680.

Dworczak, P. and G. Martini (2019): “The simple economics of optimal persuasion,”

Journal of Political Economy , 127, 000–000.

Gentzkow, M. and E. Kamenica (2014): “Costly persuasion,”

American EconomicReview , 104, 457–62.——— (2016): “A Rothschild-Stiglitz approach to Bayesian persuasion,”

American Eco-nomic Review , 106, 597–601.

Ichihashi, S. (2019): “Limiting Sender’s information in Bayesian persuasion,”

Games andEconomic Behavior , 117, 276–288.

Jager, G., L. P. Metzger, and F. Riedel (2011): “Voronoi languages: Equilibriain cheap-talk games with high-dimensional types and few signals,”

Games and economicbehavior , 73, 517–537.

Kamenica, E. and M. Gentzkow (2011): “Bayesian Persuasion,”

American EconomicReview , 101, 2590–2615.

Le Treust, M. and T. Tomala (2019): “Persuasion with limited communication capac-ity,”

Journal of Economic Theory , 104940.

Lipnowski, E. and L. Mathevet (2017): “Simplifying Bayesian Persuasion,” Tech. rep.,mimeo.——— (2018): “Disclosure to a psychological audience,”

American Economic Journal: Mi-croeconomics , 10, 67–93.

Rockafellar, R. T. (1970):

Convex analysis , vol. 28, Princeton university press.47 ubinstein, A. (1986): “Finite automata play the repeated prisoner’s dilemma,”

Journalof economic theory , 39, 83–96.

Tsakas, E. and N. Tsakas (2018): “Noisy persuasion,”

Available at SSRN 2940681 . Volund, R. T. (2018): “Bayesian Persuasion on Compact Subsets,” Tech. rep., AarhusUniversity.

Warren, J. (1996): “Barycentric coordinates for convex polytopes,”

Advances in Compu-tational Mathematics , 6, 97–108.——— (2003): “On the uniqueness of barycentric coordinates,”

Contemporary Mathematics ,334, 93–100.

Warren, J., S. Schaefer, A. N. Hirani, and M. Desbrun (2007): “Barycentriccoordinates for convex sets,”

Advances in computational mathematics , 27, 319–338.

Wei, D. (2018): “Persuasion Under Costly Learning,” .

Yaglom, I. M. and V. G. Boltyansky (1961):