PPersuasion with Coarse Communication ∗ Yunus C. Aybas † r ○ Eray Turkel ‡ July 20, 2020
Abstract
Persuasion is an exceedingly difficult task. A leading cause of this difficulty is themisalignment of preferences, which is studied extensively by the literature on persuasiongames. However, the difficulty of communication also has a first order effect on theoutcomes and welfare of agents. Motivated by this observation, we study a model ofBayesian Persuasion in which the communication between the sender and the receiveris constrained. This is done by allowing the cardinality of the signal space to be lessthan the cardinality of the action space and the state space, which limits the numberof action recommendations that the sender can make. Existence of a maximum tothe sender’s problem is proven and its properties are characterized. This generalizesthe standard Bayesian Persuasion framework, in which existence results rely on theassumption of rich signal spaces. We analyze the sender’s willingness to pay for anadditional signal as a function of the prior belief, which can be interpreted as the valueof precise communication. We provide an upper bound for this value which applies toall finite persuasion games. While increased precision is always better for the sender, weshow that the receiver might prefer coarse communication. This is done by analyzinga game of advice seeking, where the receiver has the ability to choose the size of thesignal space. ∗ We are grateful to Avidit Acharya, Steven Callander, Mine Su Erturk, Fran¸coise Forges, MatthewGentzkow, Edoardo Grillo, Matthew Jackson, Semih Kara, Tarik Kara, Emin Karagozoglu, Cem Tutuncu,Robert Wilson, Kemal Yildiz, and seminar participants in Stanford University and Bilkent Universityfor helpful comments. Author names are randomized via AEA author randomization tool. Random-ization can be verified at . † Stanford University, Department of Economics. [email protected] ‡ Stanford University, Graduate School of Business. [email protected] a r X i v : . [ ec on . T H ] J u l Introduction
Communication is difficult. This is especially true when the content is very complicated, andthe messages are relayed through coarse or imperfect channels. Financial analysts make rec-ommendations to their clients about taking long or short positions on assets. Some firms givesimple recommendations such as ‘buy’, ‘sell’ or ‘hold’, while others give more fine-grainedadvice, including ‘strong buy’ or ‘strong sell’. Credit rating agencies use letter grades withthe goal of communicating the riskiness of an investment. What is the effect of this coarse-ness in communication on agents who are interacting strategically? Under what conditionswould it be optimal to send or receive information through coarse channels?Motivated by these questions, this paper studies an information design problem betweentwo rational agents, a sender and a receiver, in a setting with coarse communication. Thedifficulty in communication arises because the underlying space of possible states of the worldis large relative to the set of signals that can be used to describe it.The framework of Bayesian Persuasion (Kamenica and Gentzkow 2011), and in general,information design (Bergemann and Morris 2016), analyzes strategic communication betweenagents who might have misaligned preferences. In the canonical model, the sender designsthe informational environment of the receiver through a signaling scheme, creating beliefsthat will induce desirable actions. The ability of the sender to commit to a signaling schemedistinguishes this framework from the literature on cheap talk, where similar restrictions onsignal spaces have been studied by Jager et al. (2011), among others.We will start by giving an overview of our contributions to the rapidly expanding fieldof information design and Bayesian Persuasion, and then provide a review of related workto explain how our results complement the existing literature.First, we generalize the Bayesian Persuasion framework to settings where the signal spaceis coarse, i.e. , limited in its cardinality. The standard model assumes the existence of a largesignal space, which is rich enough to describe the state of the world perfectly, or induce allpossible actions, depending on which one of the constraints is binding. This assumption isused to show the existence of a solution using Caratheodory’s theorem and other tools fromconvex analysis. We establish the existence and describe the properties of a sender-optimalinformation structure in a setting where the signal space is cardinality-constrained, hencethe standard tools for proving existence cannot be used.Second, we analyze the effects of coarse communication for the sender and the receiver.We show that a larger signal space always weakly improves the sender’s utility, so a senderwould be willing to pay to get access to an additional signal. We call the sender’s willingnessto pay for an additional signal the price of precision , and provide an upper bound for it whichapplies to any finite Bayesian Persuasion game. The upper bound result is derived by usinga novel insight linking higher and lower dimensional information structures: namely, given2 higher dimensional information structure, we can combine some of the induced posteriorswhile still maintaining Bayes Plausibility, and create lower dimensional information struc-tures. Doing this in a systematic way, we can show an upper bound on the gap in achievableutilities between optimal information structures under different cardinality constraints. Wealso analyze how the price of precision depends on the location of the prior, and the diffi-culty (for the sender) of inducing beneficial actions while maintaining Bayes Plausibility. Weshow that the price of precision can be non-monotonic: e.g. the second signal can be morevaluable than the third one, or vice versa.Next, we show that the effect of additional signals on receiver’s utility is ambiguous ingeneral. We analyze a game of optimal advice seeking , where the receiver has the ability tochoose the size of the signal space. Intuitively, this is a setting where the receiver can askfor simple or complicated recommendations from the sender. This framework can capturesituations where the receiver has power on the communication procedure. We show throughan example that there exists equilibria where the receiver optimally chooses to ask for ‘sim-ple advice’ with fewer action recommendations. Through our example we also show thatrestricting the cardinality of the signal space might not lead to less informative informationstructures, in the sense of Blackwell informativeness.Finally, our analysis of this problem makes a theoretical contribution by developing anovel method for finding sender-optimal information structures in persuasion games. A keyinsight we develop is using Choquet’s Theorem to analyze optimal information structures,represented as probability measures over the extreme points of low-dimensional simplices.This characterization is closely related to the study of generalized barycentric coordinates(Warren 1996, 2003; Warren et al. 2007). This approach allows us to solve information de-sign problems where concavification methods cannot be used directly. In addition to ourtheoretical results, we provide intuitive geometric tools to analyze persuasion games in set-tings with coarse signal spaces. The concavification method developed in Kamenica andGentzkow (2011) analyzes the convex hull of the hypograph of the sender utility, which canbe inspected to understand the properties of optimal information structures. We define therelated concept of k-convex hull of a set, which is the set of points that can be represented asthe convex combination of at most k points. We show that the set of achievable utilities fora sender in a game where the signal space has cardinality | S | “ k is given by the k-convexhull of the hypograph of sender utility.Previous work on persuasion games has introduced costs for generating precise infor-mation structures, where the costs are usually motivated through information theoreticfoundations: an example is assuming that the costs are proportional to the reduction inthe entropy of prior beliefs (Gentzkow and Kamenica 2014). This approach still allows thesender to make arbitrarily many action recommendations subject to a cost, and the exis-tence results rely on having a high dimensional signal space. Similarly, limitations to theinformativeness of posteriors in a persuasion game can rise endogenously in a setting where3he receiver has mental costs associated with processing more informative signals. This phe-nomenon has been analyzed under various specific preference structures, where the senderchooses to induce less informative posteriors due to increasing costs for paying attention toinformative signals on the receiver side (Wei 2018; Bloedel and Segal 2018; Lipnowski andMathevet 2018).While we assume exogenous restrictions on the signal space to prove our main results,we provide multiple applications in section 4, where our model can be used to analyze set-tings in which limitations on the signal space can arise endogenously. Our analysis of adviceseeking games, where a receiver determines the cardinality of the signal space, is similar tothe setting with binary states and signals analyzed in Ichihashi (2019), in which the receiverlimits the Blackwell informativeness of the signals. As we will see in one of our examples,optimal information structures under different cardinality constraints are not always Black-well comparable, so using Blackwell-Informativeness constraints and cardinality constraintswill lead to different outcomes in general.In a related paper to ours, Dughmi et al. (2016) examine the properties of a persuasiongame with a restricted number of signals, but in the specific context of bilateral trade withassumptions on the underlying preference structure. They also prove the NP-hardness ofapproximating optimal sender utility in general persuasion games with coarse signals. Ourfocus is on proving existence, characterizing the properties of the sender-optimal informationstructure and analyzing the various implications of coarse communication rather than thecomputational complexity of calculating the equilibrium sender utility.Two recent papers analyze noisy persuasion games with similar motivating questions.Le Treust and Tomala (2019) study a repeated game of persuasion, where the sender haslimited opportunities to intervene and send information through a noisy and cardinality-constrained channel. While they don’t prove the existence of a maximum, their main resultis an upper bound on achievable utilities by the sender. They also show that this boundis reached in the limit where the number of repetitions of the underlying game approachesinfinity. Their result can be modified to apply to our setting with noiseless channels anda coarse signal space, giving an upper bound on the utility of the sender. This asymptoticresult is shown by making an elegant connection to Shannon’s coding theorem. Similarly,Tsakas and Tsakas (2018) focus on persuasion through noisy communication channels in asingle persuasion game. They show that the effect of noise on sender utility is ambiguous ingeneral, and within the class of symmetric noisy communication channels, more noise makesthe sender worse off.Our theoretical results complement the asymptotic framework of Le Treust and Tomala(2019): we focus on a single game and prove the existence of an optimal solution and give asharp characterization of its properties. We also provide an upper bound result on achievableutilities with cardinality constraints, which provides a bound on the loss of utility due to4oarseness in communication that applies to all finite persuasion games. We also show thatcoarse communication always makes the sender worse off, as opposed to the case with noisewhere the effect is ambiguous, as is shown in Tsakas and Tsakas (2018). Our analyses ofthe value of precision and games of advice seeking also provide substantive applications forconstrained persuasion games in various market settings.While noisy channels make communication between parties more difficult, the restrictionson implementable information structures are different compared to cardinality constraints.Le Treust and Tomala (2019) show that asymptotically, all that matters for sender utility isthe channel’s capacity, which is affected both by the inherent noise in communication andthe cardinality of the signal space. However, noisy and coarse signals have substantivelydifferent implications on the optimal information structure and achievable utilities for thesender in finite games. Noise prevents the sender from inducing posteriors where the receiveris certain about the state of the world, and there are no explicit restrictions on the numberof inducable actions. Thus, the receiver can never be perfectly informed and there is alwaysresidual uncertainty in beliefs. With cardinality constraints, while the sender can induceinformative posteriors, it’s never possible to perfectly inform the receiver about all statesof the world at the same time. Thus, the sender has to prioritize some of the actions thatcan be induced with its limited capabilities while also maintaining Bayes Plausibility, whichleads to different outcomes.Mathematically, with noisy channels, the sender’s choice is restricted to informationstructures in which posteriors are not too close to the extreme points of the simplex. Withcardinality constraints, there are no restrictions on the locations of the posteriors, but thesender’s problem reduces to optimally choosing a lower dimensional object embedded in ahigher dimensional probability simplex (i.e., a line segment within the 3-simplex, or a tri-angle within the 4-simplex). This is also why we cannot use the intuitive concavificationapproach in our setting. Suppose the signal space is constrained to have cardinality K . Itwill not be possible to achieve all utility levels on the convex hull of the epigraph of senderutility. Specifically, if a utility level can only be achieved as a convex combination of morethan K points, it will not be implementable in our setting. This insight will be clarified whenwe define the concept of a K-convex hull.The rest of the paper is organized as follows. Section 2 provides a simple example andhighlights some of the insights that will be analyzed in later sections. We introduce ourmodel and provide our existence results in section 3. Section 4 provides applications forour model, where we analyze the value of precise communication, optimal advice seeking,and preferences for simple signals in persuasion. We conclude in section 5. All proofs andadditional results appear in the appendix. 5 A simple example: Financial Advice
We analyze a simple example with 3 states and 3 actions. The sender is a financial institu-tion and the receiver is a risk neutral customer, looking for advice on a financial position.The customer can take a long or short position on an asset, or do nothing. There is a fixedamount of the asset that the customer can choose to buy or sell. The value of the asset canincrease, in which case the optimal action is to take a long position, it can decrease, in whichcase the optimal action is a short position, or it could hold steady, in which case the optimalaction is doing nothing. Suppose for simplicity, that the value of the asset can increase ordecrease by 1. There is also a risk free asset which the consumer can purchase as an outsideoption, which provides a small return of r “ .
3. In addition, the institution can chargecommissions to the customer, denoted by c , which can be any real number. The payoff ofthe institution is the commission it can charge. The payoff for the customer is 0 if no actionis taken, 1 if the correct position is taken, and ´ p ` , p , p ´ denote the common beliefs that the asset’s value will increase, hold steady,or decrease, respectively, where p ` ` p ` p ´ “
1. The sender (financial institution) and thereceiver (customer) share a prior µ which is in the interior of the three dimensional simplex.Sender commits to a signaling mechanism, using signals from a finite set S , where | S | “ S , one for each realization of the (uncertain) stateof the world. The sender commits to this strategy prior to the realization of the state, andcannot change it afterwards. The receiver observes the signal (not the state of the world),and uses Bayesian updating to obtain posterior probabilities of each state conditional on theobserved signal. It should be noted that signals s P S do not have an intrinsic meaning,but obtain their meaning in equilibrium via the announced signaling mechanism. After thesignal is realized and the posterior beliefs are formed, the sender decides on the commissionthat will be charged. Finally, the receiver chooses their action.Formally, the receiver’s expected payoff will be p ` ´ p ´ ´ r ´ c when taking a long po-sition, and p ` ´ p ´ ´ r ´ c when taking a short position. For any given belief, the receiverwill choose to take a position over doing nothing if and only if | p ´ ´ p ` | ´ r ´ c ě
0. Thesender can therefore extract all the surplus by optimally setting c “ | p ´ ´ p ` | ´ r . Notethat the commissions will be higher if the posterior beliefs approach the extreme points ofthe simplex. Intuitively, the financial institution can charge higher commissions for inducingmore precise posteriors, from customers that will have very optimistic or very pessimisticbeliefs about the asset.With | S | “
3, finding the optimal information structure and calculating the maximumsender utility achievable is easily done by inspecting the concavification of sender utility. Theoptimal information structure will induce beliefs on the extreme points of the simplex: thecustomer is absolutely sure about what will happen to the asset when he receives a signal.6etting the prior be µ “ p p ` , p , p ´ q “ p . , . , . q , the optimal information structure willinduce p , , q with probability 0 . p , , q with probability 0 .
4, and p , , q with proba-bility 0 . | S | “
2, solving for the optimal information structure is not straightforward: wecan no longer use concavification. It turns out that the optimal information structure inthis case induces the belief p , , q with probability 0 .
3, and induces the belief p , . , . q with probability 0 .
7. The customer is still willing to take a short position, but the beliefsare now less extreme compared to the case with three signals. The resulting expected utilityfor the institution is therefore lower.
Figure 1:
The financial advice example. The first plot shows the sender utility function over the simplex.The second plot shows the concavification of sender utility, where the red dot corresponds to the maximumutility achievable by the sender. The third plot shows the optimal information structure for the sender usingtwo signals. The black dot corresponds to the maximum utility achievable with 2 signals, shown togetherwith the maximum utility achievable with 3 signals.
The example demonstrates some of the key insights that will be generalized in this paper.First, the induced beliefs are located on the boundaries of the regions where the receiver’saction is fixed. This is developed further in section 3.2. Second, the search for an optimalinformation structure is equivalent to searching for the highest value achievable by takingthe convex combination of two points from the graph of the sender utility function. Weformalize this insight in section 3.3. Third, the utility achievable by the sender is lower withtwo signals, hence the sender would be willing to pay to get access to additional signals.The loss in sender utility will be defined as the ‘value of precision’ in section 4.1, where weanalyze it in more detail and provide upper bounds.7
Model and Results
There are two agents, a sender and a receiver, which are communicating about an uncertainstate of the world. The state of the world ω can take values from a finite set Ω, which hascardinality | Ω | “ n . There are finitely many actions a P A that can be taken by the receiver,where | A | “ m . The two agents have utility functions which depend on the state of theworld and the receiver’s action, respectively denoted by: u S , u R : Ω ˆ A Ñ R . The agentsshare a prior belief about the state of the world, µ , which is assumed to be in the interior of∆ p Ω q that is denoted by int p ∆ p Ω qq . It is common knowledge that the agents hold a sharedprior. The sender chooses a signaling policy which is a collection of conditional probabilitymass functions t π p . | ω qu ω P Ω over the signal space S with cardinality | S | “ k . Critically, weassume k ă min t m, n u . With this assumption we focus on coarse communication. Thus, thesender cannot induce all possible actions or describe the state of the world perfectly, andhas to decide which actions to induce through coarse communication while also maintainingBayes Plausibility.Each signal realization s P S induces a posterior which is formed through Bayesian updat-ing: The receiver observes a signal realization s P S and forms a posterior belief µ s P ∆ p Ω q .Hence, we can think of the collection t π p . | ω qu ω P Ω as inducing a probability measure over pos-teriors. We denote this probability measure over posteriors by τ P ∆ p ∆ p Ω qq . τ is a discreteprobability measure with support supp p τ q “ µ “ t µ s u s P S . Throughout the paper, τ will becalled an information structure. Naturally, by the restrictions imposed above, we will have1 ă | supp p τ q| ď k . The vector of probabilities of inducing a posterior belief that is in thesupport of the information structure τ will be denoted by τ p µ q . Formally, for µ s P supp p τ q ,the probability that µ s will be induced is given by τ p µ s q “ ř ω P Ω π p s | ω q µ p ω q .After forming the posterior µ s , the receiver chooses an action from the set ˆ A p µ s q “ arg max a P A E ω „ µ s u R p a, ω q . If the receiver is indifferent between multiple actions, we assumethat the indifference is resolved by picking the action that is preferred by the sender. Ifthere are multiple such elements that maximize the sender’s utility, we pick an element fromˆ A p µ s q arbitrarily.Sender’s utility when the posterior µ s is induced will be ˆ u S p µ s q “ E ω „ µ s u S p ˆ a p µ s q , ω q .Similarly, receiver’s utility will be ˆ u R p µ s q “ E ω „ µ s u R p ˆ a p µ s q , ω q . The expected utility of ∆ p Ω q denotes the simplex over Ω “ t ω , ω , . . . , ω k u . µ s denotes the posterior induced by s which is a generic element of S , and µ i denotes the i th entry of µ “ supp p τ q . So we use µ i to refer a specific entry of µ and µ s to generic posteriors receiver forms uponobserving a generic signal s P S . The notation E ω „ µ s is used to denote the expectation over the random variable ω taken with respect tothe measure µ s . When the random variable is clear, we will just use the measure that gives the probabilitydistribution on the subscript. τ is denoted by E µ s „ τ ˆ u S p µ s q : ∆ p ∆ p Ω qq Ñ R . Wesimilarly define the expected receiver utility under τ by E µ s „ τ ˆ u R p µ s q .For a distribution of posteriors to be feasibly induced in the persuasion game with sharedpriors, we need the expected value of the posterior beliefs to be equal to the prior belief.This is the only restriction imposed by Bayesian Rationality (Kamenica and Gentzkow 2011),which we can state formally by E µ s „ τ µ s “ ř µ s P supp p τ q µ s τ p µ s q “ µ . The sender’s goal istherefore finding the optimal τ , which is described by the problem:max τ P ∆ p ∆ p Ω qq E µ s „ τ ˆ u S p µ s q subject to | supp p τ q| ď k and E τ p µ s q “ µ Formulating the sender’s problem as a search for an optimal information structure τ rather than a search for signal functions t π p . | ω qu ω P Ω makes the problem more tractable.Given a feasible information structure τ and corresponding probabilities for each posteriorbelief t τ p µ s qu s P S in our model, we can always find the related signal functions by writ-ing π p s | ω q “ µ s p ω q τ p µ s q µ p ω q . The more interesting problem of finding the probability measure t τ p µ s qu s P S that make τ Bayes Plausible given only the posterior beliefs µ “ t µ , . . . , µ k u willalso be discussed when we present our existence result. We will show that these probabilitiesare uniquely defined if µ consists of affinely independent posteriors.The constraint on the signal space makes solving the sender’s problem considerably moredifficult compared to the standard Bayesian Persuasion framework. Note that it is no longerpossible to use Caratheodory theorem to show the existence of an optimal signal. Theachievable set of utilities can shrink considerably for the sender, compared to the baselinemodel with unrestricted communication. One simple implication of limiting the dimensionality of the signal space is that it is notpossible to induce posterior distributions supported exclusively on the extreme points of the n -dimensional simplex ∆ p Ω q . This is because the convex hull of k ă n extreme points ofthe simplex cannot include µ which is assumed to be in int p ∆ p Ω qq . The limitation alsoconstrains the sender’s ability of making action recommendations: the sender can no longercreate information structures that will induce all possible actions. Thus the sender mustdecide which actions are worth inducing with the limited signals they can use. We willshow how we can find the optimal information structure through a constructive proof, andpresent its properties. Existence can also be shown by using the upper semi-continuity ofsender utility over the search space and showing the compactness of the set of Bayes Plausi-ble information structures inducable with coarse signals. Our approach provides additionalinsights about the properties of lower dimensional optimal information structures and givesus an explicit method for finding them. 9e use the underlying preference structure for the sender and the receiver to simplifythe search for an optimal information structure τ . Formally, we can define subsets of ∆ p Ω q where the receiver’s action is constant, and use the fact that sender utility is convex withinthese subsets. The properties presented in lemmas 1 and 2 have been applied in the contextof persuasion games where the receiver has psychological preferences over different posteriorbeliefs (See Lipnowski and Mathevet (2017, 2018); Volund (2018)).
Definition 1.
The set R a Ď ∆ p Ω q is the set of beliefs where the action a is receiver-optimal: R a “ t µ i P ∆ p Ω q : a P ˆ A p µ i qu R “ t R a u a P A is the collection consisting of these sets for every action a P A . Lemma 1.
For every action a P A , the set R a is closed and convex. Lemma 2.
The sender’s utility ˆ u S is convex when restricted to each set R a . Lemma 1 follows from the fact that each R a can be written as the intersection of finitelymany closed half spaces. The proof of lemma 2 uses the definition of ˆ u S , which is a functionof sender-optimal actions at every belief. For any two points µ , µ in a given R a , let thesender-optimal action be ˆ a p µ q at their convex combination µ . This action must be amongthe set of receiver-optimal actions for the two original points. Since the action ˆ a p µ q is definedas the action that maximizes sender utility among the set of receiver-optimal actions ˆ A p µ q ,and we have ˆ a p µ q P ˆ A p µ q and ˆ a p µ q P ˆ A p µ q , convexity of ˆ u S follows.Let us define beneficial information structures as τ with E τ p ˆ u S q ě ˆ u S p µ q . These areinformation structures that give the sender higher utility compared to the default action,which can be achieved by sending no information. Throughout the paper, we will maintainthe assumption that beneficial information structures exist: the other case is trivial and thesender always prefers sending no information.The first two lemmas show us that in the subspace where the receiver’s action is fixed,sender prefers inducing mean-preserving spreads in beliefs. In the model with unrestrictedcommunication, these properties reduce the search for an optimal information structureto a more tractable optimization problem, since the optimal information structure mustbe supported by the outer points of the sets R “ t R a u a P A as described in Lipnowski andMathevet (2017). With coarse communication, we can prove a similar result. The nextlemma formally states that an information structure can always be weakly improved bychanging it in a way that maintains Bayes Plausiblity and moving all posteriors to theboundaries of an action region. In other words, the sender can restrict their search toposteriors that make the receiver indifferent between multiple actions (and posteriors locatedon the boundaries of the n-simplex ∆ p Ω q ), with no loss in utility. This result reduces the sizeof our search space considerably, and provides tractability in higher dimensional problems. In the appendix, we also establish that sender utility is a continuous and piecewise linear function inthe interior of these sets. R is a finite cover of ∆ p Ω q . emma 3. Let τ be a feasible distribution of posteriors satisfying Bayes Plausibility, thatis also beneficial for the sender. Suppose that D µ a P supp p τ q such that µ a P int p R a q forsome R a P R . Then, there exists a µ k P Bd p R a q and a Bayes Plausible τ ‰ τ where supp p τ q “ p supp p τ q{t µ a uq Y t µ k u such that E τ ˆ u S ě E τ ˆ u S . An immediate corollary of lemma 3 is the following result.
Corollary 1.
The sender’s search for an optimal information structure can be restrictedto information structures τ with the following property: @ µ s P supp p τ q , E R a P R such that µ s P int p R a q . The proof explicitly constructs the information structure τ , and uses the convexity ofˆ u S within each R a . The outline of the argument is the following. Let τ be our origi-nal Bayes Plausible information structure, with the corresponding probabilities t τ p µ i qu i ď k where ř i ď k τ p µ i q µ i “ µ . Let µ a P supp p τ q be in the interior of some R a and define theray originating from µ and passing through µ a . This ray will intersect the boundary of R a at two points µ and µ , since R a is compact and convex. By convexity of ˆ u S within R a ,sender utility must be weakly higher at one of those two points. First, we show that we canreplace µ a with one of these two points and still maintain Bayes Plausibility. Since we’rechanging µ a along the ray defined above, we can change t τ p µ i qu i ď k in a way that maintainsBayes Plausibility. Note that greedily replacing µ a with the point that provides higher util-ity within R a might not always improve the expected sender utility E τ p ˆ u S q , and the overalleffect of this change depends on the relative positions of µ , µ a , µ and µ , and the changein the probabilities t τ p µ i qu i ď k that will maintain Bayes Plausibility. A carefully constructedargument relying on convexity shows that replacing µ a with either µ or µ will always yieldhigher expected utility, where the decision on which point to choose depends on the changesin t τ p µ i qu i ď k .While this simplifies the search, to solve and characterize the sender maximization prob-lem in a tractable way, we still need to understand how the probabilities t τ p µ i qu i ď k change aswe make changes to the beliefs in supp p τ q under the restriction that E τ p µ s q “ ř i ď k τ p µ i q µ i “ µ . Each Bayes Plausible information structure τ defines a lower dimensional compact convexpolytope embedded in the space ∆ p Ω q Ă R n , with the extreme points supp p τ q . Bayes Plau-sibility implies µ must be in the convex hull of the information structure, µ P co p supp p τ qq with the representation ř i ď k τ p µ i q µ i “ µ where co denotes the convex hull operator. Thisrepresentation can be thought of a discrete probability measure over the convex polytopeco p µ , . . . , µ k q , with positive values only on the extreme points. The probability measure t τ p µ i qu i ď k may not be unique in general. However, if supp p τ q consists of affinely indepen-dent beliefs, we can show that the representation is, indeed, unique using Choquet’s Theorem.We proceed by showing that we can restrict our search of optimal information structuresto the set of affinely independent information structures without any loss. The next theo-rem shows that any affinely dependent information structure can be modified by dropping11ome beliefs to reach affine independence, increasing sender utility and maintaining BayesPlausibility at every step. The proof is independent of lemma 3 and holds for a general caseof information design problems with or without constrained signal spaces. Lemma 4.
Let τ be a feasible distribution of posteriors satisfying Bayes Plausibility. Supposethat supp p τ q is not affinely independent. Then, there must exist a Bayes Plausible τ ‰ τ such that supp p τ q is affinely independent and E τ ˆ u S ě E τ ˆ u S . Intuitively, for the sender, inducing affinely dependent beliefs is not a good use of signalsbecause some beliefs are redundant. The proof outlines the details on how we can alwaysfind a belief that is optimal to drop from the information structure. Because the beliefs canbe written as an affine combination of each other, we can always choose a belief to dropsuch that the change in t τ p µ i qu i ď k guarantees higher sender utility. We use the relationshipbetween the convex weights characterizing µ which are t τ p µ i qu i ď k , and the set of affineweights that allows us to characterize beliefs in terms of each other.Lemma 4 states that we can restrict our search to affinely independent information struc-tures, or in other words, lower dimensional simplices contained in the n-simplex ∆ p Ω q . Thisgives us the uniqueness of the probability measure t τ p µ i qu i ď k representing µ through Cho-quet’s theorem. The statement of this well known result (e.g., see Alfsen (1965)) is asfollows. Theorem (Choquet Theorem) . Suppose that P is a metrizable compact convex subset of alocally convex Hausdorff topological vector space, and that µ is an element of P . Then thereis a probability measure τ on P which represents µ i.e. ř p P P τ p p q p “ µ s.t. supp p τ q “ Ext p P q , where Ext p P q denotes the extreme points of P . Furthermore, if Ext p P q is affinelyindependent, this probability measure τ is unique. We turn to the question of how this probability measure changes as we change beliefs in supp p τ q . If we perturb the set of beliefs induced (while maintaining Bayes Plausibility), wewould like to be able to analyze how the corresponding probability of inducing each beliefchanges. We can do this by using the fact that the convex hull of the posterior beliefsinduced is a compact and convex polytope. Lemma 5.
Let µ P int p ∆ p Ω qq , define ζ Ă R k ˆ N as the set of affinely independent set ofposteriors with cardinality k that are Bayes Plausible. @ µ “ p µ , . . . , µ k q P ζ , there existsa unique probability distribution over τ P ∆ p ∆ p Ω qq with support µ and ř i ď k τ p µ i q µ i “ µ .Moreover, τ : ζ Ñ R k is uniformly continuous. Existence, smoothness and uniqueness of this probability measure can be analyzed through barycentriccoordinates by making use of existing work on generalized barycentric coordinates on convex sets (Warren1996, 2003; Warren et al. 2007), but we use Choquet Theory as a more convenient tool. A similar result is theorem 19.3 in (Rockafellar 1970), which shows that orthogonal projection of poly-hedral convex set P Ă R N on subspace L is another polyhedral convex set and linear maps map polyhedralconvex sets to polyhedral convex sets in finite dimensional vector spaces. µ and characterizes the change in theweights τ p µ q as we change µ to µ as a matrix operation. With this result, we can formulatethe sender’s problem as a search over Bayes Plausible information structures that are affinelyindependent, with the added constraint that @ µ i P supp p τ q , µ i is in the boundary of some R a . The boundary of a given R a consists of facets of a polytope. Each set R a will have atmost m facets, which can be seen from their definition. Definition 2.
Choose at most k facets from any collection of polytopes from R “ t R a u a P A ,and denote them by F “ t F i : D a P A, F i is a facet of R a u . For a given collection F , denote the restriction of the sender utility function ˆ u S to a facet F i P F by ˆˆ u Si . Define the set of affinely independent Bayes Plausible information structuresthat are supported on F by: ζ F “ ˜ ζ X ˜ k ą i “ t F i u ¸¸ . Where Ś ki “ t F i u denotes the Cartesian product of the sets F i . Since each F i is a subset of R n and ζ is a subset of R k ˆ N , ζ F is also a subset of R k ˆ N .The above definition allows us to characterize sender’s maximization as a search for the facetcollection on which the optimal information structure is supported on. Definition 3.
Given a collection F “ t F i u i ď k , the sender’s problem subject to the constraintthat the information structure must be supported on the facets F is given by the following: max τ ÿ i ď k τ p µ i q ˆ u Si p µ i q subject to: t µ , . . . , µ k u P ζ F Denote the maximized value of this problem (if a maximum exists) by V p F, µ , k q , withthe added convention that V p F, µ , k q “ ´8 if the feasible set ζ F is empty. The conventionof V p F, µ , k q “ ´8 is required because of the fact that it might be impossible to represent µ for some collection of facets. Definition 4.
Let F denote the finite set of all possible collections F . We can characterizethe sender’s maximization problem as follows: max F P F p V p F, µ , k qq . With this definition, we can show that the sender’s maximization problem is well definedand an optimal information structure will always exist.
Theorem 1.
An optimal information structure τ P ∆ p ∆ p Ω qq Ă R k ˆ N maximizing the senderobjective function exists. Note that this definition allows us to use the same set R a arbitrarily many times in defining the collectionF. F i . Moreover, the set offeasible points ζ F might not be a compact set. Hence, the proof relies on a non-trivial two-step continuous extension argument. We first define the continuous extension of the senderutility over the closures of the relevant sets, for which a maximum must always exist by anapplication of simple topological extreme value theorem. We then show that the originalproblem must attain the same maximum with the modified problem for the continuous ex-tension, through an application of Theorem 4. Our analysis of cardinality-constrained signal spaces also is also useful for Bayesian Per-suasion games with rich signal spaces where agents have preferences for simplicity. In astandard Bayesian Persuasion where k ě min t| A | , | Ω |u , suppose the sender cares about thesimplicity of the induced information structures, in addition to the utility received. In ap-pendix B.4, we analyze this setting by defining an intuitive preference structure for thesender, and show that affinely independent information structures will be chosen at an equi-librium. We proceed by showing that the solution to the maximization problem in definition 4 willbe equivalent to a geometric characterization of the optimum. We call this characterizationthe k-concavification of sender utility. This will connect our solution technique to the con-cavification approach widely used in the Bayesian Persuasion literature.Let CH p ˆ u S q denote the convex hull of the hypograph of ˆ u S , in the space R n . Withunrestricted communication, the point p µ , z q P CH p ˆ u S q Ă R n represents a sender pay-off z which can achieved by an information structure when the prior is µ . This is thefoundation of the concavification technique, first used in repeated games and then appliedto Bayesian Persuasion and information design (Aumann and Maschler 1995; Kamenicaand Gentzkow 2011). In canonical persuasion games, the existence of an optimal signalis usually proven by referencing extremal representation theorems from convex analysis.For any p µ, z q P CH p ˆ u S q , Caratheodory’s theorem assures the existence of a τ such that µ P co p supp p τ qq and | supp p τ q| ě n `
1, where co denotes the convex hull operator. Notethat the last condition prevents us from using this theorem in our setting.With restricted communication, the point p µ, z q P CH p ˆ u S q might not be feasible if the An alternative proof is showing that the sender utility can be extended to an upper semi-continuousfunction defined over a compact set, but the proof we provide is more constructive in nature and has analgorithm for finding the equilibrium. Since ˆ u S : ∆ p Ω q Ñ R , we can represent any belief µ with | Ω | ´ “ n ´ u S p µ q with areal number, so p µ, z q P R n . p µ, z q requires a convex combination of more than k points from the hypo-graph of ˆ u S . A prior belief-utility pair p µ, z q will only be feasible if it can be contained in theconvex hull of k or fewer points from the hypograph of ˆ u S . To represent achievable utilities,therefore, we need the following definition. Definition 5.
Given a set A Ď R n and an integer ă k ď n , define the set of points thatcan be represented as the convex combination of at most k points in A as the k-ConvexHull of A , denoted co k p A q . Formally, a P co k p A q if and only if there exists a set of at mostk points t a , . . . , a k u Ď A and a set of weights t γ , . . . , γ k u which satisfy ř i ď k γ i “ and @ i, ą γ i ą such that a “ ř i ď k γ i a i . Therefore, we can write: co k p A q “ t a P R n : Dt a , . . . , a k u Ď A, Dt γ , . . . , γ k u with γ i P R s.t. ÿ i ď k γ i “ and ě γ i ě , a “ ÿ i ď k γ i a i u Let CH k p ˆ u S q denote the k-convex hull of the hypograph of ˆ u S , in the space R n . Notethat if p µ , z q P CH k p ˆ u S q , there exists an information structure τ with supp p τ q ď k andthe E τ p ˆ u S q “ z . Defining V p µ q “ sup t z |p µ , z q P CH k p ˆ u S qu , we get the largest payoff thesender can achieve when the prior is µ . If V p µ q “ z , then we have k beliefs such that ř i ď k τ p µ i q µ i “ µ for some set of weights t τ p µ q , . . . , τ p µ k qu and ř i ď k τ p µ i q ˆ u S p µ i q “ z . Thisgives us the following equivalence between k-concavification and our previous result. Theorem 2.
Let τ be the optimal information structure that solves the sender’s maximiza-tion problem given in definition 4. Then sup t z |p µ , z q P CH k p ˆ u S qu “ E τ ˆ u S . Going back to the financial advice example, we can see in figure 2 that the optimal payofffor the sender given µ can be observed by inspecting the 2-convex hull of the sender utility.The comparison with the regular convex hull (3-convex hull) reveals that the achieved util-ity must be lower. The optimal information structure can thus be determined by inspecting CH k p ˆ u S q Ă R n . We can further analyze the implications of restricting the signal space on sender’s utility.Let V ˚ p k, µ q be the value the sender objective function attains at µ when the signal spaceis restricted to have k elements. Then V ˚ p k ` , µ q ´ V ˚ p k, µ q is what the sender wouldbe willing to pay to increase the dimensionality of the signal space by one, given the fixedprior µ . This can be intuitively interpreted as the value of precision for the sender. Notethat when k ě min t| Ω | , | A |u , the value of precision will be equal to zero by the results in15 igure 2: The supremum of the 3-convex hull and the 2-convex hull of sender utility from the financialadvice example. The left figure shows the maximum achievable utility with 3 signals, and the right figureshows the maximum achievable utility with 2 signals as a function of the prior beliefs. The dots correspondto the prior belief given in the example ( µ “ p . , . , . q ). Kamenica and Gentzkow (2011). Therefore we focus exclusively on the coarse communica-tion setting in which k ă min t| Ω | , | A |u .The value of precision depends on the structure of the sender and receiver utility func-tions, and the location of the prior belief µ . It critically depends on what actions the sendercan induce while still maintaining Bayes Plausibility. If maintaining Bayes Plausibility withlower dimensional signals requires inducing actions with lower payoffs, or inducing a pos-terior located in a low-payoff yielding portion of an action region, then the sender will bewilling to pay more for more precise communication.We establish an upper bound on the value of precision, or equivalently, a lower bound onthe utility achievable with k ´ V ˚ p k, µ q and V ˚ p k ´ , µ q , the loss in utility cannot be too high. Theorem 3.
Suppose | S | “ k ě , and the sender utility function u S is positive everywhere.Then, the following upper bound must hold for the value of precision at k ´ signals: V ˚ p k, µ q ´ V ˚ p k ´ , µ q ď k V ˚ p k, µ q Thus, we show that the utility attainable with k ´ k ´ k V ˚ p k q and V ˚ p k q . This provides a lower bound on the utility loss from using smaller signal spaces,as a function of utility achievable with unrestricted communication. Let τ ˚ k and τ ˚ k ´ bethe optimal information structures using k and k ´ τ ˚ k can be ‘collapsed’ to get an information structure with k ´ τ ˚ k ´ . We can construct k different k ´ τ ˚ k pairwise and leaving the rest of the posteriors the same as τ ˚ k . Theutilities provided by these new information structures are related to V ˚ p k, µ q , because theycontain k ´ τ ˚ k . The resulting inequalities yieldthe lower bound in theorem 3. We will show that the value of precision can be non-monotone in general. We analyze anexample with 3 states of the world to demonstrate how the behavior of the value of precisioncan depend on the location of the prior. In our example we will see that V ˚ p , µ q ´ V ˚ p , µ q can be greater or less than V ˚ p , µ q ´ V ˚ p , µ q . We will also demonstrate how this differ-ence depends on the difficulty of inducing beneficial actions for the sender.Let Ω “ t ω , ω , ω u . There are four actions available to the receiver A “ t a , a , a , a u .We consider a Bayesian Persuasion game where the sender has an optimal action for eachstate and a default safe action. This can be represented with receiver preferences of the form: u R p a, ω i q “ $’&’% a “ a ´ ¯ π ¯ π if a “ a i @ i P t , , u´ a ‰ a i @ i P t , , u These preferences can be used to model situations in which for each state ω i action a i isoptimal, and mismatching the state i.e. taking action a j j ‰ j ‰ i is costly, with costnormalized to unity. Finally, a is the safe action. Such receiver preferences lead to actionthresholds over the simplex of posterior beliefs.Let us denote µ s p ω i q by µ is , where µ is is the i th coordinate of a given posterior belief µ s .One can think of µ s p ω q as the probability distribution over Ω induced by µ s .For each state, there is a corresponding preferred action a i which is taken by the receiverif and only if the receiver believes the state of the world is ω i with at least probability¯ π . Specifically, the receiver prefers action a i P t a , a , a u if and only if the posterior belief µ s P ∆ p Ω q such that µ is ě ¯ π , and prefers a otherwise. Hence, we can say that for i P t , , u , j P t , , , u and j ‰ i we have that E µ s r u R p a i , ω qs ě E µ s r u R p a j , ω qs if and only if µ is ą ¯ π .The action zones for these receiver preferences can be represented as: R i “ t µ s P ∆ p ω q| µ is ě ¯ π u @ ω P Ω, u s p a , ω q “ u s p a i , ω q “
1. Thus, thesender only cares about actions and not the states, and aims to induce the non-default ac-tions. The parameter ¯ π can be interpreted as the difficulty of inducing the beneficial actionsfor the sender.Given this structure, it should be obvious that sender can attain a payoff of 1 by using3-signal information structures. This follows from the fact that for every prior µ P ∆ p Ω q with µ “ p µ , µ , µ q the sender can use the information structure p , , q with probability µ , p , , q with probability µ and p , , q with probability µ . This information structurecorresponds to τ p µ s q P ∆ p ∆ p Ω qq with τ pp , , qq “ µ , τ pp , , qq “ µ , τ pp , , qq “ µ .We have that E τ u s p a p ω q , ω q “
1. Every point inside simplex can be represented as the con-vex combination of the extreme points of the simplex, hence achieving the maximal utilitywith 3 signals is possible for every interior prior.With 1-signal information structures (i.e. no information transmission at all), we havethat the payoff sender can achieve is E µ u s p a p µ q , ω q “ µ P R i @ i P t , , u µ thatare in R , as for priors in R i for i P t , , u the maximal payoff can be obtained withno information transmission at all. We define ∆ c as the set where two-signal informationstructures attain lower payoff than three-signal information structures. The following lemmastates the values of ¯ π such that this set is non-empty. Lemma 6. ∆ c ‰ H if and only if ¯ π ě . For thresholds ¯ π ď , two-dimensional information structures suffice for achieving maxi-mal utility. We restrict attention to cases where ¯ π ą . In this regime, we can state that forany prior in ∆ c , the utility attained by two-signal information structures is bounded withintwo values. Lemma 7. If ¯ π ą we have that V p , µ q ă V p , µ q “ for every µ P ∆ c and V p , µ q “ V p , µ q “ for every µ R ∆ c . Moreover, @ µ P ∆ c , V p , µ q ą V p , µ q ą V p , µ q , where V p , µ q “ π ´ π and V p , µ q “ π . In figure 4, we plot V p , µ q and V p , µ q as a function of the action threshold ¯ π . Thefollowing is an immediate implication of lemma 7. Fixing the preferences of the sender andthe receiver, for some prior beliefs, the value of an additional signal is an increasing function,and for others, it is decreasing. 18 igure 3: On the left, we have the action threshold ¯ π “ so it is possible to maintain Bayes Plausibilitywhen inducing non-default actions for every prior. On the right, ¯ π ą , so for some beliefs, we have tomix the default action and the non-default action when constrained to 2 signals. The dark red, blue andgreen regions are the beneficial action regions. The yellow middle region is the default action region. Orangeregion in the right figure corresponds to ∆ c . Figure 4:
Achievable utilities with two signals for µ P ∆ c , as a function of the action thresholds ¯ π . Blueline depicts the minimum of the equilibrium sender utility among all µ P ∆ c , and yellow line denotes themaximum value. Corollary 2.
Depending on the location of the prior inside ∆ c the value of precision canbe increasing or decreasing with respect to additional signals. That is V p , µ q ą and V p , µ q ă . The priors for which the value of precision is increasing are the ones that are the furthestaway from the beneficial action regions. For the sender who only has access to two signals,the only way to induce favorable actions with these priors is by also inducing the defaultaction with high probability, getting an expected utility below 0 .
5. Therefore, the value ofthe second signal is also below 0 .
5. Getting access to the third signal allows the sender tomaintain Bayes Plausibility by not inducing the default action, guaranteeing a payoff of 1.Hence, the value of the third signal is higher than 0 . .
5. The value of the second signal is then higher than thevalue of the third signal.Note that additional signals always weakly increase the sender utility, because the feasibleset in the optimization problem is expanding. This is not necessarily the case for the receiver,as we will see in our next application.
Our model also can be used to analyze the optimal advice seeking behavior of a receiver.Suppose, before the game described in section 3.1 takes place, the receiver can choose thecardinality of the signal space | S | “ k .Letting the receiver decide the cardinality of the signal space allows them to change theoutcome in their favor. The receiver can choose to ask for “simple advice” consisting offewer action recommendations rather than a more complicated one. We will show throughan example that the receiver will not always prefer using rich signal spaces.First, observe that if there is perfect alignment between the receiver and the sender’sutilities, so that ˆ u R “ ˆ u S , the receiver will always pick the maximum number of signalspossible. This is because the sender’s utility (and therefore the receiver’s utility) is weaklyincreasing in the number of signals available.Let us now turn to the more interesting case of misalignment. Receiver’s preferencesover the number of signals will depend on the location of the prior and the degree of themisalignment between the sender and the receiver. We will make this more clear with thefollowing example. Suppose there are three states t ω , ω , ω u “ Ω and five actions A “ t a , a , a , a , a u where a denotes the default action taken at the prior belief µ “ p { , { , { q . For simplicity,suppose the sender’s utility depends only on the actions taken, and the default action is theworst outcome. The sender prefers inducing the actions a , a and a over a , and a and a are preferred over a and a .Receiver preferences are such that the optimal actions are a , a , a whenever the beliefsare certain enough, meaning that upon observing signal s P S it is the case that µ s p ω i q ą ¯ π for i P t , , u . Moreover, whenever µ s p ω q ă ¯ π and µ s p ω q ă ¯ π but µ s p ω q ` µ s p ω q ą ¯ π the20eceiver takes action a . This means that there are two different actions that the receiveroptimally takes when the beliefs are uncertain, which are a (uncertain but leaning ω ) and a (uncertain but leaning ω or ω ). Figure 5 plots these preferences along with the optimal2 and 3-signal information structures. The full utility function for the receiver is given inthe appendix B.3.We consider the following game: the receiver will pick the cardinality of the signal space k “ | S | first, sender observes this choice and picks the optimal Bayes plausible informationstructure with k signals. By sequential rationality and our previous calculations we can char-acterize the sender’s behavior using our results. Namely, the sender will pick the optimalinformation structure for the k -constrained Bayesian Persuasion game, given the choice of k by the receiver. Hence, the receiver will pick k “ | Ω | such that expected receiver utility ismaximized.It is easy to verify that for every equilibrium (PBE) of this game the receiver will pick k “
2, as plotted in figure 5. For the receiver’s choice of k “
2, the sender will pick theinformation structure described by the red line in the lower left box in figure 5, inducing a and a . Off path, for the choice of k “ k “
3, the sender will pick the information structureshown with the blue triangle in the upper right corner in figure 5, inducing a , a , a . Hence,receivers picks k to be equal to 2. The lower right plot in figure 5 shows how the two-signalinformation structure, three-signal information structure and the single-signal informationstructure compare in terms of expected utility for the receiver.We see that there is a misalignment between the receiver and the sender preferences. Thereceiver prefers outcomes that are more certain about ω and ω whereas the sender onlywants to induce actions a and a and does not care about certainty in beliefs. The senderideally wants to induce uncertain posteriors leading to action a and a with high probability.Limiting the sender to two signals, the receiver can force the sender to induce morecertain beliefs about ω . This is because under the Bayes Plausibility constraint, imposing k “ ω . With k “
3, the senderoptimally induces vague posteriors about ω and ω .This example shows that the receiver might prefer to limit senders ability to communicateand opt for simple advice. More elaborately, in the game considered above, we see that thereceiver prefers the expected outcome with two signals ( k “
2) over three signals ( k “ k “
3) over no communication at all ( k “ See appendix B.3 for utility functions. igure 5: Partial misalignment and optimal advice seeking. In all figures, the black perpendicular line at(1/3,1/3,1/3) represents the location of the prior. The top left figure depicts the sender’s utility over thesimplex, which depends only on the actions taken by the receiver. The top right figure shows the optimal3-signal information structure, and the bottom left figure shows the optimal 2-signal information structure.The bottom right figure depicts the receiver’s utility, with the optimal 2-signal (red line) and 3-signal (bluesurface) information structures. For the receiver, utility with 2 signals (red point) is higher than the utilitywith 3 signals (blue point). that no useful information can be transmitted at all.The example also demonstrates an interesting property of cardinality constrained per-suasion games. We see that the optimal information structures chosen with k “ k “ This presents an interesting avenue for studying coarse communication. InformationDesign and Bayesian Persuasion literature generally focuses on a variety of examples -e.g.Judge-Prosecutor communication (Kamenica and Gentzkow 2011)- where the receiver hassome power on the communication procedure. One way to reflect this power is letting thereceiver pick the cardinality of the sender’s signal space. The example above shows thatthe receiver may prefer to pick k to be less than the cardinality of the action and statespace. Our framework can be used to analyze these interactions in detail by characterizingthe solutions to Bayesian Persuasion problems with coarse signal spaces. By Corollary 4 in appendix B.2. Conclusion
We set out to analyze the effect of coarseness in strategic communication. The value ofprecise communication in a game where a sender is trying to persuade a receiver is char-acterized and an upper bound for this value which applies to all finite persuasion games ispresented. This is done by proving the existence and characterizing the properties of anoptimal information structure in a game of persuasion with constrained signal spaces, whichwas left unexplored by previous literature. In doing so, we develop a novel way of solvingfinite Bayesian Persuasion problems and finding optimal information structures using Cho-quet’s Theorem. Our work complements the asymptotic upper bound results of Le Treustand Tomala (2019) on infinitely repeated persuasion games with noisy and coarse channels,and the results in Tsakas and Tsakas (2018) on finite persuasion games over noisy channels.We show that constrained signal spaces create non-trivial difficulties for the sender in a per-suasion game and demonstrate how we can analyze the outcomes using k-convex hulls. Insettings where a receiver is asking for advice from a sender with misaligned preferences, weshow that it might be optimal to ask for simple recommendations. This gives us a betterunderstanding of settings in which the communication between parties can be limited by thereceiver (or a regulator).With this general model, we can apply our framework to various settings where we wouldlike to learn about the willingness to pay of a sender for more precise messages. Some of themost important questions studied using persuasion games can now be analyzed from thisnew viewpoint. How much would a politician be willing to spend to design a more detailedpolicy experiment to convince voters? How much would a lobbyist be willing to pay to send amore precise action recommendation to the politician that they are trying to persuade? Howmuch would a firm trying to send product information to a potential customer be willing topay for a longer, more detailed ad?Our model can also be used to study competition between senders who have access to sig-nal spaces with different degrees of complexity, or the problem of a sender trying to persuadea heterogeneous set of agents using public or private signals with different dimensionalities.We leave these questions for future work. 23 ppendices
A Proofs
Proof of Lemma 1
Given a P A R a is the intersection of ∆ p Ω q , which is closed and convex, and finitelymany closed half spaces defined by t µ P R | Ω | : ř ω P Ω µ p ω qp u p a, ω q ´ u p a , ω qq ě u a P A . It istherefore closed and convex. Proof of Lemma 2
Follows directly from Volund (2018), Theorem 1 or Lipnowski and Mathevet (2017), The-orem 1.
Proof of Lemma 3
We prove this claim by explicitly constructing τ . Using the convexity of ˆ u S within R a ,we can find two alternative beliefs µ , µ in Bd p R a q , such that replacing µ with one of thesetwo beliefs maintains Bayes Plausibility and weakly increases E τ ˆ u S .Let supp p τ q “ t µ , µ , . . . , µ k u . Since τ satisfies Bayes Plausibility, we have µ “ ř ki “ τ p µ i q µ i for some τ p µ q , . . . , τ p µ k q , which satisfy ř i τ p µ i q “
1, and @ i P t , . . . k u ą τ p µ i q ą
0. Wewant to show that we can construct τ which satisfies Bayes Plausibility and E τ ˆ u S ě E τ ˆ u S .Without loss of generality, let µ k P supp p τ q be the belief p µ k ‰ µ q such that for some R k P R , µ k P int p R k q . Consider the ray from µ passing through µ k , parameterized as t µ ` s p µ k ´ µ q , s P R ` u .First, assume µ R R k . Since µ k P int p R k q , and R k is closed, bounded, and convex, theline segment passing through the interior point µ k intersects Bd p R k q at two points (Yaglomand Boltyansky 1961). Let these two points be denoted as µ k and µ k . Since these twopoints also lie on the ray passing through µ k originating from µ , they can be written in aparametric form. Therefore, for some δ ą ą γ ą µ k “ µ ` p ` δ qp µ k ´ µ q “ µ k ` δ p µ k ´ µ q ,µ k “ µ ` p ´ γ qp µ k ´ µ q “ µ k ´ γ p µ k ´ µ q . Moreover, we can write our original point µ k as a convex combination of these two pointsas γγ ` δ µ k ` δγ ` δ µ k “ µ k . Note that by convexity of ˆ u S within R k , we get: γγ ` δ ˆ u S p µ k q ` δγ ` δ ˆ u S p µ k q ě ˆ u S p µ k q . (1)24ow, let us define two new information structures, τ and τ , by replacing µ k with µ k and µ k , respectively. We will now show that we maintain Bayes Plausibility with these newinformation structures. Lemma 8.
The new information structures τ and τ , constructed as described above, areBayes Plausible. Proof.
Start with comparing τ with τ . We have supp p τ q “ t µ , . . . , µ k u and supp p τ q “t µ , . . . , µ k u . We know that τ is Bayes Plausible, so we have µ “ ř ki “ τ p µ i q µ i for some τ p µ q , . . . , τ p µ k q , which satisfy ř i τ p µ i q “
1, and @ i, ą τ p µ i q ą µ k “ µ ` p ` δ qp µ k ´ µ q “ µ k ` δ p µ k ´ µ q . Let us define a new probabilitydistribution τ P ∆ p Ω q representing µ i.e. µ “ ř i ă k τ p µ i q µ i ` τ p µ k q µ k . Simple algebrareveals that this equality will hold for τ : τ p µ i q “ τ p µ i qp ` δ q ` δ ´ τ p µ k q δ for i ă k and τ p µ k q “ τ p µ k q ` δ ´ τ p µ k q δ . Note that 1 ` δ ´ τ p µ k q δ ą
0, and τ p µ k q ă ` δ ´ τ p µ k q δ , so 1 ą τ p µ k q ą
0. Also notethat τ p µ i qp ` δ q ă ` δ ´ τ p µ k q δ since τ p µ i q´ ´ τ p µ k q´ τ p µ i q ă ă δ . Therefore @ i, ą τ p µ i q ą ÿ i ď k τ p µ i q “ p ` δ q ř i ă k τ p µ i q ` δ ´ τ p µ k q δ ` τ p µ k q ` δ ´ τ p µ k q δ “ ` δ ´ τ p µ k q δ ` δ ´ τ p µ k q δ “ . Similarly, take τ . We have supp p τ q “ t µ , . . . , µ k u and supp p τ q “ t µ , . . . , µ k u . Weknow that µ k “ µ ` p ´ γ qp µ k ´ µ q “ µ k ´ γ p µ k ´ µ q . Let us define a new probabilitydistribution τ P ∆ p Ω q representing µ i.e. µ “ ř i ă k τ p µ i q µ i ` τ p µ k q µ k . Simple algebrareveals that this equality will hold for τ : τ p µ i q “ τ p µ i qp ´ γ q ´ γ ` τ p µ k q γ for i ă k, and τ p µ k q “ τ p µ k q ´ γ ` τ p µ k q γ . Note that 1 ´ γ ` τ p µ k q γ ą γ ă
1. Also, τ p µ i qp ´ γ q ă ´ γ ` τ p µ k q γ since @ i, τ p µ i q ă
1. Therefore, @ i, ą τ p µ i q ą
0. Finally: ÿ i ď k τ p µ i q “ p ´ γ q ř i ă k τ p µ i q ´ γ ` τ p µ k q γ ` τ p µ k q ´ γ ` τ p µ k q γ “ ´ γ ` τ p µ k q γ ´ γ ` τ p µ k q γ “ . (cid:4) We are now ready to prove the main theorem. Let E τ ˆ u S and E τ ˆ u S be the sender’s util-ity under the new information structures τ and τ . Using the definitions of τ , τ , we cancalculate the difference between these new values and the sender’s utility under τ , which issimply E τ ˆ u S “ ř i ď k τ p µ i q ˆ u S p µ i q . Simple algebra shows the following:25 τ ˆ u S ´ E τ ˆ u S “ ˆ u S p µ k q ´ ˆ u S p µ k q ` δ ˜˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸ ´ ˆ u S p µ k q ¸ , E τ ˆ u S ´ E τ ˆ u S “ ˆ u S p µ k q ´ ˆ u S p µ k q ´ γ ˜˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸ ´ ˆ u S p µ k q ¸ . For a contradiction, suppose that both E τ ˆ u S ´ E τ ˆ u S ă E τ ˆ u S ´ E τ ˆ u S ă
0. Rear-ranging the above terms and multiplying with γ and δ respectively, we get: γ ˆ u S p µ k q ` γδ ˜˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸ ´ ˆ u S p µ k q ¸ ă γ ˆ u S p µ k q , and, δ ˆ u S p µ k q ´ δγ ˜˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸ ´ ˆ u S p µ k q ¸ ă δ ˆ u S p µ k q . Which implies: δ ˆ u S p µ k q ` γ ˆ u S p µ k q ă p δ ` γ q ˆ u S p µ k q . However, by inequality 1 implied by convexity, we have : γγ ` δ ˆ u S p µ k q ` δγ ` δ ˆ u S p µ k q ě ˆ u S p µ k qô δ ˆ u S p µ k q ` γ ˆ u S p µ k q ě p δ ` γ q ˆ u S p µ k q Therefore, we get a contradiction, so E τ ˆ u S ´ E τ ˆ u S ă E τ ˆ u S ´ E τ ˆ u S ă E τ ˆ u S ´ E τ ˆ u S ě E τ ˆ u S ´ E τ ˆ u S ě µ , µ k P R k . Since µ k P int p R k q , and R k is closed, bounded,and convex, the ray originating from µ passing through µ k intersects Bd p R k q at a singlepoint, which we will denote by µ k . Since µ k lies on this line, for some δ ą
0, we will have: µ k “ µ ` p ` δ qp µ k ´ µ q “ µ k ` δ p µ k ´ µ q Moreover, we can write µ k as a convex combination of µ k and µ , where µ k ` δ ` δµ ` δ “ µ k .Consider a new information structure τ , where we replace µ k with µ k in τ , implying supp p τ q “ t µ , . . . , µ k u . Similar to the first part of the proof, we construct a probabil-ity distribution τ P ∆ p Ω q that represents µ i.e. we need t τ p µ i qu i ď k to satisfy µ “ ř i ă k τ p µ i q µ i ` τ p µ k q µ k . Simple algebra reveals that this equality will hold for τ :26 p µ i q “ τ p µ i qp ` δ q ` δ ´ τ p µ k q δ for i ă k, and τ p µ k q “ τ p µ k q ` δ ´ τ p µ k q δ . Since the original information structure is assumed to be beneficial, we know that the payoffis better than the payoff under receiver’s default action. This implies:ˆ u S p µ q ď ÿ i ď k τ p µ i q ˆ u S p µ i q Also by the convexity of ˆ u S within R k , we know that:ˆ u S p µ k q ` δ ` δ ˆ u S p µ q ` δ ě ˆ u S p µ k q . Now, let us calculate the difference in expected sender payoff between τ and τ . We findthat: E τ ˆ u S ´ E τ ˆ u S “ ˆ u S p µ k q ´ ˆ u S p µ k q ` δ ˜˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸ ´ ˆ u S p µ k q ¸ . For a contradiction, suppose that E τ ˆ u S ´ E τ ˆ u S ă
0. This implies the following:ˆ u S p µ k q ´ ˆ u S p µ k q` ă δ ˜ ˆ u S p µ k q ´ ˜ÿ i ď k τ p µ i q ˆ u S p µ i q ¸¸ ď δ ` ˆ u S p µ k q ´ ˆ u S p µ q ˘ ô ` δ ˆ u S p µ k q ` δ ` δ ˆ u S p µ q ă ˆ u S p µ k q . This contradicts ˆ u S p µ k q ` δ ` δ ˆ u S p µ q ` δ ě ˆ u S p µ k q , which we know to be true from the convexity ofˆ u S within R k . Therefore E τ ˆ u S ´ E τ ˆ u S ě µ k P supp p τ q that is not on the boundary of a region R a through the stepsdescribed above, we can reach a τ that yields weakly higher utility for the sender. Thiscompletes the proof. Proof of lemma 4
Let supp p τ q “ t µ , . . . , µ k u be affinely dependent. Then, there must exist t λ , . . . , λ k u suchthat ř i ď k λ i “ ř i ď k λ i µ i “
0. Since τ is Bayes Plausible, we have µ “ ř ki “ τ p µ i q µ i for some τ p µ q , . . . , τ p µ k q , which satisfy ř i τ p µ i q “
1, and @ i, ą τ p µ i q ą t λ , . . . , λ k u , some elements must be positive and some negative. Among27he subset with negative weights, pick j ˚ such that τ p µ j q λ j is maximized. Among the subsetwith positive weights, pick p ˚ such that τ p µ p q λ p is minimized. Now, we can write µ j ˚ “ ÿ i ‰ j ˚ ´ λ i λ j ˚ µ i , and µ p ˚ “ ÿ i ‰ p ˚ ´ λ i λ p ˚ µ i . Now, rewriting the Bayes Plausibility condition, we get: τ p µ q µ ` ¨ ¨ ¨ ` τ p µ j ˚ q ˜ ÿ i ‰ j ˚ ´ λ i λ j ˚ µ i ¸ ` ¨ ¨ ¨ ` τ p µ k q µ k “ µ ô ÿ i ‰ j ˚ ˆ τ p µ i q ´ τ p µ j ˚ q λ i λ j ˚ ˙ µ i “ µ , and analagously, ÿ i ‰ p ˚ ˆ τ p µ i q ´ τ p µ p ˚ q λ i λ p ˚ ˙ µ i “ µ . Now, we will show that @ i ‰ j ˚ , ´ τ p µ i q ´ λ i τ p µ j q λ j ˚ ¯ ě @ i ‰ p ˚ , ´ τ p µ i q ´ λ i τ p µ k q λ p ˚ ¯ ě λ i “
0, the inequalities hold trivially.If λ i ą
0, the inequalities are equivalent to τ p µ i q λ i ě τ p µ j ˚ q λ j ˚ and τ p µ i q λ i ě τ p µ p ˚ q λ p ˚ . In both cases,the condition holds, because λ j ˚ is negative and λ p ˚ is chosen to minimize this ratio.If λ i ă
0, the inequalities are equivalent to τ p µ i q λ i ď τ p µ j ˚ q λ j ˚ and τ p µ i q λ i ď τ p µ p ˚ q λ p ˚ . In both cases,the condition holds, because λ j ˚ is chosen to maximize this ratio and λ p ˚ is positive.Moreover, note that ř i ‰ j ˚ ´ τ p µ i q ´ λ i τ p µ j ˚ q λ j ˚ ¯ “ p ´ τ p µ j ˚ qq ` τ p µ j ˚ q λ j ˚ λ j ˚ “
1, and analogouslyfor p ˚ . Therefore, we can define τ and τ respectively from τ by dropping µ j ˚ or µ p ˚ , and wemaintain Bayes Plausibility using convex weights ´ τ p µ i q ´ λ i τ p µ j ˚ q λ j ˚ ¯ and ´ τ p µ i q ´ λ i τ p µ p ˚ q λ p ˚ ¯ .Now, writing E τ ˆ u S ´ E τ ˆ u S and E τ ˆ u S ´ E τ ˆ u S , we get: E τ ˆ u S ´ E τ ˆ u S “ ÿ i ‰ j ˚ ˆ τ p µ i q ´ λ i τ p µ j ˚ q λ j ˚ ˙ ˆ u S p µ i q ´ ÿ i ď k τ p µ i q ˆ u S p µ i q E τ ˆ u S ´ E τ ˆ u S “ ÿ i ‰ p ˚ ˆ τ p µ i q ´ λ i τ p µ p ˚ q λ p ˚ ˙ ˆ u S p µ i q ´ ÿ i ď k τ p µ i q ˆ u S p µ i qô E τ ˆ u S ´ E τ ˆ u S “ ´ τ p µ j ˚ q λ j ˚ ˜ ÿ i ‰ j ˚ λ i ˆ u S p µ i q ¸ ´ τ p µ j ˚ q ˆ u S p µ j ˚ qô E τ ˆ u S ´ E τ ˆ u S “ ´ τ p µ p ˚ q λ p ˚ ˜ ÿ i ‰ p ˚ λ i ˆ u S p µ i q ¸ ´ τ p µ p ˚ q ˆ u S p µ p ˚ q . Suppose E τ ˆ u S ´ E τ ˆ u S ă E τ ˆ u S ´ E τ ˆ u S ă
0. This implies:28 λ j ˚ ˜ ÿ i ‰ j ˚ λ i ˆ u S p µ i q ¸ ´ ˆ u S p µ j ˚ q ă , and ´ λ p ˚ ˜ ÿ i ‰ p ˚ λ i ˆ u S p µ i q ¸ ´ ˆ u S p µ p ˚ q ă ô λ j ˚ ˜ ÿ i ‰ j ˚ λ i ˆ u S p µ i q ¸ ` ˆ u S p µ j ˚ q ą , and 1 λ p ˚ ˜ ÿ i ‰ p ˚ λ i ˆ u S p µ i q ¸ ` ˆ u S p µ p ˚ q ą . However, note that by assumption, λ j ˚ and λ p ˚ have opposite signs. Multiplying the firstinequality by λ j ˚ and the second inequality by λ p ˚ , we must have: ˜ÿ i ď k λ i ˆ u S p µ i q ¸ ă , and ˜ÿ i ď k λ i ˆ u S p µ i q ¸ ą . Which is a contradiction. So E τ ˆ u S ´ E τ ˆ u S ă E τ ˆ u S ´ E τ ˆ u S ă τ or τ must yield weakly higher expected utility for the sender.Replace τ with the information structure that yields weakly higher utility using the processdefined above, which drops one belief that is affinely dependent. If the resulting informationstructure is affinely independent, we’re done. If not, we can repeat the process describedabove and we will either reach an affinely independent set of vectors before we get to two,or we reach two vectors, which must be affinely independent. This completes the proof. Proof of lemma 5
Existence and uniqueness comes from the Choquet’s Theorem be-cause τ is a simplex, by the affine independence condition. Now given the convex weights τ “ p τ p µ q , . . . , τ p µ k qq one can transform them to the Cartesian coordinates for µ by using T τ “ »—————– µ , µ , . . . µ k, µ , µ , . . . µ k, ... ... ... ... µ ,n µ ,n . . . µ k,n . . . fiffiffiffiffiffifl where µ i,j is the j’th coordinate of i’th posterior in supp p τ q . T τ is a matrix with dimensions p n ` , k q , with linearly independent columns, which is guaranteed by the affine independenceof supp p τ q . Let us denote µ “ p µ , , . . . , µ ,n , q which is the p , n ` q vector of cartesiancoordinates of µ with an added 1 for the n ` T τ τ “ µ .
29e also know the left inverse of T τ exists by affine independence, denoted T Lτ , which hasdimensions p k, n ` q . Similarly, we can define T τ for any τ P ζ . Then, we have: T τ τ “ µ T τ τ “ µ ô τ “ T Lτ T τ τ Where T Lτ T τ is an affine transformation that takes the convex weights of µ with respectto k-simplex τ and maps to convex weights of µ with respect to k-simplex τ . The mapis bounded, because the two information structures are bounded polytopes. Hence theyare Lipschitz continuous, because bounded affine transformations are Lipschitz continuous.Hence, τ is uniformly continuous. This completes the proof. Proof of Theorem 1
The following definition will be useful.
Definition 6. S a a Ă R a denotes the region where the sender preferred action a is takenin region R a . Formally S a a Ă R a is defined as S a a : “ t µ P ∆ p Ω q : µ P R a and a P ˆ A p µ q ˆ u S p a , µ q ě ˆ u S p ˜ a, µ q @ ˜ a P ˆ A p µ qu . We start by creating auxiliary payoff candidates. This only for illustrative purposes tothe reader. First define that: F a “ t F ai : F ai is a facet of R a u and F “ t F i : @ i “ , . . . , k F i P F a for some a P A u . ˆ u S is uniformly continuous in the interior of R a as ˆ u S is piece-wise linear in the interior of R a .Then by Kirszbraun Theorem we can extend ˆ u S | int p R a q to a Lipschitz continuous function ˆ u Sa defined over R a with the same Lipschitz constant. Hence ˆ v a is uniformly continuous functionover R a .Before proceeding, the reader should note the following: Consider an information structuresupported on the boundary of an action zone R a . There is a possibility that µ l P Bd p R a q is the boundary is also defined by µ l P Bd p R a q . The induced payoff of this informa-tion structure can be represented in two ways: ř i ‰ l τ p µ i q ˆ u S p a p µ i qq ` τ p µ l q ˆ u Sa p a p µ l qq and ř i ‰ l τ p µ i q ˆ u S p a p µ q i qq ` τ p µ l q ˆ u Sa p a p µ l qq . But note that, since we are focusing on sender-preferred equilibrium, the realized payoff at equilibrium corresponds to the payoff that en-sures the higher payoff. Hence, the maximizing payoff we will obtain will always correspondto the true payoffs, as auxiliary payoffs are always dominated by true payoffs. This can bedone analogously via showing that we can limit our attention to those action zones such thatsender utility can be extended uniformly continuously to the boundaries.For each i “ , . . . , k , τ p µ i q ˆ u Sa p µ i q is a product of uniformly continuous bounded functions,hence it is uniformly continuous. Therefore the overall sum is uniformly continuous sincethe finite sum of uniformly continuous functions is uniformly continuous.30f the sender objective on ζ F which is E τ ˆ u S | ζ f : ζ F Ñ R is uniformly continuous, thenit can be extended to a continuous function on the closure of ζ F , denoted E τ ˆ u Sa | ζ f : ζ F Ñ R . E τ ˆ u Sa | ζ f attains a maximum over ζ F by Weierstraß theorem since ζ F is bounded, and itsclosure is compact by Heine-Borel Theorem.Next, observe that ζ F can be written as the intersection of three sets, ζ F “ Σ X I F X A ,where we define:1-The set of all Bayes Plausible information structures as Σ “ tp µ , . . . , µ k q P ∆ p Ω q : µ P co pt µ , . . . , µ n uqu
2- The set of information structures with supp p τ q on F as I F .3-The set of affinely independent information structures: A “ tp µ , . . . , µ k q P ∆ p Ω q : p µ , . . . , µ k q is affinely independent. u First, note that I F is a closed subset of R k ˆ N because Cartesian products of closed setsare closed. We proceed by proving the following claim. Lemma 9. Σ is closed. Proof.
We will show this by contradiction. Suppose not. Then D τ P Σ s.t. µ R co p supp p τ qq .If τ P Σ, then µ P co p supp p τ qq , by definition of ζ . Then τ P Σ (cid:114) Σ “ Bd p Σ q . Then τ is alimit point of Σ. Denote the elements of supp p τ q “ t µ , . . . , µ k u .So there exists a sequence t ˜ τ n u n P N “ p ˜ µ n , . . . , ˜ µ nk q s.t. ˜ τ n P Σ @ n P N and t ˜ τ n u n P N con-verges to τ . Since we supposed that τ R Σ there exists no such p α , . . . , α k q that satisfies α i ą ř ki “ α i “ ř ki “ α i µ i “ µ . Furthermore since ˜ τ n P Σ for each n P N wehave unique ˜ α n “ p ˜ α n , . . . , ˜ α nk q s.t. ř ki “ ˜ α ni ˜ µ ni “ µ for each n P N by lemma 3.2. Then theremust exist a δ ą @ n P N, || ř ki “ ˜ α ni ˜ µ ni ´ ř ki “ ˜ α ni µ i || “ || ř ki “ ˜ α ni p ˜ µ ni ´ µ i q|| ą δ ,where || . || denotes the Euclidean norm. Also by the fact that ˜ τ n Ñ τ we have that @ ε ą D n P N s.t. || ˜ τ n ´ τ || F ă ε , where || . || F denotes the Frobenius norm on R k ˆ N . So we get: ε ą || ˜ τ n ´ τ || F “ k ÿ i “ ||p ˜ µ ni ´ µ i q|| ą k ÿ i “ ˜ α ni ||p ˜ µ ni ´ µ i q|| ě || k ÿ i “ ˜ α ni p ˜ µ ni ´ µ i q|| “ || k ÿ i “ ˜ α ni ˜ µ ni ´ k ÿ i “ ˜ α ni µ i || ą δ Where the equality in the first line follows from the definition of the Frobenius norm, andthe second line follows from applying Jensen’s inequality. Then picking ε “ δ , we have δ ą δ ,a contradiction. Therefore, for any τ P Σ , µ P co p supp p τ qq . This shows that Σ is closed. (cid:4) F denote the finite set of all possible collections F . We can characterize thesender’s maximization problem as follows:max F P F p V p F, µ , k qq Where V p F, µ , k q “ max τ P ζ F E τ ˆ u Sa | ζ F if the maximum exists, and the added convention that V p F, µ , k q “ ´8 , if the feasible set is empty or the maximum doesn’t exist for a givencollection F. Define V p F, µ , k q analogously for ´ max τ P ζ F E τ ˆ u Sa | ζ f ¯ .If ζ F is nonempty, from the first part of the proof, we know that ´ max τ P ζ F E τ ˆ u Sa | ζ F ¯ at-tains a maximum since it is a continuous function over a compact set. Note that ζ F isalways nonempty for some F: trivially, we can find two facets of the R a for which µ P R a ,and define the line segment passing through µ with endpoints at these two facets as a BayesPlausible, affinely independent information structure . Therefore max F P F ´ V p F, µ , k q ¯ also exists. Now we will show that the maximum of the original problem also exists byshowing: max F P F p V p F, µ , k qq “ max F P F ´ V p F, µ , k q ¯ . Denote the maximizing collection of facets for the second problem as F P F and let τ P ζ F be the corresponding information structure for the maximization problem max τ P ζ F E τ ˆ u Sa | ζ F .If τ P ζ F , we’re done and max F P F p V p F, µ , k qq also exists. For the other case, suppose τ P ζ F { ζ F .From the argument in the first part of this proof, we know: ζ F “ I F X Σ X A Ď I F X Σ X A But we also have I F “ I F and Σ “ Σ. Hence ζ F Ď I F X Σ X A. But then: ζ F { ζ F Ď ` I F X Σ X A ˘ { p I F X Σ X A q “ ` I F X Σ X p A { A q ˘ . So we must have τ P ` I F X Σ X p A { A q ˘ . However, if τ P p A { A q , it is not affinely in-dependent, then by Theorem 4, we can always find some F P F and τ P ζ F such that E τ ˆ u Sa | ζ F ě E τ ˆ u Sa | ζ F , contradicting the fact that τ is the maximum. Therefore τ P ζ F { ζ F cannot hold, and the maximum of the original problem will always exist. This completes theproof of Theorem 1. If µ is an element of a single facet of R a , then the argument still applies where the two points are onthe same facet. roof of Theorem 2 Let τ be the optimal information structure solving the sender’s maximization problem givenin definition 4, and suppose for a contradiction, sup t z |p µ , z q P CH k p ˆ u S qu ‰ E τ ˆ u S .For the first case, let sup t z |p µ , z q P CH k p ˆ u S qu ă E τ ˆ u S . However, taking the beliefs in supp p τ q “ t µ , . . . , µ k u , we know that by the feasibility of τ , Dt τ p µ q , . . . , τ p µ k qu such that ř i ď k τ p µ i q µ i “ µ and ř i ď k τ p µ i q “ , ě τ p µ i q ě
0. Thus, by definition 5, p µ , E τ ˆ u S q P CH k p ˆ u S q . Therefore, we cannot have sup t z |p µ , z q P CH k p ˆ u S qu ă E τ ˆ u S .For the other case, let sup t z |p µ , z q P CH k p ˆ u S qu ą E τ ˆ u S . Since p µ , z q P CH k p ˆ u S q , takethe set of points t ˆ u S p µ q , . . . , ˆ u S p µ k qu and convex weights t α , . . . , α k u with ř i ď k α i µ i “ µ and ř i ď k α i ˆ u S p µ i q “ z , also satisfying ř i ď k α i “ , ě α ě
0. We know these points andweights must exist by definition 5. Now observe that τ “ t µ , . . . , µ k u must be a feasiblesolution to the sender’s maximization problem. t µ , . . . , µ k u must be elements of some facets,because otherwise by theorem 3, we can show the existence of another information structurewith higher expected utility, contradicting the fact that p µ , z q is a supremum. It must alsobe the case that t µ , . . . , µ k u are affinely independent, because otherwise by theorem 4, wecan contradict p µ , z q being a supremum again. We know that τ satisfies Bayes Plausibilityby the definition given above. Therefore τ P ζ F for some facet combination F, and it couldhave been picked instead of τ in the maximization problem, contradicting the optimality of τ . This completes the proof of Theorem 2. Proof of Theorem 3
Suppose τ k is the optimal information structure with k signals, and τ k ´ is the optimal in-formation structure with k ´ V ˚ p k q , V ˚ p k ´ q the utilities obtainedusing these information structures.Let supp p τ k q “ t µ , . . . , µ k u . Observe that we can create a k ´ µ , µ , anddefine a new posterior as their mixture: µ “ τ k p µ q τ k p µ q ` τ k p µ q µ ` τ k p µ q τ k p µ q ` τ k p µ q µ And define the new information structure with supp p τ q “ t µ , µ , . . . , µ k u , which main-tains Bayes Plausibility with the new weights tp τ k p µ q ` τ k p µ qq , τ p µ q , . . . , τ p µ k qu .Now, we can define k different information structures containing k ´ µ , µ , . . . , µ k ´ ,k , µ k where we mix the consecutive posteriors µ l , µ l ` and use theweights defined above to satisfy Bayes Plausibility. By the optimality of τ k ´ among the33nformation structures with k ´ V ˚ p k ´ q ě p τ k p µ q ` τ k p µ qq u S ˆ τ k p µ q τ k p µ q ` τ k p µ q µ ` τ k p µ q τ k p µ q ` τ k p µ q µ ˙ ` τ k p µ q u S p µ q ` ¨ ¨ ¨ ` τ k p µ k q u S p µ k q ,V ˚ p k ´ q ě τ k p µ q u S p µ q ` p τ k p µ q ` τ k p µ qq u S ˆ τ k p µ q τ k p µ q ` τ k p µ q µ ` τ k p µ q τ k p µ q ` τ k p µ q µ ˙ ` ¨ ¨ ¨ ` τ k p µ k q u S p µ k q , ... V ˚ p k ´ q ě τ k p µ q u S p µ q ` ¨ ¨ ¨ `p τ k p µ k ´ q ` τ k p µ k qq u S ˆ τ k p µ k ´ q τ k p µ k ´ q ` τ k p µ k q µ k ´ ` τ k p µ k q τ k p µ k ´ q ` τ k p µ k q µ k ˙ ,V ˚ p k ´ q ě τ k p µ q u S p µ q ` τ k p µ q u S p µ q ` ¨ ¨ ¨ `p τ k p µ q ` τ k p µ k qq u S ˆ τ k p µ q τ k p µ q ` τ k p µ k q µ ` τ k p µ k q τ k p µ q ` τ k p µ k q µ k ˙ Dividing all inequalities by k and summing up, we have: V ˚ p k ´ q ě k ´ k V ˚ p k q ` k V ě k ´ k V ˚ p k q Where V is the utility gained from the k dimensional information structure consisting of theposteriors t µ , µ , . . . , µ k ´ ,k , µ k u . This implies the following upper bound on the value ofan additional signal at k ´ V ˚ p k q ´ V ˚ p k ´ q ď k V ˚ p k q Equivalently, the following relationship must hold between the maximum utilities attainablebetween k and k ´ k ´ k V ˚ p k q ď V ˚ p k ´ q ď V ˚ p k q Proofs of the statements in section 4.1.1
Let p E, (cid:126)E q denote an Euclidean affine space with E being an affine space over the set ofreals such that the associated vector space is an Euclidian vector space. We will call E theEuclidean Space and (cid:126)E the space of its translations. For this example we will focus on three34imensional Euclidian affine space i.e. (cid:126)E has dimension 3. We equip (cid:126)E with Euclidean dotproduct as its inner product, inducing the Euclidian norm as a metric. To simplify notation,we will simply write p R , (cid:126) R q . Given this structure, we can define the unitary simplex inthe affine space R by the following set where ω i corresponds to the point with 1 in its i th coordinate and 0 in all of its other coordinates. We define the state space Ω “ t ω , ω , ω u .The simplex then becomes:∆ p Ω q “ " µ P R | µ “ λ ω ` λ ω ` λ ω such that ÿ i “ λ i “ ą λ i ą @ i P t , , u * Building on the problem definition in the main text, we focus on Bayesian Persuasion gameswhere the receiver preferences are described with thresholds, i.e. the receiver prefers action a i P t a , a , a u if and only if the posterior belief µ s P ∆ p Ω q such that µ s p ω i q ě ¯ π , and prefers a otherwise. Hence, we can say that for i P t , , u , j P t , , , u and j ‰ i we have E µ s r u R p a i , ω qs ě E µ s r u R p a j , ω qs if and only if µ s p ω i q ą ¯ π . Define δ “ p , ´ ¯ π, ´p ´ ¯ π qq , δ “ p ´ ¯ π, , ´p ´ ¯ π qq and δ “ p ´ ¯ π, ´p ´ ¯ π q , q and Γ “ p ¯ π, , ´ ¯ π q , Γ “ p , ¯ π, ´ ¯ π q and Γ “ p , ´ ¯ π, ¯ π q . The action zones will become: R i “ t µ s P ∆ p ω q| µ is ě ¯ π i u “ ∆ p ω q X tp µ ´ Γ i q ¨ δ i ě | µ P R u , where ¨ denotes the Euclidean dot product. Proof of lemma 6
Let us first characterize the set ∆ c . We have ∆ c “ ∆ p Ω qz co p R Y R Y R qq . We notethat: co p R Y R q “ co pt ω , p ¯ π, ´ ¯ π, q , p ¯ π, , ´ ¯ π q , ω , p ´ ¯ π, ¯ π, q , p , ¯ π, ´ ¯ π quq“ co t ω , p ¯ π, , ´ ¯ π q , ω , p , ¯ π, ´ ¯ π qu (2)and similarly for co p R Y R q and co p R Y R q we have thatco p R Y R q “ co t ω , p ¯ π, ´ ¯ π, q , ω , p , ´ ¯ π, ¯ π qu (3)co p R Y R q “ co t ω , p ´ ¯ π, , ¯ π q , ω , p ´ ¯ π, , ¯ π qu (4)The second line follows from the first line since the t ω , p ¯ π, , ´ ¯ π q , ω , p , ¯ π, ´ ¯ π qu corresponds to the extreme points of co pt ω , p ¯ π, ´ ¯ π, q , p ¯ π, , ´ ¯ π q , ω , p ´ ¯ π, ¯ π, q , p , ¯ π, ´ ¯ π quq . Similarly using equation (2), (3) and (4), co p R i Y R j q can be identified as the intersectionof a half space and the simplex i.e.co p R Y R q “ ∆ p Ω q X tp µ ´ p ¯ π, , ´ ¯ π qq ¨ p´ ¯ π, ¯ π, q ě | µ P R u (5)co p R Y R q “ ∆ p Ω q X tp µ ´ p ¯ π, ´ ¯ π, qq ¨ p´ ¯ π, , ¯ π q ě | µ P R u (6)co p R Y R q “ ∆ p Ω q X tp µ ´ p ´ ¯ π, ¯ π, qq ¨ p , ´ ¯ π, ¯ π q ě | µ P R u (7) co denotes convex hull operator and co k denotes k -convex hull i.e. co k p A q are the points that can berepresented as convex combination of k elements in A .
35o we can define ∆ c Ă ∆ p Ω q as ∆ c “ ∆ p Ω qz co p R Y R Y R q . By (5), (6) and (7) we cansee that ∆ c is defined as∆ c “ t µ “ p µ , µ , µ q P ∆ p Ω q|@ i P t , , u , µ i ą ´ ¯ π u By definition of ∆ c and ∆ p Ω q this set is non-empty if and only if ¯ π ą . Proof of lemma 7
We can identify the upper bounds through the following problem: V p , µ q “ max i Pt , , u ˆ max µ P ∆ c ,µ i P R i ,µ P R ´ d p µ i , µ q d p µ , µ q ˙ subject to µ P co p µ i , µ q . First note that by the symmetry of the problem choice of i is not relevant. Withoutloss of generality we pick i “
1. Moreover, the constraint that µ P co p µ i , µ q impliesthat we are searching for a point with the goal of minimizing the distance with µ i andmaximizing the distance with µ . The maximizing triple is therefore p µ ˚ , µ ˚ , µ ˚ q with µ ˚ “p ´ ¯ π, ´ ¯ π, π ´ q , µ ˚ “ p ´ ¯ π , ´ ¯ π , ¯ π q µ ˚ “ p , , q . The solution follows from twoobservations. One is that given two points µ and µ i there is a unique line passing throughthese points hence µ is identified to be the furthest point on that line such that µ P R .The line always intersects with R as otherwise µ R ∆ c by construction. Then we choose µ and µ i to minimize d p µ , µ i q where d p µ , µ i q is measured in the space of translations of R .Given this solution, we have that: ||p ¯ π, ´ ¯ π , ´ ¯ π q ´ p π ´ q , ´ ¯ π, ´ ¯ π || “ ? p ´ ¯ π q||p ¯ π, ´ ¯ π , ´ ¯ π q ´ p , , qq|| “ ?
62 ¯ π Giving us that V p , µ q “ π ´ π . Similarly, we can solve: V p , µ q “ min i Pt , , u ˆ max µ i P R i ,µ P R ˆ min µ P ∆ c ´ d p µ i , µ q d p µ , µ q ˙˙ subject to µ P co p µ i , µ q . We observe that the point µ ˚ “ B “ p , , q is a solution. This follows from the fact that B is the barycenter of the simplex, and R , R and R are defined with the same threshold ¯ π .Thus, any prior µ ‰ B implies that the µ is closer to one of the action zones. Minimizingthe objective, we pick µ ˚ “ B . Now given this choice, we choose µ to maximize leading tothe choice of µ ˚ “ p , , q and µ ˚ “ p ´ ¯ π , ´ ¯ π , ¯ π q .Interestingly, the posteriors induced in the optimal information structure for the twoproblems are the same, but they are induced with different probabilities. This follows from36he fact that the hyperplanes defining the action zones is parallel to one of the hyperplanesdefining the simplex. So we can write V p , µ q “ π . Proof of corollary 2
Observe that with fixed ¯ π “ {
3, we have V p , µ q “ “ V p , µ q . Also, V p , µ q “ π ´ π is increasing in ¯ π and V p , µ q “ π is decreasing in ¯ π . By continuity of distance, the ob-jective function in the definition of V p , µ q and V p , µ q are continuous. So for any other µ P ∆ c , V p , µ q takes every value between V p , µ q and V p , µ q by intermediate value the-orem. By definition of value of precision, V p , µ q ą implies decreasing value of precisionand V p , µ q ă implies increasing value of precision.37 Additional Results and Details
B.1 Properties of ˆ u S and sender-preferred zones Definition 7. S a a Ă R a denotes the region where the sender preferred action a is takenin region R a . Formally S a a Ă R a is defined as S a a : “ t µ P ∆ p Ω q : µ P R a and a P ˆ A p µ q ˆ u S p a , µ q ě ˆ u S p ˜ a, µ q @ ˜ a P ˆ A p µ qu . Remark . Observe that by definition we have that @ a, a P A we have that S a a Ď S a a . Lemma 10. @ a, a P A S a a is closed and convex. Proof.
We can define S a a “ ´ X a ‰ a (cid:32) µ P R a : ř i ă ď Ω µ p ω q ` u S p a, ω q ´ u S p a , ω q ˘ ě ( a P A p µ q ¯ ,which is intersection of finitely many half-spaces and closed, convex set R a . (cid:4) Lemma 11. @ a, a P A , ˆ u S is an affine function over S a a . Proof.
For every posterior µ P ∆ p Ω q the receiver is indifferent between taking actions a P ˆ A p µ q . For every µ P S a a receiver takes action a , by definition of sender preferred equi-librium. Given a fixed action a , ˆ u S p a q “ E µ p u S p a, ω qq , which is affine over the simplex. (cid:4) Corollary 3. @ a P A , ˆ u S is a continuous function over int p R a q . Remark . ˆ u S has jump discontinuities only at µ P ∆ p µ q such that µ P R a X R a with R a X R a “ Bd p R a q X Bd p R a q . B.2 Properties of ˆ u R and receivers preferences for signal spacecardinality Lemma 12.
In finite persuasion games, receiver utility in equilibrium max a P A ˆ u R p a, ω q isconvex over ∆ p Ω q . In fact, it is a polyhedral convex function. Proof.
Observe that max a P A ˆ u R p a, ω q “ max a P A " t E µ u R p a , ω qu a P A * . E µ u R p a , ω q denotesthe expected utility for a fixed action a P A , which is an affine function over ∆ p Ω q , andtherefore convex. Then we have that epigraph of max a P A ˆ u R p a, ω q is a polyhedral convex set. (cid:4) An immediate implication is the following.
Corollary 4.
Let τ be the optimal information structure with k -signals and τ be the optimalinformation structure with with k ` signals. If τ and τ are Blackwell comparable we havethat receiver prefers τ over τ . The corollary follows from the definition of Blackwell comparability, and the fact thatthe receiver preferences must be convex. f is a polyhedral convex function if and only if its epigraph is polyhedral, as defined in Rockafellar(1970). .3 Formal preferences for example 4.2.1 (Optimal Advice Seek-ing) We say that the sender’s utility only depends on the action, and a and a are preferredover a and a , and the default action is the least preferred action, which we call a . Forthe parametric example drawn in figure 5, we set u s p a q “ u s p a q “ u s p a q “ u s p a q “ u s p a q “ ω , ω or ω are high enough, they prefer a , a , a respectively. The defaultaction is a , which is taken when the beliefs are ‘leaning towards’ ω , and there is anotheraction a , which is taken when the beliefs are ‘leaning away from’ ω but are not sufficientlyclose to ω or ω . Formally, for the example in the figure, we define receiver utility as follows: u r p ω , a q “ ´ , u r p ω , a q “ , u r p ω , a q “ u r p ω , a q “ { , u r p ω , a q “ { , u r p ω , a q “ { u r p ω , a q “ ´ { , u r p ω , a q “ { , u r p ω , a q “ ´ { u r p ω , a q “ ´ { , u r p ω , a q “ ´ { , u r p ω , a q “ { u r p ω , a q “ ´ { , u r p ω , a q “ { , u r p ω , a q “ { B.4 Simplicity in Persuasion
In the main text, we have shown that we can restrict attention to affinely independent struc-tures while searching for the optimal information structure. The goal of this section is toclarify the connection between affine independence of information structures, preferencestowards simplicity and cognitive costs arising from complexity. We formalize cognitive costsby making the sender not only care about the payoffs of the persuasion game, but also thecomplexity of the information structures implemented.Our approach and definition of complexity is motivated by the seminal paper of Rubin-stein (1986) who studies complexity of automata strategies in repeated games. We opt fora similar simple formalization that defines complexity of an information structure by thenumber of different posteriors induced i.e. the cardinality of the support of τ P ∆ p ∆ p Ω qq .This can be analogously thought as having a mental cost for each posterior induced by asignaling strategy. We work on the limiting case of infinitesimal costs. Thus, the senderprimarily cares about the payoff, and cares lexicographically, only secondarily, about thenumber of posteriors induced. Formally, we can define the preference relation ą of thesender by defining τ ą τ if p E τ ˆ u s , ´| supp p τ q|q ą L p E τ ˆ u s , ´| supp p τ q|q ą L is the usual lexicographic order on R .This notion of complexity is fairly simple and intuitive, and captures some importantconsiderations. The simplest way to motivate the cost of an additional signal is by assum-ing that generating higher dimensional signals is costly, and committing to an informationstructure with more signals and more action recommendations implies that the sender shouldinvest in more capacity to send each different signal that is sent with positive probability.Given a standard persuasion game with no limitations on the signal space and a senderwho has preferences for simplicity, we can extend the result of Theorem 2. We can now stateaffine independence as a necessary condition of optimality and state that for every informa-tion structure τ whose support µ is not affinely independent there exists a strictly betterinformation structure that is preferred by the sender. The result follows from the construc-tion provided in the proof of Theorem 2. Existence of the optimal information structure isagain established by Theorem 3.These observations present an additional property of affinely independent informationstructures, as they also happen to be the simplest (in the sense of the lexicographic orderdefined above) possible information structures, within the set of information structures thatachieve the same utility level. Hence, our analysis of Bayesian Persuasion with coarse com-munication yields a general solution to Bayesian Persuasion games where the agents havepreferences for simplicity.The lexicographic preference order defined above is analogous to having infinitesimalcosts for additional signals. In general, using our definition for the value of precision, thesender can decide whether it is worth incurring the cost of an additional signal when costsare non-trivial. C Extension: Continuum of States
In this section, we will extend our results to the case where the state of the world ω cantake values in a continuum i.e. Ω “ r a, b s . Without loss of generality, set a “ , b “ τ be a signal or an information structure, and the signalspace be S with cardinality K. The general setting is akin to Gentzkow and Kamenica (2016).Suppose the action of the receiver only depends on the expected value of the state vari-able, E µ p ω q , where µ is a posterior belief (a probability distribution) over Ω. Let F be theCDF of the prior belief, with the mean m . A signal realization s P S will induce a posteriorbelief with CDF µ s . p x , x q ą L p y , y q if and only if x ą y or x “ y and x ą y . That is to say that τ ą τ if and onlyif E τ ˆ u s ą E τ ˆ u s or E τ ˆ u s “ E τ ˆ u s and | supp p τ q| ă | supp p τ q| . τ will induce at most K different posterior CDF’s,denoted t µ , . . . , µ k u with corresponding means t m , . . . , m k u . Note that τ will now induce aprobability distribution over posterior means . Denote CDF of this distribution of posteriormeans by G.We make the following assumptions: The set of actions A has cardinality and that thereexists cutoffs γ , . . . , γ m such that when E µ p ω q P r γ i , γ i ` s , the action a i is optimal for thereceiver. Additionally we assume that the sender’s utility depends only on receiver’s actionand that u is an affine-closed function, and satisfies regularity conditions, defined in Dwor-czak and Martini (2019). Further, assume that the prior CDF, F , be continuous and havefull support over Ω. These assumptions ensure that the optimal signal creates a distributionof posterior means which is a monotone partitional signal .A monotone partitional signal partitions the state space into at most K continuous intervalssuch that for any interval in tr x i , x i ` su Ki “ , all the mass of G is on E p X | X P r x i , x i ` sq .Let c be the integral of the posterior mean function for the completely uninformative signal,which will be equal to 0 below the prior mean, and a linear function with slope 1 above theprior mean. Similarly, let c be the integral of the posterior mean function for the fully re-vealing signal (which will use infinitely many signals). This signal reveals the state exactly.Therefore it will be equal to the integral of the prior.It is shown by Gentzkow and Kamenica (2016) that the function c for any form of sig-nal must lie between c and c . Note that both of these depend on the prior. Now, note thefollowing observation: the cardinality of the signal space K , determines how many ’kinks’the function c will have.It is straightforward to observe that , with k monotone partitional signals, we will havek ’kinks’ and a k ` c . This follows from the fact that we areinterested in the integral of G . Therefore the sender’s problem reduces to choosing the loca-tion of these k kinks and the slope of the function c at each kink, subject to the constraintthat c lies between c and c . Remember our assumption of the existence of action cutoffs γ , . . . , γ m such that when E µ p ω q P r γ i , γ i ` s , the action a i is optimal for the receiver. Therelationship between γ , . . . , γ m and the signal partitions will not be obvious when K ă M .More precisely, let c G denote the integral of G, c G p x q “ ş x G p t q dt . c G is a convex func-tion and we can analyze c instead of analyzing signal distributions as in Gentzkow andKamenica (2016). This definition also makes our focus on piecewise linear functions moreclear. Gentzkow and Kamenica (2016) shows that each function in this interval can berepresented by a signaling policy and vice versa. We will focus on solving the problem bychoosing a function between c and c instead of finding signaling policies for tractabilitypurposes. Let γ , . . . , γ m be the action cutoffs, and let c p x q be the chosen c function, with c c G p x q ă γ . Let U be the senderutility when action 1 is taken. Action two is taken when γ ě c G p x q ă γ , let U be senderutility when action 2 is taken, and so forth. The sender’s utility is then U p c q “ m ÿ k “ p c p γ k q ´ c p γ k ´ qq U k with the convention that γ “ c p γ q “ c as: F k “ t f P C r , s|D a partitioning of [0,1] into k intervals: t s l u kl “ “ tp , x s , p x , x s , . . . , p x k ´ , x k ´ s , p x k ´ , su and t φ l P R u kl “ such that: k ă K, D M P N @ l P t , . . . , k u ď φ l ă M, φ l ď φ l ` , and each s is connected and has non-zero measure, where f can be written as: f p x q “ x P s p φ x q ` k ÿ l “ x P s l ˜ φ l x ´ l ÿ j “ p φ j ´ φ j ´ q x j ´ ¸ u Given the definitions and the signal space of focus we establish existence of an optimalinformation structure for the sender.
Theorem 4. U p c G q attains its maximum over F k . Proof.
The proof proceeds by a series of lemmas:
Lemma 13. F k is pre-compact Proof. By Arzela-Ascoli theorem, proving pre-compactness suffices to showing equi-continuity and equi-boundedness . Note that the way that F k defined ensures that its elements are Lip-schitz continuous. Then we have that equi-boundedness trivially. For equi-continuity pick M P N that is the largest Lipschitz constant for the set of functions in F and a set offunctions with bounded Lipschitz constant forms an equicontinuous set. (cid:4) emma 14. F k is closed. Proof.
Suppose there exists a sequence of functions where @ n P N, f n P F k and f n Ñ f uni-formly. We will show that f P F k .First, observe that all f n are Lipschitz continuous, and therefore f must be Lipschitz contin-uous, in addition to being convex. Therefore f is differentiable almost everywhere. Let theset D Ă r , s represent the set of points where f is differentiable.Since f n Ñ f uniformly and f n , f are convex, we have that @ x P D, f n p x q Ñ f p x q . Weproceed by proving the following claim. p , qz D can have at most cardinality K.Suppose not. Pick K ` p , q{ D and call this set X . By subclaim 2, @ x P X , we can find h p x q ą @ h P r , h p x qq , there exists some N h p x q P N suchthat @ n ą N h p x q , f n p x q and f n p x ´ h q are on the same linear piece. Similarly, we can alsofind q p x q ą @ q P r , q p x qq , there exists some N q p x q P N such that @ n ą N q p x q , f n p x q and f n p x ` q q are on the same linear piece. Since there are K ` q ˚ “ min x P X p q p x qq , h ˚ “ min x P X p h p x qq , and N ˚ “ max x P X p max p N h p x q , N q p x q qq .Since f is differentiable almost everywhere, for every x P X Ď pp , q{ D q , there must exist (cid:15) p x q ą f is differentiable in the interval p x ´ (cid:15) p x qq and also (cid:15) p x q ą f is differentiable in the interval p x ` (cid:15) p x qq . Let (cid:15) ˚ “ min x P X p min p (cid:15) p x q , (cid:15) p x qqq .Define (cid:15) “ min p h ˚ , q ˚ , (cid:15) ˚ q . Now, @ x P X , and @ n ą N ˚ , we have that f n “ c p x q withinthe interval p x ´ (cid:15), x q , and f n “ c p x q within the interval p x, x ` (cid:15) q , for some constants c p x q , c p x q . The intervals p x ´ (cid:15), x q and p x, x ` (cid:15) q are contained by the set D for every valueof x, by definition. By the fact that within the set D, f n Ñ f , we must have f “ c p x q within p x ´ (cid:15), x q and f “ c p x q within p x, x ` (cid:15) q .Since f is continuous and convex, and @ x P X , f p x q doesn’t exist, we must have that @ x, c p x q ă c p x q . However, this implies that @ n ą N ˚ , f n also takes at least K ` f n P F k , i.e., f n cannot be K-piecewise linear. Thiscompletes the proof that p , q{ D can have at most cardinality K.Without loss of generality, suppose the set has cardinality K. The case where the car-dinality is less than K will be analogous. Let us order the elements of p , q{ D as 0 ă x ă x ¨ ¨ ¨ ă x K ă
1. Take the collection of intervals whose union is r , s as t s l u Kl “ “tr , x s , p x , x s , . . . , p x K , su . Within the interior of each interval, f is differentiable, hencewe must have f n Ñ f . Observe that f can take at most K ` f n cannot hold. Moreover, f must be constant within43he interior of each interval, since otherwise the cardinality of p , q{ D would exceed K.Therefore, we can write @ l ă K : @ x P int p s l q , φ l “ f p x q , and hence f p x q “ φ l x ` c l for some c. Moreover, since @ n P N, f n and f are continuous, @ x P p , q{ D , we must have:lim (cid:15) Ñ f p x ` (cid:15) q “ lim (cid:15) Ñ f p x ´ (cid:15) q “ lim (cid:15) Ñ φ l ` p x ` (cid:15) q ` c l ` “ lim (cid:15) Ñ φ l p x ´ (cid:15) q ` c l Therefore to preserve continuity we must have c l ` ´ c l “ ´p φ l ` ´ φ l q x . Also, observe thatwithin the first interval r , x s , we have f n Ñ f “ φ and f n p x q “ φ ,n x Ñ f p x q “ φ x ` c .It follows that we must have c “ l ě , c l “ ´ ř li “ p φ i ´ φ i ´ q x i ´ . Therefore, f must have the de-sired form and f P F k . This completes the proof that it is closed. (cid:4) Corollary 5. F k is compact. Proof.
Follows from two lemmas above and the definition of a pre-compact set. (cid:4)
Lemma 15.
U(c) is continuous over F k . Proof.
Let f n P C be a sequence of convex functions such that f n Ñ f uniformly. Thisimplies : d p f n , f q “ sup t| f n p x q ´ f p x q| , x P r , su Ñ n Ñ 8 . We need to show U p f n q Ñ U p f q .By above lemma, since U only depends on the left derivatives on fixed and exogenous points γ , . . . , γ k , then we will have U p f n q Ñ U p f q . Uniform convergence implies pointwise conver-gence, therefore f is convex.Since f is convex, there will exist left and right derivatives at every point. For any γ value, and for any (cid:15) ą
0, we need to show D N P N such that @ n ą N , | f n p γ q ´ f p γ q| ă (cid:15) where we write the left derivative at γ as: f p γ q “ lim h Ñ ´ f p γ ` h q ´ f p γ q h We proceed by proving two useful claims.
Claim 1. D h ą such that @ ď h ă h , f p γ ´ h q and f p γ q are on the same linearpiece, meaning that: f p γ ´ h q “ β p γ ´ h q and f p γ q “ βγ for some β ą .This implies f p γ ´ h q “ f p γ q , @ ď h ă h . roof. Follows from the fact that in our definition each linear piecewise interval is connectedand has strictly non zero measure. (cid:4)
Claim 2. D h ą that satisfies the following : @ h P r , h q , there exists some N h P N forwhich it holds that @ n ą N h , f n p γ ´ h q and f n p γ q are on the same linear piece. Proof.
Suppose not. For any given h ą
0, for all h P r , h q , there exists no N h . Meaningthat, @ n P N , f n p γ ´ h q and f n p γ q are not on the same linear piece. Implying that, forany 0 ď h ă h , for any n : there must be some β n , θ n where f n p γ ´ h q “ β n p γ ´ h q and f n p γ q “ θ n γ where β n ă θ n by convexity. Thus, | f n p γ q ´ f n p γ ´ h q| “ |p θ n ´ β n q γ ` β n h | .However, each f n is also continuous, by convexity. This implies that, at the point γ : @ (cid:15) ą D δ ą | x ´ γ | ă δ , then | f n p x q ´ f n p γ q| ă (cid:15) .For any f n , choose (cid:15) “ p θ n ´ β n q γ . Then, there exists some δ such that | x ´ γ | ă δ implies | f n p x q ´ f n p γ q| ă p θ n ´ β n q γ But then we can choose h where h ă h and h ă δ is satisfied.Which means that we will have: | f n p γ q ´ f n p γ ´ h q| “ |p θ n ´ β n q γ ` β n h | “ p θ n ´ β n q γ ` β n h from the first argument, and | f n p γ q ´ f n p γ ´ h q| ă p θ n ´ β n q γ from the second argument.Therefore we have reached a contradiction. This completes the proof of claim 2. (cid:4) Proceeding with the proof of lemma 13, we have that uniform convergence implies pointwiseconvergence, therefore f is convex. Since f is convex, there will exist left and right deriva-tives at every point. For any γ value, and for any (cid:15) ą
0, we need to show D N P N such that @ n ą N , | f n p γ q ´ f p γ q| ă (cid:15) . Where we write the left derivative at γ as: f p γ q “ lim h Ñ ` f p γ ´ h q ´ f p γ q´ h Suppose an (cid:15) ą h ă min t h , h u . We have that: f p γ q ´ (cid:15) ă f p γ ´ h q ´ f p γ q´ h “ f p γ q “ f p γ ´ h q ă f p γ q ` (cid:15) For the picked number h, by claim 2, let N h be the number where @ n ą N h , f n p γ ´ h q and f n p γ q are on the same linear piece.Since f n converges to f , there exists N c P N such that @ n ą N c : f p γ q ´ (cid:15) ă f n p γ ´ h q ´ f n p γ q´ h ă f p γ q ` (cid:15) Let N ą t N h , N c u . Then, @ n ą N , the convergence result holds, and f n p γ ´ h q and f n p γ q areon the same linear piece. The following argument holds for all n ą N : Since f n p γ ´ h q and f n p γ q are on the same linear piece, we must have that the left derivatives are the same atthese two points and f n p γ q “ f n p γ ´ h q´ f n p γ q´ h . By direct substitution to the inequality above: f p γ q ´ (cid:15) ă f n p γ q ă f p γ q ` (cid:15) ´ (cid:15) ă f n p γ q ´ f p γ q ă (cid:15) ô | f n p γ q ´ f p γ q| ă (cid:15) Therefore the left derivatives converge and U p f n q Ñ U p f q , which completes the proofthat U(c) is continuous over F k . (cid:4) With all the lemmas, the proof of theorem 4 follows immediately by topological extremevalue theorem . We have proved the existence of an optimal monotone partitional informa-tion structure. (cid:4) Let p S, d S q and p R , d q be metric spaces d is the usual Euclidean metric defined for all x, y d p x, y q “ | x ´ y | . Also let X Ď S be a compact subset of S f : S Ñ T be continuous on all of X. Then f p X q is closed andbounded in T and f achieves its supremum and infimum on X, that is, there exists p, q P X such that f p p q “ sup t f p x q : x P X u and f p q q “ inf t f p x q : x P X u eferences Alfsen, E. M. (1965): “On the geometry of Choquet simplexes,”
Mathematica Scandinav-ica , 15, 97–110.
Aumann, R. J. and M. Maschler (1995):
Repeated Games with Incomplete Information ,MIT Press.
Bergemann, D. and S. Morris (2016): “Information design, Bayesian persuasion, andBayes correlated equilibrium,”
American Economic Review , 106, 586–91.
Bloedel, A. W. and I. R. Segal (2018): “Persuasion with Rational Inattention,”
Avail-able at SSRN 3164033 . Dughmi, S., D. Kempe, and R. Qiang (2016): “Persuasion with limited communication,”in
Proceedings of the 2016 ACM Conference on Economics and Computation , ACM, 663–680.
Dworczak, P. and G. Martini (2019): “The simple economics of optimal persuasion,”
Journal of Political Economy , 127, 000–000.
Gentzkow, M. and E. Kamenica (2014): “Costly persuasion,”
American EconomicReview , 104, 457–62.——— (2016): “A Rothschild-Stiglitz approach to Bayesian persuasion,”
American Eco-nomic Review , 106, 597–601.
Ichihashi, S. (2019): “Limiting Sender’s information in Bayesian persuasion,”
Games andEconomic Behavior , 117, 276–288.
Jager, G., L. P. Metzger, and F. Riedel (2011): “Voronoi languages: Equilibriain cheap-talk games with high-dimensional types and few signals,”
Games and economicbehavior , 73, 517–537.
Kamenica, E. and M. Gentzkow (2011): “Bayesian Persuasion,”
American EconomicReview , 101, 2590–2615.
Le Treust, M. and T. Tomala (2019): “Persuasion with limited communication capac-ity,”
Journal of Economic Theory , 104940.
Lipnowski, E. and L. Mathevet (2017): “Simplifying Bayesian Persuasion,” Tech. rep.,mimeo.——— (2018): “Disclosure to a psychological audience,”
American Economic Journal: Mi-croeconomics , 10, 67–93.
Rockafellar, R. T. (1970):
Convex analysis , vol. 28, Princeton university press.47 ubinstein, A. (1986): “Finite automata play the repeated prisoner’s dilemma,”
Journalof economic theory , 39, 83–96.
Tsakas, E. and N. Tsakas (2018): “Noisy persuasion,”
Available at SSRN 2940681 . Volund, R. T. (2018): “Bayesian Persuasion on Compact Subsets,” Tech. rep., AarhusUniversity.
Warren, J. (1996): “Barycentric coordinates for convex polytopes,”
Advances in Compu-tational Mathematics , 6, 97–108.——— (2003): “On the uniqueness of barycentric coordinates,”
Contemporary Mathematics ,334, 93–100.
Warren, J., S. Schaefer, A. N. Hirani, and M. Desbrun (2007): “Barycentriccoordinates for convex sets,”
Advances in computational mathematics , 27, 319–338.
Wei, D. (2018): “Persuasion Under Costly Learning,” .
Yaglom, I. M. and V. G. Boltyansky (1961):