Bayesian Persuasion under Ex Ante and Ex Post Constraints
aa r X i v : . [ c s . G T ] D ec Bayesian Persuasion under Ex Ante and Ex Post Constraints ∗ Yakov Babichenko † Inbal Talgam-Cohen ‡ Konstantin Zabarnyi § December 6, 2020
Abstract
Bayesian persuasion, as introduced by Kamenica and Gentzkow in 2011, is the study ofinformation sharing policies among strategic agents. A prime example is signaling in online adauctions: what information should a platform signal to an advertiser regarding a user whenselling the opportunity to advertise to her? Practical considerations such as preventing dis-crimination, protecting privacy or acknowledging limited attention of the information receiverimpose constraints on information sharing. Despite their importance in real-life applications,such constraints are not usually taken into account in the Bayesian persuasion literature. Inthis work, we propose and analyze a simple way to mathematically model such constraints asrestrictions on Receiver’s admissible posterior beliefs.We consider two families of constraints – ex ante and ex post, where the latter limits eachinstance of Sender-Receiver communication, while the former more general family can also poserestrictions in expectation. For both families, we show existence of a simple optimal signalingscheme in the sense of a small number of signals; our bounds for signal numbers are tight.We then provide an additive bi-criteria FPTAS for an optimal constrained signaling schemewhen the number of states of nature is constant; we improve the approximation to single-criteria under a Slater-like regularity condition. The FPTAS holds under standard assumptions,and more relaxed assumptions yield a PTAS. Finally, we bound the ratio between Sender’soptimal utility under convex ex ante constraints and the corresponding ex post constraints.We demonstrate how this bound can be applied to find an approximately welfare-maximizingconstrained signaling scheme in ad auctions.
In many real-life situations, one entity relies on information revealed by another entity to decidewhich action to take. Call the former and the latter entities
Receiver and
Sender , respectively.Sender has the power to commit to a revelation policy, a.k.a. a signaling scheme . Sender wouldlike to strategically design such a scheme to persuade
Receiver to act in Sender’s interest. Mathe-matically, a signaling scheme transforms Receiver’s prior belief about how some unknown state ofnature is distributed into a posterior belief, which determines Receiver’s action. ∗ This research has been supported by the Israel Science Foundation (grant No. 336/18). The first author’sresearch has been partially supported by the U.S-Israel Binational Science Foundation grunt † Technion–Israel Institute of Technology —
E-mail : [email protected]. ‡ Technion–Israel Institute of Technology —
E-mail : [email protected]. § Technion–Israel Institute of Technology —
E-mail : [email protected].
We propose a theoretical model for constrained Bayesian persuasion under general families of con-straints that we call ex ante and ex post constraints. Ex ante constraints are statistical limitationson the amount of information Receiver may learn when the Sender-Receiver communication isrepeated over time; ex post constraints are a strong particular case restricting the informationpassage on every instance of the communication. These constraint families have various signifi-cant applications, including reducing discrimination and protecting individual privacy in online adauctions.
Our results and paper organization.
Let m and k be the numbers of constraints and statesof nature, respectively. Section 2 formally defines our model and describes the main motivations.Section 3 proves tight bounds on the support size of an optimal constrained signaling scheme: k + m for ex ante constraints and k for ex post; the latter bound is the same as in the original settingof Kamenica and Gentzkow. The support size of a signaling scheme is a common measure of itscomplexity, similar to menu-size complexity in auctions [18; 19]. Section 4 provides an additive bi-criteria FPTAS for an optimal signaling scheme when k is constant and improves it to single-criteriaunder a Slater-like regularity condition. This result holds for standard constraints – includingKullback–Leibler (KL) divergence, entropy and norm constraints (such as variation distance) – andstandard objective functions: Lipschitz-continuous (corresponding to Receiver having a continuumof actions) or piecewise constant (for finite Receiver’s action space). Although these objectiveand constraint families capture a wide range of scenarios, the same algorithm remains an additivebi-criteria PTAS – which improves to single-criteria under a Slater-like condition – for even moregeneral families. Section 5 shows that for constant m , convex constraints and a wide family ofobjective functions, ex ante constraints outperform ex post constraints by a constant multiplicativefactor. Subsection 5.1 concludes by applications to ad auctions with exponentially large states ofnature space, using a generalization of the setting of Badanidiyuru et al. [4]. Technical challenges.
Ex ante constraints raise technical challenges not usually encounteredin the literature on persuasion. In our model, we cannot restrict attention to straightforwardpolicies [22], in which Sender recommends an action to Receiver in an incentive-compatible way.These policies are a very central tool in persuasion problems and are widely applied across theliterature [see, e.g., 10], but they are not descriptive enough for determining whether a given ex anteconstraint is satisfied. In particular, an optimal signaling scheme in our model cannot be describedby a finite linear program (LP); rather, we need sophisticated tools from infinite-dimensional linearprogramming to prove our main existence result. Note that we do not assume Receiver’s actionspace is finite, but even such a simplifying assumption would not have resolved these issues.
The seminal work of Kamenica and Gentzkow [22] introduces Bayesian persuasion and character-izes Sender’s optimal signaling scheme using the concavification approach. Among the works onalgorithmic aspects of persuasion we mention a negative result of Dughmi and Xu [11], which isrelevant to hardness of approximating the Sender’s optimal utility; see [10] for a comprehensive2urvey of computational results.In the context of auctions, an early work on signaling information is the classic paper of Mil-grom and Weber [26]. Emek et al. [16]; Miltersen and Sheffet [27] apply a computational approachto signaling in auctions; Fu et al. [17] study signaling in the revenue-maximizing Myerson auc-tion [29]; Badanidiyuru et al. [4] study it in the welfare-maximizing second-price auction withexponentially many states of nature; and Daskalakis et al. [9] design the signaling and auctionmechanisms simultaneously.The most closely related works to our own are the following: (a) Our algorithmic approach inSection 4 is related to that of Cheng et al. [8], as both use discretization and linear programmingto achieve an additive FPTAS. (b) Dughmi et al. [12, 13] study constrained persuasion, but theirconstraints are on the complexity of the Sender-Receiver communication as measured by messagelength or number of signaled features and so are different from ours. Ichihashi [20] considerspersuasion by Sender who is constrained in the information she can acquire (and therefore, send)and characterizes the set of possible equilibrium outcomes. Our Theorems 3.1 and 3.3 are relatedto this literature in that they indicate that constraints on persuasion do not cause significant blow-up in the number of signals needed to persuade optimally. (c) Vølund [34] studies a model ofpersuasion on compact subsets, which is equivalent to our ex post constraints; there is no parallelin that work to ex ante constraints, and the results on ex post in the two works do not overlap.In Subsection 2.3, we discuss motivating applications of ex ante and ex post constraints, includ-ing limited attention, as well as privacy protection in online ad auctions. [24; 5] study persuasionwith limited attention – see Subsection 2.3 for details. Eilat et al. [15] study ex ante and ex postprivacy constraints in the design of auctions rather than persuasion schemes. Ichihashi [21] studiesthe economic implications of online consumer privacy; in his model, the consumer, rather than theseller, plays the role of Sender. It is important to note that the differential privacy paradigm [see14] does not apply to privacy protection in online ad auctions: the state of nature about whichinformation is revealed represents characteristics of an individual rather than statistics of a largepopulation, and it is inherent to ad personalization that these characteristics influence the outcomein a non-negligible way.
We consider Bayesian persuasion with a single
Sender and a single
Receiver , as introduced by Ka-menica and Gentzkow [22]. Fix a space of k states of nature Ω and a commonly-known priordistribution p on them. Take some compact nonempty set A to be Receiver’s action space . In-troduce two random variables ω and x , representing the state of nature and Receiver’s action ,respectively. Fix a Sender’s utility function ˜ u s : A × Ω → R ≥ and a Receiver’s utility function u r : A × Ω → R ≥ . The Sender-Receiver communication is specified by a signaling scheme Σ,a.k.a. a signaling policy , which is a randomized function from Ω to some set of signals (this notionwill be formalized soon). Sender must commit to Σ before learning ω .Denote by ∆(Ω) the set of probability distributions over Ω. Consider it to be a subset of [0 , k ,with i -th coordinate being the probability assigned to the i -th element of Ω.Let σ be the actual signal realization . Note that σ induces an updated distribution on Ω inReceiver’s view, called the posterior distribution or the posterior . Let p σ ∈ ∆(Ω) be the posterior They also study a version called “bipartite signaling”, which has a combinatorial flavor different than ours, in anauction setting with the strong assumption that bidder values are known. σ . The set of the posteriors induced by signal realizations of Σ with a positive probability(or density, in the non-discrete case) is called the support of Σ and is denoted supp(Σ).Formally, Σ is a distribution, unconditional on the state of nature, over the elements of ∆(Ω)that belong to supp(Σ). For any ω ∈ Ω, assuming ω = ω , Σ induces a conditional distributionover the elements of supp(Σ) that specifies how Sender chooses the signal realization when ω = ω .Denote this distribution by Σ( ω ). Note that given the distributions p and Σ, it can be computedby Bayes’ law.For simplicity, we introduce the following notation for the expectation of a function of theposterior over the elements of supp(Σ) according to Σ: Notation 2.1.
For a function f : ∆(Ω) → R : E [Σ , f ] := E p σ ∼ Σ [ f ( p σ )] = E ω ∼ p,p σ ∼ Σ( ω ) [ f ( p σ )] . By [22], a distribution Σ represents a signaling scheme iff Σ is
Bayes-plausible , that is: ∀ ω ∈ Ω : p [ ω ] = E [Σ , p σ [ ω ]] . The persuasion process runs as follows: (1) Sender commits to a signaling policy Σ. (2) Senderdiscovers the state of nature ω . (3) Sender transmits a signal realization σ to Receiver, accordingto Σ( ω ). (4) Receiver chooses an action x ∈ A s.t. x ∈ argmax (cid:0) E ω ′ ∼ p σ [ u r ( x, ω ′ )] (cid:1) ; assume, as isstandard, that ties are broken in Sender’s favour. (5) Sender gets utility of ˜ u s ( x, ω ), while Receivergets utility of u r ( x, ω ).Since x depends only on p σ , there exists ¯ u s : ∆(Ω) × Ω → R ≥ s.t. ˜ u s ( x, ω ) ≡ ¯ u s ( p σ , ω ). Define u s : ∆(Ω) → R ≥ by u s ( p σ ) := E ω ′ ∼ p σ [¯ u s ( p σ , ω ′ )]. Remark 2.2.
From now on we shall consider u s instead of ˜ u s or ¯ u s , thus assuming that Sender’sutility is state of nature-independent. This is w.l.o.g. for our theorems from Sections 3-4, sincethe passage from ¯ u s to u s preserves the conditions required there (being upper semi-continuous,continuous, piecewise constant or O (1) -Lipschitz). While one cannot apply the results of Section 5to the state-dependent case without strengthening Assumption 5.2, the natural applications to adauctions discussed there have state-independent Sender’s utility.
Throughout we make the following assumption, which is a relaxation of the standard assumptionin the persuasion literature that u s is continuous. In particular, this assumption encompasses u s which is a threshold function. Assumption 2.3.
The function u s is upper semi-continuous. So far we have described the setting of Kamenica and Gentzkow [22]. However, in our model we donot allow Sender to choose among all Bayes-plausible signaling schemes, but only among schemesthat satisfy certain restrictions (see Subsection 2.3 for motivation). We define two general familiesof constraints: ex ante and ex post . A constraint of the latter type restricts the admissible valuesof a certain function of p σ for every possible p σ , while a constraint of the former type restricts onlythe expectation of such a function. In the state-dependent setting, ¯ u s ( · , ω ) has to satisfy the theorem requirements from u s for every ω ∈ Ω. efinition 2.4 (Ex ante constraints) . An ex ante constraint on a signaling scheme Σ is a constraintof the form: E [Σ , f ] ≤ c for continuous f : ∆(Ω) → R and a constant c ∈ R . Definition 2.5 (Ex post constraints) . An ex post constraint on a signaling scheme Σ is a constraintof the form: ∀ p σ ∈ supp(Σ) : f ( p σ ) ≤ c for continuous f : ∆(Ω) → R and a constant c ∈ R .For a constraint defined as in either of the previous two definitions, we say that the constraintis specified by the function f and the constant c . A constraint specified by a convex f and someconstant c is called convex . Observation 2.6.
Ex post constraints are a special case of ex ante constraints.
Indeed, an ex post constraint specified by some f and c is equivalent to the ex ante constraintspecified by max { f, c } and c . Note that if f is convex then so is max { f, c } .Every ex ante constraint can be transformed into a (stronger) ex post constraint by ”erasingthe expectation” and vice versa. Formally: Definition 2.7.
An ex post and an ex ante constraint correspond to each other if they are specifiedby the same function and the same constant.
Definition 2.8.
Given a set of constraints, a signaling scheme satisfying all of them is called valid . Definition 2.9.
A set of constraints is called trivial if every signaling scheme satisfies it.
In many applications of Kamenica and Gentzkow’s model, Sender may not be able to reveal asmuch information as would theoretically be optimal due to imposed constraints. Such constraintscan originate from sources including law, professional integrity, political agreements, public opinionand limited attention.
Online ad auctions.
In this first motivating example, the auctioneer – an advertising platform– is Sender, while the set of bidders – which are advertisers – is Receiver. The profile of the webuser who is about to view the ad is the state of nature. This profile is known to the auctioneer, butnot to the bidders; every signal reveals information about it. Such information revelation shouldbe restricted by both privacy and fairness considerations.The constraint families we introduce are suitable for protecting privacy: following Eilat et al.[15], privacy protection can be modeled as imposing a threshold on the KL divergence from theprior to the posterior. The KL divergence quantifies how much more informative the posterior iscompared to the prior due to extra information about the user provided by the signal realization. Onthe one hand, an ex post constraint on the KL divergence provides a relatively robust protection ofindividual privacy by ruling out sending a very informative signal even with only a small probability.On the other hand, the corresponding ex ante constraint protects privacy on the group level – e.g.,it limits Receiver’s ability to learn the shopping habits of certain population groups, since theposterior is close, on average, to the prior. We treat the bidders as a single Receiver since they all get the same signal; private signaling poses additionalchallenges [3] and is left for future work. − min ω ′ ∈ Ω { p σ [ ω ′ ] } ) lower bounds the frequency of a population group in the posterior, thusensuring its proportional inclusion. An ex ante constraint of this form ensures that on average,the advertiser does not get enough information to discriminate against particular groups.
Limited attention.
A second motivating example involves constraints arising from Receiver’slimited attention span. As Simon (1996) noted, “a wealth of information creates a poverty ofattention”. Our model enables limiting the signaled information so that it “fits” within Receiver’slimited attention. Following the rational inattention literature [32], define the attention requiredfrom Receiver to process Sender’s signal σ as the entropy of the posterior p σ . By constrainingthe entropy – either in expectation (i.e., ex ante) or of every posterior (i.e., ex post) – we enableReceiver to process the signal despite her limited attention (where the limit is either in expectationor per signal, respectively). A concrete application from Bloedel and Segal [5] includes a busyexecutive as Receiver, one of her advisors as Sender and constraints on the signaled informationenforced by keeping meetings and briefings short (on average or per meeting).
In this section, we prove that for a set of m ex ante constraints, there exists an optimal validsignaling scheme with support size at most k + m . For a set of ex post constraints, we prove thatthe analogous bound is k , just as in the unconstrained setting of Kamenica and Gentzkow. Weshow that both bounds are tight. Theorem 3.1 (Existence of an optimal valid signaling scheme under ex ante constraints with alinear-sized support) . Fix m ex ante constraints. Then either there exists an optimal valid signalingscheme with support size at most k + m or the set of valid signaling schemes is empty. At a high level, we translate the problem into an infinite LP. We first prove that the targetfunction of the infinite LP is upper semi-continuous. Secondly, we show, using infinite-dimensionaloptimization tools, that it must attain a maximum at an extreme point of the feasible set. Thirdly,we argue that every extreme point has a finite support of bounded size, analyzing the effect ofadding the constraints one by one by considering hyperplanes specifying the constraints. Finally,we improve the bound on the support size of each extreme point using a finite LP.
Proof of Theorem 3.1.
Denote the function and the constant specifying the i -th ex ante constraint(1 ≤ i ≤ m ) by f i and c i , respectively. We aim to solve:max E [Σ , u s ]s.t. p [ ω ] = E [Σ , p σ [ ω ]] ∀ ω ∈ Ω E [Σ , f i ] ≤ c i ∀ ≤ i ≤ m If the prior over population groups is not uniform, then we can easily add weights to this constraint: − min ω ′ ∈ Ω { b ω ′ p σ [ ω ′ ] } . An alternative model of [24; 5] allows Sender to “flood” Receiver with information, but Receiver strategicallychooses what to pay attention to. Constrained persuasion might be viewed as a restriction that simply avoids floodingReceiver with information in expectation (the ex ante model) or always (the ex post model). Bloedel and Segal [5] use mutual information of p σ and Receiver’s perception of it after paying limited attentionas the measure of the attention invested by Receiver. In our model, Receiver always pays full attention, thus themutual information coincides with the entropy of p σ . { µ n } n ≥ of feasible probability measures on ∆(Ω)that converges to some feasible probability measure µ . Since ∆(Ω) (equipped with the Euclideanmetric) is separable, we get from a well-known result (e.g. Theorem 4.2 from [33]) that µ n weaklyconverges to µ . u s is upper semi-continuous and defined on a compact set, thus it is bounded from above.Therefore, we get from one of the equivalent definitions of weak convergence of measures:lim sup E [ µ n , u s ] ≤ E [ µ, u s ] . Therefore, the target function in the infinite LP is upper semi-continuous with respect to theL´evy–Prokhorov metric on the space of the feasible probability measures and the usual metric on R ≥ . This completes our first step.The target function is upper semi-continuous and linear, and the feasible set of measures iscompact and convex, thus Bauer’s maximum principle (e.g. Theorem 7 .
69 from [1]) yields that anoptimum is attained at an extreme point (unless the feasible set is empty and no valid signalingscheme exists), which completes our second step. It remains to show that every extreme point ofthe feasible set has support of size at most k + m .A general approach adapted from [30; 28; 23] shows that every extreme point has a finitesupport with size at most 2 k + m . This is because every constraint in the infinite LP is defined byan appropriate hyperplane, and when we add the hyperplanes one by one – the maximal size of thesupport of extreme points is at most doubled upon each addition. It completes our third step.Finally, let us discretize our LP by setting | supp(Σ) | ≤ k + m and considering each of theinfinitely many candidates for supp(Σ) separately. Every such candidate defines a finite LP with2 k + m variables and k + m constraints (note that we should add a constraint ensuring that theprobability masses in Σ sum up to 1, but we may remove one of the k Bayes-plausibility constraints– it follows from the other constraints). Furthermore, every extreme point of the infinite LP is alsoan extreme point of a certain corresponding finite LP. Therefore, every extreme point is supportedon at most k + m coordinates, which completes the proof. Proposition 3.2.
The bound from Theorem 3.1 on the support size is tight for every k and m . We provide a constructive proof to Proposition 3.2 in Appendix A. Now we establish a strongerbound for ex post constraints.
Theorem 3.3 (Existence of an optimal valid signaling scheme under ex post constraints with alinear-sized support) . Fix a set of ex post constraints. Then either there exists an optimal validsignaling scheme with support size at most k or the set of valid signaling schemes is empty. The proof is similar to Theorem 3.1, but in the infinite LP we ignore the ex post constraints;rather, we replace in the proof ∆(Ω) by the compact subset of possible posteriors under the ex postconstraints. Proof of Theorem 3.3.
The given ex post constraints specify a set S ⊆ ∆(Ω) to which elements ofthe support of a valid signaling scheme are allowed to belong. This set is bounded since ∆(Ω) is The well-known result states that for a separable metric space (
X, d ), convergence of measures on it in theL´evy–Prokhorov metric and weak convergence of measures are equivalent. If the ex post constraints are convex, the result follows directly from the concavification approach of Kamenicaand Gentzkow [22]. S is compact. We have the followingoptimization problem: max E [Σ , u s ]s.t. p [ ω ] = E [Σ , p σ [ ω ]] ∀ ω ∈ Ωsupp(Σ) ⊆ S This is an infinite LP, with the “variables” being the distribution Σ over elements of S (rather thanof ∆(Ω), as in Theorem 3.1 proof). The rest of the proof is exactly as for Theorem 3.1, with onlythe two following differences:1. We consider the metric space (equipped with the L´evy–Prokhorov metric) of the feasibleprobability measures on S (rather than on ∆(Ω)); thus, in the rest of the proof ∆(Ω) isreplaced everywhere with S .2. We have only k linear constraints (instead of k + m ), which improves the bound on the extremepoint support size at our third step and – more importantly – at our fourth and final step. Observation 3.4.
The bound from Theorem 3.3 is achieved, e.g., by u s ( p σ ) := || p σ || ∞ and a setof trivial ex post constraints. In this section, we provide positive computational results for a constant number of states of nature k . We focus on constant k since a hardness result of Dughmi and Xu [11] implies that unless P = N P , there is neither an additive PTAS nor a constant-factor multiplicative approximation ofthe optimal Sender’s utility in poly( k )-time, even for piecewise constant u s . Our results are forex ante constraints; by Observation 2.6, they hold also for ex post constraints. Throughout thissection, we assume that both u s and the functions specifying the constraints are given by explicitformulae and can be evaluated at every point in constant time.Call L -Lipschitz a function with Lipschitz constant being at most L . Our first main resultis an additive bi-criteria approximation (Theorem 4.5). Part 1 of Theorem 4.5 is an additive bi-criteria FPTAS for O (1)-Lipschitz or piecewise constant u s and a natural constraint family thatincludes entropy, KL divergence and norms. This result encompasses the utility functions thatnaturally arise in applications of Bayesian persuasion: piecewise constant if Receiver has finitelymany actions and O (1)-Lipschitz if Receiver has a continuum of actions [10]. Specifically, we showhow to compute in poly (cid:0) m, ǫ (cid:1) -time a signaling scheme achieving utility that is additively at most ǫ -far from optimal and violating each of the m ex ante constraints by at most ǫ ; Bayes-plausibilityis satisfied precisely. Part 2 of Theorem 4.5 is an additive bi-criteria PTAS, which holds undereven weaker assumptions: u s should be either continuous or piecewise constant and there are nolimitations on the ex ante constraints. The same approximation algorithm implies both parts ofTheorem 4.5.Our second main result (Theorem 4.7) is an improvement of the bi-criteria approximations fromTheorem 4.5 to single-criteria; it requires imposing a Slater-like regularity condition on the ex ante Their result is on public persuasion with multiple Receivers, which can be replaced by a single Receiver with alarge action space. u s , there exists an additive bi-criteria PTAS for an optimal signaling scheme; secondly, the same algorithm is an additive bi-criteriaFPTAS when both u s and the functions specifying the ex ante constraints are O (1)-Lipschitz; bothresults improve to single-criteria approximations under a Slater-like regularity condition. Corollary 4.1 (of Theorems 4.5, 4.7) . Suppose that k is constant, u s is continuous and given are m ex ante constraints s.t. the set of valid signaling schemes is nonempty. Then for every ǫ > ,there exists an algorithm that computes an additively ǫ -optimal signaling scheme that violates eachex ante constraint at most by ǫ , which has running time of:1. poly( m ) , provided that ǫ is constant.2. poly (cid:0) m, ǫ (cid:1) , provided that both u s and the functions specifying the ex ante constraints are O (1) -Lipschitz.Furthermore, if there exists a signaling scheme satisfying each ex ante constraint with strict inequal-ity, then the above algorithm can be improved so that each ex ante constraint is satisfied precisely. Here we present an additive bi-criteria FPTAS (Theorem 4.5, part 1) for O (1)-Lipschitz or piecewiseconstant Sender’s utility functions, under ex ante constraints specified by functions which mayinclude entropy, KL divergence and any norm of p σ − p (such as the well-known variation distancebetween probability measures). In particular, one can restrict D KL ( p ′ σ || p ′ ), where p ′ σ and p ′ arethe distributions induced by p σ and p (respectively) on some partition of Ω; that is, some elementsof Ω are united when computing the KL divergence. Practically, it can be exploited in online adauctions to limit the expected information disclosure on habits of a certain social group; such agroup is represented by a subset of Ω. Assumption 4.2 ( u s is O (1)-Lipschitz or piecewise constant – required for the additive bi-crite-ria FPTAS) . u s is either O (1) -Lipschitz or piecewise constant, with a constant number of pieces,s.t. each piece covers a convex polygon in ∆(Ω) with a constant number of vertices. Assumption 4.3 (The ex ante constraints are specified by O (1)-Lipschitz functions, entropy orKL divergence – required for the additive bi-criteria FPTAS) . Each ex ante constraint is specifiedeither by an O (1) -Lipschitz function or by a function of the form: b · X ≤ j ≤ l X ω ′ ∈ Ω j p σ (cid:2) ω ′ (cid:3) ln P ω ′ ∈ Ω j p σ [ ω ′ ] b j , where { Ω j } ≤ j ≤ l is a partition of Ω and b, b , ..., b l are constants ( b , ..., b l > ). We further show that under no assumptions on the ex ante constraints and under a weakerassumption on u s – being continuous or piecewise constant – the same algorithm provides anadditive bi-criteria PTAS (Theorem 4.5, part 2). Note that every norm on ∆(Ω) ⊆ R k is O (1)-Lipschitz, which is sufficient to satisfy Assumption 4.3. ssumption 4.4 ( u s is continuous or piecewise constant – relaxation of Assumption 4.2; requiredfor the additive bi-criteria PTAS) . u s is either continuous or piecewise constant, with a constantnumber of pieces, s.t. each piece covers a convex polygon in ∆(Ω) with a constant number of vertices. Theorem 4.5 (An additive bi-criteria FPTAS/PTAS for an optimal valid signaling scheme) . Fixa constant k and fix m ex ante constraints s.t. the set of valid signaling schemes is nonempty.1. Suppose that u s satisfies Assumption 4.2 and the ex ante constraints satisfy Assumption 4.3.Then for every ǫ > , there exists a poly (cid:0) m, ǫ (cid:1) -algorithm that computes an additively ǫ -optimal signaling scheme that violates each ex ante constraint at most by ǫ .2. Suppose that u s satisfies Assumption 4.4. Then for every constant ǫ > , there exists a poly( m ) -algorithm that computes an additively ǫ -optimal signaling scheme that violates eachex ante constraint at most by ǫ . So far, we have demonstrated how to find a near-optimal signaling scheme that satisfies the ex anteconstraints after slightly relaxing them. The relaxation is required to avoid degenerate cases. Forexample, finding the root of a polynomial with a single real root can be described in the languageof ex ante constraints. This problem has a unique feasible distribution and if we do not relax theconstraints, any algorithm missing the exact real root cannot give a satisfactory approximation.Theorem 4.5 can be improved under a regularity condition disallowing such degenerate cases.
Assumption 4.6 (Slater-like regularity condition) . There exists a signaling scheme satisfying allthe given ex ante constraints with strict inequality.
Theorem 4.7 (An additive FPTAS/PTAS for an optimal valid signaling scheme) . Fix a constant k and fix m ex ante constraints satisfying Assumption 4.6.1. Suppose that u s satisfies Assumption 4.2 and the ex ante constraints satisfy Assumption 4.3.Then for every ǫ > , there exists a poly (cid:0) m, ǫ (cid:1) -algorithm that computes an additively ǫ -optimal valid signaling scheme.2. Fix a constant ǫ > and suppose that u s satisfies Assumption 4.4. Then there exists a poly( m ) -algorithm that computes an additively ǫ -optimal valid signaling scheme. In this subsection, we first formulate and prove Lemma 4.10 (together with two technical assump-tions), which is the main step in the proofs of our results from Section 4. The first and second partsof Theorem 4.5 follow from this lemma, with t (cid:0) ǫ (cid:1) := ǫ and t (cid:0) ǫ (cid:1) := 1, respectively; proof detailsare given in Appendix B. Then we strengthen Lemma 4.10 by adding the regularity Assumption 4.6to get Lemma 4.11, and we prove the latter. Note that the proof of Theorem 4.7 is exactly as forTheorem 4.5, but it uses Lemma 4.11 rather than Lemma 4.10. Assumption 4.8 (Parameterized by t (cid:0) ǫ (cid:1) ) . For every ǫ > and every M = poly (cid:0) t (cid:0) poly (cid:0) ǫ (cid:1)(cid:1)(cid:1) , onecan compute in poly (cid:0) t (cid:0) poly (cid:0) ǫ (cid:1)(cid:1)(cid:1) -time an explicit formula for an upper semi-continuous piecewiseconstant u ǫ,M : ∆(Ω) → R ≥ s.t.: • Every piece of u ǫ,M covers a region of ∆(Ω) which is a convex polygon with diameter at most ǫM . The total number of vertices of the above regions of ∆(Ω) is poly (cid:0) t (cid:0) poly (cid:0) ǫ (cid:1)(cid:1)(cid:1) . • For every q ∈ ∆(Ω) we have: ≤ u ǫ,M ( q ) − u s ( q ) ≤ ǫ . Assumption 4.9 (Parameterized by t (cid:0) ǫ (cid:1) ) . For every ≤ i ≤ m , the i -th ex ante constraint isspecified by f i : ∆(Ω) → R s.t. for every ǫ > , one can compute in poly (cid:0) t (cid:0) poly (cid:0) ǫ (cid:1)(cid:1)(cid:1) -time anexplicit formula for a poly (cid:0) t (cid:0) poly (cid:0) ǫ (cid:1)(cid:1)(cid:1) -Lipschitz function g i : ∆(Ω) → R s.t. for every q ∈ ∆(Ω) : ≤ f i ( q ) − g i ( q ) ≤ ǫ. Lemma 4.10.
Suppose that k is constant, u s satisfies Assumption 4.8 with t (cid:0) ǫ (cid:1) and we have m exante constraints satisfying Assumption 4.9 with t (cid:0) ǫ (cid:1) . Then either the set of valid signaling schemesis empty or for every ǫ > , there exists a poly (cid:0) m, t (cid:0) poly (cid:0) ǫ (cid:1)(cid:1)(cid:1) -time algorithm that computes anadditively ǫ -optimal signaling policy that violates each ex ante constraint at most by ǫ . The proof of Lemma 4.10 first strengthens Assumption 4.9 and assumes that the constraints arespecified by poly (cid:0) t (cid:0) poly (cid:0) ǫ (cid:1)(cid:1)(cid:1) -Lipschitz functions. Then we restrict ourselves to a grid consisting ofthe vertices of the pieces of u ǫ,M , where M is the maximal Lipschitz constant among the functionsspecifying the constraints, and output the resultant optimal valid signaling scheme for u ǫ,M ratherthan u s . Finally, we estimate the loss in Sender’s utility and the constraint values using theapproximability guarantees. Proof of Lemma 4.10.
We strengthen Assumption 4.9 to the following: the i -th ex ante constraint(1 ≤ i ≤ m ) is specified by a poly (cid:0) t (cid:0) poly (cid:0) ǫ (cid:1)(cid:1)(cid:1) -Lipschitz function f i : ∆(Ω) → R and some constant c i . The original lemma follows from applying the lemma under the strengthened Assumption 4.9with ǫ replaced by ǫ and the f i s replaced by the g i s. This is because the original Assumption 4.9ensures that upon replacing f i with g i , every valid signaling scheme remains such and E [Σ , f i ]decreases at most by ǫ .Now we prove the lemma under the strengthened Assumption 4.9. Suppose that a valid signalingscheme exists and let OP T be Sender’s expected utility under an optimal valid scheme. Fix ǫ > M be the maximal Lipschitz constant among the f i s. Compute an explicit formula for u ǫ,M .Let q , ..., q n be the vertices of the regions of ∆(Ω) covered by the pieces of u ǫ,M . Let us solve thefollowing: max E [Σ , u ǫ,M ]s.t. p [ ω ] = E [Σ , p σ [ ω ]] ∀ ω ∈ Ωsupp { Σ } ⊆ { q , ..., q n } E [Σ , f i ] ≤ c i + ǫ ∀ ≤ i ≤ m This problem defines a finite LP with n variables and k + m constraints (as in the proofof Theorem 3.1 – we add a constraint for all the probability masses in Σ to sum up to 1 andremove one of the k Bayes-plausibility constraints); this LP can be solved in time poly( n, k + m ) =poly (cid:0) m, t (cid:0) poly (cid:0) ǫ (cid:1)(cid:1)(cid:1) . We return its solution Σ as the desired signaling scheme.By the design of our LP, Σ is Bayes-plausible and violates each ex ante constraint at most by ǫ . Take now a valid optimal signaling scheme Σ OP T (for Sender’s utility function u s rather than u ǫ,M ). For every piece of u ǫ,M , move all the probability weight in Σ from the region covered by thispiece to the extreme points of that region in an expectation-preserving way (so Bayes-plausibilitystill holds) and denote the resultant signaling scheme by Σ ′ OP T . Since the diameter of every such11egion is at most ǫM and the ex ante constraints have Lipschitz constants ≤ M , we get that eachex ante constraint is violated at most by ǫM · M = ǫ . Thus, Σ ′ OP T is a feasible solution to our LP,so E [Σ ′ OP T , u ǫ,M ] ≤ E [Σ , u ǫ,M ].Since u ǫ,M is upper semi-continuous and piecewise constant we have: E [Σ OP T , u ǫ,M ] ≤ E [Σ ′ OP T , u ǫ,M ]. Furthermore, the third bullet from Assumption 4.8 yields: E [Σ , u ǫ,M ] − E [Σ , u s ] ≤ ǫ and E [Σ OP T , u s ] ≤ E [Σ OP T , u ǫ,M ]. Combining the last four inequali-ties implies: E [Σ , u s ] ≥ E [Σ OP T , u s ] − ǫ = OP T − ǫ .Now we formulate and prove Lemma 4.11 – a strengthening of Lemma 4.10 used to proveTheorem 4.7. Lemma 4.11 (Parameterized by t (cid:0) ǫ (cid:1) ) . Suppose that k is constant, u s satisfies Assumption 4.8 with t (cid:0) ǫ (cid:1) and we have m ex ante constraints satisfying Assumption 4.9 with t (cid:0) ǫ (cid:1) and Assumption 4.6.Then for every ǫ > , there exists a poly (cid:0) m, t (cid:0) poly (cid:0) ǫ (cid:1)(cid:1)(cid:1) -algorithm computing an additively ǫ -optimal valid signaling policy. The algorithm applies Lemma 4.10 to a persuasion problem with strengthened ex ante con-straints. The analysis compares the output to a convex combination of two outputs of Lemma 4.10– one might violate the ex ante constraints and the other satisfies them with strict inequality. Weuse the proof of Lemma 4.10 to bound the utility loss.
Proof of Lemma 4.11. u s is upper semi-continuous and defined on a compact set, thus it is boundedfrom above by some constant C ; assume w.l.o.g. that C >
2. Let
OP T be Sender’s optimal utilityfor a valid scheme. Restrict ourselves to small enough values of 0 < ǫ < C s.t. strengtheningeach ex ante constraint by ǫ leaves the set of valid signaling schemes nonempty (it is possible byAssumption 4.6). We return the signaling scheme Σ outputted by the algorithm from Lemma 4.10on 0 . ǫ and the problem obtained by strengthening each ex ante constraint by 0 . ǫ . Then Σ satisfiesthe original constraints; it remains to bound its utility loss compared to OP T .Let Σ ′ be the output of Lemma 4.10 on 0 . ǫ and the original problem; denote by Σ ′′ theoutput of Lemma 4.10 on 0 . ǫ and the problem obtained by straightening each original ex anteconstraint by ǫ .Let M be the maximal Lipschitz constant among the g i s from Assumption 4.9. Then M is notaffected by adding constant factors to the constraints; furthermore, note that by Assumption 4.8, u . ǫ ,M can also serve as u . ǫ,M (since 0 . ǫ < . ǫ and . ǫ = poly (cid:0) . ǫ (cid:1) ). Therefore, by theproof of Lemma 4.10, we can assume w.l.o.g. that Σ , Σ ′ , Σ ′′ are all supported on the vertices of thepieces of u . ǫ ,M ; furthermore, . ǫ Σ ′ + . ǫ . ǫ Σ ′′ satisfies each original ex ante constraint.Note that Σ is 0 . ǫ -additively-optimal among the schemes supported on the above extreme pointsand satisfying the original ex ante constraints, since Σ is exactly optimal among such schemes ifwe replace u s with u . ǫ,M , by Lemma 4.10 proof. Thus: E [Σ , u s ] ≥ E (cid:20)
11 + 0 . ǫ Σ ′ + 0 . ǫ . ǫ Σ ′′ , u s (cid:21) − ǫ E [Σ ′ , u s ]1 + 0 . ǫ + 0 . ǫ E [Σ ′′ , u s ]1 + 0 . ǫ − ǫ ≥ OP T − . ǫ . ǫ − ǫ ≥ OP T − ǫ, where the last transition follows from ǫ < C ≤ OP T . To be precise, we assume that an upper bound on such values of ǫ is known in advance. Ex Ante vs. Ex Post Constraints
In this section, we bound the multiplicative gap in the Sender’s optimal utility between ex anteconstraints and the corresponding ex post constraints; we apply our bound to signaling in adauctions in Subsection 5.1.In full generality, the gap can be arbitrarily large even for k = 2 states of nature and m = 1convex constraints: Example 5.1.
Fix ǫ ∈ (cid:0) , (cid:1) ; take Ω = { , } with a uniform prior; define f ( p σ ) := p σ [ ω = 1]and c := + ǫ . Let u s ( p σ ) be 0 if p σ [ ω = 1] ∈ (cid:2) , (cid:3) and 2 · p σ [ ω = 1] − f and c allows full revelation, which yields expected utility of for Sender.Convexity of u s implies that under the corresponding ex post constraint, there exists an optimalsignaling scheme for which always p σ [ ω = 1] ∈ { , c } ; straightforward calculations show that theSender’s optimal utility is ǫ ǫ . Thus, the multiplicative gap tends to ∞ as ǫ tends to 0.We identify a multiplicatively-relaxed Jensen assumption on u s parameterized by M ≥
1, whichcombined with convexity of the m constraints yields a multiplicative bound of M m on the gapbetween ex ante and ex post constraints. Assumption 5.2 (Parameterized by M ≥ . For every λ ∈ [0 , and p σ , p σ ∈ ∆(Ω) : λu s ( p σ ) + (1 − λ ) u s ( p σ ) ≤ M · u s ( λp σ + (1 − λ ) p σ ) . For example, in Appendix C, we show that Assumption 5.2 holds with M = 2 for both thewelfare and the revenue utility functions in the single-item, second-price auction setting. We notethat there are utilities u s for which the assumption does not hold for any finite M : those u s that“grow too slowly” near 0 (in particular, if u s maps a nonzero measure of the domain to 0, as inExample 5.1). Theorem 5.3 (A bound on the multiplicative gap between ex ante and ex post constraints) . Suppose that u s satisfies Assumption 5.2 with parameter M ≥ . Fix m convex ex ante constraintsand let Σ ex ante be a valid signaling scheme. Then there exists Σ ex post , a valid signaling schemeunder the corresponding m ex post constraints, s.t.: E [Σ ex post , u s ] ≥ M m · E [Σ ex ante , u s ] . The proof runs Algorithm 1 for each constraint separately. This algorithm repeatedly pools aposterior violating the ex post constraint with a posterior satisfying this constraint with a strictinequality, replacing one of them by a posterior on which the ex post constraint is tight and decreas-ing the probability mass assigned to the other posterior. This process stops since each iterationdecreases the number of posteriors in supp(Σ) on which the ex post constraint is not tight. Theconstraint convexity assures that the resultant scheme satisfies the ex post constraint; Assump-tion 5.2 implies that the multiplicative loss caused by the pooling process (for each constraint) isat most M . Formally, we start with the following lemma. Lemma 5.4.
Suppose that u s satisfies Assumption 5.2 with some M ≥ . Let Σ ex ante be a signalingscheme with a finite support satisfying a convex ex ante constraint specified by some f and c . Thenthe output of Algorithm 1 on Σ ex ante is a signaling scheme Σ ex post satisfying the corresponding expost constraint, s.t.: E [Σ ex post , u s ] ≥ M · E [Σ ex ante , u s ] . Assuming Lemma 5.4, let us prove Theorem 5.3.13 lgorithm 1:
Ex ante to ex post
Input : A signaling scheme Σ with a finite support satisfying: E [Σ , f ] ≤ c . Parameters : A continuous convex function f : ∆(Ω) → R , a constant c . Output : An updated signaling scheme Σ with a multiplicative expected utility loss of at most M compared to the input s.t. ∀ p σ ∈ supp(Σ) : f ( p σ ) ≤ c S ← supp(Σ) ∩ f − (( −∞ , c )). T ← supp(Σ) ∩ f − (( c, ∞ )). while S, T = ∅ do Take q S ∈ S, q T ∈ T . r S ← Pr p σ ∼ Σ [ p σ = q S ], r T ← Pr p σ ∼ Σ [ p σ = q T ]. Find λ ∈ (0 ,
1) s.t. f ( λq S + (1 − λ ) q T ) = c . Define q c := λq S + (1 − λ ) q T . supp(Σ) ← supp(Σ) ∪ { q c } . if λr T ≥ (1 − λ ) r S then supp(Σ) ← supp(Σ) \ { q S } . r S ← r T ← r T − (1 − λ ) r S λ , r c ← r S λ . else supp(Σ) ← supp(Σ) \ { q T } . r S ← r S − λr T − λ , r T ← , r c ← r T − λ . end if Update Σ according to r S , r T , r c . end while return Σ Proof of Theorem 5.3.
By Theorem 3.1, assume w.l.o.g. that Σ ex ante has a finite support. Let usrun Algorithm 1 for j = 1 , , ..., m on Σ ex ante , f j and c j (updating the signaling scheme repeatedly)and let Σ ex post be the final output. Applying Lemma 5.4 m times, together with the convexity ofthe constraints – which ensures that pooling cannot increase the expected value of any constraintfunction – implies that Σ ex post satisfies the theorem requirements.It remains to prove the lemma. Proof of Lemma 5.4.
Note that Algorithm 1 terminates after at most | supp(Σ ex ante ) | − | S ∪ T | , and throughout the algorithm run we have: S ∪ T ⊆ supp(Σ ex ante ). The update rules ensure that r S , r T and r c after each update are nonnegative andtheir sum equals r S + r T before the update; furthermore, these updates preserve Bayes-plausibility,as q c = λq S + (1 − λ ) q T and:0 · q S + (cid:18) r T − (1 − λ ) r S λ (cid:19) · q T + (cid:16) r S λ (cid:17) · ( λq S + (1 − λ ) q T ) = r S · q S + r T · q T and also: (cid:18) r S − λr T − λ (cid:19) · q S + 0 · q T + r T − λ · ( λq S + (1 − λ ) q T ) = r S · q S + r T · q T . Therefore, Σ remains a Bayes-plausible probability distribution throughout the algorithm run. Inaddition, the convexity of f implies that the expectation of f never increases. Hence, when thealgorithm stops – we must have T = ∅ . Thus, Σ ex post satisfies the ex post constraint specified by f and c . Moreover, Assumption 5.2 implies that the multiplicative loss in the expected Sender’sutility compared to Σ ex ante is at most M . 14n Appendix D, we prove the following facts on tightness of Theorem 5.3 and our analysis. Weleave as an open question the tightness of Theorem 5.3 for general m . Proposition 5.5.
1. Our analysis is tight for any m and M = 2 .
2. The bound from Theorem 5.3 on the multiplicative gap between ex ante and ex post constraintsis tight for m = 1 and any M .3. This gap grows with m and can be at least m + 1 . We apply Theorem 5.3 to the important domain of signaling in ad auctions. We use a generalizationof the “Bayesian Valuation Setting” of Badanidiyuru et al. [4] and add to it constraints on thesignaling scheme.Consider a single-item second-price auction with n bidders. Recall from Section 2 that the itembeing sold is the opportunity to show an online advertisement to a web user, whose characteristicsare known to the auctioneer, but not to the bidders. Each bidder targets a certain set of users towhom showing her ad would be most valuable and the auctioneer signals information about whichtargeted sets the user belongs to.In the language of persuasion, Sender is the auctioneer while Receiver is the set of bidders. TakeΩ := { , } n , where the i -th coordinate specifies whether the web user is in the i -th advertiser’stargeted set; denote by ω = ( ω , ..., ω n ) the state of nature; let p be some commonly-known priordistribution. Assume further that for every 1 ≤ i ≤ n , the i -th bidder has a private type t i ; for every1 ≤ i ≤ n , the valuation v i of the i -th bidder is determined by a nonnegative function v i ( ω i , t i ). Fix 2 n continuously differentiable CDFs L i and H i (1 ≤ i ≤ n ) and assume that v i (0 , t i ) ∼ t i L i and v i (1 , t i ) ∼ t i H i for every 1 ≤ i ≤ n . The auction runs as follows:1. The auctioneer commits to a valid signaling scheme Σ, an allocation rule and a payment rule.2. The auctioneer discovers the state of nature ω ∈ Ω.3. The auctioneer broadcasts a public signal realization σ according to Σ( ω ).4. The bidders update their expected valuations using p σ and report their bids to the auctioneer.5. The auction outcome is determined by the allocation and the payment rules.Define Sender’s utility u s ( p σ ) to be the expected welfare – the winner’s value – over t , ..., t n , for aposterior p σ . Explicitly: u s ( p σ ) := E t ,...,t n [max { p σ [ ω = 0] · v (0 , t ) + p σ [ ω = 1] · v (1 , t ) , ...,p σ [ ω n = 0] · v n (0 , t n ) + p σ [ ω n = 1] · v n (1 , t n ) } ] . In Appendix C, we prove the following result.
Proposition 5.6. u s – the expected (over the bidders’ private types) welfare in a single-item second-price auction with signaling – satisfies Assumption 5.2 with M = 2 . Note that we use M = 2 in our applications. Unlike Badanidiyuru et al. [4], we assume neither that the t i s are i.i.d. nor that v ≡ ... ≡ v n . Example 5.7.
Take a single ex ante constraint specified by the function ( − min { b ω ′ p σ [ ω = ω ′ ] } ω ′ ∈ Ω )with some constant weights { b ω ′ } ω ′ ∈ Ω . As mentioned in Section 2, this constraint is a possible modelfor anti-discrimination. Finding the optimal valid scheme Σ ∗ ex ante is an open question. However, thecorresponding ex post constraint is simple to handle – it restricts the posteriors to an appropriatesimplex, and since u s (the social welfare) is convex, the optimal scheme Σ ∗ ex post is supported pre-cisely on the vertices of this simplex, and is uniquely specified by Bayes-plausibility. Theorem 5.3,combined with Proposition 5.6, shows that Σ ∗ ex post is a -approximation to Σ ∗ ex ante . We initiate the study of ex ante- and ex post-constrained Bayesian persuasion with applications toad auctions and limited attention. A future research direction, especially considering Theorem 5.3,is studying (nearly) optimal signaling schemes under common ex post constraints, such as KLdivergence. Another interesting direction is constrained persuasion with private signaling, e.g.,when Sender’s utility is a function of the set of Receivers who adopt a certain action [3].
References [1] Charalambos D. Aliprantis and Kim C. Border.
Infinite dimensional analysis: a hitchhiker’sguide . Springer, 2006.[2] Gerry Antioch. Persuasion is now 30 per cent of US GDP: Revisiting mccloskey and klamerafter a quarter of a century.
Economic Roundup , 1:1, 2013.[3] Itai Arieli and Yakov Babichenko. Private Bayesian persuasion.
Journal of Economic Theory ,182:185–217, 2019.[4] Ashwinkumar Badanidiyuru, Kshipra Bhawalkar, and Haifeng Xu. Targeting and signaling inad auctions. In
Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms ,pages 2545–2563. SIAM, 2018.[5] Alexander W. Bloedel and Ilya R. Segal. Persuasion with rational inattention, 2018. Availableat SSRN 3164033.[6] Elisa L. Celis, Sayash Kapoor, Farnood Salehi, and Nisheeth K. Vishnoi. Controlling polar-ization in personalization: An algorithmic framework. In
Proceedings of the conference onfairness, accountability, and transparency , pages 160–169, 2019.[7] Elisa L. Celis, Anay Mehrotra, and Nisheeth K. Vishnoi. Toward controlling discrimination inonline ad auctions. In
International Conference on Machine Learning , pages 4456–4465, 2019.[8] Yu Cheng, Ho Y. Cheung, Shaddin Dughmi, Ehsan Emamjomeh-Zadeh, Li Han, and Shang H.Teng. Mixture selection, mechanism design, and signaling. In , pages 1426–1445. IEEE, 2015.169] Constantinos Daskalakis, Christos H. Papadimitriou, and Christos Tzamos. Does informationrevelation improve revenue? In
Proceedings of the 2016 ACM Conference on Economics andComputation, EC , pages 233–250, 2016.[10] Shaddin Dughmi. Algorithmic information structure design: a survey.
ACM SIGecom Ex-changes , 15(2):2–24, 2017.[11] Shaddin Dughmi and Haifeng Xu. Algorithmic persuasion with no externalities. In
Proceedingsof the 2017 ACM Conference on Economics and Computation , pages 351–368, 2017.[12] Shaddin Dughmi, Nicole Immorlica, and Aaron Roth. Constrained signaling in auction design.In
Proceedings of the 25th annual ACM-SIAM symposium on Discrete algorithms , pages 1341–1357. SIAM, 2014.[13] Shaddin Dughmi, Nicole Immorlica, Ryan O’Donnell, and Li-Yang Tan. Algorithmic signalingof features in auction design. In
Algorithmic Game Theory - 8th International Symposium,SAGT , pages 150–162. Springer, 2015.[14] Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy.
Founda-tions and Trends in Theoretical Computer Science , 9(3-4):211–407, 2014.[15] Ran Eilat, Kfir Eliaz, and Xiaosheng Mu. Optimal privacy-constrained mechanisms. CEPRDiscussion Paper No. DP13536, 2019.[16] Yuval Emek, Michal Feldman, Iftah Gamzu, Renato Paes Leme, and Moshe Tennenholtz. Sig-naling schemes for revenue maximization.
ACM Transactions on Economics and Computation(TEAC) , 2(2):1–19, 2014.[17] Hu Fu, Patrick Jordan, Mohammad Mahdian, Uri Nadav, Inbal Talgam-Cohen, and SergeiVassilvitskii. Ad auctions with data. In
International Symposium on Algorithmic Game The-ory , pages 168–179. Springer, 2012.[18] Sergiu Hart and Noam Nisan.
The menu-size complexity of auctions . Center for the Study ofRationality, 2013.[19] Sergiu Hart and Noam Nisan. Approximate revenue maximization with multiple items.
Journalof Economic Theory , 172:313–347, 2017.[20] Shota Ichihashi. Limiting sender’s information in Bayesian persuasion.
Games and EconomicBehavior , 117:276–288, 2019.[21] Shota Ichihashi. Online privacy and information disclosure by consumers.
American EconomicReview , 110(2):569–95, 2020.[22] Emir Kamenica and Matthew Gentzkow. Bayesian persuasion.
American Economic Review ,101(6):2590–2615, 2011.[23] Alan F. Karr. Extreme points of certain sets of probability measures, with applications.
Mathematics of Operations Research , 8(1):74–85, 1983.[24] Elliot Lipnowski, Laurent Mathevet, and Dong Wei. Attention management.
American Eco-nomic Review: Insights , 2(1):17–32, 2020. 1725] Donald McCloskey and Arjo Klamer. One quarter of GDP is persuasion.
The AmericanEconomic Review , 85(2):191–195, 1995.[26] Paul R. Milgrom and Robert J. Weber. A theory of auctions and competitive bidding.
Econo-metrica: Journal of the Econometric Society , pages 1089–1122, 1982.[27] Peter Bro Miltersen and Or Sheffet. Send mixed signals: earn more, work less. In
Proceedingsof the 13th ACM Conference on Electronic Commerce , pages 234–247, 2012.[28] H.P. Mulholland and C.A. Rogers. Representation theorems for distribution functions.
Pro-ceedings of the London Mathematical Society , 3(2):177–223, 1958.[29] Roger B. Myerson. Optimal auction design.
Mathematics of Operations Research , 6(1):58–73,1981.[30] Hans Richter. Parameterfreie absch¨atzung und realisierung von erwartungswerten.
Bl¨atter derDGVFM , 3(2):147–162, 1957.[31] Herbert Alexander Simon. Designing organizations for an information-rich world.
InternationalLibrary of Critical Writings in Economics , 70:187–202, 1996.[32] Christopher Albert Sims. Implications of rational inattention.
Journal of monetary Economics ,50(3):665–690, 2003.[33] Onno Van Gaans. Probability measures on metric spaces, 2003. Lecture Notes.[34] Rune Tørnoe Vølund. Bayesian persuasion on compact subsets.
Theoretical Models in Behav-ioral Economics , pages 64–77, 2018.
A Proof of Proposition 3.2
Proof.
Fix k , m and some Ω of size k . We shall define m ex ante constraints and an upper semi-continuous u s s.t. any optimal signaling scheme has support size exactly k + m .Let e , ..., e k be the standard basis of R k . Take q , ..., q m to be m distinct interior points of∆(Ω) \ { e , ..., e k } with q + ... + q m m = (cid:0) k , ..., k (cid:1) ; set u s to be 1 on { q , ..., q m } , on { e , ..., e k } and 0on ∆(Ω) \ { e , ..., e k , q , ..., q m } ; let f i (1 ≤ i ≤ m ) be some nonnegative continuous function, whichis 1 on q i and 0 on the other q j s; set c = ... = c m = m ; choose p = (cid:0) k , ..., k (cid:1) .Then u s is upper semi-continuous and the f i s are continuous. Furthermore, no valid signalingscheme under the ex ante constraints specified by the f i s and the c i s assigns probability greaterthan m to each q i , thus the utility of a valid scheme is at most: m · m · (cid:0) − m · m (cid:1) · = .A utility of exactly is achieved by valid schemes that assign probability of exactly m to each q i and split the remaining probability between e , ..., e k . Bayes-plausibility implies that exactly onesuch scheme exists, assigning probability of m to each q i and probability of k to each e j . Thus,there exists a single optimal valid scheme, which has support size exactly k + m . Observation A.1.
One can modify our construction to make u s continuous. For example, one can make u s to quickly decrease to 0 on all the rays originating from any q i or e j and require f i to be greater than 1 in a deleted neighbourhood of q i on which u s is nonzero.18 Proof of Theorem 4.5
Proof of Theorem 4.5, part . Suppose that u s is either O (1)-Lipschitz or piecewise constant, hav-ing a constant pieces number, with each piece covering a convex polygon in ∆(Ω) having a constantnumber of vertices. Then u s satisfies Assumption 4.8 with t (cid:0) ǫ (cid:1) := ǫ . Indeed, to define u ǫ,M , onecan divide ∆(Ω) to poly (cid:0) ǫ (cid:1) simplices of diameters at most ǫM s.t. the supremum and the infimumof u s on every simplex differ at most by ǫ , and then set u ǫ,M on every such simplex to be thesupremum of u s on it.An ex ante constraint specified by an O (1)-Lipschitz function trivially satisfies Assumption 4.9with t (cid:0) ǫ (cid:1) := ǫ – simply define g i := f i . Consider now an ex ante constraint specified by a functionof the form: f i ( p σ ) := b · X ≤ j ≤ l X ω ′ ∈ Ω j p σ (cid:2) ω ′ (cid:3) ln P ω ′ ∈ Ω j p σ [ ω ′ ] b j , where { Ω j } ≤ j ≤ l is a partition of Ω, b , ..., b l > b is constant. We shall showthat one can assume w.l.o.g. that f i ( p σ ) ≡ P ω ′ ∈ Ω p σ [ ω ′ ] ln p σ [ ω ′ ]; then we shall prove that thecorresponding ex ante constraint satisfies Assumption 4.9 with t (cid:0) ǫ (cid:1) := ǫ .First, assume w.l.o.g. that b = 1: by dividing both f i and c i by | b | we can assume b ∈ {− , } ; then note that if g i fits for f i , then − ǫ − g i fits for − f i . Secondly, assume w.l.o.g. that l = k and | Ω j | = 1 for every 1 ≤ j ≤ l – just replace Ω with Ω ′ := { Ω j } ≤ j ≤ l . Thirdly, by adding a linearfunction to f i , assume w.l.o.g. that f i ( p σ ) ≡ P ω ′ ∈ Ω p σ [ ω ′ ] ln p σ [ ω ′ ]; it is possible since adding an O (1)-Lipschitz function does not affect the satisfaction of Assumption 4.9.Let p ′ be the center of ∆(Ω) and let S ǫ be the contraction (homothety) of ∆(Ω) centered at p ′ with coefficient ǫ . The restriction of f i to S ǫ is poly (cid:0) ǫ (cid:1) -Lipschitz, since on S ǫ one has: || ▽ f i ( p σ ) || = O (cid:0)P ω ′ ∈ Ω | ln p σ [ ω ′ ] | (cid:1) = O (cid:16)P ω ′ ∈ Ω 1 p σ [ ω ′ ] (cid:17) = poly (cid:0) ǫ (cid:1) . Extend this restriction of f i toa function ˜ g i : ∆(Ω) → R s.t. for every q ∈ ∆(Ω) \ S ǫ , ˜ g i ( q ) equals the value of f i on the projectionof q onto the closed, convex and nonempty set S ǫ . Finally, set g i ≡ ˜ g i − ǫ .Then g i is poly (cid:0) ǫ (cid:1) -Lipschitz, since f i is poly (cid:0) ǫ (cid:1) -Lipschitz and projection on a closed, convexand nonempty set is 1-Lipschitz. It remains to check that 0 ≤ f i ( q ) − g i ( q ) ≤ ǫ for every q ∈ ∆(Ω).It is immediate for q ∈ S ǫ . Fix now q ∈ ∆(Ω) \ S ǫ . It is enough to show that | f i ( q ) − ˜ g i ( q ) | ≤ ǫ .Indeed, by the definition of ˜ g i , ˜ g i ( q ) = f i ( q ′ ), where q ′ is the projection of q onto S ǫ . By the choiceof S ǫ we have || q − q ′ || ≤ ǫ . Therefore, the change in f i between q and q ′ is at most (for smallenough ǫ ): k (cid:12)(cid:12) ǫ ln ǫ (cid:12)(cid:12) = O (cid:0) ǫ | ln ǫ | (cid:1) = o ( ǫ ) , so | f i ( q ) − ˜ g i ( q ) | = | f i ( q ) − f i ( q ′ ) | = o ( ǫ ). Thus, f i indeed satisfies Assumption 4.9 with t (cid:0) ǫ (cid:1) := ǫ ,as desired.We proved that u s and the constraints satisfy Assumptions 4.8 and 4.9 with t (cid:0) ǫ (cid:1) := ǫ ; thus,Theorem 4.5, part 1 follows from Lemma 4.10. Proof of Theorem 4.5, part . Fix a constant ǫ >
0. If u s is piecewise constant, with a constantnumber of pieces, s.t. each piece covers a convex polygon in ∆(Ω) having a constant vertex number– it satisfies Assumption 4.8 with t (cid:0) ǫ (cid:1) := 1: to define u ǫ,M , one can just refine the pieces of u s by division to simplices of diameters at most ǫM . If u s is continuous, then from the compactnessof ∆(Ω) and the Heine–Cantor theorem, we get that u s is uniformly continuous. Therefore, u s satisfies Assumption 4.8 with t (cid:0) ǫ (cid:1) := 1: to define u ǫ,M , one can divide ∆(Ω) to simplices of small The case b = 0 is trivial. u ǫ,M on every such simplex to be the supremum of u s onit. Furthermore, each ex ante constraint satisfies Assumption 4.9 with t (cid:0) ǫ (cid:1) := 1. Indeed, given acontinuous f i , the compactness of ∆(Ω) and the Heine–Cantor theorem implies that f i is uniformlycontinuous. To define g i , one should divide ∆(Ω) to simplices of small enough diameter; thenone should temporarily set g i ≡ f i on the vertices of the simplices and extend g i linearly on eachsimplex; finally, one should slightly shift down g i so that it is never above f i .Therefore, u s and the constraints satisfy Assumptions 4.8 and 4.9 with t (cid:0) ǫ (cid:1) := 1; hence,Theorem 4.5, part 2 follows from Lemma 4.10. C Proposition 5.6 – Proof and Similar Results
In this appendix, we formulate and prove Lemma C.1; we use it to prove Proposition 5.6 andto demonstrate analogous results for other auction settings, including revenue maximization insingle-item, second-price auctions and welfare maximization in sponsored search (slot) auctions.
Lemma C.1.
Fix ≤ j ≤ n and n linear functions g , ..., g n : ∆(Ω) → R ≥ . Define g j : ∆(Ω) → R ≥ by setting g j ( y ) , for every y ∈ ∆(Ω) , to be the j -th maximal number among g ( y ) , ..., g n ( y ) .Then g j satisfies Assumption 5.2 with M = 2 .Proof. Fix y, z ∈ ∆(Ω) and λ ∈ [0 , λg j ( y ) + (1 − λ ) g j ( z ) ≤ g j ( λy + (1 − λ ) z ) . Assume w.l.o.g. that λg j ( y ) ≥ (1 − λ ) g j ( z ) and that g ( y ) ≥ g ( y ) ≥ ... ≥ g n ( y ). Then we get: λg j ( y ) + (1 − λ ) g j ( z ) ≤ λg j ( y ) = 2 λg j ( y ) = 2 λ min ≤ i ≤ j { g i ( y ) } ≤ ≤ i ≤ j { λg i ( y ) + (1 − λ ) g i ( z ) } = ( ∗ ) ≤ i ≤ j { g i ( λy + (1 − λ ) z ) } ≤ g j ( λy + (1 − λ ) z ) , where ( ∗ ) follows from the linearity of the g i s.Now we prove – using Lemma C.1 – Proposition 5.6 and discuss analogous results for otherauction settings. Proof of Proposition 5.6.
Note that u s is an expectation of maximum of linear nonnegative func-tions. By Lemma C.1, every term in the expectation satisfies Assumption 5.2 with M = 2, thusthe same holds for the expectation. Similar results.
The applications of Theorem 5.3 go beyond the setting from Section 5,which was chosen for the sake of simplicity. In particular, the revenue in single-item, second-priceauctions and the welfare in sponsored search (slot) auctions are also expectations of certain linearcombinations of functions of the form g j as described in Lemma C.1. Therefore, they too satisfyAssumption 5.2 with M = 2. D Proof of Proposition 5.5
Proof.
1. Fix m and k = 2 m . We shall show that the analysis from Section 5 is tight for M = 2.Take Ω := { , } m and u s ( p σ ) := || p σ || ∞ ; let p be uniform over Ω. By Lemma C.1 (with j = 1and g ω := p σ [ ω = ω ] for every ω ∈ Ω), u s satisfies Assumption 5.2 with M = 2. Define20or every 1 ≤ i ≤ m : f i ( p σ ) := p σ [ ω i = 1] = P ω ′ = ( ω ′ ,...,ω ′ m ) ∈ Ω: ω ′ i =1 p σ [ ω = ω ′ ] and c i := .For every p σ ∈ ∆(Ω), denote by R [ p σ , i ] the posterior obtained from p σ by assigning to each ω ′ ∈ Ω the probability assigned by p σ to the state of nature obtained from ω ′ by reversing its i -th bit.Consider the following m runs of Algorithm 1 for the m constraints. Start with Σ representingfull revelation (which is valid under the ex ante constraints specified by the f i s and the c i s).Then on the i -th run of Algorithm 1, pool every posterior p σ with R [ p σ , i ].Inductively, just before two posteriors are pooled together, they have equal probability weightsin the signaling scheme; therefore, their probability weights are moved entirely to the newposterior that the pooling creates. Note that for every 1 ≤ i ≤ m , the i -th run of Algorithm 1is legal, since it only pools posteriors having f i = 0 with posteriors having f i = 1; furthermore,at the end of the i -th run, all the posteriors in supp(Σ) have f i = . Moreover, inductively,at the end of the i -th run, every posterior in supp(Σ) specifies deterministically the last m − i bits of ω and induces a uniform distribution on { , } i for the prefix of length i of ω .Therefore, after the m -th run, we end with supp(Σ) = { p } (i.e., the no revelation policy),yielding expected Sender’s utility of k . We started with the full revelation policy, yieldingutility of 1; thus, the total multiplicative utility loss is k = 2 m = M m .2. Assume that m = 1 and fix M ≥
1. We shall define u s satisfying Assumption 5.2 with M andan ex ante constraint outperforming the corresponding ex post constraint by a multiplicativefactor of M .Take: Ω := { , } ; p uniform over Ω; f := p σ [ ω = 1]; c := ; and u s ( p σ ) := M + (cid:12)(cid:12) p σ [ ω = 1] − (cid:12)(cid:12) · M − M .Then u s ( p σ ) ∈ (cid:2) M , (cid:3) for every p σ ∈ ∆(Ω), thus u s satisfies Assumption 5.2 with M . Fur-thermore, f is linear, thus convex. Under the ex post constraint specified by f and c , the onlyvalid signaling scheme has support { p } , thus the optimal expected Sender’s utility is M ; thecorresponding ex ante constraint allows full revelation, which yields expected Sender’s utilityof 1. Therefore, we have a multiplicative gap of M between the two constraint types.3. Fix m and k = m + 1. We shall prove that the multiplicative gap between ex ante and expost constraints can be m + 1 for M = 2.Take Ω := { , ..., k } with p uniform on Ω; set u s ( p σ ) := || p σ || ∞ , f i := p σ [ ω = i ] and c i := k (1 ≤ i ≤ m ). As explained in our proof of part 1, u s satisfies Assumption 5.2 with M = 2.On the one hand, for the ex post constraints specified by the f i s and the c i s, the only validsignaling scheme has support { p } , yielding expected Sender’s utility of k . On the other hand,the corresponding ex ante constraints allow full revelation, yielding utility of 1. Thus, we geta multiplicative gap of k = mm