aa r X i v : . [ m a t h . P R ] M a y A law of large numbers for weighted plurality
Joe Neeman ∗ November 14, 2018
Abstract
Consider an election between k candidates in which each voter votesrandomly (but not necessarily independently) and suppose that there isa single candidate that every voter prefers (in the sense that each voter ismore likely to vote for this special candidate than any other candidate).Suppose we have a voting rule that takes all of the votes and produces asingle outcome and suppose that each individual voter has little effect onthe outcome of the voting rule. If the voting rule is a weighted plurality,then we show that with high probability, the preferred candidate will winthe election. Conversely, we show that this statement fails for all otherreasonable voting rules.This result is an extension of one by H¨aggstr¨om, Kalai and Mossel,who proved the above in the case k = For elections between two candidates, it is well known that voting rules in whichevery voter has a small effect are good rules in the sense that they “aggregateinformation well:” if every voter has a small bias towards the same candidatethen that candidate will win with overwhelming probability. When voters voteindependently, this fact was noted by Margulis [4] and Russo [5], whose resultswere later strengthened by Kahn, Kalai and Linial [3] and by Talagrand [6].When the voters are not independent, the situation is more complicated. Itis no longer true, then, that every reasonable voting rule aggregates well. Infact, [2] show that if we want the aggregation to hold for every distribution ofthe voters, then weighted majority functions are the only option. We extendtheir result to the non-binary case.The author would like to thank Elchanan Mossel for suggesting this problemand providing fruitful discussions. ∗ Department of Statistics, U.C. Berkeley. [email protected] Definitions and results
In the introduction, we made a few allusions to “reasonable” voting rules. Letus now say precisely what that means: we will require that our voting rules donot have a built-in preference for any alternative. This is a common assumption,and its definition is standard (see, eg. [1]). In what follows, the notation [ k ] stands for the set { , . . . , k − } . Definition 2.1.
A function f ∶ [ k ] n → [ k ] is neutral if f ( σ ( x )) = σ ( f ( x )) forall x ∈ [ k ] n and all permutations σ on [ k ] , where σ ( x ) i = σ ( x i ) . Note that in the case k =
2, a function is neutral if, and only if, it is anti-symmetric according to the definition in [2].
Example 2.2
When k = n is odd, then the simple majority function (for which f ( x ) = { i ∶ x i = } > { i ∶ x i = } ) is neutral. On the other hand, if n is even then inorder to fully specify the simple majority function, we need to say what happensin the case of a tie; the choice of tie-breaking rule will determine whether theresulting function is neutral. For example, if we define f ( x ) = x for every tiedconfiguration x , then f is neutral. On the other hand, if f ( x ) = x , then f is not neutral.The example can be extended to k ≥
3. In this case, consider the tie-breakingrule f ( x ) = x i where i is the smallest possible number for which x i is equal toone of the tied alternatives. This tie-breaking rule is neutral, and it is morenatural than setting f ( x ) = x because it guarantees that the output of f is oneof the tied alternatives. Let us say precisely what we mean by a weighted plurality function. The defini-tion that we take here generalizes the definition from [2] of a weighted majorityfunction.
Definition 2.3.
A function f ∶ [ k ] n → [ k ] is a weighted plurality function ifthere exist weights w , . . . , w n ∈ R ≥ such that ∑ i w i = and for all a, b ∈ [ k ] , f ( x ) = a implies that ∑ i ∶ x i = a w i ≥ ∑ i ∶ x i = b w i . Note that the above definition does not prescribe a particular behavior if atie occurs between two alternatives. If the weights are chosen so that ties neveroccur, then the weighted plurality function is clearly neutral. Moreover, for anyset of weights we can construct a neutral weighted plurality function with thoseweights by following the tie-breaking rule outlined in Example 2.2.2 .2 The influence of a voter
The final notion that we need before stating our result is a way to quantify thepower of a single voter. When k =
2, the notion of effect is well-establishedand can be found, for example, in [2]. However, there does not seem to bea well-established way of quantifying the effect of voters for non-binary socialchoice functions. Here, we propose a definition that closely resembles the oneused in [2] for binary functions.
Definition 2.4.
Let f be a function [ k ] n → [ k ] and fix a probability distribution P on [ k ] n . The effect of voter i is e i ( f, P ) = k ∑ j = P ( f ( X ) = j ∣ X i = j ) − P ( f ( X ) = j ∣ X i ≠ j ) , where X is a random variable distributed according to P . Note that for the case k =
2, the preceding definition reduces to e i ( f, P ) = ( P ( f ( X ) = ∣ X i = ) − P ( f ( X ) = ∣ X i = )) , which is just twice the definition in [2] of a voter’s effect. Also, the effect isclosely related to the correlation between the voters and the outcome: P ( f ( X ) = j ∣ X i = j ) − P ( f ( X ) = j ∣ X i ≠ j ) = Cov ( { f = j } , { X i = j } ) P ( X i = j ) P ( X i ≠ j ) ≥ ( { f = j } , { X i = j } ) and so e i ( f, P ) ≥ ∑ j Cov ( { f = j } , { X i = j } ) . Example 2.5
The simplest example of e i ( f, P ) is when P is a product measure (ie. the X i areindependent) and the function f does not depend on its i th coordinate; in thatcase, P ( f ( X ) = j ∣ X i = j ) = P ( f ( X ) = j ∣ X i ≠ j ) for all j and so e i ( f, P ) = P is a distribution such that X = X = ⋯ = X n withprobability 1, and if f is a plurality function, then P ( f ( X ) = j ∣ X i = j ) = j , while P ( f ( X ) = j ∣ X i ≠ j ) =
0; hence, e i ( f, P ) = i .For a less trivial example, suppose that the X i are independent and uniformlydistributed on [ k ] . Let f be an unweighted plurality function. Then the CentralLimit Theorem implies that e i ( f, P ) = O ( √ n ) as n → ∞ .On the other hand, suppose that f is still an unweighted plurality functionand the X i are independent, but now P ( X i = ) > P ( X i = j ) + δ for some δ > j ≠
1. Then Hoeffding’s inequality implies that P ( f ( X ) = ∣ X i ) ≥ − ( − δ n / ) for sufficiently large n , regardless of the value of X i . Inparticular, this implies that e i ( f, P ) = O ( exp ( − δ n / )) . Compared to the casewhere the X i are uniformly distributed, this demonstrates that e i ( f, P ) candepend strongly on P , even when P is restricted to being a product measure.3 .3 The main result Our main theorem is the following:
Theorem 2.6. (a) For every δ > and ǫ > , there is a τ > such that forevery weighted plurality function f with weights w i and every probabilitydistribution P on [ k ] n , if e i ( f, P ) ≤ τ and there is a set A ⊂ [ n ] such that ∑ i w i P ( X i = a ) ≥ ∑ i w i P ( X i = b ) + δ for all i ∈ [ n ] , all a ∈ A and all b / ∈ A ,then P ( f ( X ) ∈ A ) ≥ − ǫ .(b) If f is not a weighted plurality function then there exists a probabilitydistribution P on [ k ] n such that P ( X i = ) > P ( X i = ) for all i ∈ [ n ] but P ( f ( X ) = ) = (and hence e i ( f, P ) = for all i ). We remark that the Theorem is constructive in the sense that we can givean algorithm (based on solving a linear program) which either constructs someweights w i witnessing the fact that f is a weighted plurality, or a probabilitydistribution P satisfying part (b).Parts (a) and (b) of Theorem 2.6 are converse to one another in the followingsense: under the hypothesis of small effects, part (a) says that if there is a gapbetween the popularity of the most popular alternatives A and the less popularalternatives A c then a weighted plurality function will choose an alternativein A . Part (b) shows that this property fails for every function that is not aweighted plurality. Note that part (a) has an important special case, which iscloser to the statement of [2]: if P ( X i = a ) ≥ P ( X i = b ) + δ for all i ∈ [ n ] and all b ≠ a , then f ( X ) = a with high probability if the effects are small enough.The remainder of the paper is devoted to the proof of Theorem 2.6. Proof of Theorem 2.6 (a).
This part of the proof follows very closely the argu-ment in [2]. Suppose that f is a weighted plurality function with weights w i .The first step is to show that f is “correlated” in some sense with each voter:define p ij = P ( X i = j ) and let W j be the (random) weight assigned to alternative j : W j = ∑ i ∶ X i = j w i . Then E n ∑ i = w i k ∑ j = { f ( X )= j } ( { X i = j } − p ij ) = E ⎛⎝∑ i,j w i { f ( X )= j } { X i = j } − ∑ i,j { f ( X )= j } w i p ij ⎞⎠ = E ∑ i,j w i { f ( X )= j } { X i = j } − ∑ j P ( f = j ) E W j . (1)Now, let α j = P ( f = j ) and set ˜ α j = α j /(∑ i ∈ A α i ) for j ∈ A and ˜ α j = E ∑ i,j w i { f ( X )= j } { X i = j } = E ∑ j { f ( X )= j } W j ≥ E ∑ j { f ( X )= j } ∑ i ˜ α i W i = ∑ i ˜ α i E W j (2)4ince the winning alternative always has at least as much weight as any con-vex combination of alternatives. Since min j ∈ A E W j ≥ max j /∈ A E W j + δ , we canplug (2) into (1) to obtain(1) ≥ ∑ j ˜ α j E W j − ∑ j α j E W j ≥ ∑ j ∈ A ( ˜ α j − α j ) δ = δP ( f / ∈ A ) . Recalling that e i ( f, P ) ≥ ∑ j Cov ( { f = j } , { X i = j } ) , we have δP ( f / ∈ A ) ≤ E n ∑ i = w i k ∑ j = { f ( X )= j } ( { X i = j } − p ij ) ≤ ∑ i w i e i ( f, P ) ≤ τ τ small enough that ǫ ≥ τ /( δ ) .The proof of the second part of the theorem follows the idea of [2], in thatwe use linear programming duality to find a witness for f being a weightedplurality function. However, the details of the proof are quite different, since [2]uses a well-known linear program (the fractional vertex cover of a hypergraph)which does not extend beyond k = f is a weightedplurality function. Otherwise, the dual has a small value and the dual variableswitness the claim of Theorem 2.6 (b). In particular, note that this proof providesthe algorithm that we mentioned after the statement of Theorem 2.6.First we make a trivial observation that will simplify our linear programconsiderably: if a function is neutral, it is easier to check whether it is a weightedplurality because it is not necessary to try all possible combinations of a, b ∈ [ k ] : Proposition 2.7.
Suppose f ∶ [ k ] n → [ k ] is neutral. Then f is a weightedplurality if and only if there exist weights w , . . . , w n ∈ R such that f ( x ) = implies that ∑ i ∶ x i = w i ≥ ∑ i ∶ x i = w i . We can write a linear program for checking whether a given neutral function f is a weighted plurality. The variables for this program are t ; w i for each i ∈ [ n ] and g x for each x ∈ [ k ] n for which f ( x ) =
1. In standard form, the primal5rogram is the following:maximize t + − t − subject to g x ≥ x ∈ [ k ] n such that f ( x ) = w i ≥ i ∈ [ n ] t + ≥ t − ≥ ∑ i w i = ∑ i ∶ x i = w i − ∑ i ∶ x i = w i − g x − ( t + − t − ) = x ∈ [ k ] n with f ( x ) = . Proposition 2.8.
Let t ∗ be the value of the above linear program. If t ∗ ≥ then f is a weighted plurality function.Proof. Let w i , g x , t + and t − be feasible points such that t + − t − ≥
0. Then, forall x with f ( x ) = ∑ i ∶ x i = w i − ∑ i ∶ x i = w i = g x + ( t + − t − ) ≥ f satisfies the conditions of Proposition 2.7.Now consider the dual program; since the primal is in standard form, thedual is easy to write down. Let the dual variables be a and q x for all x suchthat f ( x ) =
1. Then the dual program is:minimize a + − a − subject to ∑ x ∶ f ( x )= q x ≤ − ∑ x ∶ f ( x )= ( { x i = } − { x i = } ) q x + ( a + − a − ) ≥ i ∈ [ n ] q x ≤ x such that f ( x ) = a + ≤ a − ≤ . Proposition 2.9.
Let a ∗ be the value of the above dual program. If a ∗ < thenthere exists a probability distribution on [ k ] n such that P ( X i = ) > P ( X i = ) for all i but f ( X ) = almost surely.Proof. Choose a feasible point with a + − a − < p x = − q x /(∑ x q x ) .Then p x ≥ ∑ x p x =
1, so we can define a probability distribution by P ( X = x ) = p x when f ( x ) = P ( X = x ) = f ( X ) = a + − a − < ∑ x ∶ f ( x )= { x i = } q x > ∑ x ∶ f ( x )= { x i = } q x i . Thus, P ( X i = ) = ∑ x ∶ f ( x )= { x i = } p x < ∑ x ∶ f ( x )= { x i = } p x = P ( X i = ) for all i .To conclude the proof of Theorem 2.6, note that both the primal and dualprograms are feasible and bounded and so a ∗ = t ∗ . References [1] S.J. Brams and P.C. Fishburn. Voting procedures.
Handbook of social choiceand welfare , 1:173–236, 2002.[2] O. H¨aggstr¨om, G. Kalai, and E. Mossel. A law of large numbers for weightedmajority.
Advances in Applied Mathematics , 37(1):112–123, 2006.[3] J. Kahn, G. Kalai, and N. Linial. The influence of variables on Booleanfunctions. In
Proceedings of the 29th Annual Symposium on Foundations ofComputer Science , pages 68–80. IEEE Computer Society, 1988.[4] G. Margulis. Probabilistic characteristic of graphs with large connectivity.
Problems Info. Transmission , 10:174–179, 1977.[5] L. Russo. An approximate zero-one law.
Probability Theory and RelatedFields , 61(1):129–139, 1982.[6] M. Talagrand. On Russo’s approximate zero-one law.