Arrow's Theorem Through a Fixpoint Argument
LL.S. Moss (Ed.): TARK 2019EPTCS 297, 2019, pp. 175–188, doi:10.4204/EPTCS.297.12 c (cid:13)
Frank M. V. Feys & Helle Hvid HansenThis work is licensed under theCreative Commons Attribution License.
Arrow’s Theorem Through a Fixpoint Argument
Frank M. V. Feys Helle Hvid Hansen
Faculty of Technology, Policy and ManagementDelft University of TechnologyDelft, The Netherlands [email protected] [email protected]
We present a proof of Arrow’s theorem from social choice theory that uses a fixpoint argument.Specifically, we use Banach’s result on the existence of a fixpoint of a contractive map defined on acomplete metric space. Conceptually, our approach shows that dictatorships can be seen as “stablepoints” (fixpoints) of a certain process.
Keywords
Social choice theory · voting · Arrow’s impossibility theorem · Banach’s fixpoint theorem · dictatorship · force · fixpoint · metric Arrow’s impossibility theorem, introduced in Kenneth Arrow’s seminal monograph
Social Choice andIndividual Values [2], deals with the issue of aggregating the preferences of a group of individuals into asingle collective preference that appropriately represents the group.Arrow defined a social welfare function to be a function that aggregates a collection of individualpreferences into a societal preference, also called social choice. Preferences are defined as linear orders,where candidates are ranked from top to bottom. He considered the following three desirable propertiesthat a social welfare function, which can be thought of as an election scheme or voting rule, might satisfy: • Pareto condition.
If all voters rank some candidate higher than another one, then also the socialchoice should do so. • Independence of irrelevant alternatives.
The societal ranking of any two candidates depends onlyon their relative rankings in the individual rankings, and on nothing else. • Non-dictatoriality.
There is no dictator, meaning that there is no voter whose individual preferencealways coincides with the societal outcome.Arrow’s impossibility theorem states that these seemingly mild conditions of reasonableness cannot besimultaneously satisfied by any voting rule.In this paper we provide a new proof of this theorem that uses Banach’s fixpoint theorem, which statesthat a contractive map on a complete metric space has a unique fixpoint [3]. Our proof method shows thatdictatorships can be seen as “stable points” (fixpoints) of a certain process.We give a sketch of our proof. First we define a metric parametrized by a probability distribution onprofiles. The distance between two voting rules is the probability under the given distribution that theoutcome of the election is different. The use of a distribution has the benefit that it allows for profiles tobe considered concurrently. Inspired by the technical notion of influence from Boolean analysis [17], wethen introduce a notion that we call “force” (a related idea was already presented in [8]). The force of avoter on a given voting rule is the probability that the outcome of the election coincides with that voter’spreference. We use this notion to define a map that changes voting rules by considering the least powerful76
Arrow’s Theorem Through a Fixpoint Argument voters’ votes and transferring them to the most powerful voter. This map is shown to be a contraction.The process converges towards its unique fixpoint, because of Banach’s fixpoint theorem, and we showthis fixpoint to be the set of dictators. One technical point that is fundamental to our approach is that wemust consider voting rules only up to equivalence via permutation of the voters, leading to a quotientmetric space. This requires us to see to it that the original metric extends appropriately to the quotientmetric. A certain group action takes care of that. Our technical development (metric space) is parametricin the distribution, and we study all notions as much as possible in generality. In the final step, we pick aspecific distribution and derive Arrow’s theorem from it.Our proof does not require us to manipulate specific profiles, like most other previous proofs, butoffers an analytic perspective in terms of fixpoints and convergence rather than a combinatoric one. Theuse of a probability distribution for measuring distance between voting rules is crucial in that respect. Ouremployment of fixpoints connects Arrow’s theorem to the literature of mathematical economics, wheremany equilibrium concepts, such as for example Nash equilibrium [16], arise as a fixpoint.We finish the introduction by discussing related work. A vast number of proofs have appeared sinceArrow’s first demonstration of the impossibility theorem. Arrow’s original proof proceeded by showing theexistence of a “decisive” voter. This approach was then refined by others, such as Blau [6] and Kirman andSondermann [14], who used ultrafilters. Barbera [5] replaced the notion of a decisive voter with the weakernotion of a “pivotal” voter. In later work, Geanakoplos [11] and Reny [19] sharpened the latter approachby introducing the concept of “extremely pivotal” voter in order to obtain shorter proofs. Even though theaforementioned proofs are all distinct, they all are of a similar, combinatorial nature: One proves resultsabout properties such as decisiveness or pivotalness by defining and manipulating profiles in a cleverway that allows for the desirable properties to be taken advantage of. To our knowledge, Kalai’s [13]proof of Arrow’s theorem based on Fourier analysis on the Boolean cube was the first approach to takea different stance. Kalai’s core idea was to calculate the probability of having a Condorcet cycle undersome probability distribution. One advantage of this approach is that, by using a theorem from Booleananalysis by Friedgut, Kalai, and Naor [9], it produces a robust, quantitative version of the theorem, in thefollowing sense: The more one seeks to avoid Condorcet’s paradox, the more the election scheme willlook like a dictator. Our approach is similar to Kalai’s in that we use quantitative methods, but it differsfrom it as we prove the classical version of Arrow’s theorem and not a quantitative one.
Contents of this paper.
In Section 2, we give a brief overview of the basic formal concepts that areused in this paper. We also formulate Arrow’s theorem in a precise way. In Section 3, we present ourfixpoint-based proof of Arrow’s theorem. Finally, we conclude and discuss future work in Section 4. Theproofs that are not given in the main body of the text can be found in the appendix.
In this section we introduce all basic notions that will be used later.
In this subsection we recall the basic Arrovian framework, going back to Arrow’s original work [2].We assume that the n voters are linearly ordered, so without loss of generality we may suppose thatthe set of voters is { , , . . . , n } with the natural ordering. Given a set A , we let L ( A ) be the set of stricttotal orders on A . For our purposes, A will be the set of candidates of the election, which we shall always rank M. V. Feys & Helle Hvid Hansen L ( A ) with a complete listing of theelements of A in order of preference. If (cid:96) ∈ L ( A ) and a , b ∈ A , we also write a (cid:96) b to mean that ( a , b ) ∈ (cid:96) .We shall use x = ( x , x , . . . , x n ) to refer to an element of L ( A ) n . Such an element is usually called a profile in social choice theory. The element x i is the individual preference of voter i . Definition 2.1 A voting rule for n voters and set of candidates A is a map L ( A ) n → L ( A ) . Here the interpretation is that if the preferences of the n voters are respectively x , x , . . . , x n ∈ L ( A ) ,then the outcome of the election under the voting rule f is f ( x , x , . . . , x n ) ∈ L ( A ) . The latter is theso-called social choice given how the electorate voted.A particularly bad type of voting rule is the following: Definition 2.2
The n projection maps L ( A ) n → L ( A ) are called dictatorships . The i-th dictator isdenoted as Dict i , where i = , , . . . , n. We furthermore define DICT n = { Dict i | i = , , . . . , n } . Note that Dict i ( x ) = x i for all x = ( x , x , . . . , x n ) ∈ L ( A ) n . We write DICT n shorthand as DICT. Definition 2.3
Let f : L ( A ) n → L ( A ) be a voting rule for n voters and set of candidates A. We say thatf satisfies the • Pareto property if for all a , b ∈ A and all x = ( x , x , . . . , x n ) ∈ L ( A ) n , if a x i b for all i, then a f ( x ) b. • independence of irrelevant alternatives (IIA) property in case for all a , b ∈ A and for all profilesx = ( x , x , . . . , x n ) ∈ L ( A ) n and x (cid:48) = ( x (cid:48) , x (cid:48) , . . . , x (cid:48) n ) ∈ L ( A ) n , if (a x i b iff a x (cid:48) i b) for all i, then(a f ( x ) b iff a f ( x (cid:48) ) b). The Pareto property speaks for itself: It states that for any candidates a and b , if the whole electorateprefers a to b , then surely also in the social choice a ought to be preferred to b . The independence ofirrelevant alternatives property is a bit more difficult to grasp. It says that the relative social rankingof two alternatives only depends on their relative individual rankings. In that sense there should be nodependence on any other alternative (hence “irrelevant”). Definition 2.4
We let
PIIA n = { f : L ( A ) n → L ( A ) | f satisfies Pareto and IIA } . We write PIIA n shorthand as PIIA. Note that DICT n ⊆ PIIA n .Arrow set out to investigate whether there was a non-dictatorial voting rule satisfying both the Paretocondition as well as IIA. The answer turned out to be negative. Theorem (Arrow [2])
A voting rule for at least three candidates that satisfies the Pareto property andIIA must be a dictatorship, i.e.,
PIIA n = DICT n for all n ≥ . We assume basic knowledge of metric spaces such as introduced in, e.g., [1]. Here we just recall a fewbasic definitions. A metric space ( X , d ) is a set X equipped with a metric.A crucial element in our proof is Banach’s fixpoint theorem [3]. We recall that a function F : X → X on a metric space X is contractive if there is a C < d ( F ( x ) , F ( x )) ≤ C · d ( x , x ) forall x , x ∈ X . A fixpoint of F is an element x ∗ such that F ( x ∗ ) = x ∗ . For any map F : X → X , by F ( n ) wemean the n -fold composition F ◦ · · · ◦ F . Theorem 2.5 (Banach Fixpoint Theorem)
Let ( X , d ) be a non-empty complete metric space. IfF : X → X is contractive then F has a unique fixpoint x ∗ . For all x ∈ X , x ∗ = lim n → ∞ F ( n ) ( x ) . Arrow’s Theorem Through a Fixpoint Argument
When X is a compact metric space it suffices to show that d ( F ( x ) , F ( x )) < d ( x , x ) for all x , x ∈ X to conclude that F is a contraction. This is in particular true if X is finite. Recall that if there is an n suchthat F ( n ) is a contraction, then F has a unique fixpoint (see, e.g., [18]). We shall use this fact later.Let ( X , d ) be any metric space for which on the underlying set X an equivalence relation ∼ is defined.For reasons that will become clear later, we wish to extend the metric on X to a metric on X / ∼ . In general,the following construction exists (see, e.g., [7] for the details). A chain C between points x ∈ X and y ∈ X is a sequence of points x = a ∼ b , a ∼ b , . . . , a n ∼ b n = y , and the length of such a chain is defined aslength ( C ) = ∑ n − i = d ( b i , a i + ) (this sum is understood to be zero when n = d ∼ ([ x ] , [ y ]) = inf { length ( C ) | C is a chain between x and y } . (1)It is easy to see that d ∼ is well-defined and satisfies all axioms of a pseudometric (meaning that thedistance between two distinct points can be zero). Note though that d ∼ ([ x ] , [ y ]) = ⇒ [ x ] = [ y ] is notgenerally true. However, in case d is such that the infimum is always attained (which happens in particularin case X is finite), then d ∼ is, in fact, a metric: Lemma 2.6
Let ( X , d ) be a metric space and ∼ an equivalence relation on X . If the infimum in d ∼ isalways attained, then d ∼ is a metric on X / ∼ .Proof. Indeed, if d ∼ ([ x ] , [ y ]) = = d ∼ ([ x ] , [ y ]) = length ( C ) for some chain C between x and y given by x = a ∼ b , a ∼ b , . . . , a n ∼ b n = y , then d ( b i , a i + ) = i . Since d is a metric, we get b i = a i + for all i . Thus, x = a ∼ a ∼ · · · ∼ a n ∼ y , which implies x ∼ y by transitivity. Hence, [ x ] = [ y ] .Nonetheless, the defining formula for d ∼ is difficult to work with. The following result gives asufficient condition for the formula to reduce to an easier one. It uses the notion of action and orbit undera group action (see, e.g., [20]). A group action of a group G on a set X is a map ϕ : G × X → X suchthat ϕ ( e , x ) = x for all x ∈ X , where e is the identity element of G , and ϕ ( g , ϕ ( g (cid:48) , x )) = ϕ ( gg (cid:48) , x ) for all g , g (cid:48) ∈ G and x ∈ X . The orbit of an x ∈ X under the group action ϕ is the set { ϕ ( g , x ) | g ∈ G } . We alsorecall that the set of isometries on a metric space forms a group, with function composition as group law.With a group of isometries we mean a subgroup of that group. Proposition 2.7 If ( X , d ) is a metric space endowed with an equivalence relation ∼ where the equivalenceclasses are the orbits of the action of a group of isometries on ( X , d ) , then it holds for all [ x ] , [ y ] ∈ X / ∼ that d ∼ ([ x ] , [ y ]) = inf { d ( x (cid:48) , y (cid:48) ) | x (cid:48) ∈ [ x ] , y (cid:48) ∈ [ y ] } .Proof. This is part of Theorem 2.1 from [7].
In this section, we give the new proof of Arrow’s theorem. The general idea is to define a contraction thathas the set of dictators as its unique fixpoint.
We wish to use Banach’s fixpoint theorem. In order to do so, we need to view our voting rules as part ofsome metric space. For technical reasons, we shall focus on voting rules that satisfy the Pareto property.Thus, we let M be the set of all voting rules L ( A ) n → L ( A ) that are Pareto.The idea to view voting rules as a metric space where the metric is based on a distribution is inspiredby earlier work from the Boolean analysis approach to social choice theory, such as Mossel [15]. Viewing rank M. V. Feys & Helle Hvid Hansen Definition 3.1
Given a probability distribution µ on L ( A ) n , we define a map d µ : M × M → R byd µ ( f , g ) = Pr x ∼ µ [ f ( x ) (cid:54) = g ( x )] . That is, d µ ( f , g ) is the probability under µ that f and g produce a different outcome. For any set X ,by 1 X we mean the indicator function on X . Note that d µ ( f , g ) = Pr x ∼ µ [ f ( x ) (cid:54) = g ( x )] = ∑ x ∈ L ( A ) n µ ( x ) f ( x ) (cid:54) = g ( x ) . Throughout this paper, µ is a distribution on L ( A ) n , but for brevity we often simply say that “ µ isa distribution”. For example, taking µ to be the uniform distribution is known in the literature as the impartial culture assumption [10].Under a mild assumption on µ we obtain our desired metric space. Proposition 3.2 If µ is a probability distribution on L ( A ) n with full support, then d µ is a metric on M.Proof. Clearly d µ ( f , f ) = f ∈ M . Since µ has full support, d µ ( f , g ) = Pr x ∼ µ [ f ( x ) (cid:54) = g ( x )] = f = g . Symmetry follows immediately. To prove the triangle inequality, let f , g , h ∈ M . Note thatfor any x , f ( x ) (cid:54) = h ( x ) implies f ( x ) (cid:54) = g ( x ) or g ( x ) (cid:54) = h ( x ) . Thus,Pr x ∼ µ [ f ( x ) (cid:54) = h ( x )] ≤ Pr x ∼ µ [ f ( x ) (cid:54) = g ( x ) or g ( x ) (cid:54) = h ( x )] , and the latter is at most Pr x ∼ µ [ f ( x ) (cid:54) = g ( x )] + Pr x ∼ µ [ g ( x ) (cid:54) = h ( x )] .Note that the metric space ( M , d µ ) is complete and compact for any µ , since it is finite.The proof idea is to define a contraction that has the set of dictators as its unique fixpoint. We start bydefining a map Φ on M . Essentially, given a voting rule f , we will define Φ ( f ) as the voting rule in whichthe “least forceful” voter’s vote is replaced by the “most forceful” voter’s vote. This idea is inspired by,though different from, the notion of influence that is well-studied in Boolean analysis [17]. The influenceof the i -th voter is defined as the probability that the i -th vote affects the outcome (assuming there are onlytwo candidates). In political science and voting theory, influence also goes by the name voting power .Voting power can be thought of as the ability of a legislator, by his vote, to affect the passage or defeat ofa measure [4].To make the aforementioned idea formal, we define what we mean by the force of a voter. Definition 3.3
Given a probability distribution µ on L ( A ) n and voter i, we define the force of voter i onvoting rule f under distribution µ as F i µ [ f ] = Pr x ∼ µ [ f ( x ) = x i ] . The set of most forceful voters is
MostForce µ [ f ] = argmax i ∈{ , ,..., n } F i µ [ f ] . The set of least forceful voters is
LeastForce µ [ f ] = argmin i ∈{ , ,..., n } F i µ [ f ] . The notion of force was first introduced in [8].
Definition 3.4
Let µ be a given distribution with full support. Given a voting rule f ∈ M, we define anew voting rule Φ µ ( f ) ∈ M by Φ µ ( f )( x , . . . , x n ) = f ( y , . . . , y n ) , wherey i = (cid:40) x i if i (cid:54)∈ LeastForce µ [ f ] , x minMostForce µ [ f ] else . Arrow’s Theorem Through a Fixpoint Argument
This defines a map Φ µ : M → M, also written for short as Φ if µ is understood from the context. So what happens when passing from f to Φ ( f ) is that all voters with least force have their votereplaced by the vote of the first voter that has maximal force. Note that Φ ( f ) is Pareto if f is Pareto, so Φ is well-defined on M . The choice to consistently pick specifically the first voter in Definition 3.4 isarbitrary. We might as well always pick the last voter, for instance. Our choice is merely a technicality toensure that the votes are transferred to one specific voter. Proposition 3.5
For all i we have Φ ( Dict i ) = Dict i .Proof. Voter i is the unique most forceful voter of Dict i . It is irrelevant how others vote, and we have Φ ( Dict i )( x , . . . , x n ) = x i for all ( x , . . . , x n ) . Hence, Φ ( Dict i ) = Dict i .Note that Φ cannot be a contraction, since every dictator is a fixed point of Φ , contradicting theuniqueness of the fixed point in Banach’s fixpoint theorem. The idea is therefore to consider all dictatorsto be “the same”, i.e., to consider them to be equivalent under some equivalence relation.Given an equivalence relation ∼ , we write [ x ] for the equivalence class of x under ∼ . We shall write M ∼ for M / ∼ throughout.The remainder of the proof is organized as follows. We define an equivalence relation ∼ using anotion of permutation-invariance of voting rules as well as our notion of force. We then show that DICTis an equivalence class of ∼ . The use of permutations gives rise to a notion of permutation-invariancefor distributions. We then show that for all distributions µ on L ( A ) n that have full support and arepermutation-invariant, the following hold.(i) The map Φ µ : M µ → M µ is compatible with ∼ , where M µ is the metric space resulting from µ , andhence extends to a map Φ µ ∼ : M µ ∼ → M µ ∼ .(ii) The map Φ µ ∼ has DICT as unique fixpoint.Finally, we prove Arrow’s theorem by showing the existence of a concrete distribution µ satisfying theabovementioned requirements. M and Metric on M ∼ We first have to find an appropriate equivalence relation on M . To do that, we use the following notion. Definition 3.6
Given a permutation π : n → n of the voters, we define (cid:126) π ( x ) as ( x π ( ) , . . . , x π ( n ) ) for alltuples x = ( x , . . . , x n ) ∈ L ( A ) n . If f is a voting rule, we define a new voting rule, written as f ◦ (cid:126) π , by ( f ◦ (cid:126) π )( x ) = f ( (cid:126) π ( x )) for all x ∈ L ( A ) n . A permutation π : n → n is in what follows often written without types simply as π . Observe that (cid:126) π : L ( A ) n → L ( A ) n is a bijection because π is a bijection. Definition 3.7
We define the relation ∼ on M by letting f ∼ g iff (i) f = g, or (ii) there exists a permutation π : n → n such that f = g ◦ (cid:126) π and f has a unique voter with maximal force. In this way we obtain our desired equivalence relation.
Proposition 3.8
The relation ∼ is an equivalence relation on M. rank M. V. Feys & Helle Hvid Hansen Proof.
Clearly ∼ is reflexive, as = ⊆ ∼ . To show symmetry, let f ∼ g , so that f = g , or f = g ◦ (cid:126) π forsome π where f has a unique maximal force voter. The former case f = g is trivial. In the latter case,clearly g = f ◦ (cid:126) π − , and because f has a unique maximal force voter, the relation f = g ◦ (cid:126) π implies thatthe same holds for g . Now to show transitivity, let f ∼ g and g ∼ h . Thus, f = g , or f = g ◦ (cid:126) π for some π where f has a unique maximal force voter. Also, g = h , or g = h ◦ (cid:126) τ for some τ where g has a uniquemaximal force voter. There are four cases. We will only show two of them, as they are all easy.Suppose f = g and g = h ◦ (cid:126) τ where g has a unique maximal force voter. Then since g has a uniquemaximal force voter, the same holds for h , so the relation f = h ◦ (cid:126) τ implies that this is also true for f . If f = g ◦ (cid:126) π and g = h ◦ (cid:126) τ , where both f and g have a unique maximal force voter, we have that f = h ◦ ( (cid:126) τ ◦ (cid:126) π ) .Note that DICT ⊆ M , so it makes sense to speak about [ Dict i ] , and that the equivalence classes of ∼ are of the following two types: • If f does not have a unique maximal force voter, then [ f ] = { f } . • If f does have a unique maximal force voter, then [ f ] = { f ◦ (cid:126) π | π : n → n } . Proposition 3.9
The set
DICT is an equivalence class of ∼ .Proof. For any i , we show that we have [ Dict i ] = DICT. Let f ∈ M be such that f ∼ Dict i . If f = Dict i ,surely f ∈ DICT. Otherwise, we have that f = Dict i ◦ (cid:126) π for some permutation π . Suppose π ( i ) = j . Atrivial computation then shows that f = Dict j , so f ∈ DICT. We now show that for any j , Dict i ∼ Dict j .Indeed, if π is the permutation that switches i and j and is the identity elsewhere, then Dict i = Dict j ◦ (cid:126) π .Furthermore, it is clear that any dictator has a unique voter with maximal force.Let µ be a distribution with full support such that d = d µ is a metric on M (by Proposition 3.2), andconsider the quotient metric d ∼ defined in (1) for the relation ∼ from Definition 3.7. Since M is finite, weobtain from Lemma 2.6 that d ∼ is a metric on M ∼ = M / ∼ . However, as we saw in that same subsection,the defining formula for d ∼ is convoluted, so we set out to identify a subgroup of isometries that willallows us to apply Proposition 2.7, as that would give us a more convenient formula to work with. To dothat, we shall need the following notion. Definition 3.10
A distribution µ on L ( A ) n is n -permutation-invariant if for all π : n → n, µ ◦ (cid:126) π = µ . The condition µ = µ ◦ (cid:126) π for all permutations π demands that µ ( x ) = µ ( x (cid:48) ) whenever x (cid:48) and x arerearrangements of one another. Mathematically, it ensures that the probability distribution µ is well-definedup to permutation equivalence of profiles.For each permutation π : n → n , let J π : M → M : f (cid:55)→ f ◦ (cid:126) π . This map is well-defined, as f ◦ (cid:126) π isPareto if f is. Proposition 3.11
Let µ be a distribution on L ( A ) n with full support and such that µ is n-permutation-invariant. Then J π is an isometry of M, for each π .Proof. From the bijectivity of π , it follows that J π is bijective. Let π be a permutation and f , g ∈ M . Thenwe need to show that d µ ( f , g ) = d µ ( f ◦ (cid:126) π , g ◦ (cid:126) π ) .82 Arrow’s Theorem Through a Fixpoint Argument
We calculate:Pr x ∼ µ [( f ◦ (cid:126) π )( x ) (cid:54) = ( g ◦ (cid:126) π )( x )] = ∑ x ∈ L ( A ) n µ ( x ) · ( f ◦ (cid:126) π )( x ) (cid:54) =( g ◦ (cid:126) π )( x ) = ∑ x ∈ L ( A ) n µ ( (cid:126) π − ( x )) · ( f ◦ (cid:126) π )( (cid:126) π − ( x )) (cid:54) =( g ◦ (cid:126) π )( (cid:126) π − ( x )) = ∑ x ∈ L ( A ) n µ ( (cid:126) π − ( x )) · f ( x ) (cid:54) = g ( x ) . Hence, Pr x ∼ µ [( f ◦ (cid:126) π )( x ) (cid:54) = ( g ◦ (cid:126) π )( x )] = Pr x ∼ µ ◦ (cid:126) π − [ f ( x ) (cid:54) = g ( x )] . Note that µ ◦ (cid:126) π − : L ( A ) n → [ , ] isindeed a probability distribution on L ( A ) n . The proof is complete after observing that µ ◦ (cid:126) π − = µ .Let G = { J π | π is a permutation } . From Proposition 3.11 it follows that G is a subgroup of the groupof isometries of M . We define a map · : G × M → M by J π · f = (cid:40) J π ( f ) if f has a unique maximal force voter , f otherwise . Proposition 3.12
The operation · is a group action of the group G on M and its orbits coincide with theequivalence classes under ∼ .Proof. Note that id : M → M is the identity of G , and id · f = f for all f ∈ M . Also, ( g ◦ h )( f ) = g ( h ( f )) for all g , h ∈ G and f ∈ M . Indeed, this easily follows by making the case distinction whether f has aunique maximal force voter.The orbit of an f ∈ M under this group action is equal to G · f = { J π · f | π } . Now, J π · f is J π ( f ) = f ◦ (cid:126) π if f has a unique maximal force voter, and f otherwise. Hence, G · f = { J π · f | π } = [ f ] . Thus, the orbitsof the group action coincide with the equivalence classes under ∼ . Proposition 3.13
Let µ be a distribution with full support and with µ = µ ◦ (cid:126) π for all permutations π . Letd ∼ be the metric on M / ∼ based on the metric d = d µ on M (see Lemma 2.6). Then for all [ f ] , [ g ] ∈ M / ∼ we have d ∼ ([ f ] , [ g ]) = min { d ( f (cid:48) , g (cid:48) ) | f (cid:48) ∈ [ f ] ∼ , g (cid:48) ∈ [ g ] ∼ } . Proof.
This follows from Proposition 2.7, Proposition 3.11, and Proposition 3.12. Φ to a Map with Unique Fixpoint From now on, unless specifically mentioned otherwise, we shall always assume that µ is a distribution on L ( A ) n with full support and with µ = µ ◦ (cid:126) π for all permutations π , such that Proposition 3.13 applies.Such a µ obviously exists, for instance the uniform distribution is an example, but one can easily obtainmany other examples simply by giving one representative of each permutation equivalence class of profiles(meaning profiles x and y are equivalent iff there is a permutation π such that π ( x ) = y ) a non-zero weight.Later we will exploit this property.Our aim in this subsection is to extend the map Φ : M → M to a map M ∼ → M ∼ where, as we recall, M ∼ = M / ∼ . The following is a technical result that we shall use later. Lemma 3.14
If f = g ◦ (cid:126) π for π , then F i µ [ g ] = F π ( i ) µ [ f ] for each i = , , . . . , n. rank M. V. Feys & Helle Hvid Hansen Proof.
For any i , let p i : L ( A ) n → L ( A ) be the i -th projection map. Then we have F i µ [ g ] = Pr x ∼ µ [ g ( x ) = x i ] = ∑ x µ ( x ) · g ( x )= p i ( x ) = ∑ x µ ( (cid:126) π ( x )) · g ( (cid:126) π ( x ))= p i ( (cid:126) π ( x )) . Note that p i ( (cid:126) π ( x )) = x π ( i ) for each x , so F i µ [ g ] = ∑ x µ ( (cid:126) π ( x )) · g ( (cid:126) π ( x ))= x π ( i ) . By assumption, µ ◦ (cid:126) π = µ and f = g ◦ (cid:126) π , so the conclusion follows.We now show that Φ respects the equivalence relation ∼ , so that it extends to the quotient. Proposition 3.15
For all f , g ∈ M, if f ∼ g then Φ ( f ) ∼ Φ ( g ) .Proof. Let f , g ∈ M be such that f ∼ g . If f = g , then clearly Φ ( f ) = Φ ( g ) . Now suppose there is a π such that f = g ◦ (cid:126) π , and that f (and hence also g ) has a unique voter with maximal force. We need toshow that Φ ( f ) ∼ Φ ( g ) . In fact, we will show that Φ ( f ) = Φ ( g ) ◦ (cid:126) π . From Lemma 3.14 we know that F i µ [ g ] = F π ( i ) µ [ f ] for each i . This implies that π maps the first (and only) most forceful voter of g tothe first (and only) most forceful voter of f . Similarly, π maps the least forceful voters in g to the leastforceful voters in f . Applying the definition of Φ (see Definition 3.4), we see that Φ ( f ) = Φ ( g ) ◦ (cid:126) π . Definition 3.16
We define Φ ∼ : M ∼ → M ∼ by Φ ∼ ([ f ]) = [ Φ ( f )] . We want to point out that, although we have not written it explicitly, Φ ∼ (and Φ alike) does depend ona chosen distribution µ . By Proposition 3.15, Φ ∼ is well-defined. We also define for each i = , , . . . , n amap s i : L ( A ) n → L ( A ) n by s i ( x , . . . , x n ) = ( x i , . . . , x i ) .The following is a technical result. Lemma 3.17
For any f ∈ M and i, voter i is the unique voter with maximal force on f ◦ s i . Moreover, [ f ◦ s i ] = { f ◦ s k | k = , , . . . , n } for each i.Proof. Since f is Pareto, we have F j µ [ f ◦ s i ] = Pr x ∼ µ [ f ( s i ( x )) = x j ] = Pr x ∼ µ [ x i = x j ] for any j . Now as µ is assumed to have full support, Pr x ∼ µ [ x i = x j ] = x i = x j for all x ∈ L ( A ) n , i.e.,iff i = j . This proves that i is the unique voter with maximal force.From the first part we know that f ◦ s i has a unique voter with maximal force. Hence we have [ f ◦ s i ] = { ( f ◦ s i ) ◦ (cid:126) π | π } = { f ◦ s k | k = , , . . . , n } . Proposition 3.18
The map Φ ∼ : M ∼ → M ∼ has a unique fixpoint.Proof. Since M ∼ is finite, it suffices to show d ∼ ( Φ ( n ) ∼ ([ f ]) , Φ ( n ) ∼ ([ g ])) < d ∼ ([ f ] , [ g ]) for all [ f ] , [ g ] ∈ M ∼ .So let [ f ] , [ g ] ∈ M ∼ . Every iteration of Φ on f , at least one voter loses their vote as it is taken over by themost forceful voter. Thus, there is an i such that Φ ( n ) ( f )( x , . . . , x n ) = f ( x i , . . . , x i ) for all ( x , . . . , x n ) , orin other words, Φ ( n ) ( f ) = f ◦ s i . There is similarly a j such that Φ ( n ) ( g ) = g ◦ s j .We have d ∼ ( Φ ( n ) ∼ ([ f ]) , Φ ( n ) ∼ ([ g ])) = d ∼ ([ Φ ( n ) ( f )] , [ Φ ( n ) ( g )]) = d ∼ ([ f ◦ s i ] , [ g ◦ s j ]) . Applying Proposition 3.13 and Lemma 3.17, we obtain d ∼ ( Φ ( n ) ∼ ([ f ]) , Φ ( n ) ∼ ([ g ])) = min k , l d ( f ◦ s k , g ◦ s l ) . Since f and g are Pareto, we have ( f ◦ s k )( x , . . . , x n ) = f ( x k , . . . , x k ) = x k and ( g ◦ s l )( x , . . . , x n ) = g ( x l , . . . , x l ) = x l Arrow’s Theorem Through a Fixpoint Argument for all ( x , . . . , x n ) ∈ L ( A ) n . Thus, min k , l d ( f ◦ s k , g ◦ s l ) = min k , l Pr x ∼ µ [ x k (cid:54) = x l ] = , and from this itfollows that for all d ∼ ([ f ] , [ g ]) (cid:54) = d ∼ ( Φ ( n ) ∼ ([ f ]) , Φ ( n ) ∼ ([ g ])) = < d ∼ ([ f ] , [ g ]) . So Φ ∼ collapsesall equivalence classes after n steps. Note that Φ ( n ) ∼ ([ f ]) = DICT for all f ∈ M .We now show that this fixpoint is the equivalence class of dictators. Proposition 3.19
It holds that Φ ∼ ( DICT ) =
DICT .Proof.
From Proposition 3.5 we know that Φ ( Dict i ) ∼ Dict i for each i . This implies the equality Φ ∼ ( DICT ) = Φ ∼ ([ Dict i ]) = [ Φ ( Dict i )] = [ Dict i ] = DICT, where we used Proposition 3.9.We shall now develop a technical result needed for proving Arrow’s theorem. For any i , we define v i : L ( A ) n → L ( A ) n − by v i ( x ) = x − i , where x − i is the same vector as x but with the i -th componentleft out (so x = ( x i , x − i ) for all x and i ). Given any distribution µ on L ( A ) n − and i = , , . . . , n , we canassociate with µ a real-valued map µ [ i ] on L ( A ) n by defining µ [ i ] ( x ) = ∑ τ : n → n µ ( v i ( (cid:126) τ ( x ))) n ! | L ( A ) | . (2) Lemma 3.20
Let µ be a distribution on L ( A ) n − and i ∈ { , , . . . , n } . Then µ [ i ] is a distribution on L ( A ) n and µ [ i ] is n-permutation-invariant. If moreover µ has full support, then also µ [ i ] has full support.Proof. To show that µ [ i ] is a distribution, note that for a fixed permutation τ : n → n , ∑ x ∈ L ( A ) n µ ( v i ( (cid:126) τ ( x ))) = ∑ x ∈ L ( A ) n µ ( v i ( x )) = ∑ x i ∑ x − i µ ( x − i ) = ∑ x i = | L ( A ) | . Thus, as the number of permutations n → n is n !, we get ∑ x ∈ L ( A ) n ∑ τ µ [ i ] ( x ) = ∑ τ ∑ x ∈ L ( A ) n µ ( v i ( (cid:126) τ ( x ))) = n ! | L ( A ) | . We now show that µ [ i ] is n -permutation-invariant. Let π : n → n . We show that µ [ i ] ◦ (cid:126) π = µ [ i ] . For any x ∈ L ( A ) n , we have ( n ! | L ( A ) | ) (( µ [ i ] ◦ (cid:126) π )( x )) = ∑ τ : n → n µ ( v i ( (cid:126) τ ( (cid:126) π ( x )))) = ∑ σ : n → n µ ( v i ( (cid:126) σ ( x ))) = ( n ! | L ( A ) | ) ( µ [ i ] ( x )) . The last claim is trivial, so the proof is complete.The following is a technical lemma.
Lemma 3.21
Let n ≥ , and let g : L ( A ) n − → L ( A ) be a voting rule for n − voters. If f : L ( A ) n → L ( A ) is such that f ( x , . . . , x n − , x n ) = g ( x , . . . , x n − ) , and µ is a n − -permutation-invariant distributionon L ( A ) n − , then F n µ [ n ] [ f ] ≤ / ( n | L ( A ) | ) , and for each i = , , . . . , n − that F i µ [ n ] [ f ] ≥ n F i µ [ g ] . rank M. V. Feys & Helle Hvid Hansen Proof.
Let i = , , . . . , n −
1. We show that F i µ [ n ] [ f ] ≥ n F i µ [ g ] . More precisely, we will show that F i µ [ n ] [ f ] = n F i µ [ g ] + n | L ( A ) | n − ∑ j = µ ( x − j ) g ( x ,..., x n − )= x i . We have F i µ [ n ] [ f ] = n ! | L ( A ) | ∑ x ,..., x n ∑ τ µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i . For each i = , , . . . , n , let V i = { τ : n → n | τ ( i ) = n } . To start, note that ∑ x ,..., x n − , x n ∑ τ ∈ V n µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i (since µ is n − = ∑ x ,..., x n − , x n ∑ τ ∈ V n µ ( x , . . . , x n − ) g ( x ,..., x n − )= x i (since | V n | = ( n − ) !) = ( n − ) ! ∑ x ,..., x n − , x n µ ( x , . . . , x n − ) g ( x ,..., x n − )= x i = ( n − ) ! ∑ x ,..., x n − ∑ x n µ ( x , . . . , x n − ) g ( x ,..., x n − )= x i (3) = ( n − ) ! | L ( A ) | ∑ x ,..., x n − µ ( x , . . . , x n − ) g ( x ,..., x n − )= x i = ( n − ) ! | L ( A ) | F i µ [ g ] . Also, for each i (cid:54) = n , note that V i = (cid:83) n − j = H ij where H ij = { τ | τ ( i ) = n and τ ( n ) = j } . It is clear that | H ij | = ( n − ) !. If τ ∈ H ij , then clearly µ ( x τ ( ) , . . . , x τ ( n − ) ) = µ ( x − j ) . Now fix x , . . . , x n − , x n ,and a j ∈ { , , . . . , n − } . Then n − ∑ i = ∑ τ ∈ H ij µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i = n − ∑ i = ∑ τ ∈ H ij µ ( x − j ) g ( x ,..., x n − )= x i = ( n − ) | H ij | µ ( x − j ) g ( x ,..., x n − )= x i = ( n − )( n − ) ! µ ( x − j ) g ( x ,..., x n − )= x i = ( n − ) ! µ ( x − j ) g ( x ,..., x n − )= x i . Summing this expression over all j = , , . . . , n −
1, we get ( n − ) ! 1 g ( x ,..., x n − )= x i n − ∑ j = µ ( x − j ) . Thus, F i µ [ n ] [ f ] = n ! | L ( A ) | ∑ x ,..., x n ∑ τ µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i = n ! | L ( A ) | ∑ x ,..., x n ∑ τ ∈ V n µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i + n ! | L ( A ) | ∑ x ,..., x n ∑ τ ∈∪ n − i = V i µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i . Arrow’s Theorem Through a Fixpoint Argument
Plugging in our calculations from above, we conclude that F i µ [ n ] [ f ] = n F i µ [ g ] + n | L ( A ) | n − ∑ j = µ ( x − j ) g ( x ,..., x n − )= x i . If we repeat the steps from above, but with i replaced by n , then the whole reasoning is the same,except in (3): there, we get ( n − ) ! ∑ x ,..., x n − ∑ x n µ ( x , . . . , x n − ) g ( x ,..., x n − )= x n = ( n − ) ! ∑ x ,..., x n − µ ( x , . . . , x n − ) = ( n − ) ! . This lets us conclude that F n µ [ n ] [ f ] = n | L ( A ) | + n | L ( A ) | n − ∑ j = µ ( x − j ) g ( x ,..., x n − )= x n . Since ∑ n − j = µ ( x − j ) g ( x ,..., x n − )= x n ≤ F n µ [ n ] [ f ] ≤ / ( n | L ( A ) | ) .Let ε > ε < − / | L ( A ) | . Note that this is possible precisely because | L ( A ) | > | A | ≥
3, i.e., there are at least three candidates. Fix any y ∈ L ( A ) .Let G = { ( y , . . . , y ) } . We define a particular distribution µ ∗ on L ( A ) n − , as follows: µ ∗ gives weight1 − ε to ( y , . . . , y ) and spreads the remaining ε out over all other profiles. That is, for any x ∈ L ( A ) n − ,we let µ ∗ ( x ) = (cid:40) − ε if x ∈ G , ε | L ( A ) | n − − if x (cid:54)∈ G . Note that µ ∗ is n − g : L ( A ) n − → L ( A ) we have g ( x , x , . . . , x ) = x for all x ∈ L ( A ) . Lemma 3.22
Let g : L ( A ) n − → L ( A ) be a voting rule for n − voters that is Pareto, and letf : L ( A ) n → L ( A ) be f ( x , . . . , x n − , x n ) = g ( x , . . . , x n − ) . Then it holds that voter n is the uniqueleast forceful voter of f with respect to µ [ n ] ∗ .Proof. By Lemma 3.21 it suffices to show that F i µ ∗ [ g ] > / | L ( A ) | for each i = , , . . . , n − F i µ ∗ [ g ] = ∑ ( x ,..., x n − ) µ ∗ ( x , . . . , x n − ) g ( x ,..., x n − )= x i ≥ ∑ ( x ,..., x n − ) ∈ G µ ∗ ( x , . . . , x n − ) g ( x ,..., x n − )= x i , and this equals µ ∗ ( y , . . . , y ) = − ε since g is Pareto. By choice of ε , we have 1 − ε > / | L ( A ) | .Finally we arrive at the proof of Arrow’s theorem. Proof of Arrow’s theorem . Assume towards a contradiction that n ≥ g ∈ PIIA n − \ DICT n − .We define f : L ( A ) n → L ( A ) by f ( x , . . . , x n − , x n ) = g ( x , . . . , x n − ) . Note the following.(1) f ∈ PIIA since it satisfies IIA and Pareto, as g does.(2) f is not a dictator: clearly none of the first n − g were a dictator, and if the n -th voter were the dictator then g ( x , . . . , x n − ) = x n for all ( x , . . . , x n − , x n ) , in contradiction with the fact that g is a function. rank M. V. Feys & Helle Hvid Hansen f ∈ PIIA \ DICT.We take µ to be the µ [ n ] ∗ that we just introduced. By construction, µ [ n ] ∗ has full support, satisfies µ [ n ] ∗ = µ [ n ] ∗ ◦ (cid:126) π for all permutations π (by Lemma 3.20), and voter n is the unique voter with least forceon f with respect to µ [ n ] ∗ (by Lemma 3.22). This proves that Φ µ [ n ] ∗ ( f ) = f . Therefore, Φ µ [ n ] ∗ ∼ ([ f ]) = [ f ] .Proposition 3.18 and Proposition 3.19 imply that [ f ] = DICT. In particular, f ∈ DICT. Contradiction. (cid:3)
The main goal of this paper has been to show that Arrow’s impossibility theorem can be proved usingBanach’s fixpoint theorem. Our approach involved coming up with an appropriate equivalence relation,and then defining a contraction on the resulting equivalence space whose unique fixpoint is the set ofdictators. The concept of force of a voter, as well as thinking about voting rules as elements of a metricspace based on a probability distribution, are inspired by the Boolean analysis approach to social choice, aline of research initiated by [13] and further developed by others [15].Our proof of Arrow’s theorem is different in spirit from most of the previous proofs in that it does notinvolve manipulations of specific profiles. The Pareto property is fundamental in our analysis and we usedit ubiquitously, as our original metric space consists of all Pareto voting rules. Interestingly, however, inour proof we did not use the IIA property explicitly, we only used it by noting that it was preserved undera certain operation. This makes us wonder if it would be possible to get other impossibility results byconsidering other properties (that preserve the same operation).An advantage of our perspective on Arrow’s theorem is that it establishes a link between this pivotalresult of mathematical economics and a concept often surfacing in that area: fixpoints. The notion offixpoint also connects the theorem better to the area of computer science, where fixpoints are omnipresentand sometimes can lead to algorithms, although this does not seem to be the case here. A possible directionfor future work is to analyze if similar results in the area, such as the Gibbard-Satterthwaite theorem [12],can be proved via a related fixpoint argument. It would also be worthwhile to study the relationshipbetween our notion of power and notions like decisiveness or pivotalness that other proofs use.
Acknowledgments.
We would like to thank Thomas Santoli for pointing out in a discussion that thepermutation structure gives rise to a group action, which ultimately led to Proposition 3.13, and Ronald deWolf for inspiring us to formalize and study the notion of force.
References [1] Aleksandr V. Arhangel’skij & Lev S. Pontryagin (1990):
General Topology: Basic Concepts and Constructions.Dimension Theory. I . Springer-Verlag.[2] Kenneth J. Arrow (1951):
Social Choice and Individual Values . New York .[3] Stefan Banach (1922):
Sur les op´erations dans les ensembles abstraits et leur application aux ´equationsint´egrales . Fundamenta Mathematicae
Weighted Voting Doesn’t Work: A Mathematical Analysis . Rutgers L. Rev.
19, p.317.[5] Salvador Barbera (1980):
Pivotal Voters: A New Proof of Arrow’s Theorem . Economics Letters Arrow’s Theorem Through a Fixpoint Argument [6] Julian H. Blau (1972):
A Direct Proof of Arrow’s Theorem . Econometrica: Journal of the Econometric Society ,pp. 61–67, doi:10.2307/1909721.[7] Francesca Cagliari, Barbara Di Fabio & Claudia Landi (2015):
The Natural Pseudo-distance as a QuotientPseudo-metric, and Applications . In:
Forum Mathematicum , 27, De Gruyter, pp. 1729–1742.[8] Frank M. V. Feys (2015):
Fourier Analysis for Social Choice . Master’s thesis, Universiteit van Amsterdam,the Netherlands.[9] Ehud Friedgut, Gil Kalai & Assaf Naor (2002):
Boolean Functions Whose Fourier Transform is Concen-trated on the First Two Levels . Advances in Applied Mathematics
The Paradox of Voting: Probability Calculations . BehavioralScience
Three Brief Proofs of Arrow’s Impossibility Theorem . Economic Theory
Manipulation of Voting Schemes: A General Result . Econometrica
A Fourier-theoretic Perspective on the Condorcet Paradox and Arrow’s Theorem . Advancesin Applied Mathematics
Arrow’s Theorem, Many Agents, and Invisible Dictators . Journal of Economic Theory
A Quantitative Arrow Theorem . Probability Theory and Related Fields
Equilibrium Points in n-Person Games . Proceedings of the National Academy ofSciences
Analysis of Boolean Functions . Cambridge University Press,doi:10.1017/CBO9781139814782.[18] Vittorino Pata (2014):
Fixed Point Theorems and Applications . Politecnico di Milano .[19] Philip J. Reny (2001):
Arrow’s Theorem and the Gibbard-Satterthwaite Theorem: A Unified Approach . Economics Letters