[PDF] Arrow's Theorem Through a Fixpoint Argument

Abstract

We present a proof of Arrow's theorem from social choice theory that uses a fixpoint argument. Specifically, we use Banach's result on the existence of a fixpoint of a contractive map defined on a complete metric space. Conceptually, our approach shows that dictatorships can be seen as fixpoints of a certain process.

Full PDF

LL.S. Moss (Ed.): TARK 2019EPTCS 297, 2019, pp. 175–188, doi:10.4204/EPTCS.297.12 c (cid:13)

Frank M. V. Feys & Helle Hvid HansenThis work is licensed under theCreative Commons Attribution License.

Arrow’s Theorem Through a Fixpoint Argument

Frank M. V. Feys Helle Hvid Hansen

Faculty of Technology, Policy and ManagementDelft University of TechnologyDelft, The Netherlands [email protected] [email protected]

We present a proof of Arrow’s theorem from social choice theory that uses a ﬁxpoint argument.Speciﬁcally, we use Banach’s result on the existence of a ﬁxpoint of a contractive map deﬁned on acomplete metric space. Conceptually, our approach shows that dictatorships can be seen as “stablepoints” (ﬁxpoints) of a certain process.

Keywords

Social choice theory · voting · Arrow’s impossibility theorem · Banach’s ﬁxpoint theorem · dictatorship · force · ﬁxpoint · metric Arrow’s impossibility theorem, introduced in Kenneth Arrow’s seminal monograph

Social Choice andIndividual Values [2], deals with the issue of aggregating the preferences of a group of individuals into asingle collective preference that appropriately represents the group.Arrow deﬁned a social welfare function to be a function that aggregates a collection of individualpreferences into a societal preference, also called social choice. Preferences are deﬁned as linear orders,where candidates are ranked from top to bottom. He considered the following three desirable propertiesthat a social welfare function, which can be thought of as an election scheme or voting rule, might satisfy: • Pareto condition.

If all voters rank some candidate higher than another one, then also the socialchoice should do so. • Independence of irrelevant alternatives.

The societal ranking of any two candidates depends onlyon their relative rankings in the individual rankings, and on nothing else. • Non-dictatoriality.

There is no dictator, meaning that there is no voter whose individual preferencealways coincides with the societal outcome.Arrow’s impossibility theorem states that these seemingly mild conditions of reasonableness cannot besimultaneously satisﬁed by any voting rule.In this paper we provide a new proof of this theorem that uses Banach’s ﬁxpoint theorem, which statesthat a contractive map on a complete metric space has a unique ﬁxpoint [3]. Our proof method shows thatdictatorships can be seen as “stable points” (ﬁxpoints) of a certain process.We give a sketch of our proof. First we deﬁne a metric parametrized by a probability distribution onproﬁles. The distance between two voting rules is the probability under the given distribution that theoutcome of the election is different. The use of a distribution has the beneﬁt that it allows for proﬁles tobe considered concurrently. Inspired by the technical notion of inﬂuence from Boolean analysis [17], wethen introduce a notion that we call “force” (a related idea was already presented in [8]). The force of avoter on a given voting rule is the probability that the outcome of the election coincides with that voter’spreference. We use this notion to deﬁne a map that changes voting rules by considering the least powerful76

Arrow’s Theorem Through a Fixpoint Argument voters’ votes and transferring them to the most powerful voter. This map is shown to be a contraction.The process converges towards its unique ﬁxpoint, because of Banach’s ﬁxpoint theorem, and we showthis ﬁxpoint to be the set of dictators. One technical point that is fundamental to our approach is that wemust consider voting rules only up to equivalence via permutation of the voters, leading to a quotientmetric space. This requires us to see to it that the original metric extends appropriately to the quotientmetric. A certain group action takes care of that. Our technical development (metric space) is parametricin the distribution, and we study all notions as much as possible in generality. In the ﬁnal step, we pick aspeciﬁc distribution and derive Arrow’s theorem from it.Our proof does not require us to manipulate speciﬁc proﬁles, like most other previous proofs, butoffers an analytic perspective in terms of ﬁxpoints and convergence rather than a combinatoric one. Theuse of a probability distribution for measuring distance between voting rules is crucial in that respect. Ouremployment of ﬁxpoints connects Arrow’s theorem to the literature of mathematical economics, wheremany equilibrium concepts, such as for example Nash equilibrium [16], arise as a ﬁxpoint.We ﬁnish the introduction by discussing related work. A vast number of proofs have appeared sinceArrow’s ﬁrst demonstration of the impossibility theorem. Arrow’s original proof proceeded by showing theexistence of a “decisive” voter. This approach was then reﬁned by others, such as Blau [6] and Kirman andSondermann [14], who used ultraﬁlters. Barbera [5] replaced the notion of a decisive voter with the weakernotion of a “pivotal” voter. In later work, Geanakoplos [11] and Reny [19] sharpened the latter approachby introducing the concept of “extremely pivotal” voter in order to obtain shorter proofs. Even though theaforementioned proofs are all distinct, they all are of a similar, combinatorial nature: One proves resultsabout properties such as decisiveness or pivotalness by deﬁning and manipulating proﬁles in a cleverway that allows for the desirable properties to be taken advantage of. To our knowledge, Kalai’s [13]proof of Arrow’s theorem based on Fourier analysis on the Boolean cube was the ﬁrst approach to takea different stance. Kalai’s core idea was to calculate the probability of having a Condorcet cycle undersome probability distribution. One advantage of this approach is that, by using a theorem from Booleananalysis by Friedgut, Kalai, and Naor [9], it produces a robust, quantitative version of the theorem, in thefollowing sense: The more one seeks to avoid Condorcet’s paradox, the more the election scheme willlook like a dictator. Our approach is similar to Kalai’s in that we use quantitative methods, but it differsfrom it as we prove the classical version of Arrow’s theorem and not a quantitative one.

Contents of this paper.

In Section 2, we give a brief overview of the basic formal concepts that areused in this paper. We also formulate Arrow’s theorem in a precise way. In Section 3, we present ourﬁxpoint-based proof of Arrow’s theorem. Finally, we conclude and discuss future work in Section 4. Theproofs that are not given in the main body of the text can be found in the appendix.

In this section we introduce all basic notions that will be used later.

In this subsection we recall the basic Arrovian framework, going back to Arrow’s original work [2].We assume that the n voters are linearly ordered, so without loss of generality we may suppose thatthe set of voters is { , , . . . , n } with the natural ordering. Given a set A , we let L ( A ) be the set of stricttotal orders on A . For our purposes, A will be the set of candidates of the election, which we shall always rank M. V. Feys & Helle Hvid Hansen L ( A ) with a complete listing of theelements of A in order of preference. If (cid:96) ∈ L ( A ) and a , b ∈ A , we also write a (cid:96) b to mean that ( a , b ) ∈ (cid:96) .We shall use x = ( x , x , . . . , x n ) to refer to an element of L ( A ) n . Such an element is usually called a proﬁle in social choice theory. The element x i is the individual preference of voter i . Deﬁnition 2.1 A voting rule for n voters and set of candidates A is a map L ( A ) n → L ( A ) . Here the interpretation is that if the preferences of the n voters are respectively x , x , . . . , x n ∈ L ( A ) ,then the outcome of the election under the voting rule f is f ( x , x , . . . , x n ) ∈ L ( A ) . The latter is theso-called social choice given how the electorate voted.A particularly bad type of voting rule is the following: Deﬁnition 2.2

The n projection maps L ( A ) n → L ( A ) are called dictatorships . The i-th dictator isdenoted as Dict i , where i = , , . . . , n. We furthermore deﬁne DICT n = { Dict i | i = , , . . . , n } . Note that Dict i ( x ) = x i for all x = ( x , x , . . . , x n ) ∈ L ( A ) n . We write DICT n shorthand as DICT. Deﬁnition 2.3

Let f : L ( A ) n → L ( A ) be a voting rule for n voters and set of candidates A. We say thatf satisﬁes the • Pareto property if for all a , b ∈ A and all x = ( x , x , . . . , x n ) ∈ L ( A ) n , if a x i b for all i, then a f ( x ) b. • independence of irrelevant alternatives (IIA) property in case for all a , b ∈ A and for all proﬁlesx = ( x , x , . . . , x n ) ∈ L ( A ) n and x (cid:48) = ( x (cid:48) , x (cid:48) , . . . , x (cid:48) n ) ∈ L ( A ) n , if (a x i b iff a x (cid:48) i b) for all i, then(a f ( x ) b iff a f ( x (cid:48) ) b). The Pareto property speaks for itself: It states that for any candidates a and b , if the whole electorateprefers a to b , then surely also in the social choice a ought to be preferred to b . The independence ofirrelevant alternatives property is a bit more difficult to grasp. It says that the relative social rankingof two alternatives only depends on their relative individual rankings. In that sense there should be nodependence on any other alternative (hence “irrelevant”). Deﬁnition 2.4

We let

PIIA n = { f : L ( A ) n → L ( A ) | f satisﬁes Pareto and IIA } . We write PIIA n shorthand as PIIA. Note that DICT n ⊆ PIIA n .Arrow set out to investigate whether there was a non-dictatorial voting rule satisfying both the Paretocondition as well as IIA. The answer turned out to be negative. Theorem (Arrow [2])

A voting rule for at least three candidates that satisﬁes the Pareto property andIIA must be a dictatorship, i.e.,

PIIA n = DICT n for all n ≥ . We assume basic knowledge of metric spaces such as introduced in, e.g., [1]. Here we just recall a fewbasic deﬁnitions. A metric space ( X , d ) is a set X equipped with a metric.A crucial element in our proof is Banach’s ﬁxpoint theorem [3]. We recall that a function F : X → X on a metric space X is contractive if there is a C < d ( F ( x ) , F ( x )) ≤ C · d ( x , x ) forall x , x ∈ X . A ﬁxpoint of F is an element x ∗ such that F ( x ∗ ) = x ∗ . For any map F : X → X , by F ( n ) wemean the n -fold composition F ◦ · · · ◦ F . Theorem 2.5 (Banach Fixpoint Theorem)

Let ( X , d ) be a non-empty complete metric space. IfF : X → X is contractive then F has a unique ﬁxpoint x ∗ . For all x ∈ X , x ∗ = lim n → ∞ F ( n ) ( x ) . Arrow’s Theorem Through a Fixpoint Argument

When X is a compact metric space it sufﬁces to show that d ( F ( x ) , F ( x )) < d ( x , x ) for all x , x ∈ X to conclude that F is a contraction. This is in particular true if X is ﬁnite. Recall that if there is an n suchthat F ( n ) is a contraction, then F has a unique ﬁxpoint (see, e.g., [18]). We shall use this fact later.Let ( X , d ) be any metric space for which on the underlying set X an equivalence relation ∼ is deﬁned.For reasons that will become clear later, we wish to extend the metric on X to a metric on X / ∼ . In general,the following construction exists (see, e.g., [7] for the details). A chain C between points x ∈ X and y ∈ X is a sequence of points x = a ∼ b , a ∼ b , . . . , a n ∼ b n = y , and the length of such a chain is deﬁned aslength ( C ) = ∑ n − i = d ( b i , a i + ) (this sum is understood to be zero when n = d ∼ ([ x ] , [ y ]) = inf { length ( C ) | C is a chain between x and y } . (1)It is easy to see that d ∼ is well-deﬁned and satisﬁes all axioms of a pseudometric (meaning that thedistance between two distinct points can be zero). Note though that d ∼ ([ x ] , [ y ]) = ⇒ [ x ] = [ y ] is notgenerally true. However, in case d is such that the inﬁmum is always attained (which happens in particularin case X is ﬁnite), then d ∼ is, in fact, a metric: Lemma 2.6

Let ( X , d ) be a metric space and ∼ an equivalence relation on X . If the inﬁmum in d ∼ isalways attained, then d ∼ is a metric on X / ∼ .Proof. Indeed, if d ∼ ([ x ] , [ y ]) = = d ∼ ([ x ] , [ y ]) = length ( C ) for some chain C between x and y given by x = a ∼ b , a ∼ b , . . . , a n ∼ b n = y , then d ( b i , a i + ) = i . Since d is a metric, we get b i = a i + for all i . Thus, x = a ∼ a ∼ · · · ∼ a n ∼ y , which implies x ∼ y by transitivity. Hence, [ x ] = [ y ] .Nonetheless, the deﬁning formula for d ∼ is difﬁcult to work with. The following result gives asufﬁcient condition for the formula to reduce to an easier one. It uses the notion of action and orbit undera group action (see, e.g., [20]). A group action of a group G on a set X is a map ϕ : G × X → X suchthat ϕ ( e , x ) = x for all x ∈ X , where e is the identity element of G , and ϕ ( g , ϕ ( g (cid:48) , x )) = ϕ ( gg (cid:48) , x ) for all g , g (cid:48) ∈ G and x ∈ X . The orbit of an x ∈ X under the group action ϕ is the set { ϕ ( g , x ) | g ∈ G } . We alsorecall that the set of isometries on a metric space forms a group, with function composition as group law.With a group of isometries we mean a subgroup of that group. Proposition 2.7 If ( X , d ) is a metric space endowed with an equivalence relation ∼ where the equivalenceclasses are the orbits of the action of a group of isometries on ( X , d ) , then it holds for all [ x ] , [ y ] ∈ X / ∼ that d ∼ ([ x ] , [ y ]) = inf { d ( x (cid:48) , y (cid:48) ) | x (cid:48) ∈ [ x ] , y (cid:48) ∈ [ y ] } .Proof. This is part of Theorem 2.1 from [7].

In this section, we give the new proof of Arrow’s theorem. The general idea is to deﬁne a contraction thathas the set of dictators as its unique ﬁxpoint.

We wish to use Banach’s ﬁxpoint theorem. In order to do so, we need to view our voting rules as part ofsome metric space. For technical reasons, we shall focus on voting rules that satisfy the Pareto property.Thus, we let M be the set of all voting rules L ( A ) n → L ( A ) that are Pareto.The idea to view voting rules as a metric space where the metric is based on a distribution is inspiredby earlier work from the Boolean analysis approach to social choice theory, such as Mossel [15]. Viewing rank M. V. Feys & Helle Hvid Hansen Deﬁnition 3.1

Given a probability distribution µ on L ( A ) n , we deﬁne a map d µ : M × M → R byd µ ( f , g ) = Pr x ∼ µ [ f ( x ) (cid:54) = g ( x )] . That is, d µ ( f , g ) is the probability under µ that f and g produce a different outcome. For any set X ,by 1 X we mean the indicator function on X . Note that d µ ( f , g ) = Pr x ∼ µ [ f ( x ) (cid:54) = g ( x )] = ∑ x ∈ L ( A ) n µ ( x ) f ( x ) (cid:54) = g ( x ) . Throughout this paper, µ is a distribution on L ( A ) n , but for brevity we often simply say that “ µ isa distribution”. For example, taking µ to be the uniform distribution is known in the literature as the impartial culture assumption [10].Under a mild assumption on µ we obtain our desired metric space. Proposition 3.2 If µ is a probability distribution on L ( A ) n with full support, then d µ is a metric on M.Proof. Clearly d µ ( f , f ) = f ∈ M . Since µ has full support, d µ ( f , g ) = Pr x ∼ µ [ f ( x ) (cid:54) = g ( x )] = f = g . Symmetry follows immediately. To prove the triangle inequality, let f , g , h ∈ M . Note thatfor any x , f ( x ) (cid:54) = h ( x ) implies f ( x ) (cid:54) = g ( x ) or g ( x ) (cid:54) = h ( x ) . Thus,Pr x ∼ µ [ f ( x ) (cid:54) = h ( x )] ≤ Pr x ∼ µ [ f ( x ) (cid:54) = g ( x ) or g ( x ) (cid:54) = h ( x )] , and the latter is at most Pr x ∼ µ [ f ( x ) (cid:54) = g ( x )] + Pr x ∼ µ [ g ( x ) (cid:54) = h ( x )] .Note that the metric space ( M , d µ ) is complete and compact for any µ , since it is ﬁnite.The proof idea is to deﬁne a contraction that has the set of dictators as its unique ﬁxpoint. We start bydeﬁning a map Φ on M . Essentially, given a voting rule f , we will deﬁne Φ ( f ) as the voting rule in whichthe “least forceful” voter’s vote is replaced by the “most forceful” voter’s vote. This idea is inspired by,though different from, the notion of inﬂuence that is well-studied in Boolean analysis [17]. The inﬂuenceof the i -th voter is deﬁned as the probability that the i -th vote affects the outcome (assuming there are onlytwo candidates). In political science and voting theory, inﬂuence also goes by the name voting power .Voting power can be thought of as the ability of a legislator, by his vote, to affect the passage or defeat ofa measure [4].To make the aforementioned idea formal, we deﬁne what we mean by the force of a voter. Deﬁnition 3.3

Given a probability distribution µ on L ( A ) n and voter i, we deﬁne the force of voter i onvoting rule f under distribution µ as F i µ [ f ] = Pr x ∼ µ [ f ( x ) = x i ] . The set of most forceful voters is

MostForce µ [ f ] = argmax i ∈{ , ,..., n } F i µ [ f ] . The set of least forceful voters is

LeastForce µ [ f ] = argmin i ∈{ , ,..., n } F i µ [ f ] . The notion of force was ﬁrst introduced in [8].

Deﬁnition 3.4

Let µ be a given distribution with full support. Given a voting rule f ∈ M, we deﬁne anew voting rule Φ µ ( f ) ∈ M by Φ µ ( f )( x , . . . , x n ) = f ( y , . . . , y n ) , wherey i = (cid:40) x i if i (cid:54)∈ LeastForce µ [ f ] , x minMostForce µ [ f ] else . Arrow’s Theorem Through a Fixpoint Argument

This deﬁnes a map Φ µ : M → M, also written for short as Φ if µ is understood from the context. So what happens when passing from f to Φ ( f ) is that all voters with least force have their votereplaced by the vote of the ﬁrst voter that has maximal force. Note that Φ ( f ) is Pareto if f is Pareto, so Φ is well-deﬁned on M . The choice to consistently pick speciﬁcally the ﬁrst voter in Deﬁnition 3.4 isarbitrary. We might as well always pick the last voter, for instance. Our choice is merely a technicality toensure that the votes are transferred to one speciﬁc voter. Proposition 3.5

For all i we have Φ ( Dict i ) = Dict i .Proof. Voter i is the unique most forceful voter of Dict i . It is irrelevant how others vote, and we have Φ ( Dict i )( x , . . . , x n ) = x i for all ( x , . . . , x n ) . Hence, Φ ( Dict i ) = Dict i .Note that Φ cannot be a contraction, since every dictator is a ﬁxed point of Φ , contradicting theuniqueness of the ﬁxed point in Banach’s ﬁxpoint theorem. The idea is therefore to consider all dictatorsto be “the same”, i.e., to consider them to be equivalent under some equivalence relation.Given an equivalence relation ∼ , we write [ x ] for the equivalence class of x under ∼ . We shall write M ∼ for M / ∼ throughout.The remainder of the proof is organized as follows. We deﬁne an equivalence relation ∼ using anotion of permutation-invariance of voting rules as well as our notion of force. We then show that DICTis an equivalence class of ∼ . The use of permutations gives rise to a notion of permutation-invariancefor distributions. We then show that for all distributions µ on L ( A ) n that have full support and arepermutation-invariant, the following hold.(i) The map Φ µ : M µ → M µ is compatible with ∼ , where M µ is the metric space resulting from µ , andhence extends to a map Φ µ ∼ : M µ ∼ → M µ ∼ .(ii) The map Φ µ ∼ has DICT as unique ﬁxpoint.Finally, we prove Arrow’s theorem by showing the existence of a concrete distribution µ satisfying theabovementioned requirements. M and Metric on M ∼ We ﬁrst have to ﬁnd an appropriate equivalence relation on M . To do that, we use the following notion. Deﬁnition 3.6

Given a permutation π : n → n of the voters, we deﬁne (cid:126) π ( x ) as ( x π ( ) , . . . , x π ( n ) ) for alltuples x = ( x , . . . , x n ) ∈ L ( A ) n . If f is a voting rule, we deﬁne a new voting rule, written as f ◦ (cid:126) π , by ( f ◦ (cid:126) π )( x ) = f ( (cid:126) π ( x )) for all x ∈ L ( A ) n . A permutation π : n → n is in what follows often written without types simply as π . Observe that (cid:126) π : L ( A ) n → L ( A ) n is a bijection because π is a bijection. Deﬁnition 3.7

We deﬁne the relation ∼ on M by letting f ∼ g iff (i) f = g, or (ii) there exists a permutation π : n → n such that f = g ◦ (cid:126) π and f has a unique voter with maximal force. In this way we obtain our desired equivalence relation.

Proposition 3.8

The relation ∼ is an equivalence relation on M. rank M. V. Feys & Helle Hvid Hansen Proof.

Clearly ∼ is reﬂexive, as = ⊆ ∼ . To show symmetry, let f ∼ g , so that f = g , or f = g ◦ (cid:126) π forsome π where f has a unique maximal force voter. The former case f = g is trivial. In the latter case,clearly g = f ◦ (cid:126) π − , and because f has a unique maximal force voter, the relation f = g ◦ (cid:126) π implies thatthe same holds for g . Now to show transitivity, let f ∼ g and g ∼ h . Thus, f = g , or f = g ◦ (cid:126) π for some π where f has a unique maximal force voter. Also, g = h , or g = h ◦ (cid:126) τ for some τ where g has a uniquemaximal force voter. There are four cases. We will only show two of them, as they are all easy.Suppose f = g and g = h ◦ (cid:126) τ where g has a unique maximal force voter. Then since g has a uniquemaximal force voter, the same holds for h , so the relation f = h ◦ (cid:126) τ implies that this is also true for f . If f = g ◦ (cid:126) π and g = h ◦ (cid:126) τ , where both f and g have a unique maximal force voter, we have that f = h ◦ ( (cid:126) τ ◦ (cid:126) π ) .Note that DICT ⊆ M , so it makes sense to speak about [ Dict i ] , and that the equivalence classes of ∼ are of the following two types: • If f does not have a unique maximal force voter, then [ f ] = { f } . • If f does have a unique maximal force voter, then [ f ] = { f ◦ (cid:126) π | π : n → n } . Proposition 3.9

The set

DICT is an equivalence class of ∼ .Proof. For any i , we show that we have [ Dict i ] = DICT. Let f ∈ M be such that f ∼ Dict i . If f = Dict i ,surely f ∈ DICT. Otherwise, we have that f = Dict i ◦ (cid:126) π for some permutation π . Suppose π ( i ) = j . Atrivial computation then shows that f = Dict j , so f ∈ DICT. We now show that for any j , Dict i ∼ Dict j .Indeed, if π is the permutation that switches i and j and is the identity elsewhere, then Dict i = Dict j ◦ (cid:126) π .Furthermore, it is clear that any dictator has a unique voter with maximal force.Let µ be a distribution with full support such that d = d µ is a metric on M (by Proposition 3.2), andconsider the quotient metric d ∼ deﬁned in (1) for the relation ∼ from Deﬁnition 3.7. Since M is ﬁnite, weobtain from Lemma 2.6 that d ∼ is a metric on M ∼ = M / ∼ . However, as we saw in that same subsection,the deﬁning formula for d ∼ is convoluted, so we set out to identify a subgroup of isometries that willallows us to apply Proposition 2.7, as that would give us a more convenient formula to work with. To dothat, we shall need the following notion. Deﬁnition 3.10

A distribution µ on L ( A ) n is n -permutation-invariant if for all π : n → n, µ ◦ (cid:126) π = µ . The condition µ = µ ◦ (cid:126) π for all permutations π demands that µ ( x ) = µ ( x (cid:48) ) whenever x (cid:48) and x arerearrangements of one another. Mathematically, it ensures that the probability distribution µ is well-deﬁnedup to permutation equivalence of proﬁles.For each permutation π : n → n , let J π : M → M : f (cid:55)→ f ◦ (cid:126) π . This map is well-deﬁned, as f ◦ (cid:126) π isPareto if f is. Proposition 3.11

Let µ be a distribution on L ( A ) n with full support and such that µ is n-permutation-invariant. Then J π is an isometry of M, for each π .Proof. From the bijectivity of π , it follows that J π is bijective. Let π be a permutation and f , g ∈ M . Thenwe need to show that d µ ( f , g ) = d µ ( f ◦ (cid:126) π , g ◦ (cid:126) π ) .82 Arrow’s Theorem Through a Fixpoint Argument

We calculate:Pr x ∼ µ [( f ◦ (cid:126) π )( x ) (cid:54) = ( g ◦ (cid:126) π )( x )] = ∑ x ∈ L ( A ) n µ ( x ) · ( f ◦ (cid:126) π )( x ) (cid:54) =( g ◦ (cid:126) π )( x ) = ∑ x ∈ L ( A ) n µ ( (cid:126) π − ( x )) · ( f ◦ (cid:126) π )( (cid:126) π − ( x )) (cid:54) =( g ◦ (cid:126) π )( (cid:126) π − ( x )) = ∑ x ∈ L ( A ) n µ ( (cid:126) π − ( x )) · f ( x ) (cid:54) = g ( x ) . Hence, Pr x ∼ µ [( f ◦ (cid:126) π )( x ) (cid:54) = ( g ◦ (cid:126) π )( x )] = Pr x ∼ µ ◦ (cid:126) π − [ f ( x ) (cid:54) = g ( x )] . Note that µ ◦ (cid:126) π − : L ( A ) n → [ , ] isindeed a probability distribution on L ( A ) n . The proof is complete after observing that µ ◦ (cid:126) π − = µ .Let G = { J π | π is a permutation } . From Proposition 3.11 it follows that G is a subgroup of the groupof isometries of M . We deﬁne a map · : G × M → M by J π · f = (cid:40) J π ( f ) if f has a unique maximal force voter , f otherwise . Proposition 3.12

The operation · is a group action of the group G on M and its orbits coincide with theequivalence classes under ∼ .Proof. Note that id : M → M is the identity of G , and id · f = f for all f ∈ M . Also, ( g ◦ h )( f ) = g ( h ( f )) for all g , h ∈ G and f ∈ M . Indeed, this easily follows by making the case distinction whether f has aunique maximal force voter.The orbit of an f ∈ M under this group action is equal to G · f = { J π · f | π } . Now, J π · f is J π ( f ) = f ◦ (cid:126) π if f has a unique maximal force voter, and f otherwise. Hence, G · f = { J π · f | π } = [ f ] . Thus, the orbitsof the group action coincide with the equivalence classes under ∼ . Proposition 3.13

Let µ be a distribution with full support and with µ = µ ◦ (cid:126) π for all permutations π . Letd ∼ be the metric on M / ∼ based on the metric d = d µ on M (see Lemma 2.6). Then for all [ f ] , [ g ] ∈ M / ∼ we have d ∼ ([ f ] , [ g ]) = min { d ( f (cid:48) , g (cid:48) ) | f (cid:48) ∈ [ f ] ∼ , g (cid:48) ∈ [ g ] ∼ } . Proof.

This follows from Proposition 2.7, Proposition 3.11, and Proposition 3.12. Φ to a Map with Unique Fixpoint From now on, unless speciﬁcally mentioned otherwise, we shall always assume that µ is a distribution on L ( A ) n with full support and with µ = µ ◦ (cid:126) π for all permutations π , such that Proposition 3.13 applies.Such a µ obviously exists, for instance the uniform distribution is an example, but one can easily obtainmany other examples simply by giving one representative of each permutation equivalence class of proﬁles(meaning proﬁles x and y are equivalent iff there is a permutation π such that π ( x ) = y ) a non-zero weight.Later we will exploit this property.Our aim in this subsection is to extend the map Φ : M → M to a map M ∼ → M ∼ where, as we recall, M ∼ = M / ∼ . The following is a technical result that we shall use later. Lemma 3.14

If f = g ◦ (cid:126) π for π , then F i µ [ g ] = F π ( i ) µ [ f ] for each i = , , . . . , n. rank M. V. Feys & Helle Hvid Hansen Proof.

For any i , let p i : L ( A ) n → L ( A ) be the i -th projection map. Then we have F i µ [ g ] = Pr x ∼ µ [ g ( x ) = x i ] = ∑ x µ ( x ) · g ( x )= p i ( x ) = ∑ x µ ( (cid:126) π ( x )) · g ( (cid:126) π ( x ))= p i ( (cid:126) π ( x )) . Note that p i ( (cid:126) π ( x )) = x π ( i ) for each x , so F i µ [ g ] = ∑ x µ ( (cid:126) π ( x )) · g ( (cid:126) π ( x ))= x π ( i ) . By assumption, µ ◦ (cid:126) π = µ and f = g ◦ (cid:126) π , so the conclusion follows.We now show that Φ respects the equivalence relation ∼ , so that it extends to the quotient. Proposition 3.15

For all f , g ∈ M, if f ∼ g then Φ ( f ) ∼ Φ ( g ) .Proof. Let f , g ∈ M be such that f ∼ g . If f = g , then clearly Φ ( f ) = Φ ( g ) . Now suppose there is a π such that f = g ◦ (cid:126) π , and that f (and hence also g ) has a unique voter with maximal force. We need toshow that Φ ( f ) ∼ Φ ( g ) . In fact, we will show that Φ ( f ) = Φ ( g ) ◦ (cid:126) π . From Lemma 3.14 we know that F i µ [ g ] = F π ( i ) µ [ f ] for each i . This implies that π maps the ﬁrst (and only) most forceful voter of g tothe ﬁrst (and only) most forceful voter of f . Similarly, π maps the least forceful voters in g to the leastforceful voters in f . Applying the deﬁnition of Φ (see Deﬁnition 3.4), we see that Φ ( f ) = Φ ( g ) ◦ (cid:126) π . Deﬁnition 3.16

We deﬁne Φ ∼ : M ∼ → M ∼ by Φ ∼ ([ f ]) = [ Φ ( f )] . We want to point out that, although we have not written it explicitly, Φ ∼ (and Φ alike) does depend ona chosen distribution µ . By Proposition 3.15, Φ ∼ is well-deﬁned. We also deﬁne for each i = , , . . . , n amap s i : L ( A ) n → L ( A ) n by s i ( x , . . . , x n ) = ( x i , . . . , x i ) .The following is a technical result. Lemma 3.17

For any f ∈ M and i, voter i is the unique voter with maximal force on f ◦ s i . Moreover, [ f ◦ s i ] = { f ◦ s k | k = , , . . . , n } for each i.Proof. Since f is Pareto, we have F j µ [ f ◦ s i ] = Pr x ∼ µ [ f ( s i ( x )) = x j ] = Pr x ∼ µ [ x i = x j ] for any j . Now as µ is assumed to have full support, Pr x ∼ µ [ x i = x j ] = x i = x j for all x ∈ L ( A ) n , i.e.,iff i = j . This proves that i is the unique voter with maximal force.From the ﬁrst part we know that f ◦ s i has a unique voter with maximal force. Hence we have [ f ◦ s i ] = { ( f ◦ s i ) ◦ (cid:126) π | π } = { f ◦ s k | k = , , . . . , n } . Proposition 3.18

The map Φ ∼ : M ∼ → M ∼ has a unique ﬁxpoint.Proof. Since M ∼ is ﬁnite, it sufﬁces to show d ∼ ( Φ ( n ) ∼ ([ f ]) , Φ ( n ) ∼ ([ g ])) < d ∼ ([ f ] , [ g ]) for all [ f ] , [ g ] ∈ M ∼ .So let [ f ] , [ g ] ∈ M ∼ . Every iteration of Φ on f , at least one voter loses their vote as it is taken over by themost forceful voter. Thus, there is an i such that Φ ( n ) ( f )( x , . . . , x n ) = f ( x i , . . . , x i ) for all ( x , . . . , x n ) , orin other words, Φ ( n ) ( f ) = f ◦ s i . There is similarly a j such that Φ ( n ) ( g ) = g ◦ s j .We have d ∼ ( Φ ( n ) ∼ ([ f ]) , Φ ( n ) ∼ ([ g ])) = d ∼ ([ Φ ( n ) ( f )] , [ Φ ( n ) ( g )]) = d ∼ ([ f ◦ s i ] , [ g ◦ s j ]) . Applying Proposition 3.13 and Lemma 3.17, we obtain d ∼ ( Φ ( n ) ∼ ([ f ]) , Φ ( n ) ∼ ([ g ])) = min k , l d ( f ◦ s k , g ◦ s l ) . Since f and g are Pareto, we have ( f ◦ s k )( x , . . . , x n ) = f ( x k , . . . , x k ) = x k and ( g ◦ s l )( x , . . . , x n ) = g ( x l , . . . , x l ) = x l Arrow’s Theorem Through a Fixpoint Argument for all ( x , . . . , x n ) ∈ L ( A ) n . Thus, min k , l d ( f ◦ s k , g ◦ s l ) = min k , l Pr x ∼ µ [ x k (cid:54) = x l ] = , and from this itfollows that for all d ∼ ([ f ] , [ g ]) (cid:54) = d ∼ ( Φ ( n ) ∼ ([ f ]) , Φ ( n ) ∼ ([ g ])) = < d ∼ ([ f ] , [ g ]) . So Φ ∼ collapsesall equivalence classes after n steps. Note that Φ ( n ) ∼ ([ f ]) = DICT for all f ∈ M .We now show that this ﬁxpoint is the equivalence class of dictators. Proposition 3.19

It holds that Φ ∼ ( DICT ) =

DICT .Proof.

From Proposition 3.5 we know that Φ ( Dict i ) ∼ Dict i for each i . This implies the equality Φ ∼ ( DICT ) = Φ ∼ ([ Dict i ]) = [ Φ ( Dict i )] = [ Dict i ] = DICT, where we used Proposition 3.9.We shall now develop a technical result needed for proving Arrow’s theorem. For any i , we deﬁne v i : L ( A ) n → L ( A ) n − by v i ( x ) = x − i , where x − i is the same vector as x but with the i -th componentleft out (so x = ( x i , x − i ) for all x and i ). Given any distribution µ on L ( A ) n − and i = , , . . . , n , we canassociate with µ a real-valued map µ [ i ] on L ( A ) n by deﬁning µ [ i ] ( x ) = ∑ τ : n → n µ ( v i ( (cid:126) τ ( x ))) n ! | L ( A ) | . (2) Lemma 3.20

Let µ be a distribution on L ( A ) n − and i ∈ { , , . . . , n } . Then µ [ i ] is a distribution on L ( A ) n and µ [ i ] is n-permutation-invariant. If moreover µ has full support, then also µ [ i ] has full support.Proof. To show that µ [ i ] is a distribution, note that for a ﬁxed permutation τ : n → n , ∑ x ∈ L ( A ) n µ ( v i ( (cid:126) τ ( x ))) = ∑ x ∈ L ( A ) n µ ( v i ( x )) = ∑ x i ∑ x − i µ ( x − i ) = ∑ x i = | L ( A ) | . Thus, as the number of permutations n → n is n !, we get ∑ x ∈ L ( A ) n ∑ τ µ [ i ] ( x ) = ∑ τ ∑ x ∈ L ( A ) n µ ( v i ( (cid:126) τ ( x ))) = n ! | L ( A ) | . We now show that µ [ i ] is n -permutation-invariant. Let π : n → n . We show that µ [ i ] ◦ (cid:126) π = µ [ i ] . For any x ∈ L ( A ) n , we have ( n ! | L ( A ) | ) (( µ [ i ] ◦ (cid:126) π )( x )) = ∑ τ : n → n µ ( v i ( (cid:126) τ ( (cid:126) π ( x )))) = ∑ σ : n → n µ ( v i ( (cid:126) σ ( x ))) = ( n ! | L ( A ) | ) ( µ [ i ] ( x )) . The last claim is trivial, so the proof is complete.The following is a technical lemma.

Lemma 3.21

Let n ≥ , and let g : L ( A ) n − → L ( A ) be a voting rule for n − voters. If f : L ( A ) n → L ( A ) is such that f ( x , . . . , x n − , x n ) = g ( x , . . . , x n − ) , and µ is a n − -permutation-invariant distributionon L ( A ) n − , then F n µ [ n ] [ f ] ≤ / ( n | L ( A ) | ) , and for each i = , , . . . , n − that F i µ [ n ] [ f ] ≥ n F i µ [ g ] . rank M. V. Feys & Helle Hvid Hansen Proof.

Let i = , , . . . , n −

1. We show that F i µ [ n ] [ f ] ≥ n F i µ [ g ] . More precisely, we will show that F i µ [ n ] [ f ] = n F i µ [ g ] + n | L ( A ) | n − ∑ j = µ ( x − j ) g ( x ,..., x n − )= x i . We have F i µ [ n ] [ f ] = n ! | L ( A ) | ∑ x ,..., x n ∑ τ µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i . For each i = , , . . . , n , let V i = { τ : n → n | τ ( i ) = n } . To start, note that ∑ x ,..., x n − , x n ∑ τ ∈ V n µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i (since µ is n − = ∑ x ,..., x n − , x n ∑ τ ∈ V n µ ( x , . . . , x n − ) g ( x ,..., x n − )= x i (since | V n | = ( n − ) !) = ( n − ) ! ∑ x ,..., x n − , x n µ ( x , . . . , x n − ) g ( x ,..., x n − )= x i = ( n − ) ! ∑ x ,..., x n − ∑ x n µ ( x , . . . , x n − ) g ( x ,..., x n − )= x i (3) = ( n − ) ! | L ( A ) | ∑ x ,..., x n − µ ( x , . . . , x n − ) g ( x ,..., x n − )= x i = ( n − ) ! | L ( A ) | F i µ [ g ] . Also, for each i (cid:54) = n , note that V i = (cid:83) n − j = H ij where H ij = { τ | τ ( i ) = n and τ ( n ) = j } . It is clear that | H ij | = ( n − ) !. If τ ∈ H ij , then clearly µ ( x τ ( ) , . . . , x τ ( n − ) ) = µ ( x − j ) . Now ﬁx x , . . . , x n − , x n ,and a j ∈ { , , . . . , n − } . Then n − ∑ i = ∑ τ ∈ H ij µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i = n − ∑ i = ∑ τ ∈ H ij µ ( x − j ) g ( x ,..., x n − )= x i = ( n − ) | H ij | µ ( x − j ) g ( x ,..., x n − )= x i = ( n − )( n − ) ! µ ( x − j ) g ( x ,..., x n − )= x i = ( n − ) ! µ ( x − j ) g ( x ,..., x n − )= x i . Summing this expression over all j = , , . . . , n −

1, we get ( n − ) ! 1 g ( x ,..., x n − )= x i n − ∑ j = µ ( x − j ) . Thus, F i µ [ n ] [ f ] = n ! | L ( A ) | ∑ x ,..., x n ∑ τ µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i = n ! | L ( A ) | ∑ x ,..., x n ∑ τ ∈ V n µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i + n ! | L ( A ) | ∑ x ,..., x n ∑ τ ∈∪ n − i = V i µ ( x τ ( ) , . . . , x τ ( n − ) ) g ( x ,..., x n − )= x i . Arrow’s Theorem Through a Fixpoint Argument

Plugging in our calculations from above, we conclude that F i µ [ n ] [ f ] = n F i µ [ g ] + n | L ( A ) | n − ∑ j = µ ( x − j ) g ( x ,..., x n − )= x i . If we repeat the steps from above, but with i replaced by n , then the whole reasoning is the same,except in (3): there, we get ( n − ) ! ∑ x ,..., x n − ∑ x n µ ( x , . . . , x n − ) g ( x ,..., x n − )= x n = ( n − ) ! ∑ x ,..., x n − µ ( x , . . . , x n − ) = ( n − ) ! . This lets us conclude that F n µ [ n ] [ f ] = n | L ( A ) | + n | L ( A ) | n − ∑ j = µ ( x − j ) g ( x ,..., x n − )= x n . Since ∑ n − j = µ ( x − j ) g ( x ,..., x n − )= x n ≤ F n µ [ n ] [ f ] ≤ / ( n | L ( A ) | ) .Let ε > ε < − / | L ( A ) | . Note that this is possible precisely because | L ( A ) | > | A | ≥

3, i.e., there are at least three candidates. Fix any y ∈ L ( A ) .Let G = { ( y , . . . , y ) } . We deﬁne a particular distribution µ ∗ on L ( A ) n − , as follows: µ ∗ gives weight1 − ε to ( y , . . . , y ) and spreads the remaining ε out over all other proﬁles. That is, for any x ∈ L ( A ) n − ,we let µ ∗ ( x ) = (cid:40) − ε if x ∈ G , ε | L ( A ) | n − − if x (cid:54)∈ G . Note that µ ∗ is n − g : L ( A ) n − → L ( A ) we have g ( x , x , . . . , x ) = x for all x ∈ L ( A ) . Lemma 3.22

Let g : L ( A ) n − → L ( A ) be a voting rule for n − voters that is Pareto, and letf : L ( A ) n → L ( A ) be f ( x , . . . , x n − , x n ) = g ( x , . . . , x n − ) . Then it holds that voter n is the uniqueleast forceful voter of f with respect to µ [ n ] ∗ .Proof. By Lemma 3.21 it sufﬁces to show that F i µ ∗ [ g ] > / | L ( A ) | for each i = , , . . . , n − F i µ ∗ [ g ] = ∑ ( x ,..., x n − ) µ ∗ ( x , . . . , x n − ) g ( x ,..., x n − )= x i ≥ ∑ ( x ,..., x n − ) ∈ G µ ∗ ( x , . . . , x n − ) g ( x ,..., x n − )= x i , and this equals µ ∗ ( y , . . . , y ) = − ε since g is Pareto. By choice of ε , we have 1 − ε > / | L ( A ) | .Finally we arrive at the proof of Arrow’s theorem. Proof of Arrow’s theorem . Assume towards a contradiction that n ≥ g ∈ PIIA n − \ DICT n − .We deﬁne f : L ( A ) n → L ( A ) by f ( x , . . . , x n − , x n ) = g ( x , . . . , x n − ) . Note the following.(1) f ∈ PIIA since it satisﬁes IIA and Pareto, as g does.(2) f is not a dictator: clearly none of the ﬁrst n − g were a dictator, and if the n -th voter were the dictator then g ( x , . . . , x n − ) = x n for all ( x , . . . , x n − , x n ) , in contradiction with the fact that g is a function. rank M. V. Feys & Helle Hvid Hansen f ∈ PIIA \ DICT.We take µ to be the µ [ n ] ∗ that we just introduced. By construction, µ [ n ] ∗ has full support, satisﬁes µ [ n ] ∗ = µ [ n ] ∗ ◦ (cid:126) π for all permutations π (by Lemma 3.20), and voter n is the unique voter with least forceon f with respect to µ [ n ] ∗ (by Lemma 3.22). This proves that Φ µ [ n ] ∗ ( f ) = f . Therefore, Φ µ [ n ] ∗ ∼ ([ f ]) = [ f ] .Proposition 3.18 and Proposition 3.19 imply that [ f ] = DICT. In particular, f ∈ DICT. Contradiction. (cid:3)

The main goal of this paper has been to show that Arrow’s impossibility theorem can be proved usingBanach’s ﬁxpoint theorem. Our approach involved coming up with an appropriate equivalence relation,and then deﬁning a contraction on the resulting equivalence space whose unique ﬁxpoint is the set ofdictators. The concept of force of a voter, as well as thinking about voting rules as elements of a metricspace based on a probability distribution, are inspired by the Boolean analysis approach to social choice, aline of research initiated by [13] and further developed by others [15].Our proof of Arrow’s theorem is different in spirit from most of the previous proofs in that it does notinvolve manipulations of speciﬁc proﬁles. The Pareto property is fundamental in our analysis and we usedit ubiquitously, as our original metric space consists of all Pareto voting rules. Interestingly, however, inour proof we did not use the IIA property explicitly, we only used it by noting that it was preserved undera certain operation. This makes us wonder if it would be possible to get other impossibility results byconsidering other properties (that preserve the same operation).An advantage of our perspective on Arrow’s theorem is that it establishes a link between this pivotalresult of mathematical economics and a concept often surfacing in that area: ﬁxpoints. The notion ofﬁxpoint also connects the theorem better to the area of computer science, where ﬁxpoints are omnipresentand sometimes can lead to algorithms, although this does not seem to be the case here. A possible directionfor future work is to analyze if similar results in the area, such as the Gibbard-Satterthwaite theorem [12],can be proved via a related ﬁxpoint argument. It would also be worthwhile to study the relationshipbetween our notion of power and notions like decisiveness or pivotalness that other proofs use.

Acknowledgments.

We would like to thank Thomas Santoli for pointing out in a discussion that thepermutation structure gives rise to a group action, which ultimately led to Proposition 3.13, and Ronald deWolf for inspiring us to formalize and study the notion of force.

References [1] Aleksandr V. Arhangel’skij & Lev S. Pontryagin (1990):

General Topology: Basic Concepts and Constructions.Dimension Theory. I . Springer-Verlag.[2] Kenneth J. Arrow (1951):

Social Choice and Individual Values . New York .[3] Stefan Banach (1922):

Sur les op´erations dans les ensembles abstraits et leur application aux ´equationsint´egrales . Fundamenta Mathematicae

Weighted Voting Doesn’t Work: A Mathematical Analysis . Rutgers L. Rev.

19, p.317.[5] Salvador Barbera (1980):

Pivotal Voters: A New Proof of Arrow’s Theorem . Economics Letters Arrow’s Theorem Through a Fixpoint Argument [6] Julian H. Blau (1972):

A Direct Proof of Arrow’s Theorem . Econometrica: Journal of the Econometric Society ,pp. 61–67, doi:10.2307/1909721.[7] Francesca Cagliari, Barbara Di Fabio & Claudia Landi (2015):

The Natural Pseudo-distance as a QuotientPseudo-metric, and Applications . In:

Forum Mathematicum , 27, De Gruyter, pp. 1729–1742.[8] Frank M. V. Feys (2015):

Fourier Analysis for Social Choice . Master’s thesis, Universiteit van Amsterdam,the Netherlands.[9] Ehud Friedgut, Gil Kalai & Assaf Naor (2002):

Boolean Functions Whose Fourier Transform is Concen-trated on the First Two Levels . Advances in Applied Mathematics

The Paradox of Voting: Probability Calculations . BehavioralScience

Three Brief Proofs of Arrow’s Impossibility Theorem . Economic Theory

Manipulation of Voting Schemes: A General Result . Econometrica

A Fourier-theoretic Perspective on the Condorcet Paradox and Arrow’s Theorem . Advancesin Applied Mathematics

Arrow’s Theorem, Many Agents, and Invisible Dictators . Journal of Economic Theory

A Quantitative Arrow Theorem . Probability Theory and Related Fields

Equilibrium Points in n-Person Games . Proceedings of the National Academy ofSciences

Analysis of Boolean Functions . Cambridge University Press,doi:10.1017/CBO9781139814782.[18] Vittorino Pata (2014):

Fixed Point Theorems and Applications . Politecnico di Milano .[19] Philip J. Reny (2001):

Arrow’s Theorem and the Gibbard-Satterthwaite Theorem: A Uniﬁed Approach . Economics Letters