Polarization in Networks: Identification-alienation Framework
PPolarization in Networks:Identification-alienation Framework ∗ Kenan Huremovi´c † Ali Ozkes ‡ September 16, 2020
Abstract
We introduce a model of polarization in networks as a unifying setting for themeasurement of polarization that covers a wide range of applications. We consider asubstantially general setup for this purpose: node- and edge-weighted, undirected, andconnected networks. We generalize the axiomatic characterization of Esteban and Ray(1994) and show that only a particular instance within this class can be used justifiablyto measure polarization in networks.
JEL codes:
D63, D70, P16
Keywords: measurement, networks, polarization ∗ We thank Joan Esteban, Jan Fa(cid:32)lkowski, and Fernando Vega-Redondo for their comments and to DebrajRay for our discussion at the beginning of the project. K. Huremovi´c acknowledges financial support fromANR (project ANR 18-CE26-0020-01). † IMT School for Advanced Studies Lucca, Piazza S. Francesco, 19, 55100 Lucca, Italy. E-mail: [email protected] . ‡ Wirtschaftsuniversit¨at Wien, Institute for Markets and Strategy, Welthandelsplatz 1, 1020, Vienna,Austria. E-mail: [email protected] a r X i v : . [ ec on . T H ] S e p Introduction
Polarization in a population denotes an intensified disconnect among its groups. The analysisof the sources and the consequences of polarization depends highly on what is measured andhow, which, in turn, is strictly contingent on the particular context. For instance, whilein the context of American politics polarization is perceived as the division of masses intothe cultural camps of liberals and conservatives, in the context of European multi-partyparliaments, it is seen as the existence of ideologically cohesive and distinct party blocks. So even the term “political polarization” is not indicative of what is being measured andhow. Existing literature reflects this complexity, and there is an abundance of measureswithout a unified formalism that applies to comparable contexts.Although there are substantial differences among existing measures across fields, oneubiquitous feature can be identified. Namely, most of the current measures are proposedin settings with a uni-dimensional scalar attribute on which the polarization is assumed tooccur. However, conflicts in societies are in general related to an irreducibly complex set ofattributes and most of the empirical work rely on categorical data on various characteristics. Dimensionality reduction approaches are called for in many instances, because the existingpolarization measures allow for only a uni-dimensional, or at most a bi-dimensional domain(Hill and Tausanovitch, 2015). However, reduced dimensions can be questionable for theircapacity to represent the actual phenomenon of interest (Kam et al., 2017).In this paper, we propose the formalism of network theory to study the measurementof polarization which delivers the desired generality and spans a large variety of contexts.We fully characterize a polarization measure following the axiomatic setting introduced byEsteban and Ray (1994) (henceforth ER) for distributions on the real line. Same as ER, werestrict ourselves to distributions with finite support.Our setup is built on undirected networks in which both nodes and links are weighted.A node in the network represents a certain attribute or grouping of individuals in the pop-ulation. The weight of a node corresponds to the number of individuals in the populationthat are characterized by the attribute or members of the group ( e.g., a political party).The weighted links describe (direct) bilateral relationships between nodes. This setup isquite general and can represent a wide range of settings in which measuring polarizationis an issue of first-order importance. We describe a number of important examples in the See Fiorina et al. (2005) and Maoz and Somer-Topcu (2010) for the two different contexts. Examples include ethnolinguistics (Montalvo and Reynal-Querol, 2008), ethnic power relations (Wimmeret al., 2009), and political retweets (Conover et al., 2011). α , which captures the importance of identification in the effective antagonism,can take. Our first result shows, quite surprisingly, that this class is thinned down by aunique value, i.e., α = 1 (Theorem 1). Note that adaptations of these axioms are neithertrivial nor straightforward, as networks allow for a much larger generality in representing MRQ also identify α = 1 in their setup, which is a special case of ours. Furthermore, their axiomatizationis different than ours and ER. , α ∗ (cid:39) . It is desirable that polarization measures attain their maximum at the symmetric bipolardistribution. Contrary to the real intervals, in networks there can be any finite numberof nodes with maximal distance between them. Still, we show that any measure withinthe family we characterize is maximized at the symmetric bipolar distribution — when thepopulation is symmetrically distributed among the two most distant nodes in the network(Proposition 1).Finally, we show that if we restrict our attention to particular classes of networks emergingin certain domains such as language trees (class of tree networks) or income distributions(class of line networks), one of the axioms, Axiom 3, can be weakened in a systematic way toallow for a wider class of measures that can be used consistently (Theorem 2). For instance,in the special case of line networks that can be used to represent income distributions, ourset of axioms and the class of measures reduce to the ones in ER.
Related literature
It presents a challenge to pay a fair tribute to the ever-growing literature on the measurementof polarization. Here, we refer to a set of papers in different domains and discuss a few closelyrelated ones. We mention several other works in Section 5.Polarization is studied in social sciences (particularly in economics and political sci-ence) in relation to economic inequality (Esteban et al., 2007, Esteban and Ray, 2012,Zhang and Kanbur, 2001), social conflict (Desmet et al., 2017, Montalvo and Reynal-Querol,2008, Østby, 2008), political economy (Aghion et al., 2004, Desmet et al., 2012, Lindqvistand ¨Ostling, 2010), international relations (Maoz, 2006b), political ideologies (Abramowitzand Saunders, 2008, Fiorina and Abrams, 2008, Lelkes, 2016, Martin and Yurukoglu, 2017,Ozdemir and Ozkes, 2014), political sentiments (Boxell et al., 2017, Garcia et al., 2015), andsocial attitudes (DiMaggio et al., 1996, Lee et al., 2014, McCright and Dunlap, 2011), amongothers. ER proposes further restrictions in that regard by imposing an additional axiom (Axiom 4) that bringsabout a lower bound, i.e., α ≥
4e want to emphasize that we are not the first to consider an ER-type approach tothe measurement of polarization in networks. For instance, both Esteban and Ray (1999)and Esteban and Ray (2011) explore this issue. However, to the best of our knowledge,this is the first paper to provide an axiomatic characterization for measures of polarizationin networks. Fowler (2006a,b) and Maoz (2006b) are among the leading examples wherenetwork formalism is proposed for the measurement of polarization, without an axiomatictreatment. Finally, Permanyer and D’Ambrosio (2015) characterize a distinct family ofmeasures for categorical attributes by using identification-alienation framework and a numberof additional axioms.The rest of the paper is organized as follows. In Section 2 we describe the environment westudy and illustrate the wide applicability of our approach. In Section 3 we define polariza-tion, state the axioms, and deliver our major results. In Section 4 we discuss the importanceof network structure in terms of polarization and formally illustrate the connection betweenour work and previous literature. We conclude in Section 5.
We consider a population in which individuals belong to n > i , π i ≥ i . When π i = 0 we say that group i is empty . Vector π ∈ R n ≥ describes the distribution of a population among n groups.Bilateral relationships between groups are described with an undirected weighted graph(UWG) g , with the set of nodes, N ( g ), equal to the set of groups, and the set of undirectedlinks (or edges) E ( g ) = {{ i, j } : { i, j } ∈ N and i (cid:54) = j } . As usual, we denote the edgebetween nodes i and j in graph g with ij and the weight of that edge with g ij ≥
0. Whilewe treat weights quite generally, it is useful to think of g ij as the direct distance betweentwo connected nodes i and j – a higher g ij implies a weaker connection between i and j . For the remaining part of the paper we write ij ∈ g instead of { i, j } ∈ E ( g ) to indicate thatthere is an edge between nodes i and j in g . When nodes i and j are not directly connected, Esteban and Ray (1999) arrive at α = 1 in their attempt to connect the intensity of conflict to polariza-tion, without an axiomatic discussion, while Esteban and Ray (2011) supplement the four axioms in Ducloset al. (2004) with a fifth axiom that delivers α = 1. The measure Maoz (2006b) uses is developed in the unpublished working paper by Maoz (2006a), andwhile inspired by Duclos et al. (2004), it is only shown to satisfy an extended and qualitatively different setof properties. The particular interpretation of weights ( g ij ) i,j ∈ N depends on the application, as we demonstrate inSection 2.1.
5e write ij / ∈ g . Moreover, since groups are represented as nodes, we use words group and node interchangeably.We restrict our attention to connected graphs, i.e. , graphs in which there is a pathconnecting any two nodes. The distance between nodes i and j in g , denoted with d g ( i, j ),is measured using the notion of the shortest path. That is, while there may be differentroutes one can take to reach node j starting from node i and moving along the links in g ,the distance between i and j is the length of the shortest path. This notion of distance, alsoknown as the geodesic distance , is the standard in graph theory and the theory of networks(Newman, 2003, Jackson, 2008).Let G n denote the set of all UWGs with n nodes, and let { G n } n ∈ N denote the family of allUWGs with any finite number of nodes. The main object of our analysis is the ordered pair( g , π ) ∈ G n × R n ≥ , which represents a weighted (node-weighted and link-weighted) network.We use N to denote the set of all networks with finite number of nodes.In the special case when π = , ( g , π ) coincides with the standard notion of a (link-)weighted network. If, additionally, g ij = 1 whenever ij ∈ g , then ( g , π ) is a binary network.Thus ( g , π ) is a fairly general object that can be used to represent any undirected network weobserve, allowing for weights on nodes and edges. In Section 4 we show that any distributionstudied in ER or any classification covered by MRQ can be represented as a network.A polarization measure is a mapping P : N → R ≥ that assigns to each network( g , π ) ∈ N a non-negative real number.Before turning to the axiomatic analysis, we discuss a number of examples in which datacan be represented as a network and measuring polarization is of interest. We consider several networks that arise in politics, each of which encodes a different aspectof the prevailing political climate. In particular, we consider situations in which collectionof individuals express their preferences over alternatives, natural examples of which includea parliament voting on bills and an electorate choosing among candidates. We discuss howthese two can be modeled as networks in order to measure elite and mass polarization. We start with the case of a parliament with possibly more than two parties. Let there We consider only connected graphs in this paper. Our insights can be extended, in a somewhat ad-hocmanner, to cases when g is unconnected, for instance, by defining the distance between nodes from differentcomponents of g to be equal to the longest path between any two connected nodes in g . Alternatively, one can think of ( g , π ) as a distribution π on graph g . See Kearney (2019) for a review focusing on networks in the political domain from a general perspective. N ∈ N representatives denoted by R = { , ..., N } and T ∈ N parties denoted by T = { t , . . . , t T } . Suppose there are k ∈ N bills that are sponsored by representatives, eitherindividually or in groups, which are thereafter voted for approval in the parliament. Let v ij ∈ { , } denote the vote of i for the bill j ∈ { , , ..., k } and V = { , } k denote the setof possible vote combinations. Network of representatives, ( g (cid:48) , π (cid:48) ) , link-weighted. The set of nodes in graph g (cid:48) is R = { , . . . , N } . For any two representatives i and j , let g (cid:48) ij ≥ Thus, g (cid:48) ij stands for the (inverse of the) strength of their connection, where g (cid:48) ij = 0 indicates that i and j always vote the same way. When they never vote the same way on any bill, they arenot directly connected, hence ij / ∈ g (cid:48) . The size of every node i ∈ R is π (cid:48) i = 1, thus π (cid:48) = , aseach node represents a unique representative. An example of networks as such can be foundin Andris et al. (2015). Network of co-sponsorships, (ˆ g (cid:48) , ˆ π (cid:48) ) , unweighted. The set of nodes in ˆ g (cid:48) is R = { , . . . , N } . ˆ g (cid:48) ij = 1 if i and j co-sponsored al least one billtogether, and ij / ∈ ˆ g (cid:48) otherwise. The size of each node i ∈ R is ˆ π (cid:48) i = 1, thus ˆ π (cid:48) = , sinceeach node represents a unique representative. Fowler (2006a) studies this type of networks. Network of votes, (˜ g (cid:48) , ˜ π (cid:48) ) , node-weighted. The set of nodes in graph ˜ g (cid:48) is V = { v , . . . , v k } . Two nodes (vote combinations) v i and v j are connected, i.e. , ij ∈ ˜ g (cid:48) , whenever v i and v j differ only in a single coordinate (bill). Eachlink in ˜ g (cid:48) has a weight 1. ˜ π (cid:48) i denotes the number of individuals with voting profile v i , and ˜ π (cid:48) is the corresponding distribution. Brams et al. (2007) and Moody and Mucha (2013), amongothers, study this type of networks. Network of parties, ( ¯g (cid:48) , ¯ π (cid:48) ) , node- and link-weighted. The set of nodes in ¯g (cid:48) is T = { t , . . . , t T } . ¯ g (cid:48) ij denotes the share of bills on which a majorityof representatives in both parties vote the same way. Thus, ij / ∈ ¯g (cid:48) indicates that there isno bill that is supported (or opposed) by a majority of representatives in both parties. Thesize of a node t i ∈ T , ¯ π (cid:48) i , denotes the number of seats of the party i in the parliament. See Alternatively, one can model that two representatives are connected (with weight 1) if they vote togetherfor more than 50% of the bills and not connected otherwise, in which case we would have an unweightednetwork. The fact that g (cid:48) ij = 0 does not indicate that link between i and j does not exist, but that the distancebetween i and j is 0. Alternatively, ˆ g (cid:48) ij may reflect how many bills i and j co-sponsored together, in which case, we wouldhave a link-weighted network. ¯ g (cid:48) ij captures the ideological distance i.e., the extent the policies of two parties overlap, which can bemeasured in different ways. Maoz and Somer-Topcu (2010) take, for instance, the similarities in partymanifestos. Alternatively, ¯ π (cid:48) can be taken as , disregarding party sizes and focusing on closeness among parties, in P ( g (cid:48) , π (cid:48) ) tellsus how polarized the policy positions of representatives based on their vote histories are,regardless of their party affiliations, whereas P ( ¯g (cid:48) , ¯ π (cid:48) ) measures the party-level polariza-tion. Also, P (˜ g (cid:48) , ˜ π (cid:48) ) is informative about the polarization with respect to policy space,while P (ˆ g (cid:48) , ˆ π (cid:48) ) captures the polarization among representatives with respect to policy co-operation.For illustration, let us more closely compare networks ( g (cid:48) , π (cid:48) ) and (˜ g (cid:48) , ˜ π (cid:48) ), which arebased on exactly the same data, i.e., votes on bills. Consider the following example with 3bills and 8 representatives, where “+” in (1) represents approval for a bill and “ − ” representsdisapproval. R − R R − R R I + − − + +II − + + − +III − − + + + (1)Panel (a) of Figure 1 below shows the corresponding network of representatives, whereasthe panel (b) shows the corresponding network of votes. Since the two networks describetwo different sets of relations in the legislation, we may expect that the measured level ofpolarization differs between them. Nevertheless, any polarization measure in our frameworkis applicable to both cases. To obtain a deeper insight, for instance, one can also com-pare polarization of networks representing different types of relationships with a suitablenormalization e.g., by dividing the polarization index with the maximal value it can attain.We next turn to the case of mass polarization. Our example is concerned with an elec-torate choosing among candidates for an office (or individuals expressing preferences overpolicy alternatives such as remain , soft-Brexit , and hard-Brexit ). Let there be a set of alter-natives X = { x , . . . , x m } and each individual i ∈ { , . . . , n } be endowed with a preference which case we would have a link-weighted network. We write P ( g , π ) in place of P (( g , π )) with a slight abuse of notation. Both networks ( g (cid:48) , π (cid:48) ) and (˜ g (cid:48) , ˜ π (cid:48) ) have a level of “structural regularity.” Graph g (cid:48) leads to a completenetwork structure in the sense that each node is connected to any other node, even though there is asubstantial heterogeneity across weights of the links. Graph ˜ g (cid:48) has a lattice structure. This is by no meansnecessary for our approach, which is applicable to connected networks with arbitrary structure. For instance,as in Andris et al. (2015), two representatives can be connected if they vote the same way sufficiently manytimes, then the g (cid:48) will not have the complete graph structure. Co-sponsorship networks such as (ˆ g (cid:48) , ˆ π (cid:48) )have, in general, quite irregular structures, as in Fowler (2006b). R R R R R R R (a) Network of representatives.Nodes denote representatives andtwo nodes are not connected ifthey do not agree on any issue.The thickness of edges indicateweights.
111 23 (b) Network of votes. The nodesrepresent all possible vote combi-nations e.g. , 100 represents the ap-proval of only first bill.
Figure 1:
Two possible network representations of the same profile of votes of representatives. P i ⊆ X × X that is a linear order, i.e. , a complete, antisymmetric, and transitive binaryrelation on X. Let L denote the set of all preferences over X and L n be the set of profileof preferences. Network of preferences, ( g (cid:48)(cid:48) , π (cid:48)(cid:48) ) . The set of nodes is L = ( p , . . . , p m ! ). Two nodes p i and p j are connected with g (cid:48)(cid:48) ij = 1, whenever p i can be obtained from p j by switching only onebinary preference, i.e., the Kemeny distance between p i and p j is 1 (Kemeny, 1959). Wedenote with π (cid:48)(cid:48) i the number of individuals with preference p i , and with π (cid:48)(cid:48) the correspondingdistribution. See Cervone et al. (2012) for a study on preference networks. For an illustration, let { a, b, c } be the set of alternatives and consider the preferenceprofile with 11 individuals represented by (2).2 3 2 4 a b c cb a a bc c b a (2) A network of preferences can be represented as a special network of votes, in which each bill representsa pairwise comparison of alternatives and transitivity is imposed. Often without explicitly using the language of networks, graph theoretical representations of preferencesare studied in the social choice literature widely. There is also a growing interest in measuring polarization inpreference profiles, as in Can et al. (2015, 2017). Note that network ( g (cid:48)(cid:48) , π (cid:48)(cid:48) ) could alternatively be definedusing a weighted metric as in Can (2014). abc acb cabbac bca cba Figure 2:
A distribution over a preference network with 3 alternatives and 11 individuals.
While we paid a close attention to examples of networks from the political domain, ourapproach can naturally be applied in a much wider range of applications, not necessarilyconfined to those that are commonly studied using networks. For instance, our settingcan be adopted to study multidimensional polarization in any distribution with a discretesupport. To see how, take the example of polarization in a society with respect to incomeand education (both measured on some discrete, increasing scale). The set of all pairs ofincome ( ι ) and education ( (cid:15) ) levels defines the set of nodes in the network. Two nodes x = ( x ι , x (cid:15) ) and y = ( y ι , y (cid:15) ) are connected, with link xy of weight s ( g xy = s ), if, forinstance, | x ι − y ι | + | x (cid:15) − y (cid:15) | = s , that is if the Manhattan distance between x and y is equalto s .Other potential applications include conflicts between groups (Esteban and Ray, 1999,2011), private provision of public goods (Bramoull´e and Kranton, 2007), research outputand citation networks (Leskovec et al., 2005), friendship networks (Calv´o-Armengol et al.,2009), and trust networks (Richardson et al., 2003). To recall, our objective in this paper is two-fold. First, we propose network theory as aunifying formalism to study polarization without any constraint on dimensionality. Second,we present a theoretical foundation for a family of polarization measures in this setting. Forthe latter, we closely follow the axiomatic approach in ER, who envisage polarization asthe aggregate antagonism in a population, based on the identification and alienation amongindividuals. 10irst, as in ER, we require polarization measures to satisfy the following property thatensures invariance of the measure with respect to the size of the population (cid:80) i ∈ N ( g ) π i . Thus,in fact, π may represent also a probability mass function. Assumption 1 (Homotheticity) P ( g , π ) ≥ P ( g (cid:48) , π (cid:48) ) = ⇒ P ( g , λ π ) ≥ P ( g (cid:48) , λ π (cid:48) ) for all ( g , π ) , ( g (cid:48) , π (cid:48) ) ∈ N and λ > . The antagonism between individuals depend on how they identify themselves and howalienated they feel from others. In the network setup we propose, individuals in a populationare identified only with their definitive attributes, which are represented as nodes in thenetwork. As emphasized before, these attributes are by no means restricted to singletons ora uni-dimensional space.The effect of the feeling of identification of each individual on her antagonism towardsanother is measured in relation to the presence of others that share the same attributes, henceare in the same node. This effect is the basis of the intra-group homogeneity, and we denoteit with I ( π i ). Thus, when the nodes represent individuals, each individual feels the samelevel of identification, whereas when nodes represent groups of individuals, the identificationan individual feels is a function of the size its node ( I ( π i )). The only assumption we makeon the identification function I : R ≥ → R ≥ is that I ( π i ) > π i > alienation component as a function of the distance betweenindividuals a ( d ( i, j )). We assume that the alienation function a : R ≥ → R ≥ is a continuousand nondecreasing function with a (0) = 0.Finally, the effective antagonism of group i towards group j is measured by continuousand strictly increasing function T ( I i , a ij ) of the identification of group i , I i = I ( π i ), and thealienation between groups i and j , a ij = a ( d ( i, j )), satisfying T ( I i ,
0) = 0. As in ER, weconsider polarization measures P : N → R ≥ defined as the sum of effective antagonisms: P ( g , π ) = n (cid:88) i =1 n (cid:88) j =1 π i π j T (cid:16) I ( π i ) , a (cid:0) d g ( i, j ) (cid:1)(cid:17) . (3)As we shall see, our axioms will pin down specific functional form for T (cid:16) I ( π i ) , a (cid:0) d g ( i, j ) (cid:1)(cid:17) .Our goal is to follow the axiomatization in ER as closely as possible, and modify it onlywhen the network setting requires. As it turns out, the first two axioms can be restated onlywith slight changes in the nomenclature. Axiom 3 needs an important adjustment. This implies that two groups (nodes) of the same size exhibit the same level of identification. Whilepotentially restrictive, this is standard in the identification-alienation framework (Esteban and Ray, 1994,2012). xiom 1 Data: Network ( g , π ) with n ≥ nodes such that π x > π y = π z > and π i = 0 ∀ i ∈ N ( g ) \ { x, y, z } . Furthermore, d g ( x, y ) ≤ d g ( x, z ) .Statement: Fix π x and d g ( x, y ) . There exists (cid:15) > and µ = µ ( π x , d g ( x, y )) > such that d g ( y, z ) < (cid:15) and π y < µπ x imply that for any ( g (cid:48) , π (cid:48) ) ∈ N with n ≥ nodes such that π (cid:48) x (cid:48) = π x , π (cid:48) w (cid:48) = π y + π z , d g (cid:48) ( x (cid:48) w (cid:48) ) = ( d g ( x, y ) + d g ( x, z )) and π (cid:48) i (cid:48) = 0 , i (cid:48) ∈ N ( g (cid:48) ) \ { x (cid:48) , w (cid:48) } ,we have P ( g (cid:48) , π (cid:48) ) > P ( g , π ) . The Axiom 1 captures the situations where two small groups join while keeping the(average) distance the same. π π π (a) The move shown by ar-rows increase polarization. π π π (b) Axiom 1 applies when the new node is fur-ther away from the two small nodes as well. Figure 3:
Axiom 1.
Suppose in ( g , π ) there is a node with large group and there are two other smaller andequal-sized groups that are close to each other but further away from the larger group. Thennetwork ( g (cid:48) , π (cid:48) ), in which smaller groups are joined at a node which is located in g (cid:48) at adistance equal to their average distance (in g ) to the large group, is more polarized. Figure3 illustrates such moves. Note that the distance of the fourth node to smaller nodes is notrestricted in the axiom, allowing for moves such as the one depicted in panel (b) of Figure 3.
Axiom 2
Data: Network ( g , π ) with n ≥ nodes such that π x > π z > , π y > , and π i = 0 , ∀ i ∈ N ( g ) \ { x, y, z } . Furthermore, d g ( x, z ) > d g ( x, y ) > d g ( y, z ) .Statement: There exists (cid:15) > such that for any network ( g (cid:48) , π (cid:48) ) with ( π (cid:48) x (cid:48) , π (cid:48) y (cid:48) , π (cid:48) z (cid:48) ) =( π x , π y , π z ) , and π (cid:48) i (cid:48) = 0 , i (cid:48) ∈ N ( g (cid:48) ) \ { x (cid:48) , y (cid:48) , z (cid:48) } such that d g ( x, z ) = d g (cid:48) ( x (cid:48) , z (cid:48) ) ,
Axiom 2.
Note that the described move makes the middle group closer to the smaller group, butits new location does not have to be close to its original position, as seen in panel (b) ofFigure 4. This kind of a move is not possible on the real line.
Axiom 3
Data: Network ( g , π ) with n ≥ nodes such that π x > , π y = π z > and π i = 0 ∀ i ∈ N ( g ) \ { x, y, z } . Furthermore, d g ( x, y ) = d g ( x, z ) = d > .Statement: For any ∆ ∈ (0 , π x ] and any network ( g (cid:48) , π (cid:48) ) with ( π (cid:48) x (cid:48) , π (cid:48) y (cid:48) , π (cid:48) z (cid:48) ) = ( π x − , π y +∆ , π z + ∆) , and π (cid:48) i (cid:48) = 0 , i (cid:48) ∈ N ( g (cid:48) ) \ { x (cid:48) , y (cid:48) , z (cid:48) } such that d g (cid:48) ( x (cid:48) , y (cid:48) ) = d g (cid:48) ( x (cid:48) , z (cid:48) ) = d and d g ( y, z ) = d g (cid:48) ( x (cid:48) , z (cid:48) ) , we have P ( g (cid:48) , π (cid:48) ) > P ( g , π ) whenever d g ( y, z ) = cd , for any c > . Axiom 3 states that as long as the distance between two lateral groups is greater than thedistance between the “middle group” and a lateral group, a network in which individuals fromthe group in the middle are reallocated to extreme points will exhibit higher polarization.Note that the relative size of the group in node x is not restricted. Furthermore, in a network, d g ( x, y ) = d g ( x, z ) = d implies only that d g ( y, z ) ≤ d , whereas on the real line y (cid:54) = z and | x − y | = | z − x | = d imply that | z − y | = 2 d . We will come back to this crucial point inSection 4.We are now ready to state our central result, which identifies the measures of polarizationin networks that satisfy Axioms 1–3. Theorem 1
A polarization measure P of the family defined in (3) satisfies Axioms 1–3 and homotheticity Axiom 2 is rather weak as it applies to only those (small) moves such that an increase in distance fromone extreme is equal to a decrease in the distance to the other extreme. π π π ∆ ∆ Figure 5:
Axiom 3 dictates that the dissolution of the middle group into two extreme nodesincreases polarization. if and only if P ( g , π ) = K (cid:88) i ∈ N ( g ) (cid:88) j ∈ N ( g ) π i π j d g ( i, j ) , (4) for some constant K > .Proof.Sufficiency. Without loss of generality set K = 1. We prove that Axiom 1 and Axiom 2 aresatisfied for P α ( g , π ) = K (cid:88) i ∈ N ( g ) (cid:88) j ∈ N ( g ) π αi π j d g ( i, j ) , (5)whenever α >
0. Clearly, (5) becomes (4) when α = 1. Establishing this claim is importantalso for the proof of Theorem 2. Axiom 1.
Let π x = p and π y = π z = q . Using d g (cid:48) ( x (cid:48) , w (cid:48) ) = d g ( x,y )+ d g ( x,z )2 we get that P α ( g , π ) = p α qd g ( x, y ) + p α qd g ( x, z ) + 2 q α qd g ( y, z ) + q α pd g ( x, y ) + q α pd g ( x, z ) , while P α ( g (cid:48) , π (cid:48) ) = p α (2 q ) d g ( x, y ) + d g ( x, z )2 + (2 q ) α p d g ( x, y ) + d g ( x, z )2 . After simplification we get: P α ( g , π ) = ( d g ( x, y ) + d g ( x, z ))( p α q + q α p ) + 2 q α d g ( y, z ) , and P α ( g (cid:48) , π (cid:48) ) = ( d g ( x, y ) + d g ( x, z ))( p α q + q α p ) + (2 α − d g ( x, y ) + d g ( x, z )) q α p, which implies P α ( g , π ) > P α ( g (cid:48) , π (cid:48) ) whenever (2 α − d g ( x, y ) + d g ( x, z )) p > qd g ( y, z ) . When d ( y, z ) is small enough ( d ( y, z ) < (cid:15) ) this inequality will hold for any α > q smallenough relative to p ( q < µp ), as required by Axiom 1.14 xiom 2. Let π x = p , π y = q , and π z = r . Subtracting we get: P α ( g , π ) − P α ( g (cid:48) , π (cid:48) ) = q α [ p ( d g (cid:48) ( x (cid:48) , y (cid:48) ) − d g ( x, y )) + r ( d g (cid:48) ( z (cid:48) , y (cid:48) ) − d g ( y, z ))]+ q [ p α ( d g (cid:48) ( x (cid:48) , y (cid:48) ) − d g ( x, y )) + r α ( d g (cid:48) ( z (cid:48) , y (cid:48) ) − d g ( y, z ))] , which is positive for any α > r < p , since d g (cid:48) ( x (cid:48) , y (cid:48) ) − d g ( x, y ) = d g (cid:48) ( z (cid:48) , y (cid:48) ) − d g ( y, z ), and therefore P α satisfies Axiom 2. Axiom 3.
We now show that P satisfies Axiom 3. To this end let d g ( x, y ) = d g ( x, z ) = d ,and let d g ( y, z ) = cd with c >
1. Furthermore, let π x = p + 2∆ and π y = π z = q − ∆. Wecan write: P α (( g , π ); ∆) = 2 cd (cid:0) ( q − ∆) α (cid:1) + 2 d (cid:16) ( p + 2∆)( q − ∆) (cid:0) ( p + 2∆) α ( q − ∆) α (cid:1)(cid:17) . (6)To prove that P satisfies Axiom 3 it is sufficient to show that ∂ P α (( g , π ) , ∆) ∂ ∆ (cid:12)(cid:12)(cid:12) ∆=0 ,α =1 < p, q ) (cid:29)
0, except for at most one ratio p/q . Differentiating (6) at ∆ = 0 and dividingby 2 d ( ≥
0) we get: ∂ P α ∂ ∆ (cid:12)(cid:12)(cid:12)(cid:12) ∆=0 < ⇐⇒ − p α (cid:0) p − α ) q (cid:1) + q α (cid:0) − (1 + α ) p + 2 q − (2 + α ) cq (cid:1) < . Dividing by p α > z = q/p we get: ∂ P α ∂ ∆ (cid:12)(cid:12)(cid:12)(cid:12) ∆=0 < ⇐⇒ f ( z, α, c ) < , where f : R ≥ × [1 , → R is defined with: f ( z, α, c ) = (1 + α ) (cid:20) z − z α z α − c (2 + α ))1 + α (cid:21) − . (7)One can easily verify f ( z, , c ) < f ( z, , c ) is a quadratic function in z ) for any c ∈ [1 , z = 1 when c = 1, therefore P satisfies Axiom 3 as well. Necessity.
The proof is analogous to the proof of Theorem 1 in ER. We describe it briefly,and refer the reader to ER for detailed derivation. Axioms 1–2 imply that function T is linearin its second argument, thus θ ( π, δ ) ≡ T ( I ( π ) , a ( d g ( i, j ))) can be written as θ ( π, δ ) = φ ( π ) δ .Furthermore, Axiom 1 implies that φ ( · ) is an increasing function. Homotheticity impliesthat φ ( π ) = Kπ α for some constants ( K, π ) (cid:29)
0. Finally, Axiom 3 implies f ( z, α, ≤ z . As seen in the first part of the proof this holds for α = 1. To see that it does not holdfor any other value α >
0, first note that f (1 , α,
1) = 0. Furthermore, function f ( z, α,
1) isincreasing in z at z = 1 when α < z at z = 1 when α >
1. By continuity,there exist (cid:15) > (cid:15) > f (1 + (cid:15) , α, > α <
1, and f (1 − (cid:15) , α, > See Kawada et al. (2018) for a solution to a technical problem arising from the original formulation ofAxiom 1 in ER. α > P ER ( π ) = K n (cid:88) i =1 n (cid:88) j =1 π αi π j | i − j | , (8)with K > α ∈ (0 , α ∗ ], with α ∗ (cid:39) .
6. The main difference between (4) and (8) isthat the index in (4) implies α = 1. The reason for this difference lies in the nature of thedistances, discussed in relation with Axiom 3. It requires that a move from a middle mass( π x ) to the lateral points ( π y and π z ) equidistant from the middle increases polarizationwhenever they are individually further away from each other than they are to the midpoint.Contrary to the real line, in ( g , π ), d g ( y, z ) is not determined by d g ( x, y ) = d g ( x, z ), andin fact it can very well happen that d g ( y, z ) < d g ( x, y ) even when d g ( x, y ) = d g ( x, z ). Werevisit this important matter in Section 4.3 below. Note that Axioms 1 and 2 also requireadaptation for the network setup, but these adaptations are minor and do not have importantimplications on the form of the characterized family of measures.Intuitively, a society is polarized if it can be grouped in a small number of homogeneousgroups of similar sizes that are very different from each-other and polarization is often con-ceptualized to capture the level of bipolarity (or bimodality). Thus, it is desirable that apolarization measure is maximized at a bipolar distribution. A bipolar network is one wherethe population is split equally into two extreme (most distant) nodes. The maximal distancebetween two nodes in graph g is called the diameter of g and is denoted by d ( g ). For anygraph g let π B ( g ) denote the distribution in which the population is split equally across twonodes at distance d ( g ). Our next result shows that ( g , π B ( g )) is more polarized than anyother network ( g , π ) under any measure within our characterization. Proposition 1 P ( g , π B ( g )) > P ( g , π ) for any ( g , π ) with π (cid:54) = π B ( g ) and any measure P defined in (4) .Proof. We first prove that for any network ( g , π ) such that π has at lest 4 nonzero mass points, thereexists a 3 node network ( g ∗ , π ∗ ) with g ∗ ij = d ( g ) for i, j ∈ N ( g ∗ ), and (cid:80) i =1 π ∗ i = (cid:80) i ∈ N g π i such that P ( g , π ) < P ( g ∗ , π ∗ ) . The proof is constructive. Assume, without loss of generality, that in ( g , π ), we have π ≥ π ≥ · · · ≥ π n with π k > π k +1 = 0 for some k ≥
4. Fixing K = 1 in (4) (without See Foster and Wolfson (2010) for a discussion on bipolarity of income distributions and DiMaggio et al.(1996) for a more general discussion on bimodality, among others. More formally, d ( g ) = max i,j ∈ N ( g ) d g ( i, j ). See Vega-Redondo (2007) or Jackson (2008). P ( g , π ) ≤ d ( g ) k (cid:88) i =1 k (cid:88) j =1 π i π j d g ( i, j )= d ( g ) k − (cid:88) i =1 k − (cid:88) j =1 j (cid:54) = i π i π j + π k − k (cid:88) j =1 j (cid:54) = k − π j + π k k (cid:88) j =1 j (cid:54) = k π j + π k − k − (cid:88) j =1 π j + π k k − (cid:88) j =1 π j . (9)Denote the right hand side expression in (9) with P ( g (cid:48) , π (cid:48) ), where g (cid:48) ij = d ( g ) for all i, j ∈ N ( g ) and π (cid:48) i = π i for all i ∈ N ( g ). Consider now a change in ( g (cid:48) , π (cid:48) ) such that massesin nodes k and k − g (cid:48)(cid:48) , π (cid:48)(cid:48) ). Simple algebragives: P ( g (cid:48)(cid:48) , π (cid:48)(cid:48) ) = d ( g ) k − (cid:88) i =1 k − (cid:88) j =1 j (cid:54) = i π i π j + ( π k − + π k ) k − (cid:88) j =1 π j + ( π k − + π k ) k − (cid:88) j =1 π j . Subtracting P ( g (cid:48) , π (cid:48) ) we get: P ( g (cid:48)(cid:48) , π (cid:48)(cid:48) ) − P ( g (cid:48) , π (cid:48) ) = d ( g ) π k − π k (cid:34) k − (cid:88) j =1 π j − ( π k − + π k ) (cid:35) > , (10)where the inequality follows from the choice of k and k − k ≥
4. Thus,for any network ( g , π ) with | N ( g ) | ≥ P ( g , π ) < P ( g (cid:48)(cid:48) , π (cid:48)(cid:48) ) . If k = 4, ( g ∗ , π ∗ ) = ( g (cid:48)(cid:48) , π (cid:48)(cid:48) ). If k >
4, the above described procedure of joining themasses in nodes k − k − g , π ):(i) |{ i ∈ N ( g ) : π i > }| = 2. Clearly P ( g , π B ( g )) > d ( g )2 (cid:0) π i + π j (cid:1) > d g ( i, j ) (cid:0) π i π j ( π i + π j ) (cid:1) = P ( g , π ) for any ( g , π ) (cid:54) = ( g , π B ).(ii) |{ i ∈ N ( g ) : π i > }| = 3 . Consider ( g (cid:48) , π (cid:48) ) obtained from ( g , π ) such that π = π (cid:48) , N ( g ) = N ( g (cid:48) ), E ( g ) = E ( g (cid:48) ) and g (cid:48) ij = d ( g ) for any i, j ∈ N ( g (cid:48) ). Clearly P ( g , π ) ≤ P ( g (cid:48) , π (cid:48) ). Construct π (cid:48)(cid:48) from π (cid:48) by reallocating mass from any point to the other twopoints as in Axiom 3. We have: P ( g , π B ( g )) = P ( g (cid:48) , π B ( g )) ≥ P ( g (cid:48) , π (cid:48)(cid:48) ) > P ( g (cid:48) , π (cid:48) ) , where the first inequality is implied by (i), while the second is a consequence of Axiom3.(iii) |{ i ∈ N ( g ) : π i > }| ≥ . The claim follows from the first part of the proof and (ii).17
Discussion
In this section, we first discuss some important properties of the measures we characterizein relation to the structure of networks. Then we show how our work is related to previouspapers in the literature. We conclude this section with a discussion on how the weakeningof the Axiom 3 can relate our characterization to the one in ER, by exactly describing therelationship between the importance of identification ( α ) and the network structure. We first want to emphasize that the structure of a graph g determines the distance betweenany two nodes in N ( g ). A change in the structure of a graph g , e.g., deleting a link, mayaffect the measured levels of polarization, even if π stays the same. Although empty (zero-weight) nodes do not directly contribute to the level of polarization, they may be important“indirectly” if, for instance, they are located on the shortest path between some non-emptynodes. Figure 6 illustrates this point. (a) ( g , π ). (b) ( g (cid:48) , π ).
15 4 2 (c) ( g (cid:48)(cid:48) , π (cid:48)(cid:48) ). Figure 6:
Three networks where each link has weight 1 and each node except the node 3 has weight1 ( π = 0). ( g (cid:48) , π ) is obtained from ( g , π ) by deleting the link g . ( g (cid:48)(cid:48) , π (cid:48)(cid:48) ) is obtained from ( g , π )by deleting the node 3. Note that π i = π (cid:48)(cid:48) i for all i ∈ N ( g (cid:48)(cid:48) ). We have P ( g , π ) < P ( g (cid:48) , π ) = P ( g (cid:48)(cid:48) , π (cid:48)(cid:48) ). Next, we want to note that given Proposition 1, we have that d ( g ) > d ( g (cid:48) ) implies P ( g , π B ( g )) > P ( g (cid:48) , π B ( g (cid:48) )). That is, comparing two bipolar networks, the larger thediameter, the higher the polarization.Finally, in the special case when π = , P ( g , π ) is proportional to the average shortestpath in the graph g . Thus, the closer the individuals are, on average, the less polarizedthe network is.
We argue that the settings considered in ER and MRQ are special cases of our setting, andhence our results can be seen as generalizations of theirs. To start with, recall that ER The average shortest path in a network is closely related to the “closeness” measure (Vega-Redondo,2007, Jackson, 2008). π be a distribution with a set of N mass points. Consider graph g with N nodes such that g ij = | i − j | for any two adjacent mass points i and j on the real line, and ij / ∈ g otherwise. Indeed, we can represent any distribution on an m − dimensional space with finite number ofmass point as a network by simply setting g ij = (cid:107) i − j (cid:107) , where (cid:107)·(cid:107) can be any norm.In the setting considered in MRQ the distance between any two different groups equalsto 1. It is immediate to note that this setting can be described by the network ( g , π ) where g is the complete graph such that g ij = 1 for any pair of nodes i, j ∈ N ( g ). MRQ propose adifferent set of axioms. Axiom 3 requires that the described change in ( g , π ) leads to an increase in polarizationonly when the distance between lateral nodes is at least as large as the distance between thecenter node and lateral nodes. We now discuss less demanding versions of Axiom 3, labeledsystematically as Axiom 3( c ), in which we require that the scenario in Axiom 3 leads to anincrease in polarization only if the lateral nodes are “far enough” (quantified by the scalar c ) from each other. This is of interest also because some settings imply a specific networkstructure in which there is a clear lower bound for the distance between two lateral nodescontemplated in Axiom 3. For instance, as we saw before, any discrete distribution on areal line can be represented with a line network. On any line network, the distance betweenlateral nodes is the double of the distance between the middle node and a lateral node, as itis on the real line. Axiom 3 ( c ) Data: Network ( g , π ) with n ≥ nodes, π x > π y = π z > and π i = 0 for all i ∈ N ( g ) \ { x, y, z } . Furthermore, d g ( x, y ) = d g ( x, z ) = d > .Statement: Fix c ∈ (1 , . For any (cid:15) ∈ (0 , π x ] and any network ( g (cid:48) , π (cid:48) ) with ( π (cid:48) x (cid:48) , π (cid:48) y (cid:48) , π (cid:48) z (cid:48) ) =( π x − (cid:15), π y + (cid:15) , π z + (cid:15) ) , and π (cid:48) i (cid:48) = 0 , i (cid:48) ∈ N ( g (cid:48) ) \{ x (cid:48) , y (cid:48) , z (cid:48) } such that d g (cid:48) ( x (cid:48) , y (cid:48) ) = d g (cid:48) ( x (cid:48) , z (cid:48) ) = d and d g ( y, z ) = d g (cid:48) ( y (cid:48) , z (cid:48) ) , we have P ( g (cid:48) , π (cid:48) ) > P ( g , π ) whenever d g ( y, z ) ≥ dc . This is not the unique way to represent a discrete distribution with n mass points as a network. However,any consistent representation that relies on the same metric will lead to a network with the same polarization. The logical dependence between our axioms and the ones in MRQ is an interesting question that is leftfor future research. cd π π π π ∆ ∆ dd Figure 7:
Axiom 3( c ) requires that the move shown by the arrows should increase polarization if¯ c ≥ c . When c = 1, we have the same statement as in Axiom 3, while for c = 2 we haveessentially the Axiom 3 in ER. The particular value of c has important implications on theresulting measure of polarization, as stated in Theorem 2. Theorem 2
Fix c ∈ (1 , . There exists an interval [ α ( c ) , ¯ α ( c )] ⊆ (0 , α ∗ ] with α ∗ (cid:39) . such that thepolarization measure P of the family defined in (3) satisfies Axioms 1, 2, and 3 ( c ) andhomotheticity if and only if P α ( g , π ) = K (cid:88) i ∈ N ( g ) (cid:88) j ∈ N ( g ) π αi π j d g ( i, j ) (11) for some constant K > whenever α ∈ [ α ( c ) , ¯ α ( c )] . Furthermore, c > c = ⇒ [ α ( c ) , ¯ α ( c )] ⊂ [ α ( c ) , ¯ α ( c )] .Proof. See the proof of Theorem 1 for the proofs of claims regarding Axiom 1 and Axiom 2 (the
Sufficency and the
Necessity part). Similarly, Axiom 3( c ) holds iff α is such that f ( z, α, c ) < z , where f is defined in (7). To conclude the proof, twoobservations about v ( α, c ) = max z ≥ f ( z, α, c ) are important. First, v is increasing in α ∈ (1 ,
2] for any fixed c ∈ (1 ,
2] and changes the sign on the considered interval. Thus, thereexists ¯ α ( c ) such that v ( α, c ) ≤ α ∈ (1 , ¯ α ( c )]. Since v ( α, c ) is decreasing in c , ¯ α ( c )is increasing in c . Second, for α < c , v decreases in α whenever v ( α, c ) ≥ v (1 , c ) < c >
1. This implies the existence of α ( c ) ∈ [0 , v decreases in c we have that α ( c ) increases in c . From these two observations we conclude 2 ≥ c > c > ⇒ [ α ( c ) , ¯ α ( c )] ⊂ [ α ( c ) , ¯ α ( c )].Theorem 2 shows that as we make Axiom 3 less demanding, the range of values of param-eter α for which our axioms is satisfied expands monotonically. In particular, if we restrictourselves to line networks, then the network structure implies that any move described in See Lemma 1 and 2 in Appendix A for the formal statements and proofs of these two observations. c ) for c = 2, and Axioms 1, 2 and 3( c ) can be seen asrestatements of the Axioms 1–3 in ER.Finally, it should be noted that the claim in Proposition 1 holds only for measurescharacterized in Theorem 1, and not for any other measure as in (5) with α (cid:54) = 1. To seethis, take any graph g such that N ( g ) = { x, y, z } with 0 < g xy = g xz ≤ g yz . Then forany α ∈ R ≥ \ { } , there exists a distribution π (cid:54) = π B ( g ) and (cid:15) > P ( g , π ) > P α ( g , π B ( g )) whenever g yz = g xz + (cid:15) . This is a direct consequence of the fact that for α (cid:54) = 1, P α does not satisfy Axiom 3 when c is arbitrary close to 1. We have introduced a model of polarization in networks. This model can be used to studythe levels and trends of polarization in a wide range of applications. In Section 2, wediscussed several examples from political processes in parliaments and public preferences.The potential of our proposal is by no means restricted to these examples as pointed tobefore. To name a few areas beyond the domain of polity, for which a recent survey isprovided by Battaglini and Patacchini (2019), Bail (2016) constructs weighted networksbetween advocacy organizations based on the frequency of words in the shared vocabulary oftheir posts. Stewart et al. (2018) construct retweet networks to study the impact of suspicioustroll activity on the levels of polarization on Twitter (see, also, Conover et al., 2011). Farrell(2016) constructs a network of organizations based on the activities of affiliates to studypolarization on climate change issues among organizations. O’Connor and Weatherall (2018)propose the network formalism to study polarization in scientific communities around beliefsbased on scientific knowledge. DiFonzo et al. (2013) employ a network-based approach oncapturing polarization of rumor beliefs in the context of social impact theory.Reconstructing the axiomatic analysis of ER, we characterized a family of measureswithin our model. Importing the axiomatic approach needs a careful attention due to thedistinct nature of the geodesic distance on networks compared to the Euclidean distance onthe real line. Our characterization result shows that the class of measures characterized byER carries almost intact to the networks. The only bite is in the value of the parameter forthe effect of identification on effective antagonism. We find that α = 1 is a necessary andsufficient condition for the measures of polarization in the form of aggregate antagonismsto satisfy the aforementioned axioms, together with hometheticity. We demonstrate thatpolarization is maximized when the population is allocated on the two most distant nodes inthe network. Finally, we discuss how restricting to specific class of network structures mayexpand the class of polarization measures. 21ur model can be further developed along different dimensions. One promising avenue forfuture research pertains to extending the measures so as to capture the intra-group hetero-geneity, which could also be described as a network. In that case, the identification functionshould additionally depend on the within-group structure. Another direction for futureresearch concerns the existence of interesting characterizations outside the identification-alienation framework but with the same axioms, as these two are independent.22 eferences Abramowitz, A. I. and Saunders, K. L. (2008). Is polarization a myth?
The Journal ofPolitics , 70(2):542–555.Aghion, P., Alesina, A., and Trebbi, F. (2004). Endogenous political institutions.
TheQuarterly Journal of Economics , 119(2):565–611.Andris, C., Lee, D., Hamilton, M. J., Martino, M., Gunning, C. E., and Selden, J. A. (2015).The rise of partisanship and super-cooperators in the us house of representatives.
PloSone , 10(4):e0123507.Bail, C. A. (2016). Combining natural language processing and network analysis to examinehow advocacy organizations stimulate conversation on social media.
Proceedings of theNational Academy of Sciences , 113(42):11823–11828.Battaglini, M. and Patacchini, E. (2019). Social networks in policy making.
Annual Reviewof Economics .Boxell, L., Gentzkow, M., and Shapiro, J. M. (2017). Greater internet use is not associatedwith faster growth in political polarization among us demographic groups.
Proceedings ofthe National Academy of Sciences , 114(40):10612–10617.Bramoull´e, Y. and Kranton, R. (2007). Public goods in networks.
Journal of EconomicTheory , 135(1):478–494.Brams, S. J., Kilgour, D. M., and Sanver, M. R. (2007). A minimax procedure for electingcommittees.
Public Choice , 132(3-4):401–420.Calv´o-Armengol, A., Patacchini, E., and Zenou, Y. (2009). Peer effects and social networksin education.
The Review of Economic Studies , 76(4):1239–1267.Can, B. (2014). Weighted distances between preferences.
Journal of Mathematical Eco-nomics , 51:109–115.Can, B., Ozkes, A., and Storcken, T. (2015). Measuring polarization in preferences.
Mathe-matical Social Sciences , 78:76–79.Can, B., Ozkes, A., and Storcken, T. (2017). Generalized measures of polarization in pref-erences. Technical report, Aix-Marseille School of Economics, France.23ervone, D. P., Dai, R., Gnoutcheff, D., Lanterman, G., Mackenzie, A., Morse, A., Srivas-tava, N., and Zwicker, W. S. (2012). Voting with rubber bands, weights, and strings.
Mathematical Social Sciences , 64(1):11–27.Conover, M. D., Ratkiewicz, J., Francisco, M., Gon¸calves, B., Menczer, F., and Flammini,A. (2011). Political polarization on twitter. In
Fifth international AAAI conference onweblogs and social media .Desmet, K., Ortu˜no-Ort´ın, I., and Wacziarg, R. (2012). The political economy of linguisticcleavages.
Journal of development Economics , 97(2):322–338.Desmet, K., Ortu˜no-Ort´ın, I., and Wacziarg, R. (2017). Culture, ethnicity, and diversity.
American Economic Review , 107(9):2479–2513.DiFonzo, N., Bourgeois, M. J., Suls, J., Homan, C., Stupak, N., Brooks, B. P., Ross,D. S., and Bordia, P. (2013). Rumor clustering, consensus, and polarization: Dynamicsocial impact and self-organization of hearsay.
Journal of Experimental Social Psychology ,49(3):378–399.DiMaggio, P., Evans, J., and Bryson, B. (1996). Have american’s social attitudes becomemore polarized?
American journal of Sociology , 102(3):690–755.Duclos, J.-Y., Esteban, J., and Ray, D. (2004). Polarization: concepts, measurement, esti-mation.
Econometrica , 72(6):1737–1772.Esteban, J., Grad´ın, C., and Ray, D. (2007). An extension of a measure of polarization, withan application to the income distribution of five oecd countries.
The Journal of EconomicInequality , 5(1):1–19.Esteban, J. and Ray, D. (1994). On the measurement of polarization.
Econometrica: Journalof the Econometric Society , pages 819–851.Esteban, J. and Ray, D. (1999). Conflict and distribution.
Journal of Economic Theory ,87(2):379–415.Esteban, J. and Ray, D. (2011). Linking conflict to inequality and polarization.
AmericanEconomic Review , 101(4):1345–74.Esteban, J. and Ray, D. (2012). Comparing polarization measures.
Oxford Handbook ofEconomics of Peace and Conflict , pages 127–151.24arrell, J. (2016). Corporate funding and ideological polarization about climate change.
Proceedings of the National Academy of Sciences , 113(1):92–97.Fiorina, M. P. and Abrams, S. J. (2008). Political polarization in the american public.
Annu.Rev. Polit. Sci. , 11:563–588.Fiorina, M. P., Abrams, S. J., and Pope, J. C. (2005). Culture war.
The myth of a polarizedAmerica , 3.Foster, J. E. and Wolfson, M. C. (2010). Polarization and the decline of the middle class:Canada and the us.
The Journal of Economic Inequality , 8(2):247–273.Fowler, J. H. (2006a). Connecting the congress: A study of cosponsorship networks.
PoliticalAnalysis , 14(4):456–487.Fowler, J. H. (2006b). Legislative cosponsorship networks in the us house and senate.
SocialNetworks , 28(4):454–465.Garcia, D., Abisheva, A., Schweighofer, S., Serd¨ult, U., and Schweitzer, F. (2015). Ideologicaland temporal components of network polarization in online political participatory media.
Policy & internet , 7(1):46–79.Hill, S. J. and Tausanovitch, C. (2015). A disconnect in representation? comparison oftrends in congressional and public polarization.
The Journal of Politics , 77(4):1058–1075.Jackson, M. O. (2008).
Social and economic networks . Princeton university press Princeton.Kam, C., Indridason, I., Bianco, W., et al. (2017). Polarization in multiparty systems.Kawada, Y., Nakamura, Y., and Sunada, K. (2018). A characterization of the esteban–raypolarization measures.
Economics Letters , 169:35–37.Kearney, M. W. (2019). Analyzing change in network polarization. new media & society ,21(6):1380–1402.Kemeny, J. G. (1959). Mathematics without numbers.
Daedalus , 88(4):577–591.Lee, J. K., Choi, J., Kim, C., and Kim, Y. (2014). Social media, network heterogeneity, andopinion polarization.
Journal of communication , 64(4):702–722.Lelkes, Y. (2016). Mass polarization: Manifestations and measurements.
Public OpinionQuarterly , 80(S1):392–410. 25eskovec, J., Kleinberg, J., and Faloutsos, C. (2005). Graphs over time: densificationlaws, shrinking diameters and possible explanations. In
Proceedings of the eleventh ACMSIGKDD international conference on Knowledge discovery in data mining , pages 177–187.ACM.Lindqvist, E. and ¨Ostling, R. (2010). Political polarization and the size of government.
American Political Science Review , 104(3):543–565.Maoz, Z. (2006a). Network polarization. mimeo .Maoz, Z. (2006b). Network polarization, network interdependence, and international conflict,1816–2002.
Journal of Peace Research , 43(4):391–411.Maoz, Z. and Somer-Topcu, Z. (2010). Political polarization and cabinet stability in mul-tiparty systems: A social networks analysis of european parliaments, 1945–98.
BritishJournal of Political Science , 40(4):805–833.Martin, G. J. and Yurukoglu, A. (2017). Bias in cable news: Persuasion and polarization.
American Economic Review , 107(9):2565–99.McCright, A. M. and Dunlap, R. E. (2011). The politicization of climate change and po-larization in the american public’s views of global warming, 2001–2010.
The SociologicalQuarterly , 52(2):155–194.Montalvo, J. G. and Reynal-Querol, M. (2008). Discrete polarisation with an application tothe determinants of genocides.
The Economic Journal , 118(533):1835–1865.Moody, J. and Mucha, P. J. (2013). Portrait of political party polarization.
Network Science ,1(1):119–121.Newman, M. E. (2003). The structure and function of complex networks.
SIAM review ,45(2):167–256.O’Connor, C. and Weatherall, J. O. (2018). Scientific polarization.
European Journal forPhilosophy of Science , 8(3):855–875.Østby, G. (2008). Polarization, horizontal inequalities and violent civil conflict.
Journal ofPeace Research , 45(2):143–162.Ozdemir, U. and Ozkes, A. (2014). Measuring public preferential polarization.
Ecole Poly-technique Departement d’Economie, Cahier de recherche , (2014-06).26ermanyer, I. and D’Ambrosio, C. (2015). Measuring social polarization with ordinal andcategorical data.
Journal of Public Economic Theory , 17(3):311–327.Richardson, M., Agrawal, R., and Domingos, P. (2003). Trust management for the semanticweb. In
International semantic Web conference , pages 351–368. Springer.Stewart, L. G., Arif, A., and Starbird, K. (2018). Examining trolls and polarization witha retweet network. In
Proc. ACM WSDM, workshop on misinformation and misbehaviormining on the web .Vega-Redondo, F. (2007).
Complex social networks . Number 44. Cambridge University Press.Wimmer, A., Cederman, L.-E., and Min, B. (2009). Ethnic politics and armed conflict: Aconfigurational analysis of a new global data set.
American Sociological Review , 74(2):316–337.Zhang, X. and Kanbur, R. (2001). What difference do polarisation measures make? anapplication to china.
Journal of development studies , 37(3):85–98.27
Appendix: Proofs
In what follows, we denote the maximal value of parameter α in ER with α ∗ (so that α ∗ (cid:39) . Lemma 1
Let < α ≤ α ∗ and c ∈ (1 , . There exists ¯ α = ¯ α ( c ) ∈ (1 , α ∗ ] such that max z ≥ f ( z, α, c ) ≤ whenever α ≤ ¯ α . Furthermore, ¯ α is increasing in c .Proof of Lemma 1. For α ≥ f is concave in z . Thus, the maximum of f is given by thefirst order condition: 12 (1 + α ) (cid:0) − αz α − + (2 − (2 + α ) c ) z α (cid:1) = 0 . (12)Taking derivative of the value function v ( α, c ) = max z ≥ f ( z, α, c ) with respect to c , andapplying the envelope theorem, we get: ∂v∂c = ∂f∂z ∂z∂c + ∂f∂c = ∂f∂c = −
12 ( α + 2) z α +1 < , so the value function is (strictly) decreasing in c . This means that v ( α, c ) > v ( α, c ∈ [1 , f in (7) discussed in the proof of Theorem1 that, for any c > v (1 , c ) <
0. Since f ( z, , >
0, as pointed out in ER (p. 833) and v ( α, c ) is decreasing in c , it must be that v (2 , c ) > c ∈ (1 , v (1 , c ) < v (2 , c ) > c ∈ (1 , α ( c ) from the claim of the Lemma,we show that v ( α, c ) is increasing in α . Indeed: ∂v∂α = ∂f∂z ∂z∂α + ∂f∂α = ∂f∂α = 12 (2 z − z α (1 + cz + (1 + α + ( − α ) c ) z ) ln z )) . To see that the above derivative is positive, first note that the first order condition (12)implies that at the maximum of f : z α − = 2 α − z (cid:0) − (2 + α ) c (cid:1) . (13)Equation (13) together with the fact that α − > c > z <
1. Indeed, if z > α − z (cid:0) − (2 + α ) c (cid:1) would be greater than 2 since (2 + α ) c > z >
1. Plugging (13) into the expression for ∂v∂α from above. we get: ∂v∂α = 12 (2 z − z α (1 + cz + (1 + α + ( − α ) c ) z ) ln z ))= 12 (2 z − α − z (cid:0) − (2 + α ) c (cid:1) (1 + cz + (1 + α + ( − α ) c ) z ) ln z ))= − z (1 − α ) + (2 − c − αc ) z + [1 + 2 α + ( − c + αc ) z ] log zα − z (cid:0) − (2 + α ) c (cid:1) , which is clearly positive for z < α ≥ c > c ∈ [1 ,
2] there exist¯ α ( c ) ∈ [1 ,
2] such that max z ≥ f ( z, α, c ) ≤ α ≤ ¯ α ( c ) (with equality only when α = ¯ α ). Finally, ∂v∂c < ∂v∂α > α ( c ) increases with c for c ∈ (1 , Lemma 2
Let ≤ α < and c ∈ (1 , . There exists α = α ( c ) ∈ [0 , such that max z ≥ f ( α, z, c ) ≥ whenever α ≤ α ( c ) . Furthermore, α is decreasing in c .Proof of Lemma 2. Let α <
1. We first prove that f ( z, α, c ) ≥ z ≥
1. Then weshow that for z ≥ f is decreasing in α . Therefore v ( α, c ) = max z ≥ f ( z, α, c ) is decreasingin α .To show that f ( z, α, c ) ≥ ⇒ z ≥ f ( z, α, c ) ≥ ⇒ f ( z, α, ≥ f is decreasing in c . Hence, to show f ( z, α, c ) ≥ ⇒ z ≥ f ( z, α, ≥ ⇒ z ≥
1. To prove this implication, we show that z < ⇒ f ( z, α, <
0. Wehave f ( z, α,
1) = 12 (cid:0) − α ) z − (1 + α ) z α − αz α (cid:1) < − α )(2 z − z ) − αz α < − α ) z − αz = (1 − αz )( z − < , where the inequalities follow the fact that α, z < z ≥
1, then f is decreasing in α . We have that: ∂f∂α = 12 (cid:2) z − z α − cz α +1 − z α (1 + α + 2( c − z + αcz ) ln z (cid:3) We need to show that 2 z − z α − cz α +1 − z α (1 + α + 2( c − z + αcz ) ln z ≤
0. Dividingby z , we obtain the inequality2 z − α ≤ cz + (1 + α + 2( c − z + αcz ) ln z, which holds as the LHS is not greater than 2, because z ≥ α − < c ≥ z ≥ f ( z, α, c ) ≥ f is decreasing in α . This implies that v ( α, c ) =29ax z ≥ f ( z, α, c ) is decreasing in α whenever v ( α, c ) ≥
0. We choose α ( c ) to be equal to azero of function v ( α, c ), whenever this zero exists on [0 , v is decreasing in c , and decreasing in α whenever v ( α, c ) ≥ α ( c ) decreaseswhen cc