Predicting Strategic Voting Behavior with Poll Information
PPredicting Strategic Voting Behavior withPoll Information
Roy Fairstein, Adam Lauz, Kobi Gal, Reshef Meir
Abstract
The question of how people vote strategically under uncertainty has attracted muchattention in several disciplines. Theoretical decision models have been proposedwhich vary in their assumptions on the sophistication of the voters and on the infor-mation made available to them about others’ preferences and their voting behavior.This work focuses on modeling strategic voting behavior under poll information. Itproposes a new heuristic for voting behavior that weighs the success of each candidateaccording to the poll score with the utility of the candidate given the voters’ prefer-ences. The model weights can be tuned individually for each voter. We comparedthis model with other relevant voting models from the literature on data obtainedfrom a recently released large scale study. We show that the new heuristic outper-forms all other tested models. The prediction errors of the model can be partlyexplained due to inconsistent voters that vote for (weakly) dominated candidates.
It is well accepted that people often vote strategically in political and other situations,taking into account not just their preferences, but also beliefs about how their vote wouldaffect the outcome [7, 2]. Researchers in economics, political science, and more recentlyin the computational social choice community, suggested various mathematical models tocapture the strategic decision that a voter faces [6, 5, 8]. The gains from a certain actiondepend not only on the preferences of the voter, but also on the votes of others. Thus, partof the difficulty in predicting a voter’s decision arises from the fact that there is uncertainty about others’ voting decisions, i.e., what can be inferred from the poll about the actualvotes. Theoretical models describe this uncertainty in the following different ways, whichcan lead to different predictions of a voter’s actions. • Expected utility maximization. A rational voter maximizes her expected utility withrespect to a probability distribution over the actions of the other voters. The dis-tribution itself may be given exogenously (e.g., by a poll with known variance as inour model), or derived via equilibrium analysis from the uncertain preferences of theother voters. Such models were developed mainly in the economics literature and aresometimes known as the “calculus of voting” [15, 11, 12]. • Voting heuristics. In these models the voter uses some (typically simple) functionthat states which action to use at any given situation. The voter is not assumed to berational, and may not even have a cardinal utility measure or an explicit probabilisticrepresentation of the different outcomes. For example, according to the 2-pragmatistheuristics, the voter behaves as if only the two candidates leading the poll are partic-ipating [14]. • Bounded rationality. These models present a mid-point between utility maximizationand heuristics. The voter makes a rational strategic decision based on a heuristicbelief, rather than accurate probabilistic belief. One example of such a model is localdominance [10], which assumes that each voter derives a set of possible outcomes basedon a poll, and then selects a non-dominated action within these outcomes. a r X i v : . [ c s . G T ] M a y e evaluated the different models on data obtained from Tal et al. [16] who implementedseveral voting scenarios in controlled experiments involving humans in different scenariosthat vary the number of voters, the poll information, and voters’ preferences. Our main find-ings are that the AU model outperforms all other models in all scenarios. The k -pragmatistheuristic model which considers only a limited number of candidates when making decisionscomes in second. The bounded-rational models obtained the worst performance. Our first contribution is an extensive evaluation of various decision models on real-worlddata. We use the data of Tal et al. [16], where human subjects with dictated preferences areexposed to a poll over three candidates and make a single voting decision under the Pluralityrule. This is the simplest possible setting that involves a nontrivial strategic decision.This is the first time that these models are tested versus voting decision with poll infor-mation. In fact, for some of them this is the first empirical test at all.Our second contribution is new heuristic voting rule, inspired by a similar model ofBowman et al. [4], that takes into account both the utility of a candidate and its attainability.The
Attainability-Utility ( AU ) decision model outperforms all other decision models we testedin predicting human votes.This contributes to the understanding of the factors that determine people’s strategicvoting and can lead to new theories of voting behavior that combine rational and boundedlyrational behavior. We are not aware of another controlled experiment where voters face multiple strategicdecisions with poll information. Yet, similar experiments were conducted in which groupsof human players voted strategically with dictated preference profiles.Closest to our work is a recent paper by Tyszler and Schram [17], who showed thatthe strategic behavior of voters in the lab using is consistent with a quantal best-responseequilibrium. The main difference is that their subjects played a strategic game versus otherhuman players, and the information they had was the preferences of other voters ratherthan poll information. Similar game-theoretic experiments along that line were conductedin [7, 2, 18]. In particular, these studies have shown that strategic voting in the lab increaseswith the amount of information subjects received about others’ preferences and actions.A different line of works in political science compared theoretical models against actualvotes in political elections (using exit polls to obtain the truthful preferences). For example,Blais et al. [3] tested the calculus of voting model on empirical data from political electionsand focused on voter’s decision to vote/abstain. They concluded that the model has someexplanatory power but is far from explaining the data completely, and did not compare toother decision models. In contrast, Abramson et al. [1] concluded that voting behavior inUS primary elections is consistent with the calculus of voting, but observed obvious strategicbehavior only in ∼
13% of the voters, a bit higher than the fraction that seem to vote atrandom.In contrast to controlled experiments, such datasets typically contain few decisions ofeach voter (usually just one), and are this insufficient to test decision models versus indi-vidual behavior.
Preliminaries
In this section we provide the necessary background for our work. An (anonymous) scoreaggregation rule with m candidates C is a function f : N m → C \ ∅ , mapping vectors ofcandidates’ scores to a subset of winning candidates. In particular, the Plurality rule letseach voter vote for a single candidate, collects the total number of votes s ∈ N m , and selects f ( s ) = arg max c ∈ C s ( c ).We consider a single voter who faces a decision, to vote for one of several candidates C .The voter has a cardinal utility function u : C → R , where u i ( c ) is the utility of the voterif candidate c wins (different utility for each candidate). The utility of a subset of winners W ⊆ C is u ( W ) = | W | (cid:80) c ∈ W u ( c ). Denote by U ( C ) the set of all utility functions over theset C . We denote by f ( s + c ) the outcome of the score vector s with one additional vote to c . Prior to her vote, the voter is faced with poll information, which is a point estimate ofcandidates scores under the Plurality voting rule. Formally, the poll is a vector s ∈ N m ,where s ( c ) is the number of voters expected to vote for candidate c . There is a jointprobability distribution D ∈ ∆( N m × N m ) over pairs of “real outcomes” and polls. Thevoter is not explicitly informed of this distribution.A decision model (for Plurality with m candidates and a poll) is a function M : U ( C ) × N m → C , where M ( u, s ) ∈ C is the vote of a voter with utility function u , using decisionmodel M given a poll s . We use a superscript for the name of the decision model (e.g., M Truth ), and subscripts to denote voter-specific parameters, if relevant. We restrict ourattention to deterministic decision models in this work. We demonstrate with two simpleexamples. First, the decision model of a voter who is always truthful regardless of the pollis M Truth ( u, s ) = arg max c ∈ C u ( c ).Next, consider a rational voter that believes the poll to be a completely accurate repre-sentation of the other votes. Such a voter can predict that the outcome of voting c is f ( s + c ),and her decision will be M BR ( u, s ) ∈ arg max c ∈ C u ( f ( s + c )), i.e., a “best response” to thevotes of the other voters (with some assumption on how to vote when there are multiplebest responses).For exposition, we introduce a running example with 5 candidates, and specify whichcandidate the voter will choose under every decision model. Example 1 (Running example) . The set of candidates is C = { q , . . . , q } . A voter’s utilityis described by the vector u = (40 , , , , (preferences are lexicographic). Poll scoresare given by s = (25 , , , , . Figure 1 shows the scores of all candidates graphically. Both M Truth and M BR alwaysselect q . In this section we briefly describe some decision-making models of voting behavior from theliterature, one for each of the approaches specified above (heuristic, rational, and bounded-rational). In Section 3 we describe our decision model developed for this study, and inSection 4 we provide a detailed comparison of this model to the decision-making modelsbelow as well as several baseline models. A priori, this distribution could be arbitrary, but in most realistic cases there is some correlation betweenthe real score of a candidate and its score in the poll. ecision model vote
Prag , k = 2 q Prag , k = 4 q CV , η = 8 q CV , η = 10000 q LD , r = 0 . q LD , r = 0 . q LD + LB, r = 0 . q LD + LB, r = 0 . q Figure 1: The poll s from Example 1, and the candidate selected by each decision model. k-pragmatist The first model we consider is the simple k -pragmatist heuristic [14]. For-mally, let B k ( s ) contain the k candidates with highest score in s , then the pragmatistdecision model with parameter k selects the most preferred candidate among them, i.e., M Prag k ( u, s ) = arg max c ∈ B k ( s ) u ( c ) . We allow k to be an individual parameter that differs from voter to voter.In Figure 1 we see that for k = 2 the voter look only at the two leading candidates ( q and q ), and therefore will vote to the one that is more preferred among them. For k = 4,the voter considers all candidates except q as possible winners, and therefore will vote forhis most preferred candidate q . Calculus of voting
The calculus of voting suggests that a voter always votes in a waythat maximizes her expected utility [15, 12]. The complications of the model usually arisefrom the fact that the voter is assumed to know only the other voters’ preferences , and usesan equilibrium model to predict their votes. However, we consider a simpler version wherethe distribution of votes is given exogenously [11].Recall that we defined D as a joint distribution over actual scores and polls. We denoteby D ( s ) the distribution on the actual scores, conditional on poll scores s . Denote by P s , D ( x, y ) = P r s (cid:48) ∼D ( s ) (cid:20) ( f ( s (cid:48) ) = { x } and f ( s (cid:48) + y ) = { x, y } ) or( f ( s (cid:48) ) = { x, y } and f ( s (cid:48) + y ) = { y } ) (cid:21) , the probability that the voter is pivotal for y versus x when the poll is s . That is, voting for y makes y a joint or unique winner. Then, the voter votes so as to maximize her expectedutility: M CV D ( u, s ) = arg max c ∈ C (cid:88) c (cid:48) (cid:54) = c P s , D ( c (cid:48) , c )( u ( c ) − u ( c (cid:48) )) . To make the CV model concrete, we need to pin it down to a specific distribution D . Forthis paper, we use the way that scores were generated in the experiment of Tal et al. [16].Specifically, given poll s of a n -voter population, the actual score vector s (cid:48) is obtained bysampling n votes from a multinomial distribution whose parameters are ( s ( c ) n ) c ∈ C .We use P s ,η as a shorthand for P s , D when D ( s ) is a multinomial distribution with η votersas explained above. When η = n (i.e., the true number of voters), this means that M CV η selects the candidate that exactly maximizes the voter’s expected utility, since P s ,n ( x, y ) isthe true probability that the voter is pivotal.owever, the M CV η decision model allows for a more flexible, bounded-rational decision:when η < n the voter overestimates her true pivot probability, and thus her influence onthe outcome, whereas η > n means that the voter underestimates her influence.In Figure 1 we see that for η = 8 the voter believes she is pivotal with sufficiently highprobability to substantially increase the chance of q to win. However for η = 10007, thevoter believes that any tie except a tie of ( q , q ) is highly improbable, and therefore willvote for q . Local dominance
Under the Local dominance model [10, 9], the voter has an ‘uncertaintyparameter’ r . Given poll s with n participants, the voter considers as possible (withoutassigning any explicit probability) all score vectors s (cid:48) such that | s ( c ) − s (cid:48) ( c ) | ≤ r · n forall c ∈ C . Then, the voter selects an undominated action (i.e., candidate) given this set ofpossible outcomes. Meir et al. [10] characterize the undominated candidates: • Let W be all candidates whose score in s is at least max c ∈ C s ( c ) − rn . • If | W | ≥
2, then the undominated candidates are all candidates in W except the leastpreferred. • If | W | = 1, then all candidates are undominated.Denote by U ( s , u, r ) the set of undominated candidates in poll s for a voter with utility u and parameter r . The decision model of such a voter is M LD r ( u, s ) = arg max c ∈ U ( s ,u,r ) u ( c ) . This assumes that the voter selects the most preferred undominated candidate, if more thanone exists.In Figure 1 we see that for r = 0 .
01 the voter believes that the poll is very accurate (thescore of each candidate may change by at most rn = 3 votes), and there is only one possiblewinner ( W = { q } ). In this case, the voter remains truthful and M LD . ( u, s ) = q . When r = 0 .
08, the voter believes that the poll is not very accurate and hence W = { q , q , q } . We suggest a new heuristic M AU that separately evaluates the attainability (an approx-imation of the success of each candidate according to the poll score) and the utility ofthe candidate given the voter’s preferences. It selects the candidate that maximizes theirweighted geometric mean. The heuristic is partly inspired by a rule that was used in thesimulations of Bowman et al. [4] for multi-issue voting. Given a poll s , we compute the attainability of each voter c similarly to [4]: a β ( c, s ) = 11 + exp( − β · ( s ( c ) n − m )) ∈ [0 , . Then, for some small constant ε >
0, we define: M AU α,β ( u, s ) = arg max c ∈ C H AU α,β ( u, s , c ) , where H AU α,β ( u, s , c ) = (cid:0) ( ε + u ( c )) α · ( ε + a β ( c, s )) − α (cid:1) . In [4], attainability was computed for each issue separately and there were additional factors such aslearning from the past. All factors were multiplied to obtain the heuristic attractiveness of the candidate. igure 2: Attainability as a function of candidate poll scores for different values of β .Intuitively, the α parameter trades-off the relative importance of attainability and utility,where α = 0 means the voter always selects the candidate with maximal score, and α = 2means the voter is always truthful.The β parameter can be thought of as the accuracy of the poll in the eyes of the voter,similarly to the role of parameter r in the LD model and η in the CV model.Figure 2 shows how β affects the attainability score a β ( s ). Candidates that are tied havethe same attainability. High β means that a small advantage in score translates to a largegap in attainability.Table 1 shows the AU model behavior over Example 1 with different parameters. For α close to 2, the voter is tend to be more truthful, however when there is a big gap in votesbetween candidates, β which of the top preferences to choose. There is big gap between q and q , therefore when β is big, the gap will cause a bigger difference in the score and willcause to vote for q . In contrast When β is small, this gap is less taken into account, andthe voter will vote for q . In contrast, when α is close to 0, the model considers the pollas more important: when β is large, smaller changes in the poll will have more effect onthe decision and therefore only q , q and q have non-negligible attainability; for small β the difference in poll have less effect and therefore preference 4 was chosen. Notice that nomatter what the parameters are, the model will never choose to vote for q or q , since theyare each dominated by another candidate with higher score and utility. H ( q ) H ( q ) H ( q ) H ( q ) H ( q ) M AU α,β ( u, s ) α = 1 . , β = 30 382.9 q α = 1 . , β = 10 q α = 0 . , β = 30 ≈ ≈ q α = 0 . , β = 10 0.16 0.77 0.11 q Table 1: M AU α,β heuristic score and decision in Example 1, for various parameter values. This is the only case where ε > u ( c ) may be 0. Methodology
Dataset
We evaluated the different models on data obtained from Tal et al. [16] whoimplemented several voting scenarios in controlled experiments involving humans. Some ofthis data is publicly available at the link votelib.org . The data was obtained from 595distinct subjects. Each subject played up to 20 rounds of voting with 3 candidates, eachtime with different preferences and poll information. The poll provided a noisy indication ofthe results of the voting. The voting instances can be divided into six different “scenarios”corresponding to different orders of candidates’ scores in the poll once preferences are heldfixed (See two leftmost columns in Table 2).We denote the candidate as Q for the most preferred, Q (cid:48) for the second and Q (cid:48)(cid:48) for theleast preferred. The reward was 10 ¢ for each round where Q was elected, 5 ¢ for Q (cid:48) , and 0 ¢ for Q (cid:48)(cid:48) . Note that only in scenarios E and F, where Q is ranked last at the poll, the votermay have a monetary incentive to vote for Q (cid:48) , and never has an incentive to vote for Q (cid:48)(cid:48) . In addition to our decision model M AU α,β , we evaluate the following single-parameter decisionmodels described in Section 2.1: M Prag k , M CV η , M LD r . To these models, we add several otherbaselines. Voter type based model
Tal, Meir and Gal [16] identified 3 distinct types of voterbehavior, albeit without suggesting an explicit decision model:1. Voters who are always truthful (TRT voters, about 10%-15% of subjects);2. Voters that often compromise when Q is ranked last (CMP voters, about 40% ofsubject), and otherwise vote truthfully;3. Voters that often compromise AND select the leader Q (cid:48) when ranked first (LB voters,about 50% of subjects).They also identified a subgroup of subjects who select unjustified actions (a candidate c where there is c (cid:48) that is both more preferred and higher-ranked) more than once. Thebehavior of these voters (about 5%-10% of the dataset) is naturally harder to predict forany decision model. We analyze the results for all subjects , but return to the issue ofunjustified actions and voters in Section 5.3.Based on their distinction of types, we consider the simple TMG decision model M TMG T .The parameter T ∈ { T RT, CM P, LB } is the voter type. It is defined as follows: • M TMG
T RT ( u, s ) = M Truth ( u, s ) = Q ; • M TMG
CMP ( u, s ) = Q (cid:48) if Q is ranked last in s , and Q otherwise; • M TMG LB ( u, s ) = Q (cid:48) if Q (cid:48) is ranked first in s , and M TMG LB ( u, s ) = M TMG
CMP ( u, s ) otherwise. Local-Dominance with Leader bias
Note that the findings of [16] indicate a strongtendency to bias for the leader of the poll, which is not taken into account in the Localdominance model. We thus consider a “leader-biased” variation of the local dominancemodel: M LD + LBr ( u, s ) = M LD r ( u, s ) if | W | ≥ , and otherwise M LD + LBr ( u, s ) = W .
In Figure 1 we see that this model acts similar to the LD model, however when there isonly one possible winner, this model allows the voter to be leader biased and voter for hisfourth preference instead of being truthful. lack-box neural network predictor
Another baseline we used was a general black-boxclassifier. We extracted about 30 relevant features, including the poll scores, the differencesin poll scores, voter’s utility and the voter type as identified in [16]. The “decision model” M NN then feeds all features to the classifier, which predicts an action in C .The classifier consisted of a single-hidden-layer feed-forward neural network classifier.The input nodes represented features that summarized voters’ preferences, and the pollinformation that was provided to them, and information about the voter types. The classifierwas implemented using the nnet package of R . Prediction and parameter fitting
The prediction was performed using leave-one-outmethod. For each voter we excluded one of his rounds, one by one. Using the rest of therounds we learned the relevant model parameters and predicted what the voter will do inthe excluded round.
Confusion matrices
The predictions for a specific decision model result in a confusionmatrix : The entry A ( x, y ) in the matrix specifies how many times the model M AU predicted x and the actual voter action was y (a matrix where all off-diagonal entries are 0 indicatesperfect prediction). Both rows and columns are sorted Q, Q (cid:48) , Q (cid:48)(cid:48) . An example for a confusionmatrix from out data: confusion matrix: A = For example, in 441 samples (4.7% of the data), the studied model in the example predicted Q (cid:48) but the voter selected Q . Performance measures
From the confusion matrix we compute standard measures formulti-class prediction problems [13]. These include precision, recall as well as the f-measure,which is the harmonic mean of precision and recall, for every candidate c ∈ C .prec( c ) = A ( c, c ) Col A ( c ) ; recall( c ) = A ( c, c ) Row A ( c ) ; F ( c ) = 2prec( c ) · recall( c )prec( c ) + recall( c ) , where Col A ( c ) = (cid:80) c (cid:48) ∈ C A ( c (cid:48) , c ) , Row A ( c ) = (cid:80) c (cid:48) ∈ C A ( c, c (cid:48) ). Since there are three possibleactions, we calculate a single f-measure by weighting each f-measure by the number of timesthis action was played: F A = 1 (cid:107) A (cid:107) (cid:88) c ∈ C Row A ( c ) F ( c ) . In the example matrix above, prec( Q (cid:48) ) = = 0 . Q (cid:48) ) = = 0 . F ( Q (cid:48) ) = · . · . . . = 0 .
87. For the entire matrix, we wouldget F A = · . · . · . = 0 . https://cran.r-project.org/web/packages/nnet/index.html Results and Analysis
Table 2 shows the f-measure of each decision model. We emphasize that the individualparameters of each voter were learned using leave-one-out to avoid overfitting. The resultsare separated to the different poll scenarios, as they each reflect a different strategic decision.The f-measure is also presented graphically in the solid bars shown in Figure 3.scenario frequency
AU LD LD +LB
CV Prag TMG NN A Q > Q (cid:48) > Q (cid:48)(cid:48) (15 . Q > Q (cid:48)(cid:48) > Q (cid:48) (11 . Q (cid:48) > Q > Q (cid:48)(cid:48) (14 . Q (cid:48)(cid:48) > Q > Q (cid:48) (16 . Q (cid:48) > Q (cid:48)(cid:48) > Q (22 . Q (cid:48)(cid:48) > Q (cid:48) > Q (19 . From Table 2 and Figure 3 we can derive the following insights: • In Scenarios A and B, all decision models (except NN ) predict that voters are alwaystruthful, and thus have the same high performance. • In all scenarios C-F, the AU model outperforms all other models. • The “sophisticated” bounded-rational models CV and LD have the worst performance.In particular, they demonstrate poor performance in Scenario C where voters’ decisionis influenced by leader-bias [16]. • The k -pragmatist heuristics performs surprisingly well, considering its utter simplicityand the fact that it only allows three types of voters (for k = 1 , • The LB variant of local dominance strictly improves its performance, placing it roughlyat par with k -pragmatist and the neural network predictor. • Scenario F is the most difficult one for almost all models, with the best models havingan f-measure slightly above 0 . The data we use to fit the parameters of each voter is sparse. Each voter has at most20 samples, and in some scenarios only 1 or 2 samples (or none at all). Therefore, evenleaving out a single sample may significantly hurt performance. In order to find what isthe maximum explanatory power of each model, we re-calculated the f-measure for eachmodel using the entire dataset both as a training set and a test set. Clearly this approach issuspect to overfitting, so it only provides an upper bound on the prediction ability of eachmodel.These upper bounds appear as stripped bars in Fig. 3. Note that our AU model stilloutperforms all other models when we compare upper bounds (with the slight exceptionof the NN model in Scenarios A and B). The numerical upper bounds can be found in theappendix Table 3.igure 3: The f-measure of each decision model in all scenarios. The solid bars show theprediction performance, whereas the stripped bars show the upper bound on the performanceof each model. Next, we dig deeper into the results. We want to see what kinds of mistakes the AU decisionmodel tends to have. For example, whether these mistakes concentrate on a specific subsetof subjects or scenarios. These insights could be used later to improve the model, and todesign further experimental evaluation. Errors by poll size
First, the model seems to perform equally well for all poll sizes (seeTable 4 in appendix), even as different as n = 8 and n = 10000. We thus conclude that thesize of the poll is not a major factor in explaining the prediction errors. C. ( Q (cid:48) > Q > Q (cid:48)(cid:48) ) D. ( Q (cid:48)(cid:48) > Q > Q (cid:48) ) E. ( Q (cid:48) > Q (cid:48)(cid:48) > Q ) F. ( Q (cid:48)(cid:48) > Q (cid:48) > Q )
734 153 0241 465 016 7 0
480 348 0231 1414 013 42 0
487 227 74275 606 9378 175 153 Figure 4: Confusion matrix for different scenarios.
Errors by type
Figure 4 shows the confusion matrices of the AU decision model in allthe “interesting” scenarios C-F. Recall that all the off-diagonal entries indicate predictionerrors, where the column is the predicted action ( Q, Q (cid:48) or Q (cid:48)(cid:48) , in that order) and the row isthe action of the subject.As can be seen in the table, most of the prediction errors in Scenario C are due tounder-prediction of voting for the leader Q (cid:48) . In contrast, most of the errors in Scenario Ere due to over prediction of a strategic compromise Q (cid:48) . We can also see why Scenario F isthe hardest, as all three actions are played frequently. Errors by subject
Every decision model can capture the behavior of some human sub-jects better than others. To check how well different subjects are predicted, we computedthe confusion matrices and f-measure for each of the 595 subjects, when actions are pre-dicted by M AU . An f-measure of 1 means that all actions of this subject were predictedcorrectly.Figure 5 shows the distribution of subjects’ individual f-measures. We can see that about46% of the subjects are predicted very well (f-measure over 0.9), 29% predicted reasonablywell (f-measure over 0.8), and the rest of the subjects (about 25%) are with f-measure lessthan 0.8.This means that most of the prediction errors are due to a relatively small subset ofsubjects. One possible explanation is that these are the subjects who played fewer games,and therefore it is harder for the model to learn their parameters, however we get a similardistribution after omitting subjects who played under 10 games.The main question is thus whether voters whose behavior is not predicted well follow adifferent decision model than AU , or are somehow inherently unpredictable.Figure 5: A histogram showing, for every f-measure F , how many subjects have f-measureof F . Inherently inconsistent behavior
To answer the above question, we considered typesof behavior that would be ‘inherently unpredictable.’ For example, [16] categorized as“unjustified” a sample ( s , a ) (action a ∈ C under poll s ) if there is another candidate a (cid:48) that ‘dominates’ the selected action a ( a (cid:48) is more preferred than a and s ( a (cid:48) ) ≥ s ( a )). Theyshowed that voters with at least two unjustified actions have a random component in theirbehavior.We suggest an additional criterion that is based on inconsistency among a voter’s ownactions. We say that a sample ( s , a ) is inconsistent if there is another sample ( s ∗ , a ∗ ) of thesame voter, such that: (i) a ∗ (cid:54) = a ; (ii) s ∗ ( a ) ≥ s ( a ); and (iii) s ∗ ( a (cid:48) ) ≤ s ( a (cid:48) ) for all a (cid:48) (cid:54) = a . Inwords, a is in a weakly better position in s ∗ , but still the voter prefers to vote for anothercandidate a ∗ .Figure 6 (left) shows the f-measure of all voters, classified by their consistency type. Wecan see that the left tail of the histogram (i.e., almost all voters with low f-measure) areeither “unjustified” or “inconsistent.”igure 6: Left : f-measure per voter for voters with more than 10 samples. We divided thepopulation into “unjustified” voters who played at least two unjustified actions; “inconsis-tent” voters; and all others.
Right : Prediction accuracy in each scenario, divided by actiontype.This might suggest that perhaps prediction cannot be significantly improved. We thustested how many of the prediction errors themselves were of dominated/inconsistent actions.This can be seen in Figure 6 (right). The plaid gray bars represent all prediction errors thatcannot be explained away as dominated or inconsistent actions.We conclude that while most of the prediction errors are indeed due voters that behaveinconsistently sometimes , most of the actual errors are “plaid” especially in scenario F. Thusthere is still room for improvement of our decision models.
It seems that the Attainability-Utility heuristics explains quite well the behavior of mostsubjects in the data, except those with inherent inconsistencies in their replies. To improvethe model we can perform more experiments where we use different utilities for the candi-dates (different utility gaps, negative utility, etc) and have more than 3 candidates. Thoseexperiments can expose behavior that do not exist in the current data. Interestingly, the NN black-box model sometimes successfully predicts “unjustified” actions, and we can try tounderstand when is this possible. Since the α and β parameters correspond to the natural cognitive inclinations, their distri-bution can in principle reveal important information on the types of strategic voters in thepopulation. Unfortunately, it is hard to make clear patterns from the parameter distribu-tion. Indeed, it seems that the distribution of β parmeter is bi-modal with peaks at 15 and30 (see appendix). However this is probably an artifact of the way we select parametersfor subjects with a large range of optimal parameters (like truthful voters). More experi-mentation in different conditions is required if we want to better understand the populationstructure.One pattern that does stand out is that α values seem to be much higher in the ‘small n ’ condition. This may suggest that not only the relative attainability matters: when n islow and the voter has a substantial chance to be pivotal, the importance of utility increases. eferences [1] Paul R Abramson, John H Aldrich, Phil Paolino, and David W Rohde. sophisticatedvoting in the 1988 presidential primaries. American Political Science Review , 86(1):55–69, 1992.[2] Anna Bassi. Voting systems and strategic manipulation: an experimental study. Tech-nical report, mimeo, 2008.[3] Andr´e Blais, Robert Young, and Miriam Lapp. The calculus of voting: An empiricaltest.
European Journal of Political Research , 37(2):181–201, 2000.[4] Clark Bowman, Jonathan K Hodge, and Ada Yu. The potential of iterative voting tosolve the separability problem in referendum elections.
Theory and decision , 77(1):111–124, 2014.[5] Samir Chopra, Eric Pacuit, and Rohit Parikh. Knowledge-theoretic properties of strate-gic voting. Presented in JELIA-04, Lisbon, Portugal, 2004.[6] Vincent Conitzer, Toby Walsh, and Lirong Xia. Dominating manipulations in votingwith partial information. In
AAAI’11 , pages 638–643, 2011.[7] Robert Forsythe, Thomas Rietz, Roger Myerson, and Robert Weber. An experimentalstudy of voting rules and polls in three candidate elections.
International Journal ofGame Theory , 25(3):355–383, 1996.[8] Umberto Grandi, Andrea Loreggia, Francesca Rossi, Kristen Brent Venable, and TobyWalsh. Restricted manipulation in iterative voting: Condorcet efficiency and bordascore. In
ADT’13 , pages 181–192. Springer, 2013.[9] Reshef Meir. Plurality voting under uncertainty. In
AAAI’15 , 2015.[10] Reshef Meir, Omer Lev, and Jeffrey S. Rosenschein. A local-dominance theory of votingequilibria. In
ACM-EC’14 , pages 313–330, 2014.[11] Samuel Merrill. Strategic decisions under one-stage multi-candidate voting systems.
Public Choice , 36(1):115–134, 1981.[12] Roger B. Myerson and Robert J. Weber. A theory of voting equilibria.
The AmericanPolitical Science Review , 87(1):102–114, 1993.[13] David Martin Powers. Evaluation: from precision, recall and f-measure to roc, in-formedness, markedness and correlation.
International Journal of Machine LearningTechnology , 2(1):37.[14] Annemieke Reijngoud and Ulle Endriss. Voter response to iterated poll information.In , pages 635–644, 2012.[15] William H Riker and Peter C Ordeshook. A theory of the calculus of voting.
Americanpolitical science review , 62(1):25–42, 1968.[16] Maor Tal, Reshef Meir, and Ya’akov (Kobi) Gal. A study of human behavior in onlinevoting. In
Proceedings of the 2015 International Conference on Autonomous Agents andMultiagent Systems, AAMAS 2015, Istanbul, Turkey, May 4-8, 2015 , pages 665–673,2015. Full version available from https://tinyurl.com/yczxugoj .17] Marcelo Tyszler and Arthur Schram. Information and strategic voting.
Experimentaleconomics , 19(2):360–381, 2016.[18] Karine Van der Straeten, Jean-Fran¸cois Laslier, Nicolas Sauger, and Andr´e Blais.Strategic, sincere, and heuristic voting under four election rules: an experimental study.
Social Choice and Welfare , 35(3):435–472, 2010.
Additional Results scenario
AU LD LD + LB
CV Prag TMG NN A B 0.903 0.903 0.903 0.903 0.903 0.903 0.922C 0.870 0.389 0.819 0.389 0.703 0.693 0.741D 0.784 0.657 0.719 0.657 0.730 0.657 0.733E 0.813 0.676 0.795 0.730 0.766 0.659 0.744F 0.728 0.525 0.609 0.533 0.640 0.411 0.638total 0.841 0.671 0.795 0.708 0.781 0.720 0.784Table 3: Upper bounds on the performance of each model.f-measuretotal scenario C scenario D scenario E scenario F n <
10 0.775 0.747 0.767 0.735 0.534 n ≈
100 0.739 0.710 0.653 0.746 0.588 n ≈ n ≈ n . B Black box classifier
The “decision model” M NN is a single-hidden-layer feed-forward neural network classifier.The model consists of a layer of input nodes that represents the features we use in the dataset,A layer of nodes which is called the ”hidden” layer and an output layer that uses the softmaxactivation function in order to output a classification to one of the possible classes in C .The flow of the data is from input to output in one direction and no recurrences occur in theflow process as opposed to recurrent neural networks. The input nodes are connected to theoutput nodes via the ”hidden” layer nodes (or neurons) by weighted edges. The algorithmis a supervised learning algorithm that takes a set of class labled records and iterativelylearns and adjusts the weights on the edges by comparing the output of each input recordto it’s class label. The model is iterative and can be updated easily by new records that ithaven’t seen yet. We use a configuration of 3 units in the ”hidden” layer. In our domain weuse a set of vote records that consists of raw and generated features. Some of the generatedfeatures are normalized by the number of votes in the poll configuration. The class label ofeach record is the selected preference of the voter which can be one of { Q, Q (cid:48) , Q (cid:48)(cid:48) } . Usingfeature selection techniques we selected the following features:(a) Poll and preference information:
Candidates poll votes, normalized poll gapsbetween candidates, preference order, the normalized gap between the leader and themost preferred candidate and the scenario which is the combination of the preferenceorder and the poll information.(b)
Voter information:
A-ratios which are the number of rounds the voter selectedaction A divided by the number of rounds it was available. A is the action which weigure 7: A scatter plot ot the α and β parameters of each of the 595 voters.Figure 8: Distribution of the α parameter for n <
10 (left) and n ≥
100 (right).determine using the selected preference (
Q, Q (cid:48) or Q (cid:48)(cid:48) ) and the order of the preferencesin the poll (namely the scenario). We also use the voter type feature which can beone of { TRT, LB, OTHER } and is determined by a threshold values over the A-ratiovalues. For example: if TRT-ratio > . Truthful ).Roy FairsteinBen Gurion UniversityIsraelEmail: [email protected]
Adam LauzBen Gurion UniversityIsraelEmail: [email protected] igure 9: Distribution of the β parameter for n <
10 (left) and n ≥
100 (right).Kobi GalBen Gurion UniversityIsraelEmail: [email protected]