Modelling the Impact of Scandals: the case of the 2017 French Presidential Election
MModelling the Impact of Scandals: the case of the 2017 FrenchPresidential Election
Yassine Bouachrine and Carole Adam
This is an ENSIMAG internship report originally written in June 2019 by intern Yassine Bouachrine under thesupervision of Carole Adam
Abstract
This paper proposes an agent-based simulation of a presidential election, inspired by the French 2017presidential election. The simulation is based on data extracted from polls, media coverage, and Twitter.The main contribution is to consider the impact of scandals and media bashing on the result of theelection. In particular, it is shown that scandals can lead to higher abstention at the election, as votershave no relevant candidate left to vote for. The simulation is implemented in Unity 3D and is availableto play online.
Keywords: agent-based simulation, computational social choice, voting models
During the 2017 French presidential election, the media had a very impactful role in the shift of the opinionaway from the election’s favorite François Fillon. The seriousness of the accusations against the candidateled to Fillon plummeting in the polls. We will try to model the impact of both conventional and social mediathrough scandal diffusion, in order to better understand the dynamics underlying the voting process.There are a variety of existing models for the voting process. However, most of the models see voting asa discrete event, while it is actually a sample at a given instant of a system in continuous evolution. Thispaper tries to shed some light on improvements which could take into account the inherent dynamism thatcomes with the interactions of the voters.Another issue is that most computer voting models are adapted to the American context. In France,there are multiple candidates participating in the first round of the elections but in reality, few of them areactually considered by the voters, despite being aligned with their ideals. This is due to the nature of thevoting process in France which takes place in two rounds. This leads the voters to cast their vote strategicallytowards candidates who have a chance of making it out of the first round.The
Voter Autrement experiment explored the effects of this strategic voting in 2017 by testing variousvoting systems during the presidential election. The results showed that the alternative voting methods yieldvastly different results (in terms of who is elected), especially for candidates such as J-L. Mélenchon and H.Hamon (systematic improvement) or F. Fillon and M. Le Pen (systematic decline). Other voting methodssuch as candidate ranking make the strategic vote useless, since every candidate gets a chance to make it tothe second turn.Finally, the context for the 2017 presidential election is even more unique in regard to the state of theFrench political scene. It is the first presidential election since the split of the centre-right party (UMP), andit is taking place amid the overall dissatisfaction of the population with the French Socialist Party (PS, leftwing). This will be further elaborated on in the Voting models section as it is relevant to studying theimpact of such a context on the existing models.Our goal here is to model the impact of scandals over the course of the election and the change in theopinion of the voters. Last year, A. Soutif [13] tried to model the election process using an agent-based Voter Autrement : https://vote.imag.fr/ a r X i v : . [ c s . M A ] J a n odel by feeding the agents the results of the polls reported by the media. His goal in that study was tomodel the impact of the polls on the votes and on the strategic vote in particular, as polls give informationon which candidates have chances of making it out of the first round. Our approach aims to complementthis work by showing the additional impact of media through the diffusion of scandals during the campaign.The first part of the paper (Section 2) focuses on the data analysis upon which the model is built. Thesecond part of the paper (Section 3) addresses voting models and our implementation of the suggested model. Building a model requires the availability of sufficient and relevant data about the phenomenon we are tryingto model.
In our case, we took the aggregated poll results from various organisms [14] shown in Figure 1, and interpo-lated linearly when we had missing data (mostly B. Hamon in the early polls). We then compared the resultof these polls with the evolution of the presence of the candidates across traditional media and Twitter.Figure 1: Evolution of the polls during the elections (Source: Wikipedia [15])Google Trends is a Google website that offers statistics on Google Search queries. We used it to evaluatethe searched queries associated with the the top five candidates in the News section, the results are shownon Figure 2. An interesting point can be made around how F. Fillon dominates the media presence in theearly months of the election, alongside B. Hamon which we can ignore since it is mostly due to the FrenchSocialist Party presidential primary.We can see that media coverage of this candidate spikes at multiple points in time (Figure 3), which islikely due to the “Penelope gate”, which is a scandal associated with the alleged fictitious employment ofmembers of Fillon’s family. Further, the media frenzy surrounding the “Penelope gate” looks correlated withthe evolution of the polls, as shown in Figure 3. https://trends.google.fr/trends/ Le Canard Enchaîné that led to him dropping significantly, whereas in M. Le Pen’s case, it is the poll resultsthat led to an increase in coverage, and then her poll results dropped as a result of this increased coverage.
The limits of the information we can leverage from Google Trends is that it does not tell us about the nature or content of the coverage. We cannot know for sure if the increasing number of articles are rather positiveor negative ones. We can deduce that a posteriori by observing the impact on the polls, but it is what weare actually trying to model. Therefore, we used Twitter data, and analyze the tweets during the primaryround to try and visualize opinion trends during the election.
Two datasets were used in our work:• The first one is from
Kaggle : Kaggle is an online exchange platform for datascientist, users canpublish datasets among other things. This Kaggle dataset contains tweets sampled during the elections.It is very rich but the data collection rate varies and some tweets appear to be truncated. More detailsabout this dataset can be found at the source.• The other dataset is a courtesy of E. Duble, research engineer at LIG. It contains an anonymizedcollection, sampling only geotagged tweets, that are mentioning the top hashtags during the election.The importance of hashtags has been shown for instance by the Politoscope project [10] .2.2 Clustering Building a vectorial representation of the tweets can be done in various ways. M. Campr and K. Jezek[3] provided a performance evaluation of various methods for paragraph vectorization. At first, we optedfor Tweet2Vec [6], which relies on character-based representations (as opposed to word representations forDoc2Vec [8]) that perform better for content such as tweets. However, Tweet2vec uses hashtag prediction totrain the model, which is limited for our use-case since we already have a restricted number of hashtags. Italso takes longer to train compared to word-based models.We used Facebook fastText to generate embeddings, and enhanced them with their term frequency-inverse document frequency (TF-IDF, [12]) in the corpus. We then computed the tweet embeddings as theaverage of the word embeddings.Performing Principal Component Analysis (PCA, [16]) on the 100-dimensional tweet embeddings did notyield very good results as the embeddings are already built to minimize colinearity. In this section, we provide an overview of existing models and their limitations before presenting the modelwe built through the observation of the data.
Doing a taxonomy of existing models is outside the scope of this paper, there are good resources availablefor that [1, 9, 7]. We will focus on some limitations of the existing models, starting with the psychologicalmodel. As hinted to in the introduction, the context of the election makes partisan identification hard to relyon: the schism of the centre-right party reshaped the political scene entirely. Also, the overall dissatisfactionwith F. Hollande hurt the socialist party. Towards the end of his mandate, his popularity rating was lowerthan Macron’s was during the Gilet Jaunes protests [11]. On top of that, the appearance of new actors onthe scene such as En Marche further shook the scene. En Marche made retrospective voting irrelevant as theparty had never held responsibilities. The novelty Macron brings to the table, and his ambition of unitingthe political parties, gave him a considerable advantage.All of these circumstances made the elections very volatile. Even more sophisticated models such as thefunnel of causality have to be rethought. Media has to have a bigger role in this funnel, especially socialFigure 5: Funnel of causality, source [4]media as it has been shown to be a good indicator of standings [2], almost as good as traditional polls. And Facebook FastText: https://fasttext.cc/
Our model is an enhancement of proximity models, in order to take into account the diffusion of scandalsand the movement of neighboring agents.
The environment is a 100 X 100 units 2D plan. There are two types of active agents: candidates and voters.For the simulation, we define:• The appeasement delta ∆ α ∈ [0 , , rate at which the repulsion of the candidate diminishes• The falloff rate for the potential of the scandals ∆ ρ ∈ [0 , • The maximum openness for the voters σ max ∈ [0 , , defines how far a voter considers his sur-roundings• The maximum tolerance for the voters θ max ∈ [0 , + ∞ [ For candidates C , we define:• Position at time t as ψ t ∈ [0 , initialized manually• Repulsion at time t as γ t ∈ [0 , with γ = 0 • A list S of scandals with S i being the i-th one.For voters V , we define:• Position at time t as ψ t ∈ [0 , with ψ ∼ U (0 , (distance to a candidate inversely proportionalto agreement with this candidate)• Openness as σ ∈ [0 , σ max ] with σ ∼ clamp [0 , ( N (0 . , . )) σ max (the radius is which a voter considersagents around him)• Charisma as κ ∈ [0 , with κ ∼ clamp [0 , ( N (0 . , . )) (the chance to influence others around)• Tolerance as θ ∈ [0 , with θ ∼ clamp [0 , ( N (0 . , . )) θ max (the threshold for repulsion beforedismissing a candidate completely)• Conformity as η ∈ [0 , with η ∼ clamp [0 , ( N (0 . , . )) For scandals, we define:•
Potential at time t as ρ t ∈ [0 , with ρ initialized manually by the user (parameter in the simulation)6 .2.3 Simulation update At each time-step of the simulation, we update the entities. To simplify the equations, we assume that thevalues are clamped to their domain.For scandals, the potential decreases with time, at the falloff rate: ρ t +1 ( x ) = ρ t ( x ) − ∆ ρ (1)For candidates, the position is static, and the repulsion increases with each scandal, and decreases with timeat the pace set by the appeasement delta: γ t +1 ( x ) = γ t ( x ) − ∆ α + (cid:88) y ∈ S ( x ) ρ t +1 ( y ) (2)For voters, the position evolves as their opinion about the different candidates evolves based on their sur-roundings: ψ t +1 ( x ) = ψ t ( x )+ argmin y {|| ψ t ( x ) − ψ t ( y ) || | y ∈ C ∧ γ t +1 ( y ) < θ ( x ) }
11 + γ t +1 ( y )+ η (cid:88) y ∈ V ∧|| ψ t ( x ) − ψ t ( y ) || <σ ( x ) κ ( y )( ψ t ( x ) − ψ t ( y )) (3)When the simulation stops, each voter votes for the closest candidate still considered in the openness radiusaround him. If there are none, the voter withholds his vote (chooses abstention). The simulation is built in Unity 2018.3.5f1 . It allowed for faster prototyping and also supports a wide rangeof platforms to run the simulation on. The simulation is available to play online at http://lig-tdcge.imag.fr/votsim/ or to download as a WebGL export .A first screen lets the user select the values of the global parameters of the simulation (Figure 6):the number of voters and candidates, the appeasement delta (rate at which the scandals decrease, whichdetermines the duration of their effect on opinions), and the maximum values for tolerance and openness(individual values of all agents are then set randomly under this boundary).Figure 6: Parameter selection before starting the simulationOn the next screen, the user can modify the simulation speed. Voters are moving in the environmenttowards or away from the candidates. The user can also trigger a scandal and choose its intensity and Unity 3D: https://unity.com/ WebGL export of the simulator: https://ensiwiki.ensimag.fr/index.php?title=IRL_-_Modélisation_de_la_dynamique_des_opinions_des_électeurs
Regarding the motivations behind the model, we wanted to enhance the existing models with the observa-tions made from our data analysis. First, regarding the initialization, the justification behind the uniformdistribution for the voters’ positions is the unique context behind the 2017 election that we detailed earlier,with many voters not knowing which parties to consider. A fine-tuned Gaussian mixture model could alsobe explored.Most of the reasoning behind the model is based on the reactions to the Penelopegate, with some voterscompletely turning their backs on F. Fillon (which we model by a scandal being above their tolerancethreshold) and some only showing hesitation (tolerance threshold not reached). A temporary dip followedby a partial recovery in the polls supports the model: regardless of the further media bashing around theevent, voters have a threshold over which further coverage has no effect.Over the simulated scenarios, one of the most interesting observations is that scandals tend to be tiedwith an increased abstention rate. In our model this is represented by the voter moving too far from allcandidates (due to repulsion generated by scandals about their favourite candidates, or to diverging opinionswith the others), so that no valid candidate is still present in the openness radius when the election comes;in that case the voter prefers to choose abstention. Our model can therefore reproduce and explain the 2017presidential election’s high abstention rate in the first round, at 22.23% [5].
We have seen that media coverage of the campaign scandals can have a big impact on the election results.The simulation showed that scandals can totally shape the result of the election and that scandals profitto the closer candidates on the political spectrum. The more interesting finding was how scandals impact8he abstention rate, which is in agreement with the observations made in the context of the French 2017presidential election and the high abstention rate recorded.There is still much to do to reach a unified model, a first step in that direction would be enhancing thesimulation with the results from A. Soutif’s experiments regarding the impact of the polls [13]. We couldthen initialize the model to match the French political scene at the beginning of the first round and test if itcorresponds to the observed election results. If the model is validated, we could explore alternative scenariosfor the election: how different scandals could have led to different results and particularly what would havehappened if there were no scandals involving the pre-campaign favorite F. Fillon.
References [1] Rui Antunes. Theoretical models of voting behaviour.
Exedra , 4(1):145–70, 2010.[2] David Anuta, Josh Churchin, and Jiebo Luo. Election bias: Comparing polls and twitter in the 2016 us election. arXivpreprint arXiv:1701.06232 , 2017.[3] Michal Campr and Karel Ježek. Comparing semantic models for evaluating automatic document summarization. In
International Conference on Text, Speech, and Dialogue , pages 252–260. Springer, 2015.[4] Russell J Dalton et al.
Citizen politics in Western democracies: Public opinion and political parties in the United States,Great Britain, West Germany, and France . Chatham House Chatham, NJ, 1988.[5] Laurent de Boissieu. Participation et abstention aux élections.
France politique , 2016.[6] Bhuwan Dhingra, Zhong Zhou, Dylan Fitzpatrick, Michael Muehl, and William W Cohen. Tweet2vec: Character-baseddistributed representations for social media. Technical report, Arxiv, 2016. https://arxiv.org/abs/1605.03481 .[7] Marco Giugni. Theoretical models of voting behaviour. https://baripedia.org/wiki/Theoretical_models_of_voting_behaviour , 2011.[8] Jey Han Lau and Timothy Baldwin. An empirical evaluation of doc2vec with practical insights into document embeddinggeneration. arXiv preprint arXiv:1607.05368 , 2016.[9] Michael S Lewis-Beck and Mary Stegmaier. Economic models of voting. In
The Oxford handbook of political behavior .Oxford, 2007.[10] Maziyar Panahi, Noe Gaumont, and David Chavalarias. How political communities are using hashtags against theiropponents in upcoming french presidential election: https://politoscope.org/2017/04/hashtags-matter-presidentielle2017/ .[11] Opinion Publique. Popularité de François Hollande : Chiffres et analyses de la popularité de François Hollande et de songouvernement.
Opinion Publique: Sondages publiés et analyses sur les élections, la politique et les sujets de société , 2014. https://opinionpublique.wordpress.com/category/popularite-de-francois-hollande/ .[12] Juan Ramos et al. Using tf-idf to determine word relevance in document queries. In
Proceedings of the first instructionalconference on machine learning , volume 242.1, pages 29–48, 2003.[13] Albin Soutif. Simulation multi-agent de systèmes de vote. Technical report, ENSIMAG. Projet d’Introduction à laRecherche en Laboratoire, encadré par Carole Adam et Sylvain Bouveret, 2017. https://ensiwiki.ensimag.fr/index.php?title=Albin_Soutif_:_Simulation_multi-agent_de_syst%C3%A8mes_de_vote .[14] Wikipedia. Opinion polling for the 2017 french presidential election. https://en.wikipedia.org/wiki/Opinion_polling_for_the_2017_French_presidential_election , 2017.[15] Wikipedia. Évolution des intentions de vote au premier tour de l’élection. https://commons.wikimedia.org/wiki/File:Evolution_des_intentions_de_vote_à_l’élection_présidentielle_2017.png , 2017.[16] Svante Wold, Kim Esbensen, and Paul Geladi. Principal component analysis.
Chemometrics and intelligent laboratorysystems , 2(1-3):37–52, 1987., 2(1-3):37–52, 1987.