[PDF] Nowcasting the Stance of Social Media Users in a Sudden Vote: The Case of the Greek Referendum

Abstract

Modelling user voting intention in social media is an important research area, with applications in analysing electorate behaviour, online political campaigning and advertising. Previous approaches mainly focus on predicting national general elections, which are regularly scheduled and where data of past results and opinion polls are available. However, there is no evidence of how such models would perform during a sudden vote under time-constrained circumstances. That poses a more challenging task compared to traditional elections, due to its spontaneous nature. In this paper, we focus on the 2015 Greek bailout referendum, aiming to nowcast on a daily basis the voting intention of 2,197 Twitter users. We propose a semi-supervised multiple convolution kernel learning approach, leveraging temporally sensitive text and network information. Our evaluation under a real-time simulation framework demonstrates the effectiveness and robustness of our approach against competitive baselines, achieving a significant 20% increase in F-score compared to solely text-based models.

Full PDF

NNowcasting the Stance of Social Media Users in a SuddenVote: The Case of the Greek Referendum

Adam Tsakalidis

University of WarwickThe Alan Turing [email protected]

Nikolaos Aletras

University of [email protected]

Alexandra I. Cristea

Durham UniversityUniversity of [email protected]

Maria Liakata

University of WarwickThe Alan Turing [email protected]

ABSTRACT

Modelling user voting intention in social media is an importantresearch area, with applications in analysing electorate behaviour,online political campaigning and advertising. Previous approachesmainly focus on predicting national general elections, which are reg-ularly scheduled and where data of past results and opinion polls areavailable. However, there is no evidence of how such models wouldperform during a sudden vote under time-constrained circumstances.That poses a more challenging task compared to traditional elections,due to its spontaneous nature. In this paper, we focus on the 2015Greek bailout referendum, aiming to nowcast on a daily basis the vot-ing intention of 2,197 Twitter users. We propose a semi-supervisedmultiple convolution kernel learning approach, leveraging tempo-rally sensitive text and network information. Our evaluation under areal-time simulation framework demonstrates the effectiveness androbustness of our approach against competitive baselines, achievinga significant 20% increase in F-score compared to solely text-basedmodels.

CCS CONCEPTS • Information systems → Web mining ; Social networks ; •

Ap-plied computing → Voting / election technologies ; •

Social andprofessional topics → User characteristics ; KEYWORDS social media; Greek referendum; natural language processing; multi-ple kernel learning; convolution kernels; Twitter; polarisation

Predicting user voting stance and final results in elections usingsocial media content is an important area of research in social mediaanalysis [21, 39] with applications in online political campaigningand advertising [11, 25]. It also provides political scientists with toolsfor qualitative analysis of electoral behaviour on a large scale [2].Previous approaches mainly focus on predicting national generalelections, which are regularly scheduled and where data of pastresults and opinion polls are available [33, 60]. However, there isno evidence of how such models would work during a sudden and major political event under time-constrained circumstances. Thatforms a more challenging task compared to general elections, dueto its spontaneous nature [35]. Building robust methods for voting intention of social media users under such circumstances is importantfor political campaign strategists and decision makers.Our work focuses on nowcasting the voting intention of Twitterusers in the 2015 Greek bailout referendum that was announced inJune, 27 th YES / NO ) at different time points during theentire pre-electoral period.For this purpose, we collect a large stream of tweets in Greekand manually annotate a set of users for testing. We also collecta set of users for training via distant supervision. We predict thevoting intention of the test users during the eight-day period until theday of the referendum with a multiple convolution kernel learningmodel. The latter allows us to leverage both temporally sensitivetextual and network information. Collecting all the available tweetswritten in Greek , enables us to study user language use and networkdynamics in a complete way. We demonstrate the effectiveness androbustness of our approach, achieving a significant 20% increase inF-score against competitive text-based baselines. We also show theimportance of combining text and network information for inferringusers’ voting intention.Our paper makes the following contributions: • We present the first systematic study on nowcasting the votingintention of Twitter users during a sudden and major politicalevent. • We demonstrate that network and language information arecomplementary, by combining them with multiple convolu-tional kernels. • We highlight the importance of the temporal modelling oftext for capturing the voting intention of Twitter users. • We provide qualitative insights on the political discourse anduser behaviour during this major political crisis.

Most previous work on predicting electoral results focuses on fore-casting the final outcome. Early approaches based on word counts[63] fail to generalize well [21, 29, 39]. Lampos et al. [33] pre-sented a bilinear model based on text and user information, usingopinion polls as the target variable. Tsakalidis et al. [60] similarlypredicted the election results in different countries using Twitter and As per Twitter Streaming API limitations: https://developer.twitter.com/en/docs/basics/rate-limiting a r X i v : . [ c s . C Y ] A ug olls while others used sentiment analysis methods and past results[8, 10, 44, 53]. More recently, Swamy et al. [56] presented a methodto forecast the results of the latest US presidential election from userpredictions on Twitter. The key difference between our task and thisstrand of previous work lies in its spontaneous and time-sensitivenature. Incorporating opinion polls or past results is not feasible, dueto the time-constrained referendum period and the lack of previousreferendum cases, respectively. Previous work on predicting the out-comes of referendums [9, 22, 36] is also different to our task, sincethey do not attempt to predict a single user’s voting intention butrather make use of aggregated data coming from multiple users topredict the voting share of only a few test instances.On the user-level, most past work has focused on identifying thepolitical leaning (left/right) of a user. Early work by Rao et al. [50]explored the linguistic aspect of the task; follow-up work has alsoincorporated features based on the user’s network [1, 16, 46, 65],leading to improvements in performance. However, most of thiswork predicts the (static) political ideology of clearly separatedgroups of users who are either declaring their political affiliation intheir profiles, or following specific accounts related to a politicalparty. This has been demonstrated to be problematic when applyingsuch models on users that do not express political opinion [12].Preo¸tiuc-Pietro et al. [49] proposed instead a non-binary, seven-point scale for measuring the self-reported political orientation ofTwitter users, showcasing that the task is more difficult for users whoare not necessarily declaring their political ideology. Our work goesbeyond political ideology prediction, by simulating a real-worldsetting on a dynamically evolving situation for which there is noprior knowledge.A smaller body of research has focused on tasks that go beyondthe classic left/right political leaning prediction. Fang et al. [19]predicted the stance of Twitter users in the 2014 Scottish Indepen-dence referendum by analysing topics in related online discussions.In a related task, Zubiaga et al. [66] classified user stance in threeindependence movements while Stewart et al. [55] analysed userlinguistic identity in the Catalan referendum. Albeit relevant, noneof these works have actually studied the problem under a real-timeevaluation setting or during a sudden event where the time betweenannouncement and voting day is extremely limited (e.g., less thantwo weeks). Previous work on social media analysis during theGreek bailout referendum [4, 40] has not studied the task of infer-ring user voting intention, whereas most of the past work in opinionmining in social media in the Greek language has focused primarilyon tasks related to sentiment analysis [30, 45, 61]. To the best of ourknowledge, this is the first work to (a) infer user voting intentionunder sudden circumstances and a major political crisis; and (b)model user information over time under such settings. The period of the Greek economic crisis before the bailout referen-dum (2009-2015) was characterized by extreme political turbulence,when Greece faced six straight years of economic recession and fiveconsecutive years under two bailout programs [62]. Greek govern-ments agreed to implement austerity measures, in order to secureloans and avoid bankruptcy – a fact that caused massive unrest and demonstrations. During the same period, political parties regard-less of their side on the left-right political spectrum were dividedinto pro-austerity and anti-austerity , while the traditional two-partysystem conceived a big blow [6, 52, 58].The Greek bailout referendum was announced on June, 27 th YES / NO ) with the newbailout deal proposed by the Troika to the Greek Government inorder to extend its credit line. The final result was 61.3%-38.7% infavor of the NO vote. For more details on the Greek crisis, refer toTsebelis [62]. Our aim is to classify a Twitter user either as a

YES or a NO voterin the Greek Bailout referendum over the eight-day period startingright before its announcement (26/6, day 0) and ending on the lastday before it took place (4/7, day 8).We assume a training set of users D t = {( x ( ) t , y ( ) ) , ..., ( x ( n ) t , y ( n ) )} ,where x ( i ) t is a representation of user i up to time step t ∈ [ , ..., ] and y ( i ) ∈ { YES , NO } . Given D t , we want to learn a function f t thatmaps a user j to her or his stance ˆ y ( j ) = f t ( x ( j ) t ) at time t . Then, weupdate our model with new information shared by the users in ourtraining set up to t +1, to predict the test users voting intention at t +1. Therefore, we mimic a real-time setup, where we nowcast uservoting intention, starting from the moment before the announcementof the referendum, until the day of the referendum. Sections 5 and 6present how we develop the training dataset D t and the function f t respectively. Using the Twitter Streaming API during the period 18/6–16/7, wecollected 14.62M tweets in Greek (from 304K users) containingat least one of 283 common Greek stopwords, starting eight daysbefore the announcement of the referendum and stopping 11 daysafter the referendum date (see Figure 1). This provides us with arare opportunity to study the interaction patterns among the usersin a rather complete and unbiased setting, as opposed to the vastmajority of past works, which track event-related keywords only. Forexample, Antonakaki et al. [4] collected 0.3M tweets using popularreferendum-related hashtags during 25/06–05/07 – we have collected6.4M tweets during the same period. In the rest of this section, weprovide details on how we processed the data in order to generate ourtraining set in a semi-supervised way (5.1) and how we annotatedthe users that were used as our test set in our experiments (5.2).

Manually creating a training set would have required annotatingusers based on their voting preference on an issue that they hadnot been aware of prior to the referendum announcement. However,the same does not hold for certain accounts (e.g., major politicalparties) whose stance on austerity had been known a-priori giventheir manifestos and previous similar votes in parliament [52]. Suchaccounts can be used as seeds to form a semi-supervised task, under A decision group formed by the European Commission, the European Central Bankand the International Monetary Fund to deal with the Greek economic crisis. igure 1: Number of tweets in Greek per hour. The period high-lighted in red indicates the nine evaluation time points (see Sec-tion 4). the hypothesis that users who are re-tweeting a political party moreoften than others, are likely to follow its stance in the referendum,once this is announced. Hence, we compile a set of 267 seed ac-counts (148

YES , 119 NO ) focusing on the pre-announcement periodincluding: (1) political parties ; (2) members of parliament (MPs) ;and (3) political party members . • Political Parties : We add as seeds the Twitter accounts ofnine major and minor parties with a known stance on aus-terity before the referendum (5 YES , 4 NO , see Table 1). Weassume that the pro-austerity parties will back the bailoutproposal ( YES ), while the anti-austerity parties will reject it( NO ). The pro-/anti- austerity stance of the parties was knownbefore the referendum, since the pro-austerity parties had al-ready backed previous bailout programs in parliament or hada clear favorable stance towards them, whereas the oppositeholds for the anti-austerity parties [52]. • MPs

The accounts of the (300) MPs of these parties weremanually extracted and added as seeds. 153 such accountswere identified (82

YES , 71 NO ) labelled according to theausterity stance of their affiliated party. • Political Party Members

We finally compiled a set of po-litically related keywords to look up in Twitter user accountnames and descriptions (names/abbreviations of the nine par-ties and keywords such as “candidate”). We identified 257accounts (133

YES , 124 NO ), which were manually inspectedby human experts to filter out irrelevant ones (e.g., the word“River” might not refer to the political party) and kept onlythose that had at least one tweet during the period precedingthe announcement of the referendum (44 NO , 61 YES ).To expand the set of seed accounts, we calculate for every user u in our dataset during the pre-announcement period his/her score as: score ( u ) = PMI ( u , YES ) −

PMI ( u , NO ) , We excluded KKE (Greek Communist Party) since an active official Twitter accountdid not exist at the time.

Table 1: Political position, austerity, referendum stance and na-tional election result (January 2015) of the political parties thatare used as seeds in our modelling.

Party Position Auster. Referend. Jan 15 (%)

SYRIZA (

ΣΥΡΙΖΑ ) Left anti NO 36.34New Democracy (

Νέα Δημοκρατία ) Centre-Right pro YES 27.81Golden Dawn (

Χρυσή Αυγή ) Far-right anti NO 6.28The River (

Το Ποτάμι ) Centre pro YES 6.05Independent Greeks (

Ανεξάρτητοι ΄Ελληνες ) Right anti NO 4.75PASOK (

ΠΑΣΟΚ ) Centre-left pro YES 4.68KIDISO (

ΚΙΔΗΣΟ ) Centre-left pro YES 2.47ANTARSYA (

ΑΝΤΑΡΣΥΑ ) Far-left anti NO 0.64Creation Again (

Δημιουργία Ξανά ) Centre-Right pro YES -

Table 2: Number of users (u) and tweets (t) used in our experi-ments per evaluation day. day date 26/06 27/06 28/06 29/06 30/06 01/07 02/07 03/07 04/07train u t u t where PMI ( u , lbl ) is the pointwise mutual information between acertain user and the respective seeding class ( YES / NO ). A high(low) score implies that the user is endorsing often YES -related ( NO -related) accounts, thus he/she is more likely to follow their stanceafter the referendum is announced. This approach has been success-fully applied to other related natural language processing tasks, suchas building sentiment analysis lexical resources using a pre-definedlist of seed words [43]. Assigning class labels to the users basedon their scores, we set up a threshold tr = n ( max (| scores |)) , with n ∈ [ , ] . We assign the label YES to a user u if score ( u ) > tr or NO if score ( u ) < − tr . Setting n = , would imply that we are assigningthe label YES if the user has re-tweeted more YES-supporting ac-counts (and inversely), which might result into a low quality trainingset, whereas higher values for n would imply a smaller (but of higherquality) training set. During development, we empirically set n = . to keep users who are fairly closer to one class than the other. Fromthe final set of 5,430 users that have re-tweeted any seed account,2,121 were kept (along with the seed accounts) as our training set(965 YES , 1,156 NO ). For evaluation purposes, we generate a test set of active users thatare likely to participate in political conversations on Twitter. First,we identify all users having tweeted at least 10 times after the refer-endum announcement (86,000 users). From the 500 most popularhashtags in their tweets, we selected those that were clearly relatedto the referendum (189) which were then manually annotated withrespect to potentially conveying the user’s voting intention (e.g.,“yesgreece”, “no” as opposed to neutral ones, such as “referendum”).Finally, we selected a random sample of 2,700 users (out of 22K)that had used more than three such hashtags, to be manually anno-tated – without considering any user from the training set. This isstandard practice in related work [20, 55] and enables us to evaluateour models on a high quality test set, as opposed to previous relatedork which rely on keyword-matching approaches to generate theirtest set [19, 66].Two authors of the paper (Greek native speakers) annotated eachof the users in the test set, using the tweets after the referendumannouncement. Each annotator was allowed to label an account as

YES , NO , or N/A , if uncertain. There was an agreement on 2,365users (Cohen’s κ = . ) that is substantially higher if the N/A labels are not considered ( κ = . ), revealing high quality in theannotations, i.e., in the upper part of the ‘substantial’ agreementband [5]. We discarded all accounts labelled as N/A by an annotatorand used the remaining accounts where the annotators agreed forthe final test set, resulting to 2,197 users – similar test set sizesare used in related tasks [18]. The resulting user distribution ( NO YES

Convolution kernels are composed of sub-kernels operating on theitem-level to build an overall kernel for the object-level [14, 23] andcan be used with any kernel based model such as Support VectorMachines (SVMs) [27]. Such kernels have been applied in variousNLP tasks [14, 31, 37, 64]. Here we build upon the approach ofLukasik and Cohn [37] by combining convolution kernels operatingon available (1) text ; and (2) network information .Let a , b denote two objects (e.g., social network users), repre-sented by two M × N matrices Z a and Z b respectively, where M denotes the number of items representing the object and N the di-mensionality of an item vector. For example, an item can be a user’stweet or network information. A kernel K between the two objects(users) a and b over Z a and Z b is defined as: K z ( a , b ) = | Z a | | Z b | (cid:213) i , j k z ( z ia , z jb ) , (1)where k z is any standard kernel function such as a linear or a radialbasis function (RBF). One can also normalise K z by dividing itsentries K z ( i , j ) by (cid:112) K z ( i , i ) K z ( j , j ) .The resulting kernel has the ability to capture the similaritiesacross objects on a per-item basis. However, unless restricted tooperate on consecutive items (time-wise), it ignores their temporalaspect. Given a set of associated timestamps T o = { t o , ..., t No } for theitems of each object o , Lukasik and Cohn [37] proposed to combinethe temporal and the item aspects as: K zt ( a , b ) = | Z a | | Z b | (cid:213) i , j k z ( z ia , z jb ) k t ( t ia , t jb ) , (2)where k t is any valid kernel function operating on the timestampsof the items. Here, K zt is a matrix capturing the similarities acrossusers by leveraging both the information between pairs of items andtheir temporal interaction. Let a , b denote two users in a social network,posting messages W a = { w a , ..., w Na } and W b = { w b , ..., w Mb } with associated timestamps T a = { t a , ..., t Na } and T b = { t b , ..., t Mb } respectively. We assume that a message w ji of user i at time j is represented by the mean k -dimensional embedding [41] of itsconstituent terms. This way, we can obtain text convolution kernels , K w and K wt by simply replacing Z and z with W and w respectivelyin Equations 1 and 2. Following Lukasik and Cohn [37], we optedfor a linear kernel operating on text and an RBF on time. Let assume a set of directed weightedgraphs G = { G ( N , E ) , ..., G t ( N t , E t )} , where G i ( N i , E i ) repre-sents the retweeting activity graph of the N i users at a time point i ∈ T = { , .., t } . Let L a ∈ R N , k , L b ∈ R M , k denote the resultingmatrices of a k-dimensional, network-based user representation fortwo users a and b across time. Contrary to the textual vector repre-sentation w ji that is defined over a fixed space given a pre-definedvocabulary, user network vector representations (e.g., graph embed-dings [57]), are computed at each time step on a different networkstructure. Thus, a standard similarity score between two user repre-sentations at timepoints t and t +1 cannot be used, since the networkvector spaces are different. To accommodate this, at each time point t we calculate the median L t YES and L t NO vectors for each class ofour training examples and update the respective user vectors as: L ∗ tu = d ( L t YES , L tu ) − d ( L t NO , L tu ) , using some distance metric d (for simplicity, we opted for the lin-ear distance). If a user has not retweeted, his/her original networkrepresentation l tu is calculated as the average across all user represen-tations at t . Finally, the network convolution kernels , K n and K nt arecomputed using Equations 1 and 2 respectively by simply replacing Z with L ∗ and z with l ∗ . Similarly to text kernels, we use a linearkernel k n for the network and an RBF kernel k t for time. We can combine the text and networkconvolution kernels by summing them up: K sum = K w + K wt + K n + K nt . This implies a simplistic assumption that the contribution ofthe different information sources with respect to our target is equal.While this might hold for a small number of carefully designedkernels, it lacks the ability to generalise over multiple kernels ofpotentially noisy representations. Convolution kernels canbe used with any kernel based model. Here, we use them with SVMs.First, a SVM s operates on a single information source s = { w , n } ,i.e., SVM w for text and SVM n for network. Second, a SVM st takestemporal information into account combined with text (SVM wt ) andnetwork (SVM nt ) information respectively. Finally, we combine thetext and the network information using a linear kernel summation( K sum ) of their respective kernels (SVM sum ). Multiplekernel learning methods learn a weight for each kernel instead ofassigning equal importance to all of them allowing more flexibility.Such approaches have been extensively used in tasks where differentdata modalities exist [26, 48, 59]. We build upon the approach ofSonnenburg et al. [54] to build a model based on labelled instances i ∈ I , by combining the different convolution kernels K s with someweight w s > s.t. (cid:205) s w s =1 and apply: f ( x ) = siдn (cid:18) (cid:213) i ∈ I α i (cid:213) s w s K s ( x , x i ) + b (cid:19) . The parameters α i , the bias term b and the kernel weights are esti-mated by minimising the expression:min γ − (cid:213) i ∈ I α i w.r.t. γ ∈ R , α ∈ R | I | + s.t. ≤ α i ≤ C ∀ i , (cid:213) i ∈ I α i y i = (cid:213) i ∈ I (cid:213) j ∈ I α i α j y i y j K s ( x i , x j ) ≤ γ ∀ s . This way, the four convolution kernels are calculated individuallyand subsequently combined in a weighted scheme accounting fortheir contribution in the prediction task. This allows us to combineexternal and asynchronous information (e.g., news articles), whileadding other kernels capturing different aspects of the users (e.g.,images) is straight-forward.

We obtain word embeddingsby training word2vec [41] on a collection of 14.7 non-retweetedtweets obtained by [61], collected in the exact same way as ourdataset, over a separate time period. We performed standard pre-processing steps including lowercasing, tokenising, removal of non-alphabetic characters, replacement of URLs, mentions and all-upper-case words with identifiers. We used the CBOW architecture, opt-ing for a 5-token window around the target word, discarding allwords appearing less than 5 times and using negative sampling with5 “noisy” examples. After training, each word is represented as a50-dimensional vector. Each tweet in our training and test set isrepresented by averaging each dimension of its constituent words.

We trained LINE[57] embeddings at different timesteps, by training on the graphs { G ( N , E ) , ..., G T ( N T , E T )} , where N i is the set of users and E i isthe (directed, weighted) set of retweets amongst N i up to time i . Wechoose the “retweet” rather than the “user mention” network due toits more polarised nature, as indicated by past work [15] . LINE waspreferred over alternative models [47, 51] due to its ability to modeldirected weighted graphs. We construct the network G t every 12hours based on the retweets among all users up to time t , and LINE istrained on G t to create 50-dimensional user representations. We usedthe second-order proximity, since it performed better than the first-order in early experimentation. We also refrained from concatenatingthem to keep the dimensionality relatively low. The “following” network cannot be constructed based on the JSON objects returnedby Twitter Streaming API; to achieve this requires a very large number of API calls andcannot be constructed accurately in a realistic scenario.

Our MCKL and our SVMmodels are fed with the convolution kernels operating on the tweet-level (for TEXT) and each NETWORK representation (derived every12 hours), based on the tweets and re-tweeting activity respectivelyof the users up to the current evaluation time point.

We compare our proposed methods against com-petitive baselines that are commonly used in social media miningtasks trained on feature aggregates [38, 66]. We obtain a TEXTrepresentation of a user at each time step t by averaging embeddingvalues across all his/her tweets until t . Similarly, a user NETWORKrepresentation is computed from the retweeting graph up until t .Finally, we train a regularised Logistic Regression ( LR ) with L reg-ularisation [34], a feed-forward neural network ( FF ) [24], a RandomForest ( RF ) [7] and a SVM . Model Parameters.

Parameter selection of our models and thebaselines is performed using a 5-fold cross-validation on the trainingset. We experiment with different regularisation strength ( − , − ,..., ) for LR, different number of trees (50, 100, ..., 500) forRF, and different kernels (linear, RBF) and parameters C and γ ( − , − , ..., ) for SVMs. For FF, we stack dense layers, eachfollowed by a ReLU activation and a 20% dropout layer, and a finallayer with a sigmoid activation function. We train our network usingthe Adam optimiser [32] with the binary cross-entropy loss functionand experiment with different number of hidden layers (1, 2), unitsper layer (10, 25, 50, 75, 100, 150, 200), batch size (10, 25, 50,75, 100) and number of epochs (10, 25, 50, 100). For MCKL, weexperiment with the same C values as in SVM and apply an L regulariser. We train and test our models based on the data collected on a dailybasis (every midnight), starting from the day before the announce-ment of the referendum (day 0) until the day before its due date (day8). This way, we mimic a real-time setting and gain better evaluationinsights. To evaluate our models, we compute the macro-averageF-score, which forms a more challenging metric compared to micro-averaging, given the imbalanced distribution of our test set. At eachevaluation time point t , we use information about the users in ourtraining set up to t , to classify the test users that have tweeted atleast once up to t (note that all of the users in our training set havetweeted before the announcement of the referendum, thus the sizeof the training set in terms of number of users remains constant).This results into a different number of test instances per day (seeTable 2). However, we did not observe any major differences in ourevaluation by excluding newly added users. Parameter selection isperformed on every evaluation day using a 5-fold cross-validationon the training set. Figure 2 presents the macro-average F-scores obtained by the meth-ods compared in all days from the announcement to the day of thereferendum. As expected, the closer the evaluation is to the refer-endum date, the more accurate the models since more information igure 2: Macro-average F-score across all evaluation days using TEXT, NETWORK and BOTH user representations.Table 3: Average F-score and standard deviation across all eval-uation days using TEXT, NETWORK and BOTH user repre-sentations. SVM s and SVM st denote the SVM with convolutionkernels (SVM w , SVM n ) and (SVM wt , SVM nt ), respectively. TEXT NETWORK BOTH

LR 63.55 ± ± . ± . FF 68.19 ± . ± . ± . RF 61.27 ± . ± . ± . SVM 68.51 ± . ± . ± . SVM s ± . ± . –SVM st ± . ± . –SVM sum – – 85.22 ± . MCKL – – ± . becomes available for each user. Table 3 shows the average (across-all-days) F-score by each model.Temporal convolution kernels using TEXT (SVM wt ) signifi-cantly outperform the best text-based baseline ( p = . , Kruskal-Wallis test against SVM), with an average of 11.8% and 17.2%absolute and relative improvement respectively. This demonstratesthe model’s ability on capturing the similarities between differentusers on a per-tweet basis compared to simpler models using tweetaggregates. Also, SVM w and SVM wt implicitly capture similari-ties in the retweeting activity of the users. This is important, sincenetwork information might not be easily accesible (e.g., due to APIlimitations) while it is expensive to compute at each timestep. Hence,one can use SVM wt to model user written content and partiallycapture network information.Classification accuracy consistently improves when using the NETWORK representation (i.e., graph embeddings). RF achieves94% F-score on the day before the referendum, whereas the worst-performing baseline (FF) still achieves 80.66% F-score on average.SVM nt provides a small boost (1.6% on average) compared tothe vanilla SVM, which uses only the user representations derivedat the current time point. This implies that the current networkstructure is indicative of users’ voting intention, probably becausethe referendum was the dominant topic of discussion at the time,e.g., most of the retweeting activity was relevant the referendum (seeSection 9). This is also in line with recent findings of Aletras and Figure 3: Change in performance (mean/standard deviation)compared to the results in Figure 2, after 100 experiments withadded noisy features.

Chamberlain [3] on predicting occupation class and income wherenetwork information is more predictive than language.When combining the user text and network representation (

BOTH ),the baselines fail to improve over using only NETWORK. In con-trast, our MCKL improves by 4.28% over the best performing singleconvolution kernel model (SVM nt ). This demonstrates that MCKLcan effectively combine information from both representations byweighting their importance, and further improve the accuracy ofthe best performing single representation model. Overall, MCKLsignificantly outperforms the best performing text-based baseline byapproximately 20% in F-score ( p < . , Kruskal-Wallis test). Due to the semi-supervised nature of our task, it is impossible tojudge whether the small difference between MCKL and RF stemsfrom a better designed model. Furthermore, it is difficult to assessMCKL’s effectiveness with respect to its ability to generalise overmultiple and potentially noisy feature sources.To assess the robustness of the best performing models (MCKL,RF) operating on

BOTH information sources, here we perform ex-periments by adding random noise in their input. We assume thatthere is a noisy source generating an extra K-dimensional representa-tion X for every user that we add as extra input to the models. We set K = , so that (a) we account for a smaller noisy input comparedo our features (25 vs 50) and (b) X ∼ N ( , ) .Our results indicate that RF is more sensitive to the noisy inputcompared to MCKL (see Figure 3). On average, RF achieves a smallboost (0.04%) in performance with the added noise. That togetherwith the higher standard deviation reveal the vulnerability of RFto potentially corruption and stochasticity introduced in the input.On the contrary, MCKL is consistently robust, achieving only atiny reduction in performance on average across all days (0.02%)while the respective average standard deviation is lower than the oneachieved by RF (0.12 vs 0.41). This robustness is highly desirableis cases of such sudden political events and it also indicates thatwe can add kernels capturing different properties of our task (e.g.,user-related information, images, etc.), without having to decidea-priori which of them are indeed predictive of the user’s votingintention. We plan to investigate this in future work. In this section, we provide insights into the temporal variation ob-served in the users’ shared content and the network structure duringthis major political crisis. Besides performing a qualitative analysisduring this time period, we believe that this analysis will also provideinsights on (a) the reasons that trigger the significant improvementin performance of convolution kernels methods operating on TEXT,and (b) the reason that our non-temporally-sensitive baselines arerather competitive to our convolution kernel models, when usingNETWORK information. In the current section we provide detailson both of these aspects.

We are interested in investigating which are the political-related enti-ties that voters from both sides most likely mention. We expect thatthis will shed light on the main focus of discussion in the politicaldebates between the

YES / NO voters that occurred after the announce-ment of the referendum. For this, two authors manually compiledtwo lists of n-grams containing different ways of referring to the (a) the six major political parties and (b) their leaders (see Table 1). Werepresent every YES / NO user in the test set as aggregated tf-idf values of the ngrams (1-3) appearing in his/her concatenated tweets;then, we compute an n-gram ’s n score as PMI ( n , YES )− PMI ( n , NO ) .A positive score implies that it is highly associated with users whosupport the YES vote, and vice versa.Figure 4 shows that the parties and leaders that supported one side,mostly appear in tweets of users supporting the opposite side. This ismore evident when we consider tweets shared by the users after theannouncement of the referendum. Examining the content of highly-retweeted tweets, revealed sarcasm and hostility for the oppositeside in the majority of them (see Table 5). Hostility is a frequentphenomenon in public debates [28] and our findings corroborateprevious work showing that the political discourse on Twitter ispolarised [15, 20].Finally, we examine the temporal variation of language over thesame two periods. Table 4 shows the most similar words (translated Note that Greek is a fully inflected language. We opted not to apply stemming becauseinflected word forms carry meaningful information.

Figure 4: Scores of n-grams related to the political par-ties/leaders, pre (18/06-26/06) and post (27/06-05/07) the refer-endum announcement.Table 4: Most similar words to

YES and NO (translated to Eng-lish), when training word2vec on different time periods. Before the announcement After the announcement (18/06-26/06) (27/06-05/07)

YES no, ok, nah, alright, sure,usrmnt, hahaha, alrighty,but, so no, abstain, referendum, KKE,question, invalid, euro, clearly,clash, nai NO yes, only, sure, so (slang),disagree, mainly, especially,obviously, so (abbrv), agree yes, abstain, KKE, referendum,clash, question, people, invalid,vote, clearly to English) to the yes and no words, measured by cosine similarity,when training word2vec using the tweets of each time period. Thedifference of the cosine similarities cos post − cos pre between the yes / no vectors and each of their corresponding most similar wordsover these two periods is shown in Figure 5. After the announcement,the context of the two words shifts towards the political domain. Thatmight explain why text aggregates become noisy, as shown in ourresults. Convolution kernels are able to filter-out this noise sincethey operate on the tweet level by also taking the time into account.We plan to study the semantic variation in language [17] in a morefine-grained way in future work. We explore the differences in retweeting behaviour of users overthe same periods ( ( a) before the announcement of the referendumand ( b) after and until the day of the referendum), by training twodifferent LINE embedding models using tweets from the each periodrespectively. Figure 6 shows the plots of the first two dimensionsof the graph embeddings before and after the announcement usingprincipal component analysis. The results unveil the effects of thereferendum announcement and provide insights on the effectivenessof NETWORK information for predicting vote intention, as demon-strated in our results. Before, YES and NO users appear to havesimilar retweeting behaviour, which changes after the announce-ment. This finding illustrates the political homophily of the social able 5: Examples of highly re-tweeted tweets after the an-nouncement of the referendum.Tweet They say that there is a long queue of people in ATMs butthey show only 6 people waiting; this is not a queue, thisis PASOK.

Looking for any angry tweets by SYRIZA fans concern-ing Kasidiaris’s (Golden Dawn MP) release from prison.Have you seen any?

I want to write something funny regarding the state-ments made by Kammenos (Ind. Greeks leader) , but Icannot find something funnier than the statements madeby Kammenos.

Figure 5: Difference in cosine similarity ( cos post ( w no / yes , w ) − cos pre ( w no / yes , w ) ) between the no / yes (red/blue) word vectors w no / yes and each of their most similar words in the two periods. network [13] and highlights the extremely polarised pre-electionperiod [62].Next, we question whether the distance between the two classesof users through time changes according to time points at which Figure 6: Network representations of

YES / NO (blue/red) users,before (above) and after (below) the referendum announce-ment. real-world events occur. To answer this, we compute the networkembeddings of the train and test users every 12 hours, as in our ex-periments, and represent every class ( YES / NO ) at a certain time point t by the average representations ( avд tY , avд tN ) of the correspondingusers in the training set at t . Then, for every user u in the test set, weuse the cosine similarity cos to calculate: network _ score tu = cos ( u t , avд tY ) − cos ( u t , avд tN ) . Finally, we calculate the average score of the YES and the NO usersin the test set ( network tY , network tN ) at every time point t and nor-malise the corresponding time series s.t. network Y ( ) = network N ( ) = . We also employ an alternative approach, by generating the networkembeddings on a seven-day sliding-window fashion and followingthe same process. The results are shown in Figure 7. In both cases,the YES / NO users start to deviate from each other right after the an-nouncement of the referendum, with an upward/downward YES / NO trend until the day of the referendum. This is effectively captured inour modelling and might explain the reason for the high accuracyachieved even by our baseline models, which are trained using thenetwork representation of the users in the last day only. However, the YES / NO users start to again approach each other only in the slidingwindow approach after the referendum day, since in our modellingthe representations are built based on re-tweets aggregates over the igure 7: Normalised difference of similarity of YES / NO (blue/red) users in our modelling (left) and in a sliding windowapproach (right). whole period. While this does not seem to have affected our perfor-mance, exploring the temporal structure of the network formationsthrough time is of vital importance for longer lasting electoral cases.

10 LIMITATIONS AND FUTURE WORK

Despite working under a real-time simulation setting, we are awarethat our results come with some caution, owed to the selection ofthe users in our test set. The limitations stem from the fact thatwe have selected highly active users that have used at least threepolarised hashtags in their tweets after the announcement of thereferendum. As previous work has shown [12, 49], we expect thatthe performance of any model is likely to drop, if tested in a randomsample of Twitter users. We plan to investigate this, by annotating arandom sample of Twitter users and comparing the performance inthe two test sets, in our future work.We also plan to assess the ability of MCKL to generalise, throughexploring different referendum cases and incorporating more sourcesof information in our modelling. Finally, we plan to study the tem-poral variation of language and network in a more fine-grained way.

11 CONCLUSION

We presented a distant-supervised multiple convolution kernel ap-proach, leveraging temporally sensitive language and network in-formation to nowcast the voting stance of Twitter users during the2015 Greek bailout referendum. Following a real-time evaluationsetting, we demonstrated the effectiveness and robustness of ourapproach against competitive baselines, showcasing the importanceof temporal modelling for our task.In particular, we showed that temporal modelling of the contentgenerated by social media users provides a significant boost in per-formance (11%-19% in F-score) compared to traditional featureaggregate approaches. Also, in line with past work on inferring thepolitical ideology of social media users [1, 16], we showed that thenetwork structure (in our case, the re-tweet network) of the socialmedia users is more predictive of their voting intention, compared tothe content they share. By combining those two temporally sensitiveaspects (text, network) of our task via a multiple kernel learningapproach, we further boost the performance, leading to an overall sig-nificant 20% increase in F-score against the best performing, solelytext-based feature aggregate baseline. Finally, we provided qualita-tive insights on aspects related to the shift in online discussions and polarisation phenomena that occurred during this time period, whichare effectively captured through our temporal modelling approach.

ACKNOWLEDGEMENTS

The current work was supported by the EPSRC through the Univer-sity of Warwick’s Centre for Doctoral Training in Urban Scienceand Progress (grant EP/L016400/1) and through The Alan TuringInstitute (grant EP/N510129/1).

REFERENCES [1] Faiyaz Al Zamal, Wendy Liu, and Derek Ruths. 2012. Homophily and LatentAttribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors.

ICWSM

270 (2012), 2012.[2] John H Aldrich, Rachel K Gibson, Marta Cantijoch, and Tobias Konitzer. 2016.Getting out the vote in the social media era: Are digital tools changing the extent,nature and impact of party contacting in elections?

Party Politics

22, 2 (2016),165–178.[3] Nikolaos Aletras and Benjamin Paul Chamberlain. 2018. Predicting TwitterUser Socioeconomic Attributes with Network and Language Information. In

Proceedings of the 29th on Hypertext and Social Media (HT ’18) . 20–24.[4] Despoina Antonakaki, Dimitris Spiliotopoulos, Christos V Samaras, PolyviosPratikakis, Sotiris Ioannidis, and Paraskevi Fragopoulou. 2017. Social mediaanalysis during political turbulence.

PloS one

12, 10 (2017), e0186836.[5] Ron Artstein and Massimo Poesio. 2008. Inter-coder agreement for ComputationalLinguistics.

Computational Linguistics

34, 4 (2008), 555–596.[6] Anna Bosco and Susannah Verney. 2012. Electoral Epidemic: The Political Costof Economic Crisis in Southern Europe, 2010–11.

South European Society andPolitics

17, 2 (2012), 129–154.[7] Leo Breiman. 2001. Random forests.

Machine learning

45, 1 (2001), 5–32.[8] Pete Burnap, Rachel Gibson, Luke Sloan, Rosalynd Southern, and MatthewWilliams. 2016. 140 characters to victory? Using Twitter to predict the UK2015 General Election.

Electoral Studies

41, Supplement C (2016), 230 – 233.[9] Fabio Celli, Evgeny Stepanov, Massimo Poesio, and Giuseppe Riccardi. 2016.Predicting Brexit: Classifying agreement is better than sentiment and pollsters. In

Proceedings of the Workshop on Computational Modeling of People’s Opinions,Personality, and Emotions in Social Media . 110–118.[10] Andrea Ceron, Luigi Curini, Stefano M Iacus, and Giuseppe Porro. 2014. Everytweet counts? How sentiment analysis of social media can improve our knowledgeof citizens’ political preferences with an application to Italy and France.

NewMedia & Society

16, 2 (2014), 340–358.[11] Derrick L. Cogburn and Fatima K. Espinoza-Vasquez. 2011. From NetworkedNominee to Networked Nation: Examining the Impact of Web 2.0 and Social Me-dia on Political Participation and Civic Engagement in the 2008 Obama Campaign.

Journal of Political Marketing

10, 1-2 (2011), 189–213.[12] Raviv Cohen and Derek Ruths. 2013. Classifying Political Orientation on Twitter:It’s not Easy!. In

ICWSM .[13] Elanor Colleoni, Alessandro Rozza, and Adam Arvidsson. 2014. Echo chamber orpublic sphere? Predicting political orientation and measuring political homophilyin Twitter using big data.

Journal of Communication

64, 2 (2014), 317–332.[14] Michael Collins and Nigel Duffy. 2002. Convolution kernels for natural language.In

NIPS . 625–632.[15] Michael Conover, Jacob Ratkiewicz, Matthew R Francisco, Bruno Gonçalves,Filippo Menczer, and Alessandro Flammini. 2011. Political polarization on Twitter.In

ICWSM , Vol. 133. 89–96.[16] Michael D Conover, Bruno Gonçalves, Jacob Ratkiewicz, Alessandro Flammini,and Filippo Menczer. 2011. Predicting the Political Alignment of Twitter Users.In

Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE 3rd InternationalConference on Social Computing (SocialCom), 2011 IEEE 3rd InternationalConference on . IEEE, 192–199.[17] Marco Del Tredici and Raquel Fernández. 2017. Semantic Variation in OnlineCommunities of Practice. In

IWCS .[18] Leon Derczynski, Kalina Bontcheva, Maria Liakata, Rob Procter, GeraldineWong Sak Hoi, and Arkaitz Zubiaga. 2017. SemEval-2017 Task 8: RumourEval:Determining rumour veracity and support for rumours. In

SemEval . 69–76.[19] Anjie Fang, Iadh Ounis, Philip Habel, Craig Macdonald, and Nut Limsopatham.2015. Topic-centric classification of Twitter user’s political orientation. In

SIGIR .791–794.[20] Kiran Garimella, Gianmarco De Francisci Morales, Aristides Gionis, and MichaelMathioudakis. 2018. Political Discourse on Social Media: Echo Chambers, Gate-keepers, and the Price of Bipartisanship. In

WWW . 913–922.[21] Daniel Gayo-Avello. 2012. I Wanted to Predict Elections with Twitter and all Igot was this Lousy Paper–A Balanced Survey on Election Prediction using TwitterData. arXiv preprint arXiv:1204.6441 (2012).22] Miha Grˇcar, Darko Cherepnalkoski, Igor Mozetiˇc, and Petra Kralj Novak. 2017.Stance and influence of Twitter users regarding the Brexit referendum.

Computa-tional Social Networks

4, 1 (2017), 6.[23] David Haussler. 1999.

Convolution kernels on discrete structures . TechnicalReport. Department of Computer Science, University of California at Santa Cruz.[24] Kurt Hornik, Maxwell Stinchcombe, and Halbert White. 1989. Multilayer feed-forward networks are universal approximators.

Neural networks

2, 5 (1989),359–366.[25] Philip N. Howard. 2005. Deep Democracy, Thin Citizenship: The Impact of DigitalMedia in Political Campaign Strategy.

The Annals of the American Academy ofPolitical and Social Science

NIPS Workshop onMultimodal Machine Learning, Montreal, Quebec , Vol. 898.[27] Thorsten Joachims. 1998. Text categorization with support vector machines:Learning with many relevant features. In

ECML . 137–142.[28] Charlotte Jørgensen. 1998. Public debate–an act of hostility?

Argumentation

Socialscience computer review

30, 2 (2012), 229–234.[30] Georgios Kalamatianos, Dimitrios Mallis, Symeon Symeonidis, and Avi Aram-patzis. 2015. Sentiment analysis of Greek tweets and hashtags using a sentimentlexicon. In

Proceedings of the 19th Panhellenic Conference on Informatics . ACM,63–68.[31] Jonghoon Kim, François Rousseau, and Michalis Vazirgiannis. 2015. Convolu-tional sentence kernel from word embeddings for short text categorization. In

EMNLP . 775–780.[32] Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic opti-mization. arXiv preprint arXiv:1412.6980 (2014).[33] Vasileios Lampos, Daniel Preo¸tiuc-Pietro, and Trevor Cohn. 2013. A user-centricmodel of voting intention from Social Media. In

ACL , Vol. 1. 993–1003.[34] Saskia Le Cessie and Johannes C Van Houwelingen. 1992. Ridge estimators inlogistic regression.

Applied statistics (1992), 191–201.[35] Lawrence Leduc. 2002. Opinion change and voting behaviour in referendums.

European Journal of Political Research

41, 6 (2002), 711–732.[36] Julio Cesar Amador Diaz Lopez, Sofia Collignon-Delmar, Kenneth Benoit, andAkitaka Matsuo. 2017. Predicting the Brexit Vote by Tracking and ClassifyingPublic Opinion Using Twitter Data.

Statistics, Politics and Policy

8, 1 (2017),85–104.[37] Michal Lukasik and Trevor Cohn. 2016. Convolution Kernels for DiscriminativeLearning from Streaming Text. In

AAAI . 2757–2763.[38] Jing Ma, Wei Gao, Zhongyu Wei, Yueming Lu, and Kam-Fai Wong. 2015. Detectrumors using time series of social context information on microblogging websites.In

CIKM . 1751–1754.[39] Panagiotis T Metaxas, Eni Mustafaraj, and Dani Gayo-Avello. 2011. How (not) topredict elections. In

Privacy, Security, Risk and Trust and IEEE Third InernationalConference on Social Computing . 165–171.[40] Asimina Michailidou. 2017. Twitter, Public Engagement and the Eurocrisis:More than an Echo Chamber? In

Social Media and European Politics . Springer,241–266.[41] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013.Distributed representations of words and phrases and their compositionality. In

Advances in neural information processing systems . 3111–3119.[42] Renato Miranda Filho, Jussara M Almeida, and Gisele L Pappa. 2015. Twitterpopulation sample bias and its impact on predictive outcomes: a case study onelections. In

Advances in Social Networks Analysis and Mining . 1254–1261.[43] Saif Mohammad, Svetlana Kiritchenko, and Xiaodan Zhu. 2013. NRC-Canada:Building the State-of-the-Art in Sentiment Analysis of Tweets. In

SemEval , Vol. 2.321–327.[44] Brendan O’Connor, Ramnath Balasubramanyan, Bryan R Routledge, and Noah ASmith. 2010. From tweets to polls: Linking text sentiment to public opinion timeseries. In

ICWSM , Vol. 11. 1–2.[45] Elisavet Palogiannidi, Polychronis Koutsakis, Elias Iosif, and Alexandros Potami-anos. 2016. Affective Lexicon Creation for the Greek Language. In

Proceedingsof the Tenth International Conference on Language Resources and EvaluationLREC 2016, Portorož, Slovenia, May 23-28, 2016. [46] Marco Pennacchiotti and Ana-Maria Popescu. 2011. A Machine Learning Ap-proach to Twitter User Classification.

ICWSM

11, 1 (2011), 281–288.[47] Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Onlinelearning of social representations. In

SIGKDD . 701–710.[48] Soujanya Poria, Haiyun Peng, Amir Hussain, Newton Howard, and Erik Cambria.2017. Ensemble application of convolutional neural networks and multiple kernellearning for multimodal sentiment analysis.

Neurocomputing

261 (2017), 217–230. [49] Daniel Preo¸tiuc-Pietro, Ye Liu, Daniel Hopkins, and Lyle Ungar. 2017. BeyondBinary Labels: Political Ideology Prediction of Twitter Users. In

Proceedingsof the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1: Long Papers) , Vol. 1. 729–740.[50] Delip Rao, David Yarowsky, Abhishek Shreevats, and Manaswi Gupta. 2010. Clas-sifying Latent User Attributes in Twitter. In

Proceedings of the 2nd InternationalWorkshop on Search and Mining User-generated Contents . ACM, 37–44.[51] Georgios Rizos, Symeon Papadopoulos, and Yiannis Kompatsiaris. 2017. Multil-abel user classification using the community structure of online networks.

PloSone

12, 3 (2017), e0173347.[52] Wolfgang Rüdig and Georgios Karyotis. 2013. Beyond the Usual Suspects? NewParticipants in Anti-Austerity Protests in Greece.

Mobilization: An InternationalQuarterly

18, 3 (2013), 313–330.[53] Lei Shi, Neeraj Agarwal, Ankur Agrawal, Rahul Garg, and Jacob Spoelstra. 2012.Predicting US primary elections with Twitter. In

Social Network and Social MediaAnalysis: Methods, Models and Applications, NIPS .[54] Sören Sonnenburg, Gunnar Rätsch, Christin Schäfer, and Bernhard Schölkopf.2006. Large scale multiple kernel learning.

Journal of Machine Learning Research

7, Jul (2006), 1531–1565.[55] Ian Stewart, Yuval Pinter, and Jacob Eisenstein. 2018. Sío no, qué penses?Catalonian Independence and Linguistic Identity on Social Media. In

NAACL-HLT .[56] Sandesh Swamy, Alan Ritter, and Marie-Catherine de Marneffe. 2017. “I have afeeling Trump will win..................": Forecasting Winners and Losers from UserPredictions on Twitter. In

EMNLP . 1583–1592.[57] Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei.2015. LINE: Large-scale information network embedding. In

WWW . 1067–1077.[58] Eftichia Teperoglou and Emmanouil Tsatsanis. 2014. Dealignment, De-legitimation and the Implosion of the Two-Party System in Greece: The Earth-quake Election of 6 May 2012.

Journal of Elections, Public Opinion and Parties

24, 2 (2014), 222–242.[59] Adam Tsakalidis, Maria Liakata, Theo Damoulas, Brigitte Jellinek, Weisi Guo,and Alexandra Cristea. 2016. Combining heterogeneous user generated data tosense well-being. In

COLING . 3007–3018.[60] Adam Tsakalidis, Symeon Papadopoulos, Alexandra I Cristea, and Yiannis Kom-patsiaris. 2015. Predicting elections for multiple countries using Twitter and polls.

IEEE Intelligent Systems

30, 2 (2015), 10–17.[61] Adam Tsakalidis, Symeon Papadopoulos, Rania Voskaki, Kyriaki Ioannidou,Christina Boididou, Alexandra I Cristea, Maria Liakata, and Yiannis Kompatsiaris.2018. Building and evaluating resources for sentiment analysis in the Greeklanguage.

Language Resources and Evaluation (2018), 1–24.[62] George Tsebelis. 2016. Lessons from the Greek crisis.

Journal of EuropeanPublic Policy

23, 1 (2016), 25–41.[63] Andranik Tumasjan, Timm Oliver Sprenger, Philipp G Sandner, and Isabell MWelpe. 2010. Predicting elections with Twitter: What 140 characters reveal aboutpolitical sentiment.

ICWSM

10, 1 (2010), 178–185.[64] Kateryna Tymoshenko, Daniele Bonadiman, and Alessandro Moschitti. 2016.Convolutional neural networks vs. convolution kernels: Feature engineering foranswer sentence reranking. In

NAACL-HLT . 1268–1278.[65] Svitlana Volkova, Glen Coppersmith, and Benjamin Van Durme. 2014. InferringUser Political Preferences from Streaming Communications. In

Proceedings of the52nd Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers) , Vol. 1. 186–196.[66] Arkaitz Zubiaga, Bo Wang, Maria Liakata, and Rob Procter. 2017. Stance Clas-sification of Social Media Users in Independence Movements. arXiv preprintarXiv:1702.08388arXiv preprintarXiv:1702.08388