[PDF] An organized review of key factors for fake news detection

Abstract

Fake news in social media has quickly become one of the most discussed topics in today's society. With false information proliferating and causing a significant impact in the political, economical, and social domains, research efforts to analyze and automatically identify this type of content have being conducted in the past few years. In this paper, we attempt to summarize the principal findings on the topic of fake news in social media, highlighting the main research path taken and giving a particular focus on the detection of fake news and bot accounts.

Full PDF

aa r X i v : . [ c s . S I] F e b FOCUS ARTICLE

An organized review of key factors for fake newsdetection

Nuno Guimaraes | Alvaro Figueira PhD | Luis TorgoPhD CRACS-INESCTEC and University ofPorto, Porto, 4200-465 , Portugal Faculty of Computer Science, DalhousieUniversity, Hallifax, Nova Scotia, NS B3H1W5, Canada

Correspondence

Nuno Guimaraes, CRACS-INESCTEC andUniversity of Porto, Porto, 4200-465,PortugalEmail: [email protected]

Present address * CRACS-INESCTEC and University of Porto,Porto, 4200-465, Portugal

Funding information

Nuno Guimaraes thanks the Fundação paraa Ciência e Tecnologia (FCT), Portugal forthe Ph.D. Grant (SFRH/BD/129708/2017);The work of L. Torgo was undertaken, inpart, thanks to funding from the CanadaResearch Chairs program.

Fake news in social media has quickly become one of themost discussed topics in today’s society. With false infor-mation proliferating and causing a signiﬁcant impact in thepolitical, economical, and social domains, research eﬀorts toanalyze and automatically identify this type of content havebeing conducted in the past few years. In this paper, we at-tempt to summarize the principal ﬁndings on the topic of fakenews in social media, highlighting the main research pathtaken and giving a particular focus on the detection of fakenews and bot accounts.

KEYWORDS fake news detection, social media, data science, bot detection,machine learning | INTRODUCTION

The problem of fake news is not recent. In fact, there have been several examples in history before the rise of socialmedia (and the Internet itself). One of the most impactful in modern history was the claim that the HIV virus wasfabricated in a United States facility (Boghardt, 2009). This rumor circulated during 1983 and was later captured bya television newscast . Although it was posteriorly debunked, the consequences are still present today since somestudies suggest the existence of a high percentage of believers in HIV related hoaxes (Bogart and Thorburn, 2005;Klonoﬀ and Landrine, 1999).With social networks such as Twitter and Facebook, this type of content has platforms where it can be dif-fused and propagated at a pace that was impossible with other mediums. Furthermore, a recent study concluded / / /opinion/russia-meddling-disinformation-fake-news-elections.html that approximately 68% of American adults use (at least occasionally) social media for their daily news consump-tion (Shearer and Matsa, 2018). Consequently, fake news easily reach their target audience and proliferate in thisecosystem, making them one of the most challenging problems in today’s society.Due to the large quantity of user-generated data in social media, manually verifying all the content published/spreadis infeasible. Therefore, researchers are using data mining methods and tools to tackle the fake news problem in thismedium.In this work, we cover the state of the art on social media platforms, to analyse, detect, and minimize the prop-agation of fake news. We exclude some research topics outside the social media spectrum such as the detection offake news articles, stance detection, and the development of fact-checking knowledge graphs. | FAKE NEWS IN SOCIAL MEDIA

Although the ﬁrst studies on fake news in social media have been published several years before (Castillo et al., 2011;Qazvinian et al., 2011), it was during the 2016 United States presidential election that the term became massivelypopular. Until then, similar problems were tackled in literature such the analysis and detection of rumors (Jin et al.,2013). However, the fake news concept is slightly diﬀerent and is more similar to the concept of disinformation (i.e.false information spread or published with the intention of deceiving). Nevertheless, current literature uses the termloosely so, in order to present a more complete review, we include rumor and disinformation as fake news and usethe terms interchangeably. | Data Annotation

The growth of fake news led to the rise of fact-checking entities such as Snopes and Politifact , whose purpose isto debunk claims, or Media Bias/Fact Check which oﬀers a large quantity of sources that are known to publish falsecontent. The majority of these fact-checking providers use expert-based annotations in two or more labels. Snopes,for example, has 14 diﬀerent labels such as True, Mostly True, Mostly False, False, Outdated, Scam, and Unproven.On the other hand, Media Bias Fact Check uses 5 diﬀerent bias labels (left, left center, least, right center, right) as wellas labels like conspiracy, pro-science and questionable sources.Several studies in data science use these providers to generate large datasets to study fake news. For exam-ple, the dataset LIAR (Wang, 2017) is composed of 12.8k claims extracted from PolitiFact and the dataset used inBovet and Makse (2019) relies on the annotations of Media Bias Fact Check to extract fake news tweets. Neverthe-less, depending on the task, theses sources may not be enough. Thus, several studies rely on experts to manuallyannotate data.It is also worth mentioning that unlike other more traditional text annotation tasks, research in fake news doesnot commonly uses crowd-sourcing annotation (i.e. rely the annotations to the wisdom of the crowd/volunteers).This can be justiﬁed by the complexity of the task and by the fact that unlike other tasks, fake news goal is indeedto deceive the reader, which could lead to poor annotated data if the annotator does not have proper training on thetask. https://mediabiasfactcheck.com/ uno Guimaraes et al. 3 | Social Media

Social media is the main medium for the propagation of fake news and consequently has been a largely studiedarea. There are several diﬀerent social media platforms with distinct characteristics and easiness of access to retrievecontent. The two most well-known are Facebook and Twitter which are also the main sources of fake news diﬀusion.The major diﬀerence between the two is that Twitter is a microblogging service that allows a 380 characters limit oneach post while on Facebook the limit is near 60,000. Also, concerning data accessibility, Twitter has an open APIthat allows the extraction of data regarding public posts and accounts, something that Facebook has discontinued.This fact, associated with the limitations that the Facebook API imposes (for example, the impossibility to extractposts regarding speciﬁc keywords) made that the majority of studies of fake news in social media use Twitter for dataextraction.There is also more region-speciﬁc research on Fake News that recur to other social media platforms. For examplein Brazil and India, WhatsApp proved to be an important medium on the diﬀusion of false information (Goel et al.,2018). This platform is intended to be used as an instant messaging platform. However, in some countries it is morecommonly used as a social network since users are added to large groups without knowing all the intervenients. Interms of data accessibility, WhatsApp is a secure and private messaging platform. Therefore, data retrieval must bedone manually through automated scripts running on the client. Nevertheless, an analysis of misinformation circu-lating in WhatsApp groups was conducted by Resende et al. (2019) and even a prototype system for fact-checkinghas being developed to tackle the problem (Melo et al., 2019). Another example is the Chinese Twitter-like platformWeibo. This social network has similar features to Twitter and allows the use of an API to extract posts. Finally, plat-forms such as Reddit and 4Chan also proved to be fake news spreaders. One of the most well-known examples wasthe Pizzagate conspiracy theory that started propagating in these forums (Aisch et al., 2016). Thus, some researchin the area also contemplates these platforms as case studies (Kang et al., 2015; Dang et al., 2016; Zannettou et al.,2017).In the next section, we elaborate on the diﬀerent topics inside the area and how the data previously mentionedapplies to the research conducted. We will also give particular emphasis to the micro-blogging platforms since theyare the focus of the majority of the studies due to the previously mentioned data accessibility issues. | RESEARCH TOPICS

There are three major areas of research in data science for fake news in social media:• the analysis of fake news and accounts that publish fake news which provides insights about the characteristicsof a social media post or account.• the detection of false information and accounts whose purpose is to spread false information normally achievedwith the use of machine and deep learning models.• the analysis of the propagation of fake news throughout the network with the intention of mitigating them orstudying the fact-checking eﬀects in some nodes/users. | Analysis

The majority of studies that analyse fake news in social media are conducted with respect to a certain event suchas the Chile Earthquake (Mendoza et al., 2010), the Mumbai blast in India (Gupta, 2011), the bombings at the 2013

Nuno Guimaraes et al.

Boston marathon (Gupta et al., 2013; Starbird et al., 2014), the 2016 United States Election (Bovet and Makse, 2019;Jin et al., 2017) and the "Brexit" referendum (Bastos and Mercea, 2019; Grčar et al., 2017). The main exception untilnow is the study published by Vosoughi et al. (2018) which covers a time period from 2006 to 2017. Several re-sults are important to highlight. First, fake news travel much faster through the network than real or credible news(Vosoughi et al., 2018) beginning sometimes with a slow propagation, but once they become viral, their diﬀusionquickly increases (Gupta et al., 2013). Furthermore, fake news posts tend to increase in important events such aselections (Vosoughi et al., 2018; Jin et al., 2017). Concerning fact-checking or credible news diﬀusion, the majorityof the studies agree that fake news propagate faster and in higher quantities than real news or fact-checking content.For example in Starbird et al. (2014), the authors claim that there is a misinformation to correction ratio of 44:1. Thisgoes against previous ﬁndings (Mendoza et al., 2010) which support that there is a 1:1 fake news to correction ratio.In a more recent study (Shao et al., 2018) the correction is 1:17 thus highlighting an absence of agreement on thissubject.To better comprehend the users’ accountability concerning the problem of fake news in social media, we shift ourfocus to the analysis of accounts that are largely responsible for its dissemination. In Shu et al. (2018), the authorsclaim that accounts that trust fake news are registered earlier and have high following/followers ratio (i.e. users tendto follow more accounts than to have "followers"). The majority of studies also agree on the importance of socialbots for spreading of fake news in social media. Social Bots can be deﬁned as algorithms that produce content au-tomatically and interact with humans. By deﬁnition, social bots are not malicious (for example some have the goalof news aggregators). However, malicious social bots have the goal of modifying or inﬂuencing behavior, causing amajor impact in real-world scenarios whether by shifting public opinion in elections or by aﬀecting the stock market(Ferrara et al., 2016). It is estimated that the percentage of social bots in Twitter accounts is around 15% of the users(Varol et al., 2017). Furthermore, in speciﬁc events (like elections or tragedies), they act like "super-spreaders" sinceseveral studies suggest that a large volume of tweets diﬀusing fake news can be attributed to a small number of botaccounts (Bastos and Mercea, 2019; Shao et al., 2017) and the majority of it happens at an early stage (i.e. a fewmoments after a fake news article is published for the ﬁrst time). Social bots also present diﬀerent strategies withrespect to the information they spread. A recent study analyzed the key strategies used by social bots to disseminatecontent on the awakening of an important event (Parkland shooting in Florida). The ﬁndings suggest that 36% of botsretweeted content that criticizes the actors involved in the shooting (such as the police and mainstream media). Otherstrategies applied by social bots were instilling doubt, sharing reliable information (showing that not all bots are mali-cious), spreading conspiracy theories, political organization, and commercial gain (Kitzie et al., 2018). Nevertheless, itis important to highlight that although social bots ampliﬁed discussion on social networks, it is the human-operated ac-counts that are largely responsible for the diﬀusion of bot-generated content (Ferrara et al., 2016; Kitzie et al., 2018).On the other hand, since bots are very active sharing fake news at an early stage, a bot classiﬁcation system capable ofa timely detection can be an eﬃcient strategy to avoid the propagation of fake news through the network (Shao et al.,2017, 2018). The research towards bots and fake news detection systems is discussed in the next section. | Detection

In terms of detection of fake news in social media we can identify two main tasks. The ﬁrst aims to predict if a socialnetwork post is false (or a similar concept). More formally given a post with a list of predictors/features { X , X , .. X n } and a target variable Y , we aim at approximating the unknown function f such as Y = f ( X , X , .. X n ) , with Y takingtwo possible values/labels (i.e. False or True, misinformation/reliable or True/Fake News). In some applications, thetarget variable Y can have more than two values (fake/reliable/satire) making it a multi-label classiﬁcation task. uno Guimaraes et al. 5 The second task has to do with bots which play an important role on the diﬀusion of fake news in social media.This detection task has to do with the detection of bots. It is normally approached as a classiﬁcation task (i.e. label anaccount as being a bot or human) (Davis et al., 2016) although there are also studies that approach the problem as amulti-label classiﬁcation task since they consider an intermediate type of account (cyborg) that is a mainly automatedaccount with rare human intervention (Chu et al., 2012). | Input Features

On both tasks, when applying machine learning techniques, it is necessary to analyse and select important featuresthat are able to discriminate among the diﬀerent class labels. In tasks related to the detection of fake news, we can lookat characteristics of the post, the text, and user. Commonly, post-based features include the number of hashtags, men-tions, links and weekday of publication. Concerning textual features, besides the number of words and length of thetext, sentiment and subjectivity analysis are often used in fake news detection tasks. This is justiﬁed by the emotionaltone that fake news texts have. Parts-of-speech tags (POS) are also usually extracted, like the number of nouns andpronouns (in 1st, 2nd, and 3rd person) as well as exclamation and question punctuation (due to the absence of formal-ism in false content). Several studies include a large set of these types of features (Boididou et al., 2018; Volkova et al.,2017; Knshnan and Chen, 2018; Mendoza et al., 2010; Helmstetter and Paulheim, 2018). The psychological meaningof words is often analyzed using the LIWC tool (Tausczik and Pennebaker, 2010) due to the psycho-linguistic charac-teristics of the text. Finally, some studies (Hamidian and Diab, 2015; Helmstetter and Paulheim, 2018; Volkova et al.,2017; Jin et al., 2017) also use bag of words or word embedding models to create a large set of features based on thetext of the post.The third main group of features concerns the user or account that publishes the post as well as the historicalbehavior of the user. This group is also used in bot detection tasks. Features in this group include the number of fol-lowers and friends (since a high number of friends, but a low number of followers can provide cues regarding the typeof account), veriﬁcation status (a veriﬁed account is unlikely to be a bot), account age and number of posts (a recentaccount with a high number of posts could possibly be a bot), (Helmstetter and Paulheim, 2018; Mendoza et al., 2010;Knshnan and Chen, 2018) and the absence/presence of biography, proﬁle picture and banner (Boididou et al., 2018).With data from Weibo, several studies also use the user’s gender and username type (Yang et al., 2012; Wu et al.,2015; Zhang et al., 2015). Groups of features used less frequently in fake news detection tasks include propagationand link-based features. Examples include the number of retweets/shares and replies/comments, and the analysisof the cascade of retweets (depth, maximum sub-tree and maximum node) in the social-based group (Mendoza et al.,2010). Link credibility via WOT score , and Alexa rank is also used (Boididou et al., 2018).In bot/fake users detection, beside the user features previously mentioned, additional account-based featuresare also considered such as the type of client (mobile, web, API...), the number of favorite tweets, and the length ofdescription. In addition, features that analyze default account settings (such as the existence of a proﬁle picture ora banner) and features based on past behavior of users are more frequent. Examples include hashtag, mentions andURL ratios in past tweets (Dickerson et al., 2014; Shu et al., 2018; Chu et al., 2012; Varol et al., 2017; Erşahin et al.,2017; Azab et al., 2016). Features based on the text of past tweets are less common for this task. However, Shu et al.(2018) rely on the users’ writing style to predict the gender, age, and other psychological characteristics. In addition,the work in Varol et al. (2017) also relies on users past tweets to derive sentiment features for the bot detection task.We proceed to summarize the models and evaluation metrics commonly used in the presented detection tasks, https://blog.alexa.com/marketing-research/alexa-rank/ Nuno Guimaraes et al. as well as the best performances achieved. | Model Types and Evaluation Metrics

Fake News Detection is commonly portrayed as a text mining classiﬁcation task. Therefore, the metrics used for eval-uating models built for this task are similar to other text classiﬁcation tasks, such as sentiment analysis or documentclassiﬁcation. True positives, false positives, true negatives, and false negatives are normally computed. However, itis Precision, Recall, F1-score (macro and micro), and Accuracy which are the focus on the evaluation of each system.The use of these metrics are adopted according to the imbalance of used data and the type of task (multi-label orbinary). In some studies, the area under the curve (AUC) is also used.Models that performed well in more traditional text mining tasks were adopted in the context of fake news detec-tion in social media. For example the studies by Castillo et al. (2011); Knshnan and Chen (2018); Hamidian and Diab(2015) use Decision Trees and achieve an F-measure between 0.83 and 0.86. On the other hand Zhang et al. (2015);Wu et al. (2015); Yang et al. (2012) and again Knshnan and Chen (2018), use Support Vector Machines for the task,accomplishing F1-scores between 0.74 to 0.90. Other approaches include the use of ensemble models (0.9 f1-score)(Helmstetter and Paulheim, 2018) and Convolutional Neural Networks (0.95 accuracy) (Volkova et al., 2017). A moreuncommon approach is the harmonic boolean label crowdsourcing presented in Tacchini et al. (2017) that relies onthe users’ social feedback to predict if a post is hoax or non-hoax. Although the authors describe excellent results(99% accuracy), the model presented relies on crowdsourcing the opinion of users based on past behaviour. Thus,it seems unfeasible to apply this model in the absence of social feedback, making it unsuitable in an early detectionscenario.Shifting our focus to the fake users/bot detection task, this type of task is also generally addressed as a classi-ﬁcation task, with several classiﬁcation algorithms tested and evaluated. In several studies, Random Forests achievea good performance in distinguishing human and bot accounts with F1-scores ranging from 0.91 to 0.96 (Azab et al.,2016; Gilani et al., 2017). Furthermore, the same model proves to be eﬃcient in a three-label classiﬁcation scenario(human, bot and cyborg) achieving an AUC score of 0.95 (Chu et al., 2012). Naive Bayes is also an often used modelaccomplishing similar results (Azab et al., 2016; Erşahin et al., 2017). | Propagation

In misinformation or fake news propagation models, the users are commonly illustrated as nodes and the edges theconnections of the users to their friends/followers. When a fake news post starts propagating in the network, eachnode is assigned with a probability of being "aﬀected" by that post. Thus, when analyzing fake news from a networkpropagation perspective, the problem can be compared with the spreading of an infectious disease where a node(user) can be infected with a certain probability (Kermack and McKendrick, 1938). That probability may vary accord-ing to several factors. First, not all users believe in "fake news" thus it is important to distinguish them in three classes:the "persuaders" whose goal is to spread and support fake news content, the "gullible users" who are easily inﬂu-enced by fake news content and "the clariﬁers" that are immune to fake news and may confront infected users withfact-checking content (Shu et al., 2019). Homophily and social inﬂuence theories contribute to the importance of thefriends’ network in the "fake news contamination" of a gullible user. Accordingly, the probability of a user believingfalse information can be computed depending on the beliefs of the friends (i.e. a user who has friends that believe infake news has a higher probability of being infected) (Wu et al., 2016). Several models and user roles have been pro-posed based on this approach. Tambuscio et al. (2015) develop a model for the propagation of rumors based on similar uno Guimaraes et al. 7 user roles (Believer, Fact-checker, Susceptible) and three probabilistic phenomenons: spread (when the user spreadsthe rumor), verify (when the user fact-checks the rumor), forget (the user forget the news). Another study (Litou et al.,2016) considers competing information spreading simultaneous in the network (i.e. the simultaneous spreading offake news and reliable content). Furthermore, time is an important factor in this model since the probability of a userreading a "fake news" post from its close connections decreases with time.Some important results arise from these studies. First, fact-checking activity on the network does not need tobe in large quantities to cancel the propagation of fake news content and even when a rumor is removed from thenetwork the fact-checking on users that believe the rumor continues (Tambuscio et al., 2015). Second, the percentageof users that are protected against misinformation increases when the propagation time constraints are more relaxed,and it is smaller when the time constraints to spread the information are more restricted (i.e. when there is an urgencyto spread content, more users are infected) (Litou et al., 2016). These results support and help to explain the resultsin other fake news analysis studies. Namely that, in the occurrence of an event, the diﬀusion of fake news tends tooccur in higher quantities (Vosoughi et al., 2018) and that human accounts are mainly responsible for its propagation(Ferrara et al., 2016; Kitzie et al., 2018). | CONCLUSION

The fake news problem led to an overall increase on the number of studies published in the topic (Figueira et al., 2019).In this work, we present a comprehensive overview of the research and application of data mining techniques for fakenews in social media. Although the current studies highlight several important results for understanding fake news,it is our conviction that the problem is still being tackle in a ﬁne-grained fashion and in a time-independent manner,with a focus on event-based analysis and detection. With the exception of the work by Vosoughi et al. (2018), thereseems to be an absence of long term studies around the analysis of fake news in social media. Regarding the modelsand systems developed, it would be important to evaluate if these can resist the time and change of context in fakenews. For example, can models developed in the context of past events be used to tackle the disinformation in socialmedia regarding the 2019 novel coronavirus? We argue that the evolution concerning content and social feedbackmust be studied to understand if the models and features used, and trained, in past experiences are still applicabletoday. We do believe that the capability of keeping the relevance of the features and models’ performance in diﬀerentdomains and temporal contexts is an essential step towards the detection and mitigation of fake news in social media. references

Aisch, G., Huang, J. and Kang, C. (2016) Dissecting the

International Journal of Computer, Electrical, Automation, Control and Information Engineering , ,13–18.Bastos, M. T. and Mercea, D. (2019) The Brexit Botnet and User-Generated Hyperpartisan News. Social Science ComputerReview , , 38–54.Bogart, L. and Thorburn, S. (2005) Are hiv/aids conspiracy beliefs a barrier to hiv prevention among african americans? Journalof acquired immune deﬁciency syndromes (1999) , , 213–8.Boghardt, T. (2009) Soviet bloc intelligence and its aids disinformation campaign. Nuno Guimaraes et al.

Boididou, C., Papadopoulos, S., Zampoglou, M., Apostolidis, L., Papadopoulou, O. and Kompatsiaris, Y. (2018) Detection andvisualization of misleading content on Twitter.

International Journal of Multimedia Information Retrieval , , 71–86.Bovet, A. and Makse, H. (2019) Inﬂuence of fake news in twitter during the 2016 us presidential election. Nature Communi-cations , , 7.Castillo, C., Mendoza, M. and Poblete, B. (2011) Information Credibility on Twitter.Chu, Z., Gianvecchio, S., Wang, H. and Jajodia, S. (2012) Detecting automation of Twitter accounts: Are you a human, bot, orcyborg? IEEE Transactions on Dependable and Secure Computing , , 811–824.Dang, A., Smit, M., Moh’D, A., Minghim, R. and Milios, E. (2016) Toward understanding how users respond to rumours insocial media. Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining,ASONAM 2016 , 777–784.Davis, C. A., Varol, O., Ferrara, E., Flammini, A. and Menczer, F. (2016) Botornot: A system to evaluate social bots. In

WWW’16 Companion Proceedings of the 25th International Conference Companion on World Wide Web , 273–274. ACM, ACM.Dickerson, J. P., Kagan, V. and Subrahmanian, V. S. (2014) Using sentiment to detect bots on Twitter: Are humans moreopinionated than bots?

ASONAM 2014 - Proceedings of the 2014 IEEE/ACM International Conference on Advances in SocialNetworks Analysis and Mining , 620–627.Erşahin, B., Aktaş, Ö., Kilmç, D. and Akyol, C. (2017) Twitter fake account detection. , 388–392.Ferrara, E., Varol, O., Davis, C., Menczer, F. and Flammini, A. (2016) The rise of social bots.

Commun. ACM , , 96–104. URL: http://doi.acm.org/ . / .Figueira, Á., Guimaraes, N. and Torgo, L. (2019) A brief overview on the strategies to ﬁght back the spread of false information. Journal of Web Engineering , , 319–352.Gilani, Z., Kochmar, E. and Crowcroft, J. (2017) Classiﬁcation of Twitter Accounts into Automated Agents and Human Users. Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 -ASONAM ’17 , 489–496. URL: http://dl.acm.org/citation.cfm?doid= . .Goel, V., Raj, S. and Ravichandran, P. (2018) How whatsapp leads mobs to murder in india. / / /technology/whatsapp-india-killings.html . Online; postedon 18-Jul-2018.Grčar, M., Cherepnalkoski, D., Mozetič, I. and Kralj Novak, P. (2017) Stance and inﬂuence of Twitter users regarding the Brexitreferendum. Computational Social Networks , .Gupta, A. (2011) Twitter Explodes with Activity in Mumbai Blasts! A Lifeline or an Unmonitored Daemon in the Lurking? Precog.Iiitd.Edu.in , 1–17. URL: http://precog.iiitd.edu.in/Publications{_}files/AG{_}PK{_}TR{_} .pdf .Gupta, A., Lamba, H. and Kumaraguru, P. (2013) $1.00 per RT eCrime Researchers Summit, eCrime .Hamidian, S. and Diab, M. T. (2015) Rumor Detection and Classiﬁcation for Twitter Data.

Proceedings of SOTICS 2015 : TheFifth International Conference on Social Media Technologies, Communication, and Informatics , 71–77.Helmstetter, S. and Paulheim, H. (2018) Weakly supervised learning for fake news detection on Twitter.

Proceedings of the2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018 , 274–277.Jin, F., Dougherty, E., Saraf, P., Cao, Y. and Ramakrishnan, N. (2013) Epidemiological modeling of news and rumors on twitter.In

Proceedings of the 7th Workshop on Social Network Mining and Analysis , SNAKDD ’13. New York, NY, USA: Associationfor Computing Machinery. URL: https://doi.org/ . / . . uno Guimaraes et al. 9 Jin, Z., Cao, J., Guo, H., Zhang, Y., Wang, Y. and Luo, J. (2017) Detection and analysis of 2016 us presidential election relatedrumors on twitter. In

Social, Cultural, and Behavioral Modeling (eds. D. Lee, Y.-R. Lin, N. Osgood and R. Thomson), 14–24.Cham: Springer International Publishing.Kang, B., Hölerer, T. and O’Donovan, J. (2015) Believe it or not? Analyzing information credibility in microblogs.

Proceedings ofthe 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015 , 611–616.Kermack, W. O. and McKendrick, A. G. (1938) A contribution to the mathematical theory of epidemics.

The American Mathe-matical Monthly , , 446.Kitzie, V. L., Mohammadi, E. and Karami, A. (2018) “Life never matters in the DEMOCRATS MIND”: Examining strategies ofretweeted social bots during a mass shooting event. Proceedings of the Association for Information Science and Technology , , 254–263.Klonoﬀ, E. A. and Landrine, H. (1999) Do blacks believe that hiv/aids is a government conspiracy against them? PreventiveMedicine , , 451 – 457. URL: .Knshnan, S. and Chen, M. (2018) Identifying tweets with fake news. Proceedings - 2018 IEEE 19th International Conference onInformation Reuse and Integration for Data Science, IRI 2018 , , 460–464.Litou, I., Kalogeraki, V., Katakis, I. and Gunopulos, D. (2016) Real-time and cost-eﬀective limitation of misinformation propa-gation. Proceedings - IEEE International Conference on Mobile Data Management , , 158–163.Melo, P., Messias, J., Resende, G., Garimella, K., Almeida, J. and Benevenuto, F. (2019) Whatsapp monitor: A fact-checkingsystem for whatsapp. Proceedings of the International AAAI Conference on Web and Social Media , , 676–677. URL: .Mendoza, M., Poblete, B. and Castillo, C. (2010) Twitter under crisis: Can we trust what we RT? SOMA 2010 - Proceedings ofthe 1st Workshop on Social Media Analytics , 71–79.Qazvinian, V., Rosengren, E., Radev, D. R. and Mei, Q. (2011) Rumor has it identifying misinformation in microblogs.

Conferenceon Empirical Methods in Natural Language Processing , 1589–1599.Resende, G., Melo, P., Sousa, H., Messias, J., Vasconcelos, M., Almeida, J. and Benevenuto, F. (2019) (mis)information dissemi-nation in whatsapp: Gathering, analyzing and countermeasures. In

The World Wide Web Conference , WWW ’19, 818–828.New York, NY, USA: Association for Computing Machinery. URL: https://doi.org/ . / . .Shao, C., Ciampaglia, G. L., Varol, O., Yang, K., Flammini, A. and Menczer, F. (2017) The spread of low-credibility content bysocial bots. Nature Communications .Shao, C., Hui, P. M., Wang, L., Jiang, X., Flammini, A., Menczer, F. and Ciampaglia, G. L. (2018) Anatomy of an online misinfor-mation network.

PLoS ONE , , 1–23.Shearer, E. and Matsa, K. E. (2018) News use across social media platforms 2018.Shu, K., Bernard, H. R. and Liu, H. (2019) Studying Fake News via Network Analysis: Detection and Mitigation , 43–65. Cham:Springer International Publishing. URL: https://doi.org/ . / - - - - _ .Shu, K., Wang, S. and Liu, H. (2018) Understanding user proﬁles on social media for fake news detection. In Proceedings -IEEE 1st Conference on Multimedia Information Processing and Retrieval, MIPR 2018 , 430–435. Institute of Electrical andElectronics Engineers Inc.Starbird, K., Maddock, J., Orand, M., Achterman, P. and Mason, R. M. (2014) Rumors, False Flags, and Digital Vigi-lantes: Misinformation on Twitter after the 2013 Boston Marathon Bombing. iConference 2014 Proceedings . URL: / . Tacchini, E., Ballarin, G., Della Vedova, M. L., Moret, S. and de Alfaro, L. (2017) Some Like it Hoax: Automated Fake NewsDetection in Social Networks. 1–12. URL: http://arxiv.org/abs/ . .Tambuscio, M., Ruﬀo, G., Flammini, A. and Menczer, F. (2015) Fact-checking Eﬀect on Viral Hoaxes: A Model of MisinformationSpread in Social Networks. 977–982.Tausczik, Y. R. and Pennebaker, J. W. (2010) The psychological meaning of words: LIWC and computerized text analysismethods. Journal of Language and Social Psychology , , 24–54.Varol, O., Ferrara, E., Davis, C. A., Menczer, F. and Flammini, A. (2017) Online human-bot interactions: Detection, estimation,and characterization. In International AAAI Conference on Web and Social Media , 280{ \ textendash}289. AAAI, AAAI. URL: https://aaai.org/ocs/index.php/ICWSM/ICWSM /paper/view/ / .Volkova, S., Shaﬀer, K., Jang, J. Y. and Hodas, N. (2017) Separating facts from ﬁction: Linguistic models to classify sus-picious and trusted news posts on twitter. In Proceedings of the 55th Annual Meeting of the Association for Computa-tional Linguistics (Volume 2: Short Papers) , 647–653. Vancouver, Canada: Association for Computational Linguistics. URL: - .Vosoughi, S., Roy, D. and Aral, S. (2018) The spread of true and false news online. Science , , 1146–1151. URL: . /science.aap .Wang, W. Y. (2017) “liar, liar pants on ﬁre”: A new benchmark dataset for fake news detection. In Proceedings of the 55th AnnualMeeting of the Association for Computational Linguistics (Volume 2: Short Papers) , 422–426. Association for ComputationalLinguistics. URL: http://aclweb.org/anthology/P - .Wu, K., Yang, S. and Zhu, K. Q. (2015) False rumors detection on Sina Weibo by propagation structures. Proceedings - Interna-tional Conference on Data Engineering , , 651–662.Wu, L., Morstatter, F., Hu, X. and Liu, H. (2016) Mining Misinformation in Social Media. In Big Data in Complex and SocialNetworks , chap. 5. New York: Taylor & Francis.Yang, F., Yu, X., Liu, Y. and Yang, M. (2012) Automatic detection of rumor on Sina Weibo.