Assessing Individual and Community Vulnerability to Fake News in Social Networks
NNoname manuscript No. (will be inserted by the editor)
Assessing Individual and Community Vulnerabilityto Fake News in Social Networks
Bhavtosh Rath · Wei Gao · JaideepSrivastava the date of receipt and acceptance should be inserted later
Abstract
The plague of false information, popularly called fake news has af-fected lives of news consumers ever since the prevalence of social media. Thusunderstanding the spread of false information in social networks has gaineda lot of attention in the literature. While most proposed models do contentanalysis of the information, no much work has been done by exploring the com-munity structures that also play an important role in determining how peopleget exposed to it. In this paper we base our idea on Computational Trust in so-cial networks to propose a novel
Community Health Assessment model againstfake news. Based on the concepts of neighbor, boundary and core nodes of acommunity, we propose novel evaluation metrics to quantify the vulnerabilityof nodes (individual-level) and communities (group-level) to spreading falseinformation. Our model hypothesizes that if the boundary nodes trust theneighbor nodes of a community who are spreaders, the densely-connected corenodes of the community are highly likely to become spreaders. We test ourmodel with communities generated using three popular community detectionalgorithms based on two new datasets of information spreading networks col-lected from Twitter. Our experimental results show that the proposed metricsperform clearly better on the networks spreading false information than onthose spreading true ones, indicating our community health assessment modelis effective.
B. Rath and J. SrivastavaDepartment of Computer Science & Engineering, University of Minnesota, USAE-mail: { rathx082, srivasta } @umn.eduW. GaoSchool of Information Systems, Singapore Management UniversityE-mail: [email protected] a r X i v : . [ c s . S I] F e b Bhavtosh Rath et al.
The use of social media platforms like Facebook, Twitter and Whatsapp isubiquitous in modern times, making them powerful tools for information prop-agation and consumption. However, such goodness inevitably gets accompa-nied by the bad due to the innate vulnerability of human users to misinfor-mation, which can be witnessed with the tremendous problem of fake newsspreading [16].
Fake news is a recently coined term that means fabricatednews.
It refers to newsworthy claims that may have no basis in fact, but arepresented as being factually truthful.
It gets spread when someone propagates itonline via various endorsements such as replying, sharing or re-posting withoutvalidating the authenticity of the content.There is a sheer amount of interest in the research community to under-stand fake news spreading, as summarized by Sharma et al. [37]. Our approachis orthogonal to these by focusing on assessing the vulnerability of social net-works to false information spreading . Specifically, our focus is on people andthe online communities they create, with the goal of identifying how vulnera-ble individuals and communities are to believing false information. We proposethe Community Health Assessment model that introduces the ideas of neigh-bor, boundary and core nodes of a community and proposes novel metrics toquantify the vulnerability of an individual and the community itself.
Froma public health perspective, determining whether a piece of news is fake ornot is akin to determining whether a virus is injurious to health, while ourapproach is akin to determining whether an individual or community is vul-nerable to being infected by the virus.
Thus, the proposed approach provides acomplementary perspective, and can be useful in inoculating individuals andcommunities against spread of fake news.To assess the vulnerability of users and their communities, we proposemethods to quantify the likelihood of a boundary node of a community tobelieve a news item sent from its immediate neighbors, and also quantify thelikelihood of a community’s entire boundary node set to believing its neigh-borhood, i.e., the set of nodes outside the community that are connected toat least one member of the community. Intuitively, if an external node infectsa member of a community, the likelihood of the entire community to get in-fected increases due to high connectivity among community members. Thus,while assessing vulnerability of community, we focus on examining the influ-ence of information propagated from external nodes into the community ratherthan considering the internal propagation of the news within the community.We evaluate our model on the propagation networks on multiple real-worldinformation spreading networks from Twitter.In this paper, we make the following three novel contributions: – We propose the Community Health Assessment model that initiates theideas of neighbor, boundary and core nodes for a community structure ina social network. – We propose metrics that are used to quantify the vulnerability of nodeand community to fake news spreading from outside. Health analogy here itle Suppressed Due to Excessive Length 3 is that fake news is akin to infection, and quantifying vulnerability is akinto assessing immunity to infection spread. – We present evaluation of the proposed metrics using two datasets basedon a set of fact-checked news events, one from snopes.com and the otherfrom a fact-checking website in India. We demonstrate that our proposedmetrics can much better assess the vulnerability of social networks to fakenews than regular news. To the best our knowledge, this is the first work tosystematically quantify the vulnerability to online users and communitiesto fake news.The rest of the paper is organized as follows: We first discuss the RelatedWork, followed by explanation of the Community Health Assessment modeland the preliminary ideas that it builds upon. We then explain the algorithmto quantify the vulnerability metrics. Next, in the Experiments and Resultssection we explain the data collection process, the datasets and the metricsused for evaluation and the results. Finally, we provide concluding remarksand summarize scope of future work.
We describe briefly prior work in three broadly related domains:
Misinforma-tion Detection , Rumor Spreading Models and
Computational Trust .2.1 Misinformation DetectionThere has been a surge in interest among researchers over the past few yearsto build models to detect misinformation. Most approaches in literature modelcontent-based and network-based characteristics of the misinformation. Thesemethods include approaches to capture the style and the language of arti-cles [10], hyperpartisan news content [24] and cues that map language to per-ceived levels of credibility [20]. Many classification models distinguishing trueand false information have also been proposed. Perez-Rosas et al. [23] pro-posed a fake news detection model using linguistic features. Yang et al. [32]proposed a classification model using client- and location- based features ex-tracted from micro-blogging websites. Network-based approaches that try tomodel the propagation structures of false information have also been pro-posed [13,9,25,18]. The use of neural networks has gained strong attentionrecently. Use of convolution neural networks [3] and recurrent neural net-works [17] and numerous variants of these fundamental models have shownpromising results. Zhang et al. [42] applied a graph neural network model thataggregates textual information news articles, creators and subjects to identifynews veracity. Ma et al. [40] applied generative adversarial networks to counterrumor dissemination by generating confusing training examples to challengethe discriminator of its detection capacity. Shu et al. [41] applies attentionmechanism that captures both news contents and user comments to propose
Bhavtosh Rath et al. an explainable fake news detection system. Khattar et al. [38] used textualand visual information in a model variational autoencoder coupled with a bi-nary classifier for the task of fake news detection. Lu and Li [39] integratedattention mechanism with graph neural networks using text information andpropagation structure to identify whether the source information is fake ornot.2.2 Rumor Spreading ModelsInfection spread models from epidemiology, namely SIR (Susceptible, Infected,Recovered) [21], SIS (Susceptible, Infected, Susceptible) [15], SEIZ (suscepti-ble, exposed, infected, skeptic) [44], SIHR (Spreaders, Ignorants, Hibernators,Removed) [33] and their variants [45,44,43,46,47] have been widely used tomodel information spreading, including rumors. Modelling rumor spreading ascascade structures in social networks is also well studied [9,31]. Other modelshave been proposed to identify the source of rumor spreading [29,35]. Fan etal. [8] proposed a model to maximize rumor containment within a fixed numberof initial protectors and a given time deadline. Social networks are naturallycomposed of disjoint communities with relations formed within communitiesstronger than relations formed across communities. Focusing on such commu-nities to understand rumor spread is a domain with a lot of research potential.Fan et al. [5] proposed an approach to identify a minimal set of boundarynodes that would prevent spread of rumors from neighboring communities.Nguyen et al. [22] proposed a community-based heuristic method to find thesmallest set of highly influential nodes whose decontamination with good infor-mation would contain rumor spreading. Vosoughi et al. [31] is another closelyrelated work that tried to empirically investigate the spread of true and falsenews online on a large real-world data repository from Twitter and concludedthat false news spreads faster and deeper in networks compared to true news.Susceptibility of users to fake news has also been studied with a content analy-sis [50] and information diffusion [48] perspective. What makes this work novelis that we propose content-agnostic metrics based on the underlying networkstructure.2.3 Computational TrustComputational social scientists have been interested to quantify the concept oftrust in various domains [1] with online social networks being one of them [30].One of the first works in the area of trust propagation in networks was byZiegler and Lausen [36]. Some researchers have attempted to understand therole of trust in message propagation during time critical situations [11]. Othershave worked to assign scores to nodes in a trust network based on variousstructural aspects. Kamvar et al. [14] proposed
Eigentrust to rate trust scoresof peers in a P2P network. Mishra and Bhattacharya [19] proposed an iterative itle Suppressed Due to Excessive Length 5
Fig. 1: The three levels of nodes that an piece of information can affect duringits propagation. The Neighbor nodes ( N ), Boundary nodes ( B ) and Core nodes( C ) are represented in black, grey and white, respectively. Boundary edges(connecting neighbor and boundary nodes) are represented by dotted lines.matrix convergence algorithm to compute the bias and prestige of nodes in anetwork. Inspired by the HITS algorithm, Roy et al. [27] proposed the Trustin Social Media (TSM) algorithm to compute a pair of complementary trustscores for every node in a social network, on which our vulnerability measuresare built upon. A social network has the characteristic property to exhibit community struc-tures which are formed based on inter-node interactions. Communities tendto be modular groups where within-group members are highly connected, andacross-group members are loosely connected.
Modularity refers to the ratio ofdensity of edges inside a community to that of the edges outside the commu-nity. Thus, based on the edge density, members within a community wouldtend to have a higher degree of trust among each other than members acrossdifferent communities. Also, there is variation in the level of inter-membertrust across different communities due to varying modularities. If such com-munities are exposed to false information being propagated from neighboringnodes, the likelihood of the whole community getting infected would be high.Thus it is important to identify vulnerable communities that lie in the path offalse information spreading in order to protect them and thus limit the overallinfluence of false information in the network.
Bhavtosh Rath et al.
Motivated by this idea we propose the Community Health Assessmentmodel. As part of the modeling, we first propose the ideas of neighbor, bound-ary and core nodes of a community and then propose metrics to quantifyvulnerability of nodes and communities based on the fundamental measures oftrust. Figure 1 explains the three groups of nodes with respect to a communitywhich are affected during the process of information spreading, namely: – Neighbor nodes : These nodes are directly connected to at least one node ofthe community. The set of neighbor nodes is denoted by N . They are nota part of the community. – Boundary nodes : These are community nodes that are directly connectedto at least one neighbor node. The set of boundary nodes is denoted by B .Edges connecting neighbor nodes to boundary nodes are called boundaryedges. – Core nodes : These nodes are only connected to members within the com-munity. The set of core nodes is denoted as C .3.1 Preliminaries In social media studies, researchers have used social networks to understandhow trust manifests among users. An inspiring work is the Trust in SocialMedia (TSM) algorithm which assigns a pair of complementary trust scores toeach actor, called
Trustingness and
Trustworthiness scores [27].
Trustingness quantifies the propensity of an actor to trust its neighbors and
Trustworthi-ness quantifies the willingness of the neighbors to trust the actor. The TSMalgorithm takes a user network, i.e., a directed graph G ( V , E ), as input to-gether with a specified convergence criteria or a maximum permitted numberof iterations. In each iteration, for every node in the network trustingness andtrustworthiness scores are computed using the equations given below: ti ( v ) = (cid:88) ∀ x ∈ out ( v ) (cid:18) w ( v, x )1 + ( tw ( x )) s (cid:19) (1) tw ( u ) = (cid:88) ∀ x ∈ in ( u ) (cid:18) w ( x, u )1 + ( ti ( x )) s (cid:19) (2)where u, v, x ∈ V are user nodes, ti ( v ) and tw ( u ) are trustingness and trust-worthiness scores of v and u , respectively, w ( v, x ) is the weight of edge from v to x , out ( v ) is the set of outgoing edges of v , in ( u ) is the set of incoming edgesof u , and s is the involvement score of the network. Involvement is basically thepotential risk an actor takes when creating a link in the network, which is setto a constant empirically. Once the trust scores are calculated for each nodein the network, TSM normalizes the scores by adhering to the normalizationconstraint so that both the sum of trustworthiness and the sum of trustingness itle Suppressed Due to Excessive Length 7 of all nodes in the network equals to 1. However, a salient problem of suchnormalization method lies in that the scale of the scores is dependent on thesize of the network. When the network is very large, the resulting scores willbecome extraordinarily small. To deal with the issue, min-max normalizationbased on the logarithm of the scores output by TSM can be used to normalizethe scores into the range of (0,1]. Details about the TSM algorithm can befound in [26]. is an edge score derived from the Trustingness and Trustworthi-ness scores [25]. It helps us to quantify the potential or strength of directededges to transmit information by capturing the intensity of the connectionbetween the sender and receiver. Believability for a directed edge is computedas a function of the trustworthiness of the sender and the trustingness of thereceiver.More specifically, given users u and v in the context of microblogs suchas Twitter, a directed edge from u to v exists if u follows v . The believabilityquantifies the strength that u trusts on v when u decides to follow v . Therefore, u is very likely to believe in v if:1. v has a high trustworthiness score, i.e., v is highly likely to be trusted byother users in the network, or2. u has a high trustingness score, i.e., u is highly likely to trust others.So, the believability score is supposed to be proportional to the two valuesabove, which can be jointly determined and computed as follow: Believability ( u → v ) = tw ( v ) ∗ ti ( u ) (3)The idea has been previously applied in [25] where a classification model wasbuilt to identify rumor spreaders in Twitter user network based on believ-ability measure. Based on [34], information posted by a person the reader hasdeliberately selected to follow on Twitter is perceived as useful and trustworthy ,which intuitively implies that follow relation can be considered as proxy fortrust.3.2 Vulnerability Metrics Motivation:
False information generally gets very low coverage from main-stream news platforms (such as press or television), so an important factorcontributing to a user’s decision to spread a fake news on social media is itsinherent trust on its neighbor endorsing it. On the other hand, a user wouldmost likely to endorse a true news since it is typically endorsed by multiplecredible news sources.
Thus, we hypothesize that the less credible nature of falseinformation makes it much more reliant on user’s trust relationship for spread-ing further than true news does.
Consequently, we propose our vulnerability
Bhavtosh Rath et al.
Fig. 2: Illustration of vulnerability to false information spreading. Red nodesare the fake news spreaders and are C1 and C3’s neighbor nodes. Dotted linesdenote edges connecting C1 and C3’s neighbor nodes to boundary nodes.metrics based upon the idea of computational trust, particularly believabil-ity, for assessing the health of individuals and communities encountering falseinformation.
An illustrative Example:
We further illustrate the idea of our proposedvulnerability metrics through figure 2. In this example, red nodes in communityC2 represent fake news spreaders, and C1 and C3 are two other communitieshaving an identical structure. We see that C3 and C1 have 3 and 2 boundarynodes, respectively, which are directly connected to the fake news spreadersin C2 (with the edges represented through dotted lines). Based on edge countsolely, one would think that C3 is more vulnerable to fake news spreading thanC1. However, such speculation is not true because the boundary nodes of C3having low trustingness scores are connected to spreaders in C2 having lowtrustworthiness scores, while the boundary nodes of C1, which have high trust-ingness scores, are connected to spreaders in C2 having high trustworthinessscores. Therefore, information are more likely to flow from C2 to C1 than toC3, i.e., C1 is more vulnerable. It is expected that our proposed metric shouldbe able to identify C1 as more vulnerable than C3.With the believability (Eq. 3) which is defined on top of trustingness andtrustworthiness derived from the TSM algorithm, we now derive the metrics toquantify vulnerability of nodes and communities to false information spread-ing. The proposed vulnerability metrics will help us quantify the likelihoodof boundary nodes and communities to believe some information spreadingfrom their neighbors. We assume that the information spreading is widespreadoutside of the community, i.e., at least some of the neighbor nodes of the com-munity are spreaders. We define the node- and community-level metrics asfollows:1.
Vulnerability of boundary node, V ( b ): This metric measures the likelihoodof a boundary node b to become a spreader. It is important to notethat the method used to quantify vulnerability of a boundary itle Suppressed Due to Excessive Length 9 node can be generalized to any node . The metric is derived as follows:The likelihood of node b to believe an immediate neighbor n is a functionof the trustworthiness of the neighbor n ( n ∈ N b , where N b is the set ofall neighbor nodes of b ) and the trustingness of b , and is quantified as bel nb = tw ( n ) ∗ ti ( b ), that is, Believability ( n → b ). Thus, the likelihoodthat b is not vulnerable to n can be quantified as (1 − bel nb ). Generalizingthis, the likelihood of b not being vulnerable to all of its neighbor nodesis (cid:81) ∀ n ∈N b (1 − bel nb ). Therefore, the likelihood of b to believe any of itsneighbors, i.e. the vulnerability of the boundary node b is computed as: V ( b ) = 1 − (cid:89) ∀ n ∈N b (1 − bel nb ) (4)2. Vulnerability of community, (cid:101) V ( C ): To compute vulnerability of commu-nity, we consider the community health perspective, i.e., vulnerability ofcommunity to information approaching from neighbor nodes (i.e., outsidethe community) towards the boundary node (i.e., circumference of the com-munity). As the scenario does not include information diffusion within thecommunity, thus the metric is independent of the core nodes of the com-munity. This metric measures likelihood of the boundary node set of acommunity C ( B C ) to believe an information from any of its neighbors.The metric is derived as follows: Going forward with the idea in 1), thelikelihood that boundary node b is not vulnerable to its neighbors can bequantified as (1 − V ( b )). Generalizing this to all b ∈ B C , the likelihoodthat none of the boundary nodes of a community are vulnerable to theirneighbors can be quantified as (cid:81) ∀ b ∈B C (1 − V ( b )). Thus, the likelihood ofcommunity C being vulnerable to any its neighbors, i.e., the vulnerabilityof the community, is defined as: (cid:101) V ( C ) = 1 − (cid:89) ∀ b ∈B C (1 − V ( b )) (5)The pseudo-code of algorithm to generate the vulnerability metrics is pro-vided in Algorithm 1. DS DS
2, summa-rized in Figure 3 for tweets associated with news articles with confirmedground truth of veracity from various fact checking websites. DS M M M M Mixture whichindicates that the news has significant elements of both truth and falsity init, news F F F F False which indicates that the
Algorithm 1:
Vulnerability Metrics Computation
Input: G ( V , E ): Spreader’s follower-following network; Output: V ( b ): Vulnerability of each boundary node, and (cid:101) V ( C ): Vulnerability of each community;( ti, tw ) ∀ v ∈V ← Trust scores using TSM( G ); φ ← Disjoint communities in G ; C ← A community s.t. C ∈ φ ; B C ← Set of Boundary nodes for community C ; N b ← Set of Neighbor nodes for boundary node b ; for each C ∈ φ dofor each b ∈ B C dofor each n ∈ N b do bel nb = tw ( n ) ∗ ti ( b ) end for V ( b ) = 1 − (cid:81) ∀ n ∈N b (1 − bel nb ) end for (cid:101) V ( C ) = 1 − (cid:81) ∀ b ∈B C (1 − V ( b )) end for Table 1: Metadata for DS Network ID No. of nodes No. of edges Snopes link
Mixture M1 M2 M3 M4 False F1 F2 F3 F4 True T1 T2 T3 T4 primary elements of the news are basically false, and news T T T T True which indicates that the primary elements of a claim arebasically true. DS N , . . . , N
10 from fact-checkingwebsites based in India with each news event containing a false informationnetwork denoted as F N ,. . . , F N and its corresponding refutation informa-tion network denoted as T N ,. . . , T N . Refutation information can be definedas true information that fact checks a specific item of false information. It iscreated soon after a false information is debunked and tends to co-exist withthe false information. We use ( F ∪ T ) N , . . . , ( F ∪ T ) N denotes network ob-tained by combining false and refutation networks for specific news events.The metadata about news events of DS DS itle Suppressed Due to Excessive Length 11 Table 2: Metadata for DS ID No. of nodes No. of edges Link of debunked article N1 N2 N3 N4 N5 N6 N7 N8 navbharattimes.indiatimes.com/viral-adda/fake-news-buster/news-about-former-srilankan-cricketer-sanath-jayasuriyas-death-is-a-hoax-24858/ N9 smhoaxslayer.com/%E2%80%8Bimported-dogs-stone-pelters-for-kasmir-or-imported-entire-video-for-inciting-communal-hatred/ N10
Fig. 3: Dataset summary.We identified the specific source tweet related to each information in ques-tion. For evaluation of metrics, we then identified all the spreaders of the sourcetweet associated with the news, which comprised of the source tweeter (identi-fied using Twitter API) and the list of retweeters (accessible through twren.ch or the Twitter search API). We considered the follower-following network ofthe spreaders obtained from Twitter API, as a proxy for social network. Codeimplementation and sample dataset is also provided .To evaluate our proposed metrics we used the collection of the twenty twodifferent news spreading networks. We ran the TSM algorithm [27] on follower-following network to compute the trustingness and trustworthiness scores forevery node in the network. We then identified disjoint communities by tryingthree popular community detection algorithms on large networks: Louvain [2]solves an optimization problem that tries to maximize modularity of commu-nities. Infomap [7] algorithm is based on the principles of Information Theory.In contrast to maximizing modularity, the fundamental approach of Infomap https://github.com/BhavtoshRath/Vulnerability-Metrics2 Bhavtosh Rath et al. is to utilize flows in the graph. It uses the map equation framework, whichcharacterizes community detection as a problem of finding a description ofminimum information of a random walk process. Label Propagation [6] startsoff by assigning a unique label to each node, and then iteratively assigns eachnode the label most common amongst its neighbors. As a greedy algorithm,Label Propagation is more efficient which is linear to the number of edges inthe graph. For each of the communities generated we identified the sets ofboundary and neighbor nodes and then computed vulnerability metrics (seeAlgorithm 1).The network statistics based on Community Health Assessment model for DS DS . A general observation is that Label Propagation algorithm tends togenerate more number of communities while Infomap generates fewer numberof communities. Louvain gives more balanced results in terms of size and countof communities.4.2 Evaluation of MetricsTo measure how good the proposed metrics are able to quantify the vulnerabil-ity of nodes and communities, we evaluate the quality of ranking on boundarynodes and communities based on vulnerability scores in comparison with theground-truth ranking of nodes and communities derived from the news spreadin the network. We adopt the ranking evaluation measures widely used inInformation Retrieval literature [28]. A vulnerable boundary node is highly likely to have strong believability withits neighbors. We thus consider the ground truth of a vulnerable node as a nodewhich retweets. The ground truth vulnerability of boundary nodes is binaryas we only have information of whether the node retweets or not. We thusevaluate this metric using
Average Precision@k and
Mean Average Precision . Average Precision@k (AP@k):
We first compute Precision@k (viz. top-k vulnerable boundary nodes based on the metric as a percentage of spreaderboundary nodes in a community) and then compute the Average Precision@k( AP @ k ) (viz. the average of Precision@k values over all communities in anetwork). Mean Average Precision (MAP):
Mean Average Precision is computedas the mean of the average precision scores for the top-k boundary nodesover all communities in a network. The formula to compute MAP is given From here on, L: Louvain, I: Infomap, LP: Label Propagation in all tables.itle Suppressed Due to Excessive Length 13
Table 3: Community statistics for DS I n f o r m a t i o n C o mm un i t y D e t ec t i o n o f c o mm un i t i e s ( C ) A v g . o f n o d e s / C A v g . o f i n f ec t e dn o d e s / C A v g . o f B e d g e s A v g . o f B A v g . o f n e i g hb o r n o d e s A v g . o f i n f ec t e d B n o d e s A v g . o f i n f ec t e d N n o d e s M1 L 54 45,004 53 69,040 7,107 14,401 47 774I 36 68,148 81 5,594 1,778 1,408 59 376IP 786 3,038 4 603 215 266 3 38 M2 L 67 54,764 34 28,250 3,300 13,717 32 494I 5 733,843 459 1274 716 453 74 120IP 931 3,941 2 1,080 264 620 2 50 M3 L 72 89,756 39 20,406 2,878 11,371 36 412I 14 461,604 202 49,791 7,848 20,097 186 558IP 1,341 4,819 2 1,150 240 702 2 60 M4 L 99 35,477 27 10,606 2,285 2,996 23 484I 37 94,924 72 16,081 3,933 3,764 66 480IP 1,637 2,146 2 709 191 292 2 50 F1 L 28 67,262 103 218,939 14,547 34,442 99 1,028I 8 235,416 360 1,482 775 616 81 143IP 480 3,924 6 933 340 455 5 40 F2 L 50 99,626 57 51,664 5,793 21,101 51 660I 4 1,245,330 708 1,454 760 637 89 118IP 677 7,358 4 2,318 396 1,542 4 84 F3 L 15 52,147 31 417,933 16,382 52,259 31 365I 133 5,881 3 6,722 1,075 3,225 3 157IP 15 52,147 31 5,227 2,285 2,514 24 83 F4 L 15 33,544 19 338,248 13,848 56,711 19 246I 38 13,241 8 11,255 2,182 5,484 8 171IP 7 71,880 41 1,779 992 744 22 64 T1 L 47 232,538 59 47,189 2,171 42,783 39 246I 34 321,450 82 5,792 1,390 2,261 52 189IP 1,283 8,519 2 2,151 202 1,724 2 54 T2 L 37 25,758 5 4,150 509 3,095 3 36I 9 105,893 22 5,650 1,418 1,777 17 60IP 159 5,994 1 1,102 189 752 1 25 T3 L 27 79,849 26 10,135 1,942 5,251 18 180I 629 3,428 1 1,266 161 641 1 124IP 209 10,315 3 1,138 303 584 3 46 T4 L 89 17,202 12 4,511 908 1,502 10 205I 1,206 1269 1 544 92 271 1 99IP 797 1,921 1 723 164 279 1 53
Table 4: Community statistics for false information in DS I n f o r m a t i o n C o mm un i t y D e t ec t i o n o f c o mm un i t i e s ( C ) A v g . o f n o d e s / C A v g . o f i n f ec t e dn o d e s / C A v g . o f B e d g e s A v g . o f B A v g . o f n e i g hb o r n o d e s A v g . o f i n f ec t e d B n o d e s A v g . o f i n f ec t e d N n o d e s F N L 37 23,935 25 9,654 1,482 3,116 20 163I 3 295,199 314 17,466 4,786 3,224 159 373IP 220 4,025 4 1,299 322 623 4 46 F N L 66 39,510 69 35,877 2,274 16,655 62 562I 6 434,605 759 1 1 1 1 1IP 280 9,313 16 2,148 250 1,571 9 40 F N L 53 44,215 65 23,464 2,774 8,280 59 443I 2497 956 1 926 67 558 1 117IP 313 7,628 11 1,519 347 955 7 40 F N L 37 55,031 24 8,758 1,102 6,539 20 144I 2 1,018,081 447 3,744 1,123 1,959 75 84IP 214 9,515 4 1,918 299 1,356 4 43 F N L 47 11,037 17 10,685 1,617 3,319 17 234I 738 703 1 1,107 107 587 1 155IP 119 4,359 7 1,646 426 848 5 49 F N L 26 10,653 6 2,140 422 1,085 5 50I 4 69,246 40 1,734 584 564 17 64IP 97 2,856 2 768 152 379 2 29 F N L 20 7,230 4 1,261 381 324 3 27I 117 1,236 1 337 74 92 1 29IP 35 4,131 2 724 245 207 2 16 F N L 17 23,188 7 1,479 308 911 5 49I 4 98,551 30 3,439 1,060 1,253 17 60IP 83 4,749 1 494 117 200 1 22 F N L 43 11,092 11 3,673 802 1,219 10 135I 487 979 1 538 77 224 1 90IP 162 2,944 3 830 221 356 3 32 F N L 55 19,506 22 5,681 853 2,455 20 200I 1,045 1,027 1 570 52 281 1 98IP 216 4,967 6 1,066 220 641 5 33 itle Suppressed Due to Excessive Length 15
Table 5: Community statistics for refutation information in DS I n f o r m a t i o n C o mm un i t y D e t ec t i o n N o . o f c o mm un i t i e s ( C ) A v g . N o . o f n o d e s / C A v g . N o . o f i n f ec t e dn o d e s / C A v g . N o . o f B e d g e s A v g . N o . o f B A v g . N o . o f n e i g hb o r n o d e s A v g . N o . o f i n f ec t e d B n o d e s A v g . N o . o f i n f ec t e d N n o d e s T N L 40 11,338 10 5,856 856 3,018 8 96I 2 226,769 200 1,260 606 151 37 58IP 154 2,945 3 1,564 274 1,019 2 39 T N L 47 9,226 10 3,562 540 1,648 9 103I 472 919 1 581 58 327 1 80IP 167 2,597 3 1,042 169 641 3 34 T N L 15 86,491 32 7,305 987 5,160 10 67I 457 2,839 1 757 64 437 1 80IP 84 15,445 6 1,497 260 1,032 5 29 T N L 45 23,522 11 5,399 590 3,950 10 102I 523 2,024 1 740 58 502 1 77IP 214 4,946 2 1,211 167 827 2 39 T N L 15 17,513 6 4,895 305 4,376 2 28I 2 131,346 46 5,650 936 2,874 5 45IP 40 6,567 2 1,769 159 1,449 2 19 T N L 9 7,458 4 772 220 333 3 10I 103 652 1 106 25 48 1 16IP 26 2,582 1 376 112 99 1 13 T N L 20 3,067 6 1,280 267 478 5 38I 2 30,666 57 1,636 871 370 26 22IP 49 1,252 2 648 149 290 2 21 T N L 4 310,826 23 2,152 233 1,723 7 31I 2 621,653 47 1,968 601 943 15 42IP 64 19,427 1 465 98 208 1 21 T N L 20 13,821 5 1,482 324 836 4 37I 3 92,143 32 233 81 132 9 16IP 78 3,544 1 564 120 244 1 28 T N L 5 29,757 7 1,098 283 643 5 20I 49 3,036 1 214 54 84 1 14IP 31 4,800 1 347 119 90 1 13
Table 6: Community statistics for false and refutation information networkcombined in DS I n f o r m a t i o n C o mm un i t y D e t ec t i o n o f c o mm un i t i e s ( C ) A v g . o f n o d e s / C A v g . o f i n f ec t e dn o d e s / C A v g . o f B e d g e s A v g . o f B A v g . o f n e i g hb o r n o d e s A v g . o f i n f ec t e d B n o d e s A v g . o f i n f ec t e d N n o d e s ( F ∪ T ) N L 40 30,764 33 11,340 2,005 4,302 27 216I 5 246,112 267 18,909 4,486 2,997 177 496IP 287 4,288 5 1,718 353 982 4 53( F ∪ T ) N L 61 47,556 82 42,135 2,893 18,601 74 603I 6 483,488 836 1 1 1 1 1IP 321 9,037 16 2,362 284 1,759 10 47( F ∪ T ) N L 48 51,030 79 29,521 3,331 10,170 72 574I 2,647 925 1 952 68 549 1 105IP 316 7,751 12 1,488 344 929 7 41( F ∪ T ) N L 41 64,961 33 16,483 1,784 10,743 32 230I 1,240 2,148 1 964 67 645 1 99IP 419 6,357 3 2,044 242 1,387 3 54( F ∪ T ) N L 35 21,636 25 14,600 2,047 4,522 23 219I 807 938 1 1,075 105 572 1 148IP 142 5,333 6 2,023 418 1,210 5 51( F ∪ T ) N L 31 10,574 6 2,278 398 1,333 5 5I 217 1,511 1 545 73 284 1 56IP 115 2,850 2 834 150 442 2 31( F ∪ T ) N L 27 7,188 7 1,976 404 664 6 41I 3 64,692 61 6,504 1,590 1,535 52 79IP 78 2,488 2 882 217 359 2 27( F ∪ T ) N L 14 114,353 15 3,801 371 3,106 5 28I 3 533,649 70 9,649 1,620 6,213 33 95IP 123 13,016 2 755 122 465 2 26( F ∪ T ) N L 40 18,008 14 4,912 964 2,089 10 109I 3 240,101 192 397 174 194 17 27IP 192 3,752 3 1,006 237 479 3 39( F ∪ T ) N L 50 23,956 25 6,125 898 2,915 17 161I 1,096 1,093 1 576 51 291 1 98IP 228 5,253 5 1,152 231 709 5 36 itle Suppressed Due to Excessive Length 17
Table 7: Evaluation of vulnerability of boundary nodes for DS AP@1 AP@5 AP@10 AP@15 MAPL I LP L I LP L I LP L I LP L I LP
M1 0.759 0.676 0.712 0.736 0.548 0.519 0.606 0.543 0.533 0.661 0.505 0.566 0.672 0.546 0.555M2 0.818 0.749 0.907 0.769 0.733 0.799 0.821 0.699 0.999 0.733 0.666 0.999 0.785 0.733 0.875M3 0.805 0.642 0.878 0.567 0.509 0.749 0.590 0.512 0.674 0.524 0.586 0.833 0.596 0.577 0.751M4 0.468 0.714 0.750 0.366 0.674 0.633 0.323 0.523 0.659 0.325 0.454 0.799 0.350 0.569 0.660F1 0.892 0.749 0.855 0.824 0.679 0.999 0.922 0.499 0.799 0.899 0.422 0.999 0.876 0.552 0.905F2 0.819 0.999 0.874 0.727 0.499 0.839 0.741 0.399 0.924 0.706 0.266 0.999 0.714 0.518 0.900F3 0.933 0.945 0.933 0.955 0.999 0.999 0.999 0.999 0.999 0.999 0.999 0.999 0.972 0.985 0.995F4 0.999 0.999 0.999 0.955 0.999 0.999 0.979 0.999 0.999 0.999 0.999 0.999 0.991 0.999 0.999T1 0.222 0.531 0.868 0.424 0.492 0.716 0.439 0.349 0.479 0.377 0.344 0.533 0.450 0.424 0.644T2 0.548 0.374 0.482 0.299 0.399 0.999 0.049 0.299 0.699 0.033 0.033 0.466 0.173 0.264 0.726T3 0.666 0.470 0.913 0.519 0.499 0.999 0.299 0.499 0.899 0.266 0.433 0.799 0.391 0.479 0.900T4 0.449 0.464 0.699 0.399 0.000 0.479 0.409 0.000 0.499 0.362 0.000 0.366 0.399 0.106 0.500 M avg F avg T avg by (cid:80) Kk =1 AP ( k ) /K , where K denotes total number of communities in thenetwork. A community with more number of spreader boundary nodes is more vulner-able to news penetration. As most communities of a network typically havea few spreader boundary nodes, it is not feasible to use node ranking metricsabove for evaluating community vulnerability. We thus rank the communitiesby their vulnerability scores and compare with the ground-truth ranking givenby the relative count of spreader boundary nodes in the community. We useKendall’s tau, which is a correlation measure for ordinal data, as evaluationmetric. Kendall’s tau close to 1 indicates strong agreement, and that close to -1indicates strong disagreement between evaluated and ground-truth rankings.
Kendall’s tau ( τ ): Let rel = [ rel , rel , . . . , rel n ] represent the ‘relevant’ranked list of n communities based on ground-truth vulnerability (quantified asthe fraction of boundary nodes that are spreaders), and ret = [ ret , ret , . . . , ret n ]represent the ‘retrieved’ ranked list of communities based on our proposedvulnerability metric. Let P represent the Q the T the rel , and U the ret .If a tie occurs for the same pair in both rel and ret , it is not added to either T or U . Then we calculate τ = ( P − Q ) /sqrt (( P + Q + T ) ∗ ( P + Q + U )).4.3 Results on DS DS
1. For the twelve networks we showthe Average Precision for k = 1, 5, 10 and 15 and compute the MAP for thetop-15 results.AP@1 shows how well we are able to identify the first spreader boundarynode based on our metric. Our metric is able to identify the most vulnerable
Table 8: Evaluation of vulnerability of communities in DS τ M1 τ M2 τ M3 τ M4 τ F1 τ F2 τ F3 τ F4 τ T1 τ T2 τ T3 τ T4 L -0.027 0.003 -0.149 -0.035 0.050 0.164 0.457 0.161 -0.045 -0.255 -0.090 -0.030 I LP Fig. 4: Distribution of vulnerability score of spreaders in DS itle Suppressed Due to Excessive Length 19 vulnerable boundary node of communities in false news networks with aver-age precision of over 90%. As expected, our metrics show better performanceparticularly for fake news networks, followed by mixture and then true newsnetworks. Average precision for rest of the k-values also shows similar trend.Metrics for Louvain-/Infomap-based communities follow a similar trendfor the remaining k values. However, Label Propagation communities for k=3evaluate with AP of 90.25% averaged over the false news networks, which isover 35% and 20% better than the mixture and the true networks, respec-tively. In this case, true news networks are ranked better than mixture newsnetworks. While k=5 also shows a similar trend, for the rest of the k val-ues Label Propagation-based communities show better performance for themixture than the true news networks. This insensitivity in evaluation could beattributed to the fact that label propagation algorithm tends to generate morenumber of communities. Thus, the average community size is much smaller,causing the communities to have sparser boundary and neighbor node sets.We also observe that the MAP averaged over the false news networks is47.86% better than the mixture and 150% better than the true news net-works for Louvain-based communities; and 25.94% better than the mixture,and 139.9% better than the true news networks for Infomap-based commu-nities; and 33.72% better than the mixture and 37.14% better than the truenews networks for Label Propagation-based communities. Therefore, we areable to identify most vulnerable boundary nodes of communities in false newsnetworks with an average MAP of over 75%.Table 8 shows the evaluation results for proposed metric to compute thevulnerability of a community for DS
1. For the twelve networks the table showsKendall’s tau value ( τ ) for communities generated using the three algorithms.We observe that the τ for mixture and true news networks tend to have anegative correlation with the ground truth community ranking. False newsnetworks on the other hand show a positive correlation, with high values of0.642, 0.667, 0.457 and 0.714 for F F F F M1 ) have less variance, while true news spreaders haveleast variance. Thus we can conclude that trust-based vulnerability metrics areable to distinguish between spreaders with high and low vulnerability betterthan true news spreaders (where most spreaders are assigned similar scores). This in turn affects the performance of community vulnerability metrics in asimilar way.
Fig. 5: Case study of spreaders in
M ixture networks.On observing the trustingness and trustworthiness scores of the spreadersof mixture news networks as shown in Figure 5 we notice that most spreadersof M1 have high trustingness and low trustworthiness scores compared to M2 , M3 and M4 that have low trustingness and high trustworthiness scores.Source of M1 was tweeted by a conservative with political undertones andit is known that conservatives are more likely to share fake news [49]. Theinformation shows spreading pattern similar to fake news, as spreaders withhigh trustingness score shared M1 without fact checking the claim, unlike thesource and spreaders of M2 , M3 and M4 who are not political conservatives.4.4 Results on DS DS
1. For the thirty networks (three each for the ten news events)we show the Average Precision for k = 1, 5, 10 and 15 and compute the MAPfor the top-15 results. Based on the AP@1, we show that our metric is ableto identify the most vulnerable boundary node with average precision (ag-gregated over all news events) of 0.735, 0.672, 0.694 for false, refutation andcombined networks repectively when communities are generated using Lou-vain; 0.705, 0.501, 0.628 when communities are generated using Infomap and0.744, 0.501, 0.577 when communities are generated using Label propagationmethod. As in DS
1, we observe that our propsoed metrics are able to identifyspreaders in false information network with higher precision than spreaders itle Suppressed Due to Excessive Length 21
Table 9: Evaluation of vulnerability of boundary nodes in DS AP@1 AP@5 AP@10 AP@15 MAPL I LP L I LP L I LP L I LP L I LP
FN1 0.729 0.999 0.502 0.866 0.533 0.999 0.785 0.333 0.799 0.644 0.222 0.766 0.78 0.48 0.825TN1 0.624 0 0.571 0.819 0 0.999 0.766 0 0.999 0.799 0 0.999 0.766 0 0.872N1 0.799 0.799 0.577 0.799 0.599 0.899 0.789 0.299 0.899 0.752 0.244 0.999 0.789 0.44 0.897FN2 0.728 0.999 0.728 0.599 0 0.933 0.735 0 0.899 0.776 0 0.933 0.711 0.133 0.881TN2 0.702 0.314 0.62 0.616 0 0.499 0.516 0 0.999 0.733 0 0.999 0.565 0.079 0.831N2 0.745 0.999 0.669 0.614 0 0.699 0.737 0 0.899 0.747 0 0.933 0.697 0.133 0.806FN3 0.666 0.49 0.541 0.584 0.799 0.999 0.774 0.899 0.999 0.745 0.733 0.999 0.691 0.808 0.874TN3 0.733 0.302 0.607 0.949 0.199 0.999 0.799 0.099 0.999 0.933 0.066 0.999 0.916 0.174 0.973N3 0.76 0.532 0.555 0.592 0.879 0.999 0.723 0.849 0.999 0.683 0.866 0.999 0.667 0.846 0.885FN4 0.599 0.999 0.556 0.699 0.299 0.999 0.585 0.099 0.899 0.866 0.066 0.999 0.637 0.282 0.878TN4 0.622 0.363 0.523 0.516 0 0.699 0.419 0 0.599 0.666 0 0.999 0.531 0.046 0.722N4 0.707 0.369 0.579 0.687 0.999 0.799 0.662 0.499 0.966 0.59 0.399 0.866 0.652 0.62 0.786FN5 0.914 0.711 0.957 0.899 0.199 0.999 0.924 0.099 0.999 0.895 0.133 0.999 0.907 0.228 0.997TN5 0.599 0.999 0.824 0 0.399 0.399 0 0 0.199 0 0 0.133 0.073 0.279 0.347N5 0.857 0.57 0.666 0.89 0.399 0.699 0.957 0.299 0.599 0.893 0.266 0.999 0.911 0.362 0.726FN6 0.769 0.499 0.762 0.519 0.533 0.999 0.499 0.599 0.899 0.466 0.533 0.666 0.549 0.581 0.867TN6 0.666 0.562 0.923 0 0 0.599 0.099 0 0 0 0 0 0.08 0.037 0.301N6 0.612 0.349 0.565 0.599 0 0.699 0.533 0 0.899 0.866 0 0.599 0.599 0.173 0.733FN7 0.749 0.499 0.914 0.399 0 0.399 0.099 0 0.199 0.066 0 0.133 0.285 0.033 0.314TN7 0.649 0.999 0.833 0.633 0.499 0.999 0.749 0.099 0.899 0.533 0.066 0.666 0.688 0.321 0.836N7 0.481 0.999 0.538 0.519 0.733 0.999 0.433 0.749 0.899 0.355 0.533 0.666 0.467 0.74 0.809FN8 0.705 0.999 0.724 0.533 0.733 0 0.499 0.566 0 0.333 0.199 0 0.517 0.592 0.097TN8 0.499 0.499 0.721 0.499 0.499 0 0 0.349 0 0 0 0 0.218 0.352 0.048N8 0.499 0.999 0.442 0.733 0.933 0.199 0.499 0.599 0.099 0.333 0.422 0.066 0.546 0.713 0.2FN9 0.72 0.377 0.849 0.672 0 0.999 0.599 0 0.999 0.483 0 0.933 0.582 0.048 0.952TN9 0.631 0.666 0.558 0.499 0.199 0.199 0.249 0.099 0.099 0.333 0.066 0 0.409 0.215 0.133N9 0.724 0.333 0.526 0.619 0.199 0.899 0.539 0.099 0.699 0.516 0.066 0.999 0.568 0.154 0.817FN10 0.773 0.475 0.912 0.576 0.666 0.899 0.622 0.399 0.999 0.552 0.366 0.999 0.605 0.474 0.97TN10 0.999 0.31 0.599 0.199 0 0 0.199 0 0 0.133 0 0 0.253 0.02 0.039N10 0.759 0.332 0.657 0.599 0.599 0.899 0.614 0.349 0.999 0.599 0.266 0.999 0.627 0.389 0.937 F avg T avg M avg Table 10: Evaluation of vulnerability of communities in DS τ F τ T τ F ∪ T L I LP L I LP L I LPN1 N2 N3 -0.351 -0.001 0.012 0.2 -0.04 0.06 -0.411 -0.012 0.044 N4 N5 -0.073 0.003 0.075 -0.238 -1 0.01 -0.055 0.02 0.051 N6 -0.113 1 0.039 -0.055 0.017 0.052 -0.092 0.033 -0.038 N7 -0.284 0.052 0.109 0.157 -1 -0.062 -0.065 0.333 0.066 N8 -0.088 0.333 -0.035 -0.333 -1 -0.089 -0.076 -0.333 0.011 N9 N10 -0.019 0.017 0.067 -0.399 -0.022 -0.027 -0.025 -0.028 0.001 in refutation information networks. This can be attributed to the fact that aperson’s motivation to spread refutation information (whose validity is morecertain) is driven more by the nature of the content; unlike false information(whose content is not validated) which is driven less by the content on moreby the trust dynamics with the endorser. Metric’s performance in identifyingfalse information spreaders in combined network affected slightly due to the presence of refutation information spreading dynamics, but is still better thanonly refutation information network.Trends do not drastically vary for other values of k, with Label Prop-agation performing slightly better than Louvain while Infomap with lowestperformance. Also we observe than certain vulnerability scores are drasticallylow. This can be attributed to the quality of disjoint communities generatedby the community detection algorithm. In scenarios where the number of com-munities is too low or too large, this causes large variation in the boundaryand neighbor node count for the community thus affecting the metric scorecomputation.Through MAP we aggregate the precision scores for top-15 spreader bound-ary nodes. We observe precison scores of 0.626. 0.366, 0.766 for fale informationnetwork; 0.449/ 0.152/ 0.51 for refutation information network; 0.652/ 0.457/0.759 for combined network using L/ I/ LP.Table 10 shows the evaluation results for proposed metric to compute thevulnerability of a community for DS
2. Similar to Table 8, τ for false infor-mation networks tend to have more values greater than zero (i.e. positivecorrelation) compared to refutation information networks. We propose novel metrics based on the concept of believability derived fromcomputational trust measures to compute vulnerability of nodes and commu-nities to news spread and show that the metrics is much more sensitive to falseinformation. We confirm our hypothesis that false information have to rely onstrong trust among spreaders to propagate while true or refuting informationdoes not. Through experiments on two real-world datasets of large informationspreading networks on Twitter we show that our proposed metrics can identifythe vulnerable nodes and communities with high precision. While detection offake news spreading is a widely studied problem, its containment is not. Webelieve that the proposed model can be used to identify vulnerable individ-uals and communities to build content-agnostic fake news spread preventionmodels. We thus propose the
Community Health Assessment model as a pre-liminary idea that exploits the structural characteristics of social networks toidentify nodes and communities that are most vulnerable to news spreading.As part of future work we would like to extend the proposed ideas tounderstand the dynamics of news spreading within a community (i.e. throughcore nodes). We would also like to include temporal features of news spreadinginto our model.
References
1. Donovan Artz and Yolanda Gil. A survey of trust in computer science and the semanticweb.
Web Semantics: Science, Services and Agents on the World Wide Web , 5(2):58–71,2007.itle Suppressed Due to Excessive Length 232. Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre.Fast unfolding of communities in large networks.
Journal of statistical mechanics: theoryand experiment , 2008(10):P10008, 2008.3. Fedor Borisyuk, Albert Gordo, and Viswanath Sivakumar. Rosetta: Large scale systemfor text detection and recognition in images. In
Proceedings of the 24th ACM SIGKDDInternational Conference on Knowledge Discovery & Data Mining , pages 71–79. ACM,2018.4. Walter Quattrociocchi, Antonio Scala, and Cass R Sunstein. Echo chambers on facebook.
Available at SSRN 2795110 , 2016.5. Lidan Fan, Zaixin Lu, Weili Wu, Bhavani Thuraisingham, Huan Ma, and Yuanjun Bi.Least cost rumor blocking in social networks. In
Distributed Computing Systems (ICDCS),2013 IEEE 33rd International Conference on , pages 540–549. IEEE, 2013.6. Usha Nandini Raghavan, R´eka Albert, and Soundar Kumara. Near linear time algorithmto detect community structures in large-scale networks.
Physical review E , 76(3):036106,2007.7. Martin Rosvall and Carl T Bergstrom. Maps of random walks on complex networks revealcommunity structure.
Proceedings of the National Academy of Sciences , 105(4):1118–1123,2008.8. Lidan Fan, Weili Wu, Xuming Zhai, Kai Xing, Wonjun Lee, and Ding-Zhu Du. Maximiz-ing rumor containment in social networks with constrained time.
Social Network Analysisand Mining , 4(1):214, 2014.9. Adrien Friggeri, Lada A Adamic, Dean Eckles, and Justin Cheng. Rumor cascades. In
ICWSM , 2014.10. Benjamin D Horne and Sibel Adali. This just in: fake news packs a lot in title, usessimpler, repetitive content in text body, more similar to satire than real news. arXivpreprint arXiv:1703.09398 , 2017.11. Muhammad Imran, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. Processingsocial media messages in mass emergency: A survey.
ACM Computing Surveys (CSUR) ,47(4):67, 2015.12. Fang Jin, Edward Dougherty, Parang Saraf, Yang Cao, and Naren Ramakrishnan. Epi-demiological modeling of news and rumors on twitter. In
Proceedings of the 7th Workshopon Social Network Mining and Analysis , page 8. ACM, 2013.13. Zhiwei Jin, Juan Cao, Yongdong Zhang, and Jiebo Luo. News verification by exploitingconflicting social viewpoints in microblogs. In
AAAI , pages 2972–2978, 2016.14. Sepandar D Kamvar, Mario T Schlosser, and Hector Garcia-Molina. The eigentrustalgorithm for reputation management in p2p networks. In
Proceedings of the 12th inter-national conference on World Wide Web , pages 640–651. ACM, 2003.15. Masahiro Kimura, Kazumi Saito, and Hiroshi Motoda. Efficient estimation of influencefunctions for sis model on social networks. In
IJCAI , pages 2046–2051, 2009.16. Srijan Kumar and Neil Shah. False information on web and social media: A survey. arXiv preprint arXiv:1804.08559 , 2018.17. Jing Ma, Wei Gao, Prasenjit Mitra, Sejeong Kwon, Bernard J Jansen, Kam-Fai Wong,and Meeyoung Cha. Detecting rumors from microblogs with recurrent neural networks.In
IJCAI , pages 3818–3824, 2016.18. Jing Ma, Wei Gao, and Kam-Fai Wong. Detect rumors in microblog posts using prop-agation structure via kernel learning. In
Proceedings of the 55th Annual Meeting of theAssociation for Computational Linguistics (Volume 1: Long Papers) , volume 1, pages708–717, 2017.19. Abhinav Mishra and Arnab Bhattacharya. Finding the bias and prestige of nodes innetworks based on trust scores. In
Proceedings of the 20th international conference onWorld wide web , pages 567–576. ACM, 2011.20. Tanushree Mitra, Graham P Wright, and Eric Gilbert. A parsimonious language modelof social media credibility across disparate events. In
Proceedings of the 2017 ACM Con-ference on Computer Supported Cooperative Work and Social Computing , pages 126–145.ACM, 2017.21. Mark EJ Newman. Spread of epidemic disease on networks.
Physical review E ,66(1):016128, 2002.4 Bhavtosh Rath et al.22. Nam P Nguyen, Guanhua Yan, My T Thai, and Stephan Eidenbenz. Containment ofmisinformation spread in online social networks. In
Proceedings of the 4th Annual ACMWeb Science Conference , pages 213–222. ACM, 2012.23. Ver´onica P´erez-Rosas, Bennett Kleinberg, Alexandra Lefevre, and Rada Mihalcea. Au-tomatic detection of fake news. arXiv preprint arXiv:1708.07104 , 2017.24. Martin Potthast, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff, and Benno Stein.A stylometric inquiry into hyperpartisan and fake news. arXiv preprint arXiv:1702.05638 ,2017.25. Bhavtosh Rath, Wei Gao, Jing Ma, and Jaideep Srivastava. From retweet to believ-ability: Utilizing trust to identify rumor spreaders on twitter. In
Proceedings of the 2017IEEE/ACM International Conference on Advances in Social Networks Analysis and Min-ing 2017 , pages 179–186. ACM, 2017.26. Atanu Roy. Computational trust at various granularities in social networks. 2015.27. Atanu Roy, Chandrima Sarkar, Jaideep Srivastava, and Jisu Huh. Trustingness & trust-worthiness: A pair of complementary trust measures in a social network. In
Advances inSocial Networks Analysis and Mining (ASONAM), 2016 IEEE/ACM International Con-ference on , pages 549–554. IEEE, 2016.28. Hinrich Sch¨utze, Christopher D Manning, and Prabhakar Raghavan.
Introduction toinformation retrieval , volume 39. Cambridge University Press, 2008.29. Devavrat Shah and Tauhid Zaman. Rumors in a network: Who’s the culprit?
IEEETransactions on information theory , 57(8):5163–5181, 2011.30. Wanita Sherchan, Surya Nepal, and Cecile Paris. A survey of trust in social networks.
ACM Computing Surveys (CSUR) , 45(4):47, 2013.31. Soroush Vosoughi, Deb Roy, and Sinan Aral. The spread of true and false news online.
Science , 359(6380):1146–1151, 2018.32. Fan Yang, Yang Liu, Xiaohui Yu, and Min Yang. Automatic detection of rumor onsina weibo. In
Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics ,page 13. ACM, 2012.33. Laijun Zhao, Hongxin Cui, Xiaoyan Qiu, Xiaoli Wang, and Jiajia Wang. Sir rumorspreading model in the new media age.
Physica A: Statistical Mechanics and its Appli-cations , 392(4):995–1003, 2013.34. Dejin Zhao and Mary Beth Rosson. How and why people twitter: the role that micro-blogging plays in informal communication at work. In
Proceedings of the ACM 2009international conference on Supporting group work , pages 243–252, 2009.35. Kai Zhu and Lei Ying. Information source detection in the sir model: A sample-path-based approach.
IEEE/ACM Transactions on Networking (TON) , 24(1):408–421, 2016.36. C-N Ziegler and Georg Lausen. Spreading activation models for trust propagation.In e-Technology, e-Commerce and e-Service, 2004. EEE’04. 2004 IEEE InternationalConference on , pages 83–97. IEEE, 2004.37. Sharma, Karishma and Qian, Feng and Jiang, He and Ruchansky, Natali and Zhang,Ming and Liu, Yan Combating fake news: A survey on identification and mitigationtechniques.
ACM Transactions on Intelligent Systems and Technology (TIST) , 2019.38. Dhruv Khattar, Jaipal Singh Goud, Manish Gupta, and Vasudeva Varma. Mvae: Mul-timodal variational autoencoder for fake news detection. In
The World Wide Web Con-ference , pages 2915–2921, 2019.39. Yi-Ju Lu and Cheng-Te Li. Gcan: Graph-aware co-attention networks for explainablefake news detection on social media. arXiv preprint arXiv:2004.11648 , 2020.40. Jing Ma, Wei Gao, and Kam-Fai Wong. Detect rumors on twitter by promoting informa-tion campaigns with generative adversarial learning. In
The World Wide Web Conference ,pages 3049–3055, 2019.41. Kai Shu, Limeng Cui, Suhang Wang, Dongwon Lee, and Huan Liu. defend: Explainablefake news detection. In
Proceedings of the 25th ACM SIGKDD International Conferenceon Knowledge Discovery & Data Mining , pages 395–405, 2019.42. Jiawei Zhang, Bowen Dong, and S Yu Philip. Fakedetector: Effective fake news detectionwith deep diffusive neural network. In , pages 1826–1829. IEEE, 2020.43. Tinggui Chen, Jiawen Shi, Jianjun Yang, Guodong Cong, and Gongfa Li. Modeling pub-lic opinion polarization in group behavior by integrating sirs-based information diffusionprocess.
Complexity , 2020, 2020.itle Suppressed Due to Excessive Length 2544. Fang Jin, Edward Dougherty, Parang Saraf, Yang Cao, and Naren Ramakrishnan. Epi-demiological modeling of news and rumors on twitter. In
Proceedings of the 7th workshopon social network mining and analysis , pages 1–9, 2013.45. Abdelmajid Khelil, Christian Becker, Jing Tian, and Kurt Rothermel. An epidemicmodel for information diffusion in manets. In
Proceedings of the 5th ACM internationalworkshop on Modeling analysis and simulation of wireless and mobile systems , pages54–60, 2002.46. Yu Liu, Bai Wang, Bin Wu, Suiming Shang, Yunlei Zhang, and Chuan Shi. Charac-terizing super-spreading in microblog: An epidemic-based information propagation model.
Physica A: Statistical Mechanics and its Applications , 463:202–218, 2016.47. Xiaobin Rui, Fanrong Meng, Zhixiao Wang, Guan Yuan, and Changjiang Du. Spir:The potential spreaders involved sir model for information diffusion in social networks.
Physica A: Statistical Mechanics and its Applications , 506:254–269, 2018.48. Tuan-Anh HOANG and Ee Peng LIM. Virality and susceptibility in information diffu-sions. In
Proceedings of the Sixth International Conference on Weblogs and Social Media,2012 .49. Andrew Guess, Jonathan Nagler, and Joshua Tucker. Less than you think: Prevalenceand predictors of fake news dissemination on facebook.
Science advances , 5(1):eaau4586,2019.50. Tracy Jia Shen, Robert Cowell, Aditi Gupta, Thai Le, Amulya Yadav, and DongwonLee. How gullible are you? predicting susceptibility to fake news. In