Temporal Analysis of Influence to Predict Users' Adoption in Online Social Networks
TTemporal Analysis of Influence to Predict Users’Adoption in Online Social Networks
Ericsson Marin , Ruocheng Guo , and Paulo Shakarian Arizona State University, Tempe, Arizona, USA
Abstract.
Different measures have been proposed to predict whetherindividuals will adopt a new behavior in online social networks, given theinfluence produced by their neighbors. In this paper, we show one canachieve significant improvement over these standard measures, extendingthem to consider a pair of time constraints. These constraints provide abetter proxy for social influence, showing a stronger correlation to theprobability of influence as well as the ability to predict influence.
Research has shown that measures which leverage the people’s ego network cor-relate with influence - the confidence at which their neighbors adopt a newbehavior [1]. In this paper, we introduce two time constraints to improve thesemeasures:
Susceptible Span and
Forgettable Span . Susceptible Span ( τ sus ) refersto the interval when people receive social signals from their neighbors (possi-ble influencing actions), blinding individuals to no more interesting connections. Forgettable Span ( τ fos ) refers to the interval before an influencer’s action is for-gotten by his neighbors, due to human brain limitation. These constraints defineevolving graphs where influence is better measured, as illustrated in Figure 1. Inactive User (Non Adopter) Active User (Adopter) - sus = 4 fos = 2 Fig. 1.
At time t , the ego node 0 has no neighbors. At t + 1, he has 1 neighbor (node 1at the bottom). From this moment, node 0 will be aware about node 1’ actions. At t +2,node 0 has 2 neighbors, nodes 1, 2. This cumulative process continues until t + 5 whennode 1 is no more a neighbor of node 0, since τ sus was defined as 4. After this timelimit, node 0 cannot visualize actions of node 1 anymore. The illustration also showsthe node 0’s memory inside the balloons. As we made τ fos = 2, node 2 (activated at t + 2) fades away from node 0’s memory after t + 4, when node 0 is no longer influencedby him. Therefore, at t + 7, node 0 is activated only by nodes 3, 4 and 5. The contributions of this paper are: we introduce a framework to consider τ sus and τ fos in social influence; we examine the correlation of 10 social network mea-sures to influence under different conditions; we compare the adoption predictionperformance of our method with others [1,2,3], showing relevant improvements.For instance, we obtained up to 92.31% gain in correlation of a simple count ofthe “active” neighbors with the probability of influence. Considering adoptionprediction, F1 score improves from 0.606 (using the state-of-the-art [1]) to 0.689for active neighbors. Similar results are found for the other measures analysed. a r X i v : . [ c s . S I] M a y This paper is structured as follows: Section II presents the related work. Sec-tion III formalizes a framework to consider the time constraints in our networks.Section IV presents the experimental setup to produce samples. Section V intro-duces the social influence measures and their corresponding gains in correlationcoefficient. Section VI details the classification experiments and results for theadoption prediction problem. Finally, section VII concludes the paper.
Many works have been proposed to measure social influence and predict users’adoption. For instance, the seminal work of Kempe et al. [4] describes two pop-ular models for diffusion in social networks that were generalized to the GeneralThreshold Model. In this model, the collective influence from a node’s infectedneighbors will trigger his infection once his threshold is exceeded. Later, Goyalet al. [3] leveraged a variety of models based on pair-wise influence probabil-ity, finding the probability of adoption increases with more adopters amongstfriends. With an alternative approach, Zhang et al. [1] proposed the influencelocality , developing two instantiated functions based on pair-wise influence butalso on structural diversity to predict adoptions. Comparing two different per-spectives, Fink et al. [2] proposed probabilistic contagion models for simple andcomplex contagion, with the later producing a superior fit for themed hashtags.In these works, the authors slightly explored the dynamic aspect of socialinfluence. Here, we take the next steps to apply a pair of time constraints to ournetworks, finding that influence is better measured and predicted dynamically.
In this section, we describe the notations of this work. We denote a set of users V , as the nodes in a directed network G = ( V, E ), a set of topics (hashtags) Θ ,and a set of discrete time points T . We will use the symbols v, θ, t to represent aspecific node, topic and time point. With nodes being active or inactive w.r.t θ ,an active node (adopter) is a user who retweeted a tweet with θ . We denote anactivity log A (containing all retweets) as a set of tuples of the form (cid:104) v , v , θ, t (cid:105) ,where v , v ∈ V . It describes that “ v adopted θ retweeting v at time t ”,creating a directed edge ( v , v ) ∈ E . The intuition behind this edge is that v can be influenced by v with respect to θ (cid:48) , if v eventually adopts θ (cid:48) after t .Finally, we integrate into our model the two proposed time constraints τ sus and τ fos . Due to them, the neighborhood of a user can change over time, affectingthe social influence measures that result in his decision to adopt a topic. Thisway, we define the set of neighbors of a node v at time t as: η v,t = { v (cid:48) | ∃(cid:104) v, v (cid:48) , θ, t (cid:48) (cid:105) ∈ A , s.t. t (cid:48) ≤ t and t − t (cid:48) ≤ τ sus } η v,t is the set of users whose adoptions since t − τ sus until t will be presented to v . After t (cid:48) + τ sus , the adoptions of v (cid:48) will not influence v . Then, we introduce η θv,t as the set of users that can influence v to adopt θ at time t as: η θv,t = { v (cid:48) ∈ η v,t (cid:48) | ∃(cid:104) v (cid:48) , v (cid:48)(cid:48) , θ, t (cid:48)(cid:48) (cid:105) ∈ A , s.t. t (cid:48) ≤ t (cid:48)(cid:48) , t (cid:48)(cid:48) − t (cid:48) ≤ τ sus , t (cid:48)(cid:48) ≤ t and t − t (cid:48)(cid:48) ≤ τ fos } Consequently, after t (cid:48)(cid:48) + τ fos , the fact that v (cid:48) adopted θ is forgotten by v , with v (cid:48) no more influencing v in terms of θ . Using these generated dynamic networks,we want to measure the influence produced by the individuals’ active neighbors. This section details our dataset, how we collect samples using different valuesfor the time constraints, which filters of users’ activity are applied, and how wemeasure correlation of our features with probability of adoption.
Dataset description . The dataset we use is provided by [5]. It contains 1,687,700retweets ( k ), made by 314,756 users (the histogram fits a power-law with p k ≈ k − . ), about 226,488 hashtags on Twitter, from March 24 to April 25, 2012. Sampling . Following previous works [1], we create balanced sets of samplesfor our experiments. For a given activity (cid:104) v, v (cid:48)(cid:48) , θ, t (cid:105) which corresponds to apositive sample, we create a negative sample uniformly getting a user v (cid:48) from theset: (cid:8) v (cid:48) | v ∈ η θv (cid:48) ,t ∧ (cid:104) v (cid:48) , u, θ, t (cid:48) (cid:105) (cid:54)∈ A , ∀ u ∈ η θv (cid:48) ,t (cid:9) . This set includes all users underinfluence of v w.r.t θ at t , who did not adopted θ in our dataset. Then, wecreate (cid:104) v (cid:48) , v, θ, t (cid:105) as the related negative sample for (cid:104) v, v (cid:48)(cid:48) , θ, t (cid:105) , keeping the sametimestamp for both users to have similar intervals to accumulate influence. Filters . In addition, we apply 4 filters to exclude users with less actions than agiven threshold, as their behaviors are hardly explainable by influence measure-ments [2]. We label
R30 , R60 for users who retweeted at least 30 or 60 times, and
H20 , H40 for users who retweeted at least 20 or 40 hashtags respectively. Thisalso enables us to test the robustness of τ sus , τ fos under a variety of conditions. Correlation between measures and adoption probability . Here, we studyhow each time constraint correlates with probability of adoption using the Pear-son correlation coefficient. The idea is to identify the values for τ sus and τ fos ∈{ , , , , , , , , , , , } ( hours ), that produce high qual-ity influence measurements (high positive correlation with adoption probability). Table 1 describes the 10 measures (in 7 categories), which we use to estimatethe influence in users’ active neighborhood. We define the measures based onthe activity a = (cid:104) v, v (cid:48) , θ, t (cid:105) by which we create samples. Then, we show the gain(or loss) of correlation coefficient by heat maps, plotting the 144 combinationsof τ sus and τ fos . Cells in the right lower corner have values = 0, as ( τ sus , τ fos ) =(720 , R
60 and H
40 with the gain (orloss) of correlation coefficients between
Number of Active Neighbors (NAN) andprobability of adoption. Previous work with no time constraints [3,2] argue thata positive correlation is expected here. Even so, many cells present gains for bothfilters, where combinations of τ sus and τ fos boost NAN ’s ability to explain users’behaviors under influence. Moreover, hot red cells dominate the left lower regionof both heat maps where τ sus and τ fos are relatively high and low respectively,especially when τ sus ≥
168 and τ fos ≤
48, with gains in [9.84%, 92.31%]. Figure2(b) presents the heat maps for Personal Network Exposure (PNE). Similar toNAN, hot red cells are mainly distributed where τ sus > = 168 and τ fos < = 24.Although the gains are smaller, in [1.18%, 11.76%], we show how PNE obtainshigh gains in classification performance. From this moment on, we plot the heatmaps only for filter R60 , since we get similar results for H Table 1.
Categories of Features.
Category Feature FormulaConnectivity Number of Active Neighbors [3]
NANθv,t = | ηθv,t | Personal Network Exposure [6]
PNEθv,t = | ηθv,t || ηv,t | Temporal Continuous Decay of Influence [3]
CDIθv,t = (cid:80) u ∈ ηθv,t exp( − ( tl − tu ) σ )where tl is the time when the latest neighbor in ηθv,t adopted θ ,and σ is the globally longest identified time-delay for adoption.Recorrence Previous Reposts [7] PRRθv,t = (cid:80) θ (cid:48) (cid:80) u ∈ ηθv,t (cid:80) t (cid:48)≤ t |(cid:104) v, u, θ (cid:48) , t (cid:48)(cid:105)| Transitivity Closed Triads [7]
CLTθv,t = (cid:80) { u,z }∈ ηθv,t,u (cid:54) = z f (( u, z ) θt ) and f (( u, z ) θt ) = (cid:8) , if (cid:104) u, z, θ, t (cid:48)(cid:105) ∈ A ∧ t (cid:48) ≤ t , otherwise Clustering Coefficient [7]
CLCθv,t = (cid:80) { u,z }∈ ηθv,t,u (cid:54) = z g (( u,z ) θt ) | ηθv,t | g (( u, z ) θt ) = (cid:8) , if (cid:104) u, z, θ, tz (cid:105) ∈ A ∧ tz ≤ t , otherwise Centrality Hubs [7]
HUBθv,t = (cid:80) u ∈ ηθv,t h ( u, t ) and h ( u, t ) = (cid:110) , if (cid:80) θ (cid:48) (cid:80) x ∈ V (cid:80) t (cid:48)≤ t |(cid:104) x, u, θ (cid:48) , t (cid:48)(cid:105)| > = γ , otherwise where γ being the minimal number of messages retweeted. Uponsome analysis, we made γ = 104, corresponding to 0.042% of allretweets. To reach this value, users should be retweeted at least.Reciprocity Mutual Reposts [7] MURθv,t = (cid:80) u ∈ ηθv,t i ( u, t ) and i ( u, t ) = (cid:8) , if (cid:104) u, v (cid:48) , θ, t (cid:48)(cid:105) ∈ A ∧ t (cid:48) ≤ t , otherwise Structural Active Strong Connected
ACCθv,t = | P ( ηθv,t ) | Diversity Components Count [8] where the function P ( V (cid:48) ) : V (cid:48) → C maps the set of nodes V (cid:48) to the set of strongly connected components C .Active Strongly Connected ACRθv,t = | P ( ηθv,t ) || P ( ηv,t ) | Components Ratio [8]
Figure 2(c) shows the heat map for Continuous Decay of Influence (CDI).Highest gains in [1.72% 60.34%] are observed when τ sus ≥
168 and τ fos ≤ Fig. 2.
Gains (red) or losses (blue) of correlation between each social influence measureand adoption probability when the time constraints are applied.
Figure 2(d) shows the heat map for Previous Reposts (PRR). There is aslight tendency that higher values to τ fos and lower values τ sus result in higherwhen compared to the previous features, with gains in [4.76%, 156%]. Figure 2(e) presents the heat map for Closed Triads (CLT). Higher gains in[3.33%, 233.33%] are spread through the majority of cells. However, we can stillobserve best gains distributed over the area where high values of τ sus and lowvalues of τ fos are found, specially when τ sus > = 144 and τ fos < = 48. Figure2(f) shows the heat map for Clustering coefficient (CLC). Small values for bothtime constraints produce higher correlation gains in [4.84%, 66.67%].Figure 2(g) presents the heat map for Hubs (HUB). We again observe thehot spots where τ sus and τ fos are relatively high and low respectively, with τ sus > = 96 and τ fos < = 48. Gains in [21.05%, 488.24%] are the highest.Figure 2(h) presents the heat map for Mutual Reposts (MUR), another cu-mulative measurement whose correlation increases with both time constraints.However, we can observe higher gains in [1.61%, 35.48%] found in the area wherethe values of τ sus are relatively high and the values of τ fos are intermediate.Figure 2(i) shows the heat map for Active Strong Connected ComponentsCount (ACC). Hot cells are found where τ sus and τ fos are relatively high andlow respectively, mainly when τ sus > = 168 and τ fos < = 24, with gains in[9.09%, 300%]. Figure 2(j) presents the heat map for Active Strong ConnectedComponents Ratio (ACR). The gains in [1.47%, 45.59%] are spread through thecells, mainly where values of τ sus and τ fos are both relatively small. This section presents our classification experiments and results for the adoptionprediction task, detailing training and testing sets and baselines comparisons.
Training and testing . Our 10 social influence measures are treated here as fea-tures in a machine learning task, such that we can measure their performance foradoption prediction individually and combined. We sort our samples chronolog-ically, using the first 90% for training and the rest for testing (obeying causalitywhich is neglected by some works). We use 2 classifiers, Logistic Regression [9]and Random Forest [9], but only report F1 score for Random Forest under the
R60 filter, since Logistic Regression and other filters produce comparable results.
Baselines . We compare our model with 3 baselines (Influence Locality (LRC-Q)[1], Static Bernoulli (SB) [3], Complex Probability Model (CPM) [2]), to checkif our method outperforms them and if the baselines improve with τ sus and τ fos . Individual Feature Analysis . Table 2 presents the individual classificationperformance of our 10 features. As done in the previous section, we run anexperiment for each combination of τ sus and τ fos , sorting this table by theperformance gain. The time constraints boosted the performance in all cases,with gains in F1 score in [7.22% and 23.2%]. In the great majority of cases, τ sus shows values greater than τ fos , repeating the correlation gain pattern. Combined Feature Analysis . In Table 2, we also present the classificationperformance results when the 10 features are combined as “All”, showing animprovement of 10.54% when applying time constraints. The observed patternfor the individual features (social influence is better measured by the measureswhen τ sus > τ fos ) is found again for the features combined, with performanceachieving the best improvements when τ sus = 336, while τ fos = 48. Table 2.
Baselines, Individual and Combined Feature Performances.
PNE w/ time constraints PNE w/o time constraints CLC w/ time constraints CLC w/o time constraints τsus τfos
F1 score Improv. F1 score τsus τfos
F1 score Improv. F1 score168 24 0.658 23.2% 0.534 72 16 0.652 22.0% 0.534ACR w/ time constraints ACR w/o time constraints CLT w/ time constraints CLT w/o time constraints τsus τfos
F1 score Impro. F1 score τsus τfos
F1 score Impro. F1 score336 120 0.632 18.7% 0.532 144 8 0.657 17.3% 0.560NAN w/ time constraints NAN w/o time constraints CDI w/ time constraints CDI w/o time constraints τsus τfos
F1 score Impro. F1 score τsus τfos
F1 score Impro. F1 score144 72 0.689 15.6% 0.596 144 16 0.677 13.5% 0.596ACC w/ time constraints ACC w/o time constraints HUB w/ time constraints HUB w/o time constraints τsus τfos
F1 score Impro. F1 score τsus τfos
F1 score Impro. F1 score144 16 0.675 13.2% 0.596 120 16 0.630 9.99% 0.573PRR w/ time constraints PRR w/o time constraints MUR w/ time constraints MUR w/o time constraints τsus τfos
F1 score Impro. F1 score τsus τfos
F1 score Impro. F1 score72 8 0.672 9.82% 0.612 336 72 0.712 7.23% 0.664All w/ time constraints All w/o time constraints LRC-Q w/ time constraints LRC-Q w/o time constraints τsus τfos
F1 score Impro. F1 score τsus τfos
F1 score Impro. F1 score336 48 0.755 10.54% 0.683 72 16 0.657 8.41% 0.606SB w/ time constraints SB w/o time constraints CPM w/ time constraints CPM w/o time constraints τsus τfos
F1 score Impro. F1 score τsus τfos
F1 score Impro. F1 score72 8 0.675 8.69% 0.621 72 8 0.689 12.58% 0.612
We interpret these results as: 1). users will start losing attention of theirneighbors after 2 weeks, if they do not retweet them anymore; 2). users will nomore remember the activations of their neighbors after approximately 2 days.
Performance of Baseline Methods . Finally, Table 2 includes the results ofbaselines. The time constraints boost all performances, with gains of 8.41% forLRC-Q, 8.69% for SB and 12.58% for CPM. These results highlight the effective-ness of τ sus and τ fos , also consolidating the pattern detected before: τ sus > τ fos .In addition, our model outperforms the baselines in both situations: when we useonly an individual feature such as MUR, and when we use all features combined,with improvements in [3.33%, to 9.6%] (compared with CPM). In this paper, we introduce a pair of time constraints to show how the dynamicgraphs produced by them better capture the influence between users over time(specially when τ sus and τ fos are relatively high and low respectively). We vali-date our model under diverse conditions, detailing how it outperforms the state-of-the-art methods that aim to predict users’ adoption. We also demonstrate howthese constraints can be used to improve the performance of other approaches,enabling practical usage of the concepts for social influence prediction. Acknowledgments
Some of the authors of this paper are supported by CNPq-Brazil, AFOSR Young Investigator Pro-gram (YIP) grant FA9550-15-1-0159, ARO grant W911NF-15-1-0282, and the DoD Minerva program.