Sentiment Paradoxes in Social Networks: Why Your Friends Are More Positive Than You?
SSentiment Paradoxes in Social Networks:Why Your Friends Are More Positive Than You?
Xinyi Zhou, Shengmin Jin, Reza Zafarani
Data Lab, Department of EECS, Syracuse University { zhouxinyi, shengmin, reza } @data.syr.edu Abstract
Most people consider their friends to be more positive thanthemselves, exhibiting a
Sentiment Paradox . Psychology re-search attributes this paradox to human cognition bias. Withthe goal to understand this phenomenon, we study senti-ment paradoxes in social networks. Our work shows that so-cial connections (friends, followees, or followers) of usersare indeed (not just illusively) more positive than the usersthemselves. This is mostly due to positive users having morefriends. We identify five sentiment paradoxes at different net-work levels ranging from triads to large-scale communities.Empirical and theoretical evidence are provided to validatethe existence of such sentiment paradoxes. By investigatingthe relationships between the sentiment paradox and otherwell-developed network paradoxes, i.e., friendship paradox and activity paradox , we find that user sentiments are posi-tively correlated to their number of friends but rarely to theirsocial activity. Finally, we demonstrate how sentiment para-doxes can be used to predict user sentiments.
Introduction
Sentiment analysis , also known as opinion mining , analyzesindividual opinions, sentiments, and attitudes towards var-ious entities such as individuals, products, organizations,and topics (Liu 2012). Relying on advancements in naturallanguage processing and machine learning (Ravi and Ravi2015), existing studies in sentiment analysis have made sub-stantial progress towards classifying and predicting senti-ments of independent individuals and groups in social net-works, focusing on tasks such as content sentiment predic-tion and review spam detection (Breck and Cardie 2017).However, existing studies have less explored sentimentsamong interacting users as their sentiments may be de-pendent. With the unavoidable peer influence in social net-works (Lewis, Gonzalez, and Kaufman 2012), it is essentialto consider user interactions when studying their sentiments,especially in large-scale social networks. For example, Linet al. find that the stress levels of users are closely related tothat of their friends on social media (Lin et al. 2017). A com-mon observation with respect to sentiments of interacting
Copyright c (cid:13) users is that many users feel their friends are more positivethan themselves, experiencing a sentiment paradox . Therehave been many discussions on why this phenomenon takesplace, with psychology research linking it to human cog-nition biases. For example, Jordan et al. (2011) show thatmost people have a tendency to underestimate the negativefeelings of others. With many users in social networks expe-riencing a sentiment paradox – being less positive than theirfriends – can we attribute all such perceptions to human cog-nition biases alone? In other words, do sentiment paradoxesexist not only in user cognition, but also in reality?
The Present Work: Sentiment Paradoxes in Networks.
We investigate whether users are indeed less positive thantheir social connections (friends, followees, or followers) insocial networks. Possible interpretations for the existence(or non-existence) of the sentiment paradox are providedby mining the relationships between sentiment paradoxesand other well-established network paradoxes, i.e., friend-ship paradox and activity paradox. Finally, as an application,we show how sentiment paradoxes can be used to predictuser sentiments (positive or negative).Overall, the specific contributions of this paper are:1. Five sentiment paradoxes are identified in both undirected(friend) and directed (follower and followee) social net-works and at multiple network levels (triad-, community-,and network-level). Our work shows that for most userstheir friends are indeed more positive than them, mostlydue to the fact that more positive users are more likely tohave more friends, followers, and followees;2. We empirically and mathematically verify each paradox,where our mathematical analysis allows us to determinewhether such a paradox is expected to exist;3. We investigate the connections between the sentimentparadox and two other well-established network para-doxes: friendship paradox and activity paradox. Our re-sults reveal factors that can determine the (1) existenceand the (2) magnitude of sentiment paradoxes; and4. We demonstrate the role that the sentiment paradox thatcan play in practical applications, i.e., in predicting suser’s sentiments by looking at the sentiments of his orher social connections. a r X i v : . [ c s . S I] M a y he remainder of the work is organized as follows. Ex-perimental setup is presented first, followed by a formaldefinition for the general sentiment paradox, and sentimentparadoxes in triads and communities. Then, we investigatethe connections of sentiment paradox to other network para-doxes, which helps determine the existence and magnitudeof sentiment paradoxes. One application of sentiment para-doxes, i.e., user sentiment prediction, is provided next. Fi-nally, a literature review and some conclusions are provided. Experimental Setup
To study sentiment paradoxes at different network levels,proper data that contains user sentiments and their networkinformation (e.g., friends or communities joined) is required.
Dataset. We have crawled a large-scale dataset from Live-Journal (Zafarani and Liu 2009; Jin and Zafarani 2017).LiveJournal is a popular blogging and social networkingsite, where users can maintain a blog, journal, or a diary.Data collected from LiveJournal has several advantages:1. Sentiments are directly provided: when posting blogs,users can report their sentiments by selecting a mood (e.g.,excited, busy or angry, see Appendix for a sample userpost with its mood), which provides access to sentimentground truth;2. Both undirected (friends) and directed (followees and fol-lowers) relationships of users exist, i.e., a directed and anundirected network. Note that these relationships are sep-arate: a user can choose to subscribe (follow) another per-son without approval, and/or befriend (with approval) sothat the two users can share some private posts. Hence,two users can follow each other (i.e., two directed edgesin the directed network), but not be friends (no edge be-tween them in the undirected network); and3. Community membership information is explicitly avail-able on user profiles (i.e., no need to detect them usingcommunity detection, which can be subjective (Fortunato2010)). User can decide to create or join communities.Each community is often related to some topic and usersin the same community often share similar interests.We have collected the following data spanning more than10 years (from 1999 to 2010) (Zafarani and Liu 2009;Jin and Zafarani 2017): (i) users and their posts to obtainuser sentiments; (ii) friendships and followee/follower re-lationships among users; and (iii) community membershipsfor all users. We only retain users with ten or more poststo exclude occasionally active or inactive users. We plotthe post distribution of these excluded users, which is pro-vided in the Appendix and indicates that most ( ∼ + , e.g., cheerful, excited and happy), negative ( − ,e.g., angry, annoyed and depressed) or neutral ( , e.g., busy).Some statistics on our data is provided in Table 1. The data is released at: http://data.syr.edu/get/EmotionPatterns
Table 1: Data Statistics
Data Number -1 -0.5 0 0.5 1
SWB U s e r s Figure 1: User Sentiment Distribution (SWB values)
User Sentiments.
Traditionally, to obtain user sentimentvalues, one can rely on self-assessment surveys, which istime-consuming for large number of users. Here, we adoptan automatic way by investigating the historical posts ofusers (see Definition 1) (Bollen et al. 2011).
Definition 1 (Subjective Well-Being (SWB) ) Assumeuser u has N p ( u ) positive posts and N n ( u ) negative posts.The SWB value of u , denoted by S ( u ) , is S ( u ) = N p ( u ) − N n ( u ) N p ( u ) + N n ( u ) . (1)Note that S ( u ) ∈ [ − , +1] , where − shows an ex-tremely negative user and +1 , an extremely positive user. Sentiment Distribution.
The distribution of user sentimentscan be obtained by plotting the SWB distribution. As ob-served in Figure 1, the SWB distribution approximately fol-lows a normal distribution N ( µ, σ ) , which aligns with find-ings on sentiment distributions in other social networks (e.g.,that of Twitter (Ferrara and Yang 2015)). Using a normal fit,we obtain the SWB distribution, which is N (0 , . . Sentiment Paradox
In this section, we mainly focus on a “general” sentimentparadox, which can be observed among all users of a net-work. We first present the definition of sentiment paradox, Strictly, what we study is a component of the SWB rather thanSWB itself as it includes both affective and cognitive parts. ollowed by experiments to verify its existence and mathe-matical proofs on whether the paradox is expected to exist.
Definition.
The sentiment paradox, or network sentimentparadox , can be summarized as
Paradox 1 (Sentiment Paradox)
Your friends, followees,or followers are more positive than you.
Empirical Verification.
To verify whether the sentimentparadox exists, we take the following three steps:
I. User Sentiment Assignment.
We calculate how positive ornegative users are by computing their SWB (Definition 1).
II. Computing Paradox Magnitude.
Consider a user whoseSWB value is less than the (i) mean or (ii) median of theSWB values of his or her connections. We can consider threetypes of connections: friends, followees, or followers. Weconsider this user as being less positive than his or her con-nections and denote the proportion of such users in a socialnetwork as the sentiment paradox magnitude : Definition 2 (Sentiment Paradox Magnitude)
Consider asocial network with a set of users U = { u i } , i =1 , , · · · , n , each with a SWB value S ( u i ) . For each user u i , we denote her connections (either friends, followers, orfollowees) by c ij , j = 1 , , · · · , m . The sentiment paradoxmagnitude of the network is calculated by M = (cid:80) u i I ( S ( u i ) < ¯ S ( c ij ) ) || U || , (2) where I ( a < b ) = 1 when a < b and is 0 otherwise. Thevalue ¯ S ( c ij ) is the (i) mean or the (ii) median of S ( c ij ) ’s. When the magnitude is greater than 0.5, we say the senti-ment paradox strongly holds in the network. When the mag-nitude is less than or equal to 0.5, but is still greater than theproportion of users that are more positive than their connec-tions, and that of users that are as positive as their connec-tions, we say the paradox weakly holds in the network.
III. Assessing Statistical Significance.
To assess the statisti-cal significance of our findings, we compute the differencebetween the observed and expected paradox magnitudes.To compute the expected paradox magnitude, we maintainthe SWB distribution of users and their network structure,but randomly assign a SWB value to each user. After ran-dom assignments, we recalculate the paradox magnitude.We conduct this experiment 1,000 times, and record theaverage magnitude, which is the expected paradox magni-tude. To assess how significant the difference between ob-served and expected paradox magnitudes is, we compute surprise (Leskovec, Huttenlocher, and Kleinberg 2010):
Definition 3 (Surprise)
In a social network with N users, ifparadox magnitude is M and expected paradox magnitudeis M Expected ( M Expected (cid:54) = 0 and ), the surprise value is surprise = N ( M − M Expected ) (cid:112) N · M Expected (1 − M Expected ) . (3) Table 2: Empirical Verification of Sentiment Paradox. Theobserved proportions are greater than 0.5 indicates that thesentiment paradox strongly holds within networks. The ob-served proportions are higher than the expected ones, wheresuch difference is statistically significant as the surprise val-ues are on the order of tens. (a) User Sentiments vs. Average Sentiments of Connections Sentiment Paradox Observed Exp. Surprise F r i e nd s Holds
Does not hold
Unknown
79 0.10% 0.00% -
Total F o ll o w ee s Holds
Does not hold
Unknown
906 1.10% 1.09% 0.47
Total F o ll o w er s Holds
Does not hold
Unknown
420 0.84% 0.82% 0.40
Total (b) User Sentiments vs. Median Sentiments of Connections
Sentiment Paradox Observed Exp. Surprise F r i e nd s Holds
Does not hold
Unknown
148 0.19% 0.04% 20.76
Total F o ll o w ee s Holds
Does not hold
Unknown
Total F o ll o w er s Holds
Does not hold
Unknown
Total
A surprise value on the order of tens is highly significant,indicating that p -values are nearly zero.Following this three-step process, we obtain the resultsin Table 2, where “Holds” (“Does not hold”) indicates thatusers are less (more) positive than their connections. “Un-known” indicates that users are as positive as their connec-tions. In both undirected and directed networks, irrespectiveof using mean or the median, we make the following threeobservations:1. Sentiment paradox strongly holds within the network, asthe observed sentiment paradox magnitudes (user propor-tions) for all networks are greater than 0.5;2. The observed paradox magnitude values are all higherthan the expected paradox magnitudes; and3. The surprise values are on the order of tens, which indi-cate that the observed paradox magnitudes are all statisti-cally significant.able 3: Sentiment Paradoxes at the Triad and Community Levels Triad Sentiment Paradox Common-neighborSentiment Paradox Community Sentiment Paradox Common-interestSentiment ParadoxObserved Exp. Surp. Observed Exp. Surp. Observed Exp. Surp. Observed Exp. Surp. F r i e nd s Holds
Does not hold
Unknown
686 3.26% 3.32% -0.47 0 0.00% 0.00% - 783 2.01% 1.97% -0.57 55 0.14% 0.06% 6.67
Total F o ll o w ee s Holds
Does not hold
Unknown
Total F o ll o w ee s Holds
Does not hold
Unknown
Total
Theoretical Verification.
We observe empirically from Ta-ble 2 that the expected magnitudes for the sentiment paradoxto hold and not hold are almost the same, indicating that theparadox is not expected to exist within networks. Theorem 1theoretically justifies this empirical observation.
Theorem 1
If the SWB values of users in a network followa normal distribution N ( µ, σ ) , the SWB value of a user isexpected to be equal to the (i) mean and (ii) median of SWBvalues of his connections (friends, followees, or followers),i.e., a user is expected to be as positive as his connections. Proof 1
Assume random variable S ∈ [ − , , which de-notes the SWB values of users, follows a normal distribution N ( µ, σ ) . Assume we sample n times from this distribution,where n is the number of users in the network. For each user u i , i = 1 , , · · · , n , we have two sample sets: (i) S iu , withsize one, as the SWB value of user u i ; (ii) S if , as the SWBvalues of the connections (friends, followees, or followers)of user u i . Assume ¯ S u denote the sample mean from the sam-ple S iu , and ¯ S f is the sample mean from samples S if . Notethat ¯ S u = S ( u i ) as || S iu || = 1 . S ∼ N ( µ, σ ) indicates ¯ S u ∼ N ( µ, σ ) and ¯ S f ∼ N ( µ, σ c ) , for some σ c . Hence, E ( ¯ S u ) = E ( ¯ S f ) = µ and E ( ¯ S u − ¯ S f ) = 0 , which indi-cates that the expected SWB values of users are equal to theexpected average SWB values of their connections. For themedian, the proof is similar as the median and the mean arethe same value in a normal distribution. Sentiment Paradox in Triads
Triads (a group of three connected people) are crucial com-ponents of networks, especially when investigating ideassuch as structural balance (i.e., a friend of a friend is afriend), clusterability (i.e., friends form small groups) andtransitivity (i.e., A is a friend of B , B is a friend of C , so A is a friend C ). In this section, we study sentiments amonginteracting users in triads. We will investigate if a sentimentparadox exists at the triad level, and aim to provide explana-tions on the existence (or lack) of such a paradox.To explore if the sentiment paradox holds within triads,assume user u i , i = 1 , , · · · , n is a member of triads t j , j = 1 , , · · · , m . Within each t j , we compare the sentiment (i.e., the SWB value) of u i and the mean and median ofthat of his two connections (friends, followees, or follow-ers). Note at the triad level, the results based on either themean or the median should be the same because each userhas no more than two connections within a triad. If in themajority of triads that u i is a member of, u i is less positivethan his connections, we consider u i as a user exhibitingsentiment paradox at the triad level. Then, we compute theproportion of such users in the overall network, and con-duct significance analysis similar to how it was conductedin last section. Note such paradox is not expected to exist, asproved in Theorem 1. However, Table 3 provides the empir-ical results, which can be summarized as: Paradox 2 (Triad Sentiment Paradox)
Your friends, fol-lowees, or followers in a triad are more positive than you.
On the other hand, one can think of a triad as a pair ofusers sharing a common neighbor . This observation moti-vates us to verify whether there is a sentiment paradox be-tween users and their connections with whom users sharecommon neighbors. Hence, we conduct an experiment sim-ilar to the one performed to validate the sentiment paradoxin last section, except that we only compare sentiments be-tween users and a subset of their connections with whomusers share a triad, i.e., have at least one common neighbor.The paradox is not expected to exist either, as proved in The-orem 1. However, the empirical results in Table 3 show that:
Paradox 3 (Common-neighbor Sentiment Paradox)
Your friends (followees or followers) with whom you sharefriends (followees or followers) are more positive than you.
Sentiment Paradox in Communities
Similar to triads, communities also play an important rolein understanding social networks (Fortunato 2010). We takea similar approach to triad-level paradoxes and study sen-timents among interacting users within communities. How-ever, we highlight that unlike triads, communities can havedifferent sizes (i.e., number of members) and different lev-els of interactions among their members (i.e., different den-sities). Hence, in addition to investigating whether senti-ment paradox exists at the community level, we also assess % C o mm un i t i e s Community Size % C o mm un i t i e s Community Density (a) Friendships % C o mm un i t i e s Community Size % C o mm un i t i e s Community Density (b) Followees % C o mm un i t i e s Community Size % C o mm un i t i e s Community Density (c) Followers
Holds (obs.)
Holds ( exp. )Does not hold (obs.)
Does not hold (exp.)
Unknown (obs.)
Unknown (exp.)
Figure 2: Relations Between Sentiment Paradox and (i) Community Size (upper three), and (ii) Community Density (lowerthree). The proportion of communities within which the paradox holds becomes larger as communities become larger or denser,ultimately reaching 0.7 (and at times, over 0.9), while the expected magnitude is always around 0.5.whether the existence or magnitude of such paradoxes de-pend on the size or level of connections within communi-ties. Similar to triads, we do not expect a sentiment paradoxto exist at the community-level as proved in Theorem 1.First, we assume user u i , i = 1 , , · · · , n is involved incommunities c k , k = 1 , , · · · , p . For each c k , we comparethe sentiment (i.e., the SWB value) of u i with the mean andmedian of that of his connections (friends, followees, or fol-lowers) within the community. If in a majority of communi-ties that u i belongs to, u i is less positive than his connectionsin the community, we denote u i is exhibiting the paradox.Finally, we compute the fraction of such users in the networkand perform statistical significance analysis. Table 3 has theresults, which we summarize as the following paradox: Paradox 4 (Community Sentiment Paradox)
Yourfriends, followees, or followers within a community aremore positive than you.
Additionally, we conduct an experiment similar to the oneperformed to verify the common-neighbor sentiment para-dox in the last section, in which we only compare senti-ments between users and a subset of their connections withwhom users share a community. Statistical significance iscomputed in the same way as before.The results are shown in Table 3. We observe a sentimentparadox within such users. As users in our dataset mostlyform communities around a common interest (one commu-nity often refers to a certain topic), we denote this paradoxas the common-interest sentiment paradox : Paradox 5 (Common-interest Sentiment Paradox)
Yourfriends, followees, or followers with whom you share someinterests are more positive than you.
Impact of Community Variations.
To assess the impactof variations in communities on the sentiment paradox, wemeasure the paradox magnitude by changing the communitysize (i.e., number of members/nodes) or community density (i.e., number of connections/edges). We vary the commu-nity size from 1 to 1,200, and community density from 1 to(i) 2,000 in the undirected network, and (ii) 4,000 in the di-rected network. Then, we calculate the paradox magnitudewithin these communities. The results are in Figure 2.We observe from Figure 2 that the proportion of com-munities within which the paradox holds becomes larger ascommunities become larger or denser, ultimately reaching0.7 (and at times, over 0.9), while the expected magnitudeis always around 0.5. Even when the community size or itsdensity is very small, the observed proportion of commu-nities where the sentiment paradox holds is always higherthan the expected proportion, and the observed proportionof communities where the sentiment paradox does not holdis always lower than the expected values. Community size and density both follow a power-law-like dis-tribution. Only around ten (less than 0.005%) communities exist inwhich number of users is greater than 1,200, or friendships amongusers is greater than 2,000, or following and follower relationshipsamong users is greater than 4,000. olds
Does n ot h old % U s e r s MeanMedian
Friendship Paradox
Holds
Does n ot h old Activity P aradox % U s e r s MeanMedian
Figure 3: Friendship Paradox (left) and Activity Paradox(right). Both paradoxes hold for a majority of users and thusexist in the network.
Connections to Network Paradoxes
Social networks exhibit many counter-intuitive properties.We assess the connection between sentiment paradox andtwo of the most commonly observed network paradoxes:(1) friendship paradox and (2) activity paradox, which pro-vides opportunities to investigate the relationships amonguser sentiments, social connections and activities.
Friendship Paradox
One of the most well-known network paradoxes is thefriendship paradox, first observed by Feld (Feld 1991),which states that users have fewer friends than theirfriends, on average. The paradox also holds for the medianvalue (Hodas, Kooti, and Lerman 2014). In our data, in ad-dition to sentiment paradoxes at different network levels, weobserve a friendship paradox for most users (both mean andmedian, see Figure 3). Here, we explore the interplay be-tween node degrees and the sentiment paradox, motivatedby the following facts:1. A user with degree d contributes his SWB value d timesto the average SWB distribution of friends of users. Weillustrate this fact using an example. Example 1
Consider a simple undirected friendship net-work (see Figure 4) with four users u , u , u , and u ,whose corresponding SWB values are +0 . , − . , − . ,and − . . For user u , the average SWB value of hisfriends is ( − . − . − . . The average SWB valuesof the friends of u , u , and u are all +0 . . Thus, user u with degree three contributes his SWB value three times(as a friend of u , u , and u ) to the average SWB distri-bution of friends of users, while other users, with degreeone, contribute their SWB values only once.
2. Compared with the distribution of user sentiments(SWBs, see Figure 1), the distributions of the mean andmedian of sentiments of friends, followees or followers ofusers are skewed to the right (see Figure 5 for friends), i.e., Both network paradoxes are expected to exist based on themean value as the distributions of node degrees and user activityexhibit a heavy tail (Hodas, Kooti, and Lerman 2014), which aredifferent from that of user sentiments.
S(u ) = -0.2S(u ) = +0.1 S(u ) = -0.3S(u ) = -0.4 Figure 4: An Illustration for Example 1 -1 1-0.5 0 U s e r s μ = 0.0 σ = 0.2 Mean SWB of Friends -1 1-0.5 0 U s e r s μ = 0.0 σ = 0.2 Median SWB of Friends
Figure 5: Distribution of Mean of Friend Sentiments (left),and Median of Friend Sentiments (right). Compared to thedistribution of user sentiments (SWBs, see Figure 1), the dis-tributions of the mean and median of sentiments of friendsof users are skewed to the right, i.e., µ increases.the latter distributions are a weighted version of the for-mer one, weighting those with comparatively high SWBvalues more.Given these two facts, it is natural to study whether userswith relatively high (in-, out-) degrees are more positive (i.e.,have higher SWB values) than those with relatively low (in-,out-) degrees. We verify this hypothesis in two ways. I. Without labeling a user as positive, negative or neutral,we directly compute the correlation coefficient between usersentiments (SWB values) and their number of (i) friends (de-grees), (ii) followees (out-degrees), and (iii) followers (in-degrees). Results are presented in Table 4, which indicatethat the sentiments of users are positively correlated to theirnumber of friends, followees and followers with p -values ap-proach zero (i.e., results are highly significant).We further visualize such correlations, where a least-square fit of the trend is provided in Figure 6. It furthervalidates that the SWB value of users is positively relatedto their (in-, out-) degrees, especially when SWB values arebetween − . and +0 . . Concretely, the (in-, out-) degree ofusers with SWB value +0 . are about six more than that ofusers with SWB value − . . In other words, more positiveusers usually have more friends, followees and followers.Note that the group of users with extreme sentiments (i.e.,whose SWB values approach − or +1 ) are not representa-tive enough as they occupy a very small proportion (less than7%) in the population. In Figure 6, it can be observed thatsuch users seem to have significantly larger degrees. How- D eg r ee SWB -1 -0.5 0.5 10468101214 O u t - deg r ee SWB -1 -0.5 0.5 10261014182226 I n - deg r ee SWB
Figure 6: Relations Between SWB and (i) Degrees (left), (ii) Out-degrees (middle), and (iii) In-degrees (right) of Users. SWBvalues of users have positive relations with (in-, out-) degrees, in particular, when SWB values are between − . and +0 . .Table 4: Correlations Between User Sentiments (SWB) andNumber of Social Connections. Correlations are all positiveand highly significant as p -values approach zero. Correlation Coefficient(SWB, p -value → ) (SWB, p -value → ) (SWB, p -value → )Table 5: Average Number of Social Connections for Posi-tive, Negative and Neutral Users. In general, positive usershave (30% to 45%) more social connections than the others. (a) When SWB values of users are between − and +1 Users Friends Followees Followers
Positive ( + ) 5.09 8.01 8.16Negative ( − ) 3.58 5.99 5.88Neutral (0) 3.70 5.90 5.80Overall 4.25 6.88 6.88 (b) When SWB values are between − . and +0 . Users Friends Followees Followers
Positive ( + ) 5.05 7.94 8.07Negative ( − ) 3.67 6.06 5.96Neutral (0) 3.70 5.90 5.80Overall 4.27 6.87 6.88ever, such phenomenon can be attributed to the degree dis-tribution, which is almost power-law and has a heavy tail.Once one user in the group has a significantly larger de-gree, it easily leads to a peak when using the least-squarefit. Hence, our conclusion here is obtained mainly based onusers with SWB values between − . and +0 . as theseusers are more representative than users whose sentimentsare not in this range. II.
We further consider all users and group them based onbeing positive ( S ( u ) > ), negative ( S ( u ) < ), or neutral( S ( u ) = 0 ). The average (in-, out-) degree for users withineach group is then calculated and provided in Table 5(a). -1 -0.5 0.5 10061218 A v e r age P o s t s SWB
Figure 7: Relationship Between User Sentiments (i.e., SWB)and Activity. User activity (i.e., the average number of userposts per 30 days) is rarely affected by user SWB valuesbetween − . and +0 . .Table 5(a) shows that positive users have more friends, fol-lowees, and followers compared to the other users, which isalso above the averages computed for all users. In particularthe number of friends, followees, and followers of positiveusers are on average 30% to 45% greater than that of neg-ative users. Additionally, we also compute the average (in-,out-) degree of users whose SWB values are between − . and +0 . . The results are shown in Table 5(b), which leadsto the same conclusion. Activity Paradox
In social networks such as Twitter (Hodas, Kooti, and Ler-man 2013) and Digg (Hodas, Kooti, and Lerman 2014), re-searchers have discovered the existence of an activity para-dox: users are less active than their friends, on average. Weobserve an activity paradox, less strongly than friendshipparadox, in our data (see Figure 3), which inspires us to ex-plore the potential relationships between user activity andsentiments. We quantify user activity as follows:
Definition 4 (User Activity)
Suppose user u has posted n posts in a social network, where the first post was publishedon date d and the last one was posted on date d n . The ac-tivity of user u is defined as A ( u ) = ∆ td n − d n, (4)able 6: Feature List Feature Group (
General Sentiment Paradox (6) Mean and median of SWB values of one’s social connections (friends, followees, or followers)Triad Sentiment Paradox (6) Mean and median of SWB values of one’s social connections in a triadCommon-neighbor Sentiment Paradox (6) Mean and median of SWBs of one’s social connections with whom he shares common neighborsCommunity Sentiment Paradox (6) Mean and median of SWB values of one’s social connections in a communityCommon-interest Sentiment Paradox (6) Mean and median of SWBs of one’s social connections with whom he shares common interestsFriendship Paradox (9) The number of degrees, in-degrees and out-degrees of oneself &Mean and median of degrees, in-degrees and out-degrees of one’s social connections
Table 7: Distribution of Positive, Negative and Neutral Users
User Number Proportion
Positive ( + ) 50,705 43.92%Negative ( − ) 61,066 52.90%Neutral ( ) 3,673 3.18%Total 115,444 100.00% where ∆ t is size of time window where we measure activity. Note that d n − d ∆ t indicates how many ∆ t ’s (e.g., months)a user has been active on the network and A ( u ) indicates theaverage number of posts of user u in ∆ t period.The relation between user sentiments (i.e., SWB values)and activity (i.e., the average number of posts of users per ∆ t =30 days) is shown in Figure 7. We observe that the valueof ∆ t does not influence the result. There seems to be aslight positive correlation; however, the number of user postsis rarely affected by the user SWB values if between − . and +0 . , which covers 93% of our users. Therefore, wedo not consider user activity to have a significant impact onsentiment paradox. User Sentiment Prediction
Sentiment paradoxes reveal a certain relationship betweenusers and their social connections (friends, followees, or fol-lowers) at triad-, community- and network-level, i.e., in gen-eral, users are less positive than their social connections. Inthis section, we demonstrate how a user’s sentiment (positiveor negative) can be predicted by investigating the generalsentiments of his social connections at triad-, community-and network-level. Before the elaboration, we provide thedistribution of positive, negative and neutral users in Table 7.To predict user sentiments (positive or negative), we re-gard it as a binary classification problem to be addressedwithin a supervised machine learning framework. Specifi-cally, we represent each user as a set of machine learningfeatures. Features are inspired by the validated five senti-ment paradoxes, and the friendship paradox which has beenvalidated to be correlated to the sentiment paradox. Featuresare presented in Table 6. Then, several common supervisedclassifiers are trained and used to predict user sentiments(positive or negative) based on ten-fold cross-validation. Re-sults are evaluated by accuracy (ACC) and AUC value.Table 9 provides the overall results. Results are obtainedby using XGBoost (Chen and Guestrin 2016), which per-forms best among supervised classifiers including logistic Table 8: Performance Comparison by using Various Super-vised Classifiers in Predicting User Sentiments. XGBoostperforms best among all selections.
Classifier ACC AUC
Logistic Regression .613 .590Decision Tree .596 .580Na¨ıve Bayes .600 .580Random Forest .590 .580SVM .587 .573XGBoost .620 .600Table 9: Performance Comparison by using Various Fea-ture Groups in Predicting User Sentiments. Among all singlesentiment paradoxes, the general one performs best. Whencombining all sentiment paradoxes outperforms when sepa-rately using single ones.
Feature Group ACC AUC
General Sentiment Paradox .601 .581Triad Sentiment Paradox .589 .568Common-neighbor Sentiment Paradox .592 .571Community Sentiment Paradox .590 .569Common-interest Sentiment Paradox .589 .569All Sentiment Paradoxes .617 .600All Sentiment Paradoxes + Friendship Paradox .620 .600 regression, decision trees, na¨ıve Bayes, random forests, andSupport Vector Machine (SVM) - see Table 8 for their per-formance comparison. Results in Table 9 indicate that (1)among single sentiment paradoxes, the general one performsbest in predicting user sentiments; (2) when combining allsentiment paradoxes, it outperforms when separately usingsingle ones; and (3) in general, using all features (five senti-ment paradoxes plus the correlated friendship paradox) per-form best, which can achieve 62% accuracy ratio and 60%AUC value.
Related Work
Numerous studies have looked at network paradoxes, es-pecially, friendship paradox. For example, friendship para-dox has been observed in many online (e.g., Quora (Iyer2018) and Twitter (Hodas, Kooti, and Lerman 2013)) andoffline networks (Pires, Marquitti, and Guimaraes Jr 2017).Kooti et al. (Hodas, Kooti, and Lerman 2014) have observedand proved that friendship paradox must exist, based on theean value, in social networks as node degrees always fol-low heavy-tail distributions. A recent study shows friend-ship paradox can help identify popular users by connectingit with friendship strength among users (Bagrow, Danforth,and Mitchell 2017). Nettasinghe and Krishnamurthy utilizefriendship paradox to design randomized polling methodsfor social networks (Nettasinghe and Krishnamurthy 2018).In addition to friendship paradox, recent literature has fo-cused on the explorations of other network paradoxes suchas user activity paradox, happiness paradox (Bollen et al.2017), and scientific collaboration paradox indicating thatresearchers always have fewer coauthors, citations, publica-tions, and lower h-index than their collaborators (Fotouhi,Momeni, and Rabbat 2014; Benevenuto, Laender, and Alves2016; Eom and Jo 2014). The development of these non-friendship paradoxes, however, is in an early stage, whosepotential interpretations for their existence and applicationshave rarely investigated.
Conclusion
This work is motivated by the limitation of current senti-ment analysis studies that have not considered interactingusers in social networks, and by the phenomenon that peopleoften consider their friends to be more positive than them-selves, often attributed to human cognition biases in psy-chology. We present five sentiment paradoxes at the triad-,community- and network-level, all empirically and mathe-matically validated in undirected (i.e., with friendships) anddirected (i.e., with follower and followee relationships) net-works. Through studying the relations between the senti-ment paradox and various characteristics of networks andusers, we observe that (i) sentiment distributions determinethe expected (non-) existence of sentiment paradoxes; (ii)node degrees (i.e., the number of social connections of users)is positively correlated to user sentiments; and (iii) there isno clear pattern between user sentiments and user activity.These connections (though not causal) can be responsiblefor the existence and magnitude of sentiment paradoxes insocial networks, which cannot be solely attributed to humancognition bias as they generally exist in social networks. Ad-ditionally, we firstly demonstrate the application of our find-ings in predicting user sentiment prediction. In the future,we will further analyze causal relationships between user’sconnections (degrees) and sentiments. Sentiment paradoxesin dynamic social networks as well as the “like’ and “com-ments” networks will be part of our future studies.
References [Bagrow, Danforth, and Mitchell 2017] Bagrow, J. P.; Dan-forth, C. M.; and Mitchell, L. 2017. Which friends aremore popular than you?: Contact strength and the friend-ship paradox in social networks. In
Proceedings of the 2017IEEE/ACM International Conference on Advances in SocialNetworks Analysis and Mining 2017 , 103–108. ACM.[Benevenuto, Laender, and Alves 2016] Benevenuto, F.;Laender, A. H.; and Alves, B. L. 2016. The H-indexparadox: your coauthors have a higher H-index than you do.
Scientometrics
Artificial Life
EPJ Data Science
The OxfordHandbook of Computational Linguistics 2nd edition . OxfordUniversity Press.[Chen and Guestrin 2016] Chen, T., and Guestrin, C. 2016.Xgboost: A scalable tree boosting system. In
Proceedings ofthe 22nd acm sigkdd international conference on knowledgediscovery and data mining , 785–794. ACM.[Eom and Jo 2014] Eom, Y.-H., and Jo, H.-H. 2014. Gener-alized friendship paradox in complex networks: The case ofscientific collaboration.
Scientific reports
American Journal of Sociology
PeerJ Computer Science
Physics reports
International Conferenceon Social Informatics , 339–352. Springer.[Hodas, Kooti, and Lerman 2013] Hodas, N. O.; Kooti, F.;and Lerman, K. 2013. Friendship paradox redux: Yourfriends are more interesting than you.
ICWSM
Proceedings of the ICWSM , 8–10.[Iyer 2018] Iyer, S. 2018. Friendship paradoxes on Quora.In
Guide to Big Data Applications . Springer. 205–244.[Jin and Zafarani 2017] Jin, S., and Zafarani, R. 2017. Emo-tions in social networks: Distributions, patterns, and models.In
Proceedings of the 2017 ACM on Conference on Informa-tion and Knowledge Management , 1907–1916.[Jordan et al. 2011] Jordan, A. H.; Monin, B.; Dweck, C. S.;Lovett, B. J.; John, O. P.; and Gross, J. J. 2011. Miseryhas more company than people think: Underestimating theprevalence of others negative emotions.
Personality and So-cial Psychology Bulletin
Proceedings of the SIGCHI conferenceon human factors in computing systems , 1361–1370. ACM.[Lewis, Gonzalez, and Kaufman 2012] Lewis, K.; Gonzalez,M.; and Kaufman, J. 2012. Social selection and peer in-fluence in an online social network.
Proceedings of the Na-tional Academy of Sciences
IEEE Transactions on Knowledge and Data Engineering
Synthesis lectures on human language technologies arXiv preprint arXiv:1802.06505 .[Pires, Marquitti, and Guimaraes Jr 2017] Pires, M. M.;Marquitti, F. M.; and Guimaraes Jr, P. R. 2017. Thefriendship paradox in species-rich ecological networks:Implications for conservation and monitoring.
Biologicalconservation
Knowledge-Based Systems
Appendix
Post Distribution of Inactive Users
In our experiments, we only retain users with ten or moreposts to exclude occasionally active or inactive users. Thepost distribution of these excluded users is presented in Fig-ure 8. The distribution indicates that a substantial number ofusers being not considered in our study has posted nothing.
Illustration of User Post
When posting blogs on LiveJournal, users can explicitly re-port their sentiments by selecting a mood. An illustration canbe seen in Figure 9, where the mood is
Chipper . Sentiment Polarity Identification of Moods
There are 132 moods available on LiveJournal. The senti-ment polarity (positive, neutral, or negative) of these moodsis determined as shown in Table 10. U s e r Figure 8: Post Distribution of Inactive Users Figure 9: An Illustrated User Post with Mood
Chipper
Table 10: Moods and Their Sentiment Polarity
Mood P o s i t i v e amused; accomplished; artistic; bouncy; calm; cheerful;content; creative; complacent; determined; excited;ecstatic; energetic; full; good; giggly; grateful; happy;hopeful; high; impressed; jubilant; loved; peaceful;productive; pleased; rejuvenated; sympathetic; satisfied;thankful; thoughtful; working; N e u t r a l awake; blah; blank; busy; chipper; contemplative; ditzy;dorky; drained; drunk; flirty; geeky; groggy; horny; hot;hyper; indescribable; intimidated; mellow; nerdy; okay;optimistic; recumbent; refreshed; relaxed; rushed;shocked; sleepy; surprised; N e ga t i v ee