Archive | 2019
Edge attribute-enhanced community discovery in social networks
Abstract
Social networks like Facebook and Twitter have become important parts of our life. We use them for di↵erent purposes: posting opinions about events, products or services, getting news updates, or catching up with friends, family and other users with similar interests. The enormous volume of data available from social networks has provided many opportunities for social network mining, deepening our understanding of social behaviour and how social recommendation systems operate. Community discovery is one of the fundamental challenges in social network. It aids in detecting users with dense connections and similar behaviours. Community discovery could be used on static or dynamic structures; static structures represent long life relationship between users such as following-follower links in Twitter, and dynamic structures consider the frequency, sentiment, or time of interactions. Currently, many community discovery methods focus on static structure to detect communities, and other useful attributes which exist in the network’s dynamic structure are overlooked, including edge attributes and node attributes. Edge attributes may include the volume of interactions between users, the sentiment of interactions, and the time of interactions. Node attributes may include users’ interests or the level of their influence. In this thesis, we aim to enhance community discovery in social networks using edge attributes. To show the importance of using edge attributes in community discovery, we initially analyse Twitter data and make two main observations. The first is that a limited number of users have interactions or communications within any fixed time interval in their static structure, such as communicating with followers. It is not di cult to see that it makes sense to find active communities that are biased towards the temporal interactions of their users, rather than relying solely on a static structure. Communities detected from this new perspective will provide time-variant social relationships or recommendations in their social networks, and these may significantly improve the applicability of social data analytics. The second observation is that positive sentiment is contagious within communities: members of a community share positive tweets more than the negative ones over time. We find a strong correlation between positive sentiments and the size of a community. These observations of sentiments shed light on the presence of like-minded users within communities, which will attract social network companies using viral marketing and recommendation systems. These observations motivate us to propose approaches that enhance community discovery using edge attributes. We present three sub-problems, each related to the larger problem of discovering social network communities using edge attributes. The first sub-problem is to discover temporal interaction biased (TIB) communities. A TIB community is an active community having constant interactions among its members; this significantly assists the application of social analytics. We develop an influence propagation model that gives the highest weight to active edges, or to inactive edges close to active edges. We then redesign the activity-biased community model by extending the classical density-based community detection metric. Lastly, we develop two di↵erent expansion-driven algorithms to find the activity-biased densest community e ciently. We verify the e↵ectiveness of the extended community metric and the e ciency of the algorithms using real datasets, comparing them with well-known methods. The second sub-problem is to partition a graph for overlapping community discovery for a parallel environment. We develop an approach using an objective function based on clique structure. The partition aims to enhance all overlapping community detections that rely on clique structure and make them suitable for parallel execution. We verify and compare the e ciency of the partitioning on sequential and parallel environments for the influence propagation model and TIB community discovery. Our results indicate that our approach consumes little memory and shows a significant decrease in computation time. The third sub-problem is to discover positive–persistent communities in social networks. This problem addresses the possibility of discovering like-minded users in a social network based on the sentiment of their interactions within communities. Members of positive–persistent communities share more positive than negative sentiment in a maximal time interval with a cohesive structure. To detect these communities, we propose two models, evaluate them using over Twitter datasets, and then compare the results against clique-based model and Infomap community detection. Our results show that our models e ciently and e↵ectively ii detect positive–persistent communities as they group all like-minded users who consistently interact over time with positive sentiments. Positive–persistent communities are useful for applications like marketing and recommendation systems that desire to make use of communities with many like-minded users.