Xufei Wang
Arizona State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xufei Wang.
Data Mining and Knowledge Discovery | 2012
Lei Tang; Xufei Wang; Huan Liu
The pervasiveness of Web 2.0 and social networking sites has enabled people to interact with each other easily through various social media. For instance, popular sites like Del.icio.us, Flickr, and YouTube allow users to comment on shared content (bookmarks, photos, videos), and users can tag their favorite content. Users can also connect with one another, and subscribe to or become a fan or a follower of others. These diverse activities result in a multi-dimensional network among actors, forming group structures with group members sharing similar interests or affiliations. This work systematically addresses two challenges. First, it is challenging to effectively integrate interactions over multiple dimensions to discover hidden community structures shared by heterogeneous interactions. We show that representative community detection methods for single-dimensional networks can be presented in a unified view. Based on this unified view, we present and analyze four possible integration strategies to extend community detection from single-dimensional to multi-dimensional networks. In particular, we propose a novel integration scheme based on structural features. Another challenge is the evaluation of different methods without ground truth information about community membership. We employ a novel cross-dimension network validation (CDNV) procedure to compare the performance of different methods. We use synthetic data to deepen our understanding, and real-world data to compare integration strategies as well as baseline methods in a large scale. We study further the computational time of different methods, normalization effect during integration, sensitivity to related parameters, and alternative community detection methods for integration.
international conference on data mining | 2009
Lei Tang; Xufei Wang; Huan Liu
With the pervasive availability of Web 2.0 and social networking sites, people can interact with each other easily through various social media. For instance, popular sites like Del.icio.us, Flickr, and YouTube allow users to comment shared content (bookmark, photos, videos), and users can tag their own favorite content. Users can also connect to each other, and subscribe to or become a fan or a follower of others. These diverse individual activities result in a multi-dimensional network among actors, forming cross-dimension group structures with group members sharing certain similarities. It is challenging to effectively integrate the network information of multiple dimensions in order to discover cross-dimension group structures. In this work, we propose a two-phase strategy to identify the hidden structures shared across dimensions in multi-dimensional networks. We extract structural features from each dimension of the network via modularity analysis, and then integrate them all to find out a robust community structure among actors. Experiments on synthetic and real-world data validate the superiority of our strategy, enabling the analysis of collective behavior underneath diverse individual activities in a large scale.
international conference on data mining | 2010
Xufei Wang; Lei Tang; Huiji Gao; Huan Liu
The increasing popularity of social media is shortening the distance between people. Social activities, e.g., tagging in Flickr, book marking in Delicious, twittering in Twitter, etc. are reshaping people’s social life and redefining their social roles. People with shared interests tend to form their groups in social media, and users within the same community likely exhibit similar social behavior (e.g., going for the same movies, having similar political viewpoints), which in turn reinforces the community structure. The multiple interactions in social activities entail that the community structures are often overlapping, i.e., one person is involved in several communities. We propose a novel co-clustering framework, which takes advantage of networking information between users and tags in social media, to discover these overlapping communities. In our method, users are connected via tags and tags are connected to users. This explicit representation of users and tags is useful for understanding group evolution by looking at who is interested in what. The efficacy of our method is supported by empirical evaluation in both synthetic and online social networking data.
Frontiers of Computer Science in China | 2012
Jiliang Tang; Xufei Wang; Huiji Gao; Xia Hu; Huan Liu
Social media websites allow users to exchange short texts such as tweets via microblogs and user status in friendship networks. Their limited length, pervasive abbreviations, and coined acronyms and words exacerbate the problems of synonymy and polysemy, and bring about new challenges to data mining applications such as text clustering and classification. To address these issues, we dissect some potential causes and devise an efficient approach that enriches data representation by employing machine translation to increase the number of features from different languages. Then we propose a novel framework which performs multi-language knowledge integration and feature reduction simultaneously through matrix factorization techniques. The proposed approach is evaluated extensively in terms of effectiveness on two social media datasets from Facebook and Twitter. With its significant performance improvement, we further investigate potential factors that contribute to the improved performance.
international conference on social computing | 2011
Huiji Gao; Xufei Wang; Geoffrey Barbier; Huan Liu
The efficiency at which governments and nongovernmental organizations (NGOs) are able to respond to a crisis and provide relief to victims has gained increased attention. This emphasis coincides with significant events such as tsunamis, hurricanes, earthquakes, and environmental disasters occuring during the last decade. Crowdsourcing applications such as Twitter, Ushahidi, and Sahana have proven useful for gathering information about a crisis yet have limited utility for response coordination. In this paper, we briefly describe the shortfalls of current crowdsourcing applications applied to disaster relief coordination and discuss one approach aimed at facilitating efficient collaborations amongst disparate organizations responding to a crisis.
IEEE Transactions on Knowledge and Data Engineering | 2012
Lei Tang; Xufei Wang; Huan Liu
This study of collective behavior is to understand how individuals behave in a social networking environment. Oceans of data generated by social media like Facebook, Twitter, Flickr, and YouTube present opportunities and challenges to study collective behavior on a large scale. In this work, we aim to learn to predict collective behavior in social media. In particular, given information about some individuals, how can we infer the behavior of unobserved individuals in the same network? A social-dimension-based approach has been shown effective in addressing the heterogeneity of connections presented in social media. However, the networks in social media are normally of colossal size, involving hundreds of thousands of actors. The scale of these networks entails scalable learning of models for collective behavior prediction. To address the scalability issue, we propose an edge-centric clustering scheme to extract sparse social dimensions. With sparse social dimensions, the proposed approach can efficiently handle networks of millions of actors while demonstrating a comparable prediction performance to other nonscalable methods.
Knowledge and Information Systems | 2013
Xufei Wang; Lei Tang; Huan Liu; Lei Wang
A recent surge of participatory web and social media has created a new laboratory for studying human relations and collective behavior on an unprecedented scale. In this work, we study the predictive power of social connections to determine the preferences or behaviors of individuals such as whether a user supports a certain political view, whether one likes a product, whether she would like to vote for a presidential candidate, etc. Since an actor is likely to participate in multiple different communities with each regulating the actor’s behavior in varying degrees, and a natural hierarchy might exist between these communities, we propose to zoom into a network at multiple different resolutions and determine which communities reflect a targeted behavior. We develop an efficient algorithm to extract a hierarchy of overlapping communities. Empirical results on social media networks demonstrate the promising potential of the proposed approach in real-world applications.
international symposium on computational intelligence and design | 2015
Lei Tang; Xufei Wang; Huan Liu
Abstract : The pervasiveness of Web 2.0 and social networking sites has enabled people to interact with each other easily through various social media. For instance, popular sites like Del.icio.us, Flickr, and YouTube allow users to comment on shared content (bookmarks, photos, videos), and users can tag their favorite content. Users can also connect with one another, and subscribe to or become a fan or a follower of others. These diverse activities result in a multi-dimensional network among actors, forming group structures with group members sharing similar interests or a liations. This work systematically addresses two challenges. First, it is challenging to e ectively integrate interactions over multiple dimensions to discover hidden community structures shared by heterogeneous interactions. We show that representative community detection methods for single-dimensional networks can be presented in a uni- ed view. Based on this uni ed view, we present and analyze four possible integration strategies to extend community detection from single-dimensional to multi-dimensional networks. In particular, we propose a novel integration scheme based on structural features. Another challenge is the evaluation of different methods without ground truth information about community membership. We employ a novel cross-dimension network validation procedure to compare the performance of di erent methods. We use synthetic data to deepen our understanding, and real-world data to compare integration strategies as well as baseline methods in a large scale. We study further the computational time of di erent methods, normalization e ect during integration, sensitivity to related parameters, and alternative community detection methods for integration.
MSM/MUSE'11 Proceedings of the 2011th International Conference on Modeling and Mining Ubiquitous Social Media - 2011 International Workshop on Modeling Social Media and 2011 International Workshop on Mining Ubiquitous and Social Environments | 2011
Jiliang Tang; Xufei Wang; Huan Liu
Community detection is an unsupervised learning task that discovers groups such that group members share more similarities or interact more frequently among themselves than with people outside groups. In social media, link information can reveal heterogeneous relationships of various strengths, but often can be noisy. Since different sources of data in social media can provide complementary information, e.g., bookmarking and tagging data indicates user interests, frequency of commenting suggests the strength of ties, etc., we propose to integrate social media data of multiple types for improving the performance of community detection. We present a joint optimization framework to integrate multiple data sources for community detection. Empirical evaluation on both synthetic data and real-world social media data shows significant performance improvement of the proposed approach. This work elaborates the need for and challenges of multi-source integration of heterogeneous data types, and provides a principled way of multi-source community detection.
ACM Transactions on Intelligent Systems and Technology | 2011
Lei Tang; Xufei Wang; Huan Liu
The prolific use of participatory Web and social networking sites is reshaping the ways in which people interact with one another. It has become a vital part of human social life in both the developed and developing world. People sharing certain similarities or affiliates tend to form communities within social media. At the same time, they participate in various online activities: content sharing, tagging, posting status updates, etc. These diverse activities leave behind traces of their social life, providing clues to understand changing social structures. A large body of existing work focuses on extracting cohesive groups based on network topology. But little attention is paid to understanding the changing social structures. In order to help explain the formation of a group, we explore different group-profiling strategies to construct descriptions of a group. This research can assist network navigation, visualization, and analysis, as well as monitoring and tracking the ebbs and tides of different groups in evolving networks. By exploiting information collected from real-world social media sites, extensive experiments are conducted to evaluate group-profiling results. The pros and cons of different group-profiling strategies are analyzed with concrete examples. We also show some potential applications based on group profiling. Interesting findings with discussions are reported.