Yuxiao Dong
University of Notre Dame
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yuxiao Dong.
international conference on data mining | 2012
Yuxiao Dong; Jie Tang; Sen Wu; Jilei Tian; Nitesh V. Chawla; Jinghai Rao; Huanhuan Cao
Link prediction and recommendation is a fundamental problem in social network analysis. The key challenge of link prediction comes from the sparsity of networks due to the strong disproportion of links that they have potential to form to links that do form. Most previous work tries to solve the problem in single network, few research focus on capturing the general principles of link formation across heterogeneous networks. In this work, we give a formal definition of link recommendation across heterogeneous networks. Then we propose a ranking factor graph model (RFG) for predicting links in social networks, which effectively improves the predictive performance. Motivated by the intuition that people make friends in different networks with similar principles, we find several social patterns that are general across heterogeneous networks. With the general social patterns, we develop a transfer-based RFG model that combines them with network structure information. This model provides us insight into fundamental principles that drive the link formation and network evolution. Finally, we verify the predictive performance of the presented transfer model on 12 pairs of transfer cases. Our experimental results demonstrate that the transfer of general social patterns indeed help the prediction of links.
knowledge discovery and data mining | 2017
Yuxiao Dong; Nitesh V. Chawla; Ananthram Swami
We study the problem of representation learning in heterogeneous networks. Its unique challenges come from the existence of multiple types of nodes and links, which limit the feasibility of the conventional network embedding techniques. We develop two scalable representation learning models, namely metapath2vec and metapath2vec++. The metapath2vec model formalizes meta-path-based random walks to construct the heterogeneous neighborhood of a node and then leverages a heterogeneous skip-gram model to perform node embeddings. The metapath2vec++ model further enables the simultaneous modeling of structural and semantic correlations in heterogeneous networks. Extensive experiments show that metapath2vec and metapath2vec++ are able to not only outperform state-of-the-art embedding models in various heterogeneous network mining tasks, such as node classification, clustering, and similarity search, but also discern the structural and semantic correlations between diverse network objects.
data and knowledge engineering | 2013
Chuan Shi; Yanan Cai; Di Fu; Yuxiao Dong; Bin Wu
There is a surge of community detection study on complex network analysis in recent years, since communities often play important roles in network systems. However, many real networks have more complex overlapping community structures. This paper proposes a novel algorithm to discover overlapping communities. Different from conventional algorithms based on node clustering, the proposed algorithm is based on link clustering. Since links usually represent unique relations among nodes, the link clustering will discover groups of links that have the same characteristics. Thus nodes naturally belong to multiple communities. The algorithm applies genetic operation to cluster on links. An effective encoding schema is designed and the number of communities can be automatically determined. Experiments on both artificial networks and real networks validate the effectiveness and efficiency of the proposed algorithm.
knowledge discovery and data mining | 2015
Yuxiao Dong; Jing Zhang; Jie Tang; Nitesh V. Chawla; Bai Wang
We study the problem of link prediction in coupled networks, where we have the structure information of one (source) network and the interactions between this network and another (target) network. The goal is to predict the missing links in the target network. The problem is extremely challenging as we do not have any information of the target network. Moreover, the source and target networks are usually heterogeneous and have different types of nodes and links. How to utilize the structure information in the source network for predicting links in the target network? How to leverage the heterogeneous interactions between the two networks for the prediction task? We propose a unified framework, CoupledLP, to solve the problem. Given two coupled networks, we first leverage atomic propagation rules to automatically construct implicit links in the target network for addressing the challenge of target network incompleteness, and then propose a coupled factor graph model to incorporate the meta-paths extracted from the coupled part of the two networks for transferring heterogeneous knowledge. We evaluate the proposed framework on two different genres of datasets: disease-gene (DG) and mobile social networks. In the DG networks, we aim to use the disease network to predict the associations between genes. In the mobile networks, we aim to use the mobile communication network of one mobile operator to infer the network structure of its competitors. On both datasets, the proposed CoupledLP framework outperforms several alternative methods. The proposed problem of coupled link prediction and the corresponding framework demonstrate both the scientific and business applications in biology and social networks.
european conference on machine learning | 2015
Yuxiao Dong; Fabio Pinelli; Yiannis Gkoufas; Zubair Nabi; Francesco Calabrese; Nitesh V. Chawla
The pervasiveness and availability of mobile phone data offer the opportunity of discovering usable knowledge about crowd behavior in urban environments. Cities can leverage such knowledge to provide better services e.g., public transport planning, optimized resource allocation and safer environment. Call Detail Record CDR data represents a practical data source to detect and monitor unusual events considering the high level of mobile phone penetration, compared with GPS equipped and open devices. In this paper, we propose a methodology that is able to detect unusual events from CDR data, which typically has low accuracy in terms of space and time resolution. Moreover, we introduce a concept of unusual event that involves a large amount of people who expose an unusual mobility behavior. Our careful consideration of the issues that come from coarse-grained CDR data ultimately leads to a completely general framework that can detect unusual crowd events from CDR data effectively and efficiently. Through extensive experiments on real-world CDR data for a large city in Africa, we demonstrate that our method can detect unusual events with 16% higher recall and over 10
advanced data mining and applications | 2011
Yanan Cai; Chuan Shi; Yuxiao Dong; Qing Ke; Bin Wu
european conference on machine learning | 2013
Yuxiao Dong; Jie Tang; Tiancheng Lou; Bin Wu; Nitesh V. Chawla
\times
Scientific Reports | 2015
Yang Yang; Yuxiao Dong; Nitesh V. Chawla
cloud data management | 2011
Yuxiao Dong; Qing Ke; Yanan Cai; Bin Wu; Bai Wang
higher precision, compared to state-of-the-art methods. We implement a visual analytics prototype system to help end users analyze detected unusual crowd events to best suit different application scenarios. To the best of our knowledge, this is the first work on the detection of unusual events from CDR data with considerations of its temporal and spatial sparseness and distinction between user unusual activities and daily routines.
knowledge discovery and data mining | 2017
Yuxiao Dong; Hao Ma; Zhihong Shen; Kuansan Wang
There is a surge of community detection on complex network analysis in recent years, since communities often play special roles in the network systems. However, many community structures are overlapping in real word. For example, a professor collaborates with researchers in different fields. In this paper, we propose a novel algorithm to discover overlapping communities. Different from conventional algorithms based on node clustering, our algorithm is based on edge clustering. Since edges usually represent unique relations among nodes, edge clustering will discover groups of edges that have the same characteristics. Thus nodes naturally belong to multiple communities. The proposed algorithm apply a novel genetic algorithm to cluster on edges. A scalable encoding schema is designed and the number of communities can be automatically determined. Experiments on both artificial networks and real networks validate the effectiveness and efficiency of the algorithm.