Zhilin Yang
Carnegie Mellon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zhilin Yang.
knowledge discovery and data mining | 2015
Yutao Zhang; Jie Tang; Zhilin Yang; Jian Pei; Philip S. Yu
More often than not, people are active in more than one social network. Identifying users from multiple heterogeneous social networks and integrating the different networks is a fundamental issue in many applications. The existing methods tackle this problem by estimating pairwise similarity between users in two networks. However, those methods suffer from potential inconsistency of matchings between multiple networks. In this paper, we propose COSNET (COnnecting heterogeneous Social NETworks with local and global consistency), a novel energy-based model, to address this problem by considering both local and global consistency among multiple networks. An efficient subgradient algorithm is developed to train the model by converting the original energy-based objective function into its dual form. We evaluate the proposed model on two different genres of data collections: SNS and Academia, each consisting of multiple heterogeneous social networks. Our experimental results validate the effectiveness and efficiency of the proposed model. On both data collections, the proposed COSNET method significantly outperforms several alternative methods by up to 10-30% (p << 0:001, t-test) in terms of F1-score. We also demonstrate that applying the integration results produced by our method can improve the accuracy of expert finding, an important task in social networks.
meeting of the association for computational linguistics | 2017
Bhuwan Dhingra; Hanxiao Liu; Zhilin Yang; William W. Cohen; Ruslan Salakhutdinov
In this paper we study the problem of answering cloze-style questions over documents. Our model, the Gated-Attention (GA) Reader, integrates a multi-hop architecture with a novel attention mechanism, which is based on multiplicative interactions between the query embedding and the intermediate states of a recurrent neural network document reader. This enables the reader to build query-specific representations of tokens in the document for accurate answer selection. The GA Reader obtains state-of-the-art results on three benchmarks for this task--the CNN \& Daily Mail news stories and the Who Did What dataset. The effectiveness of multiplicative interaction is demonstrated by an ablation study, and by comparing to alternative compositional operators for implementing the gated-attention. The code is available at this https URL
meeting of the association for computational linguistics | 2017
Zhilin Yang; Junjie Hu; Ruslan Salakhutdinov; William W. Cohen
We study the problem of semi-supervised question answering----utilizing unlabeled text to boost the performance of question answering models. We propose a novel training framework, the Generative Domain-Adaptive Nets. In this framework, we train a generative model to generate questions based on the unlabeled text, and combine model-generated questions with human-generated questions for training question answering models. We develop novel domain adaptation algorithms, based on reinforcement learning, to alleviate the discrepancy between the model-generated data distribution and the human-generated data distribution. Experiments show that our proposed framework obtains substantial improvement from unlabeled text.
web search and data mining | 2014
Zhilin Yang; Jie Tang; Bin Xu; Chunxiao Xing
We study the problem of active learning for networked data, where samples are connected with links and their labels are correlated with each other. We particularly focus on the setting of using the probabilistic graphical model to model the networked data, due to its effectiveness in capturing the dependency between labels of linked samples. We propose a novel idea of connecting the graphical model to the information diffusion process, and precisely define the active learning problem based on the non-progressive diffusion model. We show the NP-hardness of the problem and propose a method called MaxCo to solve it. We derive the lower bound for the optimal solution for the active learning setting, and develop an iterative greedy algorithm with provable approximation guarantees. We also theoretically prove the convergence and correctness of MaxCo. We evaluate MaxCo on four different genres of datasets: Coauthor, Slashdot, Mobile, and Enron. Our experiments show a consistent improvement over other competing approaches.
conference on information and knowledge management | 2014
Zhilin Yang; Jie Tang; Yutao Zhang
Mining high-speed data streams has become an important topic due to the rapid growth of online data. In this paper, we study the problem of active learning for streaming networked data. The goal is to train an accurate model for classifying networked data that arrives in a streaming manner by querying as few labels as possible. The problem is extremely challenging, as both the data distribution and the network structure may change over time. The query decision has to be made for each data instance sequentially, by considering the dynamic network structure. We propose a novel streaming active query strategy based on structural variability. We prove that by querying labels we can monotonically decrease the structural variability and better adapt to concept drift. To speed up the learning process, we present a network sampling algorithm to sample instances from the data stream, which provides a way for us to handle large volume of streaming data. We evaluate the proposed approach on four datasets of different genres: Weibo, Slashdot, IMDB, and ArnetMiner. Experimental results show that our model performs much better (+5-10% by F1-score on average) than several alternative methods for active learning over streaming networked data.
knowledge discovery and data mining | 2013
Yang Yang; Jianfei Wang; Yutao Zhang; Wei Chen; Jing Zhang; Honglei Zhuang; Zhilin Yang; Bo Ma; Zhanpeng Fang; Sen Wu; Xiaoxiao Li; Debing Liu; Jie Tang
Online social networks become a bridge to connect our physical daily life and the virtual Web space, which not only provides rich data for mining, but also brings many new challenges. In this paper, we present a novel Social Analytic Engine (SAE) for large online social networks. The key issues we pursue in the analytic engine are concerned with the following problems: 1) at the micro-level, how do people form different types of social ties and how people influence each other? 2) at the meso-level, how do people group into communities? 3) at the macro-level, what are the hottest topics in a social network and how the topics evolve over time? We propose methods to address the above questions. The methods are general and can be applied to various social networking data. We have deployed and validated the proposed analytic engine over multiple different networks and validated the effectiveness and efficiency of the proposed methods.
international conference on machine learning | 2016
Zhilin Yang; William W. Cohen; Ruslan Salakhutdinov
neural information processing systems | 2016
Zhilin Yang; Ye Yuan; Yuexin Wu; William W. Cohen; Ruslan Salakhutdinov
arXiv: Computation and Language | 2016
Zhilin Yang; Ruslan Salakhutdinov; William W. Cohen
neural information processing systems | 2017
Zihang Dai; Zhilin Yang; Fan Yang; William W. Cohen; Ruslan Salakhutdinov