Yeyun Gong
Fudan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yeyun Gong.
conference on information and knowledge management | 2016
Qi Zhang; Yeyun Gong; Jindou Wu; Haoran Huang; Xuanjing Huang
On Twitter-like social media sites, the re-posting statuses or tweets of other users are usually considered to be the key mechanism for spreading information. How to predict whether a tweet will be retweeted by a user has received increasing attention in recent years. Previous methods studied the problem using various linguistic features, personal information of users, and many other manually constructed features to achieve the task. Usually, feature engineering is a laborious task, we require to obtain the external sources and they are difficult or not always available. Recently, deep learning methods have been used in the industry and research community for their ability to learn optimal features automatically and in many tasks, deep learning methods can achieve state-of-the art performance, such as natural language processing, computer vision, image classification and so on. In this work, we proposed a novel attention-based deep neural network to incorporate contextual and social information for this task. We used embeddings to represent the user, the users attention interests, the author and tweet respectively. To train and evaluate the proposed methods, we also constructed a large dataset collected from Twitter. Experimental results showed that the proposed method could achieve better results than the previous state-of-the-art methods.
empirical methods in natural language processing | 2016
Qi Zhang; Yang Wang; Yeyun Gong; Xuanjing Huang
Keyphrases can provide highly condensed and valuable information that allows users to quickly acquire the main ideas. The task of automatically extracting them have received considerable attention in recent decades. Different from previous studies, which are usually focused on automatically extracting keyphrases from documents or articles, in this study, we considered the problem of automatically extracting keyphrases from tweets. Because of the length limitations of Twitter-like sites, the performances of existing methods usually drop sharply. We proposed a novel deep recurrent neural network (RNN) model to combine keywords and context information to perform this problem. To evaluate the proposed method, we also constructed a large-scale dataset collected from Twitter. The experimental results showed that the proposed method performs significantly better than previous methods.
international joint conference on artificial intelligence | 2017
Qi Zhang; Jiawen Wang; Haoran Huang; Xuanjing Huang; Yeyun Gong
In microblogging services, users usually use hashtags to mark keywords or topics. Along with the fast growing of social network, the task of automatically recommending hashtags has received considerable attention in recent years. Previous works focused only on the use of textual information. However, many microblog posts contain not only texts but also the corresponding images. These images can provide additional information that is not included in the text, which could be helpful to improve the accuracy of hashtag recommendation. Motivated by the successful use of the attention mechanism, we propose a co-attention network incorporating textual and visual information to recommend hashtags for multimodal tweets. Experimental results on the data collected from Twitter demonstrated that the proposed method can achieve better performance than state-of-the-art methods using textual information only.
empirical methods in natural language processing | 2015
Yeyun Gong; Qi Zhang; Xuanjing Huang
In recent years, the task of recommending hashtags for microblogs has been given increasing attention. Various methods have been proposed to study the problem from different aspects. However, most of the recent studies have not considered the differences in the types or uses of hashtags. In this paper, we introduce a novel nonparametric Bayesian method for this task. Based on the Dirichlet Process Mixture Models (DPMM), we incorporate the type of hashtag as a hidden variable. The results of experiments on the data collected from a real world microblogging service demonstrate that the proposed method outperforms stateof-the-art methods that do not consider these aspects. By taking these aspects into consideration, the relative improvement of the proposed method over the state-of-theart methods is around 12.2% in F1- score.
Neurocomputing | 2018
Yeyun Gong; Qi Zhang; Xuanjing Huang
In microblogging services, authors can use hashtags to mark keywords or topics in microblogs. Many live social media applications (e.g., microblog retrieval, classification) can gain great benefit from these manually labeled tags. However, only a small partition of microblogs contains hashtags. Moreover, many microblog posts contain not only textual content but also images. These visual resources also provide valuable information that is not included in the textual content. To recommend hashtags for these multimodal microblogs, in this work, we propose a novel generative method incorporating textual and visual information to solve the task. Experimental results on the data collected from real world microblogging services demonstrate that the proposed method outperforms state-of-the-art methods using either textual or visual information. The relative improvement of the proposed method over the textual only method is more than 17.1% in F1-score.
Science in China Series F: Information Sciences | 2017
Yeyun Gong; Qi Zhang; Xiaoying Han; Xuanjing Huang
In microblogs, authors use hashtags to mark keywords or topics. These manually labeled tags can be used to benefit various live social media applications (e.g., microblog retrieval, classification). However, because only a small portion of microblogs contain hashtags, recommending hashtags for use in microblogs are a worthwhile exercise. In addition, human inference often relies on the intrinsic grouping of words into phrases. However, existing work uses only unigrams to model corpora. In this work, we propose a novel phrase-based topical translation model to address this problem. We use the bag-of-phrases model to better capture the underlying topics of posted microblogs. We regard the phrases and hashtags in a microblog as two different languages that are talking about the same thing. Thus, the hashtag recommendation task can be viewed as a translation process from phrases to hashtags. To handle the topical information of microblogs, the proposed model regards translation probability as being topic specific. We test the methods on data collected from realworld microblogging services. The results demonstrate that the proposed method outperforms state-of-the-art methods that use the unigram model.摘要创新点近几年微博标签推荐受到广泛关注, 当前用于标签推荐的模型主要基于词语级别, 然而一个短语往往表达的是一个含义, 认为短语中的每个词分别对齐到不同标签是不合理的。 因此本文提出了基于短语级别的标签推荐方法。
conference on information and knowledge management | 2013
Qi Zhang; Jihua Kang; Yeyun Gong; Huan Chen; Yaqian Zhou; Xuanjing Huang
Map search has received considerable attention in recent years. With map search, users can specify target locations with textual queries. However, these queries do not always include well-formed addresses or place names. They may contain transpositions, misspellings, fragments and so on. Queries may significantly differ from items stored in the spatial database. In this paper, we propose to connect this task to the semi-structured retrieval problem. A novel factor graph-based semi-structured retrieval framework is introduced to incorporate concept weighting, attribute selection, and word-based similarity metrics together. We randomly sampled a number of queries from logs of a commercial map search engine and manually labeled their categories and relevant results for analysis and evaluation. The results of several experimental comparisons demonstrate that our method outperforms both state-of-the-art semi-structured retrieval methods and some commercial systems in retrieving freeform location queries.
National CCF Conference on Natural Language Processing and Chinese Computing | 2017
Jin Qian; Yeyun Gong; Qi Zhang; Xuanjing Huang
The hierarchical Dirichlet process model has been successfully used for extracting the topical or semantic content of documents and other kinds of sparse count data. Along with the growth of social media, there have been simultaneous increases in the amounts of textual information and social structural information. To incorporate the information contained in these structures, in this paper, we propose a novel non-parametric model, social hierarchical Dirichlet process (sHDP), to solve the problem. We assume that the topic distributions of documents are similar to each other if their authors have relations in social networks. The proposed method is extended from the hierarchical Dirichlet process model. We evaluate the utility of our method by applying it to three data sets: papers from NIPS proceedings, a subset of articles from Cora, and microblogs with social network. Experimental results demonstrate that the proposed method can achieve better performance than state-of-the-art methods in all three data sets.
national conference on artificial intelligence | 2015
Qi Zhang; Yeyun Gong; Ya Guo; Xuanjing Huang
international conference on computational linguistics | 2014
Qi Zhang; Yeyun Gong; Xuyang Sun; Xuanjing Huang