Xian-Ling Mao
Beijing Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xian-Ling Mao.
international conference on tools with artificial intelligence | 2014
Zhirun Liu; Heyan Huang; Xiaochi Wei; Xian-Ling Mao
Recently, authority ranking has received increasing interests in both academia and industry, and it is applicable to many problems such as discovering influential nodes and building recommendation systems. Various graph-based ranking approaches like PageRank have been used to rank authors and papers separately in homogeneous networks. In this paper, we take venue information into consideration and propose a novel graph-based ranking framework, Tri-Rank, to co-rank authors, papers and venues simultaneously in heterogeneous networks. This approach is a flexible framework and it ranks authors, papers and venues iteratively in a mutually reinforcing way to achieve a more synthetic, fair ranking result. We conduct extensive experiments using the data collected from ACM Digital Library. The experimental results show that Tri-Rank is more effective and efficient than the state-of-the-art baselines including PageRank, HITS and Co-Rank in ranking authors. The papers and venues ranked by Tri-Rank also demonstrate that Tri-Rank is rational.
Science in China Series F: Information Sciences | 2016
Lili Mei; Heyan Huang; Xiaochi Wei; Xian-Ling Mao
New words could benefit many NLP tasks such as sentence chunking and sentiment analysis. However, automatic new word extraction is a challenging task because new words usually have no fixed language pattern, and even appear with the new meanings of existing words. To tackle these problems, this paper proposes a novel method to extract new words. It not only considers domain specificity, but also combines with multiple statistical language knowledge. First, we perform a filtering algorithm to obtain a candidate list of new words. Then, we employ the statistical language knowledge to extract the top ranked new words. Experimental results show that our proposed method is able to extract a large number of new words both in Chinese and English corpus, and notably outperforms the state-of-the-art methods. Moreover, we also demonstrate our method increases the accuracy of Chinese word segmentation by 10% on corpus containing new words.创新点1.本文提出了一个基于领域特殊性和统计语言知识的新词抽取方法。首先, 采用基于领域特殊性的垃圾串过滤方法过滤垃圾串, 得到候选新词列表; 然后基于统计语言知识(词频、凝聚度和自由度)对新词进行抽取。实验验证了该方法的有效性、语言独立性和领域无关性。2.该方法能够有效提升中文分词系统的分词效果。
Multimedia Tools and Applications | 2018
Yi-Kun Tang; Xian-Ling Mao; Heyan Huang; Xuewen Shi; Guihua Wen
Recently, topic modeling has been widely used to discover the abstract topics in the multimedia field. Most of the existing topic models are based on the assumption of three-layer hierarchical Bayesian structure, i.e. each document is modeled as a probability distribution over topics, and each topic is a probability distribution over words. However, the assumption is not optimal. Intuitively, it’s more reasonable to assume that each topic is a probability distribution over concepts, and then each concept is a probability distribution over words, i.e. adding a latent concept layer between topic layer and word layer in traditional three-layer assumption. In this paper, we verify the proposed assumption by incorporating the new assumption in two representative topic models, and obtain two novel topic models. Extensive experiments were conducted among the proposed models and corresponding baselines, and the results show that the proposed models significantly outperform the baselines in terms of case study and perplexity, which means the new assumption is more reasonable than traditional one.
Chinese National Conference on Social Media Processing | 2015
Lili Mei; Heyan Huang; Xiaochi Wei; Peng Yuan; Xian-Ling Mao
New network words could benefit many NLP tasks such as Chinese word segmentation and sentiment analysis. However, automatic new network words extraction is a challenging task because new network words usually have no fixed language pattern, and even appear with the new meanings of existing words. To tackle these problems, this paper proposes a novel approach of FCL to extract new network words. It not only considers domain specificity, but also combines with multiple statistical language knowledge. First, we perform a filtering algorithm to obtain a list of candidate new words. Then, we employ the statistical language knowledge to extract the top ranked new network words. Experimental results show that our proposed approach is able to extract a large number of new network words and notably outperforms the state-of-the-art methods. Moreover, we also demonstrate our approach increases the accuracy of word segmentation by 10 % on corpus containing new words.
Neurocomputing | 2018
Xian-Ling Mao; Yi-Jing Hao; Dan Wang; Heyan Huang
Abstract Query completion has long been proved useful to help a user explore and express his information need. In general search, such completions can be generated from a large scale query log and other accessory information. However, without query log, how to generate query completion for community-based Question Answering (cQA) search remains a challenging problem. In this work, we propose a novel query completion algorithm based on ranking cQA questions with entity and phrase information for cQA search, and a demonstration system has been developed. Without involvement of query log, this method clearly helps users complete their queries. Empirical experiments on a large scale cQA dataset show that the proposed algorithm can successfully improve user experience.
IEEE Transactions on Knowledge and Data Engineering | 2017
Xiaochi Wei; Heyan Huang; Liqiang Nie; Hanwang Zhang; Xian-Ling Mao; Tat-Seng Chua
Sentence auto-completion is an important feature that saves users many keystrokes in typing the entire sentence by providing suggestions as they type. Despite its value, the existing sentence auto-completion methods, such as query completion models, can hardly be applied to solving the object completion problem in sentences with the form of (subject, verb, object), due to the complex natural language description and the data deficiency problem. Towards this goal, we treat an SVO sentence as a three-element triple (subject, sentence pattern, object), and cast the sentence object completion problem as an element inference problem. These elements in all triples are encoded into a unified low-dimensional embedding space by our proposed TRANSFER model, which leverages the external knowledge base to strengthen the representation learning performance. With such representations, we can provide reliable candidates for the desired missing element by a linear model. Extensive experiments on a real-world dataset have well-validated our model. Meanwhile, we have successfully applied our proposed model to factoid question answering systems for answer candidate selection, which further demonstrates the applicability of the TRANSFER model.
Chinese National Conference on Social Media Processing | 2017
Yi-Kun Tang; Xian-Ling Mao; Yi-Jing Hao; Cheng Xu; Heyan Huang
In the past ten years, new powerful algorithms based on efficient data structures have been proposed to solve the problem of Approximate Nearest Neighbors search (ANN). To find the nearest neighbors in probability-distribution-type data, the existing Locality Sensitive Hashing (LSH) algorithms for vector-type data can be directly used to solve it. However, these methods do not consider the special properties of probability distributions. In this paper, based on the special properties of probability distributions, we present a novel LSH scheme adapted to angular distance for ANN search in high-dimensional probability distributions. We define the specific hashing functions, and prove their local-sensitivity. Also, we propose a Sequential Interleaving algorithm based on the “Unbalance Effect” of Euclidean and angular metrics for probability distributions. Finally, we compare, through experiments, our methods with the state-of-the-art LSH algorithms in the context of ANN on six public image databases. The results prove the proposed algorithms can provide far better accuracy in the context of ANN than baselines.
Chinese National Conference on Social Media Processing | 2017
Dan Wang; Heyan Huang; Hua-Kang Lin; Xian-Ling Mao
Approximate Nearest Neighbors (ANN) Search has attracted much attention in recent years. Hashing is a promising way for ANN which has been widely used in large-scale image retrieval tasks. However, most of the existing hashing methods are designed for single-labeled data. On multi-labeled data, those hashing methods take two images as similar if they share at least one common label. But this way cannot preserve the order relations in multi-labeled data. Meanwhile, most hashing methods are based on hand-crafted features which are costing. To solve the two problems above, we proposed a novel supervised hashing method to perform hash codes learning for multi-labeled data. In particular, we firstly extract the order-preserving data features through deep convolutional neural network. Secondly, the order-preserving features would be used for learning hash codes. Extensive experiments on two real-world public datasets show that the proposed method outperforms state-of-the-art baselines in the image retrieval tasks.
international conference on artificial intelligence | 2015
Xiaochi Wei; Heyan Huang; Chin-Yew Lin; Xin Xin; Xian-Ling Mao; Shangguang Wang
national conference on artificial intelligence | 2018
Dan Wang; Heyan Huang; Chi Lu; Bo-Si Feng; Guihua Wen; Liqiang Nie; Xian-Ling Mao