Xiaojiang Huang
Peking University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xiaojiang Huang.
empirical methods in natural language processing | 2015
Gang Luo; Xiaojiang Huang; Chin-Yew Lin; Zaiqing Nie
Extracting named entities in text and linking extracted names to a given knowledge base are fundamental tasks in applications for text understanding. Existing systems typically run a named entity recognition (NER) model to extract entity names first, then run an entity linking model to link extracted names to a knowledge base. NER and linking models are usually trained separately, and the mutual dependency between the two tasks is ignored. We propose JERL, Joint Entity Recognition and Linking, to jointly model NER and linking tasks and capture the mutual dependency between them. It allows the information from each task to improve the performance of the other. To the best of our knowledge, JERL is the first model to jointly optimize NER and linking tasks together completely. In experiments on the CoNLL’03/AIDA data set, JERL outperforms state-of-art NER and linking systems, and we find improvements of 0.4% absolute F1 for NER on CoNLL’03, and 0.36% absolute precision@1 for linking on AIDA.
meeting of the association for computational linguistics | 2014
Hongzhao Huang; Yunbo Cao; Xiaojiang Huang; Heng Ji; Chin-Yew Lin
Wikification for tweets aims to automatically identify each concept mention in a tweet and link it to a concept referent in a knowledge base (e.g., Wikipedia). Due to the shortness of a tweet, a collective inference model incorporating global evidence from multiple mentions and concepts is more appropriate than a noncollecitve approach which links each mention at a time. In addition, it is challenging to generate sufficient high quality labeled data for supervised models with low cost. To tackle these challenges, we propose a novel semi-supervised graph regularization model to incorporate both local and global evidence from multiple tweets through three fine-grained relations. In order to identify semanticallyrelated mentions for collective inference, we detect meta path-based semantic relations through social networks. Compared to the state-of-the-art supervised model trained from 100% labeled data, our proposed approach achieves comparable performance with 31% labeled data and obtains 5% absolute F1 gain with 50% labeled data.
Knowledge and Information Systems | 2014
Xiaojiang Huang; Xiaojun Wan; Jianguo Xiao
Comparative news summarization aims to highlight the commonalities and differences between two comparable news topics by using human-readable sentences. The summary ought to focus on the salient comparative aspects of both topics, and at the same time, it should describe the representative properties of each topic appropriately. In this study, we propose a novel approach for generating comparative news summaries. We consider cross-topic pairs of semantic-related concepts as evidences of comparativeness and consider topic-related concepts as evidences of representativeness. The score of a summary is estimated by summing up the weights of evidences in the summary. We formalize the summarization task as an optimization problem of selecting proper sentences to maximize this score and address the problem by using a mixed integer programming model. The experimental results demonstrate the effectiveness of our proposed model.
web information systems engineering | 2012
Xiaojiang Huang; Xiaojun Wan; Jianguo Xiao
In this paper we propose a novel comparative web search system --- BiCWS, which can mine cognitive differences from web search results in a multi-language setting. Given a topic represented by two queries (they are the translations of each other) in two languages, the corresponding web search results for the two queries are firstly retrieved by using a general web search engine, and then the bilingual facets for the topic are mined by using a bilingual search results clustering algorithm. The semantics in Wikipedia are leveraged to improve the bilingual clustering performance. After that, the semantic distributions of the search results over the mined facets are visually presented, which can reflect the cognitive differences in the bilingual communities. Experimental results show the effectiveness of our proposed system.
web information systems engineering | 2012
Xiaojiang Huang; Xiaojun Wan; Jianguo Xiao
Comparison is a popular way for people to discover the commonality and difference between two entities (e.g. product, person, company, event, etc.). It would be very useful to automatically provide comparison results for the user. The prerequisite step of this task is to find comparable entities. In this paper, we propose a novel Web mining system to address the task of finding comparable entities for a given single entity. First, the system uses a bootstrapping method to find candidate entities for the given entity through natural language analysis in the snippets of search engine results. Then, the system uses set expansion techniques to find more candidate entities though semi-structured HTML analysis in the downloaded web pages. Finally, the system uses a supervised learning method to classify the candidate entities into either comparable or incomparable by incorporating linguistic, statistical and semantic features. Experimental results demonstrate that our proposed framework can outperform the baseline systems.
Archive | 2015
Gang Luo; Xiaojiang Huang; Chin-Yew Lin; Zaiqing Nie
meeting of the association for computational linguistics | 2011
Xiaojiang Huang; Xiaojun Wan; Jianguo Xiao
pacific rim international conference on artificial intelligence | 2008
Xiaojiang Huang; Xiaojun Wan; Jianwu Yang; Jianguo Xiao
Theory and Applications of Categories | 2010
Houping Jia; Xiaojiang Huang; Tengfei Ma; Xiaojun Wan; Jianguo Xiao
international joint conference on natural language processing | 2011
Xiaojun Wan; Liang Zong; Xiaojiang Huang; Tengfei Ma; Houping Jia; Yuqian Wu; Jianguo Xiao