Wanxiang Che | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wanxiang Che is active.

Explore More

Publication

Featured researches published by Wanxiang Che.

meeting of the association for computational linguistics | 2014

Learning Semantic Hierarchies via Word Embeddings

Ruiji Fu; Jiang Guo; Bing Qin; Wanxiang Che; Haifeng Wang; Ting Liu

Semantic hierarchy construction aims to build structures of concepts linked by hypernym‐hyponym (“is-a”) relations. A major challenge for this task is the automatic discovery of such relations. This paper proposes a novel and effective method for the construction of semantic hierarchies based on word embeddings, which can be used to measure the semantic relationship between words. We identify whether a candidate word pair has hypernym‐hyponym relation by using the word-embedding-based semantic projections between words and their hypernyms. Our result, an F-score of 73.74%, outperforms the state-of-theart methods on a manually labeled test dataset. Moreover, combining our method with a previous manually-built hierarchy extension method can further improve Fscore to 80.29%.

empirical methods in natural language processing | 2014

Revisiting Embedding Features for Simple Semi-supervised Learning

Jiang Guo; Wanxiang Che; Haifeng Wang; Ting Liu

Recent work has shown success in using continuous word embeddings learned from unlabeled data as features to improve supervised NLP systems, which is regarded as a simple semi-supervised learning mechanism. However, fundamental problems on effectively incorporating the word embedding features within the framework of linear models remain. In this study, we investigate and analyze three different approaches, including a new proposed distributional prototype approach, for utilizing the embedding features. The presented approaches can be integrated into most of the classical linear models in NLP. Experiments on the task of named entity recognition show that each of the proposed approaches can better utilize the word embedding features, among which the distributional prototype approach performs the best. Moreover, the combination of the approaches provides additive improvements, outperforming the dense and continuous embedding features by nearly 2 points of F1 score.

conference on computational natural language learning | 2009

Multilingual Dependency-based Syntactic and Semantic Parsing

Wanxiang Che; Zhenghua Li; Yongqiang Li; Yuhang Guo; Bing Qin; Ting Liu

Our CoNLL 2009 Shared Task system includes three cascaded components: syntactic parsing, predicate classification, and semantic role labeling. A pseudo-projective high-order graph-based model is used in our syntactic dependency parser. A support vector machine (SVM) model is used to classify predicate senses. Semantic role labeling is achieved using maximum entropy (MaxEnt) model based semantic role classification and integer linear programming (ILP) based post inference. Finally, we win the first place in the joint task, including both the closed and open challenges.

international joint conference on natural language processing | 2015

Cross-lingual Dependency Parsing Based on Distributed Representations

Jiang Guo; Wanxiang Che; David Yarowsky; Haifeng Wang; Ting Liu

This paper investigates the problem of cross-lingual dependency parsing, aiming at inducing dependency parsers for low-resource languages while using only training data from a resource-rich language (e.g. English). Existing approaches typically don’t include lexical features, which are not transferable across languages. In this paper, we bridge the lexical feature gap by using distributed feature representations and their composition. We provide two algorithms for inducing cross-lingual distributed representations of words, which map vocabularies from two different languages into a common vector space. Consequently, both lexical features and non-lexical features can be used in our model for cross-lingual transfer. Furthermore, our framework is able to incorporate additional useful features such as cross-lingual word clusters. Our combined contributions achieve an average relative error reduction of 10.9% in labeled attachment score as compared with the delexicalized parser, trained on English universal treebank and transferred to three other languages. It also significantly outperforms McDonald et al. (2013) augmented with projected cluster features on identical data.

meeting of the association for computational linguistics | 2014

Character-Level Chinese Dependency Parsing

Meishan Zhang; Yue Zhang; Wanxiang Che; Ting Liu

Recent work on Chinese analysis has led to large-scale annotations of the internal structures of words, enabling characterlevel analysis of Chinese syntactic structures. In this paper, we investigate the problem of character-level Chinese dependency parsing, building dependency trees over characters. Character-level information can benefit downstream applications by offering flexible granularities for word segmentation while improving wordlevel dependency parsing accuracies. We present novel adaptations of two major shift-reduce dependency parsing algorithms to character-level parsing. Experimental results on the Chinese Treebank demonstrate improved performances over word-based parsing methods.

IEEE Transactions on Audio, Speech, and Language Processing | 2008

Semantic Role Labeling Using a Grammar-Driven Convolution Tree Kernel

Min Zhang; Wanxiang Che; Guodong Zhou; Aiti Aw; Chew Lim Tan; Ting Liu; Sheng Li

Convolution tree kernel has shown promising results in semantic role labeling (SRL). However, this kernel does not consider much linguistic knowledge in kernel design and only performs hard matching between subtrees. To overcome these constraints, this paper proposes a grammar-driven convolution tree kernel for SRL by introducing more linguistic knowledge. Compared with the standard convolution tree kernel, the proposed grammar-driven kernel has two advantages: 1) grammar-driven approximate substructure matching, and 2) grammar-driven approximate tree node matching. The two approximate matching mechanisms enable the proposed kernel to better explore linguistically motivated structured knowledge. Experiments on the CoNLL-2005 SRL shared task and the PropBank I corpus show that the proposed kernel outperforms the standard convolution tree kernel significantly. Moreover, we present a composite kernel to integrate a feature-based polynomial kernel and the proposed grammar-driven convolution tree kernel for SRL. Experimental results show that our composite kernel-based method significantly outperforms the previously best-reported ones.

ACM Transactions on Asian Language Information Processing | 2008

Using a Hybrid Convolution Tree Kernel for Semantic Role Labeling

Wanxiang Che; Min Zhang; Aiti Aw; Chew Lim Tan; Ting Liu; Sheng Li

As a kind of Shallow Semantic Parsing, Semantic Role Labeling (SRL) is gaining more attention as it benefits a wide range of natural language processing applications. Given a sentence, the task of SRL is to recognize semantic arguments (roles) for each predicate (target verb or noun). Feature-based methods have achieved much success in SRL and are regarded as the state-of-the-art methods for SRL. However, these methods are less effective in modeling structured features. As an extension of feature-based methods, kernel-based methods are able to capture structured features more efficiently in a much higher dimension. Application of kernel methods to SRL has been achieved by selecting the tree portion of a predicate and one of its arguments as feature space, which is named as predicate-argument feature (PAF) kernel. The PAF kernel captures the syntactic tree structure features using convolution tree kernel, however, it does not distinguish between the path structure and the constituent structure. In this article, a hybrid convolution tree kernel is proposed to model different linguistic objects. The hybrid convolution tree kernel consists of two individual convolution tree kernels. They are a Path kernel, which captures predicate-argument link features, and a Constituent Structure kernel, which captures the syntactic structure features of arguments. Evaluations on the data sets of the CoNLL-2005 SRL shared task and the Chinese PropBank (CPB) show that our proposed hybrid convolution tree kernel statistically significantly outperforms the previous tree kernels. Moreover, in order to maximize the system performance, we present a composite kernel through combining our hybrid convolution tree kernel method with a feature-based method extended by the polynomial kernel. The experimental results show that the composite kernel achieves better performance than each of the individual methods and outperforms the best reported system on the CoNLL-2005 corpus when only one syntactic parser is used and on the CPB corpus when automated syntactic parse results and correct syntactic parse results are used respectively.

conference on computational natural language learning | 2008

A Cascaded Syntactic and Semantic Dependency Parsing System

Wanxiang Che; Zhenghua Li; Yuxuan Hu; Yongqiang Li; Bing Qin; Ting Liu; Sheng Li

We describe our CoNLL 2008 Shared Task system in this paper. The system includes two cascaded components: a syntactic and a semantic dependency parsers. A first-order projective MSTParser is used as our syntactic dependency parser. In order to overcome the shortcoming of the MSTParser, that it cannot model more global information, we add a relabeling stage after the parsing to distinguish some confusable labels, such as ADV, TMP, and LOC. Besides adding a predicate identification and a classification stages, our semantic dependency parsing simplifies the traditional four stages semantic role labeling into two: a maximum entropy based argument classification and an ILP-based post inference. Finally, we gain the overall labeled macro F1 = 82.66, which ranked the second position in the closed challenge.

empirical methods in natural language processing | 2014

Domain Adaptation for CRF-based Chinese Word Segmentation using Free Annotations

Yijia Liu; Yue Zhang; Wanxiang Che; Ting Liu; Fan Wu

Supervised methods have been the dominant approach for Chinese word segmentation. The performance can drop significantly when the test domain is different from the training domain. In this paper, we study the problem of obtaining partial annotation from freely available data to help Chinese word segmentation on different domains. Different sources of free annotations are transformed into a unified form of partial annotation and a variant CRF model is used to leverage both fully and partially annotated data consistently. Experimental results show that the Chinese word segmentation model benefits from free partially annotated data. On the SIGHAN Bakeoff 2010 data, we achieve results that are competitive to the best reported in the literature.

International Journal of Computer Processing of Languages | 2011

Appraisal Expression Recognition with Syntactic Path for Sentence Sentiment Classification

Yanyan Zhao; Bing Qin; Wanxiang Che; Ting Liu

An appraisal expression is described as a collocation of the polarity word and its modified target, which can be considered as an atomic unit expressing an evaluative stance towards a target. Recognizing appraisal expressions is essential for sentence sentiment classification. However, the relevant research is far from enough. This paper proposes a novel method that uses syntactic paths to recognize appraisal expressions. Compared with the previous work, the proposed syntactic path based method has two advantages: 1) it automatically explores syntactic knowledge, and 2) it covers more syntactic relationships between polarity words and targets. Based on these, this paper applies appraisal expressions to sentence sentiment classification. Some novel features based on appraisal expressions, including semantic features, syntactic features, lexical features and polarity features, are designed to classify sentiment sentences as positive or negative. Experimental results on the camera and MP3 player domains show that the proposed appraisal expression based method outperforms other sentence sentiment classification methods. Moreover, we present a composite classifier to integrate our appraisal expression based feature set and the common-used feature set. Experimental results show that our composite classifier can get a better performance than each of them.

Explore More