Chuhan Wu
Tsinghua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chuhan Wu.
Knowledge Based Systems | 2018
Chuhan Wu; Fangzhao Wu; Sixing Wu; Zhigang Yuan; Yongfeng Huang
Abstract Aspect term extraction (ATE) and opinion target extraction (OTE) are two important tasks in fine-grained sentiment analysis field. Existing approaches to ATE and OTE are mainly based on rules or machine learning methods. Rule-based methods are usually unsupervised, but they can’t make use of high level features. Although supervised learning approaches usually outperform the rule-based ones, they need a large number of labeled samples to train their models, which are expensive and time-consuming to annotate. In this paper, we propose a hybrid unsupervised method which can combine rules and machine learning methods to address ATE and OTE tasks. First, we use chunk-level linguistic rules to extract nominal phrase chunks and regard them as candidate opinion targets and aspects. Then we propose to filter irrelevant candidates based on domain correlation. Finally, we use these texts with extracted chunks as pseudo labeled data to train a deep gated recurrent unit (GRU) network for aspect term extraction and opinion target extraction. The experiments on benchmark datasets validate the effectiveness of our approach in extracting opinion targets and aspects with minimal manual annotation.
Expert Systems With Applications | 2019
Sixing Wu; Fangzhao Wu; Yue Chang; Chuhan Wu; Yongfeng Huang
Abstract Sentiment lexicon plays an important role in sentiment analysis system. In most existing sentiment lexica, each sentiment word or phrase is given a sentiment label or score. However, a sentiment word may express different sentiment orientations describing different targets. It’s beneficial but challenging to incorporate knowledge of opinion targets into sentiment lexicon. In this paper we propose an automatic approach to construct a target-specific sentiment lexicon, in which each term is an opinion pair consisting of an opinion target and an opinion word. The approach solves two principle problems in construction process, namely, opinion target extraction and opinion pair sentiment classification. An unsupervised algorithm is proposed to extract opinion pairs in high quality. Both semantic feature and syntactic feature are incorporated in the algorithm, to extract opinion pairs containing correct opinion targets. A group of opinion pairs are generated and a framework is proposed to classify their sentiment polarities. Knowledge of available resources including general-purpose sentiment lexicon and thesaurus, and context knowledge including syntactic relations and sentiment information in sentences, are extracted and integrated in a unified framework to calculate sentiment scores of opinion pairs. Experimental results on product reviews datasets in different domains prove the effectiveness of our method in target-specific sentiment lexicon construction, which can improve performances of opinion target extraction and opinion pair sentiment classification. In addition, our lexicon also achieves better performance in target-level sentiment classification compared with several general-purpose sentiment lexicons.
international conference natural language processing | 2018
Yubo Chen; Hongtao Liu; Chuhan Wu; Zhigang Yuan; Minyu Jiang; Yongfeng Huang
Distant supervised relation extraction is an efficient method to find novel relational facts from very large corpora without expensive manual annotation. However, distant supervision will inevitably lead to wrong label problem, and these noisy labels will substantially hurt the performance of relation extraction. Existing methods usually use multi-instance learning and selective attention to reduce the influence of noise. However, they usually cannot fully utilize the supervision information and eliminate the effect of noise. In this paper, we propose a method called Neural Instance Selector (NIS) to solve these problems. Our approach contains three modules, a sentence encoder to encode input texts into hidden vector representations, an NIS module to filter the less informative sentences via multilayer perceptrons and logistic classification, and a selective attention module to select the important sentences. Experimental results show that our method can effectively filter noisy data and achieve better performance than several baseline methods.
international conference natural language processing | 2018
Junxin Liu; Fangzhao Wu; Chuhan Wu; Yongfeng Huang; Xing Xie
Chinese word segmentation (CWS) is an important task for Chinese NLP. Recently, many neural network based methods have been proposed for CWS. However, these methods require a large number of labeled sentences for model training, and usually cannot utilize the useful information in Chinese dictionary. In this paper, we propose two methods to exploit the dictionary information for CWS. The first one is based on pseudo labeled data generation, and the second one is based on multi-task learning. The experimental results on two benchmark datasets validate that our approach can effectively improve the performance of Chinese word segmentation, especially when training data is insufficient.
conference on information and knowledge management | 2018
Fangzhao Wu; Chuhan Wu; Junxin Liu
Supervised learning methods are widely used in sentiment classification. However, when sentiment distribution is imbalanced, the performance of these methods declines. In this paper, we propose an effective approach for imbalanced sentiment classification. In our approach, multiple balanced subsets are sampled from the imbalanced training data and a multi-task learning based framework is proposed to learn robust sentiment classifier from these subsets collaboratively. In addition, we incorporate prior knowledge of sentiment expressions extracted from both existing sentiment lexicons and massive unlabeled data into our approach to enhance the learning of sentiment classifier in imbalanced scenario. Experimental results on benchmark datasets validate the effectiveness of our approach in improving imbalanced sentiment classification.
conference on information and knowledge management | 2018
Fangzhao Wu; Chuhan Wu; Junxin Liu
It is important to detect social spammers and spam messages in microblogging platforms. Existing methods usually handle the detection of social spammers and spam messages as two separate tasks using supervised learning techniques. However, labeled samples are usually scarce and manual annotation is expensive. In this paper, we propose a semi-supervised collaborative learning approach to jointly detect social spammers and spam messages in microblogging platforms. In our approach, the social spammer classifier and spam message classifier are collaboratively trained by exploiting the inherent relatedness between these tasks. In addition, unlabeled samples are incorporated into model training with the help of social contexts of users and messages. Experiments on real-world dataset show our approach can effectively improve the performance of both social spammer detection and spam message detection.
north american chapter of the association for computational linguistics | 2018
Chuhan Wu; Fangzhao Wu; Sixing Wu; Junxin Liu; Zhigang Yuan; Yongfeng Huang
north american chapter of the association for computational linguistics | 2018
Chuhan Wu; Fangzhao Wu; Sixing Wu; Zhigang Yuan; Yongfeng Huang
north american chapter of the association for computational linguistics | 2018
Chuhan Wu; Fangzhao Wu; Sixing Wu; Zhigang Yuan; Junxin Liu; Yongfeng Huang
north american chapter of the association for computational linguistics | 2018
Chuhan Wu; Fangzhao Wu; Junxin Liu; Zhigang Yuan; Sixing Wu; Yongfeng Huang