Duyu Tang
Harbin Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Duyu Tang.
empirical methods in natural language processing | 2015
Duyu Tang; Bing Qin; Ting Liu
Document level sentiment classification remains a challenge: encoding the intrinsic relations between sentences in the semantic meaning of a document. To address this, we introduce a neural network model to learn vector-based document representation in a unified, bottom-up fashion. The model first learns sentence representation with convolutional neural network or long short-term memory. Afterwards, semantics of sentences and their relations are adaptively encoded in document representation with gated recurrent neural network. We conduct document level sentiment classification on four large-scale review datasets from IMDB and Yelp Dataset Challenge. Experimental results show that: (1) our neural model shows superior performances over several state-of-the-art algorithms; (2) gated recurrent neural network dramatically outperforms standard recurrent neural network in document modeling for sentiment classification. 1
international joint conference on natural language processing | 2015
Duyu Tang; Bing Qin; Ting Liu
Neural network methods have achieved promising results for sentiment classification of text. However, these models only use semantics of texts, while ignoring users who express the sentiment and products which are evaluated, both of which have great influences on interpreting the sentiment of text. In this paper, we address this issue by incorporating userand productlevel information into a neural network approach for document level sentiment classification. Users and products are modeled using vector space models, the representations of which capture important global clues such as individual preferences of users or overall qualities of products. Such global evidence in turn facilitates embedding learning procedure at document level, yielding better text representations. By combining evidence at user-, productand documentlevel in a unified neural framework, the proposed model achieves state-of-the-art performances on IMDB and Yelp datasets1.
international conference on computational linguistics | 2014
Duyu Tang; Furu Wei; Bing Qin; Ting Liu; Ming Zhou
In this paper, we develop a deep learning system for message-level Twitter sentiment classification. Among the 45 submitted systems including the SemEval 2013 participants, our system (Coooolll) is ranked 2nd on the Twitter2014 test set of SemEval 2014 Task 9. Coooolll is built in a supervised learning framework by concatenating the sentiment-specific word embedding (SSWE) features with the state-of-the-art hand-crafted features. We develop a neural network with hybrid loss function 1 to learn SSWE, which encodes the sentiment information of tweets in the continuous representation of words. To obtain large-scale training corpora, we train SSWE from 10M tweets collected by positive and negative emoticons, without any manual annotation. Our system can be easily re-implemented with the publicly available sentiment-specific word embedding.
meeting of the association for computational linguistics | 2014
Li Dong; Furu Wei; Chuanqi Tan; Duyu Tang; Ming Zhou; Ke Xu
We propose Adaptive Recursive Neural Network (AdaRNN) for target-dependent Twitter sentiment classification. AdaRNN adaptively propagates the sentiments of words to target depending on the context and syntactic relationships between them. It consists of more than one composition functions, and we model the adaptive sentiment propagations as distributions over these composition functions. The experimental studies illustrate that AdaRNN improves the baseline methods. Furthermore, we introduce a manually annotated dataset for target-dependent Twitter sentiment analysis.
empirical methods in natural language processing | 2016
Duyu Tang; Bing Qin; Ting Liu
We introduce a deep memory network for aspect level sentiment classification. Unlike feature-based SVM and sequential neural models such as LSTM, this approach explicitly captures the importance of each context word when inferring the sentiment polarity of an aspect. Such importance degree and text representation are calculated with multiple computational layers, each of which is a neural attention model over an external memory. Experiments on laptop and restaurant datasets demonstrate that our approach performs comparable to state-of-art feature based SVM system, and substantially better than LSTM and attention-based LSTM architectures. On both datasets we show that multiple computational layers could improve the performance. Moreover, our approach is also fast. The deep memory network with 9 layers is 15 times faster than LSTM with a CPU implementation.
IEEE Transactions on Knowledge and Data Engineering | 2016
Duyu Tang; Furu Wei; Bing Qin; Nan Yang; Ting Liu; Ming Zhou
We propose learning sentiment-specific word embeddings dubbed sentiment embeddings in this paper. Existing word embedding learning algorithms typically only use the contexts of words but ignore the sentiment of texts. It is problematic for sentiment analysis because the words with similar contexts but opposite sentiment polarity, such as good and bad, are mapped to neighboring word vectors. We address this issue by encoding sentiment information of texts (e.g., sentences and words) together with contexts of words in sentiment embeddings. By combining context and sentiment level evidences, the nearest neighbors in sentiment embedding space are semantically similar and it favors words with the same sentiment polarity. In order to learn sentiment embeddings effectively, we develop a number of neural networks with tailoring loss functions, and collect massive texts automatically with sentiment signals like emoticons as the training data. Sentiment embeddings can be naturally used as word features for a variety of sentiment analysis tasks without feature engineering. We apply sentiment embeddings to word-level sentiment analysis, sentence level sentiment classification, and building sentiment lexicons. Experimental results show that sentiment embeddings consistently outperform context-based embeddings on several benchmark datasets of these tasks. This work provides insights on the design of neural networks for learning task-specific word embeddings in other natural language processing tasks.
web search and data mining | 2015
Duyu Tang
In this paper, we propose a representation learning research framework for document-level sentiment analysis. Given a document as the input, document-level sentiment analysis aims to automatically classify its sentiment/opinion (such as thumbs up or thumbs down) based on the textural information. Despite the success of feature engineering in many previous studies, the hand-coded features do not well capture the semantics of texts. In this research, we argue that learning sentiment-specific semantic representations of documents is crucial for document-level sentiment analysis. We decompose the document semantics into four cascaded constitutes: (1) word representation, (2) sentence structure, (3) sentence composition and (4) document composition. Specifically, we learn sentiment-specific word representations, which simultaneously encode the contexts of words and the sentiment supervisions of texts into the continuous representation space. According to the principle of compositionality, we learn sentiment-specific sentence structures and sentence-level composition functions to produce the representation of each sentence based on the representations of the words it contains. The semantic representations of documents are obtained through document composition, which leverages the sentiment-sensitive discourse relations and sentence representations.
Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery | 2015
Duyu Tang; Bing Qin; Ting Liu
Sentiment analysis (also known as opinion mining) is an active research area in natural language processing. It aims at identifying, extracting and organizing sentiments from user generated texts in social networks, blogs or product reviews. A lot of studies in literature exploit machine learning approaches to solve sentiment analysis tasks from different perspectives in the past 15 years. Since the performance of a machine learner heavily depends on the choices of data representation, many studies devote to building powerful feature extractor with domain expert and careful engineering. Recently, deep learning approaches emerge as powerful computational models that discover intricate semantic representations of texts automatically from data without feature engineering. These approaches have improved the state‐of‐the‐art in many sentiment analysis tasks including sentiment classification of sentences/documents, sentiment extraction and sentiment lexicon learning. In this paper, we provide an overview of the successful deep learning approaches for sentiment analysis tasks, lay out the remaining challenges and provide some suggestions to address these challenges. WIREs Data Mining Knowl Discov 2015, 5:292–303. doi: 10.1002/widm.1171
IEEE Transactions on Audio, Speech, and Language Processing | 2015
Duyu Tang; Bing Qin; Furu Wei; Li Dong; Ting Liu; Ming Zhou
In this paper, we propose a joint segmentation and classification framework for sentence-level sentiment classification. It is widely recognized that phrasal information is crucial for sentiment classification. However, existing sentiment classification algorithms typically split a sentence as a word sequence, which does not effectively handle the inconsistent sentiment polarity between a phrase and the words it contains, such as {“not bad,” “bad”} and {“a great deal of,” “great”}. We address this issue by developing a joint framework for sentence-level sentiment classification. It simultaneously generates useful segmentations and predicts sentence-level polarity based on the segmentation results. Specifically, we develop a candidate generation model to produce segmentation candidates of a sentence; a segmentation ranking model to score the usefulness of a segmentation candidate for sentiment classification; and a classification model for predicting the sentiment polarity of a segmentation. We train the joint framework directly from sentences annotated with only sentiment polarity, without using any syntactic or sentiment annotations in segmentation level. We conduct experiments for sentiment classification on two benchmark datasets: a tweet dataset and a review dataset. Experimental results show that: 1) our method performs comparably with state-of-the-art methods on both datasets; 2) joint modeling segmentation and classification outperforms pipelined baseline methods in various experimental settings.
NLPCC | 2013
Duyu Tang; Bing Qin; Ting Liu; Zhenghua Li
This paper studies the emotion classification task on microblogs. Given a message, we classify its emotion as happy, sad, angry or surprise. Existing methods mostly use the bag-of-word representation or manually designed features to train supervised or distant supervision models. However, manufacturing feature engines is time-consuming and not enough to capture the complex linguistic phenomena on microblogs. In this study, to overcome the above problems, we utilize pseudo-labeled data, which is extensively explored for distant supervision learning and training language model in Twitter sentiment analysis, to learn the sentence representation through Deep Belief Network algorithm. Experimental results in the supervised learning framework show that using the pseudo-labeled data, the representation learned by Deep Belief Network outperforms the Principal Components Analysis based and Latent Dirichlet Allocation based representations. By incorporating the Deep Belief Network based representation into basic features, the performance is further improved.