Jingbo Zhu
Northeastern University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jingbo Zhu.
IEEE Transactions on Affective Computing | 2011
Jingbo Zhu; Huizhen Wang; Muhua Zhu; Benjamin K. Tsou; Matthew Y. Ma
Opinion polling has been traditionally done via customer satisfaction studies in which questions are carefully designed to gather customer opinions about target products or services. This paper studies aspect-based opinion polling from unlabeled free-form textual customer reviews without requiring customers to answer any questions. First, a multi-aspect bootstrapping method is proposed to learn aspect-related terms of each aspect that are used for aspect identification. Second, an aspect-based segmentation model is proposed to segment a multi-aspect sentence into multiple single-aspect units as basic units for opinion polling. Finally, an aspect-based opinion polling algorithm is presented in detail. Experiments on real Chinese restaurant reviews demonstrated that our approach can achieve 75.5 percent accuracy in aspect-based opinion polling tasks. The proposed opinion polling method does not require labeled training data. It is thus easy to implement and can be applicable to other languages (e.g., English) or other domains such as product or movie reviews.
international conference on computational linguistics | 2008
Jingbo Zhu; Huizhen Wang; Tianshun Yao; Benjamin K. Tsou
This paper addresses two issues of active learning. Firstly, to solve a problem of uncertainty sampling that it often fails by selecting outliers, this paper presents a new selective sampling technique, sampling by uncertainty and density (SUD), in which a k-Nearest-Neighbor-based density measure is adopted to determine whether an unlabeled example is an outlier. Secondly, a technique of sampling by clustering (SBC) is applied to build a representative initial training data set for active learning. Finally, we implement a new algorithm of active learning with SUD and SBC techniques. The experimental results from three real-world data sets show that our method outperforms competing methods, particularly at the early stages of active learning.
conference on information and knowledge management | 2009
Jingbo Zhu; Huizhen Wang; Benjamin K. Tsou; Muhua Zhu
This paper presents an unsupervised approach to aspect-based opinion polling from raw textual reviews without explicit ratings. The key contribution of this paper is three-fold. First, a multi-aspect bootstrapping algorithm is proposed to learn from unlabeled data aspect-related terms of each aspect to be used for aspect identification. Second, an unsupervised segmentation model is proposed to address the challenge of identifying multiple single-aspect units in a multi-aspect sentence. Finally, an aspect-based opinion polling algorithm is presented. Experiments on real Chinese restaurant reviews show that our opinion polling method can achieve 75.5% precision performance.
IEEE Transactions on Audio, Speech, and Language Processing | 2010
Jingbo Zhu; Huizhen Wang; Benjamin K. Tsou; Matthew Y. Ma
To solve the knowledge bottleneck problem, active learning has been widely used for its ability to automatically select the most informative unlabeled examples for human annotation. One of the key enabling techniques of active learning is uncertainty sampling, which uses one classifier to identify unlabeled examples with the least confidence. Uncertainty sampling often presents problems when outliers are selected. To solve the outlier problem, this paper presents two techniques, sampling by uncertainty and density (SUD) and density-based re-ranking. Both techniques prefer not only the most informative example in terms of uncertainty criterion, but also the most representative example in terms of density criterion. Experimental results of active learning for word sense disambiguation and text classification tasks using six real-world evaluation data sets demonstrate the effectiveness of the proposed methods.
meeting of the association for computational linguistics | 2014
Ji Ma; Yue Zhang; Jingbo Zhu
In this paper, we address the problem of web-domain POS tagging using a twophase approach. The first phase learns representations that capture regularities underlying web text. The representation is integrated as features into a neural network that serves as a scorer for an easy-first POS tagger. Parameters of the neural network are trained using guided learning in the second phase. Experiment on the SANCL 2012 shared task show that our approach achieves 93.15% average tagging accuracy, which is the best accuracy reported so far on this data set, higher than those given by ensembled syntactic parsers.
conference on information and knowledge management | 2009
Jingbo Zhu; Muhua Zhu; Huizhen Wang; Benjamin K. Tsou
Aspect-based sentiment summarization systems generally use sentences associated with relevant aspects extracted from the reviews as the basis for summarization. However, in real reviews, a single sentence often exhibits several aspects for opinions. This paper proposes a two-stage segmentation model to address the challenge of identifying multiple single-aspect and single-polarity units in one sentence, namely aspect-based sentence segmentation. Our model deals with both issues of aspect change and polarity change occurring in the input sentence. Experiments on restaurant reviews show that our model outperforms state-of-the-art linear text segmentation methods.
empirical methods in natural language processing | 2009
Tong Xiao; Mu Li; Dongdong Zhang; Jingbo Zhu; Ming Zhou
Binarization of Synchronous Context Free Grammars (SCFG) is essential for achieving polynomial time complexity of decoding for SCFG parsing based machine translation systems. In this paper, we first investigate the excess edge competition issue caused by a left-heavy binary SCFG derived with the method of Zhang et al. (2006). Then we propose a new binarization method to mitigate the problem by exploring other alternative equivalent binary SCFGs. We present an algorithm that iteratively improves the resulting binary SCFG, and empirically show that our method can improve a string-to-tree statistical machine translations system based on the synchronous binarization method in Zhang et al. (2006) on the NIST machine translation evaluation tasks.
international conference natural language processing | 2010
Xiaoxu Fei; Huizhen Wang; Jingbo Zhu
This paper addresses the issue of sentiment word identification given an opinionated sentence, which is very important in sentiment analysis tasks. The most common way to tackle this problem is to utilize a readily available sentiment lexicon such as HowNet or SentiWordNet to determine whether a word is a sentiment word. However, in practice, words existing in the lexicon sometimes can not express sentiment tendency in a certain context while other words out of the lexicon do express. To address this challenge, this paper presents an approach based on maximum-entropy classification model to identify sentiment words given an opinionated sentence. Experimental results show that our approach outperforms baseline lexicon-based methods.
meeting of the association for computational linguistics | 2014
Ji Ma; Yue Zhang; Jingbo Zhu
Modern statistical dependency parsers as- sign lexical heads to punctuations as well as words. Punctuation parsing errors lead to low parsing accuracy on words. In this work, we propose an alternative approach to addressing punctuation in dependency parsing. Rather than assigning lexical heads to punctuations, we treat punctu- ations as properties of their neighbour- ing words, used as features to guide the parser to build the dependency graph. In- tegrating our method with an arc-standard parser yields a 93.06% unlabelled attach- ment score, which is the best accuracy by a single-model transition-based parser re- ported so far.
IEEE Transactions on Affective Computing | 2012
Jingbo Zhu; Chunliang Zhang; Matthew Y. Ma
This paper explores the problem of content-based rating inference from online opinion-based texts, which often expresses differing opinions on multiple aspects. To sufficiently capture information from various aspects, we propose an aspect-based segmentation algorithm to first segment a user review into multiple single-aspect textual parts, and an aspect-augmentation approach to generate the aspect-specific feature vector of each aspect for aspect-based rating inference. To tackle the problem of inconsistent rating annotation, we present a tolerance-based criterion to optimize training sample selection for parameter updating during the model training process. Finally, we present a collaborative rating inference model which explores meaningful correlations between ratings across a set of aspects of user opinions for multi-aspect rating inference. We compared our proposed methods with several other approaches, and experiments on real Chinese restaurant reviews demonstrated that our approaches achieve significant improvements over others.