Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yongfeng Huang is active.

Publication


Featured researches published by Yongfeng Huang.


Neurocomputing | 2016

Structured microblog sentiment classification via social context regularization

Fangzhao Wu; Yongfeng Huang; Yangqiu Song

Microblog sentiment analysis is a fundamental problem for many interesting applications. Existing microblog sentiment classification methods judge the sentiment polarity mainly according to textual content. However, since microblog messages are very short and noisy, and their sentiment polarities are often ambiguous and context-dependent, the accuracy of microblog sentiment classification is usually unsatisfactory. Fortunately, microblog messages lie in social media and contain rich social contexts. The social context information often implies sentiment connections between microblog messages. For example, a microblogging user usually expresses the same sentiment when posting multiple messages towards the same topic. Motivated by these observations, in this paper we propose a structured microblog sentiment classification (SMSC) framework. Our framework can combine social context information with textual content information to improve microblog sentiment classification accuracy. Two kinds of social contexts are used in our framework, i.e., social connections between microblog messages brought by the same author and social connections brought by social relations between users. In our framework, social context information is formulated as the graph structure over the sentiments of microblog messages. The objective function of our framework is a tradeoff between the agreement with content-based sentiment predictions and the consistency with social contexts. An efficient optimization algorithm is introduced to solve our framework. Experimental results on two Twitter sentiment analysis benchmark datasets indicate that our method can outperform baseline methods consistently and significantly.


conference on information and knowledge management | 2015

Social Spammer and Spam Message Co-Detection in Microblogging with Social Context Regularization

Fangzhao Wu; Jinyun Shu; Yongfeng Huang; Zhigang Yuan

The popularity of microblogging platforms, such as Twitter, makes them important for information dissemination and sharing. However, they are also recognized as ideal places by spammers to conduct social spamming. Massive social spammers and spam messages heavily hurt the user experience and hinder the healthy development of microblogging systems. Thus, effectively detecting the social spammers and spam messages in microblogging is of great value. Existing studies mainly regard social spammer detection and spam message detection as two separate tasks. However, social spammers and spam messages have strong connections, since social spammers tend to post more spam messages and spam messages have high probabilities to be posted by social spammers. Combining social spammer detection with spam message detection has the potential to boost the performance of each task. In this paper, we propose a unified framework for social spammer and spam message co-detection in microblogging. Our framework utilizes the posting relations between users and messages to combine social spammer detection with spam message detection. In addition, we extract the social relations between users as well as the connections between messages, and incorporate them into our framework as regularization terms over the prediction results. Besides, we introduce an efficient optimization method to solve our framework. Extensive experiments on a real-world microblog dataset demonstrate that our framework can significantly and consistently improve the performance of both social spammer detection and spam message detection.


Neurocomputing | 2016

Co-detecting social spammers and spam messages in microblogging via exploiting social contexts

Fangzhao Wu; Jinyun Shu; Yongfeng Huang; Zhigang Yuan

Microblogging websites, such as Twitter, have become popular platforms for information dissemination and sharing. However, they are also full of spammers who frequently conduct social spamming on them. Massive social spammers and spam messages heavily hurt the user experience and hinder the healthy development of microblogging systems. Thus, effectively detecting the social spammers and spam messages is of great value to both microblogging users and websites. Existing studies usually treat social spammer detection and spam message detection as two separate tasks. However, social spammers and spam messages have strong inherent connections, since social spammers tend to post more spam messages and spam messages have high probabilities to be posted by social spammers. Thus combining social spammer detection with spam message detection has the potential to boost the performance of both tasks. In this paper, we propose a unified approach for social spammer and spam message co-detection in microblogging. Our approach utilizes the posting relations between users and messages to combine social spammer detection with spam message detection. In addition, we extract the social relations between users and the connections between messages to refine detection results. We regard these social contexts as the graph structure over the detection results and incorporate them into our approach as regularization terms. Besides, we introduce an efficient optimization algorithm to solve the model of our approach and propose an accelerated method to tackle the most time-consuming step. Extensive experiments on a real-world microblog dataset demonstrate that our approach can improve the performance of both social spammer detection and spam message detection effectively and efficiently.


Information Sciences | 2016

Microblog sentiment classification with heterogeneous sentiment knowledge

Fangzhao Wu; Yangqiu Song; Yongfeng Huang

Microblogging services, such as Twitter, are very popular for information release and dissemination. Analyzing the sentiments in massive microblog messages is useful for sensing the publics opinions on various topics, which has wide applications in both academic and industrial fields. However, microblog sentiment analysis is a challenging task, because microblog messages are short and noisy, and contain massive user-invented acronyms and informal words. It is expensive and time-consuming to manually annotate sufficient samples for training an accurate and robust microblog sentiment classifier. Fortunately, unlabeled microblog messages can provide a lot of useful sentiment knowledge. For example, emoticons are frequently used in microblog messages and they usually indicate sentiment orientations. In this paper, we propose to extract useful sentiment knowledge from massive unlabeled messages to enhance microblog sentiment classification. Three kinds of sentiment knowledge, i.e., contextual similarity knowledge, word-sentiment knowledge, and contextual polarity knowledge, are explored. We propose a unified framework to incorporate the heterogenous sentiment knowledge into the learning of microblog sentiment classifiers. An efficient optimization method based on ADMM is introduced to solve the model of our framework and an accelerated algorithm is proposed to tackle the most time-consuming step. Extensive experiments were conducted on three benchmark Twitter datasets. The experimental results show that our approach can improve the performance of microblog sentiment classification effectively and efficiently.


decision support systems | 2016

Towards building a high-quality microblog-specific Chinese sentiment lexicon

Fangzhao Wu; Yongfeng Huang; Yangqiu Song; Shixia Liu

Due to the huge popularity of microblogging services, microblogs have become important sources of customer opinions. Sentiment analysis systems can provide useful knowledge to decision support systems and decision makers by aggregating and summarizing the opinions in massive microblogs automatically. The most important component of sentiment analysis systems is sentiment lexicon. However, the performance of traditional sentiment lexicons on microblog sentiment analysis is far from satisfactory, especially for Chinese. In this paper, we propose a data-driven approach to build a high-quality microblog-specific sentiment lexicon for Chinese microblog sentiment analysis system. The core of our method is a unified framework that incorporates three kinds of sentiment knowledge for sentiment lexicon construction, i.e., the word-sentiment knowledge extracted from microblogs with emoticons, the sentiment similarity knowledge extracted from words associations among all the messages, and the prior sentiment knowledge extracted from existing sentiment lexicons. In addition, in order to improve the coverage of our sentiment lexicon, we propose an effective method to detect popular new words in microblogs, which considers not only words distributions over texts, but also their distributions over users.The detected new words with strong sentiment are incorporated in our sentiment lexicon.We built a microblog-specific Chinese sentiment lexicon on a large microblog dataset with more than 17 million messages. Experimental results on two microblog sentiment datasets show that our microblog-specific sentiment lexicon can significantly improve the performance of microblog sentiment analysis. An effective and efficient method to detect the popular use-invented new words in Chinese microblogs.Three kinds of heterogenous sentiment knowledge are extracted for building sentiment lexicon.A unified framework incorporating various kinds of sentiment knowledge for microblog-specific sentiment lexicon construction.Our microblog-specific sentiment lexicon outperforms existing sentiment lexicons.


Information Fusion | 2017

Domain-specific sentiment classification via fusing sentiment knowledge from multiple sources

Fangzhao Wu; Yongfeng Huang; Zhigang Yuan

Extract and fuse four kinds of sentiment knowledge from multiple sources.A unified model to fuse knowledge for domain-specific sentiment classification.An efficient algorithm to solve the model of our approach.Extensive experiment results validate effectiveness and efficiency of our approach. Analyzing the sentiments in massive user-generated online data, such as product reviews and microblogs, has become a hot research topic. It can help customers, companies and expert systems make more informed decisions. Sentiment analysis is widely known as a domain dependent problem. Different domains usually have different sentiment expressions and a general sentiment classifier is not suitable for all domains. A natural solution to this problem is to train a domain-specific sentiment classifier for each target domain. However, the labeled data in target domain is usually insufficient, and it is costly and time-consuming to annotate enough samples. In order to tackle this problem, we propose a novel approach to train domain-specific sentiment classifiers by fusing the sentiment knowledge from multiple sources. Sentiment information from four sources is extracted and fused in our approach. The first source is sentiment lexicons, which contain sentiment polarities of general sentiment words. The second source is the sentiment classifiers of multiple source domains. The third source is the unlabeled data in target domain, from which we extract domain-specific sentiment relations among words. The fourth source is the labeled data in target domain. We propose a unified framework to fuse these four kinds of sentiment knowledge and train domain-specific sentiment classifier for target domain. In addition, we present an efficient optimization algorithm to solve the model of our approach. Extensive experiments are conducted on both Amazon product review dataset and Twitter dataset. Experimental results show that by fusing the sentiment information extracted from multiple sources, our approach can effectively improve the performance of sentiment classification and reduce the dependence on labeled data. For instance, our approach can achieve an accuracy of 87.22% in Kitchen domain when only 200 samples in target domain are labeled. The performance improvements of our approach compared with purely supervised sentiment classifier are 8.98% and 7.92% on Amazon and Twitter datasets respectively.


conference on information and knowledge management | 2013

Review rating prediction based on the content and weighting strong social relation of reviewers

Bingkun Wang; Yulin Min; Yongfeng Huang; Xing Li; Fangzhao Wu

Review rating is more helpful than review binary classification for many decision processes such as consumption decision-making, company product quality tracking and public opinion mining. In the review rating, reviewers are influenced not only by their own subjective feelings, but also by others rating to the same product. Existing review rating prediction methods are mainly based on the content of reviews, which only consider the subjective factors of reviewers, but not consider the impact of other people in the social relations of reviewers. Based on it, we propose a review rating prediction method by incorporating the character of reviewers social relations, as regularization constraints, into content-based methods. In addition, we further propose a method to classify the social relations of reviewers into strong social relation and ordinary social relation. For strong social relation of reviewers, we give higher weight than ordinary social relation when incorporating the two social relations into content-based methods. Experiments on two real movie review datasets demonstrate that the method of considering different social relations has better performance than the content-based methods and the method of considering social relations as a whole.


Knowledge Based Systems | 2018

A hybrid unsupervised method for aspect term and opinion target extraction

Chuhan Wu; Fangzhao Wu; Sixing Wu; Zhigang Yuan; Yongfeng Huang

Abstract Aspect term extraction (ATE) and opinion target extraction (OTE) are two important tasks in fine-grained sentiment analysis field. Existing approaches to ATE and OTE are mainly based on rules or machine learning methods. Rule-based methods are usually unsupervised, but they can’t make use of high level features. Although supervised learning approaches usually outperform the rule-based ones, they need a large number of labeled samples to train their models, which are expensive and time-consuming to annotate. In this paper, we propose a hybrid unsupervised method which can combine rules and machine learning methods to address ATE and OTE tasks. First, we use chunk-level linguistic rules to extract nominal phrase chunks and regard them as candidate opinion targets and aspects. Then we propose to filter irrelevant candidates based on domain correlation. Finally, we use these texts with extracted chunks as pseudo labeled data to train a deep gated recurrent unit (GRU) network for aspect term extraction and opinion target extraction. The experiments on benchmark datasets validate the effectiveness of our approach in extracting opinion targets and aspects with minimal manual annotation.


IEEE Transactions on Knowledge and Data Engineering | 2017

Collaboratively Training Sentiment Classifiers for Multiple Domains

Fangzhao Wu; Zhigang Yuan; Yongfeng Huang

We propose a collaborative multi-domain sentiment classification approach to train sentiment classifiers for multiple domains simultaneously. In our approach, the sentiment information in different domains is shared to train more accurate and robust sentiment classifiers for each domain when labeled data is scarce. Specifically, we decompose the sentiment classifier of each domain into two components, a global one and a domain-specific one. The global model can capture the general sentiment knowledge and is shared by various domains. The domain-specific model can capture the specific sentiment expressions in each domain. In addition, we extract domain-specific sentiment knowledge from both labeled and unlabeled samples in each domain and use it to enhance the learning of domain-specific sentiment classifiers. Besides, we incorporate the similarities between domains into our approach as regularization over the domain-specific sentiment classifiers to encourage the sharing of sentiment information between similar domains. Two kinds of domain similarity measures are explored, one based on textual content and the other one based on sentiment expressions. Moreover, we introduce two efficient algorithms to solve the model of our approach. Experimental results on benchmark datasets show that our approach can effectively improve the performance of multi-domain sentiment classification and significantly outperform baseline methods.


conference on information and knowledge management | 2016

Sentiment Domain Adaptation with Multi-Level Contextual Sentiment Knowledge

Fangzhao Wu; Sixing Wu; Yongfeng Huang; Songfang Huang; Yong Qin

Sentiment domain adaptation is widely studied to tackle the domain-dependence problem in sentiment analysis field. Existing domain adaptation methods usually train a sentiment classifier in a source domain and adapt it to the target domain using transfer learning techniques. However, when the sentiment feature distributions of the source and target domains are significantly different, the adaptation performance will heavily decline. In this paper, we propose a new sentiment domain adaptation approach by adapting the sentiment knowledge in general-purpose sentiment lexicons to a specific domain. Since the general sentiment words of general-purpose sentiment lexicons usually convey consistent sentiments in different domains, they have better generalization performance than the sentiment classifier trained in a source domain. In addition, we propose to extract various kinds of contextual sentiment knowledge from massive unlabeled samples in target domain and formulate them as sentiment relations among sentiment expressions. It can propagate the sentiment information in general sentiment words to massive domain-specific sentiment expressions. Besides, we propose a unified framework to incorporate these different kinds of sentiment knowledge and learn an accurate domain-specific sentiment classifier for target domain. Moreover, we propose an efficient optimization algorithm to solve the model of our approach. Extensive experiments on benchmark datasets validate the effectiveness and efficiency of our approach.

Collaboration


Dive into the Yongfeng Huang's collaboration.

Top Co-Authors

Avatar

Fangzhao Wu

Microsoft Research Asia (China)

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yangqiu Song

Hong Kong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jinyun Shu

Beijing University of Posts and Telecommunications

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge