Kunpeng Zhang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kunpeng Zhang is active.

Explore More

Publication

Featured researches published by Kunpeng Zhang.

international conference on data mining | 2011

SES: Sentiment Elicitation System for Social Media Data

Kunpeng Zhang; Yu Cheng; Yusheng Xie; Daniel Honbo; Ankit Agrawal; Diana Palsetia; Kathy Lee; Wei-keng Liao; Alok N. Choudhary

Social Media is becoming major and popular technological platform that allows users discussing and sharing information. Information is generated and managed through either computer or mobile devices by one person and consumed by many other persons. Most of these user generated content are textual information, as Social Networks(Face book, Linked In), Microblogging(Twitter), blogs(Blogspot, Word press). Looking for valuable nuggets of knowledge, such as capturing and summarizing sentiments from these huge amount of data could help users make informed decisions. In this paper, we develop a sentiment identification system called SES which implements three different sentiment identification algorithms. We augment basic compositional semantic rules in the first algorithm. In the second algorithm, we think sentiment should not be simply classified as positive, negative, and objective but a continuous score to reflect sentiment degree. All word scores are calculated based on a large volume of customer reviews. Due to the special characteristics of social media texts, we propose a third algorithm which takes emoticons, negation word position, and domain-specific words into account. Furthermore, a machine learning model is employed on features derived from outputs of three algorithms. We conduct our experiments on user comments from Face book and tweets from twitter. The results show that utilizing Random Forest will acquire a better accuracy than decision tree, neural network, and logistic regression. We also propose a flexible way to represent document sentiment based on sentiments of each sentence contained. SES is available online.

international conference on electronic commerce | 2011

Mining millions of reviews: a technique to rank products based on importance of reviews

Kunpeng Zhang; Yu Cheng; Wei-keng Liao; Alok N. Choudhary

As online shopping becomes increasingly more popular, many shopping web sites encourage existing customers to add reviews of products purchased. These reviews make an impact on the purchasing decisions of potential customers. At Amazon.com for instance, some products receive hundreds of reviews. It is overwhelming and time restrictive for most customers to read, comprehend and make decisions based on all of these reviews. Customers most likely end up reading only a small fraction of the reviews usually in the order which they are presented on the product page. Incorporating various product review factors, such as: content related to product quality, time of the review, content related to product durability and historically older positive customer reviews will have different impacts on the products rankings. Thus, the automated mining of product reviews and opinions to produce a re-calculated product ranking score is a valuable tool which would allow potential customers to make more informed decisions. In this paper, we present a product ranking model that applies weights to product review factors to calculate a products ranking score. Our experiments use the customer reviews from Amazon.com as input to our product ranking model which produces product ranking results that closely relate to the products sales ranking as reported by the retailer.

IEEE Intelligent Systems | 2014

MuSES: Multilingual Sentiment Elicitation System for Social Media Data

Yusheng Xie; Zhengzhang Chen; Kunpeng Zhang; Yu Cheng; Daniel Honbo; Ankit Agrawal; Alok N. Choudhary

A multilingual sentiment identification system (MuSES) implements three different sentiment identification algorithms. The first algorithm augments previous compositional semantic rules by adding rules specific to social media. The second algorithm defines a scoring function that measures the degree of a sentiment, instead of simply classifying a sentiment into binary polarities. All such scores are calculated based on a large volume of customer reviews. Due to the special characteristics of social media texts, a third algorithm takes emoticons, negation word position, and domain-specific words into account. In addition, a proposed label-free process transfers multilingual sentiment knowledge between different languages. The authors conduct their experiments on user comments from Facebook, tweets from Twitter, and multilingual product reviews from Amazon.

international conference on big data | 2013

Elver: Recommending Facebook pages in cold start situation without content features

Yusheng Xie; Zhengzhang Chen; Kunpeng Zhang; Chen Jin; Yu Cheng; Ankit Agrawal; Alok N. Choudhary

Recommender systems are vital to the success of online retailers and content providers. One particular challenge in recommender systems is the “cold start” problem. The word “cold” refers to the items that are not yet rated by any user or the users who have not yet rated any items. We propose Elver to recommend and optimize page-interest targeting on Facebook. Existing techniques for cold recommendation mostly rely on content features in the event of lacking user ratings. Since it is very hard to construct universally meaningful features for the millions of Facebook pages, Elver makes minimal assumption of content features. Elver employs iterative matrix completion technology and nonnegative factorization procedure to work with meagre content inklings. Experiments on Facebook data shows the effectiveness of Elver at different levels of sparsity.

international acm sigir conference on research and development in information retrieval | 2012

Sentiment identification by incorporating syntax, semantics and context information

Kunpeng Zhang; Yusheng Xie; Yu Cheng; Daniel Honbo; Doug Downey; Ankit Agrawal; Wei-keng Liao; Alok N. Choudhary

This paper proposes a method based on conditional random fields to incorporate sentence structure (syntax and semantics) and context information to identify sentiments of sentences within a document. It also proposes and evaluates two different active learning strategies for labeling sentiment data. The experiments with the proposed approach demonstrate a 5-15% improvement in accuracy on Amazon customer reviews compared to existing supervised learning and rule-based methods.

conference on information and knowledge management | 2012

Probabilistic macro behavioral targeting

Yusheng Xie; Yu Cheng; Daniel Honbo; Kunpeng Zhang; Ankit Agrawal; Alok N. Choudhary; Yi Gao; Jiangtao Gou

We investigate a class of emerging online marketing challenges in social networks; and formally, we define macro behavioral targeting (MBT) to be the marketing efforts that appeal to a massive targeted population with non-personalized broadcasting. Upon the problem formulation, we describe a probabilistic graphical model for MBT. In our model, we derive the prior distributions from scratch because existing applications of graphical model / Bayesian network cannot fully capture the unique characteristics of MBT. In the derivation, we propose an approximation method to circumvent an intractable situation where order statistics need be calculated from exponentially increasing computations. In the experiments, we present case studies on real Facebook data.

international conference on data mining | 2011

Learning to Group Web Text Incorporating Prior Information

Yu Cheng; Kunpeng Zhang; Yusheng Xie; Ankit Agrawal; Wei-keng Liao; Alok N. Choudhary

Clustering similar items for web text has become increasingly important in many Web and Information Retrieval applications. For several kinds of web text data, it is much easier to obtain some external information other than textual features which can be utilized to improve the performance of clustering analysis. This external information, called prior information, indicates label sign and pair wise constraints on sample points. We propose a unifying framework that can incorporate prior information of cluster membership for web text cluster analysis and develop a novel semi-supervised clustering model. The proposed framework offers several advantages over existing semi-supervised approaches. First, most previous work handles labeled data by converting it to pair wise constraints and thus leads to much more computation. The proposed approach can handle pair wise constraints together with labeled data simultaneously so that the computation is greatly reduced. Second, the framework allows us to obtain these prior information automatically or only with little human effort, thus, making it possible to boost the clustering learning performance relatively easily. We evaluated the proposed method on the real-world problems of automatically grouping online news feeds and web blog messages. Experimental results indicate the proposed framework incorporating prior information can indeed lead to statistically significant clustering improvements over the performance of approaches access only to textual features.

meeting of the association for computational linguistics | 2014

Active Learning with Constrained Topic Model

Yi Yang; Shimei Pan; Doug Downey; Kunpeng Zhang

Latent Dirichlet Allocation (LDA) is a topic modeling tool that automatically discovers topics from a large collection of documents. It is one of the most popular text analysis tools currently in use. In practice however, the topics discovered by LDA do not always make sense to end users. In this extended abstract, we propose an active learning framework that interactively and iteratively acquires user feedback to improve the quality of learned topics. We conduct experiments to demonstrate its effectiveness with simulated userinput on a benchmarkdataset.

advances in social networks analysis and mining | 2013

A probabilistic graphical model for brand reputation assessment in social networks

Kunpeng Zhang; Doug Downey; Zhengzhang Chen; Yusheng Xie; Yu Cheng; Ankit Agrawal; Wei-keng Liao; Alok N. Choudhary

Social media has become a popular platform that connects people who share information, in particular personal opinions. Through such a fast information exchange mechanism, reputation of individuals, consumer products, or business companies can be quickly built up within a social network. Recently, applications mining social network data start emerging to find the communities sharing the same interests for marketing purposes. Knowing the reputation of social network entities, such as celebrities or business companies, can help develop better strategies for election campaigns or new product advertisements. In this paper, we propose a probabilistic graphical model to collectively measure reputations of entities in social networks. By collecting and analyzing large amount of user activities on Facebook, our model can effectively and efficiently rank entities, such as presidential candidates, professional sport teams, musician bands, and companies, based on their social reputation. The proposed model produces results largely consistent with the two publicly available systems - movie ranking in Internet Movie Database and business school ranking by the US news & World Report - with the correlation coefficients of 0.75 and -0.71, respectively.

international joint conference on natural language processing | 2015

Reducing infrequent-token perplexity via variational corpora

Yusheng Xie; Pranjal Daga; Yu Cheng; Kunpeng Zhang; Ankit Agrawal; Alok N. Choudhary

Recurrent neural network (RNN) is recognized as a powerful language model (LM). We investigate deeper into its performance portfolio, which performs well on frequent grammatical patterns but much less so on less frequent terms. Such portfolio is expected and desirable in applications like autocomplete, but is less useful in social content analysis where many creative, unexpected usages occur (e.g., URL insertion). We adapt a generic RNN model and show that, with variational training corpora and epoch unfolding, the model improves its performance for the task of URL insertion suggestions.

Explore More