Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Keke Cai is active.

Publication


Featured researches published by Keke Cai.


international acm sigir conference on research and development in information retrieval | 2011

Social context summarization

Zi Yang; Keke Cai; Jie Tang; Li Zhang; Zhong Su; Juanzi Li

We study a novel problem of social context summarization for Web documents. Traditional summarization research has focused on extracting informative sentences from standard documents. With the rapid growth of online social networks, abundant user generated content (e.g., comments) associated with the standard documents is available. Which parts in a document are social users really caring about? How can we generate summaries for standard documents by considering both the informativeness of sentences and interests of social users? This paper explores such an approach by modeling Web documents and social contexts into a unified framework. We propose a dual wing factor graph (DWFG) model, which utilizes the mutual reinforcement between Web documents and their associated social contexts to generate summaries. An efficient algorithm is designed to learn the proposed factor graph model.Experimental results on a Twitter data set validate the effectiveness of the proposed model. By leveraging the social context information, our approach obtains significant improvement (averagely +5.0%-17.3%) over several alternative methods (CRF, SVM, LR, PR, and DocLead) on the performance of summarization.


Machine Learning | 2011

Topic level expertise search over heterogeneous networks

Jie Tang; Jing Zhang; Ruoming Jin; Zi Yang; Keke Cai; Li Zhang; Zhong Su

In this paper, we present a topic level expertise search framework for heterogeneous networks. Different from the traditional Web search engines that perform retrieval and ranking at document level (or at object level), we investigate the problem of expertise search at topic level over heterogeneous networks. In particular, we study this problem in an academic search and mining system, which extracts and integrates the academic data from the distributed Web. We present a unified topic model to simultaneously model topical aspects of different objects in the academic network. Based on the learned topic models, we investigate the expertise search problem from three dimensions: ranking, citation tracing analysis, and topical graph search. Specifically, we propose a topic level random walk method for ranking the different objects. In citation tracing analysis, we aim to uncover how a piece of work influences its follow-up work. Finally, we have developed a topical graph search function, based on the topic modeling and citation tracing analysis. Experimental results show that various expertise search and mining tasks can indeed benefit from the proposed topic level analysis approach.


international conference on data mining | 2011

Patent Maintenance Recommendation with Patent Information Network Model

Xin Jin; W. Scott Spangler; Ying Chen; Keke Cai; Rui Ma; Li Zhang; Xian Wu; Jiawei Han

Patents are of crucial importance for businesses, because they provide legal protection for the invented techniques, processes or products. A patent can be held for up to 20 years. However, large maintenance fees need to be paid to keep it enforceable. If the patent is deemed not valuable, the owner may decide to abandon it by stopping paying the maintenance fees to reduce the cost. For large companies or organizations, making such decisions is difficult because too many patents need to be investigated. In this paper, we introduce the new patent mining problem of automatic patent maintenance prediction, and propose a systematic solution to analyze patents for recommending patent maintenance decision. We model the patents as a heterogeneous time-evolving information network and propose new patent features to build model for a ranked prediction on whether to maintain or abandon a patent. In addition, a network-based refinement approach is proposed to further improve the performance. We have conducted experiments on the large scale United States Patent and Trademark Office (USPTO) database which contains over four million granted patents. The results show that our technique can achieve high performance.


Web Intelligence and Agent Systems: An International Journal | 2010

Leveraging sentiment analysis for topic detection

Keke Cai; W. Scott Spangler; Ying Chen; Li Zhang

The emergence of new social media such as blogs, message boards, news, and web content in general has dramatically changed the ecosystems of corporations. Consumers, non-profit organizations, and other forms of communities are extremely vocal about their opinions and perceptions on companies and their brands on the web. The ability to leverage such “voice of the web” to gain consumer, brand, and market insights can be truly differentiating and valuable to todays corporations. In particular, one important form of insights can be derived from sentiment analysis on web content. Sentiment analysis traditionally emphasizes on classification of web comments into positive, neutral, and negative categories. This paper goes beyond sentiment classification by focusing on techniques that could detect the topics that are highly correlated with the positive and negative opinions. Such techniques, when coupled with sentiment classification, can help the business analysts to understand both the overall sentiment scope as well as the drivers behind the sentiment. In this paper, we describe our overall sentiment analysis system that consists of such sentiment analysis techniques, including the bootstrapping method for word polarities weighting, automatic filtering and expansion for domain word, and a sentiment classification method. We then detail a novel topic detection method using point-wise mutual information and term frequency distribution. We demonstrate the effectiveness of our overall approaches via several case studies on different social media data sets.


international conference on data mining | 2009

Topic Distributions over Links on Web

Jie Tang; Jing Zhang; Jeffrey Xu Yu; Zi Yang; Keke Cai; Rui Ma; Li Zhang; Zhong Su

It is well known that Web users create links with different intentions. However, a key question, which is not well studied, is how to categorize the links and how to quantify the strength of the influence of a web page on another if there is a link between the two linked web pages. In this paper, we focus on the problem of link semantics analysis, and propose a novel supervised learning approach to build a model, based on a training link-labeled and link-weighted graph where a link-label represents the category of a link and a link-weight represents the influence of one web page on the other in a link. Based on the model built, we categorize links and quantify the influence of web pages on the others in a large graph in the same application domain. We discuss our proposed approach, namely Pairwise Restricted Boltzmann Machines (PRBMs), and conduct extensive experimental studies to demonstrate the effectiveness of our approach using large real datasets.


Ibm Journal of Research and Development | 2010

EagleEye: entity-centric business intelligence for smarter decisions

Li Zhang; Shenghua Bao; Honglei Guo; Huijia Zhu; Xiaoxun Zhang; Keke Cai; Ben Fei; Xian Wu; Zhenyu Guo; Zhong Su

This paper describes EagleEye, which is an intelligent system that provides business intelligence through advanced data mining and text analytics. Unlike traditional search engines, EagleEye is entity oriented, and an entity can be an organization, a person, or a place. Given an entity name, the basic function of EagleEye is to generate a consolidated view of the entity information it gathers from many disparate data sources and to organize and categorize it, and automatically detect entity relationships. EagleEye can also analyze the opinions of entities, evaluate whether they are positive or negative, and provide insight into many aspects of consumer sentiment toward product brands. This type of information can enable enterprises to manage the reputation of their brands and to respond more quickly to changes in the marketplace. We present the key technologies--such as entity-name grouping, entity-relation extraction, and entity-oriented opinion mining--that were developed to support these functions. EagleEye has been successfully deployed to a number of clients across a variety of industries in China. Several case studies are presented to demonstrate in practice the capability and business value of EagleEye.


conference on information and knowledge management | 2016

BigNet 2016: First Workshop on Big Network Analytics

Jie Tang; Keke Cai; Zhong Su; Hanghang Tong; Michalis Vazirgiannis; Yang Yang

The first ACM international workshop on big network analytics is held in Indianapolis, Indiana, USA on October 24, 2016 and co-located with the ACM 25th Conference on Information and Knowledge Management (CIKM). The main objective of the workshop is to provide a forum for presenting the most recent advances in mining big networks to unearth rich knowledge. It is related to information retrieval, Web mining, social network analysis, and computational advertising. The anticipated outcome includes a fruitful discussion about the emerging challenges in this field, the development of novel theories for mining big networks, and motivating the interesting applications. The broader anticipated outcome includes: fostering future research directions, publishing high quality papers, attracting new researchers to this field, and concrete solutions to the existing problems.


web age information management | 2014

How Do People Communicate through Different Social Connections

Keke Cai; Yu Zhao; Jie Tang; Li Zhang; Zhong Su

This paper presents a comparison study to identify the communication patterns of people through different social connections. Advances in technology have brought many communication channels for people in daily life, like E-mail, blogs/micro-blog and mobile telecommunication etc. Now and in the future it is going to be critical that people use multiple channels of communication to reach others. The understanding of people’s choice of communication channels is becoming quite important. In this paper, we specifically selected two of the most significant channels as the objects for comparison. One is online social network, e.g., Twitter as representative of such networks, and another is mobile telecommunication. The corresponding social network is therefore constructed for each communication channel. Based on that, we conduct a series of investigation, including temporal analysis, geographical analysis and topological analysis. Generally, what we have found in this study is that people’s communication through different channels shows the differences in various aspects.


conference on information and knowledge management | 2010

Understanding retweeting behaviors in social networks

Zi Yang; Jingyi Guo; Keke Cai; Jie Tang; Juanzi Li; Li Zhang; Zhong Su


Archive | 2009

SYSTEMS AND METHODS FOR DETECTING SENTIMENT-BASED TOPICS

Keke Cai; Ying Chen; W. Scott Spangler; Li Zhang

Collaboration


Dive into the Keke Cai's collaboration.

Researchain Logo
Decentralizing Knowledge