Riddhiman Ghosh
Hewlett-Packard
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Riddhiman Ghosh.
knowledge discovery and data mining | 2013
Arjun Mukherjee; Abhinav Kumar; Bing Liu; Junhui Wang; Meichun Hsu; Malu Castellanos; Riddhiman Ghosh
Opinionated social media such as product reviews are now widely used by individuals and organizations for their decision making. However, due to the reason of profit or fame, people try to game the system by opinion spamming (e.g., writing fake reviews) to promote or to demote some target products. In recent years, fake review detection has attracted significant attention from both the business and research communities. However, due to the difficulty of human labeling needed for supervised learning and evaluation, the problem remains to be highly challenging. This work proposes a novel angle to the problem by modeling spamicity as latent. An unsupervised model, called Author Spamicity Model (ASM), is proposed. It works in the Bayesian setting, which facilitates modeling spamicity of authors as latent and allows us to exploit various observed behavioral footprints of reviewers. The intuition is that opinion spammers have different behavioral distributions than non-spammers. This creates a distributional divergence between the latent population distributions of two clusters: spammers and non-spammers. Model inference results in learning the population distributions of the two clusters. Several extensions of ASM are also considered leveraging from different priors. Experiments on a real-life Amazon review dataset demonstrate the effectiveness of the proposed models which significantly outperform the state-of-the-art competitors.
conference on information and knowledge management | 2013
Zhiyuan Chen; Arjun Mukherjee; Bing Liu; Meichun Hsu; Malu Castellanos; Riddhiman Ghosh
Topic models have been widely used to discover latent topics in text documents. However, they may produce topics that are not interpretable for an application. Researchers have proposed to incorporate prior domain knowledge into topic models to help produce coherent topics. The knowledge used in existing models is typically domain dependent and assumed to be correct. However, one key weakness of this knowledge-based approach is that it requires the user to know the domain very well and to be able to provide knowledge suitable for the domain, which is not always the case because in most real-life applications, the user wants to find what they do not know. In this paper, we propose a framework to leverage the general knowledge in topic models. Such knowledge is domain independent. Specifically, we use one form of general knowledge, i.e., lexical semantic relations of words such as synonyms, antonyms and adjective attributes, to help produce more coherent topics. However, there is a major obstacle, i.e., a word can have multiple meanings/senses and each meaning often has a different set of synonyms and antonyms. Not every meaning is suitable or correct for a domain. Wrong knowledge can result in poor quality topics. To deal with wrong knowledge, we propose a new model, called GK-LDA, which is able to effectively exploit the knowledge of lexical relations in dictionaries. To the best of our knowledge, GK-LDA is the first such model that can incorporate the domain independent knowledge. Our experiments using online product reviews show that GK-LDA performs significantly better than existing state-of-the-art models.
international world wide web conferences | 2009
Riddhiman Ghosh; Mohamed Dekhil
In this paper we describe techniques for the discovery and construction of user profiles. Leveraging from the emergent data web, our system addresses the problem of sparseness of user profile information currently faced by both asserted and inferred profile systems. A profile mediator, that dynamically builds the most suitable user profile for a particular service or interaction in real-time, is employed in our prototype implementation.
international conference on management of data | 2011
Malu Castellanos; Umeshwar Dayal; Meichun Hsu; Riddhiman Ghosh; Mohamed Dekhil; Yue Lu; Lei Zhang; Mark Schreiman
The rise of Web 2.0 with its increasingly popular social sites like Twitter, Facebook, blogs and review sites has motivated people to express their opinions publicly and more frequently than ever before. This has fueled the emerging field known as sentiment analysis whose goal is to translate the vagaries of human emotion into hard data. LCI is a social channel analysis platform that taps into what is being said to understand the sentiment with the particular ability of doing so in near real-time. LCI integrates novel algorithms for sentiment analysis and a configurable dashboard with different kinds of charts including dynamic ones that change as new data is ingested. LCI has been researched and prototyped at HP Labs in close interaction with the Business Intelligence Solutions (BIS) Division and a few customers. This paper presents an overview of the architecture and some of its key components and algorithms, focusing in particular on how LCI deals with Twitter and illustrating its capabilities with selected use cases.
international world wide web conferences | 2008
Riddhiman Ghosh; Mohamed Dekhil
In this paper, we discuss challenges and provide solutions for capturing and maintaining accurate models of user profiles using semantic web technologies, by aggregating and sharing distributed fragments of user profile information spread over multiple services. Our framework for profile management allows for evolvable, extensible and expressive user profiles. We have implemented a prototype, targeting the retail domain, on the HP Labs Retail Store Assistant.
extending database technology | 2012
Malu Castellanos; Meichun Hsu; Umeshwar Dayal; Riddhiman Ghosh; Mohamed Dekhil; Carlos Ceja; Marcial Puchi; Perla Ruiz
The rapid proliferation of online forums has made it possible for people to share their intentions, wishes and experiences by posting comments with the aim of getting advice from other members of the forum. Extracting intentions from these comments provides valuable insight for companies who can exploit it to get a competitive edge. However, given the very large amount of this kind of online comments, manually extracting intentions is impractical, time consuming and expensive. Companies need tools that analyze the text to extract intentions and details about them. In this paper we propose to demo one such tool called Intention Insider which has been developed at HP Labs in close collaboration with business units and a few selected customers. The tool can ingest content from online forums or from uploaded files and quickly sift through very large amounts of comments to extract intention information. This information is loaded into a data warehouse to be correlated with other structured data and queried to produce interactive reports and dynamic visualizations that facilitate its exploration at detailed and aggregate levels.
human factors in computing systems | 2008
Jhilmil Jain; Riddhiman Ghosh; Mohamed Dekhil
In this paper we present a prototype for capturing retail related consumer intent using multiple devices and in multimodal input formats such as text, audio, and still images. The prototype was used in a longitudinal user study to analyze the process that consumers go through in order to make purchasing decisions. Based on these findings, we recommend desirable features for information management systems specifically designed for the retail environment.
conference on information and knowledge management | 2013
Hyun Duk Kim; Malu Castellanos; Meichun Hsu; ChengXiang Zhai; Umeshwar Dayal; Riddhiman Ghosh
In this paper, we propose a novel opinion summarization problem called compact explanatory opinion summarization (CEOS) which aims to extract within-sentence explanatory text segments from input opinionated texts to help users better understand the detailed reasons of sentiments. We propose and study general methods for identifying candidate boundaries and scoring the explanatoriness of text segments using Hidden Markov Models. We create new data sets and use a new evaluation measure to evaluate CEOS. Experimental results show that the proposed methods are effective for generating an explanatory opinion summary, outperforming a standard text summarization method.
international conference on human computer interaction | 2009
Jhilmil Jain; Riddhiman Ghosh; Mohamed Dekhil
In this paper we present a prototype for creating shopping lists using multiple input devices such as desktop, smart phones, landline or cell phones and in multimodal formats such as structured text, audio, still images, video, unstructured text and annotated media. The prototype was used by 10 participants in a two week longitudinal study. The goal was to analyze the process that users go through in order to create and manage shopping related projects. Based on these findings, we recommend desirable features for personal information management systems specifically designed for managing collaborative shopping lists.
international world wide web conferences | 2011
Malu Castellanos; Riddhiman Ghosh; Yue Lu; Lei Zhang; Perla Ruiz; Mohamed Dekhil; Umeshwar Dayal; Meichun Hsu
The rise of Twitter, blogs, review sites and social sites has motivated people to express their opinions publicly and more frequently than ever before. This has fueled the emerging field known as sentiment analysis whose goal is to translate the vagaries of human emotion into hard data. LivePulse is a tool that taps into the growing business interest in what is being said online with the particular characteristic of doing so in real-time. LivePulse integrates novel algorithms for sentiment analysis and a configurable dashboard with different kinds of dynamic charts that change as new data is ingested. It also provides support to drill down and visually explore the sentiment scores to understand how they were computed and what are the emotions expressed about a given aspect or topic. Our tool has been researched and prototyped at HP Labs in close interaction with internal and external customers whose valuable feedback has been crucial for improving the tool. This paper presents an overview of LivePulses architecture and functionality, and illustrates how it would be demoed.