Manish A. Bhide | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Manish A. Bhide is active.

Explore More

Publication

Featured researches published by Manish A. Bhide.

very large data bases | 2012

Real time discovery of dense clusters in highly dynamic graphs: identifying real world events in highly dynamic environments

Manoj K. Agarwal; Krithi Ramamritham; Manish A. Bhide

Due to their real time nature, microblog streams are a rich source of dynamic information, for example, about emerging events. Existing techniques for discovering such events from a microblog stream in real time (such as Twitter trending topics), have several lacunae when used for discovering emerging events; extant graph based event detection techniques are not practical in microblog settings due to their complexity; and conventional techniques, which have been developed for blogs, web-pages, etc., involving the use of keyword search, are only useful for finding information about known events. Hence, in this paper, we present techniques to discover events that are unraveling in microblog message streams in real time so that such events can be reported as soon as they occur. We model the problem as discovering dense clusters in highly dynamic graphs. Despite many recent advances in graph analysis, ours is the first technique to identify dense clusters in massive and highly dynamic graphs in real time. Given the characteristics of microblog streams, in order to find clusters without missing any events, we propose and exploit a novel graph property which we call short-cycle property. Our algorithms find these clusters efficiently in spite of rapid changes to the microblog streams. Further we present a novel ranking function to identify the important events. Besides proving the correctness of our algorithms we show their practical utility by evaluating them using real world microblog data. These demonstrate our techniques ability to discover, with high precision and recall, emerging events in high intensity data streams in real time. Many recent web applications create data which can be represented as massive dynamic graphs. Our technique can be easily extended to discover, in real time, interesting patterns in such graphs.

international conference on data engineering | 2008

Enhanced Business Intelligence using EROCS

Manish A. Bhide; Venkat Chakravarthy; Ajay Gupta; Himanshu Gupta; Mukesh K. Mohania; Kriti Puniyani; Prasan Roy; Sourashis Roy; Vibhuti S. Sengar

The EROCS technology automatically links unstructured data with relevant structured data from an external relational database. We demonstrate how EROCS can be used for enhancing business intelligence by allowing OLAP tools to analyze structured and unstructured data in a consolidated manner. Our demonstration showcases the use of EROCS in exploiting latent information in customer emails, which helps in building a complete view of the customer. This results in new insights about the business which are not possible with the existing state of the art.

international conference on management of data | 2007

LIPTUS: associating structured and unstructured information in a banking environment

Manish A. Bhide; Ajay Gupta; Rahul Gupta; Prasan Roy; Mukesh K. Mohania; Zenita Ichhaporia

Growing competition has made todays banks understand the value of knowing their customers better. In this paper, we describe a tool, LIPTUS, that associates the customer interactions (emails and transcribed phone calls) with customer and account profiles stored in an existing data warehouse. The associations discovered by LIPTUS enable analytics spanning the customer and account profiles on one hand and the meta-data associated or derived from the interaction (using text mining techniques) on the other. We illustrate the value derived from this consolidated analysis through specific customer intelligence applications. LIPTUS is today being extensively used in a large bank in India. A highlight of this paper is a discussion of the technical challenges encountered while building LIPTUS and deploying it on real-life customer data.

international conference on autonomic computing | 2004

Policy framework for automatic data management

Manish A. Bhide; Ajay Gupta; Mukul Joshi; Mukesh K. Mohania; Shree Raman

The popularity of e-business has lead to an exponential and unstructured growth in the applications space coupled with an increase in the database size. This has led to an increase in the complexity of the database management task. Moreover, organizations are increasingly concerned about the privacy of data. Thus, managing such large ever growing and privacy-preserving database is complex and time-consuming task. In this paper, we describe a policy-based framework for autonomic database management using business objects. Our system automatically manages data based on events.

very large data bases | 2009

XPEDIA: XML processing for data integration

Manish A. Bhide; Manoj K. Agarwal; Amir Bar-or; Sriram Padmanabhan; Srinivas K. Mittapalli; Girish Venkatachaliah

Data Integration engines increasingly need to provide sophisticated processing options for XML data. In the past, it was adequate for these engines to support basic shredding and XML generation capabilities. However, with the steady growth of XML in applications and databases, integration platforms need to provide more direct operations on XML as well as improve the scalability and efficiency of these operations. In this paper, we describe a robust and comprehensive framework for performing Extract-Transform-Load (ETL) of XML. This includes (i) full computational model and engine capabilities to perform these operations in an ETL flow, (ii) an approach to pushing down XML operations into a database engine capable of supporting XML processing, and (iii) methods to apply partitioning techniques to provide scalable, parallel processing for large XML documents. We describe experimental results showing the effectiveness of these techniques.

international conference on ubiquitous information management and communication | 2008

New trends in information integration

Mukesh K. Mohania; Manish A. Bhide

The growth of organizations invariably leads to creation of multiple isolated data sources which are totally disconnected from each other. This leads to reduced efficiency and lack of complete knowledge of the enterprise and its customers while making critical business decisions. This is the classical Information integration problem which has become the biggest pain point for enterprises today. Information Integration has received considerable attention from researchers in academia as well as industry in the recent past. Information integration refers to the category of middleware which lets applications access data as though there were in a single database. It enables the integration of data and content sources so as to provide real-time read and write access, transform data for business analysis and data interchange, and data placement for performance, currency and availability. The integration problem arises mainly due to the complex and heterogeneous environments in the enterprise. Typical example applications of information integration include: integrating the transcribed calls records with the data warehouse, integrating the patient reports with the relational data about the patients, integrating mobile phone call description records made from different regions, integrating inventory and store information from different warehouses etc.

international conference on data engineering | 2009

Keyword Search over Dynamic Categorized Information

Manish A. Bhide; Venkatesan T. Chakaravarthy; Krithi Ramamritham; Prasan Roy

Consider an information repository whose content is categorized. A data item (in the repository) can belong to multiple categories and new data is continuously added to the system. In this paper, we describe a system, CS*, which takes a keyword query and returns the relevant top-K categories. In contrast, traditional keyword search returns the top-K documents (i.e., data items) relevant to a user query. The need to dynamically categorize new data and also update the meta-data required for fast responses to user queries poses interesting challenges. The brute force approach of updating the meta-data by comparing each new data item with all the categories is impractical due to (i) the large cost involved in finding the categories associated with a data item and (ii) the high rate of arrival of new data items. We show that a sampling based approach which provides statistical guarantees on the reported results is also impracticable. We hence develop the CS* approach whose effectiveness results from its ability to focus on a strategically chosen subset of categories on the one hand and a subset of new data on the other. Given a query, CS* finds the top-K categories with high accuracy even in time-constrained situations. An experimental evaluation of the CS* system using real world data shows that it can easily achieve accuracy in excess of 90%, whereas other approaches demand at least 57% more resources (i.e., processing power), for providing similar results. Our experimental results also show that, contrary to expectations, if the rate of arrival of data items doubles, whereas CS* continues to provide high accuracy without a significant increase in resources, other approaches require more than double the number of resources.

international conference on distributed computing systems | 2007

Efficient Execution of Continuous Incoherency Bounded Queries over Multi-Source Streaming Data

Manish A. Bhide; Krithi Ramamritham; Mukund Agrawal

On-line decision making often involves query processing over time-varying data which arrives in the form of data streams from distributed locations. In such environments typically, a user application is interested in the value of some function defined over the data items. For example, the traffic management system can make control decisions based on the observed traffic at major intersections; stock investors can manage their investments based on the value of their portfolios. In this paper we present a system that supports pull based data refresh and query processing techniques where such queries access data from multiple distributed sources. Key challenges in supporting such Continuous Multi-Data Incoherency Bounded Queries lie in minimizing network and source overheads, without loss of fidelity in the query responses provided to users. We address these challenges by using mathematically sound approaches based on Gradient Descent and Constraint Optimization which allow us to adapt the refresh frequencies of the dynamically changing data and adjust the quality of service provided to different users.

european symposium on research in computer security | 2005

A generic XACML based declarative authorization scheme for java

Rajeev Gupta; Manish A. Bhide

Security and authorization play a very important role in the development, deployment and functioning of software systems. Java being the most popular platform for component-based software and systems, Java security is playing a key role in enterprise systems. The major drawback in the security support provided by J2EE and J2SE is the absence of a standard way to support instance level access control. JAAS does provide some help, but it is not without its share of problems. The newest standard related to security – XACML, provides a standard simple way to represent security policies. In the paper we propose a unique way to extend JAAS technology so that it can support class-instance level access control in a declarative manner. We then showcase how this extension can be molded in the XACML architecture, thereby providing an end-to-end standard based access control specification and implementation for J2SE and J2EE applications. The major advantage of our technique is that, being declarative it does not require any change to the security code when – either the security policies are changed or the security infrastructure is deployed in a new environment.

international conference on data engineering | 2003

Towards bringing database management task in the realm of IT non-experts

Ajay Gupta; Manish A. Bhide; Mukesh K. Mohania

Internet enabled services have led to an increase in the size and complexity of the database making the database administration task very complex. This necessitates the hiring of skilled personnel for the management of data, which is a bane for industries, especially in developing countries. We at IBM India research Lab, have been developing technologies (i.e. database administration tools), wherein the database administrator, who may be inexperienced person, can define error-free policies for managing and maintaining the distributed information repositories, and can define data access privileges to users at fine as well as at coarse-grained level using natural language like constructs. These policies represent the actions that need to be carried out when specific database/temporal events occur within the system or externally and are notified to the system. In this paper, we discuss the architecture of our policy based database administration and access control system and outline some of the technical challenges in this area.

Explore More