Vasudha Bhatnagar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vasudha Bhatnagar is active.

Explore More

Publication

Featured researches published by Vasudha Bhatnagar.

Computing | 2015

An overview of the commercial cloud monitoring tools: research dimensions, design issues, and state-of-the-art

Khalid Alhamazani; Rajiv Ranjan; Karan Mitra; Fethi A. Rabhi; Prem Prakash Jayaraman; Samee Ullah Khan; Adnene Guabtni; Vasudha Bhatnagar

Cloud monitoring activity involves dynamically tracking the Quality of Service (QoS) parameters related to virtualized resources (e.g., VM, storage, network, appliances, etc.), the physical resources they share, the applications running on them and data hosted on them. Applications and resources configuration in cloud computing environment is quite challenging considering a large number of heterogeneous cloud resources. Further, considering the fact that at given point of time, there may be need to change cloud resource configuration (number of VMs, types of VMs, number of appliance instances, etc.) for meet application QoS requirements under uncertainties (resource failure, resource overload, workload spike, etc.). Hence, cloud monitoring tools can assist a cloud providers or application developers in: (i) keeping their resources and applications operating at peak efficiency, (ii) detecting variations in resource and application performance, (iii) accounting the service level agreement violations of certain QoS parameters, and (iv) tracking the leave and join operations of cloud resources due to failures and other dynamic configuration changes. In this paper, we identify and discuss the major research dimensions and design issues related to engineering cloud monitoring tools. We further discuss how the aforementioned research dimensions and design issues are handled by current academic research as well as by commercial monitoring tools.

data warehousing and knowledge discovery | 1999

K-means Clustering Algorithm for Categorical Attributes

Shyam K. Gupta; K Sambasiva Rao; Vasudha Bhatnagar

Efficient partitioning of large data sets into homogeneous clusters is fundamental problem in data mining. The hierarchical clustering methods are not adaptable because of their high computational complexity. The K-means based algorithms give promising results for their efficiency. However their use in often limited to numeric data. The quality of clusters produced depends on the initialization of clusters and the order in which is based on the K-means philosophy but removes the numeric data limitation.

Publications of the Astronomical Society of the Pacific | 2010

Results from the Supernova Photometric Classification Challenge

Richard Kessler; Bruce A. Bassett; Pavel Belov; Vasudha Bhatnagar; Heather Campbell; A. Conley; Joshua A. Frieman; Alexandre Glazov; S. González-Gaitán; Renée Hlozek; Saurabh W. Jha; Stephen Kuhlmann; Martin Kunz; Hubert Lampeitl; Ashish A. Mahabal; James Newling; Robert C. Nichol; David Parkinson; Ninan Sajeeth Philip; Dovi Poznanski; Joseph W. Richards; Steven A. Rodney; Masao Sako; Donald P. Schneider; Maximilian D. Stritzinger; Melvin Varughese

We report results from the Supernova Photometric Classification Challenge (SNPhotCC), a publicly released mix of simulated supernovae (SNe), with types (Ia, Ibc, and II) selected in proportion to their expected rates. The simulation was realized in the griz filters of the Dark Energy Survey (DES) with realistic observing conditions (sky noise, point-spread function, and atmospheric transparency) based on years of recorded conditions at the DES site. Simulations of non-Ia-type SNe are based on spectroscopically confirmed light curves that include unpublished non-Ia samples donated from the Carnegie Supernova Project (CSP), the Supernova Legacy Survey (SNLS), and the Sloan Digital Sky Survey-II (SDSS-II). A spectroscopically confirmed subset was provided for training. We challenged scientists to run their classification algorithms and report a type and photo-z for each SN. Participants from 10 groups contributed 13 entries for the sample that included a host-galaxy photo-z for each SN and nine entries for the sample that had no redshift information. Several different classification strategies resulted in similar performance, and for all entries the performance was significantly better for the training subset than for the unconfirmed sample. For the spectroscopically unconfirmed subset, the entry with the highest average figure of merit for classifying SNe Ia has an efficiency of 0.96 and an SN Ia purity of 0.79. As a public resource for the future development of photometric SN classification and photo-z estimators, we have released updated simulations with improvements based on our experience from the SNPhotCC, added samples corresponding to the Large Synoptic Survey Telescope (LSST) and the SDSS-II, and provided the answer keys so that developers can evaluate their own analysis.

Data Science Journal | 2006

The impact of data mining techniques on medical diagnostics

Siri Krishan Wasan; Vasudha Bhatnagar; Harleen Kaur

Medical data mining has great potential for exploring the hidden patterns in the data sets of the medical domain. These patterns can be utilized for clinical diagnosis. However, the available raw medical data are widely distributed, heterogeneous in nature, and voluminous. These data need to be collected in an organized form. This collected data can be then integrated to form a hospital information system. Data mining technology provides a user-oriented approach to novel and hidden patterns in the data. Data mining and statistics both strive towards discovering patterns and structures in data. Statistics deals with heterogeneous numbers only, whereas data mining deals with heterogeneous fields. We identify a few areas of healthcare where these techniques can be applied to healthcare databases for knowledge discovery. In this paper we briefly examine the impact of data mining techniques, including artificial neural networks, on medical diagnostics.

international database engineering and applications symposium | 2006

PBIRCH: A Scalable Parallel Clustering algorithm for Incremental Data

Ashwani Garg; Ashish Mangla; Neelima Gupta; Vasudha Bhatnagar

We present a parallel version of BIRCH with the objective of enhancing the scalability without compromising on the quality of clustering. The incoming data is distributed in a cyclic manner (or block cyclic manner if the data is bursty) to balance the load among processors. The algorithm is implemented on a message passing share-nothing model. Experiments show that for very large data sets the algorithm scales nearly linearly with the increasing number of processors. Experiments also show that clusters obtained by PBIRCH are comparable to those obtained using BIRCH

Knowledge and Information Systems | 2014

Clustering data streams using grid-based synopsis

Vasudha Bhatnagar; Sharanjit Kaur; Sharma Chakravarthy

Continually advancing technology has made it feasible to capture data online for onward transmission as a steady flow of newly generated data points, termed as data stream. Continuity and unboundedness of data streams make storage of data and multiple scans of data an impractical proposition for the purpose of knowledge discovery. Need to learn structures from data in streaming environment has been a driving force for making clustering a popular technique for knowledge discovery from data streams. Continuous nature of streaming data makes it infeasible to look for point membership among the clusters discovered so far, necessitating employment of a synopsis structure to consolidate incoming data points. This synopsis is exploited for building clustering scheme to meet subsequent user demands. The proposed Exclusive and Complete Clustering (ExCC) algorithm captures non-overlapping clusters in data streams with mixed attributes, such that each point either belongs to some cluster or is an outlier/noise. The algorithm is robust, adaptive to changes in data distribution and detects succinct outliers on-the-fly. It deploys a fixed granularity grid structure as synopsis and performs clustering by coalescing dense regions in grid. Speed-based pruning is applied to synopsis prior to clustering to ensure currency of discovered clusters. Extensive experimentation demonstrates that the algorithm is robust, identifies succinct outliers on-the-fly and is adaptive to change in the data distribution. ExCC algorithm is further evaluated for performance and compared with other contemporary algorithms.

Knowledge and Information Systems | 2005

Architecture for knowledge discovery and knowledge management

Shyam K. Gupta; Vasudha Bhatnagar; Siri Krishan Wasan

In this paper, we propose I-MIN model for knowledge discovery and knowledge management in evolving databases. The model splits the KDD process into three phases. The schema designed during the first phase, abstracts the generic mining requirements of the KDD process and provides a mapping between the generic KDD process and (user) specific KDD subprocesses. The generic process is executed periodically during the second phase and windows of condensed knowledge called knowledge concentrates are created. During the third phase, which corresponds to actual mining by the end users, specific KDD subprocesses are invoked to mine knowledge concentrates. The model provides a set of mining operators for the development of mining applications to discover and renew, preserve and reuse, and share knowledge for effective knowledge management. These operators can be invoked by either using a declarative query language or by writing applications.The architectural proposal emulates a DBMS like environment for the managers, administrators and end users in the organization. Knowledge management functions, like sharing and reuse of the discovered knowledge among the users and periodic updating of the discovered knowledge are supported. Complete documentation and control of all the KDD endeavors in an organization are facilitated by the I-MIN model. This helps in structuring and streamlining the KDD operations in an organization.

data warehousing and knowledge discovery | 2004

Novelty Framework for Knowledge Discovery in Databases

Ahmed Sultan Al-Hegami; Vasudha Bhatnagar; Naveen Kumar

Knowledge Discovery in Databases (KDD) is an iterative process that aims at extracting interesting, previously unknown and hidden patterns from huge databases. Use of objective measures of interestingness in popular data mining algorithms often leads to another data mining problem, although of reduced complexity. The reduction in the volume of the discovered rules is desirable in order to improve the efficiency of the overall KDD process. Subjective measures of interestingness are required to achieve this. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a framework to quantify novelty of the discovered rules in terms of their deviations from the known rules. The computations are carried out using the importance that the user gives to different deviations. The computed degree of novelty is then compared with the user given threshold to report novel rules to the user. We implement the proposed framework and experiment with some public datasets. The experimental results are quite promising.

Progress in Artificial Intelligence | 2014

Accuracy-diversity based pruning of classifier ensembles

Vasudha Bhatnagar; Manju Bhardwaj; Shivam Sharma; Sufyan Haroon

Classification ensemble methods have recently drawn serious attention due to their ability to appreciably pull up prediction performance. Since smaller ensembles are preferred because of storage and efficiency reasons, ensemble pruning is an important step for construction of classifier ensembles. In this paper, we propose a heuristic method to obtain an optimal ensemble from a given pool of classifiers. The proposed accuracy–diversity based pruning algorithm takes into account the accuracy of individual classifiers as well as the pairwise diversity amongst these classifiers. The algorithm performs a systematic bottom-up search and conditionally grows sub-ensembles by adding diverse pairs of classifiers to the candidates with relatively higher accuracies. The ultimate aim is to deliver the smallest ensemble with highest achievable accuracy in the pool. The performance study on UCI datasets demonstrates that the proposed algorithm rarely misses the optimal ensemble, thus establishing confidence in the quality of heuristics employed by the algorithm.

data warehousing and knowledge discovery | 2010

Mining closed itemsets in data stream using formal concept analysis

Anamika Gupta; Vasudha Bhatnagar; Naveen Kumar

Mining of frequent closed itemsets has been shown to be more efficient than mining frequent itemsets for generating non-redundant association rules. The task is challenging in data stream environment because of the unbounded nature and no-second-look characteristics. In this paper, we propose an algorithm, CLICI, for mining all recent closed itemsets in landmark window model of online data stream. The algorithm consists of an online component, which processes the transactions arriving in the stream without candidate generation and updates the synopsis appropriately. The offline component is invoked on demand to mine all frequent closed itemsets. User can explore and experiment by specifying the support threshold dynamically. The synopsis, CILattice, stores all recent closed itemsets in the stream. It is based on Concept Lattice - a core structure of Formal Concept Analysis (FCA). Closed itemsets stored in the form of lattice facilitate generation of non-redundant association rules and is the main motivation behind using lattice based synopsis. Experimental evaluation using synthetic and real life datasets demonstrates the scalablility of the algorithm.

Explore More