Mukesh K. Mohania | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mukesh K. Mohania is active.

Explore More

Publication

Featured researches published by Mukesh K. Mohania.

data and knowledge engineering | 2001

Active data warehouses: complementing OLAP with analysis rules

Thomas Thalhammer; Michael Schrefl; Mukesh K. Mohania

Abstract Conventional data warehouses are passive. All tasks related to analysing data and making decisions must be carried out manually by analysts. Todays data warehouse and OLAP systems offer little support to automatize decision tasks that occur frequently and for which well-established decision procedures are available. Such a functionality can be provided by extending the conventional data warehouse architecture with analysis rules , which mimic the work of an analyst during decision making. Analysis rules extend the basic event/condition/action (ECA) rule structure with mechanisms to analyse data multidimensionally and to make decisions. The resulting architecture is called active data warehouse .

conference on information and knowledge management | 2008

Minimum-effort driven dynamic faceted search in structured databases

Senjuti Basu Roy; Haidong Wang; Gautam Das; Ullas Nambiar; Mukesh K. Mohania

In this paper, we propose minimum-effort driven navigational techniques for enterprise database systems based on the faceted search paradigm. Our proposed techniques dynamically suggest facets for drilling down into the database such that the cost of navigation is minimized. At every step, the system asks the user a question or a set of questions on different facets and depending on the user response, dynamically fetches the next most promising set of facets, and the process repeats. Facets are selected based on their ability to rapidly drill down to the most promising tuples, as well as on the ability of the user to provide desired values for them. Our facet selection algorithms also work in conjunction with any ranked retrieval model where a ranking function imposes a bias over the user preferences for the selected tuples. Our methods are principled as well as efficient, and our experimental study validates their effectiveness on several application scenarios.

symposium on principles of database systems | 2007

Decision trees for entity identification: approximation algorithms and hardness results

Venkatesan T. Chakaravarthy; Vinayaka Pandit; Sambuddha Roy; Pranjal Awasthi; Mukesh K. Mohania

We consider the problem of constructing decision trees for entity identification from a given relational table. The input is a table containing information about a set of entities over a fixed set of attributes and a probability distribution over the set of entities that specifies the likelihood of the occurrence of each entity. The goal is to construct a decision tree that identifies each entity unambiguously by testing the attribute values such that the average number of tests is minimized. This classical problem finds such diverse applications as efficient fault detection, species identification in biology, and efficient diagnosis in the field of medicine. Prior work mainly deals with the special case where the input table is binary and the probability distribution over the set of entities is uniform. We study the general problem involving arbitrary input tables and arbitrary probability distributions over the set of entities. We consider a natural greedy algorithm and prove an approximation guarantee of O(rK • log N), where N is the number of entities and K is the maximum number of distinct values of an attribute. The value rK is a suitably defined Ramsey number, which is at most log K. We show that it is NP-hard to approximate the problem within a factor of Ω(log N), even for binary tables (i.e. K=2). Thus, for the case of binary tables, our approximation algorithm is optimal up to constant factors (since r2=2). In addition, our analysis indicates a possible way of resolving a Ramsey-theoretic conjecture by Erdos.

Information Sciences | 2002

Mobile data and transaction management

Sanjay Kumar Madria; Mukesh K. Mohania; Sourav S. Bhowmick; Bharat K. Bhargava

Abstract Mobile computing paradigm has emerged due to advances in wireless or cellular networking technology. This rapidly expanding technology poses many challenging research problems in the area of mobile database systems. The mobile users can access information independent of their physical location through wireless connections. However, accessing and manipulating information without restricting users to specific locations complicates data processing activities. There are computing constraints that make mobile database processing different from the wired distributed database computing. In this paper, we survey the fundamental research challenges particular to mobile database computing, review some of the proposed solutions and identify some of the upcoming research challenges. We discuss interesting research areas, which include mobile location data management, transaction processing and broadcast, cache management and replication and query processing. We highlight new upcoming research directions in mobile digital library, mobile data warehousing, mobile workflow and mobile web and e-commerce.

conference on information and knowledge management | 2008

Efficient techniques for document sanitization

Venkatesan T. Chakaravarthy; Himanshu Gupta; Prasan Roy; Mukesh K. Mohania

Sanitization of a document involves removing sensitive information from the document, so that it may be distributed to a broader audience. Such sanitization is needed while declassifying documents involving sensitive or confidential information such as corporate emails, intelligence reports, medical records, etc. In this paper, we present the ERASE framework for performing document sanitization in an automated manner. ERASE can be used to sanitize a document dynamically, so that different users get different views of the same document based on what they are authorized to know. We formalize the problem and present algorithms used in ERASE for finding the appropriate terms to remove from the document. Our preliminary experimental study demonstrates the efficiency and efficacy of the proposed algorithms.

electronic commerce and web technologies | 2002

A Model for XML Schema Integration

Kalpdrum Passi; Louise Lane; Sanjay Kumar Madria; Bipin C. Sakamuri; Mukesh K. Mohania; Sourav S. Bhowmick

We define an object-oriented data model called XSDM (XML Schema Data Model) and present a graphical representation of XML Schema integration. The three layers included are, namely, pre-integration, comparison and integration. During pre-integration, the schema present in XML Schema notation is read and is converted into the XSDM notation. During the comparison phase of integration, correspondences as well as conflicts between elements are identified. During the integration phase, conflict resolution, restructuring and merging of the initial schemas take place to obtain the global schema.

international conference on big data | 2012

Cloud Computing and Big Data Analytics: What Is New from Databases Perspective?

Rajeev Gupta; Himanshu Gupta; Mukesh K. Mohania

Many industries, such as telecom, health care, retail, pharmaceutical, financial services, etc., generate large amounts of data. Gaining critical business insights by querying and analyzing such massive amounts of data is becoming the need of the hour. The warehouses and solutions built around them are unable to provide reasonable response times in handling expanding data volumes. One can either perform analytics on big volume once in days or one can perform transactions on small amounts of data in seconds. With the new requirements, one needs to ensure the real-time or near real-time response for huge amount of data. In this paper we outline challenges in analyzing big data for both data at rest as well as data in motion. For big data at rest we describe two kinds of systems: (1) NoSQL systems for interactive data serving environments; and (2) systems for large scale analytics based on MapReduce paradigm, such as Hadoop, The NoSQL systems are designed to have a simpler key-value based data model having in-built sharding, hence, these work seamlessly in a distributed cloud based environment. In contrast, one can use Hadoop based systems to run long running decision support and analytical queries consuming and possible producing bulk data. For processing data in motion, we present use-cases and illustrative algorithms of data stream management system (DSMS). We also illustrate applications which can use these two kinds of systems to quickly process massive amount of data.

Archive | 1999

Advances in Database Technologies

Yahiko Kambayashi; Dik Lun Lee; Ee-Peng Lim; Mukesh K. Mohania; Yoshifumi Masunaga

In this paper, we propose a fuzzy attribute-oriented induction method for knowledge discovery in relational databases. This method is adapted from the DBLearn system by representing background knowledge with fuzzy thesauri and fuzzy labels. These models allow to take into account inherent imprecision and uncertainty of the domain representation. We also show the power of fuzzy thesauri and linguistic variables to describe gradations in the generalization process and to handle exceptions.

international conference on conceptual modeling | 1998

Recent Advances and Research Problems in Data Warehousing

Sunil Samtani; Mukesh K. Mohania; Vijay Kumar; Yahiko Kambayashi

In the recent years, the database community has witnessed the emergence of a new technology, namely data warehousing. A data warehouse is a global repository that stores pre-processed queries on data which resides in multiple, possibly heterogeneous, operational or legacy sources. The information stored in the data warehouse can be easily and efficiently accessed for making effective decisions. The On-Line Analytical Processing (OLAP) tools access data from the data warehouse for complex data analysis, such as multidimensional data analysis, and decision support activities. Current research has lead to new developments in all aspects of data warehousing, however, there are still a number of problems that need to be solved for making data warehousing effective. In this paper, we discuss recent developments in data warehouse modelling, view maintenance, and parallel query processing. A number of technical issues for exploratory research are presented and possible solutions are discussed.

Acta Informatica | 2007

On the equivalence between FDs in XML and FDs in relations

Millist W. Vincent; Jixue Liu; Mukesh K. Mohania

With the growing use of the eXtensible Markup Language (XML) in database technology as a format for the permanent storage of data, the topic functional dependencies in XML (XFDs) has assumed increased importance because of its central role in database design. Recently, two different approaches have been proposed for defining an XFD. The first uses the concept of a ‘tree tuple’, whereas the second uses the concept of a ‘closest node’. In general, the two approaches are not comparable, but are comparable when a Document Type Definition is present and there is no missing information in the XML document. The first contribution of this article shows that when the two XFD definitions are comparable, the definitions are equivalent, and so there is essentially a common definition of an XFD in complete XML documents. The second contribution is to provide justification for the definition of a ‘closest node’ XFD. We show that if a complete flat relation is mapped to an XML document by an arbitrary sequence of nest operations, the XML document satisfies a ‘closest node’ XFD if and only if the relation satisfies the corresponding functional dependency. The class of XML documents generated in this fashion is a subset of the class of XML documents for which the two definitions of XFDs coincide. Hence ‘tree tuple’ and ‘closest node’ XFDs both capture the semantics of FDs when a complete relation is mapped to an XML document via arbitrary nesting.

Explore More