Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Inderpal Narang is active.

Publication


Featured researches published by Inderpal Narang.


international conference on data engineering | 2008

Constant-Time Query Processing

Vijayshankar Raman; Garret Swart; Lin Qiao; Frederick R. Reiss; Vijay Dialani; Donald Kossmann; Inderpal Narang; Richard S. Sidle

Query performance in current systems depends significantly on tuning: how well the query matches the available indexes, materialized views etc. Even in a well tuned system, there are always some queries that take much longer than others. This frustrates users who increasingly want consistent response times to ad hoc queries. We argue that query processors should instead aim for constant response times for all queries, with no assumption about tuning. We present Blink, our first attempt at this goal, that runs every query as a table scan over a fully denormalized database, with hash group-by done along the way. To make this scan efficient, Blink uses a novel compression scheme that horizontally partitions tuples by frequency, thereby compressing skewed data almost down to entropy, even while producing long runs of fixed-length, easily-parseable values. We also present a scheme for evaluating a conjunction of range and equality predicates in SIMD fashion over compressed tuples, and different schemes for efficient hash-based aggregation within the L2 cache. A experimental study with a suite of arbitrary single block SQL queries over a TPCH-like schema suggests that constant-time queries can be efficient.


international conference on management of data | 2002

Coordinating backup/recovery and data consistency between database and file systems

Suparna Bhattacharya; C. Mohan; Karen W. Brannon; Inderpal Narang; Hui-I Hsiao; Mahadevan Subramanian

Managing a combined store consisting of database data and file data in a robust and consistent manner is a challenge for database systems and content management systems. In such a hybrid system, images, videos, engineering drawings, etc. are stored as files on a file server while meta-data referencing/indexing such files is created and stored in a relational database to take advantage of efficient search. In this paper we describe solutions for two potentially problematic aspects of such a data management system: backup/recovery and data consistency. We present algorithms for performing backup and recovery of the DBMS data in a coordinated fashion with the files on the file servers. Our algorithms for coordinated backup and recovery have been implemented in the IBM DB2/DataLinks product [1]. We also propose an efficient solution to the problem of maintaining consistency between the content of a file and the associated meta-data stored in the DBMS from a readers point of view without holding long duration locks on meta-data tables. In the model, an object is directly accessed and edited in-place through normal file system APIs using a reference obtained via an SQL Query on the database. To relate file modifications to meta-data updates, the user issues an update through the DBMS, and commits both file and meta-data updates together.


international conference on management of data | 1994

ARIES/CSA: a method for database recovery in client-server architectures

C. Mohan; Inderpal Narang

This paper presents an algorithm, called ARIES/CSA (Algorithm for Recovery and Isolation Exploiting Semantics for Client-Server Architectures), for performing recovery correctly in client-server (CS) architectures. In CS, the server manages the disk version of the database. The clients, after obtaining database pages from the server, cache them in their buffer pools. Clients perform their updates on the cached pages and produce log records. The log records are buffered locally in virtual storage and later sent to the single log at the server. ARIES/CSA supports a write-ahead logging (WAL), fine-granularity (e.g., record) locking, partial rollbacks and flexible buffer management policies like steal and no-force. It does not require that the clocks on the clients and the server be synchronized. Checkpointing by the server and the clients allows for flexible and easier recovery.


extending database technology | 1992

Efficient Locking and Caching of Data in the Multisystem Shard Disks Transaction Environment

C. Mohan; Inderpal Narang

This paper describes a technique for use when multiple instances of a data base management system (DBMS), each with its own cache (buffer pool), can directly read and modify any data stored on a set of shared disks. Global locking and coherency control protocols are necessary in this context for assuring transaction consistency and for maintaining coherency of the data cached in the multiple caches. The coordination amongst the systems is performed by a set of local lock managers (LLMs) and a global lock manager (GLM). This typically involves sending messages. We describe a technique, called LP locking, which saves locking calls when the granularity of locking by transactions is the same as the granularity of caching by the cache manager. The savings are gained by making the LLMs hide from the GLM the distinction between a transaction lock, called the L lock, and a cache-ownership lock, called the P lock, for the same object. The L and P locks for an object, though distinct at an LLM, are known as a single lock at the GLM. An LLM can grant an L or P lock request on an object locally if the combined lock mode of the L and P locks already held on that object by that LLM is equal to or higher than the requested mode. Such optimizations save messages between the LLMs and the GLM. Our ideas apply also to the client-server environment which has become very popular in the OODBMS area and to the distributed shared memory environment.


very large data bases | 2009

Autonomic query parallelization using non-dedicated computers: an evaluation of adaptivity options

Norman W. Paton; Jorge Buenabad-Chávez; Mengsong Chen; Vijayshankar Raman; Garret Swart; Inderpal Narang; Daniel M. Yellin; Alvaro A. A. Fernandes

Writing parallel programs that can take advantage of non-dedicated processors is much more difficult than writing such programs for networks of dedicated processors. In a non-dedicated environment such programs must use autonomic techniques to respond to the unpredictable load fluctuations that prevail in the computational environment. In adaptive query processing (AQP), several techniques have been proposed for dynamically redistributing processor load assignments throughout a computation to take account of varying resource capabilities, but we know of no previous study that compares their performance. This paper presents a simulation-based evaluation of these autonomic parallelization techniques in a uniform environment and compares how well they improve the performance of the computation. Four published strategies are compared with a new algorithm that seeks to overcome some weaknesses identified in the existing approaches. In addition, we explore the use of techniques from online algorithms to provide a firm foundation for determining when to adapt in two of the existing algorithms. The evaluations identify situations in which each strategy may be used effectively and in which it should be avoided.


international conference on management of data | 1993

An efficient and flexible method for archiving a data base

C. Mohan; Inderpal Narang

We describe an efficient method for supporting incremental and full archiving of data bases (e.g., individual files). Customers archive their data bases quite frequently to minimize the duration of data outage. Because of the growing sizes of data bases and the ever increasing need for high availability of data, the efficiency of the archive copy utility is very important. The method presented here minimizes interferences with concurrent transactions by not acquiring any locks on the data being copied. It significantly reduces disk I/Os by not keeping on data pages any extra tracking information in connection with archiving. These features make the archive copy operation be more efficient in terms of resource consumption compared to other methods. The method is also flexible in that it optionally supports direct copying of data from disks, bypassing the DBMSs buffer pool. This reduces buffer pool pollution and processing overheads, and allows the utility to take advantage of device geometries for efficiently retrieving data. We also describe extensions to the method to accommodate the multisystem shared disks transaction environment. The method tolerates gracefully system failures during the archive copy operation.


international conference on distributed computing systems | 1992

Data base recovery in shared disks and client-server architectures

C. Mohan; Inderpal Narang

Solutions to the problem of performing recovery correctly in shared-disks (SD) and client-server (CS) architectures are presented. In SD, all the disks containing the data bases are shared among multiple instances of the database management system (DBMS). In CS, the server manages the disk version of the data base. The clients, after obtaining database pages from the server, cache them in their buffer pools. Clients perform their updates on the cached pages and produce log records. In write-ahead logging (WAL) systems, a monotonically increasing value called the log sequence number (LSN) is associated with each log record. Every database page contains the LSN of the log record describing the most recent update to that page. This is required for proper recovery after a system failure. A technique for generating monotonically increasing LSNs in SD and CS architectures without using synchronized clocks is presented.<<ETX>>


international conference on management of data | 2007

Lazy, adaptive rid-list intersection, and its application to index anding

Vijayshankar Raman; Lin Qiao; Wei Han; Inderpal Narang; Ying-Lin Chen; Kou-Horng Yang; Fen-Lin Ling

RID-List (row id list) intersection is a common strategy in query processing, used in star joins, column stores, and even search engines. To apply a conjunction of predicates on a table, a query process ordoes index lookups to form sorted RID-lists (or bitmap) of the rows matching each predicate, then intersects the RID-lists via an AND-tree, and finally fetches the corresponding rows to apply any residual predicates and aggregates. This process can be expensive when the RID-lists are large. Furthermore, the performance is sensitive to the order in which RID lists are intersected together, and to treating the right predicates as residuals. If the optimizer chooses a wrong order or a wrong residual, due to a poor cardinality estimate, the resulting plan can run orders of magnitude slower than expected. We present a new algorithm for RID-list intersection that is both more efficient and more robust than this standard algorithm. First, we avoid forming the RID-lists up front, and instead form this lazily as part of the intersection. This reduces the associated IO and sort cost significantly, especially when the data distribution is skewed. It also ameliorates the problem of wrong residual table selection. Second, we do not intersect the RID-lists via an AND-tree, because this is vulnerable to cardinality mis-estimations. Instead, we use an adaptive set intersection algorithm that performs well even when the cardinality estimates are wrong. We present detailed experiments of this algorithm on data with varying distributions to validate its efficiency and predictability.


The Grid 2 (2)#R##N#Blueprint for a New Computing Infrastructure | 2004

Data Access, Integration, and Management

Malcolm P. Atkinson; Ann L. Chervenak; Peter Z. Kunszt; Inderpal Narang; Norman W. Paton; Dave Pearson; Arie Shoshani; Paul Watson

Publisher Summary This chapter presents and analyzes application requirements, technical solutions, and open issues associated with maintaining and manipulating information on and for the Grid. It introduces an architectural framework for structuring data-oriented services and presents a number of important data-oriented services, highlighting current solutions and discussing future directions. Data have a multifaceted relationship with Grid infrastructure. Many Grid applications have significant data access and analysis requirements; virtually every scientific, engineering, medical, and decision-support application depends on accessing distributed heterogeneous collections of structured data. In addition, the Grid itself uses many structured data collections for its operation and administration; service data elements constitute just one example. As Grid technology becomes more sophisticated and autonomic, the number, volume, and diversity of these collections will increase. Hence, systematic data access and integration methods are important, not only for Grid applications, but for the Grid itself.


Ibm Systems Journal | 1997

DB2's use of the coupling facility for data sharing

Jeffrey William Josten; C. Mohan; Inderpal Narang; James Zu-chia Teng

We examine the problems encountered in extending DATABASE 2™ (DB2®) for Multiple Virtual Storage/Enterprise Systems Architecture (MVS/ESA™), also called DB2 for OS/390™, an industrial-strength relational database management system originally designed for a single-system environment, to support the multisystem shared-data architecture. The multisystem data sharing function was delivered in DB2 Version 4. DB2 data sharing requires a System® Parallel Sysplex™ environment because DB2s use of the coupling facility technology plays a central role in delivering highly efficient and scalable data sharing functions. We call this the shared-data architecture because the coupling facility is a unique feature that it employs.

Researchain Logo
Decentralizing Knowledge