Mehul A. Shah | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mehul A. Shah is active.

Explore More

Publication

Featured researches published by Mehul A. Shah.

symposium on operating systems principles | 2007

Sinfonia: a new paradigm for building scalable distributed systems

Marcos Kawazoe Aguilera; Arif Merchant; Mehul A. Shah; Alistair Veitch; Christos Karamanolis

We propose a new paradigm for building scalable distributed systems. Our approach does not require dealing with message-passing protocols -- a major complication in existing distributed systems. Instead, developers just design and manipulate data structures within our service called Sinfonia. Sinfonia keeps data for applications on a set of memory nodes, each exporting a linear address space. At the core of Sinfonia is a novel minitransaction primitive that enables efficient and consistent access to data, while hiding the complexities that arise from concurrency and failures. Using Sinfonia, we implemented two very different and complex applications in a few months: a cluster file system and a group communication service. Our implementations perform well and scale to hundreds of machines.

international conference on management of data | 2007

JouleSort: a balanced energy-efficiency benchmark

Suzanne Rivoire; Mehul A. Shah; Parthasarathy Ranganathan; Christos Kozyrakis

The energy efficiency of computer systems is an important concern in a variety of contexts. In data centers, reducing energy use improves operating cost, scalability, reliability, and other factors. For mobile devices, energy consumption directly affects functionality and usability. We propose and motivate JouleSort, an external sort benchmark, for evaluating the energy efficiency of a wide range of computer systems from clusters to handhelds. We list the criteria, challenges, and pitfalls from our experience in creating a fair energy-efficiency benchmark. Using a commercial sort, we demonstrate a JouleSort system that is over 3.5x as energy-efficient as last years estimated winner. This system is quite different from those currently used in data centers. It consists of a commodity mobile CPU and 13 laptop drives connected by server-style I/O interfaces.

international conference on management of data | 2010

Analyzing the energy efficiency of a database server

Dimitris Tsirogiannis; Stavros Harizopoulos; Mehul A. Shah

Rising energy costs in large data centers are driving an agenda for energy-efficient computing. In this paper, we focus on the role of database software in affecting, and, ultimately, improving the energy efficiency of a server. We first characterize the power-use profiles of database operators under different configuration parameters. We find that common database operations can exercise the full dynamic power range of a server, and that the CPU power consumption of different operators, for the same CPU utilization, can differ by as much as 60%. We also find that for these operations CPU power does not vary linearly with CPU utilization. We then experiment with several classes of database systems and storage managers, varying parameters that span from different query plans to compression algorithms and from physical layout to CPU frequency and operating system scheduling. Contrary to what recent work has suggested, we find that within a single node intended for use in scale-out (shared-nothing) architectures, the most energy-efficient configuration is typically the highest performing one. We explain under which circumstances this is not the case, and argue that these circumstances do not warrant a retargeting of database system optimization goals. Further, our results reveal opportunities for cross-node energy optimizations and point out directions for new scale-out architectures.

international conference on management of data | 2009

Query processing techniques for solid state drives

Dimitris Tsirogiannis; Stavros Harizopoulos; Mehul A. Shah; Janet L. Wiener; Goetz Graefe

Solid state drives perform random reads more than 100x faster than traditional magnetic hard disks, while offering comparable sequential read and write bandwidth. Because of their potential to speed up applications, as well as their reduced power consumption, these new drives are expected to gradually replace hard disks as the primary permanent storage media in large data centers. However, although they may benefit applications that stress random reads immediately, they may not improve database applications, especially those running long data analysis queries. Database query processing engines have been designed around the speed mismatch between random and sequential I/O on hard disks and their algorithms currently emphasize sequential accesses for disk-resident data. In this paper, we investigate data structures and algorithms that leverage fast random reads to speed up selection, projection, and join operations in relational query processing. We first demonstrate how a column-based layout within each page reduces the amount of data read during selections and projections. We then introduce FlashJoin, a general pipelined join algorithm that minimizes accesses to base and intermediate relational data. FlashJoins binary join kernel accesses only the join attributes, producing partial results in the form of a join index. Subsequently, its fetch kernel retrieves the attributes for later nodes in the query plan as they are needed. FlashJoin significantly reduces memory and I/O requirements for each join in the query. We implemented these techniques inside Postgres and experimented with an enterprise SSD drive. Our techniques improved query runtimes by up to 6x for queries ranging from simple relational scans and joins to full TPC-H queries.

very large data bases | 2008

A practical scalable distributed B-tree

Marcos Kawazoe Aguilera; Wojciech M. Golab; Mehul A. Shah

Internet applications increasingly rely on scalable data structures that must support high throughput and store huge amounts of data. These data structures can be hard to implement efficiently. Recent proposals have overcome this problem by giving up on generality and implementing specialized interfaces and functionality (e.g., Dynamo [4]). We present the design of a more general and flexible solution: a fault-tolerant and scalable distributed B-tree. In addition to the usual B-tree operations, our B-tree provides some important practical features: transactions for atomically executing several operations in one or more B-trees, online migration of B-tree nodes between servers for load-balancing, and dynamic addition and removal of servers for supporting incremental growth of the system. Our design is conceptually simple. Rather than using complex concurrency and locking protocols, we use distributed transactions to make changes to B-tree nodes. We show how to extend the B-tree and keep additional information so that these transactions execute quickly and efficiently. Our design relies on an underlying distributed data sharing service, Sinfonia [1], which provides fault tolerance and a light-weight distributed atomic primitive. We use this primitive to commit our transactions. We implemented our B-tree and show that it performs comparably to an existing open-source B-tree and that it scales to hundreds of machines. We believe that our approach is general and can be used to implement other distributed data structures easily.

IEEE Computer | 2007

Models and Metrics to Enable Energy-Efficiency Optimizations

Suzanne Rivoire; Mehul A. Shah; P. Ranganatban; Christos Kozyrakis; J. Meza

Power consumption and energy efficiency are important factors in the initial design and day-to-day management of computer systems. Researchers and system designers need benchmarks that characterize energy efficiency to evaluate systems and identify promising new technologies. To predict the effects of new designs and configurations, they also need accurate methods of modeling power consumption.

principles of distributed computing | 2011

Analyzing consistency properties for fun and profit

Wojciech M. Golab; Xiaozhou Li; Mehul A. Shah

Motivated by the increasing popularity of eventually consistent key-value stores as a commercial service, we address two important problems related to the consistency properties in a history of operations on a read/write register (i.e., the start time, finish time, argument, and response of every operation). First, we consider how to detect a consistency violation as soon as one happens. To this end, we formulate a specification for online verification algorithms, and we present such algorithms for several well-known consistency properties. Second, we consider how to quantify the severity of the violations, if a history is found to contain consistency violations. We investigate two quantities: one is the staleness of the reads, and the other is the commonality of violations. For staleness, we further consider time-based staleness and operation-count-based staleness. We present efficient algorithms that compute these quantities. We believe that addressing these problems helps both key-value store providers and users adopt data consistency as an important aspect of key-value store offerings.

dependable systems and networks | 2010

Efficient eventual consistency in Pahoehoe, an erasure-coded key-blob archive

Eric Anderson; Xiaozhou Li; Arif Merchant; Mehul A. Shah; Kevin Smathers; Joseph Tucek; Mustafa Uysal; Jay J. Wylie

Cloud computing demands cheap, always-on, and reliable storage. We describe Pahoehoe, a key-value cloud storage system we designed to store large objects cost-effectively with high availability. Pahoehoe stores objects across multiple data centers and provides eventual consistency so to be available during network partitions. Pahoehoe uses erasure codes to store objects with high reliability at low cost. Its use of erasure codes distinguishes Pahoehoe from other cloud storage systems, and presents a challenge for efficiently providing eventual consistency. We describe Pahoehoes put, get, and convergence protocols—convergence being the decentralized protocol that ensures eventual consistency. We use simulated executions of Pahoehoe to evaluate the efficiency of convergence, in terms of message count and message bytes sent, for failure-free and expected failure scenarios (e.g., partitions and server unavailability). We describe and evaluate optimizations to the naïve convergence protocol that reduce the cost of convergence in all scenarios.

international symposium on low power electronics and design | 2009

Tracking the power in an enterprise decision support system

Justin Meza; Mehul A. Shah; Parthasarathy Ranganathan; Mike Fitzner; Judson E. Veazey

Enterprises rely on decision support systems to influence critical business choices. At the same time, IT-related power costs are growing and are a key concern for enterprise executives. Yet, there is little work to date characterizing the power use of decision support systems. Towards this end, we present the first holistic measurements and analysis of an audit-class system running the TPC-H decision support benchmark at the 300GB scale. We first provide a breakdown of the systems power use into its core hardware components. We then explore its power-performance tradeoffs. This investigation shows that there is ample room to improve its energy use without sacrificing much performance. Moreover, the most energy-efficient configuration depends on the workload. These results suggest that, going forward, database software has an important role to play in optimizing for energy use.

data management on new hardware | 2012

Hathi: durable transactions for memory using flash

Mohit Saxena; Mehul A. Shah; Stavros Harizopoulos; Michael M. Swift; Arif Merchant

Recent architectural trends---cheap, fast solid-state storage, inexpensive DRAM, and multi-core CPUs---provide an opportunity to rethink the interface between applications and persistent storage. To leverage these advances, we propose a new system architecture called Hathi that provides an in-memory transactional heap made persistent using high-speed flash drives. With Hathi, programmers can make consistent concurrent updates to in-memory data structures that survive system failures. Hathi focuses on three major design goals: ACID semantics, a simple programming interface, and fine-grained programmer control. Hathi relies on software transactional memory to provide a simple concurrent interface to in-memory data structures, and extends it with persistent logs and checkpoints to add durability. To reduce the cost of durability, Hathi uses two main techniques. First, it provides split-phase and partitioned commit interfaces, that allow programmers to overlap commit I/O with computation and to avoid unnecessary synchronization. Second, it uses partitioned logging, which reduces contention on in-memory log buffers and exploits internal SSD parallelism. We find that our implementation of Hathi can achieve 1.25 million txns/s with a single SSD.

Explore More