Swapnil Patil | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Swapnil Patil is active.

Explore More

Publication

Featured researches published by Swapnil Patil.

symposium on cloud computing | 2011

YCSB++: benchmarking and performance debugging advanced features in scalable table stores

Swapnil Patil; Milo Polte; Kai Ren; Wittawat Tantisiriroj; Lin Xiao; Julio Lopez; Garth A. Gibson; Adam Fuchs; Billie Rinaldi

Inspired by Googles BigTable, a variety of scalable, semi-structured, weak-semantic table stores have been developed and optimized for different priorities such as query speed, ingest speed, availability, and interactivity. As these systems mature, performance benchmarking will advance from measuring the rate of simple workloads to understanding and debugging the performance of advanced features such as ingest speed-up techniques and function shipping filters from client to servers. This paper describes YCSB++, a set of extensions to the Yahoo! Cloud Serving Benchmark (YCSB) to improve performance understanding and debugging of these advanced features. YCSB++ includes multi-tester coordination for increased load and eventual consistency measurement, multi-phase workloads to quantify the consequences of work deferment and the benefits of anticipatory configuration optimization such as B-tree pre-splitting or bulk loading, and abstract APIs for explicit incorporation of advanced features in benchmark tests. To enhance performance debugging, we customized an existing cluster monitoring tool to gather the internal statistics of YCSB++, table stores, system services like HDFS, and operating systems, and to offer easy post-test correlation and reporting of performance behaviors. YCSB++ features are illustrated in case studies of two BigTable-like table stores, Apache HBase and Accumulo, developed to emphasize high ingest rates and finegrained security.

sensor, mesh and ad hoc communications and networks | 2004

Serial data fusion using space-filling curves in wireless sensor networks

Swapnil Patil; Samir R. Das; Asis Nasipuri

This paper considers serial fusion as a mechanism for collaborative signal detection. The advantage of this technique is that it can use only the sensor observations that are really necessary for signal detection and thus can be very communication efficient. We develop the signal processing mechanisms for serial fusion based on simple models. We also develop a space-filling curve-based routing mechanism for message routing to implement serial fusion. We demonstrate via simulations that serial fusion with curve-based routing performs better, both in terms of detection errors and message cost, relative to commonly used mechanisms such as parallel fusion with a tree-based aggregation scheme.

ieee international conference on high performance computing data and analytics | 2014

IndexFS: scaling file system metadata performance with stateless caching and bulk insertion

Kai Ren; Qing Zheng; Swapnil Patil; Garth A. Gibson

The growing size of modern storage systems is expected to exceed billions of objects, making metadata scalability critical to overall performance. Many existing distributed file systems only focus on providing highly parallel fast access to file data, and lack a scalable metadata service. In this paper, we introduce a middleware design called Index FS that adds support to existing file systems such as PVFS, Lustre, and HDFS for scalable high-performance operations on metadata and small files. Index FS uses a table-based architecture that incrementally partitions the namespace on a per-directory basis, preserving server and disk locality for small directories. An optimized log-structured layout is used to store metadata and small files efficiently. We also propose two client-based storm free caching techniques: bulk namespace insertion for creation intensive workloads such as N-N check pointing, and stateless consistent metadata caching for hot spot mitigation. By combining these techniques, we have demonstrated Index FS scaled to 128 metadata servers. Experiments show our out-of-core metadata throughput out-performing existing solutions such as PVFS, Lustre, and HDFS by 50% to two orders of magnitude.

petascale data storage workshop | 2007

GIGA+: scalable directories for shared file systems

Swapnil Patil; Garth A. Gibson; Samuel Lang; Milo Polte

There is an increasing use of high-performance computing (HPC) clusters with thousands of compute nodes that, with the advent of multi-core CPUs, will impose a significant challenge for storage systems: The ability to scale to handle I/O generated by applications executing in parallel in tens of thousands of threads. One such challenge is building scalable directories for cluster storage - i.e., directories that can store billions to trillions of entries and handle hundreds of thousands of operations per second.

international conference on embedded networked sensor systems | 2003

Poster abstract: serial data aggregation using space-filling curves in wireless sensor networks

Swapnil Patil; Samir R. Das

Many applications require that sensor observations in a given geographic region be aggregated or fused in a serial fashion. This requires a routing path to be constructed through all sensors in that region. This paper investigates efficient network traversal techniques to construct such path using the novel concept of space-filling curves.

ieee international conference on high performance computing data and analytics | 2012

A Case for Scaling HPC Metadata Performance through De-specialization

Swapnil Patil; Kai Ren; Garth A. Gibson

Lack of a highly scalable and parallel metadata service is the Achilles heel for many cluster file system deployments in both the HPC world and the Internet services world. This is because most cluster file systems have focused on scaling the data path, i.e. providing high bandwidth parallel I/O to files that are gigabytes in size. But with proliferation of massively parallel applications that produce metadata-intensive workloads, such as large number of simultaneous file creates and large-scale storage management, cluster file systems also need to scale metadata performance. To realize these goals, this paper makes a case for a scalable metadata service middleware that layers on existing cluster file system deployments and distributes file system metadata, including the namespace tree, small directories and large directories, across many servers. Our key idea is to effectively synthesize a concurrent indexing technique to distribute metadata with a tabular, on-disk representation of all file system metadata.

symposium on operating systems principles | 2005

What the protocol stack missed: the transfer service

Niraj Tolia; David G. Andersen; Michael Kaminsky; Swapnil Patil

This WIP proposes a new architecture for applications that perform bulk data transfers. This architecture, called DOT (for data-oriented transfer), cleanly separates out two functions that are comingled in todays applications. Using DOT, applications perform content negotiation to determine what content to send. They then pass that data object to the transfer service to perform the actual data transmission. This separation increases application flexibility, enables the rapid development of innovative transfer mechanisms, reduces developer effort, and allows increased efficiency through cross-application sharing.

ad hoc networks | 2003

IDEA: An Iterative-Deepening Algorithm for Energy-Efficient Querying in Ad Hoc Sensor Networks

Swapnil Patil

The data-centric ad hoc sensor networks make efficient searching a crucial and challenging operation. Dynamic topology make flooding the most widely adopted solution at a cost of high bandwidth congestion leading to inefficient use of resources and low network lifetime. This paper presents IDEA, an efficient querying and searching technique for ad hoc sensor networks that reduces average energy consumption while maintaining the capacity and performance of the network. IDEA is based on iterative-deepening search which check-points the flooding of requests based on the results. This is further extended to a token-based approach called T-IDEA, which involves local decisions made by nodes to determine their participation in a virtual searching network. Results show that IDEA and T-IDEA significantly reduces the energy consumption compared to classical flooding approaches. Apart from that T-IDEA presents a highly distributed self-supervising topology formation which performs very well to increase the lifetime of the ad hoc sensor network.

networked systems design and implementation | 2006