Ramya Prabhakar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ramya Prabhakar is active.

Explore More

Publication

Featured researches published by Ramya Prabhakar.

international conference on cloud computing | 2012

MROrchestrator: A Fine-Grained Resource Orchestration Framework for MapReduce Clusters

Bikash Sharma; Ramya Prabhakar; Seung-Hwan Lim; Mahmut T. Kandemir; Chita R. Das

Efficient resource management in data centers and clouds running large distributed data processing frameworks like MapReduce is crucial for enhancing the performance of hosted applications and increasing resource utilization. However, existing resource scheduling schemes in Hadoop MapReduce allocate resources at the granularity of fixed-size, static portions of nodes, called slots. In this work, we show that MapReduce jobs have widely varying demands for multiple resources, making the static and fixed-size slot-level resource allocation a poor choice both from the performance and resource utilization standpoints. Furthermore, lack of coordination in the management of multiple resources across nodes prevents dynamic slot reconfiguration, and leads to resource contention. Motivated by this, we propose MROrchestrator, a MapReduce resource Orchestrator framework, which can dynamically identify resource bottlenecks, and resolve them through fine-grained, coordinated, and on-demand resource allocations. We have implemented MROrchestrator on two 24-node native and virtualized Hadoop clusters. Experimental results with a suite of representative MapReduce benchmarks demonstrate up to 38% reduction in job completion times, and up to 25% increase in resource utilization. We further demonstrate the performance boost in existing resource managers like NGM and Mesos, when augmented with MROrchestrator.

international conference on distributed computing systems | 2011

Provisioning a Multi-tiered Data Staging Area for Extreme-Scale Machines

Ramya Prabhakar; Sudharshan S. Vazhkudai; Young-Jae Kim; Ali Raza Butt; Min Li; Mahmut T. Kandemir

Massively parallel scientific applications, running on extreme-scale supercomputers, produce hundreds of terabytes of data per run, driving the need for storage solutions to improve their I/O performance. Traditional parallel file systems (PFS) in high performance computing (HPC) systems are unable to keep up with such high data rates, creating a storage wall. In this work, we present a novel multi-tiered storage architecture comprising hybrid node-local resources to construct a dynamic data staging area for extreme-scale machines. Such a staging ground serves as an impedance matching device between applications and the PFS. Our solution combines diverse resources (e.g., DRAM, SSD) in such a way as to approach the performance of the fastest component technology and the cost of the least expensive one. We have developed an automated provisioning algorithm that aids in meeting the check pointing performance requirement of HPC applications, by using a least-cost storage configuration. We evaluate our approach using both an implementation on a large scale cluster and a simulation driven by six-years worth of Jaguar supercomputer job-logs, and show that our approach, by choosing an appropriate storage configuration, achieves 41.5% cost savings with only negligible impact on performance.

cluster computing and the grid | 2009

Markov Model Based Disk Power Management for Data Intensive Workloads

Rajat Garg; Seung Woo Son; Mahmut T. Kandemir; Padma Raghavan; Ramya Prabhakar

In order to meet the increasing demands of present and upcoming data-intensive computer applications, there has been a major shift in the disk subsystem, which now consists of more disks with higher storage capacities and higher rotational speeds. These have made the disk subsystem a major consumer of power, making disk power management an important issue. People have considered the option of spinning down the disk during periods of idleness or serving the requests at lower rotational speeds when performance is not an issue. Accurately predicting future disk idle periods is crucial to such schemes. This paper presents a novel disk-idleness prediction mechanism based on Markov models and explains how this mechanism can be used in conjunction with a three-speed disk. Our experimental evaluation using a diverse set of workloads indicates that (i) prediction accuracies achieved by the proposed scheme are very good (87.5% on average); (ii) it generates significant energy savings over the traditional power-saving method of spinning down the disk when idle (35.5% onaverage); (iii) it performs better than a previously proposed multi-speed disk management scheme (19% on average); and (iv) the performance penalty is negligible (less than 1% on average). Overall, our implementation and experimental evaluation using both synthetic disk traces and traces extracted from real applications demonstrate the feasibility of a Markov-model-based approach to saving disk power.

international middleware conference | 2012

Taking garbage collection overheads off the critical path in SSDs

Myoungsoo Jung; Ramya Prabhakar; Mahmut T. Kandemir

Solid state disks (SSDs) have the potential to revolutionize the storage system landscape, mostly due to their good random access performance, compared to hard disks. However, garbage collection (GC) in SSD introduces significant latencies and large performance variations, which renders widespread adoption of SSDs difficult. To address this issue, we present a novel garbage collection strategy, consisting of two components, called Advanced Garbage Collection (AGC) and Delayed Garbage Collection (DGC), that operate collectively to migrate GC operations from busy periods to idle periods. More specifically, AGC is employed to defer GC operations to idle periods in advance, based on the type of the idle periods and on-demand GC needs, whereas DGC complements AGC by handling the collections that could not be handled by AGC. Our comprehensive experimental analysis reveals that the proposed strategies provide stable SSD performance by significantly reducing GC overheads. Compared to the state-of-the-art GC strategies, P-FTL, L-FTL and H-FTL, our AGC+DGC scheme reduces GC overheads, on average, by about 66.7%, 96.7% and 98.2%, respectively.

ieee international conference on high performance computing data and analytics | 2009

Dynamic storage cache allocation in multi-server architectures

Ramya Prabhakar; Shekhar Srikantaiah; Christina M. Patrick; Mahmut T. Kandemir

We introduce a dynamic and efficient shared cache management scheme, called Maxperf, that manages the aggregate cache space in multi-server storage architectures such that the service level objectives (SLOs) of concurrently executing applications are satisfied and any spare cache capacity is proportionately allocated according to the marginal gains of the applications to maximize performance. We use a combination of Nevilles algorithm and linear-programming-model to discover the required storage cache partition size, on each server, for every application accessing that server. Experimental results show that our algorithm enforces partitions to provide stronger isolation to applications, meets application level SLOs even in the presence of dynamically changing storage cache requirements, and improves I/O latency of individual applications as well as the overall I/O latency significantly compared to two alternate storage cache management schemes, and a state-of-the-art single server storage cache management scheme extended to multi-server architecture.

cluster computing and the grid | 2009

MPISec I/O: Providing Data Confidentiality in MPI-I/O

Ramya Prabhakar; Christina M. Patrick; Mahmut T. Kandemir

Applications performing scientific computations or processing streaming media benefit from parallel I/O significantly, as they operate on large data sets that require large I/O. MPI-I/O is a commonly used library interface in parallel applications to perform I/O efficiently. Optimizations like collective-I/O embedded in MPI-I/O allow multiple processes executing in parallel to perform I/O by merging requests of other processes and sharing them later. In such a scenario, preserving confidentiality of disk-resident data from unauthorized accesses by processes without significantly impacting performance of the application is a challenging task. In this paper, we evaluate the impact of ensuring data-confidentiality in MPI-I/O on the performance of parallel applications and provide an enhanced interface, called MPISec I/O, which brings an average overhead of only 5.77% over MPI-I/O in the best case, and about 7.82% in the average case.

EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface | 2010

Automated tracing of I/O stack

Seong Jo Kim; Yuanrui Zhang; Seung Woo Son; Ramya Prabhakar; Mahmut T. Kandemir; Christina M. Patrick; Wei-keng Liao; Alok N. Choudhary

Efficient execution of parallel scientific applications requires high-performance storage systems designed to meet their I/O requirements. Most high-performance I/O intensive applications access multiple layers of the storage stack during their disk operations. A typical I/O request from these applications may include accesses to high-level libraries such as MPI I/O, executing on clustered parallel file systems like PVFS2, which are in turn supported by native file systems like Linux. In order to design and implement parallel applications that exercise this I/O stack, it is important to understand the flow of I/O calls through the entire storage system. Such understanding helps in identifying the potential performance and power bottlenecks in different layers of the storage hierarchy. To trace the execution of the I/O calls and to understand the complex interactions of multiple user-libraries and file systems, we propose an automatic code instrumentation technique, which enables us to collect detailed statistics of the I/O stack. Our proposed I/O tracing tool traces the flow of I/O calls across different layers of an I/O stack, and can be configured to work with different file systems and user-libraries. It also analyzes the collected information to generate output in terms of different user-specified metrics of interest.

ieee international conference on high performance computing data and analytics | 2011

Virtual I/O caching: dynamic storage cache management for concurrent workloads

Michael R. Frasca; Ramya Prabhakar; Padma Raghavan; Mahmut T. Kandemir

A leading cause of reduced or unpredictable application performance in distributed systems is contention at the storage layer, where resources are multiplexed among many concurrent data intensive workloads. We target the shared storage cache, used to alleviate disk I/O bottlenecks, and propose a new caching paradigm to both improve performance and reduce memory requirements for HPC storage systems. We present the virtual I/O cache, a dynamic scheme to manage a limited storage cache resource. Application behavior and the corresponding performance of a chosen replacement policy are observed at run time, and a mechanism is designed to mitigate suboptimal behavior and increase cache efficiency. We further use the virtual I/O cache to isolate concurrent workloads and globally manage physical resource allocation towards system level performance objectives. We evaluate our scheme using twenty I/O intensive applications and benchmarks. Average hit rate gains over 17% were observed for isolated workloads, as well as cache size reductions near 80% for equivalent performance levels. Our largest concurrent workload achieved hit rate gains over 23%, and an over 80% iso-performance cache reduction.

international conference on supercomputing | 2010

Adaptive multi-level cache allocation in distributed storage architectures

Ramya Prabhakar; Shekhar Srikantaiah; Mahmut T. Kandemir; Christina M. Patrick

Increasing complexity of large-scale applications and continuous increases in data set sizes of such applications combined with slow improvements in disk access latencies has resulted in I/O becoming a performance bottleneck. While there are several ways of improving I/O access latencies of dataintensive applications, one of the promising approaches has been using different layers of the I/O subsystem to cache recently and/or frequently used data so that the number of I/O requests accessing the disk is reduced. These different layers of caches across the storage hierarchy introduce the need for efficient cache management schemes to derive maximum performance benefits. Several state-of-the-art multi-level storage cache management schemes focus on optimizing aggregate hit rate or overall I/O latency, while being agnostic to Service Level Objectives (SLOs). Also, most of the existing works focus on different cache replacement algorithms for managing storage caches and discuss different exclusive caching techniques in the context of multilevel cache hierarchy. However, the orthogonal problem of storage cache space allocation to multiple, simultaneously-running applications in a multi-level hierarchy of storage caches with multiple storage servers has remained an open research problem. In this work, using a combination of per-application latency model and a linear programming model, we proportion storage caches dynamically among multiple concurrently-executing applications across the different levels of the storage hierarchy and across multiple servers to provide isolation to applications while satisfying the application level SLOs. Further, our algorithm improves the overall system performance significantly.

Fourth International IEEE Security in Storage Workshop | 2007

Securing Disk-Resident Data through Application Level Encryption

Ramya Prabhakar; Seung Woo Son; Christina M. Patrick; Sri Hari Krishna Narayanan; Mahmut T. Kandemir

Confidentiality of disk-resident data is critical for end-to-end security of storage systems. While there are several widely used mechanisms for ensuring confidentiality of data in transit, techniques for providing confidentiality when data is stored in a disk subsystem are relatively new. As opposed to prior file system based approaches to this problem, this paper proposes an application-level solution, which allows encryption of select data blocks. We make three major contributions: 1) quantifying the tradeoffs between confidentiality and performance; 2) evaluating a reuse distance oriented approach for selective encryption of disk-resident data; and 3) proposing a profile-guided approach that approximates the behavior of the reuse distance oriented approach. The experiments with five applications that manipulate disk-resident data sets clearly show that our approach enables us to study the confidentiality/performance tradeoffs. Using our approach it is possible to reduce the performance degradation due to encryption/decryption overheads on an average by 46.5%, when DES is used as the encryption mechanism, and the same by 30.63%, when AES is used as the encryption mechanism.

Explore More