Himabindu Pucha | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Himabindu Pucha is active.

Explore More

Publication

Featured researches published by Himabindu Pucha.

IEEE Transactions on Computers | 2012

Exploiting Spatio-Temporal Tradeoffs for Energy-Aware MapReduce in the Cloud

Michael Cardosa; Aameek Singh; Himabindu Pucha; Abhishek Chandra

MapReduce is a distributed computing paradigm widely used for building large-scale data processing applications. When used in cloud environments, MapReduce clusters are dynamically created using virtual machines (VMs) and managed by the cloud provider. In this paper, we study the energy efficiency problem for such MapReduce clouds. We describe a unique spatio-temporal tradeoff that includes efficient spatial fitting of VMs on servers to achieve high utilization of machine resources, as well as balanced temporal fitting of servers with VMs having similar runtimes to ensure a server runs at a high utilization throughout its uptime. We propose VM placement algorithms that explicitly incorporate these tradeoffs. Further, we propose techniques that dynamically scale MapReduce clusters to further improve energy consumption while ensuring that jobs meet or improve their expected runtimes. Our algorithms achieve energy savings over existing placement techniques, and an additional optimization technique further achieves savings while simultaneously improving job performance.

symposium on operating systems principles | 2010

Energy proportionality for storage: impact and feasibility

Wendy Belluomini; Joseph S. Glider; Karan Gupta; Himabindu Pucha

This paper highlights the growing importance of storage energy consumption in a typical data center, and asserts that storage energy research should drive towards a vision of energy proportionality for achieving significant energy savings. Our analysis of real-world enterprise workloads shows a potential energy reduction of 40-75% using an ideally proportional system. We then present a preliminary analysis of appropriate techniques to achieve proportionality, chosen to match both application requirements and workload characteristics. Based on the techniques we have identified, we believe that energy proportionality is achievable in storage systems at a time scale that will make sense in real world environments.

Ibm Journal of Research and Development | 2011

GPFS-SNC: an enterprise storage framework for virtual-machine clouds

Karan Gupta; Reshu Jain; Ioannis Koltsidas; Himabindu Pucha; Prasenjit Sarkar; Mark James Seaman; Dinesh Subhraveti

In a typical cloud computing environment, the users are provided with storage and compute capacity in the form of virtual machines. The underlying infrastructure for these services typically comprises large distributed clusters of commodity machines and direct-attached storage in concert with a server virtualization layer. The focus of this paper is on an enterprise storage framework that supports the timely and resource-efficient deployment of virtual machines in such a cloud environment. The proposed framework makes use of innovations in the General Parallel File System-Shared Nothing Clusters (GPFS®-SNC) file system, supports optimal allocation of resources to virtual machines in a hypervisor-agnostic fashion, achieves low latency when provisioning for new virtual machines, and adapts to the input-output needs of each virtual-machine instance in order to achieve high performance for all types of applications.

international conference on computer communications | 2010

Efficient Similarity Estimation for Systems Exploiting Data Redundancy

Kanat Tangwongsan; Himabindu Pucha; David G. Andersen; Michael Kaminsky

Many modern systems exploit data redundancy to improve efficiency. These systems split data into chunks, generate identifiers for each of them, and compare the identifiers among other data items to identify duplicate chunks. As a result, chunk size becomes a critical parameter for the efficiency of these systems: it trades potentially improved similarity detection (smaller chunks) with increased overhead to represent more chunks. Unfortunately, the similarity between files increases unpredictably with smaller chunk sizes, even for data of the same type. Existing systems often pick one chunk size that is ``good enough for many cases because they lack efficient techniques to determine the benefits at other chunk sizes. This paper addresses this deficiency via two contributions: (1) we present multi-resolution (MR) handprinting, an application-independent technique that efficiently estimates similarity between data items at different chunk sizes using a compact, multi-size representation of the data; (2) we then evaluate the application of MR handprints to workloads from peer-to-peer, file transfer, and storage systems, demonstrating that the chunk size selection enabled by MR handprints can lead to real improvements over using a fixed chunk size in these systems.

international conference on cloud computing | 2011

Exploiting Spatio-temporal Tradeoffs for Energy-Aware MapReduce in the Cloud

Michael Cardosa; Aameek Singh; Himabindu Pucha; Abhishek Chandra

MapReduce is a distributed computing paradigm widely used for building large-scale data processing applications. When used in cloud environments, MapReduce clusters are dynamically created using virtual machines (VMs) and managed by the cloud provider. In this paper, we study the energy efficiency problem for such MapReduce clusters in private cloud environments, that are characterized by repeated, batch execution of jobs. We describe a unique spatio-temporal tradeoff that includes efficient spatial fitting of VMs on servers to achieve high utilization of machine resources, as well as balanced temporal fitting of servers with VMs having similar runtimes to ensure a server runs at a high utilization throughout its uptime. We propose VM placement algorithms that explicitly incorporate these tradeoffs. Our algorithms achieve energy savings over existing placement techniques, and an additional optimization technique further achieves savings while simultaneously improving job performance.

ieee international conference on high performance computing, data, and analytics | 2011

STEAMEngine: Driving MapReduce provisioning in the cloud

Michael Cardosa; Piyush Narang; Abhishek Chandra; Himabindu Pucha; Aameek Singh

MapReduce has gained in popularity as a distributed data analysis paradigm, particularly in the cloud, where MapReduce jobs are run on virtual clusters. The provisioning of MapReduce jobs in the cloud is an important problem for optimizing several user as well as provider-side metrics, such as runtime, cost, throughput, energy, and load. In this paper, we present an intelligent provisioning framework called STEAMEngine that consists of provisioning algorithms to optimize these metrics through a set of common building blocks. These building blocks enable spatio-temporal tradeoffs unique to MapReduce provisioning: along with their resource requirements (spatial component), a MapReduce job runtime (temporal component) is a critical element for any provisioning algorithm. We also describe tw o novel provisioning algorithms — a user-driven performance optimization and a provider-driven energy optimization — that leverage these building blocks. Our experimental results based on an Amazon EC2 cluster and a local Xen/Hadoop cluster show the benefits of STEAMEngine through improvements in performance and energy via the use of these algorithms and building blocks.

ieee international conference on cloud computing technology and science | 2009