Sriram Govindan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sriram Govindan is active.

Explore More

Publication

Featured researches published by Sriram Govindan.

virtual execution environments | 2007

Xen and co.: communication-aware CPU scheduling for consolidated xen-based hosting platforms

Sriram Govindan; Arjun R. Nath; Amitayu Das; Bhuvan Urgaonkar; Anand Sivasubramaniam

Recent advances in software and architectural support for server virtualization have created interest in using this technology in the design of consolidated hosting platforms. Since virtualization enables easier and faster application migration as well as secure co-location of antagonistic applications, higher degrees of server consolidation are likely to result in such virtualization-based hosting platforms (VHPs). We identify a key shortcoming in existing virtual machine monitors (VMMs) that proves to be an obstacle in operating hosting platforms, such as Internet data centers, under conditions of such high consolidation: CPU schedulers that are agnostic to the communication behavior of modern, multi-tier applications. We develop a new communication-aware CPU scheduling algorithm to alleviate this problem. We implement our algorithm in the Xen VMM and build a prototype VHP on a cluster of servers. Our experimental evaluation with realistic Internet server applications and benchmarks demonstrates the performance/cost benefits and the wide applicability of our algorithms. For example, the TPC-W benchmark exhibited improvements in average response times of up to 35% for a variety of consolidation scenarios. A streaming media server hosted on our prototype VHP was able to satisfactorily service up to 3.5 times as many clients as one running on the default Xen.

international symposium on computer architecture | 2011

Benefits and limitations of tapping into stored energy for datacenters

Sriram Govindan; Anand Sivasubramaniam; Bhuvan Urgaonkar

Datacenter power consumption has a significant impact on both its recurring electricity bill (Op-ex) and one-time construction costs (Cap-ex). Existing work optimizing these costs has relied primarily on throttling devices or workload shaping, both with performance degrading implications. In this paper, we present a novel knob of energy buffer (eBuff) available in the form of UPS batteries in datacenters for this cost optimization. Intuitively, eBuff stores energy in UPS batteries during “valleys” - periods of lower demand, which can be drained during “peaks” - periods of higher demand. UPS batteries are normally used as a fail-over mechanism to transition to captive power sources upon utility failure. Furthermore, frequent discharges can cause UPS batteries to fail prematurely. We conduct detailed analysis of battery operation to figure out feasible operating regions given such battery lifetime and datacenter availability concerns. Using insights learned from this analysis, we develop peak reduction algorithms that combine the UPS battery knob with existing throttling based techniques for minimizing datacenter power costs. Using an experimental platform, we offer insights about Op-ex savings offered by eBuff for a wide range of workload peaks/valleys, UPS provisioning, and application SLA constraints. We find that eBuff can be used to realize 15-45% peak power reduction, corresponding to 6-18% savings in Op-ex across this spectrum. eBuff can also play a role in reducing Cap-ex costs by allowing tighter overbooking of power infrastructure components and we quantify the extent of such Cap-ex savings. To our knowledge, this is the first paper to exploit stored energy - typically lying untapped in the datacenter - to address the peak power draw problem.

symposium on cloud computing | 2011

Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines

Sriram Govindan; Jie Liu; Aman Kansal; Anand Sivasubramaniam

Workload consolidation is very attractive for cloud platforms due to several reasons including reduced infrastructure costs, lower energy consumption, and ease of management. Advances in virtualization hardware and software continue to improve resource isolation among consolidated workloads but a particular form of resource interference is yet to see a commercially widely adopted solution - the interference due to shared processor caches. Existing solutions for handling cache interference require new hardware features, extensive software changes, or reduce the achieved overall throughput. A crucial requirement for effective consolidation is to be able to predict the impact of cache interference among consolidated workloads. In this paper, we present a practical technique for predicting performance interference due to shared processor cache which works on current processor architectures and requires minimal software changes. While performance degradation can be empirically measured for a given placement of consolidated workloads, the number of possible placements grows exponentially with the number of workloads and actual measurement of degradation is thus not practical for every possible placement. Our technique predicts the degradation for any possible placement using only a linear number of measurements, and can be used to select the most efficient consolidation pattern, for required performance and resource constraints. An average prediction error of less than 4% is achieved across a wide variety of benchmark workloads, using Xen VMM on Intel Core 2 Duo and Nehalem quad-core processor platforms. We also illustrate the usefulness of our prediction technique in realizing better workload placement decisions for given performance and resource cost objectives.

european conference on computer systems | 2009

Statistical profiling-based techniques for effective power provisioning in data centers

Sriram Govindan; Jeonghwan Choi; Bhuvan Urgaonkar; Anand Sivasubramaniam; Andrea Baldini

Current capacity planning practices based on heavy over-provisioning of power infrastructure hurt (i) the operational costs of data centers as well as (ii) the computational work they can support. We explore a combination of statistical multiplexing techniques to improve the utilization of the power hierarchy within a data center. At the highest level of the power hierarchy, we employ controlled underprovisioning and over-booking of power needs of hosted workloads. At the lower levels, we introduce the novel notion of soft fuses to flexibly distribute provisioned power among hosted workloads based on their needs. Our techniques are built upon a measurement-driven profiling and prediction framework to characterize key statistical properties of the power needs of hosted workloads and their aggregates. We characterize the gains in terms of the amount of computational work (CPU cycles) per provisioned unit of power Computation per Provisioned Watt (CPW). Our technique is able to double the CPWoffered by a Power Distribution Unit (PDU) running the e-commerce benchmark TPC-W compared to conventional provisioning practices. Over-booking the PDU by 10% based on tails of power profiles yields a further improvement of 20%. Reactive techniques implemented on our Xen VMM-based servers dynamically modulate CPU DVFS states to ensure power draw below the limits imposed by soft fuses. Finally, information captured in our profiles also provide ways of controlling application performance degradation despite overbooking. The 95th percentile of TPC-W session response time only grew from 1.59 sec to 1.78 sec--a degradation of 12%.

architectural support for programming languages and operating systems | 2012

Leveraging stored energy for handling power emergencies in aggressively provisioned datacenters

Sriram Govindan; Di Wang; Anand Sivasubramaniam; Bhuvan Urgaonkar

Datacenters spend

modeling, analysis, and simulation on computer and telecommunication systems | 2008

Profiling, Prediction, and Capping of Power Consumption in Consolidated Environments

Jeonghwan Choi; Sriram Govindan; Bhuvan Urgaonkar; Anand Sivasubramaniam

10-25 per watt in provisioning their power infrastructure, regardless of the watts actually consumed. Since peak power needs arise rarely, provisioning power infrastructure for them can be expensive. One can, thus, aggressively under-provision infrastructure assuming that simultaneous peak draw across all equipment will happen rarely. The resulting non-zero probability of emergency events where power needs exceed provisioned capacity, however small, mandates graceful reaction mechanisms to cap the power draw instead of leaving it to disruptive circuit breakers/fuses. Existing strategies for power capping use temporal knobs local to a server that throttle the rate of execution (using power modes), and/or spatial knobs that redirect/migrate excess load to regions of the datacenter with more power headroom. We show these mechanisms to have performance degrading ramifications, and propose an entirely orthogonal solution that leverages existing UPS batteries to temporarily augment the utility supply during emergencies. We build an experimental prototype to demonstrate such power capping on a cluster of 8 servers, each with an individual battery, and implement several online heuristics in the context of different datacenter workloads to evaluate their effectiveness in handling power emergencies. We show that: (i) our battery-based solution can handle emergencies of short duration on its own, (ii) supplement existing reaction mechanisms to enhance their efficacy for longer emergencies, and (iii) battery even provide feasible options when other knobs do not suffice.

dependable systems and networks | 2014

Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory

Yixin Luo; Sriram Govindan; Bikash Sharma; Mark Santaniello; Justin Meza; Aman Kansal; Jie Liu; Badriddine Khessib; Kushagra Vaid; Onur Mutlu

Consolidation of workloads has emerged as a key mechanism to dampen the rapidly growing energy expenditure within enterprise-scale data centers. To gainfully utilize consolidation-based techniques, we must be able to characterize the power consumption of groups of co-located applications. Such characterization is crucial for effective prediction and enforcement of appropriate limits on power consumption-power budgets-within the data center. We identify two kinds of power budgets (i) an average budget to capture an upper bound on long-term energy consumption within that level and (ii) a sustained budget to capture any restrictions on sustained draw of current above a certain threshold. Using a simple measurement infrastructure, we derive power profiles-statistical descriptions of the power consumption of applications. Based on insights gained from detailed profiling of several applications both individual and consolidated-we develop models for predicting average and sustained power consumption of consolidated applications. We conduct an experimental evaluation of our techniques on a Xen-based server that consolidates applications drawn from a diverse pool. For a variety of consolidation scenarios, We are able to predict average power consumption within 5% error margin and sustained power within 10% error margin. Our sustained power prediction techniques allow us to predict close yet safe upper bounds on the sustained power consumption of consolidated applications.

IEEE Transactions on Computers | 2009

Xen and Co.: Communication-Aware CPU Management in Consolidated Xen-Based Hosting Platforms

Sriram Govindan; Jeonghwan Choi; Arjun R. Nath; Amitayu Das; Bhuvan Urgaonkar; Anand Sivasubramaniam

Memory devices represent a key component of datacenter total cost of ownership (TCO), and techniques used to reduce errors that occur on these devices increase this cost. Existing approaches to providing reliability for memory devices pessimistically treat all data as equally vulnerable to memory errors. Our key insight is that there exists a diverse spectrum of tolerance to memory errors in new data-intensive applications, and that traditional one-size-fits-all memory reliability techniques are inefficient in terms of cost. For example, we found that while traditional error protection increases memory system cost by 12.5%, some applications can achieve 99.00% availability on a single server with a large number of memory errors without any error protection. This presents an opportunity to greatly reduce server hardware cost by provisioning the right amount of memory reliability for different applications. Toward this end, in this paper, we make three main contributions to enable highly-reliable servers at low datacenter cost. First, we develop a new methodology to quantify the tolerance of applications to memory errors. Second, using our methodology, we perform a case study of three new dataintensive workloads (an interactive web search application, an in-memory key -- value store, and a graph mining framework) to identify new insights into the nature of application memory error vulnerability. Third, based on our insights, we propose several new hardware/software heterogeneous-reliability memory system designs to lower datacenter cost while achieving high reliability and discuss their trade-off. We show that our new techniques can reduce server hardware cost by 4.7% while achieving 99.90% single server availability.

architectural support for programming languages and operating systems | 2014

Underprovisioning backup power infrastructure for datacenters

Di Wang; Sriram Govindan; Anand Sivasubramaniam; Aman Kansal; Jie Liu; Badriddine Khessib

Recent advances in software and architectural support for server virtualization have created interest in using this technology in the design of consolidated hosting platforms. Since virtualization enables easier and faster application migration as well as secure colocation of antagonistic applications, higher degrees of server consolidation are likely to result in such virtualization-based hosting platforms (VHPs). We identify two shortcomings in existing virtual machine monitors (VMMs) that prove to be obstacles in operating hosting platforms, such as Internet data centers, under conditions of such high consolidation: 1) CPU schedulers that are agnostic to the communication behavior of modern, multitier applications and 2) inadequate or inaccurate mechanisms for accounting the CPU overheads of I/O virtualization. We develop a new communication-aware CPU scheduling algorithm and a CPU usage accounting mechanism. We implement our algorithms in the Xen VMM and build a prototype VHP on a cluster of 36 servers. Our experimental evaluation with realistic Internet server applications and benchmarks demonstrates the performance/cost benefits and the wide applicability of our algorithms. For example, the TPC-W benchmark exhibited improvements in average response times between 20 percent and 35 percent for a variety of consolidation scenarios. A streaming media server hosted on our prototype VHP was able to satisfactorily service up to 3.5 times as many clients as one running on the default Xen.

2012 International Green Computing Conference (IGCC) | 2012

The need for speed and stability in data center power capping

Arka Aloke Bhattacharya; David E. Culler; Aman Kansal; Sriram Govindan; Sriram Sankar

While there has been prior work to underprovision the power distribution infrastructure for a datacenter to save costs, the ability to underprovision the backup power infrastructure, which contributes significantly to capital costs, is little explored. There are two main components in the backup infrastructure - Diesel Generators (DGs) and UPS units - which can both be underprovisioned (or even removed) in terms of their power and/or energy capacities. However, embarking on such underprovisioning mandates studying several ramifications - the resulting cost savings, the lower availability, and the performance and state loss consequences on individual applications - concurrently. This paper presents the first such study, considering cost, availability, performance and application consequences of underprovisioning the backup power infrastructure. We present a framework to quantify the cost of backup capacity that is provisioned, and implement techniques leveraging existing software and hardware mechanisms to provide as seamless an operation as possible for an application within the provisioned backup capacity during a power outage. We evaluate the cost-performance-availability trade-offs for different levels of backup underprovisioning for applications with diverse reliance on the backup infrastructure. Our results show that one may be able to completely do away with DGs, compensating for it with additional UPS energy capacities, to significantly cut costs and still be able to handle power outages lasting as high as 40 minutes (which constitute bulk of the outages). Further, we can push the limits of outage duration that can be handled in a cost-effective manner, if applications are willing to tolerate degraded performance during the outage. Our evaluations also show that different applications react differently to the outage handling mechanisms, and that the efficacy of the mechanisms is sensitive to the outage duration. The insights from this paper can spur new opportunities for future work on backup power infrastructure optimization.

Explore More