Is this you? Create Your Porfile

Mukil Kesavan

Georgia Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mukil Kesavan is active.

Explore More

Publication

Featured researches published by Mukil Kesavan.

symposium on cloud computing | 2010

Differential virtual time (DVT): rethinking I/O service differentiation for virtual machines

Mukil Kesavan; Ada Gavrilovska; Karsten Schwan

This paper investigates what it entails to provide I/O service differentiation and performance isolation for virtual machines on individual multicore nodes in cloud platforms. Sharing I/O between VMs is fundamentally different from sharing I/O between processes because guest VM operating systems use adaptive resource management mechanisms like TCP congestion avoidance, disk I/O schedulers, etc. The problem is that these mechanisms are generally sensitive to the magnitude and rate of change of service latencies, where failing to address these latency concerns while designing a service differentiation framework for I/O results in undue performance degradation and hence, insufficient isolation between VMs. This problem is addressed by the notion of Differential Virtual Time (DVT), which can provide service differentiation with performance isolation for VM guest OS resource management mechanisms. DVT is realized within a proportional share I/O scheduling framework for the Xen hypervisor, and its use requires no changes to guest OSs. DVT is applied to message-based I/O, but is also applicable to subsystems like disk I/O. Experimental results with DVT-based I/O scheduling for representative applications demonstrate the utility and effectiveness of the approach.

ASME 2011 Pacific Rim Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Systems, MEMS and NEMS: Volume 2 | 2011

Spatially-Aware Optimization of Energy Consumption in Consolidated Data Center Systems

Hui Chen; Mukil Kesavan; Karsten Schwan; Ada Gavrilovska; Pramod Kumar; Yogendra Joshi

Energy efficiency in data center operation depends on many factors, including power distribution, thermal load and consequent cooling costs, and IT management in terms of how and where IT load is placed and moved under changing request loads. Current methods provided by vendors consolidate IT loads onto the smallest number of machines needed to meet application requirements. This paper’s goal is to gain further improvements in energy efficiency by also making such methods ’spatially aware’, so that load is placed onto machines in ways that respect the efficiency of both cooling and power usage, across and within racks. To help implement spatially aware load placement, we propose a model-based reinforcement learning method to learn and then predict the thermal distribution of different placements for incoming workloads. The method is trained with actual data captured in a fully instrumented data center facility. Experimental results showing notable differences in total power consumption for representative application loads indicate the utility of a two-level spatially-aware workload management (SpAWM) technique in which (i) load is distributed across racks in ways that recognize differences in cooling efficiencies and (ii) within racks, load is distributed so as to take into account cooling effectiveness due to local air flow. The technique is being im

2012 7th Open Cirrus Summit | 2012

Xerxes: Distributed Load Generator for Cloud-scale Experimentation

Mukil Kesavan; Ada Gavrilovska; Karsten Schwan

With the growing acceptance of cloud computing as a viable computing paradigm, a number of research and real-life-dynamic cloud-scale resource allocation and management systems have been developed over the last few years. An important problem facing system developers is the evaluation of such systems at scale. In this paper we present the design of a distributed load generation framework, Xerxes, that can generate appropriate resource load patterns across varying data center scales, thereby representing various cloud load scenarios. Toward this end, we first characterize the resource consumption of four distributed cloud applications that represent some of the most widely used classes of applications in the cloud. We then demonstrate how, using Xerxes, these patterns can be directly replayed at scale, potentially even beyond what is easily achievable through application reconfiguration. Furthermore, Xerxes allows for additional parameter manipulation and exploration of a wide range of load scenarios. Finally, we demonstrate the ability to use Xerxes with publicly available data center traces which can be replayed across data centers with different configurations. Our experiments are conducted on a 700-node 2800-core private cloud data center, virtualized with the VMware vSphere virtualization stack. The benefits of such a microbenchmark for cloud-scale experimentation include: (i) decoupling load scaling from application logic, (ii) resilience to faults and failures, since applications tend to crash altogether when some components fail,particularly at scales, and (iii) ease of testing and the ability to understand system behavior in a variety of actual or anticipated scenarios.

2011 International Green Computing Conference and Workshops | 2011

CACM: Current-aware capacity management in consolidated server enclosures

Hui Chen; Meina Song; Junde Song; Ada Gavrilovska; Karsten Schwan; Mukil Kesavan

Using virtualization to consolidate servers is a routine method for reducing power consumption in data centers. Current practice, however, assumes homogeneous servers that operate in a homogeneous physical environment. Experimental evidence collected in our mid-size, fully instrumented data center challenges those assumptions, by finding that chassis construction can significantly influence cooling power usage. In particular, the multiple power domains in a single chassis can have different levels of power efficiency, and further, power consumption is affected by the differences in electrical current levels across these two domains. This paper describes experiments designed to validate these facts, followed by a proposed current-aware capacity management system (CACM) that controls resource allocation across power domains by periodically migrating virtual machines among servers. The method not only fully accounts for the influence of current difference between the two domains, but also enforces power caps and safety levels for node temperature levels. Comparisons with industry-standard techniques that are not aware of physical constraints show that current-awareness can improve performance as well as power consumption, with about 16% in energy savings. Such savings indicate the utility of adding physical awareness to the ways in which IT systems are managed.

international conference on distributed computing systems | 2017

Fault-Scalable Virtualized Infrastructure Management

Mukil Kesavan; Ada Gavrilovska; Karsten Schwan

Large-scale virtualized datacenters require considerable automation in infrastructure management in order to operate efficiently. Automation is impaired, however, by the fact that deployments are prone to multiple types of subtle faults due to hardware failures, software bugs, misconfiguration, crashes, performance degraded hardware, etc. Existing Infrastructure-as-a-Service (IaaS) management stacks incorporate little to no resilience measures to shield end users from such cloud providerlevel failures and poor performance. This paper proposes and evaluates extensions to IaaS stacks that mask faults in a fault-agnostic manner while ensuring that the overheads can be proportional to observed failure rates. We also demonstrate that infrastructure automation services and end-user applications can use service-specific knowledge, together with our new interface, to achieve better outcomes.

ieee international conference on high performance computing data and analytics | 2008