2019 18th International Symposium on Parallel and Distributed Computing (ISPDC) | 2019

Portfolio Scheduling for Managing Operational and Disaster-Recovery Risks in Virtualized Datacenters Hosting Business-Critical Workloads

 
 
 

Abstract


Cloud datacenters are increasingly hosting business workloads. Such long-running, on-demand workloads raise important challenges in datacenter operation, requiring efficient online scheduling of workloads with unprecedented characteristics under strict service level agreements (SLAs). In this work, we propose an approach to manage the risk of not meeting SLAs. Our approach is based on portfolio scheduling, which is an online scheduling technique that dynamically selects a scheduling algorithm from a set (portfolio), subject to a possibly changing utility function. Ours is the first datacenter-scheduling approach to consider operational and disaster-recovery risks. Using trace-based simulation with traces collected from a commercial multi-datacenter environment, we give evidence that portfolio scheduling is able to mitigate risks significantly better than its constituent scheduling algorithms and better than datacenter engineers.

Volume None
Pages 94-102
DOI 10.1109/ISPDC.2019.00022
Language English
Journal 2019 18th International Symposium on Parallel and Distributed Computing (ISPDC)

Full Text