IEEE Transactions on Services Computing | 2019

Joint Scheduling of Overlapping MapReduce Phases: Pair Jobs for Optimization

 
 

Abstract


MapReduce includes three phases of map, shuffle, and reduce. Since the map phase is CPU-intensive and the shuffle phase is I/O-intensive, these phases can be conducted in parallel. This paper studies a joint scheduling optimization of overlapping map and shuffle phases to minimize the average job makespan. New concepts of the strong pair and the weak pair are introduced. Two jobs are defined as a strong pair if the shuffle and map workloads of one job equal the map and shuffle workloads of the other job, respectively. Two jobs are defined as a weak pair if their total map workloads equal their total shuffle workloads. We prove that if the entire set of jobs can be decomposed to strong pairs of jobs, then the optimal schedule can pairwisely execute jobs that can form a strong pair. Following the above intuition, several offline and online scheduling policies are proposed. Extensions are made based on weak pairs. Real data-driven experiments validate the efficiency and effectiveness of the proposed policies.

Volume None
Pages 1-1
DOI 10.1109/TSC.2018.2875698
Language English
Journal IEEE Transactions on Services Computing

Full Text