[PDF] Workflow Scheduling in the Cloud with Weighted Upward-rank Priority Scheme Using Random Walk and Uniform Spare Budget Splitting

Abstract

We study a difficult problem of how to schedule complex workflows with precedence constraints under a limited budget in the cloud environment. We first formulate the scheduling problem as an integer programming problem, which can be optimized and used as the baseline of performance. We then consider the traditional approach of scheduling jobs in a prioritized order based on the upward-rank of each job. For those jobs with no precedence constraints among themselves, the plain upward-rank priority scheme assigns priorities in an arbitrary way. We propose a job prioritization scheme that uses Markovian chain stationary probabilities as a measure of importance of jobs. The scheme keeps the precedence order for the jobs that have precedence constraints between each other, and assigns priorities according to the jobs' importance for the jobs without precedence constraints. We finally design a uniform spare budget splitting strategy, which splits the spare budget uniformly across all the jobs. We test our algorithms on a variety of workflows, including FFT, Gaussian elimination, typical scientific workflows, randomly generated workflows and workflows from an in-production cluster of an online streaming service company. We compare our algorithms with the-state-of-art algorithms. The empirical results show that the uniform spare budget splitting scheme outperforms the splitting scheme in proportion to extra demand in average for most cases, and the Markovian based prioritization further improves the workflow makespan.

Full PDF

11 Workﬂow Scheduling in the Cloud with WeightedUpward-rank Priority Scheme Using Random Walkand Uniform Spare Budget Splitting

Hang Zhang, Xiaoying Zheng ∗ , Ye Xia, and Mingqi Li Abstract —We study a difﬁcult problem of how to schedulecomplex workﬂows with precedence constraints under a lim-ited budget in the cloud environment. We ﬁrst formulate thescheduling problem as an integer programming problem, whichcan be optimized and used as the baseline of performance. Wethen consider the traditional approach of scheduling jobs in aprioritized order based on the upward-rank of each job. Forthose jobs with no precedence constraints among themselves,the plain upward-rank priority scheme assigns priorities in anarbitrary way. We propose a job prioritization scheme thatuses Markovian chain stationary probabilities as a measure ofimportance of jobs. The scheme keeps the precedence order forthe jobs that have precedence constraints between each other, andassigns priorities according to the jobs’ importance for the jobswithout precedence constraints. We ﬁnally design a uniform sparebudget splitting strategy, which splits the spare budget uniformlyacross all the jobs. We test our algorithms on a variety ofworkﬂows, including FFT, Gaussian elimination, typical scientiﬁcworkﬂows, randomly generated workﬂows and workﬂows froman in-production cluster of an online streaming service company.We compare our algorithms with the-state-of-art algorithms. Theempirical results show that the uniform spare budget splittingscheme outperforms the splitting scheme in proportion to extrademand in average for most cases, and the Markovian basedprioritization further improves the workﬂow makespan.

Index Terms —Workﬂow Scheduling, Heterogeneous clouds,Budget constraints, Precedence constraints, Schedule length.

I. I

NTRODUCTION T HERE is an increasing trend to use the cloud for complexworkﬂows, such as scientiﬁc computing workﬂows andbig-data analytics [1] [2] [3]. The customers submit theirworkﬂow processing requests together with their budget to thecloud. The workﬂow management system in the cloud assignsthe processing requests to appropriate virtual machines (VM)by jointly considering the requests, the VM capability andthe budget. Hopefully, the customers service level agreementwill be met and the objective of the cloud provider willbe optimized. However, the current workﬂow managementsystems are inadequate for scheduling complex workﬂows ∗ Corresponding authorH. Zhang is with the School of Computer Engineering and Science,Shanghai University, Shanghai, 200444, China.H. Zhang is also with Shang-hai Advanced Research Institute, Chinese Academy of Sciences, Shanghai,201210, China. E-mail: ([email protected])X. Zheng and M. Li are with Shanghai Advanced Research Institute,Chinese Academy of Sciences, Shanghai, 201210, China. E-mail: (zhengxy,[email protected])Y. Xia is with Department of Computer and Information Science andEngineering, University of Florida, Gainesville, FL 32611, USA. E-mail:([email protected]ﬂ.edu) with diverse requirements and heterogeneous virtual machines.This has resulted in long processing latency, wasted cloudresources, and poor return on investment.This paper investigates a workﬂow scheduling problem inthe cloud with budget constraints. More speciﬁcally, a set ofworkﬂows is to be placed in the cloud. Each workﬂow hasmultiple computation jobs, with precedence constraints amongthemselves. For each workﬂow, we can use a directed acyclicgraph (DAG) to represent the precedence constraints of thejobs. A job has an execution time, which depends on where thejob is placed and how much computing resources are allocatedto it. A job has a minimum computation resource requirement,including the CPU power and the memory requirements. Thejobs are placed on a limited set of VMs. The customer ischarged only for the period when a VM is used, i.e., onthe pay-as-you-go basis. This describes the use cases of on-demand VMs in Amazon EC2. With respect to the jobs, thedecision problem we consider in this paper is to decide whereand when to place each job, i.e., which VM will execute eachjob and when the execution starts. The precedence constraintsand the budget constraints must be satisﬁed. Furthermore, allthe resource capacity constraints at the placement targets mustalso be respected. The optimization objective is to minimizethe processing time of the set of workﬂows, i.e., the makespan of the workﬂows.Scheduling of a workﬂow represented by a directed taskgraph is a well-known NP-complete problem in general[4] [5]. The precedence constraints among jobs make thescheduling hard and many efforts have been made to ﬁndefﬁcient heuristics in the area of parallel computing and gridcomputing. Topcuoglu et al. proposed the upward-rank basedheuristic proposed in [6] to tackle the precedence constraints.In the upward-rank based approached, each job computes itsaccumulated processing time from the exit job upward to itselfalong the critical (i.e., the longest) path as the upward rank.Jobs are then scheduled in the non-increasing order of theirranks. For the jobs with precedence constraints, the upward-rank based scheme assigns the priorities in a reasonable way;but, for those jobs with no precedence constraints amongthemselves, the upward-rank priority scheme assigns prioritiesin an arbitrary fashion. In this work, we propose to assignpriorities for those unrelated jobs considering jobs’ importancein the global DAG topology. We construct a random walk onthe (extended) workﬂow DAG, and apply the random walkstationary distribution probabilities as jobs’ importance (i.e.,weights). The rationale is that the stationary probabilities are a r X i v : . [ c s . D C ] M a r computed recursively across the global topology and carrythe global information of all states (jobs) propagated back toeach state (job), and therefore the resulted stationary probabil-ities reﬂect the jobs’ importance in the global topology. Theother issue is that in parallel computing and grid computing,workﬂow computing often aims to minimize the makespanwithout considering the cost of computing facility. In the eraof the cloud, the leasing cost of the cloud facility brings anew challenge in scheduling DAG-based workﬂows in thecloud. Since jobs are scheduled in a prioritized order and oftengreedily, how the budget is split and reserved for each jobremains a heuristic. In this work, we propose to reserve theminimum required budget for each job, and assign the sparebudget uniformly across the jobs.We summarize the contributions of our work. • We formulate an integer programming model of theDAG-based workﬂow scheduling problem with budgetconstraints. The model can be evaluated by integer pro-gramming solvers such as

Gurobi [7] and the solutioncan be used as the performance baseline of differentheuristics. • We propose a weighted upward-rank priority scheme thatassigns the scheduling priorities to the jobs. It leads toimproved performance in average when compared withthe plain upward-rank priority scheme in [6]. The weightsin our scheme are the stationary probabilities of a randomwalk on the workﬂow digraphs. • We assign the spare budget uniformly across all thejobs. The empirical results show that for most cases, theuniform spare-budget-splitting scheme outperforms thescheme of splitting budget in proportion to extra demandin average.The remaining of the paper is organized as follows. InSection II, we discuss more related works. In Section III,we formulate the workﬂow scheduling problem as an integerprogramming problem. We describe the weighted upward-rankpriority scheme based on a random walk and the uniformspare budget splitting heuristic in Section V. We evaluate theheuristic on empirical test cases in Section VI. Finally, wedraw the conclusion in Section VII.II. R

ELATED W ORKS

DAG-based workﬂow scheduling has been extensively stud-ied in the literature of parallel computing and grid computing.In the survey paper [8], the authors summarized on a widespectrum of algorithms on DAG-based workﬂow scheduling ina multi-processor environment, including branch-and-bound,integer-programming, searching, randomization, and geneticalgorithms. Topcuoglu et al. proposed the HeterogeneousEarliest-Finish-Time (HEFT) algorithm in [6]. The HEFTalgorithm ﬁrst computes the upward-rank of each task bytraversing the task graph; it then sorts tasks non-increasinglybased on the upward-rank values, and assigns the tasks inthe sorted list to the available fastest processor. Upward-rankbased task prioritization achieves good performance and be-comes an important solution in solving DAG-based workﬂowscheduling. Daoud and Kharma studied a similar problem in [9] and designed the longest dynamic critical path algorithm(LDCP). The LDCP algorithm introduces a DAG for eachprocessor, named DAGP, with the sizes of all the tasks setto the computation costs of on each speciﬁed processor. Itcomputes the upward-rank of each task within a DAGP to gainmore precise task priorities. A task with the highest upward-rank among all DAGPs is assigned with a priority to the properprocessor and all DAGPs will be updated after the assignment.The tie is broken by choosing the task with the largestnumber of outgoing edges. The LDCP has better schedulingperformance than HEFT, but with higher complexity. Thework in [10] studies the problem of minimizing the execu-tion time of a workﬂow in heterogeneous environments anddesigns an ant-colony based heuristic algorithm. The heuristicgenerates task sequences considering both the forward andbackward (i.e., global) dependency of tasks, where the forwarddependency is deﬁned as the number of predecessors, and thebackward dependency is deﬁned as the number of successors,respectively. The algorithm searches the suitable machine witha greedy minimum strategy in each round of searching. Thework in [10] aligns with our opinion that not only jobs on thecritical path but also other jobs should be accounted when wecompute the scheduling priority.As more and more workﬂows are moved to the cloud,scheduling DAG-based workﬂows faces a new challenge ofscheduling tasks under budget constraints. Recently, severalstudies have worked on the budget-constrained workﬂowmakespan minimization problem in the cloud environment[2] [11] [12]. Wang and Shi [2] consider a special κ -stageMapReduce-like workﬂow where each stage consists of abatch of concurrent jobs. Their approach is to ﬁrst greedilyallocate budget to the slowest job of each stage across all thestages, hoping to minimize the execution time of each stage. Itthen gradually reﬁnes the budget allocation across the stagesand schedules the concurrent jobs of each stage based on thebudget. Shu and Wu [11] study a workﬂow mapping problemto minimize workﬂow makespan under a budget constraintin public clouds. The work assumes that a job consists ofhomogeneous tasks and there is an unlimited number of VMsin the cloud. It pre-computes the most expensive schedule andthe cheapest schedule based on the concept of the critical path,and applies the binary search to ﬁnd an approximate solution.The work in [12] considers a budget-constrained workﬂowscheduling heuristic in a heterogenous cloud environment. Theheuristic algorithm schedules the task in a prioritized orderbased on the upward-rank of each task [6]. The main ideaof the algorithm is that it splits and reserves the budget toeach individual task. It ﬁrst assigns each task the minimumbudget equal to the cost of using the cheapest VM; then, theremaining budget is split so that each task gets an additionalshare in proportion to the cost difference between using thecheapest VM and using the most expensive VM. Hence, byreserving the minimum budget to each task, the algorithmguarantees to ﬁnd a feasible solution. By splitting the extrabudget in proportion to each task’s extra cost demand, theheuristic reserves more spare budget for the tasks with lowerpriorities. These jobs will enjoy more ﬂexibility in selectingbetter VMs. Sakellariou et al. considered the facility cost in a grid environment [13]. It proposes two approaches to ﬁnda minimum makespan solution with budget constraint, LOSS and

GAIN , respectively. The

LOSS approach starts with thescheduling solution achieved by the HEFT algorithm, andkeeps swapping task to cheaper machines until the budgetconstraint is satisﬁed. The

GAIN approach starts with a so-lution with the cheapest cost, and keeps swapping tasks tofaster machines whenever there is available budget. The workin [14] extends the HEFT algorithm in [6] and proposes aBudget-constrained HEFT algorithm (BHEFT). The BHEFTalgorithm assigns scheduling priorities based on the upwardrank. It splits the budget to each task based on its average costover difference resources; if there is additional spare budget,the spare budget will be assigned to each task in proportionto its demand. With the budget for each individual task,the BHEFT algorithm always assigns the affordable fastestresource to a task. Arabnejad and Barbosa worked on a similarDAG scheduling problem in [15] and proposed the HBCSalgorithm. The task prioritization is also based on the upward-rank. The HBCS algorithm computes a worthiness indicatorwhich jointly considers the cost, the remaining budget and thespeed of each processor and assigns a task to the processorwith the highest worthiness .Some studies consider the min-cost workﬂow schedulingproblem under the processing deadline constraint. Abrishamiet al. proposed the IaaS cloud partial critical paths algorithm(IC-PCP algorithm) in [16] to minimize the execution costof the workﬂow under a deadline constraint. The key ideais the critical parent and partial critical paths(PCPs). Thecritical parent of a task is its unassigned parent that has thelatest ﬁnish time. The PCP consists of a task and its criticalparents. The algorithm schedules tasks in a PCP as a pack,and assigns it to the cheapest VM which can meet the sub-deadline of the PCP. Sahni and Vidyarthi proposed the just-in-time (JIT-C) algorithm in a follow-up work of the IC-PCP[17]. It ﬁrst checks the feasibility of the customer’s deadlinerequirement. With a feasible deadline, the algorithm startsfrom the entry tasks and steps into a monitoring control loop.Within each control loop, it identiﬁes the tasks whose parenttasks have been scheduled and are running, and assigns eachof these tasks to the cheapest VM satisfying its sub-deadlinerequirement.Regarding the scheduling of multiple workﬂows, severaldifferent scheduling strategies were proposed. The work in[18] focuses on how to schedule mutiple workﬂows onto aset of heterogeneous resources and minimize the makespan. Itproposes four policies to create a composite DAG, includingcommon entry and common exit node, level-based ordering,alternating DAGs, and ranking-based composition. It deﬁnea slowdown metric as the ratio of the ﬁnish time achievedwhen a workﬂow is scheduled individually and the ﬁnish timeachieved when the workﬂow is scheduled together with otherworkﬂows. It aims to achieve fairness across workﬂows byminimizing the largest slowdown value when scheduling jobs.The work in [19] uses a heterogeneous priority rank valuethat includes the out-degree of a task as a weight in theevaluation of task priorities. It further proposes three schedul-ing strategies across multiple workﬂows including round- robin, priority-based, and trade off between round-robin andpriority. Rodriguez and Buyyawe [20] proposed an elasticresource provisioning and scheduling algorithm for multipleworkﬂows, which aims to minimize the overall cost of leasingresources while meeting the independent deadline constraintof workﬂows.Wang and Xia explored using mixed integer programming(MIP) to formulate and solve complex workﬂow schedul-ing problems as building blocks of large-scale schedulingproblems [21]. The scheduling problems considered in [21]are minimization of the cost under the deadline constraint.Meena et al. [22] aimed at ﬁnding schedules to minimize theexecution cost while meeting the deadline in cloud computingenvironment. They employed a PerVar parameter to recordthe variation of performance of VMs and proposed a CostEffective Genetic Algorithm (CEGA) to generate schedules.Li et al. [23] focused on a similar work of [22] and captureddynamic performance ﬂuctuations of VMs by a time-series-based approach. With the VM performance forecast informa-tion, they designed a genetic algorithm that fulﬁlls the Service-Level-Agreement. The work in [24] develops a schedulingsystem to minimize the expected monetary cost given the user-speciﬁed probabilistic deadline guarantees in IaaS clouds. Itfocuses on dealing with the price and performance dynamicsin clouds and does not assume precedence constraints inworkﬂows. Zheng et al. [25] studied the problem of improvingutility of cloud computing by allowing partial execution ofjobs. The workﬂows in clouds consist of parallel homogeneouspreemptable tasks without precedence constraints. The workproposes efﬁcient online multi-resource allocation algorithms.Champati and Liang considered the job-machine assignmentproblem in the setting where jobs have placement constraints,and machines are heterogeneous [26], and there is no prece-dence constraints either. They developed an efﬁcient algorithmto minimize the sum-cost.III. P

ROBLEM F ORMULATION

In this section, we describe the cloud system and the prob-lem formulation. The formulation here overlaps with the onein [21]. Assume there is a set of cloud computing workﬂowsdenoted by W = {

1, 2, ... , W } . For each workﬂow w ∈ W ,it contains one or more jobs. The total pool of jobs is denotedby J = {

1, 2, ... , J } . Each job j ∈ J can only belong to oneworkﬂow w ∈ W . Let J w denote the set of jobs belongingto workﬂow w . For job j , the minimum CPU requirement ofjob j is denoted by c j , and the minimum memory requirementof job j is denoted by m j . In a workﬂow, a job can dependon other jobs, i.e., a job cannot start until some other jobsﬁnish execution. The job dependency is usually captured bya workﬂow DAG. Each job in the workﬂow is a node in thegraph and the dependency relations are denoted by directededges between two nodes. It is more convenient for us torepresent the job dependency DAG as a matrix L = ( l ij ) , ∀ i, j ∈ J . If job i depends on job j , we set l ij = 1 ; l ij = 0 means that job i does not depend on job j . If l ij = 1 , then thestart time of job i should be no earlier than the ﬁnish time ofjob j , which is a precedence constraint. For the cloud system resource, we consider a set of virtualmachines (VMs) V = 1, 2, ... , V , possibly of different typesand capabilities. Let C k represent the number of vCPUs ofVM k , and M k represent the amount of memory of VM k .We assume a discrete time model, where time is divided intoa sequence of time slots , , ..., T , for instance, minutesper time slot. At any time slot t , there can be at most one joballocated to any VM. We also assume non-preemptive schedul-ing of jobs. Let us characterize the amount of computation ofjob j in terms of vCPU-time-slots, denote it by h j . Thereforewhen job j runs on VM k , the running time of job j , R jk , canbe computed as R jk = h j / C k , which is measured in numberof time slots. We consider the popular pay-as-you-go cloudcomputing that charges based on the operating time of VMs.Suppose after running VM k for a unit time, the user will becharged a cost of p k . Suppose all the workﬂows in questionbelong to the same user, which has a total budget of D . Weconsider the problem of minimizing the ﬁnish time of all theworkﬂows, i.e., the makespan, subject to the budget constraintand various other constraints. More speciﬁcally, for each job,we decide the VM and the starting time slot to which the jobis assigned. The goal is that the overall VM leasing cost iswithin the budget D and the makespan of all the workﬂowsis minimized.Next, we specify the various constraints. Let us denote thejob-VM assignment decision by the binary variables x tjk . Weset x tjk = 1 if and only if job j is assigned to VM k and itstarts at time slot t . For each job j , only one of the x tjk isequal to . (cid:88) k ∈V (cid:88) t ∈T x tjk = 1 , ∀ j ∈ J . (1)When we choose the appropriate VM for job j , job j ’ sminimum resource requirement must be satisﬁed. (cid:88) k ∈V (cid:88) t ∈T C k x tjk ≥ c j , ∀ j ∈ J . (2) (cid:88) k ∈V (cid:88) t ∈T M k x tjk ≥ m j , ∀ j ∈ J . (3)Let us discuss the precedence constraint. We note that theprecedence constraint is active only if l ij = 1 . The start timeof job i can be deﬁned as (cid:80) k ∈V (cid:80) t ∈T tx tik . The ﬁnish timeof job j can be described as (cid:80) k ∈V (cid:80) t ∈T ( t + R jk ) x tjk . Theprecedence constraint says that if job i depends on job j , thenjob i cannot start earlier than the ﬁnish time of job j . ( (cid:88) k ∈V (cid:88) t ∈T tx tik − (cid:88) k ∈V (cid:88) t ∈T ( t + R jk ) x tjk ) l ij ≥ , ∀ i, j ∈ J . (4)There is one additional constraint that at most one job runson a VM at any time. (cid:88) i ∈J t (cid:88) r =max(0 ,t − R ik +1) x rik ≤ , ∀ k ∈ V , t ∈ T . (5)We explain the constraint (5) in more details. If job i ’sexecution occupies time slot t of VM k , then job i ’s start timeis from the set { max (0 , t − R ik + 1) , ..., t } . It is equivalent to saying that (cid:80) tr =max(0 ,t − R ik +1) x rik = 1 for job i . Accordingto the non-preemptive requirement, at any time slot t , for anyVM k , there is at most one job that can start execution attime t . Therefore, we have (5). We show that (5) is sufﬁcientto guarantee the existence of an non-preemptive scheduling.Suppose for job j , x sjk = 1 for some time slot s and someVM k . For each time slot t from s to s + R jk − , togetherwith (5) and x sjk = 1 , we have (cid:88) i ∈J t (cid:88) r =max(0 ,t − R ik +1) x rik = 1 . (6)Thus, for each i (cid:54) = j , x rik = 0 for r ∈ max(0 , t − R ik + 1) , · · · , t . By varying t from s to s + R jk − ,we see that job i cannot start on { max (0 , s − R ik + 1) , ..., s + R ik − } . We conclude that no other jobs can interfere withjob j ’s execution.Let the variable d denote an upper bound of the ﬁnish timeof all the workﬂows. We have (cid:88) k ∈V (cid:88) t ∈T ( t + R jk ) x tjk ≤ d, ∀ j ∈ J . (7)The budget constraint of executing the workﬂows can bewritten as: (cid:88) k ∈V (cid:88) j ∈J p k R jk (cid:88) t ∈T x tjk ≤ D. (8)The workﬂow scheduling problem with the pay-as-you-gopricing model can be written as follows:Min-Makespan: min d (9) s.t. (1)(2)(3)(8)(4)(5)(7) x, y binary , d integer . Note that data transfer costs between jobs are not directlyconsidered in the formulation (9). We assume that data transfertakes place in the internal network of a datacenter, and thetransfer rate is stable. Therefore the data transferring timebetween each pair of jobs is a constant and can be includedas a part of the job’s running time R jk [17] [22] [27]. A. Solve the problem by MIP software

The Min-Makespan problem (9) is a complex integer pro-gramming problem and is usually hard to solve.

Gurobi is thestate-of-art MIP software, and is capable of solving small tomedium sized problems. We will use

Gurobi to solve someinstances of the Min-Makespan problem. But, the goal isto provide a baseline for performance comparison with theheuristic algorithm that we will propose in Section V.IV. A M

OTIVATING E XAMPLE

Consider a workﬂow with jobs shown in Fig. 1. Thereare VMs and the leasing cost of each VM is shown in TableI. The execution time of each job on each VM is shown inTable II.In the well-known priority-based greedy algorithm in [6],each job is assigned an upward-rank, which is a value. The

12 3 4 5 67 8 910 1112

Fig. 1: A workﬂow example with jobs, job n through job n .TABLE I: Leasing cost of the VMs in the example of Fig. 1. VM VM VM VM Price 3 5 6 jobs are sorted in a non-increasing order according to theupward-rank, and the resulting ordered list gives the prioritiesto the jobs according to which the jobs are assigned to theVMs.

A. Job scheduling priorities

The upward-rank of a job j is recursively deﬁned as ¯ R j = (cid:80) k ∈V j R jk |V j | , (10) w j exit = ¯ R j exit , (11) w j = ¯ R j + max i ∈ succ ( j ) { w i } . (12)In (10), V j = { k | k ∈ V and C k ≥ c j , M k ≥ m j } is theset of the VMs that has the capacity to accept job j . Then, ¯ R j is the average job processing time over the VM in theset V j . The set succ ( j ) is the set of all successor jobs ofjob j in the workﬂow DAG. The upward-rank of the exitTABLE II: Execution time of the jobs on each VM. n i VM VM VM job in the DAG, w j exit , is deﬁned as its average processingtime. The upward-rank of any other job, w j , can be computedrecursively by traversing from the exit job upward as in (12).In fact, the upward-rank of a job is the aggregated upward-ranks along the critical (the longest in terms of upward-rank)path from the exit job to the current job. In the upward-rank-based job scheduling in [6], all the jobs are sorted accordingto the upward-rank non-increasingly; the job with the highestupward-rank is scheduled ﬁrst, and will be assigned a VMby a separate job-VM matching algorithm, such as the HBCSalgorithm in [15]. We will call the priority generation schemein [6] the plain upward-rank priority scheme.In the plain upward-rank priority scheme, equation (12)ensures that the upward-rank of a job is higher than all its suc-cessors (including the non-immediate successors). Therefore, ajob is selected with a higher priority than all its successors forVM assignment. However, for the jobs that have no precedenceconstraints among each other, the upward-rank is not a goodenough indicator of a job’s scheduling priority.For instance, in Fig. 1, jobs n , n and n do not dependon each other. As shown in Table III, the plain upward-ranksof n , n and n are , and , respectively. Thus, thetie across jobs n , n and n needs to be broken arbitrarily inscheduling. But, based on the DAG in Fig. 1, jobs n and n are more intricately related with other jobs in the workﬂow,and, to shorten the workﬂow makespan, it might be worthwhileto assign higher priorities to n and n . We will later proposea weighted upward-rank priority scheme in Section V. In TableIII, we show the ranks and the corresponding order of the jobs.With the weighted ranks, jobs n and n have higher prioritiesthan job n , and will be scheduled earlier than n . Aftergenerating the priority list, we apply the HBCS algorithmfrom [15] to assign each job to a VM. Tables IV and Vshow the ﬁnal scheduling results for the two priority generationschemes, respectively. In Table IV, job n is scheduled before n and n . Job n occupies the faster VM , and the ﬁnalmakespan is . In Table V, n and n are assigned higherpriorities because of our new priority generation scheme. Job n can choose the faster VM, which results in a makespan of .Hence, in assigning job scheduling priorities, we need toevaluate the importance of a job by considering not only thejobs on its critical path but also its relationship with other jobs. B. Budget splitting

In HBCS, the spare budget is preferentially assigned tothe jobs with the higher priority. Because of the greedynature of HBCS, the jobs with higher priorities tend to usemore expensive and faster VMs, whereas the jobs with lowerpriorities often do not have too many options because theremaining balance is more limited.From Table IV and Table V, it can be seen that the availablebudget for the jobs with lower priorities is very limited underboth priority generation schemes. If we split the spare budgetevenly as shown in Table VI, more budget will be allocatedto jobs with lower priorities. These jobs will enjoy moreﬂexibility in selecting better VMs, which results in shorter

TABLE III: Rank values and scheduling order for jobs undertwo different priority generation schemes strategies.

Job Upward Scheduling Weighted SchedulingRank Order Rank Order n

67 1 73.15 1 n

54 2 58.29 2 n

38 6 40.72 8 n

50 3 54.17 3 n

49 5 53.14 5 n

50 3 54.17 3 n

25 11 27.33 11 n

38 6 41.81 6 n

28 9 31.23 9 n

26 10 28.36 10 n

14 12 16.00 12

TABLE IV: Final scheduling results using HBCS with theplain upward-rank priority scheme (budget =500). n i Budget Cost Saved Start Finish VM AssignedBudget Time Time1 100 42 58 0 7 32 115 65 50 7 20 24 89 39 50 7 20 16 89 39 50 20 33 15 86 65 21 20 33 23 72 55 17 33 44 28 47 36 11 33 45 19 35 24 11 45 53 110 46 35 11 53 60 211 47 36 11 53 65 17 29 18 11 65 71 112 46 35 11 71 78 2Actual cost = 489, makespan = 78. makespan, as shown in Table VI. The conclusion is that thespare budget should be split across the jobs more evenly.V. A H

EURISTIC A LGORITHM

Motivated by the example in Section V, we develop aheuristic algorithm to solve the Min-Makespan problem. Thealgorithm has two key components. One is the weightedupward-rank priority scheme, which uses the stationary dis-tribution of a random walk on the DAG as the weights. Theother is uniform spare budget splitting. For scheduling multipleworkﬂows, we make an extended DAG by adding pseudo entryTABLE V: Final scheduling results using HBCS with ourweighted upward-rank priority scheme (budget =500). n i Budget Cost Saved Start Finish VM AssignedBudget Time Time1 100 42 58 0 7 32 115 65 50 7 20 24 89 39 50 7 20 16 89 42 47 7 14 35 83 48 35 14 22 38 65 55 10 20 31 29 34 24 10 22 30 13 61 55 6 31 42 210 41 35 6 42 49 211 42 40 2 49 57 27 20 18 2 42 48 112 37 35 2 57 64 2Actual cost = 498, makespan = 64.

TABLE VI: Final scheduling results using the plain upward-rank priority scheme and uniform spare budget splitting (bud-get =500). n i Budget Cost Saved Start Finish VM AssignedBudget Time Time1 47 42 5 0 7 32 67 65 2 7 20 24 46 39 7 7 20 16 50 42 8 7 14 35 49 48 1 14 22 33 57 55 2 20 31 28 37 30 7 22 27 39 36 24 12 22 30 110 52 35 17 31 38 211 57 36 21 30 42 17 44 18 26 42 48 112 66 35 31 48 55 2Actual cost = 469 ≤ Budget, makespan = 55. and exit nodes to connect multiple DAGs. The schedulingpriorities of the jobs across all the workﬂows are computedbased on the extended DAG. In Fig. 2, we show two typicalworkﬂow DAGs. By adding pseudo entry and exit nodes, job n and n , we have an extended DAG shown in Fig. 3.

12 73 45 6 Fig. 2: An example of two workﬂows with jobs, job n through job n . A. Weighted upward-rank priority scheme using random walk

According to the discussion in Section IV, when we com-pute a job’s scheduling priority, it needs to consider both thejobs on the critical path and the other jobs as well. We followthe upward-rank based priority scheme originally proposed in[6]. We propose to construct a random walk on the (extended)workﬂow DAG, and extend the plain scheme by applying therandom walk stationary distribution probabilities as weightsto the plain ranks. More speciﬁcally, for each job j , theplain upward rank represents the accumulated processing timeof successors on its critical path, and its weight (i.e., thestationary probability π j ) represents the importance of job j inthe global DAG topology. The rationale behind is that if a jobis more complicated related with other jobs in the topology,the job is more important and deserves a higher priority asdiscussed in Section IV. The stationary probability vector π

12 73 45 6 8 119 10120 11/2 Fig. 3: The extended DAG by adding pesudo entry and exitnodes, job n and n . The numbers around the edges are thetransition probabilities of the digraph.of the random walk on the workﬂow DAG can be interpretedas the recurrence probability of each state in the limiteddistribution. Generally, if a state j ’s stationary probability π j is higher than other states, it implies that the system stateprefers to transit from other states to state j and state j ismore important. Hence the vector π is a good indicator of theimportance of jobs and can be used as weights of the plainupward-rank.We describe the detailed procedure of construction therandom walk. Because the DAG is acyclic, we add directededges to the DAG from each exit node to each entry node.In the new graph, the set of successors of any node j is notempty. The random walk is on this digraph. Let the transitionprobability from job (state) j to job (state) i be denoted by p ji . We set p ji = 1 | succ ( j ) | , ∀ i, j where l ij = 1 . (13)Thus, from state j , the random walk will visit its immediatesuccessors with equal probabilities. Note that, if job i doesnot depend on job j , then p ji = 0 . We show the transitionprobabilities of an example DAG in Fig. 3.Let π j denote the stationary probability for state j . The sta-tionary probabilities can be computed by solving the followingequations. (cid:88) j ∈J π j = 1 , (14) (cid:88) i p ij π i = π j , ∀ j. (15) TABLE VII: VM types in the experiments. VM Type vCPU Memory(GiB) Price($/hour)t2.micro 1 1 0.0116t2.medium 2 4 0.0464m5.xlarge 4 16 0.192m5.2xlarge 8 32 0.384m5.4xlarge 16 64 0.768m5.12xlarge 48 192 2.304c5.large 2 4 0.085c5.xlarge 4 8 0.17c5.2xlarge 8 16 0.34c5.4xlarge 16 32 0.68c5.9xlarge 36 72 1.53c5.18xlarge 72 144 3.06r4.large 2 15.25 0.133r4.xlarge 4 30.5 0.266r4.2xlarge 8 61 0.532r4.4xlarge 16 122 1.064r4.8xlarge 32 244 2.128i3.xlarge 4 30.5 0.312i3.2xlarge 8 61 0.624i3.4xlarge 16 122 1.248i3.8xlarge 32 244 2.496g3.4xlarge 16 122 1.14g3.8xlarge 32 244 2.28

We use the stationary probability π j as a measure ofimportance of job j . The weighted upward-ranks are deﬁnedrecursively as follows. w j exit = ¯ R j exit π j exit , (16) w j = ¯ R j π j + max i ∈ succ ( j ) { w i } . (17) B. Uniform spare budget splitting

After the jobs’ scheduling priorities are determined, we needto split the budget across the jobs. In order to guarantee thateach job can rent a VM, a job j needs to receive a minimumbudget, denoted by D min j , given by D min j = min k ∈V j { p k R jk } . (18)Thus, a feasible budget D should be no less than the aggregateminimum budget of each job, i.e., D ≥ (cid:88) j ∈J D min j . (19)For the spare budget D − (cid:80) j ∈J D min j , we propose to split itevenly across the jobs. Hence, the reserved budget of each job j is computed as D reserve j = D min j + D − (cid:80) j ∈J D min j |J | . (20)We summarize the overall scheduling algorithm in Algo-rithm 1. Note that in Step e ) , we can also use the plainupward-rank priority scheme. The resulting algorithm is stilla new algorithm, compared with the algorithm in [6], becauseof the new way of splitting the spare budget - uniform splitting. Algorithm 1

Multiple workﬂow scheduling with the weigthedupward-rank priority scheme and uniform spare budget split-ting • Initialize: – Step a ) : Add a pseudo entry node and a pseudoexit node with computation cost h entry = h exit = 0 . – Step b ) : Make the entry nodes of all workﬂowsimmediate successors of the pseudo entry node, andthe exit nodes of all workﬂows immediate ancestorsof the pseudo exit node. – Step c ) : Assign the Markov chain transition prob-abilities according to equations (13). – Step d ) : Compute Markov chain stationary proba-bilities based on equations (14) and (15). – Step e ) : Compute the weighted upward-ranks asin equations (16) and (17). – Step f ) : Sort all jobs non-increasingly accordingto w j . – Step g ) : Compute the reserved budget for each job j according to equation (20). – Step h ) : Set the remaining balance D remain to be . • Step : Remove the job j with the highest w j from thesorted list. • Step : Select a VM k from the VM set V j , where VM k is the fastest VM for job j within the budget limit asthe follows, min R jk (21) s.t. p k R jk ≤ D reserve j + D remain k ∈ V j , and assign job j to VM k . If problem (21) has multiplesolutions, we will choose the cheapest VM. Any furthertie will be broken arbitrarily. • Step : Recompute the remaining balance D remain as D remain = D remain + ( D reserve j − p k R jk ) . (22) • Step : If the job list is empty, exit; else, go to Step .VI. E XPERIMENTS

In this section, we present the comparative evaluation resultsof our algorithms, the algorithms of MSLBL [12], HBCS[15] and BHEFT [14], and the optimal baseline solutiongenerated by Gurobi. In Table VIII, we list the shorthands forthese algorithms, which will be used throughout this section.We ﬁrst describe a single workﬂow scenario, where variousexperimental cases and algorithms are tested and results arereported. Then, we move to a multiple workﬂow scenario,where we compare our algorithm with random and round-robin priority generation schemes. In the experiments, we usea broad range of workloads, including workﬂows from realapplications and randomly generated workﬂows. TABLE VIII: Shorthands for different algorithms.

BAVE Algorithm 1 with the plain upward-rank priorityschemeBAVE M Algorithm 1 with the weighted upward-rank priorityschemeMSLBL Algorithm in [12] with the plain upward-rank priorityschemeMSLBL M Algorithm in [12] with the weighted upward-rankpriority schemeHBCS Algorithm in [15]BHEFT Algorithm in [14]Gurobi the optimal baseline solution generated by Gurobi

A. Workﬂow setup

We use four types of real-world workﬂows including theFast Fourier transform parallel application (FFT), Gaussianelimination parallel application [6], scientiﬁc workﬂows, andreal in-production workﬂows from an Internet streaming ser-vice company in China.In generating the FFT workﬂows, we use a parameter m to set the size of the FFT application. The number of jobs is N = 2 m − m log m , where m = 2 k for some integer k . Furthermore, an FFT workﬂow enjoys a symmetry. Theaggregated execution time of the jobs on any path from thestarting job to any of the exiting jobs is equal. Thus, any pathin an FFT DAG is a critical path. For the Gaussion eliminationworkﬂows, the number of jobs is set to be N = n + n − ,where n is the number of rows of a square matrix. We alsoevaluate other scientiﬁc workﬂows including Montage, Cy-berShake, Epigenomics, LIGO Inspiral Analysis and SIPHT,which are by an open source scientiﬁc workﬂow generator[28].Finally, we obtained a -hour-long logs of an in-productioncluster from an Internet streaming service company in China.The cluster carried workﬂows including MapReduce,Spark, Hive, Shell during the hours. A workﬂow maycontain multiple jobs, and a job may contain multiple paralleltasks. The logs show that . workﬂows only contain nomore than jobs; the remaining . workﬂows contain thenumber of jobs ranging from to , and these workﬂowsoccupy more than of the CPU and memory resources. Weevaluate the algorithms on typical workﬂows with differentnumbers of jobs. B. Other parameters

We resort to simulation to compare the algorithms. Allsimulation experiments are conducted on a PC platform withan Intel Core i . GHz CPU and GB memory. We use VM types as tabulated in Table VII, which follow theVM setup in Amazons EC2 as close as we can [29]. To seethe inﬂuence of the number of available VM, we test withthree different levels of VM sufﬁciency:

Scarce , Normal and

Sufﬁcient . In the

Scarce case, the number of VMs is half ofthe number of jobs; in the

Normal case, the two numbers areequal; in the

Sufﬁcient case, the number of VMs is . timesof the number of jobs. In all the three cases, / of the VMinstances are assigned to the VM types with no more than vCPUs in Table VII; and the other / of the VM instances are assigned to the VM types with more than vCPUs. Finally, thenumber of instances of each VM type is generated randomly.We also vary the budget as in (23). D = D min + ϕ ( D max − D min ) , (23)where D min is the cost of using the cheapest schedule, and D max is the cost obtained by the HEFT algorithm. The budgetlevel factor ϕ ∈ { , . , . , . , . } is used to vary thebudget level.Finally, sometimes an algorithm may fail to ﬁnd a feasi-ble schedule, either because of the high complexity of thealgorithm, or due to the greedy nature. When that happens, afailure is reported. We report the algorithm success rates inthe results. C. Summary of performance ranking

We ﬁrst summarize the overall performance of Gurobi,BAVE, BAVE M, MSLBL, MSLBL M, HBCS and BHEFTby counting their ranks in terms of the obtained makespans.For each test case, we order the algorithms in the increasingorder of makespans; then we count their ranks for each typeof workﬂows. We also use an average ranked value (AR)proposed in [14] to evaluate the performance of algorithms.The value AR is deﬁned as AR = R + 2 R + 3 R + 4 R N cases , (24)where N cases is the number of test cases, and R i is the countfor rank i . A smaller AR value of an algorithm stands for abetter performance in average. In Table IX - XVII, the resultsof rank counting and the associated AR values are reported.For brevity, only the counting of the ﬁrst three places areshown in tables. For Gurobi, HBCS and BHEFT algorithms,because they sometimes fail to ﬁnd feasible solutions, theirAR values are not reported. By inspecting the AR values ofall workﬂow types, we draw the following conclusion. • For FFT and randomly generated workﬂows, BAVEalgorithms achieves the best performance in average.The uniform extra budget splitting scheme outperformsthe scheme of splitting budget in proportion to extrademand. The weighted priority scheme cannot furtherimprove the makespan when it combines with the uniformextra budget splitting scheme. For FFT workﬂows, theweighted scheme even results in longer makespans inseveral test cases. Anyway, when we apply the weightedpriority scheme to MSLBL algorithm, the makespans arereduced. • For Guassian and other scientiﬁc workﬂows, and theworkﬂows obtained from the in-production cluster,BAVE M algorithm achieves the best performance inaverage. Both the weighted priority and uniform extrabudget splitting schemes help to improve the makespans.We conclude that the weighted priority scheme using randomwalk and the uniform spare budget splitting strategy help toimprove the makespans in average for most of the test cases. TABLE IX: Ranking counts for the FFT workﬂow, 75 testcases

RANK 1 2 3 ARGurobi 15 0 0 /BAVE 64 9 2 1.17BAVE M 63 8 3 1.23MSLBL 31 35 8 1.72MSLBL M 32 34 9 1.69HBCS 13 9 10 /BHEFT 9 13 3 /

TABLE X: Ranking counts for the Gaussian workﬂow, 60 testcases

RANK 1 2 3 ARGurobi 15 0 0 /BAVE 35 14 5 1.70BAVE M 41 16 3 1.36MSLBL 15 10 20 2.60MSLBL M 18 14 22 2.26HBCS 12 0 0 /BHEFT 3 10 0 /

TABLE XI: Ranking counts for the CyberShake workﬂow, 15test cases

RANK 1 2 3 ARBAVE 6 3 4 2.13BAVE M 10 4 1 1.40MSLBL 6 4 5 1.93MSLBL M 8 5 2 1.60HBCS 3 0 0 /BHEFT 3 0 0 /

TABLE XII: Ranking counts for the Epigenomics workﬂow,15 test cases

RANK 1 2 3 ARBAVE 6 2 4 2.27BAVE M 10 4 1 1.40MSLBL 4 2 6 2.53MSLBL M 7 6 2 1.67HBCS 3 0 0 /BHEFT 2 1 0 /

TABLE XIII: Ranking counts for the Inspiral workﬂow, 15test cases

RANK 1 2 3 ARBAVE 7 0 3 2.40BAVE M 8 5 1 1.67MSLBL 4 1 6 2.67MSLBL M 6 7 2 1.73HBCS 3 0 0 /BHEFT 1 0 0 /

TABLE XIV: Ranking counts for the Montage workﬂow, 15test cases

RANK 1 2 3 ARBAVE 8 3 1 1.93BAVE M 11 1 3 1.47MSLBL 3 4 4 2.60MSLBL M 5 3 6 2.20HBCS 3 0 0 /BHEFT 1 1 0 / TABLE XV: Ranking counts for the Sipht workﬂow, 15 testcases

RANK 1 2 3 ARBAVE 7 1 0 2.47BAVE M 6 3 6 2.00MSLBL 4 5 4 2.33MSLBL M 8 5 1 1.67HBCS 3 0 0 /BHEFT 1 1 0 /

TABLE XVI: Ranking counts for the random workﬂow, 45test cases

RANK 1 2 3 ARGurobi 15 0 0 /BAVE 32 12 1 1.31BAVE M 32 12 1 1.31MSLBL 11 8 19 2.48MSLBL M 11 18 15 2.13HBCS 8 1 0 /BHEFT 9 1 0 /

TABLE XVII: Ranking counts for the workﬂow obtained froman Internet company, 45 test cases

RANK 1 2 3 ARGurobi 15 0 0 /BAVE 20 19 3 1.75BAVE M 33 12 0 1.26MSLBL 10 8 18 2.60MSLBL M 10 13 18 2.35HBCS 7 2 0 /BHEFT 0 4 1 /

Finally, in order to separate out the improvement achievedby the weighted priority scheme and that by the uniform sparebudget splitting scheme, we compare the algorithm rankingresults between BAVE and MSLBL, and the results betweenBAVE and BAVE M in Fig. 4 and Fig. 5, respectively. InFig. 4, algorithm BAVE outperforms algorithm MSLBL onaverage for most workﬂow types except the workﬂow typesCyberShake and Sipht. The results can be interpreted asshowing the advantage of the uniform spare budget splittingscheme. Fig. 5 shows that the weighted priority schemeimproves the average performance compared with the plainpriority scheme by decreasing AR for most workﬂow typesexcept the workﬂow types FFT and random.

D. Detailed experimental results

In this section, we plot and discuss the detailed performanceof each test case.

1) FFT:

In Fig. 6, we show the normalized makespans forthe FFT workﬂows. In each test case, the makespans obtainedby different algorithms are normalized with respect to thesmallest one and are plotted to show the performance. Forinstance, In Fig. 6(a), we show an FFT workﬂow with jobs. All the algorithms are tested with different levels of VMsufﬁciency and budgets. Gurobi achieves the best makespanwith sufﬁcient VMs and budget level of ϕ = 1 . , and hencethe makespans obtained by other algorithms and settings arenormalized with respect to this speciﬁc optimal value in theplot shown in Fig. 6(a). Fig. 4: Comparison of Ranking counts between BAVE andMSLBLFig. 5: Comparison of Ranking counts between BAVE andBAVE MIn the small experiment with N = 15 jobs, Gurobialways achieves the best makespan which can be used asthe performance baseline. In the VM sufﬁciency case of Scarce , the results show that the budget level ϕ has a greatimpact on the resulted makespan, which drops quickly withthe increased budget level ϕ . In the VM sufﬁciency case of Normal , the makespan improves with ϕ when ϕ is small. Thedifference of makespans between ϕ = 0 . and ϕ = 1 . issigniﬁcantly narrowed. When there are Sufﬁcient supplies ofVMs, the makespan is further reduced when the budget isplenty. We notice that Gurobi produces solutions with slightlybetter makespan only in the cases of

Scarce , and the case of ϕ = 0 . and VM sufﬁciency Normal . The algorithms BAVEand BAVE M achieve almost the same performance as Gurobi.The makespans produced by MSLBL and MSLBL M areslightly worse in the cases of

Scarce . The HBCS and BHEFTalgorithms cannot always ﬁnd a solution.When the size of workﬂow increases to N = 95 jobs,Gurobi fails to ﬁnd a solution. There are , binary variables in the problem formulation, which is extremely largefor an integer programming problem. In the experiments with N = 95 , , and jobs, the algorithms BAVE andBAVE M achieve the best makespan in almost all the testcases. The algorithms HBCS and BHEFT cannot always ﬁnda solution. (a) N = 15 (b) N = 95 (c) N = 223 Fig. 6: The normalized makespan with various FFT workﬂows; N is the number of jobs. (d) N = 1151 (e) N = 2559 Fig. 6: (Cont.) The normalized makespan with various FFTworkﬂows; N is the number of jobs.

2) Gaussian elimination:

For the Gaussian eliminationworkﬂows, the BAVE and BAVE M algorithms alwaysachieve the best makespan for all test cases with budgetlevel at ϕ = 0 . or ϕ = 1 . . For other budget levels, ϕ = 0 . , . , . , BAVE always outperforms MSLBL,and BAVE M always outperforms MSLBL M. The HBCSand BHEFT algorithms perform poorly in most cases. Formost cases, the algorithms with the weighted upward-rankpriority generation scheme achieve better makespan. Overall,the proposed BAVE and BAVE M work well for Gaussianelimination workﬂows.We also observe an unusual test case. In the test case of N =1175 , VM Sufﬁcient and ϕ = 0 . , the makespans obtained byBAVE, BAVE M, MSLBL, MSLBL M are signiﬁcantly largerthan that of the test case where VM sufﬁciency is Normal . Thisis because that the set of VM instances are generated randomlyand can be substantially different for different VM sufﬁciency.The increase in VM sufﬁciency does not necessarily lead toperformance improvement. These kind of rare cases happenoccasionally in the other tests. (a) N = 14 (b) N = 665 (c) N = 1175 Fig. 7: The normalized makespan with various Guassianelimination workﬂows; N is the number of jobs. (d) N = 1829 Fig. 7: (Cont.) The normalized makespan with various Guas-sian elimination workﬂows; N is the number of jobs.(a) CyberShake with N = 1000 (b) Epigenomics with N = 997 Fig. 8: The normalized makespan with other scientiﬁc work-ﬂows; N is the number of jobs. (c) Inspiral with N = 1000 (d) Montage with N = 1000 (e) Sipht with N = 1000 Fig. 8: (Cont.) The normalized makespan with other scientiﬁcworkﬂows; N is the number of jobs.

3) Other scientiﬁc workﬂows:

We show the evaluationresults of other scientiﬁc workﬂows in Fig. 8. For the Cyber-Shake workﬂow with N = 1000 jobs, the BAVE algorithmperforms no better than MSLBL. When it is applied withthe weighted priority scheme, the makespans are reducedand BAVE M performs the best in average. For the Siphtworkﬂow, though the BAVE and BAVE M algorithms perform no better than MSLBL and MSLBL M, the weighted priorityscheme produces shorter makespans than the plain one inmost cases. For other kinds of workﬂows, the BAVE andBAVE M algorithms perform the best in most test cases. Theperformance of HBCS and BHEFT is poor when the budgetis not sufﬁcient. (a) N = 61 (b) N = 1000 (c) N = 2000 Fig. 9: The normalized makespan with various randomlygenerated workﬂows; N is the number of jobs.

4) Randomly generated workﬂows:

We tested various ran-domly generated workﬂows and the main results are shownin Fig. 9. Both the BAVE and BAVE M algorithms performthe best compared with other algorithms. There is no signif-icant performance difference between the plain and weightedpriority schemes. (a) N = 39 (b) N = 1453 (c) N = 9113 Fig. 10: The normalized makespan with various workﬂowsfrom an Internet streaming service company; N is the numberof tasks.

5) Workﬂows from an Internet streaming service company:

Finally, we tested workﬂows obtained from an Internet stream-ing service company. The workﬂows obtained from the in-production cluster contain multiple jobs, and each job maycontain multiple parallel tasks. Therefore, we evaluate thescale of each workﬂow by the number of tasks it carries.The workﬂow with N = 39 tasks is the largest test casethat Gurobi can solve. The workﬂow with N = 1453 is amedium-sized one that has the highest occurrence rate amongall medium-sized workﬂows. The workﬂow with N = 9113 is the largest workﬂow we obtained. The results in Fig. 10show that BAVE and BAVE M outperform other algorithms,and the weighted priority scheme achieves better performancethan the plain one. (a)(b)Fig. 11: The scheduling success rates of different algorithmswith FFT workﬂows.

6) Success rate of ﬁnding a schedule:

Algorithms HBCSand BHEFT cannot ﬁnd solutions when the budget is limited orthe available VMs are limited for all the workﬂows we tested.The other four algorithms can always produce a solution. InFig. 11, we plot the scheduling success rates for the FFTworkﬂows. The success rates of other workﬂows have a similarpattern. E. Multiple workﬂows

We conducted tests on multiple workﬂows. For each work-ﬂow type discussed in Section VI-A, we generate a workﬂowwith at least jobs. All workﬂows are combined to createa mixed set with N = 9744 jobs. In the test, we vary thenumber of VMs and test four algorithms: BAVE, BAVE M,round robin, and random. For the random strategy, we run random tests for each test case and report the averageresults. The achieved makespan is summarized in Table XVIII,XIX and XX. The results show that with more ﬂexible budgets,the BAVE and BAVE M algorithms achieve better makespanthan the round robin and random strategies. The BAVE Malgorithm outperforms the plain BAVE algorithm in mostcases.TABLE XVIII: Multiple large-sized workﬂows; VM Sufﬁ-ciency: Scarce , N = 9744 ϕ BAVE BAVE M ROUND ROBIN RANDOM0.0 10334 10334 10334 103340.25 7083 6912 7114 7244.060.50 3740 3385 3876 4219.710.75 1018 1936 1788 2124.21.00 878 855 879 1091.17

TABLE XIX: Multiple large-sized workﬂows; VM Sufﬁ-ciency:

Normal , N = 9744 ϕ BAVE BAVE M ROUND ROBIN RANDOM0.0 8791 8791 8791 87910.25 6015 5843 6400 6344.450.50 2908 3195 3342 3211.830.75 834 815 1278 1348.841.00 815 815 891 894.06

TABLE XX: Multiple large-sized workﬂows; VM Sufﬁciency:

Sufﬁcient , N = 9744 ϕ BAVE BAVE M ROUND ROBIN RANDOM0.0 6989 6989 6989 69890.25 4971 4616 4447 4709.120.50 1900 1707 2195 2179.320.75 725 695 970 984.961.00 725 695 777 777.74

VII. C

ONCLUSION

DAG-based complex workﬂows are becoming signiﬁcantworkload in the cloud. In scheduling workﬂows, the budgetconstraint is an important factor of consideration due to thepay-as-you-go nature of the cloud. In this paper, we formulatethe workﬂow scheduling problem with budget constraints asan integer programming model. Improving upon the plainupward-rank priority scheme, we propose a weighted schemeusing the stationary probabilities of a random walk on thedigraph as the weights. We further design a uniform spare bud-get splitting strategy, which assigns the spare budget uniformlyacross all the jobs. The empirical results show that the uniformspare budget splitting scheme outperforms the earlier schemethat splits the spare budget in proportion to extra demand, andthe weighted priority scheme further improves the workﬂowmakespan. The advantage of the weighted priority scheme is due to its ability to evaluate the jobs’ global importance inthe workﬂow, by considering not only the jobs on the criticalpath but also off the critical path. Because of the diversityand complexity of workﬂow types in production, there may besome other unknown factors yet to be studied. Deep analysis ofthe structural characteristics of different workﬂows may leadto some new discovery and help design a further improvedtask priority assignment strategy. For instance, we can borrowthe idea proposed in LDCP [9] that assigns a higher priority toa job with more children whenever there is a tie. These kindsof reﬁnement that relies on deep analysis of the workﬂowtopologies will be a direction of future research.A

CKNOWLEDGMENT

This work was supported by the Shanghai Committee ofScience and Technology, China (Grant No. 14510722300,18DZ2203900). R

EFERENCES[1] G. Juve, E. Deelman, G. B. Berriman, B. P. Berman, and P. Maechling,“An evaluation of the cost and performance of scientiﬁc workﬂows onamazon EC2,”

J. Grid Comput. , vol. 10, no. 1, pp. 5–21, Mar. 2012.[2] Y. Wang and W. Shi, “Budget-driven scheduling algorithms for batchesof mapreduce jobs in heterogeneous clouds,”

IEEE Transactions onCloud Computing , vol. 2, no. 3, pp. 306–319, July 2014.[3] M. A. Rodriguez and R. Buyya, “Deadline based resource provi-sioningand scheduling algorithm for scientiﬁc workﬂows on clouds,”

IEEE Transactions on Cloud Computing , vol. 2, no. 2, pp. 222–235,April 2014.[4] J. Lenstra, A. R. Kan, and P. Brucker, “Complexity of machine schedul-ing problems,” in

Studies in Integer Programming , ser. Annals of Dis-crete Mathematics, P. Hammer, E. Johnson, B. Korte, and G. Nemhauser,Eds. Elsevier, 1977, vol. 1, pp. 343 – 362.[5] A. S. Schulz, “Scheduling to minimize total weighted completion time:Performance guarantees of lp-based heuristics and lower bounds,” in

Integer Programming and Combinatorial Optimization , W. H. Cunning-ham, S. T. McCormick, and M. Queyranne, Eds. Berlin, Heidelberg:Springer Berlin Heidelberg, 1996, pp. 301–315.[6] H. Topcuoglu, S. Hariri, and M.-Y. Wu, “Performance-effective andlow-complexity task scheduling for heterogeneous computing,”

IEEETransactions on Parallel and Distributed Systems , vol. 13, no. 3, pp.260–274, Mar 2002.[7]

Gurobi Optimization: The state-of-the-art mathematical programmingsolver for prescriptive analytics

ACM Comput. Surv. , vol. 31,no. 4, pp. 406–471, Dec. 1999.[9] M. I. Daoud and N. Kharma, “A high performance algorithm forstatic task scheduling in heterogeneous distributed computing systems,”

Journal of Parallel and distributed computing , vol. 68, no. 4, pp. 399–409, 2008.[10] B. Xiang, B. Zhang, and L. Zhang, “Greedy-ant: ant colony system-inspired workﬂow scheduling for heterogeneous computing,”

IEEEAccess , vol. 5, pp. 11 404–11 412, 2017.[11] T. Shu and C. Q. Wu, “Performance optimization of Hadoop workﬂowsin public clouds through adaptive task partitioning,” in

IEEE INFOCOM2017 - IEEE Conference on Computer Communications , May 2017, pp.1–9.[12] W. Chen, G. Xie, R. Li, Y. Bai, C. Fan, and K. Li, “Efﬁcient taskscheduling for budget constrained parallel applications on heterogeneouscloud computing systems,”

Future Gener. Comput. Syst. , vol. 74, no. C,pp. 1–11, Sep. 2017.[13] R. Sakellariou, H. Zhao, E. Tsiakkouri, and M. D. Dikaiakos,

Schedulingworkﬂows with budget constraints . Boston, MA: Springer US, 2007,pp. 189–202.[14] W. Zheng and R. Sakellariou, “Budget-deadline constrained workﬂowplanning for admission control,”

Journal of Grid Computing , vol. 11,no. 4, pp. 633–651, Dec 2013. [15] H. Arabnejad and J. G. Barbosa, “A budget constrained schedulingalgorithm for workﬂow applications,” Journal of Grid Computing ,vol. 12, no. 4, pp. 665–679, Dec 2014.[16] S. Abrishami, M. Naghibzadeh, and D. H. Epema, “Deadline-constrainedworkﬂow scheduling algorithms for infrastructure as a service clouds,”

Future Generation Computer Systems , vol. 29, no. 1, pp. 158 – 169,2013, including Special section: AIRCC-NetCoM 2009 and Specialsection: Clouds and Service-Oriented Architectures.[17] J. Sahni and D. P. Vidyarthi, “A cost-effective deadline-constraineddynamic scheduling algorithm for scientiﬁc workﬂows in a cloudenvironment,”

IEEE Transactions on Cloud Computing , vol. 6, no. 1,pp. 2–18, Jan 2018.[18] H. Zhao and R. Sakellariou, “Scheduling multiple DAGs onto heteroge-neous systems,” in

Proceedings of the 20th International Conference onParallel and Distributed Processing , ser. IPDPS’06, 2006, pp. 159–159.[19] X. Guoqi, L. Liangjiao, Y. Liu, and L. Renfa, “Scheduling trade-offof dynamic multiple parallel workﬂows on heterogeneous distributedcomputing systems,”

Concurrency and Computation: Practice and Ex-perience , vol. 29, no. 2, p. e3782, 2017.[20] M. A. Rodriguez and R. Buyya, “Scheduling dynamic workloads inmulti-tenant scientiﬁc workﬂow as a service platforms,”

Future Gener-ation Computer Systems , vol. 79, pp. 739 – 750, 2018.[21] Y. Wang, Y. Xia, and S. Chen, “Using integer programming for workﬂowscheduling in the cloud,” in , June 2017, pp. 138–146.[22] J. Meena, M. Kumar, and M. Vardham, “Cost effective genetic algorithmfor workﬂow scheduling in cloud under deadline constraint,”

IEEEAccess , vol. 4, pp. 5065–5082, 2016.[23] W. Li, Y. Xia, M. Zhou, X. Sun, and Q. Zhu, “Fluctuation-awareand predictive workﬂow scheduling in cost-effective infrastructure-as-a-service clouds,”

IEEE Access , vol. 6, pp. 61 488–61 502, 2018.[24] A. C. Zhou, B. He, and C. Liu, “Monetary cost optimizations for hostingworkﬂow-as-a-service in IaaS clouds,”

IEEE Transactions on CloudComputing , vol. 4, no. 1, pp. 34–48, Jan 2016.[25] Z. Zheng and N. B. Shroff, “Online multi-resource allocation for dead-line sensitive jobs with partial values in the cloud,” in

IEEE INFOCOM2016 - The 35th Annual IEEE International Conference on ComputerCommunications , April 2016, pp. 1–9.[26] J. P. Champati and B. Liang, “Efﬁcient minimization of sum anddifferential costs on machines with job placement constraints,” in

IEEEINFOCOM 2017 - IEEE Conference on Computer Communications ,May 2017, pp. 1–9.[27] M. Mao and M. Humphrey, “Auto-scaling to minimize cost and meetapplication deadlines in cloud workﬂows,” in

International Conferencefor High Performance Computing, Networking, Storage and Analysis ,November 2011, pp. 1–12.[28] R. F. da Silva, W. Chen, G. Juve, K. Vahi, and E. Deelman, “Communityresources for enabling research in distributed scientiﬁc workﬂows,” in2014 IEEE 10th International Conference on e-Science (eScience 2014)