[PDF] Additive Approximation Schemes for Load Balancing Problems

Abstract

In this paper we introduce the concept of additive approximation schemes and apply it to load balancing problems. Additive approximation schemes aim to find a solution with an absolute error in the objective of at most ϵh for some suitable parameter h . In the case that the parameter h provides a lower bound an additive approximation scheme implies a standard multiplicative approximation scheme and can be much stronger when h≪ OPT. On the other hand, when no PTAS exists (or is unlikely to exist), additive approximation schemes can provide a different notion for approximation. We consider the problem of assigning jobs to identical machines with lower and upper bounds for the loads of the machines. This setting generalizes problems like makespan minimization, the Santa Claus problem (on identical machines), and the envy-minimizing Santa Claus problem. For the last problem, in which the objective is to minimize the difference between the maximum and minimum load, the optimal objective value may be zero and hence it is NP-hard to obtain any multiplicative approximation guarantee. For this class of problems we present additive approximation schemes for h= p max , the maximum processing time of the jobs. Our technical contribution is two-fold. First, we introduce a new relaxation based on integrally assigning slots to machines and fractionally assigning jobs to the slots (the slot-MILP). We identify structural properties of (near-)optimal solutions of the slot-MILP, which allow us to solve it efficiently, assuming that there are O(1) different lower and upper bounds on the machine loads (which is the relevant setting for the three problems mentioned above). The second technical contribution is a local-search based algorithm which rounds a solution to the slot-MILP introducing an additive error on the target load intervals of at most ϵ⋅ p max .

Full PDF

aa r X i v : . [ c s . D S ] J u l Additive Approximation Schemes for Load Balancing Problems

Moritz Buchem ∗ Lars Rohwedder † Tjark Vredeveld ‡ Andreas Wiese § July 21, 2020

Abstract

In this paper we introduce the concept of additive approximation schemes and apply it to load bal-ancing problems. Additive approximation schemes aim to ﬁnd a solution with an absolute error in theobjective of at most ǫh for some suitable parameter h . In the case that the parameter h provides a lowerbound an additive approximation scheme implies a standard multiplicative approximation scheme andcan be much stronger when h ≪ OPT . On the other hand, when no PTAS exists (or is unlikely to exist),additive approximation schemes can provide a different notion for approximation.We consider the problem of assigning jobs to identical machines with given lower and upper boundsfor the loads of the machines. This setting generalizes problems like makespan minimization , the

SantaClaus problem (on identical machines), and the envy-minimizing Santa Claus problem. For the last prob-lem, in which the objective is to minimize the difference between the maximum and minimum load, theoptimal objective value may be zero and hence it is NP-hard to obtain any multiplicative approximationguarantee. For this class of problems we present additive approximation schemes for h = p max , themaximum processing time of the jobs.Our technical contribution is two-fold. First, we introduce a new relaxation based on integrallyassigning slots to machines and fractionally assigning jobs to the slots. We refer to this relaxation as the slot-MILP . We identify structural properties of (near-)optimal solutions of the slot-MILP, which allowus to solve it efﬁciently in polynomial time, assuming that there are O (1) different lower and upperbounds on the machine loads (which is the relevant setting for the three problems mentioned above).The second technical contribution is a local-search based algorithm which rounds a solution to the slot-MILP introducing an additive error on the target load intervals of at most ǫ · p max . ∗ Maastricht University, Netherlands, [email protected] † EPFL, Switzerland, lars.rohwedder@epﬂ.ch, supported by the Swiss National Science Foundation project 200021-184656 ‡ Maastricht University, Netherlands, [email protected] § Universidad de Chile, Chile, [email protected]

Introduction

In traditional analysis of approximation algorithms, one tries to ﬁnd a (multiplicative) guarantee ρ such thatthe algorithm ﬁnds a solution of value of at most (or at least, in case of maximization problems) ρ · OPT ,where

OPT is the optimal solution value. An approximation scheme is a family of approximation algorithmswith performance guarantee of ρ = (1 + ǫ ) (or (1 − ǫ ) for maximization problems) for any ǫ > . In thispaper, we introduce the concept of additive approximation schemes . The goal is to design a family ofalgorithms that ﬁnd a solution with value not more than ǫ · h away from the optimal solution value, where h is chosen to be a suitable parameter of the problem instance. Formally, we deﬁne an additive approximationscheme as follows. Deﬁnition 1. An additive approximation scheme is a family of algorithms that ﬁnds on any instance I andfor every ǫ > a solution with value A ( I ) satisfying | A ( I ) − OPT( I ) | ≤ ǫh, where h is a suitable chosen parameter of instance I .In general, we are interested in ﬁnding additive approximation schemes that run in polynomial time, i.e.,of the form | I | f (1 /ǫ ) , where | I | denotes the size of the input and f is some computable function. Additiveapproximation schemes are particularly interesting in the following two scenarios.1. When the problem at hand admits a PTAS and h ≪ OPT , one obtains a stronger guarantee than thePTAS.2. When there cannot exist a PTAS, or even any multiplicative guarantee for the problem, additive ap-proximation schemes give an alternative notion for approximating the problem. A notable example isthe case when it is NP-complete to decide whether

OPT = 0 , as then no multiplicative approximationguarantee can be obtained.Additive approximation has received only little attention in the literature. Notable exceptions includeVizing’s algorithm that ﬁnds an edge coloring with at most ∆ + 1 colors, where ∆ is the maximum degreeof a graph [29]. As ∆ is a lower bound on the minimum number of colors needed, this results implies anadditive -approximation. Also, Alon et al. [2] present an additive ǫn approximation algorithm for theedge deletion problem to obtain a graph with a monotone property. This falls into an additive approximationscheme for parameter h = n , which is in fact an upper bound and not an lower bound on the minimumnumber of edges to be deleted.In this paper, we apply the concept of additive approximation schemes to scheduling and load balancingproblems which are among the classical problems in the literature on approximation algorithms, starting withthe seminal work of Graham [11]. In these problems n jobs need to be processed by one of m machines. Ajob has processing time p j and the load of machine i is the sum of the processing times of the jobs assignedto i . The goal is to ﬁnd a schedule, which can be represented by an assignment of jobs to the machines, thatoptimizes an objective function over the machine loads. Since it is strongly NP-hard to decide whether thereis a schedule that assigns the same load to each machine (see [10]), most non-trivial load balancing problemsof this form are also strongly NP-hard. This observation has led to extensive research on approximationalgorithms. In the following, we consider three variations of load balancing problems.1 ariations. The ﬁrst objective function is to minimize the maximum machine load, i.e., to minimize the makespan . This is the one of the most classical scheduling problems on parallel machines and has led to theﬁrst approximation algorithms [11, 12]. Sahni [27] showed that the problem admits an FPTAS for constantnumber of machines and Hochbaum and Shmoys [14] found a PTAS if the number of machines is part of theinput. Since then, there has been lively research in improving the running time, e.g., to an EPTAS [15, 6, 16].The second objective function that we consider is the

Santa Claus problem, also known as max-minallocation [5]. Here, the goal is to maximize the minimum load, i.e., to make the least loaded machine asfull as possible. Bansal and Sviridenko [4] coined the term Santa Claus problem when they studied it inthe restricted assignment setting. This objective is considered to measure the fairness of the allocation. Thecase of identical machines was also consider by Woeginger [30] who presents a PTAS.As a third and ﬁnal objective function, we consider to minimize the maximum envy , which is deﬁned asthe maximum load minus the minimum load. This objective has been considered by Lipton et. al [22]. Whilein the Santa Claus problem fairness is measured by the minimum load of a machine, in this setting fairnessis considered by the difference between the maximum and minimum load. Note that it is strongly NP-hardto decide whether or not the envy is . Therefore, unless P = NP, there cannot exist any polynomial timeapproximation algorithm with any (multiplicative) performance guarantee.It is notable that for all three variants a simple greedy algorithm, which assigns the jobs iteratively to theleast loaded machine, gives a additive error of p max . This guarantee is incomparable to the error of ǫ OPT of a PTAS.

Our contribution.

In this paper, we will present additive approximation schemes for load balancing prob-lems on identical machines with parameter h = p max . For the makespan and Santa Claus objective this givesa signiﬁcant improvement over the greedy algorithm mentioned above while also dominating the guaranteesof the known PTASs; for minimizing the maximum envy this demonstrates how additive approximationschemes can lead to non-trivial guarantees when no multiplicative guarantees are possible.For the mentioned load balancing problems this new perspective on the analysis of approximation guar-antees is particularly interesting, because it requires fundamentally new methods: Most standard (multi-plicative) PTASes ﬁrst round the processing times, ﬁnd optimal solutions to the rounded instance and thentransform this into a solution for the original instance. These methods do not work for additive approxima-tion scheme, because the rounding of all jobs directly adds an error of ǫ P j p j , which is too large comparedto ǫ · p max . Therefore, there is need for new (non-trivial) machinery. We present a new relaxation for thisgeneral class of load balancing problems, which we call the slot-MILP . This slot-MILP can be interpretedas a strenghened variant of the assignment-LP. The assignment-LP is the relaxation that allows jobs to beassigned to the machines fractionally. In the slot-MILP we ﬁrst group the jobs of similar size, but we donot round them (unlike previous PTASes). However, in addition to the constraints of the assignment-LPwe require an integral number of jobs of each group to be assigned to each machine. This property canbe implemented using integer variables. Alternatively, the relaxation can also be thought of assigning slots(for the groups of jobs) integrally to machines and then the jobs fractionally to the slots. A straight-fowardapplication of Lenstra’s algorithm fails, since the number of integer variables is linear in the input size.Instead, we manage to solve it using non-trivial structural properties combined with dynamic programming.While the additive integrality gap of the assignment-LP can be as large as p max , this gap is only ǫ · p max forthe slot-MILP. We show this using a rounding procedure inspired by a local search method for the restrictedassignment problem [28, 18, 19]. The local search algorithm repeatedly moves jobs between machines,eventually converging to a good solution. Although in the restricted assignment problem no polynomialrunning time bound is known for the local search procedure, in our case we obtain such a bound for our2ocal search.Our results extend to a more general setting in which for each machine we are given a target intervalfor its load, with at most O (1) different such intervals across all machines. Our solution then violates thedesired load on each machine by at most ± ǫ · p max (or we assert that no solution exists for the given loads). Other related work.

The case of small values of p max has also been considered from a parameterizedpoint of view: If all processing times are integers, then it is possible to obtain a running time that is ﬁxed-parameter tractable (FPT) in parameter k = p max [23]. In other words, there is an algorithm that ﬁnds anexact solution in time f ( k ) · | I | O (1) for some computable function f .Other variants of load balancing problems have been considered in the paper by Alon et. al [1]. Theyidentify some conditions on the objective function so that the results load balancing or machine scheduleproblem admits a PTAS. Bansal and Sviridenko [4] consider the Santa Claus problem in a more general set-ting where the processing time of a job j is also dependent on the machine i on which is processed, denotedby p ij . They considered a restricted assignment setting in which each job is only allowed to be processed ona subset of the machines, but then the processing time is identical over all these machines, i.e., p ij ∈ { p j , } .Bansal and Sviridenko [4] presented an approximation algorithm with a performance guarantee dependenton the number of machines. Feige [9] showed that the integrality gap of the conﬁguration LP for this settingis constant using the Lovasz local lemma. Through the work of Moser and Tardos [24] and Haeupler etal. [13] this could be turned into a polynomial time algorithm. Asadpour et al. [3] present a local searchmethod with a performance guarantee of ; however, it is unkown whether this method runs in polynomialtime.Related to additive approximation algorithms are several papers on the bin packing problem. Jansen etal. [17] present an additive -approximation algorithm in time exponential in the optimal number of binsplus a polynomial in the number of items to be packed. Hence, this algorithms is only useful when theoptimal number of bins is small. On the other hand, Karmakar and Karp [20] gave an algorithm that runs inpolynomial time in the number of items giving a solution with at most (1 + ǫ )OPT + O (1 /ǫ ) bins. Sucha bound is called Asymptotic PTAS (APTAS) as the additive term vanishes when OPT is sufﬁciently large.This was subsequently improved by [26] and [7]. Ophelders et al. [25] showed that a simple local searchalgorithm for the so-called Equitable Hamiltonian Cycle ﬁnds a solution that is at most away from theoptimal solution value. We introduce a new alternative relaxation for a general class of load balancing problems in machine schedul-ing. We ﬁrst formally deﬁne this class of load balancing problems as the target load balancing problem . Deﬁnition 2.

In the target load balancing problem we are given a set of jobs J with a processing time p j for each j ∈ J and a set of machines M with values ℓ i , u i for each machine i ∈ M . The goal is to assigneach job j ∈ J to a machine i ∈ M such that for each machine i ∈ M the load of i (i.e., the sum of theprocessing times of the jobs assigned to i ) is in the interval [ ℓ i , u i ] .This generalizes the load balancing settings mentioned earlier. For example, in P || C max every machinehas a target load interval with ℓ i = 0 and u i = T , where T is a guess on the optimal makespan. For agiven instance of the problem, we deﬁne K to be the number of different target load intervals [ ℓ i , u i ] of themachines in M , i.e., K = |{ [ ℓ i , u i ] | i ∈ M}| . We will assume that K = O (1) .3et ǫ > and assume w.l.o.g. that /ǫ ∈ N . Our task is to either assert that there is no solution for thegiven instance or to ﬁnd a solution in which the load of each machine i is in the interval [ ℓ i − ǫ · p max , u i + ǫ · p max ] with p max := max j ∈ J p j , i.e., violating the target load range of each machine by at most ǫ · p max .First we partition the jobs into sets J , . . . , J /ǫ , where for k = 1 , ..., /ǫ the set J k contains all jobs j ∈ J with p j ∈ (( k − ǫ · p max , kǫ · p max ] . We deﬁne a new relaxation for this problem in which for each machine i and each k = 1 , ..., /ǫ we specify integrally how many jobs from J k are assigned to i (one may imaginethat this deﬁnes slots for jobs from J k on i ). Then the jobs from J k are assigned fractionally to these slots.We denote by the slot-MILP the following relaxation. min 0 X i ∈M x i,j = 1 ∀ j ∈ J X j ∈J p j x i,j ≥ ℓ i ∀ i ∈ M (1) X j ∈J p j x i,j ≤ u i ∀ i ∈ M (2) X j ∈J k x i,j = y i,k ∀ i ∈ M , ∀ k ∈ { , . . . , /ǫ } x i,j ≥ ∀ j ∈ J , i ∈ M y i,k ∈ N ∀ i ∈ M , k ∈ { , . . . , /ǫ } In the slot-MILP the integer variables deﬁne exactly how many jobs of a type are assigned to a machinebut do not imply a speciﬁc load based on rounded processing times. The load of a machine is based on anassignment that satisﬁes the distribution of slots among the machines.Since the slot-MILP contains /ǫ · |M| integral variables, it is not clear how to solve it in polynomialtime. Nevertheless, we present two methods of efﬁciently solving the slot-MILP given that K = O (1) .The ﬁrst method gives an exact solution while the second method gives a solution that slightly violates thetarget load intervals. Afterwards, we show how to round a fractional solution of the slot-MILP to an integralsolution, while violating the load interval [ ℓ i , u i ] for each machine i ∈ M by at most ǫ · p max . In thefollowing we give sketches of the proofs of the structural properties used to develop our solution methods.For detailed proofs we refer to Appendix A. We make use of a structural property to ﬁnd an exact solution to the slot-MILP. Note that in this case anexact solution is one that satisﬁes (1) and (2). This structure allows us to guess the values of the integralvariables in polynomial time and then the remaining problem is only a linear program.Given a solution ( x, y ) , for each machine i ∈ M let y i denote the (1 /ǫ ) -tuple ( y i, , . . . , y i, /ǫ ) .We show that there are solutions in which there are not too many different vectors y i . This uses similararguments to [8]. Lemma 3.

There is a solution ( x, y ) to the slot-MILP such that for all i, i ′ ∈ M with [ ℓ i , u i ] = [ ℓ i ′ , u i ′ ] and y i ≡ y i ′ mod 2 it follows that y i = y i ′ .Proof sketch. Let ( x, y ) be a optimal solution to the slot-MILP and consider two machines i , i with [ ℓ i , u i ] = [ ℓ i , u i ] . Suppose that y i ≡ y i mod 2 but y i = y i . We construct a new solution ( x ′ , y ′ ) i and i but leaves the jobs of all other machines untouched. Intu-itively, we assign to i and i the average load of both machines. We deﬁne x ′ i ,j = x ′ i ,j = ( x i ,j + x i ,j ) / for each j ∈ J and y ′ i ,k = y ′ i ,k = ( y i ,k + y i ,k ) / for each k ∈ { , ..., /ǫ } . Also, we set x ′ i,j = x i,j and y ′ i,k = y i,k for all i ∈ M \ { i , i } , all j ∈ J , and all k ∈ { , ..., /ǫ } . As y i ≡ y i ′ mod 2 and ( x, y ) is feasible we have that ( x ′ , y ′ ) is feasible as each job remains fully assigned and no machine is assignedmore jobs of a type than it has slots. Furthermore, the load only changes on machines i and i . However,as the new load on these machines becomes the average of the previous loads we have that all machine loadssatisfy their respective target loads.For details of the proof we refer to Appendix A.1. Using Lemma 3 we can solve the slot-MILP inpolynomial time if K = O (1) . Lemma 4.

We can solve the slot-MILP in time m O ( K · /ǫ ) · n O ( K/ǫ · /ǫ ) . Proof.

We ﬁrst guess all values of y i,k (up to permutations of machines) of the optimal solution due toLemma 3 as follows. We say that two machines i, i ′ ∈ M are of the same type in a solution if y i = y i ′ and [ ℓ i , u i ] = [ ℓ i ′ , u i ′ ] . Then Lemma 3 implies that there are only K · /ǫ different machine types. For each ofthese K · /ǫ types we guess (1) the number of machines having this type and (2) for each k ∈ { , ..., /ǫ } we guess the value of y i,k for each machine i ∈ M of this type. Note that the machines are identical andhence it sufﬁces to guess the number of machines of each type, rather than guessing which exact machineis of which type. The total number of guesses is bounded by m O ( K · /ǫ ) · n O ( K/ǫ · /ǫ ) . Then the remainingproblem is only a linear program (LP) since all integral variables of the slot-MILP are already ﬁxed. If ourguess was correct then the LP must have a feasible solution.

The solution based on Lemma 3 can be found in double exponential time with respect to the number of jobtypes /ǫ and is an exact solution to the slot-MILP. In the following we show that using a different (slightlymore complicated) structural property one can ﬁnd an additive δ -approximate solution to the slot-MILP insingle exponential time with respect to /ǫ and polynomial in /δ , i.e., even with δ := 1 /n O (1) we obtainpolynomial running time. Here, δ -approximate implies that we ﬁnd a solution to a weaker version of slot-MILP with ℓ ′ i = ℓ i − δ · p max and u ′ i = u i + δ · p max for every i . We refer to this weaker version as slot-MILP’.The algorithm is based on a different structural property than the one proved in Lemma 3. Given a solution ( x, y ) of the slot-MILP, for each machine i ∈ M and each k ∈ { , . . . , /ǫ } , we denote by z i,k the averagesize of the jobs type k on machine i deﬁned by z i,k · y i,k = X j ∈J k p j x i,j . In the case that y i,k = 0 this allows us to freely choose the value of z i,k which is important for the structuralproperty in the following lemma. We prove that there is always a solution to the slot-MILP and an orderingof the machines such that for each k ∈ { , . . . , /ǫ } the values z i,k are non-decreasing and on each preﬁxof length ℓ of the machines the total size of the slots for the jobs in J k is at least as large as the y σ (1) ,k + · · · + y σ ( ℓ ) ,k smallest jobs in J k . For each integer n ′ let J min k ( n ′ ) ⊆ J k be the n ′ smallest jobs in J k .5 emma 5. There is an optimal solution ( x, y ) for the slot-MILP, a corresponding vector { z i,k } i ∈M ,k ∈ ,..., /ǫ } ,and an ordering σ : { , ..., |M|} → M such that ℓ X ℓ ′ =1 y σ ( ℓ ′ ) ,k z σ ( ℓ ′ ) ,k ≥ X j ∈J min k ( y σ (1) ,k + ··· + y σ ( ℓ ) ,k ) p j ∀ k ∈ { , . . . , /ǫ } ∀ ℓ ∈ { , ..., |M|} (3) X i ∈M y i,k z i,k = X j ∈J k p j ∀ k ∈ { , . . . , /ǫ } (4) z σ ( ℓ ) ,k ≤ z σ ( ℓ +1) ,k ∀ k ∈ { , . . . , /ǫ } ∀ ℓ ∈ { , ..., |M| − } . (5) Proof sketch.

Conditions (3) and (4) follow from feasibility. Condition (5) can be established by a potentialfunction argument: we show that a solution minimizing this potential function has to fulﬁll condition (5) asotherwise we can swap some of the jobs of the same type between two machines and decrease the potentialfunction while not decreasing the total load on the machines.We introduce a dynamic program that uses the property from Lemma 5. Intuitively, our DP guesses themachines in the ordering σ one after the other. When it guesses the next machine i , it ﬁrst guesses the type ofthe machine, i.e. the values of ℓ i and u i , and then it guesses for each k ∈ { , . . . , /ǫ } the value z i,k and thenumber of jobs y i,k from J k on machine i . In order to bound the running time we need to consider roundedvalues of z i,k . Therefore, the DP ensures that the conditions (3) and (5) on the vectors y, z from Lemma 5are satisﬁed and that condition (4) as well as the upper and lower bounds on the load of each machine i areonly violated by a small extent. The following lemma shows that this is sufﬁcient in order to compute anapproximate solution to the slot-MILP based on the vectors y, z . Lemma 6.

Suppose that we are given an ordering σ : { , ..., |M|} → M and vectors { y i,k , z i,k } i ∈M ,k ∈{ ,..., /ǫ } such that conditions (3) and (5) hold. Moreover, assume that for each i ∈ M it holds that ℓ i ≤ /ǫ X k =1 y i,k z i,k ≤ u i + δp max (6) and for each k ∈ { , · · · , /ǫ } we have that condition (4) is slightly violated as follows X j ∈J k p j ≤ X i ∈M y i,k z i,k ≤ X j ∈J k p j + δǫ · p max . (7) Then we can compute a vector { x i,j } i ∈M ,j ∈J such that ( x, y ) is a solution to slot-MILP’ in time O ( mn ) .Proof sketch. We ﬁrst ﬁnd a (fractional) assignment vector { x i,j } i ∈M ,j ∈J satisfying X j ∈J k p j x i,j ≤ y i,k z i,k (8)for all i ∈ M and k ∈ { , . . . , /ǫ } . We do so by assigning jobs for each type k independently. We startwith a solution that assigns the smallest jobs to the ﬁrst machine, and so on. Whenever a machine does notsatisfy (8), we can fractionally swap jobs between this machine and a machine with smaller index. Once wehave established that no machine is overloaded we establish that no machine is underloaded by too much,i.e. the following holds for all i and k X j ∈J k p j x i,j ≥ y i,k z i,k − δǫ · p max . (9)6he goal of our DP is to compute vectors { y i,k , z i,k } i ∈M ,k ∈{ ,..., /ǫ } that satisfy the conditions due toLemma 6. The key insight is now that when we consider the next machine i ′ in the ordering, we do notneed to remember all vectors { y i,k , z i,k } i ∈M ,k ∈{ ,..., /ǫ } for all previously considered machines i , but itsufﬁces to remember the number of previously assigned jobs from each set J k , the current left hand side ofinequality (3) for each k , the vector (cid:8) z i ′′ ,k (cid:9) k ∈{ ,..., /ǫ } of the previously considered machine i ′′ , and for eachtype of machines the number of previously guessed machines of this type. At each iteration the machinethen guesses the type of machine i and the vectors { y i,k , z i,k } i ∈M ,k ∈{ ,..., /ǫ } such that the new solutionconsisting of the guess for machine i and the remembered solution for the previous machines satisﬁes theconditions stated in Lemma 7. If none of the guesses satisﬁes these conditions the DP cell corresponding tothis iteration remains empty. Lemma 7.

Let δ > . There is an algorithm with a running time of m K +1 ( nδǫ ) O (1 /ǫ ) which either ﬁnds a δ -approximate solution to the slot-MILP or asserts that the slot-MILP is infeasible. The proof of Lemma 7 and a detailed description of the dynamic program are given in Appendix A.2.

We assume that we are given a solution to the slot-MILP via the algorithm due to Lemma 4 or an approximatesolution, i.e., a solution to slot-MILP’ via the algorithm due to Lemma 7. In this section, we describe analgorithm with a running time of n O (1) that computes an integral solution to the slot-MILP (or slot-MILP’)which for each machine i ∈ M violates the target load by at most ǫ · p max . For a solution to the slot-MILPthis implies that it holds that P j ∈J p j x i,j ∈ [ ℓ i − ǫ · p max , u i + ǫ · p max ] . For a solution to slot-MILP’ thisimplies that the target load violation is given by the error made due to the approximate solution and due tothe rounding, i.e., after rounding the solution it holds that P j ∈J p j x i,j ∈ [ ℓ i − δ · p max − ǫ · p max , u i + δ · p max + ǫ · p max ] . Note that the running time bound is independent of K . In the following we show how therounding procedure works and that the claims hold for an exact solution to slot-MILP. The same argumentshold if the initial solution is a solution to slot-MILP’.We imagine that each machine i ∈ M has y i,k slots for the jobs in J k , for each k ∈ { , ..., /ǫ } . We saythat these slots are of type k . Notice that P i ∈M y i,k = |J k | . We compute an initial solution by assigningeach job j ∈ J k to an arbitrary slot of type k . In this solution there might be a machine i whose load is notin [ ℓ i − ǫ · p max , u i + ǫ · p max ] , i.e., the load is too small or too large. We present now a local search algorithmthat repeatedly swaps pairs of jobs from the same set J k such that eventually each machine i ∈ M has aload in [ ℓ i − ǫ · p max , u i + ǫ · p max ] . Note that this maintains the number of jobs from each set J k on eachmachine. We describe how to perform one iteration of the local search algorithm. Each iteration aims at ﬁnding a pairof jobs that can be swapped. Let M be the set of machines i ∈ M that have a load strictly greater than u i + ǫ · p max . Consider a k ∈ { , ..., /ǫ } such that a job j ∈ J k is assigned to a machine i ∈ M . Wewould like to exchange j for a smaller job j ′ ∈ J k that is assigned to a machine i ′ / ∈ M . Thus, considerall jobs j ′ ∈ J k with p j ′ < p j which are assigned to a machine i ′ / ∈ M . If the load of i ′ is at most u i ′ thenwe exchange j and j ′ which completes the swap. We try to perform such a swap for each k ∈ { , ..., /ǫ } such that a job j ∈ J k is assigned to a machine in M . If we did not perform a swap then let M denotethe set of machines i ′ ∈ M having a job j ′ which we tried to swap with a job j on a machine i ∈ M , i.e.,7 contains all machines i ′ ∈ M \ M for which there exists a machine i ∈ M and a k ∈ { , ..., /ǫ } such that there is a job j ∈ J k assigned to i and a job j ′ ∈ J k assigned to i ′ with p j ′ < p j . Observe thateach machine i ′ ∈ M has a load of more than u i ′ .Now we repeat this procedure: Suppose that we constructed sets of machines M , ..., M ℓ . For each k ∈ { , ..., /ǫ } such that there is a job j ∈ J k assigned to a machine i ∈ M ℓ consider all jobs j ′ ∈ J k with p j ′ < p j , which are assigned to a machine i ′ / ∈ M ∪ . . . ∪ M ℓ . If the load on one such machine i ′ is at most u i ′ , then we exchange j and j ′ which completes the swap. In particular, we do not reuse the constructedsets M , . . . , M ℓ for the next swap but we forget these sets before the next swap starts. Otherwise, if eachconsidered machine i ′ has a load strictly more than u i ′ we construct a set M ℓ +1 consisting of all thesemachines i ′ and continue in the current iteration.Suppose that at the beginning of a swap there is no machine i ∈ M that has a load strictly greaterthan u i + ǫ · p max . Then, a second stage of the local search algorithm takes place. We take the currentsolution and perform an analogous procedure in order to ensure that each machine i ∈ M has a load of atleast ℓ i − ǫ · p max . Initially deﬁne M to be the set of all machines i ∈ M with a load strictly less than ℓ i − ǫ · p max . Suppose that we constructed sets of machines M , ..., M ℓ . For each k ∈ { , ..., /ǫ } suchthat there is a job j ∈ J k assigned to a machine i ∈ M ℓ consider all jobs j ′ ∈ J k with p j ′ > p j , whichare assigned to a machine i ′ / ∈ M ∪ . . . ∪ M ℓ . If the load on one such machine i ′ is at least ℓ i ′ , then weexchange j and j ′ which completes the swap. Otherwise, if each such machine i ′ has a load of strictly lessthan ℓ i ′ we construct a set M ℓ +1 consisting of all these machines i ′ and continue. The algorithm terminatesif the load of each machine i ∈ M is within [ ℓ i − ǫ · p max , u i + ǫ · p max ] . In the following we use ﬁrst stageto refer to the part of the algorithm that establishes that all loads are at most u i + ǫ · p max and second stageto the part of the algorithm that establishes that all loads are at least ℓ i − ǫ · p max . We show now that the algorithm terminates in n O (1) time. Then, by construction it outputs a solution inwhich each machine i ∈ M has a load in the interval [ ℓ i − ǫ · p max , u i + ǫ · p max ] .We ﬁrst show that in each iteration of the ﬁrst stage of the algorithm we can ﬁnd a pair of jobs j, j ′ toswap. Lemma 8.

In each iteration of the algorithm ﬁnds two jobs j, j ′ that it swaps and ﬁnding such a pair canbe done in O ( n ) .Proof. We prove this for the ﬁrst stage of the algorithm. A similar argument can be shown for the secondstage (see Appendix B). Suppose towards contradiction that the algorithm does not ﬁnd two jobs j, j ′ ∈ J k such that j is assigned to a machine i ∈ M ℓ with load more than u i + ǫ · p max and job j ′ is assigned to amachine i ′ / ∈ M ∪ . . . ∪ M ℓ and p ′ j < p j . This means that the machines in M ∪ . . . ∪ M ℓ are assignedthe smallest jobs of type k while each machine having load at least u i . Hence, even a fractional assignmentof jobs cannot reduce the total load on the machines in M ∪ · · · ∪ M ℓ . This means that in a fractionalassignment at least one machine must have a load greater than u i . This gives a contradiction.As every job is fully assigned to a machine we have that each job of a type k which is assigned to amachine in the sets M , M , . . . , M ℓ is considered exactly once and compared to every other job of type k not assigned to these machines. As the existence of a pair was shown for an arbitrary k this gives a worstcase running time of O ( n ) .Next, we outline the proof that the ﬁrst stage of the algorithm always terminates and that this happensafter at most O ( n ) swaps. As Lemma 8 states that each swap can be done in O ( n ) this shows that the8rst stage ﬁnishes in n O (1) . To this end, we give an alternative formulation of the ﬁrst stage of the algorithmalgorithm as a repeated breadth-ﬁrst search (BFS). We construct a weighted, directed graph. It contains onespecial vertex, the source s , and one vertex for each slot, that is |J | + 1 vertices in total. Each non-sourcevertex is associated with a machine and a size class. The slots of the same machine form a clique: Thereis an edge from each slot to the other with weight . Furthermore, there is an edge of weight from slot u to v , when (1) u and v are not on the same machine, (2) u and v belong to the same size class, and (3) u iscurrently assigned a larger job than v . Additionally, there is an edge of weight from the source to every sloton a machine with load more than u i + ǫ · p max . The algorithm performs a BFS on the graph above startingin s . Once it reaches a machine with load at most u i , it selects the edge ( u, v ) over which the machine wasreached and swaps the jobs assigned to the slots u and v . This is continued until every machine i is assigneda load at most u i + ǫ · p max . We ﬁrst show, that the distance from s to any slot in the graph does not decreaseby a swap. Lemma 9.

The distance from s to any slot does not decrease by a swap. Using Lemma 9 and a potential function that is bounded by n we show the following. Lemma 10.

The ﬁrst stage of the algorithm terminates after at most O ( n ) swaps. The detailed proof of Lemmas 9 and 10 are moved to Appendix B. Additionally, the counterparts for thesecond stage are shown (see Lemmas 16 and 17). It is important to note that in the second stage the goal isto ﬁx machines that have load less than ℓ i − ǫ · p max . We do so by swapping larger jobs on machines withload at least ℓ i with smaller jobs (of the same type) on underloaded machines. As we never increase the loadof a machine that has load at least ℓ i − ǫ · p max this process does not lead to any violations of the machineupper bounds. Hence, Lemmas 8-10 and Lemmas 16 and 17 give the following result on the running timeand additive approximation guarantee of the local search algorithm. Lemma 11.

Given a solution to the slot-MILP, in time n O (1) we can compute an integral solution to slot-MILP such that P j ∈J p j x i,j ∈ [ ℓ i − ǫ · p max , u i + ǫ · p max ] for each machine i ∈ M . By combining Lemmas 7 and 11 we obtain our main theorem. Our running time is polynomial if K = O (1) which will be the relevant case for our applications for makespan minimization, Santa Claus onidentical machines, etc. Theorem 12.

There is an algorithm for the target load balancing problem with a running time of m K +1 n O (1 /ǫ ) that computes a solution in which the load of each machine i ∈ M is in [ ℓ i − ǫ · p max , u i + ǫ · p max ] , orasserts that there is no feasible solution. We can use Theorem 12 in order to obtain additive approximation schemes for makespan minimization , the

Santa Claus problem and the envy-minimizing Santa Claus problem on identical machines. The idea is toguess the target load intervals up to multiples of ǫ · p max and then applying the algorithm due to Theorem 12.As in these problems each machine has the same target load interval we have K = 1 .For P || C max and the Santa Claus problem the guessing procedure can be done in O (1 /ǫ ) . As everymachine is assigned the same target load interval,i.e. ℓ i = ℓ and u i = u , we only need to guess the upperbound for P || C max and lower bound for the Santa Claus problem. For P || C max we set ℓ = 0 and guess u as a multiple of ǫ · p max within the interval [ m P nj =1 p j , m P nj =1 p j + p max ] . For the Santa Claus problemwe set u = P nj =1 p j and guess ℓ within the interval [ m P nj =1 p j − p max , m P nj =1 p j ] .9 orollary 13. There is an algorithm for P || C max with a running time of m n O (1 /ǫ ) that computes a solutionwith makespan at most OP T + ǫ · p max . Corollary 14.

There is an algorithm for the Santa Claus problem on identical machines with a running timeof m n O (1 /ǫ ) that computes a solution in which each machine has a load of at least OP T − ǫ · p max . For the envy-minimizing

Santa Claus problem we need to guess both ℓ and u simultaneously from thesame intervals above. This gives a total number of guesses of O (1 /ǫ ) . Corollary 15.

There is an algorithm for envy-minimization on identical machines with a running time of m n O (1 /ǫ ) which computes a solution with envy at most OP T + ǫ · p max . In this paper we introduced the concept of additive approximation schemes, where the goal is to designalgorithms that ﬁnd solutions where the absolute deviation of the objective value from the optimal value isat most ǫ · h for a (natural) parameter h of the respective problem. Additive approximation schemes areinteresting in among others the following situations. First, when the underlying problem does not admit a(traditional) PTAS, or even any (multiplicative) approximation guarantee, e.g., in the case that the optimalvalue is , additive approximation schemes can give an alternative notion for approximating the problem.Secondly, in case that h provides a lower bound on the optimum, then an additive approximation schemeimmediately implies a (traditional) approximation scheme. However, when h ≪ OPT , then the additive ap-proximation scheme can be much stronger. We applied this concept to load balancing problems on identicalmachines ﬁnding an additive PTAS for makespan minimization, the Santa Claus problem and the envy-minimizing Santa Claus problem. We do so by introducing a new relaxation, the slot-MILP, and showinghow to solve and round this relaxation. The running time of our method is exponential in both /ǫ and thenumber of different target load intervals K .Therefore, we leave two open questions with respect to additive approximation schemes for the loadbalancing problems considered here. Firstly, for P || C max there is an EPTAS [2, 16], i.e., an algorithm with arunning time of the form f (1 /ǫ ) · n O (1) for some function f . We leave as an open question to ﬁnd an additiveapproximation scheme for the problem with such a running time or rule out that one exists. Secondly, weleave open to ﬁnd an additive approximation scheme for the target load balancing problem when the number K of different machine types is super-constant. Note that our rounding algorithm from Section 3 worksfor arbitrary K , but it is not clear whether one can solve the slot-MILP in this case (approximatively) inpolynomial time. When the number of machines is not part of the input, then P m || C max admits an FPTAS,i.e., an approximation scheme with running time polynomial in the input size and /ǫ [27]. Using the ideasin this paper, it can be easily shown that all three versions with a constant number of machines admit anadditive FPTAS. On the other hand, when the number of machines is part of the input, then using similararguments as in [10], we can show that there cannot exist an additive FPTAS unless P = NP, as theseproblems are strongly NP-hard. For the case of unrelated machines, that is when the processing times, nowdenoted by p ij , depend on the machine as well as the job, it can be shown that, unless P = NP, for eachof the objectives considered in this paper, there does not exist an additive approximation algorithm withguarantee less then p max , using the reduction from [21]. Therefore, the existence of an additive PTAS forthis problem is also ruled out under the assumption that P = NP.Another interesting direction of future research is to study the concept of additive approximation schemesfor other types of problems. 10 cknowledgements

We wish to thank Jos´e Verschae, Alexandra Lassota and Klaus Jansen for helpful discussions on this prob-lem.

References [1] N. Alon, Y. Azar, G.J. Woeginger, and T. Yadid. Approximation schemes for scheduling on parallelmachines.

Journal of Scheduling , 1:55–66, 1998.[2] N. Alon, A. Shapira, and B. Sudakov. Additive approximation for edge-deletion problems.

Annals ofMathematics , 170:37 – 411, 2009.[3] A. Asadpour, U. Feige, and A. Saberi. Santa claus meets hypergraph matchings.

ACM Transactionson Algorithms , 8:24:1–24:9, 2012.[4] N. Bansal and M. Sviridenko. The santa claus problem. In

Proceedings of the Thirty-eighth AnnualACM Symposium on Theory of Computing (STOC 2006) , pages 31–40, 2006.[5] D. Chakrabarty.

Max-Min Allocation , pages 1–4. Springer US, Boston, MA, 2008.[6] L. Chen, K. Jansen, and G. Zhang. On the optimality of approximation schemes for the classicalscheduling problem. In

Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms(SODA 2013) , pages 657–668, 2013.[7] W. Fernandez de la Vega and G.S. Lueker. Bin packing can be solved in within ǫ in linear time. Combinatorica , 1:349 –355, 1981.[8] F. Eisenbrand and G. Shmonin. Carath´eodory bounds for integer cones.

Operations Research Letters ,34(5):564–568, 2006.[9] U. Feige. On allocations that maximize fairness. In

Proceedings of the annual ACM-SIAM symposiumon discrete algorithms (SODA) , pages 287–293, 2008.[10] M.R. Garey and D.S. Johnson. “strong” NP-completeness results: motivation, examples, and implica-tions.

Journal of the ACM , 25:499–508, 1978.[11] R.L. Graham. Bounds for certain multiprocessing anomalies.

Bell System Technical Journal , 45:1563–1581, 1966.[12] R.L. Graham. Bounds on multiprocessing timing anomalies.

SIAM Journal on Applied Mathematics ,17:416–429, 1969.[13] B. Haeupler, B. Saha, and A. Srinivasan. New constructive aspects of the lov´asz local lemma.

Journalof the ACM , 58, 2011.[14] D.S. Hochbaum and D.B. Shmoys. Using dual approximation algorithms for scheduling problems:Theoretical and practical results.

Journal of the ACM , 34:144–162, 1987.[15] K. Jansen. An EPTAS for scheduling jobs on uniform processors: Using an MILP relaxation with aconstant number of integral variables.

SIAM Journal on Discrete Mathematics , 24:457–485, 2010.1116] K. Jansen, K-M. Klein, and J. Verschae. Closing the gap for makespan scheduling via sparsiﬁcationtechniques.

Mathematics of Operations Research , page to be published, 2020.[17] K. Jansen, S. Kratsch, D. Marx, and I. Schlotter. Bin packing with ﬁxed number of bins revisited.

Journal of Computer and System Sciences , 79:39 – 49, 2013.[18] K. Jansen and L. Rohwedder. On the conﬁguration-lp of the restricted assignment problem. In

Pro-ceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017,Barcelona, Spain, Hotel Porta Fira, January 16-19 , pages 2670–2678, 2017.[19] K. Jansen and L. Rohwedder. A quasi-polynomial approximation for the restricted assignment prob-lem. In

Integer Programming and Combinatorial Optimization - 19th International Conference, IPCO2017, Waterloo, ON, Canada, June 26-28, 2017, Proceedings , pages 305–316, 2017.[20] N. Karmarkar and R.M. Karp. An efﬁcient approximation scheme for the one-dimensional bin-packingproblem. In

FOCS 1982: 23rd IEEE Symposium on Foundations of Computer Science , pages 312–320,1982.[21] J.K. Lenstra, D.B. Shmoys, and E. Tardos. Approximation algorithms for scheduling unrelated parallelmachines.

Mathematical Programming , 46:259 – 271, 1990.[22] R.J. Lipton, E. Markakis, E. Mossel, and A. Saberi. On approximately fair allocations of indivisiblegoods. In

Proceedings of the 5th ACM Conference on Electronic Commerce (EC’04) , pages 125–131,2004.[23] M. Mnich and A. Wiese. Scheduling and ﬁxed-parameter tractability.

Mathetical Programming , 154(1-2):533–562, 2015.[24] R. Moser and G. Tardos. A constructive proof of the general Lov´asz local lemma.

Journal of the ACM ,57, 2010.[25] T. Ophelders, R. Lambers, F.C.R. Spieksma, and T. Vredeveld. A note on equitable hamiltonian cycles,2020. Submitted manuscript.[26] S.A. Plotkin, D.B. Shmoys, and E. Tardos. Fast approximation algorithms for fractional packing andcovering problems.

Mathematics of Operations Research , 20:257–301, 1995.[27] S.K. Sahni. Algorithms for scheduling independent tasks.

Journal of the ACM , 23(1):116–127, 1976.[28] O. Svensson. Santa claus schedules jobs on unrelated machines.

SIAM J. Comput. , 41(5):1318–1341,2012.[29] V.G. Vizing. On an estimate of the chromatic class of a p -graph (in russian). Diskret. Analiz , 3:25 –30, 1964.[30] G.J. Woeginger. A polynomial-time approximation scheme for maximizing the minimum machinecompletion time.

Operations Research Letters , 20:149–154, 1997.12

Detailed proofs of Section 2

A.1 Proof of Lemma 3

Proof.

Let ( x, y ) be a solution to the slot-MILP and assume that x is the solution which minimizes X i ∈M /ǫ X k =1 k y i,k k . (10)Now suppose toward contradiction that there are i , i with y i ≡ y i mod 2 , but y i = y i . We construct anew solution x ′ , which has a lower value of (10). We set x ′ i,j = x i,j for all i / ∈ { i , i } and x ′ i ,j = x ′ i ,j =( x i ,j + x i ,j ) / . In other words, we evenly distribute all jobs between i and i . Let us ﬁrst check that thesolution remains feasible. Let j ∈ J . Then X i ∈M x ′ i,j = x ′ i ,j + x ′ i ,j + X i/ ∈{ i ,i } x ′ i,j = x i ,j + x i ,j x i ,j + x i ,j X i/ ∈{ i ,i } x i,j = X i ∈M x i,j = 1 . For all machines i / ∈ { i , i } the load does not change and, hence, the load of machine i remains within [ ℓ i , u i ] . For i and i , we argue X j ∈J p j x ′ i ,j = X j ∈J p j x ′ i ,j = X j ∈J p j x i ,j + x i ,j

2= 12 X j ∈J p j x i ,j + 12 X j ∈J p j x i ,j ≤ u i u i u i and X j ∈J p j x ′ i ,j = X j ∈J p j x ′ i ,j = X j ∈J p j x i ,j + x i ,j

2= 12 X j ∈J p j x i ,j + 12 X j ∈J p j x i ,j ≥ ℓ i ℓ i ℓ i . Hence, the solution remains optimal. As for the integrality constraints, again the machines i / ∈ { i , i } donot change. Let k ∈ { , . . . , /ǫ } . Since y i ,k ≡ y i ,k , we have that y i ,k + y i ,k is even. It follows that X j ∈J k x ′ i ,j = X j ∈J k x ′ i ,j = y i + y i

13s integral. Now it remains to show that (10) has decreased. Notice that by triangle inequality k y ′ i ,k k + k y ′ i ,k k = 2 (cid:13)(cid:13)(cid:13)(cid:13) y i ,k + y i ,k (cid:13)(cid:13)(cid:13)(cid:13) ≤ k y i ,k k + k y i ,k k and strict inequality holds when y i ,k = y i ,k . Since this is the case for at least one k and all machines i / ∈ { i , i } do not change, we have that (10) has decreased. A contradiction. A.2 Details on Dynamic Program of Section 2.2

Here we present detailed proofs of Lemmas 5-7 and a detailed description of the dynamic program.

Proof of Lemma 5.

Condition (3) and (4) follow directly from feasibility of the solution.To show condition (5), let x, y be a solution with corresponding average load vector { z i,k } i ∈M ,k ∈ ,..., /ǫ } ,where the values z i,k when y i,k = 0 are chosen appropriately. Let ˆ z ≤ · · · ≤ ˆ z ¯ n be an ordering of the ¯ n = |{ ( i, k ) : y ik > }| values z i,k for all i ∈ M , k ∈ { , . . . , /ǫ } with y ik > . Assume that ( x, y ) is thesolution maximizing the following potential function ¯ n X i =1 n m/ǫ − i ) ˆ z i (11)We will now show that in this case we can iteratively ﬁnd an ordering of machines such that condition (5)holds and otherwise get a contradiction with respect to the potential function. Let i be the machine mini-mizing P /ǫk =1 z i,k . All other machines i ′ must satisfy one of the following two cases: (1) z i,k ≤ z i ′ ,k forall k ∈ { , ..., /ǫ } or (2) z i,k >z i ′ ,k for some k . If (1) holds for all machines i ′ we relabel machine i asmachine . Otherwise, let i ′ = i be a machine such that for some kz i,k > z i ′ ,k (12)Then, as i minimizes P /ǫk =1 z i,k we know that there must exist k = k with z i,k < z i ′ ,k (13)As we can freely choose the value of z ¯ i,k ′ , whenever y ¯ i,k ′ = 0 , we know that y i,k , y i,k , y i ′ ,k , y i ′ ,k > . Wenow gradually exchange jobs of J k and J k between i and i ′ without changing the total load on either of themachines. Indeed, there must be some j, j ′ ∈ J k with x i,j > , x i ′ ,j ′ > , and p j > p j ′ . Conversely, thereare j, j ′ ∈ J k with x i,j > , x i ′ ,j ′ > , and p j < p j ′ . For some δ, δ > we now augment the solution inthe following way. x i,j ′ ← x i,j ′ + δ x i,j ′ ← x i,j ′ + δx i,j ← x i,j − δ x i,j ← x i,j − δx i ′ ,j ′ ← x i ′ ,j ′ − δ x i ′ ,j ′ ← x i ′ ,j ′ − δx i ′ ,j ← x i ′ ,j + δ x i ′ ,j ← x i ′ ,j + δ It is easy to see that for δ and δ sufﬁciently small each variable remains non-negative. Moreover, each jobremains fully assigned and the number of jobs of J k and J k assigned to i and i ′ remains the same. By setting δ = δ ( p j − p j ′ ) / ( p j ′ − p j ) the load over each of the two machines stays the same. Furthermore, as p j > p j ′ p j ′ > p j we have that δ, δ > . We choose δ maximal such that all x variables remain non-negative andthe inequalities (12) and (13) still hold or turn to equality. This means that we decreased z i,k by δ ( p j − p ′ j ) y i,k and z i ′ ,k by δ ( p j − p j ′ ) y i,k . At the same time we increased z i,k by δ ( p j − p j ′ ) y i,k and z i ′ ,k by δ ( p j − p ′ j ) y i ′ ,k . Since z i ′ ,k and z i, ¯ k (the respective smaller z -variables for i and i ′ that we change) increase by at least δ ( p j − p ′ j ) n and z i,k and z i ′ , ¯ k decrease by at most δ ( p j − p ′ j ) , we have that (11) increases. This gives a contradiction.As we can repeat this argument iteratively assuming that machines { , . . . , i } are correctly sorted forsome i ∈ { , . . . , m } , we have that there exists a solution ( x, y ) with vector { z i,k } i ∈M ,k ∈ ,..., /ǫ } such thatcondition (5) holds. Proof of Lemma 6.

We ﬁrst show that there exists an assignment vector { x i,j } i ∈M ,j ∈J satisfying X j ∈J k p j x i,j ≤ y i,k z i,k (14)for all i ∈ M and k ∈ { , . . . , /ǫ } . In order to do so we use condition (3) of Lemma 5. We ﬁnd thisassignment independently for all k . We start by assigning J min k ( y ,k ) (completely) to machine , then J min k ( y ,k + y ,k ) \ J min k ( y ,k ) to machine , etc. This assignment does not necessary have the desiredproperty (14). Hence, we repair the property iteratively for i = 2 , . . . , m . Machine clearly satisﬁes (14)because of (3)). Let i ∈ { , . . . , m − } such that all machines , . . . , i satisfy (14). In each iteration i we do not touch any of the machines i + 1 , . . . , m . Hence, when repairing machine i we may assume thatmachines , . . . , i contain only J min k ( y ,k + · · · + y i,k ) . If machine i satisﬁes (14) we are done and continuewith i + 1 . Otherwise, we know that there is a job j with p j > z i,k and x i,j > . Moreover, because ofcondition (3) we have i X i ′ =1 X j ∈J min k ( y ,k + ··· + y i,k ) p j x i ′ ,j = X j ∈J min k ( y ,k + ··· + y i,k ) p j ≤ y ,k z ,k + · · · + y i,k z i,k . (15)Since i violates (14) there must be some i ′ < i satisfying (14) with strict inequality. In particular, there is ajob j ′ with p j ′ < z i ′ ,k ≤ z i,k < p j and x i ′ ,j ′ > . We now choose an α > and exchange j ′ and j between i and i ′ as follows x i,j ′ ← x i,j ′ + α x i ′ ,j ′ ← x i ′ ,j ′ − αx i,j ← x i,j ′ − α x i ′ ,j ′ ← x i ′ ,j ′ + α Clearly, the solution remains feasible. We choose α maximal such that either i ′ satisﬁes (14) with equality, i satisﬁes (14), x i,j = 0 , or x i ′ ,j ′ = 0 . The choice of α makes sure that each pair j, j ′ that can be exchangedlike this will only be exchanged once. This procedure is repeated until i satisﬁes (14). As the procedure isrepeated for all i and possibly has to check all pairs of every job type in each exchange we have a runningtime of O ( mn ) .Next, we claim that for all i and k , we have X j ∈J k p j x i,j ≥ y i,k z i,k − δǫ · p max (16)15o prove this claim, assume by contradiction that for some machine i ′ (16) does not hold. Then by (14) andcondition (4), we have that X j ∈J k X i ∈M x i,j p j < X i ∈M y i,k z i,k − δǫ · p max < X j ∈J k p j (17)This contradicts the fact that all jobs are fully assigned and thus P j ∈J k p j x i,j = P j ∈J k p j . Hence, we havethat /ǫ X k =1 X k ∈J k p j x ij ≥ /ǫ X k =1 y i,k z i,k − δp max ≥ ℓ i − δp max , (18)where the last inequality follows by condition (6).Based on Lemmas 5 and 6, the goal of our DP is to compute vectors { y i,k , z i,k } i ∈M ,k ∈{ ,..., /ǫ } thatsatisfy the conditions due to Lemma 6. The key insight is now that when we consider the next machine i ′ inthe ordering, we do not need to remember all vectors { y i,k , z i,k } i ∈M ,k ∈{ ,..., /ǫ } for all previously consideredmachines i , but it sufﬁces to remember the number of previously assigned jobs from each set J k , the currentleft hand side of inequality (3) for each k , the vector (cid:8) z i ′′ ,k (cid:9) k ∈{ ,..., /ǫ } of the previously considered machine i ′′ , and for each type of machines the number of previously guessed machines of this type. We say that twomachines i, i ′ ∈ M are of the same type if [ ℓ i , u i ] = [ ℓ i ′ , u i ′ ] . Recall that K denotes the number ofdifferent types of machines. Let K := { [ ℓ i , u i ] | i ∈ M} and deﬁne ℓ (1) , ..., ℓ ( |K| ) , u (1) , ..., u ( |K| ) such that K = (cid:8)(cid:2) ℓ (1) , u (1) (cid:3) , ..., (cid:2) ℓ ( |K| ) , u ( |K| ) (cid:3)(cid:9) . For each r ∈ { , ..., K } let m r denote the number of machines i ∈ M such that [ ℓ i , u i ] = (cid:2) ℓ ( r ) , u ( r ) (cid:3) . We say that such a machine is of type r .We introduce a DP-table with one cell for each combination of • a value i indicating the number of machines that have already been considered with i ∈ { , ..., m } , • a value m ′ r ∈ { , ..., m r } for each interval r ∈ { , ..., |K|} indicating the number of machines of type r for which we have already deﬁned vectors (cid:8) y i ′ ,k , z i ′ ,k (cid:9) k ∈{ ,..., /ǫ } , let M ′ denote these machinesintuitively, • a vector { z i,k } k ∈{ ,..., /ǫ } where z i,k ∈ (cid:8) , δǫn p max , δǫn p max , ..., p max (cid:9) for each k ∈ { , . . . , /ǫ } .The vector z i,k corresponds to the average loads on the currently considered machine, • a vector { y i,k } k ∈{ ,..., /ǫ } where y i,k ∈ { , , , ..., n k } for each k ∈ { , . . . , /ǫ } . The vector y i,k corresponds to the number of jobs on the currently considered machine, • for each k ∈ { , ..., /ǫ } – a value n ′ k ∈ { , ..., |J k |} indicating the number of jobs of in J k that were previously assignedto machines in M ′ , – a value S k with S k ∈ (cid:8) , δǫn p max , δǫn p max , ..., np max (cid:9) which corresponds to the value P ii ′ =1 y i ′ ,k z i ′ ,k .Each cell corresponds to the subproblem of checking whether there is a solution using machines { , . . . , i } such that each machine type r ∈ { , . . . , K } is used m r times, machine i is assigned y i,k jobs of each type k with average load z i,k , for each type k a total of n ′ k jobs is assigned and the total volume assigned of eachtype k is S k . Due to the dimension of the values corresponding to a DP-cell, the dimension of the DP tableis given by m K +1 · ( nδǫ ) O (1 /ǫ ) . 16hen considering cell (cid:16) i, { m r } r ∈{ ,..., |K|} , { z i,k } k ∈{ ,..., /ǫ } , { y i,k } k ∈{ ,..., /ǫ } , (cid:8) n ′ k (cid:9) k ∈{ ,..., /ǫ } , { S k } k ∈{ ,..., /ǫ } (cid:17) the DP proceeds as follows: for every machine type r ∗ ∈ { , ..., K } it checks whether for some { ˜ z i − ,k } k ∈{ ,..., /ǫ } and { ˜ y i − ,k } k ∈{ ,..., /ǫ } , the value of the following cell is true (cid:18) i − , { ˜ m r } r ∈{ ,..., |K|} , { ˜ z i − ,k } k ∈{ ,..., /ǫ } , { ˜ y i − ,k } k ∈{ ,..., /ǫ } , (cid:8) ˜ n ′ k (cid:9) k ∈{ ,..., /ǫ } , n ˜ S k o k ∈{ ,..., /ǫ } (cid:19) , where ˜ m r = m r for each r = r ∗ and ˜ m r ∗ = m r ∗ − , ˜ n ′ k = n ′ k − y i,k for each k ∈ { , . . . , /ǫ } , and ˜ S k = S k − y i,k z i,k for each k ∈ { , . . . , /ǫ } . Then we need to check if the following conditions are truefor all k ∈ { , ..., /ǫ } S k ≤ δǫ · p max + X j ∈J k p j (19) z i,k ≥ ˜ z i − ,k (20)If these conditions are true, then there exists a solution corresponding to the considered DP cell. Filling eachcell takes K · ( nδǫ ) O (1 /ǫ ) .Finally, for each possible value of { z m,k } k ∈{ ,..., /ǫ } and { m m,k } k ∈{ ,..., /ǫ } we check whether thereexists a solution for the DP cell (cid:16) m, { m r } r ∈{ ,..., |K|} , { z m,k } k ∈{ ,..., /ǫ } , { y m,k } k ∈{ ,..., /ǫ } , (cid:8) n ′ k (cid:9) k ∈{ ,..., /ǫ } , { S k } k ∈{ ,..., /ǫ } (cid:17) where the correct number of each machine type is considered, all jobs are assigned and for every k ∈{ , . . . , /ǫ } we have that X j ∈J k p j ≤ S k ≤ X j ∈J k p j + δǫ · p max . If this is the case we use standard backward recursion to ﬁnd vectors { z i,k } k ∈{ ,..., /ǫ } and { y i,k } k ∈{ ,..., /ǫ } for all i ∈ { , ..., m } and an assignment of machine types to machine indices and use Lemma 6 to obtain asolution ( x, y ) to slot-MILP’. If there is no such solution we assert that there is no solution to the originalrelaxation, i.e., to slot-MILP. Proof of Lemma 7.

The running time follows from the dimension of the DP table and the time it takes tovalidate a speciﬁc DP cell. This amounts to a running time of m K +1 ( nδǫ ) O (1 /ǫ ) .For the correctness of the DP we need two observations: (1) due to conditions (19) and (20) and theway we check whether a solution corresponding to a DP cell exists we have that there exists a solution formachine i if and only if there is a solution for machine i − . Hence, we can indeed ﬁnd a solution viabackward recursion and (2) if for some { z m,k } k ∈{ ,..., /ǫ } and { y m,k } k ∈{ ,..., /ǫ } there is a solution for thecell (cid:16) m, { m r } r ∈{ ,..., |K|} , { z m,k } k ∈{ ,..., /ǫ } , { y m,k } k ∈{ ,..., /ǫ } , (cid:8) n ′ k (cid:9) k ∈{ ,..., /ǫ } , { S k } k ∈{ ,..., /ǫ } (cid:17) , then we can apply Lemma 6 to ﬁnd a solution to slot-MILP’. If there is no such solution, we know that dueto our rounding of the z -values there is also no solution satisfying Lemma 5. This implies that there is nosolution to the slot-MILP. 17 Details on the local search algorithm

We ﬁrst show that Lemma 8 also holds for the second stage of the algorithm.

Proof of Lemma 8 for the second stage.

We want to ﬁnd a job j of type k assigned to a machine with loadless than ℓ i and swap it with some job j ′ of the same type with p ′ j > p j currently assigned to machine i ′ with load at least ℓ ′ i . Suppose towards contradiction that no such pair of jobs exists. Then we know thatthe machines in M ∪ · · · ∪ M ℓ are assigned the largest jobs of type k while having a load less than theirspeciﬁc lower bounds. This implies that even in a fractional solution the total load cannot be increased and,hence, at least one machine violates its lower target. This gives a contradiction.As every job is fully assigned to a machine we have that each job of a type k which is assigned to amachine in the sets M , M , . . . , M ℓ is considered exactly once and compared to every other job of type k not assigned to these machines. As the existence of a pair was shown for an arbitrary k this gives a worstcase running time of O ( n ) .Next, we present the proofs of Lemmas 9 and 10. Proof of Lemma 9.

We observe how the graph changes when swapping two jobs. Throughout the proof thedistance d ( w ) of a slot w is the distance from s to w before the swap is executed. Clearly, removing anyedges from the graph cannot decrease the distances of vertices. Edges between slots of the same machinedo not change and no new edges can be added from the source, since during the execution of the algorithma machine with load at most u i + ǫ · p max will never exceed u i + ǫ · p max . Hence, it sufﬁces to look at thechanges in edges of weight . Although such an edge ( u, v ) might be added to the graph, we will show thatthis happens only when d ( v ) ≤ d ( u ) + 1 . Adding this edges cannot decrease any distances, since the ﬁrstpart of a shortest path using ( u, v ) could always be replaced by a path to v without this edge.Now we have to check that these are the only changes made to the graph. Let u, v be the slots in whichwe exchange the jobs. The size of the job in u decreases; the size of the job in v increases. We only need tolook at the incoming and outgoing edges of u and v , since all other edges remain the same.Consider the incoming edges of u . Since the size of the job in u decreases, there could be new incomingedges. Let ( w, u ) be an edge of weight that is added. This means the job in slot w has a larger size thanthe job on u . Either w is a slot on the same machine as v or ( w, v ) is in the graph before the swap. Theformer case implies that d ( w ) = d ( v ) = d ( u ) + 1 . In the latter case we have d ( u ) = d ( v ) − ≤ d ( w ) . Nooutgoing edge from u can be added, since the size of u ’s job decreases.Now consider v . Since the size of its job increases, no incoming edge can be added. As for the outgoingedges, let ( v, w ) be an outgoing edge added by the swap. Then either u and w are slots on the same machineor ( u, w ) was is in the graph before the swap. In the former case, d ( w ) = d ( u ) = d ( v ) − . In the lattercase, d ( w ) ≤ d ( u ) + 1 = d ( v ) . Proof of Lemma 10.

Let j , . . . , j n be the jobs J in increasing order of size. We claim that the potential n X i =1 i · d ( j i ) increases with every swap. Here d ( j i ) denotes the distance from s to the slot to which j i is assigned. Sincethe function is integral and bounded by n , the claim follows. Let j k , j h be the jobs that are swapped.Assume that k < h , i.e., p j k < p j h . Let d , d ′ be the distance functions before and after the swap. Based on18emma 9 we have d ′ ( j i ) ≥ d ( j i ) for all i / ∈ { k, h } , d ′ ( j h ) ≥ d ( j k ) and d ′ ( j k ) ≥ d ( j h ) , since these jobsswapped their slots. It follows that k · d ′ ( j k ) + h · d ′ ( j h ) ≥ k · d ( j h ) + h · d ( j k )= k · d ( j h ) + ( h − k ) · d ( j k ) | {z } >d ( j h ) + k · d ( j k ) > h · d ( j h ) + k · d ( j k ) . For the second stage, the analogues of Lemmas 9 and 10 follow from similar arguments. We againconstruct a graph on |J | + 1 vertices. The difference in the construction is that there is an edge from s toevery slot on a machine with load less than ℓ i − ǫ · p max and there is an edge of weight from slot u to v , when (1) u and v are not on the same machine, (2) u and v belong to the same size class, and (3) u iscurrently assigned a smaller job than v . Again, the breadth-ﬁrst search starts at s and once a machine withload at least ℓ i is reached the algorithm selects the jobs assigned to the vertices u and v corresponding tothe edge ( u, v ) over which the machine was reached and swaps them. This procedure is continued until forevery machine i the load is at least ℓ i − ǫ · p max . And due to Lemma 8 each iteration ﬁnishes in O ( n ) . Lemma 16.

The distance from s to any slot does not decrease by a swap in the second stage of the algorithm.Proof. Let d ( w ) denote the distance from s to slot w before a swap. Clearly, removing edges does notdecrease the distance from s to any vertex. Edges between slots of the same machine do not change andwe do not add new edges from the source to a machine since we only decrease the load on machines withload at least ℓ i and, hence, do not decrease any machine load below ℓ i − ǫ · p max . So we only have toconsider the changes in edges of weight . Such an edge ( u, v ) might be added to the graph but only when d ( v ) ≤ d ( u ) + 1 . Adding this edge will not decrease any distances, since the ﬁrst part of a shortest pathusing ( u, v ) can always be replaced by a path to v not using ( u, v ) . We have to check that these are the onlychanges made to the graph.Let u and v be the slots in which we exchange the jobs. This implies that the size of the job assigned toslot u increases and the size of the job assigned to slot v decreases. Next we look at the incoming edges of u and outgoing edges of v (all others remain the same). As the size of the job assigned to slot u increasesthere could be a new incoming edge from slots with smaller jobs. Let ( w, u ) be such an edge. Then eitherslot w is on the same machine as v or ( w, v ) was an edge with weight before the swap. The former caseimplies that d ( w ) = d ( v ) = d ( u ) + 1 and in the latter case d ( u ) = d ( v ) − ≤ d ( w ) . In both cases wehave that d ( u ) ≤ d ( w ) + 1 which satisﬁes the property above. Next consider the new outgoing edges of v .Let ( v, w ) be such an edge. Then the size of the job assigned to w is larger than the job assigned to v whichwas originally assigned to u . So, either u and w are on the same machine and d ( w ) = d ( u ) = d ( v ) − or ( u, w ) was an edge of weight in the original graph and d ( w ) ≤ d ( u ) + 1 = d ( v ) . Lemma 17.

The second stage of the algorithm terminates after at most O ( n ) swaps.Proof. Let j , . . . , j n be the jobs in J in decreasing order of size. We claim that the potential n X i =1 i · d ( j i ) increases with every swap. Let j k and j h be the jobs that were swapped and assume that k < h , i.e. p j k > p j h . This implies that the edge from the slot of j h to j k was deleted and new edges were constructed19s described above. Let d and d ′ be the distances before and after the swaps, respectively. Based on Lemma9 we have that d ′ ( j i ) ≥ d ( j i ) for all i / ∈ { h, k } , d ′ ( j h ) ≥ d ( j k ) and d ′ ( j k ) ≥ d ( j h ) . Furthermore, we know d ( j k ) > d ( j h ) . From this it follows that kd ′ ( j k ) + hd ′ ( j h ) ≥ kd ( j h ) + hd ( j k )= kd ( j h ) + ( h − k ) · d ( j k ) | {z } >d ( j h ) + kd ( j k ) ≥ hd ( j h ) + kd ( j k ) This concludes the proof.Hence, similar to the ﬁrst stage, the second stage of the algorithm ﬁnishes in n O (1)(1)