[PDF] Complexity of Scheduling Few Types of Jobs on Related and Unrelated Machines

Abstract

The task of scheduling jobs to machines while minimizing the total makespan, the sum of weighted completion times, or a norm of the load vector, are among the oldest and most fundamental tasks in combinatorial optimization. Since all of these problems are in general NP-hard, much attention has been given to the regime where there is only a small number k of job types, but possibly the number of jobs n is large; this is the few job types, high-multiplicity regime. Despite many positive results, the hardness boundary of this regime was not understood until now. We show that makespan minimization on uniformly related machines ( Q|HM| C max ) is NP-hard already with 6 job types, and that the related Cutting Stock problem is NP-hard already with 8 item types. For the more general unrelated machines model ( R|HM| C max ), we show that if either the largest job size p max , or the number of jobs n are polynomially bounded in the instance size |I| , there are algorithms with complexity |I | poly(k) . Our main result is that this is unlikely to be improved, because Q|| C max is W[1]-hard parameterized by k already when n , p max , and the numbers describing the speeds are polynomial in |I| ; the same holds for R|HM| C max (without speeds) when the job sizes matrix has rank 2 . Our positive and negative results also extend to the objectives ℓ 2 -norm minimization of the load vector and, partially, sum of weighted completion times ∑ w j C j . Along the way, we answer affirmatively the question whether makespan minimization on identical machines ( P|| C max ) is fixed-parameter tractable parameterized by k , extending our understanding of this fundamental problem. Together with our hardness results for Q|| C max this implies that the complexity of P|HM| C max is the only remaining open case.

Full PDF

CComplexity of Scheduling Few Types of Jobs onRelated and Unrelated Machines

Martin Koutecký

Computer Science Institute, Charles University, Czech [email protected]ﬀ.cuni.cz

Johannes Zink

Institut für Informatik, Universität Würzburg, [email protected]

Abstract

The task of scheduling jobs to machines while minimizing the total makespan, the sum of weightedcompletion times, or a norm of the load vector, are among the oldest and most fundamental tasksin combinatorial optimization. Since all of these problems are in general NP -hard, much attentionhas been given to the regime where there is only a small number k of job types, but possibly thenumber of jobs n is large; this is the few job types, high-multiplicity regime. Despite many positiveresults, the hardness boundary of this regime was not understood until now.We show that makespan minimization on uniformly related machines ( Q | HM | C max ) is NP -hardalready with 6 job types, and that the related Cutting Stock problem is NP -hard already with 8item types. For the more general unrelated machines model ( R | HM | C max ), we show that if eitherthe largest job size p max , or the number of jobs n are polynomially bounded in the instance size | I | , there are algorithms with complexity | I | poly( k ) . Our main result is that this is unlikely to beimproved, because Q || C max is W [ ]-hard parameterized by k already when n , p max , and the numbersdescribing the speeds are polynomial in | I | ; the same holds for R | HM | C max (without speeds) whenthe job sizes matrix has rank 2. Our positive and negative results also extend to the objectives ‘ -norm minimization of the load vector and, partially, sum of weighted completion times P w j C j .Along the way, we answer aﬃrmatively the question whether makespan minimization on identicalmachines ( P || C max ) is ﬁxed-parameter tractable parameterized by k , extending our understandingof this fundamental problem. Together with our hardness results for Q || C max this implies that thecomplexity of P | HM | C max is the only remaining open case. Theory of computation → Fixed parameter tractability; Theoryof computation → Scheduling algorithms

Keywords and phrases

Scheduling, cutting stock, hardness, parameterized complexity

Acknowledgements

We thank the organizers of the HOMONOLO 2019 workshop for providing awarm and stimulating research environment, which gave birth to the initial ideas of this paper. Wealso thank the anonymous reviewers for their helpful remarks. M. Koutecký was partially supportedby Charles University project UNCE/SCI/004 and by the project 19-27871X of GA ČR.

Makespan minimization is arguably the most natural and most studied scheduling problem:in the parallel machines model, we have m machines, n jobs with sizes p , . . . , p n , and thetask is to assign them to machines such that the sum of sizes of jobs on any machine isminimized. Seen diﬀerently, this is the (decision version of the) Bin Packing problem: can aset of items be packed into a given number of bins?

Bin Packing is NP -hard, so it is naturalto ask which restrictions make it polynomial time solvable. Say there are only k distinctitem sizes p , . . . , p k , and so the items are given by a vector of multiplicities n , . . . , n k with n = P kj =1 n j ; let p max = max j p j . Goemans and Rothvoss [10] showed that Bin Packing a r X i v : . [ c s . D S ] S e p can be solved in time (log p max ) f ( k ) poly log n for some function f . Note that makespanminimization is polynomial when k is ﬁxed by simple dynamic programming; the diﬃcultquestion is whether it is still polynomial in the high-multiplicity setting where jobs areencoded by the multiplicity vector n = ( n , . . . , n k ). By the equivalence with scheduling,Goemans and Rothvoss showed that high-multiplicity makespan minimization on identicalmachines is polynomial if the number of job types k is ﬁxed.Since 2014, considerable attention has been given to studying the complexity of variousscheduling problems in the regime with few job types [3, 11–14, 19–21, 25], and similartechniques have been used to obtain approximation algorithms [15, 17, 23]. However, anyanswer to the following simple and natural question was curiously missing: What is the most restricted machine model in which high-multiplicity makespanminimization becomes NP -hard, even when the number of job types is ﬁxed? There are three main machine models in scheduling: identical, uniformly related, andunrelated machines. In the uniformly related machines model, machine M i (for i ∈ [ m ])additionally has a speed s i , and processing a job of size p j takes time p j /s i on such a machine.In the unrelated machines model, each machine M i (for i ∈ [ m ]) has its own vector of jobsizes p i = ( p i , . . . , p ik ), so that p ij is the time to process a job of type j on machine M i .The makespan minimization problem in the identical, uniformly related, and unrelatedmachines model is denoted shortly as P || C max , Q || C max , and R || C max [22], respectively,with the high-multiplicity variant being P | HM | C max and analogously for the other models.Notice that the job sizes matrix p of a Q || C max instance is of rank 1: the vector p i formachine M i is simply p /s i for p = ( p , . . . , p k ), and p = p · (1 / s ) (cid:124) for the speeds vector s = ( s , . . . , s m ). Hence, the rank of the job sizes matrix has been studied [1–3] as a helpfulmeasure of complexity of an R || C max instance: intuitively, the smaller the rank, the closer isthe instance to Q || C max . We answer the question above: (cid:73) Theorem 1. Q | HM | C max is NP -hard already for 6 job types. The

Cutting Stock problem relates to

Bin Packing in the same way as Q || C max relatesto P || C max : instead of having all bins have the same capacity, there are now several bin typeswith a diﬀerent capacity and cost, and the task is to pack all items into bins of minimumcost. Cutting Stock is a famous and fundamental problem whose study dates back tothe ground-breaking work of Gilmore and Gomory [9]. It is thus surprising that the naturalquestion whether

Cutting Stock with a ﬁxed number of item types is polynomial or NP -hard has not been answered until now: (cid:73) Theorem 2.

Cutting Stock is NP -hard already with item types. Parameterized Complexity.

A more precise complexity landscape can be obtained bytaking the perspective of parameterized complexity: we say that a problem is ﬁxed-parametertractable ( FPT , or in FPT , for short) parameterized by a parameter k if there is an algorithmsolving any instance I in time f ( k ) poly( | I | ), for some computable function f . On the otherhand, showing that a problem is W [ ] -hard means it is unlikely to have such an algorithm,and the best one might hope for is a complexity of the form | I | f ( k ) ; we then say that aproblem is in XP (or that it has an XP algorithm ); see the textbook [6]. The complexity stated in [10] is (log max C max , n ) f ( k ) poly log n , but a close inspection of their proofreveals that a) the dependence on n is unnecessary, and b) it is possible to use a better bound on thenumber of vertices of a polytope and obtain the complexity stated here. . Koutecký and J. Zink 3 The hard instance I from Theorem 1 is encoded by a job sizes matrix p , a job multiplicitiesvector n , and a machine speeds vector s which all contain long numbers, i.e., entries withencoding length Ω( | I | ). What happens when some of p , n , and s are restricted to numbersbounded by poly( | I | ), or, equivalently, if they are encoded in unary?A note of caution: since we allow speeds to be rational, and the encoding length of afraction p/q is d log p e + d log q e , a Q || C max instance with s of polynomial length mighttranslate to an R || C max instance with p of exponential length. This is because for p to beinteger, one needs to scale it up by the least common multiple of the denominators in s , whichmay be exponential in m . Thus, with respect to the magnitude of n and p , R | HM | C max can not be treated as a generalization of Q | HM | C max . This is why in the following we deal withboth problems and not just the seemingly more or less general one. For Q | HM | C max , wedenote by p max the largest job size before scaling , i.e., if p = p · (1 / s ) (cid:124) , then p max = k p k ∞ .Having n polynomially bounded is equivalent to giving each job explicitly; note that inthis setting R | HM | C max strictly generalizes Q | HM | C max . A simple DP handles this case: (cid:73) Theorem 3. { R, Q }| HM | C max and { R, Q }|| C max can be solved in time m · n O ( k ) , hence { R, Q }|| C max is in XP parameterized by k . A similar situation occurs if n is allowed to be large, but p is polynomially bounded, althoughthe use of certain integer programming tools [7] is required: (cid:73) Theorem 4. { R, Q }| HM | C max can be solved in time p O ( k )max m log m log n , hence { R, Q }| HM | C max are in XP parameterized by k if p max is given in unary. Our main result is that an

FPT algorithm for Q | HM | C max is unlikely to exist even when n , p , and s are encoded in unary, and for R | HM | C max even when the rank of p is 2: (cid:73) Theorem 5. X || C max is W [ ] -hard parameterized by the number of job types with (a) X = Q and n , p , and s given in unary. (b) X = R and n and p given in unary and rank( p ) = 2 . We use a result of Jansen et al. [16] as the basis of our hardness reduction. They show that

Bin Packing is W [ ]-hard parameterized by the number of bins even if the items are givenin unary. In the context of scheduling, this means that P || C max is W [ ]-hard parameterizedby the number of machines already when p max is polynomially bounded. However, it isnon-obvious how to “transpose” the parameters, that is, how to go from many job types andfew machines to few job types and many machines which diﬀer as little as possible (i.e., onlyby their speeds, or only in low-rank way). We ﬁrst show W [ ]-hardness of Balanced BinPacking , where we additionally require that the number of items in each bin is identical,parameterized by the number of bins, even for tight instances in which each bin has to befull. Using this additional property, we are able to construct an R | HM | C max instance ofmakespan T in which optimal solutions are in bijection with optimal packings of the encoded Balanced Bin Packing instance. Our R | HM | C max instance uses one job type to “blockout” a large part of a machine’s capacity so that its remaining capacity depends on the itemthe machine represents, and all other job types have sizes independent of which machine theyrun on. Since the capacity of a machine exactly corresponds to its speed, omitting those“blocker” jobs and setting the machine speeds gives a hard instance for Q | HM | C max .Let us go back to P | HM | C max . As mentioned previously, Goemans and Rothvoss showedthat if the largest job size p max is polynomially bounded, the problem is FPT because(log p max ) f ( k ) poly log n ≤ g ( k ) · p o (1)max poly log n [6, Exercise 3.18]. We answer the remainingquestion whether the problem is in FPT also when all jobs are given explicitly: (cid:73)

Theorem 6. P || C max is FPT parameterized by k . P || . . . Q || . . . R || . . . P | HM | . . . Q | HM | . . . R | HM | . . .C max FPT X P ( T h e o r e m ) W [ ]-hard X P ( T h e o r e m ) W [ ]-hard poly. time NP -hard NP -hard(Thm. 6) (Thm. 5) (Thm. 5) for const. k for k ≥ k ≥ ‘ ? W [ ]-hard W [ ]-hard ? NP -hard NP -hard(Cor. 23) (Cor. 23) for k ≥ k ≥ P w j C j ? ? W [ ]-hard ? ? NP -hard(Cor. 27) for k ≥ Table 1

Overview of the computational hardness of { P, Q, R }|{ _ , HM }|{ C max , ‘ , P w j C j } relative to the number of job types k . This result partially answers [24, Question 5], which asks for an

FPT algorithm for P | HM | C max .Obtaining this answer turns out to be surprisingly easy: we reduce the job sizes by a fa-mous algorithm of Frank and Tardos [8] and then apply the algorithm of Goemans andRothvoss [10], which is possible precisely when n is suﬃciently small. This extends ourunderstanding of the complexity of P | HM | C max : the problem is FPT if either the largestjob or the number of jobs are not too large. Hence, the remaining (and major) open problemis the complexity of P | HM | C max parameterized by k , without any further assumptionson the magnitude of p max or n . In light of this, our result that already Q | HM | C max is NP -hard when p max and n are large, and W [ ]-hard if both are polynomially bounded, maybe interpreted as indication that the magnitude of n and p max plays a surprisingly importantrole, and that P | HM | C max may in fact not be FPT parameterized by k . Other Objectives.

Besides minimum makespan, two important scheduling objectives areminimization of the sum of weighted completion times, denoted P w j C j , and the minimizationof the ‘ -norm of the load vector. We show that our algorithms and hardness results (almostalways) translate to these objectives as well. Let us now introduce them formally.The load L i of a machine M i is the total size of jobs assigned to it. In R | HM | ‘ , the taskis to ﬁnd a schedule minimizing k ( L , . . . , L m ) k = pP mi =1 L i . Note that this is isotonic(order preserving) to the function P mi =1 L i , and because this leads to simpler proofs, weinstead study the problem R | HM | ‘ . The completion time of a job, denoted C j , is the time itﬁnishes its execution in a schedule. In the R | HM | P w j C j problem, each job is additionallygiven a weight w j , and the task is to minimize P w j C j .We show that the hard instance for R | HM | C max is also hard for ‘ , and with the rightchoice of weights is also hard for P w j C j . We also obtain hardness of Q | HM | ‘ by a diﬀerentand more involved choice of speeds, but the case of Q | HM | P w j C j remains open so far.To extend the C max reduction to other objectives, we use the “tightness” of our hardnessinstance to show that any “non-tight” schedule must increase the ‘ norm of the load vectorby at least some amount. This is not enough for R | HM | P w j C j because the value P w j C j is proportional to the load vector plus other terms, and we need to bound those remainingterms (Lemma 24) in order to transfer the argument from ‘ to P w j C j . We point out thatthe these hardness results are delicate and non-trivial even if at ﬁrst sight they may appearas “just” modifying the hard instance of Q | HM | C max .We give an overview of our results in Table 1. . Koutecký and J. Zink 5 We consider zero a natural number, i.e., 0 ∈ N . We write vectors in boldface (e.g., x , y ) andtheir entries in normal font (e.g., the i -th entry of a vector x is x i ). If it is clear from contextthat x (cid:124) y is a dot-product of x and y , we just write xy [4]. We use log := log , i.e., all ourlogarithms are base 2. For n, m ∈ N , we write [ n, m ] = { n, n + 1 , . . . , m } and [ n ] = [1 , n ]. Makespan Minimization on Unrelated Machines ( R | HM | C max ) Input: n jobs of k types, job multiplicities n , . . . , n k , i.e., n + · · · + n k = n and n j is thenumber of jobs of type j , m unrelated machines, for each i ∈ [ m ] a job sizes vector p i = ( p i , . . . , p ik ) ∈ ( N ∪ { + ∞} ) k · m where p ij is the processing time of a job of type j on a machine M i , a number T . Find:

An assignment of jobs to machines and non-overlapping (with respect to each machine)time slots such that every machine ﬁnishes by time T . Notice that our deﬁnition uses a high-multiplicity encoding of the input, that is, jobs arenot given explicitly, one by one, but “in bulk” by a vector of multiplicities. Because thisallows compactly encoding instances which would otherwise be of exponential size, the twoproblems actually have diﬀerent complexities and deserve a notational distinction: we denoteby R || C max the problem where jobs are given explicitly, and by R | HM | C max the problemdeﬁned above; see also the discussion in [20].Recall that in R | HM | ‘ , the task is to minimize k ( L , . . . , L m ) k , where L i is the sumof sizes of jobs assigned to machine M i for i ∈ [ m ]. In R | HM | P w j C j , each job j hasa weight w j , and a schedule determines a job’s completion time C j . The task is then tominimize P w j C j .The job sizes matrix p ∈ R k × m + has rank r if it can be written as a product of matrices C ∈ R k × r and D ∈ R r × m . For example, in Q || C max , each machine has a speed s i ∈ R + , and p i = p /s i for some p ∈ N k , so p = p (1 / s ) (cid:124) , where s = ( s , . . . , s m ), hence p has rank 1.In the identical machines model, p i = p for all i ∈ [ m ], and we denote it P || C max . Itsdecision variant P || C max is equivalent to Bin Packing : Bin Packing

Input: n items of sizes a , . . . , a n , k bins, each with capacity B . Find:

An assignment of items to bins such that the total size of items in each bin is ≤ B . Unary Bin Packing is Bin Packing where all a , . . . , a n are encoded in unary, or,equivalently, a max = max i a i is bounded polynomially in n . Balanced Bin Packing is BinPacking with the additional requirement on the solution that the number of items assignedto each bin is the same, hence n/k ; note that n has to be divisible by k for any instance tobe feasible. An instance of Bin Packing is tight if the total size of items P i a i is equal to k · B , which means that if an instance has a packing, then each bin is used fully. We wish to highlight the geometric structure of R | HM | C max by formulating it as an ILPand making several observations about it. We have a variable x ij for each job type j ∈ [ k ]and machine M i (with i ∈ [ m ]) specifying how many jobs of type j are scheduled to run onmachine M i . There are two types of constraints, besides the obvious bounds ≤ x i ≤ n foreach i ∈ [ m ]. The ﬁrst enforces that each job is scheduled somewhere, and the second assures that the sum of job sizes on each machine is at most T , meaning each machine ﬁnishes bytime T : m X i =1 x ij = n j ∀ j ∈ [ k ] (1) k X j =1 x ij p ij ≤ T ∀ i ∈ [ m ] . (2)Knop and Koutecký [19] show that this ILP has N -fold format, i.e., it has the general form:min f ( x ) : E ( N ) x = b , l ≤ x ≤ u , x ∈ Z Nt , with E ( N ) =  E E · · · E N E · · · E · · · · · · E N  . Here, r, s, t, N ∈ N , E ( N ) is an ( r + N s ) × N t -matrix, E i ∈ Z r × t and E i ∈ Z s × t for all i ∈ [ N ], are integer matrices, and f is some separable convex function. Speciﬁcally for R || C max , f ≡

0, the matrices corresponding to equations (1)–(2) are E i = I and E i = p i ,for each i ∈ [ m ], b = ( n , T, . . . , T ) is an r + N s = ( k + m )-dimensional vector, and l = and u = ( n , n , . . . , n ) are N t = ( mk )-dimensional vectors. We note that N -fold IP formulationsare also known for R | HM |{ ‘ , P w j C j } [19, 20]. A simple dynamic programming algorithm gives: (cid:73)

Theorem 3. { R, Q }| HM |{ C max , ‘ , P w j C j } can be solved in time m · n O ( k ) , hence { R, Q }||{ C max , ‘ , P w j C j } are in XP parameterized by k . Proof.

We will describe a simple dynamic programming (DP) algorithm. Call a vector x i ∈ N k satisfying the constraint (2), i.e., p i x i ≤ T , a conﬁguration of machine M i . We willconstruct a DP table D indexed by k -dimensional integer vectors upper bounded by n , and i ∈ [ m ], and each value of the table is a 0/1 bit. The intended meaning is that, for i ∈ [ m ]and n ≤ n , D [ i, n ] = 1 iﬀ the subinstance consisting of jobs n and the ﬁrst i machines isfeasible. Initialize D to be all-zero, and set D [0 , ] = 1. Then, consecutively for i = 1 , . . . , m ,and for each ≤ n ≤ n , set D [ i, n ] = 1 if D [ i − , n − x i ] = 1 and x i is a conﬁguration ofmachine M i . In other words, for each i = 1 , . . . , m , construct the set C i of conﬁgurationsof machine M i , and then, for each n with D [ i − , n ] = 1, set D [ i, n + x i ] = 1 for each x i ∈ C i if n + x i ≤ n . Finally, the instance is feasible if D [ m, n ] = 1. In each iteration, wego over all n ≤ n , of which there is at most n k many, and for each of them, we try to addeach element of C i , of which there is also at most n k many. In total, the algorithm makes m · n k · n k = m · n k steps.The adaptation of this DP to ‘ and P w j C j is straightforward. Say that a conﬁgurationis any vector x i ≤ n . The value of a conﬁguration x i on machine M i is f i ( x i ) = ( p i x i ) for ‘ . For P w j C j , it has been shown [19] that the contribution of a machine M i schedulingjobs x i is a quadratic convex function f i in terms of x i . Then, D [ i, n ] = min x i ≤ n − n f i ( x i ) + D [ i − , n − x i ]. (cid:74) Theorem 3 (with a worse complexity bound) can be also shown in a somewhat roundaboutway by manipulating the ILP formulation (1)–(2). This approach will eventually give us theresult that P || C max is FPT parameterized by k . We need the following result: . Koutecký and J. Zink 7 (cid:73) Proposition 7 (Frank and Tardos [8]) . Given a rational vector w ∈ Q d and an integer M ,there is a strongly polynomial algorithm which ﬁnds a ¯ w ∈ Z d such that for every integerpoint x ∈ [ − M, M ] d , we have wx ≥ ⇔ ¯ wx ≥ and k ¯ w k ∞ ≤ O ( d ) M O ( d ) . (cid:73) Lemma 8.

It is possible to compute in strongly-polynomial time for each i ∈ [ m ] a vector ¯ p i ∈ N k and an integer ¯ T i ∈ N such that replacing constraint (2) with ¯ p i x i ≤ ¯ T i does notchange the set of feasible integer solutions, and k ¯ p i , ¯ T i k ∞ ≤ O ( k ) n O ( k ) Proof.

Fix some i ∈ [ m ] and consider the inequality (2), which is p i x i ≤ T . ApplyingProposition 7 to ( p i , T ) and M = n gives a vector (¯ p i , ¯ T i ) such that for all ≤ x i ≤ n ,( p i , T )( x i , − ≤ ⇔ (¯ p i , ¯ T i )( x i , − ≤ , which means that replacing p i x i ≤ T by ¯ p i x i ≤ ¯ T in (2) does not change the set of feasiblesolutions, and the bound on k ¯ p i , ¯ T k ∞ follows immediately from Proposition 7. (cid:74) We will use the fact that N -fold IP can be solved eﬃciently: (cid:73) Proposition 9 ([5, 7, 18]) . A feasibility instance of N -fold IP can be solved in time ( k E ( N ) k ∞ rs ) O ( r s + s ) N t log

N t log k u − l k ∞ . Alternative proof of Theorem 3 for C max . By Lemma 8, we can reduce k E ( N ) k ∞ down to2 O ( k ) n O ( k ) . Since r = k , t = k , s = 1, N = m , and k u − l k ∞ ≤ n , applying Proposition 9to such a reduced instance gives an n O ( k ) m log m log n algorithm. Dealing with ‘ and P w j C j is analogous, see Lemma 11. (cid:74) While this is worse than the DP above, notice that this approach also gives: (cid:73)

Theorem 6. P || C max is FPT parameterized by k . Proof.

Apply Lemma 8 to a given P || C max instance, which gives a new job-sizes vector¯ p ∈ N k and a new time bound ¯ T ∈ N . Goemans and Rothvoss [10] have shown that P || C max with k job types can be solved in time (log p max ) O ( k ) poly log n . Plugging in p max ≤ O ( k ) n O ( k ) gives log p max ≤ log 2 O ( k ) n O ( k ) = k + k log n . Hence, the algorithmruns in time ( k log n ) O ( k ) = ( k ) O ( k ) · (log n ) O ( k ) . To verify that this is indeed an FPT runtime (i.e., f ( k ) poly( n ) for some computable f ), we use a simple observation [6, Exercise3.18] that (log α ) β ≤ β / α o (1) . Taking α = n and β = 2 O ( k ) gives (log n ) O ( k ) ≤ O ( k ) n o (1) and we are done. (cid:74)(cid:73) Remark 10.

The algorithm of [10] shows that P | HM | C max is FPT in k if p max is givenin unary. To the best of our knowledge, it has not been observed before that P | HM | C max is FPT in k if n is polynomially bounded by the input length, i.e., that P || C max is FPT in k . Thus, Theorem 6 shows that the remaining (and indeed hard) open problem is thecomplexity of P | HM | C max for instances where both p and n contain large numbers.A straightforward adaptation of the proof of Lemma 8 where we reduce each row of theconstraint E i x i = b i separately gives the following more general statement: (cid:73) Lemma 11.

Given an N -fold IP instance and M ∈ N , one can in strongly-polynomialtime compute ¯ E i and ¯ b i , for each i ∈ [ N ] , such that if k u − l k ∞ ≤ M , then { x ∈ Z Nt | E ( N ) x = b , l ≤ x ≤ u } = { x ∈ Z Nt | ¯ E ( N ) x = ¯ b , l ≤ x ≤ u } , where ¯ E ( N ) is obtained from E ( N ) by replacing E i with ¯ E i and ¯ b is obtained from b byreplacing b i with ¯ b i , for each i ∈ [ N ] , and k ¯ E i , ¯ b i k ∞ ≤ O ( t ) M O ( t ) . (cid:74) How to deal with instances whose jobs have polynomially bounded sizes, but come in largemultiplicities? Actually, the fact that R | HM | C max belongs to XP parameterized by k if p max is polynomially bounded follows by solving the N -fold IP (1)–(2) using Proposition 9: (cid:73) Theorem 4. { R, Q }| HM |{ C max , ‘ , P w j C j } can be solved in time p O ( k )max m log m log n . To obtain a result like this one can ﬁrst solve the LP relaxation of (1)–(2), and then use a “prox-imity theorem” to show that some integral optimum is at distance at most p O ( k )max · m [7, Theorem59] from any optimum of the LP relaxation. This yields an { R, Q }| HM |{ C max , ‘ , P w j C j } instance where roughly p k max · m jobs are left to be scheduled and which can be solved usingTheorem 3. To adapt the model (1)–(2) for uniformly related machines, one has a single vec-tor p ∈ N τ of “unscaled” processing times, and the right hand side of constraint (2) becomes b T · s i c for a machine of speed s i . For ‘ , the objective f of the N -fold formulation becomes f ( x ) = P mi =1 ( p i x i ) which is almost separable convex (one needs to add an auxiliary variable z i and a constraint z i = p i x i to express it as separable). For P w j C j , the modiﬁcation isanalogous but slightly more complicated; the approach is identical to the one described byKnop and Koutecký [19].It is an open problem whether the p O ( k )max parameter dependence can be improved: evenin the setting with short jobs where p max ≤ k , the best algorithm for Q | HM | C max has adependence of k k [19, 21]. (cid:73) Lemma 12.

Bin Packing reduces to

Balanced Bin Packing such that (a) a max = a max +1 , (b) B = B + n , (c) k = k , (d) n = nk , and (e) tightness is preserved,where n , k , B , a max are the parameters of the new Balanced Bin Packing instance.

Proof.

Given an instance of

Bin Packing , we obtain an instance of

Balanced BinPacking by increasing the size of each item by 1, setting the new bin capacity to be B = B + n , and adding n ( k −

1) new items of size 1. Observe that all items of size 1 are“new” items. It is also clear that a max = a max + 1.To show that we preserve feasibility of instances, take any solution of the Bin Packing instance and add new items of size zero such that each bin contains precisely n items. Nowif we increase the size of each item by 1 (including the new items of size zero) and the size ofeach bin by n , we have obtained a feasible instance of the newly constructed Balanced BinPacking instance.For the other direction, assume for the sake of contradiction that the

Balanced BinPacking instance has a solution, but the original

Bin Packing instance does not. Considera solution of

Balanced Bin Packing , subtract 1 from the size of each item and n fromthe capacity of each bin—note that there are n items per bin—and remove items of size zero.This is a solution to the instance of Bin Packing —a contradiction.Regarding tightness, note that the sum of item sizes has increased by exactly nk becausewe have increased the size by 1 for n “old” items, and added n ( k −

1) “new” items of size 1.Hence, if the total size of items of the original instance was kB , it became kB + nk = k ( B + n ),and since B = B + n is the new bin capacity, the Balanced Bin Packing instance is tightiﬀ the

Bin Packing instance was. (cid:74) . Koutecký and J. Zink 9 (cid:73)

Corollary 13.

Balanced Bin Packing is NP -hard, even for tight instances. (cid:73) Corollary 14.

Unary Balanced Bin Packing is W [ ] -hard parameterized by the numberof bins, even for tight instances. Q || C max and R || C max Let us describe our hard instance I . Given a tight instance of Balanced Bin Packing with k bins of capacity B and m items, all items sum up to P i ∈ [ m ] a i = k · B =: A . Weconstruct a Q | HM | C max instance with m machines and 3 k job types.The high level idea is as follows. We use machine M i to encode the assignment of item a i to a bin, so we have m machines. We have job types α j , α j (we will refer to both of themas α × j ), and β j for j ∈ [ k ]; we refer to a job of type α × j for any j as a job of type α or an α -type job, and similarly for β . For the sake of simplicity, we sometimes do not distinguishbetween a job and a job type, e.g., by executing α × j we mean executing a job of type α × j .Our goal is to ensure that a speciﬁc schedule, which we call henceforth perfect , is optimal.In a perfect schedule, M i gets precisely a i times a job of type α j , A − a i times a job oftype α j and once a job of type β j for some j ∈ [ k ]. There is no other job on M i . Thiscorresponds to putting a i to the j -th bin. Hence, for each j ∈ [ k ], there are m/k machines where only jobs of types α j , α j and β j appear together and they represent a packing of thecorresponding items to the j -th bin.Let us specify the parameters of I . The target makespan is T = 3 kA ; note that wewill show that the feasible schedules are precisely the perfect schedules and they have theproperty that each machine ﬁnishes exactly at time T . Jobs of type β are by far the largeston all machines. We set, for j ∈ [ k ], p α j = kA + A ( k − j ) + 1 , p α j = kA + A ( k − j ) , p β j = 2 kA − A ( k − j ) ;note that as j increases, so does p β j . Complementary to p β j , as j increases, p α × j decreases.To show hardness of Q || C max , we give each machine M i a speciﬁc speed depending on a i .The unscaled load of a machine M i , denoted ¯ L i , is the sum of sizes of jobs assigned to M i before speed scaling. In a perfect schedule, it is¯ L ∗ i = a i ( kA + A ( k − j ) + 1) + ( A − a i )( kA + A ( k − j )) + 2 kA − A ( k − j )= A ( kA + A ( k − j )) + a i + 2 kA − A ( k − j ) = 3 kA + a i = T + a i . (3)The machine speed s i of machine M i is s i = T + a i T = 3 kA + a i kA . Observe that in a perfect schedule each machine M i ﬁnishes exactly by time¯ L ∗ i s i = T + a iT + a i T = T = 3 kA . (4)The sizes of jobs of type α j and α j are almost identical, except jobs of type α j are slightlylonger. For each j ∈ [ k ], we have job multiplicities n α j = Ak = B, n α j = Amk − B = ( m − Ak , n β j = mk . Which is an integer by the fact that any

Balanced Bin Packing instance must have a number ofitems divisible by k in order to be feasible. (cid:73) Lemma 15.

Balanced Bin Packing with tight instances reduces to Q | HM | C max suchthat (a) the number of machines equals the number of items, (b) the number of job typesequals k , where k is the number of bins, (c) the job sizes and job multiplicities are boundedby O ( A ) , where A is the sum of all items of the input instance, (d) the machine speeds arerational numbers with numerator and denominator in O ( A ) , and (e) the feasible schedulesare precisely perfect schedules, in which all machines ﬁnish exactly at time T = 3 kA . Proof.

Clearly, all involved numbers are in O ( A ) (w.l.o.g. we assume k, m ∈ O ( A )). Theother parameters are clear from the description of the hard instance I above. It remainsto prove the correctness of our reduction. On the one hand, if there is a solution S ofthe corresponding instance of Balanced Bin Packing , we construct a (feasible) perfectschedule for I as follows. If, in S , a i is assigned to the j -th bin, to machine M i we assign a i jobs of type α j , A − a i jobs of type α j , and one job of type β j . According to equations (3)and (4), this assignment has makespan T and, clearly, all jobs are assigned to some machine.On the other hand, assume that I is feasible, meaning there is an assignment of jobsto machines not exceeding the target makespan T . Let us analyze the structure of such aschedule σ . First we observe that instead of considering for a machine M i the makespan T ,which is the sum of jobs lengths divided by its speed s i , we can equivalently consider T · s i = T + a i as its capacity—this is the sum of (unscaled) jobs lengths it can process. Permachine, there is exactly one job of type β j for some j ∈ [ k ], since we can execute at mostone β -type job on each machine and we have to place m such jobs onto m machines. So eachmachine is in one set M j , where M j is a set of m/k machines that process a job of type β j .Having scheduled a job of type β j to a machine, we can execute on this machine at most A jobs of type α × j for any j . In particular, observe that even on a machine that executes β ,which is the smallest of the β -type jobs, we cannot add A + 1 jobs of type α k , which is thesmallest of the α × j job types, without exceeding T + max i a i .For each j ∈ [ k ], there are Am/k jobs of type α × j . Thus, there are exactly A α -type jobson each machine from M j . Observe that on a machine from M j , we cannot use a job α j ,where j < j , as this would exceed T + a i . Therefore, we have to execute A jobs of type α × k on each machine from M k . Thus, all jobs of type α × k have to be executed by machinesin M k . Consequently, we have to execute A jobs of type α × k − on each machine of M k − since there are no more jobs of type α × k available. This argument inductively propagatesfor all j = k, k − , k − , . . . ,

1. Hence, on each machine the remaining space is at most a max < A < p t for any job type t , so no other job can be scheduled. Consider the sizes ofthe jobs that have to be executed on a machine. There can be at most a i jobs of type α j oneach machine M i . Hence we have, for each j ∈ [ k ], A/k ≤ X M i ∈M j a i (5)because all A/k jobs of type α j are assigned to machines of M j . Moreover, we have X j ∈ [ k ] X M i ∈M j a i = A .

So if there was a j ∈ [ k ] with A/k < P M i ∈M j a i , then there would be a j ∈ [ k ] with This maximum can only be reached if there are A jobs of type α j and no jobs of type α j on a machine. . Koutecký and J. Zink 11 A/k > P M i ∈M j a i . Since this would contradict Equation (5), we have X M i ∈M j a i = Ak = B and a i jobs of type α j on each M i ∈ M j for each j ∈ [ k ]. Hence, σ is perfect and the sets { a i | M i ∈ M j } for each j ∈ [ k ] are a solution for the corresponding instance of BalancedBin Packing . (cid:74) We can easily adjust our hardness instance I of Q | HM | C max to an instance I R of R | HM | C max . Instead of machine speeds depending, for machine M i , on a i , we will use alarger makespan T R to host a new “blocker” job type γ , whose length is machine-dependent,and leaves space T + a i on each machine—previously the capacity on a machine with speed s i . (cid:73) Lemma 16.

Balanced Bin Packing with tight instances reduces to R | HM | C max suchthat (a) the number of machines equals the number of items, (b) the number of job typesequals k + 1 , where k is the number of bins, (c) the job sizes and job multiplicities arebounded by O ( A ) , where A is the sum of all items of the Balanced Bin Packing instance, (d) in any feasible schedule, all machines ﬁnish precisely by time T R = 7 kA , and (e) thejob sizes matrix p has rank . Proof.

In the new hardness instance I R for R | HM | C max , we use the same job types withthe same lengths and multiplicities as in I , which is our hardness instance for Q | HM | C max .We introduce a new job type γ with p iγ = 4 kA − a i , n γ = m . Observe that γ is the only job type that is machine-dependent. However its variation betweenmachines is only − a i , which is relatively small compared to its total length. A perfectschedule for I R is as a perfect schedule for I , but with an additional job of type γ assignedonce to each machine. Again, the parameters are clear from the deﬁnition of I R and weprove the correctness next.On the one hand, if there is a solution S of the corresponding instance of BalancedBin Packing , we construct a perfect scheduling for I as follows. If, in S , a i is assigned tothe j -th bin, we assign to machine M i a i jobs of type α j , A − a i jobs of type α j , one job oftype β j , and one job of type γ . This assignment has makespan T R and, clearly, all jobs areassigned to some machine.On the one hand, assume that I R instance is feasible, meaning there is a schedule σ notexceeding the target makespan T R . Again, let us analyze the structure of such a solution.Per machine, there is exactly one job of type γ since we can execute at most one such job oneach machine. The space remaining on machine M i after executing a job of type γ is T R − p iγ = 7 kA − (4 kA − a i ) = 3 kA + a i . This is precisely the capacity of machine M i in I as described in the proof of Lemma 15.After scheduling all jobs of type γ there are also the same job types with the same lengthsand multiplicities remaining. Thus, the rest of the analysis is the same.It remains to show that the rank of the job sizes matrix p is 2. Deﬁne a matrix C whoserows are indexed by the job types as follows. The row for job type t ∈ { α j , α j , β j } (for every j ∈ [ k ]) is ( p t , γ is (4 kA , − D whose columnsare indexed by the machines as follows: column i ∈ [ m ] is (1 , a i ). It is easy to verify that C · D = p . (cid:74) Applying the reductions of Lemmas 15 and 16 to

Balanced Bin Packing with 2 bins,we have that Q | HM | C max and R | HM | C max are NP -hard with 6 and 7 job types, respectively. R | HM | C max can be reduced to 4 job types, and similar ideas can be used to improve thepreviously described reduction to only require 3 k − (cid:73) Theorem 1. Q | HM | C max is NP -hard already with 6 job types. (cid:73) Theorem 17. R | HM | C max is NP -hard already with 4 job types and with p of rank . Proof of Theorem 17.

We will modify the reduction described in Lemma 16 to use only 4types of jobs if the number of bins k = 2. First, we remove the job type γ to get to 6 diﬀerenttypes of jobs. Recall that p iγ = 4 kA − a i . For the 4 kA , we will account for when adjustingthe makespan and we add the − a i to the β -type jobs (now p iβ j = 2 kA − A ( k − j ) − a i ).Second, we blow up the makespan by a factor of A/ (7 k ). So we have T = A .To reduce to 5 diﬀerent types of jobs, we remove all jobs of type β . Still, we want A times a job of type α × on every machine of M . So its size will be around A . To distinguishbetween α and α and get a dependency of machine M i on item a i , we add A − a i andsubtract a i , respectively. So, for i ∈ [ m ], we have p iα = A + A − a i and p iα = A − a i . (6)As in the previous reduction, we can ﬁt a i times p iα and A − a i times p iα to a machine,which then needs precisely the makespan T .To reduce to 4 diﬀerent types of jobs, we remove all jobs of type α and we change thelength of α to p iα = A (7)for all i ∈ [ m ]. We lengthen the job of type β to p iβ = A − a i A , (8)which is the makespan T minus a i times p iα . Note that the rank of p is still just 2: the rowsof C are ( A + A, −

1) for α , ( A , −

1) for α , ( A ,

0) for α , and ( A , − A ) for β , and D isdeﬁned as before.It remains to show the correctness of this reduction. Clearly, if there is a solution to theinstance of Balanced Bin Packing with 2 bins (i.e. a partition), we can assign the jobs tothe machines as in the perfect schedule from Lemma 15 ignoring β and α .Assume there is a solution of the obtained instance of R | HM | C max . On half of themachines, there is a job of type β . On these machines, namely M , there is no space for ajob of type α × . So, all Am/ α × are scheduled to the m/ M . Asthere cannot be more than A jobs of type α × on a machine, there are precisely A jobs oftype α × on each machine of M —at most a i of which can be α . Thus, the free space on sucha machine is at most a max < A , so there is no job of type α on these machines. To scheduleall A/ α j for j ∈ [2], we have to choose M j such that the corresponding itemsizes in the Balanced Bin Packing instance sum up to at least A/

2. As the total sum ofitems is A , both partitions correspond to items summing up to precisely A/

2. This yields aequal partition of the items. (cid:74)

The complexity of Q | HM | C max ( R | HM | C max ) with less than 6 (4) job types remains open.From Lemmas 15 and 16 and the hardness of Corollary 14, we also get our main result: (cid:73) Theorem 5. X || C max is W [ ] -hard parameterized by the number of job types with (a) X = Q and n , p , and s given in unary. (b) X = R and n and p given in unary and rank( p ) = 2 . . Koutecký and J. Zink 13 NP -hardness of Cutting Stock Cutting Stock

Input: k item types of sizes p = ( p , . . . , p k ) ∈ N k and multiplicities n = ( n , . . . , n k ) ∈ N k , m bin types with sizes s = ( s , . . . , s m ) ∈ N m and costs c = ( c , . . . , c m ) ∈ N m . Find:

A vector x = ( x , . . . , x m ) ∈ N m of how many bins to buy of each size, and a packingof items to those bins, such that the total cost cx is minimized. The diﬃculty in transferring hardness from Q | HM | C max to Cutting Stock is inenforcing that each bin type is used exactly once. (cid:73)

Lemma 18. Q | HM | C max with k job types and m machines reduces to Cutting Stock with k + 2 item types and m bin types. Proof.

We will set the sizes of bin types as 3-dimensional vectors, whose interpretationas numbers is straightforward by choosing the base of each coordinate suﬃciently large toprevent carry when summing. For machine M i with capacity T + a i , we add a bin type ofsize and cost (1 , i − , T + a i ). For each original job type t of size p t , there is an item type ofsize (0 , , p t ) with the same multiplicity n t . We will add two new item types: there are m items of type η which have size (1 , , m − ν which have size (0 , , C = ( m, m − , mT + A ).Clearly, a feasible schedule translates easily to a packing: buying each bin type exactlyonce costs exactly C , the original item types are packed according to the feasible schedule,and we pack one η -type job and 2 i − ν -type jobs on machine M i .In the other direction, ﬁrst notice that we have to use at least m bins to pack the η -typejobs, and at most m bins are aﬀordable due to the budget C . We want to show that we haveto use each bin type exactly once. Focus on the second coordinates of the 3-dimensionalvectors. Since the total size of items with respect to these coordinates is 2 m −

1, whichis precisely the aﬀordable capacity, a solution to

Cutting Stock must buy m bins withcapacity 2 m −

1. This is equivalent to decomposing the number 2 m − m numbers which are powers of 2, namely 2 , , . . . , m − . Clearly, the unique decompositionis 2 m − + 2 + · · · + 2 m − . Hence, the unique way to obtain capacity C by buying m bins is to buy one bin of each type, concluding the proof. (cid:74) Note that the W [ ]-hardness of Q || C max does not immediately imply W [ ]-hardness of Cutting Stock when p , n , c are given in unary, because the construction of Lemma 18blows up each of p , n , c : it introduces large costs, items η with large size, and items ν withlarge multiplicity.Using our hardness of Q | HM | C max with 6 job types together with Lemma 18 yields: (cid:73) Theorem 2.

Cutting Stock is NP -hard already with item types. Q || ‘ and R || ‘ We will now transfer our hardness reduction to the ‘ norm. Remember that the speed s i ofmachine M i depended linearly on T + a i (normalized by 1 /T for all machines). For the ‘ norm, we observe that the machine speed aﬀects the objective value by its square. So for amachine where we double its speed, it contributes only a fourth to the objective value. Then,one can construct an instance where it is more beneﬁcial to schedule more than the loads ofa perfect schedule to the faster machines leaving the slower machines rather empty.To still apply our argument that the perfect schedules, which precisely correspond to binpackings, are the only ones admitting an optimal schedule, we adjust the machine speeds. It should be a value in the order of √ T + a i . We use the ceiling function to have rationalmachine speeds. However, for our reduction it is crucial that machines M i and M j havea diﬀerent speed if a i = a j . To make each (cid:6) √ T + a i (cid:7) diﬀerent from (cid:6) √ T + a i − (cid:7) , wescale up √ T + a i by a suﬃciently large factor. We will see that we can set this factor to be( T + a max ), which results, for machine M i , in a new machine speed of s i = l ( T + a max ) p T + a i m . (9)In the following we will use ‘ , which is the square of the ‘ norm, and is isotonic to it.Recall that the unscaled load of M i is ¯ L i = L i · s i = P τt =1 p it x it , where x i = ( x i , . . . , x iτ ) isthe vector of job multiplicities scheduled to machine M i , and τ is the number of job types. (cid:73) Lemma 19.

The hardness instance I with modiﬁed s i is also hard for Q | HM | ‘ with targetvalue P mi =1 (( T + a i ) /s i ) . Proof.

As before, if the instance of

Balanced Bin Packing has a solution where item a i is assigned to the j -th bin, we construct a perfect schedule, where we assign a i jobs oftype α j , A − a i jobs of type α j and one job of type β j to machine M i for each i ∈ [ m ].As this gives us load ( T + a i ) /s i on machine M i , we reach precisely the target objectivevalue P mi =1 (( T + a i ) /s i ) for the ‘ objective.For the other direction, assume there is a schedule σ of jobs to machines such that theobjective value is at most P mi =1 (( T + a i ) /s i ) . We distinguish two cases. Case 1:

The unscaled load of machine M i is T + a i , for each i ∈ [ m ] . Observe that theobjective value of σ equals the prescribed threshold objective value P mi =1 (( T + a i ) /s i ) . ByLemma 15 (e), we know that such a schedule is perfect and exists if and only if there is asolution to the corresponding Balanced Bin Packing instance.

Case 2:

There is an i ∈ [ m ] such that M i has unscaled load diﬀerent from T + a i . Considerthe unscaled loads L = ( ¯ L , . . . , ¯ L m ) scheduled to each of the machines in σ . Since the totalunscaled load is independent of the schedule, we can reach L from the “perfect” unscaledload distribution ( T + a , . . . , T + a m ) of a perfect schedule (as it appears in Case 1) byiteratively moving a portion of the load from one machine to another. Note that we do notspeak of moving jobs here. For this argument, we only consider the unscaled load of eachmachine as an integral number and ignore the jobs. In this process m iterations of re-distribution are suﬃcient; in each step we take the machine with thesmallest deviation (minimizing ∆ i = | ¯ L i − ( T + a i ) | ) and move ∆ i integral units of loadfrom it or to it (depending on the direction of the deviation). Note that there exists someother machine M j to/from which to move because we chose i to minimize ∆ i .the load of each machine monotonously increases, decreases, or remains unchanged, i.e.,we do not ﬁrst add and then remove a portion of load or the other way around.We show that in every step the objective value only increases, hence this case cannot occuras we already matched the threshold objective value in the “perfect” distribution of Case 1.Consider one such step. We move load r ≥ M i and take it from machine M j .Before, we have already moved in total z i ≥ M i and we have already removed intotal z j ≥ M j . If M i is slower than M j , then the objective value deﬁnitely increases. . Koutecký and J. Zink 15 Hence, we assume s i ≥ s j (this implies a i ≥ a j ). So it remains to show (cid:18) T + a i + z i s i (cid:19) + (cid:18) T + a j − z j s j (cid:19) < (cid:18) T + a i + z i + rs i (cid:19) + (cid:18) T + a j − z j − rs j (cid:19) ⇔ s i (cid:0) r ( T + a j − z j ) − r (cid:1) < s j (cid:0) r ( T + a i + z i ) + r (cid:1) ⇔ s i s j < T + a i + z i ) + r T + a j − z j ) − r . (10)Next, we analyze the machine speed s i as deﬁned in equation (9). Recall that we scale up √ T + a i by a suﬃciently large factor b to make each (cid:6) √ T + a i (cid:7) diﬀerent from (cid:6) √ T + a i − (cid:7) If the diﬀerence between √ T + a i and √ T + a i − d , then it must hold that b > d ≥ √ T + a max − √ T + a max − . We have chosen b = T + a max , since x > / ( √ x − √ x −

1) for x ≥

4. Hence, we conclude l ( T + a max ) p T + a i m < ( T + a max ) p T + a i + 1 . (11)With this inequality in hand, we ﬁnally show the correctness of inequality (10): s i s j = (cid:6) ( T + a max ) √ T + a i (cid:7) (cid:6) ( T + a max ) p T + a j (cid:7) < ( T + a max ) ( T + a i + 1)( T + a max ) ( T + a j ) = 2( T + a i ) + 22( T + a j ) ( r ≥ ≤ T + a i ) + 2 r T + a j ) ( a i ≥ a j ) < T + a i ) + r T + a j ) − r ≤ T + a i + z i ) + r T + a j − z j ) − r (cid:74) Similarly, we can transfer our hardness instance to R | HM | ‘ . (cid:73) Lemma 20.

The hardness instance I R is hard for R | HM | ‘ with target value m · T R . Proof.

Again, if the instance of

Balanced Bin Packing has a solution where item a i is assigned to the j -th bin, we construct a perfect schedule, where we schedule a i jobs oftype α j , A − a i jobs of type α j , one job of type β j , and one job of type γ to machine M i foreach i ∈ [ m ]. As this gives us processing time T R per machine, we precisely reach the targetobjective value of mT R for the ‘ objective.For the other direction, assume there is a schedule of jobs to machines such that theobjective value is at most mT R = 49 mk A . We distinguish three cases. Case 1:

The load of each machine is at most T R = 7 kA . Such a schedule would thushave makespan T R and is feasible for R | HM | C max with target makespan T R . By Lemma 16,we know that such a schedule exists if and only if there is a solution to the corresponding Balanced Bin Packing instance. By property (d) of Lemma 16, it admits an objectivevalue of precisely mT R for the ‘ objective. Case 2a:

There is a machine with load T R > T R = 7 kA , and on each machine thereis precisely one job of type γ . Since the processing time for all α - and β -type jobs is thesame on all machines and we have exactly one job of type γ per machine, the total load isindependent of the schedule and is m · T R . Fixing the total load, the ‘ objective reaches itsminimum uniquely by distributing the load evenly; see e.g. [19, Proof of Theorem 3]. Thus,the objective mT R can only be reached if the load of every machine is T R , so this case cannotoccur. Case 2b:

There is a machine which schedules at least two jobs of type γ . In this case, weexploit Claim 21, which we prove next. Again, it contradicts our assumption of σ havingobjective value at most mT R . So this case can also not occur. (cid:66) Claim 21.

Any schedule in Case 2b has objective value strictly greater than r · mT R with r = ( m − . / ( m − mT R by atleast( r − mT R = 0 . m − · mk A > . k A . Proof:

The dependence of p iγ on the choice of a machine M i is only subtracting a i . So we get alower bound on the total sum of job sizes of all jobs in any schedule if we subtract m times a max (as we have m jobs of type γ ). This yields a total sum T = X j ∈ [ k ] (cid:16) n α j p α j + n α j p α j + n β j p β j (cid:17) + m (4 kA − a max )=7 mkA + A − ma max . The machine where we have scheduled two jobs of type γ has load at least T R ≥ · (4 kA − a max ) > kA − A .

This is already greater than T R , which is in turn at least T /m . Hence, we assume for therest of the proof that T R is exactly 8 kA − A and the remaining processing time T − T R is distributed equally across the other m − L avg of the remaining machines is L avg = T − T R m − mkA + A − ma max ) − (8 kA − A ) m − > mkA − kA − mAm − . Hence, the objective value of such a schedule is at least (cid:0) kA − A (cid:1) + ( m − (cid:18) mkA − kA − mAm − (cid:19) > mk A − k A − mkA + 49 m k A − mk A − m kA m − > mk A m − − m − mA m − ≥ rmT R , where m − − m − mA m − ≥ r = m − m − m ≥ · Balanced Bin Packing as described in the proof ofLemma 12, and we can assume that A ≥ · m as otherwise we could scale up the itemsof the Balanced Bin Packing instance by a factor of 46 · (cid:74) The following corollaries follow immediately from Lemmas 19 and 20; as before, it islikely that one might improve this to 4 job types. (cid:73)

Corollary 22. X | HM | ‘ is NP -hard already for t job types with (a) X = Q , t = 6 . (b) X = R , t = 7 , and rank( p ) = 2 . (cid:73) Corollary 23. X || ‘ is W [ ] -hard parameterized by the number of job types with (a) X = Q and n , p , and s given in unary. (b) X = R and n and p given in unary and rank( p ) = 2 . . Koutecký and J. Zink 17 R || P w j C j We will deﬁne weights in the hardness instance I R from Lemma 16. Denote ρ ij = w j /p ij the Smith ratio of a job j on machine M i , where w j is its weight. It is known that given anassignment of jobs to machines, an optimal schedule is obtained by executing jobs orderedby their Smith ratios (on each machine) non-increasingly [26]. It suﬃces to restrict ourselvesto such schedules, and an assignment of jobs to machines describes such a schedule.We would like to use the same approach as for ‘ (Lemma 20) because it is known that P w j C j and ‘ are often (not always) closely related. However, because the size of a jobof type γ depends both on j and the machine M i , yet its weight only depends on j , it isimpossible to express an exact objective value of the perfect schedule from the previoussections. This would make the argument of an analogue of Case 2a of Lemma 20 invalid anda no-instance of Balanced Bin Packing might reduce to a yes-instance of R || P w j C j .The contribution of all α - and β -type jobs to the sum of weighted completion times is alwaysthe same as they and their weights are machine-independent. However, the contributionof jobs of type γ depends on the machine, while its weight is machine-independent. If weschedule to each machine exactly one job of type γ , then we will have each machine-dependentprocessing time once and across all machines their contribution is independent of the scheduleand we can specify an exact target objective value. Consequently, we can apply the sameargumentation for Case 1 and Case 2a as in Lemma 20. For Case 2b, we will exploit theclaim in the proof of Lemma 20 once again and combine it with a gap argument (Lemma 24).To obtain the weighted hardness instance I wR , we deﬁne the following weights for ourhardness instance I R from Section 4.2. For the α - and β -type jobs the weight equals itsprocessing time and for the job type γ it is slightly greater: w α × j = p α × j w β j = p β j w γ = 4 kA (= p iγ + a i for each i ∈ [ m ]) (cid:73) Lemma 24.

Let σ be any schedule of the weighted hardness instance, let ( L , L , . . . , L m ) beits load vector, and L := (cid:0)P mi =1 L i (cid:1) . Let Γ = k P kj =1 (cid:16) Aw α j + ( m − Aw α j + mw β j (cid:17) , ∆ = P mi =1 p iγ w γ , ∆ = P mi =1 p iγ · a i , ∆ = ∆ + ∆ , ∆ minlinear = m ( w γ − a max ) w γ , and ∆ minquadr = m ( w γ − a max ) a max . (a) The value of σ under P w j C j is at least L + Γ + ∆ minlinear + ∆ minquadr . (b) If σ schedules one γ job per machine, then the value σ under P w j C j is L + Γ + ∆ . Proof of Lemma 24.

First notice that the Smith ratio of all α - and β -type jobs is 1, andthe Smith ratio of the jobs of type γ is strictly greater than 1, so the jobs of type γ willalways be executed ﬁrst. We use the following description of the objective function due toKnop and Koutecký [19]. Assume that τ job types are ordered according to their Smithratios with respect to some machine M i (with i ∈ [ m ]) as t = 1 , . . . , τ , x it is the number ofjobs of type t scheduled on machine M i , and z it = P t‘ =1 p i‘ x i‘ is the time spent processingthe ﬁrst t job types. Deﬁne ρ iτ +1 = 0. Then the contribution of machine M i to the total P w j C j objective is12 τ X t =1 (cid:2)(cid:0) z it ) ( ρ it − ρ it +1 (cid:1) + p it w t x it (cid:3) . In our case, the coeﬃcients of ( z it ) for any α - and β -type except the last one will be 0because their slopes are identical, hence ρ it − ρ it +1 = 0. The term of the last α - or β -typewill have z it = L i be the load of machine M i and its coeﬃcient is ρ iτ − ρ iτ +1 = 1 −

0, so thisterm is L i . Hence, subtracting those terms over all machines gives L , and we are left to account for (a) the quadratic terms corresponding to the jobs of type γ , and (b) the linearterms p it w t x it .First, we consider the linear terms for the α - and β -type jobs. Since the sizes of these jobsare independent of the machines, we just sum them up without knowing to which machinethey are scheduled. For each j ∈ [ k ], we have A/k jobs of type α j , ( m − A/k jobs oftype α j jobs and m/k job of type β . Hence, across all j ∈ [ k ] this is12 k X j =1 (cid:18) Ak · p α j w α j + ( m − Ak · p α j w α j + mk · p β j w β j (cid:19) = 12 k k X j =1 (cid:16) Aw α j + ( m − Aw α j + mw β j (cid:17) =Γ . Now, we consider the jobs of type γ . Let us ﬁrst assume that we have scheduled exactlyone job of type γ per machine. This means that across all machines, every possible quadraticand linear term appears precisely once. So for the linear terms, we get12 m X i =1 p iγ w γ = ∆ . For the quadratic terms, we get12 m X i =1 ( p iγ ) (cid:18) w γ p iγ − (cid:19) = 12 m X i =1 p iγ (cid:0) w γ − p iγ (cid:1) = 12 m X i =1 p iγ · a i = ∆ . Let us now drop the assumption that we have scheduled exactly one job of type γ permachine and determine a lower bound for the objective value of an arbitrary schedule. Still, L and Γ have the structure described above. Thus, we specify a lower bound by minimizingthe linear and the quadratic terms for the jobs of type γ . Clearly, they are minimum if weschedule each of the m jobs to the machine where it has the smallest size—this is machine M i corresponding to item a max . Consequently, we have (since p iγ = w γ − a max )12 m ( w γ − a max ) w γ = ∆ minlinear and12 m ( w γ − a max ) a max = ∆ minquadr . (cid:74) With this lemma at hand, it is not diﬃcult to show that the weighted hardness instanceindeed reduces

Balanced Bin Packing to R | HM | P w j C j as before: (cid:73) Lemma 25.

The weighted hardness instance I wR is hard for R | HM | P w j C j . Proof of Lemma 25.

Set the target objective value to be kmT R + Γ + ∆ . Note that aperfect schedule satisﬁes the condition that every machine executes exactly one job of type γ and the load of every machine is at most T R , hence by Lemma 24, the value of a perfectschedule is precisely the target value. Assume a schedule σ is given whose P w j C j objectiveis at most the target objective. We again distinguish three cases: . Koutecký and J. Zink 19 Case 1:

The load of each machine is at most T R . This is again a schedule of makespan atmost T R and the analysis of Lemma 16 applies. Hence, σ is a perfect schedule. Case 2a:

Each machine contains exactly one γ -type job and there is a machine with load morethan T R . By Lemma 24, we know that such a schedule has an objective value of L + Γ + ∆ .In this sum, Γ and ∆ are constant and independent of the loads of the machines. We usethe same argument as in Lemma 20. As the objective value of kmT R + Γ + ∆ is matchedprecisely if the total load is distributed evenly (i.e. Case 1), re-distributing the same totalload unevenly increases the quadratic term L . Hence, this case cannot occur. Case 2b:

There is a machine which schedules at least jobs of type γ . By Lemma 24(a), theobjective value of σ is at least L + Γ + ∆ minlinear + ∆ minquadr . Let’s compare this to the target value kmT R + Γ + ∆ + ∆ summand by summand. As shown in the proof of Lemma 20,in Case 2b we have ( P mi =1 L i ) − mT R ≥ ( r − mT R . However, we now have L = P mi =1 L i .Plugging in, we get L − mT R ≥

12 ( r − mT R > . k A . Of course, Γ is the same in both sums. Consider each of the km summands of ∆ .Compared to its counterpart in ∆ minlinear , it is greater by at most w γ a max < kA . Similarly,consider each of the km summands of ∆ . Compared to its counterpart in ∆ minquadr , itis greater by at most a < A . Combining all m summands of both of these sums, wehave at most 5 mkA . This in turn is at most 0 . kA because without loss of generality, wecan assume that A > m as otherwise we could scale up the items of the Balanced BinPacking instance by a factor of 20.So in total, the value of σ is greater by at least 0 . k A minus at most 0 . kA , andthus cannot attain the target objective value, so this case also does not occur. (cid:74)(cid:73) Corollary 26. R | HM | P w j C j is NP -hard already with 7 job types and rank( p ) = 2 . (cid:73) Corollary 27. R || P w j C j is W [ ] -hard parameterized by the number of job types, even if n and p are given in unary and rank( p ) = 2 . We conclude with a few interesting questions raised by our results:We have shown that Q | HM | C max and R | HM | C max are NP -hard with 6 and 4 job types,respectively. What is the complexity for smaller numbers of job types? We are notaware of any positive result about either problem, including Cutting Stock , even for 2job/item types.Recall the question whether P | HM | C max parameterized by the number of job types k isin FPT or not. Our results provide some guidance for how one could use the interplay ofhigh multiplicity of jobs and large job sizes to show hardness.Is

Cutting Stock W [ ]-hard when the input data is given in unary?We haven’t yet investigated jobs with release times and due dates and minimization ofmakespan, weighted ﬂow time, or weighted tardiness, already on one machine. The workof Knop et al. [21] shows that for example 1 | r j , d j |{ C max , P w j F j , P w j T j } parameterizedby the number of job types k is in XP when p max is polynomially bounded. Is it FPT or W [ ]-hard? References Aditya Bhaskara, Ravishankar Krishnaswamy, Kunal Talwar, and Udi Wieder. Minimummakespan scheduling with low rank processing times. In

Proceedings of the twenty-fourthannual ACM-SIAM symposium on Discrete algorithms , pages 937–947. SIAM, 2013. Lin Chen, Klaus Jansen, and Guochuan Zhang. On the optimality of exact and approxi-mation algorithms for scheduling problems.

Journal of Computer and System Sciences ,96:1–32, 2018. Lin Chen, Dániel Marx, Deshi Ye, and Guochuan Zhang. Parameterized and approx-imation results for scheduling with a low rank processing time matrix. In HeribertVollmer and Brigitte Vallée, editors, , volume 66 of

LIPIcs , pages 22:1–22:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPIcs.STACS.2017.22 . Michele Conforti, Gérard Cornuéjols, Giacomo Zambelli, et al.

Integer programming ,volume 271. Springer, 2014. Jana Cslovjecsek, Friedrich Eisenbrand, and Robert Weismantel. N-fold integer program-ming via LP rounding. arXiv preprint arXiv:2002.07745 , 2020. Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, Dániel Marx, MarcinPilipczuk, Michal Pilipczuk, and Saket Saurabh.

Parameterized Algorithms . Springer,2015. doi:10.1007/978-3-319-21275-3 . Friedrich Eisenbrand, Christoph Hunkenschröder, Kim-Manuel Klein, Martin Koutecký,Asaf Levin, and Shmuel Onn. An algorithmic theory of integer programming. Technicalreport, 2019. http://arxiv.org/abs/1904.01361 . András Frank and Éva Tardos. An application of simultaneous diophantine approximationin combinatorial optimization.

Combinatorica , 7(1):49–65, 1987. P. C. Gilmore and R. E. Gomory. A linear programming approach to the cutting-stockproblem.

Oper. Res. , 9:849–859, 1961. Michel X. Goemans and Thomas Rothvoß. Polynomiality for bin packing with a constantnumber of item types. In

Proc. SODA 2014 , pages 830–839, 2014. Danny Hermelin, Shlomo Karhi, Michael Pinedo, and Dvir Shabtay. New algorithms forminimizing the weighted number of tardy jobs on a single machine.

Annals of OperationsResearch , pages 1–17, 2018. Danny Hermelin, Michael Pinedo, Dvir Shabtay, and Nimrod Talmon. On the param-eterized tractability of single machine scheduling with rejection.

European Journal ofOperational Research , 273(1):67–73, 2019. Klaus Jansen. New algorithmic results for bin packing and scheduling. In DimitrisFotakis, Aris Pagourtzis, and Vangelis Th. Paschos, editors,

Algorithms and Complexity ,pages 10–15, Cham, 2017. Springer International Publishing. Klaus Jansen and Kim-Manuel Klein. About the structure of the integer cone and itsapplication to bin packing. In

Proc. SODA 2017 , pages 1571–1581, 2017. Klaus Jansen, Kim-Manuel Klein, Marten Maack, and Malin Rau. Empowering theconﬁguration-IP-new PTAS results for scheduling with setups times. In . Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018. Klaus Jansen, Stefan Kratsch, Dániel Marx, and Ildikó Schlotter. Bin packing with ﬁxednumber of bins revisited.

Journal of Computer and System Sciences , 79(1):39–49, 2013. Klaus Jansen, Alexandra Lassota, and Marten Maack. Approximation algorithms forscheduling with class constraints. arXiv preprint arXiv:1909.11970 , 2019.

EFERENCES 21 Klaus Jansen, Alexandra Lassota, and Lars Rohwedder. Near-linear time algorithm forn-fold ILPs via color coding. arXiv preprint arXiv:1811.00950 , 2018. Dušan Knop and Martin Koutecký. Scheduling meets n -fold integer programming. Journalof Scheduling , 21:493–503, 2018. Dusan Knop and Martin Koutecký. Scheduling kernels via conﬁguration LP.

CoRR ,abs/2003.02187, 2020. URL: https://arxiv.org/abs/2003.02187 . Dušan Knop, Martin Koutecký, Asaf Levin, Matthias Mnich, and Shmuel Onn. Multitypeinteger monoid optimization and applications. Technical report, 2019. http://arxiv.org/abs/1909.07326 . Eugene L. Lawler, Jan Karel Lenstra, Alexander H. G. Rinnooy Kan, and David B.Shmoys. Sequencing and scheduling: Algorithms and complexity. In S. C. Graves,A. H. G. Rinnooy Kan, and P. H. Zipkin, editors,

Handbooks in Operations Research andManagement Science: Logistics of Production and Inventory , volume 4, pages 445–522,Amsterdam-London-New York-Tokyo, 1993. North-Holland Publishing Company. Asaf Levin. Approximation schemes for the generalized extensible bin packing problem. arXiv preprint arXiv:1905.09750 , 2019. Matthias Mnich and René van Bevern. Parameterized complexity of machine scheduling:15 open problems.

Computers & OR , 100:254–261, 2018. Matthias Mnich and Andreas Wiese. Scheduling and ﬁxed-parameter tractability.

Math-ematical Programming , 154(1-2):533–562, 2015. Wayne E. Smith. Various optimizers for single-stage production.