Explorable Uncertainty in Scheduling with Non-Uniform Testing Times
EExplorable Uncertainty in Scheduling withNon-Uniform Testing Times (cid:63)
Susanne Albers and Alexander Eckl , ,(cid:63)(cid:63) Department of Informatics, Technical University of Munich,Boltzmannstr. 3, 85748 Garching, Germany [email protected], [email protected] Advanced Optimization in a Networked Economy, Technical University of Munich,Arcisstraße 21, 80333 Munich, Germany
Abstract.
The problem of scheduling with testing in the framework ofexplorable uncertainty models environments where some preliminary ac-tion can influence the duration of a task. In the model, each job has anunknown processing time that can be revealed by running a test. Alter-natively, jobs may be run untested for the duration of a given upper limit.Recently, D¨urr et al. [5] have studied the setting where all testing timesare of unit size and have given lower and upper bounds for the objectivesof minimizing the sum of completion times and the makespan on a singlemachine. In this paper, we extend the problem to non-uniform testingtimes and present the first competitive algorithms. The general settingis motivated for example by online user surveys for market predictionor querying centralized databases in distributed computing. Introducinggeneral testing times gives the problem a new flavor and requires up-dated methods with new techniques in the analysis. We present constantcompetitive ratios for the objective of minimizing the sum of completiontimes in the deterministic case, both in the non-preemptive and pre-emptive setting. For the preemptive setting, we additionally give a firstlower bound. We also present a randomized algorithm with improvedcompetitive ratio. Furthermore, we give tight competitive ratios for theobjective of minimizing the makespan, both in the deterministic and therandomized setting.
Keywords:
Online Scheduling · Explorable Uncertainty · CompetitiveAnalysis · Single Machine · Sum of Completion Times · Makespan
In scheduling environments, uncertainty is a common consideration for optimiza-tion problems. Commonly, results are either based on worst case considerationsor a random distribution over the input. These approaches are known as robust (cid:63)
Work supported by Deutsche Forschungsgemeinschaft (DFG), GRK 2201 and by theEuropean Research Council, Grant Agreement No. 691672, project APEG. (cid:63)(cid:63)
Corresponding author, eMail: [email protected] a r X i v : . [ c s . D S ] S e p S. Albers and A. Eckl optimization and stochastic optimization, respectively. However, it is often thecase that unknown information can be attained through investing some addi-tional resources, e.g. time, computing power or money. In his seminal paper,Kahan [12] has first introduced the notion of explorable or queryable uncer-tainty to model obtaining additional information for a problem at a given costduring the runtime of an algorithm. Since then, these kind of problems havebeen explored in different optimization contexts, for example in the frameworkof combinatorial, geometric or function value optimization tasks.Recently, D¨urr et al. [5] have introduced a model for scheduling with testingon a single machine within the framework of explorable uncertainty. In theirapproach, a number of jobs with unknown processing times are given. Testingtakes one unit of time and reveals the processing time. If a job is executeduntested, the time it takes to run the job is given by an upper bound. Thenovelty of their approach lies in having tests executed directly on the machinerunning the jobs as opposed to considering tests separately.In view of this model, a natural extension is to consider non-uniform testingtimes to allow for a wider range of problems. D¨urr et al. state that for certainapplications it is appropriate to consider a broader variation on testing timesand leave this question up for future research.Situations where a preliminary action, operation or test can be executedbefore a job are manifold and include a wide range of real-life applications. Inthe following, we discuss a small selection of such problems and emphasize caseswith heterogeneous testing requirements. Consider first a situation where anonline user survey can help predict market demand and production times. Thetime needed to produce the necessary amount of goods for the given demand isonly known after conducting the survey. Depending on its scope and size, theinvested costs for the survey may vary significantly.As a second example, we look at distributed computing in a setting withmany distributed local databases and one centralized master server. At the localstations, only estimates of some data values are stored; in order to obtain thetrue value one must query the master server. It depends on the distance andconnection quality from any localized database to the master how much timeand resources this requires. Olston and Widom [15] have considered this settingin detail.Another possible example is the acquisition of a house through an agent giv-ing us more information about its value, location, condition, etc., but demandinga price for her services. This payment could vary based on the price of the house,the amount of work of the agent or the number of competitors.In their paper, D¨urr et al. [5] mention fault diagnosis in maintenance andmedical treatment, file compression for transmissions, and running jobs in analternative fast mode whose availability can be determined through a test. Gen-erally, any situation involving diverse cost and duration estimates, like e.g. inconstruction work, manufacturing or insurance, falls into our category of possibleapplications. xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 3
In view of all these examples, we investigate non-uniform testing in the scopeof explorable uncertainty on a single machine as introduced by [5]. We studywhether algorithms can be extended to this non-uniform case and if not, howwe can find new methods for it.
We consider n jobs to be scheduled on a single machine. Every job j has anunknown processing time p j and a known upper bound u j . It holds 0 ≤ p j ≤ u j for all j . Each job also has a testing time t j ≥
0. A job can either be executeduntested, which takes time u j , or be tested and then executed, which takes atotal time of t j + p j . Note that a tested job does not necessarily have to beexecuted right after its test, it may be delayed arbitrarily while the algorithmtests or executes other jobs.Since only the upper bounds are initially known to the algorithm, the taskcan be viewed as an online problem with an adaptive adversary. The actualprocessing times p j are only realized after job j has been tested by the algorithm.In the randomized case, the adversary knows the distribution of the randominput parameters of an algorithm, but not their outcome.We denote the completion time of a job j as C j and primarily consider theobjective of minimizing the total sum of completion times (cid:80) j C j . As a secondaryobjective, we also investigate the simpler goal of minimizing the makespanmax j C j . We use competitive analysis to compare the value produced by analgorithm with an optimal offline solution.Clearly, in the offline setting where all processing times are known, an op-timal schedule can be determined directly: If t j + p j ≤ u j then job j is tested,otherwise it is run untested. For the sum of completion times, the jobs are there-fore scheduled in order of non-decreasing min( t j + p j , u j ). Any algorithm for theonline problem not only has to decide whether to test a given job or not, butalso in which order to run all tests and executions of both untested and testedjobs. For a solution to the makespan objective, the ordering of the jobs does notmatter and an optimal offline algorithm decides the testing by the same principleas above. Our setting is directly based on the problem of scheduling uncertain jobs on asingle machine with explorable processing times, introduced by D¨urr et al. [5] in2018. They only consider the special case where t j ≡ S. Albers and A. Eckl
Testing and executing jobs on a single machine can be viewed as part ofthe research area of queryable uncertainty or explorable uncertainty . The firstseminal paper on dealing with uncertainty by querying parts of the input waspublished in 1991 by Kahan [12]. In his paper, Kahan considers a set of elementswith uncertain values that lie in a closed interval. He explores approximationguarantees for the number of queries necessary to obtain the maximum andmedian value of the uncertain elements.Since then, there has been a large amount of research concerned with theobjective of minimizing the number of queries to obtain a solution. A varietyof numerical, geometric and combinatorial problems have been studied in thisframework, the following is a selection of some of these publications: Next toKahan, Feder et al. [9], Khanna and Tan [13], and Gupta et al. [11] have alsoconsidered the objective of determining different function values, in particularthe k-smallest value and the median. Bruce et al. [2] have analysed geometrictasks, specifically the Maximal Points and Convex Hull problems. They have alsointroduced the notion of witness sets as a general concept for queryable uncer-tainty, which was then generalized by Erlebach et al. [7]. Olston and Widom [15]researched caching problems while allowing for some inaccuracy in the objectivefunction. Other studied combinatorial problems include minimum spanning tree[7, 14], shortest path [8], knapsack [10] and boolean trees [3]. See also the surveyby Erlebach and Hoffmann [6] for an overview over research in this area.A related type of problems within optimization under uncertainty are set-tings where the cost of the queries is a direct part of the objective function.Most notably, the paper by D¨urr et al. [5] falls into this category. There, thetests necessary to obtain additional information about the runtime of the jobsare executed on the same machine as the jobs themselves. Other examples includeWeitzman’s original Pandora’s Box problem [18], where n independent randomvariables are probed to maximize the highest revealed value. Every probing in-curs a price directly subtracted from the objective function. Recently, Singla[17] introduced the ’price of information’ model to describe receiving informa-tion in exchange for a probing price. He gives approximation ratios for variouswell-known combinatorial problems with stochastic uncertainty. In this paper, we provide the first algorithms for the more general schedulingwith testing problem where testing times can be non-uniform. Consult Table 1for an overview of results for both the non-uniform and uniform versions of theproblem. All ratios provided without citation are introduced in this paper. Theremaining results are presented in [5].For the problem of scheduling uncertain jobs with non-uniform testing timeson a single machine, our results are the following: A deterministic 4-competitivealgorithm for the objective of minimizing the sum of completion times and arandomized 3.3794-competitive algorithm for the same objective. If we allowpreemption - that is, to cancel the execution of a job at any time and start xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 5
Table 1.
Overview of results
Objective Type Generel tests Uniform tests Lower bound (cid:80) C j - deterministic 4 2 [5] 1 . (cid:80) C j - randomized 3 . . . (cid:80) C j - determ. preemptive 2 ϕ ≈ . . C j - deterministic ϕ ≈ . ϕ [5] ϕ [5]max C j - randomized
43 43 [5] [5] working on a different job - then we can improve the deterministic case to be2 ϕ -competitive. Here, ϕ ≈ . ϕ -competitive algorithm in the deterministiccase and a tight -competitive algorithm in the randomized case.Our approaches handle non-uniform testing times in a novel fashion distinctfrom the methods of [5]. As we show in Appendix A, the idea of schedulinguntested jobs with small upper bounds in the beginning of the schedule, whichworks well in the uniform case, fails to generalize to non-uniform tests. Addi-tionally, describing parameterized worst-case instances becomes intangible in thepresence of an arbitrary number of different testing times.In place of these methods, we compute job completion times by cross-exa-mining contributions of other jobs in the schedule. We determine tests basedon the ratio between the upper bound and the given test time and pay specificattention to sorting the involved executions and tests in an suitable way.The paper is structured as follows: Sections 2 and 3 examine the deterministicand randomized cases respectively. Various algorithms are presented and theircompetitive ratios proven. We extend the optimal results for the objective ofminimizing the makespan from the uniform case to general testing times inSection 4. Finally, we conclude with some open problems. In this section, we introduce our basic algorithm and prove deterministic up-per bounds for the non-preemptive as well as the preemptive case. The basicstructure introduced in Section 2.1 works as a framework for other algorithmspresented later. We give a detailed analysis of the deterministic algorithm andprove that it is 4-competitive if parameters are chosen accordingly. In Section2.2 we prove that an algorithm for the preemptive case is 3.2361-competitiveand that no preemptive algorithm can have a ratio better than 1 . We now present the elemental framework of our algorithm, which we call ( α, β ) -SORT . As input, the algorithm has two real parameters, α ≥ β ≥ S. Albers and A. Eckl
Algorithm 1: ( α, β )-SORT T ← ∅ , N ← ∅ , σ j ≡ foreach j ∈ [ m ] do if u j ≥ αt j then add j to T ; set σ j ← βt j ; else add j to N ; set σ j ← u j ; end end while N ∪ T (cid:54) = ∅ do choose j min ∈ argmin j ∈ N ∪ T σ j ; if j min ∈ N then execute j min untested; remove j min from N ; else if j min ∈ T then if j min not tested then test j min ; set σ j min ← p j min ; else execute j min ; remove j min from T ; end end The algorithm is divided into two phases. First, we decide for each jobwhether we test this job or not based on the ratio u j t j . This gives us a parti-tion of [ m ] into the disjoint sets T = { j ∈ [ m ] : ALG tests j } and N = { j ∈ [ m ] : ALG runs j untested } . In the second phase, we always attend to the job j min with the current smallest scaling time σ j . The scaling time is the timeneeded for the next step of executing j : • If j is in N , then σ j = u j . • If j is in T and has not been tested, then σ j = βt j . • If j is in T and has already been tested, then σ j = p j .Note that in the second case above, we ’stretch’ the scaling time by multiply-ing with β ≥
1. The intention behind this stretching is that testing a job, unlikeexecuting it, does not immediately lead to a job being completed. Therefore theparameter β artificially lowers the relevance of testing in the ordering of ouralgorithm. Note that the actual time needed for testing remains t j .In the following, we show that the above algorithm achieves a provably goodcompetitive ratio. The parameters are kept general in the proof and are thenoptimized in a final step. We present the computations with general parametersfor a clearer picture of the proof structure, which we will reuse in later sections. xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 7 In the final optimization step it will turn out that setting α = β = 1 yields abest-possible competitive ratio of 4. Theorem 1.
The (1 , -SORT algorithm is -competitive for the objective ofminimizing the sum of completion times.Proof. For the purpose of estimating the algorithmic result against the optimum,let ρ j := min( u j , t j + p j ) be the optimal running time of job j . Without loss ofgenerality, we order the jobs s.t. ρ ≥ . . . ≥ ρ n . Hence the objective value of theoptimum is OPT = n (cid:88) j =1 j · ρ j (1)Additionally, let p Aj := (cid:40) t j + p j if j ∈ T,u j if j ∈ N, (2)be the algorithmic running time of j , i.e. the time the algorithm spends onrunning job j .We start our analysis by comparing p Aj to the optimal runtime ρ j for a singlejob, summarized in the following Proposition: Proposition 1. (a) ∀ j ∈ T : t j ≤ ρ j , p j ≤ ρ j (b) ∀ j ∈ T : p Aj ≤ (cid:0) α (cid:1) ρ j (c) ∀ j ∈ N : p Aj ≤ αρ j Part (a) directly estimates testing and running times of tested jobs againstthe values of the optimum. We will use this extensively when computing thecompletion time of the jobs. The proof of parts (b) and (c) is very similar to theproof of Theorem 14 in [5] for uniform testing times. We refer to the appendixfor a complete write-down of the proof. Note that instead of considering a singlebound, we split the upper bound of the algorithmic running time p Aj into differentresults for tested (b) and untested jobs (c). This allows us to differentiate betweendifferent cases in the proof of Lemma 1 in more detail. We will often make use ofthis Proposition to upper bound the algorithmic running time in later sections.To obtain an estimate of the completion time C j , we consider the contribution c ( k, j ) of all jobs k ∈ [ n ] to C j . We define c ( k, j ) to be the amount of timethe algorithm spends scheduling job k before the completion of j . Obviouslyit holds that c ( k, j ) ≤ p Ak . The following central lemma computes an improvedupper bound on the contribution c ( k, j ), using a rigorous case distinction overall possible configurations of k and j : Lemma 1 (Contribution Lemma).
Let j ∈ [ n ] be a given job. The completiontime of j can be written as C j = (cid:88) k ∈ [ n ] c ( k, j ) . S. Albers and A. Eckl
Additionally, for the contribution of k to j it holds that c ( k, j ) ≤ max (cid:18)(cid:18) β (cid:19) α, α , β (cid:19) ρ j . Refer to Appendix D.2 for the proof. Depending on whether j and k aretested or not, the lemma computes various upper bounds on the contributionusing estimates from Proposition 1. Finally, the given bound on c ( k, j ) is achievedby taking the maximum over the different cases.Recall that the jobs are ordered by non-increasing optimal execution times ρ j , which by Proposition 1 are directly tied to the algorithmic running times.Hence, the jobs k with small indices are the ’bad’ jobs with possibly large runningtimes. For jobs with k ≤ j we therefore use the independent upper bound fromthe Contribution Lemma. Jobs with large indices k > j are handled separatelyand we directly estimate them using their running time p Ak .By Lemma 1 and Proposition 1(b),(c) we have C j = (cid:88) k>j c ( k, j ) + (cid:88) k ≤ j c ( k, j ) ≤ (cid:88) k>j p Ak + (cid:88) k ≤ j max (cid:18)(cid:18) β (cid:19) α, α , β (cid:19) ρ j = (cid:88) k>j max (cid:18) α, α (cid:19) ρ k + max (cid:18)(cid:18) β (cid:19) α, α , β (cid:19) j · ρ j . Finally, we sum over all jobs j : n (cid:88) j =1 C j = n (cid:88) j =1 n (cid:88) k = j +1 max (cid:18) α, α (cid:19) ρ k + n (cid:88) j =1 max (cid:18)(cid:18) β (cid:19) α, α , β (cid:19) j · ρ j = max (cid:18) α, α (cid:19) n (cid:88) j =1 ( j − ρ j + max (cid:18)(cid:18) β (cid:19) α, α , β (cid:19) n (cid:88) j =1 j · ρ j ≤ (cid:18) max (cid:18) α, α (cid:19) + max (cid:18)(cid:18) β (cid:19) α, α , β (cid:19)(cid:19)(cid:124) (cid:123)(cid:122) (cid:125) =: f ( α,β ) n (cid:88) j =1 j · ρ j = f ( α, β ) · OPTMinimizing f ( α, β ) on the domain α, β ≥ α = β = 1and a value of f (1 ,
1) = 4. We conclude that (1 , xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 9 The parameter selection α = 1, β = 1 is optimal for the closed upper boundformula we obtained in our proof. It is possible and somewhat likely that adifferent parameter choice leads to better overall results for the algorithm. In theoptimal makespan algorithm (see Section 4) the value of α is higher, suggestingthat α = 1, which leads to testing all non-trivial jobs, might not be the bestchoice. The problem structure and the approach by D¨urr et al. [5] also motivatesetting β to some higher value than 1. For our proof, setting parameters like wedid is optimal.In the appendix, we take advantage of this somewhat unexpected parameteroutcome to prove that (1 , any choice of parameters, ( α, β )-SORT is not betterthan 2-competitive. The goal of this section is to show that if we allow jobs to be preempted thereexists a 3 . Round Robin rule, which is used frequently in preemptive machine scheduling[16, Chapters 3.7, 5.6, 12.4]. The scheduling time frame is divided into verysmall equal-sized units. The Round Robin algorithm then cycles through alljobs, tending to each job for exactly one unit of time before switching to thenext. It ensures that at any time the amount every job has been processed onlydiffers by at most one time unit [16].The Round Robin algorithm is typically applied when job processing timesare completely unknown. In our setting, we are actually given some upper boundsfor our processing times and may invest testing time to find out the actual values.Despite having more information, it turns out that treating all job processingtimes as unknown in a Round Robin setting gives a provably good result. Theonly way we employ upper bounds and testing times is again to decide whichjobs will be tested and which will not. We again do this at the beginning ofour schedule for all given jobs. The rule to decide testing is exactly the sameas in the first phase of Algorithm 1: If u j /t j ≥ α , then test j , otherwise run j untested. Again, α is a parameter that is to be determined. It will turn out thatsetting α = ϕ gives the best result.The pseudo-code for the Golden Round Robin algorithm is given in Algo-rithm 2.Essentially, the algorithm first decides for all jobs whether to test them andthen runs a regular Round Robin scheme on the algorithmic testing time p Aj ,which is defined as in (2). Theorem 2.
The Golden Round Robin algorithm is . -competitive in thepreemptive setting for the objective of minimizing the sum of completion times.This analysis is tight. Algorithm 2:
Golden Round Robin T ← ∅ , N ← ∅ , σ j ≡ foreach j ∈ [ m ] do if u j ≥ ϕt j then add j to T ; set σ j ← t j ; else add j to N ; set σ j ← u j ; end end while ∃ j ∈ [ m ] not completely scheduled do run Round Robin on all jobs using σ j as their processing time; let j min be the first job to finish during the current execution; if j min ∈ T and j min tested but not executed then set σ j min ← p j min and keep j min in the Round Robin rotation; end end We only provide a sketch of the proof here, the full proof can be found inAppendix D.3.
Proof (Proof sketch).
We set α = ϕ and use Proposition 1(b),(c) to bound thealgorithmic running time p Aj of a job j by its optimal running time ρ j . p Aj ≤ ϕρ j . We then compute the contribution of a job k to a fixed job j by groupingjobs based on their finishing order in the schedule. This allows us to estimatethe completion time of job j : C j ≤ (cid:88) k>j p Ak + j · p Aj Finally, we sum over all jobs to receive ALG ≤ ϕ · OPT.To show that the analysis is tight, we provide an example where the algo-rithmic solution has a value of 2 ϕ · OPT if we let the number of jobs approachinfinity.The following theorem additionally provides a lower bound for the deter-ministic preemptive setting, giving us a first simple lower bound for this case.The proof is based on the lower bound provided in [5] for the deterministicnon-preemptive case. We refer to Appendix D.4 for this proof.
Theorem 3.
No algorithm in the preemptive deterministic setting can be betterthan . -competitive. xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 11 In this section we introduce randomness to further improve the competitive ratioof Algorithm 1. There are two natural places to randomize: when deciding whichjobs to test and the decision about the ordering of the jobs. These decisionsdirectly correspond to the parameters α and β .Making α randomized, for instance, could be achieved by defining α as arandom variable with density function f α : [1 , ∞ ] → R +0 and testing j if andonly if r j := u j /t j ≥ α . Then the probability for testing j would be given by p = (cid:82) r j f α ( x ) dx . Using a random variable α like this would make the analysisunnecessarily complicated, therefore we directly consider the probability p with-out defining a density, and let p depend on r j . This additionally allows us tocompute the probability of testing independently for each job.Introducing randomness for β is even harder. The choice of β influencesmultiple jobs at the same time, therefore independence is hard to establish.Additionally, β appears in the denominator of our analysis frequently, hinderingcomputations using expected values. We therefore forgo using randomness forthe β -parameter and focus on α in this paper. We encourage future research totry their hand at making β random.We give a short pseudo-code of our randomized algorithm in Algorithm 3. Itis given a parameter-function p ( r j ) and a parameter β , both of which are to bedetermined later. Algorithm 3:
Randomized-SORT T ← ∅ , N ← ∅ , σ j ≡ foreach j ∈ [ m ] do add j to T with probability p ( r j ) and set σ j ← βt j ; otherwise add it to N and set σ j ← u j ; end while N ∪ T (cid:54) = ∅ do choose j min ∈ argmin j ∈ N ∪ T σ j ; if j min ∈ N then execute j min untested; remove j min from N ; else if j min ∈ T then if j min not tested then test j min ; set σ j min ← p j min ; else execute j min ; remove j min from T ; end end Theorem 4.
Randomized-SORT is . -competitive for the objective of min-imizing the sum of completion times.Proof. Again, we let ρ ≥ . . . ≥ ρ n denote the ordered optimal running timeof jobs 1 , . . . , n . The optimal objective value is given by (1). Fix jobs j and k .For easier readability, we write p instead of p ( r j ). Since the testing decision isnow done randomly, the algorithmic running time p Aj as well as the contribution c ( k, j ) are now random variables. It holds p Aj = (cid:40) t j + p j with probability pu j with probability 1 − p For the values of c ( k, j ) we consult the case distinctions from the proof of theContribution Lemma 1. If j ∈ N , one can easily determine that c ( k, j ) ≤ (1 +1 /β ) u j for all cases. Note that for this we did not need to use the final estimateswith parameter α from the case distinction. Therefore this upper bound holdsdeterministically as long as we assume j ∈ N . By extension it also trivially holdsfor the expectation of c ( k, j ): E [ c ( k, j ) | j untested] ≤ (1 + 1 /β ) u j . Doing the same for the case distinction of j ∈ T , we get E [ c ( k, j ) | j tested] ≤ max (cid:18) (1 + β ) t j , (cid:18) β (cid:19) p j , t j + p j (cid:19) . For the expected value of the contribution we have by the law of total ex-pectation: E [ c ( k, j )] = E [ c ( k, j ) | j untested] · P r [ j untested]+ E [ c ( k, j ) | j tested] · P r [ j tested] ≤ (cid:18) β (cid:19) u j · (1 − p ) + max (cid:18) (1 + β ) t j , (cid:18) β (cid:19) p j , t j + p j (cid:19) · p Note that this estimation of the expected value is independent of any parametersof k . That means, for fixed j we estimate the contribution to be the same for alljobs with small parameter k ≤ j . Of course, as before, for the jobs with largeparameter k > j we may also alternatively directly use the algorithmic runtimeof k : E [ c ( k, j )] ≤ E [ p Ak ] . Putting the above arguments together, we use the Contribution Lemma andlinearity of expectation to estimate the completion time of j : E [ C j ] = n (cid:88) j =1 E [ c ( k, j )] ≤ (cid:88) k>j E [ p Ak ] + (cid:88) k ≤ j E [ c ( k, j )] . xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 13 For the total objective value of the algorithm we receive again using linearity ofexpectation: E n (cid:88) j =1 C j ≤ n (cid:88) j =1 ( j − E [ p Aj ] + n (cid:88) j =1 j · E [ c ( k, j )] ≤ n (cid:88) j =1 ( j − u j · (1 − p ) + ( t j + p j ) · p )+ n (cid:88) j =1 j (cid:32) (cid:18) β (cid:19) u j · (1 − p )+ max (cid:18) (1 + β ) t j , (cid:18) β (cid:19) p j , t j + p j (cid:19) · p (cid:33) ≤ n (cid:88) j =1 j · λ j ( β, p ) , where we define λ j ( β, p ) := (cid:18) u j + (cid:18) β (cid:19) u j (cid:19) · (1 − p )+ (cid:18) t j + p j + max (cid:18) (1 + β ) t j , (cid:18) β (cid:19) p j , t j + p j (cid:19)(cid:19) · p. Having computed this first estimation for the objective of the algorithm, wenow consider the ratio λ j ( β, p ) /ρ j as a standalone. If we can prove an upperbound for this ratio, the same holds as competitive ratio for our algorithm.Hence the goal is to choose parameters β and p , where p can depend on j ,s.t. λ j ( β, p ) /ρ j is as small as possible. In the best case, we want to computemin β ≥ ,p ∈ [0 , max j λ j ( β, p ) ρ j . Lemma 2.
There exist parameters ˆ β ≥ and ˆ p ∈ [0 , s.t. max j λ j ( ˆ β, ˆ p ) ρ j ≤ . . The choice of parameters is given in the proof of the lemma, which can befound in Appendix D.5. During the proof we use computer-aided computationswith Mathematica. The Mathematica code can be found in Appendix E andadditionally on the webpage [1] for download.To conclude the proof of the theorem, we write E n (cid:88) j =1 C j ≤ n (cid:88) j =1 j · λ j ( ˆ β, ˆ p ) ≤ . n (cid:88) j =1 j · ρ j = 3 . · OPT . In this section, we consider the objective of minimizing the makespan of ourschedule. It turns out that we are able to prove the same tight algorithmic boundsfor this objective function as D¨urr et al. in the unit-time testing case, both fordeterministic and randomized algorithms. The decisions of the algorithms onlydepend on the ratio r j = u j /t j . Refer to the appendix for the proofs. Theorem 5.
The algorithm that tests job j iff r j ≥ ϕ is ϕ -competitive for theobjective of minimizing the makespan. No deterministic algorithm can achieve asmaller competitive ratio. Theorem 6.
The randomized algorithm that tests job j with probability p =1 − / ( r j − r j +1) is / -competitive for the objective of minimizing the makespan.No randomized algorithm can achieve a smaller competitive ratio. In this paper, we introduced the first algorithms for the problem of schedul-ing with testing on a single machine with general testing times that arises inthe context of settings where a preliminary action can influence cost, durationor difficulty of a task. For the objective of minimizing the sum of completiontimes, we presented a 4-approximation for the deterministic case, and a 3 . . .
618 and 4 / References (4), 411–423 (2005). https://doi.org/10.1007/s00224-004-1180-4xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 153. Charikar, M., Fagin, R., Guruswami, V., Kleinberg, J., Raghavan, P., Sahai, A.:Query strategies for priced information. Journal of Computer and System Sciences (4), 785 – 819 (2002). https://doi.org/10.1006/jcss.2002.18284. D¨urr, C., Erlebach, T., Megow, N., Meißner, J.: An adversarial model for schedulingwith testing (2017)5. D¨urr, C., Erlebach, T., Megow, N., Meißner, J.: Scheduling with explorable un-certainty. In: Karlin, A.R. (ed.) 9th Innovations in Theoretical Computer Sci-ence Conference (ITCS 2018). Leibniz International Proceedings in Informatics(LIPIcs), vol. 94, pp. 30:1–30:14. Schloss Dagstuhl–Leibniz-Zentrum fuer Infor-matik, Dagstuhl, Germany (2018). https://doi.org/10.4230/LIPIcs.ITCS.2018.306. Erlebach, T., Hoffmann, M.: Query-competitive algorithms for computing withuncertainty. Bulletin of EATCS (116) (2015)7. Erlebach, T., Hoffmann, M., Krizanc, D., Mihal’´ak, M., Raman, R.: Comput-ing Minimum Spanning Trees with Uncertainty. In: Albers, S., Weil, P. (eds.)25th International Symposium on Theoretical Aspects of Computer Science.Leibniz International Proceedings in Informatics (LIPIcs), vol. 1, pp. 277–288.Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2008).https://doi.org/10.4230/LIPIcs.STACS.2008.13588. Feder, T., Motwani, R., O’Callaghan, L., Olston, C., Panigrahy, R.: Comput-ing shortest paths with uncertainty. Journal of Algorithms (1), 1 – 18 (2007).https://doi.org/10.1016/j.jalgor.2004.07.0059. Feder, T., Motwani, R., Panigrahy, R., Olston, C., Widom, J.: Computing themedian with uncertainty. SIAM Journal on Computing (2), 538–547 (2003).https://doi.org/10.1137/S009753970139566810. Goerigk, M., Gupta, M., Ide, J., Schbel, A., Sen, S.: The robust knapsackproblem with queries. Computers and Operations Research , 12 – 22 (2015).https://doi.org/10.1016/j.cor.2014.09.01011. Gupta, M., Sabharwal, Y., Sen, S.: The update complexity of selection and re-lated problems. In: Chakraborty, S., Kumar, A. (eds.) IARCS Annual Confer-ence on Foundations of Software Technology and Theoretical Computer Science(FSTTCS 2011). Leibniz International Proceedings in Informatics (LIPIcs), vol. 13,pp. 325–338. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Ger-many (2011). https://doi.org/10.4230/LIPIcs.FSTTCS.2011.32512. Kahan, S.: A model for data in motion. In: Proceedings of the Twenty-Third Annual ACM Symposium on Theory of Computing. p. 265277. STOC91, Association for Computing Machinery, New York, NY, USA (1991).https://doi.org/10.1145/103418.10344913. Khanna, S., Tan, W.C.: On computing functions with uncertainty. In: Proceedingsof the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles ofDatabase Systems. p. 171182. PODS 01, Association for Computing Machinery,New York, NY, USA (2001). https://doi.org/10.1145/375551.37557714. Megow, N., Meiner, J., Skutella, M.: Randomization helps computing a minimumspanning tree under uncertainty. SIAM Journal on Computing (4), 1217–1240(2017). https://doi.org/10.1137/16M108837515. Olston, C., Widom, J.: Offering a precision-performance tradeoff for aggregationqueries over replicated data. In: 26th International Conference on Very Large DataBases (VLDB 2000). p. 144155. VLDB 00, Morgan Kaufmann Publishers Inc., SanFrancisco, CA, USA (2000)16. Pinedo, M.L.: Scheduling: Theory, Algorithms, and Systems. Springer InternationalPublishing, 5 edn. (2016)6 S. Albers and A. Eckl17. Singla, S.: The price of information in combinatorial optimization. In: Proceedingsof the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms. p.25232532. SODA 18, Society for Industrial and Applied Mathematics, USA (2018)18. Weitzman, M.L.: Optimal search for the best alternative. Econometrica (3),641–654 (1979). https://doi.org/10.2307/191041219. Yao, A.C.: Probabilistic computations: Toward a unified measure of complexity.In: 18th Annual Symposium on Foundations of Computer Science (sfcs 1977). pp.222–227 (1977)xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 17 A Jobs with small upper limits
In this section, we motivate our approaches by showing why the uniform algo-rithm cannot be simply extended to the general problem. An important insightof D¨urr et al. was that jobs with small upper bounds can be scheduled imme-diately without testing at the begin of any competitive algorithm. It turns outthis does not generalize to non-uniform testing, which signifies that a new ideais necessary to deal with jobs that have small upper bounds when compared totheir testing times.D¨urr et al. [5] proved for the special case of uniform testing times t j ≡ λ may start by scheduling alljobs j with u j < λ untested in increasing order of u j . Clearly, this statementdoes not hold directly for the general case, since a single job with 0 < u j < λ , p j = t j = 0 must be tested to yield a finite competitive ratio.It seems intuitive to extend this idea to non-uniform testing by instead con-sidering the ratio u j t j between upper bounds and test times. We show via a shortcounterexample that for any λ ≥ u j t j < λ first leads toan arbitrarily bad result.Consider the following instance: Given an integer m and a small real number ε >
0, we have m jobs with u j = p j = λ and t j = 1 for j = 1 , . . . , m . Clearly,all these jobs lie just over the limit λ and are not considered for execution atthe beginning of the schedule. Additionally we have a single extra job j with u = m , t = m λ + ε , p = 0.An algorithm obeying the small upper limit rule schedules j first, since wehave u t < λ because of ε >
0. Afterwards the remaining jobs are scheduledoptimally, meaning the algorithm runs them untested. Since the order of theremaining jobs is irrelavant, we may assume w.l.o.g. that the algorithm ordersthem by 1 , . . . , m .For the completion time of j we get C = m and for the other jobs we have C j = m + j · λ . In total the value of the algorithm is:ALG = m (cid:88) j =0 C j = C + m (cid:88) j =1 C j = m + m (cid:88) j =1 ( m + j · λ )= m + m (cid:18) λ (cid:19) + m λ j =1 , . . . , m untested and then, leaving the large job j for last, tests and runsit. We note here that the optimum only schedules j last if m /λ + ε > λ . We can guarantee this by choosing m large enough.OPT = m (cid:88) j =0 C j = m (cid:88) j =1 C j + C = m (cid:88) j =1 ( j · λ ) + λm + m λ + ε + 0= m (cid:18) λ λ (cid:19) + m λ ε Now, if we let m → ∞ and ε →
0, it is clear thatALGOPT −→ ∞ . The problem with scheduling j first is that, while it might be reasonableto run it untested, all m small jobs have to wait for it to finish before beingscheduled, leading to a non-optimal result. Hence the problematic decision is the order of the jobs rather than whether the algorithm tests or not.The algorithm we propose in Section 2 takes into account the ratio betweenupper bounds and test times, while additionally making sure that the executionlength of both tested and untested jobs is considered in the order of the schedule. B Lower bounds for ( α, β )-SORT
We give two concise lower bounds for the performance of ( α, β )-SORT.First, we show that (1 , n jobs with u j = p j = 1 and t j = 1 − ε for all jobs j . Since u j /t j ≥ , t j < p j , it also runs all tests before executinganything. In total we have an algorithmic value ofALG = n (1 − ε ) + n n . Comparatively, the optimal runtime of a job is ρ j = 1. This leads to a value ofOPT = n / n/
2. Therefore we getALGOPT = n (1 − ε ) + n / n/ n / n/ −−−−→ n →∞ ε → . Additionally, we can strengthen our analysis by proving a lower bound forthe algorithm with general parameter choices α, β ≥
1. We will show that ( α, β )-SORT is at most 2-competitive for all such values. xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 19
Consider first the case α >
2. Then the instance consisting of a single jobwith u j = 2 , t j = 1 and p j = 0 is executed untested by the algorithm, while theoptimum tests the job. We have ALG / OPT = 2.For the second case 1 ≤ α ≤
2, we subdivide into two more cases based onthe value of β . Let us start with β >
2. Consider for this n jobs with u j = 2 , t j =1 , p j = 2. Since now α ≤
2, all jobs are tested by the algorithm. In particular,since p j = 2 > β = βt j , all jobs are tested before any executions happen. Intotal this gives an algorithmic value ofALG = n · n + 2 (cid:18) n n (cid:19) . Similarly, the optimal value is OPT = 2 · ( n / n/ n + 2 · ( n / n/ · ( n / n/ −−−−→ n →∞ . Finally, consider 1 ≤ α ≤ β ≥
2. We need two sets of jobs for this.First, we have a set of n jobs with u j = β, t j = 1 − ε, p j = β , similar to theprevious instance. Since u j = β ≥ α > αt j , these jobs are all tested by thealgorithm. Since p j = β > βt j , all jobs are tested before any are executed.Second, we have a set of m jobs with u k = M, t k = 1 + ε, p k = 0, where M issome large number that does not play a role in either solutions. Since the upperbound is large, both the optimum and the algorithm test these jobs. Because βt k > p j > βt j for all jobs k ∈ [ m ] , j ∈ [ n ], we know that the algorithm sorts theexecutions as follows: First, all jobs in [ n ] are tested. Then, all these jobs in [ n ]are executed. Finally, all jobs in [ m ] are tested and then immediately executed.In total, this gives us an algorithmic value ofALG = (1 − ε ) n · ( n + m ) + β (cid:18) n n (cid:19) + βn · m + (1 + ε ) (cid:18) m m (cid:19) , while the optimal value isOPT = (1 + ε ) (cid:18) m m (cid:19) + (1 + ε ) m · n + β (cid:18) n n (cid:19) . Now we choose m = n and let n → ∞ and ε →
0, giving usALGOPT −−−−→ n →∞ ε → β + 5 β + 3 . This ratio is minimal for β ≥ β = 2 with a value of 11 / >
2, finalizingthe proof of the lower bound.
C A simple algorithm for the uniform case
In the following we present a simpler 2-competitive algorithm for the unit-testingproblem as compared to the Threshold algorithm from [5]. In the newest version of their complete paper [4], D¨urr et al. add a note about this simpler algorithm,which they call DelayAll, and prove its competitive ratio. We present an alter-native proof using our methods to highlight the differences between the prooftechniques. The algorithm forces tests for all jobs designated for testing beforeexecuting any tested job. Since this leads to the same competitive ratio as thecurrent best-known result, this represents an argument for increasing the rel-evance of testing further jobs as opposed to executing already tested jobs atany point in the schedule. Note that the optimal parameter choice of β = 1 inAlgorithm 1 reflects this as well.The algorithm first runs jobs with u j < Algorithm 4:
Force Testing T ← ∅ , N ← ∅ ; foreach j ∈ [ m ] do if u j ≥ then add j to T ; else add j to N ; end end run all jobs j ∈ N untested; test all jobs j ∈ T ; run all j ∈ T in order of SPT; Lemma 3.
Force Testing is a 2-competitive algorithm for the objective of min-imizing the sum of completion times in the unit-sized testing case.Proof.
Again, we let ρ ≥ . . . ≥ ρ n denote the ordered optimal running timeof jobs 1 , . . . , n . The optimal objective value is given by (1). By Lemma 1 of[5] we may assume that all jobs fulfill u j ≥ π ( j )) j to be the SPT order of the processing times p j , such that p π − (1) ≥ · · · ≥ p π − ( n ) . This means the job with the largest processing timehas index 1 in this ordering, similar to the order of ρ j . Then the value of thealgorithm can be computed as follows: Since all jobs are tested first, an amountof n · n is added to the objective. Afterwards, the last job in the ordering π (i.e. the job with the shortest processing time) contributes his processing time n times. The second-to-last job contributes his n − n + (cid:88) j ∈ [ n ] π ( j ) · p j . xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 21 The SPT-rule is optimal on a single machine, see e.g. [16]. Therefore if were-order the jobs in the final step of the algorithm, the result can only get worse.We re-order according to the optimal order of ρ j and receive (cid:88) j ∈ [ n ] π ( j ) · p j ≤ (cid:88) j ∈ [ n ] j · p j . Using n ≤ · (cid:80) nj =1 j , we haveALG ≤ (cid:88) j ∈ [ n ] j · (2 + p j ) . If ρ j = 1 + p j , then it holds that 2 + p j ≤ p j ) = 2 ρ j . Similarly, if ρ j = u j , then by u j ≥ p j ≤ u j + u j = 2 ρ j . Insertingthis into our estimation, we receiveALG ≤ (cid:88) j ∈ [ n ] j · ρ j = 2 · OPTThis analysis is tight, as can be seen by a simple example of n jobs with largeupper bounds and processing times p j = 0. While the algorithm obliviously runsall tests first and has a completion time of n for every job, the optimum testsand immediately runs every job, resulting in a value of n / n/
2. By thisexample we also see that while this oblivious algorithm has the same theoreticalcompetitive ratio as the Threshold algorithm in [5], there are instances whereThreshold clearly performs better.We also note that the idea of forcing tests cannot be extended to non-unittesting times. In the presence of arbitrarily large testing times, we may notprioritize these tests over potentially small execution times.
D Proofs
D.1 Proof of Proposition 1
Proof. (a) Let j ∈ T . Assume ρ j = t j + p j . Then the result follows immediately fromnon-negativity of testing and running times. On the other hand, let ρ j = u j .Then, since j ∈ T and α ≥
1, we have t j ≤ α u j ≤ ρ j . By definition of theupper bound we also have p j ≤ ρ j .(b) Let j ∈ T . By definition u j /t j ≥ α and p Aj = t j + p j . If the optimum tests j , then ρ j = t j + p j and therefore p Aj = ρ j . If, on the other hand, ρ j = u j ,then: p Aj ρ j = t j + p j u j = t j u j + p j u j ≤ α + 1(c) Now let j ∈ N and hence u j /t j < α as well as p Aj = u j . If the optimumdoesn’t test j , then ρ j = u j and therefore p Aj = ρ j . If it does, then ρ j = t j + p j and: p Aj ρ j = u j t j + p j ≤ u j t j < α D.2 Proof of Lemma 1
Proof.
Let j ∈ [ n ]. By definition, the completion time C j is the point in time inthe schedule when the entire execution of j , including a potential test in our case,is finished. Since our algorithm schedules all jobs on a single machine withoutwaiting times, the completion time is equal to the amount of time the algorithmspends scheduling any job (including j itself) before reaching this point. Hence,by the definition of the contribution: C j = (cid:88) k ∈ [ n ] c ( k, j )Now let additionally k ∈ [ n ] and let c ( k, j ) be the contribution of k to thecompletion time of j . We do a rigorous case distinction over the possible valuesof c ( k, j ), depending on the testing status of j and k . Consider Figure 1 for anoverview of the case distinction for j ∈ N and Figure 2 for an overview for j ∈ T . k ∈ Nu k u k ≤ u j u k > u j
1. 2. k ∈ Tk tested t k + p k t k β t k ≤ u j β t k > u j p k ≤ u j p k > u j
3. 4. 5.
Fig. 1.
Case distinction for contribution to job j ∈ N . The connecting lines are labeledwith the relation between the involved parameters. The values with blue backgroundcorrespond to c ( k, j ) and are numbered from 1 to 5 distinct cases. We start with j ∈ N . The cases are numbered exactly according to Figure 1.1. k ∈ N , u k ≤ u j : The contribution is c ( k, j ) = u k ≤ u j = p Aj ≤ αρ j byProp. 1(c).2. k ∈ N , u k > u j : c ( k, j ) = 0.3. k ∈ T , βt k ≤ u j , p k ≤ u j : c ( k, j ) = t k + p k ≤ (cid:16) β (cid:17) u j ≤ (cid:16) β (cid:17) αρ j byProp. 1(c). xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 23 k ∈ T , βt k ≤ u j , p k > u j : c ( k, j ) = t k ≤ β u j ≤ β αρ j by Prop. 1(c).5. k ∈ T , βt k > u j : c ( k, j ) = 0.By taking the maximum over the above cases, we know that for j ∈ N and any k ∈ [ n ]: c ( k, j ) ≤ (cid:18) β (cid:19) αρ j . (3) k ∈ Nj tested u k u k u k ≤ β t j u k > β t j u k ≤ p j u k > p j
1. 2. 3. k ∈ Tk tested j tested j tested k tested t k + p k t k + p k t k t k + p k t k t k ≤ t j t k > t j p k ≤ β t j p k > β t j β t k ≤ p j β t k > p j p k ≤ p j p k > p j p k ≤ p j p k > p j
4. 5. 6. 7. 8. 9.
Fig. 2.
Case distinction for contribution to job j ∈ T . The connecting lines are labeledwith the relation between the involved parameters. The values with blue backgroundcorrespond to c ( k, j ) and are numbered from 1 to 9 distinct cases. Now consider j ∈ T , the cases are numbered as seen in Figure 2.1. k ∈ N , u k ≤ βt j : The contribution is c ( k, j ) = u k ≤ βt j ≤ βρ j by Prop. 1(a).2. k ∈ N , u k > βt j , u k ≤ p j : c ( k, j ) = u k ≤ p j ≤ ρ j by Prop. 1(a).3. k ∈ N , u k > βt j , u k > p j : c ( k, j ) = 0.4. k ∈ T , t k ≤ t j , p k ≤ βt j : c ( k, j ) = t k + p k ≤ (1 + β ) t j ≤ (1 + β ) ρ j byProp. 1(a).5. k ∈ T , t k ≤ t j , p k > βt j , p k ≤ p j : c ( k, j ) = t k + p k ≤ t j + p j = p Aj ≤ (cid:0) α (cid:1) ρ j by Prop. 1(b).6. k ∈ T , t k ≤ t j , p k > βt j , p k > p j : c ( k, j ) = t k ≤ t j ≤ ρ j by Prop. 1(a).7. k ∈ T , t k > t j , βt k ≤ p j , p k ≤ p j : c ( k, j ) = t k + p k ≤ (cid:16) β (cid:17) p j ≤ (cid:16) β (cid:17) ρ j by Prop. 1(a).8. k ∈ T , t k > t j , βt k ≤ p j , p k > p j : c ( k, j ) = t k ≤ β p j ≤ β ρ j by Prop. 1(a).9. k ∈ T , t j > t j , βt k > p j : c ( k, j ) = 0. We again take the maximum over all cases, which yields that for j ∈ T and any k ∈ [ n ]: c ( k, j ) ≤ max (cid:18) α , β, β (cid:19) ρ j (4)Combining equations (3) and (4) we achieve our desired bound. c ( k, j ) ≤ max (cid:18)(cid:18) β (cid:19) α, α , β, β (cid:19) ρ j = max (cid:18)(cid:18) β (cid:19) α, α , β (cid:19) ρ j D.3 Proof of Theorem 2
Proof.
As before, we let ρ ≥ . . . ≥ ρ n denote the ordered optimal runningtimes of jobs 1 , . . . , n . The optimal objective value is given by (1). The decisionof which jobs to test is exactly the same in Golden Round Robin as in theoriginal ( α, β )-SORT. By Proposition 1(b),(c) the algorithmic running time (2)of all jobs is bounded by p Aj ≤ max (cid:18) α, α (cid:19) ρ j . Minimizing this upper bound makes it clear why we set α = ϕ in the GoldenRound Robin algorithm. We have p Aj ≤ ϕ · ρ j . (5)Similar to the proof of (1 , k contributes to the completion time C j for any two jobs k, j .Contrary to before, the contribution now only depends on the algorithmicrunning times of k and j : If p Ak ≥ p Aj then j finishes first in the Round RobinScheme and k contributes exactly p Aj to the completion time of j . Otherwise,if p Ak < p Aj then k is done first and contributes its entire running time p Ak . Wedefine J j := { k ∈ [ n ] : p Ak ≥ p Aj } and J j := { k ∈ [ n ] : p Ak < p Aj } . Then: C j = (cid:88) k ∈ J j p Aj + (cid:88) k ∈ J j p Ak Just as before, we divide the jobs into ’good’ ( k > j ) and ’bad’ ( k ≤ j ) jobs.For the good jobs, we use p Ak and for the bad jobs we instead use p Aj . We use xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 25 the properties of the sets J j and J j to estimate as follows: C j = (cid:88) k ∈ J j k>j p Aj + (cid:88) k ∈ J j k ≤ j p Aj + (cid:88) k ∈ J j k>j p Ak + (cid:88) k ∈ J j k ≤ j p Ak ≤ (cid:88) k ∈ J j k>j p Ak + (cid:88) k ∈ J j k ≤ j p Aj + (cid:88) k ∈ J j k>j p Ak + (cid:88) k ∈ J j k ≤ j p Aj = (cid:88) k>j p Ak + (cid:88) k ≤ j p Aj = (cid:88) k>j p Ak + j · p Aj For the sum of completion times, we receive n (cid:88) j =1 C j ≤ n (cid:88) j =1 n (cid:88) k = j +1 p Ak + n (cid:88) j =1 j · p Aj = n (cid:88) j =1 ( j − p Aj + n (cid:88) j =1 j · p Aj ≤ n (cid:88) j =1 j · p Aj ≤ ϕ n (cid:88) j =1 j · ρ j = 2 ϕ · OPT , where the last inequality follows from (5).We also show that this analysis of the Golden Round Robin algorithm istight. For this, consider n jobs with u j = p j = 1 and t j = 1 /ϕ for all jobs. Since u j /t j ≥ ϕ , the algorithm tests all jobs and therefore runs a round robin schemeon n jobs with runtime p Aj = 1 + 1 /ϕ = ϕ . We receive ALG = n ϕ . On the otherhand, OPT does not test anything and has a total value of OPT = n / n/ n ϕn / n/ −−−−→ n →∞ ϕ. D.4 Proof of Theorem 3
Proof.
We prove the theorem by reducing a worst case scenario in the preemptivesetting to the worst case provided for the non-preemptive setting in [5]. Morespecifically, we prove that given a preemptive algorithm for the adversarial sce-nario as defined in Chapter 3.2 of [5], there exists a non-preemptive algorithm with competitive ratio at least as good as the given preemptive algorithm. Sim-ilarly, the optimal offline algorithm is always non-preemptive.We are given an instance with t j = 1 and u j = ¯ u ≥ δ ∈ [0 ,
1] and then decides theruntime p j of all jobs as follows: • If a job is executed untested by the algorithm, set p j = 0. • If a job is tested by the algorithm, set p j = u j . • If the number of jobs already decided is larger than δn , then always set p j = 0.This strategy can be easily extended to the preemptive case, where we just haveto make sure that once an algorithm decides to test a job, it may not retract thisdecision later in order to deceive the adversary. If an algorithm starts testinga job (or execute it untested) even for a very small amount of time, it mustcontinue to abide by this decision.Assume now that we are given a schedule produced by an algorithm ALG pre which may be preemptive. We know that all jobs run untested by ALG pre haveprocessing time p j = 0 as well as all jobs that have been decided after more than δn other jobs have already been fixed. The rest of the jobs have processing time p j = u j .We fix an ordering of the execution instances of the algorithm, which aredefined as the exact points in time where the algorithm finishes either an untestedexecution, a test, or the execution of an already tested job.We then define a new algorithm ALG ∗ , which will be non-preemptive, by thefollowing rule: Go through the schedule of ALG pre starting at time 0. Wheneveryou encounter an execution instance in the schedule of ALG pre , schedule thecorresponding execution or test completely without preemption in ALG ∗ .By definition, the ordering of the execution instances stays the same fromALG pre to ALG ∗ . Therefore the completion time of any job can only get better,i.e. C ∗ j ≤ C prej . Additionally, the exact same set of jobs is tested as before andthe set of δn jobs that is decided first by the adversary is unchanged. Hence,the behavior of the adversary is the same for both algorithms and the optimalschedule does not change.Combining these arguments, we know that the competitive ratio of the non-preemptive algorithm can not be worse than that of the preemptive version:ALG ∗ OPT ≤ ALG pre
OPTTo complete the proof, we cite section 3.2 of [5], where it is proven that forinstances with t j = 1 , u j = ¯ u ≥ xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 27 D.5 Proof of Lemma 2
Proof.
Let λ j ( β, p ) be defined as in the proof of Theorem 4. We want to findvalues ˆ β and ˆ p , s.t. max j λ j ( ˆ β, ˆ p ) ρ j is as small as possible.Because ρ j = min( u j , t j + p j ) is difficult to handle, we have to do a casedistinction on its value. We start with ρ j = u j . Recall that p j ≤ u j and that wedefined r = u j /t j . For better readability, we drop the index j of r j during thiscomputation. We have λ j ( β, p ) ρ j = u j + (cid:16) β (cid:17) u j u j (1 − p )+ t j + p j + max (cid:16) (1 + β ) t j , (cid:16) β (cid:17) p j , t j + p j (cid:17) u j p ≤ (cid:18) β (cid:19) (1 − p ) + (cid:18) r + 1 + max (cid:18) (1 + β ) 1 r , β , r + 1 (cid:19)(cid:19) p = (cid:18) r − − β + max (cid:18) (1 + β ) 1 r , β , r (cid:19)(cid:19) p + 2 + 1 β , which is a linear function in p with strictly positive slope. Now let us look at ρ j = t j + p j . We utilize p j ≥ t j ≥ λ j ( β, p ) ρ j = u j + (cid:16) β (cid:17) u j t j + p j (1 − p )+ t j + p j + max (cid:16) (1 + β ) t j , (cid:16) β (cid:17) p j , t j + p j (cid:17) t j + p j p ≤ (cid:18) r + (cid:18) β (cid:19) r (cid:19) (1 − p ) + (cid:18) (cid:18) β, β , (cid:19)(cid:19) p = (cid:18) β − (cid:18) β (cid:19) r (cid:19) p + (cid:18) β (cid:19) r, where we used β ≥ p with strictlynegative slope if r > β /β . We observe that r is the only parameter dependenton j in either of these formulas. Therefore taking the maximum over all j asrequired is equivalent to taking the maximum over values of r ≥ ≤ p ≤
1. The minimal maximum of two linear function where one is strictlyincreasing and the other strictly decreasing is attained at their intersection. Inthe cases where the second function is in fact increasing, the minimal maximummay be attained somewhere else and have a smaller value, but we can ignorethis and still use the intersection (as long as the intersection point lies in [0 , We use the mathematical solver Mathematica for some of the following com-putations. Please consult Appendix E or webpage [1] for the complete Mathe-matica code for this section. Using this code, we compute the intersection andreceive the following value for p : p ( r ) = r + 2 βr − r − βrr + 2 βr − r − βr − β r + β + βr max (cid:16) (1 + β ) r , β , r (cid:17) (6)For most feasible values of β and r , this fraction lies between [0 , p fulfills this requirement.We insert this value for p into either of the above linear functions. Afterwardsthe result depends only on r and β . Since r is our job parameter, we must considerthe worst case (i.e. the maximum case) in dependence of r ≥
1. Therefore werun a parameter search for β , such that this worst-case value is minimized. Seethe Mathematica code for the exact computation. The result of the search was β ≈ . . r ≈ . β = ˆ β := 1 . p ( r ) as in equation (6). Forthis value of β , the definition of p ( r ) is non-negative for all r ≥
1. However, forsome values of r the definition is larger than 1. This is obviously not admissibleand therefore we choose ˆ p ( r ) := min( p ( r ) , . Consult Figure 3 for an illustration of this definition. pp ̂ r Fig. 3.
Graphs of p and ˆ p . For values larger than ˆ r ≈ . p exceeds 1.xplorable Uncertainty in Scheduling with Non-Uniform Testing Times 29 Since we now restricted the choice of ˆ p for some values of r to 1, we needto make sure that the maximum value of our two functions for p = 1 in thesecases is smaller than the upper bound already provided. Otherwise our worst caseestimate would increase. Using Mathematica, we determine that the value of p ( r )is only larger than 1 for values r > ˆ r ≈ . r> ˆ r λ j ( ˆ β, ρ j (cid:47) . , which is smaller than 3 . r ≥ λ j ( ˆ β, ˆ p ( r )) ρ j = max (cid:32) max ≤ r ≤ ¯ r λ j ( ˆ β, p ( r )) ρ j , max r> ˆ r λ j ( ˆ β, ρ j (cid:33) (cid:47) max(3 . , . . . This concludes the proof.
D.6 Proof of Theorem 5
Proof.
Since every job contributes the same amount to the objective regardlessof where in the schedule it is placed, we can assume worst case instances consistof only a single job. This statement is formalized in Lemma 20 of the full versionof D¨urr et al. [4]. We therefore consider a single job j with upper bound u j ,processing time p j and testing time t j .As seen in some of the previous proofs, a case distinction on the value ofOPT = ρ j = min( u j , t j + p j ) is usually a good strategy. Therefore consider firstOPT = u j . If the algorithm tests j , then by definition ALG = t j + p j as well as r j ≥ ϕ . Hence ALGOPT = t j + p j u j ≤ r j + 1 ≤ ϕ = ϕ, where we used p j ≤ u j and the defining property of the golden ratio. If on theother hand, the algorithm does not test j , then ALG = u j = OPT.Now consider OPT = t j + p j . If the algorithm tests j , then ALG = t j + p j =OPT. If it does not, then we have ALG = u j as well as r j < ϕ and thereforeALGOPT = u j t j + p j ≤ u j t j = r j ≤ ϕ, where we used p j ≥ j with u j = ϕ, t j = 1. An algorithm that teststhe job has competitive ratio ϕ if the adversary sets p j = 0. An algorithm thatdoesn’t test j has competitive ratio 1 + 1 /ϕ = ϕ for the case p j = u j . D.7 Proof of Theorem 6
Proof.
Consider a worst case instance consisting of a single job j with upperbound u j , processing time p j , and testing time t j . We can compute the expectedvalue of the algorithmic solution: E [ALG] = ( t j + p j ) · p + u j · (1 − p )Again, we do a case distinction. If OPT = u j then E [ALG]OPT = ( t j + p j ) · p + u j · (1 − p ) u j ≤ (cid:18) r j + 1 (cid:19) · p + 1 − p. Otherwise, if OPT = t j + p j then E [ALG]OPT = ( t j + p j ) · p + u j · (1 − p ) t j + p j ≤ p + r j · (1 − p ) . To achieve a good competitive ratio, we want to minimize the maximumof these two functions. We again do this by computing their intersection pointas in the proof of Lemma 2. To simplify presentation, we only show that theresulting value for p is indeed optimal: We insert p = 1 − / ( r j − r j + 1) intoboth expressions and receive after a bit of algebra: E [ALG]OPT ≤ r j r j − r j + 1 . This function is maximized at r j = 2 with value 4 / / u j = 2 , t j = 1that has p j = 0 and p j = 2 both with probability 1 /
2. Both the deterministic al-gorithm that tests the job and the one that doesn’t test have expected makespan2. The optimal solution has an expected value of 3 /
2. Therefore every determin-istic algorithm is at most 4 / E Mathematica code for the randomized algorithm
See appended pages. n[1]:= f1 [ beta _ , r _ , p _] : = p 1r - - + Max ( + beta )
1r , 1 + + + + In[2]:= f2 [ beta _ , r _ , p _] : = p 2 + beta - + + + In[3]:= (* test for positive slope of f1 *) Reduce - - + Max ( + beta )
1r , 1 + + >
0, beta ≥
1, r ≥ Out[3]= beta ≥ ≥ In[4]:= (* test for positive slope of f2 *) Reduce + beta - + <
0, beta ≥
1, r ≥ Out[4]= beta ≥ > + beta + In[5]:= (* find intersection of f1 and f2 *) Reduce [{ f1 [ beta, r, p ] ⩵ f2 [ beta, r, p ] , r ≥
1, beta ≥ }] Out[5]= ( r ⩵ ≥ ) || r > ≤ beta < - + (- + r ) +
12 4 - + - + (- + r ) || beta > - + (- + r ) +
12 4 - + - + (- + r ) &&p ⩵ - r - + r + beta - r - - beta r + r + + beta r Max + , 1 + , + betar In[6]:= (* insert value for p into f1 *) fun [ beta _ ?NumericQ ] : = Maximize f1 beta, r, - r - + r + beta - r - - beta r + r + + beta r Max + , 1 + , + betar ,r ≥ , r [[ ]] ; n[7]:= Plot [{ fun [ beta ]} , { beta, 0, 3 } , PlotRange → {
3, 8 }] Out[7]=
In[8]:= (* find beta ≥ [ beta ] is minimized *) NMinimize [{ fun [ beta ] , 2 ≥ beta ≥ } , beta, MaxIterations → ] Out[8]= { { beta → }} In[9]:= beta = = In[10]:= (* find maximizing value of r *) Maximize f1 beta, r, - r - + r + beta - r - - beta r + r + + beta r Max + , 1 + , + betar , r ≥ , r Out[10]= { { r → }} In[11]:= (* for completeness, also compute the worst case for f2 *) Maximize f2 beta, r, - r - + r + beta - r - - beta r + r + + beta r Max + , 1 + , + betar , r ≥ , r Out[11]= { { r → }} Citation_SchedulingExplUncertain.nb n[14]:= (* insert beta into p *) pfun [ r _] : = - r - + r + beta - r - - beta r + r + + beta r Max + , 1 + , + betar ; In[15]:= (* since p is > *) phatfun [ r _] : = Min [ pfun [ r ] , 1 ] ; In[16]:=
Plot { pfun [ r ] , phatfun [ r ]} , { r, 1, 10 } ,PlotRange → { } , AxesLabel → { r, } , PlotLabels -> { "p", "p ̂ " } Out[16]= pp ̂ r In[17]:= (* compute interval where p < *) Reduce [ pfun [ r ] < ] Reduce: Reduce was unable to solve the system with inexact coefficients. The answer was obtained bysolving a corresponding exact system and numericizing the result.
Out[17]= < r < In[18]:= (* compute interval where p > *) Reduce [ pfun [ r ] > ] Reduce: Reduce was unable to solve the system with inexact coefficients. The answer was obtained bysolving a corresponding exact system and numericizing the result.
Out[18]= r > In[19]:= rhat = Citation_SchedulingExplUncertain.nb n[20]:= (* making sure the values of f1 forlarge r isn't larger than our current worst case *) Maximize [{ f1 [ beta, r, 1 ] , r ≥ rhat } , r ] Out[20]= { { r → }} In[21]:= (* making sure the values of f2 forlarge r isn't larger than our current worst case *) Maximize [{ f2 [ beta, r, 1 ] , r ≥ rhat } , r ] Out[21]= { { r → }}4