The VCG Mechanism for Bayesian Scheduling
TThe VCG Mechanism for Bayesian Scheduling ∗ Yiannis Giannakopoulos † Maria Kyropoulou ‡ March 28, 2017
Abstract
We study the problem of scheduling m tasks to n selfish, unrelated machines in orderto minimize the makespan, where the execution times are independent random variables,identical across machines. We show that the VCG mechanism, which myopically allocateseach task to its best machine, achieves an approximation ratio of O (cid:0) ln n ln ln n (cid:1) . This improvessignificantly on the previously best known bound of O (cid:0) mn (cid:1) for prior-independent mecha-nisms, given by Chawla et al. [STOC’13] under the additional assumption of MonotoneHazard Rate (MHR) distributions. Although we demonstrate that this is in general tight,if we do maintain the MHR assumption, then we get improved, (small) constant bounds for m ≥ n ln n i.i.d. tasks, while we also identify a sufficient condition on the distribution thatyields a constant approximation ratio regardless of the number of tasks. We consider the problem of scheduling tasks to machines, where the processing times of the tasksare stochastic and the machines are strategic . The goal is to minimize the expected completiontime (a.k.a. makespan ) of any machine, where the expectation is taken over the randomnessof the processing times and the possible randomness of the mechanism. We are interested inthe performance, i.e. the expected makespan, of truthful mechanisms compared to the optimal algorithm that does not take the incentives of the machines into consideration. This problem,which we call the
Bayesian scheduling problem, was previously considered by Chawla et al.[8]. Scheduling problems constitute a very rich and intriguing area of research [21]. In oneof the most fundamental cases, the goal is to schedule m tasks to n parallel machines whileminimizing the makespan, when the processing times of the tasks are selected by an adversaryin an arbitrary way and can depend on the machine to which they are allocated. However,the assumption that the machines will blindly follow the instructions of a central authority(scheduler) was eventually challenged, especially due to the rapid growth of the Internet andits use as a primary computing platform. Motivated by this, in their seminal paper Nisan andRonen [29] introduced a mechanism-design approach to the scheduling problem: the processingtimes of the tasks are now private information of the machines, and each machine declares tothe mechanism how much time it requires to execute each task. The mechanism then outputsthe allocation of tasks to machines, as well as monetary compensations to the machines for theirwork, based solely on these declarations. In fact, the mechanism has to decide the output inadvance, for any possible matrix of processing times the machines can report. Each machine isassumed to be rational and strategic, so, given the mechanism and the true processing times, ∗ Supported by ERC Advanced Grant 321171 (ALGAME) and EPSRC grant EP/M008118/1.A preliminary version of this paper appeared in WINE’15 [18]. † Department of Computer Science, University of Liverpool. Email: [email protected]
A significant part of this work was carried out while the first author was a PhD student at the University ofOxford. ‡ Department of Computer Science, University of Oxford. Email: [email protected] a r X i v : . [ c s . G T ] M a r ts declarations are chosen in order to minimize the processing time/cost it has to spend for theexecution of the allocated tasks minus the payment it will receive. In this scenario, the goal isto design a truthful mechanism that minimizes the makespan; truthful mechanisms define theallocation and payment functions so that the machines don’t have an incentive to misreporttheir true processing-time capabilities. We will refer to this model as the prior free schedulingproblem, as opposed to the stochastic model we discuss next.In the Bayesian scheduling problem [8], the time a specific machine requires in order toprocess a task is drawn from a distribution. We consider one of the fundamental questionsposed by the algorithmic mechanism design literature, which is about quantifying the potentialperformance loss of a mechanism due to the requirement for truthfulness. In the Bayesianscheduling setting, this question translates to:
What is the maximum ratio (for any distributionof processing times) of the expected makespan of the best truthful mechanism over the expectedoptimal makespan (ignoring the machines’ incentives)?
In this paper we tackle this question by considering a well known and natural truthfulmechanism, the
Vickrey-Clarke-Groves mechanism (VCG) [34, 11, 20]. VCG can be defined forvery general mechanism design settings. In the special case of scheduling unrelated machines, ithas a very simple interpretation: greedily and myopically allocate each task to a machine thatminimizes its processing time. It is a well known fact that VCG is a truthful mechanism in a verystrong sense; truth-telling is a dominant strategy for the machines. Because of the notoriouslack of characterization results for truthfulness for restricted domains such as scheduling, VCG(or more generally, affine maximizers) is the standard and obvious choice to consider for theBayesian scheduling problem. We stress here that for the scheduling domain (and for anyadditive domain) the VCG allocation and payments can be computed in polynomial time. Also,it is important to note that VCG is a prior-independent mechanism, i.e. it does not require anyknowledge of the prior distribution from which the processing times are drawn.Prior-independence is a very strong property, and is an important feature for mechanismsused in stochastic settings. Being robust with respect to prior distributions facilitates applica-bility in real systems, while at the same time bypasses the pessimistic inapproximability resultsof worst case analysis. The idea is that we would like the mechanisms we use, without relyingon any knowledge of the distribution of the processing times of the tasks, to still perform wellcompared to the optimal mechanism that is tailored for the particular distribution.Chawla et al. [8] were the first to examine the Bayesian scheduling problem while consideringthe importance for prior-independence. They study the following two mechanisms:
Bounded overload with parameter c Allocate tasks to machines such that the sum of theprocessing times of all tasks is minimized, subject to placing at most c mn tasks at anymachine. Sieve and bounded overload with parameters c, β, and δ Fix a partition of the machinesinto two sets of sizes (1 − δ ) n and δn . Ignoring all processing times which exceed β (i.e.setting them equal to infinity), run VCG on the first set of machines. For the tasks thatremain unallocated run the bounded overload mechanism with parameter c on the secondset of machines.The above mechanisms are inspired by maximal-in-range (affine maximizer) mechanisms [30]and threshold mechanisms, as these are essentially the only non-trivial truthful mechanisms weknow for the scheduling domain. One would expect that the simplest of these mechanisms,which is the VCG mechanism, would be the first to be considered. Indeed, VCG is the mostnatural, truthful, simple, polynomial time computable, and prior-independent mechanism. Still, Assume you run VCG on the first set of machines plus a dummy machine with processing time β on all tasks.The case where a task has processing time equal to β can be ignored without loss of generality for the case ofcontinuous distributions. Our results.
We prove an asymptotically tight bound of Θ (cid:16) ln n ln ln n (cid:17) for the approximationratio of VCG for the Bayesian scheduling problem under the sole assumption that the machinesare a priori identical. This bound is achieved by showing that the worst case input for VCG isactually one where the tasks are all of unit weight (point mass distributions). This resembles aballs-in-bins type scenario from which the bound is implied.Whenever the processing times of the tasks are i.i.d. and drawn from an MHR continu-ous distribution, VCG is shown to be 2 (cid:16) n ln nm (cid:17) -approximate for the Bayesian schedulingproblem. This immediately implies a constant bound at most equal to 4 when m ≥ n ln n .We also get an improved bound of 1 + √ m ≥ n using a different approach. For thecomplementary case of m ≤ n ln n , we identify a property of the distribution of processingtimes such that VCG again achieves a constant approximation. We observe that importantrepresentatives of the class of MHR distributions, that is the uniform distribution on [0 ,
1] aswell as exponential distributions, do satisfy this property, so for these distributions VCG is4-approximate regardless of the number of tasks. We note however that this is not the case forall MHR distributions.The continuity assumption plays a fundamental role in the above results. In particular,we give a lower bound of Ω (cid:16) ln n ln ln n (cid:17) for the case of i.i.d. processing times that uses a discreteMHR distribution. Finally, we also consider the bounded overload and the sieve and boundedoverload mechanisms that were studied by Chawla et al. [8], and present some instances thatlower-bound their performance. Related Work.
One of the fundamental papers on the approximability of scheduling withunrelated machines is by Lenstra et al. [25] who provide a polynomial time algorithm thatapproximates the optimal makespan within a factor of 2. They also prove that it is NP-hard to approximate the optimal makespan within a factor of 3 / n -approximation of the optimal makespan, while no truthful mechanism can achieveapproximation ratio better than 2. Note that the upper bound immediately carries over to theBayesian and the prior-independent scheduling case. The lower bound has been improved byChristodoulou et al. [10] and Koutsoupias and Vidali [23] to 2 .
61, while Ashlagi et al. [2] provethe tightness of the upper bound for deterministic anonymous mechanisms. In contrast tothe negative result on the prior free setting presented in [2], truthful mechanisms can achievesublinear approximation when the processing times are stochastic. In fact, we prove here thatVCG can achieve a sublogarithmic approximation, and even a constant one for some cases,while similar bounds for other mechanisms have also been presented by Chawla et al. [8].For the special case of related machines, where the private information of each machineis a single value, Archer and Tardos [1] were the first to give a 3-approximation truthful inexpectation mechanism, while by now truthful PTAS are also known [9, 14, 17]. Putting com-putational considerations aside, the best truthful mechanism in this single-dimensional settingis also optimal. Lavi and Swamy [24] managed to prove constant approximation ratio for aspecial, yet multi-dimensional scheduling problem; they consider the case where the processingtimes of each task can take one of two fixed values. Yu [35] then generalized this result totwo-range-values, while together with Lu and Yu [27] and Lu [26], they gave constant (betterthan 1 .
6) bounds for the case of two machines.3askalakis and Weinberg [12] consider computationally tractable approximations with re-spect to the best (Bayesian) truthful mechanism when the processing times of the tasks followdistributions (with finite support) that are known to the mechanism designer. In fact the au-thors provide a reduction of this problem to an algorithmic problem. Chawla et al. [7] showedthat there can be no approximation-preserving reductions from mechanism design to algorithmdesign for the makespan objective, however the authors in [12] bypass this inapproximabilityby considering the design of bi-criterion approximation algorithms.Prior-independent mechanisms have been mostly considered in the context of optimal auc-tion design, where the goal is to design an auction mechanism that maximizes the seller’srevenue. Inspired by the work of Dhangwatnotai et al. [15], Devanur et al. [13] and Roughgar-den et al. [32] independently provide approximation mechanisms for multi-dimensional settings,with recent follow-up work by Goldner and Karlin [19] and Azar et al. [4]. Moreover, Dughmiet al. [16] identify conditions under which VCG obtains a constant fraction of the optimal rev-enue, while Hartline and Roughgarden [22] prove Bulow-Klemperer type results for VCG. Priorrobust optimization is also discussed by Sivan [33].Chawla et al. [8] are the first to consider prior-independent mechanisms for the (Bayesian)scheduling problem. They introduce two variants of the VCG mechanism and bound their ap-proximation ratios. In particular, the bounded overload mechanism is prior-independent andachieves a O ( mn ) approximation of the expected optimal makespan when the processing timesof the tasks are drawn from machine-identical MHR distributions. For the case where the pro-cessing times of the tasks are i.i.d. from an MHR distribution, the authors prove that sieve andbounded overload mechanisms can achieve an O ( √ ln n ) approximation of the expected optimalmakespan, as well as an approximation ratio of O ((ln ln n ) ) under the additional assumptionthat there are at least n ln n tasks. We note that to achieve these improved approximationratios, a sieve and bounded overload mechanism needs to have access to a small piece of infor-mation regarding the distribution of the processing times, in particular the expectation of theminimum of a certain number of draws (in contrast to VCG which requires no distributionalinformation whatsoever).The VCG mechanism is strongly represented in the above works. Its simplicity and amenabil-ity to practice strongly motivate a detailed analysis of its performance for the Bayesian schedul-ing problem. From our results, it turns out that in general VCG performs better than thepreviously analyzed prior-independent mechanisms, applies to wider settings with less restric-tions on the distributions and, of course, it is simpler. To summarize and clarify this comparisonwith the previous prior-free mechanisms of Chawla et al. [8], we note that the only case whereVCG demonstrates a worse approximation ratio is when the number of tasks is asymptoticallyvery close to that of machines, in particular m = o (cid:16) ln n ln ln n n (cid:17) and, in addition, we are in arestricted setting where the execution times have to be drawn from necessarily non-identical,MHR distributions. For example, for m = n tasks with processing times drawn from machine-identical MHR distributions which however differ across tasks, the bounded overload mechanismof Chawla et al. [8] would be constant O (1)-approximate, while VCG would have an approxi-mation ratio of Θ (cid:16) ln n ln ln n (cid:17) . However, a point worth mentioning here is that the constant hiddenwithin the O (1)-notation above is 800 while the one in the upper-bound O (cid:16) ln n ln ln n (cid:17) of VCGcomes directly from a balls-in-bins analysis and therefore is 1 + o (1). Assume that we have n unrelated parallel machines and m ≥ n tasks that need to be scheduledto these machines. Let t ij denote the processing time of task j on machine i . In the Bayesianscheduling problem, each t ij is independently drawn from some probability distribution D ij . Inthis paper we mainly consider the machine-identical setting, that is the processing times of a4pecific task j are drawn from the same distribution D j for all the machines. This is a standardassumption for the problem (see also [8]). We also consider the case where both machines andtasks are considered a priori identical, and the processing times t ij are all i.i.d. drawn from thesame distribution D . The goal is to design a truthful mechanism that minimizes the expectedmakespan of the schedule.We consider the VCG mechanism, the most natural and standard choice for a truthfulmechanism. Thus, we henceforth assume that the machines always declare their true processingtimes. VCG minimizes the total workload by allocating each task to the machine that minimizesits processing time. So, if α denotes the allocation function of VCG (we omit the dependenceon t for clarity of presentation) then, for any task j , α ij = 1 for some machine i such that t ij = min i { t i j } , otherwise α ij = 0. Without loss of generality we assume that in case of a tie,the machine is chosen uniformly at random . The expected makespan of VCG is then computedas E [ VCG ( t )] = E max i m X j =1 α ij t ij . In what follows, we use variable Y i,j to denote the processing time of task j on machine i underVCG, that is Y i,j = α ij t ij . We also denote by Y i = P mj =1 Y i,j the workload of machine i .Note that in the machine-identical setting α ij = 1 with probability n for any task j . So,VCG exhibits a balls-in-bins type behaviour in this setting, as the machine that minimizes theprocessing time of each task, and hence, the machine that will be allocated the task, is chosenuniformly at random for each task. We thus know from traditional balls-in-bins analysis, thatthe expected maximum number of tasks that will be allocated to any machine will be Θ (cid:16) ln n ln ln n (cid:17) ,whenever m = Θ( n ). For more precise balls-in-bins type bounds see Raab and Steger [31]. Wewill use the following theorem to prove in Section 3 that the above instance that yields theΘ (cid:16) ln n ln ln n (cid:17) bound is actually the worst case scenario for VCG: Theorem 1 (Berenbrink et al. [6]) . Assume two vectors w ∈ R m , w ∈ R m with m ≤ m andtheir values in non-increasing order (that is w ≥ w ≥ . . . ≥ w m and w ≥ w ≥ . . . ≥ w m ).If the following two conditions hold:(i) P mj =1 w j = P m j =1 w j (ii) P kj =1 w j ≥ P kj =1 w j for all k ∈ [ m ] ,then the expected maximum load when allocating m balls with weights according to w is atleast equal to the expected maximum load when allocating m balls with weights according to w ,uniformly at random to the same number of bins. Following [6] we say that vector w majorizes w whenever w and w satisfy Conditions (i)and (ii) of Theorem 1. Probability preliminaries.
We now give some additional notation regarding properties ofdistributions that will be used in the analysis.Let T be a random variable following a probability distribution D . Assuming we perform n independent draws from D , we use T [ r : n ] to denote the r -th order statistic (the r -th smallest)of the resulting values, following the notation from [8]. In particular, T [1 : n ] will denotethe minimum of n draws from D , while T [1 : n ][ m : m ] denotes the maximum value of m independent experiments where each one is the minimum of n draws from D . Note that for t ij ∼ D j , the expected processing time of machine i for task j under VCG is E [ Y i,j ] = Pr [ α ij = 1] E [ t ij | α ij = 1 ] = 1 n E [ T [1 : n ]] . (1) We note here that for continuous distributions, such events of ties occur with zero probability.
5n this work we also consider the class of probability distributions that have a monotonehazard rate (MHR) . A continuous distribution with pdf f and cdf F is MHR if its hazard rate h ( x ) = f ( x )1 − F ( x ) is a (weakly) increasing function. The definition of discrete MHR distributionsis similar, only the hazard rate of a discrete distribution is defined as h ( x ) = Pr[ X = x ]Pr[ X ≥ x ] (seee.g. Barlow et al. [5]). The following two technical lemmas demonstrate properties of MHRdistributions. The proofs can be found in Appendix A. Lemma 1. If T is a continuous MHR random variable, then for every positive integer n , itsfirst order statistic T [1 : n ] is also MHR. Lemma 2.
For any continuous
MHR random variable X and any positive integer r , E [ X r ] ≤ r ! E [ X ] r . We now introduce the notion of k -stretched distributions. The property that identifies thesedistributions plays an important role in the approximation ratio of VCG as we will see later inthe analysis (Theorem 7). Definition 1.
Given a function k over integers, we call a distribution k -stretched if its orderstatistics satisfy E [ T [1 : n ][ n : n ]] ≥ k ( n ) · E [ T [1 : n ]] , for all positive integers n .We will use the following result by Aven to bound the expected makespan of VCG. Theorem 2 (Aven [3]) . If X , X , . . . , X n are (not necessarily independent) random variableswith mean µ and variance σ , then E [max i X i ] ≤ µ + √ n − σ. Finally, we use the notation introduced in the probability preliminaries to present someknown bounds on the expected optimal makespan. So, if given a matrix of processing times t we denote its optimal makespan by OPT( t ), we wish to bound E t [OPT( t )] (we omit dependenceon t for clarity of presentation). Part of the notorious difficulty of the scheduling problem stemsexactly from the lack of general, closed-form formulas for the optimal makespan. However, thefollowing two easy lower bounds are widely used (see e.g. [8]): Observation 3.
If the processing times are drawn from machine-identical distributions, thenthe expected optimal makespan is bounded by E [OPT] ≥ max E (cid:20) max j T j [1 : n ] (cid:21) , n m X j =1 E [ T j [1 : n ]] , where T j follows the distribution corresponding to task j . In this section we provide results on the performance of the VCG mechanism for the Bayesianscheduling problem for different assumptions on the number of tasks (compared to the ma-chines), and different distributional assumptions on their processing times. Our first resultshows that VCG is O (cid:16) ln n ln ln n (cid:17) –approximate in the general case, without assuming identicaltasks or even MHR distributions. We then consider some additional assumptions under whichVCG achieves a constant approximation of the expected optimal makespan. In what follows,an allocation where all machines have the same workload will be called fully balanced .6 heorem 4. VCG is O (cid:16) ln n ln ln n (cid:17) -approximate for the Bayesian scheduling problem with n iden-tical machines. As we will see later in Theorem 10, this result is in general tight. In order to prove Theorem 4we will make use of the following lemma:
Lemma 3. If VCG is ρ -approximate for the prior free scheduling problem with identical ma-chines on inputs for which the optimal allocation is fully balanced, then VCG is ρ -approximatefor the Bayesian scheduling problem where the machines are a priori identical.Proof. We will show that for any instance of Bayesian scheduling with a priori identical ma-chines, there exists a prior free scheduling instance with identical machines for which the ap-proximation ratio of VCG is at least the same. In fact, there exists such a prior free instance,for which the optimal allocation is fully balanced.Consider a Bayesian scheduling instance where t ij ∼ D j for tasks j ∈ [ m ] and machines i ∈ [ n ]. Let ρ ≥ E t [ VCG ( t )] = ρ · E t [OPT( t )]. Then, conditioning on theminimum processing times of the tasks, there exists an m -dimensional vector ( t ∗ , . . . , t ∗ m ) suchthat E t (cid:20) VCG ( t ) (cid:12)(cid:12)(cid:12)(cid:12) min i t i = t ∗ ∧ · · · ∧ min i t im = t ∗ m (cid:21) ≥ ρ · E t (cid:20) OPT( t ) (cid:12)(cid:12)(cid:12)(cid:12) min i t i = t ∗ ∧ · · · ∧ min i t im = t ∗ m (cid:21) . Notice that, once such a minimum processing time t ∗ j has been fixed for all tasks, the onlyrandomization remaining within the expected makespan of VCG is the one with respect to theidentities of the machines having processing time t ∗ j and the possible internal tie breaking; thus,if we let t ∗ denote the time matrix where task j has processing time t ij = t ∗ j for all machines i ,it holds that E t (cid:20) VCG ( t ) (cid:12)(cid:12)(cid:12)(cid:12) min i t i = t ∗ ∧ · · · ∧ min i t im = t ∗ m (cid:21) = VCG ( t ∗ ) . Also, once we have fixed the smallest element in every column of an input matrix t (a columncontains the processing times of a single task on all the machines), reducing all other values ofa column j to be equal to that minimum value t ∗ j can only improve the optimal makespan, so E t (cid:20) OPT( t ) (cid:12)(cid:12)(cid:12)(cid:12) min i t i = t ∗ ∧ · · · ∧ min i t im = t ∗ m (cid:21) ≥ OPT( t ∗ ) . Combining the above, we get that indeed
VCG ( t ∗ ) ≥ ρ · OPT( t ∗ ).It remains to be shown that, without loss, t ∗ gives rise to an optimal (prior free) allocationthat is fully balanced, that is all machines have exactly the same workload (equal to the optimalmakespan). Indeed, if that is not the case, then for any machine whose workload is strictlybelow the optimal makespan, we can slightly increase the processing time t ∗ j of one of its tasks j without affecting the optimal makespan, while at the same time that increase can only makethe performance of VCG worse.We are now ready to prove Theorem 4. Lemma 3 essentially reduces the analysis of
VCG forthe Bayesian scheduling problem for identical machines to that of a simple weighted balls-in-binssetting:
Proof of Theorem 4.
From Lemma 3, it is enough to analyze the performance of VCG on inputmatrices where the processing time of each task is the same across all machines and the optimalschedule is fully balanced. Without loss (by scaling) it can be further assumed that the optimalmakespan is exactly 1. Then, since VCG is breaking ties uniformly at random, the problemis reduced to analyzing the expected maximum (weighted) load when throwing m balls withweights ( w , . . . , w m ) = w (uniformly at random) into n bins, when P mj =1 w j = n . Then, by7heorem 1, that maximum load is upper bounded by the expected maximum load of throwing n (unit weight) balls into n bins, because the n -dimensional unit vector n majorizes w : n ’scomponents sum up to n and also w j ≤ j ∈ [ n ] (due to the assumption that the optimalmakespan is 1). By classic balls-in-bins results (see e.g. [28, 31]), the expected maximum loadof any machine is upper bounded by Θ (cid:16) ln n ln ln n (cid:17) .We now focus on the special but important case where both tasks and machines are a prioriidentical: Theorem 5.
VCG is (cid:16) n ln nm (cid:17) -approximate for the Bayesian scheduling problem with i.i.d.processing times drawn from a continuous MHR distribution.Proof. Let T be a random variable following the distribution from which the execution times t ij are drawn. Following the notation introduced in the introduction, the workload of a machine i is given by the random variable Y i = P mj =1 Y i,j . Then, for the expected makespan E [max i Y i ]and any real s > e s · E [max i Y i ] ≤ E [ e s max i Y i ] = E [max i e sY i ] ≤ n X i =1 E [ e sY i ] = n X i =1 m Y j =1 E [ e sY i,j ] = n E [ e sY , ] m , (2)where we have used Jensen’s inequality based on the convexity of the exponential function, andthe fact that for a fixed machine i the random variables Y i,j , j = 1 , . . . , m , are independent (theprocessing times are i.i.d. and VCG allocates each task independently of the others). We nowbound the term E [ e sY , ]: E [ e sY , ] = E " ∞ X r =0 ( sY , ) r r ! = 1 + ∞ X r =1 s r E [ Y r , ] r ! = 1 + 1 n ∞ X r =1 s r E [ T [1 : n ] r ] r ! ≤ n ∞ X r =1 s r E [ T [1 : n ]] r , where for the last inequality we have used the fact that the first order statistic of an MHRdistribution is also MHR (Lemma 1) and Lemma 2. Then, by choosing s = s ∗ ≡ · E [ T [1: n ]] weget that E [ e s ∗ Y , ] ≤ n ∞ X r =1 r ≤ n , and (2) yields E [max i Y i ] ≤ ln (cid:16) n E [ e s ∗ Y , ] m (cid:17) s ∗ ≤ (cid:18) n (cid:18) n (cid:19) m (cid:19) E [ T [1 : n ]] ≤ (cid:16) ne m/n (cid:17) E [ T [1 : n ]]= 2 (cid:18) ln n + mn (cid:19) E [ T [1 : n ]] . (3)But from Observation 3 we know that E [OPT] ≥ mn E [ T [1 : n ]] for the case of i.i.d. executiontimes, and the theorem follows.Notice that Theorem 5 in particular implies that VCG achieves a small, constant approxi-mation ratio whenever the number of tasks is slightly more than that of machines: Corollary 6.
VCG is -approximate for the Bayesian scheduling problem with m ≥ n ln n i.i.d.tasks drawn from a continuous MHR distribution. m ≤ n ln n . Recall the notionof k -stretched distributions introduced in Definition 1. Theorem 7.
VCG is ln nk ( n ) -approximate for the Bayesian scheduling problem with m ≤ n ln n i.i.d. tasks drawn from a k -stretched MHR distribution.Proof. From (3) we can deduce that the approximation ratio of VCG is upper bounded by (cid:18) ln n + mn (cid:19) E [ T [1 : n ]] E [OPT] ≤ (cid:18) ln n + mn (cid:19) E [ T [1 : n ]] E [ T [1 : n ][ m : m ]] ≤ (cid:18) ln n + mn (cid:19) E [ T [1 : n ]] E [ T [1 : n ][ n : n ]] ≤ (cid:18) ln n + mn (cid:19) k ( n ) ≤ nk ( n ) , where we have used Observation 3 and the fact that n ≤ m ≤ n ln n .In particular, we note that Theorem 7 yields a constant approximation ratio for VCG for theimportant special cases where the processing times are drawn independently from the uniformdistribution on [0 ,
1] or any exponential distribution. Indeed, the uniform distribution on [0 , Corollary 8.
VCG is -approximate for the Bayesian scheduling problem with i.i.d. processingtimes drawn from the uniform distribution on [0 , or an exponential distribution. We point out that the above corollary can not be generalized to hold for all MHR distribu-tions, as the lower bound in Theorem 10 implies. For example, it is not very difficult to checkthat by taking ε → , ε ], no stretch factor k ( n ) = Ω (ln n ) can be guaranteed.For our final positive result, we present an improved constant bound on the approximationratio of VCG when we have many tasks: Theorem 9.
VCG is √ -approximate for the Bayesian scheduling problem with m ≥ n tasks with i.i.d. processing times drawn from a continuous MHR distribution.Proof. We use Theorem 2 to bound the performance of VCG in this setting. In order to do so,we first bound the expectation and the variance of the makespan of a single machine. From (1),for the workload Y i of any machine i we have: E [ Y i ] = m X j =1 E [ Y i,j ] = 1 n X j E [ T [1 : n ]] = mn E [ T [1 : n ]] . To compute the variance of the makespan of machine i , we note that the random variables Y i,j are independent with respect to j , for any fixed machine i and thus we can getVar[ Y i ] = m X j =1 Var[ Y i,j ] = m X j =1 (cid:16) E [ Y i,j ] − E [ Y i,j ] (cid:17) ≤ m X j =1 E [ Y i,j ] = m X j =1 E [ α ij t ij ] = 1 n m X j =1 E [ T [1 : n ] ]= mn E [ T [1 : n ] ] .
9e are now ready to use Theorem 2 and bound the expected makespan: E [max i Y i ] ≤ E [ Y ] + √ n − q Var[ Y ] ≤ mn E [ T [1 : n ]] + √ m q E [ T [1 : n ] ] ≤ mn E [ T [1 : n ]] + √ √ m E [ T [1 : n ]] ≤ (1 + √ mn E [ T [1 : n ]] ≤ (1 + √ E [OPT] , where the third inequality follows from Lemma 2 (and Lemma 1), for the fourth inequality weuse the assumption that m ≥ n and to complete the proof, the last inequality uses a lowerbound on E [OPT] from Observation 3. In this section we prove some lower bounds on the performance of VCG under different distri-butional assumptions on the processing times. In an attempt for a clear comparison of VCGwith the mechanisms that were previously considered for the Bayesian scheduling problem (in[8]), we provide instances that lower bound their performance as well.
Theorem 10.
For any number of tasks, there exists an instance of the Bayesian schedulingproblem where VCG is not better than Ω (cid:16) ln n ln ln n (cid:17) -approximate and the processing times aredrawn from machine-identical continuous MHR distributions.Proof. Consider an instance with n identical machines and m tasks where for any machine i , task j has processing time t ij = 1 with probability 1 for j = 1 , . . . , n − t ij = m − n +1 with probability 1 for j = n, . . . , m . From classical results from balls-in-bins analysis (see also the proof of Theorem 4) we can deduce that the expected maximumnumber of unit-weight tasks allocated to any machine by VCG, is Ω (cid:16) ln n ln ln n (cid:17) . On the otherhand, there exists an allocation that achieves a makespan equal to 1, that is to allocate all ofthe m − n + 1 “small” tasks to a single machine and allocate each of the remaining unit-costtasks to a different machine. The theorem follows by noticing that we can without loss replacethese point-mass distributions on 1 and m − n +1 with uniform distributions over small intervalsaround that points.Notice that when the number of tasks equals that of the machines, i.e. m = n , then the badinstance for the lower bound of Theorem 10 is in fact an i.i.d. instance where tasks are identicalas well and all t ij ’s are drawn from the same distribution, and not just an instance with onlymachines being identical. However, if we restrict our focus only on discrete distributions, thenwe can strengthen that lower bound to hold for i.i.d. distributions for essentially any numberof tasks and not only for m = n : Theorem 11.
For any number of m = O ( ne n ) tasks, there exists an instance of the Bayesianscheduling problem where VCG is not better than Ω (cid:16) ln n ln ln n (cid:17) -approximate and the tasks havei.i.d. processing times drawn from a discrete MHR distribution.Proof. Consider an instance with n identical machines and m tasks where the processing times t ij are drawn from { , } such that t ij = 1 with probability (cid:0) n m (cid:1) n ≡ p and t ij = 0 withprobability 1 − p . Notice that this is a well-defined distribution, since for all m ≥ n we have p <
1. Furthermore, it is easy to check that this distribution is MHR; its hazard rate at 0 is
Pr[ t ij =0]Pr[ t ij ≥ = − p = 1 − p and at 1 is Pr[ t ij =1]Pr[ t ij ≥ = pp = 1.10ext, let M be the random variable denoting the number of tasks whose best processingtime over all machines is non-zero, that is M = |{ j | min i t ij = 1 }| . Then M follows a binomial distribution with probability of success p n and m trials, since theprobability of a task having processing time 1 at all machines (success) is p n , while there are m tasks in total. Given the definition for p , the average number of tasks that will end up requiringa processing time of 1 on every machine is E [ M ] = mp n = n . Also, we can derive thatPr [ M ≥ n ] ≤ e − n and Pr (cid:20) M ≤ n (cid:21) ≤ e − n/ using Chernoff bounds . As we have argued before, we can use classical results from balls-in-bins analysis to bound the performance of VCG. So, if n < M < n , we know that the expectedmakespan will be Ω (cid:16) ln n ln ln n (cid:17) , since each task has processing time 1 on all machines. That eventhappens almost surely, with probability at least 1 − e − n − e − n/ = 1 − o (1).On the other hand, we next show that the mechanism that simply balances the M “ex-pensive” tasks across the machines (by allocating l Mn m of them to every machine) achieves aconstant makespan, hence providing a constant upper-bound on the optimal makespan: E [OPT] ≤ Pr [
M < n ] · nn · M ≥ n ] · (cid:24) mn (cid:25) · ≤ e − n (cid:18) mn + 1 (cid:19) ≤ mne n = O (1) . Notice however that Theorem 11 still leaves open the possibility for continuous
MHR dis-tributions to perform better (see also Theorem 9 and Corollary 6).We finally conclude with a couple of simple observations, for the sake of completeness. First,our initial requirement (see Section 2) for identical machines (which is a standard one, see [8])is crucial for guaranteeing any non-trivial approximation ratios on the performance of VCG:
Observation 12.
There exists an instance of the Bayesian scheduling problem where VCGis not better than n -approximate even when the tasks are identically distributed according tocontinuous MHR distributions.Proof. Assume mn being an integer, and give as input the point-mass distributions t j = 1 − ε and t ij = 1 for all j ∈ [ m ] and i = 2 , , . . . , n , where ε ∈ (0 , VCG mechanism allocates all jobs to machine 1, fora makespan of m · (1 − ε ), while the algorithm that assigns mn jobs to each machine achieves amakespan of at most mn ·
1, resulting to a ratio of n as ε →
0. Without loss, the above analysiscarries over even if we replace the point-mass distributions with uniform distributions over asmall interval around the values 1 − ε and 1. These distributions are MHR, which concludesthe proof.We now present some lower bounds on the performance of the mechanisms analyzed byChawla et al. [8]. A definition of these mechanisms can be found in the introduction. Thefollowing demonstrates that the analysis of the approximation ratio for the class of boundedoverload mechanisms presented in [8] is asymptotically tight: Here we use the following forms, with β = 1 + √ β = √ : for all β > < β < X ≥ (1 + β ) µ ] ≤ e − β β µ and Pr [ X ≤ (1 − β ) µ ] ≤ e − β µ , for any binomial random variable with mean µ . bservation 13. For any number of m ≥ n tasks, there exists an instance of the Bayesianscheduling problem where a bounded overload mechanism with parameter c is not better than min { c mn , n − } -approximate and the processing times are drawn from machine-identical contin-uous MHR distributions.Proof. Consider the instance of Theorem 10 and recall that the optimal makespan is equalto 1. We note that since each task has the same processing time at any machine, all possi-ble allocations such that no machine is assigned to more than c mn tasks are valid outputs ofbounded overload mechanisms with parameter c . Now consider the bounded overload mecha-nism which fixes an ordering of the machines and then breaks ties according to that ordering.This mechanism would allocate at least min { c mn , n − } unit-cost tasks on the first machine inits ordering.The same instance can be used to bound the performance of the bounded overload mecha-nism with parameter c that breaks ties uniformly at random as well. Having sufficiently manytasks ( m = Ω (cid:16) n ln n ln ln n (cid:17) ) implies that the mechanism behaves almost like the VCG mechanismwhile allocating the unit-cost tasks, assuming they are the first to be allocated. This gives alower bound of Ω (cid:16) ln n ln ln n (cid:17) on the approximation ratio of this mechanism as well.Similar instances can provide lower bounds on the performance of the class of sieve andbounded overload mechanisms with parameters c, β, and δ , even for the case of i.i.d. processingtimes. To see this notice that if all tasks have t ij = 1 with probability 1 on any machine( T [1 : k ] = 1 for any k ), and we choose threshold β < m ≤ n ln n ,then a sieve and bounded overload mechanism with parameters c, β ≤ , and δ immediatelyreduces to a bounded overload mechanism with parameter c on δn machines. Acknowledgements:
We want to thank Elias Koutsoupias for useful discussions.
References [1] A. Archer and É. Tardos. Truthful mechanisms for one-parameter agents. In
FOCS , pages482–491, 2001.[2] I. Ashlagi, S. Dobzinski, and R. Lavi. Optimal lower bounds for anonymous schedulingmechanisms.
Math. Oper. Res. , 37(2):244–258, 2012.[3] T. Aven. Upper (lower) bounds on the mean of the maximum (minimum) of a number ofrandom variables.
Journal of Applied Probability , 22(3):pp. 723–728, 1985.[4] P. D. Azar, R. Kleinberg, and S. M. Weinberg. Prophet Inequalities with Limited Infor-mation. In
Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on DiscreteAlgorithms , SODA ’14, pages 1358–1377, 2014.[5] R. E. Barlow, A. W. Marshall, and F. Proschan. Properties of probability distributionswith monotone hazard rate.
Ann. Math. Statist. , 34(2):375–389, 06 1963.[6] P. Berenbrink, T. Friedetzky, Z. Hu, and R. Martin. On weighted balls-into-bins games.
Theoretical Computer Science , 409(3):511 – 520, 2008.[7] S. Chawla, N. Immorlica, and B. Lucier. On the limits of black-box reductions in mechanismdesign. In
STOC , pages 435–448, 2012.[8] S. Chawla, J. D. Hartline, D. Malec, and B. Sivan. Prior-independent mechanisms forscheduling. In
STOC , pages 51–60, 2013. 129] G. Christodoulou and A. Kovács. A deterministic truthful PTAS for scheduling relatedmachines.
SIAM J. Comput. , 42(4):1572–1595, 2013.[10] G. Christodoulou, E. Koutsoupias, and A. Vidali. A lower bound for scheduling mecha-nisms.
Algorithmica , 55(4):729–740, 2009.[11] E. H. Clarke. Multipart pricing of public goods.
Public Choice , 11(1):17–33, 1971.[12] C. Daskalakis and S. M. Weinberg. Bayesian truthful mechanisms for job scheduling frombi-criterion approximation algorithms. In
SODA , pages 1934–1952, 2015.[13] N. R. Devanur, J. D. Hartline, A. R. Karlin, and C. T. Nguyen. Prior-independent multi-parameter mechanism design. In
WINE , pages 122–133, 2011.[14] P. Dhangwatnotai, S. Dobzinski, S. Dughmi, and T. Roughgarden. Truthful approximationschemes for single-parameter agents.
SIAM J. Comput. , 40(3):915–933, 2011.[15] P. Dhangwatnotai, T. Roughgarden, and Q. Yan. Revenue maximization with a singlesample.
Games and Economic Behavior , 91:318–333, 2015.[16] S. Dughmi, T. Roughgarden, and M. Sundararajan. Revenue submodularity.
Theory ofComputing , 8(1):95–119, 2012.[17] L. Epstein, A. Levin, and R. van Stee. A Unified Approach to Truthful Scheduling onRelated Machines.
Mathematics of Operations Research , 41(1):1243–1252, 2013.[18] Y. Giannakopoulos and M. Kyropoulou. The VCG mechanism for bayesian scheduling. InE. Markakis and G. Schäfer, editors,
Web and Internet Economics (WINE) , volume 9470of
Lecture Notes in Computer Science , pages 343–356. Springer Berlin Heidelberg, 2015.doi: 10.1007/978-3-662-48995-6_25. URL http://arxiv.org/abs/1509.07455 .[19] K. Goldner and A. R. Karlin. A Prior-Independent Revenue-Maximizing Auction for Mul-tiple Additive Bidders. In
WINE , 2016.[20] T. Groves. Incentives in Teams.
Econometrica , 41(4):617–31, July 1973.[21] L. A. Hall. Approximation algorithms for scheduling. In D. S. Hochbaum, editor,
Approx-imation Algorithms for NP-hard Problems , pages 1–45. PWS, Boston, 1997.[22] J. D. Hartline and T. Roughgarden. Simple versus optimal mechanisms. In EC , pages225–234, 2009.[23] E. Koutsoupias and A. Vidali. A lower bound of 1+ ϕ for truthful scheduling mechanisms. Algorithmica , 66(1):211–223, 2013.[24] R. Lavi and C. Swamy. Truthful mechanism design for multidimensional scheduling viacycle monotonicity.
Games and Economic Behavior , 67(1):99–124, 2009.[25] J. K. Lenstra, D. B. Shmoys, and É. Tardos. Approximation algorithms for schedulingunrelated parallel machines.
Math. Program. , 46:259–271, 1990.[26] P. Lu. On 2-player randomized mechanisms for scheduling. In
WINE , pages 30–41, 2009.[27] P. Lu and C. Yu. An improved randomized truthful mechanism for scheduling unrelatedmachines. In
STACS , pages 527–538, 2008.[28] R. Motwani and P. Raghavan.
Randomized Algorithms . Cambridge University Press, 1995.1329] N. Nisan and A. Ronen. Algorithmic mechanism design.
Games and Economic Behavior ,35(1/2):166–196, 2001.[30] N. Nisan and A. Ronen. Computationally feasible VCG mechanisms.
J. Artif. Int. Res. ,29(1):19–47, 2007.[31] M. Raab and A. Steger. “Balls into bins” - A simple and tight analysis. In
RANDOM ,pages 159–170, 1998.[32] T. Roughgarden, I. Talgam-Cohen, and Q. Yan. Supply-limiting mechanisms. In EC , pages844–861, 2012.[33] B. Sivan. Prior Robust Optimization . PhD thesis, University of Wisconsin-Madison, 2013.[34] W. Vickrey. Counterspeculation, auctions, and competitive sealed tenders.
Journal ofFinance , 16(1):8–37, March 1961.[35] C. Yu. Truthful mechanisms for two-range-values variant of unrelated scheduling.
Theor.Comput. Sci. , 410(21-23):2196–2206, May 2009.
A Omitted Proofs from Section 2
Lemma 1. If T is a continuous MHR random variable then for any positive integer n , its firstorder statistic T [1 : n ] is also MHR. Proof. If T is a continuous real random variable with cdf F and pdf f , then the cdf and pdf of T [1 : n ] are F (1) ( x ) = 1 − (1 − F ( x )) n and f (1) ( x ) = nf ( x )(1 − F ( x )) n − , respectively. So, thehazard rate of T [1 : n ] is f (1) ( x )1 − F (1) ( x ) = n (1 − F ( x )) n − f ( x )(1 − F ( x )) n = n f ( x )1 − F ( x ) , which is increasing since f ( x )1 − F ( x ) is increasing. Lemma 2.
For any continuous
MHR random variable X and any positive integer r , E [ X r ] ≤ r ! E [ X ] r . Proof.
For any positive integer s , denote the normalized moments λ s ≡ E [ X s ] s ! . Then from [5,p. 384] we know that for all integers i and t > s > (cid:18) λ i + t λ i (cid:19) s ≤ (cid:18) λ i + s λ i (cid:19) t . By selecting t = r , s = 1 and i = 0, this inequality gives λ r λ r − ≤ λ r . We get the desiredinequality by noticing that λ r = E [ X r ] /r !, λ = E [ X ] and λ = E [1] = 1.The continuity assumption in Lemma 2 is essential, as it is demonstrated by the followingexample: consider a discrete random variable X over { , } with Pr [ X = 0] = + ε andPr [ X = 1] = − ε , for some small ε >
0. This distribution is MHR since its hazard rate at 0and 1 respectively is h (0) = Pr[ X =0]Pr[ X ≥ = + ε and h (1) = Pr[ X =1]Pr[ X ≥ = 1. However, it is easy to seethat E [ X ] = E [ X ] = Pr [ X = 1] = − ε and thus E [ X ] E [ X ] = E [ X ] < .14 Proof of Corollary 8
Throughout this section we will use the fact that if T is a random variable with cdf F then forany positive integer n , the cdf’s of the first and last order statistics T [1 : n ] and T [ n : n ] aregiven by F (1) ( x ) = 1 − (1 − F ( x )) n and F ( n ) ( x ) = F n ( x ) , (4)respectively. Lemma 4. If T is a uniform random variable over [0 , , then for all positive integers n, m E [ T [1 : n ]] = 1 n + 1 and E [ T [1 : n ][ m : m ]] = 1 − mB (cid:18) m, n (cid:19) , where B ( x, y ) ≡ R t x − (1 − t ) y − dt is the beta function.Proof. If T is a uniformly distributed random variable over [0 ,
1] then its cdf is given by F ( x ) = x , x ∈ [0 , T [1 : n ] is 1 − (1 − x ) n , thusits expectation is R (1 − x ) n dx = n +1 . For the second one, again from (4) it is straightforward tosee that the cdf of T [1 : n ][ m : m ] is [1 − (1 − x ) n ] m so its expectation is R − [1 − (1 − x ) n ] m dx =1 − R [1 − (1 − x ) n ] m dx . Next we compute the value of this integral I ( m ) ≡ Z [1 − (1 − x ) n ] m dx. We have: I ( m ) = Z [1 − (1 − x ) n ] m − (1 − (1 − x ) n ) dx = I ( m − − Z [1 − (1 − x ) n ] m − (1 − x ) n dx = I ( m − − n Z [1 − (1 − x ) n ] m − (1 − (1 − x ) n ) (1 − x ) dx = I ( m − − nm Z ([1 − (1 − x ) n ] m ) (1 − x ) dx = I ( m − − nm [(1 − (1 − x ) n ) m (1 − x )] x =1 x =0 + 1 nm Z [1 − (1 − x ) n ] m (1 − x ) dx = I ( m − − nm Z [1 − (1 − x ) n ] m dx = I ( m − − nm I ( m ) , meaning that I ( m ) = 11 + nm I ( m −
1) with I (1) = Z − (1 − x ) n dx = 1 − n + 1 . Solving the above recurrence gives I ( m ) = 2 · · · · · · m (cid:16) n (cid:17) · (cid:16) n (cid:17) · · · · · (cid:16) m + n (cid:17) = mΓ ( m ) Γ (cid:16) n (cid:17) Γ (cid:16) m + 1 + n (cid:17) = mB (cid:18) m, n (cid:19) , where Γ denotes the (complete) gamma function.15 emma 5. If T is an exponentially distributed random variable with parameter λ , then for allpositive integers n, m E [ T [1 : n ]] = 1 λn and E [ T [1 : n ][ m : m ]] = H m λn , where H m = 1 + + · · · + m is the harmonic function.Proof. If T is exponentially distributed, then its cdf is given by F ( x ) = 1 − e − λx , x ∈ [0 , ∞ ),where λ is a positive real parameter. The first equality is again easy, since from (4) the cdf of T [1 : n ] is 1 − ( e − λx ) n , thus its expectation is R ∞ e − nλ · x dx = λn . For the second one, from (4)it is straightforward to see that the cdf of T [1 : n ][ m : m ] is (1 − e − λnx ) m so its expectation is Z ∞ − (1 − e − λnx ) m dx = 1 λn Z ∞ − (1 − e − y ) m dy, by changing y = λnx, = 1 λn Z − (1 − z ) m z dz, by changing z = e − y , = 1 λn Z − w m − w dw, by changing w = 1 − z, = 1 λn Z m − X k =0 w k dw = 1 λn m − X k =0 k + 1= H m λn . To conclude the proof of Corollary 8, from Lemma 4 we deduce that the stretch factor ofthe uniform distribution is E [ T [1 : n ][ n : n ]] E [ T [1 : n ]] = ( n + 1) (cid:20) − nB (cid:18) n, n (cid:19)(cid:21) and from Lemma 5 the stretch factor for the exponential distribution with parameter λ is E [ T [1 : n ][ n : n ]] E [ T [1 : n ]] = H n . It can be verified that both the above quantities are lower-bounded by ln nn