Optimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic
aa r X i v : . [ c s . PF ] O c t Optimal Multiserver Scheduling with Unknown Job Sizesin Heavy Traffic
ZIV SCULLY,
Carnegie Mellon University, USA
ISAAC GROSOF,
Carnegie Mellon University, USA
MOR HARCHOL-BALTER,
Carnegie Mellon University, USAWe consider scheduling to minimize mean response time of the M/G/ k queue with unknown job sizes. In thesingle-server 𝑘 = Gittins policy, but it is not known whether Gittins or anyother policy is optimal in the multiserver case. Exactly analyzing the M/G/ k under any scheduling policy isintractable, and Gittins is a particularly complicated policy that is hard to analyze even in the single-servercase.In this work we introduce monotonic Gittins (M-Gittins), a new variation of the Gittins policy, and showthat it minimizes mean response time in the heavy-traffic M/G/ k for a wide class of finite-variance job sizedistributions. We also show that the monotonic shortest expected remaining processing time (M-SERPT) policy,which is simpler than M-Gittins, is a 2-approximation for mean response time in the heavy traffic M/G/ k under similar conditions. These results constitute the most general optimality results to date for the M/G/ k with unknown job sizes. Our techniques build upon work by Grosof et al. [15], who study simple policies,such as SRPT, in the M/G/ k ; Bansal et al. [6], Kamphorst and Zwart [19], and Lin et al. [23], who analyzemean response time scaling of simple policies in the heavy-traffic M/G/1; and Aalto et al. [3, 4] and Scullyet al. [32, 33], who characterize and analyze the Gittins policy in the M/G/1. Scheduling to minimize mean response time of the M/G/ k queue is an important problem inqueueing theory. The single-server 𝑘 = shortest remaining processing time (SRPT) policy is easily shown to beoptimal [29]. If the scheduler does not know job sizes, which is very often the case in practicalsystems, then a more complex policy called the Gittins policy is known to be optimal [3, 4, 12].The Gittins policy tailors its priority scheme to the job size distribution, and it takes a simple formin certain special cases. For example, for distributions with decreasing hazard rate (DHR), Gittinsbecomes the foreground-background (FB) policy, so FB is optimal in the M/G/1 for DHR job sizedistributions [3, 4, 11].In contrast to the M/G/1, the M/G/ k with 𝑘 ≥ k , with the only nontrivial results holding under heavy traffic. For known job sizes, recentwork by Grosof et al. [15] shows that a multiserver analogue of SRPT is optimal in the heavy-trafficM/G/ k . For unknown job sizes, Grosof et al. [15] address only the case of DHR job size distributions,showing that a multiserver analogue of FB is optimal in the heavy-traffic M/G/ k . But in general,optimal scheduling is an open problem for unknown job sizes, even in heavy traffic. We thereforeask:
What scheduling policy minimizes mean response time in the heavy-traffic M/G/k withunknown job sizes and general job size distribution? A job’s response time , also called sojourn time or latency , is the amount of time between its arrival and its completion. FB is the policy that prioritizes the job of least age, meaning the job that has been served the least so far. It is also knownas least attained service (LAS). Here “heavy traffic” refers to the limit as the system load approaches capacity for a fixed number of servers. Both the SRPT and FB optimality results of Grosof et al. [15] hold under technical conditions similar to finite variance.
Ziv Scully, Isaac Grosof, and Mor Harchol-Balter
This is a very difficult question. In order to answer it, we draw upon several recent lines of workin scheduling theory. • As part of their heavy-traffic optimality proofs, Grosof et al. [15] use a tagged job methodto stochastically bound M/G/ k response time under each of SRPT and FB relative to M/G/1response time (Fig. 2.1) under the same policy. • Lin et al. [23] and Kamphorst and Zwart [19] characterize the heavy-traffic scaling of M/G/1mean response time under SRPT and FB, respectively. • Scully et al. [33] show that a policy called monotonic shortest expected remaining processingtime (M-SERPT), which is considerably simpler than Gittins, has M/G/1 mean response timewithin a constant factor of that of Gittins.While these prior results do not answer the question on their own, together they suggest a plan ofattack for proving optimality in the heavy-traffic M/G/ k .When searching for a policy to minimize mean response time, a natural candidate is a multi-server analogue of Gittins. As a first step, one might hope to use the tagged job method of Grosofet al. [15] to stochastically bound M/G/ k response time under Gittins relative to M/G/1 responsetime. Unfortunately, the tagged job method does not apply to multiserver Gittins, because it relieson both stochastic and worst-case properties of the scheduling policy, whereas Gittins has poorworst-case properties.One of our key ideas is to introduce a new variant of Gittins, called monotonic Gittins (M-Gittins),that has better worst-case properties than Gittins while maintaining similar stochastic properties.This allows us to generalize the tagged job method [15] to M-Gittins, thus bounding its M/G/ k response time relative to its M/G/1 response time.Our M/G/ k analysis of M-Gittins reduces the question of whether M-Gittins is optimal in theheavy-traffic M/G/ k to analyzing the heavy-traffic scaling of M-Gittins’s M/G/1 mean responsetime. However, there are no heavy-traffic scaling results for the M/G/1 under policies other thanSRPT [23], FB [19], first-come, first served (FCFS) [21, 22], and a small number of other simplepolicies [6, 9]. To remedy this, we derive heavy-traffic scaling results for M-Gittins in the M/G/1.It turns out that analyzing M-Gittins directly is very difficult. Fortunately, M-Gittins has a simplercousin, M-SERPT, which Scully et al. [33] introduce and analyze. We analyze M-SERPT in heavytraffic as a key stepping stone in our heavy-traffic analysis of M-Gittins.This paper makes the following contributions: • We introduce the M-Gittins policy and prove that it minimizes mean response time in theheavy-traffic M/G/ k for a large class of finite-variance job size distributions (Theorem 3.1). • We also prove that the simple and practical M-SERPT policy is a 2-approximation for meanresponse time in the heavy-traffic M/G/ k for a large class of finite-variance job size distribu-tions (Theorem 3.2). • We characterize the heavy-traffic scaling of mean response time in the M/G/1 under Gittins,M-Gittins, and M-SERPT (Theorem 3.3).Section 3 formally states these results and compares them to prior work. Their proofs rely ona large collection of intermediate results, which we outline in detail in Section 4 and prove inSections 5–7.
We consider an M/G/ k queue with arrival rate 𝜆 and job size distribution 𝑋 . Each of the 𝑘 servershas speed 1 / 𝑘 , so regardless of the number of servers, the total service rate is 1 and the system loadis 𝜌 = 𝜆 E [ 𝑋 ] . This allows us to easily compare the M/G/ k system to a single-server M/G/1 system,as illustrated in Fig. 2.1. We assume a preempt-resume model with no preemption overhead. This ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 3 Single-Server System speed 1 𝜆 𝑘 -Server System speed 1 / 𝑘 speed 1 / 𝑘 speed 1 / 𝑘𝜆 Fig. 2.1. Single-Server and 𝑘 -Server Systems means that a single-server M/G/1 system can simulate any M/G/ k policy by time-sharing between 𝑘 jobs.Throughout this paper we consider the 𝜌 → heavy-traffic limit. This is the 𝜆 → / E [ 𝑋 ] limit with the job size distribution 𝑋 and number of servers 𝑘 held constant.We write 𝐹 for the cumulative distribution function of 𝑋 and 𝐹 ( 𝑥 ) = − 𝐹 ( 𝑥 ) for its tail. Weassume that 𝑋 has a continuous, piecewise-monotonic hazard rate ℎ ( 𝑥 ) = dd 𝑥 𝐹 ( 𝑥 ) 𝐹 ( 𝑥 ) . We also frequently work with the expected remaining size of a job at age 𝑎 , which is E [ 𝑋 − 𝑎 | 𝑋 > 𝑎 ] .We assume it, too, is continuous and piecewise-monotonic as a function of 𝑎 .The above assumptions on hazard rate and expected remaining size are not restrictive and serveprimarily to simplify presentation. It is very likely that our proofs can be generalized to relax them. All of the scheduling policies considered in this work are in the class of
SOAP policies [32], gen-eralized to a multiserver setting. In a single-server setting, a SOAP policy 𝜋 is specified by a rankfunction 𝑟 𝜋 : R + → R which maps a job’s age , namely the amount of service it has received so far, to its rank , or prioritylevel. Single-server SOAP policies work by always serving the job of minimal rank , breaking tiesin FCFS fashion. As an example, FB is a SOAP policy with 𝑟 FB ( 𝑎 ) = 𝑎 . Because lower age corresponds to lowerrank, FB prioritizes the job of least age. A multiserver SOAP policy uses the same rank function as its single-server analogue. The onlydifference is that the system can serve up to 𝑘 jobs, so a multiserver SOAP policy works as follows: • If there are at most 𝑘 jobs in the system, serve all of them. • If there are more than 𝑘 jobs in the system, serve the 𝑘 jobs of minimal rank, breaking tiesin FCFS fashion. A function is piecewise-monotonic if, roughly speaking, it switches between increasing and decreasing finitely manytimes in any compact interval. The full SOAP class allows a job’s rank to depend on both its age and its “static” characteristics, such as its size or class,but we do not use this generality in this paper. When multiple jobs are tied for least age, FB shares the server among all such jobs because the rank function is increasing.See Scully et al. [32, Appendix B] for details.
Ziv Scully, Isaac Grosof, and Mor Harchol-Balter 𝑎 𝑟 M-SERPT ( 𝑎 ) 𝑟 SERPT ( 𝑎 ) 𝑎 𝑟 M-Gittins ( 𝑎 ) 𝑟 Gittins ( 𝑎 ) Fig. 2.2. Rank Function Examples
We often compare the 𝑘 -server variant of a policy 𝜋 to its single-server analogue. When it isnecessary to distinguish between them, we write 𝜋 - k for the 𝑘 -server version of a policy, so 𝜋 -1 isthe single-server version. We write 𝑇 𝜋 - 𝑘𝑥 for the size-conditional response time distribution of jobsof size 𝑥 under 𝜋 - k , and we write 𝑇 𝜋 - 𝑘 for the overall response time distribution.There are four main policies we consider in this work: SERPT, M-SERPT, Gittins, and M-Gittins.None of the policies need job size information, but each uses the job size distribution to tune itsrank function. As an example, Fig. 2.2 shows the four rank functions for a bounded distributionwith nonmonotonic hazard rate. Definition 2.1.
The shortest expected remaining processing time (SERPT) policy is the SOAP policywith rank function 𝑟 SERPT ( 𝑎 ) = E [ 𝑋 − 𝑎 | 𝑋 > 𝑎 ] = ∫ ∞ 𝑎 𝐹 ( 𝑡 ) d 𝑡𝐹 ( 𝑎 ) . As a reminder, lower rank means better priority, so, as hinted by its name, SERPT prioritizes thejob of least expected remaining size.
Definition 2.2.
The monotonic SERPT (M-SERPT) policy is the SOAP policy with monotonic rankfunction 𝑟 M-SERPT ( 𝑎 ) = max 𝑏 ∈[ ,𝑎 ] 𝑟 SERPT ( 𝑏 ) . Definition 2.3.
The
Gittins policy is the SOAP policy with rank function 𝑟 Gittins ( 𝑎 ) = inf 𝑏 > 𝑎 E [ min { 𝑋, 𝑏 } − 𝑎 | 𝑋 > 𝑎 ] P { 𝑋 ≤ 𝑏 | 𝑋 > 𝑎 } = inf 𝑏 > 𝑎 ∫ 𝑏𝑎 𝐹 ( 𝑡 ) d 𝑡𝐹 ( 𝑎 ) − 𝐹 ( 𝑏 ) . Definition 2.4.
The monotonic Gittins (M-Gittins) policy is the SOAP policy with monotonic rankfunction 𝑟 M-Gittins ( 𝑎 ) = max 𝑏 ∈[ ,𝑎 ] 𝑟 Gittins ( 𝑏 ) . ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 5 The M-Gittins and M-SERPT policies, which both have monotonic rank functions, are the pri-mary focus of this paper. Some of our intermediate results apply more broadly to any policy witha monotonic rank function.
Definition 2.5.
A SOAP policy 𝜋 is monotonic if its rank function is nondecreasing, meaning 𝑟 𝜋 ( 𝑎 ) ≤ 𝑟 𝜋 ( 𝑏 ) for all ages 𝑎 < 𝑏 . Figure 2.2 shows the SERPT, M-SERPT, Gittins, and M-Gittins rank functions for a boundeddistribution with nonmonotonic hazard rate. Notice that SERPT and Gittins are not monotonic.This makes it hard to analyze their M/G/ k response time (Appendix A). In contrast, the M-SERPTand M-Gittins are monotonic: their rank functions alternate between constant regions and strictlyincreasing regions.While the rank functions of Gittins and SERPT may not be monotonic, they are still well behavedunder our assumptions on the job size distribution. Lemma 2.6.
Under the assumption that the job size distribution 𝑋 has continuous and piecewise-monotonic hazard rate and expected remaining size functions, each of 𝑟 SERPT , 𝑟 M-SERPT , 𝑟 Gittins , and 𝑟 M-Gittins is continuous and piecewise-monotonic.
Proof.
It suffices to prove the claims for 𝑟 SERPT and 𝑟 Gittins . The claim for 𝑟 SERPT is exactly ourassumption on expected remaining size, and the claim for 𝑟 Gittins is a known result [4, Theorem 1]. (cid:3)
We consider several classes of job size distributions in this paper. We briefly describe each classbefore giving the formal definitions. • The OR (−∞ , − ) class (Definition 2.7) contains, roughly speaking, distributions with Pareto-like tails. – We focus especially on the OR (−∞ , − ) subclass, all members of which have finite vari-ance. • The
MDA ( Λ ) class (Definition 2.12) contains, roughly speaking, distributions with smoothtails that are lighter than Pareto tails. It includes, among others, exponential, normal, log-normal, Weibull, and Gamma distributions. • The
QDHR and
QIMRL classes (Definitions 2.8 and 2.9) are relaxations of the well-known decreasing hazard rate ( DHR ) and increasing mean residual lifetime ( IMRL ) classes [1–4, 11, 27,28, 34].
QDHR contains distributions whose hazard rate is roughly decreasing with age, evenif it is not perfectly monotonic, and
QIMRL contains distributions with roughly increasingexpected remaining size. – We focus especially on the subclasses
MDA ( Λ ) ∩ QDHR and
MDA ( Λ ) ∩ QIMRL . • The
ENBUE class (Definition 2.10) is a relaxation of the well-known new better than used inexpectation ( NBUE ) class [3, 4, 34]. It contains distributions whose expected remaining sizereaches a global maximum at some age. – We focus especially on the
Bounded subclass, which contains all bounded distributions.These classes play two different roles in our analysis. • Some of the classes broadly characterize the asymptotic behavior of the tail 𝐹 . These include OR (−∞ , − ) , MDA ( Λ ) , and ENBUE . Virtually all job size distributions of interest are in one The nonincreasing case is less interesting, because all nonincreasing rank functions encode FCFS. Because the
NBUE terminology originates in reliability analysis, the word “better” here means “longer”.
Ziv Scully, Isaac Grosof, and Mor Harchol-Balter of these classes, so requiring membership in one of them, as in Theorem 3.3, should not beviewed as a major restriction. • Some of the classes impose additional conditions on the job size distribution that help usbound the M-Gittins and M-SERPT rank functions (Section 6). These include
QDHR , QDHR ,and
Bounded . While these classes are much broader than those previously studied (Sec-tion 3.1), they do not cover all distributions of interest. Requiring membership in one ofthem, as in Theorems 3.1 and 3.2, represents a genuine restriction.
Definition 2.7.
A function 𝑓 is 𝑂 -regularly varying if there exist exponents 𝛽 ≥ 𝛼 > 𝐶 , 𝑥 > 𝑦 ≥ 𝑥 ≥ 𝑥 ,1 𝐶 (cid:16) 𝑦𝑥 (cid:17) − 𝛽 ≤ 𝑓 ( 𝑦 ) 𝑓 ( 𝑥 ) ≤ 𝐶 (cid:16) 𝑦𝑥 (cid:17) − 𝛼 . We write OR (− 𝛽 , − 𝛼 ) for the set of 𝑂 -regularly varying functions where the exponents 𝛼 and 𝛽 above may be chosen such that 𝛼 < 𝛼 ≤ 𝛽 < 𝛽 . We use the same OR (− 𝛽 , − 𝛼 ) notation torepresent the class of distributions whose tails are in OR (− 𝛽 , − 𝛼 ) . Definition 2.8.
A job size distribution is in the quasi-decreasing hazard rate class, denoted
QDHR ,if there exist a strictly increasing function 𝑚 : R + → R + , an exponent 𝛾 ≥
1, and constants 𝐶 , 𝑥 > 𝑥 ≥ 𝑥 , 𝑚 ( 𝑥 ) ≤ ℎ ( 𝑥 ) ≤ 𝑚 ( 𝐶 𝑥 𝛾 ) . Definition 2.9.
A job size distribution is in the quasi-increasing mean residual lifetime class, de-noted
QIMRL , if there exist a strictly increasing function 𝑚 : R + → R + , an exponent 𝛾 ≥
1, andconstants 𝐶 , 𝑥 > 𝑥 ≥ 𝑥 , 𝑚 ( 𝑥 ) ≤ E [ 𝑋 − 𝑥 | 𝑋 > 𝑥 ] ≤ 𝑚 ( 𝐶 𝑥 𝛾 ) . Definition 2.10.
A job size distribution is in the eventually new better than used in expectation class, denoted
ENBUE , if there exists an age 𝑎 ∗ ≥ 𝑥 ≠ 𝑎 ∗ , E [ 𝑋 − 𝑎 ∗ | 𝑋 > 𝑎 ∗ ] ≥ E [ 𝑋 − 𝑥 | 𝑋 > 𝑥 ] . Definition 2.11.
A job size distribution is in the bounded class, denoted
Bounded , if there exists 𝑥 max < ∞ such that 𝐹 ( 𝑥 max ) = Definition 2.12.
A job size distribution is said to be in the
Gumbel domain of attraction , denoted
MDA ( Λ ) , under certain conditions specified in extreme value theory [26].The exact characterization of MDA ( Λ ) is outside the scope of this paper. The most importantproperty is that distributions in MDA ( Λ ) are lighter-tailed than all Pareto distributions. Lemma 2.13. If 𝑋 ∈ MDA ( Λ ) , then 𝐹 ( 𝑥 ) = 𝑜 ( 𝑥 − 𝛼 ) for all 𝛼 > . Proof.
The result follows from a known characterization of
MDA ( Λ ) [26, Proposition 1.4]. (cid:3) This is not the standard definition of 𝑂 -regular variation, but it is equivalent to it [8, Section 2.2.1]. Specifically, our OR (− 𝛽 , − 𝛼 ) contains the 𝑂 -regularly varying functions whose Matuszewska indices are in the interval (− 𝛽 , − 𝛼 ) . ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 7 We now present our main results, explaining how they relate to prior work in Section 3.1. Webegin with our heavy-traffic M/G/ k optimality result. Theorem 3.1.
In an M/G/k, if 𝑋 ∈ OR (−∞ , − ) ∪ ( MDA ( Λ ) ∩ QDHR ) ∪
Bounded , then lim 𝜌 → E [ 𝑇 M-Gittins- 𝑘 ] E [ 𝑇 Gittins-1 ] = . In such cases, M-Gittins-k is optimal for mean response time in heavy traffic.
The M-Gittins policy is based on the Gittins policy, which is somewhat complex to describeand compute. Fortunately, the M-SERPT policy, which can be much simpler to compute [33], alsoperforms well in the heavy-traffic M/G/ k . Theorem 3.2.
In an M/G/k, if 𝑋 ∈ OR (−∞ , − ) ∪ ( MDA ( Λ ) ∩ ( QDHR ∪ QIMRL )) ∪
Bounded , then lim 𝜌 → E [ 𝑇 M-SERPT- 𝑘 ] E [ 𝑇 Gittins-1 ] ≤ . In such cases, M-SERPT-k is a -approximation for mean response time in heavy traffic. Theorems 3.1 and 3.2 apply to a broad class of finite-variance job size distributions. Roughlyspeaking, OR (−∞ , − ) covers heavy-tailed distributions, and MDA ( Λ ) covers non-heavy-taileddistributions that are unbounded (Section 2.2). Assuming membership in these sets is standard forheavy-traffic analysis [19]. The main restriction the results impose is on MDA ( Λ ) distributions,for which we additionally require membership in QDHR or QIMRL . While slightly relaxing thisrestriction is possible, removing it entirely appears to be very difficult (Section 8).A key step in the proofs of Theorems 3.1 and 3.2 is analyzing M-Gittins and M-SERPT in theheavy-traffic M/G/1. This analysis is itself a new result of independent interest. Notably, it extendsto ordinary Gittins in addition to M-Gittins, thus characterizing the optimal heavy-traffic scalingattainable by any scheduling policy in the setting of unknown job sizes. Theorem 3.3.
Let 𝜋 -1 be one of Gittins-1, M-Gittins-1, or M-SERPT-1. If 𝑋 ∈ OR (− , − ) , then inthe 𝜌 → limit, E [ 𝑇 𝜋 -1 ] = Θ (cid:18) log 11 − 𝜌 (cid:19) and if 𝑋 ∈ OR (−∞ , − ) ∪ MDA ( Λ ) ∪ ENBUE , then E [ 𝑇 𝜋 -1 ] = Θ ( − 𝜌 ) · 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) ! , where 𝐹 − 𝑒 is the inverse of the tail of the excess of 𝑋 , namely 𝐹 𝑒 ( 𝑥 ) = E [ 𝑋 ] ∫ ∞ 𝑥 𝐹 ( 𝑡 ) d 𝑡 . For example, we only need the
QDHR and
QIMRL assumptions to prove Theorems 6.3 and 6.5, so we could insteadassume the results of those theorems.
Ziv Scully, Isaac Grosof, and Mor Harchol-Balter
Theorem 3.1 is the first result proving optimality of a scheduling policy in the heavy-traffic M/G/ k with unknown job sizes and general job size distribution. As mentioned in Section 1, the only priorresults of this type were shown by Grosof et al. [15], who prove similar results for SRPT and FB,that latter for decreasing hazard rate ( DHR ) job size distributions. • SRPT was shown to be optimal in the heavy-traffic M/G/ k for job size distributions whosetail has upper Matuszewska index less than − 𝛼 >
2. This is somewhat broader thanthe precondition of Theorem 3.1, though it is still limited to finite-variance distributions. – Given that SRPT is designed for known job sizes while M-Gittins is designed for unknownjob sizes, Theorem 3.1 complements the prior SRPT results. • FB was shown to be optimal in the heavy-traffic M/G/ k for job size distributions in theclass DHR ∩ ( OR (−∞ , − ) ∪ MDA ( Λ )) [15, Theorem 7.13]. The
DHR class is much morerestrictive than
QDHR , so this is much narrower than the precondition of Theorem 3.1. – Given that FB is equivalent to M-Gittins in the
DHR case [3, 4], Theorem 3.1 subsumes theprior FB results.There is another result that follows from two prior works that complements Theorem 3.1, al-though to the best of our knowledge it has never been explicitly stated. Köllerström [21, 22] showsthat under FCFS, the mean response times in the M/G/1 and M/G/ k converge. This means that ifGittins and M-Gittins happen to be equivalent to FCFS for a given job size distribution, then FCFSminimizes mean response time in the heavy-traffic M/G/ k . Aalto et al. [3, 4] show this occursexactly for job size distributions in the new better than used in expectation ( NBUE ) class, whichincludes some distributions that Theorem 3.1 does not cover.Finally, versions of the Gittins policy have been shown to be heavy-traffic optimal for twodiscrete-state versions of the M/G/ k queue [13, 14]. These models support some features our modeldoes not, such as multiple job classes, but discretizing the state space imposes some limitations.Specifically, Glazebrook and Niño-Mora [14] require each job to be composed of phases whereeach phase has exponentially distributed size; and Glazebrook [13] allows nonexponential job sizedistributions but discretizes time and additionally requires ENBUE job size distributions (Defini-tion 2.10). In contrast, Theorem 3.1 applies to heavy-tailed and other non-
ENBUE job size distri-butions that are of practical importance in computer systems [5, 10, 17, 25].Theorem 3.2 shows that a simple scheduling policy, namely M-SERPT, has mean response timewithin a constant factor of optimal in the heavy-traffic M/G/ k with unknown job sizes and gen-eral job size distribution. Specifically, we show M-SERPT is a 2-approximation. This complementsthe result of Scully et al. [33], who show that in the M/G/1, M-SERPT is a 5-approximation forM/G/1 mean response time at all loads. Our result is tighter and applies to multiserver systems,not just single-server systems, but it applies only in heavy traffic. The techniques we introducecould be useful for tightening the upper bound on M-SERPT’s M/G/1 approximation ratio, whichis conjectured to be 2 [33].Theorem 3.3 characterizes the heavy-traffic scaling of M/G/1 mean response time under Gittins,M-Gittins, and M-SERPT. There are three other policies whose heavy-traffic scaling has been char-acterized: FB, SRPT, and a policy called randomized multilevel feedback (RMLF) [7, 18]. We nowcompare Theorem 3.3 to each of these prior results. While Grosof et al. [15, Theorem 7.13] claim that this result applies to all distributions in
DHR with upper Matuszewskaindex less than −
2, their proof incorrectly cites the preconditions of results of Kamphorst and Zwart [19]. Correcting theprecondition narrows the result to what we state here. ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 9
Kamphorst and Zwart [19] study FB in heavy traffic. They show that if 𝑋 ∈ OR (− , − ) , then E [ 𝑇 FB-1 ] = Θ (cid:18) log 11 − 𝜌 (cid:19) , matching the first expression in Theorem 3.3. They also show that if 𝑋 ∈ OR (−∞ , − ) ∪ MDA ( Λ ) ,then E [ 𝑇 FB-1 ] = Θ ( − 𝜌 ) · 𝑟 SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) ! . This is similar to the second expression in Theorem 3.3, except it replaces the monotonic 𝑟 M-SERPT with the nonmonotonic 𝑟 SERPT , which pinpoints the suboptimality of FB’s heavy-traffic scaling.Lin et al. [23] study SRPT in heavy traffic. They show that if 𝑋 ∈ OR (− , − ) , then E [ 𝑇 SRPT-1 ] = Θ (cid:18) log 11 − 𝜌 (cid:19) , and if 𝐹 has upper Matuszewska index less than −
2, which covers 𝑋 ∈ OR (−∞ , − ) ∪ MDA ( Λ ) ,then E [ 𝑇 SRPT-1 ] = Θ ( − 𝜌 ) · 𝐺 − ( − 𝜌 ) ! , where 𝐺 ( 𝑥 ) = − E [ 𝑋 ( 𝑋 ≤ 𝑥 )] E [ 𝑋 ] = 𝐹 𝑒 ( 𝑥 ) + 𝑥𝐹 ( 𝑥 ) E [ 𝑋 ] . Recall that SRPT minimizes mean response time in the presence of job size information, whereasGittins does not use job size information, so the heavy-traffic scaling of SRPT is a lower bound onthat of Gittins. By comparing the above result for SRPT with our result for Gittins (Theorem 3.3),we learn when knowledge of job sizes yields an asymptotic improvement in mean response time. • For 𝑋 ∈ OR (− , − ) , meaning 𝑋 is heavy-tailed with infinite variance, the heavy-trafficscaling of Gittins matches that of SRPT. • For 𝑋 ∈ OR (−∞ , − ) , meaning 𝑋 is heavy-tailed with finite variance, the heavy-traffic scal-ing of Gittins still matches that of SRPT. Specifically, we later show 𝑟 M-SERPT ( 𝑎 ) = Θ ( 𝑎 ) (Theorem 6.2), and one can also show 𝐺 − ( − 𝜌 ) = Θ ( 𝐹 − 𝑒 ( − 𝜌 )) . • For 𝑋 ∈ MDA ( Λ ) , meaning 𝑋 is not heavy-tailed, one can show 𝑟 M-SERPT ( 𝑎 ) = 𝑜 ( 𝑎 ) [26],implying Gittins has worse heavy-traffic scaling than SRPT in those cases.We see that, roughly speaking, Gittins matches the heavy-traffic scaling of SRPT if and only if thejob size distribution is heavy-tailed. We conclude that knowledge of job sizes yields an asymptoticimprovement in mean response time for non-heavy-tailed job size distributions.Bansal et al. [6] study RMLF in heavy traffic. They show that if E [ 𝑋 𝛼 ] < ∞ for some 𝛼 >
2, then E [ 𝑇 RMLF-1 ] = 𝑂 (cid:18) E [ 𝑇 SRPT-1 ] · log 11 − 𝜌 (cid:19) . (3.1)Because Gittins minimizes M/G/1 mean response time, this serves as an upper bound on the heavy-traffic scaling of Gittins. However, as previously discussed when comparing Theorem 3.3 to priorresults on SRPT, there are cases where Gittins matches the heavy-traffic scaling of SRPT, so ourresult is a tighter bound. With that said, requiring E [ 𝑋 𝛼 ] < ∞ for some 𝛼 > Key Definitions • (Section 2.2) Job size distribution classes:
QDHR , OR (−∞ , − ) , MDA ( Λ ) , etc. • (Sections 4 and 5) Single-server quantities: E [ 𝑄 𝜋 -1 ] , E [ 𝑅 𝜋 -1 ] , and E [ 𝑆 𝜋 -1 ] . • (Section 4.1) Age cutoffs: 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 . Proof Steps • (Section 5) Compare M/G/k to M/G/1: E [ 𝑇 𝜋 - 𝑘 ] ≤ E [ 𝑄 𝜋 -1 ] + 𝑘 E [ 𝑅 𝜋 -1 ] + ( 𝑘 − ) E [ 𝑆 𝜋 -1 ] ,whereas E [ 𝑇 𝜋 -1 ] = E [ 𝑄 𝜋 -1 ] + E [ 𝑅 𝜋 -1 ] . • Show E [ 𝑄 𝜋 -1 ] dominates E [ 𝑅 𝜋 -1 ] and E [ 𝑆 𝜋 -1 ] in 𝜌 → limit. – (Section 6) Job size distribution classes imply bounds on age cutoffs: for example, if 𝑋 ∈ QDHR , then 𝑧 𝜋𝑥 = 𝑂 ( 𝑥 𝛾 ) for some 𝛾 ≥ – (Section 7) Job size distribution classes and bounds on age cutoffs imply E [ 𝑄 𝜋 -1 ] dominates: for example, if 𝑋 ∈ MDA ( Λ ) and 𝑧 𝜋𝑥 = 𝑂 ( 𝑥 𝛾 ) for some 𝛾 ≥
1, then E [ 𝑆 𝜋 -1 ] = 𝑜 ( 𝑄 𝜋 -1 ) . • (Section 4.4) Compare M-Gittins-k and M-SERPT-k to Gittins-1. – M-Gittins-k vs. Gittins-1: prior work shows E [ 𝑄 M-Gittins-1 ] ≤ E [ 𝑇 Gittins-1 ] , implyinglim 𝜌 → E [ 𝑇 M-Gittins- 𝑘 ]/ E [ 𝑇 Gittins-1 ] = – M-SERPT-k vs. Gittins-1: prior work shows E [ 𝑄 M-SERPT-1 ] ≤ E [ 𝑇 Gittins-1 ] , implyinglim 𝜌 → E [ 𝑇 M-SERPT- 𝑘 ]/ E [ 𝑇 Gittins-1 ] ≤ Throughout, 𝜋 stands for either M-Gittins or M-SERPT. Fig. 4.1. Proof Overview
Our main goal is to show that M-Gittins minimizes M/G/ k mean response time in the 𝜌 → E [ 𝑇 M-Gittins- 𝑘 ] ≤ E [ 𝑇 Gittins-1 ] + 𝑜 ( E [ 𝑇 Gittins-1 ]) . (4.1)The only existing technique for proving a bound like (4.1) is the M/G/ k tagged job method ofGrosof et al. [15]. In general, tagged job methods work as follows [15, 16, 20, 24, 30–32, 35]: onefocuses on a “tagged” job 𝐽 throughout its time in the system, tracking how much each other jobdelays 𝐽 . The amount of time for which another job can delay 𝐽 is called the relevant work due tothat other job. The specific M/G/ k tagged job method [15] relates the amount of relevant work inan M/G/ k under 𝜋 - k to the amount of relevant work in an M/G/1 under 𝜋 -1.As a first approach, we might try to prove a result like (4.1) for Gittins- k using the M/G/ k taggedjob method. Unfortunately, the method turns out not to work for Gittins, because Gittins can havea nonmonotonic rank function. It turns out that under nonmonotonic rank functions, jobs cancontribute more relevant work in an M/G/ k than in an M/G/1 (Appendix A), resulting in a muchlooser response time bound.Our key insight is that we can generalize the M/G/ k tagged job method of Grosof et al. [15] toany SOAP policy, provided it has a monotonic rank function. In Theorem 5.1 we show that for anymonotonic SOAP policy 𝜋 , E [ 𝑇 𝜋 - 𝑘 ] ≤ E [ 𝑄 𝜋 -1 ] + 𝑘 E [ 𝑅 𝜋 -1 ] + ( 𝑘 − ) E [ 𝑆 𝜋 -1 ] , (4.2) ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 11 𝑦 𝜋𝑥 𝑟 𝜋 ( 𝑥 ) 𝑥 𝑧 𝜋𝑥 𝑦 𝜋𝑥 ′ = 𝑥 ′ = 𝑧 𝜋𝑥 ′ 𝑟 𝜋 ( 𝑥 ′ ) Fig. 4.2. New Job and Old Job Age Cutoffs where the quantities on the right hand side, defined formally in Section 5, can be thought of asfollows: • 𝑄 𝜋 -1 and 𝑅 𝜋 -1 are distributions called waiting time and residence time , respectively [32]. Re-sponse time in the M/G/1 is the sum of waiting time and residence time. • 𝑆 𝜋 -1 is a new distribution we call inflated residence time , which is similar to residence timebut longer.Proving (4.2) is the first stepping stone to proving Theorem 3.1 because it reduces an M/G/ k anal-ysis to an M/G/1 analysis. Only the E [ 𝑅 𝜋 -1 ] and E [ 𝑆 𝜋 -1 ] coefficients depend on 𝑘 , so to prove The-orem 3.1, we show the E [ 𝑄 𝜋 -1 ] term dominates in the 𝜌 → 𝜋 is M-Gittins. Figure 4.1gives an overview of the main proof steps.In the remainder of this section, our goal is to bound E [ 𝑄 𝜋 -1 ] , E [ 𝑅 𝜋 -1 ] , and E [ 𝑆 𝜋 -1 ] , where 𝜋 iseither M-Gittins or M-SERPT. We begin in Section 4.1 by explaining in more detail the concepts ofrelevant work and of waiting, residence, and inflated residence time. In doing so, we introduce agecutoffs , quantities which characterize the relevant work due to each job. It turns out that to bound E [ 𝑄 𝜋 -1 ] , E [ 𝑅 𝜋 -1 ] , and E [ 𝑆 𝜋 -1 ] , we first need to bound the age cutoffs. Section 4.2 presents ourage cutoff bounds, deferring proofs to Section 6, and Section 4.3 presents our bounds on E [ 𝑄 𝜋 -1 ] , E [ 𝑅 𝜋 -1 ] , and E [ 𝑆 𝜋 -1 ] , deferring proofs to Section 7. Finally, in Section 4.4, we formally prove The-orems 3.1–3.3 by combining the intermediate results discussed throughout this section. In this section we give intuition for the tagged job method, deferring some formalities to Section 5.Recall that the tagged job method works by focusing on the journey of a “tagged” job 𝐽 throughthe system. Roughly speaking, the relevant work due to any other job is the amount of time bywhich that job delays 𝐽 ’s departure. A key insight from the M/G/1 SOAP analysis [32] is that tofigure out how much another job delays 𝐽 , we need to look not at 𝐽 ’s current rank but at its worstfuture rank . This is because even if 𝐽 has priority over another job at first, if 𝐽 ’s rank later increases,the other job can get priority.Suppose that 𝐽 has size 𝑥 . Under a monotonic SOAP policy 𝜋 , such as M-Gittins or M-SERPT,the worst future rank 𝐽 will have is always the rank it will have just before completion, namely 𝑟 𝜋 ( 𝑥 ) . The amount of relevant work due to another job 𝐽 ′ is the amount of time 𝐽 ′ is served while 𝐽 is in the system until 𝐽 ′ either completes or reaches rank 𝑟 𝜋 ( 𝑥 ) . Due to the FCFS tiebreaking rule(Section 2.1), exactly what “reaches” means depends on when 𝐽 ′ arrives. • New jobs , those that arrive after 𝐽 , contribute relevant work until they first have rank greaterthan or equal to 𝑟 𝜋 ( 𝑥 ) . This occurs at a specific age called the new job age cutoff , denoted 𝑦 𝜋𝑥 . • Old jobs , those that arrive before 𝐽 , contribute relevant work until they first have rank strictly greater than 𝑟 𝜋 ( 𝑥 ) . This occurs at a specific age called the old job age cutoff , denoted 𝑧 𝜋𝑥 . Figure 4.2 illustrates the new job and old job age cutoffs 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 , which are formally definedbelow. Roughly speaking, • if 𝑟 𝜋 is increasing at 𝑥 , then 𝑦 𝜋𝑥 = 𝑥 = 𝑧 𝜋𝑥 ; and • if 𝑟 𝜋 is constant at 𝑥 , then 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 are the endpoints of the constant region containing 𝑥 .As Fig. 4.2 illustrates, we always have 𝑦 𝜋𝑥 ≤ 𝑥 ≤ 𝑧 𝜋𝑥 . (4.3) Definition 4.1.
Let 𝜋 be a monotonic SOAP policy. The new job age cutoff and old job age cutoff of size 𝑥 are, respectively, 𝑦 𝜋𝑥 = sup { 𝑎 ≥ | 𝑟 𝜋 ( 𝑎 ) < 𝑟 𝜋 ( 𝑥 )} ,𝑧 𝜋𝑥 = sup { 𝑎 ≥ | 𝑟 𝜋 ( 𝑎 ) ≤ 𝑟 𝜋 ( 𝑥 )} . When the policy in question is clear, we drop the superscript 𝜋 .One can use new job and old job age cutoffs to write M/G/1 mean response time under a mono-tonic SOAP policy [33]. As a first step, we write M/G/1 response time 𝑇 𝜋 -1 as a sum of two parts,called waiting time 𝑄 𝜋 -1 and residence time 𝑅 𝜋 -1 [32]: E [ 𝑇 𝜋 -1 ] = E [ 𝑄 𝜋 -1 ] + E [ 𝑅 𝜋 -1 ] . We define waiting and residence times formally in Section 5. For now, we just need to know thattheir means can be written in terms of 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 . Specifically, Scully et al. [33, Propositions 4.7and 4.8] show E [ 𝑄 𝜋 -1 ] = ∫ ∞ 𝜏 ( 𝑧 𝜋𝑥 ) 𝜌 ( 𝑦 𝜋𝑥 ) 𝜌 ( 𝑧 𝜋𝑥 ) d 𝐹 ( 𝑥 ) , E [ 𝑅 𝜋 -1 ] = ∫ ∞ 𝑥𝜌 ( 𝑦 𝜋𝑥 ) d 𝐹 ( 𝑥 ) , (4.4)where 𝜌 and 𝜏 are defined as 𝜌 ( 𝑎 ) = − 𝜆 E [ min { 𝑋, 𝑎 }] = − ∫ 𝑎 𝜆𝐹 ( 𝑡 ) d 𝑡,𝜏 ( 𝑎 ) = 𝜆 E [ min { 𝑋, 𝑎 } ] = ∫ 𝑎 𝜆𝑡𝐹 ( 𝑡 ) d 𝑡 . (4.5)The proof of Theorem 5.1 explains the intuition behind (4.4).The significance of (4.2) is that it expresses M/G/ k response time in terms of waiting and resi-dence times, which are M/G/1 quantities. It also features a third quantity called inflated residencetime 𝑆 𝜋 -1 . We define inflated residence time formally in Section 5. For now, we just need to knowthat its mean, E [ 𝑆 𝜋 -1 ] = ∫ ∞ 𝑧 𝜋𝑥 𝜌 ( 𝑦 𝜋𝑥 ) d 𝐹 ( 𝑥 ) , (4.6)can be written in terms of 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 . Note that E [ 𝑅 𝜋 -1 ] ≤ E [ 𝑆 𝜋 -1 ] . Recall that proving our main results rests on characterizing the heavy-traffic scaling of E [ 𝑄 𝜋 ] , E [ 𝑅 𝜋 ] , and E [ 𝑆 𝜋 ] , where 𝜋 is either M-Gittins or M-SERPT. As we see in (4.4) and (4.6), both 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 feature prominently in the formulas of E [ 𝑄 𝜋 ] , E [ 𝑅 𝜋 ] , and E [ 𝑆 𝜋 ] . This means the first stepof characterizing the heavy-traffic scaling of E [ 𝑄 𝜋 ] , E [ 𝑅 𝜋 ] , and E [ 𝑆 𝜋 ] is understanding 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 .This is the subject of Section 6, in which we prove bounds on 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 for a wide class of job size The new job and old job age cutoffs of 𝑥 are equivalent to what Scully et al. [33] call the previous and next hill ages of 𝑥 . ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 13 Table 4.1. New Job and Old Job Age Cutoff Bounds
Size Distribution Quantity Bound Reference OR (−∞ , − ) 𝑦 M-Gittins-1 𝑥 Θ ( 𝑥 ) Theorem 6.4 𝑧 M-Gittins-1 𝑥 Θ ( 𝑥 ) 𝑦 M-SERPT-1 𝑥 Θ ( 𝑥 ) Theorem 6.2 𝑧 M-SERPT-1 𝑥 Θ ( 𝑥 ) QDHR 𝑦 M-Gittins-1 𝑥 Ω ( 𝑥 / 𝛾 ) for some 𝛾 ≥ 𝑧 M-Gittins-1 𝑥 𝑂 ( 𝑥 𝛾 ) for some 𝛾 ≥ QDHR ∪ QIMRL 𝑦 M-SERPT-1 𝑥 Ω ( 𝑥 / 𝛾 ) for some 𝛾 ≥ 𝑧 M-SERPT-1 𝑥 𝑂 ( 𝑥 𝛾 ) for some 𝛾 ≥ These bounds on 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 are critical for characterizing heavy-traffic scaling of E [ 𝑄 𝜋 -1 ] , E [ 𝑅 𝜋 -1 ] , and E [ 𝑆 𝜋 -1 ] . distributions. Table 4.1 summarizes these results. The main takeaway is that 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 are alwayspolynomially bounded relative to 𝑥 . Armed with bounds on age cutoffs, we are ready to characterize heavy-traffic scaling of meanwaiting, residence, and inflated residence times. This is the subject of Section 7, in which • Theorems 7.4, 7.5, 7.9 and 7.10 characterize M-SERPT’s heavy-traffic scaling; and • Theorems 7.11 and 7.12 characterize M-Gittins’s heavy-traffic scaling in terms of M-SERPT’s.Table 4.2 summarizes these results. The main takeaway of the table is that for all of the finite-variance job size distribution classes considered, if 𝜋 is either M-Gittins or M-SERPT, E [ 𝑄 𝜋 -1 ] dominates E [ 𝑅 𝜋 -1 ] and E [ 𝑆 𝜋 -1 ] , with the latter sometimes requiring an additional condition. Specif-ically, • E [ 𝑄 𝜋 -1 ] grows polynomially in 1 /( − 𝜌 ) , whereas • E [ 𝑅 𝜋 -1 ] and E [ 𝑆 𝜋 -1 ] grow subpolynomially in 1 /( − 𝜌 ) . We now prove our main results. The proofs of Theorems 3.1 and 3.2 both follow the same threemain steps, where 𝜋 is M-Gittins or M-SERPT, respectively: • Theorem 5.1 bounds E [ 𝑇 𝜋 - 𝑘 ] in terms of M/G/1 quantities. • The results in Table 4.2 show lim 𝜌 → E [ 𝑇 𝜋 - 𝑘 ]/ E [ 𝑄 𝜋 -1 ] = • Prior work relates E [ 𝑄 𝜋 -1 ] to E [ 𝑇 Gittins-1 ] . Proof of Theorem 3.1.
An M/G/1 can simulate any M/G/ k policy by sharing the server, so thefact that Gittins minimizes M/G/1 mean response time means E [ 𝑇 M-Gittins- 𝑘 ]/ E [ 𝑇 Gittins-1 ] ≥
1. Ittherefore suffices to show lim 𝜌 → E [ 𝑇 M-Gittins- 𝑘 ]/ E [ 𝑇 Gittins-1 ] ≤ E [ 𝑇 M-Gittins- 𝑘 ] E [ 𝑄 M-Gittins-1 ] ≤ + 𝑘 E [ 𝑅 M-Gittins-1 ] + ( 𝑘 − ) E [ 𝑆 M-Gittins-1 ] E [ 𝑄 M-Gittins-1 ] . That is, for all the classes in Table 4.2 except OR (− , − ) . Table 4.2. Heavy-Traffic Scaling of Waiting, Residence, and Inflated Residence Times
Size Distribution Quantity Heavy-Traffic Scaling Reference OR (− , − ) E [ 𝑄 𝜋 -1 ] 𝑂 (− log ( − 𝜌 )) Theorems 7.4 and 7.11 E [ 𝑅 𝜋 -1 ] 𝑂 (− log ( − 𝜌 )) OR (−∞ , − ) E [ 𝑄 𝜋 -1 ] Ω (( − 𝜌 ) − 𝛿 ) for some 𝛿 > E [ 𝑅 𝜋 -1 ] 𝑂 (− log ( − 𝜌 )) E [ 𝑆 𝜋 -1 ] 𝑂 (− log ( − 𝜌 )) Theorems 7.9 and 7.12
MDA ( Λ ) E [ 𝑄 𝜋 -1 ] Ω (( − 𝜌 ) −( − 𝜀 ) ) for all 𝜀 > E [ 𝑅 𝜋 -1 ] 𝑂 (( − 𝜌 ) − 𝜀 ) for all 𝜀 > MDA ( Λ ) ∩ QDHR E [ 𝑆 𝜋 -1 ] 𝑂 (( − 𝜌 ) − 𝜀 ) for all 𝜀 > MDA ( Λ ) ∩ QIMRL E [ 𝑆 M-SERPT-1 ] 𝑂 (( − 𝜌 ) − 𝜀 ) for all 𝜀 > ENBUE E [ 𝑄 𝜋 -1 ] Θ (( − 𝜌 ) − ) Theorems 7.5 and 7.11 E [ 𝑅 𝜋 -1 ] Θ ( ) Bounded E [ 𝑆 𝜋 -1 ] Θ ( ) Theorems 7.5 and 7.12
These bounds hold when 𝜋 is either M-Gittins or M-SERPT, except for the MDA ( Λ ) ∩ QIMRL case, in which thebound holds only for M-SERPT.
Theorems 7.5 and 7.9–7.12 imply that the second term vanishes in the 𝜌 → E [ 𝑄 M-Gittins-1 ] ≤ E [ 𝑄 Gittins-1 ] ≤ E [ 𝑇 Gittins-1 ] , (4.7)implying the desired result. (cid:3) Proof of Theorem 3.2.
Theorem 5.1 implies E [ 𝑇 M-SERPT- 𝑘 ] E [ 𝑄 M-SERPT-1 ] ≤ + 𝑘 E [ 𝑅 M-SERPT-1 ] + ( 𝑘 − ) E [ 𝑆 M-SERPT-1 ] E [ 𝑄 M-SERPT-1 ] . Theorems 7.5, 7.9 and 7.10 imply that the second term vanishes in the 𝜌 → E [ 𝑄 M-SERPT-1 ] ≤ E [ 𝑄 M-Gittins-1 ] , which combines with (4.7) to imply the desired result. (cid:3) To prove Theorem 3.3, we simply combine the results in Table 4.2.
Proof of Theorem 3.3.
We examine each case in turn. • For 𝑋 ∈ OR (− , − ) , we use Theorems 7.4 and 7.11. • For 𝑋 ∈ OR (−∞ , − ) ∪ MDA ( Λ ) , we use Theorems 7.9–7.11. • For 𝑋 ∈ ENBUE , we have 𝑟 M-SERPT ( 𝑎 ) = Θ ( ) by Definition 2.10, so we use Theorems 7.5and 7.11. (cid:3) k RESPONSE TIME BOUND
This section bounds M/G/ k mean response time under any monotonic SOAP policy 𝜋 . The notationused in Theorem 5.1 below is summarized in Table 5.1. While Scully et al. [33, Lemma 5.6] mention Gittins instead of M-Gittins, they prove the desired statement for M-Gittinsas an intermediate step of their proof. ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 15
Table 5.1. Summary of Notation
Notation Description Reference 𝜋 - k 𝑘 -server version of SOAP policy 𝜋 Section 2.1 𝜌 ( 𝑎 ) , 𝜏 ( 𝑎 ) functions of moments of min { 𝑋, 𝑎 } (4.5) 𝑦 𝜋𝑥 , 𝑧 𝜋𝑥 new job and old job age cutoffs Definition 4.1 𝑇 𝜋 - 𝑘 response time under 𝜋 - k Section 2.1 𝑄 𝜋 -1 waiting time under 𝜋 -1 (4.4) 𝑅 𝜋 -1 residence time under 𝜋 -1 (4.4) 𝑆 𝜋 -1 inflated residence time under 𝜋 -1 (4.6) Additionally, 𝑇 𝜋 - 𝑘𝑥 is size-conditional response time for size 𝑥 , and similarly for 𝑄 𝜋 -1 𝑥 , 𝑅 𝜋 -1 𝑥 , and 𝑆 𝜋 -1 𝑥 . Theorem 5.1.
For any monotonic SOAP policy 𝜋 , E [ 𝑇 𝜋 - 𝑘𝑥 ] ≤ 𝜌 ( 𝑦 𝜋𝑥 ) (cid:18) 𝜏 ( 𝑧 𝜋𝑥 ) 𝜌 ( 𝑧 𝜋𝑥 ) + 𝑘𝑥 + ( 𝑘 − ) 𝑧 𝜋𝑥 (cid:19) , (5.1) and therefore E [ 𝑇 𝜋 - 𝑘 ] ≤ E [ 𝑄 𝜋 -1 ] + 𝑘 E [ 𝑅 𝜋 -1 ] + ( 𝑘 − ) E [ 𝑆 𝜋 -1 ] . Proof.
In order to bound M/G/ k mean response time, we use a tagged job method in the style ofGrosof et al. [15], but we generalize it to allow an arbitrary monotonic SOAP policy 𝜋 . We consideran arbitrary “tagged” job 𝐽 of size 𝑥 arriving to a steady-state system. Our goal is to analyze thedistribution of 𝐽 ’s response time.The first step is a shift in perspective: instead of thinking about time passing , we reason in termsof work completed . Since each of the 𝑘 servers works at rate 1 / 𝑘 , the system can complete workat rate 1. While 𝐽 is in the system, servers sometimes complete work and are sometimes left idle.This means 𝐽 ’s response time is the sum of • the amount of work completed while 𝐽 is in the system and • the amount of work “wasted”, meaning service capacity left idle, while 𝐽 is in the system.We bound 𝐽 ’s response time by bounding the total amount of work above. We do so by dividing itinto several pieces: • Tagged work : the work of 𝐽 itself. • Virtual work : work on jobs prioritized behind 𝐽 , plus wasted work due to servers left idle. • Relevant work : work on jobs prioritized ahead of 𝐽 . We divide this into two subcategories: – Old relevant work: relevant work on old jobs , namely those present when 𝐽 arrives. – New relevant work: relevant work on new jobs , namely those that arrive after 𝐽 .For the first two categories, we have the same simple bound as Grosof et al. [15]: tagged workand virtual work add up to at most 𝑘𝑥 . This is because tagged work is 𝐽 ’s size 𝑥 , and the schedulingpolicy ensures that a server only completes virtual work while 𝐽 is in service at another server.However, bounding the two relevant work categories is more complicated than in Grosof et al. [15].We begin by asking: what rank must a job have to contribute to relevant work? Note that thejob 𝐽 will never have rank greater than its rank upon completion, 𝑟 𝜋 ( 𝑥 ) , since 𝜋 is a monotonicpolicy. As a result, all new relevant work is from jobs with rank strictly less than 𝑟 𝜋 ( 𝑥 ) , and all oldrelevant work is from jobs with rank less than or equal to 𝑟 𝜋 ( 𝑥 ) . We can put this in terms of theage cutoffs defined in Definition 4.1: • jobs contribute new relevant work up to at most age 𝑦 𝜋𝑥 , and • jobs contribute old relevant work up to at most age 𝑧 𝜋𝑥 .In the rest of this proof, 𝑦 𝑥 and 𝑧 𝑥 refer to 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 , respectively.To help us bound the amount of old relevant work completed while 𝐽 is in the system, we definea new concept: the amount of relevant work in the M/G/ k system under 𝜋 . Definition 5.2.
Let RelWork 𝜋 - 𝑘𝑥 ( 𝑡 ) denote the amount of work in the M/G/ k at time 𝑡 which isrelevant to a job 𝐽 of size 𝑥 :RelWork 𝜋 - 𝑘𝑥 ( 𝑡 ) = Õ jobs 𝐽 ′ (cid:0) min { 𝑧 𝑥 , 𝑥 𝐽 ′ } − 𝑎 𝐽 ′ ( 𝑡 ) (cid:1) + , where 𝑥 𝐽 ′ is the size of job 𝐽 ′ and 𝑎 𝐽 ′ ( 𝑡 ) is its age at time 𝑡 . We write RelWork 𝜋 - 𝑘𝑥 for the steadystate distribution of the amount of relevant work in the M/G/ k system.Since 𝐽 is a Poisson arrival, RelWork 𝜋 - 𝑘𝑥 is the distribution of the amount of relevant work in thesystem when 𝐽 arrives. That amount is an upper bound on the amount of old relevant work thatwill be completed while 𝐽 is in the system.To bound new relevant work, note that if a job 𝐽 ′ of size 𝑥 ′ arrives while 𝐽 is in the system, then 𝐽 ′ contributes at most min { 𝑥 ′ , 𝑦 𝑥 } new relevant work. As a result, new relevant work can be upperbounded by considering a transformed M/G/1 system in which the job size distribution is 𝑋 𝑦 𝑥 = st min { 𝑋, 𝑦 𝑥 } . The amount of new relevant work that arrives to our real system is upper bounded by the totalamount of work that arrives to the transformed system. Let 𝐵 𝑦 𝑥 ( 𝑤 ) be the length of a busy periodin the transformed M/G/1 system started by an initial amount of work 𝑤 . If 𝑤 is the total amount oftagged, virtual, and old relevant work, then the amount of new relevant work is at most 𝐵 𝑦 𝑥 ( 𝑤 ) − 𝑤 .Combining our bounds, we obtain 𝑇 𝜋 - 𝑘𝑥 ≤ st 𝐵 𝑦 𝑥 (cid:0) 𝑘𝑥 + RelWork 𝜋 - 𝑘𝑥 (cid:1) . Applying Lemma 5.3, stated and proven later in this section, yields 𝑇 𝜋 - 𝑘𝑥 ≤ st 𝐵 𝑦 𝑥 (cid:0) 𝑘𝑥 + RelWork 𝜋 -1 𝑥 + ( 𝑘 − ) 𝑧 𝑥 (cid:1) . (5.2)Taking expectations gives us E [ 𝑇 𝜋 - 𝑘𝑥 ] ≤ E [ RelWork 𝜋 -1 𝑥 ] + 𝑘𝑥 + ( 𝑘 − ) 𝑧 𝑥 𝜌 ( 𝑦 𝑥 ) . Because 𝜋 -1 is work conserving with respect to relevant work, the Pollaczek-Khinchine formulatells us E [ RelWork 𝜋 -1 𝑥 ] = 𝜏 ( 𝑧 𝑥 ) 𝜌 ( 𝑧 𝑥 ) , which completes the proof of (5.1).To connect (5.1) to the quantities E [ 𝑄 𝜋 ] , E [ 𝑅 𝜋 ] , and E [ 𝑆 𝜋 ] , we rewrite (5.2) as 𝑇 𝜋 - 𝑘𝑥 ≤ st 𝐵 𝑦 𝑥 ( RelWork 𝜋 -1 𝑥 ) + 𝑘 Õ 𝐵 𝑦 𝑥 ( 𝑥 ) + 𝑘 − Õ 𝐵 𝑦 𝑥 ( 𝑧 𝑥 ) , (5.3)where all of the relevant busy periods are independent. Prior work on SOAP policies [32, 33] givesnames to some of the distributions on the right-hand side. We define waiting, residence, and inflated residence times in terms of relevant busy periods. Waiting and residence timesalso have natural definitions as components of M/G/1 response time [32, 33], but we do not need them in this paper. ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 17 • The size-conditional waiting time for size 𝑥 is the random variable 𝑄 𝜋 -1 𝑥 = st 𝐵 𝑦 𝑥 ( RelWork 𝜋 -1 𝑥 ) ,and waiting time is 𝑄 𝜋 -1 = st 𝑄 𝜋 -1 𝑋 . • The size-conditional residence time for size 𝑥 is the random variable 𝑅 𝜋 -1 𝑥 = st 𝐵 𝑦 𝑥 ( 𝑥 ) , and residence time is 𝑅 𝜋 -1 = st 𝑅 𝜋 -1 𝑋 . • As there is no concise name for 𝐵 𝑦 𝑥 ( 𝑧 𝑥 ) in prior work, we define size-conditional inflatedresidence time for size 𝑥 to be the random variable 𝑆 𝜋 -1 𝑥 = st 𝐵 𝑦 𝑥 ( 𝑧 𝑥 ) , and we define inflatedresidence time to be 𝑆 𝜋 -1 = st 𝑆 𝜋 -1 𝑋 .With these definitions in place, (5.3) gives us 𝑇 𝜋 - 𝑘𝑥 ≤ st 𝑄 𝜋 -1 𝑥 + 𝑘 Õ 𝑅 𝜋 -1 𝑥 + 𝑘 − Õ 𝑆 𝜋 -1 𝑥 , so the result follows by taking the expectation of 𝑇 𝜋 - 𝑘 = st 𝑇 𝜋 - 𝑘𝑋 . (cid:3) Theorem 5.1 applies only to monotonic SOAP policies. It is tempting to try to apply the sametechnique to SOAP policies with nonmonotonic rank functions, but as we discuss in Appendix A,the argument does not readily generalize.The proof of Theorem 5.1 assumes a bound on RelWork 𝜋 - 𝑘𝑥 . We prove the bound in the followinglemma, which generalizes a similar lemma of Grosof et al. [15, Lemma 7.10]. Lemma 5.3.
Let Δ 𝑥 ( 𝑡 ) = RelWork 𝜋 - 𝑘𝑥 ( 𝑡 ) − RelWork 𝜋 -1 𝑥 ( 𝑡 ) . Then Δ 𝑥 ( 𝑡 ) ≤ ( 𝑘 − ) 𝑧 𝜋𝑥 for all times 𝑡 , and therefore RelWork 𝜋 - 𝑘𝑥 ≤ st RelWork 𝜋 -1 𝑥 + ( 𝑘 − ) 𝑧 𝜋𝑥 . Proof.
Throughout this proof, 𝑧 𝑥 refers to 𝑧 𝜋𝑥 . We consider a pair of coupled systems with thesame arrival sequence: • System 1 , an M/G/1 using 𝜋 -1; and • System 𝑘 , an M/G/ k using 𝜋 - k .Our approach is to bound the difference in relevant work between Systems 1 and 𝑘 at any time 𝑡 .Call a job relevant if it has age less than 𝑧 𝑥 . These are the only jobs that contribute relevantwork. To bound Δ 𝑥 ( 𝑡 ) , we divide times 𝑡 into two types of intervals: • few-jobs intervals , during which there are fewer than 𝑘 relevant jobs in System 𝑘 ; and • many-jobs intervals , during which there are at least 𝑘 relevant jobs in System 𝑘 .Note that both types of intervals are defined based on System 𝑘 alone, so System 1 may or maynot have relevant jobs during either type of interval.Any time 𝑡 is in either a few-jobs interval or a many-jobs interval. If 𝑡 is in a few-jobs interval,the argument is simple: there are at most 𝑘 − 𝑘 at time 𝑡 , so Δ 𝑥 ( 𝑡 ) ≤ RelWork 𝜋 - 𝑘𝑥 ( 𝑡 ) ≤ ( 𝑘 − ) 𝑧 𝑥 . Suppose instead that 𝑡 is in a many-jobs interval. Let 𝑠 ≤ 𝑡 be the start of the many-jobs intervalcontaining 𝑡 . We will show Δ 𝑥 ( 𝑡 ) ≤ Δ 𝑥 ( 𝑠 ) ≤ ( 𝑘 − ) 𝑧 𝑥 . We begin by showing Δ 𝑥 ( 𝑡 ) ≤ Δ 𝑥 ( 𝑠 ) . Note that arrivals do not affect Δ 𝑥 , because the twosystems experience the same arrivals and have the same definition of relevant work. Next, note thatservice to irrelevant jobs does not affect Δ 𝑥 , because irrelevant jobs never become relevant under 𝜋 ,since 𝜋 is a monotonic policy. In fact, the only way that Δ 𝑥 changes over a many-jobs period is dueto service to relevant jobs. System 𝑘 serves relevant jobs on all 𝑘 servers throughout a many-jobs 𝑦 𝑥 𝑟 M-SERPT ( 𝑥 ) 𝑥 𝑧 𝑥 𝑦 𝑥 ′ = 𝑥 ′ = 𝑧 𝑥 ′ 𝑟 M-SERPT ( 𝑥 ′ ) 𝑟 M-SERPT ( 𝑎 ) 𝑟 SERPT ( 𝑎 ) Here 𝑦 𝑥 stands for 𝑦 M-SERPT 𝑥 , and similarly for 𝑧 𝑥 , 𝑦 𝑥 ′ , and 𝑧 𝑥 ′ . Fig. 6.1. Relationship Between SERPT and M-SERPT Rank Functions period, completing relevant work at rate 1. System 1 may or may not serve relevant jobs during amany-jobs period, so it completes relevant work at rate at most 1. This means Δ 𝑥 ( 𝑡 ) ≤ Δ 𝑥 ( 𝑠 ) , asdesired.All that remains is to show that Δ 𝑥 ( 𝑠 ) ≤ ( 𝑘 − ) 𝑧 𝑥 . Recall that 𝑠 is the start of a many-jobsinterval. Many-jobs intervals cannot start due to irrelevant jobs becoming relevant, because 𝜋 is amonotonic policy. This means each many-jobs interval starts due to a relevant job arriving whileSystem 𝑘 has 𝑘 − Δ 𝑥 , as discussed above. Thismeans Δ 𝑥 ( 𝑠 ) = Δ 𝑥 ( 𝑠 − ) , where 𝑠 − is the instant before the arrival that starts the many-jobs interval.But 𝑠 − is in a few-jobs interval, so Δ 𝑥 ( 𝑠 ) = Δ 𝑥 ( 𝑠 − ) ≤ ( 𝑘 − ) 𝑧 𝑥 . (cid:3) We now have a bound on M/G/ k mean response time under monotonic SOAP policies 𝜋 , includingM-Gittins and M-SERPT. The bound (Theorem 5.1) is expressed in terms of E [ 𝑄 𝜋 -1 ] , E [ 𝑅 𝜋 -1 ] , and E [ 𝑆 𝜋 -1 ] , quantities which in turn are expressed in terms of the new job and old job age cutoffs 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 . In order to prove optimality of M-Gittins in the heavy-traffic M/G/ k , we need to understandthe heavy-traffic behavior of E [ 𝑄 𝜋 -1 ] , E [ 𝑅 𝜋 -1 ] , and E [ 𝑆 𝜋 -1 ] , which, as we will see in Section 7, boilsdown to understanding the behavior of 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 in the 𝑥 → ∞ limit. This section is thus devotedto asymptotically bounding the new job and old job age cutoffs, and more generally the rankfunctions, of M-Gittins and M-SERPT.Recall from Definition 2.2 that SERPT’s rank function is used to define M-SERPT’s. The follow-ing lemma shows that the two rank functions are equal at the new job and old job age cutoffs, andsimilarly for Gittins and M-Gittins. Figure 6.1 gives an intuitive picture of the result. Lemma 6.1.
The SERPT and M-SERPT rank functions are related by 𝑟 SERPT ( 𝑦 M-SERPT 𝑥 ) = 𝑟 M-SERPT ( 𝑦 M-SERPT 𝑥 ) = 𝑟 M-SERPT ( 𝑥 ) = 𝑟 M-SERPT ( 𝑧 M-SERPT 𝑥 ) = 𝑟 SERPT ( 𝑧 M-SERPT 𝑥 ) , and analogously for Gittins and M-Gittins. Proof.
We prove the statement for SERPT and M-SERPT, as the proof for Gittins and M-Gittinsis analogous. Throughout this proof, 𝑦 𝑥 and 𝑧 𝑥 refer to 𝑦 M-SERPT 𝑥 and 𝑧 M-SERPT 𝑥 , respectively. Theillustration in Fig. 6.1 may provide helpful intuition for the following argument.We first show the outer equalities. Definition 4.1 implies that 𝑟 M-SERPT is increasing in the inter-vals ( 𝑦 𝑥 − 𝛿, 𝑦 𝑥 ) and ( 𝑧 𝑥 , 𝑧 𝑥 + 𝛿 ) for some 𝛿 >
0. By Definition 2.2, for 𝑟 M-SERPT to be increasingat age 𝑎 , we must have 𝑟 M-SERPT ( 𝑎 ) = 𝑟 SERPT ( 𝑎 ) , so continuity of 𝑟 M-SERPT (Lemma 2.6) implies theouter equalities. ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 19
By (4.3) and the monotonicity of 𝑟 M-SERPT , it remains only to show 𝑟 M-SERPT ( 𝑦 𝑥 ) = 𝑟 M-SERPT ( 𝑧 𝑥 ) .This is immediate if 𝑦 𝑥 = 𝑧 𝑥 , and if 𝑦 𝑥 < 𝑧 𝑥 , then 𝑟 M-SERPT is constant over the interval [ 𝑦 𝑥 , 𝑧 𝑥 ) , sothe result follows by the continuity of 𝑟 M-SERPT (Lemma 2.6). (cid:3)
In this section we show two bounds on 𝑦 M-SERPT 𝑥 and 𝑧 M-SERPT 𝑥 , each subject to a different assump-tion on the job size distribution. Theorem 6.2. If 𝑋 ∈ OR (−∞ , − ) , then 𝑟 SERPT ( 𝑎 ) = Θ ( 𝑎 ) ,𝑟 M-SERPT ( 𝑎 ) = Θ ( 𝑎 ) ,𝑦 M-SERPT 𝑥 = Θ ( 𝑥 ) ,𝑧 M-SERPT 𝑥 = Θ ( 𝑥 ) . Proof.
By Definition 2.7, there exists 𝛼 > 𝑟 SERPT ( 𝑎 ) = ∫ ∞ 𝑎 𝐹 ( 𝑡 ) 𝐹 ( 𝑎 ) d 𝑡 ≤ 𝑂 ( ) ∫ ∞ 𝑎 (cid:16) 𝑡𝑎 (cid:17) − 𝛼 d 𝑡 = 𝑂 ( 𝑎 ) , and 𝑟 SERPT ( 𝑎 ) = Ω ( 𝑎 ) follows similarly. This implies 𝑟 M-SERPT ( 𝑎 ) = max 𝑏 ∈[ ,𝑎 ] 𝑟 SERPT ( 𝑏 ) = max 𝑏 ∈[ ,𝑎 ] Θ ( 𝑏 ) = Θ ( 𝑎 ) , so the result follows from Lemma 6.1. (cid:3) Theorem 6.3. If 𝑋 ∈ QDHR ∪ QIMRL with exponent 𝛾 , then 𝑦 M-SERPT 𝑥 = Ω ( 𝑥 / 𝛾 ) ,𝑧 M-SERPT 𝑥 = 𝑂 ( 𝑥 𝛾 ) . Proof.
The
QDHR case follows from Theorem 6.5 (Section 6.2) and a result of Scully et al. [33,Eq. (3.8)] stating 𝑦 M-Gittins 𝑥 ≤ 𝑦 M-SERPT 𝑥 ≤ 𝑧 M-SERPT 𝑥 ≤ 𝑧 M-Gittins 𝑥 , so only the QIMRL case remains.In the rest of this proof, 𝑦 𝑥 and 𝑧 𝑥 refer to 𝑦 M-SERPT 𝑥 and 𝑧 M-SERPT 𝑥 , respectively. By (4.3), it sufficesto show 𝑧 𝑥 = 𝑂 ( 𝑦 𝛾𝑥 ) . Because 𝑋 ∈ QIMRL with exponent 𝛾 , there exists strictly increasing function 𝑚 : R + → R + such that for all ages 𝑎 , 𝑎 ≤ 𝑚 − (cid:0) 𝑟 SERPT ( 𝑎 ) (cid:1) ≤ 𝑂 ( 𝑎 𝛾 ) . The result follows by plugging in 𝑎 = 𝑦 𝑥 and 𝑎 = 𝑧 𝑥 and applying Lemma 6.1. (cid:3) In this section we show two bounds on 𝑦 M-Gittins 𝑥 and 𝑧 M-Gittins 𝑥 , each subject to a different assump-tion on the job size distribution. Theorem 6.4. If 𝑋 ∈ OR (−∞ , − ) , then 𝑦 M-Gittins 𝑥 = Θ ( 𝑥 ) ,𝑧 M-Gittins 𝑥 = Θ ( 𝑥 ) . Theorem 6.5. If 𝑋 ∈ QDHR with exponent 𝛾 , then 𝑦 M-Gittins 𝑥 = Ω ( 𝑥 / 𝛾 ) ,𝑧 M-Gittins 𝑥 = 𝑂 ( 𝑥 𝛾 ) . These bounds are harder to prove than their M-SERPT counterparts from Section 6.1. The mostimportant component is the following definition, which helps us better understand the M-Gittinsrank function and relate it to the simpler M-SERPT rank function.
Definition 6.6.
The time per completion over an age interval ( 𝑎, 𝑏 ] is 𝜂 ( 𝑎, 𝑏 ) = E [ min { 𝑋, 𝑏 } − 𝑎 | 𝑋 > 𝑎 ] P { 𝑋 < 𝑏 | 𝑋 > 𝑎 } = ∫ 𝑏𝑎 𝐹 ( 𝑡 ) d 𝑡𝐹 ( 𝑎 ) − 𝐹 ( 𝑏 ) . We extend this definition to the 𝑏 → 𝑎 and 𝑏 → ∞ limits: 𝜂 ( 𝑎, 𝑎 ) = ℎ ( 𝑎 ) ,𝜂 ( 𝑎, ∞) = E [ 𝑋 − 𝑎 | 𝑋 > 𝑎 ] . We can write the rank functions of SERPT, M-SERPT, Gittins, and M-Gittins in terms of 𝜂 as 𝑟 SERPT ( 𝑎 ) = 𝜂 ( 𝑎, ∞) ,𝑟 M-SERPT ( 𝑎 ) = max 𝑏 ∈[ ,𝑎 ] 𝜂 ( 𝑏, ∞) ,𝑟 Gittins ( 𝑎 ) = min 𝑏 ∈[ 𝑎, ∞] 𝜂 ( 𝑎, 𝑏 ) ,𝑟 M-Gittins ( 𝑎 ) = max 𝑏 ∈[ ,𝑎 ] min 𝑐 ∈[ 𝑏, ∞] 𝜂 ( 𝑏, 𝑐 ) . (6.1)Armed with Definition 6.6 and (6.1), we are ready to prove Theorems 6.4 and 6.5. The formerproof relies on some technical lemmas that we defer to Section 6.3. Proof of Theorem 6.4.
Throughout this proof, 𝑦 𝑥 and 𝑧 𝑥 refer to 𝑦 M-Gittins 𝑥 and 𝑧 M-Gittins 𝑥 , respec-tively. By (4.3), it suffices to show there exist 𝐶 , 𝑥 > 𝑥 ≥ 𝑥 , 𝑧 𝑥 ≤ 𝐶 𝑦 𝑥 . We will set 𝐶 ≥
2, which covers the 𝑧 𝑥 ≤ 𝑦 𝑥 case. The rest of the proof is thus devoted to the 𝑧 𝑥 > 𝑦 𝑥 case. Our approach is to show there exist 𝐶 , 𝐶 such that for all 𝑥 ≥ 𝑥 , 𝐶 𝑦 𝑥 ≥ 𝑟 Gittins ( 𝑦 𝑥 ) ≥ 𝐶 𝑧 𝑥 . (6.2)We begin with the upper bound on 𝑟 Gittins ( 𝑦 𝑥 ) . By Lemma 6.1, we have 𝑟 Gittins ( 𝑦 𝑥 ) = 𝑟 M-Gittins ( 𝑦 𝑥 ) for all sizes 𝑥 , and by (6.1), we have 𝑟 M-Gittins ( 𝑎 ) ≤ 𝑟 M-SERPT ( 𝑎 ) for all ages 𝑎 . Combining these ob-servations with Theorem 6.2 implies 𝑟 Gittins ( 𝑦 𝑥 ) = 𝑂 ( 𝑦 𝑥 ) and thereby implies the desired upperbound from (6.2). We now turn to the lower bound on 𝑟 Gittins ( 𝑦 𝑥 ) . This requires Lemmas 6.7 and 6.8, which arefacts about 𝜂 that we prove in Section 6.3. Combining Lemma 6.7 with (6.1) and the fact that weare in the 𝑧 𝑥 > 𝑦 𝑥 case gives us 𝑟 Gittins ( 𝑦 𝑥 ) = 𝜂 ( 𝑦 𝑥 , 𝑧 𝑥 ) ≥ 𝜂 (cid:16) 𝑧 𝑥 , 𝑧 𝑥 (cid:17) . Our time per completion function is the reciprocal of what Aalto et al. [3, 4] call the efficiency function . This would be more subtle if lim 𝑥 →∞ 𝑦 𝑥 were finite, but Theorem 6.2 and a result of Aalto et al. [4, Proposition 9] implylim 𝑥 →∞ 𝑦 𝑥 = ∞ . ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 21 By Lemma 6.8, there exist 𝐶 , 𝑥 such that for all 𝑥 with 𝑧 𝑥 / > 𝑥 , 𝜂 (cid:16) 𝑧 𝑥 , 𝑧 𝑥 (cid:17) ≥ 𝐶 𝑧 𝑥 , implying the desired lower bound from (6.2). (cid:3) Proof of Theorem 6.5.
Throughout this proof, 𝑦 𝑥 and 𝑧 𝑥 refer to 𝑦 M-Gittins 𝑥 and 𝑧 M-Gittins 𝑥 , respec-tively. By (4.3), it suffices to show 𝑧 𝑥 = 𝑂 ( 𝑦 𝛾𝑥 ) . Because 𝑋 ∈ QDHR with exponent 𝛾 , there exists astrictly increasing function 𝑚 : R + → R + such that for all sizes 𝑥 , 𝑚 ( 𝑥 ) ≤ ℎ ( 𝑥 ) ≤ 𝑚 ( 𝑂 ( 𝑥 𝛾 )) . We have 𝑟 Gittins ( 𝑦 𝑥 ) ≤ / ℎ ( 𝑦 𝑥 ) by (6.1), and Lemma 6.1 implies 𝑟 Gittins ( 𝑧 𝑥 ) = 𝑟 Gittins ( 𝑦 𝑥 ) , so 𝑟 Gittins ( 𝑧 𝑥 ) ≤ 𝑚 ( 𝑂 ( 𝑦 𝛾𝑥 )) . It remains only to lower bound 𝑟 Gittins ( 𝑧 𝑥 ) . We do so using the observation that for any age 𝑎 , 𝑟 Gittins ( 𝑎 ) = min 𝑏 ∈[ 𝑎, ∞] 𝜂 ( 𝑎, 𝑏 ) = max 𝑏 ∈[ 𝑎, ∞] ∫ 𝑏𝑎 𝐹 ( 𝑡 ) ℎ ( 𝑡 ) d 𝑡 ∫ 𝑏𝑎 𝐹 ( 𝑡 ) d 𝑡 ! − ≥ (cid:0) sup 𝑏 > 𝑎 ℎ ( 𝑏 ) (cid:1) − = inf 𝑏 > 𝑎 ℎ ( 𝑏 )≥ 𝑚 ( 𝑎 ) , where the first inequality follows from viewing the ratio of integrals as a weighted average. Plug-ging in 𝑎 = 𝑧 𝑥 implies 𝑚 ( 𝑧 𝑥 ) ≤ 𝑚 ( 𝑂 ( 𝑦 𝛾𝑥 )) , so the result follows because 𝑚 is strictly increasing. (cid:3) Lemma 6.7.
For all sizes 𝑥 and ages 𝑎 , if 𝑦 𝑥 < 𝑎 < 𝑧 𝑥 , then 𝑟 Gittins ( 𝑦 𝑥 ) = 𝜂 ( 𝑦 𝑥 , 𝑧 𝑥 ) ≥ 𝜂 ( 𝑎, 𝑧 𝑥 ) . Proof.
A property of the Gittins index [12, Lemma 2.2] implies 𝑟 Gittins ( 𝑦 𝑥 ) = 𝜂 ( 𝑦 𝑥 , 𝑧 𝑥 ) . In particular, for any 𝑎 ≠ 𝑧 𝑥 , 𝜂 ( 𝑦 𝑥 , 𝑎 ) ≥ 𝜂 ( 𝑦 𝑥 , 𝑧 𝑥 ) . (6.3)A basic property of the 𝜂 function [33, Eq. (D.3)] is that for any 𝑑 < 𝑒 < 𝑓 , 𝜂 ( 𝑑, 𝑒 ) ≥ 𝜂 ( 𝑑, 𝑓 ) ⇔ 𝜂 ( 𝑑, 𝑓 ) ≥ 𝜂 ( 𝑒, 𝑓 ) . Plugging in 𝑑 = 𝑦 𝑥 , 𝑒 = 𝑎 , and 𝑓 = 𝑧 𝑥 and applying (6.3) yields 𝜂 ( 𝑦 𝑥 , 𝑧 𝑥 ) ≥ 𝜂 ( 𝑎, 𝑧 𝑥 ) , as desired. (cid:3) Lemma 6.8. If 𝑋 ∈ OR (−∞ , − ) , then there exist constants 𝐶 , 𝑥 > such that for all 𝑏 > 𝑎 > 𝑥 , 𝜂 ( 𝑎, 𝑏 ) ≥ 𝐶 𝑎 (cid:16) − 𝑎𝑏 (cid:17) . The proof given by Gittins et al. [12] is in a discrete setting, but essentially the same proof carries over to our continuoussetting.
Proof.
We can write 𝜂 ( 𝑎, 𝑏 ) as 𝜂 ( 𝑎, 𝑏 ) = ∫ 𝑏𝑎 𝐹 ( 𝑡 )/ 𝐹 ( 𝑎 ) d 𝑡 − 𝐹 ( 𝑏 )/ 𝐹 ( 𝑎 ) ≥ ∫ 𝑏𝑎 𝐹 ( 𝑡 ) 𝐹 ( 𝑎 ) d 𝑡 . Because 𝑋 ∈ OR (−∞ , − ) , there exist 𝛽 > 𝐶 , 𝑥 > 𝑡 > 𝑎 > 𝑥 , 𝐹 ( 𝑡 ) 𝐹 ( 𝑎 ) ≥ 𝐶 (cid:16) 𝑡𝑎 (cid:17) − 𝛽 . For all 𝑏 > 𝑎 > 𝑥 , we have 𝜂 ( 𝑎, 𝑏 ) ≥ 𝐶 ∫ 𝑏𝑎 (cid:16) 𝑡𝑎 (cid:17) − 𝛽 d 𝑡 = 𝐶 𝑎𝛽 − (cid:16) − (cid:16) 𝑏𝑎 (cid:17) −( 𝛽 − ) (cid:17) . We now consider two cases: 𝛽 ≥ < 𝛽 <
2. If 𝛽 ≥
2, then ( 𝑏 / 𝑎 ) −( 𝛽 − ) ≤ 𝑎 / 𝑏 and therefore 𝜂 ( 𝑎, 𝑏 ) ≥ 𝐶 𝑎𝛽 − (cid:16) − 𝑎𝑏 (cid:17) , (6.4)so setting 𝐶 = 𝐶 /( 𝛽 − ) and 𝑥 = 𝑥 suffices. If 1 < 𝛽 <
2, we use the fact that for all 𝑢 > 𝑢 𝛽 − ≤ + ( 𝛽 − ) ( 𝑢 − ) . Substituting 𝑢 = 𝑎 / 𝑏 and combining this with (6.4) yields 𝜂 ( 𝑎, 𝑏 ) ≥ 𝐶 𝑎 (cid:16) − 𝑎𝑏 (cid:17) , so setting 𝐶 = 𝐶 and 𝑥 = 𝑥 suffices. (cid:3) In this section we characterize the heavy-traffic scaling of mean waiting, residence, and inflatedresidence times, which are the M/G/1 quantities that appear Theorem 5.1. Because M-SERPT isa simpler policy than M-Gittins, our approach is to first study M-SERPT’s heavy-traffic scaling(Sections 7.2 and 7.3) then show that the results extend to M-Gittins (Section 7.4).
Before starting the heavy-traffic analyses of M-Gittins and M-SERPT, we introduce some newnotation. Let 𝐻 𝜌 ( 𝑥 ) = 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) . Definition 7.1.
The key M/G/1 response time quantities , or simply “key quantities”, of a monotonicSOAP policy 𝜋 are the following:I 𝜋𝑄 = ∫ ∞ (cid:0) 𝐻 𝜌 ( 𝑦 𝜋𝑥 ) + 𝐻 𝜌 ( 𝑧 𝜋𝑥 ) (cid:1) 𝜆𝜏 ( 𝑧 𝜋𝑥 ) 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) d 𝑥, II 𝜋𝑄 = ∫ ∞ 𝜆𝑥𝐻 𝜌 ( 𝑦 𝜋𝑥 ) · 𝐹 ( 𝑥 ) 𝐹 ( 𝑦 𝜋𝑥 ) d 𝑥, II 𝜋𝑅 = ∫ ∞ 𝜆𝑧 𝜋𝑥 𝐻 𝜌 ( 𝑦 𝜋𝑥 ) 𝐻 𝜌 ( 𝑧 𝜋𝑥 ) · 𝐹 ( 𝑥 ) 𝐹 ( 𝑦 𝜋𝑥 ) d 𝑥, III 𝜋𝑅 = ∫ ∞ 𝐻 𝜌 ( 𝑦 𝜋𝑥 ) · 𝐹 ( 𝑥 ) 𝐹 ( 𝑦 𝜋𝑥 ) d 𝑥, ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 23 II 𝜋𝑆 = II 𝜋𝑅 , III 𝜋𝑆 = ∫ ∞ 𝐻 𝜌 ( 𝑦 𝜋𝑥 ) d 𝑥 . When the policy in question is clear, we drop the superscript 𝜋 .In Theorems B.1–B.3 (Appendix B) we show that for any monotonic SOAP policy 𝜋 , E [ 𝑄 𝜋 ] = I 𝜋𝑄 + II 𝜋𝑄 , E [ 𝑅 𝜋 ] = II 𝜋𝑅 + III 𝜋𝑅 , E [ 𝑆 𝜋 ] = II 𝜋𝑆 + III 𝜋𝑆 . Bounding mean waiting, residence, and inflated residence times thus amounts to bounding the keyquantities.For the most of the rest of this section we focus on the case where 𝜋 is M-SERPT, deferring theM-Gittins case to Section 7.4. Until then, 𝑦 𝑥 , 𝑧 𝑥 , and the key quantities are understood to have animplicit superscript M-SERPT.The most important step of bounding the key quantities is bounding 𝐻 𝜌 ( 𝑦 𝑥 ) and 𝐻 𝜌 ( 𝑧 𝑥 ) . As afirst step, we bound 𝐻 𝜌 ( 𝑥 ) . Let 𝐹 𝑒 ( 𝑥 ) = E [ 𝑋 ] ∫ ∞ 𝑥 𝐹 ( 𝑡 ) d 𝑡 (7.1)be the tail of the excess of 𝑋 . We can write 𝜌 ( 𝑥 ) as 𝜌 ( 𝑥 ) = ( − 𝜌 ) + 𝜌𝐹 𝑒 ( 𝑥 ) . (7.2)This means that for all 𝜀 ∈ [ , ] , we have 𝐻 𝜌 ( 𝑥 ) ≤ 𝐹 ( 𝑥 ) max { − 𝜌, 𝜌𝐹 𝑒 ( 𝑥 )} ≤ 𝐹 ( 𝑥 )( − 𝜌 ) 𝜀 ( 𝜌𝐹 𝑒 ( 𝑥 )) − 𝜀 = 𝐹 ( 𝑥 ) 𝜀 𝐻 ( 𝑥 ) − 𝜀 ( − 𝜌 ) 𝜀 𝜌 − 𝜀 , (7.3)where 𝐻 ( 𝑥 ) = 𝐹 ( 𝑥 )/ 𝐹 𝑒 ( 𝑥 ) = lim 𝜌 → 𝐻 𝜌 ( 𝑥 ) . This bound is useful because it separates 𝐻 𝜌 ( 𝑥 ) ’sdependence on 𝑥 and 𝜌 : the numerator depends only on 𝑥 , and the denominator depends onlyon 𝜌 . We will typically choose 𝜀 to be either 0 or arbitrarily small.Having bounded 𝐻 𝜌 ( 𝑥 ) in (7.3), we now turn to bounding 𝐻 𝜌 ( 𝑦 𝑥 ) and 𝐻 𝜌 ( 𝑧 𝑥 ) . Recalling thedefinition of 𝑟 SERPT (Definition 2.1), 𝐻 ( 𝑥 ) = 𝐹 ( 𝑥 ) 𝐹 𝑒 ( 𝑥 ) = E [ 𝑋 ] 𝑟 SERPT ( 𝑥 ) , so Lemma 6.1 and the monotonicity of 𝑟 M-SERPT imply 𝐻 ( 𝑦 𝑥 ) = 𝐻 ( 𝑧 𝑥 ) = E [ 𝑋 ] 𝑟 M-SERPT ( 𝑥 ) = 𝑂 ( ) . (7.4)Combining this with (7.3) yields bounds on 𝐻 𝜌 ( 𝑦 𝑥 ) and 𝐻 𝜌 ( 𝑧 𝑥 ) , though the bounds still have 𝐹 ( 𝑦 𝑥 ) and 𝐹 ( 𝑧 𝑥 ) terms. To better understand 𝐻 𝜌 ( 𝑦 𝑥 ) and 𝐻 𝜌 ( 𝑧 𝑥 ) , we need to use our results from Sec-tion 6 in arguments that depend on what class of distributions contains 𝑋 . We do this over thecourse of Sections 7.2 and 7.3. In this section we study the heavy-traffic scaling of M-SERPT’s waiting, residence, and inflatedresidence times for infinite-variance job size distributions, specifically those in OR (− , − ) . Withthat said, many of the intermediate results we prove will also be useful for the finite-variance OR (−∞ , − ) case (Section 7.3).Suppose that 𝑋 ∈ OR (−∞ , − ) . Combining Theorem 6.2 and (7.4) gives us 𝑦 𝑥 , 𝑧 𝑥 = Θ ( 𝑥 ) ,𝐻 ( 𝑦 𝑥 ) , 𝐻 ( 𝑧 𝑥 ) = Θ (cid:16) 𝑥 (cid:17) . (7.5)This alone is enough to bound all of the key quantities except I 𝑄 . Lemma 7.2.
Under M-SERPT, if 𝑋 ∈ OR (−∞ , − ) , then II 𝑄 , II 𝑅 , III 𝑅 , II 𝑆 , III 𝑆 = 𝑂 (cid:18) log 11 − 𝜌 (cid:19) . Proof.
Our approach is to use the fact that, by (4.5), ∫ ∞ 𝐻 𝜌 ( 𝑥 ) d 𝑥 = ∫ ∞ 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) d 𝑥 = E [ 𝑋 ] 𝜌 log 11 − 𝜌 . (7.6)Because II 𝑅 = II 𝑆 and III 𝑅 ≤ III 𝑆 , it suffices to show that the integrands of II 𝑄 , II 𝑆 , and III 𝑆 are all 𝑂 ( 𝐻 𝜌 ( 𝑥 )) .We begin by showing that III 𝑆 ’s integrand is 𝑂 ( 𝐻 𝜌 ( 𝑥 )) . By (7.5) and the fact that 𝑋 ∈ OR (−∞ , − ) ,we have 𝐹 ( 𝑦 𝑥 ) = 𝐹 ( Θ ( 𝑥 )) = Θ ( 𝐹 ( 𝑥 )) , which yields 𝐻 𝜌 ( 𝑦 𝑥 ) = 𝐹 ( 𝑦 𝑥 ) 𝜌 ( 𝑦 𝑥 ) ≤ 𝐹 ( 𝑦 𝑥 ) 𝜌 ( 𝑥 ) = 𝑂 ( 𝐹 ( 𝑥 )) 𝜌 ( 𝑥 ) = 𝑂 ( 𝐻 𝜌 ( 𝑥 )) . (7.7)This implies the desired bound for III 𝑆 and III 𝑅 .We show II 𝑆 ’s integrand is 𝑂 ( 𝐻 𝜌 ( 𝑥 )) by applying (7.3) with 𝜀 =
0, (7.5), and (7.7): 𝜆𝑧 𝑥 𝐻 𝜌 ( 𝑦 𝑥 ) 𝐻 𝜌 ( 𝑧 𝑥 ) ≤ 𝜆𝑧 𝑥 𝐻 𝜌 ( 𝑦 𝑥 ) 𝐻 ( 𝑧 𝑥 ) = 𝑂 ( 𝐻 𝜌 ( 𝑥 )) . This implies the desired bound for II 𝑆 and II 𝑅 . Similarly, 𝜆𝑥𝐻 𝜌 ( 𝑦 𝑥 ) · 𝐹 ( 𝑥 ) 𝐹 ( 𝑦 𝑥 ) ≤ 𝜆𝑥𝐻 𝜌 ( 𝑦 𝑥 ) 𝐻 ( 𝑦 𝑥 ) = 𝑂 ( 𝐻 𝜌 ( 𝑥 )) , implying the bound for II 𝑄 . (cid:3) It remains only to characterize the heavy-traffic scaling of I 𝑄 . Treating the OR (−∞ , − ) caserequires some additional care, so we defer it to Section 7.3, focusing on the OR (− , − ) case fornow. The first step is to bound 𝜏 ( 𝑥 ) . Lemma 7.3. If 𝑋 ∈ OR (− , − ) , then 𝜏 ( 𝑥 ) = Θ ( 𝑥 𝐹 ( 𝑥 )) . Proof.
By Definition 2.7, there exists 𝛽 ∈ ( , ) such that 𝜏 ( 𝑥 ) 𝐹 ( 𝑥 ) = ∫ 𝑥 𝜆𝑡𝐹 ( 𝑡 ) 𝐹 ( 𝑥 ) d 𝑡 ≤ 𝑂 ( ) ∫ 𝑥 𝑡 (cid:16) 𝑡𝑥 (cid:17) − 𝛽 d 𝑡 = 𝑂 ( 𝑥 ) , and similarly for the lower bound. (cid:3) ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 25 We now have bounds on every term in I 𝑄 ’s integrand, allowing us to bound I 𝑄 and therebymean response time. Theorem 7.4. If 𝑋 ∈ OR (− , − ) , then in the 𝜌 → limit, E [ 𝑄 M-SERPT-1 ] = 𝑂 (cid:18) log 11 − 𝜌 (cid:19) , E [ 𝑅 M-SERPT-1 ] = 𝑂 (cid:18) log 11 − 𝜌 (cid:19) , and therefore E [ 𝑇 M-SERPT-1 ] = 𝑂 (cid:18) log 11 − 𝜌 (cid:19) . Proof.
By Lemma 7.2, it suffices to upper bound I 𝑄 . We compute (cid:0) 𝐻 𝜌 ( 𝑦 𝑥 ) + 𝐻 𝜌 ( 𝑧 𝑥 ) (cid:1) 𝜆𝜏 ( 𝑧 𝑥 ) 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) ≤ (cid:0) 𝐻 ( 𝑦 𝑥 ) + 𝐻 ( 𝑧 𝑥 ) (cid:1) 𝜆𝜏 ( 𝑧 𝑥 ) 𝐻 ( 𝑥 ) 𝜌 ( 𝑥 ) [by (7.3)] = (cid:0) 𝐻 ( 𝑦 𝑥 ) + 𝐻 ( 𝑧 𝑥 ) (cid:1) 𝑂 ( 𝑧 𝑥 𝐹 ( 𝑧 𝑥 )) · 𝐻 ( 𝑥 ) 𝜌 ( 𝑥 ) [by Lem. 7.3] = 𝑂 ( 𝐹 ( 𝑥 )) 𝜌 ( 𝑥 ) [by (7.5)] = 𝑂 ( 𝐻 𝜌 ( 𝑥 )) , so (7.6) implies the desired bound. (cid:3) We now turn to finite-variance job size distributions, specifically those in OR (−∞ , − ) , MDA ( Λ ) ,and ENBUE . We begin with the simplest case, which is
ENBUE . Theorem 7.5. If 𝑋 ∈ ENBUE , then in the 𝜌 → limit, E [ 𝑄 M-SERPT-1 ] = Θ (cid:18) − 𝜌 (cid:19) , E [ 𝑅 M-SERPT-1 ] = Θ ( ) , and therefore E [ 𝑇 M-SERPT-1 ] = Θ (cid:18) − 𝜌 (cid:19) . If additionally 𝑋 ∈ Bounded , then in the 𝜌 → limit, E [ 𝑆 M-SERPT-1 ] = Θ ( ) . Proof.
Let 𝑥 max be the supremum of 𝑋 ’s support, so we may have 𝑥 max = ∞ . Because 𝑋 ∈ ENBUE ,there exists age 𝑎 ∗ < 𝑥 max such that • 𝑟 M-SERPT ( 𝑎 ) < 𝑟 M-SERPT ( 𝑎 ∗ ) for all 𝑎 < 𝑎 ∗ , and • 𝑟 M-SERPT ( 𝑎 ) = 𝑟 M-SERPT ( 𝑎 ∗ ) for all 𝑎 ≥ 𝑎 ∗ .This means • 𝑦 𝑥 ≤ 𝑎 ∗ for all sizes 𝑥 , • 𝑧 𝑥 ≤ 𝑎 ∗ for all sizes 𝑥 ≤ 𝑎 ∗ , and • 𝑧 𝑥 = 𝑥 max for all sizes 𝑥 > 𝑎 ∗ . Because 𝜌 ( 𝑎 ∗ ) < 𝜌 ( 𝑥 max ) = − 𝜌, applying (4.4) yields E [ 𝑄 M-SERPT-1 ] = Θ ( ) + ∫ ∞ 𝑎 ∗ 𝜏 ( 𝑥 max ) 𝜌 ( 𝑎 ∗ ) · ( − 𝜌 ) d 𝐹 ( 𝑥 ) = Θ (cid:18) − 𝜌 (cid:19) , E [ 𝑅 M-SERPT-1 ] = Θ ( ) + ∫ ∞ 𝑎 ∗ 𝑥𝜌 ( 𝑎 ∗ ) d 𝐹 ( 𝑥 ) = Θ ( ) . If additionally 𝑋 ∈ Bounded , then 𝑥 max < ∞ , so E [ 𝑆 M-SERPT-1 ] = Θ ( ) + ∫ ∞ 𝑎 ∗ 𝑥 max 𝜌 ( 𝑎 ∗ ) d 𝐹 ( 𝑥 ) = Θ ( ) . (cid:3) We now turn to the OR (−∞ , − ) and MDA ( Λ ) cases, which require the following technicallemma. Lemma 7.6.
Let 𝐿 𝜋 ( 𝑢 ) = 𝑟 𝜋 (cid:0) 𝐹 − 𝑒 ( / 𝑢 ) (cid:1) , where 𝜋 is SERPT or M-SERPT. If 𝑋 ∈ OR (−∞ , − ) , then 𝐿 SERPT , 𝐿
M-SERPT ∈ OR (− , ) , and if 𝑋 ∈ MDA ( Λ ) , then 𝐿 SERPT , 𝐿
M-SERPT ∈ OR (− 𝜀, 𝜀 ) for all 𝜀 > . Proof.
Because 𝐿 M-SERPT is the nonincreasing envelope of 𝐿 SERPT , it suffices to prove the re-sult for 𝐿 SERPT . The OR (−∞ , − ) case follows from closure properties of Matuszewska indices [19,Lemmas 4.5 and 4.6]. The MDA ( Λ ) case follows from a result of Kamphorst and Zwart [19, Sec-tion 4.2.2] which states that if 𝑋 ∈ MDA ( Λ ) , then 𝐿 SERPT is slowly varying , a property implying 𝐿 SERPT ∈ OR (− 𝜀, 𝜀 ) for all 𝜀 > (cid:3) One implication of Lemma 7.6 is that if 𝑋 ∈ MDA ( Λ ) , then 𝐻 ( 𝑥 ) = 𝑂 ( 𝐹 ( 𝑥 ) − 𝜀 ) for all 𝜀 > . (7.8)We are now ready to tackle the OR (−∞ , − ) and MDA ( Λ ) cases. As in Section 7.2, we beginby bounding the five key quantities other than I 𝑄 . Lemma 7.2 does so for OR (−∞ , − ) , and thefollowing lemma does so for MDA ( Λ ) . Lemma 7.7.
Under M-SERPT, if 𝑋 ∈ MDA ( Λ ) , then II 𝑄 , II 𝑅 , III 𝑅 , II 𝑆 = 𝑂 (cid:18) ( − 𝜌 ) 𝜀 (cid:19) for all 𝜀 > . If additionally 𝑋 ∈ MDA ( Λ ) ∩ ( QDHR ∪ QIMRL ) , then III 𝑆 = 𝑂 (cid:18) ( − 𝜌 ) 𝜀 (cid:19) for all 𝜀 > . Proof.
Our overall approach is to use (7.3) on each key quantity to bound it by an expressionof the form ( − 𝜌 ) − 𝜀 · ∫ ∞ Φ ( 𝜀, 𝑥 ) d 𝑥 , where Φ ( 𝜀, 𝑥 ) does not depend on 𝜌 . The challenge is thento show that the integral converges for arbitrarily small 𝜀 > ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 27 We begin with two bounds on 𝐻 𝜌 ( 𝑦 𝑥 ) · 𝐹 ( 𝑥 )/ 𝐹 ( 𝑦 𝑥 ) , a term which appears in the integrands ofseveral key quantities. By (4.3), 𝐻 𝜌 ( 𝑦 𝑥 ) · 𝐹 ( 𝑥 ) 𝐹 ( 𝑦 𝑥 ) ≤ 𝐻 𝜌 ( 𝑦 𝑥 ) , (7.9) 𝐻 𝜌 ( 𝑦 𝑥 ) · 𝐹 ( 𝑥 ) 𝐹 ( 𝑦 𝑥 ) = 𝐹 ( 𝑥 ) 𝜌 ( 𝑦 𝑥 ) ≤ 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) = 𝐻 𝜌 ( 𝑥 ) . (7.10)Combining (7.10) with (7.6) implies the desired bound for III 𝑅 .We now bound II 𝑄 . To do so, we apply (7.3) twice, choosing 𝜀 = 𝐻 𝜌 ( 𝑦 𝑥 ) and arbitrarilysmall 𝜀 > 𝐻 𝜌 ( 𝑥 ) :II 𝑄 ≤ ∫ ∞ 𝜆𝑥𝐻 𝜌 ( 𝑦 𝑥 ) 𝐻 𝜌 ( 𝑥 ) d 𝑥 [by (7.10)] ≤ ( − 𝜌 ) 𝜀 ∫ ∞ 𝜆𝑥𝐹 ( 𝑥 ) 𝜀 𝐻 ( 𝑦 𝑥 ) 𝐻 ( 𝑥 ) − 𝜀 d 𝑥 [by (7.3)] ≤ 𝑂 ( )( − 𝜌 ) 𝜀 ∫ ∞ 𝑥𝐹 ( 𝑥 ) 𝜀 𝐹 ( 𝑥 ) − 𝜀 ( − 𝜀 ) d 𝑥 [by (7.4), (7.8)] ≤ 𝑂 ( )( − 𝜌 ) 𝜀 ∫ ∞ 𝑥 − 𝛼𝜀 d 𝑥, [by Lem. 2.13] where we may choose 𝛼 > 𝛼 > / 𝜀 makes the integral converge, soII 𝑄 = 𝑂 (( − 𝜌 ) − 𝜀 ) . The computation for II 𝑆 is similar:II 𝑆 ≤ ( − 𝜌 ) 𝜀 ∫ ∞ 𝜆𝑧 𝑥 𝐹 ( 𝑧 𝑥 ) 𝜀 𝐻 ( 𝑦 𝑥 ) 𝐻 ( 𝑧 𝑥 ) − 𝜀 d 𝑥 [by (7.3), (7.9)] ≤ 𝑂 ( )( − 𝜌 ) 𝜀 ∫ ∞ 𝑧 − 𝛼𝜀𝑥 d 𝑥 . [by (7.4), Lem. 2.13] Because 𝑧 𝑥 ≥ 𝑥 , the integral converges if we choose 𝛼 > / 𝜀 , so II 𝑆 = 𝑂 (( − 𝜌 ) − 𝜀 ) . This alsocovers II 𝑅 because II 𝑅 = II 𝑆 .If additionally 𝑋 ∈ MDA ( Λ ) ∩ ( QDHR ∪ QIMRL ) with exponent 𝛾 , then we can similarlybound III 𝑆 : III 𝑆 ≤ ( − 𝜌 ) 𝜀 ∫ ∞ 𝐹 ( 𝑦 𝑥 ) 𝜀 𝐻 ( 𝑦 𝑥 ) − 𝜀 d 𝑥 [by (7.3)] ≤ 𝑂 ( )( − 𝜌 ) 𝜀 ∫ ∞ 𝑦 − 𝛼𝜀𝑥 d 𝑥 [by (7.4), Lem. 2.13] ≤ 𝑂 ( )( − 𝜌 ) 𝜀 ∫ ∞ 𝑥 − 𝛼𝜀 / 𝛾 d 𝑥, [by Thm. 6.3] so choosing 𝛼 > 𝛾 / 𝜀 shows that III 𝑆 = 𝑂 (( − 𝜌 ) − 𝜀 ) . (cid:3) It remains only to characterize the heavy-traffic scaling of I 𝑄 . Lemma 7.8.
Under M-SERPT, if 𝑋 ∈ OR (−∞ , − ) ∪ MDA ( Λ ) , then I 𝑄 = ( − 𝜌 ) · 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) ! . Proof.
Because E [ 𝑋 ] < ∞ , we have 𝜏 ( 𝑥 ) = Θ ( ) , so by (7.3) and (7.4),I 𝑄 = ∫ ∞ Θ ( ) 𝑟 M-SERPT ( 𝑥 ) · 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) d 𝑥 . For the lower bound, we integrate up to 𝐹 − 𝑒 ( − 𝜌 ) instead of ∞ . For 𝑥 ≤ 𝐹 − 𝑒 ( − 𝜌 ) , we have 𝐹 𝑒 ( 𝑥 ) ≥ − 𝜌 , so (7.2) implies 𝜌𝐹 𝑒 ( 𝑥 ) ≤ 𝜌 ( 𝑥 ) ≤ ( + 𝜌 ) 𝐹 𝑒 ( 𝑥 ) . Using this fact along with the monotonicity of 𝑟 M-SERPT yieldsI 𝑄 ≥ Ω ( ) 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) ∫ 𝐹 − 𝑒 ( − 𝜌 ) 𝐹 ( 𝑥 ) 𝐹 𝑒 ( 𝑥 ) d 𝑥 = Ω ( ) 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) 𝐹 𝑒 (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) − ! [by (7.1)] = Ω ( − 𝜌 ) · 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) ! . For the upper bound, we split the integration region at 𝐹 − 𝑒 ( − 𝜌 ) :I 𝑄 = ∫ 𝐹 − 𝑒 ( − 𝜌 ) 𝑂 ( ) 𝑟 M-SERPT ( 𝑥 ) · 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) d 𝑥 + ∫ ∞ 𝐹 − 𝑒 ( − 𝜌 ) 𝑂 ( ) 𝑟 M-SERPT ( 𝑥 ) · 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) d 𝑥 . (7.11)The second integral in (7.11) is simple to bound using the monotonicity of 𝑟 M-SERPT : ∫ ∞ 𝐹 − 𝑒 ( − 𝜌 ) 𝑂 ( ) 𝑟 M-SERPT ( 𝑥 ) · 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) d 𝑥 ≤ 𝑂 ( ) 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) ∫ ∞ 𝐹 − 𝑒 ( − 𝜌 ) 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) d 𝑥 ≤ 𝑂 ( ) 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) − 𝜌 − − 𝜌 + 𝜌𝐹 − 𝑒 ( − 𝜌 ) ! [by (4.5), (7.2)] = 𝑂 ( − 𝜌 ) · 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) ! . To bound the first integral in (7.11), we change variables to 𝑢 = / 𝐹 𝑒 ( 𝑥 ) : ∫ 𝐹 − 𝑒 ( − 𝜌 ) 𝑂 ( ) 𝑟 M-SERPT ( 𝑥 ) · 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) d 𝑥 ≤ ∫ 𝐹 − 𝑒 ( − 𝜌 ) 𝑂 ( ) 𝑟 M-SERPT ( 𝑥 ) · 𝐹 ( 𝑥 ) 𝐹 𝑒 ( 𝑥 ) d 𝑥 [by (7.2)] = ∫ /( − 𝜌 ) 𝑂 ( ) 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( / 𝑢 ) (cid:1) d 𝑢 = 𝑂 ( ) ∫ /( − 𝜌 ) 𝐿 M-SERPT ( 𝑢 ) d 𝑢, where 𝐿 M-SERPT is as in Lemma 7.6. By Lemma 7.6, we have 𝐿 M-SERPT ∈ OR (− , ∞) , so a result inKaramata theory [8, Theorem 2.6.1] implies ∫ v 𝐿 M-SERPT ( 𝑢 ) d 𝑢 = 𝑂 ( v 𝐿 M-SERPT ( v )) ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 29 in the v → ∞ limit. Letting v = /( − 𝜌 ) yields the desired bound. (cid:3) Having characterized the heavy-traffic scaling of all the key quantities, the main heavy-trafficresults for OR (−∞ , − ) and MDA ( Λ ) follow easily. Theorem 7.9. If 𝑋 ∈ OR (−∞ , − ) , then in the 𝜌 → limit, E [ 𝑄 M-SERPT-1 ] = Θ ( − 𝜌 ) · 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) ! = Ω (cid:18) ( − 𝜌 ) 𝛿 (cid:19) for some 𝛿 > , E [ 𝑅 M-SERPT-1 ] ≤ E [ 𝑆 M-SERPT-1 ] = Θ (cid:18) log 11 − 𝜌 (cid:19) , and therefore E [ 𝑇 M-SERPT-1 ] = Θ ( − 𝜌 ) · 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) ! . Proof.
After applying Lemmas 7.2 and 7.8, it remains only to show I 𝑄 = Ω (( − 𝜌 ) − 𝛿 ) . Using 𝐿 M-SERPT from Lemma 7.6, we can rewrite Lemma 7.8 asI 𝑄 = Θ (cid:18) − 𝜌 𝐿 M-SERPT (cid:18) − 𝜌 (cid:19) (cid:19) . (7.12)By Lemma 7.6, we have 𝐿 ∈ OR (− , ) , which means there exists 𝛽 ∈ ( , ) such that 𝐿 ( 𝑢 ) = Ω ( 𝑢 − 𝛽 ) in the 𝑢 → ∞ limit. Letting 𝛿 = − 𝛽 and 𝑢 = /( − 𝜌 ) yields the desired bound. (cid:3) Theorem 7.10. If 𝑋 ∈ MDA ( Λ ) , then in the 𝜌 → limit, E [ 𝑄 M-SERPT-1 ] = Θ ( − 𝜌 ) · 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) ! = Ω (cid:18) ( − 𝜌 ) − 𝜀 (cid:19) for all 𝜀 > , E [ 𝑅 M-SERPT-1 ] = 𝑂 (cid:18) ( − 𝜌 ) 𝜀 (cid:19) for all 𝜀 > , and therefore E [ 𝑇 M-SERPT-1 ] = Θ ( − 𝜌 ) · 𝑟 M-SERPT (cid:0) 𝐹 − 𝑒 ( − 𝜌 ) (cid:1) ! . If additionally 𝑋 ∈ MDA ( Λ ) ∩ ( QDHR ∪ QIMRL ) , then E [ 𝑆 M-SERPT-1 ] = 𝑂 (cid:18) ( − 𝜌 ) 𝜀 (cid:19) for all 𝜀 > . Proof.
After applying Lemmas 7.7 and 7.8, it remains only to show I 𝑄 = Ω (( − 𝜌 ) −( − 𝜀 ) ) . Thisfollows from (7.12) and Lemma 7.6, similarly to the proof of Theorem 7.9. (cid:3) Having characterized heavy-traffic scaling under M-SERPT, we now do the same for Gittins andM-Gittins. Our first result shows that the mean waiting and residence times of Gittins and M-Git-tins have the same heavy-traffic scaling as that of M-SERPT. Note that the precondition holds forall of the job size distributions we consider in Section 7.3. Theorem 7.11.
In the 𝜌 → limit, E [ 𝑅 Gittins-1 ] , E [ 𝑅 M-Gittins-1 ] = 𝑂 ( E [ 𝑅 M-SERPT-1 ]) , and if E [ 𝑅 M-SERPT-1 ] = 𝑂 ( E [ 𝑄 M-SERPT-1 ]) , then E [ 𝑄 Gittins-1 ] , E [ 𝑄 M-Gittins-1 ] = Θ ( E [ 𝑄 M-SERPT-1 ]) . Proof.
The residence time result follows immediately from results of Scully et al. [33, Eq. (3.8)and Proposition 4.8], which imply E [ 𝑅 Gittins-1 ] ≤ E [ 𝑅 M-Gittins-1 ] ≤ E [ 𝑅 M-SERPT-1 ] . For waiting time, we first invoke further results of Scully et al. [33, Proposition 4.7 and Lemma 5.6],which imply E [ 𝑄 Gittins-1 ] ≥ E [ 𝑄 M-Gittins-1 ] ≥ E [ 𝑄 M-SERPT-1 ] . It thus suffices to show E [ 𝑄 Gittins-1 ] = 𝑂 ( E [ 𝑄 M-SERPT-1 ]) . Because Gittins minimizes mean responsetime [3, 4, 12], we have E [ 𝑄 Gittins-1 ] ≤ E [ 𝑇 Gittins-1 ] ≤ E [ 𝑇 M-SERPT-1 ] = E [ 𝑄 M-SERPT-1 ] + E [ 𝑅 M-SERPT-1 ] , so the result follows from the E [ 𝑅 M-SERPT-1 ] = 𝑂 ( E [ 𝑄 M-SERPT-1 ]) precondition. (cid:3) Our final heavy-traffic result shows that for certain job size distributions, under M-Gittins, meanwaiting time dominates mean inflated residence time. The conditions are the same as those shownfor M-SERPT over the course of Section 7.3, except
QDHR ∪ QIMRL is replaced by
QDHR . Theorem 7.12. If 𝑋 ∈ OR (−∞ , − ) ∪ ( MDA ( Λ ) ∩ QDHR ) ∪
Bounded , then in the 𝜌 → limit, E [ 𝑆 M-Gittins-1 ] = 𝑜 ( E [ 𝑄 M-Gittins-1 ]) . More specifically, E [ 𝑆 M-Gittins-1 ] obeys the same scaling bounds as shown for E [ 𝑆 M-SERPT-1 ] in Theo-rems 7.5, 7.9 and 7.10. Proof.
The proof is very similar to the proofs of analogous results for M-SERPT (Theorems 7.5,7.9 and 7.10), so we just describe the differences. • If 𝑋 ∈ OR (−∞ , − ) , we follow the same proof as Theorem 7.9 and the lemmas it requires,except we use Theorem 6.4 to bound 𝑦 M-Gittins 𝑥 and 𝑧 M-Gittins 𝑥 . • If 𝑋 ∈ MDA ( Λ ) ∩ QDHR , we follow the same proof as Theorem 7.10 and the lemmas itrequires, except we use Theorem 6.5 to bound 𝑦 M-Gittins 𝑥 and 𝑧 M-Gittins 𝑥 . • If 𝑋 ∈ Bounded , we follow the same proof as Theorem 7.5, except we use a result of Aaltoet al. [4, Proposition 9] to justify the existence of the critical age 𝑎 ∗ . (cid:3) With some extra effort, one can show it also holds for 𝑋 ∈ OR (− , − ) . ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 31 We study optimal scheduling in the M/G/ k to minimize mean response time. This problem is solvedby the Gittins policy for the single-server 𝑘 = M-Gittins (Definition 2.4)and show that it minimizes mean response time in the heavy-traffic M/G/ k for a large class offinite-variance job size distributions (Theorem 3.1). We also show that the simple and practicalM-SERPT policy is a 2-approximation for mean response time in the heavy-traffic M/G/ k undersimilar conditions (Theorem 3.2). As a byproduct of our M/G/ k study, we obtain results characteriz-ing the heavy-traffic scaling of M/G/1 mean response time under Gittins, M-Gittins, and M-SERPT(Theorem 3.3).A natural question to ask is whether the conditions under which we prove M-Gittins’s optimalitycan be relaxed, particularly the QDHR and
Bounded assumptions. The difficulty lies in the fact thatfor some job size distributions, the bound in Theorem 5.1 is not strong enough because inflatedresidence time is infinite. It is possible that the techniques used by Köllerström [21, 22] to analyzethe heavy-traffic M/G/ k under FCFS could be helpful, seeing as FCFS has infinite inflated residencetime.Another major open question is analyzing the performance of M-Gittins outside of the heavy-traffic limit. In the single-server case, one can generalize the techniques of Scully et al. [33] toshow that M-Gittins is a 3-approximation for M/G/1 mean response time at all loads. However, themultiserver case remains open. ACKNOWLEDGMENTS
This work was supported by NSF grants CMMI-1938909, XPS-1629444, and CSR-1763701; and aGoogle 2020 Faculty Research Award.
REFERENCES [1] Samuli Aalto and Urtzi Ayesta. 2006. Mean delay analysis of multi level processor sharing disciplines. In
INFOCOM2006. 25th IEEE International Conference on Computer Communications. Proceedings . IEEE, 1–11.[2] S Aalto and U Ayesta. 2006. On the nonoptimality of the foreground-background discipline for IMRL service times.
Journal of Applied Probability
43, 2 (2006), 523–534.[3] Samuli Aalto, Urtzi Ayesta, and Rhonda Righter. 2009. On the Gittins index in the M/G/1 queue.
Queueing Systems
63, 1 (2009), 437–458.[4] Samuli Aalto, Urtzi Ayesta, and Rhonda Righter. 2011. Properties of the Gittins index with application to optimalscheduling.
Probability in the Engineering and Informational Sciences
25, 03 (2011), 269–288.[5] Martin F. Arlitt and Carey L. Williamson. 1996. Web server workload characterization: The search for invariants.
ACM SIGMETRICS Performance Evaluation Review
24, 1 (1996), 126–137.[6] Nikhil Bansal, Bart Kamphorst, and Bert Zwart. 2018. Achievable performance of blind policies in heavy traffic.
Mathematics of Operations Research
43, 3 (2018), 949–964.[7] Luca Becchetti and Stefano Leonardi. 2004. Nonclairvoyant scheduling to minimize the total flow time on single andparallel machines.
Journal of the ACM (JACM)
51, 4 (2004), 517–539.[8] N. Bingham, C. Goldie, and J. Teugels. 1987.
Regular Variation . Cambridge University Press.[9] Yan Chen and Jing Dong. 2020. Scheduling with service-time information: The power of two priority classes. (2020).Preprint.[10] M. E. Crovella and A. Bestavros. 1997. Self-similarity in World Wide Web traffic: evidence and possible causes.
IEEE/ACM Transactions on Networking
5, 6 (1997), 835–846.[11] Hanhua Feng and Vishal Misra. 2003. Mixed scheduling disciplines for network flows. In
ACM SIGMETRICS Perfor-mance Evaluation Review , Vol. 31. ACM, 36–39.[12] John C. Gittins, Kevin D. Glazebrook, and Richard Weber. 2011.
Multi-armed Bandit Allocation Indices . John Wiley &Sons.[13] Kevin D Glazebrook. 2003. An analysis of Klimov’s problem with parallel servers.
Mathematical Methods of OperationsResearch
58, 1 (2003), 1–28. [14] Kevin D Glazebrook and José Niño-Mora. 2001. Parallel scheduling of multiclass M/M/m queues: Approximate andheavy-traffic optimization of achievable performance.
Operations Research
49, 4 (2001), 609–623.[15] Isaac Grosof, Ziv Scully, and Mor Harchol-Balter. 2018. SRPT for multiserver systems.
Performance Evaluation v a.2018.10.001[16] Mor Harchol-Balter. 2013. Performance Modeling and Design of Computer Systems: Queueing Theory in Action (1st ed.).Cambridge University Press, New York, NY, USA.[17] Mor Harchol-Balter and Allen B. Downey. 1997. Exploiting Process Lifetime Distributions for Dynamic Load Balanc-ing.
ACM Trans. Comput. Syst.
15, 3 (Aug. 1997), 253–285. https://doi.org/10.1145/263326.263344[18] Bala Kalyanasundaram and Kirk R Pruhs. 1997. Minimizing flow time nonclairvoyantly. In
Proceedings 38th AnnualSymposium on Foundations of Computer Science . IEEE, 345–352.[19] Bart Kamphorst and Bert Zwart. 2020. Heavy-Traffic Analysis of Sojourn Time Under the Foreground–BackgroundScheduling Policy.
Stochastic Systems
10, 1 (2020), 1–28. https://doi.org/10.1287/stsy.2019.0036[20] Leonard Kleinrock. 1976.
Queueing Systems, Volume 2: Computer Applications . Vol. 66. Wiley New York.[21] Julian Köllerström. 1974. Heavy Traffic Theory for Queues with Several Servers. I.
Journal of Applied Probability
Journal of Applied Probability
ACM SIGMETRICS Performance Evaluation Review , Vol. 38. ACM, 12–14.[24] Natalia Osipova, Urtzi Ayesta, and Konstantin Avrachenkov. 2009. Optimal policy for multi-class scheduling in asingle server queue. In
Teletraffic Congress, 2009. ITC 21 2009. 21st International . IEEE, 1–8.[25] Kihong Park and Walter Willinger. 2000. Self-Similar Network Traffic: An Overview.
Self-Similar Network Traffic andPerformance Evaluation (2000), 1–38.[26] Sidney I Resnick. 2013.
Extreme values, regular variation and point processes . Springer.[27] Rhonda Righter and J George Shanthikumar. 1989. Scheduling multiclass single server queueing systems to stochas-tically maximize the number of successful departures.
Probability in the Engineering and Informational Sciences
3, 3(1989), 323–333.[28] Rhonda Righter, J George Shanthikumar, and Genji Yamazaki. 1990. On extremal service disciplines in single-stagequeueing systems.
Journal of Applied Probability
27, 2 (1990), 409–416.[29] Linus Schrage. 1968. A proof of the optimality of the shortest remaining processing time discipline.
OperationsResearch
16, 3 (1968), 687–690.[30] Linus E Schrage. 1967. The queue M/G/1 with feedback to lower priority queues.
Management Science
13, 7 (1967),466–474.[31] Linus E Schrage and Louis W Miller. 1966. The queue M/G/1 with the shortest remaining processing time discipline.
Operations Research
14, 4 (1966), 670–684.[32] Ziv Scully, Mor Harchol-Balter, and Alan Scheller-Wolf. 2018. SOAP: One Clean Analysis of All Age-Based SchedulingPolicies.
Proc. ACM Meas. Anal. Comput. Syst.
2, 1, Article 16 (April 2018), 30 pages. https://doi.org/10.1145/3179419[33] Ziv Scully, Mor Harchol-Balter, and Alan Scheller-Wolf. 2020. Simple Near-Optimal Scheduling for the M/G/1.
Proc.ACM Meas. Anal. Comput. Syst.
4, 1, Article 11 (March 2020), 29 pages. https://doi.org/10.1145/3379477[34] Moshe Shaked and J George Shanthikumar. 2007.
Stochastic orders . Springer Science & Business Media.[35] Adam Wierman, Mor Harchol-Balter, and Takayuki Osogami. 2005. Nearly insensitive bounds on SMART scheduling.In
ACM SIGMETRICS Performance Evaluation Review , Vol. 33. ACM, 205–216.
A DIFFICULTY OF M/G/ k ANALYSIS FOR NONMONOTONIC RANK FUNCTIONS
In this appendix we explain why Theorem 5.1 does not readily generalize to SOAP policies withnonmonotonic rank functions.Recall that the proof of Theorem 5.1 considers a tagged job 𝐽 of size 𝑥 and considers severalcategories of work completed while 𝐽 is in the system. Our focus here is on relevant work, whichis work on jobs that are prioritized ahead of 𝐽 . Let 𝑠 𝜋 - 𝑘𝑥 be the maximum age at which a new job,namely one that arrives after 𝐽 , can contribute relevant work under 𝜋 - k . When 𝜋 is monotonic, 𝑠 𝜋 - 𝑘𝑥 does not depend on the number of servers 𝑘 . Specifically, we have 𝑠 𝜋 - 𝑘𝑥 = 𝑦 𝜋𝑥 . The problem fornonmonotonic SOAP policies 𝜋 is that, as we show below, we can have 𝑠 𝜋 - 𝑘𝑥 > 𝑠 𝜋 -1 𝑥 when 𝑘 ≥ 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 generalized to all SOAP policies 𝜋 . • If 𝜋 is monotonic, then 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 are given by Definition 4.1. ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 33 𝑎 𝑦 𝜋𝑥 = 𝑠 𝜋 -1 𝑥 𝑏 𝑐 = 𝑠 𝜋 -2 𝑥 𝑥 𝑧 𝜋𝑥 𝑟 M- 𝜋 ( 𝑎 ) 𝑟 𝜋 ( 𝑎 ) Fig. A.1. Age Cutoffs for Nonmonotonic Rank Functions • If 𝜋 is nonmonotonic, we can define 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 in terms of a monotonic SOAP policy relatedto 𝜋 [33]. Specifically, letting M- 𝜋 be the monotonic SOAP policy with rank function 𝑟 M- 𝜋 ( 𝑎 ) = max 𝑏 ∈[ ,𝑎 ] 𝑟 𝜋 ( 𝑏 ) , we define 𝑦 𝜋𝑥 = 𝑦 M- 𝜋𝑥 and 𝑧 𝜋𝑥 = 𝑧 M- 𝜋𝑥 .Consider the example SOAP policy 𝜋 and tagged job size 𝑥 shown in Fig. A.1. In the single-server 𝑘 = 𝑠 𝜋 -1 𝑥 = 𝑦 𝜋𝑥 . To see why, consider the moment a new job 𝐽 ′ reaches age 𝑦 𝜋𝑥 whilethe tagged job 𝐽 is still in the system. For this to occur, it must be that 𝐽 is also at age 𝑦 𝜋𝑥 , becauseotherwise 𝐽 would have priority over 𝐽 ′ . With both 𝐽 and 𝐽 ′ at the same rank, the FCFS tiebreakerprioritizes 𝐽 . Thereafter, 𝐽 never has rank worse than 𝑟 𝜋 ( 𝑦 𝜋𝑥 ) , so 𝐽 ′ remains stuck at age 𝑦 𝜋𝑥 and isnever prioritized over 𝐽 .We now reconsider the same example from Fig. A.1 but with 𝑘 ≥ 𝐽 ′ can receive service even while 𝐽 has better rank because 𝐽 and 𝐽 ′ can occupy different servers simultaneously. This means 𝐽 ′ no longer gets stuck at age 𝑦 𝜋𝑥 .In particular, if 𝐽 reaches age 𝑐 and 𝐽 ′ passes age 𝑏 , then 𝐽 ′ contributes relevant work between ages 𝑏 and 𝑐 . Therefore, 𝑠 𝜋 - 𝑘𝑥 = 𝑐 > 𝑠 𝜋 -1 𝑥 for 𝑘 ≥ 𝐽 ′ will contribute relevantwork until it completes or reaches age 𝑠 𝜋 - 𝑘𝑥 . This is a worst-case estimate, because the tagged job 𝐽 might complete before 𝐽 ′ completes or reaches age 𝑠 𝜋 - 𝑘𝑥 . When 𝜋 is monotonic, we have 𝑠 𝜋 - 𝑘𝑥 = 𝑠 𝜋 -1 𝑥 ,so this overestimate is tight enough to compare the mean response times under 𝜋 - k and 𝜋 -1. How-ever, when 𝜋 is nonmonotonic, it may be that 𝑠 𝜋 - 𝑘𝑥 > 𝑠 𝜋 -1 𝑥 , as explained above, so we do not obtaina tight comparison between the 𝜋 - k and 𝜋 -1 systems. This suggests generalizing Theorem 5.1 tononmonotonic SOAP policies requires not relying as heavily on worst-case quantities like 𝑠 𝜋 - 𝑘𝑥 . B NEW FORMULAS FOR MEAN WAITING AND RESIDENCE TIMES
In this appendix we prove the following new formulas for mean waiting, residence, and inflatedresidence times.
Theorem B.1.
Under any monotonic SOAP policy 𝜋 , E [ 𝑄 𝜋 -1 ] = ∫ ∞ (cid:18) 𝐹 ( 𝑦 𝜋𝑥 ) 𝜌 ( 𝑦 𝜋𝑥 ) + 𝐹 ( 𝑧 𝜋𝑥 ) 𝜌 ( 𝑧 𝜋𝑥 ) (cid:19) 𝜆𝜏 ( 𝑧 𝜋𝑥 ) 𝐹 ( 𝑥 ) 𝜌 ( 𝑥 ) + 𝜆𝑥𝐹 ( 𝑦 𝜋𝑥 ) 𝐹 ( 𝑥 ) 𝜌 ( 𝑦 𝜋𝑥 ) ! d 𝑥 . Theorem B.2.
Under any monotonic SOAP policy 𝜋 , E [ 𝑅 𝜋 -1 ] = ∫ ∞ (cid:18) 𝜆𝑧 𝜋𝑥 𝐹 ( 𝑥 ) 𝐹 ( 𝑧 𝜋𝑥 ) 𝜌 ( 𝑦 𝜋𝑥 ) 𝜌 ( 𝑧 𝜋𝑥 ) + 𝐹 ( 𝑥 ) 𝜌 ( 𝑦 𝜋𝑥 ) (cid:19) d 𝑥 . Theorem B.3.
Under any monotonic SOAP policy 𝜋 , E [ 𝑆 𝜋 -1 ] = ∫ ∞ (cid:18) 𝜆𝑧 𝜋𝑥 𝐹 ( 𝑥 ) 𝐹 ( 𝑧 𝜋𝑥 ) 𝜌 ( 𝑦 𝜋𝑥 ) 𝜌 ( 𝑧 𝜋𝑥 ) + 𝐹 ( 𝑦 𝜋𝑥 ) 𝜌 ( 𝑦 𝜋𝑥 ) (cid:19) d 𝑥 . Proving these results requires new technical machinery for, roughly speaking, performing inte-gration by parts on expressions involving 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 , such as those in (4.4). Appendix B.1 introducesthe general technical machinery, which Appendix B.2 then applies to prove the above results.Throughout this appendix, 𝜕 denotes the derivative operator, and [ 𝑡 , . . . , 𝑡 𝑛 ↦→ RHS ] denotesthe function that maps variables 𝑡 , . . . , 𝑡 𝑛 to expression RHS. B.1 Integration by Parts with Hills and Valleys
Definition B.4. A hill-valley partition of R + is a sequence0 = 𝑢 ≤ v < 𝑢 < v < 𝑢 < v < . . . . Intervals of the form ( 𝑢 𝑖 , v 𝑖 ] are called valleys , and intervals of the form ( v 𝑖 , 𝑢 𝑖 + ] are called hills . Definition B.5.
Functions 𝑦, 𝑧 : R + → R + are a hill-valley pair for a given hill-valley partition iffor each valley ( 𝑢 𝑖 , v 𝑖 ] , 𝑦 ( 𝑥 ) = 𝑢 𝑖 , 𝑧 ( 𝑥 ) = v 𝑖 , for all 𝑥 ∈ ( 𝑢 𝑖 , v 𝑖 ] , and for each hill ( v 𝑖 , 𝑢 𝑖 + ] , 𝑦 ( 𝑥 ) = 𝑥, 𝑧 ( 𝑥 ) = 𝑥, for all 𝑥 ∈ ( v 𝑖 , 𝑢 𝑖 + ] . For compactness, we write 𝑦 𝑥 = 𝑦 ( 𝑥 ) and 𝑧 𝑥 = 𝑧 ( 𝑥 ) .It is simple to check that for any monotonic SOAP policy 𝜋 , the pair 𝑦 𝜋 , 𝑧 𝜋 (Definition 4.1) is ahill-valley pair. Definition B.6.
For functions Φ : R + → R + , we define the difference ratio operator Δ as follows: ΔΦ (h 𝑢, v i) = Φ ( v ) − Φ ( 𝑢 ) v − 𝑢 if 𝑢 ≠ v 𝜕 Φ ( 𝑢 ) if 𝑢 = v , where 𝜕 is the derivative operator. Similarly, for functions with multiple arguments, Δ 𝑖 is a versionof Δ that works on the 𝑖 th argument: Δ 𝑖 Φ ( . . . , h 𝑢, v i , . . . ) = Δ [ 𝑡 ↦→ Φ ( . . . , 𝑡, . . . )] (h 𝑢, v i) . Like 𝜕 , it is easily seen that Δ is a linear operator. When applied to polynomials, Δ elegantlygeneralizes 𝜕 . For example, Δ (cid:20) 𝑡 ↦→ 𝑡 (cid:21) (h 𝑢, v i) = 𝑢 v . (B.1)The Δ operator also obeys various chain-rule-like identities. We highlight the two we use below. Lemma B.7.
Let Φ , Ψ : R → R be differentiable. For all 𝑢, v ∈ R , Δ [ 𝑡 ↦→ Φ ( Ψ ( 𝑡 ))] (h 𝑢, v i) = ΔΦ (h Ψ ( 𝑢 ) , Ψ ( v )i) ΔΨ (h 𝑢, v i) . Proof. If 𝑢 = v , this is the chain rule. If 𝑢 ≠ v but Ψ ( 𝑢 ) = Ψ ( v ) , then both sides are 0. If Ψ ( 𝑢 ) ≠ Ψ ( v ) , then the result follows by a simple computation. (cid:3) We borrow the terms “hill” and “valley” from Scully et al. [33], who use a similar concept to analyze SOAP policies, butthis definition is abstracted away from the details of SOAP. As a corner case, we consider the first hill or valley to alsoinclude 0. ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 35
Lemma B.8.
Let Φ : R → R be differentiable. For all 𝑢, v ∈ R , Δ [ 𝑡 ↦→ Φ ( 𝑡, 𝑡 )] (h 𝑢, v i) = Δ Φ ( 𝑢, h 𝑢, v i) + Δ Φ (h 𝑢, v i , v ) . Proof. If 𝑢 = v , this is the multivariable chain rule. If 𝑢 ≠ v , ( v − 𝑢 ) Δ [ 𝑡 ↦→ Φ ( 𝑡, 𝑡 )] (h 𝑢, v i) = Φ ( v , v ) − Φ ( 𝑢, 𝑢 ) = Φ ( v , v ) − Φ ( 𝑢, v ) + Φ ( 𝑢, v ) − Φ ( 𝑢, 𝑢 ) = ( v − 𝑢 ) ( Δ Φ (h 𝑢, v i , v ) + Δ Φ ( 𝑢, h 𝑢, v i)) . (cid:3) The most important result of this appendix is the following lemma, which formulates a versionof integration by parts that works for hill-valley pairs despite their discontinuity.
Lemma B.9.
Let 𝑦, 𝑧 be a hill-valley pair, Φ : R + → R be differentiable, 𝑃 : R + → R be differen-tiable, and 𝑃 ( 𝑥 ) = 𝑐 − 𝑃 ( 𝑥 ) for some 𝑐 ∈ R . If 𝑃 ( ) Φ ( , , 𝑧 ) = , lim 𝑥 →∞ 𝑃 ( 𝑥 ) Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) = , then ∫ ∞ Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) 𝜕𝑃 ( 𝑥 ) d 𝑥 = ∫ ∞ (cid:16) 𝑃 ( 𝑦 𝑥 ) Δ Φ ( 𝑦 𝑥 , 𝑦 𝑥 , h 𝑦 𝑥 , 𝑧 𝑥 i) + 𝑃 ( 𝑥 ) 𝜕 Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) + 𝑃 ( v ) Δ Φ (h 𝑦 𝑥 , 𝑧 𝑥 i , 𝑧 𝑥 , 𝑧 𝑥 ) (cid:17) d 𝑥 . Proof.
For each valley ( 𝑢, v ] , ∫ v 𝑢 Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) 𝜕𝑃 ( 𝑥 ) d 𝑥 = ∫ v 𝑢 𝑃 ( 𝑥 ) 𝜕 Φ ( 𝑢, 𝑥, v ) d 𝑥 + 𝑃 ( 𝑢 ) Φ ( 𝑢, 𝑢, v ) − 𝑃 ( v ) Φ ( 𝑢, v , v ) = ∫ v 𝑢 𝑃 ( 𝑥 ) 𝜕 Φ ( 𝑢, 𝑥, v ) d 𝑥 + 𝑃 ( 𝑢 ) Φ ( 𝑢, 𝑢, 𝑢 ) − 𝑃 ( v ) Φ ( v , v , v )+ ( v − 𝑢 ) 𝑃 ( 𝑢 ) Δ Φ ( 𝑢, 𝑢, h 𝑢, v i) + ( v − 𝑢 ) 𝑃 ( v ) Δ Φ (h 𝑢, v i , v , v ) = ∫ v 𝑢 (cid:16) 𝑃 ( 𝑢 ) Δ Φ ( 𝑢, 𝑢, h 𝑢, v i) + 𝑃 ( 𝑥 ) 𝜕 Φ ( 𝑢, 𝑥, v ) + 𝑃 ( v ) Δ Φ (h 𝑢, v i , v , v ) (cid:17) d 𝑥 + 𝑃 ( 𝑢 ) Φ ( 𝑢, 𝑢, 𝑢 ) − 𝑃 ( v ) Φ ( v , v , v ) = ∫ v 𝑢 (cid:16) 𝑃 ( 𝑦 𝑥 ) Δ Φ ( 𝑦 𝑥 , 𝑦 𝑥 , h 𝑦 𝑥 , 𝑧 𝑥 i) + 𝑃 ( 𝑥 ) 𝜕 Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) + 𝑃 ( v ) Δ Φ (h 𝑦 𝑥 , 𝑧 𝑥 i , 𝑧 𝑥 , 𝑧 𝑥 ) (cid:17) d 𝑥 + 𝑃 ( 𝑢 ) Φ ( 𝑢, 𝑢, 𝑢 ) − 𝑃 ( v ) Φ ( v , v , v ) . For each hill ( v , 𝑢 ] , ∫ 𝑢 v Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) 𝜕𝑃 ( 𝑥 ) d 𝑥 = ∫ 𝑢 v 𝑃 ( 𝑥 ) 𝜕 [ 𝑡 → Φ ( 𝑡, 𝑡, 𝑡 )] ( 𝑥 ) d 𝑥 + 𝑃 ( v ) Φ ( v , v , v ) − 𝑃 ( 𝑢 ) Φ ( 𝑢, 𝑢, 𝑢 ) = ∫ 𝑢 v (cid:16) 𝑃 ( 𝑥 ) 𝜕 Φ ( 𝑥, 𝑥, 𝑥 ) + 𝑃 ( 𝑥 ) 𝜕 Φ ( 𝑥, 𝑥, 𝑥 ) + 𝑃 ( 𝑥 ) 𝜕 Φ ( 𝑥, 𝑥, 𝑥 ) (cid:17) d 𝑥 + 𝑃 ( v ) Φ ( v , v , v ) − 𝑃 ( 𝑢 ) Φ ( 𝑢, 𝑢, 𝑢 ) = ∫ 𝑢 v (cid:16) 𝑃 ( 𝑦 𝑥 ) Δ Φ ( 𝑦 𝑥 , 𝑦 𝑥 , h 𝑦 𝑥 , 𝑧 𝑥 i) + 𝑃 ( 𝑥 ) 𝜕 Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) + 𝑃 ( v ) Δ Φ (h 𝑦 𝑥 , 𝑧 𝑥 i , 𝑧 𝑥 , 𝑧 𝑥 ) (cid:17) d 𝑥 + 𝑃 ( v ) Φ ( v , v , v ) − 𝑃 ( 𝑢 ) Φ ( 𝑢, 𝑢, 𝑢 ) . Summing the hill and valley expressions over all hills and valleys, most of the non-integral termscancel out, and the two that remain are 0 by assumption: ∫ ∞ Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) 𝜕𝑃 ( 𝑥 ) d 𝑥 = ∫ ∞ (cid:16) 𝑃 ( 𝑦 𝑥 ) Δ Φ ( 𝑦 𝑥 , 𝑦 𝑥 , h 𝑦 𝑥 , 𝑧 𝑥 i) + 𝑃 ( 𝑥 ) 𝜕 Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) + 𝑃 ( v ) Δ Φ (h 𝑦 𝑥 , 𝑧 𝑥 i , 𝑧 𝑥 , 𝑧 𝑥 ) (cid:17) d 𝑥 + 𝑃 ( ) Φ ( , , 𝑧 ) − lim 𝑥 →∞ 𝑃 ( 𝑥 ) Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) . (cid:3) Our final two lemmas show that integrals using Δ can sometimes be turned into integrals us-ing 𝜕 . Lemma B.10.
Let 𝑦, 𝑧 be a hill-valley pair and Φ : R + → R + be differentiable with respect to itssecond argument. Then ∫ ∞ Δ Φ ( 𝑦 𝑥 , h 𝑦 𝑥 , 𝑧 𝑥 i , 𝑧 𝑥 ) d 𝑥 = ∫ ∞ 𝜕 Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) d 𝑥 . Proof.
For each valley ( 𝑢, v ] , ∫ v 𝑢 Δ Φ ( 𝑦 𝑥 , h 𝑦 𝑥 , 𝑧 𝑥 i , 𝑧 𝑥 ) d 𝑥 = ∫ v 𝑢 Δ Φ ( 𝑢, h 𝑢, v i , v ) d 𝑥 = ( v − 𝑢 ) Δ Φ ( 𝑢, h 𝑢, v i , v ) = Φ ( 𝑢, v , v ) − Φ ( 𝑢, 𝑢, v ) = ∫ v 𝑢 𝜕 Φ ( 𝑢, 𝑥, v ) d 𝑥 = ∫ v 𝑢 𝜕 Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) d 𝑥 . For each hill ( v , 𝑢 ] , ∫ 𝑢 v Δ Φ ( 𝑦 𝑥 , h 𝑦 𝑥 , 𝑧 𝑥 i , 𝑧 𝑥 ) d 𝑥 = ∫ 𝑢 v Δ Φ ( 𝑥, h 𝑥, 𝑥 i , 𝑥 ) d 𝑥 = ∫ 𝑢 v 𝜕 Φ ( 𝑥, 𝑥, 𝑥 ) d 𝑥 = ∫ 𝑢 v 𝜕 Φ ( 𝑦 𝑥 , 𝑥, 𝑧 𝑥 ) d 𝑥 . Summing the hill and valley expressions over all hills and valleys yields the desired result. (cid:3)
Lemma B.11.
Let 𝑦, 𝑧 be a hill-valley pair and both Φ : R + → R and Ψ : R + → R be differentiable.Then ∫ ∞ Δ [ 𝑡 ↦→ Φ ( 𝑦 𝑥 , Ψ ( 𝑡 ) , 𝑧 𝑥 )] (h 𝑦 𝑥 , 𝑧 𝑥 i) d 𝑥 = ∫ ∞ Δ Φ ( 𝑦 𝑥 , h Ψ ( 𝑦 𝑥 ) , Ψ ( 𝑧 𝑥 )i , 𝑧 𝑥 ) 𝜕 Ψ ( 𝑥 ) d 𝑥 . ptimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic 37 Proof.
We compute ∫ ∞ Δ [ 𝑡 ↦→ Φ ( 𝑦 𝑥 , Ψ ( 𝑡 ) , 𝑧 𝑥 )] (h 𝑦 𝑥 , 𝑧 𝑥 i) d 𝑥 = ∫ ∞ Δ Φ ( 𝑦 𝑥 , h Ψ ( 𝑦 𝑥 ) , Ψ ( 𝑧 𝑥 )i , 𝑧 𝑥 ) ΔΨ (h 𝑦 𝑥 , 𝑧 𝑥 i) d 𝑥 [by Lem. B.7] = ∫ ∞ Δ h 𝑢, 𝑡, v ↦→ Δ Φ ( 𝑢, h Ψ ( 𝑢 ) , Ψ ( v )i , v ) · Ψ ( 𝑡 ) i ( 𝑦 𝑥 , h 𝑦 𝑥 , 𝑧 𝑥 i , 𝑧 𝑥 ) d 𝑥 = ∫ ∞ Δ Φ ( 𝑦 𝑥 , h Ψ ( 𝑦 𝑥 ) , Ψ ( 𝑧 𝑥 )i , 𝑧 𝑥 ) 𝜕 Ψ ( 𝑥 ) d 𝑥 . [by Lem. B.10] (cid:3) B.2 Proofs of New Formulas
We now apply the theory developed in Appendix B.1 to prove Theorems B.1–B.3. Throughout theproofs, 𝑦 𝑥 and 𝑧 𝑥 refer to 𝑦 𝜋𝑥 and 𝑧 𝜋𝑥 , respectively. Recall that 𝑦, 𝑧 form a hill-valley pair (Defini-tion B.5) under any monotonic SOAP policy 𝜋 . Proof of Theorem B.1.
We compute E [ 𝑄 𝜋 -1 ] = ∫ ∞ 𝜏 ( 𝑧 𝑥 ) 𝜌 ( 𝑦 𝑥 ) 𝜌 ( 𝑧 𝑥 ) d 𝐹 ( 𝑥 ) [by (4.4)] = ∫ ∞ 𝐹 ( 𝑦 𝑥 ) 𝜌 ( 𝑦 𝑥 ) Δ (cid:20) 𝑡 ↦→ 𝜏 ( 𝑡 ) 𝜌 ( 𝑡 ) (cid:21) (h 𝑦 𝑥 , 𝑧 𝑥 i) + 𝐹 ( 𝑧 𝑥 ) 𝜏 ( 𝑧 𝑥 ) 𝜌 ( 𝑧 𝑥 ) Δ (cid:20) 𝑡 ↦→ 𝜌 ( 𝑡 ) (cid:21) (h 𝑦 𝑥 , 𝑧 𝑥 i) ! d 𝑥 [by Lem. B.9] = ∫ ∞ 𝐹 ( 𝑦 𝑥 ) 𝜌 ( 𝑦 𝑥 ) Δ 𝜏 (h 𝑦 𝑥 , 𝑧 𝑥 i) + 𝐹 ( 𝑦 𝑥 ) 𝜏 ( 𝑧 𝑥 ) 𝜌 ( 𝑦 𝑥 ) Δ (cid:20) 𝑡 ↦→ 𝜌 ( 𝑡 ) (cid:21) (h 𝑦 𝑥 , 𝑧 𝑥 i)+ 𝐹 ( 𝑧 𝑥 ) 𝜏 ( 𝑧 𝑥 ) 𝜌 ( 𝑧 𝑥 ) Δ (cid:20) 𝑡 ↦→ 𝜌 ( 𝑡 ) (cid:21) (h 𝑦 𝑥 , 𝑧 𝑥 i) ! d 𝑥 [by Lem. B.8] = ∫ ∞ 𝐹 ( 𝑦 𝑥 ) 𝜌 ( 𝑦 𝑥 ) 𝜌 ( 𝑦 𝑥 ) 𝜕𝜏 ( 𝑥 ) + 𝜏 ( 𝑧 𝑥 ) (cid:18) 𝐹 ( 𝑦 𝑥 ) 𝜌 ( 𝑦 𝑥 ) + 𝐹 ( 𝑧 𝑥 ) 𝜌 ( 𝑧 𝑥 ) (cid:19) 𝜕 (cid:20) 𝑡 ↦→ 𝜌 ( 𝑡 ) (cid:21) ( 𝑥 ) ! d 𝑥, [by Lem. B.10] which equals the desired result by (4.5). (cid:3) Proof of Theorem B.2.
We compute E [ 𝑅 𝜋 -1 ] = ∫ ∞ 𝑥𝜌 ( 𝑦 𝑥 ) d 𝐹 ( 𝑥 ) [by (4.4)] = ∫ ∞ (cid:18) 𝑧 𝑥 𝐹 ( 𝑧 𝑥 ) Δ (cid:20) 𝑡 ↦→ 𝜌 ( 𝑡 ) (cid:21) (h 𝑦 𝑥 , 𝑧 𝑥 i) + 𝐹 ( 𝑥 ) 𝜌 ( 𝑦 𝑥 ) (cid:19) d 𝑥 [by Lem. B.9] = ∫ ∞ (cid:18) − 𝑧 𝑥 𝐹 ( 𝑧 𝑥 ) 𝜌 ( 𝑦 𝑥 ) 𝜌 ( 𝑧 𝑥 ) 𝜕𝜌 ( 𝑥 ) + 𝐹 ( 𝑥 ) 𝜌 ( 𝑦 𝑥 ) (cid:19) d 𝑥, [by (B.1), Lem. B.11] which equals the desired result by (4.5). (cid:3) Proof of Theorem B.3.
Very similarly to the proof of Theorem B.2, we compute E [ 𝑆 𝜋 -1 ] = ∫ ∞ 𝑧 𝑥 𝜌 ( 𝑦 𝑥 ) d 𝐹 ( 𝑥 ) [by (4.6)] = ∫ ∞ (cid:18) 𝑧 𝑥 𝐹 ( 𝑧 𝑥 ) Δ (cid:20) 𝑡 ↦→ 𝜌 ( 𝑡 ) (cid:21) (h 𝑦 𝑥 , 𝑧 𝑥 i) + 𝐹 ( 𝑦 𝑥 ) 𝜌 ( 𝑦 𝑥 ) (cid:19) d 𝑥 [by Lem. B.9] = ∫ ∞ (cid:18) − 𝑧 𝑥 𝐹 ( 𝑧 𝑥 ) 𝜌 ( 𝑦 𝑥 ) 𝜌 ( 𝑧 𝑥 ) 𝜕𝜌 ( 𝑥 ) + 𝐹 ( 𝑦 𝑥 ) 𝜌 ( 𝑦 𝑥 ) (cid:19) d 𝑥, [by (B.1), Lem. B.11] which equals the desired result by (4.5).which equals the desired result by (4.5).