[PDF] Erlang Redux: An Ansatz Method for Solving the M/M/m Queue

Abstract

This exposition presents a novel approach to solving an M/M/m queue for the waiting time and the residence time. The motivation comes from an algebraic solution for the residence time of the M/M/1 queue. The key idea is the introduction of an ansatz transformation, defined in terms of the Erlang B function, that avoids the more opaque derivation based on applied probability theory. The only prerequisite is an elementary knowledge of the Poisson distribution, which is already necessary for understanding the M/M/1 queue. The approach described here supersedes our earlier approximate morphing transformation.

Full PDF

EErlang Redux

An Ansatz Method for Solving the M/M/m Queue

Neil J. Gunther

Performance Dynamics Company, Castro Valley, CA 94552 [email protected]

Abstract

This exposition presents a novel approach to solving an M/M/m queue for the wait-ing time and the residence time. The motivation comes from an algebraic solution for theresidence time of the M/M/1 queue. The key idea is the introduction of an ansatz transfor-mation, deﬁned in terms of the Erlang B function, that avoids the more opaque derivationbased on applied probability theory. The only prerequisite is an elementary knowledge ofthe Poisson distribution, which is already necessary for understanding the M/M/1 queue.The approach described here supersedes our earlier approximate morphing transformation.

The multi-server M/M/m queue arises in the performance analysis of such systems as: call cen-ters, manufacturing, communications networks, multicore computers, and multithreaded soft-ware applications. Unfortunately, those who should be applying M/M/m models to the perfor-mance analysis of their designs and architectures are often not schooled in applied probabilitytheory. This situation cries out for a more intuitive approach to understanding multi-serverqueues—along the lines of the algebraic approach used to develop the residence time for anM/M/1 queue [1, 2]. However, this apparently simple objective has proved more difﬁcult thanone might reasonably expect. A previous attempt to meet this goal was based on our morphing model approximation toM/M/m [3, 4]. The residence time formula in the morphing model is simpler mathematicallyand more intuitive than the exact solution based on the original Erlang C function [5, Eq. 5].Nonetheless, it is only an approximation. A similar approach, but one that produces the exactsolution, has remained desirable.Here, we present a method that achieves the desired goal. Our approach arises from a con-ﬂuence of several observations that had been overlooked previously. In particular: (i) we focuson the mean waiting time W m , rather than the residence time R m (as was done in the morphing The situation is reminiscent of one that Kepler must have faced in going from circular to elliptic orbits. In-troducing even a modest amount of eccentricity causes profound complications for expressing and calculating thecircumference of an ellipse. Subsequently, others developed a variety of approximations. a r X i v : . [ c s . PF ] A ug odel), (ii) W m can be expressed as a transformation of R : a fast M/M/1 residence time,(iii) the transformation function Φ B takes us from the Erlang B function to the Erlang C func-tion, (iv) since these are probability functions, Φ B must exist on the interval [0 , , and therefore(v) it cannot be deﬁned in terms of queue attributes, such as unbounded queue length. Theseobservations, taken collectively, then allow us to reprise the logic of the previous morphingderivation to arrive at the exact waiting time and residence time formulæ for an M/M/m queue.The structure of this paper is as follows. In Section 2, we review the algebraic treatment ofthe M/M/1 queue. Section 3 reviews the morphing model, that transforms m parallel M/M/1queues into a single fast M/M/1 queue, in agreement with the residence time characteristics ofan M/M/m queue. The morphing transformation function φ ρ , which is a ﬁnite geometric seriesin the server utilization ρ , produces only an approximate solution for R m . Section 4 returns tothe original problem but, replaces φ ρ with Φ B to recover the exact R m . The iron law of residence time is R = S + W (1)where S is the mean service time and W the mean waiting time. The waiting time for M/M/1can be viewed as being the due to the number of customers in the system, Q , ahead of you whenyou join the queue, i.e., W = QS . Furthermore, the number of customers in the system can bedetermined from Little’s law, Q = λR , where λ is the mean arrival rate.Substituting Little’s law into (1) produces R = S + QS = S + ( λR ) S = S + R ( λS )= S + R ρ where we have denoted the server utilization by ρ = λS . A ﬁnal rearrangement yields R = S − ρ (2)which is the canonical expression for the M/M/1 residence time [1, 2] but, derived here withoutresorting to the usual applied probability theory found in standard texts [6–10]. The subscript in(2) has been introduced to distinguish the number of servers, m , in the queueing facility for latercomparisons. Notice the restriction ρ < in (2) to prevent the queue length from becominginﬁnite (unstable queue). Remark 1.

Although (2) —and similar equations that appear throughout—relates mean valuesof the respective metrics, it is not a so-called operational law [1] because these metrics dependon the underlying statistical distribution. Morphing M/M/m

We would like to apply the same algebraic treatment to an M/M/2 queue and ultimately, itsM/M/m generalization, especially for more practical applications [9, 11] and pedagogic pur-poses [12]. Remark 2.

It is important to note that the arrival rate λ needs to be doubled for m = 2 ifthe capacity of both servers is to be fully utilized. Since neither server can be more than 100%busy, the corresponding server utilization has to be deﬁned as ρ = λS in order that ρ < . Since ρ << , we expect R < R if the denominator in (2) is replaced by − ρ , viz., R = S − ρ (3)Moreover, we can interpret ρ as representing the smaller probability that both servers are busysimultaneously. Indeed, (3) agrees with the exact solution based on the Erlang’s C function [5].Generalizing this observation led to the morphing model [3, 4] R m ( φ ) = (cid:18) S − ρ (cid:19) φ ρ (4)where φ ρ = 1 − ρ − ρ m (5)is the sum of a ﬁnite geometric series and ρ = λSm < (6)is the per-server utilization.Table (1) Correction terms for the morphing approximation (5) m Integer polynomials P m ( ρ ) − ρ + 1 − ρ + 1 ρ + ρ − ρ − ρ + 4 ρ − ρ − ρ − ρ + 75 ρ − ρ − ρ − ρ − − ρ − ρ + 30 ρ + 35 ρ + 20 ρ + 5 ρ + 12005 ρ + 2058 ρ − ρ − ρ − ρ − ρ − ρ + 12288 ρ + 3584 ρ − ρ − ρ − ρ − ρ − ρ − Equation (4) formally captures the idea that an M/M/m queue is load-dependent in such a waythat it can be regarded as “morphing” between two types of virtual queueing facilities: It is noteworthy that [1] does not derive or discuss the equivalent of the M/M/m queue. ery low load: M/M/m acts like a set of m parallel M/M/1 queues with very little waiting-lineformation. Very heavy load:

M/M/m becomes a single M/M/1 queue with a server that is m times fasterthan a parallel queue server.According to (4), adding another server ( m = 3 ) corresponds to a residence time given by R ( φ ) = S − ρ which is incorrect. The exact expression, based on the Erlang C function (11), is R = S + 3 ρ S ρ + ρ + 3 ρ (7)The difﬁculty with (7), however, is that it cannot be further simpliﬁed, and the algebraicform is completely inscrutable by comparison with the morphing model. All intuition is lost.Part of the trouble stems from the fact that ﬁnite m introduces a truncated exponential seriesand, unlike (5) in the morphing model, there is no simple closed-form expression. Thus, weare stuck on the horns of a dilemma: the morphing model is much more intuitively appealing(particularly for pedagogy) but it is only an approximation. On a beneﬁcial note, although (4)is an approximation, the error ∆ R m ( φ ) < ln( m / )1 + ln( m ) (8)is bounded above by 25% for extremely large m values [4]. In practice, the error is typicallybetween 5% and 10% and that makes the morphing model useful for quick engineering esti-mates [11]. Different approximations for M/M/m queue metrics have been reported by others.See e.g., [13, 14].One way out of this dilemma is to ﬁnd the correction factor that takes us from (4) to theexact solution. Indeed, the corrected version of (4) can be written as [4] R m = S − (cid:12)(cid:12)(cid:12)(cid:12) c m P m − ( ρ ) (cid:12)(cid:12)(cid:12)(cid:12) ρ m (9)where P m − ( ρ ) is the deﬂated polynomial associated with P m ( ρ ) = c m ρ m + . . . + c ρ + c ρ + c ρ + c Example integer coefﬁcients, c m , are shown in Table 1 for m = 1 , , . . . . Clearly, the cor-rection polynomials are just as complicated as the terms in the exact Erlang C function so, notmuch progress has been achieved by comparison with the morphing model.The denominator in (5), when analytically continued to complex ρ , has zeros that corre-spond to roots of unity that lie on the circumference of the unit disk in Fig. 1. Conversely, zerosof the corrected denominator in (9) lie on the interior of the unit disk. As m increases, thosezeros move further away from the circumference and converge on the Szeg˝o bound [4,15]. Evenwithout understanding the mathematical construction, Fig. 1 offers a striking visualization ofthe complexity with which we are dealing. 4 = - - Re ( (cid:1) ) - (cid:1)(cid:1) Im ( (cid:1) ) Szego curve

Figure (1) Zeros ( dots ) of the polynomials in Table 1 for m = 1 , , , . . . , . Zeros of the morphing approximation ( greendots ) lie symmetrically on the circumference of the unit disk. Zeros ofthe corrected solutions ( blue dots ) lie in the interior of the unit disk andconverge on the tear-drop shaped Szeg˝o bound ( red curve ). Progress toward an algebraic derivation of the exact solution, while at the same time adheringto the objectives of Sections 1 and 2, can be made by noting that the mathematical limitationsof the morphing construction (and why it is only an approximation) can be attributed to thefollowing assumptions:1. Modifying R , rather than W , is the wrong starting point.2. Unlike M/M/1, both W m and R m are state-dependent.3. The low-trafﬁc limit corresponds to m delay servers, not parallel M/M/1 queues.The last point refers to the assumption that the morphing transformation (4) assumes m paral-lel M/M/1 queues, with mostly empty waiting lines, in low-trafﬁc limit, whereas there are nowaiting states at low load. 5ith assumption 1 in mind, we now turn our attention to the canonical exact form of theM/M/m waiting time [3, 6–8] W m = C ( m, ρ ) Sm (1 − ρ ) (10)Here, C ( m, ρ ) is the well-known Erlang C function , which we write as C ( m, a ) = A m (1 − ρ ) S k + A m (11)with a = mρ , A m = a m /m ! and S k = (cid:80) m − k a k /k ! .We want to determine C ( m, ρ ) by means of a less abstract procedure than that found ineither Erlang’s original paper [5] or standard queueing theory texts [6–8]. The main idea is toreprise the approach used to derive the morphing model but, instead of φ ρ deﬁned by (5), replaceit with an ansatz transformation function Φ B ( m, ρ ) to derive the the equivalent of C ( m, ρ ) in amore intuitive way. Once we determine the equivalent of C ( m, ρ ) , the M/M/m waiting time isdeﬁned by (10), and the corresponding residence time R m follows from (1). λ Βλ (1 −Β ) λ m λ Βλ (1 −Β ) λ m Figure (2) A fraction Bλ of offered calls is rejected and lost from the sys-tem when all m servers become instantaneously busy. The trafﬁc intensity a = λS can be arbitrarily large. In this section we adopt the teletrafﬁc parlance of Erlang’s paper [5]. We could start with apure delay center, i.e., M/M/ ∞ , where calls arrive with mean Poisson rate λ and are servicedby an inﬁnite number of servers, each having a mean exponentially-distributed service period S . Since a call always ﬁnds an available operator, no waiting occurs and the mean time spent inthe system is simply R m = S .However, with assumption 3 in mind, it is more appropriate to start with an M/M/m/m queuethat has a ﬁnite number of servers but still no waiting states allowed. That restriction causescalls to be lost from the system with probability B = B ( m, ρ ) , as depicted in Fig. 2. Thus, thequeue length can never exceed m calls in service. This is the Erlang loss model [6–10] with B being Erlang’s B function [5, Eq. 1]. Following the notation in (11), we write it as B ( m, a ) = A m S k + A m (12) Arnold Allen has described using (11) to calculate the Erlang C function as an unnatural act.

6n M/M/m queue, on the other hand, has waiting states. In order to include those addi-tional states, we ﬁrst introduce a “bucket” in Fig. 3 to capture the Bλ rejected calls. Thesecaptured calls are placed in an ordered list, i.e., callers take a number. The bucket does notchange the operation of the M/M/m/m queue in any way. λ m λ Bucket λ (1 −Β ) λ mBucket Βλ Figure (3) A bucket is introduced to capture the rejected calls as an orderedlist. Callers take a number.Deﬁning R = S/m − ρ (13)to represent the M/M/1 residence time (2) but with an m -times faster service facility, (10) canbe rewritten as W m = R Φ B (14)where Φ B is a transformation to be determined. Equation (14) says that the M/M/m waitingtime can be regarded as a proportion of the fast residence time R . That fraction is given by Φ B . Equation (14) is on the same logical footing as (4) in the morphing model.Next, the servers in Fig. 3 are repositioned behind the bucket (with respect to the directionof trafﬁc ﬂow). Consequently, the bucket now collects all incoming calls since there can be norejected calls. This is the ﬁrst signiﬁcant differece from Figs. 2 and 3. In this conﬁguration, thebucket would accumulate calls indeﬁnitely, due to the fact that none are being serviced, and thestate-space would therefore become inﬁnite. λ (1 −Β ) λ mBucket Βλλ

Bucket m

Figure (4) Next, the servers in Fig. 3.are repositioned behind the bucket.The bucket now collects all incoming calls, not just rejected calls, but noneare being serviced. A.K. Erlang called them “waiting arrangements” rather than a queue. Callers would presumably wait on theline for the operator to ﬁnally connect their call manually instead of hanging up.

7o avoid the “overﬂow” problem in Fig. 4, the bucket has a hole drilled into its base suchthat calls can be serviced from it in FIFO order. This is the second signiﬁcant change. Itcorresponds to the

Erlang C function in terms of how it relates to the Erlang B function.Moreover, new arrivals are appended to the ordered list of calls already in the bucket, whichis equivalent to joining the tail of a waiting line. With servicing restored, the mean number ofrequests in the bucket reaches steady-state equilibrium and the number of waiting calls becomesbounded. That number, in turn, determines the mean waiting time W m in the queue of Fig. 5. λ m λ Figure (5) To service the collected calls, the bucket has a hole drilled intoits base such that calls are serviced in FIFO order. New arrivals are appendedto the ordered list of calls already in the bucket. The trafﬁc intensity is nowbounded above by a = m . Remark 3 (Utilization) . There is a constraint on the per-server utilization ρ in both an M/M/m/mqueue and an M/M/m queue. Since the effective arrival rate at the M/M/m/m servers, due lostcalls in Fig. 2, is only (1 − B ) λ , the per-server utilization is ρ = (1 − B ) am < (15) and only approaches 100% busy at large trafﬁc intensities. With the leaking bucket in place(Fig. 5), B = 0 so, the per-server utilization becomes ρ = am < (16) which means that a < m , in order to maintain queue stability. The difference between (15) and (16) is shown in Fig. 6. Arriving calls in Figs. 2 and 3 arePoisson distributed, and that introduces a tendency toward longer inter-arrival periods, relativeto the mean S . On the other hand, an available M/M/m server instantaneously retrieves the nextcall from the head of the waiting line (the hole in the bucket of Fig. 5) and thus, it saturatesmore rapidly. The progression from Fig. 2 to Fig. 5 essentially extends the queueing states from a ﬁnite state-space in M/M/m/m to an inﬁnite state-space in M/M/m. We need to include the waiting calls ofFig. 5 into the transformation function of (14). We know from both M/M/1 and the morphingmodel that unbounded waiting states are generally identiﬁed with the inﬁnite geometric series − ρ = 1 + ρ + ρ + ρ + . . . (17)8 ��  �� a ρ a / m ( - B ) a / m Figure (6) The per-server utilization in M/M/m/m only approaches 100%busy as the trafﬁc intensity a becomes very large. M/M/m per-server utiliza-tion saturates more rapidly.familiar in many queue-theoretic formulæ.Equation (17) provides a clue as to how we might deﬁne Φ B , starting with B in Fig. 2 but,also including those waiting states. However, we cannot deﬁne Φ B in the same way as (17)because Erlang C in (11) is a probability function that satisﬁes the following conditions:1. C ( m, a ) ∈ [0 , for m = 1 , , , . . . and a ≥ .2. C ( m = 1 , a ) is linear-rising in Fig. 7b, as expected for M/M/1. For a > , Erlang C isconstant, i.e., C (1 , a ) = 1 , since the server remains saturated at 100% busy. Of course,in this region, an M/M/1 queue becomes unstable.3. More generally, C ( m, a ) is convex up to a = m .4. In the low trafﬁc limit a → , we assume C (cid:39) B (cf. Fig. 7a), and similarly for ourtranformation function, Φ B (cid:39) B .5. In the heavy trafﬁc limit a → m , we know C → , which suggests Φ B → B/B .These considerations lead to the following anzatz for Φ B : Φ B = B ( m, ρ )1 − [1 − B ( m, ρ )] ρ (18)Example expressions of (18) are shown in Table 2.To further substantiate the choice of (18), we consider the light and heavy trafﬁc limits W m = (cid:26) as ρ = (cid:15) (very light trafﬁc) R as ρ = 1 − (cid:15) (very heavy trafﬁc) (19)where (cid:15) is a vanishingly small quantity. 9 ��  �� a B ( m,a ) m = = = = = (a) Blocked call probability, B(m,a) ��  �� a C ( m,a ) m = = = = = (b) Probability that call waits, C(m,a) Figure (7) Erlang B and C curves as functions of the trafﬁc intensity a = λS . Under very low load, ρ = (cid:15) , the waiting time (14) becomes W m = S/m − (cid:15) (cid:20) B ( m, (cid:15) )1 − [1 − B ( m, (cid:15) )] (cid:15) (cid:21) Since B ( m, ρ ) (cid:39) when ρ = (cid:15) , W m vanishes. Substituting into (1), the residence time is R m = S , which also corresponds to Fig. 2 in the low-trafﬁc limit. Under very high load, ρ = 1 − (cid:15) , and (14) becomes W m = S/m − (1 − (cid:15) ) (cid:20) B ( m, − (cid:15) )1 − [1 − B ( m, − (cid:15) )](1 − (cid:15) ) (cid:21) (20)From Fig. 7a, we see B ( m, ρ ) (cid:28) B ( m, a ) and thus, for a given value of ρ and m , B ( m, ρ ) can be replaced by a constant δ < . Applying this to (20) produces W m = Sm(cid:15) (cid:20) δ − [1 − δ ](1 − (cid:15) ) (cid:21) = Sm(cid:15) (cid:20) δ − (1 − δ − (cid:15) − δ(cid:15) ) (cid:21) = Sm(cid:15) (21)where we have invoked the additional reasonable assumption δ (cid:29) (cid:15) . Finally, (21) becomes W m = Sm (1 − ρ ) = R which is identical to (13), viz., an m -speed M/M/1 server: a result that is also in agreement withthe morphing model of Section 3. As expected, it also corresponds to (10) under heavy trafﬁcsince Erlang C reaches probability one as ρ approaches 100% busy.10able (2) Examples of Φ B ( m, a ) with a = mρ . m B ( m , a ) [ − ( − B ( m , a )) ρ ] − Φ B ( m , a )1 ρ ρ ρ ρ ρ ρ (1+ ρ ) 1+2 ρ +2 ρ ρ ρ ρ ρ ρ (2+3 ρ (1+ ρ )) 2+6 ρ +9 ρ +9 ρ ρ (4+3 ρ ) 9 ρ ρ (4+3 ρ ) ρ ρ (3+2 ρ (3+4 ρ (1+ ρ ))) 3+4 ρ (3+2 ρ (3+4 ρ (1+ ρ )))3+ ρ (9+4 ρ (3+2 ρ )) 32 ρ ρ (9+4 ρ (3+2 ρ )) ρ ρ (24+5 ρ (12+5 ρ (4+5 ρ (1+ ρ )))) 24+5 ρ (24+5 ρ (12+5 ρ (4+5 ρ (1+ ρ ))))24+ ρ (96+5 ρ (36+5 ρ (8+5 ρ ))) 625 ρ ρ (96+5 ρ (36+5 ρ (8+5 ρ ))) ρ ρ (5+3 ρ (5+ ρ (10+3 ρ (5+6 ρ (1+ ρ ))))) 5+6 ρ (5+3 ρ (5+ ρ (10+3 ρ (5+6 ρ (1+ ρ )))))5+ ρ (25+6 ρ (10+3 ρ (5+ ρ (5+3 ρ )))) 324 ρ ρ (25+6 ρ (10+3 ρ (5+ ρ (5+3 ρ )))) Our purpose here has been to offer a more intuitive derivaton of M/M/m queueing metrics, notto promote (18) as a computational device. Computing (18) is equivalent to computing (11).However, if one should want to use Φ B ( m, a ) for calculations or other instruction, then it isclear that B ( m, a ) has to be evaluated ﬁrst.Rather than using (12) which, to paraphrase Arnold Allen: is hardly more “natural” than(11), Erlang B can more easily be computed using the iterative algorithm [16] in listing 1.Listing (1) R code to compute the Erlang B function erlangB < − function (m, a) { eB < − a / (1 + a) if (m == 1) { return (eB) } for (k in 2:m) { eB < − eB ∗ a / (a ∗ eB + k) } return (eB) } If R, or similar statistical software, is already being employed, one can make direct useof the Poisson PMF (probability mass function) and CDF (cumulative distribution function) tosimplify the code in listing 2.Listing (2) Compute Erlang B from the Poisson PMF and CDF erlangB < − function (m, a) { return ( dpois (m, a) / ppois (m, a )) } Example calculations computed in this way are summarized in Table 3.11able (3) Example M/M/m metrics with mean service time S = 1 [5] m a B ( m, a ) Poisson Φ B C ( m, a ) W m R m R m ( φ ) The goal of algebraically deriving the exact residence time for an M/M/m queue—motivatedby the same approach to M/M/1—has ﬁnally been achieved here. Several subtle observationsare needed to enable this result: (a) focus on the waiting time W m , rather than the residencetime R m , (b) make M/M/m/m the starting point (rather than parallel M/M/1 queues), (c) thediagrams in Figs. 2–5 aid development of the ansatz Φ B , (d) Φ B must conform to a probabilityfunction, and (e) equation (18) modiﬁes the fast residence time R , not R These observationsalso facilitated reprising the morphing model derivation to verify our ansatz.Equation (18) can also be derived formally from (11) and (12) but, their respective startingpoints rely on conventional applied probability theory methods, which it has been our objectiveto avoid. Indeed, the same expression for the Erlang C function is known in the literature [7, 8],especially for the purpose of programmatic computation. [1] E. D. Lazowska, J. Zahorjan, G. S. Graham, K. C. Sevcik,

Quantitative System Per-formance: Computer System Analysis Using Queueing Network Models , Prentice-Hall(1984)[2] N.J. Gunther,

The Practical Performance Analyst , McGraw-Hill (1998)[3] N.J. Gunther,

Analyzing Computer System Performance with Perl::PDQ , Springer (2005)[4] N.J. Gunther, “Morphing M/M/m: A New View of An Old Queue,” IFORS 21st Conf.Intl. Federation of Op. Research Soc., July 17–21, Quebec City, Canada (2017)[5] A. Erlang, “Solution of Some Problems in the Theory of Probabilities of Signiﬁcance inAutomatic Telephone Exchanges,”

Electroteknikeren , v. 13, p. 5 (1917)[6] L. Kleinrock,

Queueing Systems: Vol. I , Wiley (1975)[7] A. Allen,

Probability, Statistics and Queueing Theory , Academic Press (1990)[8] T. G. Robertazzi,

Computer Networks and Systems: Queueing Theory and PerformanceEvaluation , 3rd edition, Springer (2000) 129] D. Bertsekas and R. Gallager,

Data Networks , Prentic-Hall (1987)[10] D. Gross and C.M. Harris,

Fundamentals of Queueing Theory , 3rd edition, Wiley (1998)[11] N.J. Gunther,

Guerrilla Capacity Planning: A Tactical Approach to Planning for HighlyScalable Applications and Services , Springer (2007)[12] N.J. Gunther, Guerrilla Training Classes, Performance Dynamics Educational Services, [13] H. Sakasegawa “An Approximation Formula L q (cid:39) α · ρ β / (1 − ρ ))