Erlang Redux: An Ansatz Method for Solving the M/M/m Queue
EErlang Redux
An Ansatz Method for Solving the M/M/m Queue
Neil J. Gunther
Performance Dynamics Company, Castro Valley, CA 94552 [email protected]
Abstract
This exposition presents a novel approach to solving an M/M/m queue for the wait-ing time and the residence time. The motivation comes from an algebraic solution for theresidence time of the M/M/1 queue. The key idea is the introduction of an ansatz transfor-mation, defined in terms of the Erlang B function, that avoids the more opaque derivationbased on applied probability theory. The only prerequisite is an elementary knowledge ofthe Poisson distribution, which is already necessary for understanding the M/M/1 queue.The approach described here supersedes our earlier approximate morphing transformation.
The multi-server M/M/m queue arises in the performance analysis of such systems as: call cen-ters, manufacturing, communications networks, multicore computers, and multithreaded soft-ware applications. Unfortunately, those who should be applying M/M/m models to the perfor-mance analysis of their designs and architectures are often not schooled in applied probabilitytheory. This situation cries out for a more intuitive approach to understanding multi-serverqueues—along the lines of the algebraic approach used to develop the residence time for anM/M/1 queue [1, 2]. However, this apparently simple objective has proved more difficult thanone might reasonably expect. A previous attempt to meet this goal was based on our morphing model approximation toM/M/m [3, 4]. The residence time formula in the morphing model is simpler mathematicallyand more intuitive than the exact solution based on the original Erlang C function [5, Eq. 5].Nonetheless, it is only an approximation. A similar approach, but one that produces the exactsolution, has remained desirable.Here, we present a method that achieves the desired goal. Our approach arises from a con-fluence of several observations that had been overlooked previously. In particular: (i) we focuson the mean waiting time W m , rather than the residence time R m (as was done in the morphing The situation is reminiscent of one that Kepler must have faced in going from circular to elliptic orbits. In-troducing even a modest amount of eccentricity causes profound complications for expressing and calculating thecircumference of an ellipse. Subsequently, others developed a variety of approximations. a r X i v : . [ c s . PF ] A ug odel), (ii) W m can be expressed as a transformation of R : a fast M/M/1 residence time,(iii) the transformation function Φ B takes us from the Erlang B function to the Erlang C func-tion, (iv) since these are probability functions, Φ B must exist on the interval [0 , , and therefore(v) it cannot be defined in terms of queue attributes, such as unbounded queue length. Theseobservations, taken collectively, then allow us to reprise the logic of the previous morphingderivation to arrive at the exact waiting time and residence time formulæ for an M/M/m queue.The structure of this paper is as follows. In Section 2, we review the algebraic treatment ofthe M/M/1 queue. Section 3 reviews the morphing model, that transforms m parallel M/M/1queues into a single fast M/M/1 queue, in agreement with the residence time characteristics ofan M/M/m queue. The morphing transformation function φ ρ , which is a finite geometric seriesin the server utilization ρ , produces only an approximate solution for R m . Section 4 returns tothe original problem but, replaces φ ρ with Φ B to recover the exact R m . The iron law of residence time is R = S + W (1)where S is the mean service time and W the mean waiting time. The waiting time for M/M/1can be viewed as being the due to the number of customers in the system, Q , ahead of you whenyou join the queue, i.e., W = QS . Furthermore, the number of customers in the system can bedetermined from Little’s law, Q = λR , where λ is the mean arrival rate.Substituting Little’s law into (1) produces R = S + QS = S + ( λR ) S = S + R ( λS )= S + R ρ where we have denoted the server utilization by ρ = λS . A final rearrangement yields R = S − ρ (2)which is the canonical expression for the M/M/1 residence time [1, 2] but, derived here withoutresorting to the usual applied probability theory found in standard texts [6–10]. The subscript in(2) has been introduced to distinguish the number of servers, m , in the queueing facility for latercomparisons. Notice the restriction ρ < in (2) to prevent the queue length from becominginfinite (unstable queue). Remark 1.
Although (2) —and similar equations that appear throughout—relates mean valuesof the respective metrics, it is not a so-called operational law [1] because these metrics dependon the underlying statistical distribution. Morphing M/M/m
We would like to apply the same algebraic treatment to an M/M/2 queue and ultimately, itsM/M/m generalization, especially for more practical applications [9, 11] and pedagogic pur-poses [12]. Remark 2.
It is important to note that the arrival rate λ needs to be doubled for m = 2 ifthe capacity of both servers is to be fully utilized. Since neither server can be more than 100%busy, the corresponding server utilization has to be defined as ρ = λS in order that ρ < . Since ρ << , we expect R < R if the denominator in (2) is replaced by − ρ , viz., R = S − ρ (3)Moreover, we can interpret ρ as representing the smaller probability that both servers are busysimultaneously. Indeed, (3) agrees with the exact solution based on the Erlang’s C function [5].Generalizing this observation led to the morphing model [3, 4] R m ( φ ) = (cid:18) S − ρ (cid:19) φ ρ (4)where φ ρ = 1 − ρ − ρ m (5)is the sum of a finite geometric series and ρ = λSm < (6)is the per-server utilization.Table (1) Correction terms for the morphing approximation (5) m Integer polynomials P m ( ρ ) − ρ + 1 − ρ + 1 ρ + ρ − ρ − ρ + 4 ρ − ρ − ρ − ρ + 75 ρ − ρ − ρ − ρ − − ρ − ρ + 30 ρ + 35 ρ + 20 ρ + 5 ρ + 12005 ρ + 2058 ρ − ρ − ρ − ρ − ρ − ρ + 12288 ρ + 3584 ρ − ρ − ρ − ρ − ρ − ρ − Equation (4) formally captures the idea that an M/M/m queue is load-dependent in such a waythat it can be regarded as “morphing” between two types of virtual queueing facilities: It is noteworthy that [1] does not derive or discuss the equivalent of the M/M/m queue. ery low load: M/M/m acts like a set of m parallel M/M/1 queues with very little waiting-lineformation. Very heavy load:
M/M/m becomes a single M/M/1 queue with a server that is m times fasterthan a parallel queue server.According to (4), adding another server ( m = 3 ) corresponds to a residence time given by R ( φ ) = S − ρ which is incorrect. The exact expression, based on the Erlang C function (11), is R = S + 3 ρ S ρ + ρ + 3 ρ (7)The difficulty with (7), however, is that it cannot be further simplified, and the algebraicform is completely inscrutable by comparison with the morphing model. All intuition is lost.Part of the trouble stems from the fact that finite m introduces a truncated exponential seriesand, unlike (5) in the morphing model, there is no simple closed-form expression. Thus, weare stuck on the horns of a dilemma: the morphing model is much more intuitively appealing(particularly for pedagogy) but it is only an approximation. On a beneficial note, although (4)is an approximation, the error ∆ R m ( φ ) < ln( m / )1 + ln( m ) (8)is bounded above by 25% for extremely large m values [4]. In practice, the error is typicallybetween 5% and 10% and that makes the morphing model useful for quick engineering esti-mates [11]. Different approximations for M/M/m queue metrics have been reported by others.See e.g., [13, 14].One way out of this dilemma is to find the correction factor that takes us from (4) to theexact solution. Indeed, the corrected version of (4) can be written as [4] R m = S − (cid:12)(cid:12)(cid:12)(cid:12) c m P m − ( ρ ) (cid:12)(cid:12)(cid:12)(cid:12) ρ m (9)where P m − ( ρ ) is the deflated polynomial associated with P m ( ρ ) = c m ρ m + . . . + c ρ + c ρ + c ρ + c Example integer coefficients, c m , are shown in Table 1 for m = 1 , , . . . . Clearly, the cor-rection polynomials are just as complicated as the terms in the exact Erlang C function so, notmuch progress has been achieved by comparison with the morphing model.The denominator in (5), when analytically continued to complex ρ , has zeros that corre-spond to roots of unity that lie on the circumference of the unit disk in Fig. 1. Conversely, zerosof the corrected denominator in (9) lie on the interior of the unit disk. As m increases, thosezeros move further away from the circumference and converge on the Szeg˝o bound [4,15]. Evenwithout understanding the mathematical construction, Fig. 1 offers a striking visualization ofthe complexity with which we are dealing. 4 = - - Re ( (cid:1) ) - (cid:1)(cid:1) Im ( (cid:1) ) Szego curve
Figure (1) Zeros ( dots ) of the polynomials in Table 1 for m = 1 , , , . . . , . Zeros of the morphing approximation ( greendots ) lie symmetrically on the circumference of the unit disk. Zeros ofthe corrected solutions ( blue dots ) lie in the interior of the unit disk andconverge on the tear-drop shaped Szeg˝o bound ( red curve ). Progress toward an algebraic derivation of the exact solution, while at the same time adheringto the objectives of Sections 1 and 2, can be made by noting that the mathematical limitationsof the morphing construction (and why it is only an approximation) can be attributed to thefollowing assumptions:1. Modifying R , rather than W , is the wrong starting point.2. Unlike M/M/1, both W m and R m are state-dependent.3. The low-traffic limit corresponds to m delay servers, not parallel M/M/1 queues.The last point refers to the assumption that the morphing transformation (4) assumes m paral-lel M/M/1 queues, with mostly empty waiting lines, in low-traffic limit, whereas there are nowaiting states at low load. 5ith assumption 1 in mind, we now turn our attention to the canonical exact form of theM/M/m waiting time [3, 6–8] W m = C ( m, ρ ) Sm (1 − ρ ) (10)Here, C ( m, ρ ) is the well-known Erlang C function , which we write as C ( m, a ) = A m (1 − ρ ) S k + A m (11)with a = mρ , A m = a m /m ! and S k = (cid:80) m − k a k /k ! .We want to determine C ( m, ρ ) by means of a less abstract procedure than that found ineither Erlang’s original paper [5] or standard queueing theory texts [6–8]. The main idea is toreprise the approach used to derive the morphing model but, instead of φ ρ defined by (5), replaceit with an ansatz transformation function Φ B ( m, ρ ) to derive the the equivalent of C ( m, ρ ) in amore intuitive way. Once we determine the equivalent of C ( m, ρ ) , the M/M/m waiting time isdefined by (10), and the corresponding residence time R m follows from (1). λ Βλ (1 −Β ) λ m λ Βλ (1 −Β ) λ m Figure (2) A fraction Bλ of offered calls is rejected and lost from the sys-tem when all m servers become instantaneously busy. The traffic intensity a = λS can be arbitrarily large. In this section we adopt the teletraffic parlance of Erlang’s paper [5]. We could start with apure delay center, i.e., M/M/ ∞ , where calls arrive with mean Poisson rate λ and are servicedby an infinite number of servers, each having a mean exponentially-distributed service period S . Since a call always finds an available operator, no waiting occurs and the mean time spent inthe system is simply R m = S .However, with assumption 3 in mind, it is more appropriate to start with an M/M/m/m queuethat has a finite number of servers but still no waiting states allowed. That restriction causescalls to be lost from the system with probability B = B ( m, ρ ) , as depicted in Fig. 2. Thus, thequeue length can never exceed m calls in service. This is the Erlang loss model [6–10] with B being Erlang’s B function [5, Eq. 1]. Following the notation in (11), we write it as B ( m, a ) = A m S k + A m (12) Arnold Allen has described using (11) to calculate the Erlang C function as an unnatural act.
6n M/M/m queue, on the other hand, has waiting states. In order to include those addi-tional states, we first introduce a “bucket” in Fig. 3 to capture the Bλ rejected calls. Thesecaptured calls are placed in an ordered list, i.e., callers take a number. The bucket does notchange the operation of the M/M/m/m queue in any way. λ m λ Bucket λ (1 −Β ) λ mBucket Βλ Figure (3) A bucket is introduced to capture the rejected calls as an orderedlist. Callers take a number.Defining R = S/m − ρ (13)to represent the M/M/1 residence time (2) but with an m -times faster service facility, (10) canbe rewritten as W m = R Φ B (14)where Φ B is a transformation to be determined. Equation (14) says that the M/M/m waitingtime can be regarded as a proportion of the fast residence time R . That fraction is given by Φ B . Equation (14) is on the same logical footing as (4) in the morphing model.Next, the servers in Fig. 3 are repositioned behind the bucket (with respect to the directionof traffic flow). Consequently, the bucket now collects all incoming calls since there can be norejected calls. This is the first significant differece from Figs. 2 and 3. In this configuration, thebucket would accumulate calls indefinitely, due to the fact that none are being serviced, and thestate-space would therefore become infinite. λ (1 −Β ) λ mBucket Βλλ
Bucket m
Figure (4) Next, the servers in Fig. 3.are repositioned behind the bucket.The bucket now collects all incoming calls, not just rejected calls, but noneare being serviced. A.K. Erlang called them “waiting arrangements” rather than a queue. Callers would presumably wait on theline for the operator to finally connect their call manually instead of hanging up.
7o avoid the “overflow” problem in Fig. 4, the bucket has a hole drilled into its base suchthat calls can be serviced from it in FIFO order. This is the second significant change. Itcorresponds to the
Erlang C function in terms of how it relates to the Erlang B function.Moreover, new arrivals are appended to the ordered list of calls already in the bucket, whichis equivalent to joining the tail of a waiting line. With servicing restored, the mean number ofrequests in the bucket reaches steady-state equilibrium and the number of waiting calls becomesbounded. That number, in turn, determines the mean waiting time W m in the queue of Fig. 5. λ m λ Figure (5) To service the collected calls, the bucket has a hole drilled intoits base such that calls are serviced in FIFO order. New arrivals are appendedto the ordered list of calls already in the bucket. The traffic intensity is nowbounded above by a = m . Remark 3 (Utilization) . There is a constraint on the per-server utilization ρ in both an M/M/m/mqueue and an M/M/m queue. Since the effective arrival rate at the M/M/m/m servers, due lostcalls in Fig. 2, is only (1 − B ) λ , the per-server utilization is ρ = (1 − B ) am < (15) and only approaches 100% busy at large traffic intensities. With the leaking bucket in place(Fig. 5), B = 0 so, the per-server utilization becomes ρ = am < (16) which means that a < m , in order to maintain queue stability. The difference between (15) and (16) is shown in Fig. 6. Arriving calls in Figs. 2 and 3 arePoisson distributed, and that introduces a tendency toward longer inter-arrival periods, relativeto the mean S . On the other hand, an available M/M/m server instantaneously retrieves the nextcall from the head of the waiting line (the hole in the bucket of Fig. 5) and thus, it saturatesmore rapidly. The progression from Fig. 2 to Fig. 5 essentially extends the queueing states from a finite state-space in M/M/m/m to an infinite state-space in M/M/m. We need to include the waiting calls ofFig. 5 into the transformation function of (14). We know from both M/M/1 and the morphingmodel that unbounded waiting states are generally identified with the infinite geometric series − ρ = 1 + ρ + ρ + ρ + . . . (17)8 ��� �� a ρ a / m ( - B ) a / m Figure (6) The per-server utilization in M/M/m/m only approaches 100%busy as the traffic intensity a becomes very large. M/M/m per-server utiliza-tion saturates more rapidly.familiar in many queue-theoretic formulæ.Equation (17) provides a clue as to how we might define Φ B , starting with B in Fig. 2 but,also including those waiting states. However, we cannot define Φ B in the same way as (17)because Erlang C in (11) is a probability function that satisfies the following conditions:1. C ( m, a ) ∈ [0 , for m = 1 , , , . . . and a ≥ .2. C ( m = 1 , a ) is linear-rising in Fig. 7b, as expected for M/M/1. For a > , Erlang C isconstant, i.e., C (1 , a ) = 1 , since the server remains saturated at 100% busy. Of course,in this region, an M/M/1 queue becomes unstable.3. More generally, C ( m, a ) is convex up to a = m .4. In the low traffic limit a → , we assume C (cid:39) B (cf. Fig. 7a), and similarly for ourtranformation function, Φ B (cid:39) B .5. In the heavy traffic limit a → m , we know C → , which suggests Φ B → B/B .These considerations lead to the following anzatz for Φ B : Φ B = B ( m, ρ )1 − [1 − B ( m, ρ )] ρ (18)Example expressions of (18) are shown in Table 2.To further substantiate the choice of (18), we consider the light and heavy traffic limits W m = (cid:26) as ρ = (cid:15) (very light traffic) R as ρ = 1 − (cid:15) (very heavy traffic) (19)where (cid:15) is a vanishingly small quantity. 9 ��� �� a B ( m,a ) m = = = = = (a) Blocked call probability, B(m,a) ���� �� a C ( m,a ) m = = = = = (b) Probability that call waits, C(m,a) Figure (7) Erlang B and C curves as functions of the traffic intensity a = λS . Under very low load, ρ = (cid:15) , the waiting time (14) becomes W m = S/m − (cid:15) (cid:20) B ( m, (cid:15) )1 − [1 − B ( m, (cid:15) )] (cid:15) (cid:21) Since B ( m, ρ ) (cid:39) when ρ = (cid:15) , W m vanishes. Substituting into (1), the residence time is R m = S , which also corresponds to Fig. 2 in the low-traffic limit. Under very high load, ρ = 1 − (cid:15) , and (14) becomes W m = S/m − (1 − (cid:15) ) (cid:20) B ( m, − (cid:15) )1 − [1 − B ( m, − (cid:15) )](1 − (cid:15) ) (cid:21) (20)From Fig. 7a, we see B ( m, ρ ) (cid:28) B ( m, a ) and thus, for a given value of ρ and m , B ( m, ρ ) can be replaced by a constant δ < . Applying this to (20) produces W m = Sm(cid:15) (cid:20) δ − [1 − δ ](1 − (cid:15) ) (cid:21) = Sm(cid:15) (cid:20) δ − (1 − δ − (cid:15) − δ(cid:15) ) (cid:21) = Sm(cid:15) (21)where we have invoked the additional reasonable assumption δ (cid:29) (cid:15) . Finally, (21) becomes W m = Sm (1 − ρ ) = R which is identical to (13), viz., an m -speed M/M/1 server: a result that is also in agreement withthe morphing model of Section 3. As expected, it also corresponds to (10) under heavy trafficsince Erlang C reaches probability one as ρ approaches 100% busy.10able (2) Examples of Φ B ( m, a ) with a = mρ . m B ( m , a ) [ − ( − B ( m , a )) ρ ] − Φ B ( m , a )1 ρ ρ ρ ρ ρ ρ (1+ ρ ) 1+2 ρ +2 ρ ρ ρ ρ ρ ρ (2+3 ρ (1+ ρ )) 2+6 ρ +9 ρ +9 ρ ρ (4+3 ρ ) 9 ρ ρ (4+3 ρ ) ρ ρ (3+2 ρ (3+4 ρ (1+ ρ ))) 3+4 ρ (3+2 ρ (3+4 ρ (1+ ρ )))3+ ρ (9+4 ρ (3+2 ρ )) 32 ρ ρ (9+4 ρ (3+2 ρ )) ρ ρ (24+5 ρ (12+5 ρ (4+5 ρ (1+ ρ )))) 24+5 ρ (24+5 ρ (12+5 ρ (4+5 ρ (1+ ρ ))))24+ ρ (96+5 ρ (36+5 ρ (8+5 ρ ))) 625 ρ ρ (96+5 ρ (36+5 ρ (8+5 ρ ))) ρ ρ (5+3 ρ (5+ ρ (10+3 ρ (5+6 ρ (1+ ρ ))))) 5+6 ρ (5+3 ρ (5+ ρ (10+3 ρ (5+6 ρ (1+ ρ )))))5+ ρ (25+6 ρ (10+3 ρ (5+ ρ (5+3 ρ )))) 324 ρ ρ (25+6 ρ (10+3 ρ (5+ ρ (5+3 ρ )))) Our purpose here has been to offer a more intuitive derivaton of M/M/m queueing metrics, notto promote (18) as a computational device. Computing (18) is equivalent to computing (11).However, if one should want to use Φ B ( m, a ) for calculations or other instruction, then it isclear that B ( m, a ) has to be evaluated first.Rather than using (12) which, to paraphrase Arnold Allen: is hardly more “natural” than(11), Erlang B can more easily be computed using the iterative algorithm [16] in listing 1.Listing (1) R code to compute the Erlang B function erlangB < − function (m, a) { eB < − a / (1 + a) if (m == 1) { return (eB) } for (k in 2:m) { eB < − eB ∗ a / (a ∗ eB + k) } return (eB) } If R, or similar statistical software, is already being employed, one can make direct useof the Poisson PMF (probability mass function) and CDF (cumulative distribution function) tosimplify the code in listing 2.Listing (2) Compute Erlang B from the Poisson PMF and CDF erlangB < − function (m, a) { return ( dpois (m, a) / ppois (m, a )) } Example calculations computed in this way are summarized in Table 3.11able (3) Example M/M/m metrics with mean service time S = 1 [5] m a B ( m, a ) Poisson Φ B C ( m, a ) W m R m R m ( φ ) The goal of algebraically deriving the exact residence time for an M/M/m queue—motivatedby the same approach to M/M/1—has finally been achieved here. Several subtle observationsare needed to enable this result: (a) focus on the waiting time W m , rather than the residencetime R m , (b) make M/M/m/m the starting point (rather than parallel M/M/1 queues), (c) thediagrams in Figs. 2–5 aid development of the ansatz Φ B , (d) Φ B must conform to a probabilityfunction, and (e) equation (18) modifies the fast residence time R , not R These observationsalso facilitated reprising the morphing model derivation to verify our ansatz.Equation (18) can also be derived formally from (11) and (12) but, their respective startingpoints rely on conventional applied probability theory methods, which it has been our objectiveto avoid. Indeed, the same expression for the Erlang C function is known in the literature [7, 8],especially for the purpose of programmatic computation. [1] E. D. Lazowska, J. Zahorjan, G. S. Graham, K. C. Sevcik,
Quantitative System Per-formance: Computer System Analysis Using Queueing Network Models , Prentice-Hall(1984)[2] N.J. Gunther,
The Practical Performance Analyst , McGraw-Hill (1998)[3] N.J. Gunther,
Analyzing Computer System Performance with Perl::PDQ , Springer (2005)[4] N.J. Gunther, “Morphing M/M/m: A New View of An Old Queue,” IFORS 21st Conf.Intl. Federation of Op. Research Soc., July 17–21, Quebec City, Canada (2017)[5] A. Erlang, “Solution of Some Problems in the Theory of Probabilities of Significance inAutomatic Telephone Exchanges,”
Electroteknikeren , v. 13, p. 5 (1917)[6] L. Kleinrock,
Queueing Systems: Vol. I , Wiley (1975)[7] A. Allen,
Probability, Statistics and Queueing Theory , Academic Press (1990)[8] T. G. Robertazzi,
Computer Networks and Systems: Queueing Theory and PerformanceEvaluation , 3rd edition, Springer (2000) 129] D. Bertsekas and R. Gallager,
Data Networks , Prentic-Hall (1987)[10] D. Gross and C.M. Harris,
Fundamentals of Queueing Theory , 3rd edition, Wiley (1998)[11] N.J. Gunther,
Guerrilla Capacity Planning: A Tactical Approach to Planning for HighlyScalable Applications and Services , Springer (2007)[12] N.J. Gunther, Guerrilla Training Classes, Performance Dynamics Educational Services, [13] H. Sakasegawa “An Approximation Formula L q (cid:39) α · ρ β / (1 − ρ ))