Product Forms for FCFS Queueing Models with Arbitrary Server-Job Compatibilities: An Overview
PProduct Forms for FCFS Queueing Models with ArbitraryServer-Job Compatibilities: An Overview
Kristen Gardner and Rhonda RighterJune 2020
Abstract
In recent years a number of models involving different compatibilities between jobs and servers inqueueing systems, or between agents and resources in matching systems, have been studied, and, underMarkov assumptions and appropriate stability conditions, the stationary distributions have been shownto have product forms. We survey these results and show how, under an appropriate detailed descriptionof the state, many are corollaries of similar results for the Order Independent Queue. We also discusshow to use the product form results to determine distributions for steady-state response times.
Systems in which servers are flexible in the types of customers that they can serve, and customers are flexiblein the servers at which they can be processed, are very common in a wide range of practical settings. In callcenters, service representatives may be trained to handle different subsets of requests, or may speak differentlanguages. A customer who speaks only Spanish can be helped by a representative who speaks only Spanish,or by a representative who speaks both Spanish and English, or by a representative who speaks both Spanishand Mandarin. In computer systems, some jobs may be able to run only on those servers that have thejob’s data stored locally, other jobs may require a server with a particular combination of resources, and stillother jobs may be able to run on any server. In ride-sharing systems, drivers will only be assigned to usersthat are “nearby” in some sense.This type of model is called a skill-based server model in the call center literature. In the schedulingliterature, the compatibility constraints between job classes and servers are called eligibility constraints orprocessing set restrictions, and the models are typically deterministic. For matching models, compatibilitiesmay be location based. While the language and notation used to describe these models differ across researchcommunities, the common idea in all of the above examples is that the system consists of multiple serversand multiple classes of jobs, with a bipartite graph structure indicating which classes of jobs can be servedby which servers.The examples above, and more broadly the “flexible job/server” models that exist in the literature, varyin precisely how the bipartite matching structure is used to assign servers to jobs. We mainly considertwo service models, which we call the “collaborative” and “noncollaborative” models. In the collaborativemodel, multiple servers can work together, with additive service rate, to process a single job. This matchesthe computer systems setting, in which the same (replicated) job can run on several different servers atonce. In the noncollaborative model, a customer can only enter service at a single server. This matches thestructure of a call center, in which a single customer cannot speak with multiple representatives at the sametime. In both cases, we think of there being a single central queue for all customers. When a server becomesavailable, it begins working on the next compatible job in the queue, in first-come first-served (FCFS) order.In the noncollaborative case, we must also specify which server will serve an arriving job that finds multipleidle compatible servers. We will consider two policies: Assign Longest Idle Server (ALIS), which is analogousto FCFS, and Random Assignment to Idle Servers (RAIS).An additional feature of many of models of service systems with job/server compatibilities is redundancy,or job replication, i.e., the possibility of sending multiple copies of the same job to multiple servers. Forexample, this is a common practice in computer systems to combat unpredictable system variability, so the1 a r X i v : . [ c s . PF ] J un ope is that the job may experience a significantly shorter response time at one of the servers. Similarly, oneidea for reducing wait times on organ transplant waitlists is to allow patients to join the waitlist in multiplegeographic areas at the same time. Patients are restricted in which waitlists they can join based on traveltime: should an organ become available at a particular hospital, the patient must be able to travel to thathospital within a relatively short time frame to receive the transplant. Generally systems with redundancyare not modeled as a central FCFS queue as described above. Instead, each server has its own dedicatedqueue and an arriving job can join the queues of multiple servers. In the collaborative case, multiple copiesof the same job can run on different servers at the same time, and when the first copy completes service allother copies are removed immediately from other servers or queues. This is called cancel-on-completion orlate cancellation. In the noncollaborative case, all other copies of a job are removed from the system as soonas the first copy enters service. This is called cancel-on-start or early cancellation, and it is also equivalent tosending a single copy to the queue with the least work. In both cases, the cancellations occur without penalty.While the central FCFS queue and the job redundancy model describe very different system dynamics, thetwo views turn out to be sample-path equivalent, provided that service times are exponentially distributedand i.i.d. across jobs and servers. We will explore this relationship, as well as other model equivalences, inwhat follows.Throughout most of this paper, we will make a few key assumptions: that jobs of each class arriveaccording to independent Poisson processes, that service times are exponentially distributed and i.i.d. acrossjobs and servers, and that the scheduling discipline is FCFS. Under these assumptions, we will see that themodels introduced above, as well as related models, exhibit product-form stationary distributions. Indeed,product forms hold for several different state descriptors, each of which provides different advantages inunderstanding system behavior. We first consider the most detailed state descriptor, which tracks theclasses of all jobs in the system. This description lends itself to a concise proof of the product form forthe collaborative model, due to the Order Independence (OI) results of Berezner and Krzesinski [14], [34].We show that for the noncollaborative model, both the job queue and the idle server queue are OI queues,resulting in a product of product forms. We extend these arguments to collaborative and noncollaborativemodels with abandonments. We also show that the same product-form stationary distribution holds forseveral related models, including new results for two-sided matching models with arrivals of both jobs andservers, and for make-to-stock inventory models with back ordering.Following the development of the product-form stationary distributions, we turn to using these results toderive system performance metrics. We begin with class-based response time distributions. We show that ifthere is a job class that is compatible with all servers, that class has an exponentially distributed responsetime for the collaborative model; indeed, the response time for that class is the same as it would be for theM/M/1 queue in which all jobs are fully flexible. For the noncollaborative model, the queueing time for thatfully flexible class is a mixture of a mass at 0 and an exponential random variable. We use this result to showresponse time distributions for all job classes in the collaborative model, and queueing time distributions forall classes in the noncollaborative model, when the compatibility matching has a nested structure.Product-form distributions for an alternative, partially aggregated state descriptor have been derivedin the literature; we show that these results also follow as corollaries to the detailed product forms. Thepartially aggregated state description allows us to derive per-class response time distributions, conditionedon the set of busy servers and the order of the jobs they are currently serving.We briefly discuss a related queueing model in which the state description is the number of jobs of eachclass in the system (per-class aggregation). While this state space no longer yields a Markovian descriptionof the systme evolution, it has the same steady-state per-class mean performance measures (mean number insystem, probability the system is empty) as the collaborative system. This state descriptor yields a simple,recursive approach to derive the system load and mean response time for our models when they are notnested.We note that the product forms discussed in this paper are not the same as those obtained in the well-known Jackson and Kelly networks [31, 32]. The standard Jackson and Kelly product forms arise in networksof queues, where the state of the network can be expressed as a product of the states at each queue. Incontrast, in this paper we primarily concentrate on the internal product-form structure of the steady-statedistribution for a single queue. While most of our focus is on single nodes with flexible jobs and servers, thesenodes are quasi-reversible under our modeling assumptions, so a network of such nodes also has a productform stationary distribution. That is, in steady state the distributions of each node will be as if they were2igure 1: The system consists of J classes of jobs, M servers, and a bipartite matching structure indicatingwhich job classes can be served by which servers.(a) Central FCFS queue (b) Collaborative model (c) Noncollaborative modelFigure 2: The system can be viewed, equivalently, as (a) having a single FCFS queue, or as being a distributedsystem in (b) the collaborative model or (c) the noncollaborative model.operating independently, as is the case in Jackson and Kelly networks.Throughout the paper we provide pointers to the relevant literature in context. We note at the outset that our analysis requires a heavy dose of notation that we will often reuse andabuse in the interest of readability and ease of understanding. Notation that we use throughout the paperis summarized in Table 1.There are J job classes with Poisson arrivals at rates λ i , M parallel servers with exponential servicerates µ m , and a bipartite graph matching structure indicating which servers can serve which job classes(see Figure 1). For job class i , let S i = { j : server j can serve class i } , and for a subset of job classes, A ,let S ( A ) = (cid:83) i ∈ A S i be the set of servers that can serve those classes. For example, for the system shown inFigure 1, S = { , } and for A = { , } , S ( A ) = { , , } . For server j , let C j = { i : server j can serveclass i } be the set of job classes it can serve, and for a subset of servers, B , let C ( B ) = (cid:83) j ∈ B C j be the set ofjob classes that can be served by servers in B . For example, in Figure 1, C = { , , J } and for B = { , } , C ( B ) = { , , , J } . For a subset of job classes, A , let µ ( A ) = (cid:80) m ∈ S ( A ) µ m and λ ( A ) = (cid:80) i ∈ A λ i be thetotal service rate and arrival rate for job classes in A , and, abusing notation, for a subset of servers, B ,let µ ( B ) = (cid:80) m ∈ B µ m and λ ( B ) = (cid:80) i ∈ C ( B ) λ i be the total service rate and arrival rate for servers in B .It will be clear from the context whether the arguments of λ and µ are job classes or servers. Finally, let µ = (cid:80) Mm =1 µ m and λ = (cid:80) Ji =1 λ i be the total system service rate and total system arrival rate respectively.Throughout, we will assume for stability that λ ( A ) < µ ( A ) for all subsets of job classes A . We notethat this condition is both necessary and sufficient for stability in the model described above (in particular,given i.i.d. exponential service times); absent this modeling assumption stability is a much more complicatedquestion. (see Section 7 for a more detailed discussion).3e primarily consider two models of service: the collaborative model and the noncollaborative model. Inthe noncollaborative model, a job can only be served by a single server. When the first copy of a job entersservice, all other copies are removed from the system immediately without penalty. A job that arrives to thesystem and finds multiple idle compatible servers begins service on one of those servers, chosen accordingto some assignment rule. We consider two assignment rules. Under Assign Longest Idle Server (ALIS), thearriving job begins service on the compatible server that has been idle for the longest time. Under RandomAssignment to Idle Servers (RAIS), the arriving job chooses an idle server randomly; this selection must bedrawn from a particular distribution, which we discuss in more detail in Section 3.3.In the collaborative model, a job may be in service at multiple servers at the same time. When the firstcopy of a job completes service, all other copies are removed from the system immediately without penalty.A job that is in service at a set of servers B receives combined service rate µ ( B ), hence the job experiences anexponential service time with rate µ ( B ). Unlike in the noncollaborative case, no assignment rule is neededfor an arriving job that finds multiple idle compatible servers; such a job simply enters service on all of theidle compatible servers. Note that in the collaborative case (but not in the noncollaborative case), we canassume without loss of generality that the set of job classes a server can serve is unique to that server, i.e., C i (cid:54) = C j for i (cid:54) = j . This follows because of the FCFS and collaborative assumptions; if C i = C j for i (cid:54) = j ,then servers i and j will always be serving the same (oldest compatible) job, so they can be considered to bea single server with rate µ i + µ j . In the noncollaborative model we allow multiple servers that are identicalin their service rates and their sets of compatible job classes.There are two equivalent ways of viewing the system dynamics. In the first, shown in Figure 2(a), allarriving jobs join a single FCFS queue. When a server j becomes available, it begins working on the firstjob in the central queue that has class i ∈ C j . In the collaborative model, the “queue” contains all jobs inthe system, including those currently in service, so that a newly available server may begin working on a jobthat is already in service at some other server. In the noncollaborative model, the queue contains only thosejobs that are not in service. The second system view is that of a distributed system, in which each serverhas its own queue and works on the jobs in its queue in FCFS order. Here, an arriving job of class- i joinsthe queue at all servers in S i . In the collaborative model (Figure 2(b)), multiple copies of the job may bein service at different servers. For example, in Figure 2(b) the class-1 job shown at the head of the queueat both servers 1 and 2 is in service at both servers. In the noncollaborative model (Figure 2(c)), only onecopy of a job can be in service. In our example, the class-1 job shown at the head of the queue at server 1 isin service at server 1, and its other copy has been removed from the queue at server 2. Another equivalentmodel in the noncollaborative case is to assume, again, that each server has its own dedicated queue, andthat each arriving job in class- i is routed to the server in S i with the least work (i.e., Join-the-Shortest-Workamong compatible servers) [10, 11].Throughout the remainder of this paper, we will rely primarily on the central-queue view of the systemwhen developing our state descriptors. We introduce here the notation used in the state descriptors. Thisnotation captures a great deal of information about the system, and each state descriptor uses a slightlydifferent subset of this information to capture different aspects of the system dynamics. We elaborate furtheron the specific state descriptors in the sections that follow. Let (cid:126)c n = ( c , . . . , c n ) denote the classes of all jobsin the central queue, where c i is the class of the i th job in the queue in order of arrival (so c is the class ofthe oldest job, and c n is the class of the most recent arrival). As noted above, for the collaborative model the“queue” refers to all jobs in the system, including those in service, whereas for the noncollaborative modelthe “queue” refers to only those jobs that are not in service. Let (cid:126)b l = ( b , . . . , b l ) be the vector of busy serversin the arrival order of the jobs that they are serving (so b is serving the oldest job in the system, and b l is serving the most recent arrival among the jobs in service). We use (cid:126)z m to denote, in the noncollaborativemodel, an interleaving of (cid:126)c n and (cid:126)b l ordered by job arrival time, where the state tracks the job class forpositions corresponding to jobs in the queue, and it tracks the busy server for positions corresponding tojobs in service. Let (cid:126)s k = ( s , . . . , s k ) be a vector of idle servers in the order in which they became idle,where l + k = M . We use (cid:126)d l = ( d , . . . , d l ) to denote the classes of the jobs currently in service, where d i isthe class of the job in service at server b i . The vector (cid:126)n l = ( n , . . . , n l ) denotes the number of jobs waitingto be served “between” the jobs in service. That is, n i gives the number of jobs that arrived after the jobin service at server b i and before the job in service at server b i +1 . Finally, (cid:126)x J = ( x , . . . , x J ) denotes thenumber of jobs of each class in the system; x i is the number of class- i jobs in the system.4 otation Definition J Number of job classes M Number of servers λ i Arrival rate of class- i jobs λ = (cid:80) i λ i Total system arrival rate µ j Service rate at server jµ = (cid:80) j µ j Total system service rate S i The set of servers that can serve class- i jobs S ( A ) The set of servers that can serve any job class in subset AC j The set of job classes that can be served by server jC ( B ) The set of job classes that can be served by any server in subset B(cid:126)c n Classes of all jobs in the queue, in order of arrival (cid:126)b l Busy servers, in the order in which they became busy (cid:126)s k Idle servers, in the order in which they became idle (cid:126)d l Classes of jobs in service, in the order in which they entered service (cid:126)n l Number of jobs not in service in between consecutive jobs in service (cid:126)x J Number of jobs of each class in the system (cid:126)z m Interleaving of jobs in the queue and busy servers, in order of job arrival timesTable 1: Summary of notation. Top section: system notation. Bottom section: notation used in statedescriptors.
We first consider the most complete descriptions of the state: the detailed state descriptor tracks the classesof all jobs in the order of their arrival, denoted by (cid:126)c n . In the noncollaborative case, the two assigment rulesthat we consider (ALIS and RAIS) also require us to track some information about the servers. Under ALIS(Section 3.2), the state descriptor includes the vector (cid:126)s k , which tracks all idle servers in the order in whichthey became idle. Under RAIS (Section 3.3), the state descriptor is (cid:126)z m , which is an interleaving of (cid:126)c n and (cid:126)b (cid:96) , where (cid:126)b l tracks all busy servers ordered by the arrival times of the jobs they are serving.For both the collaborative and noncollaborative (ALIS and RAIS) models, we show that the stationarydistribution for the above state descriptor exhibits a product form. We begin with the collaborative case,which is a special case of what are known as “Order Independent”(OI) queues, so named because the totalservice rate given to all jobs in the queue depends only on their classes, not on their order.At the end of the section, we discuss related models that also have product-form stationary distributions. The system state is (cid:126)c n = ( c , . . . , c n ), where c i is the class of the i ’th job in the system in order of arrival,including both jobs that are in the queue and jobs that are in service (possibly at more than one server).The subscript n can take on the values 0 , , ... ; we will generally leave this implicit. Let C be the set of allsuch states. Abusing notation, let S ( (cid:126)c n ) = S ( { c , . . . , c n } ) be the set of servers that can serve at least oneof the jobs in the queue, and let µ ( (cid:126)c n ) := µ ( { c , . . . , c n } ) = (cid:88) m ∈ S ( (cid:126)c n ) µ m (1)be the total rate of service to jobs in the queue. Also, define ∆ j ( (cid:126)c n ) as the (marginal) rate of service givento the j ’th job in the queue, so (cid:80) nj =1 ∆ j ( (cid:126)c n ) = µ ( (cid:126)c n ), and∆ j ( (cid:126)c n ) = (cid:88) m ∈ S ( (cid:126)c j ) \ S ( (cid:126)c j − ) µ m = (cid:88) m ∈ S ( (cid:126)c j ) µ m − (cid:88) m ∈ S ( (cid:126)c j − ) µ m = µ ( (cid:126)c j ) − µ ( (cid:126)c j − ) . (cid:126)c n = (1 , , , , , ( (cid:126)c n ) = µ + µ ), the class-2 job immediately behind it is in service at server 3 (∆ ( (cid:126)c n ) = µ ), and theclass-4 job is in service at server 4 (∆ ( (cid:126)c n ) = ∆ ( (cid:126)c n ) = 0 and ∆ ( (cid:126)c n ) = µ ). The total rate of service givento all jobs is µ ( (cid:126)c n ) = µ + µ + µ + µ .Note that, for the collaborative model, the total service rate µ ( (cid:126)c n ) is independent of the order of the jobsin the queue, and the service rate allocated to the j ’th job doesn’t depend on the jobs (if any) after job j in the queue. That is, our collaborative model is a special case of an Order Independent queue, defined asfollows.
Definition 3.1.
A queue is said to be
Order Independent (OI) if it satisfies the following properties for all (cid:126)c n :(i) ∆ j ( (cid:126)c n ) = ∆ j ( (cid:126)c j ) for j ≤ n ,(ii) µ ( (cid:126)c n ) is the same for any permutation of c , . . . , c n ,(iii) µ ( c ) > for any class c . Properties (i)-(iii) are essentially the same as those defined by Krzesinski [34], though Krzesinski’s defini-tion generalizes property (i) to also allow for an extra multiplicative service rate factor based on the numberin queue. Our collaborative model can be generalized in this way to have speed scaling , i.e., a total servicecapacity of γ ( n ) when there are n jobs in the system. Under this generalization, µ m would be interpreted asthe proportion of the total capacity used by server m , for m ∈ S ( (cid:126)c n ). The addition of a speed-scaling factor isstraightforward, but complicates the notation, so we do not include it here. Similarly, it is straightforward toinclude an arrival scaling (or rejection) factor, so that arrivals of class c occur according to a Poisson processwith rate r ( n ) λ c when the number in queue is n , but, again, we do not include it for ease of exposition.Property (iii) ensures irreducibility of the Markov chain. Property (ii) guarantees that the total rate oftransitions out of any state (cid:126)c n depends only on the set of customers in the queue and not on their order. Aconsequence of (i) and (ii) is that ∆ j ( (cid:126)c j ) does not depend on the order of the first j − Definition 3.2.
A queue is called quasi-reversible if its state at time t is independent of • arrival times after time t • departure times before time t . An equivalent definition is that the stationary distribution for the queue satisfies partial balance, i.e., forany state and any class c , the steady-state rate out of the state due to a class- c arrival equals the steady-state6ate into the state due to a class- c departure, and the rate out of the state due to a departure equals therate in due to an arrival. Theorem 3.3 shows that the OI properties are sufficient for partial balance for theproduct-form distribution, and therefore for quasi-reversibility of the system. Theorem 3.3. (Berezner, Kriel, Krzesinski [13], Krzesinski [34]) For any OI queue, including the collabo-rative model, the system is quasi-reversible and the stationary distribution is given by π C ( (cid:126)c n ) = π C ( ∅ ) n (cid:89) i =1 λ c i µ ( (cid:126)c i ) = λ c n µ ( (cid:126)c n ) π C ( (cid:126)c n − ) (2) as long as G := (cid:80) n,(cid:126)c n ∈C n (cid:81) i =1 λ ci µ ( (cid:126)c i ) < ∞ . Then π C ( ∅ ) = 1 /G is the probability the system is empty.Proof. We will show that the product form of equation (2) satisfies partial balance. First note that equation(2) immediately satisfies the condition that the rate out of any state (cid:126)c n due to a departure equals the rateinto the state due to an arrival: µ ( (cid:126)c n ) π C ( (cid:126)c n ) = λ c n π C ( (cid:126)c n − ). Now we show that under the product-formprobabilities (2), the rate out of any state (cid:126)c n due to a class- c arrival equals the rate into the state due to aclass- c departure, ∀ c : π C ( (cid:126)c n ) λ c = n (cid:88) j =0 π C ( c , . . . , c j , c, c j +1 , . . . , c n )∆ j +1 ( c , . . . , c j , c, c j +1 , . . . , c n )= n (cid:88) j =0 π C ( c , . . . , c j , c, c j +1 , . . . , c n )∆ j +1 ( (cid:126)c j , c ) (Property (i))= λ c n µ ( (cid:126)c n , c ) n − (cid:88) j =0 π C ( c , . . . , c j , c, c j +1 , . . . , c n − )∆ j +1 ( (cid:126)c j , c )+ λ c µ ( (cid:126)c n , c ) π C ( (cid:126)c n )∆ n +1 ( (cid:126)c n , c ) ((2) and Property (ii)).We will show this by induction on n . For n = 0, π C (0) λ c = π C ( c ) µ ( c ) is immediate, given property (iii).Assume partial balance holds for the product-form probabilities (2) for any (cid:126)c n − , i.e., π C ( (cid:126)c n − ) λ c = n − (cid:88) j =0 π C ( c , . . . , c j , c, c j +1 , . . . , c n − )∆ j +1 ( (cid:126)c j , c ) . Then we need to show π C ( (cid:126)c n ) λ c = λ c n µ ( (cid:126)c n , c ) n − (cid:88) j =0 π C ( c , . . . , c j , c, c j +1 , . . . , c n − )∆ j +1 ( (cid:126)c j , c ) + λ c µ ( (cid:126)c n , c ) π C ( (cid:126)c n )∆ j +1 ( (cid:126)c n , c )From the induction hypothesis, and the definition of ∆ n +1 ( (cid:126)c n , c ), the right-hand-side is: λ c n µ ( (cid:126)c n , c ) π C ( (cid:126)c n − ) λ c + λ c µ ( (cid:126)c n , c ) π C ( (cid:126)c n )[ µ ( (cid:126)c n , c ) − µ ( (cid:126)c n )]= λ c n µ ( (cid:126)c n , c ) π C ( (cid:126)c n − ) λ c + λ c π C ( (cid:126)c n ) − λ c µ ( (cid:126)c n , c ) λ c n µ ( (cid:126)c n ) π C ( (cid:126)c n − ) µ ( (cid:126)c n )= π C ( (cid:126)c n ) λ c . In the collaborative example in Figure 3, recalling that µ is the total service rate, the stationary probabilityof the depicted state is π C ( (cid:126)c n ) = π C ( ∅ ) (cid:18) λ µ + µ (cid:19) (cid:18) λ µ + µ + µ (cid:19) (cid:18) λ µ + µ + µ (cid:19) (cid:18) λ µ + µ + µ (cid:19) (cid:18) λ µ (cid:19) (cid:18) λ µ (cid:19) .
7e reiterate that the skill-based collaborative queue is a special case of an OI queue. Other queues thatare OI are the (noncollaborative) M/M/K queue with heterogeneous servers, the M/M/ ∞ queue, the M/M/1queue under processor sharing, and the Multiserver Station with Concurrent Classes of Customers (MSCCC)queue [34]. The MSCCC queue is a multi-class M/M/K/FCFS queue with the restriction that at most B c customers of class c can be in service (noncollaboratively) at the same time. The M/M/1/LCFS queue is not an OI queue even though it is a symmetric queue in the sense of Kelly, and is therefore quasi-reversible.The following corollaries, generalizing the OI queue, follow immediately from Theorem 3.3. Corollary 3.4.
The departure process from an OI queue is a Poisson process; thus a network of OI queueswill have a product-form stationary distribution.
Consider an order independent queue with abandonments, where a job of class i abandons the systemafter an exponential time with rate γ i . This model also fits within the OI framework (i.e., properties (i)-(iii)are satisfied), so again has a product-form stationary distribution. Corollary 3.5.
In an order independent queue with abandonments, π CA ( (cid:126)c n ) = π CA ( ∅ ) n (cid:89) i =1 λ c i µ ( (cid:126)c i ) = λ c n µ ( (cid:126)c n ) π CA ( (cid:126)c n − ) , (3) where µ ( (cid:126)c j ) = j (cid:88) i =1 γ c i + (cid:88) m ∈ S ( (cid:126)c j ) µ m and ∆ j ( (cid:126)c n ) = µ ( (cid:126)c j ) − µ ( (cid:126)c j − ) = γ c j + (cid:88) m ∈ S ( (cid:126)c j ) \ S ( (cid:126)c j − ) µ m . As Berezner and Krzesinski [14] show, the product form result for OI queues also extends easily to OIloss models, where, following their terminology, we use the term loss in the general sense that arriving jobsmay be rejected or lost, depending on the current state. For the product-form to continue to hold, theacceptance, or truncated, region must satisfy the truncation property : the job acceptance/rejection decisionis also order independent and rejection is more likely with more jobs. In particular, letting C T comprise thestates ( (cid:126)c n , c ) in which jobs of class c are accepted when the state just before their arrival is (cid:126)c n , we have thefollowing. Definition 3.6.
A set of states C T satisfies the truncation property if: (i) (cid:126)c n ∈ C T ⇒ P ( (cid:126)c n ) ⊆ C T , where P ( (cid:126)c n ) denotes the set of permutations of (cid:126)c n , and (ii) (cid:126)c n ∈ C T ⇒ (cid:126)c n − ∈ C T . That is, using part (i) of the truncation property, if a job would be acceptedwith a given set of jobs in the queue, it will still be accepted if any job is removed from that set. Letting (cid:126)x = ( x , ..., x J ) be the per-class aggregated state for (cid:126)c n (which is sufficient for the accep-tance/rejection decision because of its OI property), the truncation property means the acceptance regionfor x is coordinately convex. That is, the rejection decision is a threshold decision, such that arrivals of type c are rejected if x c > t ( x , ..., x c − , x c +1 , ...x J ) for some function t . Simple examples include having an upperbound on the total number of jobs, or having upper bounds on the number in each job class.The product form, now for (cid:126)c n ∈ C T , is exactly the same, except for the normalizing constant. In otherwords, the stationary probability of being in a state in C T for the loss model is the same as the conditionalprobability of being in that state in the model without losses, given that the state is in C T . Let (cid:126)C ∈ C bethe random variable representing the state of the original collaborative system, with no rejections, in steadystate, i.e., (cid:126)C ∼ π C . Let (cid:126)C T ∈ C T and π CT be similarly defined for the model with rejections. Corollary 3.7.
For an OI queue with job rejection, if the acceptance region C T satisfies the truncationproperty, then P { (cid:126)C T = (cid:126)c n } = P { (cid:126)C = (cid:126)c n | (cid:126)C ∈ C T } = π CT ( (cid:126)c n ) = π CT ( ∅ ) n (cid:89) i =1 λ c i µ ( (cid:126)c i ) = λ c n µ ( (cid:126)c n ) π C ( (cid:126)c n − ) for (cid:126)c n ∈ C T ,where π CT ( ∅ ) = π C ( ∅ ) /P { (cid:126)C ∈ C T } . roof. To see that π T has the given product form, note that for states and transitions to states in C T thesame partial balance equations hold as for the original OI queue, and for transitions where some of the statesare not in C T , the partial balance equations are easily seen to reduce to 0 = 0 because of the truncationproperty. For example, if (cid:126)c n ∈ C T , but ( (cid:126)c n , c ) / ∈ C T , then the rate out of (cid:126)c n due to a class- c arrival is 0, andthe rate into (cid:126)c n due to a class- c departure is also 0, because π CT ( (cid:126)c n , c ) = 0 for all permutations of ( (cid:126)c n , c ).Also, for (cid:126)c n ∈ C T , P { (cid:126)C = (cid:126)c n | (cid:126)C ∈ C T } = π C ( ∅ ) n (cid:81) i =1 λ ci µ ( (cid:126)c i ) (cid:80) j (cid:80) (cid:126)c j ∈C T π C ( (cid:126)c j ) = G n (cid:89) i =1 λ c i µ ( (cid:126)c i )where G = π C ( ∅ ) /P { (cid:126)C ∈ C T } is a normalizing constant, and, because of the form of π CT , G = π CT ( ∅ ).The following special cases will be useful later. Let the subscript − A represent the system where all jobclasses in A are removed. Let the subscript (cid:96) B represent a reduced system in which all the servers in set B are removed, as well as all job classes that are compatible with those servers, i.e., job class i is removedif S i ∩ B (cid:54) = ∅ . Note that if the original system is stable, such subsystems will also be stable. Corollary 3.8. (i)
For all (cid:126)c n ∈ C − A P { (cid:126)C = (cid:126)c n | (cid:126)C ∈ C − A } = P { (cid:126)C − A = (cid:126)c n } = π C − A ( (cid:126)c n ) . (ii) For all (cid:126)c n ∈ C (cid:96) B P { (cid:126)C = (cid:126)c n | (cid:126)C ∈ C (cid:96) B } = P { (cid:126)C (cid:96) B = (cid:126)c n } = π C (cid:96) B ( (cid:126)c n ) . We mention here a recent extension of the OI queue by Comte and Dorsman [20], the ”pass and swap”queue. In their model there is an undirected graph linking the classes of the OI queue, such that an edgebetween two classes indicates that they are “swappable.” The service process satisfies the conditions of theOI queue, but now completing (or replaced) jobs replace later, swappable jobs in the queue. A job thatcompletes or is replaced and that finds no later swappable job leaves the system. They show that the sameproduct form steady-state distribution holds for the pass and swap queue.
We now turn to the noncollaborative model, in which a job is only allowed to enter service on one serverand services are completed nonpreemptively. For this model we must also specify which server is used whenan arriving job finds multiple idle and compatible servers; in this section we assume that this is accordingto Assign Longest Idle (compatible) Server (ALIS), and we will use the superscript
ALIS for the stationarydistribution.For the noncollaborative ALIS model we define the state as ( (cid:126)c n , (cid:126)s k ) where c i is the class of the i ’th oldestjob that is not receiving service, and s i is the idle server that has been idle i ’th longest (out of k that areidle). Note that unlike in the collaborative model, here the (cid:126)c n vector includes only those jobs that are in thequeue waiting for service (we will call this the job queue) and not jobs that are in service.Figure 4 shows an example of a possible state in the noncollaborative ALIS model. The state here is(3 , , ,
2; 4 , c that appears in the job queue, all serversin S ( c ) are serving jobs that arrived earlier than that class- c job. For example, we can tell from the state ofthe job queue that server 2 is serving a job that arrived earlier than the first class-2 job in the queue.We define the set of valid states, X ALIS , as those states ( (cid:126)c n , (cid:126)s k ) such that s i / ∈ S ( (cid:126)c n ), i = 1 , . . . , k . Thatis, X ALIS = C C (cid:96) (cid:126)s k × S , where S is the set of all permutations of all subsets of { , ..., M } , and C C (cid:96) (cid:126)s k is the setof valid states for the system queue (including jobs in service) for the reduced collaborative model with theservers in (cid:126)s k removed. Defining, as before, µ ( (cid:126)c n ) = (cid:88) m ∈ S ( (cid:126)c n ) µ m , , ,
2; 4 , µ ( (cid:126)c n ) is the rate at which one of the first n jobs in the job queue leaves the queue (andenters service), and ∆ j ( (cid:126)c n ) = ∆ j ( (cid:126)c j ) = µ ( (cid:126)c j ) − µ ( (cid:126)c j − ) is the rate at which the j ’th job in the job queueleaves the queue (and enters service). The OI properties (i)-(iii) given in Definition 3.1 continue to hold for∆ j ( (cid:126)c n ) and µ ( (cid:126)c n ). Indeed, given (cid:126)s k , the job queue is an OI loss queue. In state ( (cid:126)c n , (cid:126)s k ) an arrival of class c will be rejected from the job queue (and it will remove a server from the idle-server queue) if s i ∈ S ( c ) forsome i = 1 , ..., k . The state-dependent acceptance region for the job queue, S ( (cid:126)s k ), satisfies the truncationproperty of the OI loss queue given in Definition 3.6.We now consider the idle server queue. Let λ ( (cid:126)s j ) be the rate of arrivals of jobs that are compatible withone of the first j (idle) servers, i.e., the rate of departures from the idle server queue when it is in state (cid:126)s j .For k ≥ j , let ∆ λj ( (cid:126)s k ) = λ ( (cid:126)s j ) − λ ( (cid:126)s j − ) = (cid:88) i ∈ C ( (cid:126)s j ) \ C ( (cid:126)s j − ) λ i ≥ j ’th idle server will become busy (leave the idle server queue). Note that we havethe same OI properties (i)-(iii) for λ ( (cid:126)s k ) and ∆ λj ( (cid:126)s n ) as we had for µ ( (cid:126)c n ) and ∆ j ( (cid:126)c n ):(i) ∆ λj ( (cid:126)s k ) = ∆ λj ( (cid:126)s j ) for j ≤ k ,(ii) λ ( (cid:126)s j ) is the same for any permutation of s , . . . , s j (order independence),(iii) λ ( s ) > s .That is, given (cid:126)c n , the idle server queue is also an OI loss queue, where we can think of servers of type s arriving according to a Poisson process at rate µ s , but if the server is already in the queue in state (cid:126)s k ,or if it will remain busy serving another job, i.e., if s ∈ (cid:126)s k ∪ S ( (cid:126)c n ), then the arrival is rejected. Hence, theacceptance region for the idle-server queue, given (cid:126)c n , also satisfies the truncation property.The stationary distribution of the noncollaborative ALIS model has a “product of product forms” distri-bution, with a product form component for the job queue and one for the idle server queue. Theorem 3.9. (Adan et al. [5]) For the noncollaborative model, under FCFS for jobs and ALIS for servers,and given the stability condition, for ( (cid:126)c n , (cid:126)s k ) ∈ X , π ALIS ( (cid:126)c n , (cid:126)s k ) = π ALIS ( ∅ , ∅ ) n (cid:89) i =1 λ c i µ ( (cid:126)c i ) k (cid:89) j =1 µ s j λ ( (cid:126)s j ) (4)= λ c n µ ( (cid:126)c n ) π ALIS ( (cid:126)c n − , (cid:126)s k ) = µ s k λ ( (cid:126)s k ) π ALIS ( (cid:126)c n , (cid:126)s k − ) , (5) where π ALIS ( ∅ , ∅ ) is a normalizing constant equal to the probability that all servers are busy and that thereare no jobs waiting in the queue.Proof. Note that if there is an arrival to or departure from the job queue , the state of the idle servers donot change, and if there is an arrival to or departure from the set of idle servers, the state of the job queue10oes not change. Thus, the proof that the product form with π ALIS ( (cid:126)c n , (cid:126)s k ) = λ cn µ ( (cid:126)c n ) π ALIS ( (cid:126)c n − , (cid:126)s k ) satisfiespartial balance for job queue arrivals and departures, for fixed (cid:126)s k , is exactly the same as for our earlier prooffor Theorem 3.3, using the truncation property of our acceptance region for job queue arrivals. For the idleserver queue, fixing (cid:126)c n , the proof that the rate out of ( (cid:126)c n , (cid:126)s k ) due to departures of idle servers equals therate in due to arrivals of idle servers is immediate from π ALIS ( (cid:126)c n , (cid:126)s k ) = µ sk λ ( (cid:126)s k ) π ALIS ( (cid:126)c n , (cid:126)s k − ). Finally, theproof that the rate of leaving state π ALIS ( (cid:126)c n , (cid:126)s k ) due to server s becoming idle (arriving to the idle serverqueue) equals the rate of entering the state due to server s becoming busy (leaving the queue), for s / ∈ (cid:126)s k ,is also very similar to our earlier induction proof. Note that to transition out of state ( (cid:126)c n , (cid:126)s k ) due to thearrival of server s to the idle server queue, s must both be busy and not have any compatible jobs in the jobqueue, i.e., s / ∈ (cid:126)s k ∪ S ( (cid:126)c n ).In the example shown in Figure 4, the stationary probability is π ALIS ( (cid:126)c n , (cid:126)s k ) = π ALIS ( ∅ , ∅ ) (cid:18) λ µ (cid:19) (cid:18) λ µ + µ (cid:19) (cid:18) λ µ + µ (cid:19) (cid:18) λ µ + µ (cid:19) (cid:18) µ λ (cid:19) (cid:18) µ λ + λ (cid:19) . We next consider the noncollaborative model where, instead of using ALIS to choose an idle server amongcompatible servers for an arriving job, servers are chosen randomly among idle compatible servers withappropriate probabilities that depend only on the set of busy (or idle) servers [37]. The results given in [37]use the partially aggregated state space we discuss in Section 5, but, as we will show, a product-form resultalso holds for a detailed state descriptor similar to the one used for the collaborative model. Unlike underALIS, under RAIS the order of the idle servers no longer matters. Instead, for this version of the model wekeep track of the busy servers, (cid:126)b l , where there are l busy servers, and where the servers are ordered by thearrival times of the jobs they are serving. To obtain a product-form stationary distribution, we need ourstate to be even more detailed: we must track not only the order of the busy servers, but the positions of thebusy servers within the job queue. Our detailed state description for RAIS is thus (cid:126)z m , where m denotes thenumber of jobs in the system (including both jobs in the queue and jobs in service), and z i is associated withthe i ’th job in the system in order of arrival. This is similar to the state (cid:126)c n state used for the collaborativemodel, with one key difference: the (cid:126)z m state does not track the classes of jobs that are in service, instead ittracks the servers that are serving them. That is, we let z i = c if the i ’th job in the system has not startedservice (it is in the job queue), and z i = b if the i ’th job in the system is in service on server b . Note that (cid:126)z m consists of an interleaving of the states of the job queue, (cid:126)c n , and of the busy server queue, (cid:126)b l . The possiblestates for (cid:126)z m , X RAIS , are such that (cid:126)b l ⊆ { , ..., M } , and for any position i , if z i = c , all compatible serversare serving earlier arrivals, i.e., S ( c ) ⊆ (cid:126)z i − , because of the FCFS service discipline.In order to completely define the RAIS policy, we must specify the probability that an arriving jobenters service at compatible idle server s / ∈ { b , . . . , b l } . When the set of ordered busy servers is (cid:126)b j , let λ as ( (cid:126)b j ) represent the activation rate of idle server s / ∈ { b , ..., b j } (the rate of going from state (cid:126)b j to ( (cid:126)b j , s )for any (cid:126)c n ). We allow the activation rates to depend only on the set of busy servers, not on their order.Indeed, as Visschers et al. showed for their aggregated state description of this model [37], in order forthe stationary distribution to have a product form, we need the following stronger condition, called the assignment condition . Let Π λ ( (cid:126)b l ) = l (cid:81) j =1 λ ab j ( (cid:126)b j − ). The assignment condition requires that the probabilitiesfor routing to compatible idle servers be chosen so that Π λ ( (cid:126)b l ) depends only on the set of busy servers, noton their order (i.e., so that Π λ ( (cid:126)b l ) is the same for any permutation of b , . . . , b l ). Visschers et al. show that itis always possible to choose assignment probability distributions so that the assignment condition holds [37];the derivation involves solving a max flow problem for each subset of busy servers.One way to interpret the assignment condition is to consider the loss system in which customers are notallowed to queue, so that the state is just (cid:126)b l , the set of busy servers. Then the assignment condition, alongwith the fact that µ ( (cid:126)b l ) doesn’t depend on the order of busy servers, reduces to Kolmogorov’s criterion forreversibility of Markov chains, namely that the product of the transition probabilities along any path from astate back to itself is the same if the states are traversed in the reverse order. For example, consider the pathtraversing the states ∅ , ( u ) , ( u, v ) , ( v ) , ∅ , where u and v are two servers. Then the probability of traversing11hat path is Cλ au ( ∅ ) λ av ( u ) µ u µ v where C is an appropriate normalizing constant, and the probability for thereverse path, in which v is activated first and finishes first, is Cλ av ( ∅ ) λ au ( v ) µ u µ v . These are the same, giventhe assignment condition: λ au ( ∅ ) λ av ( u ) = λ av ( ∅ ) λ au ( v ). Indeed, Adan, Hurkens, and Weiss showed that theloss model (under the assignment condition) is reversible, and has a product-form stationary distribution [4].Let µ ( (cid:126)z i ) = (cid:80) ij =1 I ( z j ) µ z j , where I ( z j ) is an indicator that is 1 if z j corresponds to a busy server and 0if it corresponds to a job in the job queue. Note that µ ( (cid:126)z m ) satisifies the conditions for order independence,with ∆ j ( (cid:126)z m ) = ∆ j ( (cid:126)z j ) = µ ( (cid:126)z j ) − µ ( (cid:126)z j − ). Let λ zi ( (cid:126)z i ) = λ zi ( (cid:126)z m ) = λ c if z i = c for some job class c , and, if z i = b for some busy server b , let λ zi ( (cid:126)z i ) = λ zi ( (cid:126)z m ) = λ ab ( (cid:126)b k ( i ) ), where k ( i ) = (cid:80) i − j =1 I ( z j ) is the number of busyservers in the first i − i − λ zm ( (cid:126)z m ) isthe same for any permutation of (cid:126)z m − regardless of whether z m is a waiting job or a busy server, and, from theassignment condition, for any (cid:126)z m ∈ X RAIS , λ zm − ( (cid:126)z m − ) λ zm ( (cid:126)z m ) = λ zm − ( (cid:126)z m − , z m ) λ zm ( (cid:126)z m − , z m , z m − ) = λ zm − ( (cid:126)z m − , z ) λ zm ( (cid:126)z m ). Theorem 3.10.
For the noncollaborative model, under FCFS for jobs and random assignment to idle servers,and under the assignment condition and the stability condition, for (cid:126)z m ∈ X , π RAIS ( (cid:126)z m ) = π RAIS ( ∅ ) m (cid:89) i =1 λ zi ( (cid:126)z i ) µ ( (cid:126)z i ) = λ zm ( (cid:126)z m ) µ ( (cid:126)z m ) π RAIS ( (cid:126)z m − ) , where π RAIS ( ∅ ) is a normalizing constant that represents the probability that the system is empty (i.e., thatthere are no busy servers and no jobs in the queue). Before proving Theorem 3.10, we give a brief example of the system state and stationary probabilityunder RAIS. Consider the example in Figure 4. The state under RAIS is (cid:126)z m = (3 b , c , b , c , c , c ), wherewe use a subscript of b or c to indicate whether the entry in (cid:126)z m corresponds to a job that is waiting forservice in the job queue (c) or to a busy server (b). Note that for jobs in the job queue the notation i c indicates that the job in this position is class- i , whereas for busy servers the notation j b indicates that server j is in this position (that is, we do not track the classes of jobs in service). The state (cid:126)z m is an interleavingof (cid:126)c n = (3 , , ,
2) and (cid:126)b l = (3 , π RAIS ( (cid:126)z m ) = π RAIS ( ∅ ) (cid:18) λ a ( ∅ ) µ (cid:19) (cid:18) λ µ (cid:19) (cid:18) λ a (3) µ + µ (cid:19) (cid:18) λ µ + µ (cid:19) (cid:18) λ µ + µ (cid:19) (cid:18) λ µ + µ (cid:19) . We are now ready to prove Theorem 3.10.
Proof.
Fix (cid:126)z m ∈ X RAIS , with corresponding (cid:126)b l . We will show that partial balance holds in three steps:1. The rate out of state (cid:126)z m due to a service completion equals the rate into state (cid:126)z m due to an arrival.2. The rate out of state (cid:126)z m due to server b becoming busy equals the rate into state (cid:126)z m due to server b becoming idle.3. The rate out of state (cid:126)z m due to a class- c job arrival to the job queue equals the rate into state (cid:126)z m dueto a class- c departure from the job queue.1. First suppose z m = b l , so (cid:126)z m = ( (cid:126)z m − , b l ). Then our product form immediately satisfies µ ( (cid:126)z m ) π RAIS ( (cid:126)z m ) = λ ab l ( (cid:126)b l − ) π RAIS ( (cid:126)z m − ), i.e., the rate of transitions out of state (cid:126)z m due to a service completion equals the rateinto (cid:126)z m due to a new server arrival, i.e., of server b l going from idle to busy and serving the most recentlyarriving job. If z m (cid:54) = b l then it is not possible to enter state (cid:126)z m with an idle server becoming busy. Nowsuppose z m = c n . In this case we have, for our product form, µ ( (cid:126)z m ) π RAIS ( (cid:126)z m ) = λ c n π RAIS ( (cid:126)z m − ), i.e., therate of transitions out of state (cid:126)z m due to a service completion equals the rate into (cid:126)z m due to a new arrivalto the job queue. Note that c n is such that S ( c n ) ⊆ (cid:126)b l .2. We now show that under the product-form probabilities above, the rate out of state (cid:126)z m due to the(external) arrival of any server b / ∈ (cid:126)b l to the busy server queue equals the rate into the state due to server b ’sdeparture from the busy server queue. Note that because (cid:126)z m ∈ X RAIS and b / ∈ (cid:126)b l , none of the jobs in thejob queue are compatible with server b , so a job completion at server b in state ( z , . . . , z j , b, z j +1 , . . . , z m )12ill result in server b leaving the busy server queue. Using the OI properties of µ ( (cid:126)z m ), and that λ zm ( (cid:126)z m ) isthe same for any permutation of (cid:126)z m − , we need to show that the given product form satisfies π RAIS ( (cid:126)z m ) λ zm +1 ( (cid:126)z m , b ) = m (cid:88) j =0 π RAIS ( z , . . . , z j , b, z j +1 , . . . , z m )∆ j +1 ( (cid:126)z j , b )= λ zm +1 ( (cid:126)z m − , b, z m ) µ ( (cid:126)z m , b ) m − (cid:88) j =0 π RAIS ( z , . . . , z j , b, z j +1 , . . . , z m − )∆ j +1 ( (cid:126)z j , b )+ λ zm +1 ( (cid:126)z m , b ) µ ( (cid:126)z m , b ) π RAIS ( (cid:126)z m )∆ m ( (cid:126)z m , b ) . (6)We use induction on m ; the induction hypothesis is that π RAIS ( (cid:126)z m − ) λ zm ( (cid:126)z m − , b ) = m − (cid:88) j =0 π RAIS ( z , . . . , z j , b, z j +1 , . . . , z m − )∆ j +1 ( (cid:126)z j , b ) . Then the RHS of (6) is λ zm +1 ( (cid:126)z m − , b, z m ) µ ( (cid:126)z m , b ) π RAIS ( (cid:126)z m − ) λ zm ( (cid:126)z m − , b ) + λ zm +1 ( (cid:126)z m , b ) µ ( (cid:126)z m , b ) π RAIS ( (cid:126)z m )[ µ ( (cid:126)z m , b ) − µ ( (cid:126)z m )]= λ zm +1 ( (cid:126)z m − , b, z m ) µ ( (cid:126)z m , b ) π RAIS ( (cid:126)z m − ) λ zm ( (cid:126)z m − , b ) + λ zm +1 ( (cid:126)z m , b ) π RAIS ( (cid:126)z m ) − λ zm +1 ( (cid:126)z m , b ) µ ( (cid:126)z m , b ) λ zm ( (cid:126)z m ) µ ( (cid:126)z m ) π RAIS ( (cid:126)z m − ) µ ( (cid:126)z m )= π RAIS ( (cid:126)z m ) λ zm +1 ( (cid:126)z m , b )where λ zm ( (cid:126)z m − , b ) λ m +1 ( (cid:126)z m − , b, z m ) = λ zm ( (cid:126)z m ) λ zm +1 ( (cid:126)z m , b ) from the assignment condition.3. Finally, we show that the rate out of (cid:126)z m due to a class- c arrival to the job queue equals the rate into state (cid:126)z m due to a class- c job queue departure, for each c such that S ( c ) ⊆ (cid:126)b l . Fix c and (cid:126)z m and callthe class- c job whose departure causes the system to enter state (cid:126)z m the tagged job. Let (cid:126)z (cid:48) m +1 denote thesystem state just before the tagged job leaves the job queue. The transition from (cid:126)z (cid:48) m +1 to (cid:126)z m is triggeredby a service completion at some server b ∈ S ( c ). In (cid:126)z (cid:48) m +1 it must be the case that b , and all other serversin S ( c ), are serving jobs that arrived earlier than the tagged job. At the service completion on server b , thejob it is working on leaves, and server b takes the position of the tagged job. Therefore, server b ’s positionin (cid:126)z m , after the service completion, must be after all the other servers in S ( c ). Call this position κ . Beforethe transition, in state (cid:126)z (cid:48) m +1 , the tagged job must be in position κ + 1, and server b must be in position j + 1 ≤ κ . Thus, we need to show that π RAIS ( (cid:126)z m ) λ c = κ − (cid:88) j =0 π RAIS ( z , . . . , z j , b, z j +1 , .., z κ − , c, z κ +1 , . . . , z m )∆ j +1 ( (cid:126)z j , b ) . First suppose z m = b , i.e., κ = m . Then we want to show that π RAIS ( (cid:126)z m − , b ) λ c = m − (cid:88) j =0 π RAIS ( z , . . . , z j , b, z j +1 , .., z m − , c )∆ j +1 ( (cid:126)z j , b ) . (7)Suppose, using induction on m , that π RAIS ( (cid:126)z m − , b ) λ c = m − (cid:88) j =0 π RAIS ( z , . . . , z j , b, z j +1 , .., z m − , c )∆ j +1 ( (cid:126)z j , b ) . j < m − π RAIS ( z , . . . , z j , b, z j +1 , .., z m − , c ) = λ c µ ( (cid:126)z m − , b ) λ zm − ( (cid:126)z m − ) µ ( (cid:126)z m − , b ) π RAIS ( z , . . . , z j , b, z j +1 , .., z m − )= π RAIS ( z , . . . , z j , b, z j +1 , .., z m − , c, z m − )= λ zm − ( (cid:126)z m − ) µ ( (cid:126)z m − , b ) π RAIS ( z , . . . , z j , b, z j +1 , .., z m − , c ) . Thus, the RHS of (7) is λ zm − ( (cid:126)z m − ) µ ( (cid:126)z m − , b ) m − (cid:88) j =0 π RAIS ( z , . . . , z j , b, z j +1 , .., z m − , c )∆ j +1 ( (cid:126)z j , b ) + λ c µ ( (cid:126)z m − , b ) π RAIS ( (cid:126)z m − , b )∆ m ( (cid:126)z m − , b )= λ zm − ( (cid:126)z m − ) µ ( (cid:126)z m − , b ) π RAIS ( (cid:126)z m − , b ) λ c + λ c µ ( (cid:126)z m − , b ) π RAIS ( (cid:126)z m − , b ) [ µ ( (cid:126)z m − , b ) − µ ( (cid:126)z m − )]= λ zm − ( (cid:126)z m − ) µ ( (cid:126)z m − , b ) π RAIS ( (cid:126)z m − , b ) λ c + λ c π RAIS ( (cid:126)z m − , b ) − λ c µ ( (cid:126)z m − , b ) λ zm − ( (cid:126)z m − ) µ ( (cid:126)z m − ) π RAIS ( (cid:126)z m − , b ) µ ( (cid:126)z m − )= λ c π RAIS ( (cid:126)z m − , b ) . Now suppose κ < m , so (cid:126)z m = ( z , ..., z κ − , b, z κ +1 , ..., z m ). Then we want to show that π RAIS ( z , ..., z κ − , b, z κ +1 , ..., z m ) λ c = κ − (cid:88) j =0 π RAIS ( z , . . . , z j , b, z j +1 , .., z κ − , c, z κ +1 , . . . , z m )∆ j +1 ( (cid:126)z j , b ) , i.e., m (cid:89) i = κ +1 λ zi ( (cid:126)z i ) µ ( (cid:126)z i ) π RAIS ( (cid:126)z κ − , b ) λ c = m (cid:89) i = κ +1 λ zi ( (cid:126)z i ) µ ( (cid:126)z i ) κ − (cid:88) j =0 π RAIS ( z , . . . , z j , b, z j +1 , .., z κ − , c )∆ j +1 ( (cid:126)z j , b ) . From our previous argument we have π RAIS ( (cid:126)z κ − , b ) λ c = κ − (cid:88) j =0 π RAIS ( z , . . . , z j , b, z j +1 , .., z κ − , c )∆ j +1 ( (cid:126)z j , b ) , and the result follows. The product form stationary probabilities for the collaborative model and the ALIS noncollaborative modelboth include the term (cid:81) ni =1 λ ci µ ( (cid:126)c n ) . Given the similarities in the stationary distributions, it is natural to askwhether the two systems also are similar in their more detailed evolution. Indeed, Adan et al. observed thatwhen all servers are busy (i.e., the idle-server queue is empty, (cid:126)s k = ∅ under ALIS) the path-wise evolutionof the state (cid:126)c n (i.e., the jobs in queue) in the noncollaborative model is the same as the evolution of (cid:126)c n (jobsin system) in the collaborative model [5]. We generalize this observation to relate the path-wise evolutionof the two systems conditioned on the set of idle servers. Note that while the set of idle servers is fixed, weneed not worry about how jobs are assigned to idle servers. Observation 3.11.
Conditioned on the set of idle servers, (cid:126)s k , and while those servers remain idle, thepath-wise evolution of (cid:126)c n (jobs in queue) for the noncollaborative model (under either RAIS or ALIS) is thesame as that of (cid:126)c n (jobs in system) for the truncated collaborative model with the servers in (cid:126)s k removed. Observation 3.11 tells us that, with coupled arrivals and service completions and the same initial (cid:126)c n , aservice completion removes a job from the system for the collaborative model and removes the correspondingjob from the job queue in the noncollaborative model. In the noncollaborative model, another job that does14ot appear in (cid:126)c n will also leave the system (and will be replaced at the server by the job leaving the jobqueue).The path-wise correspondence between the collaborative and noncollaborative models will be usefulwhen we move from the stationary distribution to performance metrics such as per-class response timedistributions. In Section 4, we will see that these performance metrics often are more straightforward toderive in the collaborative model. The path-wise relationship between the two models allows us to apply ourresults in the collaborative model to the noncollaborative model.We note that the path-wise coupling still holds for general (coupled) arrival processes, not just Poissonprocesses.Our path-wise equivalence between the job queue in the noncollaborative model conditioned on the setof busy servers and the system queue for the collaborative model with the idle servers removed, is somewhatanalogous to the observation of Borst et al. of the equivalence between the jobs in system in a processorsharing model with the jobs in queue for a nonpreemptive random-order-of-service model [17]. Two generalizations, combining aspects of the collaborative and noncollaborative models, have recently beenintroduced using the notion of “tokens” [12], [19]. In these models tokens generalize the notion of serversin the noncollaborative model. There is a bipartite compatibility matching between job classes and tokens,jobs must have tokens to enter service, and a token can be assigned to only one job at a time. Jobs of class i arrive according to a Poisson process at rate λ i , and can be matched to tokens in set S i . Ayesta et al.allow jobs to wait for tokens and assume that when an arriving job sees multiple idle compatible tokens, it isassigned a token according to RAIS (or RAIT: Random Assignment to Idle Tokens) [12]. Comte assumes aloss model, in which jobs that arrive when no compatible tokens are available are lost, and that idle tokensare assigned according to ALIS (or ALIT: Assign Longest Idle Token) [19]. We describe these models inmore detail below. In the model of Ayesta et al. [12], given the set of busy tokens (cid:126)b l , listed in the order of the arrival times of thejobs they are serving, and idle token s , the activation rate λ as ( (cid:126)b l ) (i.e., the rate at which s will be assigned toan arriving compatible job) satisfies the same assignment condition as in the noncollaborative RAIS model.The service process, given ordered busy tokens (cid:126)b l , is generalized from the skill-based collaborative model tothe OI queue. That is, defining ∆ j ( (cid:126)b l ) as the (marginal) rate of service given to the job with the j ’th busytoken and µ ( (cid:126)b l ) = (cid:80) mk =1 ∆ k ( (cid:126)b l ), the following OI conditions are assumed, as in Definition 3.1:(i) ∆ k ( (cid:126)b l ) = ∆ k ( (cid:126)b k ) for j ≤ l ,(ii) µ ( (cid:126)b l ) is the same for any permutation of b , . . . , b l (order independence),(iii) µ ( b ) > b .Like Krzesinski [34], Ayesta et al. also allow the service rate to be multiplied by a factor that is a functionof the total number of tokens in service. We continue to omit that factor for simplicity.Let us define the state, as we did for the noncollaborative RAIS model, as (cid:126)z m where z i is associated withthe i ’th arrival in the system, z i = c if the arrival is of class c and does not yet have a token, and z i = b ifit has token b . We also define λ i ( (cid:126)z i ) as we did for the noncollaborative RAIS model. Note that our proof ofTheorem 3.10 did not use the particular form of µ ( (cid:126)b l ), only its OI properties. (In particular, we showed theresult for general µ ( (cid:126)z m ) and ∆ j ( (cid:126)z m )). Hence, Theorem 3.10 also holds for the token model.As Ayesta et al. note [12], the noncollaborative model is recovered when tokens correspond to the serversof the noncollaborative model, and the original OI queue (including the collaborative model) is recoveredwhen each arriving job immediately obtains a token directly corresponding to its class (so there is an infinitesupply of tokens, and activation rates need not be included). Comte introduced a related, multi-layered, token loss model that generalizes the noncollaborative modeloperating under ALIS [19]. The terminology and notation used in Comte’s model are a bit different from15urs; Comte refers to job “type” where we use job “class,” and to token “classes” where we use “tokens”.(We allow distinct tokens to have the same job class compatibilities and speeds.) In Comte’s model, andin contrast to that of Ayesta et al., a job that arrives when there is no available compatible token is lost,and a job that arrives to find multiple compatible tokens takes the token that has been idle longest. Hence,for Comte, the state is ( (cid:126)b l , (cid:126)s k ) where (cid:126)b l is the set of busy tokens listed in the order of the arrival times oftheir corresponding jobs and (cid:126)s k is the set of idle tokens in the order in which they became idle. Note that,because jobs cannot wait for tokens, there is no (cid:126)c n component of the state, so (cid:126)b l corresponds directly to (cid:126)z m of the noncollaborative RAIS model. Also, tokens alternate between being busy and idle, and therefore (cid:126)b l lists the busy tokens in the order in which they became busy, i.e., their order of arrival to the busy tokenqueue.Instead of assuming a generic OI service process µ ( (cid:126)b l ) for serving tokens, as in Ayesta et al.’s model,Comte assumes a collaborative service model. That is, there is another bipartite matching layer betweentokens and servers that defines the total service rate µ ( (cid:126)b l ) when the ordered set of busy tokens is (cid:126)b l , andsuch that µ ( (cid:126)b l ) satisfies the OI conditions. Note that when the idle token queue is in state (cid:126)s k , the rate atwhich tokens leave, λ ( (cid:126)s k ), is the rate at which jobs compatible with one of the idle tokens arrive, and, as inthe noncollaborative ALIS model, λ ( (cid:126)s k ) also satisfies the OI conditions 3.1. Because there is a finite set oftokens, we have a closed network of two OI queues, and because OI queues are quasi-reversible, the closedtoken (CT) network also has a product-form distribution. In particular, for ( (cid:126)b l , (cid:126)s k ) ∈ X T , where X T is theset of states such that each token appears exactly once, i.e., l + k equals the total number of tokens and( (cid:126)b l , (cid:126)s k ) is an arbitrary permutation of the set of tokens, we have π CT ( (cid:126)b l , (cid:126)s k ) = G CT l (cid:89) i =1 µ ( (cid:126)b i ) k (cid:89) i =1 λ ( (cid:126)s i )where G CT is a normalizing constant. This result would also hold assuming a general OI process for “serving”busy tokens rather than the collaborative service model. Adan et al. [5] introduced a matching model, called the directed bipartite matching (DBM) model, with abipartite matching between servers and jobs, and in which both servers and jobs arrive according to Poissonprocesses, but only jobs can queue to wait for servers. Servers of type j arrive according to an independentPoisson process with rate µ j , and the other parameters of the model are the same as for the collaborativemodel. The state is again (cid:126)c n . An arriving server matches with the first compatible job in the queue, if any,and the server, along with its job if there is one, immediately leaves. The DBM model captures importantfeatures of organ transplant waitlists, where patients wait for organs, but unmatched organs are lost, andwhere compatibilities are determined by biological factors such as blood types, as well as the locations ofthe patients and organs. As Adan et al. show, the Markov chain for this model is sample-path equivalentto that of the collaborative model; in particular, the departure rate from the queue in state (cid:126)c n is µ ( (cid:126)c n ) asdefined earlier. Therefore the matching model has the same, product-form, stationary distribution given inTheorem 3.3. The result also holds for a more general, OI matching, i.e., when there are no server types,but a job will be matched to a server at rate µ ( (cid:126)c n ) when the state is (cid:126)c n and µ ( (cid:126)c n ) satisfies the OI conditions(i)-(iii). The state process for the matching model is also equivalent to the queue process of a variant ofthe noncollaborative model in which we keep all the servers busy by assigning a server that becomes idleand that does not find a waiting compatible job a “dummy job.” This might be appropriate in a call centercontext in which servers that would be otherwise idle handle outgoing calls or email.If we ignore the timing between arrivals and departures in the matching model described above, we havean equivalent discrete-time model, in which at most one event (a job arrival or a server arrival/job departure)can occur in any time slot. Now λ c is the probability of a class- c arrival and, when the state is (cid:126)c n , µ ( (cid:126)c n )represents the probability of a job completion (or a job-server matching) in the next time slot, µ ( (cid:126)c n ) ≤ µ ( (cid:126)c n ) satisfies the OIconditions. Then the transitions of the Markov chain (cid:126)c n for the discrete-time queue will be sample-pathequivalent to the transitions of the embedded Markov chain for the continuous-time OI queue, and, again,the same product form will hold for the steady-state distribution. The DBM special case, with server/job16ompatibilities, is considered by Weiss [38]. Here a server of type s arrives with probability µ s and matcheswith the earliest compatible job if there is one; the server immediately departs (along with any matchingjob).The DBM model discussed in [5] does not allow unmatched servers to wait for jobs. We now show thatthe DBM model can be extended to include a “server queue” in which unmatched servers wait in FCFM(first-come-first-matched) order. This yields something somewhat analogous to the noncollaborative ALISmodel. For stability, we must have an upper bound, K , on the server queue. Let us call this (new) modelthe DBM( K ) model. Then the stability condition for the DBM( K ) model will be the same as for the DBM=DBM(0) model, i.e., λ ( A ) ≤ µ ( A ) for all subsets of job classes A , where µ ( A ) is the rate of arrivals ofservers compatible with job classes in A in the continuous-time model, and is the probability of such anarrival in the discrete-time version. The state is ( (cid:126)c n , (cid:126)s k ) ∈ X DBM ( K ) , where (cid:126)c n is the set of waiting jobs inarrival order, (cid:126)s k is the set of waiting servers in arrival order, and the set of valid states, X DBM ( K ) , comprisesthose states ( (cid:126)c n , (cid:126)s k ) such that k ≤ K and s i / ∈ S ( (cid:126)c n ), i = 1 , . . . , k . Again, both the job queue and theserver queue are order independent, so the steady-state distribution will have the same form as that of thenoncollaborative ALIS model, though the latter includes a particular loss model for the idle-server queue, soits set of valid states is restricted to (cid:126)s k such that each server appears in (cid:126)s k at most once. Theorem 3.12.
For the stable directed bipartite matching model with a finite buffer for servers K , DBM( K ), π DBM ( K ) ( (cid:126)c n , (cid:126)s k ) = π DBM ( K ) ( ∅ , ∅ ) n (cid:89) i =1 λ c i µ ( (cid:126)c i ) k (cid:89) j =1 µ s j λ ( (cid:126)s j ) , ∀ ( (cid:126)c n , (cid:126)s k ) ∈ X DBM ( K ) . By symmetry, a similar result holds if the server buffer is infinite, but the job buffer is bounded by some N . Now the stability condition is µ ( B ) ≤ λ ( B ) for all subsets of server types B .Note that the continous-time DBM(K) model also models a make-to-stock inventory system with abipartite graph representing preferences of customer classes for certain types of items. Customers of class i are willing to purchase any of the items in S i . Items of type j are produced according to a Poisson process atrate µ j as long as the total number of items is less than the overall base-stock level K . Queueing customersrepresent back orders. Also, from 3.7, the result holds when we have different base-stock levels for differenttypes of items.Our results also extend to DBM models with abandonments and finite or infinite buffers. These modelsare appropriate for car sharing applications and other two-sided queues, where, for example, classes of jobsand types of servers correspond to location preferences. Suppose jobs (riders) of class i arrive (request rides)according to a Poisson process with rate λ i , and will wait for an exponential time at rate γ i before abandoningtheir request. Servers (drivers) of type j arrive according to a Poisson process at rate µ j and will wait anexponential time at rate ν j for a rider before leaving the platform. We assume a bipartite matching graphas defined earlier. Because of the abandonments, stability will not be an issue, even for infinite buffers. Wehave the following. Theorem 3.13.
For the directed bipartite matching model with abandonments (DBMA) and finite or infinitebuffers for jobs and servers, π DBMA ( (cid:126)c n , (cid:126)s k ) = π DBMA ( ∅ , ∅ ) n (cid:89) i =1 λ c i µ ( (cid:126)c i ) k (cid:89) j =1 µ s j λ ( (cid:126)s j ) , ∀ ( (cid:126)c n , (cid:126)s k ) ∈ X DBMA , where µ ( (cid:126)c j ) = j (cid:88) i =1 γ c i + (cid:88) m ∈ S ( (cid:126)c j ) µ m , λ ( (cid:126)s j ) = j (cid:88) i =1 ν s i + (cid:88) m ∈ C ( (cid:126)s j ) λ m , and X DBMA is the set of states ( (cid:126)c n , (cid:126)s k ) such that s j / ∈ S ( (cid:126)c n ) , j = 1 , . . . , k and c i / ∈ S ( (cid:126)s k ) , i = 1 , . . . , n . Moyal, Buˇsi´c, and Mairesse show reversibility and a product-form stationary distribution for a FCFMmatching model with a General (not necessarily bipartite) Matching (GM) graph, and with sequentialindividual (non-paired) arrivals, under a given stability condition [35]. For this model, instead of jobs andservers we have “agents” of J different classes, with agent classes corresponding to nodes in the compatibility17igure 5: An example of a nested system.graph; the set of agent classes compatible with class c , S ( c ), is its set of neighbors in the compatibility graph.The set of valid states, C GM , are those states (cid:126)c n such that c i / ∈ S ( (cid:126)c n ), i = 1 , . . . , n . Among the arrivalprocesses Moyal et al. consider is i.i.d. arrivals where the probability of a class- c arrival is µ c . Given theclasses of unmatched agents ordered by their arrival times, (cid:126)c n , let µ ( (cid:126)c n ) be the probability the next arrivalis compatible with one of those agents. Again, µ ( (cid:126)c n ) satisfies the OI conditions (now in discrete time), sothe stationary distribution for the GM model, assuming stability, is π GM ( (cid:126)c n ) = π GM ( ∅ ) n (cid:89) i =1 µ c i µ ( (cid:126)c i ) = µ c n µ ( (cid:126)c n ) π ( (cid:126)c n − ) for (cid:126)c n ∈ C . Adan and Weiss consider the Paired Bipartite Matching (PBM) model in which server-job pairs arrivesequentially and where the job is type i and the server is type j independently and with respective proba-bilities λ i /λ and µ i /µ , and both unmatched jobs and unmatched servers wait for matches [1]. An arrivingjob (server) is matched to the first compatible waiting server (job) if there is one and they both immediatelyleave, otherwise the job (server) waits for a match. Adan and Weiss show that the associated Markov chainsatisfies partial balance and has a product-form stationary distribution, under the stability condition. Thatis, π P BM ( (cid:126)c n , (cid:126)s n ) = π P BM ( ∅ ) n (cid:89) i =1 λ c i µ ( (cid:126)c i ) µ s i λ ( (cid:126)s i ) . Note that for the PBM model, there are always the same number of unmatched jobs and servers. Adanet al. show that there exists a unique FCFM (first-come first-matched) matching for the PBM model, andthat the process is reversible under an “exchange transformation” that interchanges matching servers andcustomers [6].
In the previous section we developed product forms for the stationary distributions of the detailed states forvariants of OI queues, but these product forms do not readily yield other important performance measures,such as response time distributions. It turns out that we will get simple, elegant results for response times inthe collaborative model for a particular system structure called a nested system (see Figure 5 for an example).As noted in Observation 3.11, conditioned on the set of busy servers the noncollaborative queue state hasthe same sample-path evolution as the collaborative system state for a system with only the busy serversavailable. A consequence of this result is that our results for collaborative response times (Section 4.1) carryover to noncollaborative queueing times (Section 4.2).Formally, a nested system is one in which, for any two job classes i (cid:54) = j , the sets of servers with whichthey are compatible, S i and S j , are such that S i ⊂ S j or S j ⊂ S i or S i ∩ S j = ∅ . This means that nestedsystems can be recursively defined, starting with their most flexible job class, as follows.All nested systems have a most flexible job class, i , that is compatible with all the servers in the system,and if we remove class i from the system it decomposes into two or more nonoverlapping nested subsystems,18ach with its own fully flexible job class. These in turn can be decomposed by removing the fully flexibleclass until we get down to systems consisting of single job classes. Figure 5 shows an example of a nestedsystem; if the fully flexible class 6 is removed, the system decomposes into one nested system consisting ofservers 1 and 2 and job classes 1 and 2, and another nested system consisting of servers 3, 4, and 5 and jobclasses 3, 4, and 5.We begin our response time derivations with the collaborative model, and first determine the responsetime of a class that is fully flexible, which, as we will see, has an exponential distribution. The derivation forthe fully flexible class does not require the system to be nested, but later it will help us to develop generalresponse times in nested systems. We note that the results for nested systems were first derived by Gardneret al. [23] using an alternative state descriptor specific to nested systems; here we provide a new derivationthat follows directly from the detailed states used in Section 3. Let C = { (cid:126)c n , n = 0 , , ... } be the set of all states (cid:126)c n for the original collaborative model, and let (cid:126)C be arandom variable representing the state of the collaborative system in steady state, i.e., (cid:126)C ∼ π . We use thesubscript − i to represent a reduced system without class i , i = 1 , ..., J . From Corollary 3.7, we have thatfor (cid:126)c n ∈ C − i , P { (cid:126)C = (cid:126)c n | (cid:126)C ∈ C − i } = π C − i ( (cid:126)c n ) = π C − i ( ∅ ) n (cid:89) i =1 λ c i µ ( (cid:126)c i ) , where π C − i ( ∅ ) = π C ( ∅ ) /P { (cid:126)C ∈ C − i } .Suppose there is one class, call it class J , that is fully flexible in the bipartite compatibility matching,i.e., S J = { , . . . , M } . We condition on there being at least one class- J job in the system, (cid:126)C ∈ C\C − J , so weknow all servers will be busy. A possible state is ( (cid:126)c n , J, (cid:126)a m ) where the first class- J job is in position n + 1, (cid:126)c n ∈ C − J represents the classes of jobs ahead of the first class- J job in order of arrival, and (cid:126)a m ∈ C representsthe classes of jobs after the first class- J job in order of arrival. Then, because the denominator for all theterms corresponding to the first class- J job and the jobs after it is the total service rate µ , we have π C ( (cid:126)c n , J, (cid:126)a m ) = π C ( ∅ ) n (cid:89) i =1 λ c i µ ( (cid:126)c i ) λ J µ m (cid:89) k =1 λ a k µ . Let (cid:126)C before be the conditional state of the jobs before the first class- J job, given there is such a job. Then,for (cid:126)c n ∈ C − J , P { (cid:126)C before = (cid:126)c n } = π C ( ∅ ) n (cid:81) i =1 λ ci µ ( (cid:126)c i ) (cid:18) λ J µ (cid:80) m,(cid:126)a m ∈C m (cid:81) k =1 λ ak µ (cid:19)(cid:18) λ J µ (cid:80) m,(cid:126)a m ∈C m (cid:81) k =1 λ ak µ (cid:19) (cid:80) j,(cid:126)c j ∈C − J π C ( ∅ ) n (cid:81) i =1 λ ci µ ( (cid:126)c i ) = π C ( ∅ ) P { (cid:126)C ∈ C − J } n (cid:89) i =1 λ c i µ ( (cid:126)c i ) = π C − J ( (cid:126)c n ) . That is, the first class- J job “sees” the steady-state distribution for the collaborative model with class J removed. Similarly, letting (cid:126)C after be the conditional state for the jobs after the first class- J job, given thereis one, we have P { (cid:126)C after = (cid:126)a m } = A m (cid:89) k =1 λ a k µ , where the normalizing constant is A = (cid:88) m,(cid:126)a m ∈C m (cid:89) k =1 λ a k µ − = (cid:32) ∞ (cid:88) m =0 m (cid:89) k =1 J (cid:88) i =1 λ i µ (cid:33) − = (cid:32) ∞ (cid:88) m =0 λµ m (cid:33) − = (1 − ρ )19ith ρ = λµ . Also, given there is at least once class- J job, (cid:126)C before and (cid:126)C after are independent. Finally,letting N J be the total number of class- J jobs in the system in steady state, we have P { N J ≥ } = λ J µ (cid:88) j,(cid:126)c j ∈C − J π C ( ∅ ) n (cid:89) i =1 λ c i µ ( (cid:126)c i ) (cid:88) m,(cid:126)a m ∈C m (cid:89) k =1 λ a k µ = λ J µ P { N J = 0 } − ρ . Solving for P { N J ≥ } = 1 − P { N J = 0 } , we obtain P { N J = 0 } = ρ J where ρ J = λ J µ − ( λ − λ J ) .Note that P { (cid:126)C after = (cid:126)a m } is the same as the probability of state (cid:126)a m in a multiclass M/M/1 queue withservice rate µ . Let ˆ N be total number of jobs after the first class- J job in steady state. From standard resultsfor the M/M/1 queue, we have that ˆ N ∼ geom (1 − ρ ), where Y ∼ geom ( p ) means P { Y = n } = p (1 − p ) n , n = 0 , , ... . We can also obtain this result by summing the product form result above: P { ˆ N = n } = (cid:80) (cid:126)a n ∈C P { (cid:126)C after = (cid:126)a n } . Each of the ˆ N jobs is independently class i with probability λ i /λ , so ˆ N J , thenumber of class- J jobs after the first class- J job, is also geometrically distributed, ˆ N J ∼ geom (1 − ρ J ).More generally, ˆ N i , the number of class- i jobs after the first class- J job has a geometric distribution,ˆ N i ∼ geom (1 − λ i µ − λ + λ i ). This is a consequence of the following simple lemma regarding Bernoulli splittingof geometric random variables, with p = ρ = λ/µ and q i = λ i /µ ; we include the proof for completeness. Lemma 4.1.
Let Y ∼ geom (1 − p ) , i.e., Y is the number of failures before the first success in i.i.d.Bernoulli trials with failure probability p . Let Y i be the number of type- i failures before the first successin i.i.d. Bernoulli trials with success probability − p and type- i failure probability q i , with (cid:80) q i = p , so Y i | Y ∼ Binomial ( Y, q i /p ) . Then Y i ∼ geom (1 − q i / ( q i + 1 − p )) .Proof. When we are counting the number of type- i failures before the first success, we can ignore the othertypes of failures. That is, we can just look at the trials that result in either type- i failures or success.Conditioned on the trial being either a success or a type- i failure, the probability that it is a type- i failureis q i / ( q i + 1 − p ).Because N J = I { N J > } ( ˆ N J +1), and ˆ N J ∼ geom (1 − ρ J ), and, as we showed above, P { N J = 0 } = ρ J ,we have the following. Corollary 4.2. N J ∼ geom (1 − ρ J ) . Summarizing our observations so far, we have the following.
Theorem 4.3.
For the collaborative model with a fully flexible job class J , (i) The steady-state distribution for the system conditioned on there being no class- J job is the same as thatof a reduced system where there are no class- J jobs, π C − J . (ii) The distribution of the state of the system ahead of the first class- J job given there is one is also π C − J . (iii) The distribution of the state of the system after the first class- J job given there is one is the same asthe distribution of a multiclass M/M/1 queue with arrival rate λ and service rate µ . (iv) The number of class- J jobs in the system in steady state, N J , satisfies N J ∼ geom (1 − ρ J ) , i.e., it isthe same as in an M/M/1 queue with arrival rate λ J and service rate ˆ µ J = µ − ( λ − λ J ) . Let T i be the response time (total time in system) for a class- i job in steady state for our collaborativemodel, and let T M/M/ ( λ, µ ) be the steady-state response time of a job in a standard M/M/1 queue witharrival rate λ and service rate µ , i.e., T M/M/ ( λ, µ ) is exponentially distributed with rate µ − λ as long as λ < µ . Let T iQ and T M/M/ Q ( λ, µ ) be similarly defined for steady-state time in queue. Corollary 4.4.
For the collaborative model with a fully flexible job class J , (i) π C ( ∅ ) = π C − J ( ∅ )(1 − ρ J ) , ii) T J ∼ T M/M/ ( λ J , ˆ µ J ) ∼ T M/M/ ( λ, µ ) , and T JQ ∼ T M/M/ Q ( λ J , ˆ µ J ) .Proof. (i) From (i) and (iv) of Theorem 4.3 we have π C − J ( ∅ ) = π C ( ∅ ) /P { N J = 0 } = π C ( ∅ ) / (1 − ρ J ).(ii) Distributional Little’s law tells us, for any λ a and L , that if the number of jobs in a queueing systemis geometrically distributed with mean L , jobs arrive at rate λ a , and jobs are served in FCFS order, thenthe the response time is exponentially distributed with mean L/λ a . The result follows from (iv) of Theorem4.3 with arrival rate λ a = λ J and mean number in system L = − ρ J ρ J = λ J ( µ − ( λ − λ J )) − λ J = λ J µ − λ . Thus, thequeueing system for class- J jobs in steady state is stochastically indistinguishable from a single-class M/M/1queue with only class- J jobs and with effective service rate ˆ µ = µ − ( λ − λ J ).Our results for a fully flexible class in the collaborative model can be extended to general OI queues.Suppose we have an OI queue, so the service rate as a function of the ordered list of job classes, µ ( (cid:126)c n ),satisfies conditions (i)-(iii) of Section 3.1, and suppose there is a maximal service rate µ , such that µ ( (cid:126)c n ) ≤ µ for any state (cid:126)c n . Also suppose there is a job class J such that for any state (cid:126)c n in which the first class- J jobis in position k , k ≤ n , µ ( (cid:126)c n ) = µ ( (cid:126)c k ) = µ . Then a class- J job will “block” jobs behind it in the OI queue inthe same way a fully flexible job blocks jobs behind it in the skill-based collaborative queue, and Theorem4.3 and Corollary 4.4 still hold. Recall that a nested system has a fully flexible job class, J , and if class J is removed, it decomposes into two ormore nonoverlapping nested subsystems. Thus, each job class i , by removing job classes j such that S i ⊂ S j or S i ∩ S j = ∅ , defines a nested subsystem with servers S i and job classes j that require servers S j ⊆ S i , and whereclass i is fully flexible. That is, for a subset S of servers, let R ( S ) = C ( S ) = { , . . . , N }\ C ( { , . . . , M }\ S ) bethe job classes that require (i.e., that are only compatible with) servers in S . The nested subsystem definedby job class i consists of servers k ∈ S i and job classes j ∈ R ( S i ), i.e., the reduced system (cid:96) { , ..., M }\ S i . Letˆ µ i = µ ( S i ) − λ ( R ( S i )) + λ i be the effective service capacity for class i in this subsystem, and let ρ i = λ i / ˆ µ i .We will show that the overall response time for class- i jobs is the sum of the queueing times for classes j with S i ⊂ S j , plus the response time for class i given those classes are gone (so it is the most flexible classin its subsystem). Note that, as we observed for class J , T M/M/ ( λ i , ˆ µ i ) ∼ T M/M/ ( λ ( R ( S i )) , µ ( S i )). Theorem 4.5.
In a nested collaborative system, for any job class i , T i ∼ T M/M/ ( λ i , ˆ µ i ) + (cid:88) j : S i ⊂ S j T M/M/ Q ( λ j , ˆ µ j ) , where all the terms are independent. Also, π C ( ∅ ) = J (cid:81) j =1 (1 − ρ j ) .Proof. We start with the response time result. Let class G be fully redundant in one of the subsystemsobtained when class J is removed. That is, G is such that S j ⊂ S G or S j ∩ S G = ∅ for all j (cid:54) = G, J . We willshow that T G ∼ T JQ ( λ J , ˆ µ J ) + T M/M/ ( λ G , ˆ µ G ). The result will follow by repeating the argument.From PASTA and (iv) of Theorem 4.3, an arriving (tagged) class- G job in steady state will “see” N J class- J jobs in the system, and it will not be able to start service until all of those N J class- J jobs haveleft the system. That is, if there are class- J jobs in the system, the tagged job must wait until the end of aclass- J busy period, which, for an M/M/1 queue, is the same as the class- J response time. Thus, the timethe tagged job must wait until the system is empty of class- J jobs is I { N J > } T J ∼ T JQ ∼ T M/M/ Q ( λ J , ˆ µ J ).If N J = 0 when the tagged class- G job arrives, then from (i) of Theorem 4.3, it will “see” the reducedsystem in steady state, with distribution π C − J . If N J >
0, then, from quasi-reversibility, the state left behindby a class- J job will have the same distribution as that seen upon arrival. Therefore, given it is the last class- J job, i.e., it leaves behind no class- J jobs, then the state it leaves behind has the distribution π C − J , againfrom (i) of Theorem 4.3. Thus, once there are no class- J jobs, the tagged job sees independent subsystems21efined by the fully flexible class in each. The subsystems that do not include class G will have no effect onour tagged job. Hence, applying Corollary 4.4 to the subsystem with G instead of J as the most flexible class,we have that the class- G response time given there are no class- J jobs is T G | N J = 0 ∼ T M/M/ ( λ G , ˆ µ G ),and the overall response time result follows.From our earlier observations, π C − J ( ∅ ) = π C ( ∅ ) /P { (cid:126)C ∈ C − J } = π C ( ∅ ) /P { N J = 0 } = π C ( ∅ ) / (1 − ρ J ), so π C ( ∅ ) = (1 − ρ J ) π C − J ( ∅ ). If there are no class J jobs, the system decomposes into K independent subsystems,each with its own fully flexible class, G k , k = 1 , ..., K , so π C − J ( ∅ ) = K (cid:89) k =1 π C − ( J,G k ) ( ∅ ) = K (cid:89) k =1 (1 − ρ G k ) π C − J ( ∅ ) . Repeating the argument within each subsystem we get π C ( ∅ ) = J (cid:81) i =1 (1 − ρ i ).We have already established that the effective service time of the fully flexible class J , S Jeff = T J − T JQ ,is exponentially distributed with rate ˆ µ J = µ − ( λ − λ J ). We can also see this from our result above. Definethe effective service time of a (tagged) class- J job as the time from which it first has no class- J jobs aheadof it until it completes service. At the time this effective service period starts, the system the tagged jobsees will decompose into K independent subsystems, each with its own fully flexible class, G k , k = 1 , ..., K ,and the tagged job will join each of those subsystems as a fully flexible job for the subsystem (viewing thecollaborative model as a cancel-on-completion redundancy system). From Corollary 4.4 applied to G k insubsystem k , the response time of the fully flexible class within the subsystem will have the same distributionas the response time in the corresponding M/M/1 queue, so S Jeff = min k =1 ,...,K { T M/M/ ( λ k , µ ( S k ) − ( λ ( R ( S k )) − λ k ) } ∼ Exp( µ ( S k ) − λ ( R ( S k )) ∼ Exp( µ − ( λ − λ J )) , using the fact that the minimum of exponentials is exponential with the sum of the rates.As an example, consider the W model in which class- i jobs can only be served by server i , i = 1 ,
2, andclass-3 jobs can be served by either server. Then T ∼ T M/M/ ( λ , µ − λ − λ ) and T i ∼ T M/M/ Q ( λ , µ − λ − λ ) + T ( λ i , µ i ), i = 1 , . Let T iQ | B be the stationary time in the job queue for a class- i job in the noncollaborative model, given thatthe set of busy servers is B = { , ..., M } (i.e., all servers are busy). Then, from Observation 3.11, we know T iQ | B has the same distribution as the response time for class- i jobs in the collaborative model. Therefore,from Theorem 4.5, we have Theorem 4.6.
In a nested noncollaborative system, for any job class i , given busy servers B = { , ..., M } , T iQ | B ∼ T M/M/ ( λ i , ˆ µ i ) + (cid:88) j : S i ⊂ S j T M/M/ Q ( λ j , ˆ µ j ) , where ˆ µ j = µ ( S j ) − λ ( R ( S j )) + λ j and all the terms are independent. The result can be generalized for class i , if some servers are idle but all the servers in S i are busy, asfollows. Fix i and the set of busy servers B ⊇ S i , and let y be such that S i ⊆ S y ⊆ B and (cid:64) j (cid:54) = y suchthat S y ⊂ S j ⊆ B . That is, class y determines a nested subsystem of busy servers in which class y is fullyflexible, and there are no jobs of class j such that S y ⊂ S j in the job queue. Therefore, an arriving class- i job sees a reduced system, (cid:96) { , ..., M }\ S y , consisting only of the servers in S y and job classes j ∈ R ( S y ),and in which all the servers in S y are busy. We have the following. Corollary 4.7.
In a nested noncollaborative system, for any job class i , given the servers in B ⊇ S i arebusy, T iQ | B ∼ T M/M/ ( λ i , ˆ µ i ) + (cid:88) j : S i ⊂ S j ⊆ S y T M/M/ Q ( λ j , ˆ µ j ) , where ˆ µ j = µ ( S j ) − λ ( R ( S j )) + λ j and all the terms are independent.
22e can use our results for queueing times to obtain response time distributions for the special case inwhich the service rate is the same at all servers; that is, µ j = µ/M for all servers j = 1 , . . . , M . We do thisby conditioning on the set of busy servers seen by an arriving job. Define I i as an indicator that the all theservers in S i are busy (other servers may also be busy). If an arriving class- i job finds an idle compatibleserver, it will immediately enter service; otherwise it must wait in the job queue before entering service.Hence we have the following, where T i is the class- i response time. Corollary 4.8.
In a nested noncollaborative system, for any job class i , given the servers in B ⊇ S i arebusy, T i ∼ Exp ( µ/M ) + I i T M/M/ ( λ i , ˆ µ i ) + (cid:88) j : S i ⊂ S j ⊆ S y T M/M/ Q ( λ j , ˆ µ j ) , where ˆ µ j = µ | S j | /M − λ ( R ( S j )) + λ j and all the terms are independent. Unlike for the collaborative model, the results in the previous section require µ j = µ/M for all servers j = 1 , . . . , M . This condition ensures that a job’s service time is the same regardless of the server on whichit runs. If we were instead to allow different servers to have different rates, the analysis would change inseveral ways. First, for a job that finds multiple compatible servers idle, we need to further condition on theserver on which the job runs. Under RAIS this is determined probabilistically according to the assignmentrule; the probabilities can be determined using the process described in [37]. Under ALIS this is determinedby which server has been busy longer. Second, for a job that finds all compatible servers busy we still needto determine the server on which the job ultimately runs. With the current approach, we would need todetermine the probability that a class- i job completes on server j in the collaborative model; this would thenbe equal to the probability that a class- J job runs on server i in the noncollaborative model. Unfortunately,computing this quantity appears to be complicated.A final challenge in the NC model is that it is difficult, in general, to compute the probability that varioussubsets of servers are busy. While this analysis is tractable in certain small nested systems, for example, inthe W model, the form of these probabilities is not particularly clean or intuitive. In larger nested systems,we believe that the probabilities needed to perform the requisite conditioning are unlikely to have a cleanclosed form solution, even for a symmetric nested system. Section 4 provides one approach for understanding the form of the per-class response time distributions innested systems. In this section, we turn to a second approach that uses an alternative, partially aggregated,state description, which gives us conditional queueing times, given the busy servers in the order of the jobsthey are serving, for general, possibly non-nested, systems. Like the detailed states considered in Section 3,the partially aggregated states also provide a Markov description for the model and also yield a productform stationary distribution.
Instead of tracking the classes of all jobs in the system, we now track the number of jobs in the queue inbetween jobs in service, but not their individual classes. Let l denote the number of jobs currently in service.The partially aggregated state includes the vector (cid:126)n l = ( n , . . . , n l ), where n i denotes the number of jobsin the queue ( not in service) that arrived after the i th job in service and before the ( i + 1)st job in service.Under both ALIS and RAIS, we track the busy servers in the arrival order of the jobs they are serving, (but not the classes of the jobs in service); for the ALIS version we also track the idle servers in the order inwhich they became idle.The partially aggregated state description for noncollaborative models was first introduced by Adan,Visschers, and Weiss [1, 37]. In these papers, the stationary distribution was derived directly using partial23alance for the partially aggregated states. In this section we provide an alternative derivation that involvesaggregating the stationary probabilities for the detailed states discussed in Section 3.Let us first consider the noncollaborative model with the RAIS policy, in which arrivals finding multipleidle compatible servers are assigned to a server at random with appropriate probabilities that depend on theset of busy servers. The partially aggregated state is ( (cid:126)b l , (cid:126)n l ), where l is the number of busy servers (whichin the noncollaborative model is the same as the number of jobs in service), (cid:126)b l is the set of busy servers inorder of the arrival times of the jobs they are serving, and n i , i = 1 , ..., l is the number of jobs waiting forone of busy servers 1 , . . . , l . Thus, server b is serving the oldest job, the next n jobs to have arrived arewaiting for (require) server b , i.e., their classes are in R ( b ), the n + 2 nd oldest job is being served by b ,the next n jobs are only compatible with b or b or both, i.e. their classes are in R ( (cid:126)b ), and so on. Thecorresponding detailed state, (cid:126)z m , is such that z = b , z n +2 = b , etc., and m = l + (cid:80) li =1 n i . Thus, in thepartially aggregated state ( (cid:126)b l , (cid:126)n l ) there are l jobs in service and (cid:80) li =1 n i jobs in the queue. When the setof busy servers is (cid:126)b j , let λ as ( (cid:126)b j ) represent the activation rate of idle server s / ∈ { b , ..., b j } (the rate of goingfrom state ( (cid:126)b j , (cid:126)n j ) to (( (cid:126)b j , s ) , ( (cid:126)n j , assignment condition for routing a compatible job to idle server b j given busy servers (cid:126)b j − : l (cid:81) j =1 λ ab j ( (cid:126)b j − )must be the same for any permutation of b , . . . , b l . This is the same assignment condition that we use inSection 3 for the detailed state description.Let α ( (cid:126)b j ) = λ ( R ( (cid:126)b j )) µ ( (cid:126)b j ) . We use the notation π RAIS (cid:48) to denote the partially aggregated stationary distribu-tion under RAIS (in contrast with π RAIS , which denotes the stationary distribution for the detailed statedescription).
Theorem 5.1. (Visschers et al. [37]) π RAIS (cid:48) ( (cid:126)b l , (cid:126)n l ) = π RAIS ( ∅ ) l (cid:89) j =1 λ ab j ( (cid:126)b j − ) µ ( (cid:126)b j ) α ( (cid:126)b j ) n j = π RAIS ( ∅ ) l (cid:89) j =1 α ( (cid:126)b j ) n j l (cid:89) j =1 λ ab j ( (cid:126)b j − ) µ ( (cid:126)b j )= π RAIS ( (cid:126)n l (cid:126) | b l ) π RAIS ( (cid:126)b l ) = l (cid:89) j =1 (1 − α ( (cid:126)b j )) α ( (cid:126)b j ) n j π RAIS ( (cid:126)b l ) . The proof given by Visschers et al. involves showing directly that local balance holds for the partiallyaggregated states. Below we give an alternative proof, which follows by summing the stationary probabilities(given in Theorem 3.10) of states (cid:126)z m that are consistent with ( (cid:126)b l , (cid:126)n l ). Proof.
We begin by recalling that (cid:126)z m is an interleaving of states (cid:126)c n and (cid:126)b l , where m = n + l . That is, letting k i = (cid:80) ij =1 n j , we can write (cid:126)z m = ( b , c , . . . , c n , b , . . . , b j , c k j +1 , . . . , c k j − + n j , b j +1 , . . . , b l , c k l − +1 , . . . , c k l − + n l ) . C ( (cid:126)b l , (cid:126)n l ) denote the set of states (cid:126)z m compatible with ( (cid:126)b l , (cid:126)n l ). We then have π RAIS (cid:48) ( (cid:126)b l , (cid:126)n l ) = (cid:88) (cid:126)z m ∈ C ( (cid:126)b l ,(cid:126)n l ) π RAIS ( (cid:126)z m )= π RAIS ( ∅ ) (cid:88) (cid:126)z m ∈ C ( (cid:126)b l ,(cid:126)n l ) m (cid:89) i =1 λ zi ( (cid:126)z i ) µ ( (cid:126)z i )= π RAIS ( ∅ ) (cid:88) (cid:126)z m ∈ C ( (cid:126)b l ,(cid:126)n l ) l (cid:89) j =1 λ ab j ( (cid:126)b j − ) µ ( (cid:126)b j ) k j − + n j (cid:89) i = k j − +1 λ c i µ ( (cid:126)b j ) = π RAIS ( ∅ ) l (cid:89) j =1 λ ab j ( (cid:126)b j − ) µ ( (cid:126)b j ) k j − + n j (cid:89) i = k j − +1 (cid:80) c ∈ R ( (cid:126)b j ) λ c µ ( (cid:126)b j ) = π RAIS ( ∅ ) l (cid:89) j =1 (cid:32) λ ab j ( (cid:126)b j − ) µ ( (cid:126)b j ) (cid:32) λ ( R ( (cid:126)b j )) µ ( (cid:126)b j ) (cid:33) n j (cid:33) = π RAIS ( ∅ ) l (cid:89) j =1 α ( (cid:126)b j ) n j l (cid:89) j =1 λ b aj ( (cid:126)b j − ) µ ( (cid:126)b j ) . We now turn to the ALIS policy, in which arrivals finding multiple idle compatible servers are assignedto the one that has been idle longest. Because the order of the idle servers now affects the system evolution,we now consider the aggregate state ( (cid:126)s M − l ,(cid:126)b l , (cid:126)n l ), where (cid:126)s M − l is the set of M − l idle servers in the orderin which they became idle, and (cid:126)b l and (cid:126)n l are defined as in the RAIS model. Theorem 5.2. (Adan and Weiss [1]) π ALIS (cid:48) ( (cid:126)s M − l ,(cid:126)b l , (cid:126)n l ) = π ALIS ( ∅ , ∅ ) l (cid:89) j =1 α ( R ( (cid:126)b j )) n j l (cid:89) j =1 µ ( (cid:126)b j ) M − l (cid:89) j =1 λ ( (cid:126)s j ) where π ALIS ( ∅ , ∅ ) is a normalizing constant. As noted by Adan and Weiss [1], aggregating the stationary distribution under ALIS over all permutationsof the idle servers, (cid:126)s k , yields the same stationary distribution as under RAIS. Corollary 5.3. (cid:80) P ( (cid:126)s M − l ) π ALIS (cid:48) ( (cid:126)s M − l ,(cid:126)b l , (cid:126)n l ) = π RAIS (cid:48) ( (cid:126)b l , (cid:126)n l ) . Corollary 5.3 tells us that the conditional stationary distribution of the time in queue, given the set ofbusy servers, is the same under ALIS as under RAIS. Under both policies, conditioned on (cid:126)b l , the numberof jobs waiting in the queue between busy servers b j and b j +1 is geometrically distributed with parameter1 − α ( R ( (cid:126)b j )) = 1 − λ ( R ( (cid:126)b j )) /µ ( (cid:126)b j ), from Theorem 5.1. Moreover, each of these jobs is of class c withprobability λ c /λ ( R ( (cid:126)b j )). Therefore, from Lemma 4.1, conditioned on (cid:126)b l , the number of class- i jobs waiting inthe queue between busy servers b j and b j +1 is geometrically distributed with parameter 1 − λ i µ ( (cid:126)b j ) − λ ( R ( (cid:126)b j ))+ λ i .Hence, from distributional Little’s law, and because class- i jobs are served in order, the time a class- i jobwill spend in the “subsystem” behind the servers (cid:126)b j is the same as the response time for a standard M/M/1queue with arrival rate λ i and service rate µ ( (cid:126)b j ) − λ ( R ( (cid:126)b j ))+ λ i . Therefore, depending on (cid:126)b l , a job arriving insteady state will either start service immediately, or wait a sum of exponential times before entering service. Theorem 5.4.
The queueing time for a class i job, given busy servers (cid:126)b l , is T iQ ( (cid:126)b l ) = I ( i ∈ R ( (cid:126)b l )) l (cid:88) j = f ( i,(cid:126)b l ) T M/M/ ( λ i , µ ( (cid:126)b j ) − λ ( R ( (cid:126)b j )) + λ i )25 here I ( · ) is the indicator function, and f ( i,(cid:126)b l ) = arg min { j : 0 ≤ j ≤ l, i ∈ R ( (cid:126)b j ) } = arg max { j : 0 ≤ j ≤ l, b j ∈ S ( i ) } is the largest indexed busy machine that is compatible with job class i . We note that, unlike the results for nested systems derived in Section 4, here the per-class queueing timedistribution is conditioned on the ordered vector of busy servers. In general it is not straightforward toobtain closed-form expressions for the probability that a job sees a particular (cid:126)b l . Hence, while this form isinsightful in terms of interpreting the time in queue as that in a tandem series of M/M/1 queues, the formdoes not permit an easy derivation of mean time in queue or other exact performance metrics. In this section we note briefly that a similar partially aggregated state can be defined for the collaborativemodel. Here the partially aggregated state description is ( (cid:126)d l , (cid:126)n l ), where (cid:126)d l gives the classes of all jobscurrently in service, and (cid:126)n l gives the number of jobs in the queue (receiving no service) in between thosejobs in service. This is similar to the ( (cid:126)b l , (cid:126)n l ) state used for RAIS, except that now we track the classes of thejob in service rather than the servers processing these jobs. We define µ ( (cid:126)d i ) as the total service rate givento the first i jobs that are receiving service, and R ( (cid:126)d i ) as the classes of jobs that require one of the serversserving the jobs in (cid:126)d i . That is, c ∈ R ( (cid:126)d i ) if S c ∈ S ( (cid:126)d i ). Note that for d i to be in service, given (cid:126)d i − , we musthave d i / ∈ (cid:126)d i − . Let α ( (cid:126)d i ) = λ ( R ( (cid:126)d i )) µ ( (cid:126)d i ) . Proposition 5.5.
For l = 0 , . . . , M , (cid:126)d l such that d i / ∈ (cid:126)d i − for i = 2 , . . . , l , and n i = 0 , , . . . for i = 1 , . . . , l , π C (cid:48) ( (cid:126)d l , (cid:126)n l ) = π C ( ∅ ) l (cid:89) j =1 λ d j µ ( (cid:126)d j ) α ( (cid:126)d j ) n j . The proof follows a similar argument to the proof of Theorem 5.1 by aggregating detailed states consistentwith ( (cid:126)d l , (cid:126)n l ); we omit the details.As for the noncollaborative system, we can use Proposition 5.5 and the distributional form of Little’sLaw to determine the distribution of T iQ ( (cid:126)d l ), the conditional waiting time until a job starts service on atleast one server for any class i , given (cid:126)d l . Corollary 5.6. T iQ ( (cid:126)d l ) = I ( i ∈ R ( (cid:126)d l )) l (cid:88) j = f ( i,(cid:126)d l ) T M/M/ ( λ i , µ ( (cid:126)d j ) − λ ( R ( (cid:126)d j )) + λ i ) where I ( · ) is the indicator function, and f ( i, (cid:126)d l ) = arg min { j : 0 ≤ j ≤ l, i ∈ R ( (cid:126)d j ) } = arg max { j : 0 ≤ j ≤ l, S ( d j ) ∪ S ( i ) (cid:54) = ∅ is the largest indexed job in service that is using a server in S i . Corollary 5.7. T iQ ∼ (cid:80) l (cid:80) (cid:126)d l T iQ ( (cid:126)d l ) I ( (cid:126)d l ) . In this section we consider class-based performance measures for the OI queue (and hence the collaborativemodel, and for the queue of the noncollaborative model given all servers busy) with general, non-nested,bipartite structures. Here it is useful to define the per-class aggregated state x = ( x , ..., x N ), where x i isthe total number of type i jobs. Let C ( x ) be set of states of the form (cid:126)c n that are consistent with x , i.e., (cid:126)c n ∈ C ( x ) if and only if x i = (cid:80) nj =1 I { c j = i } for all classes i . so n = (cid:80) x i . Abusing notation, let µ ( x ) bethe total service rate in state x ; from OI property (ii) (see Section 3), µ ( x ) = µ ( (cid:126)c n ) for all states (cid:126)c n ∈ C ( x ).Then π X ( x ) = (cid:80) (cid:126)c n ∈ C ( x ) π ( (cid:126)c ), where π X ( x ) is the steady-state probability of aggregate state x and, byaggregating the partial balance equations, is given by µ ( x ) π X ( x ) = (cid:88) i : x i > λ i π X ( x − e i ) , e i is a vector of appropriate length containing a 1 in position i and 0’s elsewhere. Then π X ( x ) alsohas a product form. The result follows from summing the product form characterizations of π X ( (cid:126)c ). Theorem 6.1. (Bonald and Comte [15], Krzesinski [34]) π X ( x ) = π CX ( ∅ )Φ( x ) N (cid:89) i =1 λ x i i where Φ( x ) = 1 µ ( x ) (cid:88) i : x i > Φ( x − e i ) , Φ( ∅ ) = 1 and π X ( ∅ ) = π C ( ∅ ) is a normalizing constant equal to the probability that the system is empty. Note that the aggregate state description does not capture the dynamics of the OI queue and is nota Markov description for the original system. While the order of the jobs, given x , does not matter forthe total service rate, µ ( x ), it does matter for the amount of service received by the j ’th job in the queue,∆ j ( (cid:126)c j ) = ∆ j ( (cid:126)c n ). We therefore need to know (cid:126)c n —or at least the class of the job in service on each server—toknow the rate out of the state due to a class- i departure.As observed by Bonald and Comte, the stationary aggregate distribution π CX ( x ) given in Theorem 6.1 isalso the stationary distribution of a single-server system consisting of N job classes with Poisson arrivals atrates λ i and state-dependent exponential service rates such that class- i jobs are served according to processorsharing at rate φ i ( x ) = [Φ( x − e i ) / Φ( x )] I ( x i > . We call the single-server model with service rates φ i ( x ) the aggregate model . Bonald and Comte also notedthe following relationship between the aggregate model and the collaborative model: φ i ( x ) = (cid:88) (cid:126)c n ∈ C ( x ) π C ( (cid:126)c n ) π X ( x ) µ (cid:48) i ( (cid:126)c n ) , where µ (cid:48) i ( (cid:126)c n ) is the service rate of the first class- i job in state (cid:126)c n in the collaborative model. (Becauseof the FCFS service discipline, no other class- i jobs will be in service.) Note that from Theorem 6.1, (cid:80) i φ i ( x ) = µ ( x ).The service rates φ i ( x ) also satisfy the following balance property : φ i ( x ) φ j ( x − e i ) = φ j ( x ) φ i ( x − e j )for x i , x j >
0. The balance property is analogous to the assignment condition required for the noncol-laborative model with random assignment to idle servers to have a product-form stationary distribution.Furthermore, the balance property leads to Kolmogorov’s criterion being satisfied, and therefore it leads tothe system being reversible. Because the aggregate model is reversible, it is also insensitive to the job sizedistributions.While the stationary distribution of the states x is the same in the aggregate model and the collaborativemodel, as noted above, the underlying system dynamics are very different. Bonald and Comte proposeapplying a round robin-like scheduling algorithm to the original collaborative model to approximate thebehavior of the aggregate model [15]. Under their algorithm, server j serves the first compatible job inthe queue, as in the original, collaborative, FCFS model, but, after an exponential time with rate θ j ,server j interrupts the job in service and that job is moved to the back of the queue. This is analogousto approximating processor sharing with round robin for a single server and job class. Bonald and Comtenote that the aggregate model, using balanced fair processor sharing, is insensitive in that the steady-statedistribution does not depend on the job size distribution.Because the aggregate model and the collaborative model have the same steady-state distribution for theaggregate states, we can use the aggregate model to efficiently compute aggregate performance measures forthe collaborative model. In particular, Bonald et al. [16] give a recursion based on successively removingservers for computing the system idle probability, π C ( ∅ ) = π X ( ∅ ), as follows.27et C be the set of all (detailed) states (cid:126)c n for the original collaborative model. Recall that the subscript (cid:96) k represents a reduced system without server k , i.e., in which server k as well as the job classes in C k areremoved. Let ψ k be the probability that server k is idle in the original collaborative system. Then, fromCorollary 3.8, we have that π C ( (cid:126)c n | server k is idle) = π C (cid:96) k ( (cid:126)c n ) for (cid:126)c n ∈ C (cid:96) k , so π C ( (cid:126)c n ) = π C (cid:96) k ( (cid:126)c n ) ψ k . Then π C ( ∅ ) = π X ( ∅ ) = π C (cid:96) k ( ∅ ) ψ k (and hence ψ k ) can be computed recursively: Proposition 6.2. (Bonald et al. [16]) π C ( ∅ ) = π X ( ∅ ) = (1 − ρ ) µ (cid:80) Mk =1 µ k π C (cid:96) k ( ∅ ) , where ρ = λ/µ is the system load.Proof. Algebra, using π C (cid:96) k ( ∅ ) = π C ( ∅ ) /ψ k , gives us that the equation above is equivalent to M (cid:88) k =1 µ k ψ k = µ − λ .This just represents two ways of computing the long-run rate of “dummy” transitions, i.e., potential servicecompletions at idle servers.Recall that the collaborative model is equivalent to the directed bipartite matching model, in whichservers of type k arrive according to a Poisson process at rate µ k , and arriving servers that do not findcompatible jobs (unmatched servers) immediately leave the system. Here the interpretation of ψ k is theprobability that an arriving server of type k is unmatched, and “dummy” transitions correspond to arrivalsof unmatched servers. See Weiss [38] for an alternative algorithm to compute ψ k and π C ( ∅ ), as well as forcomputing the long-run matching rates of class i jobs with class k servers.The mean number of jobs L and the mean number of class- i jobs L i , with L (cid:96) k and L i (cid:96) k similarly definedfor the reduced system without server k and its compatible job classes, can be similarly recursively calculated.Note that L (cid:96) k and L i (cid:96) k are also the conditional mean number of class- i jobs in the original system, givenserver k is idle. Let ¯ S i = { , . . . , M }\ S i denote the set of servers that cannot serve class- i jobs, and let ρ i = λ i / ( µ − ( λ − λ i )) be the mean number of class- i jobs in an M/M/1 queue with arrival rate λ i and servicerate µ − ( λ − λ i ). Proposition 6.3. L i = λ i + (cid:80) k ∈ S i µ k ψ k L i |− k µ − λ = λ i µ − λ + (cid:88) k ∈ S i µ k ψ k µ − λ L i |− k L = J (cid:88) i =1 L i = λ + (cid:80) Mk =1 µ k ψ k L (cid:96) k µ − λ = λµ − λ + (cid:80) Mk =1 µ k ψ k L (cid:96) k (cid:80) Mk =1 µ k ψ k . Note that λ i µ − λ is the mean number of class- i jobs in an M/M/1 queue with arrival rate λ i and servicerate µ − ( λ − λ i ) (the maximal service rate available to class- i jobs), as we argued in Section 4. It alsorepresents the mean number of class- i jobs in the collaborative (not necessarily nested) model given all theservers are busy. Also, as noted above, µ k ψ k / ( µ − λ ) represents the proportion of dummy transitions dueto server k being idle, so the second set of terms in the above expression represent the additional expectedjobs due to “wasted” service because of job/server incompatibilities. Note that servers in S i will not be idleif there are class i jobs in the collaborative system.Bonald et al. use the results above to obtain explicit results for special cases, such as redundancy- d , whereall jobs are replicated to a randomly chosen subset of d servers, and line structures in which job classes canbe ordered so that for any server k , the classes of jobs it can serve are consecutive, i.e., classes i , i +1 , . . . , i for some i < i [16]. Nested structures are a special case of line structures so the above recursions representan alternative method for deriving mean performance metrics to the approach given in Section 4, where, ofcourse, mean response times follow immediately from Little’s Law.28e note that though the results for this section are for the collaborative model, in light of our observationthat the queue process in the noncollaborative model given all servers are busy has the same distributionas the overall process for the collaborative model, we can apply the results above to the noncollaborativecase. That is, e.g., L i as computed above will equal the expected number of class- i jobs in the queue (notreceiving service) in steady state, given all the servers are busy, for the noncollaborative model. The majority of this paper has focused on surveying results related to product-form stationary distributionsand derivations of performance metrics in the collaborative and noncollaborative systems, under a few keyassumptions: that service times are exponentially distributed and i.i.d. (across jobs and across replicas ofthe same job, in the collaborative model), and that the service discipline is FCFS. There are several linesof work that relax one or more of these assumptions; such relaxations preclude product-form results, and assuch fall outside the scope of this paper. In this section we provide a brief outline of some of the relatedwork.We begin with related work within the i.i.d. exponential model. Several papers have considered a schedul-ing policy that gives priority to less flexible jobs over more flexible jobs; this policy is known as “dedicatedcustomers first” in the noncollaborative system and as “least redundant first” in the collaborative system.Such a policy has been shown to be optimal in the sense that it stochastically maximizes the departure rate,in both the noncollaborative system [7] and the collaborative system [23]. Furthermore, in the collaborativecase mean response time is decreasing and convex under this policy as the proportion of jobs that are moreflexible increases [24]. In a similar vein, the effect of increasing the “degree” of flexibility (i.e., the numberof servers with which each job is compatible) in systems with FCFS scheduling has been studied. In boththe noncollaborative [8] and collaborative [33, 22] systems, mean response time is decreasing and convex asthe degree of redundancy increases. Gurvich and Whitt [26, 27, 28] consider other routing and schedulingpolicies for the noncollaborative model in the many-server heavy-traffic regime.The system in which all jobs have the same degree of flexibility and are assigned a set of compatibleservers uniformly at random is one special case of the system structure considered in this paper. This specialcase, often referred to as a “redundancy- d ” system (the d indicating the degree of flexibility), has receivedconsiderable attention in the literature because the symmetric system structure makes analysis more feasiblein many cases. Indeed, the redundancy- d system is another example of a system in which, under the i.i.dexponential and FCFS assumptions of this paper, it is feasible to aggregate the product-form stationarydistribution to derive performance metrics [22, 11].Several papers have noted that when the i.i.d. exponential assumptions are eliminated, mean responsetime no longer decreases as d increases in the collaborative model [25], hence recently there has been afocus on developing new dispatching and scheduling policies for systems with correlated or general servicetimes [25, 29, 30, 36]. Finally, some work focuses on deriving the stability region under redundancy- d , whichbecomes much more complicated absent the i.i.d. exponential assumptions [9]. This paper presents an overview of product-form results in systems with flexible jobs and servers, in whicha bipartite graph structure specifies which job classes can be served by which servers. We primarily focuson two models for service: the collaborative model, in which multiple servers can work together to serve asingle job at a faster rate, and the noncollaborative model, in which each job is permitted to enter serviceon only one server. Both models have been studied extensively in the literature; this survey brings togetherthe two models, as well as several other related systems, using a common language and set of notation. Ourhope is that this will allow readers to draw new connections among these similar systems. Along the way,we have presented several new results that highlight the relationships between models and that show howresults derived in one model can be used to obtain insights for the other.One of the primary goals in analyzing queueing systems is to determine response time distributions;in multi-class systems such as those considered in this paper, we wish to derive per-class response timedistributions. Each of the three state descriptors that we consider allows us to make partial progress towards29his goal. Using the detailed state descriptor of Section 3, we can derive per-class response time distributionsfor the special case of nested systems. Using the partially aggregated states of Section 5, we can derive per-class queueing time distributions in general (not necessarily nested) systems, but now conditioned on theordered set of busy servers. Using the per-class aggregated states of Section 6, we can derive unconditionalper-class mean performance metrics in general systems, but this approach does not yield distributionalresults. Each approach has its advantages and disadvantages; we believe that a unifying analysis thatprovides unconditional per-class response time distributions is likely to be infeasible, but this remains anopen question.
We thank Gideon Weiss, Erol Pekoz, and Jan-Pieter Dorsman for their careful reading and valuable feedback.
References [1] Adan, I. J. B. F., and G. Weiss. (2014) A skill based parallel service system under FCFS-ALIS – steadystate, overloads, and abandonments. Stoch. Syst., 4(1): 250–299.[2] Adan I. J. B. F., and G. Weiss. (2012) Exact FCFS matching rates for two infinite multi-type sequences.Oper. Res. 60(2):475–489.[3] Adan, I. J. B. F., and G. Weiss. (2012) A loss system with skill-based servers under assign to longestidle server policy. Prob. Eng. Inf. Sci. 26: 307-321.[4] Adan I. J. B. F., C. Hurkens, and G. Weiss (2010) A reversible Erlang loss system with multitypecustomers and multitype servers. Prob. in the Eng. and Inf. Sci. 24: 535-548.[5] Adan, I. J. B. F., I. Kleiner, R. Righter, and G. Weiss. (2018) FCFS parallel service systems andmatching models. Perf. Eval. 127: 253-272.[6] Adan, I. J. B. F., A. Buˇsi´c, J. Mairesse, and G. Weiss. (2018) Reversibility and further properties ofFCFS infinite bipartite matching, Math of OR, 43: 598-621.[7] Akgun, O., R. Righter, and R. Wolff. (2012) Partial flexibility in routing and scheduling. Adv. Appl.Prob., 45: 637-691.[8] Akgun, O., R. Righter, and R. Wolff. (2011). Understanding the marginal impact of customer flexibility.Queueing Systems 71(1-2): 5-23.[9] Anton, E., U. Ayesta, M. Jonckheere, and I.M. Verloop (2019) On the stability of redundancy models.Preprint.[10] Ayesta, U., T. Bodas, and I.M. Verloop. (2018) On redundancy-d with cancel-on-start a.k.a Join-shortest-work (d), MAMA Workshop, SIGMETRICS.[11] Ayesta, U., T. Bodas, and I.M. Verloop. (2018) On a unifying product form framework for redundancymodels, IFIP Performance.[12] Ayesta, U., T. Bodas, J. L. Dorsman, and I. M. Verloop. (2019) A token-based central queue withorder-independent service rates. arXiv Preprint, arXiv:1902.02137.[13] Berezner, S.A., C.F. Kriel, and A.E. Krzesinski (1995). Quasi-reversible multiclass queues with orderindependent departure rates, Queueing Systems, 19:345-359.[14] Berezner, S.A. and A.E. Krzesinski (1996). Order independent loss queues, Queueing Systems,23:331–335. 3015] Bonald, T., and Comte, C. (2017). Balanced fair resource sharing in computer clusters, Perform. Eval.116: 70–83.[16] Bonald, T., C. Comte, and F. Mathieu (2019). Performance of balanced fairness in resource pools: Arecursive approach, ACM SIGMETRICS Perform. Eval. Rev. 46: 125-127.[17] Borst, S.C., Boxma, O.J., Morrison, J.A., and R. Nu˜nez Queija (2003) The equivalence between pro-cessor sharing and service in random order. Operations Research Letters 31: 254–262.[18] Caldentey, R., E. H. Kaplan, and G. Weiss (2009) FCFS infinite bipartite matching of servers andcustomers. Adv. Appl. Probab. 41(3):695-730.[19] Comte, C. (2019) Dynamic load balancing with tokens, Computer Communications 144: 76-88.[20] Comte, C., Dorsman, J.-P. (2020) Pass-and-swap queues. Preprint.[21] Gardner, K., S. Zbarsky, S. Doroudi, M. Harchol-Balter, E. Hyytia, and A. Scheller-Wolf. (2016).Queueing with redundant requests: exact analysis. Queueing Systems 83:227-259.[22] Gardner, K., M. Harchol-Balter, A. Scheller-Wolf, M. Velednitsky, and S. Zbarsky. (2017). Redundancy-d: The power of d choices for redundancy. Operations Research 65:4, 1078-1094.[23] Gardner, K., M. Harchol-Balter, E. Hyytia, and R. Righter. (2017). Scheduling for efficiency and fairnessin systems with redundancy. Performance Evaluation 116:1-25.[24] Gardner, K., E. Hyytia, and R. Righter. (2019) A little redundancy goes a long way: Convexity inredundancy systems. Performance Evaluation 131:22-42.[25] Gardner, K., M. Harchol-Balter, A. Scheller-Wolf, and B. Van Houdt. (2017) A better model for jobredundancy: Decoupling server slowdown and job size. Transactions on Networking 25(6): 3353-3367.[26] Gurvich, I., and W. Whitt (2009). Scheduling flexible servers with convex delay costs in many-serverservice systems. Manufacturing & Service Operations Management, 11(2), 237—253.[27] Gurvich, I., and W. Whitt (2009). Queue-and-idleness-ratio controls in many-server service systems.Mathematics of Operations Research, 34(2), 363—396.[28] Gurvich, I., and W. Whitt (2010). Service-level differentiation in many-server service systems via queue-ratio routing. Operations Research, 58(2), 316—328.[29] Hellemans, T., and B. van Houdt. (2018). Analysis of redundancy(d) with identical replicas, IFIPPerformance.[30] Hellemans, T., and B. van Houdt. (2018). On the Power-of-d-choices with least loaded server selection,Proc. ACM Meas. Anal. Comput. Syst. 2 (27).[31] Jackson, J. (1957). Networks of waiting lines. Operations Research 5:516-523.[32] Kelly, F.P. (1979).
Stochastic Networks and Reversibility , Wiley, Chichester, UK.[33] Kim, Y., R. Righter, and R. Wolff. (2009) Job replication on multiserver systems. Adv. Appl. Prob. 41:546-575.[34] Krzesinski, A.E. (2011) Order independent queues, in: R.J. Boucherie, N.M. van Dijk (Eds.),