Set Consensus: Captured by a Set of Runs with Ramifications
aa r X i v : . [ c s . D C ] M a y Set Consensus: Captured by a Set of Runs with Ramifications
Abstract
Are (set)-consensus objects necessary? This paper answer is negative.We show that the availability of consensus objects can be replaced by restricting the set ofruns we consider. In particular we concentrate of the set of runs of the Immediate-Snapshot-Model (IIS), and given the object we identify this restricted subset of IIS runs.We further show that given an ( m, k )-set consensus, an object that provides k -set consensusamong m processors, in a system of n , n > m processors, we do not need to use the precisepower of the objects but rather their effective cumulative set consensus power. E.g. when n = 3 , m = 2 , and k = 1 and all the 3 processors are active then we only use 2-set consensusamong the 3 processors, as if 2-processors consensus is not available. We do this until at leastone of the 3 processors obtains an output. We show that this suggests a new direction in thedesign of algorithms when consensus objects are involved. Introduction
We present 3 contributions:1. Showing equivalence between
M K , the SWMR system equipped with ( m, k )-set consensus,and an IIS subset of runs
RMK ,2. Show how to compile programs written for
M K to be run on
RMK , and3. Use the observation of how the this compiler works to design algorithms with
RMK inmind, instead of the explicit objects. The paper illustrates this possibility on two elementaryexamples. This open the possibility of a “new design paradigm” that reduces “designing withobjects” with “designing with reduced concurrency.”The first two contributions are theoretical in nature. We show their feasibility without worryingabout the complexity of the various implementations. These contributions are small steps in arather ambitious agenda, and it makes the agenda more credible. The agenda involves establishingthe following:1. Any interesting distributed computing model can solve ǫ -agreement (there is no interestingmodel below read-write wait-free),2. Any interesting distributed problem, at the possibility level, can be captured by a network of tasks (tasks are the functions of distributed computing),3. Any interesting distributed system is equivalent to some subset of runs of the wait-free IISmodel (read-write and the knowledge of the possible run captures any possible “knowledge-relation” among processors),4. Any question about the solvability of a task T in a (sub-IIS) model M can be reduced to thequestion of solvability of a task T ( M ) wait-free (It is all wait-free).The contribution to the agenda is in showing that indeed the M K model (a very interestingdistributed system), corresponds to a subset of runs of the IIS.The second less theoretical contribution is in a “new algorithm design paradigm” that emergesfrom the theoretical results: Any task solvable with set consensus can be solved by a “wait-free”algorithm tuned to the variable concurrency provided by the objects.There are three main technical ideas involved:1. Describe the availability of ( m, k )-set consensus as an affine task [12] T (a task posed asa subcomplex of a subdivided simplex). With this task every processors can determine k -set consensus value for and combination of m processors out of the (cid:0) nm (cid:1) possible. Then weapproximate T (using the colored simplicial approximation theorem [14, 4]) by a subset of IISruns. This is done by completing T to a subdivided simplex, approximating it, and choosingruns that land in T .2. Show that with ( m, k ) set consensus j processors can implement k ⌊ j/m ⌋ + min ( k, j mod m )-set consensus. Use this in conjunction of the simulation of SWMR on IIS [4, 11], the Gen-eralized State-Machine Replication (GSMR) method [18], and the RC simulation [3], to run2rograms written for M K to run on
RMK . The GSMR converts
M K to n read-write threadsprogressing with maximum asynchrony of k ⌊ j/m ⌋ + min ( k, j mod m ), where j is the cardi-nality of the set of the active processors, processor that arrived with an input but did notobtain an output as yet. The n fixed threads work as a BG [19, 20] processors. The syn-chrony among them, as shown in the RC simulation, allows to do away with the Extended-BGsimulation [21] used for colorful tasks. Finally, the use of ( mk )-set consensus in GSMR isreplaced by the fact that we run in RMK , while GSMR has iterated structure too; it evolvesin rounds that use “fresh” variables in each round. This allows to simulate a GSMR roundwith a fixed number of iterations of
RMK . This extends the equivalence between IIS andand SWMR-SM from wait-free [4, 11], 0-1 tasks [23, 11], and t -resiliency [1], to consensusobjects.3. From the theory above it is clear that programs that call on ( m, k )-objects can be compiled torun without the objects over RMK . Can one write program “directly” with
RMK in mindwithout “cheating” and using the compiler? We speculate that when the question is preciselyformulated the answer will be positive. We show two rather simple example of algorithmsthat do not “cheat.”Since the quest to make the paper self contained was judged as hopeless, the traditional Modelsection section is abbreviate from [12] and put in the appendix. In the following section we elaborateon the three points: 1. The idea behind getting
RMK from IIS with ( m, k )-objects, 2. How tocompile MK programs to run on
RMK , and 3. Examples of direct design for
RMK . The nextsection goes into the details of the idea presented in 1 above, and the next section gives the technicaldetail of how to get the cumulative set-consensus power of ( m, k )-objects. Finally the obligatoryConclusions. n -processors IIS runs that solves ( m, k ) set consensus amongany m processors, n ≥ m , and is implementable by IIS with ( m, k ) -objects The (soft-wired [22]) object of ( m, k )-set consensus can be invoked by m processors and to eachinvocation it returns one of the inputs provided to it by some invocation, such that at most k distinct invocation values are returned in total [13].It is known what is a RMK for (2 , m, k )-objects? In [10], k -processors consensus is established as distinct IS task inwhich in addition a processor “knows” the k − ,
1) idea above to ( m, k ), to get to built an affine complex that solves T .Finally, we do want to capture runs as a subset of a single model, which we chose to be the IIS.For that we show how to replace a composite chunk of the rounds we described with an equivalentchunk from IIS. We use the idea from [12] that every affine task, that is, a task that is a subcomplex of a subdivided simplex, or for that matter any sub-complex of a chromatic subdividedsimplex, can be equated with a set of IIS run. The composite chunk we have produced is an affinetask T .To capture any affine task as subset of IIS runs, we notice every colored subdivided simplex canbe approximated by enough iteration of IIS [14, 4] we complete the affine task to colored subdividedsimplex, approximate it, and chose the prefixes of runs that fall into T .The iterations of this chunk ad infinitum is RMK
RMK can run
M K
Programs
Let Π be a program in
M K , i.e. threads of reads and write to SWMR shared-memory, with thethreads invoking ( m, k )-set consensus objects. How do we execute Π in
RMK ? Processors in
RMK can solve ( m, k )-set consensus, but how do they coordinate local states calling on a copy of( m, k ) to know what is the outcome since the virtual call to ( m, k ) in
RMK happens in a singleround, while when the call to ( m, k ) in Π takes place by the processors simulating Π, at differentrounds of the simulation? Conceptually, the answer is simple: Just the first round in which theobject is invoked matters. Latter processor will adopt a value from first round processors. Theimplementation of this simple idea, unfortunately, requires heavy machinary.We draw on 3 simulations: Simulating SWMR-SM on IIS [4, 11], simulating free-for-all executionthat builds on the replicated multi-state-machine in [18] (GSMR), and finally drawing on thecompanion submission called RC simulation that replaces the EBG simulation [21], by consideringconstant number n of BG simulators but increasing and reducing their concurrency.We elaborate on each of these simulations in turn. The view from 20,000 feet is as follows:Processors run in RMK and drive a GSMR system. The role of a processor is to get its input intothe simulation so that its thread can be executed. It is ignored (simulated as departed) once itsthread has an output. The GSMR just gives steps to n BG simulators. The less processors thereare or the higher the power of consensus they have the higher the synchrony of the BG simulators.The n BG simulators through the RC mechanism execute Π (we need RC since we do not run BGin the traditional way of wait-free simulators but as simulators with certain level of synchrony).
Our target machine in this paper is a subset of of runs of the iterated system IIS. At the first stepwe would like to run the GSMR replication system of [18] in IIS. The replication system was writtenfor SWMR-SM, but luckily it has a round structure. At the beginning of a round all processorsinvoke set-consensus and then communicate within a SWMR mechanism. The crucial observationis that from round to round GSMR uses “fresh” variables. Thus the simulation of a round of GSMRtakes a fixed number of rounds in IIS: A fixed number of rounds of
RMK (the chunck) to get the4et consensus required, and then a fixed number of rounds to simulate the GSMR communicationin a round (posting proposals, doing Commit-Adopt, etc.).Thus, our run in
RMK is an alternating fixed size chunks of solving set consensus followed bya fix size chunk of rounds to simulate the read-write round of GSMR, solving set-consensus, etc.
The scheme proposed in the Concur paper [18], shows how to generalize the single State-Machineapproach to distributed computing [8] using consensus, to the case of ( ∞ , k )-set consensus, inshort k -set consensus. The state-machine approach [8] shows how using consensus processors cancoordinate to replicate a linear order of proposed commands . The GSMR assumes processorswant to place commands on k distinct machines. It shows how using k -set consensus they canreplicate putting commands on these k machines, with progress guarantee that at least the placingof commands on at least one machine will progress.A trivial Corollary of the technique behind GSMR in [18] shows that with k -set consensus, n processors can place commands on n state-machines with guaranteed progress on n − ( k − k -set consensus n processors can simulate n threads where in each round at least n − ( k −
1) of the threads advance. Thus if k is small relative to n , the scheme simulates anexecution with high level of synchrony. With consensus, the execution of all the n threads will besynchronous!The main innovation of this section is to consider the state-machines to be read-write threads.Processors running GSMR read their local replica, which may lag, or be ahead of another replica.Based on their local read, they propose this value as a command for all the next read steps ofthreads (the writes will be inferred from the value of the read). All these possibly distinct readvalues are proposed. Any one of them decided for a thread is a valid read value of the tread since thethreads read asynchronously. Thus the idea is to use GSMR as a execution scheme for read-writethreads, similar to the logic of the GB simulation.But the threads of Π we are given are not only of read-write threads, they also invoke ( m, k )-objects. The next idea is to replace each ( m, k )-object in Π with a BG safe-agreement (SA) task[]. A solution to an SA task (see appendix) is read-write with a await(condition) statement. Theburden we have is to show that these await statements will still allow progress of at least onethread. When that thread will output, RMK processor associated with it (brought its input) willbe simulated as departed, the synchrony of GSMR will increase, the concurrency of executing Πwill hopefully decrease, and this will allow another thread to progress.Who are these n threads we simulate? We do not simulate directly the threads of Π. GSMRis built with a fixed set of threads (state-machines) in mind. The effective set-consensus RMK that drives GSMR provides is implicit rather than explicit as it depends on the number of virtualarrival and virtual departure of processors. This will affect the number of state machines that willprogress. To do away with this complication GSMR runs a fixed number of threads n of processorsthat behave like BG simulators: They run all over Π with some rule determining which thread ofΠ can advance as some SA’s are “waiting.”We could run a variable number of threads according to getting a GSMR thread that can progress.This variable number of threads is a problem, since when synchrony grows and the number ofthreads shrink, threads that were active before and are not active now may interfere with an SAtask. In the past this problem was solved by the Extended-BG simulation [21]. Here we solve it in5 more elegant way by fixing the number of BG simulators to n , and letting the synchrony change.Running the BG simulation with partially synchronous BG simulators, something that have notbeen done before, is described in a companion submission under the name RC-simulation. TheRC-simulation changes the BG scheme by determining that an SA is blocked [19] only after somedelay to let live simulators have a chance to terminate their execution of the core of the SA (allbut the await statement). Thus, the RC-simulation [3] is an elegant substitute to EBG [21] The crux of the RC simulation was explained above. In more detail, suppose we have n BGsimulators with at most one fault. We know [19, ? ] that any number of processors with at most asingle possible fault is effectively two processors. The original BG simulator converts the above tolet BG simulator number 1 and 2 take steps and all the rest “skip.” At least one of BG simulators1 and 2 will take step. To do this we need to do the number 2, i.e. that at most a single processormight fail. The RC simulation lets all take steps. All will go to some SA, one will finish the coreof the SA first without knowing the outcome. Should it proceed to another SA? May be all theprocessors are alive and synchronous but they have started the SA at different times. If it willproceed to another SA the concurrency of the execution of Π might grow un-necessarily (and say,in the case of solving Renaming [16] will require more space than necessary). The crux of theRC simulation is to show that if the decision to proceed to the next SA is delayed enough (as afunction of the number of the active SA’s) then if a simulator does proceed it is accounted for byone simulator being too slow. I.e. the execution was not completely synchronous.In case of at most one faults, after two SA’s are active, the delay guarantees that at least oneSA of the two will terminate. ( m, k ) -objects Now that we have reduced the execution to executing by BG processor in an RC simulation weuse what we prove later that with ( m, k )-set consensus j processors can solve k ⌊ j/m ⌋ + min ( k, j mod m )-set consensus. This will be the effective number of BG processors in the RC simulation.Since the SA for every ( m, k )-set consensus needs at least k + 1 simulators to be siting in a middleof the safe agreement code, we get that at least one BG simulators can find a thread to execute. As mentioned in the subsection above, with ( m, k )-set consensus j processors can solve cumulativeset consensus: k ⌊ j/m ⌋ + min ( k, j mod m )-set consensus. Take m = 2, k = 1. The object is now2-processors consensus.The most elementary task solvable by n processors is Test-and-Set (TST). In TST one of theparticipants outputs “win” while the other output “lose.” What if at any point in the executionall we have is the cumulative set-consensus power of the 2-processors consensus? Can we do TST?Of course we can, as we can take any TST implementation and run it through the compiler wedescribed. But then we replace objects with safe agreement etc. Can we do it directly? Of coursethis question is not formalized, but we will rely on Supreme-Court judge Potter Stewart saying: “Iknow it when I see it”It is elementary for TST. Processors do cumulative set consensus and write the id they obtainedin shared-memory. A processor that afterwards does not see its id written outputs “lose” and6epart (virtually). A processor that arrives late and sees any id written, outputs “lose.” Continueinductively with processors that saw their id written in shared-memory. Notice that this solutionis linearazible [2].All algorithms in Common2 [7, 6] “TSTs everything that moves.” It feels like the use of TSTrequires different mind-set than wait-free. Indeed, the group involved in Common2 over the years[7, 6] seem to be the same “old-hands.” People who developed intuition in the use of TST. Letstherefore take the second most elementary task solvable by 2-processors consensus: Tight-Renaming[5].In Tight-Renaming each of k participating processors outputs a unique integer in the range 1 to k . The standard TSTed way to solve it is for processors to TST the integers 1, 2, . . . in order withthe winner outputting the integer it won. Can we accomplish the same using only the cumulativeset-consensus power of 2-processors consensus.If the number of arrivals is 2 k or 2 k − k . These at most k processors can now solve Adaptive-Renaming [16] in the available rangeof at least 1 to 2 k −
1. When one processor outputs it writes its output in shared memory anddepart. The rest of the processors continue inductively using the integers that were not claimed bybeing written to shared memory. This solution is not linearizable.This solution to tight-renaming illuminates how the “wait-free logic” of Adaptive-Renaming[16] spills over to the same problem type, when 2-processors consensus is available. It will beinteresting to push the wait-free-logic to Fetch-and-Add and SWAP [7, 6]. More importantly it willbe interesting to formalize the question and actually prove it can always be done.
RMK
We first present the task ( m, k )-set consensus among any m processors as an affine complex C ( n, m, k ), where n denotes the number of processors. An Affine complex is a subcomplex ofa chromatic finitely-subdivided simplex A . To create C ( n, m, k ) we first describe how we create C ( m, m, k ), i.e. what is the subcomplex we talk about when the number of processors n = m . Thecomplex C ( m, m, k ) is a subcomple of Chr ( s m − ), the second standard chromatic subdivision ofthe m − s m − . To get C ( m, m, k ) we purge from Chr ( s m − ) all the sim-plexes that are not part of an elementary m − Chr ( s m − ) of dimension k −
1. I.e. we hallow out
Chr ( s m − ) of all simplexes that donot touch a k − k − k − k − Lemma 3.1
Consider the runs that corresponds to C ( m, m, k ) , then in a model of these runs wecan solve k -set consensus among m processors, and C ( m, m, k ) when considered as a task is solvablein a read-write wait-free m processors SWMR memory with access to ( m, k ) -set consensus. Proof. ⇒ For every run in a simplex of of C ( m, m, k ) after two Immediate Snapshots a processor obtainsa vertex of its color in C ( m, m, k ). It then returns an id from the smallest cardinality face of7 hr ( s m − ) of all the simplexes which contains it. By the unproven property we skipped the car-dinality of the output set is k or less. ⇐ Processor use ( m, k ) set consensus to determine at most k corners (0-dimensional faces) of Chr ( s m − ).They then execute convergence [14, 4] algorithm from these corners to return a simplex of C ( m, m, k ). (cid:3) To construct C ( n, m, k ), we consider all the (cid:0) nm (cid:1) combinations of m processors in some order, comb , . . . , comb ( nm ). We take the n − s n − and take the face that correspondsto comb . We subdivide it according to C ( m, m, k ) and cone-off this subdivision with the rest ofthe vertices not in comb . Now we got a complex which is a subcomplex of a colored subdividedsimplex. We take all the faces of elementary simplexes in the subdivision that correspond to comb .We subdivide each such face according to C ( m, m, k ) and then in each simplex cone this subdivisionoff with the rest of the vertices. We continue this for all combinations to get C ( n, m, k ). C ( n, m, k ) can be completed to a colored subdivision of s n − . The completion viewed as a wait-free solvable task there exist q such that Chr q s n − approximates the task. We now consider all thesimplexes of Chr q s n − that land in C ( n, m, k ), to be the first q rounds of runs in RMK , denoted
RMK ∐ . Lemma 3.2
Consider the runs that correspond to
RMK ∐ , then in the model of these runs wecan solve ( m, k ) -set consensus among all n processors, and RMK ∐ when considered as a task issolvable in a read-write wait-free n processors SWMR memory with access to ( m, k ) -set consensus. Proof.
Since each simplex of
RMK ∐ resides in a simplex of C ( n, m, k ) we can just identify backthe process of the subdivision we described above and in turn each processor can answer its choiceof value for each m combination it belongs to. In the opposite direction our construction of RMK ∐ ,just used ( m, k )-objects and read-write. (cid:3) To create
RMK we now iterate
RMK ∐ ad-infinitum. j processors using ( m, k ) -objects Theorem 4.1
Given ( m, k ) -set consensus j out of n processors can implement k ⌊ j/m ⌋ + min ( k, j mod m ) -set consensus. Proof.
Consider we knew who the j processors are. W.l.o.g. [22] we assume soft-wired objects, i.e. thesoftware controls that at most m processors will invoke the ( m, k )-set consensus. To implement k ⌊ j/m ⌋ + min ( k, j mod m )-set consensus processors rank themselves. We then arrange the objectsin order, the first object covers the lowest m ranked processors etc., and a processor invokes theobject that covers the range of ranks that include its rank.To “know” j , processors march through layers n . A processor starts at layer 1 with itsid as input id. Inductively it writes at layer i all the ids it encountered at layer i −
1, and takes a8 hared Array C . . . n, . . . n ] initialized to ∅ ;Shared Array C . . . n, . . . n ] initialized to ∅ ;Local IdSeen set of processors id,
InId input id, both initialized to { M yId } and M yId , respectively; for j = 1 to n do C [ j,
1] :=
IdSeen ; Snap := ∪ l C j, l ]; if | Snap | = j then InId := Invoke with
InId the object according to
M yId rank in
Snap ; C j, i ] := InId ; IdSeen = ∪ l C j, l ]; If | IdSeen | = j then return M yId ;elseIf ∪ l C j, l ] = ∅ then InId := element of ∪ l C j, l ] ;endend Algorithm 1:
Extracting cumulative power of soft-wired objects.snapshot of all the ids written. If the number of distinct ids written is i , it invokes the appropriateobject, at layer i , with the inductive output id from layer i −
1, obtains an input id, and writes theinput id in shared memory. It then looks back at the number of arrivals to the layer. If it is still i , it departs with its input id. If it is larger than i , it continues with the input id it got from theobject, and all the ids it encountered, to layer i + 1.On the other hand, if at layer i the number of ids it observed is larger than i it continues to layer i + 1, with either its input id to layer i , in case it did not see an input id written at layer i , else,it adopts an inout id written at layer i , as its input id to layer i + 1.The algorithm appears in figure Algorithm 1. It works since if any processor returnsit has seen the registration cardinality unchanged. Correspondingly processors thatcomes later will adopt a value from that layer. A concurrent processor that failed tosee the registration cardinality unchanged obviously has a value from this layer. Sincewe go from 1 to n , some processor must return sometime. Notice the invariant thatthe value IdSeen written to a layer is of cardinality greater equal to the layer index. (cid:3)
RMK
Notice that the algorithm above is layered. I.e. each layer uses “fresh” objects, be it set consesnsusobjects or read-write registers. This means that every layer can be simulated [4, 11] in
RMK insome fixed number of iteration: A processor that moved from layer i to layer i +1 does not ‘interfere”any more with writes at layer i . To see that the ( m.k )-objects do not need to be persistent objectswe make a further observation. Each layer can be further partitioned into three phases:1. The first phase is a read-write phase in which a processors writes the ids it encountered sofar and takes a snapshot,2. At the second phase a processor invokes an ( m, k )-objects with it id, and9. The third phase is again read-write phase in which a processor writes the id returned to it bythe ( m, k )-object, and look back at the ids now written in the first phase.Each of the three phases can be bounded a priori by some constant number of iterations. Thusthe boundary of the middle phase is well defined and that is where processors in RMK simulatetheir invocations of the ( m, k )-set consensus objects. Notice that in this phase processors in
RMK in fact implement the soft-wired objects from hard-wired objects.
We have shown the existence and gave a constructive algorithm for the sub-IIS model that corre-sponds to any set-consensus objects. We remark in passing that generalizing this to any combinationof such objects is straight forward. This adds evidence to support the quest of having canonical distributed model, namely, some subset of IIS runs. Supporting that any distributed computingsystem can be captured by a subset of IIS.We showed further evidence to another Thesis, equating wait-free SWMR-SM equipped with anytask, to the IIS model equipped with same. Namely, it was shown for 0-1 family of tasks, and nowwe enlarged to to consensus tasks. Surprisingly, it is still unknown what is the sub-IIS model forthe 0-1 family of tasks.Finally, and probably the least foundational but the most sexy part of the paper is the subsectionthat shows a new possibility of design of algorithms when consensus objects are available. In fact,we plan in the future of reproducing all the algorithms in Common2 [7], in that spirit.
References [1] Zohir Bouzid, Eli Gafni, Petr Kuznetsov: Live Equals Fast in Iterated Models. CoRRabs/1402.2446 (2014)[2] Maurice Herlihy, Jeannette M. Wing: Linearizability: A Correctness Condition for ConcurrentObjects. ACM Trans. Program. Lang. Syst. 12(3): 463-492 (1990)[3] Pierre Fraigniaud, Eli Gafni, Sergio Rajsbaum, Mathieu Roy: Automatically adjusting con-currency to the level of synchrony. Submitted to DISC 2014.[4] Elizabeth Borowsky, Eli Gafni: A Simple Algorithmically Reasoned Characterization of Wait-Free Computations (Extended Abstract). PODC 1997: 189-198.[5] Yehuda Afek, Eli Gafni, Opher Lieber: Tight Group Renaming on Groups of Size g Is Equiv-alent to g-Consensus. DISC 2009: 111-126.[6] Yehuda Afek, Adam Morrison, Guy Wertheim: From bounded to unbounded concurrencyobjects and back. PODC 2011: 119-128.[7] Yehuda Afek, Eytan Weisberger, Hanan Weisman: A Completeness Theorem for a Class ofSynchronization Objects (Extended Abstract). PODC 1993: 159-170.[8] Leslie Lamport: Time, Clocks, and the Ordering of Events in a Distributed System. Commun.ACM 21(7): 558-565 (1978). 109] Yehuda Afek, Eli Gafni: Asynchrony from Synchrony. ICDCN 2013: 225-239.[10] Yehuda Afek, Eytan Weisberger: The Instancy of Snapshots and Commuting Objects. J.Algorithms 30(1): 68-105 (1999).[11] Eli Gafni, Sergio Rajsbaum: Distributed Programming with Tasks. OPODIS 2010: 205-218.[12] Eli Gafni, Petr Kuznetsov, Ciprian Manolescu: A generalized asynchronous computabilitytheorem. CoRR abs/1304.1220 (2013). To appear in PODC2014.[13] Soma Chaudhuri: Agreement is Harder than Consensus: Set Consensus Problems in TotallyAsynchronous Systems. PODC 1990: 311-324.[14] Maurice Herlihy, Nir Shavit: The topological structure of asynchronous computability. J.ACM 46(6): 858-923 (1999).[15] Afek Y., H. Attiya, Dolev D., Gafni E., Merrit M. and Shavit N., Atomic Snapshots of SharedMemory.
Proc. 9th ACM Symposium on Principles of Distributed Computing (PODC’90) ,ACM Press, pp. 1–13, 1990.[16] Hagit Attiya, Amotz Bar-Noy, Danny Dolev, David Peleg, Rdiger Reischuk: Renaming in anAsynchronous Environment J. ACM 37(3): 524-548 (1990)[17] Elizabeth Borowsky, Eli Gafni: Immediate Atomic Snapshots and Fast Renaming (ExtendedAbstract). PODC 1993: 41-51.[18] Eli Gafni, Rachid Guerraoui: Generalized Universality. CONCUR 2011: 17-27.[19] Elizabeth Borowsky, Eli Gafni: Generalized FLP impossibility result for t-resilient asyn-chronous computations. STOC 1993: 91-100.[20] Elizabeth Borowsky, Eli Gafni, Nancy A. Lynch, and Sergio Rajsbaum: The BG distributedsimulation algorithm.
Distributed Computing , 14(3):127–146, 2001.[21] Eli Gafni: The extended BG-simulation and the characterization of t-resiliency. In
STOC ,pages 85–92, 2009.[22] Elizabeth Borowsky, Eli Gafni, Yehuda Afek: Consensus Power Makes (Some) Sense! (Ex-tended Abstract). PODC 1994: 363-372.[23] Eli Gafni: The 0-1-Exclusion Families of Tasks. OPODIS 2008: 246-258.11ppendix
A Sub-IIS models
In this section, we describe our perspective on the
Iterated Immediate Snapshot (IIS) model [4] andgive examples of sub-IIS models.
A.1 The IIS model
Our base model is the IIS. It consists of an infinite sequence of the IS tasks [] IS , IS , . . . . Processorsstart by submitting their inputs to IS and subsequently taking the output as the input to thenext IS in the sequence.Let R be the set of runs in IIS. A processor is participating if it went through IS . A processoris live, if it went ad-infinity.Some processors might not be seen by other, or not seen infinitely often. This may allow toremove (a suffix) of their appearance in a run r and still leave some processors unaware that wedid this surgury. It is easy to see that this surgury has a well defined unique “skeleton.” The setof processors that are live in the skeleton sk is called fast ( sk ). Since the skeleton is unique the set fast ( r ) will denote the fast set of the skeleton of r . In a run r , processor that are not in the fast ( r )are in slow ( r ). A.2 Examples of models
We define a sub-IIS model M to be any subset of R . Example A.1
The wait-free (or completely asynchronous ) model WF is the set R itself. Theinterpretation of WF is that anything can happen (all sorts of step interleavings are allowed). Example A.2
For t ≤ n , the t -resilient model Res t consists of the runs r ∈ R such that | fast ( r ) | ≥ n + 1 − t. This is the model in which at most t processes are slow. Example A.3
For k ≤ n + 1 , the k -obstruction-free model OF k consists of all the runs r inwhich no more than k processes are fast, i.e., | fast ( r ) | ≤ k. This model was previously discussed in[ ? ], following a suggestion of Guerraoui. Example A.4
More generally, consider the model with adversary A [ ? ], which we denote by M adv ( A ) . Here, A is any subset of the power set of { , , . . . , n } . We then define M adv ( A ) toconsist of all runs r such that slow ( r ) ∈ A . B Topological definitions
We assume the reader is familiar with by now standard terminology used in Distributed Computingof Chromatic Complexes, Subdivided-Simplexes, etc12e denote by Chr k s the k ’th iterated subdivision of the simplex s , and by | Chr k s | we denoteits some standard embedding in R n . Since every simplex of Chr k s is a partition of R by prefixes ifwe continue this process to infinity we get that every point in the embedding of s , | s | , is a uniquesubset of R . All runs at a point share the same skeleton. C Tasks
C.1 Definitions A task T = ( I , O , ∆) on n + 1 processes { p , . . . , p n } consist of two finite, pure n -dimensionalchromatic complexes I and O , together with a chromatic multi-map ∆ : I → O . The inputcomplex I specifies the possible input values, the output complex O specifies the possible outputvalues, and ∆ describes which output values are allowed for a given input. The colors specify towhich process each input or output value corresponds.A task is called input-less if the input complex is the standard simplex s , colored by the identity.Then each process starts with input only its own id. C.2 Affine tasks
Many examples of input-less tasks can be constructed as follows. Let L ⊆ Chr k s be a pure n -dimensional subcomplex of the k th chromatic subdivision of s , for some k . For each face t ⊆ s , theintersection L ∩ Chr k t is a subcomplex of Chr k s ; we assume that this subcomplex is pure of thesame dimension as t (and possibly empty).We define an input-less task ( s , L, ∆) by setting ∆( t ) = L ∩ Chr k t for any face t ⊆ s . Tasksconstructed like this are called affine . To depict an affine task, we can simply draw the correspondingcomplex L .By abuse of notation, we will usually write L for the affine task ( s , L, ∆). We chose the name affine because if we have a task L as above, the geometric realizations of the simplices of L canbe depicted as lying on affine subspaces of R n . Similar terminology appears in algebraic geometry,where one talks about affine varieties. C.3 Task Solvability
In a sub-IIS model, informally, a task T = ( I , O , ∆) is solvable in M if for all runs r ∈ M , theinfinitely participating processes output, and their output is a subsimplex of the allowed outputsfor the participating processes. An output is the result of a protocol . For us, when dealing withsolvability rather than complexity, a protocol is just a partial map from views to outputs. Thus,requiring an infinitely participating process to output means requiring that eventually it will havea view that is mapped by the protocol to an output value.We define the set V = V ( I ) to consist of all possible view ( p i , ω, k ) in all runs r ∈ R , for allprocesses p i , simplices ω ∈ I , and integers k ≥
0. Formally, a protocol Π for the task T is a map Note that in the definition of a multi-map we allowed images to be empty. This is somewhat non-standard, as itmeans that processes in a task do not have to output. If one prefers to avoid that, for every task T = ( I , O , ∆) we canconstruct a new, equivalent task T + = ( I + , O + , ∆ + ) as follows. We let I + = I . The output complex O + is obtainedfrom O by adding extra vertices v , . . . , v n (with v i corresponding to “no output” for the process i ); moreover, foreach simplex σ in O , we add an n -simplex σ + in O + by adjoining vertices v i for the colors i not represented in σ .Finally, we let ∆ + ( τ ) = (∆( τ )) + . V to the set of vertices in the output complex O . Definition C.1
A task T = ( I , O , ∆) is solvable in a sub-IIS model M if there exists a protocol Π for T such that for all r ∈ M (with r = S , S , . . . as before):1. For each p i , and for each n -dimensional simplex ω ∈ I , there exist k and a vertex v of O colored i , such that: • For all k < k , view ( p i , ω, k ) / ∈ domain (Π) ; • For all k ≥ k such that p i ∈ S k exists, we have Π( view ( p i , ω, k )) = v .(This condition is satisfied vacuously if p i is not infinitely participating, because we can find k such that p i did not take k steps in r , so p i S k for k ≥ k .)2. For all k , { Π( view ( p i , ω, k )) | view ( p i , ω, k ) ∈ domain (Π) } is a sub-simplex of a simplex in ∆ (cid:0) ω ∩ χ − ( part ( r )) (cid:1) . In every run r ∈ M , condition (1) above requires every infinitely participating to eventuallyproduce an output, and condition (2) requires the produced output to respect the task specification∆ given the inputs of participating processes. C.4 Safe-Agreement (SA) Task
We present here safe-agreement as a task. In the literature safe-agreement is specified operationally[20].1. Processor p ∈ P outputs ⊥ or its input v p ,2. At least on processor p ∈ P does not output ⊥ ,3. All processors that do not output ⊥ output the same v q , q ∈ P .The SA task can be solved wait-free [19]. A wait-free solution to SA is a SA-module . A SA-module also asks processors to post their output in Shared-Memory. The implication is that either p ∈ P that terminated knows an non- ⊥ output q to SA, or if not, there is at least one processor p ′ ∈ P that has not terminated the SA-module. This idea that either processors know the valueof the election in the SA, or otherwise one processor p ′ invoked SA but has not returned (orreturned but did not write its return, which is always the next to do after a return) from SA, andconsequently blocked from executing. We call a processor that invoked SA, did not post an output,and an output qq