Oblivious Collaboration
Yehuda Afek, Yakov Babichenko, Uriel Feige, Eli Gafni, Nati Linial, Benny Sudakov
aa r X i v : . [ c s . D C ] J un Oblivious Collaboration
Yehuda Afek ∗ Yakov Babichenko † Uriel Feige ‡ Eli Gafni § Nati Linial ¶ Benny Sudakov k May 4, 2018
Communication is a crucial ingredient in every kind of collaborative work. But what is theleast possible amount of communication required for a given task? We formalize this question byintroducing a new framework for distributed computation, called oblivious protocols .We investigate the power of this model by considering two concrete examples, the musicalchairs task
M C ( n, m ) and the well-known Renaming problem. The
M C ( n, m ) game is played by n players (processors) with m chairs. Players can occupy chairs, and the game terminates as soon aseach player occupies a unique chair. Thus we say that player P is in conflict if some other player Q is occupying the same chair, i.e., termination means there are no conflicts. By known resultsfrom distributed computing, if m ≤ n −
2, no strategy of the players can guarantee termination.However, there is a protocol with m = 2 n − scheduler chooses an arbitrary nonempty set of players, and for each of them provides only one bitof information, specifying whether the player is currently in conflict or not. A player notified not tobe in conflict halts and never changes its chair, whereas a player notified to be in conflict changes itschair according to its deterministic program. Remarkably, even with this minimal communicationtermination can be guaranteed with only m = 2 n − ∗ The Blavatnik School of Computer Science, Tel-Aviv University, Israel 69978. [email protected] † Department of Mathematics, Hebrew University, Jerusalem 91904, Israel [email protected] ‡ Department of Computer Science and Applied Mathematics Weizmann Institute of Science Rehovot 76100, [email protected]. The author holds the Lawrence G. Horowitz Professorial Chair at the Weizmann Institute.Work supported in part by The Israel Science Foundation (grant No. 873/08). § Computer Science Department, Univ. of California, LA, CA 95024, [email protected] ¶ School of Computer Science and Engineering, Hebrew University, Jerusalem 91904, Israel [email protected] k Department of Mathematics, UCLA, Los Angeles, CA, 90095. [email protected]. Research supported inpart by NSF CAREER award DMS-0812005 and by a USA-Israeli BSF grant
Introduction
In every distributed algorithm each processor must occasionally observe the activities of other pro-cessors. This can be realized by explicit communication primitives (such as by reading the messagesthat other processors send, or by inspecting some publicly accessible memory cell into which theywrite), or by sensing an effect on the environment due to the actions of other processors (such as inCarrier Sense Multiple Access channels with Collision Detection, CSMA/CD). Here we consider twosevere limitations on the processors’ behavior and ask how this affects the system’s computationalpower: (i) A processor can only post a proposal for its own output, (ii) Each processor is “blind-folded” and is only occasionally provided with the least possible amount of information, namelya single bit that indicates whether its current state is “good” or “bad”. Here “bad/good” standsfor whether or not this state conflicts with the global-state desired by the processor. Moreover, wealso impose the requirement that algorithms are deterministic (use no randomization). This newminimalist model, properly defined, is called the oblivious model . This model might appear to besignificantly weaker than other (deterministic) models studied in distributed computing. Yet, weshow that two natural problems in this field, renaming [1, 2] and musical chairs [9], can be solvedoptimally within the highly limited oblivious model. Furthermore, we discuss the efficiency ofoblivious solutions and the relations between the oblivious model and the read/write model whichis a thoroughly studied model in distributed computing [14].The oblivious model can be described and formalized in two different ways: (i) in terms ofthe operations available to individual processors, or (ii) in terms of an oblivious oracle (as in theabstract). In either case, associated with every state of a participating processor is a proposedoutput, so that the state at which a processor halts thus defines its final output. In our model,an oracle mediates between the processors. The only way a processor can sense its environment isby querying the oracle about a single predicate on the current vector of outputs of the processors.Based on the single bit answer the processor needs to either halt with its current output, or proceedwith its computation and propose a new output. But how can a processor’s computation proceed?It has no information about the state of other processors (beyond the one bit that tells it thatit must proceed), and we are forbidding randomization. Consequently, a processor’s proposedoutput can depend only on its current state, and therefore the sequence of states that processor p i traverses is simply an infinite word π i over the alphabet of possible outputs. Upon receivinga negative answer from the oracle, processor p i in state π i [ k ] moves to state π i [ k + 1]. Given thedefinition of a computational task, it is up to the programmer to design the words π i and thequery that each processor poses to the oracle under which that task is always realized properly.Our only assumption is that the oracle correctly answers the queries, and a processor eventuallyhalts/proceeds to the next state in his word upon a bad/good response from the oracle.The Musical Chairs , M C ( n, m ) task involves n processors p , . . . , p n and, m chairs numbered1 , . . . , m . Each processor p i starts in an arbitrary chair, dictated by the input. If the input chairsare all unique, all processors are good and the input is the output. If not all input chairs are unique,the task calls for each processor to capture a chair in exclusion.The Renaming ( n, m ) task is a close relative of M C ( n, m ). There are m slots (chairs) numbered1 , . . . , m and each participant has to capture a slot in exclusion. The processors have no input. Ifonly k < n processors participate, then each has to capture (output) a unique slot from the firstmin (2 k − , m ) slots. If all the n processors participate then they each capture one of the m slotsin exclusion.In Section 2 we define the oblivious model in detail. For the MC and the renaming problemswe use the collision query - a processor is good iff it is the only one to propose the current chair.1e show that in this case the general oblivious model simplifies considerably. These simplificationslater help us produce an optimal solution.Remarkably, for each processor we produce a program which is a single cyclic word on analphabet of chairs. Furthermore, for the MC task the program can be started at any chair inthe word. This provides for self stabilization [5, 6]. Namely, consider a system state where eachprocessor occupies an exclusive chair and there are no conflicts. Suppose that the system getsperturbed, and program counters change arbitrarily. This may create conflicts, but the system willnevertheless resettle obliviously in finite time into a conflict-free safe state.Here are the main contributions of the present paper:1. Introduction of the general oblivious model and its specialization to the problems at hand.2. A proof that there are tasks that are solvable in a read/write wait-free manner, but notsolvable obliviously.3. Characterization of the minimal m for which there is an M C ( n, m ) oblivious algorithm: Theorem 1
There is an oblivious
M C ( n, m ) algorithm if and only if m ≥ n − . Moreover, for all
N > n there exist N words on m chairs such that any n out of the N wordsconstitute an oblivious M C ( n, n −
1) algorithm.4. Likewise, for the Renaming problem
Theorem 2
There is an oblivious
Renaming ( n, m ) algorithm if and only if m ≥ n − .
5. A lower bound on the number of chairs required in the oblivious MC task is derived byreduction from the renaming task, which in turn is derived from the read/write wait-freemodel.6. The words in Theorem 1 use the least number of chairs, namely m = 2 n −
1. However, thelengths of these words grows doubly exponentially in n . Are there oblivious MC algorithmswith much shorter words? Even length O ( n )? Perhaps even length m ? How long can thescheduler survive? Here we consider systems with N ≥ n words (programs) and any n out ofthe N should constitute a solution of MC. We call these M C ( n, m ) systems with N words. Theorem 3
For every N ≥ n , almost every choice of N random words of length cn log N inan alphabet of m = 7 n letters is an M C ( n, m ) system with N full words (words that containevery letter in , . . . , m ). Moreover, every schedule on these words terminates in O ( n log N ) steps. Here c is an absolute constant.
7. Since we are dealing with full words (words that contain every letter in 1 , . . . , m ) and weseek to make them short, we are ultimately led to consider the case where each word is apermutation on [ m ]. At the moment the main reason to study this question is its aestheticappeal. We can design permutation-based oblivious M C ( n, n −
1) algorithms for very small n (provably for n = 3, computer assisted proof for n = 4). We suspect that no such constructionsare possible for large values of n , but we are unable at present to show this. We do know,though that 2 heorem 4 For every integer d ≥ there is a collection of N = n d permutations on m = cn symbols such that every n of these permutations constitute an oblivious M C ( n, m ) algorithm.The constant c depends only on d . In fact, this holds for almost every choice of N randompermutations on [ m ] . We should stress that our proofs of Theorems 3 and 4 are purely existential. The explicitconstruction of such systems of words remains largely open, though we do have some resultsin this direction, e.g.,
Theorem 5
For every integer d ≥ there is an explicitly constructed collection of N = n d permutations on m = O d ( n ) symbols such that every n of these permutations constitute anoblivious M C ( n, m ) algorithm. Most of the technical results in this paper concern the design of oblivious algorithms for the MCtask, either with the least possible number of chairs (namely, m = 2 n − m = O ( n )chairs. These results then extend to the renaming task. The purpose of this section is to highlightseveral additional aspects of the subject.We start with a number of simple observations. (i) An oblivious M C ( n, m ) algorithm cannotinclude any two identical words. Otherwise the corresponding players might move together inlock-step, constantly being in collision. Hence it is essential that no two processors have the sameprogram. (ii) For every oblivious M C ( n, m ) algorithm with finite words, there is a finite upperbound on the number of moves a processor can make before termination. This is because thereare only finitely many system states, and in a terminating sequence of moves no system state canbe visited twice. (iii) In fact, for every collection of finite words there is a directed graph whosevertices are all the system states. Edges correspond to the possible transitions. The collection ofwords constitute an oblivious MC protocol iff this graph is acyclic. These observations depend onthe assumption that the algorithm is deterministic.Our oblivious algorithms for MC have a number of additional desirable properties. For every n > m = 2 n − N > n we design N periodic words that are full (i.e., contain everychair) with the following properties: for every choice of n or fewer of the N words, for everychoice of states on these words, each word is guaranteed to reach a chair not shared by any otherword. There is an upper bound (that depends only on N ) on the number of steps taken by anyword, and moreover, this guarantee holds even if other words fault and no longer change states.Hence our oblivious algorithms can be run in dynamic settings in which the set of players in thesystem keeps changing. It is still guaranteed to reach a conflict free state provided that there aresufficiently long intervals without dynamic changes. Moreover, this protocol can withstand variouskinds of faults, e.g., non-faulty processors can complete their computations even in the presence offaulty processors. To illustrate this idea, consider a company that manufactures N communicationdevices, each of which can use any one of m frequencies. If several such devices happen to be atthe same vicinity, and simultaneously transmit at the same frequency, then interference occurs.Devices can (i) Move in or out of the area, (ii) Hop to a frequency of choice and transmit at thisfrequency, (iii) Sense whether there are other transmissions in this frequency. The company wantsto provide the following guarantee: If no more than n devices reside in the same geographicalarea, then no device will suffer more than M interference events for some specific integer M . Ouroblivious MC algorithms would guarantee this by pre-installing in each device a list of frequencies(a word in our terminology), and having the device hop to the next frequency on its list (in a cyclic3ashion) in response to any interference it encounters. No communication beyond the ability tosense interference is needed.In Section 2 we present a formal model in which our oblivious algorithms apply, placing itwithin a known standard framework for distributed computing. The model presented in Section 2does not attempt to capture all possible interpretations of our MC protocols. For example, themodel concerns tasks that terminate, whereas our protocols work equally well in reactive systemsthat keep adapting to a changing environment. What the formal model does capture is importantconnections with previous works in distributed computing, from which a lower bound of m ≥ n − m ≥ n −
1. The two crucial aspects are the asynchrony of the model, and the fact that our algorithmsare deterministic (no randomization). In a synchronous setting, where in every time step, everyprocessor involved in a collision moves to its next state), m = n suffices, even for oblivious protocols.(This can be proven using the techniques of Theorem 3. Details omitted.) Likewise, m = n suffice ifrandomization is allowed – with probability 1 eventually there are no collisions. However, no specificupper bound on the number of steps can be guaranteed in this case. Moreover, if the randomizedalgorithms is run using pseudorandom generators (rather than true randomness) the argumentbreaks. For any fixed seed of a pseudorandom generator, the algorithm becomes deterministic andthe lower bound m ≥ n − m ≥ n − m = n ) contains n distinct single-letter words. Another requirement is that ifthe input chairs are all unique, all processors are good and the input is the output. Without sucha requirement, the processors might simply ignore the initial input and the trivial oblivious MCalgorithm would still apply. Hence the lower bound of m ≥ n − m ≥ n − m ≥ n −
1. That presentation clarifies the minimal requirements that are needed in order tomake the lower bound work. In particular, it is not necessary that one can dictate an arbitrarystarting chair for each processor – dictating one of two chairs suffices.As noted, we design oblivious
M C ( n, m ) protocols with m = 2 n −
1. Part of our work alsoconcerns analyzing what ratios between m and n one can obtain using collections of randomlychosen words as in Theorem 3. As explained in the introduction, this allows us to present moreefficient deterministic oblivious programs – though random words seem to need more chairs, theycan reach conflict free configurations more quickly. Moreover, the use of random words is a designprinciple that can be applied to design oblivious algorithms for other tasks as well. Developing anunderstanding of what they can achieve and techniques for their analysis is likely to pay off in thelong run. One of the major questions that remain open in our work is whether randomly chosen4ords can be used to design deterministic oblivious MC protocols with m = 2 n − A task [13] is a distributed computational problem involving several processes (or processors).There is an upper bound denoted by n on the number of processes that may participate in thetask. Each participating process starts with a private input value, exchanges information withother participating processes (for example, by writing to and reading from a common memory),and halts with an output value. A nonparticipating process is indistinguishable to other processesfrom a process that is participating but has not yet performed any observable operation (suchas a write operation). The task is specified by a relation ∆ that associates with every inputvector (one element per participating processor) a set of output vectors that are allowed given thisinput. For notational convenience, the input and output vectors are of dimension n (even when thenumber of participating processors is smaller) and the corresponding entries for nonparticipatingprocessors are denoted there by the special symbol ⊥ . We use the notation v inp and v out todenote these vectors, though the reader should note that the subscripts inp and out might be a bitmisleading (the ⊥ entries for nonparticipating processors are neither true inputs nor true outputs,but only notation indicating that the processors are not participating). Given our conventionregarding ⊥ , an input vector v inp implicitly describes which are the participating processors, namely, P rtc ( v inp ) = { p i | v inp ( i ) = ⊥} . Restating our conventions regarding notation for nonparticipatingprocessors we have that for ( v inp , v out ) ∈ ∆ it must hold that v inp ( i ) = ⊥ iff v out ( i ) = ⊥ . The Musical Chairs task:
In the musical chairs task
M C ( n, m ) there are n ≥ { p , . . . , p n } , and a set of chairs { , . . . , m } . Each participating processor starts in an arbitrarychair, dictated by its input, and it has to capture a chair in exclusion. If the input chairs are allunique, all processors must output their input. The formal definition, following the notations of[13] is: v inp ( i ) , v out ( i ) ∈ { , , ..., m, ⊥} .1. If ∀ i, j ∈ P rtc ( v inp ) , v inp ( j ) = v inp ( i ) then v out = v inp , and2. Else ∀ i, j ∈ P rtc ( v inp ) , v out ( j ) = v out ( i ). The Renaming task:
In the
Renaming ( n, m ) task there are n ≥ { p , . . . , p n } , and m slots numbered 1 , . . . , m . Each participant has to capture a slot in exclusion. Formally, theprocessors have no input, though for notational convenience we shall assume that participatingprocessors have the input 1. If only k < n processors participate, then each has to capture (output)a unique slot from the first min (2 k − , m ) slots. If all the n processors participate then they eachcapture one of the m slots in exclusion. The formal definition, following the notations of [13] is: v inp ( i ) ∈ { , ⊥} and v out ( i ) ∈ { , , ..., m, ⊥}
1. If | P rtc ( v inp ) | = k < n then v out ( i ) ∈ { , , ..., k − , ⊥} and ∀ i, j ∈ P rtc ( v inp ) , v out ( j ) = v out ( i ), and2. If | P rtc ( v inp ) | = n then v out ( i ) ∈ { , , ..., m, ⊥} and ∀ i, j ∈ P rtc ( v inp ) , v out ( j ) = v out ( i ).5 .2 The Oblivious Model The
Oblivious model is an asynchronous distributed computing model in which each processor,at each point of time, exposes an output value it currently proposes, and may receive at mostone bit of information. This bit indicates whether its proposed output is legal with respect tothe other currently proposed outputs (and hence the processor may halt) or not (and then theprocessor should continue the computation). If a processor decides to halt at the current state,then its proposed output is its final output. We denote the set of possible output values by O .A system configuration (or configuration for short) is a vector of n elements, one per processor,whose entries come from the set O ∪ {⊥} . Here ⊥ represents a processor that has not yet proposedany output, either because it is not participating, or because it was not scheduled yet to proposean output (these two cases are indistinguishable to other processors. An entry from O representsthe output a corresponding processor proposes in the configuration. In an oblivious algorithmcorrectly designed for a given task, eventually all participating processors must halt, and the finalconfiguration must be a legal output vector in the task specification.The defining feature of the oblivious model is that each processor may receive only one bitof information about the system configuration in each computation step; whether the currentconfiguration is illegal and it should change its state (and thus its proposed output), or whetherit may halt in its current state. The fact that a processor p i is not informed to change its statedoes not necessarily mean that the current configuration is legal. For example, the configurationmight be illegal, but changing p i ’s state would not get the system any closer to a legal configuration.However, in a correct algorithm (program) at least one processor is notified to change its state in anillegal configuration. The choice of function specifying for each configuration which processors maychange their state and which may halt and output is up to the algorithm designer. The algorithmprovides each processor with a predicate on configurations, specifying in which configuration itchanges its state, and in which it may halt. In the most general setting the predicate providedfor each processor may depend on its input, possibly a different predicate for different inputs.However, throughout an execution one predicate is used for each processor. Our formal model doesnot exclude the use of arbitrary complex predicates, but oblivious algorithms have greater appealwhen the predicates involved are simple and natural. For the two tasks considered in this paper,the same collision predicate is used by all the processors for any input.Initially, and as a function of its input, each processor p i selects a word π i over O , and a predicate pred i on the set of of all configurations. The first letter in π i is p i ’s input, i.e., π i [1] = input i ∈ O .For tasks such as renaming in which a processor need not have any input, the first letter is set tobe an output that is valid if no other processor participates (which explains why in the definitionof renaming we used the convention that the input to participating processors is 1).We describe the system using the notion of an omnipotent know-all scheduler called asyn-chronous (other schedulers with different names are described in the sequel). Execution under thecontrol of the asynchronous scheduler proceeds in rounds. The scheduler maintains a set P ofparticipating processors, a set E ⊂ P of enabled processors, and a set DON E (disjoint from P ) ofprocessors that have already halted. These sets are initially empty. In each round the schedulerperforms the following sequence of operations. It may add some not yet participating processors to P . It may evaluate the predicate pred i for some subset of processors in P \ E . If pred i evaluatesto true, the scheduler adds processor p i to the set E . Otherwise, if it evaluates to false, it removes p i from P and adds it to the set DON E . Finally, the scheduler selects a subset SE ⊆ E , removesit from E , and moves each p i ∈ SE to its next letter in π i . I.e., the current output of p i is replacedby the next one in its program, π i . This completes the round.6n oblivious algorithm solves a task if for every input vector, the scheduler is forced to eventuallyplace all participating processors in the DON E set. At that point it can no longer continue, andthe final configuration is such that ( v inp , v out ) ∈ ∆, the relation that defines the task.A well known model for distributed computing is the read/write wait-free model, that weshall sometimes simply refer to as read/write. The main features of this model is that processorscommunicate via read and write operations, scheduling is asynchronous, and every task is completedby a processor in a finite number of steps (regardless of the actions of other processors; this is thewait-free property). See [12] or [13] for more details. The asynchronous scheduler for obliviousalgorithms mimics the behavior of an asynchronous read/write algorithm on configurations. ThusTheorem 6 below can be proved simply by having each processor emulate the scheduler throughreads (snapshots) and writes of its newly proposed output in shared memory. Theorem 6
Every task that is solvable obliviously is solvable read-write wait-free.
Proof.
Given an oblivious distributed program to solve a task we provide a read-write wait-freealgorithm to solve the same task. In the read-write system the shared memory has one single writermulti reader register for each processor, in which the processor publishes its currently proposedoutput. W.l.o.g., we can replace each read by an atomic snapshot [3].Initially, as a function of its input, each processor writes its first proposed output, and uploadsits predicate for the run. Then the processor repeatedly takes a snapshot and writes its next outputin its oblivious program until a snapshot evaluates to false. A snapshot evaluated by the predicateto false, corresponds to a configuration in which a processor was added to the
DON E set.An execution in the read-write model is thus a linear sequence of reads (snapshots) and writesand it corresponds to an execution in the oblivious model in the following way: All the processorsthat observe the same snapshot are those that the asynchronous scheduler evaluates their predicateat the same round. Those evaluated to true are added to the enabled set E , and those evaluatedto false are added to the DON E set and stopped forever. The set of writes that occur after thissnapshot, and before the next snapshot, correspond to the subset of enabled processors that thescheduler move to their next letter in their program π . Since the scheduler must stop with thecorrect output vector so will the read/write algorithm. (cid:4) Thus the oblivious model is subsumed by the read/write model. Is this a proper inclusion? Toclarify the answer we introduce an intermediate class of tasks that we call Output Negotiation,or ON . It includes those tasks solvable read-write wait-free in a system where writing is in theoblivious model (processors can only expose their proposed outputs), whereas reading is as in thegeneral read/write model (a processor can read all exposed information rather than only a singlepredicate). By definition, every obliviously solvable task is ON solvable. Corollary 7
Every obliviously solvable task is in ON . Obviously, ON is a subset of read/write, and in Theorem 8 below we show that this inclusionis proper. Consequently the oblivious model is a strict subset of read/write. Theorem 8
There exists a task,
AntiM C , that is solvable read-write wait-free but does not belongto ON . Proof.
The task
AntiM C is a variation on epsilon agreement [7]. It is a task with 3 processorswhose input and output are each a number in { , . . . , } . A processor running alone must output7ts input. Otherwise all the outputs must be one of two consecutive numbers (5 and 1 are not consecutive). Formally, AntiMC on 3 proccesors: v inp ( i ) , v out ( i ) ∈ { , , ..., , ⊥}
1. If | P rtc ( v inp ) | = 1 then, v inp = v out ,2. Else ∀ i, j ∈ P rtc ( v inp ) , | v out ( j ) − v out ( i ) | ≤ ǫ -agreement.Let us provide a few more details. Participating processors do not only post a proposed output,but also an integer weight (in the range 1 to W ) that specifies how “confident” they are in theiroutput. We now describe the actions of a participating processor. Initially, it posts its input as aproposed output, and posts a weight of 1. Thereafter, the processor performs “rounds” in its ownspeed (determined by an asynchronous scheduler). In a round the processor inspects the proposedoutputs and posted weights of all other processors (assigning weight 0 to those processors who didnot yet post anything), computes a weighted average of all proposed outputs (including its own),and posts the integer nearest to it as a new proposed output. It also raises its weight by 1 and postsits new weight. Processors halt (with their current proposed output as their final output) when thetotal weight reaches W . Choosing W to be sufficiently large guarantees that all final outputs arewithin 1 of each other. Further details are omitted.We now show that AntiMC is not solvable just by communicating outputs. Observe that wemay assume that a processor first posts its input. (If a processor performs read operations beforeposting any output we may schedule the read operations before any other processor posted anoutput, and hence eventually the processor is forced to post its input.) Consider the input vector(1 , ,
3) and two scheduling scenarios. In the first scenario, schedule p first (with input 1) andcontinue to schedule only p . Eventually p must terminate at 1. Now schedule p (with input 5)and let it post its input. In the second scenario reverse the roles of p to terminate with 5 and p to have just posted 1. Observe that in the first scenario the outputs should eventually be in { , } and in the second scenario in { , } . Now schedule p (with input 3) and let it run withoutinterference until termination. Both scenarios are indistinguishable to p , and whatever it outputsis incompatible with at least one of the scenarios. (cid:4) M C ( n, n − In Sections 3 and 4 we show that
M C ( n, n −
1) and Renaming( n, n −
1) are solvable obliviously.Renaming( n, n −
2) is unsolvable read-write wait-free [8, 11], and hence not solvable obliviouslyeither. Theorem 9 shows a reduction from Renaming( n, n −
2) to
M C ( n, n − M C ( n, n −
2) is not solvable read-write wait-free, and hence also not solvable obliviously.
Theorem 9
Renaming ( n, n − is read-write wait-free reducible to M C ( n, n − . Proof.
Whenever we say algorithm in the proof, we shall mean a read/write wait-free distributedalgorithm.Suppose that there is an algorithm for the
M C ( n, n −
2) task. Recall that there is an algorithmfor the Renaming( n − , n −
3) task. By using both algorithms, we shall design an algorithm forthe Renaming( n, n −
2) task. The basic observation is that if fewer than n processors participatethen Renaming( n, n −
2) is equivalent to Renaming( n − , n − n processors participateRenaming( n, n −
2) is equivalent to
M C ( n, n − n − , n −
3) or in
M C ( n, n − n, n − n , the processor joins the M C ( n, n −
2) task, with a private input value of 1. If on theother hand the count is smaller, the processor first joins the Renaming( n − , n −
3) task (againwith input 1). However, a processor that completes the Renaming( n − , n −
3) task is not done(because by the time of its arrival and the time that it completed the Renaming( n − , n −
3) taskit could be that additional processors arrived and are running the
M C ( n, n −
2) task, thus leadingto incompatibilities in the outputs). Instead, it joins the
M C ( n, n −
2) task, using his output fromRenaming( n − , n −
3) as input to
M C ( n, n − n − , n −
3) task is at most n −
1, and hence this task runs properly.2. If the total number of participating processors is at most n − n − , n −
3) task, and hence legal for Renaming( n, n − n then the output is that of the M C ( n, n − n, n − (cid:4) The definition of oblivious algorithms in Section 2.2 postulates that as a function of its input,each processor selects an (infinite) sequence of outputs. For the Renaming task, processors haveno input (or alternatively, are assumed to always have the input 1), and hence each processor hasonly one sequence. For
M C there are m possible inputs that a processor may have, and hence ourmodel allows each processor to have m different sequences, one for each input. Nevertheless, ourconstructions of oblivious algorithms all have the property that the same sequence is used for allinputs. Moreover, we consider finite sequences over which the processor goes cyclically. In the M C task one can designate m locations in the word, each corresponding to a possible output that hasbeen dictated by the input to the processor. The infinite word for each input is then attained byadvancing cyclically on the word starting from that designated location. In fact, we strengthen thescheduler; If an output appears in the word more than once, every appearance of the output is avalid starting point for the M C program (providing the scheduler with more choices). This makesthe
M C program self-stabilizing [5, 6], as mentioned in the introduction.
Our general model for oblivious algorithms is described using the asynchronous scheduler (Section2.2). The asynchronous scheduler enjoys a large degree of freedom in choosing which processor9o move. To simplify the design and analysis of oblivious algorithms, it is convenient to con-sider simpler schedulers that have fewer degrees of freedom, but are nevertheless equivalent to theasynchronous scheduler in their power to prevent successful completion of tasks. Our obliviousalgorithms for MC and Renaming use only a simple collision predicate. That is, a processor canbecome enabled by the asynchronous scheduler only if it is involved in a collision, and may be movedto the
DON E set only if not involved in a collision. The simple nature of this collision predicateallows us to present a sequence of schedulers that appear to be successively weaker, though all arein fact equivalent (with respect to MC and Renaming). The results in this section will be presentedonly with respect to the MC task, but at the end of this section we explain how to extend them toRenaming.
Terminology.
Whenever we say that two schedulers are equivalent it means that a collectionof n words over an alphabet of m chairs forms an oblivious M C ( n, m ) algorithm with respect toone scheduler if and only if it forms an oblivious M C ( n, m ) algorithm with respect to the otherscheduler. I.e., one scheduler has an infinite run from some initial configuration with a set of n words if and only if the other scheduler has.Recall that the asynchronous scheduler maintains several sets, P (for those processors that areparticipating), E (for those processors that may move at the current or some future round), and DON E (for those processors that will move no more). The set E gives the asynchronous schedulerits flexibility and freedom to move processors that have been in conflict at some point in the future.We now present a scheduler that makes only limited use of the set P , and does not use the set DON E . Quiescent scheduler.
It is a scheduler for which the set P never changes. That is, every processorthat participates in the execution is added to P (and posts a proposed output) immediately asthe execution begins (rather than at a point in time determined by the scheduler). Moreover,no processor is ever told to halt (and hence there is no need for the set DON E ). Processorsthat are never told to move from some point, simply become quiescent. When all processors arequiescent the scheduler has no more moves, and the execution terminates. Other than putting allparticipating processors in P upfront and not having a DON E set, the quiescent scheduler behaveslike the asynchronous scheduler. An oblivious
M C ( n, m ) protocol is required to force the quiescentscheduler to reach a configuration in which E is empty and there are no collisions. Conversely, aquiescent scheduler foils a proposed oblivious algorithm if it can generate an infinite execution inwhich in every configuration either E is nonempty or there is a collision. Proposition 10
The asynchronous scheduler and the quiescent scheduler are equivalent.
Proof.
Asynchronous scheduler at least as strong as quiescent scheduler.
All moves available tothe quiescent scheduler are also available to the asynchronous scheduler. Hence if the quiescentscheduler has an infinite run, the asynchronous scheduler can force an infinite execution as well (byimitating the quiescent scheduler).
Quiescent scheduler at least as strong as asynchronous scheduler.
For an asynchronous schedulerto foil a proposed oblivious
M C ( n, m ) algorithm, it needs to generate an infinite execution. Thequiescent scheduler can imitate the asynchronous scheduler with the following differences. When-ever the asynchronous scheduler places a processor in the DON E set (there are at most n rounds inwhich this happens), the quiescent scheduler does not do so (and drops the round if no other actionwas taken in this round). Whenever the asynchronous scheduler places a processor in P (there areat most n rounds in which this happens), the quiescent scheduler instead places the processor in P in the first round. All other moves of the asynchronous scheduler remain legal for the quiescent10cheduler, and hence infinite executions for the asynchronous scheduler result in infinite executionsfor the quiescent scheduler. (cid:4) Our next goal is to get rid of the set E . Immediate scheduler.
The immediate scheduler is similar to the quiescent scheduler, exceptthat it does not maintain a set E of enabled processors. Instead, in each round it can only selectprocessors that are currently involved in a collision and move them. It is important to note that in around the immediate scheduler does not need to select all processors that are involved in a collision– it may select a nonempty subset of its choice. An oblivious M C ( n, m ) protocol is required toforce the immediate scheduler to reach a configuration in which there are no collisions. Conversely,an immediate scheduler foils a proposed oblivious M C ( n, m ) algorithm if it can generate an infiniteexecution never reaching a configuration in which there are no more collisions. Proposition 11
The quiescent scheduler and the immediate scheduler are equivalent.
Proof.
The immediate scheduler is a special case of the quiescent scheduler (essentially it placesprocessors in E and moves them at the same round). Hence it remains to show that the immediatescheduler is at least as strong as the quiescent scheduler. This is equivalent to showing the followingstatement: whenever there is an infinite run of the quiescent scheduler, there is also an infinite runof a quiescent scheduler in which whenever it places a processor in E , it moves it in the same round.We prove this last statement by a double induction on the round t (increasing) and the number k of processors that violate this statement in round t (decreasing until k = 0, and thus causing t to increase). Our inductive proof has the property that some rounds might become empty in theprocess (contain no action on behalf of the scheduler). However, despite this, every infinite runtransforms into an infinite run, because the number of processor moves is kept unchanged.Given a proposed oblivious M C ( n, m ) algorithm and an infinite execution by the quiescentscheduler, let t be the first round in which there is a processer added to E and not moved in thesame round, and let k ≥ t . Pick an arbitrary processor p added to E in round t and not moved in this round. If p is not moved even in any future round,simply do not put p in E . This decreases k and the inductive step is done. Alternatively, if p ismoved in a future round, say round t ′ > t , we consider two cases. In one case there is some round t ” with t < t ” ≤ t ′ in which p is involved in a collision. In this case, rather than placing p in E inround t , simply do this in round t ” > t instead. This decreases k and the inductive step is done. Inthe other case, there is no such round t ”. In this case, move p in round t rather than round t ′ . Thisalso decreases k by one. Observe that all moves available to the scheduler between rounds t and t ′ are still available also after this change in the scheduler, since p could not contribute to enablingprocessors within this interval of rounds. (cid:4) Having eliminated the sets E and DON E , we now turn our attention to limiting the numberof processors that can be moved in a round.
Pairwise immediate scheduler.
This is similar to the immediate scheduler but with the followingrestriction. In every round, the pairwise immediate scheduler can select any two processors currentlyin collision with each other, and move either one of them, or the other, or both. Equivalently, inevery round either only one processor (involved in a collision) moves, or two processors that sharethe same chair.
Proposition 12
The immediate scheduler and the pairwise immediate scheduler are equivalent. roof. The pairwise immediate scheduler is a special case of the immediate scheduler. Hence itremains to show that whenever there is an infinite run with the immediate scheduler, there is alsoan infinite run with the pairwise immediate scheduler. We prove this last statement by a doubleinduction on the round t and the number k of processors that move in round t .Given a proposed oblivious M C ( n, m ) algorithm and an infinite execution by the immediatescheduler, let t be the first round in which the moves were not consistent with a pairwise immediatescheduler and let k be the number of processors that move in round t . There are two cases toconsider. In one case the set SE of processors that moved shared in round t the same chair and k ≥
3. Break round t into two rounds, pushing future rounds by one. In the first of them (round t ) move only one of the processors from SE and in the second round (round t + 1) move the rest(they can still move because there are at least two of them). This completes the inductive stepwith respect to t . The other case is that the set SE of processors that moved in round t collidedon at least two different chairs. Pick one of these chairs, say chair c , and let SE ( c ) be the set ofthose processors in SE that in round t collide in chair c . Break round t into two rounds, pushingfuture rounds by one. In the first of them (round t ) move only those processors in SE ( c ), and inthe second round (round t + 1) move those processors in SE − SE ( c ). This completes the inductivestep (as either k decreased or t increased). (cid:4) The use of the pairwise immediate scheduler (which as we showed is equivalent to the asyn-chronous scheduler) helps simplify the proofs of theorems 1, 2 and 4. However, for the proof ofTheorem 3 even the pairwise immediate scheduler has too many degrees of freedom. It is true thatit has to pick only one pair of processors to move (and then either move only one or both of them),but it is still free to pick a pair of its choice (among those pairs that collide). We would like toeliminate this degree of freedom.
Canonical Scheduler . The canonical scheduler is similar to the pairwise immediate scheduler butwith the following difference. In every round in which there is a collision, one designates a canonicalpair . This is a pair of processors currently in collision with each other, but they are not chosenby the scheduler, but rather dictated to the scheduler. Given the canonical pair, the scheduler canmove either one of the these processors, or the other, or both. But how is the canonical pair chosen?In the current paper this does not really matter to us, as long as the choice is deterministic. Forconcreteness, we shall assume the following procedure. Consider all pairs of processors and fix anarbitrary order on them. In a configuration with a collision, the canonical pair is the first pair ofplayers in the order that share a chair.We now prove the equivalence of the canonical scheduler with the immediate scheduler (theproof does not become any simpler if we replace in it immediate scheduler by pairwise immediatescheduler).
Proposition 13
The immediate scheduler and the canonical scheduler are equivalent.
Proof.
The canonical scheduler is a special case of the immediate scheduler. Hence it remains toshow that whenever there is an infinite run of the immediate scheduler, there is also an infinite runwith the canonical scheduler. We prove this last statement by induction on the round t .Given a proposed oblivious M C ( n, m ) algorithm and an infinite execution by the immediatescheduler, let t be the first round in which the moves were not consistent with a canonical scheduler.That is, the canonical pair at round t consists of two processors (say P and P , without loss ofgenerality) that collide on a chair (say, chair c ), whereas the immediate scheduler moved at leastone processer not from the canonical pair. We consider several cases.12 ase 1. The immediate scheduler never moves P in any round from t onwards. In this casemove P in round t . Note that all moves (except for the move just performed, moving P awayfrom c ) performed by the immediate scheduler from round t onwards are still available to thisscheduler (because chair c remains occupied). Hence the total number of moves in the scheduledid not change, whereas t increases by one, completing the inductive step. The same argument canbe applied with P and P exchanged. Case 2.
The immediate scheduler moves P out of c in a later round than it moves P . Inthis case move P in round t . Again, all moves (except for the move just performed, moving P away from c ) performed by the immediate scheduler from round t onwards are still available tothis scheduler. The same argument can be applied with P and P exchanged. Case 3.
The immediate scheduler moves both P and P out of c in the same round t ′ ≥ t .There are two subcases to consider. In one, there is no processor other than P and P on chair c in any of the rounds t, . . . , t ′ . In this subcase, move P and P in round t (pushing future rounds byone). All moves performed by the immediate scheduler from round t to t ′ are still available to thisscheduler. The other subcase is that there is some round t ≤ t ” ≤ t ′ in which some other processorsay P is on chair c . Consider the largest such t ”. Move P in round t (pushing future roundsby one) and P in round t ” + 1 (the round that previous to the pushing of rounds was round t ”),together with whoever else is moved at that round. (cid:4) Remark.
The results of this section apply also for Renaming and not only for
M C . However,the proofs for Renaming need to be slightly changed. The difference is that in Renaming anoblivious algorithm fails not only if the scheduler manages to exhibit an infinite execution, butalso if the scheduler manages to make a processor output a value larger than 2 k − k isthe number of participating processors. Modifying the proofs so that they handle also this form offailure is straightforward, and we omit the details. n − chairs In this section we prove the upper bound that is stated in Theorem 1. We start with somepreliminaries. The length of a word w is denoted by | w | . The concatenation of words is denotedby ◦ . The r -th power of w is denoted by w r = w ◦ w . . . ◦ w ( r times). Given a word π and aletter c , we denote by c ⊗ π the word in which the letters are alternately c and a letter from π inconsecutive order. For example if π = 2343 and c = 1 then c ⊗ π = 12131413. A collection of words π , π , ..., π n is called terminal if no schedule can fully traverse even one of the π i . Note that wecan construct a terminal collection from any M C algorithm just by raising each word to a highenough power.We now introduce some of our basic machinery in this area. We first show how to extendterminal sets of words.
Proposition 14
Let n, m, N be integers with < n < m . Let Π = { π , . . . π N } be a collection of m -full words such thatevery n of these words form an oblivious M C ( n, m ) algorithm . (1) Then Π can be extended to a set of N + 1 m -full words that satisfy condition (1). roof. Suppose that for every choice of n words from Π and for every initial configuration noschedule lasts more than t steps. (By the pigeonhole principle t ≤ L n , where L is the lengthof the longest word in Π). For a word π , let π ′ be defined as follows: If | π | ≥ t , then π ′ = π .Otherwise it consists of the first t letters in π r where r > | π | /t . The new word that we introduceis π N +1 = π ′ ◦ π ′ ◦ . . . ◦ π ′ n . It is a full word, since it contains the full word π as a sub-word.We need to show that every set Π ′ of n − π N +1 constitute anoblivious M C ( n, m ) algorithm. Observe that in any infinite schedule involving these words, theword π N +1 must move infinitely often. Otherwise, if it remains on a letter c from some point on,replace the word π N +1 by an arbitrary word from Π − Π ′ and stay put on the letter c in this word.This contradicts our assumption concerning Π. (Note that this word contains the letter c by ourfullness assumption.) But π N +1 moves infinitely often, and it is a concatenation of n words whereasΠ ′ contains only n − π N +1 must reach the beginning of a word π α for some π α Π ′ . From this point onward, π N +1 cannot proceed for t additional steps, contraryto our assumption. (cid:4) Note that by repeated application of Proposition 14, we can construct an arbitrarily largecollection of m -full words that satisfy condition (1).We next deal with the following situation: Suppose that π , π , ..., π m is a terminal collection,and we concatenate an arbitrary word σ to one of the words π i . We show that by raising all wordsto a high enough power we again have a terminal collection in our hands. Lemma 15
Let π , π , ..., π p be a terminal collection of full words over some alphabet. Let σ be anarbitrary full word over the same alphabet. Then the collection ( π ) k , ( π ) k , ..., ( π i − ) k , ( π i ◦ σ ) , ( π i +1 ) k , ..., ( π p ) k is terminal as well, for every ≤ i ≤ p , and every k ≥ | π i | + | σ | . Proof.
We split the run of any schedule on these words into periods through which we do notmove along the word ( π i ◦ σ ) . We claim that throughout a single period we do not traverse afull copy of π j in our progress along the word ( π j ) k . The argument is the same as in the proofof Proposition 14. By pasting all these periods together, we conclude that during a time intervalin which we advance ≤ | π i | + | σ | − π i ◦ σ ) every other word ( π j ) k traverses at most | π i | + | σ | − π j . In particular, there is a whole π j in the j -th word inthe collection that is never visited. If the schedule ends in this way, no word is fully traversed, andour claim holds.So let us consider what happens when a schedule makes ≥ | π i | + | σ | steps along the word( π i ◦ σ ) . We must reach at some moment the start of π i in our traversal of the word ( π i ◦ σ ) . Butour underlying assumption implies that from here on, no word can fully traverse the corresponding π k (including π i ). Again, no word is fully traversed, as claimed. (cid:4) Lemma 15 yields immediately:
Corollary 16
Let π , π , ..., π p be a terminal collection of full word over some alphabet, and let π p +1 , π p +2 , ..., π n be arbitrary full words over the same alphabet. Then the collection ( π ◦ π ◦ ... ◦ π n ) , ( π ) k , ( π ) k , ..., ( π i − ) k , ( π i +1 ) k , ..., ( π p ) k is terminal as well. This holds for every ≤ i ≤ p and k ≥ P ni =1 | π i | . This is a special case of Lemma 15 where σ = π i +1 ◦ . . . π n ◦ π . . . ◦ π i − .14 .2 The MC( n, n − ) upper bound The proof we present shows somewhat more than Theorem 1 says. We do this, since the schedulercan “trade” a player P for a chair c . Namely, he can keep P constantly on chair c . This allowsthe scheduler to move any other player past c -chairs. In other words this effectively means theelimination of chair c from all other words. This suggests the following definition: If π is a wordover alphabet C and B ⊆ C , we denote by π ( B ) the word obtained from π by deleting from it theletters from C \ B .Our construction is recursive. An inductive step should add one player (i.e., a word) and twochairs. We carry out this step in two installments: In the first we add a single chair and in thesecond one we add a chair and a player. Both steps are accompanied by conditions that counterthe above-mentioned trading option. Proposition 17
For every integer n ≥ • There exist full words s , s , ..., s n over the alphabet { , , ..., n − } such that s ( A ) , s ( A ) , ..., s p ( A ) is a terminal collection for every p ≤ n , and every subset A ⊆ { , , ..., n − } of cardinality | A | = 2 p − . • There exist full words w , w , ..., w n over alphabet { , ..., n } , such that w ( B ) , w ( B ) , ..., w p ( B ) is a terminal collection for every p ≤ n , and every subset B ⊆ { , , ..., n } of cardinality | B | = 2 p − . The words s , s , ..., s n in Proposition 17 constitute a terminal collection and are hence anoblivious M C ( n, n −
1) algorithm that proves the upper bound part of Theorem 1. In the rest ofthis section we prove Proposition 17.
Proof.
As mentioned, the proof is by induction on n . For n = 1 clearly s = 11 and w = 1122 satisfythe conditions.In the induction step we use the existence of s , s , ..., s n to construct w , w , ..., w n . Likewisethe construction of s , s , ..., s n +1 builds on the existence of w , w , ..., w n . The transition from w , w , ..., w n to s , s , ..., s n +1 : To simplify notations we assume that the words w , w , ..., w n in the alphabet { , , ..., n + 1 } (rather than { , , ..., n } ) satisfy the proposition. Let k := P | w i | and define: s : = 1 ⊗ (( w ◦ w ◦ ... ◦ w n ) n +1) ) ∀ i = 2 , . . . n + 1 s i : = ( w i − ) k (2 n +1) ◦ A ⊆ { , , ..., n + 1 } of cardinality | A | = 2 p − p ≤ n + 1, and let us showthat s ( A ) , s ( A ) , ..., s p ( A ) is a terminal collection. There are two cases to consider:We first assume 1 / ∈ A . This clearly implies that p ≤ n (or else A = { , , ..., n + 1 } and inparticular 1 ∈ A ). In this case the collection is: s ( A ) : = (( w ( A ) ◦ w ( A ) ◦ ... ◦ w n ( A )) n +1) ) ∀ i = 2 , . . . p s i ( A ) : = ( w i − ( A )) k (2 n +1)
15y the induction hypothesis, the collection w ( A ) , w ( A ) , ..., w p − ( A ) , w p ( A ) is terminal. Weapply Corollary 16 and conclude that( w ( A ) ◦ w ( A ) ◦ ... ◦ w n ( A )) , ( w ( A )) k , ( w ( A )) k , ..., ( w p − ( A )) k is terminal as well. But the s i are obtained by taking (2 n + 1)-th powers of these words, so that s ( A ) , s ( A ) , ..., s p ( A ) is terminal as needed.We now consider what happens when 1 ∈ A .We define F := ( w ( A ) ◦ w ( A ) ◦ ... ◦ w n ( A )) and for for j >
1, let F j := ( w j − ( A )) k . We referto F i as the i -th block. In our construction each word has 2 n + 1 blocks, ignoring chair 1.At any moment throughout a schedule we denote by O the set of players in { P , P , ..., P p } thatcurrently occupy chair 1. We show that during a period in which the set O remains unchanged,no player can traverse a whole block. The proof splits according to whether O is empty or not.Assume first that O = ∅ , and pick some i > P i occupies chair 1 during the currentperiod. As long as O remains unchanged, P i stays on chair 1, so the words that the other playersrepeatedly traverse are as follows: For P it is w ( A \{ } ) ◦ w ( A \{ } ) ◦ ... ◦ w n ( A \{ } )and for P j with p ≥ j = i ≥ w j − ( A \{ } )We now show that no player can traverse a whole block (as defined above). Observe that thecollection { w ν ( A \{ } ) | ν = 1 , . . . , p − } (including, in particular the word w i − ( A \{ } )) is terminal.This follows from the induction hypothesis, because | A \{ }| = 2 p −
2, and because the property ofbeing terminal is maintained under the insertion of new chairs into words. Applying Corollary 16to this terminal collection implies that this collection of blocks is terminal as well.We turn to consider the case O = ∅ . In this case player 1 cannot advance from a none-1chair to the next none-1 chair, since the two are separated by the presently unoccupied chair 1.We henceforth assume that player P stays put on chair c = 1, but our considerations remainvalid even if at some moment player P moves to chair 1. (If this happens, he will necessarilystay there, since O = ∅ ). We are in a situation where players P , P , ..., P p traverse the words w ( A \{ , c } ) , w ( A \{ , c } ) , ..., w p − ( A \{ , c } ) (chair c which is occupied by player P can be safelyeliminated from these words). But | A \{ , c }| = 2 p −
3, so by the induction hypothesis no playercan traverse a whole w i ( A \{ , c } ), so no player can traverse a whole block.We just saw that during a period in which the set O remains unchanged, no player can traversea whole block.Finally, assume towards contradiction that P j fully traverses s j for some index j , and considerthe first occurrence of such an event. It follows that P j has traversed 2 n + 1 blocks, so that the set O must have changed at least 2 n + 1 times during the process. However, for O to change, some P i must either move to, or away from a 1-chair in s i . But 1 occurs exactly once in s i , so every P i can account for at most two changes in O , a contradiction. The transition from s , s , ..., s n to w , w , ..., w n : We assume that the words s , s , ..., s n in the alphabet { , , ..., n } satisfy the proposition. Let k := P | s i | and define: 16 : = 1 ⊗ (( s ◦ s ◦ ... ◦ s n ) n +1) ) ∀ i = 2 , . . . , n w i : = ( s i − ) k (2 n +1) ◦ B ⊆ { , , ..., n } with | B | = 2 p −
1. Then w ( B ) = 1 ⊗ (( s ( B ) ◦ s ( B ) ◦ ... ◦ s n ( B )) n +1) ) ∀ i = 2 , . . . , p w i ( B ) = ( s i − ( B )) k (2 n +1) ◦ s with w and A with B (in thiscase the induction hypothesis is on s i and we prove for w i ). So exactly the same considerationsprove that w ( B ) , w ( B ) , ..., w m ( B ) is a terminal collection. (cid:4) n, n − ) algorithm The ideas developed to solve the musical chairs problem and prove Theorem 1 turn out to yield aswell an answer to the oblivious Renaming problem and a proof of Theorem 2. The rules are thesame as in the
M C problem, except that the scheduler cannot select the initial positions, and everyword is started at its first letter. In order to prove Theorem 2 we should construct a collectionof full words Π N = { s , s , ..., s N } over the alphabet [2 N −
1] such that for every n ≤ N and forevery set of n words from Π N the following holds: Every schedule that starts from the first letterin each of these words reaches a safe configuration and all players only visits chairs from the set { , . . . , n − } .We note that our construction yields very long words - triply exponential in N . It is aninteresting challenge to accomplish this with substantially shorter words. Proof. [Theorem 2] By Proposition 14 and Theorem 1, we can construct for each 1 ≤ i, n ≤ N aword π i,n that is [2 n − n words in the set { π i,n | i = 1 , . . . , N } constitutean oblivious M C ( n, n −
1) protocol.We show that with a proper choice of the exponents l , . . . , l N , the Theorem holds with thewords s i = π l i, ◦ π l i, ◦ . . . ◦ π l N i,N .The theorem follows if we can show that for every 1 ≤ n ≤ N and every subset J ⊆ [ N ]of cardinality | J | = n the following holds: In every possible schedule that starts each word in { s j | j ∈ J } from its first letter, no player reaches a position beyond the subword π l n j,n . Consider anypoint in such a schedule. Say that player P j (for some j ∈ J ) is leading if it currently resides in thestretch π l n j,n of s j . Otherwise, we say that j is trailing . We observe that during a period of time inwhich no trailing player changes position, no leading player can traverse a complete copy of π j,n .To see this, consider an arbitrary M C schedule with the words { π j,n | j ∈ J } . We start this scheduleas follows: Every leading player maintains his position from the original renaming schedule andevery trailing player stays put on the same chair that he is currently occupying. (Such a chair canbe found in the word π j,n since it is [2 n − { π j,n | j ∈ J } constitute an oblivious M C ( n, n −
1) protocol.It follows that no leading player P j can traverse more than P ν We start with an observation that puts Theorems 3 and 4 (as well as Theorem 1) in an interestingperspective. The expected number of pairwise collisions in a random configuration is exactly (cid:0) n (cid:1) /m . In particular, when m ≫ n , most configurations are safe (namely, have no collisions).Therefore, it in not surprising that in this range of parameters n random words would yield anoblivious M C ( n, m ) algorithm. However, when m = O ( n ), only an exponentially small fraction ofconfigurations are safe, and the existence of oblivious M C ( n, m ) algorithms is far from obvious. O ( n ) chairs, allowing repetitions Theorem 3 can be thought of as a (nonconstructive) derandomization of the randomized MC algo-rithm in which players choose their next chair at random (and future random decisions of playersare not accessible to the scheduler). Standard techniques for derandomizing random processes in-volve taking a union bound over all possible bad events, which in our case corresponds to a unionbound over all possible schedules. The asynchronous scheduler has too many options (and so doesthe immediate scheduler), making a union bound too wasteful. For this reason, we shall considerin this section the canonical scheduler, which is as powerful as the asynchronous scheduler (seeSection 2.5). In every unsafe configuration, the choice of canonical pair is deterministic and thecanonical scheduler has only three possible moves to choose from, which makes it viable to use aunion bound. We now prove Theorem 3. Proof. Each of the N words is chosen independently at random as a sequence of L chairs, whereeach chair in the sequence is chosen independently at random. We show that with high probability(probability tending to 1 as the value of the constant c grows), this choice satisfies Theorem 3.It is easy to verify that in this random construction, with high probability, all words are full. Tosee this note that the probability that chair j is missing from word i is (( m − /m ) L . Consequently,the probability that a word chosen this way is not full is ≤ m (( m − /m ) L . Therefore, the expectednumber of non-full words is ≤ m · N · (( m − /m ) L . But with our choice of parameters m = 7 n and L = cn log N , we see that m · N · (( m − /m ) L = o (1), provided that c is large enough.In our approach to the proof we keep track of all possible schedules. To this end we use “alogbook” that is the complete ternary tree T of depth L rooted at r . Associated with every node v of T is a random variable X v . The values taken by X v are system configurations. For a givenchoice of words and an initial system configuration we define the value of X r to be the choseninitial configuration. Every node v has three children corresponding to the three possible nextconfigurations that are available to the canonical scheduler at configuration X v .Another important ingredient of the proof is a potential function (defined below) that mapssystem configurations to the nonnegative reals. It is also convenient to define an (artificial) “empty”configuration of 0 potential. Every safe configuration has potential 1, and every non-empty unsafeconfiguration has potential > 10. If the node u is a descendant of v and the system configuration X v is safe, then we define X u to be the empty configuration.We thus also associate with every node of T a nonnegative random variable P = P v that isis the potential of the (random) configuration X v . The main step of the proof is to show that if v , v , v are the three children of v , then P i =1 E ( P v i ) ≤ r E ( P v ) for some constant r ≤ . 99. (Notethat this inequality holds as well if X v is either safe or empty). This exponential drop implies that E ( X v is a leaf of T ( P v )) = X v is a leaf of T E ( P v ) = o (1)18rovided that L is large enough. This implies that with probability 1 − o (1) (over the choice ofrandom words) all leaves of T correspond to an empty configuration. In other words every scheduleterminates in fewer than L steps.We turn to the details of the proof. A configuration with i occupied chairs is defined to havepotential x n − i , where x > x n − , and it equals 1 iff the configuration is safe.Consider a configuration of potential x n − i (with i < n ), where the canonical pair is ( α, β ). Ithas three children representing the move of either α or β or both. Let us denote ρ = i/m and ρ ′ = ( i − /m . When a single player moves, the number of occupied chairs can stay unchanged,which happens with probability ρ . With probability 1 − ρ one more chair will be occupied andthe potential gets divided by x . Consider next what happens when both players move. Herethe possible outcomes (in terms of number of occupied chairs) depend on whether there is anadditional player γ currently co-occupying the same chair as α and β . It suffices to performthe analysis in the less favorable case in which there is no such player γ , as this provides anupper bound on the potential also for the case that there is such a player. With probability ( ρ ′ ) both α and β move to occupied chairs and the potential gets multiplied by x . With probability ρ ′ (1 − ρ ′ ) + (1 − ρ ′ ) ρ = ( ρ + ρ ′ )(1 − ρ ′ ) the number of occupied chairs (and hence the potential)does not change. With probability (1 − ρ ′ )(1 − ρ ) the number of occupied chairs grows by one andthe potential gets divided by x .It follows that if v is a node of T with children v , v , v and if the configuration X v is unsafe andnonempty then P i =1 E ( P v i ) ≤ E ( P v )(2 ρ + 2(1 − ρ ) /x + ( ρ ′ ) x + ( ρ + ρ ′ )(1 − ρ ′ ) + (1 − ρ )(1 − ρ ′ ) /x ).Recall that x > ρ ′ < ρ < 1. This implies that the last expression increases if ρ ′ is replacedby ρ , and thereafter it is maximized when ρ attains its largest possible value q = ( n − /m . Weconclude that X E ( P v i ) ≤ E ( P )(2 q + 2(1 − q ) /x + q x + 2 q (1 − q ) + (1 − q ) /x ) . We can choose q = 1 / x = 23 / P i =1 E ( P v i ) ≤ r E ( P v ) for r < . 99. This guaranteesan exponential decrease in the expected sum of potentials and hence termination, as we now explain.It follows that for every initial configuration the expected sum of potentials of all leaves atdepth L does not exceed x n − (the largest possible potential) times r L . On the other hand, ifthere is at least one leaf v for which the configuration X v is neither safe nor empty, then the sumof potentials at depth L is at least x > 1. Our aim is to show that with high probability (overthe choice of N words), all runs have length < L : (i) For every choice of n out of the N words,(ii) Each selection of an initial configuration, and (iii) Every canonical scheduler’s strategy. The n words can be chosen in (cid:0) Nn (cid:1) ways. For every n words, there are L n possible initial configurations.The probability of length- L run from a given configuration is at most x n − r L , where x = 23 / r < . 99. Therefore our claim is proved if (cid:0) Nn (cid:1) · x n − r L ≤ o (1). This inequality clearly holds if welet L = cn log N with c a sufficiently large constant. This completes the proof of Theorem 3. (cid:4) A careful analysis of the proof of Theorem 3 shows that it actually works as long as mn > √ . .. . It would be interesting to determine the value of lim inf n →∞ mn for which n long enough random words over an m -letter alphabet constitute, with high probability, an oblivious M C ( n, m ) protocol. 19 .2 Permutations over O ( n ) chairs The argument we used to prove Theorem 3 is inappropriate for the proof of Theorem 4. Theo-rem 4 deals with random permutations, whereas in the proof of Theorem 3 we use words of lengthΩ( n log n ). (Longer words are crucial there for two main reasons: To guarantee that words arefull and to avoid wrap-around. The latter property is needed to guarantee independence.) Indeedin proving Theorem 4 our arguments are substantially different. In particular, we work with apairwise immediate scheduler, and unlike the proof of Theorem 3, there does not appear to be anysignificant benefit (e.g., no significant reduction in the ratio mn ) if a canonical scheduler is usedinstead.We first prove the special case N = n of Theorem 4. Theorem 18 If m ≥ cn where c > is a sufficiently large constant, then there is a family of n permutations on [ m ] which constitute an oblivious M C ( n, m ) protocol. We actually show that with high probability, a set of random permutations π , . . . , π n has theproperty that in every possible schedule the players visit at most L = O ( m log m ) chairs. Ouranalysis uses the approach of deferring random decisions until they are actually needed. For eachof the m n possible initial configuration, we consider all possible sequences of L locations. For eachsuch sequence we fill in the chairs in the locations in the sequence at random, and prove that theprobability that this sequence represents a possible schedule is extremely small – so small that evenif we take a union bound over all initial configurations and over all sequences of length L , we areleft with a probability much smaller than 1.The main difficulty in the proof is that since L ≫ m some players may completely traversetheir permutation (even more than once) and therefore the chairs in these locations are no longerrandom. To address this, we partition the sequence of moves into L/t blocks, where in each blockplayers visit a total of t locations. We can and will assume that t divides L . We take t = δm forsome sufficiently small constant δ , and n = ǫm , where ǫ is a constant much smaller than δ . Thischoice of parameters implies that within a block, chairs are essentially random and independent.To deal with dependencies among different blocks, we classify players (and their correspondingpermutations) as light or heavy . A player is light if during the whole schedule (of length L ) it visitsat most t/ log m = o ( t ) locations. A player that visits more than t/ log m locations during the wholesequence is heavy . Observe that for light players, the probability of encountering a particular chairin some given location is at most m − o ( t ) ≤ o (1) m . Hence, the chairs encountered by light playersare essentially random and independent (up to negligible error terms). Thus it is the heavy playersthat introduce dependencies among blocks. Every heavy player visits at least t/ log m locations, sothat n h , the number of heavy players does not exceed n h ≤ ( L log m ) /t = O (log m ). The fact thatthe number of heavy players is small is used in our proof to limit the dependencies among blocks.The following lemma is used to show that in every block of length t the number of locationsthat are visited by heavy players is not too large. Consequently, sufficiently many locations arevisited by light players. In the lemma we use the following notation. A segment of k locations ina permutation is said to have volume k − 1. Given a collection of locations, a chair is unique if itappears exactly once in these locations. Lemma 19 Let n h ≤ m/ log m and let δ > be a sufficiently small constant. Consider n randompermutations over [ m ] . Select any n h of the permutations and a starting location in each of them.Choose next intervals in the selected permutations with total volume t ′ for some t/ ≤ t ′ ≤ t . Withprobability − o (1) for every such set of choices at least t ′ / of the chairs in the chosen intervalsare unique. roof. We first note that we will be using the lemma with n h = O (log n ). Also, if a list of letterscontains u unique letters (i.e., they appear exactly once) and r repeated letter (i.e., appearing atleast twice), then it has d = u + r distinct letters and length λ ≥ u + 2 r . In particular d ≤ ( λ + u ) / (cid:0) nn h (cid:1) ways of choosing n h of the permutations. Then, there are m n h choices forthe initial configuration. We denote by s i the volume of the i -th interval, so that P n h i =1 s i = t ′ .Therefore there are (cid:0) t ′ + n h − n h − (cid:1) ≤ m n h ways of choosing the intervals with total volume t ′ . Since thevolume of every interval is at most t ′ we have that the probability that a particular chair residesat a particular location in this interval is at most 1 / ( m − t ′ ). This is because the permutation israndom and at most t ′ chairs appeared so far in this interval. Therefore the probability that asequence of t ′ labels involves less than 0 . t ′ distinct chairs is at most (cid:18) m . t ′ (cid:19) (cid:18) . t ′ m − t ′ (cid:19) t ′ ≤ (cid:16) em . t ′ (cid:17) . t ′ (cid:18) . t ′ m − t ′ (cid:19) t ′ ≤ e t ′ (cid:18) mm − t ′ (cid:19) . t ′ (cid:18) t ′ m − t ′ (cid:19) . t ′ ≤ t ′ (2 δ ) . t ′ ≪ e − t ′ . Explanation: The set of chairs that appear in these intervals can be chosen in (cid:0) m . t ′ (cid:1) ways. Theprobability that a particular location in this union of intervals is assigned to a chair from the chosenset does not exceed . t ′ m − t ′ . In addition m/ ( m − t ′ ) ≤ (1 + δ ), t ′ / ( m − t ′ ) ≤ δ and δ is a very smallconstant.Now we take a union bound over all choices of n h permutations, all starting locations and allcollection of intervals with total volume t ′ . It follows that the probability that there is a choice ofintervals of volume t ′ that span ≤ n h permutations and contain fewer than 9 t ′ / 10 distinct chairs isat most m n h e − t ′ = o (1) . In the above notation λ = t ′ and d ≥ . t ′ which yields u ≥ . t ′ as claimed. (cid:4) Since the conclusion of this lemma holds with probability 1 − o (1) we can assume that our setof permutations satisfies it. In particular, in every collection of intervals in these permutationswith total volume t ≤ t ′ ≤ t that reside in O (log m ) permutations there are at least 4 t ′ / L locations visited by players into blocks of t locations each. We analyze the possible runs by considering first the breakpoints profile , namelywhere each block starts and ends on each of the n words. There are m n possible choices for thestarting locations. If, in a particular block player i visits s i chairs, then P ni =1 s i = t . Consequentlythe parameters s , . . . , s n can be chosen in (cid:0) t + n − n (cid:1) ≤ t + n ways. There are L/t blocks, so thatthe total number of possible breakpoints profiles is at most m n (2 t + n ) L/t ≤ m n L (here we usedthe fact that t > n ). Clearly, by observing the breakpoints profile we can tell which players arelight and which are heavy. We recall that there are at most O (log m ) heavy players, and that thepremise of Lemma 19 can be assumed to hold.Let us fix an arbitrary particular breakpoints profile β . We wish to estimate the probability(over the random choice of chairs) that some legal sequence of moves by the pairwise immediatescheduler yields this breakpoints profile β . Let B be an arbitrary block in β . Let p ( B ) denotethe probability over choice of random chairs and conditioned over contents of all previous blocks in β that there is a legal sequence of moves by the pairwise immediate scheduler that produces thisblock B . Lemma 20 For p ( B ) as defined above we have that p ( B ) ≤ − t . roof. The total number of chairs encountered in block B is n ≪ t (for the initial locations) plus t (for the moves). Recall that the set of heavy players is determined by the block-sequence β . Hencewithin block B it is clear which are the heavy players and which are the light players. Let t h (resp. t ℓ = t − t h ) be the number of chairs visited by heavy (resp. light) players in this block. The proofnow breaks into two cases, depending on the value of t h . Case 1: t h ≤ . t . Light players altogether visit n + t ℓ chairs ( n initial locations plus t ℓ moves).If u of these chair are unique, then they visit at most ( n + t ℓ + u ) / n chairs where a player terminates his walk,or, (ii) A chair that a light player traverses due to a collision with a heavy player, and there areat most t h of those. Consequently, the number of distinct chairs visited by light players does notexceed ( n + t ℓ + n + t h ) / t/ n .Fix the set S of t/ n distinct chairs that we are allowed to use. There are (cid:0) mn + t/ (cid:1) choicesfor S . Now assign chairs to the locations one by one, in an arbitrary order. Each location hasprobability of at most (1 + o (1)) n + t/ m of receiving a chair in S . Since we are dealing here with lightplayers, we have exposed only o ( m ) chairs for each of them (in B and in previous blocks of β ), andas mentioned above, this can increase the probability by no more that a 1 + o (1) factor.Hence the probability that the segments traversed by the light players contain only n + t/ (cid:0) mn + t/ (cid:1) (cid:16) (1 + o (1)) n + t/ m (cid:17) t ℓ ≤ (cid:16) emn + t/ (cid:17) n + t/ t ℓ (cid:16) n + t/ m (cid:17) t ℓ ≤ (2 e ) t (cid:16) n + t/ m (cid:17) ( t ℓ − t h ) / − n ≤ (2 e ) t ( t/m ) t/ < − t . Here we used that t h + t ℓ = t , t h ≤ . t , t l ≥ . t and n ≪ t ≪ m . Case 2: t h ≥ . t . Let us reveal first the chairs visited by the heavy players. By Lemma 19, wefind there at least 4 t h / t ℓ locations visited by light players must includeall these 0 . t h pre-specified chairs. We bound the probability of this as follows. First choose foreach of the 0 . t h pre-specified chairs a particular location where it should appear in the intervalsof light players. The number of such choices is ≤ t . t h ℓ . As mentioned above the probability that aparticular chair is assigned to some specific location is (1 + o (1)) /m . Therefore the probability that0 . t h pre-specified chairs appear in the light intervals is at most t . t h ℓ ((1 + o (1)) /m ) . t h . Thus theprobability that a schedule satisfying the condition of the lemma exists is at most t . t h ℓ ((1 + o (1)) /m ) . t h ≤ (2 t/m ) . t h ≤ (2 t/m ) t/ < − t , where we used that n ≪ t ≪ m . (cid:4) Lemma 20 implies an upper bound of p ( B ) L/t = 8 − L on the probability there is a legal sequenceof moves by the pairwise immediate scheduler that gives rise to breakpoints profile β . Taking aunion bound over all block sequences (whose number is at most m n L ≤ L , by our choice of L = Cm log m for a sufficiently large constant C ), Theorem 18 is proved.Observe that the proof of Theorem 18 easily extends to the case that there are N = m O (1) random permutations out of which one chooses n . We simply need to multiply the number ofpossibilities by N n , a term that can be absorbed by increasing m , similar to the way the term m n is absorbed. In Lemma 19 we need to replace (cid:0) nn h (cid:1) by (cid:0) Nn h (cid:1) , and the proof goes through withoutany change (because n h is so small). This proves Theorem 4.22 .3 Explicit construction with permutations and m = O ( n ) In this section we present for every integer d ≥ n d permutations on m = O ( d n ) such that every n of these permutations constitute an oblivious M C ( n, m ) algorithm.This proves Theorem 5.We let LCS ( π, σ ) stand for the length of the longest common subsequence of the two permuta-tions π and σ , considered cyclically. (That is, we may rotate π and σ arbitrarily to maximize thelength of the resulting longest common subsequence). The following easy claim is useful. Proposition 21 Let π , . . . , π n be permutations of { , . . . , m } such that LCS ( π i , π j ) ≤ r for all i = j . If m > ( n − r , then in every schedule none of the π i is fully traversed. Proof. By contradiction. Consider a schedule in which one of the permutations is fully traversed,say that π is the first permutation to be fully traversed. Each move along π reflects a collisionwith some other permutation. Hence there is a permutation π i , i > m/ ( n − π . Consequently, r ≥ LCS ( π , π i ) ≥ m ( n − , a contradiction. (cid:4) This yields an inexplicit oblivious M C ( n, m ) algorithm with m = O ( n ), since (even exponen-tially) large families of permutations in [ m ] exist where every two permutations have an LCS ofonly O ( √ m ). We omit the easy details. On the other hand, we should notice that by [4] thisapproach is inherently limited and can, at best yield bounds of the form m ≤ O ( n / ).We now present an explicit construction that uses some algebra. Lemma 22 Let p be a prime power, let d be a positive integer and let m = p . Then there isan explicit family of (1 − o (1)) m d permutations of an m -element set, where the LCS of every twopermutations is at most d √ m . Proof. Let F be the finite field of order p . Let M := F × F , and m = p = |M| . Let f bea polynomial of degree 2 d over F with vanishing constant term, and let j ∈ F . We call the set B f,j = { ( x, f ( x ) + j ) | x ∈ F } a block . We associate with f the following permutation π f of M : Itstarts with an arbitrary ordering of the elements in B f, followed by B f, arbitrarily ordered, thenof B f, etc. A polynomial of degree r over a field has at most r roots. It follows that for every twopolynomials f = g as above and any i, j ∈ F , the blocks B f,i and B g,j have at most 2 d elements incommon. There are ( p − · p d − = (1 − o (1)) m d such polynomials. There are p blocks in π f andin π g , so that LCS ( π f , π g ) ≤ dp , as claimed. (cid:4) In this paper we introduced the notion of oblivious distributed algorithms. Our main resultsconcern the design of oblivious MC algorithms. We showed that m ≥ n − M C algorithm with n processors. However, ourconstruction involves very long words. It is interesting to find explicit constructions with m = 2 n − M C ( n, m ) algorithms exist with m = O ( n ) and relatively short full words. We still do not haveexplicit constructions of such protocols. We would also like to determine lim inf mn such that n random words over an m letter alphabet tend to constitute an oblivious M C ( n, m ) algorithm.23omputer simulations strongly suggest that for random permutations, a value of m = 2 n − M C ( n, n − 1) algorithms using permutations for n = 3 and n = 4 (for the latter theproof of correctness is computer-assisted). For n ≥ m ≥ n − m for which there are collections of N = m + 1 (not necessarily full) words such that every min[ n, N ] of them form an oblivious M C algorithm when starting at the initial chair of each word . Our proof that m ≥ n − M C algorithm. This can be viewed as the problem whether some digraph contains adirected cycle or not. The point is that the digraph is presented in a very compact form. It is nothard to place this problem in PSPACE, but is it in a lower complexity class, such as co-NP or P?There are interesting foundational questions related to different models in distributed comput-ing. We have defined here the Output Negotiation ( ON ) model, and showed that it is properlyincluded in the read/write model. It follows by definition that the oblivious model is included inthe ON model. It would be interesting to know whether this last inclusion is proper. References [1] Attiya H., Bar-Noy A., Dolev D., Peleg D., and Reischuk R.. Renaming in an asynchronous environment. J. ACM , 37(3):524–548, 1990.[2] Afek Y., Attiya H., Fouren A., Stupp G., and Touitou D., Long-Lived Renaming Made Adaptive,PODC-99, 91-103.[3] Afek Y., Attiya H., Dolev D., Gafni E., Merritt M., and Shavit N., Atomic Snapshots of Shared Memory. Journal of the ACM , 40(4):873-890, 1993.[4] Beame P., Blais E., and Ngoc D., Longest common subsequences in sets of permutations,http://arxiv4.library.cornell.edu/abs/0904.1615?context=math[5] Dijkstra, Edsger W., Self-stabilizing systems in spite of distributed control, Communications of theACM MIT Press , ISBN 0-262-04178-2[7] Dolev D., Lynch N. A., Pinter S., Stark E. W., and Weihl W. E., Reaching Approximate Agreement inthe Presence of Faults. Symposium on Reliability in Distributed Software and Database Systems Proc. 19th Int’l Symposium on Distributed Computing(DISC’05) , Springer Verlag LNCS Proc. 8th LatinAmerican Theoretical Informatics (LATIN’06) , Springer-Verlag LNCS SRDS 12] Herlihy M.P., Wait-Free Synchronization. ACM Transactions on Programming Languages and Systems ,13(1):124-149, 1991.[13] Herlihy M.P. and Shavit N., The Topological Structure of Asynchronous Computability. Journal of theACM , 46(6):858-923, 1999.[14] Lamport L., On interprocess communication, Part 1: Models, Part 2: Algorithms. Distributed Com-puting , 1(2):77-101, 1986., 1(2):77-101, 1986.