Fast Graphical Population Protocols
FFast Graphical Population Protocols
Dan Alistarh · [email protected] · IST Austria
Rati Gelashvili · [email protected] · University of Toronto
Joel Rybicki · [email protected] · IST Austria
Abstract.
Let G be a graph on n nodes. In the stochastic population protocol model,a collection of n indistinguishable, resource-limited nodes collectively solve tasks viapairwise interactions. In each interaction, two randomly chosen neighbors first read eachother’s states, and then update their local states. A rich line of research has establishedtight upper and lower bounds on the complexity of fundamental tasks, such as majorityand leader election, in this model, when G is a clique . Specifically, in the clique, thesetasks can be solved fast , i.e., in n polylog n pairwise interactions, with high probability,using at most polylog n states per node.In this work, we consider the more general setting where G is an arbitrary graph,and present a technique for simulating protocols designed for fully-connected networksin any connected regular graph. Our main result is a simulation that is efficient onmany interesting graph families: roughly, the simulation overhead is polylogarithmicin the number of nodes, and quadratic in the conductance of the graph. As a sampleapplication, we show that, in any regular graph with conductance φ , both leader electionand exact majority can be solved in φ − · n polylog n pairwise interactions, with highprobability, using at most φ − · polylog n states per node. This shows that there are fastand space-efficient population protocols for leader election and exact majority on graphswith good expansion properties. We believe our results will prove generally useful, asthey allow efficient technology transfer between the well-mixed (clique) case, and theunder-explored spatial setting. a r X i v : . [ c s . D C ] F e b Introduction
Since the early days of computer science, there has been significant interest in developing analgorithmic theory of molecular and biological systems [53]. In distributed computing, populationprotocols [9] have become a popular model for investigating the collective computational power oflarge collections of communication-bounded agents with limited computational capabilities. Thismodel consists of n identical agents, seen as finite state machines, and computation proceeds viapairwise interactions of the agents, which trigger local state transitions. The sequence of interactionsis provided by a scheduler, which picks pairs of agents to interact. Upon every interaction, theselected agents observe each other’s states, and then update their local states. The goal is tohave the system reach a configuration satisfying a given predicate, where e.g. all agents agree ona common output value (consensus/majority), or a single agent is assigned a special leader state(leader election). Early work in population protocols focused on the computational power of the model, i.e., theclass of predicates which can computed by population protocols under various interaction graphs [9,11]. More recently, the focus has shifted from computability towards understanding complexitythresholds, which often come in the form of fundamental time-versus-space complexity trade-offs,e.g. [5, 7, 12, 18, 21, 33, 37, 39]; for recent surveys please see [36] and [4].This line of work almost exclusively focuses on the uniform stochastic scheduler, where eachinteraction pair is chosen uniformly at random among all pairs of agents in the population, and thetime complexity of a protocol is measured by the number of interactions needed to solve a task. Theuniform stochastic scheduler is analogous to having a large well-mixed solution of interacting particles,an assumption often used for modelling chemical reactions between molecules. However, manynatural systems exhibit spatial structure—this is, for instance, the case of biochemical communicationin e.g. bacterial biofilms [47]—and this structure can greatly influence the system dynamics.Indeed, it is well-known that there is a qualitative difference between the computational powerof population protocols in the clique and other interaction graphs under the adversarial scheduler:any other connected interaction graph can simulate adversarial interactions on the clique graphby shuffling the states of the nodes [9] and population protocols on some interaction graphs cancompute a strictly larger set of predicates than protocols on the clique; see e.g. [14] for a survey ofcomputability results.However, much less is known about the complexity of basic tasks in general interaction graphsunder the stochastic scheduler. So far, only a handful of protocols have been analysed on generalgraphs. Existing analyses tend to be complex, and specialised to specific algorithms on limited graphclasses [17, 29, 35, 45, 46]. This is natural: given the intricate dependencies which arise due to theunderlying graph structure, the design and analysis of protocols in the spatial setting is understoodto be challenging.
Our approach.
In this paper, we develop a general approach for efficiently translating algorithmicideas and techniques for population protocols from non-spatial setting of well-mixed interactions,which is now fairly well-understood, to the general spatial setting, in which the interactions aredictated by an arbitrary regular graph.Our work proposes a “technology transfer” approach, which allows the analyses in the simplerclique case to directly generalise to arbitrary regular graphs. The computational costs of this reductionare additional factors which depend polylogarithmically on the number of nodes n , and quadratically1n the conductance of the graph. Roughly speaking, we show that an efficient synchronous protocolfor the clique, where all nodes interact simultaneously in lockstep every round, implies an efficient asynchronous population protocol in the spatial setting, where interactions happen sequentially in apairwise fashion, on graphs with good expansion properties. Thus, we can establish new complexitybounds for population protocols in regular graphs by analysing the much simpler clique setting. We consider stochastic population protocols in systems whose underlying topology is a connectedgraph G = ( V, E ) . Each node v ∈ V represents a single agent in a population of size n = | V | . Theagents are oblivious to the graph G , in the sense that they do not have access to the structure of G ,not even their own immediate neighbourhood. However, we assume non-uniform protocols, that is,the protocol may depend on n and the expansion properties of the graph.Computation proceeds asynchronously, in sequential pairwise interaction steps, and pairingsfollow a stochastic scheduler. In each step, a scheduler selects an ordered pair of neighboring nodes ( u, v ) uniformly at random to interact, where node u is the initiator and node v is the responder .Informally, population protocols are often given using chemical reaction equations that describe thepossible state transitions using rules of the form A + B → C + D, which indicate that when an initiator in state A interacts with a responder in state B , they switchto states C and D , respectively. We assume that the scheduler gives each node a single random bitper interaction; this is a fairly standard assumption (see e.g. [38]). Thus, an execution in this modelis given by a random sequence of oriented edges and the random bits provided to the nodes. Stabilisation and complexity measures.
In addition to the state transition rules, a protocoldescribes how to map an input value to an initial state and how to map the current state to anoutput value. A configuration after t interactions is given by the states of all nodes. A configurationis stable if (1) the outputs remain unchanged upon subsequent interactions and (2) the output isa correct solution to the task. A protocol stabilises in T steps if it reaches a stable configurationwithin T steps. The step complexity is the number of interactions until the system has reached astable configuration. We follow the common convention to measure parallel time as step complexitydivided by n . The state complexity is the number of distinct states per node. Leader election and exact majority.
For the most part, we focus on two classic tasks, whichserve as our running examples. In leader election , the task is to choose a single node as a leaderamong n initially identical nodes and assign all other nodes as followers. In the exact majority task,each node is given a single bit as input and the task is to stabilise to a configuration, where everynode outputs the input value which had a higher initial count. Edge expansion in graphs.
Before presenting our results, we recall some graph theoretic notions.A graph G = ( V, E ) is d -regular if every node v ∈ V is adjacent to exactly d other nodes. Unlessotherwise specified, we assume throughout that all graphs are regular. For any set S ⊆ V , the edgeboundary of S is the set ∂S ⊆ E of edges with exactly one endpoint in S . The edge expansion ofthe graph G is β = min (cid:26) | ∂S || S | : S ⊆ V, | S | ≤ n/ (cid:27) . a) (b) (c) Figure 1: The graphical population protocol model. In each step, a random edge { u, v } is selectedand the nodes u and v interact (blue nodes). Examples of graph classes covered by our construction:(a) regular high-girth expanders, (b) bipartite complete graphs, (c) toroidal grids.For regular graphs, we express our bounds in terms of the ratio d/β ; its inverse β/d corresponds tothe conductance of G . In this work, we provide reductions showing that standard problems in population protocols can besolved efficiently under graphical stochastic schedulers, with step and state complexities bounded bythe expansion and size of the underlying graph. In summary, our contributions are as follows:(1) We give a general framework for simulating a large class of synchronous protocols designedfor fully-connected networks , in the graphical stochastic population protocol model. Thus, theuser can design efficient (and simple to analyse) synchronous algorithms on a clique model,and transport the analysis automatically to the population protocol model on a large classof interaction graphs. For instance, on any d -regular graph with edge expansion β > , theresulting overhead in parallel time and state complexity is in the order of ( d/β ) · polylog n .We introduce the synchronous model, called the k -token shuffling model, in the next section.(2) As concrete applications of the simulation framework, we show that for any d -regular graphwith edge expansion β > , there exist protocols for leader election and exact majority thatstabilise both in expectation and with high probability in ( d/β ) · polylog n parallel timeand use ( d/β ) · polylog n states. While we made no specific effort to optimise the degree ofthe polylog n term, the maximum degree is at most 6 (and often smaller).(3) To complement these results following from the simulation, we also show that, on any graph G with diameter diam( G ) and m edges, leader election can be solved both in expectation andwith high probability in O (diam( G ) · m log n ) parallel time, using a constant-state protocol.This result is based on a new running time analysis of the token-based protocol by Beauquier,Blanchard and Burman [16] under the uniform stochastic scheduler on G via the meetingand coalescence times of certain correlated random walks.Throughout, we use the phrase “with high probability” to mean that we can choose the constants ofthe protocol so that the probability that the protocol fails to stabilise (in the given time) is at most /n λ for any given constant λ > . Bounds for specific graph families.
The second result immediately implies upper bounds forthe time and state complexity of majority and leader election in various graph families, parameterisedby the quantity ( d/β ) : 3 raphs Task States Parallel time Ref. Notecliques EM 4 O ( n log n ) [35] Ω( n ) parallel time necessary [6].EM O (log n ) Θ(log n ) [34] Optimal for certain protocols [7].LE n ) [33] Optimal O (1) -state protocol.LE Θ(log log n ) Θ(log n ) [21] Lower bounds in [6, 51].connected EM n ) [17, 35] Various bounds (*)LE O (diam( G ) · m log n ) new Complexity analysis of [16]. d -regular EM ( d/β ) · polylog n ( d/β ) · polylog n new Also stabilises in non-reg. graphs.LE ( d/β ) · polylog n ( d/β ) · polylog n new Also stabilises in non-reg. graphs.
Table 1: Protocols for exact majority (EM) and leader election (LE) for different graph classes. Thestate complexity is the number of states used by the protocol. The parallel time column gives theexpected parallel time (expected number of steps divided by n ) to stabilise. (*) In [35], the runningtime of the protocol is bounded by the initial discrepancy in the inputs and the spectral propertiesof the contact rate matrix; bounds in terms of n are only given for select graph classes (paths, cycles,stars, random graphs and cliques). No sublinear in n bounds on parallel time are given in [35].– In sparse graphs with good expansion properties, such as constant-degree graphs with constantedge expansion (Figure 1a), we obtain polylogarithmic parallel time and state complexityoverhead. This implies that these good expanders admit fast population protocols with smallstate complexity, despite being sparser than the clique.– In dense graphs, we obtain similar bounds whenever d/β ∈ polylog n holds. This includes theclass of d -dimensional hypercubes with n = 2 d nodes, but also highly-dense clique-like graphs,such as regular complete multipartite graphs (Figure 1b), where the degree and expansion areboth Θ( n ) .– In D -dimensional toroidal grids, we get algorithms that have n /D polylog n parallel time andstate complexity. These sparse D -regular graphs include cycles (1-dimensional toroidal grids),two-dimensional grids (Figure 1c), three-dimensional lattices, and so on.While our protocols for leader election and exact majority guarantee fast stabilisation in regulargraphs with high conductance, they will stabilise in polynomial expected time in any connectedgraph . Moreover, the results can be carried over to certain classes of non-regular graphs providedthat they are not highly irregular and have high expansion; we discuss this in Section 10. Table 1gives a summary of our main results and prior protocols. New complexity trade-offs.
It is known that, for both leader election and exact majority oncliques, constant-state protocols are necessarily slower than protocols with super-constant states [7, 33](see also Table 1). It is still an open question to which extent a similar phenomenon holds for generalinteraction graphs.In this context, we provide significantly improved upper bounds for super-constant state protocols,suggesting a complexity gap between constant and super-constant state protocols, for some graphclasses. Specifically, on d -regular graphs with good expansion, such that d/β ∈ polylog n , we providefast, polylogarithmic time protocols for both leader election and exact majority. This opens asignificant complexity gap relative to known constant-state protocols on graphs. Specifically, the4-state exact majority protocol for general graphs [35] requires Ω( n ) parallel time even in regulargraphs with high expansion, if node degrees are Θ( n ) . (A simple example is the complete bipartite4raph given in Figure 1b.) Yet, our protocols guarantee stabilisation in only polylog n parallel timein both low and high degree graphs, as long as d/β is at most polylog n .Interestingly, this advantage appears to be lost in graphs of low expansion: e.g. in cycles (1-dimensional toroidal grids) the super-constant state protocols take n polylog n time to stabilise,whereas the constant-state protocols for exact and majority and leader election run in O ( n log n ) expected parallel time. (The upper bound for majority follows from [35], whereas the leader electionbound follows from our work.) In Section 2, we give a high-level overview of our techniques and summary of our main techicalresults. Section 3 discusses related work. Section 4 gives formal definitions while Sections 5 to8 develop our framework, including applications. Section 9 gives an analysis for a constant-stateprotocol for leader election that stabilises in polynomial expected time in any connected graph.Finally in Section 10, we conclude by discussing some open problems.
We now give a technical overview of our results. Our reduction framework combines several techniquesfrom different areas, but ultimately the approach can be distilled to a few basic ingredients.
Ingredient
We introduce a simple synchronousmodel for fully-connected networks, which can be simulated by population protocols under thestochastic scheduler. In this model, we assume a synchronous system of n nodes that communicateusing a small number of distinct token types. The model is parameterised by an integer k > . Thecomputation proceeds in discrete rounds, where in each round nodes perform the following actionsin lock-step:(1) at the start of a round, every node v generates exactly k tokens based on its current state,(2) all of the nk tokens generated are shuffled uniformly at random among all the nodes so thateach node is assigned exactly k tokens, and(3) every node v updates its local state based on its current state and the (ordered sequence of) k tokens it received.To emphasise the role of k , we refer to the model as the k -token shuffling model . We typically assumethat both k and the number of distinct token types are (small) constants. An R -round execution ofthis model is given by a sequence of σ , . . . , σ R random i.i.d. uniform permutations on the set of nk tokens. We call such a sequence an R -round synchronous schedule . Figure 2 illustrates executions ofthis model for k = 1 and k = 2 .As the token shuffling model is synchronous, this often simplifies the design and analysis ofalgorithms compared to the asynchronous population protocol model. In terms of algorithm design,the model can be used in several ways:– One-way interactions:
For k ≥ , we can model “pull interactions” [41] by having each nodecopy their local state onto a token at the start of every round. After shuffling the tokens, eachnode holds a sample of k states of randomly chosen nodes. For k = 1 , this corresponds tosynchronous one-way interactions, where in each round every node v reads the state of someother node σ ( v ) given by a random permutation σ : V → V . This is illustrated in Figure 2a.5 huffle round 1 round 2 shuffle shuffle round 1 round 2 (b)(a) shuffle Figure 2: The synchronous k -token shuffling model with 5 nodes for k = 1 and k = 2 . Rectangles arenodes and the small circles are tokens. In each round, nodes generate k tokens based on their currentstate. Then all nk tokens are shuffled randomly. After this, nodes update their state based on thevector of k tokens they hold. (a) An execution of a protocol in the 1-token shuffling model. Thearrows between tokens represent the random permutation used to shuffle tokens. (b) An executionof a protocol for k = 2 . Each node sends and receives two tokens.– Simulating pairwise interactions:
For k = 2 , we can simulate two-way interactions, wheretwo agents simultaneously read each other’s states and update their local states, in a population V (cid:48) of n virtual agents. Every node v ∈ V stores the state of two virtual agents v , v ∈ V (cid:48) .In each round, the nodes shuffle the (states of the) two virtual agents they hold by copyingeach agent onto a token. At the end of a round every node v simulates the interaction betweenthe two virtual agents it holds, and updates their states. This corresponds to synchronous(virtual) interaction pattern given by a random perfect matching. More generally for k ≥ ,we can simulate k -wise interactions, used in e.g. chemical reaction network (CRN) models [23].For our running examples, the leader election and exact majority tasks, we can use the first approach(one-way interactions) to adapt a standard population protocol for information dissemination to solveleader election [37] in the synchronous 1-token shuffling model assuming nodes can make random localcoin flips. Alternatively, we can generate synthetic coin flips in the synchronous model when k > using deterministic transition rules. We use the second approach (simulating pairwise interactions)to run cancellation-doubling dynamics employed by many exact majority protocols [7, 12].Finally, we note that from a biological perspective, the tokens can be interpreted as e.g. signalingmolecules, which bind to a limited number of receptors at cells (nodes). However, we do not pursuesuch analogies further. Ingredient
In order to simulate the schedulesof the k -token shuffling model, we devise a token shuffling scheme that can be implemented by agraphical population protocol. This leads to the following card shuffling process on a graph with n nodes, whose mixing time we analyse. Suppose we are given a deck of nk cards and we place a stackof k cards on every node. In each step, select a node v uniformly at random and– with probability 1/2, move the card on the top of the node’s stack to the bottom of the stack,– with probability 1/4, exchange the top card of v with the top card of a random neighbour of v ,– with probability 1/4, do nothing.This process corresponds to a random walk on the symmetric group S nk . The special case of k = 1 isknown as the interchange process , which has been analysed on various graphs [24, 30, 32, 40, 48, 54].6
57 0
324 7501 23
30 21
61 0324 75
Figure 3: Interchange dynamics on a 4-cycle. In each step, blue cards are swapped. Top row: The1-stack interchange process. Bottom row: The 2-stack interchange process. In each step, a randomlyselected node either moves its top card to the bottom of its stack or exchanges it with the top cardof a randomly selected neighbour.For the case k > , we refer to the process as the k -stack interchange process, and to our knowledge,this process has not been explicitly studied before. The process is illustrated in Figure 3.Inspired by Jonasson [40], we bound the mixing time of this process on general graphs usingthe path comparison method of Diaconis and Saloff-Coste [30] and leveraging well-known results byLeighton and Rao [42]. For d -regular graphs and any constant k > , we obtain a simple (althoughsomewhat loose) upper bound showing that the distribution of nk cards mixes close to the uniformdistribution in the order of ( d/β ) · n log n steps; see Section 5 for details. Corollary 1.
Let G be a d -regular graph with edge expansion β > . For any constant k > , themixing time of the k -stack interchange process on G is O (cid:32)(cid:18) dβ (cid:19) n log n (cid:33) . Ingredient
A key gadget when simulating the synchronous k -token shuffling model in the asynchronous graphical population model is a construction of abounded-state, leaderless phase clock for well-behaved graphs, which we provide in Section 6. Thisgeneralises a known approach from the clique case, e.g. [7, 18], by leveraging the graphical versionof the two-choice load balancing process [15, 49]. The phase clock approximates the number ofinteractions the system has taken by having each node keep a local interaction counter. In each step,exactly one of the counters is incremented. The protocol guarantees that in any regular graph thedifference between any two counters is O ( d/β · log n ) , with high probability, for polynomially manysteps.While the phase clock construction we provide works on large class of graphs, including allconnected regular graphs but also on certain non-regular graphs and weighted graphs, other clockprotocols could also be used in our construction. For example, there are several examples of phaseclocks for the clique that have been proposed in the population protocol literature, e.g. [7, 12, 37,39, 52]. It is possible that adapting some of these clocks into the graphical setting may be usedto improve the bounds guaranteed by our construction. However, in general, it remains an openproblem to determine bounds for phase clocks and related load balancing processes in various graphclasses [8, 49]. 7 ngredient In Section 7, we construct a simulation scheme in thepopulation protocol model that can simulate any synchronous algorithm in the k -token shufflingmodel. We do this by repeatedly running iterations of the card shuffling process (where cardsrepresent tokens), carefully synchronised by the phase clock. The simulation protocol is guaranteedto correctly simulate polynomially many synchronous rounds, with high probability.However, since the card shuffling process can only be run for finitely many steps, the distributionof tokens will never reach a perfectly uniform distribution. Thus, the tokens are not guaranteed to bepermuted uniformly at random, but only almost uniformly at random. We circumvent this issue bya statistical indistinguishability argument: no polynomial time synchronous protocol for the tokenshuffling model can distinguish between executions, where the tokens are distributed uniformly atrandom and almost uniformly at random. This shows that any synchronous protocol that stabiliseswith high probability under uniform schedules, will also do so under the simulated almost-uniformschedules. We thus arrive at the main technical result of our paper. Theorem 2.
Let k > be a constant and A be a synchronous k -token shuffling protocol on n nodes,where X is the set of local states and Y the set of token types used the protocol A . If A solvesthe task Π with high probability in R ∈ poly( n ) rounds, then there exists a stochastic populationprotocol B that also solves task Π with high probability on any n -node d -regular graph G with edgeexpansion β > . The step complexity T ( B ) and state complexity S ( B ) of the protocol B satisfy T ( B ) ∈ O ( R · n · ζ ) and S ( B ) ∈ O (cid:16) | X | · | Y | k · ζ (cid:17) with ζ = log n · (cid:18) dβ + τ mix n (cid:19) , where τ mix is the mixing time of the k -stack interchange process on G . The general mixing time bound given by Corollary 1 implies that the protocol B given byTheorem 2 satisfies T ( B ) ∈ O (cid:32)(cid:18) dβ (cid:19) Rn log n (cid:33) and S ( B ) ∈ O (cid:32) | X | · | Y | k · (cid:18) dβ (cid:19) log n (cid:33) . We note that it is sometimes possible to get better bounds on the mixing time of the k -stack processfor specific graph classes than the one given by Corollary 1. Ingredient
Finally, to instantiate efficient graph-ical population protocols for majority and leader election, in Section 8 we provide their counterpartsin the synchronous token shuffling model:– There is a synchronous 2-token shuffling protocol for the leader election task that stabilises in O (log n ) rounds with high probability, uses O (log n ) states per node and two token types.– There is a synchronous 2-token shuffling protocol for the exact majority task that stabilises in O (log n ) rounds with high probability, uses O (log n ) states and five token types.Plugging in the above results into Theorem 2 we get the following corollary. Corollary 3.
For any d -regular graph G with edge expansion β > , there exist stochastic populationprotocols for leader election and exact majority which stabilise with high probability. Their paralleltime and state complexities are both bounded by ( d/β ) polylog n . he last ingredient: Backup protocols. We note that our simulation framework guaranteesstabilisation with high probability only, that is, the obtained protocols may fail with a low probability(e.g. when the phase clocks become desynchronised). Nevertheless, we can obtain always correctprotocols, i.e., finite expected stabilisation time, by switching to an always correct – but possiblymuch slower – “backup protocols” in the unlikely executions, where the fast protocols fail. We choosethe backup protocols as follows:– For exact majority, we can use the 4-state protocol by Draief and Vojnović [35], which stabilisesin polynomial expected time on any connected graph.– For leader election, we can use the 6-state protocol of Beauquier et al. [16]. In Section 9,we show that the protocol stabilises in O ( n log n ) expected parallel time in any connectedgraph under the stochastic scheduler. This is done by analysing the meeting times of certaincorrelated random walks that arise in the population protocol model.Combining the backup protocols with the faster protocol yields protocols whose expected paralleltime to stabilise is bounded by ( d/β ) polylog n with only a constant overhead in the state complexity. Our framework builds on technical tools developed in several different fields ranging from graphicalload balancing [49] to random walks on finite groups [30]. We now give a brief overview of relatedwork on population protocols, interacting particle systems, and token shuffling processes.
Population protocols on graphs.
Already the pioneering work on population protocols con-sidered the possibility of interactions allowing interactions only between nodes adjacent in someunderlying interaction graph [9]. However, the initial line of research focused mostly on the com-putational power of the model, particularly in the setting with constantly many states per node,e.g. [9–11, 22, 25].The differences between the computational power of fully-connected and general bounded-degreegraphs were established early [9, 13]. Under the adversarial scheduler, the clique graph is theweakest interaction graph in terms of computability. Protocols in e.g. bounded-degree graphs cancompute a strictly larger set of predicates than protocols in the clique [14]. While constant-spaceprotocols clique graphs can compute exactly those predicates definable in Presburger arithmetic [10],population protocols on a bounded-degree graph can simulate the computation of a linear-spaceTuring machine and compute many network predicates of the underlying graph [10]. We refer to thesurvey of Aspnes and Ruppert [14] for a more detailed overview of the early results in this area.We also note that many self-stabilising protocols, i.e. protocols that recover from arbitrarytransient failures corrupting the local states of nodes, have been studied in general graphs [13]. Inparticular, self-stabilising leader election has received considerable attention in the literature. Alarge portion of the research in this area has focused on computability issues and identifying onwhich interaction graphs the problem can be solved and e.g. under what kind of oracles [13, 16].However, as self-stabilizing leader election is generally a strictly harder problem than the non-self-stabilising variant we consider, it is often necessary to make additional assumptions on themodel or the underlying graph structure when working with self-stabilisation. Regardless, in termsof compelxity, Chen and Chen [26] gave a constant-state protocol with exponential stabilisation timein directed cycles and 2-dimensional toroidal grids. Later, they gave a protocol for d -regular graphs9sing O ( d ) states [27]. Recently, Yokota, Sudo and Toshimitsu [55] reported a leader electionprotocol for directed cycles that runs in O ( n ) parallel time using O ( N n ) states, where N ≥ n is aknown upper bound on the population size. Space-time complexity tradeoffs in the clique.
More recently, there has been a flurry ofinterest in determining the fundamental space-time trade-offs for key tasks, such as majority andleader election [6, 7, 18, 20, 21, 33, 35, 39, 45]. For example, Doty and Soloveichik [33] establishedthat there is no constant-state protocol for leader election that runs in o ( n ) expected steps andAlistarh, Aspnes, Eisenstat, Gelashvili and Rivest [6] showed that the bound on the expected numberof steps also holds for any protocol using o (log log n ) states per node.Berenbrink, Giakkoupis and Kling [21] eventually established a matching upper bound for leaderelection by giving a protocol that uses O (log log n ) states and elects a leader in the optimal numberof Θ( n log n ) steps. Similarly to leader election, a parallel line of research focused on bounding thecomplexity of the exact majority task [5, 7, 39]. Gąsieniec, Stachowiak and Uznański [39] reported aprotocol computes exact majority w.h.p. in O ( n log n ) steps using O (log n ) states. Doty, Eftekhariand Severson [34] followed up with a protocol that solves the problem in optimal expected time O (log n ) using O (log n ) states. Understanding complexity on general graphs.
The vast majority of the work investigatingthe complexity of population protocols has focused on the case where the underlying graph is aclique [4, 36]. Two natural justifications for this choice are that: (1) the clique is a reasonableapproximation for the well-mixed solution assumption, and (2) the analysis of population protocolscan be difficult enough even without additional graph structure complicating matters. The recentsurvey of [36] points out that running time on general graphs is poorly understood, and setsthis as an open question. Bounds on non-complete graphs have been studied for exact [35] andapproximate majority [45, 46], with some recent work considering plurality consensus [17, 28, 29] ina closely-related model.In this paper, we provide a general and efficient way for leveraging the recent progress in theclique model to general regular graphs. We show that time- and space-efficient algorithms – by whichwe mean polylogarithmic parallel time and space complexity – can be obtained in well-behavedgraph topologies.Reminiscent of the state shuffling approach used to show that clique is the weakest interactiongraph under adversarial schedules, our work also employs a shuffling procedure in a graph to simulatepair-wise interactions between any pairs of nodes. However, a key difference is that we aim tosimulate pairwise interactions under the uniform stochastic scheduler, as (the analyses of) fastprotocols in the clique exploit the fact that the pairwise interactions are uniformly random amongall pairs [4, 36]. Thus, one of the main technical challenges is to devise a shuffling procedure thatguarantees that the simulated interactions are (almost) uniform.
Besides population protocols on graphs, there is a long line of research in investigating variousdynamics of interacting particle systems on graphs [2]. For example, the dynamics are often (butnot always) assumed to evolve in a synchronous fashion, where in each round all particles makerandom choices independently of all other particles. This allows the use of many powerful techniquesused to analyse independent random walks on graphs [44]. In comparison, the stochastic populationprotocol model is asynchronous, as in each step only two randomly selected neighbouring nodesinteract, and the new states of the interacting nodes tend to be highly correlated.10 oalescing random walks and voter model.
Cooper, Elsässer, Ono and Radzik [28] analysedthe coalescence time of independent random walks on a graph in terms of the expansion propertiesof the graph. In this process, each node initially holds a unique particle, and in each time step everyparticle – independently of all other particles – randomly moves to another node. When two particlesmeet, they unite into a single particle, which continues to follow a random walk and coalesce withany other particle it meets. The expected coalescence time of the random walks is closely related tothe consensus time of the classic voter dynamics, where in each round every node switches to theopinion of a random chosen neighbour.In this work, we use token-based population protocols on graphs, where the tokens are shuffledbetween nodes during an interaction and the tokens instead of coalescing, may interact also in otherways. For example, we analyse the running time bounds of a constant-state leader election protocolby Beauquier et al. [16], which is used as a subroutine for our fast leader election protocol. Theconstant-state protocol employs tokens that instead of coalescing upon meeting update their state,and moreover, the tokens do not perform independent random walks. Nevertheless, we can use asimilar approach as Cooper et al. [28] to study the coalescence time of non-independent randomwalks to establish time bounds on the constant-state leader election protocol under the stochasticscheduler.Token-based processes have also been used to implement efficient, randomised rumour spreadingprotocols. For example, Perenbrink, Giakkoupis and Kling [19] analysed the cover time of asynchronous coalescing-branching random walk on regular graphs: Initially there is a single tokenlocated at some node. In each round, (1) every node v that has a token, splits its token into k partsand sends them k randomly chosen neighbours, and (2) at the end of the round, all tokens at a singlenode coalesce into a single token. Similarly to our work, they use conductance to bound the behaviourof this process in regular graphs, and show that for k = 2 the cover time of coalescing-branchingprocess is in the order of φ − log n rounds. Plurality consensus and token shuffling.
In plurality consensus, there are k > opinionsand the task is the agree on opinion supported by the most nodes. Berenbrink, Friedetzky, Kling,Mallmann-Trenn and Wastell [17] present a protocol for the plurality consensus problem in asynchronous pull-based interaction model. Their protocol also circulates tokens, and samples theircount periodically (after mixing) to estimate opinion counts, running into the issue that the tokenmovements are correlated. The authors provide a generalisation of a result by Sauerwald and Sun [50]in order to show that the joint token distribution is negatively correlated, and therefore the tokencounting mechanism concentrates.In this work, we also employ a token exchange protocol, and encounter non-trivial correlationissues. However, we resolve these issues differently: we characterise the distribution of the tokeninteractions using the k -stack interchange process, and bound its total variation distance relativeto the uniform distribution, showing that the two distributions are indistinguishable in polynomialtime with high probability. More generally, the goal of our construction is different, as we aim toprovide a general framework to efficiently simulate pairwise random node interactions. Our work builds on the work done on card shuffling processes. These processes have a long and richhistory [1, 24, 30–32, 40, 48, 54]. While many of these processes are simple to describe, they areoften surprisingly challenging to analyse. Here we focus on key results related to the interchangeprocess, where the cards are placed on the nodes of a graph and shuffling is performed by randomly11xchanging cards between adjacent nodes. We note that much of the work has aimed to identifysharp bounds on the mixing time for the interchange process on various graphs.Diaconis and Shahshahani [31] gave sharp bounds of the order Θ( n log n ) on the mixing timeof the random transpositions shuffle, i.e., interchange process on the clique. Aldous [1] establishedthat the mixing time of the interchange process on the path is bounded by Ω( n ) and O ( n log n ) ;later Wilson [54] showed that the mixing time is in fact Θ( n log n ) . Diaconis and Saloff-Coste [30]developed a powerful technique for upper bounding the mixing time of a random walk on a finitegroup by comparing it to another walk with known behaviour via certain Dirichlet forms. Ouranalyses of the k -stack interchange process also rely on this comparison technique.A decade later Wilson [54] gave a general technique for proving lower bounds for many shufflingprocesses. In particular, he showed that the mixing time on the two-dimensional √ n × √ n grid is Θ( n log n ) and Ω( n log n ) on the hypercube. Subsequently, Jonasson [40] gave additional upperand lower bounds on the interchange process on various graphs, including showing that the mixingtime on the hypercube and constant-degree expanders is at most O ( n log n ) and O ( ρmn log n ) onany m -edge graph with radius ρ . For a further exposition to this area, we refer to [43].In this work, we introduce and analyse a generalisation of the interchange process, called the k -stack interchange process, where each node holds k > cards instead of one. Graphs.
Let G = ( V, E ) be a graph, where V is the set of vertices and E is the set of edges.Throughout, we use n = | V | and m = | E | . The degree of a node is the number of neighbours it hasand a graph is d -regular if all vertices have degree d . For any S ⊆ V , we define ∂S = {{ u, v } ∈ E : u ∈ S, v ∈ V \ S } and β = min (cid:26) | ∂S || S | : S ⊆ V, | S | ≤ n/ (cid:27) , where ∂S is the set of edges with exactly one endpoint in S and β is the edge expansion of graph G .That is, | ∂S | ≥ β · | S | holds for all S ⊂ V such that | S | ≤ n/ . Probability distributions.
Let E be a finite set. We say µ : E → [0 , is a probability distributionon E if (cid:80) x ∈ E µ ( x ) = 1 holds. We use X ∼ µ to denote a random variable sampled according to µ so that Pr[ X = x ] = µ ( x ) . For any distribution µ and set A we write µ ( A ) = (cid:80) x ∈ A µ ( x ) . The uniform distribution on E is the distribution ν defined by ν ( x ) = 1 / | E | . The support of µ is the set { x : µ ( x ) > } . The total variation distance between two distributions µ and µ on E is given by (cid:107) µ − µ (cid:107) TV = 12 (cid:88) x ∈ E | µ ( x ) − µ ( x ) | = max A ⊆ E | µ ( A ) − µ ( A ) | . We say that µ is ε -uniform on E if (cid:107) µ − ν (cid:107) TV ≤ ε . Permutations and the symmetric group.
Let
N > be a positive integer and [ N ] = { , . . . , N − } . A permutation on [ N ] is a bijection from [ N ] to [ N ] . The symmetric group S N over [ N ] is the group consisting of the set of all permutations on [ N ] with function compositionas the group operation and identity element id defined by id( i ) = i . The inverse x − of an element x ∈ S N is the map satisfying x − · x = x · x − = id . A transposition ( i j ) ∈ S N of i and j is thepermutation that swaps the elements i and j , but leaves other elements in place. We say that a set12 ⊆ S N generates S N if every element of S N can be expressed as a finite product of elements in H and their inverses. We use · and ◦ interchangeably to denote function composition. Random walks on symmetric groups.
Let µ be a symmetric probability distribution on S N ,i.e., µ ( x ) = µ ( x − ) . The random walk on S N with increment distribution µ is a discrete time Markovchain with state space S N . In each step, a random element x ∼ µ is randomly chosen according to thedistribution µ and the chain moves from state y to state xy . Thus, the probability of transitioningfrom state x to state yx is µ ( y ) . The holding probability of the random walk is α = µ (id) . Thefollowing lemma summarises some useful properties of such random walks; see e.g. [43] for proofs. Lemma 4.
Let µ be an increment distribution for a random walk on the symmetric group S N .(1) The uniform distribution ν on S N is a stationary distribution for the random walk.(2) The random walk is reversible if and only if µ is symmetric.(3) The random walk is irreducible if and only if the support of µ generates S N .(4) If µ (id) > , then the random walk is aperiodic. Mixing times.
Let ν be the uniform distribution on S N and be p ( t ) be the probability distributionover states of the chain after t steps. Following [30], we define the (cid:96) s -norm and the normalised (cid:96) s -distance to stationarity for s > as: (cid:107) µ (cid:107) s = (cid:32)(cid:88) x | µ ( x ) | s (cid:33) /s and d s ( t ) = | S N | − /s · (cid:107) p ( t ) − ν (cid:107) s . The total variation distance and the normalised distances satisfy (cid:13)(cid:13)(cid:13) p ( t ) − ν (cid:13)(cid:13)(cid:13) TV = d ( t ) ≤ d ( t ) , where the latter inequality follows from the Cauchy-Schwarz inequality. We define τ ( ε ) = min { t : d ( t ) ≤ ε } and τ mix = τ (1 / , where τ mix is the mixing time of the random walk. We refer to τ ( ε ) as the ε -mixing time. Note that τ ( ε ) ≤ (cid:100) log ε − (cid:101) · τ mix . Distributed tasks.
Let Σ and Γ be finite sets of input and output labels, respectively. A task Π on a set of n nodes is a function Π that maps any input labelling z : [ n ] → Σ to a set Π( z ) ⊆ Γ V offeasible output labellings. That is, z (cid:48) : [ n ] → Γ is a feasible output labelling for input z if z (cid:48) ∈ Π( z ) .If Π( z ) = ∅ , then we say that z is an infeasible input. We focus on two tasks:– In the leader election task, the input is the constant function z ( v ) = 1 and the outputlabelling z (cid:48) is feasible iff there exists v ∈ V such that z (cid:48) ( v ) = 1 and z (cid:48) ( u ) = 0 for all u (cid:54) = v .– In the majority task, the input is a function z : V → { , } and an output labelling z (cid:48) is feasibleiff its the constant function z (cid:48) ( v ) = b , where b is the input value held by the majority of thenodes. As conventional, the input with equally many zeros and ones is taken to be infeasible.13 raphical stochastic population protocols. Let G = ( V, E ) be a graph. In the graphicalstochastic population model, abbreviated as PP ( G ) , the computation proceeds asynchronously , wherein each time step t > a stochastic scheduler picks uniformly at random an ordered pair ( u, v ) ,where { u, v } ∈ E , nodes to interact. The node u is called the initiator and v is the responder. Duringan interaction, the nodes u and v read each other’s states and update their local states according toa given protocol.We assume that nodes have access to independent and uniform random bits. Specifically, uponeach interaction, both u and v are provided with a single random bit each. We note that thisassumption is common in the context of population protocols, e.g. [37], and can be justified practicallyby the fact that chemical reaction network (CRN) implementations can directly obtain random bitsgiven the structure of their interactions [23].Formally, a protocol in this model is a tuple A = ( f, (cid:96) in , (cid:96) out ) , where f : S × { , } × S × { , } → S × S is the state transition function and S is the set of states, (cid:96) in : Σ → S maps inputs to initialstates, and (cid:96) in : S → Γ maps states to outputs. An asynchronous schedule is a random sequence ( e t ) t ≥ of pairs e t = ( u, v ) . An execution is the sequence ( x t ) t ≥ of configurations given by x t +1 ( u ) , x t +1 ( v ) = f ( x t ( u ) , q t ( u ) , x t ( v ) , q t ( v )) and x t +1 ( w ) = x t ( w ) for w ∈ V \ { u, v } , where ( u, v ) = e t +1 and q t ( u ) ∈ { , } is the random bit provided to the node during its interaction.The output of the protocol at step t is given by z (cid:48) t = (cid:96) out ◦ x t . We say that the protocol A stabiliseson input z by time T if z (cid:48) t +1 = z (cid:48) t and z (cid:48) t ∈ Π( z ) holds for all t ≥ T . We say that A solves the task Π with probability at least p in T ( A ) steps if theprotocol stabilises by time T ( A ) on any input with probability at least p . The state complexity ofthe protocol is S ( A ) = | S | , i.e., the number of states used by the protocol. Synchronous token shuffling protocols.
In the synchronous k -token shuffling model, we assumethat there are n agents which communicate in a round-based fashion using tokens . In each round,(1) every node v generates exactly k tokens based on its current state,(2) all nk tokens are shuffled uniformly at random so that each node is assigned exactly k tokens,(3) every node v updates its local state based on its current state and the k tokens it received.Let X and Y be finite sets. Formally, an algorithm in this token shuffling model is a tuple ( f, g, (cid:96) in , (cid:96) out ) . The function f : X × Y k → X is a state transition function, and g : X → Y k determines which tokens each node creates at the start of each round. Here, X represents the setof states the node can take, and Y is the set of values a token can take. The function (cid:96) in : Σ → X maps input values to initial states and (cid:96) out : X → Γ maps the state of a node onto an output value.A configuration is a map x : V → X . An execution is a sequence ( x r ) r ≥ of configurations, where x r ( v ) gives the state of node v at the end of round r . The initial state of node v is x ( v ) = (cid:96) in ( z ( v )) ,where z ( v ) is the input of node v . A synchronous schedule is a sequence ( σ r ) r ≥ , where thepermutation σ r ∈ S nk describes how the tokens are shuffled in round r . For any y : [ N ] → Y , we let y ( v , . . . , v k − ) = ( y ( v ) , . . . , y ( v k − )) . The execution induced by the synchronous schedule ( σ r ) r ≥ on input z is defined by y r +1 ( v , . . . , v k − ) = ( g ◦ x r )( v ) and x r +1 ( v ) = f ( x r ( v ) , ( y r +1 ◦ σ r +1 ) ( v , . . . , v k − )) , where y r ( v , . . . , v k − ) and ( y r ◦ σ r +1 )( v , . . . , v k − ) , respectively, are the k tokens generated andreceived by node v during round r . The output of node v at the end of round r is ( (cid:96) out ◦ x r )( v ) .When designing algorithms in this model, we assume the uniform synchronous scheduler, which pickseach permutation independently and uniformly at random from the set of all permutations S N .14 Shuffling on graphs: the k -stack interchange process We now describe and analyse the so-called k -stack interchange process, which is a variant of the classicinterchange process. Both processes are examples of random walks on the symmetric group [30, 43].In Section 7, we repeatedly run this process to simulate the synchronous schedules of the k -tokenshuffling model. The running time of this simulation will be bounded by mixing time of the k -stackinterchange process. The analyse the mixing time using the path comparison method of Diaconisand Saloff-Coste [30] combined with a flow result of Leighton and Rao [42]. A routing on a graph G is a map f that takes every pair of vertices onto a path f ( u, v ) in G connecting the vertices u and v . The congestion of a routing is the maximum number of paths witha common edge and its dilation is the length of the longest path in the routing. We say that graph G is ( C, D ) -routable if there exists a routing with congestion C and dilation D . We make use of thefollowing lemma, which follows from the results of Leighton and Rao [42, Theorem 18]. Lemma 5. If G is a d -regular graph with edge expansion β > , then G is ( C, D ) -routable with C ∈ O (cid:18) n log nβ (cid:19) and D ∈ O (cid:18) d log nβ (cid:19) . k -stack interchange process Let G = ( V, E ) a graph with n vertices { , . . . , n − } and N = kn for k > . Consider the followingshuffling process, where each node of G holds a stack of exactly k cards. In every time step, one ofthe following actions is taken:(1) with probability / , move the top card of a random node to the bottom of its stack,(2) with probability / , choose a random edge { u, v } and swap the top cards of u and v ,(3) with probability / , do nothing.We refer to this process as the k -stack interchange process on G . The special case of k = 1 is theclassic interchange process on G with holding probability / , as the first rule does not do anythingon stacks of size 1. For k > , the holding probability will be / . The process for k = 1 and k = 2 are illustrated in Figure 3 (given in Section 2).Later, we will see that this shuffling process can be implemented in the population protocolmodel assuming that each node is given a single random bit per interaction, i.e., with a total of tworandom bits per interaction. The following result bounds the mixing time of the k -stack interchangeprocess on a ( C, D ) -routable graph. Theorem 6.
Suppose G is ( C, D ) -routable graph with n vertices and m edges. For any constant k > , the k -stack interchange process has mixing time O (cid:18) n log n · max (cid:26) CDmn , D (cid:27)(cid:19) . Together with Lemma 5, the above theorem immediately implies the following general (butsomewhat loose) bound on the mixing time of the k -stack interchange process on regular graphs.15 orollary 1. Let G be a d -regular graph with edge expansion β > . For any constant k > , themixing time of the k -stack interchange process on G is O (cid:32)(cid:18) dβ (cid:19) n log n (cid:33) . We note that while this bound is fairly general, but it can be off by polylogarithmic factorsfor some graphs. For example, for cycles and cliques, Corollary 1 yields the bounds O ( n log n ) and O ( n log n ) , respectively, while a direct application of Theorem 6 can be used to obtain therespective bounds of O ( n log n ) and O ( n log n ) by using the trivial routings in these graphs. Similarly to approach of Jonasson [40] in the case of the classic interchange process, our analysisrelies on the comparison method developed by Diaconis and Saloff-Coste [30], which we overviewnow.
The path comparison method.
Let µ and ˜ µ be increment distributions whose supports H and ˜ H generate the symmetric group S N . For each element a ∈ ˜ H , choose a representation a = x · · · x k ,where k is odd and x i ∈ H for ≤ i ≤ k . Let | a | = k and N ( x, a ) denote be the number of times x appears in the representation of a . Define B ( x ) = 1 µ ( x ) (cid:88) a ∈ S N ˜ µ ( a ) N ( x, a ) | a | and B = max x ∈ H B ( x ) . The quantity B is called the bottleneck ratio of the representation and it is useful for bounding (cid:96) -distance to stationarity. We use the following special case of a lemma from [30, Lemma 5]. Lemma 7 ([30]) . Let µ and ˜ µ be symmetric increment distributions that generate the symmetricgroup S N . Let B be the bottleneck ratio for a representation as defined above. Then d ( t ) ≤ N ! · exp (cid:18) − tB (cid:19) + (cid:101) d (cid:18)(cid:22) t B (cid:23)(cid:19) . Random transpositions shuffle.
We compare the k -stack interchange process against the randomtranspositions shuffle, whose mixing behaviour is well-understood. The random transpositions shuffle is a random walk on the symmetric group given by the increment distribution µ ( x ) = /N if x = id2 /N if x = ( i j )0 otherwise.Diaconis and Shahshahani [31] give the following bound on the mixing time of the random tranposi-tions shuffle. Lemma 8.
Let µ be the increment distribution for the random transpositions shuffle on S N . Thereexists a universal constant C such that for t = (cid:98) N (log N + c ) (cid:99) , d ( t ) ≤ Ce − c . .4 Proof of Theorem 6 We now give the proof of Theorem 6, which we break into parts.
Increment distribution for the k -stack interchange process. In the following, we label thecards from { , . . . , nk − } and write u i = uk + i for u ∈ [ n ] and i ∈ [ k ] so that u i denotes the i thcard of node u ∈ V . Thus, u is the top card and u k − is the bottom card on the stack locatedat node u . Let σ u = ( u u k − u k − . . . u ) denote the permutation which moves the top card of u to the bottom of its stack. Recall that ( u v ) is the transposition along an edge for neighbouring u (cid:54) = v . The increment distribution µ of the random walk is given by µ ( x ) = / (2 n ) if x = σ u for some u ∈ V / (4 m ) if x = ( u v ) for some { u, v } ∈ E / if x = id0 otherwise.Thus, the support of the increment distribution of the k -stack interchange process is H = { id } ∪ { σ u : u ∈ V } ∪ { ( u v ) : { u, v } ∈ E } . Comparison with random transpositions shuffle.
Let ˜ µ be the increment distribution of therandom transpositions shuffle on S N and ˜ H = { ( u i v j ) : u, v ∈ V, i, j ∈ [ k ] } the support of ˜ µ . Westart by choosing a representation for each transposition ( u i v j ) ∈ ˜ H in terms of an odd number ofelements in H . After this, we bound the bottleneck ratio B of the chosen set of representations andapply Lemma 7 to bound the mixing time. Bounding the bottleneck ratio.
Consider the partition ˜ H = { id } ∪ ˜ H ∪ ˜ H , where ˜ H = { ( u i u j ) : u ∈ V, ≤ i < j < k } and ˜ H = { ( u i v j ) : u, v ∈ V, u (cid:54) = v, ≤ i < j < k } . For each a ∈ ˜ H we find a representation in terms of elements in H depending on which of the threeparts a belongs to. The identity element of ˜ H can be represented by the identity of H , so we needto only find odd-length representations for elements in ˜ H and ˜ H :– Suppose a = ( u i u j ) ∈ ˜ H . We can represent this as ( u i u j ) = σ k − iu · ( u v ) · σ k − j + iu · ( u v ) · σ j − iu · ( u v ) · σ iu , where σ iu stands for i repetitions of σ i and v is a fixed neighbour of u in G . The length | a | ofthe representation is k + 3 and N ( x, a ) ≤ max { k, } ≤ k + 1 for each x ∈ H .– Suppose a = ( u i v j ) ∈ ˜ H . Since the graph G is ( C, D ) -routable, there exists a path u = w , . . . , w (cid:96) = v of length ≤ (cid:96) ≤ D connecting u and v in G . Let ρ uv = ( w w ) · · · ( w (cid:96) − w (cid:96) ) · ( w (cid:96) − w (cid:96) − ) · · · ( w w ) be the sequence of (cid:96) − transpositions. With this, we can represent a = ( u i v j ) as ( u i v j ) = σ k − iu · σ k − jv · ρ uv · σ jv · σ iu . Now | ( u v j ) | = 2 k + 2 (cid:96) − ≤ k + D ) − . Note that σ u and σ v are used both at most k timesand ρ uv uses each edge-wise transposition at most twice. Hence N ( x, a ) ≤ k .17ext, we bound B ( x ) for each x ∈ H . There are again three cases to consider:– Suppose x = id . Since the identity element is only used to represent itself, we have that B (id) = ˜ µ (id) µ (id) = 4 kn ∈ O (1) . – Suppose x = σ u ∈ H for some u . Note that µ ( σ u ) = 1 / (4 n ) and σ u appears in k − representations of elements in ˜ H and ( k − n representations of elements in ˜ H . Thus, B ( x ) = 1 µ ( x ) (cid:88) a ∈ S N ˜ µ ( a ) N ( x, a ) | a | = 8 nk (cid:88) a ∈ ˜ H N ( x, a ) | a | + (cid:88) a ∈ ˜ H N ( x, a ) | a | . The sum over ˜ H is bounded by O ( k ) and the sum over ˜ H by O ( nk + nk D ) . Hence, B ( x ) ∈ O ( k + Dk ) .– Suppose x = ( u v ) ∈ H for some { u, v } ∈ E . Note that µ ( σ u ) = 1 / (4 m ) and ( u v ) is usedin at most k representations in ˜ H and in at most C representations in ˜ H . Hence, B ( x ) = 1 µ ( x ) (cid:88) a ∈ S N ˜ µ ( a ) N ( x, a ) | a | = 8 m ( nk ) (cid:88) a ∈ ˜ H N ( x, a ) | a | + (cid:88) a ∈ ˜ H N ( x, a ) | a | . The sum over ˜ H is bounded by O ( k ) and the sum over ˜ H by O ( Ck + CDk ) . Thus, B ( x ) ∈ O ( CDmk /n ) .Therefore, the bottleneck ratio B is bounded by B = max x ∈ H B ( x ) ∈ O (cid:18) k · max (cid:26) CDmn , D (cid:27)(cid:19) . Bounding the mixing time.
We can now bound the mixing time of the k -stack interchangeprocess using the bound on the bottleneck ratio B and Lemma 7. Note that we can choose t ∈ Θ( BN log N ) so that the following inequalities hold: (cid:101) d (cid:18)(cid:22) t B (cid:23)(cid:19) ≤ / and exp( − t/B ) ≤ N N ≤ · N ! . The first inequality is obtained using Lemma 8. The total variation distance is bounded by Lemma 7and the fact that d ( t ) ≤ d ( t ) yielding d ( t ) ≤ d ( t ) ≤ (cid:115) n ! · exp (cid:18) − tB (cid:19) + ˜ d (cid:18)(cid:22) t B (cid:23)(cid:19) ≤ · ≤ e . The claim of Theorem 6 follows by observing that the mixing time is bounded by O ( BN log N ) ,which for constant k is O (cid:18) n log n · max (cid:26) CDmn , D (cid:27)(cid:19) , as claimed. This concludes the proof of Theorem 6.18 Decentralised phase clocks
In this section, we describe a bounded phase clock construction for the stochastic population protocolmodel under regular graphs. However, the construction can also be generalised to non-regular graphs,assuming that the degrees do not deviate too much from the average degree; see Appendix A. Theconstruction generalises the approach of Alistarh et al. [7], which was used to build a leaderlessphase clock on cliques leveraging the classic two-choice load balancing process [15] and the analysisof Peres et al. [49] for graphical balanced allocations.
Recall that in the graphical stochastic population protocol model PP ( G ) , two randomly chosenadjacent nodes interact in each step t > . Let φ > be an integer and consider a protocol C withstate variables c ( v ) ∈ { , . . . , φ − } for each v ∈ V . The variable c ( v ) represents the value (i.e.phase) of the clock at node v . Let c ( v, t ) be the clock value node v has at the end of time step t (regardless of whether it was active during that step). We define the distance D between two clockvalues and the skew ∆ of the clock at the end of step t , respectively, as follows: D ( x, y ) = min {| x − y | , φ − | x − y |} and ∆( t ) = max u,v ∈ V D ( c ( u, t ) , c ( v, t )) . We say that the protocol C implements a ( φ, γ, κ ) -clock if for all t ≥ the following hold:(1) Pr[∆( t ) ≥ γ ] < t/n κ , and(2) c ( v, t + 1) = c ( v, t ) + 1 mod φ for exactly one v ∈ V and c ( u, t + 1) = c ( u, t ) for all u ∈ V \ { v } .Intuitively, the parameter φ is the length of a phase, γ bounds the skew of the clock, and κ is aconstant controlling the probability of failure. The above two properties guarantee that the clocks(1) have a skew bounded by γ for polynomially many steps with high probability; and (2) in eachstep, the clocks make progress (at some node). We say that a clock protocol C fails at time step t if the event ∆( t ) ≥ γ occurs. Several types of phase clocks have been proposed in the populationprotocol literature, satisfying various guarantees, e.g. [7, 12, 37, 39, 52]. The above formulation issimilar to that of [7] and proves convenient for our analysis. Let G be a graph and suppose that each node of G contains a bin, which is initially empty. Considerthe process, where in each step, a directed edge ( u, v ) is sampled uniformly at random and a ball isplaced into the least loaded of bin among the two nodes connected by the edge (in case of ties, placethe ball into bin u ). Let (cid:96) ( u, t ) be the number of balls placed into bin u ∈ V by the end of step t and use ∆ ∗ ( t ) = max v ∈ V (cid:96) ( v, t ) − min u ∈ V (cid:96) ( u, t ) , to denote the gap between the most and least loaded bin. Peres et al. [49] analysed (a generalisationof) the above load balancing process. Their results imply the following bound for regular graphs(see Appendix A for details). Lemma 9.
Suppose G = ( V, E ) is a d -regular graph with n nodes and edge expansion β > . Forany constant κ > , there exists a constant c ( κ ) such that for all t > the gap satisfies Pr (cid:20) ∆ ∗ ( t ) > c ( κ ) dβ log n (cid:21) < t/n κ .
19e use the above result to obtain bounded phase clocks in the PP ( G ) model. When G is d -regularwith edge expansion β , the skew of the phase clock is O ( d/β log n ) for polynomially many steps withhigh probability. For example, for constant-degree expanders, the skew bound is O (log n ) . We notethat this is the only place in our framework where the initiator/responder distinction is used. Theorem 10.
Suppose G = ( V, E ) is a d -regular graph with n nodes and edge expansion β > .There exists a ( φ, γ, κ ) -clock for PP ( G ) that uses φ states per node for any κ > and φ ≥ γ and γ ≥ c ( κ ) dβ log n. In particular, the clock fails with probability at most t/n κ during the first t steps.Proof. Fix the parameters κ, γ and φ . We devise the clock protocol for PP ( G ) , where each nodeholds a counter value c ( v, t ) ∈ [ φ ] . Each node initialises its clock value to c ( v,
0) = 0 . Define M φ ( x, y ) = (cid:40) max( x, y ) if | x − y | < φ/ , min( x, y ) otherwise . The clock protocol is now defined by the following update rule. When nodes { u , u } ∈ E interact,where u is the initiator and u is the responder, they perform the following:– If c ( u , t ) = c ( u , t ) , then the initiator u increments its clock value by one modulo φ .– Otherwise, the node u i for which M φ ( c ( u , t ) , c ( u , t )) = c ( u i , t ) holds increments its clock byone modulo φ .We argue that this protocol implements a ( φ, γ, κ ) -clock, i.e., properties (1) and (2) are satisfied.Note that the above rules guarantee that in either case exactly one of the nodes increments its clockvalue by one modulo φ . This implies property (2).It remains to show that property (1) holds. Note that for any nonnegative integers a and a such that | a − a | < φ/ we have max( a , a ) = a i ⇐⇒ M φ ( r , r ) = r i , where r i = a i mod φ . Weuse this observation to show that if ∆ ∗ ( t (cid:48) ) < γ holds for all ≤ t (cid:48) ≤ t , then in each step t (cid:48) both theunbounded and bounded process increment the counter of the same node. In particular, this impliesthat ∆ ∗ ( t (cid:48) ) = ∆( t (cid:48) ) and Lemma 9 yields that Pr[∆ ∗ ( t ) > γ ] = Pr[∆( t ) > γ ] < t/n κ .We proceed by induction on t . The base case t = 0 is vacuous, as ∆(0) = ∆ ∗ (0) = 0 . Supposethe claim holds for some t . Let ( u , u ) be the ( t + 1) th interaction pair. The induction hypothesisyields that c ( u, t ) = (cid:96) ( u, t ) mod φ for each u ∈ V and ∆( t ) = ∆ ∗ ( t ) < γ . Hence, we have that M φ ( c ( u , t ) , c ( u , t )) = c ( u i , t ) and max( (cid:96) ( u , t ) , (cid:96) ( u , t )) = (cid:96) ( u i , t ) . If the counters of u and u differ at step t + 1 , then the unbounded and bounded process incrementthe counter of the same bin u i . In case of ties, both processes increment the counter correspondingto u . Hence, the unbounded balls-into-bin process and bounded clock process increment the counterof the same node. In this section, we give our main technical result: synchronous protocols in the fully-connected tokenshuffling model can be simulated in the graphical, stochastic population protocol model.20 heorem 2.
Let k > be a constant and A be a synchronous k -token shuffling protocol on n nodes,where X is the set of local states and Y the set of token types used the protocol A . If A solvesthe task Π with high probability in R ∈ poly( n ) rounds, then there exists a stochastic populationprotocol B that also solves task Π with high probability on any n -node d -regular graph G with edgeexpansion β > . The step complexity T ( B ) and state complexity S ( B ) of the protocol B satisfy T ( B ) ∈ O ( R · n · ζ ) and S ( B ) ∈ O (cid:16) | X | · | Y | k · ζ (cid:17) with ζ = log n · (cid:18) dβ + τ mix n (cid:19) , where τ mix is the mixing time of the k -stack interchange process on G . Notation.
The rest of this section is dedicated to proving this theorem. Throughout, we fix R = R ( n ) ∈ poly( n ) and ε = 1 /n a < / ( Rn λ ) for an arbitrary small constant a > . Let G = ( V, E ) be d -regular n -node graph and N = kn . We use µ to denote the increment distribution of the k -stack interchange process on the graph G . The support of µ is the set H ⊆ S N and τ = τ ( ε ) isthe ε -mixing time of the k -stack interchange process. We now give a stochastic population protocol that simulates uniform schedules of the synchronoustoken shuffling model. The protocol simulates the random walk made by the k -stack interchangeprocess, synchronised by phase clocks. Clock setup.
Let C be a ( φ, γ, κ ) -clock with parameters given by γ ∈ O (cid:18) dβ log n (cid:19) φ = γ + ϑ ϑ = 2 τn + 3 γ t ∗ = ( Rφ + γ ) n that fails (i.e., the clock skew becomes γ or greater) with probability at most /n λ during the first t ∗ steps. Since φ ≥ γ , R ∈ poly( n ) , and t ∗ ∈ poly( n ) hold, such a protocol exists by Theorem 10for any constant λ > . The fact that t ∗ is polynomially bounded follows from Corollary 1 and that β ≥ /n for any regular connected graph so τ ≤ (cid:100) log 1 /ε (cid:101)· τ mix ∈ poly( n ) , and hence, φ, γ ∈ poly( n ) . The token shuffling protocol.
The parameter ϑ is used as a special threshold value for thetoken shuffling protocol. We assume that each node v holds exactly k tokens, which are orderedfrom to k − , in the same manner as cards ordered are in the k -stack interchange process. We saythat the first token is the top token. We say that node u is receptive when ever its clock satisfies c ( u ) < ϑ and suspended otherwise. When nodes in { u, v } interact, they apply the following rule:(1) If both are receptive, that is, c ( u ) < ϑ and c ( v ) < ϑ holds, then(a) Let q ( u ) and q ( v ) be the random coin flips of u and v , respectively.(b) If q ( u ) = q ( v ) = 0 , then u and v swap their top tokens.(c) If q ( u ) < q ( v ) , then v moves its top token to the bottom of its stack; u does nothing.(d) If q ( u ) = q ( v ) = 1 , then do nothing.(2) Otherwise, do nothing. 21
46 1 …
356 2 61 4261 4 … … (a)(b)(c) Figure 4: The dynamics of the shuffling protocol for k = 1 . Circles filled with white and red denotereceptive and suspended nodes, respectively. The blue arrows connect nodes who exchange theirtokens in the given step. Red lines denote steps, where at least one of the interacting nodes issuspended, and thus, no swap is made. (a) Initially all nodes are receptive and swap tokens withtheir interaction partners. After sufficiently many interactions, nodes become suspended and refrainfrom swapping tokens. (b) Eventually all nodes are suspended. The highlighted panel shows theresulting permutation, which will act as the interaction pattern for the simulated round. (c) As thephase clocks reset back to 0, nodes become receptive again, and the tokens are shuffled once more.We note that the protocol uses at most one random bit per node per interaction and that this is theonly part of our framework, where the random bits provided to the nodes are used. The interactingnodes exchange at most 4 bits (i.e., whether they receptive or not, and the result of their coin flip) inaddition to the contents of the swapped tokens in Step (1b). Finally, observe that when all nodes arereceptive, the tokens are shuffled according to the increment distribution µ of the k -stack interchangeprocess on G . Figure 4 illustrates the dynamics of the shuffling protocol in the case k = 1 . We now analyse the above shuffling protocol. Let c ( u, t ) indicate the clock value of node u at theend of step t . Let c ( u,
0) = 0 and t ( v,
0) = 0 . We say that the clock of node u resets at time step t ifits value transitions from φ − to . For r ≥ , define– t ( v, r + 1) = min { t > t ( v, r ) : c ( v, t ) = 0 } ; the step when v resets its clock for the r th time,– t min ( r ) = min { t ( v, k ) : v ∈ V } ; the earliest step when some clock is reset for the r th time,– t max ( r ) = max { t ( v, k ) : v ∈ V } ; the latest step when some clock is reset for the r th time.Similarly, we define the times with respective to the events when the clocks reach the value ϑ :– s ( v, r ) = min { t > t ( v, r ) : c ( v, t ) = ϑ } ,– s min ( r ) = min { s ( v, k ) : v ∈ V } ,– s max ( r ) = max { s ( v, k ) : v ∈ V } . 22 emma 11. With high probability, the following inequalities hold:(1) t max ( R + 1) ≤ t ∗ = ( Rφ + γ ) n ,(2) s min ( r ) − t max ( r ) ≥ τ for each ≤ r ≤ R .(3) t max ( r ) < s max ( r ) < t min ( r + 1) for each ≤ r ≤ R .Proof. Recall that the clock protocol works correctly with high probability for the first t ∗ steps. Wenow assume that this event occurs.For the first claim, we show that all nodes have incremented their clock at least Rφ times after t ∗ steps. For the sake of contradiction, suppose that some node v has incremented its clock lessthan Rφ times during the first t ∗ steps. By the second property of the clock protocol, in every step ≤ t ≤ t ∗ , some node increments its clock value by one (modulo φ ). Hence, the nodes in V \ { v } have incremented their clocks at least ( Rφ + γ )( n − times. By the pigeonhole principle, somenode u (cid:54) = v has incremented its clock at least Rφ + γ times. However, this contradicts the propertythat the difference in the clock skew is less than γ for each step ≤ t ≤ t ∗ . Since each node hasincremented its clock at least Rφ times, each node has reset its clock R times, so t max ( R + 1) ≤ t ∗ .For the second claim, observe that during an interval of γn steps there must exist a node thathas incremented its clock γ times by the pigeonhole principle. By the first property of the clockprotocol the skew is less than γ , so we get that t max ( r ) < t min ( r ) + 2 γn. Again since the skew of the clock is less than γ , and in each step at most one node increments itsclock counter, the time until some node reaches the clock value ϑ after step t min ( r ) satisfies s min ( r ) ≥ t min ( r ) + ( ϑ − γ ) n. Combining these two bounds and recalling that ϑ = τ /n + 3 γ yields s min ( r ) − t max ( r ) > ( ϑ − γ ) n − γn = τ. Finally, the third claim follows from the fact that φ ≥ γ and that the skew is bounded by γ . Distribution of tokens.
We now show that the distribution tokens mix to an ε -uniform distribu-tion during the intervals { t max ( r ) + 1 , . . . , s min ( r ) } for ≤ r ≤ R . Let π = id and π t denote thelocations of the tokens after t steps of the shuffling protocol. Define σ = id and σ r = π s max ( r ) for ≤ r ≤ R. Observe that σ r = ρ · ρ · ρ · σ r − , where each ρ i is product of elements from the support H ⊆ S N of the increment distribution µ of the k -stack interchange process, where– ρ = x t max ( r ) · · · x t min ( r − (only a subset of nodes have become receptive for the r th time),– ρ = x s min ( r ) · · · x t max ( r )+1 (all nodes are receptive),– ρ = x s max ( r ) · · · x s min ( r )+1 (a subset of nodes have become suspended for the r th time).(Recall that permutations are applied from right to left.) Observe that while each x i is a randomelement of H , only the elements ρ are guaranteed to be distributed according to the incrementdistribution µ of the k -stack interchange process. The elements of ρ and ρ are skewed towards theidentity permutation, as some nodes are suspended whenever their clock values are in { ϑ, . . . , φ − } .The next lemma establishes that this does not interfere with the mixing behaviour.23 emma 12. Let ≤ r < R . For any A ⊆ S N , we have | Pr[ σ r +1 ∈ A | σ k ] − ν ( A ) | ≤ ε .Proof. Suppose σ r is given. For brevity, let π = ρ · ρ · σ r so that σ r +1 = ρ · π . Define p ( x ) = Pr[ π = x | σ r ] and p (cid:48) ( x ) = Pr[ σ r +1 = x | σ r ] . Observe that | p ( A ) − ν ( A ) | ≤ ε , as ρ is given by a sequence of at least τ elements sampledaccording to the increment distribution µ . We show that | p (cid:48) ( A ) − ν ( A ) | ≤ ε . Let y · A denote theset { yx : x ∈ A } . By expanding p (cid:48) ( A ) using conditional probabilities, we can write p (cid:48) ( A ) = Pr[ ρ · π ∈ A | σ r ] = (cid:88) y ∈ S N Pr[ π ∈ y − · A and ρ = y | σ r ]= (cid:88) y ∈ S N Pr[ ρ = y | π ∈ y − · A and σ r ] · Pr[ π ∈ y − · A | σ r ]= (cid:88) y ∈ S N q ( y ) · p ( y − · A ) , where q ( y ) = Pr[ ρ = y | π = y − x, σ r ] is a probability distribution on S N . Hence, q ( S N ) = (cid:80) q ( y ) = 1 . Since ν ( A ) = ν ( z · A ) for any z ∈ S N , by applying triangle inequality we get that (cid:12)(cid:12) p (cid:48) ( A ) − ν ( A ) (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) y ∈ S N q ( y ) · p ( y − · A ) − ν ( A ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) y ∈ S N q ( y ) · (cid:2) p ( y − · A ) − ν ( y − · A ) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:88) y ∈ S N q ( y ) (cid:12)(cid:12) p ( y − · x ) − ν ( y − · A ) (cid:12)(cid:12) ≤ (cid:88) y ∈ S N q ( y ) · ε ≤ ε. Using the shuffling protocol in the population protocol model, we can simulate an R -round algorithm A in the synchronous k -token shuffling model. Let f : X × Y k → X be the state transition functionand g : X → Y k be the token generation function of the algorithm A . Recall that the sets X and Y denote the set of local state variables and token types, respectively. The simulation protocol.
Each node v keeps a variable a ( v ) ∈ X to simulate the local stateof the synchronous protocol A . In addition, the node holds k variables, b ( v ) , . . . , b k − ( v ) ∈ Y ,that are used to store the sent and received tokens. The variable a ( v ) is initialised to the initialstate x ( v ) of node v in the algorithm A and b ( v ) , . . . , b k − ( v ) are initialised to the values given by g ( x ( v )) . When node v interacts (in the asynchronous population protocol model), v updates itsstate according to the following rules:(1) Run the clock and the shuffling protocol using b ( v ) , . . . , b k − ( v ) to hold the k tokens.(2) If c ( v ) = ϑ , then update the simulated state variable and generate new tokens by setting: a ( v ) ← f ( a ( v ) , b ( v ) , . . . , b k − ( v )) and b ( v ) , . . . , b k − ( v ) ← g ( a ( v )) . We show that the above algorithm simulates an execution of the synchronous algorithm A underthe schedule σ , . . . , σ R given by the shuffling protocol. To this end, define x ( v ) = a ( v, and x ( r ) = a ( v, s ( v, r )) for all ≤ r ≤ R . Lemma 13.
The sequence ( x r ) ≤ r ≤ R is an execution induced by the schedule ( σ r ) ≤ r ≤ R . roof. Observe that each node v updates the variables a ( v ) , b ( v ) , . . . , b k − ( v ) only during thesteps s ( v, , . . . , s ( v, R ) when its local clock has reached the threshold value ϑ . Lemma 11 impliesthat with high probability every node updates these variables for the r th time during the interval { s min ( r ) + 1 , . . . , s max ( r ) + 1 } for any ≤ r ≤ R . In particular, this happens before step t min ( r + 1) ,when the first node becomes receptive for the ( r + 1) th time. By letting y r +1 ( v i ) be the value of b i ( v ) after being updated for the r th time and y (cid:48) r +1 = y r ◦ σ r +1 , we get that the configuration x r +1 satisfies, with high probability, x r +1 ( v ) = f ( x r , y (cid:48) r ( v , . . . , v k − )) . Thus, the sequence given by x , . . . , x R is an execution induced by the schedule σ , . . . , σ R . The schedules provided by the shuffling protocol are only ε -uniform, as the shuffling process isexecuted for finitely many steps. As our last stepping stone, we show that this does not matter: anysynchronous protocol behaves statistically similarly under ε -uniform and uniform schedules.To formalise this, let Φ be the distribution of the sequence ( σ , . . . , σ R ) ∈ S RN of permutationsgenerated by the shuffling protocol under the assumption that the clock protocol works correctly for T time steps. Let ν R = ν × · · · × ν denote the distribution of a sequence of R independently anduniformly sampled random permutations from S N . That is, ν R is the distribution of the uniform R -round schedules. We start with the following algebraic inequality. Lemma 14.
Let a i , b i ∈ R + for ≤ i ≤ t . Then (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t (cid:89) i =1 a i − t (cid:89) j =1 b j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ t (cid:88) i =1 | a i − b i | (cid:32) i − (cid:89) k =1 a k (cid:33) (cid:32) t (cid:89) h = i +1 b h (cid:33) . Proof.
For all ≤ i ≤ t define c i = (cid:32) i (cid:89) k =1 a k (cid:33) (cid:32) t (cid:89) h = i +1 b h (cid:33) and d i = ( a i − b i ) (cid:32) i − (cid:89) k =1 a k (cid:33) (cid:32) t (cid:89) h = i +1 b h (cid:33) . The claim follows by observing that d i = c i − c i − holds and (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t (cid:89) i =1 a i − b t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = | c t − c | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t (cid:88) i =1 ( c i − c i − ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ t (cid:88) i =1 | c i − c i − | = t (cid:88) i =1 | d i | . Lemma 15.
The total variation distance between Φ and ν R satisfies (cid:13)(cid:13) Φ − ν R (cid:13)(cid:13) TV ≤ εR .Proof. Let A = A × · · · × A r ⊆ S RN . Since the sequence σ , . . . , σ R is Markovian, we can write Φ( A ) = Pr [( σ , . . . , σ R ) ∈ A ] = Pr [ σ ∈ A ] · R (cid:89) j =2 Pr [ σ i ∈ A i | σ i − ∈ A i − ] = R (cid:89) j =1 φ j ( A ) , φ i ( A ) = Pr[ σ i ∈ A i | σ i − ∈ A i − ] for i > . Recall that σ = id . For notational convenience,let ν i ( A ) = ν ( A i ) . By applying Lemma 14, we obtain (cid:12)(cid:12) Φ( A ) − ν R ( A ) (cid:12)(cid:12) ≤ R (cid:88) i =1 | φ i ( A ) − ν i ( A ) | (cid:32) i − (cid:89) k =1 φ k ( A ) (cid:33) (cid:32) R (cid:89) h = i +1 ν h ( A ) (cid:33) ≤ R (cid:88) i =1 (cid:12)(cid:12)(cid:12)(cid:12) Pr[ σ i ∈ A i | σ i − ∈ A i − ] − | A i | N ! (cid:12)(cid:12)(cid:12)(cid:12) (cid:32) i − (cid:89) k =1 φ k ( A ) (cid:33) (cid:32) R (cid:89) h = i +1 ν h ( A ) (cid:33) ≤ R (cid:88) i =1 ε (cid:32) i − (cid:89) k =1 φ k ( A ) (cid:33) (cid:32) R (cid:89) h = i +1 ν h ( A ) (cid:33) ≤ R (cid:88) i =1 ε ≤ εR, where the third inequality follows from Lemma 12 and the second last from the fact that the productsare over probabilities. The claim now follows as (cid:13)(cid:13) Φ − ν R (cid:13)(cid:13) TV = max A ⊆ S RN | Φ( A ) − ν R ( A ) | ≤ εR. We are almost ready to show our main technical result. We recall the following standard result.
Lemma 16.
Let µ and ν be probability distributions over a finite domain Ω . For any function F : Ω → Ω (cid:48) , the total variation distance satisfies (cid:107) F ( µ ) − F ( ν ) (cid:107) TV ≤ (cid:107) µ − ν (cid:107) TV . With all the pieces now in place, we can establish our simulation theorem.
Theorem 2.
Let k > be a constant and A be a synchronous k -token shuffling protocol on n nodes,where X is the set of local states and Y the set of token types used the protocol A . If A solvesthe task Π with high probability in R ∈ poly( n ) rounds, then there exists a stochastic populationprotocol B that also solves task Π with high probability on any n -node d -regular graph G with edgeexpansion β > . The step complexity T ( B ) and state complexity S ( B ) of the protocol B satisfy T ( B ) ∈ O ( R · n · ζ ) and S ( B ) ∈ O (cid:16) | X | · | Y | k · ζ (cid:17) with ζ = log n · (cid:18) dβ + τ mix n (cid:19) , where τ mix is the mixing time of the k -stack interchange process on G .Proof. Let A be the synchronous k -token shuffling protocol. Since the protocol works with highprobability, assume it succeeds with probability at least p ≥ − /n h , where h is a constant wechoose later. Using the simulation protocol, we construct a graphical population protocol B withthe claimed properties. Recall that ε = 1 /n a , where a was an arbitrary constant. We set a so that εR ≤ /n λ holds. By Lemma 13 the shuffling protocol simulates the execution of A induced byan R -round ε -uniform synchronous schedule ( σ r ) ≤ r ≤ R with high probability. This takes at most t ∗ = ( Rφ + γ ) n steps by Lemma 11.Recall that φ ∈ O ( γ + τ /n ) , where γ is the bound on the clock skew and τ = τ ( ε ) is the ε -mixingtime of the k -stack interchange process. Since τ ≤ (cid:100) log 1 /ε (cid:101) · τ mix ∈ O (log n · τ mix ) , we get fromTheorem 10 the following bounds: γ ∈ O (cid:18) d log nβ (cid:19) φ ∈ O (cid:18) log n (cid:18) dβ + τ mix n (cid:19)(cid:19) t ∗ ∈ O (cid:18) Rn log n (cid:18) dβ + τ mix n (cid:19)(cid:19) . t ∗ establishes the claimed bound on the step complexity of B . For the state complexity,note that each node v stores the variables for the clock c ( v ) ∈ [ φ ] , the local state a ( v ) ∈ X of thesimulated protocol A , and the k tokens b ( v ) , . . . , b k − ( v ) ∈ Y . This takes φ · | X | · | Y | k states,establishing the bound on the state complexity.It remains to argue that B solves the task Π with probability at least p − /n λ . The output ofalgorithm A on input z under any R -round synchronous schedule Ξ is given by F z (Ξ) , where F z is acomputable function. Let D = F z (Φ) and D (cid:48) = F z ( ν R ) be the probability distributions of outputsin the executions of the algorithm A induced, respectively, by the simulated ε -uniform schedulesgiven by Φ and the uniform schedules given by ν R . By Lemma 15 and Lemma 16, we have that (cid:13)(cid:13) D − D (cid:48) (cid:13)(cid:13) TV = (cid:13)(cid:13) F z (Φ) − F z ( ν R ) (cid:13)(cid:13) TV ≤ (cid:107) Φ − ν r (cid:107) TV ≤ εR ≤ /n λ . Therefore, the probability that the output z (cid:48) under the execution induced by the ε -uniform scheduleon input z satisfies z (cid:48) ∈ Π( z ) is D (Π( z )) ≥ D (cid:48) (Π( z )) − /n λ ≥ p − /n λ ≥ − /n h − /n λ . Thus,the output of protocol B is feasible with high probability. Since A stabilises in R rounds, nodes canset the output of B to be the output of A at the end of the R th simulated round. Thus, the outputof B stabilises as well. We now give fast (and simple) algorithms in the fully-connected synchronous token shuffling model.These can be transported to the graphical, asynchronous population protocol setting using Theorem 2.We illustrate two types of algorithms:(1) A leader election algorithm that uses one-way communication with k > tokens. Theprotocol uses a one-way information dissemination protocol and a protocol for generatingsynthetic coins in the token shuffling model.(2) An exact majority algorithm simulating two-way interactions in a population of n virtualagents. The algorithm uses the classic cancellation-doubling dynamics.While we use straightforward adaptations of ideas from prior algorithms (see e.g. [36] for a generaloverview), for the sake of completeness, we provide a full analysis of the algorithms we use in thesynchronous token shuffling model. Before we proceed, we will establish some useful lemmas and facts we need. We start with thefollowing observation about a particular quadratic recurrence.
Lemma 17.
For any n > x > and r ≥ , the expression g ( r ) = n ( x/n ) r is the closed formsolution of the quadratic recurrence g ( r ) = (cid:40) g ( r − /n if r > x otherwise.Moreover, g ( r ) ≤ /n λ holds for all r ≥ log n + log ln n + log( λ + 1) . roof. We show the identity via induction. The base case r = 0 is given by g (0) = n ( x/n ) = x . Forthe inductive step, we have g ( r + 1) = g ( r ) n = 1 n (cid:20) n (cid:16) xn (cid:17) r (cid:21) = n (cid:16) xn (cid:17) · r = n (cid:16) xn (cid:17) r +1 . The second claim follows from the inequality (1 − /n ) − nx ≤ e − x , since g ( r ) ≤ n (cid:18) − n (cid:19) r ≤ n (cid:18) − n (cid:19) − ( λ +1) n ln n ≤ ne − ( λ +1) ln n ≤ /n λ . Remark . We make use of the following elementary facts.– The law of total expectation: For random variables X and Y on the same probability space, E[ X ] = E[ E [ X | Y ]] . – Markov’s inequality: For any nonnegative random variable X and real value a > , Pr[ X ≥ a ] ≤ E[ X ] x . – The union bound: For any events A , . . . , A n , we have Pr (cid:34) n (cid:91) i =0 A i (cid:35) ≤ n (cid:88) i =0 Pr[ A i ] . We start by adapting a classic broadcast primitive to the k -token shuffling model. This protocoloften goes by the name of “one-way epidemics” and “rumour spreading”. We assume each nodeis given an input z ( v ) from a set Σ with a total order on the values. The protocol computes themaximum value given as input. This can be used as an information dissemination protocol or toagree on a commmon value given as input. One-way epidemics protocol.
The algorithm works in the k -token shuffling model for any k > .At the start of the protocol, each node v initialises a local state variable a ( v ) to its input value z ( v ) .In every round, each node v performs the following steps:(1) Generate k tokens of type a ( v ) .(2) Use one round to shuffle the generated tokens.(3) After receiving k tokens y , . . . , y k − , set a ( v ) ← max { a ( v ) , y , . . . , y k − } .The number of states and token types used by the algorithm is | Σ | . Lemma 19.
After O (log n ) rounds, every node v satisfies a ( v ) = max z ( u ) with high probability.Proof. Let U r denote the set of nodes that have not set their local state variable a ( · ) to the maximuminput value after r rounds. Fix a constant λ > and let R = ( λ + 1) n ln n . We prove the lemma byshowing that for all r ≥ R the probability that U r is nonempty is at most /n λ .For each v ∈ V , let Y r ( v ) be the indicator variable for the event that node v receives at least onetoken with the maximum input value in round r . By Step (3) of the protocol, if this event occurs,28hen v sets a ( v ) to the maximum input value at the end of round r . In particular, this implies that v / ∈ U r (cid:48) for all r (cid:48) ≥ r . Note that Pr[ Y r +1 ( v ) = 1 | U r = b ] ≥ − b/n . Define the random variable X r = | U r | for each r > . Observe that by linearity of expectation, we have E[ X r +1 | X r = b ] = b − E (cid:34) (cid:88) v ∈ U r X r ( v ) (cid:35) ≤ b − (cid:88) v ∈ U r E[ Y r ( v )] ≤ b /n. By law of total expectation, we get E[ X r +1 ] = E [E [ X r +1 | X r ]] ≤ g ( r ) , where g ( r ) is the recurrence of Lemma 17 with x = | U | < n . By Markov’s inequality and the secondclaim of Lemma 17, we get Pr[ X R ≥ ≤ E[ U R ] ≤ g ( R ) ≤ /n λ . We now consider the leader election problem, where the goal is to select a single node as a leader.Again, we adapt a well-known strategy used by leader election protocols in the standard populationprotocol model: each leader candidate iteratively (1) flips a random coin and (2) becomes a followerif another leader candidate had a coin flip with a larger value [36, 37].In order to implement step (1) we need access to random bits. However recall that by ourdefinitions, the state transition and token generation functions in the token shuffling model aredeterministic. While we could “lift” random bits from the underlying stochastic population protocolmodel, we instead opt for a clean method for generating synthetic coin flips in the k -token shufflingmodel with k > . Synthetic coin flips.
Let k > and consider the k -token shuffling model, where each node v receives exactly k tokens. Recall that these tokens are ordered from to k − . We leverage thisproperty to generate synthetic coin flips in one round as follows:(1) Each node generates a single 0-token and a single 1-token.(2) Use one round to shuffle the generated tokens.(3) Output the value of the first token.While the coin flips between nodes are not independent, the probability that a node outputs is / . Thus, in expectation half of the nodes output 1. Leader election protocol.
We assume that the input specifies a (nonempty) subset of nodesthat start as leader candidates. Let (cid:96) ( v ) ∈ { , } be a local variable of node v denoting whether itconsiders itself a leader candidate. In a single iteration, each node v executes the following:(1) Generate a synthetic coin flip b ( v ) in one round.(2) Run Θ(log n ) rounds of the broadcast protocol of Lemma 19 with input (cid:96) ( v ) · b ( v ) ∈ { , } .(3) If (cid:96) ( v ) = 1 and b ( v ) = 0 , then set (cid:96) ( v ) ← if the broadcast protocol had output 1.In every round, node v uses the value (cid:96) ( v ) as its current output value. Each iteration of the protocoltakes Θ(log n ) rounds and the protocol uses Θ(log n ) states and constantly many token types. Weshow that with high probability, the protocol reduces the number of leader candidates to one after O (log n ) iterations, and hence, in O (log n ) rounds. The remaining candidate is the elected leader.29ote that any node that is a leader candidate ceases to be a leader candidate only if its localcoin flip was 0 and the broadcast protocol informs the node that some other leader candidate hada local coin flip with value one. Thus, we never end up in a situation where there are no leadercandidates remaining. Theorem 20.
There is a synchronous 2-token shuffling protocol for the leader election task thatstabilises in O (log n ) rounds with high probability, uses O (log n ) states per node and two token types.Proof. Assume that the broadcast protocol in Step (2) succeeds on each of the first
Θ(log n ) iterationsof the leader election protocol. This event happens with high probability. Let L i be the (random)set of leader candidates after the i th iteration and X i = | L i | be the random variable indicating thenumber of leader candidates after the i th iteration.For each leader candidate v ∈ L i , let B i ( v ) be a random variable indicating whether node v had1 as its ( i + 1) th coin flip. Note that E[ B i ( v )] = 1 / for any v ∈ L i . Let p ( a ) be the probabilitythat each of the | L i | = a leader candidates have 0 as their coin flip. If this event occurs, then noleader candidate gets removed. Observe that p ( a ) ≤ / a . Assuming a > and using linearity ofexpectation, we get E [ X i +1 | X i = a ] = a · p ( a ) + (1 − p ( a )) · E (cid:88) v ∈ L i B i ( v ) = a (cid:20) p ( a ) + 1 − p ( a )2 (cid:21) ≤ a . Thus, by law of total expectation E[ X t ] ≤ n (3 / t holds. For any constant λ > we can set t = λ log / n . By Markov’s inequality, Pr [ X t > ≤ E [ X t ] ≤ n (3 / t ≤ /n λ . Hence, with high probability, there remains only one leader candidate after t ∈ O (log n ) iterations.As each iteration takes Θ(log n ) rounds, the algorithm stabilises in O (log n ) rounds with highprobability, as desired. The state complexity bound comes from the fact that in Step (2) nodescount up to Θ(log n ) rounds. The algorithm uses two token types. We now obtain a protocol for the exact majority task in the 2-token shuffling model. We usethe 2-token shuffling model to simulate a cancellation-doubling protocol population protocol in apopulation of size n , where nodes interact synchronously according to a randomly chosen perfectmatching. In every round, each node receives two tokens of type A and B , and generates two newtokens for the next round by applying a rule of the form A + B → C + D. The rules used guarantee that with high probability all tokens get converted to the value held by theinitial majority of input values. Hence, as output of the protocol, each node v can use (an arbitrary)value held by one of its tokens. The exact majority protocol.
Let N = 2 n and t = ( λ + 1) log / N , where λ > is an arbitraryconstant. Each node v initially creates two tokens that take the input value z ( v ) ∈ { , } . Afterthis, the algorithm consists of repeatedly running the following rules:30 ancellation doubling(a) (b) promotion(c) shuffle shuffle shuffle shuffle Figure 5: The cancellation-doubling dynamics with n tokens and n = 7 nodes. Blue tokens havethe initial majority. (a) A single round of the cancellation phase. White rectangles represent emptytokens. (b) Two rounds of the doubling phase. The small circular tokens are split tokens. (c) Thepromotion rule promotes all split tokens into full tokens at the end of the doubling phase.(1) For t consecutive rounds, apply the cancellation rules Z + ¯ Z → ∅ + ∅ for Z ∈ { , } . (2) For t consecutive rounds, apply the doubling rules Z + ∅ → Z / + Z / for Z ∈ { , } . (3) Apply the promotion rule Z / → Z for each token of type Z ∈ { , } .Step (1) is called the cancellation phase and Step (2) the doubling phase . The protocol uses exactlyfive types of tokens: , , / , / and ∅ . Tokens of type Z ∈ { , } represent an “opinion” on whatis the majority value. The tokens of type Z / are called split tokens. A token of type ∅ is called an empty token . The idea is that (1) opposing opinions cancel out during the cancelling phase and (2)the amount of majority tokens doubles in the doubling phase.In each round, every node holds two tokens y and y . If one of the tokens is nonempty, then thenode outputs the largest value held by nonempty tokens. Otherwise, if both tokens of a node areempty, i.e., y = y = ∅ , then the node outputs its input value (in this case the protocol has notyet stabilised). We show that the algorithm stabilises in O (log N ) = O (log n ) rounds with highprobability, i.e., the system reaches a configuration, where all generated tokens take the majorityinput value. Analysis.
We trace the usual steps taken in analysing cancellation-doubling dynamics [7, 12, 36].Let A i and B i denote the number of majority and minority tokens after i iterations, respectively.Define the discrepancy as ∆ i = A i − B i with ∆ > being the initial discrepancy. In addition,we use A (cid:48) i , B (cid:48) i , and C i to denote the number of majority, minority, and empty tokens after the i thcancellation phase.The algorithm maintains the invariant A i > B i with high probability: The cancellation ruleremoves exactly one majority and minority token each time it is applied. The doubling phaseguarantees that A i +1 = 2 A (cid:48) i and B i +1 = 2 B (cid:48) i holds with high probability. Finally, once A i = N holds, that is, all tokens have taken the majority value, then A i +1 = N holds. This ensures thatonce all tokens are of the same type, they remain so for all subsequent rounds.31 emma 21. Let i ≥ . If A i > B i holds, then one of the following holds with high probability:(1) B (cid:48) i = 0 , that is, there are no minority tokens after the i th cancellation phase, or(2) C i ≥ N/ , that is, there are at least N/ empty tokens after the i th cancellation phase.Proof. Observe that B i < A i < N/ implies the second condition C i ≥ N/ . Thus, assume that A i ≥ N/ holds. Consider the event that a fixed minority token b is not cancelled during the t rounds of the cancellation phase conditioned on there being at least N/ majority tokens in eachround of this phase. The probability that b meets a majority token is at least / . Hence, theprobability that b is not cancelled during any of these t rounds is at most (1 − / t = (4 / t = 1 /N λ +1 . Now by taking the union bound over all B (cid:48) i < N/ minority tokens yields that either (1) all minoritytokens gets cancelled with probability at least − /N λ or (2) there are less than N/ majoritytokens remaining after the i th cancellation phase. This proves the lemma. Lemma 22. If C i ≥ N/ holds, then ∆ i +1 = 2∆ i holds with high probability.Proof. We say that a token of type Z ∈ { , } splits if it activates the rule Z + ∅ → Z / + Z / .Observe that after a nonempty token of type Z splits during the i th doubling phase, it becomes atoken of type Z / and cannot split again before the ( i + 1) th doubling phase.Recall that by assumption, there are C i ≥ N/ empty tokens. Hence, the probability thata nonempty token splits in a single round of the doubling phase is at least / , since at most N/ nonempty tokens can split (each removing an empty token from the system). Therefore, theprobability that a nonempty token does not split is at most (1 − / t = (4 / t = 1 /N λ +1 . By union bound, the probability that some nonempty token does not split is at most /N λ . Now ∆ i +1 = 2( A (cid:48) i − B (cid:48) i ) = 2( A i − B i ) = 2∆ i . Bounding the number of iterations.
Define the following two random variables K = min { i : B i = 0 } and K = min { i : A i = N } . Here K indicates the iteration after which no tokens taking the minority value are present anymore.The variable K is the first iteration when all tokens have been converted to the majority value.Note that K ≥ K since even though no minority tokens are remaining, some empty tokens mayremain after doubling phases. Lemma 23.
The random variable K satisfies K ≤ (cid:100) log N (cid:101) + 1 with high probability.Proof. Let K = (cid:100) log N (cid:101) + 1 . Suppose that B i > holds for all ≤ i ≤ K . By Lemma 21, we havethat C i ≥ N/ holds with high probability. By Lemma 22, we get that ∆ i +1 = 2∆ i = 2 i with highprobability. This implies that ∆ K = N , contradicting the assumption that B K > . Lemma 24.
The random variable K satisfies K ∈ O (log N ) with high probability. roof. Note that K = K implies that the claim holds by Lemma 23. Hence, assume that K > K holds and K is fixed. For K ≤ i < K , let U i be the set of empty tokens at the end of iteration i .Let Y i ( u ) be an indicator variable for the event that u ∈ U i gets cancelled during the first round ofiteration i + 1 . Since Pr[ Y i ( u ) = 1 | C i = c ] = 1 − c/N holds, the expected number of empty tokensafter iteration i + 1 satisfies E[ C i +1 | C i = c ] ≤ c − E (cid:88) u ∈ U i Y i ( u ) = c − (cid:88) u ∈ U E[ Y i ( u )] = c − c (1 − c/N ) = c /N. By law of total expectation, E[ C i +1 ] = E[ C i +1 | C K ] = g ( i + 1 − K ) , where g is the recurrence from Lemma 17 with x = C K and n = N . For t = log N + log( λ + 1) +log ln N , the second claim of Lemma 17 yields Pr [ C t + K ≥ ≤ E[ C t + K ] ≤ g ( t ) ≤ /N λ ≤ /n λ . Theorem 25.
There is a synchronous 2-token shuffling protocol for the exact majority task thatstabilises in O (log n ) rounds with high probability, uses O (log n ) states and five token types.Proof. By Lemma 24, the system reaches a configuration, where all tokens take the majority valuewithin O (log N ) iterations with high probability. As each iteration takes O (log N ) = O (log n ) rounds, the algorithm stabilises in O (log n ) rounds with high probability. The state complexity is t ∈ O (log n ) , as the nodes count up to t in each iteration and there are 5 token types. Finally, we address the following technical detail: our simulation framework and the simulatedsynchronous algorithms are guaranteed to work correctly and stabilise only with high probability, andtherefore, the protocols may fail with low probability. To obtain always correct protocols, i.e., oneswith finite expected stabilisation time, we specify “backup protocols”, which are run in the unlikelycases, where either the simulation framework fails (e.g. the phase clocks become desynchronised) orthe fast synchronous algorithm fails. This problem also occurs in the context of fast clique-basedalgorithms, e.g. [3, 7], and we adopt similar mitigation strategies.
Switching to backup protocols.
Note that since the probability of failure of the fast protocolscan be polynomially small, to get polynomial expected stabilisation time, it suffices to have a backupprotocol that has polynomial expected stabilisation time and small state complexity.The backup protocol, if necessary, is initiated as follows. If some node notices disagreementor inconsistent states after the fast protocols supposed stablisation time, it initiates a signallingmessage, which is propagated further by all nodes that receive it. This signal forces all nodes toswitch to executing the reliable (but potentially slow) backup protocol. Since the backup is onlyexecuted with low probability, and has negligible space cost, it does not affect the overall complexityof the fast exact majority protocol in the graphical population protocol model.
Backup for exact majority.
In the case of the exact majority protocol, we can directly adoptthe same solution as in the classic clique setting [7]: use the four-state exact majority algorithmanalysed by Draief and Vojnović [35] as a backup protocol. This algorithm works in arbitrary,connected graphs and has polynomial expected stabilisation time.33 ackup for leader election.
For leader election, we use the six-state leader election algorithmgiven by Beauquier et al. [16] who studied this protocol under the adversarial (non-stochastic)scheduler. In Section 9, we show that this protocol has polynomial expected stabilisation time underthe stochastic scheduler on any connected graph.Switching to the backup protocol can be done as follows: once a node has executed the fastprotocol sufficiently many rounds, it switches to the slow protocol using its current state (whether itis a leader candidate or not) as input for the constant-state protocol. Any node that observes duringan interaction that some other node has switched to the slow protocol, does so as well.
In this section, we analyse the leader election protocol of Beauquier et al. [16] under the uniformstochastic scheduler. We establish the following result.
Theorem 26.
There exists a protocol for leader election that uses six states and stabilises in anygraph G in O (diam( G ) · nm log n ) steps with high probability and in expectation, where n is thenumber of nodes, m the number of edges, and diam( G ) the diameter of G . For any connected graph, the protocol stabilises in O ( n log n ) steps, but for sparse and low-diameter graphs the running time is better. For example, constant-degree regular expanders havediameter O (log n ) and O ( n ) edges, and thus, the algorithm stabilises in O ( n log n ) steps inexpanders. In D -dimensional toroidal grids of constant dimension, we get stabilisation time of O ( n /D log n ) steps, since such graphs have diameter O ( n /D ) and O ( n ) edges. Note that thesebounds are roughly a factor n larger than the bound given by Corollary 3. However, as discussed, wecan employ the slower protocol as a backup protocol for the fast leader election protocol of Section 8to get finite expected stabilisation time. First, we recall that in the classic clique setting leader election can be solved by a simple 2-stateprotocol, where each node keeps track of whether it is a leader candidate or a follower. Whenevertwo leader candidates interact, the initiator stays as a leader candidate while the responder becomesa follower; no other type of interaction changes the state of nodes.The token-based protocol uses a similar approach. However, unlike in the clique, it may beimpossible for two leader candidates to directly interact: they may not have a common edge in G .Instead, the nodes use tokens to interact indirectly. In each step, the nodes update their status byexchanging tokens between their interaction partners and at every time step each node holds exactlyone token.There are three types of tokens: black, white, and inactive tokens. Initially, each leader candidatecreates a black token. In each step, nodes exchange their tokens. Whenever two black tokens meet,exactly one of them turns into a white token while the other remains black. Informally, black tokensrepresent the presence of a leader candidate that has not been yet cancelled. A white token representsa leader candidate that will eventually become a follower. Whenever a node that considers itself aleader candidate receives a white token, it changes its own status into a follower and deactivates thetoken. The invariant maintained by the protocol is that the total number of non-inactive tokenspresent in the system equals the number of leader candidates. By continuously shuffling the tokens,it is eventually guaranteed that the total number of black tokens becomes one and all other tokensbecome inactive. 34 he protocol. Formally, the state of each node v is a tuple ( (cid:96), y ) , where (cid:96) ∈ { leader , follower } is abit indicating whether node v is a leader candidate and y ∈ { black , white , inactive } denotes the typeof the token held by the node. As input, each node is given a bit indicating whether it is a leadercandidate initially. Every node v initialises its state using the following rules:– If v is a leader candidate, then it sets (cid:96) ( v ) ← leader and y ( v ) ← white .– Otherwise, it sets (cid:96) ( v ) ← follower and y ( v ) ← inactive .When two neighbouring nodes u and v are selected to interact by the scheduler, we say that (also)the tokens held by the nodes interact. On every interaction, where node u is the initiator and v isthe responder, the states are updated as follows:(1) If y ( u ) = y ( v ) = black holds, then y ( v ) ← white . That is, if both tokens are black, then thetoken of the responder v is coloured white .(2) If the token held by u is white , y ( u ) = white , and node v is a leader , (cid:96) ( v ) = leader , then– node v designates itself as follower , i.e., sets (cid:96) ( u ) ← follower .– node u sets the type of its token to y ( u ) ← inactive .(3) Finally, the nodes u and v swap their tokens y ( u ) and y ( v ) .For the formal proof of correctness, we refer to [16]. Here, we focus only on bounding the time forthe protocol to stabilise under the uniform stochastic scheduler on G . To establish bounds on the stabilisation time of the leader election protocol, we analyse the hittingtime and meeting time of tokens performing random walks on the graph G . Later, the stabilisationtime of the above token-based leader election protocol can be bound using these quantities. Beforewe proceed, we note the differences between the classic random walk process on a graph and therandom walks made by the tokens in our process. Random walks on graphs.
Recall that the classic random walk on a graph G is the followingMarkov chain: Initially, a random walker (i.e. a token) is placed to some node v of G . In each step,the random walker moves from v to a some neighbor u of v chosen uniformly at random. A naturalextension is to consider multiple, independent random walkers moving on the nodes of G : there maybe several walkers placed on nodes of u and in every step each walker moves to a new random nodeindependently of all the other nodes.In contrast, in the population protocol model, we have to consider multiple tokens performingcorrelated random walks on G : in every step exactly two tokens move along the same edge, which issampled uniformly at random. Nevertheless, we can carefully adapt and use analogous arguments toanalyse the classic random walk (see e.g. [44]) and coalescense time of independent random walks asused by Cooper et al. [28]. Naturally, the bounds we obtain are somewhat different, as the underlyingsampling process is different, and we do not aim for sharp bounds.35 itting times for irreducible Markov chains. We start by recalling the following elementaryresult about hitting times of Markov chains; see e.g. [43, Proposition 1.19]. For states x and y , theexpected hitting time H ( x, y ) between is H ( x, y ) = E[min { t ≥ X t = y, X = x } ] . For x (cid:54) = y , H ( x, y ) is the expected number of steps until the chain starting in state x reaches state y .For x = y the value gives the expected first return time to state x . Lemma 27.
For any finite and irreducible Markov chain, the stationary distribution π satisfies π ( x ) = 1 H ( x, x ) for every state x. Note that the above lemma does not require that the Markov chain is aperiodic. Indeed, thechains we consider will be periodic.
Random walk of a single token.
Let G = ( V, E ) be a simple, connected graph on n ≥ nodes,with m ≥ edges. We start by analysing the walk performed by a single token of the leader electionalgorithm under the population protocol model. More precisely, we consider the following process.Initially, a token placed on a node of G . In each time step, an edge of G is sampled uniformly atrandom. If the token is located at an endpoint of the sampled edge, then it moves to the otherendpoint of that edge. Otherwise, the token stays where it is.Formally, this corresponds to a Markov chain on the state space V . The probability that thechain transitions from state u to v is given by P ( u, v ) = /m if { u, v } ∈ E − d ( u ) /m if u = v otherwisefor every u, v ∈ V . That is, if the token is on node u , the probability for the token to move to node v in the next step is the probability that an edge between the two nodes, if existent, is chosen.The resulting Markov chain is irreducible, since the graph G is connected. Note that this randomwalk differs from the classic random walk on G , where the token moves lazily to an adjacent node ineach time step. First, we show that the uniform distribution on V is the stationary distribution forthis chain. Lemma 28.
The stationary distribution of the walk on G is π ( v ) = 1 /n for every v ∈ V .Proof. Let π be the uniform distribution on V . Recall that πP denotes the application of thetransition matrix P to π . To establish our claim, we need to validate that π = πP holds. For this,we observe that ( πP )( v ) = (cid:88) u π ( u ) P ( u, v ) = 1 n (cid:88) u P ( u, v ) = 1 n = π ( v ) . Next, using elementary arguments, we can bound the hitting time of any pair of nodes; seee.g. [44]. In particular, we make use of the worst-case expected hitting time defined by H max = max { H ( u, v ) : u, v ∈ V, u (cid:54) = v } . The following lemma shows that H max < diam( G ) · nm .36 emma 29. For any graph G , H ( u, v ) < diam( G ) · nm for all u, v ∈ V .Proof. By Lemma 27 and Lemma 28, we have H ( u, u ) = 1 /π ( u ) = n . On the other hand, bycalculating the expected hitting time in another way, we observe that H ( u, u ) = 1 − d ( u ) m + 1 m (cid:88) { u,w }∈ E (1 + H ( u, w )) = n. Thus, we get the inequality (cid:88) { u,w }∈ E (1 + H ( u, w )) < nm. In particular, we have that H ( u, w ) < nm for any edge { u, w } ∈ E . Since G has diameter D , thereis a path of length u = u , . . . , u k = v at most diam( G ) between any two nodes u and v . Hence, bylinearity of expectation, we get that H ( u, v ) ≤ k − (cid:88) i =0 H ( u i , u i +1 ) < diam( G ) · nm. Meeting time of two tokens.
We now consider the situation where a distinct token is placedon each node of G . In each time step, an edge is chosen uniformly at random. Whenever the edge { u, v } is sampled, the tokens at nodes u and v exchange places. Note that, individually, each of thetokens performs a random walk, but the random walks are not independent.We say that two tokens meet at time t if the edge { u, v } is sampled at time step t and the twotokens are located at the nodes u and v , respectively. From now on, we uniquely label the tokensfrom to n − and define the random variable M ( a, b ) as the number of time steps until tokens a and b first meet, starting in the initial configuration. If a = b , then we follow the convention that M ( a, b ) = 0 . We are interested in bounding the largest first meeting time between any two pairs oftokens. To this end, we define M = max { M ( a, b ) : a, b ∈ [ n ] } and M max = max a,b { E[ M ( a, b )] } . The random variable M is the largest first meeting time between any pairs of tokens in the tokenshuffling process and the quantity M max is the worst-case expected first meeting time between anytwo tokens. Lemma 30.
The expected worst-case first meeting time satisfies M max ∈ O (diam( G ) · nm ) .Proof. We keep track of the locations of the two tokens and the parity of the number of times thetwo tokens have met. To this end, we define the graph G ∗ = ( V ∗ , E ∗ ) with V ∗ = V ∗ ∪ V ∗ , where V ∗ b = { ( v , v , b ) : v , v ∈ V, v (cid:54) = v } for b ∈ { , } and { ( u , u , b ) , ( v , v , b (cid:48) ) } ∈ E ∗ if either of the following two conditions hold:(1) b = b (cid:48) , u i = v i and { v − i , u − i } ∈ E for some i ∈ { , } , or(2) b (cid:54) = b (cid:48) and ( u , u ) = ( v , v ) . 37ne can check that the degree of any node x = ( v , v , b ) ∈ V ∗ is d ( x ) = d ( v ) + d ( v ) . Define thetransition matrix P ∗ ( x, y ) = /m if { x, y } ∈ E ∗ − d ( x ) /m if x = y otherwise.Consider an arbitrary initial configuration and two tokens located at nodes v and v of G . Theexpected meeting time of these two tokens is the same as the expected time to reach from ( v , v , ∈ V ∗ to any node in V ∗ by the random walk given by P ∗ . By the same arguments as in Lemma 28 andLemma 29, we get that the hitting time is at most O (diam( G ) · nm ) , since | V ∗ | ∈ Θ( n ) , | E ∗ | ∈ Θ( m ) ,and the diameter of G ∗ is Θ(diam( G )) .Fix an arbitrary constant c ≥ and set T = (cid:100) · max { H max , M max }(cid:101) R = (cid:100) ( c + 2) log n (cid:101) T ∗ = RT.
We show that in T ∗ steps all pairs of tokens have met with high probability. Remark . For any < p < , the following inequality holds: ∞ (cid:88) k =0 ( k + 1) p k = 1( p − . Remark . Let A , . . . , A R be events. Then Pr (cid:34) R (cid:92) i =0 A i (cid:35) = Pr[ A ] · k (cid:89) i =1 Pr[ A i | A i − ] . Lemma 33.
We have
Pr[ M ≥ T ∗ ] ≤ /n c and E[ M ] ≤ T ∗ .Proof. Let M t ( a, b ) be the first meeting time between tokens a and b after t steps and M t =max a,b M t ( a, b ) . Note that M ( a, b ) = M ( a, b ) and M = M . First, we show that for any a, b ∈ [ n ] and t ≥ , the inequality Pr[ M t ( a, b ) ≥ T ∗ ] ≤ /n c +2 holds. For a = b the claim is vacuous, so assume a (cid:54) = b . By Markov’s inequality, the probability thattokens a and b do not meet within T − steps starting from any configuration x t is Pr[ M t ( a, b ) ≥ T ] ≤ E[ M t ( a, b )] T ≤ M max T ≤ . By repeating the experiment R times, we observe that the probability that the tokens a and b donot meet within T ∗ = RT steps (starting from any configuration x t ) satisfies Pr[ M t ( a, b ) ≥ RT ] = Pr[ M t ( a, b ) ≥ T ] · R − (cid:89) r =1 Pr[ M t + T r ( a, b ) ≥ T | M t + T ( r − ( a, b ) ≥ T ] ≤ R (cid:89) r =1
12 = (cid:18) (cid:19) (cid:100) ( c +2) log n (cid:101) ≤ n c +2 . T ∗ steps byapplying the union bound: Pr[ M t ≥ T ∗ ] ≤ (cid:88) a,b Pr[ M t ( a, b ) ≥ T ∗ ] ≤ (cid:88) a,b n c +2 ≤ n n c +2 = 1 n c . Observe that for any k > we have Pr[ M t ≥ kT ∗ ] = Pr[ M t ≥ T ∗ ] · k − (cid:89) i =1 Pr[ M t + iT ∗ ≥ T ∗ | M t +( i − T ∗ ≥ T ∗ ] ≤ k (cid:89) i =1 n c ≤ k , since n ≥ and c ≥ . To bound the expectation observe that E[ M ] = E[ M ] ≤ ∞ (cid:88) k =0 ( k + 1) T ∗ · Pr[ M ≥ kT ∗ ] ≤ T ∗ ∞ (cid:88) k =0 k + 12 k = 4 T ∗ . We analyse the dynamics of the token-based leader election protocol. Let C be the number of timesteps until a single black token remains. Lemma 34.
The random variable C satisfies Pr[ C ≥ T ∗ ] ≥ /n c .Proof. Observe that C corresponds to the time when the last pair of black tokens meet. Now C = max { M ( a, b ) : a (cid:54) = b } = M and the claim follows from Lemma 33.Let L be the stabilisation time of the protocol, that is, the time until there is exactly one leadercandidate remaining. Recall that a leader candidate becomes a follower if it receives a white tokenfrom some other node and a follower never becomes a candidate again. Thus, a node is a leadercandidate at step t if and only if it has not been hit by a white token. Whenever a white token hitsa leader candidate, the token becomes inactive. This ensures that a single leader is always elected,as there will be exactly n − white tokens created during the execution of the protocol. Lemma 35.
Let u and v be distinct nodes. The probability that a token starting from u does not hit v within T ∗ steps it at most /n c +2 .Proof. By Lemma 29 and Markov’s inequality, the probability that a token starting from u does nothit node v in T steps is bounded by H ( u, v ) T ≤ H max H max ≤ / . Again, by repeating experiment for R times, we get that with probability at most − R ≤ /n c +2 thetoken starting from u does not hit v in RT = T ∗ steps. Lemma 36.
The random variable L satisfies Pr[
L < T ∗ ] ≤ /n c and E[ L ] ≤ T ∗ . roof. Observe that conditioned on the event that there is only one black token remaining, theprobability that some node v does not become a follower is bounded by the event that node v doesnot receive a white token. This is in turn bounded by the probability of the event that some tokendoes not visit v within T ∗ steps. Hence, Pr[ L ≥ t + T ∗ | C < t ] ≤ (cid:88) v Pr[ node v is not hit by a white token by time t + T ∗ | C < t ] ≤ (cid:88) v Pr[ node v is not hit by some token ] ≤ (cid:88) v (cid:88) a Pr[ node v is not hit by token a by time t + T ∗ ] ≤ (cid:88) v (cid:88) a n c +2 ≤ n c , where in the second to last step we applied Lemma 35 and in the last step the fact that there atmost n leader candidate nodes and n tokens. By law of total probability, we get that Pr[ L ≥ T ∗ ] = Pr[ L ≥ T ∗ | C < T ∗ ] · Pr[
C < T ∗ ] + Pr[ L ≥ T ∗ | C ≥ T ∗ ] · Pr[ C ≥ T ∗ ] ≤ (1 /n c ) · (1 − /n c ) + 1 /n c ≤ /n c , where in the second to last step we apply the bound Pr[ C ≥ T ∗ ] ≤ /n c given by Lemma 34. Since c ≥ and considering repeated stabilisation attempts, we get that E[ L ] ≤ T ∗ ∞ (cid:88) k =0 ( k + 1) Pr[ L > kT ∗ ] ≤ T ∗ ∞ (cid:88) k =1 ( k + 1) (cid:18) n c (cid:19) k ≤ T ∗ ∞ (cid:88) k =1 ( k + 1) (cid:18) (cid:19) k ≤ T ∗ . Proof of Theorem 26.
Proof.
The protocol uses only 6 states as each node only stores whether it is a leader and what isthe type of its token. Since T ∗ = O (max { H max , M max } · log n ) and H max , M max ∈ O (diam( G ) · nm ) by Lemma 29 and Lemma 30. Thus, the protocol stabilises in O (diam( G ) · nm log n ) steps with highprobability and in expectation. Note on cliques.
Note that the running time is determined by max { H max , M max } . For cliques,the expected hitting time for any u (cid:54) = v is ∞ (cid:88) t =0 tm (cid:18) − m (cid:19) t − = m. Similarly, the expected meeting time for any two tokens is the same order as the hitting time. Thus,the algorithm solves leader election with high probability in O ( n log n ) steps. Indeed, it is knownthat any constant-space leader election protocol requires Ω( n ) steps in expectation to stabilise [33].40 As our main result, we established a general framework for simulating clique-based protocols inarbitrary, connected regular graphs. We now conclude by briefly discussing some limitations of ourapproach and summarise key problems left open by this work:– We assume that the nodes have access to a single random bit per interaction. The random bitsare used only by the shuffling protocol of Section 7 to avoid technical parity issues arising in themixing of the random walks on the symmetric group. It seems plausible that this assumptioncan be avoided, by exploiting the stochastic nature of the population protocol scheduler to e.g.generate synthetic coins [7] or to argue that these parity issues are avoided by the virtue ofhaving a random number of shuffling steps.– We assume that in each interaction step in the population protocol model, one of the interactingnodes is assigned to be an initiator and the other a responder to provide elementary symmetry-breaking. This is again a common assumption in population protocol literature. The simulationframework uses this assumption only in the construction of the phase clock, where in certainsituations ties need to be broken. It again seems plausible that this assumption can be avoided,but this would necessitate revisiting the involved graphical load balancing argument of Pereset al. [49] with a different tie-breaking procedure.– We focus on regular interaction graphs. The justification for this assumption is two-fold. First,this assumption is only used once: in Section 6, to obtain clean bounds for the skew of thephase clock. However, upon close inspection, we notice that this regularity assumption can berelaxed in many cases if the minimum and maximum degrees do not deviate too much fromthe average degree of the graph. As Theorem 6 can be used to bound the mixing time of theinterchange process in non-regular graphs as well, we can use our simulation framework toobtain fast leader election and exact majority algorithms also on some non-regular graphs.Please see Appendix A.2 for a formal statement and an illustration.Second, regular graphs are also justified by the fact that they provide an immediate extensionof the notion of parallel time : the expected number of interactions in any time interval is thesame for all nodes, and prior work on this problem has naturally focused on them [29, 35].Nevertheless, obtaining bounds for phase clocks and related load balancing processes innon-regular graphs remains an interesting open problem.– The simulation overhead has a polylogarithmic dependency on n . To simplify the presentation,we have made no particular effort to optimise the degree of this polylogarithmic dependency.The dependency can be improved by providing better bounds on the k -stack interchange process.Indeed, even in the case of the well-studied (1-stack) interchange process, exact bounds onmixing time have been—and still remain—an open question for many graph classes [40].Improved bounds for these processes imply better running time bounds for our simulations.– Our complexity bounds have a quadratic dependency on d/β . We conjecture a polynomialdependency on the expansion properties is necessary for step complexity and leave the investi-gation of tight space-time trade-offs for population protocols in the general graphical settingas an intriguing open problem. Acknowledgements
We thank Giorgi Nadiradze for pointing out the generalisation of the phase clock construction tonon-regular graphs. We also thank anonymous reviewers for their useful comments on earlier versions41f this manuscript. This project has received funding from the European Research Council (ERC)under the European Union’s Horizon 2020 research and innovation programme (grant agreement No.805223 ScaleML), and from the European Union’s Horizon 2020 research and innovation programmeunder the Marie Skłodowska-Curie grant agreement No. 840605.
References [1] David Aldous. Random walks on finite groups and rapidly mixing Markov chains. In
Séminairede Probabilités XVII 1981/82 , pages 243–297. Springer, 1983.[2] David Aldous and James Allen Fill. Reversible markov chains and random walks on graphs,2002. Unfinished monograph, recompiled 2014, available at .[3] Dan Alistarh and Rati Gelashvili. Polylogarithmic-time leader election in population protocols.In
Proc. 42nd International Colloquim on Automata, Languages, and Programming (ICALP2015) , pages 479–491, 2015. doi:10.1007/978-3-662-47666-6_38.[4] Dan Alistarh and Rati Gelashvili. Recent algorithmic advances in population protocols.
SIGACTNews , 49(3):63–73, 2018. doi:10.1145/3289137.3289150.[5] Dan Alistarh, Rati Gelashvili, and Milan Vojnović. Fast and exact majority in populationprotocols. In
Proc. 34th ACM Symposium on Principles of Distributed Computing (PODC2015) , pages 47–56, 2015. doi:10.1145/2767386.2767429.[6] Dan Alistarh, James Aspnes, David Eisenstat, Rati Gelashvili, and Ronald L Rivest. Time-spacetrade-offs in population protocols. In
Proc. 28th Annual ACM-SIAM Symposium on DiscreteAlgorithms (SODA 2017) , pages 2560–2579, 2017. doi:10.1137/1.9781611974782.169.[7] Dan Alistarh, James Aspnes, and Rati Gelashvili. Space-optimal majority in populationprotocols. In
Proc. 29th ACM-SIAM Symposium on Discrete Algorithms (SODA 2018) . SIAM,2018. doi:10.1137/1.9781611975031.144.[8] Dan Alistarh, Giorgi Nadiradze, and Amirmojtaba Sabour. Dynamic averaging load balancingon cycles. In , volume 168, pages 7:1–7:16, 2020. doi:10.4230/LIPIcs.ICALP.2020.7.[9] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J Fischer, and René Peralta. Computationin networks of passively mobile finite-state sensors.
Distributed computing , 18(4):235–253, 2006.doi:10.1007/s00446-005-0138-3.[10] Dana Angluin, James Aspnes, and David Eisenstat. Stably computable predicates are semilinear.In
Proc. 25th ACM Symposium on Principles of distributed computing (PODC 2006) , pages292–299, 2006. doi:10.1145/1146381.1146425.[11] Dana Angluin, James Aspnes, David Eisenstat, and Eric Ruppert. The computational powerof population protocols.
Distributed Computing , 20(4):279–304, 2007. doi:10.1007/s00446-007-0040-2.[12] Dana Angluin, James Aspnes, and David Eisenstat. Fast computation by population protocolswith a leader.
Distributed Computing , 21(3):183–199, 2008. doi:10.1007/s00446-008-0067-z.4213] Dana Angluin, James Aspnes, Michael J Fischer, and Hong Jiang. Self-stabilizing populationprotocols.
ACM Transactions on Autonomous and Adaptive Systems (TAAS) , 3(4):1–28, 2008.doi:10.1007/11795490_10.[14] James Aspnes and Eric Ruppert. An introduction to population protocols. In
Middleware forNetwork Eccentric and Mobile Applications , pages 97–120. Springer, 2009. doi:10.1007/978-3-540-89707-1_5.[15] Yossi Azar, Andrei Z. Broder, Anna R. Karlin, and Eli Upfal. Balanced allocations.
SIAMJournal on Computing , 29(1):180–200, 1999. doi:10.1137/S0097539795288490.[16] Joffroy Beauquier, Peva Blanchard, and Janna Burman. Self-stabilizing leader election in popu-lation protocols over arbitrary communication graphs. In
Proc. 17th International Conference onPrinciples of Distributed Systems (OPODIS 2013) , pages 38–52. Springer, 2013. doi:10.1007/978-3-319-03850-6_4. URL https://hal.archives-ouvertes.fr/hal-00867287v2 .[17] Petra Berenbrink, Tom Friedetzky, Peter Kling, Frederik Mallmann-Trenn, and Chris Wastell.Plurality consensus in arbitrary graphs: Lessons learned from load balancing. In
Proc. 24thAnnual European Symposium on Algorithms (ESA 2016) , volume 57, pages 10:1–10:18, 2016.doi:10.4230/LIPIcs.ESA.2016.10.[18] Petra Berenbrink, Robert Elsässer, Tom Friedetzky, Dominik Kaaser, Peter Kling, and TomaszRadzik. A population protocol for exact majority with O (log / n ) stabilization time and Θ(log n ) states. In Proc. 32nd International Symposium on Distributed Computing (DISC 2018) ,pages 10:1–10:18, 2018. doi:10.4230/LIPIcs.DISC.2018.10.[19] Petra Berenbrink, George Giakkoupis, and Peter Kling. Tight bounds for coalescing-branchingrandom walks on regular graphs. In
Proc. 29th Annual ACM-SIAM Symposium on DiscreteAlgorithms (SODA 2018) , pages 1715–1733. SIAM, 2018. doi:10.1137/1.9781611975031.112.[20] Petra Berenbrink, Dominik Kaaser, Peter Kling, and Lena Otterbach. Simple and efficient leaderelection. In
Proc. 1st Symposium on Simplicity in Algorithms (SOSA 2018) , pages 9:1–9:11,2018. doi:10.4230/OASIcs.SOSA.2018.9.[21] Petra Berenbrink, George Giakkoupis, and Peter Kling. Optimal time and space leader election inpopulation protocols. In
Proc. 52nd Annual ACM SIGACT Symposium on Theory of Computing(STOC 2020) , pages 119–129, 2020. doi:10.1145/3357713.3384312.[22] Michael Blondin, Javier Esparza, and Stefan Jaax. Large flocks of small birds: on the minimalsize of population protocols. In
Proc. 35th Symposium on Theoretical Aspects of ComputerScience (STACS 2018) , pages 16:1–16:14, 2018. doi:10.4230/LIPIcs.STACS.2018.16.[23] Robert Brijder. Computing with chemical reaction networks: a tutorial.
Natural Computing , 18(1):119–137, 2019. doi:10.1007/s11047-018-9723-9.[24] Pietro Caputo, Thomas M. Liggett, and Thomas Richthammer. Proof of Aldous’ spectral gap con-jecture.
Journal of the American Mathematical Society , 23(3):831–851, 2010. doi:10.1090/S0894-0347-10-00659-4.[25] Ioannis Chatzigiannakis, Othon Michail, Stavros Nikolaou, Andreas Pavlogiannis, and Paul GSpirakis. Passively mobile communicating machines that use restricted space. In
Proc. 7thACM SIGACT/SIGMOBILE International Workshop on Foundations of Mobile Computing ,pages 6–15, 2011. 4326] Hsueh-Ping Chen and Ho-Lin Chen. Self-stabilizing leader election. In
Proc. 38thSymposium on Principles of Distributed Computing (PODC 2019) , pages 53–59, 2019.doi:10.1145/3293611.3331616.[27] Hsueh-Ping Chen and Ho-Lin Chen. Self-stabilizing leader election in regular graphs. In
Proc.39th Symposium on Principles of Distributed Computing (PODC 2020) , page 210–217, NewYork, NY, USA, 2020. Association for Computing Machinery. doi:10.1145/3382734.3405733.[28] Colin Cooper, Robert Elsa¨sser, Hirotaka Ono, and Tomasz Radzik. Coalescing random walksand voting on connected graphs.
SIAM Journal on Discrete Mathematics , 27(4):1748–1758,2013. doi:10.1137/120900368.[29] Colin Cooper, Tomasz Radzik, Nicolás Rivera, and Takeharu Shiraga. Fast plurality consensusin regular expanders. In
Proc. 31st International Symposium on Distributed Computing (DISC2017) , pages 13:1–13:16, 2017. doi:10.4230/LIPIcs.DISC.2017.13.[30] Persi Diaconis and Laurent Saloff-Coste. Comparison techniques for random walk on finitegroups.
The Annals of Probability , 21(4):2131–2156, 1993.[31] Persi Diaconis and Mehrdad Shahshahani. Generating a random permutation with randomtranspositions.
Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete , 57(2):159–179,1981.[32] AB Dieker. Interlacings for random walks on weighted graphs and the interchange process.
SIAM Journal on Discrete Mathematics , 24(1):191–206, 2010. doi:10.1137/090775361.[33] David Doty and David Soloveichik. Stable leader election in population protocols requires lineartime.
Distributed Computing , 31(4):257–271, 2018. doi:10.1007/s00446-016-0281-z.[34] David Doty, Mahsa Eftekhari, and Eric Severson. A stable majority population protocol using log-arithmic time and states, 2020. URL https://arxiv.org/abs/2012.15800 . arXiv:2012.15800.[35] Moez Draief and Milan Vojnović. Convergence speed of binary interval consensus.
SIAMJournal on Control and Optimization , 50(3):1087–1109, 2012. doi:10.1137/110823018.[36] Robert Elsässer and Tomasz Radzik. Recent results in population protocols for exact majorityand leader election.
Bulletin of the EATCS , 126, 2018. URL http://bulletin.eatcs.org/index.php/beatcs/article/view/549/546 .[37] Leszek Gąsiniec and Grzegorz Stachowiak. Fast space optimal leader election in populationprotocols. In
Proc. 29th ACM-SIAM Symposium on Discrete Algorithms (SODA 2018) , 2018.doi:10.1137/1.9781611975031.169.[38] Leszek Gąsiniec, Grzegorz Stachowiak, and Przemyslaw Uznański. Almost logarithmic-timespace optimal leader election in population protocols. In
Proc. 31st ACM Symposium onParallelism in Algorithms and Architectures (SPAA 2019) , 2019. doi:10.1145/3323165.3323178.[39] Leszek Gąsiniec, Grzegorz Stachowiak, and Przemyslaw Uznański. Time and space optimalexact majority population protocols, 2020. URL https://arxiv.org/abs/2011.07392 .[40] Johan Jonasson. Mixing times for the interchange process.
Latin American Journal of Probabilityand Mathematical Statistics , 9(2):667–683, 2012.4441] Richard Karp, Christian Schindelhauer, Scott Shenker, and Berthold Vocking. Randomizedrumor spreading. In
Proc. 41st Annual Symposium on Foundations of Computer Science (FOCS2000) , pages 565–574, 2000. doi:10.1109/SFCS.2000.892324.[42] Tom Leighton and Satish Rao. Multicommodity max-flow min-cut theorems and theiruse in designing approximation algorithms.
Journal of the ACM , 46(6):787–832, 1999.doi:10.1145/331524.331526.[43] David A. Levin and Yuval Peres.
Markov Chains and Mixing Times . American MathematicalSociety, 2 edition, 2017.[44] László Lovász. Random walks on graphs: A survey.
Combinatorics, Paul Erdös is Eighty , 2(1):1–46, 1993.[45] George B. Mertzios, Sotiris E. Nikoletseas, Christoforos Raptopoulos, and Paul G. Spirakis.Determining majority in networks with local interactions and very small local memory. In
Proc. 41st International Colloquium on Automata, Languages, and Programming (ICALP 2014) ,pages 871–882, 2014. doi:10.1007/978-3-662-43948-7_72.[46] George B Mertzios, Sotiris E Nikoletseas, Christoforos L Raptopoulos, and Paul G Spirakis. De-termining majority in networks with local interactions and very small local memory.
DistributedComputing , 30(1):1–16, 2017. doi:10.1007/s00446-016-0277-8.[47] Carey D. Nadell, Knut Drescher, and Kevin R. Foster. Spatial structure, cooperation and com-petition in biofilms.
Nature Reviews Microbiology , 14(9):589, 2016. doi:10.1038/nrmicro.2016.84.[48] Roberto Imbuzeiro Oliveira. Mixing of the symmetric exclusion processes in terms of thecorresponding single-particle random walk.
The Annals of Probability , 41(2):871–913, 2013.[49] Yuval Peres, Kunal Talwar, and Udi Wieder. Graphical balanced allocations and the (1 + β ) -choice process. Random Structures and Algorithms , 47(4):760–775, 2014. doi:10.1002/rsa.20558.[50] Thomas Sauerwald and He Sun. Tight bounds for randomized load balancing on arbitrarynetwork topologies. In
Proc. 53rd Annual IEEE Symposium on Foundations of ComputerScience (FOCS 2012) , pages 341–350, 2012. doi:10.1109/FOCS.2012.86.[51] Yuichi Sudo and Toshimitsu Masuzawa. Leader election requires logarithmic time in populationprotocols.
Parallel Processing Letters , 30(01):2050005, 2020. doi:10.1142/S012962642050005X.[52] Yuichi Sudo, Fukuhito Ooshita, Taisuke Izumi, Hirotsugu Kakugawa, and Toshimitsu Masuzawa.Logarithmic expected-time leader election in population protocol model. In
Proc. InternationalSymposium on Stabilizing, Safety, and Security of Distributed Systems (SSS 2019) , pages323–337, 2019. doi:10.1007/978-3-030-34992-9_26.[53] Alan M. Turing. The chemical basis of morphogenesis.
Philosophical Transactions of the RoyalSociety of London, Series B , 237(641):37–72, 1952.[54] David B. Wilson. Mixing times of lozenge tiling and card shuffling Markov chains.
The Annalsof Applied Probability , 14(1):274–325, 2004.[55] Daisuke Yokota, Yuichi Sudo, and Toshimitsu Masuzawa. Time-optimal self-stabilizing leaderelection on rings in population protocols. In
International Symposium on Stabilizing, Safety,and Security of Distributed Systems (SSS 2020) , pages 301–316. Springer, 2020. doi:10.1007/978-3-030-64348-5_24. 45
Details on decentralised phase clocks
A.1 Proof of Lemma 9
Let V be a collection of n bins, which are initially empty, and let µ be a probability distribution on V × V . Consider the process, where at every time step t > , a pair ( u, v ) ∼ µ is sampled and aball is placed into the least loaded of these two bins (in case of ties, place the ball into bin u ). Let ∆ ∗ ( t ) be the difference between the most and least loaded bins after step t . Define E ( S ) as the setof edges that have at least one end point in S , that is, E ( S ) = {{ u, v } ∈ E : { u, v } ∩ S (cid:54) = ∅} . The distribution µ on V × V is said to be δ -expanding for δ > if for all S ⊆ V with | S | ≤ n/ (1) µ ( E ( S )) ≥ (1 + δ ) | S | /n , and(2) µ ( E ( S ) \ ∂S ) ≤ (1 − δ ) | S | /n hold, where ∂S denotes the edge boundary of S . Peres et al. [49] showed that when the measure µ is well-behaved in the sense that it is δ -expanding, then the gap is bounded by O (log n/δ ) at everystep t , with high probability. Lemma 37 ([49]) . Let κ > be a constant and µ be an δ -expanding measure on V × V . Then thereexists a constant c ( κ ) such that for any t > the gap ∆ ∗ ( t ) satisfies Pr[∆ (cid:96) ( t ) > c ( κ ) log n/δ ] < t/n κ . The following observation establishes that the uniform distribution on edges of a regular, connectedgraph is always δ -expanding for some δ > . This in turn implies Lemma 9. Lemma 38.
Suppose G is d -regular with edge expansion β > . The uniform distribution ξ on theedges of G is ( β/d ) -expanding.Proof. Let S ⊆ V such that | S | ≤ n/ . Note that | ∂S | ≥ β | S | , as the graph has edge expansion β .Since the graph is d -regular, it has m = nd/ edges and | S | d/ ≤ | E ( S ) | ≤ | S | d . Now ξ ( E ( S )) = 1 m | E ( S ) | ≥ nd (cid:34)(cid:88) v ∈ S (cid:18) d − β β (cid:19)(cid:35) = | S | n (cid:18) βd (cid:19) . This shows the first condition. For the second condition, observe that ξ ( E ( S ) \ ∂S ) = 1 m | E ( S ) \ ∂ ( S ) | ≤ nd (cid:34)(cid:88) v ∈ S (cid:18) d − β (cid:19)(cid:35) = | S | n (cid:18) − βd (cid:19) . A.2 Phase clocks for non-regular graphs
Finally, we observe that the phase clock construction is not restricted to only regular graphs. Theuniform distribution on the edges of G is δ -expanding whenever (1) the minimum and maximumdegree do not deviate too much from the average degree α = 2 m/n and (2) the expansion issufficiently large compared to the average and minimum degree. Lemma 39.
Let G be a graph with minimum degree d , maximum degree ∆ , and average degree α .If G satisfies a) β + d > α and(b) d + ∆ ≤ α ,then the uniform distribution ξ on the edges of G is δ -expanding for δ = ( β + d ) /α − > .Proof. Let S ⊆ V such that | S | ≤ n/ . We use d ≤ deg( v ) ≤ ∆ to denote the degree of node v in G , out( v, S ) be the number neighbours v has outside the set S , in( v, S ) denote the number ofneighbours v has in the set S . First, observe that ξ ( E ( S )) = 1 m | E ( S ) | = 1 m (cid:34)(cid:88) v ∈ S (cid:18) out( v, S ) + in( v, S )2 (cid:19)(cid:35) = 1 m (cid:34)(cid:88) v ∈ S (cid:18) out( v, S ) + deg( v ) − out( v, S )2 (cid:19)(cid:35) = 12 m (cid:34)(cid:88) v ∈ S (out( v, S ) + deg( v )) (cid:35) ≥ | S | · (cid:18) β + d m (cid:19) = | S | n · (cid:18) β + dα (cid:19) = (1 + δ ) · | S | n . Thus, we have satisfied Condition (1) of a δ -expanding measure. For the second condition, we notethat ξ ( E ( S ) \ ∂S ) = 1 m | E ( S ) \ ∂ ( S ) | = 12 m (cid:34)(cid:88) v ∈ S in( v, S ) (cid:35) ≤ m (cid:34)(cid:88) v ∈ S (deg( v ) − out( v, S )) (cid:35) ≤ | S | (cid:18) ∆ − β m (cid:19) = | S | n · (cid:18) ∆ − βα (cid:19) ≤ (1 − δ ) | S | n . Note that one can also apply the construction on non-uniform probability distributions over E (i.e. weighted graphs) as long as the distribution is δ -expanding for some δ > . An example graph.
For a simple non-regular graph that satisfies the above conditions, takea bipartite complete graph K r, r on r nodes and on one side add r edges to form a completebipartite subgraph on r nodes. Now n = 4 r and m = 4 r + r = 5 r . One can check that the averagedegree is α = 2 m/n = 5 r/ , minimum degree is r , maximum degree is r , and β ≥ r . Thus, theuniform measure is /5