[PDF] Fast Graphical Population Protocols

Abstract

Let G be a graph on n nodes. In the stochastic population protocol model, a collection of n indistinguishable, resource-limited nodes collectively solve tasks via pairwise interactions. In each interaction, two randomly chosen neighbors first read each other's states, and then update their local states. A rich line of research has established tight upper and lower bounds on the complexity of fundamental tasks, such as majority and leader election, in this model, when G is a clique. Specifically, in the clique, these tasks can be solved fast, i.e., in n \operatorname{polylog} n pairwise interactions, with high probability, using at most \operatorname{polylog} n states per node. In this work, we consider the more general setting where G is an arbitrary graph, and present a technique for simulating protocols designed for fully-connected networks in any connected regular graph. Our main result is a simulation that is efficient on many interesting graph families: roughly, the simulation overhead is polylogarithmic in the number of nodes, and quadratic in the conductance of the graph. As a sample application, we show that, in any regular graph with conductance \phi, both leader election and exact majority can be solved in \phi^{-2} \cdot n \operatorname{polylog} n pairwise interactions, with high probability, using at most \phi^{-2} \cdot \operatorname{polylog} n states per node. This shows that there are fast and space-efficient population protocols for leader election and exact majority on graphs with good expansion properties. We believe our results will prove generally useful, as they allow efficient technology transfer between the well-mixed (clique) case, and the under-explored spatial setting.

Full PDF

FFast Graphical Population Protocols

Dan Alistarh · [email protected] · IST Austria

Rati Gelashvili · [email protected] · University of Toronto

Joel Rybicki · [email protected] · IST Austria

Abstract.

Let G be a graph on n nodes. In the stochastic population protocol model,a collection of n indistinguishable, resource-limited nodes collectively solve tasks viapairwise interactions. In each interaction, two randomly chosen neighbors ﬁrst read eachother’s states, and then update their local states. A rich line of research has establishedtight upper and lower bounds on the complexity of fundamental tasks, such as majorityand leader election, in this model, when G is a clique . Speciﬁcally, in the clique, thesetasks can be solved fast , i.e., in n polylog n pairwise interactions, with high probability,using at most polylog n states per node.In this work, we consider the more general setting where G is an arbitrary graph,and present a technique for simulating protocols designed for fully-connected networksin any connected regular graph. Our main result is a simulation that is eﬃcient onmany interesting graph families: roughly, the simulation overhead is polylogarithmicin the number of nodes, and quadratic in the conductance of the graph. As a sampleapplication, we show that, in any regular graph with conductance φ , both leader electionand exact majority can be solved in φ − · n polylog n pairwise interactions, with highprobability, using at most φ − · polylog n states per node. This shows that there are fastand space-eﬃcient population protocols for leader election and exact majority on graphswith good expansion properties. We believe our results will prove generally useful, asthey allow eﬃcient technology transfer between the well-mixed (clique) case, and theunder-explored spatial setting. a r X i v : . [ c s . D C ] F e b Introduction

Since the early days of computer science, there has been signiﬁcant interest in developing analgorithmic theory of molecular and biological systems [53]. In distributed computing, populationprotocols [9] have become a popular model for investigating the collective computational power oflarge collections of communication-bounded agents with limited computational capabilities. Thismodel consists of n identical agents, seen as ﬁnite state machines, and computation proceeds viapairwise interactions of the agents, which trigger local state transitions. The sequence of interactionsis provided by a scheduler, which picks pairs of agents to interact. Upon every interaction, theselected agents observe each other’s states, and then update their local states. The goal is tohave the system reach a conﬁguration satisfying a given predicate, where e.g. all agents agree ona common output value (consensus/majority), or a single agent is assigned a special leader state(leader election). Early work in population protocols focused on the computational power of the model, i.e., theclass of predicates which can computed by population protocols under various interaction graphs [9,11]. More recently, the focus has shifted from computability towards understanding complexitythresholds, which often come in the form of fundamental time-versus-space complexity trade-oﬀs,e.g. [5, 7, 12, 18, 21, 33, 37, 39]; for recent surveys please see [36] and [4].This line of work almost exclusively focuses on the uniform stochastic scheduler, where eachinteraction pair is chosen uniformly at random among all pairs of agents in the population, and thetime complexity of a protocol is measured by the number of interactions needed to solve a task. Theuniform stochastic scheduler is analogous to having a large well-mixed solution of interacting particles,an assumption often used for modelling chemical reactions between molecules. However, manynatural systems exhibit spatial structure—this is, for instance, the case of biochemical communicationin e.g. bacterial bioﬁlms [47]—and this structure can greatly inﬂuence the system dynamics.Indeed, it is well-known that there is a qualitative diﬀerence between the computational powerof population protocols in the clique and other interaction graphs under the adversarial scheduler:any other connected interaction graph can simulate adversarial interactions on the clique graphby shuﬄing the states of the nodes [9] and population protocols on some interaction graphs cancompute a strictly larger set of predicates than protocols on the clique; see e.g. [14] for a survey ofcomputability results.However, much less is known about the complexity of basic tasks in general interaction graphsunder the stochastic scheduler. So far, only a handful of protocols have been analysed on generalgraphs. Existing analyses tend to be complex, and specialised to speciﬁc algorithms on limited graphclasses [17, 29, 35, 45, 46]. This is natural: given the intricate dependencies which arise due to theunderlying graph structure, the design and analysis of protocols in the spatial setting is understoodto be challenging.

Our approach.

In this paper, we develop a general approach for eﬃciently translating algorithmicideas and techniques for population protocols from non-spatial setting of well-mixed interactions,which is now fairly well-understood, to the general spatial setting, in which the interactions aredictated by an arbitrary regular graph.Our work proposes a “technology transfer” approach, which allows the analyses in the simplerclique case to directly generalise to arbitrary regular graphs. The computational costs of this reductionare additional factors which depend polylogarithmically on the number of nodes n , and quadratically1n the conductance of the graph. Roughly speaking, we show that an eﬃcient synchronous protocolfor the clique, where all nodes interact simultaneously in lockstep every round, implies an eﬃcient asynchronous population protocol in the spatial setting, where interactions happen sequentially in apairwise fashion, on graphs with good expansion properties. Thus, we can establish new complexitybounds for population protocols in regular graphs by analysing the much simpler clique setting. We consider stochastic population protocols in systems whose underlying topology is a connectedgraph G = ( V, E ) . Each node v ∈ V represents a single agent in a population of size n = | V | . Theagents are oblivious to the graph G , in the sense that they do not have access to the structure of G ,not even their own immediate neighbourhood. However, we assume non-uniform protocols, that is,the protocol may depend on n and the expansion properties of the graph.Computation proceeds asynchronously, in sequential pairwise interaction steps, and pairingsfollow a stochastic scheduler. In each step, a scheduler selects an ordered pair of neighboring nodes ( u, v ) uniformly at random to interact, where node u is the initiator and node v is the responder .Informally, population protocols are often given using chemical reaction equations that describe thepossible state transitions using rules of the form A + B → C + D, which indicate that when an initiator in state A interacts with a responder in state B , they switchto states C and D , respectively. We assume that the scheduler gives each node a single random bitper interaction; this is a fairly standard assumption (see e.g. [38]). Thus, an execution in this modelis given by a random sequence of oriented edges and the random bits provided to the nodes. Stabilisation and complexity measures.

In addition to the state transition rules, a protocoldescribes how to map an input value to an initial state and how to map the current state to anoutput value. A conﬁguration after t interactions is given by the states of all nodes. A conﬁgurationis stable if (1) the outputs remain unchanged upon subsequent interactions and (2) the output isa correct solution to the task. A protocol stabilises in T steps if it reaches a stable conﬁgurationwithin T steps. The step complexity is the number of interactions until the system has reached astable conﬁguration. We follow the common convention to measure parallel time as step complexitydivided by n . The state complexity is the number of distinct states per node. Leader election and exact majority.

For the most part, we focus on two classic tasks, whichserve as our running examples. In leader election , the task is to choose a single node as a leaderamong n initially identical nodes and assign all other nodes as followers. In the exact majority task,each node is given a single bit as input and the task is to stabilise to a conﬁguration, where everynode outputs the input value which had a higher initial count. Edge expansion in graphs.

Before presenting our results, we recall some graph theoretic notions.A graph G = ( V, E ) is d -regular if every node v ∈ V is adjacent to exactly d other nodes. Unlessotherwise speciﬁed, we assume throughout that all graphs are regular. For any set S ⊆ V , the edgeboundary of S is the set ∂S ⊆ E of edges with exactly one endpoint in S . The edge expansion ofthe graph G is β = min (cid:26) | ∂S || S | : S ⊆ V, | S | ≤ n/ (cid:27) . a) (b) (c) Figure 1: The graphical population protocol model. In each step, a random edge { u, v } is selectedand the nodes u and v interact (blue nodes). Examples of graph classes covered by our construction:(a) regular high-girth expanders, (b) bipartite complete graphs, (c) toroidal grids.For regular graphs, we express our bounds in terms of the ratio d/β ; its inverse β/d corresponds tothe conductance of G . In this work, we provide reductions showing that standard problems in population protocols can besolved eﬃciently under graphical stochastic schedulers, with step and state complexities bounded bythe expansion and size of the underlying graph. In summary, our contributions are as follows:(1) We give a general framework for simulating a large class of synchronous protocols designedfor fully-connected networks , in the graphical stochastic population protocol model. Thus, theuser can design eﬃcient (and simple to analyse) synchronous algorithms on a clique model,and transport the analysis automatically to the population protocol model on a large classof interaction graphs. For instance, on any d -regular graph with edge expansion β > , theresulting overhead in parallel time and state complexity is in the order of ( d/β ) · polylog n .We introduce the synchronous model, called the k -token shuﬄing model, in the next section.(2) As concrete applications of the simulation framework, we show that for any d -regular graphwith edge expansion β > , there exist protocols for leader election and exact majority thatstabilise both in expectation and with high probability in ( d/β ) · polylog n parallel timeand use ( d/β ) · polylog n states. While we made no speciﬁc eﬀort to optimise the degree ofthe polylog n term, the maximum degree is at most 6 (and often smaller).(3) To complement these results following from the simulation, we also show that, on any graph G with diameter diam( G ) and m edges, leader election can be solved both in expectation andwith high probability in O (diam( G ) · m log n ) parallel time, using a constant-state protocol.This result is based on a new running time analysis of the token-based protocol by Beauquier,Blanchard and Burman [16] under the uniform stochastic scheduler on G via the meetingand coalescence times of certain correlated random walks.Throughout, we use the phrase “with high probability” to mean that we can choose the constants ofthe protocol so that the probability that the protocol fails to stabilise (in the given time) is at most /n λ for any given constant λ > . Bounds for speciﬁc graph families.

The second result immediately implies upper bounds forthe time and state complexity of majority and leader election in various graph families, parameterisedby the quantity ( d/β ) : 3 raphs Task States Parallel time Ref. Notecliques EM 4 O ( n log n ) [35] Ω( n ) parallel time necessary [6].EM O (log n ) Θ(log n ) [34] Optimal for certain protocols [7].LE n ) [33] Optimal O (1) -state protocol.LE Θ(log log n ) Θ(log n ) [21] Lower bounds in [6, 51].connected EM n ) [17, 35] Various bounds (*)LE O (diam( G ) · m log n ) new Complexity analysis of [16]. d -regular EM ( d/β ) · polylog n ( d/β ) · polylog n new Also stabilises in non-reg. graphs.LE ( d/β ) · polylog n ( d/β ) · polylog n new Also stabilises in non-reg. graphs.

Table 1: Protocols for exact majority (EM) and leader election (LE) for diﬀerent graph classes. Thestate complexity is the number of states used by the protocol. The parallel time column gives theexpected parallel time (expected number of steps divided by n ) to stabilise. (*) In [35], the runningtime of the protocol is bounded by the initial discrepancy in the inputs and the spectral propertiesof the contact rate matrix; bounds in terms of n are only given for select graph classes (paths, cycles,stars, random graphs and cliques). No sublinear in n bounds on parallel time are given in [35].– In sparse graphs with good expansion properties, such as constant-degree graphs with constantedge expansion (Figure 1a), we obtain polylogarithmic parallel time and state complexityoverhead. This implies that these good expanders admit fast population protocols with smallstate complexity, despite being sparser than the clique.– In dense graphs, we obtain similar bounds whenever d/β ∈ polylog n holds. This includes theclass of d -dimensional hypercubes with n = 2 d nodes, but also highly-dense clique-like graphs,such as regular complete multipartite graphs (Figure 1b), where the degree and expansion areboth Θ( n ) .– In D -dimensional toroidal grids, we get algorithms that have n /D polylog n parallel time andstate complexity. These sparse D -regular graphs include cycles (1-dimensional toroidal grids),two-dimensional grids (Figure 1c), three-dimensional lattices, and so on.While our protocols for leader election and exact majority guarantee fast stabilisation in regulargraphs with high conductance, they will stabilise in polynomial expected time in any connectedgraph . Moreover, the results can be carried over to certain classes of non-regular graphs providedthat they are not highly irregular and have high expansion; we discuss this in Section 10. Table 1gives a summary of our main results and prior protocols. New complexity trade-oﬀs.

It is known that, for both leader election and exact majority oncliques, constant-state protocols are necessarily slower than protocols with super-constant states [7, 33](see also Table 1). It is still an open question to which extent a similar phenomenon holds for generalinteraction graphs.In this context, we provide signiﬁcantly improved upper bounds for super-constant state protocols,suggesting a complexity gap between constant and super-constant state protocols, for some graphclasses. Speciﬁcally, on d -regular graphs with good expansion, such that d/β ∈ polylog n , we providefast, polylogarithmic time protocols for both leader election and exact majority. This opens asigniﬁcant complexity gap relative to known constant-state protocols on graphs. Speciﬁcally, the4-state exact majority protocol for general graphs [35] requires Ω( n ) parallel time even in regulargraphs with high expansion, if node degrees are Θ( n ) . (A simple example is the complete bipartite4raph given in Figure 1b.) Yet, our protocols guarantee stabilisation in only polylog n parallel timein both low and high degree graphs, as long as d/β is at most polylog n .Interestingly, this advantage appears to be lost in graphs of low expansion: e.g. in cycles (1-dimensional toroidal grids) the super-constant state protocols take n polylog n time to stabilise,whereas the constant-state protocols for exact and majority and leader election run in O ( n log n ) expected parallel time. (The upper bound for majority follows from [35], whereas the leader electionbound follows from our work.) In Section 2, we give a high-level overview of our techniques and summary of our main techicalresults. Section 3 discusses related work. Section 4 gives formal deﬁnitions while Sections 5 to8 develop our framework, including applications. Section 9 gives an analysis for a constant-stateprotocol for leader election that stabilises in polynomial expected time in any connected graph.Finally in Section 10, we conclude by discussing some open problems.

We now give a technical overview of our results. Our reduction framework combines several techniquesfrom diﬀerent areas, but ultimately the approach can be distilled to a few basic ingredients.

Ingredient

We introduce a simple synchronousmodel for fully-connected networks, which can be simulated by population protocols under thestochastic scheduler. In this model, we assume a synchronous system of n nodes that communicateusing a small number of distinct token types. The model is parameterised by an integer k > . Thecomputation proceeds in discrete rounds, where in each round nodes perform the following actionsin lock-step:(1) at the start of a round, every node v generates exactly k tokens based on its current state,(2) all of the nk tokens generated are shuﬄed uniformly at random among all the nodes so thateach node is assigned exactly k tokens, and(3) every node v updates its local state based on its current state and the (ordered sequence of) k tokens it received.To emphasise the role of k , we refer to the model as the k -token shuﬄing model . We typically assumethat both k and the number of distinct token types are (small) constants. An R -round execution ofthis model is given by a sequence of σ , . . . , σ R random i.i.d. uniform permutations on the set of nk tokens. We call such a sequence an R -round synchronous schedule . Figure 2 illustrates executions ofthis model for k = 1 and k = 2 .As the token shuﬄing model is synchronous, this often simpliﬁes the design and analysis ofalgorithms compared to the asynchronous population protocol model. In terms of algorithm design,the model can be used in several ways:– One-way interactions:

For k ≥ , we can model “pull interactions” [41] by having each nodecopy their local state onto a token at the start of every round. After shuﬄing the tokens, eachnode holds a sample of k states of randomly chosen nodes. For k = 1 , this corresponds tosynchronous one-way interactions, where in each round every node v reads the state of someother node σ ( v ) given by a random permutation σ : V → V . This is illustrated in Figure 2a.5 hufﬂe round 1 round 2 shufﬂe shufﬂe round 1 round 2 (b)(a) shufﬂe Figure 2: The synchronous k -token shuﬄing model with 5 nodes for k = 1 and k = 2 . Rectangles arenodes and the small circles are tokens. In each round, nodes generate k tokens based on their currentstate. Then all nk tokens are shuﬄed randomly. After this, nodes update their state based on thevector of k tokens they hold. (a) An execution of a protocol in the 1-token shuﬄing model. Thearrows between tokens represent the random permutation used to shuﬄe tokens. (b) An executionof a protocol for k = 2 . Each node sends and receives two tokens.– Simulating pairwise interactions:

For k = 2 , we can simulate two-way interactions, wheretwo agents simultaneously read each other’s states and update their local states, in a population V (cid:48) of n virtual agents. Every node v ∈ V stores the state of two virtual agents v , v ∈ V (cid:48) .In each round, the nodes shuﬄe the (states of the) two virtual agents they hold by copyingeach agent onto a token. At the end of a round every node v simulates the interaction betweenthe two virtual agents it holds, and updates their states. This corresponds to synchronous(virtual) interaction pattern given by a random perfect matching. More generally for k ≥ ,we can simulate k -wise interactions, used in e.g. chemical reaction network (CRN) models [23].For our running examples, the leader election and exact majority tasks, we can use the ﬁrst approach(one-way interactions) to adapt a standard population protocol for information dissemination to solveleader election [37] in the synchronous 1-token shuﬄing model assuming nodes can make random localcoin ﬂips. Alternatively, we can generate synthetic coin ﬂips in the synchronous model when k > using deterministic transition rules. We use the second approach (simulating pairwise interactions)to run cancellation-doubling dynamics employed by many exact majority protocols [7, 12].Finally, we note that from a biological perspective, the tokens can be interpreted as e.g. signalingmolecules, which bind to a limited number of receptors at cells (nodes). However, we do not pursuesuch analogies further. Ingredient

In order to simulate the schedulesof the k -token shuﬄing model, we devise a token shuﬄing scheme that can be implemented by agraphical population protocol. This leads to the following card shuﬄing process on a graph with n nodes, whose mixing time we analyse. Suppose we are given a deck of nk cards and we place a stackof k cards on every node. In each step, select a node v uniformly at random and– with probability 1/2, move the card on the top of the node’s stack to the bottom of the stack,– with probability 1/4, exchange the top card of v with the top card of a random neighbour of v ,– with probability 1/4, do nothing.This process corresponds to a random walk on the symmetric group S nk . The special case of k = 1 isknown as the interchange process , which has been analysed on various graphs [24, 30, 32, 40, 48, 54].6

57 0

324 7501 23

30 21

61 0324 75

Figure 3: Interchange dynamics on a 4-cycle. In each step, blue cards are swapped. Top row: The1-stack interchange process. Bottom row: The 2-stack interchange process. In each step, a randomlyselected node either moves its top card to the bottom of its stack or exchanges it with the top cardof a randomly selected neighbour.For the case k > , we refer to the process as the k -stack interchange process, and to our knowledge,this process has not been explicitly studied before. The process is illustrated in Figure 3.Inspired by Jonasson [40], we bound the mixing time of this process on general graphs usingthe path comparison method of Diaconis and Saloﬀ-Coste [30] and leveraging well-known results byLeighton and Rao [42]. For d -regular graphs and any constant k > , we obtain a simple (althoughsomewhat loose) upper bound showing that the distribution of nk cards mixes close to the uniformdistribution in the order of ( d/β ) · n log n steps; see Section 5 for details. Corollary 1.

Let G be a d -regular graph with edge expansion β > . For any constant k > , themixing time of the k -stack interchange process on G is O (cid:32)(cid:18) dβ (cid:19) n log n (cid:33) . Ingredient

A key gadget when simulating the synchronous k -token shuﬄing model in the asynchronous graphical population model is a construction of abounded-state, leaderless phase clock for well-behaved graphs, which we provide in Section 6. Thisgeneralises a known approach from the clique case, e.g. [7, 18], by leveraging the graphical versionof the two-choice load balancing process [15, 49]. The phase clock approximates the number ofinteractions the system has taken by having each node keep a local interaction counter. In each step,exactly one of the counters is incremented. The protocol guarantees that in any regular graph thediﬀerence between any two counters is O ( d/β · log n ) , with high probability, for polynomially manysteps.While the phase clock construction we provide works on large class of graphs, including allconnected regular graphs but also on certain non-regular graphs and weighted graphs, other clockprotocols could also be used in our construction. For example, there are several examples of phaseclocks for the clique that have been proposed in the population protocol literature, e.g. [7, 12, 37,39, 52]. It is possible that adapting some of these clocks into the graphical setting may be usedto improve the bounds guaranteed by our construction. However, in general, it remains an openproblem to determine bounds for phase clocks and related load balancing processes in various graphclasses [8, 49]. 7 ngredient In Section 7, we construct a simulation scheme in thepopulation protocol model that can simulate any synchronous algorithm in the k -token shuﬄingmodel. We do this by repeatedly running iterations of the card shuﬄing process (where cardsrepresent tokens), carefully synchronised by the phase clock. The simulation protocol is guaranteedto correctly simulate polynomially many synchronous rounds, with high probability.However, since the card shuﬄing process can only be run for ﬁnitely many steps, the distributionof tokens will never reach a perfectly uniform distribution. Thus, the tokens are not guaranteed to bepermuted uniformly at random, but only almost uniformly at random. We circumvent this issue bya statistical indistinguishability argument: no polynomial time synchronous protocol for the tokenshuﬄing model can distinguish between executions, where the tokens are distributed uniformly atrandom and almost uniformly at random. This shows that any synchronous protocol that stabiliseswith high probability under uniform schedules, will also do so under the simulated almost-uniformschedules. We thus arrive at the main technical result of our paper. Theorem 2.

Finally, to instantiate eﬃcient graph-ical population protocols for majority and leader election, in Section 8 we provide their counterpartsin the synchronous token shuﬄing model:– There is a synchronous 2-token shuﬄing protocol for the leader election task that stabilises in O (log n ) rounds with high probability, uses O (log n ) states per node and two token types.– There is a synchronous 2-token shuﬄing protocol for the exact majority task that stabilises in O (log n ) rounds with high probability, uses O (log n ) states and ﬁve token types.Plugging in the above results into Theorem 2 we get the following corollary. Corollary 3.

For any d -regular graph G with edge expansion β > , there exist stochastic populationprotocols for leader election and exact majority which stabilise with high probability. Their paralleltime and state complexities are both bounded by ( d/β ) polylog n . he last ingredient: Backup protocols. We note that our simulation framework guaranteesstabilisation with high probability only, that is, the obtained protocols may fail with a low probability(e.g. when the phase clocks become desynchronised). Nevertheless, we can obtain always correctprotocols, i.e., ﬁnite expected stabilisation time, by switching to an always correct – but possiblymuch slower – “backup protocols” in the unlikely executions, where the fast protocols fail. We choosethe backup protocols as follows:– For exact majority, we can use the 4-state protocol by Draief and Vojnović [35], which stabilisesin polynomial expected time on any connected graph.– For leader election, we can use the 6-state protocol of Beauquier et al. [16]. In Section 9,we show that the protocol stabilises in O ( n log n ) expected parallel time in any connectedgraph under the stochastic scheduler. This is done by analysing the meeting times of certaincorrelated random walks that arise in the population protocol model.Combining the backup protocols with the faster protocol yields protocols whose expected paralleltime to stabilise is bounded by ( d/β ) polylog n with only a constant overhead in the state complexity. Our framework builds on technical tools developed in several diﬀerent ﬁelds ranging from graphicalload balancing [49] to random walks on ﬁnite groups [30]. We now give a brief overview of relatedwork on population protocols, interacting particle systems, and token shuﬄing processes.

Population protocols on graphs.

Already the pioneering work on population protocols con-sidered the possibility of interactions allowing interactions only between nodes adjacent in someunderlying interaction graph [9]. However, the initial line of research focused mostly on the com-putational power of the model, particularly in the setting with constantly many states per node,e.g. [9–11, 22, 25].The diﬀerences between the computational power of fully-connected and general bounded-degreegraphs were established early [9, 13]. Under the adversarial scheduler, the clique graph is theweakest interaction graph in terms of computability. Protocols in e.g. bounded-degree graphs cancompute a strictly larger set of predicates than protocols in the clique [14]. While constant-spaceprotocols clique graphs can compute exactly those predicates deﬁnable in Presburger arithmetic [10],population protocols on a bounded-degree graph can simulate the computation of a linear-spaceTuring machine and compute many network predicates of the underlying graph [10]. We refer to thesurvey of Aspnes and Ruppert [14] for a more detailed overview of the early results in this area.We also note that many self-stabilising protocols, i.e. protocols that recover from arbitrarytransient failures corrupting the local states of nodes, have been studied in general graphs [13]. Inparticular, self-stabilising leader election has received considerable attention in the literature. Alarge portion of the research in this area has focused on computability issues and identifying onwhich interaction graphs the problem can be solved and e.g. under what kind of oracles [13, 16].However, as self-stabilizing leader election is generally a strictly harder problem than the non-self-stabilising variant we consider, it is often necessary to make additional assumptions on themodel or the underlying graph structure when working with self-stabilisation. Regardless, in termsof compelxity, Chen and Chen [26] gave a constant-state protocol with exponential stabilisation timein directed cycles and 2-dimensional toroidal grids. Later, they gave a protocol for d -regular graphs9sing O ( d ) states [27]. Recently, Yokota, Sudo and Toshimitsu [55] reported a leader electionprotocol for directed cycles that runs in O ( n ) parallel time using O ( N n ) states, where N ≥ n is aknown upper bound on the population size. Space-time complexity tradeoﬀs in the clique.

More recently, there has been a ﬂurry ofinterest in determining the fundamental space-time trade-oﬀs for key tasks, such as majority andleader election [6, 7, 18, 20, 21, 33, 35, 39, 45]. For example, Doty and Soloveichik [33] establishedthat there is no constant-state protocol for leader election that runs in o ( n ) expected steps andAlistarh, Aspnes, Eisenstat, Gelashvili and Rivest [6] showed that the bound on the expected numberof steps also holds for any protocol using o (log log n ) states per node.Berenbrink, Giakkoupis and Kling [21] eventually established a matching upper bound for leaderelection by giving a protocol that uses O (log log n ) states and elects a leader in the optimal numberof Θ( n log n ) steps. Similarly to leader election, a parallel line of research focused on bounding thecomplexity of the exact majority task [5, 7, 39]. Gąsieniec, Stachowiak and Uznański [39] reported aprotocol computes exact majority w.h.p. in O ( n log n ) steps using O (log n ) states. Doty, Eftekhariand Severson [34] followed up with a protocol that solves the problem in optimal expected time O (log n ) using O (log n ) states. Understanding complexity on general graphs.

The vast majority of the work investigatingthe complexity of population protocols has focused on the case where the underlying graph is aclique [4, 36]. Two natural justiﬁcations for this choice are that: (1) the clique is a reasonableapproximation for the well-mixed solution assumption, and (2) the analysis of population protocolscan be diﬃcult enough even without additional graph structure complicating matters. The recentsurvey of [36] points out that running time on general graphs is poorly understood, and setsthis as an open question. Bounds on non-complete graphs have been studied for exact [35] andapproximate majority [45, 46], with some recent work considering plurality consensus [17, 28, 29] ina closely-related model.In this paper, we provide a general and eﬃcient way for leveraging the recent progress in theclique model to general regular graphs. We show that time- and space-eﬃcient algorithms – by whichwe mean polylogarithmic parallel time and space complexity – can be obtained in well-behavedgraph topologies.Reminiscent of the state shuﬄing approach used to show that clique is the weakest interactiongraph under adversarial schedules, our work also employs a shuﬄing procedure in a graph to simulatepair-wise interactions between any pairs of nodes. However, a key diﬀerence is that we aim tosimulate pairwise interactions under the uniform stochastic scheduler, as (the analyses of) fastprotocols in the clique exploit the fact that the pairwise interactions are uniformly random amongall pairs [4, 36]. Thus, one of the main technical challenges is to devise a shuﬄing procedure thatguarantees that the simulated interactions are (almost) uniform.

Besides population protocols on graphs, there is a long line of research in investigating variousdynamics of interacting particle systems on graphs [2]. For example, the dynamics are often (butnot always) assumed to evolve in a synchronous fashion, where in each round all particles makerandom choices independently of all other particles. This allows the use of many powerful techniquesused to analyse independent random walks on graphs [44]. In comparison, the stochastic populationprotocol model is asynchronous, as in each step only two randomly selected neighbouring nodesinteract, and the new states of the interacting nodes tend to be highly correlated.10 oalescing random walks and voter model.

Cooper, Elsässer, Ono and Radzik [28] analysedthe coalescence time of independent random walks on a graph in terms of the expansion propertiesof the graph. In this process, each node initially holds a unique particle, and in each time step everyparticle – independently of all other particles – randomly moves to another node. When two particlesmeet, they unite into a single particle, which continues to follow a random walk and coalesce withany other particle it meets. The expected coalescence time of the random walks is closely related tothe consensus time of the classic voter dynamics, where in each round every node switches to theopinion of a random chosen neighbour.In this work, we use token-based population protocols on graphs, where the tokens are shuﬄedbetween nodes during an interaction and the tokens instead of coalescing, may interact also in otherways. For example, we analyse the running time bounds of a constant-state leader election protocolby Beauquier et al. [16], which is used as a subroutine for our fast leader election protocol. Theconstant-state protocol employs tokens that instead of coalescing upon meeting update their state,and moreover, the tokens do not perform independent random walks. Nevertheless, we can use asimilar approach as Cooper et al. [28] to study the coalescence time of non-independent randomwalks to establish time bounds on the constant-state leader election protocol under the stochasticscheduler.Token-based processes have also been used to implement eﬃcient, randomised rumour spreadingprotocols. For example, Perenbrink, Giakkoupis and Kling [19] analysed the cover time of asynchronous coalescing-branching random walk on regular graphs: Initially there is a single tokenlocated at some node. In each round, (1) every node v that has a token, splits its token into k partsand sends them k randomly chosen neighbours, and (2) at the end of the round, all tokens at a singlenode coalesce into a single token. Similarly to our work, they use conductance to bound the behaviourof this process in regular graphs, and show that for k = 2 the cover time of coalescing-branchingprocess is in the order of φ − log n rounds. Plurality consensus and token shuﬄing.

In plurality consensus, there are k > opinionsand the task is the agree on opinion supported by the most nodes. Berenbrink, Friedetzky, Kling,Mallmann-Trenn and Wastell [17] present a protocol for the plurality consensus problem in asynchronous pull-based interaction model. Their protocol also circulates tokens, and samples theircount periodically (after mixing) to estimate opinion counts, running into the issue that the tokenmovements are correlated. The authors provide a generalisation of a result by Sauerwald and Sun [50]in order to show that the joint token distribution is negatively correlated, and therefore the tokencounting mechanism concentrates.In this work, we also employ a token exchange protocol, and encounter non-trivial correlationissues. However, we resolve these issues diﬀerently: we characterise the distribution of the tokeninteractions using the k -stack interchange process, and bound its total variation distance relativeto the uniform distribution, showing that the two distributions are indistinguishable in polynomialtime with high probability. More generally, the goal of our construction is diﬀerent, as we aim toprovide a general framework to eﬃciently simulate pairwise random node interactions. Our work builds on the work done on card shuﬄing processes. These processes have a long and richhistory [1, 24, 30–32, 40, 48, 54]. While many of these processes are simple to describe, they areoften surprisingly challenging to analyse. Here we focus on key results related to the interchangeprocess, where the cards are placed on the nodes of a graph and shuﬄing is performed by randomly11xchanging cards between adjacent nodes. We note that much of the work has aimed to identifysharp bounds on the mixing time for the interchange process on various graphs.Diaconis and Shahshahani [31] gave sharp bounds of the order Θ( n log n ) on the mixing timeof the random transpositions shuﬄe, i.e., interchange process on the clique. Aldous [1] establishedthat the mixing time of the interchange process on the path is bounded by Ω( n ) and O ( n log n ) ;later Wilson [54] showed that the mixing time is in fact Θ( n log n ) . Diaconis and Saloﬀ-Coste [30]developed a powerful technique for upper bounding the mixing time of a random walk on a ﬁnitegroup by comparing it to another walk with known behaviour via certain Dirichlet forms. Ouranalyses of the k -stack interchange process also rely on this comparison technique.A decade later Wilson [54] gave a general technique for proving lower bounds for many shuﬄingprocesses. In particular, he showed that the mixing time on the two-dimensional √ n × √ n grid is Θ( n log n ) and Ω( n log n ) on the hypercube. Subsequently, Jonasson [40] gave additional upperand lower bounds on the interchange process on various graphs, including showing that the mixingtime on the hypercube and constant-degree expanders is at most O ( n log n ) and O ( ρmn log n ) onany m -edge graph with radius ρ . For a further exposition to this area, we refer to [43].In this work, we introduce and analyse a generalisation of the interchange process, called the k -stack interchange process, where each node holds k > cards instead of one. Graphs.

Let G = ( V, E ) be a graph, where V is the set of vertices and E is the set of edges.Throughout, we use n = | V | and m = | E | . The degree of a node is the number of neighbours it hasand a graph is d -regular if all vertices have degree d . For any S ⊆ V , we deﬁne ∂S = {{ u, v } ∈ E : u ∈ S, v ∈ V \ S } and β = min (cid:26) | ∂S || S | : S ⊆ V, | S | ≤ n/ (cid:27) , where ∂S is the set of edges with exactly one endpoint in S and β is the edge expansion of graph G .That is, | ∂S | ≥ β · | S | holds for all S ⊂ V such that | S | ≤ n/ . Probability distributions.

Let E be a ﬁnite set. We say µ : E → [0 , is a probability distributionon E if (cid:80) x ∈ E µ ( x ) = 1 holds. We use X ∼ µ to denote a random variable sampled according to µ so that Pr[ X = x ] = µ ( x ) . For any distribution µ and set A we write µ ( A ) = (cid:80) x ∈ A µ ( x ) . The uniform distribution on E is the distribution ν deﬁned by ν ( x ) = 1 / | E | . The support of µ is the set { x : µ ( x ) > } . The total variation distance between two distributions µ and µ on E is given by (cid:107) µ − µ (cid:107) TV = 12 (cid:88) x ∈ E | µ ( x ) − µ ( x ) | = max A ⊆ E | µ ( A ) − µ ( A ) | . We say that µ is ε -uniform on E if (cid:107) µ − ν (cid:107) TV ≤ ε . Permutations and the symmetric group.

Let

N > be a positive integer and [ N ] = { , . . . , N − } . A permutation on [ N ] is a bijection from [ N ] to [ N ] . The symmetric group S N over [ N ] is the group consisting of the set of all permutations on [ N ] with function compositionas the group operation and identity element id deﬁned by id( i ) = i . The inverse x − of an element x ∈ S N is the map satisfying x − · x = x · x − = id . A transposition ( i j ) ∈ S N of i and j is thepermutation that swaps the elements i and j , but leaves other elements in place. We say that a set12 ⊆ S N generates S N if every element of S N can be expressed as a ﬁnite product of elements in H and their inverses. We use · and ◦ interchangeably to denote function composition. Random walks on symmetric groups.

Let µ be a symmetric probability distribution on S N ,i.e., µ ( x ) = µ ( x − ) . The random walk on S N with increment distribution µ is a discrete time Markovchain with state space S N . In each step, a random element x ∼ µ is randomly chosen according to thedistribution µ and the chain moves from state y to state xy . Thus, the probability of transitioningfrom state x to state yx is µ ( y ) . The holding probability of the random walk is α = µ (id) . Thefollowing lemma summarises some useful properties of such random walks; see e.g. [43] for proofs. Lemma 4.

Let µ be an increment distribution for a random walk on the symmetric group S N .(1) The uniform distribution ν on S N is a stationary distribution for the random walk.(2) The random walk is reversible if and only if µ is symmetric.(3) The random walk is irreducible if and only if the support of µ generates S N .(4) If µ (id) > , then the random walk is aperiodic. Mixing times.

Let ν be the uniform distribution on S N and be p ( t ) be the probability distributionover states of the chain after t steps. Following [30], we deﬁne the (cid:96) s -norm and the normalised (cid:96) s -distance to stationarity for s > as: (cid:107) µ (cid:107) s = (cid:32)(cid:88) x | µ ( x ) | s (cid:33) /s and d s ( t ) = | S N | − /s · (cid:107) p ( t ) − ν (cid:107) s . The total variation distance and the normalised distances satisfy (cid:13)(cid:13)(cid:13) p ( t ) − ν (cid:13)(cid:13)(cid:13) TV = d ( t ) ≤ d ( t ) , where the latter inequality follows from the Cauchy-Schwarz inequality. We deﬁne τ ( ε ) = min { t : d ( t ) ≤ ε } and τ mix = τ (1 / , where τ mix is the mixing time of the random walk. We refer to τ ( ε ) as the ε -mixing time. Note that τ ( ε ) ≤ (cid:100) log ε − (cid:101) · τ mix . Distributed tasks.

Let Σ and Γ be ﬁnite sets of input and output labels, respectively. A task Π on a set of n nodes is a function Π that maps any input labelling z : [ n ] → Σ to a set Π( z ) ⊆ Γ V offeasible output labellings. That is, z (cid:48) : [ n ] → Γ is a feasible output labelling for input z if z (cid:48) ∈ Π( z ) .If Π( z ) = ∅ , then we say that z is an infeasible input. We focus on two tasks:– In the leader election task, the input is the constant function z ( v ) = 1 and the outputlabelling z (cid:48) is feasible iﬀ there exists v ∈ V such that z (cid:48) ( v ) = 1 and z (cid:48) ( u ) = 0 for all u (cid:54) = v .– In the majority task, the input is a function z : V → { , } and an output labelling z (cid:48) is feasibleiﬀ its the constant function z (cid:48) ( v ) = b , where b is the input value held by the majority of thenodes. As conventional, the input with equally many zeros and ones is taken to be infeasible.13 raphical stochastic population protocols. Let G = ( V, E ) be a graph. In the graphicalstochastic population model, abbreviated as PP ( G ) , the computation proceeds asynchronously , wherein each time step t > a stochastic scheduler picks uniformly at random an ordered pair ( u, v ) ,where { u, v } ∈ E , nodes to interact. The node u is called the initiator and v is the responder. Duringan interaction, the nodes u and v read each other’s states and update their local states according toa given protocol.We assume that nodes have access to independent and uniform random bits. Speciﬁcally, uponeach interaction, both u and v are provided with a single random bit each. We note that thisassumption is common in the context of population protocols, e.g. [37], and can be justiﬁed practicallyby the fact that chemical reaction network (CRN) implementations can directly obtain random bitsgiven the structure of their interactions [23].Formally, a protocol in this model is a tuple A = ( f, (cid:96) in , (cid:96) out ) , where f : S × { , } × S × { , } → S × S is the state transition function and S is the set of states, (cid:96) in : Σ → S maps inputs to initialstates, and (cid:96) in : S → Γ maps states to outputs. An asynchronous schedule is a random sequence ( e t ) t ≥ of pairs e t = ( u, v ) . An execution is the sequence ( x t ) t ≥ of conﬁgurations given by x t +1 ( u ) , x t +1 ( v ) = f ( x t ( u ) , q t ( u ) , x t ( v ) , q t ( v )) and x t +1 ( w ) = x t ( w ) for w ∈ V \ { u, v } , where ( u, v ) = e t +1 and q t ( u ) ∈ { , } is the random bit provided to the node during its interaction.The output of the protocol at step t is given by z (cid:48) t = (cid:96) out ◦ x t . We say that the protocol A stabiliseson input z by time T if z (cid:48) t +1 = z (cid:48) t and z (cid:48) t ∈ Π( z ) holds for all t ≥ T . We say that A solves the task Π with probability at least p in T ( A ) steps if theprotocol stabilises by time T ( A ) on any input with probability at least p . The state complexity ofthe protocol is S ( A ) = | S | , i.e., the number of states used by the protocol. Synchronous token shuﬄing protocols.

In the synchronous k -token shuﬄing model, we assumethat there are n agents which communicate in a round-based fashion using tokens . In each round,(1) every node v generates exactly k tokens based on its current state,(2) all nk tokens are shuﬄed uniformly at random so that each node is assigned exactly k tokens,(3) every node v updates its local state based on its current state and the k tokens it received.Let X and Y be ﬁnite sets. Formally, an algorithm in this token shuﬄing model is a tuple ( f, g, (cid:96) in , (cid:96) out ) . The function f : X × Y k → X is a state transition function, and g : X → Y k determines which tokens each node creates at the start of each round. Here, X represents the setof states the node can take, and Y is the set of values a token can take. The function (cid:96) in : Σ → X maps input values to initial states and (cid:96) out : X → Γ maps the state of a node onto an output value.A conﬁguration is a map x : V → X . An execution is a sequence ( x r ) r ≥ of conﬁgurations, where x r ( v ) gives the state of node v at the end of round r . The initial state of node v is x ( v ) = (cid:96) in ( z ( v )) ,where z ( v ) is the input of node v . A synchronous schedule is a sequence ( σ r ) r ≥ , where thepermutation σ r ∈ S nk describes how the tokens are shuﬄed in round r . For any y : [ N ] → Y , we let y ( v , . . . , v k − ) = ( y ( v ) , . . . , y ( v k − )) . The execution induced by the synchronous schedule ( σ r ) r ≥ on input z is deﬁned by y r +1 ( v , . . . , v k − ) = ( g ◦ x r )( v ) and x r +1 ( v ) = f ( x r ( v ) , ( y r +1 ◦ σ r +1 ) ( v , . . . , v k − )) , where y r ( v , . . . , v k − ) and ( y r ◦ σ r +1 )( v , . . . , v k − ) , respectively, are the k tokens generated andreceived by node v during round r . The output of node v at the end of round r is ( (cid:96) out ◦ x r )( v ) .When designing algorithms in this model, we assume the uniform synchronous scheduler, which pickseach permutation independently and uniformly at random from the set of all permutations S N .14 Shuﬄing on graphs: the k -stack interchange process We now describe and analyse the so-called k -stack interchange process, which is a variant of the classicinterchange process. Both processes are examples of random walks on the symmetric group [30, 43].In Section 7, we repeatedly run this process to simulate the synchronous schedules of the k -tokenshuﬄing model. The running time of this simulation will be bounded by mixing time of the k -stackinterchange process. The analyse the mixing time using the path comparison method of Diaconisand Saloﬀ-Coste [30] combined with a ﬂow result of Leighton and Rao [42]. A routing on a graph G is a map f that takes every pair of vertices onto a path f ( u, v ) in G connecting the vertices u and v . The congestion of a routing is the maximum number of paths witha common edge and its dilation is the length of the longest path in the routing. We say that graph G is ( C, D ) -routable if there exists a routing with congestion C and dilation D . We make use of thefollowing lemma, which follows from the results of Leighton and Rao [42, Theorem 18]. Lemma 5. If G is a d -regular graph with edge expansion β > , then G is ( C, D ) -routable with C ∈ O (cid:18) n log nβ (cid:19) and D ∈ O (cid:18) d log nβ (cid:19) . k -stack interchange process Let G = ( V, E ) a graph with n vertices { , . . . , n − } and N = kn for k > . Consider the followingshuﬄing process, where each node of G holds a stack of exactly k cards. In every time step, one ofthe following actions is taken:(1) with probability / , move the top card of a random node to the bottom of its stack,(2) with probability / , choose a random edge { u, v } and swap the top cards of u and v ,(3) with probability / , do nothing.We refer to this process as the k -stack interchange process on G . The special case of k = 1 is theclassic interchange process on G with holding probability / , as the ﬁrst rule does not do anythingon stacks of size 1. For k > , the holding probability will be / . The process for k = 1 and k = 2 are illustrated in Figure 3 (given in Section 2).Later, we will see that this shuﬄing process can be implemented in the population protocolmodel assuming that each node is given a single random bit per interaction, i.e., with a total of tworandom bits per interaction. The following result bounds the mixing time of the k -stack interchangeprocess on a ( C, D ) -routable graph. Theorem 6.

Suppose G is ( C, D ) -routable graph with n vertices and m edges. For any constant k > , the k -stack interchange process has mixing time O (cid:18) n log n · max (cid:26) CDmn , D (cid:27)(cid:19) . Together with Lemma 5, the above theorem immediately implies the following general (butsomewhat loose) bound on the mixing time of the k -stack interchange process on regular graphs.15 orollary 1. Let G be a d -regular graph with edge expansion β > . For any constant k > , themixing time of the k -stack interchange process on G is O (cid:32)(cid:18) dβ (cid:19) n log n (cid:33) . We note that while this bound is fairly general, but it can be oﬀ by polylogarithmic factorsfor some graphs. For example, for cycles and cliques, Corollary 1 yields the bounds O ( n log n ) and O ( n log n ) , respectively, while a direct application of Theorem 6 can be used to obtain therespective bounds of O ( n log n ) and O ( n log n ) by using the trivial routings in these graphs. Similarly to approach of Jonasson [40] in the case of the classic interchange process, our analysisrelies on the comparison method developed by Diaconis and Saloﬀ-Coste [30], which we overviewnow.

The path comparison method.

Let µ and ˜ µ be increment distributions whose supports H and ˜ H generate the symmetric group S N . For each element a ∈ ˜ H , choose a representation a = x · · · x k ,where k is odd and x i ∈ H for ≤ i ≤ k . Let | a | = k and N ( x, a ) denote be the number of times x appears in the representation of a . Deﬁne B ( x ) = 1 µ ( x ) (cid:88) a ∈ S N ˜ µ ( a ) N ( x, a ) | a | and B = max x ∈ H B ( x ) . The quantity B is called the bottleneck ratio of the representation and it is useful for bounding (cid:96) -distance to stationarity. We use the following special case of a lemma from [30, Lemma 5]. Lemma 7 ([30]) . Let µ and ˜ µ be symmetric increment distributions that generate the symmetricgroup S N . Let B be the bottleneck ratio for a representation as deﬁned above. Then d ( t ) ≤ N ! · exp (cid:18) − tB (cid:19) + (cid:101) d (cid:18)(cid:22) t B (cid:23)(cid:19) . Random transpositions shuﬄe.

We compare the k -stack interchange process against the randomtranspositions shuﬄe, whose mixing behaviour is well-understood. The random transpositions shuﬄe is a random walk on the symmetric group given by the increment distribution µ ( x ) =  /N if x = id2 /N if x = ( i j )0 otherwise.Diaconis and Shahshahani [31] give the following bound on the mixing time of the random tranposi-tions shuﬄe. Lemma 8.

Let µ be the increment distribution for the random transpositions shuﬄe on S N . Thereexists a universal constant C such that for t = (cid:98) N (log N + c ) (cid:99) , d ( t ) ≤ Ce − c . .4 Proof of Theorem 6 We now give the proof of Theorem 6, which we break into parts.

Increment distribution for the k -stack interchange process. In the following, we label thecards from { , . . . , nk − } and write u i = uk + i for u ∈ [ n ] and i ∈ [ k ] so that u i denotes the i thcard of node u ∈ V . Thus, u is the top card and u k − is the bottom card on the stack locatedat node u . Let σ u = ( u u k − u k − . . . u ) denote the permutation which moves the top card of u to the bottom of its stack. Recall that ( u v ) is the transposition along an edge for neighbouring u (cid:54) = v . The increment distribution µ of the random walk is given by µ ( x ) =  / (2 n ) if x = σ u for some u ∈ V / (4 m ) if x = ( u v ) for some { u, v } ∈ E / if x = id0 otherwise.Thus, the support of the increment distribution of the k -stack interchange process is H = { id } ∪ { σ u : u ∈ V } ∪ { ( u v ) : { u, v } ∈ E } . Comparison with random transpositions shuﬄe.

Let ˜ µ be the increment distribution of therandom transpositions shuﬄe on S N and ˜ H = { ( u i v j ) : u, v ∈ V, i, j ∈ [ k ] } the support of ˜ µ . Westart by choosing a representation for each transposition ( u i v j ) ∈ ˜ H in terms of an odd number ofelements in H . After this, we bound the bottleneck ratio B of the chosen set of representations andapply Lemma 7 to bound the mixing time. Bounding the bottleneck ratio.

Consider the partition ˜ H = { id } ∪ ˜ H ∪ ˜ H , where ˜ H = { ( u i u j ) : u ∈ V, ≤ i < j < k } and ˜ H = { ( u i v j ) : u, v ∈ V, u (cid:54) = v, ≤ i < j < k } . For each a ∈ ˜ H we ﬁnd a representation in terms of elements in H depending on which of the threeparts a belongs to. The identity element of ˜ H can be represented by the identity of H , so we needto only ﬁnd odd-length representations for elements in ˜ H and ˜ H :– Suppose a = ( u i u j ) ∈ ˜ H . We can represent this as ( u i u j ) = σ k − iu · ( u v ) · σ k − j + iu · ( u v ) · σ j − iu · ( u v ) · σ iu , where σ iu stands for i repetitions of σ i and v is a ﬁxed neighbour of u in G . The length | a | ofthe representation is k + 3 and N ( x, a ) ≤ max { k, } ≤ k + 1 for each x ∈ H .– Suppose a = ( u i v j ) ∈ ˜ H . Since the graph G is ( C, D ) -routable, there exists a path u = w , . . . , w (cid:96) = v of length ≤ (cid:96) ≤ D connecting u and v in G . Let ρ uv = ( w w ) · · · ( w (cid:96) − w (cid:96) ) · ( w (cid:96) − w (cid:96) − ) · · · ( w w ) be the sequence of (cid:96) − transpositions. With this, we can represent a = ( u i v j ) as ( u i v j ) = σ k − iu · σ k − jv · ρ uv · σ jv · σ iu . Now | ( u v j ) | = 2 k + 2 (cid:96) − ≤ k + D ) − . Note that σ u and σ v are used both at most k timesand ρ uv uses each edge-wise transposition at most twice. Hence N ( x, a ) ≤ k .17ext, we bound B ( x ) for each x ∈ H . There are again three cases to consider:– Suppose x = id . Since the identity element is only used to represent itself, we have that B (id) = ˜ µ (id) µ (id) = 4 kn ∈ O (1) . – Suppose x = σ u ∈ H for some u . Note that µ ( σ u ) = 1 / (4 n ) and σ u appears in k − representations of elements in ˜ H and ( k − n representations of elements in ˜ H . Thus, B ( x ) = 1 µ ( x ) (cid:88) a ∈ S N ˜ µ ( a ) N ( x, a ) | a | = 8 nk  (cid:88) a ∈ ˜ H N ( x, a ) | a | + (cid:88) a ∈ ˜ H N ( x, a ) | a |  . The sum over ˜ H is bounded by O ( k ) and the sum over ˜ H by O ( nk + nk D ) . Hence, B ( x ) ∈ O ( k + Dk ) .– Suppose x = ( u v ) ∈ H for some { u, v } ∈ E . Note that µ ( σ u ) = 1 / (4 m ) and ( u v ) is usedin at most k representations in ˜ H and in at most C representations in ˜ H . Hence, B ( x ) = 1 µ ( x ) (cid:88) a ∈ S N ˜ µ ( a ) N ( x, a ) | a | = 8 m ( nk )  (cid:88) a ∈ ˜ H N ( x, a ) | a | + (cid:88) a ∈ ˜ H N ( x, a ) | a |  . The sum over ˜ H is bounded by O ( k ) and the sum over ˜ H by O ( Ck + CDk ) . Thus, B ( x ) ∈ O ( CDmk /n ) .Therefore, the bottleneck ratio B is bounded by B = max x ∈ H B ( x ) ∈ O (cid:18) k · max (cid:26) CDmn , D (cid:27)(cid:19) . Bounding the mixing time.

We can now bound the mixing time of the k -stack interchangeprocess using the bound on the bottleneck ratio B and Lemma 7. Note that we can choose t ∈ Θ( BN log N ) so that the following inequalities hold: (cid:101) d (cid:18)(cid:22) t B (cid:23)(cid:19) ≤ / and exp( − t/B ) ≤ N N ≤ · N ! . The ﬁrst inequality is obtained using Lemma 8. The total variation distance is bounded by Lemma 7and the fact that d ( t ) ≤ d ( t ) yielding d ( t ) ≤ d ( t ) ≤ (cid:115) n ! · exp (cid:18) − tB (cid:19) + ˜ d (cid:18)(cid:22) t B (cid:23)(cid:19) ≤ · ≤ e . The claim of Theorem 6 follows by observing that the mixing time is bounded by O ( BN log N ) ,which for constant k is O (cid:18) n log n · max (cid:26) CDmn , D (cid:27)(cid:19) , as claimed. This concludes the proof of Theorem 6.18 Decentralised phase clocks

In this section, we describe a bounded phase clock construction for the stochastic population protocolmodel under regular graphs. However, the construction can also be generalised to non-regular graphs,assuming that the degrees do not deviate too much from the average degree; see Appendix A. Theconstruction generalises the approach of Alistarh et al. [7], which was used to build a leaderlessphase clock on cliques leveraging the classic two-choice load balancing process [15] and the analysisof Peres et al. [49] for graphical balanced allocations.

Recall that in the graphical stochastic population protocol model PP ( G ) , two randomly chosenadjacent nodes interact in each step t > . Let φ > be an integer and consider a protocol C withstate variables c ( v ) ∈ { , . . . , φ − } for each v ∈ V . The variable c ( v ) represents the value (i.e.phase) of the clock at node v . Let c ( v, t ) be the clock value node v has at the end of time step t (regardless of whether it was active during that step). We deﬁne the distance D between two clockvalues and the skew ∆ of the clock at the end of step t , respectively, as follows: D ( x, y ) = min {| x − y | , φ − | x − y |} and ∆( t ) = max u,v ∈ V D ( c ( u, t ) , c ( v, t )) . We say that the protocol C implements a ( φ, γ, κ ) -clock if for all t ≥ the following hold:(1) Pr[∆( t ) ≥ γ ] < t/n κ , and(2) c ( v, t + 1) = c ( v, t ) + 1 mod φ for exactly one v ∈ V and c ( u, t + 1) = c ( u, t ) for all u ∈ V \ { v } .Intuitively, the parameter φ is the length of a phase, γ bounds the skew of the clock, and κ is aconstant controlling the probability of failure. The above two properties guarantee that the clocks(1) have a skew bounded by γ for polynomially many steps with high probability; and (2) in eachstep, the clocks make progress (at some node). We say that a clock protocol C fails at time step t if the event ∆( t ) ≥ γ occurs. Several types of phase clocks have been proposed in the populationprotocol literature, satisfying various guarantees, e.g. [7, 12, 37, 39, 52]. The above formulation issimilar to that of [7] and proves convenient for our analysis. Let G be a graph and suppose that each node of G contains a bin, which is initially empty. Considerthe process, where in each step, a directed edge ( u, v ) is sampled uniformly at random and a ball isplaced into the least loaded of bin among the two nodes connected by the edge (in case of ties, placethe ball into bin u ). Let (cid:96) ( u, t ) be the number of balls placed into bin u ∈ V by the end of step t and use ∆ ∗ ( t ) = max v ∈ V (cid:96) ( v, t ) − min u ∈ V (cid:96) ( u, t ) , to denote the gap between the most and least loaded bin. Peres et al. [49] analysed (a generalisationof) the above load balancing process. Their results imply the following bound for regular graphs(see Appendix A for details). Lemma 9.

Suppose G = ( V, E ) is a d -regular graph with n nodes and edge expansion β > . Forany constant κ > , there exists a constant c ( κ ) such that for all t > the gap satisﬁes Pr (cid:20) ∆ ∗ ( t ) > c ( κ ) dβ log n (cid:21) < t/n κ .

19e use the above result to obtain bounded phase clocks in the PP ( G ) model. When G is d -regularwith edge expansion β , the skew of the phase clock is O ( d/β log n ) for polynomially many steps withhigh probability. For example, for constant-degree expanders, the skew bound is O (log n ) . We notethat this is the only place in our framework where the initiator/responder distinction is used. Theorem 10.

Suppose G = ( V, E ) is a d -regular graph with n nodes and edge expansion β > .There exists a ( φ, γ, κ ) -clock for PP ( G ) that uses φ states per node for any κ > and φ ≥ γ and γ ≥ c ( κ ) dβ log n. In particular, the clock fails with probability at most t/n κ during the ﬁrst t steps.Proof. Fix the parameters κ, γ and φ . We devise the clock protocol for PP ( G ) , where each nodeholds a counter value c ( v, t ) ∈ [ φ ] . Each node initialises its clock value to c ( v,

0) = 0 . Deﬁne M φ ( x, y ) = (cid:40) max( x, y ) if | x − y | < φ/ , min( x, y ) otherwise . The clock protocol is now deﬁned by the following update rule. When nodes { u , u } ∈ E interact,where u is the initiator and u is the responder, they perform the following:– If c ( u , t ) = c ( u , t ) , then the initiator u increments its clock value by one modulo φ .– Otherwise, the node u i for which M φ ( c ( u , t ) , c ( u , t )) = c ( u i , t ) holds increments its clock byone modulo φ .We argue that this protocol implements a ( φ, γ, κ ) -clock, i.e., properties (1) and (2) are satisﬁed.Note that the above rules guarantee that in either case exactly one of the nodes increments its clockvalue by one modulo φ . This implies property (2).It remains to show that property (1) holds. Note that for any nonnegative integers a and a such that | a − a | < φ/ we have max( a , a ) = a i ⇐⇒ M φ ( r , r ) = r i , where r i = a i mod φ . Weuse this observation to show that if ∆ ∗ ( t (cid:48) ) < γ holds for all ≤ t (cid:48) ≤ t , then in each step t (cid:48) both theunbounded and bounded process increment the counter of the same node. In particular, this impliesthat ∆ ∗ ( t (cid:48) ) = ∆( t (cid:48) ) and Lemma 9 yields that Pr[∆ ∗ ( t ) > γ ] = Pr[∆( t ) > γ ] < t/n κ .We proceed by induction on t . The base case t = 0 is vacuous, as ∆(0) = ∆ ∗ (0) = 0 . Supposethe claim holds for some t . Let ( u , u ) be the ( t + 1) th interaction pair. The induction hypothesisyields that c ( u, t ) = (cid:96) ( u, t ) mod φ for each u ∈ V and ∆( t ) = ∆ ∗ ( t ) < γ . Hence, we have that M φ ( c ( u , t ) , c ( u , t )) = c ( u i , t ) and max( (cid:96) ( u , t ) , (cid:96) ( u , t )) = (cid:96) ( u i , t ) . If the counters of u and u diﬀer at step t + 1 , then the unbounded and bounded process incrementthe counter of the same bin u i . In case of ties, both processes increment the counter correspondingto u . Hence, the unbounded balls-into-bin process and bounded clock process increment the counterof the same node. In this section, we give our main technical result: synchronous protocols in the fully-connected tokenshuﬄing model can be simulated in the graphical, stochastic population protocol model.20 heorem 2.

The rest of this section is dedicated to proving this theorem. Throughout, we ﬁx R = R ( n ) ∈ poly( n ) and ε = 1 /n a < / ( Rn λ ) for an arbitrary small constant a > . Let G = ( V, E ) be d -regular n -node graph and N = kn . We use µ to denote the increment distribution of the k -stack interchange process on the graph G . The support of µ is the set H ⊆ S N and τ = τ ( ε ) isthe ε -mixing time of the k -stack interchange process. We now give a stochastic population protocol that simulates uniform schedules of the synchronoustoken shuﬄing model. The protocol simulates the random walk made by the k -stack interchangeprocess, synchronised by phase clocks. Clock setup.

Let C be a ( φ, γ, κ ) -clock with parameters given by γ ∈ O (cid:18) dβ log n (cid:19) φ = γ + ϑ ϑ = 2 τn + 3 γ t ∗ = ( Rφ + γ ) n that fails (i.e., the clock skew becomes γ or greater) with probability at most /n λ during the ﬁrst t ∗ steps. Since φ ≥ γ , R ∈ poly( n ) , and t ∗ ∈ poly( n ) hold, such a protocol exists by Theorem 10for any constant λ > . The fact that t ∗ is polynomially bounded follows from Corollary 1 and that β ≥ /n for any regular connected graph so τ ≤ (cid:100) log 1 /ε (cid:101)· τ mix ∈ poly( n ) , and hence, φ, γ ∈ poly( n ) . The token shuﬄing protocol.

The parameter ϑ is used as a special threshold value for thetoken shuﬄing protocol. We assume that each node v holds exactly k tokens, which are orderedfrom to k − , in the same manner as cards ordered are in the k -stack interchange process. We saythat the ﬁrst token is the top token. We say that node u is receptive when ever its clock satisﬁes c ( u ) < ϑ and suspended otherwise. When nodes in { u, v } interact, they apply the following rule:(1) If both are receptive, that is, c ( u ) < ϑ and c ( v ) < ϑ holds, then(a) Let q ( u ) and q ( v ) be the random coin ﬂips of u and v , respectively.(b) If q ( u ) = q ( v ) = 0 , then u and v swap their top tokens.(c) If q ( u ) < q ( v ) , then v moves its top token to the bottom of its stack; u does nothing.(d) If q ( u ) = q ( v ) = 1 , then do nothing.(2) Otherwise, do nothing. 21

46 1 …

356 2 61 4261 4 … … (a)(b)(c) Figure 4: The dynamics of the shuﬄing protocol for k = 1 . Circles ﬁlled with white and red denotereceptive and suspended nodes, respectively. The blue arrows connect nodes who exchange theirtokens in the given step. Red lines denote steps, where at least one of the interacting nodes issuspended, and thus, no swap is made. (a) Initially all nodes are receptive and swap tokens withtheir interaction partners. After suﬃciently many interactions, nodes become suspended and refrainfrom swapping tokens. (b) Eventually all nodes are suspended. The highlighted panel shows theresulting permutation, which will act as the interaction pattern for the simulated round. (c) As thephase clocks reset back to 0, nodes become receptive again, and the tokens are shuﬄed once more.We note that the protocol uses at most one random bit per node per interaction and that this is theonly part of our framework, where the random bits provided to the nodes are used. The interactingnodes exchange at most 4 bits (i.e., whether they receptive or not, and the result of their coin ﬂip) inaddition to the contents of the swapped tokens in Step (1b). Finally, observe that when all nodes arereceptive, the tokens are shuﬄed according to the increment distribution µ of the k -stack interchangeprocess on G . Figure 4 illustrates the dynamics of the shuﬄing protocol in the case k = 1 . We now analyse the above shuﬄing protocol. Let c ( u, t ) indicate the clock value of node u at theend of step t . Let c ( u,

0) = 0 and t ( v,

0) = 0 . We say that the clock of node u resets at time step t ifits value transitions from φ − to . For r ≥ , deﬁne– t ( v, r + 1) = min { t > t ( v, r ) : c ( v, t ) = 0 } ; the step when v resets its clock for the r th time,– t min ( r ) = min { t ( v, k ) : v ∈ V } ; the earliest step when some clock is reset for the r th time,– t max ( r ) = max { t ( v, k ) : v ∈ V } ; the latest step when some clock is reset for the r th time.Similarly, we deﬁne the times with respective to the events when the clocks reach the value ϑ :– s ( v, r ) = min { t > t ( v, r ) : c ( v, t ) = ϑ } ,– s min ( r ) = min { s ( v, k ) : v ∈ V } ,– s max ( r ) = max { s ( v, k ) : v ∈ V } . 22 emma 11. With high probability, the following inequalities hold:(1) t max ( R + 1) ≤ t ∗ = ( Rφ + γ ) n ,(2) s min ( r ) − t max ( r ) ≥ τ for each ≤ r ≤ R .(3) t max ( r ) < s max ( r ) < t min ( r + 1) for each ≤ r ≤ R .Proof. Recall that the clock protocol works correctly with high probability for the ﬁrst t ∗ steps. Wenow assume that this event occurs.For the ﬁrst claim, we show that all nodes have incremented their clock at least Rφ times after t ∗ steps. For the sake of contradiction, suppose that some node v has incremented its clock lessthan Rφ times during the ﬁrst t ∗ steps. By the second property of the clock protocol, in every step ≤ t ≤ t ∗ , some node increments its clock value by one (modulo φ ). Hence, the nodes in V \ { v } have incremented their clocks at least ( Rφ + γ )( n − times. By the pigeonhole principle, somenode u (cid:54) = v has incremented its clock at least Rφ + γ times. However, this contradicts the propertythat the diﬀerence in the clock skew is less than γ for each step ≤ t ≤ t ∗ . Since each node hasincremented its clock at least Rφ times, each node has reset its clock R times, so t max ( R + 1) ≤ t ∗ .For the second claim, observe that during an interval of γn steps there must exist a node thathas incremented its clock γ times by the pigeonhole principle. By the ﬁrst property of the clockprotocol the skew is less than γ , so we get that t max ( r ) < t min ( r ) + 2 γn. Again since the skew of the clock is less than γ , and in each step at most one node increments itsclock counter, the time until some node reaches the clock value ϑ after step t min ( r ) satisﬁes s min ( r ) ≥ t min ( r ) + ( ϑ − γ ) n. Combining these two bounds and recalling that ϑ = τ /n + 3 γ yields s min ( r ) − t max ( r ) > ( ϑ − γ ) n − γn = τ. Finally, the third claim follows from the fact that φ ≥ γ and that the skew is bounded by γ . Distribution of tokens.

We now show that the distribution tokens mix to an ε -uniform distribu-tion during the intervals { t max ( r ) + 1 , . . . , s min ( r ) } for ≤ r ≤ R . Let π = id and π t denote thelocations of the tokens after t steps of the shuﬄing protocol. Deﬁne σ = id and σ r = π s max ( r ) for ≤ r ≤ R. Observe that σ r = ρ · ρ · ρ · σ r − , where each ρ i is product of elements from the support H ⊆ S N of the increment distribution µ of the k -stack interchange process, where– ρ = x t max ( r ) · · · x t min ( r − (only a subset of nodes have become receptive for the r th time),– ρ = x s min ( r ) · · · x t max ( r )+1 (all nodes are receptive),– ρ = x s max ( r ) · · · x s min ( r )+1 (a subset of nodes have become suspended for the r th time).(Recall that permutations are applied from right to left.) Observe that while each x i is a randomelement of H , only the elements ρ are guaranteed to be distributed according to the incrementdistribution µ of the k -stack interchange process. The elements of ρ and ρ are skewed towards theidentity permutation, as some nodes are suspended whenever their clock values are in { ϑ, . . . , φ − } .The next lemma establishes that this does not interfere with the mixing behaviour.23 emma 12. Let ≤ r < R . For any A ⊆ S N , we have | Pr[ σ r +1 ∈ A | σ k ] − ν ( A ) | ≤ ε .Proof. Suppose σ r is given. For brevity, let π = ρ · ρ · σ r so that σ r +1 = ρ · π . Deﬁne p ( x ) = Pr[ π = x | σ r ] and p (cid:48) ( x ) = Pr[ σ r +1 = x | σ r ] . Observe that | p ( A ) − ν ( A ) | ≤ ε , as ρ is given by a sequence of at least τ elements sampledaccording to the increment distribution µ . We show that | p (cid:48) ( A ) − ν ( A ) | ≤ ε . Let y · A denote theset { yx : x ∈ A } . By expanding p (cid:48) ( A ) using conditional probabilities, we can write p (cid:48) ( A ) = Pr[ ρ · π ∈ A | σ r ] = (cid:88) y ∈ S N Pr[ π ∈ y − · A and ρ = y | σ r ]= (cid:88) y ∈ S N Pr[ ρ = y | π ∈ y − · A and σ r ] · Pr[ π ∈ y − · A | σ r ]= (cid:88) y ∈ S N q ( y ) · p ( y − · A ) , where q ( y ) = Pr[ ρ = y | π = y − x, σ r ] is a probability distribution on S N . Hence, q ( S N ) = (cid:80) q ( y ) = 1 . Since ν ( A ) = ν ( z · A ) for any z ∈ S N , by applying triangle inequality we get that (cid:12)(cid:12) p (cid:48) ( A ) − ν ( A ) (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) y ∈ S N q ( y ) · p ( y − · A ) − ν ( A ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) y ∈ S N q ( y ) · (cid:2) p ( y − · A ) − ν ( y − · A ) (cid:3)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:88) y ∈ S N q ( y ) (cid:12)(cid:12) p ( y − · x ) − ν ( y − · A ) (cid:12)(cid:12) ≤ (cid:88) y ∈ S N q ( y ) · ε ≤ ε. Using the shuﬄing protocol in the population protocol model, we can simulate an R -round algorithm A in the synchronous k -token shuﬄing model. Let f : X × Y k → X be the state transition functionand g : X → Y k be the token generation function of the algorithm A . Recall that the sets X and Y denote the set of local state variables and token types, respectively. The simulation protocol.

Each node v keeps a variable a ( v ) ∈ X to simulate the local stateof the synchronous protocol A . In addition, the node holds k variables, b ( v ) , . . . , b k − ( v ) ∈ Y ,that are used to store the sent and received tokens. The variable a ( v ) is initialised to the initialstate x ( v ) of node v in the algorithm A and b ( v ) , . . . , b k − ( v ) are initialised to the values given by g ( x ( v )) . When node v interacts (in the asynchronous population protocol model), v updates itsstate according to the following rules:(1) Run the clock and the shuﬄing protocol using b ( v ) , . . . , b k − ( v ) to hold the k tokens.(2) If c ( v ) = ϑ , then update the simulated state variable and generate new tokens by setting: a ( v ) ← f ( a ( v ) , b ( v ) , . . . , b k − ( v )) and b ( v ) , . . . , b k − ( v ) ← g ( a ( v )) . We show that the above algorithm simulates an execution of the synchronous algorithm A underthe schedule σ , . . . , σ R given by the shuﬄing protocol. To this end, deﬁne x ( v ) = a ( v, and x ( r ) = a ( v, s ( v, r )) for all ≤ r ≤ R . Lemma 13.

The sequence ( x r ) ≤ r ≤ R is an execution induced by the schedule ( σ r ) ≤ r ≤ R . roof. Observe that each node v updates the variables a ( v ) , b ( v ) , . . . , b k − ( v ) only during thesteps s ( v, , . . . , s ( v, R ) when its local clock has reached the threshold value ϑ . Lemma 11 impliesthat with high probability every node updates these variables for the r th time during the interval { s min ( r ) + 1 , . . . , s max ( r ) + 1 } for any ≤ r ≤ R . In particular, this happens before step t min ( r + 1) ,when the ﬁrst node becomes receptive for the ( r + 1) th time. By letting y r +1 ( v i ) be the value of b i ( v ) after being updated for the r th time and y (cid:48) r +1 = y r ◦ σ r +1 , we get that the conﬁguration x r +1 satisﬁes, with high probability, x r +1 ( v ) = f ( x r , y (cid:48) r ( v , . . . , v k − )) . Thus, the sequence given by x , . . . , x R is an execution induced by the schedule σ , . . . , σ R . The schedules provided by the shuﬄing protocol are only ε -uniform, as the shuﬄing process isexecuted for ﬁnitely many steps. As our last stepping stone, we show that this does not matter: anysynchronous protocol behaves statistically similarly under ε -uniform and uniform schedules.To formalise this, let Φ be the distribution of the sequence ( σ , . . . , σ R ) ∈ S RN of permutationsgenerated by the shuﬄing protocol under the assumption that the clock protocol works correctly for T time steps. Let ν R = ν × · · · × ν denote the distribution of a sequence of R independently anduniformly sampled random permutations from S N . That is, ν R is the distribution of the uniform R -round schedules. We start with the following algebraic inequality. Lemma 14.

Let a i , b i ∈ R + for ≤ i ≤ t . Then (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t (cid:89) i =1 a i − t (cid:89) j =1 b j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ t (cid:88) i =1 | a i − b i | (cid:32) i − (cid:89) k =1 a k (cid:33) (cid:32) t (cid:89) h = i +1 b h (cid:33) . Proof.

For all ≤ i ≤ t deﬁne c i = (cid:32) i (cid:89) k =1 a k (cid:33) (cid:32) t (cid:89) h = i +1 b h (cid:33) and d i = ( a i − b i ) (cid:32) i − (cid:89) k =1 a k (cid:33) (cid:32) t (cid:89) h = i +1 b h (cid:33) . The claim follows by observing that d i = c i − c i − holds and (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t (cid:89) i =1 a i − b t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = | c t − c | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t (cid:88) i =1 ( c i − c i − ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ t (cid:88) i =1 | c i − c i − | = t (cid:88) i =1 | d i | . Lemma 15.

The total variation distance between Φ and ν R satisﬁes (cid:13)(cid:13) Φ − ν R (cid:13)(cid:13) TV ≤ εR .Proof. Let A = A × · · · × A r ⊆ S RN . Since the sequence σ , . . . , σ R is Markovian, we can write Φ( A ) = Pr [( σ , . . . , σ R ) ∈ A ] = Pr [ σ ∈ A ] · R (cid:89) j =2 Pr [ σ i ∈ A i | σ i − ∈ A i − ] = R (cid:89) j =1 φ j ( A ) , φ i ( A ) = Pr[ σ i ∈ A i | σ i − ∈ A i − ] for i > . Recall that σ = id . For notational convenience,let ν i ( A ) = ν ( A i ) . By applying Lemma 14, we obtain (cid:12)(cid:12) Φ( A ) − ν R ( A ) (cid:12)(cid:12) ≤ R (cid:88) i =1 | φ i ( A ) − ν i ( A ) | (cid:32) i − (cid:89) k =1 φ k ( A ) (cid:33) (cid:32) R (cid:89) h = i +1 ν h ( A ) (cid:33) ≤ R (cid:88) i =1 (cid:12)(cid:12)(cid:12)(cid:12) Pr[ σ i ∈ A i | σ i − ∈ A i − ] − | A i | N ! (cid:12)(cid:12)(cid:12)(cid:12) (cid:32) i − (cid:89) k =1 φ k ( A ) (cid:33) (cid:32) R (cid:89) h = i +1 ν h ( A ) (cid:33) ≤ R (cid:88) i =1 ε (cid:32) i − (cid:89) k =1 φ k ( A ) (cid:33) (cid:32) R (cid:89) h = i +1 ν h ( A ) (cid:33) ≤ R (cid:88) i =1 ε ≤ εR, where the third inequality follows from Lemma 12 and the second last from the fact that the productsare over probabilities. The claim now follows as (cid:13)(cid:13) Φ − ν R (cid:13)(cid:13) TV = max A ⊆ S RN | Φ( A ) − ν R ( A ) | ≤ εR. We are almost ready to show our main technical result. We recall the following standard result.

Lemma 16.

Let µ and ν be probability distributions over a ﬁnite domain Ω . For any function F : Ω → Ω (cid:48) , the total variation distance satisﬁes (cid:107) F ( µ ) − F ( ν ) (cid:107) TV ≤ (cid:107) µ − ν (cid:107) TV . With all the pieces now in place, we can establish our simulation theorem.

Theorem 2.

Let k > be a constant and A be a synchronous k -token shuﬄing protocol on n nodes,where X is the set of local states and Y the set of token types used the protocol A . If A solvesthe task Π with high probability in R ∈ poly( n ) rounds, then there exists a stochastic populationprotocol B that also solves task Π with high probability on any n -node d -regular graph G with edgeexpansion β > . The step complexity T ( B ) and state complexity S ( B ) of the protocol B satisfy T ( B ) ∈ O ( R · n · ζ ) and S ( B ) ∈ O (cid:16) | X | · | Y | k · ζ (cid:17) with ζ = log n · (cid:18) dβ + τ mix n (cid:19) , where τ mix is the mixing time of the k -stack interchange process on G .Proof. Let A be the synchronous k -token shuﬄing protocol. Since the protocol works with highprobability, assume it succeeds with probability at least p ≥ − /n h , where h is a constant wechoose later. Using the simulation protocol, we construct a graphical population protocol B withthe claimed properties. Recall that ε = 1 /n a , where a was an arbitrary constant. We set a so that εR ≤ /n λ holds. By Lemma 13 the shuﬄing protocol simulates the execution of A induced byan R -round ε -uniform synchronous schedule ( σ r ) ≤ r ≤ R with high probability. This takes at most t ∗ = ( Rφ + γ ) n steps by Lemma 11.Recall that φ ∈ O ( γ + τ /n ) , where γ is the bound on the clock skew and τ = τ ( ε ) is the ε -mixingtime of the k -stack interchange process. Since τ ≤ (cid:100) log 1 /ε (cid:101) · τ mix ∈ O (log n · τ mix ) , we get fromTheorem 10 the following bounds: γ ∈ O (cid:18) d log nβ (cid:19) φ ∈ O (cid:18) log n (cid:18) dβ + τ mix n (cid:19)(cid:19) t ∗ ∈ O (cid:18) Rn log n (cid:18) dβ + τ mix n (cid:19)(cid:19) . t ∗ establishes the claimed bound on the step complexity of B . For the state complexity,note that each node v stores the variables for the clock c ( v ) ∈ [ φ ] , the local state a ( v ) ∈ X of thesimulated protocol A , and the k tokens b ( v ) , . . . , b k − ( v ) ∈ Y . This takes φ · | X | · | Y | k states,establishing the bound on the state complexity.It remains to argue that B solves the task Π with probability at least p − /n λ . The output ofalgorithm A on input z under any R -round synchronous schedule Ξ is given by F z (Ξ) , where F z is acomputable function. Let D = F z (Φ) and D (cid:48) = F z ( ν R ) be the probability distributions of outputsin the executions of the algorithm A induced, respectively, by the simulated ε -uniform schedulesgiven by Φ and the uniform schedules given by ν R . By Lemma 15 and Lemma 16, we have that (cid:13)(cid:13) D − D (cid:48) (cid:13)(cid:13) TV = (cid:13)(cid:13) F z (Φ) − F z ( ν R ) (cid:13)(cid:13) TV ≤ (cid:107) Φ − ν r (cid:107) TV ≤ εR ≤ /n λ . Therefore, the probability that the output z (cid:48) under the execution induced by the ε -uniform scheduleon input z satisﬁes z (cid:48) ∈ Π( z ) is D (Π( z )) ≥ D (cid:48) (Π( z )) − /n λ ≥ p − /n λ ≥ − /n h − /n λ . Thus,the output of protocol B is feasible with high probability. Since A stabilises in R rounds, nodes canset the output of B to be the output of A at the end of the R th simulated round. Thus, the outputof B stabilises as well. We now give fast (and simple) algorithms in the fully-connected synchronous token shuﬄing model.These can be transported to the graphical, asynchronous population protocol setting using Theorem 2.We illustrate two types of algorithms:(1) A leader election algorithm that uses one-way communication with k > tokens. Theprotocol uses a one-way information dissemination protocol and a protocol for generatingsynthetic coins in the token shuﬄing model.(2) An exact majority algorithm simulating two-way interactions in a population of n virtualagents. The algorithm uses the classic cancellation-doubling dynamics.While we use straightforward adaptations of ideas from prior algorithms (see e.g. [36] for a generaloverview), for the sake of completeness, we provide a full analysis of the algorithms we use in thesynchronous token shuﬄing model. Before we proceed, we will establish some useful lemmas and facts we need. We start with thefollowing observation about a particular quadratic recurrence.

Lemma 17.

For any n > x > and r ≥ , the expression g ( r ) = n ( x/n ) r is the closed formsolution of the quadratic recurrence g ( r ) = (cid:40) g ( r − /n if r > x otherwise.Moreover, g ( r ) ≤ /n λ holds for all r ≥ log n + log ln n + log( λ + 1) . roof. We show the identity via induction. The base case r = 0 is given by g (0) = n ( x/n ) = x . Forthe inductive step, we have g ( r + 1) = g ( r ) n = 1 n (cid:20) n (cid:16) xn (cid:17) r (cid:21) = n (cid:16) xn (cid:17) · r = n (cid:16) xn (cid:17) r +1 . The second claim follows from the inequality (1 − /n ) − nx ≤ e − x , since g ( r ) ≤ n (cid:18) − n (cid:19) r ≤ n (cid:18) − n (cid:19) − ( λ +1) n ln n ≤ ne − ( λ +1) ln n ≤ /n λ . Remark . We make use of the following elementary facts.– The law of total expectation: For random variables X and Y on the same probability space, E[ X ] = E[ E [ X | Y ]] . – Markov’s inequality: For any nonnegative random variable X and real value a > , Pr[ X ≥ a ] ≤ E[ X ] x . – The union bound: For any events A , . . . , A n , we have Pr (cid:34) n (cid:91) i =0 A i (cid:35) ≤ n (cid:88) i =0 Pr[ A i ] . We start by adapting a classic broadcast primitive to the k -token shuﬄing model. This protocoloften goes by the name of “one-way epidemics” and “rumour spreading”. We assume each nodeis given an input z ( v ) from a set Σ with a total order on the values. The protocol computes themaximum value given as input. This can be used as an information dissemination protocol or toagree on a commmon value given as input. One-way epidemics protocol.

The algorithm works in the k -token shuﬄing model for any k > .At the start of the protocol, each node v initialises a local state variable a ( v ) to its input value z ( v ) .In every round, each node v performs the following steps:(1) Generate k tokens of type a ( v ) .(2) Use one round to shuﬄe the generated tokens.(3) After receiving k tokens y , . . . , y k − , set a ( v ) ← max { a ( v ) , y , . . . , y k − } .The number of states and token types used by the algorithm is | Σ | . Lemma 19.

After O (log n ) rounds, every node v satisﬁes a ( v ) = max z ( u ) with high probability.Proof. Let U r denote the set of nodes that have not set their local state variable a ( · ) to the maximuminput value after r rounds. Fix a constant λ > and let R = ( λ + 1) n ln n . We prove the lemma byshowing that for all r ≥ R the probability that U r is nonempty is at most /n λ .For each v ∈ V , let Y r ( v ) be the indicator variable for the event that node v receives at least onetoken with the maximum input value in round r . By Step (3) of the protocol, if this event occurs,28hen v sets a ( v ) to the maximum input value at the end of round r . In particular, this implies that v / ∈ U r (cid:48) for all r (cid:48) ≥ r . Note that Pr[ Y r +1 ( v ) = 1 | U r = b ] ≥ − b/n . Deﬁne the random variable X r = | U r | for each r > . Observe that by linearity of expectation, we have E[ X r +1 | X r = b ] = b − E (cid:34) (cid:88) v ∈ U r X r ( v ) (cid:35) ≤ b − (cid:88) v ∈ U r E[ Y r ( v )] ≤ b /n. By law of total expectation, we get E[ X r +1 ] = E [E [ X r +1 | X r ]] ≤ g ( r ) , where g ( r ) is the recurrence of Lemma 17 with x = | U | < n . By Markov’s inequality and the secondclaim of Lemma 17, we get Pr[ X R ≥ ≤ E[ U R ] ≤ g ( R ) ≤ /n λ . We now consider the leader election problem, where the goal is to select a single node as a leader.Again, we adapt a well-known strategy used by leader election protocols in the standard populationprotocol model: each leader candidate iteratively (1) ﬂips a random coin and (2) becomes a followerif another leader candidate had a coin ﬂip with a larger value [36, 37].In order to implement step (1) we need access to random bits. However recall that by ourdeﬁnitions, the state transition and token generation functions in the token shuﬄing model aredeterministic. While we could “lift” random bits from the underlying stochastic population protocolmodel, we instead opt for a clean method for generating synthetic coin ﬂips in the k -token shuﬄingmodel with k > . Synthetic coin ﬂips.

Let k > and consider the k -token shuﬄing model, where each node v receives exactly k tokens. Recall that these tokens are ordered from to k − . We leverage thisproperty to generate synthetic coin ﬂips in one round as follows:(1) Each node generates a single 0-token and a single 1-token.(2) Use one round to shuﬄe the generated tokens.(3) Output the value of the ﬁrst token.While the coin ﬂips between nodes are not independent, the probability that a node outputs is / . Thus, in expectation half of the nodes output 1. Leader election protocol.

We assume that the input speciﬁes a (nonempty) subset of nodesthat start as leader candidates. Let (cid:96) ( v ) ∈ { , } be a local variable of node v denoting whether itconsiders itself a leader candidate. In a single iteration, each node v executes the following:(1) Generate a synthetic coin ﬂip b ( v ) in one round.(2) Run Θ(log n ) rounds of the broadcast protocol of Lemma 19 with input (cid:96) ( v ) · b ( v ) ∈ { , } .(3) If (cid:96) ( v ) = 1 and b ( v ) = 0 , then set (cid:96) ( v ) ← if the broadcast protocol had output 1.In every round, node v uses the value (cid:96) ( v ) as its current output value. Each iteration of the protocoltakes Θ(log n ) rounds and the protocol uses Θ(log n ) states and constantly many token types. Weshow that with high probability, the protocol reduces the number of leader candidates to one after O (log n ) iterations, and hence, in O (log n ) rounds. The remaining candidate is the elected leader.29ote that any node that is a leader candidate ceases to be a leader candidate only if its localcoin ﬂip was 0 and the broadcast protocol informs the node that some other leader candidate hada local coin ﬂip with value one. Thus, we never end up in a situation where there are no leadercandidates remaining. Theorem 20.

There is a synchronous 2-token shuﬄing protocol for the leader election task thatstabilises in O (log n ) rounds with high probability, uses O (log n ) states per node and two token types.Proof. Assume that the broadcast protocol in Step (2) succeeds on each of the ﬁrst

Θ(log n ) iterationsof the leader election protocol. This event happens with high probability. Let L i be the (random)set of leader candidates after the i th iteration and X i = | L i | be the random variable indicating thenumber of leader candidates after the i th iteration.For each leader candidate v ∈ L i , let B i ( v ) be a random variable indicating whether node v had1 as its ( i + 1) th coin ﬂip. Note that E[ B i ( v )] = 1 / for any v ∈ L i . Let p ( a ) be the probabilitythat each of the | L i | = a leader candidates have 0 as their coin ﬂip. If this event occurs, then noleader candidate gets removed. Observe that p ( a ) ≤ / a . Assuming a > and using linearity ofexpectation, we get E [ X i +1 | X i = a ] = a · p ( a ) + (1 − p ( a )) · E  (cid:88) v ∈ L i B i ( v )  = a (cid:20) p ( a ) + 1 − p ( a )2 (cid:21) ≤ a . Thus, by law of total expectation E[ X t ] ≤ n (3 / t holds. For any constant λ > we can set t = λ log / n . By Markov’s inequality, Pr [ X t > ≤ E [ X t ] ≤ n (3 / t ≤ /n λ . Hence, with high probability, there remains only one leader candidate after t ∈ O (log n ) iterations.As each iteration takes Θ(log n ) rounds, the algorithm stabilises in O (log n ) rounds with highprobability, as desired. The state complexity bound comes from the fact that in Step (2) nodescount up to Θ(log n ) rounds. The algorithm uses two token types. We now obtain a protocol for the exact majority task in the 2-token shuﬄing model. We usethe 2-token shuﬄing model to simulate a cancellation-doubling protocol population protocol in apopulation of size n , where nodes interact synchronously according to a randomly chosen perfectmatching. In every round, each node receives two tokens of type A and B , and generates two newtokens for the next round by applying a rule of the form A + B → C + D. The rules used guarantee that with high probability all tokens get converted to the value held by theinitial majority of input values. Hence, as output of the protocol, each node v can use (an arbitrary)value held by one of its tokens. The exact majority protocol.

Let N = 2 n and t = ( λ + 1) log / N , where λ > is an arbitraryconstant. Each node v initially creates two tokens that take the input value z ( v ) ∈ { , } . Afterthis, the algorithm consists of repeatedly running the following rules:30 ancellation doubling(a) (b) promotion(c) shufﬂe shufﬂe shufﬂe shufﬂe Figure 5: The cancellation-doubling dynamics with n tokens and n = 7 nodes. Blue tokens havethe initial majority. (a) A single round of the cancellation phase. White rectangles represent emptytokens. (b) Two rounds of the doubling phase. The small circular tokens are split tokens. (c) Thepromotion rule promotes all split tokens into full tokens at the end of the doubling phase.(1) For t consecutive rounds, apply the cancellation rules Z + ¯ Z → ∅ + ∅ for Z ∈ { , } . (2) For t consecutive rounds, apply the doubling rules Z + ∅ → Z / + Z / for Z ∈ { , } . (3) Apply the promotion rule Z / → Z for each token of type Z ∈ { , } .Step (1) is called the cancellation phase and Step (2) the doubling phase . The protocol uses exactlyﬁve types of tokens: , , / , / and ∅ . Tokens of type Z ∈ { , } represent an “opinion” on whatis the majority value. The tokens of type Z / are called split tokens. A token of type ∅ is called an empty token . The idea is that (1) opposing opinions cancel out during the cancelling phase and (2)the amount of majority tokens doubles in the doubling phase.In each round, every node holds two tokens y and y . If one of the tokens is nonempty, then thenode outputs the largest value held by nonempty tokens. Otherwise, if both tokens of a node areempty, i.e., y = y = ∅ , then the node outputs its input value (in this case the protocol has notyet stabilised). We show that the algorithm stabilises in O (log N ) = O (log n ) rounds with highprobability, i.e., the system reaches a conﬁguration, where all generated tokens take the majorityinput value. Analysis.

We trace the usual steps taken in analysing cancellation-doubling dynamics [7, 12, 36].Let A i and B i denote the number of majority and minority tokens after i iterations, respectively.Deﬁne the discrepancy as ∆ i = A i − B i with ∆ > being the initial discrepancy. In addition,we use A (cid:48) i , B (cid:48) i , and C i to denote the number of majority, minority, and empty tokens after the i thcancellation phase.The algorithm maintains the invariant A i > B i with high probability: The cancellation ruleremoves exactly one majority and minority token each time it is applied. The doubling phaseguarantees that A i +1 = 2 A (cid:48) i and B i +1 = 2 B (cid:48) i holds with high probability. Finally, once A i = N holds, that is, all tokens have taken the majority value, then A i +1 = N holds. This ensures thatonce all tokens are of the same type, they remain so for all subsequent rounds.31 emma 21. Let i ≥ . If A i > B i holds, then one of the following holds with high probability:(1) B (cid:48) i = 0 , that is, there are no minority tokens after the i th cancellation phase, or(2) C i ≥ N/ , that is, there are at least N/ empty tokens after the i th cancellation phase.Proof. Observe that B i < A i < N/ implies the second condition C i ≥ N/ . Thus, assume that A i ≥ N/ holds. Consider the event that a ﬁxed minority token b is not cancelled during the t rounds of the cancellation phase conditioned on there being at least N/ majority tokens in eachround of this phase. The probability that b meets a majority token is at least / . Hence, theprobability that b is not cancelled during any of these t rounds is at most (1 − / t = (4 / t = 1 /N λ +1 . Now by taking the union bound over all B (cid:48) i < N/ minority tokens yields that either (1) all minoritytokens gets cancelled with probability at least − /N λ or (2) there are less than N/ majoritytokens remaining after the i th cancellation phase. This proves the lemma. Lemma 22. If C i ≥ N/ holds, then ∆ i +1 = 2∆ i holds with high probability.Proof. We say that a token of type Z ∈ { , } splits if it activates the rule Z + ∅ → Z / + Z / .Observe that after a nonempty token of type Z splits during the i th doubling phase, it becomes atoken of type Z / and cannot split again before the ( i + 1) th doubling phase.Recall that by assumption, there are C i ≥ N/ empty tokens. Hence, the probability thata nonempty token splits in a single round of the doubling phase is at least / , since at most N/ nonempty tokens can split (each removing an empty token from the system). Therefore, theprobability that a nonempty token does not split is at most (1 − / t = (4 / t = 1 /N λ +1 . By union bound, the probability that some nonempty token does not split is at most /N λ . Now ∆ i +1 = 2( A (cid:48) i − B (cid:48) i ) = 2( A i − B i ) = 2∆ i . Bounding the number of iterations.

Deﬁne the following two random variables K = min { i : B i = 0 } and K = min { i : A i = N } . Here K indicates the iteration after which no tokens taking the minority value are present anymore.The variable K is the ﬁrst iteration when all tokens have been converted to the majority value.Note that K ≥ K since even though no minority tokens are remaining, some empty tokens mayremain after doubling phases. Lemma 23.

The random variable K satisﬁes K ≤ (cid:100) log N (cid:101) + 1 with high probability.Proof. Let K = (cid:100) log N (cid:101) + 1 . Suppose that B i > holds for all ≤ i ≤ K . By Lemma 21, we havethat C i ≥ N/ holds with high probability. By Lemma 22, we get that ∆ i +1 = 2∆ i = 2 i with highprobability. This implies that ∆ K = N , contradicting the assumption that B K > . Lemma 24.

The random variable K satisﬁes K ∈ O (log N ) with high probability. roof. Note that K = K implies that the claim holds by Lemma 23. Hence, assume that K > K holds and K is ﬁxed. For K ≤ i < K , let U i be the set of empty tokens at the end of iteration i .Let Y i ( u ) be an indicator variable for the event that u ∈ U i gets cancelled during the ﬁrst round ofiteration i + 1 . Since Pr[ Y i ( u ) = 1 | C i = c ] = 1 − c/N holds, the expected number of empty tokensafter iteration i + 1 satisﬁes E[ C i +1 | C i = c ] ≤ c − E  (cid:88) u ∈ U i Y i ( u )  = c − (cid:88) u ∈ U E[ Y i ( u )] = c − c (1 − c/N ) = c /N. By law of total expectation, E[ C i +1 ] = E[ C i +1 | C K ] = g ( i + 1 − K ) , where g is the recurrence from Lemma 17 with x = C K and n = N . For t = log N + log( λ + 1) +log ln N , the second claim of Lemma 17 yields Pr [ C t + K ≥ ≤ E[ C t + K ] ≤ g ( t ) ≤ /N λ ≤ /n λ . Theorem 25.

There is a synchronous 2-token shuﬄing protocol for the exact majority task thatstabilises in O (log n ) rounds with high probability, uses O (log n ) states and ﬁve token types.Proof. By Lemma 24, the system reaches a conﬁguration, where all tokens take the majority valuewithin O (log N ) iterations with high probability. As each iteration takes O (log N ) = O (log n ) rounds, the algorithm stabilises in O (log n ) rounds with high probability. The state complexity is t ∈ O (log n ) , as the nodes count up to t in each iteration and there are 5 token types. Finally, we address the following technical detail: our simulation framework and the simulatedsynchronous algorithms are guaranteed to work correctly and stabilise only with high probability, andtherefore, the protocols may fail with low probability. To obtain always correct protocols, i.e., oneswith ﬁnite expected stabilisation time, we specify “backup protocols”, which are run in the unlikelycases, where either the simulation framework fails (e.g. the phase clocks become desynchronised) orthe fast synchronous algorithm fails. This problem also occurs in the context of fast clique-basedalgorithms, e.g. [3, 7], and we adopt similar mitigation strategies.

Switching to backup protocols.

Note that since the probability of failure of the fast protocolscan be polynomially small, to get polynomial expected stabilisation time, it suﬃces to have a backupprotocol that has polynomial expected stabilisation time and small state complexity.The backup protocol, if necessary, is initiated as follows. If some node notices disagreementor inconsistent states after the fast protocols supposed stablisation time, it initiates a signallingmessage, which is propagated further by all nodes that receive it. This signal forces all nodes toswitch to executing the reliable (but potentially slow) backup protocol. Since the backup is onlyexecuted with low probability, and has negligible space cost, it does not aﬀect the overall complexityof the fast exact majority protocol in the graphical population protocol model.

Backup for exact majority.

In the case of the exact majority protocol, we can directly adoptthe same solution as in the classic clique setting [7]: use the four-state exact majority algorithmanalysed by Draief and Vojnović [35] as a backup protocol. This algorithm works in arbitrary,connected graphs and has polynomial expected stabilisation time.33 ackup for leader election.

For leader election, we use the six-state leader election algorithmgiven by Beauquier et al. [16] who studied this protocol under the adversarial (non-stochastic)scheduler. In Section 9, we show that this protocol has polynomial expected stabilisation time underthe stochastic scheduler on any connected graph.Switching to the backup protocol can be done as follows: once a node has executed the fastprotocol suﬃciently many rounds, it switches to the slow protocol using its current state (whether itis a leader candidate or not) as input for the constant-state protocol. Any node that observes duringan interaction that some other node has switched to the slow protocol, does so as well.

In this section, we analyse the leader election protocol of Beauquier et al. [16] under the uniformstochastic scheduler. We establish the following result.

Theorem 26.

There exists a protocol for leader election that uses six states and stabilises in anygraph G in O (diam( G ) · nm log n ) steps with high probability and in expectation, where n is thenumber of nodes, m the number of edges, and diam( G ) the diameter of G . For any connected graph, the protocol stabilises in O ( n log n ) steps, but for sparse and low-diameter graphs the running time is better. For example, constant-degree regular expanders havediameter O (log n ) and O ( n ) edges, and thus, the algorithm stabilises in O ( n log n ) steps inexpanders. In D -dimensional toroidal grids of constant dimension, we get stabilisation time of O ( n /D log n ) steps, since such graphs have diameter O ( n /D ) and O ( n ) edges. Note that thesebounds are roughly a factor n larger than the bound given by Corollary 3. However, as discussed, wecan employ the slower protocol as a backup protocol for the fast leader election protocol of Section 8to get ﬁnite expected stabilisation time. First, we recall that in the classic clique setting leader election can be solved by a simple 2-stateprotocol, where each node keeps track of whether it is a leader candidate or a follower. Whenevertwo leader candidates interact, the initiator stays as a leader candidate while the responder becomesa follower; no other type of interaction changes the state of nodes.The token-based protocol uses a similar approach. However, unlike in the clique, it may beimpossible for two leader candidates to directly interact: they may not have a common edge in G .Instead, the nodes use tokens to interact indirectly. In each step, the nodes update their status byexchanging tokens between their interaction partners and at every time step each node holds exactlyone token.There are three types of tokens: black, white, and inactive tokens. Initially, each leader candidatecreates a black token. In each step, nodes exchange their tokens. Whenever two black tokens meet,exactly one of them turns into a white token while the other remains black. Informally, black tokensrepresent the presence of a leader candidate that has not been yet cancelled. A white token representsa leader candidate that will eventually become a follower. Whenever a node that considers itself aleader candidate receives a white token, it changes its own status into a follower and deactivates thetoken. The invariant maintained by the protocol is that the total number of non-inactive tokenspresent in the system equals the number of leader candidates. By continuously shuﬄing the tokens,it is eventually guaranteed that the total number of black tokens becomes one and all other tokensbecome inactive. 34 he protocol. Formally, the state of each node v is a tuple ( (cid:96), y ) , where (cid:96) ∈ { leader , follower } is abit indicating whether node v is a leader candidate and y ∈ { black , white , inactive } denotes the typeof the token held by the node. As input, each node is given a bit indicating whether it is a leadercandidate initially. Every node v initialises its state using the following rules:– If v is a leader candidate, then it sets (cid:96) ( v ) ← leader and y ( v ) ← white .– Otherwise, it sets (cid:96) ( v ) ← follower and y ( v ) ← inactive .When two neighbouring nodes u and v are selected to interact by the scheduler, we say that (also)the tokens held by the nodes interact. On every interaction, where node u is the initiator and v isthe responder, the states are updated as follows:(1) If y ( u ) = y ( v ) = black holds, then y ( v ) ← white . That is, if both tokens are black, then thetoken of the responder v is coloured white .(2) If the token held by u is white , y ( u ) = white , and node v is a leader , (cid:96) ( v ) = leader , then– node v designates itself as follower , i.e., sets (cid:96) ( u ) ← follower .– node u sets the type of its token to y ( u ) ← inactive .(3) Finally, the nodes u and v swap their tokens y ( u ) and y ( v ) .For the formal proof of correctness, we refer to [16]. Here, we focus only on bounding the time forthe protocol to stabilise under the uniform stochastic scheduler on G . To establish bounds on the stabilisation time of the leader election protocol, we analyse the hittingtime and meeting time of tokens performing random walks on the graph G . Later, the stabilisationtime of the above token-based leader election protocol can be bound using these quantities. Beforewe proceed, we note the diﬀerences between the classic random walk process on a graph and therandom walks made by the tokens in our process. Random walks on graphs.

Recall that the classic random walk on a graph G is the followingMarkov chain: Initially, a random walker (i.e. a token) is placed to some node v of G . In each step,the random walker moves from v to a some neighbor u of v chosen uniformly at random. A naturalextension is to consider multiple, independent random walkers moving on the nodes of G : there maybe several walkers placed on nodes of u and in every step each walker moves to a new random nodeindependently of all the other nodes.In contrast, in the population protocol model, we have to consider multiple tokens performingcorrelated random walks on G : in every step exactly two tokens move along the same edge, which issampled uniformly at random. Nevertheless, we can carefully adapt and use analogous arguments toanalyse the classic random walk (see e.g. [44]) and coalescense time of independent random walks asused by Cooper et al. [28]. Naturally, the bounds we obtain are somewhat diﬀerent, as the underlyingsampling process is diﬀerent, and we do not aim for sharp bounds.35 itting times for irreducible Markov chains. We start by recalling the following elementaryresult about hitting times of Markov chains; see e.g. [43, Proposition 1.19]. For states x and y , theexpected hitting time H ( x, y ) between is H ( x, y ) = E[min { t ≥ X t = y, X = x } ] . For x (cid:54) = y , H ( x, y ) is the expected number of steps until the chain starting in state x reaches state y .For x = y the value gives the expected ﬁrst return time to state x . Lemma 27.

For any ﬁnite and irreducible Markov chain, the stationary distribution π satisﬁes π ( x ) = 1 H ( x, x ) for every state x. Note that the above lemma does not require that the Markov chain is aperiodic. Indeed, thechains we consider will be periodic.

Random walk of a single token.

Let G = ( V, E ) be a simple, connected graph on n ≥ nodes,with m ≥ edges. We start by analysing the walk performed by a single token of the leader electionalgorithm under the population protocol model. More precisely, we consider the following process.Initially, a token placed on a node of G . In each time step, an edge of G is sampled uniformly atrandom. If the token is located at an endpoint of the sampled edge, then it moves to the otherendpoint of that edge. Otherwise, the token stays where it is.Formally, this corresponds to a Markov chain on the state space V . The probability that thechain transitions from state u to v is given by P ( u, v ) =  /m if { u, v } ∈ E − d ( u ) /m if u = v otherwisefor every u, v ∈ V . That is, if the token is on node u , the probability for the token to move to node v in the next step is the probability that an edge between the two nodes, if existent, is chosen.The resulting Markov chain is irreducible, since the graph G is connected. Note that this randomwalk diﬀers from the classic random walk on G , where the token moves lazily to an adjacent node ineach time step. First, we show that the uniform distribution on V is the stationary distribution forthis chain. Lemma 28.

The stationary distribution of the walk on G is π ( v ) = 1 /n for every v ∈ V .Proof. Let π be the uniform distribution on V . Recall that πP denotes the application of thetransition matrix P to π . To establish our claim, we need to validate that π = πP holds. For this,we observe that ( πP )( v ) = (cid:88) u π ( u ) P ( u, v ) = 1 n (cid:88) u P ( u, v ) = 1 n = π ( v ) . Next, using elementary arguments, we can bound the hitting time of any pair of nodes; seee.g. [44]. In particular, we make use of the worst-case expected hitting time deﬁned by H max = max { H ( u, v ) : u, v ∈ V, u (cid:54) = v } . The following lemma shows that H max < diam( G ) · nm .36 emma 29. For any graph G , H ( u, v ) < diam( G ) · nm for all u, v ∈ V .Proof. By Lemma 27 and Lemma 28, we have H ( u, u ) = 1 /π ( u ) = n . On the other hand, bycalculating the expected hitting time in another way, we observe that H ( u, u ) = 1 − d ( u ) m + 1 m (cid:88) { u,w }∈ E (1 + H ( u, w )) = n. Thus, we get the inequality (cid:88) { u,w }∈ E (1 + H ( u, w )) < nm. In particular, we have that H ( u, w ) < nm for any edge { u, w } ∈ E . Since G has diameter D , thereis a path of length u = u , . . . , u k = v at most diam( G ) between any two nodes u and v . Hence, bylinearity of expectation, we get that H ( u, v ) ≤ k − (cid:88) i =0 H ( u i , u i +1 ) < diam( G ) · nm. Meeting time of two tokens.

We now consider the situation where a distinct token is placedon each node of G . In each time step, an edge is chosen uniformly at random. Whenever the edge { u, v } is sampled, the tokens at nodes u and v exchange places. Note that, individually, each of thetokens performs a random walk, but the random walks are not independent.We say that two tokens meet at time t if the edge { u, v } is sampled at time step t and the twotokens are located at the nodes u and v , respectively. From now on, we uniquely label the tokensfrom to n − and deﬁne the random variable M ( a, b ) as the number of time steps until tokens a and b ﬁrst meet, starting in the initial conﬁguration. If a = b , then we follow the convention that M ( a, b ) = 0 . We are interested in bounding the largest ﬁrst meeting time between any two pairs oftokens. To this end, we deﬁne M = max { M ( a, b ) : a, b ∈ [ n ] } and M max = max a,b { E[ M ( a, b )] } . The random variable M is the largest ﬁrst meeting time between any pairs of tokens in the tokenshuﬄing process and the quantity M max is the worst-case expected ﬁrst meeting time between anytwo tokens. Lemma 30.

The expected worst-case ﬁrst meeting time satisﬁes M max ∈ O (diam( G ) · nm ) .Proof. We keep track of the locations of the two tokens and the parity of the number of times thetwo tokens have met. To this end, we deﬁne the graph G ∗ = ( V ∗ , E ∗ ) with V ∗ = V ∗ ∪ V ∗ , where V ∗ b = { ( v , v , b ) : v , v ∈ V, v (cid:54) = v } for b ∈ { , } and { ( u , u , b ) , ( v , v , b (cid:48) ) } ∈ E ∗ if either of the following two conditions hold:(1) b = b (cid:48) , u i = v i and { v − i , u − i } ∈ E for some i ∈ { , } , or(2) b (cid:54) = b (cid:48) and ( u , u ) = ( v , v ) . 37ne can check that the degree of any node x = ( v , v , b ) ∈ V ∗ is d ( x ) = d ( v ) + d ( v ) . Deﬁne thetransition matrix P ∗ ( x, y ) =  /m if { x, y } ∈ E ∗ − d ( x ) /m if x = y otherwise.Consider an arbitrary initial conﬁguration and two tokens located at nodes v and v of G . Theexpected meeting time of these two tokens is the same as the expected time to reach from ( v , v , ∈ V ∗ to any node in V ∗ by the random walk given by P ∗ . By the same arguments as in Lemma 28 andLemma 29, we get that the hitting time is at most O (diam( G ) · nm ) , since | V ∗ | ∈ Θ( n ) , | E ∗ | ∈ Θ( m ) ,and the diameter of G ∗ is Θ(diam( G )) .Fix an arbitrary constant c ≥ and set T = (cid:100) · max { H max , M max }(cid:101) R = (cid:100) ( c + 2) log n (cid:101) T ∗ = RT.

We show that in T ∗ steps all pairs of tokens have met with high probability. Remark . For any < p < , the following inequality holds: ∞ (cid:88) k =0 ( k + 1) p k = 1( p − . Remark . Let A , . . . , A R be events. Then Pr (cid:34) R (cid:92) i =0 A i (cid:35) = Pr[ A ] · k (cid:89) i =1 Pr[ A i | A i − ] . Lemma 33.

We have

Pr[ M ≥ T ∗ ] ≤ /n c and E[ M ] ≤ T ∗ .Proof. Let M t ( a, b ) be the ﬁrst meeting time between tokens a and b after t steps and M t =max a,b M t ( a, b ) . Note that M ( a, b ) = M ( a, b ) and M = M . First, we show that for any a, b ∈ [ n ] and t ≥ , the inequality Pr[ M t ( a, b ) ≥ T ∗ ] ≤ /n c +2 holds. For a = b the claim is vacuous, so assume a (cid:54) = b . By Markov’s inequality, the probability thattokens a and b do not meet within T − steps starting from any conﬁguration x t is Pr[ M t ( a, b ) ≥ T ] ≤ E[ M t ( a, b )] T ≤ M max T ≤ . By repeating the experiment R times, we observe that the probability that the tokens a and b donot meet within T ∗ = RT steps (starting from any conﬁguration x t ) satisﬁes Pr[ M t ( a, b ) ≥ RT ] = Pr[ M t ( a, b ) ≥ T ] · R − (cid:89) r =1 Pr[ M t + T r ( a, b ) ≥ T | M t + T ( r − ( a, b ) ≥ T ] ≤ R (cid:89) r =1

12 = (cid:18) (cid:19) (cid:100) ( c +2) log n (cid:101) ≤ n c +2 . T ∗ steps byapplying the union bound: Pr[ M t ≥ T ∗ ] ≤ (cid:88) a,b Pr[ M t ( a, b ) ≥ T ∗ ] ≤ (cid:88) a,b n c +2 ≤ n n c +2 = 1 n c . Observe that for any k > we have Pr[ M t ≥ kT ∗ ] = Pr[ M t ≥ T ∗ ] · k − (cid:89) i =1 Pr[ M t + iT ∗ ≥ T ∗ | M t +( i − T ∗ ≥ T ∗ ] ≤ k (cid:89) i =1 n c ≤ k , since n ≥ and c ≥ . To bound the expectation observe that E[ M ] = E[ M ] ≤ ∞ (cid:88) k =0 ( k + 1) T ∗ · Pr[ M ≥ kT ∗ ] ≤ T ∗ ∞ (cid:88) k =0 k + 12 k = 4 T ∗ . We analyse the dynamics of the token-based leader election protocol. Let C be the number of timesteps until a single black token remains. Lemma 34.

The random variable C satisﬁes Pr[ C ≥ T ∗ ] ≥ /n c .Proof. Observe that C corresponds to the time when the last pair of black tokens meet. Now C = max { M ( a, b ) : a (cid:54) = b } = M and the claim follows from Lemma 33.Let L be the stabilisation time of the protocol, that is, the time until there is exactly one leadercandidate remaining. Recall that a leader candidate becomes a follower if it receives a white tokenfrom some other node and a follower never becomes a candidate again. Thus, a node is a leadercandidate at step t if and only if it has not been hit by a white token. Whenever a white token hitsa leader candidate, the token becomes inactive. This ensures that a single leader is always elected,as there will be exactly n − white tokens created during the execution of the protocol. Lemma 35.

Let u and v be distinct nodes. The probability that a token starting from u does not hit v within T ∗ steps it at most /n c +2 .Proof. By Lemma 29 and Markov’s inequality, the probability that a token starting from u does nothit node v in T steps is bounded by H ( u, v ) T ≤ H max H max ≤ / . Again, by repeating experiment for R times, we get that with probability at most − R ≤ /n c +2 thetoken starting from u does not hit v in RT = T ∗ steps. Lemma 36.

The random variable L satisﬁes Pr[

L < T ∗ ] ≤ /n c and E[ L ] ≤ T ∗ . roof. Observe that conditioned on the event that there is only one black token remaining, theprobability that some node v does not become a follower is bounded by the event that node v doesnot receive a white token. This is in turn bounded by the probability of the event that some tokendoes not visit v within T ∗ steps. Hence, Pr[ L ≥ t + T ∗ | C < t ] ≤ (cid:88) v Pr[ node v is not hit by a white token by time t + T ∗ | C < t ] ≤ (cid:88) v Pr[ node v is not hit by some token ] ≤ (cid:88) v (cid:88) a Pr[ node v is not hit by token a by time t + T ∗ ] ≤ (cid:88) v (cid:88) a n c +2 ≤ n c , where in the second to last step we applied Lemma 35 and in the last step the fact that there atmost n leader candidate nodes and n tokens. By law of total probability, we get that Pr[ L ≥ T ∗ ] = Pr[ L ≥ T ∗ | C < T ∗ ] · Pr[

C < T ∗ ] + Pr[ L ≥ T ∗ | C ≥ T ∗ ] · Pr[ C ≥ T ∗ ] ≤ (1 /n c ) · (1 − /n c ) + 1 /n c ≤ /n c , where in the second to last step we apply the bound Pr[ C ≥ T ∗ ] ≤ /n c given by Lemma 34. Since c ≥ and considering repeated stabilisation attempts, we get that E[ L ] ≤ T ∗ ∞ (cid:88) k =0 ( k + 1) Pr[ L > kT ∗ ] ≤ T ∗ ∞ (cid:88) k =1 ( k + 1) (cid:18) n c (cid:19) k ≤ T ∗ ∞ (cid:88) k =1 ( k + 1) (cid:18) (cid:19) k ≤ T ∗ . Proof of Theorem 26.

Proof.

The protocol uses only 6 states as each node only stores whether it is a leader and what isthe type of its token. Since T ∗ = O (max { H max , M max } · log n ) and H max , M max ∈ O (diam( G ) · nm ) by Lemma 29 and Lemma 30. Thus, the protocol stabilises in O (diam( G ) · nm log n ) steps with highprobability and in expectation. Note on cliques.

Note that the running time is determined by max { H max , M max } . For cliques,the expected hitting time for any u (cid:54) = v is ∞ (cid:88) t =0 tm (cid:18) − m (cid:19) t − = m. Similarly, the expected meeting time for any two tokens is the same order as the hitting time. Thus,the algorithm solves leader election with high probability in O ( n log n ) steps. Indeed, it is knownthat any constant-space leader election protocol requires Ω( n ) steps in expectation to stabilise [33].40 As our main result, we established a general framework for simulating clique-based protocols inarbitrary, connected regular graphs. We now conclude by brieﬂy discussing some limitations of ourapproach and summarise key problems left open by this work:– We assume that the nodes have access to a single random bit per interaction. The random bitsare used only by the shuﬄing protocol of Section 7 to avoid technical parity issues arising in themixing of the random walks on the symmetric group. It seems plausible that this assumptioncan be avoided, by exploiting the stochastic nature of the population protocol scheduler to e.g.generate synthetic coins [7] or to argue that these parity issues are avoided by the virtue ofhaving a random number of shuﬄing steps.– We assume that in each interaction step in the population protocol model, one of the interactingnodes is assigned to be an initiator and the other a responder to provide elementary symmetry-breaking. This is again a common assumption in population protocol literature. The simulationframework uses this assumption only in the construction of the phase clock, where in certainsituations ties need to be broken. It again seems plausible that this assumption can be avoided,but this would necessitate revisiting the involved graphical load balancing argument of Pereset al. [49] with a diﬀerent tie-breaking procedure.– We focus on regular interaction graphs. The justiﬁcation for this assumption is two-fold. First,this assumption is only used once: in Section 6, to obtain clean bounds for the skew of thephase clock. However, upon close inspection, we notice that this regularity assumption can berelaxed in many cases if the minimum and maximum degrees do not deviate too much fromthe average degree of the graph. As Theorem 6 can be used to bound the mixing time of theinterchange process in non-regular graphs as well, we can use our simulation framework toobtain fast leader election and exact majority algorithms also on some non-regular graphs.Please see Appendix A.2 for a formal statement and an illustration.Second, regular graphs are also justiﬁed by the fact that they provide an immediate extensionof the notion of parallel time : the expected number of interactions in any time interval is thesame for all nodes, and prior work on this problem has naturally focused on them [29, 35].Nevertheless, obtaining bounds for phase clocks and related load balancing processes innon-regular graphs remains an interesting open problem.– The simulation overhead has a polylogarithmic dependency on n . To simplify the presentation,we have made no particular eﬀort to optimise the degree of this polylogarithmic dependency.The dependency can be improved by providing better bounds on the k -stack interchange process.Indeed, even in the case of the well-studied (1-stack) interchange process, exact bounds onmixing time have been—and still remain—an open question for many graph classes [40].Improved bounds for these processes imply better running time bounds for our simulations.– Our complexity bounds have a quadratic dependency on d/β . We conjecture a polynomialdependency on the expansion properties is necessary for step complexity and leave the investi-gation of tight space-time trade-oﬀs for population protocols in the general graphical settingas an intriguing open problem. Acknowledgements

We thank Giorgi Nadiradze for pointing out the generalisation of the phase clock construction tonon-regular graphs. We also thank anonymous reviewers for their useful comments on earlier versions41f this manuscript. This project has received funding from the European Research Council (ERC)under the European Union’s Horizon 2020 research and innovation programme (grant agreement No.805223 ScaleML), and from the European Union’s Horizon 2020 research and innovation programmeunder the Marie Skłodowska-Curie grant agreement No. 840605.

References [1] David Aldous. Random walks on ﬁnite groups and rapidly mixing Markov chains. In

Séminairede Probabilités XVII 1981/82 , pages 243–297. Springer, 1983.[2] David Aldous and James Allen Fill. Reversible markov chains and random walks on graphs,2002. Unﬁnished monograph, recompiled 2014, available at .[3] Dan Alistarh and Rati Gelashvili. Polylogarithmic-time leader election in population protocols.In

Proc. 42nd International Colloquim on Automata, Languages, and Programming (ICALP2015) , pages 479–491, 2015. doi:10.1007/978-3-662-47666-6_38.[4] Dan Alistarh and Rati Gelashvili. Recent algorithmic advances in population protocols.

SIGACTNews , 49(3):63–73, 2018. doi:10.1145/3289137.3289150.[5] Dan Alistarh, Rati Gelashvili, and Milan Vojnović. Fast and exact majority in populationprotocols. In

Proc. 34th ACM Symposium on Principles of Distributed Computing (PODC2015) , pages 47–56, 2015. doi:10.1145/2767386.2767429.[6] Dan Alistarh, James Aspnes, David Eisenstat, Rati Gelashvili, and Ronald L Rivest. Time-spacetrade-oﬀs in population protocols. In

Proc. 28th Annual ACM-SIAM Symposium on DiscreteAlgorithms (SODA 2017) , pages 2560–2579, 2017. doi:10.1137/1.9781611974782.169.[7] Dan Alistarh, James Aspnes, and Rati Gelashvili. Space-optimal majority in populationprotocols. In

Proc. 29th ACM-SIAM Symposium on Discrete Algorithms (SODA 2018) . SIAM,2018. doi:10.1137/1.9781611975031.144.[8] Dan Alistarh, Giorgi Nadiradze, and Amirmojtaba Sabour. Dynamic averaging load balancingon cycles. In , volume 168, pages 7:1–7:16, 2020. doi:10.4230/LIPIcs.ICALP.2020.7.[9] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J Fischer, and René Peralta. Computationin networks of passively mobile ﬁnite-state sensors.

Distributed computing , 18(4):235–253, 2006.doi:10.1007/s00446-005-0138-3.[10] Dana Angluin, James Aspnes, and David Eisenstat. Stably computable predicates are semilinear.In

Proc. 25th ACM Symposium on Principles of distributed computing (PODC 2006) , pages292–299, 2006. doi:10.1145/1146381.1146425.[11] Dana Angluin, James Aspnes, David Eisenstat, and Eric Ruppert. The computational powerof population protocols.

Distributed Computing , 20(4):279–304, 2007. doi:10.1007/s00446-007-0040-2.[12] Dana Angluin, James Aspnes, and David Eisenstat. Fast computation by population protocolswith a leader.

Distributed Computing , 21(3):183–199, 2008. doi:10.1007/s00446-008-0067-z.4213] Dana Angluin, James Aspnes, Michael J Fischer, and Hong Jiang. Self-stabilizing populationprotocols.

ACM Transactions on Autonomous and Adaptive Systems (TAAS) , 3(4):1–28, 2008.doi:10.1007/11795490_10.[14] James Aspnes and Eric Ruppert. An introduction to population protocols. In

Middleware forNetwork Eccentric and Mobile Applications , pages 97–120. Springer, 2009. doi:10.1007/978-3-540-89707-1_5.[15] Yossi Azar, Andrei Z. Broder, Anna R. Karlin, and Eli Upfal. Balanced allocations.

SIAMJournal on Computing , 29(1):180–200, 1999. doi:10.1137/S0097539795288490.[16] Joﬀroy Beauquier, Peva Blanchard, and Janna Burman. Self-stabilizing leader election in popu-lation protocols over arbitrary communication graphs. In

Proc. 17th International Conference onPrinciples of Distributed Systems (OPODIS 2013) , pages 38–52. Springer, 2013. doi:10.1007/978-3-319-03850-6_4. URL https://hal.archives-ouvertes.fr/hal-00867287v2 .[17] Petra Berenbrink, Tom Friedetzky, Peter Kling, Frederik Mallmann-Trenn, and Chris Wastell.Plurality consensus in arbitrary graphs: Lessons learned from load balancing. In

Proc. 24thAnnual European Symposium on Algorithms (ESA 2016) , volume 57, pages 10:1–10:18, 2016.doi:10.4230/LIPIcs.ESA.2016.10.[18] Petra Berenbrink, Robert Elsässer, Tom Friedetzky, Dominik Kaaser, Peter Kling, and TomaszRadzik. A population protocol for exact majority with O (log / n ) stabilization time and Θ(log n ) states. In Proc. 32nd International Symposium on Distributed Computing (DISC 2018) ,pages 10:1–10:18, 2018. doi:10.4230/LIPIcs.DISC.2018.10.[19] Petra Berenbrink, George Giakkoupis, and Peter Kling. Tight bounds for coalescing-branchingrandom walks on regular graphs. In

Proc. 29th Annual ACM-SIAM Symposium on DiscreteAlgorithms (SODA 2018) , pages 1715–1733. SIAM, 2018. doi:10.1137/1.9781611975031.112.[20] Petra Berenbrink, Dominik Kaaser, Peter Kling, and Lena Otterbach. Simple and eﬃcient leaderelection. In

Proc. 1st Symposium on Simplicity in Algorithms (SOSA 2018) , pages 9:1–9:11,2018. doi:10.4230/OASIcs.SOSA.2018.9.[21] Petra Berenbrink, George Giakkoupis, and Peter Kling. Optimal time and space leader election inpopulation protocols. In

Proc. 52nd Annual ACM SIGACT Symposium on Theory of Computing(STOC 2020) , pages 119–129, 2020. doi:10.1145/3357713.3384312.[22] Michael Blondin, Javier Esparza, and Stefan Jaax. Large ﬂocks of small birds: on the minimalsize of population protocols. In

Proc. 35th Symposium on Theoretical Aspects of ComputerScience (STACS 2018) , pages 16:1–16:14, 2018. doi:10.4230/LIPIcs.STACS.2018.16.[23] Robert Brijder. Computing with chemical reaction networks: a tutorial.

Natural Computing , 18(1):119–137, 2019. doi:10.1007/s11047-018-9723-9.[24] Pietro Caputo, Thomas M. Liggett, and Thomas Richthammer. Proof of Aldous’ spectral gap con-jecture.

Journal of the American Mathematical Society , 23(3):831–851, 2010. doi:10.1090/S0894-0347-10-00659-4.[25] Ioannis Chatzigiannakis, Othon Michail, Stavros Nikolaou, Andreas Pavlogiannis, and Paul GSpirakis. Passively mobile communicating machines that use restricted space. In

Proc. 7thACM SIGACT/SIGMOBILE International Workshop on Foundations of Mobile Computing ,pages 6–15, 2011. 4326] Hsueh-Ping Chen and Ho-Lin Chen. Self-stabilizing leader election. In

Proc. 38thSymposium on Principles of Distributed Computing (PODC 2019) , pages 53–59, 2019.doi:10.1145/3293611.3331616.[27] Hsueh-Ping Chen and Ho-Lin Chen. Self-stabilizing leader election in regular graphs. In

Proc.39th Symposium on Principles of Distributed Computing (PODC 2020) , page 210–217, NewYork, NY, USA, 2020. Association for Computing Machinery. doi:10.1145/3382734.3405733.[28] Colin Cooper, Robert Elsa¨sser, Hirotaka Ono, and Tomasz Radzik. Coalescing random walksand voting on connected graphs.

SIAM Journal on Discrete Mathematics , 27(4):1748–1758,2013. doi:10.1137/120900368.[29] Colin Cooper, Tomasz Radzik, Nicolás Rivera, and Takeharu Shiraga. Fast plurality consensusin regular expanders. In

Proc. 31st International Symposium on Distributed Computing (DISC2017) , pages 13:1–13:16, 2017. doi:10.4230/LIPIcs.DISC.2017.13.[30] Persi Diaconis and Laurent Saloﬀ-Coste. Comparison techniques for random walk on ﬁnitegroups.

The Annals of Probability , 21(4):2131–2156, 1993.[31] Persi Diaconis and Mehrdad Shahshahani. Generating a random permutation with randomtranspositions.

Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete , 57(2):159–179,1981.[32] AB Dieker. Interlacings for random walks on weighted graphs and the interchange process.

SIAM Journal on Discrete Mathematics , 24(1):191–206, 2010. doi:10.1137/090775361.[33] David Doty and David Soloveichik. Stable leader election in population protocols requires lineartime.

Distributed Computing , 31(4):257–271, 2018. doi:10.1007/s00446-016-0281-z.[34] David Doty, Mahsa Eftekhari, and Eric Severson. A stable majority population protocol using log-arithmic time and states, 2020. URL https://arxiv.org/abs/2012.15800 . arXiv:2012.15800.[35] Moez Draief and Milan Vojnović. Convergence speed of binary interval consensus.

SIAMJournal on Control and Optimization , 50(3):1087–1109, 2012. doi:10.1137/110823018.[36] Robert Elsässer and Tomasz Radzik. Recent results in population protocols for exact majorityand leader election.

Bulletin of the EATCS , 126, 2018. URL http://bulletin.eatcs.org/index.php/beatcs/article/view/549/546 .[37] Leszek Gąsiniec and Grzegorz Stachowiak. Fast space optimal leader election in populationprotocols. In

Proc. 29th ACM-SIAM Symposium on Discrete Algorithms (SODA 2018) , 2018.doi:10.1137/1.9781611975031.169.[38] Leszek Gąsiniec, Grzegorz Stachowiak, and Przemyslaw Uznański. Almost logarithmic-timespace optimal leader election in population protocols. In

Proc. 31st ACM Symposium onParallelism in Algorithms and Architectures (SPAA 2019) , 2019. doi:10.1145/3323165.3323178.[39] Leszek Gąsiniec, Grzegorz Stachowiak, and Przemyslaw Uznański. Time and space optimalexact majority population protocols, 2020. URL https://arxiv.org/abs/2011.07392 .[40] Johan Jonasson. Mixing times for the interchange process.

Latin American Journal of Probabilityand Mathematical Statistics , 9(2):667–683, 2012.4441] Richard Karp, Christian Schindelhauer, Scott Shenker, and Berthold Vocking. Randomizedrumor spreading. In

Proc. 41st Annual Symposium on Foundations of Computer Science (FOCS2000) , pages 565–574, 2000. doi:10.1109/SFCS.2000.892324.[42] Tom Leighton and Satish Rao. Multicommodity max-ﬂow min-cut theorems and theiruse in designing approximation algorithms.

Journal of the ACM , 46(6):787–832, 1999.doi:10.1145/331524.331526.[43] David A. Levin and Yuval Peres.

Markov Chains and Mixing Times . American MathematicalSociety, 2 edition, 2017.[44] László Lovász. Random walks on graphs: A survey.

Combinatorics, Paul Erdös is Eighty , 2(1):1–46, 1993.[45] George B. Mertzios, Sotiris E. Nikoletseas, Christoforos Raptopoulos, and Paul G. Spirakis.Determining majority in networks with local interactions and very small local memory. In

Proc. 41st International Colloquium on Automata, Languages, and Programming (ICALP 2014) ,pages 871–882, 2014. doi:10.1007/978-3-662-43948-7_72.[46] George B Mertzios, Sotiris E Nikoletseas, Christoforos L Raptopoulos, and Paul G Spirakis. De-termining majority in networks with local interactions and very small local memory.

DistributedComputing , 30(1):1–16, 2017. doi:10.1007/s00446-016-0277-8.[47] Carey D. Nadell, Knut Drescher, and Kevin R. Foster. Spatial structure, cooperation and com-petition in bioﬁlms.

Nature Reviews Microbiology , 14(9):589, 2016. doi:10.1038/nrmicro.2016.84.[48] Roberto Imbuzeiro Oliveira. Mixing of the symmetric exclusion processes in terms of thecorresponding single-particle random walk.

The Annals of Probability , 41(2):871–913, 2013.[49] Yuval Peres, Kunal Talwar, and Udi Wieder. Graphical balanced allocations and the (1 + β ) -choice process. Random Structures and Algorithms , 47(4):760–775, 2014. doi:10.1002/rsa.20558.[50] Thomas Sauerwald and He Sun. Tight bounds for randomized load balancing on arbitrarynetwork topologies. In

Proc. 53rd Annual IEEE Symposium on Foundations of ComputerScience (FOCS 2012) , pages 341–350, 2012. doi:10.1109/FOCS.2012.86.[51] Yuichi Sudo and Toshimitsu Masuzawa. Leader election requires logarithmic time in populationprotocols.

Parallel Processing Letters , 30(01):2050005, 2020. doi:10.1142/S012962642050005X.[52] Yuichi Sudo, Fukuhito Ooshita, Taisuke Izumi, Hirotsugu Kakugawa, and Toshimitsu Masuzawa.Logarithmic expected-time leader election in population protocol model. In

Proc. InternationalSymposium on Stabilizing, Safety, and Security of Distributed Systems (SSS 2019) , pages323–337, 2019. doi:10.1007/978-3-030-34992-9_26.[53] Alan M. Turing. The chemical basis of morphogenesis.

Philosophical Transactions of the RoyalSociety of London, Series B , 237(641):37–72, 1952.[54] David B. Wilson. Mixing times of lozenge tiling and card shuﬄing Markov chains.

The Annalsof Applied Probability , 14(1):274–325, 2004.[55] Daisuke Yokota, Yuichi Sudo, and Toshimitsu Masuzawa. Time-optimal self-stabilizing leaderelection on rings in population protocols. In

International Symposium on Stabilizing, Safety,and Security of Distributed Systems (SSS 2020) , pages 301–316. Springer, 2020. doi:10.1007/978-3-030-64348-5_24. 45

Details on decentralised phase clocks

A.1 Proof of Lemma 9

Let V be a collection of n bins, which are initially empty, and let µ be a probability distribution on V × V . Consider the process, where at every time step t > , a pair ( u, v ) ∼ µ is sampled and aball is placed into the least loaded of these two bins (in case of ties, place the ball into bin u ). Let ∆ ∗ ( t ) be the diﬀerence between the most and least loaded bins after step t . Deﬁne E ( S ) as the setof edges that have at least one end point in S , that is, E ( S ) = {{ u, v } ∈ E : { u, v } ∩ S (cid:54) = ∅} . The distribution µ on V × V is said to be δ -expanding for δ > if for all S ⊆ V with | S | ≤ n/ (1) µ ( E ( S )) ≥ (1 + δ ) | S | /n , and(2) µ ( E ( S ) \ ∂S ) ≤ (1 − δ ) | S | /n hold, where ∂S denotes the edge boundary of S . Peres et al. [49] showed that when the measure µ is well-behaved in the sense that it is δ -expanding, then the gap is bounded by O (log n/δ ) at everystep t , with high probability. Lemma 37 ([49]) . Let κ > be a constant and µ be an δ -expanding measure on V × V . Then thereexists a constant c ( κ ) such that for any t > the gap ∆ ∗ ( t ) satisﬁes Pr[∆ (cid:96) ( t ) > c ( κ ) log n/δ ] < t/n κ . The following observation establishes that the uniform distribution on edges of a regular, connectedgraph is always δ -expanding for some δ > . This in turn implies Lemma 9. Lemma 38.

Suppose G is d -regular with edge expansion β > . The uniform distribution ξ on theedges of G is ( β/d ) -expanding.Proof. Let S ⊆ V such that | S | ≤ n/ . Note that | ∂S | ≥ β | S | , as the graph has edge expansion β .Since the graph is d -regular, it has m = nd/ edges and | S | d/ ≤ | E ( S ) | ≤ | S | d . Now ξ ( E ( S )) = 1 m | E ( S ) | ≥ nd (cid:34)(cid:88) v ∈ S (cid:18) d − β β (cid:19)(cid:35) = | S | n (cid:18) βd (cid:19) . This shows the ﬁrst condition. For the second condition, observe that ξ ( E ( S ) \ ∂S ) = 1 m | E ( S ) \ ∂ ( S ) | ≤ nd (cid:34)(cid:88) v ∈ S (cid:18) d − β (cid:19)(cid:35) = | S | n (cid:18) − βd (cid:19) . A.2 Phase clocks for non-regular graphs

Finally, we observe that the phase clock construction is not restricted to only regular graphs. Theuniform distribution on the edges of G is δ -expanding whenever (1) the minimum and maximumdegree do not deviate too much from the average degree α = 2 m/n and (2) the expansion issuﬃciently large compared to the average and minimum degree. Lemma 39.

Let G be a graph with minimum degree d , maximum degree ∆ , and average degree α .If G satisﬁes a) β + d > α and(b) d + ∆ ≤ α ,then the uniform distribution ξ on the edges of G is δ -expanding for δ = ( β + d ) /α − > .Proof. Let S ⊆ V such that | S | ≤ n/ . We use d ≤ deg( v ) ≤ ∆ to denote the degree of node v in G , out( v, S ) be the number neighbours v has outside the set S , in( v, S ) denote the number ofneighbours v has in the set S . First, observe that ξ ( E ( S )) = 1 m | E ( S ) | = 1 m (cid:34)(cid:88) v ∈ S (cid:18) out( v, S ) + in( v, S )2 (cid:19)(cid:35) = 1 m (cid:34)(cid:88) v ∈ S (cid:18) out( v, S ) + deg( v ) − out( v, S )2 (cid:19)(cid:35) = 12 m (cid:34)(cid:88) v ∈ S (out( v, S ) + deg( v )) (cid:35) ≥ | S | · (cid:18) β + d m (cid:19) = | S | n · (cid:18) β + dα (cid:19) = (1 + δ ) · | S | n . Thus, we have satisﬁed Condition (1) of a δ -expanding measure. For the second condition, we notethat ξ ( E ( S ) \ ∂S ) = 1 m | E ( S ) \ ∂ ( S ) | = 12 m (cid:34)(cid:88) v ∈ S in( v, S ) (cid:35) ≤ m (cid:34)(cid:88) v ∈ S (deg( v ) − out( v, S )) (cid:35) ≤ | S | (cid:18) ∆ − β m (cid:19) = | S | n · (cid:18) ∆ − βα (cid:19) ≤ (1 − δ ) | S | n . Note that one can also apply the construction on non-uniform probability distributions over E (i.e. weighted graphs) as long as the distribution is δ -expanding for some δ > . An example graph.

For a simple non-regular graph that satisﬁes the above conditions, takea bipartite complete graph K r, r on r nodes and on one side add r edges to form a completebipartite subgraph on r nodes. Now n = 4 r and m = 4 r + r = 5 r . One can check that the averagedegree is α = 2 m/n = 5 r/ , minimum degree is r , maximum degree is r , and β ≥ r . Thus, theuniform measure is /5