[PDF] Lower Bounds on the State Complexity of Population Protocols

Abstract

Population protocols are a model of computation in which an arbitrary number of indistinguishable finite-state agents interact in pairs. The goal of the agents is to decide by stable consensus whether their initial global configuration satisfies a given property, specified as a predicate on the set of all initial configurations. The state complexity of a predicate is the number of states of a smallest protocol that computes it. Previous work by Blondin et al. has shown that the counting predicates x \ge \eta have state complexity \mathcal{O}(\log \eta) for leaderless protocols and \mathcal{O}(\log \log \eta) for protocols with leaders. We obtain the first non-trivial lower bounds: the state complexity of x \geq \eta is \Omega(\log\log\log \eta) for leaderless protocols, and the inverse of a non-elementary function for protocols with leaders.

Full PDF

LLower Bounds on the State Complexity ofPopulation Protocols ∗ Philipp Czerner , Javier Esparza{czerner, esparza}@in.tum.deDepartment of Informatics, TU München, GermanyFebruary 24, 2021

Population protocols are a model of computation in which an arbitrarynumber of indistinguishable ﬁnite-state agents interact in pairs. The goalof the agents is to decide by stable consensus whether their initial globalconﬁguration satisﬁes a given property, speciﬁed as a predicate on the set ofall initial conﬁgurations. The state complexity of a predicate is the numberof states of a smallest protocol that computes it. Previous work by Blondin et al. has shown that the counting predicates x ≥ η have state complexity O (log η ) for leaderless protocols and O (log log η ) for protocols with leaders.We obtain the ﬁrst non-trivial lower bounds: the state complexity of x ≥ η isΩ(log log log η ) for leaderless protocols, and the inverse of a non-elementaryfunction for protocols with leaders.

1. Introduction

Population protocols are a model of computation in which an arbitrary number ofindistinguishable ﬁnite-state agents interact in pairs to decide if their initial globalconﬁguration satisﬁes a given property. Population protocols were introduced in [5,6] to study the theoretical properties networks of mobile sensors with very limitedcomputational resources, but they are also very strongly related to chemical reactionnetworks, a discrete model of chemistry in which agents are molecules that change theirstates due to collisions.Population protocols decide a property by stable consensus . Each state of an agentis assigned a binary output (yes/no), and in a correct protocol starting at a global ∗ This work was supported by an ERC Advanced Grant (787367: PaVeS) and by the Research TrainingNetwork of the Deutsche Forschungsgemeinschaft (DFG) (378803395: ConVeY). a r X i v : . [ c s . D C ] F e b onﬁguration, all agents eventually reach a consensus by reaching the set of states whoseoutput is the correct answer to the question “does the initial conﬁguration satisfy theproperty?”, and staying in it forever. A typical example of a property decidable bypopulation protocols is majority: initially agents are in one of two initial states, say A and B , and the property to be decided is whether the number of agents in A is largerthan the number of agents in B or not. In a seminal paper, Angluin et al. showedthat population protocols can decide exactly the properties expressible in Presburgerarithmetic, the ﬁrst-order theory of order [9].In order to deﬁne the runtime of a protocol one assumes that at each step a pair ofagents is selected uniformly at random and allowed to interact. The parallel runtime isthen deﬁned as the expected number of interactions until a stable consensus is reached (i.e.until the property is decided), divided by the number of agents. Even though the parallelruntime is computed using a discrete model, under reasonable, commonly acceptedassumptions, the result coincides with the runtime of a continuous-time stochastic model.Many papers have investigated the parallel runtime of population protocols, and severallandmark results have been obtained. In [6] it was shown that every Presburger propertycan be decided in O ( n log n ) parallel time, where n is the number of agents, and [8] showedthat population protocols with a ﬁxed number of leaders can compute all Presburgerpredicates in polylogarithmic parallel time. (Loosely speaking, leaders are auxiliaryagents that do not form part of the population of “normal” agents, but can interact withthem to help them decide the property.) More recent results have studied protocols formajority in which the number of states grows with the number of agents, and shownthat polylogarithmic time is achievable by protocols without leaders, even for very slowgrowth functions, see e.g. [2, 3, 4, 15, 18].However, many protocols have a high number of states. For example, a quick estimateshows that the fast protocol for majority implicitly described in [8] has tens of thousandsof states. This is an obstacle to implementations of protocols in chemistry, where thenumber of states corresponds to the number of chemical species participating in thereactions. Moreover, the number of states is of fundamental importance because it playsthe role of memory in sequential computational models: the total memory availableto a protocol is product of the logarithm of the number of states multiplied by thenumber of agents. Despite these facts, the state complexity of a Presburger property,deﬁned as the minimal number of states of any protocol deciding the property, hasreceived comparatively little attention . In [13, 12] Blondin et al. have shown that everypredicate representable by a boolean combination of threshold and modulo constraints(every Presburger formula can be put into this form) of length n , with numbers encodedin binary, can be decided by a protocol with p ( n ) states, for some polynomial p . Inparticular, it is not diﬃcult to see that every property of the form x ≥ η , asking whetherthe number of agents is at least η , can be decided by a leaderless protocol with O (log η ) Notice that the time-space trade-oﬀ results of [2, 3, 4, 15, 18] refer to a more general model in whichthe number of states of a protocol grows with the number n of agents; in other words, a property isdecided by a family of protocols, one for each value of n . Trade-oﬀ results bound the growth rateneeded to compute a predicate within a given time. We study the minimal number of states of a single protocol that decides the property for all n . c which has protocols (with leaders) with O (log log η ) states. However, to the best of ourknowledge, there exist no lower bounds on the state complexity, i.e. a bound showingthat a protocol for x ≥ η needs Ω( f ( η )) states for some function f . This question, whichwas left open in [13], is notoriously hard due to its relation to fundamental questions inthe theory of Vector Addition Systems.In this paper we show that every protocol, with or without leaders, needs a number ofstates that, roughly speaking, grows like the inverse Ackermann function, and (our mainresult) that every leaderless protocol for x ≥ η needs Ω(log log log η ) states. The proof ofthe ﬁrst bound relies on results on the maximal length of controlled antichains of N d , atopic in combinatorics with a long tradition in the study of Vector Addition Systems andother models of computation, see e.g. [22, 17, 1, 25, 10]. The triple exponential boundrequires us to develop new theory for a generalisation of the antichain condition.The paper is organised as follows. Section 2 introduces population protocols, statecomplexity, and its inverse, the busy beaver function for population protocols. Insteadof lower bounds on state complexity, for convenience we present upper bounds on thebusy beaver function. Section 3 presents some results on the mathematical structureof stable sets of conﬁgurations that are used throughout the paper. Section 4 showsan Ackermannian upper bound on the busy beaver function, valid for protocols withor without leaders, and explains why this surprisingly large bound might be optimal.Section 5 gives a triple exponential upper bound on the busy beaver function for leaderlessprotocols.

2. Population Protocols and State Complexity

For sets

A, B we write A B to denote the set of functions f : B → A . If B is ﬁnite we callthe elements of N B multisets over B , and the elements of R B vectors of dimension | B | .Arithmetic operations on vectors in R B are deﬁned as usual, extending the vectors withzeroes if necessary. For example, if B ⊆ B , x ∈ R B and y ∈ R B then x + y ∈ N B isdeﬁned by ( x + y ) b = x b + y b , where y b = 0 for every b ∈ B \ B . For x, y ∈ R B we write x ≤ y if x i ≤ y i for all i ∈ B , and x (cid:8) y if x ≤ y and x = y . Abusing language we identifyan element b ∈ B with the one-element multiset containing it, i.e. x ∈ N B with x b = 1and x i = 0 for i = b . We also write | x | := P b ∈ B x b for the total number of elements in amultiset x ∈ N B , and to denote the all-ones vector of appropriate dimension. Finally,given a vector v ∈ R k , we deﬁne k v k = P ki =1 | v i | and k v k ∞ = max ki =1 | v i | We recall the population protocol model of [5, 7], with explicit mention of leader agents.A population protocol is a tuple P = ( Q, T, L, X, I, O ) where Q is a ﬁnite set of states ; T ⊂ Q × Q is a set of transitions ; L ∈ N Q is the leader multiset ; X is a ﬁnite set3f input variables ; I : X → Q is the input mapping ; and O : Q → { , } is the outputmapping . Inputs and conﬁgurations An input is a multiset v ∈ N X such that | v | ≥

2, and a conﬁguration is a multiset m ∈ N Q such that | m | ≥

2. Intuitively, a conﬁgurationrepresents a population of agents where m ( q ) denotes the number of agents in state q .The initial conﬁguration for input v is deﬁned as IC ( v ) := L + P x ∈ X v ( x ) · I ( x ). Abusinglanguage, throughout the paper we write IC ( i ) instead of IC ( i · x ) to denote the initialconﬁguration for input i ∈ N , if P has a unique input state { x } = X .The output O ( m ) of a conﬁguration m is b if m ( q ) ≥ O ( q ) = b for all q ∈ Q ,and undeﬁned otherwise. So a population has output b if all agents have output b . Executions

A transition t = (( p, q ) , ( p , q )) is enabled in a conﬁguration m if m ≥ p + q ,and disabled otherwise. As | m | ≥

2, every conﬁguration enables at least one transition.If t is enabled in m , then it can be ﬁred leading to conﬁguration v := m − p − q + p + q ,which we denote m t −→ v . Given a sequence σ = t t ...t n of transitions, we write m σ −→ v if there exist conﬁgurations m , m , ..., m n such that m t −→ m t −→ m · · · m n t n −→ m , and m ∗ −→ m if m σ −→ m for some sequence σ ∈ T ∗ . For every set of transitions T ⊆ T , wewrite m T −→ m if m t −→ m for some t ∈ T , and m T −−→ m if m σ −→ m for some sequence σ ∈ T . Given a set M of conﬁgurations, m ∗ −→ M denotes that m ∗ −→ m for some m ∈ M .An execution is a sequence of conﬁgurations σ = m m ... such that m i −→ m i +1 for every i ∈ N . The output O ( σ ) of σ is b if there exist i ∈ N such that O ( m i ) = O ( m i +1 ) = ... = b ,otherwise O ( σ ) is undeﬁned.Executions have the monotonicity property : If m m m ... is an execution, then forevery conﬁguration D the sequence ( m + m )( m + m )( m + m ) ... is an execution too.We often say that a statement holds “by monotonicity” , meaning that it is a consequenceof the monotonicity property. Computations

An execution σ = m m ... is fair if for every conﬁguration m thefollowing holds:if |{ i ∈ N : m i ∗ −→ m }| is inﬁnite, then |{ i ∈ N : m i = m }| is inﬁnite.In other words, fairness ensures that an execution cannot avoid a conﬁguration forever.We say that a population protocol computes a predicate ϕ : N X → { , } (or decides the property represented by the predicate) if for every v ∈ N X every fair execution σ starting from IC ( v ) satisﬁes O ( σ ) = ϕ ( v ). Two protocols are equivalent if theycompute the same predicate. It is known that population protocols compute preciselythe Presburger-deﬁnable predicates [9]. Example 1.

Let P n = ( Q, T, , { x } , I, O ) be the protocol where Q := { , , , , ..., n } , I ( x ) := 1 , O ( a ) = 1 iff a = 2 n , and for each a, b ∈ Q the set T of transitions contains (( a, b ) , (0 , a + b )) if a + b < n , and (( a, b ) , (2 n , n )) if a + b ≥ n . It is readily seen that n computes x ≥ n with n + 1 states. Intuitively, each agent stores a number, initially1. When two agents meet, one of them stores the sum of their values and the other onestores 0, with sums capping at n . Once an agent reaches this cap, all agents eventuallyget converted to n .Now, consider the protocol P n = ( Q , T , , { x } , I , O ) , where Q := { , , , ..., n } , I ( x ) := 2 , O ( a ) = 1 iff a = 2 n , and T contains ((2 i , i ) , (0 , i +1 )) for each ≤ i < n ,and (( a, n ) , (2 n , n )) . for each a ∈ Q It is easy to see that P n also computes x ≥ n , butmore succinctly: While P n has n + 1 states, P n has only n + 1 states. Leaderless protocols

A protocol P = ( Q, T, L, X, I, O ) is leaderless if L = ∅ , and has | L | leaders otherwise. Protocols with leaders and leaderless protocols compute the samepredicates [9]. For L = ∅ we have λ IC ( v ) + λ IC ( v ) = λ (cid:18) L + X x ∈ X v ( x ) · I ( x ) (cid:19) + λ (cid:18) L + X x ∈ X v ( x ) · I ( x ) (cid:19) = λ X x ∈ X v ( x ) · I ( x ) + λ X x ∈ X v ( x ) · I ( x )= IC ( λv + λ v )for all inputs v, v and λ, λ ∈ N . In other words, any linear combination of conﬁgurationswith natural coeﬃcients is also an initial conﬁguration. Informally, the state complexity of a predicate is the minimal number of states of theprotocols that compute it. Given n , we would like to deﬁne the function STATE ( n ) asthe maximum state complexity of the predicates of size at most n . However, deﬁningthe size of a predicate requires to ﬁx a representation. Population protocols computeexactly the predicates expressible in Presburger arithmetic [9], and so there are at leastthree natural representations: formulas of Presburger arithmetic, existential formulasof Presburger arithmetic, and semilinear sets [19]. However, the translations betweenthese representations involve superexponential blow-ups. For this reason we focus onthreshold predicates of the form x ≥ η , for which the size of the predicate is just the sizeof η , independently of whether the predicate is described as a formula or a semilinear set.We choose to encode numbers in unary, and so we deﬁne STATE ( n ) as the number ofstates of the smallest predicate computing x ≥ η .The inverse of STATE ( n ) is the function that assigns to a number n the largest η suchthat a protocol with n states computes x ≥ η . Recall that the busy beaver functionassigns to a number n the largest η such that a Turing machine with n states started ona blank tape writes η consecutive ones on the tape and terminates. Due to this analogy,we call the inverse of STATE the busy beaver function for population protocols, and callprotocols computing predicates of the form x ≥ η busy beaver protocols, or just busybeavers . 5 eﬁnition 1. The busy beaver function BB : N → N is deﬁned as follows: BB ( n ) is thelargest η ∈ N such that the predicate x ≥ η is computed by some leaderless protocol withat most n states. The function BB L ( n ) is deﬁned analogously, but for general protocols,possibly with leaders. In [13] Blondin et al. give lower bounds on the busy beaver function:

Theorem 2 ([13]) . BB ( n ) ∈ Ω(2 n ) and BB L ( n ) ∈ Ω(2 n ) . However, to the best of our knowledge no upper bounds have been given.

3. Mathematical Structure of Stable Sets

A set M of conﬁgurations is downward closed if m ∈ M and m ≤ m implies m ∈ M . Apair ( µ, S ), where µ is a conﬁguration and S ⊆ Q , is a base element of M if µ + N S ⊆ M .A base of M is a ﬁnite set B of base elements such that M = S ( µ,S ) ∈B ( µ + N S ). Itis well-known (an easy consequence of Dickson’s lemma), that every downward-closedset of conﬁgurations has a base. We deﬁne the norm of a base element ( µ, S ) as k ( µ, S ) k ∞ := k µ k ∞ , and the norm of a base as the maximal norm of its elements. Weapply these notions to the stable conﬁgurations of the protocol: Deﬁnition 2.

Let b ∈ { , } . A conﬁguration m is b -stable if O ( m ) = b for everyconﬁguration m reachable from m . The set of b -stable conﬁgurations is denoted SC b . It follows easily from the deﬁnitions that a population protocol computes a predicate ϕ : N X → { , } iff IC ( v ) ∗ −→ SC for every input v satisfying ϕ ( v ) = 0, and IC ( v ) ∗ −→ SC for every input v satisfying ϕ ( v ) = 1. Lemma 3.

Let P be a protocol with n states. For every b ∈ { , } the set SC b isdownward closed and has a base of norm at most n +1)!+1 . In particular, SC b has abase with at most ϑ ( n ) := 2 (2 n +2)! elements.Proof. We ﬁrst show that SC b is downward closed, by showing that its complement SC b is upward closed. Assume m ∈ SC b and m ≥ m . We prove m ∈ SC b . Since m ∈ SC b we have m ∗ −→ m for some m such that O ( m ) = b . By monotonicity, m = m +( m − m ) ∗ −→ m +( m − m ), and since O ( m ) = b we have O ( m +( m − m )) = b .So m ∈ SC b .For the second part, let β := 2 n +1)! and ﬁx a b -stable conﬁguration m . Let S := { q ∈ Q : m q > β } , and deﬁne µ ≤ m as follows: µ i := m i for i / ∈ S and µ i := 2 β for i ∈ S .Since µ ≤ m and m is b -stable, so is µ . We show that ( µ, S ) is a base element of SC b ,which proves the result. Assume the contrary. Then some conﬁguration m ∈ µ + N S isnot b -stable. So m −→ m for some m satisfying m ( q ) ≥ q ∈ Q with O ( q ) = b ; we say that m covers q . By Rackoﬀ’s Theorem [24], m can be chosen sothat m σ −→ m for a sequence σ of length 2 O ( n ) ; a more precise bound is | σ | ≤ β (see [16,Theorem 3.12.11]). Since a transition moves at most two agents out of a given state, σ moves at most 2 β agents out of a state. So, by the deﬁnition of µ , the sequence σ is also6xecutable from µ , and also leads to a conﬁguration that covers q . But this contradictsthat µ is b -stable.To prove the bound on the number of elements of the base, observe that the numberof pairs ( µ, S ) such that µ has norm at most k and S ⊆ Q is at most ( k + 2) n . Indeed,for each state q there are at most k + 2 possibilities: q ∈ S , or q / ∈ S and 0 ≤ µ ( q ) ≤ k .So ϑ ≤ (2 n +1)!+1 + 2) n ≤ (2 n +2)! .From now on we use the following terminology: Deﬁnition 3. A b -base is a base of SC b of norm at most n +1)!+1 , and its elementsare called b -base elements .

4. A General Upper Bound on the Busy Beaver Function

Our general strategy to ﬁnd upper bounds for the busy-beaver function BB L ( n ) is asfollows:(1) Prove a “Pumping Lemma” stating that if a protocol rejects two inputs a < b satisfying certain conditions, then it rejects all inputs of the form a + λ ( b − a ), forevery λ ≥

0, and so that it does not compute the predicate x ≥ η .(2) Using the Pumping Lemma, we reduce the existence of the inputs a and b to theexistence of a ﬁnite sequence of vectors of dimension n satisfying certain purelycombinatorial properties. Moreover, the size of b is linked to the length of thesequence.(3) Say a sequence satisfying the properties of (2) is bad (it implies that the protocoldoes not compute x ≥ η ), and otherwise good . We provide a bound B ( n ) on themaximal length of good sequences.It follows from (1)-(3) that a protocol with n states cannot compute x ≥ η for any η ≥ B ( n ). Indeed, if η ≥ B ( n ) then every sequence of vectors of dimension n and length η is bad. So the sequence satisﬁes the conditions of the Pumping Lemma, and so theprotocol rejects all inputs of the form a + λ ( b − a ). In this section we follow this strategyto provide an upper bound valid for all protocols. In the next section we apply it again,albeit in a more sophisticated way, to obtain a far better upper bound for leaderlessprotocols.Fix a protocol P n = ( Q, T, , { x } , I, O ) with | Q | = n . We start by stating and provingthe Pumping Lemma. Lemma 4 (Pumping Lemma) . If there exist inputs a and b , a -base element ( µ, S ) ,and conﬁgurations m a , m b ∈ µ + N Q satisfying (1) m a ≤ m b , (2) IC ( a ) ∗ −→ m a , and (3) m a + IC ( b − a ) ∗ −→ m b , then P rejects a + λ ( b − a ) for every λ ≥ . roof. We ﬁrst claim that m a + IC ( λ ( b − a )) ∗ −→ m a + λ ( m b − m a ) holds for every λ ≥ ∗ ). The proof is by induction on λ . The basis λ = 0 is trivial. For the induction step let λ ≥

1. Due to monotonicity, we have IC ( a + λ ( b − a )) = IC ( a ) + IC ( b − a ) + IC (( λ − b − a )) ∗ −→ m a + IC ( b − a ) + IC (( λ − b − a )) (2) ∗ −→ m a + ( m b − m a ) + IC (( λ − b − a )) (3) ∗ −→ m b + ( m b − m a ) + ( λ − m b − m a ) ( ∗ )= m a + λ ( m b − m a )and the claim is proved. By (1) and m a , m b ∈ µ + N S we have m a + λ ( m b − m a ) ∈ µ + N S for every λ ≥

0. So, by (2) and the claim, SC is reachable from IC ( a + λ ( b − a )) forevery λ ≥

0. So P rejects 0 for every input a + λ ( b − a ).Our goal now is to ﬁnd a bound B ( n ) such that for every protocol with at most n states there are inputs a < b ≤ B ( n ) rejected by the protocol and satisfying conditions(1)-(3) of the Pumping Lemma. For this, observe that for every rejected input i we have IC ( i ) ∗ −→ m i for some conﬁguration m i ∈ SC , and so m i ∈ µ i + N S i for some 0-baseelement ( µ i , S i ). Further, by Lemma 3 we can assume that the 0-base has ϑ ( n ) elements.The triple ( µ i , S i , m i ) can be seen as a “rejection certiﬁcate” for i . The certiﬁcate can beveriﬁed by checking that m i ∈ µ i + N S i , and ﬁnding σ such that IC ( i ) σ −→ m i . Deﬁnition 4.

For every rejected input ≤ i ≤ η − , the triple cert ( i ) := ( µ i , S i , m i ) is the (rejection) certiﬁcate of i . We call ( µ i , S i ) the type of cert ( i ) . The certiﬁcatesequence of P is the sequence cert (2) cert (3) ... cert ( η − . The conditions of the Pumping Lemma can now be reformulated as follows: thereare two inputs a, b such that their certiﬁcates have the same type and satisfy condition(1). Since there are at most ϑ ( n ) types, if the number of rejected inputs exceeds ϑ ( n ), then there are two inputs a, b such that cert ( a ) and cert ( b ) have the same type.However, their certiﬁcates need not yet satisfy condition (1). To solve this problem weexamine the certiﬁcate sequence in more detail. More precisely, we examine the sequence m m · · · m η − . Using the terminology of [17], it is a linearly controlled sequence , meaningthat there is a linear control function f : N → N satisfying | m i | ≤ f ( i ). Indeed, since IC ( i ) ∗ −→ m i , we have | m i | = | IC ( i ) | = | L | + i , and so we can take f ( n ) = | L | + n .This allows us to use a result on linearly controlled sequences from [17]. Say a ﬁnitesequence v , v , · · · , v s of vectors of the same dimension is bad if there are two indices0 ≤ i < i ≤ s such that v i ≤ v i . Dickson’s Lemma shows that every inﬁnite sequencesequence of vectors contains a bad preﬁx, and the result extends from two to three orany other ﬁnite number of indices i < i < ... .The maximal length of good linearly controlled sequences has been studied in [22, 17, 10],and the results been used to bound the runtime of algorithms to check properties ofa number of computational models, including Vector Addition Systems, and CounterMachines [21]. The following lemma follows easily from results of [17]: Unfortunately, in [17] bad sequences are called “good”. We risk this confusion to maintain the conventionthat a bad sequence is a witness of the fact that the protocol does not compute x ≥ η . emma 5. [17] For every δ ∈ N and for every elementary function g : N → N , thereexists a function F δ,g : N → N at level F ω of the Fast Growing Hierarchy satisfying thefollowing property: For every inﬁnite sequence v , v , v ... of vectors of N n satisfying | v i | ≤ i + δ , there exist i < i < ... < i g ( n ) ≤ F δ,g ( n ) such that v i ≤ v i ≤ · · · ≤ v g ( n ) . For the deﬁnition of the Fast Growing Hierarchy see [17]. For our purposes it suﬃcesto know that F ω contains functions that, crudely speaking, grow like the Ackermannfunction. Using the lemma we obtain: Theorem 6.

Let P be a population protocol with n states and ‘ leaders computing apredicate x ≥ η for some η ≥ . Then η < F ‘,ϑ ( n ) , where ϑ ( n ) is the function of Lemma3.Proof. We inductively deﬁne a sequence m , m , m ... of conﬁgurations of SC ∪ SC satisfying:(i) IC ( i ) ∗ −→ m i for every n ≥ m i + ( j − i ) · I ( x ) ∗ −→ m j for every 2 ≤ i ≤ j . (Observe that m i + ( j − i ) · I ( x ) isthe result of adding ( j − i ) agents in state I ( x ) to m i .)Since P computes x ≥ η , for every i ≥ P starting at IC ( i ) eventuallyreaches SC or SC (depending on whether i < η or i ≥ η ), and stays there forever. Wedeﬁne m , m , m , ... . First, we let m be any conﬁguration of SC ∪ SC reachable from IC (2). Then, for every i ≥

2, assume that m i has already been deﬁned and satisﬁes IC ( i ) ∗ −→ m i . Observe that IC ( i + 1) = IC ( i ) + I ( x ). Since IC ( i ) ∗ −→ m i , we also have IC ( i + 1) = IC ( i ) + I ( x ) ∗ −→ m i + I ( x ). This execution can be extended to a fair run,which eventually reaches SC ∪ SC . Let m i +1 be any conﬁguration of SC ∪ SC reachable from m i + I ( x ).Let us show that m , m , m ... satisﬁes (i) and (ii). Property (i) holds for m bydeﬁnition, and for i ≥ IC ( i + 1) = IC ( i ) + I ( x ) ∗ −→ m i + I ( x ) ∗ −→ m i +1 . Forproperty (ii), by monotonicity and the deﬁnition of m i we have for every 2 ≤ i ≤ j : m i + ( j − i ) · I ( x ) ∗ −→ m i +1 + ( j − i − · I ( x ) ∗ −→ · · · ∗ −→ m j − + I ( x ) ∗ −→ m j Assume η > F ‘,ϑ ( n ). By Lemma 5 there exist ϑ ( n ) + 1 indices i < i < ... < i ϑ ( n ) ≤ F ‘,ϑ ( n ) such that m i ≤ m i ≤ · · · ≤ m i ϑ ( n ) . Since P computes x ≥ η , every m i j belongsto SC . By the deﬁnition of ϑ and the pigeonhole principle, there are a < b and a 0-baseelement ( µ, S ) such that m i a , m i b ∈ µ + N S . Rename a := i a and b := i b . By property(ii) we have m a + ( b − a ) · I ( x ) ∗ −→ m b . Since m a , m b ∈ µ + N S , applying Lemma 4 we getthat P rejects a + λ ( b − a ) for every λ ≥

0. This contradicts that P computes x ≥ η . The function F ‘,ϑ ( n ) grows so fast that one can doubt that the bound is even remotelyclose to optimal. However, recent results show that this would be less strange than itseems. If a protocol P computes a predicate x ≥ η , then η is the smallest number such9hat IC ( η + 1) ∗ −→ SC . Therefore, letting BBP ( n ) denote the busy beaver protocolswith at most n states, and letting SC P and IC P denote the set SC and the initialmapping of the protocol P , we obtain: BB L ( n ) = max P∈ BBP ( n ) min { i ∈ N | ∃ m ∈ SC P : IC P ( i ) ∗ −→ m } Consider now two deceptively similar functions. Let

All be the set of conﬁgurations m such that O ( m ) = 1, i.e. all agents are in states with ouput 1. Further, let Some be the set of conﬁgurations m such that O ( m ) = 0, i.e. at least one agent is in a statewith output 1. Finally, let PP ( n ) denote the set of all protocols with alphabet X = { x } ,possibly with leaders, and n states. Notice that we include also the protocols that do notcompute any predicate. Deﬁne f ( n ) = max P∈ PP ( n ) min { i ∈ N | ∃ m ∈ All P : IC P ( i ) ∗ −→ m } f ( n ) = max P∈ PP ( n ) min { i ∈ N | ∃ m ∈ Some P : IC P ( i ) ∗ −→ m } Using recent results in Petri nets and Vector Addition Systems [14, 20] it is easy to provethat f ( n ) grows faster than any elementary function . However, a recent result [11]by Balasubramanian et al. shows f ( n ) ∈ O ( n ) for leaderless protocols! Finally, we get f ( n ) ∈ O ( n ) from a classical result on the coverability problem for Vector AdditionSystems [24] due to Rackoﬀ .These results show that a non-elementary bound on BB L ( n ) might well be optimal.However, we now prove that this can only hold for population protocols with leaders.We show BB ( n ) ∈ O ( n ) , i.e. leaderless busy beavers with n states can only computepredicates x ≥ η for numbers η at most triple exponential on n .The scheme of the proof is similar to that of Theorem 6. In particular, we also relyon a lemma bounding the length of certain good sequences of vectors. However, thedeﬁnition of a good sequence is a diﬀerent one, which to the best of our knowledge hasnot been studied yet. Therefore, instead of resorting to [17] we develop the machinery toprove the lemma by ourselves. The paper [20] considers protocols with one leader, and studies the problem of moving from aconﬁguration with the leader in a state q in and all other agents in another state r in , to a conﬁgurationwith the leader in a state q f and all other agents in state r f . The paper uses results from [14] toshow that the smallest number of agents for which this is possible grows faster than any elementaryfunction in the number of states of the protocol. The problem reduces to reaching a conﬁguration with at least one agent in states of output 1. Rackoﬀshows that, if this is possible for some n , then the conﬁguration can be reached in at most 2 O ( n ) steps, which can only involve 2 O ( n ) agents. . An Upper Bound for Leaderless Protocols The main property of leaderless protocols, already mentioned in Section 2, is that, looselyspeaking, initial conﬁgurations are closed under linear combinations: λ IC ( i ) + λ IC ( j ) = IC ( λi + λ j ) for every i, j, λ, λ ∈ N This extends to executions: If IC ( i ) ∗ −→ m i and IC ( j ) ∗ −→ m j , then IC ( λi + λ j ) ∗ −→ λm i + λ m j . We make use of this property throughout the section without explicit mention.Observe also that even if the coeﬃcients of the linear combination are nonnegative rationalswe can always multiply them by a suitable constant and obtain an execution.We reuse the proof strategy described at the beginning of Section 4. In Section 5.1 westate and prove a Pumping Lemma showing that under certain conditions the protocolrejects inﬁnitely many inputs, contradicting that it computes x ≥ η . In Section 5.2 weintroduce QCT -sequences of vectors, and show that a bound on the length of the socalled good

QCT -sequences implies a bound on η . Finally, Section 5.3 derives a boundon the length of good QCT -sequences.

Notation.

For the rest of the section we ﬁx a leaderless population protocol P =( Q, T, ∅ , { x } , I, O ) with n states. We ﬁrst introduce some preliminaries, then formulate a ﬁrst version of the PumpingLemma (Lemma 7), and then strengthen it, yielding the ﬁnal version (Lemma 10).

Potential reachability and a ﬁrst Pumping Lemma.

Let t = (( p, q ) , ( p , q )) be a transition. By the deﬁnition of the reachability relation, forevery two conﬁgurations m, m , if m t −→ m then m = m − p − q + p + q . This motivatesthe following deﬁnition: Deﬁnition 5.

Let t = (( p, q ) , ( p , q )) be a transition. The eﬀect of t is the vector ∆( t ) := q + r − q − r ∈ Z Q . Given a sequence σ = t ...t n , we denote its Parikh vector −→ σ := P t ∈ σ t ∈ N T as the vector mapping each transition to its number of occurrences in σ . We can then deﬁne the eﬀect of σ as ∆( σ ) := P t ∈ T ( −→ σ ) t ∆( t ) = P t ∈ σ ∆( t ) . Observe that, since a transition only involves two agents, all components of ∆( t ) lie inthe interval [ − , m σ −→ m implies m = m + ∆( σ ). In particular, m depends only on the eﬀect of σ . We introduce some useful notations: Deﬁnition 6.

For every sequence σ ∈ T ∗ of transitions, we write m σ = ⇒ m if m = m + ∆( σ ) . Further, we write m ∗ = ⇒ m if there exists σ such that m σ = ⇒ m , and say that m is potentially reachable from m . We make the following observations: 11 If m σ −→ m then m σ = ⇒ m , but the converse does not hold in general.• For ﬁxed conﬁgurations m, m , whether m σ −→ m holds or not depends only on theParikh vector −→ σ .• If m σ = ⇒ m and m ≥ | σ | , then m σ −→ m . This follows immediately from the factthat each transition moves exactly two agents.We formulate a ﬁrst version of a pumping lemma, which we will then strengthen. Lemma 7.

If there is γ ∈ N and inputs s , a, b ∈ N such that(1) IC ( s ) ∗ −→ m for some m ≥ γ ,(2) m + IC ( a ) ∗ −→ µ + N S for some -base element ( µ, S ) , and(3) IC ( b ) σ = ⇒ N S for some σ such that | σ | ≤ γ ,then P rejects ( s + a + λb ) for every λ ≥ .Proof. By (2) and (3) there exist conﬁgurations v, u ∈ N S s.t. m + IC ( a ) ∗ −→ µ + v and IC ( b ) σ = ⇒ u . Since a transition removes at most two agents from a state, and we have m ≥ γ and | σ | ≤ γ , the sequence σ is enabled at m , and so also at m + IC ( b ). By (3)we obtain m + IC ( b ) σ −→ m + u . So we get IC ( s + a + λb ) = IC ( s ) + IC ( a ) + λ IC ( b ) ∗ −→ m + IC ( a ) + λ IC ( b ) ∗ −→ m + IC ( a ) + ( λ − IC ( b ) + u ∗ −→ · · · ∗ −→ m + IC ( a ) + λu (1) ∗ −→ µ + v + λu (2)Since ( µ, S ) is a 0-base element and v, u ∈ N S , we have µ + v + λu ∈ SC , and so P rejects ( s + a + λb ). A stronger pumping lemma.

In the rest of the section we show that Lemma 7 can be strengthened by ﬁxing the valuesof γ and s , that is, one only needs to look for inputs i and j in order to “pump”. Weﬁrst show that the sequence σ can always be chosen of size at most ( n + 1) n . Lemma 8.

If IC ( i ) ∗ = ⇒ N S for some input i ≥ then IC ( j ) σ = ⇒ N S for some input j ≥ and a sequence σ of length at most ( n + 1) n .Proof. We construct a linear system of inequalities, any integer solution of which willyield a desired sequence σ with IC ( j ) σ = ⇒ N S for some j . Then we apply a well-knownbound, showing that a small solution exists.12o construct this system, we use what is known in the analysis of Petri nets as themarking equation (see e.g. [23]). Since the order of transitions in σ does not matter,we consider the Parikh vector −→ σ = P t ∈ σ t ∈ N T of σ , as deﬁned above. Deﬁning A : Q × T → Z as the matrix where the t -th column is precisely ∆( t ) for t ∈ T , we getthat the eﬀect of the whole sequence is simply ∆( σ ) = A−→ σ .Therefore the statement IC ( j ) σ = ⇒ N S is equivalent to the following system of linearinequalities over the vector u of variables, where v i := ( A it ) t ∈ T denotes the i -th row of A , for i ∈ Q , and x ∈ Q the unique initial state of P : ∃ u ∈ N T : v > x u ≤ − v > i u = 0 for all i ∈ Q \ S c , i = xv > i u ≥ i ∈ S c , i = x We know that the above system has a solution for u , so it also has an integer solutionwith coeﬃcients at most γ = ( n + 1)4 n . This bound follows from a result by von zurGathen and Sieveking [27], combined with the estimate that for any submatrix of A itsdeterminant has absolute value at most 4 n . The bounds on the determinants followsdirectly from a suitable Laplace expansion of the columns, as each transition t ∈ T has k ∆( t ) k ≤

4. Since there are | T | ≤ n diﬀerent transitions, the total number oftransitions in σ is at most n ( n + 1)4 n ≤ ( n + 1) n This allows us to ﬁx γ := ( n + 1) n . Now we ﬁx s . We prove a lemma showing thatfor every number γ there is an input from which we can reach a conﬁguration with atleast γ agents in each state (we assume wlog that every state can be populated fromsome input, otherwise we can remove the state). The proof is in the Appendix. Lemma 9.

For every γ ∈ N there exists an s ≤ γn n such that IC ( s ) ∗ −→ m for someconﬁguration m ≥ γ . This allows us to ﬁx s := γn n ≤ ( n + 1) n . So together with Lemma 7 we ﬁnallyget: Lemma 10 (Pumping Lemma) . Let γ := ( n + 1) n and s := γn n . Let m γ be aconﬁguration satisfying m γ ≥ γ and IC ( s ) ∗ −→ m γ , which exists by Lemma 9. If thereexist inputs a, b ≥ such that1. m γ + IC ( a ) ∗ −→ µ + N S for some -base element ( µ, S ) , and2. IC ( b ) ∗ = ⇒ N S ,then P rejects ( s + a + λb ) for every λ ≥ . QCT -sequences

As we did in Section 4, we deﬁne a notion of certiﬁcate that an input is rejected, in thiscase an input larger than s . Assume P computes x ≥ η . By Lemma 9, for every i with13 + i ≤ η there exist sequences of transitions σ i and π i , a 0-base element ( µ i , S i ) and aconﬁguration m i ∈ N S i such that IC ( s + i ) σ i −→ C γ + IC ( i ) π i −→ µ i + m i (1)Further, for every i we can assume that the sequence σ i π i has minimal length and, byLemma 3, that ( µ i , S i ) has norm at most 2 n +1)!+1 . Observe that execution (1) proves IC ( s + i ) ∗ −→ SC , and so that P rejects s + i . We deﬁne the rejection certiﬁcate of i asfollows. Deﬁnition 7.

Let ‘ := η − − s , and let i = 1 , ..., ‘ . The (rejection) certiﬁcate of i isthe tuple cert ( i ) := ( µ i , S i , m i , pv i ) , where µ i , S i , m i are as in (1), and pv i is the Parikhvector of the sequence σ i π i , deﬁned as the mapping pv i : T → N that assigns to everytransition the number of times it occurs in σ i π i . Further, we say that S i is the colour of i . The certiﬁcate sequence of P is the sequence cert = cert (1) cert (2) ... cert ( ‘ ) . Observe that the type of a certiﬁcate ( µ i , S i , m i , pv i ) is N Q × Q × N Q × N T , with theconstraint that for every q ∈ S i we have µ i ( q ) = 0, and for q / ∈ S i we have m i ( q ) = 0.In Section 4 we saw that the certiﬁcates introduced there were good linearly controlledsequences, which allowed us to use existing results. For leaderless protocols we proceedin the same way, with the diﬀerence that now we cannot resort to the literature, buthave to develop the theory ourselves. Controlled

QCT -sequences.

We ﬁrst show that

Cert is also a controlled sequence in a certain sense. The proof isstraightforward and can be found in the Appendix.

Lemma 11.

Deﬁnition 8.

Let

Q, C, T be disjoint ﬁnite sets of states , colours , and transitions . A QCT -tuple is a fourtuple qct = ( µ, c, m, pv ) , where µ, m ∈ N Q , c ∈ C , and pv ∈ N T .A QCT -sequence is a ﬁnite sequence of certiﬁcates. A

QCT -sequence τ = qct , ..., qct ‘ ,where qct i = ( µ i , c i , m i , pv i ) is controlled if there are constants s , α, β such that k µ i k ∞ ≤ β , k µ i k + k m i k = s + i , and k µ i k + k m i k + k pv i k ≤ ( s + i ) α for all ≤ i ≤ ‘ . Wecall s , α, β the control parameters of τ and write I ( c ) := { i : c i = c } for the indices ofelements with colour c ∈ C . We can now reformulate Lemma 11 as:

Corollary 12.

Cert is a controlled

QCT -sequence with C = 2 Q , and control parameters s ≤ ( n + 1) n , α = n and β = n n +1)!+1 . inear combinations and good controlled QCT -sequences.

We now show that

Cert satisﬁes a property playing the same role as “goodness” of linearlycontrolled sequences, but stronger. Intuitively, this makes it much harder to produce longgood sequences, which leads to a triple exponential bound instead of a non-elementaryone.

Deﬁnition 9.

Let qct , ..., qct s be certiﬁcates of the same colour c , where qct i =( µ i , c, m i , pv i ) . A tuple ( µ, m, pv ) ∈ R Q × Q × R Q × R T is a linear combination of qct , ..., qct s if there are coeﬃcients λ , ..., λ s ∈ R such that we have ( µ, m, pv ) = P si =1 λ i ( µ i , m i pv i ) . Let τ = ( qct i ) i =1 ,...,‘ be a QCT -sequence. A colour c ∈ C is bad if there is a linearcombination qct = ( µ, c, m, pv ) of ( qct i ) i ∈ I ( c ) such that µ = 0 , m (cid:13) , and pv ≥ . A QCT -sequence is bad if at least one colour is bad, and good otherwise.

Before showing that

Cert is a good

QCT -sequence, let us give some intuition for thisdeﬁnition. First of all, let us compare the bad sequences of Section 4 with the ones ofDeﬁnition 9. In Section 4, certiﬁcates were triples ( µ i , c i , m i ), while now they have anextra component ( µ i , c i , m i , pv i ). To ease the comparison, ignore the pv component forthe moment. A sequence of certiﬁcates is bad in the sense of Section 4 if there are indices i < j such that c i = c j (i.e. the certiﬁcates have the same colour), µ i = µ j , and m j ≥ m i .So we have µ j − µ i = 0 and m j − m i ≥

0, which implies m j − m i (cid:13) m j − m i = 0 then µ i + m i = µ j + m j , which can only occur if i = j ). It follows that the linear combination( µ, m, pv ) := − ( µ i , m i , pv i ) + ( µ j , m j , pv j ) satisﬁes the conditions of Deﬁnition 9, andso the sequence is also bad in the sense of this deﬁnition. But Deﬁnition 9 is far morepermissible. The sequence is still bad if, for example, we ﬁnd indices i , i , j , j whosecertiﬁcates have the same colour and µ -component, and satisfy m j + m j ≥ m i + m i ;more generally, it is even enough to ﬁnd (distinct) multisets of indices I and J satisfying | I | = | J | and P i ∈ J m j ≥ P i ∈ I m i . So, loosely speaking, while in Section 4 we must waituntil we see m i ≤ m j for some indices i < j to declare badness, now it suﬃces to ﬁndtwo multisets I and J of the same size satisfying P i ∈ I m i ≤ P j ∈ J m j . Intuitively, thismakes it much harder to construct a long good sequence, leading to a triple exponentialbound on the maximal length of good sequences, instead of the non-elementary bound ofSection 4. Lemma 13.

Cert is a good

QCT -sequence.Proof.

Assume

Cert is bad. Then there is a bad colour c and a linear combination( µ, m, pv ) of { cert ( i ) : i ∈ I ( c ) } that satisﬁes the conditions of Deﬁnition 9. We provethat there exist inputs a and b fulﬁlling the conditions of the Pumping Lemma (Lemma 10),which contradicts the assumption that P computes x ≥ η .Let ( µ i , c, m i , pv i ) := cert ( i ) for i ∈ I ( c ) and let y : I ( c ) → R denote the coeﬃcients ofthe linear combination ( µ, m, pv ), meaning that we have P i y i µ i = 0 and P i y i m i (cid:13) P i y i pv i ≥

0. These conditions are invariant under scaling of y , so we may assumewlog that y i ∈ Z for i ∈ I ( c ). 15s we already noted, potential reachability depends only on the Parikh vector of thetransition sequence. So we will extend = ⇒ to Parikh vectors by writing m pv == ⇒ v for pv ∈ N T if m σ = ⇒ v for some sequence σ ∈ T ∗ with −→ σ = pv . Note that m pv == ⇒ v is thusequivalent to m + P t ∈ T pv t ∆( t ) = v .Recall that due to Deﬁnition 7, we have IC ( s + i ) σ i −→ C γ + IC ( i ) π i −→ µ i + m i ( ∗ )for every i ∈ I ( c ) and sequences σ i , γ i ∈ T ∗ , where pv i = −→ σ i + −→ π i .Let us now deﬁne inputs a and b fulﬁlling the conditions of the Pumping Lemma(Lemma 10). For a we simply pick any element j ∈ I ( c ) and set a := j . By ( ∗ ),condition (1) of Lemma 10 holds for a and ( µ, S ) := ( µ j , c ). It remains to prove (2).Set b := P i y i ( s + i ) and pv := P i y i pv i . Since P i y i µ i = 0 and P i y i m i (cid:13) P i y i pv i ≥

0, we have IC ( b ) = IC (cid:16)X i y i ( s + i ) (cid:17) = X i y i IC ( s + i ) pv == ⇒ X i y i ( µ i + m i ) = X i y i m i (cid:13) m i ∈ N c for i ∈ I ( c ) we get IC ( b ) ∗ = ⇒ N c \ { } . Transitions preserve the total numberof agents, so b > QCT -sequences

We obtain a bound on the length of a good controlled

QCT -sequence with controlparameters s , α, β . More precisely, our goal is to prove the following theorem: Theorem 14.

The length ‘ of a good QCT -sequence with control parameters s , α , and β satisﬁes log ‘ ≤ (log β + 1 + α log( s + 1))(3 + α ) | C | (2 | Q | + | T | ) Observe that this is is purely combinatorial question, motivated by, but independentfrom, population protocols.

Notation.

We collect a number of notations used in the rest of the section.• τ denotes a QCT -sequence with control parameters s , α, β .• c ∈ C denotes an arbitrary colour of τ .• I ( c ) denotes the set of indices of the elements of τ of colour c .• for i ∈ I ( c ) , qct i = ( µ i , c, m i , pv i ) denotes the i -th element of τ .• for i ∈ I ( c ) , u i denotes the concatenation of the vectors m i and pv i , for which weuse the notation u i = (cid:0) m i pv i (cid:1) . I ∗ ( c ) ⊆ I ( c ) denotes the the set of indices i ∈ I ( c ) s.t. (cid:0) u i µ i (cid:1) is linearly independentfrom { (cid:0) u j µ j (cid:1) : j ∈ I ( c ) , j < i } . We proceed in several steps:• In Section 5.3.1 we use Farkas’s Lemma to construct a certiﬁcate of goodness for acolour c . A certiﬁcate of goodness is a mapping that assigns a real number, calleda weight , to each dimension of µ i and u i . The mapping itself is called a weighting .We show how to compute basic weightings as the unique solution of a system ofequations (Lemma 16).• In Section 5.3.2 we bound the size of a basic weighting, and transform this boundinto a bound on the length of τ (Lemma 19). However, the bound still depends onthe size of the vectors u i , with i ∈ I ∗ ( c ).• In Section 5.3.3 we remove this dependence and prove Theorem 14. We start by formally deﬁning weightings.

Deﬁnition 10.

A vector ( y, z ) , where y ∈ R Q and z ∈ R Q × R T is a weighting for thecolour c , also called a c -weighting , if z ≥ and y > µ i + z > u i = − ( s + i ) for all i ∈ I ( c ) . We use Farkas’ Lemma to prove that the existence of a c -weighting is a certiﬁcate ofgoodness for colour c . Lemma 15.

A colour c is good iff it has a weighting.Proof. As stated in Deﬁnition 9, c is a bad colour iff ∃ x ∈ R I ( c ) : X i x i µ i = 0 and X i x i t i ≥ X i x i m i (cid:13) A := (( µ > i ) i ∈ I ( c ) ) > be the matrix where column i is µ i and A := (( u > i ) i ∈ I ( c ) ) > .Now (1) is equivalent to ∃ x ∈ R I ( c ) : A x = 0 and A x ≥ (cid:13)(cid:13) X i x i m i (cid:13)(cid:13) > QCT -sequence we have > ( µ i + m i ) = k µ i k + k m i k = s + i . If we assume further that A x = 0 and A x ≥ k P i x i m i k > < (cid:13)(cid:13) X i x i m i (cid:13)(cid:13) = > X i x i m i = > (cid:16) X i x i m i + X i x i µ i (cid:17) = X i x i > ( µ i + m i ) = X i x i ( s + i ) =: b > x b ∈ R I ( c ) as b i := s + i for i ∈ I ( c ). Hence we now have ∃ x ∈ R I ( c ) : A x = 0 and A x ≥ b > x > ∃ y ∈ R Q , z ∈ R Q ∪ T : A > y + A > z = − b and z ≥ τ is bad iﬀ (1) is feasible. Moreover, (4) is equivalent to ( y, z ) being a c -weighting.A good colour may have multiple weightings, even an inﬁnite convex set of weightings.Similarly to basic solutions of a linear program, we introduce basic weightings of a colour,whose size we will bound using simple linear algebra. Recall that I ∗ ( c ) denotes the theset of indices i ∈ I ( c ) s.t. (cid:0) u i µ i (cid:1) is linearly independent from { (cid:0) u j µ j (cid:1) : j ∈ I ( c ) , j < i } . Twoproperties of a basic weighting are of interest: (1) it is the unique solution of a linearsystem of equations, and (2) it has at most | I ∗ ( c ) | nonzero components. The proof is astraightforward application of well-known properties of linear inequalities, and is given inthe appendix. Lemma 16.

Let c be a good colour. Then there are Y ⊆ Q, Z ⊆ Q ∪ T with | Y | + | Z | = | I ∗ ( c ) | such that the system y > µ i + z > u i = − s − i , for all i ∈ I ∗ ( c ) , has a uniquesolution y ∈ R Y , z ∈ R Z , and ( y, z ) is a c -weighting. We refer to such a ( y, z ) as basic c -weighting. The next step is showing that the existence of a basic weighting implies an upper boundon the length of the

QCT -sequence. We begin by showing a general bound on a uniquesolution to a linear system of equations. Again, the proof is routine linear algebra, andcan be found in the Appendix.

Lemma 17.

Let Ax = b denote a linear system of equations with unique solution x ,where A ∈ Z d × d , and let g ( i ) ≥ log max {| A ij | : j } ∪ {| b i |} denote an upper bound of eachrow i . Then log k x k ∞ ≤ W ( g, d ) , where W ( g, d ) := 2 d − − d − X t =1 d − − t g ( t ) + g ( d )We now use the previous lemma to prove an upper bound on the components of some c -weighting, for each colour c , based on the sizes of the linearly independent vectors µ i , u i with i ∈ I ∗ ( c ). To refer to these sizes, we set { l , ..., l d } := I ∗ ( c ) with l < ... < l d ,and deﬁne g c ( i ) := log( k µ l i k + k u l i k ) for i = 1 , ..., d . We remark that Deﬁnition 9immediately gives the estimate g c ( i ) ≤ α log( l i ).18 emma 18. For each colour c and d := | I ∗ ( c ) | , there is a c -weighting ( y, z ) with log k (cid:0) yz (cid:1) k ∞ ≤ W ( g c , d ) .Proof. Lemma 16 allows us to construct a c -weighting as the solution to a speciﬁcset of linear equations. In particular, we set A to the matrix with A ij := ( µ i ) j for i ∈ I ∗ ( c ) , j ∈ Y and A ij := ( u i ) j for i ∈ I ∗ ( c ) , j ∈ Z , and deﬁne b as b i = − s − i for i ∈ I ∗ ( c ). Then A (cid:0) yz (cid:1) = b has as unique solution y ∈ R Y , z ∈ R Z where ( y, z ) is a c -weighting. Now our desired bound follows simply by applying Lemma 17. (Note that | b i | = s + i ≤ ( s + i ) α holds.)From this upper bound we can derive a bound on the length of the sequence (restrictedto a speciﬁc colour c ), using that the weights for the u i must be nonnegative. Lemma 19.

For any colour c and d := | I ∗ ( c ) | , we have log max I ( c ) ≤ log β + W ( g c , d ) .Proof. Let ( y, z ) denote a c -weighting fulﬁlling the bound of Lemma 18. Hence for every i ∈ I ( c ) we have y > µ i + z > u i = − s − i . We know that z > u i ≥ z, u i ≥

0, so i ≤ − y > µ i − s ≤ k y k ∞ k µ i k . By Deﬁnition 9, k µ i k ≤ β , which we can plug into thebound of Lemma 18 to get the desired statement. The bound of Lemma 19 still depends on g c , i.e. the sizes of the elements with indicesin I ∗ ( c ). We now show how to move from this bound to the one of Theorem 14. Theproof that the expression of Theorem 14 is indeed a bound proceeds by induction on d , i.e. assuming that the bound is correct when I ∗ ( c ) contains d linearly independentvectors, we show that it remains correct when it contains d + 1. For this, observe that incontrolled sequences a bound on the length of the sequence yields a bound on the sizeof its vectors. So we use the sizes of the ﬁrst d linearly independent vectors to derive abound on the length of the sequence until the ( d + 1)-th dimensional vector, which yieldsa bound on the size of this vector.There is a slight complication in that the induction needs to be performed for allcolours at one, instead of separately for each colour. Our induction variable is thus thetotal number of linearly independent vectors (of all colours) which we refer to as P . Theinduction hypothesis also needs to be chosen carefully. We use that the upper bound onmax I ( c ) (from Lemma 19) is bounded by f ( P ) for a suitable function f . Theorem 14.

The length ‘ of a good QCT -sequence with control parameters s , α , and β satisﬁes log ‘ ≤ (log β + 1 + α log( s + 1))(3 + α ) | C | (2 | Q | + | T | ) Proof.

Let d c := | I ∗ ( c ) | for c ∈ C and P := P c d c ≤ | C | (2 | Q | + | T | ). We will prove thestronger statement that G c ( d c ) ≤ f ( P ) for all colours c with d c >

0, where f ( P ) := (log β + 1 + α log( s + 1))(3 + α ) P − c ( r ) := log β + 2 r − + r − X t =1 r − − t g c ( t ) + g c ( r )This is a stronger statement due to Lemma 19 showing that log l ≤ G c ( d c ) for some colour c with I ( c ) = ∅ and thus d c >

0. The proof will proceed by induction on P .In the base case we have P = 1 and thus g c (1) ≤ α log( s + 1) for each c ∈ C , hence G c ( d c ) ≤ f (1) (with d c ≤ j := max S c I ∗ ( c ) denote the last index of any linearlyindependent u i , i.e. the last index at which P increases; and let c denote the colour of j .For all colours c = c , the value of G c ( d c ) does not change, so the induction hypothesisyields G c ( d c ) ≤ f ( P −

1) and thus G c ( d c ) ≤ f ( P ).For colour c , we use the induction hypothesis to get log( j − ≤ f ( P −

1) and G c ( d c − ≤ f ( P − g c ( d c ) (i.e. the vector at index j ) can,using the former, be bounded as g c ( d c ) ≤ α log( s + j ) ≤ α ( f ( P −

1) + 1). (Here we usedlog( s + 1) ≤ f ( P −

1) and log( a + b ) ≤ a for a ≥ b .) This is then combined withthe latter: G c ( d c ) ≤ G c ( d c −

1) + g c ( j ) ≤ f ( P −

1) + α ( f ( P −

1) + 1) ≤ (3 + α ) f ( P −

1) = f ( P ) Let us put all the pieces together. Let P be a leaderless protocol with n states computinga predicate x ≥ η , and let s := ( n + 1) n be the constant of the Pumping Lemma(Lemma 10). We prove η ≤ O ( n ) .If η ≤ s then we are done. So assume that η > s .• Since P rejects inputs s , s + 1 , ..., η −

1, the certiﬁcate sequence

Cert of Deﬁnition7 has length ‘ = η − − s .• By Corollary 12, Cert is a controlled

QCT -sequence with set C := 2 Q of colours,and control parameters s , α := n , and β = n n +1)!+1 . Further, by Lemma 13 Cert is good.• By Theorem 13, the length ‘ of Cert satisﬁeslog ‘ ≤ (log β + 1 + α log( s + 1))(3 + α ) | C | (2 | Q | + | T | ) where | C | = | Q | = 2 n , | Q | = n , and | T | ≤ n (each transition is determined by afourtuple of states). This expression is 2 O ( n ) .• So η = ‘ + s + 1 is bounded by 2 O ( n ) .This yields a triple exponential bound on the busy beaver function for leaderless protocols(see Lemma 23 for the precise bound): Theorem 20. BB ( n ) ≤ n +5 log n +2 , and so STATE ( n ) ∈ O (log log log n ) . . Conclusion We have obtained the ﬁrst non-trivial lower bounds on the state complexity of populationprotocols, a fundamental but very hard question about the model. The obvious openquestions are to close the gap between the Ω(log log log n ) lower bound and the O (log n )upper bound for the leaderless case, and the even larger gap between Ω(log log n ) and(roughly speaking), the O ( α ( n )) upper bound for protocols with leaders, where α ( n ) isthe inverse of the Ackermann function. References [1] Sergio Abriola, Santiago Figueira, and Gabriel Senno. Linearizing well quasi-ordersand bounding the length of bad sequences.

Theor. Comput. Sci. , 603:3–22, 2015.[2] Dan Alistarh, James Aspnes, David Eisenstat, Rati Gelashvili, and Ronald L.Rivest. Time-space trade-oﬀs in population protocols. In

Proc. th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages 2560–2579. SIAM, 2017. doi:10.1137/1.9781611974782.169 .[3] Dan Alistarh, James Aspnes, and Rati Gelashvili. Space-optimal majority in popu-lation protocols. In

SODA , pages 2221–2239. SIAM, 2018.[4] Dan Alistarh and Rati Gelashvili. Recent algorithmic advances in populationprotocols.

SIGACT News , 49(3):63–73, 2018.[5] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J. Fischer, and René Peralta.Computation in networks of passively mobile ﬁnite-state sensors. In

PODC , pages290–299. ACM, 2004.[6] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J. Fischer, and René Per-alta. Computation in networks of passively mobile ﬁnite-state sensors.

DistributedComputing , 18(4):235–253, 2006.[7] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J. Fischer, and René Peralta.Computation in networks of passively mobile ﬁnite-state sensors.

Distributed Comput. ,18(4):235–253, 2006.[8] Dana Angluin, James Aspnes, and David Eisenstat. Fast computation by populationprotocols with a leader.

Distributed Comput. , 21(3):183–199, 2008.[9] Dana Angluin, James Aspnes, David Eisenstat, and Eric Ruppert. The computationalpower of population protocols.

Distributed Comput. , 20(4):279–304, 2007.[10] A. R. Balasubramanian. Complexity of controlled bad sequences over ﬁnite sets ofnd. In

LICS , pages 130–140. ACM, 2020.2111] A. R. Balasubramanian, Javier Esparza, and Mikhail A. Raskin. Finding cut-oﬀs inleaderless rendez-vous protocols is easy.

CoRR , abs/2010.09471, 2020. To appear inProceedings of FOSSACS 2021.[12] Michael Blondin, Javier Esparza, Blaise Genest, Martin Helfrich, and Stefan Jaax.Succinct population protocols for presburger arithmetic. In

STACS , volume 154 of

LIPIcs , pages 40:1–40:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020.[13] Michael Blondin, Javier Esparza, and Stefan Jaax. Large ﬂocks of small birds: onthe minimal size of population protocols. In

STACS , volume 96 of

LIPIcs , pages16:1–16:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018.[14] Wojciech Czerwinski, Slawomir Lasota, Ranko Lazic, Jérôme Leroux, and FilipMazowiecki. The reachability problem for petri nets is not elementary. In

STOC ,pages 24–33. ACM, 2019.[15] Robert Elsässer and Tomasz Radzik. Recent results in population protocols forexact majority and leader election.

Bull. EATCS , 126, 2018.[16] Javier Esparza. Petri nets lecture notes, 2019. URL: https://archive.model.in.tum.de/um/courses/petri/SS2019/PNSkript.pdf.[17] Diego Figueira, Santiago Figueira, Sylvain Schmitz, and Philippe Schnoebelen.Ackermannian and primitive-recursive bounds with dickson’s lemma. In

LICS , pages269–278. IEEE Computer Society, 2011.[18] Leszek Gąsieniec and Grzegorz Stachowiak. Enhanced phase clocks, populationprotocols, and fast space optimal leader election.

J. ACM , 68(1), 2020.[19] Christoph Haase. A survival guide to presburger arithmetic.

ACM SIGLOG News ,5(3):67–82, 2018.[20] Florian Horn and Arnaud Sangnier. Deciding the existence of cut-oﬀ in parameterizedrendez-vous networks. In

CONCUR , volume 171 of

LIPIcs , pages 46:1–46:16. SchlossDagstuhl - Leibniz-Zentrum für Informatik, 2020.[21] Jérôme Leroux and Sylvain Schmitz. Reachability in vector addition systems isprimitive-recursive in ﬁxed dimension. In

LICS , pages 1–13. IEEE, 2019.[22] Ken McAloon. Petri nets and large ﬁnite sets.

Theor. Comput. Sci. , 32:173–183,1984.[23] Tadao Murata. Petri nets: Properties, analysis and applications.

Proceedings of theIEEE , 77(4):541–580, 1989.[24] Charles Rackoﬀ. The covering and boundedness problems for vector addition systems.

Theor. Comput. Sci. , 6:223–231, 1978.2225] Sylvain Schmitz. Complexity hierarchies beyond elementary.

ACM Trans. Comput.Theory , 8(1):3:1–3:36, 2016.[26] Alexander Schrijver.

Theory of Linear and Integer Programming . John Wiley &Sons, Inc., USA, 1986.[27] Joachim von zur Gathen and Malte Sieveking. A bound on solutions of linear integerequalities and inequalities.

Proceedings of the American Mathematical Society ,42(1):155–158, 1978.

A. Appendix

A.1. Proof of Lemma 9

Lemma 9.

For every γ ∈ N there exists an s ≤ γn n such that IC ( s ) ∗ −→ m for someconﬁguration m ≥ γ .Proof. Let Q i := { q ∈ Q : IC (2 i ) → q + m, m ∈ N Q } \ Q i − for i = 1 , , ... and Q := { x } the set containing just the initial state. Intuitively, Q i contains the states reachablestarting with 2 i agents, but not 2 i − . We know that each state is reachable from aconﬁguration IC ( s ) for some s ∈ N , so Q = S i ≥ Q i .It suﬃces to prove that Q i = ∅ implies Q i +1 = ∅ , as then Q = S n − i =0 Q i and each q ∈ Q is reachable starting from IC (2 n ). Assume that this is not the case, i.e. there exists some i and q ∈ Q i with Q i − = ∅ and IC (2 i ) σ −→ q + m for m ∈ N Q . We pick such a q whichminimises the length of σ .This means that the last transition of σ is ( q , q ) ( q, q ) for some q , q , q ∈ Q .Additionally, q , q / ∈ Q i as they are reachable by a shorter sequence. But this implies q , q ∈ S i − i =0 Q i , i.e. that q and q are each reachable from IC (2 i − ). Hence IC (2 i − ) ∗ −→ q + q + m −→ q + q + m for some m ∈ N Q , contradicting q / ∈ Q i − . A.2. Proof of Lemma 11

Lemma 11.

Let Cert = cert (1) ... cert ( ‘ ) be the certiﬁcate sequence of P , where cert ( i ) =( µ i , S i , m i , pv i ) . We have k µ i k ≤ n n +1)!+1 , k µ i k + k m i k = s + i , and k µ i k + k m i k + k pv i k ≤ ( s + i ) n for all i = 1 , ..., ‘ .Proof. Let r := s + i . Since IC ( s + i ) ∗ −→ µ i + m i we have k µ i k + k m i k = s + i = r .Further, k µ i k ≤ n k µ i k ∞ ≤ β follows from Lemma 3. For k pv i k we know that it isthe number of transitions of a shortest execution leading from IC ( r ) to µ i + m i . Sincea shortest execution visits a conﬁguration at most once, the length is bounded by thenumber of conﬁgurations with r agents, which is equal to (cid:0) n + r − r (cid:1) . Using r ≥ , n ≥ k µ i k + k m i k + k pv i k ≤ r + (cid:0) n + r − r (cid:1) = r + n − Y j =1 r + jj ≤ r + ( r + 1) r n − ≤ r n − (2 r + 1) ≤ r n .3. Proof of Lemma 16 We need a well-known elementary result from linear algebra

Theorem 21 ([26, Theorem 7.1]) . Let m, n ∈ N , and let a , ..., a m , b ∈ R n denote vectors.We write t := rank { a , ..., a m , b } for the dimension of the subspace spanned by a , ..., a m and b . Then exactly one of the following holds.1. b is a nonnegative linear combination of linearly independent a , ..., a m .2. There is a c ∈ R m with c > b < and c > a i ≥ for i = 1 , ..., m , where c > a i = 0 for t − linearly independent a i . The following is an immediate consequence:

Corollary 22.

Let A ∈ R n × m and P := { x ∈ R m : Ax = b, x ≥ } be nonempty. Thenit has a solution x ∈ P with at most n nonzero components.Proof. Let a i denote the i -th column of A . If statement (2) of Theorem 21 were tohold, then for any x ∈ P we would have 0 > c > b = c > Ax where both c > A and x arenonnegative. This is a contradiction, so (1) must hold instead, which directly implies thedesired statement, as A has at most n linearly independent columns.Now we proceed to prove the Lemma. Lemma 16.

Let c be a good colour. Then there are Y ⊆ Q, Z ⊆ Q ∪ T with | Y | + | Z | = | I ∗ ( c ) | such that the system y > µ i + z > u i = − s − i , for all i ∈ I ∗ ( c ) , has a uniquesolution y ∈ R Y , z ∈ R Z , and ( y, z ) is a c -weighting. We refer to such a ( y, z ) as basic c -weighting.Proof. By deﬁnition, y ∈ R Q , z ∈ R Q × R T is a c -weighting iﬀ y > µ i + z > u i = − s − i for all i ∈ I ( c ) and z ≥

0. We know that this system of linear inequalities is feasible asa c -weighting exists, so its set of solutions does not change if we consider only linearlyindependent rows, and we get y > µ i + z > u i = − s − i for i ∈ I ∗ ( c ), z ≥

0. The statementthen follows by considering a basic solution of the corresponding linear program.For completeness, we provide an alternative argument involving Corollary 22. Let y, z denote a solution to the above system, A the matrix where the i -th row is [ µ i − µ i u i ] (i.e.the concatenation of µ i , − µ i , u i ) for i ∈ I ∗ ( c ), and set x := [ y + y − z ], where y + , y − ≥ y into positive and negative components fulﬁlling y = y + − y − . Then Ax = b , where b i := − s − i . Applying Corollary 22 we then ﬁnd a solution x with at most | I ∗ ( c ) | nonzero components. Picking an x with minimal number of nonzero component,then yields corresponding y, z with a total of at most | I ∗ ( c ) | nonzero components. Wethen deﬁne Y and Z as the support of y and z , respectively. If the solution ( y, z ) is notunique, then it would also be possible to construct a solution ( y , z ) and thus x withsmaller support, but that would contradict our choice of x .24 .4. Proof of Lemma 17 Lemma 17.

We perform at most m − p :=arg max i x i , the largest component of x (modifying A in the process). In iteration i = 1 , ..., m −

1, let v denote the i -th row of A . We know that v = 0 cannot occur, as A has full rank, so either the only nonzero element of v is p and we solve directly for x p , orwe use v to eliminate one variable from the rest of A , taking care to leave only integerelements. For example, to eliminate element i from another row v , we would updatethat row to v i v − v i v (similarly for the right-hand side b ). Let a ij := log max {| A jt | : t } ∪ {| b j |} denote the logarithm of the maximum absolute value of the j -th row of the linear systemafter iteration i , for j = 1 , ..., m and i = 0 , ..., m −

1. We then claim a ij ≤ g ( j ) + 2 i − i X t =1 i − t g ( t )For i = 0, i.e. before the ﬁrst iteration, this reduces to a j ≤ g ( j ), which holds. Usingrow i to eliminate an element from row j in iteration i means that we get 2 a ij ≤ a i − ,j a i − ,i + 2 a i − ,i a i − ,j , and therefore a ij ≤ a i − ,j + a i − ,i ≤ g ( j ) + 1 + 2 · (2 i − −

1) + i − X t =1 i − − t g ( t ) + i − X t =1 i − − t g ( t ) + g ( i )= g ( j ) + 2 i − i X t =1 i − t g ( t )As all coeﬃcients remain integral during the computation, we ﬁnally have log k x k ∞ ≤ a k − ,k , which is the bound we wanted to show. B. Proof of the Final Bound

Lemma 23.

Let | Q | = n ≥ , | T | ≤ n , s := ( n + 1) n , ‘ := η − − s , | C | = 2 n , α := n , β := n n +1)!+1 , and log ‘ ≤ (log β + 1 + α log( s + 1))(3 + α ) | C | (2 | Q | + | T | ) Then log log log η ≤ n + 5 log( n ) + 2 25 roof. We write η := η and η i +1 := log η i , and set P := | C | (2 | Q | + | T | ) = 2 n (2 n + n ). η = log( ‘ + 1 + s ) ≤ (log β + 2 + α log( s + 1))(3 + α ) P ≤ (2(2 n + 1)! + log( n ) + 3 + n log(( n + 1) n + 1))(3 + n ) P ≤ ((2 n + 2)! + 7 n log( n + 1) + 3 n )(3 + n ) P ≤ ((2 n + 2)! + 4 n (2 n + 1))(3 + n ) P (1) ≤ (2 n + 4)!(3 + n ) P (log( n + 1) ≤ n )At (1) we use log( a + b ) ≤ log( a ) + 1 for 0 < b ≤ a . η ≤ log((2 n + 4)!) + P log(3 + n ) ≤ (2 n + 4) log( n + 3) + P log(3 + n ) ( n ! ≤ ( n ) n )= log( n + 3)(2 n (2 n + n ) + 2 n + 4)Now we will use (2) log( a + b ) ≤ log( a ) + b/ ln(2) a ≤ log( a ) + 3 b/ a for a, b > n + 3) ≤ log log( n ) + , where the latter is due to log log 5 ≤ . η ≤ log log( n + 3) + log(2 n (2 n + n ) + 2 n + 4) ≤ log log( n ) + + + n + log(2 n + n )) (2) , (3) ≤ log log( n ) + + + + n + 4 log( n ) (2) ≤ n + 5 log( nn