Lower Bounds on the State Complexity of Population Protocols
LLower Bounds on the State Complexity ofPopulation Protocols ∗ Philipp Czerner , Javier Esparza{czerner, esparza}@in.tum.deDepartment of Informatics, TU München, GermanyFebruary 24, 2021
Population protocols are a model of computation in which an arbitrarynumber of indistinguishable finite-state agents interact in pairs. The goalof the agents is to decide by stable consensus whether their initial globalconfiguration satisfies a given property, specified as a predicate on the set ofall initial configurations. The state complexity of a predicate is the numberof states of a smallest protocol that computes it. Previous work by Blondin et al. has shown that the counting predicates x ≥ η have state complexity O (log η ) for leaderless protocols and O (log log η ) for protocols with leaders.We obtain the first non-trivial lower bounds: the state complexity of x ≥ η isΩ(log log log η ) for leaderless protocols, and the inverse of a non-elementaryfunction for protocols with leaders.
1. Introduction
Population protocols are a model of computation in which an arbitrary number ofindistinguishable finite-state agents interact in pairs to decide if their initial globalconfiguration satisfies a given property. Population protocols were introduced in [5,6] to study the theoretical properties networks of mobile sensors with very limitedcomputational resources, but they are also very strongly related to chemical reactionnetworks, a discrete model of chemistry in which agents are molecules that change theirstates due to collisions.Population protocols decide a property by stable consensus . Each state of an agentis assigned a binary output (yes/no), and in a correct protocol starting at a global ∗ This work was supported by an ERC Advanced Grant (787367: PaVeS) and by the Research TrainingNetwork of the Deutsche Forschungsgemeinschaft (DFG) (378803395: ConVeY). a r X i v : . [ c s . D C ] F e b onfiguration, all agents eventually reach a consensus by reaching the set of states whoseoutput is the correct answer to the question “does the initial configuration satisfy theproperty?”, and staying in it forever. A typical example of a property decidable bypopulation protocols is majority: initially agents are in one of two initial states, say A and B , and the property to be decided is whether the number of agents in A is largerthan the number of agents in B or not. In a seminal paper, Angluin et al. showedthat population protocols can decide exactly the properties expressible in Presburgerarithmetic, the first-order theory of order [9].In order to define the runtime of a protocol one assumes that at each step a pair ofagents is selected uniformly at random and allowed to interact. The parallel runtime isthen defined as the expected number of interactions until a stable consensus is reached (i.e.until the property is decided), divided by the number of agents. Even though the parallelruntime is computed using a discrete model, under reasonable, commonly acceptedassumptions, the result coincides with the runtime of a continuous-time stochastic model.Many papers have investigated the parallel runtime of population protocols, and severallandmark results have been obtained. In [6] it was shown that every Presburger propertycan be decided in O ( n log n ) parallel time, where n is the number of agents, and [8] showedthat population protocols with a fixed number of leaders can compute all Presburgerpredicates in polylogarithmic parallel time. (Loosely speaking, leaders are auxiliaryagents that do not form part of the population of “normal” agents, but can interact withthem to help them decide the property.) More recent results have studied protocols formajority in which the number of states grows with the number of agents, and shownthat polylogarithmic time is achievable by protocols without leaders, even for very slowgrowth functions, see e.g. [2, 3, 4, 15, 18].However, many protocols have a high number of states. For example, a quick estimateshows that the fast protocol for majority implicitly described in [8] has tens of thousandsof states. This is an obstacle to implementations of protocols in chemistry, where thenumber of states corresponds to the number of chemical species participating in thereactions. Moreover, the number of states is of fundamental importance because it playsthe role of memory in sequential computational models: the total memory availableto a protocol is product of the logarithm of the number of states multiplied by thenumber of agents. Despite these facts, the state complexity of a Presburger property,defined as the minimal number of states of any protocol deciding the property, hasreceived comparatively little attention . In [13, 12] Blondin et al. have shown that everypredicate representable by a boolean combination of threshold and modulo constraints(every Presburger formula can be put into this form) of length n , with numbers encodedin binary, can be decided by a protocol with p ( n ) states, for some polynomial p . Inparticular, it is not difficult to see that every property of the form x ≥ η , asking whetherthe number of agents is at least η , can be decided by a leaderless protocol with O (log η ) Notice that the time-space trade-off results of [2, 3, 4, 15, 18] refer to a more general model in whichthe number of states of a protocol grows with the number n of agents; in other words, a property isdecided by a family of protocols, one for each value of n . Trade-off results bound the growth rateneeded to compute a predicate within a given time. We study the minimal number of states of a single protocol that decides the property for all n . c which has protocols (with leaders) with O (log log η ) states. However, to the best of ourknowledge, there exist no lower bounds on the state complexity, i.e. a bound showingthat a protocol for x ≥ η needs Ω( f ( η )) states for some function f . This question, whichwas left open in [13], is notoriously hard due to its relation to fundamental questions inthe theory of Vector Addition Systems.In this paper we show that every protocol, with or without leaders, needs a number ofstates that, roughly speaking, grows like the inverse Ackermann function, and (our mainresult) that every leaderless protocol for x ≥ η needs Ω(log log log η ) states. The proof ofthe first bound relies on results on the maximal length of controlled antichains of N d , atopic in combinatorics with a long tradition in the study of Vector Addition Systems andother models of computation, see e.g. [22, 17, 1, 25, 10]. The triple exponential boundrequires us to develop new theory for a generalisation of the antichain condition.The paper is organised as follows. Section 2 introduces population protocols, statecomplexity, and its inverse, the busy beaver function for population protocols. Insteadof lower bounds on state complexity, for convenience we present upper bounds on thebusy beaver function. Section 3 presents some results on the mathematical structureof stable sets of configurations that are used throughout the paper. Section 4 showsan Ackermannian upper bound on the busy beaver function, valid for protocols withor without leaders, and explains why this surprisingly large bound might be optimal.Section 5 gives a triple exponential upper bound on the busy beaver function for leaderlessprotocols.
2. Population Protocols and State Complexity
For sets
A, B we write A B to denote the set of functions f : B → A . If B is finite we callthe elements of N B multisets over B , and the elements of R B vectors of dimension | B | .Arithmetic operations on vectors in R B are defined as usual, extending the vectors withzeroes if necessary. For example, if B ⊆ B , x ∈ R B and y ∈ R B then x + y ∈ N B isdefined by ( x + y ) b = x b + y b , where y b = 0 for every b ∈ B \ B . For x, y ∈ R B we write x ≤ y if x i ≤ y i for all i ∈ B , and x (cid:8) y if x ≤ y and x = y . Abusing language we identifyan element b ∈ B with the one-element multiset containing it, i.e. x ∈ N B with x b = 1and x i = 0 for i = b . We also write | x | := P b ∈ B x b for the total number of elements in amultiset x ∈ N B , and to denote the all-ones vector of appropriate dimension. Finally,given a vector v ∈ R k , we define k v k = P ki =1 | v i | and k v k ∞ = max ki =1 | v i | We recall the population protocol model of [5, 7], with explicit mention of leader agents.A population protocol is a tuple P = ( Q, T, L, X, I, O ) where Q is a finite set of states ; T ⊂ Q × Q is a set of transitions ; L ∈ N Q is the leader multiset ; X is a finite set3f input variables ; I : X → Q is the input mapping ; and O : Q → { , } is the outputmapping . Inputs and configurations An input is a multiset v ∈ N X such that | v | ≥
2, and a configuration is a multiset m ∈ N Q such that | m | ≥
2. Intuitively, a configurationrepresents a population of agents where m ( q ) denotes the number of agents in state q .The initial configuration for input v is defined as IC ( v ) := L + P x ∈ X v ( x ) · I ( x ). Abusinglanguage, throughout the paper we write IC ( i ) instead of IC ( i · x ) to denote the initialconfiguration for input i ∈ N , if P has a unique input state { x } = X .The output O ( m ) of a configuration m is b if m ( q ) ≥ O ( q ) = b for all q ∈ Q ,and undefined otherwise. So a population has output b if all agents have output b . Executions
A transition t = (( p, q ) , ( p , q )) is enabled in a configuration m if m ≥ p + q ,and disabled otherwise. As | m | ≥
2, every configuration enables at least one transition.If t is enabled in m , then it can be fired leading to configuration v := m − p − q + p + q ,which we denote m t −→ v . Given a sequence σ = t t ...t n of transitions, we write m σ −→ v if there exist configurations m , m , ..., m n such that m t −→ m t −→ m · · · m n t n −→ m , and m ∗ −→ m if m σ −→ m for some sequence σ ∈ T ∗ . For every set of transitions T ⊆ T , wewrite m T −→ m if m t −→ m for some t ∈ T , and m T −−→ m if m σ −→ m for some sequence σ ∈ T . Given a set M of configurations, m ∗ −→ M denotes that m ∗ −→ m for some m ∈ M .An execution is a sequence of configurations σ = m m ... such that m i −→ m i +1 for every i ∈ N . The output O ( σ ) of σ is b if there exist i ∈ N such that O ( m i ) = O ( m i +1 ) = ... = b ,otherwise O ( σ ) is undefined.Executions have the monotonicity property : If m m m ... is an execution, then forevery configuration D the sequence ( m + m )( m + m )( m + m ) ... is an execution too.We often say that a statement holds “by monotonicity” , meaning that it is a consequenceof the monotonicity property. Computations
An execution σ = m m ... is fair if for every configuration m thefollowing holds:if |{ i ∈ N : m i ∗ −→ m }| is infinite, then |{ i ∈ N : m i = m }| is infinite.In other words, fairness ensures that an execution cannot avoid a configuration forever.We say that a population protocol computes a predicate ϕ : N X → { , } (or decides the property represented by the predicate) if for every v ∈ N X every fair execution σ starting from IC ( v ) satisfies O ( σ ) = ϕ ( v ). Two protocols are equivalent if theycompute the same predicate. It is known that population protocols compute preciselythe Presburger-definable predicates [9]. Example 1.
Let P n = ( Q, T, , { x } , I, O ) be the protocol where Q := { , , , , ..., n } , I ( x ) := 1 , O ( a ) = 1 iff a = 2 n , and for each a, b ∈ Q the set T of transitions contains (( a, b ) , (0 , a + b )) if a + b < n , and (( a, b ) , (2 n , n )) if a + b ≥ n . It is readily seen that n computes x ≥ n with n + 1 states. Intuitively, each agent stores a number, initially1. When two agents meet, one of them stores the sum of their values and the other onestores 0, with sums capping at n . Once an agent reaches this cap, all agents eventuallyget converted to n .Now, consider the protocol P n = ( Q , T , , { x } , I , O ) , where Q := { , , , ..., n } , I ( x ) := 2 , O ( a ) = 1 iff a = 2 n , and T contains ((2 i , i ) , (0 , i +1 )) for each ≤ i < n ,and (( a, n ) , (2 n , n )) . for each a ∈ Q It is easy to see that P n also computes x ≥ n , butmore succinctly: While P n has n + 1 states, P n has only n + 1 states. Leaderless protocols
A protocol P = ( Q, T, L, X, I, O ) is leaderless if L = ∅ , and has | L | leaders otherwise. Protocols with leaders and leaderless protocols compute the samepredicates [9]. For L = ∅ we have λ IC ( v ) + λ IC ( v ) = λ (cid:18) L + X x ∈ X v ( x ) · I ( x ) (cid:19) + λ (cid:18) L + X x ∈ X v ( x ) · I ( x ) (cid:19) = λ X x ∈ X v ( x ) · I ( x ) + λ X x ∈ X v ( x ) · I ( x )= IC ( λv + λ v )for all inputs v, v and λ, λ ∈ N . In other words, any linear combination of configurationswith natural coefficients is also an initial configuration. Informally, the state complexity of a predicate is the minimal number of states of theprotocols that compute it. Given n , we would like to define the function STATE ( n ) asthe maximum state complexity of the predicates of size at most n . However, definingthe size of a predicate requires to fix a representation. Population protocols computeexactly the predicates expressible in Presburger arithmetic [9], and so there are at leastthree natural representations: formulas of Presburger arithmetic, existential formulasof Presburger arithmetic, and semilinear sets [19]. However, the translations betweenthese representations involve superexponential blow-ups. For this reason we focus onthreshold predicates of the form x ≥ η , for which the size of the predicate is just the sizeof η , independently of whether the predicate is described as a formula or a semilinear set.We choose to encode numbers in unary, and so we define STATE ( n ) as the number ofstates of the smallest predicate computing x ≥ η .The inverse of STATE ( n ) is the function that assigns to a number n the largest η suchthat a protocol with n states computes x ≥ η . Recall that the busy beaver functionassigns to a number n the largest η such that a Turing machine with n states started ona blank tape writes η consecutive ones on the tape and terminates. Due to this analogy,we call the inverse of STATE the busy beaver function for population protocols, and callprotocols computing predicates of the form x ≥ η busy beaver protocols, or just busybeavers . 5 efinition 1. The busy beaver function BB : N → N is defined as follows: BB ( n ) is thelargest η ∈ N such that the predicate x ≥ η is computed by some leaderless protocol withat most n states. The function BB L ( n ) is defined analogously, but for general protocols,possibly with leaders. In [13] Blondin et al. give lower bounds on the busy beaver function:
Theorem 2 ([13]) . BB ( n ) ∈ Ω(2 n ) and BB L ( n ) ∈ Ω(2 n ) . However, to the best of our knowledge no upper bounds have been given.
3. Mathematical Structure of Stable Sets
A set M of configurations is downward closed if m ∈ M and m ≤ m implies m ∈ M . Apair ( µ, S ), where µ is a configuration and S ⊆ Q , is a base element of M if µ + N S ⊆ M .A base of M is a finite set B of base elements such that M = S ( µ,S ) ∈B ( µ + N S ). Itis well-known (an easy consequence of Dickson’s lemma), that every downward-closedset of configurations has a base. We define the norm of a base element ( µ, S ) as k ( µ, S ) k ∞ := k µ k ∞ , and the norm of a base as the maximal norm of its elements. Weapply these notions to the stable configurations of the protocol: Definition 2.
Let b ∈ { , } . A configuration m is b -stable if O ( m ) = b for everyconfiguration m reachable from m . The set of b -stable configurations is denoted SC b . It follows easily from the definitions that a population protocol computes a predicate ϕ : N X → { , } iff IC ( v ) ∗ −→ SC for every input v satisfying ϕ ( v ) = 0, and IC ( v ) ∗ −→ SC for every input v satisfying ϕ ( v ) = 1. Lemma 3.
Let P be a protocol with n states. For every b ∈ { , } the set SC b isdownward closed and has a base of norm at most n +1)!+1 . In particular, SC b has abase with at most ϑ ( n ) := 2 (2 n +2)! elements.Proof. We first show that SC b is downward closed, by showing that its complement SC b is upward closed. Assume m ∈ SC b and m ≥ m . We prove m ∈ SC b . Since m ∈ SC b we have m ∗ −→ m for some m such that O ( m ) = b . By monotonicity, m = m +( m − m ) ∗ −→ m +( m − m ), and since O ( m ) = b we have O ( m +( m − m )) = b .So m ∈ SC b .For the second part, let β := 2 n +1)! and fix a b -stable configuration m . Let S := { q ∈ Q : m q > β } , and define µ ≤ m as follows: µ i := m i for i / ∈ S and µ i := 2 β for i ∈ S .Since µ ≤ m and m is b -stable, so is µ . We show that ( µ, S ) is a base element of SC b ,which proves the result. Assume the contrary. Then some configuration m ∈ µ + N S isnot b -stable. So m −→ m for some m satisfying m ( q ) ≥ q ∈ Q with O ( q ) = b ; we say that m covers q . By Rackoff’s Theorem [24], m can be chosen sothat m σ −→ m for a sequence σ of length 2 O ( n ) ; a more precise bound is | σ | ≤ β (see [16,Theorem 3.12.11]). Since a transition moves at most two agents out of a given state, σ moves at most 2 β agents out of a state. So, by the definition of µ , the sequence σ is also6xecutable from µ , and also leads to a configuration that covers q . But this contradictsthat µ is b -stable.To prove the bound on the number of elements of the base, observe that the numberof pairs ( µ, S ) such that µ has norm at most k and S ⊆ Q is at most ( k + 2) n . Indeed,for each state q there are at most k + 2 possibilities: q ∈ S , or q / ∈ S and 0 ≤ µ ( q ) ≤ k .So ϑ ≤ (2 n +1)!+1 + 2) n ≤ (2 n +2)! .From now on we use the following terminology: Definition 3. A b -base is a base of SC b of norm at most n +1)!+1 , and its elementsare called b -base elements .
4. A General Upper Bound on the Busy Beaver Function
Our general strategy to find upper bounds for the busy-beaver function BB L ( n ) is asfollows:(1) Prove a “Pumping Lemma” stating that if a protocol rejects two inputs a < b satisfying certain conditions, then it rejects all inputs of the form a + λ ( b − a ), forevery λ ≥
0, and so that it does not compute the predicate x ≥ η .(2) Using the Pumping Lemma, we reduce the existence of the inputs a and b to theexistence of a finite sequence of vectors of dimension n satisfying certain purelycombinatorial properties. Moreover, the size of b is linked to the length of thesequence.(3) Say a sequence satisfying the properties of (2) is bad (it implies that the protocoldoes not compute x ≥ η ), and otherwise good . We provide a bound B ( n ) on themaximal length of good sequences.It follows from (1)-(3) that a protocol with n states cannot compute x ≥ η for any η ≥ B ( n ). Indeed, if η ≥ B ( n ) then every sequence of vectors of dimension n and length η is bad. So the sequence satisfies the conditions of the Pumping Lemma, and so theprotocol rejects all inputs of the form a + λ ( b − a ). In this section we follow this strategyto provide an upper bound valid for all protocols. In the next section we apply it again,albeit in a more sophisticated way, to obtain a far better upper bound for leaderlessprotocols.Fix a protocol P n = ( Q, T, , { x } , I, O ) with | Q | = n . We start by stating and provingthe Pumping Lemma. Lemma 4 (Pumping Lemma) . If there exist inputs a and b , a -base element ( µ, S ) ,and configurations m a , m b ∈ µ + N Q satisfying (1) m a ≤ m b , (2) IC ( a ) ∗ −→ m a , and (3) m a + IC ( b − a ) ∗ −→ m b , then P rejects a + λ ( b − a ) for every λ ≥ . roof. We first claim that m a + IC ( λ ( b − a )) ∗ −→ m a + λ ( m b − m a ) holds for every λ ≥ ∗ ). The proof is by induction on λ . The basis λ = 0 is trivial. For the induction step let λ ≥
1. Due to monotonicity, we have IC ( a + λ ( b − a )) = IC ( a ) + IC ( b − a ) + IC (( λ − b − a )) ∗ −→ m a + IC ( b − a ) + IC (( λ − b − a )) (2) ∗ −→ m a + ( m b − m a ) + IC (( λ − b − a )) (3) ∗ −→ m b + ( m b − m a ) + ( λ − m b − m a ) ( ∗ )= m a + λ ( m b − m a )and the claim is proved. By (1) and m a , m b ∈ µ + N S we have m a + λ ( m b − m a ) ∈ µ + N S for every λ ≥
0. So, by (2) and the claim, SC is reachable from IC ( a + λ ( b − a )) forevery λ ≥
0. So P rejects 0 for every input a + λ ( b − a ).Our goal now is to find a bound B ( n ) such that for every protocol with at most n states there are inputs a < b ≤ B ( n ) rejected by the protocol and satisfying conditions(1)-(3) of the Pumping Lemma. For this, observe that for every rejected input i we have IC ( i ) ∗ −→ m i for some configuration m i ∈ SC , and so m i ∈ µ i + N S i for some 0-baseelement ( µ i , S i ). Further, by Lemma 3 we can assume that the 0-base has ϑ ( n ) elements.The triple ( µ i , S i , m i ) can be seen as a “rejection certificate” for i . The certificate can beverified by checking that m i ∈ µ i + N S i , and finding σ such that IC ( i ) σ −→ m i . Definition 4.
For every rejected input ≤ i ≤ η − , the triple cert ( i ) := ( µ i , S i , m i ) is the (rejection) certificate of i . We call ( µ i , S i ) the type of cert ( i ) . The certificatesequence of P is the sequence cert (2) cert (3) ... cert ( η − . The conditions of the Pumping Lemma can now be reformulated as follows: thereare two inputs a, b such that their certificates have the same type and satisfy condition(1). Since there are at most ϑ ( n ) types, if the number of rejected inputs exceeds ϑ ( n ), then there are two inputs a, b such that cert ( a ) and cert ( b ) have the same type.However, their certificates need not yet satisfy condition (1). To solve this problem weexamine the certificate sequence in more detail. More precisely, we examine the sequence m m · · · m η − . Using the terminology of [17], it is a linearly controlled sequence , meaningthat there is a linear control function f : N → N satisfying | m i | ≤ f ( i ). Indeed, since IC ( i ) ∗ −→ m i , we have | m i | = | IC ( i ) | = | L | + i , and so we can take f ( n ) = | L | + n .This allows us to use a result on linearly controlled sequences from [17]. Say a finitesequence v , v , · · · , v s of vectors of the same dimension is bad if there are two indices0 ≤ i < i ≤ s such that v i ≤ v i . Dickson’s Lemma shows that every infinite sequencesequence of vectors contains a bad prefix, and the result extends from two to three orany other finite number of indices i < i < ... .The maximal length of good linearly controlled sequences has been studied in [22, 17, 10],and the results been used to bound the runtime of algorithms to check properties ofa number of computational models, including Vector Addition Systems, and CounterMachines [21]. The following lemma follows easily from results of [17]: Unfortunately, in [17] bad sequences are called “good”. We risk this confusion to maintain the conventionthat a bad sequence is a witness of the fact that the protocol does not compute x ≥ η . emma 5. [17] For every δ ∈ N and for every elementary function g : N → N , thereexists a function F δ,g : N → N at level F ω of the Fast Growing Hierarchy satisfying thefollowing property: For every infinite sequence v , v , v ... of vectors of N n satisfying | v i | ≤ i + δ , there exist i < i < ... < i g ( n ) ≤ F δ,g ( n ) such that v i ≤ v i ≤ · · · ≤ v g ( n ) . For the definition of the Fast Growing Hierarchy see [17]. For our purposes it sufficesto know that F ω contains functions that, crudely speaking, grow like the Ackermannfunction. Using the lemma we obtain: Theorem 6.
Let P be a population protocol with n states and ‘ leaders computing apredicate x ≥ η for some η ≥ . Then η < F ‘,ϑ ( n ) , where ϑ ( n ) is the function of Lemma3.Proof. We inductively define a sequence m , m , m ... of configurations of SC ∪ SC satisfying:(i) IC ( i ) ∗ −→ m i for every n ≥ m i + ( j − i ) · I ( x ) ∗ −→ m j for every 2 ≤ i ≤ j . (Observe that m i + ( j − i ) · I ( x ) isthe result of adding ( j − i ) agents in state I ( x ) to m i .)Since P computes x ≥ η , for every i ≥ P starting at IC ( i ) eventuallyreaches SC or SC (depending on whether i < η or i ≥ η ), and stays there forever. Wedefine m , m , m , ... . First, we let m be any configuration of SC ∪ SC reachable from IC (2). Then, for every i ≥
2, assume that m i has already been defined and satisfies IC ( i ) ∗ −→ m i . Observe that IC ( i + 1) = IC ( i ) + I ( x ). Since IC ( i ) ∗ −→ m i , we also have IC ( i + 1) = IC ( i ) + I ( x ) ∗ −→ m i + I ( x ). This execution can be extended to a fair run,which eventually reaches SC ∪ SC . Let m i +1 be any configuration of SC ∪ SC reachable from m i + I ( x ).Let us show that m , m , m ... satisfies (i) and (ii). Property (i) holds for m bydefinition, and for i ≥ IC ( i + 1) = IC ( i ) + I ( x ) ∗ −→ m i + I ( x ) ∗ −→ m i +1 . Forproperty (ii), by monotonicity and the definition of m i we have for every 2 ≤ i ≤ j : m i + ( j − i ) · I ( x ) ∗ −→ m i +1 + ( j − i − · I ( x ) ∗ −→ · · · ∗ −→ m j − + I ( x ) ∗ −→ m j Assume η > F ‘,ϑ ( n ). By Lemma 5 there exist ϑ ( n ) + 1 indices i < i < ... < i ϑ ( n ) ≤ F ‘,ϑ ( n ) such that m i ≤ m i ≤ · · · ≤ m i ϑ ( n ) . Since P computes x ≥ η , every m i j belongsto SC . By the definition of ϑ and the pigeonhole principle, there are a < b and a 0-baseelement ( µ, S ) such that m i a , m i b ∈ µ + N S . Rename a := i a and b := i b . By property(ii) we have m a + ( b − a ) · I ( x ) ∗ −→ m b . Since m a , m b ∈ µ + N S , applying Lemma 4 we getthat P rejects a + λ ( b − a ) for every λ ≥
0. This contradicts that P computes x ≥ η . The function F ‘,ϑ ( n ) grows so fast that one can doubt that the bound is even remotelyclose to optimal. However, recent results show that this would be less strange than itseems. If a protocol P computes a predicate x ≥ η , then η is the smallest number such9hat IC ( η + 1) ∗ −→ SC . Therefore, letting BBP ( n ) denote the busy beaver protocolswith at most n states, and letting SC P and IC P denote the set SC and the initialmapping of the protocol P , we obtain: BB L ( n ) = max P∈ BBP ( n ) min { i ∈ N | ∃ m ∈ SC P : IC P ( i ) ∗ −→ m } Consider now two deceptively similar functions. Let
All be the set of configurations m such that O ( m ) = 1, i.e. all agents are in states with ouput 1. Further, let Some be the set of configurations m such that O ( m ) = 0, i.e. at least one agent is in a statewith output 1. Finally, let PP ( n ) denote the set of all protocols with alphabet X = { x } ,possibly with leaders, and n states. Notice that we include also the protocols that do notcompute any predicate. Define f ( n ) = max P∈ PP ( n ) min { i ∈ N | ∃ m ∈ All P : IC P ( i ) ∗ −→ m } f ( n ) = max P∈ PP ( n ) min { i ∈ N | ∃ m ∈ Some P : IC P ( i ) ∗ −→ m } Using recent results in Petri nets and Vector Addition Systems [14, 20] it is easy to provethat f ( n ) grows faster than any elementary function . However, a recent result [11]by Balasubramanian et al. shows f ( n ) ∈ O ( n ) for leaderless protocols! Finally, we get f ( n ) ∈ O ( n ) from a classical result on the coverability problem for Vector AdditionSystems [24] due to Rackoff .These results show that a non-elementary bound on BB L ( n ) might well be optimal.However, we now prove that this can only hold for population protocols with leaders.We show BB ( n ) ∈ O ( n ) , i.e. leaderless busy beavers with n states can only computepredicates x ≥ η for numbers η at most triple exponential on n .The scheme of the proof is similar to that of Theorem 6. In particular, we also relyon a lemma bounding the length of certain good sequences of vectors. However, thedefinition of a good sequence is a different one, which to the best of our knowledge hasnot been studied yet. Therefore, instead of resorting to [17] we develop the machinery toprove the lemma by ourselves. The paper [20] considers protocols with one leader, and studies the problem of moving from aconfiguration with the leader in a state q in and all other agents in another state r in , to a configurationwith the leader in a state q f and all other agents in state r f . The paper uses results from [14] toshow that the smallest number of agents for which this is possible grows faster than any elementaryfunction in the number of states of the protocol. The problem reduces to reaching a configuration with at least one agent in states of output 1. Rackoffshows that, if this is possible for some n , then the configuration can be reached in at most 2 O ( n ) steps, which can only involve 2 O ( n ) agents. . An Upper Bound for Leaderless Protocols The main property of leaderless protocols, already mentioned in Section 2, is that, looselyspeaking, initial configurations are closed under linear combinations: λ IC ( i ) + λ IC ( j ) = IC ( λi + λ j ) for every i, j, λ, λ ∈ N This extends to executions: If IC ( i ) ∗ −→ m i and IC ( j ) ∗ −→ m j , then IC ( λi + λ j ) ∗ −→ λm i + λ m j . We make use of this property throughout the section without explicit mention.Observe also that even if the coefficients of the linear combination are nonnegative rationalswe can always multiply them by a suitable constant and obtain an execution.We reuse the proof strategy described at the beginning of Section 4. In Section 5.1 westate and prove a Pumping Lemma showing that under certain conditions the protocolrejects infinitely many inputs, contradicting that it computes x ≥ η . In Section 5.2 weintroduce QCT -sequences of vectors, and show that a bound on the length of the socalled good
QCT -sequences implies a bound on η . Finally, Section 5.3 derives a boundon the length of good QCT -sequences.
Notation.
For the rest of the section we fix a leaderless population protocol P =( Q, T, ∅ , { x } , I, O ) with n states. We first introduce some preliminaries, then formulate a first version of the PumpingLemma (Lemma 7), and then strengthen it, yielding the final version (Lemma 10).
Potential reachability and a first Pumping Lemma.
Let t = (( p, q ) , ( p , q )) be a transition. By the definition of the reachability relation, forevery two configurations m, m , if m t −→ m then m = m − p − q + p + q . This motivatesthe following definition: Definition 5.
Let t = (( p, q ) , ( p , q )) be a transition. The effect of t is the vector ∆( t ) := q + r − q − r ∈ Z Q . Given a sequence σ = t ...t n , we denote its Parikh vector −→ σ := P t ∈ σ t ∈ N T as the vector mapping each transition to its number of occurrences in σ . We can then define the effect of σ as ∆( σ ) := P t ∈ T ( −→ σ ) t ∆( t ) = P t ∈ σ ∆( t ) . Observe that, since a transition only involves two agents, all components of ∆( t ) lie inthe interval [ − , m σ −→ m implies m = m + ∆( σ ). In particular, m depends only on the effect of σ . We introduce some useful notations: Definition 6.
For every sequence σ ∈ T ∗ of transitions, we write m σ = ⇒ m if m = m + ∆( σ ) . Further, we write m ∗ = ⇒ m if there exists σ such that m σ = ⇒ m , and say that m is potentially reachable from m . We make the following observations: 11 If m σ −→ m then m σ = ⇒ m , but the converse does not hold in general.• For fixed configurations m, m , whether m σ −→ m holds or not depends only on theParikh vector −→ σ .• If m σ = ⇒ m and m ≥ | σ | , then m σ −→ m . This follows immediately from the factthat each transition moves exactly two agents.We formulate a first version of a pumping lemma, which we will then strengthen. Lemma 7.
If there is γ ∈ N and inputs s , a, b ∈ N such that(1) IC ( s ) ∗ −→ m for some m ≥ γ ,(2) m + IC ( a ) ∗ −→ µ + N S for some -base element ( µ, S ) , and(3) IC ( b ) σ = ⇒ N S for some σ such that | σ | ≤ γ ,then P rejects ( s + a + λb ) for every λ ≥ .Proof. By (2) and (3) there exist configurations v, u ∈ N S s.t. m + IC ( a ) ∗ −→ µ + v and IC ( b ) σ = ⇒ u . Since a transition removes at most two agents from a state, and we have m ≥ γ and | σ | ≤ γ , the sequence σ is enabled at m , and so also at m + IC ( b ). By (3)we obtain m + IC ( b ) σ −→ m + u . So we get IC ( s + a + λb ) = IC ( s ) + IC ( a ) + λ IC ( b ) ∗ −→ m + IC ( a ) + λ IC ( b ) ∗ −→ m + IC ( a ) + ( λ − IC ( b ) + u ∗ −→ · · · ∗ −→ m + IC ( a ) + λu (1) ∗ −→ µ + v + λu (2)Since ( µ, S ) is a 0-base element and v, u ∈ N S , we have µ + v + λu ∈ SC , and so P rejects ( s + a + λb ). A stronger pumping lemma.
In the rest of the section we show that Lemma 7 can be strengthened by fixing the valuesof γ and s , that is, one only needs to look for inputs i and j in order to “pump”. Wefirst show that the sequence σ can always be chosen of size at most ( n + 1) n . Lemma 8.
If IC ( i ) ∗ = ⇒ N S for some input i ≥ then IC ( j ) σ = ⇒ N S for some input j ≥ and a sequence σ of length at most ( n + 1) n .Proof. We construct a linear system of inequalities, any integer solution of which willyield a desired sequence σ with IC ( j ) σ = ⇒ N S for some j . Then we apply a well-knownbound, showing that a small solution exists.12o construct this system, we use what is known in the analysis of Petri nets as themarking equation (see e.g. [23]). Since the order of transitions in σ does not matter,we consider the Parikh vector −→ σ = P t ∈ σ t ∈ N T of σ , as defined above. Defining A : Q × T → Z as the matrix where the t -th column is precisely ∆( t ) for t ∈ T , we getthat the effect of the whole sequence is simply ∆( σ ) = A−→ σ .Therefore the statement IC ( j ) σ = ⇒ N S is equivalent to the following system of linearinequalities over the vector u of variables, where v i := ( A it ) t ∈ T denotes the i -th row of A , for i ∈ Q , and x ∈ Q the unique initial state of P : ∃ u ∈ N T : v > x u ≤ − v > i u = 0 for all i ∈ Q \ S c , i = xv > i u ≥ i ∈ S c , i = x We know that the above system has a solution for u , so it also has an integer solutionwith coefficients at most γ = ( n + 1)4 n . This bound follows from a result by von zurGathen and Sieveking [27], combined with the estimate that for any submatrix of A itsdeterminant has absolute value at most 4 n . The bounds on the determinants followsdirectly from a suitable Laplace expansion of the columns, as each transition t ∈ T has k ∆( t ) k ≤
4. Since there are | T | ≤ n different transitions, the total number oftransitions in σ is at most n ( n + 1)4 n ≤ ( n + 1) n This allows us to fix γ := ( n + 1) n . Now we fix s . We prove a lemma showing thatfor every number γ there is an input from which we can reach a configuration with atleast γ agents in each state (we assume wlog that every state can be populated fromsome input, otherwise we can remove the state). The proof is in the Appendix. Lemma 9.
For every γ ∈ N there exists an s ≤ γn n such that IC ( s ) ∗ −→ m for someconfiguration m ≥ γ . This allows us to fix s := γn n ≤ ( n + 1) n . So together with Lemma 7 we finallyget: Lemma 10 (Pumping Lemma) . Let γ := ( n + 1) n and s := γn n . Let m γ be aconfiguration satisfying m γ ≥ γ and IC ( s ) ∗ −→ m γ , which exists by Lemma 9. If thereexist inputs a, b ≥ such that1. m γ + IC ( a ) ∗ −→ µ + N S for some -base element ( µ, S ) , and2. IC ( b ) ∗ = ⇒ N S ,then P rejects ( s + a + λb ) for every λ ≥ . QCT -sequences
As we did in Section 4, we define a notion of certificate that an input is rejected, in thiscase an input larger than s . Assume P computes x ≥ η . By Lemma 9, for every i with13 + i ≤ η there exist sequences of transitions σ i and π i , a 0-base element ( µ i , S i ) and aconfiguration m i ∈ N S i such that IC ( s + i ) σ i −→ C γ + IC ( i ) π i −→ µ i + m i (1)Further, for every i we can assume that the sequence σ i π i has minimal length and, byLemma 3, that ( µ i , S i ) has norm at most 2 n +1)!+1 . Observe that execution (1) proves IC ( s + i ) ∗ −→ SC , and so that P rejects s + i . We define the rejection certificate of i asfollows. Definition 7.
Let ‘ := η − − s , and let i = 1 , ..., ‘ . The (rejection) certificate of i isthe tuple cert ( i ) := ( µ i , S i , m i , pv i ) , where µ i , S i , m i are as in (1), and pv i is the Parikhvector of the sequence σ i π i , defined as the mapping pv i : T → N that assigns to everytransition the number of times it occurs in σ i π i . Further, we say that S i is the colour of i . The certificate sequence of P is the sequence cert = cert (1) cert (2) ... cert ( ‘ ) . Observe that the type of a certificate ( µ i , S i , m i , pv i ) is N Q × Q × N Q × N T , with theconstraint that for every q ∈ S i we have µ i ( q ) = 0, and for q / ∈ S i we have m i ( q ) = 0.In Section 4 we saw that the certificates introduced there were good linearly controlledsequences, which allowed us to use existing results. For leaderless protocols we proceedin the same way, with the difference that now we cannot resort to the literature, buthave to develop the theory ourselves. Controlled
QCT -sequences.
We first show that
Cert is also a controlled sequence in a certain sense. The proof isstraightforward and can be found in the Appendix.
Lemma 11.
Let Cert = cert (1) ... cert ( ‘ ) be the certificate sequence of P , where cert ( i ) =( µ i , S i , m i , pv i ) . We have k µ i k ≤ n n +1)!+1 , k µ i k + k m i k = s + i , and k µ i k + k m i k + k pv i k ≤ ( s + i ) n for all i = 1 , ..., ‘ . Lemma 11 motivates the next definition:
Definition 8.
Let
Q, C, T be disjoint finite sets of states , colours , and transitions . A QCT -tuple is a fourtuple qct = ( µ, c, m, pv ) , where µ, m ∈ N Q , c ∈ C , and pv ∈ N T .A QCT -sequence is a finite sequence of certificates. A
QCT -sequence τ = qct , ..., qct ‘ ,where qct i = ( µ i , c i , m i , pv i ) is controlled if there are constants s , α, β such that k µ i k ∞ ≤ β , k µ i k + k m i k = s + i , and k µ i k + k m i k + k pv i k ≤ ( s + i ) α for all ≤ i ≤ ‘ . Wecall s , α, β the control parameters of τ and write I ( c ) := { i : c i = c } for the indices ofelements with colour c ∈ C . We can now reformulate Lemma 11 as:
Corollary 12.
Cert is a controlled
QCT -sequence with C = 2 Q , and control parameters s ≤ ( n + 1) n , α = n and β = n n +1)!+1 . inear combinations and good controlled QCT -sequences.
We now show that
Cert satisfies a property playing the same role as “goodness” of linearlycontrolled sequences, but stronger. Intuitively, this makes it much harder to produce longgood sequences, which leads to a triple exponential bound instead of a non-elementaryone.
Definition 9.
Let qct , ..., qct s be certificates of the same colour c , where qct i =( µ i , c, m i , pv i ) . A tuple ( µ, m, pv ) ∈ R Q × Q × R Q × R T is a linear combination of qct , ..., qct s if there are coefficients λ , ..., λ s ∈ R such that we have ( µ, m, pv ) = P si =1 λ i ( µ i , m i pv i ) . Let τ = ( qct i ) i =1 ,...,‘ be a QCT -sequence. A colour c ∈ C is bad if there is a linearcombination qct = ( µ, c, m, pv ) of ( qct i ) i ∈ I ( c ) such that µ = 0 , m (cid:13) , and pv ≥ . A QCT -sequence is bad if at least one colour is bad, and good otherwise.
Before showing that
Cert is a good
QCT -sequence, let us give some intuition for thisdefinition. First of all, let us compare the bad sequences of Section 4 with the ones ofDefinition 9. In Section 4, certificates were triples ( µ i , c i , m i ), while now they have anextra component ( µ i , c i , m i , pv i ). To ease the comparison, ignore the pv component forthe moment. A sequence of certificates is bad in the sense of Section 4 if there are indices i < j such that c i = c j (i.e. the certificates have the same colour), µ i = µ j , and m j ≥ m i .So we have µ j − µ i = 0 and m j − m i ≥
0, which implies m j − m i (cid:13) m j − m i = 0 then µ i + m i = µ j + m j , which can only occur if i = j ). It follows that the linear combination( µ, m, pv ) := − ( µ i , m i , pv i ) + ( µ j , m j , pv j ) satisfies the conditions of Definition 9, andso the sequence is also bad in the sense of this definition. But Definition 9 is far morepermissible. The sequence is still bad if, for example, we find indices i , i , j , j whosecertificates have the same colour and µ -component, and satisfy m j + m j ≥ m i + m i ;more generally, it is even enough to find (distinct) multisets of indices I and J satisfying | I | = | J | and P i ∈ J m j ≥ P i ∈ I m i . So, loosely speaking, while in Section 4 we must waituntil we see m i ≤ m j for some indices i < j to declare badness, now it suffices to findtwo multisets I and J of the same size satisfying P i ∈ I m i ≤ P j ∈ J m j . Intuitively, thismakes it much harder to construct a long good sequence, leading to a triple exponentialbound on the maximal length of good sequences, instead of the non-elementary bound ofSection 4. Lemma 13.
Cert is a good
QCT -sequence.Proof.
Assume
Cert is bad. Then there is a bad colour c and a linear combination( µ, m, pv ) of { cert ( i ) : i ∈ I ( c ) } that satisfies the conditions of Definition 9. We provethat there exist inputs a and b fulfilling the conditions of the Pumping Lemma (Lemma 10),which contradicts the assumption that P computes x ≥ η .Let ( µ i , c, m i , pv i ) := cert ( i ) for i ∈ I ( c ) and let y : I ( c ) → R denote the coefficients ofthe linear combination ( µ, m, pv ), meaning that we have P i y i µ i = 0 and P i y i m i (cid:13) P i y i pv i ≥
0. These conditions are invariant under scaling of y , so we may assumewlog that y i ∈ Z for i ∈ I ( c ). 15s we already noted, potential reachability depends only on the Parikh vector of thetransition sequence. So we will extend = ⇒ to Parikh vectors by writing m pv == ⇒ v for pv ∈ N T if m σ = ⇒ v for some sequence σ ∈ T ∗ with −→ σ = pv . Note that m pv == ⇒ v is thusequivalent to m + P t ∈ T pv t ∆( t ) = v .Recall that due to Definition 7, we have IC ( s + i ) σ i −→ C γ + IC ( i ) π i −→ µ i + m i ( ∗ )for every i ∈ I ( c ) and sequences σ i , γ i ∈ T ∗ , where pv i = −→ σ i + −→ π i .Let us now define inputs a and b fulfilling the conditions of the Pumping Lemma(Lemma 10). For a we simply pick any element j ∈ I ( c ) and set a := j . By ( ∗ ),condition (1) of Lemma 10 holds for a and ( µ, S ) := ( µ j , c ). It remains to prove (2).Set b := P i y i ( s + i ) and pv := P i y i pv i . Since P i y i µ i = 0 and P i y i m i (cid:13) P i y i pv i ≥
0, we have IC ( b ) = IC (cid:16)X i y i ( s + i ) (cid:17) = X i y i IC ( s + i ) pv == ⇒ X i y i ( µ i + m i ) = X i y i m i (cid:13) m i ∈ N c for i ∈ I ( c ) we get IC ( b ) ∗ = ⇒ N c \ { } . Transitions preserve the total numberof agents, so b > QCT -sequences
We obtain a bound on the length of a good controlled
QCT -sequence with controlparameters s , α, β . More precisely, our goal is to prove the following theorem: Theorem 14.
The length ‘ of a good QCT -sequence with control parameters s , α , and β satisfies log ‘ ≤ (log β + 1 + α log( s + 1))(3 + α ) | C | (2 | Q | + | T | ) Observe that this is is purely combinatorial question, motivated by, but independentfrom, population protocols.
Notation.
We collect a number of notations used in the rest of the section.• τ denotes a QCT -sequence with control parameters s , α, β .• c ∈ C denotes an arbitrary colour of τ .• I ( c ) denotes the set of indices of the elements of τ of colour c .• for i ∈ I ( c ) , qct i = ( µ i , c, m i , pv i ) denotes the i -th element of τ .• for i ∈ I ( c ) , u i denotes the concatenation of the vectors m i and pv i , for which weuse the notation u i = (cid:0) m i pv i (cid:1) . I ∗ ( c ) ⊆ I ( c ) denotes the the set of indices i ∈ I ( c ) s.t. (cid:0) u i µ i (cid:1) is linearly independentfrom { (cid:0) u j µ j (cid:1) : j ∈ I ( c ) , j < i } . We proceed in several steps:• In Section 5.3.1 we use Farkas’s Lemma to construct a certificate of goodness for acolour c . A certificate of goodness is a mapping that assigns a real number, calleda weight , to each dimension of µ i and u i . The mapping itself is called a weighting .We show how to compute basic weightings as the unique solution of a system ofequations (Lemma 16).• In Section 5.3.2 we bound the size of a basic weighting, and transform this boundinto a bound on the length of τ (Lemma 19). However, the bound still depends onthe size of the vectors u i , with i ∈ I ∗ ( c ).• In Section 5.3.3 we remove this dependence and prove Theorem 14. We start by formally defining weightings.
Definition 10.
A vector ( y, z ) , where y ∈ R Q and z ∈ R Q × R T is a weighting for thecolour c , also called a c -weighting , if z ≥ and y > µ i + z > u i = − ( s + i ) for all i ∈ I ( c ) . We use Farkas’ Lemma to prove that the existence of a c -weighting is a certificate ofgoodness for colour c . Lemma 15.
A colour c is good iff it has a weighting.Proof. As stated in Definition 9, c is a bad colour iff ∃ x ∈ R I ( c ) : X i x i µ i = 0 and X i x i t i ≥ X i x i m i (cid:13) A := (( µ > i ) i ∈ I ( c ) ) > be the matrix where column i is µ i and A := (( u > i ) i ∈ I ( c ) ) > .Now (1) is equivalent to ∃ x ∈ R I ( c ) : A x = 0 and A x ≥ (cid:13)(cid:13) X i x i m i (cid:13)(cid:13) > QCT -sequence we have > ( µ i + m i ) = k µ i k + k m i k = s + i . If we assume further that A x = 0 and A x ≥ k P i x i m i k > < (cid:13)(cid:13) X i x i m i (cid:13)(cid:13) = > X i x i m i = > (cid:16) X i x i m i + X i x i µ i (cid:17) = X i x i > ( µ i + m i ) = X i x i ( s + i ) =: b > x b ∈ R I ( c ) as b i := s + i for i ∈ I ( c ). Hence we now have ∃ x ∈ R I ( c ) : A x = 0 and A x ≥ b > x > ∃ y ∈ R Q , z ∈ R Q ∪ T : A > y + A > z = − b and z ≥ τ is bad iff (1) is feasible. Moreover, (4) is equivalent to ( y, z ) being a c -weighting.A good colour may have multiple weightings, even an infinite convex set of weightings.Similarly to basic solutions of a linear program, we introduce basic weightings of a colour,whose size we will bound using simple linear algebra. Recall that I ∗ ( c ) denotes the theset of indices i ∈ I ( c ) s.t. (cid:0) u i µ i (cid:1) is linearly independent from { (cid:0) u j µ j (cid:1) : j ∈ I ( c ) , j < i } . Twoproperties of a basic weighting are of interest: (1) it is the unique solution of a linearsystem of equations, and (2) it has at most | I ∗ ( c ) | nonzero components. The proof is astraightforward application of well-known properties of linear inequalities, and is given inthe appendix. Lemma 16.
Let c be a good colour. Then there are Y ⊆ Q, Z ⊆ Q ∪ T with | Y | + | Z | = | I ∗ ( c ) | such that the system y > µ i + z > u i = − s − i , for all i ∈ I ∗ ( c ) , has a uniquesolution y ∈ R Y , z ∈ R Z , and ( y, z ) is a c -weighting. We refer to such a ( y, z ) as basic c -weighting. The next step is showing that the existence of a basic weighting implies an upper boundon the length of the
QCT -sequence. We begin by showing a general bound on a uniquesolution to a linear system of equations. Again, the proof is routine linear algebra, andcan be found in the Appendix.
Lemma 17.
Let Ax = b denote a linear system of equations with unique solution x ,where A ∈ Z d × d , and let g ( i ) ≥ log max {| A ij | : j } ∪ {| b i |} denote an upper bound of eachrow i . Then log k x k ∞ ≤ W ( g, d ) , where W ( g, d ) := 2 d − − d − X t =1 d − − t g ( t ) + g ( d )We now use the previous lemma to prove an upper bound on the components of some c -weighting, for each colour c , based on the sizes of the linearly independent vectors µ i , u i with i ∈ I ∗ ( c ). To refer to these sizes, we set { l , ..., l d } := I ∗ ( c ) with l < ... < l d ,and define g c ( i ) := log( k µ l i k + k u l i k ) for i = 1 , ..., d . We remark that Definition 9immediately gives the estimate g c ( i ) ≤ α log( l i ).18 emma 18. For each colour c and d := | I ∗ ( c ) | , there is a c -weighting ( y, z ) with log k (cid:0) yz (cid:1) k ∞ ≤ W ( g c , d ) .Proof. Lemma 16 allows us to construct a c -weighting as the solution to a specificset of linear equations. In particular, we set A to the matrix with A ij := ( µ i ) j for i ∈ I ∗ ( c ) , j ∈ Y and A ij := ( u i ) j for i ∈ I ∗ ( c ) , j ∈ Z , and define b as b i = − s − i for i ∈ I ∗ ( c ). Then A (cid:0) yz (cid:1) = b has as unique solution y ∈ R Y , z ∈ R Z where ( y, z ) is a c -weighting. Now our desired bound follows simply by applying Lemma 17. (Note that | b i | = s + i ≤ ( s + i ) α holds.)From this upper bound we can derive a bound on the length of the sequence (restrictedto a specific colour c ), using that the weights for the u i must be nonnegative. Lemma 19.
For any colour c and d := | I ∗ ( c ) | , we have log max I ( c ) ≤ log β + W ( g c , d ) .Proof. Let ( y, z ) denote a c -weighting fulfilling the bound of Lemma 18. Hence for every i ∈ I ( c ) we have y > µ i + z > u i = − s − i . We know that z > u i ≥ z, u i ≥
0, so i ≤ − y > µ i − s ≤ k y k ∞ k µ i k . By Definition 9, k µ i k ≤ β , which we can plug into thebound of Lemma 18 to get the desired statement. The bound of Lemma 19 still depends on g c , i.e. the sizes of the elements with indicesin I ∗ ( c ). We now show how to move from this bound to the one of Theorem 14. Theproof that the expression of Theorem 14 is indeed a bound proceeds by induction on d , i.e. assuming that the bound is correct when I ∗ ( c ) contains d linearly independentvectors, we show that it remains correct when it contains d + 1. For this, observe that incontrolled sequences a bound on the length of the sequence yields a bound on the sizeof its vectors. So we use the sizes of the first d linearly independent vectors to derive abound on the length of the sequence until the ( d + 1)-th dimensional vector, which yieldsa bound on the size of this vector.There is a slight complication in that the induction needs to be performed for allcolours at one, instead of separately for each colour. Our induction variable is thus thetotal number of linearly independent vectors (of all colours) which we refer to as P . Theinduction hypothesis also needs to be chosen carefully. We use that the upper bound onmax I ( c ) (from Lemma 19) is bounded by f ( P ) for a suitable function f . Theorem 14.
The length ‘ of a good QCT -sequence with control parameters s , α , and β satisfies log ‘ ≤ (log β + 1 + α log( s + 1))(3 + α ) | C | (2 | Q | + | T | ) Proof.
Let d c := | I ∗ ( c ) | for c ∈ C and P := P c d c ≤ | C | (2 | Q | + | T | ). We will prove thestronger statement that G c ( d c ) ≤ f ( P ) for all colours c with d c >
0, where f ( P ) := (log β + 1 + α log( s + 1))(3 + α ) P − c ( r ) := log β + 2 r − + r − X t =1 r − − t g c ( t ) + g c ( r )This is a stronger statement due to Lemma 19 showing that log l ≤ G c ( d c ) for some colour c with I ( c ) = ∅ and thus d c >
0. The proof will proceed by induction on P .In the base case we have P = 1 and thus g c (1) ≤ α log( s + 1) for each c ∈ C , hence G c ( d c ) ≤ f (1) (with d c ≤ j := max S c I ∗ ( c ) denote the last index of any linearlyindependent u i , i.e. the last index at which P increases; and let c denote the colour of j .For all colours c = c , the value of G c ( d c ) does not change, so the induction hypothesisyields G c ( d c ) ≤ f ( P −
1) and thus G c ( d c ) ≤ f ( P ).For colour c , we use the induction hypothesis to get log( j − ≤ f ( P −
1) and G c ( d c − ≤ f ( P − g c ( d c ) (i.e. the vector at index j ) can,using the former, be bounded as g c ( d c ) ≤ α log( s + j ) ≤ α ( f ( P −
1) + 1). (Here we usedlog( s + 1) ≤ f ( P −
1) and log( a + b ) ≤ a for a ≥ b .) This is then combined withthe latter: G c ( d c ) ≤ G c ( d c −
1) + g c ( j ) ≤ f ( P −
1) + α ( f ( P −
1) + 1) ≤ (3 + α ) f ( P −
1) = f ( P ) Let us put all the pieces together. Let P be a leaderless protocol with n states computinga predicate x ≥ η , and let s := ( n + 1) n be the constant of the Pumping Lemma(Lemma 10). We prove η ≤ O ( n ) .If η ≤ s then we are done. So assume that η > s .• Since P rejects inputs s , s + 1 , ..., η −
1, the certificate sequence
Cert of Definition7 has length ‘ = η − − s .• By Corollary 12, Cert is a controlled
QCT -sequence with set C := 2 Q of colours,and control parameters s , α := n , and β = n n +1)!+1 . Further, by Lemma 13 Cert is good.• By Theorem 13, the length ‘ of Cert satisfieslog ‘ ≤ (log β + 1 + α log( s + 1))(3 + α ) | C | (2 | Q | + | T | ) where | C | = | Q | = 2 n , | Q | = n , and | T | ≤ n (each transition is determined by afourtuple of states). This expression is 2 O ( n ) .• So η = ‘ + s + 1 is bounded by 2 O ( n ) .This yields a triple exponential bound on the busy beaver function for leaderless protocols(see Lemma 23 for the precise bound): Theorem 20. BB ( n ) ≤ n +5 log n +2 , and so STATE ( n ) ∈ O (log log log n ) . . Conclusion We have obtained the first non-trivial lower bounds on the state complexity of populationprotocols, a fundamental but very hard question about the model. The obvious openquestions are to close the gap between the Ω(log log log n ) lower bound and the O (log n )upper bound for the leaderless case, and the even larger gap between Ω(log log n ) and(roughly speaking), the O ( α ( n )) upper bound for protocols with leaders, where α ( n ) isthe inverse of the Ackermann function. References [1] Sergio Abriola, Santiago Figueira, and Gabriel Senno. Linearizing well quasi-ordersand bounding the length of bad sequences.
Theor. Comput. Sci. , 603:3–22, 2015.[2] Dan Alistarh, James Aspnes, David Eisenstat, Rati Gelashvili, and Ronald L.Rivest. Time-space trade-offs in population protocols. In
Proc. th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages 2560–2579. SIAM, 2017. doi:10.1137/1.9781611974782.169 .[3] Dan Alistarh, James Aspnes, and Rati Gelashvili. Space-optimal majority in popu-lation protocols. In
SODA , pages 2221–2239. SIAM, 2018.[4] Dan Alistarh and Rati Gelashvili. Recent algorithmic advances in populationprotocols.
SIGACT News , 49(3):63–73, 2018.[5] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J. Fischer, and René Peralta.Computation in networks of passively mobile finite-state sensors. In
PODC , pages290–299. ACM, 2004.[6] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J. Fischer, and René Per-alta. Computation in networks of passively mobile finite-state sensors.
DistributedComputing , 18(4):235–253, 2006.[7] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J. Fischer, and René Peralta.Computation in networks of passively mobile finite-state sensors.
Distributed Comput. ,18(4):235–253, 2006.[8] Dana Angluin, James Aspnes, and David Eisenstat. Fast computation by populationprotocols with a leader.
Distributed Comput. , 21(3):183–199, 2008.[9] Dana Angluin, James Aspnes, David Eisenstat, and Eric Ruppert. The computationalpower of population protocols.
Distributed Comput. , 20(4):279–304, 2007.[10] A. R. Balasubramanian. Complexity of controlled bad sequences over finite sets ofnd. In
LICS , pages 130–140. ACM, 2020.2111] A. R. Balasubramanian, Javier Esparza, and Mikhail A. Raskin. Finding cut-offs inleaderless rendez-vous protocols is easy.
CoRR , abs/2010.09471, 2020. To appear inProceedings of FOSSACS 2021.[12] Michael Blondin, Javier Esparza, Blaise Genest, Martin Helfrich, and Stefan Jaax.Succinct population protocols for presburger arithmetic. In
STACS , volume 154 of
LIPIcs , pages 40:1–40:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020.[13] Michael Blondin, Javier Esparza, and Stefan Jaax. Large flocks of small birds: onthe minimal size of population protocols. In
STACS , volume 96 of
LIPIcs , pages16:1–16:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018.[14] Wojciech Czerwinski, Slawomir Lasota, Ranko Lazic, Jérôme Leroux, and FilipMazowiecki. The reachability problem for petri nets is not elementary. In
STOC ,pages 24–33. ACM, 2019.[15] Robert Elsässer and Tomasz Radzik. Recent results in population protocols forexact majority and leader election.
Bull. EATCS , 126, 2018.[16] Javier Esparza. Petri nets lecture notes, 2019. URL: https://archive.model.in.tum.de/um/courses/petri/SS2019/PNSkript.pdf.[17] Diego Figueira, Santiago Figueira, Sylvain Schmitz, and Philippe Schnoebelen.Ackermannian and primitive-recursive bounds with dickson’s lemma. In
LICS , pages269–278. IEEE Computer Society, 2011.[18] Leszek Gąsieniec and Grzegorz Stachowiak. Enhanced phase clocks, populationprotocols, and fast space optimal leader election.
J. ACM , 68(1), 2020.[19] Christoph Haase. A survival guide to presburger arithmetic.
ACM SIGLOG News ,5(3):67–82, 2018.[20] Florian Horn and Arnaud Sangnier. Deciding the existence of cut-off in parameterizedrendez-vous networks. In
CONCUR , volume 171 of
LIPIcs , pages 46:1–46:16. SchlossDagstuhl - Leibniz-Zentrum für Informatik, 2020.[21] Jérôme Leroux and Sylvain Schmitz. Reachability in vector addition systems isprimitive-recursive in fixed dimension. In
LICS , pages 1–13. IEEE, 2019.[22] Ken McAloon. Petri nets and large finite sets.
Theor. Comput. Sci. , 32:173–183,1984.[23] Tadao Murata. Petri nets: Properties, analysis and applications.
Proceedings of theIEEE , 77(4):541–580, 1989.[24] Charles Rackoff. The covering and boundedness problems for vector addition systems.
Theor. Comput. Sci. , 6:223–231, 1978.2225] Sylvain Schmitz. Complexity hierarchies beyond elementary.
ACM Trans. Comput.Theory , 8(1):3:1–3:36, 2016.[26] Alexander Schrijver.
Theory of Linear and Integer Programming . John Wiley &Sons, Inc., USA, 1986.[27] Joachim von zur Gathen and Malte Sieveking. A bound on solutions of linear integerequalities and inequalities.
Proceedings of the American Mathematical Society ,42(1):155–158, 1978.
A. Appendix
A.1. Proof of Lemma 9
Lemma 9.
For every γ ∈ N there exists an s ≤ γn n such that IC ( s ) ∗ −→ m for someconfiguration m ≥ γ .Proof. Let Q i := { q ∈ Q : IC (2 i ) → q + m, m ∈ N Q } \ Q i − for i = 1 , , ... and Q := { x } the set containing just the initial state. Intuitively, Q i contains the states reachablestarting with 2 i agents, but not 2 i − . We know that each state is reachable from aconfiguration IC ( s ) for some s ∈ N , so Q = S i ≥ Q i .It suffices to prove that Q i = ∅ implies Q i +1 = ∅ , as then Q = S n − i =0 Q i and each q ∈ Q is reachable starting from IC (2 n ). Assume that this is not the case, i.e. there exists some i and q ∈ Q i with Q i − = ∅ and IC (2 i ) σ −→ q + m for m ∈ N Q . We pick such a q whichminimises the length of σ .This means that the last transition of σ is ( q , q ) ( q, q ) for some q , q , q ∈ Q .Additionally, q , q / ∈ Q i as they are reachable by a shorter sequence. But this implies q , q ∈ S i − i =0 Q i , i.e. that q and q are each reachable from IC (2 i − ). Hence IC (2 i − ) ∗ −→ q + q + m −→ q + q + m for some m ∈ N Q , contradicting q / ∈ Q i − . A.2. Proof of Lemma 11
Lemma 11.
Let Cert = cert (1) ... cert ( ‘ ) be the certificate sequence of P , where cert ( i ) =( µ i , S i , m i , pv i ) . We have k µ i k ≤ n n +1)!+1 , k µ i k + k m i k = s + i , and k µ i k + k m i k + k pv i k ≤ ( s + i ) n for all i = 1 , ..., ‘ .Proof. Let r := s + i . Since IC ( s + i ) ∗ −→ µ i + m i we have k µ i k + k m i k = s + i = r .Further, k µ i k ≤ n k µ i k ∞ ≤ β follows from Lemma 3. For k pv i k we know that it isthe number of transitions of a shortest execution leading from IC ( r ) to µ i + m i . Sincea shortest execution visits a configuration at most once, the length is bounded by thenumber of configurations with r agents, which is equal to (cid:0) n + r − r (cid:1) . Using r ≥ , n ≥ k µ i k + k m i k + k pv i k ≤ r + (cid:0) n + r − r (cid:1) = r + n − Y j =1 r + jj ≤ r + ( r + 1) r n − ≤ r n − (2 r + 1) ≤ r n .3. Proof of Lemma 16 We need a well-known elementary result from linear algebra
Theorem 21 ([26, Theorem 7.1]) . Let m, n ∈ N , and let a , ..., a m , b ∈ R n denote vectors.We write t := rank { a , ..., a m , b } for the dimension of the subspace spanned by a , ..., a m and b . Then exactly one of the following holds.1. b is a nonnegative linear combination of linearly independent a , ..., a m .2. There is a c ∈ R m with c > b < and c > a i ≥ for i = 1 , ..., m , where c > a i = 0 for t − linearly independent a i . The following is an immediate consequence:
Corollary 22.
Let A ∈ R n × m and P := { x ∈ R m : Ax = b, x ≥ } be nonempty. Thenit has a solution x ∈ P with at most n nonzero components.Proof. Let a i denote the i -th column of A . If statement (2) of Theorem 21 were tohold, then for any x ∈ P we would have 0 > c > b = c > Ax where both c > A and x arenonnegative. This is a contradiction, so (1) must hold instead, which directly implies thedesired statement, as A has at most n linearly independent columns.Now we proceed to prove the Lemma. Lemma 16.
Let c be a good colour. Then there are Y ⊆ Q, Z ⊆ Q ∪ T with | Y | + | Z | = | I ∗ ( c ) | such that the system y > µ i + z > u i = − s − i , for all i ∈ I ∗ ( c ) , has a uniquesolution y ∈ R Y , z ∈ R Z , and ( y, z ) is a c -weighting. We refer to such a ( y, z ) as basic c -weighting.Proof. By definition, y ∈ R Q , z ∈ R Q × R T is a c -weighting iff y > µ i + z > u i = − s − i for all i ∈ I ( c ) and z ≥
0. We know that this system of linear inequalities is feasible asa c -weighting exists, so its set of solutions does not change if we consider only linearlyindependent rows, and we get y > µ i + z > u i = − s − i for i ∈ I ∗ ( c ), z ≥
0. The statementthen follows by considering a basic solution of the corresponding linear program.For completeness, we provide an alternative argument involving Corollary 22. Let y, z denote a solution to the above system, A the matrix where the i -th row is [ µ i − µ i u i ] (i.e.the concatenation of µ i , − µ i , u i ) for i ∈ I ∗ ( c ), and set x := [ y + y − z ], where y + , y − ≥ y into positive and negative components fulfilling y = y + − y − . Then Ax = b , where b i := − s − i . Applying Corollary 22 we then find a solution x with at most | I ∗ ( c ) | nonzero components. Picking an x with minimal number of nonzero component,then yields corresponding y, z with a total of at most | I ∗ ( c ) | nonzero components. Wethen define Y and Z as the support of y and z , respectively. If the solution ( y, z ) is notunique, then it would also be possible to construct a solution ( y , z ) and thus x withsmaller support, but that would contradict our choice of x .24 .4. Proof of Lemma 17 Lemma 17.
Let Ax = b denote a linear system of equations with unique solution x ,where A ∈ Z d × d , and let g ( i ) ≥ log max {| A ij | : j } ∪ {| b i |} denote an upper bound of eachrow i . Then log k x k ∞ ≤ W ( g, d ) , where W ( g, d ) := 2 d − − d − X t =1 d − − t g ( t ) + g ( d ) Proof.
We perform at most m − p :=arg max i x i , the largest component of x (modifying A in the process). In iteration i = 1 , ..., m −
1, let v denote the i -th row of A . We know that v = 0 cannot occur, as A has full rank, so either the only nonzero element of v is p and we solve directly for x p , orwe use v to eliminate one variable from the rest of A , taking care to leave only integerelements. For example, to eliminate element i from another row v , we would updatethat row to v i v − v i v (similarly for the right-hand side b ). Let a ij := log max {| A jt | : t } ∪ {| b j |} denote the logarithm of the maximum absolute value of the j -th row of the linear systemafter iteration i , for j = 1 , ..., m and i = 0 , ..., m −
1. We then claim a ij ≤ g ( j ) + 2 i − i X t =1 i − t g ( t )For i = 0, i.e. before the first iteration, this reduces to a j ≤ g ( j ), which holds. Usingrow i to eliminate an element from row j in iteration i means that we get 2 a ij ≤ a i − ,j a i − ,i + 2 a i − ,i a i − ,j , and therefore a ij ≤ a i − ,j + a i − ,i ≤ g ( j ) + 1 + 2 · (2 i − −
1) + i − X t =1 i − − t g ( t ) + i − X t =1 i − − t g ( t ) + g ( i )= g ( j ) + 2 i − i X t =1 i − t g ( t )As all coefficients remain integral during the computation, we finally have log k x k ∞ ≤ a k − ,k , which is the bound we wanted to show. B. Proof of the Final Bound
Lemma 23.
Let | Q | = n ≥ , | T | ≤ n , s := ( n + 1) n , ‘ := η − − s , | C | = 2 n , α := n , β := n n +1)!+1 , and log ‘ ≤ (log β + 1 + α log( s + 1))(3 + α ) | C | (2 | Q | + | T | ) Then log log log η ≤ n + 5 log( n ) + 2 25 roof. We write η := η and η i +1 := log η i , and set P := | C | (2 | Q | + | T | ) = 2 n (2 n + n ). η = log( ‘ + 1 + s ) ≤ (log β + 2 + α log( s + 1))(3 + α ) P ≤ (2(2 n + 1)! + log( n ) + 3 + n log(( n + 1) n + 1))(3 + n ) P ≤ ((2 n + 2)! + 7 n log( n + 1) + 3 n )(3 + n ) P ≤ ((2 n + 2)! + 4 n (2 n + 1))(3 + n ) P (1) ≤ (2 n + 4)!(3 + n ) P (log( n + 1) ≤ n )At (1) we use log( a + b ) ≤ log( a ) + 1 for 0 < b ≤ a . η ≤ log((2 n + 4)!) + P log(3 + n ) ≤ (2 n + 4) log( n + 3) + P log(3 + n ) ( n ! ≤ ( n ) n )= log( n + 3)(2 n (2 n + n ) + 2 n + 4)Now we will use (2) log( a + b ) ≤ log( a ) + b/ ln(2) a ≤ log( a ) + 3 b/ a for a, b > n + 3) ≤ log log( n ) + , where the latter is due to log log 5 ≤ . η ≤ log log( n + 3) + log(2 n (2 n + n ) + 2 n + 4) ≤ log log( n ) + + + n + log(2 n + n )) (2) , (3) ≤ log log( n ) + + + + n + 4 log( n ) (2) ≤ n + 5 log( nn