[PDF] Byzantine Agreement with Unknown Participants and Failures

Abstract

A set of mutually distrusting participants that want to agree on a common opinion must solve an instance of a Byzantine agreement problem. These problems have been extensively studied in the literature. However, most of the existing solutions assume that the participants are aware of n -- the total number of participants in the system -- and f -- an upper bound on the number of Byzantine participants. In this paper, we show that most of the fundamental agreement problems can be solved without affecting resiliency even if the participants do not know the values of (possibly changing) n and f. Specifically, we consider a synchronous system where the participants have unique but not necessarily consecutive identifiers, and give Byzantine agreement algorithms for reliable broadcast, approximate agreement, rotor-coordinator, early terminating consensus and total ordering in static and dynamic systems, all with the optimal resiliency of n> 3f. Moreover, we show that synchrony is necessary as an agreement with probabilistic termination is impossible in a semi-synchronous or asynchronous system if the participants are unaware of n and f.

Full PDF

aa r X i v : . [ c s . D C ] F e b Byzantine Agreement with Unknown Participantsand Failures

Pankaj Khanchandani

ETH Zurich

Zurich, [email protected]

Roger Wattenhofer

ETH Zurich

Zurich, [email protected]

Abstract —A set of mutually distrusting participants that wantto agree on a common opinion must solve an instance ofa Byzantine agreement problem. These problems have beenextensively studied in the literature. However, most of the existingsolutions assume that the participants are aware of n — the totalnumber of participants in the system — and f — an upper boundon the number of Byzantine participants. In this paper, we showthat most of the fundamental agreement problems can be solvedwithout affecting resiliency even if the participants do not knowthe values of (possibly changing) n and f . Speciﬁcally, we considera synchronous system where the participants have unique but notnecessarily consecutive identiﬁers, and give Byzantine agreementalgorithms for reliable broadcast, approximate agreement, rotor-coordinator, early terminating consensus and total ordering instatic and dynamic systems, all with the optimal resiliency of n > f . Moreover, we show that synchrony is necessary asan agreement with probabilistic termination is impossible in asemi-synchronous or asynchronous system if the participants areunaware of n and f . I. I

NTRODUCTION

Many modern networks have to be always available andit may not be possible to know the size of the network orthe number of failures in advance, since they may changeover time. Consider, for example, a database cluster thatrequires frequent node scaling because of changing load,or a wireless sensor network that experiences a changingnumber of faulty or disconnected nodes over time. Nakamoto’sblockchain [27] is a prominent example where the network ispermissionless, i.e., the network is open to any number ofnodes. So, the number of participants and consequently, thenumber of failures also change over time. Agreement is afundamental distributed computing primitive for fault-tolerantnetworks, however, much of the existing literature assumesthat the size n of the network and/or the upper bound f onthe number of failures is known to every node [4], [3], [10],[30], [26].In this paper, we consider fault-prone systems where thenodes do not know the number of nodes n and the maximumnumber of Byzantine nodes f and study fundamental agree-ment problems for such systems, in particular: • Reliable broadcast — ensures that a message is eitheraccepted by every correct node or no correct node [28]; • Rotor-coordinator — selects f + 1 leaders for the correctnodes; • Consensus — every correct node has a binary input andthe correct nodes output a common binary value that isan input of some correct node [22]; • Approximate agreement — each correct node has a realnumber input and has to output a real number that is strictly within the correct input values [13]; • Total ordering — each correct node maintains a totalorder on the system events while participants may enterand leave subject to n > f .Since a correct node does not know n and f and a Byzantinenode may not announce itself to everyone, there might be moreByzantine nodes in the system than what a correct node thinks.Thus, it may not be possible to achieve a resiliency of n > f ,which can be achieved when the nodes know n and f . When f is known and the identiﬁers are consecutive, it is easy toagree on a set of f + 1 nodes, and consequently ensure thepresence of a single correct leader node in the set. We show,however, that these problems can be solved without affectingresiliency even when n and f are not known. Speciﬁcally, wegive algorithms for solving the above problems in synchronoussystems with the resiliency of n > f , which is optimal forapproximate agreement, reliable broadcast and consensus. Wealso show that the synchronous assumption is necessary asotherwise the problem is impossible and there is non-zeroprobability of terminating with a disagreement.An advantage of designing algorithms without the knowl-edge of n and f is their application to networks wherethe set of participants change over time. We illustrate thisby extending some of our algorithms to solve Byzantineagreement in dynamic networks. In case of dynamic networks,single shot problems where a node acts on one input andterminates with one output are not very useful. So, we consideran agreement task where nodes are required to totally orderthe events in a system and design an algorithm for that task.II. R ELATED W ORK

If the nodes do not known n and f , then the synchronousassumption is necessary. Otherwise, if the network is asyn-chronous and the message delays are unbounded, agreementis impossible even with probabilistic termination, as we showlater. There is a line of work that deals with this problemusing oracles or failure detectors [9], [21], [2]. The idea isthat a failure detector supplies information about the numberf participants. But, these works also assume that every nodeknows f . In [29], the authors consider an asynchronousdynamic system with a failure detector where n and f areunknown, but the failure detector assumed eventually removesthe Byzantine nodes.Several ways of improving the robustness of synchronoussystems with Byzantine failures have also been explored. Forexample, Gallet et al. [11] examine a system that can allocatethe same identiﬁer to multiple nodes. In [5], [18], [8], theauthors examine a synchronous system with mobile Byzantinefaults — those which hop from one node to another acrossrounds. In [23], [24], the authors consider self-stabilizingagreement problems in the presence of Byzantine faults, i.e.,the correct nodes have to recover from arbitrary initial stateeven when the other Byzantine nodes maliciously preventthe correct nodes from recovering. In [20], the machines areassigned weights and the total weight of the faulty machinesis less than a third of the total weight.The Byzantine agreement problems have a long historysince the work by Lamport et al. [22]. They gave a f + 1 round algorithm with exponential in n message complexityfor n > f . They also showed that the resilience of n > f isoptimal. Berman et al. [6] later improved the message complex-ity to polynomial in n , while keeping optimal resilience, andincreasing the number of rounds by a small constant. Garay etal. [19] further improved the round complexity to exactly f +1 ,while retaining optimal resilience and polynomial messagecomplexity. The algorithm by Berman et al. [6] is well knownas the king algorithm and is still commonly used [23], [12],[1]. The approximate agreement algorithm was introducedby Dolev et al. [13] and is a useful primitive in designingdistributed algorithms [25], [14].It also requires n > f andhas optimal resiliency [17]. Srikanth et al. [28] introducedthe reliable broadcast abstraction and its use in dealing withByzantine failures for n > f . As they remark, this resiliencyis optimal as the reliable broadcast abstraction can be used tosolve consensus. The algorithms for approximate agreement,reliable broadcast and consensus in this paper generalize theones from Dolev et al. [13], Srikanth et al. [28] and Bermanet al. [6] respectively.A rotor-coordinator, as also used in [6], is an approach todeal with at most f Byzantine faults by rotating through f + 1 coordinator nodes, thus ensuring that one coordinator wouldbe correct. The rotor-coordinator can be easily implementedby rotating through f + 1 nodes when f is known and theidentiﬁers are consecutive. However, it is one of the mainbottleneck when n and f are unknown and the identiﬁers arealso non-consecutive.III. S IGNIFICANCE OF THIS W ORK

It is not so difﬁcult to observe that if all the correct nodesbroadcast in a round, then each correct node v receives lessthan n v / messages from the Byzantine nodes — where n v is the number of messages received by the node v —irrespective of whether the Byzantine nodes broadcast or not.This observation helps in removing dependency on n and f from the classic known algorithms. However, this observationis not sufﬁcient by itself. Many of the classic algorithms runfor ﬁxed f + 1 rounds, selecting a different leader in eachround. This is a non-trivial problem in our setting, since f is not a common knowledge and also the identiﬁers arenot consecutive. Algorithm 2 for rotor-coordinator essentiallysolves this problem.The classical models studied in the literature do not allowthe Byzantine nodes to lie about the number of participantsin the network since it is assumed to be known by everynode. Our system model allows the Byzantine nodes to sendconﬂicting information so that the correct nodes never havea consistent information about the number of participants.Therefore, the algorithms designed are robust against a widerrange of malicious behavior. This is especially useful for largedynamic systems where it may not be possible to consistentlyinitialize every node with the value of n and f .On the other hand, participants are assumed to have accessto consistent clocks after initialization, since the computationis assumed to proceed in rounds. So, some consistent ini-tialization (synchronization) is still needed. This is somewhatnecessary, since we also show that Byzantine consensus cannotbe solved with probabilistic termination if the system is semi-synchronous or asynchronous and the participants do not know n and f . This implies that it is impossible to build blockchainsystems for solving agreement problems in asynchronousnetworks when n and f are not known.IV. M ODEL

The system consists of n nodes, out of which at most f are faulty nodes. The faulty nodes can behave in anywaywhatsoever, also known as Byzantine behavior. We call the non-Byzantine nodes correct . The nodes have unique identiﬁers,which are not necessarily consecutive. Each node knows itsidentiﬁer only at initialization apart from a possible input anddoes not know any global information like n or f . The systemis synchronous and the computation proceeds in rounds . Ineach round, every node receives the messages that were sentto it in the previous round, does some local computations, andthen sends again messages to the other nodes to be consumedin the following round. A correct node can broadcast a mes-sage to all the nodes or send a message to a speciﬁc node thatsent a message to the node before. The identiﬁer of a node isincluded in the message it sends so the receiver of the messagecan decipher its sender. Thus, a Byzantine node cannot forgeits identiﬁer when communicating directly. However, it canhelp other Byzantine nodes to do so indirectly by claimingto have received messages from other, possible non-existent,nodes. Byzantine nodes can send duplicate messages acrossrounds but duplicate messages from the same node in a roundare simply discarded.Note that the only way for a correct node to know aboutthe existence of another node is to receive a message fromthat node. A Byzantine node may get itself known to only asubset of nodes, however, it can behave as if it already knowsall the nodes without having received messages from them. Inhe rest of the paper, we will sometimes refer to the abovemodel as the id-only model for brevity. We give the followingalgorithms in the id-only model for n > f : reliable broadcastin Section V, rotor-coordinator in Section VI, consensus inSection VII, and approximate agreement in Section VIII. InSection IX, we show that to solve agreement with probabilis-tic termination, when n and f are unknown, synchronousassumption is necessary. In Section X, we give a parallelversion of the consensus algorithm, where several consensusalgorithms can be run in parallel, however, the nodes donot initally agree on the instances to start. In Section XI,we build on the parallel consensus to give algorithms forachieving approximate agreement and total ordering of eventsin a dynamic network. In Section XII, we discuss the resultsand some further interesting questions.V. R ELIABLE B ROADCAST

Reliable broadcast [28] is an abstraction to deal with themessages sent by the Byzantine nodes. The idea is to enforcethat a Byzantine node cannot send contradictory informationto different nodes. It can still send around false informationbut the abstraction ensures that the same false information isseen by all the correct nodes. Concretely, let s be a designatednode that may or may not be correct and ( m, s ) be a messagebroadcast by s . The message ( m, s ) is reliably broadcast whenthe following three properties are satisﬁed.1) Correctness: If s is correct, then each correct node accepts ( m, s ) .2) Unforgeability: If a correct node accepts a message ( m, s ) and s is a correct node, then the message ( m, s ) wasbroadcast or sent to all the nodes by the node s .3) Relay: If a correct node accepts the message ( m, s ) ina round r , then each correct node accepts the message ( m, s ) by the round r + 1 .Algorithm 1 gives an algorithm for a node v to reliablybroadcast a message ( m, s ) sent by a node s in the ﬁrstround. Note that in Line 10, the value n v is not the numberof messages received in the round r but the number of nodesthat sent at least one message to v until the current round r . Also, the algorithm does not terminate as the idea is touse this mechanism as a subroutine in another algorithm thatimplements its own termination mechanism, as we will seefor consensus, where few additional messages per round areused to detect termination. In the following lemmas, we showthat the algorithm satisﬁes the three properties of the reliablebroadcast. We will again assume that n > f . Lemma 1. If n > f , then Algorithm 1 satisﬁes the correct-ness property of the reliable broadcast.Proof. If the node s is correct, it sends the message ( m, s ) toall the nodes during the initial broadcast (Line 2). Every goodnode receives the message and broadcasts echo ( m , s ) in thenext round (Line 7). Let g be the number of good nodes. Then,in the third round, every correct node receives g echo ( m , s ) messages. Moreover, the value of n v ≤ n in the third round as Algorithm 1

Reliable broadcast algorithm for a node v tobroadcast a message ( m, s ) sent by a node s in the ﬁrst round.Each iteration of the loop is a single round. if v = s then ⊲ Round 1 Broadcast ( m, s ) . else Broadcast present . end if if Received ( m, s ) from s then ⊲ Round 2 Broadcast echo ( m , s ) . end if for r ← to ∞ do ⊲ Rounds 3 to ∞ Let n v be the number of nodes that sent at least onemessage to v until the round r . if Received at least n v / echo ( m , s ) messages and not accepted ( m, s ) already then Broadcast echo ( m , s ) . end if if Received at least n v / echo ( m , s ) messages and not accepted ( m, s ) already then Accept ( m, s ) . end if end for n is the maximum number of nodes that can send a messageto v . As n > f , we have g > f or g > f + g ) = 2 n .Thus, we have g > n/ ≥ n v / . Therefore, every correctnode accepts the message in the third round (Line 17). Lemma 2. If n > f and a correct node v receives at least n v / copies of a message m from distinct nodes in a round r , then at least one of those messages was sent by a correctnode.Proof. Let f ′′ v be the number of faulty nodes that sent m to v in the round r . Since every correct node transmits a message inthe ﬁrst round (Lines 2 and 4), we have n v ≥ g , where g is thenumber of good nodes. So, we can write n v = g + f ′ v , where f ′ v is the number of faulty nodes that sent at least one messageto v until the round r . Using f ′′ v ≤ f ′ v and n v = g + f ′ v , thenumber of correct nodes G that sent a message to v in theround r are at least n v / − f ′′ v ≥ ( g − f ′ v ) / . As g > f , wehave G > f − f ′ v ) / or at least one as f ≥ f ′ v . So, at leastone correct node sent the message m to v in the round r . Lemma 3. If n > f , then Algorithm 1 satisﬁes the unforge-ability property of the reliable broadcast.Proof. We need to show that if a correct node accepts amessage ( m, s ) and s is a correct node, then the message ( m, s ) was broadcast by s . If a message ( m, s ) was accepted bya correct node v in a round r a , then v received at least n v / echo ( m , s ) messages in the round r a . Thus, the number ofcorrect nodes from which v received the echo ( m , s ) messagesin round r a are at least n v / − f ′′ v ≥ n v / − f ′′ v , where f ′′ v isthe number of messages received by v from the faulty nodes inhe round r a . Using Lemma 2, at least one of the echo ( m , s ) messages received by v in the round r a was sent by a correctnode.Let r f be the ﬁrst round when an echo ( m , s ) message wassent by a correct node u . Thus, in the round r f , the node u either received at least n u / echo ( m , s ) messages or receivedthe message ( m, s ) from s (Lines 13 or 7). If u received atleast n u / echo ( m , s ) messages, then using Lemma 2, thereis at least one correct node that sent an echo ( m , s ) message inthe previous round. Since r f is the ﬁrst round when a correctnode sends an echo ( m , s ) message, the node u must havereceived the message ( m, s ) from s in the round r f . Thus,node s indeed sent the message ( m, s ) . As s is correct, themessage ( m, s ) was broadcast to all the nodes in the ﬁrstround. Lemma 4. If n > f and a correct node v receives at least n v / copies of a message m in a round r , then every correctnode u receives at least n u / copies of m in the round r .Proof. As v receives at least n v / messages, at least n v / − f ′′ v of them were sent by the correct nodes, where f ′′ v is thenumber of messages received by v from the faulty nodes inthe round r . Let f ′ v be the number of faulty nodes from which v received at least one message until the round r . Then, wehave n v / − f ′′ v = 2( g + f ′ v ) / − f ′′ v , where g is the numberof good nodes. As f ′′ v ≤ f ′ v and f ′ v ≤ f by deﬁnition, we have g + f ′ v ) / − f ′′ v ≥ (2 g − f ) / .Using n > f or g > f , we have (2 g − f ) / g + ( g − f )) / > ( g + f ) / . Thus, at least ( g + f ) / correct nodesbroadcast the message m and every correct node receives atleast ( g + f ) / copies of m in the round r . For a correct node u , we have ( g + f ) / ≥ ( g + f u ) / n u / , where f u is thenumber of faulty nodes from which u has received at least onemessage until the round r . Lemma 5. If n > f , then Algorithm 1 satisﬁes the relayproperty of the reliable broadcast.Proof. Let r be the ﬁrst round in which a correct node v accepts the message ( m, s ) . Then, we show that every correctnode accepts the message ( m, s ) by the round r + 1 .As v accepts the message ( m, s ) in round r , it receivedat least n v / echo ( m , s ) messages. Using Lemma 4, eachcorrect node u receives at least n u / echo ( m , s ) messagesin the round r . So, every correct node broadcasts echo ( m , s ) message in the round r (Line 13) and each one of themreceives g echo ( m , s ) messages in the round r + 1 . As g > f , we have g > f + g ) = 2 n . Thus, we have g > n/ ≥ n u / for every correct node u . Therefore, everycorrect node accepts the message ( m, s ) in the round r +1 .Using Lemma 1, Lemma 3 and Lemma 5, all the proper-ties of the reliable broadcast are satisﬁed and we have thefollowing theorem. Theorem 1. If n > f , then Algorithm 1 satisﬁes the proper-ties of the reliable broadcast in the id-only model. VI. R

OTOR -C OORDINATOR

The purpose of Rotor-Coordinator is to have a commoncoordinator node in each round, where the coordinator nodeis trusted by everyone in that round. After f + 1 differentcoordinators are selected, everyone is sure that at least one ofthose f + 1 selected coordinators was correct, since there areat most f faulty nodes. Algorithm 2 gives the algorithm forselecting a set of f + 1 different coordinators, each one in aseparate round. Algorithm 2

Rotor-Coordinator algorithm for a node v . Thesets C v and S v are used by v to store process identiﬁers. Theset C v is ordered by the process identiﬁers in increasing order.We use | C v | for the size of C v and C v [ i ] for its i th member,where i ≥ . The set B v holds messages before they arebroadcast by v at the end of a round. Note that each iterationof the loop is a single round. C v ← φ ⊲ Set of candidate coordinators S v ← φ ⊲ Set of selected coordinators Broadcast init . ⊲ Round 1 Broadcast echo ( p ) if received init from p . ⊲ Round 2 for r ← → ∞ do ⊲ Rounds 3 up to termination B v ← φ ⊲ B v is broadcast at the round’s end Let n v be the number of nodes that sent at least onemessage to v until the round r . if Received at least n v / echo ( p ) and p / ∈ C v then B v ← B v ∪ { echo ( p ) } end if if Received at least n v / echo ( p ) and p / ∈ C v then C v ← C v ∪ { p } end if p ← C v [ r mod | C v | ] ⊲ p is the next coordinator Let p ′ be the coordinator selected in the previousround. if Received opinion ( x ) from p ′ then Accept x as the coordinator’s opinion. end if if p ∈ S v then break end if S v ← S v ∪ { p } if v = p then ⊲ if v itself is the coordinator Let o v be v ’s current opinion. B v ← B v ∪ { opinion ( o v ) } end if Broadcast B v if its non-empty. end for The idea is that every correct node broadcasts its willingnessto become a coordinator initially, when the faulty nodes mayor may not participate (Line 3). Every correct node v keeps aset of candidate coordinators C v , which it updates in a reliablebroadcast fashion (Lines 10 and 14). In each round, a correctnode v selects the coordinator with the next larger identiﬁer,ay p , from the set C v and adds it to the set of selectedcoordinators S v (Lines 16 and 24). The node v accepts theopinion from p in the next round as the coordinator’s opinion(Line 19) and broadcasts its own opinion as the coordinator’sopinion in case v was selected as the coordinator from theset C v (Line 27). The node v terminates when it reselects thesame node as the coordinator (Line 22). The hope is that bythe time a correct node terminates, it has already witnessed around in which every correct node accepts the opinion of acommon and a correct coordinator. We start by observing thatif a correct node adds p to its set of candidate coordinator C v ,then another correct node u adds p to its set C u as well bythe next round. Lemma 6.

If a correct node v adds p to the set C v in a round r , then any correct node u = v adds p to the set C u by theround r + 1 .Proof. The set B v is emptied at the beginning of every round r and is broadcast at the end of the round r . Thus, the algorithmfor adding a process identiﬁer p to C v is same as that ofaccepting a message ( m, s ) in Algorithm 1 if ( m, s ) = p . So,the lemma follows using Lemma 5 for the relay property ofthe reliable broadcast.We call a round a good round if the same node p wasselected as a coordinator by every correct node and the node p is correct. In the following, we show that every correct nodewitnesses a good round before it terminates, if n > f . We willcall a round as a silent round if the set C v remains unchangedfor every correct node v , i.e, no correct node executes Line 14in that round. A non-silent round is a round that is not silent.We observe that in a silent round, the value of C v is identicalfor every correct node v . If they were not, then there is a silentround between a correct node v adding an identiﬁer p to itsset C v and another correct node u = v adding p to its set C u .This contradicts Lemma . The assumption n > f is usedfor reliable broadcast and also to ensure a good round. With n > f , a good round is easily ensured, but n > f sufﬁceswith careful observation as follows. Lemma 7. If n > f , then every correct node witnesses atleast one good round until it terminates.Proof. Assume for contradiction that a node v terminates inthe round with r = r t without witnessing a good round.Consider a round with r = r c ≤ r t . Let F v ⊆ C v and G v ⊆ C v , respectively, be the set of faulty node identiﬁersand the set of good or correct node identiﬁers in C v when thecoordinator node is selected in the round r c (Line 16). Thus,we have | C v | = | F v | + | G v | .Using Lemma 1, all the correct node identiﬁers are added to C v , even before the ﬁrst coordinator node is selected. So, wehave | G v | = n − f and | C v | = | F v | + n − f . Using n > f , weget | C v | > | F v | + 2 f . Say that there is no correct node u thatadded a faulty identiﬁer to its set C u in the round with r = 0 .Then, every correct node selects a common coordinator fromthe set G v and v witnesses a good round before termination, a contradiction. Thus, there is a correct node u that adds a faultyidentiﬁer to its set C u in the round with r = 0 . For every non-silent round afterwards, at least one faulty node identiﬁer isadded to the set C u of some correct node u . Using Lemma 6,if a faulty node identiﬁer p is added to C u , every correctnode w = u adds p to C w by the next round. Thus, we have f ≥ n ns , where n ns is the number of non-silent rounds priorto the round r c and starting from the round r = 0 . Therefore,we have | C v | > | F v | + n ns .Moreover, until the round r c , node v has neither witnesseda good round, nor it has selected the same node again as acoordinator by our assumption. So, in all the silent roundsprior to the round r c , a unique faulty node was selected asa coordinator by v . Therefore, if n s is the number of silentrounds prior to the round r c , then | F v | ≥ n s since v selectsa node as a coordinator only after adding it to the set C v . So,we have | C v | > n s + n ns .Since r starts from , we have n s + n ns = r c . So, we have | C v | > r c and r c mod | C v | = r c . Since the above inequality istrue for every round r c ≤ r t , a node that was already selectedas a coordinator, is in the set { C v [ r mod | C v | ] : r < r c } .Therefore, for selecting the same identiﬁer as a coordinatoragain, it must be that r > | C v | > r c , a contradiction. Theorem 2. If n > f , then every correct node terminates in O ( n ) rounds and there is a round in which every correct nodeaccepts the opinion of a common and a correct coordinatornode.Proof. As a node terminates as soon as it selects the samenode as a coordinator and there are n nodes in total, the nodeterminates in at most n rounds. Using Lemma 7, the nodealso witnesses a good round before termination and acceptsthe corresponding opinion in the next round (Line 19).VII. C ONSENSUS

In this section, we give an O ( f ) round consensus algorithmin the id-only model, where f is the number of faulty byzantinenodes in the system. Algorithm 3 gives an algorithm based on[7]. Every correct node v has an input x v , which is a realnumber. Again, every correct node has to output a commoncorrect value. If the inputs are all same, then the output mustbe that value. We consider real number inputs here, unlikebinary inputs in Section VII, since we use it later for orderingevents in a system, which can be non-binary.In the following, we prove the correctness of Algorithm 3.We refer to an iteration of the loop as a phase . Lemma 8. If x v = x for every correct node at the start of thephase, all the nodes terminate with the output x at the end ofthe phase.Proof. Every correct node broadcasts input ( x ) at the startof the phase. So, every correct node v receives g input ( x ) messages. As n > f , we have g > f . Thus, we have g + 2 g > f + g ) or g > n/ ≥ n v / . So, all thecorrect nodes broadcast prefer ( x ) (Line 6). Every correctnode v receives g ≥ n v / prefer ( x ) messages, keeps lgorithm 3 An O ( f ) round consensus algorithm in the id-only model. To initialize the rotor-coordinator in Line 1, runthe ﬁrst two lines of the Algorithm 2. To initialize n v inLine 2, collect the identiﬁers from which a message has beenreceived, and count them. Later, a node only accepts messagesfrom a node if it counted towards n v during the initializationand discards the messages from the other nodes. If a node u receives a message from another node v during initializationbut not later inside the loop, then u assumes that v sent thesame message as sent by u in the previous round. ‘Next Round’is abbreviated as N.R. Initialize rotor-coordinator. ⊲ Rounds 1 and 2 Initialize n v . while true do Broadcast input ( x v ) . ⊲ N.R. if Received at least n v / input ( x v ) then ⊲ N.R. Broadcast prefer ( x v ) . end if if Received at least n v / prefer ( x ) then ⊲ N.R. x v = x end if if Received at least n v / prefer ( x ) then Broadcast strongprefer ( x ) . end if Execute a round of rotor-coordinator using x v as v ’scurrent opinion. Let c be the value accepted as thecoordinator’s opinion. ⊲ N.R. if Received less than n v / strongprefer ( x ) ⊲ N.R. then x v = c end if if Received at least n v / strongprefer ( x ) then Terminate and output x . end if end while their opinion to x (Line 9), and broadcasts strongprefer ( x ) (Line 12). Again, each correct node v receives g ≥ n v / strongprefer ( x ) messages and terminates with the output x (Line 20). Lemma 9.

If a correct node u receives n u / copies of amessage m and a correct node v receives n v / copies of amessage m ′ in the same round, then at least one correct nodesent both m and m ′ in the previous round.Proof. The number of messages G sent by the good nodes isat least n u / − f u +2 n v / − f v , where f u is the number of m messages sent to u by the faulty nodes, and f v is the numberof m ′ messages sent to v by the faulty nodes. As n u = g + f u and n v = g + f v , we have G > g/ − ( f u + f v ) / . We have g > f u + f v since f u ≤ f , f v ≤ f and g > f . Thus, wehave G > g and at least one correct node sent both m and m ′ in the previous round. Lemma 10.

If a correct node terminates in a phase, then allother correct nodes have the same opinion at the end of thephase.Proof.

Say, a correct node v terminates with the output x .Then, it received at least n v / strongprefer ( x ) messages. So,all the correct nodes received at least n v / strongprefer ( x ) messages using Lemma 4 and none of them switches to thecoordinator’s opinion (Line 23). Moreover, at least one correctnode u sent a strongprefer ( x ) message using Lemma 2. Thenode u received n u / prefer ( x ) messages. Using Lemma 4,at least n u / of those messages were sent by the correct nodesand so each node received at least n u / prefer ( x ) messages.It is not possible that a correct node also received prefer ( x ′ ) ,where x = x ′ . Indeed, if it was so, then using Lemma 2there is a correct node s that sent prefer ( x ) and a correctnode t that sent prefer ( x ′ ) . Thus, node s received n s / input ( x ) messages and node t received n t / input ( x ′ ) messages. Using Lemma 9, a correct node sent both input ( x ) and input ( x ′ ) messages in the same round, a contradiction.Thus, every correct node changed their opinion to x (Line 9),which remains unchanged until the end of the phase. Lemma 11.

If the coordinator is correct and none of thecorrect nodes have terminated, then all the correct nodes havethe same opinion by the end of the phase.Proof.

Consider the ﬁrst phase when the coordinator is cor-rect. Either every correct node v receives less than n v / strongprefer ( x ) messages, in which case all the correct nodeshave the same opinion by the end of the phase, and we aredone. Otherwise, there is a correct node u that received at least n u / strongprefer ( x ) messages. Using Lemma 2, there is atleast one correct node w that sent a strongprefer ( x ) message.Thus, node w received at least n w / prefer ( x ) messages.Using Lemma 4, every correct node v received at least n v / prefer ( x ) messages. As before, it is impossible that a correctnode v also receives n v / prefer ( x ′ ) messages, where x = x ′ .So, every correct node, including the coordinator, changesits opinion to x (Line 9). Since the coordinator is correct,it sends the same opinion x to all the nodes. Thus, evenif some correct nodes decide to change their opinion to thecoordinator’s opinion, all the correct nodes still have the sameopinion at the end of the phase.We can now combine the previous lemmas into the follow-ing theorem. Theorem 3.

Algorithm 3 solves consensus in O ( f ) rounds inthe id-only model.Proof. If the correct nodes have the same input x , then theyoutput x using Lemma 8. Otherwise, one of the followinghappens within O ( f ) rounds: either a correct node terminateswith an output x or a correct coordinator gets picked. In eithercase, the correct nodes have the same opinion at the end of thephase, and terminate with the same output by the next roundusing Lemma 8.III. A PPROXIMATE A GREEMENT

In the approximate agreement problem [13], each correctnode takes a real number input and outputs a real number.Let i min and i max , respectively, be the minimum and themaximum value that is an input of a correct node. Similarly, let o min and o max , respectively, be the minimum and maximumvalue that is output by a correct node. The values output bythe correct nodes must satisfy the following conditions.1) The value output by each correct node is within the inputrange [ i min , i max ] .2) The output range [ o min , o max ] is strictly smaller than theinput range, i.e., ( o max − o min ) < ( i max − i min ) if i max = i min . Algorithm 4

Approximate Agreement algorithm for a node v .The input value of the node is i v . Broadcast i v to all the nodes (including self). Let R v be the set of received values and n v = | R v | . Discard ⌊ n v / ⌋ smallest and ⌊ n v / ⌋ largest values fromthe set R v to obtain the set S v . Output o v = (min S v + max S v ) / , where min S v and max S v are the minimum and maximum value of the set S v respectively.Algorithm 4 solves the problem. The following lemmashows that the algorithm satisﬁes the ﬁrst property of theapproximate agreement, i.e., the output range lies within theinput range. Lemma 12. If n > f , then o v ∈ [ i min , i max ] for everycorrect node v .Proof. Let g be the number of correct nodes. Then, the node v receives at least g values from the correct nodes after theﬁrst round. Let f v be the number of values received by v fromthe Byzantine nodes. Therefore, we have f v ≤ f as f is thenumber of faulty nodes and v receives at most one value fromeach faulty node in a round. As n = f + g and f v ≤ f , wecan rewrite n > f as g + f > f v + f . Thus, we have ( g + f v ) / > f v or ⌊ ( g + f v ) / ⌋ ≥ f v as f v is an integer. As n v = g + f v , we have ⌊ n v / ⌋ ≥ f v .As there are at most f v faulty values in the set R v and ⌊ n v / ⌋ ≥ f v , the minimum value min S v left after discarding ⌊ n v / ⌋ smallest values from R v satisﬁes min S v ≥ i min ,where i min is the minimum value received from a correct node.Using a similar argument, the maximum value max S v satisﬁes max S v ≤ i max . Therefore, the output o v , which is the averageof min S v and max S v , satisﬁes o v ∈ [ i min , i max ] .Let i med be the median of the input values at the correctnodes. In the following lemma, we show that the value i med is never discarded by a correct node while computing the set S v . Lemma 13. If n > f , then the value i med ∈ S v for everycorrect node v . Proof. Let g be the number of correct nodes. Using n > f and n = g + f , we get f < g/ . Using n v = g + f v and f v ≤ f , we get ⌊ n v / ⌋ ≤ n v / g + f v ) / ≤ ( g + f ) / .As f < g/ , we get ⌊ n v / ⌋ < g/ . Therefore, even if allthe smallest ⌊ n v / ⌋ discarded values are from the good nodes,then also strictly less than half of the smallest good values arediscarded to obtain the set S v . Similarly, strictly less than halfof the largest good values are discarded to obtain the set S v .Thus, we have i med ∈ S v .Combining the previous two lemmas, we can state thefollowing theorem. Theorem 4. If n > f , then Algorithm 4 achieves approxi-mate agreement in the id-only model.Proof. Using Lemma 12, the output range lies within the inputrange and the ﬁrst property of the approximate agreement issatisﬁed.Using Lemma 13, we have i med ∈ S v . Thus, we have max S v ≥ i med and that min S v ≤ i med . Moreover, usingLemma 12, we also get that min S v ≥ i min and max S v ≤ i max . Therefore, we have that the average o v = (min S v +max S v ) / lies within the range [( i min + i med ) / , ( i med + i max ) / . So, the size of output range ( o max − o min ) =( i max − i min ) / < ( i max − i min ) if i max = i min .IX. S YNCHRONY IS N ECESSARY

In our work, we have assumed that the system is syn-chronous. Intuitively, this is a necessary assumption as a nodedoes not know n and f and hence, the number of messages towait for before deciding. So, it might end up deciding beforereceiving a message that was delayed for long, as such thedecision might be incorrect. The following lemma proves thisfor consensus. Lemma 14.

In an asynchronous system where the number ofnodes n and an upper bound f on the number of failures isnot known to the nodes, consensus is impossible, even withprobabilistic termination.Proof. Assume a system S in which all the nodes are correct.We partition the set of the nodes into sets A and B . A node v has input if v ∈ A ; input if v ∈ B . The messagesbetween A and B are arbitrarily delayed. To a node v ∈ A ,this is indistinguishable from a system A where the nodes in B are absent, as v only knows its id initially in both S and A . Similarly, system S is indistinguishable to a node v ∈ B from a system B where the nodes A are absent. The nodes A decide in the system A with a non-zero probability, sincethey only hear from the nodes with the input . Similarly, thenodes B decide in the system B with a non-zero probability.So, the nodes in the system S decide on different values witha non-zero probability.Similar problems can happen in a semi-synchronous system[15], where the message delays have a ﬁxed upper bound ∆ ,but its value is unknown to the nodes. However, the previousargument does not work since we cannot arbitrarily delayhe messages due the existence of the ﬁxed upper bound ∆ .Instead, we start with the partitions A and B and inductivelybuild an invalidating execution for a union of them. Lemma 15.

In a semi-synchronous system, where the messagedelays have a ﬁxed upper bound ∆ and the nodes do not knowthe value of ∆ , n and f , consensus is impossible, even withprobabilistic termination.Proof. Consider a system A where all the nodes have input and the message delays are at most ∆ a . Each node v ∈A decides with non-zero probability. Let E a be such anexecution in A of duration T a . Similarly, consider anothersystem B where all the nodes have input and the messagedelays are at most ∆ b . Let E b be an execution in B whereall the nodes decide in duration T b . We consider anothersystem S consisting of |A| + |B| nodes, and set the maximummessage delay ∆ s > max(∆ a , T a , ∆ b , T b ) . We partition theset S into a set A of |A| nodes and a set B of |B| nodes. Thenodes in A have input where as the nodes in B have input . We also assume some bijective mapping between the sets A and A and between the sets B and B . We use a ′ denote thecounterpart of a in this bijective map.We construct an execution E s from E a and E b as follows.If a node a ∈ A sends a message to a node b ∈ A , then a ′ ∈ S sends the same message to b ′ . The message sent in S has the same delay as the message sent in A . If a node a ∈ A broadcasts a message to all the nodes A , then a ′ ∈ S broadcasts the same message to all the nodes S . The delays forthe messages broadcast are assigned as follows. The messagedelay in S for the messages broadcast to the nodes A ⊂ S is same as the delay of those messages in A . The messagedelay in S for the messages broadcast to the nodes B ⊂ S are ∆ s . Similarly, we assign message actions and delays tothe nodes B ⊂ S . Inductively, a node a ∈ A ⊂ S makesthe same decisions as a node a ′ ∈ A , since both of them donot know the value of n and f , and node a makes the (same)decision before it even hears from a node in B . Similarly, anode b ∈ B ⊂ S makes the same decisions as a node b ′ ∈ B .Therefore, there is an execution E s in S so that a ∈ S decides and b ∈ S decides , a disagreement.The above argument essentially means that an agreementprotocol designed to work without the knowledge of n and f (such as the Bitcoin blockchain [27]), either must assumesynchronous execution for guaranteed agreement or sacriﬁceagreement with some probability.X. P ARALLEL C ONSENSUS

In the consensus problem, each correct node had only oneopinion and had to output a single opinion in agreement withother nodes. Later, when a correct node can submit multipleopinions, we need to agree on every opinion submitted by acorrect node. Therefore, we consider the parallel consensus problem: Every correct node v has a set of k v input pairs ( id iv , x iv ) for ≤ i ≤ k v , where x iv is an opinion and id v isthe identiﬁer of the input pair. Each correct node outputs a setof pairs subject to the following conditions. 1) Validity: If ( id, x ) is an input pair of every correct nodeand x = ⊥ , then all the correct nodes must output thepair ( id, x ) .2) Agreement: If a correct node v outputs a pair ( id v , x v ) ,then all other correct nodes must output ( id v , x v ) as well.3) Termination: Every correct node outputs a set of pairs inﬁnite number of rounds.Note that the rules allow a pair ( id v , x v ) as an input of acorrect node v , but not all the correct nodes, and be absentfrom the output of every correct node.First, we describe the EarlyConsensus ( id ) algorithm,where every correct node v has at most one input pair ( id , x v ) ,i.e., all nodes may not be aware of the identiﬁer id . The pseu-docode is given in Algorithm 5. To help a node v distinguishif another node u is aware of id or has no preference or nostrong preference of an opinion, we use id : nopreference and id : nostrongpreference messages.Next, we describe the ParallelConsensus algorithm usingthe previous one: The node v starts the EarlyConsensus ( id v ) algorithm for every ( id v , x v ) pair input at v . If the node v ﬁrsthears id : input , id : prefer , id : strongprefer respectively duringthe second, third, and ﬁfth round of the ﬁrst phase and no inputpair corresponding to id was present at v , then also the node v starts the EarlyConsensus ( id ) algorithm from that round. Theorem 5.

The

ParallelConsensus algorithm satisﬁes theparallel consensus properties.Proof.

Consider a pair ( id , x v ) that is input at a cor-rect node v , where x v = ⊥ . In the ﬁrst round of thephase, the node v broadcasts id : input ( x v ) . So, every correctnode hears an id : input message in the second round, andﬁlls the missing opinions from the correct nodes with a id : input ( ⊥ ) . In the subsequent rounds, if a correct node u does not receive enough messages to send a id : prefer or a id : strongprefer message, then it respectively sends a id : nopreference and id : nostrongpreference message. So, thenode v does not ﬁll in a message for u . Therefore, theexecution of EarlyConsensus ( id v ) is identical to an executionof Algorithm 3, where the input of a correct node v is x v if ( id v , x v ) is an actual input and ⊥ if such a pair is absent.Using Theorem 3, every correct node v outputs a pair ( id , o v ) in O ( f ) rounds, so that it is in agreement with other correctnodes, and is same as the input ( id , x v ) if it is present atall the correct nodes. Discarding the output pairs of the form ( id , ⊥ ) does not affect the agreement and validity propertiesrequired by parallel consensus (Line 26).Now, consider that no correct node has an input pair withthe identiﬁer id . If we show that no correct node outputs apair with the identiﬁer id , then we are done. Let r be the ﬁrstround when a correct node v receives an id message. If r isthe second phase, or the fourth round (rotor-coordinator) ofthe ﬁrst phase, then v simply discards it. Otherwise, the round r can be the second (Line 7), third (Lines 12 and 15) or theﬁfth one (Lines 22 and 25) of the ﬁrst phase. First, considerthat r is the second round of the ﬁrst phase and a correct lgorithm 5 EarlyConsensus ( id ) algorithm at node v : Thenode has at most one input pair ( id , x v ) . The rotor-coordinatorand n v are initialized as in Algorithm 3. Later, a node onlyaccepts messages from a node if it counted towards n v duringthe initialization and discards the messages from the othernodes. The types M = { id : input , id : prefer , id : strongprefer } of received messages are counted as follows. If a message oftype m ∈ M is received for the ﬁrst time during the secondphase, then it is discarded (considered as not received). If amessage of type m ∈ M is received for the ﬁrst time duringthe ﬁrst phase, then the message m ( ⊥ ) is substituted for everynode u that counted towards n v during initialization but didnot send a type m message. If a node v has received a type m ∈ M message already during the ﬁrst phase and a node u that counted towards n v does not send a type m ′ ∈ M message in a subsequent round, then for every such node u ,the node v substitutes the message of type m ′ that it sent mostrecently. ‘Next Round’ is abbreviated as N.R. Initialize rotor-coordinator. ⊲ Rounds 1 and 2 Initialize n v . while true do if Input pair ( id , x v ) present and x v = ⊥ then Broadcast id : input ( x v ) . ⊲ N.R. end if if Received at least n v / id : input ( x v ) then ⊲ N.R. Broadcast id : prefer ( x v ) . else Broadcast id : nopreference . end if if Received at least n v / id : prefer ( x ) then ⊲ N.R. id : x v = x end if if Received at least n v / id : prefer ( x ) then Broadcast id : strongprefer ( x ) . else Broadcast id : nostrongpreference . end if Execute a round of rotor-coordinator using x v as v ’scurrent opinion. Let c be the value accepted as thecoordinator’s opinion. ⊲ N.R. if Received less than n v / id : strongprefer ( x ) ⊲ N.R. then id : x v = c end if if Received at least n v / id : strongprefer ( x ) then Terminate and output ( id , x ) if x = ⊥ . end if end while node v ﬁrst received the id : input message during round r .Since no other correct node u had an id pair as input, node v ﬁlls a default id : input ( ⊥ ) for every correct node u = v and decides to broadcast id : pref er ( ⊥ ) . Similarly, any othercorrect node w = v that ﬁrst received the id : input message in the round r broadcasts id : pref er ( ⊥ ) . In the next round, everycorrect node receives an id : prefer ( ⊥ ) message. If a correctnode heard id : prefer message for the ﬁrst time, then it willﬁll a default id : prefer ( ⊥ ) for every node u that did not send amessage to it. If a correct node p already heard an id message,then we know that it sent id : prefer ( ⊥ ) in the previous roundand will ﬁll the same for missing opinions. Thus, every correctnode p receives at least n p / id : prefer ( ⊥ ) messages, sets id : x p = ⊥ and broadcasts id : strongprefer ( ⊥ ) . So, everycorrect node p receives at least n p / id : strongprefer ( ⊥ ) inthe next round, terminates but does not output an id pair since ⊥ is the associated opinion.Now, consider that r is the third round of the ﬁrst phaseand a correct node v ﬁrst hears an id : prefer message in theround r . The node v ﬁlls a default id : prefer ( ⊥ ) opinionfor every correct node u , sets id : x v = ⊥ and broadcasts id : strongprefer ( ⊥ ) . In the next round, every correct nodehears an id : strongprefer ( ⊥ ) message. If a correct node w hears id : strongprefer ( ⊥ ) for the ﬁrst time, it ﬁlls the missingmessages with the default id : strongprefer ( ⊥ ) message. Ifnot, the node w ﬁlls the missing opinion with what it sentpreviously, which is again id : strongprefer ( ⊥ ) . Thus, everycorrect node w receives at least n w / id : strongprefer ( ⊥ ) messages and does not output any id pair.Lastly, consider that r is the ﬁfth round of the ﬁrst phaseand a correct node v ﬁrst hears an id : strongprefer message inthe round r . No correct node received an id before the round r by assumption, so no correct node sends an id message beforeround r . So, the node v ﬁlls the default id : strongprefer ( ⊥ ) message for every correct node u . Consequently, the node v receives n v / id : strongprefer ( ⊥ ) messages and does notoutput an id pair.XI. A PPLICATION TO D YNAMIC N ETWORKS

In this section, we see how the protocols that we developedcan be applied to networks, where the participants enter orleave the system, subject to the constraint that n > f .First, we look into the approximate agreement problem. Weuse Algorithm 4 in the dynamic setting as well. It is easyto observe that the Lemmas 12 and 13 apply even if theparticipants enter and leave the system in every round subjectto n > f . So, the range of correct values still gets halvedin every round, with respect to the previous round. However,new nodes entering the system might also increase the rangeof values at the correct nodes. So, whether the range decreasesor increases over time depends on the actual inputs of nodesentering or leaving the system.Next, we consider the problem of total ordering of events ina dynamic system. We can run the parallel consensus algorithmin every round to agree on the events occurred during thatround. We just need to make sure that the set of identiﬁers usedfor every parallel consensus instance remains consistent. To dothat, we have to specify some more details about the model.The adversary can decide the number of nodes that can join thenetwork before every round starts, subject to the constraint that n > f remains true when the round starts. Once a node joinshe network, it can broadcast to all the nodes that have joinedbut not left already. A node leaves the network by announcingso to all the participants. A correct node decides itself whento leave. The adversary decides when a faulty node leaves thenetwork. Algorithm 6 lists the pseudocode. Algorithm 6

Algorithm at a node v to order events in adynamic network. Initially, round r is initialized to and S = { v } . Since there could be multiple parallel consensusinstances running at the same time, we identify them by theround in which they start by appending the round numberto the messages. Also, running a parallel consensus instancewith respect to S means recording the value of S at the start ofthe instance, and only accepting the messages from the nodeidentiﬁers in S , discarding the rest. if v wants to participate then Broadcast present . ⊲ Next Round Let A v be the multiset of ( ack , t ) messages receivedby v in the next round, where t ≥ . Initialize r = r + 1 , where ( ack , r ) is the majorityin A v . Initialize S to the identiﬁers which sent a message in A v . end if while true do r ← r + 1 I rv ← {} if Received present from u then S ← S ∪ { u } Send ( ack , r ) to u . ⊲ Next Round end if if v wants to stop participating then Broadcast absent . ⊲ Next Round

Wait and participate in the outstanding parallelconsensus instances until termination. end if if Received absent from u then S ← S \{ u } end if if v witnesses an event m = ⊥ then Broadcast ( m, r ) . ⊲ Next Round end if if Received ( m, r − from u then I rv ← I rv ∪ { ( u, m ) } end if Start a parallel consensus instance r with the inputpairs I rv with respect to the set S . ⊲ Next Round

A round r ′ < r is ﬁnal if r − r ′ > | S r ′ v | / . Let R be the largest round such that all the rounds atmost R are ﬁnal. Order the outputs of the consensus instances with iden-tiﬁers at most R in the order of increasing identiﬁers,breaking ties arbitrarily. end while In the following, we show that the nodes agree on the sequences that they output in Line 30. Let T rv be the sequenceoutput by a correct node v at the end of round r (Line 30).Our goal is that T rv satisﬁes the following two agreementproperties.1) Chain-preﬁx: For any pair of correct nodes u, v , either T ru is a preﬁx of T rv or T rv is a preﬁx of T ru .2) Chain-growth: For every correct node v , events are ap-pended to T rv over time, if a correct node submits anevent in every round. Theorem 6.

Algorithm 6 outputs a chain of events that satisfythe chain-preﬁx and chain-growth properties.Proof.

Initially, the node v stores the correct round number . By assumption, we have n > f in every round. Then, byinduction on rounds, selecting the round number based on themajority of received ack messages always returns the correctround number for every correct node. Therefore, every correctnode that starts a parallel consensus instance in a round r , tagsit with the same identiﬁer r . Each of these instances are thencorrect using Theorem 5.Consider a round r ′ that is ﬁnal with respect to v . Sinceeach phase of Algorithm 5 is ﬁve rounds and the initializationis two rounds, the parallel consensus instance r ′ terminates by r ′ + 5 f ′ r + 2 rounds using Theorem 5, where f ′ r is the numberof faulty nodes in the round r ′ . Let g ′ r be the number of goodnodes in the round r ′ and n ′ r be the total number of nodesin the round r ′ . Since we have n ′ r > f ′ r by assumption, wehave | S r ′ v | ≥ g ′ r > f ′ r . Since r ′ is ﬁnal, the current round r > r ′ + 5 | S r ′ v | / > r ′ + 5 f ′ r + 2 . So, the parallelconsensus instance r ′ has terminated by the previous roundand no further output from consensus instance r ′ is produced.Moreover, using Theorem 5, any other correct node u = v has also accepted the same output pairs corresponding to theconsensus instance id . Also, node u has not accepted anyother output pairs corresponding to the consensus instance id , which would contradict the agreement property of parallelconsensus. Let R u and R v respectively be the value of R computed in Line 29 by the nodes u and v . Then, rounds upto R min = min { R u , R v } are ﬁnal for both the nodes u and v . Thus, the outputs of the consensus instances up to R min isthe common preﬁx of T ru and T rv , which is the common-preﬁxproperty.Since the parallel consensus instance r ′ terminates in O ( f ′ r ) rounds, the earliest non-ﬁnal round eventually becomes ﬁnaland the chain-growth property is satisﬁed as well.XII. D ISCUSSION

In this paper, we investigated distributed systems where theparticipants are neither aware of the size n nor the safe esti-mate f of Byzantine failures. We examined fundamental dis-tributed computing problems such as, approximate agreement,reliable broadcast, rotor-coordinator and consensus; conclud-ing that all of them can be solved with the optimal resiliency of n > f . Each of these algorithms illustrated a different methodof computing. It is interesting to note that “replacing” f by n v / works in these algorithms although n v / is an incorrectpper bound on the number of failures. An algorithm usinga combination of some of the discussed primitives could be“compiled” to work without the knowledge of n and f keepingresiliency unaffected. We evaluated resiliency in this work butother metrics such as message complexity, round complexity,etc. do not change much either. For example, the messagecomplexity of reliable broadcast is unaffected compared to theoriginal algorithm, the convergence rate of the approximateagreement algorithm remains unchanged and the O ( f ) roundcomplexity of consensus algorithm is optimal [16].Removing knowledge of n and f from the participants hasother beneﬁts too. For example, we show in Section XI thatthe design of agreement algorithms for dynamic networksbecomes much easier and the nodes do not need to agree on thenumber of participants in the network. It also opens up waysto achieve agreement in networks without using informationfrom every node. For example, consider a set of nodes that arein approximate agreement with each other already and a newnode joins. Then, the new node can execute Algorithm 4 onlywith a subset of nodes to get closer to the value of most ofthe nodes. Self-stabilizing algorithms may not need to restorethe value of n and f .It is unclear if the resiliency of rotor-coordinator is optimal,a question left for further work. Also, one could look if thesetechniques could beneﬁt semi-synchronous or asynchronousdynamic systems where the rate of change of n is controlled,since without having any knowledge about n or f guaranteedagreement is impossible in such systems.XIII. A CKNOWLEDGMENTS

We would like to thank Christoph Lenzen for the discus-sions, reading the draft and suggesting improvements.R

EFERENCES[1] Yehuda Afek, James Aspnes, Edo Cohen, and Danny Vainstein. BriefAnnouncement: Object Oriented Consensus. In

Symposium on Principlesof Distributed Computing (PODC), Washington, D.C. , July 2017.[2] Eduardo A. P. Alchieri, Alysson Neves Bessani, Joni da Silva Fraga,and Fabíola Greve. Byzantine Consensus with Unknown Participants.In

International Conference On Principles Of Distributed Systems(OPODIS), Luxor, Egypt , December 2008.[3] James Aspnes. Notes on Theory of Distributed Systems. Chapter 10,February 2018.[4] Hagit Attiya and Jennifer Welch.

Distributed Computing: Fundamentals,Simulations, and Advanced Topics , chapter 5. John Wiley & Sons, 2004.[5] Nazreen Banu, Samia Souissi, Taisuke Izumi, and Koichi Wada. AnImproved Byzantine Agreement Algorithm for Synchronous Systemswith Mobile Faults.

International Journal of Computer Applications ,2012.[6] Piotr Berman, Juan A. Garay, and Kenneth J. Perry. Towards OptimalDistributed Consensus. In , October 1989.[7] Piotr Berman, Juan A. Garay, and Kenneth J. Perry. Optimal EarlyStopping in Distributed Consensus. In , November 1992.[8] Silvia Bonomi, Antonella Del Pozzo, Maria Potop-Butucaru, andSébastien Tixeuil. Approximate Agreement under Mobile ByzantineFaults. In , June 2016.[9] David Cavin, Yoav Sasson, and André Schiper. Consensus withUnknown Participants or Fundamental Self-Organization. In

Interna-tional Conference on Ad-Hoc Networks and Wireless (ADHOC-NOW),Vancouver, BC, Canada , July 2004. [10] Bernadette Charron-Bost, Matthias Függer, and Thomas Nowak. Ap-proximate Consensus in Highly Dynamic Networks: The Role of Av-eraging Algorithms. In , July 2015.[11] Carole Delporte-Gallet, Hugues Fauconnier, Rachid Guerraoui, Anne-Marie Kermarrec, Eric Ruppert, and Hung Tran-The. Byzantine Agree-ment with Homonyms. In , June 2011.[12] Danny Dolev, Keijo Heljanko, Matti Järvisalo, Janne H. Korhonen,Christoph Lenzen, Joel Rybicki, Jukka Suomela, and Siert Wieringa.Synchronous Counting and Computational Algorithm Design.

Journalof Computer and System Sciences , 2016.[13] Danny Dolev, Nancy A. Lynch, Shlomit S. Pinter, Eugene W. Stark, andWilliam E. Weihl. Reaching Approximate Agreement in the Presenceof Faults.

Journal of the ACM (JACM) , 1986.[14] Shlomi Dolev and Jennifer L. Welch. Self-Stabilizing Clock Synchro-nization in the Presence of Byzantine Faults.

Journal of the ACM(JACM) , 2004.[15] Cynthia Dwork, Nancy Lynch, and Larry Stockmeyer. Consensus in thepresence of partial synchrony.

J. ACM , 1988.[16] Michael J. Fischer and Nancy A. Lynch. A Lower Bound for the Timeto Assure Interactive Consistency.

Information Processing Letters , 1982.[17] Michael J. Fischer, Nancy A. Lynch, and Michael Merritt. Easy Impos-sibility Proofs for Distributed Consensus Problems. In , August 1985.[18] Juan A. Garay. Reaching (and Maintaining) Agreement in the Presenceof Mobile Faults. In

International Workshop on Distributed Algorithms(WDAG), Terschelling, Netherlands , September 1994.[19] Juan A. Garay and Yoram Moses. Fully Polynomial Byzantine Agree-ment for n > t Processors in t + 1 Rounds.

SIAM Journal onComputing (SICOMP) , 1998.[20] Vijay K. Garg and John Bridgman. The Weighted Byzantine AgreementProblem. In , 2011.[21] Fabiola Greve and Sebastien Tixeuil. Knowledge Connectivity vs.Synchrony Requirements for Fault-Tolerant Agreement in UnknownNetworks. In , June 2007.[22] Leslie Lamport, Robert Shostak, and Marshall Pease. The ByzantineGenerals Problem.

ACM Transactions on Programming Languages andSystems (TOPLAS) , 1982.[23] Christoph Lenzen and Joel Rybicki. Efﬁcient Counting with OptimalResilience. In , October 2015.[24] Christoph Lenzen and Joel Rybicki. Self-Stabilising Byzantine ClockSynchronisation is Almost as Easy as Consensus. In

InternationalSymposium on Distributed Computing (DISC), Vienna, Autria , October2017.[25] Jennifer Lundelius and Nancy Lynch. A New Fault-Tolerant Algorithmfor Clock Synchronization. In , August 1984.[26] Hammurabi Mendes, Maurice Herlihy, Nitin Vaidya, and Vijay K.Garg. Multidimensional Agreement in Byzantine Systems.

DistributedComputing , 2015.[27] Satoshi Nakamoto. Bitcoin: A Peer-to-Peer Electronic Cash System.2008.[28] T. K. Srikanth and Sam Toueg. Simulating authenticated broadcasts toderive simple fault-tolerant algorithms.

Distributed Computing , 1987.[29] Erfan Taheri and Mohammad Izadi. Byzantine Consensus for UnknownDynamic Networks.

The Journal of Supercomputing , 2015.[30] Lewis Tseng and Nitin H. Vaidya. Fault-Tolerant Consensus in DirectedGraphs. In