Revisiting Optimal Resilience of Fast Byzantine Consensus (Extended Version)
RRevisiting Optimal Resilience of Fast Byzantine Consensus
PETR KUZNETSOV,
LTCI, TΓ©lΓ©com Paris, Institut Polytechnique Paris, France
ANDREI TONKIKH,
National Research University Higher School of Economics, Russia
YAN X ZHANG,
San JosΓ© State University, United States
It is a common belief that Byzantine fault-tolerant solutions for consensus are significantly slower than their crash fault-tolerantcounterparts. Indeed, in PBFT, the most widely known Byzantine fault-tolerant consensus protocol, it takes three message delaysto decide a value, in contrast with just two in Paxos. This motivates the search for fast
Byzantine consensus algorithms that canproduce decisions after just two message delays in the common case , e.g., under the assumption that the current leader is correct andnot suspected by correct processes. The (optimal) two-step latency comes with the cost of lower resilience: fast Byzantine consensusrequires more processes to tolerate the same number of faults. In particular, 5 π + π Byzantine failures.In this paper, we present a fast Byzantine consensus algorithm that relies on just 5 π β π β π , it can be crucial for systems of a smaller scale. In particular, for π =
1, our algorithm requires only 4 processes,which is optimal for any (not necessarily fast) partially synchronous Byzantine consensus algorithm.
Consensus [22] is by far the most studied problem in distributed computing. It allows multiple processes to unambiguouslyagree on a single value. Solving consensus allows one to build a replicated state machine by reaching agreement on eachnext command to be executed. In other words, it allows a group of processes to act as a single correct process, despitecrashes or even malicious behaviour of some of them. Having implemented the replicated state machine, one can easilyobtain an implementation of any object with a sequential specification [15, 23]. This makes consensus algorithmsubiquitous in distributed systems used in practice [3, 4, 7, 12, 26]: instead of implementing each service from scratch,software engineers often prefer to have a single highly optimized and well tested implementation of state machinereplication. Hence, it is crucial for consensus algorithms to be as efficient as possible. In particular, it is desirable tominimize the degree of redundancy, i.e., the number of processes that execute the consensus protocol.It has been proven [22] that in order for a consensus algorithm to work despite the possibility of a malicious adversarytaking control over a single participant process (the behaviour we call β a Byzantine failure β), the minimum of 4 processesis required, assuming that the network is partially synchronous (i.e., mostly reliable, but may have periods of instability).More generally, in order to tolerate π Byzantine failures, the minimum of 3 π + common case when all correct processes agree on the same correct leader process. Most Byzantinefault-tolerant consensus algorithms with optimal resilience (3 π + The first author was supported by TrustShare Innovation Chair.Authorsβ addresses: Petr Kuznetsov, [email protected], LTCI, TΓ©lΓ©com Paris, Institut Polytechnique Paris, France; Andrei Tonkikh,[email protected], National Research University Higher School of Economics, Russia; Yan X Zhang, [email protected], San JosΓ© StateUniversity, United States. a r X i v : . [ c s . D C ] F e b Petr Kuznetsov, Andrei Tonkikh, and Yan X Zhangsystem), many consensus algorithms with optimal resilience ( π = π + fast Byzantine consensus algorithms β the class of consensus algorithms that can reach agreements with the same delayas their crash fault-tolerant counterparts.Kursawe [14] was the first to propose a Byzantine consensus algorithm that could decide a value after just twocommunication steps. However, the βoptimistic fast pathβ of the algorithm works only if there are no failures at all.Otherwise, the protocols falls back to randomized consensus with larger latency.Martin and Alvisi [19] proposed the first fast Byzantine consensus algorithm that is able to remain βfastβ even inpresence of Byzantine failures. The downside of their algorithm compared to classical algorithms such as PBFT [8]is that it requires 5 π + π Byzantine failures (as opposed to 3 π + π + π‘ + π failures and remain fastwhen the actual number of failures does not exceed π‘ ( π‘ β€ π ). It is then shown that 3 π + π‘ + π -resilient Byzantine consensus algorithm. We spot an oversight in the lower bound proof by Martin and Alvisi [19]. As we show in this paper, the lower bound of3 π + π‘ + proposers and acceptors .Surprisingly, if the roles of proposers and acceptors are performed by the same processes, there exists a fast π -resilientByzantine consensus protocol that requires only 5 π β π = max { π + π‘ β , π + } processes in order to be able to tolerate π Byzantine failures and remain fast (terminate in two message delays) in the presence of up to π‘ Byzantine failures.In particular, to the best of our knowledge, this is the first protocol that is able to remain fast in presence of a singleByzantine failure ( π‘ =
1) while maintaining the optimal resilience ( π = π + π = π + π‘ β π and π‘ , the difference of just two processes may not appear important,practical deployments with small π ( π , π‘ β€ π = π‘ = We start by recalling our model assumptions in Section 2. We describe our fast Byzantine consensus protocol in Section 3.In Section 4, we discuss the applicability of the previously known lower bound and prove that 3 π + π‘ β We consider a set Ξ of π processes , π , . . . , π π . Every process is assigned with an algorithm (deterministic state machine)that it is expected to follow. A process that deviates from its algorithm (we sometimes also say protocol ), by performinga step that is not prescribed by the algorithm or prematurely stopping taking steps, is called Byzantine .We assume that in an execution of the algorithm, up to π processes can be Byzantine. We sometimes also consider asubset of executions in which up to π‘ β€ π processes are Byzantine. Non-Byzantine processes are called correct .The processes communicate by sending messages across reliable (no loss, no duplication, no creation) point-to-point communication channels. More precisely, if a correct process sends a message to a correct process, the messageis eventually received. The adversary may not create messages or modify messages in transit. The channels areauthenticated: the sender of each received message can be unambiguously identified.Every process is assigned with a public/private key pair. Every process knows the identifiers and public keys of everyother processes. The adversary is computationally bounded to be unable to break to compute private keys of correctprocesses.We assume a partially synchronous system: there exists an a priori bound on message delays Ξ that holds eventually :there exists a time after which every message sent by a correct process to a correct process is received within Ξ timeunits. This (unknown to the processes) time when the bound starts to hold is called global stabilization time (GST). Forbrevity, we neglect time spent on local computations. Each process π β Ξ is assigned with an input value π₯ in π . At most once in any execution, a correct process can decide ona value π₯ by triggering the callback Decide ( π₯ ) .Any infinite execution of a consensus protocol must satisfy the following conditions: Liveness:
Each correct process must eventually decide on some value;
Consistency:
No two correct processes can decide on different values;
Validity:
We consider two flavors of this property:
Weak validity:
If all processes are correct and propose the same value , then only this value can be decided on;
Extended validity:
If all processes are correct, then only a value proposed by some process can be decided on.Note that extended validity implies weak validity, but not vice versa (i.e., extended validity is strictly stronger thanweak validity). Our algorithm solves consensus with extended validity, while our matching lower bound holds even forconsensus with weak validity.
In this section, we present our fast Byzantine consensus algorithm, assuming a system of π = π β π = π + π‘ β view number , and each view isassociated with a single leader process by an agreed upon mapping leader : Z > β Ξ . For simplicity, let us assumethat leader ( π£ ) = π ( π£ mod π )+ . When all correct processes have the same current view number π£ , we say that process leader ( π£ ) is elected . Petr Kuznetsov, Andrei Tonkikh, and Yan X ZhangThe processes execute a view synchronization protocol in the background. We do not provide explicit implementationfor it because any implementation from the literature is sufficient [5, 8, 20].The view synchronization protocol must satisfy the following three properties: β’ The view number of a correct process is never decreased; β’ In any infinite execution, a correct leader is elected an infinite number of times. In other words, for any point inthe execution, there is a moment in the future when a correct leader is elected; β’ If a correct leader is elected after GST, no correct process will change its view number for the time period of atleast 5 Ξ .Initially, the view number of each process is 1. Hence, process leader ( ) is elected at the beginning of the execution.If leader ( ) is correct and the network is synchronous from the beginning of the execution (GST = decides on the proposed value once itreceives ack messages from a quorum ( π β π ) of processes. Therefore, as long as the leader is correct and the correctprocesses do not change their views prematurely, every correct process decides after just two communication steps.When correct processes change their views, they engage in the view change protocol , helping the newly elected leaderto obtain a safe value to propose equipped with a certificate that confirms that the value is safe. (A value is safe in aview if no other value was or will ever be decided in a smaller view).The view change protocol consists of two phases: first, the leader collects votes from processes and makes a decisionabout which value is safe, and, second, the leader asks 2 π + π β π acknowledgments.Below we describe the normal case execution and the view change protocol in more detail. We say that a value π₯ is safe in a view π£ if no value other that π₯ can be decided in a view π£ β² < π£ .A view change protocol (described in more detail below) provides the new leader with a value (cid:98) π₯ and a certificate (cid:98) π ensuring that (cid:98) π₯ is safe in the current view π£ . The certificate can be independently verified by any process. In the firstview ( π£ = (cid:98) π₯ = π₯ inleader ( ) ) and there is no need for such a certificate ( (cid:98) π = β₯ ).To propose a value in the normal case (illustrated in Figure 1a), the leader π sends the message propose ( (cid:98) π₯, π£, (cid:98) π, (cid:98) π ) toall processes, where (cid:98) π = sign π (( propose , (cid:98) π₯, π£ )) .When a process receives the proposal for the first time in a given view and ensures that (cid:98) π and (cid:98) π are valid, it sends anack message containing the proposed value and a digital signature to every process. Once a process receives π β π signed acknowledgments for the same pair ( (cid:98) π₯, π£ ) , it decides on the proposed value (cid:98) π₯ . Every process π locally maintains a variable vote π , an estimate of the value to be decided, in the form ( π₯, π’, π, π ) , where π₯ is a value, π’ is a view number, π is the certificate ensuring that π₯ is safe in view π’ , and π is a signature for the tupleevisiting Optimal Resilience of Fast Byzantine Consensus 5 π π π π propose ack ( (cid:98) π₯, π£, (cid:98) π, (cid:98) π ) ( (cid:98) π₯, π£, π ack ) (a) Normal case execution example. (cid:98) π = sign π ( ( propose , (cid:98) π₯, π£ )) ,where π is the identifier of the leader process that sends thepropose message. π ack = sign π ( ( ack , (cid:98) π₯, π£ )) , where π is theidentifier of the process that sends the ack message. π π π π vote CertReq ( vote π , π vote ) ( (cid:98) π₯, votes ) ( π ca ) CertAck (b) View change execution example. π vote = sign π ( ( vote , vote π , π£ )) and π ca = sign π ( ( CertAck , (cid:98) π₯, π£ )) , where π is the identifier of theprocess that sends the message.Fig. 1. Execution examples of our protocol. ( propose , π₯, π’ ) produced by the leader of view π’ . If vote π = ( π₯, π’, π, π ) , we that that process π votes for βvalue π₯ inview π’ β. Initially, the variable vote π has special value nil . When a correct process receives a propose message fromthe leader of its current view for the first time, the process updates its vote by adopting the values from the proposemessage (before sending the ack message back to the leader). Note that once a correct replica changes its vote from nil to something else, it never changes the vote back to nil . We say that a vote is valid if either it is equal to nil or both π and π are valid with respect to π₯ and π’ .Whenever a correct process π changes its current view (let π£ be the new view number), it sends the messagevote ( vote π , π vote ) to the leader of view π£ , where π vote = sign π (( vote , vote π , π£ )) . When a correct replica finds itself to bethe leader of its current view π£ , unless π£ =
1, it executes the view change protocol (illustrated in Figure 1b). First, it waitsfor π β π valid votes and runs the selection algorithm to determine a safe value to propose ( (cid:98) π₯ ). Then it communicateswith other processes to create the certificate (cid:98) π . Selection algorithm.
Let votes be the set of all valid votes received by the leader (with the ids and the signatures ofthe processes that sent these votes). | votes | β₯ π β π . If all votes in votes are equal to nil , then the leader simply selectsits own input value ( π₯ inleader ( π£ ) ).Otherwise, let π€ be the highest view number contained in a valid vote. If there is only one value π₯ such that there isa valid vote ( π₯, π€, β , β) in votes , then π₯ is selected.Let us now consider the case when there are two or more values with valid votes in view π€ . As a correct leader issuesat most one proposal in its view, the only reason for two different valid votes π = ( π₯ , π€, π , π ) and π = ( π₯ , π€, π , π ) to exist is that the leader π of view π€ is Byzantine. Byzantine (we say that process π has equivocated ). We can then treat πΎ = ( π , π ) as an undeniable evidence of π βs misbehavior. As we have at most π faulty processes, the leader can thenwait for π β π votes not including π βs vote (i.e., the leader may need to wait for exactly one more vote if | votes | = π β π and votes contains a vote from π ). After receiving this additional vote, it may happen that π€ is no longer the highestview number contained in a valid vote. In this case, the selection algorithm needs to be restarted.Otherwise, if π€ remains the highest view number contained in a valid vote, let votes β² denote the π β π valid votesfrom processes other than π . We have two cases to consider:(1) If there is a set π β votes β² of 2 π valid votes for a value π₯ , then π₯ is selected; Petr Kuznetsov, Andrei Tonkikh, and Yan X Zhang(2) If no such value π₯ is found, then any value is safe in view π£ . In this case, the leader simply selects its own inputvalue ( π₯ inleader ( π£ ) ). Certificate creation.
Let (cid:98) π₯ be the value selected by the selection algorithm. As we prove in Section 3.3, if the leaderhonestly follows the selection algorithm as described above, the selected value (cid:98) π₯ will be safe in the current view π£ .However, the leader also needs to create a certificate (cid:98) π that will prove to all other processes that (cid:98) π₯ is safe.The naive way to do so is to simply let (cid:98) π be the set of all valid votes received by the leader. Any process will be ableto verify the authenticity of the votes (by checking the digital signatures) and that the leader followed the selectionalgorithm correctly (by simulating the selection process locally on the given set of votes).However, the major problem with this solution is that the certificate sizes will grow without a limit in case of a longperiod of asynchrony. Recall that each vote contains a certificate. If each certificate (cid:98) π consisted of π β π votes, then eachvote would contain a certificate of its own, which, in turn, would consist of π β π votes cast in an earlier view, and soon. If this naive approach is implemented carefully, the certificate size (and, hence, the certificate verification time) willbe linear with respect to the current view number. While it may be sufficient for some applications (e.g., if long periodsof asynchrony are assumed to never happen), a solution with bounded certificate size would be much more appealing.In order to compress the certificate, we add an additional round-trip to the view change protocol. The leader sendsthe votes alongside the selected value (cid:98) π₯ to at least 2 π + π + (cid:98) π is the set of π + ( CertAck , (cid:98) π₯, π£ ) . Intuitively, sincethere are at most π Byzantine processes in total, it is sufficient to present signatures from π + (cid:98) π₯ is safe inview π£ . As a result, the size of a protocol message does not depend on the view number. It is easy to see that the protocol satisfies the liveness property of consensus: once a correct leader is elected after GST,there is nothing to stop it from driving the protocol to termination. The extended validity property is immediate. Hence,in this section, we focus on consistency. We show that a correct leader always chooses a safe value in the view changeprotocol.Our proofs are based on the following three quorum intersection properties (recall that π = π β Simple quorum intersection: any two sets of π β π processes intersect in at least one correct process. Thisfollows from the pigeonhole principle. It is sufficient to verify that 2 ( π β π ) β π β₯ π +
1, which is equivalent to π β₯ π + π = π β π β₯ Equivocation quorum intersection if π β Ξ such that | π | = π β π and π β Ξ such that | π | = π β π and there are at most π β π , then π β© π contains at least 2 π correct processes. Again,by the pigeonhole principle, it is sufficient to verify that 2 ( π β π ) β π β₯ ( π β ) + π , which is equivalent to π β₯ π β Equivocation quorum intersection if π β Ξ such that | π | = π β π and π β Ξ such that | π | = π and there are at most π β π , then π β© π contains at least one correct process. It issufficient to verify that ( π β π ) + π β π β₯ ( π β ) +
1, which holds for any values of π and π , π β₯ π .First, let us address the corner case when the leader receives no valid votes other than nil .evisiting Optimal Resilience of Fast Byzantine Consensus 7Lemma 3.1. If the leader of view π£ receives nil from π β π different processes during the view change, then any value issafe in π£ . Proof. Suppose, for contradiction, that at some point of the execution some value π¦ is decided in a view π€ β² smallerthan π£ . Consider the set π β Ξ of π β π processes that acknowledged value π¦ in π€ β² . Consider also the set π β Ξ of π β π processes that sent nil to the leader of view π£ . By property (QI1), π β© π contains at least one correct process.A correct process only sends messages associated with its current view and it never decreases its current viewnumber. Hence, it cannot send the vote in view π£ before sending the acknowledgment in view π€ β² . If the correct processacknowledged value π¦ in π€ β² before sending the vote to the leader of view π£ , the vote would have not been nil . Acontradiction. β‘ For the rest of this section, let π£ denote a view number and let π€ denote the highest view number contained in avalid vote received by the leader of view π£ during the view change protocol.Lemma 3.2. No value was or will ever be decided in any view π€ β² such that π€ < π€ β² < π£ . Proof. Suppose, for contradiction, that at some point of the execution some other value π¦ is decided in π€ β² ( π€ < π€ β² < π£ ) . Let π β Ξ be the set of π β π processes that acknowledged value π¦ in π€ β² and let π β Ξ be the set of π β π processes that sent their votes to the leader of view π£ . By property (QI1), π β© π contains at least one correct process.A correct process only sends messages associated with its current view and it never decreases its current viewnumber. Hence, it cannot send the vote in view π£ before sending the acknowledgment in view π€ β² . If the correct processacknowledged value π¦ in π€ β² before sending the vote to the leader of view π£ , the vote would have contained a viewnumber at least as large as π€ β² . This contradicts the maximality of π€ . β‘ Lemma 3.3.
If there is only one value π₯ such that there is a valid vote ( π₯, π€, π, π ) , then π₯ is safe in view π£ . Proof. Suppose, for contradiction, that at some point of the execution some other value π¦ is decided in a view π€ β² smaller than π£ . By the validity of certificate π , π€ β² cannot be smaller than π€ . By Lemma 3.2, π€ β² cannot be larger than π€ .Let us consider the remaining case ( π€ β² = π€ ) .The proof is mostly identical to the proofs of Lemmas 3.1 and 3.2. Nevertheless, we repeat it here for completeness.Let π β Ξ be the set of π β π processes that acknowledged value π¦ in π€ . Let π β Ξ be the set of π β π processesthat sent their votes to the leader of view π£ . By (QI1), π β© π contains at least one correct process.A correct process only sends messages associated with its current view and it never decreases its current viewnumber. Hence, it cannot send the vote in view π£ before sending the acknowledgment in view π€ . If the correct processacknowledged value π¦ in π€ before sending the vote to the leader of view π£ , the vote would have contained either a viewnumber larger than π€ (which contradicts the maximality of π€ ) or the value π¦ (which contradicts the uniqueness of π₯ ). β‘ Lemma 3.4.
If the leader detects an equivocation by process π and receives at least π valid votes for a value π₯ in view π€ from processes other than π , then π₯ is safe in view π£ . Proof. Suppose, for contradiction, that at some point of the execution some other value ( π¦ ) is decided in a view π€ β² smaller than π£ . By the validity of the certificates attached to the votes cast for value π₯ , π€ β² cannot be smaller than π€ . ByLemma 3.2, π€ β² cannot be larger than π€ . Let us consider the remaining case ( π€ β² = π€ ) . Petr Kuznetsov, Andrei Tonkikh, and Yan X ZhangLet π β Ξ be the set of π β π processes that acknowledged value π¦ in π€ . Let π β Ξ be the set of 2 π processesthat cast votes for value π₯ in view π€ . Since π β π and π is provably Byzantine, there are at most π β π . By (QI3), there is at least one correct process in π β© π . A correct process only adopts a vote beforeacknowledging the value from the vote and it never acknowledges 2 different values in the same view. Hence, π¦ = π₯ . Acontradiction. β‘ Lemma 3.5.
If the leader detects an equivocation by process π and do not receive π or more valid votes for any value π₯ in view π€ from processes other than π , then any value is safe in π£ . Proof. Suppose, for contradiction, that at some point of the execution some value π¦ is decided in a view π€ β² smallerthan π£ . Let π = ( π¦ , π€, π , π ) and π = ( π¦ , π€, π , π ) be the two valid votes such that π¦ β π¦ . By the validity of π ,no value other than π¦ was or will ever be decided in a view smaller than π€ . The same applies for value π¦ . Hence, novalue was or will ever be decided in a view smaller than π€ (i.e., π€ β² is not smaller than π€ ). By Lemma 3.2, π€ β² is notlarger than π€ .Let us consider the remaining case ( π€ β² = π€ ) . Recall that the leader collects π β π votes from processes other than π .By (QI2), the leader would have received at least 2 π votes for the value π¦ in view π€ or at least one vote for a value in aview larger than π€ . β‘ By applying the techniques from prior studies on fast Byzantine consensus [2, 14, 19], we can obtain a generalizedversion of our algorithm. The protocol will tolerate π Byzantine failures and will be able to decide a value in the commoncase after just two communication steps as long as the actual number of faults does not exceed threshold π‘ ( π‘ β€ π ). Therequired number of processes will be max { π + π‘ β , π + } (i.e., π = π + π‘ β π‘ β₯ π = π + π‘ = π‘ =
1, we obtain a Byzantine consensus protocol with optimal resilience ( π = π + π‘ β = [ π‘ = ] π +
1) that isable to decide a value with optimal latency in the common case in presence of a single Byzantine fault. To the best ofour knowledge, in all prior algorithms with optimal resilience ( π = π + In this section, we show that any π -resilient Byzantine consensus protocol that terminates in two synchronous stepswhen the number of actual failures does not exceed π‘ (we call such a protocol two-step ) requires at least 3 π + π‘ β π = π + π‘ + proposers ) are disjoint fromthe processes responsible for replicating the proposed values (called acceptors ). Let V be the domain of the consensus protocol (i.e., the set of possible input values). We define an initial configuration as a function πΌ : Ξ β V that maps processes to their input values. Note that although πΌ maps all processes to someinput values, Byzantine processes can pretend as if they have different input.An execution of the protocol is the tuple ( πΌ, B , S) , where πΌ is an initial configuration, B is the set of Byzantineprocesses ( |B| β€ π ), and S is a totally ordered sequence of steps taken by every process with timestamps (consisting ofevisiting Optimal Resilience of Fast Byzantine Consensus 9βsend messageβ, βreceive messageβ, and βtimer elapsedβ events). We allow multiple events to have the same timestamp,but they, nevertheless, must be arranged in a total order. If π = ( πΌ, B , S) , we say that execution π starts from initialconfiguration πΌ .In the proof of this lower bound, we assume that all processes have access to perfectly synchronized local clocksthat show exact time elapsed since the beginning of the execution. Note that this only strengthens our lower bound. Ifthere is no algorithm implementing fast Byzantine consensus with 3 π + π‘ β with synchronized clocks, then clearly there is no such algorithm without synchronized clocks.In all executions that we consider in the proof, the delivery of a message takes at least Ξ time units. We refer toevents that happen during the half-open time interval [ , Ξ ) as the first round , to the events that happen during thehalf-open time interval [ Ξ , Ξ ) as the second round , and so on. Hence, a message sent in round π may only be deliveredin round π + π -th roundβ, we mean its state after all eventswith timestamp π Ξ or smaller and before any events with higher timestamps.Lemma 4.1. Actions taken by correct processes during the first round depend exclusively on their inputs (i.e., on the initialconfiguration).
Proof. Indeed, in the executions that we consider, during the first round, no messages can be delivered. Messagesthat are sent at the time 0 are delivered not earlier than at the time Ξ , which belongs to the second round. As weonly consider deterministic algorithms, all actions taken by the processes in the first round are based on their inputvalues. β‘ Thanks to the liveness property of consensus, we can choose to only consider finite executions in which everycorrect process decides on some value at some point. Moreover, by the consistency property of consensus, all correctprocesses have to decide the same value. Let us call this value the consensus value of an execution and denote it with π ( π ) , where π is an execution.Given an execution π and a process π , the decision view of π in π is the view of π at the moment when it triggers theDecide callback. The view consists of the messages π received (ordered and with the precise time of delivery) togetherwith the state of π in the initial configuration of π . Note that the messages received by π after it triggers the callbackare not reflected in the decision view.Let π and π be two executions, and let π be a process which is correct in π and π . Execution π is similar toexecution π with respect to π , denoted as π π βΌ π , if the decision view of π in π is the same as the decision view of π in π . If π is a set of processes, we use π π βΌ π as a shorthand for β π β π : π π βΌ π .Lemma 4.2. If there is a correct process π β Ξ such that π π βΌ π , then π ( π ) = π ( π ) . Proof. Since we only consider executions where all correct processes decide some value, in executions π and π ,process π had to decide values π ( π ) and π ( π ) respectively. However, since, at the moment of the decision, process π isin the same state in both executions and we only consider deterministic processes, π has to make identical decisions inthe two executions. Hence, π ( π ) = π ( π ) . β‘ We say that π = ( πΌ, B , S) is a T -faulty two-step execution , where T β Ξ and |T | = π‘ , iff:(1) All processes in Ξ \ T are correct and all processes in T are Byzantine (i.e., B = T );(2) Local computation is instantaneous. In particular, if one process receives a messages from another process at thetime π‘ and sends a reply without waiting, the reply will be sent also at time π‘ and will arrive at the time π‘ + Ξ ;0 Petr Kuznetsov, Andrei Tonkikh, and Yan X Zhang(3) Processes in T honestly follow the protocol during the first round and do not take any actions in later rounds. Inparticular, they do not send any messages at the time 2 Ξ or later;(4) Delivery of each message between each pair of processes takes precisely Ξ time units;(5) Every correct process makes a decision not later than at the time 2 Ξ .The following lemma explains how the weak validity property of consensus dictates the output values of T -faultytwo-step executions.Lemma 4.3. For any consensus protocol with weak validity, if all processes have the same input value π₯ (β π : πΌ ( π ) = π₯ ) ,for any T -faulty two-step execution π starting from πΌ , the consensus value π ( π ) = π₯ . Proof. Let π be the moment in time such that, in execution π , by that moment, all correct processes have invokedthe Decide callback. Let π β² be an execution identical to π with the exception that processes in T are not Byzantine, butjust slow. The messages they send after the first round do not reach other processes until after the moment π . Theprocesses in Ξ \ T have no way to distinguish π β² from π until they receive the delayed messages, which happensalready after they invoke the Decide callback. Hence, π β² Ξ \T βΌ π and, by Lemma 4.2, π ( π β² ) = π ( π ) . By the weak validityproperty of the consensus protocol, if all processes have π₯ as their input value, then π ( π β² ) = π₯ . β‘ Protocol P is a two-step consensus protocol if it satisfies the following conditions:(1) P is a consensus protocol with weak validity, as defined in Section 2;(2) βM β Ξ such that |M| β₯ π‘ + β πΌ β initial configuration: βT β M such that |T | = π‘ : there is a T -faultytwo-step execution starting from πΌ .In other words, if all Byzantine processes belong to a known set of βsuspectsβ M and fail by simply crashing at themoment Ξ , local computation is immediate, and the network is synchronous, the algorithm must be able to make surethat all processes decide some value after just 2 steps. Otherwise, when the environment is not so gracious (e.g., thenetwork is not synchronous from the beginning or some processes in Ξ \ M are Byzantine), the protocol is allowed toterminate after more than 2 steps. Intuitively, we allow the optimistic fast path of the protocol to rely on the correctnessof up to π β ( π‘ + ) βleadersβ. We are not aware of any protocol published in the literature that would rely on thecorrectness of more than one βleaderβ process, but we want to make our lower bound as general as possible.As an example, we show that our protocol is a two-step consensus protocol. Suppose we have at least 3 π + π‘ β π β₯
1. Recall leader ( ) is the leader for view 1. Let π = leader ( ) . We can pick M = Ξ \ { π } . Then forany initial configuration and any T with |T | = π‘ , the following T -faulty two-step execution exists:(1) π proposes its input value π₯ = π₯ in π at time 0 with the message propose ( π₯, , β₯ , (cid:98) π ) ;(2) The other processes, including those in T , honestly follow the protocol and do nothing during the first round;(3) At time Ξ , all processes receive the propose message. Among them, 3 π + π‘ β π ), and they respond with an acknowledgment ack ( π₯, , π ack ) ;(4) At time 2 Ξ , all the correct processes receive all the ack messages and decide via Decide ( π₯ ) . Process π β Ξ is said to be influential if there are two initial configurations ( πΌ and πΌ β² ) such that β π β π : πΌ ( π ) = πΌ β² ( π ) and two non-intersecting sets of suspects not including π of size π‘ ( T , T β² β M \ { π } , |T | = |T β² | = π‘ , and T β© T β² = β )such that there are a T -faulty execution π , and a T β² -faulty execution π β² with different consensus values ( π ( π ) β π ( π β² ) ).evisiting Optimal Resilience of Fast Byzantine Consensus 11Intuitively, a process is influential if its input value under certain circumstances can affect the outcome of the fastpath of the protocol. In Theorem 4.6, we prove that, if the number of processes is smaller than 3 π + π‘ β
1, an influentialprocess can use its power to force disagreement.Lemma 4.4.
For any two-step consensus protocol, there is at least one influential process.
Proof. β π β { , . . . , π } : let πΌ π be the initial configuration in which the first π processes have the input value 1 andthe remaining processes have the input value 0. In particular, in πΌ , all processes have the input value 0, and, in πΌ π , allprocesses have the input value 1. By the definition of a two-step consensus protocol, β π : βT β M such that |T | = π‘ :there must be a T -faulty two-step execution starting from πΌ π . Moreover, by Lemma 4.3, βT : all T -faulty two-stepexecutions starting from πΌ (resp., πΌ π ) have the consensus value 0 (resp., 1). Let pred ( π ) be the predicate βthere is a set T β (M \ { π π }) such that there is a T -faulty two-step execution with consensus value 1 starting from πΌ π β. We knowthat pred ( ) = false and pred ( π ) = true . Hence, as we consider all number from 0 to π , pred must change its value from false to true at least once. Let π be such a number that pred ( π β ) = false and pred ( π ) = true , π β₯
1. Let T be the set ofsuspects defined in the predicate. By the definition of T , π β π .It follows that there is a minimum number π β₯ T such that there is a T -faulty two-stepexecution with consensus value 1 starting from πΌ π .Note that π π β T . Indeed, if π π β T , then the input of π π would not be able to affect the consensus value of any T -faulty two-step execution ( π π simply would not participate in that execution). In that case, there would be a T -faultytwo-step execution with consensus value 1 starting from πΌ π β , which contradicts the choice of π .Let T β M be a set of suspect such that |T | = π‘ , π π β T , and T β© T = β . Such a set exists because |M| = π‘ +
1. Bythe definition of π , all T -faulty two-step executions starting from initial configuration πΌ π β have consensus value 0, and,by the definition of a two-step consensus protocol, there is at least one such execution.We argue that π π is an influential process. Indeed, πΌ π β and πΌ π differ only in the input of process π π , π and π are T -and T -faulty executions starting from πΌ π β and πΌ π respectively, T β© T = β , π π β (T βͺ π ) , and π ( π ) β π ( π ) . β‘ Let us prove that 3 π + π‘ β π‘ = There is no two-step consensus protocol (assuming weak validity) with π β₯ and π‘ = that can beexecuted on π + π‘ β = π processes. Proof. This is a special case of the more general lower bound [22] that states that any Byzantine consensus protocolin partially synchronous model requires at least 3 π + β‘ Theorem 4.6.
There is no two-step consensus protocol (assuming weak validity) with π β₯ π‘ β₯ that can be executed on π + π‘ β processes. Proof. Suppose, for contradiction, that there is such a two-step consensus protocol that can be executed on a set Ξ of 3 π + π‘ β π β₯ π‘ β₯ π , two initial configurations ( πΌ β² and πΌ β²β² )that differ only in the input of process π , two sets of suspects ( T β² , T β²β² β Ξ \ { π } , |T β² | = |T β²β² | = π‘ , and T β² β© T β²β² = β ),and two executions: a T β² -faulty execution π β² starting from πΌ β² and a T β²β² -faulty execution π β²β² starting from πΌ β²β² , such that π ( π β² ) β π ( π β²β² ) . Without loss of generality, let us assume that π ( π β² ) = π ( π β²β² ) = Ξ \ { π } into five groups: π , . . . , π , where π = T β²β² , π = T β² , and | π | = | π | = | π | = π β
1. Thepartition is depicted in Figure 2. Note that | π | + . . . + | π | + |{ π }| = π‘ + ( π β ) + = π + π‘ β = | Ξ | .2 Petr Kuznetsov, Andrei Tonkikh, and Yan X Zhang { π } π π π π π π π π π π π π π π π‘ π π π π‘ π‘ π π π‘ π‘ π‘ π π‘ π‘ π‘ π‘ similar for π similar for π , π , and π similar for π , π , and π similar for π size 1 π‘ π β π β π β π‘ Fig. 2. A visualization for the proof setup of the lower bound when π β₯ π‘ β₯ . The rows are executions and the columns are groups(subsets) of the processes. A group that is byzantine will be denoted with . The π π and π‘ π correspond to the state of the processafter the first round. π π π π π { π } π π‘ π π π π π time 0 Ξ Ξ π (a) Execution π . π π π π π { π } π π‘ π‘ π π π π π . . .. . .. . .. . . π time 0 Ξ Ξ (b) Execution π .Fig. 3. Executions π and π . Solid blue lines and dashed green arrows represent messages identical to messages sent in π and π respectively. Green tick symbol means that all processes in the group decide a value. Messages from all processes other than π in thefirst round are identical in all executions and omitted on the picture for clarity. Messages sent in the second round to process π inboth executions and to group π in π are also omitted as these processes are Byzantine and do not take any further steps in theseexecutions after the second round. π π and π‘ π represent states of correct processes after the first round in π and π respectively. π π represent states of correct processes after the second round in π . In π , processes in group π are Byzantine, but they pretend to becorrect and in the state π . We construct 5 executions, π through π ( π = π β²β² and π = π β² ), such that for all π β { , . . . , } , group π π is Byzantinein π π , and for all π β π , group π π is correct in π π . The influential process π is Byzantine for π , π , π and is correct for π and π ; hence we have exactly π Byzantine actors for each execution as | π | = | π | = | π | = π β
1. Our goal is to showthat each pair of adjacent executions will be similar for at least one correct process set, who will then decide the samevalue. This would naturally lead to a contradiction, since Lemma 4.2 implies that 0 = π ( π ) = Β· Β· Β· = π ( π ) = π can send one of two types of messages in the first round: 0 or 1, where sending 0 causesexecution π to happen and sending 1 causes execution π to happen. Recall that the initial configurations of π and π differ only in the input value of π . By Lemma 4.1, all actions taken by correct processes other than π during thefirst round will be the same in all executions. Additionally, for all π , in execution π π , the processes in the Byzantineevisiting Optimal Resilience of Fast Byzantine Consensus 13group π π will act as if they are correct. Hence, the only process that acts differently in different executions during thefirst round is π . For each execution π π , let π send 0 to the processes in π π for π < π and send 1 to the processes in π π for π > π (processes in π π are Byzantine in π π , so it does not matter what is sent to them). Note that for π = π honestly following the protocol for some initial configuration (i.e. sending the same message to allprocesses). For the other π , π is a Byzantine process that is role-playing a different initial configuration for each of thetwo nontrivial partitions of the other processes. Thus, after the first round, each group π π can be assumed to take justone of two states π π or π‘ π , where π π is consistent with π and π‘ π is consistent with π , solely dependent on which of thetwo messages π sends to them. See Figures 2 and 3. We now construct π , π , and π . Execution π , second round (we will specify the actions of subsequent rounds in π at a later time): β’ Recall that { π } βͺ π are Byzantine. They ( { π } βͺ π ) will send messages to group π in exactly the same fashionas in π . For other processes, π will act in exactly the same fashion as in π (i.e., as if they are correct and werein the state π‘ after the first round). Process π will simply remain silent for the rest of the execution; β’ π is now honest, unlike it was in π . However, the messages from π to π are delayed and do not reach therecipients until after the time 2 Ξ . Other messages sent by π in the second round are delivered in a timely fashion; β’ π is still honest but slow. We will have its messages not received by any other process until a finite moment intime π that we will specify later; β’ All messages sent to π by processes other than π in the second round will be delivered at the exact same timesand in the exact same order as in π .We now look at π βs perspective. During the time interval [ , Ξ ] , π will not be able to distinguish this execution from π , since it receives the exact same messages from { π } , π , π , π , which all have the same state as in π after the firstround (or in π βs case can fake the same state), and hears nothing from π in both executions. Thus, by the time 2 Ξ ,processes in π will achieve exactly the same state as in π and will decide 0 as well (a reminder that this decision isdone in silence, as π βs messages will not be received by anyone else until the time π ). Therefore, π π βΌ π . Execution π , second round: Recall that { π } βͺ π are Byzantine. We perform the exact same construction as π , butswitching the roles by both the symmetries π π β π β π and π π β π β π . In particular, { π } βͺ π will send messages togroup π in exactly the same fashion as in π . By a symmetric argument as above, we can conclude that by the time 2 Ξ ,processes in π will achieve exactly the same state as in π and will decide 1 as well. Therefore, π π βΌ π . Execution π : First, let us look at what happens during the second round: β’ In π , the Byzantine π sends the same messages to π , π , and π as its honest version does in π ; β’ In π , the Byzantine π sends the same messages to π , π , and π as its honest version does in π ; β’ π is slow in both, so everybody has the same interaction with π ; β’ π is Byzantine, so it is possible for us to assume that π sends the same messages to each π π during the twoexecutions.The important point is this: after the second round, we can assume that every non- π process has the samestate (or if Byzantine in one of them, can act as if it were in the same state) in the two executions π and π . More precisely, we can continue both executions (as long as π βs messages do not reach anyone) as if the processes in π βͺ π βͺ π βͺ π were the only correct processes starting from some state π after completing their second step.4 Petr Kuznetsov, Andrei Tonkikh, and Yan X ZhangWe now construct π . Recall that { π } βͺ π are Byzantine. We let all processes in π crash permanently at the end ofthe first round. The key idea with π is that by the discussion we just had, we can assume that after the second roundits (honest) processes π βͺ π βͺ π βͺ π are in the same group state π as their counterparts in π and π . Since there areonly π Byzantine processes total, by the liveness property of consensus, there must exist some execution that obtainsconsensus at some moment π . We pick this execution to be π . Executions π and π , later rounds: We are now finally ready to complete executions π and π . Since they aresymmetric, we start by looking at π . We have specified what happens to π up through the second round. Now, for thefuture rounds, emulate π until the moment π , keeping π silent. This execution is identical to π , so all the correctprocesses (in particular, all processes in π ) will decide. This means π π βΌ π . By a symmetric argument, π π βΌ π . Wehave now given a similarity between each adjacent pair of executions, so we have obtained a contradiction. β‘ While π = π + π‘ + proposers , acceptors , and learners . Proposers are βleadersβ and they are responsible for choosing a safe value and sending it toacceptors. Acceptors store the proposed values and help the new leader to choose a safe value in case previous leadercrashes. Finally, learners are the processes that trigger the Decide callback and use the decided value (e.g., they canexecute replicated state machine commands). In this model, the consensus problem requires all learners to decide thesame value. The Byzantine version of Paxos [16] requires presence of at least one correct proposer and π = π + π is the possible number of Byzantine faults among acceptors.In our algorithm, when a correct leader (proposer) sees that some prior leader equivocated, it uses this fact toexclude one acceptor from consideration as it is provably Byzantine. This trick only works when the set of proposersis a subset of the set of acceptors. Moreover, this trick seems to be crucial for achieving the optimal resilience ( π = max { π + π‘ β , π + } ). When the set of proposers is disjoint from the set of acceptors, or even if there is just oneproposer that is not an acceptor, it can be shown that π = π + π‘ + π = π + π‘ + π is no longeran acceptor. Hence, we are left with only 5 groups of acceptors ( π , . . . , π ) instead of 6 ( { π } , π , . . . , π ). Second, thegroups of acceptors π , π , and π can now be of size π instead of π β π is no longer counted towards the quotaof π Byzantine acceptors). After these two modifications, the proof shows that there is no two-step consensus protocolwith π = | π | + Β· Β· Β· + | π | = π + π‘ or fewer acceptors. To the best of our knowledge, Kursawe [14] was the first to implement a fast (two-step) Byzantine consensus protocol.The protocol is able to run with π = π + π processesfollow the protocol and the network is synchronous. Otherwise, it falls back to a randomized asynchronous consensusprotocol.Martin and Alvisi [19] present FaB Paxos β a fast Byzantine consensus protocol with π = π +
1. Moreover, theypresent a parameterized version of the protocol: it runs on π = π + π‘ + π‘ β€ π ), tolerates π Byzantineevisiting Optimal Resilience of Fast Byzantine Consensus 15failures, and is able to commit after just two steps in the common case when the leader is correct, the network issynchronous, and at most π‘ processes are Byzantine. In the same paper, the authors claim that π = π + π‘ + π failures, the algorithm needs5 π + π + π Byzantine failureswith only 3 π + π β ACKNOWLEDGMENTS
The authors are grateful to Jean-Philippe Martin and Lorenzo Alvisi for helpful discussions on their work [19].
REFERENCES [1] Ittai Abraham, Guy Gueta, Dahlia Malkhi, Lorenzo Alvisi, Rama Kotla, and Jean-Philippe Martin. 2017. Revisiting fast practical byzantine faulttolerance. arXiv preprint arXiv:1712.01367 (2017).[2] Ittai Abraham, Guy Gueta, Dahlia Malkhi, and Jean-Philippe Martin. 2018. Revisiting fast practical byzantine fault tolerance: Thelma, velma, andzelma. arXiv preprint arXiv:1801.10022 (2018).[3] Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Konstantinos Christidis, Angelo De Caro, David Enyeart, Christopher Ferris,Gennady Laventman, Yacov Manevich, et al. 2018. Hyperledger fabric: a distributed operating system for permissioned blockchains. In
Proceedingsof the thirteenth EuroSys conference . 1β15.[4] Alysson Bessani, Joao Sousa, and Eduardo EP Alchieri. 2014. State machine replication for the masses with BFT-SMART. In . IEEE, 355β362.[5] Manuel Bravo, Gregory Chockler, and Alexey Gotsman. 2020. Making Byzantine consensus live. In . Schloss Dagstuhl-Leibniz-Zentrum fΓΌr Informatik.[6] Ethan Buchman. 2016.
Tendermint: Byzantine fault tolerance in the age of blockchains . Ph.D. Dissertation.[7] Mike Burrows. 2006. The Chubby lock service for loosely-coupled distributed systems. In
Proceedings of the 7th symposium on Operating systemsdesign and implementation . 335β350.[8] Miguel Castro, Barbara Liskov, et al. 1999. Practical byzantine fault tolerance. In
OSDI , Vol. 99. 173β186.[9] Allen Clement, Manos Kapritsos, Sangmin Lee, Yang Wang, Lorenzo Alvisi, Mike Dahlin, and Taylor Riche. 2009. Upright cluster services. In
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles . 277β290.[10] Sisi Duan, Sean Peisert, and Karl N Levitt. 2014. hBFT: speculative Byzantine fault tolerance with minimum cost.
IEEE Transactions on Dependableand Secure Computing
12, 1 (2014), 58β70. [11] Guy Golan Gueta, Ittai Abraham, Shelly Grossman, Dahlia Malkhi, Benny Pinkas, Michael Reiter, Dragos-Adrian Seredinschi, Orr Tamir, and AlinTomescu. 2019. Sbft: a scalable and decentralized trust infrastructure. In . IEEE, 568β580.[12] Patrick Hunt, Mahadev Konar, Flavio Paiva Junqueira, and Benjamin Reed. 2010. ZooKeeper: Wait-free Coordination for Internet-scale Systems.. In
USENIX annual technical conference , Vol. 8.[13] Ramakrishna Kotla, Lorenzo Alvisi, Mike Dahlin, Allen Clement, and Edmund Wong. 2007. Zyzzyva: speculative byzantine fault tolerance. In
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles . 45β58.[14] Klaus Kursawe. 2002. Optimistic byzantine agreement. In
IEEE, 262β267.[15] Leslie Lamport. 1978. Time, Clocks, and the Ordering of Events in a Distributed System.
Communications (1978).[16] Leslie Lamport. 2011. Byzantizing Paxos by refinement. In
International Symposium on Distributed Computing . Springer, 211β224.[17] Leslie Lamport et al. 2001. Paxos made simple.
ACM Sigact News
32, 4 (2001), 18β25.[18] Barbara Liskov and James Cowling. 2012. Viewstamped replication revisited. (2012).[19] J-P Martin and Lorenzo Alvisi. 2006. Fast byzantine consensus.
IEEE Transactions on Dependable and Secure Computing
3, 3 (2006), 202β215.[20] Oded Naor and Idit Keidar. 2020. Expected linear round synchronization: The missing link for linear Byzantine SMR. arXiv preprint arXiv:2002.07539 (2020).[21] Brian M Oki and Barbara H Liskov. 1988. Viewstamped replication: A new primary copy method to support highly-available distributed systems. In
Proceedings of the seventh annual ACM Symposium on Principles of distributed computing . 8β17.[22] Marshall Pease, Robert Shostak, and Leslie Lamport. 1980. Reaching agreement in the presence of faults.
Journal of the ACM (JACM)
27, 2 (1980),228β234.[23] Fred B Schneider. 1990. Implementing fault-tolerant services using the state machine approach: A tutorial.
ACM Computing Surveys (CSUR)
22, 4(1990), 299β319.[24] Nibesh Shrestha, Mohan Kumar, and SiSi Duan. 2019. Revisiting hbft: Speculative byzantine fault tolerance with minimum cost. arXiv preprintarXiv:1902.08505 (2019).[25] Yee Jiun Song and Robbert van Renesse. 2008. Bosco: One-step byzantine asynchronous consensus. In
International Symposium on DistributedComputing . Springer, 438β450.[26] Maofan Yin, Dahlia Malkhi, Michael K Reiter, Guy Golan Gueta, and Ittai Abraham. 2019. Hotstuff: Bft consensus with linearity and responsiveness.In