[PDF] Tame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer

Abstract

Full PDF

TTame the Wild with Byzantine Linearizability: Reliable Broadcast,Snapshots, and Asset Transfer

SHIR COHEN,

Technion, Israel

IDIT KEIDAR,

Technion, Israel

We formalize Byzantine linearizability, a correctness condition that specifies whether a concurrent object with a sequential specificationis resilient against Byzantine failures. Using this definition, we systematically study Byzantine-tolerant emulations of various objectsfrom registers. We focus on three useful objects– reliable broadcast, atomic snapshot, and asset transfer. We prove that there is an 𝑓 -resilient implementation of such objects from registers with 𝑛 processes 𝑓 < 𝑛 . Over the last decade, cryptocurrencies have taken the world by a storm. The idea of a decentralized bank, independent ofpersonal motives has gained momentum, and companies like Bitcoin [16], Ethereum [18] and Diem [6] now play a bigpart in the world’s economy. At the core of all of those companies lies the asset transfer problem. In this problem, thereare multiple accounts, operated by processes that wish to transfer assets between accounts. This environment raises theneed to tolerate the malicious behavior of processes that wish to sabotage the system.In this work, we consider the shared memory model that was somewhat neglected in the Byzantine discussion. Webelieve that the shared memory model offers an intuitive formulation of the abstract service offered by blockchains. Itis well-known that it is possible to implement reliable read-write shared memory registers via message-passing even ifa fraction of the servers are Byzantine [1, 3, 12, 14, 17]. As a result, as long as the processes using the service are notmalicious, any fault-tolerant object that can be constructed using registers can also be implemented in the presence ofByzantine servers. However, it is not clear what can be done with such objects when they are used by Byzantine processes.In this work, we answer this question. In Section 4 we define

Byzantine linearizability , a correctness condition applicableto any shared memory object with a sequential specification. We then systematically study the feasibility of implementingvarious Byzantine linearizable objects from registers.We observe that existing Byzantine fault-tolerant shared memory constructions [1, 13, 15] in fact implement Byzantinelinearizable registers. Such registers are the starting point of our study. When trying to implement more complex objects(e.g., snapshots and asset transfer) using registers, constructions that work in the crash-failure model no longer work whenByzantine processes are involved, and new algorithms – or impossibility results – are needed.As our first result, we prove in Section 5 that an asset transfer object used by Byzantine processes does not have await-free implementation, even when its specification is reduced to support only transfers operations (without readingprocesses’ balances). Furthermore, it cannot be implemented without a majority of correct processes constantly takingsteps. Since asset transfer has implementations from both reliable broadcast [5] and snapshots [11], these lower boundsalso apply to them. In Section 6, we present a Byzantine linearizable reliable broadcast algorithm with resilience 𝑓 < 𝑛 ,proving that the bound on the resilience is tight. We are not familiar with any previous shared memory constructionsof reliable broadcast from registers. Finally, in Section 7 we present a Byzantine linearizable snapshot with the sameresilience. In contrast, previous constructions of Byzantine lattice agreement, which can be directly constructed fromsnapshot [4], required 𝑓 + processes to tolerate 𝑓 failures.All in all, we establish a tight bound on the resilience of emulations of three useful shared memory objects fromregisters. On the one hand, we show that it is impossible to obtain wait-free solutions as in the non-Byzantine model, a r X i v : . [ c s . D C ] F e b hir Cohen and Idit Keidar and on the other hand, unlike previous works, our solutions do not require 𝑛 > 𝑓 . Taken jointly, our results yield thefollowing theorem:T HEOREM There exist 𝑓 -resilient Byzantine linearizable implementations of reliable broadcast, snapshot, and assettransfer objects with 𝑛 processes if and only if 𝑓 < 𝑛 . Although the construction of SWMR registers in message passing systems requires 𝑛 > 𝑓 servers, our improvedresilience applies to clients, which are normally less reliable than servers, particularly in the so-called permissioned model where servers are trusted and clients are ephemeral.In summary, we make the following contributions: • Formalizing Byzantine linearizability for any object with a sequential specification. • Proving that some of the most useful building blocks in distributed computing, such as atomic snapshot and reliablebroadcast, do not have 𝑓 -resilient implementations from SWMR registers when 𝑓 ≥ 𝑛 processes are Byzantine. • Presenting a Byzantine linearizable implementations of a reliable broadcast object and a snapshot object with theoptimal resilience.

To the best of our knowledge, there is no known Byzantine linearizable implementation of a reliable broadcast in theliterature. Given such an object there are known implementations of lattice agreement [10, 19] which resembles asnapshot object. However, these constructions require 𝑛 = 𝑓 + processes. In our work, we present both Byzantinelinearizable reliable broadcast and Byzantine snapshot, (from which Byzantine lattice agreement can be constructed [4]),with resilience 𝑛 = 𝑓 + .The asset transfer object we discuss in this paper was introduced by Guerraoui et al. [11]. Their work provides aformalization of the cryptocurrency definition [16]. The highlight of their work is the observation that the asset transferproblem can be solved without consensus. It is enough to maintain a partial order of transactions in the systems, and inparticular, every process can record its own transactions. They present a wait-free linearizable implementation of assettransfer in crash-failure shared memory, taking advantage of an atomic snapshot object. Thus, we can use their solution,together with our Byzantine snapshot, to solve Byzantine linearizable asset transfer with 𝑛 = 𝑓 + .In addition, Guerraoui et al. present a Byzantine-tolerant solution in the message passing model. This algorithm utilizesreliable broadcast, where dependencies of transactions are explicitly broadcast along with the transactions. This solutiondoes not translate to a Byzantine linearizable one, but rather to a sequentially consistent asset transfer object. In particular,reads can return old (superseded) values, and transfers may fail due to outdated balance reads.Finally, recent work by Auvolat et al. [5] continues this line of work. They show that a FIFO order property betweeneach pair of processes is sufficient in order to solve the asset transfer problem. This is because transfer operations can beexecuted once a process’s balance becomes sufficient to perform a transaction and there is no need to wait for all causallypreceding transactions. However, as a result, their algorithm is not sequentially consistent, or even causally consistent forthat matter. For example, assume process 𝑖 maintains an invariant that its balance is always at least 10, and performs atransfer with amount 5 after another process deposits 5 into its account, increasing its balance to 15. Using the protocolin [5], another process might observe 𝑖 ’s balance as 5 if it sees 𝑖 ’s outgoing transfer before the causally preceding deposit.Because our solution is Byzantine linearizable, such anomalies are prevented. ame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer We study a distributed system in the shared memory model. Our system consists of a well-known static set Π = { , . . . , 𝑛 } of asynchronous processes. These processes have access to some shared memory objects. In the shared memory model,all communication between processes is done through the API exposed by the objects in the system: processes invokeoperations that in turn, return some response to the process. In this work, we assume a reliable shared memory. (Previousworks have presented constructions of such reliable shared memory in the message passing model [1, 3, 12, 14, 17]). Wefurther assume an adversary that may adaptively corrupt up to 𝑓 processes in the course of a run. When the adversarycorrupts a process, it is defined as Byzantine and may deviate arbitrarily from the protocol. As long as a process is notcorrupted by the adversary, it is correct , follows the protocol, and takes infinitely many steps. In particular, it continues toinvoke the object’s API infinitely often.We enrich the model with a public key infrastructure (PKI). That is, every process is equipped with a public-privatekey pair used to sign data and verify signatures of other processes. We denote a value 𝑣 signed by process 𝑖 as ⟨ 𝑣 ⟩ 𝑖 . Histories.

An execution of a concurrent protocol consists of operations called by processes. Operations start withan invocation event and end with a response event. Operations take time and invocations and responses mark discretepoints in time. These points form the history of an execution. A sub-history of a history 𝐻 is a sub-sequence of theevents of 𝐻 . A history 𝐻 is sequential if it begins with an invocation and each invocation, except possibly the last, isimmediately followed by a matching response. Operation 𝑜𝑝 is pending in a history 𝐻 if 𝑜𝑝 is invoked in 𝐻 but does nothave a matching response event.A history defines a partial order on operations: operation 𝑜𝑝 precedes 𝑜𝑝 in history 𝐻 , denoted 𝑜𝑝 ≺ 𝐻 𝑜𝑝 , if theresponse event of 𝑜𝑝 precedes the invocation event of 𝑜𝑝 in 𝐻 . Two operations are concurrent if neither precedes theother. In addition, a history is well formed if for each process 𝑐 the sub-sequence of 𝑐 ’s events in 𝐻 , denoted 𝐻 | 𝑐 , issequential. Linearizability.

A popular correctness condition for concurrent objects is linearizability, which is defined with respectto an object’s sequential specification. A linearization of a concurrent history 𝐻 of object 𝑜 is a sequential history 𝐻 ′ such that (1) after removing some pending operations from 𝐻 and completing others by adding matching responses,it contains the same invocations and responses as 𝐻 ′ , (2) 𝐻 ′ preserves the partial order ≺ 𝐻 , and (3) 𝐻 ′ satisfies 𝑜 ’ssequential specification. f-resilient. An algorithm is f-resilient if as long as at most 𝑓 processes fail, every correct process eventually returnsfrom each operation it invokes. A wait-free algorithm is a special case where 𝑓 = 𝑛 − . Single Writer Multiple Readers Register.

The basic building block in shared memory is a single writer multiple readers(SWMR) register that exposes read and write operations. Such registers are used to construct more complicated objects.The sequential specification of a SWMR register states that every read operation from register 𝑅 returns the value lastwritten to 𝑅 . Note that this specification is meaningful only if the register’s writer is not Byzantine. A Byzantine processcan write arbitrary values to its registers. Asset Transfer Object.

In [11], the asset transfer problem is formulated as a sequential object type, called

Asset TransferObject . The asset transfer object maintains a mapping from processes in the system to their balances . Initially, themapping contains the initial balances of all processes. The object exposes a transfer operation, transfer(src,dst,amount) , The definition in [11] allows processes to own multiple accounts. For simplicity, we assume a single account per-process in this work.3 hir Cohen and Idit Keidar which can be invoked by process src (only). It withdraws amount from process src ’s account and deposits it at process dst ’s account provided that src ’s balance was at least amount . It returns a boolean that states whether the transfer wassuccessful (i.e., src had amount to spend). In addition, the object exposes a read(i) operation that returns the currentbalance of 𝑖 . In this section we define Byzantine linearizability. Intuitively, we would like to tame the Byzantine behavior in a way thatprovides consistency to correct processes. We linearize the correct processes’ operations and offer a degree of freedom toembed additional operations by Byzantine processes.We denote by 𝐻 | 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 the projection of a history 𝐻 to all correct processes. We say that a history 𝐻 is Byzantinelinearizable if 𝐻 | 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 can be augmented with operations of Byzantine processes such that the completed historyis linearizable. In particular, if there are no Byzantine failures then Byzantine linearizability is simply linearizability.Formally:D EFINITION (Byzantine Linearizability) A history 𝐻 is Byzantine linearizable if there exists a history 𝐻 ′ so that 𝐻 ′ | 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 = 𝐻 | 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 and 𝐻 ′ is linearizable. Similarly to linearizability, we say that an object is Byzantine linearizable if all of its executions are ByzantineLinearizable.Next, we characterize objects for which Byzantine linearizability is meaningful. The most fundamental component inshared memory is read-write registers. Not surprisingly, such registers, whether they are single-writer or multi-writersones are de facto Byzantine linearizable without any changes. This is because before every read from a Byzantine register,invoked by a correct process, one can add a corresponding Byzantine write.In practice, multiple writers multiple readers (MWMR) registers are useless in a Byzantine environment as an adversarythat controls the scheduler can prevent any communication between correct processes. SWMR registers, however, arestill useful for constructing more meaningful objects. Nevertheless, the constructions used in the crash-failure modelfor linearizable objects do not preserve this property. For instance, if we allow Byzantine processes to run a classicatomic snapshot algorithm [2] using Byzantine linearizable SWMR registers, it will not result in a Byzantine linearizablesnapshot object.

Relationship to Other Correctness Conditions.

Byzantine linearizability provides a simple and intuitive way to captureByzantine behavior in the shared memory model. We now examine the relationship of Byzantine linearizability withpreviously suggested correctness conditions involving Byzantine processes. Some works have defined linearizationconditions for specific objects. This includes conditions for SWMR registers [15], a distributed ledger [9], and assettransfer [5]. Our condition coincides with these definitions for the specific objects and thus generalizes all of them. Liskovand Rodrigues [13] presented a correctness condition that has additional restrictions. Their correctness notion relies onthe idea that Byzantine processes are eventually detected and removed from the system, and focuses on converging tocorrect system behavior after their departure. While this model is a good fit when the threat model is software bugs ormalicious intrusions, it is less appropriate for settings like cryptocurrencies, where Byzantine behavior cannot be expectedto eventually stop. ame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer In shared memory, one typically aims for wait-free objects, which tolerate any number of process failures. Indeed, manyuseful objects have wait-free implementations from SWMR registers in the non-Byzantine case. This includes reliablebroadcast, snapshots, and as recently shown, also asset transfer. We now show that in the Byzantine case, wait-freeimplementations of these objects are impossible. Moreover, a majority of correct processes is required.T

HEOREM For any 𝑓 > , there does not exist a Byzantine linearizable implementation of asset transfer thatsupports only transfer operations in a system with 𝑛 ≤ 𝑓 processes, 𝑓 of which can be Byzantine, using only SWMRregisters. P ROOF . Assume by contradiction that there is such an algorithm. Let us look at a system with 𝑛 = 𝑓 correct processes.Partition Π as follows: Π = 𝐴 ∪ 𝐵 ∪ { 𝑝 , 𝑝 } , where | 𝐴 | = 𝑓 − , | 𝐵 | = 𝑓 − , 𝐴 ∩ 𝐵 = ∅ , and 𝑝 , 𝑝 ∉ 𝐴 ∪ 𝐵 . By assumption, | 𝐴 | > . Let 𝑧 be a process in 𝐴 . Also, by assumption | 𝐵 | ≥ . Let 𝑞 , 𝑞 be processes in 𝐵 . The initial balance of allprocesses but 𝑧 is , and the initial balance of 𝑧 is . We construct four executions as shown in Figure 1.Let 𝜎 be an execution where, only processes in 𝐴 ∪ { 𝑝 } take steps. First, 𝑧 performs transfer( 𝑧, 𝑝 ,1) . Since up to 𝑓 processes may be faulty, the operation completes, and by the object’s sequential specification, it is successful (returnstrue). Then, 𝑝 performs transfer( 𝑝 , 𝑞 ,1) . By 𝑓 -resilience and linearizability, this operation also completes successfully.Note that in 𝜎 no process is actually faulty, but because of 𝑓 -resilience, progress is achieved when 𝑓 processes are silent.Similarly, let 𝜎 be an execution where the processes in 𝐴 are correct, and 𝑧 performs transfer( 𝑧, 𝑝 ,1) , followed by 𝑝 performing transfer( 𝑝 , 𝑞 ,1) .We now construct 𝜎 , where all processes in 𝐴 ∪ { 𝑝 } are Byzantine. We first run 𝜎 . Call the time when it ends 𝑡 . Atthis point, all processes in 𝐴 ∪ { 𝑝 } restore their registers to their initial states. Note that no other processes took stepsduring 𝜎 , hence the entire shared memory is now in its initial state. Then, we execute 𝜎 . Because we have reset thememory to its initial state, the operations execute the same way. When 𝜎 completes, processes in A \{ 𝑧 } restore theirregisters to their state at time 𝑡 . At this point, the state of 𝑧 and 𝑝 is the same as it was at the end of 𝜎 , the state ofprocesses in 𝐴 \ { 𝑧 } is the same as it was at the end of 𝜎 , and processes in 𝐵 are all in their initial states.We construct 𝜎 where all processes in 𝐴 ∪ { 𝑝 } are Byzantine by executing 𝜎 , having all processes in 𝐴 ∪ { 𝑝 } resettheir memory, executing 𝜎 , and then having 𝑧 and 𝑝 restore their registers to their state at the end of 𝜎 . At this point,the state of 𝑧 and 𝑝 is the same as it was at the end of 𝜎 , the state of processes in 𝐴 \ { 𝑧 } is the same as it was at the endof 𝜎 , and processes in 𝐵 are all in their initial states.We observe that for processes in B , the configurations at the end of 𝜎 and 𝜎 are indistinguishable as they did not takeany steps and the global memory is the same. By 𝑓 -resilience, in both cases 𝑞 and 𝑞 , together with processes in B andone of { 𝑝 , 𝑝 } should be able to make progress at the end of each of these runs. We extend the runs by having 𝑞 and 𝑞 invoke transfers of amount 1 to each other. In both runs processes in 𝐵 ∪ { 𝑝 , 𝑝 } help them make progress. In 𝜎 𝑝 behaves as if it is a correct process and its local state is the same as it is at the end of 𝜎 and in 𝜎 𝑝 behaves as if it is acorrect process and its local state is the same as it is at the end of 𝜎 . Thus, 𝜎 and 𝜎 are indistinguishable to all correctprocesses, and as a result 𝑞 and 𝑞 act the same in both runs. However, from safety exactly one of their transfers shouldsucceed. In 𝜎 , 𝑝 is correct and transfer( 𝑝 , 𝑞 ,1) succeeds, allowing 𝑞 to transfer 1 and disallowing the transfer from 𝑞 ,whereas 𝜎 the opposite is true. This is a contradiction. □ Guerraoui et al. [11] use an atomic snapshot to implement an asset transfer object in the crash-fault shared memorymodel. In addition, they handle Byzantine processes in the message-passing model by taking advantage of reliable hir Cohen and Idit Keidar Fig. 1. An asset transfer object does not have an 𝑓 -resilient implementation for 𝑛 ≤ 𝑓 . broadcast. Their atomic snapshot-based asset transfer is wait-free and linearizable, and thus it is Byzantine linearizableassuming a Byzantine linearizable snapshot. Their reliable broadcast-based algorithm, on the other hand, is not linearizableand therefore not Byzantine linearizable even when using Byzantine linearizable reliable broadcast. Nonetheless, Auvolatet al. [5] have used reliable broadcast to construct an asset transfer object where transfer operations are linearizable(although reads are not).We note that our lower bound holds for an asset transfer object without read operations. This discussion leads us to thefollowing corollary:C OROLLARY For any 𝑓 > , there does not exist an 𝑓 -resilient Byzantine linearizable implementation of an atomicsnapshot or reliable broadcast in a system with 𝑓 ≥ 𝑛 Byzantine processes using only SWMR registers.

Furthermore, we prove in the following lemma that in order to provide 𝑓 -resilience it is required that at least a majorityof correct processes take steps infinitely often.C OROLLARY For any 𝑓 > , there does not exist an 𝑓 -resilient Byzantine linearizable implementation of assettransfer in a system with 𝑛 = 𝑓 + processes, 𝑓 of which can be Byzantine, using only SWMR registers if correctprocesses do not take steps infinitely often. P ROOF . Assume by way of contradiction that there exists an 𝑓 -resilient Byzantine linearizable implementation of assettransfer in a system with 𝑛 = 𝑓 + processes where there is a correct process 𝑝 that does not take steps infinitely often.Thus, there is a point 𝑡 in any execution such that from time 𝑡 𝑝 does not take any steps. Starting 𝑡 , the implementation isequivalent to one in a system with 𝑛 = 𝑓 , 𝑓 of them may be Byzantine. This is a contradiction to Theorem 2. □ With the acknowledgment that not all is possible, we seek to find Byzantine linearizable objects that are useful evenwithout a wait-free implementation. One of the practical objects is a reliable broadcast object. We already proved in theprevious section that it does not have an 𝑓 -resilient Byzantine linearizable implementation. In this section we provide animplementation if 𝑓 > and 𝑛 ≤ 𝑓 that tolerates 𝑓 ≤ 𝑛 faults. ame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer The reliable broadcast object exposes two operations broadcast(ts,m) and deliver(j,ts) . Its classic definition, given formessage passing systems [8], requires the following properties: • Validity: if a correct process broadcasts 𝑚 then all correct processes eventually deliver 𝑚 . • Agreement: if a correct process delivers 𝑚 then all correct processes eventually deliver 𝑚 • Integrity: No process delivers two different messages for the same ( 𝑡𝑠, 𝑗 ) and if 𝑗 is correct delivers only messages 𝑗 previously broadcastIn the shared memory model, we define the sequential specification of reliable broadcast as follows:D EFINITION Reliable broadcast object exposes two operations broadcast(ts,v) and deliver(j,ts) . deliver(j,ts) returnsthe value v of the first broadcast(ts,v) invoked by process 𝑗 before it is called. If 𝑗 did not invoke broadcast before thedeliver, then it returns ⊥ . Whereas in message passing systems reliable broadcast works in a push fashion, where the receipt of a message triggersaction at its destination, in the shared memory model processes need to actively pull information from the registers. But ifall messages are eventually pulled, the reliable broadcast properties are achieved, as proven in the following lemma.L

EMMA A Byzantine linearization of a reliable broadcast object satisfies the three properties of reliable broadcast. P ROOF . If a correct process broadcasts 𝑚 , and all messages are subsequently pulled then according to Definition 2 allcorrect processes deliver 𝑚 , providing validity. For agreement, if a correct process invokes deliver(j,ts) that returns 𝑚 and all messages are later pulled by all correct processes, it follows that all correct process also invoke deliver(j,ts) andeventually return 𝑚 ′ ≠ ⊥ . Since deliver(j,ts) returns the value 𝑣 of the first broadcast(ts,v) invoked by process 𝑗 before it iscalled, and there is only one first broadcast, and we get that 𝑚 = 𝑚 ′ . Lastly, if deliver(j,ts) returns 𝑚 , by the specification, 𝑗 previously invoked broadcast(ts,m) . □ In our implementation (given in Algorithm 1), each process has 4 SWMR registers: send, echo, ready, and deliver, towhich we refer as stages of the broadcast. We follow concepts from Bracha’s implementation in the message passingmodel [7] but leverage the shared memory to improve its resilience from 𝑓 + to 𝑓 + . The basic idea is that a processthat wishes to broadcast value 𝑣 writes it in its send register and returns only when it reaches the deliver stage. Throughoutthe run, processes infinitely often call a refresh function whose role is to help the progress of the system. When refreshing,processes read all registers and help promote broadcast values through the 4 stages. For a value to be delivered, it has tohave been seen and signed by 𝑓 + processes at the ready stage. Before promoting a value to the ready or deliver stage, acorrect process 𝑖 performs a “double-collect” of the echo registers. Namely, after collecting 𝑓 + signatures on a valuein ready registers, meaning that it was previously written in the echo of at least one correct process, 𝑖 re-reads all echoregisters to verify that there does not exist a conflicting value (with the same timestamp and sender). Using this method,concurrent deliver operations “see” each other, and delivery of conflicting values broadcast by a Byzantine process isprevented. Before delivering a value, a process writes it to its deliver register with 𝑓 + signatures. Once one correctprocess delivers a value, the following deliver calls can witness the 𝑓 + signatures and copy this value directly from itsdeliver register.We make two assumptions on the correct usage of our algorithm. The first is inherently required as shown in Corollary 2: hir Cohen and Idit Keidar Algorithm 1

Shared Memory Bracha: code for process 𝑖 shared SWMR registers: 𝑠𝑒𝑛𝑑 𝑖 , 𝑒𝑐ℎ𝑜 𝑖 , 𝑟𝑒𝑎𝑑𝑦 𝑖 , 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑖 procedure CONFLICTING - ECHO ( ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 ) return ∃ 𝑤 ≠ 𝑣, 𝑘 ∈ Π such that ⟨ 𝑡𝑠, 𝑤 ⟩ 𝑗 ∈ 𝑒𝑐ℎ𝑜 𝑘 procedure BROADCAST (ts,val) 𝑠𝑒𝑛𝑑 𝑖 ← ⟨ 𝑡𝑠, 𝑣𝑎𝑙 ⟩ 𝑖 repeat 𝑚 ← deliver(i,ts) until 𝑚 ≠ ⊥ ⊲ message is deliverable procedure DELIVER (j,ts) refresh() if ∃ 𝑘 ∈ Π and 𝑣 s.t. ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 , 𝜎 ⟩ ∈ 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑘 where 𝜎 is a set of 𝑓 + signatures on ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 ⟩ then 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑖 ← 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑖 ∪ {⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 , 𝜎 ⟩} return 𝑣 return ⊥ procedure REFRESH for 𝑗 ∈ [ 𝑛 ] do 𝑚 ← 𝑠𝑒𝑛𝑑 𝑗 if (cid:154) 𝑡𝑠 , 𝑣𝑎𝑙 s.t. 𝑚 = ⟨ 𝑡𝑠, 𝑣𝑎𝑙 ⟩ 𝑗 then continue ⊲ 𝑚 is not a signed pair 𝑒𝑐ℎ𝑜 𝑖 ← 𝑒𝑐ℎ𝑜 𝑖 ∪ {⟨ 𝑚 ⟩ 𝑖 } if ¬ conflicting-echo ( 𝑚 ) then 𝑟𝑒𝑎𝑑𝑦 𝑖 ← 𝑟𝑒𝑎𝑑𝑦 𝑖 ∪ {⟨ 𝑟𝑒𝑎𝑑𝑦, 𝑚 ⟩ 𝑖 } if ∃ 𝑆 ⊆ Π s.t. | 𝑆 | ≥ 𝑓 + , ∀ 𝑗 ∈ 𝑆, ⟨ 𝑟𝑒𝑎𝑑𝑦, 𝑚 ⟩ 𝑗 ∈ 𝑟𝑒𝑎𝑑𝑦 𝑗 and ¬ conflicting-echo ( 𝑚 ) then 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑖 ← 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑖 ∪ {⟨ 𝑚, 𝜎 = {⟨ 𝑟𝑒𝑎𝑑𝑦, 𝑚 ⟩ 𝑗 | 𝑗 ∈ 𝑆 }⟩} ⊲ 𝜎 is the set of 𝑓 + signaturesA SSUMPTION All correct processes infinitely often invoke methods of the reliable broadcast API.

The second is a straight forward validity assumption:A

SSUMPTION Correct processes do not invoke broadcast(ts,val) twice with the same 𝑡𝑠 . We now prove our algorithm’s correctness. We first make two simple observations:O

BSERVATION If process 𝑖 is correct and 𝑣 appears in 𝑒𝑐ℎ𝑜 𝑖 or 𝑟𝑒𝑎𝑑𝑦 𝑖 it is never deleted. O BSERVATION If process 𝑖 is correct and ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 , 𝜎 ⟩ appears in 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑗 for any process 𝑗 then 𝑖 previously invoked 𝑏𝑟𝑜𝑎𝑑𝑐𝑎𝑠𝑡 ( 𝑡𝑠, 𝑣 ) . P ROOF . Since we assume unforgeable signatures, 𝑖 has previously signed ⟨ 𝑡𝑠, 𝑣 ⟩ . By the code, this is only possible if 𝑖 invoked 𝑏𝑟𝑜𝑎𝑑𝑐𝑎𝑠𝑡 ( 𝑡𝑠, 𝑣 ) . □ We next prove the following lemma, identifying invariants of Algorithm 1.L

EMMA Algorithm 1 satisfies the following invariants: ame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer Fig. 2. Concurrent deliver operations.

I1: If ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 , 𝜎 ⟩ (where 𝜎 is a set of 𝑓 + ready signatures) appears in 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑗 for any processes 𝑖, 𝑗 , then ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 ⟩ 𝑘 ∈ 𝑟𝑒𝑎𝑑𝑦 𝑘 for a correct process 𝑘 .I2: If ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 ⟩ 𝑗 ∈ 𝑟𝑒𝑎𝑑𝑦 𝑗 for a correct process 𝑗 , then ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 ∈ 𝑒𝑐ℎ𝑜 𝑗 .I3: If ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 ⟩ 𝑗 appears in 𝑟𝑒𝑎𝑑𝑦 𝑗 and ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑤 ⟩ 𝑖 ⟩ 𝑗 ′ appears in 𝑟𝑒𝑎𝑑𝑦 𝑗 ′ for any two correct pro-cesses 𝑗, 𝑗 ′ then 𝑣 = 𝑤 .I4: If ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 , 𝜎 ⟩ appears in 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑗 and ⟨⟨ 𝑡𝑠, 𝑤 ⟩ 𝑖 , 𝜎 ⟩ appears in 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑗 ′ for any two correct processes 𝑗, 𝑗 ′ then 𝑣 = 𝑤 . P ROOF . I1: Since ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 , 𝜎 ⟩ appears in 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑗 and it contains a set of 𝑓 + signatures on ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 ⟩ ,there is at least one correct process 𝑘 that signed ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 ⟩ and added it to its ready register. By Ob-servation 1, it is not deleted from the register.I2: Immediate from the code and Observation 1.I3: Since ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 𝑗 ⟩ appears in 𝑟𝑒𝑎𝑑𝑦 𝑗 and 𝑗 is correct, by I2 at least one correct process signed ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 and added it to its echo register. Let 𝑝 be the first correct process to do so, and let 𝑡 be the moment ofadding ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 to 𝑒𝑐ℎ𝑜 𝑝 (see Figure 2 for illustration). By Observation 1, it is not deleted from the register.Similarly, let 𝑝 be the first correct process to add ⟨ 𝑡𝑠, 𝑤 ⟩ 𝑖 to 𝑒𝑐ℎ𝑜 𝑝 at time 𝑡 . WLOG, 𝑡 ≥ 𝑡 . In addition,let 𝑝 be the first correct process to add ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 ⟩ to 𝑟𝑒𝑎𝑑𝑦 𝑝 , and let 𝑡 be the moment of the addition.By I2 it follows that 𝑡 > 𝑡 . By Observation 1, the content of 𝑒𝑐ℎ𝑜 𝑝 and 𝑟𝑒𝑎𝑑𝑦 𝑝 is not deleted duringthe run. By the protocol, at some point in time between 𝑡 and 𝑡 , 𝑝 executes line 19 and reads all echoregisters. Let 𝑡 < 𝑡 ∗ < 𝑡 be the time when 𝑝 reads 𝑒𝑐ℎ𝑜 𝑝 . Since 𝑡 ≥ 𝑡 we conclude that 𝑡 ∗ > 𝑡 . Since, 𝑝 does not see a conflicting value in 𝑒𝑐ℎ𝑜 𝑝 , we get that 𝑣 = 𝑤 .I4: By I1 at least one correct process 𝑗 signed ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 ⟩ and added it to 𝑟𝑒𝑎𝑑𝑦 𝑗 and at least one correctprocess 𝑗 ′ signed ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑤 ⟩ 𝑖 ⟩ and added it to 𝑟𝑒𝑎𝑑𝑦 𝑗 ′ . Thus, by I3 𝑣 = 𝑤 . □ Let us examine an execution 𝐸 of the algorithm. Let 𝐻 be the history of 𝐸 . First, we define 𝐻 𝑐 to be the history 𝐻 afterremoving any pending deliver operations and any pending broadcast operations that did not complete line 11 (which iscalled from line 6). We define 𝐻 ′ to be an augmentation of 𝐻 𝑐 | 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 as follows. For every Byzantine process 𝑗 and avalue 𝑣 such that 𝑣 is returned by deliver 𝑖 (j,ts) for at least one correct process 𝑖 , we add to 𝐻 ′ a broadcast 𝑗 (ts,v) operationthat begins and ends immediately before the first correct process adds ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 , 𝜎 ⟩ to its delivery register. Since at least onecorrect process adds this value at line 11, this moment is well-defined. We construct a linearization 𝐸 ′ of 𝐻 ′ by definingthe following linearization points: hir Cohen and Idit Keidar • Let 𝑜 be a broadcast 𝑖 (ts,v) operation by a correct process 𝑖 that completed line 11. Note that by the code everycompleted broadcast operation completes line 11 exactly once, and operations that do not complete this line areremoved from 𝐻 ′ . The operation linearizes when ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 , 𝜎 ⟩ is added for the first time to delivery register of acorrect process, which occurs either when 𝑖 executes line 11 or when another correct process executes line 22beforehand. By the code, these lines are between the invocation and the return of the broadcast procedure. • Let 𝑜 be a deliver 𝑖 (j,ts) operation by a correct process 𝑖 that completes line 11 and returns 𝑣 ≠ ⊥ (note that by thecode every completed deliver operation that returns 𝑣 ≠ ⊥ completes line 11 exactly once). If 𝑖 finds ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 , 𝜎 ⟩ for some value 𝑣 in some correct process’ deliver register at line 10, then the operation linearizes when 𝑖 first reads ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 , 𝜎 ⟩ from a correct process. Otherwise, it linearizes at line 11 when 𝑖 copies the data to 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑖 . • If 𝑜 is a completed deliver 𝑖 (j,ts) operation by a correct process 𝑖 that returns ⊥ it linearizes at the moment of itsinvocation. • Every Byzantine broadcast 𝑗 (ts,v) operation by process 𝑗 linearizes at the moment we added it.In 𝐻 ′ there are no deliver operations by Byzantine processes. The following lemmas prove that 𝐸 ′ , the linearization of 𝐻 ′ , satisfies the sequential specification:L EMMA For a given deliver(j,ts) operation that returns 𝑣 ≠ ⊥ , there is at least one preceding broadcast operationin 𝐸 ′ of the form broadcast(ts,v) invoked by process 𝑗 . P ROOF . Let 𝑜 be a deliver 𝑖 (j,ts) operation invoked by a correct process 𝑖 that returns 𝑣 ≠ ⊥ . Let 𝑡 be the time when ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 , 𝜎 ⟩ is added for the first time to a delivery register of a correct process (where 𝜎 contains 𝑓 + ready signatures).If 𝑗 is correct then by Observation 2 𝑗 previously invoked broadcast(ts,v) and that broadcast linearizes at time 𝑡 . If 𝑗 isByzantine then broadcast(ts,v) by process 𝑗 is added to 𝐻 ′ immediately before 𝑡 . There are two options to the linearizationpoint of 𝑜 . If 𝑖 finds ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 , 𝜎 ⟩ in some correct process’ deliver register at line 10, then 𝑜 linearizes when 𝑖 first reads ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 , 𝜎 ⟩ from a correct process and thus it is after time 𝑡 . Otherwise, it linearizes at line 11 when 𝑖 copies the data to 𝑑𝑒𝑙𝑖𝑣𝑒𝑟 𝑖 , which is also no earlier than time 𝑡 . □ L EMMA For a broadcast 𝑖 (ts,v) in 𝐸 ′ , there does not exist any broadcast 𝑖 (ts,w) in 𝐸 ′ for 𝑣 ≠ 𝑤 . P ROOF . If 𝑖 is a correct process, the proof follows from Assumption 2. If 𝑖 is Byzantine, broadcast 𝑖 (ts,v) is addedimmediately before the first correct process adds ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 , 𝜎 ⟩ to its delivery register. By I4, no correct processes add ⟨⟨ 𝑡𝑠, 𝑤 ⟩ 𝑖 , 𝜎 ⟩ to their delivery register for 𝑣 ≠ 𝑤 and broadcast 𝑖 (ts,w) does not appear in 𝐸 ′ . □ L EMMA For a given deliver(j,ts) operation that returns ⊥ , there is no preceding broadcast operation in 𝐻 ′ of theform broadcast(ts,v) invoked by process 𝑗 , for 𝑣 ≠ ⊥ . P ROOF . Let 𝑜 be a deliver(j,ts) operation invoked by a correct process 𝑖 that returns ⊥ . Assume by way of contradictionthat there is a preceding broadcast(ts,v) operation in 𝐻 ′ invoked by process 𝑗 , for 𝑣 ≠ ⊥ . By definition, the broadcastlinearizes no later than the first adding of ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 , 𝜎 ⟩ to a delivery register of a correct process. Thus, since 𝑜 linearizes atthe moment of its invocation, it sees ⟨⟨ 𝑡𝑠, 𝑣 ⟩ 𝑗 , 𝜎 ⟩ at some process’ delivery register and returns 𝑣 ≠ ⊥ , in contradiction. □ Next, we prove 𝑓 -resilience.L EMMA (Liveness) Every correct process that invokes some operation eventually returns. P ROOF . If a correct process 𝑖 invokes a deliver operation then by the code it returns in a constant time. If it invokes broadcast(ts,v) , it copies ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 to 𝑠𝑒𝑛𝑑 𝑖 . By Assumption 1, all correct processes infinitely often call the reliable broadcast ame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer API and specifically the refresh procedure, see ⟨ 𝑡𝑠, 𝑣 ⟩ 𝑖 and copy it to their echo registers. As signatures are unforgable and 𝑖 is correct they do not find ⟨ 𝑡𝑠, 𝑤 ⟩ 𝑖 for any other 𝑤 ≠ 𝑣 in any other echo registers and copy a signed ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑤 ⟩ 𝑖 ⟩ to their ready registers. By I1, eventually they all see ⟨ 𝑟𝑒𝑎𝑑𝑦, ⟨ 𝑡𝑠, 𝑤 ⟩ 𝑖 ⟩ in 𝑓 + ready registers and copy ⟨ 𝑡𝑠, 𝑤 ⟩ 𝑖 to theirdeliver registers. Eventually 𝑓 + correct processes have ⟨ 𝑡𝑠, 𝑤 ⟩ 𝑖 in their deliver registers, and since the signatures arevalid, the check at line 10 evaluates to true, and 𝑖 returns 𝑣 and finish the repeat loop. □ We conclude the following theorem:T

HEOREM Algorithm 1 implements an 𝑓 -resilient Byzantine linearizable reliable broadcast object for any 𝑓 < 𝑛 . In this section, we utilize a reliable broadcast primitive to construct a Byzantine snapshot object with resilience 𝑛 > 𝑓 . A snapshot [2] is represented as an array of 𝑛 shared single-writer variables that can be accessed with two operations: update(v) called by process 𝑖 updates the 𝑖 𝑡ℎ entry in the array and snapshot returns an array. The sequential specificationof an atomic snapshot is as follows: the 𝑖 𝑡ℎ entry of the array returned by a snapshot invocation contains the value v lastupdated by an update(v) invoked by process 𝑖 , or its variable’s initial value if no update was invoked.Following Corollary 2, we again must require that correct processes perform operations infinitely often. For simplicity,we require that they invoke infinitely many snapshot operations; if processes invoke either snapshots or updates, we canhave each update perform a snapshot and ignore its result.A SSUMPTION All correct processes invoke snapshot operations infinitely often.

Our pseudo-code is presented in Algorithms 2 and 3. During the algorithm, we compare snapshots using the (partial)coordinate-wise order. That is, let 𝑠 and 𝑠 be two 𝑛 -arrays. We say that 𝑠 > 𝑠 if ∀ 𝑖 ∈ [ 𝑛 ] , 𝑠 [ 𝑖 ] .𝑡𝑠 > 𝑠 [ 𝑖 ] .𝑡𝑠 .Recall that all processes invoke snapshot operations infinitely often. In each snapshot instance, correct processes startby collecting values from all registers and broadcasting their collected arrays in “start” messages (message with timestamp0). Then, they repeatedly send the identities of processes from which they delivered start messages until there exists around such that the same set of senders is received from 𝑓 + processes in that round. Once the set of senders stabilizes,the snapshot is formed as the supremum of the collects in their start messages.We achieve optimal resilience by waiting for only 𝑓 + processes to send the same set. Although there is not necessarilya correct process in the intersection of two sets of size 𝑓 + , we leverage the fact that reliable broadcast preventsequivocation to ensure that nevertheless, there is a common message in the intersection, so two snapshots obtained in thesame round are necessarily identical. Moreover, once one process obtains a snapshot 𝑠 , any snapshot seen in a later roundexceeds 𝑠 .Each process 𝑖 collects values from all processes’ registers in a shared variable 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 . When starting a snapshotoperation, each process runs update-collect, where it updates its collect array (line 8) and saves it in a local variable 𝑐 (line 9). When it does so, it updates the 𝑖 𝑡ℎ entry to be the highest-timestamped value it observes in the 𝑖 𝑡ℎ entriesof all processes’ collect arrays (lines 16 – 18). Then, it initiates the snapshot-aux procedure with a new auxnum tag.Snapshot-aux returns a snapshot, but not necessarily a “fresh” one that reflects all updates that occurred before snapshot hir Cohen and Idit Keidar Algorithm 2

Byzantine Snapshot: code for process 𝑖 shared SWMR registers: ∀ 𝑗 ∈ [ 𝑛 ] 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑖 [ 𝑗 ] ∈ {⊥} ∪ { N × 𝑉 𝑎𝑙𝑠 } with selectors 𝑡𝑠 and val, initially ⊥∀ 𝑘 ∈ N , 𝑠𝑎𝑣𝑒𝑠𝑛𝑎𝑝 𝑖 [ 𝑘 ] ∈ {⊥} ∪ { array of 𝑛 𝑉 𝑎𝑙𝑠 × set of messages } with selectors 𝑠𝑛𝑎𝑝 and proof, initially ⊥ local variables: 𝑡𝑠 𝑖 ∈ N , initially 0 ∀ 𝑗 ∈ [ 𝑛 ] , 𝑟𝑡𝑠 𝑖 [ 𝑗 ] ∈ N , initially 0 𝑟, 𝑎𝑢𝑥𝑛𝑢𝑚 ∈ N , initially 0 𝑝 ∈ [ 𝑛 ] , initially 1 ∀ 𝑗 ∈ [ 𝑛 ] , 𝑘 ∈ N , seen 𝑖 [j][k] , 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 ∈ P ( Π ) , initially ∅ 𝜎 ← ∅ set of messages procedure UPDATE ( 𝑣 ) for 𝑗 ∈ [ 𝑛 ] do ⊲ collect current memory state update-collect( 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑗 ) 𝑡𝑠 𝑖 ← 𝑡𝑠 𝑖 + 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑖 [ 𝑖 ] ← ⟨ 𝑡𝑠 𝑖 , 𝑣 ⟩ 𝑖 ⊲ update local component of collected procedure SNAPSHOT for 𝑗 ∈ [ 𝑛 ] do ⊲ collect current memory state update-collect( 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑗 ) 𝑐 ← 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑖 repeat 𝑎𝑢𝑥𝑛𝑢𝑚 ← 𝑎𝑢𝑥𝑛𝑢𝑚 + 𝑠𝑛𝑎𝑝 ← snapshot-aux ( 𝑎𝑢𝑥𝑛𝑢𝑚 ) until 𝑠𝑛𝑎𝑝 ≥ 𝑐 ⊲ snapshot is newer than the collected state return 𝑠𝑛𝑎𝑝 procedure UPDATE - COLLECT (c) for 𝑘 ∈ [ 𝑛 ] do if 𝑐 [ 𝑘 ] .𝑡𝑠 > 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑖 [ 𝑘 ] .𝑡𝑠 and 𝑐 [ 𝑘 ] is signed by 𝑘 then 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑖 [ 𝑘 ] ← 𝑐 [ 𝑘 ] was invoked. Therefore, snapshot-aux is repeatedly called until it collects a snapshot 𝑠 such that 𝑠 ≥ 𝑐 , according to thesnapshots partial order (lines 10 – 13).By Assumption 3 and since the 𝑎𝑢𝑥𝑛𝑢𝑚 variable at each correct process is increased by 1 every time snapshot-auxis called, all correct processes participate in all instances of snapshot-aux. When a correct process invokes a snapshot-aux procedure with auxnum, it first initiates a new reliable broadcast instance at line 28, dedicated to this instance ofsnapshot-aux. Then, as another preliminary step it once again updates its collect array using the update-collect procedure(lines 30– 31) and broadcasts it to all processes at line 33. During the execution, a correct processes delivers messagesfrom all other processes in a round robin fashion. The local variable 𝑝 represents the process from which it currentlydelivers. In addition, 𝑟𝑡𝑠 [ 𝑝 ] maintains the next timestamp to be delivered from 𝑝 (lines 37, 38 and 49). Note that if thedelivered message at some point is ⊥ , 𝑟𝑡𝑠 [ 𝑝 ] is not increased, so all of 𝑝 ’s messages are delivered in order (line 39).Snapshot-aux proceeds in rounds, which are reflected in the timestamps of the messages broadcast during its execution.Each correct process starts snapshot-aux at round , where it broadcasts its collected array; we refer to this as its startmessage. It then continues to round 𝑟 + once it has delivered 𝑓 + round 𝑟 messages (line 51). Each process maintains ame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer Algorithm 3

Byzantine Snapshot auxiliary procedures: code for process 𝑖 procedure MINIMUM - SAVED (auxnum) 𝑆 ← { 𝑠 |∃ 𝑗 ∈ [ 𝑛 ] , 𝑠 = 𝑠𝑎𝑣𝑒𝑠𝑛𝑎𝑝 𝑗 [ 𝑎𝑢𝑥𝑛𝑢𝑚 ] .𝑠𝑛𝑎𝑝 and 𝑠𝑎𝑣𝑒𝑠𝑛𝑎𝑝 𝑗 [ 𝑎𝑢𝑥𝑛𝑢𝑚 ] .𝑝𝑟𝑜𝑜 𝑓 is a valid proof of 𝑠 } if 𝑆 = ∅ then return ⊥ 𝑟𝑒𝑠 ← infimum ( 𝑆 ) ⊲ returns the minimum value in each index 𝑠𝑎𝑣𝑒𝑠𝑛𝑎𝑝 𝑖 [ 𝑎𝑢𝑥𝑛𝑢𝑚 ] ← ⟨ 𝑟𝑒𝑠, (cid:208) 𝑗 ∈[ 𝑛 ] 𝑠𝑎𝑣𝑒𝑠𝑛𝑎𝑝 𝑗 [ 𝑎𝑢𝑥𝑛𝑢𝑚 ] .𝑝𝑟𝑜𝑜 𝑓 ⟩ update-collect( 𝑟𝑒𝑠 ) return 𝑟𝑒𝑠 procedure SNAPSHOT - AUX (auxnum) initiate new reliable broadcast instance 𝜎 ← ∅ for 𝑗 ∈ [ 𝑛 ] do ⊲ collect current memory state update-collect( 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑗 ) 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 ← { 𝑖 } ⊲ start message contains collect broadcast(0, ⟨ 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 ⟩ 𝑖 ) while true do 𝑐𝑎𝑐ℎ𝑒𝑑 ← minimum-saved( 𝑎𝑢𝑥𝑛𝑢𝑚 ) ⊲ check if there is a saved snapshot if 𝑐𝑎𝑐ℎ𝑒𝑑 ≠ ⊥ then return 𝑐𝑎𝑐ℎ𝑒𝑑 𝑝 ← ( 𝑝 + ) mod 𝑛 + ⊲ deliver messages in round robin 𝑚 ← deliver( 𝑝, 𝑟𝑡𝑠 𝑖 [ 𝑝 ] ) ⊲ deliver next message from 𝑝 if 𝑚 = ⊥ then continue if 𝑟𝑡𝑠 𝑖 [ 𝑝 ] = 0 and 𝑚 contains a signed collect array 𝑐 then ⊲ start message (round 0) 𝜎 ← 𝜎 ∪ { 𝑚 } update-collect( 𝑐 ) 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 ← 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 ∪ { 𝑗 } else if 𝑚 contains a signed set of processes, 𝑗𝑠𝑒𝑛𝑑𝑒𝑟𝑠 then ⊲ round 𝑟 message for 𝑟 > if 𝑗𝑠𝑒𝑛𝑑𝑒𝑟𝑠 ⊈ 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 then continue ⊲ cannot process message, its dependencies are missing 𝜎 ← 𝜎 ∪ { 𝑚 } 𝑠𝑒𝑒𝑛 𝑖 [ 𝑗 ] [ 𝑟𝑡𝑠 𝑖 [ 𝑝 ]] ← 𝑗𝑠𝑒𝑛𝑑𝑒𝑟𝑠 ∪ 𝑠𝑒𝑒𝑛 𝑖 [ 𝑗 ] [ 𝑟𝑡𝑠 𝑖 [ 𝑝 ] − ] 𝑟𝑡𝑠 𝑖 [ 𝑝 ] ← 𝑟𝑡𝑠 𝑖 [ 𝑝 ] + if received 𝑓 + round- 𝑟 messages for the first time then 𝑟 ← 𝑟 + broadcast( 𝑟, ⟨ 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 ⟩ 𝑖 ) if ∃ 𝑠 s.t. |{ 𝑗 | 𝑠𝑒𝑒𝑛 𝑖 [ 𝑗 ] [ 𝑠 ] = 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 }| = 𝑓 + then ⊲ stability condition 𝑟 ← senders 𝑖 ← ∅ ∀ 𝑗 ∈ [ 𝑛 ] , 𝑘 ∈ N , seen 𝑖 [j][k] ← ∅ 𝑐𝑎𝑐ℎ𝑒𝑑 ← minimum-saved( 𝑎𝑢𝑥𝑛𝑢𝑚 ) ⊲ re-check for saved snapshot if 𝑐𝑎𝑐ℎ𝑒𝑑 ≠ ⊥ then return 𝑐𝑎𝑐ℎ𝑒𝑑 𝑠𝑎𝑣𝑒𝑠𝑛𝑎𝑝 𝑖 [ 𝑎𝑢𝑥𝑛𝑢𝑚 ] ← ⟨ 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 , 𝜎 ⟩ ⊲ 𝜎 contains all received messages in this snapshot-aux instance return 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 hir Cohen and Idit Keidar a local set 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 that contains the processes from which it received start messages (line 43). In every round (from 1onward) processes send the set of processes from which they received start messages (line 52).Process 𝑖 maintains a local map 𝑠𝑒𝑒𝑛 [ 𝑗 ] [ 𝑟 ] that maps a process 𝑗 and a round 𝑟 to the set of processes that 𝑗 reported tohave received start messages from in rounds 1–r (line 48), but only if 𝑖 has received start messages from all the reportedprocesses (line 45). By doing so, we ensure that if for some correct process 𝑖 and a round r 𝑠𝑒𝑒𝑛 𝑖 [ 𝑗 ] [ 𝑟 ] contains a process 𝑙 , 𝑙 is also in 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 . If this condition is not satisfied, the delivered counter for 𝑗 ( 𝑟𝑡𝑠 [ 𝑗 ] ) is not increased and thismessage will be repeatedly delivered until the condition is satisfied.Once there is a process 𝑖 such that there exists a round 𝑠 and there is a set 𝑆 of 𝑓 + processes 𝑗 for which 𝑠𝑒𝑒𝑛 𝑖 [ 𝑗 ] [ 𝑠 ] is equal to 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 , we say that the stability condition at line 53 is satisfied for 𝑆 . At that time, 𝑖 and 𝑓 more processesagree on the collected arrays sent at round 0 by processes in 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 , and 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 holds the supremum of those collectedarrays. This is because whenever it received a start message, it updated its collect so that currently 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 reflects allcollects sent by processes in 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 . Thus, 𝑖 can return its current collect as the snapshot-aux result. Since reliablebroadcast prevents Byzantine processes from equivocating, there are 𝑓 more processes that broadcast the same 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 set at that round, and any future round will “see” this set.To ensure liveness in case some correct processes complete a snapshot-aux instance before all do, we add a helpingmechanism. Whenever a correct process successfully completes snapshot-aux, it stores its result in a savesnap map, withthe auxnum as the key (either at line 24 or at line 59). This way, once one correct process returns from snapshot-aux,others can read its result at line 35 and return as well. To prevent Byzantine processes from storing invalid snapshots,each entry in the savesnap map is a tuple of the returned array and a proof of the array’s validity. The proof is the setof messages received by the process that stores its array in the current instance of snapshot-aux. Using these messages,correct processes can verify the legitimacy of the stored array. If a correct process reads from savesnap a tuple with aninvalid proof, it simply ignores it. We outline the key correctness arguments highlighting the main lemmas. Formal proofs of these lemmas appear in Appen-dix A. To prove our algorithm is Byzantine linearizable, we first show that all returned snapshots are totally ordered (bycoordinate-wise order):L

EMMA If two snapshot operations invoked by correct processes return 𝑠 𝑖 and 𝑠 𝑗 , then 𝑠 𝑗 ≥ 𝑠 𝑖 or 𝑠 𝑗 < 𝑠 𝑖 . Based on this order, we define a linearization. Then, we show that our linearization preserves real-time order, and itrespects the sequential specification. We construct the linearization 𝐸 as follows: First, we linearize all snapshot operationsof correct processes in the order of their return values. Then, we linearize every update operation by a correct processimmediately before the first snapshot operation that “sees” it. We say that a snapshot returning 𝑠 sees an update by process 𝑗 that has timestamp 𝑡𝑠 if 𝑠 [ 𝑗 ] .𝑡𝑠 ≥ 𝑡𝑠 . If multiple updates are linearized to the same point (before the same snapshot),we order them by their start times. Finally, we add updates by Byzantine processes as follows: We add update(v) by aByzantine process 𝑗 if there is a linearized snapshot that returns 𝑠 and 𝑠 [ 𝑗 ] .𝑣𝑎𝑙 = 𝑣 . We add the update immediately beforeany snapshot that sees it.We next prove that the linearization respects the sequential specification.L EMMA

The 𝑖 𝑡ℎ entry of the array returned by a snapshot invocation contains the value 𝑣 last updated by an update(v) invoked by process 𝑖 in 𝐸 , or its variable’s initial value if no update was invoked. ame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer Because an update is linearized immediately before some snapshot sees it and snapshots are monotonically increasing,all following snapshots see the update as well. Next, we prove in the two following lemmas that 𝐸 preserves the real-timeorder.L EMMA

If a snapshot operation invoked by a correct process 𝑖 with return value 𝑠 𝑖 precedes a snapshot operationinvoked by a correct process 𝑗 with return value 𝑠 𝑗 , then 𝑠 𝑖 ≤ 𝑠 𝑗 . L EMMA

Let 𝑠 be the return value of a snapshot operation 𝑠𝑛𝑎𝑝 𝑖 invoked by a correct process 𝑖 . Let 𝑢𝑝𝑑𝑎𝑡𝑒 𝑗 ( 𝑣 ) bean update operation invoked by a correct process 𝑗 that writes ⟨ 𝑡𝑠, 𝑣 ⟩ and completes before 𝑠𝑛𝑎𝑝 𝑖 starts. Then, 𝑠 [ 𝑗 ] .𝑡𝑠 ≥ 𝑡𝑠 . It follows from Lemma 12 and the definition of 𝐸 , that if an update precedes a snapshot it is linearized before it, andfrom Lemma 11 that if a snapshot precedes a snapshot it is also linearized before it. The following lemma ensures that ifan update precedes another update it is linearized before it. That is, if a snapshot operation sees the second update, it seesthe first one.L EMMA

If update1 by process 𝑖 precedes update2 by process 𝑗 and a snapshot operation 𝑠𝑛𝑎𝑝 by a correct processsees update2, then 𝑠𝑛𝑎𝑝 sees update1 as well. Finally, the next lemma proves the liveness of our algorithm.L

EMMA (Liveness) Every correct process that invokes some operation eventually returns.

We conclude the following theorem:T

HEOREM Algorithm 2 implements an 𝑓 -resilient Byzantine linearizable snapshot object for any 𝑓 < 𝑛 . P ROOF . Lemma 9 shows that there is a total order on snapshot operations. Using this order, we have defined alinearization 𝐸 that satisfies the sequential specification (Lemma 10). We then proved that 𝐸 also preserves real-timeorder (Lemmas 11 – 13). Thus, Algorithm 2 is Byzantine linearizable. In addition, Lemma 14 proves that Algorithm 2 is 𝑓 -resilient. □ We have studied shared memory constructions in the presence of Byzantine processes. To this end, we have definedByzantine linearizability, a correctness condition suitable for shared memory algorithms that can tolerate Byzantinebehavior. We then used this notion to present both upper and lower bounds on some of the most fundamental componentsin distributed computing.We proved that atomic snapshot, reliable broadcast, and asset transfer are all problems that do not have 𝑓 -resilientemulations from registers when 𝑛 ≤ 𝑓 . On the other hand, we have presented an algorithm for Byzantine linearizablereliable broadcast with resilience 𝑛 > 𝑓 . We then used it to implement a Byzantine snapshot with the same resilience.Among other applications, this Byzantine snapshot can be utilized to provide a Byzantine linearizable asset transfer. Thus,we proved a tight bound on the resilience of emulations of asset transfer, snapshot, and reliable broadcast.Our paper deals with feasibility results and does not focus on complexity measures. In particular, we assume unboundedstorage in our constructions. We leave the subject of efficiency as an open question for future work. hir Cohen and Idit Keidar REFERENCES [1] Ittai Abraham, Gregory Chockler, Idit Keidar, and Dahlia Malkhi. 2006. Byzantine disk paxos: optimal resilience with byzantine shared memory.

Distributed Computing

18, 5 (2006), 387–408.[2] Yehuda Afek, Hagit Attiya, Danny Dolev, Eli Gafni, Michael Merritt, and Nir Shavit. 1993. Atomic snapshots of shared memory.

Journal of the ACM(JACM)

40, 4 (1993), 873–890.[3] Yehuda Afek, David S Greenberg, Michael Merritt, and Gadi Taubenfeld. 1995. Computing with faulty shared objects.

Journal of the ACM (JACM)

42, 6 (1995), 1231–1274.[4] Hagit Attiya, Maurice Herlihy, and Ophir Rachman. 1992. Efficient atomic snapshots using lattice agreement. In

International Workshop on DistributedAlgorithms . Springer, 35–53.[5] Alex Auvolat, Davide Frey, Michel Raynal, and François Taïani. 2020. Money Transfer Made Simple: a Specification, a Generic Algorithm, and itsProof.

Bulletin of EATCS

3, 132 (2020).[6] Mathieu Baudet, Avery Ching, Andrey Chursin, George Danezis, François Garillot, Zekun Li, Dahlia Malkhi, Oded Naor, Dmitri Perelman, andAlberto Sonnino. 2019. State machine replication in the Libra blockchain.

The Libra Assn., Tech. Rep (2019).[7] Gabriel Bracha. 1987. Asynchronous Byzantine agreement protocols.

Information and Computation

75, 2 (1987), 130–143.[8] Christian Cachin, Rachid Guerraoui, and Luís Rodrigues. 2011.

Introduction to reliable and secure distributed programming . Springer Science &Business Media.[9] Vicent Cholvi, Antonio Fernandez Anta, Chryssis Georgiou, Nicolas Nicolaou, and Michel Raynal. 2020. Atomic Appends in Asynchronous ByzantineDistributed Ledgers. In . IEEE, 77–84.[10] Giuseppe Antonio Di Luna, Emmanuelle Anceaume, and Leonardo Querzoni. 2020. Byzantine generalized lattice agreement. In . IEEE, 674–683.[11] Rachid Guerraoui, Petr Kuznetsov, Matteo Monti, Matej Pavloviˇc, and Dragos-Adrian Seredinschi. 2019. The consensus number of a cryptocurrency.In

Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing . 307–316.[12] Prasad Jayanti, Tushar Deepak Chandra, and Sam Toueg. 1998. Fault-tolerant wait-free shared objects.

Journal of the ACM (JACM)

45, 3 (1998),451–500.[13] Barbara Liskov and Rodrigo Rodrigues. 2005. Byzantine clients rendered harmless. In

International Symposium on Distributed Computing . Springer,487–489.[14] Jean-Philippe Martin, Lorenzo Alvisi, and Michael Dahlin. 2002. Minimal byzantine storage. In

International Symposium on Distributed Computing .Springer, 311–325.[15] Achour Mostéfaoui, Matoula Petrolia, Michel Raynal, and Claude Jard. 2017. Atomic read/write memory in signature-free byzantine asynchronousmessage-passing systems.

Theory of Computing Systems

60, 4 (2017), 677–694.[16] Satoshi Nakamoto. 2009.

Bitcoin: A peer-to-peer electronic cash system . Technical Report. Manubot.[17] Rodrigo Rodrigues and Barbara Liskov. 2003. Rosebud: A scalable byzantine-fault-tolerant storage architecture. (2003).[18] Gavin Wood et al. 2014. Ethereum: A secure decentralised generalised transaction ledger.

Ethereum project yellow paper , Quentin Bramas, RotemOshman, and Paolo Romano (Eds.). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 4:1–4:16. https://doi.org/10.4230/LIPIcs.OPODIS.2020.416 ame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer

Appendix A BYZANTINE SNAPSHOT: CORRECTNESS L EMMA

For a correct process 𝑖 , at each point during an execution 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 [ 𝑖 ] contains the value signed by 𝑗 withthe highest timestamp until that point. P ROOF . By induction on the execution; 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 [ 𝑖 ] can change either at line 5 or at line 18. If it changes at line 5, 𝑡𝑠 𝑖 is increased and 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 [ 𝑖 ] contains the value with the highest timestamp. By induction, no signed value encounteredat line 17 has s timestamp higher than the one in 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 [ 𝑖 ] , so it is not updated at line 18. □ L EMMA

For a correct process 𝑖 , 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 is monotonically increasing. P ROOF . Let 𝑗 ∈ [ 𝑛 ] . We prove that every time the value in 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 [ 𝑗 ] is updated from 𝑚 to 𝑚 ′ , it holds that 𝑚 ′ .𝑡𝑠 > 𝑚.𝑡𝑠 . By the code 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 [ 𝑗 ] changes either at line 5 or at line 18. In both cases, the value in 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 [ 𝑗 ] is signedby 𝑗 . If 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 [ 𝑗 ] changes at line 18, then monotonicity is immediate from the condition at line 17. Otherwise, it changesat line 5, indicating that 𝑖 = 𝑗 and monotonicity follows from Lemma 15. □ Lemma 10.

The 𝑖 𝑡ℎ entry of the array returned by a snapshot invocation in 𝐸 contains the value 𝑣 last updated by an update(v) invoked by process 𝑖 in 𝐸 , or its variable’s initial value if no update was invoked. P ROOF . Let 𝑣 be the value in the 𝑖 𝑡ℎ entry of the array returned by a snapshot , with a corresponding timestamp 𝑡𝑠 𝑣 . Bythe definition of 𝐸 , update(v) by process 𝑖 with timestamp 𝑡𝑠 is linearized immediately before 𝑡𝑠 𝑣 ≥ 𝑡𝑠 . If 𝑖 is correct andmultiple update operations by 𝑖 are linearized at that point, then since 𝑖 invokes updates sequentially and by Lemmas 15and 16 their start times are ordered according to the increasing timestamps. Thus, as updates are linearized by their starttimes, 𝑣 matches the value of the last update. If 𝑖 is Byzantine, since we add updates only for values at the moment theyare seen, 𝑣 must match the value of the last update. Additionally, if 𝑣 is an initial value, then no updates were linearizedbefore it in 𝐸 . □ O BSERVATION For a snapshot operation invoked by a correct process 𝑖 , let 𝑐 𝑖 be the collected array at line 8 andlet 𝑠 be the return value. Then, 𝑠 ≥ 𝑐 𝑖 . P ROOF . Immediate from the condition at line 13. □ Lemma 11.

If a snapshot operation invoked by a correct process 𝑖 with return value 𝑠 𝑖 precedes a snapshot operationinvoked by a correct process 𝑗 with return value 𝑠 𝑗 , then 𝑠 𝑖 ≤ 𝑠 𝑗 . P ROOF . Assume 𝑖 invokes snapshot operation 𝑠𝑛𝑎𝑝 𝑖 , which returns 𝑠 𝑖 before 𝑗 invokes snapshot 𝑠𝑛𝑎𝑝 𝑗 , returning 𝑠 𝑗 .Let 𝑐 be the value of 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 that 𝑗 reads at line 8 of 𝑠𝑛𝑎𝑝 𝑗 and let 𝑐 be the value it writes in 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑗 at line 9. At theend of the last snapshot-aux in 𝑠𝑛𝑎𝑝 𝑖 , 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑖 ≥ 𝑠 𝑖 either because the return value is 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑖 (if snapshot-aux returnsat line 60), or because 𝑠 𝑖 is reflected in collect by the end of line 25 if it is a savesnap returned at line 36 or at line 58.Due to the monotonicity of collects (Lemma 16), 𝑠 𝑖 ≤ 𝑐 . Because 𝑗 reads 𝑐 when calculating 𝑐 , 𝑐 ≤ 𝑐 . Finally,by Observation 3, 𝑐 ≤ 𝑠 𝑗 and by transitivity we get that 𝑠 𝑖 ≤ 𝑠 𝑗 . □ Lemma 12.

Let 𝑠 be the return value of a snapshot operation 𝑠𝑛𝑎𝑝 𝑖 invoked by a correct process 𝑖 . Let 𝑢𝑝𝑑𝑎𝑡𝑒 𝑗 ( 𝑣 ) be anupdate operation invoked by a correct process 𝑗 that writes ⟨ 𝑡𝑠, 𝑣 ⟩ and completes before 𝑠𝑛𝑎𝑝 𝑖 starts. Then, 𝑠 [ 𝑗 ] .𝑡𝑠 ≥ 𝑡𝑠 . P ROOF . Let 𝑡 be the time when 𝑗 completes line 5 in 𝑢𝑝𝑑𝑎𝑡𝑒 𝑗 ( 𝑣 ) and writes ⟨ 𝑡𝑠, 𝑣 ⟩ . Let 𝑡 be the time when 𝑖 reads 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑗 [ 𝑖 ] at line 8 in 𝑠𝑛𝑎𝑝 𝑖 . By Lemmas 15 and 16, since 𝑗 is correct, it follows that 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑗 [ 𝑗 ] .𝑡𝑠 ≥ 𝑡𝑠 at time 𝑡 ≥ 𝑡 .Thus, after line 9 in 𝑠𝑛𝑎𝑝 𝑖 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 [ 𝑗 ] .𝑡𝑠 ≥ 𝑡𝑠 and by Observation 3, 𝑠 [ 𝑗 ] .𝑡𝑠 ≥ 𝑡𝑠 . hir Cohen and Idit Keidar □ I NVARIANT For any correct process 𝑖 that invokes snapshot-aux( 𝑘 ), it holds that 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 is the supremum ofthe arrays in start message sent by processes in 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 from line 33 and until the return value of snapshot-aux( 𝑘 ) isdetermined at line line 23 or at line 60. P ROOF . First, at line 33 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 contains 𝑖 itself, and 𝑖 sends exactly its 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 array. The argument continues byinduction on steps of snapshot-aux( 𝑘 ). Other than line 25, 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 and 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 change together: Whenever 𝑖 receives astart message with an array 𝑐 from process 𝑗 , it updates 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 with the higher-timestamped values found in 𝑐 and adds 𝑗 to 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 (lines 42– 43). □ D EFINITION We say that the stability condition holds for a return value 𝑠 of snapshot-aux( 𝑘 ) return value 𝑠 witha round 𝑟 and a set of processes 𝑆 if (1) | 𝑆 | ≥ 𝑓 + , (2) there is a set 𝑆 ′ ⊇ 𝑆 so that for each 𝑝 ∈ 𝑆 the union of alljsenders sets sent in 𝑝 ’s messages in rounds 1 to 𝑟 is 𝑆 ′ , and (3) 𝑠 is the supremum of the collects sent in start messagesof members of 𝑆 ′ . O BSERVATION If 𝑠 is returned from snapshot-aux( 𝑘 ) by a correct process 𝑖 , then 𝑠 satisfies the stability conditionfor some set 𝑆 in some round 𝑟 . P ROOF . Consider two cases. First, if 𝑖 returns 𝑠 at line 60, then the condition is satisfied for 𝑠 with the round 𝑠 thatsatisfies the condition at line 53 and the set of 𝑓 + processes for which the condition at line 53 holds. 𝑆 ′ is the set in 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑖 at the time the condition is satisfied. Since messages are delivered in order, we get that 𝑆 ′ ⊇ 𝑆 . Because thereturn value is 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 , (3) follows from Invariant 1.Second, if 𝑖 adopts a saved snapshot 𝑠 with a proof and returns at line 36 or at line 58, then the proof contains 𝑓 + messages from some round 𝑟 and corresponding start messages satisfying the stability condition. □ L EMMA

For a given 𝑘 , Let 𝑖, 𝑗 be two correct processes that return 𝑠 𝑖 , 𝑠 𝑗 from snapshot-aux( 𝑘 ). Then 𝑠 𝑖 ≤ 𝑠 𝑗 or 𝑠 𝑖 > 𝑠 𝑗 . P ROOF . By Observation 4, 𝑠 𝑖 satisfies the stability condition for some set 𝑆 in some round 𝑟 . Let 𝑆 ′ be the setguaranteed from the definition. Also by Observation 4, 𝑠 𝑗 satisfies the stability condition and some set 𝑆 in some round 𝑟 . Let 𝑆 ′ be the set guaranteed from the definition.Since | 𝑆 | ≥ 𝑓 + and | 𝑆 | ≥ 𝑓 + , there is at least one process 𝑝 ∈ 𝑆 ∩ 𝑆 . Due to reliable broadcast, 𝑝 cannotequivocate with the set of processes 𝑗𝑠𝑒𝑛𝑑𝑒𝑟𝑠 sent in each round of snapshot-aux( 𝑘 ). • If 𝑟 = 𝑟 : By property (2) of Definition 3 𝑆 ′ = 𝑆 ′ , and by (3) 𝑠 𝑖 = 𝑠 𝑗 . • If 𝑟 ≠ 𝑟 : Assume WLOG 𝑟 < 𝑟 . Since the union of all jsenders sets sent in 𝑝 ’s messages in rounds 1 to 𝑟 is asuperset of those sent in rounds 1 to 𝑟 , 𝑆 ⊇ 𝑆 and then by (3) 𝑠 𝑗 ≥ 𝑠 𝑖 . □ L EMMA

Let 𝑖, 𝑗 be two correct processes returning 𝑠 𝑖 , 𝑠 𝑗 resp. from snapshot-aux with 𝑎𝑢𝑥𝑛𝑢𝑚 = 𝑘 , such that 𝑠 𝑗 > 𝑠 𝑖 . Then when 𝑖 begins any snapshot-aux 𝑖 ( 𝑘 ′ ) for 𝑘 ′ > 𝑘 , 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 > 𝑠 𝑗 . P ROOF . Since 𝑗 is correct, by Observation 4, 𝑠 𝑗 satisfies the stability condition. Let 𝑡 be a time when the condition issatisfied. At time 𝑡 , there is at least one correct process 𝑙 such that 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑙 ≥ 𝑠 𝑗 . We show that either (1) 𝑗 does not return ame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer 𝑠 𝑗 or (2) 𝑖 begins snapshot-aux 𝑖 ( 𝑘 ′ ) with 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 > 𝑠 𝑗 . If 𝑖 begins snapshot-aux 𝑖 ( 𝑘 ′ ) after 𝑡 , then when it updates itscollect at lines 30–31, it reads the values in 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑙 . By Lemma 16, 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑙 is greater than or equal to its value at time 𝑡 .Thus, we get that 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 ≥ 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑙 ≥ 𝑠 𝑗 and (2) holds. Otherwise, 𝑖 saves 𝑠 𝑖 at line 59 before starting snapshot-aux( 𝑘 ′ ),which is before time 𝑡 . Between time 𝑡 and the time it returns 𝑠 𝑗 , j checks stored snapshots (at line 21). When it does so, 𝑗 reads 𝑠 𝑖 , and since 𝑠 𝑗 > 𝑠 𝑖 and 𝑗 returns the minimal array it sees, (1) holds. □ L EMMA

If snapshot-aux 𝑖 ( 𝑘 ) of a correct process 𝑖 returns 𝑠 𝑖 , there is a correct process 𝑗 s.t. 𝑗 invoked snapshot-aux 𝑗 ( 𝑘 ) and 𝑠 𝑖 ≥ 𝑐 𝑗 , where 𝑐 𝑗 is the value of 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑗 after the collection at line 31 in snapshot-aux( 𝑘 ) at 𝑗 . P ROOF . If snapshot-aux 𝑖 ( 𝑘 ) returns at line 60, then 𝑖 returns 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 and by Lemma 16, 𝑠 𝑖 = 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑖 is greater than orequal to its value after the collection at line 31 so the lemma holds with 𝑖 = 𝑗 . Otherwise, snapshot-aux 𝑖 ( 𝑘 ) returns 𝑠 𝑖 at line 36 or at line 58 and 𝑠 𝑖 is an array saved in savesnap with a proof 𝜎 signed by process 𝑝 . Since 𝑖 validates 𝑠 𝑖 , therewas a round 𝑟 such that |{ 𝑗 | 𝑠𝑒𝑒𝑛 𝑝 [ 𝑗 ] [ 𝑠 ] = 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑝 }| ≥ 𝑓 + . Thus, there was at least one correct process 𝑗 in this set.Since 𝑗 adds itself to 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑗 (line 32), 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 𝑗 is broadcast by 𝑗 at every round (line 52), and it is the set added to 𝑠𝑒𝑒𝑛 , the array 𝑐 𝑗 sent in 𝑗 ’s start message is reflected in 𝑠 𝑖 . This set is exactly the value of 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑗 after the collectionat line 31 in snapshot-aux( 𝑘 ) at 𝑗 , and hence 𝑠 𝑖 ≥ 𝑐 𝑗 . □ Lemma 9.

If two snapshot operations invoked by correct processes return 𝑠 𝑖 and 𝑠 𝑗 , then 𝑠 𝑗 ≥ 𝑠 𝑖 or 𝑠 𝑗 < 𝑠 𝑖 . P ROOF . By the code, 𝑠 𝑖 is the return value of some snapshot-aux 𝑖 ( 𝑘 𝑖 ) and 𝑠 𝑗 is the return value of some snapshot-aux 𝑗 ( 𝑘 𝑗 ) . WLOG, 𝑘 𝑖 ≥ 𝑘 𝑗 . • If 𝑘 𝑖 = 𝑘 𝑗 , the proof follows from Lemma 17. • If 𝑘 𝑖 > 𝑘 𝑗 : By Lemma 19, there is a correct process 𝑙 that invoked snapshot-aux 𝑙 ( 𝑘 𝑖 ) , collected 𝑐 𝑙 at line 31 ofsnapshot-aux 𝑙 ( 𝑘 𝑖 ) (where 𝑐 𝑙 is the value of 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑙 at that time), and 𝑠 𝑖 ≥ 𝑐 𝑙 . Let 𝑠 𝑙 be the return value ofsnapshot-aux 𝑙 ( 𝑘 𝑗 ) (note that 𝑙 invokes snapshot-aux with increasing auxnums, so such a value exists). Considertwo cases. First, if 𝑠 𝑗 > 𝑠 𝑙 , then by Lemma 18, 𝑠 𝑗 ≤ 𝑐 𝑙 . Thus, 𝑠 𝑗 ≤ 𝑐 𝑙 ≤ 𝑠 𝑖 and the lemma follows. Otherwise, 𝑠 𝑗 ≤ 𝑠 𝑙 . At the end of snapshot-aux 𝑙 ( 𝑘 𝑗 ) 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑙 ≥ 𝑠 𝑙 because either the return value is 𝑐𝑜𝑙𝑙𝑒𝑐𝑡𝑒𝑑 𝑙 , or 𝑠 𝑙 isreflected in collect by the end of line 25. Due to the monotonicity of collects (Lemma 16), 𝑠 𝑙 ≤ 𝑐 𝑙 . We concludethat 𝑠 𝑗 ≤ 𝑠 𝑙 ≤ 𝑐 𝑙 ≤ 𝑠 𝑖 , as required. □ Lemma 13.

If update1 by process 𝑖 precedes update2 by process 𝑗 and a snapshot operation 𝑠𝑛𝑎𝑝 by a correct processsees update2, then 𝑠𝑛𝑎𝑝 sees update1 as well. P ROOF . Let 𝑠 be the return value of a snapshot that sees update2. By Observation 4, 𝑠 is the supremum of 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 arrays sent at line 33. If 𝑠 sees update2, by Lemma 15, it means that 𝑠 reflects 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑗 after line 5 of update2. After, 𝑗 performed line 3 and update1 was reflected in 𝑐𝑜𝑙𝑙𝑒𝑐𝑡 𝑗 . Hence, 𝑠 sees update1 as well. □ We now prove the liveness of our snapshot algorithm.L

EMMA

Every correct process that invokes snapshot-aux(auxnum) eventually returns. P ROOF . Assum by induction on auxnum that all snapshot-aux instances with 𝑘 ′ < 𝑘 (if any) have returned at allcorrect processes. Then, for auxnum= 𝑘 , all correct processes initiate reliable broadcast instances and broadcast ⟨ , 𝑐 ⟩ . hir Cohen and Idit Keidar This is because all correct process invoke snapshot infinitely often. Since all messages by correct processes are eventuallydelivered, they all eventually complete line 50 in each round. Because | 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 | is bounded, eventually the 𝑠𝑒𝑛𝑑𝑒𝑟𝑠 sets ofall correct process stabilize, and due to reliable broadcast, they contain the same set of processes for all correct processes.Thus, there is be a round 𝑟 for which the condition at line 53 is satisfied. Therefore, at least one correct process returnsfrom snapshot-aux at line 60 (if it did not return sooner). Before returning, it updates its savesnap register at line 59. If itreturns at line 36 or at line 58 it also updates its savesnap register at line 24. Every other correct process that has not yetreturned from snapshot-aux will read the updated savesnap in the next while iteration and will return at line 36. □ Lemma 14.

Every correct process that invokes some operation eventually returns. P ROOF . If a correct process 𝑖 invokes an update operation then by the code it returns in constant time. If 𝑖 invokes asnapshot operation at time 𝑡 , let 𝑐 be the collected array at line 8. Additionally, let 𝑘 be the maximum 𝑎𝑢𝑥𝑛𝑢𝑚 of anysnapshot-aux operation that was initiated by some process before time 𝑡 . By Lemma 20, all snapshot-aux invocationseventually return. At snapshot-aux ( 𝑘 + ) , all correct processes see 𝑐 at lines 30–31 when they update their collect. Sincethe return value is the supremum of 𝑓 + collect arrays, it is guaranteed that when 𝑖 executes snapshot-aux ( 𝑘 + ) , thereturned value 𝑟𝑒𝑠 will satisfy 𝑟𝑒𝑠 ≥ 𝑐 . □□