[PDF] \scriptstyle{BASALT}: A Rock-Solid Foundation for Epidemic Consensus Algorithms in Very Large, Very Open Networks

Abstract

Recent works have proposed new Byzantine consensus algorithms for blockchains based on epidemics, a design which enables highly scalable performance at a low cost. These methods however critically depend on a secure random peer sampling service: a service that provides a stream of random network nodes where no attacking entity can become over-represented. To ensure this security property, current epidemic platforms use a Proof-of-Stake system to select peer samples. However such a system limits the openness of the system as only nodes with significant stake can participate in the consensus, leading to an oligopoly situation. Moreover, this design introduces a complex interdependency between the consensus algorithm and the cryptocurrency built upon it. In this paper, we propose a radically different security design for the peer sampling service, based on the distribution of IP addresses to prevent Sybil attacks. We propose a new algorithm, \scriptstyle{BASALT}, that implements our design using a stubborn chaotic search to counter attackers' attempts at becoming over-represented. We show in theory and using Monte Carlo simulations that \scriptstyle{BASALT} provides samples which are extremely close to the optimal distribution even in adversarial scenarios such as tentative Eclipse attacks. Live experiments on a production cryptocurrency platform confirm that the samples obtained using \scriptstyle{BASALT} are equitably distributed amongst nodes, allowing for a system which is both open and where no single entity can gain excessive power.

Full PDF

BB ASALT : A Rock-Solid Foundation for Epidemic ConsensusAlgorithms in Very Large, Very Open Networks

Alex Auvolat, Yérom-David Bromberg, Davide Frey, François Taïani

Univ Rennes, Inria, CNRS, IRISA, Rennes, France{alex.auvolat,davide.frey}@inria.fr,{david.bromberg,francois.taiani}@irisa.fr

ABSTRACT

Recent works have proposed new Byzantine consensus algorithmsfor blockchains based on epidemics, a design which enables highly-scalable performance at a low cost. These methods however criticallydepend on a secure random peer sampling service: a service thatprovides a stream of random network nodes where no attackingentity can become over-represented. To ensure this security property,current epidemic platforms use a Proof-of-Stake system to select peersamples. However such a system limits the openness of the systemas only nodes with significant stake can participate in the consensus,leading to an oligopoly situation. Moreover, this design introduces acomplex interdependency between the consensus algorithm and thecryptocurrency built upon it.In this paper, we propose a radically different security design forthe peer sampling service, based on the distribution of IP addressesto prevent Sybil attacks. We propose a new algorithm, B

ASALT , thatimplements our design using a stubborn chaotic search to counterattackers’ attempts at becoming over-represented. We show in theoryand using Monte Carlo simulations that B

ASALT provides sampleswhich are extremely close to the optimal distribution even in adver-sarial scenarios such as tentative Eclipse attacks. Live experimentson a production cryptocurrency platform confirm that the samplesobtained using B

ASALT are equitably distributed amongst nodes,allowing for a system which is both open and where no single en-tity can gain excessive power.

Keywords : Gossip, Peer Sampling,Distributed System, Byzantine tolerance, Consensus

Blockchain-based systems, such as cryptocurrencies [19] and smartcontract platforms [3], are said to be

Byzantine Fault Tolerant (BFTfor short), i.e. they are able to resist to attacks from malicious partici-pants (called

Byzantine nodes), making it arbitrarily hard for instancefor an attacker to forge false transactions or revoke already commit-ted transactions. In particular, decision power over the blockchain’sstate must be spread over various network participants in order toprevent an attacker from obtaining full control over the system.The breakthrough made by Bitcoin [19] allowed for Byzantinefault-tolerance to be achieved in a truly open network, using a Proof-of-Work system that requires participants to solve computationallyintensive crypto-puzzles. The difficulty of these crypto-puzzles limitsthe influence of individual nodes, but encourage a race for computingpower, with Bitcoin reported to consume as much electricity asAustria in 2020 [2]. Moreover, the throughput and latency of Proof-of-Work (PoW) systems are restricted by the time between blocks,which must be long enough to ensure security.

Epidemic BFT algorithms.

A particularly interesting area of re-search in alleviating these issues with Proof-of-Work consists in a new family of BFT algorithms [12, 13, 21] that exploits epidemicmechanisms to provide large-scale protection against Byzantinebehaviour. Epidemic algorithms allow for extremely fast dissemina-tion of information in very large networks by means of stochasticpeer-to-peer exchanges [9, 17]. Epidemic BFT algorithms exploitthis property by repeatedly sampling small sets of random peers inthe network, which they then use to estimate the overall system’sstate, and ensure coordination and agreement between correct (i.e.non-Byzantine) nodes.Epidemic BFT approaches critically depend on the availability of good network samples, in the sense that the proportion of Byzantinenodes in a sample should be kept as low as possible, and samplednodes should be as varied as possible. Providing such samples isthe role of a so-called

Byzantine-tolerant , or secure , random peersampling (RPS) service. When such a service is available, thesealgorithms have the potential to yield much higher throughput thanPoW systems at a fraction of the cost [21]. Secure random peer sampling.

Unfortunately, classical RPS algo-rithms [15, 20, 23] are not resilient to malicious behavior: Byzantinenodes can easily disrupt their execution by flooding honest nodeswith Byzantine identifiers. Left unchecked, this strategy has the po-tential to isolate honest nodes in a so-called

Eclipse attack [14, 22],or to partition the system. Moreover, a scheme where peers aresampled with uniform probability is vulnerable to so-called

Sybil at-tacks [10] where a malicious entity creates arbitrarily many networknode identifiers that it controls, thus gaining unlimited influence onthe network.Current deployments of epidemic BFT algorithms, such as theAVA cryptocurrency platform [1], rely on a Proof-of-Stake mech-anism to ensure that nodes are sampled in a secure way, i.e. thatthe cost for an attacker of biasing samples in their favor is veryhigh. However, Proof-of-Stake has several known limitations [25].In essence, Proof-of-Stake consists in building an abstraction ofa closed (permissioned) system, where system membership canhowever evolve dynamically according to the various parties’ eco-nomic investments (in the form of token staking). We argue thatsuch an abstraction is too restrictive and in fact not required. Par-ticularly in the case of epidemic BFT algorithms, we show that therequired Byzantine-tolerant random peer sampling service can be im-plemented directly in a much more open fashion, without resortingto Proof-of-Stake to ensure security.

Content of this paper.

In this paper, we revisit the problem of se-cure peer sampling in large-scale decentralized systems, and proposeB

ASALT , a novel Byzantine-tolerant random peer sampling algo-rithm. B

ASALT exhibits close to optimal Byzantine fault tolerance,thus significantly improving on the state-of-the-art [8, 16]. B

ASALT is designed to operate in Internet-scale permissionless systems while a r X i v : . [ c s . CR ] F e b lex Auvolat, Yérom-David Bromberg, Davide Frey, François Taïani resisting to Eclipse and Sybil attacks. At the core of B ASALT lieswhat we have termed a stubborn chaotic search , a greedy epidemicprocedure [24] towards random nodes that are implicitly defined in away that makes it extremely hard for malicious nodes to manipulatethe decisions of correct ones. This procedure is parametrized by atarget distribution on nodes based on their IP addresses, which wedefine to defend against Sybil attacks by institutions that own largecontiguous portions of the IP address space.We comprehensively analyze B

ASALT under a theoretical modelbased on the power 𝑓 of the attack, which captures the (ideal) prob-ability of sampling malicious nodes as defined by the target dis-tribution. We show that B ASALT provides samples in which theproportion of malicious nodes is very close to 𝑓 , its theoretical opti-mum, and that 𝑓 is acceptably small in several real-world scenariosincluding institutional attacks and botnet attacks. We complementour theoretical model with Monte Carlo simulations that confirmour analysis. Finally, we demonstrate the feasibility and concretebenefits of our technique by deploying B ASALT within a live cryp-tocurrency network using a prototype implementation of B

ASALT for AvalancheGo [4], the reference engine powering the AVA cryp-tocurrency network [1, 21]. Our experiments on the AVA networkconfirm that the samples obtained using B

ASALT are equitably dis-tributed amongst nodes, allowing for a system which is both openand where no single entity can gain excessive power. Our proto-type is publicly available, fully functional, and compatible with theexisting AVA network without requiring any protocol changes.

A random peer sampling (RPS) service can be defined as a servicethat produces a continuous stream ( 𝑝 𝑖 ) 𝑖 ≥ of random nodes selectedin the network. As stated above, a secure random peer samplingservice is faced with the double task of (i) ensuring the largestpossible diversity of peers in the stream ( 𝑝 𝑖 ) 𝑖 ≥ , while (ii) limitingas much as possible the appearance of malicious nodes in ( 𝑝 𝑖 ) 𝑖 ≥ . We assume a very large system composed of nodes that can eitherbe honest (a.k.a. correct) or malicious (a.k.a. Byzantine). Byzantinenodes may deviate arbitrarily from the prescribed protocol in orderto manipulate the decisions taken by correct nodes, for instance toisolate correct nodes or to increase malicious nodes’ representationin the peer sampler’s output. We write 𝑄 the number of correct nodesin the system.We consider a communication network where any node can senda message to any other node, and assume that more than a fixedfraction of the messages sent to a node by other non-malicious nodesarrive within a certain delay. Byzantine nodes may collude (shareinformation, coordinate their behaviors), and may send arbitrarymessages to an arbitrarily large number of correct nodes per timeunit. They cannot however block completely the communicationbetween two correct nodes, or read in the local memory of correctnodes.Nodes are granted each a unique identifier, which we assume tobe their IP address. We will use the same notation to refer to a nodeand to its identifier. We assume that Byzantine nodes may not spoof the IP addresses of other nodes, which can be prevented using ahandshaking mechanism [11]. Random peer sampling is often considered under the assumption ofa closed , or permissioned system (e.g. [8, 15]), where the whole setof nodes is known and the proportion of malicious nodes is equalto (or bounded by) a small fixed fraction 𝜑 . In such a situation, aperfect random peer sampler could be defined as one that samplesall nodes uniformly, thus returning a fraction 𝜑 of malicious nodesin the samples it produces.This assumption is however not adapted to an open network suchas the public Internet, which is more akin to a permissionless (open)system. In such a setting, an attacker may control nodes with manytimes more IP addresses than there are correct nodes, which may thenbe used to perform a Sybil attack, leading to an increased influenceof the attacker in the peer sample’s output. In particular, a RPS thatsamples peers uniformly based on their IP addresses is particularlyvulnerable to such attacks.Drawing on the classification from [14], we will consider twoparadigmatic scenarios where an attacker attempts a Sybil attackusing many IP addresses:(i) Institutional attacks , launched by an institution or an orga-nization that owns large IP address blocks; and(ii)

Botnet attacks , where many infected machines are controlledby an attacker.The crucial difference between these two attacks is that in an insti-tutional attack the attacker may control many IP addresses located ina limited number of continuous address blocks, whereas in a botnetattack the attacker may control a smaller number of addresses in thewhole IP address space. These properties allow us to implement effi-cient defenses by biasing our sample selection to limit the influenceof any given entity (Section 3.3). From a practical perspective, thesetwo attacks represent the two extremes of a continuous spectrum,as most actual attacks will usually fall somewhere in the middle, apoint we return to in our evaluation.We do not consider network-level attacks such as BGP hijacksin our attack model, however we discuss these attacks and potentialdefenses in Section 7. B ASALT leverages three main components. First it employs a novelsampling approach, termed stubborn chaotic search , that exploitsranking functions to define a dynamic target random graph (i.e. aset of 𝑣 target neighbors for each node) that cannot be controlled byByzantine nodes. Second, it adopts a hit-counter mechanism that fa-vors the exploration of new peers even in the presence of Byzantinenodes that flood the network with their identities. Finally, it incorpo-rates hierarchical ranking functions that ensure that nodes sampletheir peers from a variety of address prefixes. The first two mecha-nisms ensure that the number of Byzantine nodes in a node’s viewcannot be increased arbitrarily by attackers. This offers protectionfrom general Byzantine behaviors including those resulting frombotnet attacks, as defined above. The third mechanism ensures thatnodes sample their peers from a variety of address prefixes, thereby ASALT : A Rock-Solid Foundation for Epidemic Consensus Algorithms in Very Large, Very Open Networks

Table 1: Parameters of the B

ASALT algorithm and of its envi-ronment.

Environment parameters 𝑛 Number/equivalent number of nodes 1000, 10000 𝑓 Fraction/equivalent fraction of malicious nodes 10%, 30% 𝑄 Number of correct nodes = (1 − 𝑓 ) 𝑛𝐹 Attack force (described in Sec. 4.2.1) ≥ Algorithm parameters 𝑣 View size 50 to 200 𝜏 Exchange interval 1 time unit 𝜌 Sampling rate (peers per time unit) ∼ 𝑘 Replacement count up to 𝑣 / Theoretical model variables 𝑡 Time 𝑐 ( 𝑡 ) Number of correct node identifiers seen ≤ 𝑐 ( 𝑡 ) ≤ 𝑄𝑏 ( 𝑡 ) (Equivalent) number of malicious node identifiers seen ≤ 𝑏 ( 𝑡 ) ≤ 𝑓 𝑛𝐵 ( 𝑡 ) Probability of sampling a Byzantine node = 𝑏 ( 𝑡 ) 𝑏 ( 𝑡 )+ 𝑐 ( 𝑡 ) countering institutional attacks where the attacker controls a limitednumber of entire address prefixes.Table 1 shows an overview of the parameters of our algorithm andof its environment, while Algorithm 1 shows its pseudocode. Forthe sake of clarity, in the following, we use the generic term node torefer to protocol participants, but we use the term peer to refer to anode’s neighbors or potential neighbors. B ASALT nodes implicitly identify a dynamic target random graphby defining target neighbors using a set of random ranking functions.Then, each node greedily attempts to converge towards this implicitdefinition by repeatedly exchanging neighbor lists with other peers,discovering at each step peers that better match its ranking functions.In the following, we first detail the use of ranking functions toidentify target neighbors. Then we discuss how nodes update theseranking functions to make the random graph dynamic.

Identifying neighbors through ranking functions.

Each nodemaintains a view, view[ · ] , composed of 𝑣 slots . For each slot, 𝑖 ∈ { , . . . , 𝑣 } , it chooses a random seed, noted seed[ 𝑖 ] (line 5 of Al-gorithm 1, and Fig. 1) that defines a corresponding random ranking function, rank seed[ 𝑖 ] ( · ) . We then define a node’s 𝑖 -th out-neighbor inthe target graph as the (correct or malicious) node 𝑝 that minimizes rank seed[ 𝑖 ] ( 𝑝 ) . The function rank seed[ 𝑖 ] ( · ) can be selected to imple-ment specific sampling distributions. For instance, using a simplehash function rank seed[ 𝑖 ] ( 𝑝 ) = ℎ ( ⟨ seed[ 𝑖 ] , 𝑝 ⟩ ) (where angle bracketsrepresent a tuple) leads to a uniform sampling function, since eachpeer identifier has the same probability of producing the lowest rank.In Section 3.3, we present how a hierarchical ranking function al-lows B ASALT to foil institutional attacks. For simplicity, we use theshortcut of saying that a peer 𝑝 better matches seed[ 𝑖 ] than a peer 𝑝 ′ if rank seed[ 𝑖 ] ( 𝑝 ) < rank seed[ 𝑖 ] ( 𝑝 ′ ) .When selecting seed[ 𝑖 ] , a node cannot know the correspondingtarget identifier. Rather, it stores, in view[ 𝑖 ] , the identifier that hasso far produced the smallest value of rank seed[ 𝑖 ] (view[ 𝑖 ]) amongstthose seen since selecting seed[ 𝑖 ] . At startup, each node selects thebest matching peers, view[ 𝑖 ] , from a set of bootstrap peers (line 6). Nodes then periodically exchange the current contents of their viewsat lines 7-9 in order to discover new peers that can serve as bettermatches for the slots in their views. Specifically, every 𝜏 time units We discuss the influence of the composition of this bootstrap set in Section 4.2.2. view of node 𝑝 ( 𝑣 slots) 𝑛 𝑖 𝑠 𝑖 round-robin seed replacement strategy 𝑛 𝑖 selected tominimize rank 𝑠 𝑖 ( 𝑛 𝑖 )view[ · ]seed[ · ] 𝑛 𝑠 . . .. . . . . .. . . 𝑛 𝑣 𝑠 𝑣 Figure 1: The mechanism of B

ASALT

Algorithm 1:

The B

ASALT algorithm algorithm parameters see Table 1 initialization for 𝑖 ∈ , . . . , 𝑣 do seed[ 𝑖 ] ← rand_seed(); view[ 𝑖 ] ← ⊥ ; hits[ 𝑖 ] ← 𝑟 ← ; updateSample( bootstrap_peers ) every 𝜏 time units 𝑝 ← selectPeer() ; Send ⟨ P ULL ⟩ to 𝑝 𝑞 ← selectPeer() ; Send ⟨ P USH , view[ · ] ⟩ to 𝑞 on receive ⟨ P ULL ⟩ from 𝑝 Send ⟨ P USH , view[ · ] ⟩ to 𝑝 on receive ⟨ P USH , [ 𝑝 , . . . , 𝑝 𝑣 ] ⟩ from 𝑝 updateSample( [ 𝑝 , . . . , 𝑝 𝑣 , 𝑝 ] ) every 𝑘 / 𝜌 time units for 𝑖 = 1 , . . . , 𝑘 do 𝑟 ← ( 𝑟 mod 𝑣 ) + 1 Sample view[ 𝑟 ] seed[ 𝑟 ] ← rand_seed() updateSample( view[ · ] ) function updateSample( [ 𝑝 , . . . , 𝑝 𝑚 ] ) for 𝑖 ∈ , . . . , 𝑣 , 𝑝 ∈ [ 𝑝 , . . . , 𝑝 𝑚 ] do if 𝑝 = view[ 𝑖 ] then hits[ 𝑖 ] ← hits[ 𝑖 ] + 1 else if view[ 𝑖 ] = ⊥ or rank seed[ 𝑖 ] ( 𝑝 ) < rank seed[ 𝑖 ] (view[ 𝑖 ]) then view[ 𝑖 ] ← 𝑝 ; hits[ 𝑖 ] ← function selectPeer() 𝑖 ∈ argmin 𝑣𝑗 =1 (hits[ 𝑗 ]) hits[ 𝑖 ] ← hits[ 𝑖 ] + 1 return view[ 𝑖 ] (exchange interval), each correct node selects a random peer fromits view and sends it a pull request (line 8) to which the recipient, ifcorrect, replies by sending the contents of its current view (line 11).Then, the node selects another peer from its view and sends it a push message containing its current view (line 9). When it receives thereply to the pull request, the node greedily updates any slot view[ 𝑖 ] that can be brought closer to its corresponding seed, seed[ 𝑖 ] , usingone of the received identifiers (lines 24-25). The peer to which apush message was sent does the same on its side. Making the graph dynamic.

To generate a dynamic random graphand enable nodes to continuously generate fresh samples from the lex Auvolat, Yérom-David Bromberg, Davide Frey, François Taïani network, nodes regularly reset some of their seeds to new randomvalues. This periodic operation, at lines 14-19, first provides the ap-plication with 𝑘 peer identifiers representing a random sample ofthe network (line 17), and then resets the 𝑘 corresponding slots byselecting new seeds seed[ 𝑟 ] (line 18). These 𝑘 slots are selected ina round-robin fashion every 𝑘 / 𝜌 time units. This yields 𝜌 randomsamples per time unit on average as indicated in Table 1. It then setsthe corresponding entries, view[ 𝑟 ] , to the identifiers from the currentview that best match the new seeds (line 19). When the algorithmreturns view[ 𝑟 ] as a sample to the application, view[ 𝑟 ] effectivelyresults from a random selection amongst all the peer identifiers re-ceived since the last reset of seed[ 𝑟 ] . A node has no way of knowingif it has found the peer 𝑝 that best matches seed[ 𝑟 ] globally (i.e. itstarget neighbor in the random graph), but selecting which seeds toreset in a round-robin fashion and by sampling view[ 𝑟 ] just beforeresetting seed[ 𝑟 ] , the algorithm ensures that a maximum number ofidentifiers have been seen for each seed when returning the corre-sponding sample. This optimizes the randomness of the sample fora given budget of peer exchanges and view size. Parameter 𝜌 controls the number of random samples per time unit,and so the number of slots whose seeds are refreshed at each timeunit. With a view size of 𝑣 , this means that each slot is refreshed onaverage every 𝑣 / 𝜌 time units. The value of 𝑣 / 𝜌 must therefore belarge enough with respect to the exchange interval, 𝜏 . Parameter 𝑘 controls, instead, the number of slots that are reset at the same time. Alarge value of 𝑘 causes the algorithm to explore many slots in parallel,thereby obtaining more diverse samples that help the 𝑘 slots convergefaster together. A small value of 𝑘 (e.g. 𝑘 = 1 ), instead, causes theexploration to occur with most slots in a quasi converged state. Thisincreases the probability of contacting peers that have already beencontacted recently, thereby leading to slower convergence for the 𝑘 unconverged slots. Our experiments by simulation confirm thatB ASALT better resists the presence of Byzantine peers using a batchsampling-and-replacement strategy where 𝑘 can be as high as 𝑣 / (in which case 𝑘 / 𝜌 , like 𝑣 / 𝜌 , must also be at least several exchangeintervals, 𝜏 ), rather than replacing seeds one by one (i.e. setting 𝑘 = 1 ). The graph-generation mechanism described above prevents Byzan-tine peers from influencing the target graph, and thus the views ofcorrect nodes once the network has converged. However, depend-ing on the speeds at which correct nodes discover other correct orByzantine peers, their intermediate views may suffer from a bias infavor of Byzantine nodes. If this happens, the algorithm will tendto select malicious nodes to push to and pull from (lines 8 and 9),further slowing down convergence.B

ASALT mitigates this issue by introducing a hit-counter mecha-nism that effectively makes the protocol harder to attack. Each nodemaintains a hit counter variable, hits[ 𝑖 ] , for each slot 𝑖 ∈ { , . . . , 𝑣 } in its view. A node sets hits[ 𝑖 ] to when initializing the slot, as wellas whenever updating view[ 𝑖 ] with a peer that better matches the Other than using more memory, increasing the view size has a non-negligible network-ing cost as one would typically keep an open TCP connection ready for each peer of thecurrent view. Alternatively, the samples’ randomness could be increased by keeping alog of recently closed connections and re-injecting these peer identifiers when selectingnew seeds. corresponding seed (line 25). Every time a node receives anotherpeer’s view that also contains peer view[ 𝑖 ] , it increases hits[ 𝑖 ] byone (line 23).When deciding which neighbors to contact, a node always selectsone of the peers with the lowest value of hits[ 𝑖 ] (line 27). Finally,the node increases the hit counter of the selected peer by to makeit less likely to be selected the next time (line 28).This mechanism has no impact on honest nodes as they shouldeach appear as often in expectation. However, it creates a trade-off for (possibly colluding) malicious nodes that try to be over-represented, as nodes attempting to appear more often will auto-matically be contacted less. We further discuss this aspect and thepossibility of attacks on the hit counter mechanism in Section 4.3. Central to Algorithm 1, the function rank seed[ 𝑖 ] ( · ) induces a specificsampling distribution of node identifiers. For instance, using simplya hashing function for rank seed[ 𝑖 ] ( · ) yields uniform node sampling.Unfortunately, uniform sampling makes it relatively easy for insti-tutional attackers to gain an overwhelming influence in the system,by taking the control of large IP address blocks to implement Sybilattacks (Section 2.2). For instance, as we will see in Section 4.4,controlling one of the largest ISPs would grant an attacker about IPv4 addresses, a number large enough to thwart most existingdecentralized BFT systems. Such attacks, are, however, heavily con-centrated in a limited number of address ranges by design ( ∼ Ranking functions and target distributions.

Formally, let 𝑆 be auniform random variable on 256-bit integers, which corresponds tothe sampling of a seed. Let 𝑋 be the random variable correspondingto the best matching peer sample for 𝑆 , defined as: (1) 𝑋 = argmin 𝑝 ∈N rank 𝑆 ( 𝑝 ) where N denotes the set of all network nodes. Depending on thedefinition of rank 𝑆 ( 𝑝 ) , 𝑋 can implement a specific probability dis-tribution on network nodes. This allows us to define the attacker’s power , 𝑓 , as the probability of 𝑋 being a malicious node given aspecific ranking function rank 𝑆 ( · ) . If rank 𝑆 ( · ) is a simple hashingfunction, rank 𝑆 ( 𝑝 ) = ℎ ( ⟨ 𝑆, 𝑝 ⟩ ) , nodes are selected uniformly, and 𝑓 corresponds to the fraction 𝜑 of malicious nodes in the system. Inthe more general case, this probability is no longer equal to 𝜑 but,as we will see in Section 4.1, it plays the same role in our analysis,thus we will also call 𝑓 an equivalent fraction of malicious nodes. Selecting the ranking function.

In order to counter institutionalSybil attacks, we need to select a ranking function that minimizes 𝑓 by giving malicious nodes a low probability of being selected as best-matching peers (i.e. chosen by the distribution 𝑋 ). B ASALT adoptsa ranking function that spreads sampled peers amongst differentsubnets by exploiting the structure of IP addresses. IP addresses canindeed usually be decomposed in two parts, a prefix, that designatesthe subnet to which the address belongs (linked to a given Internetservice provider), and a local part that identifies a node within thatsubnet.

ASALT : A Rock-Solid Foundation for Epidemic Consensus Algorithms in Very Large, Very Open Networks

A first grouped ranking function.

To illustrate this intuition, sup-pose that 𝐺 ( 𝑝 ) corresponds to the prefix of a given length of the IPaddress, or a country code determined from the address. The follow-ing ranking function (based on a lexicographical ordering on values)can be used to sample uniformly amongst the different values ofproperty 𝐺 ( 𝑝 ) , and then uniformly amongst all the peers that havethe selected value of 𝐺 ( 𝑝 ) : rank 𝑆 ( 𝑝 ) = ⟨ ℎ ( ⟨ 𝑆, 𝐺 ( 𝑝 ) ⟩ ) , ℎ ( ⟨ 𝑆, 𝑝 ⟩ ) ⟩ Using such a ranking function makes an attack against B

ASALT harder. In order to gain a power of 𝑓 in the network, a maliciousentity would need to control a large number of nodes at least ina fraction 𝑓 of all the values of 𝐺 ( 𝑝 ) where network nodes exist.For instance, consider an attacker that owns a full IP address block.Uniform node sampling gives the attacker a power of 𝑓 = 𝑞𝑛 , where 𝑞 is the size of the IP block and 𝑛 the total number of nodes in thenetwork. In the group-based sampling model, since an address blockis usually associated with a single group (a single country, a singleIP address prefix), the attacker only has a power of 𝑓 = | 𝐺 | 𝑞𝑔 , where | 𝐺 | is the number of different groups, and 𝑔 is the number of nodespresent in the particular group of the attacker’s address block. Thisattacking power is trivially bounded by | 𝐺 | , and the only way toincrease it consists in taking control of many IP addresses in othergroups, making such an attack much more costly.B ASALT ’s hierarchical ranking function.

In B

ASALT , we takethe grouping approach described above one step further. We adopt ahierarchical ranking function that descends the address hierarchy bysampling uniformly at levels /8, then /16, then /24, and then finallyat the level of individual addresses, defined as follows: (2) rank 𝑆 ( 𝑝 ) = (cid:68) ℎ (⟨ 𝑆, 𝑝 ⟩) , ℎ (⟨ 𝑆, 𝑝 ⟩) , ℎ (⟨ 𝑆, 𝑝 ⟩) , ℎ (⟨ 𝑆, 𝑝 ⟩) (cid:69) where 𝑝 𝑖 corresponds to the prefix constituted of the 𝑖 most signifi-cant bits of 𝑝 ’s IP address. The efficiency of this ranking function incountering institutional Sybil attacks is demonstrated numerically inSection 4.4. We now use a theoretical continuous model to estimate the valueof 𝐵 ( 𝑡 ) , the probability at a time 𝑡 that a given slot of a correctprocess contains a Byzantine peer identifier, as a function of 𝑓 , theattacker’s power. The use of 𝑓 allows us to apply the same analysisto institutional and botnet attacks in a unified reasoning. We firstconsider an ideal ‘uniform’ botnet attack, in which Byzantine iden-tifiers follow the same distribution as those of honest nodes. Thissituation corresponds for instance to a scenario in which a botnetindiscriminately targets the same kind of nodes (e.g. personal ma-chines) as those making up the rest of the system. In this case, theattacker’s power 𝑓 that we introduced in Section 3.3 is simply equalto the fraction of Byzantine nodes in the network, and is indepen-dent of B ASALT ’s hierarchical ranking function. To analyze thisattack, we note 𝑛 the total number of network nodes (i.e. the networksize), the product 𝑓 𝑛 denotes the number of Byzantine nodes, and 𝑄 = (1 − 𝑓 ) 𝑛 denotes the number of correct nodes. In the case of an institutional attack, the attacker’s power 𝑓 de-pends on the distribution of the address blocks it controls, and rep-resents the probability of selecting a Byzantine identifier using theranking function rank 𝑆 ( 𝑝 ) in the hypothetical case that all identifiersin the network are known (Eq. 1). In this scenario, we define 𝑛 as an equivalent network size , defined as 𝑛 = 𝑄 − 𝑓 , where 𝑄 still denotesthe number of correct nodes. These definitions satisfy the equality 𝑄 = (1 − 𝑓 ) 𝑛 , as in the (uniform) botnet attack, and will allow us toapply the same analysis seamlessly.The two above scenarios represents the extreme cases of a widerspectrum of attacks. In particular, botnet attack might not be uniform,as the identifiers controlled by a botnet might be biased towardscertain blocks (e.g. in the case of botnet built by targeting certainorganization, or specific vulnerabilities) that differ from those ofhonest nodes. In such hybrid cases, the reasoning for institutionalattacks applies. The probability 𝐵 ( 𝑡 ) of selecting a Byzantinenode in a given slot of a node 𝑝 at time 𝑡 depends on two sets ofidentifiers: the set of correct identifiers seen at a time 𝑡 by 𝑝 onthis slot since the last reset, noted C ( 𝑡 ) , and the set of Byzantineidentifiers seen by 𝑝 over the same period, noted B ( 𝑡 ) .One key observation is that, for a fixed C ( 𝑡 ) , 𝐵 ( 𝑡 ) increases as 𝑝 hears of new Byzantine identifiers and B ( 𝑡 ) grows, i.e. C ( 𝑡 ) = C ( 𝑡 ′ ) ∧ B ( 𝑡 ) ⊆ B ( 𝑡 ′ ) = ⇒ 𝐵 ( 𝑡 ) ≤ 𝐵 ( 𝑡 ′ ) , so that for a given C ( 𝑡 ) , 𝐵 ( 𝑡 ) is maximum when the node 𝑝 has learned all Byzantine identifierscirculating in the system.In the following analysis, we therefore assume a worst case sce-nario in which correct nodes have been flooded with all existingByzantine identifiers. (We discuss the actual implementation of thisworst case scenario in Section 4.2.1.) For botnet attacks, we haveassumed that correct and Byzantine identifiers follow the same dis-tribution, implying that 𝐶 ( 𝑡 ) = 𝑐 ( 𝑡 ) 𝑏 max + 𝑐 ( 𝑡 ) , (3)where 𝑏 max is the total number of Byzantine identifiers, i.e. 𝑏 max = 𝑓 𝑛 . (See Appendix A for a detailed derivation.)For institutional attacks, we assume that the distribution of correctnodes is independent of the sampling distribution introduced by rank() , and we approximate 𝐶 ( 𝑡 ) using the same form as Eq. 3,where 𝑏 max becomes an equivalent number of Byzantine identifiers.Considering the case when 𝑝 knows all correct nodes ( 𝑐 ( 𝑡 ) = 𝑄 ),and having defined 𝑛 = 𝑄 − 𝑓 , we derive 𝑏 max = 𝑓 × 𝑄 − 𝑓 , and hence 𝑏 max = 𝑓 𝑛 in this case as well, where 𝑛 is now the equivalent networksize introduced above.In both attacks, the probability of selecting a Byzantine node, 𝐵 ( 𝑡 ) , becomes therefore driven by the number of correct identifiersknown to 𝑝 , 𝑐 ( 𝑡 ) = |C ( 𝑡 ) | , and the same system equation can be usedto study both cases (modulo the redefinition of 𝑛 for institutionalattacks). For simplicity, we study a version ofB

ASALT without the hit counter-based hardening mechanism, andlater discuss its impact in Section 4.3. Algorithm 2 shows the pseu-docode corresponding to the hit counter-less version being analyzed.To approximate the system’s behavior, we will reason using the meanvalues of 𝑐 ( 𝑡 ) over all nodes and slots, and assume that the values lex Auvolat, Yérom-David Bromberg, Davide Frey, François Taïani Algorithm 2:

Simplification of Algorithm 1 for theoreticalanalysis of Section 4.2 function updateSample( [ 𝑝 , . . . , 𝑝 𝑣 ] ) for 𝑖 ∈ , . . . , 𝑣 do for 𝑝 ∈ [ 𝑝 , . . . , 𝑝 𝑣 ] do if view[ 𝑖 ] = ⊥ or ℎ ( ⟨ seed[ 𝑖 ] , 𝑝 ⟩ ) < ℎ ( ⟨ seed[ 𝑖 ] , view[ 𝑖 ] ⟩ ) then view[ 𝑖 ] ← 𝑝 function selectPeer() 𝑖 ← rand (1 , . . . , 𝑣 ) ; return view[ 𝑖 ] of individual nodes tend to concentrate around their means in prac-tice with high probability, as is usually the case in such stochasticsystems. We first discuss in more detail the worst-case attack on the B

ASALT algorithm. We then study the risk of a node becoming isolated underthis attack model (i.e. of an Eclipse attack succeeding), before mov-ing on to studying the convergence properties of B

ASALT assumingno node is ever isolated.

To identify the worstcase attack, we observe that attackers cannot influence the choicescorrect nodes make (at line 6 of Algorithm 2, and at lines 16-18of Algorithm 1); thus they can only manipulate the peer-samplingprocess by increasing their representation in the views of correctnodes, i.e. the value of 𝐵 ( 𝑡 ) . The fact that 𝐵 ( 𝑡 ) grows with the set ofByzantine identifiers the node is aware of, B ( 𝑡 ) , suggests that theworst case scenario arises when Byzantine nodes flood the networkwith their identifiers in order to increase B ( 𝑡 ) as much as possible.We model this attack scenario as follows: • A malicious node that receives a pull request returns a viewcomposed of 𝑣 nodes selected uniformly at random amongstthe malicious nodes. • Regularly, a malicious node sends a push request to randomlyselected correct peers, containing similarly a view of 𝑣 uni-formly random malicious peers.We define the force of the attack, 𝐹 (distinct from the attacker’spower, 𝑓 ), as the ratio between the number of push requests sentby a Byzantine node and number of push requests sent by a correctnode in a given time interval. For example, if a Byzantine nodesends push requests at the same rate as correct nodes, a force of 𝐹 corresponds to sending requests to 𝐹 distinct correct nodes ratherthan to only one. Alternatively, the force of the attack can also modela situation in which Byzantine nodes send requests more often, orwhere the network loses more messages from correct nodes thanfrom Byzantine ones.The worst case corresponds to an arbitrarily large value of 𝐹 , aris-ing when correct nodes receive all the identifiers of Byzantine peersin any arbitrarily small (but non-empty) time interval. This meansthat apart from the initial state, B ( 𝑡 ) is constant, and its effect can becaptured by the term 𝑏 max = 𝑓 𝑛 to compute the probabilities of se-lecting a correct (resp. Byzantine) peer in a slot of a node’s view. Werecall that 𝑓 𝑛 represents the total/equivalent number of Byzantinenodes depending on the attack considered. The analysis that follows shows that even in this case, B ASALT causes 𝐵 to converge to a valuethat is only slightly larger than the attacker’s power, 𝑓 . The experi-mental results of Section 5 analyze instead the actual performancewith finite values of 𝐹 . We start by show-ing that nodes have a low probability of being isolated. Isolationcan happen in two ways: either when a node joins the network forthe first time, or when it evicts all correct peers from its view andreplaces them with Byzantine peers.

Isolated joining node.

In the first case, the unfortunate joiningnode receives all of the identifiers of Byzantine nodes as soon as itjoins. At time 𝜖 after joining we have B ( 𝑡 ) becomes maximal and 𝑐 ( 𝜖 ) = (1 − 𝑓 ) 𝐼 , where 𝑓 is the fraction of Byzantine nodes in thebootstrap sample and 𝐼 is the size of the bootstrap sample. Sincewe defined 𝐵 ( 𝑡 ) as the probability of a given slot in the view beingoccupied a Byzantine peer, we can write the probability that a nodehas only Byzantine neighbors as 𝐵 ( 𝑡 ) 𝑣 . 𝐵 ( 𝑡 ) 𝑣 = (cid:18) 𝑏 max 𝑏 max + 𝑐 (cid:19) 𝑣 = (cid:169)(cid:173)(cid:171)

11 + ( − 𝑓 ) 𝐼𝑓 𝑛 (cid:170)(cid:174)(cid:172) 𝑣 (4)We can reduce this probability exponentially by increasing 𝑣 , byincreasing 𝐼 or by assuming a lower 𝑓 . For instance, supposing 𝑓 = 50% of malicious nodes in our bootstrap peer list, by takinga view size of 𝑣 = 200 and a bootstrap peer list size of thenumber of malicious nodes in the network ( 𝐼 = 𝑓 𝑛 ), this probabilitybecomes smaller than − . Supposing for instance a network ofsize 𝑛 = 10000 with a fraction 𝑓 = 0 . of Byzantine nodes, this onlyrequires a bootstrap set of size 𝐼 = 250 nodes, of which only arerequired to be correct. Convergence to isolated state.

The second way for a node tobecome isolated results from resetting the seeds for the slots thatstill contain correct peers to new seeds that select Byzantine nodes.When such a reset occurs, the probability that all of the non-resetslots are already owned by Byzantine peers is equal to 𝐵 ( 𝑡 ) 𝑣 − 𝑘 = (cid:16) 𝑏 max 𝑏 max + 𝑐 ( 𝑡 ) (cid:17) 𝑣 − 𝑘 . When the number of correct nodes seen locally, 𝑐 ( 𝑡 ) ,is large enough, this probability is negligible.Let us now study the value of 𝑐 ( 𝑡 ) at the time of a reset, dependingon the value of 𝑐 ( 𝑡 ) at the time of the previous reset. For this analysis,we look at a single node of the network and make the hypothesisthat other network nodes are well-converged. As we discuss in Sec-tions 4.2.3 and 5, this implies that the fraction of Byzantine nodes intheir views approaches 𝑓 with appropriate algorithm parameters. Wewrite 𝑐 the value of 𝑐 ( 𝑡 ) at the previous reset. The expected numberof correct peer identifiers received during the period between thetwo resets is lower bounded by 𝑘𝜌 𝑣𝜏 𝑐 𝑓 𝑛 + 𝑐 (1 − 𝑓 ) . If we write ∆ 𝑐 thecorresponding increase in 𝑐 ( 𝑡 ) , i.e. the number of distinct correctpeer identifiers received during this time period, we obtain: ∆ 𝑐 ≥ 𝑘𝑣𝑐 (1 − 𝑓 )( 𝑄 − 𝑐 ) 𝑄𝜏𝜌 ( 𝑓 𝑛 + 𝑐 ) + 𝑘𝑣𝑐 (1 − 𝑓 ) (5)(see Appendix B for the full derivation).Suppose for instance a network of 𝑛 = 10000 nodes with a propor-tion 𝑓 = 0 . of malicious nodes, with algorithm parameters 𝑣 = 100 ASALT : A Rock-Solid Foundation for Epidemic Consensus Algorithms in Very Large, Very Open Networks and 𝑘 = 50 . In this system, taking 𝜏 = 1 and 𝜌 = 1 , and supposingthat the node we are considering has just joined the network andknows only of 𝑐 = 𝑓 𝑓 𝑛 = 125 correct node identifiers, we obtainthat ∆ 𝑐 ≥ , i.e. 𝑐 ( 𝑡 ) at the next reset is expected to be at least . 𝐵 ( 𝑡 ) 𝑣 − 𝑘 is smaller than − as soon as the number 𝑐 ( 𝑡 ) of correctnode identifiers seen is more than . In other words, the probabil-ity that the node becomes isolated at the next reset is negligible. Thisguarantee can be made even stronger by increasing the view size 𝑣 . Moreover, if the node is already in a better-converged state witha relatively large 𝑐 , the probability of becoming isolated during areset becomes even smaller. Now that we have shown that theprobability of a node becoming isolated can be made arbitrarily low,we make the following assumption in the rest of the analysis.A

SSUMPTION No node is isolated, and all nodes have at leastsome correct neighbors. In particular, 𝐵 ( 𝑡 ) 𝑣 is negligible at all times 𝑡 . Deriving a continuous model. When Assumption 1 holds, andin the worst-case scenario discussed above (Byzantine nodes havepropagated all their identities to all correct nodes) B ( 𝑡 ) is constant,and its effect captured by the term 𝑏 max , which allows us to writethe evolution of 𝑐 ( 𝑡 ) over time as a differential equation, as the sumof contributions resulting from the various parts of the system. • Pull exchange : every 𝜏 rounds, a node pulls from one peerin its view, which replies by sending 𝑣 node identifiers. Withprobability 𝐵 ( 𝑡 ) , the node contacts a Byzantine peer. In thiscase, it receives only Byzantine peer identifiers that it is al-ready aware of (by the worst-case assumption 𝑏 max = 𝑓 𝑛 ).With probability 𝐶 ( 𝑡 ) , the node contacts instead a correct peer.In this case, each returned identifier will itself be correctwith probability 𝐶 ( 𝑡 ) , and if correct, it will have a probability 𝑐 ( 𝑡 )(1 − 𝑓 ) 𝑛 of being already known ( (1 − 𝑓 ) 𝑛 being the total num-ber of correct nodes). Thus we can express the variation of 𝑐 ( 𝑡 ) over time as a result of a pull operation as follows: 𝑑𝑐𝑑𝑡 = 1 𝜏 (cid:20) 𝐶 ( 𝑡 ) 𝑣 (cid:18) − 𝑐 ( 𝑡 )(1 − 𝑓 ) 𝑛 (cid:19)(cid:21) . • Push exchange : every 𝜏 rounds, a node pushes to a randomnode in its view. This push has a 𝐶 ( 𝑡 ) probability of beingsent to a correct node. In this case we can apply the samereasoning as above and derive the same contribution to 𝑑𝑐𝑑𝑡 . • Sampling and view renewal : every 𝜌 rounds, a node resetsone of its 𝑣 slots and forgets the identifiers collected for thisslot. Let us write 𝑐 ( 𝑡 ) as 𝑐 ( 𝑡 ) = 𝑣 (cid:80) 𝑣𝑖 =1 𝑐 𝑖 ( 𝑡 ) , where 𝑐 𝑖 ( 𝑡 ) is thenumber of correct nodes taken into account by slot 𝑖 . Then asingle 𝑐 𝑖 is set to zero every 𝜌 rounds. On average, this yieldsthe following contribution to 𝑑𝑐𝑑𝑡 : 𝑑𝑐𝑑𝑡 = − 𝜌 𝑐 ( 𝑡 ) 𝑣 . By summing all three above contributions, we obtain our final differ-ential equation: (6) 𝑑𝑐𝑑𝑡 = 1 𝜏 (cid:20) 𝐶 ( 𝑡 ) 𝑣 (cid:18) − 𝑐 ( 𝑡 )(1 − 𝑓 ) 𝑛 (cid:19)(cid:21) − 𝜌 𝑐 ( 𝑡 ) 𝑣 . Solving the continuous model.

We now solve Equation 6 underAssumption 1 and show that the network converges to a state wherethe proportion 𝐵 of Byzantine peers in nodes’ views is small evenfor arbitrarily large values of the attack force, 𝐹 . To this end, wecan express 𝑑𝐵𝑑𝑡 as 𝑑𝐵𝑑𝑡 = − 𝑏 max ( 𝑏 max + 𝑐 ) 𝑑𝑐𝑑𝑡 , and by substituting 𝑑𝑐𝑑𝑡 fromEquation 6, we obtain: (7) 𝑑𝐵𝑑𝑡 = 𝐵 (1 − 𝐵 ) (cid:18) 𝜌𝑣 − 𝑣 (1 − 𝐵 )( 𝐵 − 𝑓 ) 𝜏 𝑓 (1 − 𝑓 ) 𝑛 (cid:19) To study the constant regime of this system, we write 𝑑𝐵𝑑𝑡 = 0 andexclude the solutions 𝐵 = 0 , which is not compatible with 𝑏 max = 𝑓 𝑛 ,and 𝐵 = 1 , which corresponds to the case where Byzantine nodestake over the whole network. We also simplify by setting 𝜏 = 1 as itsrole is symmetrical with that of 𝜌 . We obtain after a few steps: (8) (1 − 𝐵 )( 𝐵 − 𝑓 ) = 𝜌 𝑓 (1 − 𝑓 ) 𝑛 𝑣 . The equation exhibits two roots 𝐵 < 𝐵 . (9) 𝐵 , = 12 (cid:32) 𝑓 ∓ √︂ (1 − 𝑓 ) − 𝜌 𝑓 (1 − 𝑓 ) 𝑛𝑣 (cid:33) When the quantity on the right-hand side of Equation 8 approacheszero, 𝐵 approaches 𝑓 from above, while 𝐵 approaches frombelow. Since 𝑑𝐵𝑑𝑡 > for 𝐵 < 𝐵 and 𝐵 > 𝐵 , while 𝑑𝐵𝑑𝑡 < for 𝐵 < 𝐵 < 𝐵 , 𝐵 corresponds to a stable equilibrium, while 𝐵 corresponds to an unstable one. So we focus our analysis on 𝐵 .With respect to 𝐵 , the right-hand side of Equation 8 representsthe difference between the proportion of malicious peers in nodes’views, 𝐵 ( 𝑡 ) , and their overall proportion in the network, 𝑓 . Ideally,we want to keep this quantity as small as possible, making 𝐵 onlyslightly larger than 𝑓 .To this end, we observe that the term 𝜌𝑓 (1 − 𝑓 ) 𝑛 𝑣 shrinks propor-tionally to the square of the view size, 𝑣 . Thus, choosing a largeenough view size allows the network to converge to a globally wellmixed state where Byzantine nodes control only slightly more peersin the view than their overall proportion in the network. Moreover,in order to obtain the same stable state value of 𝐵 , for fixed valuesof 𝑓 and 𝑛 , the view size 𝑣 should grow proportionally to the squareroot of the sampling rate √ 𝜌 , while, for fixed values of 𝑓 and 𝜌 , itneeds to increase proportionally to √ 𝑛 . B ASALT ’s hit counter-based hardening mechanism allows nodes todetect which peers have appeared more often in incoming messages,and prioritize other peers for network exploration. In the case of astandard attack, where malicious peers flood their own identifiers,the hit counter favors the choice of correct peers over maliciousones.However, the fact that we analyzed a simplified version ofB

ASALT without the hardening mechanism raises the legitimatequestion of whether the hit counter may degrade the security of theapproach by enabling some other attack. To answer this question, letus consider a malicious node or a coalition of malicious nodes thatwant to influence the sampling operations performed by a correctnode.We start by observing that malicious nodes can neither write norread the local memories of correct nodes. So they cannot influence lex Auvolat, Yérom-David Bromberg, Davide Frey, François Taïani

Table 2: Institutional attack: power 𝑓 of the adversary, de-fined as the equivalent fraction of malicious nodes, for differentsampling methods, supposing biggest Internet service provideras an adversary ( ∼ IP addresses, distributed over 5739blocks), on the IPv4 network. Data sourced from the GeoLite2Block/ASN dataset. 𝑄 ) 100 1000 10000 Uniform 99.9999% 99.999% 99.99%By /8 prefix 49% 28% 27%By /16 prefix 95% 64% 17%By /24 prefix 99.98% 99.8% 98%Hierarchical

47% 21% 10% Number of attacker IP addresses0.00.20.40.60.81.0 P r o b a b ili t y o f B y z a n t i n e s a m p l e s ( l o w e r i s b e tt e r ) uniformby /16 prefixhierarchical (B s ) Figure 2: Institutional attack: probability of sampling a Byzan-tine node with different ranking functions, calculated for the100 biggest Internet ASes supposing that each of them is theattacker, and that 1000 honest nodes are uniformly spreadamongst remaining IP address space. Data sourced from theGeoLite2 Block/ASN dataset. Horizontal axis: number of IP ad-dresses controlled by the AS. The sampling probability is calcu-lated from the equivalent fraction by using Equation 9, with aview size of 𝑣 = 100 and a sampling rate of 𝜌 = 1 . sampling operations directly: their only strategy consists in tryingto convince a target correct node that some of the correct peers inits view are malicious by increasing their hit counters. To this end,malicious nodes can repeatedly advertise the identifiers of correctpeers. But again, they cannot guess which correct peers are in thetarget correct node’s view slots. So their only option consists inadvertising a possibly large random set of correct peers in the hopethat some of them will be in the correct node’s view.But even this turns out to be counterproductive. If malicious nodesadvertise a large number of correct identifiers, it is indeed possiblethat they may increase the hit counter of some entries in the target’sview. But they do so at the cost of increasing their target’s value of 𝑐 ( 𝑡 ) , i.e. the number of correct peers known to it. Since we alreadyassumed the worst case scenario of 𝑏 max = 𝑓 𝑛 , the increase in 𝑐 ( 𝑡 ) can only decrease 𝐵 ( 𝑡 ) thereby making the attack counterproductive. To illustrate the robustness of B

ASALT ’s hierarchical ranking func-tion against an institutional attack, we calculate the power 𝑓 (herean equivalent fraction of malicious nodes) of a real-world attackerusing data from the GeoLite2 Block/ASN dataset [5], and use theequilibrium formula of Equation 9 ( 𝐵 ) to compute the proportion of Byzantine nodes that B ASALT would return in such a scenario.We assume that the attacker is an internet autonomous system (AS),that exploits all the IP addresses it owns to attack B

ASALT , and thata certain number of honest nodes (100, 1000, 10000) are uniformlyspread amongst the remaining currently active IP addresses. Table 2shows the power 𝑓 of such an attacker, supposing that the attacker isthe Internet AS with the largest number of currently active addresses(106 million in the dataset we used, spread over 5739 blocks). Thiscalculation shows that the hierarchical sampling method reduces thepower of the attacker down to 21% when only 1000 honest nodesrun B ASALT , where it would have been above 99.99% (i.e. full con-trol on the network) using uniform sampling. Figure 2 shows thecorresponding probability that B

ASALT will return Byzantine nodesby applying Equation 9 for the 100 biggest Internet ASes, assuming1000 uniformly spread honest nodes.

We complement our theoretical analysis with Monte Carlo simula-tions that illustrate B

ASALT ’s dynamic behaviour. In this section, wefocus on simulating a permissioned system with a known fractionof malicious nodes. We do not simulate the IP address distributionand use the uniform ranking function. As explained above, our ob-servations can be transposed to a permissionless setting, where theattacker’s power defined by the hierarchical ranking function playsthe role of an equivalent fraction of malicious nodes. We show thatB

ASALT consistently produces samples with fewer malicious peersthan the state-of-the-art algorithms Brahms [8] and SPS [16] overa wide range of scenarios. We also show that B

ASALT convergesfaster on metrics quantifying the random connectivity of the graphgenerated by the algorithm, such as the clustering coefficient andmean path length. These metrics are relevant for information dissem-ination and may thus have an influence on the convergence time ofepidemic agreement algorithms.

We evaluate the tested algorithms by simulating a system with 𝑛 nodes, of which a fraction 𝑓 implement the malicious behaviourdescribed in Section 4.2.1. We do not simulate message loss orvariable link latencies, as our model parameter 𝐹 (the attack force)already integrates the possibility of message loss (see Section 4.2.1),and variable link latencies can also be modeled as losing messagesthat arrive after a certain delay. We do not simulate node churn, butconsider instead an extreme scenario in which all nodes have justjoined the system—this can be seen as an ultimate churn event, inwhich all nodes are replaced. We vary the two parameters 𝑣 , theview size, and 𝜌 , the sampling rate, of the algorithm, as well asthe force of the attack, 𝐹 . We fix the exchange interval to 𝜏 = 1 ( simulation time step). Unless stated otherwise, we use 𝐹 = 10 and 𝜌 = 1 . All algorithms were implemented in a same simulationframework written in Rust, totaling about 2500 lines of code .We compare B ASALT (Algorithm 1) and its variant without thehardening mechanism (Algorithm 2, B

ASALT -simple) to two state-of-the-art competitors: Brahms [8] and SPS [16].SPS was unable to function at all in the tested scenarios: forinstance for 𝑛 = 1000 , 𝑓 = 30% , and even with an attack force 𝐹 of , https://github.com/basalt-rps/basalt-sim. ASALT : A Rock-Solid Foundation for Epidemic Consensus Algorithms in Very Large, Very Open Networks f P r o p o r t i o n o f B y z a n t i n e s a m p l e s ( l o w e r i s b e tt e r ) BrahmsB s (ours)B s -simpleOptimal (a) Varying the fraction 𝑓 of mali-cious nodes F P r o p o r t i o n o f B y z a n t i n e s a m p l e s ( l o w e r i s b e tt e r ) BrahmsB s (ours)B s -simpleOptimal (b) Varying the attack force 𝐹 P r o p o r t i o n o f B y z a n t i n e s a m p l e s ( l o w e r i s b e tt e r ) BrahmsB s (ours)B s -simpleOptimal (c) Varying the sampling rate 𝜌

75 100 125 150 175 200View size v P r o p o r t i o n o f B y z a n t i n e s a m p l e s ( l o w e r i s b e tt e r ) BrahmsB s (ours)B s -simpleOptimal (d) Varying the view size 𝑣 Figure 3: Our algorithm (boxes, blue) consistently provides samples that contain fewer Byzantine nodes than our competitor, Brahms,in a variety of situations. A proportion of of Byzantine samples, as exhibited in Fig 3a for the highest values of 𝑓 , corresponds toa situation where malicious nodes are able to cause a network partition. Results shown for a network size of 10000 nodes, with abase proportion 𝑓 = 0 . of malicious nodes. Base values for other parameters are 𝑣 = 160 , 𝜌 = 1 , 𝐹 = 10 . B ASALT correspondsto the complete version of our algorithm, whereas B

ASALT -simple corresponds to Algorithm 1 without the hardening mechanism(modifications of Algorithm 2).

90% of correct nodes become isolated in the network rapidly usingSPS and remain so during the whole simulation. In contrast, bothB

ASALT and Brahms were able to prevent all correct nodes frombecoming isolated. We have thus decided to exclude SPS from ourcomparison charts, and concentrate on the comparison of B

ASALT against Brahms.To compare Brahms and B

ASALT on similar grounds, we addto the Brahms algorithm a mechanism that resets some of the hashfunctions regularly, using the same round-robin strategy as B

ASALT .Without such a mechanism Brahms would always return the samefixed set of samples, limiting its usability as a random peer samplingalgorithm. As we will show just below, adding a reset rate to Brahmsmakes it less resilient to malicious nodes. In terms of communicationoverhead, Brahms and B

ASALT have the same cost. Indeed, bothalgorithms send a set of peer identifiers of size 𝑣 when replyingto a pull request. For push requests, B ASALT uses larger messagessince Brahms does not send the view with a push message, only thesending node’s identifier, whereas we send the whole view of size 𝑣 .However, supposing 𝑣 = 200 (the maximum in our experiment) andnode identifiers of size 4 bytes (such as IPv4 addresses), the size ofthe communicated information is smaller than one MTU (maximumtransmission unit, i.e. maximum size of a single packet, which isabout 1500 bytes on the Internet), thus the same number of Internetpackets need to be sent by both algorithms. In our first experiment, we measure the number of Byzantine nodespresent in correct nodes’ samples on average after 200 simulationtime steps. For this experiment, we simulate a network of 𝑛 = 10000 nodes. We fix base parameter values of 𝑓 = 10% of malicious nodes,a sampling rate of 𝜌 = 1 , a view size of 𝑣 = 160 and an attack forceof 𝐹 = 10 . We then vary the parameters 𝑓 , 𝜌, 𝑣 and 𝐹 individually.Figure 3 shows how this proportion evolves for the three algorithmsevaluated, as one of the parameters 𝑓 , 𝜌, 𝑣 and 𝐹 varies.Plot 3a shows how the algorithms behave when the proportionof Byzantine nodes in the system varies. B ASALT provides close tooptimal proportions of Byzantine samples even with many Byzantinenodes, whereas Brahms fails to contain the attack in this domain. C . o n v e r g e n c e t i m e ( r o un d s ) BrahmsB s (ours)B s -simple

Figure 4: Time to convergence within 25% of optimal propor-tion of Byzantine samples, for 𝑛 = 1000 , 𝑣 = 100 (on the rightpart, Brahms does not converge within experiment time) Plot 3b shows the sensitivity of the algorithms to the force ofthe attack 𝐹 . These plots show that B ASALT is almost insensitiveto 𝐹 , whereas Brahms shows an increasing proportion of Byzantinesamples when 𝐹 increases.Plot 3c shows how the algorithms behave for various values ofthe sampling rate 𝜌 . For low values of 𝜌 , both Brahms and B ASALT are able to converge to high quality samples, however such a settingdoes not provide much utility as the algorithm is unable to frequentlyreturn new samples to the application. Increasing the sampling rate 𝜌 results, however, in more disruption of the views, where view slotshave a higher risk of being reset before they converge to their targetpeer. This disruption causes Brahms to collapse for higher values of 𝜌 : the network becomes fully disconnected, and the views of correctnodes end up completely polluted by malicious peers. This plot alsoshows how the hit-counter variant helps B ASALT attain better stateswhen 𝜌 is high.Plot 3d shows how the algorithms behave for various view sizes.For small view sizes, all algorithms are unable to keep the networkin a connected state and correct nodes all end up isolated. The plotsshow that B ASALT can keep the network connected using smallerviews than Brahms.

In this second experiment, we study the speed at which the algo-rithms converge to good network states, where they provide samples lex Auvolat, Yérom-David Bromberg, Davide Frey, François Taïani P r o p . B y z . s a m p l e s BrahmsB s (ours)B s -simpleOptimal 0 100 200Time steps0.050.10 C l u s t e r i n g c o e ff i c i e n t BrahmsB s (ours)B s -simple 0 50 100 150 200Time steps2.12.22.32.4 M e a n p a t h l e n g t h BrahmsB s (ours)B s -simple 0 50 100 150 200Time steps304050 I n - d e g r ee d - d BrahmsB s (ours)B s -simple

Figure 5: Algorithm convergence on several graph quality metrics, for 𝑛 = 10000 , 𝑓 = 10% , 𝐹 = 1 , 𝜌 = 0 . , 𝑣 = 160 . On all metrics, loweris better: we see that B ASALT converges much more rapidly than Brahms.

75 100 125 150 175 200View size v M a x . a c h i e v a b l e BrahmsB s (ours)B s -simple

Figure 6: Maximum achievable sampling rate 𝜌 (see Section 5.4)for 10000 nodes, 𝑓 = 10% with low proportions of malicious nodes. Figure 4 shows the timethat Brahms and B ASALT take to converge to proportions of Byzan-tine samples that are within 25% of the optimal proportion, for 𝑛 = 1000 , 𝑣 = 100 , 𝐹 = 10 and 𝜌 = 1 and for varying proportions ofByzantine nodes in the network. We show that the convergence timeof B ASALT remains low for up to 30% of Byzantine nodes, whereasBrahms takes much longer to converge (starting at 20% of Byzantinenodes, it did not converge within the experiment’s time).Figure 5 shows the evolution of several metrics through time,starting with the number of Byzantine nodes in the view, in ourexperiment for 𝑛 = 10000 , in a favorable situation with 𝑓 = 10% , 𝜌 = 0 . and 𝐹 = 1 . These plots show that B ASALT converges muchfaster than Brahms to a good network state (Brahms does not con-verge according to the previous criterion within the time of theexperiment). The other plots show metrics for graph quality, wherethe algorithms exhibit a similar convergence behaviour: clusteringcoefficient, mean path length and the concentration of in-degreesmeasured by the difference between the last and the first decile. Theclustering coefficient is computed by averaging the local clusteringcoefficient of correct nodes in a graph where malicious nodes areassumed to be all connected to one another. The mean path lengthis measured in a graph where malicious nodes are assumed to haveno connection in either direction, which models the situation wherethey do not cooperate in transmitting information between correctnodes.

We have seen earlier (Plot 3c) that both Brahms and B

ASALT aresensitive to increased sampling rates, and return more malicioussamples when the sampling rate, 𝜌 , is high, with Brahms failingcompletely for too large values of 𝜌 .To investigate this effect further, we run both algorithms for var-ious values of 𝑣 and 𝜌 , and plot the maximum value of 𝜌 that can be used for a given 𝑣 without causing a network partition. More pre-cisely, a run for a given set of parameters 𝑣, 𝜌 is successful if startingfrom half of the allocated simulation time, no correct node is everisolated by the malicious peers. Otherwise it is failed. We plot thesuccessful runs with highest values of 𝜌 for a given 𝑣 . The resultsof this experiment are shown in Plot 6 for 𝑁 = 10000 , 𝑓 = 10% and 𝐹 = 10 . The areas delineated in Plot 6 correspond to the parametersets that give successful runs. Our results show that for similar viewsizes, B ASALT achieves much higher sampling rate than Brahms,thus providing more utility to the application.

We implemented B

ASALT in the AvalancheGo engine [4], the mainimplementation of the AVA network [1] which uses the Avalancheconsensus algorithm . We picked AVA, as it is the main cryptocur-rency network that uses an epidemic, sampling-based consensus,which is the target use case of B ASALT . Our implementation, a 500-lines patch to the Go source code of AvalancheGo, replaces peersampling based on stake in a proof-of-stake system by peer sam-pling based on B

ASALT , including the hierarchical ranking functiondescribed in Section 3.3.Our implementation integrates seamlessly with the AVA protocoland is fully compatible with the existing network. Our implemen-tation supports managing current outgoing connections accordingto the B

ASALT algorithm, instead of keeping connections open toall reachable network nodes as done by the original AvalancheGoimplementation .To show that B ASALT can be applied as a sampling method thatreduces the risk of an institutional attack, we ran a 10-hour experi-ment where we launched 100 “adversarial” Avalanche nodes on thepublic AVA network (corresponding to about 20% of total activenodes) in an attempt to bias sampling in their favor using a Sybilattack against one of our nodes. The nodes we launched all had IPaddresses located in the same /24 prefix, owned by our researchinstitution. Samples were measured at witness nodes running theB

ASALT sampling algorithm, as well as the non-hierarchical vari-ant of B

ASALT and a sampling algorithm based on full networkknowledge. Results shown in Table 3 show that using B

ASALT , the Our code is publicly available at https://github.com/basalt-rps/avalanchego-basalt. Ourimplementation is forked from the official AvalancheGo repository [21]. Our changesare identified by “Basalt RPS Authors”. Unfortunately, we had to disable this behaviour as it led to too many connectionattempts and some nodes appeared to have banned our IP addresses as a consequence.A simple modification allows our code to never close connections intentionally: theview maintained by B

ASALT is only used to sample peers for the Avalanche consensusalgorithm, and connections are kept in the background to nodes that have been removedfrom the view.

ASALT : A Rock-Solid Foundation for Epidemic Consensus Algorithms in Very Large, Very Open Networks T i m e s s a m p l e d C u m u l a t i v e s u m Figure 7: Behaviour of AvalancheGo modified with B

ASALT running on the public AVA network, 5-hour experiment. On theleft, nodes that are alone in their IP prefix are sampled the mostfrequently. On the right, nodes that belong to IP prefixes whereother network nodes are present are sampled less often.Table 3: Observed proportion of samples that are nodes con-trolled by the adversary in our live experiment (see Section 6)

Algorithm Adversary samples

Full knowledge uniform sampling 18.4%B

ASALT -uniform 17.5%B

ASALT (hierarchical)

True proportion of Byzantine nodes 18.8% probability of sampling one of our adversarial nodes is brought toabout 1%, meaning that the influence of our nodes in the network isextremely limited.To show the wider benefit of B

ASALT , we plot in Figure 7 thenumber of times the various nodes of the AVA network were sampledin the experiment. Sorting nodes by a density metric which countsthe number of other nodes in the same /8, /16 and /24 prefix revealsthat nodes which are isolated in their prefix (on the left of the graph)are sampled more often than nodes which share their IP prefixeswith other nodes (on the right). However all network nodes have achance of being sampled, and no single node is sampled exceedinglymore often than others. This stands in contrast with Proof-of-Stake-based sampling, where sampling frequency is proportional to thestake invested, a mechanism that gives disproportionate power torich nodes and totally excludes nodes that are not able to invest anystake in the network.

Random peer sampling in non-adversarial settings is a well-studiedproblem [15, 23]. Surprisingly, very few works have sought to de-velop Byzantine-tolerant RPS protocols.State-of-the art methods such as Brahms [8] and Secure Peer Sam-pling [16] (SPS) are based on a classical RPS algorithm, to which isadjoined a mechanism that tries to correct for the over-representationof malicious nodes. In Brahms, the view is not updated if a peerhas received more than a certain number of push messages in agiven time slot. Albeit vaguely similar to our hit counter mecha-nism, Brahms’ approach can only work if we assume that maliciousnodes have limited total firing power and must therefore target theirattack on a specific victim node. Otherwise, they would be able tosimply flood the whole network with many pushes and halt the peer sampling algorithm completely. In SPS, nodes try to build some sta-tistical knowledge on node behaviour; however, this mechanism isunable to cope with attacks where malicious nodes send so manymessages that correct nodes do not have the time to gather sufficientstatistics to block them before becoming isolated. Our protocol, onthe other hand can effectively handle these attacks. Moreover, themajority of these systems do not address risks that exist in real-worldnetworks, such as Sybil attacks. To our knowledge, the only excep-tion is HAPS [6], which is designed specifically to handle Sybilattacks. HAPS, however, only addresses Sybil attacks in which at-tackers are concentrated in a few IP blocks ("institutional attacks"),by using random walks on a carefully crafted probabilistic tree. Dueto its design, it is not immediately clear how HAPS could be ex-tended to counter attackers that are spread out, which B

ASALT doesthanks to its stubborn chaotic search.Recent works on blockchains have also brought to lighten therisk of attacks at a more fundamental level than those described inSection 2.2. Network adversaries are malicious entities that gaincontrol of part of the routing infrastructure (internet autonomoussystems, or ASes), in which case they can intercept and modify allthe traffic that they are routing, or attack the routing algorithm itselfby advertising Internet prefixes that they do not own, thus attractingtraffic that should have gone through another path, a so-called

BGPhijack [18].Note that BGP hijacking attacks are necessarily limited to one ora few IP prefixes, as large-scale routing attacks would likely bringdown large parts of the Internet and would be noticed immediately.By spreading connections over a variety of IP prefixes through its rank function, B

ASALT builds intrinsic resilience to these attacks asat most only a small fraction of nodes’ neighbors will be located inhijacked prefixes. In this way, the global B

ASALT network is not atrisk of being taken down or manipulated by a malicious entity.However, network attacks might also be used to target specificnodes, to remove them from the global network and make thembelieve false information about the network’s state (an Eclipse at-tack). Defenses have been proposed against Eclipse attacks at thenetwork level: for instance, the SABRE network [7] proposes to useadditional communication channels, in the form of a network of spe-cialized nodes that are all connected to one another using dedicatedchannels, and that are located close to end-users so that they canprovide a safe service directly to them even in the case of a hijack.In the case of a blockchain, where the most crucial property toguarantee safety is that all nodes are made aware of new blocksrapidly, the SABRE method is able to help by providing reliableblock delivery. For sampling-based methods that use B

ASALT ,SABRE could provide a security mechanism at the application layerto enable detection of network attacks and stop all activity in casethey happen, for instance by detecting a discrepancy between anode’s local state and the state of SABRE nodes. This mechanismhowever cannot be used to allow eclipsed nodes to make progressin such a situation, as it does not provide the secure random peersampling service itself. Finding mechanisms to allow nodes that areeclipsed by a network attack to continue functioning normally whenrunning a sampling-based algorithm is, to the best of our knowledge,still an open problem. lex Auvolat, Yérom-David Bromberg, Davide Frey, François Taïani

Finally, one could argue that it will be hard to bootstrap a B

ASALT network containing enough nodes to effectively counter botnet at-tacks. We note that this problem is exactly the same as in PoW-basedcryptocurrencies, as an attacker that gains > of the network’shashing power can overturn the network in their favor (which is easyto do for smaller cryptocurrencies that don’t have a lot of hashingpower allocated to them). A PoW-based cryptocurrency network issecured by members investing in providing lots of hashing power, asis the case e.g. for Bitcoin, in order to make a > attack so costlythat it is impossible in practice (or simply not worth it compared tothe value of the cryptocurrency that could be stolen). A B ASALT -based cryptocurrency is similarly secured by participants investingin running as many nodes as possible from many different IP pre-fixes, which they have an incentive to do in order to keep the systemsafe. Moreover, B

ASALT has the advantage that this investment doesnot require the waste of tremendous quantities of energy.

We have presented a new algorithm for Byzantine-tolerant randompeer sampling on the Internet that uses biased sampling to preventSybil attacks. Such an algorithm can be used to implement sampling-based consensus algorithms such as Avalanche. Contrary to samplingalgorithms based on Proof-of-Stake, such as those currently in useon the AVA network, B

ASALT allows the network to be truly openby allowing any Internet user to join the consensus without havingto own any cryptocurrency tokens. We expect that in the future theline of research around Byzantine fault-tolerant algorithms basedon epidemics will continue to see new developments motivated bygains in performance, and thus we believe that we have broughtan important contribution to making such methods applicable inlarge-scale open networks. A 𝐶 ( C , B ) IN A BOTNET ATTACK

The probability 𝐶 ( C , B ) depends on the distribution of correct andByzantine identifiers across the three levels of blocks used in Equa-tion 2. We fix one node 𝑝 selected randomly amongst C ∪ B , andwrite selected ( 𝑝 ) the event that 𝑝 is selected by the ranking function rank 𝑆 ( · ) : selected ( 𝑝 ) ≡ (cid:16) 𝑝 = argmin 𝑞 ∈C∪B rank 𝑆 ( 𝑞 ) (cid:17) . (10)With this notation we have 𝐶 ( C , B ) = Pr ( 𝑝 ∈ C | selected ( 𝑝 ) ) . In our model, a botnet attack corresponds to the (ideal) case inwhich Byzantine and honest nodes follow the same distributionacross IP blocks. As a result, they are indistinguishable from thepoint of view of rank 𝑆 ( · ) , which means here that the events 𝑝 ∈ C and selected ( 𝑝 ) are independent. This independence implies that 𝐶 ( C , B ) = Pr ( 𝑝 ∈ C| selected ( 𝑝 ) ) = Pr ( 𝑝 ∈ C) = |C||C| + |B| . (11) B DERIVING EQUATION (5)

Based on the result from the coupon collector’s problem, the ex-pected number of uniformly distributed (non-distinct) correct peeridentifiers that must be received in order to learn ∆ 𝑐 new distinct correct peer identifiers amongst 𝑄 , when 𝑐 are already known, is:(12) 𝑄𝑄 − 𝑐 + 𝑄𝑄 − 𝑐 − · · · + 𝑄𝑄 − 𝑐 − ∆ 𝑐 + 1 The number of uniformly distributed peer identifiers received be-tween the two resets is at least the following expression: (13) 𝑘𝜌 𝑣𝜏 𝑐 𝑓 𝑛 + 𝑐 (1 − 𝑓 ) where 𝑘𝜌 is the duration of the considered time slice, 𝑣 is thenumber of peer identifiers exchanged at each exchange step, 𝜏 isthe time between two exchange steps, 𝑐 𝑓 𝑛 + 𝑐 is the probability thatthe exchange was conducted with a correct peer, and (1 − 𝑓 ) is theprobability that each of the peers of the returned view is correct.We bound the value of (12) as follows: (14) (12) ≤ ∆ 𝑐 𝑄𝑄 − 𝑐 − ∆ 𝑐 Moreover, we have (12) ≥ (13) . Thus: ∆ 𝑐 𝑄𝑄 − 𝑐 − ∆ 𝑐 ≥ 𝑘𝜌 𝑣𝜏 𝑐 𝑓 𝑛 + 𝑐 (1 − 𝑓 ) thus ∆ 𝑐𝑄𝜏𝜌 ( 𝑓 𝑛 + 𝑐 ) ≥ 𝑘𝑣𝑐 ( 𝑄 − 𝑐 − ∆ 𝑐 )(1 − 𝑓 ) thus ∆ 𝑐 ≥ 𝑘𝑣𝑐 (1 − 𝑓 )( 𝑄 − 𝑐 ) 𝑄𝜏𝜌 ( 𝑓 𝑛 + 𝑐 ) + 𝑘𝑣𝑐 (1 − 𝑓 ) which is the result of Equation (5) . ASALT : A Rock-Solid Foundation for Epidemic Consensus Algorithms in Very Large, Very Open Networks

REFERENCES

IEEE Symposium on Computers and Communications . IEEE.[7] Maria. Apostolaki, Marti Gian, MÃ¼ller Jan, and Vanbever Laurent. 2019.SABRE: Protecting Bitcoin against Routing Attacks.. In

NDSS . 1–15.[8] Edward Bortnikov, Maxim Gurevich, Idit Keidar, Gabriel Kliot, and AlexanderShraer. 2009. Brahms: Byzantine resilient random membership sampling.

Com-puter Networks

53, 13 (2009), 2340–2359.[9] Alan Demers, Dan Greene, Carl Houser, Wes Irish, John Larson, Scott Shenker,Howard Sturgis, Dan Swinehart, and Doug Terry. 1987. Epidemic algorithms forreplicated database maintenance. https://dl.acm.org/citation.cfm?doid=41840.41841[10] John R Douceur. 2002. The sybil attack. In

International workshop on peer-to-peersystems . Springer, 251–260.[11] Toby Ehrenkranz and Jun Li. 2009. On the state of IP spoofing defense.

ACMTransactions on Internet Technology (TOIT)

9, 2 (2009), 1–29.[12] Rachid Guerraoui, Petr Kuznetsov, Matteo Monti, Matej Pavloviˇc, and Dragos-Adrian Seredinschi. 2019. The consensus number of a cryptocurrency. In

Pro-ceedings of the 2019 ACM Symposium on Principles of Distributed Computing .307–316.[13] Rachid Guerraoui, Petr Kuznetsov, Matteo Monti, Matej Pavlovic, and Dragos-Adrian Seredinschi. 2019. Scalable Byzantine reliable broadcast. In . Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. [14] Ethan Heilman, Alison Kendler, Aviv Zohar, and Sharon Goldberg. 2015. Eclipseattacks on bitcoin’s peer-to-peer network. In . 129–144.[15] Márk Jelasity, Spyros Voulgaris, Rachid Guerraoui, Anne-Marie Kermarrec, andMaarten van Steen. 2007. Gossip-based Peer Sampling.

ACM Trans. Comput.Syst. , Article 8 (2007). https://doi.org/10.1145/1275517.1275520[16] Gian Paolo Jesi, Alberto Montresor, and Maarten van Steen. 2010. Secure peersampling.

Computer Networks

54, 12 (2010), 2086–2098.[17] A. Kermarrec, L. Massoulie, and A. J. Ganesh. 2003. Probabilistic reliable dissem-ination in large-scale systems.

IEEE Transactions on Parallel and Distributed Sys-tems

14, 3 (March 2003), 248–258. https://doi.org/10.1109/TPDS.2003.1189583[18] Apostolaki Maria, Zohar Aviv, and Vanbever Laurent. 2017. Hijacking Bitcoin:Routing Attacks on Cryptocurrencies. In

Security and Privacy (SP), 2017 IEEESymposium on . IEEE.[19] Satoshi Nakamoto. 2009. Bitcoin: A peer-to-peer electronic cash system.[20] Brice NÃ©delec, Julian Tanke, Davide Frey, Pascal Molli, and AchourMostÃ©faoui. 2018. An adaptive peer-sampling protocol for building net-works of browsers.

World Wide Web

21, 3 (May 2018), 629–661. https://doi.org/10.1007/s11280-017-0478-5[21] Team Rocket. 2018. Snowflake to avalanche: A novel metastable consensusprotocol family for cryptocurrencies.[22] Atul Singh et al. 2006. Eclipse attacks on overlay networks: Threats and defenses.In

In IEEE INFOCOM . Citeseer.[23] Spyros Voulgaris, Daniela Gavidia, and Maarten Van Steen. 2005. Cyclon: In-expensive membership management for unstructured p2p overlays.

Journal ofNetwork and systems Management

13, 2 (2005), 197–217.[24] Spyros Voulgaris and Maarten van Steen. 2013. VICINITY: A Pinch of Random-ness Brings out the Structure. In

Middleware 2013 (Lecture Notes in ComputerScience) , David Eyers and Karsten Schwan (Eds.). Springer Berlin Heidelberg,21–40.[25] Fan Zhang, Ittay Eyal, Robert Escriva, Ari Juels, and Robbert Van Renesse.2017. REM: Resource-efficient mining for blockchains. In26th USENIX SecuritySymposium (USENIX Security 17)