[PDF] Committee selection in DAG distributed ledgers and applications

Abstract

In this paper, we propose several solutions to the committee selection problem among participants of a DAG distributed ledger. Our methods are based on a ledger intrinsic reputation model that serves as a selection criterion. The main difficulty arises from the fact that the DAG ledger is a priori not totally ordered and that the participants need to reach a consensus on participants' reputation. Furthermore, we outline applications of the proposed protocols, including: (i) self-contained decentralized random number beacon; (ii) selection of oracles in smart contracts; (iii) applications in consensus protocols and sharding solutions. We conclude with a discussion on the security and liveness of the proposed protocols by modeling reputation with a Zipf law.

Full PDF

CCommittee selection in DAG distributed ledgersand applications

Bartosz Kuśmierz , , Sebastian Müller , Angelo Capossele IOTA Foundation, 10405 Berlin, Germany Department of Theoretical Physics,Wroclaw University of Science and Technology, Poland Aix Marseille Université, CNRS, Centrale Marseille, I2M - UMR 7373, 13453Marseille, France

Abstract.

In this paper, we propose several solutions to the committeeselection problem among participants of a DAG distributed ledger. Ourmethods are based on a ledger intrinsic reputation model that serves asa selection criterion. The main diﬃculty arises from the fact that theDAG ledger is a priori not totally ordered and that the participants needto reach a consensus on participants’ reputation.Furthermore, we outline applications of the proposed protocols, includ-ing: (i) self-contained decentralized random number beacon; (ii) selectionof oracles in smart contracts; (iii) applications in consensus protocols andsharding solutions.We conclude with a discussion on the security and liveness of the pro-posed protocols by modeling reputation with a Zipf law.

Keywords: decentralized systems, distributed ledgers, DAG, reputation model,security

In distributed ledger technologies (DLTs), committees play essential roles invarious applications, e.g., distributed random number generators, smart contractoracles, consensus mechanisms, or scaling solutions.The most famous example of permissionless DLT is the Bitcoin blockchainintroduced in the whitepaper [24] by Satoshi Nakomoto. The blockchain enablesnetwork participants to reach consensus in a trustless peer-to-peer network usingthe so-called proof of work (PoW) consensus protocol. In PoW based blockchains,participants need to solve a cryptographic puzzle to issue the next block. Thehigher the computing power of a participant, the higher the chances to producethe next block.In the past years, another consensus mechanism, called Proof-of-Stake (PoS),became popular. In contrast to PoW, in PoS-based cryptocurrencies, the nextblock’s creator is chosen depending on its wealth or stake [31] and not on itscomputing power. a r X i v : . [ c s . D C ] F e b Bartosz Kuśmierz, et al.

DLTs already found multiple applications in the ﬁnancial sector, includingonline value transfer, digital assets management, data marketplace [12,42]. Suc-cessive projects, inspired by Bitcoin, started an entirely new ﬁeld of smart con-tracts, which are used for settling online agreements without the intermediaryand the open possibility of decentralized autonomous organizations [17,26].Nevertheless, blockchain-based DLTs have problems, which become apparentwhen the network participants start using full block capacity. As the number oftransactions issued by the users became signiﬁcantly bigger than what can ﬁtinto the block, fees began to increase considerably. Increasing the throughputis one of the main motivations behind the DLTs’ scaling problem, which leadsto the (in-)famous blockchain trilemma [5]. The blockchain trilemma states thatDLT can have up to two out of three desired properties: scalability, security,decentralization. The most problematic part of the trilemma for blockchains isscalability.Proposed solutions that aim at increasing DLTs scalability and ﬂexibilityinclude increasing block size and issuance frequency, sidechains [7], “layer-two”solutions like Lightning Network [29], diﬀerent consensus mechanisms [2], shard-ing [21]. A diﬀerent approach is to change the underlying data structure from achain to a more general directed acyclic graph (DAG) [30,36].DAGs have been adopted by a variety of DLT projects including IOTA [30],Obyte [6], SPECTRE [38], Nano [20], Aleph Zero [13]. While those projects uti-lize the DAG structure diﬀerently and adapted diﬀerent consensus mechanisms,in most of them, transactions are graph vertexes that reference multiple previ-ously issued transactions. This property assures that the graph is acyclic, andthe data structure grows with time.

An example of a project that utilizes DAGs is IOTA as described in the originalwhitepaper [30], which requires every transaction in the DAG, also called the

Tangle , to reference exactly two other transactions directly. Transactions alsoindirectly reference other transactions, we say that y indirectly approves x ifthere is a directed path of references from y to x . As the ledger grows, eachaccepted transaction gains indirect references, which are assured by the defaulttip selection mechanism [30,36,25,18]. The set of indirect references of a giventransaction is interpreted as the number of conﬁrming transactions and play thesame role as the conﬁrming blocks in Bitcoin, i.e., the more indirect referencesa transaction gains, the more likely it is to remain a part of the ledger.The currently deployed implementation of the IOTA is a form of technol-ogy prototype [15], hereafter referred to as Coordinator-based-IOTA, and diﬀersfrom the original whitepaper version. The most crucial diﬀerence is that theconsensus is based on the so-called milestones - special transactions issued pe-riodically by a privileged node called Coordinator . Every transaction referenced(directly or indirectly) by milestone is considered conﬁrmed. While such a systemis centralized, authors in [35] propose a decentralized solution dubbed

Coordi-cide , referred to as post-Coordicide IOTA. One key element in Coordicide is to ommittee selection in DAG distributed ledgers and applications 3 replace the Coordinator with the decentralized consensus protocol called FastProbabilistic Consensus (FPC). FPC is a scalable Byzantine resistant votingscheme [3,23,34] where nodes vote in rounds. In each round, participants querya random subset of online nodes in the network. Opinions in the next round de-pends on the received queries and a random threshold. When most of the nodesin the network use the same random threshold the system reaches unanimity. Thecommon random thresholds enable the protocol to break meta-stable situations[33].

Let us deﬁne a committee as a group of usually trustworthy nodes, selectedto execute a special task. We assume that participants take part in a DLT thatsupports a reputation system. Every node can issue transactions, which from nowon we call messages . We use the name message to indicate that those objectscan include generic data and not only token transfers. The reputation systemis needed as a criterion to select a subgroup of all participants and to mitigateSybil attacks, a common threat in permissionless systems.We concentrate on the more diﬃcult case of DAG-based DLTs, which promisebetter scalability and decentralization than blockchains. Unlike blockchains, whichare totally ordered by nature, DAGs lack natural “reference points” that couldbe used to determine the reputation of the participants. The conﬁned structureof blockchains allows for using reputation calculated for a speciﬁc block deﬁnedon the protocol level. The complex structure of DAGs does not allow for theadaptation of an analogical rule.This paper proposes a series of committee selection mechanisms for DAG-based DLTs with reputation system. The committee selection process runs pe-riodically and depends on the reputation of nodes, which is stake or delegatedstake. More speciﬁcally, the contributions of this paper are the following:1. we propose several protocols to select a committee in permissionless decen-tralized systems;2. we analyze the token distribution of a series of cryptocurrency projects;3. we model the reputation distribution and analyze the security of the pro-posed protocols.4. we discuss several applications of committee selection protocols, includingdecentralized random number beacon, smart contracts, consensus mecha-nism, and sharding solutions.

The article is organized as follows. In the next section, we discuss previous worksrelated to this paper, mainly related to diﬀerent kinds of decentralized randomnumber beacon and reputation systems. In section 3, we specify the assumptionson the DLT that are required for our protocol to work. Section 4 is devoted tothe committee selection, where we give three diﬀerent methods of achieving

Bartosz Kuśmierz, et al. consensus on the reputation values. In section 5, we discuss the security of ourproposal by modeling reputation distribution with a Zipf law. Applications ofour protocols to dRNG, smart contracts, consensus mechanism, and scaling arediscussed in section 6. Finally, section 7 outlines further research direction andconcludes the paper.

A reputation system in a DLT is any mapping that assigns real numbers to thenetwork participants. Reputation can be objective when all of the participantsagree on the exact values of the reputation, or subjective when diﬀerent nodeshave diﬀerent perception of the reputation. However, for the subjective repu-tation to maintain its utility and play the same function as in social systems,network users should have at least an approximate consensus on its values.All PoS consensus mechanisms induce a reputation system where the user’sreputation equals staked tokens [2]. In the same way, delegated PoS (DPoS) pro-tocols, where staked tokens are delegated to other nodes [11,19], deﬁne a naturalreputation system. Highest DPoS nodes form a committee of block validatorsand produce the next blocks. The consensus among the ﬁxed-size committee iseasier to achieve than in open systems. An interesting variation on DPoS systemsis mana introduced in the post-Coordicide IOTA network [35]. This reputationsystem takes advantage of each issued message, which temporally grants manato a certain node.Multiple implementations of sharding in blockchains also require a randomassignment of the block validators, e.g., [8]. If validators can not predict whichshard they will be assigned, then any collusion is signiﬁcantly hampered. Whenthe network also uses a reputation system, the sharding process can be improvedby assigning approximately the same reputation into each shard. An exampleof such protocol is RepChain [14], which assigns reputation to nodes based ontheir behavior in the previous rounds.

Both randomness and reputation systems can improve the security, scalability,and liveness of the DLTs. A random number beacon, as introduced by Rabin [37],is a service that broadcasts a random number at regular intervals. Randomnessproduced by an ideal beacon cannot be predicted before being published; how-ever, this assumption is hard to achieve in centralized systems.The development of decentralized random number generators (dRNG) triesto address those problems. In general, a dRNG should provide unpredictable andunbiased randomness, which can not be controlled nor easily biased by a singlemalicious actor. There are diﬀerent proposals for dRNGs in the literature.Authors in [1] discuss the extraction of randomness from public blockchainsusing hashes of blocks. However, certain concerns regarding those solutions have ommittee selection in DAG distributed ledgers and applications 5 been raised in [27,32]. Other proposals like RandHound and RandHerd utilizepublicly veriﬁable secret sharing schemes [40] to generate a collective key sharedbetween committee nodes. If more than a certain threshold of partial secretsare published, the network can recover the random number, i.e., ( t, n ) -thresholdsecurity model. Other proposals include smart contracts, e.g., RANDAO usedin Ethereum [10]. Security in such solutions is achieved with the risk of fundconﬁscation.An interesting research direction that can improve security and prevent ma-nipulation of the generated randomness are veriﬁable delay functions (VDFs)[1]. VDFs take a ﬁxed amount of time to compute, can not be parallelized butcan be veriﬁed quickly [41,28]. However, VDF calculations are costly. Moreover,this approach requires further research to ensure honest users have access tothe fastest application-speciﬁc integrated circuits (ASICs) specialized in a givenVDF. We propose a protocol for committee selection in DAG-based DLTs with a sub-jective reputation system. Every node has a reputation, but a priori, there is noperfect consensus on the values of the reputation. We assume that each vertex ofa DAG determines the view of the reputation, and subjectivity comes from thefact that there is no unique way of choosing the “reference vertex”. For example,if nodes adopt the simple rule reputation should be calculated based on the mostrecent received message , then due to network delay, two diﬀerent nodes disagreeon which is the most recent.The committee selection is the process of appointing nodes with a suﬃcientlyhigh reputation.To perform the three phases above, we require the underlying DAG to verifythe following properties.P1 The DAG grows in time, and incoming messages are new vertexes of theDAG.P2 The DAG allows for the message exchange of application messages (optional).P3 The DAG is immutable and provides a strict criterion for an approximatetime of message creation.P4 The subjective reputation of the nodes can be read from DAG and is de-termined by the vertex (diﬀerent nodes can read reputation from diﬀerentvertexes).P5 Messages in the DAG are signed by the nodes and can not be counterfeited(i.e., a malicious node can not fake origin of the message).Immutability, P3, guarantees that no message can be removed from the ledgernor new messages can be added in a part of the ledger that suggests it was issuedin the past (by attaching it deep into the DAG). Property P3 can be achievedusing any of the following methods:

Bartosz Kuśmierz, et al. (i) Messages are equipped with enforceable timestamps. Messages with wrongtimestamps are rejected by the network.(ii) Special partial order generating messages are periodically issued into theDAG.Enforceable timestamps mentioned in (i) can be achieved when honest nodesautomatically reject messages with timestamps too far in the past or the future.A certain level of desynchronization must be allowed to account for the networkdelay and diﬀerences in local clocks. However, we require that the network hasa speciﬁc bound above which no message with lower timestamp will be acceptedinto the ledger (even accounting for network delays). In the edge cases of times-tamps, when part of the network thinks that timestamp is valid and other doesnot, it is necessary to run a certain kind of consensus mechanism.Partial ordering generating (POG) messages, (ii), can be a result of consensusamong the nodes, e.g., nodes vote on them. Other options are proof-of-authority type of consensus where privileged “validator” nodes issue those POG messages.An example of this consensus type is the Coordinator-based IOTA [15], wherea particular entity called

Coordinator issues milestones. Similarly, Obyte usesmain chain transactions (MCT), which are indicated by trusted witnesses [6].Note that when a node issues a message after the i th milestone/MCT, it can beapproved only by a ( i + 1) th order generating message. This procedure generatesa certain kind of logical timestamps provided by the milestones/MCT.There are two natural ways of determining the reputation value:(i) the reputation is summed over all of the messages approved by a given mes-sage;(ii) if timestamps are available, the reputation can be summed over all of themessages with timestamps smaller than DAG vertex’s timestamp.An example of the DAG’s reputation calculated using the method in (i) ispresented in Fig. 1. When all users have the same view on the reputation, a natural choice for thecommittee is to take the top n reputation nodes.An alternative option is to perform a lottery with a probability dependent onthe reputation of nodes. The list of lottery participants consists of k > n nodeswith the highest reputation. Using the last random number X produced by theprevious committee, nodes calculate the coeﬃcients: q i = f ( X, id i ) · r i (cid:80) kj =1 r j , where f is a cryptographic hash function with values in [0 , , i is the index ofa node, r i its reputation, and id i is its identiﬁer. Then, the committee memberswould be the n nodes with the highest q i . ommittee selection in DAG distributed ledgers and applications 7 GenesisNode B +10

Node C +10

Node A +10

Node B +5 Node B +15

Node C +5 Node A +5 Node A +5 Node C +5 Node A +25

Fig. 1.

Each message grants a particular reputation to one node. The reputationsare summed over all of the messages approved by a given message. The reputationcalculated for the dark blue message takes into account all blue messages: node A:15, node B:5, node C: 20. The reputation for the dark green message is obtained bysumming over all green messages: node A: 5, node B:25, node C 15.

Cryptocurrency projects are known for their high concentration of hashingpower or (staked) tokens, and opening the possibility of low reputation nodesbeing members of the committee may lead to a decrease in security. Thus, werecommend selecting the top n reputation nodes. We discuss this issue in moredetail in section 5.In the following, we present three diﬀerent methods to ﬁnd consensus onthe nodes’ reputations and, therefore, on the members of the committee. Wedescribe the methods for DAGs with timestamps; the corresponding versionsfor DAG with POG messages are straightforward adaptations. We assume thatwe want to select the committee at time t C , and D is the bound for acceptingdRNG messages in the DAG, i.e., no honest node will accept dRNG messageswith timestamps diﬀerent by more than D from its local time. The ﬁrst method of committee selection requires all nodes interested in thecommittee participation to prepare a special application message. This messagedetermines the value of the reputation of a given node by its timestamp. Appli-cation messages can be submitted within an appropriate time window.Let us denote the application time window length by ∆ A , then the timewindow is Bartosz Kuśmierz, et al. [ t C − D − ∆ A , t C − D ] . Application messages are used to extract only the reputation of the issuingnode, i.e., the reputations of two potential committee members are deduced fromdiﬀerent messages. If a node sends more than one application message, the otherparticipants do not consider this node for the selection process. Since messagesrequire node’s signatures, this type of malicious behavior can not be imitatedby an attacker.This method of committee selection has the advantage that only online nodescan apply. Moreover, if a particular node does not want to participate in thecommittee, e.g., due to planned maintenance, insuﬃcient resources, it can decidenot to apply. In general, applications should not be mandatory as it is hard toenforce.The committee selection procedure is open, and any node can issue an appli-cation message. However, the committee is formed from top reputation nodes,and low reputation nodes are unlikely to get a seat. Applications issued by lowreputation nodes are likely to be not only redundant but possibly problematicwhen the network is close to congestion. Thus we propose the following improve-ments to the application process.A node is said to be M (cid:96) if, according to its view of reputations, it is amongthe top (cid:96) reputation nodes. Then nodes produce application messages accordingto the following:If a node x is M n then it issues an application at the time t C − D − ∆ A .For k > a node x which is M n · k but not M n · ( k − submits a committeeapplication only if at the time t C − D − ∆ A / ( (cid:96) − (according to x ’s local timeperception) there is less than n valid application messages with stated reputationgreater then the reputation of x . The timestamp of such application messageshould be t C − D − ∆ A / ( (cid:96) − . An example of pseudocode for scheduling thesend of an application message is presented in algorithm 1. We introduce checkpoints, similar to POG messages mentioned in (ii). The check-point is unpredictably obtained from ordinary messages, i.e., it is impossible tosay in advance that any particular message will become a checkpoint at the timeof issuance. To achieve that, we require a single random number X .The reputation of the nodes is calculated for this checkpoint message. More-over, checkpoints can suggest which nodes are online. For example, using thefollowing rule: if the last message in a past cone of checkpoint issued by a node x is older than a certain threshold, we consider this node oﬄine.The random number X can be the last random number produced by theprevious committee, e.g., using the dRNG described in subsection 6.1, or cancome from a diﬀerent source. If the random number X is revealed at time t C ,the time the committee selection should start, the checkpoint is the ﬁrst message ommittee selection in DAG distributed ledgers and applications 9 Algorithm 1:

Application message send scheduler. require : nodes_reputation(time) → descending ordered list of nodes’reputation at input time; get_position(ID, ordered_list_of_nodes) → ID’s position on thegiven list; get_reputation(ID, time) → ID’s reputation at the given time; application_msg_with_rep_higher_than(reputation) → amountof application messages in DAG with reputation higher thaninput; input : t C : time of committee selection; D : bound for accepting dRNG messages in the DAG; ∆ A : application time window length; max_k: maximum index value for sending an application msg; n: size of the committee; my_ID: ID of the node; if (time = t C − D − ∆ A ) then k ← while k ≤ max_k do wait_until ( t C − D − ∆ A / ( k − rep_list ← nodes_reputation( t C − D − ∆ A / ( k − ) my_ell ← get_position(my_ID, rep_list) my_rep ← get_reputation(my_ID, t C − D − ∆ A / ( k − ) if (application_msg_with_rep_higher_than(my_rep) ≥ n) then return if (my_ell ≤ k ∗ n ) then send_application_message() return k ← k + 1 return with a timestamp smaller than t C − D − X. Possible ties are broken with lower hash.The role of the random number X is to improve security. If no participant,including a potential attacker, can predict the timestamp of the future check-point, successful attacks are hindered or made impossible. In the following, wedescribe one attack scenario in the absence of the random factor X .The reputation changes, and it is possible that an attacker at one point intime had a high reputation but lost it in the meantime. Then, an attacker triesto place checkpoint in the favorable (for him/her) part of the DAG. Without therandom factor X , the timestamp of checkpoint would be predictable t C − ∆ . Anattacker could mine message with this timestamp and minimal hash. By placingit strategically in the DAG, an attacker could manipulate reputation obtained Algorithm 2:

Committee selection scheduler for checkpoint selectionmethod. require : nodes_reputation(time) → descending ordered list of nodes’reputation at input time; get_random_number() → global random number; top_nodes(n, rep_list) → top n nodes in terms of reputation; input : t C : time of committee selection; D : network delay; n: size of the committee; output : committee: list of nodes selected as committee members ; committee ← {} if time = t C then x ← get_random_number() rep_list ← nodes_reputation ( t C − D − x ) committee ← top_nodes(n, rep_list) return committee from summed over all of the messages approved by a checkpoint (points P3 and(i)). This type of manipulation is presented in Fig. 2. t C − D A malicious message witha timestamp t C − D Messages referenced by themalicious message t C Fig. 2.

An example of a checkpoint manipulation in the absence of the random factor X . A malicious transaction (red) with a timestamp t C − D is issued “deep” in the DAG,even though honest transactions with these timestamps are placed near the blue dashedline. This placement of checkpoints could promote attacker’s and diminish other user’sreputation as only contributions from the pink “cone” would be used. Note that when the DAG is equipped with POG messages, then one of thePOG messages can be used as a checkpoint.Algorithm 2 describes the pseudocode for determining the committee withthe checkpoint selection method. ommittee selection in DAG distributed ledgers and applications 11

Maximal reputation from an interval approach is a combination of applica-tion message and checkpoint selection. The reputation value of the node x is the maximal reputation calculated from all of the messages in the interval [ t C − D − ∆ A , t C − D ] . This approach does not require the node x to issue anyspecial message. Note that for ∆ A = 0 , it reduces to checkpoint selection withoutrandom factor.However, the method has higher computational complexity since it requiresthe computation of reputation for all messages issued in [ t C − D − ∆ A , t C − D ] . The fact that the committee is only a subset of all the nodes may decrease therobustness against malicious actors. The security of the protocol depends on thesize of the committee but also on the way the reputation is distributed amongthe nodes.

Diﬀerent protocols might deﬁne reputation and methods of gaining it diﬀerently.This aﬀects the concentration of reputation and makes it impossible to performan analysis in full generality. For this reason we propose to model the distributionof reputation using Zipf laws.Zipf laws satisfy a universality phenomenon; they appear in numerous dif-ferent ﬁelds of applications and have, in particular, also been utilized to modelwealth in economic models [16]. In this work we use a Zipf law to model theproportional reputation of N nodes: the n th largest value y ( n ) satisﬁes y ( n ) = C ( s, N ) − n − s , (1)where C ( s, N ) = (cid:80) Nn =1 n − s , N is the number of nodes, and s is the Zipf param-eter. A convenient way to observe a Zipf law is by plotting the data on a log-loggraph, with the axes being log(rank order) and log(value). The data conforms toa Zipf law to the extent that the plot is linear, and the value of s can be foundusing linear regression.In the Fig. 3 we present the distribution of the richest accounts for a series ofcryptocurrency projects for the top holders, which might be considered for thecommittee. We observe that most of them resemble Zipf laws. Table 1 containsestimations of the corresponding coeﬃcients of Zipf law. Note that for PoS-basedprotocols the reputation of a node can, to some extent, be approximated bythe distribution of the tokens. Post-Coordicade IOTA uses a reputation systemcalled mana that shares some similarities with a delegated PoS, see [35], and itis reasonable to assume that the future distribution of mana would be Zipf like. − − − T o k e nd i s tr i bu t i o n Rank

BitcoinEthereumIOTAObyteTether (ETH chain)EOSTron

Fig. 3.

The token distribution of selected cryptocurrency projects on a log-log scale(June 2020). Name Zipf coeﬃcientBitcoin 0.7628Ethereum 0.756786IOTA 0.934275Obyte 1.14361Tether (ETH chain) 0.815054EOS 0.536744Tron 1.02043

Table 1.

Coeﬃcients of Zipf distribution with the best ﬁt to the token distribution ofgiven cryptocurrency project. Method: linear regression on a log-log scale (June 2020).

We assume that an adversary possesses q % of the total reputation and that it canfreely distribute this reputation among arbitrary many diﬀerent nodes. Honestnodes share (1 − q )% of the reputation. We assume that the reputation amongthe honest nodes is distributed according to a Zipf law.Overtaking of the committee occurs when the attacker gets t threshold com-mittee seats; the exact value of t might depend on the particular application ofthe committee.The cheapest way for an attacker to obtain t seats in the committee is tocreate t nodes with y ( n − t + 1)(1 − q ) reputation each. The critical q c that allowsto get t seats in the committee is, therefore, given by q c = t · y ( n − t + 1)(1 − q c ) . (2)Equivalently, q c = t · y ( n − t + 1)1 + t · y ( n − t + 1) . (3) ommittee selection in DAG distributed ledgers and applications 13 In Fig. 4 we present the critical value q c for reaching the majority in the com-mittee. Zipf coeﬃcient s C o mm i tt ee s i ze n . . . . . . . . . Fig. 4.

Minimal amount of reputation required to overtake the committee for diﬀerentcommittee sizes and token distribution modeled with a Zipf distribution, threshold t = (cid:98) n/ (cid:99) + 1 , and number of total nodes N = 1000 . In this section, we describe a few applications of the proposed committee selec-tion protocols.

Committee selection protocols allow for the construction of a fully decentralizedrandom number beacon embedded in the DAG ledger structure. An advantageof such an approach is that it relies only on the distributed ledger and does notrequire any interaction outside the ledger.Publication of the random numbers would naturally take place in the ledger,where possible interruptions are publicly visible. Moreover, propagation of therandom numbers would use the same infrastructure as the ledger.Gossiping among nodes in the network would decrease the committee nodescommunication overhead as they would not need to send randomness to eachinterested user. Similarly, if the proposed protocol requires a distributed keygeneration (DKG) phase (or any other setup phase), messages should be ex-changed publicly in the ledger. Such an approach allows for the detection ofmalicious or malfunctioning committee members who did not participate in theDKG phase. After time D , a lack of corresponding messages can be veriﬁableproven using the ledger. This procedure increases robustness in the case of DKG phase failure. If suchfailure is detected, the committee selection is repeated until the DKG phase issuccessful. Any iteration of the selection process may not take into account thenodes that did not participate in the DKG phase previously. Moreover, if allcommunication is veriﬁable on the ledger, such nodes can be punished, e.g., bya loss of their reputation.Note that if the dRNG protocol uses a ( t, n ) -threshold scheme, i.e., n com-mittee nodes publish their parts of the secret in the form of a beacon messageand if t or more beacon messages are published then the next random numbercan be revealed. The procedure of obtaining the random number from the bea-con messages requires speciﬁc calculations, e.g., Lagrange interpolation. To saveevery node from performing those calculations, special nodes in the network cangather beacon messages and publish collective beacon messages which alreadycontain the random number. The public can then verify these collective beaconmessages against the collective public key. One main limitation of smart contracts is that they cannot access data outsideof the ledger. So-called oracles try to address this problem by providing externaldata to smart contracts. When multiple parties engage in a smart contract,they have to determine an oracle. An obvious solution is for contracting partiesto agree upon the oracle. However, this poses certain problems as oracles maybe centralized points of failure and because the diﬀerent parties have to ﬁndconsensus on the choice of the oracle for each contract. Moreover, contracteesmust know in advance that the selected oracles are going to provide reliableservices upon contract expiry.The committee selections proposed in Section 3 can be used to improve thesecurity and liveness of smart contracts. For instance, if oracles gain reputa-tions that are recorded in the ledger, then oracles can be determined using theproposed methods. If the committee selection process occurs only upon the con-tract termination, then the majority, if not all of the selected oracles still provideservices. A similar approach is adopted in blockchain-based Chainlink [9]. Ourpaper allows for using analogical methods in DAG-based DLTs.This protocol would also be user friendly as contractees are not required toknow classiﬁcations of oracles. Note that specialization is very likely to occur,and diﬀerent oracles probably will deliver diﬀerent types of data depending onthe industry and requirements. For example, sports bets are settled with sportsreputation , whereas events in a stock market may use ﬁnancial reputation .Further nodes can modify smart contracts themselves to modify the weightof oracles vote power or even make outcomes depend on the oracles’ opiniondistribution. ommittee selection in DAG distributed ledgers and applications 15

The problem of consensus is much simpler in closed systems, where the numberof participants is known and does not change. Unanimity algorithms that workin such an environment include [22,4,39]. However, the closed nature of suchprotocols makes them not very relevant for decentralization. PoW is open forany user who is willing to solve the cryptographic puzzle; the network does notrequire any prior knowledge of the user. An intermediary step between entirelyopen and permissionless networks is a system where each user is allowed to setup a node and collect reputation, but only the most reliable nodes contribute tothe consensus protocol. An example of such a protocol is EOS. In EOS delegatedstake plays the role of the reputation. EOS blockchain database expands as acommittee of validators with the highest DPoS produce blocks. Validatorsuse a type of asynchronous Byzantine Fault Tolerance to reach consensus amongthemselves and propagate the new blocks to the rest of the network [19].An illustration of the procedure of establishing consensus based on the ﬁxed-size closed committee in open and permissionless systems is in Fig. 5. Open, decen-tralized protocolOpen protocolwith a measureof: stake, hon-esty, importance,or reliabilityClosed systemwith a ﬁxednumber ofparticipantsReputation gainingCommittee selection

Fig. 5.

Process of ﬁnding ﬁxed-size closed committee in open permissionless systemswith reputation.

Most of the DLTs with committee based consensus use either pre-selectedcommittees or a blockchain as the underlying database structure (EOS [19],TRON[11]). Our committee selection protocols allow for using similar methodsin a more general DAG structure (as long as DAG satisﬁes conditions P1-P5) .

On the same note, proposed methods can also improve scaling through shardingsolutions. It is straightforward to assigning validators to each shard based ontheir reputation. Similarly, as in the case of RepuChain [14], our solutions canassure that each shard has approximately the same total validator reputation.

In this article, we proposed a committee selection protocol embedded in a DAGdistributed ledger structure. We require the DAG to be equipped with an iden-tity and reputation system. We further assumed that the ledger is immutableafter some time, i.e., no transaction can be subtracted from the ledger; no trans-actions suggesting that it was issued a long time ago can be added to the ledger.These assumptions can be achieved by enforcement of approximately correcttimestamps or POG transactions.We further discussed methods of reading the reputation from the DAG. Meth-ods include: (1) reputation summed over all of the messages approved by a givenvertex in the DAG; (2) reputation summed over all of the messages with times-tamps smaller than the timestamp of a given message.Based on that we proposed and discussed the following methods of commit-tee selection: application message, checkpoint selection, and maximal reputationmethod.Furthermore, we analyzed the token distribution of a series of cryptocurrencyprojects, which turned out to follow Zipf distribution with a parameter s ≈ .We used this fact to model reputation and analyzed the security of our proposal.Then, we focused on the applications of our committee selection protocols. Weproposed an application to produce a fully decentralized random number beacon,which does not require any interaction outside of the ledger. Then, we showedhow it could improve the oracle in smart contracts, and ﬁnally, we discussedpossible advancements in consensus mechanism and sharding solutions.Interesting research directions to further improve security and liveness of theproposed protocols might involve the use of backup committees. These solu-tions could be used when the primary committee is not fulﬁlling its duties. Forinstance, an obvious, although more centralized option, is a pre-selected com-mittee controlled by the community or consortium of businesses interested in areliable protocol. A diﬀerent option is to select another reputation based com-mittee from nodes with a reputation index { n + 1 , ..., n } . However, the securityof this solution is debatable, as it might be easier for an attacker to overtakeit. Figure 4 shows that for n = 20 and t = 11 an attacker can overtake thecommittee with as little as of the reputation. The value of the secondarycommittee would be even lower.Other improvements to the security of the committee include giving multipleseats to the top reputation nodes in the committee, e.g., nodes from { , ..., n/ } would get double or triple identities in the committee. These adaptations increasethe reputation requirement to overtake the committee. Other improvements ofthe liveness can include recovery mechanisms when the committee fails to deliver. ommittee selection in DAG distributed ledgers and applications 17 We hope that all mentioned improvements to the security and liveness willstimulate further research on this topic.

References

1. Bonneau, J., Clark, J., Goldfeder, S.: On bitcoin as a public randomness source.Cryptology ePrint Archive, Report 2015/1015 (2015), https://eprint.iacr.org/2015/1015

2. Buterin, V., Griﬃth, V.: Casper the Friendly Finality Gadget. ArXiv e-printsarXiv:1710.09437 (Oct 2017)3. Capossele, A., Mueller, S., Penzkofer, A.: Robustness and eﬃciency of leaderlessprobabilistic consensus protocols within Byzantine infrastructures (2019)4. Castro, M., Liskov, B.: Practical byzantine fault tolerance. In: Proceedings of theThird Symposium on Operating Systems Design and Implementation. p. 173–186.OSDI ’99, USENIX Association, USA (1999)5. Chu, S., Wang, S.: The curses of blockchain decentralization (2018), https://arxiv.org/abs/1810.02937

6. Churyumov, A.: Byteball: A decentralized system for storage and transfer of value(2016), https://byteball.org/Byteball.pdf

7. Croman, K., Decker, C., Eyal, I., Gencer, A.E., Juels, A., Kosba, A., Miller, A.,Saxena, P., Shi, E., Sirer, E.G., et al.: On scaling decentralized blockchains. In:International Conference on Financial Cryptography and Data Security. pp. 106–125. Springer (2016)8. Dang, H., Dinh, T.T.A., Loghin, D., Chang, E.C., Lin, Q., Ooi, B.C.: Towardsscaling blockchain systems via sharding. In: Proceedings of the 2019 InternationalConference on Management of Data. p. 123–140. SIGMOD ’19, Association forComputing Machinery, New York, NY, USA (2019), https://doi.org/10.1145/3299869.3319889

9. Ellis, S., Juels, A., Nazarov, S.: Chainlink a decentralized oracle network (2017), https://link.smartcontract.com/whitepaper

10. Foundation, E.: RANDAO: A DAO working as RNG of Ethereum, https://github.com/randao/randao

11. Foundation, T.: Tron advanced decentralized blockchain platform (2017), https://tron.network/static/doc/white_paper_v_2_0.pdf

12. Giudici, G., Milne, A., Vinogradov, D.: Cryptocurrencies: market analysis and per-spectives. Journal of Industrial and Business Economics 47(18), 1972–4977 (2020), https://bitcoin.org/bitcoin.pdf

13. Gągol, A., Leśniak, D., Straszak, D., Świętek, M.: Aleph: Eﬃcient AtomicBroadcast in Asynchronous Networks with Byzantine Nodes. arXiv e-printsarXiv:1908.05156 (Aug 2019)14. Huang, C., Wang, Z., Chen, H., Hu, Q., Zhang, Q., Wang, W., Guan, X.: Repchain:A reputation based secure, fast and high incentive blockchain system via sharding.CoRR abs/1901.05741 (2019), http://arxiv.org/abs/1901.05741

15. IOTA Foundation: IOTA Reference Implementation. Github, https://github.com/iotaledger/iri/blob/master/src/main/java/com/iota/iri/conf/BaseIotaConfig.java

16. Jones, C.I.: Pareto and Piketty: The macroeconomics of top income and wealthinequality. Journal of Economic Perspectives 29(1), 29–46 (February 2015), https://github.com/EOSIO/Documentation/blob/master/TechnicalWhitePaper.md

20. LeMahieu, C.: Nano: A feeless distributed cryptocurrency network (2017), https://content.nano.org/whitepaper/Nano_Whitepaper_en.pdf

21. Luu, L., Narayanan, V., Zheng, C., Baweja, K., Gilbert, S., Saxena, P.: A securesharding protocol for open blockchains. In: Proceedings of the 2016 ACM SIGSACConference on Computer and Communications Security. pp. 17–30. ACM (2016)22. Miller, A., Xia, Y., Croman, K., Shi, E., Song, D.: The honey badger of bft pro-tocols. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer andCommunications Security. p. 31–42. CCS ’16, Association for Computing Machin-ery, New York, NY, USA (2016), https://doi.org/10.1145/2976749.2978399

23. Müller, S., Penzkofer, A., Kuśmierz, B., Camargo, D., Buchanan, W.J.: Fast prob-abilistic consensus with weighted votes. In: Arai, K., Kapoor, S., Bhatia, R. (eds.)Proceedings of the Future Technologies Conference (FTC) 2020, Volume 2. pp.360–378. Springer International Publishing, Cham (2021)24. Nakamoto, S.: Bitcoin: A peer-to-peer electronic cash system (2008), https://bitcoin.org/bitcoin.pdf

25. Penzkofer, A., Kusmierz, B., Capossele, A., Sanders, W., Saa, O.: Parasite chaindetection in the IOTA protocol (04 2020), https://arxiv.org/pdf/1803.06559.pdf

26. Peters, G.W., Panayi, E.: Understanding Modern Banking Ledgers ThroughBlockchain Technologies: Future of Transaction Processing and Smart Contractson the Internet of Money, pp. 239–278. Springer International Publishing, Cham(2016), https://doi.org/10.1007/978-3-319-42448-4_13

27. Pierrot, C., Wesolowski, B.: Malleability of the blockchain’s entropy. Cryptographyand Communications 10, 211–233 (2017)28. Pietrzak, K.: Simple veriﬁable delay functions (2018), https://eprint.iacr.org/2018/627

29. Poon, J., Dryja, T.: The bitcoin lightning network: Scalable oﬀ-chain instant pay-ments (2016), https://lightning.network/lightning-network-paper.pdf

30. Popov, S.: The tangle (2015), https://iota.org/IOTA_Whitepaper.pdf

31. Popov, S.: A probabilistic analysis of the Nxt forging algorithm. Ledger 1, 69–83(Dec 2016), http://ledger.pitt.edu/ojs/ledger/article/view/46

32. Popov, S.: On a Decentralized Trustless Pseudo-Random Number Generation Al-gorithm. Journal of Mathematical Cryptology pp. 37–43 (2017)33. Popov, S.: Coins, walks and FPC. Youtube (2020)34. Popov, S., Buchanan, W.J.: FPC-BI: Fast Probabilistic Consensus within Byzan-tine Infrastructures. https://arxiv.org/abs/1905.10895 (2019)35. Popov, S., Moog, H., Camargo, D., Capossele, A., Dimitrov, V., Gal, A., Greve,A., Kusmierz, B., Mueller, S., Penzkofer, A., Saa, O., Sanders, W., Vigneri, L.,Welz, W., Attias, V.: The coordicide (2020), https://files.iota.org/papers/20200120_Coordicide_WP.pdf

36. Popov, S., Saa, O., Finardi, P.: Equilibria in the Tangle. ArXiv e-printsarXiv:1712.05385 (Dec 2017)ommittee selection in DAG distributed ledgers and applications 1937. Rabin, M.O.: Transaction protection by beacons. Journal of Computer and Sys-tem Sciences 27(2), 256 – 267 (1983),

38. Sompolinsky, Y., Lewenberg, Y., Zohar, A.: Spectre: A fast and scalable cryp-tocurrency protocol. Cryptology ePrint Archive, Report 2016/1159 (2016), https://eprint.iacr.org/2016/1159

39. Stathakopoulou, C., David, T., Vukolic, M.: Mir-BFT: High-Throughput BFT forBlockchains (06 2019)40. Syta, E., Jovanovic, P., Kogias, E.K., Gailly, N., Gasser, L., Khoﬃ, I., Fischer, M.J.,Ford, B.: Scalable bias-resistant distributed randomness. In: 2017 IEEE Sympo-sium on Security and Privacy (SP). pp. 444–460 (2017)41. Wesolowski, B.: Eﬃcient veriﬁable delay functions (2018), https://eprint.iacr.org/2018/623https://eprint.iacr.org/2018/623