Smart Contract Vulnerabilities: Vulnerable Does Not Imply Exploited
SSmart Contract Vulnerabilities:Vulnerable Does Not Imply Exploited
Daniel Perez
Imperial College London
Benjamin Livshits
Imperial College London
Abstract
In recent years, we have seen a great deal of both aca-demic and practical interest in the topic of vulnerabilities insmart contracts, particularly those developed for the Ethereumblockchain. While most of the work has focused on de-tecting vulnerable contracts, in this paper, we focus onfinding how many of these vulnerable contracts have actu-ally been exploited . We survey the 23,327 vulnerable con-tracts reported by six recent academic projects and findthat, despite the amounts at stake, only 1.98% of them havebeen exploited since deployment. This corresponds to atmost 8,487 ETH (~1.7 million USD ), or only 0.27% ofthe 3 million ETH (600 million USD) at stake. We explainthese results by demonstrating that the funds are very concen-trated in a small number of contracts which are not exploitable in practice. When it comes to vulnerability research, especially as it per-tains to software security, it is frequently difficult to estimatewhat fraction of discovered vulnerabilities are exploited inpractice. However, public blockchains, with their immutabil-ity, ease of access, and what amounts to a replayable executionlog for smart contracts present an excellent opportunity forsuch an investigation. In this work, we aim to contrast the vul-nerabilities reported in smart contracts on the Ethereum [16]blockchain with the actual exploitation of these contracts.We collect the data shared with us by the authors of sixrecent papers [24, 31, 32, 35, 39, 51] focusing on finding smartcontract vulnerabilities. These academic datasets are signifi-cantly bigger in scale than reports we can find in the wild andbecause of the sheer number of affected contracts — 23,327 —represent an excellent study subject.To make our approach more general, we express six dif-ferent frequently reported vulnerability classes as Datalog We use the exchange rate on 2020-05-16: 1 ETH = 200 USD. Forconsistency, any monetary amounts denominated in USD are based on thisrate. queries computed over relations that represent the state of theEthereum blockchain. The Datalog-based exploit discoveryapproach gives more scalability to our process; also, whileothers have used Datalog for static analysis formulation, weare not aware of it being used to capture the dynamic state ofthe blockchain over time.We discover that the amount of smart contract exploitationthat occurs in the wild is notably lower than what might be be-lieved, given what is suggested by the sometimes sensationalnature of some of the famous cryptocurrency exploits such asTheDAO [45] or the Parity wallet [14] bugs.
Contributions.
Our contributions are:•
Datalog formulation.
We propose a Datalog-based for-mulation for performing analysis over Ethereum VirtualMachine (EVM) execution traces. We use this highlyscalable approach to analyze a total of more than 20million transactions from the Ethereum blockchain tosearch for exploits. We highlight that our analyses run automatically based on the facts that we extract and therules defining the vulnerabilities we cover in this paper.•
Experimental evaluation of exploitation.
We analyzethe vulnerabilities reported in six recently published stud-ies and conclude that, although the number of vulnerable contracts and the amount of money at risk is very high,the amount of money actually exploited is several ordersof magnitude lower.We discover that out of 23,327 vulnerable contractsworth a total of 3,124,433 ETH, 463 contracts may havebeen exploited for an amount of 8,487 ETH, which rep-resents only 0.27% of the total amount at stake.•
Proposed explanations.
We hypothesize that the mainreasons for these vast differences is that the amount of exploitable
Ether is very low compared to the amountof Ether flagged vulnerable . Indeed, further analysis ofthe vulnerable contracts and the Ether they contain sug-gests that a large majority of Ether is held by only asmall number of contracts, and that the vulnerabilitiesreported on these contracts are either false positives or a r X i v : . [ c s . CR ] O c t ot exploitable in practice. We also confirm that the setof all contracts on the Ethereum blockchain has a similardistribution of wealth to that in our dataset.To make many of the discussions in this paper more con-crete, we present a thorough investigation of the high-valuecontracts in Appendix A. The Ethereum [16] platform allows its users to run “smartcontracts” on its distributed infrastructure. Ethereum smartcontracts are programs which define a set of rules for thegoverning of associated funds, typically written in a Turing-complete programming language called Solidity [19]. Solidityis similar to JavaScript, yet some notable differences are thatit is strongly-typed and has built-in constructs to interactwith the Ethereum platform. Programs written in Solidityare compiled into low-level untyped bytecode to be executedon the Ethereum platform by the Ethereum Virtual Machine(EVM) [53]. It is important to note that it is also possible towrite EVM contracts without using Solidity.To execute a smart contract, a sender has to send a transac-tion to the contract and pay a fee which is derived from thecontract’s computational cost, measured in units of gas . Eachexecuted instruction consumes an agreed upon amount ofgas [53]. Consumed gas is credited to the miner of the blockcontaining the transaction, while any unused gas is refundedto the sender. In order to avoid system failure stemming fromnever-terminating programs, transactions specify a gas limitfor contract execution [40]. An out-of-gas exception is thrownonce this limit has been reached.Smart contracts themselves have the capability to call an-other account present on the Ethereum blockchain. This func-tionality is overloaded, as it is used both to call a function inanother contract and to send Ether (ETH), the underlying cur-rency in Ethereum, to an account. A particularity of how thisworks in Ethereum is that calls from within a contract, alsocalled internal transactions , do not create new transactionsand are therefore not directly recorded on-chain. This meansthat looking at transactions without executing them does notprovide enough information to follow the flow of Ether.
In this subsection, we briefly review some of the most com-mon vulnerability types that have been researched and re-ported for EVM-based smart contracts. We provide a two-letter abbreviation for each vulnerability which we shall usethroughout the remainder of this paper.
Re-Entrancy ( RE ). When a contract “calls” another account,it can choose the amount of gas it allows the called party to use.If the target account is a contract, it will be executed and canuse the provided gas budget. If such a contract is malicious and the gas budget is high enough, it can try to call back inthe caller — a re-entrant call. If the caller’s implementationis not re-entrant, for example because it did not update hisinternal state containing balances information, the attackercan use this vulnerability to drain funds out of the vulnerablecontract [31, 35, 51]. This vulnerability was used in TheDAOexploit [45], essentially causing the Ethereum community todecide to rollback to a previous state using a hard-fork [37].We provide more details about TheDAO exploit in Section 8
Unhandled Exceptions ( UE ). Some low-level operations inSolidity such as send , which is used to send Ether, do notthrow an exception on failure, but rather report the statusby returning a boolean. If this return value is unchecked,the caller continues its execution even if the payment failed,which can easily lead to inconsistencies [15, 31, 35, 48].
Locked Ether ( LE ). Ethereum smart contracts can, as anyaccount on Ethereum, receive Ether. However, there as sev-eral reasons for which the received funds might get lockedpermanently into the contract.One reason is that the contract may depend on anothercontract which has been destructed using the
SELFDESTRUCT instruction of the EVM — i.e. its code has been removedand its funds transferred. If this was the only way for sucha contract to send Ether, it will result in the funds being per-manently locked. This is what happened in the Parity Walletbug in November 2017, locking millions of USD worth ofEther [14]. We provide more details about it in Section 8There are also cases where the contract will always runout of gas when trying to send Ether which could result inlocking the contract funds. More details about such issues canbe found in [24].
Transaction Order Dependency ( TO ). In Ethereum, multi-ple transactions are included in a single block, which meansthat the state of a contract can be updated multiple times in thesame block. If the order of two transactions calling the samesmart contract changes the final outcome, an attacker couldexploit this property. For example, given a contract whichexpects participant to submit the solution to a puzzle in ex-change for a reward, a malicious contract owner could reducethe amount of the reward when the transaction is submitted.
Integer Overflow ( IO ). Integer overflow and underflow is acommon type of bug in many programming languages but inthe context of Ethereum it can have very severe consequences.For example, if a loop counter were to overflow, creating aninfinite loop, the funds of a contract could become completelyfrozen. This can be exploited by an attacker if he has a way ofincrementing the number of iterations of the loop, for example,by registering enough users to trigger an overflow.
Unrestricted Action ( UA ). Contracts often perform autho-rization, by checking the sender of the message, to restrict thetype of action that a user can take. Typically, only the ownerof a contract should be allowed to destroy the contract or set anew owner. Such an issue can happen not only if the developer ame Vulnerabilities Report Citation
RE UE LE TO IO UA month
Oyente (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)
Figure 1:
A summary of smart contract analysis tools presented inprior work. forgets to perform critical checks but also if an attacker canexecute arbitrary code, for example by being able to controlthe address of a delegated call [32].
Smart contracts are generally designed to manipulate and hold funds denominated in Ether. This makes them very temptingattack targets, as a successful attack may allow the attackerto directly steal funds from the contract. Given the manycommon vulnerabilities in smart contracts, some of which wedescribed in the previous section, a large number of tools havebeen developed to find them automatically [18, 35, 51]. Mostof these tools analyze either the contract source code or itscompiled EVM bytecode and look for known security issues,such as re-entrancy or transaction order dependency vulnera-bilities. We present a summary of these different works in Fig-ure 1. The second and third columns respectively present thereported number of contracts analyzed and contracts flaggedvulnerable in each paper. The “vulnerabilities” columns showthe type of vulnerabilities that each tool can check for. Wepresent these vulnerabilities in Section 2.1 and give a moredetailed description of these tools in Section 8.2.
We give the definitions used in this paper for the terms vul-nerable , exploitable and exploited . vulnerable: A contract is vulnerable if it has been flaggedby a static analysis tool as such. As we will see later, thismeans that some contracts may be vulnerable becauseof a false-positive. exploitable:
A contract is exploitable if it is vulnerable andthe vulnerability could be exploited by an external at-tacker. For example, if the “vulnerability” flagged by atool is in a function which requires to own the contract,it would be vulnerable but not exploitable . Name Contracts Vulnerabilities Ether at stakeanalyzed found at time of report
Oyente 19,366 7,527 1,287,032Zeus 1,120 861 671,188Maian NA 2,691 15.59Securify 29,694 9,185 724,306MadMax 91,800 6,039 1,114,958teEther 784,344 1,532 1.55
Figure 2:
Summary of the contracts in our dataset. exploited:
A contract is exploited if it received a transactionon Ethereum’s main network which triggered one of itsvulnerabilities. Therefore, a contract can be vulnerable or even exploitable without having been exploited . In this paper, we analyze the vulnerable contracts reported bythe following six academic papers: [35], [31], [39], [51], [24]and [32]. To collect information about the addresses analyzedand the vulnerabilities found, we reached out to the authorsof the different papers.Oyente [35] data was publicly available [34]. The authorsof the other papers were kind enough to provide us with theirdataset. We received all the replies within less than a week ofcontacting the authors.We also reached out to the authors of [48], [30] and [15]but could not obtain their dataset, which is why we left thesepapers out of our analysis.Our dataset is comprised of a total of 821,219 contracts, ofwhich 23,327 contracts have been flagged as vulnerable toat least one of the six vulnerabilities described in Section 2.Although we received the data directly from the authors, thenumbers of contracts analyzed usually did not match the datareported in the papers, which we show in Figure 1. We believethe two main results for this are: authors improving their toolsafter the publication and authors not including duplicated con-tracts in their data they provided us. Therefore, we presentthe numbers in our dataset, as well as the Ether at stake forvulnerable contracts in Figure 2. The Ether at stake is com-puted by summing the balance of all the contracts flaggedvulnerable. We use the balance at the time at which each paperwas published rather than the current one, as it gives a bettersense of the amount of Ether which could potentially havebeen exploited.
Taxonomy.
Rather than reusing existing smart contracts vul-nerabilities taxonomies [11] as-is, we adapt it to fit the vul-nerabilities analyzed by the tools in our dataset. We do notcover vulnerabilities not analyzed by at least two of the sixtools. We settle on the six types of vulnerabilities describedin Section 2: re-entrancy ( RE ), unhandled exception ( UE ),locked Ether ( LE ), transaction order dependency ( TO ), integer N u m b e r o f c o n t r a c t s (a) Overlapping contractsanalyzed. N u m b e r o f c o n t r a c t s (b) Overlapping vulnerabilitiesflagged.
Figure 3:
Histograms that show the overlap in the contracts analyzedand flagged by examined tools.
Tools Total Agreed Disagreed % agreement
Oyente/Securify 774 185 589 23.9%Oyente/Zeus 104 3 101 2.88%Zeus/Securify 108 2 106 1.85%
Figure 4:
Agreement among tools for re-entrancy analysis. overflows ( IO ) and unrestricted actions ( UA ). As the paperswe survey use different terms and slightly different definitionsfor each of these vulnerabilities, we map the relevant vulnera-bilities to one of the six types of vulnerabilities we analyze.We show how we mapped these vulnerabilities in Figure 5. Overlapping vulnerabilities.
In this subsection, we firstcheck how much overlap there is between contracts in ourdataset: how many contracts have been analyzed by multi-ple tools and how many contracts were flagged vulnerableby multiple tools. We note that most papers, except for [35],are written around the same period. We find that 73,627 outof 821,219 contracts have been analyzed by at least two ofthe tools but only 13,751 by at least three tools. In Figure 3a,we show a histogram of how many different tools analyze asingle contract. In Figure 3b, we show the number of toolswhich flag a single contract as vulnerable to any of the an-alyzed vulnerability. The overlap for both the analyzed andthe vulnerable contracts is relatively small. We assume oneof the reasons is that some tools work on Solidity code [31]while other tools work on EVM bytecode [35, 51], makingthe population of contracts available different among tools.We also find a lot of contradiction in the analysis of thedifferent tools. We choose re-entrancy to illustrate this point,as it is supported by three of the tools we analyze. In Figure 4,we show the agreement between the three tools supporting re-entrancy detection. The
Total column shows the total numberof contracts analyzed by both tools in the
Tools column andflagged by at least one of them as vulnerable to re-entrancy.Oyente and Securify agree on only 23% of the contracts, whileZeus does not seem to agree with any of the other tools. Thisreflects the difficulty of building static analysis tools targetedat the EVM. While we are not trying to evaluate the differenttools’ performance, this gives us yet another motivation tofind out the impact of the reported vulnerabilities.
In this section, we describe in details the different analyseswe perform in order to check for exploits of the vulnerabilitiesdescribed in Section 2.To check for potential exploits, we perform bytecode-leveltransaction analysis, whereby we look at the code executed bythe contract when carrying out a particular transaction. We usethis type of analysis to detect the six types of vulnerabilitiespresented in Section 2.To perform our analyses, we first retrieve transaction datafor all the contracts in our dataset. Next, to perform bytecode-level analysis, we extract the execution traces for the transac-tions potentially affecting contracts of interest. We use EVM’sdebug functionality, which gives us the ability to replay trans-actions while tracing executed instructions. To speed-up thedata collection process, we patch the Go Ethereum client [10],opposed to relying on the Remote Procedure Call (RPC) func-tionality provided by the default Ethereum client.The extracted traces contain a list of executed instructions,as well as the state of the stack at each instruction. To analyzethe traces, we encode them into a Datalog representation;Datalog is a language implementing first-order logic withrecursion [29], allowing us to concisely express propertiesabout the execution traces. We use the following domainsto encode the information about the traces as Datalog facts,noting V as the set of program variables and A is the set ofEthereum addresses. We show an overview of the facts that wecollect and the relations we use to check for possible exploitsin Figure 7. We highlight that our analyses run automatically based on the facts that we extract and the rules that definevarious violations described in subsequent sections. In the EVM, as transactions are executed independently, re-entrancy issues can only occur within a single transaction.Therefore, for re-entrancy to be exploited, there must be a callto an external contract which invokes, directly or indirectly, are-entrant callback to the calling contract. We therefore startby looking for
CALL instructions in the execution traces, whilekeeping track of the contract currently being executed.When
CALL is executed, the address of the contract to becalled as well as the value to be sent can be retrieved byinspecting the values on the stack [53]. Using this information,we can record call ( a , a , p ) facts described in Figure 7a.We note that a contract can also create a new contract using CREATE and execute a re-entrancy attack using it [43]. Wetherefore treat this instruction in a similar way as
CALL . Usingthese, we then use the query shown in Figure 7c to retrievepotentially malicious re-entrant calls.
Analysis correctness.
Our analysis for re-entrant calls issound but not complete. As the EVM executes each contractin a single thread, a re-entrant call must come from a recur- yente ZEUS Securify MadMax Maian teEther RE re-entrancy re-entrancy no writes after call — — — UE callstack unchecked send handled exceptions — — — TO concurrency tx order dependency transaction ordering dependency — — — LE — failed send Ether liquidity unbounded mass operation greedy —wallet griefing IO — integer overflow — integer overflows — — UA — integer overflow — integer overflows prodigal exploitable Figure 5:
Mapping of the different vulnerabilities analyzed. if (! addr.send (100)) { throw ; } (a) Failure handling in Solidity. ; preparing call(0x65) CALL; call result pushed on the stack(0x69) PUSH1 0x73(0x71) JUMPI ; jump to 0x73 if call was successful(0x72) REVERT(0x73) JUMPDEST (b)
EVM instructions for failure handling.
Figure 6:
Correctly handled failed send . sive call. For example, given A , B , C and D being functions,a re-entrant call could be generated with a call path such as A → B → C → A . Our tool searches for all mutually-recursivecalls; it supports an arbitrarily-long calls path by using a re-cursive Datalog rule, making the analysis sound. However,we have no way of assessing if a re-entrant call is maliciousor not, which can lead to false positives. When Solidity compiles contracts, methods to send Ether,such as send , are compiled into the EVM
CALL instructions.We show an example of such a call and its instructions coun-terpart in Figure 6. If the address passed to
CALL is an ad-dress, the EVM executes the code of the contract, otherwiseit executes the necessary instructions to transfer Ether to theaddress. When the EVM is done executing, it pushes either 1on the stack, if the
CALL succeeded, or 0 otherwise.To retrieve information about call results, we can thereforecheck for
CALL instructions and use the value pushed on thestack after the call execution. The end of the call executioncan be easily found by checking when the depth of the traceturns back to the value it had when the
CALL instruction wasexecuted; we save this information as call_result ( v , n ) facts. An important edge case to consider are calls to pre-compiled contracts, which are typically called by the compilerand do not require their return value to be checked, as theyare results of computation, where 0 could be a valid value,but could result in false-positives. As pre-compiled contractshave known addresses between 1 and 10, we choose to simply not record call_result facts for such calls.As shown in Figure 6b, the EVM uses the JUMPI instruc-tion to perform conditional jumps. At the time of writing, thisis the only instruction available to execute conditional controlflow. We therefore mark all the values used as a conditionin
JUMPI as in_condition . We can then check for the un-handled exceptions by looking for call results, which neverinfluence a condition using the query shown in Figure 7c. Analysis correctness.
The analysis we perform to check forunhandled exceptions is complete but not sound. All failedcalls in the execution of the program will be recorded, whilewe accumulate facts about the execution. We then use a recur-sive Datalog rule to check if the call result is used directly orindirectly in a condition. We could obtain false negatives ifthe call result is used in a condition but the condition is notenough to prevent an exploit. However, given that the mostprevalent pattern for this vulnerability is the result of send not being used at all [51], and when the result is used, it istypically done within a require or assert expression, wehypothesize that such false negatives should be very rare. Although there are several reasons for funds locked in a con-tract, we focus on the case where the contract relies on an ex-ternal contract which does not exist anymore, as this is the pat-tern which had the largest financial impact on Ethereum [14].Such a case can occur when a contract uses another contractas a library to perform some actions on its behalf. To use acontract in this way, the
DELEGATECALL instruction is usedinstead of
CALL , as the latter does not preserve call data, suchas the sender or the value.The next important part is the behavior of the EVM whentrying to call a contract which does not exist anymore. Whena contract is destructed, it is not completely removed per-se,but its code is not accessible anymore to callers. When acontract tries to call a contract which has been destructed,the call is a no-op rather than a failure, which means that thenext instruction will be executed and the call will be markedas successful. To find such patterns, we collect Datalog factsabout all the values of the program counter before and afterevery
DELEGATECALL instruction. In particular, we first mark act Description is_output ( v ∈ V , v ∈ V ) v is an output of v size ( v ∈ V , n ∈ N ) v has n bits is_signed ( v ∈ V ) v is signed in_condition ( v ∈ V ) v is used in a condition call ( a ∈ A , a ∈ A , p ∈ N ) a calls a with p Ether create ( a ∈ A , a ∈ A , p ∈ N ) a creates a with p Ether expected_result ( v ∈ V , r ∈ Z ) v ’s expected result is r actual_result ( v ∈ V , r ∈ Z ) v ’s actual result is r call_result ( v ∈ V , n ∈ N ) v is the result of a calland has a value of n call_entry ( i ∈ N , a ∈ A ) contract a is called whenprogram counter is i call_exit ( i ∈ N ) program counter is i whenexiting a call to a contract tx_sstore ( b ∈ N , i ∈ N , k ∈ N ) storage key k is written intransaction i of block b tx_sload ( b ∈ N , i ∈ N , k ∈ N ) storage key k is read intransaction i of block b caller ( v ∈ V , a ∈ A ) v is the caller with address a load_data ( v ∈ V ) v contains transaction call data restricted_inst ( v ∈ V ) v is used by a restricted instruction selfdestruct ( v ∈ V ) v is used in SELFDESTRUCT (a)
Datalog facts.
Datalog rules depends ( v ∈ V , v ∈ V ) :- is_output ( v , v ) .depends ( v , v ) :- is_output ( v , v ) , depends ( v , v ) .call_flow ( a ∈ A , a ∈ A , p ∈ Z ) :- call ( a , a , p ) .call_flow ( a ∈ A , a ∈ A , p ∈ Z ) :- create ( a , a , p ) .call_flow ( a , a , p ) :- call ( a , a , p ) , call_flow ( a , a , _ ) .inferred_size ( v ∈ V , n ∈ N ) :- size ( v , n ) .inferred_size ( v , n ) :- depends ( v , v ) , size ( v , n ) .inferred_signed ( v ∈ V ) :- is_signed ( v ) .inferred_signed ( v ) :- depends ( v , v ) , is_signed ( v ) .condition_flow ( v ∈ V , v ∈ V ) :- in_condition ( v ) .condition_flow ( v , v ) :- depends ( v , v ) , in_condition ( v ) . depends_caller ( v ∈ V ) :- caller ( v , _ ) , depends ( v , v ) . depends_data ( v ∈ V ) :- load_data ( v , _ ) , depends ( v , v ) . caller_checked ( v ∈ V ) :- caller ( v , _ ) , condition_flow ( v , v ) , v < v . (b) Datalog rule definitions.
Vulnerability Query
Re-Entrancy call_flow ( a , a , p ) , call_flow ( a , a , p ) , a (cid:54) = a Unhandled Excep. call_result ( v , ) , ¬ condition_flow ( v , _ ) Transaction Order tx_sstore ( b , t , i ) ,Dependency tx_sload ( b , t , i ) , t (cid:54) = t Locked Ether call_entry ( i , a ) , call_exit ( i ) , i + = i Integer Overflow actual_result ( v , r ) , expected_result ( v , r ) , r (cid:54) = r Unrestricted Action restricted_inst ( v ) , depends_data ( v ) , ¬ depends_caller ( v ) , ¬ caller_checked ( v ) ∨ selfdestruct ( v ) , ¬ caller_checked ( v ) (c) Datalog queries for detecting different vulnerability classes.
Figure 7:
Datalog setup. the program counter value at which the call is executed — call_entry ( i ∈ N , a ∈ A ) . Then, using the same approachas for unhandled exceptions, we skip the content of the calland mark the program counter value at which the call returns — call_exit ( i ∈ N ) .If the called contract does not exist anymore, i + = i must hold. Therefore, we can use the Datalog query shown inFigure 7c to retrieve the destructed contracts address. Analysis correctness.
The approach we use to detect lockedEther is sound and complete for the class of locked fundsvulnerability we focus on. All vulnerable contracts must havea
DELEGATECALL instruction. If the issue is present and thecall contract has indeed been destructed, it will always resultin a no-op call. Our analysis records all of these calls andsystematically check for the program counter before and afterthe execution, making the analysis sound and complete.
The first insight to check for exploitation of transaction order-ing dependency is that at least two transactions to the samecontract must be included in the same block for such an at-tack to be successful. Furthermore, as shown in [35] or [51],exploiting a transaction ordering dependency vulnerabilityrequires manipulation of the contract’s storage.The EVM has only one instruction to read from the storage,
SLOAD , and one instruction to write to the storage,
SSTORE .In the EVM, the location of the storage to use for both ofthese instructions is passed as an argument, and referred to asthe storage key . This key is available on the stack at the timethe instruction is called. We go through all the transactionsof the contracts and each time we encounter one of theseinstructions, we record either tx_sload ( b ∈ N , i ∈ N , k ∈ N ) or tx_sstore ( b ∈ N , i ∈ N , k ∈ N ) where in each case b isthe block number, i is the index of the transaction in the blockand k is the storage key being accessed.The essence of the rule to check for transaction order de-pendency issues is then to look for patterns where at leasttwo transactions are included in the same block with oneof the transactions writing a key in the storage and anothertransaction reading the same key. We show the actual rule inFigure 7c. Analysis correctness.
Our approach to check for transactionorder dependencies is sound but not complete. With the defi-nition we use, for a contract to have a transaction order depen-dency it must have two transactions in the same block, whichaffect the same key in the storage. We check for all such cases,and therefore no false-negatives can exist. However, finding ifthere is a transaction order dependency requires more knowl-edge about how the storage is used and our approach couldtherefore result in false positives. .5 Integer Overflow
The EVM is completely untyped and expresses everything interms of 256-bits words. Therefore, types are handled entirelyat the compilation level and there is no explicit informationabout the original types in any execution traces.To check for integer overflow, we accumulate facts overtwo passes. In the first pass, we try to recover the sign and sizeof the different values on the stack. To do so, we use knowninvariants about the Solidity compilation process. First, anyvalue which is the result of an instruction such as
SIGNEXTEND or SDIV can be marked to be signed with is_signed ( v ) . Fur-thermore, SIGNEXTEND being the usual sign extension oper-ation for two’s complement, it is passed both the value toextend and the number of bits of the value. This allows toretrieve the size of the signed value. We assume any valuenot explicitly marked as signed to be unsigned. To retrievethe size of unsigned values, we use another behavior of theSolidity compiler.To work around the lack of type in the EVM, the Soliditycompiler inserts an
AND instruction to “cast” unsigned integersto their correct value. For example, to emulate an uint8 , thecompiler inserts
AND value 0xff . In the case of a “cast”, thesecond operand m will always be of the form m = n − , n ∈ N , n = p , p ∈ [ , ] . We use this observation to mark valueswith the according type: uintN where N = n ×
4. Variablessize are stored as size ( v , n ) facts.During the second phase, we use the inferred_signed ( v ) and inferred_size ( v , n ) rules shown in Figure 7b to re-trieve information about the current variable. When no infor-mation about the size can be inferred, we over-approximate itto 256 bits, the size of an EVM word. Using this information,we compute the expected value for all arithmetic instructions(e.g. ADD , MUL ), as well as the actual result computed by theEVM and store them as Datalog facts. Finally, we use thequery shown in Figure 7c to find instructions which overflow.
Analysis correctness.
Our analysis for integer overflow isneither sound nor complete. The types are inferred by usingproperties of the compiler using a heuristic which shouldwork for most of cases but can fail. For example, if a con-tract contains code which yields
AND value 0xff but valueis an uint32 , our type inference algorithm would wronglyinfer that this variable is an uint8 . Such error during typeinference could cause both false positives and false negatives.However, this type of issue occurs only when the developeruses bit manipulation with a mask similar to what the Soliditycompiler generate. We find that such a pattern is rare enoughnot to skew our data, and give an estimate the possible numberof contracts which could follow such a pattern in Section 5.5.
Unrestricted actions is a broad class of vulnerability, as it caninclude the ability to set an owner without being allowed to,
Contract address Last Amounttransaction exploited
Figure 8: RE : Top contracts victim of re-entrancy attack and ETHamounts exploited destruct a contract without permission or yet execute arbitrarycode. As one of our main goal is to check the exploitation ofvulnerable contracts, we stay close to the definitions given byprevious works [32] and focus on unrestricted Ether transferusing CALL , unrestricted writes using and
SSTORE , and codeinjection using
DELEGATECALL or CALLCODE .First, we need to remind ourselves that the caller, unlikefor example the call data, cannot be forged. Therefore, oneof the main insight is that if an action is restricted depend-ing on who is calling, there should be an execution tracebefore the restricted operation which conditionally jumps,depending on the caller. This is enough for
SELFDESTRUCT but not for other instructions as it would flag a line suchas balances[msg.sender] = msg.value to be vulnerable.To model this, we track whether the message sender influ-ences the storage key or the address to call. Finally, for codeinjection, we check whether the passed data influences theaddress called by
DELEGATECALL or CALLCODE . Analysis correctness.
Our analysis for unrestricted actionsis neither sound nor complete. We take a relatively simpleapproach of checking whether the message sender influencesa condition or not before executing a sensitive instruction.This can result in false negatives because the check couldbe performed inappropriately, for example not reverting thetransaction when needed, making the analysis unsound. Fur-thermore, there might be some use cases where it is acceptableto allow any sender to write to the storage, but our analysiswould flag such as vulnerable, resulting in false positives. Wediscuss the implications further in Section 5.6.
As described in Section 3, the combined amount of Ether con-tained within all the vulnerable contracts exceeds 3 millionETH, worth 600 million USD. In this section, we present theresults for each vulnerability one by one; our results havebeen obtained using the methodology described in Section 4;the goal is to show how much of this money is actually at risk.
Methodology.
For each vulnerability, we perform our analy-sis in two steps. First, we fetch the execution traces of all thetransactions up to block 10,200,000 affecting the contractsin our dataset, either directly or through internal transactions.We then run our tool to automatically find the total amount ofEther at risk and report this number. This is the amount weuse to later give a total upper bound across all vulnerabilities.n the second step, we manually analyze the contracts at riskto obtain more insight about the exploits and find interestingpatterns. As analyzing all the contracts manually would beimpractical, for each vulnerability we manually analyze thecontracts with the highest amount of Ether at risk to under-stand better the reasons behind the vulnerabilities. We thenpresent interesting findings as short case studies.
Runtime performance.
Our analysis runs in linear time andmemory with respect to the number of instructions executedby a given transaction. The number of instructions varieswidely between transactions, from a few hundreds to a fewhundred thousands, with an average of around 100,000. Ourtool takes on average less than 10ms (stddev. 20ms) per trans-action with a maximum of less than 2 seconds for the largesttransactions, which is below the timeout of 5 seconds whichwe set for a single transaction. RE : Re-Entrancy There are 4,337 contracts flagged as vulnerable to re-entrancyby [31, 35, 51], with a total of 457,073 transactions. After run-ning the analysis described in Section 4 on all the transactions,we found a total of 116 contracts which contain re-entrantcalls. To look for the monetary amount at risk, we computethe sum of the Ether sent between two contracts in transac-tions containing re-entrant calls. The total amount of Etherexploited using re-entrancy is of 6,076 ETH, which is consid-erable as it represents more than 1,200,000 USD.
Manual analysis.
We manually analyze the top contracts interms of fund lost and present them in Figure 8. Interest-ingly, one of these three potential exploits has a substantialamount of Ether at stake: 5,881 ETH, which correspondsto around 1,180,000 USD. This address has already beendetected as vulnerable by some recent work focusing on re-entrancy [43]. It appears that the contract, which is part ofthe Maker DAO [9] platform, was found vulnerable by theauthors of the contract, who themselves performed an attackto confirm the risk [2].
Sanity checking.
We use two different contracts for sanitychecking. First, we look at TheDAO attack, which is the mostfamous instance of a re-entrancy attack. Our tool detects thefollowing re-entrancy pattern: the malicious account callsTheDAO main account, TheDAO main account calls into thereward account and the reward account sends the reward tothe malicious account, allowing it to perform the re-entrantcall into TheDAO main account.As another sanity check, we look at a contract calledSpankChain [6], which is known to recently have been com-promised by a re-entrancy attack. We confirm that our ap-proach successfully marks this contract as having been thevictim of a re-entrancy attack and correctly identifies the at-tacker contract.Finally, we note that our tool finds all the re-entrancy pat-
Contract address Amount at risk
Figure 9: UE : Top contracts affected by unhandled exceptions andETH amounts at risk terns presented by Sereum [43], including delegated andcreate-based re-entrancy . UE : Unhandled Exceptions There are 11,427 contracts flagged vulnerable to unhandledexceptions by [31, 35, 51] for a total of more than 3.4 milliontransactions, which is an order of magnitude larger than whatwe found for re-entrancy issues.We find a total of 264 contracts where failed calls havenot been checked for, which represents roughly 2% of theflagged contracts. The next goal is to find an upper boundon the amount of Ether at risk because of these unhandledexceptions. We define the upper bound on the money at riskto be the minimum value of the balance of the contract at thetime of the unhandled exception and the total of Ether whichhave failed to be sent. We then sum the upper bound of allissues found to obtain a total upper bound. This gives us atotal of 271.89 Ether at risk for unhandled exceptions.
Manual analysis.
We manually analyze the top contracts andsummarize their addresses and the amount at risk in Figure 9.The Solidity code is available for three of these contracts. Weconfirm that in all cases the issue came from a misuse of alow-level Solidity function such as send . Investigation of the contract at :The contract has failed to send a total of 52.90 Ether and currently stillholds a balance of 69.3 Ether at the time of writing. Afterinvestigation, we find that the contract is an abandonedpyramid scheme [5]. The contract has a total of 21 callswhich failed, all trying to send 2.7 Ether, which appears tohave been the reward of the pyramid scheme at this pointin time. Unfortunately, the code of this contract was notavailable for further inspection but we conclude that there isa high chance that some of the users in the pyramid schemedid not correctly obtain their reward because of this issue. https://github.com/uni-due-syssec/eth-reentrancy-attack-patterns ontract address First issue Balance Figure 10:
TOD : Top contracts potentially victim of transactionordering dependency attack. LE : Locked Ether There are 7,285 contracts flagged vulnerable to locked Etherby [51], [24], [39] and [31]. The contracts hold a total valueof more than 1.4 million ETH, which is worth more than 200million USD. We analyze the transactions of the contractsthat could potentially be locked by conducting the analysisdescribed in the previous section. Our tool shows than none of the contracts are actually affected by the pattern we checkfor — i.e., dependency on a contract which had been destruc-ted. We note that our tool currently only covers dependency ona destructed contract as a reason for locked Ether and patternssuch as unbounded mass operation are not yet covered.
Parity wallet.
Contracts affected by the Parity wallet( ) bug [14]were not flagged by the tools we analyzed, and are thereforenot present in our dataset. As this is one of the most famouscases of locked Ether, we test our tool on the contracts af-fected by this bug. To find the contracts, we simply have touse the Datalog query for locked Ether in Figure 7c and insertthe value of the Parity wallet address as argument a . Our re-sults for contracts affected by the Parity bug indeed matcheswhat others had found in the past [23], with the contractat address having as much as 306,276 ETH locked. TO : Transaction Order Dependency There are 1,881 contracts flagged vulnerable to transactionordering dependency by [35] and [31]. We run the analysis de-scribed in Section 4 on their 3,002,304 transactions and obtaina total of 54 contracts potentially affected by transaction-orderdependency. To estimate the amount of Ether at risk, we sumup the total value of Ether sent, including by internal transac-tions, during all the flagged transactions, resulting in a totalof 297.2 ETH at risk of transaction-order dependency.
Manual analysis.
For each contract, we find the block wheretransaction order dependency could have happened withthe highest balance and report top with their balance atthe time of the issue in Figure 10. We manually investi-gated the contracts listed, they all had their source codeavailable. We confirmed that in all the contracts, it waspossible for a user to read and write to the same stor-age location within a single block. We inspected further , a contractcalled
WithDrawChildDAO and found that the read was sim-ply for users to check their balance, making the issue benign. IO : Integer Overflow There are 2,472 contracts flagged vulnerable to integer over-flow, which accounts for a total of more than 1.2 milliontransactions. We run the approach we described in Section 4to search for actual occurrences of integer overflows. It isworth noting that for integer overflow analysis we rely onproperties of the Solidity compiler. To ensure that the con-tracts we analyze were compiled using Solidity, we fetchedall the available source codes for contracts flagged vulner-able to integer overflow from Etherscan [7]. Out of 2,492contracts, 945 had their source code available and all of themwere written in Solidity.
Effects of our formulation.
As mentioned in Section 4.5,some types of bit manipulation in Solidity contracts whichcould result in our type inference heuristic failing. We usethe source codes we collected here to verify up to what ex-tent this could affect our analysis. We find that bit manipula-tion by itself is already fairly rare in Solidity, with only 244out of the 2,492 contracts we collected using any sort of bitmanipulation. Furthermore, most of the contracts using bitmanipulation were using it to manipulate a variable as a bitarray, and only ever retrieved a single bit at a time. Such apattern does not affect our analysis. We found only 33 con-tracts which used or similar values, which could actuallyaffect our analysis. This represents about 1.3% of the numberof contracts for which the source code was available.We find a total of 62 contracts with transactions where aninteger overflow might have occurred. To find the amountof Ether at stake, we analyze all the transactions which re-sulted in integer overflows. We approximate the amount bysumming the total amount of Ether transferred in and outduring a transaction containing an overflow. We find that thetotal of Ether at stake is 1,842 ETH. This is most likely anover-approximation but we use this value as our upper-bound.
Manual analysis.
We inspect some of the results we obtaineda little further to get a better sense of what kind of cases leadto overflows. We find that a very frequent cause of overflowis rather underflow of unsigned values. We highlight one ofsuch cases in the following investigation.
Investigation of the contract at :This contract was flagged positive to both unhandled excep-tions and integer overflow by our tool. After inspection, itseems that at block height 1,364,860, the owner tried to re-duce the fees but the unsigned value of the fees overflowedand became a huge number. Because of this issue, the contractwas then trying to send large amount of Ether. This resultedin failed calls which happened not to be checked, hence theflag for unhandled exceptions. ulnerable Exploited contracts Exploited EtherVuln. Vulnerable Total Ether Transactions Contracts % of contracts Exploited % of Ethercontracts at stake analyzed exploited exploited Ether exploited RE UE LE TO IO UA Total
Figure 11:
Understanding the exploitation of potentially vulnerable contracts.
There is a total of 5,163 contracts flagged by [32, 39, 51]as vulnerable to unrestricted actions for a total of 3,871,770transactions. We use the approach described in Section 4.6and find a total of 42 contracts having suffered of unrestrictedactions, which were all non-restricted self-destructs, but noneof them held Ether at the time of the exploit.
Effects of our formulation.
As mentioned in Section 4.6,this analysis is not sound, which means we need to be cau-tious about false positives. A contract could have a check onthe message sender which is incorrect and be exploited butnot be flagged as such. While we hypothesize that it is anedge case, it cannot be completely excluded. However, havingan automation method for such a check requires knowing theintent of the programmer, for example through specifications,which is out-of-scope of this work. We therefore decide to in-spect the contracts in our dataset in more details to understandbetter the level of exploitation.
Manual analysis.
The tool teEther flags exploitable contracts,as opposed to simply vulnerable contracts. Therefore, expectthese contracts to be more likely to have been exploited andfocus on these for our manual analysis. We fetch all the histor-ical balances of teEther contracts and retrieve the maximumamount held at any point in time and find the total of theseto equal 4,921 Ether. However, we find that 4,867 Ether be-longed to 48 different contracts with the exact same bytecode,and all had the same transaction pattern, which we describein the following investigation.
Investigation of the contract at :All contracts with a high historical monetary value foundby teEther share the same bytecode, creator and transac-tion pattern as this contract. The contracts are createdby , receiveEther from Kraken, an exchange, and send the same amountto a coupleof blocks later. We could not find further information about these addresses. We decompile the contract to understandhow the contracts were exploitable and find that during thefew blocks they held money, exploiting the contract wouldhave been as simple as sending a transaction with the addressto which to transfer the funds as argument. This is a verydangerous situation but because the Ether was usually sentwithin a minute to another address, an attacker would haveneeded to be very proactive and use advance tooling to exploitthe contract. This illustrates well how a contract can be ex-ploitable but have little chance of being exploited in practice.
Sanity checking.
As a sanity check, in addition to our testsuite, we use one of the most famous instance of an unre-stricted action: the destructed Parity wallet library contractat address .We analyze the transactions and successfully find an unre-stricted store instruction in transaction , whichwas used to take control of the wallet.
We summarize all our findings, including the number of con-tracts originally flagged, the amount of Ether at stake, andthe amount actually exploited in Figure 11. The
Contractsexploited column indicates the number of contracts which areflagged exploited and % Contracts exploited is the percentageof this number with respect to the number of contracts flaggedvulnerable. The
Exploited Ether column shows the maximumamount of Ether that could have been exploited and the nextcolumn shows the percentage this amount represents com-pared to the total amount at stake. The
Total row accounts forcontracts flagged with more than one vulnerability only once.Overall, we find that the number of contracts exploited isnon negligible, with about 2% to 4% of vulnerable contractsexploited for 4 of the 6 vulnerabilities covered in our study.However, it is important to note that the percentage of Etherexploited is an order of magnitude lower, with at most 0.4%of the Ether at stake exploited for re-entrancy. This indicatesthat exploited contracts are usually low-value. We will expandon this argument further in Section 7. C o n t r a c t s c o un t (a) Ether held by contracts in our datasetwith non-zero balance. C u m u l a t i v e p e rc e n t a g e o f E t h e r C u m u l a t i v e a m o un t o f E t h e r (b) Cumulative Ether held in the 96contracts in our dataset containing at least10 ETH.
Figure 12:
Ether held in contracts: describing the distribution.
In this section, we present the different limitations of oursystem, and describe how we try to mitigate them.
Soundness vs Completeness.
As for most tools such as thisone, we are faced with the trade-off of soundness againstcompleteness. Whenever possible we choose soundness overcompleteness — three out of six of our analyses are sound.When we cannot have a sound analysis, we are careful toonly leave out cases which are unlikely to generate manyfalse negatives. In other words, we try as much as possibleto reduce the number of false negatives, even if this meansincreasing the number of false positives. Indeed, the main goalof our system is to provide us an upper-bound of the amountof potentially exploited Ether, which make false negativesundesirable. Furthermore, we manually check the high-valuecontracts flagged as exploited, false-positives will not havean important influence on the final results. As an example ofthis trade-off, for re-entrancy we flag any contract which wascalled using a re-entrant call and lost funds in the process.However, in some cases, it could be an expected behavior andthe funds could have been transferred to an address belongingto the same entity.
Dataset.
We only run our tool on the contracts included inour dataset, which means that we might be missing someexploits which actually occurred. Indeed, we did not haveany contract affected by the Parity wallet bug nor had we thecontract affected by TheDAO hack in the dataset. However,one of the main goal of this paper is to quantify what fractionof vulnerabilities discovered by analysis tools is exploited inpractice and our current approach allows us to do exactly this.
Other types of attacks.
Our tool and analysis does not coverevery existing attack to smart contracts. There are, for ex-ample, attacks targeting ERC-20 tokens [42], or yet someother types of DoS attacks, such as wallet griefing [24]. Fur-thermore, some detected “exploits” could be the results ofHoneypots [50] but our tool does not cover such cases. Al-though it would be interesting to also cover such cases, wehad to make a decision about the scope of the tool. Therefore,we focus on the vulnerabilities which have been the most cov-ered in the literature, which we hypothesise is representativeof how common the vulnerabilities are.
Even considering the limitations of our system, it is clear thatthe exploitation of smart contracts is vastly lower that whatcould be expected. In this section, we present some of thefactors impacting the actual exploitation of smart contracts.We believe that a major reason for the difference betweenthe number of vulnerable contracts reported and the numberof contracts exploited is the distribution of Ether among con-tracts. Indeed, only about 2,000 out of the 23,327 contracts inour dataset contain Ether, and most of these contracts have abalance lower than 1 ETH. We show the balance distributionof the contracts containing Ether in our dataset in Figure 12a.Furthermore, the top 10 contracts hold about 95% of the totalEther. We show the cumulative distribution of Ether withinthe contracts containing more than 10 ETH in Figure 12b.This shows that, as long as the top contracts cannot be ex-ploited, the total amount of Ether that is actually at stake willbe nowhere close to the upper bound of “vulnerable” Ether.To make sure this fact generalizes to the whole Ethereumblockchain and not only our dataset, we fetch the balancesof all existing contracts. This gives a total of 15,459,193contracts. Out of these, only 463,538 contracts have a non-zero balance, which is merely 3% of all the contracts. Out ofthe contracts with a non-zero balance, the top 10, top 100 andtop 1000 account respectively for 54%, 92% and 99% of thetotal amount of Ether. This shows that our dataset follows thesame trend as the whole Ethereum blockchain: a very smallamount of contracts hold most of the wealth.
Manual inspection of high value contracts.
We decide tomanually inspect the top 6 contracts, in terms of balance atthe time of writing, marked as vulnerable by any of the toolsin our dataset. We focused on the top 6 because it happenedto be the number of contracts which currently hold morethan 100,000 ETH. These contracts hold a total of 1,695,240ETH, or 83% of the total of 2,037,521 ETH currently held byall the contracts in our dataset. We give an overview of thefindings here and a more in-depth version in Appendix A.
Investigation of the contract at :The source code for this contract is not available on Etherscan.However, we discovered that it is the multi-signature walletof the Ethereum foundation [1] and that its source code isavailable on GitHub [3]. We inspect the code and find that allcalls require the sender of the message to be an owner. This byitself is enough to prevent any re-entrant call, as the maliciouscontract would have to be an owner, which does not makesense. Furthermore, although the version of Oyente used inthe paper reported the re-entrancy, more recent versions ofthe tool did not report this vulnerability anymore. Therefore,we safely conclude that the re-entrancy issue was a false alert. ddress Ether balance Deployment date Flagged vulnerabilities RE LE , Zeus: IO LE RE LE TO , UE ; Zeus: LE , IO Figure 13:
Top six most valuable contracts flagged as vulnerable by at least one tool.
We were able to inspect 4 of the 5 other contracts. The con-tract at address is the only one for which we were unable to find any in-formation. The second, third and fifth contracts in the listwere also multi-signature wallets and exploitation would re-quire a majority owner to be malicious. For example, forEther to get locked, the owners would have to agree onadding enough extra owners so that all the loops over theowners result in an out-of-gas exception. The contract ataddress is a con-tract known as
WithDrawDAO [4]. We did not find any partic-ular issue, but it does use a delegate pattern which explainsthe locked Ether vulnerability marked by Zeus.We present a thorough investigation of the high-value con-tracts in Appendix A. Overall, all the contracts from Figure 13that we could analyze seemed quite secure and the vulnerabil-ities flagged were definitely not exploitable. Although thereare some very rare cases that we present in Section 8 wherecontracts with high Ether balances are being stolen, theseremain exceptions. The facts we presented up to now helpus confirm that the amount of Ether at risk on the Ethereumblockchain is nowhere as close as what is claimed [24, 31].
Some major smart contracts exploits have been observed onEthereum in recent years [45]. These attacks have been ana-lyzed and classified [11] and many tools and techniques haveemerged to prevent such attacks [21, 26]. Recent literaturehas also shown how attacks on Ethereum are evolving withtime [55]. In this section, we will first provide details abouttwo of the most prominent historic exploits and then presentexisting work aimed at increasing smart contract security.
TheDAO exploit.
TheDAO exploit [45] is one of the most in-famous bugs on the Ethereum blockchain. Attackers exploiteda re-entrancy vulnerability [11] of the contract which allowedfor the draining of the contract’s funds. The attacker contractcould call the function to withdraw funds in a re-entrant man- ner before its balance on TheDAO was reduced, making itindeed possible to freely drain funds. A total of more than 3.5million Ether were drained. Given the severity of the attack,the Ethereum community finally agreed on hard-forking.
Parity wallet bug.
The Parity Wallet bug [14] is anotherprominent vulnerability on the Ethereum blockchain whichcaused 280 million USD worth of Ethereum to be frozenon the Parity wallet account. It was due to a very simplevulnerability: a library contract used by the parity wallet wasnot initialized correctly and could be destructed by anyone.Once the library was destructed, any call to the Parity walletwould then fail, effectively locking all funds.
There have been a lot of efforts in order to prevent such attacksand to make smart contracts more secure in general. We willhere present some of the tools and techniques which havebeen presented in the literature and, when relevant, describehow they compare to our work.Analysis tools can roughly be divided in two categories:static analysis and dynamic analysis tools. Using the term“static” quite loosely, static analysis tools can be defined astools which catch bugs or vulnerabilities without the needto deploy the smart contracts. Runtime analysis tools try todetect these by executing the deployed contracts. Our tool fitsinto the second category.
Static analysis tools.
Static analysis tools have been the mainfocus of research. This is understandable, given how criticalit is to avoid vulnerabilities in a deployed contract. Most ofthese tools work by analyzing the bytecode or high-level codeof contracts and checking for known vulnerable patterns.Oyente [35] is one of the first tools which has been devel-oped to analyze smart contracts. It uses symbolic executionin combination with the Z3 SMT solver [20] to check for thefollowing vulnerabilities: transaction ordering dependency,re-entrancy and unhandled exceptions.ZEUS [31] is a static analysis tool which works on theSolidity smart contract and not on the bytecode, making itappropriate to assist development efforts rather than to ana-lyze deployed contracts, for which Solidity code is typicallyot available. Zeus transpiles XACML-styled [46] policiesto be enforced and the Solidity contract code into LLVM bit-code [33] and uses constrained Horn clauses [13, 36] over itto check that the policy is respected.Securify [51] is a static analysis tool which checks securityproperties of the EVM bytecode of smart contracts. It encodessecurity properties as patterns written in a Datalog-like [52]domain-specific language, and checks either for complianceor violation. Securify infers semantic facts from the contractand interprets the security patterns to check for their violationor compliance by querying the inferred facts. This approachhas many similarities with ours, using Datalog to expressvulnerability patterns. The major difference is that Securifyworks on bytecode while our tool works on execution traces.MadMax [24] has similarities with Securify, as it also en-codes properties of the smart contract into Datalog, but itfocuses on vulnerabilities related to gas. It is the first tool todetect “unbounded mass operations”, where a loop is boundedby a dynamic property such as the number of users, causingthe contract to always run out of gas passed a certain numberof users. MadMax is built on top of the decompiler imple-mented by Vandal [15] and is performant enough to analyzeall the contracts of the Ethereum blockchain in only 10 hours.Several other static analysis tools have been developed,some, such as SmartCheck [48], being quite generic and han-dling many classes of vulnerabilities, and other being moredomain specific, such as Osiris [49] focusing on integer over-flows, Maian [39] on unrestricted actions or Gasper [17] oncostly gas patterns. More recently, ETHBMC [22] was de-signed to also support inter-contract relations, cryptographichash functions and memcopy-style operations.Finally, there have also been some efforts to formally verifysmart contracts. [28] is one of the first efforts in this directionand defines the EVM using Lem [38], which allows to gen-erate definitions for theorem provers such as Coq [12]. [25]presents a complete small-step semantics of EVM bytecodeand formalizes it using the F* proof assistant [47]. A similareffort is made in [27] to give an executable formal specifica-tion of the EVM using the K Framework [44]. VerX [41] isalso a recent work allowing users to write properties aboutsmart contracts which will be formally verified by the tool.
Dynamic analysis tools.
Although dynamic analysis toolshave been less studied than their static counterpart, somework has emerged in recent years.One of the first work in this line is ContractFuzzer [30].As its name indicates, it uses fuzzing to find vulnerabilitiesin smart contracts and is capable of detecting a wide rangeof vulnerabilities such as re-entrancy, locked Ether or unhan-dled exceptions. The tool generates inputs to the contract andchecks using an instrumented EVM whether some vulnera-bilities have been triggered. An important limitation of thisfuzzing approach is that it requires the Application BinaryInterface of the contract, which is typically not available forcontracts deployed on the main Ethereum network. Sereum [43] focuses on detecting re-entrancy exploitationat runtime by integrating checks in a modified Go Ethereumclient. The tool analyzes runtime traces and uses taint analysisto ensure that no variable accessing the contract storage isused in a re-entrant call. Although there are some similaritieswith our tool, also analyzing traces at runtime, Sereum focuseson re-entrancy while our tool is more generic, notably becausevulnerabilities pattern can easily be expressed using Datalog.teEther [32] also works at runtime but is different fromthe previous works presented, as it does not try to protectcontracts but rather to actively find an exploit for them. It firstanalyzes the contract bytecode to look for critical executionpaths. Critical paths are execution paths which may resultin lost funds, for example by sending money to an arbitraryaddress or being destructed by anyone. To find these paths, ituses an approach close to Oyente [35], combining symbolicexecution and Z3 to solve path constraints.TXSPECTOR [54], which was published soon after thefirst version of this paper, uses a very similar approach to oursto detect re-entrancy, unchecked call and suicidal contracts.They also leverage a Datalog approach to detect vulnerabili-ties but first transforms the transaction traces into a flow graphrather than adding facts about traces directly to the Datalogdatabase. While this does add expressiveness, it makes theanalysis significantly more complex, resulting in some anal-ysis timing out on some transactions. Therefore, we believethat their approach could be complementary to ours and usedto eliminate potential false-positives of our approach.
Summary.
Static analysis tools are typically designed to de-tect vulnerable contracts, while dynamic analysis tools aredesigned to detect exploitable contracts. The only exception isSereum, which detects contracts exploited using re-entrancy.Our work is, to the best of our knowledge, the first attemptto detect contracts exploited using a wide range of vulnera-bilities. This is mostly orthogonal with other works and cansupport analysis tool development efforts by helping to un-derstand what type of exploitation is happening in the wild.
In this paper, we surveyed the 23,327 vulnerable contractsreported by six recent academic projects. We proposed aDatalog-based formulation for performing analysis over EVMexecution traces and used it to analyze a total of more than 20million transactions executed by these contracts. We foundthat at most 463 out of 23,327 contracts have been subjectto exploits but that at most 8,487 ETH (1.7 million USD), oronly 0.27% of the 3 million ETH (600 million USD) poten-tially at risk, was exploited. Finally, we found that a majorityof Ether is held by only a small number of contracts and thatthe vulnerabilities reported on these are either false positivesor not exploitable in practice, thus providing a reasonableexplanation for our results. eferences [1] Contract with 11,901,464 ether? What does it do? , 2015. [Online; accessed 13-October-2020].[2] Critical ether token wrapper vulnerability - eth tokenssalvaged from potential attacks. , 2016.[Online; accessed 13-October-2020].[3] Source code of the Ethereum Foundation Multisigwallet. https://github.com/ethereum/dapp-bin/blob/master/wallet/wallet.sol , 2017. [Online;accessed 13-October-2020].[4] The DAO Refunds. https://theethereum.wiki/w/index.php/The_DAO_Refunds , 2017. [Online; ac-cessed 13-October-2020].[5] What’s become of the ethereumpyramid? , 2017.[Online; accessed 13-October-2020].[6] We Got Spanked: What We Know So Far. https://medium.com/spankchain/we-got-spanked-what-we-know-so-far-d5ed3a0f38fe ,2018. [Online; accessed 13-October-2020].[7] Etherscan — Ethereum (ETH) Blockchain Explorer. https://etherscan.io , 2019. [Online; accessed 13-October-2020].[8] golem — Computing Power. Shared. https://golem.network/ , 2019. [Online; accessed 13-October-2020].[9] MakerDAO. https://makerdao.com/en/ , 2019. [On-line; accessed 13-October-2020].[10] Official Go implementation of the Ethereum proto-col. https://github.com/ethereum/go-ethereum ,2019. [Online; accessed 13-October-2020].[11] Nicola Atzei, Massimo Bartoletti, and Tiziana Cimoli.A survey of attacks on ethereum smart contracts sok.In
Proceedings of the 6th International Conference onPrinciples of Security and Trust - Volume 10204 , 2017.[12] Bruno Barras, Samuel Boutin, Cristina Cornes, JudicaëlCourant, Jean-Christophe Filliatre, Eduardo Gimenez,Hugo Herbelin, Gerard Huet, Cesar Munoz, ChetanMurthy, et al.
The Coq proof assistant reference manual:Version 6.1 . PhD thesis, Inria, 1997. [13] Nikolaj Bjørner, Ken Mcmillan, Andrey Rybalchenko,and Technische Universität München. Program veri-fication as satisfiability modulo theories. In
In SMT ,2012.[14] Lorenz Breidenbach, Phil Daian, Ari Juels, andEmin Gün Sirer. An In-Depth Look at the Parity Mul-tisig Bug. http://hackingdistributed.com/2017/07/22/deep-dive-parity-bug/ , 2017. [Online; ac-cessed 13-October-2020].[15] Lexi Brent, Anton Jurisevic, Michael Kong, Eric Liu,François Gauthier, Vincent Gramoli, Ralph Holz, andBernhard Scholz. Vandal: A scalable security analysisframework for smart contracts.
CoRR , abs/1809.03981,2018.[16] Vitalik Buterin. A next-generation smart contract anddecentralized application platform.
Ethereum , (January),2014.[17] Ting Chen, Xiaoqi Li, Xiapu Luo, and Xiaosong Zhang.Under-optimized smart contracts devour your money.
SANER 2017 - 24th IEEE International Conference onSoftware Analysis, Evolution, and Reengineering , 2017.[18] ConsenSys. Mythril Classic. https://github.com/ConsenSys/mythril-classic , 2019. [Online; ac-cessed 13-October-2020].[19] Chris Dannen.
Introducing Ethereum and Solidity: Foun-dations of Cryptocurrency and Blockchain Program-ming for Beginners . Apress, Berkely, CA, USA, 1stedition, 2017.[20] Leonardo De Moura and Nikolaj Bjørner. Z3: An effi-cient smt solver. In
International conference on Toolsand Algorithms for the Construction and Analysis ofSystems . Springer, 2008.[21] Ardit Dika. Ethereum Smart Contracts : Security Vul-nerabilities and Security Tools. (December), 2017.[22] Joel Frank, Cornelius Aschermann, and Thorsten Holz.ETHBMC: A bounded model checker for smart con-tracts. In , August2020.[23] Max Galka. Multisig wallets affected by the Par-ity wallet bug. https://github.com/elementus-io/parity-wallet-freeze , 2017. [Online; accessed13-October-2020].[24] Neville Grech, Michael Kong, Anton Jurisevic, LexiBrent, Bernhard Scholz, and Yannis Smaragdakis. Mad-max: Surviving out-of-gas conditions in ethereum smartcontracts.
Proceedings of the ACM on ProgrammingLanguages , (OOPSLA), October 2018.25] Ilya Grishchenko, Matteo Maffei, and Clara Schnei-dewind. A semantic framework for the security analysisof ethereum smart contracts. In
Principles of Securityand Trust , Cham, 2018.[26] Dominik Harz and William Knottenbelt. Towards SaferSmart Contracts: A Survey of Languages and Verifica-tion Methods. arXiv preprint arXiv:1809.09805 , 2018.[27] E. Hildenbrandt, M. Saxena, N. Rodrigues, X. Zhu,P. Daian, D. Guth, B. Moore, D. Park, Y. Zhang, A. Ste-fanescu, and G. Rosu. Kevm: A complete formal seman-tics of the ethereum virtual machine. In , 2018.[28] Yoichi Hirai. Defining the Ethereum Virtual Machine forInteractive Theorem Provers. In
Workshop on TrustedSmart Contracts , 2017.[29] Neil Immerman.
Descriptive complexity . Graduate textsin computer science. Springer, 1999.[30] Bo Jiang, Ye Liu, and W. K. Chan. Contractfuzzer:Fuzzing smart contracts for vulnerability detection. In
Proceedings of the 33rd ACM/IEEE International Con-ference on Automated Software Engineering , 2018.[31] Sukrit Kalra, Seep Goel, Mohan Dhawan, and SubodhSharma. ZEUS: Analyzing Safety of Smart Contracts.In
Proceedings of 25th Annual Network & DistributedSystem Security Symposium , 2018.[32] Johannes Krupp and Christian Rossow. teether: Gnaw-ing at ethereum to automatically exploit smart contracts.In , August 2018.[33] Chris Lattner and Vikram Adve. Llvm: A compilationframework for lifelong program analysis & transfor-mation. In
Proceedings of the international sympo-sium on Code generation and optimization: feedback-directed and runtime optimization . IEEE Computer So-ciety, 2004.[34] Loi Luu, Duc-Hiep Chu, Hrishi Olickel, and Prateek Sax-ena. Oyente Benchmarks. https://oyente.tech/benchmarks/ , 2016. [Online; accessed 13-October-2020].[35] Loi Luu, Duc-Hiep Chu, Hrishi Olickel, Prateek Saxena,and Aquinas Hobor. Making smart contracts smarter. In
Proceedings of the 2016 ACM SIGSAC Conference onComputer and Communications Security , 2016.[36] Kenneth L McMillan. Interpolants and symbolic modelchecking. In
International Workshop on Verification,Model Checking, and Abstract Interpretation . Springer,2007. [37] Muhammad Izhar Mehar, Charles Louis Shier, Alana Gi-ambattista, Elgar Gong, Gabrielle Fletcher, Ryan Sanay-hie, Henry M Kim, and Marek Laskowski. Understand-ing a revolutionary and flawed grand experiment inblockchain: The dao attack.
Journal of Cases on In-formation Technology (JCIT) , 21(1), 2019.[38] Dominic P Mulligan, Scott Owens, Kathryn E Gray, TomRidge, and Peter Sewell. Lem: reusable engineeringof real-world semantics. In
ACM SIGPLAN Notices ,volume 49. ACM, 2014.[39] Ivica Nikoli´c, Aashish Kolluri, Ilya Sergey, Prateek Sax-ena, and Aquinas Hobor. Finding the greedy, prodigal,and suicidal contracts at scale. In
Proceedings of the34th Annual Computer Security Applications Confer-ence , 2018.[40] Daniel Perez and Benjamin Livshits. Broken metre:Attacking resource metering in EVM. In
Proceedingsof 27th Annual Network & Distributed System SecuritySymposium , 2020.[41] A. Permenev, D. Dimitrov, P. Tsankov, D. Drachsler-Cohen, and M. Vechev. Verx: Safety verification ofsmart contracts. In , 2020.[42] R. Rahimian, S. Eskandari, and J. Clark. Resolving themultiple withdrawal attack on erc20 tokens. In , June 2019.[43] Michael Rodler, Wenting Li, Ghassan O. Karame, andLucas Davi. Sereum: Protecting existing smart contractsagainst re-entrancy attacks. In
Proceedings of 26th An-nual Network & Distributed System Security Symposium ,February 2019.[44] Grigore Ro¸su and Traian Florin ¸Serb˘anu¸t˘a. An overviewof the K semantic framework.
Journal of Logic andAlgebraic Programming , 79(6), 2010.[45] Us Securities and Exchange Commission. Report ofInvestigation Pursuant to Section 21(a) of the SecuritiesExchange Act of 1934: The DAO. Technical report,2017.[46] Remon Sinnema and Erik Wilde. eXtensible Ac-cess Control Markup Language (XACML) XML Me-dia Type. https://tools.ietf.org/html/rfc7061 ,2013. [Online; accessed 13-October-2020].[47] Nikhil Swamy, Juan Chen, Cédric Fournet, Pierre-YvesStrub, Karthikeyan Bhargavan, and Jean Yang. Securedistributed programming with value-dependent types.
SIGPLAN Not. , 46(9):266–278, September 2011.48] S. Tikhomirov, E. Voskresenskaya, I. Ivanitskiy,R. Takhaviev, E. Marchenko, and Y. Alexandrov.Smartcheck: Static analysis of ethereum smart con-tracts. In , May 2018.[49] Christof Ferreira Torres, Julian Schütte, and Radu State.Osiris: Hunting for integer bugs in ethereum smart con-tracts. In
Proceedings of the 34th Annual ComputerSecurity Applications Conference , 2018.[50] Christof Ferreira Torres, Mathis Steichen, and RaduState. The art of the scam: Demystifying honeypotsin ethereum smart contracts. In , August 2019.[51] Petar Tsankov, Andrei Dan, Dana Drachsler-Cohen,Arthur Gervais, Florian Bünzli, and Martin Vechev. Se-curify: Practical security analysis of smart contracts. In
Proceedings of the 2018 ACM SIGSAC Conference onComputer and Communications Security , 2018.[52] Jeffrey D Ullman.
Principles of database systems . Gal-gotia publications, 1984.[53] Gavin Wood. Ethereum yellow paper. http://gavwood.com/paper.pdf , 2014. [Online; accessed 13-October-2020].[54] Mengya Zhang, Xiaokuan Zhang, Yinqian Zhang, andZhiqiang Lin. TXSPECTOR: Uncovering attacks inethereum from transactions. In , August 2020.[55] Shunfan Zhou, Zhemin Yang, Jie Xiang, Yinzhi Cao,Zhemin Yang, and Yuan Zhang. An ever-evolvinggame: Evaluation of real-world attacks and defensesin ethereum ecosystem. In , August 2020.
A Investigations
In this appendix, we will give a more in-depth security anal-ysis of the top value contracts we presented in Section 7. Inparticular, we will focus on the vulnerabilities detected bythe different tools and show how it could, or not, affect thecontract.
This contract has been flagged as being vulnerable to re-entrancy by Oyente. For a contract to be victim of a re-entrancy attack, it must
CALL another contract, sending itenough gas to be able to perform the re-entrant call. In So-lidity terms, this is means that the contract has to invoke address.call and not explicitly set the gas limit. By look-ing at the source code [3], we find 2 such instances: one atline 352 in the execute function and another at line 369 inthe confirm function . The execute is protected by the onlyowner modifier, which requires the caller to be an ownerof the wallet. This means that for a re-entrant call to work, themalicious contract would need to be an owner of the walletin order to work. The confirm function is protected by the onlymanyowners modifier, which requires at least n ownersto agree on confirming a particular transaction before it isexecuted, where n is agreed upon at contract creation time.Furthermore, confirm will only invoke address.call on atransaction previously created in the execute function.
This is the contract for multi-signature wallet of the Golemproject [8] and uses a well-known multi-signature implemen-tation. We use the source code available on Etherscan toperform the audit. This contract is marked with locked Etherby MadMax and integer overflow by Zeus.We first focus on the locked Ether which is due to anunbounded mass operation [24]. An unbounded mass op-eration is flagged when a loop is bounded by a variablewhich value could increase, for example the length of anarray. This is because if the number of iteration becomes toolarge the contract would run out of gas every time, whichcould indeed result in locked funds. Therefore, we checkall the loops in the contract. There are 8 loops in the code,at lines 43, 109, 184, 215, 234, 246, 257 and 265. All theloops except the ones at lines 257 and 265 are bound bythe total number of owners. As owner can only be addedif enough existing owners agree, running out-of-gas whenlooping on the number of owners cannot happen unless theowners agree. The loops at lines 257 and 265 are in a functioncalled filterTransactions and are bounded by the numberof transactions. The function filterTransactions is onlyused by two external getters, getPendingTransactions and getExecutedTransactions and could therefore not resultthe Ether getting lock. However, as the number of transac-tions is ever increasing, if the owner submit enough transac-tions, the filterTransactions function could indeed needto loop over too many transactions and end up running out-of-gas on every execution. We estimate the amount of gasused in the loop to be around 50 gas, which means that if thenumber of transactions reaches 100,000, it would requiredmore than 5,000,000 gas to list the transactions, which wouldprobably make all calls run out of gas. The contract has onlyreceived a total of 281 transactions in more than 3 years so itis very unlikely that the number of transactions increase thismuch. Nevertheless, this is indeed an issue which should befixed, most likely by limiting the maximum numbers of trans-actions that can be retrieved by getPendingTransactions and getExecutedTransactions .ext, we look for possible integer overflows. All loopsdiscussed above use an uint as a loop index. In Solidity, uint is a uint256 which makes it impossible to overflowhere, given than neither the number of owners or transactionscould ever reach such a number. The only other arithmeticoperation performed is owners.length - 1 in the function removeOwner at line 103. This function checks that the ownerexists, which means that owners.length will always be atleast 1 and owners.length can therefore never underflow.
This contract is also a multi-sig wallet, this time owned byGnosis Ltd. We use the source code available on Etherscanto perform the audit. The contract looks very similar of theone used by and has also been marked by MadMax as being vulnera-ble to locked Ether because of unbounded mass operations.Again, we look at all the loops in the contract and findthat as the previous contract, it loops exclusively on own-ers and transactions. As in the previous contract, we assumelooping on the owners is safe and look at the loops overthe transactions. This contract has two functions loopingover transactions, getTransactionCount at line 303 and getTransactionIds at line 351. Both functions are get-ters which are never called from within the contract. There-fore, no Ether could ever be locked because of this. Un-like the previous contract, getTransactionIds allows toset the range of transactions to return, therefore makingthe function safe to unbounded mass operations. However, getTransactionCount does loop over all the transactions,and as before, could therefore become unusable at some point,although it is highly unlikely.
This contract is again a multi-sig wallet, this timeowned by the Aragon project . We use the con-tract published on Etherscan for the audit. Thesource code for this contract is exactly the same as , except thatit misses a contract called MultiSigWalletWithDailyLimit .This contract was also flagged as being at risk of unboundedmass operations by MadMax, the conclusions are thereforeexactly the same as for the previous contract.
This contract is the only one which is very different fromthe previous ones. It is the
WithdrawDAO contract, which hasbeen created for users to get their funds back after TheDAO https://gnosis.io/ https://aragon.org/ incident [45]. We use the source code from Etherscan to au-dit the contract. This contract has been flagged with severalvulnerabilities: Securify flagged it with transaction order de-pendency and unhandled exception, while Zeus flagged it withlocked ether and integer overflow. The contract has two veryshort functions: withdraw which allows users to convert theirTheDAO tokens back to Ether, and the trusteeWithdraw which allows to send funds which cannot be withdrawn byregular users to a trusted address. We first look at the trans-action order dependency. As any user will only ever be ableto receive the amount of tokens he possesses, the order ofthe transaction should not be an issue in this contract. Wethen look at unhandled exceptions. There is indeed a callto send in the trusteeWithdraw which is not checked. Al-though it is not particularly an issue here, as this does notmodify any other state, an error should probably be thrownif the call fails. We then look at locked ether. The contractis flagged with locked ether because of what Zeus classifiesas “failed send”. This issue was flagged because if the callto mainDAO.transferFrom always raised, then the call to msg.sender.send would never be reached, indeed prevent-ing from reclaiming funds. However, in this context, mainDAO is a trusted contract and it is therefore safe to assume that mainDAO.transferFrom will not always fail. Finally, welook at the integer overflow issue. The only place wherean overflow could occur is in trusteeWithdraw at line 23.This could indeed overflow without some assumptions on thedifferent values. For this particular contract, the followingassumptions are made. this.balance + mainDAO.balanceOf(this) ≥ mainDAO.totalSupply()mainDAO.totalSupply() > mainDAO.balanceOf(this) As long as these assumptions hold, which was the case whenthe contract was deployed, this expression will never over-flow. Indeed, if we note t the time before the first call to trusteeWithdraw and t + this.balance t + = this.balance t - (this.balance t + mainDAO. balanceOf (this)- mainDAO. totalSupply ())= -mainDAO. balanceOf (this )+ mainDAO. totalSupply () meaning that every subsequent call will compute: this.balance t + + mainDAO. balanceOf (this) -mainDAO. totalSupply ()= -mainDAO. balanceOf (this )+ mainDAO. totalSupply ()+mainDAO. balanceOf (this) - mainDAO. totalSupply ()= 0 This will always result in sending 0 and will therefore notcause any overflow. If some money is newly received by thecontract, the amount received will be transferred the next time trusteeWithdrawtrusteeWithdraw