Frontrunner Jones and the Raiders of the Dark Forest: An Empirical Study of Frontrunning on the Ethereum Blockchain
FFrontrunner Jones and the Raiders of the Dark Forest:An Empirical Study of Frontrunning on theEthereum Blockchain
Christof Ferreira Torres
SnT/University of Luxembourg
Ramiro Camino
Luxembourg Institute of Science and Technology
Radu Statee
SnT/University of Luxembourg
Abstract
Ethereum prospered the inception of a plethora of smart con-tract applications, ranging from gambling games to decentral-ized finance. However, Ethereum is also considered a highlyadversarial environment, where vulnerable smart contractswill eventually be exploited. Recently, Ethereum’s pool ofpending transaction has become a far more aggressive envi-ronment. In the hope of making some profit, attackers con-tinuously monitor the transaction pool and try to front-runtheir victims’ transactions by either displacing or suppressingthem, or strategically inserting their transactions. This paperaims to shed some light into what is known as a dark forest and uncover these predators’ actions. We present a method-ology to efficiently measure the three types of frontrunning: displacement , insertion , and suppression . We perform a large-scale analysis on more than 11M blocks and identify almost200K attacks with an accumulated profit of 18.41M USD forthe attackers, providing evidence that frontrunning is both,lucrative and a prevalent issue. The concept of frontrunning is not new. In financial markets,brokers act as intermediaries between clients and the market,and thus brokers have an advantage in terms of insider knowl-edge about potential future buy/sell orders which can impactthe market. In this context, frontrunning is executed by priori-tizing a broker’s trading actions before executing the client’sorders such that the trader pockets a profit. Front-runningis illegal in regulated financial markets. However, the recentrevolution enabled by decentralized finance (DeFi), wheresmart contracts and miners replace intermediaries (brokers) isboth, a blessing and a curse. Removing trusted intermediariescan streamline finance and substantially lower adjacent costs,but misaligned incentives for miners leads to generalized fron-trunning, in which market participants behave exactly likeunethical brokers used to in the “old” financial world. Un-fortunately, this is already happening at a large scale. Our paper is among the first comprehensive surveys on the ex-tent and impact of this phenomenon. Already in 2017, theBancor ICO [11] was susceptible to such an attack – amongother vulnerabilities – but no real attack was observed in thewild. Some concrete frontrunning attacks on the Ethereumblockchain were brought to knowledge by two independentlyreported attacks and their mitigation approaches to the in-formed audience. In the first report [10], the researchers triedto recover some liquidity tokens by calling a specific func-tion in a smart contract. Since this function was callableby everyone, the authors – who also compared the pendingtransactions in the transaction pool to a dark forest full ofpredators – assumed that their function call could be observedand front-runned by bots observing the submitted transactionsin the transaction pool. Even though they tried to obfuscatetheir efforts, their approach failed in the end, and they be-came a victim of a frontrunning bot. A few months later, asecond group of researchers [24] reported a successful re-covery using lessons learned from the previously reportedincident [10]. The success was due to them mining their trans-actions privately without broadcasting them to the rest of thenetwork. The researchers used a new functionality providedby SparkPool called the Taichi Network [15]. In this way,the transactions were not available to frontrunning bots butrelied entirely on having a reliable and honest mining pool.However, this approach enables centralization and requiresusers to entrust their transactions to SparkPool. Similar to howhoneypots gather intelligence by luring attackers to compro-mise apparently vulnerable hosts [8], a recent experiment [21]detailed the interactions with two bots and reported relevantassessment on their nature and origin. Surprisingly, the front-running bots do not rely on advanced software developmenttechniques or complex instructions, and code examples ondeveloping such bots are readily available [22, 23]. There areseveral ways to perform frontrunning attacks. The first surveydefining a taxonomy of frontrunning attacks [12] identifiedthree different variants on how these can be performed. Tounderstand these approaches – displacement , insertion , and suppression – a short refresh on gas and transaction fees in1 a r X i v : . [ c s . CR ] F e b thereum is given. Transactions, submitted to the Ethereumnetwork, send money and data to smart contract addressesor account addresses. Transactions are confirmed by minerswho get paid via a fee that the sender of the transaction pays.This payment is also responsible for the speed/priority min-ers include a transaction in a mined block. Miners have aninherent incentive to include high paying transactions andprioritize them. As such, nodes observing the unconfirmedtransactions can front-run by just sending transactions withhigher payoffs for miners [9]. The common feature of all threeattack types is that by frontrunning a transaction, the initialtransaction’s expected outcome is changed. In the case of thefirst attack (displacement), the outcome of a victim’s originaltransaction is irrelevant. The second attack type (insertion)manipulates the victim’s transaction environment, therebyleading to an arbitrage opportunity for the attacker. Finally,the third attack (suppression) delays the execution of a vic-tim’s original transaction. Although previous papers [9, 12]have identified decentralized applications which are victimsof frontrunning attacks, no scientific study has analyzed theoccurrence of these three attacks in the wild on a large scale.The impact of this structural design failure of the Ethereumblockchain is far-reaching. Many decentralized exchanges,implementing token-based market places have passed the 1BUSD volume [26] and are prone to the same frontrunningattack vectors because the Ethereum blockchain is used as asignificant building block. Frontrunning is not going to dis-appear any time soon, and the future looks rather grim. Wedo not expect to have mitigation against frontrunning in theshort-term. Miners do profit from the fees and thus will al-ways prioritize high yield transactions. Moreover, the trustmechanism in Ethereum is built on the total observability ofthe confirmed/unconfirmed transactions and is thus by designprone to frontrunning. Our paper sheds light into the longterm history of frontrunning on the Ethereum blockchain andprovides the first large scale data-driven investigation of thistype of attack vector. We investigate the real profits made byattackers, differentiated by the specific attack type and pro-pose the first methodology to detect them efficiently. Contributions.
We summarize our contributions as follows:• We propose a methodology that is efficient enough todetect displacement , insertion , and suppression attackson Ethereum’s past transaction history.• We run an extensive measurement study and analyzefrontrunning attacks on Ethereum for the past five years.• We identify a total of 199,725 attacks, 1,580 attackeraccounts, 526 bots, and over 18.41M USD profit.• We demonstrate that the identified attacker accounts andbots can be grouped to 137 unique attacker clusters.• We discuss frontrunning implications and find that min-ers made a profit of 300K USD due to frontrunners. This section provides the necessary background to understandour work setting, including smart contracts, transactions, gaseconomics, and transaction ordering.
The notion of smart contracts has already been introducedin 1997 by Nick Szabo [25], but the concept only became areality with the inception of Ethereum in 2015 [29]. Ethereumproposes two types of accounts: externally owned accounts(EOA) and contract accounts (smart contracts). EOAs are con-trolled via private keys and have no associated code. Contractaccounts, i.e., smart contracts, have associated code but arenot controlled via private keys. They operate as fully-fledgedprograms that are stored and executed across the blockchain.EOAs and smart contracts are identifiable via a unique 160-bitaddress. Smart contracts are immutable, and they cannot beremoved or updated once they have been deployed unless theyhave been explicitly designed to do so. Besides having a key-value store that enables them to preserve their state acrossexecutions, smart contracts also have a balance that keepstrack of the amount of ether (Ethereum’s cryptocurrency)that they own. Smart contracts are usually developed using ahigh-level programming language, such as Solidity [30]. Theprogram code is then compiled into a low-level bytecode rep-resentation, which is then interpreted by the Ethereum VirtualMachine (EVM). The EVM is a stack-based virtual machinethat supports a set of Turing-complete instructions.
Smart contracts are deployed and executed via transactions.Transactions contain an amount of ether, a sender, a receiver,input data, a gas limit and a gas price. Transactions may onlybe initiated by EOAs. Smart contract functions are invokedby encoding the function signature and arguments in a trans-action’s data field. A fallback function is executed wheneverthe provided function name is not implemented. Smart con-tracts may call other smart contracts during execution. Thus,a single transaction may trigger further transactions, so-calledinternal transactions.
Ethereum employs a gas mechanism that assigns a cost to eachEVM instruction. This mechanism prevents denial-of-serviceattacks and ensures termination. When issuing a transaction,the sender has to specify a gas limit and a gas price . The gaslimit is specified in gas units and must be large enough tocover the amount of gas consumed by the instructions during acontract’s execution. Otherwise, the execution will terminateabnormally, and its effects will be rolled back. The gas price2 ransaction Pool(Pending Transactions)(a) Displacement 𝑇 ! 𝑇 " 𝑇 Proposed Block(b) Insertion (c) Suppression 𝑇 ! 𝑇 $ 𝑇 % 𝑇 𝑇 ! 𝑇 " 𝑇 𝑇 ! 𝑇 $ ! 𝑇 % 𝑇 $ " 𝑇 𝑇 ! 𝑇 " 𝑇 𝑇 ! 𝑇 $ ! 𝑇 $ " 𝑇 $ 𝑇 % 𝑇 O r d e r e d by G a s P r i ce Figure 1: Illustrative examples of the three frontrunning attack types.defines the amount of ether that the sender is willing to payper unit of gas used. The sender is required to have a balancegreater than or equal to gas limit × gas price, but the finaltransaction fee is computed as the gas used × gas price. Theprice of gas is extremely volatile as it is directly linked to theprice of ether. As a result, Breidenbach et al. [6] proposedGasToken, a smart contract that allows users to tokenize gas.The idea is to store gas when ether is cheap and spend itwhen ether is expensive, thereby allowing users to save ontransaction fees. Two versions of GasToken exist, wherebythe second version is more efficient than the first one. Thefirst version of GasToken (GST1) exploits the fact that gas isrefunded when storage is freed. Hence, gas is saved by writingto storage and liberated when deleting from storage. Thesecond version of GasToken (GST2) exploits the refundingmechanism of removing contracts. Hence, gas is saved bycreating contracts and liberated by deleting contracts. In 2020,1inch released their version of GST2 called ChiToken [1],which includes some optimizations. A blockchain is essentially a verifiable, append-only list ofrecords in which all transactions are recorded in so-calledblocks. This list is maintained by a distributed peer-to-peer(P2P) network of distrusting nodes called miners . Minersfollow a consensus protocol that dictates the appending ofnew blocks. They compete to create a block by solving acryptographic puzzle. The winner is rewarded with a staticblock reward and the execution fees from the included trans-actions [14]. While blockchains prescribe specific rules forconsensus, there are only loose requirements for selectingand ordering transactions. Thus, miners get to choose whichtransactions to include and how to order them inside a block.Nevertheless, 95% of the miners choose and order their trans-actions based on the gas price to increase their profit, thereby deliberately creating a prioritization mechanism for transac-tions [31].
This section defines our attacker model and introduces thereader to three different types of frontrunning attacks.
Miners, as well as non-miners, can mount frontrunning attacks.Miners are not required to pay a higher gas price to manip-ulate the order of transactions as they have full control overhow transactions are included. Non-miners, on the other hand,are required to pay a higher gas price in order to front-runtransactions of other non-miners. Our attacker model assumesan attacker A that is a financially rational non-miner withthe capability to monitor the transaction pool for incomingtransactions. The attacker A needs to process the transactionsin the pool, find a victim V among those transactions andcreate a given amount of attack transactions T A i before thevictim’s transaction T V is mined. Usually, A would not beable to react fast enough to perform all these tasks manually.Hence, we assume that the attacker A has at least one com-puter program Bot A that automatically performs these tasks.However, Bot A must be an off-chain program, because con-tracts cannot react on its own when transactions are added tothe pool. Nevertheless, Bot A needs at least one or more EOAsto act as senders of any attack transaction T A . Using multipleEOAs helps attackers obscure their frontrunning activities,similar to money laundering layering schemes. We refer tothese EOAs owned by A as attacker accounts EOA A j and tothe EOA owned by V as victim account EOA V . We assumethat attacker A owns a sufficiently large balance across allits attacker accounts EOA A j from which it can send frontrun-ning transactions. However, attacker A can also employ smart3 ttacker Bot Attacker EOAs Bot ContractsOff-Chain On-Chain Figure 2: Attacker model with on-chain and off-chain parts.contracts to hold part of the attack logic. We refer to thesesmart contracts as bot contracts BC A k , which are called by theattacker accounts EOA A j . Figure 2 provides an overview ofour final attacker model. We describe in the following the taxonomy of frontrunningattacks presented by Eskandari et al. [12].
Displacement.
In a displacement attack an attacker A ob-serves a profitable transaction T V from a victim V anddecides to broadcast its own transaction T A to the net-work, where T A has a higher gas price than T V such thatminers will include T A before T V (see Figure 1 a). Notethat the attacker does not require the victim’s transactionto execute successfully within a displacement attack. Forexample, imagine a smart contract that awards a userwith a prize if they can guess the preimage of a hash. Anattacker can wait for a user to find the solution and to sub-mit it to the network. Once observed, the attacker thencopies the user’s solution and performs a displacementattack. The attacker’s transaction will then be mined first,thereby winning the prize, and the user’s transaction willbe mined last, possibly failing. Insertion.
In an insertion attack an attacker A observes aprofitable transaction T V from a victim V and decidesto broadcast its own two transactions T A and T A to thenetwork, where T A has a higher gas price than T V and T A has a lower gas price than T V , such that miners willinclude T A before T V and T A after T V (see Figure 1 b).This type of attack is also sometimes called a sandwichattack . In this type of attack, the transaction T V mustexecute successfully as T A depends on the execution of T V . A well-known example of insertion attacks is arbi-traging on decentralized exchanges, where an attackerobserves a large trade, also known as a whale, sends abuy transaction before the trade, and a sell transactionafter the trade. Suppression.
In a suppression attack, an attacker A observesa transaction T V from a victim V and decides to broad-cast its transactions to the network, which have a highergas price than T V such that miners will include A ’s trans-action before T V (see Figure 1 c). The goal of A is tosuppress transaction T V , by filling up the block with itstransactions such that transaction T V cannot be includedanymore in the next block. This type of attack is alsocalled block stuffing . Every block in Ethereum has aso-called block gas limit . The consumed gas of all trans-actions included in a block cannot exceed this limit. A ’stransactions try to consume as much gas as possible toreach this limit such that no other transactions can beincluded. This type of attack is often used against lotter-ies where the last purchaser of a ticket wins if no oneelse purchases a ticket during a specific time window.Attackers can then purchase a ticket and mount a sup-pression attack for several blocks to prevent other usersfrom purchasing a ticket themselves. Keep in mind thatthis type of frontrunning attack is expensive. This section provides an overview of our methodology’s de-sign and implementation details to detect frontrunning attacksin the wild.
As defined in Section 3.1, an attacker A employs one or moreoff-chain programs to perform its attacks. However, becausewe have no means to distinguish between the different soft-ware agents an attacker A could have, for this study, we con-sider all of them as part of the same multi-agent system Bot A .Additionally, we cannot recognize the true nature of A or how Bot A is implemented. Instead, we would like to build a clusterwith the n different attacker accounts EOA A , . . . , EOA A n andthe m different bot contracts BC A , . . . , BC A m to form an iden-tity of A . Consequently, in each of the following experiments,we use our detection system’s results to build a graph. Eachnode is either an attacker account or a bot contract. We makethe following two assumptions: Assumption 1:
Attackers only use their own bot contracts.Hence, when an attacker account sends a transaction toa bot contract, we suspect that both entities belong to thesame attacker. Note that one attacker account can sendtransactions to multiple bot contracts, and bot contractscan receive transactions from multiple attacker accounts.
Assumption 2:
Attackers develop their own bot contracts,and they do not publish the source code of their bot con-tracts as they do not want to share their secrets with4 ...
Bloom FilterInput Bytes deadbeefcafebabedeadbeefadbeefcabeefcafe ...
Figure 3: An example on how transaction input bytes aremapped into a bloom filter.competitors. Hence, when the bytecode of two bot con-tracts is exactly the same, we suspect that they belong tothe same attacker.With these assumptions in mind, we create edges betweenattacker accounts and bot contracts that share at least oneattack transaction, and between bots that share the same byte-code. Using the resulting graph, we compute all the connectedcomponents. Hence, we interpret each of these connectedcomponents as a single attacker cluster.
We limit our detection to displacement attacks where attack-ers observe profitable transactions via the transaction pooland copy these profitable transactions’ input to create theirown profitable transactions. While attackers are not requiredto use a bot contract to mount displacement attacks, using asmart contract allows them to save money as they can abortthe execution in case of an unexpected event. Therefore, ourdetection focuses on finding attackers that use bot contractsto perform internal transactions of copied inputs. The generalidea is to detect displacement by checking for every trans-action T if there exists a subsequent transaction T (cid:48) with agas price lower than T and a transaction index higher than T ,where the input of T (cid:48) is contained inside the input of T . How-ever, detecting displacement in the wild can become quitechallenging due to a large number of possible combinations.A naive approach would compare every transaction to ev-ery subsequent transaction in the blockchain, resulting in acombinatorial explosion. Our goal is to follow a more effi-cient approach that might sacrifice completeness but preservesoundness. We begin by splitting the range of blocks that areto be analyzed into windows of 100 blocks and slide themwith an offset of 20 blocks. This approach has the advantagethat each window can be analyzed in parallel. Inside eachwindow, we iterate block by block, transaction by transaction,and split the input bytes of each transaction into n -grams of 4 bytes with an offset of 1 byte and check whether at least95% of the n -grams match n -grams of previous transactioninputs. Each window has its own Bloom filter that memorizespreviously observed n -grams. A Bloom filter is a probabilis-tic data structure that can efficiently tell if a given elementmay already have been seen before or if it definitely has notbeen seen before, meaning that Bloom filters may yield falsepositives, but no false negatives. The idea is first to use aBloom filter to perform a quick probabilistic search and onlyperform an exhaustive linear search if the filter finds that atleast 95% of a transaction’s n -grams are contained in the filter.Our Bloom filters can hold up to n = 1M elements with a falsepositive rate p = 1%, which according to Bloom [3], requireshaving k = 6 different hash functions: m = − n ln p ( ln 2 ) (1) k = mn ln 2 (2)We bootstrapped our 6 hash functions using the Murmur3hash function as a basis. The result of each hash function isan integer that acts as an index on the Bloom filter’s bit array.The bit array is initialized at the beginning with zeros, and avalue of one is set for each index returned by a hash function(see Figure 3). An n -gram is considered to be contained inthe filter if all indices of the 6 hash functions are set to one.We use interleaved n -grams because the input of a copiedtransaction might be included at any position in the attacker’sinput. Once our linear search finds two transactions T A and T V with matching inputs, we check whether the following threeheuristics hold: Heuristic 1:
The sender of T A and T V as well as the receiverof T A and T V must be different. Heuristic 2:
The gas price of T A must be larger than the gasprice of T V . Heuristic 3:
We split the input of T A and T V into sequencesof 4 bytes, and the ratio between the number of the se-quences must be at least 25%.Finally, to validate that T A is a copy of T V , we run in a sim-ulated environment first T A before T V and then T V before T A .We report a finding if the number of executed EVM instruc-tions is different across both runs for T A and T V , as this meansthat T A and T V influence each other. During our experiments,we noted, that some bot contracts included code that checks ifthe miner address of the block that is currently being executedis not equal to zero. We think that the goal of this mechanismcould be to prevent transactions from being run locally. We limit our detection to insertion attacks on decentralizedexchanges (DEXes). At the time of writing, we are not aware5f any other use case where insertion attacks are applied inthe wild. DEXes are decentralized platforms where users cantrade their ERC-20 tokens for ether or other ERC-20 tokensvia a smart contract. Uniswap is currently the most popu-lar DEX in terms of locked value with 3.15B USD locked .There exist two genres of DEXes, order book-based DEXesand automated market maker-based (AMM) DEXes. Whileorder book-based DEXes match prices based on so-called’bid’ and ’ask’ orders, AMM-based DEXes match and settletrades automatically on-chain via a smart contract, withoutthe need of third party service. AMMs are algorithmic agentsthat follow a deterministic approach to calculate the priceof a token. Uniswap, for example, is an AMM-based DEX,which computes for every trade the price of a token using theequation of a constant product market maker (CPMM): [ x ] × [ y ] = k (3)where [ x ] is the current reserve of token x and [ y ] is the currentreserve of token y . Trades must not change the product k of apair’s reserve. Thus, if the underlying token reserves decreaseas a trader is buying, the token price increases. The same holdsin the opposite direction: if the underlying token’s reserveincreases while a trader is selling, the token price decreases.Despite being simple, CPMMs are incredibly susceptible toprice slippage. Price slippage refers to the difference betweena trade’s expected price and the price at which the trade is ex-ecuted. Given the public nature of blockchains, attackers canobserve large buy orders before miners pick them up by mon-itoring the transaction pool. These large buy orders will havea significant impact on the price of a token. Leveraging thisknowledge and the fact that miners order transactions basedon transaction fees, attackers can insert their buy order in frontof an observed large buy order and insert a sell order after theobserved large buy order to profit from the deterministic pricecalculation. Figure 4 depicts an example of an insertion attackon an AMM-based DEX that uses CPMM. Let us assume thata victim V wants to purchase some tokens at a price p . Letus also assume that an attacker A observes V ’s transactionand sends in two transactions: 1) a buy transaction which alsotries to purchase some tokens at a price p , but with a gas pricehigher than V , and 2) a sell transaction that tries to sell thepurchased tokens, but with a gas price lower than V . Since A pays a higher gas price than V , A ’s purchase transaction willbe mined first and A will be able to purchase the tokens atprice p , where p = p A (cf. Figure 4). Afterwards, V ’s trans-action will be mined. However, V will purchase tokens at ahigher price p V , where p V > p A due to the imbalance in thetoken reserves (see Equation 3). Finally, A ’s sell transactionwill be mined, for which A will sell its tokens at price p A ,where p A > p A and therefore A making profit. Our detectionalgorithm exploits the fact that DEXes depend on the ERC-20token standard. The ERC-20 token standard defines many https://defipulse.com/ Token 𝑥 Reserve T ok e n 𝑦 R e s e r v e 𝑝 ! ! (buy) 𝑝 " (buy) 𝑝 ! " (sell) Figure 4: An illustrative example of an insertion attack on anAMM-based DEX that uses CPMM.functions and events that enable users to trade their tokensbetween each other and across exchanges. In particular, when-ever a token is traded, a so-called
Transfer event is triggered,and information about the sender, receiver, and the amount islogged on the blockchain. We combine this information withtransactional information (e.g., transaction index, gas price,etc.) to detect insertion attacks. We define a transfer eventas E = ( s , r , a , c , h , i , g ) , where s is the sender of the tokens, r is the receiver of the tokens, a is the number of transferredtokens, c is the token’s contract address, h is the transactionhash, i is the transaction index, and g is the gas price of thetransaction. We detect insertion attacks by iterating block byblock through all the transfer events and checking if thereare three events E A , E V , and E A for which the following sixheuristics hold: Heuristic 1:
The exchange transfers tokens to A in E A andto V in E A , and the exchange receives tokens from A in E A . Moreover, A transfers tokens in E A that it receivedpreviously in E A . Thus, the sender of E A must be iden-tical to the sender of E V as well as the receiver of E A ,and the receiver of E A must be identical to the senderof E A (i.e., s A = s V = r A ∧ r A = s A ). Heuristic 2:
The number of tokens bought by E A must besimilar to the number of tokens sold by E A . To avoidfalse positives, we set a conservative threshold of 1%.Hence, the difference between token amount a A of E A and token amount a A of E A cannot be more than 1%(i.e., | a A − a A | max ( a A , a A ) ≤ . Heuristic 3:
The token contract addresses of E A , E V , and E A must be identical (i.e., c A = c V = c A ). Heuristic 4:
The transaction hashes of E A , E V , and E A must be dissimilar (i.e., h A (cid:54) = h V (cid:54) = h A ).6 euristic 5: The transaction index of E A must be smallerthan the transaction index of E V , and the transactionindex of E V must be smaller than the transaction indexof E A (i.e., i A < i V < i A ). Heuristic 6:
The gas price of E A must be larger than thegas price of E V , and the gas price of E A must be less ofequal to the gas price of E V (i.e., g A > g V ≥ g A ). In suppression, an attacker’s goal is to submit transactions tothe network that consume large amounts of gas and fill up theblock gas limit to withhold a victim’s transaction. There areseveral ways to achieve this. The naive approach uses a smartcontract that repeatedly executes a sequence of instructions ina loop to consume gas. This strategy can either be controlledor uncontrolled. In a controlled setting, the attacker repeat-edly checks how much gas is still left and exits the loop rightbefore all gas is consumed such that no out-of-gas exceptionis raised. In an uncontrolled setting, the attacker does notrepeatedly check how much gas is left and lets the loop rununtil no more gas is left and an out-of-gas exception is raised.The former strategy does not consume all the gas and does notraise an exception which makes it less obtrusive, while thelatter strategy does consume all the gas but raises an excep-tion which makes it more obtrusive. However, a third strategyachieves precisely the same result without running code inan infinite loop. If we think about it, the attacker’s goal isnot to execute useless instructions but rather to force minersto consume the attacker’s gas units to fill up the block. TheEVM proposes two ways to raise an error during execution,either through a revert or an assert. The difference betweenrevert and assert is that the former returns the unused gas tothe transaction sender, while the latter consumes the entiregas limit initially specified by the transaction sender. Hence,an attacker can exploit this and call an assert to consume allthe provided gas with just one instruction. Our goal is to de-tect transactions that employ one of the three aforementionedsuppression strategies: controlled gas loop , uncontrolled gasloop , and assert . We start by clustering for each block alltransactions with the same receiver, as we assume that attack-ers send multiple suppression transactions to the same botcontract. Afterwards, we check the following heuristics foreach cluster: Heuristic 1:
The number of transactions within a clustermust be larger than one.
Heuristic 2:
All transactions within the cluster must haveconsumed more than 21,000 gas units. This heuristicgoal is to filter out transactions that only transfer money,but do not execute code.
Heuristic 3:
The ratio between gas used and gas limit mustbe larger than 99% for all transactions within the cluster. If we happen to find a cluster that fulfils the heuristics men-tioned above, we check whether at least one of the neighbour-ing blocks (i.e., the previous block and the subsequent block)also contains a cluster that satisfies the same heuristics. Weassume that an attacker tries to suppress transactions for asequence of blocks. Finally, we try to detect if an attackeremploys one of three suppression strategies by retrieving andanalyzing the execution trace of the first transaction in thecluster. An execution trace consists of a sequence of executedinstructions. We detect the first strategy by checking if thetransaction did not raise an exception and if the instruction se-quence [GAS, GT, ISZERO, JUMPI] is executed more thanten times in a loop. We detect the second strategy by checkingif the transaction raised an exception via a revert and if theinstruction sequence [SLOAD, TIMESTAMP, ADD, SSTORE] is executed more than ten times in a loop. Finally, we de-tect the third strategy by checking if the transaction raised anexception via an assert.
We were compelled to make trade-offs between efficiencyand completeness with more than 11M blocks and over 1Btransactions. For instance, to detect displacement attacks, wehad to set a window size of 100 blocks, meaning that we couldnot detect displacement attacks were an attacker’s transactionand a victim’s transaction are more than 100 blocks apart.Another example is insertion detection, where we assumethat the attacks occur within the same block. However, thisassumption does not always hold, as transactions might bescattered across different blocks during the mining process.Theoretically, it would be possible to attack victims usingattacker accounts directly for displacement and suppressionattacks. However, our detection heuristics rely on the exis-tence of the bot contracts to identify attackers as a singleentity. Considering these limitations, all the results presentedin this paper should be interpreted as lower bounds, and theymight be solely the tip of the iceberg.
In this section, we analyze the results of our large scale mea-surement study on detecting frontrunning in Ethereum.
We implemented our detection modules using Python withroughly 1,700 lines of code . We run our modules on the first11,300,000 blocks of the Ethereum blockchain, ranging fromJuly 30, 2015 to November 21, 2020. All our experimentswere conducted using a machine with 128 GB of memory and The code and data will be publicly released.
70 Intel(R) Xeon(TM) L5640 CPUs with 12 cores each andclocked at 2.26 GHz, running 64 bit Ubuntu 16.04.6 LTS.
Overall Results.
We identified a total of 2,983 displacementattacks from 49 unique attacker accounts and 25 unique botcontracts. Using the graph analysis defined in Section 4.1 weidentified 17 unique attacker clusters.
Profitability.
We compute the gain of an attacker A on eachdisplacement attack by searching how much ether EOA A re-ceives among the internal transactions triggered by T A . Ad-ditionally, we obtain the profit by subtracting the attack costfrom the gain, where cost is defined solely by the fees of T A .Finally, for each attack we convert the ether cost and profitinto USD by taking the conversion rate valid at the time ofthe attack. Attacks.
We can see in Table 1 the distribution of each vari-able we collected per displacement attack. The cost and theprofit do not appear to be very high for most of the attacks,but the distributions of both variables present very long tailsto the right. Additionally, we compute the Gas Price ∆ as thegas price of T A minus the gas price of T V . This value indicateshow much the attacker A is willing to pay to the miners sothey execute T A before T V . Table 1 shows that most of theattacks contain a very small gas price difference in GWei(and cannot be represented with only two digits of precision),but there are very extreme cases with a difference close to50 GWei. Furthermore, we compute the Block ∆ to indicatehow many blocks are between the execution of T A and T V .Again we can see in Table 1 that for most of the attacks, bothtransactions were executed in the same block, but there aresome extreme cases with a long block distance of 19 blocks. Cost (USD) Profit (USD) Gas Price ∆ (GWei) Block ∆ mean 14.28 1,537.99 0.43 0.78std 18.25 7,162.80 2.65 2.37min 0.01 0.00 0.00 0.0025% 4.36 1.14 0.00 0.0050% 9.48 158.53 0.00 0.0075% 16.64 851.04 0.00 0.00max 311.69 223,150.01 52.90 19.00 Table 1: Distributions for displacement attacks.
Attacker Clusters.
Every cluster contains bot accounts withdifferent bytecode, with the exception of one cluster that con-tains three bot accounts with the exact same bytecode. Table 2presents the distribution of each attacker cluster variable. Thefirst variable describes profit, where we can see that a singleattacker mounted 2,249 attacks making an accumulated profit of more than 4.1M USD while spending over 40K USD intransaction fees. We can also see that the attacker used 16different accounts and 3 different bots to mount its attacks.The minimum amount of profit that an attacker made withdisplacement is 0.01 USD. Overall, the average number ofattacks per attacker cluster is 175.47 attacks, using 2.88 ac-counts and 1.47 bots. However, we also observe from thedistribution that at least half of the attackers only use oneaccount and one bot contract.
Cost (USD) Profit (USD) Attacks Attacker Accounts Bot Contractsmean 2,505.09 269,872.45 175.47 2.88 1.47std 9,776.51 1,005,283.40 555.03 3.89 0.80min 0.05 0.01 1.00 1.00 1.0025% 0.14 3.53 1.00 1.00 1.0050% 3.98 726.70 5.00 1.00 1.0075% 65.78 4,670.94 8.00 3.00 2.00max 40,420.63 4,152,270.01 2249.00 16.00 3.00
Table 2: Distributions for displacement attacker clusters.
Overall Results.
We identified a total of 196,691 insertionattacks from 1,504 unique attacker accounts and 471 uniquebot contracts. Using the graph analysis defined in Section 4.1we identified 98 unique attacker clusters.
Profitability.
We compute the cost for each attack as thesum of the amount of ether an attacker spent in T A and thefees imposed by transactions T A and T A . Additionally, wecompute the profitability of an attack as the amount of etheran attacker gained in T A minus the cost. Finally, for eachattack we convert the ether cost and profit into USD by takingthe conversion rate valid at the time of the attack. Attacks.
We can see in Table 3 the distribution of each vari-able we collected per insertion attack. The cost and the profitdo not appear to be very high for most of the attacks, butthe distributions of both variables present very long tails tothe right. Note that the profit also present very large negativevalues to the left, meaning that there are extreme cases ofattackers losing money. Additionally, we compute the GasPrice ∆ and Gas Price ∆ as the gas price of T A minus thegas price of T V , and the gas price of T V minus the gas price of T A respectively. This value indicates how much the attacker A is willing to pay to the miners so they execute T A before T V and also if T A can be executed after T V . Table 3 shows that25% of the attacks contain a very small Gas Price ∆ in GWei(and cannot be represented with only two digits of precision),but that half or more paid a significant difference, reachingsome extreme cases of more than 76K GWei. For Gas Price ∆ most of the attacks have a very small value, but there areextreme cases, which mean that some attacks are targetingtransactions with very high gas prices.8 BancorRelease Uniswap V1 Release Uniswap V2 Release SushiSwap ReleaseBancor Uniswap V1 Uniswap V2 SushiSwap I n s e r t i o n A tt a c k s [ l o g ] Figure 5: Weekly average of daily insertion attacks per decentralized exchange.
Cost (USD) Profit (USD) Gas Price ∆ (GWei) Gas Price ∆ (GWei)mean 19.41 65.05 407.63 3.88std 51.15 233.44 1,897.47 137.12min 0.01 -10,620.61 0.00 0.0025% 4.09 7.86 0.00 0.0050% 7.74 24.07 5.25 0.0075% 15.23 62.92 74.10 0.00max 1,822.22 20,084.01 76,236.09 27,396.63 Table 3: Distributions for insertion attacks.
Gas Tokens.
We analyzed how many attacks were mountedusing gas tokens. Gas tokens allow attackers to reduce theirgas costs. We found that 63,274 (32,17%) of the insertionattacks we measured were performed using gas tokens. 48,281(76.3%) attacks were mounted using gas tokens only for thefirst transaction T A , 1,404 (2.22%) attacks were mounted byemploying gas tokens only for the second transaction T A ,and 13,589 (21.48%) attacks were mounted by employing gastokens for both transactions T A and T A . We also found that24,042 (38%) of the attacks used GST2, 14,932 (23.6%) usedChiToken, and 24,300 (38.4%) used their own implementationor copy of GST2 and ChiToken. Exchanges and Tokens.
We identified insertion attacksacross 3,200 different tokens on four exchanges: Bancor,Uniswap V1, Uniswap V2, and SushiSwap. Figure 5 depictsthe weekly average of daily insertion attacks per exchange.The first AMM-based DEX to be released on Ethereum wasBancor in November 2017. We observe from Figure 5 that thefirst insertion attacks started in February 2018, targeting theBancor exchange. We also see that the number of insertionattacks increased tremendously with the rise of other DEXes,such as Uniswap V1 and Uniswap V2. While it took 3 monthsfor attackers to launch their first insertion attacks on UniswapV1, it only took 2 weeks to launch attacks on Uniswap V2 and5 days to launch attacks on SushiSwap. This is probably due
Exchange Combination Attacker ClustersUniswap V2 72Uniswap V1 16SushiSwap, Uniswap V2 4Bancor 3Uniswap V1, Uniswap V2 2Bancor, SushiSwap, Uniswap V1, Uniswap V2 1
Table 4: Exchange combination count by attacker cluster.to the core functionality of Uniswap V1 and Uniswap V2 be-ing the same and that SushiSwap is a direct fork of UniswapV2. Thus, for attackers it was probably straightforward totake their existing code for Uniswap V1 and adapt it to attackUniswap V2 as well as SushiSwap. The peak of insertionattacks was on October 5, 2020, with 2,749 daily attacks. Wemeasured in total 3,004 attacks on Bancor, 13,051 attackson Uniswap V1, 180,185 attacks on Uniswap V2, and 451attacks on SushiSwap. Table 4 shows the different combina-tions of exchanges that attackers try to front-run. We see thatmost of the attackers focus on attacking Uniswap V2, with72 attacker clusters (73.47%). We also see that 92.86% of theattackers only focus on attacking one exchange. Moreover,we observed one attacker that attacked all the 4 exchanges,2 attackers that attacked Uniswap V1 and Uniswap V2, and4 attackers that attacked Uniswap V2 and SushiSwap. Thelatter is expected since SushiSwap is a direct fork of UniswapV2. Hence, the attackers can reuse their code from UniswapV2 to attack SushiSwap. What is interesting though, is thefact that no attacker is attacking only SushiSwap, we see thatattacker always attack SushiSwap in conjunction to anotherexchange.
Attack Strategies.
In 186,960 cases (95.05%) the attackerssold the exact same amount of tokens that they purchased.Thus, an easy way to spot insertion attacks on decentralized9 0 0 0 0 0 0 0 0 % O R F N 1 X P E H U [ $ H &