Smart Contract (in)Security – Bad Arithmetic

This blog post is the second in a series that will describe some simple real-world smart contract security bugs, how they were exploited, the resulting impacts and the corresponding code fixes. So far, we have reached $30M USD of our series target of more than $250M USD in losses directly attributable to smart contract security bugs. This time, we will make two deposits of exactly 57,896,044,618,658,097,711,785,492,504,343,953,926,634,992,332,820,282,019,728,792,003,956,564,819,968 tokens each, for a grand total of 0 tokens!

What are smart contracts? To summarize the full background found in the first blog post, smart contracts are relatively short, publicly-accessible programs capable of owning large amounts of money and that execute complex transaction logic in an unchangeable and unstoppable fashion on a platform that is rapidly evolving alongside its tools. Solidity is a new language created specifically for Ethereum-based smart contracts that compiles to bytecode and is ultimately executed in the Ethereum Virtual Machine. Smart contracts are here today with well over $1B USD of Ethereum changing hands daily.

Bugs are bad! One of the primary challenges limiting the progress of smart contract adoption is security. It is well known that “every program always has one more bug” and some bugs can be catastrophic. Smart contracts can provide extremely visible, immediate, and impactful examples of code gone wrong.

The old is new again. This particular blog post will specifically cover bad arithmetic (such as integer overflow). While smart contracts provide a new context with unique novelties, this is most certainly not a new problem. Bad arithmetic occupies the attention of almost an entire section in the SEI CERT Oracle Coding Standard for Java. Other languages suffer from the exact same scary scenarios, and these vulnerabilities map to several Common Weakness Enumerations, including #190. Smart contracts offer some new and cool challenges related to their public execution environment, unchangeable nature, variety of stakeholder categories and high logical complexity.

Your very own crypto-currency. The most popular and visible type of contract on Ethereum involves issuing a custom crypto-currency (also called tokens) - you can do it yourself in less than 20 minutes. This type of contract essentially implements a data structure that keeps track of balances, along with a few functions that perform transfers by correctly adjusting account balances in response to transactions. The most popular token standard is ERC-20 with hundreds and hundreds of token varieties implementing the standard wich allows them to be tracked and used by exchanges, tools and other smart contracts. Some of these tokens will likely arrive on mainstream trading exchanges in the near future.

Show me the code. Let’s have a look at the actual code implementing BeautyChainToken, which can be easily found on Etherscan. Scrolling through the code reveals the implementation of a library named SafeMath near the very top, followed by a series of contracts that build on top of each other. Ignore all the confusing stuff, but note that there are frequent references to variables called to, from, amount and balances as well as functions called transfer() and approve(). This is the actual recipe for a fully functional token in less than 300 lines of code.

Let’s inspect the batchTransfer() function starting on line 255 which is replicated below. It takes two parameters – the first parameter (_receivers) is a list of destination account addresses, and the second parameter (_value) is the number of tokens to transfer to each destination account address. Line 256 calculates the number of receiver addresses in the list and then line 257 can calculate the total amount that will be withdrawn from the sender’s account balance. The right half of line 259 makes sure the sender has sufficient balance before line 261 adjusts it downwards by the total amount calculated earlier on line 257. Finally, line 262 starts a loop that performs the individual transfers by adjusting each receiver’s balance upward by _value.

 

255 function batchTransfer(address[] _receivers, uint256 _value) public whenNotPaused returns (bool) {
256   uint cnt = _receivers.length;
257   uint256 amount = uint256(cnt) * _value;
258   require(cnt > 0 && cnt <= 20);
259   require(_value > 0 && balances[msg.sender] >= amount);
260
261   balances[msg.sender] = balances[msg.sender].sub(amount);
262   for (uint i = 0; i < cnt; i++) {
263     balances[_receivers[i]] = balances[_receivers[i]].add(_value);
264     Transfer(msg.sender, _receivers[i], _value);
265   }
266   return true;
267 }

 

One of the most common data types in Solidity is the unsigned 256-bit integer (uint256 or uint) which can represent any value between 0 and 2256-1 inclusive. If contract code performs an ordinary arithmetic operation yielding a result outside this range, the result silently wraps around and contract execution continues with bad data – sometimes catastrophically. For this reason, the contract shown above specifically uses the specialized .sub() and .add() functions (on lines 261 and 263 respectively) from the SafeMath library we saw earlier. These common functions are written to abort if an underflow or overflow is encountered, which will prevent execution from continuing with bad data. You might be asking why line 257 uses the ordinary '*' rather than .mul() from SafeMath…

Bug! Regarding line 257, if _value were set to 2255 and then doubled because of a cnt of two calculated on the prior line, the result would be stored into amount as 0 (rather than 2256) due to the wrap-around effect. This is the exact bug we are aiming to unpack – execution would now continue with bad data in play. The _value requested was huge but the total calculated amount is 0! The (bad) amount of 0 would then pass the balance check on line 259 and continue through the balance adjustment on line 261 just fine. However, the loop that performs the individual transfers utilizes the massive _value variable (with the amount variable no longer referenced). This means the loop would increase each of the two destination balances by 2255. Jackpot!

What happened? An attacker called the batchTransfer() function shown above and provided it with a _receivers list containing two addresses and a _value parameter set to 2255. This can be clearly seen in the actual Etherscan transaction here (click on ‘Decode Input Data’). As described, the cnt of addresses is calculated to be 2, so multiplying that with the value of 2255 causes an overflow to an amount of 0. Execution continued onward with this bad data. The various checks pass, the sender’s balance is decreased by an amount of 0, and finally two massive deposits of _value are made into the _receivers balance. By the way, if you haven’t guessed it already, the large number referenced in the very first paragraph of this post happens to be 2255. For a real-money spend of approximately $0.02 USD to fund the attack transaction, two massive lots of tokens were created from thin air and deposited into the _receiver addresses. This resulted in the loss of trust in this token contract and thus completely devastated its value.

Let’s fix it. All we have to do is replace the ordinary '*' on line 257 with .mul() as shown below. Lines prefixed with ‘–’ are to be removed from the contract code while those prefixed with ‘+’ are to be added.

 

-    uint256 amount = uint256(cnt) * _value;
+    uint256 amount = uint256(cnt).mul(_value);

 

Simple in retrospect? While almost all security bugs become obvious in retrospect, developers have been making these same mistakes for well over 20 years and continue to do so every day. Note that the above fix is moot for the BeautyChainToken, because there are no software updates for smart contracts. Once smart contracts are deployed they are essentially unchangeable and unstoppable beyond their original code. So our incredibly simple fix consisting of just a handful of characters can only be a lesson for the future, underscoring the theme of “do not allow arithmetic to silently go wrong and then continue execution with bad data”.

Not so simple. Tokens were one of the first use cases for smart contracts and have been very carefully developed in the open by top experts. The presence of the SafeMath library demonstrates that this is a known problem area and the code we see above indeed makes use of it. In fact, the ERC-20 coin standard itself has several known problems that remain only partially addressed (at best). Part of the challenge is that the development tool ecosystem is still emerging. Formal methods and tools are beginning to appear that will automatically find these issues. Unfortunately, it is still early days and smart contracts remain the Wild West for now.

The implosion of trust resulting from 2255 tokens appearing out of thin air (twice) is hard to measure. We’ll run with $1M USD as a super conservative estimate that is roughly 1/20th the average funds raised in an Initial Coin Offering. This brings our running total to $31M USD. Stay tuned, as there is always one more bug!

Smart Contract (in)Security -- Broken Access Control

This blog post is the first of a series that will describe some simple real-world smart contract security bugs, how they were exploited, the resulting impacts and the corresponding code fixes. The series will cover more than $250M USD in losses directly attributable to smart contract security bugs.

What are smart contracts? Imagine relatively short, publicly-accessible programs capable of owning large amounts of money and that execute complex transaction logic in an unchangeable and unstoppable fashion on a platform that is rapidly evolving alongside with its tools. Think blockchain 2.0 where transactions invoke code execution. These autonomous programs issue and manage private currencies, implement novel funding systems, tokenize capital assets, enable new business models and even breed digital cats called CryptoKitties! There is no imagination required - the fascinating and flourishing world of smart contracts is already here with well over $1B USD of Ethereum changing hands daily.

Bugs are bad! One of the primary challenges limiting the progress of smart contract adoption is security. It is well known that "every program always has one more bug", and some bugs can be catastrophic. In the case of smart contracts, security bugs may exist wherever code behavior diverges from developer intent. This applies to both local low-level behavior through to the global system or social context (like front-running). Further, sometimes intent is stated and sometimes intent is unstated. Clearly, 'security' is a big, broad umbrella term in this application space.

Solidity is a new language created specifically for Ethereum-based smart contracts and has many similarities to JavaScript, Java and C. Contracts are compiled to bytecode, embedded on the blockchain via a special transaction, and ultimately executed in the Ethereum Virtual Machine when stimulated by other transactions. As most security bugs are shockingly simple in retrospect, this blog series will not require deep expertise.

The old is new again. This particular blog post will specifically cover broken access control. While smart contracts provide a new context with unique novelties, this is most certainly not a new problem. This topic can be found as item 5 in the 2017 Open Web Application Security Project (OWASP) Top 10 list of application security risks. Further, access control topics pervade the SEI CERT Oracle Coding Standard for Java and vulnerabilities map to several Common Weakness Enumerations, including #284. Smart contracts offer unique challenges related to their public execution environment, unchangeable nature, different stakeholder categories and high logical complexity.

Show me the code. Consider the Solidity code below excerpted from the actual Parity Multisig Wallet contract intended to manage crypto-currency user accounts. The actual deployed code, warts and all, can be found via the public Etherscan block explorer. The first three function declarations (on lines 1, 6 and 16) shown below are key, with a glitch lurking in each. The initWallet() function is part of the initialization process intended to be executed once at contract creation. That function then calls initMultiowned() and initDaylimit() which deal with recording ownership details and transaction limits respectively. The fourth and last function declaration (line 21) shown below is perfectly fine, but might provide you with a small hint for what could go wrong...

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function initWallet(address[] _owners, uint _required, uint _daylimit) {
    initMultiowned(_owners, _required);
    initDaylimit(_daylimit) ;
}

function initMultiowned(address[] _owners, uint _required) {
    m_numOwners = _owners.length ;
    m_required = _required;
    for (uint i = 0; i < _owners.length; ++i)
    {
        m_owners[1 + i] = uint(_owners[i]);
        m_ownerIndex[uint(_owners[i])] = 1 + i;
    }
}

function initDaylimit(uint _limit) {
    m_dailyLimit = _limit;
    m_lastDay = today();
}

function execute(address _to, uint _value, bytes _data) onlyOwner returns(bool _callValue) {
    ... // execute transactions!
}

Bug 1. The fourth and last function declaration (line 21) shown above includes the onlyOwner modifier which acts as a gatekeeper, restricting the ability to invoke the execute() function to the contract owner. Here we see one of the primary purposes behind modifiers - they support access control on a granular per-function basis. Auditors (and hackers) will inspect each and every function for access control weaknesses via missing modifiers. Voila - the initWallet() function should similarly include a gatekeeper modifier that restricts the function execution to only when the contract is uninitialized (e.g. just once). There is no modifier here so the initWallet() function can be called repeatedly, with parameters from later invocations overwriting parameters from prior invocations. These parameters include the (new) address of the contract owner. Yes, ownership can be repeatedly changed! Hmmm...

Bug 2. Furthermore, contract functions can also be constrained by having different visibility modifiers such as internal, external, public, etc. The middle two function declarations (lines 6 and 16) above are part of an initialization scheme internal to the contract itself and so should be marked as internal which limits external accessibility. However, the visibility modifier is not specified here so it defaults to public. Yes, these functions can be called by anyone! Hmmm...

What happened? An attacker called the initWallet() function shown above, transferred ownership to themselves and set a high day (transaction) limit. Etherscan clearly shows this transaction here. Next, the attacker called the execute() function, now as the contract owner, to transfer Ethereum out of the contract and into their own account. Etherscan clearly shows this series of transactions here, where individual transactions are at the bottom of the page and an account balance of $30M USD is shown near the top. Soon after this, two white-hat teams quickly stepped in to preemptively drain other accounts containing over $200M USD for subsequent return to the rightful owners.

Let's fix it. The key issue boils down to an initialization function able to set ownership and that can unfortunately be called repeatedly by anyone. The visibility issue on the other two functions is related but ancillary - remember "there is always one more bug". The corresponding fixes are not overly complicated and the actual GitHub edits are shown below. Lines prefixed with '--' have been removed from the contract code while those prefixed with '+' have been added. There are 4 parts to this fix.

1
2
3
4
5
6
7
8
9
10
11
+  // throw unless the contract is not yet initialized.
+  modifier only_uninitialized { if (m_numOwners > 0) throw; _; }

-  function initWallet(address[] _owners, uint _required, uint _daylimit) {
+  function initWallet(address[] _owners, uint _required, uint _daylimit) only_uninitialized {

-  function initMultiowned(address[] _owners, uint _required) {
+  function initMultiowned(address[] _owners, uint _required) internal {

-  function initDaylimit(uint _limit) {
+  function initDaylimit(uint _limit) internal {

Simple in retrospect? First, a modifier named only_uninitialized is created on line 2 for subsequent use as a gatekeeper - it prevents execution by throwing an exception when the contract has already been initialized. Second, this modifier is applied to the initWallet() function on line 5 so that initialization can only be performed once. Finally, the initMultiowned() and initDaylimit() functions are marked with internal visibility on lines 8 and 11 so they cannot be called from outside the contract.

Not so simple. The Parity Multisig Wallet was very carefully and methodically developed in the open with many, many expert eyeballs on it. A large part of the original motivation for developing it was to mitigate some of the risks associated with holding crypto-currencies. While the bug may look simple in retrospect, it still fell through the cracks and it wasn't the last bug. In this instance, it all boils down to broken access control in some novel circumstances. The lesson is to very carefully inspect function modifiers and function visibility for lurking weaknesses in access control.

Our running total now stands at $30M USD. Stay tuned, as there is always one more bug!