• 沒有找到結果。

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

instructions in our program’s algorithm, the final results of simulation show that executing the cycle part will increase the stack size by one and consume 1272 gas, per cycle.

Table 1 is the result of symbolic simulation on the EVM stack. We can see there are two path constraints for the loop path. If all the constraints have a possible answer, this loop path will have a chance of being executed during a transaction. For unknown parameter one, if it matches the function hash of function removeM ember, then the execution path will jump to the cycle path. Simulate through the cycle path, we encounter the second unknown parameter, which is the parameter len of function removeM ember. Parameter len determines the time of iteration of the loop. After simulating the cycle path for 1024 times, which make the EVM stack overflow, the path constraint with parameter len is ((1023 < P arameter2) == F alse) == F alse. That is to say, if we set the parameter len over 1023, the EVM stack will overflow and thus cause the execution to stop.

3.3 Results

According to Ethstat.net [31] at the time of writing, current gas limit of each block is approximately 8,000,000 gas. So if we set the function parameter of the cycle we found to a number over 6290, which means there are more than 6290 members using this source code, an exception of exceeding gas limit is going to happen. From another perspective, if this situation is happening in a private blockchain environment, according to the creator of the private blockchain, the block gas limit could be set to a vast number way over 8 million to prevent gas limitation assertion. Thus, setting the function parameter len of function removeM ember in the example SimpleM ember to a number over 1024 will cause the program to execute cycle part over 1024 times, which will trigger stack overflow exception, base on our analyze results. So that is an example to demonstrate how a few lines of code inside a smart contract make an impact on the execution and the result.

Table 1: Symbolic Simulation on Stack

Opcode Constraints Stack

PUSH 60 X [60]

MSTORE X []

PUSH 0 X [0]

CALLDATALOAD X [’Parameter’]

PUSH 10000000. . . X [1000...; ’Parameter’]

SWAP1 X [’Parameter’; 1000...]

PUSH [tag] 16 X [22; ’((0<Parameter2)==False)’;...]

JUMPI ((0<Parameter2) [0; 0;...]

The EVM opcode runs on a stack-based virtual machine. Unlike ArmV7 instructions for iOS mobile applications, the EVM uses a fixed size stack of 1,024, rather than registers.

We need to keep track of values in the stack to model the flow precisely. In the first phase, we gather some smart contracts from Ethereum blockchain and compile them into EVM opcode. Once we have the EVM opcode, we investigate static flow analysis techniques and develop our own analysis tool to generate the control flow graph directly from the EVM opcode. EVM opcode has around 70 kinds of instructions. Using the control flow graph, we will traverse through all the nodes inside the graph to find the potential positive cycle.

If the program contains any positive cycle, the EVM opcode and the control flow graph fragment will be extracted for further analysis, such as gas consumption calculation and symbolic simulation on the stack. Note that the graph may not be acyclic since jumps may result in cycles due to recursive calls or loop.

To know more details about the process of EVM opcode execution, we keep track of values in the stack to model the flow precisely. In the first phase, we gather some smart contracts from Ethereum blockchain and compile them into EVM opcode. With the EVM opcode, we investigate static flow analysis techniques and develop our analysis tool to generate the control flow graph directly from the EVM opcode. Figure2 is our analysis process. In the following content, we will explain the idea of each analysis procedure.

4.1 Contract opcode extraction

Most of the information about Ethereum blockchain can be found on Etherscan[32]. It is a powerful blockchain explorer and platform which contain lots of valuable information about Ethereum. For example, recent market capital, transactions status, blocks detail and smart contract source code. It will be our data sources for the experiment part. With the smart contract source code, our first step is to compile them into the EVM opcode.

There are two major approaches to convert source code into bytecode or EVM opcode.

The first one is using Solidity compiler(solc)[33], all you need to do is downloading the

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Figure 2: Smart Contract Analyze Process

package from the Internet and install it on your computer.

After installed, use solc command to compile smart contract into the EVM opcode.

With argument –asm or –asm-json, you can generate the EVM opcode from the smart contract source code using the solidity compiler. If you want to get more information about this source code, just use the help argument for more details about the compiler.

However, we prefer to use another option in our analysis process. Without installing anything, just a web browser with Solidity Remix IDE[34], you can compile it right after finishing writing the source code and deploy it onto either the testing, public or private blockchain. There are some unusable parts inside the output from Solidity Remix IDE, which is called the pre-loader of the smart contract. Those opcode will be trigger during the deploy of the smart contract. Because we do not deploy the smart contract during our analysis, so those pre-loader opcodes will be ignore. Therefore, we will do some preprocessing to extract the most important fragment of the EVM opcode from the result of the compiled opcode.

EVM opcode is a set of operations that tell us how the logic of source code work. In

STOP 0 0 EXTCODESIZE 1 1

ADD 2 1 EXTCODECOPY 4 0

MUL 2 1 BLOCKHASH 1 1

SIGNEXTEND 2 1 MSTORE8 2 0

LT 2 1 SLOAD 1 1

CALLVALUE 0 1 CREATE 3 1

CALLDATALOAD 1 1 CALL 7 1

CALLDATASIZE 0 1 CALLCODE 7 1

CALLDATACOPY 3 0 RETURN 2 0

CODESIZE 0 1 DELEGATECALL 6 1

CODECOPY 3 0 INVALID NA NA

GASPRICE 0 1 SELFDESTRUCT 1 0

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

addition, EVM is a stack machine. Every instruction will use the identical stack to push or pop item to calculate to an exact value. After executing an instruction, the stack state will be changed depending on each instruction. Table 2 show how each instruction change the state of the stack size. Follow the instruction set, we can find out how many items will be place on stack (α) or remove (δ) when the instruction executed. For example, ADD instruction will pop two times from the stack, using the first and second value on stack to calculate the sum of those two value, and push the result answer to the stack. So after the EVM executed the instruction ADD, the stack size will decrease by one. Take another instruction for example, DU P ∗ will pop * items from stack, which depend on the * value.

If the * value equal to three, it will pop the top three value from stack and duplicate the third item. Then push all the value back to the stack along with the duplicated item.

Therefore, DU P will increase the stack size by one. For further analysis, we want to transform those program logic into control flow graph, so by traversing through it, we can construct the flow of the source code and tell if any malicious things are hiding inside the source code.

相關文件