Proceedings of Spie: Svchecker: A Deep Learning-Based System For Smart Contract Vulnerability Detection
Proceedings of Spie: Svchecker: A Deep Learning-Based System For Smart Contract Vulnerability Detection
Proceedings of Spie: Svchecker: A Deep Learning-Based System For Smart Contract Vulnerability Detection
SPIEDigitalLibrary.org/conference-proceedings-of-spie
ABSTRACT
The detection of smart contracts vulnerability is a valuable research problem because smart contracts hold a huge amount
of cryptocurrency. In the past, popular detection tools were mainly based on some traditional techniques such as fuzzing
and symbolic execution, which rely on fixed expert features or patterns and often miss many vulnerabilities. Recent
machine learning approaches alleviate this issue but do not notice the semantic information in the source code. In this
paper, we develop a system called SVChecker to classify the smart contract source code written in Solidity. To show the
superiority of our system, we conduct experiments on more than 40,000 smart contracts collected from Ethereum.
Empirically, our experimental results demonstrate that our system outperforms all popular detection tools.
Keywords: Solidity, vulnerability detection, deep learning
1. INTRODUCTION
Since Satoshi Nakamoto first proposed the concept of Bitcoinin 20081, decentralized cryptocurrencies have begun to
flourish and have attracted more and more people’s attention. Cryptocurrency is a digital currency, what’s more, which
is not controlled by the central bank but decentralized through blockchain technology. That means users can maintain
shared data through a specific consensus protocol in the cryptocurrency network, thereby achieving secure transactions.
In recent years, the use of blockchain technology has surpassed peer-to-peer transactions, which is inseparable from the
application of smart contracts.
Smart contracts are programs running on blockchains and can perform trusted transactions without a third party 2.
Ethereum is the most popular public blockchain platform for running smart contracts3. Besides, due to the immutable
nature of blockchain, once a smart contract is deployed on the blockchain, it cannot be modified. Nowadays, more and
more transactions on Ethereum are executed automatically through smart contracts. Not only that, with the help of smart
contracts, we can develop various types of decentralized apps on Ethereum.
However, because of the enormous economic value brought by smart contracts, it has also attracted the attention of
attackers. The solidity language is the most common high-level language used to write smart contracts on Ethereum4.
Programmers write smart contracts, and programmers cannot guarantee that their code will be executed without
vulnerabilities. Coupled with the unmodifiable nature of smart contracts, this leads to a tremendous economic threat once
smart contracts are attacked. Therefore, more and more researchers are diving into detecting vulnerabilities in smart
contracts. In this paper, we propose a deep learning-based system for vulnerabilities detection of smart contracts written
in Solidity language.
Contributions. Our contributions are:
• We propose a method to extract specific code snippets from Solidity source code and label them malicious or benign.
This kind of code snippet can focus on the data flow of a particular variable. Also, our approach can help generate a
dataset that is more suitable for deep learning model training. We release that dataset at the link below.
• We present the design and implementation of a deep learning-based smart contract vulnerabilities detection system,
called Solidity Vulnerability Checker, for source code written in Solidity language. This system can extract the code
#yuanyemse@gmail.com; *28922111@qq.com
2. RELATED WORK
Existing work on smart contract vulnerability detection can be divided into two categories: conventional detection
methods and machine learning-based methods. For conventional detection methods, Contract Fuzzer10 identifies
vulnerabilities by fuzzing and runtime behavior monitoring during execution. Oyente is one of the representatives of
symbolic execution tools. The common feature of such tools is that they have poor detection effects for new types of
vulnerabilities. For machine learning-based methods, Zhuang et al.11 introduced a novel temporal message propagation
network (TMP) and a degree-free GCN (DR-GCN) to automatically detect smart contract vulnerabilities. Eth2vec12 is a
machine learning-based static analysis tool for detecting code rewriting attacks particularly. And it uses Ethereum
Virtual Machine bytecodes as input rather than Solidity source code.
3. DESIGN OF SVCHECKER
In this section, we present the Solidity Vulnerability Checker (SVChecker). Our objective is to design a vulnerability
detection system for smart contracts written in Solidity Language. Our system takes a smart contract source code as input
and then tells whether it is vulnerable or not. The overview of the proposed system is illustrated in Figure 1, which
consists of two phases: (a) training phase; and (b) detection phase. In detail, the SVChecker can be divided into three
core modules: (1) code snippets extraction; (2) deep learning model; and (3) detector for unknown source code. In the
following content, we will give a detailed explanation of the functions of these three core modules.
3.1 Code snippets extraction
We represent the source code as vectors that can include more contextual semantic relations. However, directly using the
entire source code is not a good choice because there is much irrelevant information. To make our system do well, we
first propose transforming programs into a representation of code snippets. We observe that two reasons cause most
vulnerabilities in smart contracts: (1) incorrect operations of variables, like integer overflow; (2) improper use of API
function calls, like timestamp dependency and reentrancy. For example, incorrect add operations to a uint type variable
may cause integer overflow and improper uses of call.value (a Solidity API) may cause reentrancy. In addition, we
believe that the statements of Library and Event in Solidity source codes will not cause vulnerabilities, so we just ignore
them.
Finally, we observe that the Oyente tool incorrectly judged the add function in SafeMath Library as vulnerable (Figure 3).
However, our code snippets extraction module will not extract these pieces of code at all. This fact shows that this
module in our system is very useful from another point of view.
As we can see, the SVChecker has the best experimental result in detecting various types of vulnerabilities. And
detection rate is far ahead of other detection tools. We make following observations. First, the deep learning-based
system SVChecker outperforms the other pattern-based static analysis detection systems. Second, for the types of
vulnerabilities that Oyente can detect, the detection rate of the SVChecker exceeds 80%. Even for the types of
vulnerabilities that have not appeared in the training set, the SVChecker has a certain detection rate. We think it benefits
from the powerful learning ability of the Transform-Encoder. In contrast, the detection effects of other detection tools
perform poorly on Smartbugs.
5. CONCLUSION
In this paper, we present the SVChecker system for smart contract vulnerability detection. We propose a practical
method to extract specific code snippets from the source code, and this method can help the neural networks perform
better. We demonstrate that the SVChecker achieves high accuracy in detecting vulnerabilities compared to existing
popular detection tools and can deal with different vulnerabilities.
ACKNOWLEDGEMENTS
This work is supported by the National Natural Science Foundation of China (No. 61962005) and Guangxi University
Young and middle-aged teachers’ basic scientific research ability improvement project (No. 2021KY1934).
REFERENCES
[1] Nakamoto, S., “Bitcoin: A peer-to-peer electronic cash system [EB/OL],” (2009).
https://bitcoin.org/bitcoin.pdf
[2] Zou, W., Lo, D., Kochhar, P. S., Le, X. B. D. and Xu, B., “Smart contract development: Challenges and
opportunities,” IEEE Transactions on Software Engineering, 47(10), 2084-2106 (2019).
[3] Wood, G., “Ethereum: A secure decentralised generalised transaction ledger,” Ethereum Project Yellow
Paper, 151, 1-32 (2014).
[4] Dannen, C., [Introducing Ethereum and Solidity], CA: Apress, Berkeley, (2017).
[5] Durieux, T., Ferreira, J. F., Abreu, R. and Cruz, P., “Empirical review of automated analysis tools on
47,587 Ethereum smart contracts,” 530-541 (2019). arXiv:1910.10601
[6] Luu, L., Chu, D. H., Olickel, H., et al., “Making smart contracts smarter,” 2016 ACM SIGSAC Conf.,
254-269 (2016).