Cointime

Download App
iOS & Android

A Powerful Formal Verification Engine for Solidity Smart Contracts

Validated Project

Smart Contracts are computer programs enabling decentralized applications on blockchain. Smart contracts are usually written in Turing-complete programming languages such as Solidity. Solidity is a JavaScript-like programming language where “contract’’ is the first-class object. In Solidity, a contract can extend its capability by inheriting other contracts or delegating jobs to them. Such composability complexity makes smart contracts vulnerable to many security threats.

Figure 1 shows the basic workflows of smart contracts from their development to their deployment. In general, developers write smart contract code in files with the suffix “.sol” and then invoke the Solidity compiler to translate the source code into bytecode. Then, the bytecode can be deployed to blockchain platforms such as Ethereum. Finding security vulnerabilities in smart contracts has always attracted much attention from security experts or attackers since the notorious DAO attack in 2016. Many approaches have been proposed to detect contract vulnerabilities and have achieved enormous success. Yet, we investigated the existing approaches and found that we do lack a powerful formal verification engine for Solidity smart contracts. The existing approaches on smart contract security can be classified according to their input, namely, either bytecode or source code. Most approaches target bytecode-level analysis via fuzzing and symbolic execution. Some approaches target source code-level analysis via formal verification and AST-based static analysis. However, the aforementioned approaches are prone to either poor accuracy or low efficiency. Briefly, the bytecode-level analyses usually produce more accurate results compared to the source code-level analyses, but the latter could find more vulnerabilities than the former.

Figure 1. The basic workflow of smart contracts.

Formal verification is the process of using formal methods of mathematics to “inspect” a program to prove or disprove its correctness. A powerful smart contract formal verification engine should be supported by a source-level symbolic execution engine to achieve accuracy and scalability. In this post, we introduce SolSEE, the first source-level symbolic execution engine for Solidity smart contracts. Currently, SolSEE is capable of executing smart contracts written in various Solidity versions, supporting harness customization, detecting common contract vulnerabilities, and performing bounded formal verification.

Introduction to SolSEE

Figure 2. The architecture of SolSEE.

The symbolic execution module of SolSEE supports a majority of the Solidity language features, including the intra- and inter-contract function calls, multiple inheritances, library support via the “using ... for” construct, low-level function calls such as “.call.value()()” with the associated fallback mechanism, modifiers, and many others. Like the Solidity compiler, SolSEE automatically generates getter functions for public smart contract variables of elementary types. Like other source-level analyzers for smart contracts, SolSEE does not support inline assembly. SolSEE also introduces a supplementary assume() statement that can be used to specify assumption conditions when verification is performed, as shown in Listing 2, line 5. In SolSEE, require() and assert() have different semantics, although in Solidity, both functions could lead to a transaction with all its effects on the state being reverted, if the required/asserted condition is not satisfied. In Solidity, require() is used to check a condition that is expected to fail occasionally, e.g., a guard condition on function input arguments. Thus, should the expression enclosed in require() evaluate to be false, SolSEE rollbacks all effects of the transaction on the smart contract state. We consider each statement in the harness function as one transaction. The semantics of assert() corresponds to its purpose in Solidity: it is used to check conditions that should never evaluate to be false. SolSEE stops execution and reports a violation if the asserted condition is violated.

In addition, SolSEE also reports any detected vulnerabilities, e.g., integer under- and overflows—a common issue in Solidity smart contracts, which heavily utilize unsigned integers to store important information such as token balances. SolSEE takes a modular arithmetic approach to handle unsigned integers of various sizes (from uint8 to uint256): it models them using Z3 integers constrained by range assertions to follow the semantics of unsigned integer arithmetic operations in Solidity. Although using bit vectors is another popular approach to unsigned model integers, it has been shown to have scalability issues. To model one- and multi-dimensional arrays and mappings, SolSEE relies on the array theory.

In SolSEE, although the symbolic execution process can proceed with the randomly generated interaction sequences for the analyzed smart contract(s), we also support the symbolic execution guided by the harness function provided by the user to orchestrate more meaningful interactions. During path exploration and assertion/property checking, SolSEE relies on Z3, an SMT-solver, to resolve constraints. Using a harness function makes symbolic analysis performed by SolSEE highly configurable, which is necessary to analyze complex smart contract code effectively and efficiently in a realistic setting, which is demonstrated by our evaluation. Additionally, a harness can also be used to encode properties about the execution trace or smart contract invariants in a form of assertions. To optimize tool performance, smart contract variables in a harness or analyzed smart contracts are assumed to have concrete (default) values unless they are declared as symbolic. Ethereum balance of a harness ( __MAIN__ ) smart contract is assumed to be symbolic too.

Demonstration by Harness Example

Listing 1 shows a sample Solidity implementation of a token smart contract. In Ethereum, users interact with a smart contract by initiating transactions that invoke its functions with public or external visibility. Token’s deposit() function (lines 18–20) allows the agreement to accept ETH, a native cryptocurrency of Ethereum, and records the deposited amount (msg. value) in a balances mapping entry associated with a transaction sender (msg. sender). The recover() function (lines 21–23) sends all ETH balance of the Token to its owner, the address that performed contract deployment (i.e., creation) which involved the execution of a constructor function (lines 14–17). Access control on recover() is implemented using the functionality of the Ownable smart contract (lines 1–10) that Token inherits from (line 11). Thus, the signature of recover() (line 21) contains an invocation of the only owner modifier (lines 7–8). The modifier adds additional behavior, such as a prerequisite (line 7), to a function body that replaces “_;” in a modifier code (line 8). The transfer of ETH (line 22) is performed via Ethereum’s built-in .call.value()() function call. Given a smart contract written in Solidity, SolSEE symbolically represents the (storage/memory) configuration of the smart contract and executes each statement based on the operational semantics of Solidity. Our developed operational semantics for Solidity supports many vital features including inheritance, modifiers, ETH transfer, and others. During symbolic execution, this representation is directly used to determine the satisfiability of the generated path constraints using a Z3 SMT-solver. To facilitate the efficient exploration of interesting innovative contract behaviors in a realistic setting, SolSEE supports a user-defined harness function that specifies the sequence of function calls to be analyzed symbolically. The harness definition follows exactly the syntax and semantics of Solidity, which is intuitive and easy to use for developers.

Listing 2 illustrates how to define a harness in SolSEE. The __MAIN__ contract serves as the entry point (similar to the primary () function in C), and the harness is defined as a constructor of it. The harness contains the declaration of a symbolic variable which should come with the prefix “$” (line 4) and the creation of the Token smart contract which involves transferring a symbolic amount of ETH to this contract (line 6). The harness also includes a call to the Token’s recover() function, executed successfully, because the call is invoked by the MAIN contract, satisfying the only owner modifier. Lines 10–12 define a fallback function—another Solidity-specific feature supported by SolSEE. Here, the fallback function is invoked when ETH is transferred to a smart contract, i.e., upon the execution of the “.call.value()()” in Listing 1, line 22. Since the execution of recover() succeeds, the assertion in Listing 2, line 8, should always be satisfied. The argument in a harness function can also be used to declare a property, which can be constructed using Solidity operators and variables in the __MAIN__ contract. In addition, SolSEE allows assumptions to be specified in terms of smart contract variables using the assume() statement. For example, line 5 in Listing 2 indicates that the analysis will only be concerned with the paths where the ETH balance of the __MAIN__ contract is not less than the amount of ETH being sent to the Token upon its creation.

Comparison with the SOTAs

We demonstrate the capabilities of SolSEE and compare them with other source-level tools for Solidity smart contracts. The tools we compare with include sold-verify, VeriSol v0.1.5 and its modified version used in SmartPulse (denoted as VeriSol-SP in Table 1), VeriSmart, and SMTChecker had in sold v.0.5.11. All these tools claim that they can identify assertion failures, which is the capability we evaluate. We evaluated SolSEE and these tools on our running example (Listing 1) and a dataset of nine different features in Solidity. Table 1 summarizes how these features are supported by different tools. Each of the nine features has one corresponding smart contract, as shown in the first two columns of Table 1.

Our results demonstrate that all tools used for comparison lack support for specific Solidity features, except SolSEE. For example, VeriSol does not support modifiers with multiple “_;” statements (witness: MultipleModfiers). VeriSmart cannot handle fallback functions (witness: FallbackFunction) or correctly analyze arithmetic operations on variables of unsigned integer (uint) types (witness: UintOverflow). sold-verify and VeriSol follow the semantics of arithmetic operations for uint in Solidity, only if a non-default modular arithmetic mode is enabled. sold-verify fails to process a bytes32 array in NewBytesArray, while variables of type struct are not supported by SMTChecker (witness: Structs). In addition, our results show that the analyzed tools report a potential violation of numerous assertions, which are, in fact, false positives (e.g., sold-verify supports most Solidity features used in experimental intelligent contracts, but in many of them it reports all assertions as potentially failing). We attribute this fact to the lack of harness function support and missing or incorrect handling of Solidity language features and their semantics. For example, none of these tools, except SolSEE and SmartPulse, correctly implement C3 Linearization that Solidity uses to decide the order in which methods are inherited in the presence of multiple inheritances (witness: multiple inheritances). Besides, while VeriSmart and SMTChecker process some of our intelligent contracts correctly, they can only do so if all the functionality is stored in a single agreement, i.e., no external function calls are allowed. Regarding the speed of analysis, our tool is as efficient as other tools used for comparison.

Related work

Most of the existing symbolic execution tools for smart contracts operate on the bytecode (rather than source-code) level, which retains limited semantic information about the smart contract and, hence, complicates reasoning about high-level properties of a smart contract. Most of these tools focus on detecting well-known vulnerabilities based on a certain pattern appearing in smart contract bytecode, e.g., Oyente, Mythril, Maian, etc. Manticore is a bytecode-level symbolic execution engine which supports property-based symbolic execution and provides users with some control over the state exploration process. While Manticore offers a GUI plugin, it visualizes low-level bytecode instructions, which are difficult, for a developer, to match with their original Solidity source code statements.

Meanwhile, existing source-level tools for Solidity smart contracts offer limited support for Solidity features and/or do not allow customization of the function call sequence and the environment to be analyzed. For example, VeriSmart is a smart contract verifier that is also used in SmarTest to perform the symbolic execution of smart contracts. These tools do not precisely handle the execution of fallback functions and inter-contract function calls, which constitute essential functionality of smart contracts. Inter-contract function calls are also not analyzed precisely by SMTChecker—a built-in verifier within the Solidity compiler. Two other source-level tools, sold-verify, and VeriSol translate Solidity code into Boogie intermediate language, which can introduce discrepancies between the analyzed code translation and original Solidity semantics. In addition, they also lack the support for certain Solidity functionality and do not allow customization of the harness function and analyzed environment, which leads to multiple false positives.

Certora is a close-sourced formal verification tool for smart contracts that focuses on bounded formal verification for smart contracts. To use Certora, developers must learn their proposed specification language, CVL, to write the specification and the harness function. However, this new domain-specific language may have a steep learning curve for general smart contract developers.  In SolSEE, currently, all the specifications, or say properties, and harness functions can be coded in Solidity language, which is user-friendly to developers. We are planning to support SolSEE with different kinds of program specifications comparable to Certora.

SolSEE was previously published as research work in a top-tier software engineering conference (The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering) in 2022 and now is currently in its beta phase developed by MetaTrust, a Singaporean start-up business focusing on Web3 security. SolSEE will be integrated into MetaTrust’s product lines as a proprietary tool to assist smart contract developers and auditors.

Company Introduction

Metatrust is a leading security solutions provider focused on helping developers enhance security protections throughout various stages of the software development lifecycle to safeguard their digital assets and privacy. Metatrust's solutions cover multiple areas, including security risk assessments, security auditing, and smart contract security protection.

Reference work

[1] Shang-Wei Lin, Palina Tolmach, Ye Liu, and Yi Li. "SolSEE: a source-level symbolic execution engine for solidity." In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1687-1691. 2022.

Twitter: https://twitter.com/MetatrustLabs

Comments

All Comments

Recommended for you