LLM4Fuzz: Guided Fuzzing of Smart Contracts with Large Language Models

LLM4Fuzz: Guided Fuzzing of Smart Contracts with Large Language Models

20 Jan 2024 | Chaofan Shou, Jing Liu, Doudou Lu, Koushik Sen
LLM4FUZZ is a novel approach to guide and prioritize fuzzing of smart contracts using large language models (LLMs). Traditional fuzzing is inefficient in exploring the vast state space of smart contracts, but LLM4FUZZ uses LLMs to direct fuzzers towards high-value code regions and input sequences more likely to trigger vulnerabilities. LLM4FUZZ also leverages LLMs to guide fuzzers based on user-defined invariants, reducing blind exploration overhead. Evaluations on real-world DeFi projects show that LLM4FUZZ significantly improves efficiency, coverage, and vulnerability detection compared to baseline fuzzing. LLM4FUZZ uncovered five critical vulnerabilities that can lead to a loss of more than $247k. LLM4FUZZ uses LLMs to generate metrics for code regions, which are then used to prioritize exploration in fuzzing campaigns. The LLMs can analyze invariants in the code and identify invariant-related code regions. The fuzzers can utilize these metrics to allocate more effort to exploring more fruitful code regions. LLM4FUZZ also identifies potentially interesting sequences of function calls, which can be prioritized for exploration to efficiently find and reach interesting and vulnerability-leading states. To harness the potential of LLMs, LLM4FUZZ extracts a hierarchical representation of the smart contract, including source code, control flow graphs, data dependencies, and metrics produced by static analysis. These elements allow LLMs to perform semantics analysis, compare historical vulnerability patterns, and pinpoint potentially interesting transaction sequences. LLM4FUZZ uses this information to generate metrics for basic blocks and functions, which are then encoded into schedulers of fuzzers to guide them to explore test cases based on this refined prioritization. LLM4FUZZ was evaluated on real-world complex smart contract projects with and without known vulnerabilities. LLM4FUZZ gains significantly higher test coverage on these contracts and can find more vulnerabilities in less time than the previous state-of-the-art. While scanning 600 live smart contracts deployed on the chain, LLM4FUZZ identified critical vulnerabilities in five smart contract projects, which can lead to significant financial loss. LLM4FUZZ uses a feedback-driven fuzzing approach, where test cases are generated and mutated based on real-time feedback from the program execution. The algorithm uses power scheduling to allocate more time to exploring favored mutants of test cases. Energies are calculated for each test case, which quantifies the favoriteness of each test case. The algorithm can be used for both fuzzing traditional software and smart contracts. For smart contracts, a test case would be a sequence of inputs (i.e., transaction calls) and mutation could be inserting new input into the sequence, removing inputs from the sequence, or mutating individual input. LLM4FUZZ uses LLMs to generate metrics for code regions, which are then used to prioritize exploration in fuzzing campaigns. The LLMs can analyze inLLM4FUZZ is a novel approach to guide and prioritize fuzzing of smart contracts using large language models (LLMs). Traditional fuzzing is inefficient in exploring the vast state space of smart contracts, but LLM4FUZZ uses LLMs to direct fuzzers towards high-value code regions and input sequences more likely to trigger vulnerabilities. LLM4FUZZ also leverages LLMs to guide fuzzers based on user-defined invariants, reducing blind exploration overhead. Evaluations on real-world DeFi projects show that LLM4FUZZ significantly improves efficiency, coverage, and vulnerability detection compared to baseline fuzzing. LLM4FUZZ uncovered five critical vulnerabilities that can lead to a loss of more than $247k. LLM4FUZZ uses LLMs to generate metrics for code regions, which are then used to prioritize exploration in fuzzing campaigns. The LLMs can analyze invariants in the code and identify invariant-related code regions. The fuzzers can utilize these metrics to allocate more effort to exploring more fruitful code regions. LLM4FUZZ also identifies potentially interesting sequences of function calls, which can be prioritized for exploration to efficiently find and reach interesting and vulnerability-leading states. To harness the potential of LLMs, LLM4FUZZ extracts a hierarchical representation of the smart contract, including source code, control flow graphs, data dependencies, and metrics produced by static analysis. These elements allow LLMs to perform semantics analysis, compare historical vulnerability patterns, and pinpoint potentially interesting transaction sequences. LLM4FUZZ uses this information to generate metrics for basic blocks and functions, which are then encoded into schedulers of fuzzers to guide them to explore test cases based on this refined prioritization. LLM4FUZZ was evaluated on real-world complex smart contract projects with and without known vulnerabilities. LLM4FUZZ gains significantly higher test coverage on these contracts and can find more vulnerabilities in less time than the previous state-of-the-art. While scanning 600 live smart contracts deployed on the chain, LLM4FUZZ identified critical vulnerabilities in five smart contract projects, which can lead to significant financial loss. LLM4FUZZ uses a feedback-driven fuzzing approach, where test cases are generated and mutated based on real-time feedback from the program execution. The algorithm uses power scheduling to allocate more time to exploring favored mutants of test cases. Energies are calculated for each test case, which quantifies the favoriteness of each test case. The algorithm can be used for both fuzzing traditional software and smart contracts. For smart contracts, a test case would be a sequence of inputs (i.e., transaction calls) and mutation could be inserting new input into the sequence, removing inputs from the sequence, or mutating individual input. LLM4FUZZ uses LLMs to generate metrics for code regions, which are then used to prioritize exploration in fuzzing campaigns. The LLMs can analyze in
Reach us at info@study.space
[slides] LLM4Fuzz%3A Guided Fuzzing of Smart Contracts with Large Language Models | StudySpace