17 Aug 2024 | Wei Ma¹, Daoyuan Wu²*, Yuqiang Sun¹, Tianwen Wang³, Shangqing Liu¹, Jian Zhang¹, Yue Xue⁴, Yang Liu¹
This paper proposes iAudit, a framework that combines fine-tuning and LLM-based agents for intuitive smart contract auditing with justifications. Smart contracts, built on blockchains like Ethereum, are crucial for decentralized finance (DeFi), but their vulnerabilities can lead to significant losses. Traditional methods for detecting these vulnerabilities are often ineffective due to the complexity of logic flaws. Recent research shows that large language models (LLMs) have potential in auditing smart contracts, but even advanced models like GPT-4 achieve only 30% precision when both decision and justification are correct. This is due to the lack of domain-specific fine-tuning in off-the-shelf LLMs.
iAudit employs a two-stage fine-tuning approach: first, a Detector model is fine-tuned to make decisions, and then a Reasoner model is fine-tuned to generate the causes of vulnerabilities. To improve the accuracy of the Reasoner model, two LLM-based agents, the Ranker and Critic, are introduced to iteratively select and debate the most suitable cause of vulnerability. A balanced dataset of 1,734 positive and 1,810 negative samples is used for training and evaluation. iAudit achieves an F1 score of 91.21% and an accuracy of 91.11% on a dataset of 263 real smart contract vulnerabilities, with a consistency rate of 38% compared to the ground truth causes.
The paper also includes three ablation studies to evaluate the effectiveness of iAudit's two-stage fine-tuning and majority voting strategies, as well as the impact of additional call graph information on the model's performance. The results show that iAudit outperforms other models in terms of detection performance and alignment with ground-truth explanations. The study highlights the importance of intuition in vulnerability auditing and demonstrates the effectiveness of combining fine-tuning with LLM-based agents for intuitive smart contract auditing with justifications.This paper proposes iAudit, a framework that combines fine-tuning and LLM-based agents for intuitive smart contract auditing with justifications. Smart contracts, built on blockchains like Ethereum, are crucial for decentralized finance (DeFi), but their vulnerabilities can lead to significant losses. Traditional methods for detecting these vulnerabilities are often ineffective due to the complexity of logic flaws. Recent research shows that large language models (LLMs) have potential in auditing smart contracts, but even advanced models like GPT-4 achieve only 30% precision when both decision and justification are correct. This is due to the lack of domain-specific fine-tuning in off-the-shelf LLMs.
iAudit employs a two-stage fine-tuning approach: first, a Detector model is fine-tuned to make decisions, and then a Reasoner model is fine-tuned to generate the causes of vulnerabilities. To improve the accuracy of the Reasoner model, two LLM-based agents, the Ranker and Critic, are introduced to iteratively select and debate the most suitable cause of vulnerability. A balanced dataset of 1,734 positive and 1,810 negative samples is used for training and evaluation. iAudit achieves an F1 score of 91.21% and an accuracy of 91.11% on a dataset of 263 real smart contract vulnerabilities, with a consistency rate of 38% compared to the ground truth causes.
The paper also includes three ablation studies to evaluate the effectiveness of iAudit's two-stage fine-tuning and majority voting strategies, as well as the impact of additional call graph information on the model's performance. The results show that iAudit outperforms other models in terms of detection performance and alignment with ground-truth explanations. The study highlights the importance of intuition in vulnerability auditing and demonstrates the effectiveness of combining fine-tuning with LLM-based agents for intuitive smart contract auditing with justifications.