Chain-of-Thought Prompting of Large Language Models for Discovering and Fixing Software Vulnerabilities

Chain-of-Thought Prompting of Large Language Models for Discovering and Fixing Software Vulnerabilities

27 Feb 2024 | Yu Nong, Mohammed Aldeen, Long Cheng, Hongxin Hu, Feng Chen, Haipeng Cai
This paper explores the use of large language models (LLMs) and chain-of-thought (CoT) prompting to address three key software vulnerability analysis tasks: identifying specific types of vulnerabilities, discovering any type of vulnerabilities, and patching detected vulnerabilities. The authors propose a unified vulnerability semantics-guided prompting (VSP) approach, which maps vulnerability semantics to chains of thoughts, and conducts extensive experiments to evaluate its effectiveness compared to five baselines. The results show that VSP significantly outperforms the baselines, achieving substantial improvements in F1 accuracy for vulnerability identification, discovery, and patching. In-depth case studies reveal common failure reasons, including insufficient code context, missing control flow facts, and incomplete data flow analysis, and propose improvements to address these issues. The study contributes to the understanding of how LLMs can be effectively leveraged for software vulnerability analysis and provides insights for future improvements.This paper explores the use of large language models (LLMs) and chain-of-thought (CoT) prompting to address three key software vulnerability analysis tasks: identifying specific types of vulnerabilities, discovering any type of vulnerabilities, and patching detected vulnerabilities. The authors propose a unified vulnerability semantics-guided prompting (VSP) approach, which maps vulnerability semantics to chains of thoughts, and conducts extensive experiments to evaluate its effectiveness compared to five baselines. The results show that VSP significantly outperforms the baselines, achieving substantial improvements in F1 accuracy for vulnerability identification, discovery, and patching. In-depth case studies reveal common failure reasons, including insufficient code context, missing control flow facts, and incomplete data flow analysis, and propose improvements to address these issues. The study contributes to the understanding of how LLMs can be effectively leveraged for software vulnerability analysis and provides insights for future improvements.
Reach us at info@study.space
[slides] Chain-of-Thought Prompting of Large Language Models for Discovering and Fixing Software Vulnerabilities | StudySpace