[slides and audio] Large Language Model for Vulnerability Detection%3A Emerging Results and Future Directions

This paper explores the effectiveness of Large Pre-Trained Language Models (LLMs) in vulnerability detection, focusing on GPT-3.5 and GPT-4. Previous methods relied on medium-sized pre-trained models or smaller neural networks, but LLMs have shown impressive few-shot learning capabilities. The study investigates how different prompts affect LLM performance and finds that GPT-3.5 achieves competitive results with prior state-of-the-art methods, while GPT-4 consistently outperforms them by 34.8% in accuracy. The research highlights the importance of prompt design, incorporating external knowledge, and leveraging training set samples to enhance LLMs' performance in vulnerability detection. Future directions include developing local and specialized LLM solutions, improving precision and robustness, addressing long-tailed distribution challenges, and fostering trust and synergy with developers. The study also discusses limitations, such as data leakage concerns and the need for more realistic test sets.This paper explores the effectiveness of Large Pre-Trained Language Models (LLMs) in vulnerability detection, focusing on GPT-3.5 and GPT-4. Previous methods relied on medium-sized pre-trained models or smaller neural networks, but LLMs have shown impressive few-shot learning capabilities. The study investigates how different prompts affect LLM performance and finds that GPT-3.5 achieves competitive results with prior state-of-the-art methods, while GPT-4 consistently outperforms them by 34.8% in accuracy. The research highlights the importance of prompt design, incorporating external knowledge, and leveraging training set samples to enhance LLMs' performance in vulnerability detection. Future directions include developing local and specialized LLM solutions, improving precision and robustness, addressing long-tailed distribution challenges, and fostering trust and synergy with developers. The study also discusses limitations, such as data leakage concerns and the need for more realistic test sets.

Large Language Model for Vulnerability Detection: Emerging Results and Future Directions

April 14–20, 2024, Lisbon, Portugal | Xin Zhou, Ting Zhang, and David Lo