Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning

Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning

6 Jun 2024 | Xiaohu Du, Ming Wen, Jiahao Zhu, Zifan Xie, Bin Ji, Huijun Liu, Xuanhua Shi, Hai Jin
VulLLM is a novel framework for code vulnerability detection that integrates multi-task learning with large language models (LLMs) to enhance vulnerability understanding. The framework introduces two auxiliary tasks: vulnerability localization and vulnerability interpretation. Vulnerability localization identifies vulnerable code elements from patches, while vulnerability interpretation uses GPT-4 to generate textual explanations of vulnerabilities. By combining these tasks, VulLLM improves the model's ability to understand the root causes of vulnerabilities rather than overfitting to superficial features. The framework also incorporates self-verification (CoT-SV) to enhance the reliability of LLMs by reducing error accumulation and hallucinations. The results show that VulLLM outperforms seven state-of-the-art models in terms of effectiveness, generalization, and robustness across six large datasets. The experiments demonstrate that VulLLM achieves an 8% improvement in F1 score compared to the best baseline, UniXcoder, and shows superior generalization on out-of-distribution (OOD) data. Additionally, VulLLM demonstrates enhanced robustness against adversarial attacks, with an average improvement of 68.08% compared to UniXcoder. The ablation study confirms the importance of multi-task learning and data augmentation in improving VulLLM's performance. The framework is evaluated on six widely-used C/C++ vulnerability detection datasets, and the results show that VulLLM surpasses existing approaches in terms of effectiveness, generalization, and robustness. The study also highlights the limitations of the framework, including resource constraints and the potential for bias in vulnerability explanations. Overall, VulLLM provides a more effective and robust approach to code vulnerability detection by leveraging multi-task learning and self-verification techniques.VulLLM is a novel framework for code vulnerability detection that integrates multi-task learning with large language models (LLMs) to enhance vulnerability understanding. The framework introduces two auxiliary tasks: vulnerability localization and vulnerability interpretation. Vulnerability localization identifies vulnerable code elements from patches, while vulnerability interpretation uses GPT-4 to generate textual explanations of vulnerabilities. By combining these tasks, VulLLM improves the model's ability to understand the root causes of vulnerabilities rather than overfitting to superficial features. The framework also incorporates self-verification (CoT-SV) to enhance the reliability of LLMs by reducing error accumulation and hallucinations. The results show that VulLLM outperforms seven state-of-the-art models in terms of effectiveness, generalization, and robustness across six large datasets. The experiments demonstrate that VulLLM achieves an 8% improvement in F1 score compared to the best baseline, UniXcoder, and shows superior generalization on out-of-distribution (OOD) data. Additionally, VulLLM demonstrates enhanced robustness against adversarial attacks, with an average improvement of 68.08% compared to UniXcoder. The ablation study confirms the importance of multi-task learning and data augmentation in improving VulLLM's performance. The framework is evaluated on six widely-used C/C++ vulnerability detection datasets, and the results show that VulLLM surpasses existing approaches in terms of effectiveness, generalization, and robustness. The study also highlights the limitations of the framework, including resource constraints and the potential for bias in vulnerability explanations. Overall, VulLLM provides a more effective and robust approach to code vulnerability detection by leveraging multi-task learning and self-verification techniques.
Reach us at info@study.space
[slides and audio] Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning