2024 | Aleksei Shestov, Rodion Levichev, Ravil Mussabayev, Evgeny Maslov, Anton Cheshkov & Pavel Zadorozhny
This paper presents the results of fine-tuning large language models (LLMs) for vulnerability detection in Java source code. The study leverages WizardCoder, an improved version of the state-of-the-art LLM StarCoder, and adapts it for vulnerability detection through further fine-tuning. To accelerate training, the training procedure of WizardCoder is modified, and optimal training regimes are investigated. For the imbalanced dataset with many more negative examples than positive, various techniques are explored to improve classification performance. The fine-tuned WizardCoder model achieves improvements in ROC AUC and F1 measures on both balanced and imbalanced vulnerability datasets compared to CodeBERT-like models, demonstrating the effectiveness of adapting pretrained LLMs for vulnerability detection in source code. Key contributions include fine-tuning the state-of-the-art code LLM WizardCoder, increasing its training speed without performance harm, optimizing the training procedure and regimes, handling class imbalance, and improving performance on difficult vulnerability detection datasets. This demonstrates the potential for transfer learning by fine-tuning large pretrained language models for specialized source code analysis tasks.
The study investigates the effectiveness of fine-tuning LLMs for vulnerability detection in Java source code. It addresses the challenges of class imbalance and explores techniques to improve classification performance. The results show that the fine-tuned WizardCoder model outperforms CodeBERT-like models in terms of ROC AUC and F1 scores on both balanced and imbalanced datasets. The study also investigates the impact of different training regimes, batch packing, and loss functions on model performance. The findings suggest that the WizardCoder model is effective for vulnerability detection in Java code, and that further research is needed to improve performance on imbalanced datasets. The study contributes to the field of vulnerability detection by demonstrating the potential of fine-tuning large pretrained language models for specialized source code analysis tasks.This paper presents the results of fine-tuning large language models (LLMs) for vulnerability detection in Java source code. The study leverages WizardCoder, an improved version of the state-of-the-art LLM StarCoder, and adapts it for vulnerability detection through further fine-tuning. To accelerate training, the training procedure of WizardCoder is modified, and optimal training regimes are investigated. For the imbalanced dataset with many more negative examples than positive, various techniques are explored to improve classification performance. The fine-tuned WizardCoder model achieves improvements in ROC AUC and F1 measures on both balanced and imbalanced vulnerability datasets compared to CodeBERT-like models, demonstrating the effectiveness of adapting pretrained LLMs for vulnerability detection in source code. Key contributions include fine-tuning the state-of-the-art code LLM WizardCoder, increasing its training speed without performance harm, optimizing the training procedure and regimes, handling class imbalance, and improving performance on difficult vulnerability detection datasets. This demonstrates the potential for transfer learning by fine-tuning large pretrained language models for specialized source code analysis tasks.
The study investigates the effectiveness of fine-tuning LLMs for vulnerability detection in Java source code. It addresses the challenges of class imbalance and explores techniques to improve classification performance. The results show that the fine-tuned WizardCoder model outperforms CodeBERT-like models in terms of ROC AUC and F1 scores on both balanced and imbalanced datasets. The study also investigates the impact of different training regimes, batch packing, and loss functions on model performance. The findings suggest that the WizardCoder model is effective for vulnerability detection in Java code, and that further research is needed to improve performance on imbalanced datasets. The study contributes to the field of vulnerability detection by demonstrating the potential of fine-tuning large pretrained language models for specialized source code analysis tasks.