An extensive study of the effects of different deep learning models on code vulnerability detection in Python code

An extensive study of the effects of different deep learning models on code vulnerability detection in Python code

31 January 2024 | Rongcun Wang¹ · Senlei Xu¹ · Xingyu Ji¹ · Yuan Tian² · Lina Gong³ · Ke Wang¹
This study investigates the impact of different deep learning models on code vulnerability detection in Python. The authors evaluate eighteen deep learning architectures, combining three representation learning models (Word2Vec, fastText, and CodeBERT) and six classification models (random forest, XGBoost, Multi-Layer Perceptron, Convolutional Neural Network, Long Short-Term Memory, and Gate Recurrent Unit). They also compare two machine learning strategies: attention and bi-directional mechanisms. The results show that Word2Vec outperforms CodeBERT and fastText in terms of precision, recall, and F-score. LSTM and GRU are superior to other classification models. The bi-directional LSTM and GRU with attention using Word2Vec are identified as the optimal models for code vulnerability detection in Python. Both representation learning models and classification models significantly influence vulnerability detection, and the bi-directional and attention mechanisms can enhance performance.This study investigates the impact of different deep learning models on code vulnerability detection in Python. The authors evaluate eighteen deep learning architectures, combining three representation learning models (Word2Vec, fastText, and CodeBERT) and six classification models (random forest, XGBoost, Multi-Layer Perceptron, Convolutional Neural Network, Long Short-Term Memory, and Gate Recurrent Unit). They also compare two machine learning strategies: attention and bi-directional mechanisms. The results show that Word2Vec outperforms CodeBERT and fastText in terms of precision, recall, and F-score. LSTM and GRU are superior to other classification models. The bi-directional LSTM and GRU with attention using Word2Vec are identified as the optimal models for code vulnerability detection in Python. Both representation learning models and classification models significantly influence vulnerability detection, and the bi-directional and attention mechanisms can enhance performance.
Reach us at info@study.space