Toward Improved Deep Learning-based Vulnerability Detection

Toward Improved Deep Learning-based Vulnerability Detection

April 14–20, 2024, Lisbon, Portugal | Adriana Sejfia, Satyaki Das, Saad Shafiq, Nenad Medvidović
This paper explores the limitations of deep learning (DL) techniques in detecting vulnerabilities that span multiple base units (MBU vulnerabilities). The authors hypothesize that existing DL-based detectors, which focus on individual base units (IBUs), may struggle with MBU vulnerabilities, leading to reduced accuracy. They evaluate three prominent DL-based detectors—ReVeal, DeepWukong, and LineVul—using a systematic analysis of their datasets and training processes. Key findings include: 1. **Presence of MBU Vulnerabilities**: MBU vulnerabilities are significant in the datasets, comprising 22% to 61% of all vulnerabilities across the three detectors. 2. **Usage in Training and Evaluation**: The detectors fail to properly account for MBU vulnerabilities in their training, validation, and testing sets, leading to inaccurate accuracy reports. 3. **Accuracy on MBU Vulnerabilities**: When evaluated on complete vulnerabilities, the detectors show significant drops in accuracy, particularly in precision and Matthews correlation coefficient (MCC). 4. **Impact of Realistic Training**: Retraining the detectors with a focus on complete vulnerabilities improves performance, especially for ReVeal, while LineVul's precision decreases. The authors propose an automated framework to help DL-based detectors better handle MBU vulnerabilities, including a Patch Collector and a Patch Cleaner component to identify and clean compound patches. The framework aims to enhance the effectiveness of these detectors in real-world scenarios.This paper explores the limitations of deep learning (DL) techniques in detecting vulnerabilities that span multiple base units (MBU vulnerabilities). The authors hypothesize that existing DL-based detectors, which focus on individual base units (IBUs), may struggle with MBU vulnerabilities, leading to reduced accuracy. They evaluate three prominent DL-based detectors—ReVeal, DeepWukong, and LineVul—using a systematic analysis of their datasets and training processes. Key findings include: 1. **Presence of MBU Vulnerabilities**: MBU vulnerabilities are significant in the datasets, comprising 22% to 61% of all vulnerabilities across the three detectors. 2. **Usage in Training and Evaluation**: The detectors fail to properly account for MBU vulnerabilities in their training, validation, and testing sets, leading to inaccurate accuracy reports. 3. **Accuracy on MBU Vulnerabilities**: When evaluated on complete vulnerabilities, the detectors show significant drops in accuracy, particularly in precision and Matthews correlation coefficient (MCC). 4. **Impact of Realistic Training**: Retraining the detectors with a focus on complete vulnerabilities improves performance, especially for ReVeal, while LineVul's precision decreases. The authors propose an automated framework to help DL-based detectors better handle MBU vulnerabilities, including a Patch Collector and a Patch Cleaner component to identify and clean compound patches. The framework aims to enhance the effectiveness of these detectors in real-world scenarios.
Reach us at info@study.space