22 Feb 2024 | Nafis Tanveer Islam, Joseph Khoury, Andrew Seong, Mohammad Bahrami Karkevandi, Gonzalo De La Torre Parra, Elias Bou-Harb, Peyman Najafirad
The paper introduces SecRepair, an AI-assisted system for identifying, repairing, and describing code vulnerabilities, leveraging a large language model (LLM) called CodeGen2. The system aims to address the growing concern of security issues in software development, particularly due to the use of AI-driven tools like GitHub Copilot. SecRepair uses a reinforcement learning paradigm with a semantic reward mechanism to generate code comments and improve the effectiveness of code vulnerability repair. The authors also present InstructVul, a comprehensive instruction-based dataset tailored for C/C++ programming, which is used to train the model. The system's performance is evaluated through various metrics, including BLEU, Rouge-L, and human evaluation scores, demonstrating superior results compared to existing models. The paper also includes ablation studies and case studies to validate the system's effectiveness in identifying, repairing, and describing vulnerabilities in six Open Source IoT Operating Systems on GitHub.The paper introduces SecRepair, an AI-assisted system for identifying, repairing, and describing code vulnerabilities, leveraging a large language model (LLM) called CodeGen2. The system aims to address the growing concern of security issues in software development, particularly due to the use of AI-driven tools like GitHub Copilot. SecRepair uses a reinforcement learning paradigm with a semantic reward mechanism to generate code comments and improve the effectiveness of code vulnerability repair. The authors also present InstructVul, a comprehensive instruction-based dataset tailored for C/C++ programming, which is used to train the model. The system's performance is evaluated through various metrics, including BLEU, Rouge-L, and human evaluation scores, demonstrating superior results compared to existing models. The paper also includes ablation studies and case studies to validate the system's effectiveness in identifying, repairing, and describing vulnerabilities in six Open Source IoT Operating Systems on GitHub.