Understanding LLM-Powered Code Vulnerability Repair with Reinforcement Learning and Semantic Reward

This paper introduces SecRepair, an AI-assisted system for identifying, repairing, and describing code vulnerabilities using a large language model (LLM) and reinforcement learning with semantic reward. The system is designed to help developers identify and fix code vulnerabilities while providing a comprehensive description of the vulnerability and a code comment. SecRepair leverages CodeGen2, a large language model explicitly fine-tuned for code security analysis, to identify and repair vulnerable code. To facilitate developers with a comprehensive security analysis, the system curates an extensive instruction-based dataset, InstructVul, tailored to C/C++ programming language security vulnerabilities. The system uses a reinforcement learning paradigm with a semantic reward mechanism to generate code comments. The approach involves two main stages: (i) code repair and vulnerability description generation, and (ii) code comment generation. The system is trained on a dataset of vulnerable code and its repaired versions, enabling it to learn the representation of code for vulnerability localization and repair. The model is optimized for vulnerable code detection and repair using cross-entropy loss and beam search. The system also includes a reinforcement learning technique with a semantically aware reward function to fine-tune the learning environment for code comment generation. The reward is calculated using BERTScore, a semantic comparison using cosine similarity. The system is evaluated using BLEU, Rouge-L, and human evaluation scores to measure the effectiveness of vulnerability identification, repair, and description. The results show that SecRepair outperforms other models in vulnerability identification and repair tasks. The system is able to accurately identify and repair code vulnerabilities, provide a comprehensive description of the vulnerability, and generate concise code comments. The system also includes ablation studies to evaluate the effect of temperature and beam size on the performance of the model. The findings demonstrate that the use of reinforcement learning with semantic reward enhances the model's performance in generating code comments and improving the overall effectiveness of the system in addressing code vulnerabilities.This paper introduces SecRepair, an AI-assisted system for identifying, repairing, and describing code vulnerabilities using a large language model (LLM) and reinforcement learning with semantic reward. The system is designed to help developers identify and fix code vulnerabilities while providing a comprehensive description of the vulnerability and a code comment. SecRepair leverages CodeGen2, a large language model explicitly fine-tuned for code security analysis, to identify and repair vulnerable code. To facilitate developers with a comprehensive security analysis, the system curates an extensive instruction-based dataset, InstructVul, tailored to C/C++ programming language security vulnerabilities. The system uses a reinforcement learning paradigm with a semantic reward mechanism to generate code comments. The approach involves two main stages: (i) code repair and vulnerability description generation, and (ii) code comment generation. The system is trained on a dataset of vulnerable code and its repaired versions, enabling it to learn the representation of code for vulnerability localization and repair. The model is optimized for vulnerable code detection and repair using cross-entropy loss and beam search. The system also includes a reinforcement learning technique with a semantically aware reward function to fine-tune the learning environment for code comment generation. The reward is calculated using BERTScore, a semantic comparison using cosine similarity. The system is evaluated using BLEU, Rouge-L, and human evaluation scores to measure the effectiveness of vulnerability identification, repair, and description. The results show that SecRepair outperforms other models in vulnerability identification and repair tasks. The system is able to accurately identify and repair code vulnerabilities, provide a comprehensive description of the vulnerability, and generate concise code comments. The system also includes ablation studies to evaluate the effect of temperature and beam size on the performance of the model. The findings demonstrate that the use of reinforcement learning with semantic reward enhances the model's performance in generating code comments and improving the overall effectiveness of the system in addressing code vulnerabilities.

LLM-Powered Code Vulnerability Repair with Reinforcement Learning and Semantic Reward

22 Feb 2024 | Nafis Tanveer Islam, Joseph Khoury, Andrew Seong, Mohammad Bahrami Karkevandi, Gonzalo De La Torre Parra, Elias Bou-Harb, Peyman Najafirad