Enhancing LLM-based Test Generation for Hard-to-Cover Branches via Program Analysis

Enhancing LLM-based Test Generation for Hard-to-Cover Branches via Program Analysis

2024-04-07 | Chen Yang, Junjie Chen, Bin Lin, Jianyi Zhou, Ziqi Wang
The paper introduces TELPA (Test generation via LIm and Program Analysis), a novel technique for enhancing the coverage of hard-to-cover branches in software testing. TELPA addresses the challenges of constructing complex objects and resolving intricate inter-procedural dependencies by leveraging program analysis and feedback-based test generation. The key contributions of TELPA include: 1. **Program Analysis**: TELPA performs backward and forward method-invocation analyses to extract real usage scenarios and understand inter-procedural dependencies, respectively. 2. **Feedback-Based Test Generation**: TELPA uses counter-examples to guide LLMs in generating diverse tests, improving efficiency and effectiveness. 3. **Evaluation**: Extensive experiments on 27 open-source Python projects show that TELPA significantly outperforms state-of-the-art SBST and LLM-based techniques, achieving an average improvement of 31.39% and 22.22% in branch coverage. The paper also discusses the limitations of existing techniques and the effectiveness of TELPA's components, highlighting the importance of task-specific prompting and the balance between effectiveness and efficiency. Future work includes addressing hallucination issues and exploring the generalizability of TELPA to other programming languages.The paper introduces TELPA (Test generation via LIm and Program Analysis), a novel technique for enhancing the coverage of hard-to-cover branches in software testing. TELPA addresses the challenges of constructing complex objects and resolving intricate inter-procedural dependencies by leveraging program analysis and feedback-based test generation. The key contributions of TELPA include: 1. **Program Analysis**: TELPA performs backward and forward method-invocation analyses to extract real usage scenarios and understand inter-procedural dependencies, respectively. 2. **Feedback-Based Test Generation**: TELPA uses counter-examples to guide LLMs in generating diverse tests, improving efficiency and effectiveness. 3. **Evaluation**: Extensive experiments on 27 open-source Python projects show that TELPA significantly outperforms state-of-the-art SBST and LLM-based techniques, achieving an average improvement of 31.39% and 22.22% in branch coverage. The paper also discusses the limitations of existing techniques and the effectiveness of TELPA's components, highlighting the importance of task-specific prompting and the balance between effectiveness and efficiency. Future work includes addressing hallucination issues and exploring the generalizability of TELPA to other programming languages.
Reach us at info@study.space
Understanding Enhancing LLM-based Test Generation for Hard-to-Cover Branches via Program Analysis