SpecGen: Automated Generation of Formal Program Specifications via Large Language Models

SpecGen: Automated Generation of Formal Program Specifications via Large Language Models

24 Mar 2024 | Lezhi Ma*, Shangqing Liu†, Yi Li‡, Xiaofei Xie§, Lei Bu¶
SpecGen is an automated method for generating formal program specifications using Large Language Models (LLMs). It addresses the challenge of manually crafting accurate and comprehensive specifications for complex programs, which is time-consuming and error-prone. Existing methods rely on predefined templates or grammars, which often fail to capture the semantics of real-world programs. SpecGen introduces two phases: a conversation-driven approach to generate specifications and a mutation-based approach to refine them when initial results are insufficient. The conversation phase involves iterative dialogue with the LLM to improve specification accuracy, while the mutation phase applies four types of mutation operators to generate variants and selects the most verifiable ones using a heuristic strategy. SpecGen was evaluated on two datasets: the SV-COMP Java benchmark and a manually constructed dataset of 120 programs. It successfully generated verifiable specifications for 279 out of 385 programs, outperforming existing tools like Houdini and Daikon. The results show that SpecGen can comprehensively describe program behaviors, with high-quality specifications that pass verification. The approach leverages LLMs' code comprehension abilities and introduces mutation-based refinement to enhance accuracy. A user study confirmed the semantic quality of the generated specifications, with SpecGen achieving high ratings. The main contributions include a novel LLM-based specification generation method, a mutation-based approach to improve verification efficiency, a new dataset (SpecGenBench), and comprehensive evaluations across different aspects. SpecGen demonstrates significant improvements in generating accurate and verifiable specifications for complex programs.SpecGen is an automated method for generating formal program specifications using Large Language Models (LLMs). It addresses the challenge of manually crafting accurate and comprehensive specifications for complex programs, which is time-consuming and error-prone. Existing methods rely on predefined templates or grammars, which often fail to capture the semantics of real-world programs. SpecGen introduces two phases: a conversation-driven approach to generate specifications and a mutation-based approach to refine them when initial results are insufficient. The conversation phase involves iterative dialogue with the LLM to improve specification accuracy, while the mutation phase applies four types of mutation operators to generate variants and selects the most verifiable ones using a heuristic strategy. SpecGen was evaluated on two datasets: the SV-COMP Java benchmark and a manually constructed dataset of 120 programs. It successfully generated verifiable specifications for 279 out of 385 programs, outperforming existing tools like Houdini and Daikon. The results show that SpecGen can comprehensively describe program behaviors, with high-quality specifications that pass verification. The approach leverages LLMs' code comprehension abilities and introduces mutation-based refinement to enhance accuracy. A user study confirmed the semantic quality of the generated specifications, with SpecGen achieving high ratings. The main contributions include a novel LLM-based specification generation method, a mutation-based approach to improve verification efficiency, a new dataset (SpecGenBench), and comprehensive evaluations across different aspects. SpecGen demonstrates significant improvements in generating accurate and verifiable specifications for complex programs.
Reach us at info@study.space
[slides and audio] SpecGen%3A Automated Generation of Formal Program Specifications via Large Language Models