June 2024 | SHAOHUA LI, ETH Zurich, Switzerland THEODOROS THEODORIDIS, ETH Zurich, Switzerland ZHENDONG SU, ETH Zurich, Switzerland
The paper introduces a novel approach to testing optimizing compilers using real-world code snippets. The core idea is to construct well-formed programs by fusing multiple code snippets from various real-world projects, leveraging the rich syntactical and semantic features of real-world code. The approach involves extracting real-world code at the granularity of functions, injecting function calls into seed programs, and maintaining semantics and complex data dependencies through dynamic execution information. This method complements existing generators by boosting their expressiveness.
The authors implemented this idea in a tool called Creal, which was used to test C compilers GCC and LLVM. Over a nine-month period, Creal reported 132 bugs to GCC and LLVM, with 121 confirmed as unknown bugs and 101 fixed. Most of these bugs were miscompilations, which are the most harmful and difficult-to-detect types of bugs.
The evaluation demonstrates the significant advantage of using real-world code to stress-test compilers, highlighting the effectiveness of Creal in discovering new and latent bugs. The paper also discusses the limitations of current generation- and mutation-based approaches and provides a detailed algorithmic sketch of the Creal approach, including expression matching, program profiling, and function call synthesis.
The contributions of the paper include:
- Proposing the injection of real-world code into seed programs to create diverse, well-formed programs for compiler testing.
- Developing Creal to implement this approach by injecting real-world functions into seed programs.
- Conducting a nine-month extensive evaluation of Creal, demonstrating its effectiveness in discovering new and latent bugs in widely used production C compilers.
The paper concludes by emphasizing the potential of this approach to enhance compiler testing and its applicability to other compilers.The paper introduces a novel approach to testing optimizing compilers using real-world code snippets. The core idea is to construct well-formed programs by fusing multiple code snippets from various real-world projects, leveraging the rich syntactical and semantic features of real-world code. The approach involves extracting real-world code at the granularity of functions, injecting function calls into seed programs, and maintaining semantics and complex data dependencies through dynamic execution information. This method complements existing generators by boosting their expressiveness.
The authors implemented this idea in a tool called Creal, which was used to test C compilers GCC and LLVM. Over a nine-month period, Creal reported 132 bugs to GCC and LLVM, with 121 confirmed as unknown bugs and 101 fixed. Most of these bugs were miscompilations, which are the most harmful and difficult-to-detect types of bugs.
The evaluation demonstrates the significant advantage of using real-world code to stress-test compilers, highlighting the effectiveness of Creal in discovering new and latent bugs. The paper also discusses the limitations of current generation- and mutation-based approaches and provides a detailed algorithmic sketch of the Creal approach, including expression matching, program profiling, and function call synthesis.
The contributions of the paper include:
- Proposing the injection of real-world code into seed programs to create diverse, well-formed programs for compiler testing.
- Developing Creal to implement this approach by injecting real-world functions into seed programs.
- Conducting a nine-month extensive evaluation of Creal, demonstrating its effectiveness in discovering new and latent bugs in widely used production C compilers.
The paper concludes by emphasizing the potential of this approach to enhance compiler testing and its applicability to other compilers.