[slides and audio] Points-to analysis in almost linear time

This paper presents an interprocedural, flow-insensitive points-to analysis algorithm with almost linear time complexity. The algorithm is based on a non-standard type system that allows for efficient inference of types representing storage shapes. The type system is designed to capture the storage usage of a program at runtime, enabling the algorithm to infer a storage shape graph that is valid at all program points. The algorithm's efficiency is achieved by using a type inference approach that merges types as necessary to ensure well-typedness of the program. The algorithm's three main contributions are: (1) a type system for describing a universally valid storage shape graph in linear space, (2) a constraint system that often leads to better results than the "obvious" constraint system for the given type system, and (3) an almost linear time algorithm for points-to analysis by solving a constraint system. The algorithm is inspired by Henglein's binding time analysis by type inference and uses a non-standard type system to describe the store usage at runtime. The types are used to construct a storage shape graph, which models the possible runtime contents of locations. The algorithm processes each statement exactly once, merging types as necessary to ensure well-typedness. The algorithm has linear space complexity and almost linear time complexity, making it suitable for large programs. The algorithm has been implemented and tested on a variety of C programs, including those with up to 75,000 lines of code. The results show that the algorithm is very efficient in practice and produces results that are comparable to those of flow-sensitive analyses. The algorithm is also able to handle large programs with many hundreds of thousands of lines of code. The algorithm's type system allows for the description of both location and function pointers, and it uses a pair of types to represent the type of a value. The algorithm's type inference process is based on a set of typing rules that ensure the correctness of the inferred types. The algorithm's efficiency is achieved by using fast union-find data structures to manage the merging of types. The algorithm has been shown to be effective in practice, and its results are comparable to those of other interprocedural points-to analyses. The algorithm's ability to handle large programs with many hundreds of thousands of lines of code makes it a valuable tool for program analysis and optimization. The algorithm's efficiency and accuracy make it a promising approach for future research in program analysis.This paper presents an interprocedural, flow-insensitive points-to analysis algorithm with almost linear time complexity. The algorithm is based on a non-standard type system that allows for efficient inference of types representing storage shapes. The type system is designed to capture the storage usage of a program at runtime, enabling the algorithm to infer a storage shape graph that is valid at all program points. The algorithm's efficiency is achieved by using a type inference approach that merges types as necessary to ensure well-typedness of the program. The algorithm's three main contributions are: (1) a type system for describing a universally valid storage shape graph in linear space, (2) a constraint system that often leads to better results than the "obvious" constraint system for the given type system, and (3) an almost linear time algorithm for points-to analysis by solving a constraint system. The algorithm is inspired by Henglein's binding time analysis by type inference and uses a non-standard type system to describe the store usage at runtime. The types are used to construct a storage shape graph, which models the possible runtime contents of locations. The algorithm processes each statement exactly once, merging types as necessary to ensure well-typedness. The algorithm has linear space complexity and almost linear time complexity, making it suitable for large programs. The algorithm has been implemented and tested on a variety of C programs, including those with up to 75,000 lines of code. The results show that the algorithm is very efficient in practice and produces results that are comparable to those of flow-sensitive analyses. The algorithm is also able to handle large programs with many hundreds of thousands of lines of code. The algorithm's type system allows for the description of both location and function pointers, and it uses a pair of types to represent the type of a value. The algorithm's type inference process is based on a set of typing rules that ensure the correctness of the inferred types. The algorithm's efficiency is achieved by using fast union-find data structures to manage the merging of types. The algorithm has been shown to be effective in practice, and its results are comparable to those of other interprocedural points-to analyses. The algorithm's ability to handle large programs with many hundreds of thousands of lines of code makes it a valuable tool for program analysis and optimization. The algorithm's efficiency and accuracy make it a promising approach for future research in program analysis.

Points-to Analysis in Almost Linear Time

1996 | Bjarne Steensgaard