[slides] Query evaluation techniques for large databases

This paper surveys efficient algorithms and software architectures for database query execution engines, focusing on complex queries over large databases. It discusses a wide range of query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains. The paper provides a foundation for the design and implementation of query execution facilities in new database management systems. The paper discusses the architecture of query execution engines, sorting and hashing, disk access, aggregation and duplicate removal, binary matching operations, universal quantification, the duality of sort- and hash-based query processing algorithms, execution of complex query plans, mechanisms for parallel query execution, and nonstandard query processing algorithms. It also covers additional techniques for performance improvement, such as precomputation and derived data, data compression, surrogate processing, bit vector filtering, and specialized hardware. The paper emphasizes the importance of efficient algorithms for accessing and manipulating large sets and sequences in database management systems. It discusses the challenges of managing large data volumes and the need for efficient query processing algorithms. The paper also addresses the differences between logical and physical algebras, the implementation of iterators, and the synchronization and data transfer between operators. The paper provides an overview of sorting and hashing techniques, including their use in query processing, the performance effects of algorithmic tricks and variants of external sorting, and the duality between main-memory mergesort and quicksort. It also discusses the merging of initial runs into larger runs, the factors affecting merge efficiency, and the optimal cluster sizes and memory allocations for sorting. The paper concludes with a summary and outlook on query processing research and its future. It highlights the importance of efficient query execution techniques for large databases and the need for further research in this area.This paper surveys efficient algorithms and software architectures for database query execution engines, focusing on complex queries over large databases. It discusses a wide range of query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains. The paper provides a foundation for the design and implementation of query execution facilities in new database management systems. The paper discusses the architecture of query execution engines, sorting and hashing, disk access, aggregation and duplicate removal, binary matching operations, universal quantification, the duality of sort- and hash-based query processing algorithms, execution of complex query plans, mechanisms for parallel query execution, and nonstandard query processing algorithms. It also covers additional techniques for performance improvement, such as precomputation and derived data, data compression, surrogate processing, bit vector filtering, and specialized hardware. The paper emphasizes the importance of efficient algorithms for accessing and manipulating large sets and sequences in database management systems. It discusses the challenges of managing large data volumes and the need for efficient query processing algorithms. The paper also addresses the differences between logical and physical algebras, the implementation of iterators, and the synchronization and data transfer between operators. The paper provides an overview of sorting and hashing techniques, including their use in query processing, the performance effects of algorithmic tricks and variants of external sorting, and the duality between main-memory mergesort and quicksort. It also discusses the merging of initial runs into larger runs, the factors affecting merge efficiency, and the optimal cluster sizes and memory allocations for sorting. The paper concludes with a summary and outlook on query processing research and its future. It highlights the importance of efficient query execution techniques for large databases and the need for further research in this area.

Query Evaluation Techniques for Large Databases

June 1993 | GOETZ GRAEFE