The Case for Learned Index Structures

The Case for Learned Index Structures

30 Apr 2018 | Tim Kraska*, Alex Beutel, Ed H. Chi, Jeffrey Dean, Neoklis Polyzotis
This paper introduces learned indexes, which replace traditional index structures like B-Trees, Hash-maps, and Bloom filters with machine learning models. The key idea is that learned indexes can predict the position or existence of records based on learned data distributions, potentially outperforming traditional indexes in speed and memory usage. The authors theoretically analyze when learned indexes outperform traditional structures and discuss challenges in their design. Initial results show that neural networks can outperform cache-optimized B-Trees by up to 70% in speed while saving an order of magnitude in memory. The paper explores the potential of learned indexes for range queries, hash lookups, and Bloom filters, showing that they can replace traditional structures with similar or better performance. The authors also discuss the benefits of learned indexes on future hardware, such as GPUs and TPUs, and note that learned indexes can be used for other components in data systems. The paper evaluates learned indexes on real and synthetic datasets, showing significant improvements in speed and memory usage. The authors also discuss challenges in handling write-heavy workloads and suggest future research directions. The paper concludes that learned indexes have the potential to revolutionize data systems by embedding machine learning models into algorithms and data structures.This paper introduces learned indexes, which replace traditional index structures like B-Trees, Hash-maps, and Bloom filters with machine learning models. The key idea is that learned indexes can predict the position or existence of records based on learned data distributions, potentially outperforming traditional indexes in speed and memory usage. The authors theoretically analyze when learned indexes outperform traditional structures and discuss challenges in their design. Initial results show that neural networks can outperform cache-optimized B-Trees by up to 70% in speed while saving an order of magnitude in memory. The paper explores the potential of learned indexes for range queries, hash lookups, and Bloom filters, showing that they can replace traditional structures with similar or better performance. The authors also discuss the benefits of learned indexes on future hardware, such as GPUs and TPUs, and note that learned indexes can be used for other components in data systems. The paper evaluates learned indexes on real and synthetic datasets, showing significant improvements in speed and memory usage. The authors also discuss challenges in handling write-heavy workloads and suggest future research directions. The paper concludes that learned indexes have the potential to revolutionize data systems by embedding machine learning models into algorithms and data structures.
Reach us at info@study.space
[slides and audio] The Case for Learned Index Structures