28 Feb 2017 | Jeff Johnson, Matthijs Douze, Hervé Jégou
This paper presents a GPU-based approach for efficient similarity search, particularly for large-scale data. The authors propose a k-selection algorithm that achieves up to 55% of the theoretical peak performance, enabling a nearest neighbor implementation that is 8.5× faster than prior GPU state of the art. The method is applied in various similarity search scenarios, including brute-force, approximate, and compressed-domain search based on product quantization. The implementation enables the construction of a high accuracy k-NN graph on 95 million images from the YFCC100M dataset in 35 minutes, and of a graph connecting 1 billion vectors in less than 12 hours on 4 Maxwell Titan X GPUs. The approach is open-sourced for comparison and reproducibility.
The paper discusses the challenges of similarity search on GPUs, including the limitations of existing algorithms and the need for efficient memory usage. It introduces a GPU k-selection algorithm that operates in fast register memory and is flexible enough to be fusable with other kernels. The algorithm is designed for exact and approximate k-nearest neighbor search on GPU, and is shown to outperform previous art by large margins on mid- to large-scale nearest-neighbor search tasks.
The paper also discusses the use of product quantization (PQ) codes, which are more effective than binary codes for similarity search. The authors propose an optimized product quantization (OPQ) method that improves the accuracy of product quantization and can be applied as a pre-processing step. The paper also discusses the use of inverted file structures for compressed-domain search, which allows for efficient memory usage and high compression rates.
The authors evaluate their approach on various datasets, including the MNIST8m dataset and the SIFT1M dataset. They compare their GPU-based approach with existing libraries and show that their method is significantly faster, particularly for large-scale data. The paper also discusses the use of multi-GPU parallelism to further improve performance, and shows that their approach can be applied to billion-scale datasets with quantization codes. The results demonstrate that their approach is effective for large-scale similarity search and can be used for a variety of applications, including image and video indexing.This paper presents a GPU-based approach for efficient similarity search, particularly for large-scale data. The authors propose a k-selection algorithm that achieves up to 55% of the theoretical peak performance, enabling a nearest neighbor implementation that is 8.5× faster than prior GPU state of the art. The method is applied in various similarity search scenarios, including brute-force, approximate, and compressed-domain search based on product quantization. The implementation enables the construction of a high accuracy k-NN graph on 95 million images from the YFCC100M dataset in 35 minutes, and of a graph connecting 1 billion vectors in less than 12 hours on 4 Maxwell Titan X GPUs. The approach is open-sourced for comparison and reproducibility.
The paper discusses the challenges of similarity search on GPUs, including the limitations of existing algorithms and the need for efficient memory usage. It introduces a GPU k-selection algorithm that operates in fast register memory and is flexible enough to be fusable with other kernels. The algorithm is designed for exact and approximate k-nearest neighbor search on GPU, and is shown to outperform previous art by large margins on mid- to large-scale nearest-neighbor search tasks.
The paper also discusses the use of product quantization (PQ) codes, which are more effective than binary codes for similarity search. The authors propose an optimized product quantization (OPQ) method that improves the accuracy of product quantization and can be applied as a pre-processing step. The paper also discusses the use of inverted file structures for compressed-domain search, which allows for efficient memory usage and high compression rates.
The authors evaluate their approach on various datasets, including the MNIST8m dataset and the SIFT1M dataset. They compare their GPU-based approach with existing libraries and show that their method is significantly faster, particularly for large-scale data. The paper also discusses the use of multi-GPU parallelism to further improve performance, and shows that their approach can be applied to billion-scale datasets with quantization codes. The results demonstrate that their approach is effective for large-scale similarity search and can be used for a variety of applications, including image and video indexing.