Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search

Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search

September 23-28, 2007 | Qin Lv, William Josephson, Zhe Wang, Moses Charikar, Kai Li
This paper introduces a new indexing scheme called *multi-probe* Locality Sensitive Hashing (LSH) to improve the efficiency of similarity search in high-dimensional spaces. The basic LSH method, while effective, requires a large number of hash tables to achieve good search quality, leading to high space requirements. Multi-probe LSH addresses this issue by intelligently probing multiple buckets that are likely to contain the query's nearest neighbors, reducing the number of hash tables needed. The method is inspired by and improves upon recent theoretical work on entropy-based LSH, which aims to reduce space requirements. Experimental results on two high-dimensional datasets (1.3 million web images and 2.6 million audio words) show that multi-probe LSH significantly reduces the number of hash tables and query time compared to both the basic LSH and entropy-based LSH methods, while maintaining similar search quality. The multi-probe LSH method achieves a reduction in the number of hash tables by a factor of 14 to 18 and query time by half, making it more space and time efficient.This paper introduces a new indexing scheme called *multi-probe* Locality Sensitive Hashing (LSH) to improve the efficiency of similarity search in high-dimensional spaces. The basic LSH method, while effective, requires a large number of hash tables to achieve good search quality, leading to high space requirements. Multi-probe LSH addresses this issue by intelligently probing multiple buckets that are likely to contain the query's nearest neighbors, reducing the number of hash tables needed. The method is inspired by and improves upon recent theoretical work on entropy-based LSH, which aims to reduce space requirements. Experimental results on two high-dimensional datasets (1.3 million web images and 2.6 million audio words) show that multi-probe LSH significantly reduces the number of hash tables and query time compared to both the basic LSH and entropy-based LSH methods, while maintaining similar search quality. The multi-probe LSH method achieves a reduction in the number of hash tables by a factor of 14 to 18 and query time by half, making it more space and time efficient.
Reach us at info@study.space