November 1998 | SUNIL ARYA, DAVID M. MOUNT, NATHAN S. NETANYAHU, RUTH SILVERMAN, ANGELA Y. WU
The paper presents an optimal algorithm for approximate nearest neighbor searching in fixed dimensions. The authors introduce a balanced box-decomposition (BBD) tree, which is a hierarchical decomposition of space into axis-aligned hyperrectangles with bounded aspect ratios. This data structure allows for preprocessing a set of \( n \) points in \( R^d \) in \( O(dn \log n) \) time and \( O(dn) \) space, enabling the computation of a \((1 + \epsilon)\)-approximate nearest neighbor of any query point \( q \) in \( O(c_{d, \epsilon} \log n) \) time, where \( c_{d, \epsilon} \leq d \lceil 1 + 6d/\epsilon \rceil^d \). The algorithm can also compute a sequence of \( k \) \((1 + \epsilon)\)-approximate nearest neighbors in additional \( O(kd \log n) \) time. The BBD-tree combines the benefits of kd-trees and quadtree-based structures, achieving both exponential cardinality and geometric size reduction. The construction and query algorithms are deterministic and easy to implement, making the approach practical for high-dimensional data sets. The paper also discusses the theoretical and empirical performance of the algorithm, showing that it outperforms brute-force search and other known methods in terms of space and query time.The paper presents an optimal algorithm for approximate nearest neighbor searching in fixed dimensions. The authors introduce a balanced box-decomposition (BBD) tree, which is a hierarchical decomposition of space into axis-aligned hyperrectangles with bounded aspect ratios. This data structure allows for preprocessing a set of \( n \) points in \( R^d \) in \( O(dn \log n) \) time and \( O(dn) \) space, enabling the computation of a \((1 + \epsilon)\)-approximate nearest neighbor of any query point \( q \) in \( O(c_{d, \epsilon} \log n) \) time, where \( c_{d, \epsilon} \leq d \lceil 1 + 6d/\epsilon \rceil^d \). The algorithm can also compute a sequence of \( k \) \((1 + \epsilon)\)-approximate nearest neighbors in additional \( O(kd \log n) \) time. The BBD-tree combines the benefits of kd-trees and quadtree-based structures, achieving both exponential cardinality and geometric size reduction. The construction and query algorithms are deterministic and easy to implement, making the approach practical for high-dimensional data sets. The paper also discusses the theoretical and empirical performance of the algorithm, showing that it outperforms brute-force search and other known methods in terms of space and query time.