| ALINA BEYGELZIMER, SHAM KAKADE, AND JOHN LANGFORD
The paper introduces a data structure called a cover tree for efficient nearest neighbor search in general metric spaces. The cover tree requires linear space $ O(n) $, regardless of the metric's structure. If the point set has an expansion constant $ c > 2 $, the data structure can be constructed in $ O(c^6 n \log n) $ time, and nearest neighbor queries take $ O(c^{12} \log n) $ time. For querying the nearest neighbor of $ O(n) $ points, the time complexity is $ O(c^{16} n) $. The algorithm is tested on real-world datasets, showing significant speedups over brute-force search, with improvements ranging from 1 to 2000 times.
The cover tree is a leveled tree where each level is a "cover" for the level beneath it. It satisfies three invariants: nesting, covering tree, and separation. The data structure is simple, as it manipulates a tree, and is a subgraph of a navigating net. The cover tree has linear space complexity, independent of the dataset's expansion constant, making it more efficient than previous methods that assume constant abstract dimensionality.
The paper also presents algorithms for inserting, removing, and querying points in the cover tree. The insertion and removal operations take $ O(c^6 \log n) $ time. The batch construction algorithm amortizes the construction cost over points, and the batch query algorithm improves time complexity by amortizing query time over multiple queries.
The paper evaluates the cover tree on several datasets, including UCI, KDD, MNIST, and Isomap. The results show that the cover tree outperforms brute-force search and the $ sb(S) $ algorithm, with speedups ranging from 0.8 to 30. The cover tree's performance is influenced by the expansion constant, with the 80th percentile expansion constant being a better predictor of performance than the worst-case expansion constant. The cover tree is also effective for approximate nearest neighbor queries and has linear space complexity, making it suitable for high-dimensional data.The paper introduces a data structure called a cover tree for efficient nearest neighbor search in general metric spaces. The cover tree requires linear space $ O(n) $, regardless of the metric's structure. If the point set has an expansion constant $ c > 2 $, the data structure can be constructed in $ O(c^6 n \log n) $ time, and nearest neighbor queries take $ O(c^{12} \log n) $ time. For querying the nearest neighbor of $ O(n) $ points, the time complexity is $ O(c^{16} n) $. The algorithm is tested on real-world datasets, showing significant speedups over brute-force search, with improvements ranging from 1 to 2000 times.
The cover tree is a leveled tree where each level is a "cover" for the level beneath it. It satisfies three invariants: nesting, covering tree, and separation. The data structure is simple, as it manipulates a tree, and is a subgraph of a navigating net. The cover tree has linear space complexity, independent of the dataset's expansion constant, making it more efficient than previous methods that assume constant abstract dimensionality.
The paper also presents algorithms for inserting, removing, and querying points in the cover tree. The insertion and removal operations take $ O(c^6 \log n) $ time. The batch construction algorithm amortizes the construction cost over points, and the batch query algorithm improves time complexity by amortizing query time over multiple queries.
The paper evaluates the cover tree on several datasets, including UCI, KDD, MNIST, and Isomap. The results show that the cover tree outperforms brute-force search and the $ sb(S) $ algorithm, with speedups ranging from 0.8 to 30. The cover tree's performance is influenced by the expansion constant, with the 80th percentile expansion constant being a better predictor of performance than the worst-case expansion constant. The cover tree is also effective for approximate nearest neighbor queries and has linear space complexity, making it suitable for high-dimensional data.