Hashing with Graphs

Hashing with Graphs

2011 | Wei Liu, Jun Wang, Sanjiv Kumar, Shih-Fu Chang
This paper proposes a novel graph-based hashing method called Anchor Graph Hashing (AGH) for efficient nearest neighbor search in large-scale databases. AGH automatically discovers the neighborhood structure in data to learn compact binary codes, enabling fast and accurate search. The method uses Anchor Graphs to approximate the data's low-dimensional manifold structure, allowing for efficient computation of graph Laplacian eigenvectors. These eigenvectors are extended to unseen data points in constant time, enabling fast code generation. AGH also introduces a hierarchical threshold learning procedure, where each eigenfunction generates multiple bits, improving search accuracy. Experimental results on two large datasets, MNIST and NUS-WIDE, show that AGH outperforms state-of-the-art hashing methods in terms of both precision and recall. The method is efficient, with linear training time and constant query time, making it suitable for large-scale applications. AGH's ability to capture semantic neighborhoods and its use of low-rank approximation of the adjacency matrix make it effective in learning compact codes that reflect the underlying data structure. The paper also discusses the theoretical properties of AGH, including its ability to generalize eigenvectors to eigenfunctions for out-of-sample extension. Overall, AGH provides a scalable and effective solution for unsupervised hashing in high-dimensional data.This paper proposes a novel graph-based hashing method called Anchor Graph Hashing (AGH) for efficient nearest neighbor search in large-scale databases. AGH automatically discovers the neighborhood structure in data to learn compact binary codes, enabling fast and accurate search. The method uses Anchor Graphs to approximate the data's low-dimensional manifold structure, allowing for efficient computation of graph Laplacian eigenvectors. These eigenvectors are extended to unseen data points in constant time, enabling fast code generation. AGH also introduces a hierarchical threshold learning procedure, where each eigenfunction generates multiple bits, improving search accuracy. Experimental results on two large datasets, MNIST and NUS-WIDE, show that AGH outperforms state-of-the-art hashing methods in terms of both precision and recall. The method is efficient, with linear training time and constant query time, making it suitable for large-scale applications. AGH's ability to capture semantic neighborhoods and its use of low-rank approximation of the adjacency matrix make it effective in learning compact codes that reflect the underlying data structure. The paper also discusses the theoretical properties of AGH, including its ability to generalize eigenvectors to eigenfunctions for out-of-sample extension. Overall, AGH provides a scalable and effective solution for unsupervised hashing in high-dimensional data.
Reach us at info@study.space