OmniGlue: Generalizable Feature Matching with Foundation Model Guidance

OmniGlue: Generalizable Feature Matching with Foundation Model Guidance

21 May 2024 | Hanwen Jiang, Arjun Karpur, Bingyi Cao, Qixing Huang, André Araujo
OmniGlue is a novel learnable image matching method designed with generalization as a core principle. It leverages a vision foundation model to guide the feature matching process, enhancing its ability to generalize to unseen domains. The method introduces a keypoint position-guided attention mechanism that disentangles spatial and appearance information, leading to improved matching descriptors. Comprehensive experiments on seven diverse datasets show that OmniGlue achieves significant improvements over existing methods, with relative gains of 20.9% on unseen domains compared to a comparable reference model and 9.5% over the recent LightGlue method. The model is trained using a combination of domain-agnostic local features and foundation model guidance, and it demonstrates strong generalization capabilities across various image domains, including scene-level, object-centric, and aerial images. The method also shows effective adaptation to target domains with limited training data, achieving up to 8.1% improvement. The paper compares OmniGlue with SuperGlue and LightGlue, highlighting its novel contributions in terms of generalization and performance. The results demonstrate that OmniGlue outperforms existing methods in both in-domain and zero-shot generalization tasks, making it a versatile and generalizable image matching solution.OmniGlue is a novel learnable image matching method designed with generalization as a core principle. It leverages a vision foundation model to guide the feature matching process, enhancing its ability to generalize to unseen domains. The method introduces a keypoint position-guided attention mechanism that disentangles spatial and appearance information, leading to improved matching descriptors. Comprehensive experiments on seven diverse datasets show that OmniGlue achieves significant improvements over existing methods, with relative gains of 20.9% on unseen domains compared to a comparable reference model and 9.5% over the recent LightGlue method. The model is trained using a combination of domain-agnostic local features and foundation model guidance, and it demonstrates strong generalization capabilities across various image domains, including scene-level, object-centric, and aerial images. The method also shows effective adaptation to target domains with limited training data, achieving up to 8.1% improvement. The paper compares OmniGlue with SuperGlue and LightGlue, highlighting its novel contributions in terms of generalization and performance. The results demonstrate that OmniGlue outperforms existing methods in both in-domain and zero-shot generalization tasks, making it a versatile and generalizable image matching solution.
Reach us at info@study.space