[slides and audio] Efficient LoFTR%3A Semi-Dense Local Feature Matching with Sparse-Like Speed

The paper presents an efficient method for producing semi-dense matches across images, addressing the limitations of the previous detector-free matcher LoFTR. LoFTR, while showing strong performance in handling large viewpoint changes and texture-poor scenarios, suffers from low efficiency. The authors revisit LoFTR's design choices and propose multiple improvements to enhance both efficiency and accuracy. Key contributions include: 1. **Aggregated Attention Mechanism**: A new aggregated attention mechanism is introduced to perform transformer operations on adaptively selected tokens, reducing computational overhead. 2. **Two-Stage Correlation Layer**: A novel two-stage correlation layer is proposed to refine coarse matches, achieving accurate subpixel correspondences and improving accuracy. 3. **Efficiency and Accuracy**: The optimized model is 2.5 times faster than LoFTR and can surpass state-of-the-art efficient sparse matching pipelines like SuperPoint + LightGlue. The method is evaluated on various tasks, including homography estimation, relative pose recovery, and visual localization, demonstrating superior performance in terms of accuracy and efficiency. The paper also includes detailed ablation studies to validate the effectiveness of the proposed improvements.The paper presents an efficient method for producing semi-dense matches across images, addressing the limitations of the previous detector-free matcher LoFTR. LoFTR, while showing strong performance in handling large viewpoint changes and texture-poor scenarios, suffers from low efficiency. The authors revisit LoFTR's design choices and propose multiple improvements to enhance both efficiency and accuracy. Key contributions include: 1. **Aggregated Attention Mechanism**: A new aggregated attention mechanism is introduced to perform transformer operations on adaptively selected tokens, reducing computational overhead. 2. **Two-Stage Correlation Layer**: A novel two-stage correlation layer is proposed to refine coarse matches, achieving accurate subpixel correspondences and improving accuracy. 3. **Efficiency and Accuracy**: The optimized model is 2.5 times faster than LoFTR and can surpass state-of-the-art efficient sparse matching pipelines like SuperPoint + LightGlue. The method is evaluated on various tasks, including homography estimation, relative pose recovery, and visual localization, demonstrating superior performance in terms of accuracy and efficiency. The paper also includes detailed ablation studies to validate the effectiveness of the proposed improvements.

Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed

11 Mar 2024 | Yifan Wang, Xingyi He, Sida Peng, Dongli Tan, Xiaowei Zhou†

Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed

11 Mar 2024 | Yifan Wang*, Xingyi He*, Sida Peng, Dongli Tan, Xiaowei Zhou†

11 Mar 2024 | Yifan Wang, Xingyi He, Sida Peng, Dongli Tan, Xiaowei Zhou†