6 Jun 2024 | Fangjinhua Wang, Xudong Jiang, Silvano Galliani, Christoph Vogel, Marc Pollefeys
GLACE: Global Local Accelerated Coordinate Encoding
**Authors:** Fangjinhua Wang, Xudong Jiang, Silvano Galliani, Christoph Vogel, Marc Pollefeys
**Affiliations:** Department of Computer Science, ETH Zurich; Microsoft Mixed Reality & AI Zurich Lab
**Abstract:**
Scene coordinate regression (SCR) methods are effective for small-scale scenes but struggle in large-scale scenes without ground truth 3D point clouds. This paper introduces GLACE, a novel method that integrates pre-trained global and local encodings to enable SCR to scale to large scenes using a single small-sized network. GLACE addresses the challenge of implicit triangulation in large scenes by proposing a feature diffusion technique that implicitly groups reprojection constraints with co-visibility, avoiding overfitting to trivial solutions. Additionally, a positional decoder parameterizes output positions more effectively for large-scale scenes. GLACE achieves state-of-the-art results on large-scale datasets without using 3D models or depth maps for supervision.
**Contributions:**
1. GLACE is the first SCR method to achieve state-of-the-art performance on large-scale scenes without ensemble networks or 3D model supervision.
2. A novel feature diffusion technique integrates global and local encodings, effectively grouping reprojection constraints with co-visibility.
3. An improved positional decoder parameterizes output positions more effectively for large-scale scenes.
**Related Work:**
- **Pose Regression:** Methods encode the scene into a neural network and regress poses from query images.
- **Feature Matching Based Localization:** Methods represent the 3D scene using 3D geometry and match pixels in query images to 3D points in the 3D model.
**Experiments:**
- **Datasets:** 7 Scenes, 12 Scenes, Cambridge Landmarks, Aachen Day-Night.
- **Implementation:** PyTorch-based architecture, similar to ACE, with adjustments for large outdoor scenes.
- **Evaluation:** GLACE outperforms state-of-the-art SCR methods on large-scale scenes, achieving comparable performance to feature matching methods with a smaller model size.
**Ablation Study:**
- **Feature Diffusion:** Enhances performance by effectively grouping reprojection constraints.
- **Decoder:** Improves performance by allowing the model to parameterize a multimodal distribution.
**Conclusion:**
GLACE is a novel SCR method that scales to large scenes using a single network, leveraging co-visibility information and an improved position decoder.GLACE: Global Local Accelerated Coordinate Encoding
**Authors:** Fangjinhua Wang, Xudong Jiang, Silvano Galliani, Christoph Vogel, Marc Pollefeys
**Affiliations:** Department of Computer Science, ETH Zurich; Microsoft Mixed Reality & AI Zurich Lab
**Abstract:**
Scene coordinate regression (SCR) methods are effective for small-scale scenes but struggle in large-scale scenes without ground truth 3D point clouds. This paper introduces GLACE, a novel method that integrates pre-trained global and local encodings to enable SCR to scale to large scenes using a single small-sized network. GLACE addresses the challenge of implicit triangulation in large scenes by proposing a feature diffusion technique that implicitly groups reprojection constraints with co-visibility, avoiding overfitting to trivial solutions. Additionally, a positional decoder parameterizes output positions more effectively for large-scale scenes. GLACE achieves state-of-the-art results on large-scale datasets without using 3D models or depth maps for supervision.
**Contributions:**
1. GLACE is the first SCR method to achieve state-of-the-art performance on large-scale scenes without ensemble networks or 3D model supervision.
2. A novel feature diffusion technique integrates global and local encodings, effectively grouping reprojection constraints with co-visibility.
3. An improved positional decoder parameterizes output positions more effectively for large-scale scenes.
**Related Work:**
- **Pose Regression:** Methods encode the scene into a neural network and regress poses from query images.
- **Feature Matching Based Localization:** Methods represent the 3D scene using 3D geometry and match pixels in query images to 3D points in the 3D model.
**Experiments:**
- **Datasets:** 7 Scenes, 12 Scenes, Cambridge Landmarks, Aachen Day-Night.
- **Implementation:** PyTorch-based architecture, similar to ACE, with adjustments for large outdoor scenes.
- **Evaluation:** GLACE outperforms state-of-the-art SCR methods on large-scale scenes, achieving comparable performance to feature matching methods with a smaller model size.
**Ablation Study:**
- **Feature Diffusion:** Enhances performance by effectively grouping reprojection constraints.
- **Decoder:** Improves performance by allowing the model to parameterize a multimodal distribution.
**Conclusion:**
GLACE is a novel SCR method that scales to large scenes using a single network, leveraging co-visibility information and an improved position decoder.