17 Mar 2024 | Peng Jiang1, Gaurav Pandey 2 and Srikanth Saripalli1
This paper introduces 3DGS-ReLoc, a novel system for 3D mapping and visual relocalization using 3D Gaussian Splatting (3DGS). The system leverages LiDAR and camera data to create detailed and geometrically accurate representations of the environment. By using LiDAR data to initiate the training of the 3DGS map, the system constructs detailed and precise maps. To address high GPU memory usage, the system employs a 2D voxel map and a KD-tree for efficient spatial queries. This setup enables efficient identification of correspondences between the query image and the rendered image from the 3DGS map using normalized cross-correlation (NCC). The camera pose of the query image is refined using feature-based matching and the Perspective-n-Point (PnP) technique. Extensive evaluation on the KITTI360 dataset demonstrates the effectiveness, adaptability, and precision of the system. The paper also discusses related work in map representation and visual relocalization, highlighting the advantages and limitations of different techniques. The method's performance in initial relocalization, refinement, and live relocalization is evaluated, showing high accuracy and robustness. The paper concludes by discussing limitations and future directions, including balancing visual quality with memory and geometric fidelity, and exploring fully differentiable localization pipelines.This paper introduces 3DGS-ReLoc, a novel system for 3D mapping and visual relocalization using 3D Gaussian Splatting (3DGS). The system leverages LiDAR and camera data to create detailed and geometrically accurate representations of the environment. By using LiDAR data to initiate the training of the 3DGS map, the system constructs detailed and precise maps. To address high GPU memory usage, the system employs a 2D voxel map and a KD-tree for efficient spatial queries. This setup enables efficient identification of correspondences between the query image and the rendered image from the 3DGS map using normalized cross-correlation (NCC). The camera pose of the query image is refined using feature-based matching and the Perspective-n-Point (PnP) technique. Extensive evaluation on the KITTI360 dataset demonstrates the effectiveness, adaptability, and precision of the system. The paper also discusses related work in map representation and visual relocalization, highlighting the advantages and limitations of different techniques. The method's performance in initial relocalization, refinement, and live relocalization is evaluated, showing high accuracy and robustness. The paper concludes by discussing limitations and future directions, including balancing visual quality with memory and geometric fidelity, and exploring fully differentiable localization pipelines.