[slides and audio] MoD-SLAM%3A Monocular Dense Mapping for Unbounded 3D Scene Reconstruction

MoD-SLAM is a novel monocular dense mapping method that enables real-time 3D reconstruction in unbounded scenes using Neural Radiance Fields (NeRF). The method addresses the limitations of existing monocular SLAM systems, which are typically designed for bounded scenes. Key contributions include: 1. **Gaussian-Based Unbounded Scene Representation**: A reparameterization approach that contracts space in spherical coordinates, eliminating the need for normalization and improving accuracy in unbounded scenes. 2. **Depth Estimation and Distillation Modules**: These modules extract accurate depth values to supervise mapping and tracking processes, enhancing the precision of pose estimation in large-scale scenes. 3. **Depth-Supervised Camera Tracking**: A robust depth loss term is introduced to constrain scales, improving the accuracy of pose estimation in monocular SLAM systems. Experiments on standard datasets (ScanNet and Replica) demonstrate that MoD-SLAM achieves competitive performance, with up to 30% improvement in 3D reconstruction accuracy and 15% in localization accuracy compared to state-of-the-art monocular SLAM systems. The method also shows superior performance in scale reconstruction and void interpolation within scenes.MoD-SLAM is a novel monocular dense mapping method that enables real-time 3D reconstruction in unbounded scenes using Neural Radiance Fields (NeRF). The method addresses the limitations of existing monocular SLAM systems, which are typically designed for bounded scenes. Key contributions include: 1. **Gaussian-Based Unbounded Scene Representation**: A reparameterization approach that contracts space in spherical coordinates, eliminating the need for normalization and improving accuracy in unbounded scenes. 2. **Depth Estimation and Distillation Modules**: These modules extract accurate depth values to supervise mapping and tracking processes, enhancing the precision of pose estimation in large-scale scenes. 3. **Depth-Supervised Camera Tracking**: A robust depth loss term is introduced to constrain scales, improving the accuracy of pose estimation in monocular SLAM systems. Experiments on standard datasets (ScanNet and Replica) demonstrate that MoD-SLAM achieves competitive performance, with up to 30% improvement in 3D reconstruction accuracy and 15% in localization accuracy compared to state-of-the-art monocular SLAM systems. The method also shows superior performance in scale reconstruction and void interpolation within scenes.

MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction

8 Mar 2024 | Heng Zhou, Zhetao Guo, Shuhong Liu, Lechen Zhang, Qihao Wang, Yuxiang Ren, Mingrui Li