MVIP-NeRF: Multi-view 3D Inpainting on NeRF Scenes via Diffusion Prior

MVIP-NeRF: Multi-view 3D Inpainting on NeRF Scenes via Diffusion Prior

5 May 2024 | Honghua Chen, Chen Change Loy, Xingang Pan
MVIP-NeRF is a novel approach for multi-view 3D inpainting on NeRF scenes using diffusion priors. The method addresses the limitations of existing NeRF inpainting techniques that rely on explicit RGB and depth inpainting results, which often lead to inconsistencies and inaccuracies. Instead, MVIP-NeRF implicitly leverages diffusion priors to achieve more faithful and consistent results in terms of both appearance and geometry. The approach performs joint inpainting across multiple views through an iterative optimization process based on Score Distillation Sampling (SDS). In addition to recovering rendered RGB images, the method also extracts normal maps as a geometric representation and defines a normal SDS loss to motivate accurate geometry inpainting. Furthermore, a multi-view SDS score function is formulated to distill generative priors simultaneously from different view images, ensuring consistent visual completion when dealing with large view variations. Experimental results show that MVIP-NeRF achieves better appearance and geometry recovery compared to previous NeRF inpainting methods. The method is evaluated on two real-world datasets, Real-S and Real-L, and demonstrates superior performance in terms of both quantitative and visual metrics. The approach is also compared with other state-of-the-art methods, including Remove-NeRF and SPiN-NeRF, and shows significant improvements in handling large view variations and achieving high-quality geometry recovery. The method is further validated through ablation studies and additional experiments, demonstrating the effectiveness of the diffusion prior in achieving consistent and realistic 3D inpainting. The work highlights the potential of diffusion models in 3D inpainting and provides a new paradigm for multi-view 3D inpainting on NeRF scenes.MVIP-NeRF is a novel approach for multi-view 3D inpainting on NeRF scenes using diffusion priors. The method addresses the limitations of existing NeRF inpainting techniques that rely on explicit RGB and depth inpainting results, which often lead to inconsistencies and inaccuracies. Instead, MVIP-NeRF implicitly leverages diffusion priors to achieve more faithful and consistent results in terms of both appearance and geometry. The approach performs joint inpainting across multiple views through an iterative optimization process based on Score Distillation Sampling (SDS). In addition to recovering rendered RGB images, the method also extracts normal maps as a geometric representation and defines a normal SDS loss to motivate accurate geometry inpainting. Furthermore, a multi-view SDS score function is formulated to distill generative priors simultaneously from different view images, ensuring consistent visual completion when dealing with large view variations. Experimental results show that MVIP-NeRF achieves better appearance and geometry recovery compared to previous NeRF inpainting methods. The method is evaluated on two real-world datasets, Real-S and Real-L, and demonstrates superior performance in terms of both quantitative and visual metrics. The approach is also compared with other state-of-the-art methods, including Remove-NeRF and SPiN-NeRF, and shows significant improvements in handling large view variations and achieving high-quality geometry recovery. The method is further validated through ablation studies and additional experiments, demonstrating the effectiveness of the diffusion prior in achieving consistent and realistic 3D inpainting. The work highlights the potential of diffusion models in 3D inpainting and provides a new paradigm for multi-view 3D inpainting on NeRF scenes.
Reach us at info@study.space
Understanding MVIP-NeRF%3A Multi-View 3D Inpainting on NeRF Scenes via Diffusion Prior