SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior

SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior

29 Mar 2024 | Zhongrui Yu, Haoran Wang, Jinze Yang, Hanzhang Wang, Zeke Xie, Yunfeng Cai, Jiale Cao, Zhong Ji, and Mingming Sun
SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior This paper proposes a novel method for novel view synthesis (NVS) in street scenes, combining 3D Gaussian Splatting (3DGS) with a diffusion model to enhance rendering quality. The method leverages a fine-tuned diffusion model to provide prior information during training, enabling the model to generate high-quality images from novel viewpoints. The diffusion model is fine-tuned using adjacent frame images and LiDAR depth data, allowing it to learn spatial relationships and improve the model's ability to render scenes from unobserved views. The method is evaluated on the KITTI and KITTI-360 datasets, demonstrating superior performance compared to existing state-of-the-art methods, particularly in sparse-view settings. The approach maintains high rendering quality even for viewpoints distant from the training views and does not compromise the real-time inference capability of 3DGS. The method addresses the challenge of rendering scenes from sparse views by incorporating diffusion model priors, which help regularize the 3DGS training process. The results show that the proposed method achieves high-quality rendering and outperforms existing methods in terms of image quality and rendering efficiency. The method is effective in preserving photo-realistic rendering quality and is suitable for autonomous driving simulations.SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior This paper proposes a novel method for novel view synthesis (NVS) in street scenes, combining 3D Gaussian Splatting (3DGS) with a diffusion model to enhance rendering quality. The method leverages a fine-tuned diffusion model to provide prior information during training, enabling the model to generate high-quality images from novel viewpoints. The diffusion model is fine-tuned using adjacent frame images and LiDAR depth data, allowing it to learn spatial relationships and improve the model's ability to render scenes from unobserved views. The method is evaluated on the KITTI and KITTI-360 datasets, demonstrating superior performance compared to existing state-of-the-art methods, particularly in sparse-view settings. The approach maintains high rendering quality even for viewpoints distant from the training views and does not compromise the real-time inference capability of 3DGS. The method addresses the challenge of rendering scenes from sparse views by incorporating diffusion model priors, which help regularize the 3DGS training process. The results show that the proposed method achieves high-quality rendering and outperforms existing methods in terms of image quality and rendering efficiency. The method is effective in preserving photo-realistic rendering quality and is suitable for autonomous driving simulations.
Reach us at info@study.space
Understanding SGD%3A Street View Synthesis with Gaussian Splatting and Diffusion Prior