SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior

SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior

2024-03-29 | Zhongrui Yu, Haoran Wang, Jinze Yang, Hanzhang Wang, Zeke Xie, Yunfeng Cai, Jiale Cao, Zhong Ji, and Mingming Sun
The paper "SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior" addresses the challenge of novel view synthesis (NVS) in autonomous driving simulations, particularly focusing on street scenes. The authors propose a method that enhances the capacity of 3D Gaussian Splatting (3DGS) by leveraging prior information from a fine-tuned Diffusion Model and complementary multi-modal data. Specifically, they fine-tune a Diffusion Model using images from adjacent frames and depth data from LiDAR point clouds to provide additional spatial information. This fine-tuned model then guides the 3DGS training at unseen views during the training process. The method demonstrates competitive performance compared to state-of-the-art models on the KITTI and KITTI-360 datasets, maintaining high rendering qualities even for viewpoints distant from the training views. The approach is efficient for real-time inference and facilitates versatile viewpoint control within autonomous driving simulation systems. The paper also includes detailed experimental results and ablation studies to validate the effectiveness of the proposed method.The paper "SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior" addresses the challenge of novel view synthesis (NVS) in autonomous driving simulations, particularly focusing on street scenes. The authors propose a method that enhances the capacity of 3D Gaussian Splatting (3DGS) by leveraging prior information from a fine-tuned Diffusion Model and complementary multi-modal data. Specifically, they fine-tune a Diffusion Model using images from adjacent frames and depth data from LiDAR point clouds to provide additional spatial information. This fine-tuned model then guides the 3DGS training at unseen views during the training process. The method demonstrates competitive performance compared to state-of-the-art models on the KITTI and KITTI-360 datasets, maintaining high rendering qualities even for viewpoints distant from the training views. The approach is efficient for real-time inference and facilitates versatile viewpoint control within autonomous driving simulation systems. The paper also includes detailed experimental results and ablation studies to validate the effectiveness of the proposed method.
Reach us at info@study.space
Understanding SGD%3A Street View Synthesis with Gaussian Splatting and Diffusion Prior