Understanding Street Gaussians%3A Modeling Dynamic Urban Scenes with Gaussian Splatting

This paper introduces Street Gaussians, a novel explicit scene representation for modeling dynamic urban scenes. The method efficiently reconstructs and renders high-fidelity urban street scenes in real-time. The dynamic urban scene is represented as a set of point clouds, each corresponding to either the static background or a moving vehicle. Each point is assigned a 3D Gaussian, including position, opacity, and covariance, to represent geometry. The appearance is modeled using spherical harmonics, with dynamic spherical harmonics for foreground vehicles. The method also incorporates 4D spherical harmonics to model time-varying appearance. The explicit point-based representation allows easy composition of separate models, enabling real-time rendering and scene editing. The method is evaluated on the Waymo Open and KITTI datasets, achieving state-of-the-art performance in terms of rendering quality and speed. The method outperforms existing methods in all datasets, with rendering speeds up to 100 times faster. The proposed method is efficient, with training and rendering within half an hour. The method supports various scene editing operations, including rotation, translation, and swapping. It also enables semantic segmentation and decomposition of foreground objects. The method is effective in handling dynamic scenes and produces high-fidelity results. The method is based on 3D Gaussians and leverages tracked vehicle poses for dynamic modeling. The method is efficient and effective for autonomous driving simulation.This paper introduces Street Gaussians, a novel explicit scene representation for modeling dynamic urban scenes. The method efficiently reconstructs and renders high-fidelity urban street scenes in real-time. The dynamic urban scene is represented as a set of point clouds, each corresponding to either the static background or a moving vehicle. Each point is assigned a 3D Gaussian, including position, opacity, and covariance, to represent geometry. The appearance is modeled using spherical harmonics, with dynamic spherical harmonics for foreground vehicles. The method also incorporates 4D spherical harmonics to model time-varying appearance. The explicit point-based representation allows easy composition of separate models, enabling real-time rendering and scene editing. The method is evaluated on the Waymo Open and KITTI datasets, achieving state-of-the-art performance in terms of rendering quality and speed. The method outperforms existing methods in all datasets, with rendering speeds up to 100 times faster. The proposed method is efficient, with training and rendering within half an hour. The method supports various scene editing operations, including rotation, translation, and swapping. It also enables semantic segmentation and decomposition of foreground objects. The method is effective in handling dynamic scenes and produces high-fidelity results. The method is based on 3D Gaussians and leverages tracked vehicle poses for dynamic modeling. The method is efficient and effective for autonomous driving simulation.

Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting

2024 | Yunzhi Yan, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, Sida Peng