[slides] Generative Gaussian Splatting for Unbounded 3D City Generation

**GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation** **Authors:** Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu **Institution:** S-Lab, Nanyang Technological University **Abstract:** This paper introduces GaussianCity, a novel generative framework for creating unbounded 3D cities using 3D Gaussian splatting (3D-GS). Traditional methods like NeRF-based approaches are computationally inefficient for large-scale city generation, while 3D-GS offers a more efficient alternative. However, adapting 3D-GS for infinite-scale 3D cities is challenging due to significant storage overhead. GaussianCity addresses this issue by introducing two key innovations: 1) BEV-Point, a highly compact intermediate representation that ensures constant VRAM usage for unbounded scenes; 2) a spatial-aware BEV-Point decoder that produces 3D Gaussian attributes, leveraging Point Serializer to integrate structural and contextual characteristics of BEV points. Extensive experiments on the GoogleEarth and KITTI-360 datasets demonstrate that GaussianCity achieves state-of-the-art results in both drone-view and street-view 3D city generation, with a speedup of 60 times compared to CityDreamer. **Contributions:** 1. The first generative 3D-GS model for unbounded 3D city generation. 2. Introduction of BEV-Point, a compact representation ensuring constant VRAM usage. 3. BEV-Point Decoder that captures structural and contextual characteristics of unstructured BEV points. **Methods:** - **BEV-Point Initialization:** Generates visible BEV points from a local patch of BEV maps, ensuring constant VRAM usage. - **BEV-Point Feature Generation:** Divides features into instance attributes, BEV-Point attributes, and style look-up table. - **BEV-Point Decoding:** Comprises positional encoder, point serializer, point transformer, modulated MLP, and Gaussian rasterizer to generate Gaussian attributes. **Evaluation:** - **Datasets:** GoogleEarth and KITTI-360. - **Metrics:** FID, KID, Depth Error, Camera Error, and Runtime. - **Results:** GaussianCity outperforms existing methods in visual quality and efficiency, with a 60-fold speedup compared to CityDreamer. **Limitations:** - Assumption of Manhattan geometry for building height. - Limited attribute prediction beyond RGB in 3D-GS. **Conclusion:** GaussianCity is a significant advancement in 3D city generation, offering both high realism and efficiency.**GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation** **Authors:** Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu **Institution:** S-Lab, Nanyang Technological University **Abstract:** This paper introduces GaussianCity, a novel generative framework for creating unbounded 3D cities using 3D Gaussian splatting (3D-GS). Traditional methods like NeRF-based approaches are computationally inefficient for large-scale city generation, while 3D-GS offers a more efficient alternative. However, adapting 3D-GS for infinite-scale 3D cities is challenging due to significant storage overhead. GaussianCity addresses this issue by introducing two key innovations: 1) BEV-Point, a highly compact intermediate representation that ensures constant VRAM usage for unbounded scenes; 2) a spatial-aware BEV-Point decoder that produces 3D Gaussian attributes, leveraging Point Serializer to integrate structural and contextual characteristics of BEV points. Extensive experiments on the GoogleEarth and KITTI-360 datasets demonstrate that GaussianCity achieves state-of-the-art results in both drone-view and street-view 3D city generation, with a speedup of 60 times compared to CityDreamer. **Contributions:** 1. The first generative 3D-GS model for unbounded 3D city generation. 2. Introduction of BEV-Point, a compact representation ensuring constant VRAM usage. 3. BEV-Point Decoder that captures structural and contextual characteristics of unstructured BEV points. **Methods:** - **BEV-Point Initialization:** Generates visible BEV points from a local patch of BEV maps, ensuring constant VRAM usage. - **BEV-Point Feature Generation:** Divides features into instance attributes, BEV-Point attributes, and style look-up table. - **BEV-Point Decoding:** Comprises positional encoder, point serializer, point transformer, modulated MLP, and Gaussian rasterizer to generate Gaussian attributes. **Evaluation:** - **Datasets:** GoogleEarth and KITTI-360. - **Metrics:** FID, KID, Depth Error, Camera Error, and Runtime. - **Results:** GaussianCity outperforms existing methods in visual quality and efficiency, with a 60-fold speedup compared to CityDreamer. **Limitations:** - Assumption of Manhattan geometry for building height. - Limited attribute prediction beyond RGB in 3D-GS. **Conclusion:** GaussianCity is a significant advancement in 3D city generation, offering both high realism and efficiency.

GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation

10 Jun 2024 | Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu