10 Jun 2024 | Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu
**GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation**
**Authors:** Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu
**Institution:** S-Lab, Nanyang Technological University
**Abstract:**
This paper introduces GaussianCity, a novel generative framework for creating unbounded 3D cities using 3D Gaussian splatting (3D-GS). Traditional methods like NeRF-based approaches are computationally inefficient for large-scale city generation, while 3D-GS offers a more efficient alternative. However, adapting 3D-GS for infinite-scale 3D cities is challenging due to significant storage overhead. GaussianCity addresses this issue by introducing two key innovations: 1) BEV-Point, a highly compact intermediate representation that ensures constant VRAM usage for unbounded scenes; 2) a spatial-aware BEV-Point decoder that produces 3D Gaussian attributes, leveraging Point Serializer to integrate structural and contextual characteristics of BEV points. Extensive experiments on the GoogleEarth and KITTI-360 datasets demonstrate that GaussianCity achieves state-of-the-art results in both drone-view and street-view 3D city generation, with a speedup of 60 times compared to CityDreamer.
**Contributions:**
1. The first generative 3D-GS model for unbounded 3D city generation.
2. Introduction of BEV-Point, a compact representation ensuring constant VRAM usage.
3. BEV-Point Decoder that captures structural and contextual characteristics of unstructured BEV points.
**Methods:**
- **BEV-Point Initialization:** Generates visible BEV points from a local patch of BEV maps, ensuring constant VRAM usage.
- **BEV-Point Feature Generation:** Divides features into instance attributes, BEV-Point attributes, and style look-up table.
- **BEV-Point Decoding:** Comprises positional encoder, point serializer, point transformer, modulated MLP, and Gaussian rasterizer to generate Gaussian attributes.
**Evaluation:**
- **Datasets:** GoogleEarth and KITTI-360.
- **Metrics:** FID, KID, Depth Error, Camera Error, and Runtime.
- **Results:** GaussianCity outperforms existing methods in visual quality and efficiency, with a 60-fold speedup compared to CityDreamer.
**Limitations:**
- Assumption of Manhattan geometry for building height.
- Limited attribute prediction beyond RGB in 3D-GS.
**Conclusion:**
GaussianCity is a significant advancement in 3D city generation, offering both high realism and efficiency.**GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation**
**Authors:** Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu
**Institution:** S-Lab, Nanyang Technological University
**Abstract:**
This paper introduces GaussianCity, a novel generative framework for creating unbounded 3D cities using 3D Gaussian splatting (3D-GS). Traditional methods like NeRF-based approaches are computationally inefficient for large-scale city generation, while 3D-GS offers a more efficient alternative. However, adapting 3D-GS for infinite-scale 3D cities is challenging due to significant storage overhead. GaussianCity addresses this issue by introducing two key innovations: 1) BEV-Point, a highly compact intermediate representation that ensures constant VRAM usage for unbounded scenes; 2) a spatial-aware BEV-Point decoder that produces 3D Gaussian attributes, leveraging Point Serializer to integrate structural and contextual characteristics of BEV points. Extensive experiments on the GoogleEarth and KITTI-360 datasets demonstrate that GaussianCity achieves state-of-the-art results in both drone-view and street-view 3D city generation, with a speedup of 60 times compared to CityDreamer.
**Contributions:**
1. The first generative 3D-GS model for unbounded 3D city generation.
2. Introduction of BEV-Point, a compact representation ensuring constant VRAM usage.
3. BEV-Point Decoder that captures structural and contextual characteristics of unstructured BEV points.
**Methods:**
- **BEV-Point Initialization:** Generates visible BEV points from a local patch of BEV maps, ensuring constant VRAM usage.
- **BEV-Point Feature Generation:** Divides features into instance attributes, BEV-Point attributes, and style look-up table.
- **BEV-Point Decoding:** Comprises positional encoder, point serializer, point transformer, modulated MLP, and Gaussian rasterizer to generate Gaussian attributes.
**Evaluation:**
- **Datasets:** GoogleEarth and KITTI-360.
- **Metrics:** FID, KID, Depth Error, Camera Error, and Runtime.
- **Results:** GaussianCity outperforms existing methods in visual quality and efficiency, with a 60-fold speedup compared to CityDreamer.
**Limitations:**
- Assumption of Manhattan geometry for building height.
- Limited attribute prediction beyond RGB in 3D-GS.
**Conclusion:**
GaussianCity is a significant advancement in 3D city generation, offering both high realism and efficiency.