[slides] Gamba%3A Marry Gaussian Splatting with Mamba for single view 3D reconstruction

Gamba is an end-to-end 3D reconstruction model that efficiently reconstructs 3D assets from a single image. The model combines 3D Gaussian Splatting (3DGS) with Mamba, a linear-time sequence modeling architecture, to achieve fast and high-quality reconstruction. The key innovations of Gamba include an efficient backbone design using a Mamba-based GambaFormer network, which enables linear scalability with token length and accommodates a large number of Gaussians. Additionally, Gamba introduces robust Gaussian constraints derived from multi-view masks, eliminating the need for warmup supervision of 3D point clouds during training. Gamba is trained on the Objaverse dataset and evaluated against existing optimization-based and feed-forward 3D reconstruction methods on the GSO dataset. Experimental results show that Gamba can generate high-quality 3D assets in 0.05 seconds on a single NVIDIA A100 GPU, which is 1,000 times faster than optimization-based methods. Gamba outperforms other state-of-the-art methods in terms of reconstruction quality and speed, demonstrating its effectiveness in single-view 3D reconstruction.Gamba is an end-to-end 3D reconstruction model that efficiently reconstructs 3D assets from a single image. The model combines 3D Gaussian Splatting (3DGS) with Mamba, a linear-time sequence modeling architecture, to achieve fast and high-quality reconstruction. The key innovations of Gamba include an efficient backbone design using a Mamba-based GambaFormer network, which enables linear scalability with token length and accommodates a large number of Gaussians. Additionally, Gamba introduces robust Gaussian constraints derived from multi-view masks, eliminating the need for warmup supervision of 3D point clouds during training. Gamba is trained on the Objaverse dataset and evaluated against existing optimization-based and feed-forward 3D reconstruction methods on the GSO dataset. Experimental results show that Gamba can generate high-quality 3D assets in 0.05 seconds on a single NVIDIA A100 GPU, which is 1,000 times faster than optimization-based methods. Gamba outperforms other state-of-the-art methods in terms of reconstruction quality and speed, demonstrating its effectiveness in single-view 3D reconstruction.

Gamba: Marry Gaussian Splatting with Mamba for Single-View 3D Reconstruction

24 May 2024 | Qiuhong Shen, Zike Wu, Xuanyu Yi, Pan Zhou, Hanwang Zhang, Shuicheng Yan, Xinchao Wang