GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation

GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation

21 Mar 2024 | Yinghao Xu1*, Zifan Shi1,2*, Wang Yifan1, Hansheng Chen1 Ceyuan Yang3, Sida Peng4, Yujun Shen5, and Gordon Wetzstein1
The paper introduces the Gaussian Reconstruction Model (GRM), a feed-forward transformer-based model designed for efficient 3D reconstruction and generation. GRM leverages sparse-view images to recover 3D scenes by translating input pixels into pixel-aligned Gaussians, which are then unprojected to create a dense set of 3D Gaussians representing the scene. This approach, combined with a transformer architecture, enables a scalable and efficient reconstruction framework. Experimental results demonstrate that GRM outperforms existing methods in both reconstruction quality and speed, achieving state-of-the-art performance in sparse-view reconstruction, single-image-to-3D generation, and text-to-3D generation. The model's ability to handle sparse yet well-distributed views allows for high-fidelity reconstruction, making it suitable for various applications such as robotics, gaming, and architecture. The project website is available at <https://justimyhxu.github.io/projects/grm/>.The paper introduces the Gaussian Reconstruction Model (GRM), a feed-forward transformer-based model designed for efficient 3D reconstruction and generation. GRM leverages sparse-view images to recover 3D scenes by translating input pixels into pixel-aligned Gaussians, which are then unprojected to create a dense set of 3D Gaussians representing the scene. This approach, combined with a transformer architecture, enables a scalable and efficient reconstruction framework. Experimental results demonstrate that GRM outperforms existing methods in both reconstruction quality and speed, achieving state-of-the-art performance in sparse-view reconstruction, single-image-to-3D generation, and text-to-3D generation. The model's ability to handle sparse yet well-distributed views allows for high-fidelity reconstruction, making it suitable for various applications such as robotics, gaming, and architecture. The project website is available at <https://justimyhxu.github.io/projects/grm/>.
Reach us at info@study.space
[slides] GRM%3A Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation | StudySpace