LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

7 Feb 2024 | Jiaxiang Tang1*, Zhaoxi Chen2, Xiaokang Chen1, Tengfei Wang3, Gang Zeng1, and Ziwei Liu2
The paper introduces the Large Multi-View Gaussian Model (LGM), a novel framework for generating high-resolution 3D models from text prompts or single-view images. The key contributions of LGM are two-fold: 1) using multi-view Gaussian features as an efficient and powerful representation for 3D models, and 2) employing an asymmetric U-Net as a high-throughput backbone for multi-view image processing. LGM addresses the challenges of high-resolution training and efficient 3D representation by leveraging multi-view diffusion models to generate multi-view images from text or images, which are then used to train the U-Net. The U-Net predicts and fuses 3D Gaussians from these multi-view images, achieving high-resolution 3D content generation within 5 seconds. Extensive experiments demonstrate the superior quality, resolution, and efficiency of LGM in both text-to-3D and image-to-3D tasks. The method also includes data augmentation techniques and a mesh extraction algorithm to enhance robustness and convert generated 3D Gaussians into smooth textured meshes.The paper introduces the Large Multi-View Gaussian Model (LGM), a novel framework for generating high-resolution 3D models from text prompts or single-view images. The key contributions of LGM are two-fold: 1) using multi-view Gaussian features as an efficient and powerful representation for 3D models, and 2) employing an asymmetric U-Net as a high-throughput backbone for multi-view image processing. LGM addresses the challenges of high-resolution training and efficient 3D representation by leveraging multi-view diffusion models to generate multi-view images from text or images, which are then used to train the U-Net. The U-Net predicts and fuses 3D Gaussians from these multi-view images, achieving high-resolution 3D content generation within 5 seconds. Extensive experiments demonstrate the superior quality, resolution, and efficiency of LGM in both text-to-3D and image-to-3D tasks. The method also includes data augmentation techniques and a mesh extraction algorithm to enhance robustness and convert generated 3D Gaussians into smooth textured meshes.
Reach us at info@study.space