27 Apr 2022 | Eric R. Chan, Connor Z. Lin, Matthew A. Chan, Koki Nagano, Boxiao Pan, Shalini De Mello, Orazio Gallo, Leonidas Guibas, Jonathan Tremblay, Sameh Khamis, Tero Karras, and Gordon Wetzstein
This paper introduces an efficient geometry-aware 3D generative adversarial network (3D GAN) that improves computational efficiency and image quality without relying heavily on approximations. The proposed framework uses a hybrid explicit-implicit tri-plane representation, which combines the benefits of both explicit and implicit 3D representations. This allows for efficient rendering and high-quality 3D geometry synthesis. The framework decouples feature generation from neural rendering, enabling the use of state-of-the-art 2D CNN generators like StyleGAN2. This approach achieves state-of-the-art results in 3D-aware image synthesis, producing high-resolution, multi-view-consistent images and high-quality 3D shapes. The framework also introduces dual discrimination to maintain consistency between neural rendering and the final output, and pose-based conditioning to decouple pose-correlated attributes during inference. The method is evaluated on FFHQ and AFHQ Cats datasets, demonstrating superior performance in terms of image quality, view consistency, and geometry accuracy. The framework is efficient, with real-time rendering capabilities at 512x512 resolution, and is capable of generating high-quality 3D shapes. The method is also effective in applications such as single-view 3D reconstruction and style mixing. The paper highlights the importance of 3D-grounded inductive biases in achieving high-quality 3D-aware image synthesis and discusses the limitations and future directions of the approach.This paper introduces an efficient geometry-aware 3D generative adversarial network (3D GAN) that improves computational efficiency and image quality without relying heavily on approximations. The proposed framework uses a hybrid explicit-implicit tri-plane representation, which combines the benefits of both explicit and implicit 3D representations. This allows for efficient rendering and high-quality 3D geometry synthesis. The framework decouples feature generation from neural rendering, enabling the use of state-of-the-art 2D CNN generators like StyleGAN2. This approach achieves state-of-the-art results in 3D-aware image synthesis, producing high-resolution, multi-view-consistent images and high-quality 3D shapes. The framework also introduces dual discrimination to maintain consistency between neural rendering and the final output, and pose-based conditioning to decouple pose-correlated attributes during inference. The method is evaluated on FFHQ and AFHQ Cats datasets, demonstrating superior performance in terms of image quality, view consistency, and geometry accuracy. The framework is efficient, with real-time rendering capabilities at 512x512 resolution, and is capable of generating high-quality 3D shapes. The method is also effective in applications such as single-view 3D reconstruction and style mixing. The paper highlights the importance of 3D-grounded inductive biases in achieving high-quality 3D-aware image synthesis and discusses the limitations and future directions of the approach.