30 Mar 2021 | Katja Schwarz*, Yiyi Liao*, Michael Niemeyer, Andreas Geiger
The paper "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis" addresses the challenge of generating high-resolution images with precise control over camera viewpoint and object pose, which is a significant gap in current 2D generative adversarial networks (GANs). The authors propose a generative model based on radiance fields, which are continuous representations that map 3D locations and viewing directions to color values and volume densities. This approach allows for disentangling camera and scene properties, providing better multi-view consistency, and handling reconstruction ambiguity gracefully. The model is trained using a multi-scale patch-based discriminator, enabling the synthesis of high-resolution images from unposed 2D images alone. The method is evaluated on both synthetic and real-world datasets, demonstrating superior performance in terms of visual fidelity and 3D consistency compared to state-of-the-art methods. The authors also discuss the importance of avoiding learned projections and the role of the multi-scale discriminator in achieving high-quality results. The paper concludes by highlighting the broader impact of 3D-aware image synthesis on applications such as virtual reality, data augmentation, and robotics, while also acknowledging the potential risks of generating misleading content.The paper "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis" addresses the challenge of generating high-resolution images with precise control over camera viewpoint and object pose, which is a significant gap in current 2D generative adversarial networks (GANs). The authors propose a generative model based on radiance fields, which are continuous representations that map 3D locations and viewing directions to color values and volume densities. This approach allows for disentangling camera and scene properties, providing better multi-view consistency, and handling reconstruction ambiguity gracefully. The model is trained using a multi-scale patch-based discriminator, enabling the synthesis of high-resolution images from unposed 2D images alone. The method is evaluated on both synthetic and real-world datasets, demonstrating superior performance in terms of visual fidelity and 3D consistency compared to state-of-the-art methods. The authors also discuss the importance of avoiding learned projections and the role of the multi-scale discriminator in achieving high-quality results. The paper concludes by highlighting the broader impact of 3D-aware image synthesis on applications such as virtual reality, data augmentation, and robotics, while also acknowledging the potential risks of generating misleading content.