25 Feb 2019 | Andrew Brock*,† Heriot-Watt University, ajb5@hw.ac.uk Jeff Donahue† DeepMind, jeffdonahue@google.com Karen Simonyan† DeepMind, simonyan@google.com
This paper presents a large-scale Generative Adversarial Network (GAN) training approach for high-fidelity natural image synthesis. The authors train GANs at an unprecedented scale, achieving state-of-the-art results on the ImageNet dataset. They introduce several key improvements, including orthogonal regularization, which enables a "truncation trick" for fine-grained control over the trade-off between sample fidelity and variety. Their models, called BigGANs, achieve an Inception Score (IS) of 166.5 and Fréchet Inception Distance (FID) of 7.4 when trained on ImageNet at 128x128 resolution, significantly outperforming previous results. The models are also trained at higher resolutions (256x256 and 512x512), achieving even better performance metrics. The authors analyze the instabilities that arise in large-scale GAN training and propose techniques to mitigate them. They find that while large-scale training can improve performance, it also introduces challenges that require careful management. The paper also evaluates the models on a larger dataset, JFT-300M, demonstrating that their design choices transfer well to different datasets. The results show that their approach significantly improves the quality and diversity of generated images, setting a new benchmark for GAN-based image synthesis.This paper presents a large-scale Generative Adversarial Network (GAN) training approach for high-fidelity natural image synthesis. The authors train GANs at an unprecedented scale, achieving state-of-the-art results on the ImageNet dataset. They introduce several key improvements, including orthogonal regularization, which enables a "truncation trick" for fine-grained control over the trade-off between sample fidelity and variety. Their models, called BigGANs, achieve an Inception Score (IS) of 166.5 and Fréchet Inception Distance (FID) of 7.4 when trained on ImageNet at 128x128 resolution, significantly outperforming previous results. The models are also trained at higher resolutions (256x256 and 512x512), achieving even better performance metrics. The authors analyze the instabilities that arise in large-scale GAN training and propose techniques to mitigate them. They find that while large-scale training can improve performance, it also introduces challenges that require careful management. The paper also evaluates the models on a larger dataset, JFT-300M, demonstrating that their design choices transfer well to different datasets. The results show that their approach significantly improves the quality and diversity of generated images, setting a new benchmark for GAN-based image synthesis.