Understanding Large Scale GAN Training for High Fidelity Natural Image Synthesis

This paper addresses the challenge of generating high-fidelity, diverse samples from complex datasets like ImageNet using Generative Adversarial Networks (GANs). The authors train GANs at unprecedented scales, achieving state-of-the-art performance in class-conditional image synthesis. Key contributions include: 1. **Scaling Up GANs**: The authors demonstrate that GANs benefit significantly from scaling, training models with up to four times more parameters and eight times larger batch sizes compared to previous work. They introduce architectural changes and regularization techniques to improve scalability and conditioning. 2. **Truncation Trick**: The models become amenable to the "truncation trick," a sampling technique that allows fine control over the trade-off between sample variety and fidelity by reducing the variance of the generator's input. 3. **Stability Analysis**: The authors identify and characterize instabilities specific to large-scale GANs, finding that these issues arise from the interaction between the generator and the discriminator. They propose novel and existing techniques to mitigate these instabilities, but note that complete training stability comes at a substantial cost to performance. The models, named BigGANs, achieve an Inception Score (IS) of 166.5 and Fréchet Inception Distance (FID) of 7.4 on ImageNet at 128×128 resolution, surpassing previous best scores. They also successfully train on higher resolutions (256×256 and 512×512) and on the larger JFT-300M dataset, demonstrating the effectiveness of their design choices.This paper addresses the challenge of generating high-fidelity, diverse samples from complex datasets like ImageNet using Generative Adversarial Networks (GANs). The authors train GANs at unprecedented scales, achieving state-of-the-art performance in class-conditional image synthesis. Key contributions include: 1. **Scaling Up GANs**: The authors demonstrate that GANs benefit significantly from scaling, training models with up to four times more parameters and eight times larger batch sizes compared to previous work. They introduce architectural changes and regularization techniques to improve scalability and conditioning. 2. **Truncation Trick**: The models become amenable to the "truncation trick," a sampling technique that allows fine control over the trade-off between sample variety and fidelity by reducing the variance of the generator's input. 3. **Stability Analysis**: The authors identify and characterize instabilities specific to large-scale GANs, finding that these issues arise from the interaction between the generator and the discriminator. They propose novel and existing techniques to mitigate these instabilities, but note that complete training stability comes at a substantial cost to performance. The models, named BigGANs, achieve an Inception Score (IS) of 166.5 and Fréchet Inception Distance (FID) of 7.4 on ImageNet at 128×128 resolution, surpassing previous best scores. They also successfully train on higher resolutions (256×256 and 512×512) and on the larger JFT-300M dataset, demonstrating the effectiveness of their design choices.

Large Scale GAN Training for High Fidelity Natural Image Synthesis

25 Feb 2019 | Andrew Brock*,† Heriot-Watt University, ajb5@hw.ac.uk Jeff Donahue† DeepMind, jeffdonahue@google.com Karen Simonyan† DeepMind, simonyan@google.com