31 Jul 2018 | Lars Mescheder 1 Andreas Geiger 1 2 Sebastian Nowozin 3
This paper investigates the convergence properties of Generative Adversarial Network (GAN) training methods. We show that the assumption of absolute continuity of data and generator distributions is necessary for local convergence of GAN training. We present a simple counterexample demonstrating that unregularized GAN training does not always converge. We analyze the convergence properties of various regularization techniques, including instance noise and zero-centered gradient penalties, which lead to local convergence. In contrast, Wasserstein GANs (WGANs) and WGAN-GP with a finite number of discriminator updates per generator update do not always converge to the equilibrium point. Our analysis leads to a new explanation for the instability problems in GAN training. We extend our convergence results to more general GANs and prove local convergence for simplified gradient penalties even if the generator and data distributions lie on lower dimensional manifolds. We find these penalties to work well in practice and use them to learn high-resolution generative image models for a variety of datasets with little hyperparameter tuning.This paper investigates the convergence properties of Generative Adversarial Network (GAN) training methods. We show that the assumption of absolute continuity of data and generator distributions is necessary for local convergence of GAN training. We present a simple counterexample demonstrating that unregularized GAN training does not always converge. We analyze the convergence properties of various regularization techniques, including instance noise and zero-centered gradient penalties, which lead to local convergence. In contrast, Wasserstein GANs (WGANs) and WGAN-GP with a finite number of discriminator updates per generator update do not always converge to the equilibrium point. Our analysis leads to a new explanation for the instability problems in GAN training. We extend our convergence results to more general GANs and prove local convergence for simplified gradient penalties even if the generator and data distributions lie on lower dimensional manifolds. We find these penalties to work well in practice and use them to learn high-resolution generative image models for a variety of datasets with little hyperparameter tuning.