5 Dec 2019 | Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, and Bernhard Schölkopf
The paper introduces the Wasserstein Auto-Encoder (WAE), a new algorithm for generating data distributions. WAE minimizes a penalized form of the Wasserstein distance between the model distribution and the target distribution, which differs from the regularizer used by Variational Auto-Encoders (VAEs). This regularizer encourages the encoded training distribution to match the prior. The authors compare WAE with other techniques and show that it generalizes Adversarial Auto-Encoders (AAEs). Experiments on the MNIST and CelebA datasets demonstrate that WAE maintains the good properties of VAEs (stable training, encoder-decoder architecture, and a nice latent manifold structure) while generating samples of better quality, as measured by the Fréchet Inception Distance (FID) score. The paper proposes two different regularizers for WAE: one based on GANs and the other using the maximum mean discrepancy. The theoretical analysis of WAE is also discussed, showing that the primal form of the Wasserstein distance can be equivalent to a problem involving the optimization of a probabilistic encoder.The paper introduces the Wasserstein Auto-Encoder (WAE), a new algorithm for generating data distributions. WAE minimizes a penalized form of the Wasserstein distance between the model distribution and the target distribution, which differs from the regularizer used by Variational Auto-Encoders (VAEs). This regularizer encourages the encoded training distribution to match the prior. The authors compare WAE with other techniques and show that it generalizes Adversarial Auto-Encoders (AAEs). Experiments on the MNIST and CelebA datasets demonstrate that WAE maintains the good properties of VAEs (stable training, encoder-decoder architecture, and a nice latent manifold structure) while generating samples of better quality, as measured by the Fréchet Inception Distance (FID) score. The paper proposes two different regularizers for WAE: one based on GANs and the other using the maximum mean discrepancy. The theoretical analysis of WAE is also discussed, showing that the primal form of the Wasserstein distance can be equivalent to a problem involving the optimization of a probabilistic encoder.