23 Jul 2018 | Ming-Yu Liu, Thomas Breuel, Jan Kautz
This paper presents an unsupervised image-to-image translation framework called UNIT, which uses a shared-latent space assumption and Coupled GANs to learn a joint distribution of images in different domains. The framework is designed to translate images between domains without requiring corresponding images in the training data. UNIT is evaluated on various tasks, including street scene, animal, and face image translation, and is applied to domain adaptation, achieving state-of-the-art performance on benchmark datasets.
The framework consists of two domain image encoders, two domain image generators, and two domain adversarial discriminators. The encoders map images to a latent space, while the generators map latent codes back to images. The shared-latent space assumption ensures that corresponding images in different domains can be mapped to the same latent code. This assumption is enforced through weight-sharing constraints between the encoders and generators.
The framework also incorporates cycle-consistency constraints, which ensure that translating an image from one domain to another and back results in the original image. This helps regularize the learning process and improve translation quality. The framework is trained using a combination of VAE and GAN objectives, with the VAE objective minimizing the KL divergence between the latent distribution and a prior, and the GAN objective ensuring that generated images are realistic.
The framework is evaluated on several tasks, including translating between satellite images and maps, translating between synthetic and real images, translating between dog breeds, translating between cat species, and translating face images based on attributes. The results show that the framework achieves high-quality translations and performs well in domain adaptation tasks.
The paper also discusses related work, including other unsupervised image-to-image translation methods and domain adaptation approaches. It highlights the importance of the shared-latent space assumption and cycle-consistency constraints in achieving effective image translation. The framework is implemented in code and additional results are available on GitHub.This paper presents an unsupervised image-to-image translation framework called UNIT, which uses a shared-latent space assumption and Coupled GANs to learn a joint distribution of images in different domains. The framework is designed to translate images between domains without requiring corresponding images in the training data. UNIT is evaluated on various tasks, including street scene, animal, and face image translation, and is applied to domain adaptation, achieving state-of-the-art performance on benchmark datasets.
The framework consists of two domain image encoders, two domain image generators, and two domain adversarial discriminators. The encoders map images to a latent space, while the generators map latent codes back to images. The shared-latent space assumption ensures that corresponding images in different domains can be mapped to the same latent code. This assumption is enforced through weight-sharing constraints between the encoders and generators.
The framework also incorporates cycle-consistency constraints, which ensure that translating an image from one domain to another and back results in the original image. This helps regularize the learning process and improve translation quality. The framework is trained using a combination of VAE and GAN objectives, with the VAE objective minimizing the KL divergence between the latent distribution and a prior, and the GAN objective ensuring that generated images are realistic.
The framework is evaluated on several tasks, including translating between satellite images and maps, translating between synthetic and real images, translating between dog breeds, translating between cat species, and translating face images based on attributes. The results show that the framework achieves high-quality translations and performs well in domain adaptation tasks.
The paper also discusses related work, including other unsupervised image-to-image translation methods and domain adaptation approaches. It highlights the importance of the shared-latent space assumption and cycle-consistency constraints in achieving effective image translation. The framework is implemented in code and additional results are available on GitHub.