Learning to Discover Cross-Domain Relations with Generative Adversarial Networks

Learning to Discover Cross-Domain Relations with Generative Adversarial Networks

15 May 2017 | Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, Jiwon Kim
This paper introduces DiscoGAN, a generative adversarial network (GAN) that learns to discover cross-domain relations without requiring explicit pair labels. The key idea is to train two GANs that map between two domains, ensuring that the generated images can be reconstructed from the original domain and are realistic in the target domain. This approach allows the model to learn a bijective mapping between the two domains, enabling effective image translation while preserving key attributes such as orientation and face identity. The model is trained on unpaired data, making it suitable for scenarios where pairing is impractical or costly. DiscoGAN uses a reconstruction loss to ensure that generated images can be mapped back to their original domain, and a GAN loss to ensure realism in the target domain. This dual loss function encourages a one-to-one correspondence between the domains, which is crucial for successful cross-domain translation. Experiments on both toy and real-world datasets demonstrate that DiscoGAN outperforms standard GANs and GANs with reconstruction loss in terms of avoiding mode collapse and maintaining realistic translations. In the toy experiment, DiscoGAN successfully maps between two domains with distinct modes, whereas other models suffer from mode collapse. In real-world experiments, DiscoGAN successfully translates images between domains such as cars, faces, chairs, edges, and photos, preserving key attributes and achieving high-quality results. The model is also tested on tasks such as face attribute conversion, where it successfully changes features like gender and hair color while maintaining other attributes. Additionally, DiscoGAN is shown to handle more complex tasks like translating between visually different domains (e.g., edges to photos) and between unrelated domains (e.g., handbags to shoes), demonstrating its versatility and effectiveness in discovering cross-domain relations. The results indicate that DiscoGAN is a robust and effective method for learning cross-domain relations using GANs.This paper introduces DiscoGAN, a generative adversarial network (GAN) that learns to discover cross-domain relations without requiring explicit pair labels. The key idea is to train two GANs that map between two domains, ensuring that the generated images can be reconstructed from the original domain and are realistic in the target domain. This approach allows the model to learn a bijective mapping between the two domains, enabling effective image translation while preserving key attributes such as orientation and face identity. The model is trained on unpaired data, making it suitable for scenarios where pairing is impractical or costly. DiscoGAN uses a reconstruction loss to ensure that generated images can be mapped back to their original domain, and a GAN loss to ensure realism in the target domain. This dual loss function encourages a one-to-one correspondence between the domains, which is crucial for successful cross-domain translation. Experiments on both toy and real-world datasets demonstrate that DiscoGAN outperforms standard GANs and GANs with reconstruction loss in terms of avoiding mode collapse and maintaining realistic translations. In the toy experiment, DiscoGAN successfully maps between two domains with distinct modes, whereas other models suffer from mode collapse. In real-world experiments, DiscoGAN successfully translates images between domains such as cars, faces, chairs, edges, and photos, preserving key attributes and achieving high-quality results. The model is also tested on tasks such as face attribute conversion, where it successfully changes features like gender and hair color while maintaining other attributes. Additionally, DiscoGAN is shown to handle more complex tasks like translating between visually different domains (e.g., edges to photos) and between unrelated domains (e.g., handbags to shoes), demonstrating its versatility and effectiveness in discovering cross-domain relations. The results indicate that DiscoGAN is a robust and effective method for learning cross-domain relations using GANs.
Reach us at info@study.space
Understanding Learning to Discover Cross-Domain Relations with Generative Adversarial Networks