Learning from Simulated and Unsupervised Images through Adversarial Training

Learning from Simulated and Unsupervised Images through Adversarial Training

November 15, 2016 | Ashish Shrivastava, Tomas Pfister, Oncel Tuzel, Josh Susskind, Wenda Wang, Russ Webb
This paper proposes Simulated+Unsupervised (S+U) learning, a method to improve the realism of synthetic images using unlabeled real data while preserving the annotation information. The approach uses an adversarial network similar to Generative Adversarial Networks (GANs), but with synthetic images as inputs instead of random vectors. Key modifications to the standard GAN algorithm include a 'self-regularization' term, a local adversarial loss, and updating the discriminator using a history of refined images. These changes help preserve annotations, avoid artifacts, and stabilize training. The method is evaluated on tasks such as gaze estimation and hand pose estimation, achieving state-of-the-art results without any labeled real data. The proposed framework significantly improves the realism of synthetic images, enabling better training of machine learning models on large datasets without the need for data collection or human annotation. The method is demonstrated on the MPIIGaze dataset for gaze estimation and the NYU hand pose dataset for hand pose estimation, showing significant improvements in performance compared to using synthetic images alone. The results indicate that the proposed method effectively bridges the gap between synthetic and real image distributions, leading to more realistic synthetic images that can be used for training machine learning models.This paper proposes Simulated+Unsupervised (S+U) learning, a method to improve the realism of synthetic images using unlabeled real data while preserving the annotation information. The approach uses an adversarial network similar to Generative Adversarial Networks (GANs), but with synthetic images as inputs instead of random vectors. Key modifications to the standard GAN algorithm include a 'self-regularization' term, a local adversarial loss, and updating the discriminator using a history of refined images. These changes help preserve annotations, avoid artifacts, and stabilize training. The method is evaluated on tasks such as gaze estimation and hand pose estimation, achieving state-of-the-art results without any labeled real data. The proposed framework significantly improves the realism of synthetic images, enabling better training of machine learning models on large datasets without the need for data collection or human annotation. The method is demonstrated on the MPIIGaze dataset for gaze estimation and the NYU hand pose dataset for hand pose estimation, showing significant improvements in performance compared to using synthetic images alone. The results indicate that the proposed method effectively bridges the gap between synthetic and real image distributions, leading to more realistic synthetic images that can be used for training machine learning models.
Reach us at info@study.space
Understanding Learning from Simulated and Unsupervised Images through Adversarial Training