[slides] Learning from Simulated and Unsupervised Images through Adversarial Training

The paper "Learning from Simulated and Unsupervised Images through Adversarial Training" by Ashish Shrivastava et al. addresses the challenge of improving the realism of synthetic images while preserving annotation information from simulators. The authors propose a method called Simulated+Unsupervised (S+U) learning, which uses unlabeled real data to refine synthetic images generated by a simulator. This approach aims to bridge the gap between synthetic and real image distributions, enhancing the realism of synthetic images without losing the annotation details. The key contributions of the paper include: 1. Introducing S+U learning to improve the realism of synthetic images using unlabeled real data. 2. Developing SimGAN, a method that combines an adversarial loss and a self-regularization loss to refine synthetic images. 3. Modifying the standard GAN training framework to stabilize training and prevent artifacts, including a 'self-regularization' term, a local adversarial loss, and updating the discriminator using a history of refined images. The authors evaluate the effectiveness of SimGAN through qualitative and quantitative experiments. Qualitative results show significant improvements in the realism of synthetic images, while quantitative results demonstrate state-of-the-art performance in gaze estimation and hand pose estimation tasks on datasets like MPIIGaze and NYU hand pose. The method achieves a 22.3% improvement in gaze estimation error and outperforms state-of-the-art methods in hand pose estimation by 8.8%. The paper also includes an ablation study to validate the importance of using a history of refined images and a local adversarial loss. The authors conclude by discussing future work, including exploring noise distribution modeling and refining videos instead of single images.The paper "Learning from Simulated and Unsupervised Images through Adversarial Training" by Ashish Shrivastava et al. addresses the challenge of improving the realism of synthetic images while preserving annotation information from simulators. The authors propose a method called Simulated+Unsupervised (S+U) learning, which uses unlabeled real data to refine synthetic images generated by a simulator. This approach aims to bridge the gap between synthetic and real image distributions, enhancing the realism of synthetic images without losing the annotation details. The key contributions of the paper include: 1. Introducing S+U learning to improve the realism of synthetic images using unlabeled real data. 2. Developing SimGAN, a method that combines an adversarial loss and a self-regularization loss to refine synthetic images. 3. Modifying the standard GAN training framework to stabilize training and prevent artifacts, including a 'self-regularization' term, a local adversarial loss, and updating the discriminator using a history of refined images. The authors evaluate the effectiveness of SimGAN through qualitative and quantitative experiments. Qualitative results show significant improvements in the realism of synthetic images, while quantitative results demonstrate state-of-the-art performance in gaze estimation and hand pose estimation tasks on datasets like MPIIGaze and NYU hand pose. The method achieves a 22.3% improvement in gaze estimation error and outperforms state-of-the-art methods in hand pose estimation by 8.8%. The paper also includes an ablation study to validate the importance of using a history of refined images and a local adversarial loss. The authors conclude by discussing future work, including exploring noise distribution modeling and refining videos instead of single images.

Learning from Simulated and Unsupervised Images through Adversarial Training

November 15, 2016 | Ashish Shrivastava, Tomas Pfister, Oncel Tuzel, Josh Susskind, Wenda Wang, Russ Webb