28 Jun 2018 | Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Senior Member, IEEE, Xiaogang Wang, Member, IEEE, Xiaolei Huang, Member, IEEE, Dimitris N. Metaxas*, Fellow, IEEE
StackGAN++ is a method for generating high-resolution, photo-realistic images using stacked generative adversarial networks (GANs). The paper proposes two versions of the method: StackGAN-v1 and StackGAN-v2. StackGAN-v1 is a two-stage GAN that first generates low-resolution images based on text descriptions, then refines them to high-resolution images. StackGAN-v2 is a multi-stage GAN that uses a tree-like structure with multiple generators and discriminators to model different image scales and distributions. It also introduces a conditioning augmentation technique to improve the stability and diversity of generated images. The method is evaluated on several datasets, including CUB, Oxford-102, and COCO, and shows significant improvements over existing methods in generating realistic images. The paper also discusses the advantages of the proposed method in terms of training stability, multi-scale image generation, and conditional/unconditional image synthesis. The results demonstrate that StackGAN++ can generate high-resolution images with photo-realistic details from text descriptions, outperforming other state-of-the-art methods.StackGAN++ is a method for generating high-resolution, photo-realistic images using stacked generative adversarial networks (GANs). The paper proposes two versions of the method: StackGAN-v1 and StackGAN-v2. StackGAN-v1 is a two-stage GAN that first generates low-resolution images based on text descriptions, then refines them to high-resolution images. StackGAN-v2 is a multi-stage GAN that uses a tree-like structure with multiple generators and discriminators to model different image scales and distributions. It also introduces a conditioning augmentation technique to improve the stability and diversity of generated images. The method is evaluated on several datasets, including CUB, Oxford-102, and COCO, and shows significant improvements over existing methods in generating realistic images. The paper also discusses the advantages of the proposed method in terms of training stability, multi-scale image generation, and conditional/unconditional image synthesis. The results demonstrate that StackGAN++ can generate high-resolution images with photo-realistic details from text descriptions, outperforming other state-of-the-art methods.