Understanding Photographic Image Synthesis with Cascaded Refinement Networks

This paper presents a method for generating photographic images based on semantic layouts. The approach uses a convolutional network trained end-to-end with a regression loss, avoiding adversarial training. The model, called the Cascaded Refinement Network (CRN), is a series of refinement modules that progressively refine images from low to high resolution. The CRN is trained on semantic segmentation datasets, where each semantic layout is used as input and the corresponding color image as output. The model is capable of generating high-resolution images, up to 2 megapixels, and produces more realistic images than alternative approaches. The method is evaluated on both outdoor and indoor scenes, with results showing that images generated by the CRN are significantly more realistic than those generated by other methods. The paper also discusses related work, including generative adversarial networks (GANs) and other image synthesis techniques, and compares the CRN to these approaches. The results demonstrate that the CRN can generate diverse images and that the model's capacity is essential for producing high-resolution, photorealistic images. The method is shown to be effective in generating images that conform to semantic layouts and are visually realistic.This paper presents a method for generating photographic images based on semantic layouts. The approach uses a convolutional network trained end-to-end with a regression loss, avoiding adversarial training. The model, called the Cascaded Refinement Network (CRN), is a series of refinement modules that progressively refine images from low to high resolution. The CRN is trained on semantic segmentation datasets, where each semantic layout is used as input and the corresponding color image as output. The model is capable of generating high-resolution images, up to 2 megapixels, and produces more realistic images than alternative approaches. The method is evaluated on both outdoor and indoor scenes, with results showing that images generated by the CRN are significantly more realistic than those generated by other methods. The paper also discusses related work, including generative adversarial networks (GANs) and other image synthesis techniques, and compares the CRN to these approaches. The results demonstrate that the CRN can generate diverse images and that the model's capacity is essential for producing high-resolution, photorealistic images. The method is shown to be effective in generating images that conform to semantic layouts and are visually realistic.

Photographic Image Synthesis with Cascaded Refinement Networks

28 Jul 2017 | Qifeng Chen† ‡, Vladlen Koltun†