[slides and audio] Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models

This paper introduces DiMR, a multi-resolution diffusion model that improves image generation quality by integrating a multi-resolution network and time-dependent layer normalization. The multi-resolution network progressively refines features from low to high resolutions, reducing image distortion. Time-dependent layer normalization (TD-LN) efficiently encodes temporal information into the diffusion model, enhancing performance with fewer parameters. DiMR outperforms existing diffusion models on the class-conditional ImageNet generation benchmark, achieving state-of-the-art FID scores of 1.70 on 256×256 and 2.89 on 512×512 images. The model's effectiveness is demonstrated through extensive experiments, showing significant improvements in image fidelity and reduced distortion compared to prior methods. DiMR also introduces a feature cascade approach that progressively upsamples lower-resolution features to higher resolutions, further alleviating image distortion. The proposed methods are evaluated on multiple image sizes and show superior performance across different resolutions. The results indicate that DiMR is a promising advancement in diffusion models for high-fidelity image generation.This paper introduces DiMR, a multi-resolution diffusion model that improves image generation quality by integrating a multi-resolution network and time-dependent layer normalization. The multi-resolution network progressively refines features from low to high resolutions, reducing image distortion. Time-dependent layer normalization (TD-LN) efficiently encodes temporal information into the diffusion model, enhancing performance with fewer parameters. DiMR outperforms existing diffusion models on the class-conditional ImageNet generation benchmark, achieving state-of-the-art FID scores of 1.70 on 256×256 and 2.89 on 512×512 images. The model's effectiveness is demonstrated through extensive experiments, showing significant improvements in image fidelity and reduced distortion compared to prior methods. DiMR also introduces a feature cascade approach that progressively upsamples lower-resolution features to higher resolutions, further alleviating image distortion. The proposed methods are evaluated on multiple image sizes and show superior performance across different resolutions. The results indicate that DiMR is a promising advancement in diffusion models for high-fidelity image generation.

Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models

13 Jun 2024 | Qihao Liu, Zhanpeng Zeng, Ju He, Qihang Yu, Xiaohui Shen, Liang-Chieh Chen