30 Jun 2021 | Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J. Fleet, Mohammad Norouzi
SR3 is a new approach to image super-resolution via iterative refinement, inspired by denoising diffusion probabilistic models (DDPMs) and denoising score matching. It adapts DDPMs to conditional image generation by training a U-Net architecture to iteratively remove noise from an input image, starting from pure Gaussian noise. The model is trained to generate high-resolution images by progressively refining the output through a series of denoising steps. SR3 achieves strong performance on super-resolution tasks at different magnification factors, including faces and natural images. It outperforms state-of-the-art GAN methods in human evaluation, achieving a fool rate close to 50%, indicating photo-realistic outputs. SR3 is also effective in cascaded image generation, where generative models are chained with super-resolution models, yielding competitive FID scores on ImageNet. The model is trained with a denoising objective, and its performance is evaluated using automated metrics and human evaluation. SR3 is capable of generating high-resolution images with a constant number of inference steps, making it efficient for high-resolution tasks. The model is also effective for unconditional and class-conditional generation, demonstrating its versatility in image synthesis.SR3 is a new approach to image super-resolution via iterative refinement, inspired by denoising diffusion probabilistic models (DDPMs) and denoising score matching. It adapts DDPMs to conditional image generation by training a U-Net architecture to iteratively remove noise from an input image, starting from pure Gaussian noise. The model is trained to generate high-resolution images by progressively refining the output through a series of denoising steps. SR3 achieves strong performance on super-resolution tasks at different magnification factors, including faces and natural images. It outperforms state-of-the-art GAN methods in human evaluation, achieving a fool rate close to 50%, indicating photo-realistic outputs. SR3 is also effective in cascaded image generation, where generative models are chained with super-resolution models, yielding competitive FID scores on ImageNet. The model is trained with a denoising objective, and its performance is evaluated using automated metrics and human evaluation. SR3 is capable of generating high-resolution images with a constant number of inference steps, making it efficient for high-resolution tasks. The model is also effective for unconditional and class-conditional generation, demonstrating its versatility in image synthesis.