Towards Realistic Data Generation for Real-World Super-Resolution

Towards Realistic Data Generation for Real-World Super-Resolution

12 Jun 2024 | Long Peng, Wenbo Li, Renjing Pei, Jingjing Ren, Yang Wang, Xueyang Fu, Yang Cao, Zheng-Jun Zha
This paper addresses the challenge of generating realistic and diverse data for real-world super-resolution (SR) tasks, which are crucial for enhancing the performance of SR models in various applications. Existing methods often fail to generalize effectively due to the significant divergence between training data and practical scenarios. To tackle this issue, the authors propose a novel unsupervised learning framework called Realistic Decoupled Data Generator (RealDGen). RealDGen is designed to create large-scale, high-quality paired data that mirrors real-world degradations, thereby improving the generalization ability of SR models. The key contributions of RealDGen include: 1. **Content and Degradation Extraction Strategies**: The framework employs well-designed content and degradation extractors to capture robust representations of high-resolution (HR) and low-resolution (LR) images. 2. **Content-Degradation Decoupled Diffusion Model**: A novel diffusion model is introduced to generate realistic LR images by integrating the content and degradation representations. 3. **Training Process**: The training process involves two phases: pre-training the extractors using contrastive and reconstruction learning, followed by training the decoupled diffusion model with fine-tuning of the extractors. Experiments demonstrate that RealDGen outperforms existing methods in generating realistic paired data and enhancing the performance of SR models on various real-world benchmarks. The method is evaluated using metrics such as PSNR, SSIM, LPIPS, FID, DISTS, and CLIP-Score, showing superior results in terms of both quantitative and qualitative metrics. Additionally, a user study confirms the visual quality and realism of the generated LR images. The paper concludes by discussing the limitations of the method, such as the difficulty in preserving fine textures due to the stochasticity of the diffusion model, and suggests future work directions, including incorporating perceptual loss and proposing an auto-selection mechanism for the denoising step.This paper addresses the challenge of generating realistic and diverse data for real-world super-resolution (SR) tasks, which are crucial for enhancing the performance of SR models in various applications. Existing methods often fail to generalize effectively due to the significant divergence between training data and practical scenarios. To tackle this issue, the authors propose a novel unsupervised learning framework called Realistic Decoupled Data Generator (RealDGen). RealDGen is designed to create large-scale, high-quality paired data that mirrors real-world degradations, thereby improving the generalization ability of SR models. The key contributions of RealDGen include: 1. **Content and Degradation Extraction Strategies**: The framework employs well-designed content and degradation extractors to capture robust representations of high-resolution (HR) and low-resolution (LR) images. 2. **Content-Degradation Decoupled Diffusion Model**: A novel diffusion model is introduced to generate realistic LR images by integrating the content and degradation representations. 3. **Training Process**: The training process involves two phases: pre-training the extractors using contrastive and reconstruction learning, followed by training the decoupled diffusion model with fine-tuning of the extractors. Experiments demonstrate that RealDGen outperforms existing methods in generating realistic paired data and enhancing the performance of SR models on various real-world benchmarks. The method is evaluated using metrics such as PSNR, SSIM, LPIPS, FID, DISTS, and CLIP-Score, showing superior results in terms of both quantitative and qualitative metrics. Additionally, a user study confirms the visual quality and realism of the generated LR images. The paper concludes by discussing the limitations of the method, such as the difficulty in preserving fine textures due to the stochasticity of the diffusion model, and suggests future work directions, including incorporating perceptual loss and proposing an auto-selection mechanism for the denoising step.
Reach us at info@study.space