This paper introduces a novel training framework called "Low-Res Leads the Way" (LWay) for image super-resolution (SR), combining supervised pre-training with self-supervised learning to enhance the adaptability of SR models to real-world images. The framework uses a low-resolution (LR) reconstruction network to extract degradation embeddings from LR images, which are then merged with super-resolved outputs for LR reconstruction. Leveraging unseen LR images for self-supervised learning helps the model adapt its modeling space to the target domain, enabling fine-tuning of SR models without requiring paired high-resolution (HR) images. The integration of Discrete Wavelet Transform (DWT) further refines the focus on high-frequency details. Extensive evaluations show that the method significantly improves the generalization and detail restoration capabilities of SR models on unseen real-world datasets, outperforming existing methods. The training regime is universally compatible, requiring no network architecture modifications, making it a practical solution for real-world SR applications.
The paper discusses the challenges of image super-resolution, particularly the gap between synthetic datasets and real-world degradation scenarios. It categorizes training approaches into three main paradigms: unsupervised learning with unpaired data, supervised learning with paired synthetic data, and self-supervised learning with a single image. The proposed LWay framework merges supervised learning (SL) pre-training with self-supervised learning (SSL) to bridge the gap between synthetic training data and real test images. The framework includes an LR reconstruction branch that extracts degradation embeddings from LR images, which are then used to regenerate LR content. For test images, a pre-trained SR model generates SR outputs, which are then degraded by the fixed LR reconstruction network. A self-supervised loss is computed by comparing this degraded counterpart to the original LR image, thereby updating specific parameters within the SR model. The use of DWT isolates high-frequency elements from the LR image, shifting the model's focus to the recovery of high-frequency nuances.
The method is evaluated on various real-world datasets, demonstrating significant improvements in SR quality and generalization performance. The results show that the proposed method outperforms existing approaches in terms of PSNR, SSIM, and perceptual quality. The method is also effective in handling real-world scenarios with complex and variable degradations. The framework is robust and requires no modifications to the network architecture, making it compatible with all SR models. The paper concludes that the proposed LWay training strategy effectively bridges the gap between synthetic data supervised training and real-world test image self-supervision, demonstrating impressive performance and robustness across various SR frameworks and real-world benchmarks.This paper introduces a novel training framework called "Low-Res Leads the Way" (LWay) for image super-resolution (SR), combining supervised pre-training with self-supervised learning to enhance the adaptability of SR models to real-world images. The framework uses a low-resolution (LR) reconstruction network to extract degradation embeddings from LR images, which are then merged with super-resolved outputs for LR reconstruction. Leveraging unseen LR images for self-supervised learning helps the model adapt its modeling space to the target domain, enabling fine-tuning of SR models without requiring paired high-resolution (HR) images. The integration of Discrete Wavelet Transform (DWT) further refines the focus on high-frequency details. Extensive evaluations show that the method significantly improves the generalization and detail restoration capabilities of SR models on unseen real-world datasets, outperforming existing methods. The training regime is universally compatible, requiring no network architecture modifications, making it a practical solution for real-world SR applications.
The paper discusses the challenges of image super-resolution, particularly the gap between synthetic datasets and real-world degradation scenarios. It categorizes training approaches into three main paradigms: unsupervised learning with unpaired data, supervised learning with paired synthetic data, and self-supervised learning with a single image. The proposed LWay framework merges supervised learning (SL) pre-training with self-supervised learning (SSL) to bridge the gap between synthetic training data and real test images. The framework includes an LR reconstruction branch that extracts degradation embeddings from LR images, which are then used to regenerate LR content. For test images, a pre-trained SR model generates SR outputs, which are then degraded by the fixed LR reconstruction network. A self-supervised loss is computed by comparing this degraded counterpart to the original LR image, thereby updating specific parameters within the SR model. The use of DWT isolates high-frequency elements from the LR image, shifting the model's focus to the recovery of high-frequency nuances.
The method is evaluated on various real-world datasets, demonstrating significant improvements in SR quality and generalization performance. The results show that the proposed method outperforms existing approaches in terms of PSNR, SSIM, and perceptual quality. The method is also effective in handling real-world scenarios with complex and variable degradations. The framework is robust and requires no modifications to the network architecture, making it compatible with all SR models. The paper concludes that the proposed LWay training strategy effectively bridges the gap between synthetic data supervised training and real-world test image self-supervision, demonstrating impressive performance and robustness across various SR frameworks and real-world benchmarks.