2024 | Yi Xiao, Qiangqiang Yuan, Qiang Zhang, and Liangpei Zhang
This paper proposes a practical blind super-resolution (BSVSR) algorithm for satellite video, addressing the challenges of severe degradation in real-world scenarios. Traditional SVSR methods assume fixed degradation, such as bicubic downsampling, which limits their effectiveness in complex, unknown degradation conditions. BSVSR introduces a novel approach that considers pixel-wise blur levels in a coarse-to-fine manner to explore more sharp cues. The method employs multi-scale deformable convolution to coarsely aggregate temporal redundancy and multi-scale deformable attention to measure pixel-wise blur levels, assigning more weights to informative pixels. A pyramid spatial transformation module is also introduced to adjust the solution space for flexible feature adaptation across multiple domains. Quantitative and qualitative evaluations on both simulated and real-world satellite videos demonstrate that BSVSR outperforms state-of-the-art non-blind and blind SR models. The method is effective in restoring sharpness and clean details in severely degraded satellite videos, particularly in handling unknown blurs and downsamplings. The proposed BSVSR achieves significant improvements in PSNR and SSIM metrics, with the best performance across various blur kernel widths. The method also shows robustness in real-world scenarios, with lower NIQE values indicating better perceptual quality. Ablation studies confirm the effectiveness of the key components, including blur kernel estimation, progressive temporal compensation, and pyramid spatial transformation. The model is efficient, with a favorable trade-off between computational complexity and performance, making it suitable for practical applications in satellite video super-resolution.This paper proposes a practical blind super-resolution (BSVSR) algorithm for satellite video, addressing the challenges of severe degradation in real-world scenarios. Traditional SVSR methods assume fixed degradation, such as bicubic downsampling, which limits their effectiveness in complex, unknown degradation conditions. BSVSR introduces a novel approach that considers pixel-wise blur levels in a coarse-to-fine manner to explore more sharp cues. The method employs multi-scale deformable convolution to coarsely aggregate temporal redundancy and multi-scale deformable attention to measure pixel-wise blur levels, assigning more weights to informative pixels. A pyramid spatial transformation module is also introduced to adjust the solution space for flexible feature adaptation across multiple domains. Quantitative and qualitative evaluations on both simulated and real-world satellite videos demonstrate that BSVSR outperforms state-of-the-art non-blind and blind SR models. The method is effective in restoring sharpness and clean details in severely degraded satellite videos, particularly in handling unknown blurs and downsamplings. The proposed BSVSR achieves significant improvements in PSNR and SSIM metrics, with the best performance across various blur kernel widths. The method also shows robustness in real-world scenarios, with lower NIQE values indicating better perceptual quality. Ablation studies confirm the effectiveness of the key components, including blur kernel estimation, progressive temporal compensation, and pyramid spatial transformation. The model is efficient, with a favorable trade-off between computational complexity and performance, making it suitable for practical applications in satellite video super-resolution.