Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

8 May 2024 | Yi Xiao, Qiangqiang Yuan, Member, IEEE, Kui Jiang, Member, IEEE, Yuzeng Chen, Qiang Zhang, Member, IEEE, and Chia-Wen Lin, Fellow, IEEE.
The paper introduces a novel framework called Frequency-Assisted Mamba (FMSR) for remote sensing image (RSI) super-resolution (SR). FMSR leverages the Vision State Space Model (Mamba) to efficiently capture long-range dependencies in large-scale RSI, addressing the limitations of existing SR methods such as limited receptive fields and quadratic computational overhead. The FMSR framework includes a Frequency Selection Module (FSM), a Vision State Space Module (VSSM), and a Hybrid Gate Module (HGM) to enhance spatial and frequency correlations. The FSM adaptively selects informative frequency cues, while the VSSM captures spatial dependencies. The HGM introduces a local inductive bias by selectively amplifying or attenuating local features. Extensive experiments on AID, DOTA, and DIOR benchmarks demonstrate that FMSR outperforms state-of-the-art Transformer-based methods (e.g., HAT-L) in terms of PSNR by 0.11 dB on average, while consuming only 28.05% and 19.08% of the memory and complexity, respectively. The paper also provides a comprehensive review of related work and discusses the effectiveness of each component through ablation studies.The paper introduces a novel framework called Frequency-Assisted Mamba (FMSR) for remote sensing image (RSI) super-resolution (SR). FMSR leverages the Vision State Space Model (Mamba) to efficiently capture long-range dependencies in large-scale RSI, addressing the limitations of existing SR methods such as limited receptive fields and quadratic computational overhead. The FMSR framework includes a Frequency Selection Module (FSM), a Vision State Space Module (VSSM), and a Hybrid Gate Module (HGM) to enhance spatial and frequency correlations. The FSM adaptively selects informative frequency cues, while the VSSM captures spatial dependencies. The HGM introduces a local inductive bias by selectively amplifying or attenuating local features. Extensive experiments on AID, DOTA, and DIOR benchmarks demonstrate that FMSR outperforms state-of-the-art Transformer-based methods (e.g., HAT-L) in terms of PSNR by 0.11 dB on average, while consuming only 28.05% and 19.08% of the memory and complexity, respectively. The paper also provides a comprehensive review of related work and discusses the effectiveness of each component through ablation studies.
Reach us at info@study.space