MambaIR: A Simple Baseline for Image Restoration with State-Space Model

MambaIR: A Simple Baseline for Image Restoration with State-Space Model

25 Mar 2024 | Hang Guo, Jinmin Li, Tao Dai, Zhihao Ouyang, Xudong Ren, Shu-Tao Xia
MambaIR: A Simple Baseline for Image Restoration with State-Space Model **Authors:** Hang Guo, Jinnin Li, Tao Dai, Zhihao Ouyang, Xudong Ren, Shu-Tao Xia **Institution:** Tsinghua Shenzhen International Graduate School, Tsinghua University; College of Computer Science and Software Engineering, Shenzhen University; Bytedance Inc.; Peng Cheng Laboratory **Abstract:** Recent advancements in image restoration have been driven by modern deep neural networks like CNNs and Transformers. However, existing restoration backbones often face a trade-off between global receptive fields and efficient computation. The Selective Structured State Space Model (Mamba) has shown promise in modeling long-range dependencies with linear complexity, addressing this trade-off. However, standard Mamba faces challenges such as local pixel forgetting and channel redundancy in low-level vision tasks. This work introduces MambaIR, a simple yet effective baseline that incorporates local enhancement and channel attention to improve the vanilla Mamba. MambaIR leverages local pixel similarity and reduces channel redundancy, demonstrating superior performance in image super-resolution tasks, outperforming SwinIR by up to 0.45dB with similar computational cost and a global receptive field. **Keywords:** Image Restoration, State Space Model, Mamba **Introduction:** Image restoration aims to reconstruct high-quality images from low-quality inputs, encompassing tasks like super-resolution and denoising. Recent advancements have been driven by deep learning models like CNNs and Transformers. While these models have improved performance, they often face challenges in balancing global receptive fields and efficient computation. Mamba, a structured state-space sequence model, offers a promising solution by enabling efficient long-range dependency modeling with linear complexity. However, standard Mamba faces issues like local pixel forgetting and channel redundancy in image restoration tasks. **Methodology:** MambaIR is designed to address these challenges by incorporating local enhancement and channel attention. It consists of three stages: shallow feature extraction, deep feature extraction, and high-quality image reconstruction. The deep feature extraction stage uses Residual State Space Blocks (RSSBs) to mitigate local pixel forgetting and reduce channel redundancy. The Vision State-Space Module (VSSM) captures long-range dependencies, and the 2D Selective Scan Module (2D-SSM) allows for efficient processing of 2D images. **Experiments:** Extensive experiments on various image restoration tasks, including super-resolution and denoising, demonstrate the effectiveness of MambaIR. It outperforms state-of-the-art methods in terms of PSNR and SSIM, achieving better structure preservation and natural texture reconstruction. MambaIR also exhibits linear complexity with input resolution, making it efficient for real-world applications. **Conclusion:** MambaIR is a simple yet effective baseline for image restoration, leveraging the advantages of Mamba while addressing its limitations. It provides a promising alternative for image restoration backbones,MambaIR: A Simple Baseline for Image Restoration with State-Space Model **Authors:** Hang Guo, Jinnin Li, Tao Dai, Zhihao Ouyang, Xudong Ren, Shu-Tao Xia **Institution:** Tsinghua Shenzhen International Graduate School, Tsinghua University; College of Computer Science and Software Engineering, Shenzhen University; Bytedance Inc.; Peng Cheng Laboratory **Abstract:** Recent advancements in image restoration have been driven by modern deep neural networks like CNNs and Transformers. However, existing restoration backbones often face a trade-off between global receptive fields and efficient computation. The Selective Structured State Space Model (Mamba) has shown promise in modeling long-range dependencies with linear complexity, addressing this trade-off. However, standard Mamba faces challenges such as local pixel forgetting and channel redundancy in low-level vision tasks. This work introduces MambaIR, a simple yet effective baseline that incorporates local enhancement and channel attention to improve the vanilla Mamba. MambaIR leverages local pixel similarity and reduces channel redundancy, demonstrating superior performance in image super-resolution tasks, outperforming SwinIR by up to 0.45dB with similar computational cost and a global receptive field. **Keywords:** Image Restoration, State Space Model, Mamba **Introduction:** Image restoration aims to reconstruct high-quality images from low-quality inputs, encompassing tasks like super-resolution and denoising. Recent advancements have been driven by deep learning models like CNNs and Transformers. While these models have improved performance, they often face challenges in balancing global receptive fields and efficient computation. Mamba, a structured state-space sequence model, offers a promising solution by enabling efficient long-range dependency modeling with linear complexity. However, standard Mamba faces issues like local pixel forgetting and channel redundancy in image restoration tasks. **Methodology:** MambaIR is designed to address these challenges by incorporating local enhancement and channel attention. It consists of three stages: shallow feature extraction, deep feature extraction, and high-quality image reconstruction. The deep feature extraction stage uses Residual State Space Blocks (RSSBs) to mitigate local pixel forgetting and reduce channel redundancy. The Vision State-Space Module (VSSM) captures long-range dependencies, and the 2D Selective Scan Module (2D-SSM) allows for efficient processing of 2D images. **Experiments:** Extensive experiments on various image restoration tasks, including super-resolution and denoising, demonstrate the effectiveness of MambaIR. It outperforms state-of-the-art methods in terms of PSNR and SSIM, achieving better structure preservation and natural texture reconstruction. MambaIR also exhibits linear complexity with input resolution, making it efficient for real-world applications. **Conclusion:** MambaIR is a simple yet effective baseline for image restoration, leveraging the advantages of Mamba while addressing its limitations. It provides a promising alternative for image restoration backbones,
Reach us at info@study.space