9 Mar 2024 | Xuanhua He¹,², Ke Cao¹,², Keyu Yan¹,², Rui Li¹, Chengjun Xie¹, Jie Zhang¹, and Man Zhou²
Pan-Mamba is a novel pan-sharpening network that leverages the Mamba model for efficient global information modeling. The method introduces two core components: channel swapping Mamba and cross-modal Mamba, designed for efficient cross-modal information exchange and fusion. Channel swapping Mamba initiates lightweight cross-modal interaction by exchanging partial PAN and multi-spectral channels, while cross-modal Mamba facilitates information representation by exploiting inherent cross-modal relationships. Through extensive experiments across diverse datasets, Pan-Mamba surpasses state-of-the-art methods, demonstrating superior fusion results in pan-sharpening. This work is the first to explore the potential of the Mamba model in pan-sharpening and establishes a new frontier in the field. The source code is available at https://github.com/alexhe101/Pan-Mamba.
Pan-sharpening integrates low-resolution multi-spectral and high-resolution panchromatic images to generate high-resolution multi-spectral images. Classical methods rely on manual rules and face challenges in representation. Deep learning methods, such as PNN, have shown significant improvements. However, existing methods face challenges in capturing global information and efficient information fusion. The Mamba model offers a novel solution with input-adaptive and global information modeling capabilities, maintaining linear complexity and reduced computational overhead. Pan-Mamba enhances models through feature extraction and fusion, leveraging Mamba for efficient long-range information modeling and cross-modal interaction. The model achieves superior results in both qualitative and quantitative assessments across multiple datasets. The method introduces a novel pan-sharpening network, enabling efficient cross-modal information exchange and fusion. The model outperforms state-of-the-art methods in terms of PSNR, SSIM, SAM, and ERGAS metrics. The model also demonstrates superior performance in real-world settings, with non-reference metrics indicating robust generalizability. The model's efficiency is validated through ablation studies and comparisons with benchmark methods, showing competitive performance in terms of computational complexity, GPU memory usage, inference time, and efficiency. The proposed method achieves state-of-the-art results in pan-sharpening, demonstrating robust spectral accuracy and effective preservation of texture information.Pan-Mamba is a novel pan-sharpening network that leverages the Mamba model for efficient global information modeling. The method introduces two core components: channel swapping Mamba and cross-modal Mamba, designed for efficient cross-modal information exchange and fusion. Channel swapping Mamba initiates lightweight cross-modal interaction by exchanging partial PAN and multi-spectral channels, while cross-modal Mamba facilitates information representation by exploiting inherent cross-modal relationships. Through extensive experiments across diverse datasets, Pan-Mamba surpasses state-of-the-art methods, demonstrating superior fusion results in pan-sharpening. This work is the first to explore the potential of the Mamba model in pan-sharpening and establishes a new frontier in the field. The source code is available at https://github.com/alexhe101/Pan-Mamba.
Pan-sharpening integrates low-resolution multi-spectral and high-resolution panchromatic images to generate high-resolution multi-spectral images. Classical methods rely on manual rules and face challenges in representation. Deep learning methods, such as PNN, have shown significant improvements. However, existing methods face challenges in capturing global information and efficient information fusion. The Mamba model offers a novel solution with input-adaptive and global information modeling capabilities, maintaining linear complexity and reduced computational overhead. Pan-Mamba enhances models through feature extraction and fusion, leveraging Mamba for efficient long-range information modeling and cross-modal interaction. The model achieves superior results in both qualitative and quantitative assessments across multiple datasets. The method introduces a novel pan-sharpening network, enabling efficient cross-modal information exchange and fusion. The model outperforms state-of-the-art methods in terms of PSNR, SSIM, SAM, and ERGAS metrics. The model also demonstrates superior performance in real-world settings, with non-reference metrics indicating robust generalizability. The model's efficiency is validated through ablation studies and comparisons with benchmark methods, showing competitive performance in terms of computational complexity, GPU memory usage, inference time, and efficiency. The proposed method achieves state-of-the-art results in pan-sharpening, demonstrating robust spectral accuracy and effective preservation of texture information.