10 Apr 2024 | Sijie Zhao, Hao Chen*, Xueliang Zhang*, Pengfeng Xiao, Lei Bai, and Wanli Ouyang
The paper introduces Remote Sensing Mamba (RSM), a novel approach for dense prediction tasks in very-high-resolution (VHR) remote sensing images. RSM addresses the challenges of context modeling in large VHR images by leveraging linear complexity and global modeling capabilities. The proposed method incorporates an omnidirectional selective scan module (OSSM) to capture spatial features from multiple directions, enhancing the effectiveness of context modeling. Extensive experiments on semantic segmentation and change detection datasets demonstrate that RSM achieves state-of-the-art performance without the need for complex training strategies. The code for RSM is available at <https://github.com/walking-shadow/Official_Remote_Sensing_Mamba>.
- **Context Modeling**: Critical for dense prediction tasks in VHR remote sensing images.
- **Challenges**: Quadratic complexity of transformer-based models and loss of contextual information when cropping large images.
- **Proposed Solution**: RSM, designed to capture global context with linear complexity.
- **Key Contributions**:
- Introduction of state space model (SSM) for dense prediction tasks in VHR remote sensing.
- Development of OSSM for extracting large spatial features from multiple directions.
- State-of-the-art performance on semantic segmentation and change detection tasks.
- **Methodology**:
- Overview of SSM and its application in RSM.
- Architecture of RSM-SS and RSM-CD for semantic segmentation and change detection, respectively.
- Detailed description of the OSSM block.
- **Experimental Settings and Results**:
- Evaluation on semantic segmentation and change detection datasets.
- Comparison with benchmark methods.
- Ablation study to validate the effectiveness of OSSM.
- Impact of image size and spatial resolution on performance.
- Comparison with CNN-based and transformer-based models in handling large remote sensing images.
- **Conclusion**: RSM demonstrates superior efficiency and accuracy in dense prediction tasks for VHR remote sensing images, leveraging linear complexity and global modeling capabilities.The paper introduces Remote Sensing Mamba (RSM), a novel approach for dense prediction tasks in very-high-resolution (VHR) remote sensing images. RSM addresses the challenges of context modeling in large VHR images by leveraging linear complexity and global modeling capabilities. The proposed method incorporates an omnidirectional selective scan module (OSSM) to capture spatial features from multiple directions, enhancing the effectiveness of context modeling. Extensive experiments on semantic segmentation and change detection datasets demonstrate that RSM achieves state-of-the-art performance without the need for complex training strategies. The code for RSM is available at <https://github.com/walking-shadow/Official_Remote_Sensing_Mamba>.
- **Context Modeling**: Critical for dense prediction tasks in VHR remote sensing images.
- **Challenges**: Quadratic complexity of transformer-based models and loss of contextual information when cropping large images.
- **Proposed Solution**: RSM, designed to capture global context with linear complexity.
- **Key Contributions**:
- Introduction of state space model (SSM) for dense prediction tasks in VHR remote sensing.
- Development of OSSM for extracting large spatial features from multiple directions.
- State-of-the-art performance on semantic segmentation and change detection tasks.
- **Methodology**:
- Overview of SSM and its application in RSM.
- Architecture of RSM-SS and RSM-CD for semantic segmentation and change detection, respectively.
- Detailed description of the OSSM block.
- **Experimental Settings and Results**:
- Evaluation on semantic segmentation and change detection datasets.
- Comparison with benchmark methods.
- Ablation study to validate the effectiveness of OSSM.
- Impact of image size and spatial resolution on performance.
- Comparison with CNN-based and transformer-based models in handling large remote sensing images.
- **Conclusion**: RSM demonstrates superior efficiency and accuracy in dense prediction tasks for VHR remote sensing images, leveraging linear complexity and global modeling capabilities.