Understanding RS3Mamba%3A Visual State Space Model for Remote Sensing Image Semantic Segmentation

The paper introduces RS3Mamba, a novel dual-branch network designed for semantic segmentation of remote sensing images. RS3Mamba incorporates a visual state space (VSS) model, which is capable of modeling long-range relationships with linear computational complexity. The network consists of an auxiliary encoder based on VSS blocks and a main encoder that uses a ResNet18 model. A collaborative completion module (CCM) is introduced to fuse features from the dual-encoder, enhancing the overall performance. Experimental results on the ISPRS Vaihingen and LoveDA Urban datasets demonstrate that RS3Mamba outperforms existing methods based on CNNs and Transformers, particularly in terms of accuracy and computational efficiency. The paper also includes ablation studies and a complexity analysis to validate the effectiveness and practicality of the proposed method.The paper introduces RS3Mamba, a novel dual-branch network designed for semantic segmentation of remote sensing images. RS3Mamba incorporates a visual state space (VSS) model, which is capable of modeling long-range relationships with linear computational complexity. The network consists of an auxiliary encoder based on VSS blocks and a main encoder that uses a ResNet18 model. A collaborative completion module (CCM) is introduced to fuse features from the dual-encoder, enhancing the overall performance. Experimental results on the ISPRS Vaihingen and LoveDA Urban datasets demonstrate that RS3Mamba outperforms existing methods based on CNNs and Transformers, particularly in terms of accuracy and computational efficiency. The paper also includes ablation studies and a complexity analysis to validate the effectiveness and practicality of the proposed method.

RS3Mamba: Visual State Space Model for Remote Sensing Images Semantic Segmentation

2024 | Xianping Ma, Xiaokang Zhang, Member, IEEE, and Man-On Pun, Senior Member, IEEE