28 Mar 2024 | Keyan Chen, Bowen Chen, Chenyang Liu, Wenyuan Li, Zhengxia Zou, Zhenwei Shi
RSMamba is a novel architecture for remote sensing image classification, designed to leverage the strengths of both Convolutional Neural Networks (CNNs) and Transformers. It is based on the State Space Model (SSM) and incorporates an efficient, hardware-aware design known as the Mamba. RSMamba integrates the advantages of a global receptive field and linear modeling complexity, addressing the limitations of the vanilla Mamba, which can only model causal sequences and is not adaptable to two-dimensional image data. To overcome these limitations, RSMamba introduces a dynamic multi-path activation mechanism that allows it to model non-causal data. This mechanism includes forward, reverse, and random shuffle paths, which are then activated through linear mapping. Experimental results on three remote sensing image classification datasets (UC Merced, AID, and RESISC45) demonstrate that RSMamba outperforms other state-of-the-art methods, showing significant potential as a backbone network for future visual foundation models. The code for RSMamba is available at https://github.com/KyanChen/RSMamba.RSMamba is a novel architecture for remote sensing image classification, designed to leverage the strengths of both Convolutional Neural Networks (CNNs) and Transformers. It is based on the State Space Model (SSM) and incorporates an efficient, hardware-aware design known as the Mamba. RSMamba integrates the advantages of a global receptive field and linear modeling complexity, addressing the limitations of the vanilla Mamba, which can only model causal sequences and is not adaptable to two-dimensional image data. To overcome these limitations, RSMamba introduces a dynamic multi-path activation mechanism that allows it to model non-causal data. This mechanism includes forward, reverse, and random shuffle paths, which are then activated through linear mapping. Experimental results on three remote sensing image classification datasets (UC Merced, AID, and RESISC45) demonstrate that RSMamba outperforms other state-of-the-art methods, showing significant potential as a backbone network for future visual foundation models. The code for RSMamba is available at https://github.com/KyanChen/RSMamba.