Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification

Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification

July 16, 2024 | Weilian Zhou, Sei-ichiro Kamata, Haipeng Wang, Man Sing Wong, Huiying (Cynthia) Hou
The paper introduces the Mamba-in-Mamba (MiM) architecture, a novel approach for hyperspectral image (HSI) classification. The MiM model is designed to enhance the performance and efficiency of HSI classification tasks by leveraging the Mamba architecture, which is known for its lightweight and parallel scanning capabilities. The key contributions of the study include: 1. **Centralized Mamba-Cross-Scan (MCS)**: This mechanism transforms HSI patches into diverse half-directional sequences, providing a more efficient and lightweight approach compared to traditional scanning methods. Four types of MCS are employed to comprehensively scan the image patch, ensuring continuous and multi-directional processing. 2. **Tokenized Mamba (T-Mamba) Encoder**: The T-Mamba encoder integrates a Gaussian Decay Mask (GDM), a Semantic Token Learner (STL), and a Semantic Token Fuser (STF) to enhance feature generation and concentration. The GDM adjusts the influence of features from each step in the sequence, while the STL and STF processes the features to extract representative semantic tokens. 3. **Weighted MCS Fusion (WMF) Module**: This module dynamically assigns different weights to the outputs from the MCS, ensuring a balanced integration of features. It also uses a Multi-Scale Loss Design to improve model training efficiency. 4. **Multi-Scale Loss Design**: This design optimizes the model training by incorporating multi-scale features, enhancing the overall performance. The MiM model is evaluated on four public HSI datasets (Indian Pines, Pavia University, Houston 2013, and WHU-Hi-HongHu) with fixed and disjoint training-testing samples. The results demonstrate that the MiM model outperforms existing baselines and is competitive with state-of-the-art approaches, highlighting its feasibility and efficiency in HSI classification tasks.The paper introduces the Mamba-in-Mamba (MiM) architecture, a novel approach for hyperspectral image (HSI) classification. The MiM model is designed to enhance the performance and efficiency of HSI classification tasks by leveraging the Mamba architecture, which is known for its lightweight and parallel scanning capabilities. The key contributions of the study include: 1. **Centralized Mamba-Cross-Scan (MCS)**: This mechanism transforms HSI patches into diverse half-directional sequences, providing a more efficient and lightweight approach compared to traditional scanning methods. Four types of MCS are employed to comprehensively scan the image patch, ensuring continuous and multi-directional processing. 2. **Tokenized Mamba (T-Mamba) Encoder**: The T-Mamba encoder integrates a Gaussian Decay Mask (GDM), a Semantic Token Learner (STL), and a Semantic Token Fuser (STF) to enhance feature generation and concentration. The GDM adjusts the influence of features from each step in the sequence, while the STL and STF processes the features to extract representative semantic tokens. 3. **Weighted MCS Fusion (WMF) Module**: This module dynamically assigns different weights to the outputs from the MCS, ensuring a balanced integration of features. It also uses a Multi-Scale Loss Design to improve model training efficiency. 4. **Multi-Scale Loss Design**: This design optimizes the model training by incorporating multi-scale features, enhancing the overall performance. The MiM model is evaluated on four public HSI datasets (Indian Pines, Pavia University, Houston 2013, and WHU-Hi-HongHu) with fixed and disjoint training-testing samples. The results demonstrate that the MiM model outperforms existing baselines and is competitive with state-of-the-art approaches, highlighting its feasibility and efficiency in HSI classification tasks.
Reach us at info@study.space