SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

25 Feb 2024 | Zhaohu Xing, Tian Ye, Yijun Yang, Guang Liu, and Lei Zhu
SegMamba is a novel 3D medical image segmentation model based on the Mamba architecture, designed to efficiently capture long-range dependencies in whole-volume features. Unlike traditional CNNs and Transformers, SegMamba excels in whole-volume feature modeling while maintaining high processing speed, even for large volumes like 64×64×64. It introduces a tri-orientated Mamba (ToM) module to enhance sequential modeling of 3D features from three directions and a gated spatial convolution (GSC) module to improve spatial feature representation. Additionally, the paper presents a new large-scale dataset, CRC-500, containing 500 3D CT scans of colorectal cancer with expert annotations, to facilitate research in 3D colorectal cancer segmentation. The SegMamba model consists of three main components: a 3D feature encoder with multiple tri-orientated spatial Mamba blocks for multi-scale global feature modeling, a 3D decoder based on convolution layers for segmentation prediction, and skip connections to connect global multiscale features to the decoder for feature reuse. The encoder includes a stem layer and multiple TSMamba blocks, which process 3D input features through GSC and ToM modules to extract spatial and global features. Extensive experiments on three datasets, including BraTS2023, AIIB2023, and CRC-500, demonstrate that SegMamba achieves superior performance in terms of Dice score, HD95, and other metrics compared to state-of-the-art methods. The model shows high efficiency in both training and inference, with significant improvements in segmentation accuracy and robustness. The code for SegMamba and information about the CRC-500 dataset are available at https://github.com/gexing/SegMamba.SegMamba is a novel 3D medical image segmentation model based on the Mamba architecture, designed to efficiently capture long-range dependencies in whole-volume features. Unlike traditional CNNs and Transformers, SegMamba excels in whole-volume feature modeling while maintaining high processing speed, even for large volumes like 64×64×64. It introduces a tri-orientated Mamba (ToM) module to enhance sequential modeling of 3D features from three directions and a gated spatial convolution (GSC) module to improve spatial feature representation. Additionally, the paper presents a new large-scale dataset, CRC-500, containing 500 3D CT scans of colorectal cancer with expert annotations, to facilitate research in 3D colorectal cancer segmentation. The SegMamba model consists of three main components: a 3D feature encoder with multiple tri-orientated spatial Mamba blocks for multi-scale global feature modeling, a 3D decoder based on convolution layers for segmentation prediction, and skip connections to connect global multiscale features to the decoder for feature reuse. The encoder includes a stem layer and multiple TSMamba blocks, which process 3D input features through GSC and ToM modules to extract spatial and global features. Extensive experiments on three datasets, including BraTS2023, AIIB2023, and CRC-500, demonstrate that SegMamba achieves superior performance in terms of Dice score, HD95, and other metrics compared to state-of-the-art methods. The model shows high efficiency in both training and inference, with significant improvements in segmentation accuracy and robustness. The code for SegMamba and information about the CRC-500 dataset are available at https://github.com/gexing/SegMamba.
Reach us at info@study.space