30 Mar 2024 | Ziyang Wang, Jian-Qing Zheng, Yichi Zhang, Ge Cui, Lei Li
Mamba-UNet is a novel architecture combining the U-Net structure with the Mamba model for medical image segmentation. It uses a pure Visual Mamba (VMamba) encoder-decoder structure with skip connections to preserve spatial information across different network scales. The VMamba blocks are designed to efficiently model long-range dependencies in medical images, which is crucial for accurate segmentation. A novel integration mechanism ensures seamless connectivity between the encoder and decoder paths, enhancing segmentation performance. The model was tested on the ACDC MRI Cardiac segmentation dataset and the Synapse CT Abdomen segmentation dataset, outperforming several types of U-Net in medical image segmentation under the same hyper-parameter settings. The model's architecture includes an encoder, bottleneck, decoder, and skip connections, all based on Visual Mamba blocks. The encoder processes input images through multiple VSS blocks and patch merging layers to create hierarchical features, while the decoder reconstructs features using VSS blocks and patch expanding layers. The bottleneck and skip connections help merge multi-scale features, enhancing spatial details. The model was evaluated using various metrics, including Dice, Intersection over Union (IoU), Accuracy, Precision, Sensitivity, and Specificity, as well as Hausdorff Distance (HD) and Average Surface Distance (ASD). The results show that Mamba-UNet achieves higher performance in segmentation tasks compared to other methods. The model is expected to be extended to 3D medical images and semi/weakly-supervised learning for further improvements in medical imaging.Mamba-UNet is a novel architecture combining the U-Net structure with the Mamba model for medical image segmentation. It uses a pure Visual Mamba (VMamba) encoder-decoder structure with skip connections to preserve spatial information across different network scales. The VMamba blocks are designed to efficiently model long-range dependencies in medical images, which is crucial for accurate segmentation. A novel integration mechanism ensures seamless connectivity between the encoder and decoder paths, enhancing segmentation performance. The model was tested on the ACDC MRI Cardiac segmentation dataset and the Synapse CT Abdomen segmentation dataset, outperforming several types of U-Net in medical image segmentation under the same hyper-parameter settings. The model's architecture includes an encoder, bottleneck, decoder, and skip connections, all based on Visual Mamba blocks. The encoder processes input images through multiple VSS blocks and patch merging layers to create hierarchical features, while the decoder reconstructs features using VSS blocks and patch expanding layers. The bottleneck and skip connections help merge multi-scale features, enhancing spatial details. The model was evaluated using various metrics, including Dice, Intersection over Union (IoU), Accuracy, Precision, Sensitivity, and Specificity, as well as Hausdorff Distance (HD) and Average Surface Distance (ASD). The results show that Mamba-UNet achieves higher performance in segmentation tasks compared to other methods. The model is expected to be extended to 3D medical images and semi/weakly-supervised learning for further improvements in medical imaging.