Understanding H-vmunet%3A High-order Vision Mamba UNet for Medical Image Segmentation

The paper introduces a novel model called High-order Vision Mamba UNet (H-vmunet) for medical image segmentation. This model leverages state-space models (SSMs), particularly 2D-selective-scan (SS2D), to enhance the extraction of global and local features. The key contributions include: 1. **High-order 2D-selective-scan (H-SS2D)**: This module progressively reduces redundant information during SS2D operations through higher-order interactions, maintaining a strong global receptive field while minimizing redundancy. 2. **Local-SS2D module**: Enhances the learning of local features at each order of interaction. 3. **H-vmunet architecture**: Combines the H-SS2D module with the U-Net framework, resulting in a 6-layer structure with U-shaped architecture, including encoder, decoder, and skip-connection parts. 4. **Ablation experiments**: Verify the effectiveness of the proposed modules and operations, showing that H-SS2D significantly improves performance and reduces parameter count. The model was evaluated on three public medical image datasets (ISIC2017, Spleen, and CVC-ClinicDB) and demonstrated strong competitiveness, reducing parameters by 67.28% compared to traditional Vision Mamba UNet (VM-UNet) while improving segmentation performance. The code for H-vmunet is available on GitHub.The paper introduces a novel model called High-order Vision Mamba UNet (H-vmunet) for medical image segmentation. This model leverages state-space models (SSMs), particularly 2D-selective-scan (SS2D), to enhance the extraction of global and local features. The key contributions include: 1. **High-order 2D-selective-scan (H-SS2D)**: This module progressively reduces redundant information during SS2D operations through higher-order interactions, maintaining a strong global receptive field while minimizing redundancy. 2. **Local-SS2D module**: Enhances the learning of local features at each order of interaction. 3. **H-vmunet architecture**: Combines the H-SS2D module with the U-Net framework, resulting in a 6-layer structure with U-shaped architecture, including encoder, decoder, and skip-connection parts. 4. **Ablation experiments**: Verify the effectiveness of the proposed modules and operations, showing that H-SS2D significantly improves performance and reduces parameter count. The model was evaluated on three public medical image datasets (ISIC2017, Spleen, and CVC-ClinicDB) and demonstrated strong competitiveness, reducing parameters by 67.28% compared to traditional Vision Mamba UNet (VM-UNet) while improving segmentation performance. The code for H-vmunet is available on GitHub.

H-vmunet: High-order Vision Mamba UNet for Medical Image Segmentation

20 Mar 2024 | Renkai Wu, Yinghao Liu, Pengchen Liang, Qing Chang