4 Jan 2022 | Ali Hatamizadeh, Vishwesh Nath, Yucheng Tang, Dong Yang, Holger R. Roth, and Daguang Xu
Swin UNETR is a novel architecture for semantic segmentation of brain tumors using multi-modal MRI images. It combines the strengths of Swin Transformers with a U-shaped network design, utilizing a Swin transformer as the encoder and a CNN-based decoder connected via skip connections at different resolutions. The Swin transformer encoder extracts features at five different resolutions using shifted windows for self-attention, enabling efficient modeling of long-range dependencies. The model was validated in the BraTS 2021 segmentation challenge, where it ranked among the top-performing approaches in the validation phase and demonstrated competitive performance in the testing phase.
The model uses a soft Dice loss function for training, and is implemented using PyTorch and MONAI. It was trained on the BraTS 2021 dataset, which includes 1251 subjects with four 3D MRI modalities. The model's performance was evaluated across five folds of cross-validation, showing superior results compared to other methods such as SegResNet, nnU-Net, and TransBTS. The Swin UNETR model outperformed these methods in terms of mean Dice scores for all three tumor sub-regions (Enhancing Tumor, Whole Tumor, and Tumor Core).
The model's performance in the BraTS 2021 validation set was among the top-ranking methodologies, with the model achieving competitive results in the testing phase. The segmentation outputs were well-delineated for all three sub-regions, demonstrating the model's effectiveness in accurately segmenting brain tumors. The model's use of a hierarchical encoder with self-attention modules allows it to learn multi-scale contextual information, leading to improved performance in brain tumor segmentation. The Swin UNETR model is a promising approach for future transformer-based models in medical image analysis.Swin UNETR is a novel architecture for semantic segmentation of brain tumors using multi-modal MRI images. It combines the strengths of Swin Transformers with a U-shaped network design, utilizing a Swin transformer as the encoder and a CNN-based decoder connected via skip connections at different resolutions. The Swin transformer encoder extracts features at five different resolutions using shifted windows for self-attention, enabling efficient modeling of long-range dependencies. The model was validated in the BraTS 2021 segmentation challenge, where it ranked among the top-performing approaches in the validation phase and demonstrated competitive performance in the testing phase.
The model uses a soft Dice loss function for training, and is implemented using PyTorch and MONAI. It was trained on the BraTS 2021 dataset, which includes 1251 subjects with four 3D MRI modalities. The model's performance was evaluated across five folds of cross-validation, showing superior results compared to other methods such as SegResNet, nnU-Net, and TransBTS. The Swin UNETR model outperformed these methods in terms of mean Dice scores for all three tumor sub-regions (Enhancing Tumor, Whole Tumor, and Tumor Core).
The model's performance in the BraTS 2021 validation set was among the top-ranking methodologies, with the model achieving competitive results in the testing phase. The segmentation outputs were well-delineated for all three sub-regions, demonstrating the model's effectiveness in accurately segmenting brain tumors. The model's use of a hierarchical encoder with self-attention modules allows it to learn multi-scale contextual information, leading to improved performance in brain tumor segmentation. The Swin UNETR model is a promising approach for future transformer-based models in medical image analysis.