8 Feb 2021 | Jieneng Chen, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan Wang, Le Lu, Alan L. Yuille, and Yuyin Zhou
TransUNet is a novel framework combining Transformers and U-Net for medical image segmentation. It addresses the limitations of U-Net in modeling long-range dependencies and the lack of localization in pure Transformers. TransUNet uses a hybrid CNN-Transformer architecture, where CNNs provide low-level details and Transformers capture global contexts. The framework employs self-attention mechanisms and a u-shaped design to enable precise localization. The encoder uses CNNs to generate high-resolution features, which are then processed by Transformers to capture global contexts. The decoder upsamples these features and combines them with high-resolution CNN features for accurate segmentation. TransUNet outperforms existing methods on multi-organ and cardiac segmentation tasks, demonstrating superior performance in terms of Dice score and Hausdorff distance. The framework is evaluated on the Synapse multi-organ segmentation dataset and the ACDC cardiac challenge, showing significant improvements over previous approaches. The results indicate that TransUNet effectively combines the strengths of both CNNs and Transformers, achieving state-of-the-art performance in medical image segmentation.TransUNet is a novel framework combining Transformers and U-Net for medical image segmentation. It addresses the limitations of U-Net in modeling long-range dependencies and the lack of localization in pure Transformers. TransUNet uses a hybrid CNN-Transformer architecture, where CNNs provide low-level details and Transformers capture global contexts. The framework employs self-attention mechanisms and a u-shaped design to enable precise localization. The encoder uses CNNs to generate high-resolution features, which are then processed by Transformers to capture global contexts. The decoder upsamples these features and combines them with high-resolution CNN features for accurate segmentation. TransUNet outperforms existing methods on multi-organ and cardiac segmentation tasks, demonstrating superior performance in terms of Dice score and Hausdorff distance. The framework is evaluated on the Synapse multi-organ segmentation dataset and the ACDC cardiac challenge, showing significant improvements over previous approaches. The results indicate that TransUNet effectively combines the strengths of both CNNs and Transformers, achieving state-of-the-art performance in medical image segmentation.