TransFuse is a novel architecture that combines Transformers and CNNs for medical image segmentation. It addresses the limitations of CNNs in capturing long-range dependencies and the inefficiency of deep networks in preserving local details. TransFuse uses a parallel-in-branch design, where both CNN and Transformer branches process information independently, and their features are fused using a BiFusion module. This approach allows for efficient capture of both global context and low-level spatial details in a much shallower network. The BiFusion module selectively fuses multi-level features from both branches using self-attention and bilinear Hadamard product. Extensive experiments show that TransFuse achieves state-of-the-art results on 2D and 3D medical image datasets, including polyp, skin lesion, hip, and prostate segmentation, with significant parameter reduction and inference speed improvement. TransFuse outperforms existing methods in terms of performance, efficiency, and accuracy. It is the first parallel-in-branch model that synthesizes CNN and Transformer. The proposed architecture is flexible and can be applied to various medical image segmentation tasks. The results demonstrate that TransFuse is effective in capturing both global and local context, and it is efficient in terms of model size and inference speed. The method is evaluated on multiple medical image segmentation tasks, and the results show that TransFuse achieves superior performance compared to other state-of-the-art methods. The architecture is also validated through ablation studies, which show that the parallel-in-branch design and BiFusion module contribute to the effectiveness of the model. The results indicate that TransFuse is a promising approach for medical image segmentation.TransFuse is a novel architecture that combines Transformers and CNNs for medical image segmentation. It addresses the limitations of CNNs in capturing long-range dependencies and the inefficiency of deep networks in preserving local details. TransFuse uses a parallel-in-branch design, where both CNN and Transformer branches process information independently, and their features are fused using a BiFusion module. This approach allows for efficient capture of both global context and low-level spatial details in a much shallower network. The BiFusion module selectively fuses multi-level features from both branches using self-attention and bilinear Hadamard product. Extensive experiments show that TransFuse achieves state-of-the-art results on 2D and 3D medical image datasets, including polyp, skin lesion, hip, and prostate segmentation, with significant parameter reduction and inference speed improvement. TransFuse outperforms existing methods in terms of performance, efficiency, and accuracy. It is the first parallel-in-branch model that synthesizes CNN and Transformer. The proposed architecture is flexible and can be applied to various medical image segmentation tasks. The results demonstrate that TransFuse is effective in capturing both global and local context, and it is efficient in terms of model size and inference speed. The method is evaluated on multiple medical image segmentation tasks, and the results show that TransFuse achieves superior performance compared to other state-of-the-art methods. The architecture is also validated through ablation studies, which show that the parallel-in-branch design and BiFusion module contribute to the effectiveness of the model. The results indicate that TransFuse is a promising approach for medical image segmentation.