LiViT-Net: A U-Net-like, lightweight Transformer network for retinal vessel segmentation

LiViT-Net: A U-Net-like, lightweight Transformer network for retinal vessel segmentation

2024 | Le Tong, Tianjiu Li, Qian Zhang, Qin Zhang, Renchaoli Zhu, Wei Du, Pengwei Hu
LiViT-Net is a lightweight Transformer network inspired by U-Net for retinal vessel segmentation. It integrates MobileViT+ and a novel local representation to efficiently capture intricate image structures while maintaining lightness. A novel joint loss function, combining weighted cross-entropy and Dice loss, is designed to address challenges like foreground-background imbalance and complex vascular structures. The model is evaluated on three prominent retinal image databases (DRIVE, CHASEDB1, HRF), demonstrating robustness and generalizability. LiViT-Net outperforms other methods in complex scenarios, especially in environments with fine vessels or vessel edges. It is optimized for efficiency, excelling on devices with constrained computational power. A freely accessible website is provided to demonstrate real-time performance. The model's architecture combines CNNs and Transformers, with a lightweight convolutional module and MobileViT+ block for efficient feature extraction. The joint loss function balances accuracy and class equilibrium, with optimal ratios (α/β = 0.8/0.2) enhancing precision and class diversity. LiViT-Net achieves high performance with fewer parameters and lower computational demands, making it suitable for edge devices. Ablation studies confirm the effectiveness of the MobileViT+ block and joint loss, showing improved segmentation accuracy. The model's performance is validated across multiple datasets, highlighting its superiority in retinal vessel segmentation. Future work includes exploring advanced optimization techniques and adapting the model to various medical imaging fields. The study is supported by the National Natural Science Foundation of China.LiViT-Net is a lightweight Transformer network inspired by U-Net for retinal vessel segmentation. It integrates MobileViT+ and a novel local representation to efficiently capture intricate image structures while maintaining lightness. A novel joint loss function, combining weighted cross-entropy and Dice loss, is designed to address challenges like foreground-background imbalance and complex vascular structures. The model is evaluated on three prominent retinal image databases (DRIVE, CHASEDB1, HRF), demonstrating robustness and generalizability. LiViT-Net outperforms other methods in complex scenarios, especially in environments with fine vessels or vessel edges. It is optimized for efficiency, excelling on devices with constrained computational power. A freely accessible website is provided to demonstrate real-time performance. The model's architecture combines CNNs and Transformers, with a lightweight convolutional module and MobileViT+ block for efficient feature extraction. The joint loss function balances accuracy and class equilibrium, with optimal ratios (α/β = 0.8/0.2) enhancing precision and class diversity. LiViT-Net achieves high performance with fewer parameters and lower computational demands, making it suitable for edge devices. Ablation studies confirm the effectiveness of the MobileViT+ block and joint loss, showing improved segmentation accuracy. The model's performance is validated across multiple datasets, highlighting its superiority in retinal vessel segmentation. Future work includes exploring advanced optimization techniques and adapting the model to various medical imaging fields. The study is supported by the National Natural Science Foundation of China.
Reach us at info@study.space