25 Apr 2024 | Xiang He, Weiyi Song, Yiming Wang, Fabio Poiesi, Ji Yi, Manishi Desai, Quanqing Xu, Kongzheng Yang, and Yi Wan
This paper proposes LightReSeg, a lightweight network for retinal layer segmentation using optical coherence tomography (OCT) images. The method addresses the challenges of accurate segmentation due to low contrast and blood flow noise in OCT images, while maintaining a compact model size suitable for clinical deployment. LightReSeg employs an encoder-decoder structure with a multi-scale feature extractor and a Transformer block to enhance global reasoning capabilities. The decoder incorporates a multi-scale asymmetric attention (MAA) module to preserve semantic information at each encoder scale. The model achieves state-of-the-art segmentation performance with only 3.3M parameters, outperforming existing methods like TransUnet on multiple datasets. The approach is evaluated on three datasets: Vis-105H (healthy eyes), Glaucoma, and DME. Quantitative metrics such as Dice similarity coefficient (DSC), mean Intersection over Union (mIoU), pixel accuracy (PA), and mean pixel accuracy (mPA) show that LightReSeg achieves the best results across all metrics. Qualitative analysis confirms the model's superior segmentation accuracy and reduced false positives. Ablation studies demonstrate the effectiveness of the MAA module and Transformer block in improving segmentation performance. The model's lightweight design ensures efficient inference, with inference speeds of 0.11s, 0.27s, and 0.07s on the three datasets. The method is effective in handling noise and uncertainty in OCT images, and its performance is robust across different datasets. The paper concludes that LightReSeg provides a practical solution for retinal layer segmentation with high accuracy and efficiency.This paper proposes LightReSeg, a lightweight network for retinal layer segmentation using optical coherence tomography (OCT) images. The method addresses the challenges of accurate segmentation due to low contrast and blood flow noise in OCT images, while maintaining a compact model size suitable for clinical deployment. LightReSeg employs an encoder-decoder structure with a multi-scale feature extractor and a Transformer block to enhance global reasoning capabilities. The decoder incorporates a multi-scale asymmetric attention (MAA) module to preserve semantic information at each encoder scale. The model achieves state-of-the-art segmentation performance with only 3.3M parameters, outperforming existing methods like TransUnet on multiple datasets. The approach is evaluated on three datasets: Vis-105H (healthy eyes), Glaucoma, and DME. Quantitative metrics such as Dice similarity coefficient (DSC), mean Intersection over Union (mIoU), pixel accuracy (PA), and mean pixel accuracy (mPA) show that LightReSeg achieves the best results across all metrics. Qualitative analysis confirms the model's superior segmentation accuracy and reduced false positives. Ablation studies demonstrate the effectiveness of the MAA module and Transformer block in improving segmentation performance. The model's lightweight design ensures efficient inference, with inference speeds of 0.11s, 0.27s, and 0.07s on the three datasets. The method is effective in handling noise and uncertainty in OCT images, and its performance is robust across different datasets. The paper concludes that LightReSeg provides a practical solution for retinal layer segmentation with high accuracy and efficiency.