28 Jul 2024 | Zeyu Zhang, Xuyin Qi, Mingxi Chen, Guangxi Li, Ryan Pham, Ayub Qassim, Ella Berry, Zhibin Liao, Owen Siggs, Robert Mclaughlin, Jamie Craig, Minh-Son To
The paper "JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on Long-Tailed OCTA" introduces a novel model called JointViT, which leverages the Vision Transformer architecture to predict oxygen saturation levels (SaO2) from Optical Coherence Tomography Angiography (OCTA) images. The key contributions of the paper are:
1. **JointViT Model**: A Vision Transformer-based model that incorporates a joint loss function to supervise both SaO2 categories and exact SaO2 values.
2. **Balancing Augmentation**: A technique used during data preprocessing to address the long-tailed distribution in the OCTA dataset, enhancing the model's performance.
3. **Performance Improvement**: Comprehensive experiments on the OCTA dataset demonstrate that JointViT significantly outperforms other state-of-the-art methods, achieving up to 12.28% improvement in overall accuracy.
The paper highlights the importance of accurate SaO2 prediction in diagnosing sleep-related breathing disorders and microvascular dysfunction. The long-tailed distribution in the OCTA dataset, characterized by a few classes with many instances and many classes with few instances, poses significant challenges for model training. JointViT addresses these challenges by using a balanced augmentation technique and a joint loss function that combines binary cross-entropy loss for classification and mean squared error loss for regression.
The paper also discusses related works in medical imaging recognition, long-tailed image recognition, and the use of OCTA in AI for health. It evaluates JointViT using various datasets, including Prog-OCTA and Kermany v3, and conducts ablation studies to validate the effectiveness of the proposed techniques. The results show that JointViT not only achieves higher accuracy but also better handles long-tailed classes, making it a promising tool for future applications in sleep-related disorder diagnosis.The paper "JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on Long-Tailed OCTA" introduces a novel model called JointViT, which leverages the Vision Transformer architecture to predict oxygen saturation levels (SaO2) from Optical Coherence Tomography Angiography (OCTA) images. The key contributions of the paper are:
1. **JointViT Model**: A Vision Transformer-based model that incorporates a joint loss function to supervise both SaO2 categories and exact SaO2 values.
2. **Balancing Augmentation**: A technique used during data preprocessing to address the long-tailed distribution in the OCTA dataset, enhancing the model's performance.
3. **Performance Improvement**: Comprehensive experiments on the OCTA dataset demonstrate that JointViT significantly outperforms other state-of-the-art methods, achieving up to 12.28% improvement in overall accuracy.
The paper highlights the importance of accurate SaO2 prediction in diagnosing sleep-related breathing disorders and microvascular dysfunction. The long-tailed distribution in the OCTA dataset, characterized by a few classes with many instances and many classes with few instances, poses significant challenges for model training. JointViT addresses these challenges by using a balanced augmentation technique and a joint loss function that combines binary cross-entropy loss for classification and mean squared error loss for regression.
The paper also discusses related works in medical imaging recognition, long-tailed image recognition, and the use of OCTA in AI for health. It evaluates JointViT using various datasets, including Prog-OCTA and Kermany v3, and conducts ablation studies to validate the effectiveness of the proposed techniques. The results show that JointViT not only achieves higher accuracy but also better handles long-tailed classes, making it a promising tool for future applications in sleep-related disorder diagnosis.