Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction

Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction

5 Apr 2024 | Hao Li, Ying Chen, Yifei Chen, Wenxian Yang, Bowen Ding, Yuchen Han, Liansheng Wang, Rongshan Yu
This paper proposes a novel framework for whole slide image (WSI) classification called FiVE, which enhances model generalizability through fine-grained visual-semantic interaction. The method leverages fine-grained pathological descriptions extracted from non-standardized pathology reports to improve the model's ability to capture complex visual features. By integrating task-specific fine-grained semantics (TFS) and a patch sampling strategy, the framework achieves robust generalization and strong transferability. The TFS module enables the model to focus on crucial visual information, while the patch sampling strategy reduces computational costs without significantly compromising accuracy. The method is evaluated on the TCGA Lung Cancer dataset, where it outperforms existing methods in few-shot experiments, achieving a 9.19% higher accuracy. The framework also demonstrates strong performance in zero-shot histological subtype classification, highlighting its effectiveness in capturing fine-grained pathological features. The results show that the proposed method significantly improves model performance and generalization capabilities, making it a promising approach for WSI classification tasks.This paper proposes a novel framework for whole slide image (WSI) classification called FiVE, which enhances model generalizability through fine-grained visual-semantic interaction. The method leverages fine-grained pathological descriptions extracted from non-standardized pathology reports to improve the model's ability to capture complex visual features. By integrating task-specific fine-grained semantics (TFS) and a patch sampling strategy, the framework achieves robust generalization and strong transferability. The TFS module enables the model to focus on crucial visual information, while the patch sampling strategy reduces computational costs without significantly compromising accuracy. The method is evaluated on the TCGA Lung Cancer dataset, where it outperforms existing methods in few-shot experiments, achieving a 9.19% higher accuracy. The framework also demonstrates strong performance in zero-shot histological subtype classification, highlighting its effectiveness in capturing fine-grained pathological features. The results show that the proposed method significantly improves model performance and generalization capabilities, making it a promising approach for WSI classification tasks.
Reach us at info@study.space