[slides] HistGen%3A Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction

**HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction** **Authors:** Zhengrui Guo, Jiabo Ma, Yingxue Xu, Yihui Wang, Liansheng Wang, and Hao Chen **Abstract:** Histopathology is crucial in cancer diagnosis, and clinical reports play a vital role in interpreting and guiding treatment. The automation of histopathology report generation using deep learning can significantly enhance clinical efficiency and reduce the burden on pathologists. HistGen is a multiple instance learning (MIL)-empowered framework designed for histopathology report generation, along with the first benchmark dataset for evaluation. Inspired by diagnostic and report-writing workflows, HistGen features two modules: a local-global hierarchical encoder and a cross-modal context module. The local-global encoder efficiently aggregates visual features from a region-to-slide perspective, while the cross-modal context module facilitates alignment and interaction between visual sequences and corresponding reports. Experimental results show that HistGen outperforms state-of-the-art (SOTA) models in WSI report generation and demonstrates superior transfer learning capabilities in cancer subtyping and survival analysis tasks. **Keywords:** Histopathology Report Generation, Multiple Instance Learning, Cross-Modal Alignment **Introduction:** Histopathology tissue analysis is essential for cancer diagnosis and prognosis. Computational pathology (CPath) has advanced this field, but the labor-intensive and time-consuming task of writing reports remains a challenge. HistGen addresses this by leveraging MIL and cross-modal interactions to generate WSI reports. The framework includes a local-global hierarchical encoder and a cross-modal context module, pre-trained on a large dataset of WSIs. Extensive experiments validate the model's superior performance in WSI report generation and its strong transfer learning capabilities in downstream tasks. **Method:** - **WSI-Report Dataset Curation:** A benchmark dataset of 7,800 WSI-report pairs is curated from the TCGA platform. - **Local-Global Hierarchical Encoder (LGH):** efficient encoding and aggregation of extensive WSI patch features. - **Cross-Modal Context Module (CMC):** enables interactions between visual encoding and textual decoding. - **Pre-trained Feature Extractor:** a general-purpose MIL feature extractor pre-trained on over 55,000 WSIs. - **Loss Function:** maximizes the conditional probability of reports given WSIs. **Experiments:** - **WSI Report Generation:** HistGen outperforms SOTA models in NLG metrics. - **Ablation Studies:** confirms the effectiveness of the proposed modules. - **Transfer Learning for Cancer Diagnosis and Prognosis:** shows superior performance in cancer subtyping and survival analysis tasks. **Conclusion:** HistGen is a MIL-empowered framework for automated histopathology report generation, demonstrating strong performance and transfer learning capabilities. Future work will expand to other fields like radiology and ophthalmology.**HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction** **Authors:** Zhengrui Guo, Jiabo Ma, Yingxue Xu, Yihui Wang, Liansheng Wang, and Hao Chen **Abstract:** Histopathology is crucial in cancer diagnosis, and clinical reports play a vital role in interpreting and guiding treatment. The automation of histopathology report generation using deep learning can significantly enhance clinical efficiency and reduce the burden on pathologists. HistGen is a multiple instance learning (MIL)-empowered framework designed for histopathology report generation, along with the first benchmark dataset for evaluation. Inspired by diagnostic and report-writing workflows, HistGen features two modules: a local-global hierarchical encoder and a cross-modal context module. The local-global encoder efficiently aggregates visual features from a region-to-slide perspective, while the cross-modal context module facilitates alignment and interaction between visual sequences and corresponding reports. Experimental results show that HistGen outperforms state-of-the-art (SOTA) models in WSI report generation and demonstrates superior transfer learning capabilities in cancer subtyping and survival analysis tasks. **Keywords:** Histopathology Report Generation, Multiple Instance Learning, Cross-Modal Alignment **Introduction:** Histopathology tissue analysis is essential for cancer diagnosis and prognosis. Computational pathology (CPath) has advanced this field, but the labor-intensive and time-consuming task of writing reports remains a challenge. HistGen addresses this by leveraging MIL and cross-modal interactions to generate WSI reports. The framework includes a local-global hierarchical encoder and a cross-modal context module, pre-trained on a large dataset of WSIs. Extensive experiments validate the model's superior performance in WSI report generation and its strong transfer learning capabilities in downstream tasks. **Method:** - **WSI-Report Dataset Curation:** A benchmark dataset of 7,800 WSI-report pairs is curated from the TCGA platform. - **Local-Global Hierarchical Encoder (LGH):** efficient encoding and aggregation of extensive WSI patch features. - **Cross-Modal Context Module (CMC):** enables interactions between visual encoding and textual decoding. - **Pre-trained Feature Extractor:** a general-purpose MIL feature extractor pre-trained on over 55,000 WSIs. - **Loss Function:** maximizes the conditional probability of reports given WSIs. **Experiments:** - **WSI Report Generation:** HistGen outperforms SOTA models in NLG metrics. - **Ablation Studies:** confirms the effectiveness of the proposed modules. - **Transfer Learning for Cancer Diagnosis and Prognosis:** shows superior performance in cancer subtyping and survival analysis tasks. **Conclusion:** HistGen is a MIL-empowered framework for automated histopathology report generation, demonstrating strong performance and transfer learning capabilities. Future work will expand to other fields like radiology and ophthalmology.

HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction

18 Jun 2024 | Zhengrui Guo, Jiabo Ma, Yingxue Xu, Yihui Wang, Liansheng Wang, and Hao Chen