A whole-slide foundation model for digital pathology from real-world data

A whole-slide foundation model for digital pathology from real-world data

6 June 2024 | Hanwen Xu, Naoto Usuyama, Jaspreet Bagga, Sheng Zhang, Rajesh Rao, Tristan Naumann, Cliff Wong, Zelalem Gero, Javier González, Yu Gu, Yanbo Xu, Mu Wei, Wenhui Wang, Shuming Ma, Furui Wei, Jianwei Yang, Chunyuan Li, Jianfeng Gao, Jaylen Rosemond, Tucker Bower, Soohee Lee, Rosanthi Weerasinghe, Bill J. Wright, Ari Robicsek, Brian Piening, Carlo Bifulco, Sheng Wang & Hoifung Poon
Prov-GigaPath is a whole-slide pathology foundation model pretrained on 1.3 billion 256×256 pathology image tiles from 171,189 whole slides of 30,000 patients across 31 major tissue types. The model was developed to address challenges in digital pathology, including the need for slide-level context and the limitations of existing models that rely on subsampling. Prov-GigaPath uses a novel vision transformer architecture, GigaPath, which adapts the LongNet method for ultra-large-context modelling. The model was evaluated on 26 tasks, including 9 cancer subtyping and 17 pathomics tasks, and achieved state-of-the-art performance on 25 out of 26 tasks. It also demonstrated potential in vision-language pretraining by incorporating pathology reports. Prov-GigaPath is an open-weight foundation model that achieves high performance on various digital pathology tasks, highlighting the importance of real-world data and whole-slide modelling. The model was pretrained on a large real-world dataset, Prov-Path, which is more than five times larger than TCGA in terms of image tiles and more than two times larger in terms of patients. The model's performance was validated on multiple tasks, including mutation prediction and cancer subtyping, where it outperformed existing models. Prov-GigaPath also showed significant improvements in zero-shot subtyping and mutation prediction, demonstrating its potential for multimodal integrative data analysis. The model's architecture includes a tile encoder for local features and a slide encoder for global features, with the slide encoder using LongNet for ultra-large-context modelling. The model was fine-tuned on various downstream tasks and demonstrated strong performance across multiple cancer types. Prov-GigaPath has the potential to assist clinical diagnostics and decision support, and its pretraining strategy, including the use of DINOv2 and masked autoencoders, was shown to be effective. The model's performance was further validated on new data, demonstrating its generalizability. Overall, Prov-GigaPath represents a significant advancement in digital pathology, offering a powerful tool for various clinical applications.Prov-GigaPath is a whole-slide pathology foundation model pretrained on 1.3 billion 256×256 pathology image tiles from 171,189 whole slides of 30,000 patients across 31 major tissue types. The model was developed to address challenges in digital pathology, including the need for slide-level context and the limitations of existing models that rely on subsampling. Prov-GigaPath uses a novel vision transformer architecture, GigaPath, which adapts the LongNet method for ultra-large-context modelling. The model was evaluated on 26 tasks, including 9 cancer subtyping and 17 pathomics tasks, and achieved state-of-the-art performance on 25 out of 26 tasks. It also demonstrated potential in vision-language pretraining by incorporating pathology reports. Prov-GigaPath is an open-weight foundation model that achieves high performance on various digital pathology tasks, highlighting the importance of real-world data and whole-slide modelling. The model was pretrained on a large real-world dataset, Prov-Path, which is more than five times larger than TCGA in terms of image tiles and more than two times larger in terms of patients. The model's performance was validated on multiple tasks, including mutation prediction and cancer subtyping, where it outperformed existing models. Prov-GigaPath also showed significant improvements in zero-shot subtyping and mutation prediction, demonstrating its potential for multimodal integrative data analysis. The model's architecture includes a tile encoder for local features and a slide encoder for global features, with the slide encoder using LongNet for ultra-large-context modelling. The model was fine-tuned on various downstream tasks and demonstrated strong performance across multiple cancer types. Prov-GigaPath has the potential to assist clinical diagnostics and decision support, and its pretraining strategy, including the use of DINOv2 and masked autoencoders, was shown to be effective. The model's performance was further validated on new data, demonstrating its generalizability. Overall, Prov-GigaPath represents a significant advancement in digital pathology, offering a powerful tool for various clinical applications.
Reach us at info@study.space