28 May 2024 | Ming Y. Lu, Bowen Chen, Drew F. K. Williamson, Richard J. Chen, Melissa Zhao, Aaron K. Chow, Kenji Ikemura, Ahrong Kim, Dimitra Pouli, Ankush Patel, Amr Soliman, Chengkuan Chen, Tong Ding, Judy J. Wang, Georg Gerber, Ivy Liang, Long Phi Le, Anil V. Parwani, Luca L. Weishaupt & Faisal Mahmood
This article presents PathChat, a multimodal generative AI copilot designed for human pathology. PathChat is built by adapting a foundational vision encoder for pathology, combining it with a pre-trained large language model, and fine-tuning the system on over 456,000 diverse visual language instructions. The authors compare PathChat against several other multimodal vision language AI assistants and GPT4V, the AI assistant powering ChatGPT-4. PathChat demonstrates state-of-the-art performance on multiple-choice diagnostic questions from diverse tissue origins and disease models. Additionally, using open-ended questions and human expert evaluation, PathChat produces more accurate and pathologist-preferable responses to various pathology-related queries. The authors highlight the potential applications of PathChat in pathology education, research, and human-in-the-loop clinical decision-making. The article also discusses the limitations and future directions for improving PathChat, including further alignment with human intent and support for larger image inputs.This article presents PathChat, a multimodal generative AI copilot designed for human pathology. PathChat is built by adapting a foundational vision encoder for pathology, combining it with a pre-trained large language model, and fine-tuning the system on over 456,000 diverse visual language instructions. The authors compare PathChat against several other multimodal vision language AI assistants and GPT4V, the AI assistant powering ChatGPT-4. PathChat demonstrates state-of-the-art performance on multiple-choice diagnostic questions from diverse tissue origins and disease models. Additionally, using open-ended questions and human expert evaluation, PathChat produces more accurate and pathologist-preferable responses to various pathology-related queries. The authors highlight the potential applications of PathChat in pathology education, research, and human-in-the-loop clinical decision-making. The article also discusses the limitations and future directions for improving PathChat, including further alignment with human intent and support for larger image inputs.