[slides and audio] A multimodal generative AI copilot for human pathology

This article presents PathChat, a multimodal generative AI copilot designed for human pathology. PathChat is built by adapting a foundational vision encoder for pathology, combining it with a pre-trained large language model, and fine-tuning the system on over 456,000 diverse visual language instructions. The authors compare PathChat against several other multimodal vision language AI assistants and GPT4V, the AI assistant powering ChatGPT-4. PathChat demonstrates state-of-the-art performance on multiple-choice diagnostic questions from diverse tissue origins and disease models. Additionally, using open-ended questions and human expert evaluation, PathChat produces more accurate and pathologist-preferable responses to various pathology-related queries. The authors highlight the potential applications of PathChat in pathology education, research, and human-in-the-loop clinical decision-making. The article also discusses the limitations and future directions for improving PathChat, including further alignment with human intent and support for larger image inputs.This article presents PathChat, a multimodal generative AI copilot designed for human pathology. PathChat is built by adapting a foundational vision encoder for pathology, combining it with a pre-trained large language model, and fine-tuning the system on over 456,000 diverse visual language instructions. The authors compare PathChat against several other multimodal vision language AI assistants and GPT4V, the AI assistant powering ChatGPT-4. PathChat demonstrates state-of-the-art performance on multiple-choice diagnostic questions from diverse tissue origins and disease models. Additionally, using open-ended questions and human expert evaluation, PathChat produces more accurate and pathologist-preferable responses to various pathology-related queries. The authors highlight the potential applications of PathChat in pathology education, research, and human-in-the-loop clinical decision-making. The article also discusses the limitations and future directions for improving PathChat, including further alignment with human intent and support for larger image inputs.

A Multimodal Generative AI Copilot for Human Pathology