CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation

CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation

22 Jan 2024 | Zhihong Chen, Maya Varma, Jean-Benoit Delbrouck, Magdalini Paschali, Louis Blankemeier, Dave Van Veen, Jeya Maria Jose Valanarasu, Alaa Youssef, Joseph Paul Cohen, Eduardo Pontes Reis, Emily B. Tsai, Andrew Johnston, Cameron Olsen, Tanishq Mathew Abraham, Sergios Gatidis, Akshay S. Chaudhari, Curtis Langlotz
CheXagent is a foundation model designed for chest X-ray (CXR) interpretation, addressing challenges in developing vision-language models (FMs) for medical imaging. The model is built using CheXinstruct, a large-scale instruction-tuning dataset derived from 28 publicly available datasets, and CheXbench, a benchmark for evaluating FMs across eight clinically relevant CXR interpretation tasks. CheXagent combines a clinical large language model (LLM) for parsing radiology reports, a vision encoder for CXR images, and a network to bridge vision and language modalities. It outperforms existing general- and medical-domain FMs on CheXbench tasks, achieving significant improvements in image perception and text generation. The model is also evaluated for fairness across demographic factors such as sex, race, and age, highlighting potential performance disparities. Human evaluations show that CheXagent is comparable to physicians in report summarization but lags in report generation. The project provides a comprehensive framework for advancing medical AI, with the release of CheXinstruct, CheXagent, and CheXbench to the public domain. The work contributes to the development of more transparent and equitable AI models in healthcare.CheXagent is a foundation model designed for chest X-ray (CXR) interpretation, addressing challenges in developing vision-language models (FMs) for medical imaging. The model is built using CheXinstruct, a large-scale instruction-tuning dataset derived from 28 publicly available datasets, and CheXbench, a benchmark for evaluating FMs across eight clinically relevant CXR interpretation tasks. CheXagent combines a clinical large language model (LLM) for parsing radiology reports, a vision encoder for CXR images, and a network to bridge vision and language modalities. It outperforms existing general- and medical-domain FMs on CheXbench tasks, achieving significant improvements in image perception and text generation. The model is also evaluated for fairness across demographic factors such as sex, race, and age, highlighting potential performance disparities. Human evaluations show that CheXagent is comparable to physicians in report summarization but lags in report generation. The project provides a comprehensive framework for advancing medical AI, with the release of CheXinstruct, CheXagent, and CheXbench to the public domain. The work contributes to the development of more transparent and equitable AI models in healthcare.
Reach us at info@study.space