ChatGPT Vision for Radiological Interpretation: An Investigation Using Medical School Radiology Examinations

ChatGPT Vision for Radiological Interpretation: An Investigation Using Medical School Radiology Examinations

2024 | Hyungjin Kim, Paul Kim, Ijin Joo, Jung Hoon Kim, Chang Min Park, Soon Ho Yoon
This study evaluates ChatGPT's ability to interpret radiological images using medical school radiology examinations. ChatGPT, a foundation model based on the transformer architecture, has shown potential in various tasks, including data mining from free-text radiology reports, structured reporting, and answering clinical questions. However, its performance in radiological image analysis remains unexplored. The study used GPT-4-1106-vision-preview to interpret radiology examinations for third-year medical students at Seoul National University College of Medicine across 2018-2020. The examinations, presented in Korean, included multiple-choice questions with text and image-based options. Since these questions were not publicly available, they were unlikely to have been used in GPT-4 training. ChatGPT was tested three times per academic year, with each session analyzing the same set of questions. The results showed that ChatGPT scored lower than medical students in all three years, with scores in the bottom percentiles. In image-based questions, ChatGPT performed worse than text-only questions. A radiologist evaluated ChatGPT's image interpretations on a 5-point scale, finding that 42% of its interpretations were rated as poor or very poor. The consistency of ChatGPT's responses across three sessions was moderate, with 69% of cases having consistent answers. The study highlights that the current version of ChatGPT with vision capabilities showed potential but underperformed in radiological interpretation. The findings suggest that further improvements are needed for reliable clinical use. The study's limitations include reliance on a single institution's examinations and potential language barriers due to the use of Korean questions. Overall, the study indicates that while ChatGPT has potential, it is not yet reliable for clinical radiology.This study evaluates ChatGPT's ability to interpret radiological images using medical school radiology examinations. ChatGPT, a foundation model based on the transformer architecture, has shown potential in various tasks, including data mining from free-text radiology reports, structured reporting, and answering clinical questions. However, its performance in radiological image analysis remains unexplored. The study used GPT-4-1106-vision-preview to interpret radiology examinations for third-year medical students at Seoul National University College of Medicine across 2018-2020. The examinations, presented in Korean, included multiple-choice questions with text and image-based options. Since these questions were not publicly available, they were unlikely to have been used in GPT-4 training. ChatGPT was tested three times per academic year, with each session analyzing the same set of questions. The results showed that ChatGPT scored lower than medical students in all three years, with scores in the bottom percentiles. In image-based questions, ChatGPT performed worse than text-only questions. A radiologist evaluated ChatGPT's image interpretations on a 5-point scale, finding that 42% of its interpretations were rated as poor or very poor. The consistency of ChatGPT's responses across three sessions was moderate, with 69% of cases having consistent answers. The study highlights that the current version of ChatGPT with vision capabilities showed potential but underperformed in radiological interpretation. The findings suggest that further improvements are needed for reliable clinical use. The study's limitations include reliance on a single institution's examinations and potential language barriers due to the use of Korean questions. Overall, the study indicates that while ChatGPT has potential, it is not yet reliable for clinical radiology.
Reach us at info@study.space
[slides] ChatGPT Vision for Radiological Interpretation%3A An Investigation Using Medical School Radiology Examinations | StudySpace