CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis

CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis

18 Jul 2024 | Junying Chen, Chi Gui, Anningzhe Gao, Ke Ji, Xidong Wang, Xiang Wan, Benyou Wang
The paper introduces Chain-of-Diagnosis (CoD) to enhance the interpretability of large language models (LLMs) in medical diagnostics. CoD transforms the diagnostic process into a transparent chain of reasoning, mirroring a physician's thought process, and outputs disease confidence distributions to ensure transparency and controllability in decision-making. The authors developed DiagnosisGPT, an LLM capable of diagnosing 9,604 diseases, which outperforms other LLMs on diagnostic benchmarks. CoD addresses the challenge of symptom inquiry by selecting crucial symptoms to reduce diagnostic uncertainty, and it uses synthetic patient cases generated from disease encyclopedias to avoid privacy concerns. Experiments demonstrate that DiagnosisGPT achieves high accuracy and reduces the number of inquiries, making it a promising tool for automated medical diagnosis. The paper also presents DxBench, a new diagnostic benchmark with 1,148 real cases, and discusses the limitations and future directions of the approach.The paper introduces Chain-of-Diagnosis (CoD) to enhance the interpretability of large language models (LLMs) in medical diagnostics. CoD transforms the diagnostic process into a transparent chain of reasoning, mirroring a physician's thought process, and outputs disease confidence distributions to ensure transparency and controllability in decision-making. The authors developed DiagnosisGPT, an LLM capable of diagnosing 9,604 diseases, which outperforms other LLMs on diagnostic benchmarks. CoD addresses the challenge of symptom inquiry by selecting crucial symptoms to reduce diagnostic uncertainty, and it uses synthetic patient cases generated from disease encyclopedias to avoid privacy concerns. Experiments demonstrate that DiagnosisGPT achieves high accuracy and reduces the number of inquiries, making it a promising tool for automated medical diagnosis. The paper also presents DxBench, a new diagnostic benchmark with 1,148 real cases, and discusses the limitations and future directions of the approach.
Reach us at info@study.space