Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents

Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents

May 2024 | Junkai Li, Siyu Wang, Meng Zhang, Weitao Li, Yunghwei Lai, Xinhui Kang, Weizhi Ma, and Yang Liu
Agent Hospital is a hospital simulation where patients, nurses, and doctors are autonomous agents powered by large language models (LLMs). The simulation covers the entire cycle of treating a patient's illness, including disease onset, triage, registration, consultation, medical examination, diagnosis, medicine dispensary, convalescence, and post-hospital follow-up. The doctor agents can improve their treatment performance over time without manually labeled data, both in simulation and real-world evaluations. The paper introduces a method called MedAgent-Zero, which trains doctor agents by simulating doctor-patient interactions within the simulated environment. This method allows doctor agents to accumulate experience from both successful and unsuccessful cases, leading to continuous improvement. The simulation experiments show that the treatment performance of doctor agents consistently improves on various tasks. Moreover, the knowledge the doctor agents have acquired in Agent Hospital is applicable to real-world medicare benchmarks. After treating around ten thousand patients, the evolved doctor agent achieves a state-of-the-art accuracy of 93.06% on a subset of the MedQA dataset that covers major respiratory diseases. The main contributions of this work include: (1) the first hospital simulacrum that comprehensively reflects the entire medical process with excellent scalability, making it a valuable platform for the study of medical LLMs/agents; (2) the proposal of the MedAgent-Zero strategy, which enables the self-evolution of medical agents without manually labeled data; and (3) the demonstration that MedAgent-Zero can handle tens of thousands of cases within several days, achieving performance that would take a real-world doctor several years to manage. The simulation environment includes 16 distinct areas, such as triage stations, consultation rooms, and examination rooms. The agents include medical professional agents (doctors and nurses) and resident agents (potential patients). The simulation covers the entire process of treating a patient's illness, including disease onset, triage, registration, consultation, medical examination, diagnosis, treatment, and follow-up. The simulation allows for the continuous evolution of doctor agents through experience accumulation and self-reflection. The methodology includes defining medical tasks such as examination decision, diagnosis, and treatment plan. The datasets used include a simulated medical dataset and a medical document dataset. The MedAgent-Zero strategy is designed to enable the self-evolution of medical agents without manually labeled data. The simulation results show that the doctor agents achieve high accuracy on various tasks, including diagnosis and treatment recommendation. The real-world evaluation on the MedQA dataset shows that the evolved doctor agent achieves state-of-the-art performance, even without manually labeled data. The results demonstrate that the simulation environment can effectively assist the evolution of LLM agents in dealing with specific tasks.Agent Hospital is a hospital simulation where patients, nurses, and doctors are autonomous agents powered by large language models (LLMs). The simulation covers the entire cycle of treating a patient's illness, including disease onset, triage, registration, consultation, medical examination, diagnosis, medicine dispensary, convalescence, and post-hospital follow-up. The doctor agents can improve their treatment performance over time without manually labeled data, both in simulation and real-world evaluations. The paper introduces a method called MedAgent-Zero, which trains doctor agents by simulating doctor-patient interactions within the simulated environment. This method allows doctor agents to accumulate experience from both successful and unsuccessful cases, leading to continuous improvement. The simulation experiments show that the treatment performance of doctor agents consistently improves on various tasks. Moreover, the knowledge the doctor agents have acquired in Agent Hospital is applicable to real-world medicare benchmarks. After treating around ten thousand patients, the evolved doctor agent achieves a state-of-the-art accuracy of 93.06% on a subset of the MedQA dataset that covers major respiratory diseases. The main contributions of this work include: (1) the first hospital simulacrum that comprehensively reflects the entire medical process with excellent scalability, making it a valuable platform for the study of medical LLMs/agents; (2) the proposal of the MedAgent-Zero strategy, which enables the self-evolution of medical agents without manually labeled data; and (3) the demonstration that MedAgent-Zero can handle tens of thousands of cases within several days, achieving performance that would take a real-world doctor several years to manage. The simulation environment includes 16 distinct areas, such as triage stations, consultation rooms, and examination rooms. The agents include medical professional agents (doctors and nurses) and resident agents (potential patients). The simulation covers the entire process of treating a patient's illness, including disease onset, triage, registration, consultation, medical examination, diagnosis, treatment, and follow-up. The simulation allows for the continuous evolution of doctor agents through experience accumulation and self-reflection. The methodology includes defining medical tasks such as examination decision, diagnosis, and treatment plan. The datasets used include a simulated medical dataset and a medical document dataset. The MedAgent-Zero strategy is designed to enable the self-evolution of medical agents without manually labeled data. The simulation results show that the doctor agents achieve high accuracy on various tasks, including diagnosis and treatment recommendation. The real-world evaluation on the MedQA dataset shows that the evolved doctor agent achieves state-of-the-art performance, even without manually labeled data. The results demonstrate that the simulation environment can effectively assist the evolution of LLM agents in dealing with specific tasks.
Reach us at info@study.space
[slides and audio] Agent Hospital%3A A Simulacrum of Hospital with Evolvable Medical Agents