28 Jun 2024 | Junda Wang, Zhichao Yang, Zonghai Yao, Hong Yu
This paper introduces JMLR, a joint training method for Large Language Models (LLMs) and information retrieval, aimed at enhancing reasoning and professional medical question-answering capabilities. JMLR trains both the LLM and the retriever simultaneously during the fine-tuning phase, which improves the model's ability to retrieve clinical guidelines and leverage medical knowledge for accurate reasoning and answering. This synchronized training mechanism reduces the need for extensive computational resources. The method was evaluated on important medical question-answering tasks, and results showed that JMLR-13B outperformed previous state-of-the-art models, including Meditron-70B and Llama2-13B with RAG, achieving a 70.5% accuracy on a medical question-answering dataset. Comprehensive evaluations revealed that JMLR-13B significantly enhances reasoning quality and reduces hallucinations compared to other models, including Claude3-Opus. Additionally, JMLR-13B trains much faster than Meditron-70B, requiring only 148 GPU hours versus 42630 GPU hours for Meditron-70B. The study also demonstrates that JMLR effectively enhances domain-specific retrieval by eliminating the need for additional human annotation. The results show that joint training of LLMs and retrievers improves performance in medical question-answering tasks, particularly in scenarios requiring nuanced understanding and specific information retrieval. The paper also discusses the limitations and ethical considerations of the approach, including domain specificity, the expertise of annotators, privacy implications, and potential biases. Overall, the study highlights the potential of integrating retrieval and LLM training for medical question-answering systems, with the goal of improving accuracy, reliability, and efficiency in healthcare applications.This paper introduces JMLR, a joint training method for Large Language Models (LLMs) and information retrieval, aimed at enhancing reasoning and professional medical question-answering capabilities. JMLR trains both the LLM and the retriever simultaneously during the fine-tuning phase, which improves the model's ability to retrieve clinical guidelines and leverage medical knowledge for accurate reasoning and answering. This synchronized training mechanism reduces the need for extensive computational resources. The method was evaluated on important medical question-answering tasks, and results showed that JMLR-13B outperformed previous state-of-the-art models, including Meditron-70B and Llama2-13B with RAG, achieving a 70.5% accuracy on a medical question-answering dataset. Comprehensive evaluations revealed that JMLR-13B significantly enhances reasoning quality and reduces hallucinations compared to other models, including Claude3-Opus. Additionally, JMLR-13B trains much faster than Meditron-70B, requiring only 148 GPU hours versus 42630 GPU hours for Meditron-70B. The study also demonstrates that JMLR effectively enhances domain-specific retrieval by eliminating the need for additional human annotation. The results show that joint training of LLMs and retrievers improves performance in medical question-answering tasks, particularly in scenarios requiring nuanced understanding and specific information retrieval. The paper also discusses the limitations and ethical considerations of the approach, including domain specificity, the expertise of annotators, privacy implications, and potential biases. Overall, the study highlights the potential of integrating retrieval and LLM training for medical question-answering systems, with the goal of improving accuracy, reliability, and efficiency in healthcare applications.