3 May 2024 | Ashwin Kumar Gururajan, Enrique Lopez-Cuenca, Jordi Bayarri-Planas, Adrian Tormos, Daniel Hinjos, Pablo Bernabeu-Perez, Anna Arias-Duart, Pablo Agustin Martin-Torres, Lucia Urcelay-Ganzabal, Marta Gonzalez-Mallo, Sergio Alvarez-Napagao, Eduard Ayguadé-Parra, Ulises Cortés, Dario Garcia-Gasulla
The paper introduces the Aloe family, a set of open-source medical Large Language Models (LLMs) designed to improve healthcare and medicine. Aloe models are trained on state-of-the-art base models (Mistral, LLaMA 3) using a custom dataset that combines public data sources with synthetic Chain of Thought (CoT) data. The models undergo an alignment phase, becoming one of the first policy-aligned open healthcare LLMs using Direct Preference Optimization (DPO), setting a new standard for ethical performance. The evaluation includes various bias and toxicity datasets, red teaming efforts, and risk assessments. Advanced prompt engineering strategies are also explored to boost performance across benchmarks, achieving state-of-the-art results for 7B LLMs in the healthcare domain. The best Aloe model is openly released under a CC-BY-NC 4.0 license, along with detailed training details, model merging configurations, and all training data. The paper also discusses the ethical considerations and risks associated with using LLMs in healthcare, emphasizing the importance of responsible and safe deployment.The paper introduces the Aloe family, a set of open-source medical Large Language Models (LLMs) designed to improve healthcare and medicine. Aloe models are trained on state-of-the-art base models (Mistral, LLaMA 3) using a custom dataset that combines public data sources with synthetic Chain of Thought (CoT) data. The models undergo an alignment phase, becoming one of the first policy-aligned open healthcare LLMs using Direct Preference Optimization (DPO), setting a new standard for ethical performance. The evaluation includes various bias and toxicity datasets, red teaming efforts, and risk assessments. Advanced prompt engineering strategies are also explored to boost performance across benchmarks, achieving state-of-the-art results for 7B LLMs in the healthcare domain. The best Aloe model is openly released under a CC-BY-NC 4.0 license, along with detailed training details, model merging configurations, and all training data. The paper also discusses the ethical considerations and risks associated with using LLMs in healthcare, emphasizing the importance of responsible and safe deployment.