3 May 2024 | Ashwin Kumar Gururajan, Enrique Lopez-Cuena, Jordi Bayarri-Planas, Adrian Tormos, Daniel Hinojos, Pablo Bernabeu-Perez, Anna Arias-Duarte, Pablo Agustin Martin-Torres, Lucia Urcelay-Ganzabal, Marta Gonzalez-Mallo, Sergio Alvarez-Napagao, Eduard Ayguade-Parra, Ulises Cortes and Dario Garcia-Gasulla
Aloe is a family of open-source healthcare large language models (LLMs) designed to be highly competitive within their scale range. The models are trained on the latest base models (Mistral, LLaMA 3) using a custom dataset that combines public data with synthetic Chain of Thought (CoT) data. Aloe models undergo an alignment phase using Direct Preference Optimization (DPO), making them one of the first policy-aligned open healthcare LLMs. The models are evaluated using bias, toxicity, and risk assessment datasets, along with a dedicated red teaming effort. Advanced prompt engineering strategies are used to improve performance across benchmarks, achieving state-of-the-art results for open healthcare 7B LLMs.
The Aloe family includes models trained on a combination of medical and general instruction tuning datasets, with synthetic data generated to enhance the quality and diversity of the training data. The models are fine-tuned using a supervised fine-tuning process, followed by model merging to combine the knowledge of different models. The final DPO merge model, Llama3-Aloe-8B-Merged-DPO-RT-v1, is trained using a two-stage process involving human preference alignment and red teaming to mitigate harmful responses.
Aloe models are evaluated on various medical benchmarks, including PubMedQA, MedMCQA, MedQA, and MMLU, demonstrating superior performance compared to other open-source models. The models are also assessed for ethical performance, including bias, toxicity, and factuality, with results showing improved safety and reliability. The Aloe model is released under a CC-BY-NC 4.0 license, allowing researchers to use and build upon the models for further development.
The Aloe family of models is designed to be safe, ethical, and reliable, with a focus on reducing bias, toxicity, and hallucinations. The models are trained on a diverse range of data sources, including medical guidelines and synthetic data, to ensure high-quality and diverse training. The models are also evaluated for their ability to handle complex medical tasks, with results showing that they can outperform larger models in certain benchmarks. The Aloe models are intended for research purposes and not for clinical use, with a focus on improving the safety and reliability of healthcare LLMs.Aloe is a family of open-source healthcare large language models (LLMs) designed to be highly competitive within their scale range. The models are trained on the latest base models (Mistral, LLaMA 3) using a custom dataset that combines public data with synthetic Chain of Thought (CoT) data. Aloe models undergo an alignment phase using Direct Preference Optimization (DPO), making them one of the first policy-aligned open healthcare LLMs. The models are evaluated using bias, toxicity, and risk assessment datasets, along with a dedicated red teaming effort. Advanced prompt engineering strategies are used to improve performance across benchmarks, achieving state-of-the-art results for open healthcare 7B LLMs.
The Aloe family includes models trained on a combination of medical and general instruction tuning datasets, with synthetic data generated to enhance the quality and diversity of the training data. The models are fine-tuned using a supervised fine-tuning process, followed by model merging to combine the knowledge of different models. The final DPO merge model, Llama3-Aloe-8B-Merged-DPO-RT-v1, is trained using a two-stage process involving human preference alignment and red teaming to mitigate harmful responses.
Aloe models are evaluated on various medical benchmarks, including PubMedQA, MedMCQA, MedQA, and MMLU, demonstrating superior performance compared to other open-source models. The models are also assessed for ethical performance, including bias, toxicity, and factuality, with results showing improved safety and reliability. The Aloe model is released under a CC-BY-NC 4.0 license, allowing researchers to use and build upon the models for further development.
The Aloe family of models is designed to be safe, ethical, and reliable, with a focus on reducing bias, toxicity, and hallucinations. The models are trained on a diverse range of data sources, including medical guidelines and synthetic data, to ensure high-quality and diverse training. The models are also evaluated for their ability to handle complex medical tasks, with results showing that they can outperform larger models in certain benchmarks. The Aloe models are intended for research purposes and not for clinical use, with a focus on improving the safety and reliability of healthcare LLMs.