[slides] OpenELM%3A An Efficient Language Model Family with Open Training and Inference Framework

OpenELM is an advanced open-source language model family designed to enhance reproducibility and transparency in large language models (LLMs). The model family uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to improved accuracy. Notably, OpenELM outperforms comparable-sized existing LLMs, such as OLMo, by 2.36% while requiring 2× fewer pre-training tokens. The release includes a comprehensive framework for training and evaluation on publicly available datasets, along with training logs, multiple checkpoints, and pre-training configurations. Additionally, OpenELM models can be converted to the MLX library for inference and fine-tuning on Apple devices. The source code and pre-trained model weights are available on GitHub and HuggingFace. The paper details the architecture, pre-training data, training hyperparameters, and evaluation methods, demonstrating OpenELM's effectiveness across various tasks and benchmarking results on both NVIDIA CUDA and Apple macOS hardware. The project aims to empower the open research community by providing access to state-of-the-art language models, while also emphasizing the importance of safety testing and filtering mechanisms to address potential biases and risks.OpenELM is an advanced open-source language model family designed to enhance reproducibility and transparency in large language models (LLMs). The model family uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to improved accuracy. Notably, OpenELM outperforms comparable-sized existing LLMs, such as OLMo, by 2.36% while requiring 2× fewer pre-training tokens. The release includes a comprehensive framework for training and evaluation on publicly available datasets, along with training logs, multiple checkpoints, and pre-training configurations. Additionally, OpenELM models can be converted to the MLX library for inference and fine-tuning on Apple devices. The source code and pre-trained model weights are available on GitHub and HuggingFace. The paper details the architecture, pre-training data, training hyperparameters, and evaluation methods, demonstrating OpenELM's effectiveness across various tasks and benchmarking results on both NVIDIA CUDA and Apple macOS hardware. The project aims to empower the open research community by providing access to state-of-the-art language models, while also emphasizing the importance of safety testing and filtering mechanisms to address potential biases and risks.

OpenELM: An Efficient Language Model Family with Open Training and Inference Framework

2 May 2024 | Sachin Mehta, Mohammad Hossein Sekhavat, Qingqing Cao, Maxwell Horton, Yanzi Jin, Chenfan Sun, Iman Mirzadeh, Mahyar Najibi, Peter Zatloukal, Mohammad Rastegari