27 Feb 2023 | Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothee Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample
LLaMA is a collection of foundation language models ranging from 7B to 65B parameters, trained on trillions of tokens using publicly available datasets. The models achieve competitive performance compared to state-of-the-art models like GPT-3, Chinchilla, and PaLM, despite being significantly smaller. LLaMA-13B outperforms GPT-3 on most benchmarks, while LLaMA-65B is competitive with Chinchilla-70B and PaLM-540B. The models are released to the research community to promote democratization and further development. The paper details the training approach, architecture modifications, and performance on various benchmarks, including zero-shot and few-shot tasks. It also discusses biases, toxicity, and carbon footprint, highlighting the importance of responsible AI development.LLaMA is a collection of foundation language models ranging from 7B to 65B parameters, trained on trillions of tokens using publicly available datasets. The models achieve competitive performance compared to state-of-the-art models like GPT-3, Chinchilla, and PaLM, despite being significantly smaller. LLaMA-13B outperforms GPT-3 on most benchmarks, while LLaMA-65B is competitive with Chinchilla-70B and PaLM-540B. The models are released to the research community to promote democratization and further development. The paper details the training approach, architecture modifications, and performance on various benchmarks, including zero-shot and few-shot tasks. It also discusses biases, toxicity, and carbon footprint, highlighting the importance of responsible AI development.