[slides and audio] Language Models are Few-Shot Learners

This paper explores the capabilities of large language models in few-shot learning, a task where models must perform well on new tasks with limited examples. The authors train GPT-3, an autoregressive language model with 175 billion parameters, and evaluate its performance on various NLP tasks, including translation, question answering, and cloze tasks. GPT-3 demonstrates strong performance in the few-shot setting, sometimes even surpassing state-of-the-art fine-tuned models. The study also identifies limitations, such as struggles on certain datasets and methodological issues related to training on large web corpora. Additionally, GPT-3 can generate synthetic news articles that are difficult for humans to distinguish from real articles, raising broader societal concerns. The paper discusses the broader impacts of these findings and the potential for misuse of language models.This paper explores the capabilities of large language models in few-shot learning, a task where models must perform well on new tasks with limited examples. The authors train GPT-3, an autoregressive language model with 175 billion parameters, and evaluate its performance on various NLP tasks, including translation, question answering, and cloze tasks. GPT-3 demonstrates strong performance in the few-shot setting, sometimes even surpassing state-of-the-art fine-tuned models. The study also identifies limitations, such as struggles on certain datasets and methodological issues related to training on large web corpora. Additionally, GPT-3 can generate synthetic news articles that are difficult for humans to distinguish from real articles, raising broader societal concerns. The paper discusses the broader impacts of these findings and the potential for misuse of language models.