Large Language Models are Zero-Shot Reasoners

Large Language Models are Zero-Shot Reasoners

29 Jan 2023 | Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
The paper "Large Language Models are Zero-Shot Reasoners" by Takeshi Kojima explores the zero-shot reasoning capabilities of large language models (LLMs). While LLMs are known for their effectiveness in few-shot learning, the authors demonstrate that they can also perform well on tasks requiring zero-shot reasoning with a simple prompt. The proposed method, Zero-shot-CoT (Zero-shot Chain of Thought), involves adding the prompt "Let’s think step by step" before each answer, which significantly improves performance on various reasoning tasks, including arithmetic, symbolic reasoning, and logical reasoning. Experimental results show that Zero-shot-CoT outperforms zero-shot LLMs on multiple benchmarks, such as MultiArith and GSM8K, with large-scale models like InstructGPT and PaLM. The versatility of this single prompt across diverse tasks suggests that LLMs have untapped zero-shot capabilities, highlighting the importance of further exploration and analysis of these capabilities. The paper also discusses the limitations and social impact of prompting methods and encourages the community to discover more multi-task prompts that elicit broad cognitive abilities from LLMs.The paper "Large Language Models are Zero-Shot Reasoners" by Takeshi Kojima explores the zero-shot reasoning capabilities of large language models (LLMs). While LLMs are known for their effectiveness in few-shot learning, the authors demonstrate that they can also perform well on tasks requiring zero-shot reasoning with a simple prompt. The proposed method, Zero-shot-CoT (Zero-shot Chain of Thought), involves adding the prompt "Let’s think step by step" before each answer, which significantly improves performance on various reasoning tasks, including arithmetic, symbolic reasoning, and logical reasoning. Experimental results show that Zero-shot-CoT outperforms zero-shot LLMs on multiple benchmarks, such as MultiArith and GSM8K, with large-scale models like InstructGPT and PaLM. The versatility of this single prompt across diverse tasks suggests that LLMs have untapped zero-shot capabilities, highlighting the importance of further exploration and analysis of these capabilities. The paper also discusses the limitations and social impact of prompting methods and encourages the community to discover more multi-task prompts that elicit broad cognitive abilities from LLMs.
Reach us at info@study.space
Understanding Large Language Models are Zero-Shot Reasoners