Large language models (LLMs) are shown to be effective zero-shot reasoners through a simple prompt, "Let's think step by step," without requiring task-specific examples. This approach, called Zero-shot-CoT, outperforms standard zero-shot methods on various reasoning tasks, including arithmetic (MultiArith, GSM8K, AQUA-RAT, SVAMP), symbolic reasoning (Last Letter, Coin Flip), and logical reasoning (Date Understanding, Tracking Shuffled Objects). The method uses a single prompt template across diverse tasks, achieving significant accuracy improvements, such as increasing MultiArith accuracy from 17.7% to 78.7% and GSM8K from 10.4% to 40.7% with the InstructGPT model. Zero-shot-CoT is versatile and task-agnostic, unlike most prior task-specific prompting methods. It generates plausible reasoning paths and correct answers without examples, demonstrating the potential of LLMs for multi-task reasoning. The study highlights the importance of exploring zero-shot capabilities in LLMs before creating fine-tuning datasets. Zero-shot-CoT is a strong baseline for reasoning benchmarks and suggests that high-level cognitive abilities may be extracted through simple prompting. The results indicate that LLMs can perform complex reasoning tasks without examples, challenging previous assumptions about their capabilities. The method is effective across various model sizes and tasks, showing that zero-shot reasoning can be improved through simple prompts. The study also emphasizes the need for further research into the broader cognitive abilities of LLMs.Large language models (LLMs) are shown to be effective zero-shot reasoners through a simple prompt, "Let's think step by step," without requiring task-specific examples. This approach, called Zero-shot-CoT, outperforms standard zero-shot methods on various reasoning tasks, including arithmetic (MultiArith, GSM8K, AQUA-RAT, SVAMP), symbolic reasoning (Last Letter, Coin Flip), and logical reasoning (Date Understanding, Tracking Shuffled Objects). The method uses a single prompt template across diverse tasks, achieving significant accuracy improvements, such as increasing MultiArith accuracy from 17.7% to 78.7% and GSM8K from 10.4% to 40.7% with the InstructGPT model. Zero-shot-CoT is versatile and task-agnostic, unlike most prior task-specific prompting methods. It generates plausible reasoning paths and correct answers without examples, demonstrating the potential of LLMs for multi-task reasoning. The study highlights the importance of exploring zero-shot capabilities in LLMs before creating fine-tuning datasets. Zero-shot-CoT is a strong baseline for reasoning benchmarks and suggests that high-level cognitive abilities may be extracted through simple prompting. The results indicate that LLMs can perform complex reasoning tasks without examples, challenging previous assumptions about their capabilities. The method is effective across various model sizes and tasks, showing that zero-shot reasoning can be improved through simple prompts. The study also emphasizes the need for further research into the broader cognitive abilities of LLMs.