Chain-of-Thought Reasoning without Prompting

Chain-of-Thought Reasoning without Prompting

23 May 2024 | Xuezhi Wang and Denny Zhou
This paper presents a novel approach to enhance the reasoning capabilities of large language models (LLMs) by modifying the decoding process rather than relying on prompting techniques. The study demonstrates that pre-trained LLMs can naturally generate chain-of-thought (CoT) reasoning paths when the decoding process is altered to consider alternative top-k tokens, rather than the standard greedy decoding. This method bypasses the need for prompting and allows for a more accurate assessment of the model's intrinsic reasoning abilities. The research shows that the presence of a CoT path in the decoding process correlates with higher confidence in the model's final answer, as evidenced by the probability differences between the top and secondary tokens. The study evaluates the effectiveness of this approach on various reasoning benchmarks, including mathematical and commonsense reasoning tasks. Results indicate that CoT-decoding significantly improves the model's reasoning performance compared to traditional greedy decoding. The method is shown to be effective across different model scales and is capable of enhancing the reasoning abilities of both pre-trained and instruction-tuned models. Additionally, the study highlights that CoT-decoding can be combined with CoT-prompting to further improve reasoning performance. The findings challenge the prevailing notion that LLMs are inherently incapable of effective reasoning without prompting. Instead, they suggest that LLMs possess intrinsic reasoning capabilities that can be revealed through alternative decoding strategies. The research also reveals that the effectiveness of CoT-decoding varies depending on the task difficulty, with simpler tasks showing more consistent results. The study further demonstrates that CoT-decoding can uncover the model's intrinsic vulnerabilities in reasoning, such as difficulties in maintaining state tracking and following correct mathematical order. Overall, the paper contributes to the understanding of LLMs' reasoning capabilities and provides a new method for enhancing their performance without the need for prompting or extensive fine-tuning. The results suggest that CoT-decoding is a promising approach for improving the reasoning abilities of LLMs across a wide range of tasks.This paper presents a novel approach to enhance the reasoning capabilities of large language models (LLMs) by modifying the decoding process rather than relying on prompting techniques. The study demonstrates that pre-trained LLMs can naturally generate chain-of-thought (CoT) reasoning paths when the decoding process is altered to consider alternative top-k tokens, rather than the standard greedy decoding. This method bypasses the need for prompting and allows for a more accurate assessment of the model's intrinsic reasoning abilities. The research shows that the presence of a CoT path in the decoding process correlates with higher confidence in the model's final answer, as evidenced by the probability differences between the top and secondary tokens. The study evaluates the effectiveness of this approach on various reasoning benchmarks, including mathematical and commonsense reasoning tasks. Results indicate that CoT-decoding significantly improves the model's reasoning performance compared to traditional greedy decoding. The method is shown to be effective across different model scales and is capable of enhancing the reasoning abilities of both pre-trained and instruction-tuned models. Additionally, the study highlights that CoT-decoding can be combined with CoT-prompting to further improve reasoning performance. The findings challenge the prevailing notion that LLMs are inherently incapable of effective reasoning without prompting. Instead, they suggest that LLMs possess intrinsic reasoning capabilities that can be revealed through alternative decoding strategies. The research also reveals that the effectiveness of CoT-decoding varies depending on the task difficulty, with simpler tasks showing more consistent results. The study further demonstrates that CoT-decoding can uncover the model's intrinsic vulnerabilities in reasoning, such as difficulties in maintaining state tracking and following correct mathematical order. Overall, the paper contributes to the understanding of LLMs' reasoning capabilities and provides a new method for enhancing their performance without the need for prompting or extensive fine-tuning. The results suggest that CoT-decoding is a promising approach for improving the reasoning abilities of LLMs across a wide range of tasks.
Reach us at info@study.space