[slides] How Likely Do LLMs with CoT Mimic Human Reasoning%3F

This paper explores the underlying mechanisms of Chain-of-Thought (CoT) in Large Language Models (LLMs) and compares them with human reasoning. The authors use causal analysis to understand the relationships between problem instructions, reasoning steps (CoT), and answers. They find that LLMs often deviate from an ideal causal chain, leading to spurious correlations and potential consistency errors. The study reveals that in-context learning (ICL) strengthens the causal structure, while supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) weaken it. Surprisingly, increasing model size does not necessarily improve the causal structure, suggesting that new techniques are needed to enhance LLMs' reasoning abilities. The paper contributes by uncovering the essential features of LLMs' causal structures and providing insights into improving their reasoning processes.This paper explores the underlying mechanisms of Chain-of-Thought (CoT) in Large Language Models (LLMs) and compares them with human reasoning. The authors use causal analysis to understand the relationships between problem instructions, reasoning steps (CoT), and answers. They find that LLMs often deviate from an ideal causal chain, leading to spurious correlations and potential consistency errors. The study reveals that in-context learning (ICL) strengthens the causal structure, while supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) weaken it. Surprisingly, increasing model size does not necessarily improve the causal structure, suggesting that new techniques are needed to enhance LLMs' reasoning abilities. The paper contributes by uncovering the essential features of LLMs' causal structures and providing insights into improving their reasoning processes.

How Likely Do LLMs with CoT Mimic Human Reasoning?

12 Dec 2024 | Guangsheng Bao, Hongbo Zhang, Cunxiang Wang, Linyi Yang, Yue Zhang