6 Feb 2024 | Pei Zhou, Jay Pujara, Xiang Ren, Xinyun Chen, Heng-Tze Cheng, Quoc V. Le, Ed H. Chi, Denny Zhou, Swaroop Mishra, Huaixiu Steven Zheng
SELF-DISCOVER is a framework that enables large language models (LLMs) to self-discover task-specific reasoning structures to solve complex reasoning problems. The framework allows LLMs to select and compose multiple atomic reasoning modules, such as critical thinking and step-by-step thinking, into an explicit reasoning structure. This structure is then followed during decoding to arrive at the final answer. SELF-DISCOVER significantly improves performance on challenging reasoning benchmarks like BigBench-Hard, grounded agent reasoning, and MATH, outperforming methods like Chain of Thought (CoT) by up to 32%. It also performs better than inference-intensive methods like CoT-Self-Consistency by over 20%, while requiring 10-40x fewer inference compute. The self-discovered reasoning structures are universally applicable across different model families and share commonalities with human reasoning patterns.
The framework operates in two stages: Stage 1 involves selecting, adapting, and implementing reasoning modules to create a task-specific reasoning structure. Stage 2 uses this structure to solve instances of the task. SELF-DISCOVER is efficient, requiring only 3 additional inference steps, and is more performant than inference-heavy ensemble approaches. The discovered structures are intrinsic to the task and provide more interpretable insights than optimized prompts.
SELF-DISCOVER was tested on 25 challenging reasoning tasks, including Big Bench-Hard (BBH), Thinking for Doing (T4D), and MATH. It outperformed CoT on 21/25 tasks, with performance gains up to 42%. It also outperformed inference-heavy methods like CoT-Self-Consistency and majority voting of each module, while requiring significantly fewer inference compute. Compared to prompt-optimized methods like OPRO, SELF-DISCOVER performed on par or better, with more interpretable reasoning structures.
The framework was evaluated on various tasks, showing significant improvements in reasoning capabilities. SELF-DISCOVER performed best on tasks requiring world knowledge and had moderate gains on algorithmic tasks. It was also efficient, requiring 10-40x fewer inference compute compared to other methods. Qualitative examples showed that SELF-DISCOVER structures were uniquely adapted to tasks and integrated multiple reasoning modules, providing insights into how to solve the tasks.
The framework was further analyzed for its effectiveness and efficiency, showing that all three stages (SELECT, ADAPT, IMPLEMENT) were beneficial. The self-discovered structures were found to be universally transferable across different models, demonstrating the efficiency and effectiveness of the framework. SELF-DISCOVER also showed promising results in transferring reasoning structures across different LLMs, indicating its potential for future work in structured reasoning for complex problem-solving.SELF-DISCOVER is a framework that enables large language models (LLMs) to self-discover task-specific reasoning structures to solve complex reasoning problems. The framework allows LLMs to select and compose multiple atomic reasoning modules, such as critical thinking and step-by-step thinking, into an explicit reasoning structure. This structure is then followed during decoding to arrive at the final answer. SELF-DISCOVER significantly improves performance on challenging reasoning benchmarks like BigBench-Hard, grounded agent reasoning, and MATH, outperforming methods like Chain of Thought (CoT) by up to 32%. It also performs better than inference-intensive methods like CoT-Self-Consistency by over 20%, while requiring 10-40x fewer inference compute. The self-discovered reasoning structures are universally applicable across different model families and share commonalities with human reasoning patterns.
The framework operates in two stages: Stage 1 involves selecting, adapting, and implementing reasoning modules to create a task-specific reasoning structure. Stage 2 uses this structure to solve instances of the task. SELF-DISCOVER is efficient, requiring only 3 additional inference steps, and is more performant than inference-heavy ensemble approaches. The discovered structures are intrinsic to the task and provide more interpretable insights than optimized prompts.
SELF-DISCOVER was tested on 25 challenging reasoning tasks, including Big Bench-Hard (BBH), Thinking for Doing (T4D), and MATH. It outperformed CoT on 21/25 tasks, with performance gains up to 42%. It also outperformed inference-heavy methods like CoT-Self-Consistency and majority voting of each module, while requiring significantly fewer inference compute. Compared to prompt-optimized methods like OPRO, SELF-DISCOVER performed on par or better, with more interpretable reasoning structures.
The framework was evaluated on various tasks, showing significant improvements in reasoning capabilities. SELF-DISCOVER performed best on tasks requiring world knowledge and had moderate gains on algorithmic tasks. It was also efficient, requiring 10-40x fewer inference compute compared to other methods. Qualitative examples showed that SELF-DISCOVER structures were uniquely adapted to tasks and integrated multiple reasoning modules, providing insights into how to solve the tasks.
The framework was further analyzed for its effectiveness and efficiency, showing that all three stages (SELECT, ADAPT, IMPLEMENT) were beneficial. The self-discovered structures were found to be universally transferable across different models, demonstrating the efficiency and effectiveness of the framework. SELF-DISCOVER also showed promising results in transferring reasoning structures across different LLMs, indicating its potential for future work in structured reasoning for complex problem-solving.