3 Dec 2023 | Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
The Tree of Thoughts (ToT) framework enables large language models (LLMs) to perform deliberate problem-solving by exploring multiple reasoning paths and self-evaluating choices. This approach generalizes the "Chain of Thought" (CoT) prompting method, allowing LMs to generate and evaluate coherent intermediate steps ("thoughts") that guide problem-solving. ToT incorporates search algorithms like breadth-first search (BFS) and depth-first search (DFS) to systematically explore possible solutions, enabling the model to backtrack, look ahead, and make global decisions. The framework is tested on three challenging tasks: Game of 24, Creative Writing, and Mini Crosswords. In Game of 24, ToT achieves a 74% success rate, significantly outperforming CoT (4%) and IO prompting (7.3%). In Creative Writing, ToT generates more coherent passages than CoT and IO prompting. In Mini Crosswords, ToT achieves a 60% word-level success rate, solving 4 out of 20 games. The framework also demonstrates improved interpretability and human alignment by providing readable, high-level language reasoning. ToT is modular, adaptable, and can be customized for different problem-solving scenarios. The study highlights the potential of ToT to enhance LMs' capabilities in complex, multi-step problem-solving tasks.The Tree of Thoughts (ToT) framework enables large language models (LLMs) to perform deliberate problem-solving by exploring multiple reasoning paths and self-evaluating choices. This approach generalizes the "Chain of Thought" (CoT) prompting method, allowing LMs to generate and evaluate coherent intermediate steps ("thoughts") that guide problem-solving. ToT incorporates search algorithms like breadth-first search (BFS) and depth-first search (DFS) to systematically explore possible solutions, enabling the model to backtrack, look ahead, and make global decisions. The framework is tested on three challenging tasks: Game of 24, Creative Writing, and Mini Crosswords. In Game of 24, ToT achieves a 74% success rate, significantly outperforming CoT (4%) and IO prompting (7.3%). In Creative Writing, ToT generates more coherent passages than CoT and IO prompting. In Mini Crosswords, ToT achieves a 60% word-level success rate, solving 4 out of 20 games. The framework also demonstrates improved interpretability and human alignment by providing readable, high-level language reasoning. ToT is modular, adaptable, and can be customized for different problem-solving scenarios. The study highlights the potential of ToT to enhance LMs' capabilities in complex, multi-step problem-solving tasks.