[slides] Buffer of Thoughts%3A Thought-Augmented Reasoning with Large Language Models

Buffer of Thoughts (BoT) is a novel and versatile thought-augmented reasoning approach designed to enhance the accuracy, efficiency, and robustness of large language models (LLMs). The approach introduces a *meta-buffer* that stores a series of high-level *thought-templates* distilled from various problem-solving processes. For each new problem, BoT retrieves a relevant thought-template and adaptively instantiates it with specific reasoning structures to conduct efficient reasoning. To ensure scalability and stability, a *buffer-manager* dynamically updates the meta-buffer as more tasks are solved. BoT addresses limitations of single-query and multi-query reasoning methods by providing a universal and flexible framework. Single-query methods often require prior assumptions or exemplars, lacking universality and generalization. Multi-query methods, while more computationally intensive, can be impractical due to the need to find unique intrinsic structures for each task. BoT's key advantages include: 1. **Accuracy Improvement**: Shared thought-templates enable adaptive instantiation of high-level thoughts, improving reasoning accuracy. 2. **Reasoning Efficiency**: BoT leverages informative historical reasoning structures, reducing the need for complex multi-query processes. 3. **Model Robustness**: The thought retrieval and instantiation process mimics human thought processes, enhancing model robustness. Experiments on 10 challenging reasoning tasks show significant performance improvements over previous state-of-the-art methods, achieving 11% on Game of 24, 20% on Geometric Shapes, and 51% on Checkmate-in-One. BoT also requires only 12% of the cost of multi-query prompting methods on average. The project is available at <https://github.com/YangLing0818/buffer-of-thought-llm>.Buffer of Thoughts (BoT) is a novel and versatile thought-augmented reasoning approach designed to enhance the accuracy, efficiency, and robustness of large language models (LLMs). The approach introduces a *meta-buffer* that stores a series of high-level *thought-templates* distilled from various problem-solving processes. For each new problem, BoT retrieves a relevant thought-template and adaptively instantiates it with specific reasoning structures to conduct efficient reasoning. To ensure scalability and stability, a *buffer-manager* dynamically updates the meta-buffer as more tasks are solved. BoT addresses limitations of single-query and multi-query reasoning methods by providing a universal and flexible framework. Single-query methods often require prior assumptions or exemplars, lacking universality and generalization. Multi-query methods, while more computationally intensive, can be impractical due to the need to find unique intrinsic structures for each task. BoT's key advantages include: 1. **Accuracy Improvement**: Shared thought-templates enable adaptive instantiation of high-level thoughts, improving reasoning accuracy. 2. **Reasoning Efficiency**: BoT leverages informative historical reasoning structures, reducing the need for complex multi-query processes. 3. **Model Robustness**: The thought retrieval and instantiation process mimics human thought processes, enhancing model robustness. Experiments on 10 challenging reasoning tasks show significant performance improvements over previous state-of-the-art methods, achieving 11% on Game of 24, 20% on Geometric Shapes, and 51% on Checkmate-in-One. BoT also requires only 12% of the cost of multi-query prompting methods on average. The project is available at <https://github.com/YangLing0818/buffer-of-thought-llm>.

Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

6 Jun 2024 | Ling Yang, Zhaochen Yu, Tianjun Zhang, Shiyi Cao, Minkai Xu, Wentao Zhang, Joseph E. Gonzalez, Bin Cui