6 Jun 2024 | Ling Yang, Zhaochen Yu, Tianjun Zhang, Shiyi Cao, Minkai Xu, Wentao Zhang, Joseph E. Gonzalez, Bin Cui
Buffer of Thoughts (BoT) is a novel thought-augmented reasoning framework designed to enhance the accuracy, efficiency, and robustness of large language models (LLMs) in various reasoning tasks. The framework introduces a meta-buffer to store high-level thought-templates derived from diverse problem-solving processes, enabling efficient and adaptive reasoning. These thought-templates are distilled from different tasks and can be reused across problems, reducing the need for manual design and improving generalization. A buffer-manager dynamically updates the meta-buffer, enhancing its capacity as more tasks are solved. BoT achieves significant performance improvements on 10 challenging reasoning tasks, outperforming previous state-of-the-art methods by 11% on Game of 24, 20% on Geometric Shapes, and 51% on Checkmate-in-One, while requiring only 12% of the cost of multi-query prompting methods. The framework also demonstrates superior robustness and efficiency, with BoT+Llama3-8B showing potential to surpass Llama3-70B. Experiments show that BoT improves reasoning accuracy, efficiency, and robustness across various benchmarks, with a balanced time distribution and effective trade-off between model size and performance. Ablation studies confirm the importance of the problem-distiller, meta-buffer, and buffer-manager in enhancing reasoning capabilities. The method addresses limitations in existing prompting approaches, offering a scalable and efficient solution for complex reasoning tasks.Buffer of Thoughts (BoT) is a novel thought-augmented reasoning framework designed to enhance the accuracy, efficiency, and robustness of large language models (LLMs) in various reasoning tasks. The framework introduces a meta-buffer to store high-level thought-templates derived from diverse problem-solving processes, enabling efficient and adaptive reasoning. These thought-templates are distilled from different tasks and can be reused across problems, reducing the need for manual design and improving generalization. A buffer-manager dynamically updates the meta-buffer, enhancing its capacity as more tasks are solved. BoT achieves significant performance improvements on 10 challenging reasoning tasks, outperforming previous state-of-the-art methods by 11% on Game of 24, 20% on Geometric Shapes, and 51% on Checkmate-in-One, while requiring only 12% of the cost of multi-query prompting methods. The framework also demonstrates superior robustness and efficiency, with BoT+Llama3-8B showing potential to surpass Llama3-70B. Experiments show that BoT improves reasoning accuracy, efficiency, and robustness across various benchmarks, with a balanced time distribution and effective trade-off between model size and performance. Ablation studies confirm the importance of the problem-distiller, meta-buffer, and buffer-manager in enhancing reasoning capabilities. The method addresses limitations in existing prompting approaches, offering a scalable and efficient solution for complex reasoning tasks.