Efficient Tool Use with Chain-of-Abstraction Reasoning

Efficient Tool Use with Chain-of-Abstraction Reasoning

26 Feb 2024 | Silin Gao, Jane Dwivedi-Yu, Ping Yu, Xiaoqing Ellen Tan, Ramakanth Pasunuru, Olga Golovneva, Koustuv Sinha, Asli Celikyilmaz, Antoine Bosselut, Tianlu Wang
This paper introduces Chain-of-Abstraction (CoA) reasoning, a novel method for large language models (LLMs) to perform multi-step reasoning with tools. CoA enables LLMs to first generate abstract reasoning chains with placeholders, then use domain-specific tools to fill in these placeholders with concrete knowledge, resulting in more accurate and efficient reasoning. The method allows LLMs to plan and execute reasoning steps in parallel with tool calls, reducing inference delays and improving performance. The proposed method outperforms previous chain-of-thought and tool-augmented baselines in both mathematical reasoning and Wiki QA domains, achieving an average 6% improvement in QA accuracy. LLMs trained with CoA also show more efficient tool use, with inference speeds being approximately 1.4 times faster than baseline tool-augmented LLMs. Human evaluations confirm that CoA guides LLMs to learn more accurate reasoning, leading to a 8% reduction in reasoning errors. The method is implemented on two representative domains: mathematical reasoning and Wikipedia QA. For mathematical reasoning, the method is evaluated on datasets such as GSM8K and ASDiv, while for Wikipedia QA, it is tested on HotpotQA and other open-domain QA datasets. The results show that CoA consistently outperforms baselines in both in-distribution and out-of-distribution test sets, especially for complex reasoning tasks. The method also demonstrates improved inference efficiency, with CoA requiring less time than baseline methods for answering questions, even though it slightly less efficient than CoT-FT. The approach decouples the generation of abstract reasoning chains from the retrieval of knowledge, allowing for more efficient parallel processing of reasoning steps and tool calls. The paper also discusses limitations of the method, including the need for extensive computational resources for full LLM fine-tuning and the potential for further research on integrating CoA with advanced decoding strategies like self-consistency. Overall, the proposed method shows great potential for adapting to new reasoning scenarios and improving the performance of LLMs in multi-step reasoning tasks.This paper introduces Chain-of-Abstraction (CoA) reasoning, a novel method for large language models (LLMs) to perform multi-step reasoning with tools. CoA enables LLMs to first generate abstract reasoning chains with placeholders, then use domain-specific tools to fill in these placeholders with concrete knowledge, resulting in more accurate and efficient reasoning. The method allows LLMs to plan and execute reasoning steps in parallel with tool calls, reducing inference delays and improving performance. The proposed method outperforms previous chain-of-thought and tool-augmented baselines in both mathematical reasoning and Wiki QA domains, achieving an average 6% improvement in QA accuracy. LLMs trained with CoA also show more efficient tool use, with inference speeds being approximately 1.4 times faster than baseline tool-augmented LLMs. Human evaluations confirm that CoA guides LLMs to learn more accurate reasoning, leading to a 8% reduction in reasoning errors. The method is implemented on two representative domains: mathematical reasoning and Wikipedia QA. For mathematical reasoning, the method is evaluated on datasets such as GSM8K and ASDiv, while for Wikipedia QA, it is tested on HotpotQA and other open-domain QA datasets. The results show that CoA consistently outperforms baselines in both in-distribution and out-of-distribution test sets, especially for complex reasoning tasks. The method also demonstrates improved inference efficiency, with CoA requiring less time than baseline methods for answering questions, even though it slightly less efficient than CoT-FT. The approach decouples the generation of abstract reasoning chains from the retrieval of knowledge, allowing for more efficient parallel processing of reasoning steps and tool calls. The paper also discusses limitations of the method, including the need for extensive computational resources for full LLM fine-tuning and the potential for further research on integrating CoA with advanced decoding strategies like self-consistency. Overall, the proposed method shows great potential for adapting to new reasoning scenarios and improving the performance of LLMs in multi-step reasoning tasks.
Reach us at info@study.space