[slides and audio] RAT%3A Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation

The paper introduces Retrieval Augmented Thoughts (RAT), a method that enhances large language models' (LLMs) reasoning and generation capabilities in long-horizon tasks by iteratively revising the chain of thoughts (CoT) with retrieved information. RAT combines the strengths of Retrieval-Augmented Generation (RAG) and CoT prompting, aiming to mitigate hallucinations and improve the accuracy of intermediate reasoning steps. The method involves generating an initial zero-shot CoT, then using RAG to revise each thought step based on the current and past thought steps, task prompt, and retrieved information. This approach is evaluated on various tasks, including code generation, mathematical reasoning, embodied task planning, and creative writing, using models like GPT-3.5, GPT-4, and CodeLLaMA-7b. The results show significant improvements in performance, with average relative increases of 13.63% in code generation, 16.96% in mathematical reasoning, 19.2% in creative writing, and 42.78% in embodied task planning. The paper also discusses the limitations of RAT, such as its reliance on the base LLM's capabilities and the quality of retrieved knowledge, and suggests future directions for improving the method.The paper introduces Retrieval Augmented Thoughts (RAT), a method that enhances large language models' (LLMs) reasoning and generation capabilities in long-horizon tasks by iteratively revising the chain of thoughts (CoT) with retrieved information. RAT combines the strengths of Retrieval-Augmented Generation (RAG) and CoT prompting, aiming to mitigate hallucinations and improve the accuracy of intermediate reasoning steps. The method involves generating an initial zero-shot CoT, then using RAG to revise each thought step based on the current and past thought steps, task prompt, and retrieved information. This approach is evaluated on various tasks, including code generation, mathematical reasoning, embodied task planning, and creative writing, using models like GPT-3.5, GPT-4, and CodeLLaMA-7b. The results show significant improvements in performance, with average relative increases of 13.63% in code generation, 16.96% in mathematical reasoning, 19.2% in creative writing, and 42.78% in embodied task planning. The paper also discusses the limitations of RAT, such as its reliance on the base LLM's capabilities and the quality of retrieved knowledge, and suggests future directions for improving the method.

RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation

March 2024 | Zihao Wang, Anji Liu, Haowei Lin, Jiaqi Li, Xiaojian Ma, Yitao Liang