10 Mar 2023 | Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao
The paper "ReAct: Synergizing Reasoning and Acting in Language Models" by Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao explores the integration of reasoning and acting capabilities in large language models (LLMs). The authors propose ReAct, a method that prompts LLMs to generate both reasoning traces and task-specific actions in an interleaved manner. This approach allows the model to dynamically reason and update action plans while interacting with external sources like knowledge bases or environments.
ReAct is evaluated on various benchmarks, including question answering (HotpotQA), fact verification (Fever), text-based games (ALFWorld), and webpage navigation (WebShop). The results show that ReAct outperforms state-of-the-art baselines in terms of performance, human interpretability, and trustworthiness. Specifically, on HotpotQA and Fever, ReAct overcomes issues of hallucination and error propagation by interacting with a Wikipedia API, generating more interpretable task-solving trajectories. On ALFWorld and WebShop, ReAct outperforms imitation and reinforcement learning methods by an absolute success rate of 34% and 10%, respectively, with only one or two in-context examples.
The key contributions of the paper include the introduction of ReAct, extensive experiments demonstrating its effectiveness, systematic ablations, and analysis of its limitations. The authors also discuss the importance of combining internal and external knowledge, the benefits of sparse reasoning, and the potential for further improvements through fine-tuning and multi-task training.The paper "ReAct: Synergizing Reasoning and Acting in Language Models" by Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao explores the integration of reasoning and acting capabilities in large language models (LLMs). The authors propose ReAct, a method that prompts LLMs to generate both reasoning traces and task-specific actions in an interleaved manner. This approach allows the model to dynamically reason and update action plans while interacting with external sources like knowledge bases or environments.
ReAct is evaluated on various benchmarks, including question answering (HotpotQA), fact verification (Fever), text-based games (ALFWorld), and webpage navigation (WebShop). The results show that ReAct outperforms state-of-the-art baselines in terms of performance, human interpretability, and trustworthiness. Specifically, on HotpotQA and Fever, ReAct overcomes issues of hallucination and error propagation by interacting with a Wikipedia API, generating more interpretable task-solving trajectories. On ALFWorld and WebShop, ReAct outperforms imitation and reinforcement learning methods by an absolute success rate of 34% and 10%, respectively, with only one or two in-context examples.
The key contributions of the paper include the introduction of ReAct, extensive experiments demonstrating its effectiveness, systematic ablations, and analysis of its limitations. The authors also discuss the importance of combining internal and external knowledge, the benefits of sparse reasoning, and the potential for further improvements through fine-tuning and multi-task training.