TREE SEARCH FOR LANGUAGE MODEL AGENTS

TREE SEARCH FOR LANGUAGE MODEL AGENTS

1 Jul 2024 | Jing Yu Koh, Stephen McAleer, Daniel Fried, Ruslan Salakhutdinov
The paper introduces an inference-time search algorithm designed to enhance the capabilities of language model (LM) agents on realistic web tasks. The approach integrates best-first tree search with LM agents, enabling them to explore and evaluate multiple action trajectories to achieve superior performance. This is the first time search has shown to significantly improve the success rates of LM agents on realistic web environments, as demonstrated on the VisualWebArena (VWA) and WebArena (WA) benchmarks. The search procedure is general and can be applied to other domains in future work. The authors highlight that inference-time search will be a key component for building capable agents that can plan, reason, and act autonomously to perform computer tasks. The paper also discusses limitations and future directions, including the need for efficient search algorithms and handling destructive actions.The paper introduces an inference-time search algorithm designed to enhance the capabilities of language model (LM) agents on realistic web tasks. The approach integrates best-first tree search with LM agents, enabling them to explore and evaluate multiple action trajectories to achieve superior performance. This is the first time search has shown to significantly improve the success rates of LM agents on realistic web environments, as demonstrated on the VisualWebArena (VWA) and WebArena (WA) benchmarks. The search procedure is general and can be applied to other domains in future work. The authors highlight that inference-time search will be a key component for building capable agents that can plan, reason, and act autonomously to perform computer tasks. The paper also discusses limitations and future directions, including the need for efficient search algorithms and handling destructive actions.
Reach us at info@study.space