BAGEL: Bootstrapping Agents by Guiding Exploration with Language

BAGEL: Bootstrapping Agents by Guiding Exploration with Language

2024 | Shikhar Murty, Christopher D. Manning, Peter Shaw, Mandar Joshi, Kenton Lee
BAGEL is a method for bootstrapping language model (LM) agents without human supervision. It generates synthetic demonstrations by iteratively relabeling trajectories from unconditioned exploration using two LM components: an LM labeler that converts trajectories into synthetic instructions, and a zero-shot LM agent that maps instructions into refined trajectories. By performing these round-trips, BAGEL shifts the distribution of trajectories toward those well-described by natural language. The generated synthetic demonstrations are used for in-context learning or fine-tuning, serving as a drop-in replacement for expert demonstrations. BAGEL demonstrates improvements of over 2-13% on ToolQA and MiniWob++, with up to 13× reduction in execution failures. The method is compared to a zero-shot ReAct baseline, showing significant performance gains. BAGEL's synthetic demonstrations improve agent performance by providing relevant in-context examples, reducing execution failures, and enhancing the correctness and diversity of generated actions. The method is evaluated on two domains, showing strong results in complex tasks requiring understanding of environment dynamics. BAGEL's approach leverages LM priors to shape random exploration, enabling automated discovery of use cases in complex environments. The study highlights the potential of using synthetic demonstrations for training LM agents without human supervision.BAGEL is a method for bootstrapping language model (LM) agents without human supervision. It generates synthetic demonstrations by iteratively relabeling trajectories from unconditioned exploration using two LM components: an LM labeler that converts trajectories into synthetic instructions, and a zero-shot LM agent that maps instructions into refined trajectories. By performing these round-trips, BAGEL shifts the distribution of trajectories toward those well-described by natural language. The generated synthetic demonstrations are used for in-context learning or fine-tuning, serving as a drop-in replacement for expert demonstrations. BAGEL demonstrates improvements of over 2-13% on ToolQA and MiniWob++, with up to 13× reduction in execution failures. The method is compared to a zero-shot ReAct baseline, showing significant performance gains. BAGEL's synthetic demonstrations improve agent performance by providing relevant in-context examples, reducing execution failures, and enhancing the correctness and diversity of generated actions. The method is evaluated on two domains, showing strong results in complex tasks requiring understanding of environment dynamics. BAGEL's approach leverages LM priors to shape random exploration, enabling automated discovery of use cases in complex environments. The study highlights the potential of using synthetic demonstrations for training LM agents without human supervision.
Reach us at info@study.space
Understanding BAGEL%3A Bootstrapping Agents by Guiding Exploration with Language