2024 | Shikhar Murty, Christopher D. Manning, Peter Shaw, Mandar Joshi, Kenton Lee
BAGEL (Bootstrapping Agents by Guiding Exploration with Language) is a method for bootstrapping language model (LM) agents to follow natural language instructions in digital environments without human supervision. The method converts a seed set of randomly explored trajectories or synthetic instructions into demonstrations through iterative round-trips between two noisy LM components: an LM labeler that converts trajectories into synthetic instructions, and a zero-shot LM agent that maps these instructions back into refined trajectories. By iteratively performing these round-trips, BAGEL quickly transforms the initial distribution of trajectories into those that are well-described by natural language. The generated synthetic demonstrations are then used for in-context learning or fine-tuning at test time, leading to significant improvements in performance on tasks like MiniWoB++ and ToolQA, with reductions in execution failures. BAGEL demonstrates the effectiveness of using LM priors to shape random exploration, making it a valuable tool for automated discovery of use cases in complex environments.BAGEL (Bootstrapping Agents by Guiding Exploration with Language) is a method for bootstrapping language model (LM) agents to follow natural language instructions in digital environments without human supervision. The method converts a seed set of randomly explored trajectories or synthetic instructions into demonstrations through iterative round-trips between two noisy LM components: an LM labeler that converts trajectories into synthetic instructions, and a zero-shot LM agent that maps these instructions back into refined trajectories. By iteratively performing these round-trips, BAGEL quickly transforms the initial distribution of trajectories into those that are well-described by natural language. The generated synthetic demonstrations are then used for in-context learning or fine-tuning at test time, leading to significant improvements in performance on tasks like MiniWoB++ and ToolQA, with reductions in execution failures. BAGEL demonstrates the effectiveness of using LM priors to shape random exploration, making it a valuable tool for automated discovery of use cases in complex environments.