July, 2024 | Tamer Abuelsaad, Deepak Akkil, Prasenjit Dey, Ashish Jagmohan, Aditya Vempaty, Ravi Kokku
This paper introduces Agent-E, a novel web agent designed to perform complex web-based tasks. Agent-E introduces several architectural improvements over prior state-of-the-art web agents, including a hierarchical architecture, flexible DOM distillation and denoising methods, and the concept of *change observation* to enhance performance. The evaluation of Agent-E on the WebVoyager benchmark dataset shows that it achieves a success rate of 73.2%, outperforming other text-only and multi-modal web agents by 10-30\%. The paper also synthesizes learnings from the development of Agent-E into general design principles for agentic systems, which include the use of domain-specific primitive skills, the importance of distilling and denoising environmental observations, the advantages of a hierarchical architecture, and the role of self-improvement to enhance agent efficiency and efficacy as the agent gathers experience. These principles can be applied beyond the realm of web automation.This paper introduces Agent-E, a novel web agent designed to perform complex web-based tasks. Agent-E introduces several architectural improvements over prior state-of-the-art web agents, including a hierarchical architecture, flexible DOM distillation and denoising methods, and the concept of *change observation* to enhance performance. The evaluation of Agent-E on the WebVoyager benchmark dataset shows that it achieves a success rate of 73.2%, outperforming other text-only and multi-modal web agents by 10-30\%. The paper also synthesizes learnings from the development of Agent-E into general design principles for agentic systems, which include the use of domain-specific primitive skills, the importance of distilling and denoising environmental observations, the advantages of a hierarchical architecture, and the role of self-improvement to enhance agent efficiency and efficacy as the agent gathers experience. These principles can be applied beyond the realm of web automation.