The paper introduces AGENTGYM, a framework designed to evolve large language model (LLM)-based agents across diverse environments and tasks. The framework aims to address the limitations of current approaches, which either require human supervision for step-by-step imitation or result in specialist agents with limited generalization. AGENTGYM features a variety of environments and tasks, an expanded trajectory set, and an effective evolution method. The authors propose AGENTEVAL, a benchmark suite to evaluate the potential of agent self-evolution. Experimental results show that the evolved agents can achieve comparable or better performance than state-of-the-art (SOTA) models. The framework includes a platform with 14 environments and 89 tasks, an expanded instruction set, and high-quality trajectories. The AGENTEVOL algorithm is introduced to explore self-evolution, demonstrating its effectiveness in evolving agents across multiple environments and tasks. The paper also discusses the limitations and future directions, emphasizing the importance of safety and ethical considerations in the development of self-evolving agents.The paper introduces AGENTGYM, a framework designed to evolve large language model (LLM)-based agents across diverse environments and tasks. The framework aims to address the limitations of current approaches, which either require human supervision for step-by-step imitation or result in specialist agents with limited generalization. AGENTGYM features a variety of environments and tasks, an expanded trajectory set, and an effective evolution method. The authors propose AGENTEVAL, a benchmark suite to evaluate the potential of agent self-evolution. Experimental results show that the evolved agents can achieve comparable or better performance than state-of-the-art (SOTA) models. The framework includes a platform with 14 environments and 89 tasks, an expanded instruction set, and high-quality trajectories. The AGENTEVOL algorithm is introduced to explore self-evolution, demonstrating its effectiveness in evolving agents across multiple environments and tasks. The paper also discusses the limitations and future directions, emphasizing the importance of safety and ethical considerations in the development of self-evolving agents.