TIMEARENA: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation

TIMEARENA: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation

8 Feb 2024 | Yikai Zhang, Siyu Yuan, Caiyu Hu, Kyle Richardson, Yanghua Xiao, Jiangjie Chen
TIMEARENA is a novel textual simulated environment designed to evaluate the multitasking capabilities of language agents in a time-aware setting. The environment incorporates complex temporal dynamics and constraints, reflecting real-life planning scenarios. Agents are tasked with completing multiple tasks as quickly as possible, allowing for parallel processing to save time. TIMEARENA includes 30 real-world tasks from cooking, household activities, and laboratory work. The environment considers the dependency between actions, the time duration for each action, and the occupancy of agents and objects. Experiments with various state-of-the-art LLMs (e.g., GPT-4) reveal that even the most powerful models still lag behind humans in effective multitasking, highlighting the need for enhanced temporal awareness in language agents. The study contributes to the field by exploring the integration of time in textual simulations and providing a comprehensive evaluation framework for language agents' multitasking efficiency.TIMEARENA is a novel textual simulated environment designed to evaluate the multitasking capabilities of language agents in a time-aware setting. The environment incorporates complex temporal dynamics and constraints, reflecting real-life planning scenarios. Agents are tasked with completing multiple tasks as quickly as possible, allowing for parallel processing to save time. TIMEARENA includes 30 real-world tasks from cooking, household activities, and laboratory work. The environment considers the dependency between actions, the time duration for each action, and the occupancy of agents and objects. Experiments with various state-of-the-art LLMs (e.g., GPT-4) reveal that even the most powerful models still lag behind humans in effective multitasking, highlighting the need for enhanced temporal awareness in language agents. The study contributes to the field by exploring the integration of time in textual simulations and providing a comprehensive evaluation framework for language agents' multitasking efficiency.
Reach us at info@study.space