AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents

AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents

29 May 2025 | Yuxiang Chai, Siyuan Huang, Yazhe Niu, Han Xiao, Liang Liu, Guozhi Wang, Dingyu Zhang, Shuai Ren, Hongsheng Li
The Android Multi-annotation EXpo (AMEX) dataset is introduced to advance the research on AI agents in mobile GUI control. AMEX is a comprehensive, large-scale dataset designed for generalist mobile GUI-control agents, capable of completing tasks by interacting with the graphical user interface (GUI) on mobile devices. The dataset includes over 104K high-resolution screenshots from popular mobile applications, annotated at three levels: GUI interactive element grounding, GUI screen and element functionality descriptions, and complex natural language instructions with stepwise GUI-action chains. Unlike existing GUI-related datasets, AMEX provides more detailed and instructive annotations, complementing the general settings of existing datasets. The dataset is collected through human instruction-following GUI manipulations and autonomous GUI controls, ensuring high-quality annotations. The effectiveness of AMEX is demonstrated by fine-tuning a baseline model, SPHINX Agent, on three benchmark datasets: ScreenSpot, ANDROIDCONTROL, and AitW. The results show that AMEX significantly improves the performance of GUI element grounding and GUI-control tasks, highlighting its value for advancing the development of mobile GUI agents.The Android Multi-annotation EXpo (AMEX) dataset is introduced to advance the research on AI agents in mobile GUI control. AMEX is a comprehensive, large-scale dataset designed for generalist mobile GUI-control agents, capable of completing tasks by interacting with the graphical user interface (GUI) on mobile devices. The dataset includes over 104K high-resolution screenshots from popular mobile applications, annotated at three levels: GUI interactive element grounding, GUI screen and element functionality descriptions, and complex natural language instructions with stepwise GUI-action chains. Unlike existing GUI-related datasets, AMEX provides more detailed and instructive annotations, complementing the general settings of existing datasets. The dataset is collected through human instruction-following GUI manipulations and autonomous GUI controls, ensuring high-quality annotations. The effectiveness of AMEX is demonstrated by fine-tuning a baseline model, SPHINX Agent, on three benchmark datasets: ScreenSpot, ANDROIDCONTROL, and AitW. The results show that AMEX significantly improves the performance of GUI element grounding and GUI-control tasks, highlighting its value for advancing the development of mobile GUI agents.
Reach us at info@study.space
[slides] AMEX%3A Android Multi-annotation Expo Dataset for Mobile GUI Agents | StudySpace