15 Feb 2024 | Zhiyong Wu, Chengcheng Han, Zichen Ding, Zhenmin Weng, Zhoumianzze Liu, Shunyu Yao, Tao Yu, Lingpeng Kong
OS-Copilot is a framework designed to build generalist computer agents capable of interacting with comprehensive elements in an operating system (OS), including the web, code terminals, files, multimedia, and various third-party applications. The framework introduces FRIDAY, a self-improving embodied agent that automates general computer tasks. FRIDAY demonstrates strong generalization and performance on the GAIA benchmark, outperforming previous methods by 35%. It also shows the ability to learn and control unfamiliar applications through self-directed learning, as evidenced by its success in spreadsheet manipulation tasks. The OS-Copilot framework and FRIDAY's empirical findings provide valuable insights and infrastructure for future research in building more capable and general-purpose computer agents.OS-Copilot is a framework designed to build generalist computer agents capable of interacting with comprehensive elements in an operating system (OS), including the web, code terminals, files, multimedia, and various third-party applications. The framework introduces FRIDAY, a self-improving embodied agent that automates general computer tasks. FRIDAY demonstrates strong generalization and performance on the GAIA benchmark, outperforming previous methods by 35%. It also shows the ability to learn and control unfamiliar applications through self-directed learning, as evidenced by its success in spreadsheet manipulation tasks. The OS-Copilot framework and FRIDAY's empirical findings provide valuable insights and infrastructure for future research in building more capable and general-purpose computer agents.