Understanding The Arcade Learning Environment%3A An Evaluation Platform for General Agents

The article introduces the Arcade Learning Environment (ALE), a platform and methodology for evaluating the development of general, domain-independent AI technology. ALE provides an interface to hundreds of Atari 2600 game environments, each designed to be challenging for human players. The platform addresses significant research challenges in reinforcement learning, model learning, planning, imitation learning, transfer learning, and intrinsic motivation. The authors illustrate the promise of ALE by developing and benchmarking domain-independent agents using established AI techniques for reinforcement learning and planning. They propose an evaluation methodology and report empirical results on over 55 different games, demonstrating the potential of ALE as a rigorous testbed for evaluating and comparing approaches to these problems. The software, including the benchmark agents, is publicly available. The article also discusses the challenges of comparing general agents across diverse domains and introduces metrics for evaluating performance, such as normalized scores, median scores, and score distributions. Finally, it highlights the unique advantages of the Atari 2600 platform for AI research and the potential for extending the challenge to more recent video game platforms.The article introduces the Arcade Learning Environment (ALE), a platform and methodology for evaluating the development of general, domain-independent AI technology. ALE provides an interface to hundreds of Atari 2600 game environments, each designed to be challenging for human players. The platform addresses significant research challenges in reinforcement learning, model learning, planning, imitation learning, transfer learning, and intrinsic motivation. The authors illustrate the promise of ALE by developing and benchmarking domain-independent agents using established AI techniques for reinforcement learning and planning. They propose an evaluation methodology and report empirical results on over 55 different games, demonstrating the potential of ALE as a rigorous testbed for evaluating and comparing approaches to these problems. The software, including the benchmark agents, is publicly available. The article also discusses the challenges of comparing general agents across diverse domains and introduces metrics for evaluating performance, such as normalized scores, median scores, and score distributions. Finally, it highlights the unique advantages of the Atari 2600 platform for AI research and the potential for extending the challenge to more recent video game platforms.

The Arcade Learning Environment: An Evaluation Platform for General Agents

2013 | Marc G. Bellemare, Yavar Naddaf, Joel Veness, Michael Bowling