16 Aug 2017 | Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, John Agapiou, Julian Schrittwieser, John Quan, Stephen Gaffney, Stig Petersen, Karen Simonyan, Tom Schaul, Hado van Hasselt, David Silver, Timothy Lillicrap, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ekermo, Jacob Repp, Rodney Tsing
This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the game StarCraft II. SC2LE presents a new and challenging domain for reinforcement learning, featuring multiple agents, imperfect information, a large action space, a complex state space, and delayed credit assignment. The environment includes an open-source Python interface for communication with the game engine, mini-games focusing on different gameplay elements, and a dataset of human expert game replays. Initial baseline results show that neural networks trained on this data can predict game outcomes and player actions, but deep reinforcement learning agents struggle to make significant progress on the full game, highlighting the environment's difficulty. The paper also discusses related work and provides a detailed description of the environment, including observations, actions, and reward specifications.This paper introduces SC2LE (StarCraft II Learning Environment), a reinforcement learning environment based on the game StarCraft II. SC2LE presents a new and challenging domain for reinforcement learning, featuring multiple agents, imperfect information, a large action space, a complex state space, and delayed credit assignment. The environment includes an open-source Python interface for communication with the game engine, mini-games focusing on different gameplay elements, and a dataset of human expert game replays. Initial baseline results show that neural networks trained on this data can predict game outcomes and player actions, but deep reinforcement learning agents struggle to make significant progress on the full game, highlighting the environment's difficulty. The paper also discusses related work and provides a detailed description of the environment, including observations, actions, and reward specifications.