Understanding Amortized Planning with Large-Scale Transformers%3A A Case Study on Chess

This paper explores the possibility of achieving grandmaster-level chess performance without explicit search using a large-scale transformer model trained on supervised learning. The authors train a 270M parameter transformer model on a dataset of 10 million chess games, using Stockfish 16 as an oracle to annotate action-values for each board state. The resulting model achieves a Lichess blitz Elo rating of 2895 against humans and successfully solves challenging chess puzzles. The paper demonstrates that strong chess performance only arises at sufficient scale, both in terms of model size and dataset size. The authors also perform extensive ablations to validate their findings and show that their model outperforms AlphaZero’s policy and value networks and GPT-3.5-turbo-instruct. The work highlights the potential of using large-scale supervised learning to approximate complex algorithms, such as those used in chess, and suggests a paradigm shift in how large transformers are viewed.This paper explores the possibility of achieving grandmaster-level chess performance without explicit search using a large-scale transformer model trained on supervised learning. The authors train a 270M parameter transformer model on a dataset of 10 million chess games, using Stockfish 16 as an oracle to annotate action-values for each board state. The resulting model achieves a Lichess blitz Elo rating of 2895 against humans and successfully solves challenging chess puzzles. The paper demonstrates that strong chess performance only arises at sufficient scale, both in terms of model size and dataset size. The authors also perform extensive ablations to validate their findings and show that their model outperforms AlphaZero’s policy and value networks and GPT-3.5-turbo-instruct. The work highlights the potential of using large-scale supervised learning to approximate complex algorithms, such as those used in chess, and suggests a paradigm shift in how large transformers are viewed.

Grandmaster-Level Chess Without Search

7 Feb 2024 | Anian Ruoss,1, Grégoire Delétang,1, Sourabh Medapati1, Jordi Grau-Moya1, Li Kevin Wenliang1, Elliot Catt1, John Reid1 and Tim Genevein1

Grandmaster-Level Chess Without Search

7 Feb 2024 | Anian Ruoss*,1, Grégoire Delétang*,1, Sourabh Medapati1, Jordi Grau-Moya1, Li Kevin Wenliang1, Elliot Catt1, John Reid1 and Tim Genevein1

7 Feb 2024 | Anian Ruoss,1, Grégoire Delétang,1, Sourabh Medapati1, Jordi Grau-Moya1, Li Kevin Wenliang1, Elliot Catt1, John Reid1 and Tim Genevein1