Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search

Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search

May 2006 | Rémi Coulom
This paper presents a new framework for combining Monte-Carlo evaluation with tree search in the game of Go. The traditional approach, which combines alpha-beta pruning with a heuristic evaluator, has been successful for games like chess but has struggled with Go due to its dynamic nature. The proposed method does not separate the min-max phase from the Monte-Carlo phase, instead defining a more general backup operator that gradually transitions from averaging to min-max as the number of simulations increases. This approach allows for fine-grained control over tree growth and efficient selectivity methods. The algorithm was implemented in the Crazy Stone program, which won the 10th KGS computer-Go tournament. The paper discusses the structure of the algorithm, the selectivity and backup operators, and game results, highlighting the program's performance against other advanced Go-playing programs. Future research directions include improving the selectivity algorithm, incorporating game-specific knowledge, and scaling the approach to larger boards.This paper presents a new framework for combining Monte-Carlo evaluation with tree search in the game of Go. The traditional approach, which combines alpha-beta pruning with a heuristic evaluator, has been successful for games like chess but has struggled with Go due to its dynamic nature. The proposed method does not separate the min-max phase from the Monte-Carlo phase, instead defining a more general backup operator that gradually transitions from averaging to min-max as the number of simulations increases. This approach allows for fine-grained control over tree growth and efficient selectivity methods. The algorithm was implemented in the Crazy Stone program, which won the 10th KGS computer-Go tournament. The paper discusses the structure of the algorithm, the selectivity and backup operators, and game results, highlighting the program's performance against other advanced Go-playing programs. Future research directions include improving the selectivity algorithm, incorporating game-specific knowledge, and scaling the approach to larger boards.
Reach us at info@study.space
[slides and audio] Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search