[ Question ]

Is AlphaZero any good without the tree search?

by steve2152 1 min read30th Jun 20199 comments


One component of AlphaZero is a neural net which takes a board position as input, and outputs a guess about how good the position is and what a good next move would be. It combines this neural net with Monte Carlo Tree Search (MCTS) that plays out different ways the game could go, before choosing the move. The MCTS is used both during self-play to train the neural net, and during competitive test-time. I'm mainly curious about whether the latter is necessary.

So my question is: Once you have the fully-trained AlphaZero system, if you then turn off the MCTS and just choose moves directly with the neural net policy head, is it any good? Is it professional-level, amateur-level, child-level?

(I think this would be a fun little data-point related to discussions of how powerful an AI can be with and without mesa-optimization / search-processes using a generative environmental model.)

New Answer
Ask Related Question
New Comment

1 Answers

The paper includes the ELO for just the NN. I believe it's professional level but not superhuman, but you should check if you really need to know. However, note that Alphazero's actual play doesn't use MCTS at all, it uses a simple tree search which only descends a few ply.