It sounds crazy indeed. But your typical MCTS (without node priors) has
no knowledge of positions at all - it just learns which of the available
actions seem to work best. The classifier actually has more knowledge
than a typical MCTS tree, since it generalizes between positions, as you
said. And it generalizes based on the intuition that good reactions to
the same moves are often useful in many branches of the search tree, see
our Power of Forgetting paper.
I'm interested in integrating this neural network approach into MCTS
such that the convergence properties of MCTS are not lost...
By the way, as far as I understand the network is not built from scratch
before each move, but before each game. Each move could be an
interesting approach as well (maybe for longer time settings?).
_______________________________________________
Computer-go mailing list
Compu...@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
It sounds crazy to me that it works at all as it has no real knowledge of the position.
It sounds crazy indeed. But your typical MCTS (without node priors) has no knowledge of positions at all - it just learns which of the available actions seem to work best. The classifier actually has more knowledge than a typical MCTS tree, since it generalizes between positions, as you said. And it generalizes based on the intuition that good reactions to the same moves are often useful in many branches of the search tree, see our Power of Forgetting paper.
I'm interested in integrating this neural network approach into MCTS such that the convergence properties of MCTS are not lost...
By the way, as far as I understand the network is not built from scratch before each move, but before each game. Each move could be an interesting approach as well (maybe for longer time settings?).