After carefully training a
combined policy-value net I finaly did some runs on CGOS.
http://www.yss-aya.com/cgos/19x19/bayes.htmlThe accounts are
NG-04 (and NG-04b with a ladder bug fixed, probably showing up two days from now)
This account does 10000 playouts/move with
param uct_expand_after 20
which means about 500 node expansions with policy-value net call.
NG-04 has an Elo of about 3100
between
CGOS KGS Botname Time
Aya788d_p1v0_6c12t 3235 6d AyaMC 1min + 15sec x10
Aya786m_4c 2965 4d AyaMC4 1min + 15sec x10
from
http://computer-go.org/pipermail/computer-go/2016-June/009444.html one might think, this would be around KGS 5d?!
The other account of interest is the pure policy network account only playing the best move (thus it is deterministic, but I checked that the opponent players did not play deterministic)
NG-cnn_proe_1b with Elo 2464 which should be about 2d on kgs.
Now, that we win some games against zen it appears, that the ladder reader has to be improved :) Zen does it perfectly and we only read simple ladder ends. If before the end there are two different possibilities, which become important later, we can not read this :(
Let me know, if someone wants to work at this :)