Plotted the default and other settings on cpuct which consist of cpuct initial value, the base and the factor. The Y-axis refers to the final_cpuct value that will be applied in MCTS, after calculations on the 3 cpuct settings.
option name CPuct type string default 3.000000
option name CPuctBase type string default 19652.000000
option name CPuctFactor type string default 2.000000
As nodes increases especially for long TC, the final_cpuct value also increases.
I run 2 matches using net id 32106. No TB, no win adjudication, but with draw adjudication using cutechess-cli.
1. TC 15s + 0.1s
Default in blue vs new setting in orange
Orange:
CPuct = 3.0
CPuctBase = 49652
CPuct = 2.0
Result
# PLAYER : RATING ERROR POINTS PLAYED (%)
1 Lc0 v0.19.1 32106 cpuct_3.0_49652_2.0 : 22.4 34.4 18.0 32 56
2 Lc0 v0.19.1 32106 cpuct_def : -22.4 34.4 14.0 32 44
2. TC 300s + 2s or 5m + 2s
Default in blue vs new setting in orange
This match takes a lot of time because Lc0 was trolling. To save time, win adjudication should be enabled.
Result
# PLAYER : RATING ERROR POINTS PLAYED (%)
1 Lc0 v0.19.1 32106 cpuct_3.0_49652_2.0 : 13.6 26.4 14.0 26 54
2 Lc0 v0.19.1 32106 cpuct_def : -13.6 26.4 12.0 26 46
From this small samples both in short and longer TC, the new setting is slightly better. The difference vs the default is that it has a lower final_cpuct value within the node boundary (green and violet) at different TC's.
Match games conditions:
Start pgn: 2moves_LT_1000.pgn
Each opening is played (side reversed)
It would be interesting to test that yellow line at [3.0, 79652, 1.5]
This is the frequency distribution of nodes on the two TC used, to see at which number of nodes has dominated the search given the gpu used.
in all my testing of 800 nodecount games A0's 2.5 cpuct base setting has always seemed to perform best with those nodes.