I have been following this project since May. I started enjoying Leela's play in old main nets. Then, test10 started growing in strenght, TCEC, cccc, etc.
I play chess and can follow games, etc, but I have no background whatsoever in AI. Therefore I could not understand well what was happening with the project.
When test20 started, I decided to try to understand it better and try to enjoy how it evolves... Most of the decisions and technical discussions are in the discord channel so
I started reading carefully trying to extract clues about what was happening. I realized many people have the same problem; for many of us the only parameter to follow was self-elo
but it can be totally misleading.
In a recent thread I wrote some messages about what I have learned about test20...
This is the thread:
I will copy those messages here and continue reporting some news about test20, etc, trying to explain what is going on with it.
I must stress from the very beginning I am not an expert so take these explanations, news with caution (maybe anyone with more knowledge can correct my mistakes).
Below I copy my previous messages... I apologize for repeating them but I want this thread to be self contained.
First message:
" The main "experiment" in test20 is to change cpuct. Test10 was using close to 1 and now is 5. That one is the important parameter.
The point is, according to other studies, that cpuct low have fast gains but then it slows down. Test10 gained about 80-85% of its final strenght very soon, as you mention.
On the other hand a cpuct of 5 would start slower but then will have a much steadier growth and a final higher ceiling (btw, test10 is about 3500 elo, we should not expect a 4000 elo ceiling either).
So, although it seems test20 is going too slow, there will be a point when it reaches test10 in the future. The experiment is set to last for about 40-44M games so it would not be very wise to stop right now or we will learn nothing.
There are other parameters such as accuracy, tactical sharpness, etc that are supposed to improve as well (again I am not an expert).
Other parameter that is being changed is resign threesold. That causes spikes when it changes...
Finally, there was decided not to do many learning rate (LR) changes. In previous tests, every time self-elo moved, people claimed for a LR change. That proved not to be very useful. Now has been decided to do only 3 LR changes, The first is about 11M games (very soon). That will probably will make elo grows faster (now, with a LR high, exploring is maximized but also the net "forgets" easily.
Again, not very wise to stop the experiment even before reaching the first LR.
So, maybe, do not look self-elo graph (totally misleading). Look Mgotostark ccrl sheet instead. Be patient.
Anyway, nobody garantees the experiment will be successful (as usual in experiments).
Other parameter to be tested in the future is the number of nodes in training games (800 at present). Effect of increasing them (again a long run). Patience!!
I would like somebody who really knows will write something in the blog to explain these things. People naturally looking at self-elo graph gets worried.
I hope this helps. "
Second message:
" It is true that enter into discord is like to fishing!! (you sometimes get something)...
Somebody sent this graph... I find it very ilustrative.
Test10 was something like the blue line (cpuct about 1) and test20 is like the orange one... (this I think is from other work, other publication). In our case it should show elo strenght and going up instead of down but you get the idea from it.
Many people ask for things to be proved... however, every experiment can take several weeks, since you need to change only one parameter (one experiment is the control, test10 and you change one parameter (test20) and see how it affects the results (elo strenght). It is slow, sometimes bored, but it is the way science goes on.
Again, I think it would be nice to have a (not very deep) explanation of how these experiments (the project) proceeds. I feel devs think explanations will produce a lot of technical questions that in turn will generate more questions, explanations, etc, etc, etc. And they work part time... Understable.
A final remark; people who understand (not me) are all saying test20 is going well and is promising, Since they were able to produce a 3500 engine, they deserve some time to develop things... "
Third message:
"Again fishing in discord waters (I apologize for using the word "fish" in this forum)..
LR at present is 0.2 it will be lowered probably to 0.02, 0.002, 0.0002
Not is a symmetrical way (I mean not equidistant). It seems the first LR lowering is gonna be delayed a bit because of spikes of self elo. Devs are waiting for some tecnical parameters (well beyond me) to get stabilised. I would say it would be for 16-18 M games....
I will inform if I got more news... "