As has been said repeatedly on this list 'aim was NOT to create the best chess engine possible' but to achieve similar success against AB search engines using self-learning NN and MCTS as Alpha0. If achieving the best chess engine happen to result, all the better.
That said, TB positions are determined outcomes. Makes no sense to cause lc0 to train to discover determined outcomes.
--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/bbc037ae-6b82-4695-85de-a00f24a35613%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
In short, the goal of the project is first replicate AlphaZero, and then make it as strong as possible.The main reason of "replicate AlphaZero" part is actually not to test reproducibility in scientific sense, that's more like a nice side effect.The main reason is that there were many attempts to create NN engine before, and they failed, while AlphaZero succeeded.There are many possible improvements which really look beneficial if you think about them, but the problem with implementing them is that if we stuck, we cannot longer just compare what we are doing differently and fix that. We do stuck a lot, and in fact there are lots of surprising subtleties which matter (recent example is sampling rate, we used ~0.95 I guess, and it was too much).So, while we can follow the guide, we do.After we reach AlphaZero state (or if we stuck before that and won't be able to find any explanation), the goal will surely be to "create as strong chess engine as possible".I expect that there will be some zero vs non-zero debate, which may result in two different forks, but I expect non-zero to get much more attention in the end because it will be stronger.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/96daa73f-1505-4ee8-bf90-5b3cba7abca6%40googlegroups.com.
I, for one, don’t believe including TBs violates the “zero” principle since outcome given a TB position is predetermined. That said lc0’s training is influenced by success and failure. Reaching favourable TB position should reward the same as winning, by principle of transitivity.
Lc0 should be able to play for win, or play for favourable TB position, without distinction since both amount to the same thing.
--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/5e12c927-62e8-4efc-a1a0-7e9b589582e4%40googlegroups.com.
Tabula Rasa: Self learning is started with a ”clean slate” (randomised weights and no pre-labelled data for training)First Principle: Only the minimal needed information (rules, constraints etc.) required are provided
The term ”Zero Principle” is not defined nor discussed in the A0 paper so it is quite natural that different people have different views on what it means. But I have seen two ”distinctions” emerge from the forum discussions which I share below (basically same as above):Zero Algorithm: A ”general” algorithm that, constrained by the above two points, learns a game (or some other task)
My personal interpretation of the LC0 ”zero principle” discussions is that the tabula rasa is not questioned (no supervised learning wanted) but the first principle constraint is up for debate as long as it does not impose any human bias / tainting. And I agree with that (if my interpretation is correct of course). As long as Leela is not influenced by human preconceptions, let her learn as much as possible, as easily as possible. And later we can all enjoy her victories (and have KC analyse them for us :-)How learning is performed: Unsupervised from scratch (tabula rasa), as in A0 and LC0 or supervised such as in DeusXWhat ”knowledge" is provided to the NN: Only basic rules (first principle), +EGTB, +chess domain specific tree search tuning, +additional input planes etc. The extreme would be tainting the NN with human heuristics of piece values, etc. as in ”classic” engines
10 sep. 2018 kl. 05:48 skrev A Thule <thul...@gmail.com>:
... and there are many people who reference the zero principle who clearly don’t understand it at all. That it means different things to different people means only that not everyone who brandy’s the principle a boy is correct.
--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/307eccde-5f0c-4112-a022-030142497d7e%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/23AE9F72-5A99-482F-91F1-3F90B82F8F9B%40idermark.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/CAJHRZ%3DJp2O%3DJbEfeyYdfDquJajBsBDHKEtNp6CSsCTO%2Bo8AB2Q%40mail.gmail.com.