Philosophy of Lc0 dev and algorithm

Pierre-Louis

unread,

Aug 31, 2020, 9:57:56 AM8/31/20

to LCZero

Hi everyone,

I just joined the group and was recently wondering about the philosophy behind Lc0 development. It is crystal clear that 0 implies that the engine must learn from 0, but regarding the algorithm, I was not sure. Can the algorithm be changed as new algorithms are published by researchers or must the project stick to AZ's guidelines?

I am not saying that there are strictly established better algorithms than AZ, but I still have a particular paper in mind that came out earlier this year. It makes some interesting analysis of AZ-like algorithms (and in particular of MuZero).

The article is: Monte-Carlo tree search as regularized policy optimization, Grill et al., 2020

Here is the link: https://arxiv.org/abs/2007.12509

It basically draws the analysis that the objective pursued by AZ-like algorithms is very close to a particular policy optimization objective, and improves on MuZero by getting rid of some unnecessary approximations (if I remember correctly). Another interesting point is that it gets rid of some hyperparameters.

My question can then be rephrased: would it be in line with Lc0 project's philosophy to try this kind of new algorithms as they come out, or should it be handled in other projects?

With best regards,

Pierre-Louis

glbchess64

unread,

Aug 31, 2020, 4:32:47 PM8/31/20

to LCZero

LC0 does not use MCTS but a similar algorithm called PUCT (https://lczero.org/dev/wiki/technical-explanation-of-leela-chess-zero/). The algorithm is not the same used by A0. Several improvement had been implemented : time management, smart pruning, WDL, MLH, double repetition as draw... All these taken individually are not great improvements but contribute all together to substantial elo gain.

A0 has a general algorithm dedicated to board games, LC0 is a bit different since it is optimized for chess. Yet a lot of improvement are needed in particular for solving the tactical issue with LC0 that is very sensible in late middle game. It seems that A0 algorithm is better for go than for chess and the reason is likely that chess is more tactical.

Pierre-Louis

unread,

Aug 31, 2020, 5:43:21 PM8/31/20

to LCZero

Thank you for the explanations.

I would like to mention that PUCT is what is called MCTS indeed, and is used in MuZero (and AlphaZero, as explained the page you linked).

Apart from that you are right, I believe that there were papers years ago about the fact that MCTS-based algorithms were not performing as good on chess as they were on go because of tactical issues (they tend to explore deep lines where shallow traps would require to stop exploration way earlier).

But those improvements you mention come on top on AZ's general algorithm if I understand correctly. So if one were to improve the general basis they could still be applied on top, or is it something completely different?

A Perfect 25th Century Bionic Woman

unread,

Sep 2, 2020, 9:49:31 AM9/2/20

to LCZero

There's no way MLH or double repetition as draw yield ELO gain at tourny tc. You can still argue they're neat ideas but all they're gonna do is make her shuffle a bit less.

Robert Pope

unread,

Sep 2, 2020, 10:36:08 AM9/2/20

to LCZero

Why not? If you recognize draws earlier in the search, you don't waste more nodes on them and can search more deeply elsewhere.

A Perfect 25th Century Bionic Woman

unread,

Sep 2, 2020, 11:25:20 AM9/2/20

to LCZero

I could be wrong but she'll already have explored the alternative lines beforehand. The reason she's doing the 2-fold shuffle back and forth is because she can't find anything more winnable, so this is her last resort.

Reply all

Reply to author

Forward