When will Leela stop playing "stronger against strong opponents and not as strong against weak opponents"?

1,537 views
Skip to first unread message

Fahim Saharaiar

unread,
Apr 17, 2019, 11:05:29 PM4/17/19
to LCZero
Ideally we want "stronger against strong opponents and just as strong against weak opponents."
Even though SF and Leela are close in strength, SF has significantly higher conversion against weaker opponents
than expected.

Any thoughts would be interesting. 

Stephen Timothy McHenry

unread,
Apr 17, 2019, 11:32:38 PM4/17/19
to LCZero
As long as Leela does not have a losing record against those other opponents then what difference does it make. SF is the only one to worry about, beat SF consistently and who cares about the rest of the inferior engines.

John D

unread,
Apr 17, 2019, 11:55:01 PM4/17/19
to LCZero
As "pure" zeroNN possibly never. Weak AB engines remain much stronger tactically. Agreed with the above poster; its inherent and irrelevant if on the whole its nevertheless stronger than the strongest AB engine. I don't think anyone would've conceived such an entity was possible pre-A0 and it's existence will change chess.

Fahim Saharaiar

unread,
Apr 18, 2019, 12:18:53 AM4/18/19
to LCZero
You have to admit one of the options is better i.e. better conversion against weaker opponents.

Hoang Hiep Vu

unread,
Apr 18, 2019, 12:57:01 AM4/18/19
to LCZero
Stockfish has contempt, so it can play sub-optimal move which has higher chance of win/lost vs other engines. Because other AB engines are weaker, it is difficult to counterplay. If the opponent is strong as Leela, this contempt may backfire.
On the other hand, Leela plays its optimal, but maybe drawish move, so less dangerous towards weak engines but safe against strong engines.

Markus Kohler

unread,
Apr 18, 2019, 7:32:05 AM4/18/19
to LCZero
FYI, On the SF mailing list, some people asked to remove the contempt because it doesn't make sense any more(the NN engines being too strong).

Lukas S

unread,
Apr 18, 2019, 9:53:05 AM4/18/19
to LCZero
That makes so much sense. Thanks for your thoughts!

garrykli...@gmail.com

unread,
Apr 18, 2019, 9:58:44 AM4/18/19
to LCZero
please do not troll, that was NOT his question.

dilyan...@gmail.com

unread,
Apr 18, 2019, 12:07:48 PM4/18/19
to LCZero
I think problem comes from that Lc0 does not try to find best move as long as it seems winning. But then those weak engines are still very strong in endgame with table base and finding amazing fortresses and that's why Lc0 makes lots of draws. I think some nets play more positional other more aggressive so it's luck sending correct net vs specific opponent...

Trevor G

unread,
Apr 18, 2019, 1:00:51 PM4/18/19
to garrykli...@gmail.com, LCZero
There's another way to look at this...

Maybe the current crop of strongish to strong AB engines are actually all much closer in Elo than normally thought, as observed by Leela. However, because they all share the same basic algorithm and are all very similar in evaluation function, the Elo differences are "stretched" such that an only slightly stronger AB engine will have a higher than expected winrate vs a slightly weaker AB engine.

Let’s say we take a very good boxer/fighter, and clone him, but make the clone just a tiny bit more skilled, a tiny bit stronger, a tiny bit faster, a tiny bit bigger, etc. Then objectively, this clone (Stockfish) will only be a little bit better than the original (all other AB engines) in the world at large, but put the clone up against the original, and likely it will be no contest.



--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/1d21c0e7-ab3d-48ef-bb64-f200ec39c567%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fahim Saharaiar

unread,
Apr 18, 2019, 6:24:00 PM4/18/19
to LCZero
That doesn't explain one of the observations. I think we can all assume that Leela is at or beyond SF strength.
Given that, SF should not be having a much greater conversion against weaker opponents than Leela. (Which it does.) 


On Thursday, April 18, 2019 at 10:00:51 AM UTC-7, Trevor wrote:
There's another way to look at this...

Maybe the current crop of strongish to strong AB engines are actually all much closer in Elo than normally thought, as observed by Leela. However, because they all share the same basic algorithm and are all very similar in evaluation function, the Elo differences are "stretched" such that an only slightly stronger AB engine will have a higher than expected winrate vs a slightly weaker AB engine.

Let’s say we take a very good boxer/fighter, and clone him, but make the clone just a tiny bit more skilled, a tiny bit stronger, a tiny bit faster, a tiny bit bigger, etc. Then objectively, this clone (Stockfish) will only be a little bit better than the original (all other AB engines) in the world at large, but put the clone up against the original, and likely it will be no contest.


On Thu, Apr 18, 2019 at 9:58 AM <garrykli...@gmail.com> wrote:
please do not troll, that was NOT his question.

On Wednesday, April 17, 2019 at 11:32:38 PM UTC-4, Stephen Timothy McHenry wrote:
As long as Leela does not have a losing record against those other opponents then what difference does it make. SF is the only one to worry about, beat SF consistently and who cares about the rest of the inferior engines.

On Wednesday, April 17, 2019 at 10:05:29 PM UTC-5, Fahim Saharaiar wrote:
Ideally we want "stronger against strong opponents and just as strong against weak opponents."
Even though SF and Leela are close in strength, SF has significantly higher conversion against weaker opponents
than expected.

Any thoughts would be interesting. 

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lcz...@googlegroups.com.

Trevor G

unread,
Apr 18, 2019, 8:37:45 PM4/18/19
to Fahim Saharaiar, LCZero
The point of what I wrote was to explain why stockfish would have a greater winrate against weaker AB engines, given the relative Elo comparison observed by playing Lc0 against these engines...

Anyway, if the goal is to create the strongest chess engine anywhere, I think the “zero” trajectory is still correct. If the goal is to beat AB engines as much as possible, then something like Antifish would be better — though I think that project was a little flawed from the get-go as it doesn’t distinguish network evaluations based on whether it’s Lc0/Antfish’s or Stockfish’s turn. Naturally, I think Antifish should’ve been able to learn positions that were bad *only* for stockfish (but not Antifish). However, that kind of asymmetry wasn’t included in the learning, nor are asymmetrical evaluations like that a part of the Lc0 engine or the underlying network design (but this certainly could be added with some code modifications).




To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/2904c499-c9ae-4484-a322-49aaaac26516%40googlegroups.com.

Deep Blender

unread,
Apr 19, 2019, 9:31:18 AM4/19/19
to LCZero
That's a point which is being raised continuously. What is usually not considered is that there is very likely a huge potential to improve many aspects of the zero approach. As a simple example, when it encounters a possible weakness during the self-play, it is not guaranteed at all that this is being explored in depth. That's where a human would sit down and systematically explore the situation while it may or may not stumble across similar situations during self-play. If Lc0 would systematically explore weaknesses it encounters systematically, it would without doubt becomes way stronger on the tactical front.
I am very confident that the zero approach still has plenty of potential and what's needed is more research.

hawk

unread,
Apr 19, 2019, 10:45:51 AM4/19/19
to LCZero
Going less zero in the form of adding chess specific input planes / additional loss functions is very likely to be beneficial though

Deep Blender

unread,
Apr 19, 2019, 11:40:58 AM4/19/19
to LCZero
I agree with you. I am very confident that the right kind of addition might even help significantly.

At the same time, I believe that the reason for that is mainly the weaknesses which currently exist in the zero approach. Even though it has been tremendously improved, it is still highly inefficient. When it comes to the exploration, we have absolutely no clue as to what a somewhat general, maybe even learned, strategy might be.
With a better exploration, we would get better game positions for the training, which would in turn make those additional planes obsolete. This is exactly what happened in many areas of machine learning. Once a certain point was reached, it became even better to remove the domain knowledge. With the zero approach, we haven't reached this milestone yet.

Jesse Jordache

unread,
Apr 21, 2019, 4:34:59 AM4/21/19
to LCZero

In addition to Stockfish being a better "fish fryer", everyone noticed at the last SuFi the disparity in wins as Black.  I think the reasons are similar; lack of contempt, experience in situations where it's  just not realistic to play for a win as Black (at least from the outset), and the net mostly filtering out openings where White doesn't have at least a slight advantage.

It may be a weakness in the general approach so far, but it also depends on what you want: do you want the strongest engine possible in terms of beating other strong engines?  I think the current path is probably best.  Do you want a better engine for analysis, one that's more agnostic about positions and opponents?   A different model might be more appropriate.

Either way, the data this project is producing is extraordinary.

Lukas S

unread,
Apr 21, 2019, 12:26:52 PM4/21/19
to LCZero
Great point.

Conceptually when humans learn chess they have a second neural net in place (their life experience) that knows what to explore in order to improve their chess knowledge (=NN) fast.

So ideally we would train a net that learns what to train to improve faster. That net needs to very general and could be applied to or might even come from a domain outside of chess. The hot big thing in NN currently is trying to reduce the amount of input needed to improve a NN and this is a viable option. I'm very excited to see where that goes!

That isn't the zero approach, but it's what humans do and ultimately everyone wants to train their NN faster.

Shuo Xiang

unread,
Apr 21, 2019, 12:53:30 PM4/21/19
to Trevor G, Fahim Saharaiar, LCZero
It should be noted that at CCC 7 Antifish is the one with the most number of wins against Stockfish. So I think there is something going for Antifish here.

Deep Blender

unread,
Apr 21, 2019, 4:38:50 PM4/21/19
to LCZero
Intuitively, having a secondary neural network which decides what is worthwhile being studied is likely still pretty far away from being practically viable. Nevertheless, a comparably simple approach would be to record positions in a database where Lc0's initial judgement is very far off from the judgement when it played several variations using Monte-Carlo tree search. This inevitably means those positions are not yet well understood.
Some games (<5%) could pick a recorded positions (or maybe a few positions before it) and explore those, such that Lc0 learns more about them. When its judgement improves for a position, it can be removed from the from the database again.
This could potentially be a relatively easy way to close some knowledge holes, but it certainly introduces more hyperparameters too.

Why do you consider this sort of idea to not being a zero approach? It is clearly in the spirit of the zero approach. The current zero approach also includes exploration which needs to be tweaked and Monte-Carlo tree search is also not invented on its own.

Alma Udoh

unread,
Apr 23, 2019, 3:22:49 PM4/23/19
to LCZero
What you describe is what KLD Gain thresholding does, which was tested on T50, T51 and currently on T52.

Deep Blender

unread,
Apr 23, 2019, 6:10:28 PM4/23/19
to LCZero
Thanks a lot for pointing that out! I wasn't aware that this idea is already partly implemented.

The difference is that there is no database in the current approach for the tricky positions. It is well known that Lc0 has tactical weaknesses. I assume that at least some of the tactically correct moves are found, but the learning effect might not be strong enough that this knowledge is preserved. If those positions are stored in a database and the generation of games based on those is enforced until the knowledge sticks, there is the possibility to avoid some knowledge gaps.
It is an obvious extension to the currently existing implementation and I am pretty sure that some of the developers already had this sort of idea long before I did.
Reply all
Reply to author
Forward
0 new messages