Leela slowly but surely learning the bishop & knight checkmate.

4,748 views
Skip to first unread message

Jesse Jordache

unread,
Apr 29, 2018, 11:34:50 PM4/29/18
to LCZero
https://docs.google.com/spreadsheets/d/1uY7fplZzeXi8H52LK0L6Do2oYgF1v5VW3K_AjNS8l0M/edit#gid=0

Been watching it since about the time it got it's first (2% - that's 1 out of 50 tries).

Oddly there's no correlation I can detect between her ability to checkmate herself and her ability to checkmate stockfish - which to me is the one that counts, since stockfish knows how to defend, and by the numbers it's the rarest.

One annoying thing is that her evaluation of a double-bishop ending is a pretty steady 80%, despite the fact that she can do that one every time, against any opponent.

evalon32

unread,
Apr 30, 2018, 12:53:39 AM4/30/18
to LCZero
The evaluation discrepancy is a result of temperature. No guarantees that the explanation will make it less annoying, though :)

Leela can do KBBvK every time with t=0. But NN value is trained on games that are played with t=1.
80% winrate really means that in "training conditions" (--noise --randomize --visits=800), Leela can checkmate itself 60% of the time. I just tried those settings and got 70 out of 100 (close enough).

kirill57

unread,
Apr 30, 2018, 7:42:15 AM4/30/18
to LCZero
That's a great point. So it means that all NN evaluations are "skewed" in the same way (noise +randomised). Does it mean that at the very end of training noise should be reduced to 0 to make it more precise? 

evalon32

unread,
Apr 30, 2018, 10:51:19 PM4/30/18
to LCZero


On Monday, April 30, 2018 at 7:42:15 AM UTC-4, kirill57 wrote:
That's a great point. So it means that all NN evaluations are "skewed" in the same way (noise +randomised). Does it mean that at the very end of training noise should be reduced to 0 to make it more precise? 

"At the end of training" -- do you mean after Leela has beaten Stockfish, the network has stalled, and we don't want to grow it anymore? Removing both --randomize and --noise wouldn't work, because then all training games would be the same. Removing just --randomize (that's what sets t=1) is certainly something that can be tried, but the jury is still out on whether it's a good idea. There are concerns that Leela would then forget how to respond to suboptimal moves. On the other hand, both AGZ and LZGo used t=0 after the first 30 moves and it worked out fine.

al...@gate.sinica.edu.tw

unread,
May 1, 2018, 3:11:31 AM5/1/18
to LCZero
Fascinating though it is to see a machine teach itself, is it even desirable in this case? This endgame, and many others, could be played perfectly if syzygy TBs were being used. For Lc0, I think TBs make even more sense than for conventional engines because a single TB access would be much faster than a neural-network call--as opposed to slower for handcrafted eval functions. Just as importantly (?) these classical endgames then wouldn't be accessed during training, and you could use the NN capacity to store something nontrivial which hasn't yet been solved exactly. To end up with the strongest, efficient Leela, I think you want your network lean and mean instead of bloating it with knowledge that will prematurely necessitate a transition to bigger, hence slower, networks.

Jesse Jordache

unread,
May 1, 2018, 11:34:46 AM5/1/18
to LCZero
Bishop and knight checkmates don't come up very often; it's just something that is very difficult for chess engines that Leela seems to be grasping.  It's a benchmark.

Most positions covered by tablebases are kind of... marginal.   The most common and difficult endgames are single rook endgames, which leave only three pieces between White and Black before they get too big for TBs.  Anything she learns with 7 pieces or less are going is going to matter in her ability to evaluate more complicated position.  

There are like a dozen "have leela train with tablebases!!1" threads on here anyway.  Pitch that noise elsewhere.

Kirill Oseledets

unread,
May 1, 2018, 11:46:57 AM5/1/18
to Jesse Jordache, LCZero
By the "end of the training" I mean for the last "reasonable size" of the network (it would be interesting to know what is the formula for  {size of the network over speed}. Bud definitely for even a high end hardware there is optimal ratio. So for a given hardware, to get the optimal strength and optimal evaluation function (for example for purposes of analysis) you need to fine tune parameters of saturated NN to optimal values, by gradually reducing the noise. My hypothesis is that it will produce slightly high rating as well.  

--
You received this message because you are subscribed to a topic in the Google Groups "LCZero" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lczero/AIYEyAzZV0A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lczero+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/75ddbfb8-d76a-4f05-83b1-74bced417941%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Kevin Kirkpatrick

unread,
May 1, 2018, 3:38:59 PM5/1/18
to LCZero

...these classical endgames then wouldn't be accessed during training, and you could use the NN capacity to store something nontrivial which hasn't yet been solved exactly.

NNs are not modular.   In no way, shape, or form is Leela's NN "storing" (or memorizing) these classical-endgame techniques as she reaches 100% success-rate in solving them.  The reality is far more impressive.  In order to find KNB checkmates 100% of the time, Leela's NN must encode a sufficiently-deep understanding of knight-bishop-king piece coordination (and value of opposite-king movement restriction) that it can dynamically work out and implement the multi-phase strategy:

1) Coordinating pieces to force the opposing king to an edge of the board.
2) Understanding that the light-square/dark-square bishop means there's a "wrong" corner and "right" corner for checkmate.
3) Using all pieces to force opposing king to correct corner.
4) Finding the final checkmate.

It seems absurd to think that Leela's training would be best-served by not having to achieve such mastery.

Dorus Peelen

unread,
May 1, 2018, 6:59:35 PM5/1/18
to LCZero
> Oddly there's no correlation I can detect between her ability to checkmate herself and her ability to checkmate stockfish - which to me is the one that counts, since stockfish knows how to defend, and by the numbers it's the rarest.

Funny enough, i see a correlation here that makes sense to me.Early on in training, LC0 slowly moved towards rnd win rate. Once it learned how to mate with BNK it begun to lean how to defend. So after the initial gain against self towards 50%, it's not moving back down towards stockfish win rates. It has just reached that point, and my expectation is that both lines will now slowly move up towards 100%.


> On the other hand, both AGZ and LZGo used t=0 after the first 30 moves and it worked out fine.

And there has been an ongoing discussion on LZ to turn on t=1 all game because it has blindspots that might only be fixed with t=1.

Op dinsdag 1 mei 2018 04:51:19 UTC+2 schreef evalon32:

Will Page

unread,
May 1, 2018, 7:43:32 PM5/1/18
to LCZero
The reason that there is so much traffic regarding training with TB's is that not doing it is such an obvious elo error.

Kevin Kirkpatrick

unread,
May 1, 2018, 11:55:14 PM5/1/18
to LCZero


On Sunday, April 29, 2018 at 10:34:50 PM UTC-5, Jesse Jordache wrote:

Trevor

unread,
May 2, 2018, 12:09:54 AM5/2/18
to LCZero
Can you please provide proof that it’s an elo error? While I agree that it is an obviously intuitive thing to do, I disagree that it is an obviously correct thing to do... Do humans learn better chess if they start off learning with table-bases? If not, why do you suppose an artificial neural network would?

Will Page

unread,
May 2, 2018, 9:12:57 AM5/2/18
to LCZero
No I can't provide such proof, the only way to do that is a computational test with Lc0 training.

On the other hand how can providing the net with additional objectively correct data hurt it's performance?

On Tue, May 1, 2018 at 10:06 PM, Trevor Graffa <tlgr...@gmail.com> wrote:
Can you please provide proof that it’s an elo error? While I agree that it is an obviously intuitive thing to do, I disagree that it is an obviously correct thing to do... Do humans learn better chess if they start off learning with table-bases? If not, why do you suppose an artificial neural network would?


--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/6c076b3f-7f27-4654-9a3b-e66e0ea92d2d%40googlegroups.com.

evalon32

unread,
May 2, 2018, 9:28:51 AM5/2/18
to LCZero


On Tuesday, May 1, 2018 at 3:38:59 PM UTC-4, Kevin Kirkpatrick wrote:

...these classical endgames then wouldn't be accessed during training, and you could use the NN capacity to store something nontrivial which hasn't yet been solved exactly.

NNs are not modular.   In no way, shape, or form is Leela's NN "storing" (or memorizing) these classical-endgame techniques as she reaches 100% success-rate in solving them.  The reality is far more impressive.  In order to find KNB checkmates 100% of the time, Leela's NN must encode a sufficiently-deep understanding of knight-bishop-king piece coordination (and value of opposite-king movement restriction) that it can dynamically work out and implement the multi-phase strategy:

1) Coordinating pieces to force the opposing king to an edge of the board.
2) Understanding that the light-square/dark-square bishop means there's a "wrong" corner and "right" corner for checkmate.
3) Using all pieces to force opposing king to correct corner.
4) Finding the final checkmate.

It seems absurd to think that Leela's training would be best-served by not having to achieve such mastery.

Doesn't seem absurd to me. This is one instance of a more general question: "Will the NN be better at task A if it's also trained for a related task B?"
Will it be better at non-TB endgames if it's also trained on TB endgames?
Will it be better at standard chess if it's also trained on the other 959 starting positions in Chess960?
Maybe, maybe not. I think they are worthwhile questions, but I don't see them settled analytically, only empirically.

Jesse Jordache

unread,
May 2, 2018, 11:20:20 AM5/2/18
to LCZero
Yeah I didn't see that coming.

Cezary Wagner

unread,
May 2, 2018, 11:32:32 AM5/2/18
to LCZero
That is true that human learn better chess if starts learning from endgames and mates - the best trainers now that :)

Try some very good Russian books for example excelent trainers "Anatol Łokasto" or quality chess the first book of full course Build Up Your Chess with Artur Yusupov first chapter of their books is mates and many chapters is ending games.

I really works and bring fast evolution - I trained some young chess players very fast with mates and ending games and the become much better that other players.

Whatever the most humans thinks that better is start from games which is the slowest method of learning.

Cezary Wagner

unread,
May 2, 2018, 11:38:33 AM5/2/18
to LCZero
Sorry for my broken English and omitting words should be:
That is true that humans learn better chess if they start learning from endgames and mates - the best chess trainers are doing that :)

Trevor G

unread,
May 2, 2018, 12:13:12 PM5/2/18
to Cezary Wagner, LCZero
It could be good to have Leela see more middlegames and endgames than
it does openings. I have said this before.

Using table-bases is not this, though.

If we use a 5-man tablebase, Leela will see drastically different
evaluations for 6-piece positions than 5-piece positions, even if the
additional piece is inconsequential. Leela could learn it's good to
sacrifice valuable pieces simply to get to a 5-piece tablebase
position (even if a quicker mate is possible with 6-pieces on the
board). Would this be good for Leela as a whole? How would that affect
Leela's generalization of positions?

If we use tablebases, Leela will not see many late-game positions in
training that it might otherwise see. Is this good for Leela?
> --
> You received this message because you are subscribed to the Google Groups
> "LCZero" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to lczero+un...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/lczero/7e371234-11c8-4895-9072-3f78625b6e72%40googlegroups.com.

Warren D Smith

unread,
May 4, 2018, 11:11:24 PM5/4/18
to LCZero
 
KNNkp is also a "basic checkmate" and the hardest one.  But it is not listed on
the "basic checkmates" progress page. 

Jesse Jordache

unread,
May 5, 2018, 5:14:17 AM5/5/18
to LCZero
Oh, there are MUCH harder ones: two bishops v. knight is so difficult that it was believed to be a draw until the eighties.  So is KQbbk, and in both (as well as the two knights v pawn ending) the winning technique may take more than 50 moves, depending on the starting position.

Also, the KNNkp ending has never occurred in practice - at least according Megabase 2014.  The only reason it's taught is because of the novelty of a mate only being possible by giving the weaker side material.

Michel VAN DEN BERGH

unread,
May 5, 2018, 8:57:32 AM5/5/18
to LCZero


On Saturday, May 5, 2018 at 11:14:17 AM UTC+2, Jesse Jordache wrote:
Oh, there are MUCH harder ones: two bishops v. knight is so difficult that it was believed to be a draw until the eighties.  So is KQbbk, and in both (as well as the two knights v pawn ending) the winning technique may take more than 50 moves, depending on the starting position.

Also, the KNNkp ending has never occurred in practice - at least according Megabase 2014. 

This is not true according to Wikipedia which lists several examples.

Warren D Smith

unread,
May 5, 2018, 12:35:52 PM5/5/18
to Jesse Jordache, lcz...@googlegroups.com
On 5/5/18, Jesse Jordache <youtwis...@gmail.com> wrote:
> Oh, there are MUCH harder ones: two bishops v. knight is so difficult that
> it was believed to be a draw until the eighties. So is KQbbk, and in both
> (as well as the two knights v pawn ending) the winning technique may take
> more than 50 moves, depending on the starting position.

--those are not considered to be "basic checkmates" covered in textbooks.
KNNkp is. I own two endgame textbooks that treat it, one published
in the 1920s.

> Also, the KNNkp ending has never occurred in practice

--you are incorrect, it has occurred many times, certainly at least
10 times, in practice, and in about 20% of
the cases the human knew how to mate, the other times he embarrassed
himself by not being able.

Although the winning technique for KNNkp is conceptually fairly straightforward,
actually executing it is quite difficult for a human. I would think leela would
have great difficulty learning it for these reasons:

1. almost-random play will tend to cause the kp side to win (queens),
exactly the wrong conclusion.
2. with slightly more intelligent play the KNN side will capture the pawn,
yielding a sure draw. Also exactly the wrong conclusion.
3. even if you do know what to do, it is quite difficult to pull it off,
you have to corral the king in the correct corner which is not so easy,
there are a lot of ways it can try to wriggle out.

So leela may have a very hard time getting any clue what to do.
If however, it were to learn from endgame tablebases, it would
probably have no trouble learning how to win this and KBNk.
The ultimate source of all knowlegde leela has, is game-end positions
(checkmates, stalemates, and 50-move & repetition rule).
With tablebases, it has a far larger source of knowledge, millions
of times larger. And coming far faster.
Therefore leela would learn faster, as opposed to being a total
idiot who does not even know basic checkmates and basic endgames like KRPkr.

Or, of course, we could just listen to uninformed people who think KNNkp has
never occurred in practice, think tablebases are cheating or evil somehow,
and so on.

--
Warren D. Smith
http://RangeVoting.org <-- add your endorsement (by clicking
"endorse" as 1st step)

Jesse Jordache

unread,
May 5, 2018, 3:43:13 PM5/5/18
to LCZero
Alright, I was wrong about the KNNkp endgame never appearing in practice, but I made an honest attempt to look it up (how did Megabase miss Smyslov - Lilienthal???).  It's still ridiculously marginal, and is nowhere near as common as KQkr.  People who actually play chess would never dispute this.
As to the rest of your screed,  I have never, anywhere said tablebases are evil or cheating.  I would find that opinion absurd; in fact, I find the tablebase holy wars to be really amusing, which must infuriate you and the other mujahaddin.  It's doubly funny since Leela has been tablebase compatible as of the last version, so the whole thing is moot.

I think watching Leela make progress with a bishop and knight checkmate is fascinating.  I'd say I'm sorry if it upsets you, but I'm not.

Warren D Smith

unread,
May 5, 2018, 5:39:47 PM5/5/18
to Jesse Jordache, LCZero
On 5/5/18, Jesse Jordache <youtwis...@gmail.com> wrote:
> Alright, I was wrong about the KNNkp endgame never appearing in practice,
> but I made an honest attempt to look it up (how did Megabase miss Smyslov -
>
> Lilienthal???). It's still ridiculously marginal, and is nowhere near as
> common as KQkr. People who actually play chess would never dispute this.

--and, if you were to also include KNNkp, then we would
have a gauge of how slowly leela is learning
this "ridiculously marginal" basic checkmate.
As opposed to: not including it, so that we have no such gauge,
because keeping us in ignorance about that is somehow helpful?

And I would be happy to also see KQkr included.
I think KQkr should be comparatively easy for leela to learn since
the naive guess (Q wins) is correct, so it does not have to unlearn
the naive guess as step 1. Nevertheless KQkr is quite hard for humans
and the vast majority cannot win it against perfect defense.

> As to the rest of your screed, I have never, anywhere said tablebases are
> evil or cheating.

--somebody has.
It simply boggles my mind that a million times larger source
of knowledge is claimed to be evil and leela must not learn from it
because that'd hurt leela.

Stephen Frost

unread,
May 5, 2018, 5:57:00 PM5/5/18
to LCZero
I would be disappointed if, at least in these early days, Leela was trained up on tablebases.  If I understand the intent of the project correctly, the idea is to see how far Leela can go with chess by herself, without interference.  Tablebases are compiled by humans to make engines better.  We know that tablebase support has just been included if you want to use it in match play.  Surely that is enough for now?

Stephen Frost

unread,
May 5, 2018, 5:58:41 PM5/5/18
to LCZero
... and before I forget, Leela doesn't yet know how to win a KQvsKR endgame, as this came up in a recent tournament game I was running and she failed at the 50-move draw rule.  The winning idea, now that I have read up on it myself, will not be easy for her to find.

Warren D Smith

unread,
May 5, 2018, 6:09:07 PM5/5/18
to Stephen Frost, LCZero
On 5/5/18, Stephen Frost <0499f...@gmail.com> wrote:
> I would be disappointed if, at least in these early days, Leela was trained
>
> up on tablebases. If I understand the intent of the project correctly, the
>
> idea is to see how far Leela can go with chess by herself, without
> interference. Tablebases are compiled by humans to make engines better.

--WRONG.
Tablebases are compiled by computers, working solely from the rules of
chess with
no human input.

This response is a typical example of the sort of idiocy I am encountering
where a large segment of the leela community intentionally wants to be idiots
and intentionally wants to learn tremendously more slowly than necessary.
So slowly that the still have not learned to mate with N&B, and so slowly that
they probably will never learn to mate in the KNNkp endgame even after
1000s of years of self play.

> We know that tablebase support has just been included if you want to use it
> in match play. Surely that is enough for now?

--Sigh. Let me say it again. Try to get it thru your thick heads.

The object of leela is to learn, i.e. train a neural net.
It currently learns only from game-end positions such as
checkmates. It has no other source of knowledge.
Checkmate positions are rare. Therefore learning is slow.

If it learned from tablebases, then it would have a million times
larger source of
knowledge, hence might be expected to learn tremendously faster.
Possibly a million times faster, but this is probably an overestimate.
But probably it is safe to say twice as fast.

----

Once upon a time, people trained neural nets to recognize handwritten digits.
If a million times larger source of training digit-image data were available,
nobody would have been stupid enough
to say "I refuse to look at it. That would be cheating."

Only people like you are that stupid, and they abound in the leela
community, for reasons I do not understand. They continuallly
say false things, just like you and Jesse just said.

Warren D Smith

unread,
May 5, 2018, 6:25:28 PM5/5/18
to Stephen Frost, LCZero
And while I am at it, leela also is being stupid by missing the opportunity
to LEARN FROM STOCKFISH.

Here is what I mean.
In about the same amount of time T it takes Leela to
do a single neural-net-learning step, stockfish
can probably do a search to depth 4 ply.
Anyhow, do this.
1. reach a training position.
2. run stockfish for time 0.1*T.
3. if stockfish proves positon is a forced win, loss, or draw, (and finds a
move M that accomplishes that in the win case) then use that as training data.
4. otherwise use normal training data.
5. do neural net learn step using that data.


Again, if this is done, then leela will learn from psotiuons <=4 ply
from checkmate,
which is a much large knowledge-source than checkmate positions alone.
It will cost 10-% slower learning, but the much larger knowledge source
should pay for that 10% many times over. (If not, change 10% to 5%.)

And now please don't send a reply back about how this would be cheating and
human-taught or some such utterly false claim.

Remember. The object of leela is to learn. The more data is
has to learn from and the faster it comes, the more it will learn.
This is simple.

Stephen Frost

unread,
May 5, 2018, 6:48:13 PM5/5/18
to LCZero
Perhaps you'd better go start your own project then?
You seem to know what you want ...

Jesse Jordache

unread,
May 5, 2018, 7:01:22 PM5/5/18
to LCZero
I'd put up money for that just to see the FAQ.

"I'm only going to say this once.  Try to get through your thick skulls."

Stephen Frost

unread,
May 5, 2018, 7:13:31 PM5/5/18
to LCZero
On Sunday, May 6, 2018 at 9:01:22 AM UTC+10, Jesse Jordache wrote:
I'd put up money for that just to see the FAQ.

"I'm only going to say this once.  Try to get through your thick skulls...

lol 

Warren D Smith

unread,
May 6, 2018, 12:00:12 AM5/6/18
to Stephen Frost, LCZero
yah, you've continued to follow the Pattern.

step 1, emit falsehoods and continue to not do it right
based on those falsehoods. Step 2, laugh about it., you think
doing it wrong is all a huge huge joke and everybody complaining about
it and trying to educate you is so, so funny.

Step 3 (never comes) actually do the job right.

Stephen Frost

unread,
May 6, 2018, 12:27:35 AM5/6/18
to LCZero
It isn't me you need to convince.  Not my project.  I'm just a bystander.  The more you rave on, the more you look like a goose.

Florian Schmitt

unread,
May 6, 2018, 4:17:07 AM5/6/18
to LCZero
it's really easy: the goal of this project (as I understand it) is to train a neural net in playing chess with zero external input (it even has zero in it's name). So the main goal seems seems not to be the creation the strongest engine ever (we already have stockfish for that), or doing that in the fastest manner. it's about learning about how such a self trained engine evolves, what kind of moves it does, where it's strenghts and weaknesses are and so on. In the process it might get stronger than everything else now available, or it might not.
Actually it would be quite interesting if you would do your own project with your goals, so that the results can be compared and analyzed. But please, don't call people stupid just because they have different goals than you.

Thomas Dybdahl Ahle

unread,
May 6, 2018, 4:30:54 AM5/6/18
to LCZero
It would be really cool to extend your tables with more endgames, such as QvR or ppvR etc.

Jesse Jordache

unread,
May 6, 2018, 5:00:59 PM5/6/18
to LCZero
Yeah, there's some I'd like to see too.  But I watch their spreadsheets like it was my favorite soap opera; they're falling further and further behind updating their benchmarks.

But since we're on the subject, which is the one I wanted to see... oh yeah, Queen vs Bishop's pawn with the defending (pawn side) king out of play.  (also knight vs rook's pawn with the defending (knight side) king far away, but I think Leela would find that simple).

Trevor

unread,
May 6, 2018, 5:52:44 PM5/6/18
to LCZero
Warren - have you called the Deepmind team a bunch of idiots too?

From my perspective.. The “zero” tabula rasa thing is not just some philosophical ideal or aesthetic, nor is it a simple curiosity. It’s a scientific result in optimization.

A few remarks...

It’s long been known that “perfect” data - noise free, 100% correctly labeled, etc - is often *not* the best training data for neural networks. Neural networks do well with noise and imperfections as it forces regularization that improves their filters, and enhances their capacity to learn and generalize better. What’s important is that the data is mostly right, such that statistics done on the data result in the correct minima - not that each example is perfectly labeled. I think this stuff is even more pertinent for reinforcement learning due to its dynamical nature (ie moving targets, exploration vs exploitation concerns, etc).

If you followed the work done on MCTS Go playing programs (back before anyone showed effectiveness with neural networks), you’d know that it was found a long time ago that a good policy for MCTS rollouts is not necessarily a good policy for playing the game. On the contrary, when playout policies were optimized to play Go well, performance of the MCTS engine suffered.

Only a few years ago most experts seemed to agree that current neural network architectures were absolutely incapable of playing games like Go and Chess as well as traditional engines do. Many many people tried - both supervised and tabula rasa. I did too with some games (over 10 years ago, I wrote a small neural network library, mostly temporal-difference learning approaches). One thing I found with my experimentation is that when learning policies based on neural networks and shallow search, it is very very easy to more or less get stuck in local-minima: portions of the game-playing-space that are difficult to get out of because of the trappy nature of games.

While training Leela on tablebases and perfect minimax search up to some depth *might* help the network find global optima, there is very good reason to believe it might hurt instead. I believe Leela’s team of developers is wise in staying the course trying to reproduce Alpha Zero’s results.

Warren D Smith

unread,
May 6, 2018, 6:41:19 PM5/6/18
to Trevor, LCZero
On 5/6/18, Trevor <trevor...@gmail.com> wrote:
> Warren - have you called the Deepmind team a bunch of idiots too?

--no, but I did ask them what A0 had learned/not about TB endgames.
I suggested that would be another way to gauge
how well it had learned chess, and readers wanted to know it.
(No answer. )

> From my perspective.. The “zero” tabula rasa thing is not just some
> philosophical ideal or aesthetic, nor is it a simple curiosity. It’s a
> scientific result in optimization.
>
> A few remarks...
>
> It’s long been known that “perfect” data - noise free, 100% correctly
> labeled, etc - is often *not* the best training data for neural networks.

--Sigh.
Right now, the sole way LC0 learns, is from perfect data
(e.g. checkmate positions).
There is no other source of knowledge besides the rules of chess, which
are perfect data.

Would you suggest adding noise to create fake-rules-of-chess
and try to have LC0 learn from those?

This is all utter bull. You just spew garbage, following the Pattern
of all my respondents so far, in a comical effort to avoid doing the
right thing.

> Neural networks do well with noise and imperfections as it forces
> regularization that improves their filters, and enhances their capacity to
> learn and generalize better. What’s important is that the data is mostly
> right, such that statistics done on the data result in the correct minima -
> not that each example is perfectly labeled. I think this stuff is even more
> pertinent for reinforcement learning due to its dynamical nature (ie moving
> targets, exploration vs exploitation concerns, etc).

--you simply spew mythology. The fact is, the TBs contain far far
more data than
the number of weights in a neural net. So the NN is incapable of
overfitting it. There simply is no concern about this. At all. It is
total bull.

And all the stuff about chess endgames the NN cannot learn, IS noise
and imperfections as
far as it is concerned.

> If you followed the work done on MCTS Go playing programs (back before
> anyone showed effectiveness with neural networks), you’d know that it was
> found a long time ago that a good policy for MCTS rollouts is not
> necessarily a good policy for playing the game. On the contrary, when
> playout policies were optimized to play Go well, performance of the MCTS
> engine suffered.

--irrelevant. I am not suggesting trying to input or in any way force
human ideas of "playing go well."
(Maybe you are, but I do not care.)


> Only a few years ago most experts seemed to agree that current neural
> network architectures were absolutely incapable of playing games like Go and
> Chess as well as traditional engines do. Many many people tried - both
> supervised and tabula rasa. I did too with some games (over 10 years ago, I
> wrote a small neural network library, mostly temporal-difference learning
> approaches).

--I did so too. But unlike you and the "many",
I actually succeeded. Namely, I trained an othello
player from nothing up to human national-champ strength in 1 week on a machine
at about 60 MHz (this about 25 years back). Not neural nets,
it was a table-based eval with several million entries in table to learn
(initially random). Sole source of knowledge was rules of othello and game-end
results.

The main reason those guys all failed, was simple: not enough data.
That was the main reason almost all NN research did not get very far
in the early days.

But in my othello case my data rates were tremendous.

> One thing I found with my experimentation is that when learning
> policies based on neural networks and shallow search, it is very very easy
> to more or less get stuck in local-minima: portions of the
> game-playing-space that are difficult to get out of because of the trappy
> nature of games.

--at last, you say something sensible.
Unfortunately, you fail to recognize this totally supports me.

In the KNNkp endgame, as i already explained, LC0 is likely to get completely
the wrong idea thru self play without tablebase aid. That is
just one example. That is why tablebases will help tremendously.
Probably enabling it to learn this endgame even though without it it
would take 1000s of years or more. Thank you for making my point.

> While training Leela on tablebases and perfect minimax search up to some
> depth *might* help the network find global optima, there is very good reason
> to believe it might hurt instead.

--no, there is no reason. You just spew myths.

> I believe Leela’s team of developers is
> wise in staying the course trying to reproduce Alpha Zero’s results.

--no, this is a complete and obvious mistake.

It is very simple. More data, coming faster ==> faster learning.

Nobody in the NN community has ever disputed this. Vast
experience supports it.
Except for the Leela forum trolls - they dispute it, right here, right now.

Hanamuke

unread,
May 7, 2018, 5:39:27 AM5/7/18
to LCZero
I was thinking, if LCZ learns from random moves she makes during training, it could be interesting to occasionnaly randomly play the TB move in endgames.

evalon32

unread,
May 12, 2018, 10:23:04 AM5/12/18
to LCZero
By popular request, the soap opera now includes KQvKR :) But the graph is pretty boring so far. I only benchmarked every 10th net.

Jesse Jordache

unread,
May 12, 2018, 12:41:38 PM5/12/18
to LCZero
Neat.  Thanks.

Cezary Wagner

unread,
May 12, 2018, 1:17:31 PM5/12/18
to LCZero
I think that learning endings and mates  from random moves is loose of time - it is bullshit.

Try to learn mate on 8 line in which Leela is very bad - how many random moves in such position - 30?

So chance to do good move is 1/30 - so it is hard to learn it.

How often such position can be repeated in next game 1/100.

So what is chance to learn 1/3000 or less - that is why it is bullshit.

W dniu poniedziałek, 30 kwietnia 2018 05:34:50 UTC+2 użytkownik Jesse Jordache napisał:
https://docs.google.com/spreadsheets/d/1uY7fplZzeXi8H52LK0L6Do2oYgF1v5VW3K_AjNS8l0M/edit#gid=0

Been watching it since about the time it got it's first (2% - that's 1 out of 50 tries).

Oddly there's no correlation I can detect between her ability to checkmate herself and her ability to checkmate stockfish - which to me is the one that counts, since stockfish knows how to defend, and by the numbers it's the rarest.

One annoying thing is that her evaluation of a double-bishop ending is a pretty steady 80%, despite the fact that she can do that one every time, against any opponent.

Kevin Kirkpatrick

unread,
May 12, 2018, 11:50:30 PM5/12/18
to LCZero
Cezary, 

If, hypothetically, the Deepmind team were to confirm that Alpha Chess Zero was able to solve mate-in-8 puzzles, would your assessment change?

Warren D Smith

unread,
May 13, 2018, 12:22:58 PM5/13/18
to LCZero


On Saturday, May 12, 2018 at 11:50:30 PM UTC-4, Kevin Kirkpatrick wrote:
Cezary, 

If, hypothetically, the Deepmind team were to confirm that Alpha Chess Zero was able to solve mate-in-8 puzzles, would your assessment change?

On Saturday, May 12, 2018 at 12:17:31 PM UTC-5, Cezary Wagner wrote:
I think that learning endings and mates  from random moves is loose of time - it is bullshit.

--sigh.  It is incredible how you guys cannot see the simplest logic related to "Leela should learn from tablebases."

If alphazero can solve mate-in-8 often, SO WHAT??
The only way it does so, is by doing a lot of searching, taking a lot of computer cycles.
Let us, say, 10^11 cycles taking say 5 minutes.
Meanwhile, in perhaps a millisecond,
you learn from a tablebase that a position is a mate-in-27, or whatever.

So the neural net, by learning from a tablebase, will extract ultimate truth from the chess world maybe 300,000
times faster.   All learning is driven by ultimate truth, i.e. game-end positions.  So you learn faster
with more of it.

Furthermore, the way that alphazero was able to solve mate-in-8, was because, during the past, it had learned good heuristics, 
enabling it to focus its search on promising lines.  Mate in 8 problems that happen not to conform to those heuristics, may
be unsolvable.  They might be solvable if you had better heuristics that alphazero never learned.  So roughly speaking,
it is only able to learn what it already knows, and it does so very very slowly.

Why is this so freaking hard for you all to comprehend??

 

Mark Bennet

unread,
May 13, 2018, 1:13:29 PM5/13/18
to LCZero
Human beings learn endgames in part to learn something about the power of the pieces. If you stop playing out endgames during training you get perfect evaluations from Tablebases, but are not learning from playing out the positions. Since Leela is most interested in learning common positions and patterns, and most specific endgames are relatively rare, using TBs during training may deprive Leela of useful learning which would be applicable to more common non-endgame positions. The way to test this would be to run the two methods in parallel.

You should note that people who have experimented with neural nets have found that feeding them human knowledge to try to improve the rate at which they learn has often proved counterproductive for subtle reasons - which render the best known strategies for learning rather counterintuitive from a human point of view. Here the net has a finite size - using some of that capacity to get an accurate evaluation of relatively rare endgame positions may compromise the capacity to learn to evaluate the kinds of positions which arise rather more often in actual play. In fact the accuracy of endgame play has been increasing through learning. Your proposed method might learn endgames faster, but chess slower.

Trevor G

unread,
May 13, 2018, 2:21:21 PM5/13/18
to Warren D Smith, LCZero
 Cezary, Warren - if using TBs during training, how do you deal with the evaluation discontinuities in Leela’s training data and the complex and erroneous bias that would result? 
https://chessprogramming.wikispaces.com/Evaluation%20Discontinuity

Suppose Leela is trained with a 5-man TB, do you think it’s good that the network would be trained to make completely unnecessary sacrifices in 6-piece positions simply to achieve a TB win? Or sacrifices/bad trades in 7-piece positions, etc.

Do you think it’s good that the network would be trained to avoid taking pieces or making good material exchanges simply to avoid TB positions, when the position is already a theoretical loss but the capture/trade would be better than not?




--
You received this message because you are subscribed to a topic in the Google Groups "LCZero" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lczero/AIYEyAzZV0A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/2ff981c3-6a05-4c0e-8caf-6485d89c68d5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Trevor G

unread,
May 13, 2018, 4:36:54 PM5/13/18
to Warren D Smith, LCZero
No, that’s ridiculous.
Consider a position where we can do one of the following:
A: force a sacrifice of our Queen such that we reach a tablebases position after our opponent’s turn, giving us a difficult-to-see mate in 26.
B. Use our queen effectively to instead force a mate in 3.

What would grandmasters choose?

The way the MCTS algorithm is set up, I see no good way to simultaneously use tablebases and train our network to do the right thing (B) and not (A).

It may not seem very relevant - we get a mate either way. But that completely ignores the fact that learning a deep neural network is all about being strong at generalizing.

Also, using table-bases as the ground truth is very much different than using only real final game positions. The particular example above is just not a problem at all in the latter case.

If you or anybody else here can provide a solution to the above issue that is both convincing in its ability to actually train the correct thing (the mate in 3 that any good chess player would choose, and not the mate in 26), and is not overly complex to implement in the MCTS/UCT algorithm (and solves the corresponding issues from the losing player’s side as well) — then maybe I’ll get behind the idea of using TBs in training.




On Sun, May 13, 2018 at 4:03 PM Warren D Smith <warre...@gmail.com> wrote:
On 5/13/18, Trevor G <trevor...@gmail.com> wrote:
>  Cezary, Warren - if using TBs during training, how do you deal with the
> evaluation discontinuities in Leela’s training data and the complex and
> erroneous bias that would result?
> https://chessprogramming.wikispaces.com/Evaluation%20Discontinuity

--sigh.
Right now, all knowledge Leele ever learns, comes from game-end
positions, such as checkmate & stalemate.  Are you whining about the
horrible "discontinuity" between checkmate/stalemate positions and
(say) their immediate predecessors?  No.
You are not.  You are happily accepting the Source Of Truth that comes from
those game-end positions, and wishing
you had more of it.

But suddenly, when it is a tablebase position, also serving as a far
larger source of truth,
you suddenly whine and quiver in fear of "discontinuities."  Why do
you like A but not B?
They are the same thing. When A is a source of truth and
discontinuities, you like the truth and ignore the discontinuities.
When B is a source of truth and discontinuities, you ignore the truth
and fear the discontinuities.

You are logically contradicting yourself.

For some unknown reason, you cannot get it through your skull that the
two are exactly analogous.


> Suppose Leela is trained with a 5-man TB, do you think it’s good that the
> network would be trained to make completely unnecessary sacrifices in
> 6-piece positions simply to achieve a TB win?

--YES.  It reduces from an unsure to a sure win.

(It is just the same as, I sac my queen to mate you.  Do you then object, saying
"you could also have won without the sac"?)


> Or sacrifices/bad trades in
> 7-piece positions, etc.
>
> Do you think it’s good that the network would be trained to avoid taking
> pieces or making good material exchanges simply to avoid TB positions, when
> the position is already a theoretical loss but the capture/trade would be
> better than not?

--YES.

And grandmasters think just the same as I do.  For example, in the only
game Tang won versus Leela, he indeed sacrificed material he could
have held on to,
because he knew he was getting into a simpler sure-win endgame
position. His extra material would merely have complicated the
situation.  Tang did not care about his
extra material, so he just gave it away.  He wanted the win, not the illusion.

You apparently care about illusions very intensely.
And by "you" I mean not just you but rather all the people making stupid
self-contradictory objections to LC0 learning from tablebases, over
and over again.

Trevor G

unread,
May 13, 2018, 4:41:44 PM5/13/18
to Warren D Smith, LCZero
(Not that my opinion even matters about this. I’m just a bystander. But I will think any training mechanism that incorporates TBs without addressing the concerns I mention is a horrible idea, and the Leela Chess project as a distributed effort will lose my support.)

Thanar

unread,
May 13, 2018, 5:16:01 PM5/13/18
to LCZero
Don't worry. The devs have not even considered using egtb for games used in training, so it will not happen.

Scott Turner

unread,
May 14, 2018, 9:35:08 AM5/14/18
to LCZero
On Sunday, May 13, 2018 at 4:36:54 PM UTC-4, Trevor wrote:
The way the MCTS algorithm is set up, I see no good way to simultaneously use tablebases and train our network to do the right thing.

There is a way, although it's a bit of a kludge.  During training games, if Leela detects a TB position it augments it's evaluation using the TB, e.g., it marks the TB moves as having a much higher value.  You can tweak this added valuation so it isn't overwhelming and Leela still tries some alternate moves at times.  But it would focus Leela on the TB solutions in those situations.

At the moment I'd say that endgames are the least of Leela's problems :-)

-- Scott

Ian Osgood

unread,
May 21, 2018, 4:55:56 PM5/21/18
to LCZero
I'd just like to note the milestone: around network 320, Leela is getting 100% win rates on the bishop and knight endgame!

Trevor G

unread,
May 21, 2018, 5:17:21 PM5/21/18
to ia...@quirkster.com, LCZero
Very nice. It's interesting to see that recently Leela has had more trouble defeating itself than either SF or TB.

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/d4e590e2-f41d-4304-b81a-90d2f9d84912%40googlegroups.com.

Kevin Kirkpatrick

unread,
May 21, 2018, 7:06:12 PM5/21/18
to LCZero
Wonder if it's a situation like
Path1 = easy-mate-in-8, 
Path2 = tricky mate-in-6.  

TB & SF will choose Path1 every time (surviving 8 moves is better than surviving 6).  But Leela may take Path2 and struggle mightily to find tricky checkmate.

evalon32

unread,
May 21, 2018, 9:29:56 PM5/21/18
to LCZero
I assume it's something like that, with the (possibly obvious) caveat that "tricky" is subjective -- it must have been tricky for Leela's then-current NN. It can also be that Path1 and Path2 are both mate-in-8, but Leela knows its own weaknesses and is more likely to take Path2. For what it's worth, here's the last game that Leela failed to win (ID321 vs Self): https://lichess.org/n0ELbP2D

Shawn S.

unread,
May 22, 2018, 7:41:08 AM5/22/18
to LCZero
Why should it be that what a human GM would do is what a chess engine should do?  This is a clear human bias.  Objectively, winning is winning and wanting the win to be "pretty" or "clean" by our own standards is a human bias.  In my opinion, the only considerations should be: does it win positions that are won, and does it hold a draw in a drawn position; and when it is it drawn or losing positions, does it do a good job at creating opportunities for opponent to make mistakes.  All of this is measured by competition results W/L/D and Elo estimates.  If TB training does not help, it will be evident in the results and that will settle it.  I think the best way possibility, one day when this experiment might be done, is to combine TB training with self-play training.  The self-play training will filter out the non-generalizable patterns that network learned from TB training, but TB training may reveal deep and generalizable patterns that self-play will not find.  So a combination effort.  If the result can beat Leela Zero or score better against other competition, then it does not matter to me how "ugly" it seems to play (I happen to think it would play beautifully).

nezhi

unread,
May 29, 2018, 2:01:21 AM5/29/18
to LCZero
what else endgames/positions can we test to see leela's progress?

Sean Francis

unread,
May 29, 2018, 10:13:26 AM5/29/18
to LCZero
Jesse, Hey i’m curious about something. If the dev’s decreased the weights for about a week, i know that the rating would drop, but if afterwards if the increased the weights once again, would the rating; 1.return to it’s initial level, 2. Become stronger than it’s initial level, or 3. Have some other negative effect?

Jesse Jordache

unread,
May 30, 2018, 2:21:53 AM5/30/18
to LCZero
As far as I know, decreasing the weights would just make for a smaller gradient.  Assuming you mean the training weights.

If not I hope you're not coming to me for a technical question about Leela, because I'm really not that guy.  I'm partly interested in this project to learn about Neural Nets, but it's still really abstract.

Stephen Frost

unread,
May 30, 2018, 3:34:03 AM5/30/18
to LCZero
Hey Jesse,

Another endgame I saw just yesterday, in a match I was observing in Arena (between NN350 gauntlet against NNs 300, 251, 226, 200 and 100), was KQ vs kr.  Time control was 1m+1s/move, so fast.

NN350 had the KQ and was able to checkmate NN300 in about 30 moves, so plenty of time before the 50-move limit.  Just a month ago, maybe around NN250 (?), Leela was unable to make much progress with this endgame.  So it seems she may have learned it now.  Hard to say without more evidence.

Cheers,
Steve

evalon32

unread,
May 30, 2018, 9:26:58 AM5/30/18
to LCZero
Steve, check out the KQvKR tab in the spreadsheet linked at the top of the thread. Leela can usually win this endgame against itself (ever since it's known how to win KQvK), but still hasn't learned to win against perfect defense.

Jesse Jordache

unread,
May 30, 2018, 9:40:43 AM5/30/18
to LCZero
That's actually pretty exciting.


On Wednesday, May 30, 2018 at 3:34:03 AM UTC-4, Stephen Frost wrote:

Stephen Frost

unread,
Jun 5, 2018, 10:56:49 AM6/5/18
to LCZero
From the latest round of match games, here is NN379 executing a KBNvK mate.

Jesse Jordache

unread,
Jun 5, 2018, 9:40:30 PM6/5/18
to LCZero
That made my day twice over - I thought Leela had abandoned that opening/move order in favor of that bizarro exchange variation where she takes off both bishops.  I haven't seen her play a (semi) proper QGD xv in a long time.

Stephen Frost

unread,
Jun 5, 2018, 11:31:23 PM6/5/18
to LCZero
I've gotten the impression last few days that there are more "normal" openings appearing again.  e.g. I saw a Sicilian with 3.d4 a couple of times.

Jesse Jordache

unread,
Jun 6, 2018, 10:54:53 AM6/6/18
to LCZero
Yeah - an english attack vs. a najdorf, or a scheveningen with a najdorf move-order I forget.  Except white is castling kingside.  But she had played so many alapins that I was afraid she had dug herself into a rut from which she'd never leave.
Reply all
Reply to author
Forward
0 new messages