Tcec season 18

3,503 views
Skip to first unread message

Warren D Smith

unread,
Jun 14, 2020, 4:35:47 PM6/14/20
to LCZero
Approaching the end of the Premier division round robin tourney.
LcZero and StockFish both undefeated (unlike everything else).
And leading.

SF has 9 wins and Lc0 has 7 after 41 games played by every contestant.

Meanwhile Ethereal & Fire8 are the opposite; they have zero wins.
It definitely looks like the NN programs are on the rise with the top 4 consisting of 3 NN programs (others
are AllieStein & Stoofvlees) plus StockFish (leading), while the bottom 4 all are old-style programs.



Pawel SalsaDura

unread,
Jun 14, 2020, 8:37:40 PM6/14/20
to LCZero
Im wonder why Leela has not been updated after so many new nets? That is rather weird, there are few hundreds new nets now, why they still stick to the old one? 

glbchess64

unread,
Jun 15, 2020, 4:20:09 AM6/15/20
to LCZero
At the beginning of divP the best tested net was still SV-3010. But during divP @Jio released some better nets based on Sergio Vieri nets and Sergio also released two new promising nets. So there will be a new net in SuFi, likely a net by @Jio.

By the way at beginning of divP, Stein 15, that is better than Stein 14.3, was not released.

ronnie millsap

unread,
Jun 15, 2020, 12:16:23 PM6/15/20
to LCZero
I just love how adnrew grant etherel had nothing to say when lco was beating his engine (and was farrr weaker than it is now) than 'oh so boring snore how original'. now just only getting stomped. nothing else ;) ah karmas!


On Sunday, June 14, 2020 at 4:35:47 PM UTC-4, Warren D Smith wrote:

Michael Elkin

unread,
Jun 15, 2020, 4:15:24 PM6/15/20
to LCZero
Will Leela be updated for the final. Leela missed at least 2 wins at the end that would have been found with latest nets with MLH.

Will the 4082 net be submitted or maybe something else entirely?

Lee Sailer

unread,
Jun 15, 2020, 7:29:54 PM6/15/20
to LCZero
You seem to enjoy the suffering of others...

Pawel SalsaDura

unread,
Jun 15, 2020, 9:01:06 PM6/15/20
to LCZero
That is ridiculous. People are contributing their GPu's, time and money for Leela development,  but instead picking the original Leela latest  net to play at TCEC , organizers pick  Leela's clone instead!!!  Some ridiculous Sergio Vieri clone!  Wtf???? Who the f....ck  care about Sergio  nets? Nobody!! Because everybody care about latest Leela net! And you dont have to tell that this Sergio nothing is based on Leela...because nobody cares!!  

glbchess64

unread,
Jun 16, 2020, 12:46:44 AM6/16/20
to LCZero
Best nets are trained by supervised learning (SR) but they can't be trained without the T60 or T40 games trained by reinforcement learning (RL). We missed the TCEC S16 SuFi because we did not send our best known net (terminator at that time). And notice that S16 net was a @jhorthos net, T40B, trained by SR from T40 games and 20 elo stronger than T40.

We learn from error and will not do the same this time. So the net that will be send to SuFi will be a 384x30 net trained by Sergio Vieri and likely @Jio by SR. They are simply the best at TCEC time control and hardware. You can have a look to @Nav stream (https://www.twitch.tv/navratil25, now it is working to find the best big net and parameters) : T60 is likely not good enough to win the SuFi but our big nets can.

Warren D Smith

unread,
Jun 16, 2020, 12:48:34 AM6/16/20
to Pawel SalsaDura, LCZero
On 6/15/20, Pawel SalsaDura <pawel....@gmail.com> wrote:
> That is ridiculous. People are contributing their GPu's, time and money for
>
> Leela development, but instead picking the original Leela latest net to
> play at TCEC , organizers pick Leela's clone instead!!! Some ridiculous
> Sergio Vieri clone! Wtf???? Who the f....ck care about Sergio nets?
> Nobody!! Because everybody care about latest Leela net! And you dont have
> to tell that this Sergio nothing is based on Leela...because nobody
> cares!!

--so how do you explain why this happened?

--
Warren D. Smith
http://RangeVoting.org <-- add your endorsement (by clicking
"endorse" as 1st step)

Alexander Lyashuk

unread,
Jun 16, 2020, 4:22:43 AM6/16/20
to LCZero
SV nets are "contrib runs", re-training the network from existing training data using different methods.
It's expected for contrib runs to be stronger (here explained why), but contrib runs are considered part of the LCZero project, it's not a "clone", it's just a contribution to the project.

(unless the person doing the training objects to consider that a part of the LCZero project. E.g. "Enstein" network trained from Lc0 data is indeed probably a clone).

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/CAAJP7Y3oHhLDwDWYnW5qeN4WR_mv3ebWb0-FUZUYqmfRJc-wzA%40mail.gmail.com.

Warren D Smith

unread,
Jun 16, 2020, 2:31:47 PM6/16/20
to Alexander Lyashuk, LCZero
Tcec is now (before the superfinal begins)
running a 30-game exhibition match between Lc0 (on CPU, not GPU) versus
all the 5 AB engines in the premier division. Result so far is:
Lc0(CPU) 5.5 out of 12
the AB-engine team: 6.5 out of 12.
So even on CPU, Lc0 remains pretty formidable -- strong enough to get into
the premier division -- probably near the bottom end of that division,
but it is there.

Shuo Xiang

unread,
Jun 16, 2020, 2:38:02 PM6/16/20
to Warren D Smith, Alexander Lyashuk, LCZero
I kept having this question when I see these exhibition matches: what's the point of running LcZero on CPU? It's like jabbing a neural net into Stockfish: makes no sense at all!

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.

Robert Pope

unread,
Jun 16, 2020, 2:53:48 PM6/16/20
to LCZero
For comparison.  And who says jabbing a neural net into Stockfish makes no sense at all?


On Tuesday, June 16, 2020 at 1:38:02 PM UTC-5, Shuo Xiang wrote:
I kept having this question when I see these exhibition matches: what's the point of running LcZero on CPU? It's like jabbing a neural net into Stockfish: makes no sense at all!

On Tue, Jun 16, 2020 at 2:31 PM Warren D Smith <warr...@gmail.com> wrote:
Tcec is now (before the superfinal begins)
running a 30-game exhibition match between Lc0 (on CPU, not GPU) versus
all the 5 AB engines in the premier division.  Result so far is:
  Lc0(CPU)  5.5 out of 12
  the AB-engine team:  6.5 out of 12.
So even on CPU, Lc0 remains pretty formidable -- strong enough to get into
the premier division -- probably near the bottom end of that division,
but it is there.

--
Warren D. Smith
http://RangeVoting.org  <-- add your endorsement (by clicking
"endorse" as 1st step)

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lcz...@googlegroups.com.

glbchess64

unread,
Jun 17, 2020, 2:12:32 AM6/17/20
to LCZero
Leela runs T70, a 10 blocks net. This gives her about 5M nodes (with 88 cores, 1 thread for each core) per move this very good to have enough tactical play to resist the AB with 176 threads. It is possible that a 20 blocks net will be even better in such conditions.

Nasir Ghaznavi

unread,
Jun 19, 2020, 4:27:29 PM6/19/20
to LCZero
Tend to agree with Pawel here, sending these nets leaves a bitter after taste and is kind of a slippery slope. If we are going this way then lets start sending something like Leelenstein instead which is possibly the best on all kinds of hardware right now and lets also dump the Zero part.

Warren D Smith

unread,
Jun 19, 2020, 4:55:35 PM6/19/20
to LCZero
As far as I understand, SF got about 25 elo stronger vs last superfinal,
plus in this one is admitting Lc0 is tough by going to zero "contempt" factor
rather than SF's default value 24 (centipawns I think).

There was a guy claiming the previous sufi, stockfish had lost about 4 or 5
games due to not grabbing draws because of contempt, as opposed to
gaining only about 1 win. It is difficult to be confident of any such
claim, but
if that really is so, then this change ought to cause
SF to be about even with Lc0. But the sufi may have a lot of draws.

Shuo Xiang

unread,
Jun 19, 2020, 5:16:32 PM6/19/20
to Warren D Smith, LCZero
So does Stockfish setting its contempt to 24 an indication that it's learning from the lessons of season 18 SuFi and clawing back in its contempt of LcZero in order to score more draws?

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/CAAJP7Y1nNNeJG3SsRb8pm3ZaxYK6vw0uedWsfGS%2BpCBGZ4-n%2BQ%40mail.gmail.com.

Warren D Smith

unread,
Jun 19, 2020, 5:24:17 PM6/19/20
to Shuo Xiang, LCZero
On 6/19/20, Shuo Xiang <shuo....@gmail.com> wrote:
> So does Stockfish setting its contempt to 24 an indication that it's
> learning from the lessons of season 18 SuFi and clawing back in its
> contempt of LcZero in order to score more draws?

--that's basically right. "Contempt=24" causes SF to avoid draws even
when 23 centipawns behind. If SF was enough stronger than opponent
that decision would, on average, pay.
Now with contempt=0, SF will try to get a draw when it thinks it is 1
centipawn behind.
This will make it harder for Lc0 to defeat SF.
As far as I know Lc0 also uses zero contempt.
So expect a lot of "grandmaster draws."

Shuo Xiang

unread,
Jun 20, 2020, 2:26:53 PM6/20/20
to Warren D Smith, LCZero
After drawing first blood, things have taken such a sharp downward turn for Leela that I dare not even look at tcec-chess.com now.

Warren D Smith

unread,
Jun 20, 2020, 2:32:59 PM6/20/20
to Shuo Xiang, LCZero
Wow, Stockfish actually managed to win the French-Winawer game pair over Leela.
I was watching the start of that and it did not look like SF knew what
it was doing, the peanut gallery thought things were not going to end
well for SF :)

Warren D Smith

unread,
Jun 20, 2020, 2:55:53 PM6/20/20
to LCZero
SF is about to win the black side of "KGA, insane sac-gambit line"
to get 4 wins to leela's 1.  Leela probably will come back to 4-2 on the
return game, though(?).

It may be that the tcec book has overdone it for season 18 on the gamestart biases.
E.g. the KGA gamestart looked highly biased against white, the French highly biased against black.

And to continue speculating, it is possible this will hurt leela.  If we believe that leela is weakest when 
playing highly biased positions, and strongest when the positions are pretty even.

PrinceZappa

unread,
Jun 20, 2020, 3:14:16 PM6/20/20
to LCZero
The peanut gallery will become very one-sided once leela is trained on opening book. 

Warren D Smith

unread,
Jun 20, 2020, 3:21:16 PM6/20/20
to LCZero
Another factor favoring SF over leela more than in prior TCECs is that
TCEC now makes endgame tablebases available in huge RAM. SF then can
make a lot of tablebase accesses with its large node rate, while leela
may not benefit as much with its 4000x slower node rate and shallower
search not seeing far enough to reach tablebases.
Message has been deleted

PrinceZappa

unread,
Jun 20, 2020, 3:41:54 PM6/20/20
to LCZero
Even without tablebases SF would beat leela overall in endgames. SF's deep search plus specific evals for certain positions guarantees it. But it's also obvious leela is doing worse with certain openings and losing them well before endgame. 

Train her on openings and she'll improve TCEC performance by probably at least 50 ELO

Shah

unread,
Jun 20, 2020, 3:53:53 PM6/20/20
to LCZero
Game 9 now about to end in another defeat to Leela.
But what worries me is that for the last 20 moves (starting from move ~40) all engines including LS agree it is over. (<-10, SF black)
But Leela still hovers at around -3 very slowly going down with its eval.
What's going on here?

Warren D Smith

unread,
Jun 20, 2020, 3:59:08 PM6/20/20
to LCZero
Indeed in at least 2 TCEC18sufi games so far, SF has found a tablebase sure win
while Lc0 is saying "duh, I can still try for a perpetual check" for a
long long time.
In other words: SF is playing perfect with 100% understanding while
Lc0 is not understanding the situation. OK, in those cases SF had it
won long before, so
you might say the EGTB did not matter, but the *reason* SF had it won
long before
might have been in part, the EGTB enabling SF to avoid draw lines.

It looked to my naive eye (based on the games I saw) that Lc0 was
outplaying SF in openings, that part of the game was not Lc0's
problem.

Warren D Smith

unread,
Jun 20, 2020, 4:07:50 PM6/20/20
to LCZero
The lesson of this might be:
Lc0 should use two nets, a "fast and dumb" one for endgames, and the
big smart slow
one otherwise. If 10X faster that would, for one thing, enable it to
get >10X more tablebase hits (I presume). Also the fact you could
train the small net much faster+more might mean it
would not actually be all that dumb.

Warren D Smith

unread,
Jun 20, 2020, 5:47:42 PM6/20/20
to LCZero
Wow, SF looks like it will not lose the KGA-insane-sac-line return game, so the
match is going to be 4 wins to 1 in SF's favor after only 10 games!
Looking like
mega-kill! SF stomping those soft squishy leela neurons!

At least, so far.

ronnie millsap

unread,
Jun 20, 2020, 10:40:03 PM6/20/20
to LCZero
lol you turned full on troll to both sides in the last few months warren. I actually enjoy your posts now compared to before.

Warren D Smith

unread,
Jun 21, 2020, 2:59:09 AM6/21/20
to ronnie millsap, LCZero
game 13.1 (Robatch) is fascinating, looks like it needs somebody smart
like "Kingscrusher" to explain it to dumb people like me. Seriously
weird stuff happening. Leela (white) might be able to win it, but
we'll see tomorrow, I have to sleep now, 3AM here :)

Dietrich Kappe

unread,
Jun 21, 2020, 3:16:00 AM6/21/20
to LCZero
In the same way that lc0 is a clone of a0.

Dietrich Kappe

unread,
Jun 21, 2020, 3:21:41 AM6/21/20
to LCZero
Scorpio uses a smaller specialist endgame net at 14 pieces(?) and ab search at 9 pieces, last i checked.

ronnie millsap

unread,
Jun 21, 2020, 8:54:20 AM6/21/20
to LCZero
im sure the only reason i find it funny is because most people are partial to a particular engine. While i realize its just a chess thing and even if you contribute it has none of 'you' in it. people take chess way too seriously lol.

I really just like to see both engines lose as much as possible so its not 200 boring draws ever chess.com final etc

Warren D Smith

unread,
Jun 21, 2020, 1:22:30 PM6/21/20
to LCZero
In particular I think queen & pawn endings are a case where "fast & dumb"
evaluation ought to be the correct way to go, I find hard to believe leela is
getting much out of having some huge neural net. SF seems to be outplaying
and out-understanding Lc0 in Q&P endings.

Consider a combination of a "decision tree" and a "neural net"
as a self-learned "zero" eval for leela. The decision tree is very fast and
then once we reach a leaf of that tree, it tells us which neural net to use.
These neural nets could vary greatly in size and speed. I would recommend
learning NNs of all different sizes and choosing which to use based on some
combination of how much "understanding" it has and how fast it is.

The question is how such a beast could be self-learned.

Warren D Smith

unread,
Jun 21, 2020, 2:11:56 PM6/21/20
to LCZero
TCEC adjudicated game 16.1 (Budapest) as a "draw" but I felt there was still
a lot of life left in the position.

The TCEC draw-adjudication rule did not work well in this situation.
And some other "drawn" TCEC superfinal games also were suspicious,
especially game #1
(Sicilian Polugaevsky). Why the hell is that game suddenly ending in a draw?
Suspicious. If human grandmaster did that, they'd be subject to some ridicule
for "grandmaster draws" and "Russian plots."

glbchess64

unread,
Jun 21, 2020, 5:22:56 PM6/21/20
to LCZero
Leela play is worse in endgames with queen than in other endgames but in fact it is not a so big difference : less than 20 Elo. This means that she play good moves in almost all positions and sometimes play a bad move. Nothing that need big discussion again and again. The problem of the queen is more general than endgames. She plays better without the queen. The reason is that position with queen are more tactical and Leela positional play is much stronger than her tactical play.

The idea that SF is outplaying Leela in endgames and in particular in endgames with a queen is simply wrong. There is just some positions where Fish play better. In most endgames Leela play very well. remember that Leela wins are always in endgames contrary to SF wins that are sometimes in middle game. She also has a lot of very good draws where SF have a queen and Leela a rook or RN or RB, with some extra pawns and where SF thinks it is winning until it reaches the 50 moves and where Leela gives the "good" eval all along (I use good with quote because Leela seldom display 0.00 but something like 0.20 that means marginal advantage not enough to hope winning and is equivalent to SF 0.00). You can find all of this in previous SuFi for example, but also in CCC games (there is lots of them, enough to do statistics).

Comments based on statistics are better, without statistic human mind usually only concentrates on special cases.

Jim Glass

unread,
Jun 21, 2020, 7:38:06 PM6/21/20
to LCZero
Game 18 move 46.

Stockfish calls "M71"!

Yeah, 71. Geeze.

Warren D Smith

unread,
Jun 21, 2020, 7:59:51 PM6/21/20
to Jim Glass, LCZero
Well, that presumably was heavily tablebase aided. E.g. stockfish maybe looked
20 ply ahead then the other 51 ply were precomputed inside the tablebase?

Warren D Smith

unread,
Jun 21, 2020, 8:06:11 PM6/21/20
to Jim Glass, LCZero
eval = M71
Depth/SD = 41/54
speed = 314.3 Mnps
nodes = 34.6B
TBhits = 1.3B

the data above indeed indicates heavily tablebase aided.

Robert Clark

unread,
Jun 22, 2020, 1:59:23 AM6/22/20
to Warren D Smith, Jim Glass, LCZero
Why are table bases even legal in tournament play? They don't use the evaluation function of the chess engine that is paying, and they aren't the same for both pirates. It just seems wrong to me. It's one thing if an engine is being used for analysis. In that case better performance is better performance. But that isn't really the core ethic in a tournament, is it?

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.

Warren D Smith

unread,
Jun 22, 2020, 3:02:59 AM6/22/20
to Robert Clark, LCZero
On 6/22/20, Robert Clark <rlcl...@gmail.com> wrote:
> Why are table bases even legal in tournament play? They don't use the
> evaluation function of the chess engine that is paying, and they aren't the
> same for both pirates.

--in the case of TCEC, they *are* the same for both players, they both
access the same tablebases in RAM. If they so choose.

> It just seems wrong to me. It's one thing if an
> engine is being used for analysis. In that case better performance is
> better performance. But that isn't really the core ethic in a tournament,
> is it?

--well, computers are different than humans. Anyhow, if TCEC said "we
will not provide tablebases" then the programs playing could create
their own. In fact, stockfish already has a builtin KPk tablebase
creator already in its code. Would you then try to
forbid that by saying "Hey programs, you are only allowed to do THIS
kind of 'thinking' and not this other kind, and I will decide somehow
which is which?"

That, it seems to *me*, would be "wrong."

ronnie millsap

unread,
Jun 22, 2020, 8:59:58 AM6/22/20
to LCZero
ROOF!  (afps refference lol)...
Chat is awesome to laugh at. I used to redicule andrew grant ethereal there all the time.

ronnie millsap

unread,
Jun 22, 2020, 9:00:55 AM6/22/20
to LCZero
OUCH mate in 71 would stick it to your heart cold.if that lc0 beeotch had one!

Álvaro Begué

unread,
Jun 22, 2020, 12:29:39 PM6/22/20
to LCZero
There is no clear line separating data and code. Opening books and endgame tablebases are fine mechanisms for a chess-playing program to play chess.
Message has been deleted

Pawel SalsaDura

unread,
Jun 22, 2020, 12:37:43 PM6/22/20
to LCZero
Oh, arguing with you is such a wast of time, whatever anyone would say you have always smth to argue against. The tablebase idea is wrong because it is simple pre-calculated seqience of movies that dont come from engines. So if you bring argument that Sf have tablebase built in so I will bring an argument of 15 pieces tablebase, would it make any sense to play such games with 15 pieces precalculated? What about 20 pieces tablebase? Or, why not tablebase the whole game from move 1? C'man this is a competition, we want to find the best most intelligent chess engine, and not the one that can access tablebase first because of computing speed. That is totally wrong, the idea of computer chess has been warped. We dont need tablebase at all because chess endings are part of the game!!!

glbchess64

unread,
Jun 22, 2020, 1:00:01 PM6/22/20
to LCZero
The goal of TCEC is not to find the best chess engine (else they would have a better tournament system that ensure that the two best engines are qualified for SuFi). The goal of TCEC is organising tournament for fun, it is entertainment. This is the reason why they use book with long weird variations, this is the reason why the TC is long enough for the spectator to have time to analyse the moves. And it works very well since  about 1000 viewers watch the SuFi at any moment.

Table base are really not important, nor opening book : they just add/substract a few elo points (for Leela TB just add about 20 elo and opening book likely subtract some Elo points, for SF may be a bit more for TB and it is not sure that a book help him).

I think that this discussion on books and TB is really outdated. In the 90's such stuff made a big difference, nowadays the engines are so strong that they are cosmetic features.

Shuo Xiang

unread,
Jun 22, 2020, 1:00:55 PM6/22/20
to Pawel SalsaDura, LCZero
Well now that you've mentioned " why not tablebase the whole game from move 1" : that's called "solving Chess" and has been a topic of study for computer science. The consensus so far is that solving Chess is out of the realm of possibilities for non-quantum computers.

On Mon, Jun 22, 2020 at 12:37 PM Pawel SalsaDura <pawel....@gmail.com> wrote:
Oh, arguing with you is such a wast of time, whatever anyone would say you have always smth to argue against. The tablebase idea is wrong because it is simple pre-calculated seqience of movies that dont come from engines. So if you bring argument that Sf have tablebase built in so I will bring an argument of 15 pieces tablebase, would it make any sense to play such games with 15 pieces precalculated? What about 20 pieces tablebase? Or, why not tablebase the whole game from move 1? C'man this is a competition, we want to find the best most intelligent  chess engine, and not the one that can access tablebase first because of computing speed. That is totally wrong, the idea of computer chess has been warped. We dont need tablebase at all because chess endings are part of the game!!!

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.

Warren D Smith

unread,
Jun 22, 2020, 1:40:41 PM6/22/20
to LCZero
Tablebases are not part of the game in the sense TCEC terminates games as soon
as they reach tablebase. So yes, TCEC already is finding "the best
engine" based
purely on their thinking ability without openings or tablebases.

However, obviously an engine can get advantage by looking ahead into tablebases.
Which muddies that claim. But both Lc0 and SF are using the same tablebases and
they both do consult them as much as they want.

Álvaro Begué

unread,
Jun 22, 2020, 1:53:35 PM6/22/20
to LCZero
I'm not sure if Pawel is talking to me, and I don't like the personal tone -whoever it's directed at-, but I'll bite^H^H^H^Htry to explain myself better.

I'm just pointing out that an engine is a bunch of code, and that code might say something like "if we have a white king on c3, a black king on a3, a white queen on d3, and no other pieces, and it's white's turn, this position is a mate in 2". That might seem like an unlikely piece of code, buy my Spanish checkers engine in the mid 90s had that type of code for specific position in 3-kings -vs- 1 king which is very common (a position known as "el triángulo de la forzosa", which is taught to learners of the game and appears in checkers book from centuries ago), and then the search would know to look for that position to win. Even in chess, I've written code to detect a draw in KBP-vs-K where the pawn is on a or h, the lonely king controls the promotion square and the bishop is "bad".

Of course, having a long list of conditional statements to detect specific positions quickly becomes inefficient, but there are ways to implement that kind of logic by looking the position up in a hash table. You can make this more systematic and have an entry for every position with a particular material configuration. These hash tables quickly become too large to be practical, but you can use clever enumeration schemes and replace the hash table with a plain array, and then you can use compression techniques. This part of the engine is easy to separate into a library call and a bunch of data files, which is what tablebase authors have done.

It's hard for me to draw a line somewhere in this process and say "Ah, no! At this point you have introduced tablebases and it's no longer your engine doing the thinking!". The only parsimonious way to handle this in the rules of a tournament is to consider that all of these incarnations are part of the engine, and therefore allow endgame tablebases.

Jim Glass

unread,
Jun 22, 2020, 4:01:20 PM6/22/20
to LCZero

Fish looks about to go up +4.

Like the more potent table bases or not, they sure seem to be having an impact!

Or something is.

Gee willikers, Batman!

Warren D Smith

unread,
Jun 22, 2020, 4:04:31 PM6/22/20
to LCZero
Game 24 (queen's indian) -- SF says it is ahead by 9.69 pawns,
while Lc0 says "what, me worry?" thinking SF only 1.16 pawns ahead.

If SF indeed wins this game, then will be
a grand total of 4 game-pairs won by SF, remaining 20 game-pairs drawn,
none lost.

If Lc0 really were superior to (or same as) SF, then the chance of
that would be <=1/16,
i.e. tossing 4 coins and getting 4 heads, so I think there is
already nearly 94% confidence Stockfish actually is superior to Lc0 under TCEC
conditions.

Not many people on either the SF or Lc0 side expected that to happen!
So it might be that the SF approach of humans trying to learn from
SF's defeats, code
it better to fix it, and validate by big testing, really does work
better than the brute force machine learning approach of Lc0.

Warren D Smith

unread,
Jun 22, 2020, 4:39:54 PM6/22/20
to LCZero
On 6/22/20, Warren D Smith <warre...@gmail.com> wrote:
> Game 24 (queen's indian) -- SF says it is ahead by 9.69 pawns,
> while Lc0 says "what, me worry?" thinking SF only 1.16 pawns ahead.
>
> If SF indeed wins this game, then will be
> a grand total of 4 game-pairs won by SF, remaining 20 game-pairs drawn,
> none lost.

--SF is indeed winning it, again seeing tablebase mate while Lc0 again
remains pretty
clueless of the devastation raining down upon it. Looks like lot of
SF's strength
advantage might be related to it using tablebases in RAM much more effectively,
although hard to be sure of that without controlled experiments.

PrinceZappa

unread,
Jun 22, 2020, 4:42:30 PM6/22/20
to LCZero
Some number of patches have come from observing SF losses against Leela. So it's probably fair to say with no Leela, SF wouldn't be as strong right now.

glbchess64

unread,
Jun 22, 2020, 5:17:41 PM6/22/20
to LCZero
Warren D Smith :

  •  If for you machine learning is brute force, SF development is super brute force : each patch for SF (and there is several patch each day) must be validated by thousand of games. This is why SF is so good, compared to all other AB engines : it uses the FishTest net a distributed effort to validate the patches And the FishTest net uses far more computers than Leela net training.

    And during the game AB is clearly brute force whereas PUCT + NN is intelligent and selective search.

  • It seems this time that Leela has technical issues. I will not details this point it remains partially unclear (there is hundred of messages on this in Leela discord, the main messages are pinned and easy to find). Just the speed is lower than expected (equivalent to 2 GPU instead of 4) and the parameters of the new MLH feature seems to be too aggressive and cause the net to play bad moves (this is the first time this feature is used with 4 GPU). The result is that the net that play the SuFi plays about 50 elo bellow the net that played divP.

Warren D Smith

unread,
Jun 22, 2020, 9:55:26 PM6/22/20
to glbchess64, LCZero
On 6/22/20, glbchess64 <glbch...@gmail.com> wrote:
> Warren D Smith :
>
>
> - If for you machine learning is brute force, SF development is super
> brute force : each patch for SF (and there is several patch each day)
> must
> be validated by thousand of games. This is why SF is so good, compared to
>
> all other AB engines : it uses the FishTest net a distributed effort to
> validate the patches And the FishTest net uses far more computers than
> Leela net training.
>
> And during the game AB is clearly brute force whereas PUCT + NN is
> intelligent and selective search.
>
> - It seems this time that Leela has technical issues. I will not details
>
> this point it remains partially unclear (there is hundred of messages on
>
> this in Leela discord, the main messages are pinned and easy to find).
> Just
> the speed is lower than expected (equivalent to 2 GPU instead of 4) and
> the
> parameters of the new MLH feature seems to be too aggressive and cause
> the
> net to play bad moves (this is the first time this feature is used with 4
>
> GPU). The result is that the net that play the SuFi plays about 50 elo
> bellow the net that played divP.

--well, it sure is lame if your superfinal version is 50 elo weaker
than your (earlier) divP
version! If this is true, sure looks like leela could have used some
fishtest-like testing to prevent that kind of embarrassment! (Let's
put in some last-second modifications for the superfinal... what could
possibly go wrong? :)

--But to stop trolling for a sec, I partly agree with glbchess64 about
"brute force."
SF's improvement is partly due to Lc0 helpfully pointing out to the SF
developers what is wrong with SF that they need to work on. Then they
try out a lot of ideas, most bad, and test heavily with brute force to
find the ideas that work. Their fishtest is indeed rather brutal.

But what is "brute force" about Lc0 is, they encode their knowledge in
a huge number of megabytes of incomprehensible data, and which data is
used in a very inefficient manner --large amount of computing per node
(in TCEC apparently about 4000X slower than stockfish). Stockfish's
knowledge is encoded in a much smaller amount of data, in a way
that is both comprehensible, and much faster to use.
OK? So that is what is brute force about both Lc0, and neural nets generally.

ronnie millsap

unread,
Jun 23, 2020, 12:10:17 PM6/23/20
to LCZero
i dont care for any engine over the other, but, thats not correct data with the context. cant comapire sf to be better with tablebase etc.

Felix Zaslavskiy

unread,
Jun 23, 2020, 12:27:19 PM6/23/20
to glbchess64, LCZero
Thanks.
I don't follow Discord very closely. I am interested in the news on this. 
Is there a confirmation that something is wrong with settings in TCEC18 superfinal?
Is there any way TCEC will fix the settings in the middle of the match if it is found to be a settings issue?

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.

Dietrich Kappe

unread,
Jun 23, 2020, 4:05:12 PM6/23/20
to LCZero
From the discord:

**Update on apparent nps problems with Lc0 SuFi submission:**
Aloril thankfully ran a few test positions (book exit after games 50,56 from DivP and games 9,10 from SuFi) both with Lc0 DivP and SuFi submissions, see https://discordapp.com/channels/425419482568196106/539960268982059008/724413973193293824
Results suggest that DivP and SuFi submissions aren't different significantly, and show a 15-20% decrease in nps and nodes searched compared to the actual DivP logs of games 50 and 56.
The nps is clearly position dependent, with significantly lower nps in apparent "deep/tactical" positions, pointing towards the known CPU bottleneck with low single core speed CPU with powerful GPU hardware and bad Lc0 search parallelization.
Most likely culprit right now seems to be the (missing) Turbo of the server CPU; at least the numbers could be explained by Turbo 2.5 GHz --> 3.1 GHz being on in DivP while being off in SuFi. However, these are inner workings of the hardware and out of simple settings control.
At the same time, the selected net seems to have a few blind spots here and there, and the relatively high evaluations of some draws could potentially interact badly with the relatively aggressive MLH settings.

Warren D Smith

unread,
Jun 23, 2020, 5:12:09 PM6/23/20
to Dietrich Kappe, LCZero
What is "MLH"?

Dave Whipp

unread,
Jun 23, 2020, 5:39:46 PM6/23/20
to Warren D Smith, Dietrich Kappe, LCZero
MLH = moves left head -- predicts number of moves remaining, to help avoid aimless shuffling


--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.

glbchess64

unread,
Jun 24, 2020, 8:40:13 AM6/24/20
to LCZero
The speed of Leela had been measured in SuFi and compared with speed in divP.

Here is some much clearer perception of the slowdown:

Compared to the 42 games at Div P

SuFi
Games  1 - 26             20.47%

REBOOT (of Leela server)
Games 27 - 38             9.84%

TOTAL
Games 1 - 38               15.16%


Warren D Smith

unread,
Jun 24, 2020, 9:51:45 AM6/24/20
to glbchess64, LCZero
Wow, getting 20% of the DivP node rate in the superfinal was incredibly
pathetic, and rebooting apparently actually worsened the problem?!

This seems like the worst screw-up in high-division TCEC history!
A previous big screw up was stockfish once managed to introduce a bug
which kept causing it to crash, but they just barely managed to survive the
divP without being disqualified, then fixed the bug. But that was not as bad
as this.

I'm surprised at glbchess64's claim this was costing leela 50 elo.
Normally a factor 5 slowdown would cost a chess program more like 160 elo.

So yes, that explains it. Leela is strong, but not strong enough to
give away 160 elo
to stockfish...

Brian

unread,
Jun 24, 2020, 10:27:08 AM6/24/20
to LCZero
They are using the wrong settings, period.   For multiple cards Threads=2 is most generally wrong. 
Not only will you produce lower nodes/sec, but your scores on test suites will be slightly lower.

While it is true that on *some* systems with certain setups Threads=2 is ideal (single GPU), for multiple GPU's and especially 3!  I do NOT believe Threads=2 is ideal.

I have repeatedly asked for them to run a nodes/sec test with (Threads=2,3,4,5,6) and then run a challenging test suite with (threads=2,3,4,5,6) and post the results for this monster machine they have setup for TCEC.  They simply do not respond.

During the prelim matches I said "why threads=2, you are gimping it" to which many members said, "They know best leave it alone".  Finally one of the admins replied that they wanted to give the other engines a chance in the early stages of the matches??  so they set it to 2.  I have no idea what logic that is but, whatever its not my machine.

Surely some of you out there have multiple GPU's.  Don't trust me, run the test yourself.  Leave the rest of the settings alone and try moving threads # slightly higher and see what happens to the Nodes/sec!   Now please keep in mind Nodes/sec are NOT everything, they are simply a good starting point.  After that you must run 100's of hours of adjustments to determine which settings are best for your machine.  I recommend running multiple test suites and complex positions, recording how soon Lc0 finds the solutions.

Here is a configuration run I setup with 4x 2080Ti's. 

glbchess64

unread,
Jun 24, 2020, 12:03:13 PM6/24/20
to LCZero
Warren D Smith : the speed was not the only problem. And rebooting solve partially the problem of speed (you did not interpret the post correctly).

You can also notice that Leela is +2 against SF since the reboot and did not lost any game. But remember that all this is SSS (small sample size) so that it is not possible to draw any serious conclusion. On discord there was many interpretations on what happened (the 50 elo come from one of the message, some one serious). It seems that lot of people in the forum are not on discord so that I just gave some informations that lack in this thread.

The only sure facts are :
  • the speed is lower than expected and the reboot solve partially the problem (there is a issue also with turbo that the processor do not want to use in the SuFi  but used in DivP : the effect is that the graphics cards are not used at their maximum, some one say 50% load),
  • rebooting solve partially this issue,
  • when using SV-3010 people are not able to reproduces some of the mistakes that in particular SV-3010 see immediately that one of the blunders (Ne4 in the opening) is a bad move and prefer a move that lead to an equal position.
I did not gave my opinion, but I suppose it is well known, I gave it so many times. My opinion is that all this is SSS : 100 games is simply not enough to know if SF and Leela are stronger than the other. The levels are too close and ,worse, the TCEC openings are heavily biased. They are chosen for entertainment and not for engines testing.

So Leela can win by luck and SF also. It simply not possible to do statistics on 100 games it is nonsense. If not quantitative you can have a qualitative approach.Most of Leela loss (but not all) are due to blunders in games where she had not a bad play except the blunders. I have just one example of a game where SF really outplayed Leela positionally and tactically and curiously it is a French where Leela had black. For most of the games they look like previous SuFi games. May be Leela positional domination is not so strong than in previous SuFi but I will be more sure of that at the end.

People seem to think that if you lead by 4 points after 26 games you must win a match in 100 games. But it is totally false, it is even false if you are the best ! Recall the match Karpov - Kasparov in 1984 (https://fr.wikipedia.org/wiki/Championnat_du_monde_d%27%C3%A9checs_1984) : 5-0 after 27 games and 5-5 after 48 games ! Yet it is not really a good example since I have the feeling that Kasparov learned how to play against Karpov during the match. But, statistically it is things that happened. I recall also a match in @mattblachess stream where the first wins where all for SF (more than 10, almost all games at the beginning, really a few draws) and finally SF wins (because on 100 games it is difficult to inverse such a trend) but with just a "normal" margin (like +3). I just notice than on discord, in TCEC chat and in this forum people are not enough aware of statistic and "theorem central limit".

John Upper

unread,
Jun 24, 2020, 12:15:25 PM6/24/20
to LCZero
Since you are arguing that for multiple GPUs Threads=2 is slower than other settings, shouldn't your table of results include a few where Threads=2?
Message has been deleted

Brian

unread,
Jun 24, 2020, 12:43:20 PM6/24/20
to LCZero
Of course I have the spreadsheet results populate and sort into NPS->Descending order, that way I can observe groups of patterns associated with several setting changes and how they effect NPS.

Threads=2 start showing up around line 50 of the spreadsheet...


Warren D Smith

unread,
Jun 24, 2020, 3:29:52 PM6/24/20
to glbchess64, LCZero
> I did not gave my opinion, but I suppose it is well known, I gave it so
> many times. My opinion is that all this is SSS : 100 games is simply not
> enough to know if SF and Leela are stronger than the other.

--well, I agree 100 games is too small for most statistical purposes.
Still, it is entirely possible to get enormous confidence, even with
only 100 games,
than A superior to B... *if* good enough match result.
And indeed, at the rate SF was
piling up won game-pairs vs Lc0 during the first 20 games, if that
rate continued for
all 100, then we would indeed have enormous confidence, about 99.9999%, that
SF was superior.

> The levels are
> too close and ,worse, the TCEC openings are heavily biased. They are chosen
> for entertainment and not for engines testing.

--The bias does not matter since both sides enjoy it equally. And the
bias actually helps
get better statistics faster.

>Most of Leela loss (but not all) are due to blunders
> in games where she had not a bad play except the blunders.

--yeah, well, if I never blundered I'd probably be rated 2300 or so...

glbchess64

unread,
Jun 24, 2020, 4:42:43 PM6/24/20
to LCZero

--well, I agree 100 games is too small for most statistical purposes.
Still, it is entirely possible to get enormous confidence, even with
only 100 games,
than A superior to B... *if* good enough match result.
And indeed, at the rate SF was
piling up won game-pairs vs Lc0 during the first 20 games, if that
rate continued for
all 100, then we would indeed have enormous confidence, about 99.9999%, that
SF was superior.

Yes but this is not possible since the level are very close. I suspect TCEC to have increase the number of cores for AB last season to keep the balance (TCEC does not hide that their goal is entertainment, for this they need an open competition).


--The bias does not matter since both sides enjoy it equally.  And the
bias actually helps
get better statistics faster.

Wrong, the bias, especially TCEC bias produce worse statistics. If you want to have good statistics you need all kind of positions. If you want to reduce the sample, you need a representative sample and a high draw rate (to reduce the error bar).
TCEC don't want high draw rate since spectators don't like that. And a representative sample suppose that you know a lot on the possible results (this is what survey institute do : they have a model of the reality and the results are good if the reality conforms the model, if a disruptive event arrive they can't predict anything).

TCEC choose to have a lot of unbalanced positions : sub variations, gambits,... But unbalanced closed positions favour Leela and unbalanced open positions favour SF. Position with queen favour SF, positions without queen favour Leela (there is no positions without queen in the sample), position with black 0-0-0 favours SF, positions without castling right or with no castle favour Leela (because she know better when castling and when not), weird positions favour SF, common position favour Leela, ...

How can you choose the sample, if you choose unbalanced games ? You can only choose the sample in that case if you know the result. Or you can choose the sample for spectator contempt. That is what TCEC does. Last season TCEC choose Alekhine and Trompowsky that are open an tactical but SF fans complains that there is no KGA. So this season there is KGA that is equivalent (not totally equivalent because there is really noting positional in KGA whereas some lines of Alekhine and Trompowsky are a bit positional).

Another very interesting example. If you let Leela play from the initial position she is a lot stronger. The experiment had been done without book for SF and with a balanced book from GM play. In the two cases Leela outplayed easily SF. That means that even if SF does not make positional mistakes in the opening it make enough positional mistakes later to lose the game without risk for Leela. She is very good at winning without risk (she can miss some win but she can not lose).

To conclude, TCEC openings are only representative of TCEC tournaments, they are not representative of chess from initial position, nor representative of chess from any position.

ronnie millsap

unread,
Jun 25, 2020, 8:13:53 AM6/25/20
to LCZero
'So yes, that explains it.  Leela is strong, but not strong enough to
give away 160 elo
to stockfish... ' burrnnnn. Now time to sf to implement a +320 elo patch to counter this Bs on an inversed level!

Warren D Smith

unread,
Jun 27, 2020, 2:59:34 PM6/27/20
to LCZero

1. It was pointed out to me that glbchess64 when claiming 20% speed for leela versus in
divP, actually meant 80% speed.  

2. now with 62 of the 100 superfinal games played, the score is 15 wins for SF, to 10 wins for Lc0.
More revealingly, SF has 7 game-pair wins, versus Lc0 with only 2.
The net effect is 91% confidence SF is superior to Lc0 in the form Lc0 was during the SuFi.

In a lot of SF's wins, I have not counted exactly how many, SF recognizes the win, 
probably with the aid of tablebases, while Lc0 is still fantasizing it might be ok sometimes for quite 
a long time.


Warren D Smith

unread,
Jun 28, 2020, 12:45:54 AM6/28/20
to LCZero
In SuFi game 65 (bogo-indian) Leela as white sacced a pawn, then a queen for 2 pieces, then a piece, and they all
were "positional sacrifices."  It then... won.

Just three words for this game: W.T.F.

Warren D Smith

unread,
Jun 28, 2020, 1:20:48 AM6/28/20
to LCZero
In SuFi game 63 (french), Stockfish lost the way it often idiotically does in the French, 
by playing ...c4 as black, which usually is a disaster for black, 
as is well known in every book on the topic, and for all the usual reasons.

In the return game Leela equalized by playing a line which apparently had only been tried
once (?) before, so pretty original.  Draw.

glbchess64

unread,
Jun 29, 2020, 9:06:15 AM6/29/20
to LCZero
For information, @Navs post on discord :

TCEC Stats

Div P  (Total 42 games each)
Average exit move NPS
Lc0 49933
SF 133116667
Lc0 nps / SF nps 0.00037511 x 2666 = 1.00

Score  lc0 3  -  SF 3  (individual games)
Difference  0


SuFi (Games 1 – 26)
Average exit moves NPS
Lc0 41265
SF 134353846
Lc0 nps / SF nps 0.00030714 x 3256 = 1.00  (Sufi games 1 – 26 Lc0 was 22.13% slower than Div P)

Score after 26 games
Lc0 11  -  SF  15
Difference Lc0  diff  -4


SuFi REBOOT
Games 27 – 72
Average exit moves NPS
Lc0 44893
SF 131795745
Lc0 nps / SF nps 0.00034063 x 2935 = 1.00 (After the reboot Lc0 was 10.09% slower than Div P)

Score after reboot games
Lc0 23  -  SF  23
Difference 0

Warren D Smith

unread,
Jul 1, 2020, 10:49:44 PM7/1/20
to LCZero

Stockfish just clinched TCEC 18 superfinal victory; it also won divP; and will probably end up winning
the superfinal by +8 (unclear what the margin will be, that is a guess, currently +7) which if so
will be a larger margin than Lc0's +5 margin of victory in TCEC 17 superfinal.

Shuo Xiang

unread,
Jul 2, 2020, 11:16:43 AM7/2/20
to Warren D Smith, LCZero
Here's what I don't get: didn't Deepmind show that AlphaZero conclusively beat Stockfish something like 70 to 30? If so why did it take Leela so long and so far still see-sawing with Stockfish back and forth winning/losing by slim margins?

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.

Warren D Smith

unread,
Jul 2, 2020, 11:26:46 AM7/2/20
to Shuo Xiang, LCZero
There might be something flaky about Lc0's latest net, or perhaps
merely the fact it is so large makes it weaker against stockfish in
tactics.

But also, I think SF has improved, and a lot of that improvement has
been driven by
Lc0 pointing out flaws in SF in ways that sometimes are clear and
cannot be denied.
And I believe that has happened again during this superfinal, and SF
probably will
be taught to overcome those flaws too. It makes me suspect the whole
Lc0 approach is actually not as strong as the stockfish approach -
provided something
is helpfully pointing out flaws in stockfish so its developers can
know what they need
to work on.


Shuo Xiang

unread,
Jul 2, 2020, 11:30:11 AM7/2/20
to Warren D Smith, LCZero
Agreed! 

So maybe the whole monte carlo/deep learning convolutional NN/zero approach is better suited to Go than Chess.

Edward Panek

unread,
Jul 2, 2020, 11:36:20 AM7/2/20
to LCZero
Ultimately it could go back and forth, especially if Leela is trained vs SF

Warren D Smith

unread,
Jul 2, 2020, 11:39:48 AM7/2/20
to Shuo Xiang, LCZero
But one thing Lc0 seems to have more of than SF (even if it ultimately
is weaker) is
originality of its play and sometimes startling positional ideas.
When SF and other conventional engines play, it can sometimes be very
good, but sort of machine like and nothing really fundamentally new,
it is just big search. At best, you are awed by just how much the
search saw.

But Leela's play can sometimes make you feel like this is a genius
producing fundamentally new ideas about how to play chess. I get
that feeling with leela's and alpha0's play more than from anything
else I ever saw. E.g. even in this superfinal where leela went down
in flames, it still got one win with a positional queen sac, which is
something virtually never seen, they've occurred probably <10 times in
chess history.
Message has been deleted

Michael Elkin

unread,
Jul 2, 2020, 2:24:03 PM7/2/20
to LCZero
The alt final by nav https://www.twitch.tv/navratil25 is going better than tcec, but I think there lc0 is doing relatively better, in the sense that if she wins or losses her eval is not as far behind SF as in the main tcec. This tells me that there might be something wrong with the lc0 config at tcec, or some other issue. And further test show 3972 is actually one of the weakest of the MLH nets so maybe poor choice in hind-sight.

Something far more interesting I think is SFNNUE, which seems to be a NN using SF search. Could lc0 be adapted in a similar fashion. Use SF or some other AB search instead of MCTS to train and use the network.

I don't know the details, but I am impressed with SFNNUE and wonder if parts can be applied to lc0?

Warren D Smith

unread,
Jul 2, 2020, 3:38:48 PM7/2/20
to Michael Elkin, LCZero
Stockfish's search would seem to disobey leela's "zero" philosophy. It
contains a lot
of human-made heuristics which have been tuned (with machine aid) to
optimize performance.

Shuo Xiang

unread,
Jul 2, 2020, 3:44:43 PM7/2/20
to Warren D Smith, Michael Elkin, LCZero
I think Michael is talking about a hybrid of Leela and Fish. Combining the best of both worlds. The final engine (or engine hybrid) does not necessarily obey the Zero principle all the time.

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.

Shah

unread,
Jul 2, 2020, 4:50:39 PM7/2/20
to LCZero
Before asking why Leela lost and by such a margin you should ask how come (almost) no progress has been made since ~N62500 to now ~N64000 representing training by appx. 50M games.
(I hope I read the charts correctly)

glbchess64

unread,
Jul 2, 2020, 5:06:32 PM7/2/20
to LCZero
I suppose the only conclusion of this SuFi is that it is likely the worst in TCEC history (there is another one that was really not good before S10 but I do not remember which one).

The main issue is the opening choice that favour a lot one of the engine : SF. And a secondary issue the problem with Leela speed, partially solved after server reboot. (26 games before reboot : SF +4, 74 games after reboot : SF +3).

The opening choices heavily favour SF because a lot of them were won position (11 for 50 opening were won by both engines, more than 20%) and some other ones were also won positions but one of the engine miss the win, generally Leela. This is difficult to give a number for this because all won pairs are not necessarily from initial won position.

This opening choice is scandalous because it favour to much SF : Leela is well known to be good at building winning position and then not so good to conclude and SF is well known to be not so good at building won position but very efficient to conclude.

A good repertoire may have won position but not so much to be fair.

This comment it not only the comment of a Leela fan but TCEC team also realizes the problem and post on this in the chat.

Warren D Smith

unread,
Jul 2, 2020, 6:30:30 PM7/2/20
to glbchess64, LCZero
SF 12 versus Leela 5 in terms of game-pair wins. (Unless somebody
wins the 100th game, still in progress - unlikely.) Based on that,
confidence SF is superior: 93%.

I'm really not impressed by glbchess64 complaining "openings favored SF."
And the slowdown for leela is entirely its own fault as far as I can tell.
I think TCEC is well justified in designing the tournament the way they do:
https://tcec-chess.com/articles/TCEC_Openings_FAQ.html

Shuo Xiang

unread,
Jul 2, 2020, 7:23:40 PM7/2/20
to Warren D Smith, glbchess64, LCZero
Can't help but agree with Warren.

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.

Warren D Smith

unread,
Jul 2, 2020, 8:12:29 PM7/2/20
to Shuo Xiang, glbchess64, LCZero
Also, if TCEC did switch to using only main line openings, then the
criticism of Lc0 that
it effectively had a "built in opening book" and thus was "cheating"
would become stronger, calling the results into question and causing
still more complaining.
[Even aside from all the other arguments for why TCEC should be the way it is.]

On the contrary: I think Lc0 needs to think about HOW it can become
better at playing
in near-won and near-lost positions, and (similarly) in playing
"handicap chess." There is something wrong with Lc0's current
approach that causes those deficiencies, probably related to
monte-carlo-type search; and there probably are ways to overcome that
- I've suggested several myself, which as far as I can tell were
ignored, but maybe you all should think again, now your minds are a
bit more concentrated :)

PrinceZappa

unread,
Jul 2, 2020, 9:40:58 PM7/2/20
to LCZero
There's an assumption you can't go unreasonably high with endgame temp and T70 data supports that. But here tells a different story? 
https://docs.google.com/spreadsheets/d/e/2PACX-1vRBSLaHgDRlqiGTvPqi8FSdELXa1-78J2lFMptmdAlV3Ij0HrFicf2VlcmSrG7rutDQGnKcf9yX8dG9/pubhtml?gid=646620829&single=true 

So higher than usual endgame temp seems to advantage T60. I wonder if the larger net size with this current run, and especially so with SV nets, is more forgiving of higher temp and you could even get away with endgame temp 0.8 - 0.9. Like I said, what works for 128x10 may not prove correct for much bigger net size?! 
Message has been deleted

glbchess64

unread,
Jul 8, 2020, 11:09:50 PM7/8/20
to LCZero
AFAIK nobody asked that TCEC use mainline openings. The goal of TCEC is to organize tournament for fun and they are right when they use weird openings. Likely, S17 openings are the best that TCEC ever used, and they are far from mainline. The issue with S18 openings is that more than a quarter of them are busted openings. And TCEC team realized during the competition that this was a big issue, they thought they had made a mistake. Yet, the alt-SuFi is now terminated on @Navs stream with the same openings and the result is 50-50 that show the main issue for Leela was the speed. Now we understand what append : the CPU was not powerful enough to fetch the 4 V100 cards so that they where underused. This issue is solved with lc0 v0.26 that was released just after the SuFi. It also appeared that MLH that worked fine at low nodes is not good at high nodes with the settings used in SuFi so that the net in SuFi played worse than the net in divP that does not have MLH. The scaling test that show this are just finished since a few days.

In fact this S18 championship happened just a bit too soon and Leela was not ready : the new Leela version was not realised and the new nets not enough tested (building such big nets takes a lot of time, they are built with T60 data and the second T60 sub run is not even close to terminate, it lacks a LR drop and new parameters from T70 are just implemented).

The new bonus right now at TCEC is also very interesting : the openings are just one move. The engines play good openings, even SF, that lead to good balanced middle game where SF can not exploit its tactical ability and SF lost one game against LS and one game against Leela at the beginning of the endgame (LS used the power of the fawn pawn and Leela a brilliant double pawn sac just to have better pieces, these are positional wins). This show one more time that the strength of SF is in middle game when there is a medium number of pieces on the board and where it can, better than Tal, find brilliant scam idea in unbalanced positions, but if the NN resists to that they have a better play at endgame and they can win the game despite the huge amount of SF TB hits and better play when there is a lot of pieces on the board. In the two games lost by SF the transition between lot of pieces to few pieces was very quick and at the NN initiative so that the tactical phase of the game was quasi inexistent. The TB help a bit but the most important things is the understanding of the position. It is the reason why generally crystal is better than SF at endgame even with less cores.

glbchess64

unread,
Jul 18, 2020, 11:16:48 AM7/18/20
to LCZero
Some precision about the CPU bottleneck that cause 21% less nodes per second (nps) and as a consequence 100% less nodes per move (npm) :

Name                        Depth       NPS        NPM
-------------------------------------------------------------------------------------
v0.25.1
SVjio-t60-3972 mlh    :      17        21634      4924731

v0.26.0
Lc0-SV-4300 mlh       :      22        26128     10346333


The nets have the same size so about the same speed (may be 1 or 2% differences, not more). v0.25.1 has a speed issue, the CPU does not succeed to feed the 4xV100 graphic cards. v0.26 has not the speed issue, and was released just after the SuFi and used for the bonuses. The reason why the effect is more important on npm is that all nodes need not the same amount of calculus. The average time for a node decrease as the size of the search tree increase.

The alternative SuFi by @Navs was a draw : same Leela/SF nodes ratio than the TCEC benchmark (that does not have the CPU bottleneck issue), 1/2231, same book, but less nodes (because less cores for SF and less graphic cards).

So Leela can hold in close conditions even with that heavily biased book. Difficult to know the relative effect of the lower TC (in fact this was a higher TC but with a weaker hardware this correspond to a lower TC on TCEC hardware) and the better speed ratio (with only 2 RTX 2070 there was not any speed issue with the CPU).

Jeff Wads

unread,
Jul 30, 2020, 12:05:13 AM7/30/20
to LCZero
I find it highly amusing that the Lc0 devs chose a net that is magnitudes slower than the net used in TCEC 17.  We are talking nearly 10 times slower with a GPU.  Makes one wonder.  Using a 2080ti, the TCEC 17 winner gets around 80K nps whereas the TCEC 18 loser gets a pathetic 8K nps on average.  Zoiks!



On Sunday, June 14, 2020 at 3:35:47 PM UTC-5, Warren D Smith wrote:
Approaching the end of the Premier division round robin tourney.
LcZero and StockFish both undefeated (unlike everything else).
And leading.

SF has 9 wins and Lc0 has 7 after 41 games played by every contestant.

Meanwhile Ethereal & Fire8 are the opposite; they have zero wins.
It definitely looks like the NN programs are on the rise with the top 4 consisting of 3 NN programs (others
are AllieStein & Stoofvlees) plus StockFish (leading), while the bottom 4 all are old-style programs.



Reply all
Reply to author
Forward
0 new messages