How strong is Stockfish NNUE compared to Leela..

OmenhoteppIV

unread,

Jul 13, 2020, 5:14:27 AM7/13/20

to LCZero

I just heard this Stockfish nn thing.. how strong it is and does it learn from scratch also.. many tnx

Dietrich Kappe

unread,

Jul 13, 2020, 8:33:51 AM7/13/20

to LCZero

First it should be noted that the NNUE technique comes from Shogi where it has been top for some time. Efficiently Updateable Neural Network (EUNN) is flipped for some reason in the name (some language thing in Japanese?). It’s a broad but shallow neural net that runs fast on cpu (~50% nps of sfdev).

It’s trained on data made up from positions, eval at depth N and game outcome (no reason that data couldn’t come from other engines, like Komodo or a version of Stockfish NNUE itself). So far it looks to get stronger vs SFDev at longer TC.

Jesse Jordache

unread,

Jul 16, 2020, 3:17:37 PM7/16/20

to LCZero

It's flipped because germanic languages do everything backwards. To an English speaker (which is a germanic language) the adjective comes first - virtually everywhere else, the noun comes first (and adjectives come before the adverb they modify, so in most languages the words are in descending order of importance). So it's a Network Neural Updateable Efficiently. Since most technical papers are written in English though, EUNN is standard.

Anyway, it's probably stronger than Leela, but I feel like there's no way a powerful eval with a minimax-a/b search is the optimal strategy for chess engines: you're throwing so much search time after moves that should be pruned at the root.

The problem with PUCT based searches though is that getting the search to run parallel is NO JOKE: unlike minimax/AB, where you can basically divide up the search space among the available hardware, MCTS/UCT works by checking the policy which is the result of previous rollouts, then changing the value to reflect which node has been searched the most (along with its leaf nodes), then backpropagating, and repeating until an arbitrary point where you cut the search off. It's a process where each step is informed by the previous steps, so multithreading is either not done, or it's wildly inefficient. So while an EUNN engine is wasting search time, it can afford to do so because it uses the hardware so much more efficiently. SF NNUE is reminds me of the proverbial millionaire lighting his cigars with hundred dollar bills.

There's an alternative to PUCT called WU-UCT (https://github.com/liuanji/WU-UCT) which I think looks promising: I don't have the skills to throw one together, and even if I did I'd have no way to test it since I only have one graphics cards. But if you could get something like this to work, with 4 v100s I feel like it would stomp the competition. I wonder if the main devs know about it. Do they? Maybe they don't. I'll check out discord.

Warren D Smith

unread,

Jul 16, 2020, 3:31:39 PM7/16/20

to Jesse Jordache, LCZero

TCEC 18 bonus ran a test of SF-NNUE versus the top 4 TCEC finishers.
SF-NNUE got 100% draws versus plain-SF, LcZero, Komodo;
beat StoofvleesII 3 wins to none, and lost to AllieStein 1 win to none.

--
Warren D. Smith
http://RangeVoting.org <-- add your endorsement (by clicking
"endorse" as 1st step)

glbchess64

unread,

Jul 16, 2020, 9:55:37 PM7/16/20

to LCZero

SF NNUE looks like to SF with a brain (@Navs citation).

Now that the TCEC bonus ended and that @Navs tests are also terminated, it is time to answer this question. Positionally SF NNUE is simply far beyond SF and tactically it looks like SF.

SF NNUE is now strong enough to resist to the precise positional play by the NN engines. Clearly his moves have not the positional accuracy of the NN moves but contrary to SF moves they are good moves when they are not the best. He is not as fast as SF but he is fast enough to resist to SF tactical power. In fact it seems that SF is dead and that SF NNUE is the future of SF.

The long TC tests (@Navs and TCEC) show that SF NNUE is at the same level than AS, LS or Leela and better than Stoof. CCC tests may be disgarded, CCC used a buggy binary.

SF has a very good search algorithm but the issue is that this algorithm has nothing to search except combinations because his eval function is simply ridiculous : it lacks of basic chess concepts. SF NNUE simply solve this issue : he gives something interesting to search to SF algorithm. (cf. @Navs citation at the beginning of the post).

SF NNUE is still in infancy and will progress as his NN will be stronger if he plays next TCEC season, I would be not surprised if he win the SuFi. Long live to SF NNUE it is the first real hybrid between NN engines and AB engines.

The first time I eared someone speaking of eval function build with NN learning was at the very end of the 80's (a young French researcher, Jean-Christophe WEILL, that build a chess program for his thesis, it was a conclusion of his work). And 30 years later, someone did it, and it works ! I don't know if anybody tried it meanwhile and rejected the idea for some reason.

Robert Pope

unread,

Jul 17, 2020, 9:19:47 AM7/17/20

to LCZero

Weren't Giraffe and KnightCap both alpha beta NN engines? SF NNUE may be the strongest, but I'm pretty sure it isn't the first. They just weren't strong enough, probably in large part due to the lack of computation speed available at the time.

Brian Richardson

unread,

Jul 17, 2020, 11:14:45 AM7/17/20

to LCZero

Giraffe is A/B. Interestingly, in addition to a NN for evaluation, it also tried to use an NN for move selection too (although will less success).

Knightcap originally used temporal difference learning to set evaluation parameter weights (not a NN engine), but all of the terms were hand-crafted.

There may have subsequent versions with some element of NN, but I just don't recall (not that my memory is what it used to be).

Robert Pope

unread,

Jul 17, 2020, 11:41:47 AM7/17/20

to LCZero

You could be right about KnightCap, it's been quite a while since I read their paper.

Warren D Smith

unread,

Jul 17, 2020, 2:08:12 PM7/17/20

to Robert Pope, LCZero

On 7/17/20, Robert Pope <esch...@gmail.com> wrote:
> Weren't Giraffe and KnightCap both alpha beta NN engines? SF NNUE may be
> the strongest, but I'm pretty sure it isn't the first. They just weren't
> strong enough, probably in large part due to the lack of computation speed
> available at the time.

--I believe you are correct. However given that SF-NNUE runs on an
ordinary CPU, no GPU needed, and SF-NNUE is about 50% of plain SF node
rate, as opposed to
1% or less... that is not quite right. The NNUE algorithm and net is
designed to be fast, and that worked. The others were by comparison
stupid about failing to get speed.

There is a large spectrum of how fast versus how smart an eval can be,
and if it is correct that SF still is stronger than SF-NNUE, then NNUE
is not smart enough -- but still probably is smarter in the right
circumstances. LcZero is more like 1000x slower, but much smarter,
than SF, which was good enough for it to win the TCEC championship twice, but it
currently looks like SF has regained an edge over LcZero, while the
LcZero guys resort to their usual whining about how they deserved
better speed but the mean mean world robbed them of it. Still, if
that whining is valid, LcZero may be able to come back next time with
at least 20 extra elo.

glbchess64

unread,

Jul 17, 2020, 5:52:28 PM7/17/20

to LCZero

It is a partial statement to say that SF is stronger than SF NNUE, SF NNUE is weaker at short TC (FishTest) but is stronger at long TC as TCEC and @Navs tests show. And when you look at the games at long TC, against the top 3 NN engines, really it is obvious. Especially when you look at the games from balanced and sound positions : SF NNUE know how to play chess. SF only know how to find combinations. When there is no combination to find, SF position become worse, even against stoof, but SF NNUE position remains sound because it search not only for combinations but also for good pawn structure according to the piece on the board. Contrary to SF, SF NNUE know in some way, what pieces suit to what pawn structure.

Warren D Smith

unread,

Jul 17, 2020, 8:37:12 PM7/17/20

to glbchess64, LCZero

An interesting possibility is a "triangle" SF > SFNNUE > LcZero > SF
where ">" means "will defeat in long match."

To be clear: I have no idea whether that is true... I'm just saying it
is a possibility
that could happen either now or later.
E.g. it might be that SFNNUE gains enough positional smarts from its
neural net that LcZero will not be able to defeat it as often, but
will retain enough speed to out-tactic LcZero and hence win most of
the games SF would have won.

If such a triangle happens, it will be rather hard to contend any of
the 3 programs is
"the best"...

glbchess64

unread,

Jul 18, 2020, 3:05:20 AM7/18/20

to LCZero

SF > SFNNUE > LcZero > SF is possible but there is so few differences that the book used for the test is a major factor and the TC has also an impact. On @Navs stream you can see that : sometimes it takes TCEC SuFi books and some times his own book that is more balanced. And the results are a bit different : it is better for SF to have unbalanced openings. He used the same ratio for TC than in TCEC benchmark (no node CPU bottleneck in the benchmark) that gives 2,231 more nodes for SF than for Leela. For this he adjusts the frequency of his two 2070 cards. He used a longer TC but can not reach the impressive TCEC number of nodes.

Results :

Leela draw the alternative SuFi (that show the effect of TC, since the node ratio is the same)

https://discordapp.com/channels/425419482568196106/530486338236055583/730702939005648946

Leela, LS better than SF NNUE that is better than SF (Navs book more balanced than SuFi book)

https://discordapp.com/channels/425419482568196106/430695662108278784/733126038867017838

These engines have very close level and it is easy when changing TC and book to change the order.

And if you change the nodes ratio between Leela and SF you will also have different results. The TCEC ratio heavily favour SF if you compare with Personal Computer : I have one with about the same price for mother board + CPU compared to the graphic cards and SF as absolutely no chance against Leela in that conditions (11 or 12 threads/6 cores for SF and RTX 2060 for Leela).

The net is also important : we have several net at about the same level and it seems that we chose a net worse (more blunders) than TCEC S17 SuFi net for S18 SuFi. (Not enough long TC tests when the choice was done and too many candidates that divided the testing resources).

My feeling is that SF NNUE is just at the beginning and will be soon the major opponent to Leela. From all these engines, he is the one that have the better progression margin.

Message has been deleted

Dietrich Kappe

unread,

Jul 18, 2020, 8:16:12 PM7/18/20

to LCZero

SF nnue is a fast moving target. A change in compile adds 50-100 elo, new nets add strength. It’s almost like leela. ;-)

Felix Zaslavskiy

unread,

Jul 18, 2020, 8:33:10 PM7/18/20

to Dietrich Kappe, LCZero

Are there any discussions about incorporating AVX-512 instructions for the NNUE neural net? As far as I can from the source they are using AVX2.

The nps could very well double...

TCEC & CCC server hardware can execute AVX-512 but maybe the developer doesn't' own the latest Intel CPU yet, but should not be a problem to rent it on the cloud for testing.

https://github.com/nodchip/Stockfish/tree/master/src/eval/nnue

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/25a54c80-1741-44e5-a900-95bef9a4c2a2o%40googlegroups.com.

Warren D Smith

unread,

Jul 18, 2020, 9:49:13 PM7/18/20

to Felix Zaslavskiy, LCZero

On 7/18/20, Felix Zaslavskiy <felix.za...@gmail.com> wrote:
> Are there any discussions about incorporating AVX-512 instructions for the
> NNUE neural net? As far as I can from the source they are using AVX2.
> The nps could very well double...

--I would certainly recommend that for machines supporting those instructions.

Felix Zaslavskiy

unread,

Jul 19, 2020, 1:11:18 AM7/19/20

to Warren D Smith, LCZero

I see they just committed AVX512 just 2hours ago.

That will be interesting to see how that speeds things up.

stj...@gmail.com

unread,

Jul 19, 2020, 10:19:53 AM7/19/20

to LCZero

For the record, SF NNUE sees the famous Bg5 move in the AZ-SF8 game of 2017 fairly quickly. Something the normal SF11 was unable to find.

glbchess64

unread,

Jul 19, 2020, 3:20:47 PM7/19/20

to LCZero

I let run the 3 engines about the same time (a little bit more for SF and SF NNUE since they fail to find the move).

Lc0 v0.26 + SV-4300 default setting find 21.Bg5!! at depth 22, ~3M nodes, eval +1.21

Stockfish12 dev 20071720, default setting do not find the move at depth 48, 1.6T nodes.

SF NNUE 090720, default setting do not find the move at depth 42, 0.77T nodes.

So, I think more precision are needed.

stj...@gmail.com

unread,

Jul 19, 2020, 4:05:58 PM7/19/20

to LCZero

I ran the same test as you did on my hardware (RTX 2080 and 8-core intel i9).

SF11 - N/A after 30 minutes of thinking (~20,000,000 nps)

SFNN - 0 minutes 12 seconds (~13,000,000 nps)

Lc0-T60 - 7 minutes 2 seconds (~8500 nps)

Interestingly SFNN thinks it's winning even before it sees Bg5!! It thinks b4! is +3 before finding Bg5!! at +4.5

Dietrich Kappe

unread,

Jul 19, 2020, 6:32:00 PM7/19/20

to LCZero

If anyone is interested, here is LizardFish 0.1, a nnue net trained on only 6 million positions evaluated by Komodo 14 at depth 8. It’s somewhere between the last Texel and Komodo 13 in strength and plays with its own style. https://www.patreon.com/posts/39496596

Alexander Tanseco

unread,

Aug 11, 2020, 2:27:16 PM8/11/20

to LCZero

Would 10 min + 10 secs be long enough for SF Dev NNUE enabled to overcome old SF Dev without NNUE? My experience is that @ 3 min + 3 secs, old SF Dev without NNUE clobbers SF Dev NNUE enabled.

cgarc...@gmail.com

unread,

Aug 29, 2020, 6:07:21 PM8/29/20

to LCZero

It seems there are no papers for EUNNs? I only found one in Japanese.
From what I understand, NNEUs evaluation function is only trying to predict the evaluation function of Stockfish N moves into the future, its a cheap way to get data and as a bootstrap mechanism it might have the similar benefits as the Q function.
However, it seems pretty clear that Stockfish NNEU's performance is bounded by how good is Stockfish's evaluation function.

Whats more, you can even do this NNEU trick with Leela itself, that is, learn a smaller and more efficient neural network that produces similar outputs as a bigger network, this is called distillation. Given Leela's weakness is speed, a distilled version makes a lot of sense.

ronnie millsap

unread,

Aug 30, 2020, 9:50:22 AM8/30/20

to LCZero

interesting, take the same approach sf has, just filing in the gaps of the nn instead of the AB

Damas Clásicas

unread,

Aug 31, 2020, 2:14:13 AM8/31/20

to LCZero

>My experience is that @ 3 min + 3 secs, old SF Dev without NNUE clobbers SF Dev NNUE enabled.

Sorry, but what kind of pathetic and mediocre test are you doing?

SF NNUE is MUCH stronger than classical SF; the last RT test is giving +131 Elo on 1 core over SF11!
And with more cores it also surpasses the 100 Elo barrier!

ronnie millsap

unread,

Aug 31, 2020, 11:51:30 AM8/31/20

to LCZero

shut up, the importance ir rather or not its silly is up the othe person doing it. fuck your mother

Robert Clark

unread,

Aug 31, 2020, 11:59:01 AM8/31/20

to ronnie millsap, LCZero

Ronnie, I think your keyboard has been hacked.

--

You received this message because you are subscribed to the Google Groups "LCZero" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/57db8cc7-2a97-4931-a03d-9a1a3a092ad8o%40googlegroups.com.

glbchess64

unread,

Aug 31, 2020, 4:53:31 PM8/31/20

to LCZero

If it is clear that SF NNUE surpass a lot SF at all TC when facing AB engines, there is just a few elo points between them at long TC and with a lot of threads when both face pure NN engines. The +100 Elo gain claims are based on fishtest and give a very partial view of the reality.

In fact I never saw a test at long TC and with lot of threads (TCEC, CCC, @Navs stream...) that showed a significant difference between SF and SF NNUE when facing Leela, AS or LS. The reason is that at long TC, the difference is far less 100 elo and there is not enough games so that sampling fluctuations linked to opening choice lead to big uncertainty.

Damas Clásicas

unread,

Aug 31, 2020, 5:21:18 PM8/31/20

to LCZero

>If it is clear that SF NNUE surpass a lot SF at all TC when facing AB engines, there is just a few elo points between them at long TC and with a lot of threads when both face pure NN
> engines. The +100 Elo gain claims are based on fishtest and give a very partial view of the reality.

I said: +131 Elo points stronger than Stockfish 11. I didn't talk about other engines.

>In fact I never saw a test at long TC and with lot of threads (TCEC, CCC, @Navs stream...) that showed a significant difference between SF and SF NNUE when facing Leela, AS or >LS. The reason is that at long TC, the difference is far less 100 elo and there is not enough games so that sampling fluctuations linked to opening choice lead to big uncertainty.

Of course you NEVER saw a long TC because NNUE is still a very new design, and we haven't had enough time for testing it at longer time controls. But at least with 3'+2'' time controls Stockfish NNUE crushed Leela in Mark Young's tests.

About CCCC, it really does NOT use longer time controls (only bullet), and the results in TCEC cannot be considered too exact, because engines only play a few games against the others in ''DivP'', while in the Superfinal only about 100 games with too skewed openings.

Warren D Smith

unread,

Aug 31, 2020, 5:35:25 PM8/31/20

to Damas Clásicas, LCZero

I prefer it if INDEPENDENT testers (e.g. not associated with stockfish team)
prove the superiority of SF-NNUE or whatever. Hence, TCEC and chess.com.

TCEC is now running premier division for season 19: https://tcec-chess.com/
hopefully enabling us to see for ourselves how superior the latest
stockfish is, versus everything else. If it really is over 100 elo
above everything else, that is so big that it ought to be
statistically clear despite the fact TCEC tourneys do not involve a
huge number of games.

glbchess64

unread,

Aug 31, 2020, 10:08:58 PM8/31/20

to LCZero

I consider that TCEC and CCC use indecent hardware that combined with the TC gives a lot of nodes per move. Never saw any test with more nodes per move.

Warren : Now doubt that SF NNUE will have very exceptional result in divP since there is "weak" engines to beat. And for this SF NNUE is the best engine ever seen. But I doubt that it will crushed AS or LS. I think it will beat then but with a usual margin.

glbchess64

unread,

Aug 31, 2020, 10:10:13 PM8/31/20

to LCZero

Oups : LS = Leela.

Warren D Smith

unread,

Aug 31, 2020, 11:15:43 PM8/31/20

to glbchess64, LCZero

Presently Ethereal and LcZero are leading TCEC divP, but it has barely started.

Allegedly SF-NNUE is over 100 elo stronger than SF without NNUE, although
not everybody outside of the stockfish testing community has seen that
huge strength boost. And SF-NNUE ought to be way stronger than LcZero
and Allie based on the theory it has a huge speed advantage over them
without much brains disadvantage anymore.

But that is all theory. What I want from TCEC is reality... we'll see :)

ronnie millsap

unread,

Sep 1, 2020, 9:21:18 AM9/1/20

to LCZero

meh you get what you ask for. dont insinuate rudeness, i wont fuck your family and hope that your children die from cancer :)

On Monday, August 31, 2020 at 11:59:01 AM UTC-4, Robert Clark wrote:

Ronnie, I think your keyboard has been hacked.

On Mon, Aug 31, 2020 at 8:51 AM ronnie millsap <garrykli...@gmail.com> wrote:

shut up, the importance ir rather or not its silly is up the othe person doing it. fuck your mother

On Monday, August 31, 2020 at 2:14:13 AM UTC-4, Damas Clásicas wrote:

>My experience is that @ 3 min + 3 secs, old SF Dev without NNUE clobbers SF Dev NNUE enabled.

Sorry, but what kind of pathetic and mediocre test are you doing?

SF NNUE is MUCH stronger than classical SF; the last RT test is giving +131 Elo on 1 core over SF11!
And with more cores it also surpasses the 100 Elo barrier!

--

You received this message because you are subscribed to the Google Groups "LCZero" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lcz...@googlegroups.com.

Reply all

Reply to author

Forward