Anti-Stockfish Leela Project

3,650 views
Skip to first unread message

Cscuile

unread,
Sep 22, 2018, 11:54:04 PM9/22/18
to LCZero
TLDR: The basic idea of this project is to train a neural network with the sole purpose of beating Stockfish Dev. Using these games you can help improve Stockfish's original evaluation function. 

This idea has been mentioned a few times on the Leela discord, however it never got any traction. I doubt this post will change anything but I hope it will bring to attention a few possible ideas. 

The Anti-Fish Leela project specifically trains Leela against Stockfish Dev. At first a non-zero Elo network will likely be needed. This can work in the form of ID 100-150. Generate games by playing Anti-Fish Leela against SFDev with low nodes. Increase the nodes for SF as needed until a certain point, depending on the resources available. 

In theory, it should be possible to produce a neural network that beats Stockfish by a considerable amount as the network will attempt to specifically target holes and weaknesses within Stockfish's evaluation. We've seen in a variety of games between SF vs Leela where Stockfish misevaluates heavily. With a Anti-SF specific engine, these holes in eval will be far clearer and easier to ascertain and ultimately patch.

As of right now these are just ideas of possible ways to train an Anti-Stockfish Leela. If you are interested in this project and want to help out please join the following discords. 

Stockfish Discord: https://discord.gg/nv8gDtt
Leela Discord: https://discord.gg/EeuXbYW

A Thule

unread,
Sep 23, 2018, 12:35:48 AM9/23/18
to LCZero
If you simply train lc0 with Stockfish dev games you simply get a NN which is a poor imitation of Stockfish dev in style and play. This isn’t going to result in either a stronger Stockfish dev, or a stronger NN that can beat Stockfish. If you want an engine to play like Stockfish, you use Stockfish. If you want to develop a NN to beat Stockfish; you train it with human games; then strong engine games. Then wait, maybe you stop restricting the games it learns from and let it train in all possible games by allowing it to make random Monte Carlo moves against an equal strength partner hoping it will pick up a style neither humans nor Stockfish dev can anticipate improving on its own and forging an optimal style of play not narrowed by human or AB search bias.

In both logic, then validated by tests restricting lc0 to train with only Stockfish dev games results in a poor imitation of AB engine play, not some Uber-Stockfish.
I’m not sure why people refuse to see that when lc0 strengthens from the set of all possible moves, its potential is far stronger than when it learns from a set determined only by previous Stockfish play. If you want Leela to beat Stockfish you have to let it see what Stockfish does not - not what Stockfish already does ....

There’s a reason there isn’t much uptake here ..

Cscuile

unread,
Sep 23, 2018, 1:03:33 AM9/23/18
to LCZero
A Thule,

If Leela was trained specifically from Stockfish games, then yes Leela would poorly imitate Stockfish. But like I said, the training will occur through Leela's play against Stockfish and not through Stockfish games alone. As the games are generated from the Leela vs Stockfish Dev matches, a batch is created. When this batch is used in training, backpropagation will attempt to minimize the error within Leela's Moves which best leads to a win against Stockfish. This minimization in error rewards specific moves that are particularly sensitive to Stockfish, or in other words, these moves lead to a higher probability of winning against Stockfish. And of course this process is repeated similar to Lc0 with an updated network. So I believe this method should work.

With that said, this neural network will be very weak against engines that are not Stockfish. I understand this. The purpose is not to create a generalized engine that beats all other engines, the purpose is to create an engine that solely targets Stockfish's weaknesses that way holes can be found within Stockfish's evaluation. And once these holes are found they can be patched to improve Stockfish's overall playing strength. 

Cscuile

unread,
Sep 23, 2018, 1:06:34 AM9/23/18
to LCZero
If there is a flaw in my prior explanation please do tell me. In my mind it should work but they may be a tiny detail I am overlooking that nullifies the feasibility of this idea. 

Thanks. 

Aditya

unread,
Sep 23, 2018, 1:58:50 AM9/23/18
to LCZero
It will make Leela draw more to Stockfish, not "Win" against stockfish.

Moritz Buchty

unread,
Sep 23, 2018, 3:18:51 AM9/23/18
to LCZero

you would have to adapt the training routine a bit, but it should work.

if you do the standard training routine (like they do with all those self play games), leela would learn to be a stockfish killer to a specific version, but oversample to whatever flaws are in there.
to explain that thought we can imagine her being trained against random moves... she very soon would learn the scholar's mate and then always move her queen out too soon... a killer against that version but totally useless elsewhere.
so it would be useful to find specific faults (they get highlighted when a version of leela oversamples to them), but the cost would be very high (several weeks of training games until the NN is useful... and that all just to bugfix a single version... and then restarting from scratch with the next version).

you need to adapt the current versions in a way they don't oversample and in a way they can constantly be reused.

remove all games from the training batch against other programs, where leela did win (they could be against a weak opponent or caused by errors of her opponent... so you never know if we really want to burn those move orders into her NN as being the way to go)
but if you use the games only, where she lost, you always can be sure, you want to learn from them (because your opponent just proved he is stronger or made less faults)

with this technique you could basically use all the games leela is currently doing against any program or human.
negative feedback on wrong moves is on the long term as good as enforcing the good moves (but without the risk of oversampling to dump opponents). training only based on that will take way longer as she does not loose that often.
but it can be very valuable as enhancement to the standard training as preferences for faulty moves are kicked out more quickly and she gets a bit more bullet proof against tricks of those engines she is now trained against.

generate as many games against stockfish as she currently generates in selfplay and let the devs include those games in the current versions... and you will have your stockfish killer without having to train in a separate project

James Thomas

unread,
Oct 5, 2018, 12:45:20 PM10/5/18
to LCZero
I think your explanation is good and the idea is sound. I don't see any obvious problems, though I am not familiar enough with the training pipeline to know how to implement it. It is also possible that the anti-SF engine would be more general than you suggest, assuming that the various engines have adopted closely related evaluation and tree pruning methods, which I think is true. No doubt it would perform worse than against SF, but maybe not a lot.

It seems as if this is an idea that the Stockfish dev folks should be very interested in - have you suggested it to them? Or a collaboration?

Until Lc0 passes the "reproduce AlphaZero" mark and perhaps wins the championship it is unlikely that there will be much traction with Lc0 devs because the training effort there is already stretched thin.

Selçuk Soner Akgül

unread,
Oct 5, 2018, 6:09:07 PM10/5/18
to LCZero
I agree with you. She only imitates SF.

Pawel Powet

unread,
Oct 5, 2018, 8:21:21 PM10/5/18
to LCZero
You guys makes me laugh. There is no such thing like anti-stockfish style of playing or imitation of SF style of playing etc. Ches for computers is just  0's and 1's  binary code and as such can't imitate anything because chess is not about imitating anything but chess is about finding the best possible moves in a given position. Stockfish is just the best because it calculate and apply the best possible variations in a given time better than any other chess engine. It plays almost perfect chess and that means if you want beat it, you have to be even more perfect. So what is the purpose of imitate SF and not to be just better one? Why you guys instead of focusing on playing better chess want to pretend that you are Stockfish and not Leela? This make not sense, Stockfish is not a Wizard, it is just a piece of code that for sure is not perfect one but it is the best one at this time and you guys instead of focusing on creating your own unique piece of code that is better than SF ,are trying to imitate  it! 

Trevor G

unread,
Oct 5, 2018, 8:42:00 PM10/5/18
to Pawel Powet, LCZero
I, for one, think that creating a engine targeted at exploiting and defeating stockfish via machine learning would be an interesting research project in its own right. Perhaps it could result in improved ideas for self-play “zero” learning, and it could also help devs of stockfish and other traditional engines better understand particular weaknesses in their software.

Anyway, the goal would not be to “imitate” stockfish, but rather use machine learning methods to discover holes and weaknesses that can be exploited.


--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/f8404ee4-ad8a-4ac6-8405-10cec68fc978%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matt Blakely

unread,
Oct 5, 2018, 8:59:53 PM10/5/18
to LCZero
This only helps if leela's full potential is only a little better than SF.

If however, leela becomes much stronger than SF (without being trained specifically for SF) then you don't need an anti-SF version for SF to learn from it.  E.g., if she reaches 4000 elo SF can learn from it regardless.

I guess a consideration is how soon can she get there


Cscuile

unread,
Oct 5, 2018, 9:07:19 PM10/5/18
to LCZero
Matt,

That brings up an interesting topic. From all the data we have so far, it is safe to conclude that at this level of chess, and perhaps beyond, there are different types of Elo. Even if Leela gets to 4000 "Universal" Elo, having a version of Leela that targets SF's specific weaknesses would still be more effective than the 4000 Elo Leela since AntiFish Leela would purposefully push SF into positions where it misevaluates. This is something a "Universal" Elo Leela cannot do. 

Cscuile

unread,
Oct 5, 2018, 9:09:08 PM10/5/18
to LCZero
+Selçuk Soner Akgül

I will repeat what I said before. 
TLDR: Leela is not trained specifically from SF games, she is trained against Stockfish with increasing nodes. 

If Leela was trained specifically from Stockfish games, then yes Leela would poorly imitate Stockfish. But like I said, the training will occur through Leela's play against Stockfish and not through Stockfish games alone. As the games are generated from the Leela vs Stockfish Dev matches, a batch is created. When this batch is used in training, backpropagation will attempt to minimize the error within Leela's Moves which best leads to a win against Stockfish. This minimization in error rewards specific moves that are particularly sensitive to Stockfish, or in other words, these moves lead to a higher probability of winning against Stockfish. And of course this process is repeated similar to Lc0 with an updated network. So I believe this method should work.

Pawel Powet

unread,
Oct 5, 2018, 9:25:27 PM10/5/18
to LCZero
As I understand , this is about training Leela against SF instead of herself. This sounds reasonable, playing against the strongest engine could accelerate development. What I dont understand is all  this gibberish about  Anti-SF specific engine or imitation of SF. There is no such term in computer chess and this is misleading. If you want to beat SF you can't play antiSF variations but rather you have to play better chess, that is all. In other words ,terms you guys are using are misleading because they suggest that there is a magical way to play some antiSF style or variations , but in fact, this means that you have to calculate better than SF does. You can also train Leela against herself to achieve similar effect but it doesnt means that she will become some anti-SF engine. It only means that her understanding of chess will be better. The same way we can say about SF that it is antiHudini specific engine because it wins most of the time, but in fact it is better because it evaluate chess positions better. 

James Thomas

unread,
Oct 5, 2018, 10:12:06 PM10/5/18
to LCZero
Apparently even this discussion attracts trolls. :-)

Cscuile

unread,
Oct 5, 2018, 10:20:16 PM10/5/18
to LCZero
James, Trolls are universal it seems :-)

Margus Riimaa

unread,
Oct 6, 2018, 1:14:10 AM10/6/18
to LCZero
I personally think the idea is good and I completely don´t understand what the opposition says by "it makes just a poor imitation of sf".

Of course if you train it against Stockfish and add temperature to the games it will eventually be a Stockfish killer, because it searches and finds ways to defeat SF. Obvious.

Because guys remember, SF would NOT be learning, I think you are missing this point.

Gyathaar

unread,
Oct 6, 2018, 5:36:22 AM10/6/18
to LCZero
So basically what you want to do is take the general idea from Generative adversarial networks (GANs) (but with modifications)
But instead of using a discriminator neural network you replace it with stockfish...

Leela plays the role of the generator network that tries to find and exploit weaknesses in the discriminator (stockfish).. that is badly evaluated positions/moves

Normally you want to train the discriminator and generator together so  they are roughly even strength, but since stockfish is far ahead in training (and not self-learning) you need to start with handicapped version and gradually improve it (with more nodes and newer versions) as the anti-stockfish network gets better... you can probably kickstart the anti-stockfish with an already trained working ... lc0 network or supervised trained so get the project moving faster


Sebastian Reinke

unread,
Oct 6, 2018, 7:22:35 AM10/6/18
to LCZero
Pawel, you are incorrect. This is not about the current instance of Leela, at all. It is the idea of using neural networks to discover misevaluations in SF and similar engines by training a net against them. Stockfish is elaborately coded based on human knowledge of chess positions. Until fully computing all variations of a game until the end becomes feasible, there is no such thing as "perfect" chess, but only evaluations of different quality as to what is likely to further the cause of winning a game. Stockfish's evaluation is imperfect. If you train a neural network trying all kinds of moves against Stockfish, it will eventually adapt to the weaknesses in the code of its opponent and exploit them. Because it targets engine-specific misevaluations, it will be less strong against engines playing different moves and evaluating differently. This is completely different from playing "better" chess, or "calculating better", if such a term even makes sense here.

Daniel Rocha

unread,
Oct 6, 2018, 11:58:13 AM10/6/18
to gyat...@gmail.com, lcz...@googlegroups.com
 "handicapped version and gradually improve it (with more nodes and newer versions) as the anti-stockfish network gets better... you can probably kickstart the anti-stockfish with an already trained working ... lc0 network or supervised trained so get the project moving faster"

You don't need to handicap Stockfish. I think you can cap the number of nodes SF uses.

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
Daniel Rocha - RJ

Pawel Powet

unread,
Oct 6, 2018, 12:40:54 PM10/6/18
to LCZero
I don't agree with you because what you say basically is in fact a general improvement of Leela chess knowledge. You define it as a antiSF, adapting to SF weakness of the code style of chess, and what I say, it is just an general improvement of Leela chess evaluation based on a training with a stronger opponent. I say that because chess knowledge is a heart of any chess engine, therefore, if you want to exploit their weakness your chess knowledge has to be greater.  Moreover, accepting your thesis we would come to the conclusion that every Leela chess game with a given chess engine would fail because she was trained against herself and not specifically against that given engin. This is not true. If you want to beat SF, train Leela as long as her chess understanding will be enough to win. But anyway, it would be nice to conduct such an tournament involving Leela in different training stances, for example, trained exclusively against Sf, or only against herself, or against many engines including SF, so we could find out which method is the most effective! 

evalon32

unread,
Oct 6, 2018, 4:58:19 PM10/6/18
to LCZero
If you accept the notion that different engines, including SF, have different strengths and weaknesses, then it trivially follows that some moves, in some positions, are going to be better against SF without being "better chess" objectively. They may even be objectively worse. For example, if (for the sake of the argument) SF tends to play worse in closed positions, an anti-SF player may prefer an equal closed position to an open position with a slight advantage.

It's a valid point that whatever the preference, Leela would have to learn to play those positions better than SF. But the counterpoint is that it may be easier to get better than SF at some positions than others (in terms of NN capacity and the amount of training required), and that's what an anti-SF version would presumably be discovering.

Pawel Powet

unread,
Oct 6, 2018, 5:40:53 PM10/6/18
to LCZero
So ok,  let's assume that you guys are right, that  if you play Leela exclusively against SF she will finally become an AntiSF chess player,whatever that means, that will easily beat Big Fish but in the same manner she will score worse against other engines. So now question arise. What is the purpose to sacrifice, time, resources and money in order to teach Leela how to win with Big Fish but in the same way to lose against other engines that are weaker because they play different ideas that Leela doesn't grasp?   Wouldn't  be just better to teach her chess in a way that she could win with everybody and not only with big fish?  In order to do that you have to sacrifice the same resources, time and money with better effect!  And although I don't think that Leela even trained exclusively against SF would finally  become an antiSf player, whatever that means, I do believe that your assumptions are wrong. 

apospa...@gmail.com

unread,
Oct 6, 2018, 5:50:23 PM10/6/18
to LCZero
It is very simple, if you train a NN vs an SF version, it will eventually discover all the ways to maximize score vs this exact version. No other NN could ever score better vs SF, because they would be unable to make the kind of "mistakes" that happen to win vs SF.

In chess the only objectively better play is a forced win compared to a forced draw to a forced loss, the strength of every other choice is subjective to the opponent. 

If you play vs an SF version which has a bug to resign when you blunder mate in 1 during the first 10 moves of the game, the NN will learn to always blunder mate in 1 for 100% winrate vs this SF. Pair it will everything else and you have 100% loss! Of course this is an extreme example, the NN will just discover all the situations that SF misplays and aim for them.

The NN will solve any problem you set it to solve. And yes this anti-SF NN will be very helpful to us for making SF better. I hope that someone does it.

apospa...@gmail.com

unread,
Oct 6, 2018, 5:54:59 PM10/6/18
to LCZero
Every single flaw of SF will get exposed, the aim of the project is to help make SF better. We, SF devs, try mostly shots in the dark, change some stuff, test, write some code, test. This tool will enable us to identify the recurring weaknesses and inspire us tremendously to write smart evaluation code for SF.

apospa...@gmail.com

unread,
Oct 6, 2018, 6:13:08 PM10/6/18
to LCZero
In fact it would be much better if the NN could itself write the SF code or at least set the parameters which would lower the NN winrate. Then the NN would readjust to beating the new SF version and again the NN would readjust the SF values. This repeating procedure would lead to a point where the NN would flatline in winrate and thus stop altering the SF parameters as well. In my opinion the end SF product will be a lot stronger than the initial one in general.

Maybe the next generation of engines are AB engines trained by NN's. I believe for the NN to do the entire code structure is too futuristic, but humans could make a structure which offers a lot of adjustable parameters for the NN to tune (how, how much and when to prune, customized eval tables etc etc, everything optimized upon position!)

garrykli...@gmail.com

unread,
Oct 6, 2018, 6:43:49 PM10/6/18
to LCZero
So, Sf should be currently training an anti stockfish NN, so that they can find all of their flaws :)

Cscuile

unread,
Oct 6, 2018, 6:53:56 PM10/6/18
to LCZero
+Apospa

Very good point! You could in fact perhaps have an NN train SF. 

This may run into problems since as SF advances, the previously learned material for the NN will become irrelevant. However there may be a way around this with different network structures.  

apospa...@gmail.com

unread,
Oct 6, 2018, 7:53:52 PM10/6/18
to LCZero
A tiny NN could be a part of SF or an AB engine, in order to control a vast set of parameters upon position. Evals, pruning methods etc etc will not need to have obscure and complex formulas in order to perform in all conditions and game phases, but instead the NN will be self-trained to choose the optimal value set for the situation. On what factors we won't know! Maybe pawn chains etc etc, whatever works. 

Eric Silverman

unread,
Oct 6, 2018, 10:24:56 PM10/6/18
to LCZero
A/B engines are already being combined to good effect in shogi.  The top-ranked shogi engines are generally either YaneuraOu (https://github.com/yaneurao/YaneuraOu) or derivatives of YaneuraOu, which itself is heavily inspired by Stockfish.  In shogi the search and eval parts of the engines are separated, so you can use YaneuraOu for example with NN evals or other machine-learning-derived evals (which tend to be quite large files, 500MB or more).

On this current ranking list of shogi engines the top 14 in a row are all A/B + NNs: https://www.qhapaq.org/shogi/

So people keep saying this as if A/B + NN is some kind of next-wave future stuff but it's well-established already in shogi, and clearly does well as all these engines can wipe the floor with human pros despite shogi being vastly more complex than chess (larger board, more pieces and piece drops).  

Cscuile

unread,
Oct 9, 2018, 11:24:51 AM10/9/18
to LCZero
Is there anyway we can spearhead a project like this?

garrykli...@gmail.com

unread,
Oct 9, 2018, 1:49:59 PM10/9/18
to LCZero
What will Arb chess guy say if we do that, and sf can never win a game lol.

Eric Silverman

unread,
Oct 9, 2018, 2:57:52 PM10/9/18
to LCZero
Sure, basing it on the good work of these shogi engine programmers would be an ideal place to start.  We'd need someone well-versed in the inner workings of Stockfish, and some Japanese-literate people to sift through the code comments and work out exactly how these neural nets are implemented.  Then we'd have to revert the engine back to chess mechanics (or Crazyhouse, since drops are already in there!) and train new neural nets.  As far as I'm aware the shogi nets are generally based in reinforcement learning, not A0 style.  They also seem to be quite small nets, so trainable by a small team rather than needing a massive Lc0 distributed effort.

I'm not a good C programmer by any stretch, but if a project like this were to get off the ground I'd be happy to help.  If nothing else I can speak some Japanese :)

Eric Silverman

unread,
Oct 9, 2018, 3:18:55 PM10/9/18
to LCZero
Sorry, as an addendum to my last post -- here's a paper on the NNUE neural net evaluation functions currently dominating shogi: https://www.apply.computer-shogi.org/wcsc28/appeal/the_end_of_genesis_T.N.K.evolution_turbo_type_D/nnue.pdf (Japanese only except for the abstract)

Here's a two-part blog post (again in Japanese) explaining NNUE in more plain-ish language: 

The NNUE codebase (now merged into the YaneuraOu engine code) is here: https://github.com/ynasu87/nnue

And some discussion of the hyperparameters used in the NN evaluation function that won the most recent World Computer Shogi Championship (again Japanese): http://www.mafujyouseki.com/article/459893973.html

Anyway, if you're interested these are good places to start.

Enqwert

unread,
Oct 9, 2018, 5:55:42 PM10/9/18
to LCZero
I think the easiest way to do it, creating SF vs SF game batches and using the batches for supervised learning. In terms of style this WILL not be an imitation of SF, not even close. It will be somewhat positional in nature like leela, but it will try to steer the game to the positions that stockfish does badly. If you examine deus x games, you will see there are similarities with leela inspite of quite different game batches. A NN can not imitate an AB engine, as the form of value function is totally different. I am sure it can be very strong,

Misha Golub

unread,
Oct 10, 2018, 5:06:22 AM10/10/18
to LCZero
Using pngs for supervised learning is tricky. Value head is simple, but how do you fill the policy for training? In normal selfplay policy is filled from actual number of visit during analysis by Leela.

Cscuile

unread,
Oct 10, 2018, 7:39:11 AM10/10/18
to LCZero
It is known that Stockfish struggles in closed positions. So we should teach Leela to use closed positions as much as possible. But the question is, how do we teach Leela to use closed positions? Perhaps an opening specifically design for A-SF Leela can be used.

Enqwert

unread,
Oct 10, 2018, 8:24:45 AM10/10/18
to LCZero
I think there is a misunderstanding here we wont teach anything, it will learn itself. There is a supervised learning alghorithm that the devs built, I think scs-ben and silver used it to develop their supervised engines. By taking positions from the games that it is fed and game results, the alghorithm adjusts weights to estimate the winning probabilities of positions and moves as best as possible. The games are fed in batches to improve the weights iteratively. Since SF is very accurate, the winning probabilies of moves taken from its games will be very accurate. This will reduce the noise and make the resulting NN stronger. Since winning probabilies will be calculated from SF games, this nn will select its moves that maximizes its winning probability vs SF and become an anti SF NN.
To achieve it large number of SF vs sf games(probably millions) and an interested dev :) are required. It can take weeks but the resulting nn should be stronger than the best SF.

Aubrey

unread,
Oct 10, 2018, 8:34:16 AM10/10/18
to LCZero

Eric Silverman

unread,
Oct 11, 2018, 10:13:17 AM10/11/18
to LCZero
There's a leap in logic here from the NN learning to approximate Stockfish's probability of winning a given position to suddenly being *better* than Stockfish, but that would not result from training on those inputs.  An NN is only as good as the data we give it.  

If you want an NN to beat Stockfish, you need to either A) feed it data from something stronger than Stockfish, or B) generate new knowledge that allows the network to exceed Stockfish's performance.  The former is not possible at the moment as Stockfish is the strongest engine available, and the latter is only possible through generating new moves and strategies via reinforcement learning.  AlphaGo did exactly this by initially learning from human games, then using reinforcement learning to find moves humans don't play and reach superhuman strength; AlphaGo Zero went further of course by learning completely from scratch and exceeding human play through self-play alone.  Human games alone are not enough to reach superhuman strength; likewise, Stockfish games alone aren't enough for Leela to reach super-Stockfish levels.

Eric Silverman

unread,
Oct 11, 2018, 10:17:38 AM10/11/18
to LCZero


A NN can not imitate an AB engine, as the form of value function is totally different. I am sure it can be very strong,

NNs are universal approximators, so they definitely can.  As I mentioned before too, shogi engines already place neural nets into the exact same role as traditional evaluation functions combined with alpha-beta search and they meet or exceed the performance of those functions under the same conditions.  In terms of Stockfish in particular, my understanding of Stockfish's evaluation function is that it's a linear combination of weighted terms, as in most alpha-beta engines, meaning NNs would be able to approximate it quite well given enough training data. 

Enqwert

unread,
Oct 11, 2018, 2:52:48 PM10/11/18
to LCZero
"An NN is only as good as the data we give it."
This statement is not true, there are lots of examples. Deus x has about +3200 Elo and is created by human games averaging 2500-2600. If you examine leela training games, you will see that they have poor quality. Although they dont have good quality, they are created in large numbers,so together they create valuable statistics that will guide the NN. Moreover, training games are created with only 800 nodes, which is incredibly shalllow search to have a good quality game. All this shows so you can create a much powerful engines than the level of data it is fed.

Phil Wakson

unread,
Oct 11, 2018, 4:39:49 PM10/11/18
to LCZero
Since stockfish is a static engine that doesnt learn, what will happen is leela will find a path to win one game (after a reaaallllyyyyy long time) and just play this game over and over. Give it another opening and its just gonna repeat the same process of finding one path to victory and use it all the time.

You will have a chess engine that has no idea how to play chess, only how to win a game against stockfish given one opening.

Cscuile

unread,
Oct 11, 2018, 5:23:30 PM10/11/18
to LCZero
+Phil, 

This issue is bypassed through forced opening given by a specially designed book. Given enough time, Anti-Fish Leela will be able to generalize towards targeting Stockfish's weaknesses within most tournament opening settings. 

Cscuile

unread,
Oct 11, 2018, 5:31:37 PM10/11/18
to LCZero
This is in regards in taking Project Anti-Fish Leela to the next step. We can use the code provided by Main Leela to supervise training. In the beginning, I would recommend the use of a 6x64 network since resources are in short supply. 

As a proof of concept, we should train a 6x64 network (Or smaller) to target an older SF version's weaknesses. If it is shown that AntiFish Leela can beat SF X, while Main Leela cannot with that specific version, then it proves that the neural network is generalizing towards SF X's holes in evaluation.

For now, the main central "Hub" for this project will be on the Stockfish Discord which you can find here:
Stockfish Discord: https://discord.gg/nv8gDtt

Eric Silverman

unread,
Oct 11, 2018, 5:36:38 PM10/11/18
to LCZero
Please read my post again, since I specifically cited reinforcement learning as a way to exceed the initial quality of supervised training data.  

It's absolutely common knowledge that supervised learning is not a free lunch, and you cannot get better predictions out of a network than what you feed into it.  Case in point, this quote from p 183 of 'Deep Learning and the Game of Go' (https://www.manning.com/books/deep-learning-and-the-game-of-go):  "You might ask yourself how strong a bot you can potentially build with the methods we presented in this chapter [on supervised learning]. A theoretical upper bound is this: the network can never get better at playing Go than the data we feed it. In particular, using just supervised deep learning techniques as we did in the last three chapters, won’t be able to surpass human gameplay. In practice, with enough compute power and time, it’s definitely possible to reach results up to about 2 dan level. To reach super-human performance of gameplay, we need to work with reinforcement learning techniques..."

Note in particular that the supervised Go-playing networks top out at 2-dan level, *despite being fed data at 7- to 9-dan level*.  A theoretical network trained only on Stockfish data is not only not going to exceed Stockfish's performance, it is going to be worse than Stockfish.

Pawel Powet

unread,
Oct 11, 2018, 5:37:11 PM10/11/18
to LCZero
Leela wont find any path because chess is a game of limited moves. It doesnt matter that there are 10^80 different combinations, but what does matter is the fact that only few of them leads to better position or advantage in a given position. Moreover, in most of the chess positions only one combination leads to advantage so what can Leela find what SF can't in such a limited range? Moreover, Since Sf is being developed on daily basis ,the speculation that Leela can overplay it sounds ridiculous. What can Leela really do, I believe, is to  win a statistic , for every 1000 games played against Sf she can win a few more! 


On Thursday, October 11, 2018 at 4:39:49 PM UTC-4, Phil Wakson wrote:

Jon Mike

unread,
Oct 11, 2018, 7:23:29 PM10/11/18
to LCZero
If this project does take off, I would support it as well.  I don't think it would take away from the lc0 project, since the whole of our understanding would increase helping lc0 and SF towards its goals.  Also I very strongly recommend the 64x6 network, for resource reasons which is of course tied to speed of progress and understanding.  I would also be fully interested in testing an even smaller network than 64x6.  Much of what we learn from the smallest networks WILL scale to the larger networks.  We can then test and apply those principles to the larger 64x6 network and progress much faster and more efficiently.  

Uri Blass

unread,
Oct 11, 2018, 10:25:30 PM10/11/18
to LCZero
When you talk about stockfish's playing strength then it is dependent on the time control.
Feed lc0 with data from something stronger than stockfish (at the relevant time control) is clearly possible and you only need to feed it with games of stockfish at longer time control.

Enqwert

unread,
Oct 12, 2018, 3:01:56 AM10/12/18
to LCZero
"Please read my post again, since I specifically cited reinforcement learning as a way to exceed the initial quality of supervised training data.  


It's absolutely common knowledge that supervised learning is not a free lunch, and you cannot get better predictions out of a network than what you feed into it.  Case in point, this quote from p 183 of 'Deep Learning and the Game of Go' (https://www.manning.com/books/deep-learning-and-the-game-of-go):  "You might ask yourself how strong a bot you can potentially build with the methods we presented in this chapter [on supervised learning]. A theoretical upper bound is this: the network can never get better at playing Go than the data we feed it. In particular, using just supervised deep learning techniques as we did in the last three chapters, won’t be able to surpass human gameplay. In practice, with enough compute power and time, it’s definitely possible to reach results up to about 2 dan level. To reach super-human performance of gameplay, we need to work with reinforcement learning techniques..."


Note in particular that the supervised Go-playing networks top out at 2-dan level, *despite being fed data at 7- to 9-dan level*.  A theoretical network trained only on Stockfish data is not only not going to exceed Stockfish's performance, it is going to be worse than Stockfish."
The paper says the network can never get better at playing Go than the data we feed it.
What I say is that this theory is not True at least for chess, as we already have practical examples namely leela and deusx that are much stronger than the material they are fed. This theory in this area is still developing, practical examples are more enlightening.

Cscuile

unread,
Oct 12, 2018, 7:46:27 AM10/12/18
to LCZero
+Engwert

An interesting thing to note is that perhaps the "Speed advantage" over humans is what is causing this inflated Elo. What this implies is that if a human could search 3000 Nodes per Second, their Elo would be 3200+. 

Eric Silverman

unread,
Oct 12, 2018, 12:06:29 PM10/12/18
to LCZero
I literally just gave you a clear statement derived from *an entire book* of practical examples!  This is not some random supposition by a single person, this is established limitations based on years of rigorous academic research in machine learning.

Also, you still clearly don't understand the difference between supervised learning and reinforcement learning, as you keep citing Leela's training games in this context as if it proves your point, which it does not.   

There's clearly no point engaging in this discussion, since you're not taking in anything I'm saying, so go ahead and have fun I guess.

Jon Mike

unread,
Oct 12, 2018, 2:09:45 PM10/12/18
to LCZero
There are so many different levels of understanding here that it often creates much confusion and frustration.  How can we get past this?  Those who are educated more than others must invest in others by sharing known resources of knowledge.  We must help smooth the path for those who are behind us.  I just posted some threads which hopefully will become resources for everyone (from beginners to advanced) to learn, share and grow together.


On Friday, October 12, 2018 at 11:06:29 AM UTC-5, Eric Silverman wrote:
....

Enqwert

unread,
Oct 12, 2018, 4:10:25 PM10/12/18
to LCZero
Why do take it as personal? Should I read the entire book to understand your examples? I gave you clear, practical examples and you say nothing about how they were possible. I agree that there is no point in continuing discussions.

Enqwert

unread,
Oct 13, 2018, 4:55:34 AM10/13/18
to LCZero
" Csuile wrote

An interesting thing to note is that perhaps the "Speed advantage" over humans is what is causing this inflated Elo. What this implies is that if a human could search 3000 Nodes per Second, their Elo would be 3200+"

The thing is all human weakness and strenghts are already reflected in the quality of those games and the level of games are 2500-2550 elo. You can think the games as played by 2500 level engines, there would not be much difference.

Infact a huge number of crappy amateur game collection can contain enough information to train a NN to great strenght. The winning probabilities of moves will be averaged and converge to probabilities of perfectly played moves. This is the fact that enable to create NNs much stronger than the training material. Even complete random moves will converge to winning probabilies of perfectly played moves, if the data size is large enough.

On the other hand, using low quality games can cause so much statistical noise
That it can take too long to train a NN, making the whole training infeasible.

Cscuile

unread,
Oct 13, 2018, 10:55:17 AM10/13/18
to LCZero
+Enqwert Ah I see. You believe the neural network is selectively picking and learning the stronger moves human play. That is possible, there are plenty of positions in which humans can come up with a winning continuation while engines cannot (Ie: Take hours to find) Most of these positions are Puzzle-like, for example the Plaskett problem.

Do you know how an engine would selectively pick these stronger moves to learn? I'm not too familiar with how Deusx is trained. 

Cscuile

unread,
Jan 19, 2019, 7:40:53 PM1/19/19
to LCZero
Looking back, this project is a success with AntiFish Mark 125! AntiFish Mark 125 successfully targets Stockfish 11 Dev (1/17/19) and managed to beat it 8-0=25. Yet when played against SF9, it seems AF didn't do as well. Interesting. 

garrykli...@gmail.com

unread,
Jan 19, 2019, 7:46:04 PM1/19/19
to LCZero
Nice, looks like google could train an anti version in a day to any program they want. Interesting.

Misha Golub

unread,
Jan 20, 2019, 5:16:45 AM1/20/19
to LCZero
So if we use correspondence games for training you get a champion? The amount of correspondence game of sufficient quality is not even close to the number needed for learning, but imagine if we had millions of correspondece games.

valergrad valergrad

unread,
Jan 20, 2019, 7:14:15 AM1/20/19
to LCZero
I've looked at this match online. This 8-0-25 can't be really impressive as at this computer too strong GPU and too week CPU. 
SF / Leela nodes ratio was about 600:1 IIRC. And Deepmind 875, TCEC - 1200.

Cscuile

unread,
Jan 20, 2019, 7:38:06 AM1/20/19
to LCZero
The CPU used was 63 threads, and the GPU used was an 2080 and a 1080ti. 

Cscuile

unread,
Jan 20, 2019, 7:38:57 AM1/20/19
to LCZero
If we had millions of correspondence games, we might be able to beat SF by quite an unforeseen margin. 
Reply all
Reply to author
Forward
0 new messages