On 5/6/18, Trevor <
trevor...@gmail.com> wrote:
> Warren - have you called the Deepmind team a bunch of idiots too?
--no, but I did ask them what A0 had learned/not about TB endgames.
I suggested that would be another way to gauge
how well it had learned chess, and readers wanted to know it.
(No answer. )
> From my perspective.. The “zero” tabula rasa thing is not just some
> philosophical ideal or aesthetic, nor is it a simple curiosity. It’s a
> scientific result in optimization.
>
> A few remarks...
>
> It’s long been known that “perfect” data - noise free, 100% correctly
> labeled, etc - is often *not* the best training data for neural networks.
--Sigh.
Right now, the sole way LC0 learns, is from perfect data
(e.g. checkmate positions).
There is no other source of knowledge besides the rules of chess, which
are perfect data.
Would you suggest adding noise to create fake-rules-of-chess
and try to have LC0 learn from those?
This is all utter bull. You just spew garbage, following the Pattern
of all my respondents so far, in a comical effort to avoid doing the
right thing.
> Neural networks do well with noise and imperfections as it forces
> regularization that improves their filters, and enhances their capacity to
> learn and generalize better. What’s important is that the data is mostly
> right, such that statistics done on the data result in the correct minima -
> not that each example is perfectly labeled. I think this stuff is even more
> pertinent for reinforcement learning due to its dynamical nature (ie moving
> targets, exploration vs exploitation concerns, etc).
--you simply spew mythology. The fact is, the TBs contain far far
more data than
the number of weights in a neural net. So the NN is incapable of
overfitting it. There simply is no concern about this. At all. It is
total bull.
And all the stuff about chess endgames the NN cannot learn, IS noise
and imperfections as
far as it is concerned.
> If you followed the work done on MCTS Go playing programs (back before
> anyone showed effectiveness with neural networks), you’d know that it was
> found a long time ago that a good policy for MCTS rollouts is not
> necessarily a good policy for playing the game. On the contrary, when
> playout policies were optimized to play Go well, performance of the MCTS
> engine suffered.
--irrelevant. I am not suggesting trying to input or in any way force
human ideas of "playing go well."
(Maybe you are, but I do not care.)
> Only a few years ago most experts seemed to agree that current neural
> network architectures were absolutely incapable of playing games like Go and
> Chess as well as traditional engines do. Many many people tried - both
> supervised and tabula rasa. I did too with some games (over 10 years ago, I
> wrote a small neural network library, mostly temporal-difference learning
> approaches).
--I did so too. But unlike you and the "many",
I actually succeeded. Namely, I trained an othello
player from nothing up to human national-champ strength in 1 week on a machine
at about 60 MHz (this about 25 years back). Not neural nets,
it was a table-based eval with several million entries in table to learn
(initially random). Sole source of knowledge was rules of othello and game-end
results.
The main reason those guys all failed, was simple: not enough data.
That was the main reason almost all NN research did not get very far
in the early days.
But in my othello case my data rates were tremendous.
> One thing I found with my experimentation is that when learning
> policies based on neural networks and shallow search, it is very very easy
> to more or less get stuck in local-minima: portions of the
> game-playing-space that are difficult to get out of because of the trappy
> nature of games.
--at last, you say something sensible.
Unfortunately, you fail to recognize this totally supports me.
In the KNNkp endgame, as i already explained, LC0 is likely to get completely
the wrong idea thru self play without tablebase aid. That is
just one example. That is why tablebases will help tremendously.
Probably enabling it to learn this endgame even though without it it
would take 1000s of years or more. Thank you for making my point.
> While training Leela on tablebases and perfect minimax search up to some
> depth *might* help the network find global optima, there is very good reason
> to believe it might hurt instead.
--no, there is no reason. You just spew myths.
> I believe Leela’s team of developers is
> wise in staying the course trying to reproduce Alpha Zero’s results.
--no, this is a complete and obvious mistake.
It is very simple. More data, coming faster ==> faster learning.
Nobody in the NN community has ever disputed this. Vast
experience supports it.
Except for the Leela forum trolls - they dispute it, right here, right now.