Promoting up to -50 ELO matches now

Gary Linscott

unread,

Mar 29, 2018, 1:56:58 PM3/29/18

to LCZero

This is basically "always-promote", except with a little bit of a safeguard against large regressions. Still not statistically sound, there is some great work happening on SPRT to make it more robust.

Just didn't want people to be surprised :).

Huragan

unread,

Mar 29, 2018, 2:19:45 PM3/29/18

to LCZero

Good decision according to my opinion.

Dne čtvrtek 29. března 2018 19:56:58 UTC+2 Gary Linscott napsal(a):

asterix aster

unread,

Mar 29, 2018, 11:11:16 PM3/29/18

to LCZero

What is the reason for this change from promoting only > 55% to "always-promote"? Just curious.

graci...@gmail.com

unread,

Mar 29, 2018, 11:24:01 PM3/29/18

to LCZero

A network with -18 elo just passed. Care to explain why this slightly weaker network is better than just continuing with the stronger one?

jkiliani

unread,

Mar 30, 2018, 12:11:22 AM3/30/18

to LCZero

We don't actually know that this net is weaker, this is within statistical noise. And, it generally benefits reinforcement learning to make the feedback loop more direct, i.e. let the network play on what moves it trained on before as much as reasonably possible, to get out of local optima faster. So we picked "Always-promote with safeguards". A bugged network would still fail.

luis....@gmail.com

unread,

Mar 30, 2018, 6:48:48 AM3/30/18

to LCZero

Isn't -50 Elo too conservative? You can see on zero.sjeng.org that many nets fail much larger than that, without it being an indication of bugs. If you're following AlphaZero, shouldn't the safeguard be more permissive?

jkiliani

unread,

Mar 30, 2018, 6:58:58 AM3/30/18

to LCZero

The training algorithms are different. Leela Zero uses constant learning rates for each training window, but samples it at different times to control for overfitting. Since gating (at 55% winrate) is used, an overfit net simply wouldn't pass.

Leela Chess on the other hand uses a learning rate annealing schedule, with a single net produced at the end of each run, before new data is parsed. This usually allows for faster improvement, but can also produce overfit networks, like (most likely) Id 58. Gating at -50 Elo seems like a good compromise to catch those nets where the learning parameters were not optimal.

lilmafya

unread,

Mar 30, 2018, 7:38:29 AM3/30/18

to LCZero

What about network ID59 ?

jkiliani

unread,

Mar 30, 2018, 8:29:17 AM3/30/18

to LCZero

The learning rate schedule got changed again after ID 59, since it still appeared to overfit a bit. The data produced by the network should help to actually avoid overfitting for a while though.

GK

unread,

Mar 31, 2018, 12:52:50 PM3/31/18

to LCZero

Just an idea, not sure how reasonable this is but:

if we consider match test 55 and 57 which had ELO Delta values of at least -100, would it be worth terminating sets of matches (such as these) in advance that seem to have a very low probability of meeting the desired gating elo? (Also I'm not sure if this has already been done but if it has, my mistake :) )

For example, if a promotion is currently only given to networks with an ELO Delta of greater than -50. at some point given how many games are being played for each set of matches (say 600 or 700), there must be a point of no return, at which no matter how many more games the new network wins, it will not be large enough to enable to it reach an ELO delta of -50. Would it not be more prudent to terminate the set of matches right then and there?

Of course, there are games that also barely miss the ELO gating, such as ID 61 at an ELO Delta of about -65 and in this case, the amount of matches played we would end up saving on would not be as many but over time, we would probably see at least some benefit from this.

evalon32

unread,

Mar 31, 2018, 1:04:29 PM3/31/18

to LCZero

Indeed, that's what SPRT is. It's being worked on in PR #174.

GK

unread,

Mar 31, 2018, 1:07:15 PM3/31/18

to LCZero

Right, that totally makes more sense. Thanks!

Message has been deleted

Ivan Ivec

unread,

Apr 1, 2018, 3:20:07 PM4/1/18

to LCZero

I'm a complete novice here, but I think that 500 games are enough to distinguish -20 Elo.

So, -50 is not clear to me.

Dana subota, 31. ožujka 2018. u 19:07:15 UTC+2, korisnik GK napisao je:

Lyudmil Antonov

unread,

Apr 1, 2018, 3:27:38 PM4/1/18

to LCZero

The variability here is high because the draw rate is very low

Ivan Ivec

unread,

Apr 1, 2018, 3:36:20 PM4/1/18

to LCZero

No matter of draw rate, negative Elo Delta (especially that high) means that you are very confident in this training. Well, good luck...

Ivan Ivec

unread,

Apr 1, 2018, 3:45:28 PM4/1/18

to LCZero

BTW, I see that ID 73 is at 4340 Elo???

I know that Elo is not transitive/additive, but still...

Lyudmil Antonov

unread,

Apr 1, 2018, 3:53:14 PM4/1/18

to LCZero

With high variability (low draw rate) 500 games are not enough to distinguish -20 elo because standard error is about +/- 50

Lyudmil Antonov

unread,

Apr 1, 2018, 3:54:22 PM4/1/18

to LCZero

Though I am worried about the diminishing elo, too ...

Lyudmil Antonov

unread,

Apr 1, 2018, 4:32:45 PM4/1/18

to LCZero

I still think that keeping the temperature = 1 the whole game may be one of the problems despite of what is written (not very clear) in the Alpha Zero paper. Better try as in AlphaGo Zero paper, t=1 only in the opening (the first 15 moves or so) and then keep t close to zero. Randomness is not needed especially in the endgame, for example when there is only one shortest way to mate or when you have corresponding squares with single best moves.

graci...@gmail.com

unread,

Apr 1, 2018, 6:14:57 PM4/1/18

to LCZero

Why not just play more games to make the elo estimation more accurate? It's not like the framework can't handle it, maybe until we get 15 elo error bar. Then till we get to higher elo of course error bars need to be lowered, to like 10 or 5 etc. 50 elo seems huge for an engine reaching 2000 elo, And the current setup seems too permissive it seems like it is regressing from passing these -40 elo networks... Is the network really getting better by doing this? Am no expert here anyway, just my 2 cents...

On Thursday, March 29, 2018 at 9:56:58 AM UTC-8, Gary Linscott wrote:

GK

unread,

Apr 1, 2018, 7:31:16 PM4/1/18

to LCZero

Indeed, also consider that the last four networks after candidate id 69 have been seen elo drops, three which passed, which add up to (-46-28-18) about -92.

What is the purpose, then, of rejecting a -56 elo delta patch yet accepting a cumulative elo delta patch that ends up subtracting more elo than that?

More visual evidence of this can be seen on the LCZ main page on which we can see the elo start to trend downwards.

I understand that some other learning networks promote all the time and that currently we are promoting with safeguards because some of the fluctuation in elo could be due to noise but how much of a drop are we willing to accept?

I added up some of the stuff up to candidate id 73 and thus far from the very beginning, the total positive elo delta has been 4511 and the total negative -2921, summing to a 1589.

Now regardless of whether that number is accurate or not, would we not be better off simply accepting networks with at least a positive elo delta?

Wouldn't we be guaranteed to see improvements that way? Of course, as was mentioned before, we might be missing out on some beneficial stuff hidden by noise but I'm not sure if that's really a strong enough argument to continually accept negative elo delta's and if it is, then the -50 elo delta cutoff also seems rather arbitrary unless I'm missing something.

Of course then there is also the Leela Zero project at http://zero.sjeng.org, which accepts all network changes, negative or positive, stating "Not each trained network will be a strength improvement over the prior one. Patience please. :)" so there's always that to consider.

I suppose what I am asking is, what is the downside of rejecting all non-positive elo changes?

GK

unread,

Apr 1, 2018, 8:04:41 PM4/1/18

to LCZero

One could probably make the argument that rejecting all non-positive elo changes would tend to select for networks that are not as good at middle or endgame plays.

However, given enough time, wouldn't this issue simply iron itself out?

After all, let's say LCZ improves to point A, which is defined as not being good at middlegame or endgame. At this point, when networks are facing each other that are both at about point A, when neither is that good at middle or endgame stuff, won't a network that is just slightly better enough to the point that it leads a positive elo delta end up passing anyways? And over enough time and enough games, wouldn't the network grow to become good at middlegame and endgame stuff regardless, just by merit of how this system works?

I'm not entirely sure about this but I can't really see the issue here.

GK

unread,

Apr 1, 2018, 8:06:14 PM4/1/18

to LCZero

Edit: this is assuming that the way things are currently going leads to networks being promoted that tend to get quicker wins, thus indicating that they are stronger at beginning/opening play rather necessarily than middlegame/endgame stuff

Michel VAN DEN BERGH

unread,

Apr 2, 2018, 2:48:26 AM4/2/18

to LCZero

would we not be better off simply accepting networks with at least a positive elo delta?
Wouldn't we be guaranteed to see improvements that way?

You have to be careful to distinguish what you observe and what is true.

Assume that every network has in fact the same strength. Then 50% of the tests would show a positive elo gain. By accepting only the winners it would seem as if you are making steady progress whereas in reality you are stagnating.

The optimal cutoff depends on the exact distribution of improvements versus regressions (the prior). The problem is that the prior cannot be known with any accuracy and is likely to change quickly over time. So traditionally in statistics you prefer to be safe rather than sorry and you use a very strict cut off. This principle works extremely well for traditional chess engines like Stockfish.

However in this case there is the added complication that a regression can still become an improvement after more training. So accepting a regression is maybe not so harmful. It might even be beneficial as it may avoid that the network becomes stuck in a local optimum. Time will tell.

kostuek

unread,

Apr 2, 2018, 3:06:26 AM4/2/18

to LCZero

Am Montag, 2. April 2018 01:31:16 UTC+2 schrieb GK:

I suppose what I am asking is, what is the downside of rejecting all non-positive elo changes?

Imagine a learning network being on a 3d landscape with hills and valleys representing its elo strength. The network is learning by "climbing the hills" and it does it in a simple manner: it looks around itself and takes the direction where it seems to go uphill. But it does not look very far, only a short distance. So, it may arrive at a point in our imaginary landscape, where every direction seems to be downhill, simple because the network is not able to see very far. And at the same time it is still not on the largest possible hill. If you do not allow non-positive elo changes, the network will get trapped in the so called "local optimum". In fact, -50 elo is a random assumption about our landscape, saying that it will be enough to overcome any local optimum in our way. Which is probably not sound, since we do not really know much about this landscape.

Lyudmil Antonov

unread,

Apr 2, 2018, 3:19:37 AM4/2/18

to LCZero

If you look at the graph of Leela Zero (http://zero.sjeng.org/), it illustrates very well this principle. It is step-like with the flat parts in local optima (valleys) with no hill in in sight. It walks for some time, finds the far hill (bigger optimum) and starts to climb (the steeper parts).

Lyudmil Antonov

unread,

Apr 2, 2018, 3:27:56 AM4/2/18

to LCZero

However, one shouldn't allow the network to fall in deep holes along the way, otherwise it would waste time and energy to get out out of the hole.

Michel VAN DEN BERGH

unread,

Apr 2, 2018, 3:57:47 AM4/2/18

to LCZero

On Monday, April 2, 2018 at 9:19:37 AM UTC+2, Lyudmil Antonov wrote:

If you look at the graph of Leela Zero (http://zero.sjeng.org/), it illustrates very well this principle. It is step-like with the flat parts in local optima (valleys) with no hill in in sight. It walks for some time, finds the far hill (bigger optimum) and starts to climb (the steeper parts).

Maybe I am misunderstanding things but L0 seems to apply very strict gating. Rather the opposite from the policy that is currently being followed by LC0.

See https://github.com/gcp/leela-zero/blob/master/FAQ.md

Or do they perhaps continue to train a failed network, in the hope that it eventually climbs out of the valley?

Lyudmil Antonov

unread,

Apr 2, 2018, 4:49:03 AM4/2/18

to LCZero

From their FAQ it is seen that they use SPRT to seed networks with 55% win rate over the previous network.

Aleks

unread,

Apr 2, 2018, 5:51:53 AM4/2/18

to LCZero

My guts feeling is it's too early to worry about local optimum.

We are still in a middle of a huge mountain to climb. There is a plenty of paths with clear sight to the peaks, and local hills are not preventing us to find them. It's the learning mistakes (like overfitting) that make us take downhill paths more often that we should.

-50 feels too harsh

GK

unread,

Apr 2, 2018, 1:17:47 PM4/2/18

to LCZero

Thanks for clarifying guys. I guess I just thought for some reason that ELO would keep rising as opposed to rising and then stalling or even dropping for a bit before rising again. This was despite seeing the progress of the other Leela Zero. As with most things, it seems it is the long term trend that counts most.

Thanks again for the clarification :)

Ivan Ivec

unread,

Apr 2, 2018, 1:37:37 PM4/2/18

to LCZero

Now even I can understand this:

negative Elo Delta is OK as long as we are far away from stagnation or falling down.

One day when we reach stagnation, we'll need to increase the cutt-off, and when we reach limits of the network,

we'll need even positive Elo Delta, just like we use it for developing classical strong engines, where limits of the network (our brains) are reached.

Dana ponedjeljak, 2. travnja 2018. u 19:17:47 UTC+2, korisnik GK napisao je:

Dorus Peelen

unread,

Apr 2, 2018, 1:57:50 PM4/2/18

to LCZero

> One day when we reach stagnation, we'll need to increase the cutt-off, and when we reach limits of the network,we'll need even positive Elo Delta, just like we use it for developing classical strong engines, where limits of the network (our brains) are reached.

Absolutely not. Especially once you are near the limit, adding too strict gating will prevent the network from improving at all.

You can still use more strict requirements to select the *best* network, however for self play and training it's beneficial to continue the feedback loop and allow (almost) all new networks to generate games in order to allow the network to continue to learn new strategies.

Op maandag 2 april 2018 19:37:37 UTC+2 schreef Ivan Ivec:

Ivan Ivec

unread,

Apr 2, 2018, 2:43:42 PM4/2/18

to LCZero

Well, Stockfish is improving just as I said, and there is no other way.

If absolutely not, than Leela's network is far below human brain (which is OK I guess).

Lyudmil Antonov

unread,

Apr 5, 2018, 5:02:06 PM4/5/18

to LCZero

With this allowance of -50 elo there is a regression which is hard to reverse. Better keep to Alpha Zero and Leela Zero with only networks with 55% winning chance selected.

Dorus Peelen

unread,

Apr 5, 2018, 5:46:19 PM4/5/18

to LCZero

What regression are you talking about? The current network (92) was matched against the previouse highest self play elo network (83), and actually was stronger.

Also AlphaZero used always promote, not just -50 elo. That did not slow AZ down at all. AlphaGoZero and LeelaZero used promotions at 220/400 wins. LeelaZero was already set up with gating when the AZ paper came out and they never switched to the AZ method. (Neither did they used t=1 all game, another big distinction between AGZ and AZ). However the AZ came later and that alone is reason to believe the techniques on AZ are actually more refined and faster.

But again, i ask you, do you have any data that backs up the wild claims that "there is a regression which is hard to reverse"?

Op donderdag 5 april 2018 23:02:06 UTC+2 schreef Lyudmil Antonov:

checkersp...@gmail.com

unread,

Apr 5, 2018, 5:47:13 PM4/5/18

to LCZero

AlphaZero used the always promote strategy (no gating-> just take the next net when it was trained). This worked perfectly fine with AlpaZero and there is no reason why it shouldn't work with LeelaZero(Chess). MiniGo (Another DeepLearning Go-project) uses the always promote-strategy as well and to no suprise, has great results.

The only reason why I like gating is to discover bugs. If there was a bug and the net elo-delta is bad (let's say -300) then it won't be promoted.

Lyudmil Antonov

unread,

Apr 6, 2018, 2:51:18 AM4/6/18

to LCZero

The data is right in the graph "Progress" on the main page. Compare the progress before and after around game 1 000 000. After game 1 200 000 when the -50 allowance was introduced, we have troughs (regressions), consisting of a sequence of decreasing networks. The low level reached needs many more networks to reach the previous peak. We have 3 such major troughs which together take about 30 networks just to stay level. This is easy to understand as the -50 rule removes/alleviates the principal selection pressure to better networks. Now compare this graph with the graph of Leela Zero (http://zero.sjeng.org) and the graph in Alpha Zero paper. We do not see these troughs because failed networks are not used there.

I am writing this with a genuine concern about this slowing of progress with no intent to criticize anybody. The -50 rule may be one of the reasons for this slowdown. Perhaps it is not the only reason because the slowdown happened some time before the introduction of the -50 rule. Still, it is too evident and too early to be natural.

Lyudmil Antonov

unread,

Apr 6, 2018, 3:00:04 AM4/6/18

to LCZero

Network 83 is 4694 Elo and network 92 is 4564 elo. So how is 92 stronger than 83 ?

Ignacio Santos

unread,

Apr 6, 2018, 3:36:46 AM4/6/18

to LCZero

Match 90: +20,13 ELO

Michel VAN DEN BERGH

unread,

Apr 6, 2018, 3:47:24 AM4/6/18

to LCZero

On Friday, April 6, 2018 at 9:36:46 AM UTC+2, Ignacio Santos wrote:

Match 90: +20,13 ELO

The progress graph is completely wrong statistically for two reasons:

(1) The matches are used both for gating and for measuring progress. This creates unavoidable bias and hence is fundamentally wrong.

(2) More seriously: apparently in each of the matches only a few openings are used. So what is really measured is the change in elo for those specific openings. Since the preferred openings change over time this destroys the additivity (elo(A,C)=elo(A,B)+elo(B,C)) of the elo measurements which is the basic assumption behind the progress graph. It is only reasonable to expect additivity when measurements are taken under the same conditions.

So whereas using no opening book is correct for gating (at least under the "zero" philosophy), it is not correct for elo measurements.

Michel VAN DEN BERGH

unread,

Apr 6, 2018, 4:01:40 AM4/6/18

to LCZero

There is one other aspect I forgot to mention. It is fascinating to see that lczero is learning opening theory all by itself... This entertaining aspect of the matches would be absent if a book was used. But we do not actually need 600 games to see lczero play the same few openings over and over again. A few demonstration games would be enough.

Lyudmil Antonov

unread,

Apr 6, 2018, 4:07:31 AM4/6/18

to LCZero

I agree on both points below. Still these do not address the main concern: by allowing inferior networks wrong patterns are harder to overcome.

Michel VAN DEN BERGH

unread,

Apr 6, 2018, 4:18:13 AM4/6/18

to LCZero

On Friday, April 6, 2018 at 10:07:31 AM UTC+2, Lyudmil Antonov wrote:

I agree on both points below. Still these do not address the main concern: by allowing inferior networks wrong patterns are harder to overcome.

I was only making _formal_ comments on the flaws of the progress graph. Not on what what is the best way to make lczero go forward.

I do not know enough about "deep learning" to say anything about this. But for sure the progress graph is too flawed to provide any guidance on this issue.

Lyudmil Antonov

unread,

Apr 6, 2018, 4:21:54 AM4/6/18

to LCZero

It's the same as with natural selection: if organisms with harmful mutations are allowed to live longer the evolution will be slower.

Jens Hartmann

unread,

Apr 6, 2018, 4:39:20 AM4/6/18

to LCZero

Rigorous protest from a biologist! Thats a wrong interpretation of the evolution theory. Darwins "survival of the fittest" does NOT mean that only the fittest survive. It means that the indivuduals have at least to be able to barely survive in order to be able to breed and to multiply. "Fittest" in the sense of "able to survive". Almost each creature on this planet is suboptimally adapted to its environment. Almighty, alien-like creatures will never evolve in gradually changing enviroments such as earch.

kostuek explained the reasons for the "steps back" very clearly here:

https://groups.google.com/d/msg/lczero/vRGdUUXVM5E/6u-wDeBbBQAJ

kostuek

unread,

Apr 6, 2018, 5:55:06 AM4/6/18

to LCZero

I am running a Id94 - Id83 100 games match with an opening book right now. 64 games into it, Id94 shows a regression of about -40 Elo. On the positive site, Id94 seems to learn about material imbalances, like prefering three minor pieces over a queen (and eventually loss), which I was not observing in earlier versions.

Lyudmil Antonov

unread,

Apr 6, 2018, 6:02:08 AM4/6/18

to LCZero

As a statistician, I have studied the math of the selection process in school. The selection pressure $beta$ is defined as the death rate of individuals with deleterious (harmful) mutations. Because harmful mutations are several orders of magnitude more than the beneficial mutations, $beta$ is the only mechanism to weed out harmful mutations. Allowing all individuals with harmful mutations to survive and breed ($beta$ = 0) means that after several generations the harmful mutations will predominate over beneficial mutations and we will have devolution instead of evolution.

Jens Hartmann

unread,

Apr 6, 2018, 6:41:43 AM4/6/18

to LCZero

"Math of selection process in school": what a contradiction. Sorry to say that it doesnt make sense to discuss evolution here. It starts with the definition of "harmful" or "harmless". Evolution is everything else than black and white.

Lyudmil Antonov

unread,

Apr 6, 2018, 6:42:43 AM4/6/18

to LCZero

In regard to network selection, I would like to put to discussion the following proposal:

After implementing the SPRT patch (it's still in pull requests), make selection continuous utilizing the sequentiality of SPRT: take each 10th (or so) game in random in respect to testers and run it in SPRT against the most recent network. We expect continuous improvement, so LLR to reach the upper limit; however, in case of failure (LLR reaching the lower limit), start SPRT again.

This approach would be feasible if network weights change after each game, or after a batch of games. I don't know if this is the case here.

Lyudmil Antonov

unread,

Apr 6, 2018, 6:46:00 AM4/6/18

to LCZero

Well, better not discuss, if you don't know that there are books and papers with mathematical description of the selection process.

Jens Hartmann

unread,

Apr 6, 2018, 7:35:44 AM4/6/18

to LCZero

Of course I know. I simply stated that you obviously misunderstood evolution. You might be a genious statistician. I dont know. But you statisticans should stop thinking that the whole world can be calculated, modelled and described by numbers. The papers deal with aspects of evolutionary theory, but not with the evolution process per se which would be just arrogance. Thats why my discussion with you ends here.

Lyudmil Antonov

unread,

Apr 6, 2018, 7:54:54 AM4/6/18

to LCZero

This is the exact attitude which I have seen from many of the biology and medical students in my classes after they confront math which is hard for them to digest. Yes, I know, biology is something superordinary, something not describable by math and accessible to only a selected few who have the "intuition". Unlike the lowly physics and chemistry which were mathematically contaminated centuries ago. Fortunately, there are bio/medics which make serious efforts to understand and some of them become better than the statisticians in these theoretical fields.

Alexander Lyashuk

unread,

Apr 6, 2018, 8:12:02 AM4/6/18

to lanton...@gmail.com, LCZero

It seems that this discussion diverted a bit from the original topic.

I'd suggest to avoid getting personal and keep discussion constructive.

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/84d03d4d-41ce-43d4-aabc-6d2afe391de6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Lyudmil Antonov

unread,

Apr 6, 2018, 8:15:15 AM4/6/18

to LCZero

Agreed

Юрий Павлович

unread,

Apr 6, 2018, 8:48:23 AM4/6/18

to LCZero

They're not inferior. The error landscape in which parameters moving during training is not convex. It's full of pits and mountains and if you want to find a deepest pit(lowest error) and if you measure your success only by elevation then it might wrongly seems that you experience setbacks during your travel when in reality only closeness to the goal matters. The Elo graph doesn't represents latter so these networks are not really inferior. They're still accumulating knowledge(by moving parameters closer to goal) which is important for gradual improvements. If we gonna filter all those "inferior" networks then instead of small steps it would require for predecessors to make huge leaps which is still hard for modern optimization techniques.

пятница, 6 апреля 2018 г., 13:07:31 UTC+5 пользователь Lyudmil Antonov написал:

Dorus Peelen

unread,

Apr 6, 2018, 5:21:15 PM4/6/18

to LCZero

Network 83 is 4694 Elo and network 92 is 4564 elo. So how is 92 stronger than 83 ?

The elo in the graph and on the networks is a bit misleading because this elo is based on the cumulative elo of 83->88->89->90->91->92. However, in a straight up match between 83 and 92 (Match id 90), 92 ended up +20 elo.

Also, i'm not sure why evolution theory is involved here. Networks are not "bred" but "trained". While it is true that a single training step can harm the network as much as it can improve it, over the course of the entire training process the network will generally always move in the right direction because it's guided by the MCTS. During training the network converge towards the MCTS output of the previous network, and in order to improve the quality of the network, it might sometimes need to break down some internal structure to build up another more efficient or more important one.

Op vrijdag 6 april 2018 09:00:04 UTC+2 schreef Lyudmil Antonov:

Joona Kiiski

unread,

Apr 6, 2018, 10:20:15 PM4/6/18

to LCZero

The elo in the graph and on the networks is a bit misleading because this elo is based on the cumulative elo of 83->88->89->90->91->92. However, in a straight up match between 83 and 92 (Match id 90), 92 ended up +20 elo.

The error bars being?

nogpu

unread,

Apr 7, 2018, 1:00:20 PM4/7/18

to LCZero

Network Id 102 is over 100 Elo weaker than 101, so why is it being used?
Shouldn't it have been ignored?

Gary Linscott

unread,

Apr 7, 2018, 2:26:24 PM4/7/18

to LCZero

Changed it to -150. The match games end up playing only one opening usually, so they are not representative right now. We need to introduce some variety in there, which is being worked on.

nogpu

unread,

Apr 7, 2018, 3:04:55 PM4/7/18

to LCZero

Ah, ok. A small amount of noise or <can't_remember_the_other_parameter> required I guess. Makes sense.

I assumed it was all under control, but good to hear that.

Reply all

Reply to author

Forward