Did 32585 outperform Alpha Zero at TCEC?

1,297 views
Skip to first unread message

Jon Mike

unread,
Jan 20, 2019, 12:46:30 AM1/20/19
to LCZero
Final score of TCEC Alpha Zero simulation tournament:
 32585 vs SF8 W/D/L (24 / 71 / 5). Plus SF8 had more hash and opening book. 
Alpha0 vs SF8 W/D/L (25 / 72 / 3) 
 I'd say that's A0 level or better!
What do you think?

Alexey Eromenko

unread,
Jan 20, 2019, 1:25:10 AM1/20/19
to LCZero
Yes, AlphaZero is still stronger.
Leela needs more training or more fundamental research to match or surpass Alpha.

Alexey Eromenko

unread,
Jan 20, 2019, 1:25:50 AM1/20/19
to LCZero
Alpha had 27 wins + 73 draws and ZERO losses against SF8.

Jhor Vi

unread,
Jan 20, 2019, 1:45:33 AM1/20/19
to LCZero
Another advantage for Stockfish8 in the TCEC match is better time management because there's no fixed time per move as in the A0 match

Dietrich Kappe

unread,
Jan 20, 2019, 1:54:17 AM1/20/19
to LCZero
The first 6 games had lc0 at half strength. Also, sf8 had openings, bigger hash and tb, which it didn’t have against a0.

When a0 played sf8 with opening book, it did lose a few games (see science article) and draw a whole bunch.

With DeepMind pulling out it’s a0 only for Christmas and world championships, we won’t be able to answer this question the way it ought to be: with a match.

Jon Mike

unread,
Jan 20, 2019, 1:54:28 AM1/20/19
to LCZero
Alexey

This is taken directly from the A0 paper

                                           Win Draw Loss
White  AlphaZero Stockfish 25 25 0 
Black  Stockfish AlphaZero 3 47 0

and shows that stockfish had 3 wins as black...

Jon Mike

unread,
Jan 20, 2019, 1:55:15 AM1/20/19
to LCZero
(sorry I meant "and shows that stockfish had 3 wins as white")

Dietrich Kappe

unread,
Jan 20, 2019, 2:02:13 AM1/20/19
to LCZero
So, lc0 was 56.5/96 or 60.1%

A0 was +155 -6 =839 for 57.45% in its 100 game match with book.

Even if you leave in the first 6 games, lc0 outperformed a0.

Dietrich Kappe

unread,
Jan 20, 2019, 2:04:30 AM1/20/19
to LCZero
Sorry, 1000 game match.

Alexey Eromenko

unread,
Jan 20, 2019, 2:09:57 AM1/20/19
to LCZero
Stockfish never won against Alpha.

Alpha won 25 games with white and 3 games with black. 28 games won by Alpha.
Stockfish 8 won ZERO games.

Dietrich Kappe

unread,
Jan 20, 2019, 2:33:13 AM1/20/19
to LCZero
You need to read the science article, not the preprint (see the open access version referenced in this deep mind blog post https://deepmind.com/blog/alphazero-shedding-new-light-grand-games-chess-shogi-and-go/). Additional matches were played, this time with time management and opening book (tcec I think). The score was +155 -6 =839. One of a0’s losses, for your enjoyment: https://lichess.org/P5s8PC2M

Vassilis

unread,
Jan 20, 2019, 4:27:46 AM1/20/19
to LCZero
The most accurate way to compare lc0 with a0 is a direct match between the two.
Something that is unlikely in my opinion. I'm sure DeepMind would object!

Mustafa Ünal

unread,
Jan 20, 2019, 5:51:27 AM1/20/19
to LCZero
I think Leela has overpass A0.

Mirza Hadzic

unread,
Jan 20, 2019, 6:08:42 AM1/20/19
to LCZero
Hiding your engine behind closed door is OK, unless you start cherry picking games and results to publish with cherry picked rules. Then it becomes cheating. Maybe there are all sorts of engines around the world hidden behind closed doors, but for engines ready to compete, SF anf Leela are currently best.

dilyan...@gmail.com

unread,
Jan 20, 2019, 6:09:14 AM1/20/19
to LCZero
Alphazero did not had opening book did he? But leela had opening book.

dilyan...@gmail.com

unread,
Jan 20, 2019, 6:12:52 AM1/20/19
to LCZero
Alpha zero has 6 loses in 1000 games, leela has 5 loses in 100 games.

LuckyDay

unread,
Jan 20, 2019, 7:55:49 AM1/20/19
to LCZero
Leela does not play with an opening book. if you mean the tcec matches involved use of an opening suite to start the game off from set opening positions, then yes that was done, and is standard in computer chess competitions to improve variation in games. This has the effect of increasing win and loss rates through use of openings that are occasionally slightly unbalanced and outside of an engine's comfort zone.

Based on the results, leela overall had a comparatively superior score than A0 did against SF8 in the revised paper. this must be taken with a grain of salt, however, as these were much shorter tc matches than what A0 played vs SF8. I think though it is safe to say T30 leela is at least comparable in strength to the old A0, maybe even slightly superior. Deepmind however have claimed they have a more advanced A0 in the works which is apparently much stronger than the old A0 so there will always be a push to get even stronger.

Sean Francis

unread,
Jan 20, 2019, 8:19:25 AM1/20/19
to LCZero
There were actually two tournaments of Alpha Zero against stockfish 8. In the one where A0 won 27-73-0 it was using tpu’s and so had a lot more computing power. The recent lczero - stockfish was set up to match the conditions of the second tournament where A0 won 25-72-3

Alexey Eromenko

unread,
Jan 20, 2019, 10:09:03 AM1/20/19
to LCZero
the original match was won by Alpha vs SF8 +28=72-0. Alpha had zero losses. Not a single loss vs Stockfish! Until Leela can show that it can survive for a 100 games against SF8 without a single defeat, Alpha is stronger.

This match between Leela and SF8 shown a much worse results of +24 =71 -5.

Jon Mike

unread,
Jan 20, 2019, 12:04:50 PM1/20/19
to LCZero

From pages 4-5:

"We evaluated the fully trained instances of AlphaZero against Stockfish (8)...
playing 100 game matches...
at tournament time controls of one minute per move
AlphaZero ...used a single machine with 4 TPUs
-----------------------------
TABLE:
                                            Win Draw Loss
White  AlphaZero Stockfish 25 25 0 
Black  Stockfish AlphaZero 3 47 0 (3 losses for A0 as black/ 3 Wins for SF8 as white)
-----------------------------

Stockfish and Elmo played at their strongest skill level using 64 threads and a hash size of 1GB. AlphaZero convincingly defeated all opponents, losing zero games to Stockfish (...but what about table above, which shows 3 losses as black to SF8?...)and eight games to Elmo...AlphaZero searches just 80 thousand positions (80 nodes/per second) ... and ... 70 million (nodes/per second) for Stockfish.

Why does the paper state,  "losing zero games to Stockfish" , then it references the table above showing 3 losses???


On Sunday, January 20, 2019 at 9:09:03 AM UTC-6, Alexey Eromenko wrote:
...

Alexandre

unread,
Jan 20, 2019, 1:06:30 PM1/20/19
to LCZero
Is 25 win white 3 win black 25 draw white 47 draw black and 0 loss

Dietrich Kappe

unread,
Jan 20, 2019, 1:40:17 PM1/20/19
to LCZero
So, responding to the criticism that these were not realistic conditions, a0 played a 1000 match WITH OPENINGS AND TC, it achieved a record of +155 -6 =839. Let me repeat, when openings and game time controls were introduced, a0 had a lot of draws and some losses. This is detailed in the Science paper and supplementary materials.

Leela played sf8 under these conditions, not the original no opening conditions.

So, looking at the correct 1000 game alpha zero vs sf8 match, a0 scored 57.45% vs sf8 (those 839 draws add up). In the 94 games with lc0 playing full strength, it had a record of +23 -4 =67, so scored 60.1%, better than a0.

Before you trott out the same 100 game match again, played under different conditions, please read the full science article and supplemental materials. You, sir, are entitled to your own opinions, but not your own facts. Please read.

Sam Jukes

unread,
Jan 20, 2019, 2:36:09 PM1/20/19
to LCZero
How do you know Deep Mind has a stronger A0 in the works? Where did you get that information? That's pretty serious news!

Jon Mike

unread,
Jan 20, 2019, 2:55:47 PM1/20/19
to LCZero
I see now Alexander, I was mis-interpreting the table in the original paper.

Also, thank you Dietrich for the clarifications.

Congratulations Lc0 (32585) you have officially surpassed Alpha Zero!    

How far will she go? Do you think Lc0 could move beyond +200 elo vs current SFdev?

LuckyDay

unread,
Jan 20, 2019, 9:07:59 PM1/20/19
to LCZero
Demis hassabis was interviewed a few times while at the recent world chess championship between caruana and carlsen (you can find the interviews on youtube). He stated that there have been ongoing improvements to Alphazero and that the latest alphazero was around 3600 elo and supposedly 100 elo stronger than the current strongest chess engine (presumably sf 10). It is not clear what elo scaling he was using, but since deepmind have gone off CCRL elo listing before presumably he was referring to the ccrl 40/40 list (as for  the  ccrl 40/4 list 3600 elo would most certainly not be 100 elo stronger than sf). This is all complete conjecture though, but does indicate that deepmind have still been doing some work on Alphazero, and was sort of confirmed by a deepmind programmer matthewlai posting on talkchess saying that deepmind had more things to show in regards to alphazero and that they have indeed been doing further work on it beyond what was shown in the recently released paper. He was giving leela devs some pretty helpful info/clarification on the alphazero paper when it got recently released; you can look up the talkchess thread for more clarification.

Jhor Vi

unread,
Jan 20, 2019, 9:54:25 PM1/20/19
to LCZero
The first match between A0 vs SF8 didn't have opening book so SF keep on repeating the same losing opening like the certain variation of Queens Indian.


On Sunday, January 20, 2019 at 1:46:30 PM UTC+8, Jon Mike wrote:

Alexey Eromenko

unread,
Jan 20, 2019, 9:55:08 PM1/20/19
to LCZero
As long as Leela still losing games in 100-game matches it is nowhere close to Alpha, sorry to burst your bubble.

Leela must train a lot more to survive 100-game match against Stockfish 8 with ZERO losses.

Patrick Hill

unread,
Jan 20, 2019, 10:33:36 PM1/20/19
to LCZero
Please name one chess competition ever (except for the one in your head) where the stated goal was to produce the least number of losses.

Your noble attempt to tamp down over enthusiasm is causing you to miss the point.

Dietrich Kappe

unread,
Jan 20, 2019, 10:46:54 PM1/20/19
to LCZero
You came again with your 100 game match under different conditions. I therefore can only conclude that you cannot read.

Daniel Smith

unread,
Jan 21, 2019, 10:47:10 AM1/21/19
to LCZero
Original a0 vs sf match,
Stockfish did not have end game table base
In the updated 2018 paper, did sf have end game table base?
Reply all
Reply to author
Forward
0 new messages