Ender 62 Released

1,281 views
Skip to first unread message

Dietrich Kappe

unread,
Oct 23, 2018, 2:16:26 PM10/23/18
to LCZero

Jon Mike

unread,
Oct 23, 2018, 4:03:44 PM10/23/18
to LCZero
Thanks Dietrich, I am testing it now.  6 block nets are tailored for cpu users!  

Of course ender62 is weak in the opening for many reasons besides its lack of knowledge of castling.  I am interested in setting up test matches from 14 man positions. 

Is there an opening book which ends at 14 man positions?  Or even an opening book that starts at 7 man positions?  If someone shared a link to such an opening book, we could easily match lc0+ender62 vs lc0+other weights.

In my quick tests of 16 man position, I paired 9149 vs Ender at a 1:4 ratio (Ender 4x time) and 9149 seemed to dominate the engine.  Looking forward to others sharing their tests and hopefully an opening book to get the engines to a 14 man position for matches.  :)

Dietrich, again thanks for the hard work!  I will be testing Ender62 this week!

It is SO fast on cpu!  Why is this?  Even in the opening position I am getting about 1 Kilonode a second (compared to the 20 block net, 11248 @ 11 seconds per KN on cpu and the 6 block net, 9149) 

On Tuesday, October 23, 2018 at 1:16:26 PM UTC-5, Dietrich Kappe wrote:

Dietrich Kappe

unread,
Oct 23, 2018, 4:08:58 PM10/23/18
to LCZero
It doesn't train on 16 man. The consequences are pretty extreme as it gets confused with too many pieces, so I'm not surprised. It is ideal for 12 man and less.

Dietrich Kappe

unread,
Oct 23, 2018, 4:20:56 PM10/23/18
to LCZero
Here is a set of 12 man that I'm running Ender62 vs 9149 (no TB) (twice, colors reversed). Ender62 is clearly better (as expected).


On Tuesday, October 23, 2018 at 3:03:44 PM UTC-5, Jon Mike wrote:
12x200.epd

Jon Mike

unread,
Oct 23, 2018, 6:24:55 PM10/23/18
to LCZero
Hi Dietrich, Thanks for the share, but I'm afraid the link is broke.  When you have the time could you fix the link.  :)

Dietrich Kappe

unread,
Oct 23, 2018, 7:06:43 PM10/23/18
to LCZero
Here is the result with tc 0.25 sec per move.

Score of ID9149 vs Ender62: 94 - 144 - 162 [0.438] 400
Elo difference: -43.66 +/- 26.34

Dietrich Kappe

unread,
Oct 23, 2018, 7:07:22 PM10/23/18
to LCZero
Both links work for me.

Jon Mike

unread,
Oct 24, 2018, 1:44:08 AM10/24/18
to LCZero
Working great and great work!

On Tuesday, October 23, 2018 at 6:07:22 PM UTC-5, Dietrich Kappe wrote:
Both links work for me.

kirill57

unread,
Oct 24, 2018, 10:25:27 PM10/24/18
to LCZero
Dietrich,
Do you think that ender can be trained as distributive project for a larger net? Combining strong opening net with strong ender net looks very promising.

On Tuesday, October 23, 2018 at 1:16:26 PM UTC-5, Dietrich Kappe wrote:

Jupiter

unread,
Oct 25, 2018, 4:16:12 AM10/25/18
to LCZero
Good result for Ender62.

Conditions:
GPU: Tesla V100 in gcloud (average nps around 20K)
Threads: 2
TC: 12s+0.1s
Start opening: 12men_start.pgn (positions used were randomly selected by cutechess-cli)
Total Unique positions: 50
Total games: 100 (side reversed)

   # PLAYER                 :  RATING  ERROR  POINTS  PLAYED   (%)  D(%)    W    D    L
   1 Lc0 v0.18.1 ender62    :    42.3   26.0    56.0     100    56    84   14   84    2
   2 Lc0 v0.18.1 9149       :     0.0   ----    44.0     100    44    84    2   84   14

Get the test games.
Get the 12men_start.pgn, 1800 unique positions with move history. Each end position was analyzed by SF to be within +/-75cp after 3s of analysis. Game sources are from cccc1, twic, and tcec.

Matt Blakely

unread,
Oct 25, 2018, 12:09:18 PM10/25/18
to LCZero
You should test against SF9 w/ TB.  When Ender consistently outperforms it, which it occasionally can do, that's when we need to start serious integration work

Although training a full sized 20x256 Ender net may be required if we want beat SF badly

Jupiter

unread,
Oct 25, 2018, 12:54:45 PM10/25/18
to LCZero
Why should I? You test it :-) But of course that is a wrong approach. To measure small gains don't use an engine that is too strong. Even against Sf without egt is enough, but you have to control the conditions such that Sf is not slaughtering Lc0 from ending. First get close to it from negative, then equalize it and finally beat it. Playing against same net size is the most logical test to do. Then probably next is to test against id 11258, bigger size.

Jon Mike

unread,
Oct 26, 2018, 6:47:59 PM10/26/18
to LCZero
Here is the results of a recent round robin tournament using the 12man_start.pgn.  
100 games were played total with 1 minute + 0s TC, ran on my pc using no gpu with 2 cpu threads.
(Because of the short time controls, these games were played with "first guesses" by the engines)
               
1   Stockfish 9 64 POPCNT   17.5 - 7.5    17.5 - 7.5        19.0 - 6.0      54.0/75
2   Lc0 v0.18.0_Ender62        7.5 - 17.5    15.5 - 9.5       18.5 - 6.5      41.5/75
3   Lc0 v0.18.0_9149             7.5 - 17.5      9.5 - 15.5      18.5 - 6.5      35.5/75
4   Lc0 v0.18.0-dev_11250    6.0 - 19.0      6.5 - 18.5       6.5 - 18.5     19.0/75

Its amazing lc0 plays so strong with such a huge gap in endgame strength (11250 vs SF9).

Walid Doknichot

unread,
Oct 26, 2018, 10:08:42 PM10/26/18
to LCZero
haw youse 12men_start.pgn,...in lc0....ender 62..????,

evalon32

unread,
Oct 26, 2018, 11:30:58 PM10/26/18
to LCZero
I've added tabs for Ender 38 and 62 to the endgame spreadsheet with results from the tbeval benchmark (plotted as a function of MCTS nodes on a log scale).
Some unexpected patterns emerged (optimal move probability going down before going up as nodes increase). They don't appear to be a fluke, nor are they unique to Ender nets (reproduced with test10 and test20). If anyone has an explanation, I'd love to hear it.
Message has been deleted

Dietrich Kappe

unread,
Oct 27, 2018, 12:38:55 AM10/27/18
to LCZero
The chart is not intuitive to me. How is on to interpret it?

Jupiter

unread,
Oct 27, 2018, 12:41:08 AM10/27/18
to LCZero
I would like to understand what this spreadsheet is all about. Let me start in Ender62 sheet. There is a chart titled KQK. The y-axis is numbered 0 to 1. What is y-axis? The x-axis is numbered 1 to 100,000. What is x-axis?

1. There is series RNG_Opt which is the Probability that a random move is optimal (listed in 'bm') based from https://github.com/evalon32/lc0/blob/endgame-benchmarks/endgame-benchmarks/tbeval.py
2. There is RNG_OK which is the Probability that a random move is winning (NOT listed in 'am')
3. MCTS_T0_Opt
And others

So you have a collection of KQK positions generated by tbgen.py. Then you let Lc0 using Ender62 as its weight evaluate those positions. Lets start with RNG_Opt, the Probability that a random move is optimal (listed in 'bm'). Could you explain what is that? Why there is random move? Where it came from?

Jupiter

unread,
Oct 27, 2018, 1:03:31 AM10/27/18
to LCZero
You can use that file as opening file to be used by engines in engine matches and tournaments.

In cutechess-cli, you can use the option -openings, example
-openings file=c:\chess\start\12men_start.pgn order=random

Latest cutechess gui and cli for windows can befound at

Usage:

  cutechess-cli -engine [eng_options] -engine [eng_options]... [options]

Options:

  -help Display this information
  -version Display the version number
  -engines Display a list of configured engines and exit
  -engine OPTIONS Add an engine defined by OPTIONS to the tournament
  -each OPTIONS Apply OPTIONS to each engine in the tournament
  -variant VARIANT Set the chess variant to VARIANT, which can be one of:
'3check': Three-check Chess
'5check': Five-check Chess
'aiwok': Ai-Wok (Makruk variant)
'almost': Almost Chess
'amazon': Amazon Chess
'andernach': Andernach Chess
'antiandernach': Anti-Andernach Chess
'antichess': Antichess (Losing Chess)
'asean': ASEAN-Chess
'atomic': Atomic Chess
'berolina': Berolina Chess
'cambodian': Ouk Chatrang (Cambodian Chess)
'capablanca': Capablanca Chess
'caparandom': Capablanca Random Chess
'chancellor': Chancellor Chess (9x9)
'changeover': Change-Over Chess
'checkless': Checkless Chess
'courier': Courier Chess
'chessgi': Chessgi (Drop Chess)
'chigorin': Chigorin Chess
'circulargryphon': Circular Gryphon Chess
'coregal': Co-regal Chess
'crazyhouse': Crazyhouse (Drop Chess)
'discplacedgrid': Displaced Grid Chess
'embassy': Embassy Chess
'extinction': Extinction Chess
'fischerandom': Fischer Random Chess/Chess 960
'giveaway': Giveaway Chess (Losing Chess)
'gothic': Gothic Chess
'grand': Grand Chess
'grid': Grid Chess
'gridolina': Berolina Grid Chess
'gryphon': Gryphon Chess
'horde': Horde Chess (v2)
'janus': Janus Chess
'karouk': Kar Ouk (One-check Ouk)
'kinglet': Kinglet Chess
'kingofthehill': King of the Hill Chess
'knightmate': Knightmate
'loop': Loop Chess (Drop Chess)
'losalamos': Los Alamos Chess
'losers': Loser's Chess
'makruk': Makruk (Thai Chess)
'modern': Modern Chess (9x9)
'pocketknight': Pocket Knight Chess
'racingkings': Racing Kings Chess
'seirawan': S-Chess (Seirawan Chess)
'shatranj': Shatranj
'slippedgrid': Slipped Grid Chess
'simplifiedgryphon': Simplified Gryphon Chess
'sittuyin': Sittuyin (Myanmar Chess)
'suicide': Suicide Chess (Losing Chess)
'superandernach': Super-Andernach Chess
'threekings': Three Kings Chess
'twokings': Two Kings Each Chess (Wild 9)
'twokingssymmetric': Symmetrical Two Kings Each Chess
'standard': Standard Chess (default).
  -concurrency N Set the maximum number of concurrent games to N
  -draw movenumber=NUMBER movecount=COUNT score=SCORE
Adjudicate the game as a draw if the score of both
engines is within SCORE centipawns from zero for at
least COUNT consecutive moves, and at least NUMBER full
moves have been played.
  -resign movecount=COUNT score=SCORE
Adjudicate the game as a loss if an engine's score is
at least SCORE centipawns below zero for at least COUNT
consecutive moves.
  -maxmoves N Adjudicate the game as a draw if the game is still
ongoing after N or more full moves have been played.
This limit is not in action if set to zero.
  -tb PATHS Adjudicate games using Syzygy tablebases. PATHS should
be semicolon-delimited list of paths to the compressed
tablebase files. Only the DTZ tablebase files are
required.
  -tbpieces N Only use tablebase adjudication for positions with
N pieces or less.
  -tbignore50 Disable the fifty move rule for tablebase adjudication.
  -tournament TYPE Set the tournament type to TYPE, which can be one of:
'round-robin': Round-robin tournament (default)
'gauntlet': First engine plays against the rest
'knockout': Single-elimination tournament.
'pyramid': Every engine plays against all predecessors
  -event EVENT Set the event/tournament name to EVENT
  -games N Play N games per encounter. This value should be set to
an even number in tournaments with more than two players
to make sure that each player plays an equal number of
games with white and black pieces.
  -rounds N Multiply the number of rounds to play by N.
For two-player tournaments this option should be used
to set the total number of games to play.
  -sprt elo0=ELO0 elo1=ELO1 alpha=ALPHA beta=BETA
Use a Sequential Probability Ratio Test as a termination
criterion for the match. This option should only be used
in matches between two players to test if engine A is
stronger than engine B. Hypothesis H1 is that A is
stronger than B by at least ELO0 ELO points, and H0
(the null hypothesis) is that A is not stronger than B
by at least ELO1 ELO points. The maximum probabilities
for type I and type II errors outside the interval
[ELO0, ELO1] are ALPHA and BETA. The match is stopped if
either H0 or H1 is accepted or if the maximum number of
games set by '-rounds' and/or '-games' is reached.
  -ratinginterval N Set the interval for printing the ratings to N games
  -debug Display all engine input and output
  -openings file=FILE format=FORMAT order=ORDER plies=PLIES start=START
Pick game openings from FILE. The file's format is
FORMAT, which can be either 'epd' or 'pgn' (default).
Openings will be picked in the order specified by ORDER,
which can be either 'random' or 'sequential' (default).
The opening depth is limited to PLIES plies. If PLIES is
not set the opening depth is unlimited. In sequential
mode START is the number of the first opening that will
be played. The minimum value for START is 1 (default).
  -bookmode MODE Set Polyglot book mode to MODE, which can be one of:
'ram': The whole book is loaded into RAM (default)
'disk': The book is accessed directly on disk.
  -pgnout FILE [min][fi]
Save the games to FILE in PGN format. Use the 'min'
argument to save in a minimal/compact PGN format. Only
finished games are saved for argument 'fi'.
  -epdout FILE Save the end position of the games to FILE in FEN format.
  -recover Restart crashed engines instead of stopping the match
  -repeat [N] Play each opening twice (or N times). Unless the -noswap
option is used, the players swap sides after each game.
So they get to play the opening on both sides. Please
note that a new encounter will use a new opening.
  -noswap Do not swap sides of paired engines
  -seeds N Set the first N engines as seeds in the tournament
  -site SITE Set the site/location to SITE
  -srand N Set the seed for the random number generator to N
  -wait N Wait N milliseconds between games. The default is 0.

Engine options:

  conf=NAME Use an engine with the name NAME from Cute Chess'
engines.json configuration file.
  name=NAME Set the name to NAME
  cmd=COMMAND Set the command to COMMAND
  dir=DIR Set the working directory to DIR
  arg=ARG Pass ARG to the engine as a command line argument
  initstr=TEXT Send TEXT to the engine's standard input at startup.
TEXT may contain multiple lines seprated by '\n'.
  stderr=FILE Redirect standard error output to FILE
  restart=MODE Set the restart mode to MODE which can be:
'auto': the engine decides whether to restart (default)
'on': the engine is always restarted between games
'off': the engine is never restarted between games
Setting this option does not prevent engines from being
restarted between rounds in a tournament featuring more
than two engines.
  trust Trust result claims from the engine without validation.
By default all claims are validated.
  proto=PROTOCOL Set the chess protocol to PROTOCOL, which can be one of:
'xboard': The Xboard/Winboard/CECP protocol
'uci': The Universal Chess Interface
  tc=TIMECONTROL Set the time control to TIMECONTROL. The format is
moves/time+increment, where 'moves' is the number of
moves per tc, 'time' is time per tc (either seconds or
minutes:seconds), and 'increment' is time increment
per move in seconds.
Infinite time control can be set with 'tc=inf'.
  st=N Set the time limit for each move to N seconds.
This option can't be used in combination with "tc".
  timemargin=N Let engines go N milliseconds over the time limit.
  book=FILE Use FILE (Polyglot book file) as the opening book
  bookdepth=N Set the maximum book depth (in fullmoves) to N
  whitepov Invert the engine's scores when it plays black. This
option should be used with engines that always report
scores from white's perspective.
  depth=N Set the search depth limit to N plies
  nodes=N Set the node count limit to N nodes
  ponder Enable pondering if the engine supports it. By default
pondering is disabled.
  option.OPTION=VALUE Set custom option OPTION to value VALUE

evalon32

unread,
Oct 27, 2018, 1:17:20 AM10/27/18
to LCZero


On Saturday, October 27, 2018 at 12:41:08 AM UTC-4, Jupiter wrote:
I would like to understand what this spreadsheet is all about. Let me start in Ender62 sheet. There is a chart titled KQK. The y-axis is numbered 0 to 1. What is y-axis?
Y is the metric value, averaged over the 1000 positions. For most metrics (xxx_Opt and xxx_OK), it's probability (of making an optimal move or an "OK" move). For xxx_Q it's the Q value.
 
The x-axis is numbered 1 to 100,000. What is x-axis?
Number of nodes searched. (But this is different from the test10/test20/test30 tabs: those use the network number as X.)
 

1. There is series RNG_Opt which is the Probability that a random move is optimal (listed in 'bm') based from https://github.com/evalon32/lc0/blob/endgame-benchmarks/endgame-benchmarks/tbeval.py
2. There is RNG_OK which is the Probability that a random move is winning (NOT listed in 'am')
3. MCTS_T0_Opt
And others

So you have a collection of KQK positions generated by tbgen.py. Then you let Lc0 using Ender62 as its weight evaluate those positions.
Correct.
 
Lets start with RNG_Opt, the Probability that a random move is optimal (listed in 'bm'). Could you explain what is that? Why there is random move? Where it came from?
The random move is hypothetical. Let's say a position has 30 legal moves and only one is optimal. Then RNG_Opt=1/30. I thought it might be useful as a baseline (how does Leela compare to a random mover?)

Jupiter

unread,
Oct 27, 2018, 2:28:28 AM10/27/18
to LCZero
The random move is hypothetical. Let's say a position has 30 legal moves and only one is optimal. Then RNG_Opt=1/30. I thought it might be useful as a baseline (how does Leela compare to a random mover?)
This is why it is horizontal (no changes in y as x increases) because the engine has nothing to do with it. This is a property of the position. Then you have 1000 positions, so this is the average. Perhaps this can be removed in the chart. It just adds complexity that does not relate on the engine. A simple metric that should be in the chart is number of correct moves choosen by the engine divided by total positions tried. Then plot it from node 1 and up, so we can see the progress of success rate as nodes searched increases.

Have you tried generating KQk positions but with move history and then let Lc0 analyze from it?, It would be interesting to see the difference on the number of positions solved with and without move history?

So what is RNG_OK? 

evalon32

unread,
Oct 27, 2018, 11:26:46 AM10/27/18
to LCZero


On Saturday, October 27, 2018 at 2:28:28 AM UTC-4, Jupiter wrote:
The random move is hypothetical. Let's say a position has 30 legal moves and only one is optimal. Then RNG_Opt=1/30. I thought it might be useful as a baseline (how does Leela compare to a random mover?)
This is why it is horizontal (no changes in y as x increases) because the engine has nothing to do with it. This is a property of the position.
Exactly, that's why it's listed under "Position properties (engine-independent)" :)
 
Then you have 1000 positions, so this is the average. Perhaps this can be removed in the chart. It just adds complexity that does not relate on the engine. A simple metric that should be in the chart is number of correct moves choosen by the engine divided by total positions tried. Then plot it from node 1 and up, so we can see the progress of success rate as nodes searched increases.
That's what most of the metrics are, for different definitions of "correct" and "chosen." I emphasized the three that I think are the most important (MCTS_T0_xxx), and now I've de-emphasized the others even further.


Have you tried generating KQk positions but with move history and then let Lc0 analyze from it?, It would be interesting to see the difference on the number of positions solved with and without move history?
I am using the no history patch (PR305). I did try analyzing with and without the patch, and there was indeed a significant difference; see the original post announcing the benchmark.
 

So what is RNG_OK? 
Where "Opt" refers to moves that lead to the shortest mate, "OK" refers to all moves that don't squander the win. If out of 30 legal moves in a "mate in 5" position, one leads to a mate in 5, two allow black to draw, and the other 27 still allow white to win, but more slowly, then RNG_Opt=1/30 and RNG_OK=28/30.
Reply all
Reply to author
Forward
0 new messages