Kiudee parameters are very strong

548 views
Skip to first unread message

Stefan Pohl

unread,
Jan 20, 2020, 8:36:53 AM1/20/20
to LCZero
Played 2 Gauntlets a 300 games (30''+300ms Bullet, with my 150 SALC Armageddon openings of my longtime-Testruns (https://www.sp-cc.de/nn-longtime-testing.htm)):
Lc0 default Leelenstein 13 Net vs. Lc0 Kiudee Leelenstein 13 Net and
Lc0 default Leelenstein 13 Net vs. Lc0 LSbinary Leelenstein 13 Net (the binary from josh patreon-site (post from 2019/12/26)

1 Lc0 0.23.2kiudee LS13 vs. Lc0 default LS13: 300 (+180,=  0,-120), 60.0 % (!!!)

2 Lc0 LSbinary LS13  vs. Lc0 default LS13: 300 (+103,=  0,-197), 34.3 %

Conclusions:
Josh-binary is very bad - do not use it!
Lc0 Kiudee is really impressive. 60%-40% means +70 Elo. But mention, that Armageddon (no draws, because all draws are counted as a win for Black) and Bullet-speed spread results, so +40 or +50 Elo seems more realistic. And on discord, some tests with other net-sizes (10x128 and T60) (Leelenstein Size is 20x256) show a measureable Elo gain with Kiudee-setting, too. So, it seems, that the Kiudee-setting should be the new default for Lc0. I will use it from now as default for my Lc0-testings.

Here the Kiudee setting:

CPuct=2.147
Fpu=0.443
PolicyTemperature=1.607
CPuctBase=18368
CPuctFactor=2.815

Benedetto Romano

unread,
Jan 20, 2020, 9:49:16 AM1/20/20
to LCZero

Kiudee work is extremly important, need to do that for other Time too.
Btw for now that must are default surely.

Stefan Pohl

unread,
Jan 20, 2020, 10:17:37 AM1/20/20
to LCZero
Armageddon means, all draws are counted as a win for Black. And the opening-positions give White an advantage (SALC: White can castle short, Black can castle long, only). Thats it. Because, there are no draws, the results are spreaded (more away from 50%-50%).

Here some tests from discord, with "normal" openings:

CODE: SELECT ALL

tc=1s+0.1s, RTX 2070
"bonus settings" cpuct=2.147, fpu=0.443, pst=1.607, cpuct-base=18368, cpuct-factor=2.815

   # PLAYER                        :  RATING  ERROR  POINTS  PLAYED   (%)  CFS(%)    W    D    L
   1 lc0.net.58613.kiudee_bonus    :      86     10   836.5    1422  58.8     100  459  755  208
   2 lc0.net.58613.default         :      33      9   699.0    1422  49.2      55  320  758  344
   3 lc0.net.LD2.kiudee_bonus      :      32     10   696.5    1422  49.0     100  308  777  337
   4 lc0.net.LD2.default           :       0      9   612.0    1422  43.0     ---  240  744  438
(both Nets are small (10x128). +32 and +53 Elo gain = +42 Elo average gain)


Gauntlet: J13B.2-188 vs lc0.net.62013 (default) / lc0.net.62013-tuned @kiudee
LC0-version: lc0-v0.23.2, Backend=cudnn-fp16
Hardware: RTX 2060
Software: Cutechess-CLI
Time control: 1k nodes/move
Book: openings-10ply-100k.pgn, 10 plies, sequential, color reversed

CODE: SELECT ALL

# PLAYER                 :  RATING  ERROR  POINTS  PLAYED    W    L    D  D(%)  CFS(%)
1 lc0.net.62013-tuned    :    28.4   23.2   108.0     200   51   35  114    57      97
2 J13B.2-188             :     0.0   14.6   192.5     400   74   89  237    59      54
3 lc0.net.62013          :    -1.8   24.4    99.5     200   38   39  123    62     ---
(Big T60 Net. +27 Elo gain)

glbchess64

unread,
Jan 20, 2020, 1:54:49 PM1/20/20
to LCZero
It is most likely that these parameters are not optimal for LS. Kiudee parameters are designed for T60 after PST changes (and may be T59). And also don't forget that when playing a net vs a net like self-elo the elo diff. is overestimated by a factor close to 2. Moreover SALC and Armageddon introduce bias in the tests that may amplify some apparent qualities that are not general.

When you introduce so strong bias in test I will highly recommand to set matches between a net and an AB engine to verify if the qualities you select by the bias have not some drawback.
Reply all
Reply to author
Forward
0 new messages