Michael Larabel benches lc0 on various GPUs

1086 views
Skip to first unread message

Warren D Smith

unread,
Jan 14, 2019, 12:39:17 PM1/14/19
to LCZero
https://www.phoronix.com/scan.php?page=news_item&px=LCZero-NVIDIA-Benchmarks

he writes he added it to his "phoronix" benchmark suite, and then he
ran it on a lot of GPUs with various CUDA and openCL compilers.

The fastest combination he tried was
CUDA + cuDNN for GeForce TITAN RTX: 25695 nodes/sec.

The slowest was OpenCL on GeForce GTX 970: 1171 nodes/sec.

He says he will test more hardware/software combos over the next few days;
keep looking at his phoronix.com news pages.

David Bigler

unread,
Jan 14, 2019, 1:18:41 PM1/14/19
to LCZero
Thanks for sharing, very interesting!

brian.p.r...@gmail.com

unread,
Jan 14, 2019, 3:08:15 PM1/14/19
to LCZero
Looked but did not see which net ID was used?

Warren D Smith

unread,
Jan 16, 2019, 1:35:06 PM1/16/19
to LCZero
Some more phoronix bench testing:

https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1073250-lczero-neural-network-chess-benchmarks-with-opencl-radeon-vs-nvidia

I suggest also reading the "comments" to both those blog posts,
they have some things to say -- some of it rather critical...
it might not be hard to improve leela GPU performance... 

Alexey Eromenko

unread,
Jan 16, 2019, 4:07:48 PM1/16/19
to LCZero
Yes, great benchmarks by Phoronix.

XRig: AMD Radeon R9 290 "Hawaii" (via OpenCL; Windows 7 x64) + Core i7 2600K
Nodes/second :  1220.84 

Leele 0.20.1 - ID 32400

Alexey Eromenko

unread,
Jan 16, 2019, 4:10:42 PM1/16/19
to LCZero
According to him, it is most cost efficient to buy one or two newest RTX 2060 series.

$350 dollar GPUs. Home friendly budget.

Leela on the cheap :) Leela Home Edition.

But I don't know if 6 GB of VRAM will affect Leela play or training in the future, with future IDs. (vs 11 GB for RTX 2080 Ti)

Markus Kohler

unread,
Jan 17, 2019, 4:26:29 AM1/17/19
to LCZero
Do we know why the AMD cards perform so poorly?
Is it because the driver for AMD are less optimized, or could the lczero code be improved?

Alexey Eromenko

unread,
Jan 17, 2019, 4:51:39 AM1/17/19
to LCZero
AMD doesn't perform poorly at all.

If you look carefully, AMD Radeon R9 290 performance is similar to NVIDIA GTX 970, in OpenCL, and their performance is also very similar in real world 3D graphics (gaming).
Those cards are of 2 generations ago. They are similar price and similar generation cards. Both do around 1200 nps via OpenCL.

The big question is: Why is OpenCL so much slower than CUDA ? This I don't know.

Misha Golub

unread,
Jan 17, 2019, 7:38:05 AM1/17/19
to LCZero
I am surprised difference between cudnn and opencl is so small. With cudnn being optimized specifically for NN on specific hardware and opencl being general purpose and designed to work with any hardware I expected order of magnitude differences.

lamarr...@gmail.com

unread,
Jan 17, 2019, 3:12:35 PM1/17/19
to LCZero
The performance is really abysmal on amd gpus

590 outperforming vega.

Even on amdgpu-pro or windows the performance is very poor.

My vega FE on windows does 2555 nps.  It is still very low.
Reply all
Reply to author
Forward
0 new messages