Tuning matrix multiplication (GEMM) for Intel GPUs

121 views
Skip to first unread message

Evgeny Demidov

unread,
Jun 28, 2019, 12:54:18 AM6/28/19
to WebGL Dev List
WebGL2-compute GEMM shaders need tuning for different GPUs (e.g. Intel and AMD). New shader based on highly optimised Intel OpenCL kernel accelerates GEMM 2 times on Intel GPUs (~210 GFLOPS on i3-8100, ~50% from peak float perf and very near to OpenCL kernel performance).

Fastest OpenCL kernels use subgroups extension (+ 30-50% acceleration) and Intel team is planning to add this extension in WebGL2-compute D3D backend. More details at

Ken Russell

unread,
Jun 28, 2019, 9:09:21 PM6/28/19
to WebGL Dev List
That's very cool Evgeny. Great work approaching peak performance for these workloads!

I hope that somehow it'll be possible to bridge the gaps between different GPUs' behavior so that more performance-portable compute shaders can be written for the web.

Please keep us all posted on your progress!

-Ken


--
You received this message because you are subscribed to the Google Groups "WebGL Dev List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to webgl-dev-lis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/webgl-dev-list/7b54b2b7-1c3b-4486-9f31-9b8db6c0b830%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages