AMGCL with OpenCL - AMD vs. NVIDIA Consumer GPUs F64 performance

C B

unread,

May 22, 2021, 10:11:56 PM5/22/21

to amgcl

Hello Denis and Everyone,

What is your opinion on f64 performance with low end GPUs, or let's say best value for the cost ?

I tried to find f64 performance and I found that many NVIDIA GPUs that have great f32 performance have very low f64 performance,

whereas AMD's GPUs seem to have better relative f64 performance on average.

https://arrayfire.com/explaining-fp64-performance-on-gpus/ AMD GPUs perform fairly well for FP64 compared to FP32. Most AMD cards (including consumer/gaming series) will give between 1:3 and 1:8 FP32 performance for FP64.

https://www.techpowerup.com/gpu-specs/geforce-rtx-3090.c3622

A very expensive RTX 3090 has a 1:64 F64:F32 ratio, with FP64 = 560 GFlops, whereas

a very inexpensive Radeon RX 5600M (laptop) has a 1:16 ratio with FP 64 = 360 GFlops !

But I suppose it is not just FP64, I guess OpenCL vs. CUDA also is an important factor, what is your advice in this regard ?

Last month NVIDIA released a new OpenCL 3.0 driver, has anyone tried it to compare the OpenCL vs. CUDA performance on the same GPU ?

I would like to try this Radeon GPU with AMGCL, but first I need OpenCL, and on my Windows computer amgcl-master/cmake/opencl/FindOpenCL.cmake is not finding the OpenCL installations, I have at least 2, one from Intel and the other from Nvidia. I browsed FindOpenCL.cmake and it seems to indicate that it was sort of customized for AMD's sdk, which is now discontinued. Is this the case ?

Thanks for your help / recommendations !

Cheers,

Denis Demidov

unread,

May 23, 2021, 1:49:08 AM5/23/21

to amgcl

On Sunday, May 23, 2021 at 5:11:56 AM UTC+3 cebau...@gmail.com wrote:

Hello Denis and Everyone,

What is your opinion on f64 performance with low end GPUs, or let's say best value for the cost ?

I would say that for amgcl the memory bandwidth is more important than double precision arithmetics,

since the algorithms are memory-bound, not compute-bound.

I tried to find f64 performance and I found that many NVIDIA GPUs that have great f32 performance have very low f64 performance,
whereas AMD's GPUs seem to have better relative f64 performance on average.
https://arrayfire.com/explaining-fp64-performance-on-gpus/ AMD GPUs perform fairly well for FP64 compared to FP32. Most AMD cards (including consumer/gaming series) will give between 1:3 and 1:8 FP32 performance for FP64.
https://www.techpowerup.com/gpu-specs/geforce-rtx-3090.c3622
A very expensive RTX 3090 has a 1:64 F64:F32 ratio, with FP64 = 560 GFlops, whereas
a very inexpensive Radeon RX 5600M (laptop) has a 1:16 ratio with FP 64 = 360 GFlops !
But I suppose it is not just FP64, I guess OpenCL vs. CUDA also is an important factor, what is your advice in this regard ?
Last month NVIDIA released a new OpenCL 3.0 driver, has anyone tried it to compare the OpenCL vs. CUDA performance on the same GPU ?

On my GPU, OpenCL performs on par with CUDA, and the OpenCL backend in amgcl is even slightly faster than the CUDA one on the same NVIDIA GPU.

I would like to try this Radeon GPU with AMGCL, but first I need OpenCL, and on my Windows computer amgcl-master/cmake/opencl/FindOpenCL.cmake is not finding the OpenCL installations, I have at least 2, one from Intel and the other from Nvidia. I browsed FindOpenCL.cmake and it seems to indicate that it was sort of customized for AMD's sdk, which is now discontinued. Is this the case ?

FindOpenCL.cmake copy in amgcl is only used with ancient versions of CMake:

https://github.com/ddemidov/amgcl/blob/61d219699005743338a41768c41bca0a8678d24e/CMakeLists.txt#L18-L21

After that, FindOpenCL is part of cmake distribution.

I don't have access to a windows machine, but you should be able to manually set cmake variables (these should be in "advanced") section

OpenCL_INCLUDE_DIR (the path which has CL/opencl.h) and OpenCL_LIBRARY (the path to OpenCL.dll).

C B

unread,

May 23, 2021, 2:04:02 PM5/23/21

to Denis Demidov, amgcl

Denis,

Thank you so much for your insights !

Then if we are dealing mainly with memory-bound applications, are there any specific metrics that we need to look at when selecting a GPU ?

If we want to buy a GPU, what are the most important parameters for HPC ?

I guess this may also depend on the memory bandwidth when communicating between CPU and GPU ...

Thanks again for your insights :)

Cheers,

--
You received this message because you are subscribed to the Google Groups "amgcl" group.
To unsubscribe from this group and stop receiving emails from it, send an email to amgcl+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/amgcl/1b6134b1-f978-4988-b45e-c19ce744e025n%40googlegroups.com.

Denis Demidov

unread,

May 23, 2021, 2:20:46 PM5/23/21

to amgcl

On Sunday, May 23, 2021 at 9:04:02 PM UTC+3 cebau...@gmail.com wrote:

Denis,
Thank you so much for your insights !
Then if we are dealing mainly with memory-bound applications, are there any specific metrics that we need to look at when selecting a GPU ?
If we want to buy a GPU, what are the most important parameters for HPC ?
I guess this may also depend on the memory bandwidth when communicating between CPU and GPU ...

If the problems you are going to solve are mostly memory-bound, then I would look at the GPUs with the faster memory bandwidth.

Double-precision performance should also help, but professional GPUs with non-restricted double arithmetics are going to be an order of magnitude more expensive,

so that is up to your budget really. I can not help you with the specific models, as I haven't been following the market for some time.

C B

unread,

May 23, 2021, 3:29:43 PM5/23/21

to Denis Demidov, amgcl

Denis,

Thank you very much for your comments.

I found this link showing how to measure memory bandwidth with the utilities fo the standard CUDA tookit

https://forums.developer.nvidia.com/t/grim-memory-bandwidth-gtx-1080/43177

Thanks again,

Cheers

To view this discussion on the web visit https://groups.google.com/d/msgid/amgcl/e4109a90-8d5f-49dd-aca2-6fd8a158d9fbn%40googlegroups.com.

Reply all

Reply to author

Forward