On Thu, Apr 27, 2023 at 06:11:43AM -0700, Arun T wrote:
> Dear All,
>
> I'm trying to best evaluate Polybench benchmarks using PPCG for CPU, CUDA,
> and openCL.
> I'm using the flag options: *--tile, * *--target=c or cuda or opencl,* and *--openmp
> *from PPCG
--tile and --openmp are only valid for the "c" target.
Note that you are unlikely to get good performance on the "c" target
since for CPUs, PPCG really only has a basic framework.
It hasn't been optimized in any way and doesn't even have
any specific support for vectorization.
For GPUs, there is a bit of intelligence, but it was based
on GPUs from more than 10 years ago. I have no idea
if the heuristics from back then are any good for more recent GPUs.
To get good performance, you also have to specify appropriate
(program dependent) kernel, grid and block sizes.
skimo