Currently, all NVPTX kernels generated by Polly-ACC are named "kernel_#" ( #=0,1,2,3,...) . While profiling, we would have to find the host function that called the kernel by searching for the point where the kernel is launched in the source code. This task becomes even more arduous as the number of kernels increase.
Furthermore, it might become impossible to find the kernel-host function association when Polly is used through JIT. i.e. functions with GPU friendly SCoPs would end up calling a function of the name, although, with different definitions. E.g. nvprof on Julia running PolyBench.jl,
==4776== Profiling application: usr/bin/julia --check-bounds=no -g0
==4776== Profiling result:
Time(%)      Time     Calls       Avg       Min       Max  Name
 76.16%  24.1500s     25701  939.65us  115.55us  46.950ms  kernel_0
 19.28%  6.11307s     25054  244.00us  115.46us  22.412ms  kernel_1
[...]