Thread and block size between different GPUs

0 views
Skip to first unread message

Chris Uchytil

unread,
Nov 23, 2016, 11:25:05 PM11/23/16
to Numba Public Discussion - Public
I can't post the code as it is far to long and involved so if this can't be answered it's no biggy. I have been running numba code on my GTX 770 for the last few months. The kernels I was running had (32,32) set as threads per block and something i the neighborhood of (19,19) blocks per grid. I recently upgraded to a GTX 1080 and with the same code I am getting this error
numba.cuda.cudadrv.driver.CudaAPIError: [701] Call to cuLaunchKernel results in CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES.
If I lower the threads per block to (8,8) or (16,16) and increase the blocks per grid accordingly I can run the code just fine. Is this perhaps a hardware thing? I would think that a 1080 could handle a larger set of threads per block then a 770.

Stanley Seibert

unread,
Nov 28, 2016, 9:33:43 AM11/28/16
to Numba Public Discussion - Public
Hmm...  The only limitations I could imagine you running into (assuming the amount of shared memory you allocate is the same):
  • Too many threads per block
  • Too many registers per thread
  • Too many registers per block
The GTX 770 is a compute capability 3.5 device, and the GTX 1080 is a compute capability 6.1 device.  Both architectures have the same limitations (https://docs.nvidia.com/cuda/cuda-c-programming-guide/#compute-capabilities):
  • Threads per block: 1024
  • Registers per thread: 255
  • Registers per block: 64k

I'm not sure if this is some strange limitation with Numba using CUDA toolkit 7.5 with Pascal.  We are planning to move to CUDA toolkit 8.0 with the next Numba release (in mid Dec), but I'll check with Siu to see if it can be installed with the current Numba release.


--
You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users+unsubscribe@continuum.io.
To post to this group, send email to numba...@continuum.io.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/d727cb27-1e88-4866-be8c-8eab09e57a65%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Stanley Seibert

unread,
Nov 28, 2016, 5:48:04 PM11/28/16
to Numba Public Discussion - Public
If you would like to test whether the CUDA toolkit version makes a difference (and you are using conda), you can upgrade with this command:

conda install -c numba cudatoolkit

And downgrade back to the 7.5 release with this:

conda install cudatoolkit

Note that this assumes you've updated your CUDA system drivers to something that supports CUDA 8 (which I assume was required for your GTX 1080 to work).

Reply all
Reply to author
Forward
0 new messages