Multiprocessing - to_device and to_host launched by different process

Brunno Goldstein

unread,

Aug 7, 2016, 3:53:24 PM8/7/16

to Numba Public Discussion - Public

Hi!

I'm working with numba and multiprocessing, where a process starts the device computation and sends the dA pointer (result of cuda.to_device call) to another process. The second process then will call copy_to_host() and work with the resulting data.

The problem that I am facing is that gpu_data holds some ctype pointers and they cannot be pickled by the queue that I am using. Do you have any idea about how can I handle that?

Thanks!

Best Regards,

Brunno Goldstein

Stanley Seibert

unread,

Aug 8, 2016, 10:15:25 AM8/8/16

to Numba Public Discussion - Public

You're bumping into a bigger issue, which is that CUDA device allocations are not portable between processes. (In fact, pointers generally are not portable between processes, which is why ctypes pointers can't be serialized.)

There is a mechanism in CUDA (but not exposed in Numba) for interprocess communication of device allocations (cudaIpc*):

http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html#group__CUDART__DEVICE_1g8a37f7dfafaca652391d0758b3667539

We've been talking with the Dask developers about how to handle this more broadly, but don't have an ETA at the moment. Siu might be able to propose a workaround in the meantime. (I'll ping him to take a look at this thread.)

--
You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users+unsubscribe@continuum.io.
To post to this group, send email to numba...@continuum.io.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/1b2d358a-5692-4e84-9a4e-05ce35f54f3d%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Brunno Goldstein

unread,

Aug 8, 2016, 5:17:48 PM8/8/16

to Numba Public Discussion - Public

Hi Stanley,

thank you for your quick response.

I'll take a look at the cudaIpc and wait to see if Siu has a workaround.

Best Regards,

Brunno

Em segunda-feira, 8 de agosto de 2016 11:15:25 UTC-3, Stanley Seibert escreveu:

You're bumping into a bigger issue, which is that CUDA device allocations are not portable between processes. (In fact, pointers generally are not portable between processes, which is why ctypes pointers can't be serialized.)

There is a mechanism in CUDA (but not exposed in Numba) for interprocess communication of device allocations (cudaIpc*):

http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html#group__CUDART__DEVICE_1g8a37f7dfafaca652391d0758b3667539

We've been talking with the Dask developers about how to handle this more broadly, but don't have an ETA at the moment. Siu might be able to propose a workaround in the meantime. (I'll ping him to take a look at this thread.)

On Sun, Aug 7, 2016 at 2:53 PM, Brunno Goldstein <dae...@gmail.com> wrote:

Hi!

I'm working with numba and multiprocessing, where a process starts the device computation and sends the dA pointer (result of cuda.to_device call) to another process. The second process then will call copy_to_host() and work with the resulting data.

The problem that I am facing is that gpu_data holds some ctype pointers and they cannot be pickled by the queue that I am using. Do you have any idea about how can I handle that?

Thanks!

Best Regards,

Brunno Goldstein

--
You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.

To unsubscribe from this group and stop receiving emails from it, send an email to numba-users...@continuum.io.

Siu Kwan Lam

unread,

Aug 9, 2016, 3:02:27 PM8/9/16

to Numba Public Discussion - Public

Brunno,

I have an initial implementation for IPC device array at https://github.com/numba/numba/pull/2023. See example https://github.com/sklam/numba/blob/7b0cc21b1d98c2e0363191dfbbd0b2568099dc17/examples/cuda_ipc.py.

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/c1c3a1ce-d7e9-4e8d-8f31-6e01f6681aa7%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--

Siu Kwan Lam

Software Engineer

Continuum Analytics

Brunno Goldstein

unread,

Aug 9, 2016, 11:10:42 PM8/9/16

to Numba Public Discussion - Public

Siu,

thanks for the code!

Unfortunately, I got the following error while running the cuda_ipc.py example:

Traceback (most recent call last):
File "cuda_ipc.py", line 60, in <module>
main()
File "cuda_ipc.py", line 56, in main
parent()
File "cuda_ipc.py", line 12, in parent
darr = cuda.to_device(arr)
File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+81.g0a9d560-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/devices.py", line 257, in _require_cuda_context
get_context()
File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+81.g0a9d560-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/devices.py", line 240, in get_context
return _runtime.get_or_create_context(devnum)
File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+81.g0a9d560-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/devices.py", line 202, in get_or_create_context
return self.push_context(self.gpus[devnum])
File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+81.g0a9d560-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/devices.py", line 40, in __getitem__
return self.lst[devnum]
File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+81.g0a9d560-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/devices.py", line 26, in __getattr__
numdev = driver.get_device_count()
File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+81.g0a9d560-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/driver.py", line 292, in get_device_count
self.cuDeviceGetCount(byref(count))
File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+81.g0a9d560-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/driver.py", line 234, in __getattr__
self.initialize()
File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+81.g0a9d560-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/driver.py", line 199, in initialize
self._initialize_extras()
File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+81.g0a9d560-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/driver.py", line 213, in _initialize_extras
call_cuIpcOpenMemHandle)
File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+81.g0a9d560-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/driver.py", line 250, in _wrap_api_call
@functools.wraps(libfn)
File "/home/goldstein/anaconda2/lib/python2.7/functools.py", line 33, in update_wrapper
setattr(wrapper, attr, getattr(wrapped, attr))
AttributeError: 'CFunctionType' object has no attribute '__name__'

I don't know if I missed something when compiling the source code or if it's another thing.

Best Regards,

Brunno

Siu Kwan Lam

unread,

Aug 10, 2016, 9:55:58 AM8/10/16

to Numba Public Discussion - Public

Oh no, that's a python2.7 specific bug.

--

You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users...@continuum.io.
To post to this group, send email to numba...@continuum.io.

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/4b4c167c-62c6-420e-ba15-c19c6c04e101%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Siu Kwan Lam

unread,

Aug 10, 2016, 11:28:23 AM8/10/16

to Numba Public Discussion - Public

Should be fixed now.

Brunno Goldstein

unread,

Aug 14, 2016, 12:04:03 AM8/14/16

to Numba Public Discussion - Public

Siu,

sorry for the delay and thanks for the commit!

Since I'm using python 2.7+, I've made some changes into your cuda_ipc example code. Now, the problem is that I'm getting the CudaDriverError exception when the child process tries to work with the darr pointer.

Here is the traceback:

Traceback (most recent call last):

File "/home/goldstein/anaconda2/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap

self.run()

File "/home/goldstein/anaconda2/lib/python2.7/multiprocessing/process.py", line 114, in run

self._target(*self._args, **self._kwargs)

File "tst.py", line 14, in worker

with ipch as darr:

File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+82.g0d6fe0c-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/devicearray.py", line 402, in __enter__

return self.open()

File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+82.g0d6fe0c-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/devicearray.py", line 392, in open

dptr = self._ipc_handle.open(devices.get_context())

File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+82.g0d6fe0c-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/devices.py", line 240, in get_context

return _runtime.get_or_create_context(devnum)

File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+82.g0d6fe0c-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/devices.py", line 199, in get_or_create_context

return self.current_context

File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+82.g0d6fe0c-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/devices.py", line 125, in current_context

assert driver.get_context().value == top.handle.value, (

File "/home/goldstein/anaconda2/lib/python2.7/site-packages/numba-0.28.0.dev0+82.g0d6fe0c-py2.7-linux-x86_64.egg/numba/cuda/cudadrv/driver.py", line 313, in get_context

raise CudaDriverError("CUDA initialized before forking")

CudaDriverError: CUDA initialized before forking

Best Regards,

Brunno

Siu Kwan Lam

unread,

Aug 15, 2016, 11:08:48 AM8/15/16

to Numba Public Discussion - Public

CUDA does not support forking after it is initialized. You will need to delay the CUDA init (by avoiding any CUDA features) until you have forked the process. Alternatively, spawn new process instead of fork().

--

You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users...@continuum.io.
To post to this group, send email to numba...@continuum.io.

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/a4ce57fe-7794-4c45-97d2-ebb9ea48c7c4%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Reply all

Reply to author

Forward