Using reikna in different processes

wagner...@gmx.at

unread,

Jun 29, 2018, 7:08:44 AM6/29/18

to reikna

Hi @all!

I have a computation framework which uses multiprocessing.Process. Each of the spawned processes has to perform FFTs, which I want to offload to a Tesla GPU using reikna.
However, I get "pycuda._driver.LogicError: cuDeviceGetCount failed: initialization error" when using reikna within multiple processes.
See the example below.

Is there a way to use reikna within multiple processes?

best regards
Thomas

def init_reikna():

from reikna import cluda

# init reikna
api=cluda.cuda_api()
dev = api.get_platforms()[0].get_devices()[0]
thr = api.Thread(dev)

# to work here

if __name__=='__main__':
import numpy as np
from reikna import cluda
from reikna.fft import FFT

from multiprocessing import Process

# init reikna
api=cluda.cuda_api()
dev = api.get_platforms()[0].get_devices()[0]
thr = api.Thread(dev)

# initdata
wind=np.random.rand(256,256,8)+1j*np.random.rand(256,256,8)
data=np.random.rand(256,256,8)+1j*np.random.rand(256,256,8)

# precompile
d=np.pad(data,((0,0),(0,0),(0,32-data.shape[2])),'constant')
fft = FFT(d, axes=(0,1,2))
fftc = fft.compile(thr, fast_math=True)

p=Process(target=init_reikna)
p.start()

# do some dummy work in main
for i in range(10):
d=wind*data
d=np.pad(data,((0,0),(0,0),(0,32-data.shape[2])),'constant')
data_dev = thr.to_device(d)
fftc(data_dev, data_dev)
fwd = data_dev.get()
print(fwd.shape)

p.join()

Bogdan Opanchuk

unread,

Jun 29, 2018, 9:26:54 AM6/29/18

to reikna

Hi Thomas,

It looks like it is rather a PyCuda issue. What operating system are you using? There seems to be a similar problem on OSX (see https://stackoverflow.com/questions/14719065/pycuda-multiprocessing-issue-on-os-x-10-8 ) where you can't use PyCuda in a forked process. Perhaps you could switch to something like mpi4py?

wagner...@gmx.at

unread,

Jun 29, 2018, 11:02:40 AM6/29/18

to reikna

Hi!

Thanks for that tip. I'm using linux, where the multiprocessing module forks per default. Googling around shows that forking is incompatible with pycuda.
Using spawned processes instead of forked ones works.
However, startup time is much slower with spawned instead of forked processes.