Use GPU and numba in a simple function

0 views
Skip to first unread message

Mehdi Ravanbakhsh

unread,
Jun 24, 2016, 2:04:46 AM6/24/16
to conda - Public
Hi All,

I am trying tim implement this function using GPU but I got different types of error from SQRt is unknown to type casting issue. I would appreciate if somebody could tell me what is wrong with this code? 


@vectorize(['float32(float32, float32)'],target='gpu')

def thin_pts(pts,thr):
    # pts is 2D column vector as n*2 size and thr is a single value
    out_x = np.array([])
    out_y = np.array([])
    while 1:
        n=pts.shape[0]
        if n>1:
            dist=sqrt((pts[0,0]-pts[1:n,0])**2+(pts[0,1]-pts[1:n,1])**2)
            if np.all(dist>thr):
                out_x=np.append(out_x,pts[0,0])
                out_y=np.append(out_y,pts[0,1])
            pts = numpy.delete(pts, (0), axis=0)
        else: break
    return (out_x,out_y)

Stan Seibert

unread,
Jun 27, 2016, 10:20:29 AM6/27/16
to conda - Public
Hi Mehdi,

I suspect the immediate problem (though without the full code I can't be sure) is that you need to import the sqrt function, like this:

from math import sqrt

However, there is the larger problem that @vectorize is designed to create ufuncs, and what you have written below is not a ufunc.  When you write a ufunc, it takes scalar arguments and relies on the broadcast rules of NumPy to extend the operation across the array.

I'm not totally sure what the intended algorithm here is, but you might be able to write this as a generalized ufunc:


But keep in mind that memory allocation on the GPU, of the sort that np.append, np.delete, and np.array do, is not supported by Numba's GPU target.  That is partly on purpose, because that kind of code runs extremely inefficiently on the GPU.

Your final option would be to write a CUDA kernel using the cuda.jit decorator, but this requires learning the CUDA execution model:


I hope that helps point you in the right direction. 
Reply all
Reply to author
Forward
0 new messages