Help Converting NumPy Code to Numba CUDA Jit

Sean Law

unread,

Apr 18, 2018, 9:56:45 PM4/18/18

to Numba Public Discussion - Public

In the code below, I have a simple for-loop that I'd like to replace with, hopefully, a faster vectorized Numba parallelized implementation as well as a CUDA implementation.

import numpy as np

b = np.array([9,8100,-60,7], dtype=np.float64)
a = np.array([584,-11,23,79,1001,0,-19], dtype=np.float64)
m = 3
n = b.shape[0]
l = n-m+1
k = a.shape[0]-m+1

QT = np.array([-85224., 181461., 580047., 8108811., 10149.])
QT_first = QT.copy()

out = [None] * l
for i in range(1, l):
    QT[1:] = QT[:k-1] - b[i-1]*a[:k-1] + b[i-1+m]*a[-(k-1):]
    QT[0] = QT_first[i]


    # Update: This is not the REAL calculation below but a proxy

    # Use QT above to do something with the ith element of array x
    # As i updates in each iteration, QT changes
    out[i] = np.argmin((QT + b_mean[i] * m) / (b_stddev[i] * m * a_stddev))

return out

In my real function, the length of the input arrays, a and b, can be variable and very long. Note that QT depends on the m and the length of b and both will always be provided. Also, one might be tempted to recommend doing some sort of traditional convolution but convolution does not solve my problem. Convolving only gives me the final QT but I actually need to use the intermediate QT values for another calculation (see argmin line that depends on some pre-calculated calculations of the input arrays) before updating it for the next iteration of the for-loop.

What is the best way to replace the for-loop with Numba so that it is faster on CPU?

Is it possible to use multiple threads with nogil and prange in this instance?

What is the best way to replace the for-loop with Numba so that I can port this to GPU as well with CUDA Jit?

I would greatly appreciate any help in porting this code over to Numba so that I can leverage parallel CPU computation as well as GPU CUDA computation.

Sean Law

unread,

Nov 16, 2018, 10:54:48 PM11/16/18

to Numba Public Discussion - Public

I am still looking to get some help or guidance on this. Otherwise, can somebody please recommend any resources that I can refer to in order to figure this out?

Denis Akhiyarov

unread,

Nov 16, 2018, 11:48:32 PM11/16/18

to Numba Public Discussion - Public

This question is probably more appropriate for stackoverflow. Also you need to provide a realistic size of a and b arrays. Finally some variables like b_mean are not defined.

Sean Law

unread,

Nov 17, 2018, 7:38:08 AM11/17/18

to Numba Public Discussion - Public

The size of a and b are around 1-100 million elements in length and the variables like b_mean or b_stddev are sliding windows for mean and standard deviation respectively for the ith window.

So, if b = [1, 3, 5, 7, 9] and m 3 then:

b_mean[0] = 3 # (1+3+5)/3

b_mean[1] = 5 # (3+5+7)/3

b_mean[2] = 7 # (5+7+9)/3

Thanks in advance!

Reply all

Reply to author

Forward