is it possible to write python code inside source module rather than C code?

31 views

Skip to first unread message

anush2...@gmail.com

unread,

Sep 10, 2017, 3:00:45 AM9/10/17

to reikna

Hi, I want to parallelize for loop in pycuda. and the code snippet is :

for u, Wu in enumerate(W):
X[u] = np.linalg.solve(np.dot(Y, np.dot(np.diag(Wu), Y.T)) + lambda_ * np.eye(n_factors),
np.dot(Y, np.dot(np.diag(Wu), Q_train[u].T))).T #solve(A,B) solves for AX = B

but i'm wondering how to write C code for such a complicated matrix operation. if there is a possiblity that python code could be written inside the source module then i feel the above code could easily parallelized.
please give your opinion on this and if possible some hints so as to parallelize the above code in pycuda.
thanks in advance.

Bogdan Opanchuk

unread,

Sep 12, 2017, 10:15:29 PM9/12/17

to reikna

Source modules are compiled and executed on a GPU, so Python is not available there. You will have to split your expressions into computation steps, for instance:
- t1 = diag(Wu)
- t2 = t1 @ Y.T
- t3 = Y @ t2
...

Some steps can be joined together for better performance, for example diag(Wu) can be created on the spot using a transformation, so that you don't have to keep it in memory.

The main problem here is that reikna does not have a computation corresponding to the numpy's solve(). You will have to either implement it, or limit yourself to one of CUDA or OpenCL and use an existing function from the corresponding linear algebra library.

Reply all

Reply to author

Forward

0 new messages