is it possible to write python code inside source module rather than C code?

29 views
Skip to first unread message

anush2...@gmail.com

unread,
Sep 10, 2017, 3:00:45 AM9/10/17
to reikna
Hi, I want to parallelize for loop in pycuda. and the code snippet is :

for u, Wu in enumerate(W):
X[u] = np.linalg.solve(np.dot(Y, np.dot(np.diag(Wu), Y.T)) + lambda_ * np.eye(n_factors),
np.dot(Y, np.dot(np.diag(Wu), Q_train[u].T))).T #solve(A,B) solves for AX = B

but i'm wondering how to write C code for such a complicated matrix operation. if there is a possiblity that python code could be written inside the source module then i feel the above code could easily parallelized.
please give your opinion on this and if possible some hints so as to parallelize the above code in pycuda.
thanks in advance.

Bogdan Opanchuk

unread,
Sep 12, 2017, 10:15:29 PM9/12/17
to reikna
Source modules are compiled and executed on a GPU, so Python is not available there. You will have to split your expressions into computation steps, for instance:
- t1 = diag(Wu)
- t2 = t1 @ Y.T
- t3 = Y @ t2
...

Some steps can be joined together for better performance, for example diag(Wu) can be created on the spot using a transformation, so that you don't have to keep it in memory.

The main problem here is that reikna does not have a computation corresponding to the numpy's solve(). You will have to either implement it, or limit yourself to one of CUDA or OpenCL and use an existing function from the corresponding linear algebra library.
Reply all
Reply to author
Forward
0 new messages