def dot(X):
return numpy.dot(X.T, X)
@jit('double[:,:](double[:,:])', nopython=True, nogil=True)
def jit_dot(X):
return numpy.dot(X.T, X)
--
You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users+unsubscribe@continuum.io.
To post to this group, send email to numba...@continuum.io.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/48e8a1d2-e2b3-40e9-a61f-16ccf2572f40%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
This difference doesn't make sense to me, unless the version of NumPy and SciPy you are using were compiled with different BLAS implementations. For some obscure technical reasons, Numba uses the SciPy BLAS when JIT compiling functions that call numpy.dot. In all the cases we've encountered, both are compiled with the same implementation, but in principle they could be different.Where did your numpy and scipy packages come from?
On Sat, Nov 25, 2017 at 2:42 AM, Jacob Schreiber <jmschr...@gmail.com> wrote:
HowdyI'd like to access BLAS, specifically dgemm, in numba wrapped functions while also releasing the GIL. When I try a simple function like...
def dot(X):
return numpy.dot(X.T, X)
@jit('double[:,:](double[:,:])', nopython=True, nogil=True)
def jit_dot(X):
return numpy.dot(X.T, X)I get that the `dot` function is significantly faster than `jit_dot`, and that `jit_dot` is slowed down by multithreading, not sped up.Is there a way to use numba to release the GIL while also accessing BLAS?Thanks
--
You received this message because you are subscribed to the Google Groups "Numba Public Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numba-users...@continuum.io.
import numpy as np
from joblib import Parallel, delayed
import datetime as dt
N = 4
X = np.random.randn(50000,1000)
Y = np.random.randn(1000,1000)
Z = np.empty((50000, 1000))
Zs = [np.empty((50000, 1000)) for _ in range(N)]
tic = dt.datetime.now()
for i in range(4):
np.dot(X, Y, out=Z)
toc = dt.datetime.now() - tic
print(toc.total_seconds())
tic = dt.datetime.now()
with Parallel(n_jobs=4, backend='threading') as P:
P(delayed(np.dot, check_pickle=False)(X, Y, Zs[i]) for i in range(4))
toc = dt.datetime.now() - tic
print(toc.total_seconds())
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/numba-users/544b18a8-43da-484a-931c-b6a1258c4540%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.