Boost numpy with jit nopython mode but seems no effect

Will Tsing

unread,

May 20, 2017, 9:27:46 PM5/20/17

to Numba Public Discussion - Public

Hi there,

I'm new here and tring to boost my numpy calculation with this great code.

Here's a function that I'll call it thousands times:

def updatePsi(n, theta, distance, pl):
   ‘n > 50
    distance is a (n, n, len(theta)) ndarray
    theta is an 1-d ndarray’

    Psi = np.zeros((n, n))
    one = np.ones(n)
    psi = np.zeros((n, 1))
    newPsi = np.exp(-np.sum(theta * np.power(distance, pl), axis=2))

It is proved that this

newPsi = np.exp(-np.sum(theta * np.power(distance, pl), axis=2))

line takes most of the runtime, so I'm tring to use a jit decorator to boost this function, in order to make in run in nopython mode, I've change this code to the following:

test_1 = theta * np.power(distance, pl)
total = np.zeros_like(test_1[:, :, 0])
for i in range(len(theta)):
    total += test_1[:, :, i]
total = -total
newPsi = np.exp(total)

However, the new code with jit(nopython=True) shows no effect on speed, I'm just use a while loop to run this code:

t0 = time.time()
n = 100
k = 10
for i in range(2000):
    nopython_updatePsi(n, np.ones(k), np.ones((n, n, k)) * 5, np.ones(k) * 2)
t1 = time.time() - t0

Though the origin code with no numba jit rans about 15.59 seconds, while with @jit(nopython=True) gives a 16.12 seconds result.

Is this the right way to use numba? or am I missed some instructions to make this code quicker?

Thanks in advance.

Will

Carl Sanders

unread,

May 20, 2017, 9:52:42 PM5/20/17

to Numba Public Discussion - Public

Hi Will,

The numba code you have written is basically what numpy already does; if you take a look at

https://docs.scipy.org/doc/numpy/user/whatisnumpy.html

it discusses how numpy gets near-C (and thereby near-numba) speeds when operating on arrays.

Numba is best when working on problems that are difficult/impossible to implement with numpy directly, e.g. if there were some complicated control flow issues while filling your newPsi matrix. As it is now, your function seems to be running about as fast as it ever will, at least in a single process.

Best,

carl

Leopold Haimberger

unread,

May 21, 2017, 11:00:58 AM5/21/17

to Numba Public Discussion - Public

Hi Carl, I agree with Will

Nevertheless I found two possibilities to speed up your code:

The first one: If you do not construct an array every time you call the function that helps. In the code below replaced the ones(n,n,k) with an array I defined before

The second, more interesting one: In the Numba 0.34 development version you can use the keyword "parallel" in the (n)jit decorator. If you have a multicore machine that can give nice speedups Please check the documentation how to install a development version.

Both measures together give a nice 7x speedup on my machine (28 cores). The code follows

Leo

import numpy as np

import time

from numba import *

@njit(parallel=True)

def njit_updatePsi(n, theta, distance, pl):

Psi = np.zeros((n, n))

one = np.ones(n)

psi = np.zeros((n, 1))

total=np.zeros((n,n))

newPsi=np.power(distance, pl)

for i in range(newPsi.shape[0]):

for j in range(newPsi.shape[1]):

for k in range(newPsi.shape[2]):

total[i,j]-=theta[k]*newPsi[i,j,k]

newPsi = np.exp(total)

return

def updatePsi(n, theta, distance, pl):

Psi = np.zeros((n, n))

one = np.ones(n)

psi = np.zeros((n, 1))

newPsi = np.exp(-np.sum(theta * np.power(distance, pl), axis=2))

return

t0 = time.time()

n = 100

k = 10

ini=np.ones((n, n, k)) * 5

njit_updatePsi(n, np.ones(k),ini , np.ones(k) * 2)

t0 = time.time()

for i in range(2000):

updatePsi(n, np.ones(k), ini, np.ones(k) * 2)

t1 = time.time() - t0

print t1

t0 = time.time()

for i in range(2000):

njit_updatePsi(n, np.ones(k), ini, np.ones(k) * 2)

t1 = time.time() - t0

print t1

Reply all

Reply to author

Forward