Boost numpy with jit nopython mode but seems no effect

0 views
Skip to first unread message

Will Tsing

unread,
May 20, 2017, 9:27:46 PM5/20/17
to Numba Public Discussion - Public
Hi there,
    I'm new here and tring to boost my numpy calculation with this great code.
    Here's a function that I'll call it thousands times:
def updatePsi(n, theta, distance, pl):
   ‘n > 50
    distance is a (n, n, len(theta)) ndarray
    theta is an 1-d ndarray’

    Psi = np.zeros((n, n))
    one = np.ones(n)
    psi = np.zeros((n, 1))
    newPsi = np.exp(-np.sum(theta * np.power(distance, pl), axis=2))
   It is proved that this     
newPsi = np.exp(-np.sum(theta * np.power(distance, pl), axis=2))
   line takes most of the runtime, so I'm tring to use a jit decorator to boost this function, in order to make in run in nopython mode, I've change this code to the following:
test_1 = theta * np.power(distance, pl)
total = np.zeros_like(test_1[:, :, 0])
for i in range(len(theta)):
    total += test_1[:, :, i]
total = -total
newPsi = np.exp(total)
However, the new code with jit(nopython=True) shows no effect on speed, I'm just use a while loop to run this code:
t0 = time.time()
n = 100
k = 10
for i in range(2000):
    nopython_updatePsi(n, np.ones(k), np.ones((n, n, k)) * 5, np.ones(k) * 2)
t1 = time.time() - t0

Though the origin code with no numba jit rans about 15.59 seconds, while with @jit(nopython=True) gives a 16.12 seconds result.

Is this the right way to use numba? or am I missed some instructions to make this code quicker? 
Thanks in advance.

Will


   


Carl Sanders

unread,
May 20, 2017, 9:52:42 PM5/20/17
to Numba Public Discussion - Public
Hi Will,
The numba code you have written is basically what numpy already does; if you take a look at 
it discusses how numpy gets near-C (and thereby near-numba) speeds when operating on arrays.
Numba is best when working on problems that are difficult/impossible to implement with numpy directly, e.g. if there were some complicated control flow issues while filling your newPsi matrix. As it is now, your function seems to be running about as fast as it ever will, at least in a single process. 
Best,
carl

Leopold Haimberger

unread,
May 21, 2017, 11:00:58 AM5/21/17
to Numba Public Discussion - Public


Hi Carl, I agree with Will

Nevertheless I found two possibilities to speed up your code:

The first one: If you do not construct an array every time you call the function that helps. In the code below replaced the ones(n,n,k) with an array I defined before
The second, more interesting one: In the Numba 0.34 development version you can use the keyword "parallel" in the (n)jit decorator. If you have a multicore machine that can give nice speedups Please check the documentation how to install a development version. 

Both measures together give a nice  7x speedup on my machine (28 cores). The code follows

Leo

import numpy as np
import time
from numba import *


@njit(parallel=True)
def njit_updatePsi(n, theta, distance, pl):

    Psi = np.zeros((n, n))
    one = np.ones(n)
    psi = np.zeros((n, 1))
    total=np.zeros((n,n))
    newPsi=np.power(distance, pl)
    for i in range(newPsi.shape[0]):
        for j in range(newPsi.shape[1]):
            for k in range(newPsi.shape[2]):
                total[i,j]-=theta[k]*newPsi[i,j,k]
    newPsi = np.exp(total)

    return

def updatePsi(n, theta, distance, pl):

    Psi = np.zeros((n, n))
    one = np.ones(n)
    psi = np.zeros((n, 1))
    newPsi = np.exp(-np.sum(theta * np.power(distance, pl), axis=2))
    return

t0 = time.time()
n = 100
k = 10
ini=np.ones((n, n, k)) * 5
njit_updatePsi(n, np.ones(k),ini , np.ones(k) * 2)

t0 = time.time()
for i in range(2000):
    updatePsi(n, np.ones(k), ini, np.ones(k) * 2)
t1 = time.time() - t0
print t1
t0 = time.time()
for i in range(2000):
    njit_updatePsi(n, np.ones(k), ini, np.ones(k) * 2)
t1 = time.time() - t0
print t1
Reply all
Reply to author
Forward
0 new messages