Cython inline function with numpy array as parameter

573 views
Skip to first unread message

Maxim

unread,
Jan 10, 2011, 8:03:55 AM1/10/11
to cython-users
I have a question about cython inlining functions with numpy arrays as
parameters:
http://stackoverflow.com/questions/4641200/cython-inline-function-with-numpy-array-as-parameter

Basically the code is like this:
# cython: infer_types=True
# cython: boundscheck=False
# cython: wraparound=False

import numpy as np
cimport numpy as np

cdef inline inc(np.ndarray[np.int32_t, ndim=2] arr, int i, int j):
arr[i, j]+= 1

def test1(np.ndarray[np.int32_t, ndim=2] arr):
cdef int i,j
for i in xrange(arr.shape[0]):
for j in xrange(arr.shape[1]):
inc(arr, i, j)


def test2(np.ndarray[np.int32_t, ndim=2] arr):
cdef int i,j
for i in xrange(arr.shape[0]):
for j in xrange(arr.shape[1]):
arr[i,j] += 1

test2 is 300x times faster than test1. But in real code inc is quite
big function and manually inlining it looks bad. Is there a way around
it? Is it considered a bug in cython and are there any plans to fix
it?

Stefan Behnel

unread,
Jan 10, 2011, 9:19:09 AM1/10/11
to cython...@googlegroups.com

Maxim

unread,
Jan 10, 2011, 9:38:58 AM1/10/11
to cython-users
On 10 янв, 17:19, Stefan Behnel <stefan...@behnel.de> wrote:
> Maxim, 10.01.2011 14:03:
>
>
>
>
>
>
>
>
>
> > I have a question about cython inlining functions with numpy arrays as
> > parameters:
> >http://stackoverflow.com/questions/4641200/cython-inline-function-wit...
>
> > Basically the code is like this:
> > # cython: infer_types=True
> > # cython: boundscheck=False
> > # cython: wraparound=False
>
> > import numpy as np
> > cimport numpy as np
>
> > cdef inline inc(np.ndarray[np.int32_t, ndim=2] arr, int i, int j):
> >    arr[i, j]+= 1
>
> > def test1(np.ndarray[np.int32_t, ndim=2] arr):
> >      cdef int i,j
> >      for i in xrange(arr.shape[0]):
> >          for j in xrange(arr.shape[1]):
> >              inc(arr, i, j)
>
> > def test2(np.ndarray[np.int32_t, ndim=2] arr):
> >      cdef int i,j
> >      for i in xrange(arr.shape[0]):
> >          for j in xrange(arr.shape[1]):
> >              arr[i,j] += 1
>
> > test2 is 300x times faster than test1. But in real code inc is quite
> > big function and manually inlining it looks bad. Is there a way around
> > it? Is it considered a bug in cython and are there any plans to fix
> > it?
>
> It's a known bug:
>
> http://trac.cython.org/cython_trac/ticket/61http://trac.cython.org/cython_trac/ticket/177http://trac.cython.org/cython_trac/ticket/340
>
> There is no time frame for a fix.
>
> Stefan
I see, thanks.

btw:
http://trac.cython.org/cython_trac/ticket/61
This works fine now as you can see in the example above, probably you
could close that bug.

Stefan Behnel

unread,
Jan 10, 2011, 9:51:12 AM1/10/11
to cython...@googlegroups.com
Maxim, 10.01.2011 15:38:
> I see, thanks.
>
> btw:
> http://trac.cython.org/cython_trac/ticket/61
> This works fine now as you can see in the example above, probably you
> could close that bug.

Right, closed now.

There's also this ticket which seems related:

http://trac.cython.org/cython_trac/ticket/180

Stefan

Maxim

unread,
Jan 10, 2011, 2:09:51 PM1/10/11
to cython-users
I've created wrapper class to hold pointer to numpy data:
http://pastebin.com/3Lx4M48G
Timings:
In [4]: timeit ttt.test1(arr)
1 loops, best of 3: 623 ms per loop

In [5]: timeit ttt.test2(arr)
100 loops, best of 3: 2.29 ms per loop

In [6]: timeit ttt.test3(arr)
1 loops, best of 3: 201 ms per loop

So wrapper is 3x faster than naive code, but 100x slower than manual
inlining. Strangely, cython doesn't allow __getitme__ to be at least
cpdef method, so it could be called faster. Am I missing any
optimization opportunities?

Robert Bradshaw

unread,
Jan 10, 2011, 2:38:28 PM1/10/11
to cython...@googlegroups.com

__getitem__ is neither a def nor a cdef method, it's a special method,
so already has faster (and fixed) calling semantics. One optimization
that I'd like to do is use the PySequence_* API if the second argument
is an integer type (perhaps this would need to be declared in the
.pxd).

- Robert

Stefan Behnel

unread,
Jan 11, 2011, 3:28:42 AM1/11/11
to cython...@googlegroups.com
Robert Bradshaw, 10.01.2011 20:38:

> One optimization
> that I'd like to do is use the PySequence_* API if the second argument
> is an integer type

... with the obvious drawback of slowing down item access, especially in
Python 3.

http://trac.cython.org/cython_trac/ticket/636

Stefan

Robert Bradshaw

unread,
Jan 12, 2011, 1:26:37 AM1/12/11
to cython...@googlegroups.com

True, it would only work for classes that don't support slicing
(though it's not that hard to imagine detecting checking the index is
a slice and converting it to an integer otherwise--I'm not sure the
best way to handle that).

- Robert

Reply all
Reply to author
Forward
0 new messages