Creating a numpy array with dtype intp

310 views
Skip to first unread message

Keith Goodman

unread,
Jun 9, 2011, 1:05:29 PM6/9/11
to cython...@googlegroups.com
I'd like to create an empty numpy array with dtype np.intp.

The following doesn't work since since I can't cimport NPY_INTP:

import numpy as np
cimport numpy as np
from numpy cimport NPY_INTP
from numpy cimport PyArray_EMPTY, PyArray_DIMS, import_array
import_array()

def intparray(np.ndarray[np.float64_t, ndim=1] a):
dim = PyArray_DIMS(a)
cdef int n0 = dim[0]
cdef np.npy_intp *dims = [n0]
cdef np.ndarray[np.float64_t, ndim=1] aintp = PyArray_EMPTY(1,
dims, NPY_INTP, 0)
return aintp

When I try to compile I get: 'NPY_INTP.pxd' not found

Anyone know a workaround?

Robert Kern

unread,
Jun 9, 2011, 7:11:59 PM6/9/11
to cython...@googlegroups.com

You will have to modify Cython/Include/numpy.pxd to include NPY_INTP. It should
just be a matter of adding it to the NPY_TYPES enum, if you don't need any
support for it in the utility functions in numpy.pxd.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Keith Goodman

unread,
Jun 9, 2011, 9:27:40 PM6/9/11
to cython...@googlegroups.com
On Thu, Jun 9, 2011 at 4:11 PM, Robert Kern <rober...@gmail.com> wrote:
> On 6/9/11 12:05 PM, Keith Goodman wrote:
>>
>> I'd like to create an empty numpy array with dtype np.intp.
>>
>> The following doesn't work since since I can't cimport NPY_INTP:
>>
>> import numpy as np
>> cimport numpy as np
>> from numpy cimport NPY_INTP
>> from numpy cimport PyArray_EMPTY, PyArray_DIMS, import_array
>> import_array()
>>
>> def intparray(np.ndarray[np.float64_t, ndim=1] a):
>>     dim = PyArray_DIMS(a)
>>     cdef int n0 = dim[0]
>>     cdef np.npy_intp *dims = [n0]
>>     cdef np.ndarray[np.float64_t, ndim=1] aintp = PyArray_EMPTY(1,
>> dims, NPY_INTP, 0)
>>     return aintp
>>
>> When I try to compile I get: 'NPY_INTP.pxd' not found
>>
>> Anyone know a workaround?
>
> You will have to modify Cython/Include/numpy.pxd to include NPY_INTP. It
> should just be a matter of adding it to the NPY_TYPES enum, if you don't
> need any support for it in the utility functions in numpy.pxd.

That was too easy. You must have cheated.

Since I don't want to ask users to modify numpy.pxd, I'll add this to
my package:

cdef extern from "numpy/arrayobject.h":
cdef enum NPY_TYPES:
NPY_INTP

Is there some way that I can refer to NPY_INTP as NPY_intp? So the equivalent of

from numpy cimport NPY_INTP as NPY_intp

Robert Bradshaw

unread,
Jun 10, 2011, 12:45:25 AM6/10/11
to cython...@googlegroups.com

We're accepting patches :). It'd make it in before the next release.

> Is there some way that I can refer to NPY_INTP as NPY_intp? So the equivalent of
>
> from numpy cimport NPY_INTP as NPY_intp

   cdef extern from "numpy/arrayobject.h":
       cdef enum NPY_TYPES:
           "NPY_intp" NPY_INTP

- Robert

Keith Goodman

unread,
Jun 10, 2011, 10:59:27 AM6/10/11
to cython...@googlegroups.com

To prove that I am the wrong person I have a question: are unit tests needed?

>> Is there some way that I can refer to NPY_INTP as NPY_intp? So the equivalent of
>>
>> from numpy cimport NPY_INTP as NPY_intp
>
>     cdef extern from "numpy/arrayobject.h":
>         cdef enum NPY_TYPES:
>             "NPY_intp" NPY_INTP

I get an error:

cdef extern from "numpy/arrayobject.h":
cdef enum NPY_TYPES:
"NPY_intp" NPY_INTP

^
------------------------------------------------------------
Expected an identifier

Dag Sverre Seljebotn

unread,
Jun 10, 2011, 11:55:21 AM6/10/11
to cython...@googlegroups.com

It's the other way around,

cdef enum NPY_TYPES:
NPY_intp "NPY_INTP"

(although I must say I think it's a bad idea to do this...)

Dag Sverre

Keith Goodman

unread,
Jun 10, 2011, 12:01:05 PM6/10/11
to cython...@googlegroups.com

That works! Thanks Robert, Robert, and Dag. That was my release blocker.

> (although I must say I think it's a bad idea to do this...)

Everything is a slave to my templating system.

Robert Bradshaw

unread,
Jun 10, 2011, 8:25:14 PM6/10/11
to cython...@googlegroups.com

Yeah, I'd rather it not be renamed either.

> Everything is a slave to my templating system.

On this note, we talked a bit about templating at the Cython days. I'd
be interested in hearing your usecase, setup, and ideal hypothetical
setup.

- Robert

Keith Goodman

unread,
Jun 11, 2011, 3:11:32 PM6/11/11
to cython...@googlegroups.com

[I hope this doesn't exceed you inbox quota.]

I use Cython in two different ways.

In the first way I have a slow section of NumPy code that I rewrite in
Cython. The dtype, ndim, and the axis I am working along are all known
so no templating is needed---I just hard code them.

In the second way I am making a general function where I do not know
the dtype, ndim, and axis. That's when I need templating. An example
is the Bottleneck package.

In Bottleneck a high-level function like bn.nanmean looks at the input
(dtype, ndim, axis) and then selects the appropriate low-level
function (using a dict) and uses the function to do the calculation.

Here's an example of a low level-function:

@cython.boundscheck(False)
@cython.wraparound(False)
def nanmean_2d_float64_axis0(np.ndarray[np.float64_t, ndim=2] a):
"Mean of 2d array with dtype=float64 along axis=0 ignoring NaNs."
cdef int count = 0
cdef np.float64_t asum = 0, ai
cdef Py_ssize_t i0, i1
cdef np.npy_intp *dim


dim = PyArray_DIMS(a)
cdef int n0 = dim[0]

cdef int n1 = dim[1]
cdef np.npy_intp *dims = [n1]
cdef np.ndarray[np.float64_t, ndim=1] y = PyArray_EMPTY(1, dims,
NPY_float64, 0)
for i1 in range(n1):
asum = 0
count = 0
for i0 in range(n0):
ai = a[i0, i1]
if ai == ai:
asum += ai
count += 1
if count > 0:
y[i1] = asum / count
else:
y[i1] = NAN
return y

There are similar low-level functions for the various combinations of
dtype, ndim, axis:

nanmean_2d_float64_axis1()
nanmean_3d_int32_axis2()
nanmean_1d_float32_axis0()
etc.

I currently only accelerate 1d, 2d, 3d, and int32, int64, float32,
float64. Everything else falls back to slower non-cython functions.
Since I have templating the only thing holding me back from supporting
more dtypes and ndim is the size of the C files and resulting binaries
(and compile time).

The low-level functions are created with a homemade templating system.
Here's how I template nanmean for float input (different for ints
since ints cannot be NaN) when axis is not None:

floats = {}
floats['dtypes'] = FLOAT_DTYPES
floats['axisNone'] = False
floats['force_output_dtype'] = False
floats['reuse_non_nan_func'] = False

floats['top'] = """
@cython.boundscheck(False)
@cython.wraparound(False)
def NAME_NDIMd_DTYPE_axisAXIS(np.ndarray[np.DTYPE_t, ndim=NDIM] a):
"Mean of NDIMd array with dtype=DTYPE along axis=AXIS ignoring NaNs."
cdef int count = 0
cdef np.DTYPE_t asum = 0, ai
"""

loop = {}
loop[2] = """\
for iINDEX0 in range(nINDEX0):
asum = 0
count = 0
for iINDEX1 in range(nINDEX1):
ai = a[INDEXALL]
if ai == ai:
asum += ai
count += 1
if count > 0:
y[INDEXPOP] = asum / count
else:
y[INDEXPOP] = NAN
return y
"""
loop[3] = """\
for iINDEX0 in range(nINDEX0):
for iINDEX1 in range(nINDEX1):
asum = 0
count = 0
for iINDEX2 in range(nINDEX2):
ai = a[INDEXALL]
if ai == ai:
asum += ai
count += 1
if count > 0:
y[INDEXPOP] = asum / count
else:
y[INDEXPOP] = NAN
return y
"""
floats['loop'] = loop

To get the cdef's of the for loop indices and range and initialize the
output, I do (example from docstring):

Define parameters:

>>> ndim = 3
>>> dtype = 'float64'
>>> axis = 1
>>> is_reducing_function = True

Import loop_cdef:

>>> from bottleneck.src.template.template import loop_cdef

Make loop initialization code:

>>> print loop_cdef(ndim, dtype, axis, is_reducing_function)
cdef Py_ssize_t i0, i1, i2
cdef np.npy_intp *dim


dim = PyArray_DIMS(a)
cdef int n0 = dim[0]

cdef int n1 = dim[1]
cdef int n2 = dim[2]
cdef np.npy_intp *dims = [n0, n2]
cdef np.ndarray[np.float64_t, ndim=2] y = PyArray_EMPTY(2, dims,
NPY_float64, 0)

Repeat, but this time make the output non-reducing:

>>> is_reducing_function = False
>>> print loop_cdef(ndim, dtype, axis, is_reducing_function)
cdef Py_ssize_t i0, i1, i2
cdef np.npy_intp *dim


dim = PyArray_DIMS(a)
cdef int n0 = dim[0]

cdef int n1 = dim[1]
cdef int n2 = dim[2]
cdef np.npy_intp *dims = [n0, n1, n2]
cdef np.ndarray[np.float64_t, ndim=3] y = PyArray_EMPTY(3, dims,
NPY_float64, 0)

To generate the for loops for different axis input, I do (example from
docstring):

Make a 3d loop template:

>>> loop = '''
.... for iINDEX0 in range(nINDEX0):
.... for iINDEX1 in range(nINDEX1):
.... amin = MAXDTYPE
.... for iINDEX2 in range(nINDEX2):
.... ai = a[INDEXALL]
.... if ai <= amin:
.... amin = ai
.... y[INDEXPOP] = amin
.... '''

Import the looper function:

>>> from bottleneck.src.template.template import looper

Make a loop over axis=0:

>>> print looper(loop, ndim=3, axis=0)
for i1 in range(n1):
for i2 in range(n2):
amin = MAXDTYPE
for i0 in range(n0):
ai = a[i0, i1, i2]
if ai <= amin:
amin = ai
y[i1, i2] = amin

Make a loop over axis=1:

>>> print looper(loop, ndim=3, axis=1)
for i0 in range(n0):
for i2 in range(n2):
amin = MAXDTYPE
for i1 in range(n1):
ai = a[i0, i1, i2]
if ai <= amin:
amin = ai
y[i0, i2] = amin

Make a loop over axis=2:

>>> print looper(loop, ndim=3, axis=2)
for i0 in range(n0):
for i1 in range(n1):
amin = MAXDTYPE
for i2 in range(n2):
ai = a[i0, i1, i2]
if ai <= amin:
amin = ai
y[i0, i1] = amin

To make templating easy you need to have a rigid structure that all
code must fit into. To make templating flexible, well, cython already
has that since the user can already do whatever they want. So it is a
matter of finding useful support functions for those who need
templating. I don't yet have enough experience to know what that is so
I'd be interested in hearing the experiences of others.

Robert Bradshaw

unread,
Jul 3, 2011, 12:55:29 AM7/3/11
to cython...@googlegroups.com
Thanks for the long example.
Reply all
Reply to author
Forward
0 new messages