The following doesn't work since since I can't cimport NPY_INTP:
import numpy as np
cimport numpy as np
from numpy cimport NPY_INTP
from numpy cimport PyArray_EMPTY, PyArray_DIMS, import_array
import_array()
def intparray(np.ndarray[np.float64_t, ndim=1] a):
dim = PyArray_DIMS(a)
cdef int n0 = dim[0]
cdef np.npy_intp *dims = [n0]
cdef np.ndarray[np.float64_t, ndim=1] aintp = PyArray_EMPTY(1,
dims, NPY_INTP, 0)
return aintp
When I try to compile I get: 'NPY_INTP.pxd' not found
Anyone know a workaround?
You will have to modify Cython/Include/numpy.pxd to include NPY_INTP. It should
just be a matter of adding it to the NPY_TYPES enum, if you don't need any
support for it in the utility functions in numpy.pxd.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
That was too easy. You must have cheated.
Since I don't want to ask users to modify numpy.pxd, I'll add this to
my package:
cdef extern from "numpy/arrayobject.h":
cdef enum NPY_TYPES:
NPY_INTP
Is there some way that I can refer to NPY_INTP as NPY_intp? So the equivalent of
from numpy cimport NPY_INTP as NPY_intp
We're accepting patches :). It'd make it in before the next release.
> Is there some way that I can refer to NPY_INTP as NPY_intp? So the equivalent of
>
> from numpy cimport NPY_INTP as NPY_intp
cdef extern from "numpy/arrayobject.h":
cdef enum NPY_TYPES:
"NPY_intp" NPY_INTP
- Robert
To prove that I am the wrong person I have a question: are unit tests needed?
>> Is there some way that I can refer to NPY_INTP as NPY_intp? So the equivalent of
>>
>> from numpy cimport NPY_INTP as NPY_intp
>
> cdef extern from "numpy/arrayobject.h":
> cdef enum NPY_TYPES:
> "NPY_intp" NPY_INTP
I get an error:
cdef extern from "numpy/arrayobject.h":
cdef enum NPY_TYPES:
"NPY_intp" NPY_INTP
^
------------------------------------------------------------
Expected an identifier
It's the other way around,
cdef enum NPY_TYPES:
NPY_intp "NPY_INTP"
(although I must say I think it's a bad idea to do this...)
Dag Sverre
That works! Thanks Robert, Robert, and Dag. That was my release blocker.
> (although I must say I think it's a bad idea to do this...)
Everything is a slave to my templating system.
Yeah, I'd rather it not be renamed either.
> Everything is a slave to my templating system.
On this note, we talked a bit about templating at the Cython days. I'd
be interested in hearing your usecase, setup, and ideal hypothetical
setup.
- Robert
[I hope this doesn't exceed you inbox quota.]
I use Cython in two different ways.
In the first way I have a slow section of NumPy code that I rewrite in
Cython. The dtype, ndim, and the axis I am working along are all known
so no templating is needed---I just hard code them.
In the second way I am making a general function where I do not know
the dtype, ndim, and axis. That's when I need templating. An example
is the Bottleneck package.
In Bottleneck a high-level function like bn.nanmean looks at the input
(dtype, ndim, axis) and then selects the appropriate low-level
function (using a dict) and uses the function to do the calculation.
Here's an example of a low level-function:
@cython.boundscheck(False)
@cython.wraparound(False)
def nanmean_2d_float64_axis0(np.ndarray[np.float64_t, ndim=2] a):
"Mean of 2d array with dtype=float64 along axis=0 ignoring NaNs."
cdef int count = 0
cdef np.float64_t asum = 0, ai
cdef Py_ssize_t i0, i1
cdef np.npy_intp *dim
dim = PyArray_DIMS(a)
cdef int n0 = dim[0]
cdef int n1 = dim[1]
cdef np.npy_intp *dims = [n1]
cdef np.ndarray[np.float64_t, ndim=1] y = PyArray_EMPTY(1, dims,
NPY_float64, 0)
for i1 in range(n1):
asum = 0
count = 0
for i0 in range(n0):
ai = a[i0, i1]
if ai == ai:
asum += ai
count += 1
if count > 0:
y[i1] = asum / count
else:
y[i1] = NAN
return y
There are similar low-level functions for the various combinations of
dtype, ndim, axis:
nanmean_2d_float64_axis1()
nanmean_3d_int32_axis2()
nanmean_1d_float32_axis0()
etc.
I currently only accelerate 1d, 2d, 3d, and int32, int64, float32,
float64. Everything else falls back to slower non-cython functions.
Since I have templating the only thing holding me back from supporting
more dtypes and ndim is the size of the C files and resulting binaries
(and compile time).
The low-level functions are created with a homemade templating system.
Here's how I template nanmean for float input (different for ints
since ints cannot be NaN) when axis is not None:
floats = {}
floats['dtypes'] = FLOAT_DTYPES
floats['axisNone'] = False
floats['force_output_dtype'] = False
floats['reuse_non_nan_func'] = False
floats['top'] = """
@cython.boundscheck(False)
@cython.wraparound(False)
def NAME_NDIMd_DTYPE_axisAXIS(np.ndarray[np.DTYPE_t, ndim=NDIM] a):
"Mean of NDIMd array with dtype=DTYPE along axis=AXIS ignoring NaNs."
cdef int count = 0
cdef np.DTYPE_t asum = 0, ai
"""
loop = {}
loop[2] = """\
for iINDEX0 in range(nINDEX0):
asum = 0
count = 0
for iINDEX1 in range(nINDEX1):
ai = a[INDEXALL]
if ai == ai:
asum += ai
count += 1
if count > 0:
y[INDEXPOP] = asum / count
else:
y[INDEXPOP] = NAN
return y
"""
loop[3] = """\
for iINDEX0 in range(nINDEX0):
for iINDEX1 in range(nINDEX1):
asum = 0
count = 0
for iINDEX2 in range(nINDEX2):
ai = a[INDEXALL]
if ai == ai:
asum += ai
count += 1
if count > 0:
y[INDEXPOP] = asum / count
else:
y[INDEXPOP] = NAN
return y
"""
floats['loop'] = loop
To get the cdef's of the for loop indices and range and initialize the
output, I do (example from docstring):
Define parameters:
>>> ndim = 3
>>> dtype = 'float64'
>>> axis = 1
>>> is_reducing_function = True
Import loop_cdef:
>>> from bottleneck.src.template.template import loop_cdef
Make loop initialization code:
>>> print loop_cdef(ndim, dtype, axis, is_reducing_function)
cdef Py_ssize_t i0, i1, i2
cdef np.npy_intp *dim
dim = PyArray_DIMS(a)
cdef int n0 = dim[0]
cdef int n1 = dim[1]
cdef int n2 = dim[2]
cdef np.npy_intp *dims = [n0, n2]
cdef np.ndarray[np.float64_t, ndim=2] y = PyArray_EMPTY(2, dims,
NPY_float64, 0)
Repeat, but this time make the output non-reducing:
>>> is_reducing_function = False
>>> print loop_cdef(ndim, dtype, axis, is_reducing_function)
cdef Py_ssize_t i0, i1, i2
cdef np.npy_intp *dim
dim = PyArray_DIMS(a)
cdef int n0 = dim[0]
cdef int n1 = dim[1]
cdef int n2 = dim[2]
cdef np.npy_intp *dims = [n0, n1, n2]
cdef np.ndarray[np.float64_t, ndim=3] y = PyArray_EMPTY(3, dims,
NPY_float64, 0)
To generate the for loops for different axis input, I do (example from
docstring):
Make a 3d loop template:
>>> loop = '''
.... for iINDEX0 in range(nINDEX0):
.... for iINDEX1 in range(nINDEX1):
.... amin = MAXDTYPE
.... for iINDEX2 in range(nINDEX2):
.... ai = a[INDEXALL]
.... if ai <= amin:
.... amin = ai
.... y[INDEXPOP] = amin
.... '''
Import the looper function:
>>> from bottleneck.src.template.template import looper
Make a loop over axis=0:
>>> print looper(loop, ndim=3, axis=0)
for i1 in range(n1):
for i2 in range(n2):
amin = MAXDTYPE
for i0 in range(n0):
ai = a[i0, i1, i2]
if ai <= amin:
amin = ai
y[i1, i2] = amin
Make a loop over axis=1:
>>> print looper(loop, ndim=3, axis=1)
for i0 in range(n0):
for i2 in range(n2):
amin = MAXDTYPE
for i1 in range(n1):
ai = a[i0, i1, i2]
if ai <= amin:
amin = ai
y[i0, i2] = amin
Make a loop over axis=2:
>>> print looper(loop, ndim=3, axis=2)
for i0 in range(n0):
for i1 in range(n1):
amin = MAXDTYPE
for i2 in range(n2):
ai = a[i0, i1, i2]
if ai <= amin:
amin = ai
y[i0, i1] = amin
To make templating easy you need to have a rigid structure that all
code must fit into. To make templating flexible, well, cython already
has that since the user can already do whatever they want. So it is a
matter of finding useful support functions for those who need
templating. I don't yet have enough experience to know what that is so
I'd be interested in hearing the experiences of others.