how to get a numpy bool array?

3,259 views
Skip to first unread message

Chris Barker - NOAA Federal

unread,
Jun 5, 2013, 12:03:15 PM6/5/13
to cython-users
Folks,

I'm banging my head against this one -- I need to create a numpy bool
array in Cython (typed as such...)

I've tried variations of:

import numpy as np
cimport numpy as cnp

cdef cnp.ndarray[cnp.bool8, ndim=1, mode="c" ] result =
np.zeros((points.shape[0],), dtype=np.bool8)

cdef cnp.ndarray[cnp.bool_, ndim=1, mode="c" ] result =
np.zeros((points.shape[0],), dtype=np.bool_)

cdef cnp.ndarray[char, ndim=1, mode="c" ] result =
np.zeros((points.shape[0],), dtype=np.bool_)

cdef cnp.ndarray[NPY_BOOL, ndim=1, mode="c" ] result =
np.zeros((points.shape[0],), dtype=np.bool)

cdef cnp.ndarray[npy_bool, ndim=1, mode="c" ] result =
np.zeros((points.shape[0],), dtype=np.bool)

none work, some fail with: Invalid type.

some with a run-time error:


> cdef cnp.ndarray[char, ndim=1, mode="c" ] result = np.zeros((points.shape[0],), dtype=np.bool_)
E ValueError: Does not understand character buffer dtype format string ('?')

How should I be spelling this?

-thanks,
-Chris



--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris....@noaa.gov

Robert Bradshaw

unread,
Jun 5, 2013, 1:04:50 PM6/5/13
to cython...@googlegroups.com
IIRC, Numpy boolean arrays are stored in a packed format, one bit per
element (requiring shifting to access). This doesn't work well with
the buffer interface, where one is expected to be able to compute a
pointer to any individual element.
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "cython-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cython-users...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Nathaniel Smith

unread,
Jun 5, 2013, 1:31:12 PM6/5/13
to cython...@googlegroups.com

On 5 Jun 2013 18:05, "Robert Bradshaw" <robe...@gmail.com> wrote:
>
> IIRC, Numpy boolean arrays are stored in a packed format, one bit per
> element (requiring shifting to access). This doesn't work well with
> the buffer interface, where one is expected to be able to compute a
> pointer to any individual element.

No, numpy dtypes have to have sizes expressible in bytes, because strides are counted in bytes. Numpy bools are 1 byte apiece. Worst case you can cast them to uint8 and use 0 = False, 1 = True.

-n

Chris Barker - NOAA Federal

unread,
Jun 5, 2013, 1:37:31 PM6/5/13
to cython...@googlegroups.com
On Wed, Jun 5, 2013 at 10:31 AM, Nathaniel Smith <n...@pobox.com> wrote:

> No, numpy dtypes have to have sizes expressible in bytes, because strides
> are counted in bytes. Numpy bools are 1 byte apiece. Worst case you can cast
> them to uint8 and use 0 = False, 1 = True.

which is what I"m doing now:

cdef cnp.ndarray[char, ndim=1, mode="c" ] result =
np.zeros((a_points.shape[0],), dtype=np.uint8)

then:

return result.view(dtype=np.bool) # make it a np.bool array

which feels pretty ugly and fragile...

in numpy.pyd, I see:
$ grep "bool" /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Cython/Includes/numpy.pxd
ctypedef signed char npy_bool
npy_bool PyArray_CanCastTo (dtype, dtype)
object PyArray_Byteswap (ndarray, npy_bool)
npy_bool PyArray_CheckStrides (int, int, npy_intp, npy_intp,
npy_intp *, npy_intp *)
npy_bool PyArray_CanCastScalar (type, type)
int PyArray_BoolConverter (object, npy_bool *)

which sure makes it look supported, but I cna't seem to find the right
incantation...

Thanks,

Robert Bradshaw

unread,
Jun 6, 2013, 1:09:55 AM6/6/13
to cython...@googlegroups.com
On Wed, Jun 5, 2013 at 10:37 AM, Chris Barker - NOAA Federal
<chris....@noaa.gov> wrote:
> On Wed, Jun 5, 2013 at 10:31 AM, Nathaniel Smith <n...@pobox.com> wrote:
>
>> No, numpy dtypes have to have sizes expressible in bytes, because strides
>> are counted in bytes. Numpy bools are 1 byte apiece.

Ah, good to know. I know that vector<bool> stores things in a compact
way, with odd behaviors.

> Worst case you can cast
>> them to uint8 and use 0 = False, 1 = True.
>
> which is what I"m doing now:
>
> cdef cnp.ndarray[char, ndim=1, mode="c" ] result =
> np.zeros((a_points.shape[0],), dtype=np.uint8)
>
> then:
>
> return result.view(dtype=np.bool) # make it a np.bool array
>
> which feels pretty ugly and fragile...
>
> in numpy.pyd, I see:
> $ grep "bool" /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Cython/Includes/numpy.pxd
> ctypedef signed char npy_bool
> npy_bool PyArray_CanCastTo (dtype, dtype)
> object PyArray_Byteswap (ndarray, npy_bool)
> npy_bool PyArray_CheckStrides (int, int, npy_intp, npy_intp,
> npy_intp *, npy_intp *)
> npy_bool PyArray_CanCastScalar (type, type)
> int PyArray_BoolConverter (object, npy_bool *)
>
> which sure makes it look supported, but I cna't seem to find the right
> incantation...

These are the API functions, which is different than buffer support.
One additional step compared to any other type would be inserting the
right hooks on assignment to coerce to 0/1. Probably not hard, simply
not (as far as I know) implemented.

- Robert

Chris Barker - NOAA Federal

unread,
Jun 6, 2013, 12:52:06 PM6/6/13
to cython...@googlegroups.com
On Wed, Jun 5, 2013 at 10:09 PM, Robert Bradshaw <robe...@gmail.com> wrote:

> These are the API functions, which is different than buffer support.
> One additional step compared to any other type would be inserting the
> right hooks on assignment to coerce to 0/1. Probably not hard, simply
> not (as far as I know) implemented.

Fair enough. And you're right, there isn't much difference between a
array of uint8 and a bool array -- on the numpy side, it's mostly
about display (you see [True, False, False]) and a few other niceties.
in may case, a key one was:

In [6]: bool_arr = np.ones((3,), dtype=np.bool)

In [7]: bool_arr
Out[7]: array([ True, True, True], dtype=bool)

In [8]: ~bool_arr
Out[8]: array([False, False, False], dtype=bool)

In [9]: int_arr = np.ones((3,), dtype=np.uint8)

In [10]: int_arr
Out[10]: array([1, 1, 1], dtype=uint8)

In [11]: ~int_arr
Out[11]: array([254, 254, 254], dtype=uint8)

technically, "~" is bitwise, so it all makes sense, but not the result
I wanted! Of course:

In [12]: np.logical_not(int_arr)
Out[12]: array([False, False, False], dtype=bool)

is probably the way to go anyway..

NOTE: I have seen some discussion about "bool" here -- and there a
bool in the C99 standard.. could that simply be used (maybe not
guaranteed to be a one-byte value, though)

Thanks,
-- Chris
Reply all
Reply to author
Forward
0 new messages