memoryview from ndarray with char field

441 views
Skip to first unread message

Bill Noon

unread,
Nov 28, 2011, 11:42:22 AM11/28/11
to cython...@googlegroups.com
I am having a problem converting a np.ndarray with a character field to a memoryview (running from trunk).

>>> tx.dtype
dtype([('d', '<f4'), ('f', 'S1')])

and in the .pyx file:
cdef packed struct datum_df :
    float d
    char  f

and in my method:
cpdef calc(np.ndarray tx) :
    cdef datum_df[:] d_tx=tx
...

Calling this will generate an exception:
ValueError: Does not understand character buffer dtype format string ('s')

If I change the field 'f' type to 'i1', it will work fine.

Any way to coerce the memoryview to use a character?

--Bill

mark florisson

unread,
Nov 28, 2011, 2:11:43 PM11/28/11
to cython...@googlegroups.com

Buffers (note that you'd get the same behaviour using
np.ndarray[datum_df]) don't support every character format or dtype
yet, unfortunately. It seems 'c' and 's' are one of them, so you're
stuck with 'b', 'B' or 'i1' apparently, which I agree is frustrating.
Note that Cython's 'char' converts to an integer in any case, so you'd
have to cast it in Cython space anyway, using <bytes> c. You could
decide to actually use char[1], which will convert to bytes, but that
breaks in the compiler. It's really something that should be fixed,
probably before the next release.

Bill Noon

unread,
Nov 28, 2011, 2:25:30 PM11/28/11
to cython...@googlegroups.com
Is there any way I can cast from the numpy 'c' dtype to an <bytes> on the cython side?

Can I cast a pointer to the ndarray data as a memoryview and just have it ignore the numpy dtype?

--Bill

mark florisson

unread,
Nov 28, 2011, 2:47:51 PM11/28/11
to cython...@googlegroups.com
On 28 November 2011 19:25, Bill Noon <wn...@cornell.edu> wrote:
> Is there any way I can cast from the numpy 'c' dtype to an <bytes> on the
> cython side?

Yes, you use char in your struct, as you're doing now, and you can use
'i1' or 'B' or whatever, and cast like you just showed. In example

print <bytes> myslice[myidx].f

Is that what you're asking for?

> Can I cast a pointer to the ndarray data as a memoryview and just have it
> ignore the numpy dtype?
> --Bill
>

Yes, you can. You can use numpy.PyArray_DATA() to get the data
pointer, and use <datum_df[:shape0, :shape1, ..., :shapeN]> to convert
the pointer to a cython.array (from which you can obtain a memoryview
slice).

Bill Noon

unread,
Nov 28, 2011, 4:09:03 PM11/28/11
to cython...@googlegroups.com
I am having a hard time getting the syntax correct.

This is all given the structure:
cdef packed struct datum_df :
    float d
    char  f

This will compile and run:
cpdef int calc_i() :
    cdef np.ndarray tx = np.ndarray((3,),dtype=[('d','<f4'),('f','i1')])
    cdef datum_df[:] m_tx = tx
    
But what I want is this:
cpdef int calc() :
    cdef np.ndarray tx = np.ndarray((3,),dtype=[('d','<f4'),('f','c')])
    cdef datum_df[:] m_tx = tx

that compiles, but give me the following error when run:
ValueError: "Does not understand character buffer dtype format string ('s')"

This variant doesn't even compile
cpdef int calc2() :
    cdef np.ndarray tx = np.ndarray((3,),dtype=[('d','<f4'),('f','c')])
    cdef datum_df[:] m_tx
    m_tx = <datum_df[:]>tx
    
and gives me the error:
tstReducers.pyx:62:26: Can only create cython.array from pointer or array

What I want is to cast the dtype [('d','<f4'),('f','c')] to my datum_df structure than use .f as a byte.

What am I missing?

--Bill

mark florisson

unread,
Nov 28, 2011, 4:29:32 PM11/28/11
to cython...@googlegroups.com

Basically casting memoryview slices doesn't work, what would it do?
Would it create a copy and convert all elements to a new type? So the
casting syntax is used to obtain views on C data that you have, i.e. a
pointer or a C array.

So what you want is this (I agree it's somewhat convoluted):

cimport numpy as np

cdef datum_df *p = <datum_df *> np.PyArray_DATA(my_numpy_ndarray)
cdef Py_ssize_t shape0 = my_numpy_ndarray.shape[0]
cdef datum_df[:] myslice = <datum_df[:shape0]> p

The idea in the cast to memoryview slice is that you provide shape
information that tells Cython "my pointer points memory that can be
viewed like this and this is how big it is".

Hope that helps,

Mark

Bill Noon

unread,
Nov 28, 2011, 8:41:50 PM11/28/11
to cython...@googlegroups.com
Mark -- Thanks for all the help.  Just to summarize, here is what is working for me:

in my reducer.pyx file:

cdef packed struct datum_df :
    float d
    char  f

cpdef int calc(np.ndarray tx) :
    cdef datum_df *p_tx = <datum_df *> np.PyArray_DATA(tx)
    cdef Py_ssize_t i, shape0 = tx.shape[0]
    cdef datum_df[:] d_tx = <datum_df[:shape0]> p_tx

    print "strides:",d_tx.strides[0], tx.strides[0]
    print "itemsize:",d_tx.itemsize
    d_tx.strides[0]= tx.strides[0]
    
    for i in range(shape0) :
        print i,d_tx[i].d,d_tx[i].f

and from python:
>>> import numpy as np
>>> from reducer import calc
>>> a = np.ndarray((3,),dtype=[
...   ('v0',[('d','<f4'),('f','c')]),
...   ('v1',[('d','<f4'),('f','c')])])
>>> 
>>> a['v0'][:] = (0.,'a')
>>> a['v1'][:] = (1.,'z')
>>> calc(a['v0'])
strides: 5 10
itemsize: 5
0 0.0 97
1 0.0 97
2 0.0 97
0
>>> calc(a['v1'])
strides: 5 10
itemsize: 5
0 1.0 122
1 1.0 122
2 1.0 122
0

Because my array is a bit more complicated, I had to adjust the .strides array on the memoryview.

Thanks --Bill

Dag Sverre Seljebotn

unread,
Nov 29, 2011, 3:59:14 AM11/29/11
to cython...@googlegroups.com

Something slightly more Pythonic, perhaps, than what Mark suggested is

m_tx = arr.view(np.dtype([('d','<f4'),('f','i1')]))

That is, even if Cython memoryviews can't be casted, you can use the
view method of NumPy to do the cast.

Dag Sverre

Bill Noon

unread,
Nov 29, 2011, 9:19:23 AM11/29/11
to cython...@googlegroups.com
On Tuesday, November 29, 2011 3:59:14 AM UTC-5, dagss wrote:

Something slightly more Pythonic, perhaps, than what Mark suggested is

m_tx = arr.view(np.dtype([('d','<f4'),('f','i1')]))

That is, even if Cython memoryviews can't be casted, you can use the
view method of NumPy to do the cast.

Dag Sverre

That works as well.  Thanks --Bill


 

 

mark florisson

unread,
Dec 16, 2011, 10:19:40 AM12/16/11
to cython...@googlegroups.com
On 28 November 2011 16:42, Bill Noon <wn...@cornell.edu> wrote:

You should be able to use strings with the latest Cython now:
https://github.com/cython/cython/commit/e52e87714ab15f5993362d68d1449231819fb016

This means you will have to declare 'char f' as 'char f[1]'.

Reply all
Reply to author
Forward
0 new messages