Accepting a sequence of bytes

99 views
Skip to first unread message

Mark Lodato

unread,
Feb 2, 2012, 10:54:19 PM2/2/12
to cython-users
I am using Cython to wrap a C function that accepts an arbitrary-
length sequence of bytes. The C function takes a uint8_t pointer and
a size_t length, but I would like the Python interface to accept
anything that has an obvious byte stream. At the very least, I would
like it to work on 'str'/'bytes' (2.x/3.x), 'bytearray' (2.6+), and
numpy array of the correct dtype (if numpy is available). I would
like my Cython source and resulting C file to work even on older
Pythons (2.4) and even if NumPy is not available. It seems like the
buffer interface is what I want, but this requires 3.x (or at least
2.6). Is there any way to do what I want? If not, can you think of
something that will come close?

Thanks!
Mark

mark florisson

unread,
Feb 6, 2012, 3:57:25 PM2/6/12
to cython...@googlegroups.com

You could look into memoryviews, new in the upcoming Cython release:
https://sage.math.washington.edu:8091/hudson/job/cython-docs/doclinks/1/src/userguide/memoryviews.html

Mark Lodato

unread,
Feb 6, 2012, 7:32:11 PM2/6/12
to cython-users
On Feb 6, 3:57 pm, mark florisson <markflorisso...@gmail.com> wrote:
> On 3 February 2012 03:54, Mark Lodato <loda...@gmail.com> wrote:
> > I am using Cython to wrap a C function that accepts an arbitrary-
> > length sequence of bytes.  The C function takes a uint8_t pointer and
> > a size_t length, but I would like the Python interface to accept
> > anything that has an obvious byte stream.
>
> You could look into memoryviews, new in the upcoming Cython release

Ah, this is on the right track! It works perfectly for Python 2.6+
with writable arrays (say, bytearray). Sadly, it does not work in
Python 2.4 or 2.5, nor does it work for non-writable arrays (say,
str). Are there any plans to add these features?

Here's my test case:

--- 8< ---
cdef extern from "foo.h":
int sum(unsigned char *p, int len)

def add_bytes(o):
cdef unsigned char[::1] p = o
return sum(&p[0], len(p))
--- >8 ----

Also, in order to avoid having to write "&p[0]", it would be nice if
type "unsigned char[::1]" could be assigned to "unsigned char *".

Anyway, looks like great work. I'm looking forward to the next
version of Cython!


In the meantime, I might try to write a custom C function that creates
a read-only memoryview if the version is >= 2.6 (using the Cython-
generated code as a starting point!) and the old buffer protocol (and
maybe special-casing NumPy) otherwise.

mark florisson

unread,
Feb 7, 2012, 3:47:51 PM2/7/12
to cython...@googlegroups.com
On 7 February 2012 00:32, Mark Lodato <lod...@gmail.com> wrote:
> On Feb 6, 3:57 pm, mark florisson <markflorisso...@gmail.com> wrote:
>> On 3 February 2012 03:54, Mark Lodato <loda...@gmail.com> wrote:
>> > I am using Cython to wrap a C function that accepts an arbitrary-
>> > length sequence of bytes.  The C function takes a uint8_t pointer and
>> > a size_t length, but I would like the Python interface to accept
>> > anything that has an obvious byte stream.
>>
>> You could look into memoryviews, new in the upcoming Cython release
>
> Ah, this is on the right track!  It works perfectly for Python 2.6+
> with writable arrays (say, bytearray).  Sadly, it does not work in
> Python 2.4 or 2.5, nor does it work for non-writable arrays (say,
> str).  Are there any plans to add these features?

Hm, what made you jump to that conclusion? Memoryviews work just fine
in python2.4 and 2.5. They do however always request writable buffers,
as they can't know about all the contexts they will be used in. Maybe
in the future, if we ever get support for const, we could allow a
declaration like 'cdef const unsigned char[:] myarray'.

> Here's my test case:
>
> --- 8< ---
> cdef extern from "foo.h":
>    int sum(unsigned char *p, int len)
>
> def add_bytes(o):
>    cdef unsigned char[::1] p = o
>    return sum(&p[0], len(p))
> --- >8 ----
>
> Also, in order to avoid having to write "&p[0]", it would be nice if
> type "unsigned char[::1]" could be assigned to "unsigned char *".

Hm, no I don't think that would be nice, as they are really different
types. They should not coerce implicitly.

> Anyway, looks like great work.  I'm looking forward to the next
> version of Cython!
>
>
> In the meantime, I might try to write a custom C function that creates
> a read-only memoryview if the version is >= 2.6 (using the Cython-
> generated code as a starting point!) and the old buffer protocol (and
> maybe special-casing NumPy) otherwise.

I really wouldn't buffer with the old buffer protocol. You can flag
buffers as readonly just as well with the new buffer interface, it's
just that memoryviews will only want writable buffers. You can
implement the new buffer interface much more easily anyway, by
implementing __getbuffer__/__releasebuffer__ on your extension type.

Reply all
Reply to author
Forward
0 new messages