Why I cannot coerce python string to char*?

3,219 views
Skip to first unread message

Simo Salminen

unread,
Dec 20, 2009, 4:06:54 PM12/20/09
to cython-users
I have following code:

cdef class Test:
def get_stuff(self):
cdef str id = "test"
cdef char* id2 = id <- error

This produces error:
'str' objects do not support coercion to C types (use 'bytes'?)

How this should work? From document http://docs.cython.org/src/tutorial/strings.html
I though this should work.

Damien Churchill

unread,
Dec 21, 2009, 3:19:15 AM12/21/09
to cython...@googlegroups.com
2009/12/20 Simo Salminen <ssal...@gmail.com>:

bytes is just a byte string. I think it's following Py3 now, so str ==
unicode, and bytes == str

Stefan Behnel

unread,
Dec 22, 2009, 3:51:06 AM12/22/09
to cython...@googlegroups.com

Simo Salminen, 20.12.2009 22:06:

> I have following code:
>
> cdef class Test:
> def get_stuff(self):
> cdef str id = "test"
> cdef char* id2 = id <- error
>
> This produces error:
> 'str' objects do not support coercion to C types (use 'bytes'?)

'str' is a special type that is bytes in Py2 and unicode in Py3. It is
therefore not possible to assign it to a char* without prior treatment.

If you want a byte string, use bytes. If you use unicode, encode it
appropriately.

Note that there is some discussion going on amongst the core developers
regarding better ways to deal with this. As of now, nothing has been decided.


> How this should work? From document http://docs.cython.org/src/tutorial/strings.html
> I though this should work.

I can't find any reference to "cdef str" in that document.

Stefan

Simo Salminen

unread,
Dec 22, 2009, 2:07:56 PM12/22/09
to cython-users
Ok thanks.

> > How this should work? From documenthttp://docs.cython.org/src/tutorial/strings.html


> > I though this should work.
>
> I can't find any reference to "cdef str" in that document.
>

I misunderstood that type str equals to python string (document says
this: cdef char* other_c_string = py_string).

Cass

unread,
Jul 20, 2010, 10:15:44 AM7/20/10
to Simo Salminen, cython...@googlegroups.com
So was there no easy fix to take a Python 2.6 str object, and pass it
into an external C function as a char *?

How do you check what version of Cython you are using, btw?

Cass

unread,
Jul 20, 2010, 4:28:42 PM7/20/10
to cython-users
To be more specific, I am trying to take a Python 2.6 Unicode str, and
trying to convert it into a char * so that I can pass it into an
existing C function.
I can modify the Python wrapper layer (i.e. I don't have to use a
Python str) but cannot modify the C code. The input just consists of
alphabetical words of varying lengths.
Any suggestions? :)

Thanks!

Cass

Stefan Behnel

unread,
Jul 20, 2010, 4:34:51 PM7/20/10
to cython...@googlegroups.com
Cass, 20.07.2010 22:28:

> To be more specific, I am trying to take a Python 2.6 Unicode str, and
> trying to convert it into a char * so that I can pass it into an
> existing C function.
> I can modify the Python wrapper layer (i.e. I don't have to use a
> Python str) but cannot modify the C code. The input just consists of
> alphabetical words of varying lengths.
> Any suggestions? :)

Well, yes, read the extensive docs on string handling.

Admittedly, the most up-to-date tutorial for Cython 0.13 is not on the web
site yet, but here it is in text form:

http://hg.cython.org/cython-docs/raw-file/tip/src/tutorial/strings.rst

In short: switch to using bytes objects internally and convert unicode
string input to bytes when receiving it from Python space by encoding it in
a suitable encoding.

Stefan

Cass

unread,
Jul 20, 2010, 5:01:03 PM7/20/10
to cython-users
Thanks!! It worked. Just used:

py_byte_string = py_unicode_string.encode('UTF-8')
cdef char* c_string = py_byte_string

However, now I'm confused. After my last post, I read at (http://
farmdev.com/talks/unicode/) that the default encoding for Python 2 is
ASCII. Was that information wrong?

Stefan Behnel

unread,
Jul 20, 2010, 5:05:10 PM7/20/10
to cython...@googlegroups.com
Hi, please avoid top-posting.

Cass, 20.07.2010 23:01:


> Thanks!! It worked. Just used:
>
> py_byte_string = py_unicode_string.encode('UTF-8')
> cdef char* c_string = py_byte_string

Looks good.


> However, now I'm confused. After my last post, I read at (http://
> farmdev.com/talks/unicode/) that the default encoding for Python 2 is
> ASCII. Was that information wrong?

Yes. The default encoding in Python 2 is configurable and platform
specific. Don't rely on it being anything.

Stefan

Reply all
Reply to author
Forward
0 new messages