Type declarations and 2.x/3.x types

13 views
Skip to first unread message

Nikolaus Rath

unread,
Apr 29, 2013, 8:28:42 PM4/29/13
to cython...@googlegroups.com
Hello,

I noticed that when compiling with -3 semantics,

def foo(str s):
pass

requires a unicode argument when compiled into a Python 2.x extension,
and a str argument when compiled into a Python 3.x extension.

Is there a way to declare that I want a str argument in both cases, i.e.
a bytes rather than unicode argument under 2.x?


Thanks,

-Nikolaus

--
»Time flies like an arrow, fruit flies like a Banana.«

PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C

Chris Barker - NOAA Federal

unread,
Apr 30, 2013, 12:20:07 AM4/30/13
to cython-users
On Mon, Apr 29, 2013 at 5:28 PM, Nikolaus Rath <Niko...@rath.org> wrote:
I noticed that when compiling with -3 semantics,

def foo(str s):
    pass

requires a unicode argument when compiled into a Python 2.x extension,
and a str argument when compiled into a Python 3.x extension.

Is there a way to declare that I want a str argument in both cases, i.e.
a bytes rather than unicode argument under 2.x?

That really wouldn't be a good idea -- there are only two types of objects, in both 2 and 3 -- bytes objects and unicode objects -- they just have different names under 2 and 3 (i.e str means something different).

But any code that needs to deal with one of these objects better know which one it is dealing with, so having "str" mean unicode under both 2 and 3 makes sense, particularly under "py3 symantics"

If you really want to take either a py2 string (what encoding?) or a unicode object, then you can take any python object, then convert and type it in the function:

def foo(s):
   cdef str myunicode
   myunicode s=unicode(s)
   ....

or something like that.

You _may_ be able to use py2 mode, and do:

def foo(unicode s):
    pass

I'd have to test to make sure, but Cyhton may do the str(bytes) -> unicode conversion for you (using the default encoding...)

-Chris


-Chris









 
--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris....@noaa.gov

Robert Bradshaw

unread,
Apr 30, 2013, 12:49:32 AM4/30/13
to cython...@googlegroups.com
On Mon, Apr 29, 2013 at 5:28 PM, Nikolaus Rath <Niko...@rath.org> wrote:
> Hello,
>
> I noticed that when compiling with -3 semantics,
>
> def foo(str s):
> pass
>
> requires a unicode argument when compiled into a Python 2.x extension,
> and a str argument when compiled into a Python 3.x extension.
>
> Is there a way to declare that I want a str argument in both cases, i.e.
> a bytes rather than unicode argument under 2.x?

Yes, that's the default, but -3 forces Python 3 semantics (in
particular str == unicode).
http://docs.cython.org/src/tutorial/strings.html has lots of
information on how to be careful about bytes vs unicode.

- Robert

Stefan Behnel

unread,
Apr 30, 2013, 2:15:54 AM4/30/13
to cython...@googlegroups.com
Nikolaus Rath, 30.04.2013 02:28:
> I noticed that when compiling with -3 semantics,
>
> def foo(str s):
> pass
>
> requires a unicode argument when compiled into a Python 2.x extension,
> and a str argument when compiled into a Python 3.x extension.
>
> Is there a way to declare that I want a str argument in both cases, i.e.
> a bytes rather than unicode argument under 2.x?

Seriously, you don't want that. In Python 2, it's very difficult to write
code that never uses unicode strings anywhere, because whenever you mix
bytes and unicode for whatever reason, even by accident or because some
library hands it to you or whatnot, the result will be unicode, and you
wouldn't easily notice it. So explicitly excluding unicode input in Python
2 will not make your users happy.

IMHO, using str in a function signature is always a bad idea. Unless you're
completely sure that you want either exactly bytes input (e.g. a data chunk
from the network) or exactly unicode (i.e. text), don't type the input
argument and apply an appropriate conversion yourself.

Stefan

Reply all
Reply to author
Forward
0 new messages