Type Hinting and Pure Python

109 views
Skip to first unread message

Thomas Draper

unread,
Nov 24, 2015, 6:48:50 PM11/24/15
to cython-users
With the acceptance of PEP484, type hinting is now part of Python 3.5+. I was wondering if a new style of Pure Python was being considered.

For example:

@cython.cfunc
@cython.returns(cython.bint)
@cython.locals(a=cython.int, b=cython.int)
def c_compare(a,b):
    return a == b

Could now be:

def c_compare(a: int64, b: int64) -> bool:
    return a == b

with new cython specific types (e.g. int32, uint32, int64, uint64, double, float). These additional types could act the same at their C counterparts. That is, an int32 type adds and multiplies like a 32-bit C integer.

I have always liked the *idea* of Python Pure. I haven't found myself using it, and this looks like a cleaner way to insert the information Cython needs.

Robert Bradshaw

unread,
Nov 24, 2015, 6:58:38 PM11/24/15
to cython...@googlegroups.com
On Tue, Nov 24, 2015 at 1:53 PM, Thomas Draper <tgd...@gmail.com> wrote:
> With the acceptance of PEP484, type hinting is now part of Python 3.5+. I
> was wondering if a new style of Pure Python was being considered.

Yes, this has been considered since the addition of type annotations.

> For example:
>
> @cython.cfunc
> @cython.returns(cython.bint)
> @cython.locals(a=cython.int, b=cython.int)
> def c_compare(a,b):
> return a == b
>
> Could now be:
>
> def c_compare(a: int64, b: int64) -> bool:
> return a == b

Likely the definition of types does not mean we want a cdef function
(though I've long wanted def functions to be cpdef by default whenever
possible; methods are trickier as they impose constraints along class
hierarchies so they'd probably have to continue to be opt-in by
default).

> with new cython specific types (e.g. int32, uint32, int64, uint64, double,
> float). These additional types could act the same at their C counterparts.
> That is, an int32 type adds and multiplies like a 32-bit C integer.

One can already get the standard fixed-width int types, e.g.

https://github.com/cython/cython/blob/master/Cython/Includes/libc/stdint.pxd

The question is what to do about int, long, and float that have Python
meanings as well. (I agree that system-defined sizes come with a whole
host of problems, but when wrapping C libraries one doesn't always
have a choice.)

> I have always liked the *idea* of Python Pure. I haven't found myself using
> it, and this looks like a cleaner way to insert the information Cython
> needs.

Yes, it's very nice for function signatures. One difficulty is that it
doesn't specify a nice way to type (non-argument) local variables.
Cython has other constructs (cimports, extern from, structs, ...) that
go beyond simply declaring types as well.

- Robert

Chris Barker - NOAA Federal

unread,
Nov 25, 2015, 12:33:13 AM11/25/15
to cython...@googlegroups.com
> (though I've long wanted def functions to be cpdef by default whenever
> possible;

+1

> The question is what to do about int, long, and float that have Python
> meanings as well. (I agree that system-defined sizes come with a whole
> host of problems, but when wrapping C libraries one doesn't always
> have a choice.)

I think Numpy has cint and friends:

Maybe: c_int, c_long. Etc. float32 and flloat64 will work for float and double.

Though encouraging the sized int types would be a good thing!

>> I have always liked the *idea* of Python Pure. I haven't found myself using
>> it, and this looks like a cleaner way to insert the information Cython
>> needs.
>
> Yes, it's very nice for function signatures. One difficulty is that it
> doesn't specify a nice way to type (non-argument) local variables.
> Cython has other constructs (cimports, extern from, structs, ...) that
> go beyond simply declaring types as well.

I don't think it's all or nothing -- if we can get cleaner, more
Python compatible type declarations fit function parameters, that's a
good thing, even for non pure Python mode, and it when mixed with the
decorator syntax for pure Python.

The other trick is that type hinting in Python is still dynamic -- it
is generally not specifying a particular type at the binary level, but
rather a protocol or ABC - maybe something like a sequence of
strings. There could be many ways to provide that in Python.

But that doesn't mean Cython couldn't share the syntax, anyway.

-CHB

Jeroen Demeyer

unread,
Nov 25, 2015, 7:40:03 AM11/25/15
to cython...@googlegroups.com
On 2015-11-25 06:33, Chris Barker - NOAA Federal wrote:
> Though encouraging the sized int types would be a good thing!

I don't completely agree with this. I would recommend to use sized or
non-sized types depending on the application. Sometimes you really want
32 bits on a 32-bit system and 64 bits on a 64-bit system.

Stefan Behnel

unread,
Nov 25, 2015, 12:12:43 PM11/25/15
to cython...@googlegroups.com
Thomas Draper schrieb am 24.11.2015 um 22:53:
> With the acceptance of PEP484, type hinting is now part of Python 3.5+. I
> was wondering if a new style of Pure Python was being considered.
>
> For example:
>
> @cython...@cython.returns(cython.bint)@cython.locals(a=cython.int, b=cython.int)def c_compare(a,b):
> return a == b
>
> Could now be:
>
> def c_compare(a: int64, b: int64) -> bool:
> return a == b

No, it can't. PEP484 type hinting is for pre-defined Python types only and
not currently extensible.

I discussed this with Guido and it was rejected in order to keep Python
type hinting manageable for them.

Stefan

Chris Barker - NOAA Federal

unread,
Nov 29, 2015, 9:35:04 PM11/29/15
to cython...@googlegroups.com
> I don't completely agree with this. I would recommend to use sized or non-sized types depending on the application. Sometimes you really want 32 bits on a 32-bit system and 64 bits on a 64-bit system.

Talk to the compiler / OS authors -- if you use a C long, you'll get
that on *nix ( with gcc), but you won't get it on Windows with MSVC.

So you need all sorts of platform specific defines anyway.

And -- why in the world would you want a particular size integer based
on the pointer size of your OS? Unless you are using integers to store
pointers, which is another bad idea.

But all very OT for this list....

-CHB

Robert Bradshaw

unread,
Nov 30, 2015, 3:23:57 PM11/30/15
to cython...@googlegroups.com
On Sun, Nov 29, 2015 at 6:35 PM, Chris Barker - NOAA Federal
<chris....@noaa.gov> wrote:
>> I don't completely agree with this. I would recommend to use sized or non-sized types depending on the application. Sometimes you really want 32 bits on a 32-bit system and 64 bits on a 64-bit system.
>
> Talk to the compiler / OS authors -- if you use a C long, you'll get
> that on *nix ( with gcc), but you won't get it on Windows with MSVC.

Yeah, plain vanilla C on Windows doesn't have a good way to
(generally) get 32-bit ints on 32-bit hardware and 64-bit ints on
64-bit hardware.

> So you need all sorts of platform specific defines anyway.
>
> And -- why in the world would you want a particular size integer based
> on the pointer size of your OS? Unless you are using integers to store
> pointers, which is another bad idea.

The pointer size for your OS is a good proxy for the word size of the hardware.

Python itself went with the "long" type for its single-precision ints.

> But all very OT for this list....
>
> -CHB
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "cython-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cython-users...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Sturla Molden

unread,
Dec 1, 2015, 3:35:12 AM12/1/15
to cython...@googlegroups.com
On 30/11/15 21:23, Robert Bradshaw wrote:

> Yeah, plain vanilla C on Windows doesn't have a good way to
> (generally) get 32-bit ints on 32-bit hardware and 64-bit ints on
> 64-bit hardware.

intptr_t is defined in stdint.h





Sturla Molden

unread,
Dec 1, 2015, 4:03:45 AM12/1/15
to cython...@googlegroups.com
On 30/11/15 03:35, Chris Barker - NOAA Federal wrote:

> Talk to the compiler / OS authors -- if you use a C long, you'll get
> that on *nix ( with gcc), but you won't get it on Windows with MSVC.

A C long is by definition the fastest integer type of at least 32 bits.
The AMD64 system uses natively a 64 bit pointer with a 32 bit offset.
The fastest and native integer is 32 bits.

On Windows 64 MSVC or MinGW-w64 a C long is 32 bits. Thus, the C
standard is implemented correctly because this is the fastest integer
with at least 32 bits.

64 bit long is breaking the C standard on AMD64 because it is not the
fastest integer type of at least 32 bits. C compilers on OSX, Linux and
Cygwin are not following the C standard.

Fortran compilers do it correctly on all platforms, however, as they all
default to 32 bit integer (integer*4) unless you specify otherwise.
Fortran requires that an integer without kind number should default to
the fastest on the platform, which is 32 bits on AMD64.

If you want an integer the size of a void* you should use intptr_t. If
you want an integer that can hold the maximum segment size you should
use size_t. If you want the fastest offset to a pointer you should use
int (at least 16 bits), long (at least 32 bits) or long long (at least
64 bits). Normally you want to index with long.

Do not assume that a long is 64 bits or that an int is 32 bits. Both of
these assumptions are wrong and very common programming mistakes.

Sturla



Robert Bradshaw

unread,
Dec 1, 2015, 5:01:39 AM12/1/15
to cython...@googlegroups.com
On Tue, Dec 1, 2015 at 1:02 AM, Sturla Molden <sturla...@gmail.com> wrote:
> On 30/11/15 03:35, Chris Barker - NOAA Federal wrote:
>
>> Talk to the compiler / OS authors -- if you use a C long, you'll get
>> that on *nix ( with gcc), but you won't get it on Windows with MSVC.
>
> A C long is by definition the fastest integer type of at least 32 bits.

No, that would be int_fast32_t. The C standard doesn't say anything
about the speed of long, only its minimum size and ranking between int
and long long.

> The
> AMD64 system uses natively a 64 bit pointer with a 32 bit offset. The
> fastest and native integer is 32 bits.

> On Windows 64 MSVC or MinGW-w64 a C long is 32 bits. Thus, the C standard is
> implemented correctly because this is the fastest integer with at least 32
> bits.
>
> 64 bit long is breaking the C standard on AMD64 because it is not the
> fastest integer type of at least 32 bits. C compilers on OSX, Linux and
> Cygwin are not following the C standard.
>
> Fortran compilers do it correctly on all platforms, however, as they all
> default to 32 bit integer (integer*4) unless you specify otherwise. Fortran
> requires that an integer without kind number should default to the fastest
> on the platform, which is 32 bits on AMD64.
>
> If you want an integer the size of a void* you should use intptr_t. If you
> want an integer that can hold the maximum segment size you should use
> size_t. If you want the fastest offset to a pointer you should use int (at
> least 16 bits), long (at least 32 bits) or long long (at least 64 bits).
> Normally you want to index with long.

If you want an integer that may be 64 bits, when 64-bits isn't too
expensive, use a long. Unfortunately Windows is the odd one out
here...but it doesn't have any type that fits this (intptr_t might
have the right size, but feels wrong).

> Do not assume that a long is 64 bits or that an int is 32 bits. Both of
> these assumptions are wrong and very common programming mistakes.

For sure.

And of course if you want to interoperate with code that was written
before c99 (or at least before c99 became widespread) and uses longs,
you have to use a long.

Chris Barker

unread,
Dec 1, 2015, 9:44:20 PM12/1/15
to cython-users
On Tue, Dec 1, 2015 at 4:02 AM, Sturla Molden <sturla...@gmail.com> wrote:
A C long is by definition the fastest integer type of at least 32 bits.

can yiu find a reference for that -- All I see is "at least 32 bit", more or less, but I have'n found an authoritative reference. but what th heck is "fastest" anyway? these days we often fine we're waiting for memory to get pushed around -- and a smaller size would be better for that regardless of other contraints.

Anyway, way OT -- but while it sure seems like a nice idea to use generic types like "int" and "long" and automagically get larger types as you compile your code with more capable hardware, in reality, it's just a nightmare -- while you write your code, yo'd better know what size you have unless you are carefully checking sizes at run-time -- and who the heck actually does that?

sized types are just so much more robust and clear. Fortran has had that right for a long time.....
 
 C compilers on OSX, Linux and Cygwin are not following the C standard.

regardless of who is or isn't following the standard - the reality is that different vendors do it differently -- so cross-platform code needs to deal with that.

-CHB

If you want an integer the size of a void* you should use intptr_t. If you want an integer that can hold the maximum segment size you should use size_t.

exactly.
 
If you want the fastest offset to a pointer you should use int (at least 16 bits), long (at least 32 bits) or long long (at least 64 bits). Normally you want to index with long.

ouch -- seems risky to me to use an int on a 64 bit plaform!

-CHB


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris....@noaa.gov

Robert Bradshaw

unread,
Dec 2, 2015, 12:19:29 AM12/2/15
to cython...@googlegroups.com
On Tue, Dec 1, 2015 at 6:43 PM, Chris Barker <chris....@noaa.gov> wrote:
> On Tue, Dec 1, 2015 at 4:02 AM, Sturla Molden <sturla...@gmail.com>
> wrote:
>>
>> A C long is by definition the fastest integer type of at least 32 bits.
>
>
> can yiu find a reference for that -- All I see is "at least 32 bit", more or
> less, but I have'n found an authoritative reference. but what th heck is
> "fastest" anyway? these days we often fine we're waiting for memory to get
> pushed around -- and a smaller size would be better for that regardless of
> other contraints.
>
> Anyway, way OT -- but while it sure seems like a nice idea to use generic
> types like "int" and "long" and automagically get larger types as you
> compile your code with more capable hardware, in reality, it's just a
> nightmare -- while you write your code, yo'd better know what size you have
> unless you are carefully checking sizes at run-time -- and who the heck
> actually does that?

Cython does :-P
https://github.com/cython/cython/blob/581c8e7f264f256a6e58c1a69785aed3fac01503/Cython/Utility/TypeConversion.c#L544

> sized types are just so much more robust and clear. Fortran has had that
> right for a long time.....
>
>>
>> C compilers on OSX, Linux and Cygwin are not following the C standard.
>
>
> regardless of who is or isn't following the standard - the reality is that
> different vendors do it differently -- so cross-platform code needs to deal
> with that.
>
> -CHB
>
>> If you want an integer the size of a void* you should use intptr_t. If you
>> want an integer that can hold the maximum segment size you should use
>> size_t.
>
>
> exactly.
>
>>
>> If you want the fastest offset to a pointer you should use int (at least
>> 16 bits), long (at least 32 bits) or long long (at least 64 bits). Normally
>> you want to index with long.
>
>
> ouch -- seems risky to me to use an int on a 64 bit plaform!

Or a long long, a long on Windows...
Reply all
Reply to author
Forward
0 new messages