I'm trying to create a "max-heap" that works with longs and floats.
ctypedef fused element_t:
long
double
cdef struct heap:
ulong capacity
element_t *data
cdef heap_init( ulong capacity, heap_t heap ) nogil:
heap.capacity = capacity
heap.data = <element_t>malloc( capacity *
sizeof( heap.data[ 0 ] ) )
for i in range( capacity ):
if heap.data[ 0 ] is long:
heap.data[ i ] = LONG_MIN
else:
heap.data[ i ] = DBL_MIN
However, I get "Type is not specialized" on the cast from malloc. I
have gotten this to compile by changing type of "data" to void *, and
declaring a dummy element_t:
cdef void heap_init( ulong capacity, heap_t heap, element_t element )
nogil:
heap.capacity = capacity
heap.data = malloc( capacity * sizeof( element ) )
if element_t is long:
for i in range( capacity ):
(<element_t *>heap.data)[ i ] = LONG_MIN
elif element_t is double:
for i in range( capacity ):
(<element_t *>heap.data)[ i ] = DBL_MIN
else:
with gil:
raise TypeError( "Unsupported heap type" )
However, this is (IMHO) an ugly hack... is there any way to refer to
the type of the specialized element in "heap_t"?
> cdef heap_init( ulong capacity, heap_t heap ) nogil:
> heap.capacity = capacity
> heap.data = <element_t>malloc( capacity *
> sizeof( heap.data[ 0 ] ) )
> for i in range( capacity ):
> if heap.data[ 0 ] is long:
> heap.data[ i ] = LONG_MIN
> else:
> heap.data[ i ] = DBL_MIN
> However, I get "Type is not specialized" on the cast from malloc. I
> have gotten this to compile by changing type of "data" to void *, and
> declaring a dummy element_t:
> cdef void heap_init( ulong capacity, heap_t heap, element_t element )
> nogil:
> heap.capacity = capacity
> heap.data = malloc( capacity * sizeof( element ) )
> if element_t is long:
> for i in range( capacity ):
> (<element_t *>heap.data)[ i ] = LONG_MIN
> elif element_t is double:
> for i in range( capacity ):
> (<element_t *>heap.data)[ i ] = DBL_MIN
> else:
> with gil:
> raise TypeError( "Unsupported heap type" )
> However, this is (IMHO) an ugly hack... is there any way to refer to
> the type of the specialized element in "heap_t"?
Compiler crash traceback from this point on: File "/Library/Python/2.7/site-packages/Cython/Compiler/Nodes.py", line 2305, in copy_def permutations = PyrexTypes.get_all_specialized_permutations(fused_compound_types) File "/Library/Python/2.7/site-packages/Cython/Compiler/PyrexTypes.py", line 2644, in get_all_specialized_permutations return _get_all_specialized_permutations(unique(fused_types)) File "/Library/Python/2.7/site-packages/Cython/Compiler/PyrexTypes.py", line 2647, in _get_all_specialized_permutations fused_type, = fused_types[0].get_fused_types() IndexError: list index out of range
You can combine it with the fused structs as another fused type, but
that would give you some specializations that are invalid (you could
match for that and raise an exception though...).
> You can combine it with the fused structs as another fused type, but
> that would give you some specializations that are invalid (you could
> match for that and raise an exception though...).
Sounds like this should go into the NumPy documentation page for now.
> Compiler crash traceback from this point on:
> File "/Library/Python/2.7/site-packages/Cython/Compiler/Nodes.py", line
> 2305, in copy_def
> permutations =
> PyrexTypes.get_all_specialized_permutations(fused_compound_types)
> File "/Library/Python/2.7/site-packages/Cython/Compiler/PyrexTypes.py",
> line 2644, in get_all_specialized_permutations
> return _get_all_specialized_permutations(unique(fused_types))
> File "/Library/Python/2.7/site-packages/Cython/Compiler/PyrexTypes.py",
> line 2647, in _get_all_specialized_permutations
> fused_type, = fused_types[0].get_fused_types()
> IndexError: list index out of range
Ah, you made two mistakes. The first is that you forgot to declare
'self' in your method, which leads to the compiler crash. The second
is that you use a fused type as a class attribute, which is not
supported.
>> You can combine it with the fused structs as another fused type, but
>> that would give you some specializations that are invalid (you could
>> match for that and raise an exception though...).
> Sounds like this should go into the NumPy documentation page for now.
> On 30 April 2012 09:25, Stefan Behnel <stefan...@behnel.de> wrote:
>> mark florisson, 30.04.2012 10:22:
>>> On 29 April 2012 15:50, shaunc wrote:
>>>> update:
>>> You can combine it with the fused structs as another fused type, but
>>> that would give you some specializations that are invalid (you could
>>> match for that and raise an exception though...).
>> Sounds like this should go into the NumPy documentation page for now.
At some point we discussed allowing fused types as part of fused
types, e.g. struct attributes, dtypes, etc. Only in the case of
structs and unions would this be really useful, as you can't specify
the fused part in an argument declaration itself. It's really not
different from e.g.
except it happens up-front. The semantics here are:
1) if a compound type has one elementary type as a fused type,
specialize on the elementary type
-> example: cython.floating[:] -> specialize on float, double, ...
2) if a compound type has multiple elementary fused types,
specialize on the entire type
-> example: the function as above, a struct with multiple
fused attributes -> specialize on the entire type (this may mean the
user has to define those entire types if they want to explicitly
specialize)
> On 29 April 2012 15:56, shaunc wrote:
>> File "/Library/Python/2.7/site-packages/Cython/Compiler/PyrexTypes.py",
>> line 2647, in _get_all_specialized_permutations
>> fused_type, = fused_types[0].get_fused_types()
>> IndexError: list index out of range
> Ah, you made two mistakes. The first is that you forgot to declare
> 'self' in your method, which leads to the compiler crash.
It's still a bug that it crashes, though. Should raise an error instead
when the type declared for the first argument of an extension type method
is not compatible with "self". And the above traceback looks like the fused
types code should check if there really is a matching permutation.
> The second
> is that you use a fused type as a class attribute, which is not
> supported.
That's unfortunate, though understandable. It would mean that the class
itself gets multiplied, I guess, so this could end up being quite involved.
And I don't think it could be done without user visible quirks, e.g. when
typing variables as that class or type checking against it. Sounds like it
would take some major effort.
> At some point we discussed allowing fused types as part of fused
> types, e.g. struct attributes, dtypes, etc. Only in the case of
> structs and unions would this be really useful, as you can't specify
> the fused part in an argument declaration itself. It's really not
> different from e.g.
> except it happens up-front. The semantics here are:
> 1) if a compound type has one elementary type as a fused type,
> specialize on the elementary type
> -> example: cython.floating[:] -> specialize on float, double, ...
> 2) if a compound type has multiple elementary fused types,
> specialize on the entire type
> -> example: the function as above, a struct with multiple
> fused attributes -> specialize on the entire type (this may mean the
> user has to define those entire types if they want to explicitly
> specialize)
> What do you think?
Is this really all that useful? Wouldn't you just use a union in C instead
if you want to store multiple types in the same field of a struct?
I think we might run into the same quirks as for fused extension type
attributes at some point.
The use case at hand (as far as I understand it) was building up an entire
heap implementation based on a struct with a fused item type. That would
mean that the entire implementation gets specialised for each item type.
That's potentially a huge amount of struct data types and associated code
that we generate there, without a serious benefit.
I think, if fused types are to be used here, it would be much better to
make the heap items array a void* and store the heap item type (maybe as an
enum) as part of the heap struct. The rest can be done with fused types
already, simply by casting to the proper item array type before passing it
into the fused heap implementation. And as an additional bonus, it would be
trivial to hide all of it behind an extension type façade.
Personally, I wouldn't invest too much time into fused structs for now and
just make them an error with a clearly understandable message, until we
have a valid use case that we can build this feature around.
> mark florisson, 30.04.2012 10:29:
>> On 29 April 2012 15:56, shaunc wrote:
>>> File "/Library/Python/2.7/site-packages/Cython/Compiler/PyrexTypes.py",
>>> line 2647, in _get_all_specialized_permutations
>>> fused_type, = fused_types[0].get_fused_types()
>>> IndexError: list index out of range
>> Ah, you made two mistakes. The first is that you forgot to declare
>> 'self' in your method, which leads to the compiler crash.
> It's still a bug that it crashes, though. Should raise an error instead
> when the type declared for the first argument of an extension type method
> is not compatible with "self". And the above traceback looks like the fused
> types code should check if there really is a matching permutation.
>> The second
>> is that you use a fused type as a class attribute, which is not
>> supported.
> That's unfortunate, though understandable. It would mean that the class
> itself gets multiplied, I guess, so this could end up being quite involved.
> And I don't think it could be done without user visible quirks, e.g. when
> typing variables as that class or type checking against it. Sounds like it
> would take some major effort.
> Another error case for now, I guess.
> Stefan
Yeah, they should give proper errors. I guess there's so many
different possible cases to not use them correctly, that I haven't had
them all handled.
> mark florisson, 30.04.2012 10:44:
>> At some point we discussed allowing fused types as part of fused
>> types, e.g. struct attributes, dtypes, etc. Only in the case of
>> structs and unions would this be really useful, as you can't specify
>> the fused part in an argument declaration itself. It's really not
>> different from e.g.
>> except it happens up-front. The semantics here are:
>> 1) if a compound type has one elementary type as a fused type,
>> specialize on the elementary type
>> -> example: cython.floating[:] -> specialize on float, double, ...
>> 2) if a compound type has multiple elementary fused types,
>> specialize on the entire type
>> -> example: the function as above, a struct with multiple
>> fused attributes -> specialize on the entire type (this may mean the
>> user has to define those entire types if they want to explicitly
>> specialize)
>> What do you think?
> Is this really all that useful? Wouldn't you just use a union in C instead
> if you want to store multiple types in the same field of a struct?
> I think we might run into the same quirks as for fused extension type
> attributes at some point.
Indeed. So maybe it would be better to for structs also list the
specializations as part of the struct type. For things like function
pointers you simply list the entire thing.
So specializing structs/ext classes: StructOrClass[float, int]
Function pointers: int (*)(float, int)
> The use case at hand (as far as I understand it) was building up an entire
> heap implementation based on a struct with a fused item type. That would
> mean that the entire implementation gets specialised for each item type.
> That's potentially a huge amount of struct data types and associated code
> that we generate there, without a serious benefit.
And that's exactly what you want, as you want to keep a list of items
of a specific type. If you want to support many types, you're going to
have to duplicate the code, or do a lot of checks on the dtype. In
case of a union, you're going to have to specialcase which attribute
you're going to access, which you want to be automatic. I don't think
this use case is very common, but not supporting it is also a
limitation. That said, it's also not very high on my priority list,
but I think it'd be nice to support if and when extension classes are
also supported.
> I think, if fused types are to be used here, it would be much better to
> make the heap items array a void* and store the heap item type (maybe as an
> enum) as part of the heap struct. The rest can be done with fused types
> already, simply by casting to the proper item array type before passing it
> into the fused heap implementation. And as an additional bonus, it would be
> trivial to hide all of it behind an extension type façade.
> Personally, I wouldn't invest too much time into fused structs for now and
> just make them an error with a clearly understandable message, until we
> have a valid use case that we can build this feature around.
> On 30 April 2012 10:21, Stefan Behnel wrote:
>> The use case at hand (as far as I understand it) was building up an entire
>> heap implementation based on a struct with a fused item type. That would
>> mean that the entire implementation gets specialised for each item type.
>> That's potentially a huge amount of struct data types and associated code
>> that we generate there, without a serious benefit.
> And that's exactly what you want, as you want to keep a list of items
> of a specific type. If you want to support many types, you're going to
> have to duplicate the code, or do a lot of checks on the dtype. In
> case of a union, you're going to have to specialcase which attribute
> you're going to access, which you want to be automatic. I don't think
> this use case is very common, but not supporting it is also a
> limitation.
I do not consider it a limitation that we do not support an uncommon use
case in the syntactically easiest(?) possible way. See below for a straight
forward way to deal with the specific use case at hand without having to
duplicate any code on the user side.
>> I think, if fused types are to be used here, it would be much better to
>> make the heap items array a void* and store the heap item type (maybe as an
>> enum) as part of the heap struct. The rest can be done with fused types
>> already, simply by casting to the proper item array type before passing it
>> into the fused heap implementation. And as an additional bonus, it would be
>> trivial to hide all of it behind an extension type façade.
>> Personally, I wouldn't invest too much time into fused structs for now and
>> just make them an error with a clearly understandable message, until we
>> have a valid use case that we can build this feature around.
I consider fused types a way to optimise code by trading code complexity
and size for speed. If users have to deal with non-performance critical
code (such as what I describe above) in a more generic way without help by
fused types, I'm totally ok with that. After all, we wanted something
that's simpler than C++ templates and we should keep it that way.
> On 30 April 2012 09:45, Stefan Behnel wrote:
>> mark florisson, 30.04.2012 10:29:
>>> On 29 April 2012 15:56, shaunc wrote:
>>>> File "/Library/Python/2.7/site-packages/Cython/Compiler/PyrexTypes.py",
>>>> line 2647, in _get_all_specialized_permutations
>>>> fused_type, = fused_types[0].get_fused_types()
>>>> IndexError: list index out of range
>>> Ah, you made two mistakes. The first is that you forgot to declare
>>> 'self' in your method, which leads to the compiler crash.
>> It's still a bug that it crashes, though. Should raise an error instead
>> when the type declared for the first argument of an extension type method
>> is not compatible with "self". And the above traceback looks like the fused
>> types code should check if there really is a matching permutation.
>>> The second
>>> is that you use a fused type as a class attribute, which is not
>>> supported.
>> That's unfortunate, though understandable. It would mean that the class
>> itself gets multiplied, I guess, so this could end up being quite involved.
>> And I don't think it could be done without user visible quirks, e.g. when
>> typing variables as that class or type checking against it. Sounds like it
>> would take some major effort.
>> Another error case for now, I guess.
> Yeah, they should give proper errors. I guess there's so many
> different possible cases to not use them correctly, that I haven't had
> them all handled.
Understandable again. This was a rather large addition to the language. It
doesn't come unexpected that it takes some time for the compiler to mature.
> mark florisson, 30.04.2012 10:44:
>> At some point we discussed allowing fused types as part of fused
>> types, e.g. struct attributes, dtypes, etc. Only in the case of
>> structs and unions would this be really useful, as you can't specify
>> the fused part in an argument declaration itself. It's really not
>> different from e.g.
>> except it happens up-front. The semantics here are:
>> 1) if a compound type has one elementary type as a fused type,
>> specialize on the elementary type
>> -> example: cython.floating[:] -> specialize on float, double, ...
>> 2) if a compound type has multiple elementary fused types,
>> specialize on the entire type
>> -> example: the function as above, a struct with multiple
>> fused attributes -> specialize on the entire type (this may mean the
>> user has to define those entire types if they want to explicitly
>> specialize)
>> What do you think?
> Is this really all that useful? Wouldn't you just use a union in C instead
> if you want to store multiple types in the same field of a struct?
> I think we might run into the same quirks as for fused extension type
> attributes at some point.
> The use case at hand (as far as I understand it) was building up an entire
> heap implementation based on a struct with a fused item type. That would
> mean that the entire implementation gets specialised for each item type.
> That's potentially a huge amount of struct data types and associated code
> that we generate there, without a serious benefit.
> I think, if fused types are to be used here, it would be much better to
> make the heap items array a void* and store the heap item type (maybe as an
> enum) as part of the heap struct. The rest can be done with fused types
> already, simply by casting to the proper item array type before passing it
> into the fused heap implementation. And as an additional bonus, it would be
> trivial to hide all of it behind an extension type façade.
This only works if you pass in the element type as an extra or dummy
argument (e.g. when inserting into the heap, or searching for some
value in the heap). Otherwise, you'll need a dummy scalar argument,
and what you lose is type safety. If I pass in a float to my heap of
doubles, it will suddenly cast a double pointer stored as a void
pointer to a float pointer and clobber my data. I can only avoid this
by checking the kind and the sizeof of the data type before casting.
This also means that conversion to objects and returning primitive
values get harder, e.g. "get the maximum element". You now have to do
a memcpy for the primitive case copying to a passed in pointer from
the user, or manually hard-code all the different conversions. This is
exactly what fused types are there to avoid.
> Personally, I wouldn't invest too much time into fused structs for now and
> just make them an error with a clearly understandable message, until we
> have a valid use case that we can build this feature around.
> On 30 April 2012 10:21, Stefan Behnel wrote:
>> mark florisson, 30.04.2012 10:44:
>>> At some point we discussed allowing fused types as part of fused
>>> types, e.g. struct attributes, dtypes, etc. Only in the case of
>>> structs and unions would this be really useful, as you can't specify
>>> the fused part in an argument declaration itself. It's really not
>>> different from e.g.
>>> except it happens up-front. The semantics here are:
>>> 1) if a compound type has one elementary type as a fused type,
>>> specialize on the elementary type
>>> -> example: cython.floating[:] -> specialize on float, double, ...
>>> 2) if a compound type has multiple elementary fused types,
>>> specialize on the entire type
>>> -> example: the function as above, a struct with multiple
>>> fused attributes -> specialize on the entire type (this may mean the
>>> user has to define those entire types if they want to explicitly
>>> specialize)
>>> What do you think?
>> Is this really all that useful? Wouldn't you just use a union in C instead
>> if you want to store multiple types in the same field of a struct?
>> I think we might run into the same quirks as for fused extension type
>> attributes at some point.
>> The use case at hand (as far as I understand it) was building up an entire
>> heap implementation based on a struct with a fused item type. That would
>> mean that the entire implementation gets specialised for each item type.
>> That's potentially a huge amount of struct data types and associated code
>> that we generate there, without a serious benefit.
>> I think, if fused types are to be used here, it would be much better to
>> make the heap items array a void* and store the heap item type (maybe as an
>> enum) as part of the heap struct. The rest can be done with fused types
>> already, simply by casting to the proper item array type before passing it
>> into the fused heap implementation. And as an additional bonus, it would be
>> trivial to hide all of it behind an extension type façade.
> This only works if you pass in the element type as an extra or dummy
> argument
What I meant was this:
cdef enum:
INT_TYPE
LONG_TYPE
DOUBLE_TYPE
struct heap:
Py_ssize_t mem_length
Py_ssize_t length
void* items
int type
I find that acceptable and it keeps the generated C code way below what a
fused struct and cdef class would give you. I would even use separate API
classes for different low-level types, might make things more accessible
for users on the Python side.
> mark florisson, 30.04.2012 15:32:
>> On 30 April 2012 10:21, Stefan Behnel wrote:
>>> mark florisson, 30.04.2012 10:44:
>>>> At some point we discussed allowing fused types as part of fused
>>>> types, e.g. struct attributes, dtypes, etc. Only in the case of
>>>> structs and unions would this be really useful, as you can't specify
>>>> the fused part in an argument declaration itself. It's really not
>>>> different from e.g.
>>>> except it happens up-front. The semantics here are:
>>>> 1) if a compound type has one elementary type as a fused type,
>>>> specialize on the elementary type
>>>> -> example: cython.floating[:] -> specialize on float, double, ...
>>>> 2) if a compound type has multiple elementary fused types,
>>>> specialize on the entire type
>>>> -> example: the function as above, a struct with multiple
>>>> fused attributes -> specialize on the entire type (this may mean the
>>>> user has to define those entire types if they want to explicitly
>>>> specialize)
>>>> What do you think?
>>> Is this really all that useful? Wouldn't you just use a union in C instead
>>> if you want to store multiple types in the same field of a struct?
>>> I think we might run into the same quirks as for fused extension type
>>> attributes at some point.
>>> The use case at hand (as far as I understand it) was building up an entire
>>> heap implementation based on a struct with a fused item type. That would
>>> mean that the entire implementation gets specialised for each item type.
>>> That's potentially a huge amount of struct data types and associated code
>>> that we generate there, without a serious benefit.
>>> I think, if fused types are to be used here, it would be much better to
>>> make the heap items array a void* and store the heap item type (maybe as an
>>> enum) as part of the heap struct. The rest can be done with fused types
>>> already, simply by casting to the proper item array type before passing it
>>> into the fused heap implementation. And as an additional bonus, it would be
>>> trivial to hide all of it behind an extension type façade.
>> This only works if you pass in the element type as an extra or dummy
>> argument
> What I meant was this:
> cdef enum:
> INT_TYPE
> LONG_TYPE
> DOUBLE_TYPE
> struct heap:
> Py_ssize_t mem_length
> Py_ssize_t length
> void* items
> int type
> I find that acceptable and it keeps the generated C code way below what a
> fused struct and cdef class would give you. I would even use separate API
> classes for different low-level types, might make things more accessible
> for users on the Python side.
> Stefan
Maybe C++ templates aren't that bad, they are certainly much more
powerful. It is sometimes useful to allow code to operate on any type,
as you go, rather than have the API designed decide what is best for
you (and export all possible specializations).
In this case I would personally prefer to use fused types as it makes
the compiler deal with all the boring things like dispatching, but
also guarantee type safety in an easy way (this code because
unmanageable for a more complicated algorithm or more data types).
>> I find that acceptable and it keeps the generated C code way below what a
>> fused struct and cdef class would give you. I would even use separate API
>> classes for different low-level types, might make things more accessible
>> for users on the Python side.
> Maybe C++ templates aren't that bad, they are certainly much more
> powerful. It is sometimes useful to allow code to operate on any type,
> as you go, rather than have the API designed decide what is best for
> you (and export all possible specializations).
You have to know all possible specialisations statically at compile time,
though.
> In this case I would personally prefer to use fused types as it makes
> the compiler deal with all the boring things like dispatching, but
> also guarantee type safety in an easy way (this code because
> unmanageable for a more complicated algorithm or more data types).
That's why I leave the (algorithmically) complicated stuff to fused types
in the code above and keep the boring stuff simple.
File "Visitor.py", line 176, in Cython.Compiler.Visitor.TreeVisitor._visitchild (/Users/x/src/entropy-git/src/entropy/analyze/monitor/build/cython/Cython/C ompiler/Visitor.c:3766)
File "/Library/Python/2.7/site-packages/Cython/Compiler/ParseTreeTransforms.py", line 2580, in visit_PrimaryCmpNode
is_same = type1.same_as(type2)
File "/Library/Python/2.7/site-packages/Cython/Compiler/PyrexTypes.py", line 214, in same_as
> You can combine it with the fused structs as another fused type, but > that would give you some specializations that are invalid (you could > match for that and raise an exception though...).
>> You can combine it with the fused structs as another fused type, but
>> that would give you some specializations that are invalid (you could
>> match for that and raise an exception though...).