Cython limitations (and potential bugs)

212 views
Skip to first unread message

Sour Ce

unread,
Jun 2, 2021, 5:22:25 PM6/2/21
to cython-users
First of all I just want to commend the developers of this language and for the free support offered here!
It's an excellent language and a dream to be working with.

Now to the topic...
So I was inspired last week to start writing down limitations I've found in cython by reading a quora question asking about the limitations of cython.
Some of the limitations are trivial or unimportant, some can be worked around, others are puzzling me and sometimes has me asking myself if these are bugs.
Cython version: 0.29.21. Tried upgrading to subversion 23 today in case any of these are fixed, but the pip3 upgrade didn't replace the cython binary for some reason (and pip3 uninstall didn't remove the binary either that's located in ~/.local/bin/cython).

Here's my list ordered from least to most important:
  • Can only use const for parameters in functions, not for vars/consts (can be worked around)
  • Cannot do type overloading in a cppclass without extern from (can be worked around)
  • Cannot change the value of a native C type pointer (e.g unsigned int) because Python already has a meaning for the syntax *var (couldn't using dereference(var) on the lvalue work for this?)
  • Cannot use for loops and delete entries at the same time
  • Doesn't allow you to define variables in any indented territory (even with "if True: cdef ..." and you only use the var in the same intendation or not at all)
  • Trying to use the with statement in a cppclass method crashes the compiler (why? non-extern cppclasses allows the use of everything else pythonic as far as I can see)
  • Can't use prange in a cppclass method even with nogil
  • Can't use a switch-case i.e. "if c in [1, 2, 3]" in a nogil cppclass method (works both in nogil cdef class methods and nogil cdef functions)

The issue giving me the most trouble right now is the last entry in the list.
One of the things I love about cython and would love to see more of is the convenience of doing things like this, it's easier to read, easier and quicker to type, and just as fast as C afaik.

My dream of rewriting my large cdef class as a cppclass is slowly crumbling.
I have already had to move two functions out of the class to be able to use prange and with and something else which is a little ugly and disorderly, but not being able to use the switch/case feature for any non-extern cppclass method is kind of a deal breaker :<

That said, still love the language, would be awesome though if some of these are possible to fix in the future, especially the last ones :)

Golden Rockefeller

unread,
Jun 3, 2021, 4:33:30 PM6/3/21
to cython-users
" Cannot change the value of a native C type pointer (e.g unsigned int) because Python already has a meaning for the syntax *var (couldn't using dereference(var) on the lvalue work for this?)"

Workaround: Use array dereferencing (var[0]) I remember seeing something like this in the documentation.

Also, can you elaborate on " Cannot use for loops and delete entries at the same time"?

" Doesn't allow you to define variables in any indented territory (even with "if True: cdef ..." and you only use the var in the same intendation or not at all)"
If you use Python's type annotation syntax instead of cdef, you can define variables in indented territory sometimes. But I would also like to see cdef in indented territory.

Robert Bradshaw

unread,
Jun 3, 2021, 8:54:31 PM6/3/21
to cython...@googlegroups.com
On Wed, Jun 2, 2021 at 2:22 PM 'Sour Ce' via cython-users <cython...@googlegroups.com> wrote:
First of all I just want to commend the developers of this language and for the free support offered here!
It's an excellent language and a dream to be working with.

Glad you're finding it useful!
 
Now to the topic...
So I was inspired last week to start writing down limitations I've found in cython by reading a quora question asking about the limitations of cython.
Some of the limitations are trivial or unimportant, some can be worked around, others are puzzling me and sometimes has me asking myself if these are bugs.

Thanks for bringing them back here. 
 
Cython version: 0.29.21. Tried upgrading to subversion 23 today in case any of these are fixed, but the pip3 upgrade didn't replace the cython binary for some reason (and pip3 uninstall didn't remove the binary either that's located in ~/.local/bin/cython).

Here's my list ordered from least to most important:
  • Can only use const for parameters in functions, not for vars/consts (can be worked around)
Const-ness is a very odd concept for Python, mostly we only provide it in places where it's required for type checking (e.g. so that functions can have the required type).  
  • Cannot do type overloading in a cppclass without extern from (can be worked around)
I assume you mean that you can't declare a method foo(int) and foo(double)? 
  • Cannot change the value of a native C type pointer (e.g unsigned int) because Python already has a meaning for the syntax *var (couldn't using dereference(var) on the lvalue work for this?)
As mentioned, you can use var[0] = value for this. You could also define a macro #define deref_assign(target, value) (*(target)) = (value)) if you're working with some C++ class that needed this rather than a pointer. 
  • Cannot use for loops and delete entries at the same time
Could you clarify what you mean by this?  
  • Doesn't allow you to define variables in any indented territory (even with "if True: cdef ..." and you only use the var in the same intendation or not at all)
This gets into difficult issues with respect to scoping, as Python functions only have a single locals() whereas C/C++ have per-block scopes. Note that with type inference, you can usually just make an assignment (possibly with a cast) rather than having to declare the variable.  
  • Trying to use the with statement in a cppclass method crashes the compiler (why? non-extern cppclasses allows the use of everything else pythonic as far as I can see)
  • Can't use prange in a cppclass method even with nogil
Works for me, maybe this was fixed at head?  
  • Can't use a switch-case i.e. "if c in [1, 2, 3]" in a nogil cppclass method (works both in nogil cdef class methods and nogil cdef functions)
That's quite strange. I created https://github.com/cython/cython/issues/4212 In the meantime you can use "if c == 1 or c == 2 or ..." which compiles to the same thing.
 

The issue giving me the most trouble right now is the last entry in the list.
One of the things I love about cython and would love to see more of is the convenience of doing things like this, it's easier to read, easier and quicker to type, and just as fast as C afaik.

My dream of rewriting my large cdef class as a cppclass is slowly crumbling.

Writing cppclasses in .pyx files was mostly to support APIs that required extending classes for their use, not necessarily as a replacement for (or faster than) cdef classes. 
 
I have already had to move two functions out of the class to be able to use prange and with and something else which is a little ugly and disorderly, but not being able to use the switch/case feature for any non-extern cppclass method is kind of a deal breaker :<

That said, still love the language, would be awesome though if some of these are possible to fix in the future, especially the last ones :)

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cython-users/83838767-b465-4af5-b212-6e9508c2577bn%40googlegroups.com.

Sour Ce

unread,
Jun 5, 2021, 1:01:09 AM6/5/21
to cython-users
Thank you for taking the time to respond in detail and I'm happy to hear the feedback's useful :)

> Const-ness is a very odd concept for Python, mostly we only provide it in places where it's required for type checking (e.g. so that functions can have the required type).
I see.
> I assume you mean that you can't declare a method foo(int) and foo(double)? 
Yep.
> Could you clarify what you mean by this?
Sure, what I meant is that as with Python it doesn't seem like you can delete an entry from even a C++ container while being iterated using the for syntax without "thing breaking" (or at least deleting the current entry being iterated, I don't think I tried deleting other entries), it only seem to work when using C++ iterators manually.
What seems to happen when you delete the current entry is that the next iteration skips 2 entries ahead instead of 1.
I've always felt like the lack of this feature is one of Python's main weaknesses compared to other script languages and I was super hyped about seeing for iteration working at all for C and C++ containers, I think it even used to not work for C arrays without specifying its size, but now it seems to works flawlessly. Only too bad deleting from the container at the same time will mess up the iteration which I think is kinda key.
> Note that with type inference, you can usually just make an assignment (possibly with a cast) rather than having to declare the variable.
Ah cool, good to know!
Yep.
> Works for me, maybe this was fixed at head?
I guess so. I was able to upgrade to 0.29.23 but still no luck.
> That's quite strange. I created https://github.com/cython/cython/issues/4212 In the meantime you can use "if c == 1 or c == 2 or ..." which compiles to the same thing.
Awesome :) I forgot to mention that if you don't set nogil then it runs but you get bogus results.
I.e. cdef unsigned char c = b"#"; print(c in [b"#"]) => False.
Since I currently only use this feature for unsigned char's I've created a temporary function that does the same for unsigned char's using a std::string for reference and returning a bool on match.

D Woods

unread,
Jun 5, 2021, 12:10:53 PM6/5/21
to cython-users
> > > Cannot use for loops and delete entries at the same time
> > Could you clarify what you mean by this?
> Sure, what I meant is that as with Python it doesn't seem like you can delete an entry from even a C++ container while being iterated using the for syntax without "thing breaking" (or at least deleting the current entry being iterated, I don't think I tried deleting other entries), it only seem to work when using C++ iterators manually.
> What seems to happen when you delete the current entry is that the next iteration skips 2 entries ahead instead of 1.
> I've always felt like the lack of this feature is one of Python's main weaknesses compared to other script languages and I was super hyped about seeing for  iteration working at all for C and C++ containers, I think it even used to not work for C arrays without specifying its size, but now it seems to works flawlessly. Only too bad deleting from the container at the same time will mess up the iteration which I think is kinda key.

For regular for-loops (i.e. with Python types), Cython deliberately makes sure to follow Python behaviour here. This is actually specifically tested (because Python compatibility is one of the main aims). That is unlikely to change.

For C++ for-loops, Cython implements them in terms of C++ iterators. C++ iterators are generally invalidated if you start inserting and erasing elements into a container that you're iterating (this varies a bit depending on the container type and where the iterator is relative to the element you changes).  I don't think there's all that much that Cython can do to avoid this really. If you doing something odd then you're much better writing it in terms of C++ iterators yourself, then you at least have complete control.

If there are any future changes to the behaviour of C++ for-loops are likely to be to make them match Python for-loops as closely as a possible (i.e. you aren't likely to like the changes!)

Stefan Behnel

unread,
Jun 8, 2021, 5:09:44 PM6/8/21
to cython...@googlegroups.com
'Sour Ce' via cython-users schrieb am 04.06.21 um 22:38:
> I.e. cdef unsigned char c = b"#"; print(c in [b"#"]) => False.

This is because "char" is actually an integer type in C, so the value in
"c" turns into the Python int 35 (ord('#')), which is really not in the
list [b'#'].

You can compare this to the behaviour of byte strings in Python 3, where
indexing returns an integer byte value and not a substring.

>>> b'#' in [b'#']
True
>>> b'#'[0] in [b'#']
False

If we always converted C char values to single character bytes objects,
then there would equally be cases where this gets in the way. And mixing
the conversions would just lead to unpredictable behaviour. Following C
(and Python 3) is arguably the most straight forward behaviour here.

Stefan

Sour Ce

unread,
Jun 13, 2021, 3:40:37 PM6/13/21
to cython-users
Thanks for the thorough answers.


> Writing cppclasses in .pyx files was mostly to support APIs that required extending classes for their use, not necessarily as a replacement for (or faster than) cdef classes.

Sure. I opted in for a cppclass because I kind of need the convenience of overloaded methods to support multiple ways of passing "game position" arguments of which there's different types and number of arguments.
Additionally a long-term goal has been to replace as much as the code as possible with C++ for code reuse and optimization purposes as I go.
I'm also not sure, maybe other factors played in or maybe I'm misremembering the number, but I believe the cython generation time for my project was reduced from 3s to 2.5s after I partially replaced this cdef class with a C++ class. Could be misremembering though.

Either way I'm bummed out again after realizing today that non-extern cppclasses inheriting from an extern cppclass actually loses all overloaded methods without any notification.

Cython did allow me to create an extern cppclass that inherits from a non-extern cppclass class, though unfortunately this kind of serves no purpose to me because I need my non-extern cppclass class to inherit from C++ (so the non-extern class can operate on C++ members), not the other way around.

Not sure what more I can do at this point :/



Also I've found some more caveats/strange behaviors during the past week:
* Cppclass methods converts varname into this.varname if varname is a class attribute or method.
I.e. the method "call(): a()" will call this.a() even if there is a function a() as long as global a is not declared. This differs from cdef classes and I believe python classes as well which defaults to local > global unless the self or this prefix is used explicitly, while cppclass methods seem to default to local > this.
* Extern from is the only clause I'm aware of that silently drops duplicate definitions. Perhaps understandable, but a little unexpected and confusing when you're not aware (lead me to spend days reworking my class into a cppclass only to realize today that my overloaded methods had been silently dropped).
* Cppclass construction with anything pythonic causes segfault.
I.e. cppclass test: test(): ['a', 'b', 'c', 1, 2, 3] # segfault
* Dict's can be converted to maps... but kind of sporadically.
E.g. cdef void test(map[unsigned int, unsigned int] m): pass
test({1: 1}) # Cannot interpret dict as type 'map[unsigned int, unsigned int]'
d = {1: 1}
test(d) # Works!
On the other hand I've never had any issues with converting a list to a vector or vice versa.

Stefan Behnel

unread,
Jun 13, 2021, 4:36:56 PM6/13/21
to cython...@googlegroups.com
'Sour Ce' via cython-users schrieb am 13.06.21 um 17:31:
> Either way I'm bummed out again after realizing today that non-extern
> cppclasses inheriting from an extern cppclass actually loses all overloaded
> methods without any notification.

Not sure what exactly you describe here, but it sounds like something that
could be easy to fix.

Could you please open a ticket and add some example code to it? Or, even
better, implement a failing test in a PR?


> Cython did allow me to create an extern cppclass that inherits from a
> non-extern cppclass class,

Doesn't sound entirely unreasonable to me, but also not like a major use case.


> Also I've found some more caveats/strange behaviors during the past week:
> * Cppclass methods converts varname into this.varname if varname is a class
> attribute or method.
> I.e. the method "call(): a()" will call this.a() even if there is a
> function a() as long as global a is not declared. This differs from cdef
> classes and I believe python classes as well which defaults to local >
> global unless the self or this prefix is used explicitly, while cppclass
> methods seem to default to local > this.

Up to debate, I guess. C++ methods do not have an explicit self/this, so
they already are different from Python methods.

I think the question is mostly: what is a C++ method more likely to do? Use
other C++ attributes and methods? Or use surrounding code? If the answer is
strongly one of the two, then that's a reason to follow C++ or Python
semantics.

That doesn't mean that I don't see the argument of consistency – it's an
important one. I'm rather trying to find out if the behaviour is worth
changing at this point.


> * Extern from is the only clause I'm aware of that silently drops duplicate
> definitions. Perhaps understandable, but a little unexpected and confusing
> when you're not aware (lead me to spend days reworking my class into a
> cppclass only to realize today that my overloaded methods had been silently
> dropped).

Sounds like a bug to me. Declarations should be either clear and correct,
or get rejected. Also worth a ticket.


> * Cppclass construction with anything pythonic causes segfault.
> I.e. cppclass test: test(): ['a', 'b', 'c', 1, 2, 3] # segfault
> * Dict's can be converted to maps... but kind of sporadically.
> E.g. cdef void test(map[unsigned int, unsigned int] m): pass
> test({1: 1}) # Cannot interpret dict as type 'map[unsigned int, unsigned
> int]'
> d = {1: 1}
> test(d) # Works!

Same, please create bug tickets for these.

Thanks for fighting your way through this. As mentioned before, C++ class
support wasn't meant to replace C++ as such, rather to close a gap when
users need to define them. There is definitely room for improvements.

Stefan

D Woods

unread,
Jun 14, 2021, 1:31:19 PM6/14/21
to cython-users
>I think the question is mostly: what is a C++ method more likely to do? Use other C++ attributes and methods? Or use surrounding code? If the answer is strongly one of the two, then that's a reason to follow C++ or Python semantics.

> That doesn't mean that I don't see the argument of consistency – it's an important one. I'm rather trying to find out if the behaviour is worth
changing at this point.

The other question is possibly "is there a way to force Cython to use a global method of the same name?". In C++ you'd do `::method_name()` (I think) to force this, but I suspect it isn't easy to force from Cython, which could be inconvenient.

More generally, I think Cython-declared cppclasses are probably still mostly undocumented, and you may be finding out the reason why! But that's no reason why they shouldn't be improved.

David

Sour Ce

unread,
Jun 14, 2021, 7:19:58 PM6/14/21
to cython-users
> Not sure what exactly you describe here, but it sounds like something that
could be easy to fix.

> Could you please open a ticket and add some example code to it? Or, even
better, implement a failing test in a PR?
Nice. Sure, not sure where to put them in the repo though. Tests/run/ ?

>> Also I've found some more caveats/strange behaviors during the past week:
>> * Cppclass methods converts varname into this.varname if varname is a class
>> attribute or method.
>> I.e. the method "call(): a()" will call this.a() even if there is a
>> function a() as long as global a is not declared. This differs from cdef
>> classes and I believe python classes as well which defaults to local >
>> global unless the self or this prefix is used explicitly, while cppclass
>> methods seem to default to local > this.

> Up to debate, I guess. C++ methods do not have an explicit self/this, so
they already are different from Python methods.

Ah, I see. I've always been using this in C++ so I didn't know it had this mechanic on its own. That makes sense then.

> Thanks for fighting your way through this. As mentioned before, C++ class
support wasn't meant to replace C++ as such, rather to close a gap when
users need to define them. There is definitely room for improvements.

No worries, it's an incredible tool, thanks again for making and working on it.

Source

unread,
Aug 19, 2021, 6:58:27 PM8/19/21
to cython-users
A couple of questions:

About the non-extern cppclass inheriting from an extern cppclass (or overloading with non-extern cppclass) bug/missing feature, I fully understand you guys are busy, but is there any way I can incentivize the priority of this feature, assuming the complexity is not way out of hand atm?
I'm an indie game developer without an income or a team, so I can't afford much, but I'd be happy to donate some to get it fixed.
I'm not in any kind of rush, but it would probably be very handy for me in approximately 6 months time or so, it's just a convenience thing and not critical, but definitely handy and satisfies the perfectionist in me + I'm happy to help the project any way I can.
I could maybe try and work on it on my own also and send a PR your way if I figure it out, but some simple pointers on where the code related to this would be would be greatly appreciated, though I assume it's probably too complex for me to handle anyways.

Secondly, I noticed with newer Python(/Cython?) installations on Debian/Ubuntu that the python3-config command no longer provides a -lpythonx tag/option which I've been relying on so far to compile my cython programs sort of manually to avoid relying on distutils due to its lack of flexibility, bloatedness (need for a separate distutils file for every program, write the same distutils code for every program), etc, which bothers me.
Should I avoid relying on the python3-config command/util in the future? I know a couple of workarounds on my part but they're kinda dirt, just curious what everyone thinks.

Also was really cool to see the dict to map bug get fixed and getting feedback on the issue and a workaround for the cppclass crash bug.

D Woods

unread,
Aug 20, 2021, 9:17:23 AM8/20/21
to cython-users
> About the non-extern cppclass inheriting from an extern cppclass (or overloading with non-extern cppclass) bug/missing feature


It's probably relatively easy to fix (and can probably give some guidance if you want to try it). As I said I think the issue was in "declare_inherited_cpp_attributes" in Symtab.py.  That iterates over attributes in the base class and declares them in the new class. However, an C++ function/method Entry also has an attribute "overloaded_alternatives" which describes alternative functions with the same name and a different signature and it's this that doesn't get copied. If I were doing it I'd put a breakpoint in that functions and then run through your simple example - see what Entries are copied, and which ones have "overloaded_alternatives" that aren't copied.

I think the issue is entirely within that function so it's fairly easy to fix without having to know too many details about the rest of Cython.

There's probably a decent chance that the bug would get fixed on a 6-month timescale, but obviously the way to guaranttee it would be to fix it yourself ;). I wouldn't like to comment on whether anyone else is interested in prioritising it in exchange for a donation.

> python3-config

Don't know about this I'm afraid.

David
Reply all
Reply to author
Forward
0 new messages