I'm filing this as we discovered that C files generated with older versions of cython can generate code which accesses internal cpython APIs which were changed/removed in python 3.10, which leads me to conclude that distributing generated cython code in source distributions is unsafe.
The documentation currently recommends distributing generated source code so that one does not require cython at build/install time:
It is strongly recommended that you distribute the generated .c files as well as your Cython sources, so that users can install your module without needing to have Cython available.
This should be amended and reversed, and it would be really great if the examples clearly expressed how to ensure that possibly incompatible C files don't mistakenly end up in source distributions.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
Investigating further, it appears that I am mistaken.
The issue we ran into in particular was with _PyGen_Send missing in 3.10, which is handled by cython in: 782a873
If I'm reading this correctly, then the generated code should support cpython <= 3.10 after this fix.
Were we to follow the upstream intentions (as I think I understand them now), then we would have bumped a minimal bound dependency on Cython when adding support for python 3.10, so that code generated for cpython <= 3.10 would work.
Do I understand correctly that the generated code should still support all versions of cpython leading up to the version which a given version of cython supports ?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
If I'm reading this correctly, then the generated code should support cpython <= 3.10 after this fix.
Yes (within limits... Cython supports back to CPython 2.7 and ~3.4... I'm not sure of the exact numbers off the top of my head, but it isn't completely unlimited)
W.r.t. the advice in the documentation: there's pros and cons. The main benefit of shipping the .c files is that you don't force your users to depend on Cython (however I think it's now possible to use Cython only as an install-time dependency so that's less onorous than it was).
The main problem with shipping the .c files is that they aren't guaranteed forwards compatible and CPython is modifying the c-API fairly frequently these days. It may still be a good idea for projects that are actively maintained and have someone who can regenerate the .c files if needed for a new Python release to ship them.
Basically it's less clear what the best advice is any more so an update to the documentation is welcome.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
The main problem with shipping the .c files is that they aren't guaranteed forwards compatible and CPython is modifying the c-API fairly frequently these days.
I presume this is a problem because Cython generated code will access internal APIs, and upstream CPython can freely change these internals in a stable release cycle (e.g. internals can change from 3.10.x -> 3.10.y), because there is no promise of stability for these APIs.
Makes sense...
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
Yes, although most of the breakages tend to be at major releases though (e.g. 3.10 - 3.11) - it's fairly rare to need to update Cython for a stable release change
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
W.r.t. the advice in the documentation: there's pros and cons. The main benefit of shipping the .c files is that you don't force your users to depend on Cython (however I think it's now possible to use Cython only as an install-time dependency so that's less onorous than it was).
Just to update, I've encoded Cython>=0.29.25 as a build-system dependency in pyproject.toml and have observed that:
pip install . will automatically install these dependencies into a /tmp directorypip install <source tarball> will also do thispip install --no-index <source tarball> will bail out if the dependency on cython is not met on the hostThe only way I can envision the distributed generated files being used, is if cython is not available on the host, and setup.py is invoked directly.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
At this point I think it makes more sense to clearly enumerate the pros and cons than make a strong recommendation.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
This has become an issue in many conda-forge feedstocks, see this open thread among conda-forge core devs (who are the most knowledgeable persons in the packaging community) 🙂
https://gitter.im/conda-forge/conda-forge.github.io?at=636ae677ff5546644b371ae8
I would vote to not recommend distributing any artifacts. The old times has passed.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
@isuruf - can you comment?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
I'm filing this as we discovered that C files generated with older versions of cython can generate code which accesses internal cpython APIs which were changed/removed in python 3.10
ditto, but for Python 3.11, see e.g. 3-manifolds/SnapPy#88
IMHO it's also not impossible to imagine a case where Cython-generated C files don't build on another type of hardware, e.g. increasingly popular arm64.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
clearly enumerate the pros and cons
Can you name one, just one argument in favour of distributing generated C files?
Building an extension from ready C files takes much more time than running Cython to generate them, and installing Cython is very quick (and only needs to be done infrequently).
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
main benefit of shipping the .c files is that you don't force your users to depend on Cython (however I think it's now possible to use Cython only as an install-time dependency so that's less onorous than it was).
Indeed, and that's precisely the point - Cython is a small (compared to the C toolchain) build-time dependency.
@mkoeppe
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
Python packages these days are built in isolated build environment that do not have access to the system packages. Depending on how Cython is integrated into the Python package build system, there are different constraints that can apply.
There are two ways in which Cython can be integrated into the build system: the "standard" way via the extensions to setuptools which IIRC import Cython as a Python library, or via other build systems like meson-python (used for example by scipy) that invoke the cython compiler via the cython executable.
For setuptools integration, Cython needs to be available as an importable module in the build. Therefore it needs to be listed in the Python package build requirements. This means a Cython installation into the throw away build environment for each package build. For Cython compilation via the cython executable, the executable can come from the system (like the C compiler) or from a package installed in the throw away build environment.
Anyhow, I don't think that distributing pre-cythonized source files makes much sense and I don't do it for any of my projects. And, AFAIK, scipy and other large prohects don't do it either.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
IMHO it's also not impossible to imagine a case where Cython-generated C files don't build on another type of hardware, e.g. increasingly popular
arm64.
Building on different hardware was tested until fairly recently without much problem so I wouldn't expect this is to a significant issue. (I think it turned out too slow on github actions so we turned off the tests, but there's no reason why it shouldn't work)
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
it makes more sense to clearly enumerate the pros and cons than make a strong recommendation
@robertwb After the above discussions, what else do you need to help you change your mind? 🙂 Note that this issue is not even asking for any change in Cython API or behavior, just the documentation for best practice (which has clearly changed over the past years). I think we should not block it.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
No matter who is doing the building, unless you end up in a circular build dependencies loop -- highly unlikely, cython's dependencies won't be building themselves with cython, that's exclusively setuptools anyways and even then optional -- there is no reason those users cannot just build cython first. I see little compelling need to offer workarounds so that people don't need to install cython.
It really is an incredibly small advantage, with big consequences unless cython can somehow commit to not requiring any functionality which is exclusive to python's internal API.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
unless cython can somehow commit to not requiring any functionality which is exclusive to python's internal API
Based on the discussion in this issue (#4382) it seems unlikely 🙂
Moreover, let me quote @scoder's reply there as another evidence (from a core Cython developer!) for supporting this documentation change (highlight is mine):
Now, previously, it used to be the case that CPython's C-API was more stable than Cython. This seems to have shifted. Over the years, I was advocating for generating the C code from Cython modules locally and shipping it as part of the source distribution. As long as the C-API (or the exposed internals) were stable enough, that was a good idea. If they no longer are, then it's less of a good idea. The C code can get outdated and it's then really hard for users to regenerate it when all they want to do is pip-install a package into a new CPython version for which there's no wheel (and possibly never will be, for an unmaintained package). This was a problem with the line_profiler package, for example, which, for a very long while, didn't receive an update with freshly generated C sources to support CPython 3.7, and the C sources failed to compile on that newly released CPython version at the time.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
@eli-schwartz I'm thinking to work on a PR for updating the Cython documentation. It would be nice to have meson / meson-python examples along the setuptools examples in the Cython documentation (these are the only two build backends that support cython as a first class language AFAIK). What's the status of the Cython depends file support for meson?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
I think we should change the recommendation to "use Cython at build time". Arguments:
Another topic then is: pinning a specific Cython version or a .x release series in the dependencies. But that's the usual "safe and known vs. unsafe and more future-proof" debate that applies to all dependencies. I don't think there is a general path to recommend here, apart from stating a few pros and cons.
PR(s) welcome.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
Given that we're now in a place were tons of Python package releases were published with C-sources generated by an old or buggy cython, causing forward-compatibility issues... Is there an easy and generic way to force C-source to be re-generated? In one package I noticed I could do python setup.py clean like make clean, but I was told this was a "custom" target.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
Closed #5089 as completed.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
Documentation was updated in #6201
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()