Idea: splitting out compiled code from the astropy core package

52 views
Skip to first unread message

Thomas Robitaille

unread,
Jun 26, 2025, 8:40:12 AMJun 26
to astropy-dev mailing list
Hi everyone,

Following a discussion yesterday at the coordination meeting related to compiled extensions (C, Cython, potentially rust in future) in the core astropy package, Clément and I both independently had an idea which I thought it would be good to discuss here, and which could lead to an APE if there is consensus.

In short: what if we moved all compiled extensions out of the core astropy package into a separate package or separate packages?

Let's not worry for now about whether this separate package would be one or multiple packages, or what it would be called. Let's also assume for now that this package was explicitly advertised as not having any API that should be used by users directly

Why would we want to do this? There are several reasons

Faster builds/testing

Every single time someone or CI builds astropy, we compile again and again and again the same extensions which don't change much. This is a huge waste of computational resources, as well as a waste of human time waiting for astropy to build.

If astropy had no compiled extensions, then installing a developer version of astropy would be extremely fast, and it would then be easier to be able to just quickly run tests directly with e.g.

    pytest astropy/wcs/wcsapi

during development without having to make sure the extensions are built.

In addition, running tox would also be faster since we wouldn't need to worry about the build time.

Faster release process

If we did this, astropy would become a pure Python package. This means only having to build a single wheel and source distribution.

Easier for contributors

The average contributor, who does not need to mess with any of the compiled extensions, would not need to even need a compiler installed to contribute to astropy. Running tests and building the documentation would be faster as it would not require building all the extensions first.

Possibility of adopting rust for some extensions

There has been a desire by some to include rust extensions in astropy. However, it would be very difficult to do this with the core package as it is at the moment because we would then require any astronomer who wants to contribute to astropy to install rust. I think we can agree the bar to contributing is already high enough that we don't need this.

With compiled extensions in a separate package or packages, we could more easily adopt rust. The developers who would likely be dealing with the compiled extensions would be far fewer and it would be easier to require just them to install rust.

Prior experience

We've already done this partially before: pyerfa was split out and most people don't need to worry about compiling erfa again and again. I think we can now say in hindsight the transition to a separate pyerfa went very smoothly, and most users haven't noticed. This proposal would just be a generalization of what we did with pyerfa.

Anyway I'm curious to hear what people think! If there are no strong objections, then I think the next step would be an APE so we can actually start discussing the details.

Cheers,
Tom





Aldcroft, Tom

unread,
Jun 26, 2025, 2:11:36 PMJun 26
to astro...@googlegroups.com
I agree this sounds like a good idea.

- Tom


--
You received this message because you are subscribed to the Google Groups "astropy-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to astropy-dev...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/astropy-dev/CAGMHX_3%2BBeCQofcn-aMkhF%2Bp9%2B%3DGp5v_8OfQBnwaWKkB3XVuNA%40mail.gmail.com.

hamogu

unread,
Jun 27, 2025, 8:03:39 AMJun 27
to astropy-dev
I like this idea. For me (an "average contributor" in the sense that I've never touched the C/Cython extensions) the time it needs to build/test astropy locally is a major pain point and I support making contributions easier by reducing build-time dependencies (compilers).

The only downside I see is added complexity for the people doing the releases, since now there are more packages that need to be released (even if not frequently) and need to be kept in sync. I'm not doing the releases, so I trust Tom that he's considered that and it's not a major issue.

Moritz

Thomas Robitaille

unread,
Jun 27, 2025, 9:46:52 AMJun 27
to astro...@googlegroups.com
On Fri, 27 Jun 2025 at 13:03, 'hamogu' via astropy-dev <astro...@googlegroups.com> wrote:
I like this idea. For me (an "average contributor" in the sense that I've never touched the C/Cython extensions) the time it needs to build/test astropy locally is a major pain point and I support making contributions easier by reducing build-time dependencies (compilers).

The only downside I see is added complexity for the people doing the releases, since now there are more packages that need to be released (even if not frequently) and need to be kept in sync. I'm not doing the releases, so I trust Tom that he's considered that and it's not a major issue.

I think we should carefully assess this impact as part of an APE, but in short most of the hard work with releasing the core package is related to e.g. the what's new, the list of contributors, and the time it takes to run the CI and tests. This separate package would not need much manual work for a release and could in principle be much more automated similar to packages like reproject where one can just make a new release via the GitHub interface.

In addition, as you alluded to, the releases could be very infrequent - for fun I checked the astropy repo, and there have been almost 40,000 commits, but only 782 of them have changed any of the .c, .h, or .pyx files.

In any case, I think a lot of details would need to be fleshed out in an APE, so I will wait a little more to see if anyone has strong objections to this concept. To be clear, we might find while writing the APE that it is not a feasible idea, but at least for now it seems worthwhile exploring.

Cheers,
Tom


Pey Lian Lim

unread,
Jun 30, 2025, 9:27:34 AMJun 30
to astropy-dev
FWIW

* I think it only make sense if these split out packages have calver like https://github.com/astropy/astropy-iers-data/

* But unlike astropy-iers-data , these are not just data, they are API fundamentals. So if they change, how do we make sure astropy core lib releases on time and have a sane version bump? For example, what if this C API package have a breaking change or urgent security patch? Not only we have to know how to version both this package and core lib with updated pin properly, the release manager now also have to coordinate multiple releases at the same time.

* Are we putting cart in front of horse? Is this only worth considering if someone actually bothers to reimplement the C API into something else first?

Aldcroft, Tom

unread,
Jun 30, 2025, 12:47:18 PMJun 30
to astro...@googlegroups.com
On Mon, Jun 30, 2025 at 9:27 AM Pey Lian Lim <p3y...@gmail.com> wrote:

* Are we putting cart in front of horse? Is this only worth considering if someone actually bothers to reimplement the C API into something else first?

My understanding is that this just moves all the existing C interfaces into a separate package with minimal changes. This is independent of any efforts to update/unify the C API. These efforts are not really a high priority and may not happen any time soon.

- Tom
 

Pey Lian Lim

unread,
Jul 2, 2025, 8:52:52 AMJul 2
to astropy-dev
What if the effort to refactor C API does happen? Is this new model sustainable? How do we test against different versions of C API as separate packages? Do things like "use system wcslib" or "use system cfitsio" still apply then?

Aldcroft, Tom

unread,
Jul 2, 2025, 9:06:02 AMJul 2
to astro...@googlegroups.com
On Wed, Jul 2, 2025 at 8:52 AM Pey Lian Lim <p3y...@gmail.com> wrote:
What if the effort to refactor C API does happen? Is this new model sustainable? How do we test against different versions of C API as separate packages? Do things like "use system wcslib" or "use system cfitsio" still apply then?

Maybe we are having a language problem, and I may be out of my depth when it comes to the details. But I believe the intent of the C-refactor would not be changing the external Python API to use the C code, but only the C code implementation and build details. For instance, changing a Cython implementation to pure C with the CPython / numpy C API.

About using system packages, that's an interesting question.

- Tom
 

On Monday, June 30, 2025 at 12:47:18 PM UTC-4 Tom Aldcroft wrote:
On Mon, Jun 30, 2025 at 9:27 AM Pey Lian Lim <p3y...@gmail.com> wrote:

* Are we putting cart in front of horse? Is this only worth considering if someone actually bothers to reimplement the C API into something else first?

My understanding is that this just moves all the existing C interfaces into a separate package with minimal changes. This is independent of any efforts to update/unify the C API. These efforts are not really a high priority and may not happen any time soon.

- Tom
 

--
You received this message because you are subscribed to the Google Groups "astropy-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to astropy-dev...@googlegroups.com.

Clément Robert

unread,
Jul 7, 2025, 9:54:31 AMJul 7
to astropy-dev
> * I think it only make sense if these split out packages have calver

I actually have reasons to think semver is preferable: it would let astropy to pin all these new dependencies in a way that easily allow for pre-releases testing (maximal stability for end users, and maximal flexibility for devs).
For instance:

 "astropy-core.table>=1.0.0,<1.1.0",

trivially allows for 1.0.1, as well as 1.1.0rc1, which users will never see, but reduces internal testing friction. It's not obvious to me how to achieve this with calver.

>  For instance, changing a Cython implementation to pure C with the CPython / numpy C API.

Tangent, but as I emphasized in my talk about free-threading Python: writing native extensions directly in C is a lot harder to get right than using binding generators (like Cython), and there are other reasons to consider going the opposite direction (C -> Cython, C -> rust, or even C -> Python):
- Building stable-ABI compliant binaries is now trivial with most binding generators (at least Cython and pyO3 for rust), while it is much more work for C sources (see PRs linked to https://github.com/astropy/astropy/issues/18163)
- Supporting free-threading Python supposes we can write (at least partially) thread-safe extensions, which again should be much easier if we let the heavy lifting to generators.
- CPython now has an (experimental, and opt-in) JIT. While it currently does not reliably accelerate execution yet, future progress might completely remove the *need* for native extensions in some instance.
not to mention the on going experiments with compiling well-typed pure Python modules with mypyc...

Anyway, these are all points we'll consider in our APE. And since I'm not seeing any strong objection yet I'll start discussing details with Tom R. In any case, thanks everyone !

Jim Bosch

unread,
Jul 7, 2025, 10:17:30 AMJul 7
to astro...@googlegroups.com
I think there is a lot to be gained from splitting out the compiled
code, but the effort to do it should not be taken lightly, and I think
it's best done by defining clear interfaces and tests, (and, yes,
versions) for the compiled packages.

The scenario you really want to avoid is one in which the compiled
code lives in a package upstream of the Python code, but its test
coverage effectively lives downstream of the Python code, so you can't
safely make a change to the compiled-code package without building the
Python-code package and running its tests. That's going to push the
limits of what the Python package/build tooling is good at in CI, and
I think you'll find yourself writing a lot of custom tooling to
compensate.

At least from the outside, Pydantic (with compiled code in
pydantic_core) *looks* like a case of doing this in what I'd consider
the right way, but I'm personally speaking much more from the negative
experience of separating high-level from low-level code without making
the latter's interfaces independently tested enough (and regretting
it).

Jim

Thomas Robitaille

unread,
Jul 8, 2025, 6:18:38 AMJul 8
to astro...@googlegroups.com
Hi Jim,

I agree with you that if we did want to do this, the 'compiled' package would need to have its own tests with good coverage. I think it would also need to have a well defined clean API that is documented and would follow semantic versioning as Clément mentioned.

Cheers,
Tom


--
You received this message because you are subscribed to the Google Groups "astropy-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to astropy-dev...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages