[Proposal] allow standard packages to be pip packages, reduce source tarball size

1,178 views
Skip to first unread message

Dima Pasechnik

unread,
Feb 11, 2024, 2:23:42 PMFeb 11
to sage-...@googlegroups.com
Currently the standard packages cannot be pip packages, i.e. we must, in effect, vendor them. This entails an extra effort which is often not needed, in particular as we patch only very few Python packages.
Pip packages are on the other hand installed straight from PyPI.

Good examples of standard packages which can become pip ones are tox, pytest (not yet standard).


The other difference is that by default these packages are not included in the Sage releases source tarball.

Rather than adding them there I propose to split the upstream/* part of the tarball into something optional - which is represented by a list of files to download, and which is just not needed if you build while connected to the internet.

This is a huge saving on the tarball size: with upstream/* in, Sage 10.2 tarball is 1.3Gb, and without it is smaller than 0.25Gb.

Note that as William writes, the desire to have Sage buildable without an internet connection was a requirement by a past Sage funder, gone about 10 years ago. Thus there's no longer an obligation to have this option.
I am not aware of a similar to Sage which provides tarballs allowing for an offline build.

Thus, I would like to call a vote on these two topics:

1) allow standard packages to be pip packages

2) drop the contents of upstream/ from the Sage source tarballs.


---
Dima

Matthias Koeppe

unread,
Feb 11, 2024, 2:50:17 PMFeb 11
to sage-devel
I think it's a bit too quick to already call a vote. I would suggest that you take the time to collect and link previous discussions on this topic, so that participants can review the known arguments, viewpoints, and requirements.

It may also be relevant to consider whether the "Source code (tar.gz)" tarballs that are automatically provided by GitHub on releases (and tags) would be sufficient. (They do not contain upstream; but they also do not contain the helpful .git directory that our tarball release script painstakingly adds.)

Dima Pasechnik

unread,
Feb 11, 2024, 3:26:41 PMFeb 11
to sage-...@googlegroups.com


On 11 February 2024 19:50:17 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>I think it's a bit too quick to already call a vote. I would suggest that
>you take the time to collect and link previous discussions on this topic,
>so that participants can review the known arguments, viewpoints, and
>requirements.
>
>Example (from my previous
>post): https://groups.google.com/g/sage-devel/c/C7-ho1zvEYU/m/S2n8d5rOAgAJ
>(2016)


I don't think arguments from 2016 are very relevant today, given how much python packaging evolved since then.

I don't think there is a good reason to delay this vote, especially given that there is a pending vote on more
pip packages to be made standard, potentially leading to totally unneeded effort to vendor them.

Matthias Koeppe

unread,
Feb 11, 2024, 3:34:51 PMFeb 11
to sage-devel
On Sunday, February 11, 2024 at 12:26:41 PM UTC-8 Dima Pasechnik wrote:

On 11 February 2024 19:50:17 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>I think it's a bit too quick to already call a vote. I would suggest that
>you take the time to collect and link previous discussions on this topic,
>so that participants can review the known arguments, viewpoints, and
>requirements.
>
>Example (from my previous
>post): https://groups.google.com/g/sage-devel/c/C7-ho1zvEYU/m/S2n8d5rOAgAJ
>(2016)

I don't think arguments from 2016 are very relevant today, given how much python packaging evolved since then.

In case it was not clear, I did not suggest to only look for discussions from 2016 or earlier.

And the state of Python packaging is only one aspect that is relevant.

mmarco

unread,
Feb 11, 2024, 4:19:15 PMFeb 11
to sage-devel
As I mentioned in the thread that motivated this one, it would be relevant to stablish if it is possible to move those packages from standard to pip, while still having a way to install sage without an internet connection.

If the effort is not too much, I think it would make sense to provide that alternative.

Matthias Koeppe

unread,
Feb 11, 2024, 4:46:40 PMFeb 11
to sage-devel
I'll provide some context and pointers that readers may find helpful to participate in the discussion and vote.
- tooling: https://deploy-livedoc--sagemath.netlify.app/html/en/developer/packaging#utility-script-to-create-and-maintain-packages

A "normal" or "wheel" package is always pinned to a specific version in the Sage distribution (package-version.txt, checksums.ini), and the Sage distribution needs to have a package for each of its dependencies.

"Pip" packages can either be pinned to a specific version, or set acceptable version ranges, or be entirely unconstrained. This is set in the file requirements.txt in the package directory. 

Pinning a version has the potential benefit of stability (avoiding retroactive breakage by new, incompatible versions. The cost is that updating the version requires work by two Sage developers: One who prepares a PR and one who reviews it. (I'll make an attempt to quantify this cost in a separate post.) And when the package does not get the attention of developers who upgrade it, there's the potential risk of missing out on bugfixes made in newer versions, or missing out on features in major new versions.
Not pinning the version has the obvious potential benefit of always being up to date. But there is a risk of instability, either by the package itself being affected by bugs in a new version, or by breaking compatibility with Sage.
What policy is best for a package obviously depends on lots of factors, including the development velocity and quality control that the upstream project, interest by Sage developers in the package, the depth of integration in Sage etc. I suggest to subject "one-size-fits-all" approaches to a healthy dose of critical thinking.

Dependencies of a "pip" package do not need to be available as packages in the Sage distribution. However, if a dependency is also a package of the Sage distribution, then we must declare this dependency. If we don't, surprising things can happen when building or upgrading. When new versions of "pip" packages add dependencies that happen to be Sage packages, there is a separate source of instability.

On Sunday, February 11, 2024 at 11:23:42 AM UTC-8 Dima Pasechnik wrote:

Matthias Koeppe

unread,
Feb 11, 2024, 5:47:24 PMFeb 11
to sage-devel
On Sunday, February 11, 2024 at 1:46:40 PM UTC-8 Matthias Koeppe wrote:
I'll make an attempt to quantify this cost

Here's an illustration of the workflow for making python_build a standard "wheel" package, as proposed in https://groups.google.com/g/sage-devel/c/MIU-xo9b7pc:

$ git checkout -b python_build_standard upstream/develop
branch 'python_build_standard' set up to track 'upstream/develop'.
Switched to a new branch 'python_build_standard'
$ ls build/pkgs/python_build
SPKG.rst         dependencies     distros          requirements.txt type

The package already exists as a "pip" package (requirements.txt). Let's re-create it as a standard "wheel" package.

$ mv build/pkgs/python_build build/pkgs/build
$ ./sage -package create build --pypi --type standard
Downloading tarball from https://pypi.io/packages/py3/b/build/build-1.0.3-py3-none-any.whl to .../upstream/build-1.0.3-py3-none-any.whl
[......................................................................]
$ mv build/pkgs/build build/pkgs/python_build
$ ls build/pkgs/python_build
SPKG.rst             checksums.ini        dependencies         distros              install-requires.txt package-version.txt  requirements.txt     type
$ git rm -f build/pkgs/python_build/requirements.txt
rm 'build/pkgs/python_build/requirements.txt'

Now, after removing requirements.txt, it's a wheel package. Let's review the changes that "sage -package create" made.

$ git --no-pager diff build/pkgs/python_build/dependencies
diff --git a/build/pkgs/python_build/dependencies b/build/pkgs/python_build/dependencies
index b72a6d1c776..47296a7bace 100644
--- a/build/pkgs/python_build/dependencies
+++ b/build/pkgs/python_build/dependencies
@@ -1,4 +1,4 @@
- pyparsing tomli packaging | $(PYTHON_TOOLCHAIN) $(PYTHON)
+ | $(PYTHON_TOOLCHAIN) $(PYTHON)
 
 ----------
 All lines of this file are ignored except the first.

Our old version was better, go back to it. (The script "sage -package create" does not know how to find the dependencies; https://github.com/sagemath/sage/pull/36740 prepares an improvement, needs review.)

$ git checkout -- build/pkgs/python_build/dependencies

Commit the changes:

$ git add build/pkgs/python_build
$ git commit -m "build/pkgs/python_build: Change to a normal standard package"
[python_build_standard 43f6b2b8ef9] build/pkgs/python_build: Change to a normal standard package
 4 files changed, 7 insertions(+), 1 deletion(-)
 create mode 100644 build/pkgs/python_build/checksums.ini
 rename build/pkgs/python_build/{requirements.txt => install-requires.txt} (100%)
 create mode 100644 build/pkgs/python_build/package-version.txt

Test it:

$ make python_build
make -j16 build/make/Makefile --stop
./bootstrap -d
[...]
rm -rf config/install-sh config/compile config/config.guess config/config.sub config/missing configure build/make/Makefile-auto.in
make --no-print-directory python_build-SAGE_VENV-no-deps
[python_build-1.0.3] Using cached file .../upstream/build-1.0.3-py3-none-any.whl
[python_build-1.0.3] python_build-1.0.3
[python_build-1.0.3] ====================================================
[python_build-1.0.3] Setting up build directory for python_build-1.0.3
[...]
[python_build-1.0.3] Using pip 23.3.1 from .../local/var/lib/sage/venv-python3.11/lib/python3.11/site-packages/pip (python 3.11)
[python_build-1.0.3] Looking in links: .../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels
[python_build-1.0.3] Processing .../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels/build-1.0.3-py3-none-any.whl (from -r .../local/var/lib/sage/venv-python3.11/var/lib/sage/scripts/python_build/spkg-requirements.txt (line 1))
[python_build-1.0.3] Requirement already satisfied: packaging>=19.0 in .../local/var/lib/sage/venv-python3.11/lib/python3.11/site-packages (from build@ file://.../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels/build-1.0.3-py3-none-any.whl->-r .../local/var/lib/sage/venv-python3.11/var/lib/sage/scripts/python_build/spkg-requirements.txt (line 1)) (23.2)
[python_build-1.0.3] Requirement already satisfied: pyproject_hooks in .../local/var/lib/sage/venv-python3.11/lib/python3.11/site-packages (from build@ file://.../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels/build-1.0.3-py3-none-any.whl->-r .../local/var/lib/sage/venv-python3.11/var/lib/sage/scripts/python_build/spkg-requirements.txt (line 1)) (1.0.0)
[python_build-1.0.3] Installing collected packages: build
[python_build-1.0.3]   changing mode of .../local/var/lib/sage/venv-python3.11/bin/pyproject-build to 755
[python_build-1.0.3] Successfully installed build-1.0.3
[python_build-1.0.3] Successfully installed python_build-1.0.3
[...]
Sage build/upgrade complete!

It did not complain about dependencies, so we seem to be good. But the "pyproject_hooks" that it was happy to find comes from the previous installation, we don't have it as a package. Let's create it as a standard package.

$ ./sage -package create pyproject_hooks --pypi --type standard
Downloading tarball from https://pypi.io/packages/py3/p/pyproject_hooks/pyproject_hooks-1.0.0-py3-none-any.whl to .../upstream/pyproject_hooks-1.0.0-py3-none-any.whl
[......................................................................]
$ make pyproject_hooks
make -j16 build/make/Makefile --stop
./bootstrap -d
[...]
make --no-print-directory pyproject_hooks-SAGE_VENV-no-deps
[pyproject_hooks-1.0.0] Using cached file .../upstream/pyproject_hooks-1.0.0-py3-none-any.whl
[pyproject_hooks-1.0.0] pyproject_hooks-1.0.0
[...]
[pyproject_hooks-1.0.0] Found existing installation: pyproject_hooks 1.0.0
[pyproject_hooks-1.0.0] Uninstalling pyproject_hooks-1.0.0:
[pyproject_hooks-1.0.0]   Successfully uninstalled pyproject_hooks-1.0.0
[pyproject_hooks-1.0.0] Using pip 23.3.1 from .../local/var/lib/sage/venv-python3.11/lib/python3.11/site-packages/pip (python 3.11)
[pyproject_hooks-1.0.0] Looking in links: .../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels
[pyproject_hooks-1.0.0] Processing .../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels/pyproject_hooks-1.0.0-py3-none-any.whl (from -r .../local/var/lib/sage/venv-python3.11/var/lib/sage/scripts/pyproject_hooks/spkg-requirements.txt (line 1))
[pyproject_hooks-1.0.0] Installing collected packages: pyproject_hooks
[pyproject_hooks-1.0.0] Successfully installed pyproject_hooks-1.0.0
[...]
Sage build/upgrade complete!

No more dependencies to take care of, we are done.

$ git add build/pkgs/pyproject_hooks
$ git commit -m "build/pkgs/pyproject_hooks: New, python_build dependency"
[python_build_standard 58ab4c838e3] build/pkgs/pyproject_hooks: New, python_build dependency
 6 files changed, 28 insertions(+)
 create mode 100644 build/pkgs/pyproject_hooks/SPKG.rst
 create mode 100644 build/pkgs/pyproject_hooks/checksums.ini
 create mode 100644 build/pkgs/pyproject_hooks/dependencies
 create mode 100644 build/pkgs/pyproject_hooks/install-requires.txt
 create mode 100644 build/pkgs/pyproject_hooks/package-version.txt
 create mode 100644 build/pkgs/pyproject_hooks/type
$ git push -u origin HEAD
Enumerating objects: 22, done.
Counting objects: 100% (22/22), done.
Delta compression using up to 12 threads
Compressing objects: 100% (14/14), done.
Writing objects: 100% (18/18), 1.92 KiB | 163.00 KiB/s, done.
Total 18 (delta 6), reused 7 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (6/6), completed with 3 local objects.
remote:
remote: Create a pull request for 'python_build_standard' on GitHub by visiting:
remote:      https://github.com/mkoeppe/sage/pull/new/python_build_standard
remote:
To https://github.com/mkoeppe/sage.git
 * [new branch]              HEAD -> python_build_standard
branch 'python_build_standard' set up to track 'origin/python_build_standard'.




 

Dima Pasechnik

unread,
Feb 11, 2024, 6:29:11 PMFeb 11
to sage-...@googlegroups.com
Sage had shot itself in the foot by adopting an overtly rigid approach to Python dependencies which are not tightly integrated into the core of Sage (sagelib): Jupyter, Tox, and Sphinx (and their zillion dependencies).

A way out of it is to declare as many deps as possible pip, and just remove from our list many of these packages which are dependencies of Sphinx and Jupyter only (they are found and installed by pip just fine when you install Jupyter and Sphinx, there is no need for Sage's micromanaging of them).
The potential issues with dependencies of pip packages interfering with Sage packages (you mention these below) are precisely the result of this package micromanagement.



>What policy is best for a package obviously depends on lots of factors,
>including the development velocity and quality control that the upstream
>project, interest by Sage developers in the package, the depth of
>integration in Sage etc. I suggest to subject "one-size-fits-all"
>approaches to a healthy dose of critical thinking.

Yes, indeed, the current "standard packages cannot be pip packages" is exactly "one-size-fits-all" approach you are arguing against, and the issue we would like to resolve here.

>
>Dependencies of a "pip" package do not need to be available as packages in
>the Sage distribution. However, if a dependency is also a package of the
>Sage distribution, then we must declare this dependency. If we don't,
>surprising things can happen when building or upgrading. When new versions
>of "pip" packages add dependencies that happen to be Sage packages, there
>is a separate source of instability.

OTOH a package like pytest or tox is basically an external tool, and using an appropriate version of it is all what's needed.

Dima Pasechnik

unread,
Feb 11, 2024, 6:34:46 PMFeb 11
to sage-...@googlegroups.com


On 11 February 2024 22:47:24 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>On Sunday, February 11, 2024 at 1:46:40 PM UTC-8 Matthias Koeppe wrote:
>
>I'll make an attempt to quantify this cost
>
>
>Here's an illustration of the workflow for making python_build a standard
>"wheel" package, as proposed in
>https://groups.google.com/g/sage-devel/c/MIU-xo9b7pc:


What you outlined is the initial one-time cost. There is also a cost of maintenance, which eventually gets bigger than the initial cost: the thing gets outdated, its dependencies get outdated, this all requires updates, tests, conflict resolutions ---something that you get largely for free if you let go of the package dependency micromanagement, relying instead on the Python universe out there to do the job.




>
>
>*$ git checkout -b python_build_standard upstream/develop*branch
>'python_build_standard' set up to track 'upstream/develop'.
>Switched to a new branch 'python_build_standard'
>
>*$ ls build/pkgs/python_build*SPKG.rst dependencies distros
> requirements.txt type
>
>The package already exists as a "pip" package (requirements.txt). Let's
>re-create it as a standard "wheel" package.
>
>
>
>*$ mv build/pkgs/python_build build/pkgs/build$ ./sage -package create
>build --pypi --type standard*Downloading tarball from
>https://pypi.io/packages/py3/b/build/build-1.0.3-py3-none-any.whl to
>.../upstream/build-1.0.3-py3-none-any.whl
>[......................................................................]
>
>
>*$ mv build/pkgs/build build/pkgs/python_build$ ls build/pkgs/python_build*SPKG.rst
> checksums.ini dependencies distros
> install-requires.txt package-version.txt requirements.txt type
>
>*$ git rm -f build/pkgs/python_build/requirements.txt*rm
>'build/pkgs/python_build/requirements.txt'
>
>Now, after removing requirements.txt, it's a wheel package. Let's review
>the changes that "sage -package create" made.
>
>
>*$ git --no-pager diff build/pkgs/python_build/dependencies*
>diff --git a/build/pkgs/python_build/dependencies
>b/build/pkgs/python_build/dependencies
>index b72a6d1c776..47296a7bace 100644
>--- a/build/pkgs/python_build/dependencies
>+++ b/build/pkgs/python_build/dependencies
>@@ -1,4 +1,4 @@
>- pyparsing tomli packaging | $(PYTHON_TOOLCHAIN) $(PYTHON)
>+ | $(PYTHON_TOOLCHAIN) $(PYTHON)
>
> ----------
> All lines of this file are ignored except the first.
>
>Our old version was better, go back to it. (The script "sage -package
>create" does not know how to find the
>dependencies; https://github.com/sagemath/sage/pull/36740 prepares an
>improvement, needs review.)
>
>
>*$ git checkout -- build/pkgs/python_build/dependencies*
>
>Commit the changes:
>
>
>
>*$ git add build/pkgs/python_build$ git commit -m "build/pkgs/python_build:
>Change to a normal standard package"*[python_build_standard 43f6b2b8ef9]
>build/pkgs/python_build: Change to a normal standard package
> 4 files changed, 7 insertions(+), 1 deletion(-)
> create mode 100644 build/pkgs/python_build/checksums.ini
> rename build/pkgs/python_build/{requirements.txt => install-requires.txt}
>(100%)
> create mode 100644 build/pkgs/python_build/package-version.txt
>
>Test it:
>
>
>*$ make python_build*make -j16 build/make/Makefile --stop
>*$ ./sage -package create pyproject_hooks --pypi --type standard*Downloading
>tarball from
>https://pypi.io/packages/py3/p/pyproject_hooks/pyproject_hooks-1.0.0-py3-none-any.whl
>to .../upstream/pyproject_hooks-1.0.0-py3-none-any.whl
>[......................................................................]
>
>*$ make pyproject_hooks*make -j16 build/make/Makefile --stop
>*$ git add build/pkgs/pyproject_hooks$ git commit -m
>"build/pkgs/pyproject_hooks: New, python_build dependency"*[python_build_standard
>58ab4c838e3] build/pkgs/pyproject_hooks: New, python_build dependency
> 6 files changed, 28 insertions(+)
> create mode 100644 build/pkgs/pyproject_hooks/SPKG.rst
> create mode 100644 build/pkgs/pyproject_hooks/checksums.ini
> create mode 100644 build/pkgs/pyproject_hooks/dependencies
> create mode 100644 build/pkgs/pyproject_hooks/install-requires.txt
> create mode 100644 build/pkgs/pyproject_hooks/package-version.txt
> create mode 100644 build/pkgs/pyproject_hooks/type
>
>*$ git push -u origin HEAD*Enumerating objects: 22, done.

Matthias Koeppe

unread,
Feb 11, 2024, 7:57:34 PMFeb 11
to sage-devel
On Sunday, February 11, 2024 at 3:34:46 PM UTC-8 Dima Pasechnik wrote:
On 11 February 2024 22:47:24 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>On Sunday, February 11, 2024 at 1:46:40 PM UTC-8 Matthias Koeppe wrote:
>
>I'll make an attempt to quantify this cost
>
>Here's an illustration of the workflow for making python_build a standard
>"wheel" package, as proposed in
>https://groups.google.com/g/sage-devel/c/MIU-xo9b7pc:

What you outlined is the initial one-time cost.

That's correct, that's what I did in that post.
 
There is also a cost of maintenance, which eventually gets bigger than the initial cost: the thing gets outdated, its dependencies get outdated, this all requires updates, tests, conflict resolutions ---something that you get largely for free if you let go of the package dependency micromanagement, relying instead on the Python universe out there to do the job.

That's where a possible sleight of hand happens. 
Let's please do this discussion at normal speed, giving the audience a chance to observe the facts and form their opinion.

Pinning packages to a set of tested working versions is a standard practice, and as a matter of fact part of best practices to achieve stability in various deployment situations, reproducibility, etc.

In the Python world, such pinning is done using requirements.txt, Pipfile.lock, and environment.yml files.
In the Sage distribution, we pin using package-version.txt and tiny requirements.txt files.

When updating the pins, testing is always necessary; it does not come for free. Yes, we have our automatic tests, but in two of the examples that you mentioned, Sphinx and Jupyter, some manual inspection is necessary.

A question to ask is what tooling is available to update the version pins, and what the cost of using the tools is. For a typical upgrade, by improving our tooling, we have reduced the work to just typing "./sage -package update-latest sphinx --commit". In the Sphinx upgrade, https://github.com/sagemath/sage/pull/37129/files (needs review), I ended up updating 25 packages, so I had to use a command like this 25 times. It's repetitive, maybe it takes 20 minutes total, but it's not remotely something that I would use the phrase "Sage has shot itself in the foot" for. 

(Our tooling for "pip" packages is actually worse than that; "./sage -package update-latest" does not support them, an easy to implement wishlist item. Being able to run "sage -pip install -U sphinx", then test, then updating the pinned versions according to "./sage -pip freeze" -- also that's an easy to implement wishlist item.)

Dima Pasechnik

unread,
Feb 12, 2024, 6:18:05 AMFeb 12
to sage-...@googlegroups.com
On Mon, Feb 12, 2024 at 12:57 AM Matthias Koeppe
<matthia...@gmail.com> wrote:
>
> On Sunday, February 11, 2024 at 3:34:46 PM UTC-8 Dima Pasechnik wrote:
>
> On 11 February 2024 22:47:24 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
> >On Sunday, February 11, 2024 at 1:46:40 PM UTC-8 Matthias Koeppe wrote:
> >
> >I'll make an attempt to quantify this cost
> >
> >Here's an illustration of the workflow for making python_build a standard
> >"wheel" package, as proposed in
> >https://groups.google.com/g/sage-devel/c/MIU-xo9b7pc:
>
> What you outlined is the initial one-time cost.
>
>
> That's correct, that's what I did in that post.
>
>
> There is also a cost of maintenance, which eventually gets bigger than the initial cost: the thing gets outdated, its dependencies get outdated, this all requires updates, tests, conflict resolutions ---something that you get largely for free if you let go of the package dependency micromanagement, relying instead on the Python universe out there to do the job.
>
>
> That's where a possible sleight of hand happens.
> Let's please do this discussion at normal speed, giving the audience a chance to observe the facts and form their opinion.
>
> Pinning packages to a set of tested working versions is a standard practice, and as a matter of fact part of best practices to achieve stability in various deployment situations, reproducibility, etc.
>
> In the Python world, such pinning is done using requirements.txt, Pipfile.lock, and environment.yml files.
> In the Sage distribution, we pin using package-version.txt and tiny requirements.txt files.

as well as install-requires.txt and spkg-configure.m4 - they also in
some cases pin versions, strictly,or not.
Now you can lament about the lack of more developers joining the
project... (they come, they see the insanity of controlling versions
in 5 different somewhat incompatible ways, they leave).

>
> When updating the pins, testing is always necessary; it does not come for free. Yes, we have our automatic tests, but in two of the examples that you mentioned, Sphinx and Jupyter, some manual inspection is necessary.

Now, at last, tell us what makes Sage so special that we must vendor
sphinx and jupyter (and pytest (proposed), and tox, and...), unlike,
say, sympy, or scipy?
I imagine they spend developers' time on something more productive
than repeating the work done elsewhere, no?

>
> A question to ask is what tooling is available to update the version pins, and what the cost of using the tools is. For a typical upgrade, by improving our tooling, we have reduced the work to just typing "./sage -package update-latest sphinx --commit". In the Sphinx upgrade, https://github.com/sagemath/sage/pull/37129/files (needs review), I ended up updating 25 packages, so I had to use a command like this 25 times. It's repetitive, maybe it takes 20 minutes total, but it's not remotely something that I would use the phrase "Sage has shot itself in the foot" for.

The whole thing of a zillion vendored packages makes Sage uniquely
hard to package, and use outside of its own venv. These 25 packages
just don't need our version micromanagement, it's already done outside
of the project.
Can we please start to let go of this "vendor everything" mentality? Please?


>
> (Our tooling for "pip" packages is actually worse than that; "./sage -package update-latest" does not support them, an easy to implement wishlist item. Being able to run "sage -pip install -U sphinx", then test, then updating the pinned versions according to "./sage -pip freeze" -- also that's an easy to implement wishlist item.)
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/7f062b45-a5b3-49de-83e1-4f2f47eb96c2n%40googlegroups.com.

kcrisman

unread,
Feb 12, 2024, 7:34:11 AMFeb 12
to sage-devel
As part of this thread, I'd again ask for a discussion of the following situation I asked in the other thread.  Dima had some interesting points about a less-vendored approach saving disk space etc., but it would be helpful to have input from people who have had to install Sage in these kinds of situations en masse.  Separately, I'm also wondering about the Windows situation since much of the world, for better or worse, is not on Linux.

"At least in the not too distant past there have been situations where the non-requirement of internet connectivity alleviated issues of limited internet accessibility in a given locale, limited download speeds, limited grid electricity, etc.   This policy just as much affects those situations, and perhaps some people who have installed Sage in such environments (including Sage Days and other events) might want to weigh in on that, and whether such situations still obtain (as I personally assume they must certainly do).  I figure three-letter agencies have people with the skills to get around not using pip install, but if your downloads are over a mobile network (or, for that matter, Project Kuiper or Starlink or whatever), you might still want to download Sage - especially now that we don't have binary installs "provided"."

Dima Pasechnik

unread,
Feb 12, 2024, 7:41:19 AMFeb 12
to sage-...@googlegroups.com
On Mon, Feb 12, 2024 at 12:34 PM kcrisman <kcri...@gmail.com> wrote:
>
> As part of this thread, I'd again ask for a discussion of the following situation I asked in the other thread. Dima had some interesting points about a less-vendored approach saving disk space etc., but it would be helpful to have input from people who have had to install Sage in these kinds of situations en masse. Separately, I'm also wondering about the Windows situation since much of the world, for better or worse, is not on Linux.

On Windows, once you have WSL 2 up and running in a default way
(something that it's very common to have, and it's beyond the scope of
Sage how to have it on in detail)
you basically are on a recent Ubuntu (assessed via a weird interface, but OK).


>
> "At least in the not too distant past there have been situations where the non-requirement of internet connectivity alleviated issues of limited internet accessibility in a given locale, limited download speeds, limited grid electricity, etc. This policy just as much affects those situations, and perhaps some people who have installed Sage in such environments (including Sage Days and other events) might want to weigh in on that, and whether such situations still obtain (as I personally assume they must certainly do). I figure three-letter agencies have people with the skills to get around not using pip install, but if your downloads are over a mobile network (or, for that matter, Project Kuiper or Starlink or whatever), you might still want to download Sage - especially now that we don't have binary installs "provided"."
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/1c003881-c1c5-4d5a-8fd3-fb78d46263f7n%40googlegroups.com.

Matthias Koeppe

unread,
Feb 12, 2024, 1:02:21 PMFeb 12
to sage-devel
On Monday, February 12, 2024 at 3:18:05 AM UTC-8 Dima Pasechnik wrote:
> Pinning packages to a set of tested working versions is a standard practice, and as a matter of fact part of best practices to achieve stability in various deployment situations, reproducibility, etc.
>
> In the Python world, such pinning is done using requirements.txt, Pipfile.lock, and environment.yml files.
> In the Sage distribution, we pin using package-version.txt and tiny requirements.txt files.

as well as install-requires.txt and spkg-configure.m4 - they also in
some cases pin versions, strictly,or not.

These files serve a different purpose. They declare acceptable version ranges.
In pure Python packages, this exists as well, as you know.
It is done in pyproject.toml "dependencies" (previously setup.cfg/py "install-requires").

Talking about these here is a distraction that does not serve the discussion of this topic.

Now, at last, tell us what makes Sage so special that we must vendor
sphinx and jupyter [...]

Note that I have not expressed much of an opinion yet on your proposal. 
We'll get there.

But as I have pointed out several times previously, you are using the word "vendoring" in a polemic and idiosyncratic way, which does not serve the discussion. More below.

> A question to ask is what tooling is available to update the version pins, and what the cost of using the tools is. For a typical upgrade, by improving our tooling, we have reduced the work to just typing "./sage -package update-latest sphinx --commit". In the Sphinx upgrade, https://github.com/sagemath/sage/pull/37129/files (needs review), I ended up updating 25 packages, so I had to use a command like this 25 times. It's repetitive, maybe it takes 20 minutes total, but it's not remotely something that I would use the phrase "Sage has shot itself in the foot" for.

The whole thing of a zillion vendored packages [...]

1. Sage does not "vendor". What is in build/pkgs is _metadata_. It's just text. Sage _pins_ versions of packages, so there is information on the version.

2. Also the large Sage source tarball does not "vendor". It is a shipment of a distribution. Distributions don't "vendor". It's the job of a distribution to ship its components.

Dima Pasechnik

unread,
Feb 12, 2024, 1:49:04 PMFeb 12
to sage-...@googlegroups.com


On Mon, Feb 12, 2024 at 6:02 PM Matthias Koeppe <matthia...@gmail.com> wrote:
>
> On Monday, February 12, 2024 at 3:18:05 AM UTC-8 Dima Pasechnik wrote:
>
> > Pinning packages to a set of tested working versions is a standard practice, and as a matter of fact part of best practices to achieve stability in various deployment situations, reproducibility, etc.
> >
> > In the Python world, such pinning is done using requirements.txt, Pipfile.lock, and environment.yml files.
> > In the Sage distribution, we pin using package-version.txt and tiny requirements.txt files.
>
> as well as install-requires.txt and spkg-configure.m4 - they also in
> some cases pin versions, strictly,or not.
>
>
> These files serve a different purpose. They declare acceptable version ranges.

requirements.txt might as well specify the range, and this is used too e.g.

build/pkgs/phitigra/requirements.txt has

phitigra>=0.2.6

So this is all blurred and confusing

> In pure Python packages, this exists as well, as you know.
> It is done in pyproject.toml "dependencies" (previously setup.cfg/py "install-requires").
>
> Talking about these here is a distraction that does not serve the discussion of this topic.
>
> Now, at last, tell us what makes Sage so special that we must vendor
> sphinx and jupyter [...]
>
>
> Note that I have not expressed much of an opinion yet on your proposal.
> We'll get there.
>
> But as I have pointed out several times previously, you are using the word "vendoring" in a polemic and idiosyncratic way, which does not serve the discussion. More below.
>
> > A question to ask is what tooling is available to update the version pins, and what the cost of using the tools is. For a typical upgrade, by improving our tooling, we have reduced the work to just typing "./sage -package update-latest sphinx --commit". In the Sphinx upgrade, https://github.com/sagemath/sage/pull/37129/files (needs review), I ended up updating 25 packages, so I had to use a command like this 25 times. It's repetitive, maybe it takes 20 minutes total, but it's not remotely something that I would use the phrase "Sage has shot itself in the foot" for.
>
> The whole thing of a zillion vendored packages [...]
>
>
> 1. Sage does not "vendor". What is in build/pkgs is _metadata_. It's just text. Sage _pins_ versions of packages, so there is information on the version.

of course, I never said that metadata is vendoring, it's certainly not, and this is a deviation from the topic.

>
> 2. Also the large Sage source tarball does not "vendor". It is a shipment of a distribution. Distributions don't "vendor". It's the job of a distribution to ship its components.
This is not correct. Sage is not a distribution, and  I am using the verb as described here:  https://en.wiktionary.org/wiki/vendor#Verb

vendor (third-person singular simple present vendors, present participle vendoring, simple past and past participle vendored)

        1. (transitive, software engineering) To bundle third-party dependencies with the source code for one's own program.
                      I distributed my application with a vendored copy of Perl so that it wouldn't use the system copies of Perl where it is installed.
  1. (transitive, software engineering) As the software vendor, to bundle one's own, possibly modified version of dependencies with a standard program.
    Strawberry Perl contains vendored copies of some CPAN modules, designed to allow them to run on Windows.

According to this definition, everything in upstream/ is vendored (except our own packages, like configure.)


 
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Matthias Koeppe

unread,
Feb 12, 2024, 5:01:36 PMFeb 12
to sage-devel
On Monday, February 12, 2024 at 10:49:04 AM UTC-8 Dima Pasechnik wrote:
requirements.txt might as well specify the range, and this is used too e.g.

build/pkgs/phitigra/requirements.txt has
phitigra>=0.2.6

Yes, as I said in https://groups.google.com/g/sage-devel/c/5kmxaw105lg/m/9rF77fvFAAAJ, ""Pip" packages can either be pinned to a specific version, or set acceptable version ranges, or be entirely unconstrained. This is set in the file requirements.txt in the package directory."

So this is all [...] confusing

That's why I'm taking the time to explain it clearly for the benefit of everyone.

Matthias Koeppe

unread,
Feb 12, 2024, 5:11:16 PMFeb 12
to sage-devel
On Monday, February 12, 2024 at 10:49:04 AM UTC-8 Dima Pasechnik wrote:
> 2. Also the large Sage source tarball does not "vendor". It is a shipment of a distribution. Distributions don't "vendor". It's the job of a distribution to ship its components.
This is not correct. Sage is not a distribution

Let's not do "Sage-the-distribution is not a distirbution" again. https://groups.google.com/g/sage-devel/c/3Zoq0CNE1hE/m/tPgFOpHWBwAJ (2023).

Dima Pasechnik

unread,
Feb 12, 2024, 6:00:59 PMFeb 12
to sage-...@googlegroups.com
I never agreed with William on this one (Sage is too narrow in scope
and incomplete to be a distribution),
Anaconda calls itself "distribution", Sage is quite far from
Anaconda's functionality.

Anyway, William concludes with "I hope soon Sage isn't a distribution,
but right now it still is. "
Do you also hope for the latter?

Anyhow, it's just fuzzy terminology, as well as just what exactly "to
vendor" means.
With the definition of "to vendor" I provided then you got to agree
that we vendor a lot of things.

Dima Pasechnik

unread,
Feb 12, 2024, 6:07:38 PMFeb 12
to sage-...@googlegroups.com
I am sorry: I claimed that Sage has about 5 different ways to
specify/restrict versions of its packages,
and this makes it hugely confusing.
You disagreed, but now you say that it needs an explanation.

What really needs an explanation is how we ever went this far on a
garden path. :-)

John H Palmieri

unread,
Feb 12, 2024, 6:44:00 PMFeb 12
to sage-devel
What does this (a discussion of how Sage specifies version restrictions) have to do with the proposal? If it's relevant, that was not clear in the original proposal, so please clarify. It sounds like you might be proposing removing version checks on many of the packages Sage uses, or at least that's a conclusion I might draw from your critique of the amount of maintenance for Sage packages. Or maybe you are proposing redesigning the version specification system? In any case, it wasn't stated as part of the original proposal, so I don't know what was intended. If it is not relevant to the proposal, let's drop this part of the discussion.

I would also suggest dropping the question of whether we're "vendoring." The proposal clearly says that we should stop distributing the tarballs in the upstream directory, so whatever we call it, that part is clear.

(Maybe by "vendoring" you meant the combination of including the tarballs and the maintenance on the allowed versions, or maybe just including the tarballs, or maybe something else. The word "vendoring" does not seem to be helpful, so instead spelling out exactly what's meant for Sage could be helpful, at least if you meant more than just removing "upstream".)

Matthias Koeppe

unread,
Feb 12, 2024, 6:52:29 PMFeb 12
to sage-devel
I'll now offer:

Opinion 1. Nobody needs to care in the slightest what the size of that release tarball is. 

In any use cases with internet connectivity, people will be better off by just cloning the git repo, not use the release tarball.

If there are relevant use cases without internet connectivity (I have no opinion to offer on this), then the release tarball has exactly the right contents.

Proposed action items: 
A. Change https://github.com/sagemath/sage/blob/develop/README.md so that "git clone" is described as the primary way to obtain the Sage sources. That the big release tarball is available can be a footnote in the Installation Guide (https://deploy-livedoc--sagemath.netlify.app/html/en/installation/source#installation-steps) for the limited no-internet connectivity use case.

B. Likewise, get rid of all of these "Download Sage source code" pages (https://www.sagemath.org/download-source.html, https://www.sagemath.org/download-latest.html), mirror selection, etc. from the Sage website. 

Dima Pasechnik

unread,
Feb 12, 2024, 7:05:05 PMFeb 12
to sage-...@googlegroups.com
On Mon, Feb 12, 2024 at 11:52 PM Matthias Koeppe
<matthia...@gmail.com> wrote:
>
> I'll now offer:
>
> Opinion 1. Nobody needs to care in the slightest what the size of that release tarball is.

Not quite true. E.g. the mirrors are not of infinite size, e.g. some
projects (symengine is an example, IIRC) on PyPI get constrained that
way.

>
> In any use cases with internet connectivity, people will be better off by just cloning the git repo, not use the release tarball.
>
> If there are relevant use cases without internet connectivity (I have no opinion to offer on this), then the release tarball has exactly the right contents.

This won't be true any more if we allow standard packages to be pip packages.

>
> Proposed action items:
> A. Change https://github.com/sagemath/sage/blob/develop/README.md so that "git clone" is described as the primary way to obtain the Sage sources. That the big release tarball is available can be a footnote in the Installation Guide (https://deploy-livedoc--sagemath.netlify.app/html/en/installation/source#installation-steps) for the limited no-internet connectivity use case.
>
> B. Likewise, get rid of all of these "Download Sage source code" pages (https://www.sagemath.org/download-source.html, https://www.sagemath.org/download-latest.html), mirror selection, etc. from the Sage website.
>
>
> On Sunday, February 11, 2024 at 11:23:42 AM UTC-8 Dima Pasechnik wrote:
>>
>> Currently the standard packages cannot be pip packages, i.e. we must, in effect, vendor them. This entails an extra effort which is often not needed, in particular as we patch only very few Python packages.
>> Pip packages are on the other hand installed straight from PyPI.
>>
>> Good examples of standard packages which can become pip ones are tox, pytest (not yet standard).
>>
>>
>> The other difference is that by default these packages are not included in the Sage releases source tarball.
>>
>> Rather than adding them there I propose to split the upstream/* part of the tarball into something optional - which is represented by a list of files to download, and which is just not needed if you build while connected to the internet.
>>
>> This is a huge saving on the tarball size: with upstream/* in, Sage 10.2 tarball is 1.3Gb, and without it is smaller than 0.25Gb.
>>
>> Note that as William writes, the desire to have Sage buildable without an internet connection was a requirement by a past Sage funder, gone about 10 years ago. Thus there's no longer an obligation to have this option.
>> I am not aware of a similar to Sage which provides tarballs allowing for an offline build.
>>
>> Thus, I would like to call a vote on these two topics:
>>
>> 1) allow standard packages to be pip packages
>>
>> 2) drop the contents of upstream/ from the Sage source tarballs.
>>
>>
>> ---
>> Dima
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/f926e074-9803-4335-b128-29398c460b0en%40googlegroups.com.
Message has been deleted

Matthias Koeppe

unread,
Feb 12, 2024, 9:42:07 PMFeb 12
to sage-devel
On Monday, February 12, 2024 at 3:52:29 PM UTC-8 Matthias Koeppe wrote:
In any use cases with internet connectivity, people will be better off by just cloning the git repo, not use the release tarball.

If there are relevant use cases without internet connectivity (I have no opinion to offer on this), then the release tarball has exactly the right contents.

Proposed action items: 
A. Change https://github.com/sagemath/sage/blob/develop/README.md so that "git clone" is described as the primary way to obtain the Sage sources. That the big release tarball is available can be a footnote in the Installation Guide (https://deploy-livedoc--sagemath.netlify.app/html/en/installation/source#installation-steps) for the limited no-internet connectivity use case.

 
B. Likewise, get rid of all of these "Download Sage source code" pages (https://www.sagemath.org/download-source.html, https://www.sagemath.org/download-latest.html), mirror selection, etc. from the Sage website. 

Tobia...@gmx.de

unread,
Feb 12, 2024, 10:44:19 PMFeb 12
to sage-devel
+1 for both proposals.

Via "pip download" (https://pip.pypa.io/en/stable/cli/pip_download/) it is easy to resolve and download all pip packages on a system with internet connection, and then later on the target system install it without the need for internet.

Matthias Koeppe

unread,
Feb 16, 2024, 6:28:47 PMFeb 16
to sage-devel
On Monday, February 12, 2024 at 4:05:05 PM UTC-8 Dima Pasechnik wrote:
On Mon, Feb 12, 2024 at 11:52 PM Matthias Koeppe
<matthia...@gmail.com> wrote:
> If there are relevant use cases without internet connectivity (I have no opinion to offer on this), then the release tarball has exactly the right contents.

This won't be true any more if we allow standard packages to be pip packages.

That's correct.

Nils Bruin

unread,
Feb 16, 2024, 6:57:06 PMFeb 16
to sage-devel
As far as I understand, the proposal is to allow sage "packages" to be closer to more standard python prerequisites by letting them be resolved by pip packages. By default the package content would be fetched, as pip does, and that would mean the default configuration for sage would require internet at install time.

I also understand that there are people who are concerned that this may not reflect all scenarios where people want to install sagemath and they would prefer if there is a clear method to install sagemath from a well-defined set of archives (one big one?) that need to be transferred to the target machine, after which the install can proceed without internet access.

Searching for "pip without internet" gives various hits. One that at least superficially looks like a reasonable starting point:


but there are also stackoverflow answers that look relevant.

It looks like, with a bit of work, pip can be convinced to look at local files to satisfy prerequisites and packages. Hence, if we keep that in mind it seems to me that having an archive of "pip packages" would be doable if we ensure pip gets used in a way that makes it easy to reconfigure the place to look for prereqs. Then it may be fairly easy to make an offline installable version of sagemath, either by packing a big tarball that includes the pip content or by making that available in a separate ball, with an easy switch (or perhaps we can configure pip to first look locally and then try the internet? Or the other way around? that it could transition gracefully between different ways of satisfying requirements).

So, perhaps we can have our pips and networkless installs too?

Matthias Koeppe

unread,
Feb 16, 2024, 9:06:14 PMFeb 16
to sage-devel
On Friday, February 16, 2024 at 3:57:06 PM UTC-8 Nils Bruin wrote:
As far as I understand, the proposal is to allow sage "packages" to be closer to more standard python prerequisites by letting them be resolved by pip packages.

No, we already have such Sage packages: This is just one of the 4 existing package "source types" (https://deploy-livedoc--sagemath.netlify.app/html/en/developer/packaging#package-types) - "normal", "wheel", "pip", "script".

Most of our Python packages come from PyPI already. The difference is really (1) when we determine the version to be installed, and (2) if and how we distribute the tarball.

- "normal" packages are built from an sdist (tarball) retrieved from PyPI. 
- The version is set in the file package-version.txt, and the PyPI download URL ("upstream_url") and checksums are recorded in checksums.ini; see https://github.com/sagemath/sage/tree/develop/build/pkgs/numpy for an example. 
- The release manager's scripts download the package from the upstream_url and put them on the Sage mirrors. 
- If the package is standard, it is also included in the big release tarball. 
- If the package is standard and a stable release is being made, a GH Actions workflow also uploads the tarball as a Release Asset to GitHub (see https://github.com/sagemath/sage/releases/tag/10.2). 
- When users install Sage from git, any normal package is first attempted to retrieve from the GitHub Release Assets, then from Sage mirrors, then from the upstream_url. 
- When users install Sage from the big release tarball, standard normal packages have their sdists already in upstream, and only optional/experimental normal packages need to be retrieved.

What Dima proposes here is to allow _standard_ Sage packages to be of "source type" "pip". 
 
By default the package content would be fetched, as pip does,

Not just as pip does, but by actually calling "pip" to contact PyPI.
 
and that would mean the default configuration for sage would require internet at install time.

That's right.
 

Kwankyu Lee

unread,
Feb 16, 2024, 9:26:32 PMFeb 16
to sage-devel
 
By default the package content would be fetched, as pip does,

Not just as pip does, but by actually calling "pip" to contact PyPI.
 
and that would mean the default configuration for sage would require internet at install time.

That's right.

Then Dima's proposal implies assuming internet at install time. Right? 

I asked the same question before. But Dima denied it. Whence I got confused...

Matthias Koeppe

unread,
Feb 16, 2024, 10:13:05 PMFeb 16
to sage-devel
On Friday, February 16, 2024 at 6:26:32 PM UTC-8 Kwankyu Lee wrote:
 
By default the package content would be fetched, as pip does,

Not just as pip does, but by actually calling "pip" to contact PyPI.
 
and that would mean the default configuration for sage would require internet at install time.

That's right.

Then Dima's proposal implies assuming internet at install time. Right? 

Yes.

But one can make "pip" work with some local directory for the packages it considers instead of using PyPI over the Internet:
We can use "pip install --no-index --find-links=/SOME/LOCAL/DIRECTORY ...". See https://pip.pypa.io/en/stable/cli/pip_install/#finding-packages

As all pip options can also be provided systematically via environment variables, we can also set "PIP_NO_INDEX=true" and "PIP_FIND_LINKS=/SOME/LOCAL/DIRECTORY" for the same effect. Then one does not need to change the invocations of pip.

In fact, we already do exactly this in the Sage distribution for a slightly different purpose, namely when we build "normal" Python packages and "script" Python packages (= packages whose source trees are part of the repository, such as https://github.com/sagemath/sage/tree/develop/pkgs/sagemath-bliss), see https://github.com/sagemath/sage/blob/develop/build/pkgs/sagemath_objects/spkg-install.in#L3

We do this because, following modern Python build practices, we build most packages with "build isolation". The build-time prerequisites are not accessed from the normal Sage venv but are specifically installed in a temporary environment just for the build of the specific package. The prerequisites are installed from wheel files in venv/var/lib/sage/wheels/; this directory is referred to by the variable $SAGE_SPKG_WHEELS.

(Where do the wheel files in venv/var/lib/sage/wheels/ come from? Either (1) we have built them ahead of time and stored them there; or (2) they are platform-independent wheels and we have found them in the directory upstream/, downloaded them from GH Release assets, downloaded them from Sage mirrors, or the upstream_url (= PyPI).)

Nathan Dunfield

unread,
Feb 16, 2024, 11:44:06 PMFeb 16
to sage-devel
Dima mentioned "tox" [1] as an example of a "standard" package that would benefit from being switched to a "pip" package.  The "tox" package is pure python, so could also made a "wheel" package, which are already allowed for standard package, for example [2].  I'm having difficultly understanding the practical differences between a "wheel" package and a "pip" packages in this setting.  With "wheel", the wheel is downloaded from PyPI and put in upstream/ by various GH actions and put in the sage tarball and copied over to the sage mirrors, whereas with "pip" it is only downloaded by pip itself when an end-user builds Sage.  But in terms of developer effort, the only difference I see between "wheel" and "pip" is that the former has a few extra checksums, compare [2] and [3].  What distinctions am I missing?  Is it that a "wheel" must be pinned to a specific release on PyPI whereas "pip" can specify a range?

Best,

Matthias Koeppe

unread,
Feb 17, 2024, 12:17:37 AMFeb 17
to sage-devel
On Friday, February 16, 2024 at 8:44:06 PM UTC-8 Nathan Dunfield wrote:
Dima mentioned "tox" [1] as an example of a "standard" package that would benefit from being switched to a "pip" package.  The "tox" package is pure python, so could also made a "wheel" package, which are already allowed for standard package, for example [2]. 

Yes, in fact, tox and its dependencies have already been "wheel" packages, see https://github.com/sagemath/sage/blob/develop/build/pkgs/tox/checksums.ini
 
I have been switching many packages from "normal" to "wheel", which has reduced the complexity of the Sage distribution, as wheel packages have no installation scripts -- and also no build dependencies. The latter was crucial -- as we decided to install JupyterLab components from the pre-built wheels, which eliminated the complexity of Javascript build infrastructure from the Sage distribution, https://github.com/sagemath/sage/pull/36129

For wheel packages, it's all just metadata and the copied-over package README (which we need for building our reference manual).

I'm having difficultly understanding the practical differences between a "wheel" package and a "pip" packages in this setting. 

With "wheel", the wheel is downloaded from PyPI and put in upstream/ by various GH actions and put in the sage tarball and copied over to the sage mirrors, whereas with "pip" it is only downloaded by pip itself when an end-user builds Sage.  But in terms of developer effort, the only difference I see between "wheel" and "pip" is that the former has a few extra checksums, compare [2] and [3].
 
  What distinctions am I missing?  Is it that a "wheel" must be pinned to a specific release on PyPI whereas "pip" can specify a range?

If one does not care about the use case without internet access, then it's just the following:
- Pinning, as you mentioned (see also https://groups.google.com/g/sage-devel/c/5kmxaw105lg/m/9rF77fvFAAAJ above, where I discussed some details of this, including risks of leaving packages unpinned)
- Dependencies: "pip" packages can pull some of their build-time and run-time dependencies directly from PyPI, without us mirroring these dependencies in SageMath metadata. That's a mild convenience for developers, of importance if one wants to leave the version range wide open; but also has risks of instability.

Obviously, what is costly or inconvenient for developers depends a lot on the tooling that is available. I can elaborate on this if there's interest.

Dima Pasechnik

unread,
Feb 17, 2024, 6:09:25 AMFeb 17
to sage-...@googlegroups.com

Dima Pasechnik

unread,
Feb 17, 2024, 6:15:44 AMFeb 17
to sage-...@googlegroups.com


On 17 February 2024 02:26:32 GMT, Kwankyu Lee <ekwa...@gmail.com> wrote:
>
>
>
>
there are ways to use pip without internet, with the necessary wheels pre-fetched.
That's what Sage does with wheel packages. The difference between wheel packages vs pip packages is that the latter don't require pre-fetched wheels, and absence of the need for package (micro)management.

>

Kwankyu Lee

unread,
Feb 17, 2024, 10:01:15 AMFeb 17