[Proposal] allow standard packages to be pip packages, reduce source tarball size

923 views
Skip to first unread message

Dima Pasechnik

unread,
Feb 11, 2024, 2:23:42 PMFeb 11
to sage-...@googlegroups.com
Currently the standard packages cannot be pip packages, i.e. we must, in effect, vendor them. This entails an extra effort which is often not needed, in particular as we patch only very few Python packages.
Pip packages are on the other hand installed straight from PyPI.

Good examples of standard packages which can become pip ones are tox, pytest (not yet standard).


The other difference is that by default these packages are not included in the Sage releases source tarball.

Rather than adding them there I propose to split the upstream/* part of the tarball into something optional - which is represented by a list of files to download, and which is just not needed if you build while connected to the internet.

This is a huge saving on the tarball size: with upstream/* in, Sage 10.2 tarball is 1.3Gb, and without it is smaller than 0.25Gb.

Note that as William writes, the desire to have Sage buildable without an internet connection was a requirement by a past Sage funder, gone about 10 years ago. Thus there's no longer an obligation to have this option.
I am not aware of a similar to Sage which provides tarballs allowing for an offline build.

Thus, I would like to call a vote on these two topics:

1) allow standard packages to be pip packages

2) drop the contents of upstream/ from the Sage source tarballs.


---
Dima

Matthias Koeppe

unread,
Feb 11, 2024, 2:50:17 PMFeb 11
to sage-devel
I think it's a bit too quick to already call a vote. I would suggest that you take the time to collect and link previous discussions on this topic, so that participants can review the known arguments, viewpoints, and requirements.

It may also be relevant to consider whether the "Source code (tar.gz)" tarballs that are automatically provided by GitHub on releases (and tags) would be sufficient. (They do not contain upstream; but they also do not contain the helpful .git directory that our tarball release script painstakingly adds.)

Dima Pasechnik

unread,
Feb 11, 2024, 3:26:41 PMFeb 11
to sage-...@googlegroups.com


On 11 February 2024 19:50:17 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>I think it's a bit too quick to already call a vote. I would suggest that
>you take the time to collect and link previous discussions on this topic,
>so that participants can review the known arguments, viewpoints, and
>requirements.
>
>Example (from my previous
>post): https://groups.google.com/g/sage-devel/c/C7-ho1zvEYU/m/S2n8d5rOAgAJ
>(2016)


I don't think arguments from 2016 are very relevant today, given how much python packaging evolved since then.

I don't think there is a good reason to delay this vote, especially given that there is a pending vote on more
pip packages to be made standard, potentially leading to totally unneeded effort to vendor them.

Matthias Koeppe

unread,
Feb 11, 2024, 3:34:51 PMFeb 11
to sage-devel
On Sunday, February 11, 2024 at 12:26:41 PM UTC-8 Dima Pasechnik wrote:

On 11 February 2024 19:50:17 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>I think it's a bit too quick to already call a vote. I would suggest that
>you take the time to collect and link previous discussions on this topic,
>so that participants can review the known arguments, viewpoints, and
>requirements.
>
>Example (from my previous
>post): https://groups.google.com/g/sage-devel/c/C7-ho1zvEYU/m/S2n8d5rOAgAJ
>(2016)

I don't think arguments from 2016 are very relevant today, given how much python packaging evolved since then.

In case it was not clear, I did not suggest to only look for discussions from 2016 or earlier.

And the state of Python packaging is only one aspect that is relevant.

mmarco

unread,
Feb 11, 2024, 4:19:15 PMFeb 11
to sage-devel
As I mentioned in the thread that motivated this one, it would be relevant to stablish if it is possible to move those packages from standard to pip, while still having a way to install sage without an internet connection.

If the effort is not too much, I think it would make sense to provide that alternative.

Matthias Koeppe

unread,
Feb 11, 2024, 4:46:40 PMFeb 11
to sage-devel
I'll provide some context and pointers that readers may find helpful to participate in the discussion and vote.
- tooling: https://deploy-livedoc--sagemath.netlify.app/html/en/developer/packaging#utility-script-to-create-and-maintain-packages

A "normal" or "wheel" package is always pinned to a specific version in the Sage distribution (package-version.txt, checksums.ini), and the Sage distribution needs to have a package for each of its dependencies.

"Pip" packages can either be pinned to a specific version, or set acceptable version ranges, or be entirely unconstrained. This is set in the file requirements.txt in the package directory. 

Pinning a version has the potential benefit of stability (avoiding retroactive breakage by new, incompatible versions. The cost is that updating the version requires work by two Sage developers: One who prepares a PR and one who reviews it. (I'll make an attempt to quantify this cost in a separate post.) And when the package does not get the attention of developers who upgrade it, there's the potential risk of missing out on bugfixes made in newer versions, or missing out on features in major new versions.
Not pinning the version has the obvious potential benefit of always being up to date. But there is a risk of instability, either by the package itself being affected by bugs in a new version, or by breaking compatibility with Sage.
What policy is best for a package obviously depends on lots of factors, including the development velocity and quality control that the upstream project, interest by Sage developers in the package, the depth of integration in Sage etc. I suggest to subject "one-size-fits-all" approaches to a healthy dose of critical thinking.

Dependencies of a "pip" package do not need to be available as packages in the Sage distribution. However, if a dependency is also a package of the Sage distribution, then we must declare this dependency. If we don't, surprising things can happen when building or upgrading. When new versions of "pip" packages add dependencies that happen to be Sage packages, there is a separate source of instability.

On Sunday, February 11, 2024 at 11:23:42 AM UTC-8 Dima Pasechnik wrote:

Matthias Koeppe

unread,
Feb 11, 2024, 5:47:24 PMFeb 11
to sage-devel
On Sunday, February 11, 2024 at 1:46:40 PM UTC-8 Matthias Koeppe wrote:
I'll make an attempt to quantify this cost

Here's an illustration of the workflow for making python_build a standard "wheel" package, as proposed in https://groups.google.com/g/sage-devel/c/MIU-xo9b7pc:

$ git checkout -b python_build_standard upstream/develop
branch 'python_build_standard' set up to track 'upstream/develop'.
Switched to a new branch 'python_build_standard'
$ ls build/pkgs/python_build
SPKG.rst         dependencies     distros          requirements.txt type

The package already exists as a "pip" package (requirements.txt). Let's re-create it as a standard "wheel" package.

$ mv build/pkgs/python_build build/pkgs/build
$ ./sage -package create build --pypi --type standard
Downloading tarball from https://pypi.io/packages/py3/b/build/build-1.0.3-py3-none-any.whl to .../upstream/build-1.0.3-py3-none-any.whl
[......................................................................]
$ mv build/pkgs/build build/pkgs/python_build
$ ls build/pkgs/python_build
SPKG.rst             checksums.ini        dependencies         distros              install-requires.txt package-version.txt  requirements.txt     type
$ git rm -f build/pkgs/python_build/requirements.txt
rm 'build/pkgs/python_build/requirements.txt'

Now, after removing requirements.txt, it's a wheel package. Let's review the changes that "sage -package create" made.

$ git --no-pager diff build/pkgs/python_build/dependencies
diff --git a/build/pkgs/python_build/dependencies b/build/pkgs/python_build/dependencies
index b72a6d1c776..47296a7bace 100644
--- a/build/pkgs/python_build/dependencies
+++ b/build/pkgs/python_build/dependencies
@@ -1,4 +1,4 @@
- pyparsing tomli packaging | $(PYTHON_TOOLCHAIN) $(PYTHON)
+ | $(PYTHON_TOOLCHAIN) $(PYTHON)
 
 ----------
 All lines of this file are ignored except the first.

Our old version was better, go back to it. (The script "sage -package create" does not know how to find the dependencies; https://github.com/sagemath/sage/pull/36740 prepares an improvement, needs review.)

$ git checkout -- build/pkgs/python_build/dependencies

Commit the changes:

$ git add build/pkgs/python_build
$ git commit -m "build/pkgs/python_build: Change to a normal standard package"
[python_build_standard 43f6b2b8ef9] build/pkgs/python_build: Change to a normal standard package
 4 files changed, 7 insertions(+), 1 deletion(-)
 create mode 100644 build/pkgs/python_build/checksums.ini
 rename build/pkgs/python_build/{requirements.txt => install-requires.txt} (100%)
 create mode 100644 build/pkgs/python_build/package-version.txt

Test it:

$ make python_build
make -j16 build/make/Makefile --stop
./bootstrap -d
[...]
rm -rf config/install-sh config/compile config/config.guess config/config.sub config/missing configure build/make/Makefile-auto.in
make --no-print-directory python_build-SAGE_VENV-no-deps
[python_build-1.0.3] Using cached file .../upstream/build-1.0.3-py3-none-any.whl
[python_build-1.0.3] python_build-1.0.3
[python_build-1.0.3] ====================================================
[python_build-1.0.3] Setting up build directory for python_build-1.0.3
[...]
[python_build-1.0.3] Using pip 23.3.1 from .../local/var/lib/sage/venv-python3.11/lib/python3.11/site-packages/pip (python 3.11)
[python_build-1.0.3] Looking in links: .../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels
[python_build-1.0.3] Processing .../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels/build-1.0.3-py3-none-any.whl (from -r .../local/var/lib/sage/venv-python3.11/var/lib/sage/scripts/python_build/spkg-requirements.txt (line 1))
[python_build-1.0.3] Requirement already satisfied: packaging>=19.0 in .../local/var/lib/sage/venv-python3.11/lib/python3.11/site-packages (from build@ file://.../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels/build-1.0.3-py3-none-any.whl->-r .../local/var/lib/sage/venv-python3.11/var/lib/sage/scripts/python_build/spkg-requirements.txt (line 1)) (23.2)
[python_build-1.0.3] Requirement already satisfied: pyproject_hooks in .../local/var/lib/sage/venv-python3.11/lib/python3.11/site-packages (from build@ file://.../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels/build-1.0.3-py3-none-any.whl->-r .../local/var/lib/sage/venv-python3.11/var/lib/sage/scripts/python_build/spkg-requirements.txt (line 1)) (1.0.0)
[python_build-1.0.3] Installing collected packages: build
[python_build-1.0.3]   changing mode of .../local/var/lib/sage/venv-python3.11/bin/pyproject-build to 755
[python_build-1.0.3] Successfully installed build-1.0.3
[python_build-1.0.3] Successfully installed python_build-1.0.3
[...]
Sage build/upgrade complete!

It did not complain about dependencies, so we seem to be good. But the "pyproject_hooks" that it was happy to find comes from the previous installation, we don't have it as a package. Let's create it as a standard package.

$ ./sage -package create pyproject_hooks --pypi --type standard
Downloading tarball from https://pypi.io/packages/py3/p/pyproject_hooks/pyproject_hooks-1.0.0-py3-none-any.whl to .../upstream/pyproject_hooks-1.0.0-py3-none-any.whl
[......................................................................]
$ make pyproject_hooks
make -j16 build/make/Makefile --stop
./bootstrap -d
[...]
make --no-print-directory pyproject_hooks-SAGE_VENV-no-deps
[pyproject_hooks-1.0.0] Using cached file .../upstream/pyproject_hooks-1.0.0-py3-none-any.whl
[pyproject_hooks-1.0.0] pyproject_hooks-1.0.0
[...]
[pyproject_hooks-1.0.0] Found existing installation: pyproject_hooks 1.0.0
[pyproject_hooks-1.0.0] Uninstalling pyproject_hooks-1.0.0:
[pyproject_hooks-1.0.0]   Successfully uninstalled pyproject_hooks-1.0.0
[pyproject_hooks-1.0.0] Using pip 23.3.1 from .../local/var/lib/sage/venv-python3.11/lib/python3.11/site-packages/pip (python 3.11)
[pyproject_hooks-1.0.0] Looking in links: .../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels
[pyproject_hooks-1.0.0] Processing .../local/var/lib/sage/venv-python3.11/var/lib/sage/wheels/pyproject_hooks-1.0.0-py3-none-any.whl (from -r .../local/var/lib/sage/venv-python3.11/var/lib/sage/scripts/pyproject_hooks/spkg-requirements.txt (line 1))
[pyproject_hooks-1.0.0] Installing collected packages: pyproject_hooks
[pyproject_hooks-1.0.0] Successfully installed pyproject_hooks-1.0.0
[...]
Sage build/upgrade complete!

No more dependencies to take care of, we are done.

$ git add build/pkgs/pyproject_hooks
$ git commit -m "build/pkgs/pyproject_hooks: New, python_build dependency"
[python_build_standard 58ab4c838e3] build/pkgs/pyproject_hooks: New, python_build dependency
 6 files changed, 28 insertions(+)
 create mode 100644 build/pkgs/pyproject_hooks/SPKG.rst
 create mode 100644 build/pkgs/pyproject_hooks/checksums.ini
 create mode 100644 build/pkgs/pyproject_hooks/dependencies
 create mode 100644 build/pkgs/pyproject_hooks/install-requires.txt
 create mode 100644 build/pkgs/pyproject_hooks/package-version.txt
 create mode 100644 build/pkgs/pyproject_hooks/type
$ git push -u origin HEAD
Enumerating objects: 22, done.
Counting objects: 100% (22/22), done.
Delta compression using up to 12 threads
Compressing objects: 100% (14/14), done.
Writing objects: 100% (18/18), 1.92 KiB | 163.00 KiB/s, done.
Total 18 (delta 6), reused 7 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (6/6), completed with 3 local objects.
remote:
remote: Create a pull request for 'python_build_standard' on GitHub by visiting:
remote:      https://github.com/mkoeppe/sage/pull/new/python_build_standard
remote:
To https://github.com/mkoeppe/sage.git
 * [new branch]              HEAD -> python_build_standard
branch 'python_build_standard' set up to track 'origin/python_build_standard'.




 

Dima Pasechnik

unread,
Feb 11, 2024, 6:29:11 PMFeb 11
to sage-...@googlegroups.com
Sage had shot itself in the foot by adopting an overtly rigid approach to Python dependencies which are not tightly integrated into the core of Sage (sagelib): Jupyter, Tox, and Sphinx (and their zillion dependencies).

A way out of it is to declare as many deps as possible pip, and just remove from our list many of these packages which are dependencies of Sphinx and Jupyter only (they are found and installed by pip just fine when you install Jupyter and Sphinx, there is no need for Sage's micromanaging of them).
The potential issues with dependencies of pip packages interfering with Sage packages (you mention these below) are precisely the result of this package micromanagement.



>What policy is best for a package obviously depends on lots of factors,
>including the development velocity and quality control that the upstream
>project, interest by Sage developers in the package, the depth of
>integration in Sage etc. I suggest to subject "one-size-fits-all"
>approaches to a healthy dose of critical thinking.

Yes, indeed, the current "standard packages cannot be pip packages" is exactly "one-size-fits-all" approach you are arguing against, and the issue we would like to resolve here.

>
>Dependencies of a "pip" package do not need to be available as packages in
>the Sage distribution. However, if a dependency is also a package of the
>Sage distribution, then we must declare this dependency. If we don't,
>surprising things can happen when building or upgrading. When new versions
>of "pip" packages add dependencies that happen to be Sage packages, there
>is a separate source of instability.

OTOH a package like pytest or tox is basically an external tool, and using an appropriate version of it is all what's needed.

Dima Pasechnik

unread,
Feb 11, 2024, 6:34:46 PMFeb 11
to sage-...@googlegroups.com


On 11 February 2024 22:47:24 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>On Sunday, February 11, 2024 at 1:46:40 PM UTC-8 Matthias Koeppe wrote:
>
>I'll make an attempt to quantify this cost
>
>
>Here's an illustration of the workflow for making python_build a standard
>"wheel" package, as proposed in
>https://groups.google.com/g/sage-devel/c/MIU-xo9b7pc:


What you outlined is the initial one-time cost. There is also a cost of maintenance, which eventually gets bigger than the initial cost: the thing gets outdated, its dependencies get outdated, this all requires updates, tests, conflict resolutions ---something that you get largely for free if you let go of the package dependency micromanagement, relying instead on the Python universe out there to do the job.




>
>
>*$ git checkout -b python_build_standard upstream/develop*branch
>'python_build_standard' set up to track 'upstream/develop'.
>Switched to a new branch 'python_build_standard'
>
>*$ ls build/pkgs/python_build*SPKG.rst dependencies distros
> requirements.txt type
>
>The package already exists as a "pip" package (requirements.txt). Let's
>re-create it as a standard "wheel" package.
>
>
>
>*$ mv build/pkgs/python_build build/pkgs/build$ ./sage -package create
>build --pypi --type standard*Downloading tarball from
>https://pypi.io/packages/py3/b/build/build-1.0.3-py3-none-any.whl to
>.../upstream/build-1.0.3-py3-none-any.whl
>[......................................................................]
>
>
>*$ mv build/pkgs/build build/pkgs/python_build$ ls build/pkgs/python_build*SPKG.rst
> checksums.ini dependencies distros
> install-requires.txt package-version.txt requirements.txt type
>
>*$ git rm -f build/pkgs/python_build/requirements.txt*rm
>'build/pkgs/python_build/requirements.txt'
>
>Now, after removing requirements.txt, it's a wheel package. Let's review
>the changes that "sage -package create" made.
>
>
>*$ git --no-pager diff build/pkgs/python_build/dependencies*
>diff --git a/build/pkgs/python_build/dependencies
>b/build/pkgs/python_build/dependencies
>index b72a6d1c776..47296a7bace 100644
>--- a/build/pkgs/python_build/dependencies
>+++ b/build/pkgs/python_build/dependencies
>@@ -1,4 +1,4 @@
>- pyparsing tomli packaging | $(PYTHON_TOOLCHAIN) $(PYTHON)
>+ | $(PYTHON_TOOLCHAIN) $(PYTHON)
>
> ----------
> All lines of this file are ignored except the first.
>
>Our old version was better, go back to it. (The script "sage -package
>create" does not know how to find the
>dependencies; https://github.com/sagemath/sage/pull/36740 prepares an
>improvement, needs review.)
>
>
>*$ git checkout -- build/pkgs/python_build/dependencies*
>
>Commit the changes:
>
>
>
>*$ git add build/pkgs/python_build$ git commit -m "build/pkgs/python_build:
>Change to a normal standard package"*[python_build_standard 43f6b2b8ef9]
>build/pkgs/python_build: Change to a normal standard package
> 4 files changed, 7 insertions(+), 1 deletion(-)
> create mode 100644 build/pkgs/python_build/checksums.ini
> rename build/pkgs/python_build/{requirements.txt => install-requires.txt}
>(100%)
> create mode 100644 build/pkgs/python_build/package-version.txt
>
>Test it:
>
>
>*$ make python_build*make -j16 build/make/Makefile --stop
>*$ ./sage -package create pyproject_hooks --pypi --type standard*Downloading
>tarball from
>https://pypi.io/packages/py3/p/pyproject_hooks/pyproject_hooks-1.0.0-py3-none-any.whl
>to .../upstream/pyproject_hooks-1.0.0-py3-none-any.whl
>[......................................................................]
>
>*$ make pyproject_hooks*make -j16 build/make/Makefile --stop
>*$ git add build/pkgs/pyproject_hooks$ git commit -m
>"build/pkgs/pyproject_hooks: New, python_build dependency"*[python_build_standard
>58ab4c838e3] build/pkgs/pyproject_hooks: New, python_build dependency
> 6 files changed, 28 insertions(+)
> create mode 100644 build/pkgs/pyproject_hooks/SPKG.rst
> create mode 100644 build/pkgs/pyproject_hooks/checksums.ini
> create mode 100644 build/pkgs/pyproject_hooks/dependencies
> create mode 100644 build/pkgs/pyproject_hooks/install-requires.txt
> create mode 100644 build/pkgs/pyproject_hooks/package-version.txt
> create mode 100644 build/pkgs/pyproject_hooks/type
>
>*$ git push -u origin HEAD*Enumerating objects: 22, done.

Matthias Koeppe

unread,
Feb 11, 2024, 7:57:34 PMFeb 11
to sage-devel
On Sunday, February 11, 2024 at 3:34:46 PM UTC-8 Dima Pasechnik wrote:
On 11 February 2024 22:47:24 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>On Sunday, February 11, 2024 at 1:46:40 PM UTC-8 Matthias Koeppe wrote:
>
>I'll make an attempt to quantify this cost
>
>Here's an illustration of the workflow for making python_build a standard
>"wheel" package, as proposed in
>https://groups.google.com/g/sage-devel/c/MIU-xo9b7pc:

What you outlined is the initial one-time cost.

That's correct, that's what I did in that post.
 
There is also a cost of maintenance, which eventually gets bigger than the initial cost: the thing gets outdated, its dependencies get outdated, this all requires updates, tests, conflict resolutions ---something that you get largely for free if you let go of the package dependency micromanagement, relying instead on the Python universe out there to do the job.

That's where a possible sleight of hand happens. 
Let's please do this discussion at normal speed, giving the audience a chance to observe the facts and form their opinion.

Pinning packages to a set of tested working versions is a standard practice, and as a matter of fact part of best practices to achieve stability in various deployment situations, reproducibility, etc.

In the Python world, such pinning is done using requirements.txt, Pipfile.lock, and environment.yml files.
In the Sage distribution, we pin using package-version.txt and tiny requirements.txt files.

When updating the pins, testing is always necessary; it does not come for free. Yes, we have our automatic tests, but in two of the examples that you mentioned, Sphinx and Jupyter, some manual inspection is necessary.

A question to ask is what tooling is available to update the version pins, and what the cost of using the tools is. For a typical upgrade, by improving our tooling, we have reduced the work to just typing "./sage -package update-latest sphinx --commit". In the Sphinx upgrade, https://github.com/sagemath/sage/pull/37129/files (needs review), I ended up updating 25 packages, so I had to use a command like this 25 times. It's repetitive, maybe it takes 20 minutes total, but it's not remotely something that I would use the phrase "Sage has shot itself in the foot" for. 

(Our tooling for "pip" packages is actually worse than that; "./sage -package update-latest" does not support them, an easy to implement wishlist item. Being able to run "sage -pip install -U sphinx", then test, then updating the pinned versions according to "./sage -pip freeze" -- also that's an easy to implement wishlist item.)

Dima Pasechnik

unread,
Feb 12, 2024, 6:18:05 AMFeb 12
to sage-...@googlegroups.com
On Mon, Feb 12, 2024 at 12:57 AM Matthias Koeppe
<matthia...@gmail.com> wrote:
>
> On Sunday, February 11, 2024 at 3:34:46 PM UTC-8 Dima Pasechnik wrote:
>
> On 11 February 2024 22:47:24 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
> >On Sunday, February 11, 2024 at 1:46:40 PM UTC-8 Matthias Koeppe wrote:
> >
> >I'll make an attempt to quantify this cost
> >
> >Here's an illustration of the workflow for making python_build a standard
> >"wheel" package, as proposed in
> >https://groups.google.com/g/sage-devel/c/MIU-xo9b7pc:
>
> What you outlined is the initial one-time cost.
>
>
> That's correct, that's what I did in that post.
>
>
> There is also a cost of maintenance, which eventually gets bigger than the initial cost: the thing gets outdated, its dependencies get outdated, this all requires updates, tests, conflict resolutions ---something that you get largely for free if you let go of the package dependency micromanagement, relying instead on the Python universe out there to do the job.
>
>
> That's where a possible sleight of hand happens.
> Let's please do this discussion at normal speed, giving the audience a chance to observe the facts and form their opinion.
>
> Pinning packages to a set of tested working versions is a standard practice, and as a matter of fact part of best practices to achieve stability in various deployment situations, reproducibility, etc.
>
> In the Python world, such pinning is done using requirements.txt, Pipfile.lock, and environment.yml files.
> In the Sage distribution, we pin using package-version.txt and tiny requirements.txt files.

as well as install-requires.txt and spkg-configure.m4 - they also in
some cases pin versions, strictly,or not.
Now you can lament about the lack of more developers joining the
project... (they come, they see the insanity of controlling versions
in 5 different somewhat incompatible ways, they leave).

>
> When updating the pins, testing is always necessary; it does not come for free. Yes, we have our automatic tests, but in two of the examples that you mentioned, Sphinx and Jupyter, some manual inspection is necessary.

Now, at last, tell us what makes Sage so special that we must vendor
sphinx and jupyter (and pytest (proposed), and tox, and...), unlike,
say, sympy, or scipy?
I imagine they spend developers' time on something more productive
than repeating the work done elsewhere, no?

>
> A question to ask is what tooling is available to update the version pins, and what the cost of using the tools is. For a typical upgrade, by improving our tooling, we have reduced the work to just typing "./sage -package update-latest sphinx --commit". In the Sphinx upgrade, https://github.com/sagemath/sage/pull/37129/files (needs review), I ended up updating 25 packages, so I had to use a command like this 25 times. It's repetitive, maybe it takes 20 minutes total, but it's not remotely something that I would use the phrase "Sage has shot itself in the foot" for.

The whole thing of a zillion vendored packages makes Sage uniquely
hard to package, and use outside of its own venv. These 25 packages
just don't need our version micromanagement, it's already done outside
of the project.
Can we please start to let go of this "vendor everything" mentality? Please?


>
> (Our tooling for "pip" packages is actually worse than that; "./sage -package update-latest" does not support them, an easy to implement wishlist item. Being able to run "sage -pip install -U sphinx", then test, then updating the pinned versions according to "./sage -pip freeze" -- also that's an easy to implement wishlist item.)
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/7f062b45-a5b3-49de-83e1-4f2f47eb96c2n%40googlegroups.com.

kcrisman

unread,
Feb 12, 2024, 7:34:11 AMFeb 12
to sage-devel
As part of this thread, I'd again ask for a discussion of the following situation I asked in the other thread.  Dima had some interesting points about a less-vendored approach saving disk space etc., but it would be helpful to have input from people who have had to install Sage in these kinds of situations en masse.  Separately, I'm also wondering about the Windows situation since much of the world, for better or worse, is not on Linux.

"At least in the not too distant past there have been situations where the non-requirement of internet connectivity alleviated issues of limited internet accessibility in a given locale, limited download speeds, limited grid electricity, etc.   This policy just as much affects those situations, and perhaps some people who have installed Sage in such environments (including Sage Days and other events) might want to weigh in on that, and whether such situations still obtain (as I personally assume they must certainly do).  I figure three-letter agencies have people with the skills to get around not using pip install, but if your downloads are over a mobile network (or, for that matter, Project Kuiper or Starlink or whatever), you might still want to download Sage - especially now that we don't have binary installs "provided"."

Dima Pasechnik

unread,
Feb 12, 2024, 7:41:19 AMFeb 12
to sage-...@googlegroups.com
On Mon, Feb 12, 2024 at 12:34 PM kcrisman <kcri...@gmail.com> wrote:
>
> As part of this thread, I'd again ask for a discussion of the following situation I asked in the other thread. Dima had some interesting points about a less-vendored approach saving disk space etc., but it would be helpful to have input from people who have had to install Sage in these kinds of situations en masse. Separately, I'm also wondering about the Windows situation since much of the world, for better or worse, is not on Linux.

On Windows, once you have WSL 2 up and running in a default way
(something that it's very common to have, and it's beyond the scope of
Sage how to have it on in detail)
you basically are on a recent Ubuntu (assessed via a weird interface, but OK).


>
> "At least in the not too distant past there have been situations where the non-requirement of internet connectivity alleviated issues of limited internet accessibility in a given locale, limited download speeds, limited grid electricity, etc. This policy just as much affects those situations, and perhaps some people who have installed Sage in such environments (including Sage Days and other events) might want to weigh in on that, and whether such situations still obtain (as I personally assume they must certainly do). I figure three-letter agencies have people with the skills to get around not using pip install, but if your downloads are over a mobile network (or, for that matter, Project Kuiper or Starlink or whatever), you might still want to download Sage - especially now that we don't have binary installs "provided"."
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/1c003881-c1c5-4d5a-8fd3-fb78d46263f7n%40googlegroups.com.

Matthias Koeppe

unread,
Feb 12, 2024, 1:02:21 PMFeb 12
to sage-devel
On Monday, February 12, 2024 at 3:18:05 AM UTC-8 Dima Pasechnik wrote:
> Pinning packages to a set of tested working versions is a standard practice, and as a matter of fact part of best practices to achieve stability in various deployment situations, reproducibility, etc.
>
> In the Python world, such pinning is done using requirements.txt, Pipfile.lock, and environment.yml files.
> In the Sage distribution, we pin using package-version.txt and tiny requirements.txt files.

as well as install-requires.txt and spkg-configure.m4 - they also in
some cases pin versions, strictly,or not.

These files serve a different purpose. They declare acceptable version ranges.
In pure Python packages, this exists as well, as you know.
It is done in pyproject.toml "dependencies" (previously setup.cfg/py "install-requires").

Talking about these here is a distraction that does not serve the discussion of this topic.

Now, at last, tell us what makes Sage so special that we must vendor
sphinx and jupyter [...]

Note that I have not expressed much of an opinion yet on your proposal. 
We'll get there.

But as I have pointed out several times previously, you are using the word "vendoring" in a polemic and idiosyncratic way, which does not serve the discussion. More below.

> A question to ask is what tooling is available to update the version pins, and what the cost of using the tools is. For a typical upgrade, by improving our tooling, we have reduced the work to just typing "./sage -package update-latest sphinx --commit". In the Sphinx upgrade, https://github.com/sagemath/sage/pull/37129/files (needs review), I ended up updating 25 packages, so I had to use a command like this 25 times. It's repetitive, maybe it takes 20 minutes total, but it's not remotely something that I would use the phrase "Sage has shot itself in the foot" for.

The whole thing of a zillion vendored packages [...]

1. Sage does not "vendor". What is in build/pkgs is _metadata_. It's just text. Sage _pins_ versions of packages, so there is information on the version.

2. Also the large Sage source tarball does not "vendor". It is a shipment of a distribution. Distributions don't "vendor". It's the job of a distribution to ship its components.

Dima Pasechnik

unread,
Feb 12, 2024, 1:49:04 PMFeb 12
to sage-...@googlegroups.com


On Mon, Feb 12, 2024 at 6:02 PM Matthias Koeppe <matthia...@gmail.com> wrote:
>
> On Monday, February 12, 2024 at 3:18:05 AM UTC-8 Dima Pasechnik wrote:
>
> > Pinning packages to a set of tested working versions is a standard practice, and as a matter of fact part of best practices to achieve stability in various deployment situations, reproducibility, etc.
> >
> > In the Python world, such pinning is done using requirements.txt, Pipfile.lock, and environment.yml files.
> > In the Sage distribution, we pin using package-version.txt and tiny requirements.txt files.
>
> as well as install-requires.txt and spkg-configure.m4 - they also in
> some cases pin versions, strictly,or not.
>
>
> These files serve a different purpose. They declare acceptable version ranges.

requirements.txt might as well specify the range, and this is used too e.g.

build/pkgs/phitigra/requirements.txt has

phitigra>=0.2.6

So this is all blurred and confusing

> In pure Python packages, this exists as well, as you know.
> It is done in pyproject.toml "dependencies" (previously setup.cfg/py "install-requires").
>
> Talking about these here is a distraction that does not serve the discussion of this topic.
>
> Now, at last, tell us what makes Sage so special that we must vendor
> sphinx and jupyter [...]
>
>
> Note that I have not expressed much of an opinion yet on your proposal.
> We'll get there.
>
> But as I have pointed out several times previously, you are using the word "vendoring" in a polemic and idiosyncratic way, which does not serve the discussion. More below.
>
> > A question to ask is what tooling is available to update the version pins, and what the cost of using the tools is. For a typical upgrade, by improving our tooling, we have reduced the work to just typing "./sage -package update-latest sphinx --commit". In the Sphinx upgrade, https://github.com/sagemath/sage/pull/37129/files (needs review), I ended up updating 25 packages, so I had to use a command like this 25 times. It's repetitive, maybe it takes 20 minutes total, but it's not remotely something that I would use the phrase "Sage has shot itself in the foot" for.
>
> The whole thing of a zillion vendored packages [...]
>
>
> 1. Sage does not "vendor". What is in build/pkgs is _metadata_. It's just text. Sage _pins_ versions of packages, so there is information on the version.

of course, I never said that metadata is vendoring, it's certainly not, and this is a deviation from the topic.

>
> 2. Also the large Sage source tarball does not "vendor". It is a shipment of a distribution. Distributions don't "vendor". It's the job of a distribution to ship its components.
This is not correct. Sage is not a distribution, and  I am using the verb as described here:  https://en.wiktionary.org/wiki/vendor#Verb

vendor (third-person singular simple present vendors, present participle vendoring, simple past and past participle vendored)

        1. (transitive, software engineering) To bundle third-party dependencies with the source code for one's own program.
                      I distributed my application with a vendored copy of Perl so that it wouldn't use the system copies of Perl where it is installed.
  1. (transitive, software engineering) As the software vendor, to bundle one's own, possibly modified version of dependencies with a standard program.
    Strawberry Perl contains vendored copies of some CPAN modules, designed to allow them to run on Windows.

According to this definition, everything in upstream/ is vendored (except our own packages, like configure.)


 
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Matthias Koeppe

unread,
Feb 12, 2024, 5:01:36 PMFeb 12
to sage-devel
On Monday, February 12, 2024 at 10:49:04 AM UTC-8 Dima Pasechnik wrote:
requirements.txt might as well specify the range, and this is used too e.g.

build/pkgs/phitigra/requirements.txt has
phitigra>=0.2.6

Yes, as I said in https://groups.google.com/g/sage-devel/c/5kmxaw105lg/m/9rF77fvFAAAJ, ""Pip" packages can either be pinned to a specific version, or set acceptable version ranges, or be entirely unconstrained. This is set in the file requirements.txt in the package directory."

So this is all [...] confusing

That's why I'm taking the time to explain it clearly for the benefit of everyone.

Matthias Koeppe

unread,
Feb 12, 2024, 5:11:16 PMFeb 12
to sage-devel
On Monday, February 12, 2024 at 10:49:04 AM UTC-8 Dima Pasechnik wrote:
> 2. Also the large Sage source tarball does not "vendor". It is a shipment of a distribution. Distributions don't "vendor". It's the job of a distribution to ship its components.
This is not correct. Sage is not a distribution

Let's not do "Sage-the-distribution is not a distirbution" again. https://groups.google.com/g/sage-devel/c/3Zoq0CNE1hE/m/tPgFOpHWBwAJ (2023).

Dima Pasechnik

unread,
Feb 12, 2024, 6:00:59 PMFeb 12
to sage-...@googlegroups.com
I never agreed with William on this one (Sage is too narrow in scope
and incomplete to be a distribution),
Anaconda calls itself "distribution", Sage is quite far from
Anaconda's functionality.

Anyway, William concludes with "I hope soon Sage isn't a distribution,
but right now it still is. "
Do you also hope for the latter?

Anyhow, it's just fuzzy terminology, as well as just what exactly "to
vendor" means.
With the definition of "to vendor" I provided then you got to agree
that we vendor a lot of things.

Dima Pasechnik

unread,
Feb 12, 2024, 6:07:38 PMFeb 12
to sage-...@googlegroups.com
I am sorry: I claimed that Sage has about 5 different ways to
specify/restrict versions of its packages,
and this makes it hugely confusing.
You disagreed, but now you say that it needs an explanation.

What really needs an explanation is how we ever went this far on a
garden path. :-)

John H Palmieri

unread,
Feb 12, 2024, 6:44:00 PMFeb 12
to sage-devel
What does this (a discussion of how Sage specifies version restrictions) have to do with the proposal? If it's relevant, that was not clear in the original proposal, so please clarify. It sounds like you might be proposing removing version checks on many of the packages Sage uses, or at least that's a conclusion I might draw from your critique of the amount of maintenance for Sage packages. Or maybe you are proposing redesigning the version specification system? In any case, it wasn't stated as part of the original proposal, so I don't know what was intended. If it is not relevant to the proposal, let's drop this part of the discussion.

I would also suggest dropping the question of whether we're "vendoring." The proposal clearly says that we should stop distributing the tarballs in the upstream directory, so whatever we call it, that part is clear.

(Maybe by "vendoring" you meant the combination of including the tarballs and the maintenance on the allowed versions, or maybe just including the tarballs, or maybe something else. The word "vendoring" does not seem to be helpful, so instead spelling out exactly what's meant for Sage could be helpful, at least if you meant more than just removing "upstream".)

Matthias Koeppe

unread,
Feb 12, 2024, 6:52:29 PMFeb 12
to sage-devel
I'll now offer:

Opinion 1. Nobody needs to care in the slightest what the size of that release tarball is. 

In any use cases with internet connectivity, people will be better off by just cloning the git repo, not use the release tarball.

If there are relevant use cases without internet connectivity (I have no opinion to offer on this), then the release tarball has exactly the right contents.

Proposed action items: 
A. Change https://github.com/sagemath/sage/blob/develop/README.md so that "git clone" is described as the primary way to obtain the Sage sources. That the big release tarball is available can be a footnote in the Installation Guide (https://deploy-livedoc--sagemath.netlify.app/html/en/installation/source#installation-steps) for the limited no-internet connectivity use case.

B. Likewise, get rid of all of these "Download Sage source code" pages (https://www.sagemath.org/download-source.html, https://www.sagemath.org/download-latest.html), mirror selection, etc. from the Sage website. 

Dima Pasechnik

unread,
Feb 12, 2024, 7:05:05 PMFeb 12
to sage-...@googlegroups.com
On Mon, Feb 12, 2024 at 11:52 PM Matthias Koeppe
<matthia...@gmail.com> wrote:
>
> I'll now offer:
>
> Opinion 1. Nobody needs to care in the slightest what the size of that release tarball is.

Not quite true. E.g. the mirrors are not of infinite size, e.g. some
projects (symengine is an example, IIRC) on PyPI get constrained that
way.

>
> In any use cases with internet connectivity, people will be better off by just cloning the git repo, not use the release tarball.
>
> If there are relevant use cases without internet connectivity (I have no opinion to offer on this), then the release tarball has exactly the right contents.

This won't be true any more if we allow standard packages to be pip packages.

>
> Proposed action items:
> A. Change https://github.com/sagemath/sage/blob/develop/README.md so that "git clone" is described as the primary way to obtain the Sage sources. That the big release tarball is available can be a footnote in the Installation Guide (https://deploy-livedoc--sagemath.netlify.app/html/en/installation/source#installation-steps) for the limited no-internet connectivity use case.
>
> B. Likewise, get rid of all of these "Download Sage source code" pages (https://www.sagemath.org/download-source.html, https://www.sagemath.org/download-latest.html), mirror selection, etc. from the Sage website.
>
>
> On Sunday, February 11, 2024 at 11:23:42 AM UTC-8 Dima Pasechnik wrote:
>>
>> Currently the standard packages cannot be pip packages, i.e. we must, in effect, vendor them. This entails an extra effort which is often not needed, in particular as we patch only very few Python packages.
>> Pip packages are on the other hand installed straight from PyPI.
>>
>> Good examples of standard packages which can become pip ones are tox, pytest (not yet standard).
>>
>>
>> The other difference is that by default these packages are not included in the Sage releases source tarball.
>>
>> Rather than adding them there I propose to split the upstream/* part of the tarball into something optional - which is represented by a list of files to download, and which is just not needed if you build while connected to the internet.
>>
>> This is a huge saving on the tarball size: with upstream/* in, Sage 10.2 tarball is 1.3Gb, and without it is smaller than 0.25Gb.
>>
>> Note that as William writes, the desire to have Sage buildable without an internet connection was a requirement by a past Sage funder, gone about 10 years ago. Thus there's no longer an obligation to have this option.
>> I am not aware of a similar to Sage which provides tarballs allowing for an offline build.
>>
>> Thus, I would like to call a vote on these two topics:
>>
>> 1) allow standard packages to be pip packages
>>
>> 2) drop the contents of upstream/ from the Sage source tarballs.
>>
>>
>> ---
>> Dima
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/f926e074-9803-4335-b128-29398c460b0en%40googlegroups.com.
Message has been deleted

Matthias Koeppe

unread,
Feb 12, 2024, 9:42:07 PMFeb 12
to sage-devel
On Monday, February 12, 2024 at 3:52:29 PM UTC-8 Matthias Koeppe wrote:
In any use cases with internet connectivity, people will be better off by just cloning the git repo, not use the release tarball.

If there are relevant use cases without internet connectivity (I have no opinion to offer on this), then the release tarball has exactly the right contents.

Proposed action items: 
A. Change https://github.com/sagemath/sage/blob/develop/README.md so that "git clone" is described as the primary way to obtain the Sage sources. That the big release tarball is available can be a footnote in the Installation Guide (https://deploy-livedoc--sagemath.netlify.app/html/en/installation/source#installation-steps) for the limited no-internet connectivity use case.

 
B. Likewise, get rid of all of these "Download Sage source code" pages (https://www.sagemath.org/download-source.html, https://www.sagemath.org/download-latest.html), mirror selection, etc. from the Sage website. 

Tobia...@gmx.de

unread,
Feb 12, 2024, 10:44:19 PMFeb 12
to sage-devel
+1 for both proposals.

Via "pip download" (https://pip.pypa.io/en/stable/cli/pip_download/) it is easy to resolve and download all pip packages on a system with internet connection, and then later on the target system install it without the need for internet.

Matthias Koeppe

unread,
Feb 16, 2024, 6:28:47 PMFeb 16
to sage-devel
On Monday, February 12, 2024 at 4:05:05 PM UTC-8 Dima Pasechnik wrote:
On Mon, Feb 12, 2024 at 11:52 PM Matthias Koeppe
<matthia...@gmail.com> wrote:
> If there are relevant use cases without internet connectivity (I have no opinion to offer on this), then the release tarball has exactly the right contents.

This won't be true any more if we allow standard packages to be pip packages.

That's correct.

Nils Bruin

unread,
Feb 16, 2024, 6:57:06 PMFeb 16
to sage-devel
As far as I understand, the proposal is to allow sage "packages" to be closer to more standard python prerequisites by letting them be resolved by pip packages. By default the package content would be fetched, as pip does, and that would mean the default configuration for sage would require internet at install time.

I also understand that there are people who are concerned that this may not reflect all scenarios where people want to install sagemath and they would prefer if there is a clear method to install sagemath from a well-defined set of archives (one big one?) that need to be transferred to the target machine, after which the install can proceed without internet access.

Searching for "pip without internet" gives various hits. One that at least superficially looks like a reasonable starting point:


but there are also stackoverflow answers that look relevant.

It looks like, with a bit of work, pip can be convinced to look at local files to satisfy prerequisites and packages. Hence, if we keep that in mind it seems to me that having an archive of "pip packages" would be doable if we ensure pip gets used in a way that makes it easy to reconfigure the place to look for prereqs. Then it may be fairly easy to make an offline installable version of sagemath, either by packing a big tarball that includes the pip content or by making that available in a separate ball, with an easy switch (or perhaps we can configure pip to first look locally and then try the internet? Or the other way around? that it could transition gracefully between different ways of satisfying requirements).

So, perhaps we can have our pips and networkless installs too?

Matthias Koeppe

unread,
Feb 16, 2024, 9:06:14 PMFeb 16
to sage-devel
On Friday, February 16, 2024 at 3:57:06 PM UTC-8 Nils Bruin wrote:
As far as I understand, the proposal is to allow sage "packages" to be closer to more standard python prerequisites by letting them be resolved by pip packages.

No, we already have such Sage packages: This is just one of the 4 existing package "source types" (https://deploy-livedoc--sagemath.netlify.app/html/en/developer/packaging#package-types) - "normal", "wheel", "pip", "script".

Most of our Python packages come from PyPI already. The difference is really (1) when we determine the version to be installed, and (2) if and how we distribute the tarball.

- "normal" packages are built from an sdist (tarball) retrieved from PyPI. 
- The version is set in the file package-version.txt, and the PyPI download URL ("upstream_url") and checksums are recorded in checksums.ini; see https://github.com/sagemath/sage/tree/develop/build/pkgs/numpy for an example. 
- The release manager's scripts download the package from the upstream_url and put them on the Sage mirrors. 
- If the package is standard, it is also included in the big release tarball. 
- If the package is standard and a stable release is being made, a GH Actions workflow also uploads the tarball as a Release Asset to GitHub (see https://github.com/sagemath/sage/releases/tag/10.2). 
- When users install Sage from git, any normal package is first attempted to retrieve from the GitHub Release Assets, then from Sage mirrors, then from the upstream_url. 
- When users install Sage from the big release tarball, standard normal packages have their sdists already in upstream, and only optional/experimental normal packages need to be retrieved.

What Dima proposes here is to allow _standard_ Sage packages to be of "source type" "pip". 
 
By default the package content would be fetched, as pip does,

Not just as pip does, but by actually calling "pip" to contact PyPI.
 
and that would mean the default configuration for sage would require internet at install time.

That's right.
 

Kwankyu Lee

unread,
Feb 16, 2024, 9:26:32 PMFeb 16
to sage-devel
 
By default the package content would be fetched, as pip does,

Not just as pip does, but by actually calling "pip" to contact PyPI.
 
and that would mean the default configuration for sage would require internet at install time.

That's right.

Then Dima's proposal implies assuming internet at install time. Right? 

I asked the same question before. But Dima denied it. Whence I got confused...

Matthias Koeppe

unread,
Feb 16, 2024, 10:13:05 PMFeb 16
to sage-devel
On Friday, February 16, 2024 at 6:26:32 PM UTC-8 Kwankyu Lee wrote:
 
By default the package content would be fetched, as pip does,

Not just as pip does, but by actually calling "pip" to contact PyPI.
 
and that would mean the default configuration for sage would require internet at install time.

That's right.

Then Dima's proposal implies assuming internet at install time. Right? 

Yes.

But one can make "pip" work with some local directory for the packages it considers instead of using PyPI over the Internet:
We can use "pip install --no-index --find-links=/SOME/LOCAL/DIRECTORY ...". See https://pip.pypa.io/en/stable/cli/pip_install/#finding-packages

As all pip options can also be provided systematically via environment variables, we can also set "PIP_NO_INDEX=true" and "PIP_FIND_LINKS=/SOME/LOCAL/DIRECTORY" for the same effect. Then one does not need to change the invocations of pip.

In fact, we already do exactly this in the Sage distribution for a slightly different purpose, namely when we build "normal" Python packages and "script" Python packages (= packages whose source trees are part of the repository, such as https://github.com/sagemath/sage/tree/develop/pkgs/sagemath-bliss), see https://github.com/sagemath/sage/blob/develop/build/pkgs/sagemath_objects/spkg-install.in#L3

We do this because, following modern Python build practices, we build most packages with "build isolation". The build-time prerequisites are not accessed from the normal Sage venv but are specifically installed in a temporary environment just for the build of the specific package. The prerequisites are installed from wheel files in venv/var/lib/sage/wheels/; this directory is referred to by the variable $SAGE_SPKG_WHEELS.

(Where do the wheel files in venv/var/lib/sage/wheels/ come from? Either (1) we have built them ahead of time and stored them there; or (2) they are platform-independent wheels and we have found them in the directory upstream/, downloaded them from GH Release assets, downloaded them from Sage mirrors, or the upstream_url (= PyPI).)

Nathan Dunfield

unread,
Feb 16, 2024, 11:44:06 PMFeb 16
to sage-devel
Dima mentioned "tox" [1] as an example of a "standard" package that would benefit from being switched to a "pip" package.  The "tox" package is pure python, so could also made a "wheel" package, which are already allowed for standard package, for example [2].  I'm having difficultly understanding the practical differences between a "wheel" package and a "pip" packages in this setting.  With "wheel", the wheel is downloaded from PyPI and put in upstream/ by various GH actions and put in the sage tarball and copied over to the sage mirrors, whereas with "pip" it is only downloaded by pip itself when an end-user builds Sage.  But in terms of developer effort, the only difference I see between "wheel" and "pip" is that the former has a few extra checksums, compare [2] and [3].  What distinctions am I missing?  Is it that a "wheel" must be pinned to a specific release on PyPI whereas "pip" can specify a range?

Best,

Matthias Koeppe

unread,
Feb 17, 2024, 12:17:37 AMFeb 17
to sage-devel
On Friday, February 16, 2024 at 8:44:06 PM UTC-8 Nathan Dunfield wrote:
Dima mentioned "tox" [1] as an example of a "standard" package that would benefit from being switched to a "pip" package.  The "tox" package is pure python, so could also made a "wheel" package, which are already allowed for standard package, for example [2]. 

Yes, in fact, tox and its dependencies have already been "wheel" packages, see https://github.com/sagemath/sage/blob/develop/build/pkgs/tox/checksums.ini
 
I have been switching many packages from "normal" to "wheel", which has reduced the complexity of the Sage distribution, as wheel packages have no installation scripts -- and also no build dependencies. The latter was crucial -- as we decided to install JupyterLab components from the pre-built wheels, which eliminated the complexity of Javascript build infrastructure from the Sage distribution, https://github.com/sagemath/sage/pull/36129

For wheel packages, it's all just metadata and the copied-over package README (which we need for building our reference manual).

I'm having difficultly understanding the practical differences between a "wheel" package and a "pip" packages in this setting. 

With "wheel", the wheel is downloaded from PyPI and put in upstream/ by various GH actions and put in the sage tarball and copied over to the sage mirrors, whereas with "pip" it is only downloaded by pip itself when an end-user builds Sage.  But in terms of developer effort, the only difference I see between "wheel" and "pip" is that the former has a few extra checksums, compare [2] and [3].
 
  What distinctions am I missing?  Is it that a "wheel" must be pinned to a specific release on PyPI whereas "pip" can specify a range?

If one does not care about the use case without internet access, then it's just the following:
- Pinning, as you mentioned (see also https://groups.google.com/g/sage-devel/c/5kmxaw105lg/m/9rF77fvFAAAJ above, where I discussed some details of this, including risks of leaving packages unpinned)
- Dependencies: "pip" packages can pull some of their build-time and run-time dependencies directly from PyPI, without us mirroring these dependencies in SageMath metadata. That's a mild convenience for developers, of importance if one wants to leave the version range wide open; but also has risks of instability.

Obviously, what is costly or inconvenient for developers depends a lot on the tooling that is available. I can elaborate on this if there's interest.

Dima Pasechnik

unread,
Feb 17, 2024, 6:09:25 AMFeb 17
to sage-...@googlegroups.com

Dima Pasechnik

unread,
Feb 17, 2024, 6:15:44 AMFeb 17
to sage-...@googlegroups.com


On 17 February 2024 02:26:32 GMT, Kwankyu Lee <ekwa...@gmail.com> wrote:
>
>
>
>
there are ways to use pip without internet, with the necessary wheels pre-fetched.
That's what Sage does with wheel packages. The difference between wheel packages vs pip packages is that the latter don't require pre-fetched wheels, and absence of the need for package (micro)management.

>

Kwankyu Lee

unread,
Feb 17, 2024, 10:01:15 AMFeb 17
to sage-devel
there are ways to use pip without internet, with the necessary wheels pre-fetched.
That's what Sage does with wheel packages.

Yes. This is a sage package of source type "wheel", as Matthias explained.
 
The difference between wheel packages vs pip packages is that the latter don't require pre-fetched wheels, and absence of the need for package (micro)management.

So your proposal is to let a standard' package to be installed by pip via internet. Hence your proposal suggests to break the rule that sage-the-distribution can be installed without internet connection. If so, breaking the rule (that is, assuming internet connection at install time) is more substantial content of your proposal.


Nathan Dunfield

unread,
Feb 17, 2024, 10:06:27 AMFeb 17
to sage-devel
On Friday, February 16, 2024 at 11:17:37 PM UTC-6 Matthias Koeppe wrote:
If one does not care about the use case without internet access, then it's just the following:
- Pinning, as you mentioned (see also https://groups.google.com/g/sage-devel/c/5kmxaw105lg/m/9rF77fvFAAAJ above, where I discussed some details of this, including risks of leaving packages unpinned)
- Dependencies: "pip" packages can pull some of their build-time and run-time dependencies directly from PyPI, without us mirroring these dependencies in SageMath metadata. That's a mild convenience for developers, of importance if one wants to leave the version range wide open; but also has risks of instability.

Matthias, thanks for the clarification.  I think pinning the version of a "standard" package, including all its dependencies and down to the minor release, is likely the best approach.  Based on my experience with snappy [1], not pinning things will result in CI runs failing "out of the blue" because one of the dependencies got updated.  With a small project like snappy, this is pretty occasional and serves as a way to flag issues with new upstream releases, but with Sage my guess is that such failures would be frequent.   Suppose that each time the CI runs on a new PR, there's a 10% chance of it failing because some completely unrelated dependency shifted; that would be a major annoyance to seasoned Sage developers and very discouraging to newcomers.  Now for a smaller package without (many?) dependencies, one can probably get away with pinning just down to the major release, but the benefit of doing that is pretty marginal.

So I am against allowing standard packages to be "pip" packages --- the "wheel" approach seems like the right tradeoff here as (a) the final package is sourced straight off PyPI so we don't have to build it (b) the result is completely stable.

Best,

Nathan
  

Matthias Koeppe

unread,
Feb 17, 2024, 12:16:07 PMFeb 17
to sage-devel
On Saturday, February 17, 2024 at 7:06:27 AM UTC-8 Nathan Dunfield wrote:
On Friday, February 16, 2024 at 11:17:37 PM UTC-6 Matthias Koeppe wrote:
If one does not care about the use case without internet access, then it's just the following:
- Pinning, as you mentioned (see also https://groups.google.com/g/sage-devel/c/5kmxaw105lg/m/9rF77fvFAAAJ above, where I discussed some details of this, including risks of leaving packages unpinned)
- Dependencies: "pip" packages can pull some of their build-time and run-time dependencies directly from PyPI, without us mirroring these dependencies in SageMath metadata. That's a mild convenience for developers, of importance if one wants to leave the version range wide open; but also has risks of instability.

Matthias, thanks for the clarification.  I think pinning the version of a "standard" package, including all its dependencies and down to the minor release, is likely the best approach.  Based on my experience with snappy [1], not pinning things will result in CI runs failing "out of the blue" because one of the dependencies got updated.  With a small project like snappy, this is pretty occasional and serves as a way to flag issues with new upstream releases, but with Sage my guess is that such failures would be frequent.   Suppose that each time the CI runs on a new PR, there's a 10% chance of it failing because some completely unrelated dependency shifted; that would be a major annoyance to seasoned Sage developers and very discouraging to newcomers.

Thanks a lot, Nathan, for this point.
I share the same concern based on the amplification of the failure probability, due to the large number of dependencies in Sage.

I'll note that recently we have merged PR https://github.com/sagemath/sage/pull/35986 by Tobias Diez, which implements full version pinning for our conda-forge installation methods. See https://github.com/sagemath/sage/blob/develop/src/environment-dev-3.11-macos-arm64.yml#L329 -- pytest is fully pinned down to the build hash. This change was motivated specifically by the fast changes of conda-forge (as a rolling distribution), which was one major cause for the instability of the CI Conda workflow.

The proposed policy change to allow standard packages to be "pip" packages would run in the opposite direction.

Dima Pasechnik

unread,
Feb 17, 2024, 1:14:54 PMFeb 17
to sage-...@googlegroups.com
Once again: my proposal does not address the question how one can pack everything for an offline install.

We suspect the demand for this feature does not exist (while the requirements for it are gone long ago), but there are options to create a fully offline installer readily available.

Dima
>
>

Dima Pasechnik

unread,
Feb 17, 2024, 2:13:33 PMFeb 17
to sage-...@googlegroups.com
My proposal is in fact aimed at reducing the number of pinned Sage dependecies, drastically.

Because most of them are either dependencies of Jupyterlab, or of Sphinx, or of Python build system, and none of the them should be Sage's concern to package, with all their dependencies.

If you itch to pack the said dependencies, please do it in a separate repo/PyPI package, which can be consumed by sagelib to get the desired pinned dependencies (and test all this in the existing CI, why not?)
But stop tying them up with sagelib - which in effect forces people interested in sagelib to slave away on packaging 300 dependencies, most of which aren't even tested by CI in any way, besides building.

Please liberate sagelib from the cabal of the ftontend, etc.

Sagemath is not a disto - no sane distro puts everything in one flat directory structure.
Sagemath is an insane pile of needlessly vendored packages.

Matthias Koeppe

unread,
Feb 17, 2024, 2:46:08 PMFeb 17
to sage-devel
On Saturday, February 17, 2024 at 11:13:33 AM UTC-8 Dima Pasechnik wrote:
On 17 February 2024 17:16:07 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>I share the same concern based on the amplification of the failure
>probability, due to the large number of dependencies in Sage.

My proposal is in fact aimed at reducing the number of pinned Sage dependecies, drastically.

As you seem to be responding to what I wrote there, I have to point out that it is the *unpinning*, not the *pinning*, that is the concern for instability that Nathan and I share.

Because most of them are either dependencies of Jupyterlab, or of Sphinx, or of Python build system, and none of the them should be Sage's concern to package, with all their dependencies.

If you itch to pack the said dependencies, please do it in a separate repo/PyPI package, which can be consumed by sagelib to get the desired pinned dependencies (and test all this in the existing CI, why not?)
But stop tying them up with sagelib - which in effect forces people interested in sagelib to slave away on packaging 300 dependencies, most of which aren't even tested by CI in any way, besides building.

Please liberate sagelib from the cabal of the ftontend, etc.

Sagemath is not a disto - no sane distro puts everything in one flat directory structure.
Sagemath is an insane pile of needlessly vendored packages.

Wow, there's so much to unpack here... 

Nathan Dunfield

unread,
Feb 17, 2024, 6:04:49 PMFeb 17
to sage-devel
On Saturday, February 17, 2024 at 1:13:33 PM UTC-6 Dima Pasechnik wrote:
My proposal is in fact aimed at reducing the number of pinned Sage dependecies, drastically.

Because most of them are either dependencies of Jupyterlab, or of Sphinx, or of Python build system, and none of the them should be Sage's concern to package, with all their dependencies.

In my experience, it's particularly important to pin build dependencies.  Most of the "out of the blue" CI failures we've seen with "snappy" have been caused by new versions of build dependencies, especially Cython.  (I see that Cython was also one of the motivations for the "conda-lock" scheme of https://github.com/sagemath/sage/pull/35986 )

If you itch to pack the said dependencies, please do it in a separate repo/PyPI package, which can be consumed by sagelib to get the desired pinned dependencies (and test all this in the existing CI, why not?)
But stop tying them up with sagelib - which in effect forces people interested in sagelib to slave away on packaging 300 dependencies, most of which aren't even tested by CI in any way, besides building.

It seems to me that the "wheel" type Sage packages, each of which is primarily just the version number of a file on PyPI and its hash, is like a "requirements.txt" file (or "conda-lock" file, for that matter) spread over multiple directories.  Personally, I don't view that as packaging a dependency, but rather saving some metadata to aid reliability/reproducibility.

Best,

Nathan

Matthias Koeppe

unread,
Feb 17, 2024, 6:31:43 PMFeb 17
to sage-devel
On Saturday, February 17, 2024 at 3:04:49 PM UTC-8 Nathan Dunfield wrote:
It seems to me that the "wheel" type Sage packages, each of which is primarily just the version number of a file on PyPI and its hash, is like a "requirements.txt" file (or "conda-lock" file, for that matter) spread over multiple directories.  Personally, I don't view that as packaging a dependency, but rather saving some metadata to aid reliability/reproducibility.

I agree with this viewpoint.
I'll note that in addition to aiding reliability/reproducibility, the metadata in build/pkgs is also important for discoverability and attribution.

Of course one could point out that our format is relatively verbose, as it is spread over several plain-text files -- for simplicity of access with shell scripts. Our format dates back to at least 2015, see the sage-devel thread "Is there an online index of the standard packages shipped in Sage?" (https://groups.google.com/g/sage-devel/c/aEmUmFOwJYQ/m/4pmmYrt3nXQJ) Other distributions such as Pyodide (https://github.com/pyodide/pyodide/tree/main/packages) use contemporary structured datafile formats such as yaml instead. But it is probably safe to say that most Sage users and even most Sage developers do not look at the contents of the build/pkgs/ directory. 

Dima Pasechnik

unread,
Feb 18, 2024, 7:26:28 AMFeb 18
to sage-...@googlegroups.com


On 17 February 2024 23:04:49 GMT, Nathan Dunfield <nat...@dunfield.info> wrote:
>On Saturday, February 17, 2024 at 1:13:33 PM UTC-6 Dima Pasechnik wrote:
>
>My proposal is in fact aimed at reducing the number of pinned Sage
>dependecies, drastically.
>
>Because most of them are either dependencies of Jupyterlab, or of Sphinx,
>or of Python build system, and none of the them should be Sage's concern to
>package, with all their dependencies.
>
>
>In my experience, it's particularly important to pin build dependencies.
>Most of the "out of the blue" CI failures we've seen with "snappy" have
>been caused by new versions of build dependencies, especially Cython.


My proposal does not say that we must convert
all the standard python packages into pip packages.


I cannot imagine CI breaking down by, say, pytest.
Surely once every few years one might get a serious incompatibility, but not in 10% of CI runs, or in even 1%.

Besides, you can pin down or limit the version of a pip package, just as well. E.g. pin down the version of Cython. But leave its dependencies out of Sage, as much as possible.

Dima Pasechnik

unread,
Feb 18, 2024, 7:40:35 AMFeb 18
to sage-...@googlegroups.com


On 17 February 2024 23:31:43 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>On Saturday, February 17, 2024 at 3:04:49 PM UTC-8 Nathan Dunfield wrote:
>
>It seems to me that the "wheel" type Sage packages, each of which is
>primarily just the version number of a file on PyPI and its hash, is like a
>"requirements.txt" file (or "conda-lock" file, for that matter) spread over
>multiple directories. Personally, I don't view that as packaging a
>dependency, but rather saving some metadata to aid
>reliability/reproducibility.
>
>
>I agree with this viewpoint.
>I'll note that in addition to aiding reliability/reproducibility, the
>metadata in build/pkgs is also important for discoverability and
>attribution.

Merely pinning down the versions doesn't magically brings you reproducibility, unless you are also willing to pin down the OS version, and the hardware. It's often just doesn't make sence.
E.g. try to build Sage 8 or 9 on a modern machine with a recent OS. Chances are it won't work, despite you knowing all the versions.

Besides, to create a pinning of all the versions of Python packages, just run the appropriate pip command, it will produce a full list of all the versions, ready to be used to reproduce the environment. No need to maintain these pinnings by hand.

Yes, metadata is important, it's just make-work to maintain it manually. We don't need to carry out this make-work.

Nathan Dunfield

unread,
Feb 18, 2024, 10:51:27 AMFeb 18
to sage-devel
On Sunday, February 18, 2024 at 6:26:28 AM UTC-6 Dima Pasechnik wrote:
I cannot imagine CI breaking down by, say, pytest.

I can definitely see that happening, and indeed it seems to have done so for other projects:

 
Besides, you can pin down or limit the version of a pip package, just as well. E.g. pin down the version of Cython. But leave its dependencies out of Sage, as much as possible.

Leaving out dependencies doesn't eliminate complexity, it just hides it.  That doesn't seem like an improvement to me.  Plus, some packages are dependencies of multiple other packages, and pip's ability to find a version of said dependency that all other packages will tolerate is not the greatest in my experience.

Best,

Nathan



Dima Pasechnik

unread,
Feb 18, 2024, 12:07:04 PMFeb 18
to sage-...@googlegroups.com


On 18 February 2024 15:51:27 GMT, Nathan Dunfield <nat...@dunfield.info> wrote:
>On Sunday, February 18, 2024 at 6:26:28 AM UTC-6 Dima Pasechnik wrote:
>
>I cannot imagine CI breaking down by, say, pytest.
>
>
>I can definitely see that happening, and indeed it seems to have done so
>for other projects:
>
>https://github.com/pytest-dev/pytest/issues/9765
>https://github.com/pytest-dev/pytest/issues/11983
>

Well, yes, a major pytest version jump might be an issue. We know this well, so what?

>
>Besides, you can pin down or limit the version of a pip package, just as
>well. E.g. pin down the version of Cython. But leave its dependencies out
>of Sage, as much as possible.
>
>
>Leaving out dependencies doesn't eliminate complexity, it just hides it.

Why would two dependencies of pytest* and pytest* only need to enter Sage codebase? No, there is no point in it, to the contrary it is more make-work, more resources, longer build times, etc.

>That doesn't seem like an improvement to me.

well:

1) you can even just get a binary wheel of pytest installed - it is very fast, and robust. Just like you do with pytest being optional. There is no need for any extra deps getting in.

2) The major improvement is that sagelib will be easier to install into an existing venv, and that's a wish of quite a number of users. Much more Pythonic, too.


> Plus, some packages are
>dependencies of multiple other packages, and pip's ability to find a
>version of said dependency that all other packages will tolerate is not the
>greatest in my experience.

with Sage unnecessary pinning down a lot of package versions (and simply vendoring them,) you cannot expect pip to do too well. We just have to let go of many dependencies which are, needlessly, part of Sage, then pip will be able to do the job well.

Dima


>
>Best,
>
>Nathan
>
>
>

Matthias Koeppe

unread,
Feb 18, 2024, 12:15:49 PMFeb 18
to sage-devel
On Sunday, February 18, 2024 at 4:40:35 AM UTC-8 Dima Pasechnik wrote:
On 17 February 2024 23:31:43 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>On Saturday, February 17, 2024 at 3:04:49 PM UTC-8 Nathan Dunfield wrote:
>"wheel" type Sage packages, each of which is
>primarily just the version number of a file on PyPI and its hash, is like a
>"requirements.txt" file (or "conda-lock" file, for that matter) spread over
>multiple directories. Personally, I don't view that as packaging a
>dependency, but rather saving some metadata to aid
>reliability/reproducibility.
>
>I'll note that in addition to aiding reliability/reproducibility, the
>metadata in build/pkgs is also important for discoverability and
>attribution.

Merely pinning down the versions doesn't magically brings you reproducibility, unless you are also willing to pin down the OS version, and the hardware. [...]

This is an instance of the "all or nothing" fallacy, and simultaneously a "straw man" fallacy (note that both Nathan and I said it "aids" reproducibility etc.)

Besides, to create a pinning of all the versions of Python packages, just run the appropriate pip command, it will produce a full list of all the versions, ready to be used to reproduce the environment. No need to maintain these pinnings by hand.

Yes, metadata is important, it's just make-work to maintain it manually. We don't need to carry out this make-work.

Exactly, it matters what tooling is available. And every single little improvement of our tooling will likely have more value than this entire thread.

Yes, we don't want to maintain the metadata "manually". 

Yes, "pip freeze" will output the current versions of installed Python packages, to be saved as a requirements.txt file. 
What's missing is the tooling that would feed this version information back to our version files. That's wishlist item https://github.com/sagemath/sage/issues/37314, estimated effort: 15 minutes of work. Any takers?

Matthias Koeppe

unread,
Feb 18, 2024, 12:24:44 PMFeb 18
to sage-devel
On Sunday, February 18, 2024 at 9:07:04 AM UTC-8 Dima Pasechnik wrote:
1) you can even just get a binary wheel of pytest installed - it is very fast, and robust.

Yes, that's what my PR https://github.com/sagemath/sage/pull/37301 does. It installs pytest as a "wheel" package.

Whether you install a package from an sdist or from a wheel, you still have the same runtime dependencies ("install-requires"). 
What goes away when you use a wheel is only the build-time dependencies ("build-system requires").
  
2) The major improvement is that sagelib will be easier to install into an existing venv, and that's a wish of quite a number of users. Much more Pythonic, too.

The pip-installability of sagelib has absolutely nothing to do with this discussion.

Dima Pasechnik

unread,
Feb 18, 2024, 1:05:21 PMFeb 18
to sage-...@googlegroups.com
On Sun, Feb 18, 2024 at 5:24 PM Matthias Koeppe <matthia...@gmail.com> wrote:
On Sunday, February 18, 2024 at 9:07:04 AM UTC-8 Dima Pasechnik wrote:
1) you can even just get a binary wheel of pytest installed - it is very fast, and robust.

Yes, that's what my PR https://github.com/sagemath/sage/pull/37301 does. It installs pytest as a "wheel" package.
there are wheels and wheels.
Binary wheels don't need any building, Sage's wheel packages still do building from source - in case the package has C extensions, possibly cythonizing or running a similar built processs involving compilation/linking.
The wheel you talk about is just another packaging of a source package, isn't it?

Just the other day we saw how one can very well install pyscipopt, an (optional) Sage package,
from a binary wheel, without building (cause Sage's Cython is too new for pyscipopt, one can't in present build it from source).
Doing the same with e.g. scipy will shave off quite a bit  of build time (by for this scipy needs to become a pip package).

 
Whether you install a package from an sdist or from a wheel, you still have the same runtime dependencies ("install-requires"). 
What goes away when you use a wheel is only the build-time dependencies ("build-system requires").
  
2) The major improvement is that sagelib will be easier to install into an existing venv, and that's a wish of quite a number of users. Much more Pythonic, too.

The pip-installability of sagelib has absolutely nothing to do with this discussion.

Of course it does, a lot. As having less deps pinning will model installability and useability of sagelib in a "foreign" venv. At the moment using sagelib in a foreign venv is complicated and error-prone, and untested.

And I should also say that less deps will make it easier for downstreams to package sage/sagelib.

 

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/3299fe86-fa67-4830-9f3c-1386235c37c8n%40googlegroups.com.

Matthias Koeppe

unread,
Feb 18, 2024, 1:09:39 PMFeb 18
to sage-devel
On Sunday, February 18, 2024 at 10:05:21 AM UTC-8 Dima Pasechnik wrote:
On Sun, Feb 18, 2024 at 5:24 PM Matthias Koeppe <matthia...@gmail.com> wrote:
On Sunday, February 18, 2024 at 9:07:04 AM UTC-8 Dima Pasechnik wrote:
1) you can even just get a binary wheel of pytest installed - it is very fast, and robust.

Yes, that's what my PR https://github.com/sagemath/sage/pull/37301 does. It installs pytest as a "wheel" package.
there are wheels and wheels.
Binary wheels don't need any building, Sage's wheel packages still do building from source - in case the package has C extensions, possibly cythonizing or running a similar built processs involving compilation/linking.

 
The wheel you talk about is just another packaging of a source package, isn't it?

No.

Dima Pasechnik

unread,
Feb 18, 2024, 1:12:01 PMFeb 18
to sage-...@googlegroups.com
On Sun, Feb 18, 2024 at 5:15 PM Matthias Koeppe <matthia...@gmail.com> wrote:
On Sunday, February 18, 2024 at 4:40:35 AM UTC-8 Dima Pasechnik wrote:
On 17 February 2024 23:31:43 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>On Saturday, February 17, 2024 at 3:04:49 PM UTC-8 Nathan Dunfield wrote:
>"wheel" type Sage packages, each of which is
>primarily just the version number of a file on PyPI and its hash, is like a
>"requirements.txt" file (or "conda-lock" file, for that matter) spread over
>multiple directories. Personally, I don't view that as packaging a
>dependency, but rather saving some metadata to aid
>reliability/reproducibility.
>
>I'll note that in addition to aiding reliability/reproducibility, the
>metadata in build/pkgs is also important for discoverability and
>attribution.

Merely pinning down the versions doesn't magically brings you reproducibility, unless you are also willing to pin down the OS version, and the hardware. [...]

This is an instance of the "all or nothing" fallacy, and simultaneously a "straw man" fallacy (note that both Nathan and I said it "aids" reproducibility etc.)

I don't know where you see fallacies. You are just not willing to view the existing Sage tooling under a critical loop,
IMHO.

Besides, to create a pinning of all the versions of Python packages, just run the appropriate pip command, it will produce a full list of all the versions, ready to be used to reproduce the environment. No need to maintain these pinnings by hand.

Yes, metadata is important, it's just make-work to maintain it manually. We don't need to carry out this make-work.

Exactly, it matters what tooling is available. And every single little improvement of our tooling will likely have more value than this entire thread.

The value is in the eye of beholder, if you like.
The existing tooling is too rigid, my proposal aims to improve it by making it more flexible.
 

Yes, we don't want to maintain the metadata "manually". 

Yes, "pip freeze" will output the current versions of installed Python packages, to be saved as a requirements.txt file. 
What's missing is the tooling that would feed this version information back to our version files.

why do we need "our version files"? we need input to pip, and "pip freeze" can generate such an input.
 
 
That's wishlist item https://github.com/sagemath/sage/issues/37314, estimated effort: 15 minutes of work. Any takers?

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Dima Pasechnik

unread,
Feb 18, 2024, 1:14:58 PMFeb 18
to sage-...@googlegroups.com
Well, I might have used incorrect terminology, but our wheels are always "build from source" wheels.
 

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Nathan Dunfield

unread,
Feb 18, 2024, 1:57:18 PMFeb 18
to sage-devel
On Sunday, February 18, 2024 at 12:14:58 PM UTC-6 Dima Pasechnik wrote:
 
The wheel you talk about is just another packaging of a source package, isn't it?

No.

Well, I might have used incorrect terminology, but our wheels are always "build from source" wheels.

No, many pure Python wheels are simply downloaded in the their final form from PyPI, for example:


Best,

Nathan

Dima Pasechnik

unread,
Feb 18, 2024, 2:59:41 PMFeb 18
to sage-...@googlegroups.com
OK, I am sorry, I've hallucinated a non-existing format for distributing Python packages.
(or perhaps I saw somewhere a package which has both a pure-Python wheel, and platform-specific wheels
with optional  binary extensions)


Wheels are platform-specific, and this non-pure-python packages need to have platform-specific wheels,
see  e.g.


Best,

Nathan

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Matthias Koeppe

unread,
Feb 18, 2024, 6:47:26 PMFeb 18
to sage-devel
On Sunday, February 18, 2024 at 10:05:21 AM UTC-8 Dima Pasechnik wrote:
On Sun, Feb 18, 2024 at 5:24 PM Matthias Koeppe <matthia...@gmail.com> wrote:
On Sunday, February 18, 2024 at 9:07:04 AM UTC-8 Dima Pasechnik wrote:
  
2) The major improvement is that sagelib will be easier to install into an existing venv, and that's a wish of quite a number of users. Much more Pythonic, too.

The pip-installability of sagelib has absolutely nothing to do with this discussion.

Of course it does, a lot. As having less deps pinning will model installability and useability of sagelib in a "foreign" venv. At the moment using sagelib in a foreign venv is complicated and error-prone, and untested.

The dependencies of sagelib are declared in the template https://github.com/sagemath/sage/blob/develop/src/setup.cfg.m4#L13, which is filled with the data from the "install-requires.txt" files (such as https://github.com/sagemath/sage/blob/develop/build/pkgs/cysignals/install-requires.txt).
These files provide version ranges. They do NOT use the specific versions pinned in the Sage distribution.

I don't think it's necessary to comment much on the idea that we should change our production environment (unpinning the pinned versions of the Sage distribution) to make it "model installability and usability of sagelib" in a less controlled environment.

Testing sagelib (the sagemath-standard distribution package) in different environments is not "complicated" -- because I've built the convenient tooling for that. One does not have to make changes to the Sage distribution for that. See https://github.com/sagemath/sage/blob/develop/pkgs/sagemath-standard/tox.ini#L18 for the different options. See also the developer's guide https://deploy-livedoc--sagemath.netlify.app/html/en/developer/packaging_sage_library#testing-the-distribution-in-virtual-environments-with-tox, where all of this is documented.


Tobia...@gmx.de

unread,
Feb 19, 2024, 8:42:08 AMFeb 19
to sage-devel
This discussion about the need to fix the version of pytest and its runtime dependencies is almost comical. We are installing and running pytest successfully since 3 years without any version requirement via pip in ci and experienced zero issues. We are also not alone in that. For example, scipy also doesn't pin any pytest version (https://github.com/search?q=repo%3Ascipy%2Fscipy%20pytest%20lang%3Ayml&type=code). Can you name a single project that needed to pin one of the runtime dependencies of pytest because of a bug they experienced?

Matthias Koeppe

unread,
Feb 19, 2024, 12:16:39 PMFeb 19
to sage-devel
On Monday, February 19, 2024 at 5:42:08 AM UTC-8 Tobia...@gmx.de wrote:
This discussion about the need to fix the version of pytest and its runtime dependencies is almost comical.

No, you are in the wrong thread. 

This thread is about the general policy for standard packages, not about pytest. 

Dima Pasechnik

unread,
Feb 19, 2024, 1:37:32 PMFeb 19
to sage-...@googlegroups.com
Come on! Tobias is in the right thread.
pytest is a good example of standard package which can well be made a pip package, that's why it's mentioned here.
 

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

John H Palmieri

unread,
Feb 19, 2024, 2:42:01 PMFeb 19
to sage-devel
This (A and B below) has the advantage of being quite explicit. The original proposal

1) allow standard packages to be pip packages

2) drop the contents of upstream/ from the Sage source tarballs.

sounds explicit, but the more the discussion goes on, the more I feel that there are hidden pieces. Does the proposal also mean removing version restrictions and all of the other claimed maintenance burden for various components of Sage?

Regarding item (2): if I clone the github repository, there is no upstream directory at the start, but after building Sage, it ends up being almost as large as in the current tarballs. (This is on OS X with a lot of homebrew packages installed.) So how much savings are we actually talking about? (Maybe it's not savings for the end user that are important, so *what* are we saving? Disk space on the mirrors?) Dima, can you please provide data? If we convert (according to (1)) to pip packages, those still need to be downloaded, and while they may not end up in "upstream" — I don't actually know how they work — don't they still take up disk space? So again, how much savings are we talking about? Please provide data.

By the way, the git clone yields a package that is 616M on my machine. A good chunk of that is the .git directory. Are you proposing that we do not distribute this? (A recent beta tarball is 1.4G, unpacked 1.6G.)

Regarding item (1): can you provide a list of packages that would become pip packages? Or describe how you would come up with a list?


On Monday, February 12, 2024 at 3:52:29 PM UTC-8 Matthias Koeppe wrote:
I'll now offer:

Opinion 1. Nobody needs to care in the slightest what the size of that release tarball is. 

In any use cases with internet connectivity, people will be better off by just cloning the git repo, not use the release tarball.

If there are relevant use cases without internet connectivity (I have no opinion to offer on this), then the release tarball has exactly the right contents.

Proposed action items: 
A. Change https://github.com/sagemath/sage/blob/develop/README.md so that "git clone" is described as the primary way to obtain the Sage sources. That the big release tarball is available can be a footnote in the Installation Guide (https://deploy-livedoc--sagemath.netlify.app/html/en/installation/source#installation-steps) for the limited no-internet connectivity use case.

B. Likewise, get rid of all of these "Download Sage source code" pages (https://www.sagemath.org/download-source.html, https://www.sagemath.org/download-latest.html), mirror selection, etc. from the Sage website. 


On Sunday, February 11, 2024 at 11:23:42 AM UTC-8 Dima Pasechnik wrote:
Currently the standard packages cannot be pip packages, i.e. we must, in effect, vendor them. This entails an extra effort which is often not needed, in particular as we patch only very few Python packages.
Pip packages are on the other hand installed straight from PyPI.

Good examples of standard packages which can become pip ones are tox, pytest (not yet standard).


The other difference is that by default these packages are not included in the Sage releases source tarball.

Rather than adding them there I propose to split the upstream/* part of the tarball into something optional - which is represented by a list of files to download, and which is just not needed if you build while connected to the internet.

This is a huge saving on the tarball size: with upstream/* in, Sage 10.2 tarball is 1.3Gb, and without it is smaller than 0.25Gb.

Note that as William writes, the desire to have Sage buildable without an internet connection was a requirement by a past Sage funder, gone about 10 years ago. Thus there's no longer an obligation to have this option.
I am not aware of a similar to Sage which provides tarballs allowing for an offline build.

Thus, I would like to call a vote on these two topics:

1) allow standard packages to be pip packages

2) drop the contents of upstream/ from the Sage source tarballs.


---
Dima

Matthias Koeppe

unread,
Feb 19, 2024, 3:09:54 PMFeb 19
to sage-devel
Prompted by the discussion of space use on the local machines of users and developers, I propose another item in addition to A and B:

C. Advertise use of "git worktree" and recommend symlinking the "upstream" directory. For testing a new release when you have an existing clone of the repository, using "git clone" another time is overkill as it creates another copy of the .git directory. And there is no point in having multiple copies of the "upstream" directory, as the filenames of the tarballs change whenever the contents change.

Dima Pasechnik

unread,
Feb 19, 2024, 3:10:49 PMFeb 19
to sage-...@googlegroups.com
On Mon, Feb 19, 2024 at 7:42 PM John H Palmieri <jhpalm...@gmail.com> wrote:
This (A and B below) has the advantage of being quite explicit. The original proposal

1) allow standard packages to be pip packages

2) drop the contents of upstream/ from the Sage source tarballs.

sounds explicit, but the more the discussion goes on, the more I feel that there are hidden pieces. Does the proposal also mean removing version restrictions and all of the other claimed maintenance burden for various components of Sage?

No, this is simply FUD spread by certain parties here. "Allow" does not mean "Make all of the", it should be
obvious. 

pytest is good example of package which can be elevated to standard, but kept pip. In no place my
proposal 1) demands anything done for all packages.



Regarding item (2): if I clone the github repository, there is no upstream directory at the start, but after building Sage, it ends up being almost as large as in the current tarballs. (This is on OS X with a lot of homebrew packages installed.) So how much savings are we actually talking about? (Maybe it's not savings for the end user that are important, so *what* are we saving? Disk space on the mirrors?) Dima, can you please provide data? If we convert (according to (1)) to pip packages, those still need to be downloaded, and while they may not end up in "upstream" — I don't actually know how they work — don't they still take up disk space? So again, how much savings are we talking about? Please provide data.

I am talking about saving space on mirrors, and on bandwidth, by not packaging "upstream/".
(as I wrote: This is a huge saving on the tarball size: with upstream/* in, Sage 10.2 tarball is 1.3Gb, and without it is smaller than 0.25Gb.)
Local disk space nowadays is cheap, but space on mirrors, and bandwidth, don't come free.
(not everyone is on an unlimited internet)

As well, if you don't wipe up your local upstream/, its contents can  be reused.
(typically not so many packages are updated with each release after all)

However, as far as I can tell, by default pip package wheels are not stored in upstream/.
Perhaps there is an easy way to change this, I don't know.
 
 

By the way, the git clone yields a package that is 616M on my machine. A good chunk of that is the .git directory. Are you proposing that we do not distribute this? (A recent beta tarball is 1.4G, unpacked 1.6G.)

Regarding item (1): can you provide a list of packages that would become pip packages? Or describe how you would come up with a list?

packages not involved in sagelib directly are good candidates, e.g. pytest, tox.
sphinx and jupyterlab are good candidates too, in my limited testing experiments.

Dima


On Monday, February 12, 2024 at 3:52:29 PM UTC-8 Matthias Koeppe wrote:
I'll now offer:

Opinion 1. Nobody needs to care in the slightest what the size of that release tarball is. 

In any use cases with internet connectivity, people will be better off by just cloning the git repo, not use the release tarball.

If there are relevant use cases without internet connectivity (I have no opinion to offer on this), then the release tarball has exactly the right contents.

Proposed action items: 
A. Change https://github.com/sagemath/sage/blob/develop/README.md so that "git clone" is described as the primary way to obtain the Sage sources. That the big release tarball is available can be a footnote in the Installation Guide (https://deploy-livedoc--sagemath.netlify.app/html/en/installation/source#installation-steps) for the limited no-internet connectivity use case.

B. Likewise, get rid of all of these "Download Sage source code" pages (https://www.sagemath.org/download-source.html, https://www.sagemath.org/download-latest.html), mirror selection, etc. from the Sage website. 


On Sunday, February 11, 2024 at 11:23:42 AM UTC-8 Dima Pasechnik wrote:
Currently the standard packages cannot be pip packages, i.e. we must, in effect, vendor them. This entails an extra effort which is often not needed, in particular as we patch only very few Python packages.
Pip packages are on the other hand installed straight from PyPI.

Good examples of standard packages which can become pip ones are tox, pytest (not yet standard).


The other difference is that by default these packages are not included in the Sage releases source tarball.

Rather than adding them there I propose to split the upstream/* part of the tarball into something optional - which is represented by a list of files to download, and which is just not needed if you build while connected to the internet.

This is a huge saving on the tarball size: with upstream/* in, Sage 10.2 tarball is 1.3Gb, and without it is smaller than 0.25Gb.

Note that as William writes, the desire to have Sage buildable without an internet connection was a requirement by a past Sage funder, gone about 10 years ago. Thus there's no longer an obligation to have this option.
I am not aware of a similar to Sage which provides tarballs allowing for an offline build.

Thus, I would like to call a vote on these two topics:

1) allow standard packages to be pip packages

2) drop the contents of upstream/ from the Sage source tarballs.


---
Dima

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

John H Palmieri

unread,
Feb 19, 2024, 4:08:54 PMFeb 19
to sage-devel


On Monday, February 19, 2024 at 12:10:49 PM UTC-8 Dima Pasechnik wrote:
On Mon, Feb 19, 2024 at 7:42 PM John H Palmieri wrote:
This (A and B below) has the advantage of being quite explicit. The original proposal

1) allow standard packages to be pip packages

2) drop the contents of upstream/ from the Sage source tarballs.

sounds explicit, but the more the discussion goes on, the more I feel that there are hidden pieces. Does the proposal also mean removing version restrictions and all of the other claimed maintenance burden for various components of Sage?

No, this is simply FUD spread by certain parties here.

You said: "The difference between wheel packages vs pip packages is that the latter don't require pre-fetched wheels, and absence of the need for package (micro)management." The implication is that changing the package management system is, maybe not part of this proposal, but a next step. In other words, I'm getting this impression from your words, not by other "certain parties."

You said: "My proposal is in fact aimed at reducing the number of pinned Sage dependecies, drastically." (You have made similar comments elsewhere in this thread.) How does (1) accomplish this? Either I'm missing something or you have not spelled everything out in your proposal.

You said '"Allow" does not mean "Make all of the", it should be obvious.'

"Allow" does not cause any changes to happen drastically. So what exactly are you proposing to accomplish these drastic changes? If you have a roadmap in mind, it would be helpful if you described it.

Matthias Koeppe

unread,
Feb 19, 2024, 4:32:36 PMFeb 19
to sage-devel
On Monday, February 19, 2024 at 12:10:49 PM UTC-8 Dima Pasechnik wrote:
On Mon, Feb 19, 2024 at 7:42 PM John H Palmieri <jhpalm...@gmail.com> wrote:
If we convert (according to (1)) to pip packages, those still need to be downloaded, and while they may not end up in "upstream" — I don't actually know how they work — don't they still take up disk space? So again, how much savings are we talking about? Please provide data.

as far as I can tell, by default pip package wheels are not stored in upstream/.
Perhaps there is an easy way to change this, I don't know.

That's correct. pip packages are installed by directly calling "pip install".
If pip downloads a wheel file, it is not stored anywhere.
If pip downloads an sdist and builds a wheel file, then the built wheel is stored in pip's cache; see output from "./sage -pip cache info" and "./sage -pip cache list".


Dima Pasechnik

unread,
Feb 19, 2024, 4:36:27 PMFeb 19
to sage-...@googlegroups.com
I think we can drastically, by hundreds, reduce the number of Python packages we catalogue/vendor in build/pkgs/. Namely, these which are only used deep inside Jupyter, Sphinx, and Python build system, if we convert more standard Python packages into pip packages.

We can also hope to convert e.g. matplotlib, numpy, scipy, etc
into pip packages, but this is not as obvious.

This is based on very limited experiments, though.
Anyhow, we have obvious, and safe, candidates for pip package conversion, such as pytest.
>*Opinion 1. Nobody needs to care in the slightest what the size of that
>release tarball is. *
>
>In any use cases with internet connectivity, people will be better off by
>just cloning the git repo, not use the release tarball.
>
>If there are relevant use cases without internet connectivity (I have no
>opinion to offer on this), then the release tarball has exactly the right
>contents.
>
>*Proposed action items: *
>*A.* Change https://github.com/sagemath/sage/blob/develop/README.md so that
>"git clone" is described as the primary way to obtain the Sage sources.
>That the big release tarball is available can be a footnote in the
>Installation Guide (
>https://deploy-livedoc--sagemath.netlify.app/html/en/installation/source#installation-steps)
>for the limited no-internet connectivity use case.
>
>*B. *Likewise, get rid of all of these "Download Sage source code" pages (

John H Palmieri

unread,
Feb 19, 2024, 5:24:40 PMFeb 19
to sage-devel
Regarding symlinking the upstream directory: instead or in addition, what about an option to `./configure` for the location of that directory?

Matthias Koeppe

unread,
Feb 19, 2024, 5:29:54 PMFeb 19
to sage-devel
An option to "./configure" could work too, except that the "bootstrap" phase already downloads the "configure" tarball into that directory.

Another possible direction: I've been thinking about creating a "./sage worktree" command, see https://github.com/sagemath/sage/issues/34744

Dima Pasechnik

unread,
Feb 19, 2024, 5:41:11 PMFeb 19
to sage-...@googlegroups.com
On Mon, Feb 19, 2024 at 10:29 PM Matthias Koeppe <matthia...@gmail.com> wrote:
An option to "./configure" could work too, except that the "bootstrap" phase already downloads the "configure" tarball into that directory.

an option to ./bootstrap then would be logical
 
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

John H Palmieri

unread,
Feb 19, 2024, 6:09:55 PMFeb 19
to sage-devel
If we keep a "configure" tarball in each separate Sage installation but they share the rest of "upstream", then we save a lot of space and a lot of downloading. A workflow like

% make configure
% configure --with-lots-of-options
% make

would be familiar and unchanged from the status quo when working with the git clone. Currently users don't have to run `./bootstrap` manually just to build Sage, and it would be nice to keep it this way.

By the way, I just cloned the Sage repo and ran "make configure", which runs `./bootstrap`. The upstream directory is empty after that. If it's getting used for temporary storage for this tarball, that can be done elsewhere (.upstream.d? config? some temporary directory?).

Another option for the location of a shared "upstream" would be Yet Another Environment Variable.

Matthias Koeppe

unread,
Feb 19, 2024, 6:17:11 PMFeb 19
to sage-devel
On Monday, February 19, 2024 at 3:09:55 PM UTC-8 John H Palmieri wrote:
By the way, I just cloned the Sage repo and ran "make configure", which runs `./bootstrap`. The upstream directory is empty after that.

You probably have autoconf/automake/... installed. In this case, it just uses them to build the configure script. Downloading is a fallback that is used when these "bootstrapping prerequisites" are not installed.


John H Palmieri

unread,
Feb 19, 2024, 6:48:43 PMFeb 19
to sage-devel
You're right, plus I was confusing "./bootstrap -d" (which is run by "make configure") with "./bootstrap -D" which forces the download.

Matthias Koeppe

unread,
Feb 19, 2024, 7:40:42 PMFeb 19
to sage-devel
On Monday, February 19, 2024 at 2:41:11 PM UTC-8 Dima Pasechnik wrote:
On Mon, Feb 19, 2024 at 10:29 PM Matthias Koeppe <matthia...@gmail.com> wrote:
An option to "./configure" could work too, except that the "bootstrap" phase already downloads the "configure" tarball into that directory.

an option to ./bootstrap then would be logical

... except that "bootstrap" is invoked by "make configure", so it would have to be a "make" variable.

Nathan Dunfield

unread,
Feb 19, 2024, 8:57:41 PMFeb 19
to sage-devel
On Monday, February 19, 2024 at 3:08:54 PM UTC-6 John H Palmieri wrote, responding to Dima:
You said: "The difference between wheel packages vs pip packages is that the latter don't require pre-fetched wheels, and absence of the need for package (micro)management." The implication is that changing the package management system is, maybe not part of this proposal, but a next step. In other words, I'm getting this impression from your words, not by other "certain parties."

You said: "My proposal is in fact aimed at reducing the number of pinned Sage dependecies, drastically." (You have made similar comments elsewhere in this thread.) How does (1) accomplish this? Either I'm missing something or you have not spelled everything out in your proposal.

You said '"Allow" does not mean "Make all of the", it should be obvious.'

"Allow" does not cause any changes to happen drastically. So what exactly are you proposing to accomplish these drastic changes? If you have a roadmap in mind, it would be helpful if you described it.

My understanding is that allowing standard packages to be pip packages could greatly reduce the number of pinned Sage dependencies for two reasons:

1) a build-from-source or wheel package must explicitly pin its version, but, more importantly,

2) a pip package is allowed to install additional dependencies of PyPI that are not recorded anywhere in the Sage repo.

A simple example is pytest.  Here it is as an optional pip package:


To be upgraded to a standard package, under the current policy would need to be turned into a "wheel package" requires adding its dependencies like so:


Here, pytest has just a few dependencies, but jupyterlab has more like 50 when you include dependencies of dependencies. 

--------

Personally, I think the current system of having everything pinned and explicitly recorded is the right choice, being more stable in my experience with other projects.  In any event, switching to a pip package for e.g. jupterlab doesn't affect the final size or complexity of Sage as installed, just how many moving pieces there appear to be if you look in "sage/build/pkgs".

Best,

Nathan

Dima Pasechnik

unread,
Feb 20, 2024, 4:43:27 AMFeb 20
to sage-...@googlegroups.com
The number of dependencies has grown to the point it has gotten too hard to maintain, especially if one aims to support as many Python versions as we do.
These dependencies force one to have fragile, and temporary, version-dependent workarounds in the configuration.
We don't have full-time (or even part-time) software engineers who can be tasked with such tedious stuff. Meanwhile maths bug pile up, but one has to do this ever growing maintenance...

That's why it's better to let go of as many explicit dependencies as possible now.

Matthias Koeppe

unread,
Feb 20, 2024, 12:28:31 PMFeb 20
to sage-devel
On Tuesday, February 20, 2024 at 1:43:27 AM UTC-8 Dima Pasechnik wrote:
The number of dependencies has grown to the point it has gotten too hard to maintain,

No. It's easier than it has ever been in the past because of our improved tooling.
 
especially if one aims to support as many Python versions as we do.
These dependencies force one to have fragile, and temporary, version-dependent workarounds in the configuration.

What do you mean by that?

Dima Pasechnik

unread,
Feb 20, 2024, 2:43:07 PMFeb 20
to sage-...@googlegroups.com


On 20 February 2024 17:28:31 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>On Tuesday, February 20, 2024 at 1:43:27 AM UTC-8 Dima Pasechnik wrote:
>
>The number of dependencies has grown to the point it has gotten too hard to
>maintain,
>
>
>No. It's easier than it has ever been in the past because of our improved
>tooling.

You keep adding more dependencies to justify ever growing tooling, it seems. Only you know what it does and how to use it.

What if we just don't want it? I certainly don't care about the guts of Jupyterlab and Sphinx, I just want to use them. You for some reason want to maintain a vendored version of them.
Has it ever crossed your mind to ask whether we want to toil on this vendored version? IMO it is a waste of time and effort.

We might just do a fork of Sage, if you keep preventing reduction in number of dependencies, you know. Good luck then finding people willing to toil with these tools of yours, or without.


>
>
>especially if one aims to support as many Python versions as we do.
>These dependencies force one to have fragile, and temporary,
>version-dependent workarounds in the configuration.
>
>
>What do you mean by that?

You perfectly know what I mean. There are ongoing debates on Sage's GitHub repo coming from various Python packages being obsoleted as projects and distros drop Python 3.9 and 3.10.
And you are actively preventing people to do work there, e.g. on #36753.


>

Matthias Koeppe

unread,
Feb 20, 2024, 3:13:13 PMFeb 20
to sage-devel
On Tuesday, February 20, 2024 at 11:43:07 AM UTC-8 Dima Pasechnik wrote:
On 20 February 2024 17:28:31 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>On Tuesday, February 20, 2024 at 1:43:27 AM UTC-8 Dima Pasechnik wrote:
>The number of dependencies has grown to the point it has gotten too hard to
>maintain,
>
>No. It's easier than it has ever been in the past because of our improved
>tooling.

You keep adding more dependencies to justify ever growing tooling, it seems.

No, Dima, that's absurd.

I have been doing the vast majority of this maintenance work in the past 4 years, and I have been improving the tooling to reduce the workload associated with it -- and to make it more accessible to other contributors.

Source:
echo "  commit | date | author | subject"; echo "  -- | -- | -- | --"; git --no-pager log --since 2020 --no-merges --format="  %h | %as | %<(24)%aN | %s" build/pkgs/*/{package-version.txt,spkg-*.in,patches}

Only you know what it does and how to use it.

It's well-documented in our Developer Guide (https://deploy-livedoc--sagemath.netlify.app/html/en/developer/packaging). 

I strongly encourage others to read it -- and welcome requests for improvements and PRs --  but I certainly can't force anyone to read it.

What if we just don't want it?

Who is "we"?

I certainly don't care about the guts of Jupyterlab and Sphinx, I just want to use them.

That's OK. That's what most users and developers do.

Dima Pasechnik

unread,
Feb 20, 2024, 3:45:25 PMFeb 20
to sage-...@googlegroups.com
On Tue, Feb 20, 2024 at 8:13 PM Matthias Koeppe <matthia...@gmail.com> wrote:
On Tuesday, February 20, 2024 at 11:43:07 AM UTC-8 Dima Pasechnik wrote:
On 20 February 2024 17:28:31 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>On Tuesday, February 20, 2024 at 1:43:27 AM UTC-8 Dima Pasechnik wrote:
>The number of dependencies has grown to the point it has gotten too hard to
>maintain,
>
>No. It's easier than it has ever been in the past because of our improved
>tooling.

You keep adding more dependencies to justify ever growing tooling, it seems.

No, Dima, that's absurd.

I have been doing the vast majority of this maintenance work in the past 4 years, and I have been improving the tooling to reduce the workload associated with it -- and to make it more accessible to other contributors.

testing and reviewing this endless stream of updates and new packages, etc. is also work, a lot.
And you also are not shy in outsourcing your "vast majority" either.
E.g. I got totally fed up with this at this point:
- where you kindly invited me to add few hundred text files by hand, copy-pasting from the repology repo - all thanks to your truly wonderful labour-reducing tooling... Wondeful tooling - homo sapience with vi and mouse - very flexible, reliable.
I repeat:
we don't need the guts of Jupyterlab etc in Sage repo, full stop - in fact in this case just mentioned #36777 would be not needed.



Source:
echo "  commit | date | author | subject"; echo "  -- | -- | -- | --"; git --no-pager log --since 2020 --no-merges --format="  %h | %as | %<(24)%aN | %s" build/pkgs/*/{package-version.txt,spkg-*.in,patches}

Only you know what it does and how to use it.

It's well-documented in our Developer Guide (https://deploy-livedoc--sagemath.netlify.app/html/en/developer/packaging). 

I strongly encourage others to read it -- and welcome requests for improvements and PRs --  but I certainly can't force anyone to read it.

What if we just don't want it?

Who is "we"?

You'll know when we announce a fork. :-)
 

I certainly don't care about the guts of Jupyterlab and Sphinx, I just want to use them.

That's OK. That's what most users and developers do.

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Matthias Koeppe

unread,
Feb 20, 2024, 5:44:03 PMFeb 20
to sage-devel
On Tuesday, February 20, 2024 at 12:45:25 PM UTC-8 Dima Pasechnik wrote:
On Tue, Feb 20, 2024 at 8:13 PM Matthias Koeppe <matthia...@gmail.com> wrote:
I have been doing the vast majority of this maintenance work in the past 4 years, and I have been improving the tooling to reduce the workload associated with it -- and to make it more accessible to other contributors.

testing and reviewing this endless stream of updates and new packages, etc. is also work, a lot.

I'll note that testing of most package upgrades has been automated by the CI, specifically by my CI Linux Incremental workflow. Manual testing of most of these upgrades is not necessary.
 
And you also are not shy in outsourcing your "vast majority" either.
E.g. I got totally fed up with this at this point:
- where you kindly invited me to add few hundred text files by hand, copy-pasting from the repology repo - all thanks to your truly wonderful labour-reducing tooling... Wondeful tooling - homo sapience with vi and mouse - very flexible, reliable.

I can confirm that you seemed upset. But everything else is false.

The project of supporting use of system site-packages via "./configure --enable-system-site-packages" (https://github.com/sagemath/sage/wiki/Sage-10.2-Release-Tour#configure---enable-system-site-packages-experimental) is not mine. I have expressed my skepticism about it early on, but have nevertheless supported its merge (https://github.com/sagemath/sage/pull/36141) for experiments by the interested parties, and in the interest of synergy/collaboration on maintaining the "install-requires.txt" files, which have another use for the metadata of our packages on PyPI. 

This feature remains in experimental status, as it fails on numerous platforms (see https://github.com/sagemath/sage/actions/runs/7894201915, section "standard-sitepackages").

When I saw what you were trying to do in your PR https://github.com/sagemath/sage/pull/36777, trying to support use of system Jupyter by means of this system-site-packages mechanism (and using the very "micro-management of dependencies" that you are lamenting about here), I suggested a better approach in https://github.com/sagemath/sage/pull/36777#issuecomment-1831172595 and https://github.com/sagemath/sage/pull/36777#issuecomment-1832426588. Unfortunately you dismissed it without giving it much consideration.

I then invited you to work on an Issue that would improve the tooling, https://github.com/sagemath/sage/issues/36356, exactly to eliminate the copy-pasting from the repology repo that you were doing, but you declined.

Dima Pasechnik

unread,
Feb 20, 2024, 6:51:15 PMFeb 20
to sage-...@googlegroups.com
I basically don't want to collaborate with you, Matthias, because of your conduct. Sometimes I have to, but too often interactions with you piss me off too much, and then I just don't feel like keeping on.


Matthias Koeppe

unread,
Feb 20, 2024, 9:44:32 PMFeb 20
to sage-devel
On Monday, February 19, 2024 at 12:09:54 PM UTC-8 Matthias Koeppe wrote:
Prompted by the discussion of space use on the local machines of users and developers, I propose another item in addition to A and B:

C. Advertise use of "git worktree" and recommend symlinking the "upstream" directory. For testing a new release when you have an existing clone of the repository, using "git clone" another time is overkill as it creates another copy of the .git directory. And there is no point in having multiple copies of the "upstream" directory, as the filenames of the tarballs change whenever the contents change.


(I've written it without introducing a new Sage command for now.)

Gareth Ma

unread,
Feb 22, 2024, 8:21:17 PMFeb 22
to sage-devel
Can someone from this list confirm that the PR Matthias linked (#37411) is okay i.e. not the topic of the debate in this thread? The added documentation seems harmless to me, advertises using git worktree, and can be updated when the discussion in this thread has a conclusion e.g. if upstream is moved or something.

Also, I suggest everyone to take a short break from Sage infra issues and go for a hike when possible.

Matthias Koeppe

unread,
Feb 24, 2024, 6:40:02 PMFeb 24
to sage-devel
On Monday, February 12, 2024 at 6:42:07 PM UTC-8 Matthias Koeppe wrote:
On Monday, February 12, 2024 at 3:52:29 PM UTC-8 Matthias Koeppe wrote:

Proposed action items: 
A. Change https://github.com/sagemath/sage/blob/develop/README.md so that "git clone" is described as the primary way to obtain the Sage sources. That the big release tarball is available can be a footnote in the Installation Guide (https://deploy-livedoc--sagemath.netlify.app/html/en/installation/source#installation-steps) for the limited no-internet connectivity use case.


Thanks for all comments on the PR. Ready for final review. 
 
 B. Likewise, get rid of all of these "Download Sage source code" pages (https://www.sagemath.org/download-source.html, https://www.sagemath.org/download-latest.html), mirror selection, etc. from the Sage website. 


This one has already been merged, thanks, Harald! 
The "Download" menu no longer sends people to the tarball download pages. 
The pages are, of course, still there, and links from the installation guide point to them.

Screenshot 2024-02-24 at 3.34.47 PM.png

This one has been positively reviewed. Thanks, Lorenz! 

John H Palmieri

unread,
Feb 27, 2024, 1:50:55 PMFeb 27
to sage-devel
Regarding the proposal to allow standard packages to be pip packages, no one really knows how much people rely on the all-in-one tarball that we currently distribute. No one really knows how often the "make download" option is used for people who just clone the git repo and want to do all of their downloads at once. Since we don't know, the absolute safest course of action is to not change anything, but maybe that's too conservative. A pretty safe second choice would be to have "make download" also download the relevant files for pip installation and tell pip where to find them. If we implemented this second choice, I would support this aspect of Dima's proposal.

My impression from Dima's posts is that he would like us to frequently not provide version information for pip packages. Here are some of my thoughts:

As Nathan points out, this will likely lead to instability. Someone will upgrade some component, and most of the time that will be fine, but occasionally it will break something on some platform, and it could be annoying to track down the cause. If this leads to Sage failing to build, that's not great, but it would be far worse if Sage built and ran but produced some mathematically incorrect answers. Being able to control all of the versions means that our doctests are pretty robust. If we really want to go down the road of unpinning version requirements, I propose that we always pin version requirements for the mathematical components of Sage. If Jupyter or Sphinx doesn't work right, it doesn't affect the mathematics, but if linbox or pari don't work right (or ore_algebra, if you want a pip package), people could be getting different answers on different platforms and we might not know about it for a while. To maintain the mathematical integrity of the project, we should keep very careful control of the mathematical components of Sage.

Matthias Koeppe

unread,
Feb 27, 2024, 2:37:32 PMFeb 27
to sage-devel
On Tuesday, February 27, 2024 at 10:50:55 AM UTC-8 John H Palmieri wrote:
A pretty safe second choice would be to have "make download" also download the relevant files for pip installation and tell pip where to find them. If we implemented this second choice [...]

The problem is that such tooling, even if "trivial", would need to be implemented, tested, and maintained as well. And typically this just does not work when there is no developer who is actually interested in using it.

We are better off with improving the tooling that we already know *will* continue to be used: Namely the tooling for creating and maintaining our metadata in build/pkgs. Issues such as:

Dima Pasechnik

unread,
Feb 27, 2024, 3:18:28 PMFeb 27
to sage-...@googlegroups.com


On 27 February 2024 18:50:55 GMT, John H Palmieri <jhpalm...@gmail.com> wrote:
>Regarding the proposal to allow standard packages to be pip packages, no
>one really knows how much people rely on the all-in-one tarball that we
>currently distribute. No one really knows how often the "make download"
>option is used for people who just clone the git repo and want to do all of
>their downloads at once. Since we don't know, the absolute safest course of
>action is to not change anything, but maybe that's too conservative. A
>pretty safe second choice would be to have "make download" also download
>the relevant files for pip installation and tell pip where to find them. If
>we implemented this second choice, I would support this aspect of Dima's
>proposal.

At the moment we are talking about using this option,
have pytest* and build, as standard pip packages, whereas they are now optional pip packages.

This is all by now. We lived for years with them as optional packages, nobody had problems with them, why would they all of a sudden started causing problems?

>
>My impression from Dima's posts is that he would like us to frequently not
>provide version information for pip packages. Here are some of my thoughts:
>
>As Nathan points out, this will likely lead to instability.


No known to me project vendors pytest on the basis that doing otherwise would "lead to instability".

Let us be practical - there is no reason whatsoever to vendor pytest*.
Therefore standard pip packages should be allowed.

No known to me project vendors jupyterlab, either.

This is just FUD that a well maintained package like
pytest must be vendored.
It's as believable as saying that tar must be vendored, cause a version not tested might lead to maths errors.


> Someone will
>upgrade some component, and most of the time that will be fine, but
>occasionally it will break something on some platform, and it could be
>annoying to track down the cause. If this leads to Sage failing to build,
>that's not great, but it would be *far worse* if Sage built and ran but
>produced some mathematically incorrect answers.
> Being able to control all
>of the versions means that our doctests are pretty robust. If we really
>want to go down the road of unpinning version requirements, I propose that
>we *always* pin version requirements for the mathematical components of
>Sage.

We can still pin version requirements on pip packages. And indeed for maths packages,
a handful of these which are maths, we can always do this, why not?

> If Jupyter or Sphinx doesn't work right, it doesn't affect the
>mathematics, but if linbox or pari don't work right (or ore_algebra, if you
>want a pip package), people could be getting different answers on different
>platforms and we might not know about it for a while. To maintain the
>mathematical integrity of the project, we should keep very careful control
>of the mathematical components of Sage.

Sure, I am fine with this. All what my proposal meant to do is to be able to get rid of lots of vendored packages which have nothing to do with maths, they are just guts of pytest, Jupyter, other common place packages.


>
>
>On Saturday, February 24, 2024 at 3:40:02 PM UTC-8 Matthias Koeppe wrote:
>
>> On Monday, February 12, 2024 at 6:42:07 PM UTC-8 Matthias Koeppe wrote:
>>
>> On Monday, February 12, 2024 at 3:52:29 PM UTC-8 Matthias Koeppe wrote:
>>
>>
>> *Proposed action items: *
>> *A.* Change https://github.com/sagemath/sage/blob/develop/README.md so
>> that "git clone" is described as the primary way to obtain the Sage
>> sources. That the big release tarball is available can be a footnote in the
>> Installation Guide (
>> https://deploy-livedoc--sagemath.netlify.app/html/en/installation/source#installation-steps)
>> for the limited no-internet connectivity use case.
>>
>>
>> That's now https://github.com/sagemath/sage/pull/37309 (needs review)
>>
>>
>> Thanks for all comments on the PR. Ready for final review.
>>
>>
>> *B. *Likewise, get rid of all of these "Download Sage source code" pages
>> (https://www.sagemath.org/download-source.html,
>> https://www.sagemath.org/download-latest.html), mirror selection, etc.
>> from the Sage website.
>>
>> That's now https://github.com/sagemath/website/pull/466
>>
>>
>> This one has already been merged, thanks, Harald!
>> The "Download" menu no longer sends people to the tarball download pages.
>> The pages are, of course, still there, and links from the installation
>> guide point to them.
>>
>> [image: Screenshot 2024-02-24 at 3.34.47 PM.png]
>>
>> On Tuesday, February 20, 2024 at 6:44:32 PM UTC-8 Matthias Koeppe wrote:
>>
>> On Monday, February 19, 2024 at 12:09:54 PM UTC-8 Matthias Koeppe wrote:
>>
>> Prompted by the discussion of space use on the *local machines* of users
>> and developers, I propose another item in addition to A and B:
>>
>> *C. Advertise use of "git worktree" and recommend symlinking the
>> "upstream" directory.* For testing a new release when you have an

Nils Bruin

unread,
Feb 27, 2024, 3:21:27 PMFeb 27
to sage-devel
On Tuesday 27 February 2024 at 10:50:55 UTC-8 John H Palmieri wrote:

As Nathan points out, this will likely lead to instability. Someone will upgrade some component, and most of the time that will be fine, but occasionally it will break something on some platform, and it could be annoying to track down the cause. If this leads to Sage failing to build, that's not great, but it would be far worse if Sage built and ran but produced some mathematically incorrect answers. Being able to control all of the versions means that our doctests are pretty robust. If we really want to go down the road of unpinning version requirements, I propose that we always pin version requirements for the mathematical components of Sage. If Jupyter or Sphinx doesn't work right, it doesn't affect the mathematics, but if linbox or pari don't work right (or ore_algebra, if you want a pip package), people could be getting different answers on different platforms and we might not know about it for a while. To maintain the mathematical integrity of the project, we should keep very careful control of the mathematical components of Sage.

+1 to this. There is another difference: The jupyter notebook server - notebook kernel interface is not a binary one: it's a protocol. As such, it's hopefully narrower in its definition and a bit more stable. So offering that sagemath offers a notebook kernel that a separately distributed notebook server can connect to is a different dependency structure than depending on libraries being provided. There are some super stable libraries for which external dependencies are OK, but generally (especially components under active development) they are a lot more sensitive.

For full functionality we depend on more than vanilla jupyter notebook, though: at some point, jupyter notebook server shipped with a pared-down mathjax and I have not been able to to get proper mathjax working in my system-provided jupyter notebook. We have other components, such as documentation and various plug-ins that we need in jupyter notebook too. This has made the shipped-with-sage jupyter notebook server more reliable to me, even if I'd prefer the system notebook server if that would be reliable to get working.

Sphinx is another part of such tooling: it should be able to just read and process markup. We tend to not ship a latex installation with our preprints. However, I don't have experience with how stable sphinx on its own is to judge how much trouble one would get from relying on an external sphinx. Personally, I don't bother building the sage documentation and rely on what is available online instead, so that one wouldn't affect me. 

Dima Pasechnik

unread,
Feb 27, 2024, 3:36:44 PMFeb 27
to sage-...@googlegroups.com


On 27 February 2024 19:37:31 GMT, Matthias Koeppe <matthia...@gmail.com> wrote:
>On Tuesday, February 27, 2024 at 10:50:55 AM UTC-8 John H Palmieri wrote:
>
>A pretty safe second choice would be to have "make download" also download
>the relevant files for pip installation and tell pip where to find them. If
>we implemented this second choice [...]
>
>
>The problem is that such tooling, even if "trivial", would need to be
>implemented, tested, and maintained as well. And typically this just does
>not work when there is no developer who is actually interested in using it.

It is a largely artificially invented, by you, problem, to shoot my proposal down. Besides, these tools are trivial to build and maintain, much easier than your ever growing and breaking maze of packages we don't even need to vendor.

At the moment you are actively breaking down the precious project fabric, all in the name of you having your way:

you blocked me (without bothering yo even tell me) and perhaps other developers on GitHub, meaning that you effectively want to shut me up.
(Well, I had no other choice but to block you too, as a countermeasure; I installed this block about 12:00 GMT, today.)

Of course it's easy to skip this message. But tomorrow it might be you who Matthias might block, for disagreements with him.
Why is this tolerated? This is a naked attempt to shut down the opposition.


>
>We are better off with improving the tooling that we already know *will*
>continue to be used: Namely the tooling for creating and maintaining our
>metadata in build/pkgs.

Or it will be discarded as useless, because doing things the way projects like scipy do is better.

Your package tooling is makework, repeating with worse tools what already is done by Conda, Homebrew, Linux distros, etc. We are mainly a maths project, not a distro project. Let us stay this way, and actually do more maths and less distro-like stuff.

Dima

John H Palmieri

unread,
Feb 27, 2024, 3:44:50 PMFeb 27
to sage-devel
Sentences like "At the moment you are actively breaking down the precious project fabric, all in the name of you having your way" are personal attacks. Please stop.

Dima Pasechnik

unread,
Feb 27, 2024, 3:59:34 PMFeb 27
to sage-...@googlegroups.com


On 27 February 2024 20:21:26 GMT, Nils Bruin <nbr...@sfu.ca> wrote:
>On Tuesday 27 February 2024 at 10:50:55 UTC-8 John H Palmieri wrote:
>
>
>As Nathan points out, this will likely lead to instability. Someone will
>upgrade some component, and most of the time that will be fine, but
>occasionally it will break something on some platform, and it could be
>annoying to track down the cause. If this leads to Sage failing to build,
>that's not great, but it would be *far worse* if Sage built and ran but
>produced some mathematically incorrect answers. Being able to control all
>of the versions means that our doctests are pretty robust. If we really
>want to go down the road of unpinning version requirements, I propose that
>we *always* pin version requirements for the mathematical components of
>Sage. If Jupyter or Sphinx doesn't work right, it doesn't affect the
>mathematics, but if linbox or pari don't work right (or ore_algebra, if you
>want a pip package), people could be getting different answers on different
>platforms and we might not know about it for a while. To maintain the
>mathematical integrity of the project, we should keep very careful control
>of the mathematical components of Sage.
>
>+1 to this. There is another difference: The jupyter notebook server -
>notebook kernel interface is not a binary one: it's a protocol. As such,
>it's hopefully narrower in its definition and a bit more stable. So
>offering that sagemath offers a notebook kernel that a separately
>distributed notebook server can connect to is a different dependency
>structure than depending on libraries being provided. There are some super
>stable libraries for which external dependencies are OK, but generally
>(especially components under active development) they are a lot more
>sensitive.
>
>For full functionality we depend on more than vanilla jupyter notebook,
>though: at some point, jupyter notebook server shipped with a pared-down
>mathjax and I have not been able to to get proper mathjax working in my
>system-provided jupyter notebook.

The reason for having a bad integration with upstream Jupyter is simple - we haven't really tried to do it properly, that's all.

And I am not able to get Sage's Jupyter properly working. Also people here complain that they can't use Sage's Jupyter offline (it's cause we keep on using Mathjax2, for the sole reason that long formulas dont wrap around, instead there is a slider to view them). So if one wants offline MathJax, e.g. for a presentation, Sage's Jupyter is useless.

With an ability to have standard pip packages, this would be something to look at - cause without this ability it's always some ad hoc non-official stuff, may work, or not, no one cares much.

> We have other components, such as
>documentation and various plug-ins that we need in jupyter notebook too.
>This has made the shipped-with-sage jupyter notebook server more reliable
>to me, even if I'd prefer the system notebook server if that would be
>reliable to get working.
>
>Sphinx is another part of such tooling: it should be able to just read and
>process markup. We tend to not ship a latex installation with our
>preprints.
> However, I don't have experience with how stable sphinx on its
>own is to judge how much trouble one would get from relying on an external
>sphinx.

If you pin a version, or even if you don't, there should be no difference between what we install, and what comes from pip install. Gentoo provides a very modern upstream Sphinx, and it works with Sage, for quite some time.

Dima Pasechnik

unread,
Feb 27, 2024, 4:01:25 PMFeb 27
to sage-...@googlegroups.com


On 27 February 2024 20:44:50 GMT, John H Palmieri <jhpalm...@gmail.com> wrote:
>Sentences like "At the moment you are actively breaking down the precious
>project fabric, all in the name of you having your way" are personal
>attacks. Please stop.

Blocking on GitHub members of the project is not a personal attack?
Of course it is.

John H Palmieri

unread,
Feb 27, 2024, 5:17:50 PMFeb 27
to sage-devel
That's called "whataboutism". Invoking what you consider inappropriate behavior by others is not relevant. Please stay on topic, and please follow Sage's code of conduct in your posts.

Dima Pasechnik

unread,
Feb 27, 2024, 6:00:22 PMFeb 27
to sage-...@googlegroups.com
Blocking on GitHub, I presume, is due to disagreements on a number of topics, including the topic discussed in this thread. So it's related to the topic, and personal only as it was done by a person, not by an AI.

Dima Pasechnik

unread,
Mar 6, 2024, 12:35:58 PMMar 6
to sage-...@googlegroups.com
the experience of Sage macOS app is that it's better to leave Sage's jupyterlab (with its ~193 packages) totally aside, and use the undiluted upstream jupyterlab instead, see 


 
In any event, switching to a pip package for e.g. jupterlab doesn't affect the final size or complexity of Sage as installed, just how many moving pieces there appear to be if you look in "sage/build/pkgs".

Best,

Nathan

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages