(Re-)building an inclusive SageMath community. III: Our relations to the projects that Sage depends on

Skip to first unread message

Matthias Koeppe

May 1, 2024, 4:35:08 PMMay 1
to sage-devel
Previous posts in the series: https://groups.google.com/g/sage-devel/c/OeN8o14s6Jc/m/ChnpijP3AgAJhttps://groups.google.com/g/sage-devel/c/xBzaINHWwUQ/m/Tq17YRqOAAAJ

As we all know, SageMath makes use of hundreds of "upstream" projects: third-party, separately maintained packages written either in Python/Cython or in other languages (C, C++, Common Lisp, Fortran, and the domain-specific languages of systems such as GAP, Singular, Maxima, ...).

The role of SageMath, although it does have a role as a software distribution, is in a clear contrast to that of general software distributions such as Ubuntu or conda-forge: It's probably rare for users to say "I computed this Gröbner basis using Ubuntu Linux" or "a strong generating set for this matrix group was computed using conda-forge". But many users say such things all the time about using SageMath.

Of course this is because of the added value of SageMath over the collection of its dependencies:
1. Abstraction and unification of the interface to multiple upstream dependencies, and integration.
2. The algorithms, structures, and applications implemented in the Sage library itself.

I posit that there is an intrinsic conflict between abstraction/unification/integration and attribution for the upstream projects: Regardless of intent and purpose, a real side effect of abstraction/unification/integration is that the use of the upstream project is obscured to some degree.

It is natural if individuals who contribute to the upstream projects (or have contributed to them in the past) are concerned or unhappy about such effects. And it is understandable if they perceive that the SageMath project is using their work, consuming attention/visibility/attribution, but not "giving back" sufficiently. Contributors are entitled to taking pride in their workpersonship, in the success of the project that they have been contributing to, the brand that they have created, etc. Even if some of us may be wary of possible toxic gradations such as tribalism, it is clear that attention and attribution are important, positive, and legitimate motivating factors for open source contributors in general, and moreover attribution via academic citations may indirectly translate into individuals' careers and success in obtaining funding. 

The 2018 sage-devel thread "Suggestion for the SageMath website" (https://groups.google.com/g/sage-devel/c/H8FcZD90O0Y/m/VRIRzj1sBAAJ) focused on getting upstream projects credited on our website. Although the suggestions there, to randomly rotate the names of external dependencies that are listed on the main page so they all get equal exposure, or scrolling lists, were not implemented, we have come a long way since then regarding better attribution for upstream projects.
William's message in that thread, https://groups.google.com/g/sage-devel/c/H8FcZD90O0Y/m/4ke5ekyVAgAJ, suggested that "linking to dependencies should be done much more, but in a way that provides clear value to users:
- being able to better know what is in Sage,
- being able to read the original upstreams docs and source code more easily,
- knowing which upstreams devs to contact for *support*, to ask for features, to contribute work, and to thank,
- being able to properly acknowledge what they are using."

Regarding William's first point, being able to better know what is in Sage: The main page of http://sagemath.org now links to our reference manual with a list of dependencies that is always up to date because it is automatically generated from the source code. And in the most current version, this long list is broken into categories such as "Mathematics" for better navigability: https://deploy-livedoc--sagemath.netlify.app/html/en/reference/spkg/
For each dependency, we have a page with various information, including a short description, installation instructions and a link to the upstream project. 

Regarding the second point, Simon King's suggestion in the thread, to "[...] on that list have a link to the doc for each package that provides docs" (https://groups.google.com/g/sage-devel/c/H8FcZD90O0Y/m/YEp6aO8cAQAJ), has not been implemented and would still be a valuable improvement. But we have just made progress in a similar way by facilitating Sphinx hyperlinks to specific pages of package documentation, see https://github.com/sagemath/sage/wiki/Sage-10.4-Release-Tour#linking-to-external-package-documentation
Also TB's suggestion to use "the Sphinx extensions viewcode and linkcode [...] add links next to functions and classes with the corresponding source code" (https://groups.google.com/g/sage-devel/c/H8FcZD90O0Y/m/RisE0AajAgAJ) was implemented in this development cycle, see https://github.com/sagemath/sage/wiki/Sage-10.4-Release-Tour#links-to-source-code

Giving attribution to the projects that supported a particular computation is a hard problem that cannot be fully automated. I don't know how widely SageMath's profiling-based citation system (https://doc.sagemath.org/html/en/reference/misc/sage/misc/citation.html) is used by the community; but in any case, it's still a long way from the terse output of this citation system to actual citations that people can use, and it may be valuable to provide some convenient shortcuts.

Next I'll note that the modularization project provides an opportunity to refresh our relations to the upstream projects in very significant ways.

The new pip-installable packages from the modularization project will be a new way for our project to give back something of value to the upstream projects that Sage depends on; and thus are a possible new expression of interest to collaborate with upstream projects: In particular those projects that do not maintain Python interfaces themselves or those that might be interested in higher-level interfaces than what they provide. Examples of packages corresponding to actively maintained upstream projects:

Viewing one of these packages as the Python interface to the upstream library may be much more plausible than considering the monolithic SageMath system as the Python interface to the library. This may facilitate a shared investment in its development and may also avoid duplicate developments. (Disclosure: I have not contacted any upstream projects about this yet because there's little that I can offer before the work that makes the modularized distributions available is merged in to Sage.)

I'll note that these new distributions differ very significantly from the products of earlier, "bottom-up" modularization efforts of the Sage library: packages such as cypari2 (just discussed in the concurrent thread on modularization, https://groups.google.com/g/sage-devel/c/mqgtkLr2gXY/m/kSiZktwpAAAJ) and pplpy (mentioned in the same thread in https://groups.google.com/g/sage-devel/c/mqgtkLr2gXY/m/65UjwaMaBQAJ, along with some other packages). These packages, designed to be reusable in the Python ecosystem without dependencies on anything in Sage, are not exposed directly to Sage users; they are merely glue between a C/C++ library and the higher-level Sage code that uses it. As such, these packages do not provide users with a slice of the part of Sage where most of the effort and polish in Sage development is spent, namely the high-level public interface of Sage. (I cannot say whether or to what degree it is related to this observation, but I unfortunately have to say that these modularization efforts have not been a clear success: With the exception of cypari (which Marc and Nathan created specifically for SnapPy) and the package cysignals, there is little evidence that such packages have attracted a community of users other than indirectly as dependencies of the Sage library; and certainly no viable community of developers has formed for these packages; just a few weeks ago I took over as the de-facto maintainer of cypari2 and pplpy, you may have seen the announcements.)

Finally, recall that to make the modularized distributions testable separately, we have annotated doctests in the source code with tags like "# needs sage.libs.flint" (at the file, block, or doctest level), supported by a convenient maintenance tool, the powered-up "sage --fixdoctests" command (see https://github.com/sagemath/sage/wiki/Sage-10.1-Release-Tour#new-developer-tools-modularization-deprecations; this was an outcome of the June 2023 sage-devel discussion "Modularized doctests" https://groups.google.com/g/sage-devel/c/utA0N1it0Eo/m/ep_G5dFOAAAJ and the June/July 2023 vote https://groups.google.com/g/sage-devel/c/MtS2u3VbJEo/m/wBhhdN3aAAAJ).

Such annotations -- even if as Sage developers we may find them annoying -- give an important secondary benefit, namely specific attribution for the libraries that Sage uses for particular types of computations. This alleviates the conflict between abstraction and attribution that I mentioned above.

Matthias Koeppe

May 20, 2024, 4:09:24 PMMay 20
to sage-devel
On Wednesday, May 1, 2024 at 1:35:08 PM UTC-7 Matthias Koeppe wrote:
[...] SageMath makes use of hundreds of "upstream" projects: third-party, separately maintained packages [...]
https://groups.google.com/g/sage-devel/c/H8FcZD90O0Y/m/VRIRzj1sBAAJ) [...] suggested that "linking to dependencies should be done much more, but in a way that provides clear value to users:
- being able to better know what is in Sage,
- being able to read the original upstreams docs and source code more easily,
- knowing which upstreams devs to contact for *support*, to ask for features, to contribute work, and to thank,
- being able to properly acknowledge what they are using." [...]

Any takers for these related tasks?
- Add badges for GitHub stars of upstream projects (https://github.com/sagemath/sage/issues/37585)
- build/pkgs/*/SPKG.rst: Explain how SageMath makes use of the package (https://github.com/sagemath/sage/issues/37586)
- Documentation and scripts to direct bug reports to upstream or downstream projects (https://github.com/sagemath/sage/issues/37382)
- Replace ad-hoc package installation instructions by links to SPKG page (https://github.com/sagemath/sage/issues/37532)

Reply all
Reply to author
0 new messages