How to modularize for fun and profit, II: MONOREPO vs. MULTIREPO

259 views
Skip to first unread message

Matthias Koeppe

unread,
Oct 10, 2021, 8:17:33 PM10/10/21
to sage-devel
As we are making progress on preparing the Sage library for modularization (https://trac.sagemath.org/ticket/29705), here's a discussion of three things that need to be decided whenever a portion of the Sage library is modularized:

(1) Namespace: Will it be imported as (+) "from sage.hermeneutics.quantum import QuantumGravity" or (-) "from transformative_hermeneutics.quantum import QuantumGravity"?

(2) Distribution name (= project name): Will it be installed using "pip install transformative-hermeneutics" or "pip install sagemath-hermeneutics"?

(3) Source repository: Is the source maintained as part of the Sage repository using Trac tickets (MONOREPO); or in a separate repository on GitHub, GitLab, ... (MULTIREPO)?

This post is dedicated to the MONOREPO vs. MULTIREPO question, and how it relates to (1) and (2). 

Some details about MONOREPO:
The Sage repository already contains several separate distribution source trees. Since Sage 9.4 they are located in the subdirectory SAGE_ROOT/pkgs (see https://wiki.sagemath.org/ReleaseTours/sage-9.4#New_location_for_distribution_package_sources:_SAGE_ROOT.2Fpkgs).
For example, SAGE_ROOT/pkgs/sage-sws2rst is a self-contained source tree of this distribution package: It has standard Python packaging files such as setup.py.
SAGE_ROOT/pkgs/sagemath-standard is also a source tree of this distribution package; but it is created in part using symbolic links into SAGE_ROOT/src. This is the trick that has allowed the modularization effort to keep the SAGE_ROOT/src tree monolithic: Modularization has been happening behind the scenes and will not change where Sage developers find the source files.

In favor of MONOREPO:
+ The Sage developer community is used to using Trac
+ The process is well defined: No code "ownership" (all Sage developers can change any code, subject to peer review); changes are synchronized
+ There is implicit integration testing with all of Sage

Drawbacks of MONOREPO:
- Extra hurdle for attracting new developers from outside the Sage developer community (everyone who is not already a Sage developer uses GitHub or similar)
- Testing infrastructure needs to be set up and maintained separately; in particular, integration testing with Sage.

Evidence:
- The pynac situation -- now finally resolved by merging it into the Sage library (https://wiki.sagemath.org/ReleaseTours/sage-9.5#Modularization_and_packaging_changes)
- Even for separate git repositories hosted in the SageMath GitHub organization (https://github.com/sagemath/), code ownership and review workflow are unclear.
  - For example, https://github.com/sagemath/sage-numerical-backends-coin provides no information on how to contribute (commonly expected in a file CONTRIBUTING.md)
  - Neither does https://github.com/sagemath/cysignals, which moreover looks abandoned: Issues and PRs are not tended to. Is it subject to peer review like other parts of Sage? Who is in charge of merging PRs?
  - Similar issues with https://github.com/sagemath/cypari2


Recommendation: Keep MONOREPO for all distributions that fill the sage.PAC.KAGE.MODULE namespace (= distribution packages named sagemath-... -- according to my recommendation in part I). 

Recommendation: For distributions that do not use the sage.PAC.KAGE.MODULE namespace (= distribution packages named something other than sagemath-... -- according to my recommendation in part I), weigh the benefits vs. drawbacks of MONOREPO vs. MULTIREPO. It is also simple enough for a distribution to start out being developed inside of the Sage MONOREPO and to be split out to a separate repository later.

Recommendation: Discuss the development workflow for projects in the https://github.com/sagemath organization, and document it in files CONTRIBUTING.md in the individual repositories.




François Bissey

unread,
Oct 10, 2021, 8:49:05 PM10/10/21
to sage-...@googlegroups.com
One annoying thing about monorepo from a downstream perspective - but only for people crazy enough to package a development branch (OK that would be me and not many other people :) ).
The split packages have their setup.py or equivalent in SAGE_ROOT/pkgs/pkg_name and some links to the single sage source tree from there. Examples: sagemath-standard, sage_setup and sage_docbuild. To package these I pull the full tree :( and then I don’t go to SAGE_ROOT/pkgs/pkg_name and build from there.
If I want to patch it doesn’t work from there because patch doesn’t apply across symbolic links. So, instead I go to SAGE_ROOT/src, copy the appropriate setup.py (and any other files needed) and patch and build from there. To patch while setting the top package source tree at SAGE_ROOT/pkg/pkg_name I would have to write my own new patching machinery for Gentoo ebuilds - not gonna happen.
That wouldn’t happen with split repos.
For proper releases, I am hoping for separate tarballs (eventually) which means that there won’t be any issues with symbolic links.

François
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/1572d163-abba-4fde-b348-5f31166d4b87n%40googlegroups.com.

William Stein

unread,
Oct 10, 2021, 9:00:50 PM10/10/21
to sage-devel
On Sun, Oct 10, 2021 at 5:49 PM François Bissey <frp.b...@gmail.com> wrote:
>
> One annoying thing about monorepo from a downstream perspective - but only for people crazy enough to package a development branch (OK that would be me and not many other people :) ).
> The split packages have their setup.py or equivalent in SAGE_ROOT/pkgs/pkg_name and some links to the single sage source tree from there. Examples: sagemath-standard, sage_setup and sage_docbuild. To package these I pull the full tree :( and then I don’t go to SAGE_ROOT/pkgs/pkg_name and build from there.

I don't understand what you're doing. However, I just want to point
out that it is possible to use Git to only pull a selected directory
(or directories) from a Git repo with no history. See, e.g., the
discussion of "sparse checkout" here.

https://stackoverflow.com/questions/600079/how-do-i-clone-a-subdirectory-only-of-a-git-repository
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/7682C523-1881-4525-A55E-158FE9AE5C65%40gmail.com.



--
William (http://wstein.org)

François Bissey

unread,
Oct 10, 2021, 9:06:33 PM10/10/21
to sage-...@googlegroups.com


> On 11/10/2021, at 14:00, William Stein <wst...@gmail.com> wrote:
>
> On Sun, Oct 10, 2021 at 5:49 PM François Bissey <frp.b...@gmail.com> wrote:
>>
>> One annoying thing about monorepo from a downstream perspective - but only for people crazy enough to package a development branch (OK that would be me and not many other people :) ).
>> The split packages have their setup.py or equivalent in SAGE_ROOT/pkgs/pkg_name and some links to the single sage source tree from there. Examples: sagemath-standard, sage_setup and sage_docbuild. To package these I pull the full tree :( and then I don’t go to SAGE_ROOT/pkgs/pkg_name and build from there.
>
> I don't understand what you're doing. However, I just want to point
> out that it is possible to use Git to only pull a selected directory
> (or directories) from a Git repo with no history. See, e.g., the
> discussion of "sparse checkout" here.
>
> https://stackoverflow.com/questions/600079/how-do-i-clone-a-subdirectory-only-of-a-git-repository

I don’t know if it can be used inside Gentoo ebuilds but this is an interesting thing to be able to do. I will certainly explore that possibility but if it doesn’t replace symlinks in a particular subdirectory it may all be in vain.

François

Matthias Koeppe

unread,
Oct 10, 2021, 10:02:09 PM10/10/21
to sage-devel
On Sunday, October 10, 2021 at 5:49:05 PM UTC-7 François Bissey wrote:
For proper releases, I am hoping for separate tarballs (eventually) which means that there won’t be any issues with symbolic links.

Anyone can create the package tarballs by just using "setup.py sdist" in the distribution package directories in the checked out and bootstrapped repository.
This is also what I do when I upload the distributions to PyPI.

https://trac.sagemath.org/ticket/29039 (needs review) adds the command "make pypi-sdists", which does this for all the distributions.

https://trac.sagemath.org/ticket/32062 is for adding this (and publishing to PyPI) as an automatic step on GH Actions. 

 

François Bissey

unread,
Oct 10, 2021, 10:04:22 PM10/10/21
to sage-...@googlegroups.com
I am presuming the upload will be automatic for releases and may also for beta and rc?

François

Matthias Koeppe

unread,
Oct 10, 2021, 10:05:41 PM10/10/21
to sage-devel
Yes, for all of these.



 

Matthias Koeppe

unread,
Oct 10, 2021, 10:07:40 PM10/10/21
to sage-devel
On Sunday, October 10, 2021 at 5:49:05 PM UTC-7 François Bissey wrote:
To package these I pull the full tree :( and then I don’t go to SAGE_ROOT/pkgs/pkg_name and build from there.
If I want to patch it doesn’t work from there because patch doesn’t apply across symbolic links. So, instead I go to SAGE_ROOT/src, copy the appropriate setup.py (and any other files needed) and patch and build from there. 

This sounds terrible. If you want to maintain a patch-based workflow against the sdists, why not generate the sdists and patch them?


 

Matthias Koeppe

unread,
Oct 10, 2021, 10:13:25 PM10/10/21
to sage-devel
On Sunday, October 10, 2021 at 6:00:50 PM UTC-7 wst...@gmail.com wrote:
it is possible to use Git to only pull a selected directory
(or directories) from a Git repo with no history. See, e.g., the
discussion of "sparse checkout" here.

https://stackoverflow.com/questions/600079/how-do-i-clone-a-subdirectory-only-of-a-git-repository

I wouldn't recommend this for the Sage repository. The source trees in SAGE_ROOT/pkgs/DISTRIBUTION are only complete after running SAGE_ROOT/bootstrap. (This is how we keep metadata such as version constraints in sync -- by generating them from information in SAGE_ROOT/build/pkgs.)

François Bissey

unread,
Oct 10, 2021, 10:16:23 PM10/10/21
to sage-...@googlegroups.com
I need to implement some automation for that. To make sure I am not surprised by new things I usually follow Volker’s merging tree (this happen on a branch of the sage-on-too repo that isn’t publicised). I’d need to get new sdists each time he pushes, this is feasible but not in my current expertise.
For releases, falling back to pipy sdists should be trivial, beta and rc I would need to go back to release stuff separately, which I stopped doing almost 10 years ago because it was easier to just have a “live ebuild” following the develop branch on GitHub for monolithic builds.

François

François Bissey

unread,
Oct 10, 2021, 10:32:38 PM10/10/21
to sage-...@googlegroups.com
Another fact that annoys me about monorepo is the versioning of the components. There is no right or wrong here, just preferences.

In the monorepo scenario you’ll have subcomponents released and they (usually) will all have matching version numbers.
This is easy to figure out what you should use :) that’s good
Sometimes there will be a new release of some component where the only thing changing is the version number :( this is a bit of a waste.

Going monrepo you may maintain a version number coherence and over-release some components.
Going multirepo may mean decoupling those version numbers, as you only cut a release when there is something new to release in a subcomponent.

I am personally on the multiple repo side and decoupling of version numbers. But I appreciate the coherence of having a single version number to refer too.

Dima Pasechnik

unread,
Oct 11, 2021, 4:31:14 AM10/11/21
to sage-devel
On Mon, Oct 11, 2021 at 1:17 AM Matthias Koeppe <matthia...@gmail.com> wrote:
As we are making progress on preparing the Sage library for modularization (https://trac.sagemath.org/ticket/29705), here's a discussion of three things that need to be decided whenever a portion of the Sage library is modularized:

(1) Namespace: Will it be imported as (+) "from sage.hermeneutics.quantum import QuantumGravity" or (-) "from transformative_hermeneutics.quantum import QuantumGravity"?

(2) Distribution name (= project name): Will it be installed using "pip install transformative-hermeneutics" or "pip install sagemath-hermeneutics"?

(3) Source repository: Is the source maintained as part of the Sage repository using Trac tickets (MONOREPO); or in a separate repository on GitHub, GitLab, ... (MULTIREPO)?

This post is dedicated to the MONOREPO vs. MULTIREPO question, and how it relates to (1) and (2). 

Some details about MONOREPO:
The Sage repository already contains several separate distribution source trees. Since Sage 9.4 they are located in the subdirectory SAGE_ROOT/pkgs (see https://wiki.sagemath.org/ReleaseTours/sage-9.4#New_location_for_distribution_package_sources:_SAGE_ROOT.2Fpkgs).
For example, SAGE_ROOT/pkgs/sage-sws2rst is a self-contained source tree of this distribution package: It has standard Python packaging files such as setup.py.
SAGE_ROOT/pkgs/sagemath-standard is also a source tree of this distribution package; but it is created in part using symbolic links into SAGE_ROOT/src. This is the trick that has allowed the modularization effort to keep the SAGE_ROOT/src tree monolithic: Modularization has been happening behind the scenes and will not change where Sage developers find the source files.

In favor of MONOREPO:
+ The Sage developer community is used to using Trac
+ The process is well defined: No code "ownership" (all Sage developers can change any code, subject to peer review); changes are synchronized
+ There is implicit integration testing with all of Sage

Drawbacks of MONOREPO:
- Extra hurdle for attracting new developers from outside the Sage developer community (everyone who is not already a Sage developer uses GitHub or similar)
- Testing infrastructure needs to be set up and maintained separately; in particular, integration testing with Sage.

you put aside the very fact that dependence on Trac is a problem in itself - it's a crumbling monster that eats in our money and time. Just yesterday I spent nontrivial amount of time sorting out why certain notification emails
are not arriving - it turns out they are not even sent out by Trac, for the reason unknown, and probably never to be known. This is very far from ideal, and getting worse - we were in the past years helped by Erik, who knows a lot about Trac, but this is no longer the case.



Evidence:
- The pynac situation -- now finally resolved by merging it into the Sage library (https://wiki.sagemath.org/ReleaseTours/sage-9.5#Modularization_and_packaging_changes)

merging pynac into Sage tree has not resolved the bigger problem  - that pynac is buggy, that we don't have developers who know the pynac code, no docs exist explaining the differences, that pynac code is too far from ginac to be re-merged, and that we should be getting rid of it for.

- Even for separate git repositories hosted in the SageMath GitHub organization (https://github.com/sagemath/), code ownership and review workflow are unclear.
  - For example, https://github.com/sagemath/sage-numerical-backends-coin provides no information on how to contribute (commonly expected in a file CONTRIBUTING.md)
  - Neither does https://github.com/sagemath/cysignals, which moreover looks abandoned: Issues and PRs are not tended to. Is it subject to peer review like other parts of Sage? Who is in charge of merging PRs?
  - Similar issues with https://github.com/sagemath/cypari2

these issues are primarily due to the primary developing spot being Trac, being disconnected from Sage's GitHub and this disconnect necks participation of Sage developers in these GitHub-based sub-projects.



Recommendation: Keep MONOREPO for all distributions that fill the sage.PAC.KAGE.MODULE namespace (= distribution packages named sagemath-... -- according to my recommendation in part I). 

I believe that this hinges on reliance of Trac a lot, but we should not be bound by it.
With Trac gone, most benefits of monorepo will go, as well.


Recommendation: For distributions that do not use the sage.PAC.KAGE.MODULE namespace (= distribution packages named something other than sagemath-... -- according to my recommendation in part I), weigh the benefits vs. drawbacks of MONOREPO vs. MULTIREPO. It is also simple enough for a distribution to start out being developed inside of the Sage MONOREPO and to be split out to a separate repository later.

Recommendation: Discuss the development workflow for projects in the https://github.com/sagemath organization, and document it in files CONTRIBUTING.md in the individual repositories.

My recommendation is to start switching to GitHub (or GitLab)  now. This will diminish our over-reliance on Trac, it will lower the entry barrier into the project, it will naturally facilitate modularization, improve testing, etc etc.

Another recommendation is to look for yet another splitting possibilities - at Git submodules.
E.g. SciPy has half a dozen Git submodules.

To this end, I'm stopping doing any non-real-emergency stuff related to Trac (and its git server, which is a separate pain in the neck) maintainance/updates, now (I'm willing to work on exporting Trac to GitHub, etc). 


Dima

Matthias Koeppe

unread,
Oct 11, 2021, 12:28:03 PM10/11/21
to sage-devel
On Monday, October 11, 2021 at 1:31:14 AM UTC-7 Dima Pasechnik wrote:
On Mon, Oct 11, 2021 at 1:17 AM Matthias Koeppe <matthia...@gmail.com> wrote:
(3) Source repository: Is the source maintained as part of the Sage repository using Trac tickets (MONOREPO); or in a separate repository on GitHub, GitLab, ... (MULTIREPO)?

This post is dedicated to the MONOREPO vs. MULTIREPO question, and how it relates to (1) and (2). 

Some details about MONOREPO:
The Sage repository already contains several separate distribution source trees. Since Sage 9.4 they are located in the subdirectory SAGE_ROOT/pkgs (see https://wiki.sagemath.org/ReleaseTours/sage-9.4#New_location_for_distribution_package_sources:_SAGE_ROOT.2Fpkgs).
For example, SAGE_ROOT/pkgs/sage-sws2rst is a self-contained source tree of this distribution package: It has standard Python packaging files such as setup.py.
SAGE_ROOT/pkgs/sagemath-standard is also a source tree of this distribution package; but it is created in part using symbolic links into SAGE_ROOT/src. This is the trick that has allowed the modularization effort to keep the SAGE_ROOT/src tree monolithic: Modularization has been happening behind the scenes and will not change where Sage developers find the source files.

In favor of MONOREPO:
+ The Sage developer community is used to using Trac
+ The process is well defined: No code "ownership" (all Sage developers can change any code, subject to peer review); changes are synchronized
+ There is implicit integration testing with all of Sage

Drawbacks of MONOREPO:
- Extra hurdle for attracting new developers from outside the Sage developer community (everyone who is not already a Sage developer uses GitHub or similar)
- Testing infrastructure needs to be set up and maintained separately; in particular, integration testing with Sage.

you put aside the very fact that dependence on Trac is a problem in itself 

Yes, I agree that Trac itself is a problem. But I have deliberately designed the modularization project in a way that it does NOT require changing people's development workflows.

It's... modularized. 


Matthias Koeppe

unread,
Oct 11, 2021, 12:36:54 PM10/11/21
to sage-devel
On Monday, October 11, 2021 at 1:31:14 AM UTC-7 Dima Pasechnik wrote:
On Mon, Oct 11, 2021 at 1:17 AM Matthias Koeppe <matthia...@gmail.com> wrote:
Recommendation: Keep MONOREPO for all distributions that fill the sage.PAC.KAGE.MODULE namespace (= distribution packages named sagemath-... -- according to my recommendation in part I). 

I believe that this hinges on reliance of Trac a lot, but we should not be bound by it.
With Trac gone, most benefits of monorepo will go, as well.

Not at all. Still the same even with a switch to using GitHub.

Dima Pasechnik

unread,
Oct 11, 2021, 1:27:31 PM10/11/21
to sage-devel
Well, I'd have said, problems of multi-repos will disappear.


--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Nils Bruin

unread,
Oct 11, 2021, 2:20:41 PM10/11/21
to sage-devel
I *really* appreciate that I can install sage from source and have it ready for development by issuing a single git clone (well, a second one to get git-trac. That's a bit annoying) and then a combination of make and configure that I can never remember, but the system helps me along. I find it a little annoying that it recommends a "yum install..."  of a whole bunch of stuff, where that really should be a "dnf install ...", but that's again an easy edit to make.

I also really appreciate that fixing something and/or stepping between different branches is all done through a single git repo, so that "git branch ..." gets my entire relevant tree in a sensible state (and I don't have to figure out where to submit fixes in different parts of the tree). I absolutely dread having to navigate multiple git repositories for making a fix and/or having to figure out WHICH repository a fix should go into.

Currently, having code spun off in "spkgs" is already a barrier to fixing things. Recently there was a ticket where some maxima/ecl patch needed work. The only reason I was able to contribute there was that the whole situation could be recreated by monkey-patching maxima on the spot. Figuring out how to pull/patch/build an spkg would have been a prohibitive barrier. I understand the rationale behind the spkgs, but I would think splitting up sagemath in more repositories would pose a higher barrier of entry to contributions. So PLEASE make it possible to pretend sagemath is still in one big repo, where the same build tools all apply.

Another worry that I have is that modularizing sage will lead to destabilization. The original reason for putting all components of sagemath into one big environment was to reduce the number of variables in getting things to work. I expect that if modularization takes off and sage ends up split in more components, then 10 years down the road there will be a group of developers that throw their hands up in desperation because it's just so hard to get all different components together with versions that actually work well together, and trying to keep different components working across version ranges is just so tricky (plus it's too hard to get exactly the right versions from all places, because the newest version of some components will always just have made some incompatible change that the others still have to catch up to). So PLEASE make it easy to have a reference of known-to-work components together, like sage-the-distribution does. I think we'll need it in the future again.

Dima Pasechnik

unread,
Oct 11, 2021, 3:08:29 PM10/11/21
to sage-devel
On Mon, Oct 11, 2021 at 7:20 PM Nils Bruin <nbr...@sfu.ca> wrote:
On Monday, 11 October 2021 at 09:36:54 UTC-7 Matthias Koeppe wrote:
On Monday, October 11, 2021 at 1:31:14 AM UTC-7 Dima Pasechnik wrote:

I believe that this hinges on reliance of Trac a lot, but we should not be bound by it.
With Trac gone, most benefits of monorepo will go, as well.

Not at all. Still the same even with a switch to using GitHub.

I *really* appreciate that I can install sage from source and have it ready for development by issuing a single git clone (well, a second one to get git-trac. That's a bit annoying
you don't need git trac :-)
 
) and then a combination of make and configure that I can never remember, but the system helps me along. I find it a little annoying that it recommends a "yum install..."  of a whole bunch of stuff, where that really should be a "dnf install ...", but that's again an easy edit to make.

I also really appreciate that fixing something and/or stepping between different branches is all done through a single git repo, so that "git branch ..." gets my entire relevant tree in a sensible state (and I don't have to figure out where to submit fixes in different parts of the tree). I absolutely dread having to navigate multiple git repositories for making a fix and/or having to figure out WHICH repository a fix should go into.

Currently, having code spun off in "spkgs" is already a barrier to fixing things. Recently there was a ticket where some maxima/ecl patch needed work. The only reason I was able to contribute there was that the whole situation could be recreated by monkey-patching maxima on the spot. Figuring out how to pull/patch/build an spkg would have been a prohibitive barrier. I understand the rationale behind the spkgs, but I would think splitting up sagemath in more repositories would pose a higher barrier of entry to contributions. So PLEASE make it possible to pretend sagemath is still in one big repo, where the same build tools all apply.

most of our spkgs are never changed by us, they are just Python packages, which should have never been converted into spkgs in the 1st place. And what's really is touched is, in the normal world, set up as a
git submodule, or something like that, which does not require a ritual dance to put a patch in and do an update.


Another worry that I have is that modularizing sage will lead to destabilization. The original reason for putting all components of sagemath into one big environment was to reduce the number of variables in getting things to work. I expect that if modularization takes off and sage ends up split in more components, then 10 years down the road there will be a group of developers that throw their hands up in desperation because it's just so hard to get all different components together with versions that actually work well together, and trying to keep different components working across version ranges is just so tricky (plus it's too hard to get exactly the right versions from all places, because the newest version of some components will always just have made some incompatible change that the others still have to catch up to). So PLEASE make it easy to have a reference of known-to-work components together, like sage-the-distribution does. I think we'll need it in the future again.

other Python projects somehow manage without pulling in  development toolchains for a number of languages,
and vendoring a hundred Python packages, yet somehow they manage to work just fine.

Let's stop this SageMath exceptionalism; our development model is quite unique, but it's nothing to be happy about, rather the opposite.

Sage-the-distribution should die in its present form, it's too bulky and without paid labour to carry on maintaining,
without any real need, these endless dependencies it's destined to fall behind and crumble.

Dima

 

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Matthias Koeppe

unread,
Oct 11, 2021, 3:14:40 PM10/11/21
to sage-devel
On Monday, October 11, 2021 at 11:20:41 AM UTC-7 Nils Bruin wrote:
Another worry that I have is that modularizing sage will lead to destabilization. The original reason for putting all components of sagemath into one big environment was to reduce the number of variables in getting things to work. I expect that if modularization takes off and sage ends up split in more components, then 10 years down the road there will be a group of developers that throw their hands up in desperation because it's just so hard to get all different components together with versions that actually work well together, and trying to keep different components working across version ranges is just so tricky (plus it's too hard to get exactly the right versions from all places, because the newest version of some components will always just have made some incompatible change that the others still have to catch up to). So PLEASE make it easy to have a reference of known-to-work components together, like sage-the-distribution does. I think we'll need it in the future again.

I fully agree with all of this, and it matches my plan and recommendations. 

I find it helpful to think about sage-the-distribution (which is more or less the same as the contents of the Sage monorepo) as the reference distribution for Sage developers, in which, as Nils says, all the components fit together, and developers are able to make changes without having to buy into the specifics of packaging systems such as nix, conda, ... To me it is secondary whether we advertise Sage-the-distribution to end users -- in fact, there may be better distribution options (such as conda) for end users. 

Matthias Koeppe

unread,
Oct 11, 2021, 3:26:18 PM10/11/21
to sage-devel
On Monday, October 11, 2021 at 12:08:29 PM UTC-7 Dima Pasechnik wrote:
other Python projects somehow manage

Dima, the dependencies of the Sage library are real -- they cannot be eliminated by wishful thinking or by eliminating the mechanism that makes it easy for developers to install them.

The way to address the dependencies is ... modularization: Define distributions of parts of the Sage library, each of which has a small number of dependencies only, and fix the problems of the Sage library in which "everything depends on everything else".

 

Dima Pasechnik

unread,
Oct 11, 2021, 3:54:24 PM10/11/21
to sage-devel
On Mon, Oct 11, 2021 at 8:26 PM Matthias Koeppe <matthia...@gmail.com> wrote:
On Monday, October 11, 2021 at 12:08:29 PM UTC-7 Dima Pasechnik wrote:
other Python projects somehow manage

Dima, the dependencies of the Sage library are real -- they cannot be eliminated by wishful thinking or by eliminating the mechanism that makes it easy for developers to install them.

the proper mechanism is already there, there is no need to duplicate it.
The biggest part of modularisation should be unvendoring, e.g.
unvendoring Jupyter is removing about ~50 Sage spkgs and replacing their installation by one command,

    > python3.10 -m pip install jupyter --user

That's about 15% of all the spkgs. No need to care about individual versions of these Jupyter components,
Jupyter is doing this job for us already. Same with numpy, scipy, sympy...



The way to address the dependencies is ... modularization: Define distributions of parts of the Sage library, each of which has a small number of dependencies only, and fix the problems of the Sage library in which "everything depends on everything else".

 

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Matthias Koeppe

unread,
Oct 11, 2021, 4:01:28 PM10/11/21
to sage-devel
On Monday, October 11, 2021 at 12:54:24 PM UTC-7 Dima Pasechnik wrote:
unvendoring Jupyter is removing about ~50 Sage spkgs and replacing their installation by one command,

    > python3.10 -m pip install jupyter --user

That's about 15% of all the spkgs. No need to care about individual versions of these Jupyter components

Jupyter has really nothing to do with anything in this thread.

But yes, it also needs work, see https://trac.sagemath.org/ticket/30306 "Meta-ticket: Use system Jupyter notebook / JupyterLab".


 

Dima Pasechnik

unread,
Oct 11, 2021, 4:05:19 PM10/11/21
to sage-devel
On Mon, Oct 11, 2021 at 9:01 PM Matthias Koeppe <matthia...@gmail.com> wrote:
On Monday, October 11, 2021 at 12:54:24 PM UTC-7 Dima Pasechnik wrote:
unvendoring Jupyter is removing about ~50 Sage spkgs and replacing their installation by one command,

    > python3.10 -m pip install jupyter --user

That's about 15% of all the spkgs. No need to care about individual versions of these Jupyter components

Jupyter has really nothing to do with anything in this thread.

at least I got an impression from your proposal that among other modules you want to create a sage-jupyter package, as a module?



But yes, it also needs work, see https://trac.sagemath.org/ticket/30306 "Meta-ticket: Use system Jupyter notebook / JupyterLab".


 

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Matthias Koeppe

unread,
Oct 11, 2021, 4:08:12 PM10/11/21
to sage-devel
On Monday, October 11, 2021 at 1:05:19 PM UTC-7 Dima Pasechnik wrote:
On Mon, Oct 11, 2021 at 9:01 PM Matthias Koeppe <matthia...@gmail.com> wrote:
Jupyter has really nothing to do with anything in this thread.

at least I got an impression from your proposal that among other modules you want to create a sage-jupyter package, as a module?

No. See https://trac.sagemath.org/ticket/29705 for a list of distribution packages that I have in mind so far. They are in boldface on the ticket description.




Nathan Dunfield

unread,
Oct 12, 2021, 10:43:38 AM10/12/21
to sage-devel
On Sunday, October 10, 2021 at 7:17:33 PM UTC-5 Matthias Koeppe wrote:
Recommendation: Keep MONOREPO for all distributions that fill the sage.PAC.KAGE.MODULE namespace (= distribution packages named sagemath-... -- according to my recommendation in part I). 

Recommendation: For distributions that do not use the sage.PAC.KAGE.MODULE namespace (= distribution packages named something other than sagemath-... -- according to my recommendation in part I), weigh the benefits vs. drawbacks of MONOREPO vs. MULTIREPO. It is also simple enough for a distribution to start out being developed inside of the Sage MONOREPO and to be split out to a separate repository later.

I think these are the right choices, at least for now.  Once modularization is complete, perhaps MULTIREPO becomes viable, as it is for scipy, but until the work is done I don't think it is clear how one would want to subdivide it.

Best,

Nathan

 

Tobia...@gmx.de

unread,
Oct 14, 2021, 5:54:05 AM10/14/21
to sage-devel
I think the discussion and the initial post mixes a few things that are not really related to the question of mono- vs multi-repo. In particular, the question of how to continue with trac is somewhat orthogonal (you can easily have a monorepo on github or multiple trac repos).

In the end, it is a question on how to structure and organization of the code.
Monorepo: everything in one git repo, each package is a subfolder (roughly how it is currently organized under src/sage).
Multirepo: each package has their own git repo, with maybe one central repo that references the other repos via git submodules.

A monorepo provide a centralized place to manage dependencies, allow for easier code sharing, provides a single place of entrance for new contributions, and uses a central CI. Admittedly, with a good infrastructure similar benefits can be achieved for multirepos as well (e.g. write a central github action that is used in each package-repo). In my opinion, the biggest advantage of multirepos is that they define a clear boundary of a module, which prevents developers to rely on internal code of another package. Another benefit of multirepos is that build and test times for the ci are much lower as only the code of that package has to be tested (but again, this can also be achieved by more intelligent CI configs). Other advantages of multirepos such as a clear code ownership (responsibility of code review) can also be achieved in a monorepo.

In summary, I think a monorepo makes the most sense at the moment given the current structure and project philosophy. But at the same time the multirepo approach keeps its advantages only if a modern build and project management is used. Thus I also strongly agree with Dima in that trac is horrendous and outdated, and that one should try to switch to existing tools for dependency management than the home-built sage-the-distribution (at least for Python packages).

Matthias Koeppe

unread,
Nov 21, 2021, 11:47:16 PM11/21/21
to sage-devel
On Sunday, October 10, 2021 at 5:17:33 PM UTC-7 Matthias Koeppe wrote:
- Even for separate git repositories hosted in the SageMath GitHub organization (https://github.com/sagemath/), code ownership and review workflow are unclear.
  - For example, https://github.com/sagemath/sage-numerical-backends-coin provides no information on how to contribute (commonly expected in a file CONTRIBUTING.md)
  - Neither does https://github.com/sagemath/cysignals, which moreover looks abandoned: Issues and PRs are not tended to. Is it subject to peer review like other parts of Sage? Who is in charge of merging PRs?
  - Similar issues with https://github.com/sagemath/cypari2

These issues are still unresolved.
In the meantime I have added @kliem to the list of cysignals and cypari2 admins so that we can make progress on the pressing portability issues.



 
Reply all
Reply to author
Forward
0 new messages