a serious bug uncovered in Sage 9.7-10.6 (fixed in 10.7)

16 views
Skip to first unread message

Dima Pasechnik

unread,
Dec 18, 2025, 3:52:48 PM (15 hours ago) Dec 18
to sage-devel, sage-support, sage-release
Maxim Kontsevich reported patently wrong answers from modular forms
code in https://github.com/sagemath/sage/issues/41267.
We were able to pin them down to setting Parallelism().set(nproc=k),
for any k>1. The error is not dependent upon the platform (observed
in Linux Conda originally, but meanwhile found to occur in "normal"
builds, too, on Linux x86_64 and on Intel macOS) - arm64 etc still
needs to be checked.

It would be great to understand what fixed it - any ideas?

For reasons unclear to me, the git history between tags 10.6 and 10.7
is not clean (somehow, 10.7 is not "based" upon 10.6 in Git sense),
breaking a straightforward git bisect.
Help with the latter would be appreciated, too.
(otherwise one would need to do a manual git rebase of 10.7 over 10.6,
which isn't instant)

Dima

PS. Computations done in Sage 9.7-10.6 under Parallelism().set(nproc=k)
(e.g. one might have set "Parallelism().set(nproc=42)" in ~/.sage/init.sage/)
thus might be incorrect :-(

William Stein

unread,
Dec 18, 2025, 4:26:20 PM (14 hours ago) Dec 18
to sage-r...@googlegroups.com, sage-devel, sage-support
On Thu, Dec 18, 2025 at 12:52 PM Dima Pasechnik <dim...@gmail.com> wrote:
>
> Maxim Kontsevich reported patently wrong answers from modular forms
> code in https://github.com/sagemath/sage/issues/41267.
> We were able to pin them down to setting Parallelism().set(nproc=k),
> for any k>1. The error is not dependent upon the platform (observed
> in Linux Conda originally, but meanwhile found to occur in "normal"
> builds, too, on Linux x86_64 and on Intel macOS) - arm64 etc still
> needs to be checked.
>
> It would be great to understand what fixed it - any ideas?

(Pure speculation below, but I did write CuspForms so maybe it's helpful.)

Just looking at your log:

sage: Parallelism()
Number of processes for parallelization:
- linbox computations: 1
- tensor computations: 1
sage: CuspForms(Gamma1(2), 10).hecke_matrix(5)
[870]
sage: Parallelism().set(nproc=2)
sage: Parallelism()
Number of processes for parallelization:
- linbox computations: 2
- tensor computations: 2
sage: CuspForms(Gamma1(2), 10).hecke_matrix(5)
[534154/3]

and seeing "linbox" makes me thing "maybe there is a bug in linbox"?
I don't even know what "Parallelism()" is, but my guess is it sets a
parameter that impacts how linbox runs. Modular forms computations
use linbox for fast exact linear algebra, and maybe there's a bug in
linbox. Linbox is very actively developed so maybe a routine
upgrade fixed that bug.




>
> For reasons unclear to me, the git history between tags 10.6 and 10.7
> is not clean (somehow, 10.7 is not "based" upon 10.6 in Git sense),
> breaking a straightforward git bisect.
> Help with the latter would be appreciated, too.
> (otherwise one would need to do a manual git rebase of 10.7 over 10.6,
> which isn't instant)
>
> Dima
>
> PS. Computations done in Sage 9.7-10.6 under Parallelism().set(nproc=k)
> (e.g. one might have set "Parallelism().set(nproc=42)" in ~/.sage/init.sage/)
> thus might be incorrect :-(
>
> --
> You received this message because you are subscribed to the Google Groups "sage-release" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-release...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/sage-release/CAAWYfq0NoeR%2BxKRsKgvk-0P5PHyk1Tnc35gXhxXs-rSwc6gM-g%40mail.gmail.com.



--
William (http://wstein.org)

Dima Pasechnik

unread,
Dec 18, 2025, 7:25:33 PM (11 hours ago) Dec 18
to sage-r...@googlegroups.com, sage-support, sage-devel
On Thu, Dec 18, 2025 at 3:29 PM Volker Braun <vbrau...@gmail.com> wrote:
>
> In April i accidentally rewrote 10.6.rc1 version commit from 8a8453f35f3 to 10741006a47, changing only metadata. That was the only mistake that I'm aware of. But it only means that 8a8453f35f3 isn't part of the "release tree".
>
> If you keep going to the first parent commit starting at 10.7 (85c8f1e8a26) then you end up at 10.6 (b8f98e7c7c3). So 10.7 is most certainly based on 10.6 in the git sense.
>
> You are probably tripping over messy merges in-between. To bisect you need --first-parent to only bisect at the release merges.
>
> $ git checkout 10.7
> $ git bisect start --first-parent
> $ git bisect new HEAD
> $ git bisect old 10.6
> Bisecting: 229 revisions left to test after this (roughly 8 steps)
> [581aae7712a34b2a143d4e8decc03344ff862aa3] gh-40164: ⬆️ Bump astral-sh/setup-uv from 6.0.1 to 6.1.0
>

Thanks for the tip. The bugfix happened at commit
3531a873beb5df16d1172525013ba9159f3f84d0,
that is, when https://github.com/sagemath/sage/pull/39733 was merged.

Basically, it switches the default linear algebra echelonize() method in
src/sage/matrix/matrix_rational_dense.pyx to a different algorithm, avoiding
the use of "multimodular", i.e. _echelonize_multimodular(), calling
matrix_rational_echelon_form_multimodular() - which apparently does
work correctly with `Parallelism().set(nproc=2)` (or bigger than 2).

So a bug is still there, it's just hidden, in a way.

Dima
> --
> You received this message because you are subscribed to the Google Groups "sage-release" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-release...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/sage-release/5e473ce1-29be-4f1e-93e7-d7e2c94c0935n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages