Possibly insecure verification of sage source downloaded from a mirror

65 views
Skip to first unread message

Georgi Guninski

unread,
Feb 12, 2025, 2:15:10 AMFeb 12
to sage-...@googlegroups.com
Hi,

Many thanks for listing me on your developer map.

On sagemath.org I go to Download source [1] then I choose mirror [2]

On the mirror I see:

sage-10.5.tar.gz torrent 1535.20 MB 2024-12-04 00:28
MD5: 83dab794f87e989a30e248f3b39c40db

There are several potential issues with this:

1. If the mirror is compromised or MITM'ed, it could provide whatever checksum.
There is unencrypted "http://" mirror University of Washington,
Seattle, WA, USA [3]
Is this a real concern?
2. The MD5 hash function is deprecated since years and considered broken.

One possible approach is to sign with gpg the tarballs or only the hashes.

[1] https://www.sagemath.org/download-source.html
[2] https://sage.mirror.garr.it/mirrors/sage/src/index.html
[3] http://files.sagemath.org/src/index.html

Nils Bruin

unread,
Feb 12, 2025, 1:22:14 PMFeb 12
to sage-devel
On Tuesday, 11 February 2025 at 23:15:10 UTC-8 Georgi Guninski wrote:
On the mirror I see:

sage-10.5.tar.gz torrent 1535.20 MB 2024-12-04 00:28
MD5: 83dab794f87e989a30e248f3b39c40db

There are several potential issues with this:

1. If the mirror is compromised or MITM'ed, it could provide whatever checksum.
There is unencrypted "http://" mirror University of Washington,
Seattle, WA, USA [3] Is this a real concern?

The web has been moving to https connections for pretty much anything. It makes it much more painful to get a site set up, because you need to get a certificate signed for it from an authority that is recognized by default by most browsers. That's something that malicious players can still do, though. It's not *that* hard to get such a certificate. We could just remove non-https mirrors (or see if those mirrors can be upgraded to https). Posting a checksum right next to a download file is never going to protect against spoofing: the download file is under the same control as the hash is, so they can be kept in sync by a malicious player just as much as by a maintainer under normal circumstances.
 
2. The MD5 hash function is deprecated since years and considered broken.

It's broken for crypto. It's still a hash that's perfectly fine for checking for random corruptions (which would probably be detected via https anyway). A posted hash like that is more for checking that. Because of the chain of trust issues (see below) I'm not sure it's worth changing to a cryptographic hash for this, since this posted hash doesn't really seem to solve anything for which a cryptographic hash is required.
 
One possible approach is to sign with gpg the tarballs or only the hashes.

But how do downloaders establish the chain of trust for the signing key? This is one of those things where we cannot do anything else than what is common practice, because anything else will put burden on the downloaders and will damage the accessibility of sagemath that way.

In my opinion, this problem is commonly solved nowadays by curated software distributions (through stores, trusted package repositories, etc.) with keys that are predistributed with the operating system used. The integrity control is then offloaded from the end-user to the distribution maintainers. How do package maintainers protect against getting fed compromised sources?

I know that when I download a live image for fedora for install on a new machine, I basically have no choice but to trust that I recognize the site from which I download it and that the site is not compromised.  I don't really have independent means to get a hash on the live image to check its integrity.

Michael Orlitzky

unread,
Feb 12, 2025, 7:31:18 PMFeb 12
to sage-...@googlegroups.com
On 2025-02-12 10:22:14, Nils Bruin wrote:
> In my opinion, this problem is commonly solved nowadays by curated software
> distributions (through stores, trusted package repositories, etc.) with
> keys that are predistributed with the operating system used. The integrity
> control is then offloaded from the end-user to the distribution
> maintainers. How do package maintainers protect against getting fed
> compromised sources?

As the joke goes, very carefully. There are two different aspects of
this problem. First, developers need to verify the sources they get
from upstream, and second, users need to verify the sources they get
from developers by way of untrusted mirrors.

New distribution developers are taught several ways to verify the
sources that they download. HTTPS isn't great, but is preferable to
plain HTTP in this regard. Many upstreams advertise their public keys
across multiple channels. If you can confirm that the key posted to
the project's public HTTPS page is the same as the key in the
maintainer's email signature for the past five years, and the same one
posted on Github (or whatever), then that's a pretty good
indicator. You can ask your fellow developers to confirm this from
another country. Even better if you can meet them at FOSDEM or some
equivalent conference.

To strengthen this chain of trust and make it easier to verify, we
have packages for many of these keys:

https://packages.gentoo.org/categories/sec-keys

These are shipped to each developer and user via the usual means. When
a developer is adding a new version of a package, they can set a flag
to have the package manager verify the signature on the tarball
against the right sec-keys package. The developer then signs the hash
of the tarball with his own key, which you implicitly trust by way of
turtles upon turtles going back to the installation of the OS.

That's just for developers. To get these packages to end users, we
need to get the signed hashes of both the upstream tarballs and our
build scripts to the users and then verify them. Again we do this via
packages, all signed with gentoo.org keys that you implicitly
trust. The rsync mirrors can change whatever they want, but the
package manager verifies everything that it receives from the rsync
mirrors before those changes "go live." The package manager is also
implicitly trusted (it comes with the OS), so there is a chain of
trust going back to the installation of the OS, at which point you
were supposed to say a prayer.

Most distributions have a similar system. It can't be replicated
within sage (or on pypi et al.), and everyone who wants to address
these sorts of problems is already a developer for some distribution,
and we all suggest that you get your sage packages from the
distribution in the first place :)
Reply all
Reply to author
Forward
0 new messages