Hi,
> Bup did say it borrowed a lot of ideas from rsync, including its
> rolling checksum technique.
Yeah, but that's not related to md5.
> Rsync uses MD5 as a stronger algorithm when the weaker rolling
> checksum algorithm claims its found a match.
I'm not sure this is quite right, but I haven't looked at rsync
specifically.
In bup, the rolling hash is just used to split the data into chunks, and
then the chunks are checksummed independently.
> If Bup does not use MD5 as a stronger algorithm then what does it use?
It's compatible with git, so it uses git's blob/checksum construction
that uses sha1.
This is being changed in git, and I expect bup might follow eventually.
Note that blake3 isn't a contender in git at this point, afaict.
> Anyways, may you please explain to me why BLAKE3 cannot be used as a
> rolling hash cheaply?
Because by design you cannot remove any content prefix bytes from a
cryptograpic hash, which is a key property you need for a rolling hash.
Otherwise you have to recalculate the checksum over every window again
and again.
> I am the one here that is learning about bup, git, and rsync and how
> they compete against each other for performance.
But they don't.
johannes