On Sat, May 31, 2014 at 6:02 AM, Daniel Theophanes <
kard...@gmail.com> wrote:
> I'm aware of how the rsync algo works [1], when I tested on my package it
> made enough difference that I kept it as md5 by default (though you can
> replace it with whatever Hasher you want). But at the time I was using it
> for rdiff for a file over 2 GiB, so the hash time added up.
Even for large files, the difference should be similar to proportional
to the difference of applying both algorithms to the file once.
> 1.
https://bitbucket.org/kardianos/rsync/src/
That doesn't look right. The signature is being created based on fixed
block sizes. If you prepend 1 byte to the whole file, it'll resend the
whole file as the delta, because all of the blocks have been shifted.
You need to apply the same rolling checksum logic when creating the
signature to split the blocks at appropriate places and allow for
reuse.
gustavo @
http://niemeyer.net