Ideas

37 views
Skip to first unread message

guli...@gmail.com

unread,
Jul 4, 2014, 3:59:26 AM7/4/14
to libr...@googlegroups.com
Hello,

I just recently discovered librsync and I'm thinking if it' suitable for the problem I'm currently trying to solve. It seems I'd need to modify it a bit to suit my needs and since it seems to be in development again, maybe that could be combined together :)

I'm trying to backup large files (10GB+) as fast and as efficiently as possible, while still having some option of point-in-time restore by storing deltas backwards. At first I thought of using rsync to transfer the changes to the server and then using xdelta from previous backup to create a back-delta. But rsync is too slow - it reads the whole file on the receiver, sends checksums, waits for sender to send deltas and then does another read to verify the file. That's 2+ full file IO-ops on the server and 1 on the sender (and not at the same time as the server). With rdiff, that could be down to 1 full file IO-op on the sender if the signature is available.

My questions:
- does librsync support (or it could support) doing a delta and computing a signature at the same time - this way, with every backup operation a delta is created and saved on the server and also a signature that will be used for the next backup, all in one operation (a back-delta can later be calculated offline on the server with the new signature and old file)

- how about storing some externally verifiable checksum in the signature (md5 or sha256 even better); have rdiff info display the checksum from the signature and rdiff verify perform a verify operation against the file and it's signature (or is this already possible with current signatures?)

I'm assuming that the improvements in rsync rolling checksum algorithm would need to be transferred to librsync again? Are there any improvements worth working on?

Am I missing something? Where could I start working on this? :)

Regards,
gulikoza

Martin Pool

unread,
Jul 7, 2014, 2:27:42 PM7/7/14
to guli...@gmail.com, libr...@googlegroups.com
You can compute a delta and new signature at the same time.

The signature does include a whole file hash, which is checked when the delta is applied. If rdiff info doesn't show it, it would be good to add.

I'm not sure what has changed in rsync but yes, that would be great to port any improvements.

-m
--
You received this message because you are subscribed to the Google Groups "librsync" group.
To unsubscribe from this group and stop receiving emails from it, send an email to librsync+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages