Announcing rzbackup

254 views
Skip to first unread message

James Pharaoh

unread,
Sep 16, 2016, 7:42:14 AM9/16/16
to zba...@googlegroups.com
Hi all,

I'd like to announce the release of my partial zbackup clone, rzbackup.
This is open source software freely available under the Apache 2.0 license.

https://github.com/wellbehavedsoftware/wbs-backup/tree/master/rzbackup

https://gitlab.wellbehavedsoftware.com/well-behaved-software/wbs-backup/tree/master/rzbackup

https://crates.io/crates/rzbackup

This is written in Rust and designed to address some of my specific use
cases where zbackup doesn't perform well.

For example, it can operate in a client/server mode allowing you to
restore multiple backups which share deduplicated content without
repeatedly loading the indexes and decompressing the chunks.

Its main features are:

* Rust library for access to ZBackup repositories
* Supports encrypted and non encrypted formats
* RandomAccess implements Read and Seek to provide efficient random access
* Client/server utilities to efficiently restore multiple backups,
sharing chunk cache
* Command line decrypt utility, mostly useful for debugging

Thanks to Konstantin Isakov for relicensing the protobuf definitions so
I could include them with the Apache licence, and of course to him and
all the other contributors for the prior work this project builds on.

James

Tracy Reed

unread,
Sep 16, 2016, 1:16:55 PM9/16/16
to James Pharaoh, zba...@googlegroups.com
Very interesting. Does it parallelize both the compression and the dedupe? IIRC zbackup only parallelizes the compression but not the dedupe which leaves CPUs going to waste.



James

--
You received this message because you are subscribed to the Google Groups "ZBackup general discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zbackup+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
--
Tracy Reed

James Pharaoh

unread,
Sep 16, 2016, 1:57:44 PM9/16/16
to Tracy Reed, zba...@googlegroups.com
I am working on parallel decompression and have plans for plenty more
features soon. Just trying to get my head around Rust's multithreading
capabilities at the moment.

As for the dedupe, I currently only support restore operations, although
I would of course be interested in making it support the compression as
well.

In my use cases zbackup performs reasonably well in compression, and I
feel like it will be a bit trickier to do the dedupe/compression, so
this is a fairly low priority for me at the moment.

The main driver for the fast decompression at the moment is because I am
using zbackup as storage for LXC images, which are churned out by a CI
and have a large amount of redundant data. Decompressing them with
zbackup itself takes a very long time, as you would imagine for maybe
five to ten images of roughly a gigabyte each uncompressed.

The version I've released today will definitely speed this up, due to
the large cache. I have plenty of memory and can add swap space if
necessary, so I should never have to decompess a bundle twice in a
single deployment of my containers.

I feel like zbackup will be a very powerful tool for this kind of use
case, and if I can make it work it will be a big improvement on the
system docker uses, which I think is a flawed design due to the linear
history. I'm hoping to come up with a solution which works with docker
as well at some point.

James

On 16/09/16 18:16, Tracy Reed wrote:
> Very interesting. Does it parallelize both the compression and the
> dedupe? IIRC zbackup only parallelizes the compression but not the
> dedupe which leaves CPUs going to waste.
>
> On Fri, Sep 16, 2016 at 4:42 AM, James Pharaoh <ja...@pharaoh.uk
> <mailto:ja...@pharaoh.uk>> wrote:
>
> Hi all,
>
> I'd like to announce the release of my partial zbackup clone,
> rzbackup. This is open source software freely available under the
> Apache 2.0 license.
>
> https://github.com/wellbehavedsoftware/wbs-backup/tree/master/rzbackup
> <https://github.com/wellbehavedsoftware/wbs-backup/tree/master/rzbackup>
>
> https://gitlab.wellbehavedsoftware.com/well-behaved-software/wbs-backup/tree/master/rzbackup
> <https://gitlab.wellbehavedsoftware.com/well-behaved-software/wbs-backup/tree/master/rzbackup>
>
> https://crates.io/crates/rzbackup <https://crates.io/crates/rzbackup>
>
> This is written in Rust and designed to address some of my specific
> use cases where zbackup doesn't perform well.
>
> For example, it can operate in a client/server mode allowing you to
> restore multiple backups which share deduplicated content without
> repeatedly loading the indexes and decompressing the chunks.
>
> Its main features are:
>
> * Rust library for access to ZBackup repositories
> * Supports encrypted and non encrypted formats
> * RandomAccess implements Read and Seek to provide efficient random
> access
> * Client/server utilities to efficiently restore multiple backups,
> sharing chunk cache
> * Command line decrypt utility, mostly useful for debugging
>
> Thanks to Konstantin Isakov for relicensing the protobuf definitions
> so I could include them with the Apache licence, and of course to
> him and all the other contributors for the prior work this project
> builds on.
>
> James
>
> --
> You received this message because you are subscribed to the Google
> Groups "ZBackup general discussion" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to zbackup+u...@googlegroups.com
> <mailto:zbackup%2Bunsu...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
>
>
> --
> --
> Tracy Reed

James Pharaoh

unread,
Sep 18, 2016, 11:25:28 AM9/18/16
to Tracy Reed, zba...@googlegroups.com
Ok, I have got a parallel restore working. This is committed to the
master branch of my repo here:

https://github.com/wellbehavedsoftware/wbs-backup/tree/master/rzbackup

Here's a comparison of the timings between zbackup and rzbackup:

https://gist.github.com/jamespharaoh/572c92716866a6a0727ceb9cb8fc8ee9

This is on a dual core Intel i7 with two threads per core (i7-6500U CPU
@ 2.50GHz). It's significantly faster, as you would expect of course.

This is not quite ready for release yet and I have some other
commitments I need to address now, but I'll do some more cleanup, add a
nicer interface, and get something released soon.

This should also work with the client/server version, although I've not
tested that. This will provide a further speed boost when doing multiple
restores because of the data held in cache, assuming there's redundant
data, of course.

James
Reply all
Reply to author
Forward
0 new messages