Compression of backups possible?

48 views
Skip to first unread message

David Hicks

unread,
Nov 8, 2012, 4:57:10 PM11/8/12
to xtrabac...@googlegroups.com
This looks like a very cool manager for XtraBackup.  I'd love to put it into use.  What I don't see in the documentation is any reference to the possibility of compressing the backup files.  I currently do that with XtraBackup because our databases are so large that storing uncompressed backup sets would take far too much space.  Is there such an option that I'm just not seeing?  If not, would it be terribly difficult to add?

Thanks,
David

Lachlan Mulcahy

unread,
Nov 9, 2012, 12:34:14 PM11/9/12
to xtrabac...@googlegroups.com
Hi David,

On Nov 8, 2012, at 1:57 PM, David Hicks <dh...@i-hicks.org> wrote:

> This looks like a very cool manager for XtraBackup. I'd love to put it into use. What I don't see in the documentation is any reference to the possibility of compressing the backup files. I currently do that with XtraBackup because our databases are so large that storing uncompressed backup sets would take far too much space. Is there such an option that I'm just not seeing? If not, would it be terribly difficult to add?

Currently XBM does not have support for compressed backups on disk. This was done as a somewhat lazy choice on my part during the initial build, because the company I was working for was using ZFS (filesystem) with compression enabled for storage. This made additional compression redundant, so I didn't spend my time on it.

Compression is somewhat difficult to enable with many of the features that XBM provides.

For example.. If you are using a backup strategy where you are collapsing your oldest delta onto the seed as you continue to roll forward your backups, you will be unable to apply the deltas if the seed is compressed.

This would require uncompressing the seed (full backup) for XtraBackup to be able to see the uncompressed files, and then merging the oldest deltas, then re-compressing. It is a lot of thrashing work to do :(

If however, you only make use of full backups each time, it would be fairly simple to implement compression.

So the answer is that it depends on the backup strategy that you plan to employ -- in some cases it would be easy (full backup only) and in others it would be a mix of infeasible and somewhat involved.

I have been kind of slack on any further development on XBM, but I wouldn't mind getting back to it to update it for some of the newer XtraBackup features and also consider improving it to be able to support compression better.

Any interest in helping develop? :)

Best,
Lachlan

David Hicks

unread,
Nov 9, 2012, 12:41:16 PM11/9/12
to xtrabac...@googlegroups.com
Thanks for the feedback, Lachlan.

I can appreciate that adding compression for incremental backups that
are applied to the seed would certainly be involved - probably not even
worthwhile.

I'm going to investigate whether I could get away with doing just a
once-a-week full backup and then utilizing incrementals. Perhaps, I can
accomplish this using no more space than my current full backup every day.

What language are you using to implement the system? I used PHP for my
own automation, mainly because it's already installed on all of our
servers. I might be interested in contributing some effort if the need
arises.

Thanks again!
Dave

David Hicks

unread,
Nov 9, 2012, 12:53:19 PM11/9/12
to xtrabac...@googlegroups.com
I should have looked at the source code in the download before I asked
about the language. :-)
I see that it is also PHP. Cool! Nice and easy.


On 11/09/2012 12:34 PM, Lachlan Mulcahy wrote:

Lachlan Mulcahy

unread,
Nov 9, 2012, 1:10:18 PM11/9/12
to xtrabac...@googlegroups.com
Hey,

Yeah it is all PHP.

It is probably not my finest work as I learnt a lot about coding while writing it, but it works fairly well.

Depending on how often your data changes the incremental approach could be worthwhile. The problem with them is that each incremental is based upon the previous one, so if for some reason your previous incremental has a problem applying, all subsequent incrementals are rendered useless.

I would like to implement differentials which are always a set of deltas based on the seed/full backup, rather than the last set of deltas. This would mean that a set of deltas could have a problem applying and it would not have any effect on any other set of deltas usefulness for the given seed/full backup.

If you have interest in contributing and want to discuss possible changes, I'm open to patches and whatnot :)

cheers,
L
Reply all
Reply to author
Forward
0 new messages