Size of xtrabackup incremental backups

288 views
Skip to first unread message

simon...@googlemail.com

unread,
Sep 8, 2016, 11:49:57 AM9/8/16
to Percona Discussion
Using xtrabackup to backup a MySQL database with both MyISAM and InnoDB tables, I did a full backup which ended up being 2,3 GB in size. Afterwards I only dropped a single table and did an incremental backup, which was 971 MB, then changed nothing on the DB and did another incremental backup (against the previous one) which then again was 971 MB.

Why are the incremental backups so large when (almost) changed and just a single table got removed? Is there a way to avoid this behaviour?

Jervin R

unread,
Sep 8, 2016, 6:11:40 PM9/8/16
to Percona Discussion, simon...@googlemail.com
I suspect it may be the MyISAM tables  - remember they are not transactional and cannot be backed up incrementally. Check the total size of your MyISAM tables, use the last query from this blog post - https://www.percona.com/blog/2008/03/17/researching-your-mysql-table-sizes/

Roel Van de Paar

unread,
Sep 9, 2016, 5:30:30 AM9/9/16
to Percona Discussion, simon...@googlemail.com
So Simon, MyISAM tables are always backed up fully, irrespective of whether they changed or not.

simon...@googlemail.com

unread,
Sep 9, 2016, 7:28:23 AM9/9/16
to Percona Discussion, simon...@googlemail.com
Thank you! I indeed have 0,88 GB of MyISAM tables according to that query.

Since I am not too familiar with both database internals and percona backup strategies: What would be the preferred way of getting a concise incremental backup of a MySQL database containing both MyISAM and InnoDB tables? Each backup gets passed over a network so I am interested in keeping things small in size.

Jervin R

unread,
Sep 9, 2016, 10:26:02 PM9/9/16
to Percona Discussion, simon...@googlemail.com
Simon,
 
Thank you! I indeed have 0,88 GB of MyISAM tables according to that query.

Since I am not too familiar with both database internals and percona backup strategies: What would be the preferred way of getting a concise incremental backup of a MySQL database containing both MyISAM and InnoDB tables? Each backup gets passed over a network so I am interested in keeping things small in size.

MyISAM is not a transactional storage engine so there is no interface to get a consistent incremental snapshot. You should consider converting your remaining MyISAM tables to InnoDB. Copying the MyISAM tables during backup takes a global read lock on the server which blocks any traffic while that happens. So aside from bigger incremental backup size, you also incur longer backup times.

If you must keep the MyISAM tables - compressing on stream of storing locally first, compressing then copying over the network should help reduce overall size to some degree.

Jaime Crespo

unread,
Sep 11, 2016, 9:43:35 AM9/11/16
to percona-d...@googlegroups.com
simonfromme, it is difficult to give a good answer without knowing what are your main constraints of your project. 1GB of tables for me would be so small that I would not use try to optimize and use other thing that mysqldump in a single thread because it would take seconds to backup/restore in full.

However, that doesn't mean you could have large issues because: a) you are trying to use internet and/or those networks could be unreliable (e.g. mobile network). b) you are trying to minimize storage used because you have very little or it is very expensive for your budget c) you intend to do very frequent backups.

Also, it is difficult to provide a strategy without knowing what things you could or couldn't do; or have the budget to do/do fast/cheaply/have the human resources.

Things that I can suggest in a very general way:

== Migrate all(*) your tables to InnoDB ==

(*)Not all mysql system tables can be converted yet.
InnoDB is the present and future of MySQL, and if you use a recent MySQL/Percona version there is almost nothing you can do with MyISAM that you cannot do with InnoDB, but a huge number of extra features and scalability. Other engines have a lot of nice features, but you are sometimes losing flexibility when you mix them (specially when MySQL/InnoDB has a the best 1st party and 3rd support). Have that in mind when chosing to use other engines.

== Use the binary logs as "incremental backups" ==

This can be very controversial, but remember I do not know much about your environment. Binary logs (a.k.a replication) is the best way to have data availability in case of hardware failure; and can be used for point in time recovery if the size of transactions to apply doesn't get too large (e.g. if your level of writes is not too high or you are not too much time behind). In most cases, binary logs will not substitute incremental backups, but they can indeed complement them (e.g. weekly full backups + daily incrementals + binary logs for PITR)

== Use "unix tools" ==

rsync has been used for a long time as the standard to transmit only changed bits on the filesystem. Obviouly, you cannot just use rsync directly on a running mysql instance(*), but you can either FLUSH TABLES WITH READ LOCK; and rsync the MyISAM tables + .frm or do a similar process with locally exported .csv files. What or if this will work will depend on the number of tables, its size, the rsync algorithm used, and how much they change each time you do a backup.

== Use compression ==

Between potential row fragmentation and storage inefficiencies plus text compressibility, it is not unusual to be able to compress 5-10 times the on-disk size of some tables <http://dbahire.com/which-compression-tool-should-i-use-for-my-database-backups/>. I have a lot of work pending to do on optimizing my own database, but right now I am able to reduce my 1TB MySQL databases down to ~200GB while doing cold clones, while maximizing network bandwidth thanks to "pigz".

What *you* need depends on your particular setup and how much engineering time vs money for hw you have.

--
Jaime Crespo
Reply all
Reply to author
Forward
0 new messages