incremental backup question (plus possible incremental backup bug)

5 views
Skip to first unread message

Frank O'Dwyer

unread,
Aug 9, 2009, 6:33:35 AM8/9/09
to ec2-on-rails-discuss
How is the incremental daily backup restored in practice? I've done a
trial run of using the archive and restore tasks, which works fine -
however I'm not sure how to restore the incremental daily backup. Will
the 'restore' cap task take account of the incremental binary logs?

By the way, one thing I noticed is that when deploying a new instance,
and then restoring the DB from an archive of a previous instance, the
incremental backup sql logs don't seem to take any account of the
archive restore. This is just based on looking at the size of the
files in S3 - it still has the original 'empty' sql dump, and the logs
are all too small to contain the restored data - even though the live
database does indeed have that data. This was when moving to a medium
instance from a small one.

So it looks to me like the incremental backup will not provide
coverage until the next daily backup rolls around and another full DB
dump is done - i.e. after a restore there is a worst case data loss
window up to 24 hrs rather than 5 mins?

Cheers,
Frank

Paul Dowman

unread,
Aug 11, 2009, 1:15:25 PM8/11/09
to ec2-on-rai...@googlegroups.com
The restore should also restore the binary logs (i.e. the incremental backup). One caveat is the database needs to have the same name when you're restoring it. 

It does work, let me know if you're having a problem and please give more detail. I didn't quite understand if you've noticed that the restored data actually isn't there or if you're just confused about how the binary logs work.

Frank O'Dwyer

unread,
Aug 16, 2009, 5:33:11 PM8/16/09
to ec2-on-rails-discuss

On Aug 11, 6:15 pm, Paul Dowman <li...@pauldowman.com> wrote:
> The restore should also restore the binary logs (i.e. the incremental
> backup). One caveat is
> the database needs to have the same name when you're restoring it.
> It does work, let me know if you're having a problem and please give more
> detail.

What I was confused about is mainly the steps to restore the daily
backup, versus restoring a backup I'd done myself using the 'archive'
task.

I have been able to do a manual database archive using the cap
'archive' task on one box, and then the cap 'restore' task on another
box. This worked fine for me. Before running the restore, I just
edited the deploy.rb and pointed it to the archive bucket I had
previously archived to on the old box.

So, for restoring the daily backup, what cap task do I use? Still the
'restore' task? I just point it at the daily backup bucket, and it
will process the binary logs too?

> I didn't quite understand if you've noticed that the restored data
> actually isn't there or if you're just confused about how the binary logs
> work.

Possibly a bit of both :-) The problem I'm referring to is the status
of the daily backup on the *target* box, after a restore is done - in
other words the capability of the target, newly restored box to
respond to a failure on that box. These are the steps I did and the
problem I saw:

1) cap archive task on old box
2) bring up new box and run cap setup etc (now the app is running, but
with an empty database - no user activity so nothing being added,
modified or deleted) - all fine so far
3) repopulate the DB using cap restore, from the archive I'd manually
created (*not* the daily backup) on the old box - this works too
4) now, looking at the new box the DB is restored OK, the app works
etc, *but* the current daily backup data on the new box looked like it
couldn't possibly include the data I'd just restored. This is because
adding up the sizes of the daily dump (done prior to the archive
restore), plus the sizes of the binary logs copied to S3 since the
archive restore, they were way too small to represent a full data set.
It looked to me like the daily backup would only catch up on the next
full daily dump.

Another backup related question, is what happens if the incremental
backup fails due to a transient error on S3, which happens sometimes?
Does it catch up the next time it successfully runs the incremental or
is there a possibility of being missing a binary log?

Cheers,
Frank

Paul Dowman

unread,
Sep 25, 2009, 3:59:02 PM9/25/09
to ec2-on-rai...@googlegroups.com
Since this is a long message with several points in it I'll include my comments inline. See below...

On Sun, Aug 16, 2009 at 5:33 PM, Frank O'Dwyer <batsi...@gmail.com> wrote:


On Aug 11, 6:15 pm, Paul Dowman <li...@pauldowman.com> wrote:
> The restore should also restore the binary logs (i.e. the incremental
> backup). One caveat is
> the database needs to have the same name when you're restoring it.
> It does work, let me know if you're having a problem and please give more
> detail.

What I was confused about is mainly the steps to restore the daily
backup, versus restoring a backup I'd done myself using the 'archive'
task.

I have been able to do a manual database archive using the cap
'archive' task on one box, and then the cap 'restore' task on another
box. This worked fine for me.  Before running the restore, I just
edited the deploy.rb and pointed it to the archive bucket I had
previously archived to on the old box.

So, for restoring the daily backup, what cap task do I use? Still the
'restore' task? I just point it at the daily backup bucket, and it
will process the binary logs too?

Yes, in your deploy.rb see :ec2onrails_config[:restore_from_bucket] and :ec2onrails_config[:restore_from_bucket_subdir]. I assume you got that working based on what you wrote later in the message.
 
> I didn't quite understand if you've noticed that the restored data
> actually isn't there or if you're just confused about how the binary logs
> work.

Possibly a bit of both :-) The problem I'm referring to is the status
of the daily backup on the *target* box, after a restore is done - in
other words the capability of the target, newly restored box to
respond to a failure on that box. These are the steps I did and the
problem I saw:

1) cap archive task on old box
2) bring up new box and run cap setup etc (now the app is running, but
with an empty database - no user activity so nothing being added,
modified or deleted) - all fine so far
3) repopulate the DB using cap restore, from the archive I'd manually
created (*not* the daily backup) on the old box - this works too 
4) now, looking at the new box the DB is restored OK, the app works
etc, *but* the current daily backup data on the new box looked like it
couldn't possibly include the data I'd just restored. This is because
adding up the sizes of the daily dump (done prior to the archive
restore), plus the sizes of the binary logs copied to S3 since the
archive restore, they were way too small to represent a full data set.
It looked to me like the daily backup would only catch up on the next
full daily dump.

OK, I think I see what you mean now. I think the issue is that the restore executes the MySQL command "reset master". In fact, there's already a comment about this in the source:

I think you're right, this is a bug, and the solution is simply to not do the "reset master". I've done that in this commit:


Thanks.
 
Another backup related question, is what happens if the incremental
backup fails due to a transient error on S3, which happens sometimes?
Does it catch up the next time it successfully runs the incremental or
is there a possibility of being missing a binary log?

Yes, it catches up. It uploads all binary log files that exist except the last (which is currently being written to), and deletes the files up to the last one. The source is here:
 
Paul
Reply all
Reply to author
Forward
0 new messages