Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Fault tolerance?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  7 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Korkman  
View profile  
 More options Apr 4 2011, 2:39 pm
From: Korkman <goo...@pierre-beck.de>
Date: Mon, 4 Apr 2011 11:39:54 -0700 (PDT)
Local: Mon, Apr 4 2011 2:39 pm
Subject: Fault tolerance?
Hi,

it might be a bit early to ask this, but how fault tolerant is bup? We
had a power outage yesterday during backups (one that lasted much too
long, battery wouldn't last) and for a thrill I tried verifying one of
the affected backups. Instantly errors were reported about a zero-
sized .idx file. Removing it stopped the errors, but I fear the whole
backup set is now destroyed. I'm running a restore right now and it
didn't abort yet, so I'll go on with the verification. Is it in vain?

It should be noted that unexpected power outages bring up the worst in
filesystems and databases, thus potentially leading to the need to
restore from backup in the first place. I will work around the problem
by rotating two sets of backups for now. Bup reduces backup volume so
dramatically (really, the rolling checksum thing pays off), so keeping
multiple versions is cheap enough.

But maybe you can think of some atomic file operations / write-ahead
rollback journal in further development to make it rock-solid?

Greetings,

Pierre Beck


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Brandon Smith  
View profile  
 More options Apr 4 2011, 2:44 pm
From: Brandon Smith <free...@reardencode.com>
Date: Mon, 4 Apr 2011 11:44:01 -0700
Local: Mon, Apr 4 2011 2:44 pm
Subject: Re: Fault tolerance?
By bup's design, any new backup is additive only.  Therefor all of your
previous snapshots are completely valid even though the in progress one
is not.  For that matter, if you're doing a large incremental backup
(more than one .pack worth) that backup's progress up to the end of the
last complete pack is completely valid even if there is a fault.

--Brandon

On 2011-04-04 (Mon) at 11:39:54 -0700, Korkman wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Avery Pennarun  
View profile  
 More options Apr 4 2011, 3:02 pm
From: Avery Pennarun <apenw...@gmail.com>
Date: Mon, 4 Apr 2011 15:02:07 -0400
Local: Mon, Apr 4 2011 3:02 pm
Subject: Re: Fault tolerance?

On Mon, Apr 4, 2011 at 2:44 PM, Brandon Smith <free...@reardencode.com> wrote:
> By bup's design, any new backup is additive only.  Therefor all of your
> previous snapshots are completely valid even though the in progress one
> is not.  For that matter, if you're doing a large incremental backup
> (more than one .pack worth) that backup's progress up to the end of the
> last complete pack is completely valid even if there is a fault.

Exactly.

There are some situations I can think of that might cause minor
(recoverable) problems.  For example, if the system crashes before all
the files are synced, we might end up with pack #7 being slightly
corrupt but pack #8 being valid.  Then deleting pack #7 would result
in missing objects that you might not immediately discover.

Also, if a .pack file is invalid, bup might not notice it and might
keep using the .idx anyway, which would result in certain missing
objects not being backed up as they should be.  I haven't tested that
very carefully.

Someday, we should improve 'bup fsck' to be able to check stuff like
that and throw away bad or mismatched packs, just to be safe.
However, this situation is pretty contrived and should be very rare,
even in the case of a crash.

Maybe we should fsync() more frequently, like after writing a .pack or
.idx file.  I really hate fsync() though, since its performance is so
terrible on ext3.  (fsync() on any file ends up being basically the
same as a full sync() of the entire filesystem.  Barf.)

If you're worried about another form of corruption - ie. silent loss
of data in .pack files due to hard drive sector errors - you should
consider using 'bup fsck -g'.

Have fun,

Avery


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Korkman  
View profile  
 More options Apr 5 2011, 7:37 am
From: Korkman <goo...@pierre-beck.de>
Date: Tue, 5 Apr 2011 04:37:05 -0700 (PDT)
Local: Tues, Apr 5 2011 7:37 am
Subject: Re: Fault tolerance?
So I can simply delete the last files created during a crashed backup
(.pack, .idx and .midx) and be good?

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Brandon Smith  
View profile  
 More options Apr 5 2011, 10:51 am
From: Brandon Smith <free...@reardencode.com>
Date: Tue, 5 Apr 2011 07:51:34 -0700
Local: Tues, Apr 5 2011 10:51 am
Subject: Re: Fault tolerance?
No need -- during writing, bup writes to temporary filenames.  You can
remove some hidden '.' files that were created tho.

On 2011-04-05 (Tue) at 04:37:05 -0700, Korkman wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Korkman  
View profile  
 More options Apr 5 2011, 11:51 am
From: Korkman <goo...@pierre-beck.de>
Date: Tue, 5 Apr 2011 08:51:45 -0700 (PDT)
Local: Tues, Apr 5 2011 11:51 am
Subject: Re: Fault tolerance?
From what I understand, a temporary .idx file was moved to objects/
pack but not written to disk yet, because there's no fsync in between.
Therefore, 0-byte size. That's an easy case, though. I'm really
getting paranoid about my .pack files not being written completely in
crash situations. bup fsck found an invalid pack file from a crash a
week earlier ("git verify failed"), so that backup set is most likely
invalid from that point on, isn't it?

I think a very simple and safe procedure is deleting all .idx, .pack
and .midx files created at and after that day?

On 5 Apr., 16:51, Brandon Smith <free...@reardencode.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Avery Pennarun  
View profile  
 More options Apr 5 2011, 12:03 pm
From: Avery Pennarun <apenw...@gmail.com>
Date: Tue, 5 Apr 2011 12:03:44 -0400
Local: Tues, Apr 5 2011 12:03 pm
Subject: Re: Fault tolerance?

On Tue, Apr 5, 2011 at 11:51 AM, Korkman <goo...@pierre-beck.de> wrote:
> From what I understand, a temporary .idx file was moved to objects/
> pack but not written to disk yet, because there's no fsync in between.
> Therefore, 0-byte size. That's an easy case, though. I'm really
> getting paranoid about my .pack files not being written completely in
> crash situations. bup fsck found an invalid pack file from a crash a
> week earlier ("git verify failed"), so that backup set is most likely
> invalid from that point on, isn't it?

> I think a very simple and safe procedure is deleting all .idx, .pack
> and .midx files created at and after that day?

Just watch out for one thing: the git commit that your branch points
at.  It may be that the final commit of a particular backup is valid
right now, but if you delete old (valid) packs, it might become
invalid and you might need to re-point the branch at a commit that
still exists.

So basically:

cd ~/.bup
git rev-list name-of-my-branch >my-commits
...delete some packs...
git log name-of-my-branch
# if it fails, try each of the branches in my-commits in turn.
# let's call the first one that matches $TOPCOMMIT
git branch -d name-of-my-branch
git branch name-of-my-branch $TOPCOMMIT

Then you should be okay.

So basically, deleting the packs is fine, but you want to be careful
not to lose track of the part of history you *didn't* delete.

We could probably put more work into bup fsck, autorecovery, and so
on.  But since it's generally *possible* (although a bit of work like
the above) to recover from problems, it hasn't been a big deal.

Have fun,

Avery


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »