recovery strategy if you lose an entire shard to hardware or software failures/accidents?

128 views
Skip to first unread message

Erik

unread,
Mar 2, 2011, 2:45:04 PM3/2/11
to mongodb-user
In case you lose an entire shard somehow, do you just recover that
shard, or do you roll back all of the shards to the last full backup
of all the shards?

I'm sure it's very much application specific (e.g., do your shards
have interdependencies that will be violated if you reset just 1 shard
back a day), just wondering what people's thoughts are on this
potential problem.

Bernie Hackett

unread,
Mar 2, 2011, 3:26:11 PM3/2/11
to mongodb-user
You can just recover the one shard but you will have to do manual
reconciliation of chunks that have moved since the last backup. Each
shard keeps a copy of the chunks that have been moved to another shard
in a moveChunk directory. Assuming you haven't deleted these copies
you can use them to reconcile the changes.

From your last post it sounds like you are planning a large and
complicated setup. You might want to get in touch with 10gen about a
support contract.

http://www.10gen.com/support

Andrew Armstrong

unread,
Mar 2, 2011, 8:21:09 PM3/2/11
to mongodb-user
Manual work sounds terrible :(

It sounds like a backup from each shard is not very useful; as the
data you backed up may have either moved to another shard when you go
to restore it (what happens then? Do you accidently insert
duplicates?) or you missed data that was previously available on a
different shard, so your backup does not hold it.

Is there a plan to make this process painless?

- Andrew

Bernie Hackett

unread,
Mar 2, 2011, 8:34:23 PM3/2/11
to mongodb-user
There is a jira ticket about this issue (although it looks like you
have already seen it) here:

http://jira.mongodb.org/browse/SERVER-1998

Bernie Hackett

unread,
Mar 2, 2011, 9:07:47 PM3/2/11
to mongodb-user
Here is our current documentation on backing up and restoring a
sharded cluster:

http://www.mongodb.org/display/DOCS/Backing+Up+Sharded+Cluster

On Mar 2, 7:45 pm, Erik <eri...@gmail.com> wrote:

Andrew Armstrong

unread,
Mar 2, 2011, 9:16:01 PM3/2/11
to mongodb-user
Thanks your right.

I've added a comment at http://jira.mongodb.org/browse/SERVER-1998
outlining a possible solution to cluster backups. Thoughts?

---
The problem as I understand it now is that taking a backup of Shard #1
and restoring its backup an hour later (if it were to crash) would not
be good enough:
a) Chunks originally on this shard (and in the backup) may not exist
on the shard anymore (balancer moved them to another shard)
b) Chunks allocated to this shard were never in the previous backup,
but were since moved to this shard. Their data is however available in
a different shards backup (but that shard no longer owns these
chunks)

Thinking out loud about this; what about a backup system where:
a) You ask the mongo cluster to perform a backup system wide. Its an
operation that happens for the entire cluster.
b) Each shard writes its local backup to a specific directory (or
uploads to S3, whatever) independently in parallel

Assuming then after this backup is taken; Shard #1 catches on fire and
you need to restore Shard #1 after buying new servers:
a) You ask the mongo cluster to restore Shard #1
b) Mongod Shard #1 primary (for example) asks the config servers where
Shard #1's backup data should be, based on the latest cluster backup
available
c) Config servers tell Shard #1 primary:
* You need to download /backup/shard001/ data files (majority of your
shard's data is here)
* You need to also then grab a few chunks from /backup/shard002/
because between the time the last backup was taken, and when Shard #1
caught fire, you had some more chunks allocated to you that aren't in
your original Shard #1 backup (but are available in shard002's backup
files - the original chunk owner)
* You need to ignore restoring chunks XXX from your backup files
because they were since given a new owner on a different shard, and so
you don't need to try restoring them as you don't own them anymore

Basically; let the mongo cluster be backup-aware and know how to
restore data even if chunks have since moved around.

You just need to make sure enough backup space is available.
---

Regards
- Andrew

Erik

unread,
Mar 2, 2011, 10:19:34 PM3/2/11
to mongodb-user
Thanks for your reply Bernie.

On Mar 2, 12:26 pm, Bernie Hackett <ber...@10gen.com> wrote:
> You can just recover the one shard

The reason I'm thinking we wouldn't be able to do this in general is
that there could be newer data in the other shards that is dependent
(at the application level) on data in this shard that is being reset
back to a previous state (when the backup was taken).

> but you will have to do manual
> reconciliation of chunks that have moved since the last backup. Each
> shard keeps a copy of the chunks that have been moved to another shard
> in a moveChunk directory. Assuming you haven't deleted these copies
> you can use them to reconcile the changes.

So this need to manually reconcile chunk movements (sounds very hard
to me) arise because the other shards can continue having chunks moved
between them by the balancer whilst this one shard is down. Is there
any way to configure the balancer to not do balancing if any shards
are dead? That would prevent the need for any manual reconciliation.

> From your last post it sounds like you are planning a large and
> complicated setup. You might want to get in touch with 10gen about a
> support contract.
>
> http://www.10gen.com/support

Honestly, the per-server support contract is way too expensive for
us. If there was some sort of flat rate we would consider it. We may
take advantage of the 1-off consulting services offered at some point.

- Erik

Erik

unread,
Mar 9, 2011, 9:39:40 PM3/9/11
to mongodb-user
Asking the previous question again in case it got lost in the noise or
just missed:

So this need to manually reconcile chunk movements (sounds very hard
to me) arises because the other shards can continue having chunks
moved
between them by the balancer whilst this one shard is down. Is there
any way to configure the balancer to not do balancing if any shards
are dead? That would prevent the need for any manual reconciliation.

- Erik

Gaetan Voyer-Perrault

unread,
Mar 10, 2011, 2:59:51 PM3/10/11
to mongod...@googlegroups.com
@Erik:

Let's revisit this from the top to clarify.

> In case you lose an entire shard somehow, do you just recover that
shard, or do you roll back all of the shards to the last full backup
of all the shards?

First off, you really, really want to avoid an entire shard going down. Really. That's why the standard advice is to use replica sets. You want to avoid the single point of failure.

The answer to your question really dependent on how you feel about your data. Let's break them out:

Roll back all of the shards
---------
Yes, you can always go back to the "last known good state". Please note that you *must* include backups of the config DBs or this doesn't work. So if you have to restore from back-ups, you're talking about restoring each shard + all of the config DBs.

Assuming that all of your backups were done correctly. This *will* work, but will also result in losing all data since the last backup. For most people, this represents several hours of data?

Roll back the one shard
---------
Here's the problem with rolling back _just_ one shard: data has probably moved since you made your backup. Given this, your "restored" data could contain old information currently on other shards, it could also be missing data that was migrated.

This is the manual reconciliation part.

There is a "movechunks" folder that contains copies of chunk data that was migrated. It is possible (but time-consuming) to use this data to re-create the appropriate moves and recover data.

Down-Time
---------
The problem with both of these is that they *will* incur lots of down-time. If you have a sharded, write-heavy configuration, you probably have lots of data. If you have lots of data, you're going to have to wait for gigabytes of data to be written to each of the shards.

Again, the best way to avoid this all together is replicate to prevent the complete loss of a shard.

- Gates

- Erik

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Reply all
Reply to author
Forward
0 new messages