AWS Multi-Region Sharding Setup

Brian Carpio

unread,

Mar 27, 2013, 1:53:07 PM3/27/13

to mongod...@googlegroups.com

So we want to setup a multi-region HA setup using mongodb, but we don't want to use the region aware sharding, let me explain.

We have an app that can't do multi-region and we don't need users in region_A to be local to region_A and users in region_B to be local in region_B.

We want to use the three US Amazon regions us-west-1, us-west-2, and us-east-1. We plan to setup 5 mongodb instances in us-east-1 and 5 in us-west-1 (these will have a lower priority then the ones in us-west-1) and use us-west-2 for an abriter instance ONLY. This is PER shard. So initially we need two shards so 10 instances in us-west-1 and 10 in us-east-1 and two arbiters in us-west-2.

This way if us-east-1 goes down one of the members in us-west-1 will become master we will spin up app nodes in us-west-1 using our automation strategy and switch DNS to point to this region. Once us-east-1 comes back online I assume due to priority the instances in us-east-1 will become primary again and we will switch DNS again to point back to us-east-1.

We would also deploy one config servers in each region. This way if we have to live in us-west-1 for a while we could spin up a config server in us-west-1 using the documentation here, http://docs.mongodb.org/manual/tutorial/manage-sharded-cluster-config-server/

Is this a reasonable architecture? Since we don't want / need to use region aware sharding but we still want to be able to fail over to a second region.

How are other's dealing with this situation?

Brian Carpio

unread,

Mar 28, 2013, 7:49:59 PM3/28/13

to mongod...@googlegroups.com

Any ideas here?

Kelly Stirman

unread,

Mar 29, 2013, 8:28:30 AM3/29/13

to mongod...@googlegroups.com

Have a look at http://docs.mongodb.org/manual/administration/replica-set-architectures/#geographically-distributed-sets and http://docs.mongodb.org/manual/tutorial/deploy-geographically-distributed-replica-set/

I can't tell what you mean when you say 5 mongodb instances in each of the regions. Do you mean a replica set with 5 members?

It sounds like your application will not fail over automatically to another data center. Perhaps your design is for a standby data center? It might be good to state your goals and then work from there.

Kelly

Brian Carpio

unread,

Mar 29, 2013, 10:40:42 AM3/29/13

to mongod...@googlegroups.com

Yes, sorry if I wasn't clear here, but I am looking for active / standby datacenters, and although my app isn't able to fail over to datacenter B automatically it will only be a matter of time till it does, which is why its't important to have mongodb fail over.

When I say 5 mongodb instances in each region PER replica set, I mean a 10 node replica set (5 nodes per region) and an arbiter in the 3rd region. I think the limit is 7 voting members is that correct? The only reason I say 5 per region is that in a single region we always deploy a 5 node mongodb replica set. 2 nodes in our APP AZs (a & b) and a delayed secondary in the 3rd AZ (c), I'd like to duplicate that in the other region simply for the sake of failover. If we have to failover to the other region I want to cut down on the manual steps involved.

Kelly Stirman

unread,

Apr 1, 2013, 6:08:25 PM4/1/13

to mongod...@googlegroups.com

To me active/standby implies that the failover process is *not* automatic.

You could easily have replicas in a second datacenter that are configured as secondary only: http://docs.mongodb.org/manual/administration/replica-sets/#secondary-only-members

This would allow you to maintain a copy of the data in a secondary datacenter (B). If the primary datacenter (A) were to fail, you could script the creation of a new cluster from the remaining replicas (B). If the replicas in the original data center (A) came back online, the new system in data center (B) would be unaware of them and you would effectively an active system (B) and an idle system (A). You could then script the system to take a variety of next steps.

Another option is to use an active/active datacenter configuration, which requires three datacenters. If there is a network partition between datacenters (A) and (B) then a third datacenter (C) with arbiters would need to exist to determine which of the two datacenters was actually available. It might be helpful to read up on election internals: http://docs.mongodb.org/manual/core/replication-internals/#replica-set-election-internals

In the active/active setup you would maintain multiple replicas in datacenter (B) and you would configure the priorities such that replicas in datacenter (B) would only become primary if the replicas in datacenter (A) were unavailable.