MongoDB load balancing in multiple AWS instance

489 views
Skip to first unread message

Indranil Mondal

unread,
Jul 10, 2014, 2:25:45 AM7/10/14
to mongod...@googlegroups.com
We're using amazon web service for a business application which is using node.js server and mongodb as database. Currently the node.js server is runing on a EC2 medium instance. And we're keeping our mongodb database in a separate micro instance. Now we want to deploy replica set in our mongodb database, so that if the mongodb gets locked or unavailble, we still can run our database and get data from it.

So we're trying to keep each member of the replica set in separate instances, so that we can get data from the database even if the instance of the primary memeber shuts down.

Now, I want to add load balancer in the database, so that the database works fine even in huge traffic load at a time. In that case I can read balance the database by adding slaveOK config in the replicaSet. But it'll not load balance the database if there is huge traffic load for write operation in the database.

To solve this problem I got two options till now.

Option 1: I've to shard the database and keep each shard in separate instance. And under each shard there will be a reaplica set in the same instance. But there is a problem, as the shard divides the database in multiple parts, so each shard will not keep same data within it. So if one instance shuts down, we'll not be able to access the data of that shard within the instance. 

To solve this problem I'm trying to divide the database in shards and each shard will have a replicaSet in separate instances. So even if one instance shuts down, we'll not face any problem. But if we've 2 shards and each shard has 3 members in the replicaSet then I need 6 aws instances. So I think it's not the optimal solution.


Option 2: We can create a master-master configuration in the mongodb and all the masters will be in seprate instance, and write operation in each master will be maintained by sharding. And write operation in each master will be synchronized with other masters. But I don't know whether mongdb supports this structure or not.

So, please suggest me what should be the best solution for this problem.

kaushick gope

unread,
Jul 10, 2014, 2:42:24 AM7/10/14
to mongod...@googlegroups.com
I'm also facing the same problem. Thanks for the post.

Will Berkeley

unread,
Jul 11, 2014, 11:16:47 AM7/11/14
to mongod...@googlegroups.com
Hi Idranil. If you are concerned about write scaling, the best option for you right now is probably to upgrade your EC2 instance rather than sharding. Normal failover in a replica set will ensure ongoing write operations. 

Relevant to your idea about read balancing using secondaries, Asya has written a helpful blog post about whether replica sets can help with scaling.

As for option #2, as of MongoDB 2.6 only one primary is allowed per replica set. There is no master-master replication.

-Will

Asya Kamsky

unread,
Jul 14, 2014, 4:50:01 AM7/14/14
to mongodb-user
Just to emphasize something in addition to what Will already said...

You wrote:
And we're keeping our mongodb database in a separate micro instance.

MICRO?  No real database should ever be run in a micro instance.  

You need to use real instances - and then see whether you need more than one shard to handle your proposed load.

Asya



--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/a41eda08-3fe2-4d8d-84f7-43e5480062f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

s.molinari

unread,
Jul 14, 2014, 11:21:39 AM7/14/14
to mongod...@googlegroups.com
Running a professional database for any application will never be cheap, no matter what you use. And you shouldn't spare expenses on it either, when possible.

If you can currently run support application's data needs with a micro instance (which I also find hard to believe), when do you think the "huge traffic load" will happen? I would suggest, as Asya suggested too, to up your database instance to something bigger and create a replica set with that single instance (minimum 2 more AWS EC2 instances). When your application's data needs start to grow (and much sooner than before your replica set is maxed out), then you should start to worry about sharding.

Scott

Indranil Mondal

unread,
Jul 15, 2014, 2:52:33 AM7/15/14
to mongod...@googlegroups.com
Ok, I've no issue with using larger instance. But I was thinking if there is any load-balance for mongodb in amazon so that it'll automatically upgrade my instance if the database increases, because using huge volume of instance for a very minimal database is not very good option.

Anyways, are you people suggesting me to use the option 1, I mean dividing the database in shards and each shard will have a replicaSet in separate instances?

Indranil Mondal

unread,
Jul 15, 2014, 3:02:33 AM7/15/14
to mongod...@googlegroups.com
As you people suggested I tried to divide the database in shards and each shard will have a replicaSet. I followed  http://docs.mongodb.org/manual/tutorial/convert-replica-set-to-replicated-shard-cluster/ to deploy this structure and that worked perfectly when all shards & replicaset members are in the same machine.

Later I tried to keep each members of the replicaset in separate machine that is locally connected before deploying in in amazon EC2 instances.

I've two replicasets firstset & secondset

The firstset has three members 192.168.1.10:27018(Primary), 192.168.1.11.27018(Secondary) & 192.168.1.9:27018(arbiter)

And the secondset has three members 192.168.1.6:27019(Primary), 192.168.1.9:27019(Secondary) & 192.169.1.11:27019(arbiter)

To create the firstset replica I've written:-

mongod --dbpath /replica-data --port 27018 --replSet firstset --oplogSize 10

in 192.168.1.10, 192.168.1.11 & 192.168.1.9. Then to initiate the replica set I've written:-

mongo --port 27018

in 192.168.1.10 and then,

    use admin

    db.runCommand({
     "replSetInitiate": {
        "_id": "firstset",
        "members": [{
             "_id": 1,
             "host": "192.168.1.10:27018",
             "priority": 10
             }, {
             "_id": 2,
             "host": "192.168.1.11:27018",
             "priority": 2
             }, {
             "_id": 3,
             "host": "192.168.1.9:27018",
             "priority": 1,
             "arbiterOnly": true
             }]
             }
        });

I've created the secondset replica is similar way and it worked.

Then I tried to create three configsvrs in three separate machine so that even if one machine shuts down the other configsvrs will be running. The configsvs will run in 192.168.1.10:27020, 192.168.1.11:27020 & 192.168.1.6:27020

So I've written:-

mongod --configsvr --dbpath /shard-data -port 27020

in 192.168.1.10, 192.168.1.11 & 192.168.1.6.

Then I tried to run the mongos in 192.168.1.10 in 27030 port with the three running configdb. So, I've written

mongos --configdb 192.168.1.10:27020,192.168.1.11:27020,192.168.1.6:27020 --port 27030 --chunkSize 1

Then I got in mongos log:

    2014-07-11T13:52:34.502+0530 [mongosMain] MongoS version 2.6.3 starting:     pid=4195 port=27030 64-bit host=innofied-lappy (--help for usage)
    2014-07-11T13:52:34.502+0530 [mongosMain] db version v2.6.3
    2014-07-11T13:52:34.502+0530 [mongosMain] git version: 255f67a66f9603c59380b2a389e386910bbb52cb
    2014-07-11T13:52:34.502+0530 [mongosMain] build info: Linux build12.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
    2014-07-11T13:52:34.502+0530 [mongosMain] allocator: tcmalloc
    2014-07-11T13:52:34.502+0530 [mongosMain] options: { net: { port: 27030 }, sharding: { chunkSize: 1, configDB: "192.168.1.10:27020,192.168.1.7:27020,192.168.1.8:27020" } }
    2014-07-11T13:52:34.518+0530 [mongosMain] SyncClusterConnection connecting to [192.168.1.10:27020]
    2014-07-11T13:52:34.524+0530 [mongosMain] SyncClusterConnection connecting to [192.168.1.7:27020]
    2014-07-11T13:52:34.527+0530 [mongosMain] SyncClusterConnection connecting to [192.168.1.8:27020]
    2014-07-11T13:52:34.618+0530 [mongosMain] scoped connection to 192.168.1.10:27020,192.168.1.7:27020,192.168.1.8:27020 not being returned to the pool
    2014-07-11T13:52:45.956+0530 [mongosMain] waited 11s for distributed lock configUpgrade for upgrading config database to new format v5
    2014-07-11T13:52:57.613+0530 [mongosMain] waited 22s for distributed lock configUpgrade for upgrading config database to new format v5

And also could not found any mongos process at the specific port.
So, when I run

    mongo --port 27030

it's showing me
 
    errno:111 conncetion refused.

But previously it working perfectly when I was testing it in a single machine.

I don't know why I'm getting this error and how to solve it.

s.molinari

unread,
Jul 15, 2014, 5:35:53 AM7/15/14
to mongod...@googlegroups.com
On Tuesday, July 15, 2014 8:52:33 AM UTC+2, Indranil Mondal wrote:
Ok, I've no issue with using larger instance. But I was thinking if there is any load-balance for mongodb in amazon so that it'll automatically upgrade my instance if the database increases, because using huge volume of instance for a very minimal database is not very good option.

Anyways, are you people suggesting me to use the option 1, I mean dividing the database in shards and each shard will have a replicaSet in separate instances?

AFAIK, you don't need a load balancer for a sharded instance of MongoDB. It handles load balancing on its own through the config servers and the Mongos routers.

http://docs.mongodb.org/manual/core/sharding-introduction/#sharding-in-mongodb

Scott 

Indranil Mondal

unread,
Jul 15, 2014, 6:42:45 AM7/15/14
to mongod...@googlegroups.com
As far as I understand, in case of sharding, I'll have two/more shards running in two/more different instances/ same instance in different ports. I'll have to configure all the shards initially, that means I'll have to buy the instances and configure it, so that if there is huge data, it'll be distributed among the two shards.

I'm using Amazon EC2 instance and for load-balancing our nodejs server we're using elastic benstalk, in this case suppose we're running the server in a micro instance and if there is huge load it'll be automatically upgrade to a medium instance and it'll be automatically configured. Does mongodb has any kind of service like that?

If there is such kind of service, I may not have to use sharding, only replica running in separate instances will fulfil my requirement, but if there is no such service then I've to go for dividing the database in shards and each shard will have a replicaSet in separate instances?

s.molinari

unread,
Jul 15, 2014, 7:22:30 AM7/15/14
to mongod...@googlegroups.com
What you should do is have, as a minimum, 3 EC2 instances (middle sized?), with Mongod's, clustered together as a replica set. If you are running a micro instance now and it runs ok, then you will have a good bit of room to grow with this replica set.

If you want the system to automatically scale, then you would need to use Amazon's Autoscaling service and that will only scale out horizontally and that in Mongo terms means sharding. AFAIK, this decision should be a "human" one, so I highly doubt you will use the Autoscaling from Amazon for the initial sharding scenario. For later sharding, it might be useful.

I'd venture to say, you'd actually probably only need to vertically scale the 3 replica set nodes a couple of times, until you are in bigger EC2 instances. This again, must be determined by your own growth acceleration. If you are growing fast, you might need to shard earlier.

In other words, sharding should be always in your mind for the future, and thus in all of your application design decisions, but sharding should be done only when it is absolutely necessary (but also not too late!!!). Basically you are following the general golden rule of "the less complication you have in your system, the better off you are" and the Mongo golden rule of, "if you have to shard, have enough resources available to so".

Scott 

Indranil Mondal

unread,
Jul 15, 2014, 7:50:09 AM7/15/14
to mongod...@googlegroups.com
Yeah, I'll go for only replication in multiple instance for now, but I've to be prepared for sharding also. I'm following :

http://docs.mongodb.org/manual/tutorial/convert-replica-set-to-replicated-shard-cluster/ for sharding in a single machine, and it worked fine.

Later I tried to run shards & replicasets in separate local machine that are locally connected. The replicasets worked fine but I'm getting an error when I'm trying to connect mongos with three config servers running in three machine. I've already posted what I've done yet in my previous post. Can you please look into it.


--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to a topic in the Google Groups "mongodb-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mongodb-user/BztKs2TW2QU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mongodb-user...@googlegroups.com.

To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.

s.molinari

unread,
Jul 15, 2014, 11:05:35 AM7/15/14
to mongod...@googlegroups.com
Where were the config servers running? AFAIK, they must be on different machines than the shard/ replica nodes. I am not certain about that though.

Scott

Indranil Mondal

unread,
Jul 16, 2014, 3:21:58 AM7/16/14
to mongod...@googlegroups.com
Currently I've four machines, 192.168.1.10, 192.168.1.11, 192.168.1.9 & 192.168.1.6

The replicasets and config servers are running on this four machines in separate ports.


The firstset has three members 192.168.1.10:27018(Primary), 192.168.1.11.27018(Secondary) & 192.168.1.9:27018(arbiter)

And the secondset has three members 192.168.1.6:27019(Primary), 192.168.1.9:27019(Secondary) & 192.169.1.11:27019(arbiter)

The configsvs will run in 192.168.1.10:27020, 192.168.1.11:27020 & 192.168.1.6:27020

And the mongos will run in 192.168.1.10 in 27030 port
Reply all
Reply to author
Forward
0 new messages