Use CPU from SECONDARY in a replica set

65 views
Skip to first unread message

Jean-Baptiste Reich

unread,
May 13, 2013, 4:50:54 AM5/13/13
to mongodb-user
Hello,

I am currently using MongoDB sharded cluster with replica sets in a
write intensive application. My problem is that the primary in the
replica set is receiving the whole load and that it reaches 100% CPU
(reads on secondaries are not possible). My problem is that I am far
from using the available hard drive space and adding new replica sets
to share CPU is kind of expensive (I need 3 machines each time to use
only 1 CPU and a small part of the available hard drive space).

So, my point is to know if there is a way to use CPUs from secondaries
servers in a replica set ?

Is it possible in a replica set to have a PRIMARY for only a subset of
the whole data ? I know data is partitioned into clusters so is it
possible to say for cluster 0 PRIMARY is machine 1, for cluster 1
PRIMARY is machine 2 and for cluster 2 PRIMARY is machine 3 ?

In order to achieve something like this, I am experiencing an
architecture where I have the same number of replica sets as my number
of machines. In my case 3. Normally, I should use 9 machines but in my
case I install all replica sets on those 3 machines. Then, I change
the priority so that I have PRIMARY for rs0 on machine 1, PRIMARY for
rs1 on machine 2 and PRIMARY for rs2 on machine 3. With that, I am
able to distribute workload on all machines and to use my hard drive
more efficiently.
Is it a good way to solve my problem ? Is there any drawbacks ?

Thanks

Rob Moore

unread,
May 13, 2013, 10:32:28 AM5/13/13
to mongod...@googlegroups.com


If you have the resources (CPU, disk I/O, and memory) on the machines to support the workload, you test with realistic loading/data, and you actively monitor the utilization of the resources by the mongod processes and have a plan for dealing with failures/spikes in load then this is a very workable solution.

We do it in production today with 4 mongod processes for each host. There are others that have done the same. You can find there comments in this group.  The common thread is usually having physical machines (even low end can stack a couple mongods, IMHO) and not using VMs.

10gen will advise against this as it requires more planning and thought but if you have the resources, test, and monitor (don't skip any of those) you will be fine.

Rob.

Jean-Baptiste Reich

unread,
May 13, 2013, 10:55:40 AM5/13/13
to mongod...@googlegroups.com
OK

I am happy to learn others are doing that. And yes I have the resources, test and monitor. I don't known if it is a good solution for now, I need to run all my load and failover tests on it but at least I know that it could be solution.

Thank you


2013/5/13 Rob Moore <robert.a...@gmail.com>
--
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb
 
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Asya Kamsky

unread,
May 13, 2013, 3:33:21 PM5/13/13
to mongod...@googlegroups.com
There is one big problem with what you are seeing.  CPU load is almost
never the bottleneck/limiting resource for MongoDB (or databases in general).

Unless you are running a large number of MapReduce jobs and/or aggregation 
framework queries, high CPU utilization tends to indicate that you have poorly
tuned queries possibly with in-memory sorts (as opposed to indexes supporting
sorting by reading documents in correct order).

Before jumping into a complex configuration you describe, I would try to find
out WHY the CPU load is to high and possibly some simpler solutions will
emerge.

Asya
P.S. write heavy load means that rather than worrying about using
available disk drive space you should be concerned about available
disk IO bandwidth.

Rob Moore

unread,
May 13, 2013, 5:14:46 PM5/13/13
to mongod...@googlegroups.com
On Monday, May 13, 2013 3:33:21 PM UTC-4, Asya Kamsky wrote:
There is one big problem with what you are seeing.  CPU load is almost
never the bottleneck/limiting resource for MongoDB (or databases in general).

Unless you are running a large number of MapReduce jobs and/or aggregation 
framework queries, high CPU utilization tends to indicate that you have poorly
tuned queries possibly with in-memory sorts (as opposed to indexes supporting
sorting by reading documents in correct order).

For the record.  We run well over a single core of CPU for each mongod. I can assure you we are using indexes for all queries (vast majority use _id). It would become a glorious disaster if we did not. Trust me, I've seen it happen in testing.

I freely admit we are probably _not_ a normal case and we did not get there without some work but it can certainly be done.

P.S. write heavy load means that rather than worrying about using
available disk drive space you should be concerned about available
disk IO bandwidth.

Yes - disk is disk I/O, space is rarely a concern when talking about performance.  Should have been clear.  Sorry.

Rob.
Reply all
Reply to author
Forward
0 new messages