I'm setting up a MongoDB installation and wondered if there on any
general guidelines on
hardware selection.
For instance:
In terms of scalability I'm sure there are some trade-offs between
number of CPU cores
and disk spindles. I would like a setup that I can scale with shards
over time so should
I go with nodes of 8 cores and RAID 10 storage or more nodes of 4
cores with just RAID 1?
I've been trying to weigh out buying four quad-core machines or two 8
core.
Thanks, I could really use some general advice.
Steve
PS: I'm impressed with MongoDB from my testing on very low-end
hardware (Atom 330)
> Hi,
>
> I'm setting up a MongoDB installation and wondered if there on any
> general guidelines on
> hardware selection.
>
> For instance:
>
> In terms of scalability I'm sure there are some trade-offs between
> number of CPU cores
> and disk spindles. I would like a setup that I can scale with shards
> over time so should
> I go with nodes of 8 cores and RAID 10 storage or more nodes of 4
> cores with just RAID 1?
>
> I've been trying to weigh out buying four quad-core machines or two 8
> core.
>
> Thanks, I could really use some general advice.
My advice would be the same if this server were intended to run Oracle, MySQL, Sybase, etc. because the problem domains are so similar.
I/O
If your application is very write heavy, but the writes are small and potentially random, then go with RAID10.
If your application is very write heavy, but the writes are few, large and streaming, then RAID5 or RAID6 is a reasonable candidate.
If your application is read heavy, any RAID will do.
I almost always choose RAID10 because it is vastly more resilient to hardware failures and its rebuilds are "cheaper" than for RAID5 or 6.
I'll leave you to decide if you want to do FC, AoE, direct attached, iSCSI or some other storage connectivity option. Since Mongodb doesn't currently allow you to force indexes to be built at different storage paths there isn't a lot of reason to have multiple I/O channels. Perhaps you could future proof your setup a little bit by allowing for that possibility down the road.
RAM
Make sure you install as much RAM as your budget allows. If you are going above 16GB of RAM, I highly recommend ECC RAM so stray errors can be detected and corrected without crashing your box.
CPU
This is the toughest part of the discussion. Right now mongodb is not very concurrent in the sense that it uses a lot of cores. My understanding is that it has a write lock (per collection) so all writes are serialized (could potentially max out 1 CPU). However, reads can be more concurrent so you might be able to take advantage of additional cores if there is extremely heavy read activity.
Using db.eval() to execute code server-side blocks the entire process, so no more than 1 core can be used. Point for a dual or single quad core system.
Map/reduce allows for some concurrency but unless you are doing several of them simultaneously then more cores are a waste. Point again for a dual or single quad core system.
If you intend to run more software than just mongodb on this box, that will likely be your deciding factor for number of cores. I just don't see the point in going beyond a *single* quad core for mongodb now or in the immediate future.
Corrections, suggestions, criticisms are all welcome...
cr
Based on your advice I will probably go for nodes with 4 cores and
RAID 10 sets.
The machines will be dedicated to MongoDB.