ok help me understand the hardware requirements here.

138 views
Skip to first unread message

Carl Christianson

unread,
Oct 23, 2018, 3:08:33 PM10/23/18
to mongodb-user
First off I know there is a lot I don't know yet and I'm just trying to wrap my head around a few things.
So with an RDBMS once you are have scaled up as far as you can go, scaling horizontally instead is a pain.  So something like mongo, it runs on commodity hardware, and you can shard it across many shards on the surface seems like a possible solution.  But the devil is always in the details...

My question(s) are around just how much cheaper/efficient is it really.   So with an RDBMS, lets say the issue for scaling is because you are operating a SAAS/multi-tenancy model and you have too many customers (good problem).  You can shard tenants across different db servers.  The downside is that you are now managing where all the data lies within your application  where tenants reside, typically done with a master database. This is a design pattern I have implemented more then once for a large number of tenants and it has worked well.   I have never found a situation where a single tenant would even remotely approach the capacity of a single machine.


So for this hypothetical spanish bank Case #1,  you need (in production) 18TB of space, 4TB of ram across the cluster.  They specify a max of 128G of RAM per shard,
so to meet a 4TB working set worth of RAM, you'll need 36 shards,  with replica you are talking 108 physical machines. (plus config replica set, and if you have mongos's on separate hardware instead of app servers).

Now my reading/understanding/hearing is that typically a noSQL solution takes a fair amount more space in NoSQL then what it does in an RDBMS.  That 18TB in mongo could easily be 12TB in an rdbms,  Devil is in details there maybe it wouldn't pare down that nicely but it very well could.

Just as a for example.   I have in play right now a 12TB SQL Server, and a 6TB Oracle solution in production.    The 12TB box is exteremely busy, that server has 96 cores 1TB of RAM, a really expensive storage solution, and of course licensing an enterprise RDBMS on 96 cores is never cheap.    The 6TB oracle solution is just humming along surprisingly well, and it's is sitting on a 64G RAM, dual socket server, really it would be a good comparison for Case #2. 

So with the SQL Server solution, scaling up is really not an option as it's on the biggest thing you can get really.   But when I think of what it would take on mongo, I'm looking at 108 machines!  While each one costs a lot less,  each of the machines in the mongo example provided would cost you over 10K each.  probably more like 12-14K.
So just in hardware you are taking 1.1-1.2M dollars.  Now if you license it I do not know the model here, but one place i read was something like 10K per node, so another 1.1M in licensing?  That is NOT cheaper.  It's also something that you will get laughed at by the Operational staff, when you tell them that the ONE big honkin machine that is keeping up with the load needs 108 other machines to replace it.  That is a nightmare to manage in comparison.  Didn't even touch on the network, heat generation,  power consumption of 108 nodes versus what's in place now.

I for one am really digging mongo as I am noodling a POC with it now and taking some online courses. I really am looking spinning my next project up in mongo.  However,  if I have to go to, say,  my boss and try to spin a tale exactly why we should use this technology it's a tough sale.     In my world a 18TB rdbms database is healthy but not crazy large database.  There are deploys on Oracle in the Petabyte range.   I am NOT trying to turn this into a holy war of RDBMS versus NoSQL.   I honestly would like to hear that i'm having some misconceptions and that my numbers for cost on mongo are way off.     It's just it's a hard sale for me to go to the powers that be that control the $$$ and  tell them that we need manpower, heat, network and physical space for 108 2U servers,  rather then a 4U db server, with a NetAPP disk cabinent.  One solution fills a large part of a data center, one takes most of 2 racks. not to mention the heat/power/networking costs.
As much as I'd like to NOT be on oracle / sql server,  how can I really recommend 108 bits of hardware, versus what we use now?

I am pretty taken back about multiplying by 3 for replica sets.  Also the RAM requirements for Mongo seem an order of magnitude higher.  Oracle/SQLServer do a damn good job of keeping just the blocks actually being accessed in memory and swapping in/out the blocks of data/index as needed.

keep in mind I'm using Mongo's own case study for these numbers...   Also I really want do my next project in mongo because it's cool technology, but I also would like to keep my job :D

Wan Bachtiar

unread,
Oct 31, 2018, 12:49:17 AM10/31/18
to mongodb-user

Hi Carl,

First, worth noting that the blog post is based on MongoDB World: Hardware Provisioning for MongoDB on June 2014 (MongoDB v2.6). There has been many changes to MongoDB since (current release is v4.0). One of the notable changes is the new storage engine WiredTiger, which is the default storage engine since v3.2.

In the scenario case #1, one of the banks requirements is to actually have a working set size of 4TB. The case is intended as an example of scaling horizontally. Again this was calculated at the time with MMAPv1 storage engine in mind, please also see Memory Diagnostics for WiredTiger for consideration.

Despite some part of the blogs content may be out of date, the four tips for capacity planning from the blog are still relevant:

  • Document the performance requirements up front.
  • Stage a Proof of Concept.
  • Always test with a real workload.
  • Constantly monitor and adjust based on changing requirements.

I am pretty taken back about multiplying by 3 for replica sets.

Replica sets provide redundancy and high availability, and are the basis for all production deployments. Please review replication in MongoDB for more information.

Regards,
Wan.

Reply all
Reply to author
Forward
0 new messages