Amount of data MongoDB can handle

259 views
Skip to first unread message

Abhi

unread,
Apr 17, 2014, 5:01:02 AM4/17/14
to mongod...@googlegroups.com
Hi,
I am looking for document based NoSQL databases for my requirements and mongodb seemed promising.
Before proceeding, I want to know the amount the data which a single mongodb server can handle before breaking.

Has anyone done this kind of breaking point test? If yes what was the limit you reached. Can you provide details like the document size, number of documents etc.
If not then could some one from mongodb shed some light on this aspect

Thanks,
Abhi

s.molinari

unread,
Apr 17, 2014, 8:28:05 AM4/17/14
to mongod...@googlegroups.com
I would imagine it depends totally on the server in question. Without knowing what kind of server MongoDB is running on, it is impossible to say, how much data it can handle. It also depends on what you want to do with that data. More reads than writes. Or more writes than reads. 1000s of collections or just a few. It also depends on the network connectivity. But, I am a Mongo newb too. So, TIWAGOS.

Scott

Asya Kamsky

unread,
Apr 21, 2014, 6:33:46 PM4/21/14
to mongodb-user
The absolute limit for data in a single node depends on your operating system.

On Linux it's 64 Terabytes and on Windows it's 4 Terabytes.   This is assuming you're running with journaling turned on (which is the default).

Asya



--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/62657be2-112f-4679-81c4-665b97d007a5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

s.molinari

unread,
Apr 22, 2014, 3:55:28 AM4/22/14
to mongod...@googlegroups.com
That is interesting.

Could it be feasible and/ or practical to have 64TB on one node?

Scott

Abhi

unread,
Apr 23, 2014, 2:23:30 AM4/23/14
to mongod...@googlegroups.com
Thanks Asya for the reply. The data i am be storing will cross 1TB soon so my follow up question is that given the limits you mentioned is it recommended to have 1-2 TB of data on a single node running Linux? At what limits should i start sharding the data?

Thanks,
Abhi
Message has been deleted

Abhi

unread,
Apr 24, 2014, 10:25:01 AM4/24/14
to mongod...@googlegroups.com
Hi Asya, 

My problem is that the data i am storing will be around 1TB and may increase. So, given the limits you mentioned is it recommended to have 1-2 TB of data on a single node running Linux? At what limits should i start sharding the data?

Thanks,
Abhi

Asya Kamsky

unread,
Apr 24, 2014, 5:15:10 PM4/24/14
to mongodb-user
The limits are frequently practical rather than theoretical.

For example, how much fun would it be to create a backup of a 20TB single node?   Or to resync a single secondary?

Having said that, the raw size of data is usually only the third reason that people shard.
The first one is usually to access more RAM, the second is to have more disk IO bandwidth.

And the relationship of above to total data size is very tenuous at best.   It all depends on how much of your data is active, how much churn there is, etc.

Asya
P.S. of course this all assumes indexed queries - if you don't tune your queries to use indexes and constantly have collection scans, then obviously then size of the data will have effect on performance very directly.



s.molinari

unread,
Apr 25, 2014, 2:07:53 AM4/25/14
to mongod...@googlegroups.com
Hehehe.....I just had to point it out. Asya, you said, "It just depends" once again and that made me chuckle. As the newb I am, I am still not sure if the "flexibility" causing the "it depends" statements is something absolutely beautiful about Mongo or sometimes a total curse. LOL!

Still yet, your insights are like gold to me. Thank you!

Scott 


Glenn Maynard

unread,
Apr 28, 2014, 12:56:12 PM4/28/14
to mongod...@googlegroups.com
If you're running an archival system, where you store huge amunts of data but don't access it very often, you could run more than 64 TB on a single server by running multiple nodes on it.  Usually, you hit other limitations long before that makes sense, but if you hit the address space limit before you hit I/O-related limits, it's an option.  Once you have that much data, the cost of running a second server is probably relatively small, though.

I assume these limits apply to the whole size of the database (including indexes, journal, and data fragmentation), and not just data.

(4 TB is a very low for a per-shard limit--there are 6 TB drives on the market.  Don't run MongoDB in Windows.)








For more options, visit https://groups.google.com/d/optout.



--
Glenn Maynard

Reply all
Reply to author
Forward
0 new messages