How WiredTiger handles collection, and multi-collection limitations.

95 views
Skip to first unread message

Azwok Kateal

unread,
Jul 26, 2016, 3:56:17 PM7/26/16
to mongodb-user
Hi,

I'm essentially looking for the details in response to Samanth Atkin's question: https://groups.google.com/d/msg/mongodb-user/vHKZODyh9cE/gkT_miNwEQAJ, and a couple of other queries.

My use case is for the storage and rapid querying of large volumes of processed satellite data (design requirements: http://dba.stackexchange.com/questions/143342/large-22-trillion-items-geospatial-dataset-with-rapid-1s-read-query-perfor). I am testing a multi-collection design for improved performance of smaller index sizes, and faster inserts (I expect due to smaller geospatial indexes). My initial testing shows the performance to be good, but when I scale up the creation of this many collection design (current target of >50,000 for my sample, but a full production scale will end up at >1 million, at which point I will look at a mix of multi-collection/server and sharding), I run into "too many open files" errors (http://dba.stackexchange.com/questions/144911/make-mongodb-close-open-files). With further reading and testing, I see that when starting mongoDB (WiredTiger), it opens every collection straight away (seen using lsof), is this normal behavior?. Is this just how the engine works, or is it possible to stop it opening every file, and make the engine only open the collection file when it first needs it, and close it when it is finished? From the documentation (https://docs.mongodb.com/manual/reference/limits/, "Number of Collections in a Database") it states that "WiredTiger is not subject to the same limitation to the maximum number of collection as MMAPv1, but it does not state it is subject to any other limitation. If there is no way to prevent the server opening every collection on startup, surely the maximum number of collections is therefore limited by the systems limit on maximum number of open files, and ultimately limited by the hardware (is it RAM?) for being able to keep that many files open?

If there is no way to prevent the server opening every collection file when it starts up, and increasing the ulimit is my only option for having many collections, then what are the limitations (including hardware) and possible problems I need to consider when increasing the ulimit to very large numbers?

Mongo is still very new to me, but I'm learning a lot quickly and I really like what I have learned so far! So any advice is welcome, as I'd like to learn as much as possible about this database engine!

Cheers,
Azwok

Reply all
Reply to author
Forward
0 new messages