WiredTiger limits

147 views
Skip to first unread message

Francisco A. Lozano

unread,
Apr 25, 2015, 8:26:19 AM4/25/15
to mongod...@googlegroups.com
Hi. I am considering using WiredTiger storage engine and got some questions about it.

My use-case involves a huge number of collections and indexes, but but it's not clear to me what the limits are in these regards. I can see in the documentation that there are no theoretical limits... but I'm pretty sure that there are practical ones.

Eg: Is it realistic to have 10 million collections with 5-6 indexes each?

MARK CALLAGHAN

unread,
Apr 25, 2015, 2:55:59 PM4/25/15
to mongod...@googlegroups.com
From my usage the number of files per collection is 2 + #secondary
* 2 for collection and _id index
* 1 for each secondary index

All of the files are in the --dbpath directory. Are you OK with 70M files in one directory? 

This option can let you split that over two directories.
storage.wiredTiger.engineConfig.directoryForIndexes

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/26f2c0f2-49e5-4215-88b8-b13b7e16f4ae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Mark Callaghan
mdca...@gmail.com

Asya Kamsky

unread,
Apr 25, 2015, 8:18:50 PM4/25/15
to mongodb-user

You can have different databases store files in different subdirectories too, but I don't think the primary consideration is how many files there are - what about how long it takes to open all the files and read them or to flush changes to all the files?   I guess it depends on how many of these collections are being used in parallel...

I'm going to guess that 70M collections+indexes is not going to be good performance on a single server.    Are you thinking that a single replica set would be handling this data?

If you were thinking of sharding it, then you could split the collections across your shards so if you had N shards, each one would only have 70M/N collections+indexes.

Asya



For more options, visit https://groups.google.com/d/optout.



--
MongoDB World is back! June 1-2 in NYC. Use code ASYA for 25% off!

MARK CALLAGHAN

unread,
Apr 25, 2015, 8:58:09 PM4/25/15
to mongod...@googlegroups.com
Good to learn that "storage.directoryPerDB: true" helps.


For more options, visit https://groups.google.com/d/optout.



--
Mark Callaghan
mdca...@gmail.com

Francisco A. Lozano

unread,
Apr 27, 2015, 5:43:57 AM4/27/15
to mongod...@googlegroups.com
I would of course shard per collection (I assume I have to do that at app level, as mongo's sharding shards inside collections - am I right?)

I didn't know about the possibility of storage.directoryPerDB either, that sounds like would help.

Thanks a lot!

Asya Kamsky

unread,
Apr 27, 2015, 3:20:09 PM4/27/15
to mongodb-user
You can use tag aware sharding to distribute collections across the cluster (then your application won't have to be aware of it) you can also create more databases and distribute collections across your databases - each DB can be on a different shard.   That's less flexible than the first option.   I discuss it somewhat here:  http://askasya.com/post/taggedcollectionbalancing

Asya



For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages