Data Locality in WiredTiger in Different Collections

31 views
Skip to first unread message

Ian Hansen

unread,
Mar 27, 2015, 10:51:20 AM3/27/15
to mongod...@googlegroups.com
Some quick background, I'm playing with a time-series metric database where every datapoint is a new document.

I'm intrigued by a new feature: no collection limits.

I started playing with creating a collection per metric. Also, as a hack, I'm using the automatic _id ObjectID to utilize its automatic index and embedded timestamp. I can craft careful ObjectIDs in queries to get documents before and after a timestamp I define.

e.g. db.someMetricName.insert({v: 42.0})

So, back to the question. In WiredTiger, all the data is in one file on disk, but are collections written in such a way that all the documents in that collection are near each other?

It seems if this is the case, I should expect faster reads for a single metric if I split them out by collection.

Are there any terrible side-effects of creating thousands of different collections (one-per metric)?

Asya Kamsky

unread,
Mar 28, 2015, 8:07:55 PM3/28/15
to mongodb-user
Are your collections "insert-only"? Or do you also update documents?

If you *only* insert documents, then you should expect them to have
some loose proximity to others from the same range of insertion time,
except for the fact that there is really no guarantee that something
that's in the same file is physically near another something due to
the way pages may be split into multiple pages and then written out.

But if you tend to query them in larger chunks than what goes into a
single document, then you might want to consider storing them in
larger "pre-aggregated" time slices...

How many metrics are we talking about here? Theoretically there may
not be collection limits but in practice, you are not going to like
having millions of files...

Asya
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user"
> group.
>
> For other MongoDB technical support options, see:
> http://www.mongodb.org/about/support/.
> ---
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mongodb-user...@googlegroups.com.
> To post to this group, send email to mongod...@googlegroups.com.
> Visit this group at http://groups.google.com/group/mongodb-user.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mongodb-user/8d332601-db75-4746-90c5-0aee8b3aab9e%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
MongoDB World is back! June 1-2 in NYC. Use code ASYA for 25% off!
Reply all
Reply to author
Forward
0 new messages