Super slow sequential read on an indexed timestamp?

391 views
Skip to first unread message

Keith Irwin

unread,
Dec 29, 2010, 12:58:40 PM12/29/10
to mongod...@googlegroups.com
Folks--

I'm having trouble with queries based on the sorted order of an indexed timestamp field. I have a collection with several million objects. When I try to find the object with the earliest timestamp, the query takes a long time (> 20 minutes), or, for all intents and purposes, never completes.

I have something like:

  db.data.ensureIndex({"timestamp" : 1})

which was in place before the data was even written, then

  db.data.find({"key" : "value"}, {"timestamp" : 1}).sort({"timestamp" : 1}).limit(10)

this never returns (or I loose my patience waiting for it). In other words, 20 minutes is way too long for a query of this type.

However, the following returns right away:

  db.data.find({"type" : "whatever"}, {"timestamp" : 1}).sort({"timestamp" : -1}).limit(10)

My wild guess is that the second case works quickly because the latest data is in memory. What I'm actually trying to do is find the earliest object and then running some aggregations based on that. My actual code has some "where" clauses in it as well, but the above illustrates the problem much more simply.

Am I doing something wrong?

Keith

Scott Hernandez

unread,
Dec 29, 2010, 1:11:58 PM12/29/10
to mongod...@googlegroups.com
I think you actually want to index like this:
db.data.ensureIndex({"key":1, "timestamp" : 1})

The index will include the value you are looking for and is ordered
the way you want. Having an index just on the sorted field doesn't
help as it can't be used (effectively) if you are searching for other
values.

Any $where cannot use an index so it will have to inspect every
document for those criteria, and all javascript runs in a single
thread; not good for concurrency for large number of docs or queries.

> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.
>

Ethan Gunderson

unread,
Dec 29, 2010, 1:14:40 PM12/29/10
to mongod...@googlegroups.com
Can you do an explain on your query? I'm guessing it's not actually using an index.

--

Luke Ehresman

unread,
Dec 29, 2010, 1:15:21 PM12/29/10
to mongod...@googlegroups.com
Keith,

What other indexes do you have on this collection?

Since you're doing a query against the collection along with a sort, it's needing to do a scan over the whole collection to match the condition {"key":"value"}.  Thus, your index on timestamp isn't doing much good.  Try doing a compound index of {key:1, timestamp:1}.

Luke


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.



--
Luke Ehresman
lu...@ehresman.org
Tebros Systems, LLC - http://tebros.com

Keith Irwin

unread,
Dec 29, 2010, 1:45:23 PM12/29/10
to mongod...@googlegroups.com
Yes, the "key" part of the query is also indexed. I have an index on everything I might want to search for, but still no good.

The "where" stuff is just GTE and LTE types of things. Once I actually get the timestamp I want, I calculate some boundaries, then get data limited by those boundaries, and that's acceptably fast. In other words, getting the data between two timestamps works fine.

Here's the explain:

db.slices.find({"mac" : "001EC001EB62"}, {"timestamp" : 1}).explain()
{
"cursor" : "BtreeCursor mac_1",
"nscanned" : 2608170,
"nscannedObjects" : 2608170,
"n" : 2608170,
"millis" : 4438,
"nYields" : 28,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"mac" : [
[
"001EC001EB62",
"001EC001EB62"
]
]
}
}

I add a sort and limit to the end of this.

Keith

Scott Hernandez

unread,
Dec 29, 2010, 1:47:30 PM12/29/10
to mongod...@googlegroups.com
A query can only use a single index. If you want to filter and sort
with a single index you should use a compound index as has been
suggested.

Keith Irwin

unread,
Dec 29, 2010, 1:48:34 PM12/29/10
to mongod...@googlegroups.com
On Wed, Dec 29, 2010 at 10:11 AM, Scott Hernandez <scotthe...@gmail.com> wrote:
I think you actually want to index like this:
db.data.ensureIndex({"key":1, "timestamp" : 1})

Oh, hm. Are you taking about a compound key, here? I've created the indexes via the Java drivers, with one "ensureIndex" per field.

Keith

Keith Irwin

unread,
Dec 29, 2010, 1:50:10 PM12/29/10
to mongod...@googlegroups.com
On Wed, Dec 29, 2010 at 10:47 AM, Scott Hernandez <scotthe...@gmail.com> wrote:
A query can only use a single index. If you want to filter and sort
with a single index you should use a compound index as has been
suggested.

Ah, okay. I didn't realize that's what you suggested. The scripting is not always clear to me. I'll give that a try. That certainly explains the problems!

Scott Hernandez

unread,
Dec 29, 2010, 1:55:24 PM12/29/10
to mongod...@googlegroups.com
One index per field might not be that useful (since a query can only
use a single index) if you are always using more than one query
criteria. You might want to read up here:
http://www.mongodb.org/display/DOCS/Indexes

Keith Irwin

unread,
Dec 29, 2010, 2:01:11 PM12/29/10
to mongod...@googlegroups.com
On Wed, Dec 29, 2010 at 10:55 AM, Scott Hernandez <scotthe...@gmail.com> wrote:
One index per field might not be that useful (since a query can only
use a single index) if you are always using more than one query
criteria. You might want to read up here:
http://www.mongodb.org/display/DOCS/Indexes

Yes, I've scanned that several times. For some reason, I never gleaned from it that a given query can only use one index at a time. ;) I'm building a new index (which is taking a long time, unsurprisingly), so we'll see how it goes. I'm glad I wrote in to ask the question! ;)

Thanks for all the pointers.

Keith
Reply all
Reply to author
Forward
0 new messages