having trouble indexing an array

39 views
Skip to first unread message

Travis Laborde

unread,
Apr 13, 2012, 11:14:10 AM4/13/12
to mongod...@googlegroups.com
I have a collection of documents that look like this:

{
  "_id" : "bb08a318-7a07-4552-a0e7-81fa1f1e1677",
  "Exceptions" : [],
  "AppID" : 94,
  "TimeCompleted" : new Date("Fri, 13 Apr 2012 10:11:46 GMT -04:00")
}

(other fields omitted for brevity).  The Exceptions array is just a List<string> in .NET terms.  Just a list of strings.

In a sample database we have a few million documents.

I'm trying to query it using the following conditions:
{
AppID : 94
Exceptions : {$ne:[]}
}

with the following sort order:
{
TimeCompleted : -1
}

Basically, find the documents that the Exceptions collection is not empty, and show them to me from newest to oldest.

The query takes 3 minutes, and causes a 25% CPU hit on the server during that time.  I'm sure that indexing would help, as we have much larger databases with much more complex queries which all run just fine on the same server.  When I use "explain()" I see that it isn't using a very good index.  Probably this means I have not created the right index.  Which is why I'm here :)  This is my first attempt to create in index on an "array" section of a document.

Here is my current attempt at an index which is ineffective:
{
AppID : 1,
Exceptions : 1,
TimeCompleted : 1
}

Please help if you can,
Travis

Marc

unread,
Apr 13, 2012, 12:17:03 PM4/13/12
to mongodb-user
I believe the issue is with the $ne part of the query.
In general it is difficult to use an index to find "not matches".
There is actually a note on this on our FAQ page:
http://www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ#IndexingAdviceandFAQ-I%27musing%24neor%24nininaquery%2Candwhileitusestheindex%2Cit%27sstillslow.What%27shappening%3F

Similar questions regarding "not equals" queries have been asked in
the past "How to index queries with $nin and $ne filters?" -
http://groups.google.com/group/mongodb-user/browse_thread/thread/e8da8bfcd9605d30

If possible, it would be best to rearrange your query to perform a
"positive match".

One possible solution is to add a field to each document similar to
"HasExceptions". This will be true if the "Exceptions" array is
populated, and false if it is not. You can then include this field in
your index, and change your query to something like the following:

> db.collection.ensureIndex({"AppID" : 1, "HasExceptions" : 1, "TimeCompleted" : 1})

> db.collection.find({"AppID" : 94, "HasExceptions":true}).sort({"TimeCompleted" : -1})

I realize this involves a little extra overhead on inserts and updates
(determining if the "Exceptions" array is empty and changing
"HasExceptions" accordingly, but you should see a noticeable
improvement in query time.

Travis Laborde

unread,
Apr 19, 2012, 7:00:30 AM4/19/12
to mongod...@googlegroups.com
Marc, thanks.  Is there a "server side" way to add this property without having to loop through every document in client side code?

I have gone in and "added a property" to all documents in a collection, I believe using $set, but that was just adding a property with a default static value.  Can you give me an example if possible of how I'd add that property and have its value set to the count of items in an array as described?

Thanks,
Travis

Sam Millman

unread,
Apr 19, 2012, 7:17:19 AM4/19/12
to mongod...@googlegroups.com
You ncan do this nicely with the $inc.

So when you add a new element $inc the field that summerises the count of that array like so:

update({//query}, { $inc: {'fieldtoinc': 1} })

And wqhen you remove from the array:

update({//query}, { $inc: {'fieldtoinc': -1} })

Personally though I prefer to pull the field out and manipulate it on client side and then resave since its safer when de-$inc-ing.

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/nEsOj1rlhWEJ.

To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.

Travis Laborde

unread,
Apr 19, 2012, 7:23:46 AM4/19/12
to mongod...@googlegroups.com
Sammaye, thanks, but that isn't really what I was asking about.  The array size never changes after the document is saved.  So when I insert a document, I can easily set that property.

What I'm asking about is "how to create this new field and set it to the right value" for millions of existing documents.  Without loading them all client side, setting the field accordingly, and saving them again.

Thanks!
Travis


On Thursday, April 19, 2012 7:17:19 AM UTC-4, Sammaye wrote:
You ncan do this nicely with the $inc.

So when you add a new element $inc the field that summerises the count of that array like so:

update({//query}, { $inc: {'fieldtoinc': 1} })

And wqhen you remove from the array:

update({//query}, { $inc: {'fieldtoinc': -1} })

Personally though I prefer to pull the field out and manipulate it on client side and then resave since its safer when de-$inc-ing.

To unsubscribe from this group, send email to mongodb-user+unsubscribe@googlegroups.com.

Sam Millman

unread,
Apr 19, 2012, 7:35:24 AM4/19/12
to mongod...@googlegroups.com
Oops sorry, that's what happens when some one barges in halfway :).

Hmm I don't think you can, you could use a sizeof but then you can't see the current documents array without pulling it out either in client side or through the js engine.

I would run a js function inside of shell to do this, it should be fast enough there.

To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/nrN8ufQenVQJ.

To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages