Re: [mongoose] Performance of find() with moderately large result sets

7,469 views
Skip to first unread message

Aaron Heckmann

unread,
Jul 12, 2012, 2:16:48 PM7/12/12
to mongoo...@googlegroups.com
This is expected. The driver does no magic. Mongoose adds magical helpers around each document.

Tips: 

  1) use 3.x if you are ok with the bleeding edge
  2) select only the fields you need (either 2.x or 3.x)
  3) utilize the stream() method for large result sets http://mongoosejs.com/docs/querystream.html (either 2.x or 3.x)
  4) if using 3.x and this is a simple query where you do not need to save any changes back to the db, use the `lean` option which bypasses the mongoose magic and returns the plain js object from the driver

ideal example (v3):

  // no streaming
  my_model.find({ user_id: user_id }).select('your fields').setOptions({ lean: true }).exec(callback);

  // with streaming
  var stream = my_model.find({ user_id: user_id }).select('your fields').setOptions({ lean: true }).stream();

I'm very curious about your results.

ps Mongoose casts your ObjectIds for you so no need to do that manually in your query conditions.

On Thu, Jul 12, 2012 at 10:09 AM, dbrand666 <Da...@rewind.me> wrote:
A member of our team happened to notice that the performance of a simple find() operation via mongoose was performing much more slowly than via native mongodb-native.  His test case retrieved about 8000 records via a single indexed key (an ObjectID).  The operation took about 600ms with mongodb-native but over 3000ms with mongoose. I repeated the test with mongoose 3.0 (alpha) and the time is down to about 2000ms, which is a significant improvement but still much slower than native.

I haven't seen much discussion of mongoose's real world performance. I was surprised by this result. This is on a query so we shouldn't be seeing validation overhead. What else is going on?

In trying to isolate the overhead, I tried retrieving only the _id. The performance did improve (in both cases) but mongoose still took 3x as long.

The query looks like this:

    my_model.find({
      user_id: mongoose.Types.ObjectId(user_id),
    }, function (err, events) { ...

The mongodb equivalent is:

    my_connection.collection('event', function (err, eventCollection) {
      eventCollection.find({
        user_id : mongodb.ObjectID(user_id),
      }).toArray(function (err, events) { ...

Is this expected? Any tips on avoiding this extra overhead?

--
http://mongoosejs.com
http://github.com/learnboost/mongoose
You received this message because you are subscribed to the Google
Groups "Mongoose Node.JS ORM" group.
To post to this group, send email to mongoo...@googlegroups.com
To unsubscribe from this group, send email to
mongoose-orm...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/mongoose-orm?hl=en



--
Aaron



dbrand666

unread,
Jul 12, 2012, 5:39:12 PM7/12/12
to mongoo...@googlegroups.com
I tried adding the lean option and the resulting times are essentially identical to the times we were getting with mongodb-native.  Very nice!

Is there a writeup yet of what functionality this option turns off? Is it recommended that we use lean for all read-only queries?

Thanks for the tips and the speedy response.  You just saved us a lot of trouble!

(About casting ObjectId's... old habits die hard.)

Aaron Heckmann

unread,
Jul 12, 2012, 5:56:14 PM7/12/12
to mongoo...@googlegroups.com
Ah nice! There are no writes ups on lean yet. New docs are on the way.

All lean does is skip the mongoose wrapper stuff (no defaults, methods, getters/setters, nothing at all) and passes the original doc from the driver.

Yes, if you are running read only queries `lean` makes a lot of sense.

--
http://mongoosejs.com
http://github.com/learnboost/mongoose
You received this message because you are subscribed to the Google
Groups "Mongoose Node.JS ORM" group.
To post to this group, send email to mongoo...@googlegroups.com
To unsubscribe from this group, send email to
mongoose-orm...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/mongoose-orm?hl=en

dbrand666

unread,
Jul 12, 2012, 7:40:34 PM7/12/12
to mongoo...@googlegroups.com
That sounds like it might be a little too lean. For reading, I'd still want getters, at least. Maybe methods and defaults too. That wouldn't introduce much overhead, at least until the getters actually get used.

Robert Mayo

unread,
Jul 12, 2012, 7:54:24 PM7/12/12
to mongoo...@googlegroups.com
Great discussion!

Is there a way to turn "lean" to always-on on for one specific model, rather than per-find?  

If I only use basic functionality in a particular Schema, will the overhead be proportional to the functionality I use, or is there a fixed overhead that is incurred for wrapping?  In other words, would a find with a schema with only basic functionality be as fast as a "lean" version?

I assume it is impossible to have 2 schemas for the same collection, and switch between them depending upon what functionality I want...

I'm assuming an application with lots of configuration and UI objects for which Mongoose is very helpful, and then maybe one or two high-bandwidth underlying objects, which I'm willing to carefully craft for performance.

--Bob


On Thu, Jul 12, 2012 at 4:40 PM, dbrand666 <Da...@rewind.me> wrote:
That sounds like it might be a little too lean. For reading, I'd still want getters, at least. Maybe methods and defaults too. That wouldn't introduce much overhead, at least until the getters actually get used.

Aaron Heckmann

unread,
Jul 13, 2012, 1:42:19 PM7/13/12
to mongoo...@googlegroups.com
On Thu, Jul 12, 2012 at 4:54 PM, Robert Mayo <bob-...@bobmayo.com> wrote:
Is there a way to turn "lean" to always-on on for one specific model, rather than per-find?  

Not presently. Should be easy to add that option to the schema. Please open a pull request or ticket.
 

If I only use basic functionality in a particular Schema, will the overhead be proportional to the functionality I use, or is there a fixed overhead that is incurred for wrapping?  In other words, would a find with a schema with only basic functionality be as fast as a "lean" version?

No it will not be as fast as lean. Each mongoose document instantiation involves applying default values (each array has a default subclassed array created) and adding hooks. The less fields you have in your schema or selected in your query the faster this will be.
 

I assume it is impossible to have 2 schemas for the same collection, and switch between them depending upon what functionality I want...

Incorrect. You can force your schema to use a specific collection like so:

var options = { collection: 'likes' }
new Schema({ .. }, options);

It's probably worth playing around with this to see if its really beneficial your your application.


I'm assuming an application with lots of configuration and UI objects for which Mongoose is very helpful, and then maybe one or two high-bandwidth underlying objects, which I'm willing to carefully craft for performance.

I'm curious how it works out for you.

Vitaly Puzrin

unread,
Jul 17, 2012, 5:15:15 AM7/17/12
to mongoo...@googlegroups.com
{lean: true} helped in our case. ~3.5x difference.

That's VERY much on high traffic fetches. On practice, casting eats more than fetch + page render.
So, with big data queries, casting become cpu killer. Imho, that should be specially noted in docs,
as important thing for highload projects.

Though, if you understand, what you are doing, then mongose is very nice :) . There are cases, when
you don't need high speed, but have to write tons of different db requests (need fast development).
For example, in control panel.

пятница, 13 июля 2012 г., 21:42:19 UTC+4 пользователь Aaron Heckmann написал:

Aaron Heckmann

unread,
Aug 29, 2012, 1:18:24 PM8/29/12
to mongoo...@googlegroups.com
the docs were rewritten:
http://mongoosejs.com/docs/api.html#querystream_QueryStream

On Wed, Aug 29, 2012 at 8:28 AM, daslicht <ans...@gmail.com> wrote:
> http://mongoosejs.com/docs/querystream.html ->404
> --
> --
> http://mongoosejs.com - docs
> http://plugins.mongoosejs.com - plugins search
> http://github.com/learnboost/mongoose - source code
Reply all
Reply to author
Forward
0 new messages