Performance of find() with moderately large result sets

Showing 1-10 of 10 messages
Performance of find() with moderately large result sets dbrand666 7/12/12 10:09 AM
A member of our team happened to notice that the performance of a simple find() operation via mongoose was performing much more slowly than via native mongodb-native.  His test case retrieved about 8000 records via a single indexed key (an ObjectID).  The operation took about 600ms with mongodb-native but over 3000ms with mongoose. I repeated the test with mongoose 3.0 (alpha) and the time is down to about 2000ms, which is a significant improvement but still much slower than native.

I haven't seen much discussion of mongoose's real world performance. I was surprised by this result. This is on a query so we shouldn't be seeing validation overhead. What else is going on?

In trying to isolate the overhead, I tried retrieving only the _id. The performance did improve (in both cases) but mongoose still took 3x as long.

The query looks like this:

    my_model.find({
      user_id: mongoose.Types.ObjectId(user_id),
    }, function (err, events) { ...

The mongodb equivalent is:

    my_connection.collection('event', function (err, eventCollection) {
      eventCollection.find({
        user_id : mongodb.ObjectID(user_id),
      }).toArray(function (err, events) { ...

Is this expected? Any tips on avoiding this extra overhead?
Re: [mongoose] Performance of find() with moderately large result sets Aaron Heckmann 7/12/12 11:16 AM
This is expected. The driver does no magic. Mongoose adds magical helpers around each document.

Tips: 

  1) use 3.x if you are ok with the bleeding edge
  2) select only the fields you need (either 2.x or 3.x)
  3) utilize the stream() method for large result sets http://mongoosejs.com/docs/querystream.html (either 2.x or 3.x)
  4) if using 3.x and this is a simple query where you do not need to save any changes back to the db, use the `lean` option which bypasses the mongoose magic and returns the plain js object from the driver

ideal example (v3):

  // no streaming
  my_model.find({ user_id: user_id }).select('your fields').setOptions({ lean: true }).exec(callback);

  // with streaming
  var stream = my_model.find({ user_id: user_id }).select('your fields').setOptions({ lean: true }).stream();

I'm very curious about your results.

ps Mongoose casts your ObjectIds for you so no need to do that manually in your query conditions.


--
http://mongoosejs.com
http://github.com/learnboost/mongoose
You received this message because you are subscribed to the Google
Groups "Mongoose Node.JS ORM" group.
To post to this group, send email to mongoo...@googlegroups.com
To unsubscribe from this group, send email to
mongoose-orm...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/mongoose-orm?hl=en



--
Aaron



Re: [mongoose] Performance of find() with moderately large result sets dbrand666 7/12/12 2:39 PM
I tried adding the lean option and the resulting times are essentially identical to the times we were getting with mongodb-native.  Very nice!

Is there a writeup yet of what functionality this option turns off? Is it recommended that we use lean for all read-only queries?

Thanks for the tips and the speedy response.  You just saved us a lot of trouble!

(About casting ObjectId's... old habits die hard.)
Re: [mongoose] Performance of find() with moderately large result sets Aaron Heckmann 7/12/12 2:56 PM
Ah nice! There are no writes ups on lean yet. New docs are on the way.

All lean does is skip the mongoose wrapper stuff (no defaults, methods, getters/setters, nothing at all) and passes the original doc from the driver.

Yes, if you are running read only queries `lean` makes a lot of sense.


--
http://mongoosejs.com
http://github.com/learnboost/mongoose
You received this message because you are subscribed to the Google
Groups "Mongoose Node.JS ORM" group.
To post to this group, send email to mongoo...@googlegroups.com
To unsubscribe from this group, send email to
mongoose-orm...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/mongoose-orm?hl=en



--
Aaron



Re: [mongoose] Performance of find() with moderately large result sets dbrand666 7/12/12 4:40 PM
That sounds like it might be a little too lean. For reading, I'd still want getters, at least. Maybe methods and defaults too. That wouldn't introduce much overhead, at least until the getters actually get used.
Re: [mongoose] Performance of find() with moderately large result sets Robert Mayo 7/12/12 4:54 PM
Great discussion!

Is there a way to turn "lean" to always-on on for one specific model, rather than per-find?  

If I only use basic functionality in a particular Schema, will the overhead be proportional to the functionality I use, or is there a fixed overhead that is incurred for wrapping?  In other words, would a find with a schema with only basic functionality be as fast as a "lean" version?

I assume it is impossible to have 2 schemas for the same collection, and switch between them depending upon what functionality I want...

I'm assuming an application with lots of configuration and UI objects for which Mongoose is very helpful, and then maybe one or two high-bandwidth underlying objects, which I'm willing to carefully craft for performance.

--Bob


On Thu, Jul 12, 2012 at 4:40 PM, dbrand666 <Da...@rewind.me> wrote:
That sounds like it might be a little too lean. For reading, I'd still want getters, at least. Maybe methods and defaults too. That wouldn't introduce much overhead, at least until the getters actually get used.

--
http://mongoosejs.com
http://github.com/learnboost/mongoose
You received this message because you are subscribed to the Google
Groups "Mongoose Node.JS ORM" group.
To post to this group, send email to mongoo...@googlegroups.com
To unsubscribe from this group, send email to
mongoose-orm...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/mongoose-orm?hl=en

Re: [mongoose] Performance of find() with moderately large result sets Aaron Heckmann 7/13/12 10:42 AM


On Thu, Jul 12, 2012 at 4:54 PM, Robert Mayo <bob-...@bobmayo.com> wrote:
Is there a way to turn "lean" to always-on on for one specific model, rather than per-find?  

Not presently. Should be easy to add that option to the schema. Please open a pull request or ticket.
 

If I only use basic functionality in a particular Schema, will the overhead be proportional to the functionality I use, or is there a fixed overhead that is incurred for wrapping?  In other words, would a find with a schema with only basic functionality be as fast as a "lean" version?

No it will not be as fast as lean. Each mongoose document instantiation involves applying default values (each array has a default subclassed array created) and adding hooks. The less fields you have in your schema or selected in your query the faster this will be.
 

I assume it is impossible to have 2 schemas for the same collection, and switch between them depending upon what functionality I want...

Incorrect. You can force your schema to use a specific collection like so:

var options = { collection: 'likes' }
new Schema({ .. }, options);

It's probably worth playing around with this to see if its really beneficial your your application.


I'm assuming an application with lots of configuration and UI objects for which Mongoose is very helpful, and then maybe one or two high-bandwidth underlying objects, which I'm willing to carefully craft for performance.

I'm curious how it works out for you.
Re: [mongoose] Performance of find() with moderately large result sets Vitaly Puzrin 7/17/12 2:15 AM
{lean: true} helped in our case. ~3.5x difference.

That's VERY much on high traffic fetches. On practice, casting eats more than fetch + page render.
So, with big data queries, casting become cpu killer. Imho, that should be specially noted in docs,
as important thing for highload projects.

Though, if you understand, what you are doing, then mongose is very nice :) . There are cases, when
you don't need high speed, but have to write tons of different db requests (need fast development).
For example, in control panel.

пятница, 13 июля 2012 г., 21:42:19 UTC+4 пользователь Aaron Heckmann написал:
Re: [mongoose] Performance of find() with moderately large result sets daslicht 8/29/12 8:28 AM
http://mongoosejs.com/docs/querystream.html ->404

Am Donnerstag, 12. Juli 2012 20:16:48 UTC+2 schrieb Aaron Heckmann:
Re: [mongoose] Performance of find() with moderately large result sets Aaron Heckmann 8/29/12 10:18 AM
the docs were rewritten:
http://mongoosejs.com/docs/api.html#querystream_QueryStream
> --
> --
> http://mongoosejs.com - docs
> http://plugins.mongoosejs.com - plugins search
> http://github.com/learnboost/mongoose - source code