> I'm trying MongoDB for an application that MySQL didn't seem capable
[..]
> I have looked at MongoDB's sharding, and I'd prefer to stay away from
> setting up a cluster or any of that complexity. I could definitely
> try the same manual partition idea that would work on MySQL, but if
> I'm going to do that, I am not sure there remains a reason to use
> MongoDB for this case.
My own experience with a collection of about 300 million rows is the
same as yours... simply mongo without sharding simply doesn't cut.
You can search in the archives for my threads about that... I would be
happy to listen some others words on this but I think the only thing
you can do is sharding (even if I didn't tryied myself) or use some
sort of map-reduce functions.
BTW I end up with PostgreSQL which handle the data load nicely even if
I still have the prototype in MongoDB.
Cheers
--
Massimo
http://meridio.blogspot.com
> Are you maxing your client? Try multi-threading [be careful, too many
> threads will reduce performance].
I don't think multi-thread could help here at all...
> Are you maxing your storage - Iometer etc.
>
> How's your memory? Mongo IIRC always indexes your ID
> [barring capped collections] - 700Mn docs could create a big index -
> if this started being swapped...
Yep that's the problem... the way index are built/thought they need to
stay in memory for fast lookup otherwise there's too much penalty so
use shards or do map-reduce before storing...
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
> If you want to make the initial import much faster you can try:
> - turn off journaling, gain about 30% (or put journaling on a fast
> drive). Note that once it's inserted you should really turn journaling
> back on.
> - use WriteConcern.NONE so that the db doesnt have to acknowledge
> writes.
> - use bulk insert. For example insert 100 records at a time, this will
> reduce processing time.
>
> Besides that, you should try to find out where the bottleneck is.
> I would look at cpu, load, iostat on both server and client.
> Sometimes bottleneck is not where you may think.. if disk is
> bottleneck then db wont make much difference.
You're right but the fact is that due to how mongo handle indexes
having a 700million entries in the index it tries to keep them in
memory (at least if you use it to query) and that can lead to the OS
swap memory pages to disks.
That for sure kills the performance.
I need to have 300million (and counting) SHA2-256 hash in a index to
check for uniqueness and cannot make a reliable (performance wise) use
of this in Mongo.