Large Scale Mongo Questions

8 views
Skip to first unread message

Verrier

unread,
Sep 5, 2010, 7:40:02 PM9/5/10
to mongodb-user
Hello,

I've been messing with MongoDB for several months now and absolutely
love it, but I've recently run into a problem.

I have a bunch of writes that occur that rotate through the
collections in my database pretty much non stop, however as I've
scaled up the number of writes, I've seen the read time increase. I
assumed that the writing and reading locks were conflicting, so I
decided to change my writes to a batchInsert instead of a regular
insert.

By doing this, my writes take about 20 seconds to complete... and
during these 20 seconds, I am not able to read from any collection in
the database (single machine, single mongod). Is this normal behavior
that I cannot hold both a read lock and a write lock at the same time
for *different* collections? eg. I can't read from db.collection1
while I write to db.collection2.

Is the only way to solve this problem setting up a master / slave
relationship where I write to one machine and read from another? Is it
possible to efficiently run two mongod on the same machine (perhaps
pointing to two different physical data disks to prevent
bottlenecks?).

My issue with clustering is that I have a good number of 32 bit
machines, however because of the 2.5 gig limit, I don't believe they
would do very much in a MongoDB cluster.. and I'm not able to upgrade
these to a 64 bit at this time.

If anyone has any recommendations or advice it would be greatly
appreciated.

Thanks,
Stephen

Eliot Horowitz

unread,
Sep 5, 2010, 10:57:01 PM9/5/10
to mongod...@googlegroups.com
First just the simple questions:
 - do the reads have all the right indexes?
 - do you know what the bottle neck is?  (index fitting in ram, cpu, disk, etc...)

As for choices
 - master/slave is one, where you read from slaves
 - sharding is an option so the load is distributed.  this lets you maintain full consistency and will let you scale up as needed.

Sharding could work well if you have a number of 32-bit machines as you can put 500mb on 10 machines for example.


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Verrier

unread,
Sep 6, 2010, 12:10:47 AM9/6/10
to mongodb-user
Thanks for the reply

- Yes, the reads have the correct indexes and I've verified that my
queries are indeed using them
- I have only 2 gig of ram on this particular 64 bit machine, I'll be
upgrading it here shortly as I feel like I'm starting to get pretty
close.. some of the indexes for the collections are approaching 1 gig.
Does MongoDB attempt to keep all the indexes for all the collections
in memory? IE if I have a 900 mb, 600 mb, and 900 mb index.. and only
2 gig of ram, I would obviously be over the quota? I have noticed on
an initial query to a collection, that the first query is slower (~2
seconds) while the others are speedy (~100 ms). Is this becuase it's
loading the index into memory? I've also noticed during writes, the
disk is the bottle neck, but I do expect that.

I'll look into sharding / master slave and give one of them a try, can
you shard across 64 bit and 32 bit machines? Will MongoDB
intelligently fill up up the machines, so that 32 bit machines can
store ~1-2 gig of data, while the 64 bit ones will store much more?

Thanks again,
Stephen
> > mongodb-user...@googlegroups.com<mongodb-user%2Bunsubscribe@google groups.com>
> > .

Eliot Horowitz

unread,
Sep 6, 2010, 12:14:22 AM9/6/10
to mongod...@googlegroups.com
The most commonly accessed indexes will stick in ram.
So if you have an index that isn't used for a little while, and then you query on it, its very likely the query will be slow.

To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.

GVP

unread,
Sep 6, 2010, 2:11:57 PM9/6/10
to mongodb-user
@Verrier:

What's probably happening is that your dataset has gotten too big for
you RAM.

When you start, everything fits in RAM (data and indexes) and so all
of the queries are fast.

As your data size grows, the system runs out of RAM. Now when you
write to collection1, the system basically has to load the index for
collection1, update the index, load the data and update the data. If
all of those are in RAM, then the whole system is going to be very
fast.

But you're trying to read from collection2 at the same time. So you're
interrupting the flow above. When you try to read from collection2,
you're trying to load the indexes of collection2 into RAM at the same
time as you're trying to load the indexes for collection1. But you're
on a 32-bit system with 2GB of RAM, so you don't have nearly enough
RAM to handle all of this at once.

If you have access to more computers, then the easiest thing to do is
to set up a master/slave. Do the reads from the slaves and the batch
inserts to the Master. This way, the slave can prioritize the reads
and handle the updates "in the background".

Markus Gattol

unread,
Sep 7, 2010, 3:47:43 PM9/7/10
to mongod...@googlegroups.com
Verrier> Does MongoDB attempt to keep all the indexes for all the
Verrier> collections in memory?

yes, and not just the indexes but also the data set itself


Verrier> Ie if I have a 900 mb, 600 mb, and 900 mb index.. And only 2
Verrier> gig of ram, I would obviously be over the quota?

check out this link
http://www.markus-gattol.name/ws/mongodb.html#how_much_ram_does_mongodb_need


Scott Hernandez

unread,
Sep 7, 2010, 3:58:50 PM9/7/10
to mongod...@googlegroups.com
On Tue, Sep 7, 2010 at 12:47 PM, Markus Gattol <markus...@sunoano.org> wrote:
 Verrier> Does MongoDB attempt to keep all the indexes for all the
 Verrier> collections in memory?

yes, and not just the indexes but also the data set itself

This is a little misleading as there is no effort made to keep the files in memory (by mongodb). All the files are memory mapped and the OS keeps the parts that it can in memory based on the system load, available memory and file usage.  
 
 Verrier> Ie if I have a 900 mb, 600 mb, and 900 mb index.. And only 2
 Verrier> gig of ram, I would obviously be over the quota?

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages