Re: Performance issue while iterating over a large collection and updating it.

90 views
Skip to first unread message

Dwight Merriman

unread,
Nov 15, 2012, 3:30:17 PM11/15/12
to mongod...@googlegroups.com
what version of mongo are you using?

there's a lot of possible answers

it could be that the batch job is changing what is cached a lot.  if that is the case you would see a lot of page faults on queries by others

it could be that its writing and the flushing of those writes are taking too long.  if that is the case you would see significant time in the background flush avg graph in MMS.

if you run iostat -xm 2 what do you get as output?


you might try "pacing" your batch job by limiting it to a certain # of operations per second (via sleeps or something) and see if it is then happy


On Thursday, November 15, 2012 2:20:52 PM UTC-5, cbk2012 wrote:
Hi all,

We are having trouble with mongodb while iterating over many collections and updating each record in the collections.
Our system compose of several collections about same document object model. We have something around 400 collections and each collection have around 100000 documents (which may increase to millions of records too).
So basically we iterate over each document in each collection and we update each document after making a fast computation with the document itself. The problem is that while we are doing this batch work, there are other processes who are inserting documents or updating existing documents to these collections. So when we run the batch operation, mongos performance decreases so much, and we are getting late answers to our queries. 
I would like to learn what is the recommended solution for such an architecture? We are iterating each documents with a single thread right now, iterating over collections in parallel would increase the performance; but the main thing we don't understand is, why mongo's response time to queries increases dramatically when we iterate over collections, why iterating over collection A, decreases the performance of asking queries to collection B? Btw we work with morphia implementation of java driver.

Suggestions and feedback will be highly appreciated.
Cheers.

cbk2012

unread,
Dec 4, 2012, 7:30:29 AM12/4/12
to mongod...@googlegroups.com
We were using 2.04 now we upgraded to newer version. 
We changed our configuration, before we were using a sharded environment, but we didn't get what we looked for, so now we work in a single mongo with a replica set.
The batch job was changing only a single document in each iteration, and that collection doesn't have much connection to rest of the database.

Right now with single machine things seem to work ok, thanks for the feedback.
Cheers
Reply all
Reply to author
Forward
0 new messages