Out of memory on mongo shell

232 views
Skip to first unread message

Marc

unread,
Jan 21, 2011, 10:16:41 AM1/21/11
to mongodb-user
Hi everyone,

I have to update around 15 milion rows on a collection (in fact, all
of the records). I have to add a new field. Data comes from an Oracle
DB.
I tried to do it the simplest possible, so just exported all the data
from oracle creating an "update" script to be applied to the mongodb
collection.
So basically I have a script with 15 milions lines like this:

db.ad.update({_id:53978483},{ $set: {DATA:1937.6}});

I pipe the script to a mongo shell and everything looks like running
fine (Although, slow, at 100 upd/sec).
I looked to a "ps" meanwhile the import where running, and I were
amused by the size of the mongo process (around 150M at that point),
but considered it more or less normal.
After about half a milion updates, the mongo shell pops out an out of
memory error. Didn't really know how much memory it used at the end.
I were using one of the servers to do the import, so memory where
already scarce. But didn't imagine that the mongo process coud use
that much memory.
The servers logs doesn't have anything special.
The collection is a sharded collection on "_id", and I'm using 1.6.5.

I'm now trying from another machine and using the 1.7 client. But for
now, looks like memory is also going up non stop (50m already).

Is this a known mongo shell restriction? Have to use another driver?
(Java/groovy?)


Nat

unread,
Jan 21, 2011, 10:37:49 AM1/21/11
to mongodb-user
mongo shell is using spidermonkey underneath. I think it has a small
memory leak there. If you know other languages such as python/ruby/
java, you should try them. It will be significantly faster too.

Marc Gracia

unread,
Jan 21, 2011, 10:41:28 AM1/21/11
to mongod...@googlegroups.com
I were suspecting that....

Thanks

D Boyd

unread,
Jan 21, 2011, 1:11:18 PM1/21/11
to mongodb-user
I have been having a similar error in a data loading script
that I posted last night.

There is something with object or array references that is
not getting garbage collected quite right causing a memory
based on my analysis.

Your script is so simple I am not sure why it would leak.
The only thing I think might be the issue is that all the objects
defined in your update commands are staying marked as referenced.

You could try putting periodic calls to the gc in the script.
But I don't think that will work.

GVP

unread,
Jan 23, 2011, 4:38:52 AM1/23/11
to mongodb-user
You have a couple of quick options:

1. If this is a one-time import, just split the file into multiple
parts and run them in sequence. The JS shell may indeed have a memory
leak (or it may just have a different max process size).
2. If you plan to run this import regularly you can:
a. Use a driver from a different language (Ruby, Python, PHP, etc)
b. Try the "mongoimport" utility. It's usually pretty easy to export
to CSV or TSV from any RDBMS. The MongoImport will bring them in for
you. Just remember to rename the "ID" column in your file to "_id" if
you want to keep your old IDs.

MongoImport link below:
http://www.mongodb.org/display/DOCS/Import+Export+Tools#ImportExportTools-mongoimport
Reply all
Reply to author
Forward
0 new messages