RAM usage of Java Driver cursor iteration in default GC options

585 views
Skip to first unread message

alp

unread,
Dec 27, 2011, 9:55:38 AM12/27/11
to mongod...@googlegroups.com
Hello guys, my problem is quite trivial, pardon me if I got the concept from.

I'm using Java driver and measuring RAM usage of my Java program before and after cursor iteration of ~60k documents and currently observed that something is wrong with garbage collection. Here's my main app logic, it is quite clear and straightforward

[...]
Thread.sleep(10*1000); // to observe initial ram usage
DBCursor cur = collection.find();

while(cur.hasNext()){
    DBObject o = cur.next();
}
cur.close(); 

Thread.sleep(200*1000); // to observe RAM usage after initialization

Before the database gets involved in, RAM usage is ~30 MB and after the iteration it takes around 160 MB and does not change until Thread.sleep(200*1000) finishes and program terminates. 

I'm using default JVM settings and am not doing any kind of GC tuning at all. I have to keep my ram usage around ~50 MB in my case and I observed that it gets higher as number of documents increase. Unfortunately I don't have much RAM to scale number of documents and I think it shouldn't increase that much since I'm not keeping any references to the documents I'm scanning with the cursor.

What do you think may cause this?

Scott Hernandez

unread,
Dec 27, 2011, 10:35:19 AM12/27/11
to mongod...@googlegroups.com
Unfortunately you have built a test which does not do what you think
it does. The garbage collector will generally not run on those
instances until all references are lost (nullified) and the thread
becomes inactive.

See here for more background and some random notes I found by
searching on google for "java garbage collection when":
http://javarevisited.blogspot.com/2011/04/garbage-collection-in-java.html

I would suggest setting "cur = null" and then hinting to the JVM to do
garbage collection [system.gc()] and then sleeping, or better yet, run
the test on a sep. thread completely.

> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/mongodb-user/-/rWW58EzBlKAJ.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.

Nat

unread,
Dec 27, 2011, 10:41:53 AM12/27/11
to mongod...@googlegroups.com
Can you try to limit the maximum heap using -Xmx64m option to see whether it runs out of memory or not?

alp

unread,
Dec 27, 2011, 10:58:28 AM12/27/11
to mongod...@googlegroups.com
Tried this and it didn't run out of memory but worked a few times slower. So I think limiting the heap space will work for me. Thanks!

I also observed that if I set max heap size to 64 mb, it takes 64+(~30) MB in the real memory usage on OS X but anyway it works for me. I also tried hinting GC with setting variables to null but it didn't help that much. 

What may cause this you think? So when the number of documents increase, doesn't memory usage gets stabilized at some point  or does it just continues to burst all the available vm heap memory?

Sincerely.

Nat

unread,
Dec 27, 2011, 11:18:02 AM12/27/11
to mongod...@googlegroups.com
Then it's not really a problem. If memory is constrainted, you can try to reduce batch size. It should help a little bit.
Date: Tue, 27 Dec 2011 07:58:28 -0800 (PST)
Subject: [mongodb-user] Re: RAM usage of Java Driver cursor iteration in default GC options
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/Ffdk0b3yfjcJ.

Ahmet Alp Balkan

unread,
Dec 27, 2011, 11:55:26 AM12/27/11
to mongod...@googlegroups.com
I have just encountered an OutOfMemoryError due to limited Java heap space when I added extra functionality to my program (actually indexing with Lucene, however that's irrelevant in this discussion).

The problem is, why can't I just iterate through the cursor with a limited memory? Also, calling System.gc() after processing every document in the loop causes a huge time overhead, however calling it after the loop (after closing the cursor) does not mean anything since it is too late, we get an OutOfMemoryError meanwhile.

Calling System.gc() for every 1000 documents in the loop helped a little bit but it still throws the OutOfMemoryError after a while and we can't rely on System.gc() since it is not guaranteed to do anything. Do you recommend any aggressive and prioritized garbage collection parameters for this case?

Thanks.

Eliot Horowitz

unread,
Dec 28, 2011, 1:31:28 AM12/28/11
to mongod...@googlegroups.com
I would try lowering the batch size.
that will keep memory usage down at the expense of more round trips to the db

Ahmet Alp Balkan

unread,
Dec 28, 2011, 9:46:23 AM12/28/11
to mongod...@googlegroups.com
What I'm wondering is whether this behavior of Java driver is perfectly normal or something is wrong with either mongo.jar or my JVM (and its GC options). Because lowering the batch sizes is not quite manageable in my case and from time to time, I may need to pull all the data from mongodb collection to my Java program. 

Nat

unread,
Dec 28, 2011, 9:49:05 AM12/28/11
to mongod...@googlegroups.com
It is totally normal. Even though you lower your batch size, there should be any problem pulling all data to your program. If you need to keep those data in memory, that's another story. That would mean you need to have enough memory to hold those in memory.

Ahmet Alp Balkan

unread,
Dec 29, 2011, 5:32:14 AM12/29/11
to mongod...@googlegroups.com
I thought that removing references to documents pulled from database will be wiped out from memory by gc. I'm not saving documents to a list or a data structure that keeps all of them in memory; I immediately process documents and they get out of scope every time I iterate. I'm suspicious of Mongo driver cursor implementation which may somehow keep references to the fetched documents for (maybe) caching etc reasons.

--
Ahmet Alp Balkan <http://ahmetalpbalkan.com>

Sent from my iPad

On 28 Ara 2011, at 16:49, Nat <nat....@gmail.com> wrote:

It is totally normal. Even though you lower your batch size, there should be any problem pulling all data to your program. If you need to keep those data in memory, that's another story. That would mean you need to have enough memory to hold those in memory.

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/BBJRzL5ZI3sJ.

Nat

unread,
Dec 29, 2011, 6:04:23 AM12/29/11
to mongod...@googlegroups.com
It wouldn't but it may hold a reference to buffer it uses to parse the document. For that part, reducing batchsize will reduce the memory used for the buffer.
From: Ahmet Alp Balkan <ahmetal...@gmail.com>
Date: Thu, 29 Dec 2011 12:32:14 +0200
Subject: Re: [mongodb-user] RAM usage of Java Driver cursor iteration in default GC options
Reply all
Reply to author
Forward
0 new messages