Java Driver Performance

238 views
Skip to first unread message

David Brooks

unread,
Apr 12, 2011, 10:57:07 AM4/12/11
to mongodb-user
Hi all,

I've been using the Java driver since version 2.4, and just recently
upgraded to the latest 2.5.3. I'm knee deep in profiling a heavily
used query routine to get the requests / second up.

I'm not an expert at this by no means (yet), but it seems the driver
continually fights the JVM's GC (specially around the BSON lib which
creates a ton of objects) and I can't get beyond 20 requests /
second. I've tried various JVM tweaks (parallel GC, min / max perm
sizes, having GC run almost continually, to once every couple
minutes), and various memory and CPU configs (8+GB ram, 4 core CPUs,
etc).

Does anyone have some pearls of wisdom they can throw my way? In
terms of specs for the query, I'm hitting the _id index, and returning
6 fields with each query.

Anything you need to see to help, I'll throw your way.

Thanks much,

David

Brendan W. McAdams

unread,
Apr 12, 2011, 11:09:07 AM4/12/11
to mongod...@googlegroups.com
David,

It sounds an awful lot like, based on your description, that your running out of heap space if you are continually GCing.  What does your memory allocation look like?

Apart from that, any sample code you can send for the places where you're seeing bottlenecks would help debug this.

Additionally, I'd suggest verifying each of your queries with the profiler or explain: 


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.

Eliot Horowitz

unread,
Apr 12, 2011, 11:11:46 AM4/12/11
to mongod...@googlegroups.com
What's your max jvm size? Have you tried increasing that?

David Brooks

unread,
Apr 12, 2011, 11:14:39 AM4/12/11
to mongodb-user
Sorry - I forgot to mention on the query I'm using "$in" to get
multiple documents at a time. The less documents I pull at a time,
the more requests I can handle. It seems odd though that 20
documents, performance is sub-par light this. Am I forcing the driver
to do too much decoding?

David Brooks

unread,
Apr 12, 2011, 11:24:19 AM4/12/11
to mongodb-user
On my 8GB box, I've tried as low as 25MB, to 100MB, to 1GB, to the
full 8GB. The lower amounts are due to keep the GC working to help
keep requests flowing in without long GC pauses. When set at 8GB,
the gen space fills up within a couple minutes (mostly from the BSON
objects from what I can tell) and there's a 10+s pause while it clears
out. I need to have a reasonable GC pause (10-20ms at most) to keep
requests flowing in.

On Apr 12, 9:11 am, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> What's your max jvm size?  Have you tried increasing that?
>

Eliot Horowitz

unread,
Apr 12, 2011, 11:43:15 AM4/12/11
to mongod...@googlegroups.com
How big/complex are the objects you're loading?

Keith Branton

unread,
Apr 12, 2011, 11:37:20 AM4/12/11
to mongod...@googlegroups.com
Oh - I meant to add - if you suspect this is being caused by the 2.5.3 driver, why not switch back to the 2.4 driver to confirm/rule the 2.5.3 driver as the culprit?

Shi Shei

unread,
Apr 12, 2011, 12:06:42 PM4/12/11
to mongodb-user
The $in query works fine for me. No performance or GC issues. My
document size is about 1KB. The $in operator holds 1000 id's in each
query and returns partial documents (only a couple of attributes).

Using only one thread, mongo is able to select 32.000 to 40.000
documents per second.
Using 50 threads it increases to 190.000 to 225.000 docs/sec.
- using 3 shards, each running on a 24 core 48 GB machine.
- mongodb-linux-x86_64-v1.8-2011-04-12
- java v2.5.3 driver


On Apr 12, 5:43 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> How big/complex are the objects you're loading?
>

David Brooks

unread,
Apr 12, 2011, 12:17:23 PM4/12/11
to mongodb-user
Each document contains 6 string fields, and 2 fields with arrays with
10 string elements within each array.

On Apr 12, 9:43 am, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> How big/complex are the objects you're loading?
>

Keith Branton

unread,
Apr 12, 2011, 11:34:08 AM4/12/11
to mongod...@googlegroups.com
While it is possible that some change has caused more objects to be created by the java driver I sounds more likely that that you have reached a point of activity where your JVM needs some GC tuning.

Most of what I'm going to say next is based on the Sun JVM - that's where most of my experience lies. Also I may not fully understand this area accurately - and it has been a while since I did any GC tuning.

Modern generational garbage collectors split the heap into regions (young/survivor/old) or (eden/survivor/tenured)

New objects are created in eden. Eden gets GC'd very quickly - no pauses. If the new objects are still needed after a GC cycle or so, they are promoted to the survivor region. That also gets GC'd from time to time, and objects that are still needed get promoted to tenured.

In a web app you really want to avoid request-scope objects becoming tenured. The main reason is that GCs of tenured tend to be slow and usually "stop the world". 

There are a few things that can cause problems here...

1. If you generate too many short lived objects then the young gen GC runs more frequently, and so objects that shouldn't be tenured end up so.

2. If the code using short lived objects is too slow then the objects may be held onto for a little too long, causing them to become tenured.

3. A memory leak - all memory leaks end up being tenured :).

I'd first try changing NewRatio to 1 to see if that helps. This makes eden and tenured the same size. Unfortunately I don't think you can make eden bigger than tenured.

I absolutely recommend hooking jConsole up to the jvm so you can monitor the graphs of all three regions.

If tenured continually grows, and never drops as low after each GC that can indicate a memory leak.

David Brooks

unread,
Apr 12, 2011, 12:33:08 PM4/12/11
to mongodb-user
Thanks for your specs - this was helpful.

Does your JVM that's doing these queries have 48GB on it? For my
setup, the mongo server itself is able to handle queries with ease,
but it's the web server's JVM (4 cores with 8GB RAM) that's decoding
the mongo result is causing my headaches as the heap fills up too
quickly. I'm sure I'm missing something though.

David Brooks

unread,
Apr 12, 2011, 12:34:03 PM4/12/11
to mongodb-user
I've seen this issue with all versions of the driver (I started with
2.4).

Brendan W. McAdams

unread,
Apr 12, 2011, 12:39:00 PM4/12/11
to mongod...@googlegroups.com
Keep in mind that the JVM does NOT use all of the servers memory.

The default heap size is relatively low.  IIRC the maximum heap size is something like 64MB by default.

What you're describing sounds like there just isn't enough RAM allocated to the JVM.

What are your heap settings for your JVM?

Eliot Horowitz

unread,
Apr 12, 2011, 12:44:14 PM4/12/11
to mongod...@googlegroups.com
Can you send a sample document?
Would be curious if performed the same everywhere for everyone.

Keith Branton

unread,
Apr 12, 2011, 12:54:14 PM4/12/11
to mongod...@googlegroups.com
 When set at 8GB,
the gen space fills up within a couple minutes (mostly from the BSON
objects from what I can tell)

How did you determine that most of your heap is full of BSON objects? Did you use jhat? 

David Brooks

unread,
Apr 12, 2011, 1:00:36 PM4/12/11
to mongodb-user
Here's my JVM settings for a new box I'm testing and throwing a ton of
requests to:

-server -XX:NewRatio=1 -Xms1500m -Xmx1500m -XX:+DisableExplicitGC -XX:
+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc

With the NewRatio setting, GC activity is running better. I'm at
early stages of testing though.

David Brooks

unread,
Apr 12, 2011, 1:06:03 PM4/12/11
to mongodb-user
I used the YourKit profiler, did an allocation snapshop and saw:

org.bson.BSONDecoder.decode(InputStrea, BSONCallback) had 55,065 GC'ed
Objects with 84% of the total GCed Size (1,939,160)

Keith Branton

unread,
Apr 12, 2011, 1:33:21 PM4/12/11
to mongod...@googlegroups.com
I've never used YourKit, but your stats would tend to suggest that you don't have a memory leak at least. 

It sounds like you are not letting go of the references quickly enough for them to be discarded before they get tenured. 

Are you adding results of your $in query into a list or something rather than just iterating the cursor? 

If rearranging code can't help then then increasing your overall heap size may help (as that will increase eden size, reducing the frequency of eden GCs and so help keep these objects from being tenured)

I don't tend to use heaps bigger than 2GB with production Java web servers (Your 1.5GB heaps seem reasonable), though I have used heaps as big as 20GB on one very high traffic web site with decent results. (I would normally favor load balancing several 2GB jvms on a box in preference to a single large one)

There are many GC options you can experiment with that can affect GC pauses in tenured, but the first step is to stop request-scoped objects from getting tenured as much as possible.

Martin Grigorov

unread,
Apr 12, 2011, 2:26:10 PM4/12/11
to mongod...@googlegroups.com, David Brooks
On Tue, Apr 12, 2011 at 7:00 PM, David Brooks <d.b.b...@gmail.com> wrote:
Here's my JVM settings for a new box I'm testing and throwing a ton of
requests to:

-server -XX:NewRatio=1 -Xms1500m -Xmx1500m -XX:+DisableExplicitGC -XX:
+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc

are you sure that you use -Xms (instead of -Xmn) ?
Xms is for stack size, while you need min heap size

Keith Branton

unread,
Apr 12, 2011, 2:41:58 PM4/12/11
to mongod...@googlegroups.com
are you sure that you use -Xms (instead of -Xmn) ?
Xms is for stack size, while you need min heap size

Not on the Sun JVM  -Xss is for stack size, -Xms is initial heap size 

David Brooks

unread,
Apr 12, 2011, 2:29:31 PM4/12/11
to mongodb-user
I'm iterating the cursor and outputting to the view. I'll try a
bigger heap size next.

On Apr 12, 11:33 am, Keith Branton <ke...@branton.co.uk> wrote:
> I've never used YourKit, but your stats would tend to suggest that you don't
> have a memory leak at least.
>
> It sounds like you are not letting go of the references quickly enough for
> them to be discarded before they get tenured.
>
> Are you adding results of your $in query into a list or something rather
> than just iterating the cursor?
>
> If rearranging code can't help then then increasing your overall heap size
> may help (as that will increase eden size, reducing the frequency of eden
> GCs and so help keep these objects from being tenured)
>
> I don't tend to use heaps bigger than 2GB with production Java web servers
> (Your 1.5GB heaps seem reasonable), though I have used heaps as big as 20GB
> on one very high traffic web site with decent results. (I would normally
> favor load balancing several 2GB jvms on a box in preference to a single
> large one)
>
> There are many GC options you can experiment with that can affect GC pauses
> in tenured, but the first step is to stop request-scoped objects from
> getting tenured as much as possible.
>

Shi Shei

unread,
Apr 13, 2011, 3:58:28 AM4/13/11
to mongodb-user
> Does your JVM that's doing these queries have 48GB on it?

Yes, but only very few RAM is used by the JVM. It's almost not visible
in Cacti (monitoring tool). The client just iterates through all
documents to make sure to fetch them all from mongo but they are not
kept client side, so are not consuming memory.

Performance is equal when I run my stress test with the java options
you've posted:
-server -XX:NewRatio=1 -Xms1500m -Xmx1500m -XX:+DisableExplicitGC -XX:
+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -verbose:gc

I even reduced the min and max heap size to only 64 MB. Throughput is
the same when running 1 thread. Running 50 threads I get a
OutOfMemoryError (Java Heap Space). Increasing it to 512 MB, 50
threads run well too but produce less throughput than with more heap
sapce (117.000 instead of 220.000 docs/sec).

It would be helpful to see how your document looks like and also how
your client queries mongo.
Reply all
Reply to author
Forward
0 new messages