mongodb java driver version 2.3 concurrency issue

g.org

unread,

Nov 8, 2010, 5:55:17 AM11/8/10

to mongod...@googlegroups.com

Hi,

i'm experiencing concurrency issues, in the mean of massive query-time
increase, when using the mongodb java driver in a multithreaded way.

see: http://jira.mongodb.org/browse/JAVA-207

with 1 concurrent threads I get a query time of 52ms (average over 400
time the same query) total runtime: 21063ms
average query time: 52ms
total runtime: 21063ms

with 4 concurrent threads I get a query time of around 152ms (average
over 100 queries per thread) total runtime:
average query time: 152ms
total runtime: 16871ms

with 10 concurrent threads I get a query time of around 544ms (average
over 40 queries per thread) total runtime: 22081ms
average runtime: 544ms
total querytime: 22081ms

so i would expect an decrase of the around 3/4, when running on a
quadcore machine with 4 threads, but well, it was less then 1/4, with a
cpu usage of not more then 50-60%
so of course 10 concurrent threads will take longer then 4 concurrent
threads, on a quad core box because not all of them can be run
concurrently, but, it should be faster then 1 thread ...

increase in average query time (1 to 4thread): 292%
increase in total runtime (1 to 4thread): 80%

increase in average query time (1 to 10thread): 1046%
increase in total runtime (1 to 10thread): 104%

increase in average query time (4 to 10thread): 357%
increase in total runtime (4 to 10thread): 130%

values taken from here:
http://bit.ly/8XvTIN

to verify this is an issue with the java driver, i run the test from 4
different jvm instances, i also increased the loops to 4000 each, to
make sure there enough concurrent ones
http://bit.ly/cPhKah

1 concurrent jvm, with each 1 thread:
average of 52ms, total runtime: 205613ms

4 concurrent jvm, with each 1 thread:
average time 73ms, total runtime: 295533ms

increase in average querytime: 140%
increase in total runtime: 143%

sadly my issue (JAVA-207) was closed with "works as designed", i hope
this decision is revised because, if it is this way, this driver is
bareley useable for our webproject, if there is more than one concurrent
user ...

tnx,

Georg

Brendan W. McAdams

unread,

Nov 8, 2010, 8:28:12 AM11/8/10

to mongod...@googlegroups.com

In your sample code you are calling toArray on your resultset. DBCursors are cursors and typically converting them to arrays will be slow.

It looks like this is your performance bottleneck and that certainly is expected behavior. You would see similar behavior from any other database if you forcibly converted a cursor to an array.

On top of it after you convert to an Array you are iterating the array again. Which means you are iterating each resultset twice which is certainly contributing to your slow benchmark.

DBCursors are optimized to be used as cursors - their optimum performance comes from being called result by result only as needed. Converting to an an array is slow and costly and will require a series of roundtrips to grab more batches.

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>

g.org

unread,

Nov 8, 2010, 8:35:40 AM11/8/10

to mongod...@googlegroups.com

well, in the end, we need a list, with counts, so i probably speed it up
that way, but:

1) we need the full list of results, otherwise i wouldn't ask for it (in
our business case, this is already filtered, but as i figuered out, that
the effect for the testcase was the same")

2) this doesn't explain the gap, between the threaded access times
within one JVM and the non threaded access times within multiple jvms.
where the nonthreaded access always win's, and is limited by the number
of available cpu cycles, where as the threaded access does use only
around half of my available cpu cycles?

Georg

On 11/08/2010 02:28 PM, Brendan W. McAdams wrote:
> In your sample code you are calling toArray on your resultset.
> DBCursors are cursors and typically converting them to arrays will be slow.
>
> It looks like this is your performance bottleneck and that certainly is
> expected behavior. You would see similar behavior from any other
> database if you forcibly converted a cursor to an array.
>
> On top of it after you convert to an Array you are iterating the array
> again. Which means you are iterating each resultset twice which is
> certainly contributing to your slow benchmark.
>
> DBCursors are optimized to be used as cursors - their optimum
> performance comes from being called result by result only as needed.
> Converting to an an array is slow and costly and will require a series
> of roundtrips to grab more batches.
>

> On Nov 8, 2010 7:32 AM, "g.org <http://g.org>" <gg...@thomas-daily.de

> <mailto:mongod...@googlegroups.com>.

> > To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com

> <mailto:mongodb-user%2Bunsu...@googlegroups.com>.

joseph

unread,

Nov 29, 2010, 6:26:52 AM11/29/10

to mongodb-user

hi

btw, some update on the issue: http://jira.mongodb.org/browse/JAVA-207
"in fact, this seems like a finalization/GC issue. if you use the 2.3
release driver and remove the finalize() methods from DBApiLayer, the
cores avail are used to 100% and the numbers created are significantly
better.

i think this is due to implicit sync on finalization. Could we somehow
get away without finalization there?"

interesting.

Looks like the TCP stack wasn't the only issue. Btw, I asked at
different places about it, it's an never heard of kind of issue with
the JVM. Doesn't mean it can't happen, but more element on it would be
welcome.

++
joseph

Reply all

Reply to author

Forward