Performance impact of jooq

1,262 views
Skip to first unread message

udam.d...@gmail.com

unread,
Apr 6, 2017, 2:20:53 PM4/6/17
to jOOQ User Group
Wanted to get an idea of what the performance impact is of using JOOQ and if there were ways to improve it.

Currently we're seeing pretty slow performance when using JOOQ for doing even very simple tasks. See output below from the StopWatch:

2017-04-06 10:45:25:428 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Initialising             : Total: 0.003ms
2017-04-06 10:45:25:428 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Rendering query          : Total: 0.231ms, +0.228ms
2017-04-06 10:45:25:428 DEBUG [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Executing query          : SELECT `session`.`id`, `session`.`created_date` FROM `session` WHERE (`session`.`id` = ? AND `session`.`created_date` = ?)
2017-04-06 10:45:25:429 DEBUG [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] -> with bind values      : SELECT `session`.`id`, `session`.`created_date` FROM `session` WHERE (`session`.`id` = 'abcd' AND `session`.`created_date` = {TS '2017-04-06 10:45:25.417'})
2017-04-06 10:45:25:429 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Query rendered           : Total: 1.166ms, +0.935ms
2017-04-06 10:45:25:429 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Preparing statement      : Total: 1.391ms, +0.225ms
2017-04-06 10:45:25:429 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Statement prepared       : Total: 1.645ms, +0.253ms
2017-04-06 10:45:25:430 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Binding variables        : Total: 1.814ms, +0.168ms
2017-04-06 10:45:25:430 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Variables bound          : Total: 2.113ms, +0.299ms
2017-04-06 10:45:25:430 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Executing query          : Total: 2.331ms, +0.217ms
2017-04-06 10:45:25:431 DEBUG [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Query executed           : Total: 3.486ms, +1.154ms
2017-04-06 10:45:25:432 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Fetching results         : Total: 4.052ms, +0.566ms
2017-04-06 10:45:25:432 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Fetching result          : Total: 4.304ms, +0.252ms
2017-04-06 10:45:25:432 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Fetching record          : Total: 4.576ms, +0.271ms
2017-04-06 10:45:25:432 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Record fetched           : Total: 4.796ms, +0.22ms
2017-04-06 10:45:25:433 DEBUG [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Fetched result           : +------------------------------------+-----------------------+
2017-04-06 10:45:25:433 DEBUG [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ]                          : |id                                  |created_date           |
2017-04-06 10:45:25:433 DEBUG [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ]                          : +------------------------------------+-----------------------+
2017-04-06 10:45:25:434 DEBUG [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ]                          : | abcd |2017-04-06 10:45:25.417|
2017-04-06 10:45:25:434 DEBUG [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ]                          : +------------------------------------+-----------------------+
2017-04-06 10:45:25:434 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Result fetched           : Total: 6.146ms, +1.349ms
2017-04-06 10:45:25:434 TRACE [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Results fetched          : Total: 6.374ms, +0.228ms
2017-04-06 10:45:25:434 DEBUG [dw-28 - POST /service/nonuilogin] [JooqLogger] [SdDYuJEQ] Finishing                : Total: 6.523ms, +0.148ms

This is run on a quad-core MacBook Pro with 16GB against a MySQL database running on the same machine. The same query run using mysql workbench would take roughly (duration/fetch time): 0.00074 sec / 0.000036 sec.

Seems like using JOOQ to execute the query causes several magnitudes more time. This causes us to have db connectionpool starvation (timeout waiting for an open connection) and terrible latencies when running with any concurrent traffic (even as low as 20 concurrent requests). 

I'd like to think that we're doing something wrong here with the way we're using JOOQ (we use dropwizard/jooq for our service). All these things that are taking a lot of time seem like things that are done with CPU cycles and I would have expected would run many magnitudes faster. If anyone can provide any hints about how we would get these latencies back down to decent levels, I would very much appreciate it.

Thank you!
Udam

Lukas Eder

unread,
Apr 6, 2017, 4:55:40 PM4/6/17
to jooq...@googlegroups.com
Hi Udam,

Thank you very much for your message.

First off - be careful with what you measure. Measuring a single query execution in any tool is not going to give you any significant results. You'd have to at least benchmark e.g. 100 or better 10000 consecutive executions.

Secondly, you seem to have turned on TRACE logging in jOOQ. This by itself causes significant overhead and is probably not what you want to do in production.

I'm more than happy to help you improve your performance with jOOQ (there's always some overhead compared to JDBC, but it's far below even a factor, if done correctly) - but do you have any actual productive issues, or did you just notice these numbers in your TRACE log?

Cheers,
Lukas

--
You received this message because you are subscribed to the Google Groups "jOOQ User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

udam.d...@gmail.com

unread,
Apr 6, 2017, 5:29:53 PM4/6/17
to jOOQ User Group
Hi Lukas,

First, awesome product! you do an amazing job answering questions and helping. We definitely expect there will be some minor overhead as you mentioned. 

We're looking in to performance degradation as we increase the load in our service. Load testing is far from scientific, just using a threadpool to send traffic to one fairly simple API concurrently. We're running with 5 db connections in the pool (min=max=5). DB connection pool will wait 1000ms for a connection before timing out. The APIs are basically doing some reads or writes to a database. Some parts of the write APIs execute multiple writes with in a single transaction.

What we're seeing is that under low loads the API performance is ok. The higher the number of concurrent requests, the api latency gets exponentially high. It also seems to get worse even with in a single run. E.g. if we do 300 concurrent requests, then some will still finish quickly. Then you see a long tail of requests out of the 300 that finish later and later (even to the point of minutes where the earlier ones may have finished in a 10s of milliseconds). I understand theres a lot of reasons where things could be slowing down. After 150ish concurrent connections, we see timeouts trying to get a free db connection from the pool. 

We've done some work to make sure the transactions use the same configuration/connection. I was expecting the the querying from DB to be super fast to the point where no API would be waiting 1000 milliseconds for an open connection in the pool. While we keep digging on this, I turned on the trace logging on the stop watch and the execute logging to see how long the queries were taking. Then I ran it with a single API call (so no concurrent calls). The numbers I was seeing with the execute logging turned on is what I pasted. I was alarmed to see the times I was seeing. Though I admit I didn't run them 1000s of times, the numbers after the very first API after a restart seem to remain pretty stable. 

The reason I started looking in to this was because I wanted to figure out why connections in the connection pool would be tied up for long periods of time. I'll keep digging on that further. 

Is there a way to find out how well things are performing in JOOQ without turning on the execute logging? What level of slow down can we attribute to the logging? 

One thing we noticed recently was we were keeping a cache of "Record" objects for some entities that rarely change. When we hand them out of the cache, we make a copy of it. We noticed that its using reflection for copying and is very slow. I wonder if there's a similar way we may be using JOOQ that causes excessive slow downs.

Thanks again!

Udam

To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+...@googlegroups.com.
Message has been deleted

Roger Thomas

unread,
Apr 7, 2017, 7:20:41 AM4/7/17
to jOOQ User Group
Hi,

Which dbconnection pool manager are you using and why have you the number of connections set to such a low number? If you are expecting/testing 100's of concurrent requests per second such a small pool could become exhausted unless each request can be handled in a millisecond or 2. Even if not exhausted you are likely to see a high level of contention and so delays at the pool level. You seem to indicate contention and then exhaustion as you say that it starts off quick, then slows down (contention) and then times out (exhaustion/timeout). Unless you have any other restrictions do a test run with the pool set to 50 and see what happens.

If you have the time to read up on yet another Java library you may find java metrics useful

        http://metrics.dropwizard.io/3.2.2/

This allows you to reduce the logging level and instead insert code that records stats with detail being reported via histograms. It can make understanding things like DB connections and query processing over time far easier to understand.

Another nice tool you may want to look at is http://www.profilesql.com/ as this provides a free in-line profiler for MySQL. By deploying this you will be able to confirm how connections are being made/dropped and the time it takes the database to process the request passed to it - the values you see here should be far nearer to the values you see within mysql workbench, but it's worth checking. In past projects I have found this very helpful as it has allowed me to discover configuration issues with the db connection pool, such as lack of reuse (lots of opens/closes) or lots of clean up between reuse (re-issuing of default information).

I hope my comments are helpful.

Roger

udam.d...@gmail.com

unread,
Apr 7, 2017, 1:53:31 PM4/7/17
to jOOQ User Group
Roger,

Thanks for taking the time to respond. Pool sizing was done based on this article: https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing. We may be off base on that decision. We're using the standard database connection pool from tomcat that ships with dropwizard. I've increased the db connection pool size and run the same tests and see less db pool exhausted errors. 

However, my concerns remain about how the performance degrades. Given that we're running against a db thats running locally (dev setup), I would not expect the database connections to be getting held up for milliseconds. Even then, at the rate we're sending requests, it seems like it should be able to handle the queue of requests that are trying to pull a connection from the pool given they're allowed to wait for a 1000ms. Thats why I turned on the JOOQ execute logging to see how long the queries were taking. I was surprised to see the numbers I was seeing (going against the local db). So wanted to get an idea from others about what sort of perf impact we can expect from using JOOQ and how it would degrade under load.

There may be other things at play here. Its possible its an issue with dropwizard or my setup. I'll keep digging on this and try to isolate jooq more.

Will give the metrics api a go as well. Did not know about profilesql, thats an interesting tool that I'll use in the future as well.

Udam

Roger Thomas

unread,
Apr 7, 2017, 4:30:57 PM4/7/17
to jOOQ User Group
This is where the Java world starts to get complicated as no 2 people from it use the same tools in the same way, so all our view points are different.

HikariCP is considered one of, if not the best of the different connection pools available by many. This can cause problems if you start to use articles that are focussed on HikariCP as the foundation for configuring other solutions. Saying that, while I do not know much about the default apache option you can see that it does not perform in the same way as HikariCP if you read up on the HikariCP benchmarks shown on the main HikariCP page  https://github.com/brettwooldridge/HikariCP. You will see that the way in which a connection cycle is handled by each solution is very different. This is something that may be visible via the profiler.

Are you able to post the segment of Java that you use to create, populate and execute the statement you have given as a log example. jOOQ has a range of ways to do things, which Lukas can comment on with easy. It is also the case that as with any library solution there can also be the need to 'pin-hole' optimise critical code to get the best performance. Any query to an SQL database that is executed in such a way as being described as "300 concurrent" is a good target for optimisation or at least a strong understanding of how it operates. I say this with my DBA hat on as I have often seen the DB server handle such requests with easy, but I see the server committing all its resources to do so.

As Lukas has already noted the enabling of logging brings its own overhead issues - so greatly impacting performance and in this case maybe causing connections to be held open far longer than non-logged code would, so needing the pool size to be grown. The log itself shows the problem. It records the single request, with a single returned record as taking 6.523ms, but if you look at the log the step from "Fetched result" to "Result fetched" took 1.349ms and most of that is likely to have been the output of the returned record to the log. I guess this also raises the question - which log sub-system have you deployed as some are faster than others. This issue regarding how to get log visibility of high-performance areas is the reason why I had to go off and learnt about the metrics library.

Roger

Lukas Eder

unread,
Apr 8, 2017, 3:40:43 PM4/8/17
to jooq...@googlegroups.com
Hi Udam,

I see you're already on the right track with your links and the feedback Roger was so kind to give as well. Essentially, you're entering the wonderful world of


To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+unsubscribe@googlegroups.com.

Lukas Eder

unread,
Apr 8, 2017, 4:00:17 PM4/8/17
to jooq...@googlegroups.com
(Gaaah, who ever thought it was a good idea to have keyboard shortcuts in web email clients!?)

Hi Udam,

I see you're already on the right track with your links and the feedback Roger was so kind to give as well. Essentially, you're entering the wonderful world of Amdahl's Law, Queuing Theory, and the Universal Scalability Law.

I've had the pleasure of hosting some very interesting interviews on the jOOQ Tuesdays series on the subject matter:


Vlad also published a very interesting guest post, recently:


As far as scaling is concerned: Rest assured: It's extremely difficult. You said you'd read AND write. That's already where trouble starts. Scalaing read-only access is much easier, but as soon as you write as well, there's concurrency and contention to take care of, and eventually, having too many connections in the pool might prove to become your bottleneck. This has nothing to do with jOOQ and will trouble you regardless of the technology stack you're using. Details about this in the above links.

How to profile jOOQ?

jOOQ's built in debug and trace loggers weren't intended for measuring time. I do realise that the existing stop watch is a bit misleading. We'll remove it from future jOOQ versions in order not to confuse users:

Ideally, you will run jOOQ just as you run it in production, and then you can:

- Benchmark thousands of executions using System.nanoTime(), which will give you already a good idea
- Benchmark the same again using JMH, which will be much more precise (and easy to set up complex benchmarks and get rid of benchmarking artefacts and side effects)
- Use a profiler like JMC / Flight Recorder, or YourKit, or JProfiler, etc. This will be less precise than benchmarking, but if you have a concrete bottleneck, you'll find it this way

Bottleneck: Reflection

You've mentioned caching and reflection. One of the most common bottlenecks in jOOQ is when you use DSL.using() all the time to implicitly create new Configuration objects, instead of caching the Configuration object (which contains a reflection cache for the DefaultRecordMapper).

This should usually be done even in environments that do not have any low latency requirements.

Bottleneck: Schema mapping

The runtime schema and table mapping feature incurs some overhead in a twofold manner. First off, the mapping configuration is "compiled" from the Settings into a more appropriate format, and again cached in your Configuration, which you should cache as well (see above).

Secondly, when applying the mapping, there is a bit of overhead as well, of course

Bottleneck: SQL generation

If you have very high performance requirements in some areas of your application, then jOOQ's usual API usage might not be appropriate. Be aware that jOOQ lets you build a SQL expression tree at runtime, and then transform it to a SQL string again.

If you're really writing static SQL, then it might be more optimal to pre-generate the SQL string, store it in a constant, and execute it with JDBC directly. There's a short section on this in the manual:

Bottleneck: Fetching

jOOQ inverts all of JDBC's "lazy" defaults for convenience. E.g. jOOQ Results are fetched eagerly, and resources are closed eagerly, including ResultSet and PreparedStatement.

If you have large result sets, you should use jOOQ's ResultQuery.fetchLazy() or fetchStream() method instead:

Don't forget to specify als the ResultQuery.fetchSize() (which translates to JDBC's Statement.setFetchSize()).

If you repeatedly run the exact same query, you could specify Query.keepStatement(), which will keep the PreparedStatement of the Query around, which might also be beneficial.

Final word: Benchmark wisely

jOOQ has some overhead compared to JDBC, sure. Some of it can and should be avoided very easily, i.e. you should have a single instance of a Configuration, or at least a pool of configurations, instead of creating it afresh all the time.

The other optimisation tools should be applied only where really needed. Not all queries need to be run millions of times per second. Again, finding the places where you can optimise most efficiently is best done using a profiler (if possible: in production), and then fixing the top 10 bottlenecks.

I'll be very happy to assist you with more details, if needed.
Cheers,
Lukas

udam.d...@gmail.com

unread,
Apr 10, 2017, 3:42:13 PM4/10/17
to jOOQ User Group
Thanks Lukas. There's quite a bit here for me to digest. I'll go through these suggestions and see where we can improve our perf. We definitely have the read/write scenario so we'll need to be extra careful. As I mentioned, we did make sure we use a tx scoped configuration object. We may chose to take it a step further and do a request scoped configuration. We did see some oddities with child transactions and propagation at some point when we tried to make the configuration common but we will spend some time on this in the coming weeks to revisit all of it.

Appreciate you taking the time to provide all this insight.

Udam

Hilco Wijbenga

unread,
Apr 10, 2017, 5:12:02 PM4/10/17
to jOOQ User Group
Hi Lukas,

Thank you for this very interesting information.

I would like to understand better how DSLContext and Configuration
work together.

On 8 April 2017 at 13:00, Lukas Eder <lukas...@gmail.com> wrote:
> You've mentioned caching and reflection. One of the most common bottlenecks
> in jOOQ is when you use DSL.using() all the time to implicitly create new
> Configuration objects, instead of caching the Configuration object (which
> contains a reflection cache for the DefaultRecordMapper).

Does this paragraph contradict [1]? Let me ask a few specific
questions (about [1]) to make it, hopefully, easier to respond.

1) Why not cache the result of "DSL.using(configuration)" and use that
cached version in the second INSERT? Is there a practical difference?
2) In fact, why not re-use "create" and avoid
"DSL.using(configuration)" altogether? Is there a practical
difference?

On a side note, when using injection, the idea is to inject DSLContext, correct?

Cheers,
Hilco

[1] https://www.jooq.org/doc/3.9/manual/sql-execution/transaction-management/

Lukas Eder

unread,
Apr 11, 2017, 3:38:34 AM4/11/17
to jooq...@googlegroups.com
Hi Udam,

Thanks for your feedback. One question: Are you using jOOQ's transaction API for those child transactions? Perhaps there's some caching issue hidden in there, which we might need to benchmark...

Cheers,
Lukas

To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+unsubscribe@googlegroups.com.

Lukas Eder

unread,
Apr 11, 2017, 3:42:33 AM4/11/17
to jooq...@googlegroups.com
2017-04-10 23:11 GMT+02:00 Hilco Wijbenga <hilco.w...@gmail.com>:
Hi Lukas,

Thank you for this very interesting information.

I would like to understand better how DSLContext and Configuration
work together.

DSLContext is a DSL entry point that operates in the context of a Configuration, unlike DSL, which is static and context (Configuration) free.
 
On 8 April 2017 at 13:00, Lukas Eder <lukas...@gmail.com> wrote:
> You've mentioned caching and reflection. One of the most common bottlenecks
> in jOOQ is when you use DSL.using() all the time to implicitly create new
> Configuration objects, instead of caching the Configuration object (which
> contains a reflection cache for the DefaultRecordMapper).

Does this paragraph contradict [1]? Let me ask a few specific
questions (about [1]) to make it, hopefully, easier to respond.

1) Why not cache the result of "DSL.using(configuration)" and use that
cached version in the second INSERT? Is there a practical difference?

In fact, the paragraph is not entirely accurate. DSL.using(Configuration) has no significant overhead because DSLContext doesn't really do anything and this constructor takes a pre-existing Configuration into account.

I've seen quite a few people use the DSL.using(Connection, ...) or DSL.using(DataSource, ...) convenience constructors all the time, in case of which a new Configuration is created all the time.
 
2) In fact, why not re-use "create" and avoid
"DSL.using(configuration)" altogether? Is there a practical
difference?

No. Except that in the documentation, if we did this (and we used to), we were flooded with tons of questions about what this "create" is.

On a side note, when using injection, the idea is to inject DSLContext, correct?

That's up to you. You can inject the Configuration, too. Injecting DSLContext is just a bit more convenient.
Reply all
Reply to author
Forward
0 new messages