(Gaaah, who ever thought it was a good idea to have keyboard shortcuts in web email clients!?)
Hi Udam,
I see you're already on the right track with your links and the feedback Roger was so kind to give as well. Essentially, you're entering the wonderful world of Amdahl's Law, Queuing Theory, and the Universal Scalability Law.
I've had the pleasure of hosting some very interesting interviews on the jOOQ Tuesdays series on the subject matter:
Vlad also published a very interesting guest post, recently:
As far as scaling is concerned: Rest assured: It's extremely difficult. You said you'd read AND write. That's already where trouble starts. Scalaing read-only access is much easier, but as soon as you write as well, there's concurrency and contention to take care of, and eventually, having too many connections in the pool might prove to become your bottleneck. This has nothing to do with jOOQ and will trouble you regardless of the technology stack you're using. Details about this in the above links.
How to profile jOOQ?
jOOQ's built in debug and trace loggers weren't intended for measuring time. I do realise that the existing stop watch is a bit misleading. We'll remove it from future jOOQ versions in order not to confuse users:
Ideally, you will run jOOQ just as you run it in production, and then you can:
- Benchmark thousands of executions using System.nanoTime(), which will give you already a good idea
- Benchmark the same again using JMH, which will be much more precise (and easy to set up complex benchmarks and get rid of benchmarking artefacts and side effects)
- Use a profiler like JMC / Flight Recorder, or YourKit, or JProfiler, etc. This will be less precise than benchmarking, but if you have a concrete bottleneck, you'll find it this way
Bottleneck: Reflection
You've mentioned caching and reflection. One of the most common bottlenecks in jOOQ is when you use DSL.using() all the time to implicitly create new Configuration objects, instead of caching the Configuration object (which contains a reflection cache for the DefaultRecordMapper).
This should usually be done even in environments that do not have any low latency requirements.
Bottleneck: Schema mapping
The runtime schema and table mapping feature incurs some overhead in a twofold manner. First off, the mapping configuration is "compiled" from the Settings into a more appropriate format, and again cached in your Configuration, which you should cache as well (see above).
Secondly, when applying the mapping, there is a bit of overhead as well, of course
Bottleneck: SQL generation
If you have very high performance requirements in some areas of your application, then jOOQ's usual API usage might not be appropriate. Be aware that jOOQ lets you build a SQL expression tree at runtime, and then transform it to a SQL string again.
If you're really writing static SQL, then it might be more optimal to pre-generate the SQL string, store it in a constant, and execute it with JDBC directly. There's a short section on this in the manual:
Bottleneck: Fetching
jOOQ inverts all of JDBC's "lazy" defaults for convenience. E.g. jOOQ Results are fetched eagerly, and resources are closed eagerly, including ResultSet and PreparedStatement.
If you have large result sets, you should use jOOQ's ResultQuery.fetchLazy() or fetchStream() method instead:
Don't forget to specify als the ResultQuery.fetchSize() (which translates to JDBC's Statement.setFetchSize()).
If you repeatedly run the exact same query, you could specify Query.keepStatement(), which will keep the PreparedStatement of the Query around, which might also be beneficial.
Final word: Benchmark wisely
jOOQ has some overhead compared to JDBC, sure. Some of it can and should be avoided very easily, i.e. you should have a single instance of a Configuration, or at least a pool of configurations, instead of creating it afresh all the time.
The other optimisation tools should be applied only where really needed. Not all queries need to be run millions of times per second. Again, finding the places where you can optimise most efficiently is best done using a profiler (if possible: in production), and then fixing the top 10 bottlenecks.
I'll be very happy to assist you with more details, if needed.
Cheers,
Lukas