On "The Difficulty of Performance Evaluation of HikariCP in Dropwizard"

1,120 views
Skip to first unread message

nbabc...@gmail.com

unread,
Apr 25, 2017, 7:46:21 PM4/25/17
to HikariCP
Backstory: I wrote a post/benchmark using HikariCP inside a Dropwizard application and Brett wrote a thoughtful response.

Hey Brett,

Thank you for taking the time for reading and crafting a well thought out response. I've opened this thread to continue the conversation as it's a better platform for conversation vs somebody's blog. I'll link the post to this thread so others may find this too.

what version of HikariCP was used in the test

HikariCP v2.6.0

Validation

The Tomcat pool was using Connection.isValid() (via the validator class),
while HikariCP was configured to use a SQL query
(config.setConnectionTestQuery()). If the query is left un-set in HikariCP,
it will also use Connection.isValid(). If a validator is set on Tomcat, as
it is here, any similar test query is/was also configured on Tomcat would be
ignored and preference given to the validator.

Yes, I missed this, I did not realize they were mutually exclusive. From Tomcat's PoolProperties:

setValidatorClassName: Set the name for an optional validator class which will be used in place of test queries

So forcing HikariCP to use test queries while allowing Tomcat to use a validator, as you've alluded to, make these benchmarks invalid.


JDBC Compliance/Unsafe


Ah, I realize how lucky I've been for not needing to write applications that changes the transaction levels, modify autocommit, etc! Even though the benchmark does not change these properties they really should be enabled by default. The fact that Tomcat "relies on the application to remember how and when these settings have been applied" (source) is unfortunate. When I redo the benchmark I'll be sure to add that interceptor.

The JDBC specification also requires that when Connections are closed, any
open Statements will also be closed. And the specification states that when
a Statement is closed, any associated ResultSets will be closed. So, there
is a kind of cascade cleanup that occurs. Failing to close Statements and
ResultSets can leave cursors open and locks held, causing potential
unexplained deadlocks in the DB.

"Unsafe Case #2" is that Tomcat also does not do this by default. This is
solved by configuring the org.apache.tomcat...StatementFinalizer
interceptor.

Huh, I did not know this and the Tomcat docs agree with you. In the HikariCP wiki I see the feature chart of "Track/Close Open Statements", and Tomcat is listed as "Not Supported". Shouldn't the feature be supported, but not enabled by default, as StatementFinalizer will close open statements "created using createStatementprepareStatement or prepareCall"


Anyways calling out the StatementFinalizer interceptor in the wiki somewhere may be beneficial as people may erroneously believe that any kind of validator set on Tomcat would close open statements.

By default, Tomcat does not use “disposable facades”, unless setUseDisposableConnectionFacade() is configured

I would find this to be a code smell in any code base, but you're right that this should be enabled by default (and in the benchmark)


Remaining Questions


There still appears to be some sort of contention issues in Hikari, as the benchmarks that had the highest 99th percentile for response times were the ones where Jetty had the highest number of threads to serve requests (so these requests are blocked waiting for a db connection to become available). If you'd like I can try and provide more information such as jvisualvm output.


Conclusion

it is extremely difficult for the average user to configure Tomcat for compliant/safe behavior

I'm in full agreement. I was on a mission to compare apples and apples and I've overlooked a significant amount!

I'll have the benchmarks re-ran by Friday and shortly after I'll post an update and a follow up in this issue and the blog.


Brett Wooldridge

unread,
Apr 26, 2017, 4:11:12 AM4/26/17
to hika...@googlegroups.com
Hi Nick,

Yeah, HikariCP v2.6.0 had a rather unfortunate regression in the contention code path.  You can see the impact here, tweeted to me by a user after upgrading from v2.6.0 to v2.6.1:


From the change log in the v2.6.1 release announcement:


Changes in 2.6.1 
 
 * issue 835 fix increased CPU consumption under heavy load caused by excessive
   spinning in the ConcurrentBag.requite() method.



Re: The "disposable facade", I agree it is basically covering up for code smell.  Unfortunately, it is fairly common to encounter code that obtains a Connection and then passes it down through multiple layers of methods, usually in environments without transaction managers, because the user wants a bunch of SQL to run in a single transaction.  In those environments, it's pretty easy to make that mistake, which is why Tomcat ended up adding support.  Most other pools, in fact all of those of which I am aware, use a facade by default (and leaving it off is not an option).

Re: The wiki not mentioning the StatementFinalizer, I clearly need to do some house keeping. :)  This wiki page does mention it.

Brett Wooldridge

unread,
Apr 26, 2017, 5:26:47 AM4/26/17
to HikariCP
Hi Nick,

Until re-reading my own page Re: Tomcat's StatementFinalizer, I totally forgot that it ends up being only partially effective/reliable.  Because Tomcat uses WeakReferences in their implementation of the finalizer, if the VM comes under GC pressure then unclosed Statements may simply be "lost" track of by Tomcat.  Under GC pressure abandoned unclosed Statements may be garbage collected, at which point the WeakReference held by Tomcat is no longer referencing anything, and therefore it is unable to close them.  When that occurs, any locks held or resources allocated in the driver or database will not be cleaned up until the Connection itself is fully retired from the pool.

Still, even with that potentiality, running without it is certainly more potentially destabilizing than employing it; i.e. it is better than nothing.

-Brett

nbabc...@gmail.com

unread,
Apr 26, 2017, 3:39:20 PM4/26/17
to HikariCP
Yeah v2.6.1 could be just the fix I'm looking for, excellent.

And it looks like I forgot to click on the "wiki" tab when I did the github search for StatementFinalizer. :blush:

I think the HikariCP wiki is a treasure trove of information. IMO the lowest hanging fruit would be to improve the home page, as by default "Pool Analysis" is only displayed after clicking "show more". Potentially the only thing that is needed is to have the articles categorized on the home page.

Nick Babcock

unread,
Apr 27, 2017, 10:32:02 PM4/27/17
to hika...@googlegroups.com
Re-ran the benchmark with the following modifications:

- HikariCP updated to v2.6.1
- Removed setConnectionTestQuery() from HikariCP so it will default to Connection.isValid
- Added ConnectionState and StatementFinalizer jdbc interceptors to tomcat
- Enable disposable connection facade in tomcat

The benchmarks have significantly swung in HikariCP's favor. I will publish a post with a detailed breakdown in the next few days and do all the necessary linking and updating so people aren't confused by the previous blog post.

--
You received this message because you are subscribed to a topic in the Google Groups "HikariCP" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/hikari-cp/mOtyGp2fIeY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to hikari-cp+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/hikari-cp.
For more options, visit https://groups.google.com/d/optout.

Brett Wooldridge

unread,
Apr 28, 2017, 1:59:42 AM4/28/17
to HikariCP
Wow!  Thanks, Nick!

Excited to see how HikariCP faired.

-Brett

Nick Babcock

unread,
Apr 29, 2017, 8:04:04 PM4/29/17
to HikariCP
I think I was premature in the conclusion in saying that the benchmarks significantly swung in HikariCP's favor. HikariCP definitely improved but from the following graphs it doesn't look like a landslide by any means.

The first two graphs show the change in response times between the benchmark runs (mean and 99th percentile). A negative value means that the response times decreased (improved).


Notice that both HikariCP and Tomcat contained improvements, but this may be because the Jetty (the web server) and JDBI (the SQL abstraction over the connection) were upgraded as well. On average, though, it does appear that HikariCP improved more than Tomcat.


Below contain the absolute mean and 99th percentile response times.



HikariCP seems to edge out on Tomcat but the difference is oftentimes minimal.

The most peculiar graph may be seen where the number of threads the server uses to service requests is varied:



HikariCP tends to have a much larger 99th percentile response time when contention is present (ie. many request threads accessing a fewer number of db pool connections).


Not exactly what to make of this, so I figured I'd let you comment first before I start jumping to anymore conclusions.

Brett Wooldridge

unread,
May 1, 2017, 11:56:41 AM5/1/17
to hika...@googlegroups.com
Hi Nick,

Thanks for the update.  It took me a while to digest, and ponder the dynamics.  I apologise for the possibly rambling response.  I'll say it again, benchmarking is hard -- more of an art than a science.  Ok, diving in...

Don't take this as a criticism, but pool sizes of 1 or 2 are possibly of acedemic interest but probably not so much in practical applications.  Looking at the original results' "Slowest 5 average time", which I suspect didn't change much on the revised run, HikariCP held 4 of the 5 slots, and 4 of the 5 slots included pools of size 1 (Tomcat holding one of them).  I'm not too surprised, and at the same time, I'm also not particularly interested.  I would be more interested in that ranking with pool sizes of 4 and above.  Still probably below the recommended for the machine capabilities, but getting into the realm of sizes that users might consider deploying.

Tabling that for now and stepping back and looking at the larger dynamics at play.  Citing my own article (itself just summarizing some basic CompSci):

It is a basic Law of Computing that given a single CPU resource, executing A and B sequentially will always be faster than executing A and B "simultaneously" through time-slicing. Once the number of threads exceeds the number of CPU cores, you're going slower by adding more threads, not faster.
...
It is not quite as simple as stated above...because threads become blocked on I/O, we can actually get more work done by having a number of connections/threads that is greater than the number of physical computing cores.

This applies not only to databases/connections, but any software.  A related law of computing is that the performance of a processing pipeline is dictated by the slowest component in the chain.

In this case, we can envision three components in the stack:

Jetty <--> HikariCP <--> PostgreSQL

I'm going to go out on a limb and say that the formula specifying that PostgreSQL will achieve nearly optimal performance at ~(cores * 2) is likely correct.  On your 4-core host, this means that PostgreSQL will be near its peak performance when handling ~8 simultantous requests -- closer to 4 if the query set fits in memory.  A pool is just a proxy for the database and per the same formula, peak performance is likely to be achieved (regardless of pool) at somewhere between a pool size of 4-8.  I assume those numbers are also affected by the co-resident Jetty, for which PostgreSQL does not account in the formula they provided.

The next question naturally is, at what concurrency does Jetty reach its peak performance.  Taking the the database out of the equation, I'll make an educated guess based on the above.  Given that Jetty is largely non-blocking and therefore bounded primarily by CPU, my educated guess would be 4 threads.  Reinjecting the database into the mix, and noting that clearly the database is the slowest component in the pipeline, whenever a Jetty thread is blocked on the database, Jetty can likely make forward progress parsing an incoming request or an outgoing response if another thread is available.  But that thread-pool too will have a sweet spot, below which and above which total throughput drops off to either side.  My educated guess for Jetty, in this case, with 4 cores and little blocking except for the database, would be between 8-16 threads (6-12 if I had money riding on it).  Buuut, given that Jetty is splitting 4-cores with PostgreSQL, that number again may be closer to 4 threads.

On this hardware, 4-cores, I also consider 32 and 64 Jetty threads as acedemic.  Interesting, but I wouldn't judge either pool by it (or Jetty for that matter), because I would consider such a deployment naive.

Returning to the tabled "Slowest 5 average time" and looping in the "Slowest 5 by 99th percentile in response" from the original article.  HikariCP is designed for high-performance under contention, but that contention is not unbounded.  Specifically, the contention design target is N+1 to 2*N, where N is the maximum pool size -- and the maximum pool is set roughly according to the PostgreSQL-provided formula outlined in the pool sizing article.  I am not terribly suprised that 4 of the 5 slots in the "Slowest 5 by 99th percentile in response" involved 64 Jetty threads.  It is outside of the design envelope of HikariCP, and in generally the further you get away from that the larger the negative impact.  I would also expect HikariCP to be more negatively affected than Tomcat at, for example, 128 Jetty threads (on a 4-core CPU).

Having thought about the test over the past few days, I've come to think that I'm frankly surpised that HikariCP shows much advantage at all in the places that it does.  Why?  As I wrote in my article "Down the Rabbit Hole", I start by saying:

If you think of performance, and of connection pools, you might be tempted into thinking that the pool is the most important part of the performance equation. Not so clearly so. The number of getConnection() operations in comparison to other JDBC operations is small. A large amount of performance gains come in the optimization of the "delegates" that wrap Connection, Statement, etc.

The benchmark harness comes close to a simple getConnection()/close(), with the addition in a single query.  Tomcat runs a getConnection() in approximately 400 nanoseconds, and HikariCP in roughly 22.  Seemingly a large different, but minuscule in comparison to a several millisecond query.  The more [JDBC] interactions that occur between getConnection() and close(), the more advantage HikariCP will show.  It is an anacdotal user report, but I'll still point to it as supporting that claim:

We're testing HikariCP at the client and have had great initial success - an application loading 1 million records over multiple HTTP threads and putting them in the DB had it's run time cut by 70% after moving from Tomcat CP to Hikari CP!
 
Now we are having an issue with a new application. The application is a batch process that launches N threads. Each thread starts a transaction and ETLs some data from a few tables to some other tables. This application slowed down markedly after changing the connection pool to Hikari from Tomcat.
...
This was a bug in our side, using some unrelated non-threadsafe code. No issue. After fixing the bug, the code runs about 2x faster using HikariCP than Tomcat CP.


I don't have any issue with the accuracy of the results, certainly not the new ones, and not particularly even the original ones taken as a whole.  They are what they are.  I do think the workload is a bit overly simplistic to serve as guidence to users trying to select a pool for production applications, i.e the one-request/one-query workload.  Given that the same query is run each request, this ensures that the page cache contains the entirety of the result, removing database I/O from the equation, and resulting in the optimal PostgreSQL/Pool-size being essentially equal to the number of cores.

The conclusion that "Neither HikariCP or Tomcat were the clear winner. While HikariCP had the best performance, it also had the worst performance depending on configuration" also includes the aforementioned pool size of 1.  And Jetty thread-pool sizes of 32 and 64, which on 4-core hardware under a load with no blocking I/O, is likely four to eight times optimal.

The question that I, and future readers of your results, would likely find most interesting if answered, is:

What are the Top 5 Pool/Jetty size combinations for each pool (HikariCP and Tomcat) that provide:
  • The highest throughput (req/min)
  • Lowest mean response time
  • Lowest 99% response time
And for each of the above, what is the delta between the two (HikariCP vs. Tomcat)?

I believe from that, hopefully, a more definitive conclusion can be drawn.

-Brett

p.s. It would be truly awesome if you could create a github repo with your test harness and associated scripts.  I would love to run permutations on our 64-core server, with the additional axis of varying available cores over the test.

Nick Babcock

unread,
May 1, 2017, 2:55:15 PM5/1/17
to hika...@googlegroups.com
Thanks for the update.  It took me a while to digest, and ponder the dynamics.  I apologise for the possibly rambling response.  I'll say it again, benchmarking is hard -- more of an art than a science

Yeah you're telling me! I didn't choose the title "The Difficulty of Performance Evaluation of HikariCP in Dropwizard" for nothing ;)

A lot of the numbers for connection pools and server request pools don't make sense for the hardware (eg. 1 and 2 / 64 requests threads). This is done on purpose so that people can get the whole picture. What's the sweet spot for some may not be the same for others. I think it's especially important when working with a framework like Dropwizard that picks defaults for you. The defaults chosen can be asinine depending on the machine and workload. I have a sneaky suspicion that most users don't change these defaults either. What may be considered naive, one might consider convenient and plenty performant. I was planning on using the numbers reported to advocate inclusion of documentation that asks users to override the defaults if they wish for maximum performance.

I don't have any issue with the accuracy of the results, certainly not the new ones, and not particularly even the original ones taken as a whole.  They are what they are.  I do think the workload is a bit overly simplistic to serve as guidence to users trying to select a pool for production applications, i.e the one-request/one-query workload.  Given that the same query is run each request, this ensures that the page cache contains the entirety of the result, removing database I/O from the equation, and resulting in the optimal PostgreSQL/Pool-size being essentially equal to the number of cores.
 
You're right that one-request/one-query workload is simplistic, but it's better than nothing, and serves as a great conversation stater. :)

There will be requests where there are many queries performed: short, long, DELETE statements, UPDATE statements, etc. Even network variables can be introduced (latency from server to db and from server to client). If given infinite time, all of those variables can be accounted for, but my initial benchmark grabbed onto the lowest hanging fruit.

The conclusion that "Neither HikariCP or Tomcat were the clear winner. While HikariCP had the best performance, it also had the worst performance depending on configuration" also includes the aforementioned pool size of 1.  And Jetty thread-pool sizes of 32 and 64, which on 4-core hardware under a load with no blocking I/O, is likely four to eight times optimal.

You're right that I should throw out pool sizes of 1 in any serious consideration; however, I may end up fighting you tooth and nail on Jetty thread-pool sizes because the default max for Jetty is 200. Dropwizard defaults to 1024 (though they are different threadpool implementations)

The question that I, and future readers of your results, would likely find most interesting if answered, is:
 
What are the Top 5 Pool/Jetty size combinations for each pool (HikariCP and Tomcat) that provide:
The highest throughput (req/min)
Lowest mean response time
Lowest 99% response time
And for each of the above, what is the delta between the two (HikariCP vs. Tomcat)?

 Yes, but I will also continue to include comparisons of HikariCP vs. Tomcat in non-optimal pool configurations if only to stress the importance of pool configuration.

p.s. It would be truly awesome if you could create a github repo with your test harness and associated scripts.  I would love to run permutations on our 64-core server, with the additional axis of varying available cores over the test.

 Yes, you're right. I'll create the project and link it here when it's in a good state

Nick Babcock

unread,
May 1, 2017, 9:26:04 PM5/1/17
to hika...@googlegroups.com
The benchmarking repo is here: https://github.com/nickbabcock/dropwizard-hikaricp-benchmark

Feel free to create issues/PRs
Reply all
Reply to author
Forward
0 new messages