Idle Connection Testing

1,697 views
Skip to first unread message

Nick Palmer

unread,
Apr 7, 2014, 9:52:49 AM4/7/14
to hika...@googlegroups.com
Hey Brett, we've been using Hikari more as our application is getting more complete and we've started talking and wondering about recovery from connection failures and looking at timings.

We're connecting to an Oracle 11g database using ojdbc6-11.2.0.4.jar which supports JDBC4 connection testing, as far as I know. The project I work on is a web application framework, so it can go for long periods (i.e. overnight) with no connection use and during a web request connections will be borrowed and returned multiple times.

As far as timings are concerned we have seen that with jdbc4ConnectionTest=true getting a connection that has been idle in the pool for more than a second can take ~20ms just doing the connection test. If I set jdbc4ConnectionTest=false and add a connectionTestQuery="SELECT * FROM DUAL" the connection tests come down to ~8ms on average. I thought the jdbc4ConnectionTest was meant to be quicker? Are there other options I should perhaps have been considering to use along side jdbc4ConnectionTest?

Secondly, the 1000ms (https://github.com/brettwooldridge/HikariCP/blob/master/src/main/java/com/zaxxer/hikari/HikariPool.java#L157) check on last access time, could this perhaps be configurable? As my project is a web framework connections are fairly likely to sit for more than a second between requests at low use times increasing this a little could lead to better overall performance for us.

Thirdly, and related to that connection test time limit, other connection pools I have used in the past have an option to test connections while idle so that I can have a reasonable assurance that when I borrow them from the pool they will still be connected and in a usable state.
http://commons.apache.org/proper/commons-dbcp/configuration.html - DBCP for example has a "testWhileIdle" option.
If this something Hikari might get at some point or is there reasoning for it not to be included as an option?

Brett Wooldridge

unread,
Apr 7, 2014, 10:38:02 PM4/7/14
to hika...@googlegroups.com
Hi Nick,

How disappointing that Oracle's implementation of isValid() is actually less efficient than a query.  The whole idea of the isValid() method was so that vendors could, for example, implement an "is valid" check by simple bouncing a packet between the client and server rather than executing a SQL statement that goes all the way through the database stack (parser, optimizer, execution plan, etc.).  I have no idea what Oracle is doing under the covers, but if "SELECT * FROM DUAL" is empirically faster, I would suggest using that rather then the isValid() method, though I would propose "SELECT 1 FROM DUAL" instead.  Having said that, 8ms is an extremely long time for such a query.  Well tuned Oracle configurations and networking stacks should run queries like this in the 1-2ms range.

Re: background testing and the 1000ms "exception" to the alive check.  HikariCP is designed for high-performance, but not at the cost of reliability.  We will gladly perform slower in some environments in order to ensure fewer application-level errors.  Having said that, one of HikariCP's main design targets are highly active applications -- applications that sustain hundreds or thousands of transactions per minute.  HikariCP treats connections that have been used reliably within the past second as "proven" -- this is the <1000ms exception.

Once you start pushing this out to several seconds, or in the case of pools like tomcat, bonecp, dbcp, pushing it out to 30 seconds, you start to trade off reliability.  Having a thread (or threads) that tests connections every 30 seconds is pretty worthless in our opinion.  For example, this means is that if the database is rebooted, even if it comes back up in 5 seconds the pool can continue returning bad connections for the next 30 seconds.

Anyway, try "SELECT 1 FROM DUAL", and poke around how to tune your Oracle config and equally important your TCP stacks.  You should be able to get "SELECT 1 FROM DUAL" down to about 2ms.

Nick Palmer

unread,
Apr 11, 2014, 9:21:25 AM4/11/14
to hika...@googlegroups.com
Hi Brett,

Thanks for the response.
I completely forgot to mention that the timings I gave were rather slow as I was testing with app server and database server on different networks, the network lag mostly amplifying the difference between the JDBC test and a SELECT. On the same network both fall around the 1-2ms range, though the select usually gets more towards 1ms and the JDBC test towards 2ms.

I do agree with the argument for keeping the 1000ms limit small, I'd rather wait a couple of extra ms on borrowing a connection to check it's still valid than have a broken connection given out.
And a connection-test background job as well would indeed be fairly worthless, in terms of making sure connections remain valid. There was some concern here about oracle killing idle connections over night and Hikari timing out first thing in the morning when it gets attempts to grab a connection for the first time in the morning. Though checking the source it seems making sure the idleTimeout and maxLifetime parameters are sensible would cause connections to be destroyed/created via the housekeeping timer task so probably not a concern.

Brett Wooldridge

unread,
Apr 12, 2014, 2:39:30 AM4/12/14
to hika...@googlegroups.com
Yes, that is the purpose of maxLifetime.  Databases like MySQL are aggressive to kill long lived connections, and load-balancers or stateful firewalls that sometimes sit between applications and databases also tend to terminate connections after a fixed period of time.  maxLifetime should be set about 2 minutes shorter than these configured timeouts in order to reduce application delays.

Reply all
Reply to author
Forward
0 new messages