Hello Brett!
Firstly, congratulations on releasing this great library! I've been seeing results out from many many users and felt so excited that I would too have a fantastic experience. I have downloaded HikariCP 2.4.6 from Maven Central and started using the library to replace the ages old Apache DBCP 1.4.that we were using.
Let me first give you the context: We have a batch process in our business which processes 250 batch jobs every night starting from 7.00PM to 7.00AM and one of the batch jobs reads/processes and writes data in around 38-40 tables. Since this batch always took 4H.35m to complete with Apache, we thought of using HikariCP as an alternative as its FAST has great features and very well reviewed by users! We are utilizing Spring Batch's partition strategy whereby we are kicking off 30 parallel instances of the same job. There is only 1 Java process but 30 internal threads each processing some records. We are using IBM JDK 8.5.5 with Java 7 as our version. This batch is executed as part of a shell script.
Problem: We are trying to process 10,000 records with a commit count of 200 which means after every 200 records read/processed, it would be committed (200 all at once) to the DB. The Batch always processes around ~9500 records and goes into hang mode. At this moment, the batch does not do anything, no jvm activity, no database processing but HikariCP throws an error :
Caused by: java.sql.SQLException: Connection is closed
at com.zaxxer.hikari.pool.ProxyConnection$ClosedConnection$1.invoke(ProxyConnection.java:468)
We reduced the commit interval to 100 and the batch processes perfectly fine. We gained 15 mins of batch processing time with HikariCP! We again changed the commit interval to 200 and re-ran the batch. Our database is behind a firewall and is currently having SQLNET.EXPIRE_TIME set as 10 minutes.
We are using Spring Batch with JSR 352 approach and below is our Spring XML configuration:
<bean id="hikariConfig" class="com.zaxxer.hikari.HikariConfig">
<property name="poolName" value="springHikariPool" />
<property name="autoCommit" value="false" />
<property name="maximumPoolSize" value="30" />
<property name="jdbcUrl" value="${batch.jdbc.url}" />
<property name="maxLifetime" value="600000" /> <!--10mins-->
<property name="idleTimeout" value="300000" /><!--5mins-->
<property name="minimumIdle" value="10" />
<property name="dataSourceProperties">
<props>
<prop key="driverClassName">${batch.jdbc.driver}</prop>
<prop key="user">${batch.jdbc.user}</prop>
<prop key="password">${batch.jdbc.password}</prop>
</props>
</property>
</bean>
Doubt 1: We are using log4j for logging stuff and to satisfy hikari dependency, we are having slf4j-log4j12-1.7.7, slf4j-api-1.7.7.jar on the classpath. In our log4j.properties, we are having log4j.logger.com.zaxxer.hikari=DEBUG set but I never see ANY logs from HikariCP in my logging. Below is the excerpt from log4j.properties:
log4j.rootLogger = ALL, stdout, debug, exception
layoutPattern=%d{dd MMM yyyy HH:mm:ss,SSS} %-5p [%l] - %m %n
otherLayoutPattern=%d{dd MMM yyyy HH:mm:ss SSS}, %m %n
log4j.logger.com.zaxxer.hikari=DEBUG
Doubt 2: In order to understand the pool usage: idle, active, inuse I thought of using registerMbeans property of HikariCP and view it using JConsole as its an JMX MBean. However, the moment I added the property to "true", I got an error saying "java.lang.IllegalArgumentException: Pool with the same name already exists". I only have 1 Batch Job running at the moment so I dont know why it would give this error? Is it because we are running 30 internal threads it is saying this?
Doubt 3: We think the issue could be the firewall is causing the connections to close after 30mins ( current firewall timeout). We set the maximumPool to 30, maxLifeTIme to 10mins and idle to 5mins. Is there any other setting that you could help us with that could make the batch run successful without being hung?
Doubt 4: Whenever I give 30 as the maximumPoolSize, it somehow creates 60 connections. I verified this with GV$session. It doesnt make any sense to me, but ideally it should not cross beyond 30. To be double sure, I changed the setting to 60 and GV$session showed 120! Can you explain this behavior a bit please?
Doubt 5: When we set the leakDetectionThreshold to 30000 ( 30 seconds), I was assuming it would give me a stacktrace for all the threads which may be a leak suspects. But what I identified is, it always gives stacktrace for the Spring specific transaction package ( doBegin method to be more precise) where our application is beginning the transaction. Is there a way to print out a more detailed error trace to identify the real leak issue?
Hope this information is sufficient enough for you to help us out with. I am really really excited about making this successful and use this library as a permanent library for our application!
Thanks,
Yogendra