HikariDataSource.getConnection is slow under load

5,966 views
Skip to first unread message

Krishna P

unread,
Jul 9, 2015, 1:10:18 PM7/9/15
to hika...@googlegroups.com
Hi,

We have a java web application using MySQL db with Hikari connection pool. Under a load of about 40 requests per sec, the getConnection calls take almost 50% of the response time.

This is with hikari 2.3.8. The max connections is set to 80. Any suggestions for bringing the delays down are appreciated.

Please let me know if you need more information.




Brett Wooldridge

unread,
Jul 9, 2015, 10:51:36 PM7/9/15
to hika...@googlegroups.com, kpadm...@gmail.com
Hi Krishna,

The time spent in getConnection() is almost certainly high because threads are blocked waiting for other threads to return connections to the pool.

This indicates one or all of the following:
  • The database is overloaded with too many simultaneous queries, which is slowing all queries down.  Unless you have a 16-core server, 80 connections almost certainly too many for the database to handle.
  • There are one or more "expensive" queries that are creating "back pressure" ... like a slow car on a highway causing traffic to queue up behind it.
  • The thread acquiring a connection is performing expensive or time-consuming work before returning the connection to the pool.  Just like the previous point, this creates a "traffic jam" of threads waiting for connections.
Typical well-tuned queries for a website with a target total response time of 250ms should generally have single-digit, or at most double-digit, execution times (ms).  For example, if a request against your database could execute in 10ms, then a single thread with a single connection could execute 40 requests in 400ms.  Or put simply, a 10ms average query execution time means ~100 req/sec. from a single thread.  Most databases should be able to maintain a relatively flat average query response time, even as the number of connections increase, up to a certain peak.  Beyond that "performance peak" number of connections, average query response time will start to increase as connections increase.

Your job as a DBA (database administrator) is to find that peak number of connections, and through query profiling/optimization and database parameter tuning try to push that number higher without degrading performance.

Looking at your graph above, it is equally interesting to me that your ServiceHandlerLogin.getLoginToken() is taking ~250ms by itself.  Additionally, it appears that your code flow is:

  1. HikariDataSource.getConnection()
  2. ServiceLoginHandler.getLoginToken()
  3. connection.commit()
  4. ...
  5. connection.close()
If this is the case, and ServiceLoginHandler.getLoginToken() is taking ~250ms, that means that each connection will be held out of the pool for a minimum of 250ms.  If all 80 connections are "in-use", this is going to cause each additional thread calling HikariDataSource.getConnection() to wait an average of 250ms.

The presence of JDBCTransaction.commit() implies to me that you are using Hibernate.  If your application server is running on a session-per-request model, it is absolutely critical that end-to-end request times are on the order of tens of milliseconds, not hundreds.  This is because the container is going to obtain a connection at the start of a request and not return it until after the request is complete.

If I were you, I would start first by analyzing your queries.  Configure MySQL to log slow queries (assuming MySQL >5.1.23):

SET GLOBAL log_queries_not_using_indexes=1
SET GLOBAL log_slow_queries=slow-queries.log
SET GLOBAL long_query_time=0.05

This will log all queries taking longer than 50ms.  If it is too noisy, set the long_query_time down to 100ms (0.1 sec).  Work down from the slowest query, analyzing the query plan, adding indexes if needed, until all of your queries are running under 50ms (maximum).  After that, I would start dialing the maximum connections in the pool back, with a goal of it down to 20 or less.

Lastly, if ServiceLoginHandler.getLoginToken() is some kind of call to an external system, you're going to have to figure out how to get that out of the middle of the HikariDataSource.getConnection() / connection.close().  That may involve disabling container managed transactions/sessions and taking control of those yourself.

Krishna P

unread,
Jul 10, 2015, 6:07:24 AM7/10/15
to hika...@googlegroups.com, kpadm...@gmail.com
Hi Brett,

Thanks for the quick response.

ServiceLoginHandler.getLoginToken() is the entry point. This procedure calls several other sub procedures, some requiring db connections, others not. For every sub procedure that needs db connection, it calls createTransaction() (via hibernate), which internally does a HikariDataSource.getConnection(). Once the sub procedure completes, the transaction is committed or rolled back. 

I am attaching the breakdown image below which shows that HikariDataSource.getConnection() is called about 7 times during one call to ServiceLoginHandler.getLoginToken(), so this probably means each getConnection() takes ~ 448/7 = 64ms. The db queries take less than 10ms each. There are no external calls in this flow. We are not using container based transactions, we are managing them inside the application. Based on this data, do you see what we need to do to improve the getConnection() latency?

The flow is as follows (logically):
  • ServiceHandlerLogin.getLoginToken()
  • doStuff1()
  • checkPropertyInDb()
  • doStuff1()
  • checkServicesInDb()
  • doStuff3()
  • checkSessionsInDb()
  • doStuff4()
  • checkUsersInDb()
  • doStuff5()
  • checkUserProfileInDb()
  • doStuff6()
  • checkNamespaceInDb()
  • doStuff7()

thanks,

Krishna

Brett Wooldridge

unread,
Jul 10, 2015, 10:34:40 AM7/10/15
to hika...@googlegroups.com, kpadm...@gmail.com

I have a couple of thoughts.


HikariCP has a "rollback-on-return" policy, which cannot be disabled because it ensures correctness.  By which I mean, imagine that a thread runs some SQL on a connection, but subsequently neither commits nor rolls back, and then returns the connection to the pool.  If the pool simply gives that connection to another thread, "bad things" can happen.  Therefore HikariCP performs a rollback when the connection is returned.  Which could be bad performance-wise, but HikariCP is smart about it -- it tracks the transaction state and skips the rollback if no SQL has been run since the last commit or rollback by the application.


Looking at the table you attached, I see the average calls/txn for getConnection() is 7.04.  However, the average calls/txn for JDBCTransaction.commit() is only 4.04.  This leads me to believe that maybe your application, or Hibernate itself is eliding a commit() in some cases -- maybe in the case of read only transactions?  If this is so, HikariCP will be led to believe that it needs to rollback -- which is a big hit even when there is nothing to actually rollback.


A simple getConnection()/close() cycle (no SQL) runs at >25000 ops/ms in HikariCP against MySQL.  But as soon as you introduce a rollback against MySQL --getConnection()/rollback()/close() -- the throughput drops to ~20 ops/ms.  Yes, that's from 25000 to 20 (like the number of fingers and toes you have).


Arguing against this theory is that I would expect Connection.close() to show up as an equal heavy-hitter in that table.


The second thought I have is "metrics".  I can see you're already using New Relic, so you know the value of metrics.  If you can (or already have) enabled Dropwizard metrics, the pool stats collected might provide some insights.  Failing that, even the basic stats collected by the default JMX MBean and charted, would be helpful.


You'll need v2.3.9, because v2.3.8 had a regression preventing the MBeans from registering properly.


Lastly, there is not much in the getConnection() code path to delay obtaining a connection except, 1) a shortage of connections, or 2) the cost of the connection validation.  As an experiment, can you set the system property com.zaxxer.hikari.aliveBypassWindow to 5000 (milliseconds)?  This would need to be set before the pool is constructed.


Thanks.


Brett Wooldridge

unread,
Jul 10, 2015, 9:19:14 PM7/10/15
to hika...@googlegroups.com, brett.wo...@gmail.com, kpadm...@gmail.com
Krishna,

I have added some additional debug level logging that is available in 2.3.10-SNAPSHOT:

<dependency>
  <groupId>com.zaxxer</groupId>
  <artifactId>HikariCP</artifactId>
  <version>2.3.10-SNAPSHOT</version>
</dependency>

Additional logging was added in paths that are most likely to cause a delay, anywhere the pool touches the connection like rollback(), setTransactionIsolation(), connection validation, etc.  Can you make a run in a test environment with debug logging turned on -- just simulating the typical sequence you outlined above?

We should probably take this over to the issue tracker, if you don't mind opening an issue, its easier to attach and format output.

Thanks,
Brett

Brett Wooldridge

unread,
Jul 20, 2015, 8:48:06 PM7/20/15
to hika...@googlegroups.com, kpadm...@gmail.com
Hi Krishna,

Any update on this issue?

Krishna P

unread,
Jul 21, 2015, 6:34:39 AM7/21/15
to hika...@googlegroups.com, kpadm...@gmail.com
Hi Brett,

Sorry for my delayed response. I did not yet get a chance to work on this completely as i needed to optimize the apis, but this is on my todo list. 

Since our existing db queries were mostly readonly, I added redis caching for the data referenced by the apis, which circumvented the issue for now.
I did try upgrading to 2.3.10-SNAPSHOT but maven could not find it, do you know what I am missing? I settled for 2.3.9 instead, and added coda hale metrics. 

I'll try to get you the data you asked for, but it may take some time.

regards,
Krishna

Krishna P

unread,
Jul 22, 2015, 3:31:26 AM7/22/15
to HikariCP
Hi Brett,

I cloned the repo and tried building 2.3.10-SNAPSHOT, but got this error during compilation, do you know what I am doing wrong?

[INFO] Building HikariCP-java6 2.3.10-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ HikariCP-java6 ---
[INFO]
[INFO] --- maven-enforcer-plugin:1.0:enforce (enforce-maven) @ HikariCP-java6 ---
[INFO]
[INFO] --- build-helper-maven-plugin:1.9.1:add-source (src) @ HikariCP-java6 ---
[INFO] Source directory: D:\github\hikari\HikariCP\hikaricp-common\src\main\java added.
[INFO]
[INFO] --- maven-resources-plugin:2.7:resources (default-resources) @ HikariCP-java6 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO]
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ HikariCP-java6 ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 39 source files to D:\github\hikari\HikariCP\hikaricp-java6\target\classes
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/StatementProxy.java:[164,22] cannot find symbol
  symbol:   method executeLargeBatch()
  location: variable delegate of type java.sql.Statement
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/StatementProxy.java:[160,4] method does not override or implement a method from a supertype
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/StatementProxy.java:[172,22] cannot find symbol
  symbol:   method executeLargeUpdate(java.lang.String)
  location: variable delegate of type java.sql.Statement
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/StatementProxy.java:[168,4] method does not override or implement a method from a supertype
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/StatementProxy.java:[180,22] cannot find symbol
  symbol:   method executeLargeUpdate(java.lang.String,int)
  location: variable delegate of type java.sql.Statement
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/StatementProxy.java:[176,4] method does not override or implement a method from a supertype
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/StatementProxy.java:[188,22] cannot find symbol
  symbol:   method executeLargeUpdate(java.lang.String,int[])
  location: variable delegate of type java.sql.Statement
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/StatementProxy.java:[184,4] method does not override or implement a method from a supertype
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/StatementProxy.java:[196,22] cannot find symbol
  symbol:   method executeLargeUpdate(java.lang.String,java.lang.String[])
  location: variable delegate of type java.sql.Statement
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/StatementProxy.java:[192,4] method does not override or implement a method from a supertype
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/PreparedStatementProxy.java:[69,44] cannot find symbol
  symbol:   method executeLargeUpdate()
  location: interface java.sql.PreparedStatement
[ERROR] /D:/github/hikari/HikariCP/hikaricp-common/src/main/java/com/zaxxer/hikari/proxy/PreparedStatementProxy.java:[65,4] method does not override or implement a method from a supertype
[INFO] 12 errors
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] HikariCP-parent ................................... SUCCESS [3.569s]
[INFO] HikariCP-java6 .................................... FAILURE [7.352s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------

Brett Wooldridge

unread,
Jul 22, 2015, 3:56:11 AM7/22/15
to HikariCP, kpadm...@gmail.com
Hi Krishna,

Even for the Java 6/7 jars you must use Java 8 to build.  You can use any version at runtime, but the build itself requires Java 8.

You can also download the 2.3.10-SNAPSHOT JAR file directly from here.


Krishna P

unread,
Jul 22, 2015, 12:32:54 PM7/22/15
to HikariCP
Hi Brett,


thanks,
Krishna
Reply all
Reply to author
Forward
0 new messages