I am trying to configure the connection creation timeouts to improve response time when our database fails over.
We are using Spring Boot (2.2.1), Hikari (3.4.1), and PostgreSQL in an OpenShift 3.11 cluster. Hikari's JDBC URL points at an OpenShift Service, which itself provides connections to the master database container.
The scenario is:
1. Our master database container is evacuated from the cluster.
2. A replica database container takes over duties as master. This happens in under a second.
3. Immediately afterwards, the application requests a connection from Hikari.
4. Hikari finds that the pooled connections are invalid.
5. Hikari attempts to get a new connection from the Service, prompting the Service to fail over to the new master database container.
6. After 10s, Hikari times out (which could be lousy Service performance, it should fail over quickly and connect us).
7. Hikari tries again and immediately succeeds.
I'm OK with the Service weirdness and the first connection attempt failing and timing out, but I don't want to wait 10s to give up and try again. It isn't the validTimeout, as the validity check fails immediately. I also don't want an exception from a connectionTimeout failure immediately going back to the application. If I set the socketTimeout for the JDBC driver to be, say 1s, then it sometimes retries every 3s (why?) but usually just retries after 10s. Is there another 3s timeout I'm missing?
Any advise would be greatly appreciated.
P.S. We're loving HikariCP - such nice docs too!