Hi ProxySQL team,
We are adopting ProxySQL as a Kubernetes sidecar to mediate between our Java application (utilizing MySQL Connector/J) and AWS Aurora for read-write separation. During testing, we observed that the application's behavior differs when it fails to establish new connections, depending on whether ProxySQL is in use.
Without ProxySQL: The application will encounter a "connection timed out" error if the "connectTimeout" duration elapses due to Aurora's slow response in establishing new connections. And it will promptly receive a "connection refused" error if the Aurora instance is down, such as during a failover.
With ProxySQL: The application will get "Max connect timeout reached while reaching hostgroup xxx" error in both scenarios after the duration specified by "
connect_timeout_server_max".
The discrepancy creates a contradiction in our scenarios:
1. The application should fail fast during the Aurora failover to prevent a surge in new connection requests when the traffic volume remains constant. Therefore, the "
connect_timeout_server_max" parameter needs to be minimized.
2. The application should be able to tolerate a performance reduction in Aurora up to a certain threshold. Our current JDBC 'connectTimeout' setting is 5 seconds, which is too long for the failover scenario.
I understand that ProxySQL manages two sets of connections, and typically, the creation of frontend connections is stable. Therefore, it makes sense that ProxySQL doesn't return a "connection refused" error. However, I'm curious why it still waits until the "
connect_timeout_server_max" duration when an Aurora instance is down, even though I have set "
connect_retries_on_failure" to 0. Are there any other parameters that can adjust this behavior?
Many thanks,
Yuankai