Sporadic connection errors (tcp i/o timeout; cannot fetch oauth token; connection refused)

362 views
Skip to first unread message

David Goodwin

unread,
Sep 28, 2017, 9:12:15 AM9/28/17
to Google Cloud SQL discuss
Hi,

We're using sqlproxy (gcr.io/cloudsql-docker/gce-proxy:1.10) to connect a PHP application running within Google's container engine.

For the most part it's working fine - but every so often we see errors like :

SQLSTATE[HY000] [2002] Connection refused

from the PHP application.

We're also seeing :

2017/09/28 08:04:51 couldn't connect to "<project-name>:europe-west2:<mysql instance name>": Post https://www.googleapis.com/sql/v1beta4/projects/<project-name>/instances/<mysql instance name>/createEphemeral?alt=json: oauth2: cannot fetch token: Post https://accounts.google.com/o/oauth2/token: dial tcp: i/o timeout


In the sqlproxy log.

Any suggestions?

thanks,

David.

Carlos (Cloud Platform Support)

unread,
Sep 28, 2017, 3:00:03 PM9/28/17
to Google Cloud SQL discuss
Hi David,

The fact that the issue is intermittent will make it difficult to troubleshoot.  Your issue is similar to the ones discussed here [1][2]. In one of them the issue was transient, in the other it was solved by upgrading the cluster. So make sure you are using the latest node images and Kubernetes releases.

One step to debug could be to create another cluster and see if the timeouts happens at the same time.  To isolate networking issues you could spin the cluster in the closest region where you have the SQL instance.



David Goodwin

unread,
Sep 29, 2017, 7:10:49 AM9/29/17
to Google Cloud SQL discuss
Hi Carlos,

The cluster and SQL instance are in the same region (europe-west-2) - although the containers are mostly in 2-a and the SQL instance is in 2-b.

The cluster is v1.6.10 already. I'll investigate upgrading it to 1.7 on a test environment first.

thanks
David. 

Carlos (Cloud Platform Support)

unread,
Sep 29, 2017, 5:30:32 PM9/29/17
to Google Cloud SQL discuss

Hi David,


That is a good approach. You will be discarding the images and the cluster per se.

If the connections timeouts still happens in the new cluster, try to correlate them with the current cluster. If they happen at the same time the issue could be in the Cloud SQL server or the network.


Reply all
Reply to author
Forward
0 new messages