Lots of ssl.SSLError: ('The read operation timed out',) on Google Compute API

1,544 views
Skip to first unread message

Matthew Will

unread,
Jul 17, 2015, 3:29:23 PM7/17/15
to gce-dis...@googlegroups.com
Hi there,

Starting 2 days ago (July 15th, 2015) we have been seeing a very high number of SSL timeout error while using the Google Compute API via libcloud.

This is causing almost all of our automated deployment to fails, we are seeing this from Linux and OS X and when running the deployment from our internal networks as well as from travis CI build agents (I believe these are hosted on AWS).

We have confirmed that this is not related to a code change we've made, code from last week that used to succeed 100% of the time is now failing 100% of the time with the ssl.SSLError: ('The read operation timed out',).

We're seeing this error on lots of different API calls (create node, get static ip, create forwarding rules, create network, ...). We are also seeing it in the us-central1 region and in the europe-west1 region.


We are at lost as to what could be causing this, Google cloud status page doesn't report any issue.

Has anyone else seen this? How do you work around it?

--
Matthew Will

Hai Wu

unread,
Jul 21, 2015, 9:02:27 AM7/21/15
to gce-dis...@googlegroups.com
Sometimes I am getting the same error as this one. It happens randomly. 

Have you been able to find out the root cause for this? 

Matthew Will

unread,
Jul 21, 2015, 9:08:07 AM7/21/15
to gce-dis...@googlegroups.com
No. It's increased with frequency to the point automated builds on TravisCI interacting with Googles Cloud API fail constantly (along with the normal backendError and internalErrors we've become accustomed to). 

Hai Wu

unread,
Jul 22, 2015, 7:37:54 PM7/22/15
to gce-discussion, matthe...@points.com
I am getting almost the same from time to time, this really sucks:

ssl.SSLError: The read operation timed out

Matthew Will

unread,
Jul 23, 2015, 8:30:44 AM7/23/15
to gce-discussion, haiw...@gmail.com
Are you noticing these on any particular calls? Yesterday was quite frustrating. Pretty much any orchestration we were attempting was timing out at some point or another, and possibly looking related to GCE firewall API calls.

--
Matthew Will

Faizan (Google Cloud Support)

unread,
Jul 24, 2015, 2:05:27 PM7/24/15
to gce-discussion, matthe...@points.com, haiw...@gmail.com, matthe...@points.com
Hello Matthew,

Can you provide me your project number and trace route to Google API through private message. You can add the following to your travis build config. 
 
scripts:
- tracert [google apis domain]

Thanks

Faizan

hai wu

unread,
Jul 27, 2015, 10:57:01 AM7/27/15
to Matthew Will, gce-discussion
To me, it always failing upon trying to create a new instance from
some image via Ansible, I tried to create image in zones us-central1-a
and us-central1-b, and it just randomly fails with that error for
either case. But other times it works, so it is very confusing ..

Jean Mertz

unread,
Jul 29, 2015, 7:06:27 AM7/29/15
to gce-discussion, matthe...@points.com, haiw...@gmail.com
I'm having the same issue. SSH connections keep timing out randomly. Sometimes a machine works, then I exit and want back in, and I get timeouts. Pinging the IP also times out.

This is on the europe-west1-d region, I tried several in Europe, and am about to start testing in the US region.

The developer console shows the machine as up, and using the browser-based SSH panel works as expected.

ps. I just tried again, and the server is available again (ie I can SSH from my local machine). This is happening for two days now, and I tried SSH'ing from multiple locations to rule out local connectivity issues. All locations are from within The Netherlands though, so perhaps there is a central network node causing this issue?

Alex Smith

unread,
Jul 29, 2015, 4:01:12 PM7/29/15
to gce-discussion, matthe...@points.com, haiw...@gmail.com, je...@mertz.fm
I've been having the same problem targeting us-central1-a.  Sometimes it times out gathering inventory information, other times it fails creating new instances.  As of today I can't create a single instance without hitting an SSL timeout.

Joe Toscano

unread,
Jul 29, 2015, 8:43:27 PM7/29/15
to gce-discussion, matthe...@points.com, haiw...@gmail.com, je...@mertz.fm, al...@telnyx.com
Just adding my voice here: I'm also experiencing timeouts which render my automated processes essentially useless. Tried from various machines, same result. I'm seeing the timeouts with different calls, such as listing all images, or creating a new node.

Using apache-libcloud 0.17.0

Joe

Scott Van Woudenberg

unread,
Jul 29, 2015, 11:18:13 PM7/29/15
to Joe Toscano, gce-discussion, matthe...@points.com, haiw...@gmail.com, je...@mertz.fm, al...@telnyx.com
Hi folks,

Thanks for continuing to report in on this issue. We're working to track down the specific ingress point that is creating this issue. As Faizan requested, for those that are encountering the issue, please send Faizan and I a traceroute of the connection to the Google API endpoint (please send off-list so you can include your project ID).

Thanks in advance,

-ScottVW

----
Scott Van Woudenberg
Product Manager
Google Compute Engine


--
© 2014 Google Inc. 1600 Amphitheatre Parkway, Mountain View, CA 94043
 
Email preferences: You received this email because you signed up for the Google Compute Engine Discussion Google Group (gce-dis...@googlegroups.com) to participate in discussions with other members of the Google Compute Engine community and the Google Compute Engine Team.
---
You received this message because you are subscribed to the Google Groups "gce-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gce-discussio...@googlegroups.com.
To post to this group, send email to gce-dis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gce-discussion/c291de5c-babd-4721-b730-a898ea8089de%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Faizan (Google Cloud Support)

unread,
Aug 7, 2015, 7:00:15 PM8/7/15
to gce-dis...@googlegroups.com, j...@replicated.com, matthe...@points.com, haiw...@gmail.com, je...@mertz.fm, al...@telnyx.com
We identified an issue with our frontend infrastructure that routes requests to GCE APIs. This problem was side effect of a recent change intended for overall optimization. Due to sparse error rates, the problem was not obvious initially. We started testing and rolling out a fix on 08/04 and it was complete by 08/06.

I hope that helps.

Faizan

On Wednesday, July 29, 2015 at 11:18:13 PM UTC-4, Scott Van Woudenberg wrote:
Hi folks,

Thanks for continuing to report in on this issue. We're working to track down the specific ingress point that is creating this issue. As Faizan requested, for those that are encountering the issue, please send Faizan and I a traceroute of the connection to the Google API endpoint (please send off-list so you can include your project ID).

Thanks in advance,

-ScottVW

----
Scott Van Woudenberg
Product Manager
Google Compute Engine

Email preferences: You received this email because you signed up for the Google Compute Engine Discussion Google Group (gce-discussion@googlegroups.com) to participate in discussions with other members of the Google Compute Engine community and the Google Compute Engine Team.

---
You received this message because you are subscribed to the Google Groups "gce-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gce-discussion+unsubscribe@googlegroups.com.
To post to this group, send email to gce-discussion@googlegroups.com.

hai wu

unread,
Aug 7, 2015, 10:38:52 PM8/7/15
to Faizan (Google Cloud Support), gce-discussion, j...@replicated.com, Matthew Will, je...@mertz.fm, al...@telnyx.com
Awksome news! Thank you!!

On Fri, Aug 7, 2015 at 5:00 PM, Faizan (Google Cloud Support)
<fai...@google.com> wrote:
> We identified an issue with our frontend infrastructure that routes requests
> to GCE APIs. This problem was side effect of a recent change intended for
> overall optimization. Due to sparse error rates, the problem was not obvious
> initially. We started testing and rolling out a fix on 08/04 and it was
> complete by 06/08.
>>> (gce-dis...@googlegroups.com) to participate in discussions with other
>>> members of the Google Compute Engine community and the Google Compute Engine
>>> Team.
>>> ---
>>> You received this message because you are subscribed to the Google Groups
>>> "gce-discussion" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to gce-discussio...@googlegroups.com.
>>> To post to this group, send email to gce-dis...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages