Google App Engine Issues on Sunday, June 1st 2014

126 views
Skip to first unread message

Google App Engine Downtime Notify

unread,
Jun 1, 2014, 9:50:53 AM6/1/14
to google-appengine...@googlegroups.com
We're investigating an issue with Google App Engine Admin Console and Deployment beginning at approximately Sunday, 2014-06-01 05:00 (all times are in US/Pacific). We will provide more information shortly.

Google App Engine Downtime Notify

unread,
Jun 1, 2014, 10:40:35 AM6/1/14
to google-appengine...@googlegroups.com
We are currently experiencing an issue with the App Engine admin console and some users are seeing high latency and an elevated error rate. For everyone who is affected, we apologize for any inconvenience you may be experiencing.

We will provide an update by 8:30 AM Pacific with current details.

Google App Engine Downtime Notify

unread,
Jun 1, 2014, 11:09:33 AM6/1/14
to google-appengine...@googlegroups.com
The problem with the Google App Engine admin console should be resolved as of 7:50 AM Pacific. We apologize for any issues this may have caused you or your users and thank you for your patience and continued support. Please rest assured that system reliability is a top priority at Google, and we are making continuous improvements to make our systems better.

Google App Engine Downtime Notify

unread,
Jun 5, 2014, 6:37:12 PM6/5/14
to google-appengine...@googlegroups.com, google-appengine...@googlegroups.com

SUMMARY:

On Sunday 1 June 2014, some administrators of Google App Engine applications were unable to access the Admin Console or deploy new versions of their applications for a duration of 156 minutes. If your service or application was affected, we apologize — this is not the level of quality and reliability we strive to offer you, and we have taken and are taking immediate steps to improve the platform’s performance and availability.


DETAILED DESCRIPTION OF IMPACT:

On Sunday 1 June 2014 from 05:00 AM to 07:36 AM US/Pacific, the App Engine Admin Console had an elevated error rate, preventing some administrators from making configuration changes or deploying new versions of their applications. The error rate for requests to the Admin Console during this period was 27%. 64% of deployments failed. Serving of requests to end users of App Engine applications was not affected by this incident.


ROOT CAUSE:

The incident was caused by a software update that lead to degraded performance of the storage system used by the Admin Console.


REMEDIATION AND PREVENTION:

Our engineers reverted the software update as soon as the root cause was understood. To prevent recurrence, we have fixed the issue that caused degraded performance. We are working on an improvement to the Admin Console that will make it more resilient to issues with the storage layer. We also are adding additional monitoring to allow the operations team to respond more quickly to Admin Console issues.


Reply all
Reply to author
Forward
0 new messages