Google App Engine Issues on Fri September 26th 2014

235 views
Skip to first unread message

Google App Engine Downtime Notify

unread,
Sep 26, 2014, 1:33:12 PM9/26/14
to google-appengine...@googlegroups.com
We're investigating an issue with Google App Engine Admin Console and Deployments beginning at Friday, 2014-09-26 08:00 US/Pacific. We will provide more information shortly.

Google App Engine Downtime Notify

unread,
Sep 26, 2014, 2:36:18 PM9/26/14
to google-appengine...@googlegroups.com, google-appengine...@googlegroups.com
We are still investigating the issue with Google App Engine Admin Console and deployments. The issue may also affect some users of the Files API and Blobstore API. We will provide another status update by Friday, 2014-09-26 12:30 PDT.

Google App Engine Downtime Notify

unread,
Sep 26, 2014, 3:03:10 PM9/26/14
to google-appengine...@googlegroups.com, google-appengine...@googlegroups.com
The problem with Google App Engine Admin Console, deployments, Files API and Blobstore API was resolved as of Friday, 2014-09-26 12:00 PDT. We apologize for the inconvenience and thank you for your patience and continued support. Please rest assured that system reliability is a top priority at Google, and we are making continuous improvements to make our systems better.

Google App Engine Downtime Notify

unread,
Sep 29, 2014, 7:37:15 PM9/29/14
to google-appengine...@googlegroups.com, google-appengine...@googlegroups.com

SUMMARY:

On Friday 26 September 2014, some administrators of Google App Engine applications experienced elevated errors and latency when deploying new versions for a duration of 150 minutes. We apologize for the impact that this may have had on your service or application. We are taking steps to ensure that this problem will not recur.


DETAILED DESCRIPTION OF IMPACT:

On Friday 26 September 2014 from 08:15 to 10:45 PDT, App Engine administrators experienced a 75% error rate when when deploying new versions.


During this period, users of the App Engine Admin Console at https://appengine.google.com/ also experienced elevated latency. The elevated latency was 38% higher at the median. At the 75th percentile, most Admin Console requests timed out. The Google Developers Console at https://console.developers.google.com/ was not affected by this incident.


Some users of the App Engine Files API experienced errors when creating new Blobstore blobs between 09:15 and 10:45. This issue affected 22% of applications hosted in US HRD datacenters. The average error rate for affected applications was 32%.


ROOT CAUSE:

This incident was triggered by a failure in the underlying storage layer in a single datacenter that occurred at 07:46. This lead to the elevated errors when opening Blobstore blobs using the Files API and also to slightly higher latency for deployments starting from 08:04.


App Engine replicates its data to multiple datacenters. Following our normal procedures, we redirected traffic to other datacenters to work around the storage failure in a single datacenter. However, the App Engine Admin Console continued to point to the original datacenter for deployments. This caused elevated deployment errors until the storage issue was resolved at 10:44.


Users of the App Engine Admin Console experienced elevated latency during the incident, due to high latency for loading requests of new instances as a result of the storage layer issues.



REMEDIATION AND PREVENTION:

Our engineers were automatically alerted to the failure in the storage system at 08:03. We redirected traffic to other datacenters at 09:13. The underlying failure in the storage system was resolved at 10:44.


In order to prevent this or similar incidents from recurring, our engineers are eliminating all dependencies within the Admin Console and the App Engine deployment pipeline on data stored in a single datacenter. In the event of a failure in the storage layer in a single datacenter, the Admin Console and deployment pipelines will be able to use replicated data in other datacenters.


For customers using the Files API, which is now deprecated, we recommend that you migrate your code to use the Cloud Storage client library instead:


https://cloud.google.com/appengine/docs/java/googlecloudstorageclient/

https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/
Reply all
Reply to author
Forward
0 new messages