Google Cloud Platform Status
unread,Jan 22, 2015, 12:10:59 AM1/22/15Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to google-appengine...@googlegroups.com
SUMMARY:
On Tuesday 20 January 2015, some Google App Engine applications experienced
elevated rates of HTTP 500 errors for a duration of 11 minutes. We
apologize if you were affected by this incident. We are working hard to
prevent incidents like this from recurring in future.
DETAILED DESCRIPTION OF IMPACT:
On Tuesday 20 January 2015, some Google App Engine apps experienced
elevated rates of HTTP 500 errors during the following time intervals:
18:24 - 18:27, 18:36 - 18:41, and 19:06 - 19:08 (all times in PST). The
issue affected 13% of applications. This issue caused 3% of requests to App
Engine to receive 500 errors during the 11 minutes of the incident.
ROOT CAUSE:
The issue was caused by an error in the software-defined networking control
system responsible for network traffic between Google datacenters. The
system incorrectly determined that there had been a drop in network
capacity available to App Engine applications in one datacenter.
REMEDIATION AND PREVENTION:
Our engineers received an automated alert for the issue at 18:42. At 18:55,
we redirected some traffic away from the affected datacenter. The system
returned to stability at 19:08.
To prevent a recurrence of this issue, we will disable the subsystem which
malfunctioned until both a fix for the immediate malfunction and a defense
in depth have been deployed.