SUMMARY:
On Wednesday 9 April 2014, Google App Engine (GAE) experienced elevated error rates for some applications for a duration of 7 minutes. If your service or application was affected, we apologize - this is not the level of reliability and performance we strive to offer you, and we have taken and are taking immediate steps to improve the platform’s performance and availability.
DETAILED DESCRIPTION OF IMPACT:
On Wednesday 9 April 2014, some applications running in US datacenters with the High Replication Datastore experienced elevated error rates during the period 13:57 to 14:04 US/Pacific. 13.2% of applications that had more than 100 non-task queue requests during this period experienced an error rate of greater than 10%.
ROOT CAUSE:
The outage was caused by a software update which caused too many servers to restart simultaneously.
REMEDIATION AND PREVENTION:
Our System Reliability Engineering team has improved the system update process to ensure that we do not restart too many servers simultaneously.