SUMMARY:
Between Monday 22 September 2013 and Tuesday 23 September 2014, some requests to Google Cloud Endpoints from clients using the Google APIs Client Library for Javascript failed for a duration of 12 hours and 33 minutes. We apologize for any impact this incident may have had on your Cloud Endpoints applications. We hold ourselves to a high standard, and we failed to meet that standard. We are taking immediate action to ensure incidents like this do not happen in the future.
DETAILED DESCRIPTION OF IMPACT:
From Monday 22 September 2014 19:27 PDT until Tuesday 23 September 2014 08:00 PDT, 3% of requests to Google Cloud Endpoints made using the Google APIs Client Library for JavaScript failed with an HTTP 400 or HTTP 500. API calls that passed data as part of the request body were affected by this issue.
ROOT CAUSE:
Google engineers published a new configuration for the Google APIs Client Library for Javascript. This configuration enabled an experimental feature for communicating with Cloud Endpoints, and was intended only for a subset of clients but erroneously was applied to all clients. The experimental feature contained a bug that caused API call requests structured a specific way to fail under certain conditions.
REMEDIATION AND PREVENTION:
Google engineers were alerted to the issue via an external report at 01:05 PDT. The engineers immediately began investigating and identified the configuration change as the cause of the issue at 03:50 PDT. To resolve this issue, the engineers began a rollback of the configuration change at 04:00 PDT which completed at 08:00 PDT.
To prevent this issue from occurring in the future we will improve testing of configuration changes. We will also improve internal documentation to make identifying and rectifying incorrect configuration pushes easier.