Task queues not able to execute

213 views
Skip to first unread message

Google Cloud Platform Status

unread,
Jun 17, 2015, 12:51:13 AM6/17/15
to google-appengine...@googlegroups.com
We're investigating an issue with Google App Engine task queues beginning
at Tuesday, 2015-06-16 20:00 (all times are in US/Pacific). Users may also
experience issues with application deployment. We will provide more
information within 30 minutes.

Google Cloud Platform Status

unread,
Jun 17, 2015, 1:20:44 AM6/17/15
to google-appengine...@googlegroups.com
The problem with Google App Engine Task Queue was resolved as of Tuesday,
2015-06-16 21:35 (all times are in US/Pacific), however some users may
continue to experience difficulties with application deployment. We are
continuing to investigate this and will provide a further update by
Tuesday, 2015-06-16 22:50 with current details. Currently, this service
disruption is affecting less than 8% of users.

We apologize for the inconvenience and thank you for your patience and
continued support.

Google Cloud Platform Status

unread,
Jun 17, 2015, 1:52:46 AM6/17/15
to google-appengine...@googlegroups.com
We are continuing to investigate the issue with application deployment and
will provide a further update  by Tuesday, 2015-06-16 23:20.

Google Cloud Platform Status

unread,
Jun 17, 2015, 2:24:04 AM6/17/15
to google-appengine...@googlegroups.com
The issue with application deployments is ongoing; symptoms of a deployment
failure are posted below.  We are continuing to investigate this and a
further update will be posted in 30 minutes.

--

Error posting to URL:
https://appengine.google.com/api/appversion/precompile?module=default&app_id=[APPID]&version=[VERSION_ID]

500 Internal Server Error

<html><head><meta http-equiv="content-type"
content="text/html;charset=utf-8"><title>500 Server
Error</title></head><body text=#000000 bgcolor=#ffffff><h1>Error: Server
Error</h1><h2>The server encountered an error and could not complete your
request.<p>Please try again in 30 seconds.</h2><h2></h2></body></html>

This is try #0

[TIMESTAMP] com.google.appengine.tools.admin.AbstractServerConnection send1

Error posting to URL:
https://appengine.google.com/api/appversion/precompile?module=default&app_id=[APPID]&version=[VERSION_ID]

503 Service Unavailable

Try Again (503)An unexpected failure has occurred. Please try again.

Google Cloud Platform Status

unread,
Jun 17, 2015, 2:51:36 AM6/17/15
to google-appengine...@googlegroups.com
We are continuing to investigate the issue with application deployment and
will provide a further update by Wednesday, 2015-06-17 00:20.

Google Cloud Platform Status

unread,
Jun 17, 2015, 3:20:16 AM6/17/15
to google-appengine...@googlegroups.com
The issue with application deployment was resolved as of Wednesday,
2015-06-17 00:00. Again we do apologize for the inconvenience and thank you
for your patience and continued support. Please rest assured that system
reliability is a top priority at Google, and we are making continuous
improvements to make our systems better. We will provide a more detailed
analysis of this incident once we have completed our internal investigation.

Google Cloud Platform Status

unread,
Jun 18, 2015, 10:57:54 PM6/18/15
to google-appengine...@googlegroups.com
SUMMARY:

On Tuesday, 16 June 2015, Google App Engine Task Queue service and App
Engine application deployment experienced increased error rates for a
duration of 3 hours and 25 minutes. If your service or application was
affected, we apologize. We have taken actions to fix the issue and are in
process of making the system more reliable.

DETAILED DESCRIPTION OF IMPACT:

On Tuesday, 16 June 2015 from 20:10 to 23:35 PDT, some developers of Google
App Engine applications in the US region were unable to deploy their
applications. The overall error rate of deployments during this period was
approximately 60%. Affected developers saw that attempted deployments
would exit and report an internal server error message after HTTP requests
to appengine.google.com timed out. App Engine Admin Console was unable to
load data for affected applications. Additionally, between 20:58 to 21:33,
applications in the US region experienced an increase in error rate of up
to 0.25% as well as slower execution of Task Queue tasks.

ROOT CAUSE:

Google engineers had performed maintenance on a storage system of one of
datacenters which App Engine uses. During this maintenance, components of
App Engine that rely on this storage system had to rely on a replica in a
different datacenter. For both deployments and Task Queues, this switch did
not function properly.

REMEDIATION AND PREVENTION:

Google engineers took necessary measures to prevent the Task Queue service
from accessing the storage under the maintenance at 21:33. In addition,
all traffic for the affected applications was redirected to alternate
datacenters at 23:26. This was completed by 23:35 and applications were
again able to deploy successfully.

To prevent the issue from recurring, we are working to make deployments and
Task Queue are more resilient to movements in the underlying storage
system, in a similar fashion to other App Engine components.
Reply all
Reply to author
Forward
0 new messages