Latency Issues Stabilized

1 view

Skip to first unread message

App Engine Team

unread,

Mar 5, 2009, 7:42:51 PM3/5/09

to Google App Engine Downtime Notify

We are happy to report that the serving and datastore latency issues
affecting some App Engine applications during the past few days have
now stabilized. Performance has greatly improved. Latency does remain
slightly higher for some apps, but all spikes and errors have been
contained at this time. System performance should remain stable
throughout the weekend and improve further at the beginning of next
week.

Your feedback, input, and support were invaluable as we worked to
resolve this issue. We greatly appreciate the involvement of our
community and we thank you for your contributions. If you observe any
further issues, please do not hesitate to inform us via the Group.

The App Engine team is committed to providing our developers with as
much information as possible, via such channels as the System Status
site. Due to recent improvements in our system, we are working to
refine the monitoring we use to report performance, as well as add new
test metrics to fully capture our developers' scenarios. Please expect
improvements in the upcoming weeks. Also, during the past few days,
there were times when we did not communicate frequently enough about
our status. We will make improvements to our process for keeping
developers informed as we resolve important issues.

We apologize for the inconvenience which this issue has caused for our
developers. Now that it is resolved, we will use the data and
experiences we have gained to inform our ongoing efforts to improve
App Engine. Your feedback is crucial to us and we'll be monitoring the
discussion groups for your input and ideas.

Mike Repass, App Engine Team

App Engine Team

unread,

Mar 16, 2009, 6:46:31 PM3/16/09

to Google App Engine Downtime Notify

Some additional information to close this:

During the week of March 2nd, due to scheduled maintenance on the
machines in one of our datacenters, we had to move the App Engine
serving fabric around to a new set of hardware. Although we do this
regularly and it has not been a problem in the past, in this case some
of the new features that we have introduced added complexities that we
weren't aware of, and as a result the new serving configuration was
not configured properly. In essence, App Engine virtualizes the entire
web serving infrastructure stack. Maintaining a consistent virtual
"target" on different configurations of hardware is a serious
challenge for us.

In this case of the problems from last week, our new serving
configuration resulted in elevated latencies. What made this problem
so difficult for us to find and fix was that many of our applications,
including our biggest customers, continued running just fine, and did
not see elevated latencies. The latency problems only surfaced
noticeably in apps with a specific computation and API call workload.
Thus it took us longer than it should to understand the scope of the
problem and get the proper fixes in place. We have since learned from
this, and are now tracking and testing new hardware configurations
like this much more carefully, so we hope this will not happen again.

Pete Koomen, App Engine Team

On Mar 5, 5:42 pm, App Engine Team <appengine.nore...@gmail.com>
wrote:

Reply all

Reply to author

Forward

0 new messages