Some additional information to close this:
During the week of March 2nd, due to scheduled maintenance on the
machines in one of our datacenters, we had to move the App Engine
serving fabric around to a new set of hardware. Although we do this
regularly and it has not been a problem in the past, in this case some
of the new features that we have introduced added complexities that we
weren't aware of, and as a result the new serving configuration was
not configured properly. In essence, App Engine virtualizes the entire
web serving infrastructure stack. Maintaining a consistent virtual
"target" on different configurations of hardware is a serious
challenge for us.
In this case of the problems from last week, our new serving
configuration resulted in elevated latencies. What made this problem
so difficult for us to find and fix was that many of our applications,
including our biggest customers, continued running just fine, and did
not see elevated latencies. The latency problems only surfaced
noticeably in apps with a specific computation and API call workload.
Thus it took us longer than it should to understand the scope of the
problem and get the proper fixes in place. We have since learned from
this, and are now tracking and testing new hardware configurations
like this much more carefully, so we hope this will not happen again.
Pete Koomen, App Engine Team
On Mar 5, 5:42 pm, App Engine Team <
appengine.nore...@gmail.com>
wrote: