Hi,
In the discussion below I am only referring to requests that did
*not* start a new server instance, I'm comfortable with the fact that
timings on these are all over the place. It's the standard request
processing I am looking at. I'm using python and initiating requests
from the UK.
It was recently announced that app engine log timings now include
latency, and I think I can see that in my apps logs. As a general
rule, I try to keep the real-time activity of a single request below
100ms. I judge whether I am achieving that based on the time between
my app's 'Request Received' log message, and it's 'Sending Response'
log message.
One problem with this approach is that quite often I am seeing the
total time on these requests up around 700 - 800 ms or more. The
first thing to note is that sometimes the difference between the time
on Google's log header and my 'Request Received' log message is over
500ms. e.g.
07-28 02:08AM 26.148 /MyURL 200 793ms 127cpu_ms 30api_cpu_ms 3kb <user
agent & headers & stuff>
a.b.c.d - - [28/Jul/2010:02:08:26 -0700] "GET /MyURL HTTP/1.1" 200
3646 "http:/
my-app.appspot.com/" <user agent & headers & stuff> "my-
app.appspot.com" ms=793 cpu_ms=128 api_cpu_ms=31 cpm_usd=0.004105
pending_ms=428
07-28 02:08AM 26.579 Handling Request...
...
07-28 02:08AM 26.662 Sending Response...
So it looks like the 'pending_ms' in the header measures this,
although this attribute is not always present in the log headers. Is
the occurrence of this something that we should expect fairly often
going forward? Is the size of this number something that Google is
trying to reduce? It's a huge number for a web server.
The second question is the difference between the total (793ms) and
the log numbers. 26.148 -> 26.662 = 514ms, so there's 279ms I'm
trying to understand the cause of - its worth knowing too because it
represents about 300% of my actual request handling time. Sometimes
this difference is as low as 20ms (for a similarly sized response), so
it would appear to be something in google's infrastructure rather than
raw network latency. What sort of things are we looking at here?
I guess I'm asking this stuff because I've made a fair effort to
reduce the request processing times on my app, but apart from billing
advantages (user experience is far more important), this is largely
irrelevant in the face of such huge overhead. Is the size of this
overhead something that google considers important to reduce - is it
something you think can be reduced in the future, or should we expect
requests to often incur 700%+ overhead on a warm server?
I recognise that google would appear to be doing well here -
http://www.infoq.com/news/2010/07/Benchmarking-5-Cloud-Platforms,
although the tests seem to be for static files, which I agree are
served very quickly. It would be interesting to see the numbers for
actual app requests to these platforms. I can see people consuming
that graph and thinking 'Wow, my app's gonna fly on app engine'.
Anyway, if you are able to shed any light on the current situation,
and your hopes/expectations for the future, that would be much
appreciated - cheers,
Colin