What's the red and blue bars in AppStats?

Fredrik Bonander

unread,

Dec 6, 2010, 8:43:31 AM12/6/10

to google-a...@googlegroups.com

In my application i saw something that kinda worries me a bit.

In the logs for on specific request it takes 3275ms 9140cpu_ms 7296api_cpu_ms for the request to complete. The request started a new process and saved the request to appstats.

A typical request to the same URL generates for example this: 384ms 2306cpu_ms 1630api_cpu_ms in terms of time to process the request.

Inspecting this a bit further I realized that in appstats the redbar is about 5 - 15 times as long as the blue.

So my question what's the red and blue bars? In the log what's the difference between the first, second and third ms numbers?

What numbers is withdrawn from my CPU quota ?

..fredrik

--

Fredrik Bonander

carl.fredr...@gmail.com

+46 70 943 5441

- the infinite power of the creative mind -

mscwd01

unread,

Dec 6, 2010, 9:52:49 AM12/6/10

to Google App Engine

I always assumed the blue was cpu time and the red api time. Please
feel free to correct me if I am wrong though.

While we're on the topic of appstats does anyone get a zero value
reported for cpu and api times? I only ever get readings such as this:

real=774ms cpu=0ms api=0ms overhead=0ms

Not helpful at all :P

On Dec 6, 1:43 pm, Fredrik Bonander <carl.fredrik.bonan...@gmail.com>
wrote:

> carl.fredrik.bonan...@gmail.com

Stephen Johnson

unread,

Dec 6, 2010, 12:39:54 PM12/6/10

to google-a...@googlegroups.com

My understanding (please correct me if I'm wrong) of the log time numbers are:

1.) Latency time - the amount of actual real world time taken to process the request. The App Engine infrastructure uses this time to determine if you are (since it's Christmas time) being naughty or nice. Different numbers have been thrown out as to what it means to be nice but definitely average latency time needs to be under 1 second but it's been noted that sub 800ms is preferred and even lower is better. What happens if you're naughty. Well, then supposedly Santa won't spin up additional instances for you. It has been noted on this forum by Santa's Elves (the Googlers) that this average latency time requirement to spin up new instances doesn't include the latency time for background tasks.

2.) CPU Time - the amount of CPU time that your code has used based on a 1.2 GHz 64-bit x86 CPU executing sequentially which equates to 1200 megacycles equal to one second physical time. My assumption is then that if your code executed on a 2.4 GHz machine that real world time could be 1 second and CPU time could be 2 seconds (assuming no API time).

3.) API Time - the amount of API time. Time spent executing in API calls. This number could be very large even even if Latency Time is small. For example, performing a batch get of 200 keys. These could be executed in parallel by the API and could result in large API time but small latency time.

Now, according to the QuoteService docs, the CPU Time and API Time are completely separate numbers and would need to be added together to get your Total Billed CPU time. Now, do the logs follow this convention. I'm not totally sure since I can't find the logs documented anywhere and I can't find an example in my logs of API time ever being greater than CPU time. I would assume that at times if I did a batch get like I mentioned above and very little else in my code that the API time would greatly exceed my CPU time, but I can't find that. It would be great if anyone can add to this and if they have an example log entry showing this that would be awesome. Any of Santa's elves wish to chime in? Also, if this is documented someplace nicely and I just couldn't find it then a link to it would be nice.
Stephen

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Fredrik Bonander

unread,

Dec 7, 2010, 5:49:20 AM12/7/10

to google-a...@googlegroups.com

Thanks for your (both of you) info.

In my mission to improve my application I'm on a constant hunt for milliseconds. But while improving on every request I find it kinda strange that the difference would be so huge from my development server.

In appstats the trouble some request that show's "3275ms 9140cpu_ms 7296api_cpu_ms" is 1 memcache call, 4 db.RunQuery and 5 db.Get and in my development it's about 400ms. When thinking about it I would think that the red bar (the longest one) is time spent waiting on the request to return. Especially since the red bar is not present in appstats on my development server.

I think it would make sense or that for some odd reason my code is so bad that would run that much slower on the production servers.

Another thing that bugs me a bit is that this is my third time asking about how CPU/API time is calculated in different contexts but never any answers from Santa's Elves (in sprit of keeping to the metaphor).

If I'm right (about the red bars) why the long waiting times for some requests ?

It would be great to get some clarification in how this works so we as developers can know how to proceed. I don't want to spend x amount of hours optimizing my code if I can't to anything to help improve the CPU/API times.

..fredrik

--

Fredrik Bonander

carl.fredr...@gmail.com

Stephen Johnson

unread,

Dec 9, 2010, 11:37:37 AM12/9/10

to google-a...@googlegroups.com

As a followup to this just in case anyone was wondering. The second number CPU Time does NOT follow the QuotaService convention of not adding the cpu and api times together and is actually the Total CPU time (CPU time + API cpu time). IMHO, it really should be renamed to Total CPU so as to not conflict with the QuotaService convention.

Stephen Johnson

unread,

Dec 9, 2010, 11:42:13 AM12/9/10

to google-a...@googlegroups.com

Fredrik, one of the reasons your going to see discrepancies in the numbers on production is that if your request starts up a new instance than that request is going to incur the CPU time cost of loading the libraries that the request uses. Subsequent requests to that instance then won't have that CPU cost and will show much smaller CPU time but API time should remain fairly constant except for some possible caching of data on Google's end. If you implement the warmup request handlers than you can minimize some of that time by having your request handler do some datastore/memcache call to get those libraries loaded in to that instance memory.

djidjadji

unread,

Dec 12, 2010, 7:03:18 AM12/12/10

to google-a...@googlegroups.com

The usage of Appstats on the development environment is to determine
if the request is not performing to many RPC calls.
Doing transactions in a loop is a nice example that might be optimized.

Don't compare the execution times of the development and production.
They have very different implementations of the Datastore, and
sometimes the production has to start a new app instance.

For some RPC calls on the production server the API time (red bar) is
larger then the wall-clock time (blue bar) because some
queries/puts/get/fetch calls can be parallelized. The red bar is the
sum of the processing cycles combined.

The gaps in between the blue bars is the time spend in the code you
have written that are not API calls. For example to construct a list
of db.Key's you want to get.
Those ms added together and converted to 1.2 GHz processor are the
CPUms in the result.

2010/12/7 Fredrik Bonander <carl.fredr...@gmail.com>:

Reply all

Reply to author

Forward