Understanding "runtime mcycles" and "cpu_ms" accounting on AppEngine's Go runtime

401 views
Skip to first unread message

Joel Webber

unread,
May 1, 2014, 4:52:38 PM5/1/14
to google-ap...@googlegroups.com
I posted this question on StackOverflow as well (not entirely sure what's considered the best place to ask):
  http://stackoverflow.com/questions/23416328/understanding-runtime-mcycles-and-cpu-ms-accounting-on-appengines-go-runtim

In a nutshell, I'm wondering to what extent I'm understanding, and/or can trust, the cpu accounting going on in the AppEngine Go runtime's logs and dashboard. Some of the numbers seem impossible (e.g., as noted in the SO post, "cpu_ms > ms") if I understand their semantics correctly.

More subtly, I would love any information that would help me form a mental model of how this accounting is performed. I presume the AppEngine layer needs to get fairly deep into the Go runtime to be able to attribute cycles to particular requests (and presumably any goroutines spawned by those requests). But I have to admit I'm mostly flying blind here, so any help is greatly appreciated.

Cheers,
joel.

David Symonds

unread,
May 2, 2014, 8:41:59 AM5/2/14
to Joel Webber, google-appengine-go
The cpu_ms and related accounting measures are legacy holdovers from
the old billing structure, which was based at least partly on CPU
consumption. Nowadays it is meaningless from that perspective, and I
wouldn't be surprised if those numbers are somewhat nonsensical.

There's nothing in the Go runtime done to attribute CPU time to
separate requests, nor is it really tractable to do so in a concurrent
runtime. The attributing is statistical in nature, which may account
for the weirdness you are seeing.

Joel Webber

unread,
May 2, 2014, 9:07:26 AM5/2/14
to google-ap...@googlegroups.com, Joel Webber
Thanks for taking the time to clarify.

So it sounds like we shouldn't be relying on cpu_ms for much of anything. With regard to the average mcycles attributed to each request in the dashboard, it sounds like this is just a best guess made by the appengine container (or whichever system is responsible for gathering statistics). Do you think it's reasonable to assume that it just looks at load (computed by the VM, perhaps?) between the beginning and end of each request to compute this average?

If that model's correct, I presume that there's essentially no way to get reliable numbers when there's any concurrency -- e.g., if the original request goes into IO wait, the runtime would likely switch to another goroutine, some of whose load would be attributed to the original (unrelated) request.

Assuming *all* of that's correct, do you think it's safe to say that we should just stick with using pprof locally to profile?

Cheers,
joel.

David Symonds

unread,
May 2, 2014, 9:10:56 AM5/2/14
to Joel Webber, google-appengine-go
Yes, I would stick with pprof and general load testing. I'm not sure
cpu_ms is even used by the request scheduler any more.
Reply all
Reply to author
Forward
0 new messages