No log warning for heavy datastore operations, dashboard marks are yellow and red

2 views
Skip to first unread message

djidjadji

unread,
Dec 14, 2008, 7:12:15 PM12/14/08
to google-a...@googlegroups.com
I have read the posts about log warnings and the FAQ at
http://knol.google.com/k/marce/app-engine-community-faqs/vkzeph4si12v/1#

----------------- From the FAQ
A: The mcycles number included in log entries includes only the CPU
consumed in the Python runtime and the datastore. The number in the
dashboard includes the mcycles used in both your code and API code.
This is why the dashboard number is usually higher.
------------------

if I understand the FAQ can I write a python request handler that does
a lot of calculations and won't get a log warning?
The calculation has no datastore get() or put() operations and very
few python runtime calls like zip(), map(), list comprehentions.

I tested extensively the other side: heavy datastore request handlers
and noted the average CPU a request costs.
The numbers are taken from the dashboard. And about 100 to 400
requests per value of N where used to determine the average.

My test Model contained an IntegerProperty and a StringProperty. The
number was random [0..10000] and the strings had 10 random lowercase
characters. I started with a datastore that contained more then 1000
of these objects.

===========
The first experiment was: How much CPU does it cost to fetch N objects?

Every request reads N objects starting at a certain random number.
query.filter('num >=', startNum).fetch(N)
There is a wait of 1 sec in between requests.

here are the measurements
N=0 CPU=7 Mcycle
N=1 CPU=108
N=5 CPU=264
N=10 CPU=464
N=15 CPU=663
N=20 CPU=863
N=25 CPU=1059
N=30 CPU=1260

I have made a graph of these figures: http://tinyurl.com/6hkc6q (a
very long google chart URL)

I also did the experiment to print the number and string of these N
objects with a Django template and with a Python for-loop. The Python
for-loop costs 1Mcycle extra for every value of N, the Django template
costs 6 Mcycle extra for N=30 and 3 Mcycle extra for N=5.

===========
The second experiment was: How much CPU does it cost to create or
delete N objects?

A request creates N objects starting at a certain random number, n, n+1,n+2....
Puts them in a list and stores them with db.put(objlist)
The delete request reads N objects starting at a certain random
number. query.filter('num >=', startNum).fetch(N)
Then calls db.delete(objlist).
There is a wait of 1 sec in between requests.

here are the measurements
N=1 createCPU=799 deleteCPU=699 Mcycle
N=2 createCPU=1589 deleteCPU=1306
N=3 createCPU=2377 deleteCPU=1957
N=4 createCPU=3150 deleteCPU=2585
N=5 createCPU=3956 deleteCPU=3216
N=10 createCPU=7900 deleteCPU=6317

A graph of these figures: http://tinyurl.com/6s3hq7


Only 2 of all these requests, a couple of 1000 in total, did result in
a log warning that I use a lot of CPU,
one for a create(N=3)[4239Mcycle] and one for a create(N=4)[5046Mcycle]
Why not the 100 create requests for N=10, and they where 1 sec apart.
( 1 sec wait after receipt of a result ).
The log warnings have a threshold of 1000Mcycle, and if I remember
correct I have read, you are allowed 2 per minute before the app is
blocked temporarily.
During the testing of this experiment I had a dashboard line telling
me a series of requests took 11000 MCycle on average, NO log warning.

Why don't these (>2000Mcycle) datastore requests give a lot of log warnings?
The time spend in my code is around 7 Mcycle (N=0).

Marzia Niccolai

unread,
Dec 15, 2008, 5:00:30 PM12/15/08
to google-a...@googlegroups.com
Hi,

Answers inline

On Sun, Dec 14, 2008 at 4:12 PM, djidjadji <djid...@gmail.com> wrote:

I have read the posts about log warnings and the FAQ at
http://knol.google.com/k/marce/app-engine-community-faqs/vkzeph4si12v/1#

----------------- From the FAQ
A: The mcycles number included in log entries includes only the CPU
consumed in the Python runtime and the datastore. The number in the
dashboard includes the mcycles used in both your code and API code.
This is why the dashboard number is usually higher.
------------------

I've updated this FAQ.  This information isn't correct (even though I think I may have said it, *sigh*).  The most up-to-date information on high CPU warnings can be found at:
http://code.google.com/appengine/kb/general.html#highcpu

Please consider this the canonical source of information on high CPU warnings.
 

if I understand the FAQ can I write a python request handler that does
a lot of calculations and won't get a log warning?

This is maybe not true, depending on what you mean.  The logs containing two different 'warning' indicators.  The first is the (possibly) colored CPU consumption numbers on first log line.  The second is something that says:
"This request used a high amount of CPU, and was roughly X times over the average CPU request limit..."

These two indicators are calculated differently.  The CPU listed on the first line is _total_ CPU used, runtime, datastore and other APIs.  Since the runtime and datastore CPU are generally much higher than any other APIs, these numbers are the ones that weigh the most on this measurement.

The CPU that actually generates a logged warning message is _runtime only_.  If you do a lot of calculations in your runtime (sorting in memory, complex math calculations, large module loads), you will get the log warning in your dashboard.
 

The CPU you are seeing is most likely datastore CPU.  This CPU won't result in warning lines in your logs.  Again, only runtime CPU usage will cause these warnings to appear, and the 2/minute limit applies only to runtime CPU and not to datastore CPU.

Please know that we are working on the high CPU issues to make it more transparent how this is calculated and generally having the warnings make less of an impact on developing App Engine applications.

-Marzia
 




Reply all
Reply to author
Forward
0 new messages