frequent spinups of instances, very high frontend instance usage

nicois

unread,

Nov 18, 2011, 3:01:46 AM11/18/11

to Google App Engine

I'm developing another app at the moment, using the latest python SDK,
python 2.7 with threading enabled, and I'm seeing a few strange
things:

The dashboard reports 9 requests, but the event log shows 61 in the
last 24 hours.
The maximum CPU time used on a request is under 10000ms.
The number of instance hours used in the same period is 1.36, out of
28 available (ie:~5%).

These statistics are my main concern. By my calculations, even
assuming each request took the full 10 seconds, I should have used
about 10 CPU minutes, rather than the 100 I'm seeing.

The second thing which I am experiencing today, and which may or may
not be related to the above: if I do not make a request for 10-20
seconds, the GAE instance is shut down, requiring a complete reload of
the instance at the next request. This means if someone loads a page,
they have to wait ~10seconds, and after digesting that page, if more
than a few seconds have passed, they will have another 10 second wait
for the next page.

I can understand GAE spinning down instances when not required, but
this seems to be occurring very agressively, and is probably resulting
in an increase in the CPU load being put on the GAE infrastructure.

Has anyone else experienced anything like this with their apps? I
couldn't see any issues in the GAE issue tracker, and wanted to make
sure I wasn't doing anything silly before logging a ticket there.

thanks.

Tapir

unread,

Nov 21, 2011, 2:51:48 AM11/21/11

to Google App Engine

Yes, this problem is very annoying.
I think it is a defect even a bug in the instance scheduler.
Below is mail I sent to GAE support, without replies.
The problem is the scheduler often will not use an instance in idling.

*****************************************

This problem is so serious.
Above 30% percent of my app requests will be slow loading.
I hope you can fix this problem soon.

Here I provide a detailed example for this problem:

-------------------------------------------------
1. Instance Snapshots at 8:16

Instances help QPS* Latency* Requests Errors Age Memory
Availability
0.050 14589.0 ms 1 0 0:00:26 80.8 MBytes Dynamic Icon Dynamic
0.000 0.0 ms 2 0 0:31:51 80.0 MBytes Dynamic Icon Dynamic

2. Instance Snapshots at 8:25

Instances help QPS* Latency* Requests Errors Age Memory
Availability
0.050 11925.0 ms 1 0 0:00:27 75.7 MBytes Dynamic Icon Dynamic
0.000 0.0 ms 2 0 0:41:08 80.0 MBytes Dynamic Icon Dynamic

3. Instance Snapshots at 8:27

Instances help QPS* Latency* Requests Errors Age Memory
Availability
0.000 0.0 ms 2 0 0:42:38 80.0 MBytes Dynamic Icon Dynamic
--------------------------------------------------

You will find at 8:25, a new instance is created with 12 seconds
loading
and handle a new request.
But the number handled requests for the other instance is 2, which is
the same as 9 minutes ago.
This proves when the new instance is being created, the other long run
instance is idle and available.
It is unnecessary to create the new instance. Even if the new instance
should be created for other reasons,
the scheduler should still let the long run idle instance to handle
the
just coming request, so that the visitor
doesn't need to wait for 12 seconds to load a page.

Marcel Manz

unread,

Nov 21, 2011, 8:03:16 AM11/21/11

to google-a...@googlegroups.com

Same problem here - see my earlier post including screenshot: https://groups.google.com/forum/?hl=en#!topic/google-appengine/dQQ2y01Mbgs

Marcel

Reply all

Reply to author

Forward