Very high instance count

127 views
Skip to first unread message

Clément

unread,
Dec 8, 2011, 11:33:42 AM12/8/11
to google-a...@googlegroups.com
I'm experiencing a very high and unusual instance count on one of my apps since this morning.

I have about 10 instances usually, but today it's stuck between 40 and 50.

There is no code change since yesterday, and the load is about the same as every day.

The average QPS is extremely low (below 0.1), and the latency is also quite low (below 500ms), so it doesn't make any sense.

I've almost exhausted my quota for today because of this issue. 

Has something changed in the instance scheduler since yesteday ?

Rishi Arora

unread,
Dec 8, 2011, 2:13:54 PM12/8/11
to google-a...@googlegroups.com
I'm seeing this too, but for me this has turned out to be a good thing.  I have my Max-Idle-Instances fixed at one.  But I thought my issue was more because I lowered my Max-Pending-Latency from 5 seconds down to 1 second, and then further down to 50ms.  In terms of impact on resource-usage, I have only seen my overall instance-hours used per-day go from ~27 (under the free amount of 28), to about 30.  To me, a cost increase of $0.16 is a small price to pay for a really low max-pending-latency of 50ms.  The user experience has been a lot better since then.

For your case, keep in mind that QPS is only averaged over the last minute - so it may not be a good indication for overall load demanded of your instances.  I do feel that there was a change in GAE's instance scheduler, but I would recommend tweaking the scheduler settings  - more importantly, you may need to lower your Max-Idle-Instances, so that an increase in active-instance-count (in response to spike in incoming requests) doesn't have a huge impact to your cost.  Also, correspondingly, if you're reducing max-idle-instances, you may also need to reduce your max-pending-latency so that instances are spawned sooner to serve your incoming load.  Do you have a sense for how long your instance start up time is?


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/DgFyJxrcopQJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Message has been deleted

Rishi Arora

unread,
Dec 8, 2011, 3:49:40 PM12/8/11
to google-a...@googlegroups.com
Wow, an increase from 10 to 400?  Can I ask what your scheduler settings are, if only to learn how to prevent my app from running into this scenario?

On Thu, Dec 8, 2011 at 2:43 PM, smwatch <show...@gmail.com> wrote:
We are seeing this problem too, our regular instance are 6-10 and they
were at 300-400

Our code is now disabled, it was using memcache, but we have no fix
for it so far, no change in code or traffic.

See this thread.
http://groups.google.com/group/google-appengine/browse_thread/thread/29515fdc51f80b32#
Message has been deleted

sebastián serrano

unread,
Dec 8, 2011, 8:00:11 PM12/8/11
to google-a...@googlegroups.com
Hi,

I had this problem with one of my toy apps a few days ago and went over quota. Is not that bad today but is not good.

Just now: 

Avg. QPS: 0.003 (very little traffic - less than 1 request per second)
Average Latency: 13ms (I can't make it smaller, no way)
Instances: 6 <--- why? 1 should be enough


-Sebastian

Clément Denis

unread,
Dec 9, 2011, 4:35:12 AM12/9/11
to google-a...@googlegroups.com
Everything is back to normal today, without changing a line of code or any instance parameter.

Must have have been a temporary glitch ...

Clément Denis

unread,
Dec 9, 2011, 9:54:50 AM12/9/11
to google-a...@googlegroups.com
I spoke too soon, the instance count has gone crazy again. I was wondering if this has something to do with the warmup requests.

I'm on GAE/Java with threadsafe enabled. The initial request of my app triggers a ServletContextListener, which takes more than 30 seconds to complete.
This is indeed much higher than the configured max latency (I use the highest value, i.e. 15s). But the first request is a WARMUP request, so it shouldn't be used to evaluate the latency of an instance !
As the latency of every new instances is above the max value, additional instances are created as requests are coming in. 
=> After a deployment, I get 40 fresh instances in just a few minutes, with only 1 request on each !

And of course the instance count never decreases, even when the latency is stable around 500ms, with an average QPS below 0.1.
If I stop all traffic on the application, the instance count never goes back under 20 instances, which is already way too much for the needs of my app.

Maybe I'm completely wrong about this, but I think there is a real problem here, and it's costing us money.

Sami Lehtinen

unread,
Dec 9, 2011, 1:13:51 PM12/9/11
to Google App Engine
I'm pretty sure it has something to do with super slow system. So
because each instance is slow, more instances are started.

My tasks usually take less than 1000 ms, but now I'm seeing responce
times like 15959ms all the time.

Something is seriously messed up now.

Amy Unruh

unread,
Dec 9, 2011, 1:43:08 PM12/9/11
to google-a...@googlegroups.com
For those who are still seeing such issues, could you file a production ticket (or share your app id)?
http://code.google.com/p/googleappengine/issues/entry?template=Production%20issue

Sami Lehtinen

unread,
Dec 9, 2011, 3:55:15 PM12/9/11
to Google App Engine
Fresh from logs, appid: 9oxnet

Instance was already running, it received request and server it
straight from memcache without database access. (Less than 200 bytes)

2011-12-09 23:50:39.593 /2t 302 6108ms 0kb Mozilla/5.0 (Ubuntu;
X11; Linux x86_64; rv:8.0) Gecko/20100101 Firefox/8.0

So it's generic system issue, because I can see from my logs that
memcache lookup took exactly 2ms.

On Dec 9, 8:43 pm, Amy Unruh <amyu+gro...@google.com> wrote:
> For those who are still seeing such issues, could you file a production

> ticket (or share your app id)?http://code.google.com/p/googleappengine/issues/entry?template=Produc...

ZeroCool

unread,
Dec 12, 2011, 9:31:31 AM12/12/11
to Google App Engine
I'm having the same problem for appid: pe-server1
I don't even know where the massive datastore reads came from.
It has costed me a lot of $ in 6 hours, which can support the app for
2 days with the same server load.
Reply all
Reply to author
Forward
0 new messages