instance scheduling mystery

Peter Warren

unread,

Dec 6, 2012, 4:19:03 PM12/6/12

to google-a...@googlegroups.com

Despite having very low request volume (about 1 request / 8 seconds), low latency (<1 sec avg) and having 3 dedicated instances, our application still suffers from frequent warmup requests.

See attached charts for requests, instances, and latency.

All our app does is serve static resources and items from memcache, and I don't see any OOM exceptions in our logs, so I don't think we're churning instances because of memory limits.

I figured I could try brute-forcing things and upped our dedicated # of instances to 5. That didn't help at all and only costs us more money. Look at the attached instances chart. Google is now using an avg of about 8 instances to serve exactly the same volume as 5 instances handled previously. It does look like our latency decreased a bit, but we still get tons of warmup requests.

I currently have the "pending latency" slider set to 14.9 secs min and 15.0 secs max. I have also set it to "auto"/"auto" per tips in other threads but can't tell a difference.

So 2 questions:
1) why does Google need more than 3 instances to serve 1 relatively low-load request every 8 seconds??? I would think 3 dedicated instances should be far more than sufficient for 1 request / 8 seconds.

2) how come the number of instances seems unrelated to load? I would think that more dedicated servers would mean fewer dynamic instances would be created (assuming of course that the dedicated servers are under a manageable load). That doesn't seem to be the case.

Thanks for any insights!

Peter

instances.png

latency.png

requests.png

Per

unread,

Dec 6, 2012, 7:47:21 PM12/6/12

to google-a...@googlegroups.com

Hi Peter,

I currently have the "pending latency" slider set to 14.9 secs min and 15.0 secs max. I have also set it to "auto"/"auto" per tips in other threads but can't tell a difference.

That's most likely your problem. Check out this video: http://www.youtube.com/watch?feature=player_embedded&v=zQ5_47zy4bY#! from 25:45 onwards. You don't want to micro-manage the scheduler too much, or it will get confused. The scheduler may not be great for low usage apps, but it can do a lot better than what you're seeing. I'd recommend only having one resident instance. We're using "1 - automatic" for instances and "auto - auto" for pending latency. We have about 1-2 requests per second during the week, but maybe only 0.2 on the weekend, so its close enough to what you have.

Cheers,
Per

Peter Warren

unread,

Dec 7, 2012, 2:22:30 PM12/7/12

to google-a...@googlegroups.com

Thanks for the tip! Upping the max idle instances makes a huge difference (see attached chart). Previously I had it capped at 3 max idle instances and then 5 for testing.

Our latency has gone up but our instance graph is way, way more stable.

instances.png

Reply all

Reply to author

Forward