Peter Warren
unread,Dec 6, 2012, 4:19:03 PM12/6/12Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to google-a...@googlegroups.com
Despite having very low request volume (about 1 request / 8 seconds), low latency (<1 sec avg) and having 3 dedicated instances, our application still suffers from frequent warmup requests.
See attached charts for requests, instances, and latency.
All our app does is serve static resources and items from memcache, and I don't see any OOM exceptions in our logs, so I don't think we're churning instances because of memory limits.
I figured I could try brute-forcing things and upped our dedicated # of instances to 5. That didn't help at all and only costs us more money. Look at the attached instances chart. Google is now using an avg of about 8 instances to serve exactly the same volume as 5 instances handled previously. It does look like our latency decreased a bit, but we still get tons of warmup requests.
I currently have the "pending latency" slider set to 14.9 secs min and 15.0 secs max. I have also set it to "auto"/"auto" per tips in other threads but can't tell a difference.
So 2 questions:
1) why does Google need more than 3 instances to serve 1 relatively low-load request every 8 seconds??? I would think 3 dedicated instances should be far more than sufficient for 1 request / 8 seconds.
2) how come the number of instances seems unrelated to load? I would think that more dedicated servers would mean fewer dynamic instances would be created (assuming of course that the dedicated servers are under a manageable load). That doesn't seem to be the case.
Thanks for any insights!
Peter