Hi,
I've been investigating some claims that our app is sometimes slow, which led me to discover that many times a day we have multiple Instances being shutdown, creating latency spikes as new instances spin up. Based on the Instances view in the GCP App Engine console, it appeared to be the result of exceeding the memory limit of our instance class. So I recently increased our instance class and decreased the max-concurrent-requests for our app. This change does appear to distribute most of the traffic across multiple rather than a single instance most of the time.
Unfortunately it appears that we still have multiple instances being shutdown now and then, whose memory usages (according to the Instances view) are far below what the new instance class allows. The other thing I noticed is that the traffic is then often sent to new instances, rather than to any idle instances (Dynamic) that may be present. (Note that I have later seen these dormant Dynamic instances start receiving traffic again).
The only solution that I see is to increase the minimum number of idle (Resident) instances, to accommodate having multiple instances suddenly being shutdown, but there must be a way to prevent the multiple instance shutdowns from occurring in the first place, or maybe not? Adding Resident instances would increase costs, for a reason that seems like it should be avoidable in my mind.
That's the main issue. The other problem is that I don't trust the GCP App Engine console Instances view itself. While conducting my investigation this morning, I've seen:
1) Active Instances with Start Times around 6:30AM suddenly appear in the list when I checked at 12:00PM, but they were not present in the list in the prior few hours that I had been checking the list.
2) An active instance (Dynamic) that was apparently using 0 B of Memory.
Is the Instances view having trouble lately, or am I trusting it too much as a reliable debug tool?
Thank you