Hi
I second Adam's suggestion to use manual scaling in order to not run into any latency issues. We're running on manual scaling for several years now and this proved the only scaling method guaranteeing no latency issues.
Eg. when we tried with basic-scaling, we discovered that:
1. There is no min-instances parameter. In worst case app engine will scale down to 1 instance, but if you encouter a sudden traffic spike there won't be any prewarmed, unused instances. We would like to see here a config option where min-instances can be configured on basic-scaling.
2. Even though we have warmup requests configured, some requests still went to newly created instances, thus adding latency for the time it takes to serve the first request.
As our application has to deliver consistent performance, we chose manual-scaling instead.
@Adam, regarding
setNumInstances() method: Is there some kind of method to programmatically retrieve the active instances value, as like the active green line as shown in the dashboard when 'Instances' is selected? Or similar the current latency value when 'Latency' is selected?
For our application it seems a better fit to use manual-scaling, but use a custom load balancer control function that will use setNumInstances() to scale the instances depending on load.
Thanks & Regards
Marcel