Autoscaling with Flex

764 views
Skip to first unread message

Patrick Jackson

unread,
Oct 12, 2017, 8:31:47 AM10/12/17
to Google App Engine
Hi Folks,

In regards to autoscaling with the flex environment, what sort of spin up time can be expected?  In other words, once the scheduler decides to create new instances, how long before that instance is ready for traffic?  I am working with Java 8 with a small - medium sized application, but insights from other runtimes are welcome.

We've tried GAE Standard with Java 8, however have been completely dissatisfied with the autoscaling in the Java environment.  Start up on new instances take 9+ seconds, and there does not appear to be a way to keep user facing requests from hitting a cold start.  I've tested with resident instances and cron jobs for "always on", but just does not appear to be possible.  Plenty of other antedotes in this group to assure me that preventing a user facing request from hitting a cold start (or heavy latency due to startup not complete) is simply not possible on GAE Standard.  If I'd know this up front, I would have choosen Go, or python.  If Google would be more up front about this situation it would help many devs and the GAE as a whole.  

So investigation is now if the flex environment can serve traffic without a user facing high latency due to an instance starting up.  Also curious in the time it takes to start instances.  One speaker at Cloud Next said a couple of minutes, which is much higher than GAE standard.

Patrick Jackson

Yannick (Cloud Platform Support)

unread,
Oct 12, 2017, 12:03:55 PM10/12/17
to Google App Engine
Hello Patrick,

As can be seen on the high-level feature comparison chart, App Engine Flexible is indeed supposed to be slower than Standard. That is because you are booting up an entire VM as opposed to a simple container.

You can certainly keep a number of instances up and ready to take requests using cron jobs but there will always come a time where your application needs to scale up due to receiving too much new traffic and where cold starts will occur. The scheduler attempts to reduce the incidences of cold starts by scaling up the number of instances quickly to handle traffic spikes. And yes, in the end a Go application would perform faster than a Java application.

Tom Stuart

unread,
Jul 2, 2018, 4:36:46 PM7/2/18
to Google App Engine
Hi Yanik, 

Do you have any examples of bumping up the number of instances using cron jobs?

Thanks
Tom

Jordan (Cloud Platform Support)

unread,
Jul 4, 2018, 9:45:15 AM7/4/18
to Google App Engine
Yannick was referring to the old method in App Engine Standard where you would use a Cron job to ensure App Engine never scaled to 0, causing initial incoming requests to see delays due to cold starts. This is the same method Patrick mentioned in his original message that he attempted to use in order to mitigate cold starts. More information about this can be found in the Always-Cron instances section of the blog 'App Engine Resident instances and the startup time problem'. 

This is no longer required for App Engine Standard as the new Clone Scheduler has been released, removing this issue. Now, the scheduler sees both resident idle instances and dynamic instances as equals, meaning that the 'min-idle-instances' and the new 'min-instances' configuration settings work as expected to preemptively ensure instances are warmed up to avoid cold starts (where min-instances is the absolute minimum amount, and min-idle is the amount always kicked up and running in excess to preemptively handle traffic spikes).

- In App Engine Flexible you would use the same 'min_num_instances' setting to preemptively start more instances than are required in order to prevent initial cold starts when traffic does arrive. If you are instead looking for a way to manually change the scaling configuration via code, you would use the Admin API to patch the scaling settings for the specific version you wish to modify. 


Parth Mishra

unread,
Jul 5, 2018, 9:42:26 AM7/5/18
to Google App Engine
Using this admin API patch method, could I set/alter the `min_num_instances` to be a different number at different times of day? In other words, could I possibly automate an Admin API call to preemptively set a new minimum number of instances at known high traffic periods? 

Jeff Schnitzer

unread,
Jul 6, 2018, 11:12:42 AM7/6/18
to Google App Engine
I can offer one observation about GAE/Standard using Java with 20+s app startup times:  Use-facing cold starts can be problematic when you have low/intermittent traffic, but it smooths out when you have some traffic. I don't typically see user-facing cold starts; my guess is that GAE spins up enough "extra" instances naturally to handle the load, and the consequence is just a bit higher bill.

I've also never found a setting that improved this situation beyond the default. Everything I tried made latency worse. So I've left it pretty much alone and been reasonably happy with the results... even though I used to bitch heavily about the issue here.

Jeff

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/1c725ace-d146-4e6c-95c0-5cfc08440b66%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jordan (Cloud Platform Support)

unread,
Jul 9, 2018, 3:30:18 PM7/9/18
to Google App Engine
As Jeff mentioned, when you are already receiving traffic, App Engine will perform automatic scaling to meet the requirements of your traffic to reduce user-facing cold starts. So setting up an automated system to change your instance count would be competing with automatic scaling, and it is better to just let App Engine take over instead.

Setting the min instance count is more ideal before traffic comes in, or before a major traffic spike, since spikes are very hard to predict, and App Engine becomes reactive scaling up after the spike occurs if it doesn't have enough instances. Therefore you can indeed created an automated script that preemptively increase your min instance count before known traffic spikes occur, but should then let App Engine take over once traffic is already flowing. 
Reply all
Reply to author
Forward
0 new messages