avoiding cold starts

60 views
Skip to first unread message

Steven Simon

unread,
May 3, 2022, 5:44:25 PM5/3/22
to Google Cloud Developers
Hi,

We have a Python HTTP Cloud Function with long cold starts that we want to avoid. The long cold starts are due to loading a Tensorflow module used for natural language processing.

We have tried setting the "Minimum number of instances" in the Autoscaling section to 1. This seems to make cold starts less frequent, but they still occur. This is a problem for us because the cold starts take 45-60 seconds, which is too long for our user-facing application.

Is there any way to host our code in Google Cloud, as a Cloud Function or using some other hosting option that will completely eliminate cold starts?

Any help on this would be appreciated. Thank you.

Steven Simon

unread,
May 4, 2022, 3:16:55 AM5/4/22
to Google Cloud Developers
Update
Based on this post, it looks like I need to "deploy a new function as second generation.". I did that by using this command line:

gcloud beta functions deploy getsimilarity \
--gen2 \
--runtime=python38 \
--trigger-http \
--entry-point get_similarity \
--source . \
--allow-unauthenticated \
--memory=4096MB \
--min-instances 1 \ 

The attachment "gen2.png" shows the function (getsimilarity) in the Cloud Functions console running as a gen2 function.

The attachment "autoscaling.png" shows that the minimum number of instances has been set to 1 (so that at least one instance will always be running).

However, the minimum number of instances is frequently 0, according to the "Active instances" graph in the Cloud Functions console (as shown in "activeinstances.png."

What am I doing wrong?

Any help would be greatly appreciated.
autoscaling.png
activeinstances.png
gen2.png

Steven Simon

unread,
May 4, 2022, 5:47:12 PM5/4/22
to Google Cloud Developers
Update
I have determined that there is no possible way to do what we need with Google Cloud Functions -- setting minimum number of instances to 1 may make cold starts less frequent or reduce how long they take, but doesn't eliminate long start-up times, which is the only acceptable option for our user-facing app. (See, for instance, https://issuetracker.google.com/issues/158014637.) Therefore, I am now researching Cloud Run, Compute Engine, and AWS Lambda functions.

Tracy Hall

unread,
May 6, 2022, 3:05:57 AM5/6/22
to Google Cloud Developers
I'll mention here that minimum=1 will definitely reduce cold/warm starts if your calls are intermittent/non-overlapping. If, however, you are experiencing calls to the function *while a previous is still running*, you will run into at *least* queuing, and possibly a cold start of a 2nd *instance*

Tracy Hall
LeadDreamer

Rogelio Monter

unread,
May 6, 2022, 7:00:46 PM5/6/22
to Google Cloud Developers

I agree that setting min instances as 1 should reduce cold starts as shown on this blog entry New Cloud Functions min instances reduces serverless cold starts, but they cannot be eliminated as explained in this Stack Overflow answer:

    “With all "serverless" compute providers, there is always going to be some form of cold start cost that you can't eliminate. Even if you are able to keep a single instance alive by pinging it, the system may spin up any number of other instances to handle current load. Those new instances will have a cold start cost. Then, when load decreases, the unnecessary instances will be shut down.”

Tracy Hall

unread,
May 6, 2022, 7:03:23 PM5/6/22
to Google Cloud Developers
Yeah, saw that.  Which is why I followed up over here:

https://groups.google.com/g/firebase-talk/c/JkqQLSc6hvs

 with a feature request:

When discussing cold vs warm starts, and minimum_instance on another google group, we realize having one instance already running when another invocation comes in doesn't help the *second* invocation - *that* invocation *still* has to cold-start.

It seems what we *really* want is NOT a minimum # of instances - but a minimum # of EXTRA instances MORE THAN ARE CURRENTLY RUNNING. Let's call this "minimumExtra"

So with minimumExtra=1
If no function is currently running - one "extra" instance is kept warm for a total of 1 instance
if one function is currently running -  one "extra" instance is kept warm for a total of 2 instances
if 10 functions are currently running -  one "extra" instance is kept warm for a total of 11 instances

The goal is not "at least one is warm" - the goal is "at least one is warm and NOT BEING USED" to be ready to pick up the NEXT invocation.

Should be settable from 1 to "n" depending on your systems burst levels, etc.

Tracy Hall
LeadDreamer

Alejandro Lorea Sanchez

unread,
May 10, 2022, 7:00:12 PM5/10/22
to Google Cloud Developers

You need to consider these Cold starts as the documentation itself warns about them. This doesn’t mean the service is not good, but instead that it has a different approach. In this case, Cloud Functions are great for a service that is constantly running and not paused. 

It is possible to reduce the cold starts and it has been a widely discussed topic. 

There are:

- This great Medium Article: Improving Cloud Functions cold start time

-An official vide with Doug Stevenson: Minimizing cold start time (Firecasts)

-The Tips & Tricks where there are a few more details on how to improve your project to reduce them.

-And what has been already discussed about min instances. 


If you want a service that is always up and running with a quick response, as you previously stated I would go for Cloud Run or App Engine.

Steven Simon

unread,
May 13, 2022, 7:45:54 PM5/13/22
to Google Cloud Developers
I previously went through all the information mentioned in the preceding post, did what was suggested, and found my use case was still not addressed.

With all due respect to the people at Google, saying that Google has addressed cold starts in Cloud Functions is at best misleading and at worst disingenuous. Why else do I keep seeing the same comments on the web, over and over, saying that minimum instances is not addressing problems with cold starts? 

Google would do everyone a favor by being very clear in the Google Cloud function documentation that there is no way to avoid "startup latency" with Cloud Functions, rather than talking about "minimizing cold starts" or "reducing their impact." Using that kind of language is a red herring.

Just say something like: if startup latency is an issue, don't even think about using Google Cloud functions. 

It shouldn't be so hard to find this information.

It appears that Google still hasn't figured out how to do platforms 11 years after Stevey's rant. That's why I'm looking into AWS.


Reply all
Reply to author
Forward
0 new messages