Old versions still running (flexible environment)

3,531 views
Skip to first unread message

Alan deLespinasse

unread,
Mar 16, 2017, 3:10:07 PM3/16/17
to Google App Engine
I'm not sure if this is a bug or expected behavior. It's not what I expected, but I'm still somewhat confused by the Cloud Console.

I have several services for which I noticed there were still some old versions running, although the old versions were serving 0% of traffic. Example:


The two selected versions were made obsolete by the top version. Normally when I deploy a new version, the previously running version is shut down. In these cases, I think a new version never succeeded in starting up, because I had a bug that would simply crash every time. So here's what apparently happens:
  1. Version A is running
  2. I try to deploy version B; it repeatedly fails to start up. My "gcloud app deploy" command eventually returns an error, telling me that deployment failed. Version A is still serving all traffic. Version B is apparently still running, though I didn't know that until just now.
  3. I fix the bug and deploy version C. It starts up, version A is correctly stopped. Apparently version B is still running, though.
So, is this a bug or expected? Is it costing us money? I'm finding it a bit hard to tell what our recent charges are for, although they are a bit higher than expected (still looking into that). I'd submit a billing ticket if I was sure.

Adam (Cloud Platform Support)

unread,
Mar 18, 2017, 12:39:42 PM3/18/17
to Google App Engine
Are you using manual scaling or automatic scaling in your app.yaml? If you're using manual scaling, App Engine will try to keep instances alive indefinitely until you manually shut them down.

Alan deLespinasse

unread,
Mar 18, 2017, 1:17:26 PM3/18/17
to Google App Engine
We have some services that are manual and some that are automatic. I'm pretty sure both have had this problem.

I wouldn't have interpreted that documentation as saying that manual scaling would keep old versions running. Our manual-scaling services are configured to have one instance. We've ended up having multiple versions running, each with one instance, although all traffic was routed to a single version.

And like I said, normally when we deploy a new version, the old version is shut down automatically. It's just when a second version has failed to start up, and we deploy a third version, only the first version is shut down, and the second version still appears to be running, with 0% traffic. The behavior of the gcloud app deploy command in this case is misleading.

Zachary Fewtrell

unread,
Mar 20, 2017, 12:28:04 PM3/20/17
to Google App Engine
Hi Alan,
 Generally we would expect a failed deployment to rollback (i.e. delete the new version & leave the old one running).  However your example is from January and some rollback related issues have been addressed since then.  If you are still getting the issue I recommend you file an issue in our tracker.

Regards,
 Zach Fewtrell

Alan deLespinasse

unread,
Mar 20, 2017, 12:35:08 PM3/20/17
to Google App Engine
Thanks! I'll wait and see if it happens again.

Matthew Kime

unread,
Jul 14, 2017, 6:57:00 PM7/14/17
to Google App Engine
Have your problems been fixed? I'm still seeing them.

Alan deLespinasse

unread,
Jul 14, 2017, 7:27:53 PM7/14/17
to Google App Engine
I did have one case a couple weeks ago where an old version was unexpectedly still running. So I guess it's not 100% fixed on Google's side, although that was the only time I've seen it since January.

Matthew Kime

unread,
Jul 16, 2017, 4:04:17 AM7/16/17
to Google App Engine
Which version of app engine are you using? We're using the flex environment with custom runtime.

thanks,
matt

Alan deLespinasse

unread,
Jul 17, 2017, 9:23:34 AM7/17/17
to Google App Engine
Flexible, Node.js 6.9.3 or something.

Samuel Richardson

unread,
Feb 25, 2018, 5:48:42 PM2/25/18
to Google App Engine
We're also seeing this

Hendrik Kleinwächter

unread,
Feb 26, 2018, 9:49:58 AM2/26/18
to Google App Engine
We are seeing this as well. Using ruby and the flexible environment. Any idea what this could be? Are the stale instances still producing costs?

Linus Larsen

unread,
Mar 2, 2018, 6:58:06 PM3/2/18
to Google App Engine
Yep, this is really easy do reproduce in the standard env as well. Just deploy a new version, migrate traffic to the new version. The
old version still keeps instances running (in my case both python and java).

Jeremy D

unread,
Mar 9, 2018, 3:31:30 PM3/9/18
to Google App Engine
Seeing this in standard env on python as well. Traffic alloc is set to 100%, but last 2 versions (which are several days old) are still running 3 or 4 instances. 

Daniel Iñigo

unread,
Jul 19, 2018, 8:56:13 AM7/19/18
to Google App Engine
I found myself in the same situation.

A couples months ago we were going through heavy development / deploying / testing and got ourselves with a huge bill. Do you confirm that using automatic scaling and setting it to 1 max-instances will fix it?

Thanks in advance.

Ani Hatzis

unread,
Jul 19, 2018, 9:37:26 AM7/19/18
to google-a...@googlegroups.com
Hi Daniel,

I use automatic scaling with 1 max-instances in my dev project. This is one thing that helps to limit the costs of the dev environment.

I also debug and test the app locally first, before I deploy anything to GAE. In your case with Node.js flex:
npm start
You can do this in Cloud Shell, too.

Another measurement was to use a different app.yaml and cron.yaml in dev project than in production project. In app.yaml I define auto-scaling with max-instances 1 and F1 (because 28 hours of F1 are free in standard environment). Although there is no free tier for flex instances, you still can define cheaper virtual machine types for dev. Cron-job tasks are scheduled far less frequently than in production, because I typically debug/test them by triggering them manually and I don't need them to actually run in dev project in contrast to production.

You probably also want to make sure that in dev (!) project, with every new deploy every previous version is stopped and the new version is promoted to default version, e.g.:

gcloud app deploy app.yaml --promote --stop-previous-version

See the gcloud app deloy reference for details. Alternatively, you could overwrite an existing version, by specifying the version to replace:

gcloud app deploy app.yaml --version=v-ani-dev

This would also work with multiple devs deploying to the same dev project, although I usually would go with one dev project per dev. Then also have a QA project where you can deploy and test release candidates before they are deployed into staging or production. (Basically, dev -> qa -> staging -> production)

One last thing: I do not use production data in dev or QA (for cost reasons, but also for security reasons). Instead I would use only a small subset of fictive sample data that reflects the nature of the production data. You always can use import/export features of Cloud Datastore, Cloud SQL etc. to restore sample data if for example you debug a new feature that changes schema. And only provide production data to the staging project (using export/import again) if you need to test that a migration will run successfully through your entire production data. That can be quite expensive though.

Hope that gave you some ideas.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/6678e51f-cb09-4cac-b4cc-83890975bbbd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Ani Hatzis
Consultant and developer for Google Cloud solutions

George (Cloud Platform Support)

unread,
Jul 19, 2018, 3:34:07 PM7/19/18
to Google App Engine
The proper way to get due attention for your issue is to register it in the Public Issue Tracker. All useful information should be there, such as project ID, version number, programming language, and output of the gcloud info command. To keep details such as project ID private, one needs to set the category of the issue to "Private Issues". 

This discussion group is oriented more towards general opinions, trends and issues of general nature touching the app engine. Deployment or coding issues should get much more attention in the Public Issue Tracker. 
Reply all
Reply to author
Forward
0 new messages