Do background threads prevent instances created with basic scaling from getting shut down?

Erik Kuefler

unread,

Dec 9, 2015, 4:17:16 AM12/9/15

to Google App Engine

The description of basic scaling is a little vague about the exact conditions that cause an instance created with basic scaling to be shut down. It says an instance is evicted when it "has not received a request for more than `idle-timeout`", but it's not clear to me whether this includes background threads and the "/_ah/background" request that they generate.

My problem is that my machines are staying alive longer than I expect them to and I suspect a background thread might be the culprit. I'm using a library (Firebase) that creates background threads but provides no way of shutting them down. Will this cause my machines to never be evicted? If so, is there a way for me to either forcibly terminate all background threads, or to shut down the instance from my own background thread (so I can implement my own idle timeout)?

Christian F. Howes

unread,

Dec 9, 2015, 3:09:14 PM12/9/15

to Google App Engine

interesting question - where i have used background threads i have done so on manually scaling instances.

on our manually scaling instances we have successfully used the modules API https://cloud.google.com/appengine/docs/python/refdocs/google.appengine.api.modules.modules to control startup and shutdown of instances. I think this can be used to forcibly shutdown an autoscaling instance as well.

Hope that helps!

cfh

Nick (Cloud Platform Support)

unread,

Dec 9, 2015, 7:23:46 PM12/9/15

to Google App Engine

Hey Erik,

Yes, background threads yet to terminate will cause the instance to remain standing, regardless of the idle timeout. I've just submitted a docs feedback which will hopefully see this fixed shortly in the "Modules in Python" docs page. You can, as Christian suggests, use the modules controls to manually turn-down an instance. I'm curious which Firebase library you're using, by the way, and how it uses background threads?

Nick (Cloud Platform Support)

unread,

Dec 9, 2015, 7:51:00 PM12/9/15

to Google App Engine

... and to clarify, by "fixed" in the docs I mean that it should be mentioned there. The behaviour makes sense - the reason is that each background thread actually enqueues a long-running request not governed by the same execution timeline as a normal request. When it finally terminates, you'll see that it leaves a log line with a timestamp of its termination, however the milliseconds spent on the request will show its actual duration, which should be subtracted from the time of the log line to determine when the process started. Since this request (to /_ah/background) is "being handled" and "has not returned yet", the instance stays up.

Erik Kuefler

unread,

Dec 10, 2015, 3:42:53 AM12/10/15

to Google App Engine

Cool, that all makes sense, I think I understand what /_ah/background means now. So to clarify, every time I create a background thread, what actually happens is the AppEngine sends a request to /_ah/background on the same instance to execute that thread, and I should expect to see log output from that thread in the /_ah/background request rather than in the log from the original request? This seems to match what I see with Firebase - the trace shows that it connects to a socket for exactly 60s, then the request completes. I assume this is Firebase choosing to connect for 60s at a time rather than AppEngine killing it.

The modules API looks like it's on the right track, but it doesn't seem to have a "shut down the current instance" method. It only has methods to start and stop all instances for a version or to change the number of instances for a manually scaled app, which doesn't quite do what I need. Am I reading it correctly?

I'm using Firebase's newish JVM client (described a bit here), which has explicit support for App Engine. It looks like it's pretty much the same as the Android client with some reflection magic to figure out if it's running on GAE and adjust its thread creation strategy accordingly. I'm still trying to figure out exactly how it manages its thread (talking to them here), but I believe they're used to listen on web sockets for changes to remote data to invoke user callbacks. I think I'll try running some tests on a basic scaling module capped at one instance to try and deduce exactly what it's doing.

As long as you're submitting docs feedback, one other thing that I couldn't find described anywhere is how basic scaling instances handle concurrent requests. I assume it caps the number of current request in the same way autoscaling does, but the cap doesn't seem to be configurable like it is for autoscaling and it's not documented anywhere. I was getting what looked like a lot of CPU contention on my instances that I was only able to resolve by setting threadsafe to false, which was a bit more extreme than I would have preferred.

Nick (Cloud Platform Support)

unread,

Dec 11, 2015, 1:00:36 PM12/11/15

to Google App Engine

Hey Erik,

It seems that /_ah/stop is only appearing as a signal in your logs once a shutdown event has occurred, either due to normal events, such as an idle timeout or a call to the Modules service, or due to instance failure.

In the other thread you linked, Michael from Firebase indicated that you can use one of their methods to attempt to shut down all the Firebase threads. This should prevent your instances from staying up due to idle background threads. Have you managed to attempt this, and did it work for you?

You were, though, more generally, looking for a way to start and stop instances in manual / basic scaling. Looking at the Modules service documentation (see Java's javadoc, note that python's methods are parallel), there are methods to set and get the number of instances in a version/module, and methods to start and stop all instances in a version/module, but no methods for individual control.

You should feel free to file a Feature Request in the Public Issue Tracker for App Engine. You could also engineer various work-arounds, such as using Managed VMs, which can start and stop in a more controlled manner, also allowing process control, listing of threads, etc.

Let me know how this goes for you.

Erik Kuefler

unread,

Dec 14, 2015, 4:30:51 AM12/14/15

to Google App Engine

Thanks, I filed https://code.google.com/p/googleappengine/issues/detail?id=12606 and https://code.google.com/p/googleappengine/issues/detail?id=12607 for features that would help me.

I'll be experimenting with the method mentioned on the other thread soon, but it's tricky since it basically kills all of the instance's threads instead of just the ones for the current request, so I have to be careful with it. I might end up having to effectively roll my own scaling by keeping track of the last request the instance gets and running a background thread that periodically checks the last request time, killing all of my background threads if it sees no request has come in for the past 10 minutes or so.

Nick (Cloud Platform Support)

unread,

Dec 14, 2015, 6:25:06 PM12/14/15

to Google App Engine

Hey Erik,

It does seem as if this is an unintended side-effect of the Firebase driver authors leaving their background threads running indefinitely, or for longer than the idle timeout you've specified anyways. We monitor the Public Issue Trackers quite regularly, so you should see responses on those threads shortly.

Best of luck!

Alexey

unread,

May 24, 2017, 7:04:00 AM5/24/17

to Google App Engine

Curious to see if the issue Erik brought up has been addressed. The issues in the public tracker seem to have not had any movement beyond being delegated to appropriate teams. I'm struggling with backend instances not shutting down with Firebse admin SDK.

Alexey

unread,

May 25, 2017, 12:14:08 PM5/25/17

to Google App Engine

Minor update. I have experimented a little bit with alternate ways to scale instances running firebase admin SDK. I have tried the following approaches:
- Calling goOffline inside firebase API upon completion of a single firebase persistence call: no error, instances don't shut down, /_ah/background continues running.
- Calling setNumInstances inside GAE modules API to set instance count to 0 under basic scaling of the serivce: error at runtime, setNumInstances is only allowed to control manually scaled instances.
- Calling setNumInstances inside GAE modules API to set instance count or 0 under manual scaling depending on the app logic: error at deploy, cannot set number of instances to 0 in appengine-web.xml under manual scaling.

At this point, I know of no way to avoid having at least 1 backend instance running all the time if one wants to use firebase. My needs for firebase stemmed from Google's recommendation to switch to it, due to impending deprecation of channel service. Please advise on a way to proceed, if only to wait for Erik's "daemon threads" request to be addressed. As things stand, firebase is not a workable alternative to a working service due to be sunset. Any and all help appreciated!

Alexey

Torbjørn Smørgrav

unread,

May 26, 2017, 6:42:38 AM5/26/17

to Google App Engine

Hi,

My experience;

I had the same issue, but for me goOffline seemed to work. It might take longer to before an instance is shut down - but my prime concern was that I eventually reached the threshold for background threads. After going offline this is no longer the case.

Cheers,

-Toby

Torbjørn Smørgrav

unread,

May 26, 2017, 6:49:09 AM5/26/17

to Google App Engine

Hmm... looked at the number of instance graph now and it seems the instance is pretty much up all time (where it should have some hiatus periods).

Alexey

unread,

May 30, 2017, 11:52:11 AM5/30/17

to Google App Engine

Toby, thank you!

Could anyone from Google Cloud team chime in on the status of this? The latest on issue https://issuetracker.google.com/issues/35900105 filed by Erik is that there appears to be some blocker (https://issuetracker.google.com/issues/26334734), but it doesn't seem to be publicly visible.

Alexey

Reply all

Reply to author

Forward