Behavior of activity manager when processes are OOM killed

214 views
Skip to first unread message

Mike Hearn

unread,
Jan 17, 2009, 2:03:41 PM1/17/09
to android-platform
Hello,

I am logcatting my device as I write my own app, and notice a strange
thing - as the device runs out of memory some of the background
services that are running are OOM killed by the kernel then
immediately scheduled for restart 5 seconds later. This doesn't seem
right - if the system needs the memory, surely making it available for
only 5 seconds is a bad idea.

Examples of these services are the gtalk service (I don't use GTalk),
the music player service and StreamFurious. It seems that these
services have been started in a mode where the system places a high
priority on keeping them running.

The code flow seems to be

appDiedLocked
handleAppDiedLocked
cleanUpApplicationRecordLocked
killServicesLocked
scheduleServiceRestartLocked

From reading the docs, it sounded like services would be killed
permanently when memory got low if they weren't in use. None of these
background services should be doing anything right now - I'm not
playing music and haven't used GTalk for weeks. Are these apps not
telling the OS it's ok to shut them down? Or is this by design?

Mike Hearn

unread,
Jan 17, 2009, 5:37:49 PM1/17/09
to android-platform
Actually, I guess the AMs behavior is reasonable in this case - it
can't tell the difference between an OOM kill and an actual crash of
the process. Even if it could, suppressing the restart until there's
more memory sounds tricky. This wouldn't matter, except that the
MediaPlaybackService doesn't shut itself down after 60 seconds like
it's supposed so. I read the code and can't figure it out.

When the MediaPlaybackService is OOM killed it'll be restarted in the
normal manner 5 seconds later, which means onStart will be called.
That registers a delayed message for 60 seconds later. That message
handler calls stopSelf() which RPCs back to the AM into stopService(),
which then calls bringServiceDownLocked. The first return statement
can't be taken, and "adb bugreport" shows no connections to the MPS so
the second codepath that could return can't be taken either. But then
it logs into the EventManager am_destroy_service .... and yet that
event is never logged. I can clearly see other services getting
destroyed like this but not the MPS, so it must bail out during one of
the steps I outlined (assuming messages don't get lost in the binder).

The only thing I can think of is that findService(Intent...) returns
null, in which case stopService silently returns. I haven't explored
what this function does yet but it looks pretty complicated.

I uploaded a bugreport file here: http://plan99.net/~mike/log2.txt

Dianne Hackborn

unread,
Jan 18, 2009, 1:00:37 AM1/18/09
to android-...@googlegroups.com
Correct, when a process hosting a running service is killed, the activity manager will schedule to re-create it at some point in the future, starting at 5 seconds, and increasing if it continues to get killed.  There is no way to know why the process went away (if it crashed or was killed due to memory pressure) nor whether it can now run (the act of killing and restarting a process may, for example, free up a bunch of memory it leaked), not when it will be able to run in the future (due to other processes going away).

Ideally we want to keep all processes running that have requested to have a service running in them, but if course you can easily run out of RAM to do so.  So we let the viking killer start killing off those processes when needed, and try restarting them as we can.  Also explicit requests to start the services -- such as through the alarm managers, or calls by others to startService or bindService -- will result in the system bringing them back up, since this is a good indication they are really needed at that point.

This is also one of the reasons why we strongly encourage people to use the alarm manager or other approach for doing things in the background, instead of just leaving a service running all of the time.  You have much better guarantees about when you will actually be able to run that way.

Also note that when we restart a service after is has been killed, only onCreate() is called, and not onStart().  The naming here is a bit unfortunate -- the semantics of Service.onStart() are very different than Activity.onStart(), in that you will receive one onStart() call for each call to startService() with the parameters being given.  That is, it doesn't indicate a state, but tells you about the call.  So when the service is re-created, only onCreate() is called until another startService() call happens with a new Intent to deliver to the service.
--
Dianne Hackborn
Android framework engineer
hac...@android.com

Note: please don't send private questions to me, as I don't have time to provide private support.  All such questions should be posted on public forums, where I and others can see and answer them.

Mike Hearn

unread,
Jan 19, 2009, 4:26:20 AM1/19/09
to android-platform
> Also note that when we restart a service after is has been killed, only
> onCreate() is called, and not onStart().

That must be the reason then. I thought I'd eliminated that
possibility but obviously not.

http://code.google.com/p/android/issues/detail?id=1807

All your points make perfect sense, but it still feels wrong that the
system tries to free memory by killing processes then immediately
starts them back up again. If that memory is now really in use by
something else, either the restarted process should fail to start or
(more likely?) something else should randomly get OOM killed to make
space for the newly started and thus not oom_adjusted process. That
doesn't seem to happen in practice though.

Dianne Hackborn

unread,
Feb 10, 2009, 1:13:35 PM2/10/09
to android-...@googlegroups.com
On Mon, Jan 19, 2009 at 1:26 AM, Mike Hearn <mh.in....@gmail.com> wrote:
All your points make perfect sense, but it still feels wrong that the
system tries to free memory by killing processes then immediately
starts them back up again. If that memory is now really in use by
something else, either the restarted process should fail to start or
(more likely?) something else should randomly get OOM killed to make
space for the newly started and thus not oom_adjusted process. That
doesn't seem to happen in practice though.

Sorry I missed this and am slow getting back.  Scheduling a quick first start actually tends to be a good thing to do, because the new process that comes up can take significantly less memory than the previous one.  For example, if this is some application with UI that also has a service, the user may have been using its UI and thus have a lot of images and such loaded into it.  After killing the process, only the service is restarted in it, so it may take significantly less space.  Alternatively, there may be some leak in the process that caused it to balloon up.

Also the overall memory state of the system is very dynamic, so there just may be more memory free a little later for some other reason and if that is the case it is nice to keep the service going.

So yeah it's pretty brutal, but in practice the strategy of a quick restart and then aggresively backing off the time to the next restart seems to work pretty well.
Reply all
Reply to author
Forward
0 new messages