I've been researching a couple of problems related to shutdown.
- Chrome frequently starts up indicating that it hadn't shut down cleanly
- We have a frequent crash that only happens if the X server goes away
I've found a few things that I believe are the cause
- A bug in the session_manager that can cause it to send a SIGABRT to chrome (it's only supposed to do that if chrome hasn't exited within 3 seconds of the SIGTERM it sends at shutdown).
- chrome attempts to handle the SIGTERM by shutting down cleanly. But it resets its handler to the default once it has received one. That will cause it to exit when it processes a subsequent one.
- ui.conf sends up to 10 SIGTERMs to chrome, spaced .1 seconds apart while the session_manager is trying to terminate chrome cleanly. This is because its post-stop script is executed in parallel to the SIGTERM that is sent to the session_manager. I believe this is intended by upstart, but unexpected by us.
- That same bug will cause the X server to be killed after those 10 SIGTERMS. If chrome hadn't gone away yet, it will crash.
- We can also send up to 10 SIGTERMs to chrome from chromes_shutdown. I'm not sure yet exactly when these happen wrt ui.conf.
With everyone gunning for chrome, it's no surprise that it rarely starts cleanly. I have some thoughts about what to do, but I'd like some advice.
- I'll fix the bug in session_manager to only send the SIGABRT if 3 seconds have gone by without chrome terminating.
- I could change the post-stop script in ui.conf to wait until the session_manager is gone or some timeout is reached before doing the rest of its processing. I was surprised upstart didn't provide some general mechanism for doing this, as cleanup of a service should probably come after it dies. Am I misunderstanding how this is supposed to work? Can anyone suggest alternatives?
- The reason for all the SIGTERMs from ui.conf and chromeos_shutdown are because we're trying to kill all processes with files open on cryptohome before unmounting it. As this is happening in parallel to other processes being shut down it seems over aggressive. ui.conf does it do handle logout, which happens when chrome exits cleanly or in any way during screenlock. Maybe we can differentiate that from shutdown and leave it to chromeos_shutdown in that case. But even then I think processes are going to be killed while they're in the middle of shutdown. Does anyone know for sure exactly when chromeos_shutdown gets executed wrt the other jobs being stopped? I believe there are several other open bugs related to this.
Dave
--
Chromium OS Developers mailing list: chromiu...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-os-dev?hl=en
I've tried that, and it works (in that chrome doesn't terminate early via the subsequent sigterms). But I need to research exactly why it was set up this way in the first place. Also, it feels wrong to have up to 21 sigterms being fired at chrome. And it's possible we're killing other things that we shouldn't be, in a random order.Dave