--Thanks,Colin
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev+unsubscribe@chromium.org.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.
Ah, right. I think this discussion is a generalization of that one though, right? From scanning that thread, I interpret the conclusions there as:- Make singletons leaky be default- It's a bug if unclean shutdown results in inconsistent persistent state or inconsistent state on restartDoes that match your understanding?
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev+unsubscribe@chromium.org.
Also, relying on clean shutdown to save user data is fundamentally vulnerable to unexpected events like Chrome crash, OS crash, and device power loss.
On Fri, Sep 2, 2016 at 1:09 AM, Ryo Hashimoto <hash...@chromium.org> wrote:Also, relying on clean shutdown to save user data is fundamentally vulnerable to unexpected events like Chrome crash, OS crash, and device power loss.Not inherently true.For example, parts of the omnibox code need to track stats condensed from the user's (much larger) complete history. If those stats are out of date, they can be recomputed on startup.So the omnibox tries to write out updated stats periodically, especially at shutdown. If it fails (e.g. crash before then), the cost is not correctness, it's startup perf on the next run. Startup perf is extremely valuable and we don't want to regress that. But it's also not clear we want to be writing to disk constantly to ensure our stats are updated on every history event.
I am in general very skeptical of the idea that shutdown can or should be entirely eliminated. The useful thing, which we'd need to do anyway before doing that, would be to understand what currently happens at shutdown and try to eliminate as much of it as possible. Then a decision about what is left can be made in an informed fashion.So perhaps it's just phrasing, but I would have phrased things less provocatively about "how can we find and eliminate unnecessary work during shutdown?"
PK
gab also pointed out on Windows we go through the unclean shutdown
path when the user logs out/powers-down. In this case we can't go
through the clean shutdown path because of the state windows is in
(our clean shutdown path on Windows relies on getting certain messages
that are not delivered in this state).
To me the argument comes down to we are forced to support the unclean
shutdown path, so why bother also maintaining the clean shutdown path?
-Scott
I don't believe my comment about iOS going through clean shutdown was an argument against the suggestion. It was just a statement that we shouldn't assume that no mobile platform handles clean shutdown (and to please test that shutdown CL on iOS!).
PK
I've worked on systems where shutdown semantics were not guaranteed, and trying to write tests in that environment was a nightmare. Even if you could get your tests to stop crashing long enough to check them in, they would be flaky or break the next time somebody changed the fragile contract you had managed to make with the subsystems underneath you.
On Fri, Sep 2, 2016 at 6:25 PM, Peter Kasting <pkas...@chromium.org> wrote:On Fri, Sep 2, 2016 at 1:09 AM, Ryo Hashimoto <hash...@chromium.org> wrote:Also, relying on clean shutdown to save user data is fundamentally vulnerable to unexpected events like Chrome crash, OS crash, and device power loss.Not inherently true.For example, parts of the omnibox code need to track stats condensed from the user's (much larger) complete history. If those stats are out of date, they can be recomputed on startup.So the omnibox tries to write out updated stats periodically, especially at shutdown. If it fails (e.g. crash before then), the cost is not correctness, it's startup perf on the next run. Startup perf is extremely valuable and we don't want to regress that. But it's also not clear we want to be writing to disk constantly to ensure our stats are updated on every history event.Isn't the said situation already happening on Android? Clean shutdown never happens on that platform as gab@ said.If it's causing a serious startup perf issue, IMO what should be fixed is the startup code.
I am in general very skeptical of the idea that shutdown can or should be entirely eliminated. The useful thing, which we'd need to do anyway before doing that, would be to understand what currently happens at shutdown and try to eliminate as much of it as possible. Then a decision about what is left can be made in an informed fashion.So perhaps it's just phrasing, but I would have phrased things less provocatively about "how can we find and eliminate unnecessary work during shutdown?"If we are to choose what to save during shutdown, reconstructable data (incl. the said omnibox stats) would be excluded as losing it is not fatal.
On Fri, Sep 2, 2016 at 4:00 AM, Ryo Hashimoto <hash...@chromium.org> wrote:On Fri, Sep 2, 2016 at 6:25 PM, Peter Kasting <pkas...@chromium.org> wrote:On Fri, Sep 2, 2016 at 1:09 AM, Ryo Hashimoto <hash...@chromium.org> wrote:Also, relying on clean shutdown to save user data is fundamentally vulnerable to unexpected events like Chrome crash, OS crash, and device power loss.Not inherently true.For example, parts of the omnibox code need to track stats condensed from the user's (much larger) complete history. If those stats are out of date, they can be recomputed on startup.So the omnibox tries to write out updated stats periodically, especially at shutdown. If it fails (e.g. crash before then), the cost is not correctness, it's startup perf on the next run. Startup perf is extremely valuable and we don't want to regress that. But it's also not clear we want to be writing to disk constantly to ensure our stats are updated on every history event.Isn't the said situation already happening on Android? Clean shutdown never happens on that platform as gab@ said.If it's causing a serious startup perf issue, IMO what should be fixed is the startup code.How do you proposing "fixing" the startup code to avoid this perf regression? The point is that you need to read in a large amount of data and then process it to create the smaller amount of data. There's no way to do that on startup and still be fast. So you need to have the smaller data around before startup.
It is probably broken on Android. I doubt anyone working on this has tried to optimize startup perf on Android.But that doesn't mean the answer is to also break desktop.I am in general very skeptical of the idea that shutdown can or should be entirely eliminated. The useful thing, which we'd need to do anyway before doing that, would be to understand what currently happens at shutdown and try to eliminate as much of it as possible. Then a decision about what is left can be made in an informed fashion.So perhaps it's just phrasing, but I would have phrased things less provocatively about "how can we find and eliminate unnecessary work during shutdown?"If we are to choose what to save during shutdown, reconstructable data (incl. the said omnibox stats) would be excluded as losing it is not fatal.You're missing my point. The whole reason I wrote the email was that it's incorrect to claim there are only two buckets, "reconstructable" and "non-reconstructable", for which the correct descriptions are always "no consequence to not handling" and "dataloss if not handled". There are shutdown actions which fall in between: consequences, but less severe than dataloss.And let's not get hung up on arguing about this specific case. I'm writing an example to support a larger thesis: it is better to attack the problem by trying to eliminate things from the clean shutdown path, and then later decide the question of whether that path can disappear entirely, than it is to assume from the outset that the path can disappear entirely, because such assumptions lead to things like the mistaken dichotomy above.PK
--
On Fri, Sep 2, 2016 at 3:31 PM, Peter Kasting <pkas...@chromium.org> wrote:On Fri, Sep 2, 2016 at 4:00 AM, Ryo Hashimoto <hash...@chromium.org> wrote:On Fri, Sep 2, 2016 at 6:25 PM, Peter Kasting <pkas...@chromium.org> wrote:On Fri, Sep 2, 2016 at 1:09 AM, Ryo Hashimoto <hash...@chromium.org> wrote:Also, relying on clean shutdown to save user data is fundamentally vulnerable to unexpected events like Chrome crash, OS crash, and device power loss.Not inherently true.For example, parts of the omnibox code need to track stats condensed from the user's (much larger) complete history. If those stats are out of date, they can be recomputed on startup.So the omnibox tries to write out updated stats periodically, especially at shutdown. If it fails (e.g. crash before then), the cost is not correctness, it's startup perf on the next run. Startup perf is extremely valuable and we don't want to regress that. But it's also not clear we want to be writing to disk constantly to ensure our stats are updated on every history event.Isn't the said situation already happening on Android? Clean shutdown never happens on that platform as gab@ said.If it's causing a serious startup perf issue, IMO what should be fixed is the startup code.How do you proposing "fixing" the startup code to avoid this perf regression? The point is that you need to read in a large amount of data and then process it to create the smaller amount of data. There's no way to do that on startup and still be fast. So you need to have the smaller data around before startup.How necessary is it to do this on startup, blocking the paint of the first page? I would imagine this could be deferred to shortly after startup, to avoid regressing actual start up "time to first paint".
I think the contention is around the teardown phase (when services are told about imminent shutdown and given one last chance to flush). As a first step, we should probably keep that phase (and aim for it to be a no-op for most services -- I agree with Peter's use case of doing some work on shutdown instead of next startup for performance reasons). But we can probably get rid of the shutdown phase (when things are deleted for realz and all). As for tests, browser tests use a new process per test as highlighted by Scott so not an issue; some unit_tests might need existing Shutdown() calls to become ShutdownForTesting() but that's about it (FWIW, the upcoming base/task_scheduler uses this paradigm -- never deleted in production, only ensures that TaskShutdownBehavior semantics are respected and then lets everything loose).As for Vincent's mention of LSAN tests, I remember being told that LSAN ignores shutdown leaks? But then I guess I'm not sure how it qualifies a "leak"? Perhaps it ignores everything hanging off of known top-level leaked objects?
PK
Since the default starting position of this email thread was effectively "Let's kill clean shutdown", let me summarize the arguments I've seen on the thread for maintaining clean shutdown:1. When the user explicitly closes Chrome (as they can on Windows), they expect no data loss; data consistency with data loss is not sufficient to meet their expectations.2. Without clean shutdown, writing tests can be cumbersome or even infeasible, or alternatively, we might end up effectively having to maintain a clean shutdown path just for tests.3. Some form of clean shutdown is required for closing a Profile in multiprofile.4. Some actions taken on clean shutdown (e.g., writing out up-to-date omnibox stats) otherwise have to be taken on the next startup, regressing startup performance.Of these arguments, (1) strikes me as a reason that we do indeed need to maintain clean shutdown in production: effectively, there is an expectation by users of the browser on Windows that simply doesn't exist on the other platforms in question, and we need to satisfy that expectation. I would be curious to hear the opinions of others who have been supportive of eliminating clean shutdown on this thread.
Unless there's clear data about how much it affects the performance, I think Chrome shouldn't spend time for the said data during shutdown. Users can do many things with Chrome and Omnibox even without autocomplete.
On Android, FlushPersistentData may get called before Chrome is killed by the OS, but it just saves the local state, preferences, and a number of selected data.
On Windows, SessionEnding gets called in an OS shutdown and it just saves the local state and preferences.
Apparently the said perf regression and the said data loss have been already there for years.
For completeness, my thoughts on the others:
(2) does not seem like a blocker to me: we already have lots of code that's only for tests, and moving complexity in production to complexity in testing seems like an overall win to me.
(3) It seems like the requirements here could possibly be fulfilled without having the entire clean browser shutdown path that we have today.
(4) seems dangerous to me given that clean shutdown is not available on ChromeOS, Android, or iOS. Given that reality, I would regard solutions that rely on clean shutdown to avoid regressing startup performance as an anti-pattern.
"Clean shutdown exists solely to prevent data loss when the user explicitly closes the app (on platforms where such prevention is technically possible). In particular, if using clean shutdown to optimize the performance of subsequent startup, consider strongly whether there is another way of achieving your goal that would avoid regressing startup on Android, ChromeOS, and iOS (platforms where clean shutdown is not possible)."
"Clean shutdown exists solely to prevent data loss when the user explicitly closes the app (on platforms where such prevention is technically possible). In particular, if using clean shutdown to optimize the performance of subsequent startup, consider strongly whether there is another way of achieving your goal that would avoid regressing startup on Android, ChromeOS, and iOS (platforms where clean shutdown is not possible)."To me this is a statement of hope, not a statement of fact. If you preface the whole thing with "I want to move towards a world where", then great. But I think qualifiers like that are important because otherwise we're inclined to discard the realities of today with a hand-wavey "well, that shouldn't be that way", as if that just makes things not that way, and we need not spend time and effort worrying about, understanding, and fixing said things.
PK
On Mon, Sep 5, 2016 at 2:56 AM, Ryo Hashimoto <hash...@chromium.org> wrote:Unless there's clear data about how much it affects the performance, I think Chrome shouldn't spend time for the said data during shutdown. Users can do many things with Chrome and Omnibox even without autocomplete.I'm going to ask you to trust me that this matters and we can't regress it. So please stop assuming we can safely regress it.On Android, FlushPersistentData may get called before Chrome is killed by the OS, but it just saves the local state, preferences, and a number of selected data.The Android omnibox is a different implementation than Windows and I have literally never tested to see if it even works correctly, let alone tried to optimize its performance, so do not assume that things are or should be fine on Android. However, I would think FlushPersistentData should be sufficient to write this stuff.On Windows, SessionEnding gets called in an OS shutdown and it just saves the local state and preferences.I believe most Chrome sessions on Windows do not end with an OS shutdown (or have an OS shutdown while Chrome is still shutting down). Do you have data showing otherwise? You seem to assume that either this is common, or (in your other email, if it ever happens we "can't rely on" clean shutdown, which is missing the point.
Apparently the said perf regression and the said data loss have been already there for years.There is no data loss.And this whole email ignores the point I've been trying to make, and clarify, repeatedly: I am NOT trying to debate a specific thing happening during shutdown. I am simply trying to suggest that walking into the situation without even understanding what other parts of the browser are doing and saying "clearly we can kill clean shutdown!" is cavalier and unnecessarily aggressive. Want to improve shutdown perf? Fine! Great goal. Let's do it in a controlled, planned manner that understands and avoids regressions along the way.
So can we please stop talking about the omnibox, and stop throwing around the simplistic idea that we can just discard shutdown entirely in the near future? Everybody thinks doing as little as possible during shutdown is is a great idea, so let's move on to finding some people who want to actually work on removing unnecessary shutdown work in said controlled and planned manner.
PK
On Tue, Sep 6, 2016 at 12:03 AM, Peter Kasting <pkas...@chromium.org> wrote:On Mon, Sep 5, 2016 at 2:56 AM, Ryo Hashimoto <hash...@chromium.org> wrote:Unless there's clear data about how much it affects the performance, I think Chrome shouldn't spend time for the said data during shutdown. Users can do many things with Chrome and Omnibox even without autocomplete.I'm going to ask you to trust me that this matters and we can't regress it. So please stop assuming we can safely regress it.On Android, FlushPersistentData may get called before Chrome is killed by the OS, but it just saves the local state, preferences, and a number of selected data.The Android omnibox is a different implementation than Windows and I have literally never tested to see if it even works correctly, let alone tried to optimize its performance, so do not assume that things are or should be fine on Android. However, I would think FlushPersistentData should be sufficient to write this stuff.On Windows, SessionEnding gets called in an OS shutdown and it just saves the local state and preferences.I believe most Chrome sessions on Windows do not end with an OS shutdown (or have an OS shutdown while Chrome is still shutting down). Do you have data showing otherwise? You seem to assume that either this is common, or (in your other email, if it ever happens we "can't rely on" clean shutdown, which is missing the point.Why do you think it's not common? Do you think most WIndows users close all Chrome windows and wait for a number of seconds before turning off their computers?
Personally, when I use my Windows machine, I launch Chrome just after logging in and keep the window open until OS shutdown.I've never bothered to close Chrome windows before turning off my machine, so this means, for me, almost all Chrome shutdown is unclean.If this legitimate use case results in serious performance regression or anything unacceptable, it's not the user's fault, but Chrome's.IIUC this means we shouldn't rely on clean shutdown even on Windows.Do you think it's OK to make the UX painful when the user forgets to close all Chrome windows and wait for Chrome to complete shutdown before turning off their computer?Apparently the said perf regression and the said data loss have been already there for years.There is no data loss.And this whole email ignores the point I've been trying to make, and clarify, repeatedly: I am NOT trying to debate a specific thing happening during shutdown. I am simply trying to suggest that walking into the situation without even understanding what other parts of the browser are doing and saying "clearly we can kill clean shutdown!" is cavalier and unnecessarily aggressive. Want to improve shutdown perf? Fine! Great goal. Let's do it in a controlled, planned manner that understands and avoids regressions along the way.I don't care much about omnibox neither.What I want to clarify is that we're already performing unclean shutdown on Windows, and we shouldn't make it result in bad UX.So can we please stop talking about the omnibox, and stop throwing around the simplistic idea that we can just discard shutdown entirely in the near future? Everybody thinks doing as little as possible during shutdown is is a great idea, so let's move on to finding some people who want to actually work on removing unnecessary shutdown work in said controlled and planned manner.Is "doing as little as possible during shutdown" going to work?For example, crbug.com/470501 was a bug I fixed where Perferences and Local State had been written to the disk multiple times during shutdown.Writing Preferences and Local State is very costly because it involves fsync() which can take forever depending on how busy the disk is, so ideally it should be done only once during shutdown.However, developers tend to add redundant code to make sure the data is saved because it's much easier to add new code than investigating the existing code.Also, almost all storage classes start saving their buffered data in the destructors, and it's easy to write new code which does the same.If we are to perform clean shutdown, every time a new class which performs IO in its dtor is added, the shutdown performance inevitably gets regressed.PK
A couple clarifications (replying here instead of doc to make sure everyone sees) :A) I'm not sure how profile shutdown came to the discussion but AFAIK we never shutdown profiles (we would like to but naked Profile* are all over the place and if you close all windows of a profile, it's not actually teared down until browser shutdown).
B) The term "unclean shutdown" typically refers to Chrome failing to write the "clean shutdown" bit in Preferences (e.g., crash or hanged process killed). This is when you see the "Chrome didn't shutdown properly : Restore" bubble.Here I think we should refer to "ungraceful shutdown" where by "graceful" I mean: unregistering things, releasing handles, tearing down threads, etc.For example: observers of WillDestroyCurrentMessageLoop() doing things like removing file watcher registrations just so it doesn't post task to a null MessageLoop is clearly an instance of cleaning the carpets before burning down the building.
I do think that Peter's use case is valid. Saving state on shutdown *when possible* to save time on the next startup should always be done. This is why I propose a "teardown phase" where notifications are propagated to everything to pronounce their last wish before the process is terminated.
It sounds like we may already have that in FlushPersistentData? Should we make this cross-platform, invoke it, and terminate process after?
Hi gab@,Apologies in advance if I misunderstood something :). Questions inline.On Wed, Sep 7, 2016 at 11:17 PM Gabriel Charette <g...@chromium.org> wrote:A couple clarifications (replying here instead of doc to make sure everyone sees) :A) I'm not sure how profile shutdown came to the discussion but AFAIK we never shutdown profiles (we would like to but naked Profile* are all over the place and if you close all windows of a profile, it's not actually teared down until browser shutdown).+Mattias Nissler raised this point: "We need to be able shut down a profile cleanly when the user closes the last window, so that part of the clean shutdown path must be kept functional (at least as long as we support multi-profile, IIRC it was a bit of work to get it functional when multi-profile was first introduced)." Mattias, any comments here?
On Thu, Sep 8, 2016 at 2:08 PM, Colin Blundell <blun...@chromium.org> wrote:Hi gab@,Apologies in advance if I misunderstood something :). Questions inline.On Wed, Sep 7, 2016 at 11:17 PM Gabriel Charette <g...@chromium.org> wrote:A couple clarifications (replying here instead of doc to make sure everyone sees) :A) I'm not sure how profile shutdown came to the discussion but AFAIK we never shutdown profiles (we would like to but naked Profile* are all over the place and if you close all windows of a profile, it's not actually teared down until browser shutdown).+Mattias Nissler raised this point: "We need to be able shut down a profile cleanly when the user closes the last window, so that part of the clean shutdown path must be kept functional (at least as long as we support multi-profile, IIRC it was a bit of work to get it functional when multi-profile was first introduced)." Mattias, any comments here?I remember that there was quite a bit of cleanup work on Profile stuff (also related to shutdown) when multi-profile was first introduced, hence my reply. That was a long while ago though, and I'm not at all familiar with the current state of things. I had assumed that the cleanup work had proceeded to the point where we can get rid of Profile instances at run time (which IIRC was the goal back then), but Gab is totally right in that this is not true (I've just checked ProfileManager, it never removes a Profile once the Profile is fully initialized). That still leaves the question whether the current status quo is the state we should be in or whether our goal is to eventually be able to tear down Profiles correctly after the user closes them.
B) The term "unclean shutdown" typically refers to Chrome failing to write the "clean shutdown" bit in Preferences (e.g., crash or hanged process killed). This is when you see the "Chrome didn't shutdown properly : Restore" bubble.Here I think we should refer to "ungraceful shutdown" where by "graceful" I mean: unregistering things, releasing handles, tearing down threads, etc.For example: observers of WillDestroyCurrentMessageLoop() doing things like removing file watcher registrations just so it doesn't post task to a null MessageLoop is clearly an instance of cleaning the carpets before burning down the building.It's unclear to me what you're referring to when you write "here". Are you suggesting that we should move Chrome's clean shutdown process to be ungraceful in general, or are you saying that Android already goes through ungraceful shutdown, or are you referring to something else?
I do think that Peter's use case is valid. Saving state on shutdown *when possible* to save time on the next startup should always be done. This is why I propose a "teardown phase" where notifications are propagated to everything to pronounce their last wish before the process is terminated.It sounds like we may already have that in FlushPersistentData? Should we make this cross-platform, invoke it, and terminate process after?(Related to the above questions) Are you suggesting that we should have a unified cross-platform clean "ungraceful shutdown" process that basically just issues fire-and-forget requests to save whatever state it can? The complication I see there is Peter and Antoine's point that they don't want to either limit performance optimizations or admit data loss on Windows just because of constraints on (e.g.) Android. It's unclear to me whether it's possible to maintain those properties without waiting for acks on various subsystems' shutdown processes. WDYT?
Another clarification on something I previously said:C) I incorrectly swapped my own definition of shutdown vs teardown above... The "shutdown phase" is the synchronous one-pass pronounce-your-last-wish (but don't cleanup anything, beyond perhaps invalidating weak ptrs) phase. The "teardown phase" is the one where things are gracefully unwound (and is the one I think we can get rid of on all platforms).On Thu, Sep 8, 2016 at 8:26 AM Mattias Nissler <mnis...@chromium.org> wrote:On Thu, Sep 8, 2016 at 2:08 PM, Colin Blundell <blun...@chromium.org> wrote:Hi gab@,Apologies in advance if I misunderstood something :). Questions inline.On Wed, Sep 7, 2016 at 11:17 PM Gabriel Charette <g...@chromium.org> wrote:A couple clarifications (replying here instead of doc to make sure everyone sees) :A) I'm not sure how profile shutdown came to the discussion but AFAIK we never shutdown profiles (we would like to but naked Profile* are all over the place and if you close all windows of a profile, it's not actually teared down until browser shutdown).+Mattias Nissler raised this point: "We need to be able shut down a profile cleanly when the user closes the last window, so that part of the clean shutdown path must be kept functional (at least as long as we support multi-profile, IIRC it was a bit of work to get it functional when multi-profile was first introduced)." Mattias, any comments here?I remember that there was quite a bit of cleanup work on Profile stuff (also related to shutdown) when multi-profile was first introduced, hence my reply. That was a long while ago though, and I'm not at all familiar with the current state of things. I had assumed that the cleanup work had proceeded to the point where we can get rid of Profile instances at run time (which IIRC was the goal back then), but Gab is totally right in that this is not true (I've just checked ProfileManager, it never removes a Profile once the Profile is fully initialized). That still leaves the question whether the current status quo is the state we should be in or whether our goal is to eventually be able to tear down Profiles correctly after the user closes them.I think the team tried it initially but it was crashing too much, so they shipped without support for tearing down Profiles. I agree that it would be preferable if it was possible to tear down an unused Profile at run-time, but I think there were a couple of attempts at this and no one ever got it working. I thus wouldn't want to block better shutdown (which is a way more impactful problem) on this.
Another clarification on something I previously said:C) I incorrectly swapped my own definition of shutdown vs teardown above... The "shutdown phase" is the synchronous one-pass pronounce-your-last-wish (but don't cleanup anything, beyond perhaps invalidating weak ptrs) phase. The "teardown phase" is the one where things are gracefully unwound (and is the one I think we can get rid of on all platforms).
On Thu, Sep 8, 2016 at 8:26 AM Mattias Nissler <mnis...@chromium.org> wrote:On Thu, Sep 8, 2016 at 2:08 PM, Colin Blundell <blun...@chromium.org> wrote:Hi gab@,Apologies in advance if I misunderstood something :). Questions inline.On Wed, Sep 7, 2016 at 11:17 PM Gabriel Charette <g...@chromium.org> wrote:A couple clarifications (replying here instead of doc to make sure everyone sees) :A) I'm not sure how profile shutdown came to the discussion but AFAIK we never shutdown profiles (we would like to but naked Profile* are all over the place and if you close all windows of a profile, it's not actually teared down until browser shutdown).+Mattias Nissler raised this point: "We need to be able shut down a profile cleanly when the user closes the last window, so that part of the clean shutdown path must be kept functional (at least as long as we support multi-profile, IIRC it was a bit of work to get it functional when multi-profile was first introduced)." Mattias, any comments here?I remember that there was quite a bit of cleanup work on Profile stuff (also related to shutdown) when multi-profile was first introduced, hence my reply. That was a long while ago though, and I'm not at all familiar with the current state of things. I had assumed that the cleanup work had proceeded to the point where we can get rid of Profile instances at run time (which IIRC was the goal back then), but Gab is totally right in that this is not true (I've just checked ProfileManager, it never removes a Profile once the Profile is fully initialized). That still leaves the question whether the current status quo is the state we should be in or whether our goal is to eventually be able to tear down Profiles correctly after the user closes them.I think the team tried it initially but it was crashing too much, so they shipped without support for tearing down Profiles. I agree that it would be preferable if it was possible to tear down an unused Profile at run-time, but I think there were a couple of attempts at this and no one ever got it working. I thus wouldn't want to block better shutdown (which is a way more impactful problem) on this.
B) The term "unclean shutdown" typically refers to Chrome failing to write the "clean shutdown" bit in Preferences (e.g., crash or hanged process killed). This is when you see the "Chrome didn't shutdown properly : Restore" bubble.Here I think we should refer to "ungraceful shutdown" where by "graceful" I mean: unregistering things, releasing handles, tearing down threads, etc.For example: observers of WillDestroyCurrentMessageLoop() doing things like removing file watcher registrations just so it doesn't post task to a null MessageLoop is clearly an instance of cleaning the carpets before burning down the building.It's unclear to me what you're referring to when you write "here". Are you suggesting that we should move Chrome's clean shutdown process to be ungraceful in general, or are you saying that Android already goes through ungraceful shutdown, or are you referring to something else?By "here" I mean : in this conversation (i.e., it should be titled "Should we stop supporting graceful shutdown in Chrome" instead of "clean shutdown").I do think that Peter's use case is valid. Saving state on shutdown *when possible* to save time on the next startup should always be done. This is why I propose a "teardown phase" where notifications are propagated to everything to pronounce their last wish before the process is terminated.It sounds like we may already have that in FlushPersistentData? Should we make this cross-platform, invoke it, and terminate process after?(Related to the above questions) Are you suggesting that we should have a unified cross-platform clean "ungraceful shutdown" process that basically just issues fire-and-forget requests to save whatever state it can? The complication I see there is Peter and Antoine's point that they don't want to either limit performance optimizations or admit data loss on Windows just because of constraints on (e.g.) Android. It's unclear to me whether it's possible to maintain those properties without waiting for acks on various subsystems' shutdown processes. WDYT?Yes, except that I don't think it should be fire-and-forget. On platforms that can afford it, it should be a synchronous call (just like the synchronous call to shutdown KeyedServices). That addresses concerns raised IMO.
Tried to process this whole thread. Sorry if I missed things.Re: unload handlers.These are deeply entrenched in the desktop web and somewhat entrenched on mobile. AFAIK, all desktop browsers are pretty consistent in their support for unload events. Unload handlers do fire on mobile, they're just even less reliable than on desktop.Removing these is going to be take a long time. In particular, we don't yet have a reasonable replacement. We're working on it. But the bug fix needed there is complicated and relatively low priority. So it's making slow progress. The advantage of the replacement (visibilitychange) is that it only fires when you're navigating away from the foreground tab or when the foreground tab is backgrounded. So, we'd only have the same performance problem if the foreground tab is somehow paged out of memory when you try to close it, which seems exceedingly rare to me.If anyone has interest in helping this project proceed faster, ping me offline and hopefully we can make it happen.
- As best I can tell, the only argument people buy for the need for optimistic teardown is Peter's use case. That use case does not seem sufficiently valuable to me to justify this. We should find a way to write that information to disk during idle time or something like that.
Also, I don't think "please trust me that this large architectural burden is necessary for performance" is acceptable to stall progress without pointing to an UMA or other quantifiable justification.
The next steps should be: start removing code from the shutdown codepath with the ultimate goal of removing that codepath entirely. Each incremental step you take there makes the product a little bit better, so it has value independent of whether we succeed in the end with killing shutdown. It also allows examining each case independently, e.g. we can fix the omnibox code to not use shutdown and thus prove/disprove the performance concern.
On Fri, Sep 9, 2016 at 12:37 PM, Ojan Vafai <oj...@chromium.org> wrote:Tried to process this whole thread. Sorry if I missed things.Re: unload handlers.These are deeply entrenched in the desktop web and somewhat entrenched on mobile. AFAIK, all desktop browsers are pretty consistent in their support for unload events. Unload handlers do fire on mobile, they're just even less reliable than on desktop.Removing these is going to be take a long time. In particular, we don't yet have a reasonable replacement. We're working on it. But the bug fix needed there is complicated and relatively low priority. So it's making slow progress. The advantage of the replacement (visibilitychange) is that it only fires when you're navigating away from the foreground tab or when the foreground tab is backgrounded. So, we'd only have the same performance problem if the foreground tab is somehow paged out of memory when you try to close it, which seems exceedingly rare to me.If anyone has interest in helping this project proceed faster, ping me offline and hopefully we can make it happen.Perhaps fixing this is higher-priority than trying to address shutdown, since this hits tab closure and not just all-browser-closure? And tons of people have been complaining about unload handlers resulting in not-fast-killable tabs for years, whereas slow shutdown, while a real problem, has not risen to nearly the same level of noise.
- As best I can tell, the only argument people buy for the need for optimistic teardown is Peter's use case. That use case does not seem sufficiently valuable to me to justify this. We should find a way to write that information to disk during idle time or something like that.
We're doing that already.Also, I don't think "please trust me that this large architectural burden is necessary for performance" is acceptable to stall progress without pointing to an UMA or other quantifiable justification.That's not what I was saying. I don't want a large architectural burden imposed and I'm not interested in stalling progress. I'm interested in not ratholing about a specific example case (instead of discussing the underlying point) and not making statements like "clearly all shutdown is stupid and we can just kill it all" when I don't believe anyone on this thread knows everything shutdown does. The IMO correct next step of "let's start ripping out what we can one thing at a time" avoid both those problems.
On Fri, Sep 9, 2016 at 1:57 PM Peter Kasting <pkas...@chromium.org> wrote:On Fri, Sep 9, 2016 at 12:37 PM, Ojan Vafai <oj...@chromium.org> wrote:Tried to process this whole thread. Sorry if I missed things.Re: unload handlers.These are deeply entrenched in the desktop web and somewhat entrenched on mobile. AFAIK, all desktop browsers are pretty consistent in their support for unload events. Unload handlers do fire on mobile, they're just even less reliable than on desktop.Removing these is going to be take a long time. In particular, we don't yet have a reasonable replacement. We're working on it. But the bug fix needed there is complicated and relatively low priority. So it's making slow progress. The advantage of the replacement (visibilitychange) is that it only fires when you're navigating away from the foreground tab or when the foreground tab is backgrounded. So, we'd only have the same performance problem if the foreground tab is somehow paged out of memory when you try to close it, which seems exceedingly rare to me.If anyone has interest in helping this project proceed faster, ping me offline and hopefully we can make it happen.Perhaps fixing this is higher-priority than trying to address shutdown, since this hits tab closure and not just all-browser-closure? And tons of people have been complaining about unload handlers resulting in not-fast-killable tabs for years, whereas slow shutdown, while a real problem, has not risen to nearly the same level of noise.Even once we fix it, it's going to be a slow, long process to get sites to switch over to it. It won't have the immediate benefit of getting rid of shutdown code. In either case, it's not an either or problem. It's different people that would take on each of these things.
- As best I can tell, the only argument people buy for the need for optimistic teardown is Peter's use case. That use case does not seem sufficiently valuable to me to justify this. We should find a way to write that information to disk during idle time or something like that.
We're doing that already.Also, I don't think "please trust me that this large architectural burden is necessary for performance" is acceptable to stall progress without pointing to an UMA or other quantifiable justification.That's not what I was saying. I don't want a large architectural burden imposed and I'm not interested in stalling progress. I'm interested in not ratholing about a specific example case (instead of discussing the underlying point) and not making statements like "clearly all shutdown is stupid and we can just kill it all" when I don't believe anyone on this thread knows everything shutdown does. The IMO correct next step of "let's start ripping out what we can one thing at a time" avoid both those problems.I see. That's not what I took away from this thread. It's also not what I took away from the conclusion in Collin's doc. "Audit" does not imply action. We could audit, decide we need some of the shutdown code and not actually make any changes.But that's fine. Communicating over email is hard. It sounds like we agree.Ripping out what we can one thing at a time makes sense to me. I didn't imagine a world in which we blindly just deleted all the code. :) Removing individual things from the shutdown codepath is valuable even if we can't remove everything.I think I'm just optimistic that whatever is left at the end won't provide enough value to justify keeping the shutdown codepath around and that we'll be able to find other solutions that are good enough.