--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
I'm not sure if there is consensus that we need a proper architecture for this.Some time ago I proposed a proof-of-concept framework that, amongst other things, was intended to facilitate the implementation of "correct" startup behaviour. But the message I received was that it would be better to just solve problems as they occur, and correctly design individual components/features, than to introduce a new framework, with all the cost that that implies.
A sophisticated dependency resolver/injection graph-based framework was proposed that would allow chrome to transparently resolve the right order of component creation and initialization so developers would not have to care about this. There are costs associated which I am not going to go into details here but the eng review take was that we would like a) work towards a simpler start-up sequence that a qualified and properly trained homo sapiens can understand and b) a simpler form of the framework that was easier to diagnose/debug/grok etc.
The message was not exactly that. I assume you are talking about the eng review. I get that you are compressing it to make your point more pointy but lets get the full context:The issue is that our start-up is very complicated, beyond what any developer in the team can hope to fit in their brain which wikipedia tells me has been in its current form since 200,000 years ago. A sophisticated dependency resolver/injection graph-based framework was proposed that would allow chrome to transparently resolve the right order of component creation and initialization so developers would not have to care about this. There are costs associated which I am not going to go into details here but the eng review take was that we would like a) work towards a simpler start-up sequence that a qualified and properly trained homo sapiens can understand and b) a simpler form of the framework that was easier to diagnose/debug/grok etc.
OK, the canonical example from the policy code (and pardon me if this is polluting this thread - I'm happy to split this into a separate thread if you like):DeviceManagementService is owned by BrowserPolicyConnector and is created during browser initialization (BrowserPolicyConnector is owned by the BrowserProcess and is effectively a singleton).This class maintains a queue of pending DeviceManagement requests (typically, requests for CloudPolicy). We want to refresh our policy on startup, but we want to delay this long enough to avoid conflicting with initial startup.
DeviceManagementService does not expose a public Initialize() API - instead, it exposes a ScheduleInitialization(int msecs) API which is used to schedule a delayed initialization via a task. Typically, we schedule initialization after 5 or 10 seconds.However, there are two cases where we want to speed up initialization:1) When doing enterprise enrollment on startup, we want to trigger policy loading immediately so the user doesn't have to wait for the normal initialization delay to complete, so the enrollment code calls ScheduleInitialization(0) to force an immediate enrollment.2) The ChromeOS login screen wants to force a refresh so any new policies can take effect on the login screen, but doesn't want to conflict with the rendering of this initial screen, so it triggers a refresh after 100 milliseconds.
On Thu, Nov 21, 2013 at 10:23 AM, Drew Wilson <atwi...@chromium.org> wrote:
OK, the canonical example from the policy code (and pardon me if this is polluting this thread - I'm happy to split this into a separate thread if you like):DeviceManagementService is owned by BrowserPolicyConnector and is created during browser initialization (BrowserPolicyConnector is owned by the BrowserProcess and is effectively a singleton).This class maintains a queue of pending DeviceManagement requests (typically, requests for CloudPolicy). We want to refresh our policy on startup, but we want to delay this long enough to avoid conflicting with initial startup.DeviceManagementService does not expose a public Initialize() API - instead, it exposes a ScheduleInitialization(int msecs) API which is used to schedule a delayed initialization via a task. Typically, we schedule initialization after 5 or 10 seconds.However, there are two cases where we want to speed up initialization:1) When doing enterprise enrollment on startup, we want to trigger policy loading immediately so the user doesn't have to wait for the normal initialization delay to complete, so the enrollment code calls ScheduleInitialization(0) to force an immediate enrollment.2) The ChromeOS login screen wants to force a refresh so any new policies can take effect on the login screen, but doesn't want to conflict with the rendering of this initial screen, so it triggers a refresh after 100 milliseconds.Is it network, disk, or CPU that is being contended for?
On Thu, Nov 21, 2013 at 7:34 AM, Erik Wright <erikw...@chromium.org> wrote:
On Thu, Nov 21, 2013 at 10:23 AM, Drew Wilson <atwi...@chromium.org> wrote:
OK, the canonical example from the policy code (and pardon me if this is polluting this thread - I'm happy to split this into a separate thread if you like):DeviceManagementService is owned by BrowserPolicyConnector and is created during browser initialization (BrowserPolicyConnector is owned by the BrowserProcess and is effectively a singleton).This class maintains a queue of pending DeviceManagement requests (typically, requests for CloudPolicy). We want to refresh our policy on startup, but we want to delay this long enough to avoid conflicting with initial startup.DeviceManagementService does not expose a public Initialize() API - instead, it exposes a ScheduleInitialization(int msecs) API which is used to schedule a delayed initialization via a task. Typically, we schedule initialization after 5 or 10 seconds.However, there are two cases where we want to speed up initialization:1) When doing enterprise enrollment on startup, we want to trigger policy loading immediately so the user doesn't have to wait for the normal initialization delay to complete, so the enrollment code calls ScheduleInitialization(0) to force an immediate enrollment.2) The ChromeOS login screen wants to force a refresh so any new policies can take effect on the login screen, but doesn't want to conflict with the rendering of this initial screen, so it triggers a refresh after 100 milliseconds.Is it network, disk, or CPU that is being contended for?Possibly all 3 - it initializes a request engine (CPU, but probably not much), which then sends a policy fetch request to the server (network) and then processes the response (CPU + disk: it parses a protobuf, stores the result to disk, turns it into pref values, notifies pref listeners to changes to the prefs, which can result in pretty much any arbitrary amount of work depending on what new policy values happen to come down).
I have certainly seen CPU availability and performance make a difference to startup on Android. https://x20web.corp.google.com/~aberent/no_crawl/traces/trace_N4.html (Googlers only, sorry) shows a trace of normal startup on a Nexus 4 (without setting the CPU governors into performance mode). As you will see there are periods when all 4 CPUs are busy, and have ratcheted up their clock speeds to maximum.When I was doing experiments for https://docs.google.com/a/google.com/document/d/1L6RFGGRY0wixju4f5ZXoKNjbY0KEDnATxkmQ2AeMCa0/edit?usp=sharing (Googlers only, sorry) I found that an svelte N4 (an N4 with some of its memory and 2 CPUs disabled) took roughly half a second longer to start Chrome than a normal N4. The availability of memory should not have made a difference to this, and the I/O system is, I believe, the same, so it seems to me that this can only have been caused by CPU contention.In terms of doing the work, I am already looking at doing some work on pulling out some of the startup stuff we have done to improve the performance of Chrome on Android into a new component, and generalizing it. The most relevant piece of this to this discussion is StartupTaskRunner. This was created to allow the Android UI to be brought up early and to remain responsive during startup. It supports creating a queue of UI thread startup tasks that may then either all be run at one go, or be run one at a time with UI events being handled after each task is run. Currently the order of the queue is defined by the programmer in the calling code, but in the long term one possibility would be to define it using a dependency graph (preferably one from which we can generate a graphical graph!). My feeling is that it will take some careful design to get this right, but I would be happy to work on this with others.At the moment StartupTaskRunner is only used for managing tasks that happen during the initialization of BrowserMainLoop, (from PreCreateThreads to PreMainMessageLoopRun) whereas I think the delayed startup cases talked about on this thread are happening rather later, however I don't see why this should not be generalized. Also, at the moment StartupTaskRunner only runs tasks asynchronously on Android, since, I believe, the desktop UI isn't currently created until after the initialization of BrowserMainLoop has completed (in addition this allowed me to minimize the changes to desktop Chrome). On Android we also have a (quite limited) structure for later delayed initialization of certain other components, but unfortunately this is in the proprietary Java code of Chrome for Android, so cannot be directly reused on other platforms.
I have certainly seen CPU availability and performance make a difference to startup on Android. https://x20web.corp.google.com/~aberent/no_crawl/traces/trace_N4.html (Googlers only, sorry) shows a trace of normal startup on a Nexus 4 (without setting the CPU governors into performance mode). As you will see there are periods when all 4 CPUs are busy, and have ratcheted up their clock speeds to maximum.When I was doing experiments for https://docs.google.com/a/google.com/document/d/1L6RFGGRY0wixju4f5ZXoKNjbY0KEDnATxkmQ2AeMCa0/edit?usp=sharing (Googlers only, sorry) I found that an svelte N4 (an N4 with some of its memory and 2 CPUs disabled) took roughly half a second longer to start Chrome than a normal N4. The availability of memory should not have made a difference to this, and the I/O system is, I believe, the same, so it seems to me that this can only have been caused by CPU contention.In terms of doing the work, I am already looking at doing some work on pulling out some of the startup stuff we have done to improve the performance of Chrome on Android into a new component, and generalizing it. The most relevant piece of this to this discussion is StartupTaskRunner. This was created to allow the Android UI to be brought up early and to remain responsive during startup. It supports creating a queue of UI thread startup tasks that may then either all be run at one go, or be run one at a time with UI events being handled after each task is run. Currently the order of the queue is defined by the programmer in the calling code, but in the long term one possibility would be to define it using a dependency graph (preferably one from which we can generate a graphical graph!). My feeling is that it will take some careful design to get this right, but I would be happy to work on this with others.At the moment StartupTaskRunner is only used for managing tasks that happen during the initialization of BrowserMainLoop, (from PreCreateThreads to PreMainMessageLoopRun) whereas I think the delayed startup cases talked about on this thread are happening rather later, however I don't see why this should not be generalized. Also, at the moment StartupTaskRunner only runs tasks asynchronously on Android, since, I believe, the desktop UI isn't currently created until after the initialization of BrowserMainLoop has completed (in addition this allowed me to minimize the changes to desktop Chrome). On Android we also have a (quite limited) structure for later delayed initialization of certain other components, but unfortunately this is in the proprietary Java code of Chrome for Android, so cannot be directly reused on other platforms.
On Thu, Nov 21, 2013 at 12:17 PM, Anthony Berent <abe...@chromium.org> wrote:
I have certainly seen CPU availability and performance make a difference to startup on Android. https://x20web.corp.google.com/~aberent/no_crawl/traces/trace_N4.html (Googlers only, sorry) shows a trace of normal startup on a Nexus 4 (without setting the CPU governors into performance mode). As you will see there are periods when all 4 CPUs are busy, and have ratcheted up their clock speeds to maximum.When I was doing experiments for https://docs.google.com/a/google.com/document/d/1L6RFGGRY0wixju4f5ZXoKNjbY0KEDnATxkmQ2AeMCa0/edit?usp=sharing (Googlers only, sorry) I found that an svelte N4 (an N4 with some of its memory and 2 CPUs disabled) took roughly half a second longer to start Chrome than a normal N4. The availability of memory should not have made a difference to this, and the I/O system is, I believe, the same, so it seems to me that this can only have been caused by CPU contention.In terms of doing the work, I am already looking at doing some work on pulling out some of the startup stuff we have done to improve the performance of Chrome on Android into a new component, and generalizing it. The most relevant piece of this to this discussion is StartupTaskRunner. This was created to allow the Android UI to be brought up early and to remain responsive during startup. It supports creating a queue of UI thread startup tasks that may then either all be run at one go, or be run one at a time with UI events being handled after each task is run. Currently the order of the queue is defined by the programmer in the calling code, but in the long term one possibility would be to define it using a dependency graph (preferably one from which we can generate a graphical graph!). My feeling is that it will take some careful design to get this right, but I would be happy to work on this with others.At the moment StartupTaskRunner is only used for managing tasks that happen during the initialization of BrowserMainLoop, (from PreCreateThreads to PreMainMessageLoopRun) whereas I think the delayed startup cases talked about on this thread are happening rather later, however I don't see why this should not be generalized. Also, at the moment StartupTaskRunner only runs tasks asynchronously on Android, since, I believe, the desktop UI isn't currently created until after the initialization of BrowserMainLoop has completed (in addition this allowed me to minimize the changes to desktop Chrome). On Android we also have a (quite limited) structure for later delayed initialization of certain other components, but unfortunately this is in the proprietary Java code of Chrome for Android, so cannot be directly reused on other platforms.StartupTaskRunner addresses a different but related problem.In summary, there is some long-running work that occurs on the main thread before the main loop starts. On desktop that's not so bad not the least because we haven't even displayed a browser window by that time (it's also just, on average, shorter on desktop).On Android the browser window has already been painted by Java before the browser process starts going, and the browser process startup is effectively a task running in the message loop. So for the duration of the startup the browser is visible but inactive.So the startup was broken up into finer grained tasks between which UI events could be handled (for example, omnibox clicks/typing).This solution only exists because (due to Chrome architecture) that work (substantially disk IO) all happens on the main thread. IIRC the biggest factors are loading resources, loading preferences, and paging in code. These are things that, according to our current architecture at least, absolutely have to happen before the browser starts chugging along.The types of things being considered in this thread, AFAICT, are other things like history, data used for predictions and auto-completion, cookies not needed for restoring initial tab state, etc. Contrary to the StartupTaskRunner tasks, most of these happen on separate threads anyways, so we're not worried about yielding the message loop as much as avoiding contention for CPU/disk/network.
On Thu, Nov 21, 2013 at 9:17 AM, Anthony Berent <abe...@chromium.org> wrote:
I have certainly seen CPU availability and performance make a difference to startup on Android. https://x20web.corp.google.com/~aberent/no_crawl/traces/trace_N4.html (Googlers only, sorry) shows a trace of normal startup on a Nexus 4 (without setting the CPU governors into performance mode). As you will see there are periods when all 4 CPUs are busy, and have ratcheted up their clock speeds to maximum.When I was doing experiments for https://docs.google.com/a/google.com/document/d/1L6RFGGRY0wixju4f5ZXoKNjbY0KEDnATxkmQ2AeMCa0/edit?usp=sharing (Googlers only, sorry) I found that an svelte N4 (an N4 with some of its memory and 2 CPUs disabled) took roughly half a second longer to start Chrome than a normal N4. The availability of memory should not have made a difference to this, and the I/O system is, I believe, the same, so it seems to me that this can only have been caused by CPU contention.In terms of doing the work, I am already looking at doing some work on pulling out some of the startup stuff we have done to improve the performance of Chrome on Android into a new component, and generalizing it. The most relevant piece of this to this discussion is StartupTaskRunner. This was created to allow the Android UI to be brought up early and to remain responsive during startup. It supports creating a queue of UI thread startup tasks that may then either all be run at one go, or be run one at a time with UI events being handled after each task is run. Currently the order of the queue is defined by the programmer in the calling code, but in the long term one possibility would be to define it using a dependency graph (preferably one from which we can generate a graphical graph!). My feeling is that it will take some careful design to get this right, but I would be happy to work on this with others.At the moment StartupTaskRunner is only used for managing tasks that happen during the initialization of BrowserMainLoop, (from PreCreateThreads to PreMainMessageLoopRun) whereas I think the delayed startup cases talked about on this thread are happening rather later, however I don't see why this should not be generalized. Also, at the moment StartupTaskRunner only runs tasks asynchronously on Android, since, I believe, the desktop UI isn't currently created until after the initialization of BrowserMainLoop has completed (in addition this allowed me to minimize the changes to desktop Chrome). On Android we also have a (quite limited) structure for later delayed initialization of certain other components, but unfortunately this is in the proprietary Java code of Chrome for Android, so cannot be directly reused on other platforms.StartupTaskRunner is a highly specialized class for a specific purpose (running a few tasks inside content asynchronously, for Android Chrome). It's readable/understandable and easy to reason about correctness because it's very specific It has little impact on other code (i.e. startup in src\chrome for desktop chrome) because that code always runs after these tasks have run. Now if we were to generalize this to code in desktop chrome, many assumptions that are baked in would cause a lot of failures (i.e. tons of code assumes that it runs after the profile has been initialized, to give one example). I remain extremely skeptical that we should expand this model to other code. Instead I think we should be doing more lazy initialization so that data is fetched/read from disk only when it's needed for the first time.
On Thu, Nov 21, 2013 at 1:41 PM, John Abd-El-Malek <j...@chromium.org> wrote:
On Thu, Nov 21, 2013 at 9:17 AM, Anthony Berent <abe...@chromium.org> wrote:
I have certainly seen CPU availability and performance make a difference to startup on Android. https://x20web.corp.google.com/~aberent/no_crawl/traces/trace_N4.html (Googlers only, sorry) shows a trace of normal startup on a Nexus 4 (without setting the CPU governors into performance mode). As you will see there are periods when all 4 CPUs are busy, and have ratcheted up their clock speeds to maximum.When I was doing experiments for https://docs.google.com/a/google.com/document/d/1L6RFGGRY0wixju4f5ZXoKNjbY0KEDnATxkmQ2AeMCa0/edit?usp=sharing (Googlers only, sorry) I found that an svelte N4 (an N4 with some of its memory and 2 CPUs disabled) took roughly half a second longer to start Chrome than a normal N4. The availability of memory should not have made a difference to this, and the I/O system is, I believe, the same, so it seems to me that this can only have been caused by CPU contention.In terms of doing the work, I am already looking at doing some work on pulling out some of the startup stuff we have done to improve the performance of Chrome on Android into a new component, and generalizing it. The most relevant piece of this to this discussion is StartupTaskRunner. This was created to allow the Android UI to be brought up early and to remain responsive during startup. It supports creating a queue of UI thread startup tasks that may then either all be run at one go, or be run one at a time with UI events being handled after each task is run. Currently the order of the queue is defined by the programmer in the calling code, but in the long term one possibility would be to define it using a dependency graph (preferably one from which we can generate a graphical graph!). My feeling is that it will take some careful design to get this right, but I would be happy to work on this with others.At the moment StartupTaskRunner is only used for managing tasks that happen during the initialization of BrowserMainLoop, (from PreCreateThreads to PreMainMessageLoopRun) whereas I think the delayed startup cases talked about on this thread are happening rather later, however I don't see why this should not be generalized. Also, at the moment StartupTaskRunner only runs tasks asynchronously on Android, since, I believe, the desktop UI isn't currently created until after the initialization of BrowserMainLoop has completed (in addition this allowed me to minimize the changes to desktop Chrome). On Android we also have a (quite limited) structure for later delayed initialization of certain other components, but unfortunately this is in the proprietary Java code of Chrome for Android, so cannot be directly reused on other platforms.StartupTaskRunner is a highly specialized class for a specific purpose (running a few tasks inside content asynchronously, for Android Chrome). It's readable/understandable and easy to reason about correctness because it's very specific It has little impact on other code (i.e. startup in src\chrome for desktop chrome) because that code always runs after these tasks have run. Now if we were to generalize this to code in desktop chrome, many assumptions that are baked in would cause a lot of failures (i.e. tons of code assumes that it runs after the profile has been initialized, to give one example). I remain extremely skeptical that we should expand this model to other code. Instead I think we should be doing more lazy initialization so that data is fetched/read from disk only when it's needed for the first time.I agree with you about StartupTaskRunner being for a completely different use-case (see my simultaneous message).I think the tricky thing is defining the word "needed". For example, when the initial tabs are restored I "need" history in order to record the visits. But I don't really care if it happens right away, so it would be OK and good if the history service deferred the visit recording until all the tabs had been restored.The CloudPolicy example seems like it presents another counterpoint. From what I understand, this is a remotely managed policy that could have an impact on disparate parts of the user interface. So whenever it's updated it might cause the UI or browser behaviour to change. In that case, we want it to be reasonably up to date all of the time, but we don't mind waiting 10 seconds for tabs to restore before doing so.
This would be used for deferring work until things like:1) Browser window is visible2) First tab is loaded3) All tabs are restored4) Idle state achieved
PK
Some thoughts on this. It seems to me that there are possibly three different cases for when we want to start components:
- As soon as possible after startup. For example anything that is needed to get the UI started or to load the first tab visible should be done as soon as possible.
- We often care about the order in which these tasks are run, even if all of them should be run as soon as possible. For example, on Android, we have chosen to prioritize getting the Omnibox visible and responsive over loading the first tab.
- Lazily. This is appropriate for components that may or may not be used, and are reasonably quick to start (or the delays in starting aren't visible to the user), but tie up resources (e.g. memory). There is no point in starting these until they are needed.
- When resources are available. This is the main case that has been discussed above. This makes sense for components that are likely or certain to be needed, but aren't needed immediately, and take significant time to start.
For some components we may need to combine cases 2 and 3, i.e. in normal circumstances they should wait for resources to be available, but if some other component requires them they must start when that component requires them.
The interesting case is case 3, components that should start when resources are available. At the moment these are handled in various ad-hoc ways. Some start on timers, guessing that Chrome will using less resources 10 or 60 seconds after startup that during startup; on Android at least, some are started by an idleHandler; and some are started when other parts of startup reaches a particular state. Naively it would seem that this could be solved by initializing these components on low priority background threads; however this only works if the limiting resource is CPU time, which it isn't in most of the cases we care about. It seems to me that maybe what we need is some way of prioritizing access to other resources, so that, for example, a low priority task reading from disk would never read more than a few blocks at a time without giving way to higher priority tasks doing disk access.Clearly adding any sort of priority system to disk access (or network access) would be a significant piece of work, and would have quite widespread effects on the code, so we should assess further whether disk and network contention are real problems (either during startup or at other times), and if so whether we can sensibly classify the contending tasks into high and low priority tasks, before implementing this, but it seems to me that this could solve many of the problems discussed in this email thread.
On Tue, Nov 26, 2013 at 12:21 AM, John Abd-El-Malek <j...@chromium.org> wrote:
The previous comments summarize my intuition about this topic. I'm really curious about the stuff that happens after initial startup, i.e. once we dig into details could we find alternate ways to simplify the code as well as to reduce "magic" delays to start doing work.-for the cookie example given: do we have numbers on how much faster it is to load some cookies first vs all of the cookies database? My cookies file is 300KB. I imagine there's a few seeks to get to the necessary cookies for my opened tabs. At that point, is it actually slower to load all the cookies?
-one other example off the top of my head is metrics service, which starts a task after 60 seconds to collect data. Part of that is that it doesn't want to force loading the plugins from disk. Practically speaking, most page loads would force that to be loaded anyways. Could we be smarter by having the metrics service watch for when plugins are loaded, and only update its plugin data then?
On Fri, Nov 29, 2013 at 11:47 AM, Anthony Berent <abe...@chromium.org> wrote:
Some thoughts on this. It seems to me that there are possibly three different cases for when we want to start components:
- As soon as possible after startup. For example anything that is needed to get the UI started or to load the first tab visible should be done as soon as possible.
- We often care about the order in which these tasks are run, even if all of them should be run as soon as possible. For example, on Android, we have chosen to prioritize getting the Omnibox visible and responsive over loading the first tab.
- Lazily. This is appropriate for components that may or may not be used, and are reasonably quick to start (or the delays in starting aren't visible to the user), but tie up resources (e.g. memory). There is no point in starting these until they are needed.
- When resources are available. This is the main case that has been discussed above. This makes sense for components that are likely or certain to be needed, but aren't needed immediately, and take significant time to start.
For some components we may need to combine cases 2 and 3, i.e. in normal circumstances they should wait for resources to be available, but if some other component requires them they must start when that component requires them.
The interesting case is case 3, components that should start when resources are available. At the moment these are handled in various ad-hoc ways. Some start on timers, guessing that Chrome will using less resources 10 or 60 seconds after startup that during startup; on Android at least, some are started by an idleHandler; and some are started when other parts of startup reaches a particular state. Naively it would seem that this could be solved by initializing these components on low priority background threads; however this only works if the limiting resource is CPU time, which it isn't in most of the cases we care about. It seems to me that maybe what we need is some way of prioritizing access to other resources, so that, for example, a low priority task reading from disk would never read more than a few blocks at a time without giving way to higher priority tasks doing disk access.Clearly adding any sort of priority system to disk access (or network access) would be a significant piece of work, and would have quite widespread effects on the code, so we should assess further whether disk and network contention are real problems (either during startup or at other times), and if so whether we can sensibly classify the contending tasks into high and low priority tasks, before implementing this, but it seems to me that this could solve many of the problems discussed in this email thread.Case (1) typically maps to things that are initialized synchronously on the main thread during startup.Case (3), as you say, is the most interesting. And I agree it would be nice to react to resource availability in real time. But it would seem a good start to simply wait until some key high priority milestones are reached before triggering the low priority tasks. Bonus points if we start the low priority tasks in serial rather than parallel.As for classifying the tasks, I think that just starting with the things that are currently deferred through other methods and/or visual inspection of startup traces to find offending tasks would be a good start, too. I am under the impression that new tasks of this type are being added faster than every few months, so once a strategy exists I suspect we will quickly accrete clients.On Tue, Nov 26, 2013 at 12:21 AM, John Abd-El-Malek <j...@chromium.org> wrote:
The previous comments summarize my intuition about this topic. I'm really curious about the stuff that happens after initial startup, i.e. once we dig into details could we find alternate ways to simplify the code as well as to reduce "magic" delays to start doing work.-for the cookie example given: do we have numbers on how much faster it is to load some cookies first vs all of the cookies database? My cookies file is 300KB. I imagine there's a few seeks to get to the necessary cookies for my opened tabs. At that point, is it actually slower to load all the cookies?We used to load all cookies in one shot, and it did take longer than just loading the required cookies. Because of the series of changes that led to the current implementation it's not easy to point to a precise gain, but I don't think it would be hard to introduce an experiment that loaded the required cookies first and then immediately loaded all others. The resulting data should be good for discussion.
-one other example off the top of my head is metrics service, which starts a task after 60 seconds to collect data. Part of that is that it doesn't want to force loading the plugins from disk. Practically speaking, most page loads would force that to be loaded anyways. Could we be smarter by having the metrics service watch for when plugins are loaded, and only update its plugin data then?That's a good example. Tasks that aren't resource intensive but depend on some other "expensive" task completing would not be scheduled in any special way. They should, I think, just ask for notification when the expensive task completes.That hinges, though, on the assumption that the expensive task will absolutely run. I don't know enough about plugins to say if that's the case here, but there are probably other examples where it's not guaranteed. In that case, you still want a way to schedule the task to run at the appropriate time.I agree with you that an appropriate first step is to identify a handful of potential use-cases and consider whether there are other appropriate solutions.
On Tue, Dec 3, 2013 at 7:15 AM, Erik Wright <erikw...@chromium.org> wrote:
On Fri, Nov 29, 2013 at 11:47 AM, Anthony Berent <abe...@chromium.org> wrote:
Some thoughts on this. It seems to me that there are possibly three different cases for when we want to start components:
- As soon as possible after startup. For example anything that is needed to get the UI started or to load the first tab visible should be done as soon as possible.
- We often care about the order in which these tasks are run, even if all of them should be run as soon as possible. For example, on Android, we have chosen to prioritize getting the Omnibox visible and responsive over loading the first tab.
- Lazily. This is appropriate for components that may or may not be used, and are reasonably quick to start (or the delays in starting aren't visible to the user), but tie up resources (e.g. memory). There is no point in starting these until they are needed.
- When resources are available. This is the main case that has been discussed above. This makes sense for components that are likely or certain to be needed, but aren't needed immediately, and take significant time to start.
For some components we may need to combine cases 2 and 3, i.e. in normal circumstances they should wait for resources to be available, but if some other component requires them they must start when that component requires them.
The interesting case is case 3, components that should start when resources are available. At the moment these are handled in various ad-hoc ways. Some start on timers, guessing that Chrome will using less resources 10 or 60 seconds after startup that during startup; on Android at least, some are started by an idleHandler; and some are started when other parts of startup reaches a particular state. Naively it would seem that this could be solved by initializing these components on low priority background threads; however this only works if the limiting resource is CPU time, which it isn't in most of the cases we care about. It seems to me that maybe what we need is some way of prioritizing access to other resources, so that, for example, a low priority task reading from disk would never read more than a few blocks at a time without giving way to higher priority tasks doing disk access.Clearly adding any sort of priority system to disk access (or network access) would be a significant piece of work, and would have quite widespread effects on the code, so we should assess further whether disk and network contention are real problems (either during startup or at other times), and if so whether we can sensibly classify the contending tasks into high and low priority tasks, before implementing this, but it seems to me that this could solve many of the problems discussed in this email thread.Case (1) typically maps to things that are initialized synchronously on the main thread during startup.Case (3), as you say, is the most interesting. And I agree it would be nice to react to resource availability in real time. But it would seem a good start to simply wait until some key high priority milestones are reached before triggering the low priority tasks. Bonus points if we start the low priority tasks in serial rather than parallel.As for classifying the tasks, I think that just starting with the things that are currently deferred through other methods and/or visual inspection of startup traces to find offending tasks would be a good start, too. I am under the impression that new tasks of this type are being added faster than every few months, so once a strategy exists I suspect we will quickly accrete clients.On Tue, Nov 26, 2013 at 12:21 AM, John Abd-El-Malek <j...@chromium.org> wrote:
The previous comments summarize my intuition about this topic. I'm really curious about the stuff that happens after initial startup, i.e. once we dig into details could we find alternate ways to simplify the code as well as to reduce "magic" delays to start doing work.-for the cookie example given: do we have numbers on how much faster it is to load some cookies first vs all of the cookies database? My cookies file is 300KB. I imagine there's a few seeks to get to the necessary cookies for my opened tabs. At that point, is it actually slower to load all the cookies?We used to load all cookies in one shot, and it did take longer than just loading the required cookies. Because of the series of changes that led to the current implementation it's not easy to point to a precise gain, but I don't think it would be hard to introduce an experiment that loaded the required cookies first and then immediately loaded all others. The resulting data should be good for discussion.Yep, getting that data would help us figure out if this is a use case for 3 (when resources are available).-one other example off the top of my head is metrics service, which starts a task after 60 seconds to collect data. Part of that is that it doesn't want to force loading the plugins from disk. Practically speaking, most page loads would force that to be loaded anyways. Could we be smarter by having the metrics service watch for when plugins are loaded, and only update its plugin data then?That's a good example. Tasks that aren't resource intensive but depend on some other "expensive" task completing would not be scheduled in any special way. They should, I think, just ask for notification when the expensive task completes.That hinges, though, on the assumption that the expensive task will absolutely run. I don't know enough about plugins to say if that's the case here, but there are probably other examples where it's not guaranteed. In that case, you still want a way to schedule the task to run at the appropriate time.I agree with you that an appropriate first step is to identify a handful of potential use-cases and consider whether there are other appropriate solutions.
This is key. For the majority of items, 1) and 2) is enough. We should push more things from 1 to 2. My hypothesis is that 3 is a one-off, and there are very few items that need to be loaded this way, as almost all of them should just be lazily loaded. The only exceptions could be things like history, where if we lazily load it, then the first omnibox search might not have results. But again, this specific case may just be done with a simple watching of when then first browser window is shown, and we might not need something more sophisticated.
Some other component would like to have them, but isn't in a hurry. There is no way, right now, to express that.But again, until there is data I'm happy to let the different theories stand.