PSA: Beware of introducing new startup IO

228 views
Skip to first unread message

Jeremy Moskovich

unread,
Nov 18, 2013, 3:33:42 AM11/18/13
to Chromium-dev
tl;dr: If there’s any chance your code will be called during startup, please do everything possible to not perform, or at least minimize IO operations.

When Chrome launches we want to display the browser window, start the event loop and render the first page as quickly as possible.  IO operations (even seemingly small ones) can be very costly.  E.g. if Chrome is autostarted and many processes are contending for IO, cold starts or what have you.

Even if the IO operation you’re performing is non-blocking, it adds to IO contention and causes other blocking operations to take longer.

A couple of examples of measurable wins we've had in startup performance:
  • Storing the safe browsing database in a more compact format led to large wins on end user machines.
  • Delaying enumeration of attached storage devices until after first page load rather than performing the operation during early startup.
The effects of this stuff is pronounced on desktop systems, but in a mobile world badly timed IO has a critical impact on performance.

On the testing front we now have stable startup tests on the bots which measure things such as cold and warm startup time, session restore time, etc.  These should help catch regressions but it would be a mistake to blindly rely on them.

Drew Wilson

unread,
Nov 18, 2013, 4:26:55 AM11/18/13
to Jeremy Moskovich, Chromium-dev
Have we settled on best practices here? Agreed that it's preferable not to fire off everything as soon as Chrome launches, but we don't really have any mechanism in place to notify modules when it's safe to initialize, or to throttle module initialization to ensure that Chrome doesn't come to a screeching halt right after first page load when every module suddenly does its delayed initialization.

I've seen different code do all kinds of different stuff, ranging from initialization-on-first-use, to "delay initialization by N~=10 secs", to "wait until first page loads" (which breaks when starting in background mode). At least by not having any agreed-upon best practice we end up reducing contention, at least :)

Last time we talked about this (https://groups.google.com/a/chromium.org/d/msg/chromium-dev/zcLsAHooHps/SHWxetmI1D4J) we logged http://crbug.com/179779 - if anyone has some spare time to build a delayed-start framework, seems like it's a great opportunity to really improve our startup behavior.

-atw


--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

Jeremy Moskovich

unread,
Nov 20, 2013, 2:03:14 AM11/20/13
to Drew Wilson, Chromium-dev
No news on the best practices front, sadly.  We really do need a proper architecture for handling this stuff :/

Erik Wright

unread,
Nov 20, 2013, 9:37:16 AM11/20/13
to Jeremy Moskovich, Drew Wilson, Chromium-dev
I'm not sure if there is consensus that we need a proper architecture for this.

Some time ago I proposed a proof-of-concept framework that, amongst other things, was intended to facilitate the implementation of "correct" startup behaviour. But the message I received was that it would be better to just solve problems as they occur, and correctly design individual components/features, than to introduce a new framework, with all the cost that that implies.

Peter Kasting

unread,
Nov 20, 2013, 2:59:17 PM11/20/13
to Erik Wright, Jeremy Moskovich, Drew Wilson, Chromium-dev
On Wed, Nov 20, 2013 at 6:37 AM, Erik Wright <erikw...@chromium.org> wrote:
I'm not sure if there is consensus that we need a proper architecture for this.

Some time ago I proposed a proof-of-concept framework that, amongst other things, was intended to facilitate the implementation of "correct" startup behaviour. But the message I received was that it would be better to just solve problems as they occur, and correctly design individual components/features, than to introduce a new framework, with all the cost that that implies.

I don't agree with that message, and a lot of people have been wanting a proper way to consistently "do lots of work early on but not hurt browser performance" for years.  Since before Chrome's public launch, in fact.

PK

Carlos Pizano

unread,
Nov 20, 2013, 8:35:53 PM11/20/13
to chromi...@chromium.org, Erik Wright, Jeremy Moskovich, Drew Wilson
The message was not exactly that. I assume you are talking about the eng review. I get that you are compressing it to make your point more pointy but lets get the full context:

The issue is that our start-up is very complicated, beyond what any developer in the team can hope to fit in their brain which wikipedia tells me has been in its current form since 200,000 years ago. A sophisticated dependency resolver/injection graph-based framework was proposed that would allow chrome to transparently resolve the right order of component creation and initialization so developers would not have to care about this. There are costs associated which I am not going to go into details here but the eng review take was that we would like a) work towards a simpler start-up sequence that a qualified and properly trained homo sapiens can understand and b) a simpler form of the framework that was easier to diagnose/debug/grok etc.

I'll end up saying that Erik's framework is pretty cool but I side with the recommendation as it was.

Peter Kasting

unread,
Nov 20, 2013, 8:44:23 PM11/20/13
to Carlos Pizano, Chromium-dev, Erik Wright, Jeremy Moskovich, Drew Wilson
On Wed, Nov 20, 2013 at 5:35 PM, Carlos Pizano <c...@chromium.org> wrote:
A sophisticated dependency resolver/injection graph-based framework was proposed that would allow chrome to transparently resolve the right order of component creation and initialization so developers would not have to care about this. There are costs associated which I am not going to go into details here but the eng review take was that we would like a) work towards a simpler start-up sequence that a qualified and properly trained homo sapiens can understand and b) a simpler form of the framework that was easier to diagnose/debug/grok etc.

What I would like to see is simply a system where you register that you have some nontrivial work to be done, blocking-pool-style, and the system attempts to process all registered work with the maximum throughput that does not compromise user-visible aspects of Chrome's performance.  In other words, somehow it knows that during startup we're doing lots of IO, so it doesn't do more, and then when that IO drops off, it figures out how much it can schedule without introducing thrashing.

Dependency graphs and determining overall component init order and relationships sounds explicitly out-of-scope for a feasible solution to me.

PK

Jeremy Moskovich

unread,
Nov 21, 2013, 5:19:41 AM11/21/13
to Peter Kasting, Carlos Pizano, Chromium-dev, Erik Wright, Drew Wilson
While I'd like to see a penultimate generalized startup architecture, I'd be happy with just runlevels*, notifications at opportune times and the ability to assert that we've reached a given runlevel.
All of these solve concrete problems we have in the code today and are easily replaced with a better architecture should one be introduced.

Does that sound controversial?

FWIW, just wanted to add that while I understand the reasoning for not implementing it, I really liked Erik's framework.

Drew Wilson

unread,
Nov 21, 2013, 7:03:44 AM11/21/13
to Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev, Erik Wright
+1 IMO runlevels + transition notifications/callbacks probably gets us 90% of the way there and I'd be very happy to see something (ANYTHING) like this happen.

Jói Sigurðsson

unread,
Nov 21, 2013, 7:04:45 AM11/21/13
to atwi...@chromium.org, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev, Erik Wright
Who wants to take this on? It seems there would be plenty of willing
reviewers on this thread (count me in as one).

Cheers,
Jói

Erik Wright

unread,
Nov 21, 2013, 8:40:50 AM11/21/13
to Carlos Pizano, Chromium-dev, Jeremy Moskovich, Drew Wilson
On Wed, Nov 20, 2013 at 8:35 PM, Carlos Pizano <c...@chromium.org> wrote:
The message was not exactly that. I assume you are talking about the eng review. I get that you are compressing it to make your point more pointy but lets get the full context:

The issue is that our start-up is very complicated, beyond what any developer in the team can hope to fit in their brain which wikipedia tells me has been in its current form since 200,000 years ago. A sophisticated dependency resolver/injection graph-based framework was proposed that would allow chrome to transparently resolve the right order of component creation and initialization so developers would not have to care about this. There are costs associated which I am not going to go into details here but the eng review take was that we would like a) work towards a simpler start-up sequence that a qualified and properly trained homo sapiens can understand and b) a simpler form of the framework that was easier to diagnose/debug/grok etc.

Yes, that is a slightly more context-rich summary. But this thread is discussing doing (b) first. It seems like there is enough momentum and enthusiasm here to do that in a limited scope.

Erik Wright

unread,
Nov 21, 2013, 10:10:17 AM11/21/13
to Jói Sigurðsson, Drew Wilson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
I'd be happy to work with Peter, Drew, Jeremy, and anyone else who might be interested to identify a specific representative example of some execution that we would like to defer. Based on that I think we should be able to come up with something as simple as possible to meet that specific need.

Drew Wilson

unread,
Nov 21, 2013, 10:23:49 AM11/21/13
to Erik Wright, Jói Sigurðsson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
OK, the canonical example from the policy code (and pardon me if this is polluting this thread - I'm happy to split this into a separate thread if you like):

DeviceManagementService is owned by BrowserPolicyConnector and is created during browser initialization (BrowserPolicyConnector is owned by the BrowserProcess and is effectively a singleton).

This class maintains a queue of pending DeviceManagement requests (typically, requests for CloudPolicy). We want to refresh our policy on startup, but we want to delay this long enough to avoid conflicting with initial startup.

DeviceManagementService does not expose a public Initialize() API - instead, it exposes a ScheduleInitialization(int msecs) API which is used to schedule a delayed initialization via a task. Typically, we schedule initialization after 5 or 10 seconds.

However, there are two cases where we want to speed up initialization:

1) When doing enterprise enrollment on startup, we want to trigger policy loading immediately so the user doesn't have to wait for the normal initialization delay to complete, so the enrollment code calls ScheduleInitialization(0) to force an immediate enrollment.
2) The ChromeOS login screen wants to force a refresh so any new policies can take effect on the login screen, but doesn't want to conflict with the rendering of this initial screen, so it triggers a refresh after 100 milliseconds.

I suspect that we could remove #2 and nobody would notice, but case #1 is definitely important. The functionality we'd want out of this framework is:

1) A notification when it's OK to do our delayed initialization (triggered at some time that avoids contention)
2) Some way to force immediate initialization (possibly this could be handled outside of the framework, just by exposing an InitNow() method on our object that does immediate initialization and stops observing the notification from #1).

-atw

Erik Wright

unread,
Nov 21, 2013, 10:34:38 AM11/21/13
to Drew Wilson, Jói Sigurðsson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
On Thu, Nov 21, 2013 at 10:23 AM, Drew Wilson <atwi...@chromium.org> wrote:
OK, the canonical example from the policy code (and pardon me if this is polluting this thread - I'm happy to split this into a separate thread if you like):

DeviceManagementService is owned by BrowserPolicyConnector and is created during browser initialization (BrowserPolicyConnector is owned by the BrowserProcess and is effectively a singleton).

This class maintains a queue of pending DeviceManagement requests (typically, requests for CloudPolicy). We want to refresh our policy on startup, but we want to delay this long enough to avoid conflicting with initial startup.

DeviceManagementService does not expose a public Initialize() API - instead, it exposes a ScheduleInitialization(int msecs) API which is used to schedule a delayed initialization via a task. Typically, we schedule initialization after 5 or 10 seconds.

However, there are two cases where we want to speed up initialization:

1) When doing enterprise enrollment on startup, we want to trigger policy loading immediately so the user doesn't have to wait for the normal initialization delay to complete, so the enrollment code calls ScheduleInitialization(0) to force an immediate enrollment.
2) The ChromeOS login screen wants to force a refresh so any new policies can take effect on the login screen, but doesn't want to conflict with the rendering of this initial screen, so it triggers a refresh after 100 milliseconds.

Is it network, disk, or CPU that is being contended for?

Drew Wilson

unread,
Nov 21, 2013, 10:45:31 AM11/21/13
to Erik Wright, Jói Sigurðsson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
On Thu, Nov 21, 2013 at 7:34 AM, Erik Wright <erikw...@chromium.org> wrote:



On Thu, Nov 21, 2013 at 10:23 AM, Drew Wilson <atwi...@chromium.org> wrote:
OK, the canonical example from the policy code (and pardon me if this is polluting this thread - I'm happy to split this into a separate thread if you like):

DeviceManagementService is owned by BrowserPolicyConnector and is created during browser initialization (BrowserPolicyConnector is owned by the BrowserProcess and is effectively a singleton).

This class maintains a queue of pending DeviceManagement requests (typically, requests for CloudPolicy). We want to refresh our policy on startup, but we want to delay this long enough to avoid conflicting with initial startup.

DeviceManagementService does not expose a public Initialize() API - instead, it exposes a ScheduleInitialization(int msecs) API which is used to schedule a delayed initialization via a task. Typically, we schedule initialization after 5 or 10 seconds.

However, there are two cases where we want to speed up initialization:

1) When doing enterprise enrollment on startup, we want to trigger policy loading immediately so the user doesn't have to wait for the normal initialization delay to complete, so the enrollment code calls ScheduleInitialization(0) to force an immediate enrollment.
2) The ChromeOS login screen wants to force a refresh so any new policies can take effect on the login screen, but doesn't want to conflict with the rendering of this initial screen, so it triggers a refresh after 100 milliseconds.

Is it network, disk, or CPU that is being contended for?

Possibly all 3 - it initializes a request engine (CPU, but probably not much), which then sends a policy fetch request to the server (network) and then processes the response (CPU + disk: it parses a protobuf, stores the result to disk, turns it into pref values, notifies pref listeners to changes to the prefs, which can result in pretty much any arbitrary amount of work depending on what new policy values happen to come down).

Erik Wright

unread,
Nov 21, 2013, 10:48:34 AM11/21/13
to Drew Wilson, Jói Sigurðsson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
On Thu, Nov 21, 2013 at 10:45 AM, Drew Wilson <atwi...@chromium.org> wrote:

On Thu, Nov 21, 2013 at 7:34 AM, Erik Wright <erikw...@chromium.org> wrote:



On Thu, Nov 21, 2013 at 10:23 AM, Drew Wilson <atwi...@chromium.org> wrote:
OK, the canonical example from the policy code (and pardon me if this is polluting this thread - I'm happy to split this into a separate thread if you like):

DeviceManagementService is owned by BrowserPolicyConnector and is created during browser initialization (BrowserPolicyConnector is owned by the BrowserProcess and is effectively a singleton).

This class maintains a queue of pending DeviceManagement requests (typically, requests for CloudPolicy). We want to refresh our policy on startup, but we want to delay this long enough to avoid conflicting with initial startup.

DeviceManagementService does not expose a public Initialize() API - instead, it exposes a ScheduleInitialization(int msecs) API which is used to schedule a delayed initialization via a task. Typically, we schedule initialization after 5 or 10 seconds.

However, there are two cases where we want to speed up initialization:

1) When doing enterprise enrollment on startup, we want to trigger policy loading immediately so the user doesn't have to wait for the normal initialization delay to complete, so the enrollment code calls ScheduleInitialization(0) to force an immediate enrollment.
2) The ChromeOS login screen wants to force a refresh so any new policies can take effect on the login screen, but doesn't want to conflict with the rendering of this initial screen, so it triggers a refresh after 100 milliseconds.

Is it network, disk, or CPU that is being contended for?

Possibly all 3 - it initializes a request engine (CPU, but probably not much), which then sends a policy fetch request to the server (network) and then processes the response (CPU + disk: it parses a protobuf, stores the result to disk, turns it into pref values, notifies pref listeners to changes to the prefs, which can result in pretty much any arbitrary amount of work depending on what new policy values happen to come down).

Jeremy can probably add his two cents here, but I doubt CPU contention is worth worrying about. I've seen lots of Android startup traces (I assume ChromeOS will be similar or better) and I never saw that CPU was a factor in startup. It was pretty much always paging in data (and code) that was the biggest delay.

Anthony Berent

unread,
Nov 21, 2013, 12:17:24 PM11/21/13
to erikw...@chromium.org, Drew Wilson, Jói Sigurðsson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
I have certainly seen CPU availability and performance make a difference to startup on Android. https://x20web.corp.google.com/~aberent/no_crawl/traces/trace_N4.html (Googlers only, sorry) shows a trace of normal startup on a Nexus 4 (without setting the CPU governors into performance mode). As you will see there are periods when all 4 CPUs are busy, and have ratcheted up their clock speeds to maximum.

When I was doing experiments for https://docs.google.com/a/google.com/document/d/1L6RFGGRY0wixju4f5ZXoKNjbY0KEDnATxkmQ2AeMCa0/edit?usp=sharing (Googlers only, sorry) I found that an svelte N4 (an N4 with some of its memory and 2 CPUs disabled) took roughly half a second longer to start Chrome than a normal N4. The availability of memory should not have made a difference to this, and the I/O system is, I believe, the same, so it seems to me that this can only have been caused by CPU contention.

In terms of doing the work, I am already looking at doing some work on pulling out some of the startup stuff we have done to improve the performance of Chrome on Android into a new component, and generalizing it.  The most relevant piece of this to this discussion is StartupTaskRunner. This was created to allow the Android UI to be brought up early and to remain responsive during startup. It supports creating a queue of UI thread startup tasks that may then either all be run at one go, or be run one at a time with UI events being handled after each task is run. Currently the order of the queue is defined by the programmer in the calling code, but in the long term one possibility would be to define it using a dependency graph (preferably one from which we can generate a graphical graph!). My feeling is that it will take some careful design to get this right, but I would be happy to work on this with others.

At the moment StartupTaskRunner is only used for managing tasks that happen during the initialization of BrowserMainLoop, (from PreCreateThreads to PreMainMessageLoopRun) whereas I think the delayed startup cases talked about on this thread are happening rather later, however I don't see why this should not be generalized. Also, at the moment StartupTaskRunner only runs tasks asynchronously on Android, since, I believe, the desktop UI isn't currently created until after the initialization of BrowserMainLoop has completed (in addition this allowed me to minimize the changes to desktop Chrome). On Android we also have a (quite limited) structure for later delayed initialization of certain other components, but unfortunately this is in the proprietary Java code of Chrome for Android, so cannot be directly reused on other platforms.


Jói Sigurðsson

unread,
Nov 21, 2013, 12:53:11 PM11/21/13
to Anthony Berent, Erik Wright, Drew Wilson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
Here's a strawman idea for how this might be done without trying to
wade through the start-up code to determine what stuff is "critical"
and when all of that stuff has completed. This strawman assumes we
have access to performance counters on all systems, which I'm not sure
whether we do or not:

Very early in Chrome's creation, before any threads are started,
create such an object that starts sampling performance counters to
give some kind of reasonable per-second average for both CPU and I/O
performance. Store a starting value.

Use heuristics where we expect both CPU and I/O to increase after
startup, keep track of the maximum for CPU and I/O, and then have
events that are triggered when either or both have been stable at 70%
or less than the difference between the starting value and the maximum
for a couple of seconds, and again when either or both have been
stable at 30% of the difference. To avoid bad things happening due to
failure of the heuristic, you could have a minimum/maximum amount of
time that must/can pass before each of these event types. We could use
UMA metrics and Finch experiments to figure out the best heuristic
parameters that cause failure least often and seem to spread the load
out the best.

In terms of the interface to this object, code could register for
notification on any of the predetermined events (e.g. a CPU-bound task
might register for the "CPU below 30% of difference from start to
peak" event) but could potentially also provide TimeDelta values that
would be the earliest/latest it would want to be triggered.

An additional nicety in the interface might be to have each task that
gets a notification indicate when it's done, and trigger them in
sequence rather than all tasks at once that are waiting for the same
event. For this to work we might need them to register with a priority
value.

Cheers,
Jói

John Abd-El-Malek

unread,
Nov 21, 2013, 1:41:38 PM11/21/13
to abe...@chromium.org, Erik Wright, Drew Wilson, Jói Sigurðsson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
On Thu, Nov 21, 2013 at 9:17 AM, Anthony Berent <abe...@chromium.org> wrote:
I have certainly seen CPU availability and performance make a difference to startup on Android. https://x20web.corp.google.com/~aberent/no_crawl/traces/trace_N4.html (Googlers only, sorry) shows a trace of normal startup on a Nexus 4 (without setting the CPU governors into performance mode). As you will see there are periods when all 4 CPUs are busy, and have ratcheted up their clock speeds to maximum.

When I was doing experiments for https://docs.google.com/a/google.com/document/d/1L6RFGGRY0wixju4f5ZXoKNjbY0KEDnATxkmQ2AeMCa0/edit?usp=sharing (Googlers only, sorry) I found that an svelte N4 (an N4 with some of its memory and 2 CPUs disabled) took roughly half a second longer to start Chrome than a normal N4. The availability of memory should not have made a difference to this, and the I/O system is, I believe, the same, so it seems to me that this can only have been caused by CPU contention.

In terms of doing the work, I am already looking at doing some work on pulling out some of the startup stuff we have done to improve the performance of Chrome on Android into a new component, and generalizing it.  The most relevant piece of this to this discussion is StartupTaskRunner. This was created to allow the Android UI to be brought up early and to remain responsive during startup. It supports creating a queue of UI thread startup tasks that may then either all be run at one go, or be run one at a time with UI events being handled after each task is run. Currently the order of the queue is defined by the programmer in the calling code, but in the long term one possibility would be to define it using a dependency graph (preferably one from which we can generate a graphical graph!). My feeling is that it will take some careful design to get this right, but I would be happy to work on this with others.

At the moment StartupTaskRunner is only used for managing tasks that happen during the initialization of BrowserMainLoop, (from PreCreateThreads to PreMainMessageLoopRun) whereas I think the delayed startup cases talked about on this thread are happening rather later, however I don't see why this should not be generalized. Also, at the moment StartupTaskRunner only runs tasks asynchronously on Android, since, I believe, the desktop UI isn't currently created until after the initialization of BrowserMainLoop has completed (in addition this allowed me to minimize the changes to desktop Chrome). On Android we also have a (quite limited) structure for later delayed initialization of certain other components, but unfortunately this is in the proprietary Java code of Chrome for Android, so cannot be directly reused on other platforms.

StartupTaskRunner is a highly specialized class for a specific purpose (running a few tasks inside content asynchronously, for Android Chrome). It's readable/understandable and easy to reason about correctness because it's very specific It has little impact on other code (i.e. startup in src\chrome for desktop chrome) because that code always runs after these tasks have run. Now if we were to generalize this to code in desktop chrome, many assumptions that are baked in would cause a lot of failures (i.e. tons of code assumes that it runs after the profile has been initialized, to give one example). I remain extremely skeptical that we should expand this model to other code. Instead I think we should be doing more lazy initialization so that data is fetched/read from disk only when it's needed for the first time.

Erik Wright

unread,
Nov 21, 2013, 1:42:20 PM11/21/13
to Anthony Berent, Drew Wilson, Jói Sigurðsson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
On Thu, Nov 21, 2013 at 12:17 PM, Anthony Berent <abe...@chromium.org> wrote:
I have certainly seen CPU availability and performance make a difference to startup on Android. https://x20web.corp.google.com/~aberent/no_crawl/traces/trace_N4.html (Googlers only, sorry) shows a trace of normal startup on a Nexus 4 (without setting the CPU governors into performance mode). As you will see there are periods when all 4 CPUs are busy, and have ratcheted up their clock speeds to maximum.

When I was doing experiments for https://docs.google.com/a/google.com/document/d/1L6RFGGRY0wixju4f5ZXoKNjbY0KEDnATxkmQ2AeMCa0/edit?usp=sharing (Googlers only, sorry) I found that an svelte N4 (an N4 with some of its memory and 2 CPUs disabled) took roughly half a second longer to start Chrome than a normal N4. The availability of memory should not have made a difference to this, and the I/O system is, I believe, the same, so it seems to me that this can only have been caused by CPU contention.

In terms of doing the work, I am already looking at doing some work on pulling out some of the startup stuff we have done to improve the performance of Chrome on Android into a new component, and generalizing it.  The most relevant piece of this to this discussion is StartupTaskRunner. This was created to allow the Android UI to be brought up early and to remain responsive during startup. It supports creating a queue of UI thread startup tasks that may then either all be run at one go, or be run one at a time with UI events being handled after each task is run. Currently the order of the queue is defined by the programmer in the calling code, but in the long term one possibility would be to define it using a dependency graph (preferably one from which we can generate a graphical graph!). My feeling is that it will take some careful design to get this right, but I would be happy to work on this with others.

At the moment StartupTaskRunner is only used for managing tasks that happen during the initialization of BrowserMainLoop, (from PreCreateThreads to PreMainMessageLoopRun) whereas I think the delayed startup cases talked about on this thread are happening rather later, however I don't see why this should not be generalized. Also, at the moment StartupTaskRunner only runs tasks asynchronously on Android, since, I believe, the desktop UI isn't currently created until after the initialization of BrowserMainLoop has completed (in addition this allowed me to minimize the changes to desktop Chrome). On Android we also have a (quite limited) structure for later delayed initialization of certain other components, but unfortunately this is in the proprietary Java code of Chrome for Android, so cannot be directly reused on other platforms.

StartupTaskRunner addresses a different but related problem.

In summary, there is some long-running work that occurs on the main thread before the main loop starts. On desktop that's not so bad not the least because we haven't even displayed a browser window by that time (it's also just, on average, shorter on desktop).

On Android the browser window has already been painted by Java before the browser process starts going, and the browser process startup is effectively a task running in the message loop. So for the duration of the startup the browser is visible but inactive.

So the startup was broken up into finer grained tasks between which UI events could be handled (for example, omnibox clicks/typing).

This solution only exists because (due to Chrome architecture) that work (substantially disk IO) all happens on the main thread. IIRC the biggest factors are loading resources, loading preferences, and paging in code. These are things that, according to our current architecture at least, absolutely have to happen before the browser starts chugging along.

The types of things being considered in this thread, AFAICT, are other things like history, data used for predictions and auto-completion, cookies not needed for restoring initial tab state, etc. Contrary to the StartupTaskRunner tasks, most of these happen on separate threads anyways, so we're not worried about yielding the message loop as much as avoiding contention for CPU/disk/network.

John Abd-El-Malek

unread,
Nov 21, 2013, 1:48:11 PM11/21/13
to Erik Wright, Anthony Berent, Drew Wilson, Jói Sigurðsson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
On Thu, Nov 21, 2013 at 10:42 AM, Erik Wright <erikw...@chromium.org> wrote:



On Thu, Nov 21, 2013 at 12:17 PM, Anthony Berent <abe...@chromium.org> wrote:
I have certainly seen CPU availability and performance make a difference to startup on Android. https://x20web.corp.google.com/~aberent/no_crawl/traces/trace_N4.html (Googlers only, sorry) shows a trace of normal startup on a Nexus 4 (without setting the CPU governors into performance mode). As you will see there are periods when all 4 CPUs are busy, and have ratcheted up their clock speeds to maximum.

When I was doing experiments for https://docs.google.com/a/google.com/document/d/1L6RFGGRY0wixju4f5ZXoKNjbY0KEDnATxkmQ2AeMCa0/edit?usp=sharing (Googlers only, sorry) I found that an svelte N4 (an N4 with some of its memory and 2 CPUs disabled) took roughly half a second longer to start Chrome than a normal N4. The availability of memory should not have made a difference to this, and the I/O system is, I believe, the same, so it seems to me that this can only have been caused by CPU contention.

In terms of doing the work, I am already looking at doing some work on pulling out some of the startup stuff we have done to improve the performance of Chrome on Android into a new component, and generalizing it.  The most relevant piece of this to this discussion is StartupTaskRunner. This was created to allow the Android UI to be brought up early and to remain responsive during startup. It supports creating a queue of UI thread startup tasks that may then either all be run at one go, or be run one at a time with UI events being handled after each task is run. Currently the order of the queue is defined by the programmer in the calling code, but in the long term one possibility would be to define it using a dependency graph (preferably one from which we can generate a graphical graph!). My feeling is that it will take some careful design to get this right, but I would be happy to work on this with others.

At the moment StartupTaskRunner is only used for managing tasks that happen during the initialization of BrowserMainLoop, (from PreCreateThreads to PreMainMessageLoopRun) whereas I think the delayed startup cases talked about on this thread are happening rather later, however I don't see why this should not be generalized. Also, at the moment StartupTaskRunner only runs tasks asynchronously on Android, since, I believe, the desktop UI isn't currently created until after the initialization of BrowserMainLoop has completed (in addition this allowed me to minimize the changes to desktop Chrome). On Android we also have a (quite limited) structure for later delayed initialization of certain other components, but unfortunately this is in the proprietary Java code of Chrome for Android, so cannot be directly reused on other platforms.

StartupTaskRunner addresses a different but related problem.

In summary, there is some long-running work that occurs on the main thread before the main loop starts. On desktop that's not so bad not the least because we haven't even displayed a browser window by that time (it's also just, on average, shorter on desktop).

On Android the browser window has already been painted by Java before the browser process starts going, and the browser process startup is effectively a task running in the message loop. So for the duration of the startup the browser is visible but inactive.

So the startup was broken up into finer grained tasks between which UI events could be handled (for example, omnibox clicks/typing).

This solution only exists because (due to Chrome architecture) that work (substantially disk IO) all happens on the main thread. IIRC the biggest factors are loading resources, loading preferences, and paging in code. These are things that, according to our current architecture at least, absolutely have to happen before the browser starts chugging along.

The types of things being considered in this thread, AFAICT, are other things like history, data used for predictions and auto-completion, cookies not needed for restoring initial tab state, etc. Contrary to the StartupTaskRunner tasks, most of these happen on separate threads anyways, so we're not worried about yielding the message loop as much as avoiding contention for CPU/disk/network.

Exactly. For desktop chrome, we can't paint the UI until we load the profile because that impacts how the UI is drawn (i.e. is bookmark bar enabled, extension page actions, the NTP...).

Erik Wright

unread,
Nov 21, 2013, 1:48:49 PM11/21/13
to John Abd-El-Malek, Anthony Berent, Drew Wilson, Jói Sigurðsson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
On Thu, Nov 21, 2013 at 1:41 PM, John Abd-El-Malek <j...@chromium.org> wrote:



On Thu, Nov 21, 2013 at 9:17 AM, Anthony Berent <abe...@chromium.org> wrote:
I have certainly seen CPU availability and performance make a difference to startup on Android. https://x20web.corp.google.com/~aberent/no_crawl/traces/trace_N4.html (Googlers only, sorry) shows a trace of normal startup on a Nexus 4 (without setting the CPU governors into performance mode). As you will see there are periods when all 4 CPUs are busy, and have ratcheted up their clock speeds to maximum.

When I was doing experiments for https://docs.google.com/a/google.com/document/d/1L6RFGGRY0wixju4f5ZXoKNjbY0KEDnATxkmQ2AeMCa0/edit?usp=sharing (Googlers only, sorry) I found that an svelte N4 (an N4 with some of its memory and 2 CPUs disabled) took roughly half a second longer to start Chrome than a normal N4. The availability of memory should not have made a difference to this, and the I/O system is, I believe, the same, so it seems to me that this can only have been caused by CPU contention.

In terms of doing the work, I am already looking at doing some work on pulling out some of the startup stuff we have done to improve the performance of Chrome on Android into a new component, and generalizing it.  The most relevant piece of this to this discussion is StartupTaskRunner. This was created to allow the Android UI to be brought up early and to remain responsive during startup. It supports creating a queue of UI thread startup tasks that may then either all be run at one go, or be run one at a time with UI events being handled after each task is run. Currently the order of the queue is defined by the programmer in the calling code, but in the long term one possibility would be to define it using a dependency graph (preferably one from which we can generate a graphical graph!). My feeling is that it will take some careful design to get this right, but I would be happy to work on this with others.

At the moment StartupTaskRunner is only used for managing tasks that happen during the initialization of BrowserMainLoop, (from PreCreateThreads to PreMainMessageLoopRun) whereas I think the delayed startup cases talked about on this thread are happening rather later, however I don't see why this should not be generalized. Also, at the moment StartupTaskRunner only runs tasks asynchronously on Android, since, I believe, the desktop UI isn't currently created until after the initialization of BrowserMainLoop has completed (in addition this allowed me to minimize the changes to desktop Chrome). On Android we also have a (quite limited) structure for later delayed initialization of certain other components, but unfortunately this is in the proprietary Java code of Chrome for Android, so cannot be directly reused on other platforms.

StartupTaskRunner is a highly specialized class for a specific purpose (running a few tasks inside content asynchronously, for Android Chrome). It's readable/understandable and easy to reason about correctness because it's very specific It has little impact on other code (i.e. startup in src\chrome for desktop chrome) because that code always runs after these tasks have run. Now if we were to generalize this to code in desktop chrome, many assumptions that are baked in would cause a lot of failures (i.e. tons of code assumes that it runs after the profile has been initialized, to give one example). I remain extremely skeptical that we should expand this model to other code. Instead I think we should be doing more lazy initialization so that data is fetched/read from disk only when it's needed for the first time.

I agree with you about StartupTaskRunner being for a completely different use-case (see my simultaneous message).

I think the tricky thing is defining the word "needed". For example, when the initial tabs are restored I "need" history in order to record the visits. But I don't really care if it happens right away, so it would be OK and good if the history service deferred the visit recording until all the tabs had been restored.

The CloudPolicy example seems like it presents another counterpoint. From what I understand, this is a remotely managed policy that could have an impact on disparate parts of the user interface. So whenever it's updated it might cause the UI or browser behaviour to change. In that case, we want it to be reasonably up to date all of the time, but we don't mind waiting 10 seconds for tabs to restore before doing so.

Drew Wilson

unread,
Nov 22, 2013, 3:20:00 AM11/22/13
to Erik Wright, John Abd-El-Malek, Anthony Berent, Jói Sigurðsson, Jeremy Moskovich, Peter Kasting, Carlos Pizano, Chromium-dev
On Thu, Nov 21, 2013 at 7:48 PM, Erik Wright <erikw...@chromium.org> wrote:



On Thu, Nov 21, 2013 at 1:41 PM, John Abd-El-Malek <j...@chromium.org> wrote:



On Thu, Nov 21, 2013 at 9:17 AM, Anthony Berent <abe...@chromium.org> wrote:
I have certainly seen CPU availability and performance make a difference to startup on Android. https://x20web.corp.google.com/~aberent/no_crawl/traces/trace_N4.html (Googlers only, sorry) shows a trace of normal startup on a Nexus 4 (without setting the CPU governors into performance mode). As you will see there are periods when all 4 CPUs are busy, and have ratcheted up their clock speeds to maximum.

When I was doing experiments for https://docs.google.com/a/google.com/document/d/1L6RFGGRY0wixju4f5ZXoKNjbY0KEDnATxkmQ2AeMCa0/edit?usp=sharing (Googlers only, sorry) I found that an svelte N4 (an N4 with some of its memory and 2 CPUs disabled) took roughly half a second longer to start Chrome than a normal N4. The availability of memory should not have made a difference to this, and the I/O system is, I believe, the same, so it seems to me that this can only have been caused by CPU contention.

In terms of doing the work, I am already looking at doing some work on pulling out some of the startup stuff we have done to improve the performance of Chrome on Android into a new component, and generalizing it.  The most relevant piece of this to this discussion is StartupTaskRunner. This was created to allow the Android UI to be brought up early and to remain responsive during startup. It supports creating a queue of UI thread startup tasks that may then either all be run at one go, or be run one at a time with UI events being handled after each task is run. Currently the order of the queue is defined by the programmer in the calling code, but in the long term one possibility would be to define it using a dependency graph (preferably one from which we can generate a graphical graph!). My feeling is that it will take some careful design to get this right, but I would be happy to work on this with others.

At the moment StartupTaskRunner is only used for managing tasks that happen during the initialization of BrowserMainLoop, (from PreCreateThreads to PreMainMessageLoopRun) whereas I think the delayed startup cases talked about on this thread are happening rather later, however I don't see why this should not be generalized. Also, at the moment StartupTaskRunner only runs tasks asynchronously on Android, since, I believe, the desktop UI isn't currently created until after the initialization of BrowserMainLoop has completed (in addition this allowed me to minimize the changes to desktop Chrome). On Android we also have a (quite limited) structure for later delayed initialization of certain other components, but unfortunately this is in the proprietary Java code of Chrome for Android, so cannot be directly reused on other platforms.

StartupTaskRunner is a highly specialized class for a specific purpose (running a few tasks inside content asynchronously, for Android Chrome). It's readable/understandable and easy to reason about correctness because it's very specific It has little impact on other code (i.e. startup in src\chrome for desktop chrome) because that code always runs after these tasks have run. Now if we were to generalize this to code in desktop chrome, many assumptions that are baked in would cause a lot of failures (i.e. tons of code assumes that it runs after the profile has been initialized, to give one example). I remain extremely skeptical that we should expand this model to other code. Instead I think we should be doing more lazy initialization so that data is fetched/read from disk only when it's needed for the first time.

I agree with you about StartupTaskRunner being for a completely different use-case (see my simultaneous message).

I think the tricky thing is defining the word "needed". For example, when the initial tabs are restored I "need" history in order to record the visits. But I don't really care if it happens right away, so it would be OK and good if the history service deferred the visit recording until all the tabs had been restored.

The CloudPolicy example seems like it presents another counterpoint. From what I understand, this is a remotely managed policy that could have an impact on disparate parts of the user interface. So whenever it's updated it might cause the UI or browser behaviour to change. In that case, we want it to be reasonably up to date all of the time, but we don't mind waiting 10 seconds for tabs to restore before doing so.

Exactly. Thinking about it more, the CloudPolicy stuff is interesting, because there are really two parts: fetching and applying the already-cached policy, and checking for updates. The "apply current policy from cache" has to happen during startup, immediately (for example, if sync is disabled by policy, or if policy specifies a startup page, you really need to know this before doing profile initialization and rendering tabs). But as you say, fetching an update can wait. 

Jeremy Moskovich

unread,
Nov 24, 2013, 7:42:25 AM11/24/13
to Drew Wilson, Erik Wright, John Abd-El-Malek, Anthony Berent, Jói Sigurðsson, Peter Kasting, Carlos Pizano, Chromium-dev
It's great we're having this discussion!

Again, personally I'd be happy with something simplistic like the concept of a runlevel, the ability to assert against that and opportune notifications.

I may be mistaken, but that seems noncontroversial and introducing it might inform a more advanced system.

Erik: WDYT?

Best regards,
Jeremy

John Abd-El-Malek

unread,
Nov 25, 2013, 2:24:51 AM11/25/13
to Jeremy Moskovich, Drew Wilson, Erik Wright, Anthony Berent, Jói Sigurðsson, Peter Kasting, Carlos Pizano, Chromium-dev
A few questions:
-what would run levels correspond to, can you give concrete examples?
-if I'm understanding this proposal, presumably classes which are already created would wait for a particular run level before doing IO?
  -could you give some classes as examples? if they're already created and waiting for some notification to start their IO, can we measure through rough experimentation the benefit of that vs waiting for lazy initialization instead (i.e. first time data is needed, instead of a particular run level)
-how does this compare to ChromeBrowserMainExtraParts?

Erik Wright

unread,
Nov 25, 2013, 11:01:17 AM11/25/13
to John Abd-El-Malek, Jeremy Moskovich, Drew Wilson, Anthony Berent, Jói Sigurðsson, Peter Kasting, Carlos Pizano, Chromium-dev
This would be used for deferring work until things like:
1) Browser window is visible
2) First tab is loaded
3) All tabs are restored
4) Idle state achieved

Examples are any code that currently posts a task to an arbitrary point in the future to complete its startup. I did a quick codesearch for "PostDelayedTask FromSeconds". I found the following three instances in a few minutes of scanning but there are many more:


Other classes currently wait for notifications from notification service. They wait until an event indicates first page load and then they start their work. But given our goal of eliminating NotificationService an easier and more specific API for achieving a similar effect would be useful.

Examples of services waiting for lazy initialization include History and friends, prediction databases, web data, etc. I have seen Android traces where History was loading too early and delaying startup. I know Brett has said he periodically spends a morning diagnosing what code is erroneously "first needing" history too early and fixing it. My interpretation is that waiting for "first need" is brittle.

Another example is the cookie store. We load cookies for active web requests first, followed by the rest of the cookie store. We definitely want to have the whole store in memory in order to reduce latency for future web requests, but it would be perfectly reasonable to _only_ load the active domains until some idle state is reached. Currently we don't make an effort to do so.

ChromeBrowserMainExtraParts does not provide any hooks that are related to tab restoration, idle state, etc. It could be extended to do so. That's just one way of implementing run levels. On the other hand, some means of abstraction would be required to allow browser components to access this system.

Peter Kasting

unread,
Nov 25, 2013, 2:24:39 PM11/25/13
to Erik Wright, John Abd-El-Malek, Jeremy Moskovich, Drew Wilson, Anthony Berent, Jói Sigurðsson, Carlos Pizano, Chromium-dev
On Mon, Nov 25, 2013 at 8:01 AM, Erik Wright <erikw...@chromium.org> wrote:
This would be used for deferring work until things like:
1) Browser window is visible
2) First tab is loaded
3) All tabs are restored
4) Idle state achieved

I don't think the granularity here is useful.  All code I've worked with either wants to do something during initial startup, or after initial startup.  That's the only interesting distinction.  Code in the first category doesn't need any additional hooks.  Code in the second category is all a huge blob of stuff that needs to happen "soon after startup, but not blocking startup, and not slowing down general code usage".  Runlevels don't solve that problem at all.

PK

Elliot Glaysher (Chromium)

unread,
Nov 25, 2013, 2:31:19 PM11/25/13
to Peter Kasting, Erik Wright, John Abd-El-Malek, Jeremy Moskovich, Drew Wilson, Anthony Berent, Jói Sigurðsson, Carlos Pizano, Chromium-dev
+1 to this; if something is going to be done here, it should probably look more like Joi's proposal where we actually look at the system performance and only run tasks when they wouldn't interfere with the normal throughput.

Erik Wright

unread,
Nov 25, 2013, 2:32:17 PM11/25/13
to Peter Kasting, John Abd-El-Malek, Jeremy Moskovich, Drew Wilson, Anthony Berent, Jói Sigurðsson, Carlos Pizano, Chromium-dev
That sounds correct to me. I think the analogy to runlevels (made by someone else, so I might be corrected here) was because some prioritization is assumed. I.E., amongst the things that want to happen after initial startup, some might be more desirable than others.

For example, it would (hypothetically) have a greater user impact to have good omnibox predictions vs to have the entire cookie store in memory.

I don't see a reason to need that in an initial implementation.

PK

Erik Wright

unread,
Nov 25, 2013, 2:34:24 PM11/25/13
to Elliot Glaysher (Chromium), Peter Kasting, John Abd-El-Malek, Jeremy Moskovich, Drew Wilson, Anthony Berent, Jói Sigurðsson, Carlos Pizano, Chromium-dev
That seems desirable, but also seems compatible with a first pass that would just defer until the initial crunch passed. That would still be much better than the current state.

John Abd-El-Malek

unread,
Nov 25, 2013, 7:21:57 PM11/25/13
to Erik Wright, Elliot Glaysher (Chromium), Peter Kasting, Jeremy Moskovich, Drew Wilson, Anthony Berent, Jói Sigurðsson, Carlos Pizano, Chromium-dev
The previous comments summarize my intuition about this topic. I'm really curious about the stuff that happens after initial startup, i.e. once we dig into details could we find alternate ways to simplify the code as well as to reduce "magic" delays to start doing work.
-for the cookie example given: do we have numbers on how much faster it is to load some cookies first vs all of the cookies database? My cookies file is 300KB. I imagine there's a few seeks to get to the necessary cookies for my opened tabs. At that point, is it actually slower to load all the cookies?
-one other example off the top of my head is metrics service, which starts a task after 60 seconds to collect data. Part of that is that it doesn't want to force loading the plugins from disk. Practically speaking, most page loads would force that to be loaded anyways. Could we be smarter by having the metrics service watch for when plugins are loaded, and only update its plugin data then?

Anthony Berent

unread,
Nov 29, 2013, 11:47:53 AM11/29/13
to John Abd-El-Malek, Erik Wright, Elliot Glaysher (Chromium), Peter Kasting, Jeremy Moskovich, Drew Wilson, Jói Sigurðsson, Carlos Pizano, Chromium-dev
Some thoughts on this. It seems to me that there are possibly three different cases for when we want to start components:
  1. As soon as possible after startup. For example anything that is needed to get the UI started or to load the first tab visible should be done as soon as possible.
    • We often care about the order in which these tasks are run, even if all of them should be run as soon as possible. For example, on Android, we have chosen to prioritize getting the Omnibox visible and responsive over loading the first tab.
  2. Lazily. This is appropriate for components that may or may not be used, and are reasonably quick to start (or the delays in starting aren't visible to the user), but tie up resources (e.g. memory). There is no point in starting these until they are needed.
  3. When resources are available. This is the main case that has been discussed above. This makes sense for components that are likely or certain to be needed, but aren't needed immediately, and take significant time to start.
For some components we may need to combine cases 2 and 3, i.e. in normal circumstances they should wait for resources to be available, but if some other component requires them they must start when that component requires them.

The interesting case is case 3, components that should start when resources are available. At the moment these are handled in various ad-hoc ways. Some start on timers, guessing that Chrome will using less resources 10 or 60 seconds after startup that during startup; on Android at least, some are started by an idleHandler; and some are started when other parts of startup reaches a particular state. Naively it would seem that this could be solved by initializing these components on low priority background threads; however this only works if the limiting resource is CPU time, which it isn't in most of the cases we care about. It seems to me that maybe what we need is some way of prioritizing access to other resources, so that, for example, a low priority task reading from disk would never read more than a few blocks at a time without giving way to higher priority tasks doing disk access.

Clearly adding any sort of priority system to disk access (or network access) would be a significant piece of work, and would have quite widespread effects on the code, so we should assess further whether disk and network contention are real problems (either during startup or at other times), and if so whether we can sensibly classify the contending tasks into high and low priority tasks, before implementing this, but it seems to me that this could solve many of the problems discussed in this email thread. 

 

Erik Wright

unread,
Dec 3, 2013, 10:15:39 AM12/3/13
to Anthony Berent, John Abd-El-Malek, Elliot Glaysher (Chromium), Peter Kasting, Jeremy Moskovich, Drew Wilson, Jói Sigurðsson, Carlos Pizano, Chromium-dev
On Fri, Nov 29, 2013 at 11:47 AM, Anthony Berent <abe...@chromium.org> wrote:
Some thoughts on this. It seems to me that there are possibly three different cases for when we want to start components:
  1. As soon as possible after startup. For example anything that is needed to get the UI started or to load the first tab visible should be done as soon as possible.
    • We often care about the order in which these tasks are run, even if all of them should be run as soon as possible. For example, on Android, we have chosen to prioritize getting the Omnibox visible and responsive over loading the first tab.
  2. Lazily. This is appropriate for components that may or may not be used, and are reasonably quick to start (or the delays in starting aren't visible to the user), but tie up resources (e.g. memory). There is no point in starting these until they are needed.
  3. When resources are available. This is the main case that has been discussed above. This makes sense for components that are likely or certain to be needed, but aren't needed immediately, and take significant time to start.
For some components we may need to combine cases 2 and 3, i.e. in normal circumstances they should wait for resources to be available, but if some other component requires them they must start when that component requires them.

The interesting case is case 3, components that should start when resources are available. At the moment these are handled in various ad-hoc ways. Some start on timers, guessing that Chrome will using less resources 10 or 60 seconds after startup that during startup; on Android at least, some are started by an idleHandler; and some are started when other parts of startup reaches a particular state. Naively it would seem that this could be solved by initializing these components on low priority background threads; however this only works if the limiting resource is CPU time, which it isn't in most of the cases we care about. It seems to me that maybe what we need is some way of prioritizing access to other resources, so that, for example, a low priority task reading from disk would never read more than a few blocks at a time without giving way to higher priority tasks doing disk access.

Clearly adding any sort of priority system to disk access (or network access) would be a significant piece of work, and would have quite widespread effects on the code, so we should assess further whether disk and network contention are real problems (either during startup or at other times), and if so whether we can sensibly classify the contending tasks into high and low priority tasks, before implementing this, but it seems to me that this could solve many of the problems discussed in this email thread.

Case (1) typically maps to things that are initialized synchronously on the main thread during startup.

Case (3), as you say, is the most interesting. And I agree it would be nice to react to resource availability in real time. But it would seem a good start to simply wait until some key high priority milestones are reached before triggering the low priority tasks. Bonus points if we start the low priority tasks in serial rather than parallel.

As for classifying the tasks, I think that just starting with the things that are currently deferred through other methods and/or visual inspection of startup traces to find offending tasks would be a good start, too. I am under the impression that new tasks of this type are being added faster than every few months, so once a strategy exists I suspect we will quickly accrete clients.

On Tue, Nov 26, 2013 at 12:21 AM, John Abd-El-Malek <j...@chromium.org> wrote:
The previous comments summarize my intuition about this topic. I'm really curious about the stuff that happens after initial startup, i.e. once we dig into details could we find alternate ways to simplify the code as well as to reduce "magic" delays to start doing work.
-for the cookie example given: do we have numbers on how much faster it is to load some cookies first vs all of the cookies database? My cookies file is 300KB. I imagine there's a few seeks to get to the necessary cookies for my opened tabs. At that point, is it actually slower to load all the cookies?

We used to load all cookies in one shot, and it did take longer than just loading the required cookies. Because of the series of changes that led to the current implementation it's not easy to point to a precise gain, but I don't think it would be hard to introduce an experiment that loaded the required cookies first and then immediately loaded all others. The resulting data should be good for discussion.
 
-one other example off the top of my head is metrics service, which starts a task after 60 seconds to collect data. Part of that is that it doesn't want to force loading the plugins from disk. Practically speaking, most page loads would force that to be loaded anyways. Could we be smarter by having the metrics service watch for when plugins are loaded, and only update its plugin data then?

That's a good example. Tasks that aren't resource intensive but depend on some other "expensive" task completing would not be scheduled in any special way. They should, I think, just ask for notification when the expensive task completes.

That hinges, though, on the assumption that the expensive task will absolutely run. I don't know enough about plugins to say if that's the case here, but there are probably other examples where it's not guaranteed. In that case, you still want a way to schedule the task to run at the appropriate time.

I agree with you that an appropriate first step is to identify a handful of potential use-cases and consider whether there are other appropriate solutions.

John Abd-El-Malek

unread,
Dec 5, 2013, 2:39:22 PM12/5/13
to Erik Wright, Anthony Berent, Elliot Glaysher (Chromium), Peter Kasting, Jeremy Moskovich, Drew Wilson, Jói Sigurðsson, Carlos Pizano, Chromium-dev
On Tue, Dec 3, 2013 at 7:15 AM, Erik Wright <erikw...@chromium.org> wrote:



On Fri, Nov 29, 2013 at 11:47 AM, Anthony Berent <abe...@chromium.org> wrote:
Some thoughts on this. It seems to me that there are possibly three different cases for when we want to start components:
  1. As soon as possible after startup. For example anything that is needed to get the UI started or to load the first tab visible should be done as soon as possible.
    • We often care about the order in which these tasks are run, even if all of them should be run as soon as possible. For example, on Android, we have chosen to prioritize getting the Omnibox visible and responsive over loading the first tab.
  2. Lazily. This is appropriate for components that may or may not be used, and are reasonably quick to start (or the delays in starting aren't visible to the user), but tie up resources (e.g. memory). There is no point in starting these until they are needed.
  3. When resources are available. This is the main case that has been discussed above. This makes sense for components that are likely or certain to be needed, but aren't needed immediately, and take significant time to start.
For some components we may need to combine cases 2 and 3, i.e. in normal circumstances they should wait for resources to be available, but if some other component requires them they must start when that component requires them.

The interesting case is case 3, components that should start when resources are available. At the moment these are handled in various ad-hoc ways. Some start on timers, guessing that Chrome will using less resources 10 or 60 seconds after startup that during startup; on Android at least, some are started by an idleHandler; and some are started when other parts of startup reaches a particular state. Naively it would seem that this could be solved by initializing these components on low priority background threads; however this only works if the limiting resource is CPU time, which it isn't in most of the cases we care about. It seems to me that maybe what we need is some way of prioritizing access to other resources, so that, for example, a low priority task reading from disk would never read more than a few blocks at a time without giving way to higher priority tasks doing disk access.

Clearly adding any sort of priority system to disk access (or network access) would be a significant piece of work, and would have quite widespread effects on the code, so we should assess further whether disk and network contention are real problems (either during startup or at other times), and if so whether we can sensibly classify the contending tasks into high and low priority tasks, before implementing this, but it seems to me that this could solve many of the problems discussed in this email thread.

Case (1) typically maps to things that are initialized synchronously on the main thread during startup.

Case (3), as you say, is the most interesting. And I agree it would be nice to react to resource availability in real time. But it would seem a good start to simply wait until some key high priority milestones are reached before triggering the low priority tasks. Bonus points if we start the low priority tasks in serial rather than parallel.

As for classifying the tasks, I think that just starting with the things that are currently deferred through other methods and/or visual inspection of startup traces to find offending tasks would be a good start, too. I am under the impression that new tasks of this type are being added faster than every few months, so once a strategy exists I suspect we will quickly accrete clients.

On Tue, Nov 26, 2013 at 12:21 AM, John Abd-El-Malek <j...@chromium.org> wrote:
The previous comments summarize my intuition about this topic. I'm really curious about the stuff that happens after initial startup, i.e. once we dig into details could we find alternate ways to simplify the code as well as to reduce "magic" delays to start doing work.
-for the cookie example given: do we have numbers on how much faster it is to load some cookies first vs all of the cookies database? My cookies file is 300KB. I imagine there's a few seeks to get to the necessary cookies for my opened tabs. At that point, is it actually slower to load all the cookies?

We used to load all cookies in one shot, and it did take longer than just loading the required cookies. Because of the series of changes that led to the current implementation it's not easy to point to a precise gain, but I don't think it would be hard to introduce an experiment that loaded the required cookies first and then immediately loaded all others. The resulting data should be good for discussion.

Yep, getting that data would help us figure out if this is a use case for 3 (when resources are available).
 
 
-one other example off the top of my head is metrics service, which starts a task after 60 seconds to collect data. Part of that is that it doesn't want to force loading the plugins from disk. Practically speaking, most page loads would force that to be loaded anyways. Could we be smarter by having the metrics service watch for when plugins are loaded, and only update its plugin data then?

That's a good example. Tasks that aren't resource intensive but depend on some other "expensive" task completing would not be scheduled in any special way. They should, I think, just ask for notification when the expensive task completes.

That hinges, though, on the assumption that the expensive task will absolutely run. I don't know enough about plugins to say if that's the case here, but there are probably other examples where it's not guaranteed. In that case, you still want a way to schedule the task to run at the appropriate time.

I agree with you that an appropriate first step is to identify a handful of potential use-cases and consider whether there are other appropriate solutions.

This is key. For the majority of items, 1) and 2) is enough. We should push more things from 1 to 2. My hypothesis is that 3 is a one-off, and there are very few items that need to be loaded this way, as almost all of them should just be lazily loaded. The only exceptions could be things like history, where if we lazily load it, then the first omnibox search might not have results. But again, this specific case may just be done with a simple watching of when then first browser window is shown, and we might not need something more sophisticated.

Erik Wright

unread,
Dec 5, 2013, 2:48:27 PM12/5/13
to John Abd-El-Malek, Anthony Berent, Elliot Glaysher (Chromium), Peter Kasting, Jeremy Moskovich, Drew Wilson, Jói Sigurðsson, Carlos Pizano, Chromium-dev
On Thu, Dec 5, 2013 at 2:39 PM, John Abd-El-Malek <j...@chromium.org> wrote:



On Tue, Dec 3, 2013 at 7:15 AM, Erik Wright <erikw...@chromium.org> wrote:



On Fri, Nov 29, 2013 at 11:47 AM, Anthony Berent <abe...@chromium.org> wrote:
Some thoughts on this. It seems to me that there are possibly three different cases for when we want to start components:
  1. As soon as possible after startup. For example anything that is needed to get the UI started or to load the first tab visible should be done as soon as possible.
    • We often care about the order in which these tasks are run, even if all of them should be run as soon as possible. For example, on Android, we have chosen to prioritize getting the Omnibox visible and responsive over loading the first tab.
  2. Lazily. This is appropriate for components that may or may not be used, and are reasonably quick to start (or the delays in starting aren't visible to the user), but tie up resources (e.g. memory). There is no point in starting these until they are needed.
  3. When resources are available. This is the main case that has been discussed above. This makes sense for components that are likely or certain to be needed, but aren't needed immediately, and take significant time to start.
For some components we may need to combine cases 2 and 3, i.e. in normal circumstances they should wait for resources to be available, but if some other component requires them they must start when that component requires them.

The interesting case is case 3, components that should start when resources are available. At the moment these are handled in various ad-hoc ways. Some start on timers, guessing that Chrome will using less resources 10 or 60 seconds after startup that during startup; on Android at least, some are started by an idleHandler; and some are started when other parts of startup reaches a particular state. Naively it would seem that this could be solved by initializing these components on low priority background threads; however this only works if the limiting resource is CPU time, which it isn't in most of the cases we care about. It seems to me that maybe what we need is some way of prioritizing access to other resources, so that, for example, a low priority task reading from disk would never read more than a few blocks at a time without giving way to higher priority tasks doing disk access.

Clearly adding any sort of priority system to disk access (or network access) would be a significant piece of work, and would have quite widespread effects on the code, so we should assess further whether disk and network contention are real problems (either during startup or at other times), and if so whether we can sensibly classify the contending tasks into high and low priority tasks, before implementing this, but it seems to me that this could solve many of the problems discussed in this email thread.

Case (1) typically maps to things that are initialized synchronously on the main thread during startup.

Case (3), as you say, is the most interesting. And I agree it would be nice to react to resource availability in real time. But it would seem a good start to simply wait until some key high priority milestones are reached before triggering the low priority tasks. Bonus points if we start the low priority tasks in serial rather than parallel.

As for classifying the tasks, I think that just starting with the things that are currently deferred through other methods and/or visual inspection of startup traces to find offending tasks would be a good start, too. I am under the impression that new tasks of this type are being added faster than every few months, so once a strategy exists I suspect we will quickly accrete clients.

On Tue, Nov 26, 2013 at 12:21 AM, John Abd-El-Malek <j...@chromium.org> wrote:
The previous comments summarize my intuition about this topic. I'm really curious about the stuff that happens after initial startup, i.e. once we dig into details could we find alternate ways to simplify the code as well as to reduce "magic" delays to start doing work.
-for the cookie example given: do we have numbers on how much faster it is to load some cookies first vs all of the cookies database? My cookies file is 300KB. I imagine there's a few seeks to get to the necessary cookies for my opened tabs. At that point, is it actually slower to load all the cookies?

We used to load all cookies in one shot, and it did take longer than just loading the required cookies. Because of the series of changes that led to the current implementation it's not easy to point to a precise gain, but I don't think it would be hard to introduce an experiment that loaded the required cookies first and then immediately loaded all others. The resulting data should be good for discussion.

Yep, getting that data would help us figure out if this is a use case for 3 (when resources are available).
 
 
-one other example off the top of my head is metrics service, which starts a task after 60 seconds to collect data. Part of that is that it doesn't want to force loading the plugins from disk. Practically speaking, most page loads would force that to be loaded anyways. Could we be smarter by having the metrics service watch for when plugins are loaded, and only update its plugin data then?

That's a good example. Tasks that aren't resource intensive but depend on some other "expensive" task completing would not be scheduled in any special way. They should, I think, just ask for notification when the expensive task completes.

That hinges, though, on the assumption that the expensive task will absolutely run. I don't know enough about plugins to say if that's the case here, but there are probably other examples where it's not guaranteed. In that case, you still want a way to schedule the task to run at the appropriate time.

I agree with you that an appropriate first step is to identify a handful of potential use-cases and consider whether there are other appropriate solutions.

This is key. For the majority of items, 1) and 2) is enough. We should push more things from 1 to 2. My hypothesis is that 3 is a one-off, and there are very few items that need to be loaded this way, as almost all of them should just be lazily loaded. The only exceptions could be things like history, where if we lazily load it, then the first omnibox search might not have results. But again, this specific case may just be done with a simple watching of when then first browser window is shown, and we might not need something more sophisticated.

I actually disagree here, but we can let the data speak. My theory is that lazily initialized things are not being lazy enough. Some other component would like to have them, but isn't in a hurry. There is no way, right now, to express that.

But again, until there is data I'm happy to let the different theories stand.

John Abd-El-Malek

unread,
Dec 5, 2013, 3:41:28 PM12/5/13
to Erik Wright, Anthony Berent, Elliot Glaysher (Chromium), Peter Kasting, Jeremy Moskovich, Drew Wilson, Jói Sigurðsson, Carlos Pizano, Chromium-dev
Can you expand on what that means? Do you mean that components look like they're loading their data lazily, but they're triggering off the wrong events so it's happening earlier than needed? If so, isn't that a bug in each implementation, and no system could solve that?
 
Some other component would like to have them, but isn't in a hurry. There is no way, right now, to express that.

But again, until there is data I'm happy to let the different theories stand.

I wholeheartedly agree. I think this is a great project for someone to classify as many startup tasks as possible and to figure out if they can be made lazy or not.
Reply all
Reply to author
Forward
0 new messages