Proposal: Stage based app bootstrap process

Zbigniew Braniecki

unread,

Apr 14, 2014, 6:07:20 PM4/14/14

to mozilla-...@lists.mozilla.org

=== Problem ===

FirefoxOS apps bootstrap code is quite messy and incoherent.

Lack of API that would enable an App to control the System's behavior (like firstPaint) results in a lot of dirty hacks intended to provide decent app startup user experience.

Preventing FOUC's, flickering, on-screen layout composition, both for apps and for the System itself (think - runtime localization) requires a lot of trickery and suboptimal heuristics and still depends on unreliable behavior like I/O or CPU speed.

On top of that, lack of standardized way for the App to report it's bootstrapping progress makes it impossible to reliably measure such basic characteristics as startup performance or memory consumption.
We just don't know what does it mean that the App is loaded, so we're guessing.

The one characteristic that we can use is firstPaint for performance and arbitrarily chosen 10 second delay for measuring memory consumption. That's all guesswork.

This also incentivize developers to postpone more and more of bootstrapping post firstPaint which gives better values to firstPaint while it decreases the user experience due to post-firstPaint screen composition and unresponsive UI.

=== Proposed solution ===

I'd like to suggest implementation of a stage based bootstrapping process.

The stages would be asynchronous, flexible and optional.

Asynchronous means that firing the App code for given stage loading may launch asynchronous code (like data loading or worker operations) and the App will end the stage firing an event to notify the System when the stage is complete.

Flexible means that code related to stages may be fired out of order and the behavior of the App bootstrapping may be adjusted to particular app needs (think: developer knowing that async data loading for stage4 does not block stage3 may initialize the loading for stage4 before stage3).

Optional means that the App may skip each and every of the stages just by not defining function that the System is firing at the beginning of the stage.

The result is a lightweight model that optionally enables the App authors to take over control over certain System operations and provides solid information to the System about the status of the App bootstrapping.

The focus on the perceived user experience and universality of stage division makes the contract between the App author and the System work independently of the particular System UX and should not require any changes as the long term design of the System UI progresses.

This form of feeding information from the App to the System may also potentially enable us to optimize B2G better for performance, memory usage and caching.

=== Flow ===

Stage 0 - this stage is initialized as soon as possible after the user initiated the App launching procedure. The goal is to instantly provide user feedback that his action has been registered by the System.

System: At this point the System initializes JS bootstrap code if it's registered in the manifest.webapp.

Stage 1 - this is an optional stage which allows the App to modify any initialization parameters at runtime.

System: At this point the System initializes loading of the document from manifest.webbapp@launch_app

Stage 2 - this stage is fired when the document's DOM is interactive. It enables all operations required to prepare the document to be displayed and may block firstPaint if needed.

System: At this point the System draws the App screen and takes initial firstPaint performance measurement.

Stage 3 - during this stage the App initializes it's UI (event handlers on buttons etc.) making the App's UI usable for the user.

System: At this point the System knows that the App's UI is usable and may take another measurement.

Stage 4 - this stage allows the App to load any data that should not block UI but is displayed on the main screen. (examples: first 10 text messages, wifi/sim card status in Settings, first bunch of emails)

System: At this point the System knows that the screen is fully visible and usable and should not require reflows unless an event occurs. Another performance metric and optionally screenshot can be taken or reflow optimizations.

Stage 5 - during this stage the App loads additional data that is not displayed on the screen, but is needed without user interactions. Remaining messages/emails, Browser's bookmarks/history or awesomebar data.

System: At this point the System considers the App fully loaded, may take memory measurements.

Here's a demo of how it may look like: http://labs.braniecki.net/gaia/bootstrap/

=== Challenges ===

* Webby-ness

The Stage Bootstrap proposal focused on UX of an App loading and as such is different from how HTML document loading works. The firstPaint blocking is introduced, layout is separated from UI and data.
The question is how "webby" it can be and how can we make this model fit into the architecture of Web engines and standards.

* Error recovery

If the app decides to block firstPaint and fails to emit stage2complete event, we should be able to recognize it and react. Potential solutions may be: time limit and taking over firstPaint control in an event of uncaught Error.

* Interpretability with The Web

How can we make this architecture work in case the System is not Firefox OS but a Web Browser and in result does not recognize the stage bootstrapping system?
One solution is to make the stage code fire on its own in case the System does not takeover. Can we make it work?

=== RFC ===

Let's start the discussion on the app bootstrapping and use this proposal as a starting point.
- Do you think it may work?
- Does it solve the problems described at the top?
- Are there other approaches that may work better?

Thanks,
g.

Alive

unread,

Apr 14, 2014, 10:38:34 PM4/14/14

to Zbigniew Braniecki, mozilla-...@lists.mozilla.org

Hi,

This is an interesting post.

AFAIK we have:

'mozbrowserloadstart'/'mozbrowserloadend' to know loading state.

'mozbrowserfirstpaint' to know first paint state.

'addNextPaintListener' for next paint event.

But we don't know 'when is the UI usable' for now. The app should be responsible to lazily load anything js/data from the main page.

BTW, could you cross post this to dev-b2g as well? Most of your proposal needs gecko work in advance of system work.

-Alive

_______________________________________________
dev-gaia mailing list
dev-...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-gaia

--

Alive C. Kuo, Senior Front-end/Software Engineer, FirefoxOS, MoCo. Taiwan, Taipei office.

al...@mozilla.com

Dave Huseby

unread,

Apr 15, 2014, 12:46:51 AM4/15/14

to dev-...@lists.mozilla.org, fxos...@mozilla.com, Harald Kirschner

I'm cross-posting this to the fxos-perf list since this is a very
interesting proposal that could also dovetail nicely with the ongoing
performance work. Harald's work on the perf hud is also tangentially
related.

Doing a staged load like this gets us several key improvements:

1. it stabilizes the vocabulary we use when talking about perf issues
(e.g. the stage 1 code is doing I/O from indexdb when that should be
done in stage 3).

2. we can have clear time budgets at each stage and enforce them with
regression testing (e.g. b2gperf, make perf-test, etc).

3. there are clearly stages that the UX team would be very interested
in. the metrics around perceived progress and response time can be
evaluated for each stage when determining if an app meats our shipping
criteria.

Alive's work on the state machine in the window manager could be a good
first start on getting something like this formalized. But as he
pointed out, we'd also need some gecko work so that we could have
support for things like memory stable and UI responsive.

-dave

On 04/14/2014 07:38 PM, Alive wrote:
> Hi,
>
> This is an interesting post.
> AFAIK we have:
> 'mozbrowserloadstart'/'mozbrowserloadend' to know loading state.
> 'mozbrowserfirstpaint' to know first paint state.
> 'addNextPaintListener' for next paint event.
>
> But we don't know 'when is the UI usable' for now. The app should be
> responsible to lazily load anything js/data from the main page.
>
> BTW, could you cross post this to dev-b2g as well? Most of your proposal
> needs gecko work in advance of system work.
> -Alive
>
> Zbigniew Braniecki <zbigniew....@gmail.com

>> dev-...@lists.mozilla.org <mailto:dev-...@lists.mozilla.org>

>> https://lists.mozilla.org/listinfo/dev-gaia
>
> --
> Alive C. Kuo, Senior Front-end/Software Engineer, FirefoxOS, MoCo.
> Taiwan, Taipei office.

> al...@mozilla.com <mailto:al...@mozilla.com>

signature.asc

Zbigniew Braniecki

unread,

Apr 15, 2014, 1:39:00 AM4/15/14

to mozilla-...@lists.mozilla.org

Hi Alive,

Thanks for your feedback.

On Monday, April 14, 2014 7:38:34 PM UTC-7, Alive wrote:
> Hi,

> AFAIK we have:
> 'mozbrowserloadstart'/'mozbrowserloadend' to know loading state.

Well, the problem here is that it has little to do with how we understand app loading. It works for document loading where the engine is doing all the guess work.
In particular, mozbrowserloadend is fired (if I understand correctly) synchronously after onload events stack. The issue with that is that in the scenario of an application the synchronous load is usually just used to set asynchronous handlers that fire and continue initialization of the app (see mozL10n.ready/window.eventListener('localized').

That gives us nothing and it incentivize moving more of the app to asynchronous loading because it does not impact mozbrowserloadend marker.
But it does impact UI responsiveness and layout composition in a negative way.

> 'mozbrowserfirstpaint' to know first paint state.

Yes, we have firstPaint. But once again, similarly to mozbrowserloadend it is easy to game it by putting asynchronous event handlers and initialize more of the app past firstPaint. More and more of our code is loaded from modules and it messes with firstPaint timers.

The heuristics that Gecko use to decide when to start painting works great for documents loaded from over slow connections - it can start painting parts of app while the rest of HTML document is still being loaded.

It does not work well in the local app scenario where we should be actually able to help Gecko because we know when it should paint.

> BTW, could you cross post this to dev-b2g as well? Most of your proposal needs gecko work in advance of system work.

Actually not that much.

I believe that the only Gecko piece that we need here is ability to block firstPaint. But sure, I'll cross-post! :)

Cheers,
zb.

Zbigniew Braniecki

unread,

Apr 15, 2014, 1:47:10 AM4/15/14

to mozilla-...@lists.mozilla.org

On Monday, April 14, 2014 9:46:51 PM UTC-7, Dave Huseby wrote:
> I'm cross-posting this to the fxos-perf list since this is a very
> interesting proposal that could also dovetail nicely with the ongoing
> performance work. Harald's work on the perf hud is also tangentially
> related.

Great! I'm still kind of new to work on FxOS codebase and I'll be working on a lot of early code initialization as part of my work on l10n startup (which currently is the major bootstrap API (bug 993188).

> 2. we can have clear time budgets at each stage and enforce them with
> regression testing (e.g. b2gperf, make perf-test, etc).

Love this. I didn't think about this use case but we definitely could define a budget for when the UI is visible and how many miliseconds we allow before it's responsive etc.

> Alive's work on the state machine in the window manager could be a good
> first start on getting something like this formalized. But as he
> pointed out, we'd also need some gecko work so that we could have
> support for things like memory stable and UI responsive.

Hmm, as I stated in my response to Alive, I kind of don't see why we need that much Gecko involvement.
When I was prototyping the idea the only thing I really needed from Gecko was ability to make the App block the firstPaint. Everything else is just an event listener on the System app that reacts by marking the stage timestamp and app initialization code that fires stages.

In my thinking, I see potential future improvements like not firing layout until Stage2 injects all the code into HTML, deprioritizing I/O in Stage 1 or conservative reflow after Stage 4, but that was rather in the "future optimizations" category.

I'll cross-post to dev-b2g, but I'll need you help to make a point on what we want to get from Gecko.

Thanks for help and feedback!
zb.

Eli Perelman

unread,

Apr 15, 2014, 11:54:48 AM4/15/14

to Zbigniew Braniecki, mozilla-...@lists.mozilla.org

zb,

This is definitely a great start towards this idea and something we are currently discussing on the Perf team. On the Responsiveness part of our wiki [1], we have begun to outline what some of these interaction states look like, and we currently have it detailed to 4 states. These were relevant for us to be able to effectively measure application cold-launch time. Below that, you can see we have guidelines for when application states should strive for timing of responsiveness within a certain window. This is based on research done by our UX team.

I agree that some of the events need to be async in order to effectively load a dynamic application, but that some of the states will rely on synchronous events to realistically determine actual loading times. Things like mozbrowserloadend and its counterpart for DOMContentLoaded can be used to set some of these benchmarks, provided we still write the application bootstrapping in a manner that smartly loads necessary items without trying to subvert the numbers.

[1] https://wiki.mozilla.org/B2G/Performance/Responsiveness

Thanks,

Eli Perelman

Software Engineer, Firefox OS - Performance

On Tue, Apr 15, 2014 at 12:47 AM, Zbigniew Braniecki <zbigniew....@gmail.com> wrote:

On Monday, April 14, 2014 9:46:51 PM UTC-7, Dave Huseby wrote:

> I'm cross-posting this to the fxos-perf list since this is a very
> interesting proposal that could also dovetail nicely with the ongoing
> performance work. Harald's work on the perf hud is also tangentially
> related.

Great! I'm still kind of new to work on FxOS codebase and I'll be working on a lot of early code initialization as part of my work on l10n startup (which currently is the major bootstrap API (bug 993188).

> 2. we can have clear time budgets at each stage and enforce them with
> regression testing (e.g. b2gperf, make perf-test, etc).

Love this. I didn't think about this use case but we definitely could define a budget for when the UI is visible and how many miliseconds we allow before it's responsive etc.

> Alive's work on the state machine in the window manager could be a good
> first start on getting something like this formalized. But as he
> pointed out, we'd also need some gecko work so that we could have
> support for things like memory stable and UI responsive.

Hmm, as I stated in my response to Alive, I kind of don't see why we need that much Gecko involvement.
When I was prototyping the idea the only thing I really needed from Gecko was ability to make the App block the firstPaint. Everything else is just an event listener on the System app that reacts by marking the stage timestamp and app initialization code that fires stages.

In my thinking, I see potential future improvements like not firing layout until Stage2 injects all the code into HTML, deprioritizing I/O in Stage 1 or conservative reflow after Stage 4, but that was rather in the "future optimizations" category.

I'll cross-post to dev-b2g, but I'll need you help to make a point on what we want to get from Gecko.

Thanks for help and feedback!
zb.

Zibi Braniecki

unread,

Apr 15, 2014, 2:36:02 PM4/15/14

to mozilla-...@lists.mozilla.org

Hi Eli,

On Tuesday, April 15, 2014 8:54:48 AM UTC-7, Eli Perelman wrote:
> This is definitely a great start towards this idea and something we are currently discussing on the Perf team. On the Responsiveness part of our wiki [1], we have begun to outline what some of these interaction states look like, and we currently have it detailed to 4 states. These were relevant for us to be able to effectively measure application cold-launch time. Below that, you can see we have guidelines for when application states should strive for timing of responsiveness within a certain window. This is based on research done by our UX team.

Awesome! I see a lot of overlap and it feels like your stages fit perfectly into what I suggested:

"Stage 2 Complete" becomes "App Chrome Visible"
"Stage 3 Complete" becomes "App Interaction Ready"
"Stage 4 Complete" becomes "App Content Visible"
"Stage 5 Complete" becomes "App Content Ready"

The differences that I think are worth discussing before we them out from my version:

1) It seems that you prioritize "App Content Visible" over "App Interaction Ready". In my experience with UX the interactive UI should be prioritized over visible content loading.

For example: loading the visible portion of messages in SMS app should happen after the "New Message" button becomes interactive.

Do you have a strong preference to prioritize it the way you do?

2) "App Content Ready" stage indicates that its completed when remaining content is loaded. It may be a vocabulary issue, but since post "App Content Visible" the screen is "stable", I'm worried about the term "content" being used by app authors to indicate just loading content (sms messages, emails) and not all data required for app operations (filter lists in email, awesomebar data for browser etc.)

What is the goal of this stage in your schema? What do you want to measure?
My goal is explicitly measure the memory usage of a fully loaded and operational app, thus I prefer using the word "data".

Does that description match your goals?

===

The remaining differences are mostly important for App bootstrapping logic to be independent of race conditions (ability to block firstPaint on "App Chrome Visible" stage is crucial for DOM operations like runtime l10n) and Stage 1 may will have limited usage right now.

Thanks!
zb.

Zibi Braniecki

unread,

Apr 15, 2014, 3:08:39 PM4/15/14

to mozilla-...@lists.mozilla.org

As a follow up, it seems that many people were already thinking about parts of the equation here. Let's collect some older ideas that may fit into the stage bootstrap proposal:

1) http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2013-April/208783.html - API to delay onload. It's not the same as blocking firstPaint, but may be useful. Sounds similar to Stage 3 Complete.

2) https://bugzilla.mozilla.org/show_bug.cgi?id=863499 - Signal for "app load" - that feels closest to Stage 2 Complete since it's supposed to delay displaying the app until the custom event is fired.

3) https://groups.google.com/forum/#!msg/mozilla.dev.gaia/oPqmUS9xvJQ/GrOLMEAXUmwJ - "App rendered" event for better screenshots/ux & perf - this is more reactive approach to Stage 4 Complete event.

If you see other threads, add them. Also, seems like a lot of people are interested in this, so if you know lists or people who may want to chime in, let them know! :)

zb.

Gordon Brander

unread,

Apr 15, 2014, 4:00:28 PM4/15/14

to Zbigniew Braniecki, mozilla-...@lists.mozilla.org

On 14 Apr 2014, at 16:07, Zbigniew Braniecki wrote:
> Preventing FOUC's, flickering, on-screen layout composition, both for
> apps and for the System itself (think - runtime localization) requires
> a lot of trickery and suboptimal heuristics and still depends on
> unreliable behavior like I/O or CPU speed.

This is awesome Gandalf! We take a huge (and unfair) hit in
perception-of-quality because of on-screen layouting and jank. Finding a
solution to this problem would be tremendous.

In tandem with whatever APIs we might introduce, we could also tackle
this problem from the platform side, fading-in or tweening elements as
we render them (see https://mozilla.box.com/s/8odusqia8p0jgrrcjkdq @
00:00:20 for an example). This could improve the experience for web
content across the board.

---
Gordon Brander
Sr UX Engineer
Firefox OS, Mozilla