Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Summary of e10s performance (Talos + Telemetry + crash-stats)

292 views
Skip to first unread message

Vladan Djeric

unread,
Jul 10, 2015, 4:00:13 PM7/10/15
to dev-platform
A few of us on the perf team (+ Joel Maher) looked at e10s performance &
stability using Talos, Telemetry, and crash-stats. I wrote up the
conclusions below.

Notable improvements in Talos tests [1]:

* Hot startup time in Talos improved by about 50% across all platforms
(ts_paint [2]). This test measures time from Firefox launch until a Firefox
window is first painted (ts_paint); I/O read costs are not accounted for,
as data is already cached in the OS disk buffer before the test.
* The tsvgr_opacity test improved 50-80% across all platforms. This is a
sign of a reduction in the overhead of loading a page, instead of an
improvement in actual SVG performance.
* Linux scrolling performance improved 5-15%
* The long-standing e10s WebGL performance regression has been fixed
* SVG rendering performance (tsvgx) is ~25% better on Windows 7 & 8, but it
is 10% worse on Windows XP and 25% worse on Linux

Notable regressions in Talos tests [1]:

* There are several large regressions unique to Windows XP. Scrolling
smoothness regressed significantly (5-6 times worse on tp5o_scroll and
tscrollx [2]), resizing of Firefox windows is 150% worse (tresize), SVG
rendering performance is 25% worse (tsvgx)
* Page loading time regressed across all platforms (tp5o). Linux regressed
~30%, OS X 10.10 regressed 20%, WinXP/Win8/Win7 all regressed ~10%.
Page-loading with accessibility enabled (a11yr) saw similar regressions.
* Time to open a new Firefox window (tpaint) regressed 30% on Linux, and
across different versions of Windows (<10%)
* Resizing of Firefox windows (tresize) is ~15% worse on Linux
* Note: not all tests are compatible with e10s yet (e.g. session-restore
performance test) so this list isn't complete

Notable improvements from Telemetry data [3]:

* Overall tab animation smoothness improved significantly: 50% vs 30% of
tab animation frames are hitting the target 16ms inter-frame interval. See
FX_TAB_ANIM_* graphs in [3] to see the distribution of frame intervals.
Note that not all tab animations benefited equally.
* e10s significantly decreased jank caused by GC & CC, both in parent &
content processes (GC_MAX_PAUSE_MS, GC_SCC_SWEEP_MAX_PAUSE_MS,
CYCLE_COLLECTOR_MAX_PAUSE, etc [3])
* Unlike Talos, Telemetry suggests that the time to open a new Firefox
window improved with e10s (FX_NEW_WINDOW_MS)
* Median time to restore a saved session improved by 40ms or 20%
("simpleMeasurements/sessionRestored")
* Median shutdown duration improved by 120ms or 10%
("simpleMeasurements/shutdownDuration")

Notable regressions from Telemetry data [3]:

* Unlike Talos, Telemetry numbers imply that the median real-world startup
time, measured as time to first-paint, regressed by 550ms or 20% with e10s
("simpleMeasurements/firstPaint")
* The frequency of jank events lasting more than 100ms increased from ~19
events/min to ~21 events/minute with e10s. This was derived from the
main-thread's event processing times and session uptime
("gecko_hangs_per_minute")
* Similarly the frequency of the slow-script dialog appearing seems to have
roughly doubled with e10s ("histograms/SLOW_SCRIPT_NOTICE_COUNT")
* A side-note: interpreting Telemetry data is trickier than Talos data,
because Telemetry measurements aren't gathered from a controlled
environment, there are confounding variables, opt-in bias, etc. An
additional challenge with e10s Telemetry is that many measurements haven't
yet been re-validated to confirm that they measure the same things in e10s
and non-e10s.

Notable stability improvements [4]:

* E10S Firefox can survive crashes in content-process code, so it's no
surprise that the E10S parent process crash rate is a quarter of the
single-process crash rate (based on crash-stats from Nightly 42 on Windows
[4])
* The total number of E10S crashes of any type (content crash or parent
crash) is roughly the same as without E10S
* Oddly enough, E10S seems to win on plugin crash rates as well! [4]
* There seem to be no regressions in crash rate compared to single-process

References:

1. Joel Maher used compare-talos to compare the Talos scores of an m-c
revision (Nightly 42) in e10s & non-e10s configurations:
https://bugzilla.mozilla.org/show_bug.cgi?id=1144120#c5
Data in friendlier chart form:
https://drive.google.com/open?id=1qfkcoE5_25GtZDa-pIlqFMw6pLueplOsP1YhQb8UcC8

Talos data was gathered from a Firefox 42 m-c build aad95360a002 from June
29th
2. Talos test descriptions https://wiki.mozilla.org/Buildbot/Talos/Tests
3. Roberto Vitillo compared Telemetry from 150,000 Nightly sessions
submitted on June 15th with buildIDs in the range [20150601, 20150616]:
http://nbviewer.ipython.org/urls/gist.githubusercontent.com/vitillo/cb6f1304316c1c1a2cbc/raw/e10s%20analysis.ipynb
~90% of the sessions were from e10s clients, so interpret the e10s &
non-e10s populations with a grain of salt. The numbers were not broken down
by OS. I did not comment on any findings where the delta had more than 0.10
probability of being caused by chance.
4. Crash-stats comparisons from Firefox 42 crashes collected from July 2 to
July 9:
https://docs.google.com/document/d/1xw6pLbkzeh0jxaa2LVn_wLGMUc6CK09i7NXQLg5pp50/edit

Mike Hommey

unread,
Jul 10, 2015, 5:34:14 PM7/10/15
to Vladan Djeric, dev-platform
Wait. What? Median shutdown duration is 1.2s ?!?

Vladan Djeric

unread,
Jul 10, 2015, 5:45:17 PM7/10/15
to Mike Hommey, dev-platform
Yup, the median shutdown duration for Release 39 users on Windows with
Telemetry is 2.3 seconds for example: http://mzl.la/1HSHiD8
Those are also the kinds of shutdown times I see on my Windows machines
when I have 3-5 windows open with 5-10 tabs each.

What is your experience?
Btw, you can go to about:telemetry and look through your archived Telemetry
pings to see a history of your own shutdownDurations. Open about:telemetry,
select "Archived ping data", open the "Simple Measurements" section, and
use the next-previous arrows to look through your Telemetry submissions.
Focus on the "saved-session" pings.

Benoit Girard

unread,
Jul 15, 2015, 6:57:38 PM7/15/15
to Vladan Djeric, Mike Hommey, dev-platform
For the e10s talos regressions see
https://bugzilla.mozilla.org/show_bug.cgi?id=1174776 and
https://bugzilla.mozilla.org/show_bug.cgi?id=1184277. We've already
diagnose one source of the regression to be a difference with GC/CC
behavior when running e10s talos.
> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
0 new messages