Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

PSA: potential new source of intermittent test failures - and how to work around it

46 views
Skip to first unread message

Jonathan Kew

unread,
Jan 5, 2021, 2:32:49 PM1/5/21
to dev-pl...@lists.mozilla.org
Do you write Gecko/Firefox patches or testcases, or monitor the Mozilla
trees?

If so, you may run into new intermittent test failures due to a recent
(intentional) behavior change.

On 2020-12-31, a patch landed in
https://bugzilla.mozilla.org/show_bug.cgi?id=1676966 that changed how
font fallback is handled. Previously, if the fonts specified in a page's
CSS or through the browser prefs did not support a character present in
the text, Gecko would potentially search all installed fonts to try and
find one that could render the character. The first time this happens,
it can be quite expensive, as it involves loading data from every
installed font file, of which there may be thousands. Result: unpleasant
jank.

To avoid this performance issue, we no longer block layout on the
exhaustive search of all the fonts; instead, we start a background task
to load the required character mappings from all the fonts, but proceed
with layout using whatever fallbacks we may find, or just missing-glyph
boxes. Once the font data is all loaded, we trigger a reflow everywhere
so that content will be refreshed using the proper fonts.

Why does this matter for tests? It may result in two main types of
failure in tests that are otherwise fine:

(1) If the test includes content -- such as text in a lesser-used
Unicode script or unusual symbols -- that depends on font fallback, it
may render with a different fallback font or not render at all during
the initial pageload/reflow, if all the necessary font data has not yet
been loaded. The rendering will be automatically corrected once the
async font loading completes, but if the reftest harness has already
taken a snapshot by that time, it may be too late, and the test fails.

(2) If async font data loading was triggered by the testcase, or by one
shortly preceding it, an "unexpected" extra reflow event will happen
when the loading completes. This can interfere with tests that are
specifically concerned with event handling and expect a
precisely-defined pattern of behavior, or are watching things like frame
dimensions for changes.

Because the font fallback behavior is asynchronous (and the actual work
happens in the parent process, while your testcase is usually running
independently in a content process), the timing of all this cannot be
accurately predicted, and failures may be intermittent.

(Note also that this async behavior only happens once per browser
session, the first time content triggers a global font search. This
means that which testcases are affected may depend on the chunking of
test suites, and could change over time.)

If you have tests that are impacted by this, you can disable the async
behavior -- reverting to the previous behavior where global font
fallback, if needed, will block layout -- for them by setting the pref
'gfx.font_rendering.fallback.async' to false via a test manifest
annotation or similar metadata.

We could simply run all tests with the pref set to false, to avoid these
issues, but I'd prefer not to do that as we then wouldn't be testing the
configuration we ship to users. So let's try to handle this by
selectively disabling the new behavior only in cases where we see it
causing actual problems. Thanks!

JK

0 new messages