Attempting to access missing chrome: or resource: files will turn tests orange

Gijs Kruitbosch

unread,

Aug 2, 2021, 3:33:40 PM8/2/21

to dev-platform, firef...@mozilla.org

Hello folks,

Somewhat recently we had to beta-uplift fixes for missing images in the browser UI. The image references were constructed dynamically, and the underlying images had been removed, unaware of the consumer that was doing this (and, because the images' file names were not literally mentioned, searchfox and review didn't catch this).

To avoid having to catch issues like this manually in future, I'm in the process of landing a patch that will MOZ_CRASH in automated tests when chrome:// or resource:// URIs are requested that don't resolve (ie where the underlying jar/file channels return NS_ERROR_FILE_NOT_FOUND or similar). On debug builds, it'll also print a JS stack before crashing. When not running automated tests, the patch will print a warning to stderr. [1]

Writing this patch found a few other pre-existing dead references already, so I'm pretty happy that this is a worthwhile exercise.

There are exceptions in the patch for mochitest and reftest files, as well as some pre-existing issues where I didn't want to block landing on a resolution.

There is also an exception for `fetch()` calls, as there are a number of consumers who use them and expect them to fail under some circumstances - though it might be better to update those as well, longer-term.

If you run afoul of these checks with requests where you expect missing files in some circumstances, consider if you can avoid making the requests at all in such cases.

Finally, the overwhelming majority of issues I ran into were restricted to android/geckoview. It may be worth reiterating that:

- Gecko and toolkit/ code should not generally assume it runs only on desktop Firefox

- as a corollary, you should not ChromeUtils.import or otherwise rely on scripts in either `chrome://browser/` or `resource://app/` from toolkit code. The same thing goes for images and CSS, of course.

Happy hacking,

Gijs

[1] Being pedantic: because this code relies on channel inputstreams being created, it will not catch or fail on requests to chrome:// packages that aren't even registered (e.g. chrome://horse-battery-staple/content/), as we give up well before trying to actually open the channel in such cases. If someone were interested, the code could be extended to deal with those cases.

--
You received this message because you are subscribed to the Google Groups "firef...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firefox-dev...@mozilla.org.
To view this discussion on the web visit https://groups.google.com/a/mozilla.org/d/msgid/firefox-dev/CACWru5YHu38vsQUuvAkpyP7y5WdC66wfTcm6fDcRz61G--Q-ag%40mail.gmail.com.

Marco Castelluccio

unread,

Aug 5, 2021, 5:46:01 AM8/5/21

to dev-pl...@mozilla.org, Gijs Kruitbosch, firef...@mozilla.org

Hello Gijs,
I think these kinds of dynamic checks are super useful, thanks for doing this!

I wonder if we can do something similar to uncover other bad behaviors that can be detected at runtime. It would be nice to build some kind of framework to make it easy to build these kind of checks, a bit like we have mozlint for linters.

For example, detecting fishy behaviors like addition of an undefined with a string. https://github.com/Berkeley-Correctness-Group/DLint could be a starting point for a list of potential analyses.

Thanks!
- Marco.

To view this discussion on the web visit https://groups.google.com/a/mozilla.org/d/msgid/firefox-dev/97981d51-b99f-4511-8e34-af6e43401a58n%40mozilla.org.

Tom Ritter

unread,

Aug 5, 2021, 11:18:46 AM8/5/21

to Marco Castelluccio, dev-pl...@mozilla.org, Gijs Kruitbosch, firef...@mozilla.org

Over in hardening land we have been making (slow, incremental) progress on some of these types of things

A) Gijs prevented docshells from loading remote pages in the parent process a while ago. The mechanism for landing that was simply test it, land it, fix a couple regressions.

B) Christoph and I were eventually able to disable eval() in the System Principal Context and Parent Process. This was a much slower rollout - we collected Event Telemetry on all channels, carved out exceptions as needed, and refactored a few instances. Then we slowly rolled it out in enforcement mode.

C & D) Freddy and I have been working on disabling loads of files by the System Principal and loads of javascript files in the parent that aren't chrome/resource:// respectively. This is slower work, and has been using Event Telemetry also.

Event Telemetry has worked well for these efforts; we deploy it in reporting mode, and roll out reporting mode incrementally on the channels, then roll out enforcement mode. We're able to collect some detailed information (on Windows, thanks to reused platform code) that safely redacts information we don't want to collect while providing as much context as we can. That code could be refactored if others wanted to use it.

However we're also reaching the point where we haven't been able to find the last odd occurrences. There was a proposal to add a new type of Telemetry Probe that would allows us to submit stacktraces without a crash (like BHR) but that hasn't gotten a lot of traction. I'm planning on experimenting with a 'crash the browser at most once per user' to collect a stack trace on the JS File Loads. I don't love that idea, but because Event Telemetry shows relatively few people being affected by what I'm trying to find it's a trade-off. That could _also_ be used by others, but if enough people really wanted it, it would be better to invest in making the non-crashing stacktrace-reporting ping.

-tom

To view this discussion on the web visit https://groups.google.com/a/mozilla.org/d/msgid/firefox-dev/CADua4_vritYiqjvP6xkW2UqpayWHrggCn%2B8K%3D%3DTVDEBbAuiegg%40mail.gmail.com.

Marco Castelluccio

unread,

Aug 5, 2021, 11:53:48 AM8/5/21

to Tom Ritter, dev-pl...@mozilla.org, Gijs Kruitbosch, firef...@mozilla.org, Christian Holler, jst...@mozilla.com

Il 05/08/21 17:18, Tom Ritter ha scritto:

Over in hardening land we have been making (slow, incremental) progress on some of these types of things

A) Gijs prevented docshells from loading remote pages in the parent process a while ago. The mechanism for landing that was simply test it, land it, fix a couple regressions.

B) Christoph and I were eventually able to disable eval() in the System Principal Context and Parent Process. This was a much slower rollout - we collected Event Telemetry on all channels, carved out exceptions as needed, and refactored a few instances. Then we slowly rolled it out in enforcement mode.

C & D) Freddy and I have been working on disabling loads of files by the System Principal and loads of javascript files in the parent that aren't chrome/resource:// respectively. This is slower work, and has been using Event Telemetry also.

Event Telemetry has worked well for these efforts; we deploy it in reporting mode, and roll out reporting mode incrementally on the channels, then roll out enforcement mode. We're able to collect some detailed information (on Windows, thanks to reused platform code) that safely redacts information we don't want to collect while providing as much context as we can. That code could be refactored if others wanted to use it.

However we're also reaching the point where we haven't been able to find the last odd occurrences. There was a proposal to add a new type of Telemetry Probe that would allows us to submit stacktraces without a crash (like BHR) but that hasn't gotten a lot of traction. I'm planning on experimenting with a 'crash the browser at most once per user' to collect a stack trace on the JS File Loads. I don't love that idea, but because Event Telemetry shows relatively few people being affected by what I'm trying to find it's a trade-off. That could _also_ be used by others, but if enough people really wanted it, it would be better to invest in making the non-crashing stacktrace-reporting ping.

It looks like this kind of need comes up regularly lately!

Maybe you're already familiar with these, but here's some recent effort from the DOM team in this area, https://bugzilla.mozilla.org/show_bug.cgi?id=1700915, with a summary of their findings at https://docs.google.com/document/d/1KPNr-NHa_1yQYO4WFgbpNLNR4ZXEYxNKDsjMe0KZrEU/edit.

And here's a relevant proposal about "Diagnostic Reports", https://docs.google.com/document/d/1MNGRD0l6wpcM41aObScx--9gNq2X7FF4K9OH6arLoRc/edit.

- Marco.

To view this discussion on the web visit https://groups.google.com/a/mozilla.org/d/msgid/firefox-dev/9f0c431d-9f53-befe-5956-9ab2c3ada14d%40mozilla.com.

Reply all

Reply to author

Forward