Web-facing PSA: Fetching external resource only if type is supported

167 views
Skip to first unread message

Suzy Howlett

unread,
Dec 16, 2015, 10:46:17 PM12/16/15
to blink-dev
tl;dr: I am proposing to change Blink to only fetch external resources if the type attribute of the link element is unspecified or is text/css, in order to match the spec and behaviour of other browsers.

Background and motivation
The spec http://www.w3.org/TR/html5/document-metadata.html#the-link-element specifies that "If the UA does not support the given MIME type for the given link relationship, then the UA should not obtain the resource". Blink does not comply with this; see http://crbug.com/286682.

- Blink and WebKit load the resource, so Chrome and Safari show the text as red
- Firefox, IE and Edge all show the text as green, indicating they are not fetching the resource

The change
In https://codereview.chromium.org/1501393003/ I am proposing to change this so that Blink only fetches the resource if the type is unspecified or is "text/css".

This is expected to be a minor change that improves interop, but please let me know if you have any concerns.

Suzy

Dimitri Glazkov

unread,
Dec 16, 2015, 11:22:19 PM12/16/15
to Suzy Howlett, blink-dev
This should really be an intent to implement and ship, right?

:DG<

Suzy Howlett

unread,
Dec 17, 2015, 9:34:50 PM12/17/15
to Dimitri Glazkov, blink-dev
Hi Dimitri,

There was some discussion of this point on the review for my patch. I sent this email on Rick's advice:

On Thu, Dec 10, 2015 at 8:26 AM, <rby...@chromium.org> wrote:
Big picture as long as we have reason to believe the risk is low (and we're
prepared to revert if we see breakage that suggests otherwise), I'd personally
be fine considering this a "trivial" change (almost every bug-fix we do has the
potential to be a breaking web exposed change).  A "web-facing PSA" to blink-dev
wouldn't hurt though, might flesh out reasons to be more concerned about the
compat risk.

Dimitri Glazkov

unread,
Dec 18, 2015, 12:12:42 AM12/18/15
to Suzy Howlett, blink-dev
Alrighty.

:DG<

Suzy Howlett

unread,
Jan 13, 2016, 10:36:55 PM1/13/16
to blink-dev, yo...@yoav.ws, rby...@chromium.org, igri...@chromium.org
To follow up on this PSA, I have managed to pull some stats from HTTPArchive to quantify the impact of this change. According to these (approximate) calculations, less than 0.04% of <link rel=stylesheet>s are not of type text/css. This suggests to me that the proposed change will be low-impact. Further details are below.

Suzy

-----

The BigQuery query
SELECT
  stylesheetType,
  count(stylesheetType) AS typeCount
FROM (
  SELECT
    REPLACE(LOWER(REGEXP_EXTRACT(stylesheetLink, r'\s(type=[^\s>]+)')), '"', "'") AS stylesheetType
  FROM (
    SELECT
      REGEXP_EXTRACT(content, r'^(link[^>]+rel=[^\s>]*stylesheet[^\s>]*[^>]*>)') AS stylesheetLink
    FROM (
      SELECT
        JSON_EXTRACT(payload, '$.request.url') AS url,
        JSON_EXTRACT(payload, '$._contentType') AS contentType,
        SPLIT(JSON_EXTRACT_SCALAR(payload, '$.response.content.text'), '<') AS content,
        INTEGER(JSON_EXTRACT(payload, '$.response.content.size')) as contentSize,
        JSON_EXTRACT(payload, '$._body') AS body,
      FROM [httparchive:har.desktop_oct_1_2015_requests]
    )
    WHERE
      contentType = '"text/html"'
      AND body = 'true'
  )
  WHERE
    stylesheetLink IS NOT NULL
)
GROUP BY stylesheetType
ORDER BY typeCount DESC

The results
stylesheetTypetypeCount
type='text/css'2606908
type='text/css'/52328
type=text/css2067
type=\'text/css\'818
type=\'text\/css\'361
type='text/less'73
type='text/kss'70
type='text\/css'55
type='text/css''35
type=&#039;text/css&#039;35
type='text/css'--24
type='text/javascript'17
type='text/chrome'17
type='text/nonsense'14
type='text/menu-css'14
type='text/safari'11
and a long tail of others, a total of 285 rows, with sum(typeCount) == 2663347. Those top 5 rows account for over 99.96% of that total.

I think my favourite is "type='text/nonsense'".

Rick Byers

unread,
Jan 14, 2016, 3:21:44 PM1/14/16
to Suzy Howlett, blink-dev, Yoav Weiss, igri...@chromium.org
0.04% of page loads is still a LOT of page loads.  Can you pick a random sampling (eg. 6) of the sites you get here and see how exactly they're impacted by this change?  My intuition is that most of these 0.04% cases won't actually be noticeably affected, but that might be wrong.  If there is noticeable UI impact in a large fraction, it might be helpful to understand if / how a couple of those sites are working in IE and Firefox.  Worst case and even half of the cases you check have noticable UI regressions with this change (but somehow work fine in IE/Firefox) then I think we'd want to reconsider - that's a pretty big compat impact.

Rick

Suzy Howlett

unread,
Jan 14, 2016, 11:40:14 PM1/14/16
to Rick Byers, blink-dev, Yoav Weiss, igri...@chromium.org
Hi Rick,

I've looked into it a little more. The 0.04% figure was 0.04% of <link rel=stylesheet>s, which is not quite the same thing as 0.04% of page loads. So, how many pages is that? Of the 2.6 million <link rel=stylesheet>s above, there were ~850 that didn't contain the string "text/css". (These are the ones that would not be fetched after my patch.) About half of those contained "text\/css". It was mostly one per page, so we're looking at 400-450 pages out of the (I believe) 1 million in HTTPArchive that are potentially affected. So ... I guess that's about 0.04% of pages too. But that's still not page loads ^.^

I had a look at maybe 10 of the URLs that this brought up. For some of them, the page source no longer contained one of these funny <link>s. The rest I opened in Chrome stable and in my patched build, and didn't spot any differences.

Simon Pieters

unread,
Jan 15, 2016, 6:53:22 AM1/15/16
to Rick Byers, 'Suzy Howlett' via blink-dev, Suzy Howlett, Yoav Weiss, igri...@chromium.org
On Fri, 15 Jan 2016 05:40:11 +0100, 'Suzy Howlett' via blink-dev
<blin...@chromium.org> wrote:

> Hi Rick,
>
> I've looked into it a little more. The 0.04% figure was 0.04% of <link
> rel=stylesheet>s, which is not quite the same thing as 0.04% of page
> loads.
> So, how many pages is that? Of the 2.6 million <link rel=stylesheet>s
> above, there were ~850 that didn't contain the string "text/css". (These
> are the ones that would not be fetched after my patch.) About half of
> those
> contained "text\/css". It was mostly one per page, so we're looking at
> 400-450 pages out of the (I believe) 1 million in HTTPArchive that are
> potentially affected. So ... I guess that's about 0.04% of pages too. But
> that's still not page loads ^.^
>
> I had a look at maybe 10 of the URLs that this brought up. For some of
> them, the page source no longer contained one of these funny <link>s. The
> rest I opened in Chrome stable and in my patched build, and didn't spot
> any
> differences.

Maybe the interesting ones are those that say text/chrome (17) or
text/safari (11), since those may be deliberate WebKit-only stylesheets.

As an aside, httparchive has rank information in its *_pages tables, so
you could sort your results by page rank. ("One JS framework is not
enough!" in
https://www.igvita.com/2013/06/20/http-archive-bigquery-web-performance-answers/
has an example that does this.)

--
Simon Pieters
Opera Software

Rick Byers

unread,
Jan 15, 2016, 2:30:34 PM1/15/16
to Simon Pieters, 'Suzy Howlett' via blink-dev, Suzy Howlett, Yoav Weiss, igri...@chromium.org
Thanks for the additional details Suzy.  I agree that looking at a couple of the text/chrome or text/safari examples is probably a good idea.

Since there's probably at least a couple sites out there depending on this as a hack to get a WebKit-specific stylesheet, how would you feel about adding a deprecation warning (call to UseCounter::countDeprecation) for one release?  If you landed that next week, I suspect you could get it merged into M49, then immediately remove the support entirely for M50.

By the way I realized I never replied publicly to the discussion of whether this should be an "intent to ship".  My position was that this case falls in the grey area between "interop-affecting bug" and web-exposed API change, and that we don't want to imply that all observable bug fixes need an 'intent'.  The rule of thumb I proposed, and Dimitri agreed with, is that if a change is something that you'd expect we'd update MDN docs for (and so reasonably expect developers to need to think about the behavior), then it needs an intent.  When it's just fixing a bug (even one with potential small compat impact), then it doesn't necessarily (although we of course always take compat and interop very seriously in any such change).

Rick

Yoav Weiss

unread,
Jan 16, 2016, 4:49:24 PM1/16/16
to Rick Byers, Simon Pieters, 'Suzy Howlett' via blink-dev, Suzy Howlett, Ilya Grigorik
'text/less' would also be interesting to look at. There's a good chance that these would be preprocessed from less to css in JS, which means that not downloading these styles is likely to break the sites in question. It'd also be interesting to see what happens with those site in Firefox.

Suzy Howlett

unread,
Jan 18, 2016, 12:16:52 AM1/18/16
to Yoav Weiss, Rick Byers, Simon Pieters, 'Suzy Howlett' via blink-dev, Ilya Grigorik
I've had a look at several more pages with text/chrome and text/safari stylesheets (and one or two text/less ones) and have seen very little changing. Typically it's very small changes in px or other length values (and it's not clear why). As a user, I don't even notice the difference.

The biggest change I've managed to find is http://www.betterchinese.com/. I compared this page on Chrome 47 and Firefox 43 on my Linux machine, as well as the browsers at https://dev.windows.com/en-us/microsoft-edge/tools/screenshots/#http://www.betterchinese.com/. From the latter, in particular see IE 11Chrome 42 and Firefox 37 all on Win 8.1.

Looking at the top menu (preschool, elementary, ...) and the text below "Customer Favorites" ("My First Chinese Words..."), I found:
- Chrome 47 on my machine and IE 11 on the comparison site: serif font
- Chrome 42 on the comparison site: sans serif font, similar font size to the above
- Firefox 43 on my machine and 37 on the comparison site, as well as my patched content_shell: sans serif font, larger font size in the "Customer Favorites" section.

The difference in font face between Chrome 42 on Win and Chrome 47 on Linux seems to be because the site is looking for Tahoma, a Windows-specific font that's not on my machine. Since there's so much variability between browsers and platforms, it's not clear what the intended result is.

I think we should go with Rick's suggestion of getting a deprecation warning into M49 and then get my patch (after I've added the extra requested tests) into M50. How does that sound?

Yoav Weiss

unread,
Jan 18, 2016, 3:27:05 AM1/18/16
to Suzy Howlett, Rick Byers, Simon Pieters, 'Suzy Howlett' via blink-dev, Ilya Grigorik
sgtm. I'm also trying to get the word out there and see if people scream bloody murder.

Rick Byers

unread,
Jan 18, 2016, 5:34:04 PM1/18/16
to Yoav Weiss, Suzy Howlett, Simon Pieters, 'Suzy Howlett' via blink-dev, Ilya Grigorik
Thanks again for the additional research Suzy!  I'm convinced this is low risk enough that we should try shipping it, and the deprecation message will definitely really help for the tiny edge cases.

Rick

Suzy Howlett

unread,
Jan 27, 2016, 5:24:57 PM1/27/16
to Rick Byers, Yoav Weiss, Simon Pieters, 'Suzy Howlett' via blink-dev, Ilya Grigorik
For the curious, here is the UseCounter info from the deprecation warning for this feature: https://www.chromestatus.com/metrics/feature/timeline/popularity/1122 ... 0.0001%!

The deprecation warning was merged into M49, and the removal patch has now landed.

Rick Byers

unread,
Jan 27, 2016, 5:52:27 PM1/27/16
to Suzy Howlett, Ilya Grigorik, Yoav Weiss, 'Suzy Howlett' via blink-dev, Simon Pieters

Excellent, thanks for the extra diligence here Suzy!  In retrospect it's easy to see it as overkill (now that we have the data), but we've made some mistakes recently in the other direction so erring on the side of caution is probably a good idea.

Keep your ears open for any developer complaints - it'll be interesting to hear of any real-world impact.  Some small impact will not be surprising, but I'm confident it will be manageable.

Rick

Reply all
Reply to author
Forward
0 new messages