Intent to Remove: Support for legacy protocols (`ftp:`) in subresource requests.

748 views
Skip to first unread message

Mike West

unread,
Jan 23, 2017, 5:41:28 AM1/23/17
to blink-dev, Chris Bentzel

Primary eng (and PM) emails

mk...@chromium.org


Link to “Intent to Deprecate” thread

https://groups.google.com/a/chromium.org/d/msg/blink-dev/bAMVnc1Zyvs/M3VnxbYPBC0J (which I neglected to follow up on for over 2 years... *sigh*)


Summary

I'd like to block requests from HTTP/HTTPS documents that target "legacy" schemes (e.g. "ftp://my-awesome-ftp-server.com/yay.tiff"). That is, the `ftp://` image referenced in https://jsbin.com/petonig/edit?html,output would not load, as the document itself is not served from `ftp://`.


Motivation

Removing FTP subresources from webby pages would enable us to extract FTP support out of Chrome, handling top-level FTP requests via some other mechanism (Chrome app. protocol handler, etc). Given the low usage, and the code we could remove, this seems like a sizable win.


Compatibility And Interoperability Risk

Other browsers generally support `ftp://` subresources. Given the usage, however, I expect we'd be able to agree on changing that.


Alternative implementation suggestion for web developers

Developers can embed resources from HTTP servers, rather than FTP servers.


Usage information from UseCounter

The deprecation warnings we added in 2014 worked (or: the internet doesn't use much FTP anymore)! Whatever the cause, legacy protocol usage in HTTP/HTTPS documents has dropped down to the point where it no longer registers on the usage graph: https://www.chromestatus.com/metrics/feature/timeline/popularity/531.


Looking at the raw numbers shows 0.0003% of page views over the last 28 days.


OWP launch tracking bug

https://crbug.com/435547


Entry on the feature dashboard

https://www.chromestatus.com/feature/5709390967472128


-mike

Jochen Eisinger

unread,
Jan 23, 2017, 5:44:26 AM1/23/17
to Mike West, blink-dev, Chris Bentzel
Any idea what sites use this?

Anyways, with so low usage, lgtm1

Anne van Kesteren

unread,
Jan 23, 2017, 6:02:37 AM1/23/17
to Mike West, blink-dev, Chris Bentzel
On Mon, Jan 23, 2017 at 11:41 AM, Mike West <mk...@chromium.org> wrote:
> Compatibility And Interoperability Risk
>
> Other browsers generally support `ftp://` subresources. Given the usage,
> however, I expect we'd be able to agree on changing that.

I guess that means tests and Fetch proposals are forthcoming?


--
https://annevankesteren.nl/

Mike West

unread,
Jan 23, 2017, 6:05:14 AM1/23/17
to Anne van Kesteren, blink-dev, Chris Bentzel
Absolutely. I plan to file suggestions against Fetch for this and the next thread I'm about to start, and will upstream any/all tests I write for Blink's implementation to WPT.

-mike

Anne van Kesteren

unread,
Jan 23, 2017, 6:09:50 AM1/23/17
to Mike West, blink-dev, Chris Bentzel
On Mon, Jan 23, 2017 at 12:04 PM, Mike West <mk...@google.com> wrote:
> Absolutely. I plan to file suggestions against Fetch for this and the next
> thread I'm about to start, and will upstream any/all tests I write for
> Blink's implementation to WPT.

Great, as a heads up, these days any normative Fetch changes require
WPT tests to have already landed (plus pointer in the Fetch PR). And
then once both are landed bugs need to be filed against all browsers.
This will be documented more clearly soonish.


--
https://annevankesteren.nl/

Chris Bentzel

unread,
Jan 23, 2017, 8:56:38 AM1/23/17
to Anne van Kesteren, Mike West, net...@chromium.org, blink-dev
+net...@chromium.org to get more awareness.

Non-owner fine with this. We had looked a while back about whether it was possible to drop FTP implementation altogether with Chrome and this was one of the steps.

Can you get data about what fraction of users are imapcted by this over the course of a week? When looking at the HTTP/0.9 deprecation we had a reasonably small percentage of requests impacted by it, but above comfort threshold on %age of users impacted over course of a week.

Rick Byers

unread,
Jan 23, 2017, 11:49:12 AM1/23/17
to Chris Bentzel, Anne van Kesteren, Mike West, net...@chromium.org, blink-dev
This data is different from the HTTP/0.9 deprecation in that here we're looking at % of pageviews which trigger such a load at least once.  We didn't have the data in that form for the HTTP/0.9 deprecation and so were relying on the harder-to-interpret %-of-requests.  Right?  0.0003% of page views sounds likely to be pretty low risk to me, but nothing is risk-free for sure...

What's the right relative ordering is here. Should we wait to remove until the Fetch spec has changed, or at least has a PR against it?




Simon Pieters

unread,
Jan 23, 2017, 12:39:36 PM1/23/17
to Chris Bentzel, Rick Byers, Anne van Kesteren, Mike West, net...@chromium.org, blink-dev
On Mon, 23 Jan 2017 17:48:45 +0100, Rick Byers <rby...@chromium.org> wrote:

> This data is different from the HTTP/0.9 deprecation in that here we're
> looking at % of pageviews which trigger such a load at least once. We
> didn't have the data in that form for the HTTP/0.9 deprecation and so
> were
> relying on the harder-to-interpret %-of-requests. Right? 0.0003% of
> page
> views sounds likely to be pretty low risk to me, but nothing is risk-free
> for sure...
>
> What's the right relative ordering is here. Should we wait to remove
> until
> the Fetch spec has changed, or at least has a PR against it?

I tried checking httparchive (494,956 pages), any elements with src or
href starting with ftp: or ftps: but ignoring <a href="ftp:"> (I assume
those will continue to work?):

```
SELECT * FROM (
SELECT page, REGEXP_EXTRACT(LOWER(body),
r'(<(?:[^a][a-z]+)(?:\s[^>]+)?\s(?:src|href)\s*=\s*["\']?ftps?:[^>]+>)')
AS match
FROM [httparchive:har.2017_01_01_chrome_requests_bodies]
)
WHERE match != "null"
```

3 results, as csv below:

page,match
http://www.taz.de/,"<node id=""4495"" ts=""1357240141"" name=""ftp""
href=""ftp://ftp.taz.de"" traverse=""true"">"
http://www.microline.hr/,<img
src='ftp://ftp.microline.hr/mol/images/banner.png'/>
http://www.webbkameror.se/,"<img
src=""ftp://194.117.166.250/fh/nygatan-kopmannagatan.jpg"" alt=""kalix""
width=""155"" height=""113"" border=""0"" class=""bild-border"">"

---

For comparison, in httparchive, 293 matches for <a href=ftp:>.

---

In GitHub, img src=ftp (trying to exclude some test cases):

https://github.com/search?utf8=✓&q="img+src%3Dftp"+NOT+test+NOT+"invalid+src"&type=Code&ref=searchresults

"We’ve found 592 code results"

---

For comparison, href=ftp (HTML only):

https://github.com/search?l=HTML&q="href%3Dftp"+NOT+test&ref=searchresults&type=Code&utf8=✓

"We’ve found 451,683 code results"

--
Simon Pieters
Opera Software

Matt Menke

unread,
Jan 23, 2017, 12:43:34 PM1/23/17
to Simon Pieters, Chris Bentzel, Rick Byers, Anne van Kesteren, Mike West, net...@chromium.org, blink-dev
On Mon, Jan 23, 2017 at 12:39 PM, Simon Pieters <sim...@opera.com> wrote:
On Mon, 23 Jan 2017 17:48:45 +0100, Rick Byers <rby...@chromium.org> wrote:

This data is different from the HTTP/0.9 deprecation in that here we're
looking at % of pageviews which trigger such a load at least once.  We
didn't have the data in that form for the HTTP/0.9 deprecation and so were
relying on the harder-to-interpret %-of-requests.  Right?  0.0003% of page
views sounds likely to be pretty low risk to me, but nothing is risk-free
for sure...

What's the right relative ordering is here. Should we wait to remove until
the Fetch spec has changed, or at least has a PR against it?

I tried checking httparchive (494,956 pages), any elements with src or href starting with ftp: or ftps: but ignoring <a href="ftp:"> (I assume those will continue to work?):

Not sure that's right.  Would we allow ftp URLs in iframes or not?  If not, then that wouldn't work when present in an iframe, unless opening the URL in a new tab.
 

```

SELECT * FROM (
SELECT page, REGEXP_EXTRACT(LOWER(body), r'(<(?:[^a][a-z]+)(?:\s[^>]+)?\s(?:src|href)\s*=\s*["\']?ftps?:[^>]+>)') AS match
FROM [httparchive:har.2017_01_01_chrome_requests_bodies]
)
WHERE match != "null"
```

3 results, as csv below:

page,match
http://www.taz.de/,"<node id=""4495"" ts=""1357240141"" name=""ftp"" href=""ftp://ftp.taz.de"" traverse=""true"">"
http://www.microline.hr/,<img src='ftp://ftp.microline.hr/mol/images/banner.png'/>
http://www.webbkameror.se/,"<img src=""ftp://194.117.166.250/fh/nygatan-kopmannagatan.jpg"" alt=""kalix"" width=""155"" height=""113"" border=""0"" class=""bild-border"">"

---

For comparison, in httparchive, 293 matches for <a href=ftp:>.

---

In GitHub, img src=ftp (trying to exclude some test cases):

https://github.com/search?utf8=✓&q="img+src%3Dftp"+NOT+test+NOT+"invalid+src"&type=Code&ref=searchresults

"We’ve found 592 code results"

---

For comparison, href=ftp (HTML only):

https://github.com/search?l=HTML&q="href%3Dftp"+NOT+test&ref=searchresults&type=Code&utf8=✓

"We’ve found 451,683 code results"

--
Simon Pieters
Opera Software


--
You received this message because you are subscribed to the Google Groups "net-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to net-dev+unsubscribe@chromium.org.
To post to this group, send email to net...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/net-dev/op.yui4n4j7idj3kv%40simons-mbp.

Rick Byers

unread,
Jan 23, 2017, 1:47:53 PM1/23/17
to Matt Menke, Simon Pieters, Chris Bentzel, Anne van Kesteren, Mike West, net...@chromium.org, blink-dev
On Mon, Jan 23, 2017 at 12:43 PM, Matt Menke <mme...@chromium.org> wrote:
On Mon, Jan 23, 2017 at 12:39 PM, Simon Pieters <sim...@opera.com> wrote:
On Mon, 23 Jan 2017 17:48:45 +0100, Rick Byers <rby...@chromium.org> wrote:

This data is different from the HTTP/0.9 deprecation in that here we're
looking at % of pageviews which trigger such a load at least once.  We
didn't have the data in that form for the HTTP/0.9 deprecation and so were
relying on the harder-to-interpret %-of-requests.  Right?  0.0003% of page
views sounds likely to be pretty low risk to me, but nothing is risk-free
for sure...

What's the right relative ordering is here. Should we wait to remove until
the Fetch spec has changed, or at least has a PR against it?

I tried checking httparchive (494,956 pages), any elements with src or href starting with ftp: or ftps: but ignoring <a href="ftp:"> (I assume those will continue to work?):

Not sure that's right.  Would we allow ftp URLs in iframes or not?  If not, then that wouldn't work when present in an iframe, unless opening the URL in a new tab.
 

```
SELECT * FROM (
SELECT page, REGEXP_EXTRACT(LOWER(body), r'(<(?:[^a][a-z]+)(?:\s[^>]+)?\s(?:src|href)\s*=\s*["\']?ftps?:[^>]+>)') AS match
FROM [httparchive:har.2017_01_01_chrome_requests_bodies]
)
WHERE match != "null"
```

3 results, as csv below:

Thanks for doing this analysis Simon!  I looked briefly into these 3 cases:
 
page,match
http://www.taz.de/,"<node id=""4495"" ts=""1357240141"" name=""ftp"" href=""ftp://ftp.taz.de"" traverse=""true"">"

Looks like a false positive - doesn't actually use FTP today as far as I can see.
Legitimate breakage - looks like a banner ad on the site.  Generates a mixed-content warning today (since the site redirects to https).

http://www.webbkameror.se/,"<img src=""ftp://194.117.166.250/fh/nygatan-kopmannagatan.jpg"" alt=""kalix"" width=""155"" height=""113"" border=""0"" class=""bild-border"">"

Legitimate breakage.  A site with a bunch of webcam links where one (of many) images is served via FTP.  I can imagine this as a pattern of possible breakage - old-school webcam setups where the software is designed only to upload images periodically to some FTP server.  There's no console warning here at all.  Mike, did your deprecation warning get removed at some point?  We should probably add it back for at least a milestone before removal.

Mike West

unread,
Jan 24, 2017, 6:27:36 AM1/24/17
to Rick Byers, Matt Menke, Simon Pieters, Chris Bentzel, Anne van Kesteren, net...@chromium.org, blink-dev
On Mon, Jan 23, 2017 at 7:47 PM, Rick Byers <rby...@chromium.org> wrote:
On Mon, Jan 23, 2017 at 12:43 PM, Matt Menke <mme...@chromium.org> wrote:
On Mon, Jan 23, 2017 at 12:39 PM, Simon Pieters <sim...@opera.com> wrote:
On Mon, 23 Jan 2017 17:48:45 +0100, Rick Byers <rby...@chromium.org> wrote:

This data is different from the HTTP/0.9 deprecation in that here we're
looking at % of pageviews which trigger such a load at least once.  We
didn't have the data in that form for the HTTP/0.9 deprecation and so were
relying on the harder-to-interpret %-of-requests.  Right?  0.0003% of page
views sounds likely to be pretty low risk to me, but nothing is risk-free
for sure...

What's the right relative ordering is here. Should we wait to remove until
the Fetch spec has changed, or at least has a PR against it?

I tried checking httparchive (494,956 pages), any elements with sr

Thank you, Simon!
 
c or href starting with ftp: or ftps: but ignoring <a href="ftp:"> (I assume those will continue to work?):

Not sure that's right.  Would we allow ftp URLs in iframes or not?  If not, then that wouldn't work when present in an iframe, unless opening the URL in a new tab.

We would not allow opening `ftp:` in an iframe, and the metrics presented include that case. We'd allow top-level navigation (in the hopes that such navigation could eventually be dealt with via some built-in protocol handler), but that's it.

(Incidentally, those top-level navigations account for only 0.0025% of navigations over the last ~28 days (assuming that `Navigation.MainFrameSchemeDifferentPage` is the right metric), which makes me wonder whether it's worth going further than this intent.)

http://www.webbkameror.se/,"<img src=""ftp://194.117.166.250/fh/nygatan-kopmannagatan.jpg"" alt=""kalix"" width=""155"" height=""113"" border=""0"" class=""bild-border"">"

Legitimate breakage.  A site with a bunch of webcam links where one (of many) images is served via FTP.  I can imagine this as a pattern of possible breakage - old-school webcam setups where the software is designed only to upload images periodically to some FTP server.  There's no console warning here at all.
 
Mike, did your deprecation warning get removed at some point?  We should probably add it back for at least a milestone before removal.

Hrm. I don't know what happened here; I'm pretty sure I added the warning, but I agree that it's not in the code today. :( I can add it back in for a milestone if that makes y'all more comfortable, but given the usage, I'd be just as comfortable adding a console error along with the removal and letting it ride through dev and beta. I don't think the usage justifies a significant deprecation period, but I'll defer to y'all on that decision.

Also: I sent a PR against Fetch: https://github.com/whatwg/fetch/pull/464. However, while writing tests for that patch, I noticed that we don't actually even have an FTP server for layout tests. All the tests that contain `ftp` are verifying general URL construction/parsing, or that we block FTP resources in certain circumstances. Moreover, a proof-of-concept patch[1] triggers only two failures: one extension test that accidentally tests FTP while verifying permissions, and one test that dumps internal network-stack state (showing that we attempt a connection to an invalid host and fail). The fact that we have basically zero coverage (and no clear path to creating such coverage) is an interesting data point when considering removing the functionality sooner rather than later.

-mike

Philip Jägenstedt

unread,
Feb 7, 2017, 12:46:08 AM2/7/17
to Mike West, Rick Byers, Matt Menke, Simon Pieters, Chris Bentzel, Anne van Kesteren, net...@chromium.org, blink-dev
LGTM2

The breakage from httparchive didn't look too worrying. Given the very low usage here, a deprecation message for just one milestone wouldn't be very impactful. If a console message will be logged regardless after the removal, then just doing it seems like the better choice.

I don't think making the change must block on https://github.com/whatwg/fetch/pull/464 as long as that does happen. 

Rick Byers

unread,
Feb 16, 2017, 11:53:48 AM2/16/17
to Philip Jägenstedt, Mike West, Matt Menke, Simon Pieters, Chris Bentzel, Anne van Kesteren, net...@chromium.org, blink-dev
Shoot, looks like I lost track of this one - sorry! (Feel free to ping me on hangouts/IRC in the future if you're waiting for my reply).
+1 to Philip's comments.

LGTM3

Reply all
Reply to author
Forward
0 new messages