Intent to Ship: Use Non-Transitional IDNA Processing in URLs

352 views
Skip to first unread message

Mustafa Emre Acer

unread,
Nov 28, 2022, 4:16:04 PM11/28/22
to blink-dev

Contact emails

mea...@chromium.org

Specification

https://unicode.org/reports/tr46

Summary

Enable IDNA 2008 in Non-Transitional Mode for URL processing, aligning Chrome's behavior with Firefox and Safari. Chrome currently uses IDNA 2008 in Transitional Mode in URL processing. The main difference between Transitional and Non-Transitional Mode is the handling of four characters known as deviation characters: ß (LATIN SMALL LETTER SHARP S), ς (GREEK SMALL LETTER FINAL SIGMA), ZWJ (Zero width joiner) and ZWNJ (Zero width non-joiner). In Transitional mode, deviation characters are handled the same as IDNA2003: ß is mapped to ss, ς is mapped to σ, and ZWJ and ZWNJ are deleted. In Non-Transitional mode, domains containing these characters are allowed in domain names without mapping, and thus can resolve to different IP addresses. For example, typing "faß.de" in Chrome and Firefox opens different sites today. Enabling Non-Transitional IDNA in Chrome will allow deviation characters in domain names. Firefox and Safari already made this change in 2016 and continue to use Non-Transitional URL processing.



Blink component

UI>Security>UrlFormatting

Search tags

idna

TAG review

This feature addresses conformance to an existing spec and other browsers already do it.

TAG review status

Not applicable

Risks



Interoperability and Compatibility



Gecko: Shipped/Shipping (https://bugzilla.mozilla.org/show_bug.cgi?id=1218179)

WebKit: Shipped/Shipping (https://trac.webkit.org/changeset/208902/webkit)

Web developers: No signals

Other signals:

Security

This change introduces a potential security issue where a domain pointing to one IP may start pointing to another IP. As an example, IDNA2003 and Transitional IDNA-2008 maps faß.de to fass.de (ß is a deviation character). Non-Transitional IDNA2008 maps it to xn--fa-hia.de which is the punycode representation of faß.de. Typing "faß.de" in Chrome and Firefox currently opens different sites. Main mitigations discussed were domain bundling / blocking where registrars bundle domain names (e.g. registering faß.de along with fass.de) or block the alternative domain name (e.g. disallow faß.de if fass.de is registered). According to data from Chrome 106 and 107: - Less than 0.001% of user-typed or pasted main frame navigations had a deviation character in the hostname. This excludes link clicks and renderer initiated navigations, so the percentage of affected domains among all navigations is even lower. - Only one hostname had a deviation character and had more than 50 impressions over a 28 day period (fußball.de). Both fußball.de and fussball.de have the same owner so this change doesn't affect them. Thus, typing domain names with deviation characters is very rare. Domain bundling / blocking aren't blockers as this change won't have a significant impact on navigations. Finally, Firefox and Safari have been using Non-Transitional IDNA 2008 since 2016 without issues.



WebView application risks

Does this intent deprecate or change behavior of existing APIs, such that it has potentially high risk for Android WebView-based applications?



Debuggability



Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?

Yes

Is this feature fully tested by web-platform-tests?

No

DevTrial instructions

https://bugs.chromium.org/p/chromium/issues/detail?id=694157#c70

Flag name

use-idna2008-non-transitional

Requires code in //chrome?

False

Tracking bug

https://bugs.chromium.org/p/chromium/issues/detail?id=694157

Launch bug

https://launch.corp.google.com/launch/4224656

Estimated milestones

DevTrial on desktop110
DevTrial on Android110


Anticipated spec changes

Open questions about a feature may be a source of future web compat or interop issues. Please list open issues (e.g. links to known github issues in the project for the feature specification) whose resolution may introduce web compat/interop risk (e.g., changing to naming or structure of the API in a non-backward-compatible way).



Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/5105856067141632

This intent message was generated by Chrome Platform Status.

Harald Alvestrand

unread,
Nov 29, 2022, 1:30:06 AM11/29/22
to Mustafa Emre Acer, blink-dev
This IDNA 2008 author applauds your decision.


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAHafXh3rh2Hh35Pv1wNg8vBzUMy13NY%2Bh1y8HmHQrH2aD1i_Lg%40mail.gmail.com.

Yoav Weiss

unread,
Nov 30, 2022, 12:37:57 AM11/30/22
to Harald Alvestrand, Mustafa Emre Acer, blink-dev
Thanks for working on alignment here!!

Why not?
 

Yifan Luo

unread,
Nov 30, 2022, 8:35:38 AM11/30/22
to blink-dev, yoav...@chromium.org, Mustafa Emre Acer, blink-dev, Harald Alvestrand
There seems to be some tests written by apple https://github.com/web-platform-tests/wpt/pull/4794. However, same question here: Why not?

Rick Byers

unread,
Nov 30, 2022, 10:48:49 AM11/30/22
to Yifan Luo, blink-dev, yoav...@chromium.org, Mustafa Emre Acer, Harald Alvestrand
Thanks for investing in this alignment! Having a URL that goes one place in Chrome and somewhere different in Safari/Firefox seems like a very bad thing in principle to me :-)

Your metrics and comments are around user-typed/pasted URLs. Does this change somehow impact only that, not URLs parsed from HTML and CSS? If so then I can understand why there's no WPTs for this. But if not then we'd definitely need confidence in the WPT tests and probably some more compat analysis. 

Philip Jägenstedt

unread,
Nov 30, 2022, 12:00:35 PM11/30/22
to Rick Byers, Yifan Luo, blink-dev, yoav...@chromium.org, Mustafa Emre Acer, Harald Alvestrand
Hi Mustafa,

Thanks for much for working on this. The initial email says this isn't tested by WPT, but I think this is the change that will make this test (part of Interop 2022) pass:
https://wpt.fyi/results/url/toascii.window.html?label=experimental&label=master&product=chrome&product=firefox&product=safari&aligned&view=interop&q=label%3Ainterop-2022-webcompat

Is that right?

Best regards,
Philip

Mustafa Emre Acer

unread,
Nov 30, 2022, 12:39:45 PM11/30/22
to Philip Jägenstedt, Rick Byers, Yifan Luo, blink-dev, yoav...@chromium.org, Harald Alvestrand
There are actually tests, but as a virtual test suite since the implementation is currently behind a flag:

Chrome Status form asked for a link to wpt.fyi and I couldn't figure out how to link to a virtual test suite so I said no. Updated the CS entry.

Philip Jägenstedt

unread,
Dec 1, 2022, 6:07:54 AM12/1/22
to Mustafa Emre Acer, Rick Byers, Yifan Luo, blink-dev, yoav...@chromium.org, Harald Alvestrand
I see, so if we compare the expectations of the default setup to virtual test suite, we see the improvement from 154 failures to 73. Yay!

Are those remaining failures for reasons unrelated to IDNA processing? There are still tests with "ß" in the name that fail, but I'm not sure if it's expected or not.

Mustafa Emre Acer

unread,
Dec 1, 2022, 6:40:26 PM12/1/22
to Philip Jägenstedt, Rick Byers, Yifan Luo, blink-dev, yoav...@chromium.org, Harald Alvestrand
Hi Philip,

Pretty sure the remaining failures with URLs with "ß" are due to crbug.com/724018. In fact a quick hack reduced the failures down to 28: https://chromium-review.googlesource.com/c/chromium/src/+/4072454

While related to IDNA, it's a different issue and isn't affected by this change. 

You received this message because you are subscribed to a topic in the Google Groups "blink-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/a/chromium.org/d/topic/blink-dev/8pxRArGQlS4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAARdPYfmsGWwqFiRr2OKiVh2aq2AC7yoagUHJrPrdiVv8vJ7-Q%40mail.gmail.com.

Yoav Weiss

unread,
Dec 1, 2022, 11:10:37 PM12/1/22
to Mustafa Emre Acer, Philip Jägenstedt, Rick Byers, Yifan Luo, blink-dev, Harald Alvestrand
Thanks for clarifying the test situation, Mustafa! :) Can you also answer Rick's question regarding the impact of this change on parsed URLs? (vs. typed or pasted URL, that you already described)

Mustafa Emre Acer

unread,
Dec 2, 2022, 3:28:03 PM12/2/22
to Yoav Weiss, Philip Jägenstedt, Rick Byers, Yifan Luo, blink-dev, Harald Alvestrand
> Rick's question regarding the impact of this change on parsed URLs? (vs. typed or pasted URL, that you already described)

Yes, this affects parsed URLs as well. So, subresources with affected URLs may start pointing to different IP addresses after this change. Unfortunately I don't have metrics about how prevalent this is, but I'm happy to dig into it if we feel it's necessary.

Also, a small correction about the remaining failures in the virtual test suite: There are two more failures containing ß (lines 124 and 127) I missed. These seem to be related to the handling of extended ASCII characters in hostnames: ß is an extended ASCII character, so the URL string is treated as 8 bit and parsed accordingly. I'll file a separate bug for this.

Rick Byers

unread,
Dec 2, 2022, 4:04:02 PM12/2/22
to Mustafa Emre Acer, Yoav Weiss, Philip Jägenstedt, Yifan Luo, blink-dev, Harald Alvestrand
Thanks Mustafa, that makes sense. I'm struggling a bit to evaluate the compat risk. Changing URL parsing at all feels risky, but your data indicates this should be a very rare scenario, and the fact that we're just matching changes Firefox and Safari made years ago means it's even less risky. There's still Android WebView and chromium-only enterprise scenarios to consider. But I don't want to ask that you go through a whole other round of adding metrics and waiting for stable just to address what is effectively an interop bug (with a non-trivial impact on our WPT pass rates), especially given those metrics are not going to be 100% conclusive either (may identify only non-breaking cases). Finding only one origin with any real usage, and seeing that that origin works fine either way also further reduces the risk for me.

I think I'm convinced that the risk here is similar to that of other bug-fixes we make without any formal compat analysis. LGTM1 to ship. But if you get reports of any breakage whatsoever prior to hitting stable, please revert and come back to us for discussion of next steps. 

Thanks,
   Rick

Rick Byers

unread,
Dec 2, 2022, 4:05:59 PM12/2/22
to Mustafa Emre Acer, Alex Russell, Yoav Weiss, Philip Jägenstedt, Yifan Luo, blink-dev, Harald Alvestrand
Oh and +Alex Russell mentioned in the API owners meeting that he's fine with this change, and he has already approved it in Chromestatus. So mine is actually LGTM2.

Chris Harrelson

unread,
Dec 2, 2022, 4:19:46 PM12/2/22
to Rick Byers, Mustafa Emre Acer, Alex Russell, Yoav Weiss, Philip Jägenstedt, Yifan Luo, blink-dev, Harald Alvestrand

Mustafa Emre Acer

unread,
Dec 2, 2022, 5:12:37 PM12/2/22
to Chris Harrelson, Rick Byers, Alex Russell, Yoav Weiss, Philip Jägenstedt, Yifan Luo, blink-dev, Harald Alvestrand
Thanks Rick and Chris. Will get back here if I become aware of any breakage. 

In the meantime, I'm thinking of adding a console message for affected subresources. It'll read something like:
"The hostname for http://faß.de/script.js (faß.de) may point to a different IP address after https://chromestatus.com/feature/5105856067141632. Make sure you are using the correct host name."
Let me know if this is useful or just noise.

Rick Byers

unread,
Dec 2, 2022, 5:18:58 PM12/2/22
to Mustafa Emre Acer, Chris Harrelson, Alex Russell, Yoav Weiss, Philip Jägenstedt, Yifan Luo, blink-dev, Harald Alvestrand
Given that you expect it to be very rare, that sounds like a good idea to me - at least for a release or two. I should have suggested this because I was reviewing our compat principles trying to reason through this one, but skipped right over Debuggability :-).

Can you also add a UseCounter at that point so we get a signal in beta of how often it's triggered? 

Mustafa Emre Acer

unread,
Dec 15, 2022, 6:16:35 PM12/15/22
to Rick Byers, Chris Harrelson, Alex Russell, Yoav Weiss, Philip Jägenstedt, Yifan Luo, blink-dev, Harald Alvestrand
Log message and use counters landed today. I enabled the feature flag by default on tip of tree earlier this week. 

Thank you all! 
Reply all
Reply to author
Forward
0 new messages