Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Intent to prototype: Honoring bogo-XML declaration for character encoding in text/html

55 views
Skip to first unread message

Henri Sivonen

unread,
Mar 10, 2021, 10:57:15 AM3/10/21
to dev-platform
# Summary

For compatibility with WebKit and Blink, honor the character encoding
declared using the XML declaration syntax in text/html.

For reasons explained in https://hsivonen.fi/utf-8-detection/ , unlike
other encodings, UTF-8 isn't detected from content, so with the demise
of Trident and EdgeHTML (which don't honor the XML declaration syntax
in text/html), <?xml version="1.0" encoding="UTF-8"?> has become a
more notable Web compat problem for us. With non-Latin scripts, the
failure mode is particularly bad for a Web compat problem: The text is
completely unreadable.

That is, this isn't a feature for Web authors to use. This is to
address a push factor for users when authors do use this feature.

# Bug

https://bugzilla.mozilla.org/show_bug.cgi?id=673087

# Standard

https://github.com/whatwg/html/pull/1752

# Platform coverage

All

# Preference

To be enabled unconditionally.

# DevTools bug

No integration needed.

# Other browsers

WebKit has had this behavior for a very long time and didn't remove it
when HTML parsing was standardized.

Blink inherited this from WebKit upon forking.

Trident and EdgeHTML don't have this; their demise changed the balance
for this feature.

# web-platform-tests

https://hsivonen.com/test/moz/xml-decl/ contains tests which are
wrapped for WPT as part of the Gecko patch.

--
Henri Sivonen
hsiv...@mozilla.com
0 new messages