On Mon, Aug 3, 2015 at 4:48 PM, Philip Jägenstedt <
phi...@opera.com> wrote:
> Assuming that the new set of forbidden code points is a strict subset
> of the existing set of forbidden code points, then it still seems
> pretty likely to be web compatible.
It seems like that is not the case for elements. The HTML parser
requires that the first code point is a case-insensitive ASCII letter,
which is completely incompatible with XML:
http://www.w3.org/TR/xml/#NT-NameStartChar Any remaining code point
for a tag in HTML must not be U+0009, U+000A, U+000C, U+0020, U+002F,
U+003E, or U+0000. (Though we could allow U+0000 and treat it as
U+FFFD as the HTML parser does.) That does seem far more liberal than
XML allows and a full subset.
For attributes, they must not start with U+0009, U+000A, U+000C,
U+0020, U+002F, U+003E, or U+0000. (Again, U+0000 could be treated as
U+FFFD.) Any any subsequent code point must not be any of those, and
also not U+003D. That again seems far more liberal than XML and a full
subset.
--
https://annevankesteren.nl/