Updates:
Status: Invalid
Comment #2 on issue 8 by
mikes...@gmail.com: child elements are moved out
Entering into
http://html5.validator.nu/ the first example
<p>123<p>abcdefg</p>456</p>
gives
Error: No p element in scope but a p end tag seen.
From line 1, column 24; to line 1, column 27
efg</p>456</p>↩
because the </p> at the end doesn't close a tag. The second <p> closes the
first <p> per HTML5 parsing rules.
http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#parsing-main-inbody
says
"""
A start tag whose tag name is one
of: "address", "article", "aside", "blockquote", "center", "details", "dialog", "dir", "div", "dl", "fieldset", "figcaption", "figure", "footer", "header", "hgroup", "main", "menu", "nav", "ol", "p", "section", "summary", "ul"
If the stack of open elements has a p element in button scope, then act as
if an end tag with the tag name "p" had been seen.
Insert an HTML element for the token.
"""
which means that when a <p> is seen inside a <p>, an implicit </p> is seen,
so
<p>123<p>abcdefg</p>456</p>
is equivalent to
<p>123</p><p>abcdefg</p>456
which is what the HTML sanitizer produces.
By understanding browser tag nesting rules, the sanitizer avoids a lot of
ambiguity in HTML, and can produce output that will be consistently and
safely interpreted by a variety of browsers.
----
Sanitizers.BLOCKS.sanitize("<div><meta/><p>abcdefg</p></div>")
should not produce
"<div><meta/><p>abcdefg</p></div>"
since <meta> is not a block tag, and is not even allowed in the body.
----
Marking this bug invalid. Please reopen if you feel this was in error.