Re: [owasp-sanitizer] enable nesting of list elements...

211 views
Skip to first unread message

Mike Samuel

unread,
Oct 22, 2012, 7:39:21 PM10/22/12
to owasp-java-html-...@googlegroups.com, latc...@gmail.com
2012/10/22 Jon Stevens <latc...@gmail.com>:
> first... thanks for an excellent project.
>
> second, it seems perfectly valid html like this:
>
> <ul>
> <li>asdf</li>
> <ul>
> <li>adfasdf</li>
> </ul>
> </ul>
>
> is getting sanitized into this:
>
> <ul>
> <li>asdf</li>
> </ul>
> <ul>
> <li>adfasdf</li>
> </ul>
>
> I've looked through the docs and the source (v117), but I can't seem to
> figure it out. Is there some way to keep the nesting?

This smells like a bug in

http://code.google.com/p/owasp-java-html-sanitizer/source/browse/trunk/src/main/org/owasp/html/TagBalancingHtmlStreamEventReceiver.java

which introduces end tags to prevent HTML from breaking out of tables
and lists used for formatting and from containers that establish a
viewport or clip region.

I'll look into it.

James Manico

unread,
Oct 22, 2012, 7:53:35 PM10/22/12
to owasp-java-html-...@googlegroups.com
I don't think UL where a LI should be is valid HTML. This is NOT valid
HTML I conject.

That embedded UL needs to be IN a LI tag like:

<ul>
<li> item 1 </li>
<li><UL> ... start new list here.

I could be wrong and often am. :)

--
Jim Manico
(808) 652-3805

On Oct 22, 2012, at 6:43 PM, Jon Stevens <latc...@gmail.com> wrote:

> first... thanks for an excellent project.
>
> second, it seems perfectly valid html like this:
>
> <ul>
> <li>asdf</li>
> <ul>
> <li>adfasdf</li>
> </ul>
> </ul>
>
> is getting sanitized into this:
>
> <ul>
> <li>asdf</li>
> </ul>
> <ul>
> <li>adfasdf</li>
> </ul>
>
> I've looked through the docs and the source (v117), but I can't seem to figure it out. Is there some way to keep the nesting?
>
> thanks,
>
> jon

Jon Stevens

unread,
Oct 22, 2012, 9:52:58 PM10/22/12
to owasp-java-html-...@googlegroups.com, j...@manico.net
Hi James,

Sorry to be the bearer of bad news, but as far as I'm reading per the specs, you're incorrect. 

James Manico

unread,
Oct 22, 2012, 10:02:30 PM10/22/12
to Jon Stevens, owasp-java-html-...@googlegroups.com
According to the WC3 html validator, this is not legal HTML.

I get the error:

  1. Error Line 5, Column 6document type does not allow element "UL" here; assuming missing "LI" start-tag
      <ul> 



I still could be wrong :) try it yourself here:


--
Jim Manico
VP, Security Architecture
WhiteHat Security

Jon Stevens

unread,
Oct 22, 2012, 10:08:37 PM10/22/12
to James Manico, owasp-java-html-...@googlegroups.com
Hi Jim,

That error is because you didn't create a fully formed document. It doesn't like <ul> in whatever you gave it.

Try this:

<!doctype html>
<html>
<head>
<title>foo</title>
</head>
<body>
<ul><li>laskdfj<ul><li>laksdjf</ul></ul>
</body>
</html>

Note: I left off the </li> because technically, you don't need it. Adding it has no effect on the validation...

<!doctype html>
<html>
<head>
<title>foo</title>
</head>
<body>
<ul><li>laskdfj<ul><li>laksdjf</li></ul></li></ul>
</body>
</html>

best,

jon

James Manico

unread,
Oct 22, 2012, 11:27:48 PM10/22/12
to Jon Stevens, owasp-java-html-...@googlegroups.com
  1. Error Line 9, Column 6Element ul not allowed as child of element ul in this context. (Suppressing further errors from this subtree.)
      <ul>
    Contexts in which element ul may be used:
    Where flow content is expected.
    Content model for element ul:
    Zero or more li elements.


The </li> matter. Add then back in and try the w3c validator.


--
Jim Manico
VP, Security Architecture
WhiteHat Security

James Manico

unread,
Oct 22, 2012, 11:29:22 PM10/22/12
to Jon Stevens, owasp-java-html-...@googlegroups.com
Try this against the w3c validator. You cannot put a UL where a LI is
expected. The missing </li> 's matter.

<!doctype html>
<html>
<head>
<title>foo</title>
</head>
<body>
<ul>
<li>laskdfj</li>
<ul>
<li>laksdjf</li>
</ul>
</ul>
</body>
</html>


--
Jim Manico
VP, Security Architecture
WhiteHat Security
(808) 652-3805

Jon Stevens

unread,
Oct 23, 2012, 2:08:36 AM10/23/12
to James Manico, owasp-java-html-...@googlegroups.com
Ah, yes... that is indeed invalid, but oddly the browser parses it correctly.

What is generating it is http://imperavi.com/redactor/

jon

Mike Samuel

unread,
Oct 23, 2012, 11:36:20 AM10/23/12
to owasp-java-html-...@googlegroups.com
2012/10/23 Jon Stevens <latc...@gmail.com>
>
> Ah, yes... that is indeed invalid, but oddly the browser parses it
> correctly.
>
> What is generating it is http://imperavi.com/redactor/
>
> jon

Odd. My eyes completely glazed over when I saw that example before,
and I assumed it was a nested list.
<ul><li>...</li><li><ul>...</ul></li></ul>
instead of
<ul><li>...</li><ul>...</ul></ul>
Either way, the tag balancer is not following the HTML5 specified
error recovery steps: "assuming missing "LI" start-tag"

I've filed http://code.google.com/p/owasp-java-html-sanitizer/issues/detail?id=7
to track this issue.

Jon Stevens

unread,
Oct 23, 2012, 12:37:42 PM10/23/12
to owasp-java-html-...@googlegroups.com
I've also emailed the redactor folks to tell them they need to fix
their generator.

Thanks Mike,

jon

James Manico

unread,
Oct 23, 2012, 12:44:49 PM10/23/12
to owasp-java-html-...@googlegroups.com
Awesome John. This is still a good bug report and mandates some fixin'

Thank you good sir! :)

--
Jim Manico
(808) 652-3805

Mike Samuel

unread,
Oct 23, 2012, 12:52:06 PM10/23/12
to owasp-java-html-...@googlegroups.com
2012/10/23 James Manico <j...@manico.net>:
> Awesome John. This is still a good bug report and mandates some fixin'
>
> Thank you good sir! :)

Mandated fix is available at
http://code.google.com/p/owasp-java-html-sanitizer/source/detail?r=121
If trunk addresses your issue I'll cut a release.

Jon Stevens

unread,
Oct 23, 2012, 12:55:23 PM10/23/12
to owasp-java-html-...@googlegroups.com
wow, this is the first project i've seen in about 15 years that uses a
makefile to build a java project.

mind emailing me a .jar file and i'll test with it?

jon

Jim Manico

unread,
Oct 28, 2012, 4:04:41 PM10/28/12
to owasp-java-html-...@googlegroups.com
John,

I think you had a typo. I think you mean to say, thank you Mike Samuel for all your hard work in supporting this high performance professional-grade OWASP project?

;) Aloha,
Jim

Jim Manico

unread,
Oct 28, 2012, 4:06:16 PM10/28/12
to owasp-java-html-...@googlegroups.com
> Awesome John. This is still a good bug report and mandates some fixin'

That was a typo Mike.

I mean to say, "I think this is a good bug report and merits fixing if the project leader agrees"

Mandate--

;) Aloha,
Jim

Jon Stevens

unread,
Oct 28, 2012, 4:23:14 PM10/28/12
to owasp-java-html-...@googlegroups.com
I think you have a typo, my name is Jon, not John.

=)

jon

Jim Manico

unread,
Oct 28, 2012, 4:24:50 PM10/28/12
to owasp-java-html-...@googlegroups.com
I'm having a lot of typo problems today, Jon. =)

Aloha,
Jim
Reply all
Reply to author
Forward
0 new messages