Remove not just disallowed elements, but text nested within the tags?

40 views
Skip to first unread message

Henry Reed

unread,
Nov 26, 2014, 2:07:24 AM11/26/14
to owasp-java-html-...@googlegroups.com


Hi there,


Looks like a good product. I have a question though - after creating a custom policy via HtmlPolicyBuilder as follows:


        private static final PolicyFactory INSTANCE = new HtmlPolicyBuilder()

                .allowElements(

                        "a",

                        "b",

                        "blockquote",

                        "br",

          ... many more allowed elements here...

                        "thead",

                        "tr",

                        "u",

                        "ul")

                .allowAttributes("href", "rel").onElements("a")

                .allowAttributes("cite").onElements("blockquote")

                .allowAttributes("src", "alt", "width", "height").onElements("img")

                .allowAttributes("cite").onElements("q")

                .allowAttributes("colspan", "rowspan").onElements("td")

                .allowAttributes("colspan", "rowspan").onElements("thead")

                .toFactory();



After calling the factory to sanitize input like this:

<hr />Hello<h1>OK</h1><blink>evil</blink><script>alert('Evil');</script>


I get the following output:

<hr />Hello<h1>OK</h1>evil


But what the client wants to see is this...

<hr />Hello<h1>OK</h1>


That is, the text between <blink> and </blink has been removed in addition to the disallowed tags.


Is there a way to accomplish this?


Many thanks,


Henry.

Reply all
Reply to author
Forward
0 new messages