Problem allowing iframe in EbayPolicyExample.java

300 views
Skip to first unread message

doa...@gmail.com

unread,
Aug 3, 2013, 7:25:20 PM8/3/13
to owasp-java-html-...@googlegroups.com
I am trying to modify the EbayPolicyExample.java to allow an iframe with src, width and height attributes to be used for a forum that needs to be able to embed youtube videos and such. It looks like it should be as simple as:

onElements("iframe")
.allowAttributes("src")
.onElements("iframe")
.allowAttributes("width" , "height").matching(NUMBER_OR_PERCENT)


and then adding "iframe" to the allowElements list. In my database the <iframe></iframe> tag is entered but all the attributes are missing. This is the debug for before and after sanitization:

BEFORE: html [<p><iframe src="http://www.youtube.com/embed/GNZBSZD16cY" width="425" height="350"></iframe></p>]
AFTER: safeHTML [<p><iframe></iframe></p>]

Do I have something wrong in the code? Thank you for any pointers.

Mike Samuel

unread,
Aug 4, 2013, 10:36:28 AM8/4/13
to owasp-java-html-...@googlegroups.com
2013/8/3 <doa...@gmail.com>:
Can you supply your full code snippet? The one that includes the
allowElements and any other white-listing you may do?

doa...@googlemail.com

unread,
Aug 6, 2013, 11:13:10 PM8/6/13
to owasp-java-html-...@googlegroups.com, mikes...@gmail.com
Thank you for replying.

I have this partly working for the youtube URL but I had to use .matching(MY_REG_EXP) to get it to work but I am finding that the height and width are being removed from iframe (frameborder and allowfullscreen don't seem to be required):

<iframe width="420" height="315" src="//www.youtube.com/embed/NQc4V_wTT_I" frameborder="0" allowfullscreen></iframe>

and the sanitizer is an exact copy of the EbayPolicyExample.java with my additional bits in bold:

private static final Pattern ANYTHING = Pattern.compile(".*");
    // The 16 colors defined by the HTML Spec (also used by the CSS Spec)
    private static final Pattern COLOR_NAME = Pattern.compile(
            "(?:aqua|black|blue|fuchsia|gray|grey|green|lime|maroon|navy|olive|purple"
            + "|red|silver|teal|white|yellow)");
    // HTML/CSS Spec allows 3 or 6 digit hex to specify color
    private static final Pattern COLOR_CODE = Pattern.compile(
            "(?:#(?:[0-9a-fA-F]{3}(?:[0-9a-fA-F]{3})?))");
    private static final Pattern NUMBER_OR_PERCENT = Pattern.compile(
            "[0-9]+%?");
    private static final Pattern PARAGRAPH = Pattern.compile(
            "(?:[\\p{L}\\p{N},'\\.\\s\\-_\\(\\)]|&[0-9]{2};)*");
    private static final Pattern HTML_ID = Pattern.compile(
            "[a-zA-Z0-9\\:\\-_\\.]+");
    // force non-empty with a '+' at the end instead of '*'
    private static final Pattern HTML_TITLE = Pattern.compile(
            "[\\p{L}\\p{N}\\s\\-_',:\\[\\]!\\./\\\\\\(\\)&]*");
    private static final Pattern HTML_CLASS = Pattern.compile(
            "[a-zA-Z0-9\\s,\\-_]+");
    private static final Pattern ONSITE_URL = Pattern.compile(
            "(?:[\\p{L}\\p{N}\\\\\\.\\#@\\$%\\+&;\\-_~,\\?=/!]+|\\#(\\w)+)");
    private static final Pattern OFFSITE_URL = Pattern.compile(
            "\\s*(?:(?:ht|f)tps?://|mailto:)[\\p{L}\\p{N}]"
            + "[\\p{L}\\p{N}\\p{Zs}\\.\\#@\\$%\\+&;:\\-_~,\\?=/!\\(\\)]*\\s*");
    private static final Pattern NUMBER = Pattern.compile(
            "[+-]?(?:(?:[0-9]+(?:\\.[0-9]*)?)|\\.[0-9]+)");
    private static final Pattern NAME = Pattern.compile("[a-zA-Z0-9\\-_\\$]+");
    private static final Pattern ALIGN = Pattern.compile(
            "(?i)center|left|right|justify|char");
    private static final Pattern VALIGN = Pattern.compile(
            "(?i)baseline|bottom|middle|top");
    private static final Pattern YOUTUBE = Pattern.compile(
            "^(?:https?:)?(//)?(?:www\\.)?(?:youtube\\.com|youtu\\.be).*");

    private static final Predicate<String> COLOR_NAME_OR_COLOR_CODE = new Predicate<String>() {
        public boolean apply(String s) {
            return COLOR_NAME.matcher(s).matches()
                    || COLOR_CODE.matcher(s).matches();
        }
    };
    private static final Predicate<String> ONSITE_OR_OFFSITE_URL = new Predicate<String>() {
        public boolean apply(String s) {
            return ONSITE_URL.matcher(s).matches()
                    || OFFSITE_URL.matcher(s).matches();
        }
    };
    private static final Pattern HISTORY_BACK = Pattern.compile(
            "(?:javascript:)?\\Qhistory.go(-1)\\E");
    private static final Pattern ONE_CHAR = Pattern.compile(
            ".?", Pattern.DOTALL);
    public static final PolicyFactory POLICY_DEFINITION = new HtmlPolicyBuilder()
            .allowAttributes("id").matching(HTML_ID).globally()
            .allowAttributes("class").matching(HTML_CLASS).globally()
            .allowAttributes("lang").matching(Pattern.compile("[a-zA-Z]{2,20}"))
            .globally()
            .allowAttributes("title").matching(HTML_TITLE).globally()
            .allowStyling()
            .allowAttributes("align").matching(ALIGN).onElements("p")
            .allowAttributes("for").matching(HTML_ID).onElements("label")
            .allowAttributes("color").matching(COLOR_NAME_OR_COLOR_CODE)
            .onElements("font")
            .allowAttributes("face")
            .matching(Pattern.compile("[\\w;, \\-]+"))
            .onElements("font")
            .allowAttributes("size").matching(NUMBER).onElements("font")
            .allowAttributes("href").matching(ONSITE_OR_OFFSITE_URL)
            .onElements("a")
            .allowStandardUrlProtocols()
            .allowAttributes("nohref").onElements("a")
            .allowAttributes("name").matching(NAME).onElements("a")
            .allowAttributes(
            "onfocus", "onblur", "onclick", "onmousedown", "onmouseup")
            .matching(HISTORY_BACK).onElements("a")
            .requireRelNofollowOnLinks()
            .allowAttributes("src").matching(ONSITE_OR_OFFSITE_URL)
            .onElements("img")
            .allowAttributes("name").matching(NAME)
            .onElements("img")
            .allowAttributes("alt").matching(PARAGRAPH)
            .onElements("img")
            .allowAttributes("border", "hspace", "vspace").matching(NUMBER)
            .onElements("img")
            .allowAttributes("border", "cellpadding", "cellspacing")
            .matching(NUMBER).onElements("table")
            .allowAttributes("bgcolor").matching(COLOR_NAME_OR_COLOR_CODE)
            .onElements("table")
            .allowAttributes("background").matching(ONSITE_URL)
            .onElements("table")
            .allowAttributes("align").matching(ALIGN)
            .onElements("table")
            .allowAttributes("noresize").matching(Pattern.compile("(?i)noresize"))
            .onElements("table")
            .allowAttributes("background").matching(ONSITE_URL)
            .onElements("td", "th", "tr")
            .allowAttributes("bgcolor").matching(COLOR_NAME_OR_COLOR_CODE)
            .onElements("td", "th")
            .allowAttributes("abbr").matching(PARAGRAPH)
            .onElements("td", "th")
            .allowAttributes("axis", "headers").matching(NAME)
            .onElements("td", "th")
            .allowAttributes("scope")
            .matching(Pattern.compile("(?i)(?:row|col)(?:group)?"))
            .onElements("td", "th")
            .allowAttributes("nowrap")
            .onElements("td", "th")
            .allowAttributes("height", "width").matching(NUMBER_OR_PERCENT)
            .onElements("table", "td", "th", "tr", "img")
            .allowAttributes("align").matching(ALIGN)
            .onElements("thead", "tbody", "tfoot", "img",
            "td", "th", "tr", "colgroup", "col")
            .allowAttributes("valign").matching(VALIGN)
            .onElements("thead", "tbody", "tfoot",
            "td", "th", "tr", "colgroup", "col")
            .allowAttributes("charoff").matching(NUMBER_OR_PERCENT)
            .onElements("td", "th", "tr", "colgroup", "col",
            "thead", "tbody", "tfoot")
            .allowAttributes("char").matching(ONE_CHAR)
            .onElements("td", "th", "tr", "colgroup", "col",
            "thead", "tbody", "tfoot")
            .allowAttributes("colspan", "rowspan").matching(NUMBER)
            .onElements("iframe")
            .allowAttributes("src").matching(YOUTUBE)
            .onElements("iframe")
            .allowAttributes("height" , "width").matching(NUMBER_OR_PERCENT)

            .onElements("td", "th")
            .allowAttributes("span", "width").matching(NUMBER_OR_PERCENT)
            .onElements("colgroup", "col")
            .allowElements(
            "a", "label", "noscript", "h1", "h2", "h3", "h4", "h5", "h6",
            "p", "i", "b", "u", "strong", "em", "small", "big", "pre", "code",
            "cite", "samp", "sub", "sup", "strike", "center", "blockquote",
            "hr", "br", "col", "font", "map", "span", "div", "img",
            "ul", "ol", "li", "dd", "dt", "dl", "tbody", "thead", "tfoot",
            "table", "td", "th", "tr", "colgroup", "fieldset", "legend" , "iframe")
            .toFactory();

Mike Samuel

unread,
Aug 7, 2013, 11:21:21 AM8/7/13
to doa...@googlemail.com, owasp-java-html-...@googlegroups.com
2013/8/6 <doa...@googlemail.com>:
> private static final Pattern ANYTHING = Pattern.compile(".*");

This probably needs Pattern.DOT_ALL, or you could use
Predicates.<String>alwaysTrue()

> .allowAttributes("colspan", "rowspan").matching(NUMBER)
> .onElements("iframe")

This is saying "allow attributes colspan and rowspan that match the
pattern number on elements iframe".

> .allowAttributes("src").matching(YOUTUBE)
> .onElements("iframe")

This is allowing src on iframe which is working for you.

> .allowAttributes("height" , "width").matching(NUMBER_OR_PERCENT)
> .onElements("td", "th")

And this is allowing height and width on elements td and th which is
duplicative of an earlier allow.

You can solve this by putting calls to allowAttributes first, and
follow them with onElements thus:

.allowAttributes("src").matching(YOUTUBE).onElements("iframe")
.allowAttributes("height", "width")
.matching(NUMBER_OR_PERCENT).onElements("iframe")
.allowAttributes("allowfullscreen")
.matching(Pattern.compile("allowfullscreen")).onElements("iframe")
.allowAttributes("frameborder").matching(NUMBER).onElements("iframe")

doa...@googlemail.com

unread,
Aug 7, 2013, 6:57:43 PM8/7/13
to owasp-java-html-...@googlegroups.com, doa...@googlemail.com, mikes...@gmail.com
Yep, that got it. Thank you very much Mike. In case anyone else is having a similar problem I think that a part of my issue was getting the syntax backwards! I had:

onElements("iframe").allowAttributes("src").matching(YOUTUBE)

rather than:


.allowAttributes("src").matching(YOUTUBE).onElements("iframe")

This worked as I inserted the code into the wrong line in the EbayPolicyExample.java but it would also have broken another element. Other bits that mike provided such as:

Pattern.compile("allowfullscreen")

I didn't even have at all.

Thanks again.

Mike Samuel

unread,
Aug 7, 2013, 7:06:11 PM8/7/13
to doa...@googlemail.com, owasp-java-html-...@googlegroups.com
2013/8/7 <doa...@googlemail.com>:
> Yep, that got it. Thank you very much Mike. In case anyone else is having a
> similar problem I think that a part of my issue was getting the syntax
> backwards! I had:
>
> onElements("iframe").allowAttributes("src").matching(YOUTUBE)
>
> rather than:
>
>
> .allowAttributes("src").matching(YOUTUBE).onElements("iframe")

Exactly.

It's a reasonable mistake.
I attempted to come up with an Englishy API but failed to notice how
fluid English is around the ordering of prepositional phrases.

I attempted to make it clear via indentation in the example code, but
that may be too subtle.

I'll rewrite the examples to not use chaining so extensively.

doa...@googlemail.com

unread,
Aug 11, 2013, 8:40:34 AM8/11/13
to owasp-java-html-...@googlegroups.com, doa...@googlemail.com, mikes...@gmail.com
I have reformatted my sanitizer so it makes it clearer to me:


.allowAttributes("src").matching(YOUTUBE).onElements("iframe")
.allowAttributes("height", "width").matching(NUMBER_OR_PERCENT).onElements("iframe")
.allowAttributes("allowfullscreen").matching(Pattern.compile("allowfullscreen")).onElements("iframe")

Thanks for the API - It will become a standard part of my toolbox from now on. I am using it in a couple of places now and it is very useful to me.

James Manico

unread,
Aug 11, 2013, 11:52:36 AM8/11/13
to owasp-java-html-...@googlegroups.com, doa...@googlemail.com, mikes...@gmail.com
Great news! Is there any place you might suggest where we should augment the documentation with what you have learned? 

Aloha,
--
Jim Manico
@Manicode
--
You received this message because you are subscribed to the Google Groups "OWASP Java HTML Sanitizer Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-saniti...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

doa...@googlemail.com

unread,
Aug 11, 2013, 3:51:01 PM8/11/13
to owasp-java-html-...@googlegroups.com, doa...@googlemail.com, mikes...@gmail.com, j...@manico.net
To be honest I think I was just a little confused by looking at a code base that I hadn't worked with before. The way you indented the formatting is a very standard way of showing your intention of usage, and I can see that the JavaDocs use the format:

.allowAttributes("href").onElements("a")

which is the way that I eventually formatted my class anyway. I hadn't looked at the JavaDoc as I had just used the example given here:

http://code.google.com/p/owasp-java-html-sanitizer/source/browse/trunk/src/main/org/owasp/html/examples/EbayPolicyExample.java

I think that having it on one line is easier to read but then you will run into problems over the 80, 120 etc character limit; it is also just my personal preference! All in all I think I just made a silly error and not anything wrong with the documentation.
To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-sanitizer-support+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages