Binary encoded images?

218 views
Skip to first unread message

gsilver0

unread,
Jun 24, 2016, 4:38:31 PM6/24/16
to OWASP Java HTML Sanitizer Support
I'm working on a web application which includes an editor which binary encodes images. I'd like to put the generated HTML through the OWASP HTML sanitizer, and disallow img tags to specify a URL, while retaining the ability to make a href tags to link to other pages.

I have the policy mostly in place, but I can't figure out what options to use (or even if it's supported) for binary encoded images, and also to block images specified by URL.

Any ideas?

Mike Samuel

unread,
Jun 24, 2016, 4:41:19 PM6/24/16
to OWASP Java HTML Sanitizer Support

What do you mean by "binary encoded images"?  Like <img src="data:image/png;...">?

--
You received this message because you are subscribed to the Google Groups "OWASP Java HTML Sanitizer Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-saniti...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

gsilver0

unread,
Jun 28, 2016, 1:20:33 PM6/28/16
to OWASP Java HTML Sanitizer Support, mikes...@gmail.com
Yeah. That's what I meant. 

Are there any options to support this?

Mike Samuel

unread,
Jun 28, 2016, 4:06:58 PM6/28/16
to gsilver0, OWASP Java HTML Sanitizer Support
allowUrlProtocols (
http://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20160526.1/org/owasp/html/HtmlPolicyBuilder.html#allowUrlProtocols
) should let you whitelist the "data" protocol.

You can then allow an attribute with an extra check thus

.allowAttributes("src")
.matching(...)
.onElements("img")

There are a number of things you can do in the matching part like allowing

data:image/...

instead of just allowing data:...

Since allowUrlProtocols("data") allows data URLs anywhere data URLs
are allowed, you might want to also add a matcher to any other URL
attributes you whitelist like

.allowAttributes("href")
.matching(...)
.onElements("a")

that reject anything with a colon that does not start with http: or
https: or mailto:.

Jim Manico

unread,
Jun 29, 2016, 3:06:27 AM6/29/16
to owasp-java-html-...@googlegroups.com
This is good stuff, would you like these examples put up on the wiki, Mike?

Aloha, Jim
--
Jim Manico
Manicode Security
https://www.manicode.com

Mike Samuel

unread,
Jun 29, 2016, 7:41:15 AM6/29/16
to OWASP Java HTML Sanitizer Support

Sure.  I wouldn't use the term "binary encoded" though.  Some quick searching seems to show that "inline" and "embedded" are used roughly equally to refer to data:image/... in CSS and HTML.

Jim Manico

unread,
Jul 6, 2016, 5:49:01 PM7/6/16
to owasp-java-html-...@googlegroups.com

Added here

https://www.owasp.org/index.php/OWASP_Java_HTML_Sanitizer_Project#tab=Inline_2FEmbedded_Images

Happy to move this (and other examples) to the GitHub wiki as well.

- Jim

Reply all
Reply to author
Forward
0 new messages