img src is being html encoded after a custom policy sets its value

50 views
Skip to first unread message

Alexandre Russel

unread,
Apr 3, 2017, 12:57:48 PM4/3/17
to OWASP Java HTML Sanitizer Support
Hi,

 I've created an Attribute Policy that modify the src of the image tag. It is working as expected, but the sanitize code, instead of just putting the value as src in the html, is encoding it. This  makes the value usuable. Why would we want attribute to be html encoded ? How can I avoid this ? I really hope that there is another way than pre-decoding the value so that the encoded value would be as expected.

Alex

Jim Manico

unread,
Apr 3, 2017, 1:08:59 PM4/3/17
to owasp-java-html-...@googlegroups.com

Any value that goes in a src attribute is in the attribute context and needs to be HTML entity encoded to ensure safety.

This library is 100% all about safety which requires some careful choices that may effect your functionality.

- Jim

--
You received this message because you are subscribed to the Google Groups "OWASP Java HTML Sanitizer Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-saniti...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mike Samuel

unread,
Apr 3, 2017, 1:09:16 PM4/3/17
to OWASP Java HTML Sanitizer Support
On Mon, Apr 3, 2017 at 8:31 AM, Alexandre Russel <alex...@russel.fr> wrote:
> Hi,
>
> I've created an Attribute Policy that modify the src of the image tag. It
> is working as expected, but the sanitize code, instead of just putting the
> value as src in the html, is encoding it. This makes the value usuable. Why

How is HTML-encoding a URL a problem?

https://jsbin.com/hudekejuyo/edit?html,output shows that HTML encoding
of URLs in HTML attributes leads to equivalent URLs as seen by the
browser.

> would we want attribute to be html encoded ?

We would want attributes to be HTML encoded for correctness.

If you have the URL

http://example.com?a=b&copy=1

then you should HTML escape it thus

http://example.com?a=b&amp;copy=1

Most browsers are smart enough to realize that &copy in a URL
attribute is not a copyright symbol the way it is everywhere else, but
don't rely on that for program correctness.

> How can I avoid this ? I really
> hope that there is another way than pre-decoding the value so that the
> encoded value would be as expected.

What do you mean pre-decoding? A concrete example would be helpful.

Alexandre Russel

unread,
Apr 3, 2017, 1:47:29 PM4/3/17
to OWASP Java HTML Sanitizer Support, mikes...@gmail.com
This is the value I return from my policy:
https://test-owasp.s3-us-west-1.amazonaws.com/_nIu8KHX.jpeg?AWSAccessKeyId=AKIAJUULYQRWERGSH2IQ&Expires=1491327804&Signature=zRlyPBSWJEdv09SuxLlOFZrfAEc%3D
You can click it, it is valid for 24 hours, you should see  a kitten

This is the value after the encode:
You'll get an access denied. 

I am using chrome and it is not smart enough to show the second link properly.

Alex

Mike Samuel

unread,
Apr 3, 2017, 1:51:39 PM4/3/17
to Alexandre Russel, OWASP Java HTML Sanitizer Support
https://jsbin.com/gequfuzore/edit?html,output has both URLs in <img
src> as per your description.
I see two kittens using Chrome 56.0.2924.87.
>> > email to owasp-java-html-saniti...@googlegroups.com.

Alexandre Russel

unread,
Apr 3, 2017, 2:00:58 PM4/3/17
to mikes...@gmail.com, OWASP Java HTML Sanitizer Support
Thanks a lot for your help, you are completely right. Instead of including the link in a page to test it, I was copy/pasting the link in the URL bar of the browser and there, the browser doesn't make the effort to decode. Pasting the link to colleagues to see which browser would work, I've realised that, same as for you, when you click the link, or show it in an html page, it shows properly. My bad.

Alex

Mike Samuel

unread,
Apr 3, 2017, 2:21:17 PM4/3/17
to Alexandre Russel, OWASP Java HTML Sanitizer Support
On Mon, Apr 3, 2017 at 2:00 PM, Alexandre Russel <alex...@russel.fr> wrote:
> Thanks a lot for your help, you are completely right. Instead of including
> the link in a page to test it, I was copy/pasting the link in the URL bar of
> the browser and there, the browser doesn't make the effort to decode.
> Pasting the link to colleagues to see which browser would work, I've
> realised that, same as for you, when you click the link, or show it in an
> html page, it shows properly. My bad.

No worries. I'm glad you're closer to solving your problem.
Reply all
Reply to author
Forward
0 new messages