How to avoid encoding of special chars?

2,277 views
Skip to first unread message

sk...@clearslide.com

unread,
Dec 8, 2014, 12:25:09 PM12/8/14
to owasp-java-html-...@googlegroups.com
When sanitizing the inputs with HTML Sanitizer, it is encoding the following special chars:

", &, \, +, <, >, =, @, '

This is very not desirable for some cases.  

new NoHTMLPolicyBuilder().toFactory().sanitize("\", &, \, +, <, >, =, @, '");
becomes
&#34;, &amp;, \, &#43;, &lt;, &gt;, &#61;, &#64;, &#39;

How can I avoid encoding of those special chars?

Thanks,

sk...@clearslide.com

unread,
Dec 9, 2014, 11:36:23 AM12/9/14
to owasp-java-html-...@googlegroups.com, sk...@clearslide.com
The correct code should be as below:
new new HtmlPolicyBuilder().toFactory().sanitize("\", &, \, +, <, >, =, @, '");

This will encode special chars.

kme...@exoplatform.com

unread,
Aug 18, 2016, 7:31:32 AM8/18/16
to OWASP Java HTML Sanitizer Support, sk...@clearslide.com
I have the same problem and I need to avoid encoding special chars:

Mike Samuel

unread,
Aug 18, 2016, 7:32:29 AM8/18/16
to OWASP Java HTML Sanitizer Support

How is this causing problems?


--
You received this message because you are subscribed to the Google Groups "OWASP Java HTML Sanitizer Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-sanitizer-support+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mike Samuel

unread,
Aug 18, 2016, 7:33:59 AM8/18/16
to OWASP Java HTML Sanitizer Support

Same question for you, kmenzli, how is this causing problems?


--

Khemais Menzli

unread,
Aug 18, 2016, 11:06:41 AM8/18/16
to owasp-java-html-...@googlegroups.com
In my input field when I put text like for example : hello everyone + have a nice day I get as an output this text : hello everyone &#43; have a nice day.
I want to avoid escaping special chars 

eXo Platform

khemais menzli / PreSales Director 
kme...@exoplatform.com / (216) 28 71 47 24

eXo Platform 
Tunisia 
http://www.exoplatform.com

Twitter Google Plus github

This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited. E-mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, or contain viruses. Anyone who communicates with us by e-mail is deemed to have accepted these risks. eXoPlatform is not responsible for errors or omissions in this message and denies any responsibility for any damage arising from the use of e-mail. Any opinion and other statement contained in this message and any attachment are solely those of the author and do not necessarily represent those of the company.


--
You received this message because you are subscribed to a topic in the Google Groups "OWASP Java HTML Sanitizer Support" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/owasp-java-html-sanitizer-support/ZpcCZdx6bUE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to owasp-java-html-sanitizer-support+unsubscribe@googlegroups.com.

Khemais Menzli

unread,
Aug 18, 2016, 11:06:41 AM8/18/16
to OWASP Java HTML Sanitizer Support, sk...@clearslide.com
In my input field when I put text like for example : hello everyone + have a nice day I get as an output this text : hello everyone &#43; have a nice day.
I want to avoid escaping special chars 

Mike Samuel

unread,
Aug 18, 2016, 11:14:33 AM8/18/16
to OWASP Java HTML Sanitizer Support
In my browser,

<form>
<input name=inp value="&#43;">
<textarea name=ta>&#43;</textarea>
<button type=submit>submit</button>
</form>

both show text fields that contain '+' characters, and when the submit button is clicked, I get the query string

    ?inp=%2B&ta=%2B

which properly encodes pluses per

$ python -c 'print chr(0x2B)'
+

So, as far as I can tell, the problem is not in the sanitizer.  The text node it produces `&#43;` is semantically equivalent to the input text node `+` except that the former is robust against UTF-7 attacks ( https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet#UTF-7_encoding )


If you think this is a legit bug, please file an issue at https://github.com/OWASP/java-html-sanitizer/issues and include

1) The HTML that you're sanitizing.
2) HTML from the page that that sanitized HTML is embedding in.

If you can include details about how you're generating the page containing the sanitized HTML, that'd be great too.

Jim Manico

unread,
Aug 19, 2016, 1:36:01 PM8/19/16
to owasp-java-html-...@googlegroups.com, sk...@clearslide.com

This is not a problem from what Mike and I see. If you are using the library properly - for sanitizing and rendering untrusted HTML in a browser -then the HTML encoded special characters will display properly.

I am still concerned about your issue. I don't understand why an encoded special character is a problem for you. What browser are you using? What does the *rendered* HTML look like?

Removing the special character encoding will weaken the security properties of this library, so it's almost certainly not going to happen... :(

Respectfully, Jim
--
You received this message because you are subscribed to the Google Groups "OWASP Java HTML Sanitizer Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-saniti...@googlegroups.com.

Khemais Menzli

unread,
Aug 22, 2016, 4:33:58 AM8/22/16
to owasp-java-html-...@googlegroups.com, sk...@clearslide.com
I'm facing this behavior with chrome and firefox.
Again, if I understood you correctly, escape special chars is a normal behavior BUT from rendering standpoint, there is an issue ==> Escaped chars should be well displayed on browsers : &#43;  ====> +
In my database special chars are escaped BUT also in my HTML view they still escaped (which should never happen). May be because I'm using a groovy template in my view to generate HTML?

eXo Platform

khemais menzli / PreSales Director 
kme...@exoplatform.com / (216) 28 71 47 24

eXo Platform 
Tunisia 
http://www.exoplatform.com

Twitter Google Plus github

This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited. E-mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, or contain viruses. Anyone who communicates with us by e-mail is deemed to have accepted these risks. eXoPlatform is not responsible for errors or omissions in this message and denies any responsibility for any damage arising from the use of e-mail. Any opinion and other statement contained in this message and any attachment are solely those of the author and do not necessarily represent those of the company.


To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-sanitizer-support+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "OWASP Java HTML Sanitizer Support" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/owasp-java-html-sanitizer-support/ZpcCZdx6bUE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to owasp-java-html-sanitizer-support+unsubscribe@googlegroups.com.

Jim Manico

unread,
Aug 22, 2016, 5:14:47 AM8/22/16
to owasp-java-html-...@googlegroups.com, sk...@clearslide.com

Yes you have to disable escaping in Groovy for the variable that contains the HTML you want to render. Here is a discussion of the details...

http://justthesam.com/2010/06/grails-gsp-html-escaping-confusion/

To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-saniti...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

-- 
Jim Manico
Manicode Security
https://www.manicode.com

Khemais Menzli

unread,
Aug 22, 2016, 5:19:08 AM8/22/16
to owasp-java-html-...@googlegroups.com, sk...@clearslide.com
Again think you for your time and help
I'll make a try and then come back to you

eXo Platform

khemais menzli / PreSales Director 
kme...@exoplatform.com / (216) 28 71 47 24

eXo Platform 
Tunisia 
http://www.exoplatform.com

Twitter Google Plus github

This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited. E-mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, or contain viruses. Anyone who communicates with us by e-mail is deemed to have accepted these risks. eXoPlatform is not responsible for errors or omissions in this message and denies any responsibility for any damage arising from the use of e-mail. Any opinion and other statement contained in this message and any attachment are solely those of the author and do not necessarily represent those of the company.


-- 
Jim Manico
Manicode Security
https://www.manicode.com

--

Khemais Menzli

unread,
Aug 22, 2016, 8:06:04 AM8/22/16
to owasp-java-html-...@googlegroups.com, Steve Kim
I get the same behavior even when I replaced $ by <%=%> in my groovy file 

eXo Platform

khemais menzli / PreSales Director 
kme...@exoplatform.com / (216) 28 71 47 24

eXo Platform 
Tunisia 
http://www.exoplatform.com

Twitter Google Plus github

This e-mail message may contain confidential or legally privileged information and is intended only for the use of the intended recipient(s). Any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is prohibited. E-mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, or contain viruses. Anyone who communicates with us by e-mail is deemed to have accepted these risks. eXoPlatform is not responsible for errors or omissions in this message and denies any responsibility for any damage arising from the use of e-mail. Any opinion and other statement contained in this message and any attachment are solely those of the author and do not necessarily represent those of the company.


Jim Manico

unread,
Aug 22, 2016, 1:02:24 PM8/22/16
to owasp-java-html-...@googlegroups.com, Steve Kim

This is a Groovy issue. In order to display HTML in Groovy you need to disable the encoding for that variable.

- Jim

To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-saniti...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages