Library not able to support all Internationalized characters.

54 views
Skip to first unread message

kevinp...@gmail.com

unread,
Oct 19, 2019, 10:28:04 AM10/19/19
to OWASP Java HTML Sanitizer Support
Hi, 

We have some REST API's that accept JSON body. We are using this library to prevent and HTML content being passed as Strings in the JSON body. 
The problem that we are facing is that we also support Internationalized characters. The library seems to work for most of the Multi-Byte characters but fails beyond a certain range. 

for. e.g:

1) "𐌸" ==> "𐌸"
2) "中文" ==> "中文"

Because of this we are having issues supporting Internationalized characters. Is there a way to fix this where we receive the same multi-byte character back post sanitization. 

Regards,
Kevin

Mike Samuel

unread,
Oct 21, 2019, 10:32:50 AM10/21/19
to OWASP Java HTML Sanitizer Support
Is the problem that the BOM is being encoded?

--
You received this message because you are subscribed to the Google Groups "OWASP Java HTML Sanitizer Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-saniti...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/owasp-java-html-sanitizer-support/11351892-99a4-4e33-85c9-d5219b03431f%40googlegroups.com.

kevinp...@gmail.com

unread,
Oct 22, 2019, 9:30:56 AM10/22/19
to OWASP Java HTML Sanitizer Support
Hi Mike, 

I figured out the issue. the problem was in the Spring library to unescape HTML content and not in the HTML sanitizer. 
Replacing the Spring HtmlUtils to Apache Commons text SpringEscapeUtils solved the problem. 

Thank you for your prompt reply. 

Regards, 
Kevin  

Mike Samuel

unread,
Oct 22, 2019, 9:31:45 AM10/22/19
to OWASP Java HTML Sanitizer Support
Np.  I'm glad you got it sorted.

--
You received this message because you are subscribed to the Google Groups "OWASP Java HTML Sanitizer Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to owasp-java-html-saniti...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages