How to enter less-common HTML entities in Text plugin?

37 views
Skip to first unread message

Brian Rutledge

unread,
Mar 3, 2017, 3:13:44 PM3/3/17
to django CMS developers
I'm trying to insert a "✓" (check mark) in a djangocms-text-ckeditor plugin. I've tried pasting it, and entering both ✓ and ✓ in the "Source" view. No matter what method I use, CKEditor renders that character, but when I save the plugin, the character gets changed to "?". I've played around with various settings suggested by https://github.com/divio/djangocms-text-ckeditor#configurable-sanitizer, to no avail.

Any thoughts?

Thanks,
Brian

Sacha Müller Philipps Sohn

unread,
Mar 6, 2017, 3:15:16 AM3/6/17
to django CMS developers
You are refering to UTF-8 Symbols. It could be that the Font used in ckEditor supports this chararcter thus it gets properly displayed. The font used on your website might differ fro ckeditors font and might not support this character. As soon as a character isnt in the charset of a font it gets rendered as a questionmark.

Brian Rutledge

unread,
Mar 6, 2017, 9:39:05 AM3/6/17
to django CMS developers
I don't think that's what's happening. First, I'm using the same font on my website and in CKEditor. In CKEditor's "Source" view, when I enter ✓, switching back to the WYSIWG view renders the checkmark. However, it looks like the text that gets stored in the database is ?, so subsequent edits of the Text plugin show that character.

I see similar behavior from entities like
← and ′. However, for more common entities like ©, it looks like there's an encoding step somewhere, because the value in the database is \xa9.

I know CKEditor has some settings related to handling entities, and djangocms-text-ckeditor does its own sanitization using html5lib, but I don't know if/how those things are resulting in this behavior.

Brian Rutledge

unread,
Mar 8, 2017, 4:20:59 PM3/8/17
to django CMS developers
Update: Looks like this is an issue with MySQL character sets. I haven't fully wrapped my head around it, but it seems my default character set is latin1. However, it looks like djangocms_text_ckeditor.html.clean_html (called from Text.save()) converts HTML entities to their respective encodings, which results in Unicode characters, which resulted in an error from MySQL. I found some guidance on changing the character set to utf8 for a single column, which resolved this issue. It looks like fixing my entire database is more involved.
Reply all
Reply to author
Forward
0 new messages