Getting a lot of 

448 views
Skip to first unread message

Hunter Sherman

unread,
Jul 23, 2013, 11:57:59 AM7/23/13
to snape...@googlegroups.com
I'm getting a lot of  characters at the beginning of lines when working with snapeditor. My HTML purifier is turning them in to question marks.

Are these characters expected? If so what are their purpose?

Thanks,
Hunter

Wesley Wong

unread,
Jul 23, 2013, 7:34:11 PM7/23/13
to snape...@googlegroups.com
Hi Hunter,

This HTML entity is known as the zero-width no-break space. We use this to reserve space in empty blocks. Unfortunately, depending on which browser you're in, if a block is empty and you move the cursor away, you can never get into it again, but it's still there. We use this special HTML entity to preserve space so that you can move the cursor back into the block.

The zero-width no-break space should not cause any visual differences. The question mark is due to the HTML entity being changed into unicode. A thorough discussion can be found in this other thread.

Basically, what's happening is the data coming from SnapEditor contians  and your HTML purifier is changing them to \ufeff. If you're using a database that cannot (or is not configured to) store unicode, it barfs when it sees this character and changes it to a question mark. Note that if the database can store unicode, you won't see the question marks.

To fix this, before storing in the database, you can replace all instances of \ufeff back to  or fix the database so it will properly store unicode.

Hope that helps!
Wesley

Hunter Sherman

unread,
Jul 30, 2013, 9:22:46 AM7/30/13
to snape...@googlegroups.com
Ah I see, that makes sense.

Our DB is storing Unicode, but our HTML purifier is doing what you described. I've found a nice workaround for it, thanks!

Wesley Wong

unread,
Jul 30, 2013, 12:41:30 PM7/30/13
to snape...@googlegroups.com
Hi Hunter,

Glad to hear it worked out.

Cheers,
Wesley
Reply all
Reply to author
Forward
0 new messages