Handling mojibake

23 views
Skip to first unread message

jayri...@gmail.com

unread,
Dec 14, 2025, 9:33:31 PM (5 days ago) Dec 14
to XMPie Interest Group
Hi guys - please excuse the stupid question I have :)

Is there an elegant way of handling characters from different encoding formats that get interpretted as mojibake characters in final outputs?

Stuff like typographer quotation marks and em dashes typically end up as mojibake.

J

Stephen Couch

unread,
Dec 14, 2025, 10:10:43 PM (5 days ago) Dec 14
to xmpie...@googlegroups.com
The best solution is to identify what the character is and select the a font that has the relevant glyph.

This video explains more about it from the point of view of emojis:  XMPie e-Learning - uCreate Print Training - Working with Emojis

If you are running into problems with the PDF compliance error preventing production (and you don't care about the missing characters) you can change the advanced output settings to use a lower compliance level. I think that is in the video too?)

--
You received this message because you are subscribed to the Google Groups "XMPie Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xmpie-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/xmpie-users/e4be7a38-e078-45e5-bec3-08ed3f61f92dn%40googlegroups.com.

jayri...@gmail.com

unread,
Dec 14, 2025, 10:21:39 PM (5 days ago) Dec 14
to XMPie Interest Group
Ah.. not emoji issue (unless I'm missing something).

It's when say a typographers quotation mark as opposed to a normal quotation mark (a curly quotation mark as opposed to a standard) get interpretted as something like:

That’s nice → That’s nice

To avoid that, we'd have the following instead (note the different apostrophe):
That's nice

west-digital.fr

unread,
Dec 15, 2025, 1:53:55 AM (4 days ago) Dec 15
to XMPie Interest Group
Hello,

Thanks for providing a sample.
Do I understand well, that the behavior you described can be seen both in Adobe InDesign (on your screen) and in PDF output files, relatively to Text frame that are tagged with Text Content Objets?
Then, I would guess that the issue is more due to the encoding of your data source. Particularly (but not only) if you data source is a CSV.
Your data source may be encoded in UTF-8, whereas your software may consider that it's encoded in a strictly single-byte encoding such as Windows-1252.
If you use uDirect 26.0, there is a new feature that indeed lets you specify the default encoding for non-Unicode files.

jayri...@gmail.com

unread,
Dec 17, 2025, 8:10:39 PM (2 days ago) Dec 17
to XMPie Interest Group
Yes I saw that part in uCreate 26 however I'm fairly sure this is being introduced thanks to the mailing validation software we are using from Datatools - maybe a different output format may work better to handle these issues. I was hoping XMPie may have a more elegant way to handle this (read allow me to be lazy :D)

J

Reply all
Reply to author
Forward
0 new messages