On 10 Aug 2022, at 10:15, Raphael Krupinski wrote:
> Hi,
>
> I'm writing a pandas DataFrame in which a value is a string with an emoji
>
> (restroom or toothbrush) to an xlsx file.
>
> When I open this file with a proprietary software all the emojis are
>
> garbled - is shown instead.
The box glyph usually mean that there is no symbol in the chosen font for the codepoint. Most typefaces don't have the emojis in their set, so they're dependent upon the OS being able to tell the difference and load something up.
> So I opened the file with LibreOffice Calc and it all shows fine, and after
>
> saving the file in LO it also displays properly in the other software.
>
> I've opened the xml contents of both versions and I found that the openpyxl
>
> turns the emoji to XML entities (e.g. 🪥) but the LO version has the
>
> emojis written straight as UTF-8.
There's no difference: XML's default encoding is UTF-8 but the entities are "safer" for the above reason and means you don't have to look up the code point separately.
> Is that a bug in OpenPyxl (writing emoji as entity), the proprietary
It's not openpyxl, which just uses etree or lxml to write XML.
> software (not reading entity as emoji) or both?
It's most likely to be a combination of the software and the machine it's running on. Basically, avoid this kind of thing if you can.
Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Sengelsweg 34
Düsseldorf
D- 40489
Tel:
+49-203-3925-0390
Mobile:
+49-178-782-6226