[reportlab-users] Error using encoding="utf-8" to output RML to PDF

Robert Sullivan

unread,

Jun 8, 2017, 6:37:55 AM6/8/17

to reportl...@lists2.reportlab.com

Hello,

I'd like to use some entities such as ÷, × and ≈ in my RML template that outputs to a PDF but I get an error:

rml2pdf.go(rml, outputFileName=buffer)

File "rlextra-3.3.29/src/rlextra/rml2pdf/rml2pdf.py", line 6018, in go

pyRXPU.error: Declared encoding UTF-8 is incompatible with UTF-16 which was used to read it

I've read in this previous message https://pairlist2.pair.net/pipermail/reportlab-users/2017-March/011607.html that PDF may not allow UTF-8 so I'm a bit stuck as to what to do.

Many thanks,

Robert

Robin Becker

unread,

Jun 8, 2017, 7:20:03 AM6/8/17

to reportlab-users, Robert Sullivan

Hi Robert,

can you email me (attached as a zip) the smallest possible example where this
happens? Also what version of python/platform are you using?

I have tested by just inserting the ÷ into a simple test rml

<!DOCTYPE document SYSTEM "../rml.dtd">
<document filename="tdivide.pdf" invariant="1">

<stylesheet>
<paraStyle
name="intro"
fontName="Helvetica"
fontSize="12"
leading="12"
/>
</stylesheet>

<pageDrawing>
<drawCentredString x="4.1in" y="5.8in">
Hello World. First Page Drawing ÷
</drawCentredString>
</pageDrawing>

</document>

that works for me with python 2.7 3.3...

I suspect you have been more formal and have put in some encodings at the top
somehow. The encoding eclared appears to be utf8 and the text of the rml appears
to be something other than utf8.

......

--
Robin Becker
_______________________________________________
reportlab-users mailing list
reportl...@lists2.reportlab.com
https://pairlist2.pair.net/mailman/listinfo/reportlab-users

Robert Sullivan

unread,

Jun 9, 2017, 11:06:43 AM6/9/17

to Robin Becker, reportlab-users

Hi Robin,

In preparing the smallest possible example I discovered I don't get the error when using Helvetica just with Gotham. So the entities are working fine.

Here is the docinit snippet and we are using utf-16 encoding.

{% if font == "Gotham" %}

<registerFontFamily

normal="Gotham"

bold="Gotham-Bold"

italic="Gotham-Italic"

boldItalic="Gotham-Black" />

{% endif %}

</docinit>

Also using Python3.6 and Django 1.10.5, fyi, so the font is declared in the context variable that gets passed to this template.

I'll look more into the Gotham font but any pointers would be appreciated.

Many thanks,

Robert

Robin Becker

unread,

Jun 9, 2017, 12:01:36 PM6/9/17

to Robert Sullivan, reportlab-users

On 09/06/2017 16:06, Robert Sullivan wrote:
> Hi Robin,
>
> In preparing the smallest possible example I discovered I don't get the
> error when using Helvetica just with Gotham. So the entities are working
> fine.
>
> Here is the docinit snippet and we are using utf-16 encoding.
>
> <docinit>
> {% if font == "Gotham" %}
> <registerTTFont faceName="{{ font }}" fileName="{{ asset_root
> }}/{{ font }}-Book.ttf" />
> <registerTTFont faceName="{{ font }}-Bold" fileName="{{
> asset_root }}/{{ font }}-Bold.ttf" />
> <registerTTFont faceName="{{ font }}-Italic" fileName="{{
> asset_root }}/{{ font }}-BookItalic.ttf" />
> <registerTTFont faceName="{{ font }}-Black" fileName="{{
> asset_root }}/{{ font }}-Black.ttf" />
> <registerFontFamily
> normal="Gotham"
> bold="Gotham-Bold"
> italic="Gotham-Italic"
> boldItalic="Gotham-Black" />
> {% endif %}
> </docinit>
>

hi could this just be the use of the font or asset_root interpolations? I don't
know exactly what django templates are supposed to output in this case. I always
just use preppy to prepare rml.

Have you actually saved the output that is produced from the template and
checked it is all in utf-16. I think that means we need unicode as the django
template output.

There's a bug in the current go method when saveRml='path' is passed in as it
will try to save the rml with a 'w' mode argument and you'll just get an error
there.

So you should probably save the produced RML yourself with a 'wb' mode argument.

I think this is almost certainly a problem in the content of the RML rather than
an issue in the font itself. Missing glyphs should just appear as black squares
etc etc.

..........
>> --
>> Robin Becker

Robin Becker

unread,

Jun 9, 2017, 12:12:11 PM6/9/17

to Robert Sullivan, reportlab-users

In fact I'm being stupid, if the django is producing mixed stuff somehow then
the templating software should blow up as bytes (utf8) are added to unicode or
vice versa.

If on the other hand you declare utf16 in the rml header, but django is
producing utf8 then you'll likely get this error.

I tried just changing the encoding to utf-16 in the header of one of my test
files (ie it is a bytes file) and I get exactly this error

C:\tmp>rml2pdf bad.rml
Traceback (most recent call last):
File "c:\code\rlextra\rml2pdf\rml2pdf.py", line 6511, in <module>
main()
File "c:\code\rlextra\rml2pdf\rml2pdf.py", line 6503, in main
dynamicRml=1
File "c:\code\rlextra\rml2pdf\rml2pdf.py", line 6147, in go
parsed = parsexml(xmlInputText,eoCB=_eoDTD)
pyRXPU.error: Declared encoding UTF-16 is incompatible with UTF-8 which was used
to read it
Internal error, ParserPush failed!

clearly a bytes file cannot be utf16.

This seems to indicate you are actually producing utf8 under some circumstances.

On 09/06/2017 16:06, Robert Sullivan wrote:

Reply all

Reply to author

Forward