streamed pyfpdf is stored with wrong fileencoding when using python3

Armin Würtenberger

unread,

Apr 21, 2020, 1:28:19 AM4/21/20

to web2py-users

Hi,

I am developing apps which are heavily depending on the creation of pdf-files. This workes great using fpfdf and python2.7. I try to go on and moving to pyhton3.7 which is, so far no problem. The real challenge for me is to generate correct pdf-files:

Heres how I did it in python2.7:


@service.run
def generatePDF(**kwargs):
    from gluon.contrib.fpdf import fpdf
    from gluon.contrib.fpdf.php import UTF8ToUTF16BE, UTF8StringToArray
    from gluon.contenttype import contenttype

    pdf = fpdf.FPDF("P","mm","a4")
    pdf.l_margin, pdf.r_margin = 20, 10
    pdf.add_font('Sans', '', 'DejaVuSansCondensed.ttf', uni=True)
    pdf.add_font('Sans', 'B', 'DejaVuSansCondensed-Bold.ttf', uni=True)
    pdf.add_font('Sans', 'I', 'DejaVuSansCondensed-Oblique.ttf', uni=True)
    pdf.add_font('Sans', 'BI', 'DejaVuSansCondensed-BoldOblique.ttf', uni=True)
    pdf.add_font('Serif', '', 'DejaVuSerifCondensed.ttf', uni=True)
    pdf.add_font('Serif', 'B', 'DejaVuSerifCondensed-Bold.ttf', uni=True)
    pdf.add_font('Serif', 'I', 'DejaVuSerifCondensed-Italic.ttf', uni=True)
    pdf.add_font('Serif', 'BI', 'DejaVuSerifCondensed-BoldItalic.ttf', uni=True)
    
    pdf.add_page()
    pdf.set_font('Sans','B',20)
    pdf.set_y( 25 )
    pdf.cell( w=pdf.w-(pdf.l_margin+pdf.r_margin), h=25,
              align="C",
              border=1,
              txt="Hello World", ln=2 )

    
    response.headers["Content-Type"]=contenttype(".pdf")
    response.headers['Content-disposition'] = 'attachment; filename=mypdf.pdf'
    
    return pdf.output(dest='S')

The downloaded file is a PDF, the file itself is latin1 encoded. (checked with vim; set fileencoding)

If I do the very same, using web2py and python3.7, I am getting the pdf-file as byte-stream - starting with the b'


b'%PDF-1.3\n3 0 obj\n<</Type /Page\n/Parent 1 0 R ...

fileencoding utf-8, which I understand (somehow) since the fpdf does serve a byte-stream.

To overcome this, I tried to change the last line to:


return pdf.output(dest='S').decode('latin1')

and I get a file which is no valid pdf, fileencoding is utf-8

Once I manually change the fileencoding of the downloaded file to latin1, it becomes a valid and readable pdf-file.

Using

pdf.output("myfile.pdf",dest='F')

does produce a valid, latin1 - encoded file on disc however.

I tried different variants as well, but I got stuck and was not able to produce a valid pdf.

Can someone suggest a way to produce a valid download?

At first I was suspecting some missing response - header information, but I did not find one which solved the problems.

Any help would be appreciated!

I am working on macOS, using Safari, but tried Firefox and Chrome as well - same result.

Cheers, Armin

Massimo Di Pierro

unread,

Jul 1, 2020, 1:13:09 AM7/1/20

to web2py-users

Try a few things things:

1) set content type to "application/pdf"

2) set content type to "application/pdf; charset=latin1"

3)

from io import BytesIO
return BytesIO(pdf.output(dest='S'))

Armin Würtenberger

unread,

Jul 7, 2020, 3:32:22 PM7/7/20

to web2py-users

Thank you for your response, I tried as suggested, but did not succeed.

[...]
    from io import BytesIO
[...]
    response.headers["Content-Type"]=contenttype(".pdf") # version 1
    #response.headers["Content-Type"]=contenttype("application/pdf") # version 2
    #response.headers["Content-Type"]=contenttype("application/pdf; charset=latin1") # version 3

    response.headers['Content-disposition'] = 'attachment; filename=mypdf.pdf'

return BytesIO(pdf.output(dest='S')).getvalue()

I tried the 3 versions of the 'content-type' and the response with BytesIO

Meanwhile I found a solution with storing an intermediate pdf-file locally on the server and serve it.