streamed pyfpdf is stored with wrong fileencoding when using python3

134 views
Skip to first unread message

Armin Würtenberger

unread,
Apr 21, 2020, 1:28:19 AM4/21/20
to web2py-users
Hi,

I am developing apps which are heavily depending on the creation of pdf-files. This workes great using fpfdf and python2.7. I try to go on and moving to pyhton3.7 which is, so far no problem. The real challenge for me is to generate correct pdf-files:

Heres how I did it in python2.7:


@service.run
def generatePDF(**kwargs):
   
from gluon.contrib.fpdf import fpdf
   
from gluon.contrib.fpdf.php import UTF8ToUTF16BE, UTF8StringToArray
   
from gluon.contenttype import contenttype

    pdf
= fpdf.FPDF("P","mm","a4")
    pdf
.l_margin, pdf.r_margin = 20, 10
    pdf
.add_font('Sans', '', 'DejaVuSansCondensed.ttf', uni=True)
    pdf
.add_font('Sans', 'B', 'DejaVuSansCondensed-Bold.ttf', uni=True)
    pdf
.add_font('Sans', 'I', 'DejaVuSansCondensed-Oblique.ttf', uni=True)
    pdf
.add_font('Sans', 'BI', 'DejaVuSansCondensed-BoldOblique.ttf', uni=True)
    pdf
.add_font('Serif', '', 'DejaVuSerifCondensed.ttf', uni=True)
    pdf
.add_font('Serif', 'B', 'DejaVuSerifCondensed-Bold.ttf', uni=True)
    pdf
.add_font('Serif', 'I', 'DejaVuSerifCondensed-Italic.ttf', uni=True)
    pdf
.add_font('Serif', 'BI', 'DejaVuSerifCondensed-BoldItalic.ttf', uni=True)
   
    pdf
.add_page()
    pdf
.set_font('Sans','B',20)
    pdf
.set_y( 25 )
    pdf
.cell( w=pdf.w-(pdf.l_margin+pdf.r_margin), h=25,
              align
="C",
              border
=1,
              txt
="Hello World", ln=2 )

   
    response
.headers["Content-Type"]=contenttype(".pdf")
    response
.headers['Content-disposition'] = 'attachment; filename=mypdf.pdf'
   
   
return pdf.output(dest='S')


The downloaded file is a PDF, the file itself is latin1 encoded. (checked with vim; set fileencoding)

If I do the very same, using web2py and python3.7, I am getting the pdf-file as byte-stream - starting with the b'

b
'%PDF-1.3\n3 0 obj\n<</Type /Page\n/Parent 1 0 R ...


fileencoding utf-8, which I understand (somehow) since the fpdf does serve a byte-stream.

To overcome this, I tried to change the last line to:


return pdf.output(dest='S').decode('latin1')


and I get a file which is no valid pdf, fileencoding is utf-8
Once I manually change the fileencoding of the downloaded file to latin1, it becomes a valid and readable pdf-file.

Using 

pdf.output("myfile.pdf",dest='F')

does produce a valid, latin1 - encoded file on disc however.

I tried different variants as well, but I got stuck and was not able to produce a valid pdf.

Can someone suggest a way to produce a valid download?
At first I was suspecting some missing response - header information, but I did not find one which solved the problems.

Any help would be appreciated!

I am working on macOS, using Safari, but tried Firefox and Chrome as well - same result.

Cheers, Armin

Massimo Di Pierro

unread,
Jul 1, 2020, 1:13:09 AM7/1/20
to web2py-users
Try a few things things:


1) set content type to "application/pdf"
2) set content type to "application/pdf; charset=latin1"
3)
from io import BytesIO
return BytesIO(pdf.output(dest='S'))

Armin Würtenberger

unread,
Jul 7, 2020, 3:32:22 PM7/7/20
to web2py-users
Thank you for your response, I tried as suggested, but did not succeed.

[...]
    from io import BytesIO
[...]
    response.headers["Content-Type"]=contenttype(".pdf") # version 1
    #response.headers["Content-Type"]=contenttype("application/pdf") # version 2
    #response.headers["Content-Type"]=contenttype("application/pdf; charset=latin1") # version 3
    response.headers['Content-disposition'] = 'attachment; filename=mypdf.pdf'
    
    return BytesIO(pdf.output(dest='S')).getvalue()

I tried the 3 versions of the 'content-type' and the response with BytesIO

Meanwhile I found a solution with storing an intermediate pdf-file locally on the server and serve it.

Anyway - it would be interesting to get a 'direct' solution.

Thanks for your help!
Reply all
Reply to author
Forward
0 new messages