HTMLDOC as a python view

39 views
Skip to first unread message

MikeKJ

unread,
Oct 12, 2015, 12:21:24 PM10/12/15
to Django users
Please excuse if this is the wrong forum and if a mod wants to move it please feel free, just please let me know where it went.

I'm attempting to convert html to pdf on the fly with HTMLDOC.

The fly in the ointment appears to be that the generated pdf is not viewable in that it opens Adobe Reader and gives a warning This PDF document might not be displayed correctly.

The model creates the filename as a slug appended with pdf creates a dict of the variables and calls

    def generate_or_serve_pdf( self, saving = False ):
        """
        Return a link to the PDF of this CV, generating one only if necessary.
        """
        self.filename = "%s.pdf" % self.slug
        pdfurl = os.path.join( settings.MEDIA_URL, "pdf", self.filename )
        pdfpath = os.path.join( settings.MEDIA_ROOT, "pdf", self.filename )
        if saving == True or not os.path.exists( pdfpath ):
            super( CVMask, self ).save()
            url = """%s/%s/""" % ( settings.CVBASE, self.slug )
            pdfgen( url, pdfpath )
            #return pdfurl
        else:
            return pdfurl


pdfgen is

def pdfgen( url, outputfile = None, testing = False ):
    """
    Generate a PDF file from the supplied URL and write
    it to outputfile when complete:
    >>> pdfgen("http://www.google.com", "testpdfgen.pdf", testing = True )
    """
    if os.path.exists( outputfile ):
        os.unlink( outputfile )
    worker = subprocess.Popen( """htmldoc --webpage "%s" --linkstyle plain --footer ... --no-compression -t pdf14 > %s_tmp""" % ( url, outputfile ), shell = True, stdout = subprocess.PIPE, stderr = subprocess.STDOUT )
    worker.wait()
    os.rename( outputfile+"_tmp", outputfile )

and then saves the result to a pdf file in /media/pdf

but somewhere it would appear that the html is being lost and therefore not being converted or something.

Using XHTML2PDF it all works fine but with some horrible inconstencies in the rendering and just want to try this to see if it is better but if the pdf isn;t generated in the first place I can't do that.


James Schneider

unread,
Oct 12, 2015, 4:54:33 PM10/12/15
to django...@googlegroups.com

> Please excuse if this is the wrong forum and if a mod wants to move it please feel free, just please let me know where it went.
>
> I'm attempting to convert html to pdf on the fly with HTMLDOC.
>
> The fly in the ointment appears to be that the generated pdf is not viewable in that it opens Adobe Reader and gives a warning This PDF document might not be displayed correctly.
>

Sounds like an issue with the shell application you are using to generate it. Have you tried manually grabbing the generated file from the server (not through the web) and opening it with the same result?

>     worker = subprocess.Popen( """htmldoc --webpage "%s" --linkstyle plain --footer ... --no-compression -t pdf14 > %s_tmp""" % ( url, outputfile ), shell = True, stdout = subprocess.PIPE, stderr = subprocess.STDOUT )

Have you tried running the command manually in the shell and then trying to open the PDF? That would eliminate Django as a culprit.

-James

James Schneider

unread,
Oct 12, 2015, 5:29:20 PM10/12/15
to django...@googlegroups.com


> and then saves the result to a pdf file in /media/pdf
>
> but somewhere it would appear that the html is being lost and therefore not being converted or something.
>

I forgot to address this. When you say that the HTML is missing, do you mean that the CSS styling is missing? The HTML to PDF converters all have pretty limited support (if any) for CSS styling, which may be why your documents render incorrectly. That shouldn't cause an error with Adobe reader though.

Have you seen django-easy-pdf? It wraps in a bunch of the functionality that you are performing manually. Might be a good alternative.

http://django-easy-pdf.readthedocs.org/en/stable/

-James

MikeKJ

unread,
Oct 13, 2015, 4:08:19 AM10/13/15
to Django users
The shell command worker may be at fault as the files in media/pdf are 0byte but having said that of course if the varaibles being sent to it are erroneous that would have the same effect.... wood trees

I'll have a look at easy_pdf, I have the context dict available so should be a relatively quick shoe in to test, thanks
Reply all
Reply to author
Forward
0 new messages