for the moment I'd prefer to follow this project passively until my own
project Lino <http://code.google.com/p/lino/> has started to use Pisa.
I'd also suggest to talk about a related topic: Does it make sense to
keep Pisa alive? What alternatives to Pisa exist?
Luc
I recently switched to wkhtmltopdf
http://code.google.com/p/wkhtmltopdf/ because of performance reasons.
It is based on WebKit and Qt and allows you to use javascript and and
all the other web technologies available in a modern rendering engine.
For me it worked quite well so for. There are even plans to implement
a python interface for it.
Pascal
> --
> You received this message because you are subscribed to the Google Groups "Pisa XHTML2PDF Support" group.
> To post to this group, send email to xhtm...@googlegroups.com.
> To unsubscribe from this group, send email to xhtml2pdf+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/xhtml2pdf?hl=en.
>
>
As per bottom of http://code.google.com/p/wkhtmltopdf/
http://github.com/mreiferson/py-wkhtmltox
Cheers,
Danny
As long as our research for alternatives did not bring up a replacement
that is clearly better, let's assume that it *does* make sense to
continue working on Pisa.
So another important question is: how many users does Pisa have? Who
wants Pisa to live? Please post your statements here in order to
encourage Greg's idea!
Luc
To unsubscribe from this group, send email to xhtml2pdf+...@googlegroups.com.
However, Flying Saucer did not support Chinese, Japanese, Arabic, and
Hebrew. I use Pisa for producing the PDF handbooks for MuseScore in
Chinese, Japanese, and 17 other languages. I am also looking for a
solution that supports right-to-left languages such as Arabic and
Hebrew.
I don't remember if I looked at wkhtmltopdf when I was reviewing
available options. I will have to investigate.
David
I make use of PISA/xhtml2pdf on both a professional and a personal
level. A number of the (web based) applications at work use PISA, so
we'd really like to see its continued support (if not development).
Unfortunately I don't have the time to contribute to it, but I do have
an interest in making sure it stays around.
wkhtmltopdf won't work for me or the company I work for, because we're
writing web-based software, and there's no ways the sysadmins will
install WebKit and X11 on the servers just for a PDF writer.
--
Raoul Snyman
B.Tech Information Technology (Software Engineering)
E-Mail: raoul....@gmail.com
Web: http://www.saturnlaboratories.co.za/
Blog: http://blog.saturnlaboratories.co.za/
Mobile: 082 550 3754
Registered Linux User #333298 (http://counter.li.org)
it is very nice to see so much interest in keeping Pisa alive. Thanks for this and the positive feedback.
As the author of the software I think I should summarize the pros and cons I have in mind.
THE PROS:
- Easy to learn: The main idea behind Pisa is that a person with HTML and CSS skills (so most of us) is able to produce a PDF.
- Optimized for PDF: Pisa enhances HTML and CSS with some print specific features like headers and footers. It still tries to be compatible with all standards.
- Integration into processes: Pisa is great for dynamic generated content. It can be used directly via Python modules or via command line tools. It also integrates well with web frameworks. And it (somehow) works on Google App Engine ;)
THE CONS:
- Pisa is not very fast (caching may help here a lot)
- Pisa currently depends on ReportLab
- Pisa is not fully compatible with HTML and CSS specifications
THE ALTERNATIVES:
- wkhtmltopdf <http://code.google.com/p/wkhtmltopdf/> This is a great tool. If I would write Pisa again I would probably also start with the WebKit rendering machine. It is fast, reliable and portable. It also seems to support some print specific features http://madalgo.au.dk/~jakobt/wkhtmltopdf-0.9.9-doc.html
- FOP <http://xmlgraphics.apache.org/fop/> This may be the best choice for production environments though the XSL-FO format is not easy to understand. This project <http://html2fo.sourceforge.net/> might be helpful to get around this.
- In the PHP world there are some nice projects doing the job of Pisa to. Here are some of them in random order: http://mpdf.bpm1.com/, http://code.google.com/p/dompdf/, http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=tcpdf
- PrinceXML <http://www.princexml.com/> If you are ready to invest a lot of money this may also be a workingsolution.
THE FUTURE:
With theses strong competitors around for me personally it does not make sense to put a lot of work into the Pisa project any more. To make Pisa more stable and faster it would need a complete rewrite and I would avoid using ReportLab and all the other dependencies. HTML and CSS are evolving technologies and it is almost impossible to keep up with all features in a one person project.
Anyway, for those who already integrated Pisa in their projects or persons who are looking for a simple Python only solutions Pisa is still a good choice.
My proposal for ad hoc renovation of the project would be that someone does the following steps:
Step 1:
- Cleanup the module hierarchy: Eliminate 'sx' and 'ho' namespaces and introduce 'xhtml2pdf' or something similar
- Integrate the HTML parser: The parser is evolving. To make it work nicely with the current Pisa version it would help to integrate http://code.google.com/p/html5lib/ directly into Pisa. This would also make installation easier. To make things faster maybe lxml or similar C based tools might be integrated. The same for the CSS parsing.
- Integrate PDFRW: http://code.google.com/p/pdfrw/ seems to be a great PDF toolset. This is also under MIT and could be integrated for PDF background images and other features. Else pyPDF might be used instead.
- Replace Reportlabs Paragraph implementation: I did start this work already. You'll find the code here: http://code.google.com/p/xhtml2pdf-base/source/list
- Creating a stable testing environment: The most important thing to keep in mind is that testing is crucial. Pisa worked on Windows, Mac and Linux and testing was always the hardest part. I started to write a test suite that also did visual comparing on Windows.
Step 2:
- Become independent from ReportLab
As I mentioned already the source code is available at http://github.com/holtwick/xhtml2pdf and ready for getting forked. My latest researches on the future of Pisa are seeing available here http://code.google.com/p/xhtml2pdf-base/source/list
I'm sorry that I can not help here, because I estimate that doing all these things would be a full time job for several months for one person and then it still needs a lot of maintenance. Personally I can not spend that time on it and for myself I don't see a good reason that good alternatives exist.
But I will not discourage people that have interest in keeping the project alive. It is a very very fascinating project and you can learn a lot about the basic technologies that make the web and print work. I think the basic technical approach that Pisa uses is also smart and extensible. The only drawback is the dependency on 3rd party software that often changes and often doesn't do what you expect it to do (yes, I'm ranting about ReportLab ;) ) I spend several months fixing bugs of this kind and this is discouraging.
If you have any concrete questions where I may help please let me know. I would be very happy to hand over the project and everything I know about it to persons who like to continue the work. I may also hand over the domains xhtml2pdf.com and htmltopdf.org to others if I see a big engagement in the open source project.
Thanks again for all the interest in Pisa, I really appreciate it!
Cheers
Dirk
--
To unsubscribe from this group, send email to xhtml2pdf+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/xhtml2pdf?hl=en.
(+1 on supporting a fork of pisa)
I developed an invoice system previously generating invoices in HTML.
Wanting PDF based invoices pisa was the exact tool to fill that gap.
Especially because I'm using django, so feeding rendered XHTML to pisa is
very doable.
If I can help with a fork of the project I'd gladly do that. I am however
not a hardcore coder by origin. But I do try to comply to the PEP's and
other conventions in this world .. Oh and I get the job done! :)
So please inform me (on this list or directly) about a possible group that
will arise working on a fork. Although my time is limited I'd definitely
like to help.
Regards,
Gerard.
On 06-11-10 09:05, Aziz Bookwala wrote:
> Hello All
> I too have used pisa for a coupla projects, and its the simplest html to pdf
> app to pluggin and use into an existing application. anyways, count me in
> too. Im not too confident in what i can accomplish, but like the earlier
> poster said, assign me a simple task, and we can take it from there.
>
> On Sat, Nov 6, 2010 at 3:26 AM, Jared <jaredt...@gmail.com
> <mailto:jaredt...@gmail.com>> wrote:
>
> I'm a django programmer, and my software currently depends on pisa. I
> agree that although there are alternatives this is the best I can
> find. I won't be able to do a lot, but count me in and assign me
> simple tasks and I'll see if I can do them. Thanks Dirk and Greg.
>
> On Oct 25, 9:47 am, Greg Corey <gregco...@gmail.com
> <mailto:gregco...@gmail.com>> wrote:
> > Thanks Dirk! That is a great basis to begin thinking about how to
> actually
> > work on this project. Personally, it will take a good deal of research to
> > understand everything that is going on here, but if I can get past that
> > first hurdle, I am more than happy to start forking this and keep it
> going.
> > Whether there are good alternatives or not, I love the simplicity of this
> > package (to use) so I am going to do what I can to make it work. It
> may take
> > a year, but I am going to try.
> >
> > I am going to start walking through the code and try to figure out
> exactly
> > how a pdf is generated. Once I get a grasp on that, I will make some
> changes
> > and if I successfully *improve* the software, I will start a fork and go
> > from there. That is my current plan of attack. I may have to ask some
> > specific questions once I start getting elbow deep into it Dirk, and I
> > appreciate your willingness to help out.
> >
> > Any others that want to start working on this software with me, feel
> free to
> > email me directly and we can start coordinating.
> >
> > Greg
> >
> > On Mon, Oct 25, 2010 at 3:05 AM, Dirk Holtwick
> <dirk.holtw...@gmail.com <mailto:dirk.holtw...@gmail.com>>wrote:
> <http://madalgo.au.dk/%7Ejakobt/wkhtmltopdf-0.9.9-doc.html><http://madalgo.au.dk/%7Ejakobt/wkhtmltopdf-0.9.9-doc.html>
> > > xhtml2pdf.com <http://xhtml2pdf.com> and htmltopdf.org
> <http://htmltopdf.org> to others if I see a big engagement in the
> > > open source project.
> >
> > > Thanks again for all the interest in Pisa, I really appreciate it!
> >
> > > Cheers
> > > Dirk
> >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Pisa XHTML2PDF Support" group.
> > > To post to this group, send email to xhtm...@googlegroups.com
> <mailto:xhtm...@googlegroups.com>.
> > > To unsubscribe from this group, send email to
> > > xhtml2pdf+...@googlegroups.com
> <mailto:xhtml2pdf%2Bunsu...@googlegroups.com><xhtml2pdf%2Bunsu...@googlegroups.com
> <mailto:xhtml2pdf%252Buns...@googlegroups.com>>
> > > .
> > > For more options, visit this group at
> > >http://groups.google.com/group/xhtml2pdf?hl=en.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Pisa XHTML2PDF Support" group.
> To post to this group, send email to xhtm...@googlegroups.com
> <mailto:xhtm...@googlegroups.com>.
> To unsubscribe from this group, send email to
> xhtml2pdf+...@googlegroups.com
> <mailto:xhtml2pdf%2Bunsu...@googlegroups.com>.
> For more options, visit this group at
> http://groups.google.com/group/xhtml2pdf?hl=en.
>
>
>
>
> --
> - Aziz M. Bookwala
>
> --
> You received this message because you are subscribed to the Google Groups
> "Pisa XHTML2PDF Support" group.
> To post to this group, send email to xhtm...@googlegroups.com.
> To unsubscribe from this group, send email to
> xhtml2pdf+...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/xhtml2pdf?hl=en.
--
self.url = www.gerardjp.com
I would also be happy to see a continuation of pisa. The alternatives
listed are all unsuitable for my purposes.
> - wkhtmltopdf <http://code.google.com/p/wkhtmltopdf/> This is a great tool. If I would write Pisa again I would probably also start with the WebKit rendering machine. It is fast, reliable and portable. It also seems to support some print specific featureshttp://madalgo.au.dk/~jakobt/wkhtmltopdf-0.9.9-doc.html
Webkit+Qt is too heavy of a requirement for me. Especially Qt, since I
am trying to print from a gtk application :P
> - FOP <http://xmlgraphics.apacxhtml2pdf is he.org/fop/> This may be the best choice for production environments though the XSL-FO format is not easy to understand. This project <http://html2fo.sourceforge.net/> might be helpful to get around this.
FOP is java based, and adding a Java requirement to my python
application turns distribution from mostly-easy to frighteningly
complex
I have used some PHP based html-to-pdf converters before, and have
> - In the PHP world there are some nice projects doing the job of Pisa to. Here are some of them in random order:http://mpdf.bpm1.com/,http://code.google.com/p/dompdf/,http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=tcpdf
been happy with the results, but using a PHP tool in a non-web
application can sometimes be problematic.
$495 single user, $3800 server license. need I say more? ;)
> - PrinceXML <http://www.princexml.com/> If you are ready to invest a lot of money this may also be a workingsolution.
Right now, if pisa becomes unmaintained to the point where I cannot
use it anymore (hopefully that won't happen anytime soon) I would
probably have to resort to printing to html and sending it to the
user's installed web browser, requiring them to print from the browser
(to a PDF printer) as a separate step which is kludgy, and almost as
bad as the alternatives listed above ;)
---
James Paige
> Ok. Work has been extremely busy these last months, but I am getting closer to having time to fork Pisa and get things going. Here is what I would like to get a feel for.
> • Name. Dirk, what name would be good to fork the project to? should I just use xhtml2pdf under my own git user?
I think xhtml2pdf is well established, you should use that name. Best way would be to fork from my GitHub repository, so people can easily find the repository etc. Others participating in development may also fork from that repository, that should make collaboration easy.
> • Current Knowledge. Dirk (again), sometime it would be nice to have a complete rundown of what you already know and the problems already present beyond what you listed in your previous posting. Also, the flow of the code and what depends on what and what is outdated etc.
Ok, I'll see if I find time for doing that. I'll keep you updated.
> • Assigning work. How to assign work? Just wait until a fork is complete and go from there?
Forking and merging should be the best approaches. For new features you might want to add branches. If most contributors prefer another versioning system you might want to move to another platform. Mercurial is also a good choice. SVN is deprecated in my opinion for agile development.
> Dirk, if you are game, we can talk via Skype or similar if that would help in the exchange of knowledge. I leave it up to you.
I tried to contact you via Skype. My Skype name is my lastname.
> Whew. I hope this works :)
It will, I'm optimistic :)
Dirk
--
To unsubscribe from this group, send email to xhtml2pdf+...@googlegroups.com.
Thank you very much for your effort and your great looking agenda. As the original author of XHTML2PDF, also known as 'Pisa', I would like to encourage everyone interested in the project to support Chris.
I personally cannot continue to support the project as much as I did before, therefore I'm happy to see that Chris is willing to take the lead. I agree that documentation, testing and modernization are the primary tasks and I think that after having accomplished that, the maintenance, development and use of the project should become easier for everyone.
Thanks to Chris and all of you guys for supporting XHTML2PDF!
Cheers,
Dirk
I'm not really familiar with github so I'll just post my patches here.
Feel free to use them in any way you see fit. Here is a quick
description of what each of them does:
- hr patch: allow a "width" attribute for hr tags. My only use case
for this was with "XX%" values so I'm not sure it works for absolute
pixels/points values.
- hash image: reverse a workaround for a (supposedly) fixed bug in
reportlab. Including the same image many times resulted in N copies
being added to the pdf. With the patch each individual image will be
included once, regardless of the number of uses.
- img px spec: allow img tags to have "XXpx" width/height attributes.
In my tests only numbers worked (px unit was implicit). This was a big
deal to me because ckeditor uses the "XXpx" attribute format.
- nb pages: add a page counter to the pisa doc instance. Use as this:
doc = pisa.CreatePDF(xhtml, buffer, blah blah)
print doc._pisa_page_counter
Page counting and adding a "Page X of Y" label to each page with
reportlab/pisa has been an issue for some time. There are many ways to
skin that particular cat. My first approach was to run pisa to
generate the pdf first just to count the pages, and then add the
("Page <pdf:pagenumber/> of %d" % doc._pisa_page_counter) frame to the
html body and run pisa again. This was by far the simplest and worked
quite well, but performance was very poor for large documents. My
strategy is now to generate a new pdf with only the background and the
page counter frame and to merge it with the previously generated
content (pypdf). Before this patch, I had to count pages with pypdf
which is relatively slow because it needed to parse everything.
Also, I've done a bit of benchmarking back when I was working with
500-pages pdf with images, and it seems the parsing does take quite a
bit of time (and memory!). I'd advise against using the current parser
in pisa. lxml can do the job faster with less memory.
Also the biggest performance gain I got was from adding xhtml = 1 to
the CreatePDF arguments. Without it the parser considered each
<pdf:nextpage/> as an opening tag and put everything following inside.
I hope some of you will find this to be helpful. Pisa is deployed on
my projects and will continue to be used for the foreseeable future.
I'd also like to take this opportunity to thank Dirk for his work.
Best regards,
Philippe
Cheers,
Dirk
>> For more options, visit this group at
>> http://groups.google.com/group/xhtml2pdf?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Pisa XHTML2PDF Support" group.
> To post to this group, send email to xhtm...@googlegroups.com.
> To unsubscribe from this group, send email to