automated webpage to PDF creation

782 views
Skip to first unread message

Tim McEwan

unread,
Nov 7, 2012, 1:14:49 AM11/7/12
to roro list
Hi all,

I want to automate the creation of around 100 PDFs with user data and graphics generated by JS (charts mostly, but maybe some Raphael).  Each PDF is an energy bill.  The whole environment can be controlled, so it doesn't need to be client-side cross-platform or anything.  However, we'd like to be able to package it up and have others use it, so the simpler the better.

I've looked at Prawn, but I'd rather not have to draw everything from first principles.  It seems wkhtmltopdf is a better bet so I tested it on highcharts.com.  It does a good job, but not as good as straight Safari.  So I thought perhaps I'd automate a browser instead, but that may not be so simple.

Has anyone got any tips on the best way to go?

Thanks,
Tim


Simon Russell

unread,
Nov 7, 2012, 1:21:01 AM11/7/12
to rails-...@googlegroups.com
I had some success using d3.js with wkhtmltopdf to do pretty much the same thing (energy usage reports, as it happens).  It works, most of the time, but ironing out bugs gets pretty annoying.  Sometimes things wouldn't render (colors or something) in wkhtmltopdf, and it was basically just "fiddle with random things" until it worked; doing things with non-pixel dimensions is also pretty much impossible.  I also tried out PhantomJS, but that was even worse.

Automating a real browser might work better, I didn't get to that level of desperation though -- I got the task done "enough" and moved on.




--
You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group.
To post to this group, send email to rails-...@googlegroups.com.
To unsubscribe from this group, send email to rails-oceani...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rails-oceania?hl=en.

Ben Taylor

unread,
Nov 7, 2012, 1:25:26 AM11/7/12
to rails-...@googlegroups.com
I've done something similar with wkhtmltopdf, although I never needed to do charts or JS. I found it was quick to get something 80% there, but needed lots of fiddling to get it *right*.

Automating Safari is an interesting path, but sounds like it could become a massive pain very quickly.

 - Ben

Ben Bruscella

unread,
Nov 7, 2012, 1:20:26 AM11/7/12
to rails-...@googlegroups.com

Simon Russell

unread,
Nov 7, 2012, 1:33:31 AM11/7/12
to rails-...@googlegroups.com
I think pdfkit is just a wrapper for wkhtmltopdf, so I would assume that's not going to solve this particular problem.  (Any more than wkhtmltopdf would.)

How complex are the charts you're producing?

Tim McEwan

unread,
Nov 7, 2012, 1:43:44 AM11/7/12
to rails-...@googlegroups.com

On Wednesday, 7 November 2012 at 17:33, Simon Russell wrote:

How complex are the charts you're producing?

@Simon No more complex than the ones on the front page of hightcharts.com.  I want them to look nice without having to manually draw shadows in Prawn.

As it's only going to be done 4 times per year (& maybe only 4 times total ;) - I'll drop back to using a single webpage and CSS page breaks if I have to.  That's not going to be scalable at the utility level though.  We've put 450 charts in one page using Highcharts before, so I know it's doable. ;)

Thanks for the help and comments guys; much appreciated.

Tim

Simon Russell

unread,
Nov 7, 2012, 1:48:10 AM11/7/12
to rails-...@googlegroups.com
Good luck, sounds like you've got it reasonably under control.  It's a pity that making PDFs from SVGs (that are made from JS) is still such a pain; I can only assume it's not that common.

(I did toy with using node + fake dom + d3 + svg to pdf converter, but that's when I realised it was going to be a never-ending tangent...)



Tim

--

Robert Gravina

unread,
Nov 7, 2012, 3:14:51 AM11/7/12
to rails-...@googlegroups.com
If you can use spreadsheets instead of PDFs the axlsx gem is quite good. You can generate various charts with that, by either passing in the values or by referring to a cell grid.

Although I suppose you might need the layout etc. that a PDF provides for am energy bill. Just FYI, as I've used it at work with much success.

Robert

[1] Open Office XML (should work in Excel, LibreOffice, Numbers). See: https://github.com/randym/axlsx

Steven Ringo

unread,
Nov 7, 2012, 6:12:06 AM11/7/12
to rails-...@googlegroups.com
Tim,

Have a look at http://docraptor.com/

Uses PrinceXML under the hood, which is the gold standard, but costs an arm and a leg.

DocRaptor's plans seem reasonable and I have heard good comments about them.

They have a decent API with Ruby/Rails examples.

Good luck!

Steve

Daniel

unread,
Nov 7, 2012, 5:52:52 PM11/7/12
to rails-...@googlegroups.com
I think chromedriver (selenium-webdriver / chrome) can do it (although it's pretty hacky).

You have to
- List the current window handles
- call window.print
- re-list the current window handles, remove the ones that were there before (to get the handle for the print window)
- tell selenium to switch to the print window
- use javascript to switch to pdf output (change... -> save as pdf)
- use javascript to click the 'print' button
- hit save (not 100% sure this part works nicely as it opens a system dialog but at least one person has reported it working.)

https://code.google.com/p/chromedriver/issues/detail?id=19 is the issue to star if you want to know when this gets better support.

Finally, if you're more comfortable with C++ than I am (I'm not), the chromium source is really very readable. 
It shouldn't be hard to hook up the built-in pdf generation to the chromedriver bindings.

Clifford Heath

unread,
Nov 7, 2012, 6:07:22 PM11/7/12
to rails-...@googlegroups.com
On 08/11/2012, at 9:52 AM, Daniel <daniel....@gmail.com> wrote:
> Finally, if you're more comfortable with C++ than I am (I'm not), the chromium source is really very readable.

I can confirm that (as someone who is comfortable in C++).

> It shouldn't be hard to hook up the built-in pdf generation to the chromedriver bindings.

Isn't that what wkhtmltopdf does? Or is there some extra Chromium magic that's not pure WK?

Clifford Heath.

Bruce Wang

unread,
Nov 7, 2012, 7:25:57 AM11/7/12
to rails-...@googlegroups.com
Hi Tim,

Have you try phantom.js? Since your source is js, it seems to be a perfect fit.

It's actually a headless WebKit with JavaScript API, much better and faster than Selenium.

HTH.

Cheers,
Bruce



--
You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group.
To post to this group, send email to rails-...@googlegroups.com.
To unsubscribe from this group, send email to rails-oceani...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rails-oceania?hl=en.

Rufus Post

unread,
Nov 7, 2012, 6:36:14 PM11/7/12
to rails-...@googlegroups.com
I have had great success with wkhtmltopdf (except that it at ALL the memory with a gradient on body).

For some reasons it's a little sketchy with certain css, word-spacing breaks everything @font-face needs to be an exact format etc etc.

So if it does not look like Safari you will need to target it with media queries to fix:

  @media print {
    .actions {
      display: none;    
    }
  }

 @media (min-width: 768px) and (max-width: 979px) {

}

This is cool:

config.middleware.use PDFKit::Middleware, print_media_type: true

/reports/1.pdf

Also someone has put the binary in a gem:


As far as packaging up for others, no dice. As views need css tweaks and media queries.

phantom.js is awesome, but I don't know how to generate PDFs with it …….

Tim McEwan

unread,
Nov 7, 2012, 10:01:03 PM11/7/12
to rails-...@googlegroups.com
Woah, thanks for all the awesome info. These options will keep me going for a while. :)


On Thursday, 8 November 2012 at 10:36, Rufus Post wrote:

> I have had great success with wkhtmltopdf (except that it at ALL the memory with a gradient on body).
>
> For some reasons it's a little sketchy with certain css, word-spacing breaks everything @font-face needs to be an exact format etc etc.
>
> So if it does not look like Safari you will need to target it with media queries to fix:
>
> @media print {
> .actions {
> display: none;
> }
> }
>
> @media (min-width: 768px) and (max-width: 979px) {
>
> }
>
> This is cool:
>
> config.middleware.use PDFKit::Middleware, print_media_type: true
>
> /reports/1.pdf
>
> Also someone has put the binary in a gem:
>
> https://rubygems.org/gems/wkhtmltopdf-binary
>
> As far as packaging up for others, no dice. As views need css tweaks and media queries.
>
> phantom.js is awesome, but I don't know how to generate PDFs with it …….
>
>
> On 07/11/2012, at 11:25 PM, Bruce Wang <br...@brucewang.net (mailto:br...@brucewang.net)> wrote:
> > Hi Tim,
> >
> > Have you try phantom.js (http://phantomjs.org/)? Since your source is js, it seems to be a perfect fit.
> > https://github.com/ariya/phantomjs/wiki/Screen-Capture (search for rasterize.js)
> >
> > It's actually a headless WebKit with JavaScript API, much better and faster than Selenium.
> >
> > HTH.
> >
> > Cheers,
> > Bruce
> >
> >
> >
> > On Wed, Nov 7, 2012 at 5:14 PM, Tim McEwan <t...@mcewan.it (mailto:t...@mcewan.it)> wrote:
> > > Hi all,
> > >
> > > I want to automate the creation of around 100 PDFs with user data and graphics generated by JS (charts mostly, but maybe some Raphael). Each PDF is an energy bill. The whole environment can be controlled, so it doesn't need to be client-side cross-platform or anything. However, we'd like to be able to package it up and have others use it, so the simpler the better.
> > >
> > > I've looked at Prawn, but I'd rather not have to draw everything from first principles. It seems wkhtmltopdf is a better bet so I tested it on highcharts.com (http://highcharts.com/). It does a good job (http://cl.ly/2l1c3C083u3N), but not as good as straight Safari (http://cl.ly/2Q0F0J0C022a). So I thought perhaps I'd automate a browser instead, but that may not be so simple (http://stackoverflow.com/questions/11537103/how-to-handle-print-dialog-in-selenium).
> > >
> > > Has anyone got any tips on the best way to go?
> > >
> > > Thanks,
> > > Tim
> > >
> > >
> > >
> > > --
> > > You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group.
> > > To post to this group, send email to rails-...@googlegroups.com (mailto:rails-...@googlegroups.com).
> > > To unsubscribe from this group, send email to rails-oceani...@googlegroups.com (mailto:rails-oceania%2Bunsu...@googlegroups.com).
> > > For more options, visit this group at http://groups.google.com/group/rails-oceania?hl=en.
> >
> >
> >
> >
> > --
> > simple is good
> > http://brucewang.net (http://brucewang.net/)
> > http://twitter.com/number5
> >
> >
> > --
> > You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group.
> > To post to this group, send email to rails-...@googlegroups.com (mailto:rails-...@googlegroups.com).
> > To unsubscribe from this group, send email to rails-oceani...@googlegroups.com (mailto:rails-oceani...@googlegroups.com).
> > For more options, visit this group at http://groups.google.com/group/rails-oceania?hl=en.
>
>
> --
> You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group.
> To post to this group, send email to rails-...@googlegroups.com (mailto:rails-...@googlegroups.com).
> To unsubscribe from this group, send email to rails-oceani...@googlegroups.com (mailto:rails-oceani...@googlegroups.com).
Reply all
Reply to author
Forward
0 new messages