Html2pdf Vs Jspdf

0 views

Skip to first unread message

Cecile Lilien

unread,

Aug 5, 2024, 4:38:16 AM8/5/24

to downjogderib

Thisworker has methods that can be chained sequentially, as each Promise resolves, and allows insertion of your own intermediate functions between steps. A prerequisite system allows you to skip over mandatory steps (like canvas creation) without any trouble:

html2pdf.js has the ability to automatically add page-breaks to clean up your document. Page-breaks can be added by CSS styles, set on individual elements using selectors, or avoided from breaking inside all elements (avoid-all mode).

The Worker object returned by html2pdf() has a built-in progress-tracking mechanism. It will be updated to allow a progress callback that will be called with each update, however it is currently a work-in-progress.

If you want to create a new feature or bugfix, please feel free to fork and submit a pull request! Create a fork, branch off of main, and make changes to the /src/ files (rather than directly to /dist/). You can test your changes by rebuilding with npm run build.

As we have to produce sometimes a lot of these in a given week, I didn't want to rely on a 3rd-party API with their charges, but I wanted more control over the PDF, especially around page breaking and such. So, a quick trip back to the drawing board, and now we have an ultra-simple custom component that can be fed a string of HTML, and then print a nice version of it (depending on your CSS skills, of course!)

They key here is that the opening of the PDF in a new window is triggered as a result of you hitting the button (to get around sandbox security issues), and you don't rely on the native print/save function from html2pdf, but instead you take the blob as a datauri, and pass it to window.open().

As you now have full control over the CSS (within the custom component), you can go all out, and make your PDFs look great without relying on any 3rd-party services. You will, of course, need to flex your CSS muscles as printing to a PDF has its own little gotchas.

I may play with this to replace Carbone.io for some things, but I am also creating xlsx files with Carbone so can't get rid it entirely, sigh.) Carbone is still a viable option for those less skillful in HTML/CSS and more do in MS Word or need to create more than PDFs.

There is a known issue with PDF size generated via html2pdf. html2pdf uses the canvas object and there are limitations on its size. I ended up with a couple dozen blank pages. There are workarounds involving breaking down your PDF into multiple pieces and combine them back together - and this can be complex. If you have short documents, then there are no issues with this.

As @church mentioned you need a good foundation in html and css. I newly tried some of these drag/drop html editors and they still create Frankenstein code that you have to edit after the fact. If you want to make changes, do you just change the code directly or use the html editor and do your post processing again?

And you need to edit/build the HTML dynamically. If you are just replacing fields then liberal use of within an iFrame or Custom component will suffice. If you need to create tables that loop through query results and such, this is more work, but doable.

I've created a generic module (linked below) with a modified version of this custom component. You can insert the module into any project, pass in an HTML string, and the component will provide a base64 binary output of the PDF, which you can then render inside of a standard Retool PDF component or output using the utils.downloadFile() function. The module contains hidden elements to make it easy for you to preview the generated PDF, but you can hide the module itself in your app.

@jmikem it's working great!

I've just a problem: if I set hidden to true for htmlPdfExport1 component in the app, the pdf is empty. it seems that pdf_base64 is created only if the component is visible (if hidden it remains undefined).

I'm using a button to generate the pdf and I don't want to leave the preview visible.

Yep, the hidden-not-generating issue is definitely a challenge due to the way that custom components currently work as iframes (I know the Retool team is aware of it and might give us some other options for custom components in the future). I've encountered this with a couple of other "data-only" modules and I usually end up just making the custom component a very small one-line / minimal width component with no visual content. In the screenshot below, I have a custom component called "twilioDevice" that I use for Twilio call handling (similar to the PDF conversion, it requires no visual display and just acts as a data handler) and I usually place this in some innocuous place in my page layout where it doesn't interfere with any other components. Note, though, that if the parent page causes a layout adjustment that changes the position of where the custom component is located (such as another component having a dynamic "hidden" condition), the custom component will completely reload itself, so keep that in mind when placing it.

@jmikem I did some other tests and it was working in edit but not in the app preview version (i mean the pdf generation itself).

For what I understood, the reason why was that the components were set hidden in the module itself.

I've solved it leaving only pdf_preview set to hidden, and now in edit I see the pdf generated and in the preview version it's hidden (no matter the size of the component), but the button works properly.

See attached Example:

The module @jmikem provided worked beautiflly. I just modifed the HTML template/handlebars data to my liking and voila - perfect invoicing system complete! Customer/Project management with billing - took all of a day or two.

@ofedrigo I have seen that before, I believe it occurs when there is any sort of malformed HTML - the PDF library just keeps reattempting to generate the PDF binary and it fails on a loop. I'd suggest having a look through your HTML structure and see if there's anything that might be causing it to become malformed so that the rendering library isn't processing correctly.

I didn't realize this until I got it all setup and working. Unfortunately, this is a dealbreaker for my project. I didn't see it mentioned anywhere else here, so just sharing to potentially save others time in the future.

You may add html2pdf-specific page-breaks to your document by adding the CSS class html2pdf__page-break to any element (normally an empty div). For React elements, use className=html2pdf__page-break. During PDF creation, these elements will be given a height calculated to fill the remainder of the PDF page that they are on. Example usage:

These options are limited to the available settings for HTMLCanvasElement.toDataURL(), which ignores quality settings for 'png' images. To enable png image compression, try using the canvas-png-compression shim, which should be an in-place solution to enable png compression via the quality option.

If using the unbundled dist/html2pdf.min.js (or its un-minified version), you must also include each dependency. Order is important, otherwise html2canvas will be overridden by jsPDF's own internal implementation:

When submitting an issue, please provide reproducible code that highlights the issue, preferably by creating a fork of this template jsFiddle (which has html2pdf already loaded). Remember that html2pdf uses html2canvas and jsPDF as dependencies, so it's a good idea to check each of those repositories' issue trackers to see if your problem has already been addressed.

If you want to create a new feature or bugfix, please feel free to fork and submit a pull request! Use the develop branch, which features the latest development, and make changes to /src/ rather than directly to /dist/. You can test your changes by rebuilding with npm run build.

I need, from within javascript code of a custom widget, to create pdf file which will contain Hebrew text. I found jsPDF library, incorporated it by its jspdf.umd.min.js file and it creates pdf all right, but I have not succeeded to show Hebrew fonts.

Most real-world applications encounter the requirement of generating PDFs based on some content. This includes generating PDFs from custom HTML content or even generating PDFs directly from a website URL.

Puppeteer is a Node library that provides a high-level API to control headless browsers via the DevTools Protocol. It is commonly used for web scraping, automated testing, and generating PDFs or screenshots. It can perform actions such as clicking buttons, filling out forms, and capturing screenshots or PDFs.

jsPDF also does not support the generation of PDFs directly from website URLs out of the box. We need to use libraries like axios to first get the content from the webpage and pass that content to the library function to generate PDFs. One other customization with jspdf is that you can pass coordinates for the elements in the PDF, and then the elements will be rendered to those locations in the PDF.

PlayWright not only offers capabilities to perform actions on a browser programmatically, but it is also useful for tasks such as web scraping, testing, and generating PDFs from web content. In this section, we will be using PlayWright to generate PDF documents from a website URL.

By using page.setContent(), you can load any HTML content directly into the browser context and then generate a PDF, making it a versatile tool for creating PDF documents from dynamically generated HTML content.

html-pdf does not support generating PDFs from website URLs out of the box, which is why we first need to extract the website content using axios. Once we have the content from the webpage, we can then use the html-pdf library to generate a PDF from the website content.