What changed in my PDF after PDF/A conversion?

Skip to first unread message


Apr 30, 2014, 7:45:19 PM4/30/14
to pdfne...@googlegroups.com

According to the manual for PDF/A Manager and 'pdftron.PDF.PDFA.PDFACompliance' there will be only minimal information loss when converting to PDF/A. To be able to check the information loss there are a number of Obj Refs in the conversion report, but I can’t see them marked in the document. Is it possible to see what objects that the tool has replaced with PDF/A compliant equivalents?


The following pdfa command-line converts its manual to PDF/A:

   pdfa -c "PDFTron PDFA Manager User Manual.pdf"


The resulting report looks as follows:



To view simple double click on ‘report.xml’ to open it in the browser. You can find the list of all objects that were modified in the column on the right side of the table. You can obtain exactly the same info if you are using 'pdftron.PDF.PDFA.PDFACompliance' class from PDFNet SDK (as shown in PDF/A sample project - http://www.pdftron.com/pdfnet/samplecode.html#PDFA).

In case you are looking for some type of visual indicator you could do a visual diff between the source and PDF/A document.

Please keep in mind that most PDF/A error and resulting changes will not have a visual manifestation and there would not be much to see.

If you would like to determine if converted PDF/A document is visually different from the original you could use 'pdftron.PDF.PDFDraw' in PDFNet or pdf2image (http://www.pdftron.com/pdf2image/downloads.html) along with 'compare' utility from imagemagic (http://www.imagemagick.org/Usage/compare/, for your convenience you can fetch compare tool from http://pdftron.com/pub/compare.zip … if this does not work download full ImageMagic from their site).

To find visual differences run:

pdfa -c my.pdf

pdf2image my.pdf

pdf2image my_pdfa.pdf

compare my1.png my_pdfa1.png diff.png

Attached is the visual diff that highlights information loss due to loss of the soft mask (e_PDFA42). The technique could be used for batch & unattended conversion and validation. By using PDFNet SDK rather than PDF/A Manager you could also implement something that is more interactive (i.e. without need for command-line).

Reply all
Reply to author
0 new messages