Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#839188: pdftk: NullPointerException for combining forms

71 views
Skip to first unread message

Ross Boylan

unread,
Sep 29, 2016, 6:40:03 PM9/29/16
to
Package: pdftk
Version: 2.02-2
Severity: normal

Dear Maintainer,

* What led up to the situation?
I created some pdf's with OpenTax, which fills in pdf's for US
taxes. Those pdf's have forms in them so they can be filled out.
It's possible that OpenTax added additional annotations to the
forms so that they could be filled in programmatically.

I then took the output pdf's from OpenTax and filled in additional
information.

I also downloaded some forms directly from the IRS and filled them
out.

I believe I used okular for all forms I filled in manually.

Then I attempted to combine all the files using
pdftk KR_Fed_1040.pdf KR_Fed_1040_s?.pdf KR_Fed_8863.pdf cat output
KR_Fed_all.pdf

* What exactly did you do (or not do) that was effective (or
ineffective)?
I tried with fewer files and no wildcards.
I tried the version against which I'm filing this report. My first
try was with 1.44 on an older system.
I was running in a graphical environment on the older system;
non-graphical on the newer system.
None of these changes affected the result, which was ...

* What was the outcome of this action?
Error: Unexpected Exception in open_reader()
Unhandled Java Exception in main():
java.lang.NullPointerException
at gnu.gcj.runtime.NameFinder.lookup(libgcj.so.15)
at java.lang.Throwable.getStackTrace(libgcj.so.15)
at java.lang.Throwable.stackTraceString(libgcj.so.15)
at java.lang.Throwable.printStackTrace(libgcj.so.15)
at java.lang.Throwable.printStackTrace(libgcj.so.15)

I gather from other bug reports this is just a generic indicator
that something went wrong.

* What outcome did you expect instead?
That the requested output file, a concatentation of all the inputs,
would be output.

It seems likely this is related to 703377. That bug was thought to be
an evince bug, but with this report, relying on okular and a program
to fill in the form, it seems more likely that pdftk isn't very good
at forms. There are other form-related bugs: 792168.

As a further test I downloaded a blank form 1040 from
wget
https://www.irs.gov/pub/irs-pdf/f1040.pdf?_ga=1.196643116.1249814110.1473627121
(you can probably skip that part after the ?).
ross@ross-node1:~/Finance/tax2015$ pdftk f1040.pdf cat output test.pdf
# seems OK
# edit the file in okular by filling in the name fields, and save it as f104-test:
ross@ross-node1:~/Finance/tax2015$ pdftk f1040-test.pdf cat output test.pdf
Error: Unexpected Exception in open_reader()
java.lang.ClassCastException: pdftk.com.lowagie.text.pdf.PdfNull
cannot be cast to pdftk.com.lowagie.text.pdf.PdfDictionary
at pdftk.com.lowagie.text.pdf.PdfReader$PageRefs.iteratePages(pdftk)
at pdftk.com.lowagie.text.pdf.PdfReader$PageRefs.readPages(pdftk)
at pdftk.com.lowagie.text.pdf.PdfReader$PageRefs.<init>(pdftk)
at pdftk.com.lowagie.text.pdf.PdfReader.readPages(pdftk)
at pdftk.com.lowagie.text.pdf.PdfReader.readPdf(pdftk)
at pdftk.com.lowagie.text.pdf.PdfReader.<init>(pdftk)
at pdftk.com.lowagie.text.pdf.PdfReader.<init>(pdftk)
Error: Failed to open PDF file:
f1040-test.pdf
Errors encountered. No output created.
Done. Input errors, so no output created.

I have attached f1040-test.pdf.

I ran okular 0.14.3 on KDE 4.8.4 on the older system.

FWIW pdfshuffler simply removed all the information in the forms when
tried it.


-- System Information:
Debian Release: 8.6
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.16.0-4-amd64 (SMP w/40 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages pdftk depends on:
ii libc6 2.19-18+deb8u6
ii libgcc1 1:4.9.2-10
ii libgcj15 4.9.2-10
ii libstdc++6 4.9.2-10

pdftk recommends no packages.

Versions of packages pdftk suggests:
ii poppler-utils [xpdf-utils] 0.26.5-2+deb8u1

-- no debconf information

Boylan, Ross

unread,
Sep 29, 2016, 7:20:02 PM9/29/16
to
Another clue. Note the file is as downloaded from the IRS with nothing filled in. This is for pdfmod, which of course is not pdftk.
$ pdfmod f1040.pdf
[1 Debug 16:12:13.206] Starting PdfMod 0.9.1
[1 Debug 16:12:13.221] Initializing i18n catalog from /usr/share/locale/
[1 Debug 16:12:13.242] Loaded custom AccelMap from /home/ross/.config/pdfmod/gtk_accel_map
[1 Debug 16:12:13.249] Cache directory set to /home/ross/.cache/pdfmod
[1 Debug 16:12:13.401] Loaded f1040.pdf
[3 Warn 16:12:13.442] Caught an exception - PdfSharp.Pdf.IO.PdfReaderException: Cannot handle iref streams. The current implementation of PDFsharp cannot handle this PDF feature introduced with Acrobat 6. (in `PdfSharp')
at PdfSharp.Pdf.IO.Parser.ReadXRefTableAndTrailer (PdfSharp.Pdf.PdfReferenceTable xrefTable) [0x00000] in <filename unknown>:0
at PdfSharp.Pdf.IO.Parser.ReadTrailer () [0x00000] in <filename unknown>:0
at PdfSharp.Pdf.IO.PdfReader.Open (System.IO.Stream stream, System.String password, PdfDocumentOpenMode openmode, PdfSharp.Pdf.IO.PdfPasswordProvider passwordProvider) [0x00000] in <filename unknown>:0
at PdfSharp.Pdf.IO.PdfReader.Open (System.String path, System.String password, PdfDocumentOpenMode openmode, PdfSharp.Pdf.IO.PdfPasswordProvider provider) [0x00000] in <filename unknown>:0
at PdfSharp.Pdf.IO.PdfReader.Open (System.String path, PdfDocumentOpenMode openmode, PdfSharp.Pdf.IO.PdfPasswordProvider provider) [0x00000] in <filename unknown>:0
at PdfMod.Pdf.Document.Load (System.String uri, PdfSharp.Pdf.IO.PdfPasswordProvider passwordProvider, Boolean isAlreadyTmp) [0x00000] in <filename unknown>:0
at PdfMod.Gui.Client+<LoadPath>c__AnonStorey10.<>m__1C () [0x00000] in <filename unknown>:0
[3 Error 16:12:13.443] Error Loading Document - There was an error loading /home/ross/Finance/tax2015/f1040.pdf

Boylan, Ross

unread,
Oct 1, 2016, 12:10:02 AM10/1/16
to
Adobe Acrobat Pro XI on MS Windows, as well as Adobe reader on Windows, also refused to open the files.

I suspect this may ultimately be a poppler bug; both evince and okular use poppler. In particular, there's a bug that poppler can't handle XFA, and I'm pretty sure that OpenTax uses XFA. OTOH, OpenTax uses pdftk to fill in the forms.

My work-around was to open the pdf's in evince and then print to pdf. Oddly, the resulting files were much shorter than the originals; they also seemed to be better formed. Acrobat Pro was able to put them together. I haven't tried any of the Linux tools on the resulting files.

Ross

0 new messages