PDF::Reader::MalformedPDFError

47 views
Skip to first unread message

Marcus Pavani

unread,
Jan 14, 2019, 5:16:51 PM1/14/19
to PDF::Reader
Hello guys,

Im getting this error while trying to open a PDF file to read:

C:/Ruby25-x64/lib/ruby/gems/2.5.0/gems/pdf-reader-2.2.0/lib/pdf/reader/buffer.rb:137:in `find_first_xref_offset': PDF does not contain EOF marker (PDF::Reader::MalformedPDFError)

I have 12 PDFs that are basically the same (payment receipts downloaded from a company's website). My code works for opening and reading 11 files... but one.
Visually, there is nothing different or wrong with it.

The code im using to open it is the following:

PDF::Reader.open(dir_path+"#{pdf_file}.pdf") do |reader|
reader.pages.each do |page|
txt_file.puts(page.text)
end
end

The PDF has a single page, and opens without issue on Adobe Reader, Chrome and IE. Unfortunately, I cannot attach the PDF file for security and confidentiality purposes. :(

I've searched for this error on the internet, and found people having the same problem with empty pdf files, which is not my case.
Can anyone help?

Thanks in advance!

Jon Kern

unread,
Jan 22, 2019, 8:49:18 AM1/22/19
to pdf-r...@googlegroups.com
Maybe try to edit the PDF and save it back out… Might “fix” the missing EOF marker error?

--
You received this message because you are subscribed to the Google Groups "PDF::Reader" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pdf-reader+...@googlegroups.com.
To post to this group, send email to pdf-r...@googlegroups.com.
Visit this group at https://groups.google.com/group/pdf-reader.
For more options, visit https://groups.google.com/d/optout.

Marcus Pavani

unread,
Jan 22, 2019, 9:46:33 AM1/22/19
to pdf-r...@googlegroups.com
Hello Jon,
Thanks for your answer.

Indeed, opening the PDF on Acrobat Reader and then saving it again "fixes" the problem, and adds the missing EOF marker.
However, I am developing an automated tool that receives these PDFs via email and reads and extracts the data automatically.
I haven't figured out how to solve this missing EOF marker error without the intervention of a human yet. Any ideas?

Best regards,

VISAGIO   MARCUS PAVANI
   marcus...@visagio.com
   +55 (14) 98108 2303
   Rio de Janeiro | São Paulo | London | Perth | Moscow

   visagio.com


LEGAL NOTICE: This e-mail message and its attachments are for the sole use of the designated recipient(s). They may contain confidential information,legally privileged information or other information subject to legal restrictions and may not necessarily represent the opinion of Visagio Group. If you are not a designated recipient of this message, or an agent responsible for delivering it to a designated recipient, please do not read, copy, use or disclose this message or its attachments, and notify the sender by replying to this message and delete or destroy all copies of this message and attachments. Thank you.

AVISO LEGAL: Esta mensagem e seus anexos são endereçados exclusivamente aos seus destinatários. A utilização, cópia, distribuição e divulgação não autorizadas desta mensagem, assim como de seus anexos, são expressamente proibidas. Esta mensagem pode conter informações confidenciais, privilegiadas ou legalmente protegidas e não representam necessariamente a opinião do Grupo Visagio. Caso esta mensagem tenha sido recebida por engano, por favor, nos informe imediatamente via e-mail e apague esta mensagem juntamente com seus anexos. Obrigado.

P Please consider the environment before printing this email.



You received this message because you are subscribed to a topic in the Google Groups "PDF::Reader" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/pdf-reader/2Ja16TfdFa8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pdf-reader+...@googlegroups.com.

Wayne Brissette

unread,
Jan 22, 2019, 9:54:06 AM1/22/19
to pdf-r...@googlegroups.com
I don’t know what platform you’re on, but if Mac OS is involved, you can use AppleScript to extract and then save out a new PDF which would ensure all the PDFs have the proper EOF marker. If you’re not Mac OS X, I’m not sure. 

-Wayne

Marcus Pavani

unread,
Jan 22, 2019, 11:30:46 AM1/22/19
to pdf-r...@googlegroups.com
Im using Ubuntu 10.04!

VISAGIO   MARCUS PAVANI
   marcus...@visagio.com
   +55 (14) 98108 2303
   Rio de Janeiro | São Paulo | London | Perth | Moscow

   visagio.com


LEGAL NOTICE: This e-mail message and its attachments are for the sole use of the designated recipient(s). They may contain confidential information,legally privileged information or other information subject to legal restrictions and may not necessarily represent the opinion of Visagio Group. If you are not a designated recipient of this message, or an agent responsible for delivering it to a designated recipient, please do not read, copy, use or disclose this message or its attachments, and notify the sender by replying to this message and delete or destroy all copies of this message and attachments. Thank you.

AVISO LEGAL: Esta mensagem e seus anexos são endereçados exclusivamente aos seus destinatários. A utilização, cópia, distribuição e divulgação não autorizadas desta mensagem, assim como de seus anexos, são expressamente proibidas. Esta mensagem pode conter informações confidenciais, privilegiadas ou legalmente protegidas e não representam necessariamente a opinião do Grupo Visagio. Caso esta mensagem tenha sido recebida por engano, por favor, nos informe imediatamente via e-mail e apague esta mensagem juntamente com seus anexos. Obrigado.

P Please consider the environment before printing this email.


Marcus Pavani

unread,
Jan 22, 2019, 11:31:05 AM1/22/19
to pdf-r...@googlegroups.com
Sorry, I meant 18.04!

VISAGIO   MARCUS PAVANI
   marcus...@visagio.com
   +55 (14) 98108 2303
   Rio de Janeiro | São Paulo | London | Perth | Moscow

   visagio.com


LEGAL NOTICE: This e-mail message and its attachments are for the sole use of the designated recipient(s). They may contain confidential information,legally privileged information or other information subject to legal restrictions and may not necessarily represent the opinion of Visagio Group. If you are not a designated recipient of this message, or an agent responsible for delivering it to a designated recipient, please do not read, copy, use or disclose this message or its attachments, and notify the sender by replying to this message and delete or destroy all copies of this message and attachments. Thank you.

AVISO LEGAL: Esta mensagem e seus anexos são endereçados exclusivamente aos seus destinatários. A utilização, cópia, distribuição e divulgação não autorizadas desta mensagem, assim como de seus anexos, são expressamente proibidas. Esta mensagem pode conter informações confidenciais, privilegiadas ou legalmente protegidas e não representam necessariamente a opinião do Grupo Visagio. Caso esta mensagem tenha sido recebida por engano, por favor, nos informe imediatamente via e-mail e apague esta mensagem juntamente com seus anexos. Obrigado.

P Please consider the environment before printing this email.


Jon Kern

unread,
Jan 22, 2019, 8:47:40 PM1/22/19
to pdf-r...@googlegroups.com
Marcus, 

Try googling this: "how can i add an EOF marker to a malformed PDF programmatically?”

I saw some hits that might be useful for you.
You received this message because you are subscribed to the Google Groups "PDF::Reader" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pdf-reader+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages