I suspect this file is a bit non-compliant with the spec, and Adobe
Reader is doing a better job and compensating.
The trailer of the file suggests there's an additional xref table
stored in a xref stream at byte offset 248405 (try opening the file in
a text editor and searching for 248405).
However, there's no xref stream at byte offset 248405.
if I comment out the line that reads xref streams in pdf-reader, then
the file can be processed:
diff --git a/lib/pdf/reader/xref.rb b/lib/pdf/reader/xref.rb
index 9e6a56c..5e91ce3 100644
--- a/lib/pdf/reader/xref.rb
+++ b/lib/pdf/reader/xref.rb
@@ -145,7 +145,7 @@ class PDF::Reader
raise MalformedPDFError, "PDF malformed, trailer should
be a dictionary"
end
- load_offsets(trailer[:XRefStm]) if trailer.has_key?(:XRefStm)
+ #load_offsets(trailer[:XRefStm]) if trailer.has_key?(:XRefStm)
load_offsets(trailer[:Prev].to_i) if trailer.has_key?(:Prev)
trailer
I'd be happy to merge a PR attempts to continue reading the PDF if no
xref stream is found at the offset. It'd need a spec based on a real
PDF, but I can add that to the PR if that's helpful.
James
> --
> You received this message because you are subscribed to the Google Groups "PDF::Reader" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
pdf-reader+...@googlegroups.com.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/pdf-reader/9357f168-2d26-40c8-b9d4-d40b55eb60ed%40googlegroups.com.