Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

PDF file Page Count

8 views
Skip to first unread message

Ennis Gong

unread,
Jun 18, 1996, 3:00:00 AM6/18/96
to

Given a PDF file on a Unix system, is
there an easy way (via a script or
non-interactive process) to determine
the total page count for given PDF
file ? I wish to do this through a
batch job (not through interactively
running a Acrobat Reader/Exchange to
open up the PDF and count the pages)

John Main

unread,
Jun 20, 1996, 3:00:00 AM6/20/96
to

First, define _easy_. I think the answer is a resounding NO.
(I'm assuming you know a little about how to read the PDF format.)

Step one. Get the PDF Spec from Adobe (http://www.adobe.com).

Step two. Read from the end of the file back until you get to the
startxref
entry. The next line contains the byte offset (offsets are from the
beginning
of the file) of the xref table.

Step three. build a xref table from the information in THIS xref
object.
See the spec for the information you get out of the xref table.

Step four. Read the trailer object. You are looking for a /Prev entry
to
tell you the offset of a previous xref object. If there is a /Prev
entry,
go back to step three, otherwise continue.

Step five. Now the xref table is built. Continue reading the LAST
trailer
object (the first one you encountered) in the file. Now you are looking
for
a /Root entry. Get the object number of the root object and look it up
in
the xref table.

Step six. Find the /Pages entry in the root object. Get the object
number of
the pages object and look it up in the xref table.

Step seven. Find the /Count entry in the pages object. BINGO! There's
the
number of pages!


Print out the PDF file in text format and follow through this. It
should
make thinks much more clear.

I am certainly no PDF expert, but this has worked for all of the PDF
files I
have seen. Let me know if there are any mistakes.


John C. Main
EDD Project Lead
Northern Telecom, Inc.
jcm...@nortel.com

John Main

unread,
Jun 20, 1996, 3:00:00 AM6/20/96
to

Hello,

Ok, I lied about step 7.

The /Count is the number of pages that are direct Kids of that object.
There can be more pages under those Kids!

Sorry for the misinformation,


John C. Main
EDD Project Lead
Northern Telecom, Inc.
jcm...@nortel.com

Mark A Harrison

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

Ennis Gong (go...@llnl.gov) wrote:
: Given a PDF file on a Unix system, is
: there an easy way (via a script or
: non-interactive process) to determine
: the total page count for given PDF
: file ? I wish to do this through a
: batch job (not through interactively
: running a Acrobat Reader/Exchange to
: open up the PDF and count the pages)

Try:

acroread -toPostScript <file.pdf |grep "%%Pages:"

For example, the Reader help file is 19 pages long:

$ acroread -toPostScript <Help-Reader.pdf |grep "%%Pages:"
%%Pages: 19

Hope this helps,
Mark.

--
Mark Harrison http://jasper.ora.com/mh/

0 new messages