Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

split pdf file

194 views
Skip to first unread message

wexfordpress

unread,
Apr 24, 2015, 5:04:34 PM4/24/15
to
My current project requires that I send my client a pretty large pdf file at frequent intervals. The file is currently about 36mb. Context from 2014 Texlive produces the file, a book that I am typesetting and will index.

I can use pdftk to split the pdf file into single pages, but what I really
need is an open source program that will divide the file into two or three large chunks. My web connection won't handle files bigger than 25 MB.

Suggestions?

John Culleton

Julian Bradfield

unread,
Apr 24, 2015, 5:30:04 PM4/24/15
to
On 2015-04-24, wexfordpress <jo...@wexfordpress.com> wrote:
> I can use pdftk to split the pdf file into single pages, but what I really
> need is an open source program that will divide the file into two or three large chunks. My web connection won't handle files bigger than 25 MB.

If you were using Unix, it would have split installed.

Since you're using Windows, google

windows divide files

The top hit is freeware, though not apparently open source, but I'm
sure you can find an open-source windows program by looking a little
harder.

Jinsong Zhao

unread,
Apr 24, 2015, 8:47:01 PM4/24/15
to
you can use ghostscript to split the pdf file into several small parts.

HTH,
Jinsong

Scott Pakin

unread,
Apr 24, 2015, 8:55:54 PM4/24/15
to
Dropbox?

You can send a link and not have to bother with splitting and recombining files.

-- Scott

Michael Shell

unread,
Apr 25, 2015, 12:13:54 AM4/25/15
to
On Fri, 24 Apr 2015 21:25:07 +0000 (UTC)
Julian Bradfield <j...@inf.ed.ac.uk> wrote:

> If you were using Unix, it would have split installed.


Yes, also qpdf is available in Windows binaries:

http://sourceforge.net/projects/qpdf/files/qpdf/5.1.2/

e.g.,


qpdf --empty --pages in.pdf 1-10 -- out_p1-10.pdf


Among other features, qpdf can unencrypt pdf files.


Cheers,

Mike Shell

Dr Engelbert Buxbaum

unread,
Apr 25, 2015, 5:24:56 AM4/25/15
to
In article <7052bcc7-63b9-4a26...@googlegroups.com>,
jo...@wexfordpress.com says...

> I can use pdftk to split the pdf file into single pages, but what I
> really need is an open source program that will divide the file into
> two or three large chunks. My web connection won't handle files bigger
> than 5 MB.

I'd simply zip the file into a multi-volume archive. PDF doesn't compress
much, but every little helps and if it comes as by-product...

wexfordpress

unread,
Apr 25, 2015, 10:30:41 AM4/25/15
to
I have used slackware since about 1996. Haven't used windows much since then.


I will investigate split to see if it handles PDF files properly.
John C.

Manuel Collado

unread,
Apr 25, 2015, 11:18:11 AM4/25/15
to
What is the problem with pdftk? It can extract any range of pages as a
single file.

You can use
pdftk input-file cat page-range
instead of just
pdftk input-file burst


Peter Flynn

unread,
Apr 25, 2015, 4:22:26 PM4/25/15
to
pdftk, no question about it.

///Peter


wexfordpress

unread,
Apr 25, 2015, 5:42:11 PM4/25/15
to
Tried
pdftk book.pdf cat 1-100
and got
Done. Input errors, so no output created.

Then I tried
pdftk book.pdf 1-100 >booka.pdf
and got the same message.

I did find an online site that will do it for free. But I prefer to do it
locally if possible.
John C.

Peter Flynn

unread,
Apr 25, 2015, 7:13:46 PM4/25/15
to
On 04/25/2015 10:42 PM, wexfordpress wrote:

> Tried
> pdftk book.pdf cat 1-100
> and got
> Done. Input errors, so no output created.

What errors were they?

///Peter

Y. Lamprou

unread,
Apr 26, 2015, 3:51:19 AM4/26/15
to
I see you hit the problem I had when I tried pdftk. Since I have a full
TeX installation I find that using pdfpages works. Just create a file like
the following where infile.pdf is the doc you want to extract pages from,
pdflatex it and you have what you want.

\documentclass{article}
\usepackage{pdfpages}
\begin{document}
\includepdf[pages={23-30}]{infile.pdf}
\end{document}


You can combine pages from other pdfs or include them in your own documents,
rotate, resize .... Just read the manual.

Ulrich D i e z

unread,
Apr 26, 2015, 5:59:12 AM4/26/15
to
I renamed one of my pdf files to book.pdf .

Then I sucessfully had extracted the first 55 pages of book.pdf into
booka.pdf by the program pdftk by using a file-handle named "B".
I used the command-line below -- I didn't use a space between
the "B" and the range "1-55" of pages that should come from
file "B:

pdftk B=book.pdf cat B1-55 output booka.pdf


Then I extracted the pages 56-105 into bookb.pdf:

pdftk B=book.pdf cat B56-105 output bookb.pdf


Then I extracted everything from page 106 to the
last page into bookc.pdf:

pdftk B=book.pdf cat B106-end output bookc.pdf



In order to do this successfully, you/pdftk must have permission
both to modify and to concatenate the file book.pdf.

In case you/pdftk does not have these permissions, an error-message
will occur and booka.pdf will not be created. For changing/setting these
permissions, the owner-password of book.pdf is needed.

I used
pdftk 1.12 a Handy Tool for Manipulating PDF Documents
Copyright (C) 2003-04, Sid Steward
on a virtual machine that runs Microsoft Windows 95.

I obtained the info about the usage of pdftk from
<http://www.lagotzki.de/pdftk/>
which is a German-language-page about pdftk.

---------------------------------------------------------------

When using pdftk. I had to specify "by hand" the range
of pages/amount of pages that should go into a pdf-file.

I did not find a solution for having a pdf file split into
chunks of e.g. 50 pages automatically by means of pdftk.

But I stumbled over the
"Coherent PDF Command Line Tools Community Release"
which can be found at
<http://community.coherentpdf.com/>
and which brings along the program cpdf for many different
computer-platforms.

Using the Windows-binary cpdf.exe under a virtual machine running
Windows 7, I could successfully split the file book.pdf, which consists
of 163 pages, into "chunks" of, e.g., at most 50 pages.

In order to do this, I started the Windows command-line-shell and
entered the command

cpdf book.pdf -split -chunk 50 -o book%%%.pdf

Here "50" denotes the amount of pages per pdf file.
"%%%" denotes the numeric-part of the names of the
pdf files that are to be created during the loop and denotes
a natural number consisting of 3 digits.

The file book.pdf contained 163 pages.

I obtained a file book001.pdf containing the pages 1-50.
I obtained a file book002.pdf containing the pages 51-100.
I obtained a file book003.pdf containing the pages 101-150.
I obtained a file book004.pdf containing the pages 151-163.


---------------------------------------------------------------

In case you prefer to split according to file-size instead of
amount of pages, the following link might be of interest to
you as well:
<http://unix.stackexchange.com/questions/75496/splitting-large-pdf-into-small-files>


---------------------------------------------------------------

I don't know what will happen in case pdftk or cpdf are
used for splitting pdf-files that contain, e.g.,
- hyperlinks (from table of contents to headline in the text; from
list of tables/figures to caption; from index to text) created via
hyperref-package or
- file-attachments attached via attachfile/attachfil2-package
- or the like.


Sincerely

Ulrich

Bob Tennent

unread,
Apr 26, 2015, 6:48:25 AM4/26/15
to
On Sat, 25 Apr 2015 14:42:08 -0700 (PDT), wexfordpress wrote:
>> You can use
>> pdftk input-file cat page-range
>> instead of just
>> pdftk input-file burst
>
> Tried
> pdftk book.pdf cat 1-100
> and got
> Done. Input errors, so no output created.

Use

pdftk book.pdf cat 1-100 output book1.pdf


An alternative is to use the \includeonly mechanism in LaTeX
to generate PDFs for each of several parts.

Bob T.

wexfordpress

unread,
Apr 26, 2015, 8:53:06 AM4/26/15
to
...
>
> Sincerely
>
> Ulrich

I copied the above lines exactly and ran them. They worked perfectly. I will put them in a script.

Many, many thanks.

wexfordpress

unread,
Apr 26, 2015, 9:00:50 AM4/26/15
to
The error message was all I was given. But the problem was solved, elegantly, by Ulrich. He read some German language documentation. I only understand English (and a few computer languages.)

Thanks to all who answered.

Axel Berger

unread,
Apr 26, 2015, 1:15:06 PM4/26/15
to
Y. Lamprou wrote on Sun, 15-04-26 09:51:
>I find that using pdfpages works.

As log as you use neither bookmarks nor internal hyperlinks. I have
cheaply bought an older version of the full Acrobat to do things like
split, join, renumber or clip PDFs.

Peter Flynn

unread,
Apr 26, 2015, 4:50:17 PM4/26/15
to
On 04/26/2015 02:00 PM, wexfordpress wrote:
> On Saturday, April 25, 2015 at 7:13:46 PM UTC-4, peter wrote:
>> On 04/25/2015 10:42 PM, wexfordpress wrote:
>>
>>> Tried
>>> pdftk book.pdf cat 1-100
>>> and got
>>> Done. Input errors, so no output created.
>>
>> What errors were they?
>>
>> ///Peter
>
> The error message was all I was given.

Sorry, I misunderstood. I thought "Input errors, so no output created."
was *your* paraphrasing of the error, not the actual error message :-)

> But the problem was solved, elegantly, by Ulrich.

I have used pdftk hundreds of times and never had to use a handle for
processing a single file. I wonder why it was needed. Very strange indeed.

///Peter

Dox

unread,
Apr 27, 2015, 7:36:18 AM4/27/15
to
I'd use just the "Print to file" option and select the pages say from 1-30, then 31-

giacomo boffi

unread,
Apr 27, 2015, 12:38:10 PM4/27/15
to
- split doesn't know of pdf format,

- pdfk can produce large chunks, if you can decode the man page, that
is...

- pdfjam (ask google) is a very handy set of scripts (that use the
undelying capabilities of the LaTeX package pdfpages) to ease the
kind of tasks you mention and others too (i use it to print 6-up
printouts of my slides)

- pdfpages is a latex package that does what you want, but I find
easier to access it using the interface provided by pdjam...

--
la panna resta più pannosa.
-- Ruggine, in IHC

Manuel Collado

unread,
Apr 28, 2015, 1:33:14 PM4/28/15
to
El 25/04/2015 23:42, wexfordpress escribió:
>>
>> What is the problem with pdftk? It can extract any range of pages as a
>> single file.
>>
>> You can use
>> pdftk input-file cat page-range
>> instead of just
>> pdftk input-file burst
>
> Tried
> pdftk book.pdf cat 1-100
> and got
> Done. Input errors, so no output created.
>
> Then I tried
> pdftk book.pdf 1-100 >booka.pdf
> and got the same message.

Sorry, the given suggestion was only part of the command. The whole
command for 'cat' must include an output file specification. For instance:

pdftk book.pdf cat 1-100 output book1-100.pdf

Please use 'pdftk --help' to see all the command line options.

Hope this helps.
--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado

jon

unread,
Apr 28, 2015, 2:51:50 PM4/28/15
to
and of course the web page is pretty useful:

https://www.pdflabs.com/docs/pdftk-cli-examples/

cheers,
jon.

Fritz Wuehler

unread,
May 18, 2015, 8:42:19 AM5/18/15
to
> Y. Lamprou wrote on Sun, 15-04-26 09:51:
> >I find that using pdfpages works.
>
> As log as you use neither bookmarks nor internal hyperlinks.

The /hyperref/ package works for bookmarks and hyperlinks.

Axel Berger

unread,
May 18, 2015, 9:10:10 PM5/18/15
to
Fritz Wuehler wrote on Mon, 15-05-18 14:42:
>The /hyperref/ package works for bookmarks and hyperlinks.

In what way? When I found that Komascript's scrlttr2 class does not
support the thebibliography environment, I made that page separately
and included it through pdfpages. Do you tell me, that \cite{} in the
main letter would have worked? I never thought to try.

Alex Romihsita

unread,
May 22, 2015, 11:17:00 PM5/22/15
to
I'm suggesting using a makefile with smth. like

---
default: book.pdf.zip

book.pdf: book.tex $(SOURCES)
pdflatex book
bibtex book
pdflatex book
# ... whatever else you need here ...

book1.pdf: book.pdf
pdftk book.pdf cat 1-100 output book1.pdf

book2.pdf: book.pdf
pdftk book.pdf cat 101-200 output book2.pdf

book.pdf.zip: book.pdf
zip -9 book.pdf.zip book.pdf
---

Your action would be to run make.


I am doing it differently for a similar problem, however. I'm using
hand-written book.tex for the whole thing, and hand-written book1.tex
and book2.tex for the two parts. All of them just \input chapter_i.tex
for different ranges of i (i=1,2,3,4,5,6) or (i=1,2,3) or (i=4,5,6).
Inside a file chapter_i.tex, when a reference from part x to 3-x has to
be produced, I'm doing a case split on the \jobname and producing
different output depending on whether pdflatex is run on book.tex ("see
\ref{...}") or on bookx.tex ("see part ..., section ...") (i = 1,...,6,
x = 1,2).

Best,

Alex.

sophia...@gmail.com

unread,
Jan 28, 2019, 2:21:31 AM1/28/19
to
Get advanced PDF Split Program Tool, which can split large PDF file into small parts in few easy clicks. Split PDF software provides 4 amazing options to break large PDF file like split PDF by size, date, year and folder wise and effectively manage PDF file. By using this tool, users can easily avoid the risk of data corruption, which arises due to over sized PDF file. Read more: https://www.osttopstapp.com/pdf-split.html
0 new messages