Joining A5 landscape to make A4 portrait pages

Andrew Gabb

unread,

Jul 17, 2009, 5:27:39 AM7/17/09

to

My client has a large number of pdf files resulting from scanning their
paper files. For a large bunch of these the scanner setting was wrong,
and the A4 pages are scanned as A5 landscape (oops!).

I've found that I can fix an individual file by using Acrobate Reader to
print to PDF995 with multiple pages per page, plus some other tweaking
in the printer options.

What I need to be able to do is to now process hundreds of files,
semi-automatically, eg by batch file.

I've looked at PDFedit995, but it has trouble reading these pdf files
apparently. In any case, I can't get any output.

Any ideas?

I don't mind buying a tool, as long as I know it will do the job (and is
not *too* expensive).

I also don't mind doing some programming if there's a utility with a
reasonable API. This is less attractive, however, because we can
probably convert all the files manually in 20 hours or so.

Andrew
--
Andrew Gabb
email: ag...@tpgi.com.au Adelaide, South Australia
phone: +61 8 8342-1021
-----

Peter Flynn

unread,

Jul 17, 2009, 6:22:56 PM7/17/09

to

Andrew Gabb wrote:
> My client has a large number of pdf files resulting from scanning their
> paper files. For a large bunch of these the scanner setting was wrong,
> and the A4 pages are scanned as A5 landscape (oops!).

You mean two paper pages per "page" on-screen?
Or just 1-for-1 but at the wrong size and orientation?

> I've found that I can fix an individual file by using Acrobate Reader to
> print to PDF995 with multiple pages per page, plus some other tweaking
> in the printer options.
>
> What I need to be able to do is to now process hundreds of files,
> semi-automatically, eg by batch file.
>
> I've looked at PDFedit995, but it has trouble reading these pdf files
> apparently. In any case, I can't get any output.
>
> Any ideas?
>
> I don't mind buying a tool, as long as I know it will do the job (and is
> not *too* expensive).
>
> I also don't mind doing some programming if there's a utility with a
> reasonable API. This is less attractive, however, because we can
> probably convert all the files manually in 20 hours or so.

Use LaTeX and the pdfpages package. It can include selected pages in any
order from PDFs, and rotate and scale and crop them to spec, and output
a new PDF.

///Peter

Andrew Gabb

unread,

Jul 17, 2009, 9:00:15 PM7/17/09

to

> You mean two paper pages per "page" on-screen?
> Or just 1-for-1 but at the wrong size and orientation?

Yes, the problem is that that each paper page (A4-P) has been scanned as
two A5-L pdf pages and I need to join them up again in a new pdf file.

--

Lutrin

unread,

Jul 18, 2009, 8:44:34 AM7/18/09

to

On Sat, 18 Jul 2009 10:30:15 +0930, Andrew Gabb ci disse:

> the problem is that that each paper page (A4-P) has been scanned as two
> A5-L pdf pages and I need to join them up again in a new pdf file.

[...]
you can use

*scantailor*
- http://sourceforge.net/projects/scantailor/
--
Puppy Linux wiki: http://puppylover.netsons.org/dokupuppy
Puppy Linux Forum: http://puppylinux.ilbello.com
Windows me genuit, Ubuntu rapuere / tenet nunc Puppy Linux...

Peter Flynn

unread,

Jul 18, 2009, 10:46:04 AM7/18/09

to

Andrew Gabb wrote:
> > You mean two paper pages per "page" on-screen?
> > Or just 1-for-1 but at the wrong size and orientation?
>
> Yes, the problem is that that each paper page (A4-P) has been scanned as
> two A5-L pdf pages and I need to join them up again in a new pdf file.

Like top half and bottom half? That's messy.

///Peter

Govert J. Knopper

unread,

Jul 19, 2009, 1:44:23 AM7/19/09

to

> Yes, the problem is that that each paper page (A4-P) has been scanned as
> two A5-L pdf pages and I need to join them up again in a new pdf file.

Like this?:

file#1 top half of 1st scan
file#2 bottom half of 1st scan
file#3 top half of 2nd scan
file#4 bottom half of 2nd scan
etc.

If so, then here a couple of suggestons (using free tools):

Option 1.
a. Merge (join/combine/concatenate) all PDF's to a multipage PDF (arranged
in the correct order!); this can a.o. be done with PDFTools
(http://www.sheelapps.com) or ConcatPDF
(http://www.ujihara.jp/ConcatPDF/en/).
b. Rotate all pages in the PDF to portrait with PDFTK
(http://www.accesspdf.com/pdftk/)
c. Do 2-up with my simple PDF imposition tool (http://www.noliturbare.com)

Option 2.
a. Join (merge/combine/concatenate) all PDF's to a multipage PDF (correct
order!); see 1.a.
b. Convert this multipage PDF document to PostScript with pdftops
(http://www.foolabs.com/xpdf/)
c. Use psnup (ftp://ftp.knackered.org/pub/psutils/psutils-p17-a4-nt.zip) to
layout pairs of scans on a page (2-up)
d. Convert the PostScript file back to PDF with GhostScript

Option 3.
Create a program with iText (or iTextSharp) that does it all in one step
(couple of hours work).

Govert

http://www.noliturbare.com

Peter Flynn

unread,

Jul 19, 2009, 7:57:43 AM7/19/09

to

Govert J. Knopper wrote:
>> Yes, the problem is that that each paper page (A4-P) has been scanned as
>> two A5-L pdf pages and I need to join them up again in a new pdf file.
>
> Like this?:
>
> file#1 top half of 1st scan
> file#2 bottom half of 1st scan
> file#3 top half of 2nd scan
> file#4 bottom half of 2nd scan
> etc.

If that is the case, is the division between top half and bottom half
accurate? That is, can you reply on positioning the two files
bottom-edge to top-edge and having them match perfectly? Or was this
scanning done by hand, and did the paper move between scans? (In which
case all bets are off).

///Peter

Govert J. Knopper

unread,

Jul 19, 2009, 1:48:46 PM7/19/09

to

.... is the division between top half and bottom half

> accurate? That is, can you reply on positioning the two files
> bottom-edge to top-edge and having them match perfectly? Or was this
> scanning done by hand, and did the paper move between scans? (In which
> case all bets are off).

Looking at the OP's preliminary fix, I got the impression that it is
accurate enough.

The interesting question remains how they managed to get this scanning
result.

Govert

Peter Flynn

unread,

Jul 19, 2009, 4:13:13 PM7/19/09

to

"Never attribute to malice that which can be adequately explained by
stupidity."

:-)

///Peter

Andrew Gabb

unread,

Jul 19, 2009, 9:04:38 PM7/19/09

to

Thanks for suggestions so far. Some clarifications.

The scanning was semi-automatic. As so often happens, the job was given
to a supervised junior. Someone else did some copying in a break using
one of the 'page split' copier options which split the A4 pages into 2
A5L pages. The junior then just carried on not realising what was happening.

As a result, this batch of copying got screwed up and wasn't checked
till much later, long after the originals had been destroyed.

I can get reasonable results for a single file by opening the file in
Acrobat Reader, then printing to PDF995 with multi-page options. Each
pair of 2 A5L pages are printed on the same A4P page in the right order
with a thin (4-5 mm) gap between them. Each 'page' in the source file is
a half-page image without margins, so the gap is a product of the
multi-page 'printing'.

This is acceptable to my client, but they'd prefer not to have the gap.
Also, given that there are 100s of files it would take a couple of days
to do this manually.

The problem is automating this. I tried using PDF995's batch mode but it
choked with an 'assertion' error (which of course means nothing except
that it failed). I'd also like to avoid the gap if possible.

I was hoping that there's a utility out there which will do this
reasonably easily (possibly without the gap).

Unless someone can come up with a substantially easier way of doing
this, I'll write a short app which drives Acrobat Reader with keystrokes
- messy but not impossible. Any solution that takes more than (say) 4
hours to develop and/or set up will lose out to doing it manually.

Govert J. Knopper

unread,

Jul 20, 2009, 3:55:39 AM7/20/09

to

> The scanning was semi-automatic. As so often happens, the job was given
> to a supervised junior. Someone else did some copying in a break using
> one of the 'page split' copier options which split the A4 pages into 2
> A5L pages.

> I was hoping that there's a utility out there which will do this

> reasonably easily (possibly without the gap).

> Unless someone can come up with a substantially easier way of doing
> this, I'll write a short app which drives Acrobat Reader with keystrokes

Did you actually try the suggested options? I forgot my own PDFrotatescale
tool in Option#1 step b, which - if you are used to GUI apps - makes it even
easier (than with PDFTK) to rotate the pages. If it all works as expected
(only you can test that), you'll save yourself the time to write an
application. And the output will probably have no gap. Installing the tools
and trying the 3-steps solution is a matter of minutes, not hours.

Govert

Andrew Gabb

unread,

Jul 24, 2009, 6:15:07 AM7/24/09

to

I eventually achieved this by using Grovert's rotate and impose tools,
driving the GUI via keystrokes. Not a simple or nice solution but it was
much faster than doing it manually for 400 odd files. I *hate* doing
things this way, but it was the best I could manage without a few hours
more research.

BTW, because the impose tool only joins horizontally, I had to rotate
the files before and after doing the impose. I actually lost some time
trying to use pdftk for the rotation, finding that the impose then
worked incorrectly for some reason (but the rotation viewed OK).

Thanks all for their suggestions.

Govert J. Knopper

unread,

Jul 24, 2009, 1:00:56 PM7/24/09

to

Thanks for the feedback, Andrew

Govert

Andrew Gabb

unread,

Jul 27, 2009, 12:12:19 AM7/27/09

to

Govert J. Knopper wrote:
> Thanks for the feedback, Andrew

Thank you for your interest and assistance, Govert (and sorry for
spelling your name wrong:=).