95 views

Skip to first unread message

Sep 2, 2011, 3:32:37â€¯AM9/2/11

to

[I can set about learning how to do the following by myself, but if

anyone has an off-the-cuff answer or pointers to the starting point,

I'll be appreciative.]

anyone has an off-the-cuff answer or pointers to the starting point,

I'll be appreciative.]

Given a large number of single-page PDF image files, all in a single

folder (on a Mac), write a notebook that will build a single multipage

document containing some or all of these files, in accordance with

(and with page order determined by) a list of some or all of these

image file names.

For extra credit: Have each page in the final document bookmarked by

the name of the corresponding file.

Thanks for any assistance.

Sep 3, 2011, 8:05:12â€¯AM9/3/11

to

On 9/2/11 at 3:29 AM, sie...@stanford.edu (AES) wrote:

>Given a large number of single-page PDF image files, all in a single

>folder (on a Mac), write a notebook that will build a single

>multipage document containing some or all of these files, in

>accordance with (and with page order determined by) a list of some

>or all of these image file names.

On a Mac, probably the simplest way to do this would be save

each portion of the notebook representing a single PDF page to a

separate PDF file, open the first page in Preview, display the

thumbnails (cmd 2) then drag the remaining files to the

thumbnail bar.

Drag to re-order however you like and save the result from Preview.

>For extra credit: Have each page in the final document bookmarked

>by the name of the corresponding file.

You can create bookmarks within Preview.

As for the first step, saving the files from Mathematica to PDF,

you could either manually select what is to go on a single page

and use the Save Selection As item in the File menu to save the

selection to PDF format. Alternatively, this step could easily

be automated using Export and and the various Notebook*

functions to select what you want and save them as PDF.

Or you could use the various Notebook* functions (do ?Notebook*

to get a list) to create a notebook with just what you wanted to

save then save the whole thing from Mathematica as PDF file.

But unless you are going to do this often or have a large number

of files you want collected into the final PDF file, using a

manual method with Preview is likely to be considerably less effort.

Sep 3, 2011, 8:07:15â€¯AM9/3/11

to

raw text or whole pages in the form of images. It is not possible to

extract formatted text - say in the form of a notebook. Maybe that will

change in future versions of Mathematica, but I understand that the PDF

format is rather opaque.

David Bailey

http://www.dbaileyconsultancy.co.uk

Sep 3, 2011, 8:09:18â€¯AM9/3/11

to

Sep 3, 2011, 8:11:20â€¯AM9/3/11

to

Step 1: be grateful you own a Mac. PDF's are so easy to manipulate on a Mac.

This is one of those rare circumstances where I don't think Mathematica is your best solution. Preview exports actions to Automator for compiling pages into a PDF and for watermarking.

Since this is a Mathematica forum, I have to ask the group: Can Mathematica be used to script Automator or Applescript actions?

Daniel

Sep 3, 2011, 5:57:25â€¯PM9/3/11

to

Original query:

> Given a large number of single-page PDF image files, all in a single

> folder (on a Mac), write a notebook that will build a single multipage

> document containing some or all of these files, in accordance with

> (and with page order determined by) a list of some or all of these

> image file names.

>

> For extra credit: Have each page in the final document bookmarked by

> the name of the corresponding file.

In article <j3t5h8$58o$1...@smc.vnet.net>,

dr DanW <dmaxw...@gmail.com> wrote:

> This is one of those rare circumstances where I don't think Mathematica is

> your best solution. Preview exports actions to Automator for compiling pages

> into a PDF and for watermarking.

I guess I'm surprised at the answers I've gotten to this query -- thus

far, anyway -- which basically say, "You gotta do this by hand". All

I want to do is, in essence, import a bunch of files; concatenate 'em

(without in any way opening, "reading" or in any way processing them);

and re-export the concatenated file.

What I failed to understand, I suppose, is that one doesn't just

"concatenate" PDF files in this fashion. Text files, yes; image

files, probably; but PDF files, no. (Or can one, in fact, do this with

PDF files, without converting each PDF file to a .jpg or .png file? --

maybe at the cost of a bulkier final document?)

The Automator suggestion is interesting. I've played with it a bit;

found it powerful but quirky; and not a particularly fun language to

program in -- partly because it's hard to follow just what it's doing,

step by step, partly because it's not well documented.

But suppose your workflow involves generating and saving a large

number of one-page PDF files -- each file a spec sheets or catalog

page for one of the products or commercial items that you sell, for

example.

Every so often you edit a master list of the PDF files for those

products that you currently sell (a list of the file names, that is),

removing obsolete items, adding new ones. Then you hand this list to

a Mathematica notebook, which builds an updated multipage catalog of

all your current products. This can't be done . . . ???

As I'm typing, I thinking: Hey, I'm quite sure I can do this in TeX,

and quite easily in fact. I'll give that a try.

Sep 3, 2011, 5:56:24â€¯PM9/3/11

to

Hi AES,

This solution cheats because it uses latex to do the real job. However, it does use mathematica to assemble the latex code. To use it:

1. place your pdf's in a subdirectory of the directory that contains the notebook (avoid file names that confuse latex)

2. supply the name of the subdirectory in the notebook in the variable pdfsubdirectory (in this example the directory is called "pdf_files")

3. Execute the notebook.

This creates a latex file in the current directory that uses the package pdfpages. Typeset it and you will get the assembled pdf. I suppose one should be able to send the typesetting command directly from Mathematica to the shell to fully automate the process, but this job is for another volunteer :-)

Themis

(*specify name of subdirectory that contains the pdf files*)

pdfsubdirectory = "pdf_files";

thisdirectory = SetDirectory@NotebookDirectory[];

(*read in list of pdfs*)

mypdffiles = FileNames["*.pdf", pdfsubdirectory];

numberoffiles = Length[mypdffiles];

(*build latex code*)

latexcode =

StringJoin @@

Join[{"\\documentclass[11pt]{article}\n\\usepackage{pdfpages}\n\

\\begin{document}\n"},

Table[StringJoin["\\includepdf[pages=-]{",

ToString[mypdffiles[[i]]], "}\n"], {i, 1,

numberoffiles}], {"\\end{document}"}];

Export["assemblepdf.tex", latexcode, "Text"]

Sep 3, 2011, 5:58:27â€¯PM9/3/11

to

I think the only reason this may be worth doing in Mathematica is that

the sorting and selection of the input files perhaps requires some

nontrivial processing. Assuming you have already solved that problem,

here is a simple way to proceed.

Calling this a Mathematica solution is perhaps cheating, but here it

is anyway:

coalescePDF[inputFiles_?ListQ, outputFile_?StringQ] :=

If[FileExistsQ[outputFile],

Print[outputFile <>

" already exists and has not been overwritten"],

Run["/System/Library/Automator/Combine\\ PDF\\ \

Pages.action/Contents/Resources/join.py --output " <> outputFile <>

" " <> StringJoin[Riffle[inputFiles, " "]]]

]

The input filenames are passed to this function as strings in the list

inputFiles, and the desired output file name is the second argument.

So you would call it like this:

coalescePDF[{"picture1.pdf","picture2.pdf"}, "output.pdf"]

Just in case the line breaks are ambiguous when this gets posted, make

sure the string in the path name reads as one line containing

"Combine\\ PDF\\ Pages.action"

That's an Automator action that comes with OS X, so it's perhaps

better than using LaTeX or ghostscript etc., although I'm sure you

have those installed too.

Regards,

Jens

Sep 3, 2011, 5:59:58â€¯PM9/3/11

to

On 9/3/11 at 8:05 AM, dmaxw...@gmail.com (dr DanW) wrote:

>Since this is a Mathematica forum, I have to ask the group: Can

>Mathematica be used to script Automator or Applescript actions?

Yes. You could actually run an Applescript from Mathematica.

There are unix utilities that allow you to call Applescript from

them. And with Mathematica's Run function, you can call up any

unix script from Mathematica.

Additionally, with the current Mac OS, you can run Python, Perl,

Ruby etc., pretty much any scripting language you like. With the

command line interface to Mathematica, you can do just about

anything you like with respect to either scripting Mathematica

or running scripts from Mathematica.

But it is good to keep in mind even though Mathematica is a very

powerful tool, Mathematica is not the most efficient tool for

everything you might want to do with a computer.

Sep 4, 2011, 4:14:14â€¯AM9/4/11

to

In article <j3u7q8$a7p$1...@smc.vnet.net>,

Themis Matsoukas <tmats...@me.com> wrote:

Themis Matsoukas <tmats...@me.com> wrote:

> Hi AES,

>

> This solution cheats because it uses latex to do the real job. However, it

> does use mathematica to assemble the latex code. To use it:

Thanks very much -- but here's an even simpler way, using just Plain

TeX and TeXShop, without needing to bring Mathematica into the picture

at all.

% To insert a centered PDF image in TeXShop

\pageinsert

\null \vfill

\centerline{

\pdfximage

width xx in {my_pdf_file_name.pdf}

\pdfrefximage

\pdflastximage }

\vfill

\endinsert

where xx is the width in inches you want the PDF image to occupy on

the page.

Just write a TeX preamble that sets the various pdf page size,shape

and margin parameters, then insert a bunch of these \pageinserts, one

per file.

Better yet, macro-ize the above coding, then call the macro repeatedly

on the list of file names.

I've just checked this on a simple three-PDF example. Each PDF file

in the three-page output document seems to have been captured with

full vector coding of the image preserved; each page can be

individually opened and edited in Illustrator if one wants to.

TeXShop and complete installation of TeX Live of course available as

MacTeX from TUG; be sure to join TUG to support this.

Sep 4, 2011, 4:15:15â€¯AM9/4/11

to

On 9/3/11 at 5:55 PM, sie...@stanford.edu (AES) wrote:

>I guess I'm surprised at the answers I've gotten to this query --

>thus far, anyway -- which basically say, "You gotta do this by

>hand".

You have mis-interpreted the responses you have received. It

isn't "you gotta do this by hand". Instead, people are telling

you unless you are doing this often or have a very large number

of files, it is easier/more efficient to do this by hand.

>All I want to do is, in essence, import a bunch of files; concatenate

>'em (without in any way opening, "reading" or in any way processing

>them); and re-export the concatenated file.

There are a variety of third party apps available for the Mac

that will do just this. Many are free or minimal cost.

Sep 4, 2011, 6:08:13â€¯PM9/4/11

to

In article <j3vc2j$ek0$1...@smc.vnet.net>,

Bill Rowe <read...@sbcglobal.net> wrote:

Bill Rowe <read...@sbcglobal.net> wrote:

All right -- I guess what's surprised me is that no one so far has

come up with a few lines of simple and straightforward Mathematica

coding that can do this job simply and quickly.

After all, one is supposed to be able to carry out one's _entire_ work

flow of analysis, calculation, _and publication_ (including the

inclusion of externally generated or provided content), entirely in

Mathematica -- is that not the mantra?

I just want to make a Mathematica-generated publication, to be

exported in PDF format, that will actually have almost no Mathematica

generated content -- maybe a title page or ToC -- but include a lot of

externally generated content, in the form of PDF files.

As I eventually realized, Plain TeX can do this easily.

Sep 5, 2011, 7:07:54â€¯AM9/5/11

to

> After all, one is supposed to be able to carry out one's _entire_ work

> flow of analysis, calculation, _and publication_ (including the

> inclusion of externally generated or provided content), entirely in

> Mathematica -- is that not the mantra?

> flow of analysis, calculation, _and publication_ (including the

> inclusion of externally generated or provided content), entirely in

> Mathematica -- is that not the mantra?

No mantras here, my friend. None.

Bobby

On Sun, 04 Sep 2011 17:05:48 -0500, AES <sie...@stanford.edu> wrote:

> In article <j3vc2j$ek0$1...@smc.vnet.net>,

> Bill Rowe <read...@sbcglobal.net> wrote:

>

> All right -- I guess what's surprised me is that no one so far has

> come up with a few lines of simple and straightforward Mathematica

> coding that can do this job simply and quickly.

>

> After all, one is supposed to be able to carry out one's _entire_ work

> flow of analysis, calculation, _and publication_ (including the

> inclusion of externally generated or provided content), entirely in

> Mathematica -- is that not the mantra?

>

> I just want to make a Mathematica-generated publication, to be

> exported in PDF format, that will actually have almost no Mathematica

> generated content -- maybe a title page or ToC -- but include a lot of

> externally generated content, in the form of PDF files.

>

> As I eventually realized, Plain TeX can do this easily.

>

Sep 5, 2011, 7:17:28â€¯AM9/5/11

to

On Sep 3, 10:11 pm, dr DanW <dmaxwar...@gmail.com> wrote:

> Step 1: be grateful you own a Mac. PDF's are so easy to manipulate on =

a Mac.

>

> This is one of those rare circumstances where I don't think Mathematica i=

s your best solution. Preview exports actions to Automator for compiling=

pages into a PDF and for watermarking.

>

> Since this is a Mathematica forum, I have to ask the group: Can Mathemati=

ca be used to script Automator or Applescript actions?

>

> Daniel

> Step 1: be grateful you own a Mac. PDF's are so easy to manipulate on =

a Mac.

>

> This is one of those rare circumstances where I don't think Mathematica i=

s your best solution. Preview exports actions to Automator for compiling=

pages into a PDF and for watermarking.

>

ca be used to script Automator or Applescript actions?

>

> Daniel

There is this:

http://library.wolfram.com/infocenter/MathSource/5688/

which may help. There is also a services package as well. I haven't

used either of them for many years and recall that an update was

needed for the graphics rendering in services (for better resolution).

Mike

Sep 5, 2011, 7:23:45â€¯AM9/5/11

to

On 9/4/11 at 6:05 PM, sie...@stanford.edu (AES) wrote:

>In article <j3vc2j$ek0$1...@smc.vnet.net>,

>Bill Rowe <read...@sbcglobal.net> wrote:

>>>All I want to do is, in essence, import a bunch of files;

>>>concatenate 'em (without in any way opening, "reading" or in any

>>>way processing them); and re-export the concatenated file.

>>There are a variety of third party apps available for the Mac that

>>will do just this. Many are free or minimal cost.

>All right -- I guess what's surprised me is that no one so far has

>come up with a few lines of simple and straightforward Mathematica

>coding that can do this job simply and quickly.

Why should this be surprising? While Mathematica is a very

powerful tool, it is not the best tool for all purposes.

>After all, one is supposed to be able to carry out one's _entire_

>work flow of analysis, calculation, _and publication_ (including the

>inclusion of externally generated or provided content), entirely in

>Mathematica -- is that not the mantra?

That may be the mantra of some. But it certainly isn't mine.

>I just want to make a Mathematica-generated publication, to be

>exported in PDF format, that will actually have almost no

>Mathematica generated content -- maybe a title page or ToC -- but

>include a lot of externally generated content, in the form of PDF

>files.

>As I eventually realized, Plain TeX can do this easily.

Recall, Mathematica's Run function. That is you can invoke

command line TeX tools from Mathematica. So, anything you can do

with TeX could be done from within Mathematica if you like. In

fact, since Mathematica can be seen as a general purpose

programming language, you could actually create command line TeX

tools from Mathematica if you wanted. In principle, anything you

want to do on a computer could be done within Mathematica. But

this is hardly an efficient or easy way to accomplish many tasks

you are likely to want to do with a computer.

Sep 6, 2011, 4:09:24â€¯AM9/6/11

to

> Thanks very much -- but here's an even simpler way,

> using just Plain

> TeX and TeXShop, without needing to bring Mathematica

> into the picture

> at all.

> using just Plain

> TeX and TeXShop, without needing to bring Mathematica

> into the picture

> at all.

I thought that the whole point of the challenge was to use Mathematica! :-)

>

> % To insert a centered PDF image in TeXShop

>

> \pageinsert

> \null \vfill

> \centerline{

> \pdfximage

> width xx in {my_pdf_file_name.pdf}

> \pdfrefximage

> \pdflastximage }

> \vfill

> \endinsert

>

> where xx is the width in inches you want the PDF

> image to occupy on

> the page.

>

> Just write a TeX preamble that sets the various pdf

> page size,shape

> and margin parameters, then insert a bunch of these

> \pageinserts, one

> per file.

>

But don't you have to code the list of files manually? The notebook I posted produces a latex file in which the list of files, however long, is processed by mathematica (see example below). You can probably adapt it to the plain TeX version.

\documentclass[11pt]{article}

\usepackage{pdfpages}

\begin{document}

\includepdf[pages=-]{pdf_files/fig2_ex_BWR.pdf}

\includepdf[pages=-]{pdf_files/fig2_interpolation.pdf}

\includepdf[pages=-]{pdf_files/mathematica.pdf}

\includepdf[pages=-]{pdf_files/solution.pdf}

\end{document}

As a side comment, pdfpages allows you to include pdf files with multiple pages and even to choose which pages to include. My example includes all pages.

> TeXShop and complete installation of TeX Live of

> course available as

> MacTeX from TUG; be sure to join TUG to support this.

>

There is also a support group dedicated to TeX on Mac: http://email.esm.psu.edu/mailman/listinfo/macosx-tex

Themis

Sep 7, 2011, 5:43:37â€¯AM9/7/11

to

Hi,

this might work if the PDFs will always smaller than a page.

1. generate some sample PDF files - sin plots

dir = NotebookDirectory[];

t = 20;

d = Table[Plot[Sin[a x], {x, 1, 20}], {a, 1, t}];

Export[FileNameJoin[{dir,

"Sin[" <> ToString[NumberForm[#, 2, NumberPadding -> "0"]] <>

"t].pdf"}], d[[#]]] & /@ Range[t]

2. Get file name list

files = FileNames[dir <> "Sin*.pdf"];

3. Generate Notebook - (maybe bruce force but seems to work)

Flatten is used to get a final list of Cell objects.

It creates a title page and one section per file. Filename is used for the section name and at the end of each Section a page break is added..

l = Flatten[{{TextCell["My Title for this Doc", "Title"],

Cell["", "PageBreak",

PageBreakBelow ->

True]}, {TextCell[FileNameTake[files[[#]]], "Section"],

ExpressionCell[Import[files[[#]]][[1]], "Input"],

Cell["", "PageBreak", PageBreakBelow -> True]} & /@

Range@Length[files]}];

nb = CreateDocument[l, Visible -> False, NotebookFileName -> "test"]

4. Save Notebook as PDF

Export[dir <> "test.pdf", nb]

5. Clean up test files

DeleteFile[dir <> "test.nb"]

DeleteFile[files]

Maybe not 100% what you looking for but would avoid other tools.

Ulrich

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu