Best way to get a .pbm file from latex?

JohnF

unread,

Dec 5, 2016, 7:56:58 PM12/5/16

to

I'm trying to get a P1-.pbm from latex, and have solved
that problem as follows...
Sample input .tex file
\documentclass[10pt]{article}
\pagestyle{empty}
\begin{document}
\setlength{\parindent}{0pt}
\noindent $\displaystyle xyz$
\end{document}
Then I've run that using both latex to get file.dvi
and pdflatex to get file.pdf
And then I've run ImageMagick's both ways,
convert -trim -compress none file.dvi filedvi.pbm
convert -trim -compress none file.pdf filepdf.pbm
And then diff filedvi.pbm filepdf.pbm are identical.
But if I increase resolution, e.g.,
convert -trim -compress none -density 300 file.dvi filedvi.pbm
convert -trim -compress none -density 300 file.pdf filepdf.pbm
then there's a very slight diff, but the images are visually
indistinguishable.

So is there a faster,better,whatever "optimal" way to
generate a .pbm? I've got lots of this to do, and want to
find the best (whatever that means) procedure.
But I'm not even sure what else to try, or how to
evaluate the results.
--
John Forkosh ( mailto: j...@f.com where j=john and f=forkosh )

Peter Flynn

unread,

Dec 6, 2016, 2:14:10 PM12/6/16

to

On 12/06/2016 12:56 AM, JohnF wrote:
> I'm trying to get a P1-.pbm from latex,

If you don't mind using a graphical interface instead of the command
line, I would import the .pdf into GIMP at 300dpi and then export as
pbm, but I don't know how the results compare.

> So is there a faster,better,whatever "optimal" way to
> generate a .pbm? I've got lots of this to do, and want to
> find the best (whatever that means) procedure.

Ah, OK, so you do need a commandline solution to script. I think what
you're doing sounds fine, in that case, although you could try the
netpbm utilities instead of ImageMagick.

///Peter

JohnF

unread,

Dec 6, 2016, 10:43:51 PM12/6/16

to

Thanks, Peter. Yeah, pretty much do need commandline. Actually,
it's a C program, acting more or less as a script, that will
run the commands using the system() call after constructing
appropriate commandstrings. So I'm just doing it by hand for
the time being, to figure out exactly what needs to be programmed.
I'd been vaguely aware of the netpbm package (which is in fact
default installed on my slackware linux box), but it hadn't occurred
to me to use it until you mentioned it. So I gave it a tryout...
latex file.tex to get file.dvi
dvips file.dvi -o file.ps to get file.ps
pstopnm -forceplain -pbm file.ps to get P1 file.pbm
Problem is that convert -trim option, for which I couldn't
man or google a netpbm counterpart. So it gives a 556x786 pbm file
with 444151 bytes -- the whole page with 99.999% blank space (zeroes).
Whereas convert -trim -compress none file.ps file.pbm (no -density 300)
gives 16x7 with 239 bytes -- just the rectangle containing the image.

Peter Flynn

unread,

Dec 7, 2016, 4:47:58 PM12/7/16

to

On 12/07/2016 03:43 AM, JohnF wrote:
[...]

> Thanks, Peter. Yeah, pretty much do need commandline. Actually,
> it's a C program, acting more or less as a script, that will
> run the commands using the system() call after constructing
> appropriate commandstrings. So I'm just doing it by hand for
> the time being, to figure out exactly what needs to be programmed.
> I'd been vaguely aware of the netpbm package (which is in fact
> default installed on my slackware linux box), but it hadn't occurred
> to me to use it until you mentioned it. So I gave it a tryout...
> latex file.tex to get file.dvi
> dvips file.dvi -o file.ps to get file.ps
> pstopnm -forceplain -pbm file.ps to get P1 file.pbm
> Problem is that convert -trim option, for which I couldn't
> man or google a netpbm counterpart. So it gives a 556x786 pbm file
> with 444151 bytes -- the whole page with 99.999% blank space (zeroes).
> Whereas convert -trim -compress none file.ps file.pbm (no -density 300)
> gives 16x7 with 239 bytes -- just the rectangle containing the image.

Sorry, didn't realise there was a lot of empty space. I'd just pass it
through pdfcrop.

pdflatex test;pdfcrop test;pdftoppm test-crop.pdf >test.ppm

///Peter

JohnF

unread,

Dec 8, 2016, 12:21:01 AM12/8/16

to

Thanks again, Peter. Hadn't been aware of pdfcrop, but it doesn't
seem to work for this purpose. As far as I can tell, seems to crop
from one papersize to another, but doesn't eliminate whitespace
within the output papersize. So, using the demo testfile in first
post, after pdflatex'ing,
pdfcrop test.pdf cropped.pdf
produces diff test.pdf cropped.pdf that are identical.
And they're also identical with a papersize
pdfcrop test.pdf letter cropped.pdf
and even identical with a4. What I'd say it needs is an "eps" kind
of option to trim all possible whitespace, leaving only the smallest
rectangle that contains all non-whitespace.

That's what convert -trim does, though interestingly not for pdf-->pdf.
That is,
convert -trim test.pdf trimmed.png
produces a png with **all** unnecessary whitespace removed,
which is exactly what I want. But,
convert -trim test.pdf trimmed.pdf
is still a **full lettersize** page (but weirdly, with the "xyz" text
moved from near-the-top to near-the-bottom of the page -- go figure:).

Peter Flynn

unread,

Dec 8, 2016, 4:41:24 PM12/8/16

to

On 12/08/2016 05:20 AM, JohnF wrote:
> Peter Flynn <pe...@silmaril.ie> wrote:
[...]

>> Sorry, didn't realise there was a lot of empty space. I'd just pass it
>> through pdfcrop.
>> pdflatex test;pdfcrop test;pdftoppm test-crop.pdf >test.ppm
>> ///Peter
>
> Thanks again, Peter. Hadn't been aware of pdfcrop, but it doesn't
> seem to work for this purpose. As far as I can tell, seems to crop
> from one papersize to another, but doesn't eliminate whitespace
> within the output papersize.

That's weird. It crops right to the edge of the type here (v1.38,
2012/11/02), see http://latex.silmaril.ie/test/croptest.tex and
croptest.aux
croptest.log
croptest.pdf
croptest-crop.pdf
croptest.ppm
all in the same web folder.

> So, using the demo testfile in first
> post, after pdflatex'ing,
> pdfcrop test.pdf cropped.pdf
> produces diff test.pdf cropped.pdf that are identical.
> And they're also identical with a papersize
> pdfcrop test.pdf letter cropped.pdf
> and even identical with a4.

I've never seen it do that.

> What I'd say it needs is an "eps" kind
> of option to trim all possible whitespace, leaving only the smallest
> rectangle that contains all non-whitespace.

That's exactly what it does, AFAIK.

///Peter

JohnF

unread,

Dec 8, 2016, 10:57:23 PM12/8/16

to

Okay, to reiterate your opening "that's weird" remark,
let me add "and this is weirder":), as follows...

(1) So I first downloaded all your files, one-by-one using wget,
and ran pdfcrop croptest.pdf jftest.pdf against your original
pdf, just as a check. But this time, voila, it worked as
you advertised. So maybe you think it has something to do with
my original pdf versus yours??? Nope, read on...
(2) As I'm writing this, I'm working on a different box, this one
running slackware 14.2 linux, whereas the tests I ran yesterday,
while posting preceding followup, were on a box running 14.1.
That earlier box didn't have pdfcrop at all, so I got it from
sourceforge,
https://sourceforge.net/projects/pdfcrop/files/
clicking the "latest version Download pdfcrop_v0.4b.tar.gz (18.8 kB)"
link. And **that's** what produced (and still does) the problem.
This current box, running slackware 14.2, has PDFCROP 1.5, 2004/06/24
according to its --help. The downloaded one doesn't show version/date
with its --help, but its source file perl script says
PDFCrop version 0.4b Copyright 2011. So a later date by seven years,
but earlier version. And the 0.4b version continues to produce
the original bad result I reported, even when run on this box.
(3) Try it...I dare you:). Download that sourceforge version, which
they're advertising as "latest", and run that against your original
croptest.pdf. I've got $20 USD that says you'll now reproduce my
originally-reported bad result. So do I win??? I think maybe
sourceforge needs to be updated. Or something.

Anyway, now using pdftoppm -mono, rather than convert, seems to work
okay, except it produces a compressed P4 file, and I have to run that
through convert anyway to get a P1 file. Not seeing a pdftoppm -switch
for that. Only other diff is that (both run against exactly the same pdf)
convert produces a default 16x7 file, whereas pdftoppm's is 34x15.
Not sure where they're getting their respective default resolutions from,
though they're both -switch-settable. In any case, it seems to be
pretty much "six of one, half-a-dozen of the other", with respect to
the "which is better?" question. I'll certainly program it both ways
(and likely any third way that's suggested), with one or the other
as default, with an optional user override at runtime. It's trivial to
set up the corresponding commandstrings and system() calls,
and I certainly want to have all different possibilities documented
in the code. Thanks again,

Joost Kremers

unread,

Dec 9, 2016, 4:45:44 AM12/9/16

to

EN:SiS(9)

They seem to be two completely different scripts. The Sourceforge one probably shouldn't even be called pdfcrop, because its documentation says:

,----
| Function:
| calculates the page metrics and scale factor of a PDF file,
| then crops and scales the PDF, so that it neatly fits on a
| standard size sheet of paper
`----

The "standard size sheet of paper" is the giveaway. This pdfcrop tries to fit your pdf onto a letter, a4 or legal paper size. So it's not about cropping at all. pdffit would be a better name.

--
Joost Kremers joostk...@fastmail.fm
Selbst in die Unterwelt dringt durch Spalten Licht

JohnF

unread,

Dec 9, 2016, 10:04:45 PM12/9/16

to

Thanks, Joost. I absent-mindedly missed that.
Just googled pdfcrop, and sourceforge was the first hit.
They're usually reliable, so I just took what it gave me
without further investigation. But the second google hit
was https://www.ctan.org/pkg/pdfcrop which is what I
should have downloaded. The name collision never even
crossed my mind. Not even after it (sourceforge's) failed
to work as advertised (by Peter), when it should have
dawned on me to take a closer look.