Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Making thumbnails efficiently with ghostscript

1,769 views
Skip to first unread message

bugbear

unread,
May 22, 2012, 9:05:50 AM5/22/12
to
(I've asked this before, but the thread fizzled)

http://groups.google.com/group/comp.lang.postscript/browse_thread/thread/ca973ffaf885834c/56f107d7b68ee435?hl=en#56f107d7b68ee435

I would like to use ghostscript to make a thumbnail of the first page of a PDF file,
or an EPSF file, to the following spec.

* The thumbnail must have the SAME aspect ratio as the original
* the size of the longest side of the thumbnail must be 128 pixels

The correct pixels can be easily made, by simply rendering at a "too-high"
resolution, and then using ImageMagick:

convert <raster> -resize 128x128 <thuumbnail>

But this can be resource hungry for large media sources. Setting the
resolution high enough to get good thumbs for tiny images
(e.g. a business card image) leaves the resolution FAR
too high for large images (like a newspaper page).

Using info from the previous thread the closest approach is to use these flags:

-r72 -dDEVICEWIDTHPOINTS=128 -dDEVICEHEIGHTPOINTS=128 -dPDFFitPage -dEPSFitPage

Which gives me a 128x128 raster every time. But if the aspect ratio
of the original is NOT 1:1, I either get white padding
on the right, or white padding on the top.

Can anyone suggest a way to get my "dream" thumbnail,
i.e. the same pixels as I'm currently getting, but with
no padding?

The closest I can get at the moment is a second pass with ImageMagick using
options to trim any exterior whitespace

-bordercolor white -border 1x1 -trim +repage

But this will remove any peripheral white
*content* as well as the white padding from the rendering.

All help welcomed (and I bet I'm not the only one wanting this)

BugBear

Thomas Kaiser

unread,
May 22, 2012, 10:11:25 AM5/22/12
to
["Followup-To:" comp.text.pdf]
bugbear wrote in <news:PKKdnck_eruzEibS...@brightview.co.uk>
> I would like to use ghostscript to make a thumbnail of the first page of a PDF file,
> or an EPSF file, to the following spec.
>
> * The thumbnail must have the SAME aspect ratio as the original
> * the size of the longest side of the thumbnail must be 128 pixels
>
> The correct pixels can be easily made, by simply rendering at a
> "too-high" resolution, and then using ImageMagick:

Use the dimensions of the trimbox (or mediabox if missing), do some math
to get the approriate resolution for an image with at least 256x256
pixels and then use this as $MyRenderResolution. This

gs -dUseTrimBox -q -dBATCH -dNOPAUSE -r${MyRenderResolution} \
-sDEVICE=... -dDOINTERPOLATE -dTextAlphaBits=4 -dGraphicsAlphaBits=4 \
-dMaxBitmap=100000000 -dMaxPatternBitmap=2000000 -sOutputFile=... \
-c 100000000 setvmthreshold -f /path/to/pdf

will produce an image with correct aspect ratio (-dTextAlphaBits=2 and
-dGraphicsAlphaBits=2 should work a bit faster without decreasing
rendering quality since you get an image with twice the pixel dimensions
as necessary).

When you scale down using ImageMagic in a second step you can then apply
some slight sharping or unsharp masking which will produce superiour
results.

Regards,

Thomas

bugbear

unread,
May 22, 2012, 10:19:22 AM5/22/12
to
Thomas Kaiser wrote:
> ["Followup-To:" comp.text.pdf]
> bugbear wrote in<news:PKKdnck_eruzEibS...@brightview.co.uk>
>> I would like to use ghostscript to make a thumbnail of the first page of a PDF file,
>> or an EPSF file, to the following spec.
>>
>> * The thumbnail must have the SAME aspect ratio as the original
>> * the size of the longest side of the thumbnail must be 128 pixels
>>
>> The correct pixels can be easily made, by simply rendering at a
>> "too-high" resolution, and then using ImageMagick:
>
> Use the dimensions of the trimbox (or mediabox if missing), do some math
> to get the approriate resolution for an image with at least 256x256
> pixels and then use this as $MyRenderResolution. This

OK - any recommendation for "getting the trimbox/mediabox/cropbox"
efficiently, in a perl or shell script? You're quite
right that this would work, of course.

Thomas Kaiser

unread,
May 22, 2012, 11:34:32 AM5/22/12
to
bugbear schrieb in <news:bsSdncge-572PSbS...@brightview.co.uk>
> Thomas Kaiser wrote:
>> ["Followup-To:" comp.text.pdf]
>> bugbear wrote in<news:PKKdnck_eruzEibS...@brightview.co.uk>
>>> I would like to use ghostscript to make a thumbnail of the first
>>> page of a PDF file, or an EPSF file, to the following spec.
>>>
>>> * The thumbnail must have the SAME aspect ratio as the original
>>> * the size of the longest side of the thumbnail must be 128 pixels
>>>
>>> The correct pixels can be easily made, by simply rendering at a
>>> "too-high" resolution, and then using ImageMagick:
>>
>> Use the dimensions of the trimbox (or mediabox if missing), do some
>> math to get the approriate resolution for an image with at least
>> 256x256 pixels and then use this as $MyRenderResolution. This
>
> OK - any recommendation for "getting the trimbox/mediabox/cropbox"
> efficiently, in a perl or shell script?

In our workflows we use Helios' pdfinfo [1]. But you could also use the
one from XPDF [2] with the "-box" parameter.

The values you get are in PostScript points (1/72 inch) and in this
order: x/y of the lower left corner and x/y of the upper right.

'pdfinfo -box' might produce this:

Title: 27491-0_Agassi_Open_komplett.indd
Author: kbauer
Creator: Adobe InDesign CS3 (5.0.4)
Producer: Acrobat Distiller 9.0.0 (Macintosh)
CreationDate: Fri Sep 25 10:45:07 2009
ModDate: Thu Nov 5 15:04:54 2009
Tagged: no
Pages: 1
Encrypted: no
Page size: 403.139 x 618.095 pts
MediaBox: 920.82 29.03 1323.96 647.12
CropBox: 920.82 29.03 1323.96 647.12
BleedBox: 920.82 29.03 1323.96 647.12
TrimBox: 920.82 29.03 1323.96 647.12
ArtBox: 920.82 29.03 1323.96 647.12
File size: 17634656 bytes
Optimized: yes
PDF version: 1.6

To calculate the dimensions of the TrimBox in pts all you have to do is
to subtract value 3 - 1 (x axis) and 4 -1 (y axis). To translate to inch
simply divide by 72. In a shell script one could use in two steps:

MyXAxis=$(pdfinfo -box "${MyPDF}" | grep "^TrimBox" | awk -F" " '{print "("$4" - "$2") / 72"}' | bc -l)
MyYAxis=$(pdfinfo -box "${MyPDF}" | grep "^TrimBox" | awk -F" " '{print "("$5" - "$3") / 72"}' | bc -l)
echo "TrimBox dimensions in inch: ${MyXAxis} x ${MyYAxis}"

The result would be:

TrimBox dimensions in inch: 5.59916666666666666666 x 8.58458333333333333333

Use the larger value of both and do the division. But don't forget to
increment the value by 1 before using it with gs's "-r" switch:

MyRenderResolution=$(( 1 + $(echo 256 / ${MyYAxis} | bc) ))

This will ensure that you have at least twice the pixels needed for
downscaling to 128 pixels (applying some sharpening [3])

Regards,

Thomas

[1] http://www.helios.de/web/EN/products/HELIOS-PDF-HandShake.html?lang_id=en
[2] http://www.foolabs.com/xpdf/
[3] http://redskiesatnight.com/2005/04/06/sharpening-using-image-magick/
0 new messages