Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

style of PostScript generated by pdf2ps?

43 views
Skip to first unread message

bugbear

unread,
Jul 16, 2012, 5:29:07 AM7/16/12
to
This is (in fact) just the output of the pwwrite driver in GS.

This has changed:

The older driver generated stuff like this

%!PS-Adobe-3.0
%%Pages: (atend)
%%BoundingBox: 0 0 90 85
%%HiResBoundingBox: 0.000000 0.000000 90.000000 85.000000
%.............................................
%%Creator: GPL Ghostscript 900 (pswrite)
%%CreationDate: 2012/07/16 10:18:24
%%DocumentData: Clean7Bit
%%LanguageLevel: 2
%%EndComments

The newer driver:

%!PS-Adobe-3.0
%%BoundingBox: 0 0 612 792
%%Creator: GPL Ghostscript 904 (ps2write)
%%LanguageLevel: 2
%%CreationDate: D:20120713171249+01'00'
%%Pages: 1
%%EndComments

The %%BoundingBox comment in the new output is incorrect,
or at least deferred in a very unhelpful way; *this*
occurs later on...

%%Page: 1 1
%%PageBoundingBox: 0 0 198 198

Which is the "true" size. Indeed, every box (including media
and crop) in the driving PDF is 198x198, so th1 612x792
numbers in the BoundingBox comment are just defaults.

This has caused me trouble, since I had a script that was
using the BoundingBox comments as the size
of the image represented by the PostScript
(which was always a single page in my context).

BugBear

Helge Blischke

unread,
Jul 16, 2012, 5:55:48 AM7/16/12
to
Ghostscript's ps2write device in essence outputs a linearized version of PDF
prepended by a procset that permit an ordinary level2 interpreter to
successfully render the stream. That implies reducing the input PDF or PS to
level2 compatible objects.
One (weird?) feature of this driver is the attempt to tailor the output to
specific printer features such as paper size. Therefore the global
BoundingBox reflects Ghostscript's default page size or, if specified, the
value fom the "-sPAPERSIZE=..." commandline switch.
Specifying "-dSetPageSize=true" for the ps2write device forces the media of
crop (if present) box for every page to be reflected both in a setpagedevice
dictionary and in a PageBoundingBox comment.

I'd recommend to modify your script to grep through the generated PS stream
for the PageBoundingBox comments and use thoese values if specified.

Helge

ken

unread,
Jul 16, 2012, 6:15:58 AM7/16/12
to
In article <CL-dne2qwa15Q57N...@brightview.co.uk>,
bugbear@trim_papermule.co.uk_trim says...

> This is (in fact) just the output of the pwwrite driver in GS.

That's because ps2pdf is just a script that calls Ghostscript.

> The older driver generated stuff like this
>
> %!PS-Adobe-3.0
> %%Pages: (atend)
> %%BoundingBox: 0 0 90 85
> %%HiResBoundingBox: 0.000000 0.000000 90.000000 85.000000
> %.............................................
> %%Creator: GPL Ghostscript 900 (pswrite)
> %%CreationDate: 2012/07/16 10:18:24
> %%DocumentData: Clean7Bit
> %%LanguageLevel: 2
> %%EndComments
>
> The newer driver:
>
> %!PS-Adobe-3.0
> %%BoundingBox: 0 0 612 792
> %%Creator: GPL Ghostscript 904 (ps2write)
> %%LanguageLevel: 2
> %%CreationDate: D:20120713171249+01'00'
> %%Pages: 1
> %%EndComments

We now use ps2write instead of pswrite because it produces better output
(smaller, less bitmapped content etc). In the long term pswrite will be
removed, but for now you can still use it if you really want to.


> The %%BoundingBox comment in the new output is incorrect,
> or at least deferred in a very unhelpful way; *this*
> occurs later on...

Well %%BoundingBox is a high water mark for *all* the pages, and since
that value is probably the media size, its certainly legitimate, if
perhaps inaccurate. I'm not certain there is anything we can do about
this, as I think the header may be generated before we know all the page
sizes.

That said I'm willing to look at the problem if you raise a bug report
at http://bugs.ghostscript.com


Ken

ken

unread,
Jul 16, 2012, 6:24:29 AM7/16/12
to
In article <a6i6p7...@mid.individual.net>, h.bli...@acm.org says...

> Ghostscript's ps2write device in essence outputs a linearized version of PDF
> prepended by a procset that permit an ordinary level2 interpreter to
> successfully render the stream.

That's only partially true these days, though it is still true up to a
point.


> That implies reducing the input PDF or PS to
> level2 compatible objects.

The conversion to leve 2 PostScript is what requires us to use level 2
compatible objects, not the way the file is written.

If we ever do a ps3write then it will be able (foe example) to do
shading dictionaries and CIDFonts. Currently these are converted to
images and tyep 3 fonts respectively.


> One (weird?) feature of this driver is the attempt to tailor the output to
> specific printer features such as paper size.

Err, the paper size isn't printer-specific. We take the MediaBox from
the PDF file and emit a PageSize media request in the PostScript. That's
not specific to the printer, its specific to the PDF file, what the
printer does with the request is up to the printer, it may select media
from different trays, scale ther file etc.

I'm not sure what is weird about this, it makes sense to me.


> Therefore the global
> BoundingBox reflects Ghostscript's default page size or, if specified, the
> value fom the "-sPAPERSIZE=..." commandline switch.

The document BoundingBox ought to be the MediaBox from the PDF file or,
as you correctly say, any overriding value such as the CropBox or
PAPERSIZE, if specified. That's because these override what's in the PDF
file.

You really shouldn't ever be seeing the GS default media size.


> I'd recommend to modify your script to grep through the generated PS stream
> for the PageBoundingBox comments and use thoese values if specified.

Yes I agree with this, but I'm willing to look into the document level
BoundingBox. I only ask for a bug to be entered so I have something to
track.



Ken

ken

unread,
Jul 16, 2012, 6:43:54 AM7/16/12
to
In article <MPG.2a6dfc3bc...@usenet.plus.net>, k...@spamcop.net
says...

> Yes I agree with this, but I'm willing to look into the document level
> BoundingBox. I only ask for a bug to be entered so I have something to
> track.

In fact it looks like we probably *can* emit the intersection of all the
page bounding boxes as the document BoundingBox. It doesn't look too
hard, though it would be nice if someone would open a bug report,
otherwise I'll have to do it myself ;-)


Ken

Helge Blischke

unread,
Jul 16, 2012, 6:50:56 AM7/16/12
to
Well, if I convert a PDF which contains no document level media box (which
is OK) and all of the pages specify a media box of A4 size, where then stems
the document level bounding box with letter size from (the default paper
size of GS is letter)?
The gs version used is 9.05

Helge

ken

unread,
Jul 16, 2012, 6:58:16 AM7/16/12
to
In article <a6ia0i...@mid.individual.net>, h.bli...@acm.org says...

> Well, if I convert a PDF which contains no document level media box
(which
> is OK) and all of the pages specify a media box of A4 size, where then stems
> the document level bounding box with letter size from (the default paper
> size of GS is letter)?

I don't have such a file, but I wouldn't have expected that. Anyway,
I've implemented the intersection code now. I should not however that
this is *not* a true BoundingBox, its the media sizes for each page.

In order to find the true BoundingBox for the document (and for each
page) we would need to run the bbox device to determine the real
bounding box (and intersect it with the media request anyway, presumably
in case there are off-page objects)



Ken

ken

unread,
Jul 16, 2012, 8:05:28 AM7/16/12
to
In article <MPG.2a6e041eb...@usenet.plus.net>, k...@spamcop.net
says...


> I don't have such a file, but I wouldn't have expected that. Anyway,
> I've implemented the intersection code now. I should not however that
> this is *not* a true BoundingBox, its the media sizes for each page.
>
> In order to find the true BoundingBox for the document (and for each
> page) we would need to run the bbox device to determine the real
> bounding box (and intersect it with the media request anyway, presumably
> in case there are off-page objects)

The fix (to use the media from all pages, not the device) was raised as
bug #693181:

http://bugs.ghostscript.com/show_bug.cgi?id=693181

and fixed with Git commit b49d3c75a70cbdcdb2214f22ad1a1f62f1bb90fc

http://git.ghostscript.com/?
p=ghostpdl.git;a=commit;h=b49d3c75a70cbdcdb2214f22ad1a1f62f1bb90fc

The point about this not being a true BoundingBox still applies

Ken
0 new messages