Thanks,
Alan Isaac
On 10/8/2009 11:44 PM, Jeffrey H. Coffield wrote:
> gs -sOutputFile=x.eps -sDEVICE=epswrite x.ps ?
I tried that before asking ...
Are you sure epswrite is being used?
I cannot get comparable output.
When I use GSview via menus, I get
my PS file unchanged except for the
encapsulation (e.g., header incl
bounding box). epswrite gives me
entirely different output.
Thanks,
Alan Isaac
AFAIK, GSView converts the pages to EPS files by textual analysis
guided by the document structuring comments - it does not call
ghostscript for this.
Helge
I don't use GSview as almost all of the PostScript programming I do is
for "batch" environments and I usually don't start with a PS file. The
few times the epswrite seemed to work for me.
Jeff Coffield
Helge> AFAIK, GSView converts the pages to EPS files by textual
Helge> analysis guided by the document structuring comments - it
Helge> does not call ghostscript for this.
How does it determine the bounding box, then?
--
Lee Sau Dan 李守敦 ~{@nJX6X~}
E-mail: dan...@informatik.uni-freiburg.de
Home page: http://www.informatik.uni-freiburg.de/~danlee
>>>>>> "Helge" == Helge Blischke <h.bli...@acm.org> writes:
>
> Helge> AFAIK, GSView converts the pages to EPS files by textual
> Helge> analysis guided by the document structuring comments - it
> Helge> does not call ghostscript for this.
>
> How does it determine the bounding box, then?
>
>
The bounding box is as well specified by an DSC:
%%BoundingBox: 0 0 595 842
for example.
Helge
Helge> AFAIK, GSView converts the pages to EPS files by textual
Helge> analysis guided by the document structuring comments - it
Helge> does not call ghostscript for this.
>>
>> How does it determine the bounding box, then?
>>
>>
Helge> The bounding box is as well specified by an DSC:
Helge> %%BoundingBox: 0 0 595 842
Helge> for example.
What happens when the source PS doesn't have this in the first place?
Also, the 595x842 size is standard A4 size. I guess your pages don't
really occupy the whole A4, right? You may want REAL bounding boxes for
your drawings, i.e. the minimum bounding boxes without excessive
margins.
So what? What makes you think GSView is supposed to be a perfect program?
In REAL life, drawings are saved as EPS from drawing applications and
cropped in publishing / composition tools. GSView and similar are just
helper applications. As long as you can get a drawing to EPS format and
into an application (say, Word), you can use its cropping tools to define
the visible part of the drawing.
>> Also, the 595x842 size is standard A4 size. I guess your pages
>> don't really occupy the whole A4, right? You may want REAL
>> bounding boxes for your drawings, i.e. the minimum bounding
>> boxes without excessive margins.
Matti> So what? What makes you think GSView is supposed to be a
Matti> perfect program?
Matti> In REAL life, drawings are saved as EPS from drawing
Matti> applications and cropped in publishing / composition
Matti> tools. GSView and similar are just helper applications. As
Matti> long as you can get a drawing to EPS format and into an
Matti> application (say, Word), you can use its cropping tools to
Matti> define the visible part of the drawing.
Manual cropping?
Why not delegate such tedious and BORING tasks to computers? What are
computers invented for?
Try out my 'psrealbb' tool:
http://www.informatik.uni-freiburg.de/~danlee/fun/#postscript
Is it working with new Ghostscript? Old Ghostscripts couldn't
calculate bbox correct. See the discussion here about a year ago (and
*my* solution ;-).
Yours,
Ilya
>> Try out my 'psrealbb' tool:
>>
>> http://www.informatik.uni-freiburg.de/~danlee/fun/#postscript
Ilya> Is it working with new Ghostscript? Old Ghostscripts couldn't
Ilya> calculate bbox correct. See the discussion here about a year
Ilya> ago (and *my* solution ;-).
Oh! I haven't checked. :P
I did notice that Ghostscript's bbox drive got the bounding box wrong
when the operators drawed outside the first quadrant (in device space).
The end result was as if the drawing was clipped to this quadrant. I
know some people used to use tricks such as injecting "999999 dup
translate" to work around this "feature". Are you talking about this
bug? Has it been fixed?
No. When calculating bbox, GS considers (considered?) any image as
having no white parts. Thus to calculate bbox of scanned
documents one must use other tools.
Yours,
Ilya
>> I did notice that Ghostscript's bbox drive got the bounding box
>> wrong when the operators drawed outside the first quadrant (in
>> device space). The end result was as if the drawing was clipped
>> to this quadrant. I know some people used to use tricks such as
>> injecting "999999 dup translate" to work around this "feature".
>> Are you talking about this bug?
Ilya> No. When calculating bbox, GS considers (considered?) any
Ilya> image as having no white parts. Thus to calculate bbox of
Ilya> scanned documents one must use other tools.
Why should the white margin of a raster image not considered an integral
part of that image? That's a part of the "contents" of that image!
(And don't forget that they aren't always completely, purely white. It
could be something like ".95 .9 .98 setrgbcolor".)
BTW, does PS's graphics model ever specify that the paper (the
background of the drawing) be always pure white?
Who cares? IIRC, BBox is DEFINED as non-white part... See the old
discussion...
> (And don't forget that they aren't always completely, purely white. It
> could be something like ".95 .9 .98 setrgbcolor".)
Who cares? I consider the case when they are...
> BTW, does PS's graphics model ever specify that the paper (the
> background of the drawing) be always pure white?
This is an interesting question...
Yours,
Ilya
>>>>>> "Ilya" == Ilya Zakharevich <nospam...@ilyaz.org> writes:
>
> >> I did notice that Ghostscript's bbox drive got the bounding box
> >> wrong when the operators drawed outside the first quadrant (in
> >> device space). The end result was as if the drawing was clipped
> >> to this quadrant. I know some people used to use tricks such as
> >> injecting "999999 dup translate" to work around this "feature".
> >> Are you talking about this bug?
>
> Ilya> No. When calculating bbox, GS considers (considered?) any
> Ilya> image as having no white parts. Thus to calculate bbox of
> Ilya> scanned documents one must use other tools.
>
> Why should the white margin of a raster image not considered an integral
> part of that image? That's a part of the "contents" of that image!
> (And don't forget that they aren't always completely, purely white. It
> could be something like ".95 .9 .98 setrgbcolor".)
>
> BTW, does PS's graphics model ever specify that the paper (the
> background of the drawing) be always pure white?
>
>
IIRC, Adobe's DSC spec defines the bounding box as the smallest rectangular
area that contains all marks the respective PostScript program has created
(irrespective of the object's type).
Helge
Ilya> No. When calculating bbox, GS considers (considered?) any
Ilya> image as having no white parts. Thus to calculate bbox of
Ilya> scanned documents one must use other tools.
>>
>> Why should the white margin of a raster image not considered an
>> integral part of that image?
Ilya> Who cares? IIRC, BBox is DEFINED as non-white part... See
Ilya> the old discussion...
>> (And don't forget that they aren't always completely, purely
>> white. It could be something like ".95 .9 .98 setrgbcolor".)
Ilya> Who cares? I consider the case when they are...
when they are what?
You've never stacked a raster image with white edge on another graphics
that is non-white? You've never used background colours other than
white?
>> BTW, does PS's graphics model ever specify that the paper (the
>> background of the drawing) be always pure white?
Ilya> This is an interesting question...
If it paper is not pure white, then the white rim in the raster image
should be inside the bounding box. See?
> >> (And don't forget that they aren't always completely, purely
> >> white. It could be something like ".95 .9 .98 setrgbcolor".)
>
> Ilya> Who cares? I consider the case when they are...
>
> when they are what?
"completely, purely white"
> You've never stacked a raster image with white edge on another graphics
> that is non-white?
Why are you so interested in what I did and what I did not?
> >> BTW, does PS's graphics model ever specify that the paper (the
> >> background of the drawing) be always pure white?
>
> Ilya> This is an interesting question...
>
> If it paper is not pure white, then the white rim in the raster image
> should be inside the bounding box. See?
As I said: "This is an interesting question...".
However, I may repeat AGAIN that the last time this topic was
discussed, people claimed that GS' semantic of bbox is "non-white".
Yours,
Ilya
Ilya> No. When calculating bbox, GS considers (considered?) any
Ilya> image as having no white parts. Thus to calculate bbox of
Ilya> scanned documents one must use other tools.
>>
>> Why should the white margin of a raster image not considered an
>> integral part of that image? That's a part of the "contents" of
>> that image! (And don't forget that they aren't always
>> completely, purely white. It could be something like ".95 .9 .98
>> setrgbcolor".)
>>
>> BTW, does PS's graphics model ever specify that the paper (the
>> background of the drawing) be always pure white?
Helge> IIRC, Adobe's DSC spec defines the bounding box as the
Helge> smallest rectangular area that contains all marks the
Helge> respective PostScript program has created (irrespective of
Helge> the object's type).
So, are white pixels of a raster image regarded as marks?
>>>>>> "Helge" == Helge Blischke <h.bli...@acm.org> writes:
>
> Ilya> No. When calculating bbox, GS considers (considered?) any
> Ilya> image as having no white parts. Thus to calculate bbox of
> Ilya> scanned documents one must use other tools.
> >>
> >> Why should the white margin of a raster image not considered an
> >> integral part of that image? That's a part of the "contents" of
> >> that image! (And don't forget that they aren't always
> >> completely, purely white. It could be something like ".95 .9 .98
> >> setrgbcolor".)
> >>
> >> BTW, does PS's graphics model ever specify that the paper (the
> >> background of the drawing) be always pure white?
>
> Helge> IIRC, Adobe's DSC spec defines the bounding box as the
> Helge> smallest rectangular area that contains all marks the
> Helge> respective PostScript program has created (irrespective of
> Helge> the object's type).
>
> So, are white pixels of a raster image regarded as marks?
>
>
That's an interesting question. The DSC 3.0 spec states for the
%%BouningBox: comment:
This comment specifies the bounding box that encloses all marks painted
on all pages of a document. That is, it must be a “high water mark” in all
directions for marks made on any page. The four arguments correspond to
the lower left (llx, lly) and upper right corners (urx, ury) of the bounding
box
in the default user coordinate system (PostScript units).
It makes no assumptions on the color.
Helge
> This comment specifies the bounding box that encloses all marks painted
> on all pages of a document. That is, it must be a â high water markâ ? in all
> directions for marks made on any page. The four arguments correspond to
> the lower left (llx, lly) and upper right corners (urx, ury) of the bounding
> box
> in the default user coordinate system (PostScript units).
>
> It makes no assumptions on the color.
There can be good reasons for painting white as a colour, especially
when printing T-shirts or other clothes, and a number of other unusual
items.
As has been pointed out in other posts, an image sample which is white
and overlays another coloured object will overwrite the coloured pixels
with white, so the white samples haev definitely made a mark.
The opaque imaging model of PostScript means that interpreters do not
(in general) have any idea whether a destination pixel has been altered
in any way prior to any current operation.
From my point of view every pixel in an image makes a mark. The mark
might end up the same colour as the medium, but it still makes a mark.
Ken
This all is very interesting; but irrelevant.
What I was discussing was the bbox device of GS. It is documented to
work in terms of WHITE. And it still does not (at least with 8.70).
For details, see
http://groups.google.com/group/comp.lang.postscript/browse_thread/thread/a1c8a6277c4642ce/4b52f2e615835561#4b52f2e615835561
Hope this helps,
Ilya
ken> From my point of view every pixel in an image makes a mark. The
ken> mark might end up the same colour as the medium, but it still
ken> makes a mark.
I can't agree more.
colour==white is not equivalent to alpha=0.0. In PS's model, we always
have alpha==1.0. So, the white pixels are opaque marks (that do not
contrast at all against a white background).
Indeed, I've come across PS (and sometimes PDF) files where the creators
intentionally put some "invisible" marks under the main text by drawing
those marks with "1 setgray". But those marks are revealed when I read
them on the screen, because I set the background colours of my PDF/PS
views to 70% gray (so as not to irritate the eyes). ;)
Where is it documented 'to work in terms of white' ? It says that 'white
objects' don't contribute, but images are not white objects, they
contain coloured samples as well as white ones. The facdt that part of
the object is white does not make it a ahite object.
It could perhaps be better stated that white *vector* objects don't
contribute, but this is reasonably clear from the documentation:
"By default, white objects don't contribute to the bounding box because
many files fill the whole page with white before drawing other objects."
Its very easy to detect a white fill, examining every sample of an
image, then applying the CTM to determine whether it lies inside/outside
the current recorded high water marks is expensive (it terms of time).
The old EPS conversion script used to do something akin to what you seem
to want. It rendered the job at 72 dpi (to get a preview), then counted
white/non-white pixels in the raster. Because its rendered at 72 dpi the
conversion is trivial.
Of course, this has other problems; Text (especially glyphs like 'P')
are rendered differently at low resolution, the outer edge of the curve
is flattened, if you rely on these to generate a bounding box at low
resolution, then scale up, the curve can end up being cropped at higher
resolution, when it extends further.
This case is in fact the reason that the current ps2eps script doesn't
do it that way any more, but instead uses the bbox device.
> For details, see
> http://groups.google.com/group/comp.lang.postscript/browse_thread/thread/a1c8a6277c4642ce/4b52f2e615835561#4b52f2e615835561
I read the posts at the time, your argument is the same as the
discusssion here, white pixels *do* make a mark, an image is an object,
the fact that part of it is white does not make it a white object.
Not clear on why you think this is different to your case. It may not
work the way you want it to, but as far as I can seee it works the way
its supposed to, so its not wrong.
Ken
However, it does determine a bounding box somehow. (How?)
And to return to the original question, I'd like to use
the GSView conversion as a batch program. Is it possible?
Thanks,
Alan Isaac
>> For details, see
>> http://groups.google.com/group/comp.lang.postscript/browse_thread/thread/a1c8a6277c4642ce/4b52f2e615835561#4b52f2e615835561
> I read the posts at the time [...]
> Not clear on why you think this is different to your case. It may not
> work the way you want it to, but as far as I can seee it works the way
> its supposed to, so its not wrong.
It would have speed things a lot if you would voice your opinion at
that time, instead of just "reading"... As you saw, as a minimum, a
lot of people thinked that it does not work as it's supposed to to...
>> What I was discussing was the bbox device of GS. It is documented to
>> work in terms of WHITE. And it still does not (at least with 8.70).
> Where is it documented 'to work in terms of white' ?
Devices.html: white objects don't contribute to the bounding box
> It says that 'white objects' don't contribute, but images are not
> white objects,
Are white images "white objects"?
> they contain coloured samples as well as white ones.
Does `100 100 scale 100 100 8 [100 0 0 -100 0 100] <FF> image' contain
coloured samples?
> It could perhaps be better stated that white *vector* objects don't
> contribute, but this is reasonably clear from the documentation:
> "By default, white objects don't contribute to the bounding box because
> many files fill the whole page with white before drawing other objects."
It is not clear how "mentioning what many PS files do" would make
anything clear about "how GS works".
> Its very easy to detect a white fill, examining every sample of an
> image, then applying the CTM to determine whether it lies inside/outside
> the current recorded high water marks is expensive (it terms of time).
Why would one need to apply inside/outside logic to white objects?!
Moreover, I do not see how checking which lines and columns of images
are white can be expensive... After you know this, the price is the
same as for 4 vertical/horizontal lines of width 0.
[Of course, in my application - and all other applications of bbox
device I have seen - I would need that GS worksa as documented
w.r.t. partially white images too. This needs scanning for white
rows and columns...]
Thanks for giving your voice on this,
Ilya
> It would have speed things a lot if you would voice your opinion at
> that time, instead of just "reading"... As you saw, as a minimum, a
> lot of people thinked that it does not work as it's supposed to to...
The replies at the time made it reasonably plain that it was working as
expected I felt.
> > Where is it documented 'to work in terms of white' ?
>
> Devices.html: white objects don't contribute to the bounding box
Yes, that does not say that it 'works in terms of white'.
> > It says that 'white objects' don't contribute, but images are not
> > white objects,
>
> Are white images "white objects"?
No.
> > they contain coloured samples as well as white ones.
>
> Does `100 100 scale 100 100 8 [100 0 0 -100 0 100] <FF> image' contain
> coloured samples?
Yes, they contain the image sample FF, this does *NOT* directly denote a
colour. Depending on the transfer function applied this may be pure
white, pure black or, in the case of an Indexed colour space, any colour
at all.
So first you see you must apply the colour space and transfer function,
only then can you even decide whether an image sample is white or not.
Then we need to consider the clipping path, which may be a rectangle or
may be much more complex, eg a piece of text. Obviously if all the white
areas of the image lie outside the clip path then they don't count
towards removing marks and shrinking the bounding box.
> > "By default, white objects don't contribute to the bounding box because
> > many files fill the whole page with white before drawing other objects."
>
> It is not clear how "mentioning what many PS files do" would make
> anything clear about "how GS works".
Why not ? It says why GS does not consider white objects to contribute,
that's how Ghostscript works, the explanation is why GS works that way.
> > Its very easy to detect a white fill, examining every sample of an
> > image, then applying the CTM to determine whether it lies inside/outside
> > the current recorded high water marks is expensive (it terms of time).
>
> Why would one need to apply inside/outside logic to white objects?!
This is not path insideness, (but see above regarding clip paths) it
determines whether an image sample lies outside a rectangular path
bounding the current highest/lowest marks made on the page.
By the way, if a page made black marks at the extemities, then overwrote
them with a white image, what would you like to see happen ? If you want
the bounding box reduced, then we would need to know where the new
top/bottom left/right marks are on the page, and that depends on exactly
where the image lies.
What if the white image is a PDF optional content ? So that when its
present it obscures some of the output ?
> Moreover, I do not see how checking which lines and columns of images
> are white can be expensive...
I don't see how you are simply checking lines or columns. Each image
sample must be converted to device space. Bear in mind that images can
be skewed and/or rotated.
For example If the image is rotated 45 degrees and the bootom left
sample is white, but neither of its neighbours is, then it must not
contribute. No matter what the other samples in its row and column.
So we must consider the colour space, the transfer function, the image
matrix and the CTM (and thus every image sample must be converted to
device space) and the clip path. Quite a bit more than 'rows and
columns'....
> After you know this, the price is the
> same as for 4 vertical/horizontal lines of width 0.
>
> [Of course, in my application - and all other applications of bbox
> device I have seen - I would need that GS worksa as documented
> w.r.t. partially white images too. This needs scanning for white
> rows and columns...]
I believe it means much more than scanning for rows and columns of white
in images. If you absolutely require this then you must render the job
at reasonably high resolution and then apply your rows and columns rule
to the rendered output.
Ken
> In PS's model, we always have alpha==1.0.
Almost the contrary I would say :) : We have alpha==1 only for marks
(including white marks) and alpha==0 for rest (unpainted areas, which
are vastly -infinitely- bigger than the painted one). From that
perspective, we almost always have alpha==0, except when it's 1.0...
So one plausible definition of the bounding box (and I believe the one
taken in DSC) is the smallest rectangle that contains points with
alpha !=0 (note that I did not wrot 'with alpha==1', so it could
provide a reasonnable bbox definition applicable to PDF with
transparency)
For completeness, I will note that a plain L1 or L2 raster ('image'
operator) leaves an opaque parallelogram mark (possibly clipped) in
device space. L3 masked images (or L1 imagemask) make things somewhat
harder to compute because each pixel carry its own alpha (1.0 or 0.0).
I will also note that things become yet more confused when one
considers color separation (in-RIP or otherwise): Depending on the
colorspace in effect and/or overprint flag, opaqueness may not affect
all 'inks' equally.
> So, the white pixels are opaque marks (that do not
> contrast at all against a white background).
Indeed.