Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

PDF to PDF (gs?): rich RGB black to plain K (CMYK) black?

1,880 views

Skip to first unread message

sdaau

unread,

Jun 6, 2011, 1:49:08 AM6/6/11

Hi all,

I basically have the problem with print of some slides from
OpenOffice. The problem is that OpenOffice exports the PDF of the
slides as an RGB PDF, where the text color is R:0, G:0, B:0 - and
usually when I send that to the printer, they complain that what
should be plain black extends into all four (CMYK) channels, and so I
have to pay more for the ink.

So the problem is - how would I convert a RGB PDF with R:0, G:0, B:0
into a CMYK pdf where the same color is plain black (C:0, M:0, Y:0, K:
100)? I posted a similar question on

http://stackoverflow.com/questions/6241282/converting-pdf-to-cmyk-with-identify-recognizing-cmyk

... although that question is more Latex oriented. So here, I'll try
to provide my OpenOffice test case:

* Open OpenOffice Impress, use Empty Presentation, click Create
* Add some text for 'title' and 'text'
* Click File/Export as PDF; call this PDF blah-slide.pdf

At this point, close and reopen OpenOffice, for yet another slide
pdf:
* Open OpenOffice Impress, use Empty Presentation, click Create
* Add some text for 'title' and 'text'
* Click Insert/Picture/From File... and insert whatever PNG image
** I used `convert -size 10x10 xc:red img.png` to generate a PNG image
to insert
* Click File/Export as PDF; call this PDF blah-slideP.pdf

At this point, we can run ImageMagick's `identlfy` on both pdf's, and
we'll get:

$ identify -verbose blah-slide.pdf | grep -i 'type\|color'
Type: Grayscale
Base type: Grayscale
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black

$ identify -verbose blah-slideP.pdf | grep -i 'type\|color'
Type: TrueColor
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black

Now, I'm aware that `identify` in principle works on raster images,
but I cannot find any other application that will provide similar
color information for PDFs (any other suggestions?)

Furthermore, the only check I have for CMYK separations for now (any
other suggestions?), is to use the `tiffsep` device of GhostScript:

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide.pdf && eog p00000001.tif

(or)

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slideP.pdf && eog
p00000001.tif

Of course, both of these show that the black color of the text is
'rich' black - on all four CMYK plates - instead of a plain 'black',
just in the K channel...

//////

So, now I finally try the command line I found in
http://www.productionmonkeys.net/guides/ghostscript/examples for
converting, as it says, "Color PDF to CMYK" - for both of these PDFs
(without and with an embedded image):

$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -
sColorConversionStrategy=CMYK -dProcessColorModel=/DeviceCMYK -
sOutputFile=blah-slide-out.pdf blah-slide.pdf

$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -
sColorConversionStrategy=CMYK -dProcessColorModel=/DeviceCMYK -
sOutputFile=blah-slideP-out.pdf blah-slideP.pdf

.. And here is now the interesting thing - if I try to run `identify`
again - *only* the pdf containing an image is the one recognized as
CMYK:

$ identify -verbose blah-slide-out.pdf | grep -i 'type\|color'
Type: Palette
Colorspace: RGB
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black

$ identify -verbose blah-slideP-out.pdf | grep -i 'type\|color'
Type: ColorSeparation
Base type: ColorSeparation
Colorspace: CMYK
Background color: white
Border color: cmyk(223,223,223,0)
Matte color: grey74
Transparent color: black

However, regardless of how they are reported, if I try to view their
separations:

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slide-out.pdf && eog
p00000001.tif

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
dLastPage=1 -sOutputFile=p%08d.tif blah-slideP-out.pdf && eog
p00000001.tif

... I can still see that both of these PDFs still feature the text in
rich black, in all four color separations.

So, I guess my questions can be summed up as:

* How can I convert a rich black text color in an RGB pdf - into a
plain black text color in a CMYK pdf?
* Why do I need an image in the slide, so that `identify` recognizes
the "converted" CMYK pdf as being really CMYK?

(* Are there any other alternative free tools for: conversion of RGB
to CMYK pdf; and: checking the print separations of any PDF?)

As a final note: I guess this kind of thing may have something to do
(and be achievable) with ICC profiles, which unfortunately I don't
understand very much - and I've had a lot of problems finding example
command lines; so if there is such a solution, an example command line
will be much appreciated.

Thanks in advance for any responses,
Cheers!

Matti Vuori

unread,

Jun 6, 2011, 4:41:21 AM6/6/11

sdaau <s...@imi.aau.dk> wrote in
news:208ecf3d-3629-4672...@p13g2000yqh.googlegroups.com:

> I basically have the problem with print of some slides from
> OpenOffice. The problem is that OpenOffice exports the PDF of the
> slides as an RGB PDF, where the text color is R:0, G:0, B:0 - and
> usually when I send that to the printer, they complain that what
> should be plain black extends into all four (CMYK) channels, and so I
> have to pay more for the ink.
>
> So the problem is - how would I convert a RGB PDF with R:0, G:0, B:0
> into a CMYK pdf where the same color is plain black (C:0, M:0, Y:0, K:
> 100)?

I don't know, but the way I see it, the real problem here is your
incompetent printer, who should be able to do it as a matter of routine.

Helge Blischke

unread,

Jun 6, 2011, 5:26:43 AM6/6/11

sdaau wrote:

I did the following with PDFs generated by both LibreOffice and LaTex:
pdftops source.pdf test.ps | grep -i cs | grep Device
and the result was in both cases like

/DeviceGray {} cs
/DeviceGray {} CS
/DeviceGray {} cs
/DeviceGray {} CS
/DeviceGray {} cs
/DeviceGray {} CS
/DeviceGray {} cs
/DeviceGray {} CS

As the pdftops utility (from the xpdf suite) preserves the PDF color spaces,
this means that - at least the text - is *not* "rich black".

I rather suspect that your print provider uses some unusual color conversion
in his workflow.

Helge

sdaau

unread,

Jun 6, 2011, 7:29:45 AM6/6/11

Hi all,

Thanks a lot for the prompt answers!

On Jun 6, 10:41 am, Matti Vuori <xmvu...@kolumbus.fi> wrote:
>
> [snip]

>
> > So the problem is - how would I convert a RGB PDF with R:0, G:0, B:0
> > into a CMYK pdf where the same color is plain black (C:0, M:0, Y:0, K:
> > 100)?
>

> I don't know, but the way I see it, the real problem here is your
> incompetent printer, who should be able to do it as a matter of routine.

Hehe :) It could well be - then again, most of these guys I worked
with (and I work with print shops on and off) simply invest a lot of
money in equipment; and when something like this comes up, their usual
response is: "just drop your file through the distiller once more",
and it gets very difficult to explain that I don't use "the
distiller" :) So I'd rather know how to give them files they won't
complain about :)

On Jun 6, 11:26 am, Helge Blischke <h.blisc...@acm.org> wrote:
> > [snip]

>
> I did the following with PDFs generated by both LibreOffice and LaTex:
> pdftops source.pdf test.ps | grep -i cs | grep Device
> and the result was in both cases like
>
> /DeviceGray {} cs
> /DeviceGray {} CS
> /DeviceGray {} cs
> /DeviceGray {} CS
> /DeviceGray {} cs
> /DeviceGray {} CS
> /DeviceGray {} cs
> /DeviceGray {} CS
>
> As the pdftops utility (from the xpdf suite) preserves the PDF color spaces,
> this means that - at least the text - is *not* "rich black".
>
> I rather suspect that your print provider uses some unusual color conversion
> in his workflow.
>

It could be - but then, I'm still having the same problem, even with
pdftops:

$ pdftops blah-slide.pdf blah-slide.ps
$ grep -A 1 Device blah-slide.ps
/DeviceGray {} cs
[0] sc
/DeviceGray {} CS
[0] SC
--
/DeviceRGB {} cs
[1 1 1] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc

# final check for separations

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -

dLastPage=1 -sOutputFile=p%08d.tif blah-slide.ps && eog p00000001.tif

All the tiff separations show again text on all (CMYK) channels; and
seemingly, at least the background white color seems to be treated as
RGB.

I also tried converting the PDF to grayscale first:

* as per: http://handyfloss.net/2008.09/making-a-pdf-grayscale-with-ghostscript/

$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -

sProcessColorModel=DeviceGray -sColorConversionStrategy=Gray -
dCompatibilityLevel=1.4 -sOutputFile=blah-slide-gray.pdf blah-
slide.pdf

* as per: color PDF -> Grayscale PDF - Ubuntu Forums -
http://ubuntuforums.org/showthread.php?t=379013

$ pdf2ps -sDEVICE=psgray blah-slide.pdf blah-slide-gray.ps

..., and then back to CMYK:

$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -

sProcessColorModel=DeviceCMYK -sColorConversionStrategy=CMYK -
dCompatibilityLevel=1.4 -sOutputFile=blah-slide-gray-out.pdf blah-
slide-gray.pdf

$ gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite -

sProcessColorModel=DeviceCMYK -sColorConversionStrategy=CMYK -
dCompatibilityLevel=1.4 -sOutputFile=blah-slide-gray-ps-out.pdf blah-
slide-gray.ps

... and if I check tiff separations again:

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -

dLastPage=1 -sOutputFile=p%08d.tif blah-slide-gray-out.pdf && eog
p00000001.tif

$ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -

dLastPage=1 -sOutputFile=p%08d.tif blah-slide-gray-ps-out.pdf && eog
p00000001.tif

... again the text black shows on all four separation tiffs :(

At this point, I'm wandering if the gs `tiffsep` is an appropriate
method for preview separation at all (though, if there are images
present, it seems to parse their CMYK separations OK)... But, it
seems, there is still no reliable method to get (originally RGB) black
color to show only in K channel?

Well, any further pointers on this will be much appreciated :)

Thanks,
Cheers!

ken

unread,

Jun 6, 2011, 9:05:34 AM6/6/11

In article <208ecf3d-3629-4672-9a3e-
135e44...@p13g2000yqh.googlegroups.com>, s...@imi.aau.dk says...

RGB->CMYK conversion often results in a mixture of CMY as well as black.
OpenOffice being a disply-oriented application (like Micrsofot Office)
probably only sets colours in RGB.

One thing you could try is printing to a PostScript file and converting
that into PDF as a separate step. You haven't said which OS you are
using, though I'm assuming some flavour of Linux. However its often the
case that PostScript printer drivers understand about CMYK and will
convert RGB into sensible colours.

> However, regardless of how they are reported, if I try to view their
> separations:
>
> $ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
> dLastPage=1 -sOutputFile=p%08d.tif blah-slide-out.pdf && eog
> p00000001.tif
>
> $ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
> dLastPage=1 -sOutputFile=p%08d.tif blah-slideP-out.pdf && eog
> p00000001.tif
>
> ... I can still see that both of these PDFs still feature the text in
> rich black, in all four color separations.

You are still converting the RGB into CMYK, if the
undercolorremoval/blackgeneration doesn't convert equal values of RGB
into CMYK, then you get a CMY output. It doesn't really matter which PDF
interpreter does this.

> So, I guess my questions can be summed up as:
>
> * How can I convert a rich black text color in an RGB pdf - into a
> plain black text color in a CMYK pdf?

There are a number of things you could try, but I would suggest either
printing to PostScript, and then using GS or sending the PDF file to
Ghostscript. Because GS is a PostScript interpreter, there are things
which can be done to colours.

It is possible to redefine the setcolor and setrgbcolor operators so
that they convert equal amounts of RGB into a colour specification in
DeviceGray instead (DeviceGray will convert to pure black in a CMYK
workflow).

Its also possible to set up an under colour removal function which
significantly affects how RGB is converted to CMYK (this is covered in
the PostScript Language Reference Manual).

If you can post a (small!) example file, preferably a single page, to
some publicly accessible URL I could take a look.

Ken

sdaau

unread,

Jun 6, 2011, 11:34:12 AM6/6/11

Hi Ken,

Thanks for the response!

>
> > I basically have the problem with print of some slides from
> > OpenOffice. The problem is that OpenOffice exports the PDF of the
> > slides as an RGB PDF, where the text color is R:0, G:0, B:0 - and
> > usually when I send that to the printer, they complain that what
> > should be plain black extends into all four (CMYK) channels, and so I
> > have to pay more for the ink.
>
> RGB->CMYK conversion often results in a mixture of CMY as well as black.
> OpenOffice being a disply-oriented application (like Micrsofot Office)
> probably only sets colours in RGB.
>

Yeah, that was my suspicion too - thanks for confirming!

> One thing you could try is printing to a PostScript file and converting
> that into PDF as a separate step. You haven't said which OS you are
> using, though I'm assuming some flavour of Linux. However its often the
> case that PostScript printer drivers understand about CMYK and will
> convert RGB into sensible colours.
>

Yup, it's Ubuntu Linux I'm using - and yes, I'm seeing the advice
about PostScript as an intermediate step more and more, as in:

How to convert pdf to monochrome?... - http://www.groupsrv.com/computers/about669835.html
>> Print the original .pdf to PostScript in a file, edit the PostScript,
>> distill the edited PostScript back to a new .pdf. Â For the editing:
>> save [get then put/def] the existing PostScript definition (probably
>> builtin) of any operator that sets color (setcolor, setcmycolor,
>> setrgbcolor, setgray, ...), then install a new definition that does
>> whatever you want based on the actual arguments and the saved
>> original definitions.

... unfortunately, I do not understand the postscript language enough
to understand this advice :)

However, I've made some progress by manually hacking a postscript
file, which I'm hoping to post about next...

> > However, [snip]

> > ... I can still see that both of these PDFs still feature the text in
> > rich black, in all four color separations.
>
> You are still converting the RGB into CMYK, if the
> undercolorremoval/blackgeneration doesn't convert equal values of RGB
> into CMYK, then you get a CMY output. It doesn't really matter which PDF
> interpreter does this.
>

Yes - but I was hoping, that if I 'properly' use color profiles
(whatever 'properly' is), I could sort of have this conversion go "the
right way" in this case: i.e. if it encounters R = G = B (grayscale);
then treat it as K:100 CMY:0...

I found:
http://git.ghostscript.com/?p=ghostpdl.git;a=blob_plain;f=gs/doc/GS9_Color_Management.pdf;hb=acdf790792b31d1581a4ae6eb8926128f4876214

and there it talks of DefaultRGB/CMYK/GrayProfile - and additionally -
sOutputICCProfile (and others); so I came to 'monster' command lines
like these:

gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite \
-sICCProfilesDir=/usr/share/ghostscript/9.02/iccprofiles/ -
dUseCIEColor \
-sDefaultGrayProfile=default_gray.icc \
-sDefaultRGBProfile=default_rgb.icc -sProcessColorModel=DeviceGray \
-sColorConversionStrategy=Gray -sOutputICCProfile=default_cmyk.icc \
-dCompatibilityLevel=1.4 -sOutputFile=blah-test.pdf blah-slide.pdf &&

gs \
-sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -dLastPage=1

\
-sOutputFile=p%08d.tif blah-test.pdf && eog p00000001.tif

... just to see if some combo would work, but unfortunately not :)

> > So, I guess my questions can be summed up as:
>
> > * How can I convert a rich black text color in an RGB pdf - into a
> > plain black text color in a CMYK pdf?
>
> There are a number of things you could try, but I would suggest either
> printing to PostScript,

Ah - you actually meant something like choosing a .ps output (or
"printing" to a .ps file) directly from OpenOffice? Yeah, that sounds
like it should save a processing step...

> and then using GS or sending the PDF file to
> Ghostscript. Because GS is a PostScript interpreter, there are things
> which can be done to colours.
>
> It is possible to redefine the setcolor and setrgbcolor operators so
> that they convert equal amounts of RGB into a colour specification in
> DeviceGray instead (DeviceGray will convert to pure black in a CMYK
> workflow).
>
> Its also possible to set up an under colour removal function which
> significantly affects how RGB is converted to CMYK (this is covered in
> the PostScript Language Reference Manual).
>

Thanks for noting this - I was somewhat aware that postscript language
can also "process", but I am completely ignorant about its scope. Re:
the DeviceGray CMYK workflow, I was intuitively trying to follow that
in the above "monster" cmdline too (as in: force all colors to
grayscale during conversion [in conversion colorspace], and write out
CMYK values based on these grayscale ones, which hopefully end up only
on the K plate) -- but I couldn't get `tiffsep` to confirm that.

> If you can post a (small!) example file, preferably a single page, to
> some publicly accessible URL I could take a look.
>

Sure, here are the pdf's of the slides mentioned in the OP:

http://sdaaubckp.sf.net/post/img/blah-slide.pdf
http://sdaaubckp.sf.net/post/img/blah-slideP.pdf

Many thanks for looking into this, :)
Cheers!

sdaau

unread,

Jun 6, 2011, 12:17:53 PM6/6/11

Yes - well, thanks to pointers I got here, I tried several ways of
converting the same pdf into a .ps file, and they all give different
sort of files (different dialect of PostScript, I suppose ?!) :)

I was interested in getting a human readable output, and then trying
to change it by 'hand', and here's what I tried:

gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pswrite -
sOutputFile=01-gs.ps blah-slide.pdf

gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=ps2write -
dASCII85EncodePages=false -sOutputFile=02-gs.ps blah-slide.pdf

pdf2ps -dASCII85EncodePages=false -sProcessColorModel=DeviceCMYK blah-
slide.pdf 03-2ps.ps

pdftops blah-slide.pdf 04-tops.ps

From the above, 02-gs.ps and 03-2ps.ps will turn out with compressed
data inside, hence not human editable. 01-gs.ps, generated by device
`pswrite` is 'human readable'; however has code like this:

$ grep -A 1 -i 'rgb\|device' 01-gs.ps
{ pop/setpagedevice where
{ pop 1 dict dup /PageSize PageSize put setpagedevice}
{ /setpage where{ pop PageSize aload pop pageparams 3 {exch pop}
repeat
--
/rG{3{3 -1 roll 255 div}repeat setrgbcolor}!/G{255 div setgray}!/K{0
G}!
/r6{dup 3 -1 roll rG}!/r5{dup 3 1 roll rG}!/r3{dup rG}!

... and I don't see anything resembling RGB coordinates here :)

Turns out, the 04-tops.ps (generated by pstopdf - as I learned, from
xpdf) has the right output:

$ grep -A 1 -i 'cmyk\|rgb\|device' 04-tops.ps
/setpagedevice where {
pop 3 dict begin
--
currentdict end setpagedevice
} {
--

/DeviceGray {} cs
[0] sc
/DeviceGray {} CS
[0] SC
--
/DeviceRGB {} cs
[1 1 1] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc
--
/DeviceRGB {} cs
[0.2353 0.2353 0.2353] sc

So what I did was open this 04-tops.ps manually in nano, and change
all instances of DeviceRGB and DeviceGray to DeviceCMYK - and re-
mapping the values accordingly, as in:

$ grep -A 1 -i 'cmyk\|rgb\|device' 04-tops.ps
/setpagedevice where {
pop 3 dict begin
--
currentdict end setpagedevice
} {
--
/DeviceCMYK {} cs
[0 0 0 1] sc
/DeviceCMYK {} CS
[0 0 0 1] SC
--
/DeviceCMYK {} cs
[0 0 0 0] sc
--
/DeviceCMYK {} cs
[0 0 0 0.2353] sc
--
/DeviceCMYK {} cs
[0 0 0 0.2353] sc
--
/DeviceCMYK {} cs
[0 0 0 1] sc
--
/DeviceCMYK {} cs
[0 0 0 0.2353] sc

And now - one could see that say, evince took much longer to open the
file; it seems that the final two RGB colors [0.2353 0.2353 0.2353]
are used to provide the black color for the two pieces of text in the
pdf (tested through the before-last [0 0 0 1] CMYK), and just copying
this value (0.2353) to K is not correct (gives a very weak gray).

However, the best part is that now, FINALLY, `tiffsep` shows colors
only in the K channel:

gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -

dLastPage=1 -sOutputFile=p%08d.tif 04-tops.ps && eog p00000001.tif

and now even if I convert this ps to pdf:

ps2pdf 04-tops.ps

I again get correct tiffseps that show black only in K:

gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -

dLastPage=1 -sOutputFile=p%08d.tif 04-tops.pdf && eog p00000001.tif

Funny thing - `identify` will recognize *both* 04-tops.ps and 04-
tops.pdf as RGB :)

Well, I guess I could cook myself up a python script iterating through
all of these colors, and calculate and replace CMYK values, however:

* Is there a guarantee that pdftops will always generate this kind of
syntax with /DeviceRGB, regardless of what PDF I throw at it?
* What happens if /DeviceRGB is not attributed to something like color
coordinates (say, if it is attributed to an image) - if that's
possible at all?

In the end, I'm guessing with a proper command line, ghostscript
should be able to do this - however, only time I got some success
(report of CMYK by `identify`) there had to be an image present; plus
seemingly it doesn't handle the 000->0001 mapping (which maybe color
profiles would address?)

Anyways, I'd love to hear some comments on this,
Thanks,
Cheers!

PS: One more (maybe) relevant note:

HOWTO Convert a ps file to CMYK - http://www.met.rdg.ac.uk/~dan/work/H2ConvertToCMYK.html
> As far as I know gs (ghostscript) doesn't support CMYK postscript.

So, doing:

gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pswrite -
sProcessColorModel=DeviceCMYK -sOutputFile=01b-gs.ps blah-slide.pdf

... results with: "Unrecoverable error: rangecheck
in .putdeviceprops". However, the same error appears also for
sProcessColorModel=DeviceGray; yet, it doesn't appear for -
sColorConversionStrategy=CMYK (replacing -
sProcessColorModel=DeviceCMYK), but there seems no significant
difference...

ken

unread,

Jun 6, 2011, 12:28:51 PM6/6/11

In article <6d0f8a6c-3430-4b2f-8852-
ddb0e7...@v8g2000yqb.googlegroups.com>, s...@imi.aau.dk says...

> How to convert pdf to monochrome?... - http://www.groupsrv.com/computers/about669835.html
> >> Print the original .pdf to PostScript in a file, edit the PostScript,
> >> distill the edited PostScript back to a new .pdf. Â For the editing:
> >> save [get then put/def] the existing PostScript definition (probably
> >> builtin) of any operator that sets color (setcolor, setcmycolor,
> >> setrgbcolor, setgray, ...), then install a new definition that does
> >> whatever you want based on the actual arguments and the saved
> >> original definitions.
>
> ... unfortunately, I do not understand the postscript language enough
> to understand this advice :)

I can do that for you, as can others here, but it would be helpful to
see an example. Ideally a PDF and PostScript file of a single page file,
passed through your workflow.

That was we can be more certain about what to do, and give you better
advice on how to achieve waht you want.

> > You are still converting the RGB into CMYK, if the
> > undercolorremoval/blackgeneration doesn't convert equal values of RGB
> > into CMYK, then you get a CMY output. It doesn't really matter which PDF
> > interpreter does this.
> >
>
> Yes - but I was hoping, that if I 'properly' use color profiles
> (whatever 'properly' is), I could sort of have this conversion go "the
> right way" in this case: i.e. if it encounters R = G = B (grayscale);
> then treat it as K:100 CMY:0...

You only get CMYK output if you have an interpreter which applies the
ICC (via a Colour Management System) profile to create CMYK. In general
you won't get this.

What usually happens is that you get a PDF which contains colours in an
ICCBased colour space. Which your print shop probably won't like either.

Or possibly the colours still specified in RGB, but an OutputProfile
attached, which simply describes the RGB space for which these were
intended. A fully ICC ompliant workflow (ie including your printer)
would be able to create a link from the ICC profile in the PDF to the
ICC profile used for the printing device, and everything would magically
work out. This is rare, it usually only works on closed workflows (that
is, not accepting submissions from the outside world)

> > > So, I guess my questions can be summed up as:
> >
> > > * How can I convert a rich black text color in an RGB pdf - into a
> > > plain black text color in a CMYK pdf?
> >
> > There are a number of things you could try, but I would suggest either
> > printing to PostScript,
>
> Ah - you actually meant something like choosing a .ps output (or
> "printing" to a .ps file) directly from OpenOffice? Yeah, that sounds
> like it should save a processing step...

Welk, I meant you could do that to get a PostScript file, which it might
be easier to massage intot he form you want before converting it into
PDF (assuming that's what your print shop wants as a submission).

> > Its also possible to set up an under colour removal function which
> > significantly affects how RGB is converted to CMYK (this is covered in
> > the PostScript Language Reference Manual).
> >
>
> Thanks for noting this - I was somewhat aware that postscript language
> can also "process", but I am completely ignorant about its scope.

PostScript is a Turing-complete programming language. While there are
thigns that are hard to do, very little is impossible.

> > If you can post a (small!) example file, preferably a single page,
to
> > some publicly accessible URL I could take a look.
> >
>
> Sure, here are the pdf's of the slides mentioned in the OP:
>
> http://sdaaubckp.sf.net/post/img/blah-slide.pdf
> http://sdaaubckp.sf.net/post/img/blah-slideP.pdf
>
> Many thanks for looking into this, :)

I'll go pull the files down now.

Ken

Helge Blischke

unread,

Jun 6, 2011, 1:30:45 PM6/6/11

sdaau wrote:

[...]

If you convert your OOo generated PDFs to PostScript using pdftops (from the
xpdf suite) and then prepend the attached forceblack.ps to the resulting
PostScript file, RGB colors where R==G==B will be printed as pure black
(replaced with the appropriate gray value).

Note that this trick won't work with PDF input or PostScript produced using
Ghostscript's ps2write device.

Helge

forceblack.ps

ken

unread,

Jun 7, 2011, 5:21:06 AM6/7/11

In article <6d0f8a6c-3430-4b2f-8852-
ddb0e7...@v8g2000yqb.googlegroups.com>, s...@imi.aau.dk says...

> > Its also possible to set up an under colour removal function which

> > significantly affects how RGB is converted to CMYK (this is covered in
> > the PostScript Language Reference Manual).
> >
>
> Thanks for noting this - I was somewhat aware that postscript language
> can also "process", but I am completely ignorant about its scope. Re:
> the DeviceGray CMYK workflow, I was intuitively trying to follow that
> in the above "monster" cmdline too (as in: force all colors to
> grayscale during conversion [in conversion colorspace], and write out
> CMYK values based on these grayscale ones, which hopefully end up only
> on the K plate) -- but I couldn't get `tiffsep` to confirm that.
>
>
> > If you can post a (small!) example file, preferably a single page, to
> > some publicly accessible URL I could take a look.
> >
>
> Sure, here are the pdf's of the slides mentioned in the OP:
>
> http://sdaaubckp.sf.net/post/img/blah-slide.pdf
> http://sdaaubckp.sf.net/post/img/blah-slideP.pdf

I wasn't able to do this conversion in a single pass using the
Ghostscript PDF interpreter, because it uses the setrgbcolor directly
from systemdict, so it doesn't allow for replacement.

Instead I first converted your files to PostScript using the ps2write
device:

gs -sDEVICE=ps2write -sOutputFile=./out.ps ./blah-slide.pdf

Then I created a simple replaement routine, and stored it in a file
called HackRGB.ps:

%!
/oldsetrgbcolor /setrgbcolor load def
/setrgbcolor {
(in replacement setrgbcolor\n) print
%% R G B
1 index 1 index %% R G B G B
eq { %%
2 index 1 index %% R G B R B
eq {
%% Here if R = G = B
pop pop %% remove two values
setgray
} {
oldsetrgbcolor %% set the RGB values
} ifelse
}{
oldsetrgbcolor %% Set the RGB values
}ifelse
} bind def

This replaces the setrgbcolor operator with a routine which tests the
RGB value and if all components are equal it replaces it with a call to
setgray using just one of the components. (BTW you can remove the line
ending in 'print', its just there so that you can see something is
happening ;-)

I then converted the PostScript file back to PDF, but using this code:

gs -sDEVICE=pdfwrite -sOutputFile=./out.pdf ./HackRGB.ps ./out.ps

This results in a PDF file where the text is in a shade of gray. This
*ought* to be acceptable to your print shop, because gray should map
straight to the K channel of CMYK.

If for some reason that isn't acceptable, you could replace the
'setgray' with '0 0 0 4 -1 roll setcmykcolor' which uses CMYK directly.

NOTE! This only affects linework (text, vectors), only affects linework
using RGB and will only ocnvert that to gray if the R, G and B values
are identical. Images, shadings and potentially other object types will
not be affected.

I should also mention that going from PDF to PostScript and back to PDF
is a potentially lossy process which can introduce errors and odd
artefacts, you should check files carefully after this conversion. I
haven't tested this code particularly.

If you print directly to PostScript then you can eliminate one
conversion step, which is probably worthwhile.

Ken

sdaau

unread,

Jun 9, 2011, 8:28:58 AM6/9/11

Hi all,

Many, many thanks for the assistance with this problem! I believe it
is more or less solved now - somewhat of a mammoth post follows, but
first a summary:

* forceblack.ps uses `pdftops` PS file, manipulates /DeviceRGB, /
setcolorspace
* HackRGB.ps uses `gs` ps2write PS file, manipulates /setrgbcolor, /
setgray

Thanks for that, Helge Blischke; here is a command line log of what I
tried:

pdftops blah-slide.pdf blah-slide-tops.ps

cat forceblack.ps blah-slide-tops.ps > blah-slide-forceblack.ps
# blah-slide-forceblack.ps has wrong page size in evince!

# check if /pdfEndPage occurs only once:
sed -n '/\/pdfE/{p}' blah-slide.ps

# insert forceblack.ps after line where /pdfEndPage occurs:
## sed without -n will output entire input file
## the 'r' command reads in forceblack.ps, and
## adds/inserts it after the matching line
## http://www.grymoire.com/Unix/Sed.html#uh-0

sed '/\/pdfE/r forceblack.ps' blah-slide-tops.ps > blah-slide-
forceblack.ps

# now blah-slide-forceblack.ps is the correct page size!

# check separations

gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -

dLastPage=1 -sOutputFile=p%02d.tif blah-slide-forceblack.ps && eog
p01.tif 2>/dev/null

... and again, `gs` with `tiffsep` shows black text on all four CMYK
plates.

I guess, this is what they call "preflight", as in checking whether
the separations are coming out right - and again, I'm not sure how
reliable `gs` with `tiffsep` is; but I don't know of any other tool in
Linux that could open a PDF/PS and show expected CMYK separations; so
if anyone has any alternatives to `gs` with `tiffsep` on Linux, please
write back. Then again, there is the problem that the printer guy may
not necesarilly obtain the same CMYK separations as I do (regardless
of the software I use to render these) - but at least, for now `gs`
with `tiffsep` offers at least a starting point...

Possibly, the problem may end up boiling down to `gs` with `tiffsep`,
as a "preflight" software - *and* my printer's actual setup - may
choose to send (gray) RGB values (or even values declared as
Grayscale) to all four CMYK plates; while that may not be the case
with other print setups or shops (for the same PDF or PS file). Which
is why in that case, the best for me would be to explicitly try to
convert gray R=G=B values into CMY:0+K values, instead of into
Grayscale?

(anecdote that may confirm this: these days I had a more-less textual
content document from `pdflatex`, split into ranges with `pdftk`,
printed on an office laser printer [don't know the brand] - apparently
something in the laser printer was misaligned, and I could see blueish
[though not 'actual' cyan] outline leaking less than 1 mm 'northwest'
of each and every letter; don't know if that would be a proof of RGB
text black being interpreted in that chain [I guess they used Windows
or Mac to print] as 'rich' black, i.e. C:1 M:1 Y:1 K:1).

On Jun 7, 11:21 am, ken <k...@spamcop.net> wrote:
> I wasn't able to do this conversion in a single pass using the
> Ghostscript PDF interpreter, because it uses the setrgbcolor
> directly from systemdict, so it doesn't allow for replacement.
>
>
> Instead I first converted your files to PostScript using the
> ps2write device:
>
> gs -sDEVICE=ps2write -sOutputFile=./out.ps ./blah-slide.pdf
>

Thanks for that, Ken - good to know some constructs dont allow
replacements..

> Then I created a simple replacement routine, and stored it in a
> file called HackRGB.ps:
>
> ... [snip] ...

>
> This replaces the setrgbcolor operator with a routine which
> tests the RGB value and if all components are equal it replaces
> it with a call to setgray using just one of the components. (BTW
> you can remove the line ending in 'print', its just there so
> that you can see something is happening ;-)
>
> I then converted the PostScript file back to PDF, but using this
> code:
>
> gs -sDEVICE=pdfwrite -sOutputFile=./out.pdf ./HackRGB.ps ./out.ps
>

Many thanks for the commented code (and the tip for `print` for
debugging postscript - and the example of how to use an 'external'
postscript routing with `ghostscript` :) ); this is what I tried:

# I had to add -dNOPAUSE -dBATCH to avoid having
# '>>showpage, press <return> to continue<<' and
# the prompt 'GS>' shown...

gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=./blah-slide-
gsps2w.ps ./blah-slide.pdf
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB.pdf ./HackRGB.ps ./blah-slide-gsps2w.ps
# check separations

gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -

dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB.pdf && eog
p01.tif 2>/dev/null

Sadly, similarly to the use of the previous forceblack.ps, I again get
all four separations here showing letters...

> This results in a PDF file where the text is in a shade of gray.
> This *ought* to be acceptable to your print shop, because gray
> should map straight to the K channel of CMYK.
>

Yes - but, as I commented previously: if the process that they have at
the printer's shop behaves the same as `gs` with `tiffsep`, then
they'll still see what I see - that is, the black for text letters
showing on all four CMYK plates.

> If for some reason that isn't acceptable, you could replace the
> 'setgray' with '0 0 0 4 -1 roll setcmykcolor' which uses CMYK
> directly.
>

Ahh - thanks for that; now that looks very promising to me :) !

I did the replacement, called that version HackRGB-cmyk.ps, and tried
this:

gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=./blah-slide-
gsps2w.ps ./blah-slide.pdf
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB-cmyk.pdf ./HackRGB-cmyk.ps ./blah-slide-gsps2w.ps
# check separations

gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -

dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk.pdf && eog
p01.tif 2>/dev/null

... and - partial success here: CMY plates are *finally* blank white -
but the K plate is inverted (what should be white background, is shown
in black; and the letters are grayer than that) :) Interestingly, the
same effect is shown if I open blah-slide-hackRGB-cmyk.pdf in
`evince`, too. Also interestingly, if I use the `pdftops` output (blah-
slide-tops.ps), then the final pdf is not inverted - but the
separations show again black text on all four CMYK plates:

gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB-cmyk.pdf ./HackRGB-cmyk.ps ./blah-slide-tops.ps
# check separations

gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -

dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk.pdf && eog
p01.tif 2>/dev/null

... and since the debug text "in replacement setrgbcolor" never
appears on stdout, this means the procedure is not even triggered!

Then, since the whole problem seems to be a simple matter of
calculating K=1-R (instead of using K=R), I tried to modify the script
myself, and after finding the following webpages:

> PostScript, The Forgotten Art of Programming | Linux Journal - http://www.linuxjournal.com/article/2386
> http://homepage.mac.com/andykopra/pdm/tutorials/an_introduction_to_postscript.html

... it seems to have worked :) First, I tried use ghostscript in
command line mode, reconstructing a simple stack and pasting
modifications of the HackRGB-cmyk.ps so I could see what I was writing
- as a noob note, those commands are here:

http://sdaaubckp.sourceforge.net/post/ps/debug-paste-cmds.ps

... and here is what ghostscript writes on output:

http://sdaaubckp.sourceforge.net/post/ps/debug-paste-cmds.ps.log

Finally, this is what the modified HackRGB-cmyk.ps looks like:

http://sdaaubckp.sourceforge.net/post/ps/HackRGB-cmyk-inv.ps

... the difference from HackRGB-cmyk.ps being:

- 0 0 0 4 -1 roll setcmykcolor
+ 0 0 0 4 -1 roll -1 mul 1 add setcmykcolor

... along with an added piece of code that will do the same for
setgray values.

To finally confirm all is OK, I run:

gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
hackRGB-cmyk-inv.pdf ./HackRGB-cmyk-inv.ps ./blah-slide-gsps2w.ps
# check separations

gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -

dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk-inv.pdf &&
eog p01.tif 2>/dev/null

... and now, I do get black text on white background only in the K
plate, while CMY plates are blank white :) Although, I'm not sure how
accurate of a formula K=1-R is; if someone can suggest a more accurate
formula, please write back!

> NOTE! This only affects linework (text, vectors), only affects
> linework using RGB and will only ocnvert that to gray if the R,
> G and B values are identical. Images, shadings and potentially
> other object types will not be affected.
>

Thanks for that - for color images, I anyway have to pay all four
inks, so for them, maybe 'rich' black is preferable; for shadings -
yeah, will have to look into that once I get a problem with it :)

> I should also mention that going from PDF to PostScript and back
> to PDF is a potentially lossy process which can introduce errors
> and odd artefacts, you should check files carefully after this
> conversion. I haven't tested this code particularly.
>

Right - and a note to myself: after pdf to ps (and thus in the final
roundrip from ps to pdf) text information is gone - all the font
glyphs apparently become treated as curves (since I cannot select or
copy the text in `evince` anymore); I guess hyperlinks would be gone
too - but it doesn't matter really; as this is a document specifically
intended for a print shop :)

> If you print directly to PostScript then you can eliminate one
> conversion step, which is probably worthwhile.
>

I haven't really tried this, but as far as I can see, there are
several PostScript dialects - and I'm not sure if I, say, export from
OpenOffice to PS directly, if it is guaranteed that the it will
feature /setrgbcolor type syntax (instead of the /DeviceRGB, /
setcolorspace type syntax). Which is why it's good to know that `gs` +
`ps2write` would help in such a case regardless :)

On Jun 6, 6:28 pm, ken <k...@spamcop.net> wrote:
> You only get CMYK output if you have an interpreter which
> applies the ICC (via a Colour Management System) profile to
> create CMYK. In general you won't get this.
>
> What usually happens is that you get a PDF which contains
> colours in an ICCBased colour space. Which your print shop
> probably won't like either.
>
> Or possibly the colours still specified in RGB, but an
> OutputProfile attached, which simply describes the RGB space for
> which these were intended. A fully ICC ompliant workflow (ie
> including your printer) would be able to create a link from the
> ICC profile in the PDF to the ICC profile used for the printing
> device, and everything would magically work out. This is rare,
> it usually only works on closed workflows (that is, not
> accepting submissions from the outside world)
>

Thanks for this - I can see that I still fail to understand properly
how ICC profiles really work; however, I hope with the solution above
I won't need to :) Especially thanks for the 'closed workflow' comment
- I was suspecting that may be the case, but I'm not that involved
with the industry to have actual experience of the kind...

> PostScript is a Turing-complete programming language. While
> there are thigns that are hard to do, very little is impossible.

Heh - as soon as I saw this, I started reading up on it a bit - and
'oh dear' - there is a LOT of history involved in this, and reverse
Polish Notation doesn't make it any easier :) But I was glad I could
at least recognize:

> /sys_setcolorspace /setcolorspace load def
> /setcolorspace {
> ...

> /oldsetrgbcolor /setrgbcolor load def
> /setrgbcolor {

... hey, I guess this is much like the following:

> \let\Oldincludegraphics\includegraphics
> \renewcommand{\includegraphics}[1]{\Oldincludegraphics[width=\maxwidth]{#1}}

... in Latex, no? :)

Anyways - thanks again everyone for the help with this problem,
Cheers!

ken

unread,

Jun 9, 2011, 9:00:39 AM6/9/11

In article <da825ba1-f2f2-4ffe-8ea2-5cd3b4518e73
@p13g2000yqh.googlegroups.com>, s...@imi.aau.dk says...

> > If for some reason that isn't acceptable, you could replace the
> > 'setgray' with '0 0 0 4 -1 roll setcmykcolor' which uses CMYK
> > directly.
> >
>
> Ahh - thanks for that; now that looks very promising to me :) !
>
> I did the replacement, called that version HackRGB-cmyk.ps, and tried
> this:
>
>
> gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=./blah-slide-
> gsps2w.ps ./blah-slide.pdf
> gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
> hackRGB-cmyk.pdf ./HackRGB-cmyk.ps ./blah-slide-gsps2w.ps
> # check separations
> gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
> dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk.pdf && eog
> p01.tif 2>/dev/null
>
>
> ... and - partial success here: CMY plates are *finally* blank white -
> but the K plate is inverted (what should be white background, is shown
> in black; and the letters are grayer than that) :)

Oops, my fault, try '0 0 0 4 -1 roll 1 exch sub' instead. Gray is
inverse polarity and so 1 setgray produces white while 0 setgray
produces black. Subtracting from 1 will yield the reverse result. which
should work.

> Interestingly, the
> same effect is shown if I open blah-slide-hackRGB-cmyk.pdf in
> `evince`, too. Also interestingly, if I use the `pdftops` output (blah-
> slide-tops.ps), then the final pdf is not inverted - but the
> separations show again black text on all four CMYK plates:
>
> gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-
> hackRGB-cmyk.pdf ./HackRGB-cmyk.ps ./blah-slide-tops.ps
> # check separations
> gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -
> dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-cmyk.pdf && eog
> p01.tif 2>/dev/null
>
> ... and since the debug text "in replacement setrgbcolor" never
> appears on stdout, this means the procedure is not even triggered!

Presumably because there are no colours specified using setrgbcolor....

> ... the difference from HackRGB-cmyk.ps being:
>
> - 0 0 0 4 -1 roll setcmykcolor
> + 0 0 0 4 -1 roll -1 mul 1 add setcmykcolor
>
> ... along with an added piece of code that will do the same for
> setgray values.

Mea culpa, I forgot to invert the gray values. Still, well done on
sorting out the setgray yourself !

> ... and now, I do get black text on white background only in the K
> plate, while CMY plates are blank white :) Although, I'm not sure how
> accurate of a formula K=1-R is; if someone can suggest a more accurate
> formula, please write back!

It only occurs (in the code I wrote at least) when R=G=B, so its a shade
of gray. In that case it doesn't matter what component you choose, they
are all the same :-)

> > I should also mention that going from PDF to PostScript and back
> > to PDF is a potentially lossy process which can introduce errors
> > and odd artefacts, you should check files carefully after this
> > conversion. I haven't tested this code particularly.
> >
>
> Right - and a note to myself: after pdf to ps (and thus in the final
> roundrip from ps to pdf) text information is gone - all the font
> glyphs apparently become treated as curves (since I cannot select or
> copy the text in `evince` anymore); I guess hyperlinks would be gone
> too - but it doesn't matter really; as this is a document specifically
> intended for a print shop :)

pswrite is *really* basic, ps2write does a much better job. All text is
converted to outlines by pswrite, which is one resaon the output tends
to be huge. There are many other compromises too.

> > If you print directly to PostScript then you can eliminate one
> > conversion step, which is probably worthwhile.
> >
>
> I haven't really tried this, but as far as I can see, there are
> several PostScript dialects

Not really. There are three basic levels, 1 to 3 where 1 is ancient, 2
is quite old and 3 is (comparatively) new. Most producers only create
level 2 PostScript anyway, which will run on any printer you can buy, or
could have bought in the last 10 years probably.

Only specialist DTP applications like Quark XPress, Adobe Illustrator,
IDesign etc generally create level 3 PostScript. However, if you start
form a PDF file and convert to PostScript using something like Acrobat,
it may offer to save as level 3. Generally that should be fine, but you
might want to check that one with your printer if you ever find yourself
doing it.

OTOH If they can accept PDF I'd be amazed if their Rip couldn't also
handle level 3 PostScript.

Ken

sdaau

unread,

Jun 9, 2011, 9:48:15 AM6/9/11

On Jun 9, 3:00 pm, ken <k...@spamcop.net> wrote:
> > ... and - partial success here: CMY plates are *finally* blank white -
> > but the K plate is inverted (what should be white background, is shown
> > in black; and the letters are grayer than that) :)
>
> Oops, my fault, try '0 0 0 4 -1 roll 1 exch sub' instead. Gray is
> inverse polarity and so 1 setgray produces white while 0 setgray
> produces black. Subtracting from 1 will yield the reverse result. which
> should work.
>

Awesome - thanks for the '1 exch sub' construct - read up on
http://www.tailrecursive.org/postscript/operators.html#exch and I
think I see how it works :)

>
> Still, well done on
> sorting out the setgray yourself !

Cheers - wouldn't have done it if I wasn't encouraged by your code
comments :)

> > ... Although, I'm not sure how

> > accurate of a formula K=1-R is; if someone can suggest a more accurate
> > formula, please write back!
>
> It only occurs (in the code I wrote at least) when R=G=B, so its a shade
> of gray. In that case it doesn't matter what component you choose, they
> are all the same :-)
>

Yes, but I meant more from a perceptual perspective: for instance, I
am pretty certain that black as (CMY)K:(0,0,0),1 on paper should
correspond to RGB:0,0,0 on screen; and that white as (CMY)K:(0,0,0),0
should correspond to RGB:1,1,1. However, would RGB:0.8,0.8,0.8 map
linearly to K:0.2 - or are there some 'transformations' involved, when
mapping perception of grayscale from screen to paper (e.g. instead of
K=1-R, may be something like K=1-0.2*(5^R) would be more
appropriate) ?

> > > I should also mention that going from PDF to PostScript and back
> > > to PDF is a potentially lossy process which can introduce errors
> > > and odd artefacts, you should check files carefully after this
> > > conversion. I haven't tested this code particularly.
>
> > Right - and a note to myself: after pdf to ps (and thus in the final
> > roundrip from ps to pdf) text information is gone - all the font
> > glyphs apparently become treated as curves (since I cannot select or

> > copy the text in `evince` anymore); ...

>
> pswrite is *really* basic, ps2write does a much better job. All text is
> converted to outlines by pswrite, which is one resaon the output tends
> to be huge. There are many other compromises too.
>

Just to make sure - I was using the ps2write (not pswrite) in the
example above, and that also seems to 'flatten' the text (although, as
I noted, I don't mind that, and the other compromises - as long as the
print comes out nice :) )..

> > > If you print directly to PostScript then you can eliminate one
> > > conversion step, which is probably worthwhile.
>
> > I haven't really tried this, but as far as I can see, there are
> > several PostScript dialects
>
> Not really. There are three basic levels, 1 to 3 where 1 is ancient, 2
> is quite old and 3 is (comparatively) new. Most producers only create
> level 2 PostScript anyway, which will run on any printer you can buy, or
> could have bought in the last 10 years probably.
>
> Only specialist DTP applications like Quark XPress, Adobe Illustrator,
> IDesign etc generally create level 3 PostScript. However, if you start
> form a PDF file and convert to PostScript using something like Acrobat,
> it may offer to save as level 3. Generally that should be fine, but you
> might want to check that one with your printer if you ever find yourself
> doing it.
>
> OTOH If they can accept PDF I'd be amazed if their Rip couldn't also
> handle level 3 PostScript.
>

Thanks for noting that; good to have the notion, that level 2 should
still be generally safe to use.

I guess 'dialects' was the wrong word to use; when I wrote that, I was
referring more to this type of problem:

> > ... and since the debug text "in replacement setrgbcolor" never
> > appears on stdout, this means the procedure is not even triggered!
>
> Presumably because there are no colours specified using setrgbcolor....
>

... if not a different dialect, then sure there seems to be different
ways of specifying color: for instance, depending on how a conversion
from PDF to PS is performed, the PS file may or may not specify colors
using setrgbcolor. And I guess, that is what would limit the usability
of a script like HackRGB.ps?

Thanks again for a great discussion,
Cheers!

ken

unread,

Jun 9, 2011, 11:06:50 AM6/9/11

In article <fd01f57e-e9b3-4839-9ba8-5c732bcfc506
@m4g2000yqk.googlegroups.com>, s...@imi.aau.dk says...

> > It only occurs (in the code I wrote at least) when R=G=B, so its a
shade
> > of gray. In that case it doesn't matter what component you choose, they
> > are all the same :-)
> >
>
> Yes, but I meant more from a perceptual perspective: for instance, I
> am pretty certain that black as (CMY)K:(0,0,0),1 on paper should
> correspond to RGB:0,0,0 on screen; and that white as (CMY)K:(0,0,0),0
> should correspond to RGB:1,1,1. However, would RGB:0.8,0.8,0.8 map
> linearly to K:0.2 - or are there some 'transformations' involved, when
> mapping perception of grayscale from screen to paper (e.g. instead of
> K=1-R, may be something like K=1-0.2*(5^R) would be more
> appropriate) ?

If you're worried about colour fidelity, you shouldn't be using an
application which produces RGB to produce documents for print ;-)

This has long been a criticism of Microsoft Office and Publisher, people
who care about colour want (at the very least!) to be able to specify
CMYK colours, not RGB. The whole colour model is different (reflective
vs transmissive).

Seriously, I wouldn't worry about it too much, I expect it'll be close
eough for your.
n

> > > Right - and a note to myself: after pdf to ps (and thus in the
final
> > > roundrip from ps to pdf) text information is gone - all the font
> > > glyphs apparently become treated as curves (since I cannot select or
> > > copy the text in `evince` anymore); ...
> >
> > pswrite is *really* basic, ps2write does a much better job. All text is
> > converted to outlines by pswrite, which is one resaon the output tends
> > to be huge. There are many other compromises too.
> >
>
> Just to make sure - I was using the ps2write (not pswrite) in the
> example above, and that also seems to 'flatten' the text (although, as
> I noted, I don't mind that, and the other compromises - as long as the
> print comes out nice :) )..

ps2write really should never convert text to outlines, worst case it
might produce bitmaps instead of scalable fonts. I don't think there's
any way it can convert to outlines (I ha d arecent request for that, so
I'm reasonably sure ;-)

> Thanks for noting that; good to have the notion, that level 2 should
> still be generally safe to use.
>
> I guess 'dialects' was the wrong word to use; when I wrote that, I was
> referring more to this type of problem:
>
> > > ... and since the debug text "in replacement setrgbcolor" never
> > > appears on stdout, this means the procedure is not even triggered!
> >
> > Presumably because there are no colours specified using setrgbcolor....
> >
>
> ... if not a different dialect, then sure there seems to be different
> ways of specifying color: for instance, depending on how a conversion
> from PDF to PS is performed, the PS file may or may not specify colors
> using setrgbcolor.

Well PostScript is a programming language; there are usually multiple
ways of achieving the same end in programming languages. Some may be
preferable.

One reason to use "/DeviceRGB setcolorspace R G B setcolor" instead of
"R G B setcolorspace" would be if you were going to specify lots of
colours. Saving the 5 bytes per time of setcolor vs setrgbcolor can
mount up if you do lots of them leading to smaller files. These days
nobody really cares much about that ;-)

Also there are a number of operators which use the current color space,
so if you are going to be working in RGB its often more efficient to set
the colour space to RGB, and then just go. Same for other spaces of
course.

Microsoft Office used to (probably still does) create patterns by
drawing lots of teeny tiny images in an Indexed (ie palette) colour
space. it was hideously inefficient because it would set the current
colour space to RGB then save the graphics state, set the colour space
to the paletted space and draw the image, then restore back to
DeviceRGB, rinse and repeat.

The RIP I was working on at the time did a certain amount of work
whenever the colour space changed. By switching inanely back and forth
like that the files took a long time to process. We eventually added a
cache to cater for the situation.

Setting the colour space to the paletted colour, drawing all the images
and then restoring back would have been *much* more efficient....

But concerns like those went away when people stopped sending PostScript
files to RIPs using a 9,600 Kbits/sec serial interface :-)

> And I guess, that is what would limit the usability
> of a script like HackRGB.ps?

You could sitll do it. You would need to monitor calls to setcolor
instead of setrgbcolor, check the current colour space and if its
/DeviceRGB check the three components. If they are the same then you
could set the colour space to Gray, and call setcolor with one
component. You would need to remember what the last colour space was, so
that on the next call to setcolor you could restore the original space
first. Obviously you would also monitor setcolorspace calls in case the
space changed after the last setcolor.

Clearly the complexity of the challenge goes up, but its still possible.

Ken

tlvp

unread,

Sep 16, 2011, 6:33:00 PM9/16/11

At the risk of wasting good keystrokes on a necro-thread, let me
nonetheless remark on my understanding of the encoding of black
in postscript (corrections welcomed if called for):

In the RGB colorspace, black is specified as 0 0 0 .
In the grayscale "colorspace", black is specified as 0 .
In the CMYK colorspace, black is specified either as 1 1 1 1 (rich)
or as 0 0 0 1 (pure).

In any event, it is clear that your grayscale blackness parameter and
your matching K value are complementary real numbers (whose sum is 1).

This may help explain why your correction (R to 1-R) was necessary.

Hoping this doesn't come too too late, I offer you cheers, -- tlvp
--
Avant de repondre, jeter la poubelle, SVP.

ole....@gmail.com

unread,

Jan 25, 2012, 9:40:19 AM1/25/12

Hi Helge

Your "recipe" works very nicely - thanks a lot for that - except for one refinement that would probably be nice:

When black text is printed on a background color other than white, in my case a light cyan, that background color is carved out to white; and as far as I remember from my prepress-time, it is difficult for the printer to avoid white margins around the glyphs if the background color is white and not not the same under the glyps as around them also.

Can this be taken into consideration e.g. in forceblack.ps? (I unfortunately don't feel competent in Postscript to quickly figure that out myself.)

Thanks a lot for your time!
Ole

Helge Blischke

unread,

Jan 25, 2012, 10:09:13 AM1/25/12

The knockout of your background color should only occur when you printer
(what make and model?) is generating color separations.
In any case, you could you could try to insert the statement
true setoverprint
just before every of the "}bind def" lines in the forceblack.ps script.
But there is no guarantee that this works.

Helge

ole....@gmail.com

unread,

Jan 26, 2012, 9:41:50 AM1/26/12

Thanks a lot for quick reply!

Tried the "true setoverprint" as you mentioned, but didn't help. I'm using the tiffsep ghostscript device for checking like sdaau did.

I'm wondering if it made a difference to first convert the rgb-PDF to cmyk and then apply your procedure? It appears Postscript defines overprinting only for subtractive color models like cmyk, not for additive ones like rgb. And forceblack.ps is assuming an rbg color model, isn't it? Again, feel not in a position to quickly modify forceblack.ps to cmyk colorspace. Would that be a big thing for you?-)

Thanks again!
Ole

Message has been deleted

ole....@gmail.com

unread,

Jan 26, 2012, 3:58:52 PM1/26/12

Hi Helge

Found the solution in the answer sdaau posted on stackoverflow (see http://stackoverflow.com/questions/6248563/converting-any-pdf-to-black-k-only-cmyk/).

After converting original rgb pdf to ps with pdftops, convert ps to cmyk pdf with ghostscript pdfwrite first prepending sdaau's HackRGB-cmyk-inv.ps (http://sdaaubckp.sourceforge.net/post/ps/HackRGB-cmyk-inv.ps), then prepending your forceblack.ps with the "true setoverprint" extension. This will overprint black text on existing background color:

pdftops -level2 test.pdf test.ps
gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -sOutputFile=test-cmyk-k-overprint.pdf HackRGB-cmyk-inv.ps forceblack.ps test.ps

Thanks again,
Ole

ricard...@gmail.com

unread,

Feb 27, 2013, 10:10:27 AM2/27/13

Hello all!

This last script work very well for my project!
thanks

But I got a problem.
How can I convert the rgb magenta to 100% magenta?

ricard...@gmail.com

unread,

Feb 27, 2013, 10:36:54 AM2/27/13

my question on stackoverflow:
http://stackoverflow.com/questions/15115990/convert-rgb-pdf-to-cmyk-keep-100-k-black-and-100-mmagenta-on-linux

regards

ricard...@gmail.com

unread,

Feb 28, 2013, 1:06:20 PM2/28/13

Thanks to the guys on stackoverflow follow the answers:

update the HackRGB-cmyk to it:

%!

/oldsetrgbcolor /setrgbcolor load def
/setrgbcolor {

(in replacement setrgbcolor\n) print
%% R G B
1 index 1 index %% R G B G B
eq { %%
2 index 1 index %% R G B R B
eq {
%% Here if R = G = B
pop pop %% remove two values

% setgray % "replace the 'setgray' with":
0 0 0 4 -1 roll % setcmykcolor
-1 mul %% obtain -R on top of stack
1 add %% obtain 1-R on top of stack
setcmykcolor %% now set(cmykcolor) K (as 1-R)
} {

oldsetrgbcolor %% set the RGB values
} ifelse
}{
oldsetrgbcolor %% Set the RGB values

currentcmykcolor %puts 4 numbers on the stack
(cmyk-) print pstack %display the colors (remove when things work correctly)
3 -1 roll %put magenta on top of stack
dup %make copy of magenta value
.5 %put magenta test value on stack (then may not be exactly .5, see pstack)
eq %see of magenta is equal to test value (.5)
{pop 1}if %if it is equal, pop off the .5 and put a 1 onto the stack
3 1 roll %put magenta back where it belongs in the stack
setcmykcolor %reset the cmyk to have new magenta value
}ifelse

} bind def
/oldsetgray /setgray load def
/setgray {
(in replacement setgray\n) print
% == % debug: pop last element and print it
% here we're at a gray value;
% http://www.tailrecursive.org/postscript/operators.html#setcymkcolor
% setgray: "gray-value must be a number from 0 (black) to 1 (white)."
% setcymkcolor: "The components must be between 0 (none) to 1 (full)."
% so convert here again:
0 0 0 4 -1 roll % push CMY:000 after Gray and roll down,
% so top of stack becomes
% ...:C:M:Y:Gray
-1 mul %% obtain -Gray on top of stack
1 add %% obtain 1-Gray on top of stack
setcmykcolor %% now set(cmykcolor) K (as 1-Gray)
} bind def

%~ # test: rgb2gray
%~ gs -dNOPAUSE -dBATCH -sDEVICE=ps2write -sOutputFile=./blah-slide-hackRGB-gray.ps ./HackRGB.ps ./blah-slide-gsps2w.ps
%~ # gray2cmyk
%~ gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=./blah-slide-hackRGB-gray-ci.pdf ./HackRGB-cmyk-inv.ps ./blah-slide-hackRGB-gray.ps
%~ # check separations - looks OK
%~ gs -sDEVICE=tiffsep -dNOPAUSE -dBATCH -dSAFER -dFirstPage=1 -dLastPage=1 -sOutputFile=p%02d.tif blah-slide-hackRGB-gray-ci.pdf && eog p01.tif 2>/dev/null

0 new messages