Adding support for new PDF objects

73 views
Skip to first unread message

Piotr Kopszak

unread,
Aug 3, 2009, 6:35:37 AM8/3/09
to prawn...@googlegroups.com
Hello list,

I'm trying to learn how to expand Prawn to support a new PDF object
(actually I need a little object tree, but let's stick to a single
object now). As I understand I have to call ref() on the
Prawn::Document object but is that enough? I can't arrive at
generating any objects this way in resulting pdf.

Best

Piotr


--
http://okle.pl

James Healy

unread,
Aug 3, 2009, 9:21:44 AM8/3/09
to prawn...@googlegroups.com
Piotr Kopszak wrote:
> I'm trying to learn how to expand Prawn to support a new PDF object
> (actually I need a little object tree, but let's stick to a single
> object now). As I understand I have to call ref() on the
> Prawn::Document object but is that enough? I can't arrive at
> generating any objects this way in resulting pdf.

Calling ref() is all you should need to do. Make sure you pass it a ruby
object that can be translated into a corresponding PDF object.

What type of object are you adding and what makes you think it's not
ending up in the generated PDF? For it to have any effect, make sure it
is referenced from another object that's part of the document. Adding an
'orphan' object will have no impact.

-- James Healy <jimmy-at-deefa-dot-com> Mon, 03 Aug 2009 23:18:08 +1000

Piotr Kopszak

unread,
Aug 3, 2009, 9:44:37 AM8/3/09
to prawn...@googlegroups.com
Yes, I did exactly that (tried to add an "orphaned" object). Finally
succeeded but it's still "orphaned" with


def my_function()
obj = ref(:MyObject => :Something
)
return obj
end

which I called like this


pdf = Prawn::Document.new
pdf.my_function

How could I modify my_function so that it is referenced by a page object?

Piotr


2009/8/3 James Healy <ji...@deefa.com>:
--
http://okle.pl

James Healy

unread,
Aug 3, 2009, 10:19:22 AM8/3/09
to prawn...@googlegroups.com
Piotr Kopszak wrote:
> How could I modify my_function so that it is referenced by a page object?

How your new object fits into the PDF depends on what kind of object it
is.

Assuming my_function() is within the scope of a PDF::Document object, it
can access the current page object via the @current_page instance
variable (see lib/prawn/document.rb).

Also check out the @pages, @root and @info instance variables,
initialised in lib/prawn/document.rb. These are all objects near the top
of the PDF object tree and *may* be suitable places to link your object.

If you want to add content to a page, skip creating an object and use
the add_content() method, defined in lib/prawn/document/internals.rb.

-- James Healy <jimmy-at-deefa-dot-com> Tue, 04 Aug 2009 00:14:15 +1000

Piotr Kopszak

unread,
Aug 4, 2009, 4:02:18 AM8/4/09
to prawn...@googlegroups.com
Great! I'm beginning to get the hang of it. I looked at a couple of
Java PDF libraries before but playing with Ruby is so much more fun.

All Best

Piotr

2009/8/3 James Healy <ji...@deefa.com>:
>
--
http://okle.pl

Piotr Kopszak

unread,
Aug 4, 2009, 8:45:04 AM8/4/09
to prawn...@googlegroups.com
Still, it's not quite straightforward :)

When I'm doing

def my_function()
@obj = ref(:Name => :MyObject)
@current_page = ref(:Contents => @obj)
@pages.data[:Kids] << @current_page
end

I'm getting two pages and MyObject ends up as a child of the second.
How can I add object to a first page. Also I had to change obj to @obj
to get MyObject in the output but don't quite understand why. Sorry
for such a trivial post but I promise I'll get better.

Piotr

P.S.
I'd gladly take a look at
http://prawn.lighthouseapp.com/projects/9398/tickets/122-hacking-guide
however it does not seem publicly available.

2009/8/4 Piotr Kopszak <kop...@gmail.com>:
--
http://okle.pl

Gregory Brown

unread,
Aug 4, 2009, 8:49:37 AM8/4/09
to prawn...@googlegroups.com
On Tue, Aug 4, 2009 at 8:45 AM, Piotr Kopszak<kop...@gmail.com> wrote:

> P.S.
> I'd gladly take a look at
> http://prawn.lighthouseapp.com/projects/9398/tickets/122-hacking-guide
> however it does not seem publicly available.

You're doing interesting low level things, so no worries about
discussing them here.
The HACKING file only describes how to do basic things (get code from
git, etc). I need to write a better one that describes low level
features.

-greg

James Healy

unread,
Aug 4, 2009, 8:52:44 AM8/4/09
to prawn...@googlegroups.com
Piotr Kopszak wrote:
> I'm getting two pages and MyObject ends up as a child of the second.
> How can I add object to a first page. Also I had to change obj to @obj
> to get MyObject in the output but don't quite understand why. Sorry
> for such a trivial post but I promise I'll get better.

No worries, it's nice to see someone willing to put in the work to get
up to speed on PDFs.

It might make things easier if you outline what it is your trying to add
to the first page. Text? an image? graphics?

-- James Healy <jimmy-at-deefa-dot-com> Tue, 04 Aug 2009 22:50:48 +1000

Piotr Kopszak

unread,
Aug 4, 2009, 9:26:36 AM8/4/09
to prawn...@googlegroups.com
Well, right now as you can see I'm playing with entirely useless
objects, however maybe we should get down to do more useful work. On
my way towards DeviceN pdf images I guess it might be a good idea to
start with DeviceN for graphics. Perhaps it could be done in two
stages. First a user should declare a DeviceN color and provide it's
CMYK representation and then could use it in fill_color and
stroke_color. Perhaps it would be good to get the structure similar to
the attached file.
Here is my initial idea for parsing the fill_color but it I guess it's
far from perfect solution. Did you think how adding other colorspaces
in fill_ and stroke_color could be handled? Testing only for String
and Array is great but somehow limits the choice. Maybe all the rest
should use a Hash?




def process_color(*color)
case(color.size)
when 1
color[0]
when 2
color
when 4
color
else
raise ArgumentError, 'wrong number of arguments supplied'
end
end

def set_fill_color
case @fill_color
when String
r,g,b = hex2rgb(@fill_color)
add_content "%.3f %.3f %.3f rg" %
[r / 255.0, g / 255.0, b / 255.0]

when Array
case @fill_color.size
when 4
c,m,y,k = *@fill_color
add_content "%.3f %.3f %.3f %.3f K" %
[c / 100.0, m / 100.0, y / 100.0, k / 100.0]

when 2
device_n,tint = *@fill_color
add_content "#{device_n}, #{tint} SC"
end
end
end

Best

Piotr



2009/8/4 James Healy <ji...@deefa.com>:
--
http://okle.pl
test-multitone.pdf

James Healy

unread,
Aug 4, 2009, 7:38:15 PM8/4/09
to prawn...@googlegroups.com
Piotr Kopszak wrote:
> Well, right now as you can see I'm playing with entirely useless
> objects, however maybe we should get down to do more useful work. On
> my way towards DeviceN pdf images I guess it might be a good idea to
> start with DeviceN for graphics. Perhaps it could be done in two
> stages. First a user should declare a DeviceN color and provide it's
> CMYK representation and then could use it in fill_color and
> stroke_color. Perhaps it would be good to get the structure similar to
> the attached file.

Assuming DeviceN graphics are possible, they're probably a good first
step. Adding graphics to a page is generally much simpler than embedding
images and other objects.

Can you explain DeviceN to me? I grok RGB and CMYK colour spaces, but
that's about the limit of my colour knowledge. A better understanding of
DeviceN will help me interpret the relevant parts of the PDF spec and
see what's possible.

To add content to a page using the add_content() method you pass a series
of arguments, followed by an operator. Using set_fill_color as an
example, when setting an RGB colour we pass "arg1 arg2 arg3 rg".

arg1 arg2 and arg3 are the RGB components of the colour and 'rg' is the
operator that uses the arguments to set the active *RGB* colour. The 'k'
operator performs a similar function for CMYK colours and takes 4
arguments.

The trick will be to work out if there's operators for alternative
colour spaces.

-- James Healy <jimmy-at-deefa-dot-com> Wed, 05 Aug 2009 09:16:23 +1000

Piotr Kopszak

unread,
Aug 5, 2009, 5:50:43 AM8/5/09
to prawn...@googlegroups.com
Actually I found the section on DeviceN in the spec a bit unclear but
fortunately it is not that complicated. In most simple scenario the
references chain goes like this
Page->Resources->ColorSpace->Color and /Color is used then in graphic
context with DeviceN operators SCN and scn. The trick is it also has
a definition an alternate ColorSpace e.g.
---
27 0 obj [/Separation /PANTONE#201375#20C 23 0 R 28 0 R]
endobj
---
23 0 obj [/CalRGB
<<
/Matrix [0.576675 0.297348 0.0270386 0.185562 0.627365 0.0706787
0.188232 0.0753021 0.991333]
/Gamma [2.19922 2.19922 2.19922]
/BlackPoint [0.0 0.0 0.0]
/WhitePoint [0.950455 1.0 1.08905]
>>]
endobj
---
28 0 obj
<<
/FunctionType 4
/Length 102
/Range [0.0 1.0 0.0 1.0 0.0 1.0]
/Domain [0.0 1.0]
>>
stream
{dup dup -0.894118 mul 1.0 add 3 1 roll -0.403922 mul 1.0 add 3 1 roll
-0.078431 mul 1.0 add 3 1 roll}
endstream
endobj
---
So here it has representations in two colorspaces \CalRGB and
\DeviceCMYK so you can see color approximations on screen and print it
on CMYK printer and you will get results as similar as it is possible
to a spot color which you defined this way. We can forget about
CalRGB. I think it would be useful already it contained just a CMYK
approximation. It doesn't have to be a FunctionType 4, could be
FunctionType 2, but in any case before Prawn becomes another Photoshop
it would be already useful with very simple approximation of the sort

<<
/FunctionType 4
/Range [0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0]
/Length 25
/Domain [0.0 1.0 0.0 1.0]
>>
stream
{pop pop CYAN MAGENTA YELLOW BLACK}
endstream

where CYAN MAGENTA YELLOW BLACK are CMYK values provided by the user.

I'll try to prepare a complete minimal example later. To sum up, the
whole point is not really to provide exact approximations but to give
a user possibility to use in his or her pdf document colors which can
be used in, say, two color printing process without any conversion.

Re: color operators, I just took a look at the table 74 in the most
recent PDF 32000-1:2008 spec and there are less operators than
colorspaces but they accept different number of operands for each
colorspace so in fact there is an operator for each colorspace. So the
string for a DeviceN color can like

/PantoneSixZeroSeven cs /PantoneSixZeroSeven CS 1 SCN 0.4 scn

Here SCN and scn take only tint as argument (which is reasonable I
think as that's the only parameter you can change when printing with a
single color).

Piotr

2009/8/5 James Healy <ji...@deefa.com>:
--
http://okle.pl

Piotr Kopszak

unread,
Aug 5, 2009, 7:14:31 AM8/5/09
to prawn...@googlegroups.com
I just noticed I skipped two crucial objects.

2009/8/5 Piotr Kopszak <kop...@gmail.com>:
> Actually I found the section on DeviceN in the spec a bit unclear but
> fortunately it is not that complicated. In most simple scenario the
> references chain goes like this
> Page->Resources->ColorSpace->Color and /Color is used then in graphic
> context with DeviceN operators SCN and scn. The trick is  it also has
> a definition an alternate ColorSpace e.g.

21 0 obj [/DeviceN [/PANTONE#201375#20C /PANTONE#202767#20C /None
/None /None] 23 0 R 24 0 R
<<
/Colorants 25 0 R
>>]
endobj


25 0 obj
<<
/PANTONE#202767#20C 26 0 R
/PANTONE#201375#20C 27 0 R
>>
endobj
--
http://okle.pl

James Healy

unread,
Aug 5, 2009, 8:21:47 AM8/5/09
to prawn...@googlegroups.com
Ahh.

So, DeviceN is a technique that allows a colour to be specified in
multiple colour spaces?

Assuming a colour is defined in RGB and CMYK: does that mean when an RGB
device opens the PDF it grabs the RGB colour and when an CMYK device
opens it it grabs the CMYK colour. This would avoid the need for an
approximate conversion?

-- James Healy <jimmy-at-deefa-dot-com> Wed, 05 Aug 2009 22:12:11 +1000

Piotr Kopszak

unread,
Aug 5, 2009, 9:19:01 AM8/5/09
to prawn...@googlegroups.com
Yes, that's how I understand that. But I do not pretend to be any
expert in PDF, I only started to learn about these things. This is the
optimal situation, but obviously it requires knowing both CMYK and RGB
values of a color. PDF spec also has an example how to use ICCBased
colorspace. Well, plenty of cool options there. But I don't know which
PDF readers support ICCBased colorspace already.
--
http://okle.pl

James Healy

unread,
Aug 5, 2009, 10:25:15 AM8/5/09
to prawn...@googlegroups.com
Piotr Kopszak wrote:
> Yes, that's how I understand that. But I do not pretend to be any
> expert in PDF, I only started to learn about these things. This is the
> optimal situation, but obviously it requires knowing both CMYK and RGB
> values of a color. PDF spec also has an example how to use ICCBased
> colorspace. Well, plenty of cool options there. But I don't know which
> PDF readers support ICCBased colorspace already.

It *sounds* like a feature that would make prawn useful to a new segment
of people (people in printing, etc), so thanks for investigating it.

It sounds like you're close to a working example.

I'ev covered how to add args and operators to the content stream for
current page.

How are you going with adding objects to the page resources? Something
like the following is probably what you want:

page_resources[:ColorSpace] = {:MyCS => ref(...)}

page_resources() is a helper method defined in
lib/prawn/document/internals.rb

You could also emulate the page_fonts() or page_xobjects() helpers like
so:

def page_colorspaces
page_resources[:ColorSpace] ||= {}
end

Then add your new colour space with something like:

page_colorspaces[:MyCS] = ref(...)

-- James Healy <jimmy-at-deefa-dot-com> Thu, 06 Aug 2009 00:06:33 +1000

Piotr Kopszak

unread,
Aug 6, 2009, 4:52:54 PM8/6/09
to prawn...@googlegroups.com
One correction, I knew I would mess up something. I talked about
DeviceN but the example of color operators use I gave could be applied
only in a special case of DeviceN colorspace when only one colorant is
used which is exactly the same situation as with Separation
colorspace. I think I finally got it right but please shout loud if
I'm talking rubbish again.
In the Separation colorspace there is only one colorant as it contains
exact information which can be used without any further
transformations to obtain a. plate for one color printing. The only
parameter you can change in such case is the tint as you can have only
more or less the ink of the same color on the plate and then on paper.
That is why CSN color operator takes only one parameter in such case.
CMYK or any other (but not special colorspace like DeviceN or Pattern,
Indexed or Separation) approximation can be used with Separation just
as with DeviceN.
The point of DeviceN is that you can use more absolutely arbitrary
colorants, actually any number "limited only by implementation", but
realistically if you think about printing probably also by the number
of print runs the paper can stand. And here also the approximations
work, rather more sophisticated ones, as you give to SCN operator as
many parameters as there are colorants in the colorspace.
I would be very grateful if someone could elucidate how such
approximations are implemented in PostScript as this still escapes me.

Piotr


2009/8/5 Piotr Kopszak <kop...@gmail.com>:
--
http://okle.pl

James Healy

unread,
Aug 7, 2009, 12:09:20 AM8/7/09
to prawn...@googlegroups.com
Wow, that all went *way* over my head. I'd love to read up on this
colour stuff some more, but I'm swamped with work at the moment.

I'm happy to keep answering any questions you have on prawn or PDF file
structure though!

-- James Healy <jimmy-at-deefa-dot-com> Fri, 07 Aug 2009 14:07:40 +1000

Piotr Kopszak

unread,
Aug 7, 2009, 3:40:55 AM8/7/09
to prawn...@googlegroups.com
No worry. I am off for hollidays. Will be back in September. I hope
the colour magic will be dispelled sooner or later as so many
high-quality colour devices are already available now including six
colour inkjets you buy for home use. Yes, that's something that
DeviceN could make use of! The problem with colours is that they are
one of these things people react to but rarely think about. Even my
fellow Art Historians often prefer to stay away from colour theory and
indulge in all sorts of crazy ideas.

Piotr


2009/8/7 James Healy <ji...@deefa.com>:
--
http://okle.pl
Reply all
Reply to author
Forward
0 new messages