#!/usr/bin/env ruby
require 'rubygems'
require 'pdf/writer'
encoding = {
:encoding => "WinAnsiEncoding",
:differences => {
219 => "Euro"
}
}
pdf = PDF::Writer.new(:paper => "A4")
pdf.select_font("Helvetica", encoding)
pdf.text("\xDB Euro!")
File.open("euro.pdf", "wb") { |f| f.write pdf.render }
--
The PDF spec defines a few encodings that all readers should support,
WinAnsiEncoding is one of them. As far as I know, none of these
encodings support the Euro symbol by default, however there is a
mechanism to override any byte in the default with a custom one using a
'differences table'.
In this case, I've overwritten byte 219 (0xDB) with the Euro symbol.
It's pretty hacky, but it works. Note that I chose 219 arbitrarily, and
I'm not sure what I just replaced - you might want to do a bit more
research on the best byte to use.
For more info, see "Fonts, Special Characters, and Character Maps in
PDF::Writer" in the PDF::Writer manual[1], and section 5.5.6 of the PDF
spec [2].
-- James Healy <jimmy-at-deefa-dot-com> Tue, 05 Feb 2008 21:47:50 +1100
[1] http://ruby-pdf.rubyforge.org/pdf-writer/manual/manual.pdf
[2] http://www.adobe.com/devnet/acrobat/pdfs/pdf_reference_1-7.pdf
Whoops, my bad. The standard encodings do have support for the Euro
glyph, you just have to know where to look (Appendix D of the PDF spec,
specifically, table D.1).
The takeaway fact is that internal PDF string encodings are *not* the
common ones we use in the real world (UTF-8, ISO-8859-1, etc), so
passing strings in these encodings into a PDF::Writer object will have
undefined results. Ideally the translation would happen transparently
inside PDF::Writer, but it doesn't at this stage.
Of course in the vast majority of encodings (including the internal PDF
ones) the first 127 characters correspond to ASCII, which explains why
those characters work just fine.
-- James Healy <jimmy-at-deefa-dot-com> Tue, 05 Feb 2008 22:29:58 +1100
--
#!/usr/bin/env ruby
require 'rubygems'
require 'pdf/writer'
pdf = PDF::Writer.new(:paper => "A4")
pdf.select_font("Helvetica", "WinAnsiEncoding")
pdf.text("\200 Euro!")
Thanks for investigating this James.
> The takeaway fact is that internal PDF string encodings are *not* the
> common ones we use in the real world (UTF-8, ISO-8859-1, etc), so
> passing strings in these encodings into a PDF::Writer object will have
> undefined results. Ideally the translation would happen transparently
> inside PDF::Writer, but it doesn't at this stage.
Volunteers most welcome :)
-greg