Patch for Unicode/TTF support

51 views
Skip to first unread message

Grant McLean

unread,
Jun 11, 2008, 5:49:07 AM6/11/08
to lars...@cpan.org, PDF-...@googlegroups.com
Hi Lars

You may recall that back in September I was asking about Unicode
support in PDF::Reuse. Well it turns out I really needed it so I've
done some research and put together the attached patch which works
sufficiently well at least for my needs.

It seems that the only way to access characters outside the
MacRoman/WinAnsi encodings supported by PDF's built-in fonts is to embed
a TrueType font. (I'm happy to be corrected if this conclusion in
wrong).

I've implemented font embedding by grafting Font::TTF and
Text::PDF::TTFont0 onto the PDF::Reuse API. I'm not suggesting that
these extra modules be added as hard dependencies for PDF::Reuse - just
saying that PDF::Reuse can have the added TrueType font support if those
modules are installed.

With the patch applied you can embed a TrueType font using the new
prTTFont function and supplying it with the full path of the TTF file.
E.g.:

prTTFont('/usr/share/fonts/truetype/msttcorefonts/Comic_Sans_MS.ttf');

And then you would simply pass a UTF8 string to prText. E.g.:

prText("T\x{113}n\x{101} koutou"); # Tēnā Koutou

As an added bonus, the Font::TTF module provides access to font metrics
tables in the .ttf files so the prStrWidth function is able to
accurately compute widths for text strings rendered with TrueType
fonts.

The patch also moves the prStrWidth from the autoloaded section after
__END__. That's not an essential change - it just made things easier
for my debugging.

prTTFont called in a list context will return the same values as prFont
- including the undocumented 5th value (a reference to
%PDF::Reuse::Font). I've found this extra value to be useful in my
scripts when I need to determine whether the 'current' font is a
TrueType font (ie: whether UTF8 text is OK). Perhaps that implies the
extra return value should be documented or perhaps it implies I need to
add a function for answering that question.

I've attached both the patch file and a tar.gz of the whole distribution
with the patch applied.

Hopefully you'll find the patch suitable for inclusion in a future
release. Let me know if you have any concerns or suggestions for
improvement.

Regards
Grant

ttf-unicode.patch
PDF-Reuse-ttf-unicode.tar.gz

Nic Gibson

unread,
Jun 11, 2008, 6:06:51 AM6/11/08
to PDF-...@googlegroups.com, lars...@cpan.org
On 11 Jun 2008, at 10:49, Grant McLean wrote:

> Hi Lars
>
> You may recall that back in September I was asking about Unicode
> support in PDF::Reuse. Well it turns out I really needed it so I've
> done some research and put together the attached patch which works
> sufficiently well at least for my needs.
>
> It seems that the only way to access characters outside the
> MacRoman/WinAnsi encodings supported by PDF's built-in fonts is to
> embed
> a TrueType font. (I'm happy to be corrected if this conclusion in
> wrong).

That would be right - I've been struggling with the same issue
recently in a different
context (ebooks for Adobe Digital Editions - it uses the same base
font set). Someone
from Adobe referred me to this PDF which lists characters available in
the built-in fonts:

cheers

nic

Nic Gibson

unread,
Jun 11, 2008, 6:17:50 AM6/11/08
to PDF-...@googlegroups.com, lars...@cpan.org
Um. Actually giving the URL would be good. Sorry. The important text would be in Appendix D, Table D.1



nic

Daniele Lippi

unread,
Jun 17, 2008, 12:57:33 PM6/17/08
to PDF-Reuse
Hi Grant, I'm using your prTTFont update. It seem to work very well!
Thank you!

Daniele

Lars Lundberg

unread,
Jun 26, 2008, 12:38:25 PM6/26/08
to PDF-...@googlegroups.com
Hello,

It looks very good, excellent!
Next week I will upload PDF::Reuse with the patch and perhaps also some
other minor additions.

Regards
Lars Lundberg


Grant McLean skrev:

Reply all
Reply to author
Forward
0 new messages