Demo for unicode text?

1,199 views
Skip to first unread message

Craig

unread,
Apr 29, 2011, 2:59:01 PM4/29/11
to libHaru
Hi, I am new to libHaru. I came across it recently, downloaded it,
and built several of the demos. Now I see that there is support for
UNICODE - Bravo!! I have downloaded the kdeforche-libharu-1a705bc
build - is this the correct one for unicode?

Since I am getting my feet wet, I wonder if there is a sample program
which demonstrates the unicode support? Or, do I need to simply
modify something like the ttfont_demo_jp demo and use
HPDF_UseUTFEncodings() in place of HPDF_UseJPEncodings() and then pass
in UTF-8 encoded text via HPDF_Page_ShowText()?

Thanks for any help. And again, thanks for providing this!

Craig

unread,
May 3, 2011, 2:10:23 PM5/3/11
to libHaru
More...

Using the unicode version of libHaru (2.3.0-dev) and the
ttfont_demo_jp.c source, I generated a pdf file. Here are the
pertinent modifications:
HPDF_UseUTFEncodings( pdf );
detail_font_name = HPDF_LoadTTFontFromFile2 (pdf, "C:\\Windows\\Fonts\
\msgothic.ttc", 2, HPDF_TRUE);
detail_font = HPDF_GetFont (pdf, detail_font_name, "UTF-8");

/* draw japanese (Katakana) text: カテコリ (0x30AB, 0x30C6, 0x30B4,
0x30EA) */
HPDF_Page_MoveTextPos (page, 0, -48);
HPDF_Page_SetFontAndSize (page, detail_font, 48);
HPDF_Page_ShowText (page, "カテゴリ"); // utf8 byte stream:
E382AB E383C6 E382B4 E383AA


The resulting PDF shows the two fonts - /F1=Helvetica to display the
font name + encoding, MS-UIGothic (UTF-8), and /F2=MS UI Gothic to
display the text. The text seems to be the correct utf8 byte stream.
However, the pdf displays boxes for the text.

/F1 10 Tf
BT
10 190 Td
(MS-UIGothic) Tj
( \050) Tj
(UTF-8) Tj
(\051) Tj
0 -48 Td

/F2 48 Tf
<E382ABE38386E382B4E383AA32> Tj

The byte stream has 3 bytes per japanese character. Does this mean
that libHaru does not support this encoding? How could it be modified
to do so? Or, am I missing something else that I need to do?

Thanks for any help.

Koen Deforche

unread,
May 3, 2011, 3:14:15 PM5/3/11
to lib...@googlegroups.com
Hey Craig,

It seems your code is right but you are using the 2.3.0 branch, which
has support only for 2-byte UTF8 codes. This is what prompted me to
look at how to add more complete Unicode support (up to 3-byte UTF8
codes, or thus the 0-0xFFFF unicode range).

See this thread:
http://groups.google.com/group/libharu/browse_thread/thread/66c4e645352c243f?pli=1

Your code example should work onmidified against that libharu version
(git head) and produce a correct PDF.

Regards,
koen

Reply all
Reply to author
Forward
0 new messages