UTF-8 support

4,946 views
Skip to first unread message

Clayman

unread,
Sep 11, 2010, 5:46:58 PM9/11/10
to libHaru
Hi!

If you need UTF-8 support in libHaru, you can take
`hpdf_encoder_utf.c' from files section.
Include it in makefiles, add prototype

HPDF_EXPORT(HPDF_STATUS)
HPDF_UseUTFEncodings (HPDF_Doc pdf);

to `hpdf.h' and rebuild the library.
After that this code should work perfectly

HPDF_UseUTFEncodings(pdf);
HPDF_SetCurrentEncoder(pdf, "UTF-8");
...

My implementation support only 1- and 2-byte UTF-8 codes, it should be
enough in most cases.

johsve339

unread,
Oct 12, 2010, 9:51:35 AM10/12/10
to libHaru
Hi

I have added hpdf_encoder_utf.c to my xcode project and modified the
hpdf.h file to include the prototype.
It all compiles fine, but I still can't read å ä ö, that should work
in UTF-8.

I'm using version 2.0.8 of libharu.

Clayman

unread,
Oct 12, 2010, 4:41:18 PM10/12/10
to libHaru
It's strange.
I've succeded in displaying cyrillic symbols such as "й ц ш ж ф ы"
with this method.
I use TrueType font "FreeSans" building it in PDF document.
Microsoft's "ComicSans" also works fine.

This is how I load font:

const char *fontname;
HPDF_Font font;

HPDF_UseUTFEncodings(pdf);
fontname = HPDF_LoadTTFontFromFile(pdf, "FreeSans.ttf", HPDF_TRUE);
font = HPDF_GetFont(pdf, fontname, "UTF-8");

After that functions "HPDF_Page_TextOut" and "HPDF_Page_TextWidth"
doing just fine.

johsve339

unread,
Oct 26, 2010, 9:23:17 AM10/26/10
to libHaru
Hi

I'm using iphone, if it has something to do with it.

Here is an sample of my code:

NSString *path = [[NSBundle mainBundle] pathForResource: @"Comic Sans
MS" ofType: @"ttf"];
const char *fontname = HPDF_LoadTTFontFromFile(pdf, [path UTF8String],
HPDF_TRUE);
HPDF_Font font = HPDF_GetFont(pdf, fontname, "UTF-8");
HPDF_Page_SetFontAndSize(pageAsset, font, 32.0);
HPDF_Page_TextOut(pageAsset, 50, 50, [@"åäö" UTF8String]);

I get the font right, but it doesn't seems to use it.

Francesco

unread,
Oct 26, 2010, 12:03:08 PM10/26/10
to libHaru
Will this be ever integrated into libharu?

johsve339

unread,
Oct 27, 2010, 4:13:13 PM10/27/10
to libHaru
It works now, there was a problem with the font, on iphone...

Antony Dovgal

unread,
Oct 28, 2010, 7:55:39 AM10/28/10
to lib...@googlegroups.com
On 10/26/2010 08:03 PM, Francesco wrote:
> Will this be ever integrated into libharu?

Done.

Committed the patch along with 3dMeasures patch by Robert W�rfel.

--
Wbr,
Antony Dovgal
---
http://pinba.org - realtime statistics for PHP

Antony Dovgal

unread,
Oct 28, 2010, 7:56:17 AM10/28/10
to lib...@googlegroups.com

Thanks a lot for the code, I've committed the patch several minutes ago.

Ramon Ribó

unread,
Oct 28, 2010, 12:53:20 PM10/28/10
to libHaru

Hello,

I am trying to compile a test program that works with the new utf-8
encoding but
I have failed miserabily.

Could you kindly post a complete c example, that can be compiled in
Linux, that
prints non ascii letters with the utf-8 encoding?

Thanks in advance,

Li Guang

unread,
Nov 9, 2010, 3:37:09 AM11/9/10
to lib...@googlegroups.com
Hi,
 
Do you know if it is possible to support 3-byte UTF-8 char?
 
Thanks,
Robert

2010/10/29 Ramon Ribó <rams...@gmail.com>
--
---
libHaru.org development mailing list
To unsubscribe, send email to libharu-u...@googlegroups.com

Antony Dovgal

unread,
Jan 19, 2011, 9:20:26 AM1/19/11
to lib...@googlegroups.com
Sergey, can you take a look at the Unicode patch in this message plz?
http://groups.google.com/group/libharu/msg/58d8fc366e94a29d?hl=en_US

On 09/12/2010 01:46 AM, Clayman wrote:

Guy Deleeuw

unread,
Jan 29, 2011, 6:28:49 AM1/29/11
to lib...@googlegroups.com
Hello
Is this patch available in the git repository ?

Regards
Guy

Antony Dovgal

unread,
Jan 29, 2011, 8:19:24 AM1/29/11
to lib...@googlegroups.com
On 01/29/2011 02:28 PM, Guy Deleeuw wrote:
> Hello
> Is this patch available in the git repository ?
>
> Regards
> Guy
> Le mercredi 19 janvier 2011 � 17:20 +0300, Antony Dovgal a �crit :

>> Sergey, can you take a look at the Unicode patch in this message plz?
>> http://groups.google.com/group/libharu/msg/58d8fc366e94a29d?hl=en_US

Not yet, just click on the link.

Koen Deforche

unread,
Feb 23, 2011, 4:14:43 AM2/23/11
to lib...@googlegroups.com
Hey all,

Please find in attachment an updated patch for the unicode rendering.
With the help from Antony, this patch fixes the following problems
identified with the first patch:

- renders correctly on libpoppler based PDF viewers, and xpdf. In
fact, it now generates PDFs that I have not seen any pdf viewer fail
on the generated PDFs.
- fixes TextOut() API

Two problems, perhaps related, remain:

- libpoppler etc.. give an error: Error: Unknown character collection
'Adobe-Identity-H'. It seems to be innocent what concerns rendering.
- perhaps, because of that, text copy/paste from the PDF seems to be broken.

The patch also contains some changes w.r.t. truetype font handling, to
allow more true type fonts to be correctly used and embedded. One of
the changes there is to not error when the "cvt " table is missing
(since some fonts don't have this as it may not apply to e.g. chinese
fonts). Still, the truetype font handling of libharu remains a bit
flaky: some MacOS X system fonts do not work for example.

To see what it can do, see:

http://www.emweb.be/public/haru/utf8.html

rendered as:

http://www.emweb.be/public/haru/utf8.pdf

A small test case (which I got from Antony), shows the basic usage,
and I have also attached it.

Antony, please do not forget the patch for the arc rendering. It is as
important to us as the unicode support.

Regards,
koen

FreeSans.ttf
utf8test.c
truetype_utf8.patch

Clayman

unread,
Feb 24, 2011, 1:16:32 AM2/24/11
to libHaru
Hi!

In my case, Koen's patch works fine. Great job, Koen!

Antony, maybe it's time to release it for more wide testing?

Guy Deleeuw

unread,
Feb 24, 2011, 4:58:22 AM2/24/11
to lib...@googlegroups.com
Hello all,
I tested intensively in our applications and work very fine.
Great job Koen !
Hope that is implemented in haru ASAP.
Best regards
Guy

Pradeep CK

unread,
Jul 24, 2013, 7:13:00 AM7/24/13
to lib...@googlegroups.com
Hi,

I am using the libharu package for a small project and it is been excellent so far. Thanks for making such a nice and usable package.

I am trying out the utf8 feature and I have been able to print out a pdf document. However, when I try to copy the text from the pdf document and paste to word, it comes out incorrectly. I tried downloading the pdf file which emweb.de has posted here, and there also I could not paste the contents properly. 

Am I missing something or is there something I should do to make this work?

Thanks!
Pradeep
Reply all
Reply to author
Forward
0 new messages