(Update #4)
I found a bug in the hpdf_encoder_utf8.c. I adjusted this, it is
uploaded to this discussion group
I'm understanding what should be done in the PDF output. <FEFF...> is
the correct hexstring.
I adjusted InternalWriteText once again.
Now the input is UTF-8 and the output is UTF-16BE, but the glyphs are
not correct and the spacing between characters is also not correct.
I'm guessing that there should be something done with the maptable of
a font. Here is where my knowledge ends.
Regards,
Mirco
********
Summarizing my changes to implement UTF-8 input string:
1) Implemented hpdf_encoding_utf8.c
2) Adjusted in the hpdf_pages.h file the struct _HPDF_PageAttr_Rec,
added 2 fields:
HPDF_MMgr mmgr;
HPDF_Encoder encoder;
3) Adjusted the HPDF_Page_New (hpdf_pages.h and hpdf_pages.c)
function:
HPDF_Page
HPDF_Page_New (HPDF_MMgr mmgr,
HPDF_Xref xref,
HPDF_Encoder encoder)
{
//ADDED
attr->encoder = encoder;
attr->mmgr = mmgr;
}
4) Adjusted in hpdf_doc.c the calls to HPDF_Page_new in the functions
HPDF_AddPage() and HPDF_InsertPage():
page = HPDF_Page_New (pdf->mmgr, pdf->xref, pdf->cur_encoder);
5) Adjusted the hpdf_page_operator.c function InternalWriteText():
static HPDF_STATUS
InternalWriteText (HPDF_PageAttr attr,
const char *text)
{
HPDF_FontAttr font_attr = (HPDF_FontAttr)attr->gstate->font->attr;
HPDF_STATUS ret;
HPDF_String str;
HPDF_PTRACE ((" InternalWriteText\n"));
if ((attr->encoder) && (attr->encoder->type ==
HPDF_ENCODER_TYPE_DOUBLE_BYTE))
{
str = HPDF_String_New(attr->mmgr, text, attr->encoder);
if (!str)
return HPDF_FAILD_TO_ALLOC_MEM;
ret = HPDF_String_Write(str, attr->stream, NULL);
HPDF_String_Free(str);
return ret;
}
else
{
if (font_attr->type == HPDF_FONT_TYPE0_TT ||
font_attr->type == HPDF_FONT_TYPE0_CID) {
if ((ret = HPDF_Stream_WriteStr (attr->stream, "<")) !
= HPDF_OK)
return ret;
if ((ret = HPDF_Stream_WriteBinary (attr->stream,
(HPDF_BYTE *)text,
HPDF_StrLen (text,
HPDF_LIMIT_MAX_STRING_LEN), NULL))
!= HPDF_OK)
return ret;
return HPDF_Stream_WriteStr (attr->stream, ">");
}
return HPDF_Stream_WriteEscapeText (attr->stream, text);
}
}
6) Testing:
pdf = HPDF_New(error_handler, NULL);
HPDF_UseUTF8Encoding(pdf);
HPDF_SetCurrentEncoder(pdf, "UTF-8");
...
Add pages etc.
Strings outputted to PDF must be presented in UTF-8.
(HPDF_Page_ShowText, etc.)