How to get the glyph for a Unicode character?

68 views
Skip to first unread message

Ryan

unread,
Nov 2, 2015, 8:53:39 PM11/2/15
to PDFTron PDFNet SDK
Q:

How do I get the glyph for a Unicode character?

--------------------------------------------------------------------------------------------------------------------------

A:

The proper way to get the glyph for a Unicode character, is by using the character code (charcode) used in the PDF to display the glyph. A font can have multiple glyphs representing any particular Unicode character.

So the proper way is with the char code

PathData path_data = font.GetGlyphPath(char_code);
UString unicode = font.MapToUnicode(char_code);

While the mapping from char_code to glyph is 1:1, the mapping of char_code to unicode can be 1:many (for instance ligatures, such as "ffi")

And as mentioned above, a glyph can have multiple charcodes mapping to itself, and any Unicode character could have multiple charcodes mapping to it.

If you really need to get a glyph for a unicode character, the following will work.

unsigned int max_char_code = font.IsSimpl() ? 0xFF : 0xFFFF;
UString uni;
for(unsigned int cc = 0; cc < max_char_code; ++cc)
{
    uni = font.MapToUnicode(cc);
    if(!uni.Empty() && uni.GetLength() == 1) // uni could contain multiple unicode characters
    {
        if(uni.GetAt(0) == unicode_target)
        {
            // you now have a matching charcode for that unicode character.
            // you could either break now, or continue searching for additional charcodes.
            // There may be more than one glyph for this unicode, in which case there would
            // would be at least that many charcodes mapping to this unicode.
        }
    }
}
// now using matched charcode(s), you can call font.GetGlyphPath()


Reply all
Reply to author
Forward
0 new messages