Using sfntly to create a database of mappings {character→best font for that character}

81 views
Skip to first unread message

Nicolas Raoul

unread,
Mar 12, 2012, 12:26:58 PM3/12/12
to sfntly...@googlegroups.com
Hello,

I maintain an Android application that displays strings of text.
Text can be Tibetan, Arabic, IPA, Thai, Ancient Greek, etc...
A lot of users complain about some characters being displayed as squares sometimes.

So here is what I want to do:
1) Gather a collection of various open source fonts.
2) Use sfntly to create a database of mappings {character→best font for that character}
3) In the Android app, use this database to dynamically use the best font.

Questions:
A) Is it a crazy idea?
B) Am I re-inventing the wheel, has someone already done this and created such a database that I could reuse?
C) Is step 2 implementable with sfntly? Or is there a more adapted tool?

Thanks a lot!
Nicolas Raoul

Bill Schwanitz

unread,
Mar 12, 2012, 1:02:54 PM3/12/12
to sfntly-users
Hi Nicolas,

Yes, you could definitely use sfntly to look at the cmap table to
figure out what "characters" are in a font. For sfntly this will only
work for TTF fonts - OTFs are not supported to the same level in
sfntly yet.

I might also suggest that you look at fontaine (http://www.unifont.org/
fontaine/), or TTX (http://sourceforge.net/projects/fonttools/)
depending on what language you feel most comfortable using
(sfntly=java & c, fontaine=ruby(I think), TTX=python).

Cheers,
Bill

Stuart Gill

unread,
Mar 12, 2012, 1:59:13 PM3/12/12
to sfntly...@googlegroups.com
To expand on Bill's response, I think that it doesn't matter if the font is an OTF font. What it seems that you are interested in is whether there is an entry in the cmap for the given character within the font. You will want to be sure to pick the correct cmap if there are more than one.

If your definition of "best" is more than just does it have a glyph then you may need to use the advanced layout tables (GSUB, GPOS, GDEF) to determine more details. Bill is correct that sfntly doesn't, yet, have support for these.

Stuart

Brian Stell

unread,
Mar 12, 2012, 2:07:42 PM3/12/12
to sfntly...@googlegroups.com
Hi Nicolas,

Here is a snippet to get the characters in a font:

    Font[] srcFontarray = FontFactory.getInstance().loadFonts(<your-font-data-or-stream>);
    Font font = srcFontarray[0];
    CMapTable cmapTable = font.getTable(Tag.cmap);
    // use the bigger cmap table if available
    CMap cmap = cmapTable.cmap(Font.PlatformId.Windows.value(), Font.WindowsEncodingId.UnicodeUCS4.value());
    if (cmap == null) 
      cmap = cmapTable.cmap(Font.PlatformId.Windows.value(), Font.WindowsEncodingId.UnicodeUCS2.value());

    if (cmap.glyphId(charId) != 0)
      return true;
    return true;

To keep is short/simple I've left out the error handling which you may want to actually have.

Best,

Brian

Brian Stell

unread,
Mar 12, 2012, 2:09:12 PM3/12/12
to sfntly...@googlegroups.com
And sfntly does TTF/OTF fonts but does not yet support CFF/OTF fonts.

Stuart Gill

unread,
Mar 12, 2012, 2:22:18 PM3/12/12
to sfntly...@googlegroups.com
For the cmap table it shouldn't matter. The cmap table will exist in a CFF outline OTF font and if it has an entry for the character then the CFF table will too, unless, of course, the font is broken.

Stuart


On Monday, 12 March 2012 11:09:12 UTC-7, bstell wrote:
And sfntly does TTF/OTF fonts but does not yet support CFF/OTF fonts.

Nicolas Raoul

unread,
Mar 12, 2012, 8:39:53 PM3/12/12
to sfntly...@googlegroups.com
Hello Bill, Stuart, Brian,

Thanks a lot for the great and fast answers!
I will try the code snippet.
Notes:

1) I am surprised no such database already exists. I will make everything Open Source, and hopefully some other applications will find it useful too. I will concentrate on fonts that can be freely embedded into apps.

2) My title said "best font for that character" but should I should have said "suitable font for that character". I only want to avoid squares, so I guess "having an entry in the cmap" is enough and I don't need Fontaine/TTX for now.

3) I don't really need much error handling as I plan to prepare the database in advance, not dynamically.

Thanks again, keep up the good work!
Nicolas Raoul

Nicolas Raoul

unread,
Mar 13, 2012, 5:17:31 AM3/13/12
to sfntly...@googlegroups.com
Hello,

I quickly created a new Open Source project around the code snippet:

As you can see, it reads all font files from a directory, and generates a mapping for each UTF-16 character.


I will think of an efficient (size+speed) data structure, probably arrays of arrays generated as static Java code, after grouping adjacent characters that have the same suitable fonts.

Cheers!
Nicolas Raoul
Reply all
Reply to author
Forward
0 new messages