Revised idea for utf8show

David Newall

unread,

Jan 30, 2022, 1:03:20 PM1/30/22

to

Hi All,

Yes, requiring a modified font is ugly, but a map from UNICODE values to
glyph names is absolutely required.

Fonts don't have /uniXXXX aliases for all glyphs, nor do they reliably
use the names in AdobeGlyphList (which, anyway, does not even cover all
written languages).

I think I've come up with an elegant solution. It allows you to specify
the actual glyph names from fonts that you use, or use AdobeGlyphList if
you think that will be sufficient (it probably won't be.)

I would greatly value constructive criticism.

Meet unicodefont:

font array|dict unicodefont dict

Prepare a dictionary derived from font, adding a UNICODE map based on
array or dict. The result, after registering with definefont, will be
suitable for use with the utf8show family of operators.

If an array is passed, it must contain one element for each UNICODE
value to be installed in the map. Each element is an array containing
a UNICODE value followed by one or more glyph names. The first name
found in the font is associated with the UNICODE value.

If a dict is passed, it's keys must be glyph names and values must
be UNICODE values.

Although the standard AdobeGlyphList could be used, this is not
recommended. Fonts often use names that are different to those in
AdobeGlyphList, and also often include many glyphs which are not
listed in it. Using AdobeGlyphList is likely to result in characters
not being painted even though they present in the font.

It is strongly recommended that your encoding array maps all names
actually used by the font with UNICODE values. Fontforge
(https://fontforge.org) can generate a map of a font's glyphs when
saving in .otf or .ttf format with the "output glyph map" option
enabled. This produces a .g2n file which can be processed by awk:
BEGIN{print "<<"}
/GLYPHID .*PSNAME .*UNICODE .*/{print "/"$4, "16#"$6}
END{print ">>"}

The original font is not modified. A new font named key is created.

Here's an example:

%!
/Helvetica-Unicode /Helvetica findfont 20 scalefont AdobeGlyphList
unicodefont setfont
100 300 moveto <~=(Q2XDf'&.FDi;]J=KV=7P-UZJ=Q~> utf8show showpage

Regards,

David

David Newall

unread,

Jan 30, 2022, 11:54:28 PM1/30/22

to

On 31/1/22 5:03 am, David Newall wrote:
> Yes, requiring a modified font is ugly, but a map from UNICODE values to
> glyph names is absolutely required.

Following up on my own post, I should mention alternative ideas that I
have.

First is to include the UnicodeEncoding map with the string when
invoking utf8show:

/MyUnicode AdobeGlyphList makeunicode def
MyUnicode (UTF8 String) utf8show

Second is to have a global map, akin to currentpoint:

AdobeGlyphList makeunicode setunicode
(UTF8 String) utf8show

Are these ideas better or worse than what I previously discussed?

Thanks,

David

luser droog

unread,

Feb 1, 2022, 1:03:50 AM2/1/22

to

I think all of these would be fine, such as they are. But I see more
potential with the first idea of wrapping up all the information in
the font. It may be possible to wrap the functionality of the glyph
lookup in the /BuildChar procedure and get the whole thing to
work with show or cshow.