Support for Type1 hints

100 views
Skip to first unread message

rrob...@adobe.com

unread,
Apr 25, 2013, 5:40:39 PM4/25/13
to ufo-...@googlegroups.com
Over the next few months, the Adobe Type group is going to experiment with moving to UFO fonts as a primary font data source. One problem with this is that there is no hint data in the UFO spec.  This means we need to autohint every time we build a font (for final release, we checkout the sources, and build from those), which can take several minutes per face. I think we can live with this for the short-term, as we can do this only when the developer builds in release mode, or explicitly request this of makeotf. However, in the long run , we'd want to put the hint data in the UFO font. First, is there any general consensus about whether the UFO font should eventually provide a place fro hint data?

If so,  I have had a couple ideas about how to do so. If not, I can continue the discussion about a private T1 hint format in another thread.

- the hinting data should be separate from the glyph files. This is partly because of design logic: hints are derived from the path data, but do not alter it, so it makes sense to me to not modify the path data files when hints are added/changed. It is also partly from issues around workflow speed and convenience. You can derive hints a number of ways, and with different options. If the hints are kept in the glyph data file, then you have to write new glyph files and save previous versions with each autohint run, even though the outline data hasn't changed. Source code control is important in our workflow.

- Another issue is that hints for a glyph need to be invalidated when the outline data is edited. The way I imagine dealing with this is that the hint data for a glyph should contain a fingerprint of the outline data - probably a hash based on the marking path operators. A program which reads the hints would then know to throw away any hint data for a glyph where this hash does not match the the hash for the current glyph. I implemented something like this in the AFDKO autohint tool.

If I were to add hint data as a private data source, I would create a new directory in the UFO font app directory, 't1Hints'. The would contain an XML file containing the global T1 hint data, such as alignment zones, stemSnap[H|V} and BlueScale. It would also contain a set of hint files, one for each glyph, with names derived from the glyph file names, such as  <glyphName>.hint.xml. There may or may not be a hint file for any glyph. The hint files would be XML files which mirror the structure of T2 hints: a list of all the h stem hints, followed by all the v stem hints. These would then be followed by a list of hint substitution operators, each of which is linked to a point in the glyph file by point index or point tag and identifies the active stem hints by index.. The glyph hint file would also contain a hash of the glyph path marking operators at the time of hinting, to be used to determine if the hint file is still applicable.

- Read Robertds

Dave Crossland

unread,
Apr 25, 2013, 6:33:59 PM4/25/13
to rrob...@adobe.com, ufo-...@googlegroups.com
On 25 April 2013 23:40, <rrob...@adobe.com> wrote:
> Over the next few months, the Adobe Type group is going to experiment with
> moving to UFO fonts as a primary font data source.

This is awesome news; I'm moving all the Google Fonts to UFO sources
on GitHub this year, and would love to see hints stored in a
standardised way :-)

Tal Leming

unread,
Apr 26, 2013, 4:20:49 PM4/26/13
to ufo-...@googlegroups.com

On Apr 25, 2013, at 5:40 PM, rrob...@adobe.com wrote:

> Over the next few months, the Adobe Type group is going to experiment with moving to UFO fonts as a primary font data source. One problem with this is that there is no hint data in the UFO spec. This means we need to autohint every time we build a font (for final release, we checkout the sources, and build from those), which can take several minutes per face. I think we can live with this for the short-term, as we can do this only when the developer builds in release mode, or explicitly request this of makeotf. However, in the long run , we'd want to put the hint data in the UFO font. First, is there any general consensus about whether the UFO font should eventually provide a place fro hint data?

There isn't consensus one way or the other. We (Erik, Just and myself) have been interested in storing hints for a few years, we're just waiting for the right data structures. For many years we were hopeful that we would discover a magic abstraction that would represent both PS and TT hints. Erik and I have made attempts at this in the past but I think they have led us, or at least me, to believe that such an abstraction isn't possible. A more format specific approach is probably going to be a better solution.

> - the hinting data should be separate from the glyph files. This is partly because of design logic: hints are derived from the path data, but do not alter it, so it makes sense to me to not modify the path data files when hints are added/changed. It is also partly from issues around workflow speed and convenience. You can derive hints a number of ways, and with different options. If the hints are kept in the glyph data file, then you have to write new glyph files and save previous versions with each autohint run, even though the outline data hasn't changed. Source code control is important in our workflow.

I'm a little confused by this. I understand that the hinting process wouldn't modify the path data. But, GLIF contains more than path data. I could see GLIF gaining a <pshints> element that follows the <outline> element. The structure of GLIF would then look something like this:

• glyph
• advance
• unicode
• note
• image
• guideline
• anchor
• outline
• contour
• point
• component
• pshints
• lib

When the hinting is created or modified the only part of the GLIF that would change would be the <pshints> element. The version control system would then (hopefully) only pay attention to the changed lines.

Am I missing something that makes this not a viable option?

> - Another issue is that hints for a glyph need to be invalidated when the outline data is edited. The way I imagine dealing with this is that the hint data for a glyph should contain a fingerprint of the outline data - probably a hash based on the marking path operators. A program which reads the hints would then know to throw away any hint data for a glyph where this hash does not match the the hash for the current glyph. I implemented something like this in the AFDKO autohint tool.

I've done similar things in the past. The fingerprint could be a digest of a string representation of the path (something like Erik's digest pen structure or SVG's path strings would work). I'll have to think about how this could work in the spec. If it becomes part of the spec then we have to define how to make the structure used to make the digest. That could be complicated.

> If I were to add hint data as a private data source,

For the following, do you mean that if this doesn't become part of the spec this is how you would implement it privately? If so, I have some suggestions…

> I would create a new directory in the UFO font app directory, 't1Hints'.

Directories that aren't part of the spec are not safe. There is no guarantee that a save or save as operation will retain undefined directories. A better location would be the /*.ufo/lib.plist with a com.adobe.private.pshints key or something like that. That could be a structure like this:

{
glyphs : {
name : {
fingerprint : hash
hints : representation
}
}

It's one file rather than one file per glyph. You could also store these on a glyph by glyph basis in the lib element of each GLIF.

> The would contain an XML file containing the global T1 hint data, such as alignment zones, stemSnap[H|V} and BlueScale.

These are already in fontinfo.plist so you don't need to make a private place for them.

> It would also contain a set of hint files, one for each glyph, with names derived from the glyph file names, such as <glyphName>.hint.xml. There may or may not be a hint file for any glyph. The hint files would be XML files which mirror the structure of T2 hints: a list of all the h stem hints, followed by all the v stem hints. These would then be followed by a list of hint substitution operators, each of which is linked to a point in the glyph file by point index or point tag and identifies the active stem hints by index..

In UFO 3 points can have persistent identifiers that would be perfect for this. Unfortunately, nothing supports those yet.


Speaking only for myself, I'd like to explore storing hints in a public way. There are a couple of ways that this could be done:

1. Officially in the spec. The earliest that this could happen would be UFO 4.
2. Through font.lib with a publicly defined key. There could be something like com.adobe.hints that follows a specification that you develop. Authoring tool developers could then follow that specification for reading and writing the data.
3. Through font.lib for <= UFO 3 in a way similar to #2. Then that could form the basis for a longer simmering official part of the spec in UFO 4.

I'm curious about the XML representation of the hint data. Do you have a sketch of that yet?

Tal

rrob...@adobe.com

unread,
May 1, 2013, 3:10:19 PM5/1/13
to ufo-...@googlegroups.com
Hello Tal;

I appreciate the detailed reply.

I am glad to hear that there is interest in storing hints in the UFO format, at least in the long term. I will probably go ahead and implement a private solution, since the Adobe Type team will need support for this in the next month or so. By 'private', I mean only "not  in the UFO  spec'.   I do not expect that the UFO developers will necessarily follow the example of anything  that I implement. I will commit to supporting in the AFDKO whatever hint format does eventually get added to the UFO spec. I do hope to to continue to get guidance from the UFO team as I develop a first-pass solution for my team's workflow. Your comments have already been very helpful.

For TT and T1 hints, I agree that you cannot convert all hint data between the two formats. I do think that there is a subset of hint data that could be shared. I know practically little about TT hinting, so I will go get informed before before I do any data format design work on this.

My rationale for keeping the hint data for a glyph in a file separate from the 'glif' file was that there some usefulness to to being able to tell when the glyph outlines were last touched just by the file date. However, I take your point that the  glif file already contains a fair amount of data other than the paths. My logic would lead to keeping all of this in separate files, which would make working with the data a lot more complicated. In terms of data format design, keeping all the glyph-specific elements together makes more sense.What do you think of adding the glyph T1 hints in a new <pshints> element in the GLIF file, although  in the glyph-> lib element, under some domain name, until there is an official solution.

I do not yet have a sketch of the hint data. I start working on that late next week.

This week I will be working on supporting read/write UFO in the 'tx' tool. One problem I have to deal with is that the conversion from UFO to PS ( T1 or CFF) loses a lot of data that is in the UFO font. This is fine when converting UFO->PS.  However, if tx is then used to write a UFO font in order to transfer any changes in the PS font back to the source  UFO data, it matters that the data is lost. My current idea about dealing with this is that to keep the 'tx' tool simple, it will always write out a basic UFO font with just the info from the source PS font.  When the user wants to transfer data from a PS font to an existing UFO font, the AFDKO will provide a Python script  that would run tx to make a temporary UFO font, and then merge user-specified parts of the temporary UFO font into the target UFO font. 

- Read

Tal Leming

unread,
May 2, 2013, 4:19:45 PM5/2/13
to ufo-...@googlegroups.com

On May 1, 2013, at 3:10 PM, rrob...@adobe.com wrote:

> What do you think of adding the glyph T1 hints in a new <pshints> element in the GLIF file, although in the glyph-> lib element, under some domain name, until there is an official solution.

That sounds like a good plan. This would also allow backwards compatibility with < UFO 4 in at least an unofficial capacity.

> This week I will be working on supporting read/write UFO in the 'tx' tool. One problem I have to deal with is that the conversion from UFO to PS ( T1 or CFF) loses a lot of data that is in the UFO font. This is fine when converting UFO->PS. However, if tx is then used to write a UFO font in order to transfer any changes in the PS font back to the source UFO data, it matters that the data is lost. My current idea about dealing with this is that to keep the 'tx' tool simple, it will always write out a basic UFO font with just the info from the source PS font. When the user wants to transfer data from a PS font to an existing UFO font, the AFDKO will provide a Python script that would run tx to make a temporary UFO font, and then merge user-specified parts of the temporary UFO font into the target UFO font.

I think merging is going to be the only way to do that.

You may want to look at the extractor package:

http://svn.typesupply.com/packages/extractor/

Frederik and I have built that out to pull data from various font formats into a UFO. It's not a binary to UFO converter per se since it requires an in-memory object that has a specific API, but it shows where the bits of data in the UFO can come from. tx obviously does lots of this already...

Tal

rrob...@adobe.com

unread,
May 15, 2013, 4:48:43 PM5/15/13
to ufo-...@googlegroups.com

The following is my first pass at supporting Type1 hints in a GLIF file. Let me know what you think.

The logical structure is modeledl after the hint structure in the  Type 2 specification.

All stem hints used anywhere in the glyph are specified  in the <stemList> element. The stem hint position value is specified as an absolute real decimal value in the design space. The width value is a real decimal value, and may be negative. Negative width values must be either -20 or –21, which identify edge hints. Stem hints with other negative width values will be ignored. There is no requirement that the stem hint elements be in any order.

There may or may not be a <hintSetList> element. If this element is not present, then all the stem hints are applied over the entire glyph.  If the glyph uses hint substitution, there must be a <hintSetList> element, containing at least one <hintset> elements  A <hintset> element specifies the set of hints that is applied starting with the specified point index. There is no requirement that the <hintset> elements be in any particular order.

Note that the <hintset> element references stems definitions by child element index within the <stemList> element,  and references points by  point index within the list of points for the entire glyph.  This means that changing point order or contour order will invalidate the hint data. The <hintset> point index is  determined by building a list of all on-curve points in the order they are encountered when parsing the GLIF file, after expanding components to outlines.

The <stemhints> "id" property is a hash of the glyph point coordinates and types. It used to determine if a <stemhints> element is valid. A consumer of the <stemhints> element should build a hash of the glyph point coordinates and types; if this differs from the value of the "id" property, the <stemhints> element should be ignored. The hash is built by first expanding all components to contours, and then building a list of all the points in the glyph. A source string is built by concatenating the point x  value, y value,  and first letter of the type value to the string, without whitespace, for each point in turn. If the final string is less than 128 characters, that is used as the hash value. If the length is 128 characters or greater, then a hash value is calculated using a SHA512 algorithm, producing a 512 bit hash, written as a 128 character hexadecimal string.

When expanding a component element to an outline element, the scale and offset values must be applied to all the point values from the compoonent element.

 <glyph>
...
<lib>
<dict>
<key><com.adobe.type.autohint><key>
<data>
<stemhints id="hexadecimal value">
<stemList>
(<hstem pos="<decimal value>" width="decimnal value" />)*
(<vstem pos="<decimal value>" width="decimal value" />)*
</stemList>
(<hintSetList>
(<hintset pointIndex="positive integer">
(<stemindex>positive integer</stemindex>)+
</hintset>)+
</hintSetList>)*
</stemhints>
</data>
</dict>
</lib>
</glyph>

Example from "B" in  SourceCode-Regular
<key><com.adobe.type.autohint><key>
<data>
<stemhints>
<stemList>
<hstem pos="0" width="66" />
<hstem pos="314" width="62" />
<hstem pos="590" width="66" />
<vstem pos="84" width="87" />
<vstem pos="426" width="82" />
<vstem pos="463" width="82" />
</stemList>
<hintSetList>
<hintset countourindex="0" pointIndex="0">
<stemindex>0</stemindex>
<stemindex>1</stemindex>
<stemindex>2</stemindex>
<stemindex>3</stemindex>
<stemindex>4</stemindex>
</hintset>
<hintset countourindex=0 pointIndex="5">
<stemindex>0</stemindex>
<stemindex>1</stemindex>
<stemindex>2</stemindex>
<stemindex>3</stemindex>
<stemindex>5</stemindex>
</hintset>
</hintSetList>
</stemhints>
</data>


- Read Roberts

Tal Leming

unread,
May 21, 2013, 1:13:36 PM5/21/13
to ufo-...@googlegroups.com

On May 15, 2013, at 4:48 PM, rrob...@adobe.com wrote:

> The following is my first pass at supporting Type1 hints in a GLIF file. Let me know what you think.

I read over the Type 1 and Type 2 spec hinting sections so that I could better understand what you are proposing. I think I get it now, though I'm not 100% clear on what is needed to make a hint mask. (More on that below.)

Overall this seems good. The structure needs to be adjusted so that it can be stored in a property list. I sketched that out quickly (this includes some of the feedback from below):

com.adobe.type.autohint : {
formatVersion : 1,
fingerprint : "string",
stems : [
{
type : "hstem | vstem",
position : number,
width : number (-20 and -21 are reserved values as defined in the spec)
},
...
],
hintmasks : [
{
stems : [
{
stem : number,
point : number
}
]
}
]
}

> There may or may not be a <hintSetList> element. If this element is not present, then all the stem hints are applied over the entire glyph. If the glyph uses hint substitution, there must be a <hintSetList> element, containing at least one <hintset> elements A <hintset> element specifies the set of hints that is applied starting with the specified point index. There is no requirement that the <hintset> elements be in any particular order.

Is a hintset the same thing as a hintmask? If so, I'd recommend the name hintmask since that is the latest name in the spec. We try to stick to the spec terminology for clarity as much as we can.

> Note that the <hintset> element references stems definitions by child element index within the <stemList> element, and references points by point index within the list of points for the entire glyph. This means that changing point order or contour order will invalidate the hint data. The <hintset> point index is determined by building a list of all on-curve points in the order they are encountered when parsing the GLIF file, after expanding components to outlines.

Is the point index needed for the charstring? I'm trying to understand how this data can go to and come from the charstrings.

> The <stemhints> "id" property is a hash of the glyph point coordinates and types. It used to determine if a <stemhints> element is valid. A consumer of the <stemhints> element should build a hash of the glyph point coordinates and types; if this differs from the value of the "id" property, the <stemhints> element should be ignored. The hash is built by first expanding all components to contours, and then building a list of all the points in the glyph. A source string is built by concatenating the point x value, y value, and first letter of the type value to the string, without whitespace, for each point in turn. If the final string is less than 128 characters, that is used as the hash value. If the length is 128 characters or greater, then a hash value is calculated using a SHA512 algorithm, producing a 512 bit hash, written as a 128 character hexadecimal string.

I would call it "fingerprint" instead of "id". We use "identifier" elsewhere in the spec for a completely different thing.

Just to make sure I'm understanding this correctly, would a lineto at (100, 200) be expressed as "100,200l"? What should the precision be for floats?

I have been thinking about this and the fingerprint makes things fragile when changing left-margin values since changing that will move points. You could make the points relative to the minimum x and y values of all points (including off curves). You would need to do the same for the positions in the stem definitions as well. This wouldn't make the fingerprint robust against anything other than moving points, so it may not be all that useful.


> <glyph>
> ...
> <lib>
> <dict>
> <key><com.adobe.type.autohint><key>

If the data is specific to the autohinter, com.adobe.type.autohint is the best key. However, if hints defined in FontLab or elsewhere will be stored here, I'd suggest com.adobe.type.postscripthints or something more generic like that.


I would suggest adding a formatVersion so that if you change the structure of the data later you can quickly discern what you are dealing with when reading data.



I hav been thinking about how this could be stored officially in GLIF. My main concern is that changing the contour data in any way invalidates the hinting data. For example, if I change the left margin then the hinting data must be discarded. That's fine if the data was produced by an algorithm that can be rerun. If the data was produced manually, it's not so nice. We try to not have data that behaves that way. I did some sketching and came up with this:

<postscripthints>
<stems>
<hstem point1="point identifier" point2="point identifier" identifier="unique identifier (optional)" />
<vstem point1="point identifier" point2="point identifier" identifier="unique identifier (optional)" />
<hedge point="point identifier" direction="up | down" identifier="unique identifier (optional)" />
<vedge point="point identifier" direction="left | right" identifier="unique identifier (optional)" />
</stems>
<hintmasks>
<hintmask>
<stem>stem identifier</stem>
</hintmask>
</hintmasks>
</postscripthints>

Rather than using static coordinates in glyph units, the stems are linked to specific points that can in turn be used to calculate the values needed to create the charstring. The points are referenced by identifiers rather than index. Identifiers are sticky so they should survive moderate outline operations (i.e. metrics changes, contour reordering, direction changes) and may survive extensive outline operations (i.e. overlap removal, contour editing). I added new hedge and vedge values since there is no width value to indicate that a stem is an edge. The hint mask is based on my somewhat shaky understanding of the hintmask spec.

I translated the B from Source Code Pro-Regular into this syntax as a test case. The result is below.

There are some operators in the Type 2 spec that we haven't discussed. Specifically, cntrmask and flex. Should those be supported? Any thoughts from anyone on this would be appreciated.

Thanks,
Tal



<?xml version="1.0" encoding="UTF-8"?>
<glyph name="B" format="3">
<advance width="600"/>
<unicode hex="0042"/>
<outline>
<contour>
<point x="103" y="0" type="line" identifier="HbggNB8jlW"/>
<point x="298" y="0" type="line" smooth="yes"/>
<point x="444" y="0"/>
<point x="545" y="63"/>
<point x="545" y="192" type="curve" smooth="yes" identifier="8pFbrLXQhr"/>
<point x="545" y="282"/>
<point x="489" y="334"/>
<point x="393" y="349" type="curve"/>
<point x="393" y="353" type="line"/>
<point x="470" y="373"/>
<point x="508" y="431"/>
<point x="508" y="496" type="curve" smooth="yes" identifier="LRPWCKY4qj"/>
<point x="508" y="611"/>
<point x="417" y="656"/>
<point x="283" y="656" type="curve" smooth="yes"/>
<point x="103" y="656" type="line" identifier="cxnHEdxNQa"/>
</contour>
<contour>
<point x="187" y="376" type="line" identifier="Uq2xdW6RZc"/>
<point x="187" y="590" type="line" identifier="0jz1HOl0UG"/>
<point x="273" y="590" type="line" smooth="yes"/>
<point x="374" y="590"/>
<point x="426" y="560"/>
<point x="426" y="489" type="curve" smooth="yes" identifier="IagrAeCRhl"/>
<point x="426" y="416"/>
<point x="381" y="376"/>
<point x="269" y="376" type="curve" smooth="yes"/>
</contour>
<contour>
<point x="187" y="66" type="line" identifier="JEYqZEsWDV"/>
<point x="187" y="314" type="line" identifier="F4OYZGPovQ"/>
<point x="286" y="314" type="line" smooth="yes"/>
<point x="401" y="314"/>
<point x="463" y="277"/>
<point x="463" y="196" type="curve" smooth="yes" identifier="YfHt4H8LqQ"/>
<point x="463" y="107"/>
<point x="398" y="66"/>
<point x="286" y="66" type="curve" smooth="yes"/>
</contour>
</outline>
<postscripthints>
<stems>
<vstem point1="HbggNB8jlW" point2="JEYqZEsWDV" identifier="FNP9rQzr9D" />
<vstem point1="IagrAeCRhl" point2="LRPWCKY4qj" identifier="HVG8gkDTbx" />
<vstem point1="YfHt4H8LqQ" point2="8pFbrLXQhr" identifier="lj06rIo05z" />
<hstem point1="HbggNB8jlW" point2="JEYqZEsWDV" identifier="tR3lzQDuFw" />
<hstem point1="F4OYZGPovQ" point2="Uq2xdW6RZc" identifier="tCPsNPT5Il" />
<hstem point1="0jz1HOl0UG" point2="cxnHEdxNQa" identifier="97yeZPAcer" />
</stems>
<hintmasks>
<hintmask>
<stem>FNP9rQzr9D</stem>
<stem>lj06rIo05z</stem>
<stem>tR3lzQDuFw</stem>
<stem>tCPsNPT5Il</stem>
<stem>97yeZPAcer</stem>
</hintmask>
<hintmask>
<stem>FNP9rQzr9D</stem>
<stem>HVG8gkDTbx</stem>
<stem>tR3lzQDuFw</stem>
<stem>tCPsNPT5Il</stem>
<stem>97yeZPAcer</stem>
</hintmask>
</hintmasks>
</postscripthints>
</glyph>
Reply all
Reply to author
Forward
0 new messages