UFO Normalization

199 views
Skip to first unread message

Tal Leming

unread,
Jun 5, 2013, 4:42:37 PM6/5/13
to ufo-...@googlegroups.com
Hi Everyone,

On several occasions I have been asked to develop a small, non-normative portion of the UFO specification that would recommend formatting guidelines for the XML files in the UFO—whitespace, element ordering, etc. The most common reason for this request is version control. Different authoring tools are writing UFOs with different formatting (but the same content) and that leads to unneeded diffs. I've given this some thought and I don't think a new spec is the ideal way to deal with this. For one thing, I don't know of any other spec that does this. If no other specs do this, there is probably a good reason. The UFO is XML for better and worse and XML was not designed with these problems in mind. Secondly, if the specification is non-normative it will be ignored, partially implemented in different ways or misinterpreted. Finally, tool-chains are complex. When these requests have come up, there has been talk of documenting "what ufoLib does" since it is the most common reader/writer. The thing is, ufoLib is built on parsers and writers that are out of our control. If the writer that ufoLib uses decides to switch the character(s) representing indentation or how it orders attributes or anything else, there isn't much ufoLib can do about that.

I do have an idea for a solution to the problem, though. Rather than a spec, I could write a tool that takes in a UFO with any formatting and outputs a UFO with normalized formatting. This way, FontForge, FontLab, Glyphs, RoboFont, ufoLib, etc. could use the tool internally. The tool could also be used independently of any authoring tool. makeotf has shown that some commonly used code can be very beneficial to everyone involved, so we have a precedent for this kind of approach. From an implementation standpoint, this is what I'm thinking:

- The tool will be written in Python as a single, standalone script. The tool must not use anything outside of the standard library (unless the standard library can be used as a fallback if an outside package can't be imported).
- The tool must be usable in two ways: as a commandline tool and directly importable into Python. For both of these, it must be able to work on a entire UFO that is on disk, a specific file in a UFO that is on disk and on a string with a given relative path inside of a UFO that is not on disk.
- The tool will use the ElementTree API for reading (because it is easy to work with) and will most likely use a custom XML writer so that it has control over all aspects of writing.
- The tool needs to be self-documenting. I understand that some developers may not be interested in using a tool that someone else wrote for whatever reason. For this situation, the tool needs to have documentation strings that detail what it does to what. I want this to be in the code rather than as a separate document.
- The tool will be MIT licensed.

My only concern about this is speed. Normalizing a large UFO could be an expensive operation. Allowing the tool to work on in-memory strings and only on specific files will help some. The tool could store file modification times in the font lib and reference this when working on an entire UFO. The tool could also use lxml (a Python wrapper around libxml2) and fallback to standard ElementTree if lxml isn't available. lxml is reportedly much faster than ElementTree though I'm not sure how much of a speed boost will be apparent in a UFO since the main expense is typically I/O. In any case, speed is going to need to be considered carefully.

As for the actual formatting that should be used... I think we could start with what ufoLib does in the current version of Python. That won't be enough though as we need to figure out things like global float precision (and exceptions to the global rule) and stuff like that. I think our goal should be readable files that are as compact as possible.

If this is something that you are interested in, please let me know. Apart from some specific workflows, I don't know how widely version control is being used with UFOs. Before things move along too far it would be good to know if this is actually solving a common problem or not.

Finally, while I recognize the need for this tool, personally I don't have a need for it. I have a policy of not writing complex code that solves a problem that I don't have. So, I'm not working on this in any urgent way. If anyone wants to sponsor the development of this to speed things along, please get in touch. Otherwise, I'll work on this when my time allows.

Tal

Antonio Cavedoni

unread,
Jun 5, 2013, 5:05:19 PM6/5/13
to ufo-...@googlegroups.com
On Wed, Jun 5, 2013 at 1:42 PM, Tal Leming <t...@typesupply.com> wrote:
> If this is something that you are interested in, please let me know. Apart
> from some specific workflows, I don't know how widely version control is
> being used with UFOs. Before things move along too far it would be good
> to know if this is actually solving a common problem or not.

Personally, as the author of a (hackish prototype, but still)
versioning tool for UFOs, I’m not sure this is needed. My tool used to
rely on the formatting of XML and that proved to be so brittle in my
testing it wasn’t even funny.

Furthermore, the performance concerns in Tal’s proposal get compounded
by the fact that normalizing a UFO means normalizing all the
individual glyph files in the font. Depending on the number of glyphs
that can be a lot of I/O – even with SSDs – and it needs to be
performed on both the base revision and the modified one since we
can’t assume they are going to be normalized to begin with.

Because if they are guaranteed to be, to the point where other tools
are supposed to interoperate with it, this normalization might as well
be part of the spec itself, no?

But we’re talking about XML here: the fact that whitespace around XML
elements is not significant is supposed to be a feature. If we think
that is a hindrance, maybe we should consider moving UFO4 to a format
where that assumption doesn’t hold?

Ah, opinions.
--
Antonio

Miguel Sousa

unread,
Jun 5, 2013, 10:22:39 PM6/5/13
to ufo-...@googlegroups.com
At Adobe we're moving to a UFO-based workflow and keeping the source files under version control. We work with various external designers and can't force them to use the same tools as we do. So we're very much interested in a normalization tool. I don't know what's possible in terms of sponsorship, but I'll ask. In the meantime, if there's anything I can help with code-wise (perhaps writing unit tests) let me know.

Regarding the potential I/O bottleneck, let me share something with you. Read Roberts has been adding UFO support to a few FDK tools, namely makeotf and tx (plus checkoutlines and autohint, which rely on the latter). One of the things checkoutlines is used for in our production workflow is for removing path overlaps. In the new workflow, Read implemented a glyph hash mechanism for testing if a glyph has been processed by checkoutlines and still matches the source glyph. In a UFO font with 1889 glyphs the first pass took 52 seconds. In the second pass, where checkoutlines still needed to read all the source GLIF files, compute the hash, and compare it with the saved hash, took less than 2 seconds. All done via Python. I think this example shows that 1) storing file modification times will speed the process quite a lot, and 2) opening, lightly process, and closing a relatively large number of GLIF files doesn't take very much time.

Miguel

Tal Leming

unread,
May 20, 2015, 2:04:31 PM5/20/15
to ufo-...@googlegroups.com
Hi Everyone,

This issue has come up again, so I took some time today and started a draft of a normalization specification. It's here:


This is a very quick and incomplete sketch, so contributions and thoughts would be greatly appreciated.


Once this document is worked out I'll start work on a Python tool that will act as an example implementation.

Thanks,
Tal

David Raymond

unread,
May 21, 2015, 3:51:01 AM5/21/15
to ufo-...@googlegroups.com
Thanks for looking at this again -it will be very useful to have an agreed normalization spec!

A few quick comments:

XML indents - Spaces seem to be used more commonly than tabs and seem to be preferable.  When I last looked I _think_ Robofab, Robfont, FontForge and Glyphs all used 2 spaces for indents in glif files.  Also the majority of those don't do an indent for the <dict>, so don't know if that is a useful convention.

For elements with no contents you say "A space not preceed the />" but then have a space in the examples!  No space seems the normal use.

Attribute ordering - Specifying a order for at least some seems preferable to alphabetic in terms of human-readability.  For example in glif files, tools currently keep, say, xOffset next to yOffset.  I've so far been simply using the order attributes are defined in the spec which, for glif files, leads to an order of 'pos', 'width', 'height', 'fileName', 'base', 'xScale', 'xyScale', 'yxScale', 'yScale', 'xOffset', 'yOffset', 'x', 'y', 'angle', 'type', 'smooth', 'name', 'format', 'color', 'identifier'.

Currently on holiday, so may comment more next week!

David

Tal Leming

unread,
May 21, 2015, 9:59:06 AM5/21/15
to ufo-...@googlegroups.com
On May 21, 2015, at 3:51 AM, David Raymond <david_...@sil.org> wrote:

XML indents - Spaces seem to be used more commonly than tabs and seem to be preferable.  When I last looked I _think_ Robofab, Robfont, FontForge and Glyphs all used 2 spaces for indents in glif files.  Also the majority of those don't do an indent for the <dict>, so don't know if that is a useful convention.

I think ufoLib uses four spaces. I'm leaning towards tabs to reduce file size. 1 "\t" vs 4 * " " is a big difference across a full UFO. Maybe I'm thinking about this too much.

For elements with no contents you say "A space not preceed the />" but then have a space in the examples!  No space seems the normal use.

Oops. Quick draft.

Attribute ordering - Specifying a order for at least some seems preferable to alphabetic in terms of human-readability.

I prefer a logical ordering over alphabetical too, but I'm not sure how possible that will be. I need to do some experiments to see if this can be done without writing a custom XML writer.

Thanks!

Tal

Jack Jennings

unread,
May 21, 2015, 10:31:40 AM5/21/15
to ufo-...@googlegroups.com
Perhaps it would be helpful to know what a library should do if it encounters a UFO that doesn't conform to the normalization rules, even if that behavior is just to reformat it correctly?

Bob Hallissy

unread,
May 21, 2015, 10:49:38 AM5/21/15
to ufo-...@googlegroups.com
On 5/21/2015 9:31 AM, Jack Jennings wrote:
Perhaps it would be helpful to know what a library should do if it encounters a UFO that doesn't conform to the normalization rules, even if that behavior is just to reformat it correctly?

I don't think the logical meaning of the UFO changes if it isn't in the normal form being discussed.  So I would hope libraries reading UFO would be lenient with regard to normalization.

Rather, the concern (from my perspective) is for all UFO writers to generate a consistent form so that change detection (e.g., used by version control systems) works well.

Bob

Jack Jennings

unread,
May 21, 2015, 10:57:05 AM5/21/15
to ufo-...@googlegroups.com
Yes, sorry, I was referring to writing specifically. Reading should never be an action that alters data.

Dave Crossland

unread,
May 21, 2015, 11:00:13 AM5/21/15
to ufo-...@googlegroups.com

On 21 May 2015 at 17:57, Jack Jennings <j...@ckjennin.gs> wrote:
Yes, sorry, I was referring to writing specifically. Reading should never be an action that alters data.

Are you suggesting writers should also be able to retain the existing order?

Jack Jennings

unread,
May 21, 2015, 11:07:06 AM5/21/15
to ufo-...@googlegroups.com, da...@lab6.com
I assume that writers should "clean up" syntax to conform to the normalized spec when they write some part of the UFO. Mostly I was suggesting that this behavior be explicitly outlined and defined as correct.

Lasse Fister

unread,
May 21, 2015, 11:15:51 AM5/21/15
to ufo-...@googlegroups.com
On 05/21/2015 04:49 PM, Bob Hallissy wrote:

Rather, the concern (from my perspective) is for all UFO writers to generate a consistent form so that change detection (e.g., used by version control systems) works well.

The purpose of the tool as proposed is to take the output of any UFO writer and normalize it, then you can send the normalized UFO to version control. UFO writers don't have to generate a consistent form.

Lasse

Tal Leming

unread,
May 21, 2015, 11:24:49 AM5/21/15
to ufo-...@googlegroups.com
Right. Requiring a specific output of all writers would be onerous. What I'm proposing is an independent post-processor (and a spec defining what it should output) that could be used to process a UFO between the time that it was written to disk and when it is handed over to version control. This way the version control system only sees UFOs that are formatted in one way regardless of the tool that was used to create them.

Tal

David Raymond

unread,
May 21, 2015, 5:07:55 PM5/21/15
to ufo-...@googlegroups.com
I've been working on such a tool (UFOconvert) for a while which can do much of what is in the current normalization spec.  It uses its own xml output code and so can do the attribute sorting - the code is in https://github.com/silnrsi/pysilfont but is still under development.  Designed to be able to do some conversion between UFO versions as well as normalizing.

Should the normalization spec cover glif file naming?  I was planning to use the suggested algorithm from the UFO 3 spec and consider that part of normalization, though with the option of turning that off.

David

Jack Jennings

unread,
May 22, 2015, 12:40:25 AM5/22/15
to ufo-...@googlegroups.com
Ahh, my mistake. Sorry for derailing this topic.

Behdad Esfahbod

unread,
May 22, 2015, 1:33:51 AM5/22/15
to ufo-...@googlegroups.com
On 15-05-21 06:59 AM, Tal Leming wrote:
>
>> Attribute ordering - Specifying a order for at least some seems preferable
>> to alphabetic in terms of human-readability.
>
> I prefer a logical ordering over alphabetical too, but I'm not sure how
> possible that will be. I need to do some experiments to see if this can be
> done without writing a custom XML writer.

The fonttools xmlWriter supports this. You can give it a list of tuples,
instead of named arguments, if you want to preserve attribute order. Eg:

writer.begintag('TTGlyph', [
("name", glyphName),
("xMin", glyph.xMin),
("yMin", glyph.yMin),
("xMax", glyph.xMax),
("yMax", glyph.yMax),
])

b

David Raymond

unread,
May 28, 2015, 11:00:32 AM5/28/15
to ufo-...@googlegroups.com
Here's some fuller comments of the proposed normalization spec.

In general I think there's a need to separate discussion of what should be in the spec from the tools that are needed to support it. BTW we are committed to producing a normalizer, and I'd expect we would adjust those tools to follow an agreed normalization spec by default - though many parts could then be changed by parameters if a particular project preferred alternative approaches.

Going through the current proposal:

XML Declaration

 Agreed

White-space and indentation

1) Would prefer two spaces to tab.  When tabs are used, if displayed by an editor or browser that uses 8-spaces for tabs, the XML becomes hard to read

2) Currently most tools don't indent <dict> in plists, eg:

<plist version="1.0">
<dict>
  <key>ascender</key>
  <integer>1650</integer>
  ...

No strong views on this, but others who write the tools might have!

Elements, tags

Agreed (though currently examples with self-closing tags show extra spaces!)

Elements, attributes

These should be ordered a logical, predetermined order rather than alphabetic.  Simplest order seems to be the order they are defined within the UFO spec

Elements, comments

Do any current tools use comments? (I don't know!)

Elements, decimal precision

Ten digits past the decimal seems a bit high, and more likely to lead to rounding errors when round-tripped through different tools. However, I've no relevant experience to say what level of precision is needed.

Also note there's been recent discussion elsewhere about whether <real> values that are exact integers should be shown as just 1 or 1.0 - no view on this myself (as long as it's clear one way or the other, which it currently is).

GLIF Element Ordering

Agreed that there needs to be fixed order.  Currently I think tools tend to follow the order they are defined in the UFO spec, ie [advance, unicode, note,   image,  guideline, anchor, outline, lib] so might that be a better order?

Plist Element Ordering

Currently nothing specified in the spec, but alphabetic seems the most common order used by tools.

Default values

Not thought about this one before, so no view on this yet!

Item not covered - .glif file naming

Whilst this is perhaps off-spec for an "XML Normalization" spec, it does need to be covered by any normalization tool, since some font tools do change .glif file names.  Naming these using the suggested algorithm from the UFO 3.0 spec seems the best approach.

Other items

From my experience testing round-tripping 'normalized' fonts through different tools, I expect other items will be found that need to be covered by normalisation!

David

David Raymond

unread,
Jun 1, 2015, 5:35:35 AM6/1/15
to ufo-...@googlegroups.com

On Thursday, 28 May 2015 16:00:32 UTC+1, David Raymond wrote:

Plist Element Ordering

Currently nothing specified in the spec, but alphabetic seems the most common order used by tools.


Alphabetic sorting only applies to dictionary plists.  For array plists like layercontents.plist either an order needs to be defined, or existing order preserved.

The spec also needs to cover the formatting of the DOCTYPE linefor plists, eg

<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">

David
 

David Raymond

unread,
Jun 5, 2015, 12:21:10 PM6/5/15
to ufo-...@googlegroups.com

Here are a few additional comments on the proposed spec - either clarification of earlier comments or new suggestions.


White-space and indentation


Note that examples in the UFO spec follow my 2 suggestions!


Elements, comments


Having thought about it, I now agree with the proposal


Elements, attributes


Where attributes contain numbers, positive numbers should not have a leading + sign.


Where attributes contain real numbers, there should be a maximum of 6 digits following the decimal point.


Elements, decimal precision


Since numbers in ttf are not more than 16 bit, there is no need to store more than 6 digits.  Storing in higher precision within XML will increase the likelihood of rounding errors coming in simply from round-tripping through font tools.  Clearly font tools will want to use higher precision internally - again to reduce rounding errors.


In terms of formatting when the value is an integer, since the UFO spec specifies all values that might have decimals as “Integer or float”, the proposed format of <integer>n</integer> within XML does seem the best standard.


GLIF Element Ordering


There is one exception to principle of not reordering contours and components - anchors within UFO 2 (which, by convention, are single point contours with type “move”).  Some tools put these before other contours/components and others put them after.  Suggest normalizers should put them first - but otherwise maintain their order.


Plist Element Ordering


Currently nothing specified in the spec, but ascending key order is suggested in the UFO spec conventions.


See also note on <dict> order in general below.


Default values


Having thought about it, I now agree with the proposal.  Note that I would not expect to implement this within my normalizer unless the need arose, since I think existing tools work this way anyhow.


Item not covered - <dict> sorting


Although the order of the main dict for .plist files has been raised, problems also occur in dictionaries within individual elements.  Since, with ElementTree for example, maintaining any original order is hard, sorting all dicts ascending key order should be part of the spec.  If accepted, no need to mention the order for Plists specifically.


Tal Leming

unread,
Jun 5, 2015, 2:06:20 PM6/5/15
to ufo-...@googlegroups.com
On May 28, 2015, at 11:00 AM, David Raymond <david_...@sil.org> wrote:

White-space and indentation

1) Would prefer two spaces to tab.  When tabs are used, if displayed by an editor or browser that uses 8-spaces for tabs, the XML becomes hard to read

Most modern editors allow for the adjustment of the width used to display tabs so I don't think this should be a huge consideration.

2) Currently most tools don't indent <dict> in plists, eg:

<plist version="1.0">
<dict>
  <key>ascender</key>
  <integer>1650</integer>
  ...

I don't want to get too deep into element specific formatting that deviates from general element formatting.

Elements, attributes

These should be ordered a logical, predetermined order rather than alphabetic.  Simplest order seems to be the order they are defined within the UFO spec

I agree that there should be a set order but I don't necessarily agree that it should follow the UFO order.

Elements, comments

Do any current tools use comments? (I don't know!)

It doesn't matter. Preserving comments is not behavior that should be expected.

Elements, decimal precision

Ten digits past the decimal seems a bit high, and more likely to lead to rounding errors when round-tripped through different tools. However, I've no relevant experience to say what level of precision is needed.

It's not "always 10", it's a maximum of 10 to allow for high-precision. The number should be represented with the fewest number of digits as possible without encountering a loss, up to 10 digits past the decimal. If a number can be losslessly represented with 2, then 2 should be used.

Also note there's been recent discussion elsewhere about whether <real> values that are exact integers should be shown as just 1 or 1.0 - no view on this myself (as long as it's clear one way or the other, which it currently is).

My draft addresses this for everything except Property Lists. I need to think through what happens in PLIST if something is converted from <real> to <integer> because the spec may declare that integers are not allowed.

GLIF Element Ordering

Agreed that there needs to be fixed order.  Currently I think tools tend to follow the order they are defined in the UFO spec, ie [advance, unicode, note,   image,  guideline, anchor, outline, lib] so might that be a better order?

I roughed out an order based on the things that I've needed to extract from GLIFs in the past, so they are grouped in a logical order rather than alphabetic or anything else. Arbitrary. :)

Plist Element Ordering

Currently nothing specified in the spec, but alphabetic seems the most common order used by tools.

Dicts will be alphabetic. Lists will retain their order.

Item not covered - .glif file naming

Whilst this is perhaps off-spec for an "XML Normalization" spec, it does need to be covered by any normalization tool, since some font tools do change .glif file names.  Naming these using the suggested algorithm from the UFO 3.0 spec seems the best approach.

This is out of scope in normalization. I don't want normalizers renaming files. The UFO 3 spec has a robust naming algorithm.



On Jun 1, 2015, at 5:35 AM, David Raymond <david_...@sil.org> wrote:


On Thursday, 28 May 2015 16:00:32 UTC+1, David Raymond wrote:

Plist Element Ordering

Currently nothing specified in the spec, but alphabetic seems the most common order used by tools.


Alphabetic sorting only applies to dictionary plists.  For array plists like layercontents.plist either an order needs to be defined, or existing order preserved.

Preserved. The order in those is critical.

The spec also needs to cover the formatting of the DOCTYPE linefor plists, eg

<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">

Yes. I haven't had time to write the PLIST information yet.



On Jun 5, 2015, at 12:21 PM, David Raymond <david_...@sil.org> wrote:

White-space and indentation

Note that examples in the UFO spec follow my 2 suggestions!

Spaces vs. tabs! Fight!

Elements, attributes

Where attributes contain numbers, positive numbers should not have a leading + sign.

Right. Does that actually need to be stated? I think the + would cause all sorts of interpretation errors with existing tools.

Where attributes contain real numbers, there should be a maximum of 6 digits following the decimal point.

Hm. I'm not so sure about that.

Elements, decimal precision

Since numbers in ttf are not more than 16 bit, there is no need to store more than 6 digits.

UFO doesn't just represent TTF.

 Storing in higher precision within XML will increase the likelihood of rounding errors coming in simply from round-tripping through font tools.

Unless something is doing non-standard rounding, how will rounding errors be introduced?

In terms of formatting when the value is an integer, since the UFO spec specifies all values that might have decimals as “Integer or float”, the proposed format of <integer>n</integer> within XML does seem the best standard.

I need to review the spec to make sure this is the case. (see above)

GLIF Element Ordering

There is one exception to principle of not reordering contours and components - anchors within UFO 2 (which, by convention, are single point contours with type “move”).  Some tools put these before other contours/components and others put them after.  Suggest normalizers should put them first - but otherwise maintain their order.

Hm. That's going to be tricky but I guess the spec should say something about it. Anchors were unofficially (and therefore inconsistently) represented in < UFO 3 as contours that contain a single, named moveto point. I suppose that we could say that normalization of UFO 1/2 should catch this and move them to the top or bottom of the stack. I'll think about it.

Plist Element Ordering

Currently nothing specified in the spec, but ascending key order is suggested in the UFO spec conventions.

See also note on <dict> order in general below.

I'll get to it when I have some time!

Default values

Having thought about it, I now agree with the proposal.  Note that I would not expect to implement this within my normalizer unless the need arose, since I think existing tools work this way anyhow.

I don't think ufoLib filters out defaults.


Item not covered - <dict> sorting

Although the order of the main dict for .plist files has been raised, problems also occur in dictionaries within individual elements.  Since, with ElementTree for example, maintaining any original order is hard, sorting all dicts ascending key order should be part of the spec.  If accepted, no need to mention the order for Plists specifically.

Maintaining an element order is not hard. Dicts are by definition unsorted in memory so we must specify the order when writing them to a file. It will be alphabetic, ascending.


Thanks,
Tal

Miguel Sousa

unread,
Jun 5, 2015, 3:13:08 PM6/5/15
to ufo-...@googlegroups.com

On 05/06/2015, at 11:06, Tal Leming <t...@typesupply.com> wrote:

Item not covered - .glif file naming

Whilst this is perhaps off-spec for an "XML Normalization" spec, it does need to be covered by any normalization tool, since some font tools do change .glif file names.  Naming these using the suggested algorithm from the UFO 3.0 spec seems the best approach.

This is out of scope in normalization. I don't want normalizers renaming files. The UFO 3 spec has a robust naming algorithm.

I think that an exception should be made for this case.
.glif file name differences is something that we (Adobe) run into a lot.
I believe the goal of the Normalizer is to get the same output, no matter where or how the input UFO was produced.
If .glif file names are not normalized, there’s no guarantee that they will remain the same, if for example a normalized UFO goes thru a series of tools (Glyphs, Robofab, RoboFont, defcon, etc.) and is normalized afterwards. I think that’s bad.

M.

David Raymond

unread,
Jun 8, 2015, 9:34:47 AM6/8/15
to ufo-...@googlegroups.com


On Friday, 5 June 2015 19:06:20 UTC+1, Tal Leming wrote:
...

Thanks for your various responses.  With most I'll just wait now until you've time to produce the next version of the spec and comment further then, if needed.  Many are just style issues,  so not worth worrying too much about - and in my normaliser they are controlled by parameters so individual projects could choose their preferred style if needed.

On decimal precision, I'm only concerned since I have seen rounding errors introduced by different font tools in the past, so was looking to avoid them by not storing excess precision - but tools might have improved since my tests a year ago.  When I do another round of tests, I can do them with high precision and see if problems still occur.

I do think it is important to separate the spec from the normalised form from the spec for the normalizer, with the former covering everything where the UFO spec allows variations - eg .glif file names following UFO 3 algorithm, Unicode being upper case, not having a + sign before positive numbers.

Then, if font tool designers want to follow the normalized form, they know what to do in all cases.

Once the normalized form is agreed, the normalizer spec can be more pragmatic in terms of not coding for things that don't currently happen in font tools.

David


Tal Leming

unread,
Jun 23, 2015, 1:41:54 PM6/23/15
to ufo-...@googlegroups.com
Good point. It will make normalization more complex, but this is something that needs to happen.

Thanks,
Tal

David Raymond

unread,
Jul 9, 2015, 8:42:14 AM7/9/15
to ufo-...@googlegroups.com
A couple more thoughts for the spec:

1) If a plist is empty (eg groups.plist) then no file should be written to disk - some tools currently (or used to) produce empty groups and kerning files
2) Images should only be written out if referenced by a glif

 Like .glif file naming, (2) is more than "XML normalisation", but I think should be in the scope of UFO normalizing

David

David Raymond

unread,
Jul 9, 2015, 10:46:12 AM7/9/15
to ufo-...@googlegroups.com
The first version of SIL's normalizer has now been released in  https://github.com/silnrsi/pysilfont.  It covers most of the items is in Tal's draft spec (and more) though currently the defaults options for some are different.  Many defaults can be overridden by command-line parameters.

Feedback welcome!

David

The UFOlib behind this has been designed with the intention of building further scripts for modifying UFOs and will be extended as needs arise.

Tal Leming

unread,
Jul 9, 2015, 3:04:05 PM7/9/15
to ufo-...@googlegroups.com
Nice! I've been working on an example implementation to test the feasibility of my draft:


It's not complete, but a lot of it is working. (Still, don't use it for production use!)

Miguel has a lot of pull requests that I need to review. Hopefully tomorrow...

Thanks,
Tal


--
You received this message because you are subscribed to the Google Groups "Unified Font Object Specification" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ufo-spec+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tal Leming

unread,
Jul 13, 2015, 10:03:12 AM7/13/15
to ufo-...@googlegroups.com

> On Jul 9, 2015, at 8:42 AM, David Raymond <david_...@sil.org> wrote:
>
> A couple more thoughts for the spec:
>
> 1) If a plist is empty (eg groups.plist) then no file should be written to disk - some tools currently (or used to) produce empty groups and kerning files

Good point. I'll add it to the notes and the in-progress implementation.

> 2) Images should only be written out if referenced by a glif
>
> Like .glif file naming, (2) is more than "XML normalisation", but I think should be in the scope of UFO normalizing

This seems fine, but it brings up a question: what should happen to image elements in GLIF that reference a non-existent image? If we remove those, should we also remove components that reference non-existent glyphs?

Thanks,
Tal

David Raymond

unread,
Jul 13, 2015, 10:35:27 AM7/13/15
to ufo-...@googlegroups.com


> 2) Images should only be written out if referenced by a glif
>
>  Like .glif file naming, (2) is more than "XML normalisation", but I think should be in the scope of UFO normalizing

This seems fine, but it brings up a question: what should happen to image elements in GLIF that reference a non-existent image? If we remove those, should we also remove components that reference non-existent glyphs?

Thanks,
Tal

My inclination would be just to flag a warning for things like this - it's an error in the UFO but not a normalization issue as such.

David

David Raymond

unread,
Aug 14, 2015, 12:16:01 PM8/14/15
to Unified Font Object Specification

On Thursday, 9 July 2015 20:04:05 UTC+1, Tal Leming wrote:
Nice! I've been working on an example implementation to test the feasibility of my draft:



Hi Tal,


Generally works well. Here's what I've noticed so far:

- Elements with no content are not shown with a self-closing tag, rather <string><string/> format

- String elements containing &amp; have that output just as &

- If a glif exists in the glyphs directory but is not in contents.plist it still gets output

- If a string value includes the copyright symbol, the code fails with "UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in position 365: ordinal not in range(128)"

- In the DOCTYPE statement for plists, it uses just Apple rather than "Apple Computer".  I realise there is no standard for this - examples in both formats even on the Apple website. "Apple Computer" seems more common, including in sample ufo fonts from various sources and in the examples on unifiedfontobject.org.

- When a glif lib dict contains data, the data is lost, eg

  <lib>
  <dict>
  <key>com.adobe.type.autohint</key>
  <data>
  <hintSetList>
    <hintset pointTag="hr00">
      <hstem pos="862" width="600" />
      <vstem pos="88" width="174" />
    </hintset>
  </hintSetList>
  </data>
  </dict>
  </lib>

changed to

<lib>
<dict>
<key>com.adobe.type.autohint</key>
<data>
</data>
</dict>
</lib>
 
Hope this is helpful.

David

David Raymond

unread,
Aug 14, 2015, 12:18:15 PM8/14/15
to Unified Font Object Specification


On Thursday, 9 July 2015 20:04:05 UTC+1, Tal Leming wrote:
Nice! I've been working on an example implementation to test the feasibility of my draft

Is there an updated version of the spec available?

David 

Miguel Sousa

unread,
Aug 15, 2015, 1:36:04 AM8/15/15
to ufo-...@googlegroups.com
This is something that Read Roberts needs to fix on the FDK side. It’s on his to do list. I’ll see if I can nudge it up.

M.
Reply all
Reply to author
Forward
0 new messages