I had a chat with Jonathan Kew and Karl Tomlinson this morning about Harfbuzz. Here are some notes. Further feedback appreciated.
* Using Harfbuzz for shaping and platform APIs for rasterization should work OK. Shapers generally don't take hinting into account. They may want glyph bounds sometimes, but we can get those from the platform. There shouldn't be much duplicate caching between the platform and Harfbuzz.
* Pros for using Harfbuzz on Mac/Windows: ** Core Text is weak at Opentype ** We can implement whatever Opentype features we want ** Consistent shaping across platforms ** We get full control over which Opentype features are enabled (useful since we want to expose that control to authors via CSS) ** Currently we get performance wins by making certain assumptions (e.g., fonts don't shape across ASCII spaces, shaping doesn't depend on line break placement) which are untrue in general; we need APIs to tell us when it's OK to use those assumptions, but platforms don't have them ** We get control over performance-critical code so we can optimize it, parallelize it, etc ** Enables shaping on Windows Mobile where there's no Uniscribe * Cons for using Harfbuzz on Mac/Windows ** Lacks AAT support ** Script support lags behind Uniscribe ** Inconsistent shaping between apps on the same platform --- but Windows apps are already wildly inconsistent, and Safari isn't consistent with other Mac apps
* Download size impact should be measured but not expected to be prohibitive
* Our feeling: support Harfbuzz alongside Core Text on Mac and Uniscribe on Windows; decide which system to use based on the font/ script/language in a text run. Prefer Harfbuzz; use Harfbuzz on Mac unless the font has AAT. Use Harfbuzz on Windows unless it's a script Harfbuzz doesn't support.
* Required work steps: ** Improve Harfbuzz so it's comparable to Pango in shaping functionality ** Switch from Pango to Harfbuzz for GTK ** Integrate Harfbuzz for Mac and Windows ** Keep improving Harfbuzz script/shaper coverage
Some additional things that came up:
* Need two-phase clustering ** initial clustering phase that's just based on Unicode tables, can be cross-platform ** font selection should avoid font changes inside clusters (e.g. a base character and a combining mark should be given a font that supports both characters) ** once we've selected the font, we need a second clustering phase that can fuse clusters if the font needs it (e.g. the font may combine two characters into a 'stacked glyph' where selecting half of it makes no sense, i.e. ligature splitting simply makes no sense)
* Need to measure perf impact of using shaping for all font sizes and for measuring glyph bounds at all font sizes (should be easy)
I think the lack of AAT support is going to make things a bit tricky on Mac, many of the fonts that ship even with 10.5 only support complex scripts via AAT, not OpenType. Maybe that will change with 10.6, dunno. So I think we'll need to fallback to ATSUI layout for cases where the AAT fonts are matched, unless CoreText supports AAT-based shaping.
> * Need two-phase clustering > ** initial clustering phase that's just based on Unicode tables, can > be cross-platform > ** font selection should avoid font changes inside clusters (e.g. a > base character and a combining mark should be given a font that > supports both characters)
I think we'll need fast-path font matching code for the "no clusters" case and a separate version for the "has clusters" case. And the "avoid font changes within clusters" rule should get pushed back into the CSS3 Fonts spec.
One other suggestion would be to also review ICU Layout code, there may features or techniques that might be useful to incorporate.
> I think the lack of AAT support is going to make things a bit tricky > on Mac, many of the fonts that ship even with 10.5 only support > complex scripts via AAT, not OpenType. Maybe that will change with > 10.6, dunno. So I think we'll need to fallback to ATSUI layout for > cases where the AAT fonts are matched, unless CoreText supports > AAT-based shaping.
What stops Harfbuzz from gaining AAT support? Too much work? Patents? Or?
On Nov 25, 2:23 pm, moz.li...@windingwisteria.com wrote:
> I think the lack of AAT support is going to make things a bit tricky on Mac, many of the fonts that ship even with 10.5 only support complex scripts via AAT, not OpenType. Maybe that will change with 10.6, dunno. So I think we'll need to fallback to ATSUI layout for cases where the AAT fonts are matched, unless CoreText supports AAT-based shaping.
Yeah, we definitely have to keep supporting AAT. The best way to do that seems to be to retain the Core Text path for AAT fonts (it does support AAT shaping). I'm assuming we won't ship Harfbuzz on Mac until we drop support for 10.4, so we won't have to deal with Harfbuzz and ATSUI at the same time.
> I had a chat with Jonathan Kew and Karl Tomlinson this morning about > Harfbuzz. Here are some notes. Further feedback appreciated.
> * Using Harfbuzz for shaping and platform APIs for rasterization > should work OK.
Great initiative. I briefly looked at using Harfbuzz for shaping on OS/2 (where we currently don't shape at all), but lack of documentation really kept me from pursuing this further. But when I looked at this it seemed to me that in addition to Harfbuzz code one would still need to generate shaping tables (per language), in addition to just using the code. Has that changed, did I misunderstand, or do we already have those tables somewhere in a form that allows them to be used under tri-license (or with MPL)?
On Nov 25, 2:38 pm, "rob...@ocallahan.org" <rocalla...@gmail.com> wrote:
> On Nov 25, 2:31 pm, Zack Weinberg <zweinb...@mozilla.com> wrote:
> > What stops Harfbuzz from gaining AAT support? Too much work? > > Patents? Or?
> Adding AAT support to Harfbuzz is an option, but it seems like a lot > of work without much benefit.
I should have explained this further. AAT is basically an Apple-only technology. It makes sense to support AAT on Mac, where a lot of the system fonts are AAT, but it doesn't seem useful for non-Mac platforms where the system fonts aren't AAT. I haven't heard any interest in downloadable fonts in AAT format, and it's hard to see why there would be any.
On Nov 26, 12:44 am, Peter Weilbacher <newss...@weilbacher.org> wrote:
> Great initiative. I briefly looked at using Harfbuzz for shaping on > OS/2 (where we currently don't shape at all), but lack of documentation > really kept me from pursuing this further. But when I looked at this > it seemed to me that in addition to Harfbuzz code one would still need > to generate shaping tables (per language), in addition to just using > the code. Has that changed, did I misunderstand, or do we already have > those tables somewhere in a form that allows them to be used under > tri-license (or with MPL)?
I'm not sure what you mean by "shaping tables". There's shaping code in Harfbuzz, and there are fonts containing Opentype tables used by that code, but I'm not aware of any additional data required.
On 11/30/2008 09:37 PM, rob...@ocallahan.org wrote:
> On Nov 26, 12:44 am, Peter Weilbacher <newss...@weilbacher.org> wrote: >> Great initiative. I briefly looked at using Harfbuzz for shaping on >> OS/2 (where we currently don't shape at all), but lack of documentation >> really kept me from pursuing this further. But when I looked at this >> it seemed to me that in addition to Harfbuzz code one would still need >> to generate shaping tables (per language), in addition to just using >> the code. Has that changed, did I misunderstand, or do we already have >> those tables somewhere in a form that allows them to be used under >> tri-license (or with MPL)?
> I'm not sure what you mean by "shaping tables". There's shaping code > in Harfbuzz, and there are fonts containing Opentype tables used by > that code, but I'm not aware of any additional data required.
Because there is no documentation, I was pointed to http://cgit.freedesktop.org/harfbuzz/tree/tests/shaping/main.cpp as an example on how to do shaping with Harfbuzz. Looking at e.g. the devanagari() function in there I found a long table of glyphs that make up the ShapeTable. So at first glance I thought that one would still have to specify those things by hand for every language. Never had time for a deeper look. Looking at that again now, I see that I missed before that this is some kind of unit test. Now I note that these explicit tables only seem to be used to check the result.