Accessing OpenType MATH information through racket/draw

39 views
Skip to first unread message

Alexis King

unread,
Jan 8, 2022, 5:11:05 PM1/8/22
to Racket Developers, Matthew Flatt

Hi all,

I have recently been tinkering with an implementation of an OpenType math renderer in Racket. Doing this properly requires consulting font metrics stored in the OpenType MATH table. There is no direct interface to access these metrics in Pango, but they can be accessed by dropping down to the relevant HarfBuzz APIs.

In my experiments so far, I have been getting access to these metrics by importing racket/draw/unsafe/pango and racket/draw/private/local and using private APIs to make the necessary FFI calls myself. However, this is obviously not a great long-term solution, especially since some of the information I’d like to get my hands on isn’t even directly accessible through the private APIs. Therefore, I would like to add public APIs for somehow accessing this information using racket/draw.

This raises some API design questions, as there is not much precedent in racket/draw for exposing these nitty-gritty details of fonts. The font% class is really closer to a font description than a concrete font face, which is why even the most rudimentary methods for getting information about fonts, such as glyph-exists?, are actually methods of dc<%>, not font% itself. What’s more, such methods do not query font information directly: even glyph-exists? may perform font substitution.

Much of this complexity is inherent to the problem of text layout and rendering, and racket/draw generally tries to hide it as much as possible. Unfortunately, those abstractions are at least somewhat at odds with my goal of implementing a math renderer, since I need to get fairly low-level access to font information. This leaves me considering two possible ways forward:

  1. Expose public but explicitly unsafe direct access to Cairo and Pango contexts, and allow third-party libraries to make the necessary FFI calls themselves.

  2. Implement a safe API in racket/draw that provides the necessary low-level access to fonts and glyphs.

The second option sounds compelling, since safety is obviously preferable to unsafety, but the required API surface area would be pretty large: it would essentially amount to exposing a significant portion of HarfBuzz. That would create a significant backwards compatibility burden for racket/draw, and probably for relatively little gain, since most users have no need to get at any of this information. I’m therefore leaning towards the former, but I’d like to know if this sounds like a reasonable conclusion to others before doing this work.

Thanks,
Alexis

Jens Axel Søgaard

unread,
Jan 9, 2022, 4:48:14 PM1/9/22
to Alexis King, Racket Developers, Matthew Flatt
It seems to me that the ideal solution would be hybrid.

    1. Expose public but explicitly unsafe direct access to Cairo and Pango contexts.
    2. Implement a safe API in racket/draw that provides the necessary low-level access to fonts and glyphs.

Where 2. is built on top of 1. 

Users are of course encouraged to use 2., but if some library needs low-level access it's available.

Wrt to Cairo I have bindings for most C-level functions here:
    https://github.com/soegaard/cairo/blob/main/cairo-lib/cairo/bindings.rkt

On top of these I have implemented a shallow object oriented layer following the recommendation
on the Cairo web-site. It's shallow in the sense that it is still lower-level than racket/draw.

    https://github.com/soegaard/cairo/blob/main/cairo-lib/cairo/main.rkt

Note: These bindings haven't been tested as thoroughly as I usually test my code.
          I still haven't had time to use them in a larger project.

/Jens Axel

Philip McGrath

unread,
Jan 12, 2022, 7:16:21 PM1/12/22
to Alexis King, Racket Developers, Matthew Flatt

Hi Alexis,

(Apologies if anyone is getting this twice.)

On 1/8/22 17:10, Alexis King wrote:

In my experiments so far, I have been getting access to these metrics by importing racket/draw/unsafe/pango and racket/draw/private/local and using private APIs to make the necessary FFI calls myself. However, this is obviously not a great long-term solution, especially since some of the information I’d like to get my hands on isn’t even directly accessible through the private APIs. Therefore, I would like to add public APIs for somehow accessing this information using racket/draw.

This raises some API design questions, as there is not much precedent in racket/draw for exposing these nitty-gritty details of fonts. The font% class is really closer to a font description than a concrete font face, which is why even the most rudimentary methods for getting information about fonts, such as glyph-exists?, are actually methods of dc<%>, not font% itself. What’s more, such methods do not query font information directly: even glyph-exists? may perform font substitution.

Much of this complexity is inherent to the problem of text layout and rendering, and racket/draw generally tries to hide it as much as possible. Unfortunately, those abstractions are at least somewhat at odds with my goal of implementing a math renderer, since I need to get fairly low-level access to font information. This leaves me considering two possible ways forward:

  1. Expose public but explicitly unsafe direct access to Cairo and Pango contexts, and allow third-party libraries to make the necessary FFI calls themselves.

  2. Implement a safe API in racket/draw that provides the necessary low-level access to fonts and glyphs.

The second option sounds compelling, since safety is obviously preferable to unsafety, but the required API surface area would be pretty large: it would essentially amount to exposing a significant portion of HarfBuzz. That would create a significant backwards compatibility burden for racket/draw, and probably for relatively little gain, since most users have no need to get at any of this information. I’m therefore leaning towards the former, but I’d like to know if this sounds like a reasonable conclusion to others before doing this work.

I definitely see the tension between these two options, and especially the potential downsides of having compatibility deeply intertwined with an external dependency as large and complex as HarfBuzz. I don't have an answer, but here are some stray thoughts, anyway.

I have definitely wanted more low-level control over fonts, ideally (of course) in a safe way, and I remember other discussions when people have wanted similar things. I think one of the more common requests has been more control over font loading and resolution, e.g. to load a font from a file rather than relying exclusively on the system's installed fonts, or perhaps more control over fallback behavior when a given font doesn't have some glyph. I'm not deeply familiar with the boundaries between Pango, Fontconfig, HarfBuzz, etc.: I think much of that would be higher level than what you need, but maybe there's some overlap.

I'm not sure exactly how low-level my own desires go. At maximum, if I have the time some day,  the current state-of-the-art system for typesetting medieval plainchant is essentially a DSL that compiles to LuaTeX, and I would love to make it into a #lang with better means of abstraction.

More concretely, I've noticed from time to time that racket/draw/unsafe/glib, mred/private/wx/gtk/utils, and maybe a few other places have Glib FFI utilities that aren't specifically tied to drawing or GUI contexts. I think it might be useful to move those into a new public ffi/unsafe/glib module, analogous to ffi/unsafe/objc, either in a new package or in "draw-lib", especially since racket/draw/unsafe/glib sets up logging with Racket's private glib-log-message primitive. (At a glance, it looks like there aren't breaking changes in Glib associated with Gtk4, but it would probably be worth confirming that any new public functionality is not deprecated.)

The Unsafe Libraries chapter of the racket/draw docs leaves open the possibility that the representation of handles may "change if the racket/draw library is implemented differently in the future." It seems reasonable to me to provide weaker compatibility guarantees for low-level functionality, whether safe or unsafe, than for high-level functionality. If programmers want to implement safe APIs, it seems unsatisfying as a general principle to push them towards unsafe functionality, instead, when the underlying issue is not a matter of safety but that the external world may not share Racket's usual commitment to long-term compatibility. (I feel like there's some kind of analogy to the Separate Compilation Guarantee to be made here.) Maybe there's a way of organizing module and/or package boundaries so that racket/draw can keep it's current compatibility guarantees, something like (bad name idea) racket/cairo+pango+harfbuzz could provide safe or somewhat-safe implementation-dependent functionality, and racket/draw can simply continue leave open the possibility that it may in the future adopt an implementation incompatible with racket/cairo+pango+harfbuzz.

I'm not sure how useful any of that is, but there are some thoughts, at least.

-Philip

Alexis King

unread,
Jan 17, 2022, 11:51:44 AM1/17/22
to Philip McGrath, Jens Axel Søgaard, Racket Developers, Matthew Flatt
On Sun, Jan 9, 2022 at 3:48 PM Jens Axel Søgaard <jens...@soegaard.net> wrote:
It seems to me that the ideal solution would be hybrid.

    1. Expose public but explicitly unsafe direct access to Cairo and Pango contexts.
    2. Implement a safe API in racket/draw that provides the necessary low-level access to fonts and glyphs.

Where 2. is built on top of 1.

I agree that, for users, this would indeed perhaps be ideal. But from the perspective of the maintainers of racket/draw, I think it’s really the worst of all possible worlds:
  • It requires doing all the same work to design and build the safe API in the first place, and it commits to upholding the associated maintenance burden.

  • It leaves open the possibility that users will use the unsafe API, which means racket/draw cannot rely on all interactions going through a blessed code path. Moreover, it must document enough of the internals of its safe API to allow unsafe code to cooperate with it, so changing racket/draw’s implementation would likely be backwards-incompatible (and switching to a different set of libraries altogether definitely would be).

On the whole, I think there probably isn’t much benefit to exposing the unsafe API if the safe API exists, and the maintenance burden is enough to discourage doing so.



On Wed, Jan 12, 2022 at 6:05 PM Philip McGrath <philip....@gmail.com> wrote:
I think one of the more common requests has been more control over font loading and resolution, e.g. to load a font from a file rather than relying exclusively on the system's installed fonts
 
I agree with this, but I think this is mostly an orthogonal concern. If implemented, this would make sense to expose through the existing (safe) API.

The Unsafe Libraries chapter of the racket/draw docs leaves open the possibility that the representation of handles may "change if the racket/draw library is implemented differently in the future."

Yes, this is a great point that I had overlooked, as it means there’s already precedent for exposing some unsafe guts with reduced compatibility guarantees. With that in mind, I think I’ll try going down the unsafe API route and feeling out how large the exposed surface area would actually have to be. If it’s small enough, I suspect racket/draw/unsafe/cairo-lib is enough precedent to justify the additions.

Alexis
Reply all
Reply to author
Forward
0 new messages