Rendering primes: <msup><mi>x</mi><mo>′</mo></msup>

Justu...@piater.name

unread,

Jun 26, 2008, 1:11:35 PM6/26/08

to dev-tec...@lists.mozilla.org, www-...@w3.org

Hi -

I'm slightly confused about proper markup and rendering of primes.

The W3C appears to recommend that primes be marked up as superscripts,
as shown in the subject line. This gave the expected results in
Mozilla back when I still used the LaTeX XFT fonts. That's normal: The
glyph used for the prime, codepoint U+0030 in cmsy10.ttf, is an
oversized, vertically roughly centered slab that is scaled and shifted
into just the right position and size by virtue of the superscript.

However, Unicode fonts these days provide a glyph (at U+2032) that is
quite evidently designed to be used without superscripting; its
position and size correspond to apostrophes and quotation marks. Using
the above markup, it is thus rendered too small and too high above the
baseline. (To see what I mean, point your out-of-the-box,
STIXBeta-powered Firefox3 to
http://www.w3.org/Math/testsuite/mml2-testsuite/Topics/Primes/primes1.xml
and compare the first x-prime to its sample rendering.)

So there is, I think, a contradiction between the recommended MathML
markup and the reality of glyphs.

As far as solutions are concerned, I think that W. Hammond's
suggestion
(http://www.albany.edu/~hammond/gellmu/primeaccents2.xhtml#mi) has its
merits. But if we stand by the <msup/> convention, then MathML
renderers will have to treat superscripted primes differently than
other superscripted characters.

Justus

Robert Miner

unread,

Jun 27, 2008, 10:59:34 AM6/27/08

to Justu...@piater.name, dev-tec...@lists.mozilla.org, www-...@w3.org

Hi.

This is a perennial problem area. It is a general problem affect a
class of character that I personally call "pseudoscripts". Prime is the
main one, but it also affects

asterisk (x2a)
degree (xb0)
prime (x2030)
double prime (x2033)
back prime (x2035)

In all of these cases as you note, the standard glyphs in most fonts are
"pre-shrunk and pre-raised", i.e. the glyph is visually congruent with
script-sized text, and the metrics for the characters place the natural
baseline of the glyph at the script baseline for the surrounding text.

The rationale is obvious -- the majority of use of fonts is in
unstructured text where characters are simply placed next to one
another, so pre-shrinking and pre-raising these glyphs gives the desired
typesetting in simple text editing situations.

However, there are a couple of issues. First, these characters are
frequently used in situations where the typesetting effect of "stacked"
scripts are needed, i.e. where both a superscript and subscript are
attached to a single base expression. In TeX, x^\prime_0, or in MathML
<msubsup> <mi>x</mi> <mn>0</mn> <mo>′</mo> </msubsup>. This is a
fairly strong requirement for professional publishing, and isn't not
easily supported using the Unicode-only model. That is, neither of

<msub><mi>x&prime</mi><mn>0</mn></msbub>
<msub><mi>x</mi><mn>0</mn></msub><mo>′</mo>

gives the desired result. A second issue is the (admittedly weaker)
argument that marking the prime with a script better reflects its role
as an operator applied to an expression. This argument is most
compelling with expressions like (f+g)', where you would end up with
<mo>)&prime</mo> in the juxtaposition model.

Finally, this question has been considered repeatedly over the lifetime
of MathML, and at least the Math WG has always come down on continuing
the practice of marking primes and the other pseudoscripts with script
markup. So there is backward compatibility to consider, since that is
what most authoring tools and pre-existing content now do.

At the same time, clearly the data model for all MathML token elements
is Unicode CDATA, so one cannot rule out the possibility of valid MathML
markup containing constructions such as <mi>x′</mi> or even
<mi>x</mi><mo>′</mo>. So that has to work as expected too.

Consequently, I have concluded that there is no alternative for
high-quality MathML renderers but to special-case the handling of
pseudo-script characters.

There are a couple of standard approaches I know of in various
implementations. All renderers that take on the problem have to special
case their layout algorithms, and look for isolated pseudoscripts in the
superscript position (or presuperscript position). By isolated, I mean
that there are no other characters in the same token element with the
pseudoscript, so that <msup> <mi>x</mi> <mo>′!</mo> </msup> or
whatever is not special-cased. Once a special case has been identified,
then there is a choice. If the renderer can compute the true bounding
box of the glyph, then it can select an appropriate size and position
the standard glyphs for these characters from whatever font is
available. However, I know of other renderers that make use of
non-standard private glyphs for "on baseline" versions of these
characters. The most notable is the full-size, on-the-baseline version
of the prime character in the TeX fonts you allude too. This tends to
be a more popular approach is large publishing operations that use
non-MathML-based typesetting engines such as XyEnterprise's XPP engine,
where MathML support is provided by pre-processing markup into a
proprietary math typesetting language. In this case, special casing the
layout algorithm is difficult or impossible, and thus swapping out
different glyphs is the more practical.

I hope this is helpful.

--Robert

> _______________________________________________
> dev-tech-mathml mailing list
> dev-tec...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-mathml

Justu...@piater.name

unread,

Jun 28, 2008, 5:52:05 AM6/28/08

to Robert Miner, www-...@w3.org, dev-tec...@lists.mozilla.org

Robert -

Thanks, your comments did shed additional light on this issue, at
least for me. The arguments in favor of treating primes (and similar)
as superscripts are compelling.

It seems to me that a clean and simple solution would be for a
renderer to create private glyphs from pre-scripted glyphs by applying
the inverse scripting transform (which should not require any
knowledge beyond that already needed for the forward transform), and
then blindly substitute these for the originals.

This should yield the expected results for all typical use cases
conformant with W3C markup conventions, including multiple characters
within the scripted token element, but would break non-standard markup
such as this:

"Robert Miner" <rob...@dessci.com> wrote on Fri, 27 Jun 2008 07:59:34
-0700:

> At the same time, clearly the data model for all MathML token
> elements is Unicode CDATA, so one cannot rule out the possibility of
> valid MathML markup containing constructions such as
> <mi>x′</mi> or even <mi>x</mi><mo>′</mo>. So that has
> to work as expected too.

But what is "expected" here? If such markup is expected to yield
superscript appearance, then one might use the original glyph at zero
script level, but this heuristic breaks down with nested/chained
superscripts.

W3C folks: It would be nice to include a clarification on markup
conventions and rendering expectations in the MathML specs or
accompanying docs, since these are clearly in conflict here. A
suggestion for MathML 3?

Mozilla folks: what do you think? Shall I open a bug with the above
suggestion for a fix?

Thanks,
Justus

Karl Tomlinson

unread,

Jun 29, 2008, 7:32:50 PM6/29/08

to

On Thu, 26 Jun 2008 19:11:35 +0200, Justu...@Piater.name wrote:

> Using
> the above markup, it is thus rendered too small and too high above the
> baseline. (To see what I mean, point your out-of-the-box,
> STIXBeta-powered Firefox3 to
> http://www.w3.org/Math/testsuite/mml2-testsuite/Topics/Primes/primes1.xml
> and compare the first x-prime to its sample rendering.)

Justu...@Piater.name writes:

> It seems to me that a clean and simple solution would be for a
> renderer to create private glyphs from pre-scripted glyphs by applying
> the inverse scripting transform (which should not require any
> knowledge beyond that already needed for the forward transform), and
> then blindly substitute these for the originals.

> Mozilla folks: what do you think? Shall I open a bug with the above
> suggestion for a fix?

Yes, please do file a bug.
I'm not yet clear on the best solution though.

If this situation can be detected/distinguished, the inverse of
the scale component of the script transformation is moderately
easy to perform, but complicated due to the fact that
scriptsizemultiplier can vary from element to element. (Maybe
this corner case is not so important.)

The translation component needs to be considered also. This could
be done either by

* making the pseudoscripts centered operators

This would not give expected results in the
<mi>x</mi><mo>′</mo> situation.

* positioning the superscript based on its bounds (or the
intersection of its bounds and the typographic ascent and
descent of the font)

I wonder whether there are any non-pseudoscript superscripts
that would suffer from being positioned according to their
bounds?

Justu...@piater.name

unread,

Jun 30, 2008, 4:01:13 AM6/30/08

to Karl Tomlinson, dev-tec...@lists.mozilla.org

Karl Tomlinson <moz...@karlt.net> wrote on Mon, 30 Jun 2008 11:32:50
+1200:

> Yes, please do file a bug.

Done: https://bugzilla.mozilla.org/show_bug.cgi?id=442637

How about putting up a Wiki page where we collect a comprehensive list
of use cases with their recommended markup and expected rendering?

Justus

William F Hammond

unread,

Jun 30, 2008, 2:41:48 PM6/30/08

to dev-tec...@lists.mozilla.org, www-...@w3.org

Karl Tomlinson <moz...@karlt.net> writes in dev-tech-mathml chez mozilla:

>> http://www.w3.org/Math/testsuite/mml2-testsuite/Topics/Primes/primes1.xml
>> and compare the first x-prime to its sample rendering.)

Yes, with FF2 there were issues in the test suite. Of course, when I say
"with FF2" I really mean "with FF2 and the fonts I have had". The font
installation may be part of the problem.

> I'm not yet clear on the best solution though.

I don't see how you could be clear on it. I think the MathML spec
needs to give it more explicit attention.

Issues include:

0. legacy MathML content including, in particular, the past history
of the handling of U-2032 compared to the handling of U-2033 ---
see bugzilla 140439

1. the need for providing at the level of markup (perhaps with a new
attribute), rather than cdata, for symbols analogous to
(ams)LaTeX's \prime and \backprime, which are needed for complex
scripting situations that require explicit script placement

2. specification for handling of all the various prime-like
characters in mtext, mi, and mo

If I've stated this as I intend to, then it should follow that there
will, in any given situation, be a way for an author to override
defaults on the question of whether a prime-like character is to be
scripted automatically or to be placed explicitly.

-- Bill

Karl Tomlinson

unread,

Jul 1, 2008, 12:12:33 AM7/1/08

to

Justu...@Piater.name writes:

> How about putting up a Wiki page where we collect a comprehensive list
> of use cases with their recommended markup and expected rendering?

That sounds good.

I wonder whether there's a way to embed mathml in wiki content at
developer.mozilla.org. It may be necessary to use an object
element:

<html>
<object type="application/xhtml+xml" data="mathml.xhtml">
<p>Not supported</p>
</object>
</html>

type="text/xml" can also be used, but I've had no success with
"application/mathml+xml".

...or we can wait for HTML5...

Rendering primes: <msup><mi>x</mi><mo>&#x2032;</mo></msup>

Justu...@piater.name

Robert Miner

Justu...@piater.name

Karl Tomlinson

Justu...@piater.name

William F Hammond

Karl Tomlinson

Rendering primes: <msup><mi>x</mi><mo>′</mo></msup>