We should drop MathML

4404 views
Skip to first unread message

Benoit Jacob

unread,
May 5, 2013, 11:38:39 AM5/5/13
to dev-platform
Hi,

Summary: MathML is a vestigial remnant of the XML-everything era, and we
should drop it.

***

1. Reasons why I believe that MathML never was a good idea. Summary:
over-specialized and uniformly inferior to the pre-existing,
well-established standard, TeX.

1.1. MathML is too specialized: we should be reluctant to have a
separate spec for every kind of specialized typography. What if musicians
wanted their own MusicML too?

1.2. MathML reinvents the wheel, poorly. A suitable subset of TeX (not
the entirety of TeX, as that is a huge, single-implementation technology
that reputedly only Knuth ever fully understood) was the right choice all
along, because:

1.2.1. TeX is already the universally adopted standard --- and
already was long before MathML was invented. Check for yourself on
http://arxiv.org/ , where most new math papers are uploaded --- pick any
article, then "other" formats, then "Source": you can then download TeX
sources for almost every article.

1.2.2. TeX is very friendly to manual writing, being concise and
close to natural notation, with limited overhead (some backslashes and
curly braces), while MathML is as tedious to handwrite as any other
XML-based format. An example is worked out at
http://en.wikipedia.org/wiki/MathML#Example_and_comparison_to_other_formats,
where the solution to the quadratic equation is one line of TeX versus
30
lines of MathML!

1.2.3. An important corollary of being very close to natural notation
is that TeX can be nearly trivially "read aloud". That means that it offers
a particularly easy accessibility story. No matter what mechanism is used
to graphically display equations, providing the TeX source (similarly to
images alt text) would allow anyone to quickly read it themselves without
any kind of software support; and screen reading software could properly
read equations with minimal TeX-specific support code. For example, TeX
code such as "\int_0^1 x^2 dx" can be readily understood by any human with
basic TeX exposure (which is nearly 100% of mathematicians) and can be
easily handled by any screen reader that knows that \int should be read as
"integral" and that immediately after it, _ and ^ should be read as "from"
and "to" respectively.

***

2. Reasons why even if MathML had ever been a decent idea, now would be the
right time to drop it. Summary: never really got traction, and the same
rendering can now be achieved without MathML support.

2.1. MathML never saw much traction outside of Mozilla, despite having
been around for a decade. WebKit only got a very limited partial
implementation recently, and Google removed it from Blink. The fact that it
was just dropped from Blink says much about how little it's used: Google
wouldn't have disabled a feature that's needed to render web pages in the
real world. Opera got an implementation too, but Opera's engine has been
phased out.

2.2. High-quality mathematical typography in browsers is now possible,
without using MathML. Examples include MathJax ( http://www.mathjax.org/ ),
which happily takes either TeX or MathML input and renders it without
specific browser support, and of course PDF.js which is theoretically able
to render all PDFs including those generated by pdftex. Both approaches
give far higher quality output than what any current MathML browser
implementation offers.

***

3. Proposals

Assuming that there will be agreement to drop MathML, I can see us doing
either of two things:

3.1. Either just drop MathML support; the assumption would be that
current solutions not requiring specific browser support, such as MathJax
or PDF.js, are sufficient;

3.2. Or drop MathML support and create a new specification, that would
be based on a suitable subset of TeX.

In both approaches, distributing TeX source code alongside with a page is
highly desirable because it is the preferred source form of most math
content and because it enables good accessibility as discussed above. In
the 3.1 approach, that would be like alt text on images: something that
many authors would omit in practice. In the 3.2 approach, that would be the
document itself, which means that it couldn't be neglected.

The big problem with 3.2. is the same issue as we described in 1.1: any
math-specific system may well be over-specialized. Then again, TeX is not
exclusively restricted to math typography, and it has been used for e.g.
music typography before. So to some extent that I haven't precisely figured
yet, the 1.1 overspecialization against MathML may not fully apply against
a TeX-based solution.

Benoit

Justin Lebar

unread,
May 5, 2013, 12:10:10 PM5/5/13
to Benoit Jacob, Jonathan Kew, dev-platform
Four points here.

1. We're assuming that MathJax is as good with MathML as it is without
it, but perhaps we could ask the MathJax folks to comment on whether
this is true. I'd certainly be a lot more comfortable dropping MathML
if the MathJax folks said there was no point.

2.

> A suitable subset of TeX (not
> the entirety of TeX, as that is a huge, single-implementation technology
> that reputedly only Knuth ever fully understood) was the right choice all
> along

Jonathan Kew is a much better person to comment on this, but in my
relatively limited experience typesetting documents in TeX, I've had
to use various LaTeX packages (particularly amsmath and amssymb) in
order to get all of the symbols and so on that I needed. I suspect
that "heavy" users of TeX frequently need more than these two
packages.

The point being, "a subset of TeX" isn't necessarily sufficient.

3. It's not clear to me why we should go through all the work of
rewriting MathML into this TeX thing unless we thought that the new
thing would see more enthusiastic adoption. It sounds like you would
probably agree on this point.

4.

> 2.2. High-quality mathematical typography in browsers is now possible,
> without using MathML. Examples include MathJax ( http://www.mathjax.org/ ),
> which happily takes either TeX or MathML input and renders it without
> specific browser support, and of course PDF.js which is theoretically able
> to render all PDFs including those generated by pdftex. Both approaches
> give far higher quality output than what any current MathML browser
> implementation offers.

Could you elaborate on how MathML is inferior to MathJax's HTML+CSS
rendering? MathJax has a page where you can switch between different
rendering modes, and to my eyes, the two modes are almost identical.
The only difference I see is that the HTML+CSS mode is better at
correctly sizing large parentheses and radicals, but I wouldn't call
this "far higher quality."

http://www.mathjax.org/demos/mathml-samples/
> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform

Benoit Jacob

unread,
May 5, 2013, 12:51:45 PM5/5/13
to Justin Lebar, dev-platform, Jonathan Kew
2013/5/5 Justin Lebar <justin...@gmail.com>

> Four points here.
>
> 1. We're assuming that MathJax is as good with MathML as it is without
> it, but perhaps we could ask the MathJax folks to comment on whether
> this is true. I'd certainly be a lot more comfortable dropping MathML
> if the MathJax folks said there was no point.
>
>
Absolutely, feedback from the MathJax team would be very valuable here.


>
> 2.
>
> > A suitable subset of TeX (not
> > the entirety of TeX, as that is a huge, single-implementation technology
> > that reputedly only Knuth ever fully understood) was the right choice all
> > along
>
> Jonathan Kew is a much better person to comment on this, but in my
> relatively limited experience typesetting documents in TeX, I've had
> to use various LaTeX packages (particularly amsmath and amssymb) in
> order to get all of the symbols and so on that I needed. I suspect
> that "heavy" users of TeX frequently need more than these two
> packages.
>
> The point being, "a subset of TeX" isn't necessarily sufficient.
>

It is absolutely true that nearly all TeX documents include various
packages. So by "TeX" I implicitly meant "TeX with a selection of stuff
from various usual packages".


>
> 3. It's not clear to me why we should go through all the work of
> rewriting MathML into this TeX thing unless we thought that the new
> thing would see more enthusiastic adoption. It sounds like you would
> probably agree on this point.
>

I absolutely agree. That is basically what I meant when I wrote that the
argument that MathML was over-specialized may well apply to a TeX-based
solution too (see the discussion at the end, comparing proposals 3.1 and
3.2).


>
> 4.
>
> > 2.2. High-quality mathematical typography in browsers is now possible,
> > without using MathML. Examples include MathJax ( http://www.mathjax.org/),
> > which happily takes either TeX or MathML input and renders it without
> > specific browser support, and of course PDF.js which is theoretically
> able
> > to render all PDFs including those generated by pdftex. Both approaches
> > give far higher quality output than what any current MathML browser
> > implementation offers.
>
> Could you elaborate on how MathML is inferior to MathJax's HTML+CSS
> rendering? MathJax has a page where you can switch between different
> rendering modes, and to my eyes, the two modes are almost identical.
> The only difference I see is that the HTML+CSS mode is better at
> correctly sizing large parentheses and radicals, but I wouldn't call
> this "far higher quality."
>
> http://www.mathjax.org/demos/mathml-samples/
>

Sure, I loaded this page in two tabs and switched one to MathML to compare.
I used Firefox 23.0a1 linux 64bit (ubuntu 12.04 if that matters).

>From a quick look, here is what stands out:
1. the spacing looks weird in MathML mode in a few places, especially in
"Definition of Christoffel Symbols": the dx^i at the denominator is
strangely far down, and the exponent (k) and subscript (im) in Gamma^k_{im}
also are placed surprisingly far away from the Gamma. (The subscript in
fact looks like there was no kerning there, as it is not placed any further
left than the exponent in the MathML version, while it is placed left of
the exponent in the HTML+CSS version, which looks better).
2. in "Gauss' Divergence Theorem", in the dS, the S is placed strangely
far away from the d, in the MathML version. That could again be explained
by absence of kerning.
3. the square root in "The Quadratic Formula" does not extend any further
down than the text under it, in the MathML version. square root signs
usually extend a bit further down, as in the HTML+CSS version.
4. the greek letters in MathML mode look weird, for example the pi in
"Cauchy's Integral Formula" looks like an uppercase pi (in smallcaps), the
mu in "Standard Deviation" looks strange too.

That may all sound very picky, but if you're going to try to convince the
scientific community to switch away from TeX, which gets all of this just
right, better make sure than the replacement looks as good!

Benoit

fred...@mathjax.org

unread,
May 5, 2013, 2:10:25 PM5/5/13
to
I'm not sure if that's a joke or complete misinformation about the topic. But obviously the answer is that the MathML support must be preserved. The MathJax team is strongly in favor of native MathML implementation.

Benoit Jacob

unread,
May 5, 2013, 2:47:18 PM5/5/13
to fred...@mathjax.org, dev-platform
It's not a joke.

Could you elaborate on this? In particular, as I wrote to the MathJax list,
I would be very interested in knowing what regressions the removal of
MathML would incur as far as MathJax is concerned.

Benoit


2013/5/5 <fred...@mathjax.org>

> I'm not sure if that's a joke or complete misinformation about the topic.
> But obviously the answer is that the MathML support must be preserved. The
> MathJax team is strongly in favor of native MathML implementation.

Robert O'Callahan

unread,
May 5, 2013, 6:28:52 PM5/5/13
to Benoit Jacob, dev-platform
On Mon, May 6, 2013 at 3:38 AM, Benoit Jacob <jacob.b...@gmail.com>wrote:

> 2.1. MathML never saw much traction outside of Mozilla, despite having
> been around for a decade. WebKit only got a very limited partial
> implementation recently, and Google removed it from Blink. The fact that it
> was just dropped from Blink says much about how little it's used: Google
> wouldn't have disabled a feature that's needed to render web pages in the
> real world.


The Blink implementation was never good enough to render MathML pages well
in the real world, whether there were any or not. It also had some pretty
major brokenness in the way it was integrated into Blink, which made it
difficult to enable safely.

I would also say that one big difference between MathML and a hypothetical
TeX-based format is that MathML has a DOM and it's not clear how to fit TeX
into a DOM. That may not matter much for rendering, but it does if you want
to support editing.

One other thing: EPUB publishers are screaming for good math support for
textbooks (and currently that means they want MathML). They're mostly
Webkit-based, and maybe we don't care about them, but there you are.

Rob
--
q“qIqfq qyqoquq qlqoqvqeq qtqhqoqsqeq qwqhqoq qlqoqvqeq qyqoquq,q qwqhqaqtq
qcqrqeqdqiqtq qiqsq qtqhqaqtq qtqoq qyqoquq?q qEqvqeqnq qsqiqnqnqeqrqsq
qlqoqvqeq qtqhqoqsqeq qwqhqoq qlqoqvqeq qtqhqeqmq.q qAqnqdq qiqfq qyqoquq
qdqoq qgqoqoqdq qtqoq qtqhqoqsqeq qwqhqoq qaqrqeq qgqoqoqdq qtqoq qyqoquq,q
qwqhqaqtq qcqrqeqdqiqtq qiqsq qtqhqaqtq qtqoq qyqoquq?q qEqvqeqnq
qsqiqnqnqeqrqsq qdqoq qtqhqaqtq.q"

Benoit Jacob

unread,
May 5, 2013, 6:52:07 PM5/5/13
to Robert O'Callahan, dev-platform
2013/5/5 Robert O'Callahan <rob...@ocallahan.org>

> On Mon, May 6, 2013 at 3:38 AM, Benoit Jacob <jacob.b...@gmail.com>wrote:
>
>> 2.1. MathML never saw much traction outside of Mozilla, despite having
>> been around for a decade. WebKit only got a very limited partial
>> implementation recently, and Google removed it from Blink. The fact that
>> it
>> was just dropped from Blink says much about how little it's used: Google
>> wouldn't have disabled a feature that's needed to render web pages in the
>> real world.
>
>
> The Blink implementation was never good enough to render MathML pages well
> in the real world, whether there were any or not. It also had some pretty
> major brokenness in the way it was integrated into Blink, which made it
> difficult to enable safely.
>
> I would also say that one big difference between MathML and a hypothetical
> TeX-based format is that MathML has a DOM and it's not clear how to fit TeX
> into a DOM. That may not matter much for rendering, but it does if you want
> to support editing.
>

That sounds interesting; could you please expand a little more?
- Do you mean that in order to support editing well, the format must be
naturally parseable into a tree representation? (Why so?)
- If that is what you mean: I think that TeX equations parse fairly
naturally into tree representations; in fact, I suppose that every
TeX-to-MathML conversion tool, or TeX-to-DOM-elements conversion tool in
existence, must already have solved this problem somehow (in particular,
MathJax).



>
> One other thing: EPUB publishers are screaming for good math support for
> textbooks (and currently that means they want MathML). They're mostly
> Webkit-based, and maybe we don't care about them, but there you are.
>

Given that TeX is already the standard in scientific publishing, I would
find it very surprising if they complained about a TeX-based or TeX-like
format !

Benoit

Wesley Johnston

unread,
May 5, 2013, 7:16:05 PM5/5/13
to Benoit Jacob, dev-platform
> 1.2.2. TeX is very friendly to manual writing, being concise and
> close to natural notation, with limited overhead (some backslashes and
> curly braces), while MathML is as tedious to handwrite as any other
> XML-based format. An example is worked out at
> http://en.wikipedia.org/wiki/MathML#Example_and_comparison_to_other_formats,
> where the solution to the quadratic equation is one line of TeX versus
> 30 lines of MathML!

This isn't exactly a fair comparison. I mean, its fair, but for equations of any complexity (i.e. things you wouldn't find in a high school text book) TeX can quickly become incredibly difficult (maybe more difficult than MATHML) to manage. Most people I know who use TeX regularly have developed fairly thick sets of macros to try and manage things.

> Given that TeX is already the standard in scientific publishing, I would
> find it very surprising if they complained about a TeX-based or TeX-like
> format !

I'm not sure this is true either. At least in the fields I was involved in (solid state phsyics), MS Word had established itself as a broader standard. That was primarily based on general ease of use and (more importantly?) ease of collaboration (i.e. we could easily share a real document back and forth that tracked changes/comments inside it). Using a version tracking system would have been interesting... but I wasn't aware of anyone doing it.

I always wanted to see MathML succeeded. There are plenty of things to complain about in the format, but I think most of its problems stemmed from a lack of implementations. It feels to me like another one of those technologies (like flexbox or web components) that people need to reinvent (with a few of the sharp edges rounded off) and try to sell as "new". Until we have buy in from some other browser vendors on a new format though, I don't think I understand why we'd kill off something that 1.) works and 2.) AFAIK requires almost zero upkeep. Are teams spending a lot of time upkeeping MathML code?

- Wes

Benoit Jacob

unread,
May 5, 2013, 7:40:30 PM5/5/13
to Wesley Johnston, dev-platform
2013/5/5 Wesley Johnston <wjoh...@mozilla.com>

> > 1.2.2. TeX is very friendly to manual writing, being concise and
> > close to natural notation, with limited overhead (some backslashes and
> > curly braces), while MathML is as tedious to handwrite as any other
> > XML-based format. An example is worked out at
> >
> http://en.wikipedia.org/wiki/MathML#Example_and_comparison_to_other_formats
> ,
> > where the solution to the quadratic equation is one line of TeX versus
> > 30 lines of MathML!
>
> This isn't exactly a fair comparison. I mean, its fair, but for equations
> of any complexity (i.e. things you wouldn't find in a high school text
> book) TeX can quickly become incredibly difficult (maybe more difficult
> than MATHML) to manage. Most people I know who use TeX regularly have
> developed fairly thick sets of macros to try and manage things.
>

Well, I have written hundreds of pages of TeX; for sure, some large
equations would expand over more than one line of TeX, but I can't remember
going over more than 5 lines of TeX source (without custom helper macros)
per actual line of output, that that would be a really unusual case ---
while the MathML example above has a ratio of 30 source lines to 1 output
line.

The fact that TeX furthermore allows macros shouldn't be considered proof
that it's particularly hairy --- it's just something that people do for
convenience/abstraction.

There _are_ very hairy things with TeX, but they are not so much with math
typography per se; instead, I'd say that TeX becomes hairy when one tries
to use it beyond its primary domain of application. For example, one can
draw diagrams, e.g. with the xypic package, and that can get really
cumbersome and inexpressive. But that's not part of what I was suggesting
could become part of the subset-of-TeX used to replace MathML.



> > Given that TeX is already the standard in scientific publishing, I would
> > find it very surprising if they complained about a TeX-based or TeX-like
> > format !
>
> I'm not sure this is true either. At least in the fields I was involved in
> (solid state phsyics), MS Word had established itself as a broader
> standard. That was primarily based on general ease of use and (more
> importantly?) ease of collaboration (i.e. we could easily share a real
> document back and forth that tracked changes/comments inside it). Using a
> version tracking system would have been interesting... but I wasn't aware
> of anyone doing it.
>

Ouch. I am glad I didn't work in a field where MS Word was in use for
writing long and/or scientific documents.

At least for the more mathematical sciences (math, mathematical physics,
large parts of CS) I can say with confidence that TeX is ubiquitous.



>
> I always wanted to see MathML succeeded. There are plenty of things to
> complain about in the format, but I think most of its problems stemmed from
> a lack of implementations. It feels to me like another one of those
> technologies (like flexbox or web components) that people need to reinvent
> (with a few of the sharp edges rounded off) and try to sell as "new". Until
> we have buy in from some other browser vendors on a new format though, I
> don't think I understand why we'd kill off something that 1.) works and 2.)
> AFAIK requires almost zero upkeep. Are teams spending a lot of time
> upkeeping MathML code?
>

We agree: it does sound fair to wait for either a replacement, or agreement
that no such technology is needed in browsers, or evidence that the
maintenance cost is significant, before taking any decision to drop MathML.

Benoit


>
> - Wes
>

p.kraut...@gmail.com

unread,
May 5, 2013, 8:38:53 PM5/5/13
to
Here are a couple of reasons why dropping MathML would be a bad idea. (While I wrote this others made some of the points as well.)

* MathML is part of HTML5 and epub3.
* Gecko has the very best native implementation out there, only a few constructs short of complete.
* Killing it off means Mozilla gives up a competitive edge against all other browser engines.
* MathML is widely used. Almost all publishers use XML workflows and in those MathML for math. Similarly, XML+MathML dominates technical writing.
* In particular, the entire digital textbook market and thus the entire educational sector comes out of XML/MathML workflows right now.
* MathML is the only format supported by math-capable accessibility tools right now.
* MathML is just as powerful for typesetting math as TeX is. Publishers have been converting TeX to XML for over a decade (e.g., Wiley, Springer, Elsevier). Fun fact: the Math WG and the LaTeX3 group overlap.
* Limitations of browser support does not mean that the standard is limited.

From a MathJax point of view

* MathJax uses MathML as its internal format.
* MathJax output is ~5 times slower than native support. This is after 9 years of development of jsmath and MathJax (and javascript engines).
* The performance issues lie solely with rendering MathML using HTML constructs.
* Performance is the only reason why Wikipedia continues to uses images.
* JavaScript cannot access font metrics, so MathJax can only use fonts we'r able to teach it to use.
* While TeX and the basic LaTeX packages are stable, most macro packages are unreliable. Speaking as a mathematician, it's often hard to compile my own TeX documents from a few years ago. You can also ask the arXiv folks how painful it is to do what they do.

Other points

* MathML has never seen paid browser development. All work (save code review) has been done solely by unpaid volunteers. If Mozilla was paying even a part time developer, Firefox would have had complete support years ago.

* The same holds for Apple, Google and Microsoft. Yes, when you don't put any developers on the job, MathML implementations do not get better.

* Google is even silly enough to kick out a hugely improved (albeit partial) implementation instead of landing the patches that fix that one remaining security issue -- while Apple doesn't have any problems with the same code.

* Firefox has shown how productive the feedback loop from a partial implementation can be, attracting a number of volunteers over the years, pushing it forward a little bit each time.

* MathML syntax is not as bad as people think but it takes getting used to (just like HTML). It's a bit like saying HTML is bad since markdown is much more human readable. Check out Dave Barton's jqmath, a serialization of MathML; with very little effort I find it as human readable as TeX.

* TeX is *not* the de-facto standard for math. It is the standard for researchers in mathematics and very few related fields. Most mathematical content is not created by researchers but by technical writers and in the educational sector. And again: most TeX gets converted to MathML in publishing workflows.

* MS Word and Libre Office produce MathML out of the box.


Personal remarks

MathML still feels a lot like HTML 1 to me. It's only entered the web natively in 2012. We're lacking a lot of tools, in particular open source tools (authoring environments, cross-conversion, a11y tools etc).

But that's a bit like complaining in 1994 that HTML sucks and that there's TeX which is so much more natural with \chapter and \section and has higher typesetting quality anyway.

I'm totally for MusicML! More generally, there are things like CellML, CML and other scientific standards. I'd encourage them to work towards becoming web standards, to prove that the web is truly the native place for all human communication.

A statistical plot has no more reason to be an image than an equation -- it should be markup/data in the page and the browser should render it. Browsers may be the new printing press, but we are looking at Gutenberg's model here, not 20th century digital offset printing.


Anyway, the MathWG has fought extremely hard for 15 years to make mathematics a first class citizen on the web. Certainly, MathML is only the beginning for math on the web. But abandoning it now will throw scientific content back 20 years.

Personally, I don't want to wait for another Knuth to show up and fix the problem.


Peter.

Joshua Cranmer 🐧

unread,
May 5, 2013, 9:10:29 PM5/5/13
to
On 5/5/2013 6:40 PM, Benoit Jacob wrote:
Well, I have written hundreds of pages of TeX; for sure, some large
equations would expand over more than one line of TeX, but I can't remember
going over more than 5 lines of TeX source (without custom helper macros)
per actual line of output, that that would be a really unusual case ---
while the MathML example above has a ratio of 30 source lines to 1 output
line.

For what it's worth, to compare the TeX to that MathML properly, you'd have to count, e.g., \frac{a}{b} as three lines:
\frac
  {a}
  {b}

An example I have of several lines of TeX-per-output-line is the following (some Big-Step semantics rules):
\frac{\langle b,\sigma\rangle \Downarrow \mathtt{true}\quad
      \langle \text{$s$ while ($b$) $s$},\sigma\rangle\Downarrow
      \langle t, \sigma_1\rangle}
     {\langle \text{while ($b$) $s$},\sigma\rangle \Downarrow
      \langle t, \sigma_1\rangle}

The rendered output of this would be (hopefully the MathML makes it through):
b , σ true s while (b) s , σ t , σ 1 while (b) s , σ t , σ 1
Entering one of the lines in MathML in a more compact representation comes out to:
<mo>〈</mo><mtext>while (<mi>b</mi>) <mi>s</mi></mtext><mo>,</mo><mi>σ</mi><mo>〉</mo><mo></mo

So it's not a factor of 30-to-1 in verbosity, more like a factor of 2-to-1 or 3-to-1. Certainly the same order of magnitude. You might argue that I'm cheating by using Unicode characters instead of entities, but the LaTeX-to-MathML conversion tools I've seen all output UTF-8, and UTF-8 is generally much more well supported by browsers than in TeX processors, so it's not an unrealistic assumption for how the text looks.

-- 
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist

Benoit Jacob

unread,
May 5, 2013, 10:46:01 PM5/5/13
to p.kraut...@gmail.com, dev-platform
Let me just reply to a few points to keep this conversation manageable:

2013/5/5 <p.kraut...@gmail.com>

> Here are a couple of reasons why dropping MathML would be a bad idea.
> (While I wrote this others made some of the points as well.)
>
> * MathML is part of HTML5 and epub3.
>

That MathML is part of epub3, is useful information. It doesn't mean that
MathML is good but it means that it's more encroached than I knew.

We don't care about "this is part of HTML5" arguments (or else we would
support all the crazy stuff that flies on public-fx@w3...)


> * Gecko has the very best native implementation out there, only a few
> constructs short of complete.
> * Killing it off means Mozilla gives up a competitive edge against all
> other browser engines.
> * MathML is widely used. Almost all publishers use XML workflows and in
> those MathML for math. Similarly, XML+MathML dominates technical writing.
> * In particular, the entire digital textbook market and thus the entire
> educational sector comes out of XML/MathML workflows right now.
> * MathML is the only format supported by math-capable accessibility tools
> right now.
> * MathML is just as powerful for typesetting math as TeX is. Publishers
> have been converting TeX to XML for over a decade (e.g., Wiley, Springer,
> Elsevier). Fun fact: the Math WG and the LaTeX3 group overlap.
> * Limitations of browser support does not mean that the standard is
> limited.
>

> From a MathJax point of view
>
> * MathJax uses MathML as its internal format.
> * MathJax output is ~5 times slower than native support. This is after 9
> years of development of jsmath and MathJax (and javascript engines).
>

JavaScript performance hasn't stopped improving and is already far better
than 5x slower than native on use cases (like the Unreal Engine 3 demo)
that were a priori much harder for JavaScript.



> * The performance issues lie solely with rendering MathML using HTML
> constructs.
> * Performance is the only reason why Wikipedia continues to uses images.
>

Then fix performance? With recent JavaScript improvements, if you really
can't get faster than within 5x of native, then you must be running into a
browser bug. The good thing with rendering with general HTML constructs is
that improving performance for such use cases benefits the entire browser.
If you pit browsers against each other on such a benchmark, you should be
able to generate enough competitive pressure between browser vendors to
force them to pay attention.


> * JavaScript cannot access font metrics, so MathJax can only use fonts
> we'r able to teach it to use.
>

Has that issue been brought up in the right places before (like, on this
very mailing list?) Accessing font metrics sounds like something reasonable
that would benefit multiple applications (like PDF.js).


> * While TeX and the basic LaTeX packages are stable, most macro packages
> are unreliable. Speaking as a mathematician, it's often hard to compile my
> own TeX documents from a few years ago. You can also ask the arXiv folks
> how painful it is to do what they do.
>

I'm also speaking as a (former) mathematician, and I've never had to rely
on TeX packages that aren't found in every sane TeX distribution (when I
stopped using TeX on a daily basis, TexLive was what everybody seemed to be
using).

But that's not relevant to my proposal (or considering a suitable subset of
TeX-plus-some-packages) because we could write this specification in a way
that mandates support for a fixed set of functionality, much like other Web
specifications do.


>
>
> Personal remarks
>
> MathML still feels a lot like HTML 1 to me. It's only entered the web
> natively in 2012. We're lacking a lot of tools, in particular open source
> tools (authoring environments, cross-conversion, a11y tools etc).
>

I'm concerned everytime I hear "native" as an inherent quality. As I tried
to explain above, if something can be done in browsers without "native"
support, that's much better. The job of browser vendors is to be picky
gatekeepers to limit the number of different specialized things that
require "native" support. Whence my specific interest in MathJax here.


>
> But that's a bit like complaining in 1994 that HTML sucks and that there's
> TeX which is so much more natural with \chapter and \section and has higher
> typesetting quality anyway.
>

It would have been extremely easy to rebut such arguments as irrelevant and
counter them by much stronger arguments why TeX couldn't do the job that
HTML does.

I am still waiting for the rebuttal of my arguments, in the original email
in this thread, about how TeX is strictly better than MathML for the
particular task of representing equations. As far as I can see, MathML's
only inherent claim to existence is "it's XML", and being XML stopped being
a relevant selling point for a Web spec many years ago (or else we'd be
stuck with XHTML).



>
> I'm totally for MusicML! More generally, there are things like CellML, CML
> and other scientific standards. I'd encourage them to work towards becoming
> web standards, to prove that the web is truly the native place for all
> human communication.
>

That's our perspective mismatch right there.

Application developers want features: nothing could be more natural.

The job of browser developers is to say "no" unless the feature is really
necessary for doing something important on the Web.

That's why the MathJax part of our conversation, above, is the most useful
one.

Cheers,
Benoit

Joshua Cranmer 🐧

unread,
May 5, 2013, 11:23:56 PM5/5/13
to
On 5/5/2013 9:46 PM, Benoit Jacob wrote:
> I am still waiting for the rebuttal of my arguments, in the original
> email in this thread, about how TeX is strictly better than MathML for
> the particular task of representing equations. As far as I can see,
> MathML's only inherent claim to existence is "it's XML", and being XML
> stopped being a relevant selling point for a Web spec many years ago
> (or else we'd be stuck with XHTML)

Don't be quick to dismiss the utility of XML. The problem of XHTML, as I
understand it, was that the XHTML2 spec ignored the needs of its
would-be users and designed stuff that was untenable. XHTML as in "a
representation of the HTML DOM in XML syntax" isn't a bad idea to me.
Note that I'm really defining XML here as "the basic representation
format of HTML."

In this case, I think the XML nature of MathML actually works to its
benefit: it uses the same basic framework and "look and feel" as HTML,
so you can very easily insert arbitrary HTML into your equation. A
TeX-like language would have to invent awkward wrappers for this same
functionality, like \html{<b>I can insert arbitrary HTML!</b>}. It also
creates its own implicit DOM structure for manipulation, and provides
very natural launchpads for extra styling or scripting.

p.kraut...@gmail.com

unread,
May 6, 2013, 1:27:41 AM5/6/13
to
Benoit, you said you need proof that MathML is better than TeX. I think it's the reverse at this point (from a web perspective -- you'll never get me to use Word instead of TeX privately ;) ).

Anyway, let me try to repeat how I had addressed your original points in my first post.

1.1. you make a point against adding unnecessary typography. Mathematics is text, but adding new requirements. It's comparable to the introduction of RTL or tables much more than musical notation. It's also something that all school children will encounter for 9-12 years. IMHO, this makes it necessary to implement mathematical typesetting functionality.

1.2 you claimed MathML is inferior to TeX. I've tried to point out that that's not the case as most scientific and educational publishers use it extensively.

1.2.1 you claimed TeX is the universal standard. I've tried to point out only research mathematicians use it as a standard. Almost most mathematics happens outside that group.

1.2.2 You pointed out that MathML isn't friendly to manual input. That's true but HTML isn't very friendly either, nor is SVG.

1.2.3 You argued TeX is superior for accessibility. I've pointed out that that's not the case given the current technology landscape.

2 You wrote now is the time to drop MathML. I've tried to point out that now -- as web and ebook standard -- is the time to support it, especially when your implementation is almost complete and you're looking to carve a niche out of the mobile and mobile OS market, ebooks etc.

2.1 you claim MathML never saw traction outside of Firefox. I tried to point out that MathML has huge traction in publishing and the educational sector, even if it wasn't visible on the web until MathJax came along. Google wants MathML support (they just don't trust the current code) while Apple has happily advertised with the MathML they got for free. Microsoft indeed remains a mystery.

2.2 you claim MathJax does a great job -- ok, I'm not going to argue ;) -- while browsers don't. But we've used native output on Firefox before MathJax 2.0 and plan to do it again soon -- it is well implemented and can provide the same quality of typesetting.

3. Well, I'm not sure what to say to those. If math is a basic typographical need, then the syntax doesn't matter -- we need to see it implemented and its bottom up layout process clashes with CSS's top down process. No change in syntax will resolve that.

Since MathML development involved a large number of TeX and computer algebra experts, I doubt a TeX-like syntax will end up being extremely different from MathML the second time around.

Instead of fighting over syntax, I would prefer to focus on improving the situation of mathematics on the web -- so thank you for your offer to support us in fixing bugs and improving HTML layout.

Peter.

Robert O'Callahan

unread,
May 6, 2013, 1:58:21 AM5/6/13
to Benoit Jacob, dev-platform
On Mon, May 6, 2013 at 10:52 AM, Benoit Jacob <jacob.b...@gmail.com>wrote:

> 2013/5/5 Robert O'Callahan <rob...@ocallahan.org>
>
>> I would also say that one big difference between MathML and a
>> hypothetical TeX-based format is that MathML has a DOM and it's not clear
>> how to fit TeX into a DOM. That may not matter much for rendering, but it
>> does if you want to support editing.
>>
>
> That sounds interesting; could you please expand a little more?
> - Do you mean that in order to support editing well, the format must be
> naturally parseable into a tree representation? (Why so?)
>

We expose HTML and SVG content to Web applications by structuring that
content as a tree and then exposing it using standard DOM APIs. These APIs
let you examine, manipulate, parse and serialize content subtrees. They
also let you handle events on that content. CSS also depends on content
having a DOM tree structure for selectors and inheritance to work. You
definitely need to able to handle events and apply CSS to elements of your
math markup.

I mentioned editing because I thought you'd want to reuse DOM text node
editing in a MathML editor.

Introducing a new kind of document markup that can't be manipulated via DOM
APIs is a non-starter. And of course, once you've figured out how to expose
math content in a DOM API, people are going to expect the source language
to use HTML-like angle-bracket syntax like everything else that parses to a
DOM.

Robert O'Callahan

unread,
May 6, 2013, 2:14:18 AM5/6/13
to Benoit Jacob, dev-platform, p.kraut...@gmail.com
Let me go on a bit of a rampage about TeX for a bit.

TeX is not a markup format. It is an executable code format. It is a
programming language by design! (It's a very poor programming language, but
let's ignore that for the moment.) You run a TeX program to generate the
rendered output. This has some major implications:
-- It's very hard to write a universal WYSIWYG editor. While I was still in
research I tried various WYSIWYG TeX editors. They all sucked because it's
an intractable problem. That's not a problem for programmers,
mathematicians and scientists who are used to writing everything in
plaintext with emacs. It is for everyone else.
-- You have an edit/compile/debug cycle and your Tex can fail with
compilation/runtime errors. "Catastrophic fail" document content models
(like XML) really suck for Web content. (Yes, MathML is XML, but people can
and should use the HTML embedding which avoids this problem.)

(Because I like WYSIWYG and I don't like edit/compile/debug cycles and TeX
is atrocious as a language, I tried to avoid it in my research. I published
a POPL paper full of type theory written in Mircrosoft Word (which is
totally unheard of), and wrote my thesis which also include a lot of
semantics and type theory in FrameMaker, which was actually pretty good but
is very dead. (I had an officemate who wrote his thesis in Scribe, which
was very dead even in the mid-90s!))

You could try to fix TeX's problems in a new math language, but computer
scientists have been talking about that for decades and nobody has. Of
course, computer scientists and mathematicians would probably continue to
prefer a Turing-complete language, which is fine for them but again, not
suitable for normal users for the above reasons. And of course, to the
extent you change TeX, you break compatibility with TeX, which is much of
its appeal in the first place.

Robert O'Callahan

unread,
May 6, 2013, 2:21:09 AM5/6/13
to Benoit Jacob, dev-platform, p.kraut...@gmail.com
On Mon, May 6, 2013 at 6:14 PM, Robert O'Callahan <rob...@ocallahan.org>wrote:

> wrote my thesis which also include a lot of semantics and type theory in
> FrameMaker, which was actually pretty good but is very dead.
>

Correction: it's alive! Amazing.

papa...@gmail.com

unread,
May 6, 2013, 4:20:38 AM5/6/13
to
On Monday, 6 May 2013 07:27:41 UTC+2, p.kraut...@gmail.com wrote:
>
> Microsoft indeed remains a mystery.
>

Not so much when it comes to Microsoft Office:
http://blogs.msdn.com/b/murrays/

smaug

unread,
May 6, 2013, 5:01:22 AM5/6/13
to Benoit Jacob, p.kraut...@gmail.com
On 05/06/2013 05:46 AM, Benoit Jacob wrote:
> Let me just reply to a few points to keep this conversation manageable:
>
> 2013/5/5 <p.kraut...@gmail.com>
>
>> Here are a couple of reasons why dropping MathML would be a bad idea.
>> (While I wrote this others made some of the points as well.)
>>
>> * MathML is part of HTML5 and epub3.
>>
>
> That MathML is part of epub3, is useful information. It doesn't mean that
> MathML is good but it means that it's more encroached than I knew.
>
> We don't care about "this is part of HTML5" arguments (or else we would
> support all the crazy stuff that flies on public-fx@w3...)

We do care about the stuff what is in the HTML spec.
http://www.whatwg.org/specs/web-apps/current-work/#mathml
(and if there is something we don't care about, it should be removed from the spec)

Benoit Jacob

unread,
May 6, 2013, 7:22:55 AM5/6/13
to Peter Krautzberger, dev-platform
Thanks Peter: that point-for-point format makes it easier for me to
understand your perspective on the issues that I raised.

2013/5/6 <p.kraut...@gmail.com>

> Benoit, you said you need proof that MathML is better than TeX. I think
> it's the reverse at this point (from a web perspective -- you'll never get
> me to use Word instead of TeX privately ;) ).
>
> Anyway, let me try to repeat how I had addressed your original points in
> my first post.
>
> 1.1. you make a point against adding unnecessary typography. Mathematics
> is text, but adding new requirements. It's comparable to the introduction
> of RTL or tables much more than musical notation. It's also something that
> all school children will encounter for 9-12 years. IMHO, this makes it
> necessary to implement mathematical typesetting functionality.
>

School children are only on the reading end of math typesetting, so for
them, AFAICS, it doesn't matter that math is rendered with MathML or with
MathJax's HTML+CSS renderer.


> 1.2 you claimed MathML is inferior to TeX. I've tried to point out that
> that's not the case as most scientific and educational publishers use it
> extensively.
>
> 1.2.1 you claimed TeX is the universal standard. I've tried to point out
> only research mathematicians use it as a standard. Almost most mathematics
> happens outside that group.
>

I suppose that I can only accept your data as better documented that mine;
most of the TeX users I know are or have been math researchers.


> 1.2.2 You pointed out that MathML isn't friendly to manual input. That's
> true but HTML isn't very friendly either, nor is SVG.
>

It's not comparable at all.

If you're writing plain text, HTML's overhead is limited to some <br> or
<p> tags, with maybe the usual <b>, <i>, heading... so the overhead is
small compared to the size of your text.

If you add many anchors and links, and some style, the overhead can grow
significantly, but is hardly going to be more than 2 input lines per output
line.

With MathML, we're talking about easily over 10 input lines per output line
--- in wikipedia's example, MathML has 30 where TeX has 1.

So contrary to HTML, nobody's going to actually write MathML code by hand
for anything more than a few isolated equations.

Thanks also for your other points below, to which I'm not individually
replying; we have a perspective mismatch here, so it's interesting for me
to understand your perspective, but I'm not going to win a fight against
the entire publishing industry which you say is already behind MathML.

Benoit

Benoit Jacob

unread,
May 6, 2013, 7:27:08 AM5/6/13
to Robert O'Callahan, dev-platform
2013/5/6 Robert O'Callahan <rob...@ocallahan.org>

> We expose HTML and SVG content to Web applications by structuring that
> content as a tree and then exposing it using standard DOM APIs. These APIs
> let you examine, manipulate, parse and serialize content subtrees. They
> also let you handle events on that content. CSS also depends on content
> having a DOM tree structure for selectors and inheritance to work. You
> definitely need to able to handle events and apply CSS to elements of your
> math markup.
>

I guess I don't see the usefulness of allowing to apply style to individual
parts of an equation --- applying a single style to an entire equation
would be plenty enough as far as I can see.

Regarding editing, if I understand correctly, you have WYSIWYG or other
kinds of fancy editing in mind, where understanding of the syntax tree
inside of the equation is needed; I haven't seen a need for WYSIWYG editing
of math, but I don't want to try to fight the war "for or against WYSIWYG".

Benoit


>
> I mentioned editing because I thought you'd want to reuse DOM text node
> editing in a MathML editor.
>
> Introducing a new kind of document markup that can't be manipulated via
> DOM APIs is a non-starter. And of course, once you've figured out how to
> expose math content in a DOM API, people are going to expect the source
> language to use HTML-like angle-bracket syntax like everything else that
> parses to a DOM.
>

Benoit Jacob

unread,
May 6, 2013, 7:31:36 AM5/6/13
to Robert O'Callahan, dev-platform, Peter Krautzberger
2013/5/6 Robert O'Callahan <rob...@ocallahan.org>

> Let me go on a bit of a rampage about TeX for a bit.
>
> TeX is not a markup format. It is an executable code format. It is a
> programming language by design!
>

Yes, but a small subset of TeX could be purely a markup format, not a
programming language. Just support a finite list of common TeX math
operations, and no custom macros (or very restricted ones).

Benoit

Mike Hommey

unread,
May 6, 2013, 7:45:55 AM5/6/13
to Benoit Jacob, dev-platform
On Mon, May 06, 2013 at 07:27:08AM -0400, Benoit Jacob wrote:
> 2013/5/6 Robert O'Callahan <rob...@ocallahan.org>
>
> > We expose HTML and SVG content to Web applications by structuring that
> > content as a tree and then exposing it using standard DOM APIs. These APIs
> > let you examine, manipulate, parse and serialize content subtrees. They
> > also let you handle events on that content. CSS also depends on content
> > having a DOM tree structure for selectors and inheritance to work. You
> > definitely need to able to handle events and apply CSS to elements of your
> > math markup.
> >
>
> I guess I don't see the usefulness of allowing to apply style to individual
> parts of an equation --- applying a single style to an entire equation
> would be plenty enough as far as I can see.

Stupid example that can be useful:
<style>
.sqrt { color: red }
.sqrt * { font-style: italic; color: black }
</style>
<p>A square root is denoted by <math><msqrt
class="sqrt"><mrow>a</mrow></msqrt></math>, where the radical sign, or
radix, is the symbol in red.</p>

> Regarding editing, if I understand correctly, you have WYSIWYG or other
> kinds of fancy editing in mind, where understanding of the syntax tree
> inside of the equation is needed; I haven't seen a need for WYSIWYG editing
> of math

seriously? I was a very happy user of the MS Word Equation Editor when I
was in high school.

Mike

Boris Zbarsky

unread,
May 6, 2013, 8:19:31 AM5/6/13
to
On 5/6/13 7:27 AM, Benoit Jacob wrote:
> I guess I don't see the usefulness of allowing to apply style to individual
> parts of an equation

Styling parts of an equation with different colors can be _extremely_
useful for readability. It's rarely done in print, of course, and I
assume there are various reasons ranging from "it's more expensive" to
"no one does that" for why. But on the web it seems like a no-brainer.

Styling parts of an equation with different font styles is of course all
over the place; there are lots of TeX packages that will let you do
things like \mathfrak, for example. Of course fraktur in particular got
stuck into Unicode...

There are some interesting use cases I can think of for scripted
visibility styling in educational materials.

> Regarding editing, if I understand correctly, you have WYSIWYG or other
> kinds of fancy editing in mind, where understanding of the syntax tree
> inside of the equation is needed; I haven't seen a need for WYSIWYG editing
> of math

I think this goes back to roc's point about current TeX workflows being
ok for specialists (maybe; I have in fact wished for a good wysiwyg
editor for TeX on many an occasion, but was always stymied by the need
for custom macros for my documents), but most people _do_ in fact want
wysiwyg editing. It's not "fancy" for most people but a baseline
requirement. So any system for math on the web needs to have support
for that requirement...

-Boris

Boris Zbarsky

unread,
May 6, 2013, 8:24:07 AM5/6/13
to
On 5/5/13 10:46 PM, Benoit Jacob wrote:
>> * MathJax output is ~5 times slower than native support. This is after 9
>> years of development of jsmath and MathJax (and javascript engines).
>
> JavaScript performance hasn't stopped improving and is already far better
> than 5x slower than native on use cases (like the Unreal Engine 3 demo)
> that were a priori much harder for JavaScript.

This is a layout/css issue, not a js engine issue, I suspect. MathJax
HTML output just ends up having to produce lots of stuff, do lots of
layout calculations (which means doing layout!) then redo all the layout
again based on the results of those calculations.

It's really hard to make 2+ layout passes as fast as one layout pass.

> I'm also speaking as a (former) mathematician, and I've never had to rely
> on TeX packages that aren't found in every sane TeX distribution

It depends on your timeframe.

The packages I used in the mid-to-late '90s for embedding images in
documents no longer exist; their current replacements (with different
syntax) did not exist then.

> I am still waiting for the rebuttal of my arguments, in the original email
> in this thread, about how TeX is strictly better than MathML for the
> particular task of representing equations.

How easy is it to build an accessibility application on top of TeX, or
even a restricted subset of it? Note that these exist for MathML, but
not so much for TeX.

I guess this comes down to how easy it is to construct exactly the same
parse/syntax tree out of TeX, right?

-Boris

Joshua Cranmer 🐧

unread,
May 6, 2013, 8:36:20 AM5/6/13
to
On 5/6/2013 6:27 AM, Benoit Jacob wrote:
> I guess I don't see the usefulness of allowing to apply style to individual
> parts of an equation --- applying a single style to an entire equation
> would be plenty enough as far as I can see.

Suppose you were writing an introductory explanation course, where you
were explaining the derivation of a complex formula step-by-step. You
could illustrate the changes in each step with a different color. You
could also use strike through text formatting to clearly indicate.
>
> Regarding editing, if I understand correctly, you have WYSIWYG or other
> kinds of fancy editing in mind, where understanding of the syntax tree
> inside of the equation is needed; I haven't seen a need for WYSIWYG editing
> of math, but I don't want to try to fight the war "for or against WYSIWYG".
>
I would wager that the majority of HTML content in the wild is not
written by people who write HTML in a text editor but by people who use
some sort of WYSIWYG tool or document format conversion--I'm including
subsets like email and E-PUB here. Also, this strikes me as very biased
towards the frame of mind that "real mathematicians use TeX"--I was
introduced to the Equation Editor in Microsoft Office more or less as
part of the regular course of study, long before I was introduced to TeX
in any form.

Trevor Saunders

unread,
May 6, 2013, 9:12:48 AM5/6/13
to dev-pl...@lists.mozilla.org
On Mon, May 06, 2013 at 08:24:07AM -0400, Boris Zbarsky wrote:
> >I am still waiting for the rebuttal of my arguments, in the original email
> >in this thread, about how TeX is strictly better than MathML for the
> >particular task of representing equations.
>
> How easy is it to build an accessibility application on top of TeX,
> or even a restricted subset of it? Note that these exist for
> MathML, but not so much for TeX.

I actually think it would be easier to map tx math into the
accessibility APIs we support than mathml.

currently we don't expose mathml at all other than as a an object that
we say is an equation, and its not really clear how to fix that with
mathml.

Trev

>
> I guess this comes down to how easy it is to construct exactly the
> same parse/syntax tree out of TeX, right?
>
> -Boris

fred...@mathjax.org

unread,
May 6, 2013, 10:13:04 AM5/6/13
to
I don't have time to respond right now, but regarding the accessibility, mathematics is also more complex in that case too. Basically the two use cases are I'm aware of are

- For blind people or other visual disabilities, speech synthesizer must follow the MathSpeak rules. Simply reading the text "normally", e.g. of a LaTeX or ASCII source, is ambiguous.

- For people with reading disabilities (dyslexia etc), you need to synchronize highlight of equation parts / reading of equation parts.

In both cases, you must know a bit more about the mathematical structure e.g. to have a DOM. It's not clear how to do that with plain text. It's just absurd to believe that putting TeX source inside the alt text of an <img> makes the formula accessible. It might works for very simple equations like x+2 but in general you'll have to do some parsing into an abstract representation if you want to read/highlight it correctly. With MathML you already have a standard representation and there already exist tools to work with that language.

pa...@dessci.com

unread,
May 6, 2013, 12:25:39 PM5/6/13
to
I'm coming late to this thread but I have to say that the misunderstanding present in the original post is huge. The author can take refuge in that he's made a common category mistake. MathML is a computer representation for math, TeX is a human input language.

MathML was never intended to be typed by humans so it is no wonder that you find it a bad experience. TeX is a poor computer representation which is one reason why MathML was invented.

It is reasonable to have a discussion of the relative merits of entering math by typing TeX vs point-and-click editing of math (ie, direct manipulation editing). I am biased toward the latter but I can understand the feelings of those whose hands know TeX really well.

In short, both MathML and TeX have good reasons to exist and don't compete with each other in their primary categories.

On Sunday, May 5, 2013 8:38:39 AM UTC-7, Benoit Jacob wrote:
> Hi,
>
>
>
> Summary: MathML is a vestigial remnant of the XML-everything era, and we
>
> should drop it.
>
>
>
> ***
>
>
>
> 1. Reasons why I believe that MathML never was a good idea. Summary:
>
> over-specialized and uniformly inferior to the pre-existing,
>
> well-established standard, TeX.
>
>
>
> 1.1. MathML is too specialized: we should be reluctant to have a
>
> separate spec for every kind of specialized typography. What if musicians
>
> wanted their own MusicML too?
>
>
>
> 1.2. MathML reinvents the wheel, poorly. A suitable subset of TeX (not
>
> the entirety of TeX, as that is a huge, single-implementation technology
>
> that reputedly only Knuth ever fully understood) was the right choice all
>
> along, because:
>
>
>
> 1.2.1. TeX is already the universally adopted standard --- and
>
> already was long before MathML was invented. Check for yourself on
>
> http://arxiv.org/ , where most new math papers are uploaded --- pick any
>
> article, then "other" formats, then "Source": you can then download TeX
>
> sources for almost every article.
>
>
>
> 1.2.2. TeX is very friendly to manual writing, being concise and
>
> close to natural notation, with limited overhead (some backslashes and
>
> curly braces), while MathML is as tedious to handwrite as any other
>
> XML-based format. An example is worked out at
>
> http://en.wikipedia.org/wiki/MathML#Example_and_comparison_to_other_formats,
>
> where the solution to the quadratic equation is one line of TeX versus
>
> 30
>
> lines of MathML!
>
>
>
> 1.2.3. An important corollary of being very close to natural notation
>
> is that TeX can be nearly trivially "read aloud". That means that it offers
>
> a particularly easy accessibility story. No matter what mechanism is used
>
> to graphically display equations, providing the TeX source (similarly to
>
> images alt text) would allow anyone to quickly read it themselves without
>
> any kind of software support; and screen reading software could properly
>
> read equations with minimal TeX-specific support code. For example, TeX
>
> code such as "\int_0^1 x^2 dx" can be readily understood by any human with
>
> basic TeX exposure (which is nearly 100% of mathematicians) and can be
>
> easily handled by any screen reader that knows that \int should be read as
>
> "integral" and that immediately after it, _ and ^ should be read as "from"
>
> and "to" respectively.
>
>
>
> ***
>
>
>
> 2. Reasons why even if MathML had ever been a decent idea, now would be the
>
> right time to drop it. Summary: never really got traction, and the same
>
> rendering can now be achieved without MathML support.
>
>
>
> 2.1. MathML never saw much traction outside of Mozilla, despite having
>
> been around for a decade. WebKit only got a very limited partial
>
> implementation recently, and Google removed it from Blink. The fact that it
>
> was just dropped from Blink says much about how little it's used: Google
>
> wouldn't have disabled a feature that's needed to render web pages in the
>
> real world. Opera got an implementation too, but Opera's engine has been
>
> phased out.
>
>
>
> 2.2. High-quality mathematical typography in browsers is now possible,
>
> without using MathML. Examples include MathJax ( http://www.mathjax.org/ ),
>
> which happily takes either TeX or MathML input and renders it without
>
> specific browser support, and of course PDF.js which is theoretically able
>
> to render all PDFs including those generated by pdftex. Both approaches
>
> give far higher quality output than what any current MathML browser
>
> implementation offers.
>
>
>
> ***
>
>
>
> 3. Proposals
>
>
>
> Assuming that there will be agreement to drop MathML, I can see us doing
>
> either of two things:
>
>
>
> 3.1. Either just drop MathML support; the assumption would be that
>
> current solutions not requiring specific browser support, such as MathJax
>
> or PDF.js, are sufficient;
>
>
>
> 3.2. Or drop MathML support and create a new specification, that would
>
> be based on a suitable subset of TeX.
>
>
>
> In both approaches, distributing TeX source code alongside with a page is
>
> highly desirable because it is the preferred source form of most math
>
> content and because it enables good accessibility as discussed above. In
>
> the 3.1 approach, that would be like alt text on images: something that
>
> many authors would omit in practice. In the 3.2 approach, that would be the
>
> document itself, which means that it couldn't be neglected.
>
>
>
> The big problem with 3.2. is the same issue as we described in 1.1: any
>
> math-specific system may well be over-specialized. Then again, TeX is not
>
> exclusively restricted to math typography, and it has been used for e.g.
>
> music typography before. So to some extent that I haven't precisely figured
>
> yet, the 1.1 overspecialization against MathML may not fully apply against
>
> a TeX-based solution.
>
>
>
> Benoit

msc...@googlemail.com

unread,
May 6, 2013, 2:30:51 PM5/6/13
to
On Monday, 6 May 2013 14:12:48 UTC+1, Trevor Saunders wrote:
> On Mon, May 06, 2013 at 08:24:07AM -0400, Boris Zbarsky wrote:
>
> > >I am still waiting for the rebuttal of my arguments, in the original email
> > >in this thread, about how TeX is strictly better than MathML for the
> > >particular task of representing equations.
> >
> > How easy is it to build an accessibility application on top of TeX,
> > or even a restricted subset of it? Note that these exist for
> > MathML, but not so much for TeX.
>
> I actually think it would be easier to map tx math into the
> accessibility APIs we support than mathml.

There are several problems/issues here:

# Context

How do you differentiate/identify math powers (e.g. "a^2"), footnotes (e.g. "some text^1") and code ("int c = a^b;")?

With MathML markup, you have clearly identified what the content of the document/sub-tree is.

# Parsing

With a TeX-like format, a speech synthesiser/screen reader/web browser would need to write a parser for that format.

With MathML, the parsing is already handled by the SGML/XML/HTML5 parser so the application can process it via DOM/SAX/a reader API.

> currently we don't expose mathml at all other than as a an object that
> we say is an equation, and its not really clear how to fix that with
> mathml.

This is enough information for the screen reader/speech synthesiser to know that it has MathML content, and thus walk the MathML DOM to read the math out loud. It should also be enough to query associated CSS styles to handle any Aural CSS or CSS Speech styles associated with the MathML.

Another important consideration is existing web content. If you are going to start rendering text that has e.g. "a^2" as math, then all documents that use that, e.g. "<p>You can use a^b in TeX to denote 'a raised to the b<sup>th</sup> power'.</p>"

- Reece

Trevor Saunders

unread,
May 6, 2013, 2:57:16 PM5/6/13
to dev-pl...@lists.mozilla.org
On Mon, May 06, 2013 at 11:30:51AM -0700, msc...@googlemail.com wrote:
> On Monday, 6 May 2013 14:12:48 UTC+1, Trevor Saunders wrote:
> > On Mon, May 06, 2013 at 08:24:07AM -0400, Boris Zbarsky wrote:
> >
> > > >I am still waiting for the rebuttal of my arguments, in the original email
> > > >in this thread, about how TeX is strictly better than MathML for the
> > > >particular task of representing equations.
> > >
> > > How easy is it to build an accessibility application on top of TeX,
> > > or even a restricted subset of it? Note that these exist for
> > > MathML, but not so much for TeX.
> >
> > I actually think it would be easier to map tx math into the
> > accessibility APIs we support than mathml.
>
> There are several problems/issues here:
>
> # Context
>
> How do you differentiate/identify math powers (e.g. "a^2"), footnotes (e.g. "some text^1") and code ("int c = a^b;")?

the same way the tx parser does, though that would be a problem for the
API consumer to deal with not us.

> With MathML markup, you have clearly identified what the content of the document/sub-tree is.
>
> # Parsing
>
> With a TeX-like format, a speech synthesiser/screen reader/web browser would need to write a parser for that format.
>
> With MathML, the parsing is already handled by the SGML/XML/HTML5 parser so the application can process it via DOM/SAX/a reader API.

which has just changed the problem from parsing text to parsing a tree
of objects.

> > currently we don't expose mathml at all other than as a an object that
> > we say is an equation, and its not really clear how to fix that with
> > mathml.
>
> This is enough information for the screen reader/speech synthesiser to know that it has MathML content, and thus walk the MathML DOM to read the math out loud. It should also be enough to query associated CSS styles to handle any Aural CSS or CSS Speech styles associated with the MathML.

No it is not. Ignoring various evil things we'd really rather they
didn't do they can't touch the DOM itself.

> Another important consideration is existing web content. If you are going to start rendering text that has e.g. "a^2" as math, then all documents that use that, e.g. "<p>You can use a^b in TeX to denote 'a raised to the b<sup>th</sup> power'.</p>"

I don't think anyone is suggesting that because it obviously would
break existing pages, instead we'd have to do something like <p>this is
some text with an equation <tx>x = 2y</tx></p>

Trev

>
> - Reece

Benoit Jacob

unread,
May 6, 2013, 3:12:12 PM5/6/13
to dev-platform
We're getting distracted by the comparison with TeX and the discussion of
MathML's relative merits. My bad: I obscured my message by starting two
conversations at once (1.1 and 1.2 in my initial email).

I happily concede this round, given that most people disagree with me about
TeX in this thread.

Can we focus on the other conversation now: should the Web have a
math-specific markup format at all? I claim it shouldn't; I mostly
mentioned TeX as a "if we really wanted one" side note and let it go out of
hand.

How many specific domains will want to have their own domain-specific
markup language next? Chemistry? Biology? Electronics? Music? Flow charts?
Calligraphy?

I suspect that when people start asking for that, we'll quickly have to
start saying "no", and at that point, the exception made for math will seem
unjustified.

I understand, from Boris' email, that there are nontrivial performance
issues associated with relying on generic HTML layout to render math. And
API issues associated with querying font metrics from JavaScript. But
surely it must be possible to overcome these issues, and that would benefit
entire classes of content, not just math.

If tomorrow a competing browser solves these problems, and renders
MathJax's HTML output fast, we will obviously have to follow. That can
easily happen, especially as neither of our two main competitors is
supporting MathML.

Benoit


2013/5/5 Benoit Jacob <jacob.b...@gmail.com>
> without using MathML. Examples include MathJax ( http://www.mathjax.org/), which happily takes either TeX or MathML input and renders it without

Joshua Cranmer 🐧

unread,
May 6, 2013, 3:46:04 PM5/6/13
to
On 5/6/2013 2:12 PM, Benoit Jacob wrote:
> How many specific domains will want to have their own domain-specific
> markup language next? Chemistry? Biology? Electronics? Music? Flow charts?
> Calligraphy?

MathML specifies mathematical formulae, which is not domain-specific,
and is itself a building block for other fields as well. Looking at the
other fields:
Chemical formulas of course can use MathML, and drawing chemical
structures is best built on SVG. Note that even practitioners in the
field are used to basically building these structures with tools like
ChemDraw, which can be thought of as a specialized SVG tool.
I don't know what biology can specify, but I don't think there's much
that they couldn't solve without basic 2D and 3D graphics.
Electronics' circuit diagrams are easily just a set of macros on top of
SVG, as are music and flow charts.

I haven't read the source code of MathJAX, but the fact that it isn't a
straight TeX-to-HTML one-pass converter is to me a good sign that MathML
expresses stuff that is not reliably expressible in HTML.

> I suspect that when people start asking for that, we'll quickly have to
> start saying "no", and at that point, the exception made for math will seem
> unjustified.

If no one's asking for the other things, then it's not an issue.

Benoit Jacob

unread,
May 6, 2013, 4:45:49 PM5/6/13
to Joshua Cranmer 🐧, dev-platform
2013/5/6 Joshua Cranmer 🐧 <Pidg...@gmail.com>

> On 5/6/2013 2:12 PM, Benoit Jacob wrote:
>
>> How many specific domains will want to have their own domain-specific
>> markup language next? Chemistry? Biology? Electronics? Music? Flow charts?
>> Calligraphy?
>>
>
> MathML specifies mathematical formulae, which is not domain-specific, and
> is itself a building block for other fields as well. Looking at the other
> fields:
> Chemical formulas of course can use MathML, and drawing chemical
> structures is best built on SVG. Note that even practitioners in the field
> are used to basically building these structures with tools like ChemDraw,
> which can be thought of as a specialized SVG tool.
> I don't know what biology can specify, but I don't think there's much that
> they couldn't solve without basic 2D and 3D graphics.
> Electronics' circuit diagrams are easily just a set of macros on top of
> SVG, as are music and flow charts.
>

Of course not; just like math, music will want a higher level of
abstraction that's not directly tied to graphical rendering, like a set of
SVG macros would be.

In fact, http://en.wikipedia.org/wiki/MusicXML

And in fact... http://en.wikipedia.org/wiki/List_of_XML_markup_languages

Benoit


>
> I haven't read the source code of MathJAX, but the fact that it isn't a
> straight TeX-to-HTML one-pass converter is to me a good sign that MathML
> expresses stuff that is not reliably expressible in HTML.
>
>
> I suspect that when people start asking for that, we'll quickly have to
>> start saying "no", and at that point, the exception made for math will
>> seem
>> unjustified.
>>
>
> If no one's asking for the other things, then it's not an issue.
>
>
> --
> Joshua Cranmer
> Thunderbird and DXR developer
> Source code archæologist
>
> ______________________________**_________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/**listinfo/dev-platform<https://lists.mozilla.org/listinfo/dev-platform>
>

Robert O'Callahan

unread,
May 6, 2013, 7:15:56 PM5/6/13
to Benoit Jacob, dev-platform
On Tue, May 7, 2013 at 7:12 AM, Benoit Jacob <jacob.b...@gmail.com>wrote:

> How many specific domains will want to have their own domain-specific
> markup language next? Chemistry? Biology? Electronics? Music? Flow charts?
> Calligraphy?
>

This is a good question to ask, but I think it would help if there are
specific vocabularies we can use as examples.

I think we can safely say that mathematics is a more compelling domain for
Web content than all those other domain. For years we've had a MathML WG in
the W3C and as far as I know, none of those other domains have ever wanted
a WG at the W3C --- at least, they haven't had one. Likewise we've had and
still have a lot of people pushing for browser support for math, and I
haven't ever noticed anyone pushing for browser support for those other
domains.

I think you can also look at Wikipedia and see a lot of use of math, but
relatively little use of content in those other domains. Probably because
math is a much more general tool than those other domains.

Another thing to consider is how amenable to automatic layout/presentation
a particular XML vocabulary is. I know good automatic music layout is very
difficult. For flow-charts, and I suspect chemistry, it is too. For biology
I don't even know what the browser would do. If there's no known good
automatic layout algorithm then obviously browser rendering of content
doesn't make much sense.

One domain you didn't list where I *have* seen pressure for built-in
browser support is maps. Some people want to extend SVG with features to
support maps, so a browser can just render a map without specialized Web
app support. I don't think that is a good idea because good map layout
algorithms are really difficult (e.g. placing labels). Also, mapping
applications invariably have a lot of functionality that wouldn't make
sense to add to the browser --- direction finding, for example --- so it's
hard to imagine users wanting to use maps outside of the context of some
Web application. There's almost no benefit to anyone supporting maps
outside the context of a mapping Web application.

Robert O'Callahan

unread,
May 6, 2013, 7:20:44 PM5/6/13
to Benoit Jacob, Joshua Cranmer 🐧, dev-platform
Hopefully Web Components will provide a good solution to let authors extend
the browser with support for vocabularies that can be rendered via a
straightforward decomposition to HTML or MathML or SVG.

I think the layout requirements of MathML are too onerous for MathML to be
reduced to HTML or SVG that way.

While diagrams such as chemical formulae, flowcharts or electronics
schematics can be compiled to SVG, the layout step is very much nontrivial
and I don't think Web Components is enough for that. Web Components plus
some JS to do the layout is probably satisfactory.

Brian Smith

unread,
May 6, 2013, 9:06:58 PM5/6/13
to Benoit Jacob, dev-platform
Benoit Jacob wrote:
> Can we focus on the other conversation now: should the Web have a
> math-specific markup format at all? I claim it shouldn't; I mostly
> mentioned TeX as a "if we really wanted one" side note and let it go
> out of hand.
>
> How many specific domains will want to have their own domain-specific
> markup language next? Chemistry? Biology? Electronics? Music? Flow
> charts? Calligraphy?

I hope that all those subjects develop their own domain-specific markup languages. In fact, many of them have: there's MusicXML for Music, and OpenType for caligraphy, for example. Things that can help people convey the true meaning of information to each other and that can give machines the necessary assistance to understand that information are generally good.

I think the more important issue is whether browsers should have built-in support for all these things. I think we should make the platform flexible enough and powerful enough that web pages can render, edit, and manipulate the information without any built-in knowledge of the markup from the browser. However, unless/until we ship that, I don't think there should be a rush to remove MathML.

I mean no disrespect to the people who worked on pdf.js, but I have to admit that many frustrating experiences with pdf.js have convinced me that it is even more important than I originally thought to get people publishing scientific and technical writing *natively* in HTML as soon as possible. Simply, we are not "there" yet as far as "render and edit it with your own JS code" goes. Until we are "there," IMO we have to get the web publishing content natively in HTML. That means we should be aiming for high-fidelity (perfect) and high-performance dvi-to-html (and even docx-to-html and xslx-to-html) conversion at a minimum. (For all the good things about pdf.js, "high fidelity" and "high performance" do not describe it, in my experience.)

> start saying "no", and at that point, the exception made for math
> will seem unjustified.

I think eventually we could say the same thing about SVG (why not just have JS code render Adobe Illustrator drawings using canvas or even WebGL?) and quite a few other things we've built into the platform. We definitely should do what you suggest and improve the core parts of the platform to make such specialized built-in interpreters unnecessary. But, that seems quite far off; we want the web platform to be competitive with various native apps sooner than we can demonstrate success with that strategy.

> If tomorrow a competing browser solves these problems, and renders
> MathJax's HTML output fast, we will obviously have to follow. That
> can easily happen, especially as neither of our two main competitors
> is supporting MathML.

Sure. Nobody's arguing that we shouldn't make MathJax fast. I would argue, though, that we shouldn't remove MathML until there's a viable (equally-usable, equally-round-trippable, equally-performing) replacement.

> School children are only on the reading end of math typesetting, so
> for them, AFAICS, it doesn't matter that math is rendered with MathML
> or with MathJax's HTML+CSS renderer.

School children traditionally have been on the reading end of math typesetting because they get punished for writing in their math books. However, I fully expect that scribbling in online books will be highly encouraged going forward. School children are not going to write MathML or TeX markup. Instead they will use graphical WYSIWYG math editors. The importance of MathML vs. alternatives, then, will have to be judged by what those WYSIWYG end up using. WYSIWYG editing of even basic wiki pages is still almost completely unusable right now, so I don't think we're even close to knowing what's optimal as far as editing non-trivial mathematics goes.

Cheers,
Brian

fred...@mathjax.org

unread,
May 7, 2013, 7:07:24 AM5/7/13
to
* About the "XML is evil, MathML is XML so MathML is evil" syllogism.

I don't think it makes sense in general to say that something is good or bad without mentioning for what purpose. I actually agree with Joshua that XML is a good format to work with for a computer engineer. There are very good libraries and tools to handle it and things like XML namespaces that are painful on the Web become very important for these tools. roc is right that the "catastrophic fail" is certainly not good for a Web model. Note however that most of the Web sites are automatically generated by server side programs and I often see MYSQL, PHP, CGI etc failures without hearing anyone saying they should come back to static pages. The HTML5 parsing rules allow to get concision and error-tolerance when you want to quickly write pages but this "tag-soup" approach also brings confusion in general and is just useless for programming. The inclusion of MathML inside HTML5 removed the XML burden that prevented people to use it on the Web or in emails where the default content is text/html. Actually the syntactic difference has never been a problem for MathJax or WYSIWYG tools (working on a DOM-like tree) or for authoring tools like LaTeXML (that generates XML from LaTeX and then only uses XSLT stylesheets at the final step to convert to EPUB, XHTML or HTML5). Perhaps one of the best argument against the XML-haters is that Henri Sivonen's HTML5 validator is itself heavily based on XML tools like RELAX NG schemas or Java XML-related libraries.

* About the "MathML is too specialized".

Obviously I agree with what has been said before about math being in particular position. I personally see mathematical writing as language by itself and so not having it in the browsers is just like not supporting Arabic or Asian scripts (BTW MathML was implemented in Gecko a long time before HTML ruby). Just to add one point: mathematical expressions are also very often mixed with other content like text or diagrams and it makes sense to have HTML+SVG+MathML+CSS well integrated together.

* About the "TeX is already the universally adopted standard" and " TeX is very friendly to manual writing".

Again, people have already said that this is not true, at least not outside academia (I personally use TeX too as an input method but don't want to impose that to other people and I'm open to use other methods like handwritting recognition in the future). One of the most popular question on the MathJax list is of course "is there a WYSIWYG math editor?" Many people also like the ASCII-like syntax: (x^2 + y_1)/2. For example MathJax supports that syntax, Daniel Glazman has a plugins for BlueGriffon & Thunderbird and this is commonly used in Computer algebra systems. Some people say (cf jqMath or MathEL) that with tools to replace the traditional keyboard, entering Unicode characters becomes easy and so they generalize to a Unicode-based simple syntax where you write the actual symbol rather than commands like \Leftrightarrow. Even modern LaTeX environments support Unicode. Finally, people are also interested in handwritting recognition (see e.g. https://www.youtube.com/watch?v=26opB8DRf3c or http://webdemo.visionobjects.com/portal.html).

* About the "TeX can be nearly trivially read aloud".

This is an assumption but the reallity is that math accessibility tools would need a parsing into an abstract representation at some point anyway. Just reading the plain text source naively is not enough for the two use cases I mentioned. There are already MathML-based tools showing that it is possible to use MathML.

* About "MathML never saw much traction outside of Mozilla"

If you are only talking about browser vendors then that's true. But Web users have requested MathML support for a long time (remember that the Web was created at the CERN for research purpose) and has been implemented in Gecko and Webkit by volunteers. MathJax is yet another community effort to bring math on the Web and was initially presented to me by Robert Miner as a "transition technology" towards MathML in browsers. At the last W3C workshop on ebook, everybody complained about the lack of MathML support in layout engines (Gecko being excluded de facto for now) and this leads to serious discussions inside the MathJax consortium about how we could help implementing MathML in browsers (hopefully the MathJax team will be able to say more about that later). Other people also indicate other domain where MathML is now used. BTW, you probably don't care either about that argument, but MathML is part of the OpenDocument OASIS/ISO/IEC standard used in OpenOffice and other office software suite.

[digression: Mozilla people keep saying that competition is good. That was certainly true when Mozilla was fighting against Internet Explorer predominance and stagnation in Web innovation. But to be honest, it seems that what's happening now is that Google is leading the development and other actors are following. Webkit developments were mostly done by Google and it's not clear what Apple will do after Blink fork. Opera just gave up its rendering engine and joined the Blink effort. Mozilla claims to propose something different but there have been many pragmatic decisions recently that are against its own community and manifesto. Many excellent Mozilla projects are now abandoned by MoCo and left to volunteers. I don't want the MathML story to be another defeat against Google's monoculture]

* About the "cost to support MathML"

I think most of the changes done by Mozilla staff are normal code maintenance (removing PR_TRUE macros, changing C++ interface, moving files etc) as well as a few security fixes. The MathML code is relatively small and isolated so it's not a too serious effort. There were large code refactoring a couple of years ago, like roc moving some font code to the style system or Karl's remarkable work to make MathML support live again. All the new features and other bug fixes not in the previous categories have been made by volunteers. I've actually been mentoring dizains of new volunteers in the last few months and some are still active in Gecko or other Mozilla projects. The synergy with other MathML projects outside Mozilla was also very fruitful. I'm certainly biased but I would say it has been more a benefit for Mozilla than a burden to have MathML.

* High-quality mathematical typography in browsers is now possible, without using MathML. Examples include MathJax or PDF.js

I only tried PDF.js once at the very beginning. It was really slow and the output was really bad (disclaimer: I tried with TeX-generated papers generated, I guess for simple PDF documents it is fine). I don't believe the future of Web content is PDF and I only see the PDF.js effort as a workaround for Adobe plugin rather than a real wish to bring PDF document to the Web. I agree with that we should encourage scientific and technical content on the Web and that's one of the goal of the MathJax project.

Regarding MathJax output, it is certainly far better than Safari native MathML support at the moment but I would say that Gecko's rendering is close, depending on the font available on your system (https://developer.mozilla.org/en-US/docs/Mozilla/MathML_Project/Fonts). It's rather impressive that a Javascript-based approach with all the approximations (like rounding errors in measures, placements and hardcoded font metrics) is able to do better than native support but I would see that as an encouragement for browser vendors to do better than to just give up. Microsoft Office uses MathML (or at least a very similar XML format) and developed the Open Type Math Table extension, proving that you can get a good rendering from that too. Jonathan Kew's implementation of the Open Type Math Table in XeTeX and the same in LuaTeX proved that the TeX layout algorithm can be replaced by this method without losing in quality and can do better than PDF.js or MathJax. Currently Gecko's MathML tries to emulate TeX's heuritics with very few knowlegde on the current font but I expect we could use the OpenType Math table in the future (Karl reported a related bug a long time ago). Also, MathJax TeX fonts are generated from the classical Computer Modern font. This means that we need an autotracer to convert from Knuth's metafont to Web fonts and this results in not so good quality depending on font size (at least, some MathJax's partners from the publishing industry complain). I think the future for the Web is to directly use math fonts that have been designed as Open Type fonts and many have Microsoft's Math table (STIX, Asana Math, Neo Euler, Gyre, Latin Modern, Cambria Math etc).

Without further details, the main issues with MathJax are:

1) Performance
2) Dynamic Update (Javascript, reflow, repaint etc)
3) Integration with HTML/CSS/SVG
4) Font support

Some improvements could be done by providing Javascript APIs to give MathJax more information, especially to solve 4). However, I don't think the fundamental issues can be solved without proper MathML support. We have other ideas like using localStorage to cache the HTML-CSS output or trying to play on how many equations we insert at once, but I doubt we will never get the same performance as native MathML.

BTW, one of the most important source of complaints from MathJax users and partners at the moment seem to be incompatibility and conflicts with CSS and in general 3). It turns out that math on the web is very different to math on papers. With TeX, you're happy if you get a final black & white pdf, with a fixed layout inside page areas and user-defined size & line-breaking (you know the \\ and \Bigl commands). Web people not only want colors but also fonts, Unicode support, links, DOM & Javascript, inclusion in SVG diagrams and a rendering that is not "optimized for IE6 with screen resolution 1024x800". Actually they want compatibility with all the CSS effects like text-shadow, ::selection, CSS animations or max-width (just to mention a few examples of recent feedback from MathJax users). So as roc said, in one way or the other what you want is a Web language with a DOM.

In my opinion the main improvement to do to MathML would be to make it even more close to CSS rather than trying to go back to the old TeX paradigm that is not appropriate at all for the Web. Some examples: <mstyle> (that just duplicates CSS in an incompatible and less powerful way), <mpadded width="2height"> (inspired from LaTeX but incompatible with the CSS box model) or <mphantom> (inspired from \phantom, but not very useful when you have "visibility: none"). I believe things like mfrac@linethickness, math@displaystyle, math@dir, MathML font properties that have been reimplemented by roc, or even parameters from the Open Type Math Table could become CSS properties. This would make the MathML layout more configurable and less random.

As a conclusion, I understand Benoit's concerns from a Web authors point of view. But really from my experience a solution like MathJax for a subset of LaTeX or alternative output like ASCIIMath + tools like LaTeXML for advanced and complicated TeX macros and processing is really satisfactory. WYSIWYG tools or handwritting recognition are maybe not widely available yet, but I expect the solution will be at a high abstraction level rather than just plain text.

Mihai Sucan

unread,
May 7, 2013, 8:29:14 AM5/7/13
to Benoit Jacob, dev-platform
Hello everyone!

This thread has raised my attention and I would like to share my
opinions, maybe as a "school child" who used mathematical software for
WYSIWYG editing (not only reading!), as the primary way of editing any
math, as a primary/fundamental tool for computer-aided learning. I was
(un)lucky enough to be forced by my situation to learn using *only*
computers in the late 1990s and early 2000s. That experience has taught
me the importance of WYSIWYG editing for HTML and maths.

I feel it's not easy to me to reply to this thread - seeing other people
who are technical experts that I admire have already replied, providing
proper arguments for their reasoning. Please excuse my, perhaps, less
formal, less backed-by-arguments reply.

This thread shows that there's some misunderstanding on the performance,
styling and editing requirements for math. I can say that I spent months
trying software to find the best one fitting my requirements. It wasn't
easy.

I haven't seen good (La)TeX WYSIWYG editors, but lately I haven't tried
any such software - now I write LaTeX manually. Still, in the early
2000s I did see and use one WYSIWYG editor that was really good:
Wolfram's Mathematica. It had fast rendering, good set of keyboard
editing shortcuts allowing fast input in WYSIWYG mode. Really good math
WYSIWYG editing is very much possible.

Performance matters not only for the initial document rendering. When
you do WYSIWYG editing performance characteristics matter in a lot more
subtle ways. When you are editing big equations, or some really big
document updates need to happen as close as possible to instant. I have
tested software like MathCAD and Maple that did not seem slow at all
when loading documents. Editing math, however, proved to be quite slow.
Very good editing is *not* about "click and point" - this was one of the
biggest failures of MathCAD's UI: it encouraged the click-and-point
editing which meant you had to switch between the keyboard and the mouse
all the time. Word 97 (before Word 2007) forced you to manually switch
between the equation editor and the normal editor, which was a huge
problem, and so on.

Styling is really important when you collaborate with others and you
need to highlight relevant parts of the math output. I am surprised this
is even put up as discussion.

Similarly I am surprised that the need for WISYWG editing for math is
being discussed. I am being subjective here: I believe that mathematics
should be first-class citizen on the web. Mathematics is a fundamental
domain of study in all schools, in all forms of education throughout the
world. Mathematics is the basis for many other fields, see physics,
computer science and others.

Back in those days when I was writing math homeworks with Mathematica I
was very glad and I appreciated a lot that people write software that
can benefit my niche needs, it was invaluable for me. It made possible
things that were not possible. Microsoft's Word was not even close to
being as usable as Wolfram's software. Word 2007 has, indeed, improved
math editing a *lot*, today it's certainly usable.

Microsoft's work on improving math editing in Word shows there's a real
demand for math in documents. I don't see why we would believe otherwise
about the web. We should not need to include half-baked* JS libs to
render math in a document.

* I'm not claiming that MathJax is half-baked - I am simply pointing out
that once people have the choice of which JS lib to use for math
rendering they may (and will) fail to pick the best one.

I do not care about the technology here - MathML or TeX. What I care
about is for the web browsers to meet the technical demands for
producing really good math rendering and editors. I want this not for
the academics, not for professors who can write TeX documents. I want
this for school children who cannot write math on paper, who are blind,
or who have other physical disabilities. Manually writing LaTeX does not
"cut it" at early stages, when children learn maths. Such tools are
invaluable for them.

At the moment, removing MathML support from Gecko would make it harder
for web app developers to create (really) good software for math
editing. It may certainly have its problems, but its benefits are
greater. Before MathML is removed people should look into defining the
requirements, the APIs needed to be implemented in the browser such that
JS-based math rendering can be equally fast and versatile (eg. styling).
Font metrics stuff is, I believe, only a part of the problem that makes
JS-based math rendering slower than native. After requirements are
defined, those things should be implemented. After that, yes, remove MathML.

Back in the days when I was testing math software, I was also testing
MathML rendering in Gecko - it was slower than in specialized software.
I don't know how it is today, but keep in mind that native software like
Maple and MathCAD was not usable due to performance issues, during fast
editing of small to medium sized documents. It may take some time before
web apps can become as fast as Mathematica at rendering math, and as
good at editing -- even with MathML rendered natively.

Editors are really hard and it is unfortunate to note here that browsers
do not even do good enough at HTML editing. If we can do something to
improve the situation we should do that - not the opposite. The removal
of MathML would most-likely make things worse/harder for web-based math
editors.

Probably there is not much "value" from maintaining MathML - browser
competition happens in other areas, other APIs and technologies.
However, please let the volunteers do their work, maintain their work
and so on. From reading this thread I understand MathML support in Gecko
was implemented mostly by volunteers. It would be a big disappointment
to volunteer efforts to see that work goes away, especially without
anything better replacing it.

I doubt that if we keep MathML some day some people would like their own
niche markup language - eg. for domains like chemistry, biology, music,
etc. Did you see anyone doing that?


I find it surprising that HTML5 caters to advertisers/trackers by
introducing the ping attribute for anchors, yet here we question the
use/need for a standard way to write mathematics on the web - the
initial email in this thread questions the need for anything to replace
MathML, as writing maths is over-specialized.


Thank you for reading. Feel free to take these thoughts with a grain of
salt: I am biased, I was a user of native math software and I would like
the web platform to provide equally good software.


Best regards,
Mihai

Benjamin Smedberg

unread,
May 7, 2013, 9:46:08 AM5/7/13
to rob...@ocallahan.org, Benoit Jacob, Joshua Cranmer 🐧, dev-platform
On 5/6/2013 7:20 PM, Robert O'Callahan wrote:
> Hopefully Web Components will provide a good solution to let authors extend
> the browser with support for vocabularies that can be rendered via a
> straightforward decomposition to HTML or MathML or SVG.
>
> I think the layout requirements of MathML are too onerous for MathML to be
> reduced to HTML or SVG that way.
I'd like understand more about this. I have been hoping that one of the
best use cases for web components is to implement these kinds of
domain-specific languages. I greatly fear that we're accidentally
pushing the web from declarative markup to a model where everything is
controlled with script: in the process, we are going to lose some of the
core benefits of the web: pervasive hyperlinking, save-as and
view-source, and . I tend to think that web components are a great way
to abstract away the presentation of new declarative languages.

Without knowing a lot about it, it seems that SVG and HTML contain all
of the primitives necessary for a web components script to implement the
visual MathML presentation. Perhaps I'm not completely aware of the
problems, though. Does MathML need to participate in inline reflow in a
way that requires direct support from the layout engine?

>
> While diagrams such as chemical formulae, flowcharts or electronics
> schematics can be compiled to SVG, the layout step is very much nontrivial
> and I don't think Web Components is enough for that. Web Components plus
> some JS to do the layout is probably satisfactory.
>
Unless I misunderstand web components, they are primarily blobs of
script, and I'm really hoping they will be able to implement arbitrary
new markup languages. I think there's a lot more that we can add to
components, especially for nonvisual presentations (better
implementation of accessibility and audio presentations seem like
near-term goals).

That still leaves us with future decisions about whether we should build
any of these markup languages into the browser. It seems clear from epub
that there is a demand for declarative math markup, and whether we build
that using web components or directly into our layout engine, it should
be a core markup language of the web.

--BDS

fred...@mathjax.org

unread,
May 7, 2013, 10:11:22 AM5/7/13
to
> Does MathML need to participate in inline reflow in a way that requires direct support from the layout engine?

I don't know if that answers your question but one important thing that is currently lacking in Gecko's MathML implementation is line breaking. This is true for Web pages but I suspect this will become even more important with mobile devices. I use many long inline formulas in my blog and this is handled as I would like by Gecko. A very important use case is large tables enumerating mathematical properties and thus containing formulas (you can find some on Wikipedia like pages on Fourier transforms, integral/derivatives, probability laws, usual function properties and I also used such tables in the appendices of my two master thesis in CS and math). Currently, the line breaking is disable by default in MathJax as that slows down the layout algorithm even more. And of course that does not work for formulas inside table cells since MathJax has no knowledge of CSS intrinsic widths.

fred...@mathjax.org

unread,
May 7, 2013, 10:12:32 AM5/7/13
to
On Tuesday, May 7, 2013 4:11:22 PM UTC+2, fred...@mathjax.org wrote:
> I use many long inline formulas in my blog and this is handled as I would like by Gecko.

sorry I meant this is *not* handled

Robert O'Callahan

unread,
May 7, 2013, 1:42:51 PM5/7/13
to Benjamin Smedberg, Benoit Jacob, Joshua Cranmer 🐧, dev-platform
On Tue, May 7, 2013 at 6:46 AM, Benjamin Smedberg <benj...@smedbergs.us>wrote:

> On 5/6/2013 7:20 PM, Robert O'Callahan wrote:
>
>> Hopefully Web Components will provide a good solution to let authors
>> extend
>> the browser with support for vocabularies that can be rendered via a
>> straightforward decomposition to HTML or MathML or SVG.
>>
>> I think the layout requirements of MathML are too onerous for MathML to be
>> reduced to HTML or SVG that way.
>>
> I'd like understand more about this. I have been hoping that one of the
> best use cases for web components is to implement these kinds of
> domain-specific languages. I greatly fear that we're accidentally pushing
> the web from declarative markup to a model where everything is controlled
> with script: in the process, we are going to lose some of the core benefits
> of the web: pervasive hyperlinking, save-as and view-source, and . I tend
> to think that web components are a great way to abstract away the
> presentation of new declarative languages.
>
> Without knowing a lot about it, it seems that SVG and HTML contain all of
> the primitives necessary for a web components script to implement the
> visual MathML presentation. Perhaps I'm not completely aware of the
> problems, though. Does MathML need to participate in inline reflow in a way
> that requires direct support from the layout engine?


Ideally, yes, although it currently doesn't in Gecko.

Good math layout requires specialized layout primitives that we don't have
in regular CSS. I'm thinking of features like stretchy characters (e.g.
integrals that grow based on the size of the enclosed formula), stretchy
overbars and underbars of various kinds, and careful placement of the
degree next to a radical symbol.

Keep in mind that without script, the kind of transformations you can apply
with Web Components are similar to XBL, and that's pretty limited.

I'll repeat what I said before: going from a domain-specific data model,
such as XML describing an electronic circuit, to a good rendering, is an
incredibly complex process. It's unclear it can even be automated at all,
let alone automated without script.

d.p.ca...@gmail.com

unread,
May 7, 2013, 2:02:47 PM5/7/13
to
On Sunday, 5 May 2013 16:38:39 UTC+1, Benoit Jacob wrote:
> Hi,
>
>
>
> Summary: MathML is a vestigial remnant of the XML-everything era, and we
>
> should drop it.
>

As can be seen in the integration into HTML(5) nothing in MathML requires an XML surface syntax.

>
>
> ***
>
>
>
> 1. Reasons why I believe that MathML never was a good idea. Summary:
>
> over-specialized and uniformly inferior to the pre-existing,
>
> well-established standard, TeX.

TeX is designed for a situation in which the author has full control over
fonts and page size. It is not designed at all for a web-like scenario, in particular TeX has no support for linebreaking of displayed mathematics
which is particularly important with small screens etc. Also of course classic TeX has no support for Unicode fonts. There are Unicode variants (luatex and xetex) and experimental support for Unicode math fonts (unicode-math fonts) but all these are relatively unstable development software and hardly well-established standards.

>
>
>
> 1.1. MathML is too specialized: we should be reluctant to have a
>
> separate spec for every kind of specialized typography. What if musicians
>
> wanted their own MusicML too?
>
>

Mathematics plays a central role as part of the _textual_ vocabulary of scientific and educational literature. Music is of course important but musical notation does not have to interact with textual paragraphs in the same way.
>
> 1.2. MathML reinvents the wheel, poorly. A suitable subset of TeX (not
>
> the entirety of TeX, as that is a huge, single-implementation technology
>
> that reputedly only Knuth ever fully understood) was the right choice all
>
> along, because:

It does not re-invent TeX: it is different.
>
>
>
> 1.2.1. TeX is already the universally adopted standard --- and
>
> already was long before MathML was invented. Check for yourself on
>
> http://arxiv.org/ , where most new math papers are uploaded --- pick any
>
> article, then "other" formats, then "Source": you can then download TeX
>
> sources for almost every article.


TeX is the standard in some fields for _author submission_ although even in scientific fields other formats notably Word are surprisingly (perhaps) common. Also other than self-publishing mechanisms such as arxiv many scientific journals do _not_ use TeX for publishing and even if they accept tex input they convert in house to xml/mathml workflows (thus mathml is a more natural publishing format)
>
>
>
> 1.2.2. TeX is very friendly to manual writing, being concise and
>
> close to natural notation, with limited overhead (some backslashes and
>
> curly braces), while MathML is as tedious to handwrite as any other
>
> XML-based format. An example is worked out at
>
> http://en.wikipedia.org/wiki/MathML#Example_and_comparison_to_other_formats,
>
> where the solution to the quadratic equation is one line of TeX versus
>
> 30
>
> lines of MathML!
>
>

Probably true but not very relevant. HTML is more verbose than wiki syntax for the same reasons.

>
> 1.2.3. An important corollary of being very close to natural notation
>
> is that TeX can be nearly trivially "read aloud". That means that it offers
>
> a particularly easy accessibility story. No matter what mechanism is used
>
> to graphically display equations, providing the TeX source (similarly to
>
> images alt text) would allow anyone to quickly read it themselves without
>
> any kind of software support; and screen reading software could properly
>
> read equations with minimal TeX-specific support code. For example, TeX
>
> code such as "\int_0^1 x^2 dx" can be readily understood by any human with
>
> basic TeX exposure (which is nearly 100% of mathematicians) and can be
>
> easily handled by any screen reader that knows that \int should be read as
>
> "integral" and that immediately after it, _ and ^ should be read as "from"
>
> and "to" respectively.
>
>

The accessibility implementations of MathML (notably MathPlayer) rather refute the suggestions that it is easier to get accessible renderings from Tex. rather the reverse is true.
>
> ***
>
>
>
> 2. Reasons why even if MathML had ever been a decent idea, now would be the
>
> right time to drop it. Summary: never really got traction, and the same
>
> rendering can now be achieved without MathML support.
>
>

MathML is a standard part of ODF (OpenOffice etc) it is supported in clipboard in Word and the rest of the MS Office suite. It is a standard part of epub3. It is supported in many typesetting systems used by journals. It is supported on import/export in maple and mathematca, just to name a few. In contrast TeX is notoriously difficult to process by anything other than TeX: latex2html/tex4ht/latexml do a remarkable job but are inherently fragile and incomplete.
>
> 2.1. MathML never saw much traction outside of Mozilla, despite having
>
> been around for a decade. WebKit only got a very limited partial
>
> implementation recently, and Google removed it from Blink. The fact that it
>
> was just dropped from Blink says much about how little it's used: Google
>
> wouldn't have disabled a feature that's needed to render web pages in the
>
> real world. Opera got an implementation too, but Opera's engine has been
>
> phased out.
>

If you mean "in web browsers" firefox and IE+MathPlayer are the two main implementations it is true with less complete support in Safari and Opera.
See the comment above for traction in MathML in other aspects.
>
> 2.2. High-quality mathematical typography in browsers is now possible,
>
> without using MathML. Examples include MathJax ( http://www.mathjax.org/ ),
>
> which happily takes either TeX or MathML input and renders it without
>
> specific browser support, and of course PDF.js which is theoretically able
>
> to render all PDFs including those generated by pdftex. Both approaches
>
> give far higher quality output than what any current MathML browser
>
> implementation offers.
>

MathJax (or any javascript) rendering is clearly slower and harder to interact with than a native implementation that maps directly to the DOM
>
>
> ***
>
>
>
> 3. Proposals
>
>
>
> Assuming that there will be agreement to drop MathML, I can see us doing
>
> either of two things:
>
>

Hopefully there will be no such agreement, it would be a massively detrimental step for the scientific and educational use of the web.

>
> 3.1. Either just drop MathML support; the assumption would be that
>
> current solutions not requiring specific browser support, such as MathJax
>
> or PDF.js, are sufficient;
>

They are not. PDF in particular is not an ideal format on the web for so many obvious reasons.
>
>
> 3.2. Or drop MathML support and create a new specification, that would
>
> be based on a suitable subset of TeX.
>
>
It is hard to think of any advantages that would have.

>
> In both approaches, distributing TeX source code alongside with a page is
>
> highly desirable because it is the preferred source form of most math
>
> content and because it enables good accessibility as discussed above. In
>
> the 3.1 approach, that would be like alt text on images: something that
>
> many authors would omit in practice. In the 3.2 approach, that would be the
>
> document itself, which means that it couldn't be neglected.
>
>
>
> The big problem with 3.2. is the same issue as we described in 1.1: any
>
> math-specific system may well be over-specialized. Then again, TeX is not
>
> exclusively restricted to math typography, and it has been used for e.g.
>
> music typography before. So to some extent that I haven't precisely figured
>
> yet, the 1.1 overspecialization against MathML may not fully apply against
>
> a TeX-based solution.
>
>
>

> Benoit

David

Benjamin Smedberg

unread,
May 7, 2013, 2:19:14 PM5/7/13
to rob...@ocallahan.org, Benoit Jacob, Joshua Cranmer 🐧, dev-platform
On 5/7/2013 1:42 PM, Robert O'Callahan wrote:
> Keep in mind that without script, the kind of transformations you can apply
> with Web Components are similar to XBL, and that's pretty limited.
Yes, I'm definitely not talking about a non-script implementation of any
of these. I'm presuming a fully scripted webcomponents impl (i.e.
MathJAX but hidden behind webcomponents isolation).

--BDS

Marcio Galli

unread,
May 7, 2013, 4:31:11 PM5/7/13
to pa...@dessci.com, dev-pl...@lists.mozilla.org
> I'm coming late to this thread but I have to say that the misunderstanding present in the original post is huge. The author can take refuge in that he's made a common category mistake. MathML is a computer representation for math, TeX is a human input language.
>
> MathML was never intended to be typed by humans so it is no wonder that you find it a bad experience. TeX is a poor computer representation which is one reason why MathML was invented.
>
> It is reasonable to have a discussion of the relative merits of entering math by typing TeX vs point-and-click editing of math (ie, direct manipulation editing). I am biased toward the latter but I can understand the feelings of those whose hands know TeX really well.
> In short, both MathML and TeX have good reasons to exist and don't compete with each other in their primary categories.

I am also late in this thread, and I incline to this point and I see
the original author frustration as a valid discussion — not sure under
this one.

But I wanted to present another problem which is in a way of the same
nature (has also a degree of separation). It's the development of
tableless grids against HTML.

Consider the case, table A, which a developer can think of:

4 columns:
abcd
ebfd

To to this in HTML, the developer has to make a "container DIV" with 4
main column cells in it, think c1,c2,c3,c4. And which c1=rows a,e;
c2=row b; c3=rows cf; c4=row d;

It's easier to type something like the following:

4,abcdebfd

But this won't change the reality, the end product of this is:

<div><div class='inline'><a /><e /></div><div class='inline'><b
/></div><div class='inline'><c /><f /></div><div class='inline'><d
/></div></div>

Which, can be , as many pointed: styled, channeled to accessibility
observers, manipulated, annotated — all of of that with a greater
level of compatibility as developers understand.

But then, right now, what we have are:

a) Toolkits using JS to do things like 4,abcdebfd [1]

a.1) For example, http://labs.telasocial.com/grid-layout/

b) Specs do to similar things:

b.1) http://dev.w3.org/csswg/css-template/#grid-shorthand

The above example, which refers to grid rearrangement, is a different
things I know. But I think it has similar points to this discussion:
shorthands that applies to HTML elements (or other elements) are good
things to developers.

m

pa...@dessci.com

unread,
May 7, 2013, 4:40:48 PM5/7/13
to
A bit more on the TeX part of this argument. Over a decade ago my company polled publishers that accept submissions from authors of content containing math. Although not a scientific poll, the results were overwhelming. Approximately 85% of all submissions were in MS Word format with equations written using its Equation Editor (my company's product at the time, licensed to Microsoft).

It has been pointed out already here but let me emphasize that most math is not done by mathematicians. All K12 students must learn math so there's a huge industry to serve them. Half the departments at a typical university use math in their teaching (not just science and engineering but anything with statistics and business, economics). Mathematicians are a tiny fraction of the whole. I guarantee you that virtually all K-12 teachers have never heard of TeX.

Ok, with that dead horse beaten let me turn to MathML.

It was mentioned that MathML lacks open source tools to work with it. I believe one of the main reasons for this is the lack of browser support for MathML. It is ironic that most STEM publishers' internal workflows are based on XML and MathML for math but they can't deliver it to browsers. MathJax changes that of course which is why we now see the overwhelming interest in MathJax and its ability to render MathML.

It would be truly ironic if MathJax success killed MathML. MathJax does a truly heroic job of formatting mathematics. It is amazing to me that a JavaScript library can do so well. That said, it is a stand-in for true MathML support. It lacks access to fonts and character metrics as well as layout information. The fact that it works so well is due to the brilliance of Davide Cervone and the rest of the MathJax team and not because it is the right way to render MathML in a browser. That it is not truly integrated into the browser results in all sorts of struggles. As witness, see all the postings in its forums from people who run into trouble creating dynamic web pages containing math. They get into all sorts of tangles involving DOM changes, rendering, event processing order, etc.

Those of us that work in MathML and equation editing constantly run into the misconception that an equation rendering is more like an image than text. I think this is due to the fact that many document processing systems through history have had to handle math as an image because they don't support math directly. Mathematics is really just a fancy text format. Think subscripts and superscripts on steroids. Some chemistry notations fit into this mold as well but music and some of the other things mentioned in this thread are not. There is one easy way to tell if a notation is fancy text. Do books include the notation inline in paragraphs or not. Even with a block (or display) equation it flows with the text.

It was stated that Mozilla's two main competitors don't support MathML. While that is literally true, Internet Explorer has had the benefit of MathML display via my company's MathPlayer plug in for years. When we introduced it, Microsoft added some APIs specifically to allow us to do a better job integrating its rendering into the surrounding text. Most screen readers interface with MathPlayer to make math accessible to people with various disabilities.

Please don't turn your backs on MathML just as it is coming into its own. MathJax is a good stand-in for missing MathML support but it is does not eliminate the need for native support. Instead, it makes its absence all the more glaring and the need to fix the problem all the more urgent.

Trevor Saunders

unread,
May 7, 2013, 4:59:17 PM5/7/13
to dev-pl...@lists.mozilla.org
On Mon, May 06, 2013 at 07:13:04AM -0700, fred...@mathjax.org wrote:
> - For blind people or other visual disabilities, speech synthesizer must follow the MathSpeak rules. Simply reading the text "normally", e.g. of a LaTeX or ASCII source, is ambiguous.

I'd argue that any machine parsable format can't be ambiguous by virtue
of the fact machines parse it. However in any case AtkText /
IAccessibleText / the mac accessible protocol thing all expect the text
for an object to be a string so whatever format the web uses screen
readers will be handling a serialized format.

> - For people with reading disabilities (dyslexia etc), you need to synchronize highlight of equation parts / reading of equation parts.

this should actually work quiet naturally in my proposal since we can
tell API consumers that the bounds of characters in the serialized text
are those for the formatted text shown on screen. That doesn't quiet
work for { / } / ^ etc, but we can just give them a size of zero or
something, and that should probably be fine.

> In both cases, you must know a bit more about the mathematical structure e.g. to have a DOM. It's not clear how to do that with plain text. It's just absurd to believe that putting TeX source inside the alt text of an <img> makes the formula accessible. It might works for very simple equations like x+2 but in general you'll have to do some parsing into an abstract representation if you want to read/highlight it correctly. With MathML you already have a standard representation and there already exist tools to work with that language.

nobody is suggesting images with alt text.

In the cases above which you discussed and Braille which you didn't the
medium is fundimentally serial, so I'm not sure I buy that you need a
tree.

Trev

pa...@dessci.com

unread,
May 7, 2013, 5:12:17 PM5/7/13
to
I think Fred's point here was that the literal text in the MathML or LaTeX is not what a blind person wants to hear. The whole point of math as a 2-D notation is that the relative position of the parts of the equation carry meaning. This is unlike normal text which almost always carries its whole message in its words and punctuation.

On Tuesday, May 7, 2013 1:59:17 PM UTC-7, Trevor Saunders wrote:

fred...@mathjax.org

unread,
May 7, 2013, 6:10:54 PM5/7/13
to
> I'd argue that any machine parsable format can't be ambiguous by virtue
> of the fact machines parse it. However in any case AtkText /
> IAccessibleText / the mac accessible protocol thing all expect the text
> for an object to be a string so whatever format the web uses screen
> readers will be handling a serialized format.
>

Sorry I was not clear here. Concretely, if "{x+1}/2" is read "x plus one over 2", it's ambiguous since that could also mean "x + 1/2". Blind people have standard rules to clarify reading of mathematical expressions and screen readers must follow these rules. That's not necessarily the order people usually read a formula. And reading "opening brace x plus closing brace slash two" is not what they want either. So at some point, someone will have to parse and understand the mathematical structure. That's essentially what MathPlayer and a MathJax extension prototype do from the MathML tree. I guess at the end they will send text strings to the reader anyway.

> > - For people with reading disabilities (dyslexia etc), you need to synchronize highlight of equation parts / reading of equation parts.
> this should actually work quiet naturally in my proposal since we can
> tell API consumers that the bounds of characters in the serialized text
> are those for the formatted text shown on screen. That doesn't quiet
> work for { / } / ^ etc, but we can just give them a size of zero or
> something, and that should probably be fine.

It seems that you are oversimplifying the issue here. First people proposing an alternative syntax will still need to define a mapping from the text source to the visual 2D representation if you want to know which part to highlight. For MathML it's the normal mapping from DOM to the rendering tree in Gecko but it's not clear what Benoit wants to use instead since without further info it seems that the DOM will just have a text node. Next, what you suggest seem to only work for variable names and it's not clear how you'll read e.g. operators or square roots. IIRC, to read a fraction the tools I mention highlight the numerator when they read it, then the whole fraction's bounding box when they read the "over", then the denominator when they read it. Or perhaps they highlight the whole fraction before, I don't remember exactly but the point is that it's not limited to text. I admit that I don't know the details but there are other similar sync rules between highlight and reading. Also, I hope you realize that mathematical layout is much more complex than just fractions and scripts so I don't think one can just say "I guess that will work" for such a nontrivial problem.

pa...@dessci.com

unread,
May 7, 2013, 7:34:25 PM5/7/13
to
Math accessibility is a surprisingly complex subject. How math should be read is dependent on the mathematical or scientific context in which the math is embedded, the educational level of the user, and their familiarity with the accessibility technology itself. In our grant work with the Educational Testing Service (ETS) we found out that a literal reading of a mathematical expression in a test question can give away the answer even when the graphical rendering doesn't.

BTW, all this work is done with math expressed in MathML. It could use MathML structures obtained from MathJax but this means that the screen reader can't use MSAA (or equivalent) to get an IAccessible interface from a DOM node. As far as I know, there is no mechanism that allows JavaScript code to implement IAccessible.

Even with MathML implemented natively in browsers, it seems like accessibility mechanisms still need some work. While the HTML5 effort is busy adding access to device features (phone, camera, GPS, touch) for us in web apps, there has been no effort to do something similar for screen readers and for accessibility support in general. Screen reader vendors are currently being cut out of the mobile market as device makers are playing the old proprietary "that functionality is part of the OS" game.

I guess I am going a bit far afield here. My hope was to show that there is a lot happening with MathML technology. It is not time to pull the plug but properly support it.

kyv...@gmail.com

unread,
May 8, 2013, 1:08:23 PM5/8/13
to
On Monday, 6 May 2013 22:19:31 UTC+10, Boris Zbarsky wrote:
> On 5/6/13 7:27 AM, Benoit Jacob wrote:
>
> > I guess I don't see the usefulness of allowing to apply style to individual
>
> > parts of an equation
>
>
>
> Styling parts of an equation with different colors can be _extremely_
>
> useful for readability.
> ...
> -Boris
Hi, I was looking at estimations of Pi and decided to try/learn MathML to quickly compare them.
I found it extremely quick to throw together and colouring each example made it easier to differentiate them:
http://htmlpad.org/Pi
I also find it's great for others to play around with just by adding /edit to the URL.
I do hope this marvel I just discovered won't disappear just because non-gecko browsers don't render it properly yet.

-Ky

kyv...@gmail.com

unread,
May 8, 2013, 1:45:23 PM5/8/13
to
On Monday, 6 May 2013 01:38:39 UTC+10, Benoit Jacob wrote:
> Hi,
>
>
>
> Summary: MathML is a vestigial remnant of the XML-everything era, and we
>
> should drop it.
>
>
>
> ***
>
>
>
> 1. Reasons why I believe that MathML never was a good idea. Summary:
>
> over-specialized and uniformly inferior to the pre-existing,
>
> well-established standard, TeX.
>
>
>
> 1.1. MathML is too specialized: we should be reluctant to have a
>
> separate spec for every kind of specialized typography. What if musicians
>
> wanted their own MusicML too?
>
>
>
> 1.2. MathML reinvents the wheel, poorly. A suitable subset of TeX (not
>
> the entirety of TeX, as that is a huge, single-implementation technology
>
> that reputedly only Knuth ever fully understood) was the right choice all
>
> along, because:
>
>
>
> 1.2.1. TeX is already the universally adopted standard --- and
>
> already was long before MathML was invented. Check for yourself on
>
> http://arxiv.org/ , where most new math papers are uploaded --- pick any
>
> article, then "other" formats, then "Source": you can then download TeX
>
> sources for almost every article.
>
>
>
> 1.2.2. TeX is very friendly to manual writing, being concise and
>
> close to natural notation, with limited overhead (some backslashes and
>
> curly braces), while MathML is as tedious to handwrite as any other
>
> XML-based format. An example is worked out at
>
> http://en.wikipedia.org/wiki/MathML#Example_and_comparison_to_other_formats,
>
> where the solution to the quadratic equation is one line of TeX versus
>
> 30
>
> lines of MathML!
>
>
>
> 1.2.3. An important corollary of being very close to natural notation
>
> is that TeX can be nearly trivially "read aloud". That means that it offers
>
> a particularly easy accessibility story. No matter what mechanism is used
>
> to graphically display equations, providing the TeX source (similarly to
>
> images alt text) would allow anyone to quickly read it themselves without
>
> any kind of software support; and screen reading software could properly
>
> read equations with minimal TeX-specific support code. For example, TeX
>
> code such as "\int_0^1 x^2 dx" can be readily understood by any human with
>
> basic TeX exposure (which is nearly 100% of mathematicians) and can be
>
> easily handled by any screen reader that knows that \int should be read as
>
> "integral" and that immediately after it, _ and ^ should be read as "from"
>
> and "to" respectively.
>
>
>
> ***
>
>
>
> 2. Reasons why even if MathML had ever been a decent idea, now would be the
>
> right time to drop it. Summary: never really got traction, and the same
>
> rendering can now be achieved without MathML support.
>
>
>
> 2.1. MathML never saw much traction outside of Mozilla, despite having
>
> been around for a decade. WebKit only got a very limited partial
>
> implementation recently, and Google removed it from Blink. The fact that it
>
> was just dropped from Blink says much about how little it's used: Google
>
> wouldn't have disabled a feature that's needed to render web pages in the
>
> real world. Opera got an implementation too, but Opera's engine has been
>
> phased out.
>
>
>
> 2.2. High-quality mathematical typography in browsers is now possible,
>
> without using MathML. Examples include MathJax ( http://www.mathjax.org/ ),
>
> which happily takes either TeX or MathML input and renders it without
>
> specific browser support, and of course PDF.js which is theoretically able
>
> to render all PDFs including those generated by pdftex. Both approaches
>
> give far higher quality output than what any current MathML browser
>
> implementation offers.
>
>
>
> ***
>
>
>
> 3. Proposals
>
>
>
> Assuming that there will be agreement to drop MathML, I can see us doing
>
> either of two things:
>
>
>
> 3.1. Either just drop MathML support; the assumption would be that
>
> current solutions not requiring specific browser support, such as MathJax
>
> or PDF.js, are sufficient;
>
>
>
> 3.2. Or drop MathML support and create a new specification, that would
>
> be based on a suitable subset of TeX.
>
>
>
> In both approaches, distributing TeX source code alongside with a page is
>
> highly desirable because it is the preferred source form of most math
>
> content and because it enables good accessibility as discussed above. In
>
> the 3.1 approach, that would be like alt text on images: something that
>
> many authors would omit in practice. In the 3.2 approach, that would be the
>
> document itself, which means that it couldn't be neglected.
>
>
>
> The big problem with 3.2. is the same issue as we described in 1.1: any
>
> math-specific system may well be over-specialized. Then again, TeX is not
>
> exclusively restricted to math typography, and it has been used for e.g.
>
> music typography before. So to some extent that I haven't precisely figured
>
> yet, the 1.1 overspecialization against MathML may not fully apply against
>
> a TeX-based solution.
>
>
>
> Benoit

I find it extremely disappointing this was posted the *day* after the Slashdot article was published at http://news.slashdot.org/story/13/05/04/0015241/firefox-is-the-first-browser-to-pass-the-mathml-acid2-test

Was the real motivation for this thread in any way related to the timing of that story?

Neil

unread,
May 9, 2013, 7:16:52 PM5/9/13
to

Henri Sivonen

unread,
May 24, 2013, 8:15:28 AM5/24/13