May 5, 2013, 11:38:39 AM5/5/13

Hi,

Summary: MathML is a vestigial remnant of the XML-everything era, and we

should drop it.

***

1. Reasons why I believe that MathML never was a good idea. Summary:

over-specialized and uniformly inferior to the pre-existing,

well-established standard, TeX.

1.1. MathML is too specialized: we should be reluctant to have a

separate spec for every kind of specialized typography. What if musicians

wanted their own MusicML too?

1.2. MathML reinvents the wheel, poorly. A suitable subset of TeX (not

the entirety of TeX, as that is a huge, single-implementation technology

that reputedly only Knuth ever fully understood) was the right choice all

along, because:

1.2.1. TeX is already the universally adopted standard --- and

already was long before MathML was invented. Check for yourself on

http://arxiv.org/ , where most new math papers are uploaded --- pick any

article, then "other" formats, then "Source": you can then download TeX

sources for almost every article.

1.2.2. TeX is very friendly to manual writing, being concise and

close to natural notation, with limited overhead (some backslashes and

curly braces), while MathML is as tedious to handwrite as any other

XML-based format. An example is worked out at

http://en.wikipedia.org/wiki/MathML#Example_and_comparison_to_other_formats,

where the solution to the quadratic equation is one line of TeX versus

30

lines of MathML!

1.2.3. An important corollary of being very close to natural notation

is that TeX can be nearly trivially "read aloud". That means that it offers

a particularly easy accessibility story. No matter what mechanism is used

to graphically display equations, providing the TeX source (similarly to

images alt text) would allow anyone to quickly read it themselves without

any kind of software support; and screen reading software could properly

read equations with minimal TeX-specific support code. For example, TeX

code such as "\int_0^1 x^2 dx" can be readily understood by any human with

basic TeX exposure (which is nearly 100% of mathematicians) and can be

easily handled by any screen reader that knows that \int should be read as

"integral" and that immediately after it, _ and ^ should be read as "from"

and "to" respectively.

***

2. Reasons why even if MathML had ever been a decent idea, now would be the

right time to drop it. Summary: never really got traction, and the same

rendering can now be achieved without MathML support.

2.1. MathML never saw much traction outside of Mozilla, despite having

been around for a decade. WebKit only got a very limited partial

implementation recently, and Google removed it from Blink. The fact that it

was just dropped from Blink says much about how little it's used: Google

wouldn't have disabled a feature that's needed to render web pages in the

real world. Opera got an implementation too, but Opera's engine has been

phased out.

2.2. High-quality mathematical typography in browsers is now possible,

without using MathML. Examples include MathJax ( http://www.mathjax.org/ ),

which happily takes either TeX or MathML input and renders it without

specific browser support, and of course PDF.js which is theoretically able

to render all PDFs including those generated by pdftex. Both approaches

give far higher quality output than what any current MathML browser

implementation offers.

***

3. Proposals

Assuming that there will be agreement to drop MathML, I can see us doing

either of two things:

3.1. Either just drop MathML support; the assumption would be that

current solutions not requiring specific browser support, such as MathJax

or PDF.js, are sufficient;

3.2. Or drop MathML support and create a new specification, that would

be based on a suitable subset of TeX.

In both approaches, distributing TeX source code alongside with a page is

highly desirable because it is the preferred source form of most math

content and because it enables good accessibility as discussed above. In

the 3.1 approach, that would be like alt text on images: something that

many authors would omit in practice. In the 3.2 approach, that would be the

document itself, which means that it couldn't be neglected.

The big problem with 3.2. is the same issue as we described in 1.1: any

math-specific system may well be over-specialized. Then again, TeX is not

exclusively restricted to math typography, and it has been used for e.g.

music typography before. So to some extent that I haven't precisely figured

yet, the 1.1 overspecialization against MathML may not fully apply against

a TeX-based solution.

Benoit

May 5, 2013, 12:10:10 PM5/5/13

to Benoit Jacob, Jonathan Kew, dev-platform

Four points here.

1. We're assuming that MathJax is as good with MathML as it is without

it, but perhaps we could ask the MathJax folks to comment on whether

this is true. I'd certainly be a lot more comfortable dropping MathML

if the MathJax folks said there was no point.

2.

> A suitable subset of TeX (not

> the entirety of TeX, as that is a huge, single-implementation technology

> that reputedly only Knuth ever fully understood) was the right choice all

> along

Jonathan Kew is a much better person to comment on this, but in my

relatively limited experience typesetting documents in TeX, I've had

to use various LaTeX packages (particularly amsmath and amssymb) in

order to get all of the symbols and so on that I needed. I suspect

that "heavy" users of TeX frequently need more than these two

packages.

The point being, "a subset of TeX" isn't necessarily sufficient.

3. It's not clear to me why we should go through all the work of

rewriting MathML into this TeX thing unless we thought that the new

thing would see more enthusiastic adoption. It sounds like you would

probably agree on this point.

4.

> 2.2. High-quality mathematical typography in browsers is now possible,

> without using MathML. Examples include MathJax ( http://www.mathjax.org/ ),

> which happily takes either TeX or MathML input and renders it without

> specific browser support, and of course PDF.js which is theoretically able

> to render all PDFs including those generated by pdftex. Both approaches

> give far higher quality output than what any current MathML browser

> implementation offers.

Could you elaborate on how MathML is inferior to MathJax's HTML+CSS

rendering? MathJax has a page where you can switch between different

rendering modes, and to my eyes, the two modes are almost identical.

The only difference I see is that the HTML+CSS mode is better at

correctly sizing large parentheses and radicals, but I wouldn't call

this "far higher quality."

http://www.mathjax.org/demos/mathml-samples/

May 5, 2013, 12:51:45 PM5/5/13

to Justin Lebar, dev-platform, Jonathan Kew

May 5, 2013, 2:10:25 PM5/5/13

to

I'm not sure if that's a joke or complete misinformation about the topic. But obviously the answer is that the MathML support must be preserved. The MathJax team is strongly in favor of native MathML implementation.

May 5, 2013, 2:47:18 PM5/5/13

to fred...@mathjax.org, dev-platform

It's not a joke.

Could you elaborate on this? In particular, as I wrote to the MathJax list,

I would be very interested in knowing what regressions the removal of

MathML would incur as far as MathJax is concerned.

Benoit

May 5, 2013, 6:28:52 PM5/5/13

to Benoit Jacob, dev-platform

On Mon, May 6, 2013 at 3:38 AM, Benoit Jacob <jacob.b...@gmail.com>wrote:

> 2.1. MathML never saw much traction outside of Mozilla, despite having

> been around for a decade. WebKit only got a very limited partial

> implementation recently, and Google removed it from Blink. The fact that it

> was just dropped from Blink says much about how little it's used: Google

> wouldn't have disabled a feature that's needed to render web pages in the

> real world.

The Blink implementation was never good enough to render MathML pages well
> 2.1. MathML never saw much traction outside of Mozilla, despite having

> been around for a decade. WebKit only got a very limited partial

> implementation recently, and Google removed it from Blink. The fact that it

> was just dropped from Blink says much about how little it's used: Google

> wouldn't have disabled a feature that's needed to render web pages in the

> real world.

in the real world, whether there were any or not. It also had some pretty

major brokenness in the way it was integrated into Blink, which made it

difficult to enable safely.

I would also say that one big difference between MathML and a hypothetical

TeX-based format is that MathML has a DOM and it's not clear how to fit TeX

into a DOM. That may not matter much for rendering, but it does if you want

to support editing.

One other thing: EPUB publishers are screaming for good math support for

textbooks (and currently that means they want MathML). They're mostly

Webkit-based, and maybe we don't care about them, but there you are.

Rob

--

q“qIqfq qyqoquq qlqoqvqeq qtqhqoqsqeq qwqhqoq qlqoqvqeq qyqoquq,q qwqhqaqtq

qcqrqeqdqiqtq qiqsq qtqhqaqtq qtqoq qyqoquq?q qEqvqeqnq qsqiqnqnqeqrqsq

qlqoqvqeq qtqhqoqsqeq qwqhqoq qlqoqvqeq qtqhqeqmq.q qAqnqdq qiqfq qyqoquq

qdqoq qgqoqoqdq qtqoq qtqhqoqsqeq qwqhqoq qaqrqeq qgqoqoqdq qtqoq qyqoquq,q

qwqhqaqtq qcqrqeqdqiqtq qiqsq qtqhqaqtq qtqoq qyqoquq?q qEqvqeqnq

qsqiqnqnqeqrqsq qdqoq qtqhqaqtq.q"

May 5, 2013, 6:52:07 PM5/5/13

to Robert O'Callahan, dev-platform

2013/5/5 Robert O'Callahan <rob...@ocallahan.org>

> On Mon, May 6, 2013 at 3:38 AM, Benoit Jacob <jacob.b...@gmail.com>wrote:

>

>> 2.1. MathML never saw much traction outside of Mozilla, despite having

>> been around for a decade. WebKit only got a very limited partial

>> implementation recently, and Google removed it from Blink. The fact that

>> it

>> was just dropped from Blink says much about how little it's used: Google

>> wouldn't have disabled a feature that's needed to render web pages in the

>> real world.

>

>

> The Blink implementation was never good enough to render MathML pages well

> in the real world, whether there were any or not. It also had some pretty

> major brokenness in the way it was integrated into Blink, which made it

> difficult to enable safely.

>

> I would also say that one big difference between MathML and a hypothetical

> TeX-based format is that MathML has a DOM and it's not clear how to fit TeX

> into a DOM. That may not matter much for rendering, but it does if you want

> to support editing.

>

That sounds interesting; could you please expand a little more?

- Do you mean that in order to support editing well, the format must be

naturally parseable into a tree representation? (Why so?)

- If that is what you mean: I think that TeX equations parse fairly

naturally into tree representations; in fact, I suppose that every

TeX-to-MathML conversion tool, or TeX-to-DOM-elements conversion tool in

existence, must already have solved this problem somehow (in particular,

MathJax).

>

> One other thing: EPUB publishers are screaming for good math support for

> textbooks (and currently that means they want MathML). They're mostly

> Webkit-based, and maybe we don't care about them, but there you are.

>

Given that TeX is already the standard in scientific publishing, I would

find it very surprising if they complained about a TeX-based or TeX-like

format !

Benoit

May 5, 2013, 7:16:05 PM5/5/13

to Benoit Jacob, dev-platform

> 1.2.2. TeX is very friendly to manual writing, being concise and

> close to natural notation, with limited overhead (some backslashes and

> curly braces), while MathML is as tedious to handwrite as any other

> XML-based format. An example is worked out at

> http://en.wikipedia.org/wiki/MathML#Example_and_comparison_to_other_formats,

> where the solution to the quadratic equation is one line of TeX versus

> 30 lines of MathML!

This isn't exactly a fair comparison. I mean, its fair, but for equations of any complexity (i.e. things you wouldn't find in a high school text book) TeX can quickly become incredibly difficult (maybe more difficult than MATHML) to manage. Most people I know who use TeX regularly have developed fairly thick sets of macros to try and manage things.
> close to natural notation, with limited overhead (some backslashes and

> curly braces), while MathML is as tedious to handwrite as any other

> XML-based format. An example is worked out at

> http://en.wikipedia.org/wiki/MathML#Example_and_comparison_to_other_formats,

> where the solution to the quadratic equation is one line of TeX versus

> 30 lines of MathML!

> Given that TeX is already the standard in scientific publishing, I would

> find it very surprising if they complained about a TeX-based or TeX-like

> format !

I always wanted to see MathML succeeded. There are plenty of things to complain about in the format, but I think most of its problems stemmed from a lack of implementations. It feels to me like another one of those technologies (like flexbox or web components) that people need to reinvent (with a few of the sharp edges rounded off) and try to sell as "new". Until we have buy in from some other browser vendors on a new format though, I don't think I understand why we'd kill off something that 1.) works and 2.) AFAIK requires almost zero upkeep. Are teams spending a lot of time upkeeping MathML code?

- Wes

May 5, 2013, 7:40:30 PM5/5/13

to Wesley Johnston, dev-platform

>

May 5, 2013, 8:38:53 PM5/5/13

to

Here are a couple of reasons why dropping MathML would be a bad idea. (While I wrote this others made some of the points as well.)

* MathML is part of HTML5 and epub3.

* Gecko has the very best native implementation out there, only a few constructs short of complete.

* Killing it off means Mozilla gives up a competitive edge against all other browser engines.

* MathML is widely used. Almost all publishers use XML workflows and in those MathML for math. Similarly, XML+MathML dominates technical writing.

* In particular, the entire digital textbook market and thus the entire educational sector comes out of XML/MathML workflows right now.

* MathML is the only format supported by math-capable accessibility tools right now.

* MathML is just as powerful for typesetting math as TeX is. Publishers have been converting TeX to XML for over a decade (e.g., Wiley, Springer, Elsevier). Fun fact: the Math WG and the LaTeX3 group overlap.

* Limitations of browser support does not mean that the standard is limited.

From a MathJax point of view

* MathJax uses MathML as its internal format.

* MathJax output is ~5 times slower than native support. This is after 9 years of development of jsmath and MathJax (and javascript engines).

* The performance issues lie solely with rendering MathML using HTML constructs.

* Performance is the only reason why Wikipedia continues to uses images.

* JavaScript cannot access font metrics, so MathJax can only use fonts we'r able to teach it to use.

* While TeX and the basic LaTeX packages are stable, most macro packages are unreliable. Speaking as a mathematician, it's often hard to compile my own TeX documents from a few years ago. You can also ask the arXiv folks how painful it is to do what they do.

Other points

* MathML has never seen paid browser development. All work (save code review) has been done solely by unpaid volunteers. If Mozilla was paying even a part time developer, Firefox would have had complete support years ago.

* The same holds for Apple, Google and Microsoft. Yes, when you don't put any developers on the job, MathML implementations do not get better.

* Google is even silly enough to kick out a hugely improved (albeit partial) implementation instead of landing the patches that fix that one remaining security issue -- while Apple doesn't have any problems with the same code.

* Firefox has shown how productive the feedback loop from a partial implementation can be, attracting a number of volunteers over the years, pushing it forward a little bit each time.

* MathML syntax is not as bad as people think but it takes getting used to (just like HTML). It's a bit like saying HTML is bad since markdown is much more human readable. Check out Dave Barton's jqmath, a serialization of MathML; with very little effort I find it as human readable as TeX.

* TeX is *not* the de-facto standard for math. It is the standard for researchers in mathematics and very few related fields. Most mathematical content is not created by researchers but by technical writers and in the educational sector. And again: most TeX gets converted to MathML in publishing workflows.

* MS Word and Libre Office produce MathML out of the box.

Personal remarks

MathML still feels a lot like HTML 1 to me. It's only entered the web natively in 2012. We're lacking a lot of tools, in particular open source tools (authoring environments, cross-conversion, a11y tools etc).

But that's a bit like complaining in 1994 that HTML sucks and that there's TeX which is so much more natural with \chapter and \section and has higher typesetting quality anyway.

I'm totally for MusicML! More generally, there are things like CellML, CML and other scientific standards. I'd encourage them to work towards becoming web standards, to prove that the web is truly the native place for all human communication.

A statistical plot has no more reason to be an image than an equation -- it should be markup/data in the page and the browser should render it. Browsers may be the new printing press, but we are looking at Gutenberg's model here, not 20th century digital offset printing.

Anyway, the MathWG has fought extremely hard for 15 years to make mathematics a first class citizen on the web. Certainly, MathML is only the beginning for math on the web. But abandoning it now will throw scientific content back 20 years.

Personally, I don't want to wait for another Knuth to show up and fix the problem.

Peter.

* MathML is part of HTML5 and epub3.

* Gecko has the very best native implementation out there, only a few constructs short of complete.

* Killing it off means Mozilla gives up a competitive edge against all other browser engines.

* MathML is widely used. Almost all publishers use XML workflows and in those MathML for math. Similarly, XML+MathML dominates technical writing.

* In particular, the entire digital textbook market and thus the entire educational sector comes out of XML/MathML workflows right now.

* MathML is the only format supported by math-capable accessibility tools right now.

* MathML is just as powerful for typesetting math as TeX is. Publishers have been converting TeX to XML for over a decade (e.g., Wiley, Springer, Elsevier). Fun fact: the Math WG and the LaTeX3 group overlap.

* Limitations of browser support does not mean that the standard is limited.

From a MathJax point of view

* MathJax uses MathML as its internal format.

* MathJax output is ~5 times slower than native support. This is after 9 years of development of jsmath and MathJax (and javascript engines).

* The performance issues lie solely with rendering MathML using HTML constructs.

* Performance is the only reason why Wikipedia continues to uses images.

* JavaScript cannot access font metrics, so MathJax can only use fonts we'r able to teach it to use.

* While TeX and the basic LaTeX packages are stable, most macro packages are unreliable. Speaking as a mathematician, it's often hard to compile my own TeX documents from a few years ago. You can also ask the arXiv folks how painful it is to do what they do.

Other points

* MathML has never seen paid browser development. All work (save code review) has been done solely by unpaid volunteers. If Mozilla was paying even a part time developer, Firefox would have had complete support years ago.

* The same holds for Apple, Google and Microsoft. Yes, when you don't put any developers on the job, MathML implementations do not get better.

* Google is even silly enough to kick out a hugely improved (albeit partial) implementation instead of landing the patches that fix that one remaining security issue -- while Apple doesn't have any problems with the same code.

* Firefox has shown how productive the feedback loop from a partial implementation can be, attracting a number of volunteers over the years, pushing it forward a little bit each time.

* MathML syntax is not as bad as people think but it takes getting used to (just like HTML). It's a bit like saying HTML is bad since markdown is much more human readable. Check out Dave Barton's jqmath, a serialization of MathML; with very little effort I find it as human readable as TeX.

* TeX is *not* the de-facto standard for math. It is the standard for researchers in mathematics and very few related fields. Most mathematical content is not created by researchers but by technical writers and in the educational sector. And again: most TeX gets converted to MathML in publishing workflows.

* MS Word and Libre Office produce MathML out of the box.

Personal remarks

MathML still feels a lot like HTML 1 to me. It's only entered the web natively in 2012. We're lacking a lot of tools, in particular open source tools (authoring environments, cross-conversion, a11y tools etc).

But that's a bit like complaining in 1994 that HTML sucks and that there's TeX which is so much more natural with \chapter and \section and has higher typesetting quality anyway.

I'm totally for MusicML! More generally, there are things like CellML, CML and other scientific standards. I'd encourage them to work towards becoming web standards, to prove that the web is truly the native place for all human communication.

A statistical plot has no more reason to be an image than an equation -- it should be markup/data in the page and the browser should render it. Browsers may be the new printing press, but we are looking at Gutenberg's model here, not 20th century digital offset printing.

Anyway, the MathWG has fought extremely hard for 15 years to make mathematics a first class citizen on the web. Certainly, MathML is only the beginning for math on the web. But abandoning it now will throw scientific content back 20 years.

Personally, I don't want to wait for another Knuth to show up and fix the problem.

Peter.

May 5, 2013, 9:10:29 PM5/5/13

to

On 5/5/2013 6:40 PM, Benoit Jacob
wrote:

Well, I have written hundreds of pages of TeX; for sure, some large equations would expand over more than one line of TeX, but I can't remember going over more than 5 lines of TeX source (without custom helper macros) per actual line of output, that that would be a really unusual case --- while the MathML example above has a ratio of 30 source lines to 1 output line.

For what it's worth, to compare the TeX to that MathML properly,
you'd have to count, e.g., \frac{a}{b} as three lines:

\frac

{a}

{b}

An example I have of several lines of TeX-per-output-line is the following (some Big-Step semantics rules):

\frac{\langle b,\sigma\rangle \Downarrow \mathtt{true}\quad

\langle \text{$s$ while ($b$) $s$},\sigma\rangle\Downarrow

\langle t, \sigma_1\rangle}

{\langle \text{while ($b$) $s$},\sigma\rangle \Downarrow

\langle t, \sigma_1\rangle}

The rendered output of this would be (hopefully the MathML makes it through):

〈
b
,
σ
〉
⇓
`true`
〈
s while (b)
s
,
σ
〉
⇓
〈
t
,
σ
1
〉
〈
while (b) s
,
σ
〉
⇓
〈
t
,
σ
1
〉

Entering one of the lines in MathML in a more compact representation comes out to:

<mo>〈</mo><mtext>while (<mi>b</mi>) <mi>s</mi></mtext><mo>,</mo><mi>σ</mi><mo>〉</mo><mo>⇓</mo

So it's not a factor of 30-to-1 in verbosity, more like a factor of 2-to-1 or 3-to-1. Certainly the same order of magnitude. You might argue that I'm cheating by using Unicode characters instead of entities, but the LaTeX-to-MathML conversion tools I've seen all output UTF-8, and UTF-8 is generally much more well supported by browsers than in TeX processors, so it's not an unrealistic assumption for how the text looks.

-- Joshua Cranmer Thunderbird and DXR developer Source code archæologist

May 5, 2013, 10:46:01 PM5/6/13

to p.kraut...@gmail.com, dev-platform

Let me just reply to a few points to keep this conversation manageable:

May 5, 2013, 11:23:56 PM5/5/13

to

On 5/5/2013 9:46 PM, Benoit Jacob wrote:

> I am still waiting for the rebuttal of my arguments, in the original

> email in this thread, about how TeX is strictly better than MathML for

> the particular task of representing equations. As far as I can see,

> MathML's only inherent claim to existence is "it's XML", and being XML

> stopped being a relevant selling point for a Web spec many years ago

> (or else we'd be stuck with XHTML)

Don't be quick to dismiss the utility of XML. The problem of XHTML, as I
> I am still waiting for the rebuttal of my arguments, in the original

> email in this thread, about how TeX is strictly better than MathML for

> the particular task of representing equations. As far as I can see,

> MathML's only inherent claim to existence is "it's XML", and being XML

> stopped being a relevant selling point for a Web spec many years ago

> (or else we'd be stuck with XHTML)

understand it, was that the XHTML2 spec ignored the needs of its

would-be users and designed stuff that was untenable. XHTML as in "a

representation of the HTML DOM in XML syntax" isn't a bad idea to me.

Note that I'm really defining XML here as "the basic representation

format of HTML."

In this case, I think the XML nature of MathML actually works to its

benefit: it uses the same basic framework and "look and feel" as HTML,

so you can very easily insert arbitrary HTML into your equation. A

TeX-like language would have to invent awkward wrappers for this same

functionality, like \html{<b>I can insert arbitrary HTML!</b>}. It also

creates its own implicit DOM structure for manipulation, and provides

very natural launchpads for extra styling or scripting.

May 6, 2013, 1:27:41 AM5/6/13

to

Benoit, you said you need proof that MathML is better than TeX. I think it's the reverse at this point (from a web perspective -- you'll never get me to use Word instead of TeX privately ;) ).

Anyway, let me try to repeat how I had addressed your original points in my first post.

1.1. you make a point against adding unnecessary typography. Mathematics is text, but adding new requirements. It's comparable to the introduction of RTL or tables much more than musical notation. It's also something that all school children will encounter for 9-12 years. IMHO, this makes it necessary to implement mathematical typesetting functionality.

1.2 you claimed MathML is inferior to TeX. I've tried to point out that that's not the case as most scientific and educational publishers use it extensively.

1.2.1 you claimed TeX is the universal standard. I've tried to point out only research mathematicians use it as a standard. Almost most mathematics happens outside that group.

1.2.2 You pointed out that MathML isn't friendly to manual input. That's true but HTML isn't very friendly either, nor is SVG.

1.2.3 You argued TeX is superior for accessibility. I've pointed out that that's not the case given the current technology landscape.

2 You wrote now is the time to drop MathML. I've tried to point out that now -- as web and ebook standard -- is the time to support it, especially when your implementation is almost complete and you're looking to carve a niche out of the mobile and mobile OS market, ebooks etc.

2.1 you claim MathML never saw traction outside of Firefox. I tried to point out that MathML has huge traction in publishing and the educational sector, even if it wasn't visible on the web until MathJax came along. Google wants MathML support (they just don't trust the current code) while Apple has happily advertised with the MathML they got for free. Microsoft indeed remains a mystery.

2.2 you claim MathJax does a great job -- ok, I'm not going to argue ;) -- while browsers don't. But we've used native output on Firefox before MathJax 2.0 and plan to do it again soon -- it is well implemented and can provide the same quality of typesetting.

3. Well, I'm not sure what to say to those. If math is a basic typographical need, then the syntax doesn't matter -- we need to see it implemented and its bottom up layout process clashes with CSS's top down process. No change in syntax will resolve that.

Since MathML development involved a large number of TeX and computer algebra experts, I doubt a TeX-like syntax will end up being extremely different from MathML the second time around.

Instead of fighting over syntax, I would prefer to focus on improving the situation of mathematics on the web -- so thank you for your offer to support us in fixing bugs and improving HTML layout.

Peter.

May 6, 2013, 1:58:21 AM5/6/13

to Benoit Jacob, dev-platform

May 6, 2013, 2:14:18 AM5/6/13

May 6, 2013, 2:21:09 AM5/6/13

to Benoit Jacob, dev-platform, p.kraut...@gmail.com

On Mon, May 6, 2013 at 6:14 PM, Robert O'Callahan <rob...@ocallahan.org>wrote:

> wrote my thesis which also include a lot of semantics and type theory in

> FrameMaker, which was actually pretty good but is very dead.

>

Correction: it's alive! Amazing.
> wrote my thesis which also include a lot of semantics and type theory in

> FrameMaker, which was actually pretty good but is very dead.

>

May 6, 2013, 4:20:38 AM5/6/13

to

On Monday, 6 May 2013 07:27:41 UTC+2, p.kraut...@gmail.com wrote:

>

> Microsoft indeed remains a mystery.

>

Not so much when it comes to Microsoft Office:
>

> Microsoft indeed remains a mystery.

>

http://blogs.msdn.com/b/murrays/

May 6, 2013, 5:01:22 AM5/6/13

to Benoit Jacob, p.kraut...@gmail.com

On 05/06/2013 05:46 AM, Benoit Jacob wrote:

> Let me just reply to a few points to keep this conversation manageable:

>

> 2013/5/5 <p.kraut...@gmail.com>

>

>> Here are a couple of reasons why dropping MathML would be a bad idea.

>> (While I wrote this others made some of the points as well.)

>>

>> * MathML is part of HTML5 and epub3.

>>

>

> That MathML is part of epub3, is useful information. It doesn't mean that

> MathML is good but it means that it's more encroached than I knew.

>

> We don't care about "this is part of HTML5" arguments (or else we would

> support all the crazy stuff that flies on public-fx@w3...)

We do care about the stuff what is in the HTML spec.
> Let me just reply to a few points to keep this conversation manageable:

>

> 2013/5/5 <p.kraut...@gmail.com>

>

>> Here are a couple of reasons why dropping MathML would be a bad idea.

>> (While I wrote this others made some of the points as well.)

>>

>> * MathML is part of HTML5 and epub3.

>>

>

> That MathML is part of epub3, is useful information. It doesn't mean that

> MathML is good but it means that it's more encroached than I knew.

>

> We don't care about "this is part of HTML5" arguments (or else we would

> support all the crazy stuff that flies on public-fx@w3...)

http://www.whatwg.org/specs/web-apps/current-work/#mathml

(and if there is something we don't care about, it should be removed from the spec)

May 6, 2013, 7:22:55 AM5/6/13

to Peter Krautzberger, dev-platform

Thanks Peter: that point-for-point format makes it easier for me to

understand your perspective on the issues that I raised.

2013/5/6 <p.kraut...@gmail.com>

> Benoit, you said you need proof that MathML is better than TeX. I think

> it's the reverse at this point (from a web perspective -- you'll never get

> me to use Word instead of TeX privately ;) ).

>

> Anyway, let me try to repeat how I had addressed your original points in

> my first post.

>

> 1.1. you make a point against adding unnecessary typography. Mathematics

> is text, but adding new requirements. It's comparable to the introduction

> of RTL or tables much more than musical notation. It's also something that

> all school children will encounter for 9-12 years. IMHO, this makes it

> necessary to implement mathematical typesetting functionality.

>

School children are only on the reading end of math typesetting, so for

them, AFAICS, it doesn't matter that math is rendered with MathML or with

MathJax's HTML+CSS renderer.

> 1.2 you claimed MathML is inferior to TeX. I've tried to point out that

> that's not the case as most scientific and educational publishers use it

> extensively.

>

> 1.2.1 you claimed TeX is the universal standard. I've tried to point out

> only research mathematicians use it as a standard. Almost most mathematics

> happens outside that group.

>

I suppose that I can only accept your data as better documented that mine;

most of the TeX users I know are or have been math researchers.

> 1.2.2 You pointed out that MathML isn't friendly to manual input. That's

> true but HTML isn't very friendly either, nor is SVG.

>

It's not comparable at all.

If you're writing plain text, HTML's overhead is limited to some <br> or

<p> tags, with maybe the usual <b>, <i>, heading... so the overhead is

small compared to the size of your text.

If you add many anchors and links, and some style, the overhead can grow

significantly, but is hardly going to be more than 2 input lines per output

line.

With MathML, we're talking about easily over 10 input lines per output line

--- in wikipedia's example, MathML has 30 where TeX has 1.

So contrary to HTML, nobody's going to actually write MathML code by hand

for anything more than a few isolated equations.

Thanks also for your other points below, to which I'm not individually

replying; we have a perspective mismatch here, so it's interesting for me

to understand your perspective, but I'm not going to win a fight against

the entire publishing industry which you say is already behind MathML.

Benoit

May 6, 2013, 7:27:08 AM5/6/13

to Robert O'Callahan, dev-platform

May 6, 2013, 7:31:36 AM5/6/13

to Robert O'Callahan, dev-platform, Peter Krautzberger

> Let me go on a bit of a rampage about TeX for a bit.

>

> TeX is not a markup format. It is an executable code format. It is a

> programming language by design!

>

Yes, but a small subset of TeX could be purely a markup format, not a
>

> TeX is not a markup format. It is an executable code format. It is a

> programming language by design!

>

programming language. Just support a finite list of common TeX math

operations, and no custom macros (or very restricted ones).

Benoit

May 6, 2013, 7:45:55 AM5/6/13

to Benoit Jacob, dev-platform

On Mon, May 06, 2013 at 07:27:08AM -0400, Benoit Jacob wrote:

> 2013/5/6 Robert O'Callahan <rob...@ocallahan.org>

>

May 6, 2013, 8:19:31 AM5/6/13

to

On 5/6/13 7:27 AM, Benoit Jacob wrote:

> I guess I don't see the usefulness of allowing to apply style to individual

> parts of an equation

Styling parts of an equation with different colors can be _extremely_
> I guess I don't see the usefulness of allowing to apply style to individual

> parts of an equation

useful for readability. It's rarely done in print, of course, and I

assume there are various reasons ranging from "it's more expensive" to

"no one does that" for why. But on the web it seems like a no-brainer.

Styling parts of an equation with different font styles is of course all

over the place; there are lots of TeX packages that will let you do

things like \mathfrak, for example. Of course fraktur in particular got

stuck into Unicode...

There are some interesting use cases I can think of for scripted

visibility styling in educational materials.

> Regarding editing, if I understand correctly, you have WYSIWYG or other

> kinds of fancy editing in mind, where understanding of the syntax tree

> inside of the equation is needed; I haven't seen a need for WYSIWYG editing

> of math

ok for specialists (maybe; I have in fact wished for a good wysiwyg

editor for TeX on many an occasion, but was always stymied by the need

for custom macros for my documents), but most people _do_ in fact want

wysiwyg editing. It's not "fancy" for most people but a baseline

requirement. So any system for math on the web needs to have support

for that requirement...

-Boris

May 6, 2013, 8:24:07 AM5/6/13

to

On 5/5/13 10:46 PM, Benoit Jacob wrote:

>> * MathJax output is ~5 times slower than native support. This is after 9

>> years of development of jsmath and MathJax (and javascript engines).

>

> JavaScript performance hasn't stopped improving and is already far better

> than 5x slower than native on use cases (like the Unreal Engine 3 demo)

> that were a priori much harder for JavaScript.

This is a layout/css issue, not a js engine issue, I suspect. MathJax
>> * MathJax output is ~5 times slower than native support. This is after 9

>> years of development of jsmath and MathJax (and javascript engines).

>

> JavaScript performance hasn't stopped improving and is already far better

> than 5x slower than native on use cases (like the Unreal Engine 3 demo)

> that were a priori much harder for JavaScript.

HTML output just ends up having to produce lots of stuff, do lots of

layout calculations (which means doing layout!) then redo all the layout

again based on the results of those calculations.

It's really hard to make 2+ layout passes as fast as one layout pass.

> I'm also speaking as a (former) mathematician, and I've never had to rely

> on TeX packages that aren't found in every sane TeX distribution

The packages I used in the mid-to-late '90s for embedding images in

documents no longer exist; their current replacements (with different

syntax) did not exist then.

> I am still waiting for the rebuttal of my arguments, in the original email

> in this thread, about how TeX is strictly better than MathML for the

> particular task of representing equations.

even a restricted subset of it? Note that these exist for MathML, but

not so much for TeX.

I guess this comes down to how easy it is to construct exactly the same

parse/syntax tree out of TeX, right?

-Boris

May 6, 2013, 8:36:20 AM5/6/13

to

On 5/6/2013 6:27 AM, Benoit Jacob wrote:

> I guess I don't see the usefulness of allowing to apply style to individual

> parts of an equation --- applying a single style to an entire equation

> would be plenty enough as far as I can see.

Suppose you were writing an introductory explanation course, where you
> I guess I don't see the usefulness of allowing to apply style to individual

> parts of an equation --- applying a single style to an entire equation

> would be plenty enough as far as I can see.

were explaining the derivation of a complex formula step-by-step. You

could illustrate the changes in each step with a different color. You

could also use strike through text formatting to clearly indicate.

>

> Regarding editing, if I understand correctly, you have WYSIWYG or other

> kinds of fancy editing in mind, where understanding of the syntax tree

> inside of the equation is needed; I haven't seen a need for WYSIWYG editing

> of math, but I don't want to try to fight the war "for or against WYSIWYG".

>

I would wager that the majority of HTML content in the wild is not
> Regarding editing, if I understand correctly, you have WYSIWYG or other

> kinds of fancy editing in mind, where understanding of the syntax tree

> inside of the equation is needed; I haven't seen a need for WYSIWYG editing

> of math, but I don't want to try to fight the war "for or against WYSIWYG".

>

written by people who write HTML in a text editor but by people who use

some sort of WYSIWYG tool or document format conversion--I'm including

subsets like email and E-PUB here. Also, this strikes me as very biased

towards the frame of mind that "real mathematicians use TeX"--I was

introduced to the Equation Editor in Microsoft Office more or less as

part of the regular course of study, long before I was introduced to TeX

in any form.

May 6, 2013, 9:12:48 AM5/6/13

to dev-pl...@lists.mozilla.org

On Mon, May 06, 2013 at 08:24:07AM -0400, Boris Zbarsky wrote:

> >I am still waiting for the rebuttal of my arguments, in the original email

> >in this thread, about how TeX is strictly better than MathML for the

> >particular task of representing equations.

>

> How easy is it to build an accessibility application on top of TeX,

> or even a restricted subset of it? Note that these exist for

> MathML, but not so much for TeX.

I actually think it would be easier to map tx math into the
May 6, 2013, 10:13:04 AM5/6/13

May 6, 2013, 12:25:39 PM5/6/13

to

I'm coming late to this thread but I have to say that the misunderstanding present in the original post is huge. The author can take refuge in that he's made a common category mistake. MathML is a computer representation for math, TeX is a human input language.

MathML was never intended to be typed by humans so it is no wonder that you find it a bad experience. TeX is a poor computer representation which is one reason why MathML was invented.

It is reasonable to have a discussion of the relative merits of entering math by typing TeX vs point-and-click editing of math (ie, direct manipulation editing). I am biased toward the latter but I can understand the feelings of those whose hands know TeX really well.

In short, both MathML and TeX have good reasons to exist and don't compete with each other in their primary categories.

MathML was never intended to be typed by humans so it is no wonder that you find it a bad experience. TeX is a poor computer representation which is one reason why MathML was invented.

It is reasonable to have a discussion of the relative merits of entering math by typing TeX vs point-and-click editing of math (ie, direct manipulation editing). I am biased toward the latter but I can understand the feelings of those whose hands know TeX really well.

In short, both MathML and TeX have good reasons to exist and don't compete with each other in their primary categories.

There are several problems/issues here:
# Context
How do you differentiate/identify math powers (e.g. "a^2"), footnotes (e.g. "some text^1") and code ("int c = a^b;")?

With MathML markup, you have clearly identified what the content of the document/sub-tree is.

# Parsing

With a TeX-like format, a speech synthesiser/screen reader/web browser would need to write a parser for that format.

With MathML, the parsing is already handled by the SGML/XML/HTML5 parser so the application can process it via DOM/SAX/a reader API.

> currently we don't expose mathml at all other than as a an object that

> we say is an equation, and its not really clear how to fix that with

> mathml.

Another important consideration is existing web content. If you are going to start rendering text that has e.g. "a^2" as math, then all documents that use that, e.g. "<p>You can use a^b in TeX to denote 'a raised to the b<sup>th</sup> power'.</p>"

- Reece

the same way the tx parser does, though that would be a problem for the
API consumer to deal with not us.
With MathML markup, you have clearly identified what the content of the document/sub-tree is.

>

# Parsing

>

With a TeX-like format, a speech synthesiser/screen reader/web browser would need to write a parser for that format.

>

With MathML, the parsing is already handled by the SGML/XML/HTML5 parser so the application can process it via DOM/SAX/a reader API.

> > currently we don't expose mathml at all other than as a an object that

> > we say is an equation, and its not really clear how to fix that with

> > mathml.

>

This is enough information for the screen reader/speech synthesiser to know that it has MathML content, and thus walk the MathML DOM to read the math out loud. It should also be enough to query associated CSS styles to handle any Aural CSS or CSS Speech styles associated with the MathML.

Another important consideration is existing web content. If you are going to start rendering text that has e.g. "a^2" as math, then all documents that use that, e.g. "<p>You can use a^b in TeX to denote 'a raised to the b<sup>th</sup> power'.</p>"

break existing pages, instead we'd have to do something like <p>this is

some text with an equation <tx>x = 2y</tx></p>

We're getting distracted by the comparison with TeX and the discussion of

MathML's relative merits. My bad: I obscured my message by starting two

conversations at once (1.1 and 1.2 in my initial email).

I happily concede this round, given that most people disagree with me about

TeX in this thread.

Can we focus on the other conversation now: should the Web have a

math-specific markup format at all? I claim it shouldn't; I mostly

mentioned TeX as a "if we really wanted one" side note and let it go out of

hand.

How many specific domains will want to have their own domain-specific

markup language next? Chemistry? Biology? Electronics? Music? Flow charts?

Calligraphy?

I suspect that when people start asking for that, we'll quickly have to

start saying "no", and at that point, the exception made for math will seem

unjustified.

I understand, from Boris' email, that there are nontrivial performance

issues associated with relying on generic HTML layout to render math. And

API issues associated with querying font metrics from JavaScript. But

surely it must be possible to overcome these issues, and that would benefit

entire classes of content, not just math.

If tomorrow a competing browser solves these problems, and renders

MathJax's HTML output fast, we will obviously have to follow. That can

easily happen, especially as neither of our two main competitors is

supporting MathML.

Benoit

MathML specifies mathematical formulae, which is not domain-specific,
Of course not; just like math, music will want a higher level of

abstraction that's not directly tied to graphical rendering, like a set of

SVG macros would be.

In fact, http://en.wikipedia.org/wiki/MusicXML

And in fact... http://en.wikipedia.org/wiki/List_of_XML_markup_languages

Benoit

This is a good question to ask, but I think it would help if there are
Hopefully Web Components will provide a good solution to let authors extend

the browser with support for vocabularies that can be rendered via a

straightforward decomposition to HTML or MathML or SVG.

I think the layout requirements of MathML are too onerous for MathML to be

reduced to HTML or SVG that way.

While diagrams such as chemical formulae, flowcharts or electronics

schematics can be compiled to SVG, the layout step is very much nontrivial

and I don't think Web Components is enough for that. Web Components plus

some JS to do the layout is probably satisfactory.

* About the "XML is evil, MathML is XML so MathML is evil" syllogism.

I don't think it makes sense in general to say that something is good or bad without mentioning for what purpose. I actually agree with Joshua that XML is a good format to work with for a computer engineer. There are very good libraries and tools to handle it and things like XML namespaces that are painful on the Web become very important for these tools. roc is right that the "catastrophic fail" is certainly not good for a Web model. Note however that most of the Web sites are automatically generated by server side programs and I often see MYSQL, PHP, CGI etc failures without hearing anyone saying they should come back to static pages. The HTML5 parsing rules allow to get concision and error-tolerance when you want to quickly write pages but this "tag-soup" approach also brings confusion in general and is just useless for programming. The inclusion of MathML inside HTML5 removed the XML burden that prevented people to use it on the Web or in emails where the default content is text/html. Actually the syntactic difference has never been a problem for MathJax or WYSIWYG tools (working on a DOM-like tree) or for authoring tools like LaTeXML (that generates XML from LaTeX and then only uses XSLT stylesheets at the final step to convert to EPUB, XHTML or HTML5). Perhaps one of the best argument against the XML-haters is that Henri Sivonen's HTML5 validator is itself heavily based on XML tools like RELAX NG schemas or Java XML-related libraries.

* About the "MathML is too specialized".

Obviously I agree with what has been said before about math being in particular position. I personally see mathematical writing as language by itself and so not having it in the browsers is just like not supporting Arabic or Asian scripts (BTW MathML was implemented in Gecko a long time before HTML ruby). Just to add one point: mathematical expressions are also very often mixed with other content like text or diagrams and it makes sense to have HTML+SVG+MathML+CSS well integrated together.

* About the "TeX is already the universally adopted standard" and " TeX is very friendly to manual writing".

Again, people have already said that this is not true, at least not outside academia (I personally use TeX too as an input method but don't want to impose that to other people and I'm open to use other methods like handwritting recognition in the future). One of the most popular question on the MathJax list is of course "is there a WYSIWYG math editor?" Many people also like the ASCII-like syntax: (x^2 + y_1)/2. For example MathJax supports that syntax, Daniel Glazman has a plugins for BlueGriffon & Thunderbird and this is commonly used in Computer algebra systems. Some people say (cf jqMath or MathEL) that with tools to replace the traditional keyboard, entering Unicode characters becomes easy and so they generalize to a Unicode-based simple syntax where you write the actual symbol rather than commands like \Leftrightarrow. Even modern LaTeX environments support Unicode. Finally, people are also interested in handwritting recognition (see e.g. https://www.youtube.com/watch?v=26opB8DRf3c or http://webdemo.visionobjects.com/portal.html).

* About the "TeX can be nearly trivially read aloud".

This is an assumption but the reallity is that math accessibility tools would need a parsing into an abstract representation at some point anyway. Just reading the plain text source naively is not enough for the two use cases I mentioned. There are already MathML-based tools showing that it is possible to use MathML.

* About "MathML never saw much traction outside of Mozilla"

If you are only talking about browser vendors then that's true. But Web users have requested MathML support for a long time (remember that the Web was created at the CERN for research purpose) and has been implemented in Gecko and Webkit by volunteers. MathJax is yet another community effort to bring math on the Web and was initially presented to me by Robert Miner as a "transition technology" towards MathML in browsers. At the last W3C workshop on ebook, everybody complained about the lack of MathML support in layout engines (Gecko being excluded de facto for now) and this leads to serious discussions inside the MathJax consortium about how we could help implementing MathML in browsers (hopefully the MathJax team will be able to say more about that later). Other people also indicate other domain where MathML is now used. BTW, you probably don't care either about that argument, but MathML is part of the OpenDocument OASIS/ISO/IEC standard used in OpenOffice and other office software suite.

[digression: Mozilla people keep saying that competition is good. That was certainly true when Mozilla was fighting against Internet Explorer predominance and stagnation in Web innovation. But to be honest, it seems that what's happening now is that Google is leading the development and other actors are following. Webkit developments were mostly done by Google and it's not clear what Apple will do after Blink fork. Opera just gave up its rendering engine and joined the Blink effort. Mozilla claims to propose something different but there have been many pragmatic decisions recently that are against its own community and manifesto. Many excellent Mozilla projects are now abandoned by MoCo and left to volunteers. I don't want the MathML story to be another defeat against Google's monoculture]

* About the "cost to support MathML"

I think most of the changes done by Mozilla staff are normal code maintenance (removing PR_TRUE macros, changing C++ interface, moving files etc) as well as a few security fixes. The MathML code is relatively small and isolated so it's not a too serious effort. There were large code refactoring a couple of years ago, like roc moving some font code to the style system or Karl's remarkable work to make MathML support live again. All the new features and other bug fixes not in the previous categories have been made by volunteers. I've actually been mentoring dizains of new volunteers in the last few months and some are still active in Gecko or other Mozilla projects. The synergy with other MathML projects outside Mozilla was also very fruitful. I'm certainly biased but I would say it has been more a benefit for Mozilla than a burden to have MathML.

* High-quality mathematical typography in browsers is now possible, without using MathML. Examples include MathJax or PDF.js

I only tried PDF.js once at the very beginning. It was really slow and the output was really bad (disclaimer: I tried with TeX-generated papers generated, I guess for simple PDF documents it is fine). I don't believe the future of Web content is PDF and I only see the PDF.js effort as a workaround for Adobe plugin rather than a real wish to bring PDF document to the Web. I agree with that we should encourage scientific and technical content on the Web and that's one of the goal of the MathJax project.

Regarding MathJax output, it is certainly far better than Safari native MathML support at the moment but I would say that Gecko's rendering is close, depending on the font available on your system (https://developer.mozilla.org/en-US/docs/Mozilla/MathML_Project/Fonts). It's rather impressive that a Javascript-based approach with all the approximations (like rounding errors in measures, placements and hardcoded font metrics) is able to do better than native support but I would see that as an encouragement for browser vendors to do better than to just give up. Microsoft Office uses MathML (or at least a very similar XML format) and developed the Open Type Math Table extension, proving that you can get a good rendering from that too. Jonathan Kew's implementation of the Open Type Math Table in XeTeX and the same in LuaTeX proved that the TeX layout algorithm can be replaced by this method without losing in quality and can do better than PDF.js or MathJax. Currently Gecko's MathML tries to emulate TeX's heuritics with very few knowlegde on the current font but I expect we could use the OpenType Math table in the future (Karl reported a related bug a long time ago). Also, MathJax TeX fonts are generated from the classical Computer Modern font. This means that we need an autotracer to convert from Knuth's metafont to Web fonts and this results in not so good quality depending on font size (at least, some MathJax's partners from the publishing industry complain). I think the future for the Web is to directly use math fonts that have been designed as Open Type fonts and many have Microsoft's Math table (STIX, Asana Math, Neo Euler, Gyre, Latin Modern, Cambria Math etc).

Without further details, the main issues with MathJax are:

1) Performance

2) Dynamic Update (Javascript, reflow, repaint etc)

3) Integration with HTML/CSS/SVG

4) Font support

Some improvements could be done by providing Javascript APIs to give MathJax more information, especially to solve 4). However, I don't think the fundamental issues can be solved without proper MathML support. We have other ideas like using localStorage to cache the HTML-CSS output or trying to play on how many equations we insert at once, but I doubt we will never get the same performance as native MathML.

BTW, one of the most important source of complaints from MathJax users and partners at the moment seem to be incompatibility and conflicts with CSS and in general 3). It turns out that math on the web is very different to math on papers. With TeX, you're happy if you get a final black & white pdf, with a fixed layout inside page areas and user-defined size & line-breaking (you know the \\ and \Bigl commands). Web people not only want colors but also fonts, Unicode support, links, DOM & Javascript, inclusion in SVG diagrams and a rendering that is not "optimized for IE6 with screen resolution 1024x800". Actually they want compatibility with all the CSS effects like text-shadow, ::selection, CSS animations or max-width (just to mention a few examples of recent feedback from MathJax users). So as roc said, in one way or the other what you want is a Web language with a DOM.

In my opinion the main improvement to do to MathML would be to make it even more close to CSS rather than trying to go back to the old TeX paradigm that is not appropriate at all for the Web. Some examples: <mstyle> (that just duplicates CSS in an incompatible and less powerful way), <mpadded width="2height"> (inspired from LaTeX but incompatible with the CSS box model) or <mphantom> (inspired from \phantom, but not very useful when you have "visibility: none"). I believe things like mfrac@linethickness, math@displaystyle, math@dir, MathML font properties that have been reimplemented by roc, or even parameters from the Open Type Math Table could become CSS properties. This would make the MathML layout more configurable and less random.

As a conclusion, I understand Benoit's concerns from a Web authors point of view. But really from my experience a solution like MathJax for a subset of LaTeX or alternative output like ASCIIMath + tools like LaTeXML for advanced and complicated TeX macros and processing is really satisfactory. WYSIWYG tools or handwritting recognition are maybe not widely available yet, but I expect the solution will be at a high abstraction level rather than just plain text.

Hello everyone!

This thread has raised my attention and I would like to share my

opinions, maybe as a "school child" who used mathematical software for

WYSIWYG editing (not only reading!), as the primary way of editing any

math, as a primary/fundamental tool for computer-aided learning. I was

(un)lucky enough to be forced by my situation to learn using *only*

computers in the late 1990s and early 2000s. That experience has taught

me the importance of WYSIWYG editing for HTML and maths.

I feel it's not easy to me to reply to this thread - seeing other people

who are technical experts that I admire have already replied, providing

proper arguments for their reasoning. Please excuse my, perhaps, less

formal, less backed-by-arguments reply.

This thread shows that there's some misunderstanding on the performance,

styling and editing requirements for math. I can say that I spent months

trying software to find the best one fitting my requirements. It wasn't

easy.

I haven't seen good (La)TeX WYSIWYG editors, but lately I haven't tried

any such software - now I write LaTeX manually. Still, in the early

2000s I did see and use one WYSIWYG editor that was really good:

Wolfram's Mathematica. It had fast rendering, good set of keyboard

editing shortcuts allowing fast input in WYSIWYG mode. Really good math

WYSIWYG editing is very much possible.

Performance matters not only for the initial document rendering. When

you do WYSIWYG editing performance characteristics matter in a lot more

subtle ways. When you are editing big equations, or some really big

document updates need to happen as close as possible to instant. I have

tested software like MathCAD and Maple that did not seem slow at all

when loading documents. Editing math, however, proved to be quite slow.

Very good editing is *not* about "click and point" - this was one of the

biggest failures of MathCAD's UI: it encouraged the click-and-point

editing which meant you had to switch between the keyboard and the mouse

all the time. Word 97 (before Word 2007) forced you to manually switch

between the equation editor and the normal editor, which was a huge

problem, and so on.

Styling is really important when you collaborate with others and you

need to highlight relevant parts of the math output. I am surprised this

is even put up as discussion.

Similarly I am surprised that the need for WISYWG editing for math is

being discussed. I am being subjective here: I believe that mathematics

should be first-class citizen on the web. Mathematics is a fundamental

domain of study in all schools, in all forms of education throughout the

world. Mathematics is the basis for many other fields, see physics,

computer science and others.

Back in those days when I was writing math homeworks with Mathematica I

was very glad and I appreciated a lot that people write software that

can benefit my niche needs, it was invaluable for me. It made possible

things that were not possible. Microsoft's Word was not even close to

being as usable as Wolfram's software. Word 2007 has, indeed, improved

math editing a *lot*, today it's certainly usable.

Microsoft's work on improving math editing in Word shows there's a real

demand for math in documents. I don't see why we would believe otherwise

about the web. We should not need to include half-baked* JS libs to

render math in a document.

* I'm not claiming that MathJax is half-baked - I am simply pointing out

that once people have the choice of which JS lib to use for math

rendering they may (and will) fail to pick the best one.

I do not care about the technology here - MathML or TeX. What I care

about is for the web browsers to meet the technical demands for

producing really good math rendering and editors. I want this not for

the academics, not for professors who can write TeX documents. I want

this for school children who cannot write math on paper, who are blind,

or who have other physical disabilities. Manually writing LaTeX does not

"cut it" at early stages, when children learn maths. Such tools are

invaluable for them.

At the moment, removing MathML support from Gecko would make it harder

for web app developers to create (really) good software for math

editing. It may certainly have its problems, but its benefits are

greater. Before MathML is removed people should look into defining the

requirements, the APIs needed to be implemented in the browser such that

JS-based math rendering can be equally fast and versatile (eg. styling).

Font metrics stuff is, I believe, only a part of the problem that makes

JS-based math rendering slower than native. After requirements are

defined, those things should be implemented. After that, yes, remove MathML.

Back in the days when I was testing math software, I was also testing

MathML rendering in Gecko - it was slower than in specialized software.

I don't know how it is today, but keep in mind that native software like

Maple and MathCAD was not usable due to performance issues, during fast

editing of small to medium sized documents. It may take some time before

web apps can become as fast as Mathematica at rendering math, and as

good at editing -- even with MathML rendered natively.

Editors are really hard and it is unfortunate to note here that browsers

do not even do good enough at HTML editing. If we can do something to

improve the situation we should do that - not the opposite. The removal

of MathML would most-likely make things worse/harder for web-based math

editors.

Probably there is not much "value" from maintaining MathML - browser

competition happens in other areas, other APIs and technologies.

However, please let the volunteers do their work, maintain their work

and so on. From reading this thread I understand MathML support in Gecko

was implemented mostly by volunteers. It would be a big disappointment

to volunteer efforts to see that work goes away, especially without

anything better replacing it.

I doubt that if we keep MathML some day some people would like their own

niche markup language - eg. for domains like chemistry, biology, music,

etc. Did you see anyone doing that?

I find it surprising that HTML5 caters to advertisers/trackers by

introducing the ping attribute for anchors, yet here we question the

use/need for a standard way to write mathematics on the web - the

initial email in this thread questions the need for anything to replace

MathML, as writing maths is over-specialized.

Thank you for reading. Feel free to take these thoughts with a grain of

salt: I am biased, I was a user of native math software and I would like

the web platform to provide equally good software.

Best regards,

Mihai

I'd like understand more about this. I have been hoping that one of the
best use cases for web components is to implement these kinds of

domain-specific languages. I greatly fear that we're accidentally

pushing the web from declarative markup to a model where everything is

controlled with script: in the process, we are going to lose some of the

core benefits of the web: pervasive hyperlinking, save-as and

view-source, and . I tend to think that web components are a great way

to abstract away the presentation of new declarative languages.

Without knowing a lot about it, it seems that SVG and HTML contain all

of the primitives necessary for a web components script to implement the

visual MathML presentation. Perhaps I'm not completely aware of the

problems, though. Does MathML need to participate in inline reflow in a

way that requires direct support from the layout engine?

script, and I'm really hoping they will be able to implement arbitrary

> Does MathML need to participate in inline reflow in a way that requires direct support from the layout engine?

I don't know if that answers your question but one important thing that is currently lacking in Gecko's MathML implementation is line breaking. This is true for Web pages but I suspect this will become even more important with mobile devices. I use many long inline formulas in my blog and this is handled as I would like by Gecko. A very important use case is large tables enumerating mathematical properties and thus containing formulas (you can find some on Wikipedia like pages on Fourier transforms, integral/derivatives, probability laws, usual function properties and I also used such tables in the appendices of my two master thesis in CS and math). Currently, the line breaking is disable by default in MathJax as that slows down the layout algorithm even more. And of course that does not work for formulas inside table cells since MathJax has no knowledge of CSS intrinsic widths.
> problems, though. Does MathML need to participate in inline reflow in a way
Ideally, yes, although it currently doesn't in Gecko.
Good math layout requires specialized layout primitives that we don't have

in regular CSS. I'm thinking of features like stretchy characters (e.g.

integrals that grow based on the size of the enclosed formula), stretchy

overbars and underbars of various kinds, and careful placement of the

degree next to a radical symbol.

Keep in mind that without script, the kind of transformations you can apply

with Web Components are similar to XBL, and that's pretty limited.

I'll repeat what I said before: going from a domain-specific data model,

such as XML describing an electronic circuit, to a good rendering, is an

incredibly complex process. It's unclear it can even be automated at all,

let alone automated without script.

As can be seen in the integration into HTML(5) nothing in MathML requires an XML surface syntax.
fonts and page size. It is not designed at all for a web-like scenario, in particular TeX has no support for linebreaking of displayed mathematics

which is particularly important with small screens etc. Also of course classic TeX has no support for Unicode fonts. There are Unicode variants (luatex and xetex) and experimental support for Unicode math fonts (unicode-math fonts) but all these are relatively unstable development software and hardly well-established standards.

It does not re-invent TeX: it is different.
TeX is the standard in some fields for _author submission_ although even in scientific fields other formats notably Word are surprisingly (perhaps) common. Also other than self-publishing mechanisms such as arxiv many scientific journals do _not_ use TeX for publishing and even if they accept tex input they convert in house to xml/mathml workflows (thus mathml is a more natural publishing format)
Probably true but not very relevant. HTML is more verbose than wiki syntax for the same reasons.
MathML is a standard part of ODF (OpenOffice etc) it is supported in clipboard in Word and the rest of the MS Office suite. It is a standard part of epub3. It is supported in many typesetting systems used by journals. It is supported on import/export in maple and mathematca, just to name a few. In contrast TeX is notoriously difficult to process by anything other than TeX: latex2html/tex4ht/latexml do a remarkable job but are inherently fragile and incomplete.
If you mean "in web browsers" firefox and IE+MathPlayer are the two main implementations it is true with less complete support in Safari and Opera.
See the comment above for traction in MathML in other aspects.

MathJax (or any javascript) rendering is clearly slower and harder to interact with than a native implementation that maps directly to the DOM
It is hard to think of any advantages that would have.
Yes, I'm definitely not talking about a non-script implementation of any
> I'm coming late to this thread but I have to say that the misunderstanding present in the original post is huge. The author can take refuge in that he's made a common category mistake. MathML is a computer representation for math, TeX is a human input language.

>

> MathML was never intended to be typed by humans so it is no wonder that you find it a bad experience. TeX is a poor computer representation which is one reason why MathML was invented.

>

> It is reasonable to have a discussion of the relative merits of entering math by typing TeX vs point-and-click editing of math (ie, direct manipulation editing). I am biased toward the latter but I can understand the feelings of those whose hands know TeX really well.

> In short, both MathML and TeX have good reasons to exist and don't compete with each other in their primary categories.

But I wanted to present another problem which is in a way of the same

nature (has also a degree of separation). It's the development of

tableless grids against HTML.

Consider the case, table A, which a developer can think of:

4 columns:

abcd

ebfd

To to this in HTML, the developer has to make a "container DIV" with 4

main column cells in it, think c1,c2,c3,c4. And which c1=rows a,e;

c2=row b; c3=rows cf; c4=row d;

It's easier to type something like the following:

4,abcdebfd

But this won't change the reality, the end product of this is:

<div><div class='inline'><a /><e /></div><div class='inline'><b

/></div><div class='inline'><c /><f /></div><div class='inline'><d

/></div></div>

Which, can be , as many pointed: styled, channeled to accessibility

observers, manipulated, annotated — all of of that with a greater

level of compatibility as developers understand.

But then, right now, what we have are:

a) Toolkits using JS to do things like 4,abcdebfd [1]

a.1) For example, http://labs.telasocial.com/grid-layout/

b) Specs do to similar things:

b.1) http://dev.w3.org/csswg/css-template/#grid-shorthand

The above example, which refers to grid rearrangement, is a different

things I know. But I think it has similar points to this discussion:

shorthands that applies to HTML elements (or other elements) are good

things to developers.

to

A bit more on the TeX part of this argument. Over a decade ago my company polled publishers that accept submissions from authors of content containing math. Although not a scientific poll, the results were overwhelming. Approximately 85% of all submissions were in MS Word format with equations written using its Equation Editor (my company's product at the time, licensed to Microsoft).

It has been pointed out already here but let me emphasize that most math is not done by mathematicians. All K12 students must learn math so there's a huge industry to serve them. Half the departments at a typical university use math in their teaching (not just science and engineering but anything with statistics and business, economics). Mathematicians are a tiny fraction of the whole. I guarantee you that virtually all K-12 teachers have never heard of TeX.

Ok, with that dead horse beaten let me turn to MathML.

It was mentioned that MathML lacks open source tools to work with it. I believe one of the main reasons for this is the lack of browser support for MathML. It is ironic that most STEM publishers' internal workflows are based on XML and MathML for math but they can't deliver it to browsers. MathJax changes that of course which is why we now see the overwhelming interest in MathJax and its ability to render MathML.

It would be truly ironic if MathJax success killed MathML. MathJax does a truly heroic job of formatting mathematics. It is amazing to me that a JavaScript library can do so well. That said, it is a stand-in for true MathML support. It lacks access to fonts and character metrics as well as layout information. The fact that it works so well is due to the brilliance of Davide Cervone and the rest of the MathJax team and not because it is the right way to render MathML in a browser. That it is not truly integrated into the browser results in all sorts of struggles. As witness, see all the postings in its forums from people who run into trouble creating dynamic web pages containing math. They get into all sorts of tangles involving DOM changes, rendering, event processing order, etc.

Those of us that work in MathML and equation editing constantly run into the misconception that an equation rendering is more like an image than text. I think this is due to the fact that many document processing systems through history have had to handle math as an image because they don't support math directly. Mathematics is really just a fancy text format. Think subscripts and superscripts on steroids. Some chemistry notations fit into this mold as well but music and some of the other things mentioned in this thread are not. There is one easy way to tell if a notation is fancy text. Do books include the notation inline in paragraphs or not. Even with a block (or display) equation it flows with the text.

It was stated that Mozilla's two main competitors don't support MathML. While that is literally true, Internet Explorer has had the benefit of MathML display via my company's MathPlayer plug in for years. When we introduced it, Microsoft added some APIs specifically to allow us to do a better job integrating its rendering into the surrounding text. Most screen readers interface with MathPlayer to make math accessible to people with various disabilities.

Please don't turn your backs on MathML just as it is coming into its own. MathJax is a good stand-in for missing MathML support but it is does not eliminate the need for native support. Instead, it makes its absence all the more glaring and the need to fix the problem all the more urgent.

of the fact machines parse it. However in any case AtkText /

IAccessibleText / the mac accessible protocol thing all expect the text

for an object to be a string so whatever format the web uses screen

readers will be handling a serialized format.

> - For people with reading disabilities (dyslexia etc), you need to synchronize highlight of equation parts / reading of equation parts.

tell API consumers that the bounds of characters in the serialized text

are those for the formatted text shown on screen. That doesn't quiet

work for { / } / ^ etc, but we can just give them a size of zero or

something, and that should probably be fine.

> In both cases, you must know a bit more about the mathematical structure e.g. to have a DOM. It's not clear how to do that with plain text. It's just absurd to believe that putting TeX source inside the alt text of an <img> makes the formula accessible. It might works for very simple equations like x+2 but in general you'll have to do some parsing into an abstract representation if you want to read/highlight it correctly. With MathML you already have a standard representation and there already exist tools to work with that language.

In the cases above which you discussed and Braille which you didn't the

medium is fundimentally serial, so I'm not sure I buy that you need a

tree.

I think Fred's point here was that the literal text in the MathML or LaTeX is not what a blind person wants to hear. The whole point of math as a 2-D notation is that the relative position of the parts of the equation carry meaning. This is unlike normal text which almost always carries its whole message in its words and punctuation.

Math accessibility is a surprisingly complex subject. How math should be read is dependent on the mathematical or scientific context in which the math is embedded, the educational level of the user, and their familiarity with the accessibility technology itself. In our grant work with the Educational Testing Service (ETS) we found out that a literal reading of a mathematical expression in a test question can give away the answer even when the graphical rendering doesn't.

BTW, all this work is done with math expressed in MathML. It could use MathML structures obtained from MathJax but this means that the screen reader can't use MSAA (or equivalent) to get an IAccessible interface from a DOM node. As far as I know, there is no mechanism that allows JavaScript code to implement IAccessible.

Even with MathML implemented natively in browsers, it seems like accessibility mechanisms still need some work. While the HTML5 effort is busy adding access to device features (phone, camera, GPS, touch) for us in web apps, there has been no effort to do something similar for screen readers and for accessibility support in general. Screen reader vendors are currently being cut out of the mobile market as device makers are playing the old proprietary "that functionality is part of the OS" game.

I guess I am going a bit far afield here. My hope was to show that there is a lot happening with MathML technology. It is not time to pull the plug but properly support it.

