MathML-in-HTML5

52 views
Skip to first unread message

r...@maths.uq.edu.au

unread,
Sep 23, 2006, 9:57:55 AM9/23/06
to dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
I am currently driving an effort to enable MathML-in-HTML (apart from
MathML-in-XHTML that we already support). I have a patch that serves
the dual purpose of showing where things are going and the issues to
ponder about.

Here is a
[screenshot] https://bugzilla.mozilla.org/attachment.cgi?id=239771
which is a _live_ rendering of this testcase:
[mathml-in-html] https://bugzilla.mozilla.org/attachment.cgi?id=239769

Those interested in following this up can see bug 353926:
https://bugzilla.mozilla.org/show_bug.cgi?id=353926

Quick background:
=================

At the Firefox engineering meeting in Mountain Views (last December
2005), I pleaded that we enable MathML in HTML5 to advance the cause
of MathML, which is so far locked in a XHTML/XML world that does not
seem to be going anywhere in terms of display content as opposed to
data (witness the WHATWG effort -- http://www.whatwg.org). Those to
whom I spoke included dbaron, hixie and sicking, and they welcomed the
suggestion, asking for a broader discussion. Hixie raised the caveat
that MathML elements should still remain in the MathML namespace. He
e-mailed me a while ago about a discussion on this matter in the
WHATWG mailing list, which can be seen here
http://listserver.dreamhost.com/pipermail/whatwg-whatwg.org/2006-June/thread.html.

That discussion is however too broad and involves tangential issues such as
inventing another syntax, etc. My original take was simply to enable
MathML+HTML, in the same vein as we have MathML+XHTML. I think MathML
is suffering from having to fight the battle for adoption of XHTML as
well. As a niche technology, it does not have the means to be engaging
a fight. What it simply needs is MathML-in-HTML. W3C failed to
recognise that it could retrofit MathML in HTML -- see this archived
post for some insight:
http://groups.google.com/group/netscape.public.mozilla.mathml/msg/4d58c35217afcb54?dmode=source
But HTML5 being shepherded by WHATWG could provide the right framework
from this to happen now.

I have finally been able to code this up (while keeping MathML
elements in the MathML namespace). I attached the patch I had so far
in bug 353926.

Design & Technical issues:
==========================

How does MathML-in-HTML5 work?

We support MathML-in-HTML5 when these two conditions are met:

1. The DOCTYPE of the document says so. If yes, we enable
MathML entities (TODO) and flag mMayHaveMathML in the HTML content sink.

2. And either a) OR b) is met:

a) <html> has the MathML namespace as the value of an attribute with a
prefix, e.g., <html xmlns:m="http://www.w3.org/1998/Math/MathML">.

In this case, we cache the prefix "m" in mMathMLNameSpacePrefix,
and we intercept all <m:tag> in the document and create
MathML content nodes for them.

b) MathML fragments are in the document as
<math xmlns="http://www.w3.org/1998/Math/MathML">
...
</math>

In this case, we intercept all non-HTML elements inside the <math> tag
and create MathML content nodes for them.

Issues:
1. Tag soup: we understand that we are exposing ourselves to this.

2. a) What about CSS matching rules? From the Style System point of view,
the document is still HTML, but <m:math> is in the MathML namespace. We
might have to special case MathML-in-HTML5 in the Style System as well.

b) The second option raises an issue with HTML-in-MathML, e.g.,
<math xmlns="http://www.w3.org/1998/Math/MathML">
<b>bold</b>
</math>
We don't intercept the <b> in this case. Hence, even though it is
HTML-in-MathML without an explicit XHTML namespace for <b>,
the HTML sink
will give <b> a HTML content node. This is not really XHTML friendly.
On the other hand, we don't want to be an XML parser either... These
are conflicting objectives. We need to decide what to do. We may agree
to only support tags with prefixes as in a), or also keep b) knowing
that it has this XHTML unfriendly behavior.
---
RBS

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


Ian Hickson

unread,
Sep 23, 2006, 5:06:07 PM9/23/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Sat, 23 Sep 2006 r...@maths.uq.edu.au wrote:
>
> Hixie raised the caveat that MathML elements should still remain in the
> MathML namespace.

I meant in the DOM, I didn't mean in the markup. I don't think we should
have any namespace declarations or namespace prefixes in text/html; I
would just have the HTML parser always support the MathML elements, in
the same way that it supports any random unknown element today, except
that when it sees a MathML element it puts it into the MathML namespace in
the DOM rather than the XHTML namespace.

I really don't think we want to introduce namespace prefixes or namespace
declarations into tag soup. I think that would be a big mistake.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Paul Topping

unread,
Sep 23, 2006, 6:38:52 PM9/23/06
to Ian Hickson, r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
If MathML is considered a subset of HTML5, then no namespace declaration
would be necessary. However, if MathML is going to work in HTML that
isn't declared as HTML5 (not clear to me from this thread), then the
document would be poorly specified without it, IMHO.

At the risk of enciting an anti-Microsoft backlash, I should remind some
on the list that IE has covered this territory before. They already have
a mechanism for declaring XML islands in HTML that seems to work just
fine. Of course, Mozilla won't be interested in duplicating IE's way of
associating a plugin as the renderer of the namespace in the document.
IMHO, it doesn't belong there anyway. It is better (ie, more secure) to
keep such associations out of the content.

Paul Topping
Design Science, Inc.
www.dessci.com/mathplayer

> _______________________________________________
> dev-tech-mathml mailing list
> dev-tec...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-mathml
>

Ian Hickson

unread,
Sep 23, 2006, 8:08:52 PM9/23/06
to Paul Topping, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org, r...@maths.uq.edu.au
On Sat, 23 Sep 2006, Paul Topping wrote:
>
> If MathML is considered a subset of HTML5, then no namespace declaration
> would be necessary. However, if MathML is going to work in HTML that
> isn't declared as HTML5 (not clear to me from this thread), then the
> document would be poorly specified without it, IMHO.

As far as HTML5 UAs are concerned, declaring HTML as HTML5 consists of
labelling it as text/html. It isn't clear to me what you would consider
HTML that isn't declared as HTML5. With the exception of quirks which are
required for compatibility with de facto standards that disagree with de
jure standards, HTML has no practical versioning story -- all features
work in all documents, regardless of the official "version" of HTML used.


> At the risk of enciting an anti-Microsoft backlash, I should remind some
> on the list that IE has covered this territory before. They already have
> a mechanism for declaring XML islands in HTML that seems to work just
> fine.

XML data islands don't form part of the parent DOM (they are "islands", as
opposed to part of the document). I'm not sure how wrapping <xml> tags
around the MathML content would help. :-)


> And, I should have added that without a namespace declaration there
> would be no way to differentiate different versions of MathML. While
> most MathML instances are now MathML 2.0, the MathML 3.0 effort is just
> now starting up.

Why would you need to distinguish them? MathML2 is a superset of MathML1,
and (for all intents and purposes) any compliant MathML2 UA can process
any compliant MathML1 content. I would assume that this would continue to
be the case; if not, then this is IMHO a problem with MathML3.

Note that the namespace declaration can't currently distinguish between
MathML1 and MathML2, I don't see any reason why MathML3 would change this.

Chris Chiasson

unread,
Sep 24, 2006, 4:58:59 AM9/24/06
to
I don't understand. Aren't people who are savvy enough to generate
MathML also savvy enough to generate XHTML? Has anyone actually said,
"That MathML I can handle, but what's this XHTML?"

sha...@shantirao.com

unread,
Sep 24, 2006, 12:16:13 PM9/24/06
to
On 9/24/2006 1:58 AM, Chris Chiasson wrote:
> I don't understand. Aren't people who are savvy enough to generate
> MathML also savvy enough to generate XHTML? Has anyone actually said,
> "That MathML I can handle, but what's this XHTML?"

Savvy, yes. But also impatient. You will notice that HTML is what gets
used -- not XHTML.

I like write straight HTML with embedded LaTeX, then run it through a
translator to turn $exponents^2$ into MathML. Sure, HTML->XHTML
converters exist, but again, I'm lazy, selfish, and impatient.

Shanti

Chris Chiasson

unread,
Sep 24, 2006, 1:43:39 PM9/24/06
to
You find it easier to transform HTML + LaTeX instead of XHTML + LaTeX?
What kind of tools are you using?

David Carlisle

unread,
Sep 25, 2006, 5:11:17 AM9/25/06
to i...@hixie.ch, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org

Ian

> XML data islands don't form part of the parent DOM (they are "islands", as
> opposed to part of the document). I'm not sure how wrapping <xml> tags
> around the MathML content would help. :-)

The syntax Paul was referring to here wasn't the <xml> convention, but
the ability in IE to have (explicitly prefixed) XML elements within an
HTML document with rendering controlled by an external component,
but _without_ any other flag at that point in the in the markup, such as
<xml> or <object> etc.

In the IE implementation you need to have an <object> in the head
pointing at the particular rendering component, which is fairly horrible
and also, you need to declare the namespace using (a variant of) an
early working draft namespace syntax using a PI, but as Paul said, those
parts needn't be copied. an example of a document using this syntax is
shown here:

http://www.dessci.com/en/products/mathplayer/author/creatingpages.htm#AnatomyMathPlayerWebPage

By using a different classid you can do the same thing to include
(explicitly prefixed) svg into an htm document and have it rendered by
Adobe's svg viewer, and in principle any other vocabularies (although I
don't personally know of any other implementations of this, except
techexplorer, which is again for MathML).

I'm not sure, having math more or less added directly to html would be
nice in many ways but I'm not sure how well it scales, if you think
people might want to have html+svg+chemml+... then perhaps having an api
that allows processing to be attached to namespaced elements would be
more general. On the other hand that was part of the reason for having
namespaces (and for that matter, xml itself) that people could serve all
sorts of different xml vocabularies and have clients do whatever is
necessary. I suspect part of the reason for "html5" is a feeling that
that never happened and isn't going to be mainstream any time soon, and
that a solution that directly addresses the fixed html vocabulary, with
perhaps two specific extensions such as svg and mathml will in practice
cover the vast majority of browser needs, and other vocabularies can be
transformed to html+.. before being served.

David

Ian Hickson

unread,
Sep 25, 2006, 1:38:59 PM9/25/06
to David Carlisle, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Mon, 25 Sep 2006, David Carlisle wrote:
>
> The syntax Paul was referring to here wasn't the <xml> convention, but
> the ability in IE to have (explicitly prefixed) XML elements within an
> HTML document with rendering controlled by an external component, but
> _without_ any other flag at that point in the in the markup, such as
> <xml> or <object> etc.

Oh, well, as noted earlier, the idea of namespace prefixes in HTML isn't
one that I personally am particularly fond of.


> I suspect part of the reason for "html5" is a feeling that that never
> happened and isn't going to be mainstream any time soon, and that a
> solution that directly addresses the fixed html vocabulary, with perhaps
> two specific extensions such as svg and mathml will in practice cover
> the vast majority of browser needs, and other vocabularies can be
> transformed to html+.. before being served.

I think that's pretty much exactly correct, yes.

sha...@shantirao.com

unread,
Sep 25, 2006, 10:44:15 PM9/25/06
to
On 9/24/2006 10:43 AM, Chris Chiasson wrote:
> You find it easier to transform HTML + LaTeX instead of XHTML + LaTeX?
> What kind of tools are you using?

A text editor, and itexMML, of course! Sure, more sophisticated tools
exist, but they aren't very reliable, are they?

Chris Chiasson

unread,
Sep 26, 2006, 8:00:53 AM9/26/06
to
How would transforming XHTML+LaTeX be harder than HTML+LaTeX with
itexMML?

William F Hammond

unread,
Sep 26, 2006, 5:35:39 PM9/26/06
to dev-tec...@lists.mozilla.org
sha...@shantirao.com writes:

> On 9/24/2006 10:43 AM, Chris Chiasson wrote:
>> You find it easier to transform HTML + LaTeX instead of XHTML + LaTeX?
>> What kind of tools are you using?
>
> A text editor, and itexMML, of course! Sure, more sophisticated tools
> exist, but they aren't very reliable, are they?

Oh?

I expect that the author of whatever more sophisticated tool you try
would like to hear of any lack of reliability you find.

Cheers.

-- Bill

Roger B. Sidje

unread,
Sep 26, 2006, 7:59:50 PM9/26/06
to Ian Hickson, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
> What I would be proposing for HTML5 is just the following list of
> elements:
>
> math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
> mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
> mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction

I don't like mlabeledtr very much (I have already expressed my views
about it to folks of the MathML WG), and would hope that they will take
my suggestion for <mtr label="..."> in MathML3. The former is
unnecessarily bloated and doesn't degrade gracefully at all with
renderers that don't support it (not to mention that it is hard to fit
in Gecko's existing table code).

However, your list misses some key tags, in particular leaf tags such as
<mspace/> -- which is sometimes quite useful. Also, <mprescripts/> and
<none/> are needed in <mmultiscripts> (albeit it can be argued that
<none/> is the same as <mrow></mrow> or an empty <mspace/>, but the
differentiation is worthwhile).

In general, I would prefer the list to at least include all the tags
that we already support, and which existing webpages have come to depend
on. This effectively boils down to your list above, excluding
<mlabeledtr>, and including <mspace/>, <mprescripts/>, <none/> and
<mi>, <mn>, <ms>, <mtext>, <mo>. In particular, <mo> is a vital tag as
it is at the heart of those stretchy MathML characters.

Implementation-wise, as this inclusion of MathML-in-HTML5 marks the
beginning of tag soup, it may be that the HTML parser would have to have
some knowledge of leaf tags, so that for example, a stray <mspace>
doesn't become the root of an entire HTML tree... which is later fed to
the hapless MathML engine. (The patch I attached in bug 353926 ignored
the issue.)
---
RBS

On 26/09/2006 3:59 AM, Ian Hickson wrote:
> On Sun, 24 Sep 2006, Boris Zbarsky wrote:
>
>>Ian Hickson wrote:
>>
>>>We didn't check that <canvas> wouldn't cause clashes, either.
>>
>>I see. I had assumed that we in fact had.
>>
>>
>>>I don't see why. We don't want a flag for when people can use the storage
>>>APIs. Or when they can use <img> elements. Or whatever.
>>
>>True, because those are very unlikely to collide with random stuff the pages
>>are doing (e.g. the storage APIs are using fairly long names that are unlikely
>>to collide with page-defined functions and variables).
>>
>>If we think MathML has a similarly low risk of collision, great.
>
>
> I don't know about "we".
>
> What I would be proposing for HTML5 is just the following list of
> elements:
>
> math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
> mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
> mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction
>
> ...and of those only <math> came up at in the top 1000 elements in my
> search of elements on about one billion pages.
>
> According to that same research, <math> is, on the Web, less frequent than
> the following elements: <m>, <e>, <rem>, <tab>, <yr>, <prohibits>, <your>,
> <lable>, <text-spez>, etc. It was present on less than 0.002% of the pages
> the research covered. (To give an idea of scale, <h8> is used on more than
> 0.003%, so if we avoid <math> because of this, we should probably
> introduce <h7> and <h8> into HTML, since we're saying that's an important
> enough level to worry about.)
>
> Now, of course, it could be that those 0.002% of pages are all hugely
> important and that we'll break the Web in adding this feature. We can't
> know until we've tried.
>

Ian Hickson

unread,
Sep 26, 2006, 8:16:07 PM9/26/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, Roger B. Sidje wrote:
>
> I don't like mlabeledtr very much (I have already expressed my views
> about it to folks of the MathML WG), and would hope that they will take
> my suggestion for <mtr label="..."> in MathML3. The former is
> unnecessarily bloated and doesn't degrade gracefully at all with
> renderers that don't support it (not to mention that it is hard to fit
> in Gecko's existing table code).

I'm happy to drop/add any tag to this list. Just give me the list you
want.


> However, your list misses some key tags, in particular leaf tags such as
> <mspace/> -- which is sometimes quite useful. Also, <mprescripts/> and
> <none/> are needed in <mmultiscripts> (albeit it can be argued that
> <none/> is the same as <mrow></mrow> or an empty <mspace/>, but the
> differentiation is worthwhile).

I missed anything that wasn't in the table I happened upon in the spec. I
didn't look very closely for the exact table I wanted.

Tell me what tags you want to have and we'll make that the list. You're
the expert. :-)


> Implementation-wise, as this inclusion of MathML-in-HTML5 marks the
> beginning of tag soup, it may be that the HTML parser would have to have
> some knowledge of leaf tags, so that for example, a stray <mspace>
> doesn't become the root of an entire HTML tree... which is later fed to
> the hapless MathML engine. (The patch I attached in bug 353926 ignored
> the issue.)

Don't worry, these tags auto-close when a parent tag is closed.

<foo><bar><baz></foo><quux>

...results in this DOM:

<foo>
<bar>
<baz>
<quux>

For leaf nodes with following siblings, people will have to use end tags,
as in:

<foo><bar></bar><baz></baz></foo><quux></quux>

If we want to start adding actual leaf tags, I'd rather do this in a
second stage, after we have a proof of concept. (I've so far avoided
adding any new tags to the HTML5 parser spec, but eventually there will be
a bunch we have to add.)

We can go from non-empty to empty much more easily than from empty to
non-empty.

Roger B. Sidje

unread,
Sep 26, 2006, 9:03:25 PM9/26/06
to Ian Hickson, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On 27/09/2006 10:16 AM, Ian Hickson wrote:

> I'm happy to drop/add any tag to this list. Just give me the list you
> want.

OK.

> For leaf nodes with following siblings, people will have to use end tags,
> as in:
>
> <foo><bar></bar><baz></baz></foo><quux></quux>
>
> If we want to start adding actual leaf tags, I'd rather do this in a
> second stage, after we have a proof of concept. (I've so far avoided
> adding any new tags to the HTML5 parser spec, but eventually there will be
> a bunch we have to add.)

OK, I see.

The other issue are those 2000 entities that MathML has. You said that
you are not a big fan of a namespace thingy on the root <html> element.

Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
W3C entities _by default_? We have a proof-of-concept of that in View
Selection Source, BTW. It will display any entity it can.
http://lxr.mozilla.org/mozilla/source/content/base/public/nsIDocumentEncoder.idl#125
As VSS has underwent the test of time without major complaints, perhaps
<!DOCTYPE html> could assume that too? If that is agreed, we are all clear.

The other remaining issue might be with style matching because <math>
will then be internally in the MathML namespace whereas the HTML
document is in the none namespace (at present), but we will see how it
goes from there.
---
RBS

Ian Hickson

unread,
Sep 26, 2006, 9:23:38 PM9/26/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, Roger B. Sidje wrote:
>
> The other issue are those 2000 entities that MathML has.

Yeah... Do we really need those? Some of them seem reasonable to add, but
2000 seems like too many for the mnemonic advantage to beat just using
Unicode codepoints...

The problem with adding entities is that a LOT of people do things like

href="/u?aa=foo&ab=foo&ac=foo&ad=foo"

...which today works, but would break if MathML entities were introduced
(since &ac is a MathML entity).


> Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
> W3C entities _by default_?

Don't do anything based on the DOCTYPE. HTML5 is anything sent as
text/html.


> The other remaining issue might be with style matching because <math>
> will then be internally in the MathML namespace whereas the HTML
> document is in the none namespace (at present), but we will see how it
> goes from there.

I don't see why this would cause any problems.

Roger B. Sidje

unread,
Sep 26, 2006, 11:10:17 PM9/26/06
to Ian Hickson, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On 27/09/2006 11:23 AM, Ian Hickson wrote:
>
> The problem with adding entities is that a LOT of people do things like
>
> href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
>
> ...which today works, but would break if MathML entities were introduced
> (since &ac is a MathML entity).
>

That list is so big that trying to hand-pick some and leaving some out
would need another committee...

>>Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
>>W3C entities _by default_?
>
>
> Don't do anything based on the DOCTYPE. HTML5 is anything sent as
> text/html.

I thought the DOCTYPE was trustworthy -- based on this excerpt from the
HTML5 spec:

"HTML documents that use the new features described in this
specification must start with the string <!DOCTYPE html> and, if they
are served over the wire (e.g. by HTTP) must be labelled with the
text/html MIME type."

If so, it would have meant less conflicts with agreed entities in HTML5.

BTW, for my own information, do you intent HTML5 to be transitional,
almost-standards, or strict? If it is HTML5 (or XHTML5) served as
text/html but put in the XHTML namespace at some later stage (as the
HTML5 implies), it better be strict, no? And that would be driven by the
DOCTYPE detection code. Catch my drift? Or is tag soup going to be in
the XHTML namespace?

If it is strict then maybe entities could be required to have a
semi-colon -- which will then avoid the ambiguities you mentioned above.

Not that I have a position on this (at least as yet). I am just bringing
in some food for thoughts, to accommodate the realistic issues of MathML.
---
RBS

Ian Hickson

unread,
Sep 27, 2006, 1:59:04 AM9/27/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, Roger B. Sidje wrote:
> On 27/09/2006 11:23 AM, Ian Hickson wrote:
> >
> > The problem with adding entities is that a LOT of people do things
> > like
> >
> > href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
> >
> > ...which today works, but would break if MathML entities were
> > introduced (since &ac is a MathML entity).
>
> That list is so big that trying to hand-pick some and leaving some out
> would need another committee...

Not really... I say we just add ApplyFunction, InvisibleComma, and
InvisibleTimes (but not their short aliases).


> > > Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting
> > > all W3C entities _by default_?
> >
> > Don't do anything based on the DOCTYPE. HTML5 is anything sent as
> > text/html.
>
> I thought the DOCTYPE was trustworthy -- based on this excerpt from the
> HTML5 spec:
>
> "HTML documents that use the new features described in this
> specification must start with the string <!DOCTYPE html> and, if they
> are served over the wire (e.g. by HTTP) must be labelled with the
> text/html MIME type."

That's an authoring conformance requirement, and has no bearing on
implementations.


> BTW, for my own information, do you intent HTML5 to be transitional,
> almost-standards, or strict?

HTML5 documents starting with <!DOCTYPE HTML> must be in standards mode.
Documents with other DOCTYPEs or no DOCTYPE at all may be in another mode,
as already described in the spec. In due course I may specify quirks mode
and then there'll just be the spec, and no other modes.


> If it is HTML5 (or XHTML5) served as text/html but put in the XHTML
> namespace at some later stage (as the HTML5 implies), it better be
> strict, no? And that would be driven by the DOCTYPE detection code.
> Catch my drift? Or is tag soup going to be in the XHTML namespace?

Not sure what you mean my that. All HTML DOM nodes are (per HTML5) in the
XHTML namespace, irrespective of the standards/quirks thing.


> If it is strict then maybe entities could be required to have a
> semi-colon -- which will then avoid the ambiguities you mentioned above.

That would break back-compat.

William F Hammond

unread,
Sep 27, 2006, 12:25:11 PM9/27/06
to dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
Ian Hickson <i...@hixie.ch> writes:

> On Wed, 27 Sep 2006, Roger B. Sidje wrote:

> . . .


>> Implementation-wise, as this inclusion of MathML-in-HTML5 marks the

>> beginning of tag soup, ...


>
> Don't worry, these tags auto-close when a parent tag is closed.

Two points for clarification:

1. There's the old issue, related to dual parsers, of trying to get
Mozilla family user agents to give proper handling of XHTML+MathML
when served through text/html -- following early Amaya practice. (In
the end the W3C HTML WG refused to support this idea and spawned the
mimetype application/xhtml+xml.) It seems that formally correct
XHTML+MathML would now gain coverage as text/html under current WhatWG
thinking, at least when XML namespaces are evident only through use of
the xmlns attribute (which would be ignored in tag soup), i.e., no use
of xml namespace prefixing. Is this correct?

2. Is WhatWG entertaining the idea that off-the-cuff tag soup writers
will generate MathML content that's good enough for Mozilla rendering?

---

In case you don't know:

The W3C Math group has announced that it is beginning to think seriously
about author-level markup for math.

Long term -- say ten years in the future (we've already been at this
for ten years) -- I think author level math additions to the tag soup
vocabulary would work out much better, especially with enhanced CSS
support.

Cheers.

-- Bill

----------------------------------------------------------------------
William F. Hammond Dept. of Mathematics & Statistics
518-442-4625 The University at Albany
hammond At math.albany.edu Albany, NY 12222 (U.S.A.)
http://www.albany.edu/~hammond/ Dept. FAX: 518-442-4731
----------------------------------------------------------------------

David Carlisle

unread,
Sep 27, 2006, 12:44:42 PM9/27/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org
I don't think I saw Ian's original comment, Just Roger's reply?

> What I would be proposing for HTML5 is just the following list of
> elements:
>
> math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
> mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
> mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction

You would beed to include the leaf elements (mi mn mo mtext) otherwise
there'll be no characters in the mathml!, also mspace is pretty
important.

But a more general point I think it's dangerous for a spec to be
profiled by _implementations_. The Math WG activity has just been
restarted at W3C and if there is a need to profile MathMl to
presentation MathML (or a subset thereof) please can it be done _there_
so that there is some chance that mathml authoring tools can be
customised to have options to generate code to match any profiled spec.

> I don't like mlabeledtr very much (I have already expressed my views
> about it to folks of the MathML WG)

Roger, I don't see anything searching for
http://www.w3.org/Search/Mail/Public/search?type-index=www-math&index-type=t&keywords=mlabeledtr&search=Search
I know you've talked to us at conferences etc, but we're all getting old
and if comments aren't on the comment list, then they are likely to get
forgotten over time.

_Now_ would be a really good time to make such comments as we are in the
process of finalising the requirements for what extar features should
be in MathML3, and what if necessary, features should be deprecated.


I don't remember specific discussions about an <mtr label="..."> I
would guess there woul dbe some convern about the label being an
attribute rather than an element restricting the possibilities, but
implementation advice on difficulties on teh current schem woul dbe
taken seriously....

Ian wrote about entities


> Yeah... Do we really need those? Some of them seem reasonable to add, but
> 2000 seems like too many for the mnemonic advantage to beat just using
> Unicode codepoints...

I'd say that it's probably not worth including only a few, it would just
lead to confusion. The problem is that much mathml is generated using
tools and those tools may use entities, and if they do that the user
hasn't much control over which are used, and how to fix things to remove
entities that are not supported in the browser. It would be better to
just get the MathML authoring tools to use characters or character refs
directly and tell the user mathml entities are not supported (but html
ones are)

David

Ian Hickson

unread,
Sep 27, 2006, 2:06:36 PM9/27/06
to William F Hammond, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, William F Hammond wrote:
>
> 1. There's the old issue, related to dual parsers, of trying to get
> Mozilla family user agents to give proper handling of XHTML+MathML when
> served through text/html -- following early Amaya practice. (In the end
> the W3C HTML WG refused to support this idea and spawned the mimetype
> application/xhtml+xml.) It seems that formally correct XHTML+MathML
> would now gain coverage as text/html under current WhatWG thinking, at
> least when XML namespaces are evident only through use of the xmlns
> attribute (which would be ignored in tag soup), i.e., no use of xml
> namespace prefixing. Is this correct?

I'm confused by your terminology.

MathML using namespaces and XML syntax would not, under the WHATWG
proposals here, be formally correct. XML sent as text/html is never
correct per the "WHATWG thinking".

What is being proposed here is a non-XML syntax, to be formally described
in the HTML5 specification, which, went processed by an HTML5 UA, would
generate a DOM that can then be processed per the MathML2 specification.

Per the WHATWG specifications, the presence of an "xmlns" attribute is
always a conformance error in any content sent as text/html.


> 2. Is WhatWG entertaining the idea that off-the-cuff tag soup writers
> will generate MathML content that's good enough for Mozilla rendering?

The idea being entertained is that off-the-cuff HTML5 authors, and HTML5
editors, would create content which, when processed by an HTML5 UA (such
as Mozilla, in due course), would render as MathML markup would.


> The W3C Math group has announced that it is beginning to think seriously
> about author-level markup for math.
>
> Long term -- say ten years in the future (we've already been at this for
> ten years) -- I think author level math additions to the tag soup
> vocabulary would work out much better, especially with enhanced CSS
> support.

On the very short term, the proposal here is just a proof of concept. On
the medium term (12 months) I was considering specifying more complex
parsing rules for MathML such that the same MathML2-compatible DOM could
be obtained from much smaller markup, e.g. by implying <mo> tags around
operators and <mn> tags around numbers.

HTH,

Roger B. Sidje

unread,
Sep 28, 2006, 4:52:26 AM9/28/06
to David Carlisle, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org
On 28/09/2006 2:44 AM, David Carlisle wrote:

> I don't remember specific discussions about an <mtr label="..."> I
> would guess there woul dbe some convern about the label being an
> attribute rather than an element restricting the possibilities, but
> implementation advice on difficulties on teh current schem woul dbe
> taken seriously....

Here is an informative thread about it:
http://groups.google.com/group/netscape.public.mozilla.mathml/browse_thread/thread/d77d015a1fffc6fb/5b0eb0cc9724ce72
(not on www-math, though. Maybe I should forward it there?)

It appeared that attributes (like those in <mfenced>) aren't unanimous
either. But having a bloated tag that won't be implemented in the next
several years isn't really helpful.

> Ian wrote about entities
>
>>Yeah... Do we really need those? Some of them seem reasonable to add, but
>>2000 seems like too many for the mnemonic advantage to beat just using
>>Unicode codepoints...
>
> I'd say that it's probably not worth including only a few, it would just
> lead to confusion.

I am actually a fan of entities because they improve readability a fair
bit. I hope Ian won't give up thinking on this issue so quickly...
especially in the context of MathML where strange characters are quite
common.

As to my suggestion that "if [a document] is strict then maybe entities

could be required to have a semi-colon -- which will then avoid the

ambiguities", to which Ian responded that, "That would break back-compat."

We have other cases of broken back-compat. -- where users were told to
use a non-strict DOCTYPE or some other workaround, e.g, line-height of
images.
---
RBS

David Carlisle

unread,
Sep 28, 2006, 5:24:59 AM9/28/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org

Roger,
Thanks for the link on <mtr label="mylabel">,

> It appeared that attributes (like those in <mfenced>) aren't unanimous
> either.

yes mfenced also "suffers" from requiring attributes, but probably one
is more likely to need markup in an equation label than in a stretchy
operator. It's not so uncommon to want superscript * or daggers etc to
highlight special versions of formulae, and mfenced is explictly a
shorthand form so you can always use the mwrow/mo form if you need an
operator that is "decorated" in some way. That would not be the case
here if mlabeledtr were deprecated and an attribute form was the
only version. (Actually it would if the attribute could then be
css-styled using css generated content. Allowing css (or other
mechanism) auto numbering is I think a highly requested feature for
mathml3.


> (not on www-math, though. Maybe I should forward it there?)

Yes please do. When we are doing a pass for errata or pulling in feature
requests for a new version we can do a more or less exhaustive check of
the official comment list but (even with google's help) doing an
exhaustive check of the entire web's a bit hard:-)


The charter for the current working group

http://www.w3.org/Math/Documents/Charter2006.html

has as one of its headline work items

Extension of MathML with enhanced support for equation labeling,
including automatic numbering, general label placement and style, and
resolution of references.

so getting that specified out in a way that ensures that implementations
can implement it sounds like a good idea, and the timiming is good now
to get new features in this area if that is needed. If WhatWG members
are interested in mathml most of them are w3c members and could join the
WG of course (currently only Opera is represented out of the main
browser vendors) But WG membership isn't really needed we can do the
technical discussion on the public www-math list if that is appropriate.

> I am actually a fan of entities because they improve readability a fair
> bit.

Well as you know I've invested a frightening number of houres maintaining
that entity set (and the draft iso set at www.w3.org/2003/entities,
which is the same thing, really) so I'm also think they are valuable,
although it's a kind of love-hate relationship most of the time:-)

> I hope Ian won't give up thinking on this issue so quickly...
> especially in the context of MathML where strange characters are quite
> common.

Yes I think the ideal situation is that they all be allowed. My comment
was that subsetting them is likely to be more confusing than helpful.

> As to my suggestion that "if [a document] is strict then maybe entities
> could be required to have a semi-colon -- which will then avoid the
> ambiguities", to which Ian responded that, "That would break back-compat."

Requiring a ; would seem reasonable to me (ie make the lack of a ; make
the & into an implict &amp; rather than be an error as in xml).
That does have a theoretical backward compatibility problem in that
&rightarrow; would be an arrow instead of &amp;rightarrow; but I would
have thought that the occurrences of any such construction outside of
test suites was rather rare.

David

White Lynx

unread,
Sep 28, 2006, 8:38:50 AM9/28/06
to
I consider switching from XML to text/html as inappropriate and
pointless development, morover it is damaging in long term perspective.


First of all it is unclear where this idea comes from, as MathML
community has no legacy text/html content that one should care about.
All MathML content is wellformed (by definition), which means that one
has less errors in MathML documents comparing to what one would have in
tagsoup approach, it also means that all MathML content can can be
handled with XML tools, can be processed with XSLT, matched using
XPath, mixed with other XML based markup languages (OpenMath, SVG) etc.
There is no single MathML implementation that supports text/html
tagsoup, but does not support X(HT)ML, while inverse is not true, there
are XML only MathML implementations that by definition have nothing to
do with HTML legacy.

Further it is not clear for me why this has to be done today, after
paying price for wellformedness and tackling XML related problems for
seven years, when finally MSIE/MathPlayer accept application/xhtml+xml
and thus allow people to deliver the same XHTML+MathML to
MSIE/MathPlayer and Mozilla (one can add Opera with UserJS) someone
decides to revert (more precisely convert) everything to tagsoup.

Profiling policy is sounds unclear and strange to me. Solving issue on
the level "I'm happy to drop/add any tag to this list. Just give me the
list you want" or based on MathML support level on some particular
implementation seems to be irresponsible.
There are at least two subgroups in W3C Math WG that one could drop a
message with profile proposal to after looking at "wrong table".
One is called liason with WhatWG subgroup and as name suggests is
expected to ensure that needs of MathML are addressed in WhatWG specs.
Another is liason with CSS subgroup, which is expected to define MathML
profile suitable for usage in XML+CSS framework and a few CSS
extensions needed to format proposed MathML profile.
There is also subgroup that deals with compound document formats. My
opinion is that profiling of MathML should be coordinated with these
units as irresponsible steps may spoil W3C efforts in the same area.

One more thing that sounds unlogical and rather strange is that
Mozilla/WhatWG try to move MathML further from XML+CSS framework,
by converting XML to tagsoup with ad hoc parsing rules and embracing
constructions like mstyle, mpadded in "proposed" profile.

Message has been deleted

Ian Hickson

unread,
Sep 28, 2006, 2:45:36 PM9/28/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, David Carlisle, dev-tec...@lists.mozilla.org
On Thu, 28 Sep 2006, Roger B. Sidje wrote:
> >
> > Ian wrote about entities
> >
> > > Yeah... Do we really need those? Some of them seem reasonable to add, but
> > > 2000 seems like too many for the mnemonic advantage to beat just using
> > > Unicode codepoints...
> >
> > I'd say that it's probably not worth including only a few, it would just
> > lead to confusion.
>
> I am actually a fan of entities because they improve readability a fair
> bit. I hope Ian won't give up thinking on this issue so quickly...
> especially in the context of MathML where strange characters are quite
> common.

I really don't want to start introducing weird rules for parsing entities
(I'm trying to simplify the entity parsing rules, not make them worse). At
least not at this stage. Maybe once we have a proof-of-concept working, it
would make more sense to revisit the issue, but I'd want to do a thorough
scan of the Web to see how common these entities actually are today.


> As to my suggestion that "if [a document] is strict then maybe entities
> could be required to have a semi-colon -- which will then avoid the
> ambiguities", to which Ian responded that, "That would break
> back-compat."
>

> We have other cases of broken back-compat. -- where users were told to
> use a non-strict DOCTYPE or some other workaround, e.g, line-height of
> images.

Yeah. And we can see how well _that_ went. QA nightmare, multiple
overlapping codepaths, obscure bugs, confused authors, contradicting
documentation, etc. Let's not go there again. The whole point of
MathML-in-HTML is to have back-compat work -- if we didn't care about
back-compat, we would just have people use MathML-in-XHTML.

Roger B. Sidje

unread,
Sep 28, 2006, 9:45:46 PM9/28/06
to David Carlisle, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org
On 28/09/2006 7:24 PM, David Carlisle wrote:

> Roger,
> Thanks for the link on <mtr label="mylabel">,
>
>
>>It appeared that attributes (like those in <mfenced>) aren't unanimous
>>either.
>
>
> yes mfenced also "suffers" from requiring attributes, but probably one
> is more likely to need markup in an equation label than in a stretchy
> operator. It's not so uncommon to want superscript * or daggers etc to
> highlight special versions of formulae, and mfenced is explictly a
> shorthand form so you can always use the mwrow/mo form if you need an
> operator that is "decorated" in some way. That would not be the case
> here if mlabeledtr were deprecated and an attribute form was the
> only version. (Actually it would if the attribute could then be
> css-styled using css generated content. Allowing css (or other
> mechanism) auto numbering is I think a highly requested feature for
> mathml3.

The danger (and problem) with that tag is that it is over-designed to
accommodate the tiny set of special-cases you alluded to, while holding
the 99.99% majority of cases hostage. One could put up with CDATA all
the way, e.g., (6') or (7*), (8&dagger;), (9a), etc -- if a subequation
is really needed. I would think we can put with this and reap the
benefits. A <mtr label="mylabel"> tag that stands a chance, degrades
gracefully, *free* cross-referencing (with href#mylabel -- by just
invoking what the browser already does with <a name="...">), the
counters that you mentioned (which work in Gecko today, BTW), etc.
(Also conceivable, optimistically, is a pseudo-class :label to style the
label text, but we might going ahead of ourselves...)

Seems to me that the concrete benefits that might result outweigh the
feeling against an attribute.

>
>>(not on www-math, though. Maybe I should forward it there?)
>
> Yes please do.

OK.

> Well as you know I've invested a frightening number of houres maintaining
> that entity set (and the draft iso set at www.w3.org/2003/entities,
> which is the same thing, really) so I'm also think they are valuable,
> although it's a kind of love-hate relationship most of the time:-)

Yeah. Let's hope Ian is listening and keeps these entities on his radar...
---
RBS

Juan R.

unread,
Sep 29, 2006, 4:04:20 AM9/29/06
to
Ian Hickson wrote:
>
> I'm happy to drop/add any tag to this list. Just give me the list you
> want.
>

Ok, this is one in LISP syntax for lists: ()

>
> --
> Ian Hickson U+1047E )\._.,--....,'``. fL
> http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
> Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

No need to reply the rest you are promoting, since basically you may
think -parodying you- that MathML in HTML 5 is <anything sent as
text/html>

Far from simplifying the authoring of mathematical docs and spreading
online maths, you are really doing comunication more difficult still
for all of us with this strange hibrid convincing nobody.


Juan R.

Center for CANONICAL |SCIENCE)

David Carlisle

unread,
Sep 29, 2006, 4:40:40 AM9/29/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org

> Seems to me that the concrete benefits that might result outweigh the
> feeling against an attribute.

Which is why it's good to get real implementation experience into the
language design (or update). Either by implementors joining the WG or
by doing the technical design on the public www-math list so you and
others can join in (or both).

David


sha...@shantirao.com

unread,
Oct 2, 2006, 12:13:04 AM10/2/06
to
I would like to call attention to RBS's original point: MathML is
emperiled, and something *can* be done. To expound:

1. MathML is nifty. It's the best thing since LaTeX. In fact, it's the
only thing since LaTeX. My colleagues admire my Mozilla-rendered
documents while they struggle with MS Word.

2. MathML is in trouble. My colleagues who use IE can't see my
equations. This makes it unacceptable for me to write anything important
in MathML, so long as I want to succeed at my job. So investing effort
into learning or using MathML is a quixotic proposition.

3. XHTML is the web language of the future -- and it always well be. It
might as well be dead. It was born crippled, and it never will catch on,
for the simple economic reason that HTML is easier to use. XHTML was
supposed to be the replacement of HTML. In fact, it was so popular that
we're moving forward with HTML5.

4. Languages that are not easy to write are ignored. The wasteland of
obsolete internet standards is littered with romatic, intellectually
superior, morally defensible languages like XFORMS, VRML, and SVG. Boy,
those sure made our lives better! Compare those to what actually gets
used: unvalidated HTML, CSS, JavaScript, and the DOM. All marginally
self-consistent languages that are easy to write and tolerant of abuse.

5. If MathML is not widely understood and easily used by browsers, say
by being a part of HTML5, then sites that drive technology adoption,
like Wikipedia, will have no incentive to switch from the current
TeX->PNG kludge. Lacking a large user base, MathML will not grow.

6. Although the MathML community is self-contained today, we all know
what happens to species that evolve on islands: they get smaller and
prone to extinction. The community needs to grow, and incorporation in
HTML5 is something we should all get behind.

Shanti

* Camel = a horse designed by committee

Chris Chiasson

unread,
Oct 2, 2006, 1:10:45 AM10/2/06
to
The reason MathML is in trouble is because Microsoft hasn't implemented
it (and many other good XML technologies) natively into Internet
Explorer. They are using their intertia to screw over the open
standards. It's hard for Firefox to compete in a (MathML) market that
doesn't exist.

The best thing that could be done without MS help is to make XML
handling plugins as ubiquitous and easy to install as Adobe's
Macromedia Flash plugin.

Maybe it would be prudent to make an open source MathML and SVG plugin
for IE so people don't have to rely on the changing winds of corporate
desires and licensing. You could call it something like Firefox
sub-rendering for IE - or whatever.

White Lynx

unread,
Oct 2, 2006, 5:33:30 AM10/2/06
to
> XHTML is the web language of the future -- and it always well be. It
> might as well be dead.

MathML is not necessary confined to XHTML, it may use other XML
application as host languages. In particular one can name several XML
applications that are much more suitable for encoding scientific
articles then XHTML (NIH Journal Publishing DTD, DocBook, TEI). Of
course XHTML will remain to be the most widespread host language for
MathML, but it is not something that MathML absolutely depends on. And
XML in general is apparently not dead, it is enough for MSIE to fix
their broken parser and the people that yesterday argued that we all
must switch to XHTML, today argue that HTML5 is the only way to go,
tomorrow may adjust their opinion once more. It should not be a
problem.

> Languages that are not easy to write are ignored

Well compare XML
<mmultiscripts><mi>A</mi><mprescripts/><none/><mi>B</mi></mmultiscripts>
and HTML
<mmultiscripts><mrow>A</mrow><mprescripts></mprescripts><none></none><mrow>B</mrow></mmultiscripts>
that being processed by parser will generate mi-mo-mu tagsoup
automatically
<mmultiscripts><mrow><mi>A</mi></mrow><mprescripts></mprescripts><none></none><mrow><mi>B</mi></mrow></mmultiscripts>
So how switching to HTML helped to make language human processable?

I am definetely for turning MathML into human processable language, and
removing mi-mo-mu (explicit markup is useful for stuff like integrals,
N-ary operators, delimiters, but otherwise it is just bloat
<mn>2</mn><mo>+</mo><mn>2</mn><mo>=</mo><mn>4</mn>), however this can
be done and should be done whithin XML, without introducing telephatic
parsing rules. If mi-mo-mu are not available in original source and are
generated by parser then their semantic value is exactly zero (and yes
I know that it is close to zero in any case). ECMA approach is one
possible way to remove mi-mo-mu and add use something like <nary> (but
not exactly <nary> construction which is the most CSS unfriendly part
of ECMA math markup) for operators and just <i> for italic.
So we should either remove it from MathML (the problem however is lack
of consensus in WG on issue) or keep it. Removing it from source but
keeping in DOM does not make any sense, as you remove semantics but
keep this stuff in DOM.

> Although the MathML community is self-contained today, we all know
> what happens to species that evolve on islands: they get smaller and
> prone to extinction.

Integration with environment in which formulae are embedded is crucial
for any mathematical markup. All other approaches are closesly
integrated in some extensible framework with powerful formatting
mechanism (LaTeX/TeX, ISO-12083/SGML+DSSSL, OfficeMath/WordML).
Extensibility and availablility of fullfeatured style language or
equivalent formatting mechanism are crusial here. In case of MathML
environment is web, so integration of MathML into extensible framework
is integration into XML+CSS+DOM which is on agenda of Math WG. In
contrast HTML5 does not give us extensible framework and ad hoc parsing
rules does not help us to integrate MathML with CSS while keepind DOM
synchronised with actual markup.

William F Hammond

unread,
Oct 2, 2006, 11:23:32 AM10/2/06
to dev-tec...@lists.mozilla.org
"Chris Chiasson" <chris.c...@gmail.com> writes:

> ...


> The best thing that could be done without MS help is to make XML
> handling plugins as ubiquitous and easy to install as Adobe's
> Macromedia Flash plugin.

I acquired a new machine with MS Windows XP (Home) recently and found
that it had both IE and AOL/NetScape visible as desktop icons. Of
course, NetScape rendered XHTML+MathML.

Installing the Design Science plugin for IE called MathPlayer was
quite easy, but one does need to know where to go to get it. So the
math community might consider advertising its location -- or at least
advising its readers to google for "MathPlayer".

> Maybe it would be prudent to make an open source MathML and SVG plugin
> for IE so people don't have to rely on the changing winds of corporate
> desires and licensing. You could call it something like Firefox
> sub-rendering for IE - or whatever.

What was new for me about the OEM NetScape was that it would render in
IE mode if asked.

It's curious that an OEM platform should include NetScape along with
IE but not provide a seamless plugin for IE. Perhaps Microsoft will
want to rethink that.

-- Bill

Paul Topping

unread,
Oct 2, 2006, 12:16:39 PM10/2/06
to William F Hammond, dev-tec...@lists.mozilla.org
Hi,

Thanks Bill (Hammond) for mentioning our MathPlayer plugin. While I
understand that people might want IE to support MathML "out of the box",
many capabilities in many apps are provided as plugins. I don't think it
is right to think that all plugins are bad. Plugins allow company's like
mine, with an interest in providing technology in a particular area, to
move technology forward independently of monsters like Microsoft. In
other words, if Microsoft provided MathML support in IE, it wouldn't be
as good as MathPlayer and everyone would be complaining about that.

Of course, demanding that Microsoft support XHTML in IE is perfectly
reasonable. IE does a really good job, IMHO, of allowing plugins like
MathPlayer support embedded XML languages, except in HTML, not XHTML.
MathPlayer works around this by allowing carefully prepared XHTML+MathML
to work in IE but proper support for XHTML in IE would be better.

Paul Topping
President & CEO

Design Science, Inc.
"How Science Communicates"
Makers of MathType, MathFlow, WebEQ, MathPlayer, Equation Editor,
TeXaide
http://www.dessci.com

> _______________________________________________
> dev-tech-mathml mailing list
> dev-tec...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-mathml
>

Chris Chiasson

unread,
Oct 2, 2006, 2:31:40 PM10/2/06
to
White Lynx wrote:
>All other approaches are closesly
> integrated in some extensible framework with powerful formatting
> mechanism (LaTeX/TeX, ISO-12083/SGML+DSSSL, OfficeMath/WordML).

I don't know how common the knowledge is, but MathML is closely tied
with a certain platform: Mathematica

Wolfram Research (makers of Mathematica) was one of the originators of
MathML. Anyway, present day MathML is strongly related to Mathematica's
internal representation of math, as shown in this short example.

Consider Euler's formula as entered in the most source-code like syntax
available in Mathematica (called InputForm):

E^(I*x)==Cos[x]+I*Sin[x]

After parsing, it becoms this (called FullForm):

Equal[Power[E,Times[Complex[0,1],x]],Plus[Cos[x],Times[Complex[0,1],Sin[x]]]]

Compare this with content MathML

<math
xmlns='http://www.w3.org/1998/Math/MathML'><apply><eq/><apply><power/><
exponentiale/><apply><times/><imaginaryi/><ci>x</ci></apply></apply><apply><
plus/><apply><cos/><ci>x</ci></apply><apply><times/><imaginaryi/><apply><sin/>
<ci>x</ci></apply></apply></apply></apply></math>

Notice how <apply> is used to capture the structure of
head[arg1,arg2,arg3] as <apply><head/><arg1/><arg2/><arg3/></apply>.

However, when an equation like this is typeset in Mathematica, it is
converted to a box structure. I'll use StandardForm boxes for this
example:

RowBox[{SuperscriptBox["\[ExponentialE]",RowBox[{"\[ImaginaryI]","
","x"}]],"\[Equal]",RowBox[{RowBox[{"Cos","[","x","]"}],"+",RowBox[{"\[ImaginaryI]","
",RowBox[{"Sin","[","x","]"}]}]}]}]

Note that RowBox means that the items within should have the same
baseline. Obviously, the possible necessity to linebreak complicates
things somewhat. However, the box structure remains invariant (which is
why I think it's odd that Firefox doesn't linebreak <mrow>).

Compare this with presentation MathML:

<math
xmlns='http://www.w3.org/1998/Math/MathML'><mrow><msup><mi>&#8519;</mi>
<mrow><mi>&#8520;</mi><mo>&#8290;</mo><mi>x</mi></mrow></msup><mo>&#63449;</
mo><mrow><mrow><mi>cos</mi><mo>&#8289;</mo><mo>(</mo><mi>x</mi><mo>)</mo></
mrow><mo>+</mo><mrow><mi>&#8520;</mi><mo>&#8290;</mo><mrow><mi>sin</mi><mo>&#
8289;</mo><mo>(</mo><mi>x</mi><mo>)</mo></mrow></mrow></mrow></mrow></math>

So those plentiful <mrow> elements shouldn't be unexpected. Also, it
becomes pretty apparent why presentation MathML is nearly
incomprehensible. It is a representation of an already verbose two
dimensional box formatting system in XML, making it even more verbose.

Of course, the fact that presentation MathML is a translation of a box
formatting system makes it well suited to styling by CSS.

Mathematica's box formatting subsystem (called the FrontEnd)
understands very few operators (input shortcuts). One of the few is
Rule (lhs->rhs). It doesn't even understand Plus (l+m+r). It certainly
wouldn't understand what's going on if someone left out a RowBox.

In that respect, presentation MathML is slightly more flexible, because
it can "insert" some implicit row boxes when the markup wouldn't make
sense otherwise.

Anyway, I don't speak for WRI, but I think it's fairly obvious they
will try to keep MathML "in their image" so that it will be easy for
them to have an XML language for math that is understood by machines
... aka their computer algebra system.

IMHE (in my humble estimation) Firefox people would be better off
trying to define "shorthand" definitions for the content MathML system,
which WRI will be less likely to oppose.

White Lynx wrote:
> Extensibility and availablility of fullfeatured style language or
> equivalent formatting mechanism are crusial here.

Agreed. I think it would be imprudent to remove formatting structures
from presentation MathML because that would make it harder to write
appropriate CSS.

Paul Topping

unread,
Oct 2, 2006, 3:00:07 PM10/2/06
to Chris Chiasson, dev-tec...@lists.mozilla.org
Chris,

While Mathematica people were heavily involved in MathML's creation, it
is hardly the result of their effort alone. They provided some much
needed early impetus and hosted two MathML conferences but since then
they have been more noticeable by their absence from the MathML
community. At any rate, the notion that they have some kind of control
over it now is just not even close to being the case.

If anyone has opinions on how MathML can be improved, they should
participate in the W3C's MathML 3.0 effort just getting underway. Then
they can see for themselves that Wolfram/Mathematica doesn't run the
show. Actually, I half expected someone to accuse my company, Design
Science, of that these days.

I would encourage anyone to create front ends that save as MathML.
Either GUI ones like our products or "programming" languages that are
converted into MathML. Now that MathML has been fairly well established
as the XML representation for math, ease of conversion should be a goal
for any front end. However, IMHO ease of use should take priority over
this.

Paul Topping
Design Science, Inc.
www.dessci.com

> -----Original Message-----
> From: dev-tech-ma...@lists.mozilla.org
> [mailto:dev-tech-ma...@lists.mozilla.org] On Behalf
> Of Chris Chiasson
> Sent: Monday, October 02, 2006 11:32 AM
> To: dev-tec...@lists.mozilla.org
> Subject: Re: MathML-in-HTML5
>

Jacques Distler

unread,
Oct 2, 2006, 10:44:35 PM10/2/06
to
In article
<mailman.6128.115938040...@lists.mozilla.org>, Ian
Hickson <i...@hixie.ch> wrote:


>What is being proposed here is a non-XML syntax, to be formally described
>in the HTML5 specification, which, went processed by an HTML5 UA, would
>generate a DOM that can then be processed per the MathML2 specification.
>

> ...


>
>On the very short term, the proposal here is just a proof of concept. On
>the medium term (12 months) I was considering specifying more complex
>parsing rules for MathML such that the same MathML2-compatible DOM could
>be obtained from much smaller markup, e.g. by implying <mo> tags around
>operators and <mn> tags around numbers.

Please don't go down that road.

Let's not have two incompatible markup languages, both called "MathML,"
one of which can be embedded in HTML5, the other in XHTML.

If you want MathML-in-HTML5, create a profile (along the lines of
XHTML's Appendix C) of MathML 2.0 that is safe to consume by the
Tag-Soup parser.

--
PGP public key: http://golem.ph.utexas.edu/~distler/distler.asc

Roger B. Sidje

unread,
Oct 3, 2006, 1:48:17 AM10/3/06
to White Lynx, www-...@w3.org, dev-tec...@lists.mozilla.org
On 28/09/2006 10:44 PM, White Lynx wrote:

> I consider switching from XML to text/html as inappropriate and
> pointless development, morover it is damaging in long term perspective.

Damaging to what? To MathML? Not really in my opinion. What damage could
there be to have plenty of MathML formulas on the web?!? But to the
XML/XHTML agenda, possibly. And that has been the real "problem" since
the beginning, and which I alluded to in my opening post. It wasn't a
fight fitted for a niche MathML that was already struggling to make a
name for itself.

Interested in using MathML? First pass that XHTML barrier, and that
wasn't even a small barrier. It was a significant barrier, taking seven
years before IE understood application/xhtml+xml. As for the fact that

"the people that yesterday argued that we all must switch to XHTML,

today argue that HTML5 is the only way to go". Speaking generally (or
specifically w.r.t. MathML)? People had to switch to XHTML to get MathML
-- it wasn't even a matter of choice. C.f. again this very insightful
post on the matter.
http://groups.google.com/group/netscape.public.mozilla.mathml/msg/4d58c35217afcb54?dmode=source

So after all these years making the case for something else (XHTML),
what this thread is about is to make <math>...</math> works everywhere,
especially where it still matters the most today, and that is HTML5. As
I indicated, my original take is for <math>...</math> to work as-is --
as we have come to know and enjoy it. But it is obvious that this new
mixing has to be defined somehow, even if we later come to a conclusion
saying that it is an opaque <object>, or a profile of some sort.

But I hope that as further insight is gathered through the
proof-of-concept, it turns out that <math>...</math> is just fine, and
that interoperability issues won't be thrown at an already special niche
technology. While on this, I should stress that tag-soug is possible
anywhere, although this is often not mentioned because the extent is
much different. Well-formed tag-soup (as odd as it sounds...) is
possible, which is why these reddish "invalid-markup" messages sometimes
pop in Gecko's MathML rendering. Such things are left undefined by the
spec. However, in the case of MathML where the markup is generated
automatically by software, there is no particular reason to believe that
these generators will suddenly start to generate an indigestible
tag-soup. So it is not quite realistic to over-emphasize this issue.

MathML already works in XML/XHTML and this proposal is not going to
break that. But there is little else to gain there (as far as MathML in
concerned). Publishers who use XML in their back-end production line can
continue to do what they have been doing.

However, MathML stands to win more (especially individual users) in the
front-end by being in HTML (HTML5 for that matter). This might also
encourage those building HTML authoring tools to consider interfacing
MathML (either with free or commercial plug-ins) because the XML/XHTML
barrier won't be standing right at their face. (On the issue of the
verbosity of MathML, this wouldn't be much of an issue if people didn't
have to stare at the MathML. In fact, when I look at HTML+Javascript+CSS
pages these days, they are also quite cryptic... It is possible to have
invisible/collapsible MathML in an editor interfaced to a plug-in?
Surely for people who have experience building comprehensive editors.
But with the XHTML barrier they can't even chime in...)

I am sure by now that it should be evident that it is XML/XHTML that
stand to lose with MathML enabled in HTML5. Anyway, XHTML doesn't seem
to be going anywhere. (How often does one stumble on a page served as
application/xhtml+xml -- if it isn't a page with MathML?) In any case,
as I indicated, it will still work there, maybe not just as _the_
selling argument that it is now. (Many math pages wouldn't have bothered
with XHTML if it had been possible to have MathML in HTML, and that's
where their loss might come from. But does it really matter? Read Robert
Miner's earlier post again.)

To advance MathML, we contributed a great deal to XML/XHTML and pushed
for them so much that it is very easy to forget the initial focus.
MathML-in-HTML5? Worth a try. The thread is now about the issues in
prototyping this, and the benefits (or otherwise) for MathML and math on
the web. And I must say I don't see that much disadvantages in enabling
MathML everywhere at this point.
---
RBS

Paul Topping

unread,
Oct 3, 2006, 2:27:42 AM10/3/06
to Roger B. Sidje, White Lynx, www-...@w3.org, dev-tec...@lists.mozilla.org
This all sounds vaguely familiar. When MathML (and Mozilla) were new,
many of us argued for MathML support in Mozilla's HTML parser for many
of the same reasons I see here. We were told by the Mozilla chieftains
that this would only happen over their dead bodies and that XHTML was
the only way we were going to get MathML support. Perhaps it did take us
7 years to get IE to work with a XHTML+MathML but IE has also had a
solution for MathML embedded in HTML for even longer.

While Microsoft may have (nasty) business reasons for not supporting
XHTML, they may also have made the argument that the world wasn't ready
to change all their pages into XHTML just for some gain in "purity".
Sounds like some people on this list are coming around to that same
point of view.

So, as I posted a week ago, why not adopt the Microsoft convention for
embedding MathML (or any other XML language) in HTML? Minus the COM
class id stuff, of course. Basically, this would result in a simple
declaration of the embedded language's namespace. For the reasons stated
earlier, just <math> is not enough. At a minimum, it doesn't allow for
smooth transitions to new versions of MathML. Come on, Microsoft isn't
wrong all the time.

Paul Topping
Design Science

> -----Original Message-----
> From: dev-tech-ma...@lists.mozilla.org
> [mailto:dev-tech-ma...@lists.mozilla.org] On Behalf

White Lynx

unread,
Oct 3, 2006, 3:18:43 AM10/3/06
to
> Please don't go down that road.
> Let's not have two incompatible markup languages, both called "MathML,"
> one of which can be embedded in HTML5, the other in XHTML.

Completely agree. Personally I am not against removing mandatory tokens
and following approach taken by ECMA (this attitude does not
necessarily reflect the position of Math WG however), but I am
radically against current approach. It does not make sense to remove
tokens from markup while preserving them in DOM (the semantic value of
tokens automatically generated by parser is zero, and not all
conversion/interchange tools operate through DOM).

> I don't know how common the knowledge is, but MathML is closely tied
> with a certain platform: Mathematica

They just use MathML for import/export of math formulae. This is not
the kind of integration I meant.

>> I consider switching from XML to text/html as inappropriate and
>> pointless development, morover it is damaging in long term perspective.

> Damaging to what? To MathML? Not really in my opinion. What damage could
> there be to have plenty of MathML formulas on the web?!?

What prevents you from having plenty of formulae on web today? Do we
have at least one MathML implementation that supports HTML, but lacks
XHTML support? Do we have MathML implementations that support XHTML
only? So, how introducing two different and incompatible parsing rules
will improve interoperability? And assume that you have plenty of
formulae on web and you want to process them. How having half of
them in tagsoup and another half in XML does not make them easier to
handle?

> But to the
> XML/XHTML agenda, possibly. And that has been the real "problem" since
> the beginning, and which I alluded to in my opening post.

It is not the beggining. Seven years passed since that time and a lot
of XML applications emerged since then. Most of current W3C are
designed keeping in mind XML and not SGML or HTML. MathML is part of
large and extensible framework where it can be combined with other XML
applications. Current proposal does adds no new functionality to
MathML, but rather artificially splits MathML community into
incompatible parts that has to be delt separately.

> Interested in using MathML? First pass that XHTML barrier, and that
> wasn't even a small barrier. It was a significant barrier, taking seven
> years before IE understood application/xhtml+xml.

It was. But it is not anymore. So it is not clear what are you struggle
with. Maybe someone has to struggle with legacy text/html content, but
it is not our problem we have no MathML in HTML legacy. Maybe someone
complaints that MSIE does not support application/xhtml+xml, again it
is not our problem as without MathPlayer MSIE can not process MathML
while with MathPlayer application/xhtml+xml problem is N/A.
If someone doubts about future of XML in MSIE, note that Microsoft's
own mathematical markup language is (and most of other recent format$
are) entirely XML based.

> MathML already works in XML/XHTML and this proposal is not going to
> break that.

XML for maths means better interoperability (and extensibility) this
proposal splits MathML into two different versions

> This might also
> encourage those building HTML authoring tools to consider interfacing
> MathML (either with free or commercial plug-ins) because the XML/XHTML
> barrier won't be standing right at their face.

Once again there is no barrier, XHTML has all the functionality that
HTML has and much more. The only issue is MSIE parser and as noted
above several times this issue is N/A to MathML today.

> Many math pages wouldn't have bothered
> with XHTML if it had been possible to have MathML in HTML

Which means that goint in that direction will give rise to two
different versions of MathML, damaging interoperability and introducing
no new functionality.

> MathML-in-HTML5? Worth a try.

Once you try something you can't always untry it. Just proceed with you
proposal and we will have to strugle with text/html legacy forever.

Jacques Distler

unread,
Oct 3, 2006, 9:17:47 AM10/3/06
to
In article <1159859923....@m73g2000cwd.googlegroups.com>,
White Lynx <whit...@operamail.com> wrote:

>It does not make sense to remove
>tokens from markup while preserving them in DOM (the semantic value of
>tokens automatically generated by parser is zero, and not all
>conversion/interchange tools operate through DOM).

HTML does this all the time. (E.g. inferred <tbody> element as a child
of <table>, inferred <head> and <body> elements,...) There's nothing
wrong with inferred elements ... per se.

The only problem occurs when people expect their MathML code (or, more
pertinently, the software they use to generate it) to be interoperable
in an XML context.

>> This might also
>> encourage those building HTML authoring tools to consider interfacing
>> MathML (either with free or commercial plug-ins) because the XML/XHTML
>> barrier won't be standing right at their face.
>
>Once again there is no barrier, XHTML has all the functionality that
>HTML has and much more.

I disagree strongly. XHTML is a *huge* barrier. CMS's that reliably
produce XHTML are rare to nonexistent.

And many users don't have control over the MIME-type their pages are
sent with. If they did, you wouldn't have so many RSS feed sent as
text/XML (which, unless they are plain ASCII, means they are
automatically ill-formed).

>The only issue is MSIE parser and as noted
>above several times this issue is N/A to MathML today.

Precisely. MathPlayer2 allows IE/6 to consume MathML embedded in tag
soup *TODAY*.

There's every incentive to have the Mozilla people experiment with
allowing Mozilla to do the same.

White Lynx

unread,
Oct 3, 2006, 10:55:21 AM10/3/06
to
> >It does not make sense to remove
> >tokens from markup while preserving them in DOM (the semantic value
of
> >tokens automatically generated by parser is zero, and not all
> >conversion/interchange tools operate through DOM).
>
> HTML does this all the time. (E.g. inferred <tbody> element as a
child
> of <table>, inferred <head> and <body> elements,...) There's nothing

> wrong with inferred elements ... per se.

One thing when you can unamboguously infer completely useless element
that has no semantic value and just groups rows (tbody) and another
thing is when you infer out of nowhere elements with either predefined
presentation or semantic like address, or i.

> The only problem occurs when people expect their MathML code (or,
more
> pertinently, the software they use to generate it) to be
interoperable
> in an XML context.
>
> >> This might also
> >> encourage those building HTML authoring tools to consider
interfacing
> >> MathML (either with free or commercial plug-ins) because the
XML/XHTML
> >> barrier won't be standing right at their face.
>
> >Once again there is no barrier, XHTML has all the functionality
that
> >HTML has and much more.
>
> I disagree strongly. XHTML is a *huge* barrier. CMS's that reliably
> produce XHTML are rare to nonexistent.

You tend to turn simple things into rocket science.

> And many users don't have control over the MIME-type their pages are

> sent with. If they did, you wouldn't have so many RSS feed sent as
> text/XML (which, unless they are plain ASCII, means they are
> automatically ill-formed).

Browsers follow Appendix F.2 of XML recommendation (if an XML entity
is in a file, the Byte-Order Mark and encoding declaration are used (if
present) to determine the character encoding) not RFC 3023.

>
> >The only issue is MSIE parser and as noted
> >above several times this issue is N/A to MathML today.
>
> Precisely. MathPlayer2 allows IE/6 to consume MathML embedded in tag

> soup *TODAY*.

Well, MSIE does not deal with MathML in any form and I am not against
embededing MathML in environments other then XML (you can embed it in
LaTeX if you want) but I am against turning it into tagsoup which is
different issue.

White Lynx

unread,
Oct 3, 2006, 11:01:06 AM10/3/06